U.S. patent application number 17/440618 was filed with the patent office on 2022-05-26 for control of nitrogen fixation in rhizobia that associate with cereals.
This patent application is currently assigned to Massachusetts Institute of Technology. The applicant listed for this patent is Massachusetts Institute of Technology. Invention is credited to Min-Hyung Ryu, Christopher A. Voigt.
Application Number | 20220162544 17/440618 |
Document ID | / |
Family ID | 1000006128682 |
Filed Date | 2022-05-26 |
United States Patent
Application |
20220162544 |
Kind Code |
A1 |
Voigt; Christopher A. ; et
al. |
May 26, 2022 |
CONTROL OF NITROGEN FIXATION IN RHIZOBIA THAT ASSOCIATE WITH
CEREALS
Abstract
Disclosed herein are engineered rhizobia having nif clusters
that enable the fixation of nitrogen under free-living conditions,
as well as ammonium and oxygen tolerant nitrogen fixation under
free-living conditions. Also provided are methods for producing
nitrogen for consumption by a cereal crop using these engineered
rhizobia.
Inventors: |
Voigt; Christopher A.;
(Belmont, MA) ; Ryu; Min-Hyung; (Cambridge,
MA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Massachusetts Institute of Technology |
Cambridge |
MA |
US |
|
|
Assignee: |
Massachusetts Institute of
Technology
Cambridge
MA
|
Family ID: |
1000006128682 |
Appl. No.: |
17/440618 |
Filed: |
March 19, 2020 |
PCT Filed: |
March 19, 2020 |
PCT NO: |
PCT/US2020/023646 |
371 Date: |
September 17, 2021 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
16746215 |
Jan 17, 2020 |
|
|
|
17440618 |
|
|
|
|
62820765 |
Mar 19, 2019 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C12N 2510/00 20130101;
C05F 11/08 20130101; C12N 15/87 20130101; C12N 1/20 20130101 |
International
Class: |
C12N 1/20 20060101
C12N001/20; C05F 11/08 20060101 C05F011/08; C12N 15/87 20060101
C12N015/87 |
Goverment Interests
GOVERNMENT SUPPORT
[0002] This invention was made with Government support under Grant
No. IOS1331098 awarded by the National Science Foundation (NSF).
The government has certain rights in this invention.
Claims
1. A rhizobium that can fix nitrogen under aerobic free-living
conditions, comprising a symbiotic rhizobium having an exogenous
nif cluster, wherein the exogenous nif cluster confers nitrogen
fixation capability on the symbiotic rhizobium under aerobic
free-living conditions, and wherein the rhizobium is not
Azorhizobium caulinodans.
2. The rhizobium of claim 1, wherein the exogenous nif cluster is
selected from a group consisting of a free-living diazotroph, a
symbiotic diazotroph, a photosynthetic Alphaproteobacteria, a
Gammaproteobacteria, a cyanobacteria, a firmicutes, a Rhodobacter
sphaeroides, and a Rhodopseudomonas palustris.
3. The rhizobium of claim 1, wherein the exogenous nif cluster is
an inducible refactored nif cluster.
4. The rhizobium of claim 3, wherein the inducible refactored nif
cluster is an inducible refactored Klebsiella nif cluster.
5. The rhizobium of claim 1, wherein the rhizobium is IRBG74.
6. The rhizobium of claim 1, wherein the exogenous nif cluster
comprises 6 nif genes or operons.
7. The rhizobium of claim 6, wherein the 6 nif genes or operons are
nifHDK(T)Y, nifEN(X), nifJ, nifBQ, nifF, and nifUSVWZM.
8. The rhizobium of claim 6, wherein each nif gene or operon of the
exogenous nif cluster is preceded by a T7 promoter.
9. The rhizobium of claim 1, further comprising an endogenous nif
cluster.
10. The rhizobium of claim 1, wherein the exogenous nif cluster
further comprises a terminator.
11. The rhizobium of claim 8, wherein the T7 promoter has a
terminator and wherein the terminator is downstream from the T7
promoter.
12. The rhizobium of claim 11, wherein the exogenous nif cluster is
a refactored rhizobium IRBG74 nif cluster.
13-33. (canceled)
34. A method for making a nitrogen-fixing bacterium, the method
comprising: a) identifying a host bacterium; b) selecting a donor
bacterium having a nif cluster based on evolutionary distance
between the host bacterium and the donor bacterium; c) inserting
the nif cluster of the donor bacterium to the host bacterium,
thereby making a nitrogen-fixing bacterium.
35. The method of claim 34, wherein the evolutionary distance
between the host bacterium and the donor bacterium is less than
10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%,
0.5%, 0.4%, 0.3%, 0.2%, or 0.1% substitutions per site in 16S
ribosomal RNA gene sequence.
36. The method of claim 34, wherein the host bacterium and the
donor bacterium are in the same genus, family, order, or class.
37-39. (canceled)
40. The method of claim 34, wherein the host bacterium is E. coli
and the donor bacterium is K. oxytoca.
41. (canceled)
42. The method of claim 34, wherein the host bacterium is Rhizobium
IRBG74, and the donor bacterium is R. sphaeroides.
43. The method of claim 34, wherein the host bacterium is a
nonsymbiotic bacterium, e.g., Azotobacter, Beijerinckia, or
Clostridium bacterium.
44-45. (canceled)
46. The method of claim 34, wherein the inserted nif cluster is
under inducible control.
47-59. (canceled)
Description
RELATED APPLICATIONS
[0001] This application is a national stage filing under 35 U.S.C.
.sctn. 371 of International Patent Application Number
PCT/US2020/023646, filed Mar. 19, 2020, which claims priority under
35 U.S.C. .sctn. 119(e) to U.S. Provisional Application Ser. No.
62/820,765, filed Mar. 19, 2019 and under 35 U.S.C. .sctn. 120 of
U.S. application Ser. No. 16/746,215, filed on Jan. 17, 2020, the
entire contents of each of which are incorporated by reference
herein.
BACKGROUND OF THE INVENTION
[0003] In agriculture, nitrogen is a limiting nutrient that needs
to be added as fertilizer to those crops that cannot produce it on
their own, including the cereals rice, corn, and wheat. In
contrast, legumes are able to obtain nitrogen from the atmosphere
using nitrogen-fixing bacteria that reside in root nodules.
However, the majority of the world's calories are from cereals;
thus, it has been a longstanding problem in genetic engineering to
transfer this ability to these crops. This would reduce the need
for nitrogenous fertilizer and the economic, environmental, and
energy burdens that it brings.
SUMMARY OF THE INVENTION
[0004] The present disclosure is based, at least in part, rhizobia
and methods for making rhizobia that can fix nitrogen under aerobic
free-living conditions. The present disclosure also provides
refactored nif-clusters that confer the ability to fix nitrogen
under aerobic free-living conditions.
[0005] Accordingly, one aspect of the present disclosure provides a
rhizobium that can fix nitrogen under aerobic free-living
conditions, comprising a symbiotic rhizobium having an exogenous
nif cluster, wherein the exogenous nif cluster confers nitrogen
fixation capability on the symbiotic rhizobium under aerobic
free-living conditions, and wherein the rhizobium is not
Azorhizobium caulinodans. In some embodiments, the exogenous nif
cluster is from a free-living diazotroph. In some embodiments, the
exogenous nif cluster is from a symbiotic diazotroph. In some
embodiments, the exogenous nif cluster is from a photosynthetic
Alphaproteobacteria. In some embodiments, the exogenous nif cluster
is from a Gammaproteobacteria. In some embodiments, the exogenous
nif cluster is from a cyanobacteria. In some embodiments, the
exogenous nif cluster is from a firmicutes. In some embodiments,
the exogenous nif cluster is from Rhodobacter sphaeroides. In some
embodiments, the exogenous nif cluster is from Rhodopseudomonas
palustris. In some embodiments, the exogenous nif cluster is an
inducible refactored nif cluster. In some embodiments, the
inducible refactored nif cluster is an inducible refactored
Klebsiella nif cluster. In some embodiments, the rhizobium is
IRBG74. In some embodiments, the exogenous nif cluster comprises 6
nif genes. In some embodiments, the 6 nif genes are nifHDK(T)Y,
nifEN(X), nifJ, nifBQ, nifF, and nifUSVWZM. In some embodiments,
each nif gene of the exogenous nif cluster is preceded by a T7
promoter. In some embodiments, the T7 promoter is a wild-type
promoter. In some embodiments, the rhizobium further comprises an
endogenous nif cluster. In some embodiments, the nif cluster has a
nifV gene. In some embodiments, the nifV gene is endogenous. In
some embodiments, the exogenous nif cluster further comprises a
terminator. In some embodiments, the T7 promoter has a terminator
and the terminator is downstream from the T7 promoter. In some
embodiments, the exogenous nif cluster is a refactored v3.2 nif
cluster as shown in FIG. 2H.
[0006] Another aspect of the present disclosure provides a plant
growth promoting bacterium that can fix nitrogen under aerobic
free-living conditions, comprising a bacterium having an exogenous
nif cluster having at least one inducible promoter, wherein the
exogenous nif cluster confers nitrogen fixation capability on the
bacterium, under aerobic free-living conditions, and wherein the
bacterium is not Azorhizobium caulinodans. In some embodiments, the
bacterium is a symbiotic bacterium. In some embodiments, the
bacterium is an endophyte. In some embodiments, the endophyte is
rhizobium IRBG74. In some embodiments, the bacterium is an
epiphyte. In some embodiments, the epiphyte is Pseudomonas
protogens PF-5. In some embodiments, the plant growth promoting
bacterium is associated with a genetically modified cereal plant.
In some embodiments, the genetically modified cereal plant includes
an exogenous gene encoding a chemical signal. In some embodiments,
the nitrogen fixation is under the control of the chemical signal.
In some embodiments, the chemical signal is an opine (e.g.,
octopine, nopaine, or mannopine), phlorogluconol or rhizopene. In
some embodiments, the exogenous nif cluster comprises 6 nif genes.
In some embodiments, the 6 nif genes are nifHDK(T)Y, nifEN(X),
nifJ, nifBQ, nifF, and nifUSVWZM. In some embodiments, the
inducible promoter is a T7 promoter. In some embodiments, the
inducible promoter is P.sub.A1lacO1 promoter. In some embodiments,
the inducible promoter is activated by an agent selected from a
group that includes IPTG, sodium salicylate, octapine, nopaline,
the quorum signal 3OC6HSL, aTc, cuminic acid, DAPG, and salicylic
acid. In some embodiments, the exogenous nif cluster further
comprises a terminator. In some embodiments, the inducible promoter
has a terminator and the terminator is downstream from the
inducible promoter.
[0007] Another aspect of the present disclosure provides an
Azorhizobium caulinodans capable of inducible ammonium-independent
nitrogen fixation in a cereal crop, comprising: (i) a modified nif
cluster, wherein an endogenous nifA gene is deleted or altered; and
(ii) at least one operon comprising nifA and RNA polymerase sigma
factor (RpoN), wherein the operon comprises a regulatory element
including an inducible promoter. In some embodiments, the inducible
promoter is P.sub.A1lacO1 promotor. In some embodiments, the
inducible promoter is activated by an agent selected from the group
consisting of IPTG, sodium salicylate, octapine, nopaline, the
quorum signal 3OC6HSL, aTc, cuminic acid, DAPG, and salicylic acid.
In some embodiments, the endogenous nifA gene is altered with at
least one of the following substitutions: (i) L94Q, (ii) D95Q, and
(iii) both L94Q and D95Q.
[0008] Another aspect of the present disclosure provides a method
of engineering a rhizobium that can fix nitrogen under aerobic
free-living conditions, comprising transferring an exogenous nif
cluster to a symbiotic rhizobium, wherein the exogenous nif cluster
confers nitrogen fixation capability on the symbiotic rhizobium,
under aerobic free-living conditions, and wherein the rhizobium is
not Azorhizobium caulinodans. In some embodiments, the exogenous
nif cluster comprises 6 nif genes. In some embodiments, the 6 nif
genes are nifHDK(T)Y, nifEN(X), nifJ, nifF and nifUSVWZM. In some
embodiments, each of the nif genes is preceded by a wild-type T7
promoter. In some embodiments, the exogenous nif cluster is
transferred to the rhizobium in a plasmid. In some embodiments, the
exogenous nif cluster further comprises a terminator. In some
embodiments, the wild-type T7 promoter has a terminator, and the
terminator is downstream from the wild-type T7 promoter. In some
embodiments, the endogenous NifL gene is deleted.
[0009] Another aspect of the present disclosure provides a method
of producing nitrogen for consumption by a cereal plant, comprising
providing a plant growth promoting bacterium that can fix nitrogen
under aerobic free-living conditions in proximity of the cereal
plant, wherein the plant growth promoting bacterium is a symbiotic
bacterium having an exogenous nif cluster, wherein the exogenous
nif cluster confers nitrogen fixation capability on the symbiotic
bacterium, enabling nitrogen fixation under aerobic free-living
conditions. In some embodiments, the plant growth promoting
bacterium is a rhizobium. In some embodiments, the plant growth
bacterium is a bacterium as described in the present disclosure. In
some embodiments, the cereal plant is a genetically modified cereal
plant. In some embodiments, the genetically modified cereal plant
includes an exogenous gene encoding a chemical signal. In some
embodiments, the nitrogen fixation is under the control of the
chemical signal. In some embodiments, the chemical signal is opine,
phlorogluconol or rhizopene. In some embodiments, the nitrogen
fixation is under the control of a chemical signal. In some
embodiments, the chemical signal is a root exudate, biocontrol
agent or phytohormone. In some embodiments, the root exudate is
selected from the group consisting of sugars, hormones, flavonoids,
and antimicrobials. In some embodiments, the chemical signal is
vanillate. In some embodiments, the chemical signal is IPTG, aTc,
cuminic acid, DAPG, and salicylic acid, 3,4-dihydroxybenzoic acid,
3OC6HSL or 3OC14HSL.
[0010] In one aspect, the disclosure also provides a genetically
engineered plant that can produce orthogonal carbon sources, such
as opines or less common sugars, and bacteria with the
corresponding catabolism pathways, which can respond to these
signals.
[0011] In one aspect, the present disclosure provides a method for
making a nitrogen-fixing bacterium, the method comprising a)
identifying a host bacterium; b) selecting a donor bacterium having
a nif cluster based on evolutionary distance between the host
bacterium and the donor bacterium; and c) inserting the nif cluster
of the donor bacterium to the host bacterium, thereby making a
nitrogen-fixing bacterium. In some embodiments, the evolutionary
distance between the host bacterium and the donor bacterium is less
than 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%,
0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% substitutions per site in 16S
ribosomal RNA gene sequence. In some embodiments, the host
bacterium and the donor bacterium are in the same genus, family,
order, or class. In some embodiments, the donor bacterium is
selected from Klebsiella, Pseudomonas, Azotobacter,
Gluconacetobacter, Azospirillum, Azorhizobium, Rhodopseudomonas,
Rhodobacter, Cyanothece, or Paenibacillus genus. In some
embodiments, the host bacterium is selected from the group
consisting of E. coli, Pseudomonas protegens Pf-5, and Rhizobium
IRBG74. In some embodiments, the donor bacterium is selected from
the group consisting of K. oxytoca, P. stutzeri, A. vinelandii, G.
diazotrophicus, A. brasilense, A. caulinodans, R. palustris, R.
sphaeroides, Cyanothece, and Paenibacillus. In some embodiments,
the host bacterium is E. coli and the donor bacterium is K.
oxytoca. In some embodiments, the host bacterium is Pseudomonas
protegens Pf-5, and the donor bacterium is P. stutzeri. In some
embodiments, the host bacterium is Rhizobium IRBG74, and the donor
bacterium is R. sphaeroides. In some embodiments, the host
bacterium is a nonsymbiotic bacterium, e.g., Azotobacter,
Beijerinckia, or Clostridium bacterium. In some embodiments, the
host bacterium is a symbiotic bacterium, e.g., Rhizobium, Frankia,
or Azospirillum bacterium. In some embodiments, the host bacterium
is symbiotic with a leguminous plant, an actinorhizal plant, or a
cereal crop. In some embodiments, the inserted nif cluster is under
inducible control.
[0012] In one aspect, the present disclosure provides a method of
selecting a nif cluster of a donor bacterium that is compatible
with a host bacterium, the method comprising a) performing a
phylogenetic analysis for the donor bacterium and the host
bacterium; b) determining evolutionary distance based on the
phylogenetic analysis between the donor bacterium and the host
bacterium is less than a reference value; and c) selecting the nif
cluster of the donor bacterium for the host bacterium. In some
embodiments, the phylogenetic analysis is performed by using
distance-matrix, maximum parsimony, maximum likelihood, or Bayesian
inference. In some embodiments, the phylogenetic analysis is
performed by analyzing ribosomal RNA (e.g., 16s rRNA) substitution
rate. In some embodiments, the reference value is 10%, 9%, 8%, 7%,
6%, 5%, 4%, 3%, 2%, 1%, 0.9%, 0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%,
0.2%, or 0.1% substitutions per site in 16S ribosomal RNA gene
sequence. In some embodiments, the reference value is 500, 400,
300, 200, 100, 50, or 10 million years. In some embodiments, the
method further comprises inserting the nif cluster to the host
bacterium and evaluating the nitrogen fixation activity. In some
embodiments, the nif cluster is under inducible control.
[0013] In one aspect, the present disclosure provides a bacterium
comprising a nif cluster, where the nif cluster is under control of
an exogenous control genetic element. In some embodiments, the nif
cluster is an endogenous or exogenous nif cluster. In some
embodiments, the exogenous control genetic element initiates
promoter activities in response to an inducer (e.g., a chemical
signal). In some embodiments, the promoter activities are measured
by the below equation:
.delta. .times. J = .gamma. n .function. [ i = x tss + 1 x tss + 1
+ n .times. m .function. ( i ) - i = x tss - 1 x 0 - 1 - n .times.
m .function. ( i ) ] ##EQU00001##
where m(i) is number of transcripts at each position i from the
FPKM normalized transcriptomic profiles, .gamma.=0.0067 s-1 is the
degradation rate of mRNA and n is the window length before and
after x.sub.tss. In some embodiments, the inducer is delivered to
the bacterium by chemical delivery or biocontrol delivery. In some
embodiments, the inducer is a chemical signal in seeds (e.g.,
cuminic acid), a native root exudate (e.g., arabinose, salicylic
acid, vanillic acid, or narigenin), a chemical signal from a
bacterium (e.g., 3OC6HSL, 3OC14HSL, DHBA, or DAPG), or a chemical
signal from a genetically modified plant (e.g., Nopaline or
Octopine).
[0014] The details of one or more embodiments of the invention are
set forth in the description below. Other features or advantages of
the present invention will be apparent from the following drawings
and detailed description of several embodiments, and also from the
appended claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[0015] The following drawings form part of the present
specification and are included to further demonstrate certain
aspects of the present disclosure, which can be better understood
by reference to one or more of these drawings in combination with
the detailed description of specific embodiments presented herein.
For purposes of clarity, not every component may be labeled in
every drawing. It is to be understood that the data illustrated in
the drawings in no way limit the scope of the disclosure. In the
drawings:
[0016] FIGS. 1A-1F include diagrams showing transfer of nif
clusters across species. (FIG. 1A) Eight nif clusters from
free-living nitrogen fixing bacteria are aligned based on
phylogenetic relationships of 16S rRNA sequences. The genes and
operons are based on K. oxytoca M5al. Dots in the DNA line indicate
where multiple regions were cloned from genomic DNA and combined to
form one large plasmid-borne nif cluster. A complete list of strain
genotypes is provided in Table 3. Nitrogenase activity from
transfer of the native nif clusters was measured in three species.
The activities of the R. palustris and R. sphaeroides nif clusters
were also measured in 12 Rhizobia strains. Asterisks indicate
ethylene production below the detection limit (<10 a.u.). Error
bars represent s.d. from three independent experiments. (FIG. 1B)
Transcriptomic profile of the native K. oxytoca nif cluster in K.
oxytoca, compared with those obtained from its transfer to the
indicated species. (FIG. 1C) Transcription levels (FPKM) of the
native K. oxytoca nif cluster across species. Transcriptional units
are underlined. (FIG. 1D) Transcription levels (FPKM) of the K.
oxytoca nif genes in K. oxytoca (.fwdarw.Klebsiella) compared to
that obtained when transferred to a new host. (FIG. 1E) Same as in
(FIG. 1C), except the translational efficiency is compared, as
calculated using ribosome profiling. (FIG. 1F) Same as in (FIG.
1D), except the ribosome densities (RD) are compared, as calculated
using ribosome profiling. R2 in log-log plots was calculated from
the line (y=x+b), where b is an expression variable between
hosts.
[0017] FIGS. 2A-2M include diagrams showing the transfer of the
refactored K. oxytoca nif clusters to R. sp. IRBG74. (FIG. 2A) The
genetic systems for the controller for E. coli MG1655 (left) and R.
sp. IRBG74 (right) are shown. A variant of T7 RNAP (R6232S,
N-terminal lon tag, GTG start codon) is used for the E. coli
controller. Several genetic parts were substituted to build the R.
sp. IRBG74 controller (FIG. 16). The sequences for the genetic
parts are provided in Table 5. (FIG. 2B) The response functions for
the controllers with the reporter plasmid pMR-79 (Table 4 and Table
5). The IPTG concentrations used to induce nitrogenase were
circled. (FIG. 2C) The genetic parts used to build the refactored
v2.1 nif gene cluster are shown (Table 5). (FIG. 2D) The activity
of the refactored nif gene cluster v2.1 in different hosts is
shown. Asterisks indicate ethylene production below the detection
limit (<10 a.u.). (FIG. 2E) The activities of the v2.1 promoters
and terminators in E. coli MG1655 and R. sp. IRBG74 as calculated
from RNA-seq data (see Materials and Methods). (FIG. 2F) The
translation efficiency of the v2.1 nif genes in E. coli MG1655 and
R. sp. IRBG74, as calculated using ribosome profiling and RNA-seq.
Lines connect points that occur in the same operon. (FIG. 2G) The
ribosome density (RD) is compared for the refactored v2.1 nif genes
in a new host (E. coli MG155; R. sp. IRBG74) versus that measured
for the nif genes from the native K. oxytoca cluster in K. oxytoca
(.fwdarw.Klebsiella). The points corresponding to nifH is marked H.
(FIGS. 2H-2L) The same as (FIGS. 2C-2G) but with the refactored nif
cluster v3.2. Genetic parts are provided in Table 5. (FIG. 2M)
Nitrogenase activity is shown as a function of T7 promoter
strength. The refactored nif cluster v3.2 was expressed from three
controller strains with varying strengths (FIG. 16). Error bars
represent s.d. from three independent experiments.
[0018] FIGS. 3A-3F include diagrams showing the control of nitrogen
fixation in A. caulinodans ORS571. (FIG. 3A) The controller is
shown, carried on a pBBR1 origin plasmid (genetic parts are
provided in Table 5). NifA and RpoN co-induce the expression of
three sites in the genome (identified by consensus NifA binding
sequences). (FIG. 3B) Expression from the nifH promoter was
evaluated using a fluorescent reporter (see Materials and Methods).
NifA and RpoN were complemented (+) individually or in combination
in the A. caulinodans .DELTA.nifA strain where the genomic rpoN
remains intact. (FIG. 3C) The response function for the induction
of the nifH promoter by the controller is shown. (FIG. 3D) The
nitrogenase activity is shown for wild-type A. caulinodans ORS571
compared to the .DELTA.nifA complemented with the controller
plasmid (+) and the addition of 1 mM IPTG (+). (FIG. 3E) The effect
of the absence or presence of 10 mM ammonium chloride is shown. The
WT NifA from A. caulinodans ORS571 is compared to different
combinations of amino acid substitutions with additional RpoN
expression. NifA/RpoN expression is induced by 1 mM IPTG (+) for
the .DELTA.nifA strain containing the controller plasmid pMR-121,
122, 123, and 124 (+). Asterisks indicate ethylene production below
the detection limit (<10 au). (FIG. 3F) The nitrogenase activity
is shown as a function of the oxygen concentration in the headspace
(see Materials and Methods). The native nif cluster (wild-type A.
caulinodans ORS571) is compared to the inducible version including
the controller plasmid and 1 mM IPTG. Error bars represent s.d.
from three independent experiments.
[0019] FIGS. 4A-4F include diagrams showing Nitrogenase activity of
the inducible nif clusters in Pseudomonas protegens Pf-5. (FIG. 4A)
The controllers, based on P. stutzeri NifA, were used for all three
clusters. Plasmids and genetic parts are provided in Table 4 and
Table 5. (FIG. 4B) The nif clusters from K. oxytoca, P. stutzeri,
and A. vinelandii are shown. The deleted regions corresponding the
NifLA regulators are marked. The dotted lines indicate that
multiple regions from the genome were cloned and combined for form
the nif cluster. The clusters were carried the plasmids pMR-4, 6, 8
(Table 4). (FIG. 4C) The induction of the nifH promoters from each
species by the controller are shown (0.5 mM IPTG) (see Materials
and Methods). (FIG. 4D) The nitrogenase activities of the native
cluster (intact nifLA) is compared to the inducible clusters in the
presence and absence of 0.5 mM IPTG. The dashed lines indicate the
activity of the native clusters in the wild-type context (top to
bottom, K. oxytoca M5al, P. stutzeri A1501 and A. vinelandii DJ).
(FIG. 4E) The sensitivity of the native and inducible (+0.5 mM
IPTG) nif clusters to 17.1 mM ammonium acetate are compared.
Asterisks indicate ethylene production below the detection limit
(<10 au). (FIG. 4F) The nitrogenase activity is shown as a
function of the oxygen concentration in the headspace (see
Materials and Methods). The native nif cluster is compared to the
inducible version including the controller plasmid and 0.5 mM IPTG.
Error bars represent s.d. from three independent experiments.
[0020] FIGS. 5A-5D include diagrams showing the control of
nitrogenase activity with sensors that respond to diverse chemicals
in the rhizosphere. (FIG. 5A) Schematic showing the origins of the
chemicals. "Introduced DNA" refers to the genetic modification of
the plant to produce nopaline and octopine. (FIG. 5B) The genetic
sensors built for A. caulinodans are shown. Sequences for the
genetic parts are provided in Table 5. (FIG. 5C) The response
functions for the sensors are shown. Either the sensor expresses T7
RNAP, which then activates PT7, or it expresses NifA (P. protegens
Pf-5) or NifA/RpoN (A. caulinodans) and activates the nifH promoter
(species origin in parentheses). (FIG. 5D) The nitrogenase activity
is measured in the presence or absence of inducer (see Materials
and Methods). The refactored Klebsiella nif clusters v2.1 and v3.2
were used in E. coli MG1655 and R. sp. IRBG74, respectively. The
inducible A. vinelandii nif cluster was used in P. protegens Pf-5.
The controller containing nifA/rpoN was used in A. caulinodans
.DELTA.nifA. The inducer concentrations are: 50 .mu.M vanillic
acid, 500 .mu.M DHBA, 50 .mu.M cuminic acid, 25 nM 3OC6HSL, 500 nM
3OC14HSL, 33 .mu.M arabinose, 100 .mu.M naringenin, 100 nM DAPG,
200 .mu.M salicylic acid, 1 mM nopaline and 1 mM octopine. Error
bars represent s.d. from three independent experiments.
[0021] FIG. 6 includes a plot of the growth curve of R. sp. IRBG74
in UMS minimal medium with varying carbon sources. Cultures grown
overnight in 2 mL TY medium in 15 mL culture tubes at 30.degree. C.
and 250 rpm were diluted to an OD.sub.600 of 0.02 into 1 mL of UMS
minimal medium plus varying carbon sources in 96-deepwell plates
and incubated for 16 hours at 30.degree. C. and 900 rpm. Bacterial
growth was spectrophotometrically monitored at OD.sub.600 nm. Error
bars represent s.d. from three independent experiments.
[0022] FIGS. 7A-7F include diagrams showing the nitrogenase
activity when different inducible nif clusters are transferred to
E. coli MG1655. (FIG. 7A) The same controller system based on K.
oxytoca NifA was used for all three clusters. The controller
plasmid pMR-99 and genetic parts are provided in Table 4 and Table
5. (FIG. 7B) The nif clusters from K. oxytoca, P. stutzeri, and A.
vinelandii are shown. The deleted regions corresponding the NifLA
regulators are marked. The dotted lines indicate that multiple
regions from the genome were cloned and combined for form the nif
cluster. The clusters were carried the plasmids pMR-3, 5, 7 (Table
4). (FIG. 7C) The induction of the nifH promoters from each species
by the controller is shown (50 .mu.M IPTG) (see Materials and
Methods) (FIG. 7D) The nitrogenase activities of the native cluster
(intact nifLA) is compared to the inducible clusters in the
presence and absence of 50 .mu.M IPTG. The dashed lines indicate
the activity of the native clusters in the wild-type context (top
to bottom, K. oxytoca M5al, P. stutzeri A1501 and A. vinelandii
DJ). (FIG. 7E) Regulation of nitrogenase activity by ammonia.
Ammonium tolerance of nitrogenase from the native (black bar) and
inducible (gray bar) systems was tested in the presence of 17.1 mM
ammonium acetate. Asterisks indicate ethylene production below the
detection limit (<10 au). (FIG. 7F) Regulation of nitrogenase
activity by oxygen. The native nif cluster is compared to the
inducible version including the controller plasmid and 50 .mu.M
IPTG. Nitrogenase activities were measured after 3 h of incubation
at constant oxygen concentrations (0 to 3%) in the headspace (see
Materials and Methods). Error bars represent s.d. from three
independent experiments.
[0023] FIGS. 8A-8B include plots showing ammonium repression of the
transferred nif clusters. Nitrogenase sensitivity to ammonium was
measured by nitrogenase assay in the absence (-) or presence (+) of
17.1 mM ammonium acetate. The sensitivity of the native and
inducible nif clusters in E. coli MG1655 (FIG. 8A) and P. protegens
Pf-5 (FIG. 8B). Note that the data are from FIGS. 4A-4F and FIGS.
7A-7F. The nif clusters were induced by 50 .mu.M and 0.5 mM IPTG in
E. coli MG1655 and P. protegens Pf-5, respectively. Asterisks
indicate ethylene production below the detection limit (<10 au).
Error bars represent s.d. from three independent experiments.
[0024] FIG. 9 includes a diagram showing the ribosome profiling
data for the K. oxytoca native nif cluster in K. oxytoca M5al, E.
coli MG1655, P. protegens Pf-5 and R. sp. IRBG74 (see Materials and
Methods).
[0025] FIGS. 10A-10B include diagrams showing the effect of NifA
overexpression on the nifH promoter activity in R. sp. IRBG74.
(FIG. 10A) The reporter construct used to measure nifH promoter
activity is shown. The nifH promoter activity was analyzed in the
R. sp. IRBG74 wild-type background using flow cytometry. Additional
copies of NifA of R. sp. IRBG74 increased activity of the R. sp.
IRBG74 nifH promoter but failed to complement or enhance activity
of the other nifH promoters including K. oxytoca, P. stutzeri and
A. caulinodans. Error bars represent s.d. from three independent
experiments. (FIG. 10B) Plasmid maps used to assess the effect of
NifA overexpression in R. sp. IRBG74. WT, wild-type; Rsp, R. sp.
IRBG74; Kox, K. oxytoca M5al; Pst, P. stutzeri A1501; Aca, A.
caulinodans ORS571
[0026] FIGS. 11A-11C include diagrams showing Promoter
characterization in R. sp. IRBG74 and P. protegens Pf-5. (FIG. 11A)
Constitutive promoters are rank-ordered by their strength. Plasmids
used to measure promoter activity are depicted on the top. (FIG.
11B) The strength of the T7 promoter wild-type and its variants was
analyzed in the controller strains containing the IPTG-inducible T7
RNAP on the genome of R. sp. IRBG74 and P. protegens Pf-5 with 1 mM
IPTG induction. A reporter plasmid used to measure T7 promoter
activity is shown on the right. (FIG. 11C) Correlation of T7
promoter strength between species. Error bars represent s.d. from
three independent experiments.
[0027] FIGS. 12A-12B include diagrams showing RBS characterization
in R. sp. IRBG74 and P. protegens Pf-5. RBS library for GFP was
designed using the RBS library calculator at the highest-resolution
mode. (FIG. 12A) The strengths of the synthetic RBSs in R. sp.
IRBG74 were analyzed in the plasmid pMR-40 containing the
IPTG-inducible system with 1 mM IPTG induction. 33 of the RBSs
spanning a range of 5,684-fold expression were selected and their
sequences are provided in Table 6. (FIG. 12B) The strengths of the
synthetic RBSs in P. protegens Pf-5 was analyzed in the plasmid
pMR-65 containing the arabinose-inducible system with 7 .mu.M
arabinose induction. 33 of the RBSs spanning a range of 1,075-fold
expression were selected and their sequences are provided in Table
6.
[0028] FIGS. 13A-13B include diagrams showing the characterization
of terminators for T7 RNAP in R. sp. IRBG74 (FIG. 13A) and P.
protegens Pf-5 (FIG. 13B). (FIG. 13A) The strength of terminators
was analyzed in the controller R. sp. IRBG74 strains MR16
containing the IPTG-inducible T7 RNAP on the genome with 1 mM IPTG
induction. (FIG. 13B) Plasmids used to measure terminator strength
are shown on right. Genetic parts are provided in Table 5. Error
bars represent s.d. from three independent experiments.
[0029] FIG. 14 includes diagrams showing the response functions for
the sensors in R. sp. IRBG74. Plasmids used to characterize the
sensors are shown on top of each panel and provided in Table 4.
Genetic parts are provided in Table 5. Error bars represent s.d.
from three independent experiments. Experimental details are
provided in Methods.
[0030] FIGS. 15A-15C include diagrams showing the response
functions for the sensors in P. protegens Pf-5. The output changes
as a function of input inducer concentrations. Plasmids used to
characterize the sensors are shown on top of each panel. (FIG. 15A)
Inducible promoter characterization in P. protegens Pf-5. (FIG.
15B) Optimization of the arabinose-inducible systems. Constitutive
expression of a plasmid-borne AraE transporter decreased a
dissociation constant of arabinose (dark gray). A mutation in the
-10 region (TACTGT to TATATT) of the P.sub.BAD promoter increased
promoter strength (black). (FIG. 15C) Optimization of
IPTG-inducible systems. The IPTG-inducible promoters were induced
by 1 mM IPTG. The combination of the P.sub.tac promoter and the
LacI (Q18M/A47V/F161Y) protein yielded an expression range of
110-fold. Plasmids and genetic parts are provided in Table 4 and
Table 5. Error bars represent s.d. from three independent
experiments.
[0031] FIG. 16 includes diagrams showing the tuning controller
strength in R. sp. IRBG74. The controller containing the
IPTG-inducible T7 RNAP is integrated into the genome of R. sp.
IRBG74 (top). Controller strengths were adjusted by modulating the
RBS of T7 RNAP in the plasmids pMR-81, 82, and 83. Response
functions of the T7 promoter were measured with the reporter
plasmid pMR-79 (right) in the R. sp. IRBG74 controller strains
MR16, MR17, and MR18. Genetic parts and RBS sequences are provided
in Table 5 and Table 5. Error bars represent s.d. from three
independent experiments.
[0032] FIG. 17 includes a plot showing the nitrogenase activity of
the refactored nif clusters across species. Error bars represent
s.d. from three independent experiments.
[0033] FIG. 18 includes diagrams showing RNA-seq (top) and Ribosome
profiling (bottom) data, respectively in E. coli MG1655 and R. sp.
IRBG74. The nif genes were induced by 1 mM IPTG for 6 hours (see
Materials and Methods).
[0034] FIG. 19 includes diagrams showing RNA-seq (top) and ribosome
profiling (bottom) data, respectively, in E. coli MG1655 and P.
protegens Pf-5 and R. sp. IRBG74. The nif genes were induce by 1
mM, 0.1 mM, and 0.5 mM IPTG for 6 h in E. coli MG1655, P. protegens
Pf-5 and R. sp. IRBG74, respectively (see Materials and
Methods).
[0035] FIGS. 20A-20F include diagrams showing the transfer of the
refactored nif cluster v3.2 in P. protegens Pf-5. (FIG. 20A)
Controllers whose output is T7 RNAP from the genome of P. protegens
Pf-5 are described. Substituted genetic parts including a new RBS
and IPTG-inducible promoter for the controller optimization
compared to the controller module pKT249 in E. coli MG1655
highlighted in red. The response functions for the controllers with
the reporter plasmid pMR-80 was measured in the P. protegens Pf-5
controller strain MR7. Controllers driving the expression of GFP by
the T7 promoter achieved large dynamic to 96-fold activation by
IPTG. Error bars represent s.d. from three independent experiments.
(FIG. 20B) The genetic parts used to build the refactored v3.2 nif
gene cluster are shown (Table 5). (FIG. 20C) The activity of the
refactored nif cluster v3.2. Nitrogenase expression was induced by
1 mM IPTG. (FIG. 20D) Function of the transcriptional parts of the
cluster v3.2 was analyzed by RNA-seq (FIG. 19). The performance of
the promoters (left) and terminators (right) was calculated (see
Materials and Methods). (FIG. 20E) The translation efficiency of
the nif genes v3.2 as calculated using ribosome profiling and
RNA-seq. Lines connect points that occur in the same operon. (FIG.
20F) The ribosome density (RD) is compared for the refactored v3.2
nif genes in P. protegens Pf-5 versus that measured for the nif
genes from the native K. oxytoca cluster in K. oxytoca
(.fwdarw.Klebsiella).
[0036] FIG. 21 includes diagrams showing the response function of
inducible promoters in A. caulinodans ORS571. Plasmids used to
characterize inducible promoters are shown on top of each panel and
provided in Table 4. Genetic parts are provided in Table 5. Error
bars represent s.d. from three independent experiments.
[0037] FIG. 22 includes a diagram showing the multiple sequence
alignment of NifA of A. caulinodans ORS571 with R. spheroides 2.4.1
was generated using MUSCLE2. The corresponding residues for
ammonium tolerance in R. sphaeroides are outlined. The A.
caulinodans strand corresponds to SEQ ID NO: 293, and the R.
sphaeroides strand corresponds to SEQ ID NO: 292.
[0038] FIGS. 23A-23B include diagrams showing functional testing of
the NifA homologues that activate the nifH promoters. (FIG. 23A)
The ability of the various NifA to activate the nifH promoters was
tested with pairwise combinations of the nifH promoters and the
NifA in E. coli MG1655 and P. protegens Pf-5. Error bars represent
s.d. from three independent experiments. (FIG. 23B) Plasmids used
to measure nifH promoter activity by NifA overexpression are shown
and provided in Table 4. Genetic parts are provided in Table 5.
Pst, P. stutzeri A1501; Avi, A. vinelandii DJ; Kox, K. oxytoca
M5al
[0039] FIGS. 24A-24B include diagrams showing optimization of the
controllers in P. protegens Pf-5 and E. coli MG1655 that induce the
nifH promoters. (FIG. 24A) The controllers with different strengths
were designed by RBS replacement and tested with the reporter
plasmids (pMR103-105) in which each of the three nifH promoter is
fused to sfgfp (Methods). The nifH promoters were induced with 0.5
mM IPTG. Genetic parts and RBS sequences are provided in Table 5
and 6, respectively. (FIG. 24B) Activation of the nifH promoters in
the E. coli MG1655 containing the controller plasmid pMR102 was
tested with the reporter plasmids pMR106-108. The P. protegens Pf-5
controller strain MR10 was used to drive expression of the nifH
promoter of K. oxytoca and the controller strain MR9 was used to
drive expression of the nifH promoters of P. stutzeri and A.
vinelandii. The nifH promoters were induced with 0.05 mM IPTG and
0.5 mM IPTG in E. coli MG1655 and P. protegens Pf-5, respectively.
Error bars represent s.d. from three independent experiments.
[0040] FIG. 25 includes diagrams showing the effect of oxygen on
the activity of the nifH promoters. Expression from the nifH
promoters was analyzed in E. coli MG1655 containing the controller
plasmid pMR102, P. protegens Pf-5 MR10 (for K. oxytoca) and MR9
(for P. stutzeri and A. vinelandii) at varying initial oxygen
levels in the headspace. The three nifH promoters were induced with
0.05 mM IPTG and 0.5 mM IPTG in E. coli MG1655 and P. protegens
Pf-5, respectively, and incubated at varying initial oxygen
concentrations. Oxygen has no effects on nifH expression in both
strains. Error bars represent s.d. from three independent
experiments.
[0041] FIGS. 26A-26B include diagrams describing the nitrogenase
activity assay. (FIG. 26A) Nitrogenase activity assay at constant
oxygen levels in the headspace. Experimental setup used in this
study to analyze oxygen tolerance of nitrogenase. Following the
expression induction of nitrogenase with preincubation under low
oxygen conditions, targeted oxygen concentrations in the headspace
is maintained by oxygen spiking while monitoring with oxygen
monitoring system (Methods). (FIG. 26B) Nitrogenase activity in E.
Coli MG1655 and P. protegens Pf-5 over a course of three hours.
[0042] FIG. 27 includes diagrams showing the effect of the rnf and
fix complex on nitrogenase activity. The modified nif clusters of
A. vinelandii on the plasmids pMR25-28 were analyzed in the
controller strain P. protegens Pf-5 MR9. The deleted regions from
the clusters were provided in Table 4. Nitrogenase was induced with
0.5 mM IPTG. Removing the rnf complex from the cluster abrogated
activity. The cluster without the fixABCX complex showed identical
oxygen tolerance to the cluster with the complex. Error bars
represent s.d. from three independent experiments.
[0043] FIGS. 28A-28C include diagrams showing regulation of
nitrogenase activity in E. coli MG1655 "Marionette" strain5. (FIG.
28A) Controller plasmids used to drive expression of T7 promoters.
(FIG. 28B) Inducibility of the T7 promoter by the controller
plasmids encoding T7 RNAP under the regulation of the 12 sensors
was tested with a reporter plasmid pMR121 (right). (FIG. 28C)
Inducible control of nitrogenase activity in response to 12
inducers was with the plasmid pMR136 (right) carrying the
refactored nif cluster v2.1 on pBBR1 origin. The choline-Cl
inducible system was omitted for activity assay as the system was
not inducible. For the DAPG-, DHBA-, and vanillic acid-inducible
system, the refactored cluster was carried on a lower copy number
plasmid pMR31 (right) as transformation of the plasmid pMR29 gave
rise to no colony formation. The inducers concentrations are: 400
.mu.M arabinose, 1 mM choline-Cl, 500 nM 3OC14HSL, 50 .mu.M cuminic
acid, 25 nM 3OC6HSL, 25 .mu.M DAPG, 500 .mu.M DHBA, 1 mM IPTG, 100
nM aTc, 250 .mu.M naringenin, 50 .mu.M vanillic acid, and 250 .mu.M
salicylic acid. Plasmid and genetic parts are provided in Table 4
and 5. Error bars represent s.d. from three independent
experiments.
[0044] FIG. 29 includes schematic plasmid maps used to assess the
effect of inducible expression of NifA/RpoN on the activity of the
nifH promoter in A. caulinodans ORS571.
[0045] FIG. 30A-30B include diagrams showing the phylogenetic
relationships of 10 diazotrophs based on 16S ribosomal RNA
sequences. The scale bar indicates 2% substitutions per site. The
clusters based on evolutionary closeness are circled. FIG. 30B
shows the relative nitrogenase activity in three host strains (E.
coli, Pseudomonas protegens Pf-5, and Rhizobium sp. IRBG74)
carrying each of the 10 nif clusters. The result suggests that the
phylogenetic closeness has a predictive power for achieving highest
nitrogenase activity in a new host that lacks an endogenous nif
cluster.
DETAILED DESCRIPTION OF THE INVENTION
[0046] Nitrogen fixation in the root nodules of leguminous plants
is a major contributor to world food production and therefore, the
practical applications of this field are of major interest. Legumes
obtain nitrogen from air through bacteria residing in root nodules,
some species of which also associate with cereals but do not fix
nitrogen under these conditions. Disabling native regulation can
turn on expression, even in the presence of nitrogenous fertilizer
and low O.sub.2, but continuous nitrogenase production confers an
energetic burden.
[0047] The present disclosure in some aspects describes the
surprising discovery that bacteria can be genetically altered in a
manner that will enable the bacteria to deliver fixed nitrogen to
cereal crops. Several strategies to implement control over nitrogen
fixation in bacteria that live on or inside the roots of cereals
are described. At least two approaches can be taken. In one
embodiment, the native regulation is replaced. In alternative
embodiments, a nif cluster is transferred from another species and
placed under inducible control. The Examples section below includes
a description of the achievement of these two approaches in
multiple species with multiple constructs. For example, A.
caulinodans, ammonium-independent control can be achieved using a
sensor to drive the co-expression of a NifA mutant and RpoN in a
.DELTA.nifA strain. Rhizobium sp. IRBG74 can be engineered to
express functional nitrogenase under free living conditions either
by transferring a native nif cluster from Rhodobacter or a
refactored cluster from Klebsiella. Multiple approaches enable P.
protegens Pf-5 to express functional nitrogenase, of which the
transfer of the nif cluster from Azotobacter vinelandii DJ yields
the highest activity and O.sub.2 tolerance.
[0048] To date, it has not been shown that a Rhizobium strain can
be engineered to fix nitrogen under free-living conditions when it
does not do so naturally. Some Rhizobia isolated from legume root
nodules are also cereal endophytes, however most are unable to fix
nitrogen under free-living conditions (outside of the nodule)
(Ramachandran, V. K., East, A. K., Karunakaran, R., Downie, J. A.
& Poole, P. S. Adaptation of Rhizobium leguminosarum to pea,
alfalfa and sugar beet rhizospheres investigated by comparative
transcriptomics. Genome biology 12, R106 (2011); Frans, J. et al.
in Nitrogen Fixation 33-44 (Springer, 1990)). There have been
reports of cereal yield improvements due to these bacteria,
including a 20% increase for rice by Rhizobium sp. IRBG74, but this
is likely due to other growth-promoting mechanisms, such as
improved nutrient uptake or root formation (Ramachandran, V. K.,
East, A. K., Karunakaran, R., Downie, J. A. & Poole, P. S.
Adaptation of Rhizobium leguminosarum to pea, alfalfa and sugar
beet rhizospheres investigated by comparative transcriptomics.
Genome biology 12, R106 (2011); Delmotte, N. et al. An integrated
proteomics and transcriptomics reference data set provides new
insights into the Bradyrhizobium japonicum bacteroid metabolism in
soybean root nodules. Proteomics 10, 1391-1400 (2010); Hoover, T.
R., Imperial, J., Ludden, P. W. & Shah, V. K. Homocitrate is a
component of the iron-molybdenum cofactor of nitrogenase.
Biochemistry 28, 2768-2771 (1989)). Azorhizobium caulinodans ORS571
is exceptional because it is able to fix nitrogen in both aerobic
free-living and symbiotic states, has been shown to be a rice and
wheat endophyte, and does not rely on plant metabolites to produce
functional nitrogenase. However, when Rhizobia or Azorhizobium
species are living in cereal roots, there is low nitrogenase
expression and .sup.15N.sub.2 transfer rates suggest any reported
uptake is due to bacterial death.
Cereal Crops, Nitrogen Fixation, and Bacteria
[0049] Cereal crops are broadly defined as any grass cultivated for
the edible components of its grain (also referred to as caryopsis),
composed of the endosperm, germ, and bran. Cereal crops are
considered staple crops in many parts of the world. They are grown
in greater quantities and provide more food energy worldwide than
any other type of crop. Non-limiting examples of cereal crops
include maize, rye, barley, wheat, sorghum, oats, millet and rice.
As used herein, the terms "cereal crop" and "cereal plant" are used
interchangeably.
[0050] Nitrogen fixation is the process by which atmospheric
nitrogen is assimilated into organic compounds as part of the
nitrogen cycle. The fixation of atmospheric nitrogen associated
with specific legumes is the result of a highly specific symbiotic
relationship with rhizobial bacteria. These indigenous bacteria
dwell in the soil and are responsible for the formation of nodules
in the roots of leguminous plants as sites for the nitrogen
fixation. Most Rhizobium symbioses are confined to leguminous
plants. Furthermore, Rhizobium strains which fix nitrogen in
association with the agriculturally-important temperate legumes are
usually restricted in their host range to a single legume
genus.
[0051] The nif genes are genes encoding enzymes involved in the
fixation of atmospheric nitrogen into a form of nitrogen available
to living organisms. The primary enzyme encoded by the nif genes is
the nitrogenase complex which converts atmospheric nitrogen
(N.sub.2) to other nitrogen forms (e.g. ammonia) which the organism
can process. As used herein, the term "nif cluster" refers to a
gene cluster comprising nif genes. As used herein, the term
"refactored" refers to an engineered gene cluster, i.e. its genes
have reordered, deleted or altered in some way.
[0052] Rhizobia are diazotrophic bacteria. In general, they are
gram negative, motile, non-sporulating rods. In terms of taxonomy,
they fall into two classes: alphaproteobacteria and
betaproteobacteria. Non-limiting examples of rhizobia include
Azorhizobium caulinodans, Rhizobium(R.) sp. IRBG74, R. radiobacter,
R. rhizogenes, R. rubi, R. vitis, Alfalfa Rhizobia (R. meliloti),
Chickpea Rhizobia (Rhizobium sp.), Soybean Rhizobia (Bradyrhizobium
japonicum), Leucaena Rhizobia (Rhizobium sp.), R. leguminosarum by
trifolii, R. leguminosarum by phaseoli, and Rhizobium leguminosarum
by viciae (see for example U.S. Pat. No. 7,888,552, herein
incorporated by reference). In some embodiments, the rhizobia of
the present invention are Azorhizobium caulinodans. In some
embodiments, the rhizobia of the present invention are not
Azorhizobium caulinodans.
[0053] As used herein, the term "free-living conditions" refers to
a bacterium (e.g. rhizobium) that is not within a leguminous root
nodule. It generally refers to something that has not formed a
parasitic (or dependent) relationship with another organism or is
not on a substrate. As used herein, the term "symbiotic" refers to
the interaction between two organisms living in close proximity.
Close proximity can be about 0.2 .mu.m, 0.4 .mu.m, 0.6 .mu.m, 0.8
.mu.m, 1 .mu.m, 5 .mu.m, 10 .mu.m, 20 .mu.m, 50 .mu.m, 100 .mu.m,
500 .mu.m, 1 mm, 1 cm, 5 cm, 10 cm. Close proximity can also be
less than 0.2 .mu.m. In many cases, a symbiotic relationship refers
to a mutually beneficial interaction.
[0054] As used herein, "aerobic free-living conditions" refer to
conditions under which a bacterium is not within a leguminous root
nodule and the bacterium is in the presence of oxygen. Aerobic
free-living conditions can also be referred to as nonsymbiotic or
non-parasitic conditions in the presence of oxygen. The bacterium
can be in close proximity to a crop, as defined above.
[0055] As used herein, the term "endophyte" refers to a group of
organisms, often fungi and bacteria, that live within living plant
cells for at least part of its life cycle without having an
apparent detrimental effect on the plant cell. This is contrasting
with an epiphyte, which is a plant that grows on another plant,
without being parasitic.
[0056] As used herein, the term "diazotroph" refers to
microorganisms that are able to grow without external sources of
fixed nitrogen. The group includes some bacteria and some archae.
There are free-living and symbiotic diazotrophs. An example of a
free-living diazotroph is Klebsiella pneumoniae. K. pneumoniae is a
facultative anaerobes--these species can grow either with or
without oxygen, but they only fix nitrogen anaerobically.
[0057] As used herein, the term "Alphaproteobacteria" refers to a
diverse class of bacteria falling under the phylum Proteobacteria.
Non-limiting examples of Alphaproteobacteria include species
Rhodobacter sphaeroides and Rhodopseudomonas palustris. As used
herein, the term "Gammaproteobacteria" refers to another class of
bacteria falling under the phylum of Proteobacteria. All
proteobacteria are gram negative. As used herein, the term
"Cyanobacteria" refers to a phylum of bacteria that obtain their
energy through photosynthesis. They are also referred to as
Cyanophyta. They have characteristic internal membranes and
thylakoids, the latter being for photosynthetic purposes. As used
herein, the term "Firmicutes" refer to a phylum of bacteria. This
phylum includes the classes Bacilli, Clostridia, and
Thermolithobacteria.
Nif Genes
[0058] Typically, the genes necessary for nitrogen fixation occur
together in a gene cluster, including the nitrogenase subunits, the
biosynthesis of metalloclusters cluster and, electron transport,
and regulator proteins. Nif genes are genes that encode the enzyme
involved in nitrogen fixation. In most cases nif genes occur as an
operon. Some of these genes encode the subunits for the nitrogenase
complex, which is the primary enzyme imparting the ability to
convert atmospheric nitrogen (N.sub.2) to forms of nitrogen
accessible to living organisms. In most genes, the regulation of
the nif gene transcription is conducted by NifA protein, which is
responsive to nitrogen levels. When there are nitrogen deficits,
NtrC activates NifA expression, which in turn leads to the
activation of the remaining nif genes. When nitrogen levels are
adequate or in excess, NifL protein, encoded by NifL, inhibits NifA
activity.
[0059] Nif gene pathways are generally sensitive to small changes
in expression. Important genes include nifHDK, which form the
subunits for nitrogenase. The chaperone NifY is required to achieve
full activity and broadens the tolerance to changes in expression
level. NifJ and nif regulate electron transport. The nifUSVWZM
operon encodes proteins for early Fe--S cluster formation (NifUS)
and proteins for component maturation (NifVWZ for Component I and
NifM for Component II), whereas nifBQ encodes proteins for FeMo-co
core synthesis (NifB) and molybdenum integration (NifQ). NifEN is
tolerant to varied expression levels.
[0060] Exemplary sequences for various nif genes are provided in
Table 5. Non-limiting examples of nif genes include nifH, nifD,
nifK, nifE, nifN, nifU, nifS, nifV, nifW, nifX, nifB, nifQ, nifY,
nifT, nifJ, nifF, nifX, nifU, and nifS.
Nitrogen Fixation and Regulatory Elements
[0061] The nitrogen fixation (nif) genes are organized as genomic
clusters, ranging from a 10.5 kb single operon in Paenibacillus to
64 kb divided amongst three genomic locations in A. caulinodans.
Conserved genes include those encoding the nitrogenase enzyme
(nifHDK), FeMoCo biosynthesis, and chaperones. Species that can fix
nitrogen under more conditions tend to have larger gene clusters
that include environment-specific paralogues, alternative electron
transport routes, and oxygen protective mechanisms. Often, the
functions of many genes in the larger clusters are unknown.
[0062] There is evolutionary evidence for the lateral transfer of
nif clusters between species (Pascuan, C., Fox, A. R., Soto, G.
& Ayub, N. D. Exploring the ancestral mechanisms of regulation
of horizontally acquired nitrogenases. Journal of molecular
evolution 81, 84-89 (2015); Kechris, K. J., Lin, J. C., Bickel, P.
J. & Glazer, A. N. Quantitative exploration of the occurrence
of lateral gene transfer by using nitrogen fixation genes as a case
study. Proceedings of the National Academy of Sciences 103,
9584-9589 (2006)). However, achieving such a transfer via genetic
engineering poses a challenge as many things can go awry, including
differences in regulation, missing genes, and the intracellular
environment (Frans, J. et al. in Nitrogen Fixation 33-44 (Springer,
1990); Poudel, S. et al. Electron transfer to nitrogenase in
different genomic and metabolic backgrounds. Journal of
bacteriology 200, e00757-00717 (2018); Thony, B., Anthamatten, D.
& Hennecke, H. Dual control of the Bradyrhizobium japonicum
symbiotic nitrogen fixation regulatory operon fixR nifA: analysis
of cis- and trans-acting elements. Journal of bacteriology 171,
4162-4169 (1989); Han, Y. et al. Interspecies Transfer and
Regulation of Pseudomonas stutzeri A1501 Nitrogen Fixation Island
in Escherichia coli. Journal of microbiology and biotechnology 25,
1339-1348 (2015)). Nitrogenase is under stringent control because
it is oxygen sensitive and energetically expensive: it can make up
20% of the cell mass and each NH.sub.3 requires .about.40 ATP. It
is also irreversibly deactivated by oxygen. Across species,
transcription of nif genes is strongly repressed by fixed nitrogen
(ammonia) and oxygen with these signals converging on the NifA
regulatory protein that works in concert with the sigma factor
RpoN. Diverse, species-specific, and often poorly understood
signals control these regulators, including plant-produced
chemicals, ATP, reducing power, temperature, and carbon sources.
Those bacteria that can fix nitrogen in a wider range of
environmental conditions tend to be controlled by more complex
regulatory networks.
[0063] When a nif cluster is transferred from one species to
another, it either preserves its regulation by environmental
stimuli or has an unregulated constitutive phenotype. Maintaining
the native regulation, notably ammonium repression, limits their
use in agriculture because such levels are likely to fluctuate
according to soil types, irrigation, and fertilization.
Nitrogen-fixing diazotrophs have been engineered to reduce ammonia
sensitivity by disrupting NifL or mutating NifA and placing the
entire cluster under the control of T7 RNA polymerase (RNAP).
Constitutive expression of nitrogenase is also undesirable as it
imparts a fitness burden on the cells. For example, when the nif
cluster from P. stutzeri A1501 was transferred to P. protegens
Pf-5, this was reported to result in sufficient ammonia production
to support maize and wheat growth, but the bacteria quickly
declined after a month when competing with other species in soil.
Constitutive activity is detrimental even before the bacteria are
introduced to the soil, impacting production, formulation, and
long-term storage. Therefore, uncontrolled nitrogenase production
could lead to more expensive production, shorter shelf life, and
more in-field variability.
[0064] An important aspect of the nif clusters or nif genes the
present disclosure is that they can each be under the control of a
regulatory element. In some embodiments, 2 or more genes are under
the control of a regulatory element. In some embodiments, all the
genes are under the control of a regulatory element. The regulatory
elements may also be activation elements or inhibitory elements. An
activation element is a nucleic acid sequence that when presented
in context with a nucleic acid to be expressed will cause
expression of the nucleic acid in the presence of an activation
signal. An inhibitory signal is a nucleic acid sequence that when
presented in context with a nucleic acid to be expressed will cause
expression of the nucleic acid unless an inhibitory signal is
present. Each of the activation and inhibitory elements may be a
promoter, such as a bacteriophage T7 promoter, sigma 70 promoter,
sigma 54 promoter, lac promoter, etc. As used herein, the term
"promoter" is intended to refer to those regulatory sequences which
are sufficient to enable the transcription of an operably linked
DNA molecule. Promoters may be constitutive or inducible. As used
herein, the term "constitutive promoter" refers to a promoter that
is always on (i.e. causing transcription at a constant level).
Examples of constitutive promoters include, without limitation,
sigma 70 promoter, bla promoter, lacI. promoter, etc. Non-limiting
examples of inducible promoters are shown in Table 1. The
P.sub.A1lacO1 promoter is another example of an inducible promoter
that can be used in the present invention.
TABLE-US-00001 TABLE 1 Examples of regulatory elements (e.g.
inducible promoters, repressors). Essential regulatory Name
Chemical inducer and/or repressor gene(s) ParaBAD L-arabinose (ON)
& glucose (OFF) araC ("PBAD") PrhaBAD L-rhamnose (ON) &
glucose (OFF) rhaR &rhaS Plac lactose or IPTG (ON) &
glucose lacI (OFF) Ptac lactose or IPTG (ON) lacI Plux
acyl-homoserine lactone (ON) luxR Ptet tetracycline or aTc (ON)
tetR Psal salycilate (ON) nahR Ptrp tryptophan (OFF) (NONE) Ppho
phosphate (OFF) phoB & phoR
[0065] Inducible promoters allow regulation of gene expression and
can be regulated by exogenously supplied compounds, environmental
factors such as temperature, or the presence of a specific
physiological state, e.g., acute phase, a particular
differentiation state of the cell, or in replicating cells only.
Inducible promoters and inducible systems are available from a
variety of commercial sources, including, without limitation,
Invitrogen, Clontech and Ariad. Many other systems have been
described and can be readily selected by one of skill in the art.
Examples of inducible promoters regulated by exogenously supplied
promoters include the zinc-inducible sheep metallothionine (MT)
promoter, the dexamethasone (Dex)-inducible mouse mammary tumor
virus (MMTV) promoter, the T7 polymerase promoter system [WO
98/10088]; the ecdysone insect promoter [No et al, Proc. Natl.
Acad. Sci. USA, 93:3346-3351 (1996)], the tetracycline-repressible
system [Gossen et al, Proc. Natl. Acad. Sci. USA, 89:5547-5551
(1992)], the tetracycline-inducible system [Gossen et al, Science,
268:1766-1769 (1995), see also Harvey et al, Curr. Opin. Chem.
Biol., 2:512-518 (1998)], the RU486-inducible system [Wang et al,
Nat. Biotech., 15:239-243 (1997) and Wang et al, Gene Ther.,
4:432-441 (1997)] and the rapamycin-inducible system [Magari et al,
J. Clin. Invest., 100:2865-2872 (1997)]. Still other types of
inducible promoters which may be useful in this context are those
which are regulated by a specific physiological state, e.g.,
temperature, acute phase, a particular differentiation state of the
cell, or in replicating cells only.
[0066] As used herein, the term "terminator" (as referred to as a
transcription terminator) is a section of nucleic acid sequence
that marks the end of a gene or operon in genomic DNA during
transcription. They stop transcription of a polymerase. Terminators
can be classified into several groups. At the first group of
termination signals the core enzyme can terminate in vitro at
certain sites in the absence of any other factors (as tested in
vitro). These sites of termination are called intrinsic terminators
or also class I terminators. Intrinsic terminators usually share
one common structural feature, the so called hairpin or stem-loop
structure. On the one hand the hairpin comprises a stem structure,
encoded by a dG-dC rich sequence of dyad symmetrical structure. On
the other hand the terminator also exhibits a dA-dT rich region at
the 3'-end directly following the stem structure. The uridine rich
region at the 3' end is thought to facilitate transcript release
when RNA polymerase pauses at hairpin structures. Two or more
terminators can be operatively linked if they are positioned to
each other to provide concerted termination of a preceding coding
sequence. Particularly preferred, the terminator sequences are
downstream of coding sequences, i.e. on the 3' position of the
coding sequence. The terminator can e.g. be at least 1, at least
10, at least 30, at least 50, at least 100, at least 150, at least
200, at least 250, at least 300, at least 400, at least 500
nucleotides downstream of the coding sequence or directly adjacent.
Examples of terminators include, but are not limited to, T7
terminator, rrnBT1, L3S2P21, tonB, rrnA, rrnB, rrnD, RNAI, crp,
his, ilv lambda, M13, rpoC, and trp (see for example U.S. Pat. No.
9,745,588, incorporated herein by reference).
[0067] RpoN
[0068] As used herein "RpoN" refers to a gene that encodes the
sigma factor sigma-54 (.sigma.54, sigma N, or RpoN), a protein in
Escherichia coli and other species of bacteria. Sigma factors are
initiation factors that promote attachment of RNA polymerase to
specific initiation sites and are then released. Bacteria normally
only have one functional copy of the alternative sigma factor,
.sigma.54 or RpoN, which regulates a complex genetic network that
extends into various facets of bacterial physiology, including
metabolism, survival in strenuous environments, production of
virulence factors, and formation of biofilms. RpoN is one of seven
RNA polymerase sigma subunits in E. coli required for
promoter-initiated transcription and RpoN plays a major role in the
response of E. coli to nitrogen-limiting conditions. Under such
conditions, RpoN directs the transcription of at least 14 E. coli
operons/regulators in the nitrogen regulatory (Ntr) response. RpoN
also plays an important role in stress resistance (e.g. resistance
to osmotic stress) and virulence of bacteria. RpoN is structurally
and functionally distinct from the other E. coli .sigma. factors.
It is able to bind promoter DNA in the absence of core RNA
polymerase and it recognizes promoter sequences with conserved GG
and GC elements located -24 to -12 nucleotides upstream of the
transcription start site. Additionally, Regulatory proteins like
NtrB and NtrC can activate .sigma.54 holoenzyme.
[0069] Without being bound by theory or mechanism, it is believed
that RpoN works in concert with NifA to turn on the transcription
of nif clusters. An exemplary sequence for RpoN is provided in
Table 5.
Gene Cluster Nucleic Acids
[0070] As used herein, a "gene cluster" or "genetic cluster" refers
to a set of two or more genes that encode gene products. A target,
naturally occurring, or wild type genetic cluster can be used as
the original model for refactoring. In some embodiments, the gene
products are enzymes. In some embodiments, the gene products of a
genetic cluster function in a biosynthetic pathway. In some
embodiments, the gene cluster encodes proteins of the nif nitrogen
fixation pathway.
[0071] The genetic clusters can encode proteins of a biosynthetic
pathway. A biosynthetic pathway, as used herein, refers to any
pathway found in a biological system that involves more than one
protein. In some instances, these pathways involve 2-1,000
proteins. In other instances the number of proteins involved in a
biosynthetic pathway may be 2-500, 2-100, 5-1000, 5-500, 5-100,
5-10, 10-1,000, 10-900, 10-800, 10-700, 10-600, 10-500, 10-400,
10-300, 10-200, 10-100, 50-1,000, 50-500, 50-100, 100-1,000, or
100-500. Examples of biosynthetic pathways include but are not
limited to the nitrogen fixation pathway.
[0072] In some instances, the refactored genetic clusters have
naturally occurring non-coding DNA, naturally occurring regulatory
sequences, and/or non-essential genes that have been removed from
at least one or in some instances all of the transcriptional units.
These can be replaced by synthetic regulatory sequences, not
replaced at all or replaced by spacers. A spacer simply refers to a
set of nucleotides or analogs thereof that don't have a function
such as coding for a protein or in any way regulating the activity
of the gene cluster.
[0073] The genetic components in the genetic cluster typically will
include at least one regulatory element. A synthetic regulatory
element is any nucleic acid sequence which plays a role in
regulating gene expression and which differs from the naturally
occurring regulatory element. It may differ for instance by a
single nucleotide from the naturally occurring element. In some
cases, it is an exogenous regulatory element (i.e. not identical to
the naturally occurring version). Thus, a "regulatory element"
refers to a nucleic acid having nucleotide sequences that influence
transcription or translation initiation or rate, or stability
and/or mobility of a transcription or translation product.
Regulatory regions include, without limitation, promoter sequences,
ribosome binding sites, ribozymes, enhancer sequences, response
elements, protein recognition sites, inducible elements, protein
binding sequences, 5' and 3' untranslated regions (UTRs),
transcriptional start sites, transcription terminator sequences,
polyadenylation sequences, introns, and combinations thereof.
[0074] The genetic clusters can be expressed in vivo in an organism
or in vitro in a cell. The organism or cell can be any organism or
cell in which a DNA can be introduced. For example, organisms and
cells can include prokaryotes and eukaryotes (i.e. yeast, plants).
Prokaryotes include but are not limited to Cyanobacteria, Bacillus
subtilis, E. coli, Clostridium, and Rhodococcus. Eukaryotes
include, for instance, algae (Nannochloropsis), yeast such as, S.
cerevisiae and P. pastoris, plant cells, mammalian cells. Thus,
some aspects of this disclosure relate to engineering of a cell to
express proteins from the modified genetic clusters.
[0075] In some embodiments of the present disclosure provides a
genetic cluster includes a nucleotide sequence that is at least
about 85% or more homologous or identical to the entire length of a
naturally occurring genetic cluster sequence, e.g., at least 5%,
10%, 15%, 20%, 25%, 30%, 35%, 40%, 50% or more of the full length
naturally occurring genetic cluster sequence). In some embodiments,
the nucleotide sequence is at least about 90%, 91%, 92%, 93%, 94%,
95%, 96%, 97%, 98%, 99% or 100% homologous or identical to a
naturally occurring genetic cluster sequence. In some embodiments,
the nucleotide sequence is at least about 85%, e.g., is at least
about 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100%
homologous or identical to a genetic cluster sequence, in a
fragment thereof or a region that is much more conserved, such as
an essential, but has lower sequence identity outside that region.
The disclosure also provides a nucleotide sequence that is at least
about 85%, e.g., is at least about 90%, 91%, 92%, 93%, 94%, 95%,
96%, 97%, 98%, 99% or 100% identical to any nucleotide sequence as
described herein or an amino acid sequence that is at least about
85%, e.g., is at least about 90%, 91%, 92%, 93%, 94%, 95%, 96%,
97%, 98%, 99% or 100% identical to any amino acid sequence as
described herein.
[0076] Calculations of homology or sequence identity between
sequences (the terms are used interchangeably herein) are performed
as follows. To determine the percent identity of two nucleic acid
sequences, the sequences are aligned for optimal comparison
purposes (e.g., gaps can be introduced in one or both of a first
and a second amino acid or nucleic acid sequence for optimal
alignment and non-homologous sequences can be disregarded for
comparison purposes). The length of a reference sequence aligned
for comparison purposes is at least 80% of the length of the
reference sequence, and in some embodiments is at least 90% or
100%. The nucleotides at corresponding amino acid positions or
nucleotide positions are then compared. When a position in the
first sequence is occupied by the same nucleotide as the
corresponding position in the second sequence, then the molecules
are identical at that position (as used herein nucleic acid
"identity" is equivalent to nucleic acid "homology"). The percent
identity between the two sequences is a function of the number of
identical positions shared by the sequences, taking into account
the number of gaps, and the length of each gap, which need to be
introduced for optimal alignment of the two sequences.
[0077] In some embodiments the gene clusters are native gene
clusters. In some embodiments, the gene clusters are refactored
gene clusters. In some instances, the nucleic acids may include
non-naturally occurring nucleotides and/or substitutions, i.e.
Sugar or base substitutions or modifications.
[0078] One or more substituted sugar moieties include, e.g., one of
the following at the 2' position: OH, SH, SCH3, F, OCN, OCH3OCH3,
OCH3O(CH2)n CH3, O(CH2)n NH2 or O(CH2)n CH3 where n is from 1 to
about 10; Ci to C10 lower alkyl, alkoxyalkoxy, substituted lower
alkyl, alkaryl or aralkyl; Cl; Br; CN; CF3; OCF3; O-, S-, or
N-alkyl; O-, S-, or N-alkenyl; SOCH3; SO2 CH3; ONO2; NO2; N3; NH2;
heterocycloalkyl; heterocycloalkaryl; aminoalkylamino;
polyalkylamino; substituted silyl; an RNA cleaving group; a
reporter group; an intercalator; a group for improving the
pharmacokinetic properties of a nucleic acid; or a group for
improving the pharmacodynamic properties of a nucleic acid and
other substituents having similar properties. Similar modifications
may also be made at other positions on the nucleic acid,
particularly the 3' position of the sugar on the 3' terminal
nucleotide and the 5' position of 5' terminal nucleotide. Nucleic
acids may also have sugar mimetics such as cyclobutyls in place of
the pentofuranosyl group.
[0079] Nucleic acids can also include, additionally or
alternatively, nucleobase (often referred to in the art simply as
"base") modifications or substitutions. As used herein,
"unmodified" or "natural" nucleobases include adenine (A), guanine
(G), thymine (T), cytosine (C) and uracil (U). Modified nucleobases
include nucleobases found only infrequently or transiently in
natural nucleic acids, e.g., hypoxanthine, 6-methyladenine, 5-Me
pyrimidines, particularly 5-methylcytosine (also referred to as
5-methyl-2' deoxycytosine and often referred to in the art as
5-Me-C), 5-hydroxymethylcytosine (HMC), glycosyl HMC and
gentobiosyl HMC, isocytosine, pseudoisocytosine, as well as
synthetic nucleobases, e.g., 2-aminoadenine,
2-(methylamino)adenine, 2-(imidazolylalkyl)adenine,
2-(aminoalklyamino)adenine or other heterosubstituted
alkyladenines, 2-thiouracil, 2-thiothymine, 5-bromouracil,
5-hydroxymethyluracil, 5-propynyluracil, 8-azaguanine,
7-deazaguanine, N6 (6-aminohexyl)adenine, 6-aminopurine,
2-aminopurine, 2-chloro-6-aminopurine and 2,6-diaminopurine or
other diaminopurines. See, e.g., Kornberg, "DNA Replication," W. H.
Freeman & Co., San Francisco, 1980, pp 75-' 7' 7; and Gebeyehu,
G., et al. Nucl. Acids Res., 15:4513 (1987)). A "universal" base
known in the art, e.g., inosine, can also be included.
[0080] Methods to deliver expression vectors or expression
constructs into cells, for example, into bacteria, yeast, or plant
cells, are well known to those of skill in the art. Nucleic acids,
including expression vectors, can be delivered to prokaryotic and
eukaryotic cells by various methods well known to those of skill in
the relevant biological arts. Methods for the delivery of nucleic
acids to a cell, include, but are not limited to, different
chemical, electrochemical and biological approaches, for example,
heat shock transformation, electroporation, transfection, for
example liposome-mediated transfection, DEAE-Dextran-mediated
transfection or calcium phosphate transfection. In some
embodiments, a nucleic acid construct, for example an expression
construct comprising a fusion protein nucleic acid sequence, is
introduced into the host cell using a vehicle, or vector, for
transferring genetic material. Vectors for transferring genetic
material to cells are well known to those of skill in the art and
include, for example, plasmids, artificial chromosomes, and viral
vectors. Methods for the construction of nucleic acid constructs,
including expression constructs comprising constitutive or
inducible heterologous promoters, knockout and knockdown
constructs, as well as methods and vectors for the delivery of a
nucleic acid or nucleic acid construct to a cell are well known to
those of skill in the art, and are described, for example, in J.
Sambrook and D. Russell, Molecular Cloning: A Laboratory Manual,
Cold Spring Harbor Laboratory Press; 3rd edition (Jan. 15, 2001);
David C. Amberg, Daniel J. Burke; and Jeffrey N. Strathern, Methods
in Yeast Genetics: A Cold Spring Harbor Laboratory Course Manual,
Cold Spring Harbor Laboratory Press (April 2005); John N. Abelson,
Melvin I. Simon, Christine Guthrie, and Gerald R. Fink, Guide to
Yeast Genetics and Molecular Biology, Part A, Volume 194 (Methods
in Enzymology Series, 194), Academic Press (Mar. 11, 2004);
Christine Guthrie and Gerald R. Fink, Guide to Yeast Genetics and
Molecular and Cell Biology, Part B, Volume 350 (Methods in
Enzymology, Vol 350), Academic Press; 1st edition (Jul. 2, 2002);
Christine Guthrie and Gerald R. Fink, Guide to Yeast Genetics and
Molecular and Cell Biology, Part C, Volume 351, Academic Press; 1st
edition (Jul. 9, 2002); Gregory N. Stephanopoulos, Aristos A.
Aristidou and Jens Nielsen, Metabolic Engineering: Principles and
Methodologies, Academic Press; 1 edition (Oct. 16, 1998); and
Christina Smolke, The Metabolic Pathway Engineering Handbook:
Fundamentals, CRC Press; 1 edition (Jul. 28, 2009), all of which
are incorporated by reference herein.
Phylogenetic Analysis
[0081] The present disclosure also provides methods of selecting a
nif cluster of a donor bacterium that is compatible with a host
bacterium. The methods involve performing a phylogenetic analysis
for the donor bacterium and the host bacterium.
[0082] A phylogenetic analysis is a method of estimating the
evolutionary relationships. In molecular phylogenetic analysis, the
sequence of a common gene or protein can be used to assess the
evolutionary relationship of species. In some embodiments,
phylogenetic analysis is performed based on the rRNA (e.g., the
full-length 16S rRNA gene) sequences. These sequence include e.g.,
K. oxytoca, BWI76_05380; A. vinelandii, Avin_55000; R. sphaeroides,
DQL45_00005; Cyanothece ATCC51142, cce_RNA045; A. brasilense,
AMK58_25190; R. palustris, RNA_55; P. protegens, PST_0759;
Paenibacillus sp. WLY78, JQ003557. In some embodiments, a multiple
sequence alignment can be generated using MUSCLE (Edgar, R. C. J.
N. a. r. MUSCLE: multiple sequence alignment with high accuracy and
high throughput. 32, 1792-1797 (2004)). A phylogenetic tree is then
constructed using the Jukes-Cantor distance model and UPGMA as a
tree build method.
[0083] As shown in FIGS. 30A and 30B, the phylogenetic closeness
has a predictive power for nitrogenase activity of transferring a
nif cluster in a new host. In some embodiments, the host bacterium
and the donor bacterium are in the same genus, family, order, or
class. In some embodiments, the donor bacterium is selected from
Klebsiella, Pseudomonas, Azotobacter, Gluconacetobacter,
Azospirillum, Azorhizobium, Rhodopseudomonas, Rhodobacter,
Cyanothece, or Paenibacillus genus.
[0084] In some embodiments, the evolutionary distance based on the
phylogenetic analysis between the donor bacterium and the host
bacterium is less than a reference value. For example, the
reference value is 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, 0.9%,
0.8%, 0.7%, 0.6%, 0.5%, 0.4%, 0.3%, 0.2%, or 0.1% substitutions per
site in 16S ribosomal RNA gene sequence. In some embodiments, the
reference value is 500, 400, 300, 200, 100, 50, or 10 million years
on the phylogenetic tree.
[0085] The methods can also involve transferring the nif cluster to
the host bacterium and determining the nitrogenase activity.
Genetically Modified Plants
[0086] In some embodiments, this disclosure features
genetically-modified plant cells and plants (e.g.,
genetically-modified cereal plants or cells) comprising at least
one recombinant nucleic acid construct described herein. In some
embodiments, a nucleic acid construct can encode, for example, a
chemical signal synthesis peptide, operably linked in sense
orientation to one or more regulatory regions. In some embodiments,
the chemical signal synthesis peptide is an opine biosynthetic
polypeptide (e.g., from A. tumefaciens) such as octopine synthase
or nopaline synthase. In some embodiments, the chemical signal
synthesis peptide can produce a chemical signal, which the
genetically engineered bacterium can respond. It will be
appreciated that because of the degeneracy of the genetic code, a
number of nucleic acids can encode a particular opine biosynthetic
polypeptide; i.e., for many amino acids, there is more than one
nucleotide triplet that serves as the codon for the amino acid.
Thus, codons in the coding sequence for a given opine biosynthetic
polypeptide can be modified such that expression in a particular
plant species is obtained, using appropriate codon bias tables for
that species.
[0087] In some cases, the regulatory region is a constitutive
promoter. In some cases, the regulatory region is an inducible
promoter. In some cases, the regulatory region is a root-active
promoter that can confer transcription in root tissue, e.g., root
endodermis, root epidermis, or root vascular tissues. In some
embodiments, root-active promoters can include the root-specific
subdomains of the CaMV 35S promoter (Lam et al., Proc. Natl. Acad.
Sci. USA, 86:7890-7894 (1989)), root cell specific promoters of
Conkling et al., Plant Physiol., 93:1203-1211 (1990), or the
tobacco RD2 promoter.
[0088] As described herein, a cereal plant or plant cell can be
transformed by having a nucleic acid construct integrated into its
genome, i.e., can be stably transformed. Stably transformed cells
typically retain the introduced nucleic acid with each cell
division. A plant or plant cell can also be transiently transformed
such that the construct is not integrated into its genome.
Transiently transformed cells typically lose all or some portion of
the introduced nucleic acid construct with each cell division such
that the introduced nucleic acid cannot be detected in daughter
cells after a sufficient number of cell divisions. Both transiently
transformed and stably transformed transgenic plants and plant
cells can be useful in the methods described herein.
[0089] Genetically modified plant cells used in methods described
herein can constitute part or all of a whole plant. Such plants can
be grown in a manner suitable for the species under consideration,
either in a growth chamber, a greenhouse, or in a field. As used
herein, a genetically modified plant also refers to progeny of an
initial engineered plant provided the progeny inherits the
construct. Seeds produced by a modified plant can be grown and then
selfed (or outcrossed and selfed) to obtain seeds homozygous for
the nucleic acid construct.
[0090] Modified plants can be grown in suspension culture, or
tissue or organ culture. When using solid medium, modified plant
cells can be placed directly onto the medium or can be placed onto
a filter that is then placed in contact with the medium. When using
liquid medium, modified plant cells can be placed onto a flotation
device, e.g., a porous membrane that contacts the liquid medium. A
solid medium can be, for example, Murashige and Skoog (MS) medium
containing agar and a suitable concentration of an auxin, e.g.,
2,4-dichlorophenoxyacetic acid (2,4-D), and a suitable
concentration of a cytokinin, e.g., kinetin.
[0091] When transiently transformed plant cells are used, a
reporter sequence encoding a reporter polypeptide having a reporter
activity can be included in the transformation procedure and an
assay for reporter activity or expression can be performed at a
suitable time after transformation. A suitable time for conducting
the assay typically is about 1-21 days after transformation, e.g.,
about 1-14 days, about 1-7 days, or about 1-3 days. The use of
transient assays is particularly convenient for rapid analysis in
different species, or to confirm expression of a polypeptide whose
expression has not previously been confirmed in particular
recipient cells.
[0092] Techniques for introducing nucleic acids into plants are
known and include, without limitation, Agrobacterium-mediated
transformation, viral vector-mediated transformation,
electroporation and particle gun transformation, e.g., U.S. Pat.
Nos. 5,538,880; 5,204,253; 6,329,571 and 6,013,863. If a cell or
cultured tissue is used as the recipient tissue for transformation,
plants can be regenerated from transformed cultures if desired, by
techniques known to those skilled in the art.
[0093] A population of modified plants can be screened and/or
selected for those members of the population that produce as
described chemical signal (e.g., opine) at a desired location
(e.g., in the roots) as conferred by expression of the transgene.
For example, a population of progeny of a single transformation
event can be screened for those plants having a desired level of
expression of a polypeptide or nucleic acid.
Control of Nitrogenase Activity
[0094] In some embodiments, the disclosure provides a genetically
engineered bacterium that contains a regulatory sequence or a
genetic sensor that regulates the nitrogenase activity in response
to a chemical signal (e.g., an environmental signal or artificial
signal). In some embodiments, the chemical signal can be an
environmental signal such as ammonia, IPTG, or oxygen. In some
embodiments, the nif cluster is placed under the control of a
genetic sensor that can respond to the chemical signal. In some
embodiments, the genetic sensor can respond to biocontrol agents or
components of added fertilizer and other treatments (e.g., DAPG).
In some embodiments, the genetic sensor can respond to root
exudates from a plant, including e.g., sugar such as arabinose,
hormones such as salicylic acids, flavonoids such as naringenin,
antimicrobials such as vanillic acid, and various chemicals that
can remodel the microbial community (e.g., cuminic acid). In some
embodiments, the genetic sensor can respond to chemicals released
by other bacteria including e.g., 3,4-dihydroxybenzoic acid (DHBA),
3OC6HSL or 3OC14HSL.
[0095] In some embodiments, sensors for chemicals are used to
construct controllers. In some embodiments, a "Marionette" strain
of E. coli, which includes sensors for e.g., vanillic acid, DHBA,
cuminic acid, 3OC6HSL and 3OC14HSL in the genome, is used to host
the nif cluster. In some embodiments, the output promoter of a
sensor is used to express T7 RNA polymerase. In some embodiments,
the arabinose and naringenin sensors are used to express NifA,
which leads to the induction of the nifH promoter and nitrogenase
activity. In some embodiments, the arabinose and naringenin sensors
are used to express NifA, which leads to the induction of the nifH
promoter and nitrogenase activity in P. protegens Pf-5. In some
embodiments, the DAPG sensor is used to drive T7 RNAP, which then
induces nitrogenase activity. In some embodiments, the DAPG sensor
is used to drive T7 RNAP, which then induces nitrogenase activity
in R. sp. IRBG74. In some embodiments, the salicylic acid sensor is
used to control NifA.sup.L94Q/D95Q/RpoN expression, which then
activates nitrogenase activity. In some embodiments, the salicylic
acid sensor is used to control NifA.sup.L94Q/D95Q/RpoN expression,
which then activates nitrogenase activity in A. caulinodans.
[0096] In some embodiments, a plant is engineered to release an
orthogonal chemical signal that can be sensed by a corresponding
engineered bacterium. This would have the benefit of only inducing
nitrogenase in the presence of the engineered crop. In some
embodiments, legumes and Arabidopsis are engineered to produce
opines, including nopaline and octopine. In some embodiments, an
engineered bacterium contains sensors for nopaline and octopine. In
some embodiments, an engineered bacterium contains the LysR-type
transcriptional activators OccR (octopine) and NocR (nopaline) and
their corresponding promoters. In some embodiments, sensors for
nopaline and octopine are used to control the expression of
NifA.sup.L94Q/D95Q/RpoN, which then activates nitrogenase
activity.
[0097] The present disclosure also provides methods of selecting a
nif cluster or a regulatory element for the nif cluster. The
methods involve calculation of genetic-part strengths based on
sequencing data. In some embodiments, RNA-seq and
Ribosome-footprint profiling is carried out according to the method
described herein. In some embodiments, to generate the RNA-seq read
profile for each nif cluster, the raw trace profiles can be
multiplied by e.g., at least or about 10.sup.5, 10.sup.6, 10.sup.7,
10.sup.8, or 10.sup.9 and normalized by respective total reads from
coding sequences of each species. In some embodiments, the mRNA
expression level of each gene is estimated using total sequencing
reads mapped onto the gene, representing fragments per kilobase of
transcript per million fragments mapped units (FPKM).
[0098] In some embodiments, the activity of a promoter is defined
as the change in RNAP flux .delta.J around a transcription start
site x.sub.tss. In some embodiments, the promoter strength or the
regulatory element strength is calculated using the below
equation:
.delta. .times. J = .gamma. n .function. [ i = x tss + 1 x tss + 1
+ n .times. m .function. ( i ) - i = x tss - 1 x 0 - 1 - n .times.
m .function. ( i ) ] ##EQU00002##
where m(i) is the number of transcripts at each position i from the
FPKM-normalized transcriptomic profiles, .gamma.=0.0067 s.sup.-1 is
the degradation rate of mRNA and n is the window length before and
after x.sub.tss. In some embodiments, the window length is set to
ten. In some embodiments, the Ts is defined as the fold-decrease in
transcription before and after a terminator, which can be
quantified from the FPKM-normalized transcriptomic profiles as:
T s = i = x 1 + 1 x 1 + n .times. m .function. ( i ) i = x 0 - 1 x
0 - n .times. m .function. ( i ) ##EQU00003##
where x.sub.0 and x.sub.1 are the beginning and end positions of
the terminator part, respectively. The translation efficiency was
calculated by dividing the ribosome density by the FPKM.
Definition
[0099] As used herein, the equivalent terms "expression" or "gene
expression" are intended to refer to the transcription of a DNA
molecule into RNA, and the translation of such RNA into a
polypeptide.
[0100] As used herein, a "gene cluster" refers to a set of two or
more genes that encode gene products. As used herein, a "nif gene
cluster" refers to a set of two or more genes that encode nitrogen
fixation genes.
[0101] "Exogenous" with respect to genes indicates that the nucleic
acid or gene is not in its natural (native) environment. For
example, an exogenous gene can refer to a gene that is from a
different species. In contrast, "endogenous" with respect to genes
indicates that the gene is in its native environment. As used
herein, the terms "endogenous" and "native" are used
interchangeably.
[0102] As used herein, the term "delete" or "deleted" refers to the
removal of a gene (e.g. endogenous gene) from a sequence or
cluster. As used herein, the term "alter" or "altered" refers to
the modification of one or more nucleotides in a gene or the
deletion of one or more base pairs in a gene. This alteration may
render the gene dysfunctional. Herein, ".DELTA.nifA" refers to a
strain or cluster within which NifA was deleted or altered. Method
of deletion and alteration, in the context of genes, are known in
the art.
[0103] As used herein, the term "chemical signals" refers to
chemical compounds. Any substance consisting of two or more
different types of atoms (chemical elements) in a fixed
stoichiometric proportion can be termed a chemical compound.
Chemical signals can be synthetic or natural chemical compounds. In
some embodiments of the present invention, a bacterium of the
present disclosure or a sensor of the present disclosure is under
the control of a chemical signal. In some embodiments, the signal
is a native biological signal (e.g. root exudate, biological
control agent, etc.). In some embodiments, the chemical signal is a
quorum sensing signal from the bacterium. Non-limiting examples of
chemical signals include root exudates (as defined below),
biocontrol agents (as defined below), phytohormones, vanillate,
IPTG, aTc, cuminic acid, DAPG, and salicylic acid,
3,4-dihydroxybenzoic acid, 3OC6HSL and 3OC14HSL.
[0104] As used herein, the term "root exudate" refers to chemicals
secreted or emitted by plant roots in response to their
environment. These allow plant to manipulate or alter their
immediate environment, specifically their rhizosphere. Root
exudates are a complex mixture of soluble organic substances, which
may contain sugars, amino acids, organic acids, enzymes, and other
substances. Root exudates include, but are not limited to, ions,
carbon-based compounds, amino acids, sterols, sugars, hormones
(phytohormones), flavonoids, antimicrobials, and many other
chemical compounds. The exudates can serve as either positive
regulators or negative regulators.
[0105] As used herein, the term "phytohormone" refers plant
hormones and they are any of various hormones produced by plants
that influence process such as germination, growth, and metabolism
in the plant.
[0106] As used herein, the term "vanillate" refers to a
methoxybenzoate that is the conjugate base of vanillic acid. It is
a plant metabolite.
[0107] Biological control or biocontrol is a method of controlling
pests such as insects, mites, weeds and plant diseases using other
organisms. Natural enemies of insect pests, also known as
biological control agents, include predators, parasitoids,
pathogens, and competitors. Biological control agents of plant
diseases are most often referred to as antagonists. Biological
control agents of weeds include seed predators, herbivores and
plant pathogens. The inducible clusters or promoters of the present
invention may be modulated by a secretion of (or chemical otherwise
associated with) a biological control agent. Herein, that is
referred to as a "biocontrol agent".
[0108] Without further elaboration, it is believed that one skilled
in the art can, based on the above description, utilize the present
invention to its fullest extent. The following specific embodiments
are, therefore, to be construed as merely illustrative, and not
limitative of the remainder of the disclosure in any way
whatsoever. All publications cited herein are incorporated by
reference for the purposes or subject matter referenced herein.
EXAMPLES
[0109] Herein, inducible nitrogenase activity is engineered in two
cereal endophytes (Azorhizobium caulinodans ORS571 and Rhizobium
sp. IRBG74) and the epiphyte Pseudomonas protegens Pf-5, a maize
seed inoculant. For each organism, different strategies are taken
to eliminate ammonium repression and place nitrogenase expression
under the control of agriculturally-relevant signals, including
root exudates, biocontrol agents, and phytohormones. The present
disclosure demonstrates that Rhizobium sp. (e.g., IRBG74) can be
engineered to fix nitrogen under free living conditions, inter
alia, by transferring either a nif cluster from Rhodobacter or
Klebsiella. For P. protegens Pf-5, the transfer of an inducible
cluster from Azotobacter vinelandii yields the highest ammonia and
oxygen tolerance. Collectively, data from the transfer of 12 nif
gene clusters between diverse species (including E. coli and 12
additional Rhizobia) help identify the barriers that must be
overcome to engineer a bacterium to deliver a high nitrogen flux to
a cereal crop and provide a solution such that Rhizobium can be
engineered to fix nitrogen under free living conditions.
Materials and Methods
[0110] Bacterial Strains and Growth Media.
[0111] All bacterial strains and their derivatives used in this
study are listed in Table 2. E. coli DH10-beta (New England
Biolabs, MA, Cat #C3019) was used for cloning. E. coli K-12 MG1655
was used for the nitrogenase assay. P. protegens Pf-5 was obtained
from the ATCC (BAA-477). Strains used in this study are listed in
Table 3. For rich media, LB medium (10 g/L tryptone, 5 g/L yeast
extract, 10 g/L NaCl), LB-Lennox medium (10 g/L tryptone, 5 g/L
yeast extract, 5 g/L NaCl), and TY medium (5 g/L tryptone, 3 g/L
yeast extract, 0.87 g/L CaCl.sub.2.2H.sub.2O) were used. For
minimal media, BB medium (0.25 g/L MgSO.sub.4.7H.sub.2O, 1 g/L
NaCl, 0.1 g/L CaCl.sub.2.2H.sub.2O, 2.9 mg/L FeCl.sub.3, 0.25 mg/L
Na.sub.2MoO.sub.4.2H.sub.2O, 1.32 g/L NH.sub.4CH.sub.3CO.sub.2, 25
g/L Na.sub.2HPO.sub.4, 3 g/L KH.sub.2PO.sub.4 pH [7.4]), UMS medium
(0.5 g/L MgSO.sub.4.7H.sub.2O, 0.2 g/L NaCl, 0.375 mg/L
EDTA-Na.sub.2, 0.16 ZnSO.sub.4.7H.sub.2O, 0.2 mg/L
Na.sub.2MoO.sub.4.2H.sub.2O, 0.25 mg/L H.sub.3BO.sub.3, 0.2 mg/L
MnSO.sub.4.H.sub.2O, 0.02 mg/L CuSO.sub.4.5H.sub.2O, 1 mg/L
CoCl.sub.2.6H.sub.2O, 75 mg/L CaCl.sub.2.2H.sub.2O, 12 mg/L
FeSO.sub.4.7H.sub.2O, 1 mg/L thiamine hydrochloride 2 mg/L
D-pantothenic acid hemicalcium salt, 0.1 mg/L biotin, 87.4 mg/L
K.sub.2HPO, 4.19 g/L MOPS pH [7.0]), and Burk medium (0.2 g/L
MgSO.sub.4.7H.sub.2O, 73 mg/L CaCl.sub.2.2H.sub.2O, 5.4 mg/L
FeCl.sub.3.6H.sub.2O, 4.2 mg/L Na.sub.2MoO.sub.4.2H.sub.2O, 0.2 g/L
KH.sub.2PO.sub.4, 0.8 g/L K.sub.2HPO.sub.4 pH [7.4]) were used.
Antibiotics were used at the following concentrations (.mu.g/mL):
E. coli (kanamycin, 50; spectinomycin, 100; tetracycline, 15;
gentamicin, 15). P. protegens Pf-5 (kanamycin, 30; tetracycline,
50; gentamicin, 15; carbenicillin, 50). R. sp. IRBG74 (neomycin,
150; gentamicin, 150; tetracycline, 10; nitrofurantoin, 10). A.
caulinodans (kanamycin, 30; gentamicin, 15; tetracycline, 10;
nitrofurantoin, 10). Chemicals including inducers used in this
study are listed in Table 7.
[0112] Strain Construction.
[0113] In order to increase transformation efficiency in R. sp.
IRBG74, a type-I restriction modification system was inactivated by
deleting hsdR, which encodes a restriction enzyme for foreign DNA
(this strain was the basis for all experiments) (Ferri, L., Gori,
A., Biondi, E. G., Mengoni, A. & Bazzicalupo, M. J. P. Plasmid
electroporation of Sinorhizobium strains: The role of the
restriction gene hsdR in type strain Rm1021. 63, 128-135 (2010)). A
sacB markerless insertion method was utilized to allow replacements
of a native locus with synthetic parts by homologous recombination.
Two homology arms of -500 bp flanking the hsdR gene were amplified
by PCR, cloned and yielded a suicide plasmid pMR-44. The suicide
plasmid was mobilized into R. sp. IRBG74 by triparental mating.
Single-crossover recombinants were selected for resistance to
gentamicin and subsequently grown and plated on LB plates
supplemented with 15% sucrose to induce deletion of the vector DNA
part containing the counter selective marker sacB which converts
sucrose into a toxic product (levan). Two native nif gene clusters
encompassing nifHDKENX (genomic location 219.579-227, 127) and
nifSW-fixABCX-nifAB-fdxN-nifTZ (genomic location 234, 635-234, 802)
of R. sp. IRBG74 were sequentially deleted using pMR45-46. To
increase genetic stability recA gene was deleted using the plasmid
pMR47. The R. sp. IRBG74 .DELTA.nif, hsdR, recA strain was the
basis for all experiments unless indicated otherwise. Two homology
arms of .about.900 bp flanking the nifA gene were amplified by PCR,
cloned and yielded a suicide plasmid pMR-47 to generate nifA
deletion in A. caulinodans ORS571, The suicide plasmid pMR47 in E.
coli was mobilized into A. caulinodans by triparental mating.
Single-crossover recombinants were selected for resistance to
gentamicin and subsequently grown and plated on plain TY plates
supplemented with 15% sucrose to induce deletion of the vector DNA
part. All markerless deletions were confirmed by gentamicin
sensitivity and diagnostic PCR. A list of the mutant strains is
provided in Table 3.
[0114] Plasmid System.
[0115] Plasmids with the pBBR1 origin were derived from pMQ131 and
pMQ132. Plasmids with the pRO1600 origin were derived from pMQ80.
Plasmids with the RK2 origin were derived from pJP2. Plasmids with
the RSF1010 origin were derived from pSEVA651. Plasmids with the
IncW origin were derived from pKT249. Plasmids used in this study
are provided in Table 4.
[0116] Phylogenetic Analysis of Nif Clusters.
[0117] Phylogenetic analysis was performed based on the full-length
16S rRNA gene sequences (K. oxytoca, BWI76_05380; A. vinelandii,
Avin_55000; R. sphaeroides, DQL45_00005; Cyanothece ATCC51142,
cce_RNA045; A. brasilense, AMK58_25190; R. palustris, RNA_55; P.
protegens, PST_0759; Paenibacillus sp. WLY78, JQ003557). A multiple
sequence alignment was generated using MUSCLE (Edgar, R. C. J. N.
a. r. MUSCLE: multiple sequence alignment with high accuracy and
high throughput. 32, 1792-1797 (2004)). A phylogenetic tree was
constructed using the Geneious software (R9.0.5) with the
Jukes-Cantor distance model and UPGMA as a tree build method, with
bootstrap values from 1,000 replicates.
[0118] Nif Cluster Construction.
[0119] To obtain large nif clusters on mobilizable plasmids that
carry origin of transfer (oriT) for conjugative transfer of the
plasmids, the genomic DNAs from K. oxytoca, P. stutzeri, A.
vinelandii, A. caulinodans and R. sphaeroides were purified using
Wizard genomic DNA purification kit, following the isolation
protocol for gram negative bacteria (Promega, Cat #A1120). The
genomic DNAs of Cyanothece ATCC51142, A. brasilense ATCC29729, R.
palustris ATCC BAA-98, and G. diazotrophicus ATCC49037 were
obtained from ATCC. Each nif cluster was amplified into several
fragments (4-10 kb) with upstream and downstream 45 bp linkers at
the 5' and 3' most end of the cluster by PCR with primer sets
(Table 2) and assembled onto linearized E. coli-yeast shuttle
vectors pMR-1 for E. coli and Rhizobia, and pMR-2 for P. protegens
Pf-5 using yeast recombineering. For the nif cluster of
Paenibacillus sp. WLY78, the DNA sequence information were gleaned
from contig ALJV01 and the DNA of the nif cluster was synthesized
by GeneArt gene synthesis (Thermo Fisher Scientific, MA) into four
fragments that were used as templates for PCR amplification and
assembly. Amplified fragments from two to eight (Table 2) were
assembled with a linearized vector into a single large plasmid by
one-pot yeast assembly procedure(Shanks, R. M. et al. Saccharomyces
cerevisiae-based molecular tool kit for manipulation of genes from
gram-negative bacteria. 72, 5027-5036 (2006)). Once assembled, the
nif cluster-plasmids were isolated from yeast using Zymoprep Yeast
Miniprep kit (Zymo Research Cat #D2004) and transformed into E.
coli. The purified plasmid was isolated from E. coli and sequenced
to verify the correct assembly and sequence (MGH CCIB DNA Core
facility, Cambridge, Mass.). E. coli containing a mutation-free
plasmid were stored for further experiments. Plasmids containing
nif clusters are provided in Table 4.
[0120] Construction of Refactored Nif v3.2.
[0121] The six transcriptional units (nifHDKTY, nifENX, nifJ,
nifBQ, nifF, nifUSVWZM) were amplified from the plasmid pMR-3 that
harbors the native Klebsiella nif cluster. Each unit was divided
onto six level-1 module plasmids where the nif genes are preceded
by a terminator. T7 promoter wild-type or T7 promoter variant
PT7.P2 was placed between a terminator and the first gene of the
transcriptional unit. Assembly linkers (.about.45 bp) were placed
at both ends of the units. The level-1 plasmids (pMR32-37) were
provided in Table 4 and 5. Each of the six plasmids was linearized
by digestion with restriction enzymes and assembled with a
linearized pMR-1 or pMR-2 vector into a single large plasmid by
one-pot yeast assembly procedure, yielding pMR38 and pMR39.
[0122] Transformation.
[0123] Electroporation was used to transfer plasmids into P.
protegens Pf-5. A single colony was inoculated in 4 mL of LB and
grown for 16 h at 30.degree. C. with shaking at 250 rpm. The cell
pellets were washed twice with 2 mL of 300 mM sucrose and dissolved
in 100 .mu.l of 300 mM sucrose at RT. A total of 50-100 ng DNA was
electroporated and recovered in 1 mL of LB media for 1 h before
plating on selective LB plates. Triparental mating was used to
transfer DNA from E. coli to Rhizobia. An aliquot of 40 .mu.l of
late-log phase (OD.sub.600.about.0.6) donor cells and 40 .mu.l of
late-log phage helper cells containing pRK7013 were mixed with 200
.mu.l of late-log phase (OD.sub.600.about.0.8) recipient Rhizobia
cells and washed in 200 .mu.l of TY medium. Mating was initiated by
spotting 20 .mu.l of the mixed cells on TY plates and incubated at
30.degree. C. for 6 h. The mating mixtures were plated on TY medium
supplemented with nitrofurantoin to isolate Rhizobia
transconjugants.
[0124] Construction and Characterization Genetic Parts for
Rhizobia.
[0125] Genetic part libraries were built on a pBBR1-ori plasmid
pMR-1 using Gibson assembly (New England Biolabs, Cat #E2611). The
fluorescence proteins, GFPmut3b and mRFP1, were used as reporters.
The Anderson promoter library (Anderson, J. et al. BglBricks: A
flexible standard for biological part assembly. 4, 1 (2010)) on the
BioBricks Registry were utilized for the characterization of
constitutive promoters (FIGS. 11A-11C). To characterize inducible
promoters, a regulator protein is constitutively expressed by the
PlacIq promoter, and GFP expression is driven by a cognate
inducible promoter from the opposite direction, facilitating
replacement of the reporter with gene of interest (e.g., T7 RNAP
and nifA) and transfer of the controller unit across different
plasmid backbones for diverse microbes. The following combinations
of cognate regulators and inducible promoters were characterized.
IPTG inducible LacI-A1lacO1, DAPG inducible Ph1F-PPh1, aTc
inducible TetR-PTet, 3OC6HSL inducible LuxR-P.sub.Lux, salicylic
acid inducible NahR-P.sub.Sal, and cuminic acid inducible
CymR-P.sub.Cym systems were optimized for R. sp. IRBG74 (FIG. 14).
Opine inducible OccR-P.sub.occ, and nopaline inducible NocR-Pnoc
systems were optimized for A. caulinodans (FIGS. 20A-20F and Tables
4 and 5). For RBS characterization, an IPTG-inducible GFP
expression plasmid pMR-40 was used and GFP was expressed to the
highest levels with 1 mM IPTG (FIGS. 12A-12B). RBS library for GFP
was designed using the RBS library calculator at the
highest-resolution mode, and the 3' end of the 16S rRNA sequences
were adjusted according to the species (3'-ACCTCCTTC-5' for R. sp.
IRBG74). Terminators for T7 RNAP were characterized by placing a
terminator between two fluorescence reporters expressed from a
single T7 wild-type promoter located upstream of the first
fluorescence protein GFP. The expression of the two fluorescence
proteins is enabled by the controller strain MR18 encoding the
IPTG-inducible T7 RNAP system by 1 mM IPTG (FIGS. 13A-13B). The
terminator strength (Ts) was determined by normalizing fluorescence
levels of a terminator construct by a reference construct pMR-66
where a 40 bp spacer was placed between the reporters. All genetic
parts for Rhizobia were characterized as follows. Single colonies
were inoculated into 0.5 ml TY supplemented with antibiotics in
96-deepwell plates (USA Scientific, Cat #18962110) and grown
overnight at 30.degree. C., 900 rpm in a Multitron incubator
(INFORS HT, MD). 1.5 .mu.l of overnight cultures was diluted into
200 .mu.l of TY with antibiotics and appropriate inducers in
96-well plates (Thermo Scientific, Cat #12565215) and incubated for
7 h at 30.degree. C., 1,000 rpm in an ELMI DTS-4 shaker (ELMI, CA).
After growth, 8 .mu.l of culture sample was diluted into 150 .mu.l
PBS with 2 mg/mL kanamycin for flow cytometry analysis. Plasmids
and genetic parts are listed in Table 4 and 5.
[0126] Construction and Characterization Genetic Parts for P.
protegens.
[0127] Genetic part libraries were built on a pRO1600-ori plasmid
pMR-2 using Gibson assembly (New England Biolabs, Cat #E2611). The
fluorescence proteins, GFPmut3b and mRFP1 were used as reporters.
The Anderson promoter library on the BioBricks Registry were
utilized for the characterization of constitutive promoters (FIGS.
11A-11C). The following combinations of cognate regulators and
inducible promoters were characterized. IPTG inducible
LacI-P.sub.tac, DAPG inducible Ph1F-P.sub.Phl, aTc inducible
TetR-P.sub.Tet, 3OC6HSL inducible LuxR-P.sub.Lux, arabinose
inducible AraC-P.sub.BAD, cuminic acid inducible CymR-P.sub.Cym,
and naringenin inducible FdeR-P.sub.Fde were optimized (FIGS.
15A-15C). For RBS characterization, an arabinose-inducible GFP
expression plasmid pMR-65 was used and GFP was expressed with 1 mM
IPTG (FIGS. 12A-12B). RBS library for GFP was designed using the
RBS library calculator at the highest-resolution mode, and the 3'
end of the 16S rRNA sequences were adjusted according to the
species (3'-ACCTCCTTA-5' for P. protegens Pf-5). Terminators for T7
RNAP were characterized by placing a terminator between two
fluorescence reporters expressed from a single T7 wild-type
promoter located upstream of the first fluorescence protein GFP.
The expression of the two fluorescence proteins is enabled by an
IPTG-inducible T7 RNAP expression system of the controller strain
MR7 (FIGS. 13A-13B). All genetic parts for P. protegens Pf-5 were
characterized as follows. Single colonies were inoculated into 1 ml
LB supplemented with antibiotics in 96-deepwell plates (USA
Scientific, Cat #18962110) and grown overnight at 30.degree. C.,
900 rpm in a Multitron incubator (INFORS HT, MD). 0.5 .mu.l of
overnight cultures was diluted into 200 .mu.l of LB with
antibiotics and appropriate inducers in 96-well plates (Thermo
Scientific, Cat #12565215) and incubated for 7 h at 30.degree. C.,
1,000 rpm in an ELMI DTS-4 shaker (ELMI, CA). After growth, 10
.mu.l of culture sample was diluted into 150 .mu.l PBS with 2 mg/mL
kanamycin for flow cytometry analysis. Plasmids and genetic parts
are listed in Tables 4 and 5.
[0128] Genomic Integration and Characterization of Controllers.
[0129] The mini-Tn7 insertion system was used to introduce a
controller into the genome of P. protegens Pf-5. The IPTG-inducible
T7 RNAP expression system and a tetracycline resistant marker tetA
was placed between two Tn7 ends (Tn7L and Tn7R). The controller
plasmid pMR-85 was introduced into P. protegens Pf-5 by double
transformation with pTNS3 encoding the TnsABCD transposase. A
genomically-integrated controller located 25 bp downstream of the
stop codon of glmS was confirmed by PCR and sequencing. A
markerless insertion method using homologous recombination was
employed in R. sp. IRBG74. A controller encoding inducible T7 RNAP
system flanked by two homology fragments that enables the
replacement of recA was cloned into a suicide plasmid. These
controller plasmids (IPTG-inducible, pMR82-84; DAPG-inducible,
pMR85) in E. coli was mobilized into R. sp. IRBG74 MR18
(.DELTA.hsdR. .DELTA.nif) by triparental mating, generating the
controller strains (MR19, 20, 21 and 22, respectively). The
controller integration in the genome was confirmed by gentamicin
sensitivity and diagnostic PCR. All controllers were characterized
in a manner identical to that described in genetic part
characterization.
[0130] Construction and Characterization of Marionette-Based
Controllers.
[0131] To regulate nitrogenase expression in the E. coli Marionette
MG1655, the yfp in the 12 reporter plasmids was replaced with T7
RNAP while keeping other genetic parts (e.g., promoters and RBSs)
unchanged (FIGS. 28A-28C). The reporter plasmid pMR-120 in which
gfpmut3b is fused to the PT7(P2) promoter (FIGS. 28A-28C) was
co-transformed to analyze the response functions of each of the 12
T7 RNAP controller plasmids. To characterize controllers, single
colonies were inoculated into 1 ml LB supplemented with antibiotics
in 96-deepwell plates (USA Scientific, Cat #18962110) and grown
overnight at 30.degree. C., 900 rpm in a Multitron incubator
(INFORS HT, MD). 0.5 .mu.l of overnight cultures was diluted into
200 .mu.l of LB with antibiotics and appropriate inducers in
96-well plates (Thermo Scientific, Cat #12565215) and incubated for
6 h at 30.degree. C., 1,000 rpm in an ELMI DTS-4 shaker (ELMI, CA).
After growth, 4 .mu.l of culture sample was diluted into 150 .mu.l
PBS with 2 mg/mL kanamycin for flow cytometry analysis.
[0132] Flow Cytometry.
[0133] Cultures with fluorescence proteins were analyzed by flow
cytometry using a BD Biosciences LSRII Forterssa analyzer with a
488 nm laser and 510/20-nm band pass filter for GFP and a 561 nm
laser and 610/20 nm band pass filter for mCherry and mRFP1. Cells
were diluted into 96-well plates containing phosphate buffered
saline solution (PBS) supplemented with 2 mg/mL kanamycin after
incubation. Cells were collected over 20,000 events which were
gated using forward and side scatter to remove background events
using FlowJo (TreeStar Inc., Ashland, Oreg.). The median
fluorescence from cytometry histograms was calculated for all
samples. The median autofluorescence was subtracted from the median
fluorescence and reported as the fluorescence value in arbitrary
unit (au).
[0134] Nitrogenase Assay (E. coli and K. oxytoca).
[0135] Cultures were initiated by inoculating a single colony into
1 mL of LB supplemented with appropriate antibiotics in 96-deepwell
plates (USA Scientific, Cat #18962110) and grown overnight at
30.degree. C., 900 rpm in a Multitron incubator. 5 .mu.l of
overnight cultures was diluted into 500 .mu.l of BB medium with
17.1 mM NH4CH3CO2 and appropriate antibiotics in 96-deepwell and
incubated for 24 h at 30.degree. C., 900 rpm in a Multitron
incubator. Cultures were diluted to an OD600 of 0.4 into 2 mL of BB
medium supplemented with appropriate antibiotics, 1.43 mM serine to
facilitate nitrogenase depression, and an inducer (if necessary) in
10 mL glass vials with PTFE-silicone septa screw caps (Supelco
Analytical, Cat #SU860103). Headspace in the vials was replaced
with 100% argon gas using a vacuum manifold. Acetylene freshly
generated from CaC.sub.2 in a Burris bottle was injected to 10%
(vol/vol) into each culture vial to begin the reaction. The
acetylene reduction was carried out for 20 h at 30.degree. C. with
shaking at 250 rpm in an Innova 44 shaking incubator (New
Brunswick) to prevent cell aggregations, followed by quenching via
the addition of 0.5 mL of 4 M NaOH to each vial.
[0136] Nitrogenase Assay (P. protegens Pf-5).
[0137] Cultures were initiated by inoculating a single colony into
1 mL of LB supplemented with appropriate antibiotics in 96-deepwell
plates (USA Scientific, Cat #18962110) and grown overnight at
30.degree. C., 900 rpm in a Multitron incubator. 5 .mu.l of
overnight cultures was diluted into 500 .mu.l of BB medium with
17.1 mM NH.sub.4CH.sub.3CO.sub.2 and appropriate antibiotics in
96-deepwell and incubated for 24 h at 30.degree. C., 900 rpm in a
Multitron incubator. Cultures were diluted to an OD.sub.600 of 0.4
into 2 mL of BB medium supplemented with appropriate antibiotics,
1.43 mM serine and an inducer (if necessary) in 10 mL glass vials
with PTFE-silicone septa screw caps. Headspace in the vials was
replaced with 99% argon and 1% oxygen gas (Airgas, MA USA) using a
vacuum manifold. Acetylene was injected to 10% (vol/vol) into each
culture vial to begin the reaction. The acetylene reduction was
carried out for 20 h at 30.degree. C. with shaking at 250 rpm,
followed by quenching via the addition of 0.5 mL of 4 M NaOH to
each vial.
[0138] Nitrogenase Assays (Rhizobia Strains).
[0139] Cultures were initiated by inoculating a single colony into
0.5 mL of TY medium supplemented with appropriate antibiotics in
96-deepwell plates (USA Scientific, Cat #18962110) and grown
overnight at 30.degree. C., 900 rpm in a Multitron incubator. 5
.mu.l of overnight cultures was diluted into 500 .mu.l of UMS
medium with 30 mM succinate, 10 mM sucrose, and 10 mM NH.sub.4C1
and appropriate antibiotics in 96-deepwell and incubated for 24 h
at 30.degree. C., 900 rpm in a Multitron incubator. Cultures were
diluted to an OD.sub.600 of 0.4 into 2 mL of UMS medium plus 30 mM
succinate and 10 mM sucrose supplemented with appropriate
antibiotics, 1.43 mM serine and an inducer (if necessary) in 10 mL
glass vials with PTFE-silicone septa screw caps. Headspace in the
vials was replaced with 99% argon and 1% oxygen gas using a vacuum
manifold. Acetylene was injected to 10% (vol/vol) into each culture
vial to begin the reaction. The acetylene reduction was carried out
for 20 h at 30.degree. C. with shaking at 250 rpm, followed by
quenching via the addition of 0.5 mL of 4 M NaOH to each vial.
[0140] Nitrogenase Assays (A. caulinodans and P. stutzeri).
[0141] Cultures were initiated by inoculating a single colony into
0.2 mL of TY medium supplemented with appropriate antibiotics in
96-deepwell plates and grown overnight at 37.degree. C. and
30.degree. C. for A. caulinodans and P. stutzeri, respectively, 900
rpm in a Multitron incubator. 5 .mu.l of overnight cultures was
diluted into 500 .mu.l of UMS medium with 30 mM lactate and 10 mM
NH.sub.4Cl and appropriate antibiotics in 96-deepwell and incubated
for 24 h at 37.degree. C. and 30.degree. C. for A. caulinodans and
P. stutzeri, respectively, 900 rpm in a Multitron incubator.
Cultures were diluted to an OD.sub.600 of 0.4 into 2 mL of UMS
medium plus 30 mM lactate supplemented with appropriate antibiotics
and an inducer (if necessary) in 10 mL glass vials with
PTFE-silicone septa screw caps. Headspace in the vials was replaced
with 99% argon plus 1% oxygen gas using a vacuum manifold.
Acetylene was injected to 10% (vol/vol) into each culture vial to
begin the reaction. The acetylene reduction was carried out for 20
h at 30.degree. C. with shaking at 250 rpm, followed by quenching
via the addition of 0.5 mL of 4 M NaOH to each vial.
[0142] Nitrogenase Assays (A. vinelandii).
[0143] Cultures were initiated by inoculating a single colony into
0.5 mL of Burk medium supplemented with appropriate antibiotics in
96-deepwell plates (USA Scientific, Cat #18962110) and grown
overnight at 30.degree. C., 900 rpm in a Multitron incubator. 5
.mu.l of overnight cultures was diluted into 500 .mu.l of Burk
medium with 17.1 mM NH4CH3CO2 and appropriate antibiotics in
96-deepwell and incubated for 24 h at 30.degree. C., 900 rpm in a
Multitron incubator. Headspace in the vials was replaced with 97%
argon and 3% oxygen gas (Airgas, MA USA) using a vacuum manifold.
Acetylene was injected to 10% (vol/vol) into each culture vial to
begin the reaction. The acetylene reduction was carried out for 20
h at 30.degree. C. with shaking at 250 rpm, followed by quenching
via the addition of 0.5 mL of 4 M NaOH to each vial.
[0144] Nitrogenase Activity Assay in the Presence of Ammonium.
[0145] Following overnight incubation in minimal medium with a
nitrogen source (described above), cultures were diluted to an
OD.sub.600 of 0.4 in 2 mL of nitrogen-free minimal medium, 1.43 mM
serine (for E. coli and P. protegens Pf-5) and an inducer (for
inducible systems) in 10 mL glass vials with PTFE-silicone septa
screw caps. Ammonium (17.1 mM NH.sub.4CH.sub.3CO.sub.2 for E. coli
and P. protegens Pf-5 and 10 mM NH4Cl for Rhizobia) was added to a
nitrogen-free minimal medium when testing ammonium tolerance of
nitrogenase activity. Headspace in the vials was replaced with
either 100% argon gas for E. coli, 99% argon plus 1% oxygen for
Pseudomonas and Rhizobia using a vacuum manifold. Acetylene was
injected to 10% (vol/vol) into each culture vial to begin the
reaction. The acetylene reduction was carried out for 20 h at
30.degree. C. with shaking at 250 rpm followed by quenching via the
addition of 0.5 mL of 4 M NaOH to each vial.
[0146] Nitrogenase Activity Assay at Varying Oxygen Levels.
[0147] Following overnight incubation in minimal medium with a
nitrogen source (described above), cultures were diluted to an
OD.sub.600 of 0.4 in 2 mL of minimal medium, 1.43 mM serine (for E.
coli and P. protegens Pf-5), and an inducer (for inducible systems)
in 10 mL glass vials with PTFE-silicone septa screw caps. The vial
headspace was replaced with either 100% nitrogen gas for E. coli or
99% nitrogen plus 1% oxygen for P. protegens Pf-5 and A.
caulinodans using a vacuum manifold. Cultures were incubated with
shaking at 250 rpm at 30.degree. C. for 6 h and 9 h for P.
protegens Pf-5 and A. caulinodans, respectively, after which oxygen
concentrations in the headspace were recorded with the optical
oxygen meter FireStingO2 equipped with a needle-type sensor
OXF500PT (Pyro Science, Germany) After the induction period, no
oxygen remained in the headspace for all species as confirmed by
the oxygen meter. The initial oxygen levels in the headspace were
adjusted by injecting pure oxygen via syringe into the headspace of
the vials and stabilized with shaking at 250 rpm at 30.degree. C.
for 15 m followed by the injection of acetylene to 10% (vol/vol)
into each culture vial to begin the reaction and initial oxygen
concentrations in the headspace were recorded concomitantly. The
oxygen levels in the headspace were maintained around the setting
points (<.+-.0.25% 02) while incubating at 250 rpm and
30.degree. C. by injecting oxygen every hour for 3 h with oxygen
monitoring before and after oxygen spiking (FIGS. 26A-26B). The
reactions were quenched after 3 h of incubation by the injection of
0.5 mL of 4 M NaOH to each vial using a syringe.
[0148] Ethylene Quantification.
[0149] Ethylene production was analyzed by gas chromatography using
an Agilent 7890A GC system (Agilent Technologies, Inc., CA USA)
equipped with a PAL headspace autosampler and flame ionization
detector as follows. An aliquot of 0.5 mL headspace preincubated to
35.degree. C. for 30 s was injected and separated for 4 min on a
GS-CarbonPLOT column (0.32 mm.times.30 m, 3 microns; Agilent) at
60.degree. C. and a He flow rate of 1.8 mL/min. Detection occurred
in a FID heated to 300.degree. C. with a gas flow of 35 mL/min H2
and 400 mL/min air. Acetylene and ethylene were detected at 3.0 min
and 3.7 min after injection, respectively. Ethylene production was
quantified by integrating the 3.7 min peak using Agilent GC/MSD
ChemStation Software.
[0150] Sample Preparation for RNA-Seq and Ribosome Profiling.
[0151] Cultures of K. oxytoca, E. coli, P. protegens Pf-5 or R. sp.
IRBG74 were grown following the same protocol as used for
nitrogenase activity assay (described above) with a few changes.
Following overnight incubation in minimal medium with a nitrogen
source, cultures were diluted to an OD.sub.600=0.4 in 25 mL of
minimal medium (with an inducer, if needed) and antibiotics in 125
mL Wheaton serum vials (DWK Life Sciences, Cat #223748) with septum
stoppers (Fisher Scientific, Cat #FB57873). The vial headspace was
replaced with either 100% nitrogen gas for E. coli and K. oxytoca
or 99% nitrogen plus 1% oxygen for P. protegens Pf-5 and R. sp.
IRBG74 using a vacuum manifold. Cultures grown 6 h at 30.degree.
C., 250 rpm were filtered onto a nitrocellulose filter 0.45 .mu.M
pore size (Fisher Scientific, Cat #GVS1215305). Cell pellets were
combined from three vials using a stainless-steel scoopula,
followed by flash-frozen in liquid nitrogen. The frozen pellets
were added to 650 .mu.l of frozen droplets of lysis buffer (20 mM
Tris (pH 8.0), 100 mM NH.sub.4Cl, 10 mM MgCl.sub.2, 0.4% Triton
X-100, 0.1% NP-40, 1 mM chloramphenicol and 100 U/mL DNase I) in
prechilled 25 mL canister (Retsch, Germany, Cat #014620213) in
liquid nitrogen and pulverized using TissueLyser II (Qiagen USA)
with a setting at 15 Hz for 3 min for 5 times with intermittent
cooling between cycles. The pellet was removed by centrifugation at
20,000 rcf at 4.degree. C. for 10 min and the lysate was recovered
in the supernatant.
[0152] RNA-Seq Experiments.
[0153] RNA-seq and Ribosome-footprint profiling was carried out
according to the method described earlier with a few
modifications(Li, G.-W., Oh, E. & Weissman, J. S. J. N. The
anti-Shine--Dalgarno sequence drives translational pausing and
codon choice in bacteria. 484, 538 (2012); Li, G.-W., Burkhardt,
D., Gross, C. & Weissman, J. S. Quantifying absolute protein
synthesis rates reveals principles underlying allocation of
cellular resources. Cell 157, 624-635 (2014)). The total RNA was
isolated using the hot phenol-SDS extraction method. The rRNA
fractions were determined and subtracted from the total using the
MICROBExpress kit (Thermo Fisher Scientific, Cat #AM1905). The
remaining mRNAs and tRNAs were fragmented by RNA fragmentation
reagents (Thermo Fisher Scientific, Cat #AM8740) at 95.degree. C.
for 1 m 45 s. RNA fragments (10-45 bp) were isolated from a 15%
TBE-Urea polyacrylamide gel (Thermo Fisher Scientific, Cat
#EC6885). The 3' ends of the RNA fragments were dephosphorylated
using T4 polynucleotide kinase (1U/.mu.l, New England Biolabs, Cat
#M0201S) in a 20 .mu.l reaction volume supplemented with 1 .mu.l of
20 U SUPERase. In at 37.degree. C. for 1 h, after which the
denatured fragments (5 pmoles) were incubated at 80.degree. C. for
2 min and ligated to 1 .mu.g of the oligo
(/5rApp/CTGTAGGCACCATCAAT/3ddc/, Integrated DNA technologies) (SEQ
ID NO: 1) in a 20 .mu.l reaction volume supplemented with 8 .mu.l
of 50% PEG 8000, 2 .mu.l of 10.times.T4 RNA ligase 2 buffer, 1
.mu.l of 200 U/.mu.l truncated K277Q T4 ligase 2 (New England
Biolabs, Cat #M0351) and 1 .mu.l of 20 U/.mu.l of SUPERase. In
(Invitrogen) at 25.degree. C. for 3 h. The ligated fragments (35-65
bp) were isolated from a 10% TBE-Urea polyacrylamide gel
(Invitrogen, Cat #EC6875). cDNA libraries from the purified mRNA
products were reverse-transcribed using Superscript III (Thermo
Fisher Scientific, Cat #18080044) with oCJ485 primer
(/5Phos/AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT/iSp18/CAAGCAGAAGA
CGGCATACGAGATATTGATGGTGCCTACAG (SEQ ID NO: 2, SEQ ID NO: 3)) at
50.degree. C. for 30 min and RNA products subsequently were
hydrolyzed by the addition of NaOH at a final concentration of 0.1
M, followed by incubation at 95.degree. C. for 15 min. The cDNA
libraries (125-150 bp) were isolated from on a 10% TBE-Urea
polyacrylamide gel (Invitrogen, Cat #EC6875). The cDNA products
were circularized in a 20 .mu.l reaction volume supplemented with 2
.mu.l of 10.times. CircLigase buffer, 1 .mu.l of 1 mM ATP, 1 .mu.l
of 50 mM MnCl2 and 1 .mu.l of CircLigase (Epicenter, Cat #CL4115K)
at 60.degree. C. for 2 h and heat-inactivated at 80.degree. C. for
10 min. 5 .mu.l of circularized DNA was amplified using Phusion HF
DNA polymerase (New England Biolabs, Cat #M0530) with o231 primer
(CAAGCAGAAGACGGCATACGA (SEQ ID NO: 4)) and indexing primers
(AATGATACGGCGACCACCGAGATCTACACGATCGGAAGAGCACACGTCTGAACT
CCAGTCACNNNNNNACACTCTTTCCCTACAC (SEQ ID NO: 5)) for 7 to 10 cycles.
The amplified products (125-150 bp) were recovered from an 8%
TBE-Urea polyacrylamide gel (Invitrogen, Cat #EC62152). The
purified products were analyzed by BioAnalyzer (Agilent, CA USA)
and sequenced with a sequencing primer
(CGACAGGTTCAGAGTTCTACAGTCCGACGATC (SEQ ID NO: 6)) using an Illumina
HiSeq 2500 with a rapid run mode. To generate the RNA-seq read
profile for each nif cluster, the raw trace profiles are multiplied
by 10.sup.7 and normalized by respective total reads from coding
sequences of each species (K. oxytoca M5al, CP020657.1; E. coli
MG1655, NC_000913.3; P. protegens Pf-5, CP000076; R. sp. IRBG74
HG518322, HG518323, HG518324 and an appropriate plasmid carrying a
nif cluster). The mRNA expression level of each gene was estimated
using total sequencing reads mapped onto the gene, representing
fragments per kilobase of transcript per million fragments mapped
units (FPKM).
[0154] Ribo-Seq Experiments.
[0155] 0.5 mg of RNA was diluted into 195 .mu.l of the lysis buffer
including 0.5 U RNase inhibitor SUPERase. In (Invitrogen, Cat
#AM2694), 5 mM CaCl2 and were treated with 5 .mu.l of 750 U of
micrococcal nuclease (Sigma Aldrich, Cat #10107921001) at
25.degree. C. for 1 h to obtain ribosome-protected monosomes. The
digestions were quenched by the addition of EGTA to a final
concentration of 6 mM and then kept on ice before the isolation of
monosomes. Subsequently, the monosome fraction was collected by
sucrose density gradient (10-55% w/v) ultracentrifugation at 35,000
rpm for 3 h, followed by a hot phenol-SDS extraction to isolate
ribosome-protected mRNA fragments. The mRNA fragments (15-45 bp)
were isolated from a 15% TBE-Urea polyacrylamide gel. The 3' ends
of the purified fragments were dephosphorylated and ligated to the
modified oligo. cDNA libraries generated by Superscript III were
circularized by CircLigase as described above. rRNA products were
depleted by a respective biotinylated oligo mix for E. coli and P.
protegens Pf-5.5 .mu.l of circularized DNA was amplified using
Phusion HF DNA polymerase with o231 primer and indexing primers for
7 to 10 cycles. The amplified products (125-150 bp) were recovered
from an 8% TBE-Urea polyacrylamide gel. The purified products were
analyzed by BioAnalyzer and sequenced with a sequencing primer
(CGACAGGTTCAGAGTTCTACAGTCCGACGATC (SEQ ID NO: 7)) using an Illumina
HiSeq 2500 with a rapid run mode. Sequences were aligned to
reference sequences using Bowtie 1.1.2 with the
parameters--k1--m2--v1. A center-weighting approach was used to map
the aligned footprint reads ranging from 22 to 42 nucleotides in
length. To map P-site of ribosome from footprint reads, 11
nucleotides from the both ends were trimmed, and the remaining
nucleotide were given the same score, normalized by the length of
the center region. Aligned reads (10-45 nucleotides) were mapped to
the reference with equal weight of each nucleotide. A Python 3.4
script was used to perform the mapping. To generate the Ribo-seq
read profile for each nif cluster, the raw trace profiles are
multiplied by 10.sup.8 and normalized by respective total reads
from coding sequences of each species. To calculate the ribosome
density of each gene, read densities were first normalized in the
following ways: (i) The first and last 5 codons of the gene are
excluded for the calculation to remove the effects of translation
initiation and termination. (ii) A genome-wide read density profile
was fitted to an exponential function and the density at each
nucleotide on a given gene was corrected using this function. (iii)
If the average read density on a gene is higher than 1, a 90%
winsorization was applied to reduce the effect of outliers. The sum
of normalized reads on a gene was normalized by the gene length and
the total read densities on coding sequences to yield the ribosome
density.
[0156] Calculation of Genetic Part Strengths Based On--Seq
Data.
[0157] The activity of a promoter is defined as the change in RNAP
flux .delta.J around a transcription start site x.sub.tss
(Gorochowski, T. E. et al. Genetic circuit characterization and
debugging using RNA-seq. 13, 952 (2017)). The promoter strength is
calculated by
.delta. .times. J = .gamma. n .function. [ i = x tss + 1 x tss + 1
+ n .times. m .function. ( i ) - i = x tss - 1 x 0 - 1 - n .times.
m .function. ( i ) ] ( 1 ) ##EQU00004##
where m(i) is the number of transcripts at each position I from
FPKM-normalized transcriptomic profiles, y=0.0067 s.sup.-1 is the
degradation rate of mRNA, n is the window length before and after
x.sub.tss. The window length is set to 10. The terminator strength
T.sub.s is defined as the fold-decrease in transcription before and
after a terminator, which can be quantified from FPKM-normalized
transcriptomic profiles as
T s = i = x 1 + 1 x 1 + n .times. m .function. ( i ) i = x 0 - 1 x
0 - n .times. m .function. ( i ) ( 2 ) ##EQU00005##
where x.sub.0 and x.sub.1 are the beginning and end positions of
the terminator part, respectively. Translation efficiency was
calculated by dividing the ribosome density by the FPKM.
[0158] nifH Expression Analysis.
[0159] Complementation of NifA was tested using plasmid pMR-128 to
130 that contains the sfgfp fused to the nifH promoter in the A.
caulinodans .DELTA.nifA mutant. The inducible NifA/RpoN expression
was provided by the plasmid pMR-121 into which sfgfp driven by the
nifH promoter was added to analyze nifH promoter activity, yielding
pMR-131 (FIG. 29). The IPTG-inducible system in the plasmid pMR-124
was substituted with other inducible systems including the
salicylic acid-inducible, nopaline-inducible and octopine-inducible
systems, yielding pMR-125, 126, and 127, respectively. Each of the
plasmids was mobilized into the A. caulinodans .DELTA.nifA mutant,
which was grown following the same protocol as used for nitrogenase
activity (described herein). Following overnight incubation in
minimal medium with a nitrogen source, cultures were diluted to an
OD.sub.600=0.4 in 2 mL of UMS medium plus 30 mM lactate,
antibiotics and an inducer (for inducible systems) in 10 mL glass
vials with PTFE-silicone septa screw caps. Headspace in the vials
was replaced with 99% argon plus 1% oxygen using a vacuum manifold.
The vials were incubated with shaking at 250 rpm at 30.degree. C.
for 9 h, after which 10 .mu.l of cultures was diluted into 150
.mu.l PBS with 2 mg/mL kanamycin for flow cytometry analysis. To
test activation of the nifH promoters by diverse NifA proteins, the
plasmids pMR-51, 53, 88, 89 and 90 were introduced into E. coli
MG1655 and the plasmids pMR-91, 92, 93, 94 and 95 to P. protegens
Pf-5. The plasmid pMR-101 was used to provide inducible NifA
expression by IPTG in E. coli. The controller encoding the
IPTG-inducible NifA was inserted into the genome of P. protegens
Pf-5 using the plasmids pMR-96, 97 and 98. The IPTG-inducible
system of the NifA controller plasmid pMR-96 was replaced with the
arabinose-inducible and the naringenin-inducible system, yielding
pMR-99 and 100, respectively. The inducibility of nifH expression
was assessed by the reporter plasmids pMR-105 to 107 and pMR102 to
104 or E. coli and P. protegens Pf-5, respectively. The controller
plasmids were transformed into E. coli or P. protegens Pf-5 with
the reporter plasmids. Following overnight incubation in minimal
medium with a nitrogen source, cultures were diluted to an
OD.sub.600=0.4 in 2 mL of BB medium, antibiotics and an inducer
(for inducible systems) in 10 mL glass vials with PTFE-silicone
septa screw caps. Headspace in the vials was replaced with either
100% argon for E. coli or 99% argon plus 1% oxygen for P. protegens
Pf-5 using a vacuum manifold. The vials were incubated with shaking
at 250 rpm at 30.degree. C. for 9 h, after which 10 .mu.l of
cultures was diluted into 150 .mu.l PBS with 2 mg/mL kanamycin for
flow cytometry analysis.
[0160] Sequence Alignment.
[0161] NifA sequences of R. sphaeroides 2.4.1 (RSP_0547) and A.
caulinodans ORS571 (AZC_1049) were obtained from NCBI. NifA protein
sequences were aligned with MUSCLE
(https://www.ebi.ac.uk/Tools/msa/muscle/) with a default settings
(FIG. 22).
[0162] Results
[0163] Performance of Native Nif Clusters in E. coli, P. Protegens
Pf-5, and Symbiotic Rhizobia
[0164] A set of diverse native nif clusters were cloned in order to
determine their relative performance in different strains and the
associated species barriers (FIG. 1A). Previously-defined
boundaries for the well-studied nif cluster from K. oxytoca
(Arnold, W., Rump, A., Klipp, W., Priefer, U. B. & Paler, A. J.
J. o. m. b. Nucleotide sequence of a 24,206-base-pair DNA fragment
carrying the entire nitrogen fixation gene cluster of Klebsiella
pneumoniae.
[0165] 203, 715-738 (1988)) and the small (10 kb) cluster from
Paenibacillus polymyxa WLY7870 were used. Similarly, the published
boundaries (43.7 kb) of the P. stutzeri A1501 (Yan, Y. et al.
Nitrogen fixation island and rhizosphere competence traits in the
genome of root-associated Pseudomonas stutzeri A1501. Proceedings
of the National Academy of Sciences (2008)) and A. vinelandii DJ
clusters were used (Hamilton, T. L. et al. Transcriptional
profiling of nitrogen fixation in Azotobacter vinelandii. J
Bacteriol 193, 4477-4486, doi:10.1128/JB.05099-11 (2011)). A region
of the P. stutzeri A1501 nif cluster (Pst1307-Pst1312) was excluded
as these genes are predicted to have no effect on nitrogenase. A.
vinelandii DJ contains three putative electron transport systems
(the Rnf1 and Rnf2 complexes and the Fix complex) located in other
regions of the genome. RNA-seq data shows that Rnf2 is not
co-expressed with the nif genes, so only the Rnf1 and Fix complexes
were included by fusing their DNA to create a single 46.9 kb
construct. The nif cluster
[0166] (40.1 kb) from Azospirillum brasilense Sp7 was selected
because this species is a cereal endophyte and fixes nitrogen in
free-living conditions. Several less-studied gene clusters were
also cloned in order to probe species barriers. As a representative
of cyanobacteria, the gene cluster from Cyanothece sp. ATCC51142
was cloned following published boundaries. Its transcriptional
activator PatB occurs outside of the nif cluster, which was cloned
along with its native promoter and fused to nif cluster to form a
single construct (31.7 kb). Several gene clusters were selected
from photosynthetic purple bacteria (Rhodopseudomonas palustris
CGA009 (Oda, Y. et al. Functional genomic analysis of three
nitrogenase isozymes in the photosynthetic bacterium
Rhodopseudomonas palustris. 187, 7784-7794 (2005)) and Rhodobacter
sphaeroides 2.4.1 (Haselkorn, R. & Kapatral, V. in Genomes and
genomics of nitrogen-fixing organisms, 71-82 (Springer, 2005))) as
these are members of the same alphaproteobacteria class as
Rhizobia. The mf cluster, encoded on a separate chromosome of
[0167] R. sphaeroides 2.4.1, was added to the nif cluster to
provide electrons to nitrogenase. Finally, the gene clusters from
the sugarcane and rice endosymbiant Gluconacetobacter
diazotrophicus PA1 5 (28.9 kb) as well as the three nif clusters
from A. caulinodans ORS571 (64 kb).sup.37 were cloned together with
an upstream regulator fixLJK, but these were found to be inactive
in all species tested, so they are not shown in FIGS. 1A-1F. The
precise genomic locations for all the nif clusters are provided in
Table 2 and the plasmids containing nif clusters are provided in
Table 3.
[0168] Each cluster was amplified from genomic DNA as multiple
fragments by PCR and assembled with the plasmid backbone using
yeast assembly (see Materials and Methods Section). The P. polymyxa
WLY78 cluster was de novo synthesized based on the DNA sequence on
contig ALJV01 (Shanks, R. M. et al. Saccharomyces cerevisiae-based
molecular tool kit for manipulation of genes from gram-negative
bacteria. 72, 5027-5036 (2006)). The clusters were cloned into
different plasmid systems to facilitate transfer. For transfer to
E. coli and R. sp. IRBG74, the broad-host range plasmid based on a
pBBR1 origin was used (a second compatible RK2-origin plasmid was
used for the nif cluster from A. caulinodans ORS571). These
plasmids contain the RK2 oriT to enable the conjugative transfer of
large DNA (see Materials and Methods). For transfer to P. protegens
Pf-5, this plasmid system was found to be unstable and produce a
mixed population. To transfer into this strain, the
Pseudomonas-specific plasmid pRO1600 with the oriT was used. After
construction, all of the plasmids were verified using
next-generation sequencing (see Materials and Methods Section).
[0169] The set of 10 nif clusters were transferred into E. coli
MG1655, the cereal epiphyte P. protegens Pf-5, and the cereal
endophyte R. sp. IRBG74 to create 30 strains (FIG. 1A). E. coli was
selected as a control as successful transfers to this recipient
have been performed. Native P. protegens Pf-5 does not fix
nitrogen. R. sp. IRBG74 contains two nif clusters in different
genomic locations, which were left intact, but does not have
nitrogenase activity under free living conditions. The genomic
cluster does not have the required NifV enzyme as it obtains
homocitrate from the plant. All of the clusters in the set have
nifV, except the one from P. polymyxa WLY78. A test was run to
determine whether the expression of recombinant WV from A.
caulinodans ORS571 in R. sp. IRBG74 would result in active
nitrogenase, but no activity was detected.
[0170] The bacteria were grown in appropriate media, including
antibiotics, and then evaluated for nitrogenase activity using an
acetylene reduction assay (see Methods and Materials Section). E.
coli and Pseudomonas were grown at 30.degree. C. in BB minimal
media, as described previously.sup.71. However, no growth was
observed for R. sp. IRBG74 under these conditions. Different media
and carbon sources were tested and it was found that UMS media with
dicarboxylic acids (malate or succinate), the major carbon source
from plants.sup.147, with 10 mM sucrose yielded the highest growth
rates (FIG. 6). After overnight growth, cells were transferred to
stoppered test tubes in ammonium-free minimal media to a final
OD.sub.600 of 0.4. For E. coli, the headspace air is completely
replaced with argon gas. For P. protegens Pf-5 and R. sp. IRBG74,
the initial headspace concentration of oxygen was maintained at 1%
because these bacteria require oxygen for their metabolism. The
cells are incubated at 30.degree. for 20 hours in the presence of
excess acetylene and the conversion to ethylene was quantified by
GC-MS (see Materials and Methods Section). There was no significant
growth for any of the strains under these conditions, so the
nitrogenase activities reported correspond to the same cell
densities.
[0171] A surprising 6 out of 10 clusters were functional in E. coli
MG1655, with the K. oxytoca cluster producing the highest activity
(FIG. 1A). The K. oxytoca cluster is also functional in P.
protegens Pf-5, albeit with 60-fold less activity as compared to
that in E. coli MG1655. Interestingly, the clusters from P.
stutzeri and A. vinelandii--both obligate aerobes--are able to
achieve high activities in P. protegens Pf-5. The resulting
nitrogenase activities are 3- to 7-fold higher than that achieved
from K. oxytoca, which only fixes nitrogen under strict anaerobic
conditions. These clusters have common organizational features and
similar electron transport chains, such as the Rnf complex.
[0172] A single gene cluster, from R. sphaeroides, yielded
nitrogenase activity in R. sp. IRBG74 (FIG. 1A). Notably, both
Rhizobium and Rhodobacter are alphaproteobacter and their nif
clusters may contain interchangeable genes. When the native nif
clusters are knocked out of R. sp. IRBG74, introducing the R.
sphaeroides cluster alone does not yield active nitrogenase. These
data point to a complex complementation between the endogenous and
introduced gene clusters. To determine whether this approach could
be generalized to other symbiotic Rhizobia, the Rhodobacter and
Rhodopseudomonas gene clusters were transferred to a panel of 12
species isolated from diverse legumes (FIG. 1A). Remarkably, the
transfer of these clusters was able to produce detectable
nitrogenase activity in 7 of the strains.
[0173] Phylogenetic analysis was performed based on the full-length
16S rRNA gene sequences (K. oxytoca, BWI76_05380; A. vinelandii,
Avin_55000; R. sphaeroides, DQL45_00005; Cyanothece ATCC51142,
cce_RNA045; A. brasilense, AMK58_25190; R. palustris, RNA_55; P.
protegens, PST_0759; Paenibacillus sp. WLY78, JQ003557). A multiple
sequence alignment was generated using MUSCLE (Edgar, R. C. J. N.
a. r. MUSCLE: multiple sequence alignment with high accuracy and
high throughput. 32, 1792-1797 (2004)). A phylogenetic tree was
constructed using the Geneious software (R9.0.5) with the
Jukes-Cantor distance model and UPGMA as a tree build method, with
bootstrap values from 1,000 replicates. This phylogenetic tree is
shown in FIG. 30A. The scale bar indicates 2% substitutions per
site. The clusters based on evolutionary closeness are circled.
Using the same data from FIG. 1A, FIG. 30B summarizes the relative
nitrogenase activity in the three host strains carrying each of the
10 nif clusters. The result indicates that the phylogenetic
closeness has a predictive power for achieving highest nitrogenase
activity in a new host that lacks a nif cluster.
[0174] Hereafter, studies were conducted to further characterize
the extent to which changes in transcription and translation
impacted the differences in activity observed when a native cluster
is transferred between species. Differences in promoter activity,
ribosome binding sites, and codon usage could change the expression
levels of nif genes in detrimental ways. To quantify this effect,
RNA-seq and ribosome profiling experiments were performed to
evaluate the expression K. oxytoca nif cluster in K. oxytoca as
well as E. coli MG1655, P. protegens Pf-5, and R. sp. IRBG74.
RNA-seq experiments provide mRNA levels of genes (calculated as
FPKM) and can be used to measure the performance of promoters and
terminators. Ribosome profiling can be used to quantify protein
synthesis rates, ribosome binding site (RBS) strength and ribosome
pausing internal to genes. The ribosome density (RD) has been shown
to correlate with protein expression rates. The translation
efficiency is calculated by normalizing the RD by the number of
transcripts (FPKM from Ribo-seq). Ribosome profiling has been
applied to determine the relative levels of proteins expressed in
multi-subunit complexes.
[0175] The RNA-seq profiles in both the sense and antisense
direction are very close when compared between K. oxytoca and E.
coli (FIGS. 1B-1C) and the ratios between mRNAs is preserved
(R.sup.2=0.89) (FIG. 1D). This is consistent with the observation
that this cluster yields a similar activity in both hosts. In
contrast, the RNA-seq profiles differ more significantly for P.
protegens Pf-5 and R. sp. IRBG74 (FIGS. 1B-1C), and there was no
correlation between mRNA transcripts (FIG. 1D).
[0176] The ratios between protein expression rates were measured
using ribosome profiling (FIG. 1E and FIG. 9). It is noteworthy
that the ratios measured in K. oxytoca almost perfectly correlate
with immunoblotting assays of A. vinelandii and the stoichiometry
of H:D:K reflects the known 2:1:1 ratio. Interestingly, unlike mRNA
levels, the ratios in expression rates are strongly correlated when
the cluster is transferred between species: E. coli (R.sup.2=0.94),
P. protegens Pf-5 (R.sup.2=0.61), and R. sp. IRBG74 (R.sup.2=0.71)
(FIGS. 1E-1F). The production of NifH is significantly lower in R.
sp. IRBG as compared to other strains. In an attempt to increase
the induction of the cluster in this host, NifA was overexpressed,
but this proved unsuccessful in producing high levels of active
nitrogenase (FIGS. 10A-10B).
[0177] The following summarizes the results of the transfer of
native nif clusters to new species. The most successful recipient
was E. coli. However, this is not a viable agricultural strain and
activity was eliminated in the presence of 17.1 mM ammonium (FIGS.
7A-7E, and FIGS. 8A-8B). Moderately high activity was obtained in
P. protegens Pf-5, but this yielded a constitutively-on response
(the K. oxytoca cluster) or was strongly repressed by ammonium (the
A. vinelandii cluster). It was also found that the P. stutzeri
cluster in P. protegens Pf-5 is inactive in the presence of
ammonium, in disagreement with previously published results
(Setten, L. et al. Engineering Pseudomonas protegens Pf-5 for
nitrogen fixation and its application to improve plant growth under
nitrogen-deficient conditions. PLoS One 8, e63666 (2013)). Only low
levels of activity could be obtained by transferring clusters to
Rhizobia. To address these issues, different approaches were
applied to engineer the clusters to generate higher activity,
exhibit less repression by ammonium, and be inducible.
Transfer of Refactored Klebsiella Nif Clusters to R. Sp. IRBG74
[0178] The process of refactoring a gene cluster involves the
complete reconstruction of the genetic system from the bottom-up,
using only well-characterized genetic parts. An exhaustive approach
is to recode the genes (to eliminate internal regulation),
reorganize into operons, control expression with synthetic ribosome
binding sites (RBSs), and use T7 RNAP promoters and terminators. A
separate "controller," carried in a genetically distinct location,
links synthetic sensors and circuits to the expression of T7 RNAP.
For various applications, this approach has proven useful for
transferring multi-gene systems between species, simplifies
optimization through part replacement and enzyme mining, and
enables the replacement of environmental signals that naturally
control the cluster with the stimuli that induce the synthetic
sensors (Smanski, M. J. et al. Synthetic biology to access and
expand nature's chemical diversity. Nature Reviews Microbiology 14,
135 (2016); Song, M. et al. Control of type III protein secretion
using a minimal genetic system. 8, 14737 (2017); Guo, C.-J. et al.
Discovery of reactive microbiota-derived metabolites that inhibit
host proteases. 168, 517-526. e518 (2017); Ren, H., Hu, P., Zhao,
H. J. B. & bioengineering. A plug-and-play pathway refactoring
workflow for natural product research in Escherichia coli and
Saccharomyces cerevisiae. 114, 1847-1854 (2017)). In previous
studies, the Klebsiella nif cluster was refactored, which was
subsequently used as a platform to optimize activity by changing
the genetic organization and the parts controlling expression. The
top variant (v2.1) fully recovered activity in a K. oxytoca nif
knockout and is functional in E. coli. To transfer into E. coli, a
controller based on the isopropyl-.beta.-D-thiogalactoside
(IPTG)-inducible T7 RNAP carried on a plasmid was used (FIG. 2A).
An interesting observation during optimization is that the genetic
organization of the native cluster, including the existence of
operons, was not correlated with activity.
[0179] An advantage of using T7 RNAP is that it is functional in
essentially all prokaryotes, so the refactored cluster can be
transferred as-is and transcription induced by expressing T7 RNAP
in the new host. However, a new controller needs to be built for
each host based on regulation and regulatory parts that work in
that species. A controller for E. coli was designed based on the
IPTG-inducible T7 RNAP carried on a plasmid (pKT249) (FIG. 2A). To
transfer the refactored cluster to R. sp. IRBG74, first a
controller was constructed that functions in this species and
produces an equivalent range of T7 RNAP expression.
[0180] While a handful of inducible systems and sets of genetic
parts have been previously described for Rhizobia, a new part
collection needed to be built and characterized in order to have
those needed to create a controller with sufficient dynamic range.
First, a set of 20 constitutive promoters (Anderson, J. et al.
BglBricks: A flexible standard for biological part assembly. 4, 1
(2010)) and seven T7 RNAP-dependent promoters (emme, K., Zhao, D.
& Voigt, C. A. Refactoring the nitrogen fixation gene cluster
from Klebsiella oxytoca. Proceedings of the National Academy of
Sciences 109, 7085-7090 (2012)) that were found to span a range of
382-fold and 23-fold expression, respectively, were characterized
(FIGS. 11A-11C). Second, a library of 285 ribosome binding sites
(RBSs) were screened using the RBS Library Calculator, representing
an expression range of 5,600-fold (FIGS. 12A-12B). Finally, a set
of 29 terminators was characterized, of which 17 were found to have
a terminator strength >10 (FIGS. 13A-13B). Using these part
libraries, six inducible systems for R. sp. IRBG74 were then
constructed that respond to IPTG, the quorum signal 3OC6HSL, aTc,
cuminic acid, DAPG, and salicylic acid (FIG. 14). After
optimization, these systems generate between 7- to 400-fold
induction.
[0181] A controller was then constructed by using the optimized
IPTG-inducible system to drive the expression of a variant of T7
RNAP (R6232S, N-terminal lon tag, GTG start codon) (FIG. 2A). RBS
variants controlling T7 RNAP expression were tested and an
intermediate strength was selected to maximize induction while
limiting toxicity (FIG. 16). The controller was carried on the
genome by replacing recA (see Materials and Methods). The response
function of the final controller is compared to that obtained for
pKT249 in E. coli, showing that they sweep through the same range
of expression at intermediate levels of induction (FIG. 2B). To
achieve the same level of induction in the two species, 0.1 mM IPTG
is selected for E. coli and 0.5 mM for R. sp. IRBG74 (circled
points in FIG. 2B).
[0182] The refactored v2.1 cluster was then transferred to R. sp.
IRBG74, but no activity was observed (FIGS. 2C-2D). Activity was
also not observed when the v2.1 cluster was transferred to P.
protegens Pf-5 (FIG. 17). To determine if the genetic parts that
make up the refactored cluster were functioning as designed,
RNA-seq and ribosome profiling experiments were performed (FIG.
18). From these data, the strengths of promoters/terminators and
the transcription level and translation rates of genes could be
calculated (see Materials and Methods). The performance of the
promoters in R. sp. IRBG74 was systematically lower than E. coli,
particularly the first promoter controlling nifH (FIG. 2E). The
terminators were functioning the same in the two species, albeit
weakly, and no termination could be detected from the three
terminators in the center of the cluster (FIG. 2E). The translation
of the genes differed significantly between organisms (FIG. 2F).
When the expression rates of the nif genes from the refactored
cluster are compared with their levels in their native context in
K. oxytoca, there is almost no correlation (FIG. 2F). Importantly,
there is 9-fold less NifH expressed from the refactored cluster in
R. sp. IRBG74 as compared to the same cluster in E. coli. Thus, the
refactored cluster produces wildly different expression levels of
the component genes when transferred between organisms, even when
transcription is matched between them using different
controllers.
[0183] Based on these results, a new refactored cluster (v3.2)
(FIG. 2G) was designed. A very strong promoter was chosen for nifH.
The transcription was broken up by adding promoters to divide
nifENX and nifJ and selecting stronger terminators. Noting that the
expression ratios between nif genes are better preserved when the
native cluster is transferred to a new host (FIG. 1D) but not the
refactored cluster (FIG. 2F), it was hypothesized that this could
be due to the disruption of the operon structures and the
associated translational coupling between genes. The K. oxytoca
operons were cloned intact, including native RBSs and replaced
these regions of the refactored cluster (FIG. 2G). Note that this
also preserves nifT and nifX, which were not included in first
versions because they were either inessential(Simon, H. M., Homer,
M. J. & Roberts, G. P. J. J. o. b. Perturbation of nifT
expression in Klebsiella pneumoniae has limited effect on nitrogen
fixation. 178, 2975-2977 (1996)) or inhibitory (Gosink, M. M.,
Franklin, N. M. & Roberts, G. P. J. J. o. b. The product of the
Klebsiella pneumoniae nifX gene is a negative regulator of the
nitrogen fixation (nif) regulon. 172, 1441-1447 (1990)).
[0184] Compared to v2.1, the v3.2 cluster is less active in E. coli
but is active in R. sp. IRBG74 (FIG. 2H) and P. protegens Pf-5
(FIG. 17). This experiment was performed in the double nif knockout
strain in R. sp. IRBG74, thus indicating that the refactored
cluster is self-contained in producing nitrogenase activity.
RNA-seq and ribosome profiling was applied to evaluate the
performance of v3.2 in all three species (FIG. 21, FIG. 19, and
FIGS. 20A-20F). The promoters perform similarly in the different
hosts, but there was significant diversity in terminator function.
Despite this, the translation rates (RD) of the genes were
remarkably consistent and NifH expression is nearly identical (FIG.
2J). The higher expression of NifH and the preserved ratios between
proteins is the likely reason that the refactored cluster is
functional in R. sp. IRBG74. The next attempt was to increase
expression level of the nif genes in R. sp. IRBG74 by increasing
the concentration of inducer used, but a clear optimum beyond which
increased expression caused a rapid decline in activity was found
(FIG. 2M). This indicates a potential upper limit in obtaining
activity in R. sp. IRBG74 under free living conditions using only
the genes from K. oxytoca.
Replacement of A. caulinodans Nif Regulation with Synthetic
Control
[0185] The A. caulinodans nif genes are distributed across three
clusters in different genomic locations. The regulatory signals
converge on the NifA activator that, in concert with the RpoN sigma
factor, turns on transcription of the genomic nif clusters.
Numerous and not fully characterized environmental signals are
integrated upstream of this node, including NtrBC (Kaminski, P. A.
& Elmerich, C. J. M. m. The control of Azorhizobium caulinodans
nifA expression by oxygen, ammonia and by the HF-I-like protein,
NrfA. 28, 603-613 (1998)), NtrXY (Pawlowski, K., Klosse, U., De
Bruijn, F. J. M. & MGG, G. G. Characterization of a novel
Azorhizobium caulinodans ORS571 two-component regulatory system,
NtrY/NtrX, involved in nitrogen fixation and metabolism. 231,
124-138 (1991)), FixLJK(Kaminski, P. & Elmerich, C. J. M. m.
Involvement of fixLJ in the regulation of nitrogen fixation in
Azorhizobium caulinodans. 5, 665-673 (1991); Kaminski, P., Mandon,
K., Arigoni, F., Desnoues, N. & Elmerich, C. J. M. m.
Regulation of nitrogen fixation in Azorhizobium caulinodans:
identification of a fixK-like gene, a positive regulator of nifA.
5, 1983-1991 (1991)), NrfA (Kaminski, P. A. & Elmerich, C. J.
M. m. The control of Azorhizobium caulinodans nifA expression by
oxygen, ammonia and by the HF-I-like protein, NrfA. 28, 603-613
(1998)), and PII proteins (e.g., GlnB and GlnK (Michel-Reydellet,
N. & Kaminski, P. A. J. J. o. b. Azorhizobium caulinodans
Plland GlnK proteins control nitrogen fixation and ammonia
assimilation. 181, 2655-2658 (1999))). The clusters (64 kb total,
containing 76 genes) were cloned into the plasmid systems described
above and transferred into R. sp. IRBG74 and P. protegens Pf-5, but
no activity was found in either strain. Overexpression of A.
caulinodans NifA and RpoN did not lead to activity and, upon
further investigation, these regulators were found to be inactive
in these strains. The size of the clusters and the lack of genetic
and gene function information would complicate fully refactoring
the system. For these reasons, it was decided to modify the
regulation controlling nif such that it can be placed under the
control of synthetic sensors.
[0186] One goal herein was to eliminate ammonium repression of
nitrogenase activity, which converges on the regulation of NifA.
The native nifA gene was knocked out of the genome using the sacB
markerless deletion method (see Materials and Methods), with the
intent of placing NifA under inducible control (FIG. 3A). There is
only basal activity from the nifH promoter in the .DELTA.nifA
strain (FIG. 3B). When NifA is overexpressed, the promoter turns on
and its activity is further enhanced by the co-expression of RpoN
in an operon (note that the genomic rpoN gene is left intact for
these experiments). The IPTG-inducible system designed for
Rhizobium (previous section) was tested in A. caulinodans carried
on a pBBR1-ori plasmid. Using GFP, this was found to induce
expression over several orders of magnitude (FIG. 21). Then, the A.
caulinodans nifA and rpoN gene was placed under IPTG control and
the fluorescent reporter fused to the A. caulinodans nifH promoter
(encompassing 281 nt upstream of the ATG), carried on the same
plasmid (see Materials and Methods). The response function from the
nifH promoter was analyzed at the condition used for nitrogen
fixation, exhibiting a wide dynamic range to 45-fold (FIG. 3C).
[0187] The controller was designed to co-express NifA and RpoN and
tested for its ability to induce nitrogenase (FIG. 3D). When fully
induced, there was a complete recovery of activity as compared to
the wild-type strain. The repression of nitrogenase activity by
ammonium was then evaluated. The presence of 10 mM ammonium
chloride leads to no detectible activity by the wild-type strain
(FIG. 3E). Even when both NifA and RpoN are under inducible
control, there is strong repression with only 5% of the nitrogenase
activity of the wild-type. This suggests that the
post-transcriptional control of NifA activity by ammonium remains
intact.
[0188] In related alphaproteobacteria, mutations have been
identified in NifA that abrogate ammonium repression (Paschen, A.,
Drepper, T., Masepohl, B. & Klipp, W. Rhodobacter capsulatus
nifA mutants mediating nif gene expression in the presence of
ammonium. FEMS microbiology letters 200, 207-213 (2001); Rey, F.
E., Heiniger, E. K. & Harwood, C. S. Redirection of metabolism
for biological hydrogen production. Applied and environmental
microbiology 73, 1665-1671 (2007)). These mutations occur in the
N-terminal GAF domain. Using a multiple sequence alignment, two
equivalent residues were identified to mutate in A. caulinodans
(L94Q and D95Q) (FIG. 22). These mutations were made and then
tested individually and in combination (FIG. 3D). When the double
mutant of NifA is co-expressed with RpoN, the presence of ammonium
only results in a slight decrease in activity.
[0189] Oxygen irreversibly inhibits nitrogenase and represses nif
clusters. The inducible nif clusters were tested for oxygen
sensitivity, noting that A. caulinodans is an obligate aerobe and
fixes nitrogen under micro-aerobic conditions. The tolerance of
nitrogenase to oxygen was then assessed as a function of the
concentration of oxygen in the headspace, held constant by
injecting oxygen while monitoring its level (Methods and FIG. 26A).
The native and inducible gene clusters responded nearly identically
to oxygen (FIG. 3F). The optimum activity occurs between 0.5% to 1%
with a wide tolerance (30% activity at 3% oxygen).
Introduction of Controllable Nif Activity in P. protegens Pf-5
[0190] The native K. oxytoca, P. stutzeri, and A. vinelandii nif
clusters are all functional in P. protegens Pf-5 (FIG. 1A).
However, when the native P. stutzeri and A. vinelandii clusters are
transferred, nitrogenase is strongly repressed. In contrast,
transferring the native K. oxytoca cluster produces uncontrolled
(constitutively on) nitrogenase activity (FIG. 4E). For these three
clusters in P. protegens Pf-5, it was sought to gain regulatory
control by removing the nifA master regulators from the clusters
and expressing them from a controller (FIG. 4A).
[0191] As with Rhizobia, it was found that first, part libraries
for P. protegens Pf-5 had to be built before building controllers
with sufficient dynamic range. A range of 20 constitutive promoters
and seven T7 promoters that span a range of 778-fold and 24-fold
expression, respectively, was characterized (FIGS. 11A-11C). A
library of 192 RBSs was screened, representing an expression range
of 4,079-fold (FIGS. 12A-12B). A set of seven terminators that
share no sequence homology between each other and have a terminator
strength >10 in R. sp. IRBG74 was selected and characterized
together with the three well-used terminators (e.g., T7 terminator,
rrnBT1, and L3S2P21). These seven terminators showed a terminator
strength >50 (FIGS. 13A-13B).
[0192] The inducible systems designed for Rhizobium were
transferred as-is to a Pseudomas-specific pRO1600 plasmid (see
Materials and Methods). The 3OC6HSL-, aTc-, cuminic acid-, and
DAPG-inducible systems were all found to be functional (FIG. 15A).
In addition, a naringenin-inducible system based on the P.sub.fde
promoter was constructed and found to be functional. The strength
of arabinose inducible system was increased by substituting the -10
box in P.sub.BAD promoter and arabinose import was improved by
constitutive expression of the arabinose transporter AraE (FIG.
15B). Finally, the IPTG-inducible system was optimized for P.
protegens Pf-5 by replacing the P.sub.A1lacO1 promoter with the
P.sub.tac promoter and making three amino acid substitutions to Lad
(Meyer, A. J., Segall-Shapiro, T. H., Glassey, E., Zhang, J. &
Voigt, C. A. J. N. c. b. Escherichia coli "Marionette" strains with
12 highly optimized small-molecule sensors. 1 (2018)). This effort
resulted in seven new inducible systems that produce 41- to
554-fold induction in P. protegens Pf-5 (FIG. 15C).
[0193] To simplify the comparison between clusters, it was sought
to build a single, universal controller that could induce all
three. Each has a different NifA sequence, so the ability to cross
induce the gene clusters was tested. To do this, the nifH promoters
from each nif cluster were cloned and fused to gfp to build
plasmid-based reporters (see Materials and Methods). The ability of
the various NifA homologues to activate the nifH promoters was
evaluated in E. coli and P. protegens Pf-5 (FIG. 23A-23B). The
results suggest that it is more important to express a NifA variant
from a similar species as the host, as opposed to expressing the
NifA variant that is cognate to the transferred cluster. This may
be due to the need for NifA to recruit host transcriptional
machinery, whereas the NifA binding sites in the promoters are well
conserved across species. Based on these data, the controller was
constructed using the P. stutzeri NifA, placed under the control of
the optimized IPTG-inducible system, described above. The RBSs of
NifA were synthetically designed to span a wide range of expression
of nif genes (FIG. 24A). The controller was inserted into the
genome 25 bp downstream of the stop codon of glmS using the
mini-Tn7 system. The ability for this controller to induce the nifH
promoter from each cluster using a fluorescent reporter is shown in
FIG. 4C and FIG. 24B.
[0194] The nitrogenase activity for each of the gene clusters in P.
protegens Pf-5 was then assessed (FIG. 4D). The three P. protegens
Pf-5 strains containing the transferred clusters were modified to
insert the controller and delete the native nifLA genes from each
cluster (FIG. 4B). All three are inducible, with nitrogenase
activity showing dynamic ranges of 1,200-fold, 2,300-fold, and
130-fold for the K. oxytoca, P. stutzeri, and A. vinelandii nif
clusters, respectively. When induced, these systems all produce
similar or even higher nitrogenase activities than can be achieved
by the transfer of the unmodified native clusters (FIG. 4D). For
reference, the nitrogenase activities produced by K. oxytoca, P.
stutzeri, and A. vinelandii are shown as dashed lines in FIG. 4D
(top to bottom) (see Methods and Materials). All three inducible
clusters produce similar levels of activity that approach those
measured from wild-type P. stutzeri and A. vinelandii.
[0195] The native P. stutzeri and A. vinelandii clusters are
strongly repressed by ammonium: the presence of 17.1 mM eliminates
activity or reduces it 7-fold, respectively (FIG. 4E and FIGS.
8A-8B). The inducible clusters show little reduction in activity
and the inducible A. vinelandii cluster exhibits almost no ammonia
repression. While the native K. oxytoca cluster in P. protegens
Pf-5 generates a constitutive response, there is still some
repression, which is reduced by the inducible version.
[0196] The inducible nif clusters were tested for oxygen
sensitivity. Note that wild-type A. vinelandii is able to fix
nitrogen under ambient conditions due to genetic factors internal
and external to the cluster. First, it was established that the
controller in P. protegens Pf-5 could induce transcription from the
three nifH promoters in the presence of oxygen (FIGS. 26A-26B). The
tolerance of nitrogenase to oxygen was then assessed as a function
of the concentration of oxygen in the headspace, as described for
A. caulinodans (previous section). The native and inducible
clusters exhibited the same oxygen response (FIG. 4F). The nif
cluster from K. oxytoca was the most sensitive, generating the
highest activity under anaerobic conditions, but this is quickly
abolished in the presence of 02. In contrast, the nif clusters from
P. stutzeri and A. vinelandii showed wider tolerance with optima at
1% and 0.5%, respectively. However, both clusters lose activity at
lower oxygen concentrations than A. caulinodans.
[0197] To explore the impact of the electron transport chains,
several mutants to the A. vinelandii cluster were made (FIG. 27).
The A. vinelandii cluster contains two potential electron transport
systems to nitrogenase and the redundant system may help maintain
redox status for nitrogenase at various oxygen levels. The
dependence of nitrogenase activity on the oxygen concentration in
various mutant backgrounds was re-measured. No effect was seen by
adding the rnf2 operon or deleting the fix operon, however deleting
rnf1 eliminated activity. This suggests that the rnf1 operon is the
sole source of electrons in P. protegens Pf-5 under these
conditions and the Fix complex cannot compensate the Rnf complex
unlike the case of A. vinelandii.
Control of Nitrogen Fixation with Agriculturally-Relevant
Sensors
[0198] The careful design and characterization of the controller
has the benefit of simplifying the process by which different
synthetic sensors can be used to induce nitrogenase expression. By
knowing the dynamic range required to go from inactive to active
nitrogenase, one can quantitatively select sensors that have the
produce a compatible response. This allows different environmental
signals--or combinations of signals using genetic logic
circuits--to be used to control expression. To demonstrate this, 11
synthetic sensors were selected that respond to a variety of
chemical signals of relevance to the rhizosphere and demonstrate
that these can be used to create inducible nitrogenase in for
example, engineered strains of E. coli (carrying the refactored
v2.1 nif), R. sp. IRBG74 (carrying the refactored v3.2 nif), P.
protegens Pf-5 (carrying the inducible A. vinelandii nif), and A.
caulinodans (inducible nifA/rpoN) (FIGS. 5A-5D).
[0199] The roles of the chemical signals in the rhizosphere are
shown in FIG. 5A. Cuminic acid is present in plant seeds and
functions as a fungicide. Natural root exudates may include sugars,
amino acids, organic acids, phenolic compounds, phytohormones, and
flavonoids. These represent potential signals to control
nitrogenase production close to the root surface. Cereals have been
shown to release arabinose, vanillic acid, and salicylic acid. In
addition, salicylic acid regulates the plant innate immune response
and the impact of its exogenous addition to cereals has been
studied. Naringenin is a common precursor for many flavonoids and
improves endophytic root colonization when applied to rice and
wheat. Genistein, a product from naringenin catalyzed by the
isoflavone synthase, is released from maize roots. A quorum sensing
mimic released by rice can regulate the 3OC6HSL receptor protein
LuxR, which has been visualized using E. coli biosensor
strains.
[0200] Bacteria either native to the rhizome or added as biocontrol
agents introduced as a spray inoculant or seed coating produce
chemical signatures. Inoculation of cereals with root colonizing
Pseudomonas strains that produce DAPG elicits protection against
fungal pathogens. Many bacteria produce quorum molecules, such as
N-acyl homoserine lactones, as a means of communication and plants
can respond to these signals. The bacterium Sinorhizobium meliloti
produces 3OC14HSL, which enhances Medicago nodulation and has been
shown to induce systemic resistance in cereals. DHBA can be
produced by root colonizing bacteria to increase iron solubility
and play a role as a chemoattractant for Agrobacterium and
Rhizobium.
[0201] Sensors for these chemicals were constructed based on the
controllers for each species. For E. coli MG1655, a strain that
contains 12 optimized sensors, carried in the genome, that respond
to various small molecules ("Marionette") had been previously
constructed (Meyer, A. J., Segall-Shapiro, T. H., Glassey, E.,
Zhang, J. & Voigt, C. A. J. N. b. Escherichia coli "Marionette"
strains with 12 highly optimized small-molecule sensors. 1
(2018).). The response functions of these sensors were
characterized in standard units, making it simple to identify those
that can be connected to nitrogenase expression without further
tuning. Marionette contains sensors for vanillic acid, DHBA,
cuminic acid, 3OC6HSL, and 3OC14HSL. For each sensor, the output
promoter was transcriptionally fused to T7 RNAP and the response of
the responsive promoter (PT7) was measured as a function of inducer
concentration (FIG. 5B and FIG. 28B). Then, the v2.1 refactored nif
cluster was introduced and nitrogenase activity was measured in the
presence and absence of inducer (FIG. 5C and FIG. 28C). The
inducible systems constructed for P. protegens Pf-5 that respond to
arabinose and naringenin were used to drive NifA expression for the
control of the A. vinelandii nif cluster (FIG. 4A). The induction
of the nifH promoter by these sensors was first confirmed using a
reporter (FIG. 5B). When this is replaced with the nif gene
cluster, it results in an inducible response of nitrogenase
activity (FIG. 5C). The best nitrogenase activity in R. sp. IRBG74
is low; however, herein it was demonstrated that it could be placed
under inducible control. The DAPG-inducible system developed for R.
sp. IRBG74 was connected to the control of T7 RNAP and this
produces a strong response from PT7 (FIG. 5B). However, when used
to drive the expression of the v3.2 refactored pathway, only a
9-fold induction is observed, consistent with the low nitrogenase
activity observed in this strain (FIG. 5C). Finally, the salicylic
acid sensor designed for Rhizobium was used to control NifA
(L94Q/D95Q)/RpoN expression in A. caulinodans (FIG. 3A and FIG.
5B). This yielded a 1000-fold dynamic range of nitrogenase activity
(FIG. 5C).
[0202] Plants could be engineered to release an orthogonal chemical
signal that could then be sensed by a corresponding engineered
bacterium. This would have the benefit of only inducing nitrogenase
in the presence of the engineered crop. Further, if the molecule is
metabolizable by the engineered bacterium, it could serve as a
mechanism around which a synthetic symbiosis could be designed,
where the plant provides the carbon and the bacterium fixed
nitrogen in an engineered relationship. To this end, legumes and
Arabidopsis have been engineered to produce opines, including
nopaline and octopine. Sensors were constructed for these two
opines for A. caulinodans based on the LysR-type transcriptional
activators OccR (octopine) and NocR (nopaline) and their
corresponding P.sub.occ and P.sub.noc promoters (FIG. 5D and FIG.
21). These sensors were connected to the expression of
NifA(L94Q/D95Q)/RpoN and the response from P.sub.nifH was measured
using a fluorescent reporter. Both response functions had a large
dynamic range (FIG. 5B) and produced highly-inducible nitrogenase
activity (FIG. 5C). The nopaline sensor yielded a 412-fold dynamic
range and the octopine sensor led to 40% higher nitrogenase
activity than the wild-type.
Discussion
[0203] Towards designing a bacterium that can deliver fixed
nitrogen to a cereal crop, this work provides a side-by-side
comparison of diverse species, natural nif clusters, and
engineering strategies that can be used to obtain inducible
nitrogenase activity in a strain that can associate with cereals as
an endophyte or epiphyte. To this end, .about.100 strains involving
the transfer of 10 natural nif clusters ranging in size from 10 kb
to 64 kb to 16 diverse species of Rhizobia, Azorhizobium,
Pseudomas, and E. coli were constructed. Different approaches were
taken to make these nif clusters inducible, from bioinformatics and
protein engineering to complete genetic reconstruction from the
ground-up (refactoring). In addition to the highest activity, it is
important that nitrogen fixation be robust to the addition of
nitrogenous fertilizer (ammonia) and microaerobic environments. For
example, an endophyte such as a variant of Azorhizobium where nifA
is knocked out of the genome and a nifA mutant and rpoN are
complemented on a plasmid can be used to obtain high nitrogenase
activities. For an epiphyte, P. protegens Pf-5, is a versatile
strain based on the transfer of the A. vinelandii nif cluster and
placement of nifA of P. stutzeri under inducible control. In both
such cases, nitrogenase activities were obtained that are nearly
identical to wild-type A. caulinodans and P. stutzeri,
respectively. Neither showed significant repression by ammonia and
optimal activity was obtained in 1% oxygen. Based on these strains,
it was demonstrated that nitrogenase can be placed under inducible
control in response to cereal root exudates (arabinose, salicylic
acid), phytohormones (naringenin) and putitive signaling molecules
that could be released by genetically modified plants (e.g., can
express or exudate nopaline or octopine).
[0204] Because R. sp. IRBG74 can fix nitrogen in a legume nodule
and also associates with rice, significant effort was directed to
engineering this strain to fix nitrogen when cereal-associated. The
first attempt was simply complementing nifV, as this is absent in
R. sp. IRBG74 and produces a metabolite provided by the plant, but
this attempt was unsuccessful. Then, it was found that all of the
initial nif clusters transferred, some of which have high activity
in P. protegens Pf-5 and E. coli, are non-functional in R. sp.
IRBG74, which led to trying clusters from alphaproteobacteria, one
of which produced a very low level of activity that was dependent
on the nif genes native to R. sp. IRBG74. The previously-published
refactored gene clusters based on Klebsiella nif were attempted in
R. sp. IRBG74 but these showed no activity. It was only after the
construction of a new refactored cluster (v3.2) that activity was
obtained under free-living conditions that was not dependent on the
native nif genes. This allowed an increase in the expression
levels, and an optimum was discovered beyond which activity was
lost. This is the first time that nif activity has been engineered
in a Rhizobium under free-living conditions that could otherwise
not perform this function.
[0205] The present disclosure encompasses different degrees of nif
pathway re-engineering to promote heterologous transfer. The most
ambitious is the complete refactoring of all the nif genes and
regulation, where all regulatory genetic parts are replaced, genes
are recoded, operons are reorganized, and transcription is
performed by the orthogonal T7 RNAP. Initially, the evaluation of
performance relied on the overall nitrogenase activity, rather than
an understanding of the underlying parts. As such, the first
refactored pathway performed poorly. In subsequent studies, better
part libraries and DNA assembly and automation platforms enabled
the synthesis of many variants. Further, as the cost of RNA-seq
declined, it was used to evaluate the performance of internal
parts, such as promoters and terminators. This revealed that the
first designs were effectively large single operons with little
differential control over the transcription levels of individual
genes. With these techniques allowed the tailoring of the function
of the refactored nif pathway and the discovery that many of the
underlying genetic structure were not needed to achieve high
activities.
[0206] Ribosome profiling, a new technique that enables the
measurement of translational parts (e.g., ribosome binding sites),
was applied and expression levels were inferred. Further,
nitrogenase activity and the function of underlying parts were
assessed as the clusters were moved between species. Interestingly,
the native Klebsiella nif cluster could be transferred and it
performed similarly but the refactored cluster yielded widely
varying expression levels in the different hosts, sometimes leading
to a total loss in activity. This could be recovered by maintaining
the native operon structure in the refactored cluster, implying
that it was not due to the synthetic sensors, T7 RNAP, or
promoters/terminators. This is one of the hypothesized functions of
operons. Achieving this required maintenance of the codon usage and
translational coupling of the native cluster. However, this does
not mean that it will not be possible to also encode this function
synthetically. There have been computational advances that enable
the calculation of RBSs internal to upstream genes when encoded on
an operon. If coupled with codon optimization algorithms, this
would allow the design of de novo genetic parts that achieve a
desired degree of translational coupling and expression level.
[0207] The present disclosure demonstrates the deregulation of nif
clusters in A. caulinodans and P. protegens Pf-5, enabling them to
be placed under the control of cereal root exudates. This
derepresses the pathway in the presence of exogenous nitrogenous
fertilizer--critical for the use of the bacterium as part of an
integrated agricultural solution. Further, these organisms retain
the ability to fix nitrogen in microaerobic environments, thus
avoiding the need for a root nodule that enforces strict
anaerobiosis. The complete deregulation of the nif pathway makes
the bacterium non-competitive in the soil and lost quickly, thus
limiting its impact to particular phases of the growth cycle. Thus,
it is demonstrated that nitrogenase can be placed under the control
of chemical root exudates.
EMBODIMENTS
[0208] 1. A rhizobium that can fix nitrogen under aerobic
free-living conditions, comprising a symbiotic rhizobium having an
exogenous nif cluster, wherein the exogenous nif cluster confers
nitrogen fixation capability on the symbiotic rhizobium under
aerobic free-living conditions, and wherein the rhizobium is not
Azorhizobium caulinodans.
[0209] 2. The rhizobium of paragraph 1, wherein the exogenous nif
cluster is from a free-living diazotroph.
[0210] 3. The rhizobium of paragraph 1, wherein the exogenous nif
cluster is from a symbiotic diazotroph.
[0211] 4. The rhizobium of paragraph 1, wherein the exogenous nif
cluster is from a photosynthetic Alphaproteobacteria.
[0212] 5. The rhizobium of paragraph 1, wherein the exogenous nif
cluster is from a Gammaproteobacteria.
[0213] 6. The rhizobium of paragraph 1, wherein the exogenous nif
cluster is from a cyanobacteria.
[0214] 7. The rhizobium of paragraph 1, wherein the exogenous nif
cluster is from a firmicutes.
[0215] 8. The rhizobium of paragraph 1, wherein the exogenous nif
cluster is from Rhodobacter sphaeroides.
[0216] 9. The rhizobium of paragraph 1, wherein the exogenous nif
cluster is from Rhodopseudomonas palustris.
[0217] 10. The rhizobium of paragraph 1, wherein the exogenous nif
cluster is an inducible refactored nif cluster.
[0218] 11. The rhizobium of paragraph 10, wherein the inducible
refactored nif cluster is an inducible refactored Klebsiella nif
cluster.
[0219] 12 The rhizobium of any one of the preceding paragraphs,
wherein the rhizobium is IRBG74.
[0220] 13. The rhizobium of any one of the preceding paragraphs,
wherein the exogenous nif cluster comprises 6 nif genes.
[0221] 14. The rhizobium of paragraph 13, wherein the 6 nif genes
are nifHDK(T)Y, nifEN(X), nifJ, nifBQ, nifF, and nifUSVWZM.
[0222] 15. The rhizobium of paragraphs 13 or 14, wherein each nif
gene of the exogenous nif cluster is preceded by a T7 promoter.
[0223] 16. The rhizobium of paragraph 15, wherein the T7 promoter
is a wild-type promoter.
[0224] 17. The rhizobium of any one of the preceding paragraphs,
further comprising an endogenous nif cluster.
[0225] 18. The rhizobium of any one of the preceding paragraphs,
wherein the nif cluster has a nifV gene.
[0226] 19. The rhizobium of paragraph 18, wherein the nifV gene is
endogenous.
[0227] 20. The rhizobium of any one of the preceding paragraphs,
wherein the exogenous nif cluster further comprises a
terminator.
[0228] 21. The rhizobium of any one of paragraphs 15-20, wherein
the T7 promoter has a terminator and wherein the terminator is
downstream from the T7 promoter.
[0229] 22. The rhizobium of paragraph 12, wherein the exogenous nif
cluster is a refactored rhizobium IRBG74 nif cluster.
[0230] 23. A plant growth promoting bacterium that can fix nitrogen
under aerobic free-living conditions, comprising a bacterium having
an exogenous nif cluster having at least one inducible promoter,
wherein the exogenous nif cluster confers nitrogen fixation
capability on the bacterium, under aerobic free-living conditions,
and wherein the bacterium is not Azorhizobium caulinodans.
[0231] 24. The plant growth promoting bacterium of paragraph 23,
wherein the bacterium is a symbiotic bacterium.
[0232] 25. The plant growth promoting bacterium of paragraph 23,
wherein the bacterium is an endophyte.
[0233] 26. The plant growth promoting bacterium of paragraph 25,
wherein the endophyte is rhizobium IRBG74.
[0234] 27. The plant growth promoting bacterium of paragraph 23,
wherein the bacterium is an epiphyte.
[0235] 28. The plant growth promoting bacterium of paragraph 27,
wherein the epiphyte is pseudomonas protogens PF-5.
[0236] 29. The plant growth promoting bacterium of any one of
paragraphs 23-28, wherein the plant growth promoting bacterium is
associated with a genetically modified cereal plant.
[0237] 30. The plant growth promoting bacterium of paragraph 29,
wherein the genetically modified cereal plant includes an exogenous
gene encoding a chemical signal.
[0238] 31. The plant growth promoting bacterium of paragraph 29,
wherein the nitrogen fixation is under the control of the chemical
signal.
[0239] 32. The plant growth promoting bacterium of paragraphs 30 or
31, wherein the chemical signal is opine, phlorogluconol or
rhizopene.
[0240] 33. The rhizobium of any one of paragraphs 23-32, wherein
the exogenous nif cluster comprises 6 nif genes.
[0241] 34. The rhizobium of paragraph 33, wherein the 6 nif genes
are nifHDK(T)Y, nifEN(X), nifJ, nifBQ, nifF, and nifUSVWZM.
[0242] 35. The rhizobium of any one of paragraphs 23-34, wherein
the inducible promoter is a T7 promoter.
[0243] 36. The rhizobium of any one of paragraphs 23-34, wherein
the inducible promoter is P.sub.A1lacO1 promoter.
[0244] 37. The rhizobium of any one of paragraphs 23-36, wherein
the inducible promoter is activated by an agent selected from a
group that includes IPTG, sodium salicylate, octapine, nopaline,
the quorum signal 3OC6HSL, aTc, cuminic acid, DAPG, and salicylic
acid.
[0245] 38. The rhizobium of any one of paragraphs 23-37, wherein
the exogenous nif cluster further comprises a terminator.
[0246] 39. The rhizobium of any one of paragraphs 23-37, wherein
the inducible promoter has a terminator and wherein the terminator
is downstream from the inducible promoter.
[0247] 40. An Azorhizobium caulinodans capable of inducible
ammonium-independent nitrogen fixation in a cereal crop,
comprising:
[0248] (i) a modified nif cluster, wherein an endogenous nifA gene
is deleted or altered; and
[0249] (ii) at least one operon comprising nifA and RNA polymerase
sigma factor (RpoN), wherein the operon comprises a regulatory
element including an inducible promoter.
[0250] 41. The Azorhizobium caulinodans of claim 40, wherein the
inducible promoter is P.sub.A1lacO1 promotor.
[0251] 42. The Azorhizobium caulinodans of paragraphs 40 or 41,
wherein the inducible promoter is activated by an agent selected
from IPTG, sodium salicylate, octapine, nopaline, the quorum signal
3OC6HSL, aTc, cuminic acid, DAPG, and salicylic acid.
[0252] 43. The Azorhizobium caulinodans of any one of paragraphs
40-42, wherein the endogenous nifA gene is altered with at least
one of the following substitutions:
[0253] (i) L94Q;
[0254] (ii) D95Q; and
[0255] (iii) both L94Q and D95Q.
[0256] 44. A method of engineering a rhizobium that can fix
nitrogen under aerobic free-living conditions, comprising
transferring an exogenous nif cluster to a symbiotic rhizobium,
wherein the exogenous nif cluster confers nitrogen fixation
capability on the symbiotic rhizobium, under aerobic free-living
conditions, and wherein the rhizobium is not Azorhizobium
caulinodans.
[0257] 45. The method of paragraph 44, wherein the exogenous nif
cluster comprises 6 nif genes.
[0258] 46. The method of paragraph 45, wherein the 6 nif genes are
nifHDK(T)Y, nifEN(X), nifJ, nifBQ, nijF and nifUSVWZM.
[0259] 47. The method of paragraph 45 or 46, wherein each of the
nif genes is preceded by a wild-type T7 promoter.
[0260] 48. The method of any one of paragraphs 44-47, wherein the
exogenous nif cluster is transferred to the rhizobium in a
plasmid.
[0261] 49. The method of any one of paragraphs 44-48, wherein the
exogenous nif cluster further comprises a terminator.
[0262] 50. The method of any one of paragraphs 47-49, wherein the
wild-type T7 promoter has a terminator, and wherein the terminator
is downstream from the wild-type T7 promoter.
[0263] 51. The method of any one of paragraphs 44-50, wherein the
endogenous NifL gene is deleted.
[0264] 52. A method of producing nitrogen for consumption by a
cereal plant, comprising providing a plant growth promoting
bacterium that can fix nitrogen under aerobic free-living
conditions in proximity of the cereal plant, wherein the plant
growth promoting bacterium is a symbiotic bacterium having an
exogenous nif cluster, wherein the exogenous nif cluster confers
nitrogen fixation capability on the symbiotic bacterium, enabling
nitrogen fixation under aerobic free-living conditions.
[0265] 53. The method of paragraph 52, wherein the plant growth
promoting bacterium is a rhizobium.
[0266] 54. The method of paragraph 52, wherein the plant growth
bacterium is the bacterium of any one of paragraphs 1-22 and
23-39.
[0267] 55. The method of any one of paragraphs 52-54, wherein the
cereal plant is a genetically modified cereal plant.
[0268] 56. The method of paragraph 55, wherein the genetically
modified cereal plant includes an exogenous gene encoding a
chemical signal.
[0269] 57. The method of paragraph 56, wherein the nitrogen
fixation is under the control of the chemical signal.
[0270] 58. The method of paragraph 56 or 57, wherein the chemical
signal is opine, phlorogluconol or rhizopene.
[0271] 59. The method of any one of paragraphs 52-55, wherein the
nitrogen fixation is under the control of a chemical signal.
[0272] 60. The method of paragraph 57 or 59, wherein the chemical
signal is a root exudate, biocontrol agent or phytohormone.
[0273] 61. The method of paragraph 60, wherein the root exudate is
selected from the group consisting of sugars, hormones, flavonoids,
and antimicrobials.
[0274] 62. The method of paragraph 57 or 59, wherein the chemical
signal is vanillate.
[0275] 63. The method of paragraph 57 or 59, wherein the chemical
signal is IPTG, aTc, cuminic acid, DAPG, and salicylic acid,
3,4-dihydroxybenzoic acid, 3OC6HSL or 3OC14HSL.
[0276] All of the features disclosed in this specification may be
combined in any combination. Each feature disclosed in this
specification may be replaced by an alternative feature serving the
same, equivalent, or similar purpose. Thus, unless expressly stated
otherwise, each feature disclosed is only an example of a generic
series of equivalent or similar features. From the above
description, one skilled in the art can easily ascertain the
essential characteristics of the present invention, and without
departing from the spirit and scope thereof, can make various
changes and modifications of the invention to adapt it to various
usages and conditions. Thus, other embodiments are also within the
claims.
EQUIVALENTS
[0277] While several inventive embodiments have been described and
illustrated herein, those of ordinary skill in the art will readily
envision a variety of other means and/or structures for performing
the function and/or obtaining the results and/or one or more of the
advantages described herein, and each of such variations and/or
modifications is deemed to be within the scope of the inventive
embodiments described herein. More generally, those skilled in the
art will readily appreciate that all parameters, dimensions,
materials, and configurations described herein are meant to be
exemplary and that the actual parameters, dimensions, materials,
and/or configurations will depend upon the specific application or
applications for which the inventive teachings is/are used. Those
skilled in the art will recognize, or be able to ascertain using no
more than routine experimentation, many equivalents to the specific
inventive embodiments described herein. It is, therefore, to be
understood that the foregoing embodiments are presented by way of
example only and that, within the scope of the appended claims and
equivalents thereto, inventive embodiments may be practiced
otherwise than as specifically described and claimed. Inventive
embodiments of the present disclosure are directed to each
individual feature, system, article, material, kit, and/or method
described herein. In addition, any combination of two or more such
features, systems, articles, materials, kits, and/or methods, if
such features, systems, articles, materials, kits, and/or methods
are not mutually inconsistent, is included within the inventive
scope of the present disclosure.
[0278] All definitions, as defined and used herein, should be
understood to control over dictionary definitions, definitions in
documents incorporated by reference, and/or ordinary meanings of
the defined terms.
[0279] All references, patents and patent applications disclosed
herein are incorporated by reference with respect to the subject
matter for which each is cited, which in some cases may encompass
the entirety of the document.
[0280] The indefinite articles "a" and "an," as used herein in the
specification and in the claims, unless clearly indicated to the
contrary, should be understood to mean "at least one."
[0281] The phrase "and/or," as used herein in the specification and
in the claims, should be understood to mean "either or both" of the
elements so conjoined, i.e., elements that are conjunctively
present in some cases and disjunctively present in other cases.
Multiple elements listed with "and/or" should be construed in the
same fashion, i.e., "one or more" of the elements so conjoined.
Other elements may optionally be present other than the elements
specifically identified by the "and/or" clause, whether related or
unrelated to those elements specifically identified. Thus, as a
non-limiting example, a reference to "A and/or B", when used in
conjunction with open-ended language such as "comprising" can
refer, in one embodiment, to A only (optionally including elements
other than B); in another embodiment, to B only (optionally
including elements other than A); in yet another embodiment, to
both A and B (optionally including other elements); etc.
[0282] As used herein in the specification and in the claims, "or"
should be understood to have the same meaning as "and/or" as
defined above. For example, when separating items in a list, "or"
or "and/or" shall be interpreted as being inclusive, i.e., the
inclusion of at least one, but also including more than one, of a
number or list of elements, and, optionally, additional unlisted
items. Only terms clearly indicated to the contrary, such as "only
one of" or "exactly one of," or, when used in the claims,
"consisting of," will refer to the inclusion of exactly one element
of a number or list of elements. In general, the term "or" as used
herein shall only be interpreted as indicating exclusive
alternatives (i.e. "one or the other but not both") when preceded
by terms of exclusivity, such as "either," "one of," "only one of,"
or "exactly one of." "Consisting essentially of," when used in the
claims, shall have its ordinary meaning as used in the field of
patent law.
[0283] As used herein in the specification and in the claims, the
phrase "at least one," in reference to a list of one or more
elements, should be understood to mean at least one element
selected from any one or more of the elements in the list of
elements, but not necessarily including at least one of each and
every element specifically listed within the list of elements and
not excluding any combinations of elements in the list of elements.
This definition also allows that elements may optionally be present
other than the elements specifically identified within the list of
elements to which the phrase "at least one" refers, whether related
or unrelated to those elements specifically identified. Thus, as a
non-limiting example, "at least one of A and B" (or, equivalently,
"at least one of A or B," or, equivalently "at least one of A
and/or B") can refer, in one embodiment, to at least one,
optionally including more than one, A, with no B present (and
optionally including elements other than B); in another embodiment,
to at least one, optionally including more than one, B, with no A
present (and optionally including elements other than A); in yet
another embodiment, to at least one, optionally including more than
one, A, and at least one, optionally including more than one, B
(and optionally including other elements); etc.
[0284] It should also be understood that, unless clearly indicated
to the contrary, in any methods claimed herein that include more
than one step or act, the order of the steps or acts of the method
is not necessarily limited to the order in which the steps or acts
of the method are recited.
ADDITIONAL TABLES
TABLE-US-00002 [0285] TABLE 2 Primers used for nif cluster cloning.
Nif cluster Forward primer (SEQ ID NOs: 8-64) Reverse Primer (SEQ
ID NOs: 65-121) Genomic location GenBank accession No. Klebsiella
oxytoca CGTAGGGCGCATTAATGCAGCTGGCACGA GTGACGCTCGCGTATCAGGTTTG
3,897,443-3,909,294 CP020657.1 M5aI CAGGTGAATTC
TAGACTGCTGGATACGCTGCTTAAGGTC TACGCTGTTTGAGCTGGCAAACCT
ATCAGGCGCATATTTGAATGTATTTACTGCA 3,909,255-3,920,878 CP020657.1
GCGGCCGCTTCTAG AGTGACCAAAAGCTTCCGCAACCC Pseudomonas
GCCCGGAGAGCAAGCCCGTAGGGCGCATT ACTACGCATCACTAGCAGGGCACGCACCGCG
1,410,207-1,414,229 NC_009434 stutzeri AATGCAGCTGG GACGAAATCGAAGT
A1501 CACGACAGGTGTTAGGTTGGCCTGAATTC GAG GGTGT
GGCTCACTTCGATTTCGTCCGCGGTGCGT TTGTCGACTCCCGGGGTCTGAC
1,419,757-1,424,637 NC_009434 GCCCTGCTAGT GATGCGTA
CGCCTGATTTCGCCTGATGAACAGG GGCTTTAACGGCATGTTCCGGGT
1,424,588-1,429,971 NC_009434 TGACGCTGTTGACCACCGCC
GTAGTCGTCGTTGTGGCCGAACTC 1,429,922-1,434,417 NC_009434
ATGGAAGTGGTCGGCACCGGCTA AAAGCATCATCTCGGGTCGGGC 1,434,370-1,438,503
NC_009434 CGCAACGGTTGGGGTAGGTTGG CGTCGAGCGACAACGCCTCGA
1,438,454-1,442,613 NC_009434 GACGTCCATCGCTTCGGCTTCGA
CTATGAGCTGGACTGAACCGCGATG 1,442,565-1,448,340 NC_009434
CTGCGAAATCGACGCTGTCGAGCATCATC GAAAATACCGCATCAGGCGCATATTTGAATG
1,448,291-1,459,252 NC_009434 GCGGTTCA TATTTACTGCAGCG
GCCGCTGGCGAATCTCCTTCCTCGGTTCG Azotobacter
ATCCATTCTCAGGCTGTCTCGTCTCGTCT GCCTTCGAACATGTTGTCCCAG
134,732-144,115 NC_012560 vinelandii CTACGTACGCG DJ
GATCCCAGGCAACGTCTTCGTACTGCGGT ACCGGGTTGCG GGGGCAGCCAGTGGAAAAAGG
CTACGGCACGCCCTGGTTCGA TCGAGTTCGAGCAGTTTCTCCAGC 144.076-148,534
NC_012560 GCTCGGAAAGTGCTGGAGAAAC AGCGAACAATACCTGTGGCC
148,500-152,895 NC_012560 AAATCAGACATTCATGGCCACAGG
TGGCGCTTGCCCTTGTTCCAA 152,861-157,152 NC_012560
TCTACCATGGCGTGACTCTCGG GCGCGGTGGTAGAGTTCCGGGAGTTTAAACG
157,101-162,181 NC_012560 GACAGAAGACGAGT CGTGCGGGC
ACTCGTCTTCTGTCCGTTTAAACTCCCGG TTGCTCAGGGTCGGGTTGGC
5,161,399-5,168,611 NC_012560 AACTCTACCAC CGC CTTGGATAGACGAGGCACAGC
CATCATCCTCGGCCCCTTCAGGTTGCAGGAG 5,168,561-5,175,635 NC_012560
CCGGCTTG GCCGGCTCCTGCAACCTGAAGGGGCCGAG GCAAGCCACTCCACTGACGAA
995,860-1,000,698 NC_012560 GATGATG Paenibacillus
GAATTGAGGATAAATGTCAGGGATTTCAT ACAGGTTCCGCAGTTCACAAGC 23,686-26,413
ALJV01.1 polymyxa G contig00089 WLY78 CCAAGCATTTTGAGATCGCGGATG
GCTGATTGTGATCGACAATATTCGG 26,364-27,763 ALJV01.1 contig00089
CGGAGGTGCCGGTATGAGCGA GAAAGCCTACACGAAGCAAAGG 27,714-29,113 ALJV01.1
contig00089 GAAGTTTGCAGCGAAAGAGGCG CTTGAGAATCTGCCGGGCGCCT
29.064-30,463 ALJV01.1 contig00089 GGGATGATGCAGAATACATCCCG
ATCCACAAATCAACACCCTGCG 30,414-31,813 ALJV01.1 contig00089
GGTGACCTGGATGATGCAGAGGAGAG AAAGCGTTCCAGTCACGGTCAC 31,764-34,402
ALJV01.1 contig00089 Cyanothece GGCCCGCGTTAGGTTGGCCTGAATTCGGT
GAGACTTTCCCCACCTTATTATGCATGCAGA 1,931,343-1,929,132 NC_010546.1
ATCC51142 GTGTATCCCCC TGTTATGGGAATTA ACG
GGAGATACGTAAAAAAAAAAACCCCGCCC TGTCAGGGGCG
GGGTTTTTTTTTGATAAGTCAAGCTATCA GAACCGATC
TAATTCCCATAACATCTGCATGCATAATA ACCTTGACAATCATTACACAGCG
555,364-562,941 NC_010546.1 AGGTGGGGAAA GTCTCAGC
AATGTATTTCTGATCGATGCGACG CAAATATAATGATCGACATTTTCACCAC
562,897-570,603 NC_010546.1 GTTATCTGGCTGATGTTTGTGGTG
CGTTAACTTTGTCGCAAAACTTCG 570,558-577,494 NC_010546.1
GTCAAACTGTCTTGTTTAAAGCCG ACCAAGGCGAATCTCCTTCCTCGGTTCGCGA
577,449-584,687 NC_010546.1 TCACGCTACTCCGC
CAATAAAAAAGCCCCCGGAATGATCTTCCGG GGGCCAGATTCAGG TAACTGCTCAAG
Azospirillum TTAAGGTCATGCAGCAGGAGAACTAAAGG TGCGTCTTCTTCGGGCATCGTCA
1.043,795-1,035,568 CP012914 brasilense CCCGCGTTAGG Sp7
TTGGTAATAAAAAAGCCCCCGGAATGATC TTCCGGGGGCC CTGCGCAAATACAACATCGAGATC
GACGACTGAATAAGGATCGCGGAATG AGAAAATTGATTGCGGACGAGCG
1,035,614-1,027,483 CP012914 TATGTCACAGGCCCGACAAAGCG
TTCAATAAGTTAAGCAGATCGGCCTCG 1,027,533-1,019,166 CP012914
GATTGTCGGGTATCGCACACGAG CGGTGTTACGAATAAATATTTCTACGAATAG
1,019,211-1,010,628 CP012914 AC CGAAGGAGTTCGCCCCAGTCTATTC
GCTCCAAAAGGAGCCTTTAATTGTATCGGTT 1,010,677-1,003,838 CP012914
TATCAGCTTGCTTT GTTCCGCGGGTCTCGATACAACG Rhodopseudomonas
AATACGATCGCATGTCCTAGGTAATACGA GGTCTTGCGGATCATCACTTTC
5,215,514-5,207,699 NC_005296.1 palustris CTCACTATAGG CGA009
GAGAGGTAATCAGTGGTGGATTTGATGT CCAAGCAAAGGACCACCCTC
GACGGTCAGGTGGTCCGAAC 5,207,743-5,201,639 NC_005296.1
AGCTTCGATATCATCCGCTGAT GGTGAGAATGATCATGATCGGCC 5,201,687-5,196,113
NC_005296.1 TTGTTCATGTCGGACCTAACCGA CTCCAAAAGGAGCCTTTAATTGTATCGGTTT
5,196,162-5,187,847 NC_005296.1 ATCAGCTTGCTTTG
ACGACAAGTGGAGAAGGGATAG Rhodobacter CAATACGATCGCATGTCCTAGGTAATACG
TCCCATGGTCATGTCCTTTGCG 2,285,634-2,279,216 NC_007493 sphaeroides
ACTCACTATAG 2.4.1. GGAGATGCATTTCACGCTTCGCGATTC CCGCCTTCACCAGAGACACC
GTGCGCTTTTCCACGAGGAGC 2,279,260-2,271,404 NC_007493
ATCGAGAAGTTCTACGATGCCGT AATTGAAAAAAAAAACCCCGCCCTGTCAGGG
2,271,450-2,264,419 NC_007493 GCGGGGTTTTTTTT TGCAGCGCCCATTCCGTCTTC
GCAAAAAAAAACCCCGCCCCTGACAGGGC GCTCCAAAAGGAGCCTTTAATTGTATCGGTT
245,956-252,936 NC_007494 GGGGTTTTTTT TATCAGCTTGCTTT
TTTCAATTGGACCTGGATGGGCAGCAAG GGAGAAAGCCTGCGCGGCTAG Azorhizobium
CTCGCATCCATTCTCAGGCTGTCTCGTCT GCCCCCGGAAGGTGATCTTCCGGGGGCTTTC
5,290,244-5,293,483 NC_009937 caulinodans CGTCTCTCTAG TCATGCGTTGA
ORS571 AGTCGGAGCTCTTGGGGCCTCTAAACGGG CAGCCTTGAGATAGATCAAGTGC
TCTTGAGGGGT TTTTTGTTGTCTTCGACGCGAAGCTC
ATAGGCAATACGATCGCATGTCCGTTTAA CTGATCCAGGCCTTCATCGG
1,183,854-1,175,614 NC_009937 ACTGATAAGGA CGGCACTGGCTGG
CGATGCCGTCCAGCACCTC GACATGTCTGGTCTCCTTGGAAC 1,175,653-1,170,712
NC_009937 CTGCCACGGTTCCCAAGGTTC TTCTGGAATTTGGTACCGAGTCAGTAACGTG
1,179,751-1,162,529 NC_009937 CCACAGCCTCG
TAAAAAAGCGGCTAACCACGCCGCTTTTT ATCAGGCGCATATTTGAATGTATTTACTGCA
3,922,323-3,919,341 NC_009937 TTACGTCTGCA GCGGCCGCTAC
GTGTTGTCGAAGCTTGATGCGC GTACTTGTGGGGTCAGTTCCGGCTGGGGGTT CAGCAGCCACC
TGCAGTTAATTAAGGCGCTCCTTTCCTGATT CG CGCTGCTTAAGGTCATGCAGCAGGAGAAC
GCTGCTGTGTGGAGAGATCG 3,930,607-3,934,260 NC_009937 TAAAGGCCCGC
TCTGCGAAAGGAATAGCGTC CTATCGCCGCCACCTGACC GTCGGTGAGATTGATCATGGCC
3,934,220-3,937,923 NC_009937 CGTCAGAACGGCTCTGACGCATCAGGGAG
TGCATGTCCGTTCCTCGCTG 3,937,871-3,941,205 NC_009937 A
AGTAATATTGCGGATCGGCCAGCAGCGAG ACATGTCTTGAATTCCTTCGAACC
3,941,164-3,959,444 NC_009937 GAA GGTGGTCATTGGCAACGGTTCGAAG
TGCATTGCGTTCGCTCCC 3,959,405-3,962,598 NC_009937
TCCCCAAGAGCCCAACCGTTCCGGGAGCG TGTCAGGGCAGGCAGGGCC
3,962,559-3,966,562 NC_009937 AA Gluconacetobacter
TTAAGGTCATGCAGCAGGAGAACTAAAGG TCACCAGCCGTATCCGGAATATGTCAGGATC
1,759,465-1,754,718 CP001189 diazotrophicus CCCGCGTTAGG ATGACATCCC
PA1 5 TTGGTAATAAAAAAGCCCCCGGAATGATC TTCCGGGGGCC GATCGAGGAAATCGACGTG
ATATTCCGGATACGGCTGGTGAGGTGGA ACGATTTCCATGCCCAGGTC
1,754,739-1,746,565 CP001189 CGCCACGTCGTCAATGCCTATAAC
CCTCCAGCACCTCTTCGATG 1,746,608-1,738,322 CP001189
TGACCACCGTGCAGAAGATCC GCTCCAAAAGGAGCCTTTAATTGTATCGGTT
1,738,366-1,730,601 CP001189 TATCAGCTTGC
TTTGGGCAATACCTGAGACGTTTCA
TABLE-US-00003 TABLE 3 Strains used in this study Name Strain
Source Description MR1 E. coli DH10-beta NEB Cat# C3019 MR2 E. coli
K-12 MG1655 Voigt lab MR3 Klebsiella oxytoca M5al Voigt lab MR4
Pseudomonas stutzeri A1501 Poole lab MR5 Azotobacter vinelandii DJ
Peters lab MR6 Pseudomonas protegens Pf-5 ATCC BAA-477 MR7 P.
protegens Pf-5 controller (P.sub.tac-T7RNAP) This study generated
by pMR86 MR8 P. protegens Pf-5 controller v1 (P.sub.tac-nifA) This
study generated by pMR97 MR9 P. protegens Pf-5 controller v2
(P.sub.tac-nifA v2) This study generated by pMR98 MR10 P. protegens
Pf-5 controller v3 (P.sub.tac-nifA v3) This study generated by
pMR99 MR11 P. protegens Pf-5 controller v4 (P.sub.BAD.10-nifA) This
study generated by pMR100 MR12 P. protegens Pf-5 controller v5
(P.sub.Fde-nifA) This study generated by pMR101 MR13 Rhizobium sp.
IRBG74 Ane lab MR14 R. sp. IRBG74 .DELTA.hsdR This study generated
by pMR44 MR15 R. sp. IRBG74 .DELTA.recA This study generated by
pMR47 MR16 R. sp. IRBG74 .DELTA.nif This study generated by
pMR45-46. Two nif clusters (227,127- 219,579 and 234,635-234,802)
were removed. MR17 R. sp. IRBG74 .DELTA.hsdR, recA This study MR18
R. sp. IRBG74 .DELTA.hsdR, .DELTA.nif This R. sp. IRBG74
.DELTA.hsdR, recA .DELTA.nif study MR19 R. sp. IRBG74 .DELTA.hsdR
.DELTA.nif .DELTA.recA::P.sub.A1lacO1-T7RNAP This generated by
pMR82 v1 study MR20 This study MR21 R. sp. IRBG74 .DELTA.hsdR
.DELTA.nif .DELTA.recA::P.sub.A1lacO1-T7RNAP This study generated
by pMR83 v2 MR22 R. sp. IRBG74 .DELTA.hsdR .DELTA.nif
.DELTA.recA::P.sub.A1lacO1-T7RNAP This study generated by pMR84 v3
MR23 R. sp. IRBG74 .DELTA.hsdR .DELTA.nif
.DELTA.recA::P.sub.Phl-T7RNAP This study generated by pMR85 MR24
Azorhizobium coulinodans ORS571 Poole lab MR25 Azorhizobium
coulinodans ORS571 .DELTA.nifA This study generated by pMR48 MR26
R. spp NGR234 Poole lab MR27 R. leguminosarum bv. Trifolii WSM1325
Poole lab MR28 Sinorhizobium medicae WSM419 Poole lab MR29 R.
leguminosarum 8002 Poole lab MR30 Sinorhizobium meliloti WSM1022
Poole lab MR31 R. leguminosarum A34 Poole lab MR32 Sinorhizobium
fredii HH103 Poole lab MR33 Sinorhizobium meliloti 1021 Poole lab
MR34 R. tropici CIAT899 Poole lab MR35 R. leguminosarum viciae 3841
Poole lab MR36 R. etli CFN42 Poole lab MR37 Agrobacterium
tumefaciens C58 Poole lab
TABLE-US-00004 TABLE 4 Plasmids used in this study Origin of Name
replication Marker Description pMR1 pBBR1 Kanamycin Plasmid for nif
cluster cloning pMR2 pRO1600, Gentamicin Plasmid for nif cluster
cloning p15A pMR3 pBBR1 Kanamycin Native nif cluster of K. oxytoca
M5al pMR4 pRO1600, Gentamicin Native nif cluster of K. oxytoca M5al
p15A pMR5 pBBR1 Kanamycin Native nif cluster of P. stutzeri A1501
pMR6 pRO1600, Gentamicin Native nif cluster of P. stutzeri A1501
p15A pMR7 pBBR1 Kanamycin Native nif cluster of A. vinelandii DJ
pMR8 pRO1600, Gentamicin Native nif cluster of A. vinelandii DJ
p15A pMR9 pBBR1 Gentamicin Native nif cluster of Cyanothece
ATCC51142 pMR10 pRO1600, Gentamicin Native nif cluster of
Cyanothece ATCC51142 p15A pMR11 pBBR1 Kanamycin Native nif cluster
of P. polymyxa WLY78 pMR12 pRO1600, Gentamicin Native nif cluster
of P. polymyxa WLY78 ColE1 pMR13 pBBR1 Kanamycin Native nif cluster
of A. brasilense Sp7 pMR14 pRO1600, Gentamicin Native nif cluster
of A. brasilense Sp7 ColE1 pMR15 pBBR1 Kanamycin Native nif cluster
of R. sphaeroides 2.4.1 pMR16 pRO1600, Gentamicin Native nif
cluster of R. sphaeroides 2.4.1 ColE1 pMR17 pBBR1 Kanamycin Native
nif cluster of R. palustris CGA009 pMR18 pRO1600, Gentamicin Native
nif cluster of R. palustris CGA009 ColE1 pMR19 pBBR1 Kanamycin
Native nif cluster of A. caulinodans ORS571 (Part1 of 2) pMR20 RK2
Tetracycline Native nif cluster of A. caulinodans ORS571 (Part2 of
2) pMR21 pBBR1 Kanamycin Native nif cluster of G. diazotrophicus
PA1 5 pMR22 pRO1600, Gentamicin Native nif cluster of G.
diazotrophicus PA1 5 ColE1 pMR23 pRO1600, Gentamicin nifLA
(3,915,521-3,918,529) deletion in the nif cluster of K. oxytoca
M5al p15A pMR24 pRO1600, Gentamicin nifLA (1,420,874-1,423,084)
deletion in the nif cluster of P. stutzeri A1501 p15A pMR25
pRO1600, Gentamicin nifLA (5,168,709-5,171,731) deletion in the nif
cluster of A. vinelandii DJ p15A pMR26 pRO1600, Gentamicin Native
nif cluster of A. vinelandii DJ with the rnf2 operon p15A pMR27
pRO1600, Gentamicin rnf1 (5,168,156-5,162,716) operon deletion in
the nif cluster of A. vinelandii DJ p15A pMR28 pRO1600, Gentamicin
fix operon (995,860-1,000,698) deletion in the nif cluster of A.
vinelandii DJ p15A pMR29 pBBR1 Kanamycin Refactored nif cluster
v2.1 pMR30 pRO1600, Gentamicin Refactored nif cluster v2.1 p15A
pMR31 RK2 Tetracycline Refactored nif cluster v2.1 pMR32 ColE1
Gentamicin P.sub.WT-nifHDKTY pMR33 ColE1 Gentamicin P.sub.2-nifENX
pMR34 ColE1 Gentamicin P.sub.2-nifJ pMR35 ColE1 Gentamicin
P.sub.2-nifBQ pMR36 ColE1 Gentamicin P.sub.2-nifF pMR37 ColE1
Gentamicin P.sub.2-nifUSVWZM pMR38 pBBR1 Kanamycin Refactored nif
cluster v3.2 pMR39 pRO1600, Gentamicin Refactored nif cluster v3.2
p15A pMR40 pBBR1 Kanamycin LacI, P.sub.A1lacO1-gfpmut3b pMR41
RSF1010 Gentamicin LacI, P.sub.A1lacO1-gfpmut3b pMR42 RK2
Tetracycline LacI, P.sub.tac-gfpmut3b pMR43 pRO1600, Gentamicin
LacI, P.sub.A1lacO1-gfpmut3b ColE1 pMR44 p15A Gentamicin Suicide
plasmid for hsdR deletion in R. sp. IRBG74 pMR45 p15A Gentamicin
Suicide plasmid for the nif cluster I (219,579-227,127) deletion in
R. sp. IRBG74 pMR46 p15A Gentamicin Suicide plasmid for the nif
cluster II (234,635-234,802) deletion in R. sp. IRBG74 pMR47 p15A
Gentamicin Suicide plasmid for recA deletion in R. sp. IRBG74 pMR48
p15A Gentamicin Suicide plasmid for nifA deletion in A. coulinodans
ORS571 pMR49 pBBR1 Gentamicin LacI, P.sub.A1lacO1-nifV (A.
coulinodans ORS571) pMR50 pBBR1 Gentamicin P.sub.nifH(R. sp.
IRBG74)-sfgfp pMR51 pBBR1 Gentamicin NifA(R. sp. IRBG74),
P.sub.nifH(R. sp. IRBG74)-sfgfp pMR52 pBBR1 Gentamicin NifA(K.
oxytoca), P.sub.nifH(K. oxytoca)-sfgfp pMR53 pBBR1 Gentamicin
NifA(R. sp. IRBG74), P.sub.nifH(K. oxytoca)-sfgfp pMR54 pBBR1
Gentamicin NifA(P. stutzeri), P.sub.nifH(P. stutzeri)-sfgfp pMR55
pBBR1 Gentamicin NifA(R. sp. IRBG74), P.sub.nifH(P. stutzeri)-sfgfp
pMR56 pBBR1 Gentamicin NifA(A. coulinodans), P.sub.nifH(A.
coulinodans)-sfgfp pMR57 pBBR1 Gentamicin NifA(R. sp. IRBG74),
P.sub.nifH(A. coulinodans)-sfgfp pMR58 pBBR1 Kanamycin Plasmid for
consituitive promoter characterization. P.sub.constitutive-gfpmut3b
pMR59 pRO1600, Gentamicin Plasmid for consituitive promoter
characterization. P.sub.constitutive-gfpmut3b p15A pMR60 pBBR1
Kanamycin PT7(WT)-mCherry pMR61 pBBR1 Kanamycin PT7(P1)-mCherry
pMR62 pBBR1 Kanamycin PT7(P2)-mCherry pMR63 pBBR1 Kanamycin
PT7(P3)-mCherry pMR64 pBBR1 Kanamycin PT7(P4)-mCherry pMR65 pBBR1
Kanamycin PT7(P5)-mCherry pMR66 pRO1600, Gentamicin AraE, AraC,
P.sub.BAD.10-gfpmut3b ColE1 pMR67 pBBR1 Kanamycin Plasmid for
terminator characterization. P.sub.T7-gfpmut3b-mrfp1 pMR68 pRO1600,
Gentamicin Plasmid for terminator characterization.
P.sub.T7-gfpmut3b-mrfp1 ColE1 pMR69 pBBR1 Kanamycin LuxR,
P.sub.Lux-gfpmut3b pMR70 pBBR1 Kanamycin TetR, P.sub.Tet-gfpmut3b
pMR71 pBBR1 Kanamycin CymR, P.sub.Cym-gfpmut3b pMR72 pBBR1
Kanamycin PhlF, P.sub.Phl-gfpmut3b pMR73 pBBR1 Kanamycin NahR,
P.sub.Sal-gfpmut3b pMR74 pRO1600, Gentamicin PhlF,
P.sub.Phl-gfpmut3b ColE1 pMR75 pRO1600, Gentamicin TetR,
P.sub.Tet-gfpmut3b ColE1 pMR76 pRO1600, Gentamicin LuxR,
P.sub.Lux-gfpmut3b ColE1 pMR77 pRO1600, Gentamicin CymR,
P.sub.Cym-gfpmut3b ColE1 pMR78 pRO1600, Gentamicin FdeR,
P.sub.Fde-gfpmut3b ColE1 pMR79 pRO1600, Gentamicin
LacI(Q18M/A47V/F161Y), P.sub.tac-gfpmut3b ColE1 pMR80 pBBR1
Kanamycin P.sub.T7-gfpmut3b pMR81 pRO1600, Gentamicin
P.sub.T7-gfpmut3b p15A pMR82 p15A Gentamicin Controller for R. sp.
IRBG74, LacI, P.sub.A1lacO1-T7RNAP (RBSr33 for T7RNAP) pMR83 p15A
Gentamicin Controller for R. sp. IRBG74, LacI, P.sub.A1lacO1-T7RNAP
(RBSr32 for T7RNAP) pMR84 p15A Gentamicin Controller for R. sp.
IRBG74, LacI, P.sub.A1lacO1-T7RNAP (RBSr3 for T7RNAP) pMR85 p15A
Gentamicin Controller for R. sp. IRBG74, PhlF, P.sub.PhlF-T7RNAP
(RBSr33 for T7RNAP) pMR86 ColE1 Tetracycline Controller for P.
protegens Pf-5, LacI(Q18M/A47V/F161Y), P.sub.tac-T7RNAP pMR87 pBBR1
Kanamycin NocR, P.sub.noc-gfpmut3b pMR88 pBBR1 Kanamycin OccR,
P.sub.ooc-gfpmut3b pMR89 pBBR1 Gentamicin NifA(A. vinelandii),
P.sub.nifH(A. vinelandii)-sfgfp pMR90 pBBR1 Gentamicin NifA(K.
oxytoca),P.sub.nifH(P. stutzeri)-sfgfp pMR91 pBBR1 Gentamicin
NifA(K. oxytoca), P.sub.nifH(A. vinelandii)-sfgfp pMR92 pRO1600,
Gentamicin NifA(K. oxytoca), P.sub.nifH(K. oxytoca)-sfgfp p15A
pMR93 pRO1600, Gentamicin NifA(A. vinelandii), P.sub.nifH(A.
vinelandii)-sfgfp p15A pMR94 pRO1600, Gentamicin NifA(P. stutzeri),
P.sub.nifH(P. stutzeri)-sfgfp p15A pMR95 pRO1600, Gentamicin
NifA(P. stutzeri), P.sub.nifH(K. oxytoca)-sfgfp p15A pMR96 pRO1600,
Gentamicin NifA(P. stutzeri), P.sub.nifH(A. vinelandii)-sfgfp p15A
pMR97 ColE1 Tetracycline NifA controller for P. protegens Pf-5,
LacI(Q18M/A47V/F161Y), P.sub.tac-nifA(P. stutzeri) (RBSp32 for
NifA) pMR98 ColE1 Tetracycline NifA controller for P. protegens
Pf-5, LacI(Q18M/A47V/F161Y), P.sub.tac-nifA(P. stutzeri) (RBSp27
RBS for NifA) pMR99 ColE1 Tetracycline NifA controller for P.
protegens Pf-5, LacI(Q18M/A47V/F161Y), P.sub.tac-nifA(P. stutzeri)
(RBSp33 for NifA) pMR100 ColE1 Tetracycline NifA controller for P.
protegens Pf-5, AraE, AraC, P.sub.BAD.10-nifA pMR101 ColE1
Tetracycline NifA controller for P. protegens Pf-5, FdeR,
P.sub.Fde-nifA pMR102 IncW Spectinomycin NifA controller plasmid
for E. coli, LacI, P.sub.A1lacO1-nifA(K. oxytoca) pMR103 pRO1600,
Gentamicin P.sub.nifH(K. oxytoca)-sfgfp p15A pMR104 pRO1600,
Gentamicin P.sub.nifH(P. stutzeri)-sfgfp p15A pMR105 pRO1600,
Gentamicin P.sub.nifH(A. vinelandii)-sfgfp p15A pMR106 pBBR1
Gentamicin P.sub.nifH(K. oxytoca)-sfgfp pMR107 pBBR1 Gentamicin
P.sub.nifH(P. stutzeri)-sfgfp pMR108 pBBR1 Gentamicin P.sub.nifH(A.
vinelandii)-sfgfp pMR109 p15A Kanamycin P.sub.BAD-T7RNAP pMR110
p15A Kanamycin P.sub.Bet-T7RNAP pMR111 p15A Kanamycin
P.sub.Cin-T7RNAP pMR112 p15A Kanamycin P.sub.Cym-T7RNAP pMR113 p15A
Kanamycin P.sub.Lux-T7RNAP pMR114 p15A Kanamycin P.sub.Phl-T7RNAP
pMR115 p15A Kanamycin P.sub.3B5B-T7RNAP pMR116 p15A Kanamycin
P.sub.tac-T7RNAP pMR117 p15A Kanamycin P.sub.Tet-T7RNAP pMR118 p15A
Kanamycin P.sub.Tfg-T7RNAP pMR119 p15A Kanamycin P.sub.Van-T7RNAP
pMR120 p15A Kanamycin P.sub.Sal-T7RNAP pMR121 pBBR1 Gentamicin
P.sub.T7(P2)-gfpmut3b pMR122 pBBR1 Gentamicin NifA controller for
A. caulinodans, LacI, P.sub.A1lacO1-nifA-rpoN pMR123 pBBR1
Gentamicin NifA controller for A. caulinodans, LacI,
P.sub.A1lacO1-nifA(L94Q)-rpoN pMR124 pBBR1 Gentamicin NifA
controller for A. caulinodans, LacI,
P.sub.A1lacO1-nifA(D95Q)-rpoN(A. caulinodans) pMR125 pBBR1
Gentamicin NifA controller for A. caulinodans, LacI,
P.sub.A1lacO1-nifA(L94Q/D95Q)-rpoN pMR126 pBBR1 Gentamicin NifA
controller for A. caulinodans, NahR, P.sub.Sal-nifA(L94Q/D95Q)-rpoN
pMR127 pBBR1 Gentamicin NifA controller for A. caulinodans, NocR,
P.sub.noc-nifA(L94Q/D95Q)-rpoN pMR128 pBBR1 Gentamicin NifA
controller for A. caulinodans, OccR, P.sub.occ-nifA(L94Q/D95Q)-rpoN
pMR129 pBBR1 Gentamicin P.sub.nifH(A. caulinodans)-sfgfp pMR130
pBBR1 Gentamicin NifA, P.sub.nifH(A. caulinodans)-sfgfp pMR131
pBBR1 Gentamicin NifA, RpoN, P.sub.nifH(A. caulinodans)-sfgfp
pMR132 pBBR1 Gentamicin LacI, P.sub.A1lacO1-nifA(L94Q/D95Q)-rpoN,
P.sub.nifH-sfgfp pMR133 pBBR1 Gentamicin NahR,
PSal-nifA(L940/D95Q)-rpoN, P.sub.nifH-sfgfp pMR134 pBBR1 Gentamicin
NocR, Pnoc-nifA(L940/D95Q)-rpoN, P.sub.nifH-sfgfp pMR135 pBBR1
Gentamicin OccR, Pocc-nifA(L94Q/D95Q)-rpoN, P.sub.nifH-sfgfp pMR136
pBBR1 Gentamicin Refactored nif clusterv2.1
TABLE-US-00005 TABLE 5 Genetic part sequences used in this study
Name Genetic part DNA sequence (SEQ ID NOs: 122-225) P.sub.A1lacO1
Promoter.sup.6
AGAGTGTTGACTTGTGAGCGGATAACAATGATACTTAGATTCAATTGTGAGCGGATAACAATTTCAC
ACA T7 RNAP Gene
ATGAACACGATTAACATCGCTAAGAACGACTTCTCTGACATCGAACTGGCTGCTATCCCGTTCAACACTC
TGGCTGACCATTACGGTGAGCG
TTTAGCTCGCGAACAGTTGGCCCTTGAGCATGAGTCTTACGAGATGGGTGAAGCACGCTTCCGCAAGATG
TTTGAGCGTCAACTTAAAGCTG
GTGAGGTTGCGGATAACGCTGCCGCCAAGCCTCTCATCACTACCCTACTCCCTAAGATGATTGCACGCAT
CAACGACTGGTTTGAGGAAGTG
AAAGCTAAGCGCGGCAAGCGCCCGACAGCCTTCCAGTTCCTGCAAGAAATCAAGCCGGAAGCCGTAGCGT
ACATCACCATTAAGACCACTCT
GGCTTGCCTAACCAGTGCTGACAATACAACCGTTCAGGCTGTAGCAAGCGCAATCGGTCGGGCCATTGAG
GACGAGGCTCGCTTCGGTCGTA
TCCGTGACCTTGAAGCTAAGCACTTCAAGAAAAACGTTGAGGAACAACTCAACAAGCGCGTAGGGCACGT
CTACAAGAAAGCATTTATGCAA
GTTGTCGAGGCTGACATGCTCTCTAAGGGTCTACTTGGTGGCGAGGCGTGGTCTTCGTGGCATAAGGAAG
ACTCTATTCATGTAGGAGTACG
CTGCATCGAGATGCTCATTGAGTCAACCGGAATGGTTAGCTTACACCGCCAAAATGCTGGCGTAGTAGGT
CAAGACTCTGAGACTATCGAAC
TCGCACCTGAATACGCTGAGGCTATCGCAACCCGTGCAGGTGCGCTGGCTGGCATCTCTCCGATGTTCCA
ACCTTGCGTAGTTCCTCCTAAG
CCGTGGACTGGCATTACTGGTGGTGGCTATTGGGCTAACGGTCGTCGTCCTCTGGCGCTGGTGCGTACTC
ACAGTAAGAAAGCACTGATGCG
CTACGAAGACGTTTACATGCCTGAGGTGTACAAAGCGATTAACATTGCGCAAAACACCGCATGGAAAATC
AACAAGAAAGTCCTAGCGGTCG
CCAACGTAATCACCAAGTGGAAGCATTGTCCGGTCGAGGACATCCCTGCGATTGAGCGTGAAGAACTCCC
GATGAAACCGGAAGACATCGAC
ATGAATCCTGAGGCTCTCACCGCGTGGAAACGTGCTGCCGCTGCTGTGTACCGCAAGGACAAGGCTCGCA
AGTCTCGCCGTATCAGCCTTGA
GTTCATGCTTGAGCAAGCCAATAAGTTTGCTAACCATAAGGCCATCTGGTTCCCTTACAACATGGACTGG
CGCGGTCGTGTTTACGCTGTGT
CAATGTTCAACCCGCAAGGTAACGATATGACCAAAGGACTGCTTACGCTGGCGAAAGGTAAACCAATCGG
TAAGGAAGGTTACTACTGGCTG
AAAATCCACGGTGCAAACTGTGCGGGTGTCGACAAGGTTCCGTTCCCTGAGCGCATCAAGTTCATTGAGG
AAAACCACGAGAACATCATGGC
TTGCGCTAAGTCTCCACTGGAGAACACTTGGTGGGCTGAGCAAGATTCTCCGTTCTGCTTCCTTGCGTTC
TGCTTTGAGTACGCTGGGGTAC
AGCACCACGGCCTGAGCTATAACTGCTCCCTTCCGCTGGCGTTTGACGGGTCTTGCTCTGGCATCCAGCA
CTTCTCCGCGATGCTCCGAGAT
GAGGTAGGTGGTCGCGCGGTTAACTTGCTTCCTAGTGAAACCGTTCAGGACATCTACGGGATTGTTGCTA
AGAAAGTCAACGAGATTCTACA
AGCAGACGCAATCAATGGGACCGATAACGAAGTAGTTACCGTGACCGATGAGAACACTGGTGAAATCTCT
GAGAAAGTCAAGCTGGGCACTA
AGGCACTGGCTGGTCAATGGCTGGCTTACGGTGTTACTCGCAGTGTGACTAAGCGTTCAGTCATGACGCT
GGCTTACGGGTCCAAAGAGTTC
GGCTTCCGTCAACAAGTGCTGGAAGATACCATTCAGCCAGCTATTGATTCCGGCAAGGGTCTGATGTTCA
CTCAGCCGAATCAGGCTGCTGG
ATACATGGCTAAGCTGATTTGGGAATCTGTGAGCGTGACGGTGGTAGCTGCGGTTGAAGCAATGAACTGG
CTTAAGTCTGCTGCTAAGCTGC
TGGCTGCTGAGGTCAAAGATAAGAAGACTGGAGAGATTCTTCGCAAGCGTTGCGCTGTGCATTGGGTAAC
TCCTGATGGTTTCCCTGTGTGG
CAGGAATACAAGAAGCCTATTCAGACGCGCTTGAACCTGATGTTCCTCGGTCAGTTCCGCTTACAGCCTA
CCATTAACACCAACAAAGATAG
CGAGATTGATGCACACAAACAGGAGTCTGGTATCGCTCCTAACTTTGTACACAGCCAAGACGGTAGCCAC
CTTCGTAAGACTGTAGTGTGGG
CACACGAGAAGTACGGAATCGAATCTTTTGCACTGATTCACGACTCCTTCGGTACGATTCCGGCTGACGC
TGCGAACCTGTTCAAAGCAGTG
CGCGAAACTATGGTTGACACATATGAGTCTTGTGATGTACTGGCTGATTTCTACGACCAGTTCGCTGACC
AGTTGCACGAGTCTCAATTGGA
CAAAATGCCAGCACTTCCGGCTAAAGGTAACTTGAACCTCCGTGACATCTTAGAGTCGGACTTCGC
GTTCGCGTAA P.sub.laclq Promoter.sup.7
CGAATGGTGCAAAACCTTTCGCGGTATGGCATGATAGCGCCCGGAAGAGAG rpoN of A.
caulinodans Gene
ATGGCGATGAGCCCAAAGATGGAGTTCCGCCAGAGCCAGTCTCTGGTGATGACGCCGCAGCTGATGCAGG
CCATCAAGCTGCTGCAGCTCTC
CAATCTCGAACTGGTCGCCTATGTGGAGGCCGAGCTCGAACGCAATCCGCTGCTGGAGCGGGCGAGCGAG
CCGGAAAGCCCCGAGCACGATC
CGCCGAACCCGCAGGAAGAGGCACCCACCCCGCCTGACAGTGGCGCGCCGGTGTCCGGCGACTGGATGGA
AAGCGACATGGGCTCGAGCCGC
GAGGCCATCGAGACCCGGCTGGACACCGACCTCGGCAATGTCTTTCCCGATGATGCGCCGGCCGAGCGCA
TCGGCGCGGGCAGCGGCAGCGG
CTCGTCCATCGAATGGGGCTCGGGCGGCGACCGGGGCGAGGACTACAATCCGGAAGCCTTCCTCGCTGCC
GAGACGACGCTGGCCGACCATC
TGGAAGCCCAGCTCTCCGTGGCGGAGCCCGATCCGGCGCGCCGCCTCATCGGCCTCAACCTCATCGGCCT
CATCGACGAGACGGGTTATTTC
TCCGGCGACCTCGATGCGGTGGCCGAGCAACTGGGCGCCACCCACGATCAGGTGGCCGACGTGCTGCGCG
TCATCCAGAGCTTCGAGCCGTC
CGGCGTCGGCGCACGGTCGCTCAGCGAATGCCTGGCCCTGCAATTGCGCGACAAGGATCGCTGCGATCCC
GCCATGCAGGCGCTGCTCGACA
ATCTGGAACTCCTCGCCCGCCACGACCGCAACGCGCTGAAGCGCATCTGCGGGGTGGACGCGGAAGACCT
CGCGGACATGATCGGCGAGATC
CGCCGCCTCGATCCGAAGCCCGGCCTCGCCTATGGCGGCGGCGTCGTCCACCCGCTGGTGCCGGACGTGT
TCGTGCGCGAGGGCTCCGACGG
CAGCTGGATCGTGGAACTGAATTCCGAGACGCTGCCGCGCGTGCTGGTGAACCAGACCTATCACGCGACG
GTGGCCAAGGCGGCGCGCTCGG
CCGAGGAAAAGACCTTCCTCGCCGACTGCCTCCAGAGCGCCTCCTGGCTTACCCGCTCGCTCGACCAGCG
GGCTCGCACCATCCTCAAGGTG
GCGAGCGAGATCGTGCGCCAGCAGGACGCCTTCCTCGTGCACGGCGTGCGGCACCTGCGCCCCCTGAACC
TGCGCACGGTGGCGGATGCCAT
CGGCATGCACGAATCCACCGTCTCGCGGGTGACCTCGAACAAGTACATCTCCACCCCGCGCGGGGTGCTG
GAGATGAAGTTCTTCTTCTCCT
CCTCCATCGCTTCCTCGGGTGGTGGCGAGGCCCATGCGGCGGAGGCGGTGCGCCACCGCATCAAGAGCCT
CATCGAGGCCGAGAGTGCGGAC
GACGTGCTGTCCGACGACACGCTGGTGCAGAAGCTGAAGGACGACGGCATCGATATCGCCCGCCGAACGG
TCGCGAAATATCGCGAGAGCAT
GAACATCCCGTCCTCGGTCCAGCGCCGCCGCGAAAAGCAGGCCCTGCGCAGCGACGCCGCCGCCGC
CGGCTGA lacl Gene
GTGAAACCAGTAACGTTATACGATGTCGCAGAGTATGCCGGTGTCTCTTATCAGACCGTTTCCC-
GCGTGG TGAACCAGGCCAGCCACGTTTC
TGCGAAAACGCGGGAAAAAGTGGAAGCGGCGATGGCGGAGCTGAATTACATTCCCAACCGCGTGGCACAA
CAACTGGCGGGCAAACAGTCGT
TGCTGATTGGCGTTGCCACCTCCAGTCTGGCCCTGCACGCGCCGTCGCAAATTGTCGCGGCGATTAAATC
TCGCGCCGATCAACTGGGTGCC
AGCGTGGTGGTGTCGATGGTAGAACGAAGCGGCGTCGAAGCCTGTAAAGCGGCGGTGCACAATCTTCTCG
CGCAACGCGTCAGTGGGCTGAT
CATTAACTATCCGCTGGATGACCAGGATGCCATTGCTGTGGAAGCTGCCTGCACTAATGTTCCGGCGTTA
TTTCTTGATGTCTCTGACCAGA
CACCCATCAACAGTATTATTTTCTCCCATGAAGACGGTACGCGACTGGGCGTGGAGCATCTGGTCGCATT
GGGTCACCAGCAAATCGCGCTG
TTAGCGGGCCCATTAAGTTCTGTCTCGGCGCGTCTGCGTCTGGCTGGCTGGCATAAATATCTCACTCGCA
ATCAAATTCAGCCGATAGCGGA
ACGGGAAGGCGACTGGAGTGCCATGTCCGGTTTTCAACAAACCATGCAAATGCTGAATGAGGGCATCGTT
CCCACTGCGATGCTGGTTGCCA
ACGATCAGATGGCGCTGGGCGCAATGCGCGCCATTACCGAGTCCGGGCTGCGCGTTGGTGCGGATATCTC
GGTAGTGGGATACGACGATACC
GAAGACAGCTCATGTTATATCCCGCCGTTAACCACCATCAAACAGGATTTTCGCCTGCTGGGGCAAACCA
GCGTGGACCGCTTGCTGCAACT
CTCTCAGGGCCAGGCGGTGAAGGGCAATCAGCTGTTGCCCGTCTCACTGGTGAAAAGAAAAACCACCCTG
GCGCCCAATACGCAAACCGCCT
CTCCCCGCGCGTTGGCCGATTCATTAATGCAGCTGGCACGACAGGTTTCCCGACTGGAAAGCGGGC
AGTGA gfpmut3b Gene
ATGAGTAAAGGAGAAGAACTTTTCACTGGAGTTGTCCCAATTCTTGTTGAATTAGATGGTGATGTTAATG
GGCACAAATTTTCTGTTAGTGG
AGAGGGTGAAGGTGATGCAACATACGGAAAACTTACCCTTAAATTTATTTGCACTACTGGAAAACTACCT
GTTCCATGGCCAACACTTGTCA
CTACTTTCGGTTATGGTGTTCAATGCTTTGCGAGATACCCAGATCATATGAAACAGCATGACTTTTTCAA
GAGTGCCATGCCCGAAGGTTAT
GTACAGGAAAGAACTATATTTTTCAAAGATGACGGGAACTACAAGACACGTGCTGAAGTCAAGTTTGAAG
GTGATACCCTTGTTAATAGAAT
CGAGTTAAAAGGTATTGATTTTAAAGAAGATGGAAACATTCTTGGACACAAATTGGAATACAACTATAAC
TCACACAATGTATACATCATGG
CAGACAAACAAAAGAATGGAATCAAAGTTAACTTCAAAATTAGACACAACATTGAAGATGGAAGCGTTCA
ACTAGCAGACCATTATCAACAA
AATACTCCAATTGGCGATGGCCCTGTCCTTTTACCAGACAACCATTACCTGTCCACACAATCTGCCCTTT
CGAAAGATCCCAACGAAAAGAG
AGACCACATGGTCCTTCTTGAGTTTGTAACAGCTGCTGGGATTACACATGGCATGGATGAACTATA
CAAATAG sigfg Gene
ATGCGTAAAGGCGAAGAGCTGTTCACTGGTGTCGTCCCTATTCTGGTGGAACTGGATGGTGAT-
GTCAACGGTCATAAGTTTTCC GTGCGTGG
CGAGGGTGAAGGTGACGCAACTAATGGTAAACTGACGCTGAAGTTCATCTGTACTACTGGTAAACTGCCGGT-
ACCTTGGCCGAC TCTGGTAA
CGACGCTGACTTATGGTGTTCAGTGCTTTGCTCGTTATCCGGACCATATGAAGCAGCATGACTTCTTCAAGT-
CCGCCATGCCGG AAGGCTAT
GTGCAGGAACGCACGATTTCCTTTAAGGATGACGGCACGTACAAAACGCGTGCGGAAGTGAAATTTGAAGGC-
GATACCCTGGTA AACCGCAT
TGAGCTGAAAGGCATTGACTTTAAAGAAGACGGCAATATCCTGGGCCATAAGCTGGAATACAATTTTAACAG-
CCACAATGTTTA CATCACCG
CCGATAAACAAAAAAATGGCATTAAAGCGAATTTTAAAATTCGCCACAACGTGGAGGATGGCAGCGTGCAGC-
TGGCTGATCACT ACCAGCAA
AACACTCCAATCGGTGATGGTCCTGTTCTGCTGCCAGACAATCACTATCTGAGCACGCAAAGCGTTCTGTCT-
AAAGATCCGAAC GAGAAACG
CGATCATATGGTTCTGCTGGAGTTCGTAACCGCAGCGGGCATCACGCATGGTATGGATGAACTGTACAAATG-
A mrfp1 Gene
ATGGCTTCCTCCGAAGACGTTATCAAAGAGTTCATGCGTTTCAAAGTTCGTATGGAAGGTTCC-
GTTAACGGTCACGAGTTCGAA ATCGAAGG
TGAAGGTGAAGGTCGTCCGTACGAAGGTACCCAGACCGCTAAACTGAAAGTTACCAAAGGTGGTCCGCTGCC-
GTTCGCTTGGGA CATCCTGT
CCCCGCAGTTCCAGTACGGTTCCAAAGCTTACGTTAAACACCCGGCTGACATCCCGGACTACCTGAAACTGT-
CCTTCCCGGAAG GTTTCAAA
TGGGAACGTGTTATGAACTTCGAAGACGGTGGTGTTGTTACCGTTACCCAGGACTCCTCCCTGCAAGACGGT-
GAGTTCATCTAC AAAGTTAA
ACTGCGTGGTACCAACTTCCCGTCCGACGGTCCGGTTATGCAGAAAAAAACCATGGGTTGGGAAGCTTCCAC-
CGAACGTATGTA CCCGGAAG
ACGGTGCTCTGAAAGGTGAAATCAAAATGCGTCTGAAACTGAAAGACGGTGGTCACTACGACGCTGAAGTTA-
AAACCACCTACA TGGCTAAA
AAACCGGTTCAGCTGCCGGGTGCTTACAAAACCGACATCAAACTGGACATCACCTCCCACAACGAAGACTAC-
ACCATCGTTGAA CAGTACGAACGTGCTGAAGGTCGTCACTCCACCGGTGCTTAA mCherry
Gene
ATGGTTTCGAAGGGCGAGGAGGATAACATGGCCATCATCAAGGAGTTCATGCGCTTCAAGGTGCACATGGAGG-
GCTCCGTGAAC GGCCACGA
GTTCGAGATCGAGGGCGAGGGCGAGGGCCGCCCCTACGAGGGCACCCAGACCGCCAAGCTGAAGGTGACCAA-
GGGTGGCCCCCT GCCCTTCG
CCTGGGACATCCTGTCCCCTCAGTTCATGTACGGCTCCAAGGCCTACGTGAAGCACCCCGCCGACATCCCCG-
ACTACTTGAAGC TGTCCTTC
CCCGAGGGCTTCAAGTGGGAGCGCGTGATGAACTTCGAGGACGGCGGCGTGGTGACCGTGACCCAGGACTCC-
TCCTTGCAGGAC GGCGAGTT
CATCTACAAGGTGAAGCTGCGCGGCACCAACTTCCCCTCCGACGGCCCCGTAATGCAGAAGAAGACGATGGG-
CTGGGAGGCCTC CTCCGAGC
GGATGTACCCCGAGGACGGCGCCCTGAAGGGCGAGATCAAGCAGAGGCTGAAGCTGAAGGACGGCGGCCACT-
ACGACGCTGAGG TCAAGACC
ACCTACAAGGCCAAGAAGCCCGTGCAGCTGCCCGGCGCCTACAACGTCAACATCAAGTTGGACATCACCTCC-
CACAACGAGGAC
TACACCATCGTGGAACAGTACGAACGCGCCGAGGGCCGCCACTCCACCGGCGGCATGGACGAGCTGTACAAG-
TAA P.sub.WT Promoter.sup.8 TAATACGACTCACTATAGGGAGA P.sub.1
Promoter.sup.8 TAATACGACTCACTACAGGCAGA P.sub.2 Promoter.sup.8
TAATACGACTCACTAGAGAGAGA P.sub.3 Promoter.sup.8
TAATACGACTCACTAATGGGAGA P.sub.4 Promoter.sup.8
TAATACGACTCACTAAAGGGAGA P.sub.5 Promoter.sup.8
TAATACGACTCACTATAGGTAGA P.sub.6 Promoter.sup.8
TAATACGACTCACTATTGGGAGA nifV of A. caulinodans Gene
GTGTTCCGTGGGAGGCCTGCCATGCTCGCCAAGACACCCGCAAACCCCGCGCCGCTTCAGCGGACGGCGTTCC-
TGAACGACACC ACGCTGCG
CGACGGCGAGCAGGCGCCGGGTGTCGCCTTCACCCGCAAGGAGAAGATCGAGATCGCCGCCGCCCTTGCCGC-
CGCCGGTGTCCC GGAGATCG
AGGCGGGAACGCCCGCCATGGGCGACGAAGAGGTGGAAACCATCCGCTCCATCGTCTCGCTGAACCTCCCGA-
CGCGCGTCATGG CCTGGTGC
CGCATGAGCGAGGACGACCTGATGGCCGCCGTCGCGGCGGGCGTGAAGATCGTCAATGTCTCCATTCCCACC-
TCCGACCGGCAA CTGGCCGG
CAAGCTCGGCAAGGATCGCGCCTGGGCGCTCGGCCGTGTGGCGGAGGTGGTGACACTGGCGCGTCGGCTCGG-
CTTTGAGGTGGC GGTAGGGG
GCGAGGATTCCTCGCGGGCCGATCCCGATTTTCTCTGCCGTCTCGCGGAGACGGCGAAGGCGGCGGGCGCCT-
TTCGCCTGCGGC TGGCCGAC
ACGCTTGGCGTGCTTGACCCCTTCGGCACCTATGCATTGGTGCGCCGGGTGGCCGCCACCACCGACATCGAG-
CTTGAGTTCCAC GCCCATGA
CGATCTCGGCCTTGCCACCGCCAATACGCTGGCGGCGGTGATGGGCGGAGCGCGTCACGCCAGCGTCACCGT-
CGCCGGGCTCGG CGAGCGCG
CGGGCAATGCCGCGCTGGAGGAAGTGGCCATCGCCCTGCGCCAGACGGCGCGGGCGGAGACCGGCATCGCTC-
CGGCCGCGCTGA AGCCGCTG
GCCGAACTAGTGTGCGGCGCCGCCGCCCGTCCGGTGCCGCGCGGCAAGGCCATCGTCGGCGCGGATGTGTTC-
ACCCACGAGTCG GGCATCCA
TGTCTCCGGCCTGCTCAAGGACCGGGCCACCTATGAAGCTCTGAATCCGGAACTGTTCGGGCGTGGCCACAC-
GGTGGTGCTCGG AAAGCATT
CCGGTCTTGCGGCGGTGGAGAAGGCGCTGGCCGACGAGGGCATCACCGTGGATGCGGTGCGCGGGCGCGCCA-
TTCTCGACCGGG TGCGGGCT
TTTGCTGTCCGCACCAAGGAGAATGTTTCCCGCGAGACGCTGCTGCGCTTCTATCAGGACAGCTTCACCGAG-
TCCGCGCTGCGT CTGCGGCGGGCCGCCGTGGAAGGCGCAATCTGA P.sub.nifh of K.
oxytoca Promoter
TGTTGCCTCAAGCACAGCCTGTGCCAGCTCGCGGATGACAGAAGAGTTAGCGCGAATTCAACGCGTTATGAAG-
AGAGTCGCCGC GCAGCGCG
CCAAGAGATTGCGTGGAATAAGACACAGGGGGCGACAAGCTGTTGAACAGGCGACAAAGCGCCACCATGGCC-
CCGGCAGGCGCA ATTGTTCT
GTTTCCCACATTTGGTCGCCTTATTGTGCCGTTTTGTTTTACGTCCTGCGCGGCGACAAATAACTAACTTCA-
TAAAAATCATAA GAATAC
ATAAACAGGCACGGCTGGTATGTTCCCTGCACTTCTCTGCTGGCAAACA P.sub.nifh of P.
stutzeri Promoter
TGTCATGTTCGCAACAGTTGCCGAAAGTGTGGAAAACCGGCGCTTGGCCCGGCCGATCTTTTTGTCGCCATTG-
CAACAGTCAGG CCTGTCGG
TTGTTAACTATCGAACCGCCGAAGGATGTTGCTAGTAATTAAATTATTCTAATTAAAACAAGTGCTTAGATT-
ATTTTAGAAACG
CTGGCACAAAGGCTGCTATTGCCCTGTTGCGCAGGCTTGTTCGTGCCTATAGCCCAC
P.sub.nifh of A. vinelandii Promoter
TGTCAGTTTTGTCACAGGGGGCCGGACCAGGATGGTGGACGCTCGATGGGGATGTCGGGCCATTGTTCGGTTG-
TAGCAATTACA ACAGTCGG
AGTAGGGGGATTGTAGGGGGATTGTTGTGTATCAGACCGCCCTGCAGCTCCCGTCGATGGATAATTAATCAT-
TTAAAATCAATG GTTTATTT
ATGTGTTGCGGGTGCTGGCACAGACGCTGCATTACCTTTGGTGCGCGGAGTTGTTCGGGCTTACGGCCGAAC
P.sub.nifh of R. sp. Promoter
TTGACAAAGCCTCCGAGAAGAGCGCCCCCTAACCCCTCCTCAGCCCTGATCGGCAGTATCATCTTGTCGAATC-
CTAACGTCTGA IRBG74 TAGGCAAC
GCTATACGACAAACGCTGGTTACAATTGTCGGTTCCGCGACAAGAATTTGCTTTGTCTGGCGGGTGGTCTAT-
TTTGAGCTAAGT AGCTGAGA
AATCAGGAAAACAAAACTCTATTCGGTCTACCCGACGAGTTGGCACGGGTCTTGTAACCATCCTTGCGCAGG-
CGGCGAAAGCCA CCGGCGATATTCATGTTGCGGGCAAC P.sub.nifh of Of A.
caulinodans Promoter
TGTCGCGTTTGAAACACGGGGCTTTTGGAACCGTTCGATTCTGCAATGCACTGATTTTACTTGATTAATTCGA-
CCACACGACCA CTGGCACA
CCCGTTGCAAAACCCCTTGGTGCAGGCGACGGGTTGCCGGTCTGGTTCGCGGATCTCCTCGATCCCCGGCTA-
CCGACCCGCCTC CGAAAAGTCCGGTCCCGATCCAGTTCGGCGGGGCCACAC nifA of K.
oxytoca Gene
ATGATCCATAAATCCGATTCGGACACCACCGTCAGACGTTTCGATCTCTCCCAGCAGTTTACCGCCATGCAGC-
GGATAAGCGTG GTCCTGAG
TCGCGCCACCGAAGCGAGCAAAACCCTGCAGGAGGTTCTGAGCGTGCTACATAACGATGCCTTTATGCAGCA-
CGGGATGATTTG CCTGTACG
ACAGCCAGCAGGAGATCCTGAGCATCGAAGCGCTGCAGCAAACGGAAGATCAGACGCTGCCCGGCAGTACGC-
AAATTCGCTACC GGCCGGGG
GAAGGATTAGTCGGTACCGTGCTGGCGCAGGGCCAGTCGCTGGTGCTGCCGCGCGTCGCCGACGACCAGCGT-
TTTCTCGATCGT CTGAGCCT
GTACGACTATGACCTGCCGTTTATCGCCGTTCCGCTGATGGGCCCCCACTCCCGGCCCATCGGCGTACTGGC-
GGCGCAGCCGAT GGCGCGTC
AGGAAGAGCGGCTGCCCGCCTGCACGCGCTTTCTCGAAACCGTCGCCAATCTGATCGCCCAGACGATTCGCC-
TGATGATCCTGC CAACCTCC
GCCGCGCAGGCGCCGCAGCAGAGCCCCAGAATAGAGCGCCCGCGCGCCTGTACCCCTTCGCGCGGTTTCGGC-
CTGGAAAATATG GTCGGTAA
AAGCCCGGCGATGCGGCAGATTATGGATATTATTCGTCAGGTTTCCCGCTGGGATACCACGGTGCTGGTACG-
CGGCGAGAGCGG CACCGGGA
AAGAGCTCATCGCCAACGCCATCCACCATAATTCTCCGCGCGCCGCCGCGGCGTTCGTCAAATTTAACTGCG-
CGGCGCTGCCGG ACAACCTG
CTGGAGAGCGAGCTGTTTGGTCATGAGAAAGGCGCGTTTACCGGCGCGGTGCGCCAGCGGAAAGGCCGCTTT-
GAGCTGGCGGAC GGCGGCAC
CTTATTCCTCGATGAGATCGGCGAAAGCAGCGCCTCGTTTCAGGCTAAGCTACTGCGTATTCTGCAAGAGGG-
GGAGATGGAGCG CGTCGGCG
GCGACGAAACCCTGCGGGTCAACGTGCGCATTATCGCGGCGACCAACCGCCATCTGGAAGAGGAGGTGCGGC-
TGGGTCATTTCC GCGAGGAT
CTATACTACCGCCTGAACGTAATGCCTATCGCGCTGCCGCCGCTGCGCGAGCGCCAGGAGGATATCGCCGAG-
CTGGCGCACTTT CTGGTGCG
AAAAATCGCCCACAGCCAGGGGCGAACGCTGCGCATCAGCGATGGGGCGATTCGCCTGCTGATGGAGTACAG-
CTGGCCGGGAAA CGTGCGCG
AACTGGAAAACTGTCTCGAACGTTCGGCGGTGCTGTCGGAAAGCGGCCTGATAGACCGGGACGTGATTCTGT-
TCAACCATCGCG ATAACCCG
CCGAAAGCGCTCGCCAGCAGCGGCCCGGCGGAGGACGGCTGGCTCGATAACAGCCTCGACGAGCGCCAGCGG-
CTGATCGCCGCC CTGGAAAA
AGCGGGCTGGGTGCAGGCCAAAGCGGCGCGGCTGCTCGGCATGACCCCGCGCCAGGTGGCGTATCGCATTCA-
GATTATGGATAT CACCATGC CGCGACTGTGA nifA of P. stutzeri Gene
ATGAACGCCACATTCGCCGAACGCCCCAGCGCGCCAACCCGCAACGAACTGCTGGATGCCCAACTGCAGGCGC-
TGGCGCAGATC GCCCGCAT
CCTTAACCGCGGCCGGCCCATCGAGGAACTGCTGGCCGAGATCCTCGCCGTGCTGCACGAAGACCTCGGCCT-
GCTGCACGGGCT GGTCTCCA
TCTGCAACCCGAAGGACGGCAGCCTGCAGGTGGGCGCCGTGCACAGCGACTCCGAAACCGTGGTACGGGCCT-
GCGAAAGCACCC GCTACCGC
ATCGGCGAAGGCGTGTTCGGCAACATCCTCAAGCATGGCAACAGCGTGGTGCTCGGGCGTATCGACGCCGAA-
CCGCGCTTTCTC GACCGACT
GGCGCTGTACGACATGGACCTGCCCTTCATCGCCGTGCCGATCAAGGCCGTCGACGGCACCACCATCGGCGT-
GCTGGCTGCCCA GCCCGACC
GCCGCGCCGACGAGCTGATGCCCGAACGCACCCGTTTGATGGAAATCGTCGCCCGCCTACTGGCGCAGACCG-
TGCGCCTGGTGG TGAACCTC
GAGGACGGCCAGGAAGTGGTCGACGAGCGCGACGAGCTACGCCGCGAAGTCCGCGCCAAGTACGGCTTCGAG-
AACATGGTGGTG GGCCACAC
CGCCTCCATGCGCCGGGTTTTCGACCAGGTTCGACGGGTCGCCAAGTGGAACAGCACCGTGCTGATCCTCGG-
CGAATCCGGCAC CGGCAAGG
AGCTGATCGCCAGCGCCATCCACTACAACTCACCGCGCGCTCACCAGCCGCTGGTACGCCTGAACTGCGCCG-
CGCTACCGGAAA CCCTGCTC
GAATCGGAACTGTTCGGTCACGAGAAAGGCGCCTTCACCGGCGCCGTGAAGCAGCGCAAGGGACGTTTCGAA-
CAGGCCGACGGC GGCACCCT
GTTCCTCGACGAGATCGGCGAGATCTCGCCGATGTTCCAGGCCAAGCTGCTGCGCGTGCTGCAGGAAGGCGA-
GCTGGAGCGCGT CGGCGGCA
GCCAGACGGTGAAGGTCAACGTGCGCATCGTCGCCGCCACCAACCGCGACCTGGAGCACGAGGTGGAGCAAG-
GCAAGTTCCGCG AAGACCTC
TACTACCGCCTCAACGTCATGGCCATCCGCGTCCCGCCGCTGCGCGAGCGCAGCGCCGACATCCCGGAACTG-
GCCGAATTCCTC CTCGACAA
GATCGCCCGCCAGCAGGGTCGCAAACTCAAGCTGACCGACAGCGCCCTGCGTCTGCTGATGAGCCACCGCTG-
GCCGGGCAACGT GCGCGAAC
TGGAAAACTGCCTGGAACGCTCGGCCATCATGAGCGAGGATGGCACCATCAGCCGCGACGTGGTCTCCCTCA-
CCGGCCTCGACC ACGACGCC
ACGCCGCTGGCGCCGGTCCCCGAAGTCGACCTCGCCGACGACAGCCTCGACGACCGCGAGCGCGTCATCGCC-
GCGCTGGAACAG GCCGGCTG
GGTCCAGGCCAAGGCCGCCCGCCTGCTCGGCATGACGCCCCGGCAGATCGCCTACCGAGTGCAGACGCTGAA-
CATTCATATGCG CAAGATCT GA nifA of A. vinelandii Gene
ATGAATGCAACCATCCCTCAGCGCTCGGCCAAACAGAACCCGGTCGAACTCTATGACCTGCAATTGCAGGCCC-
TGGCGAGCATC GCCCGCAC
GCTCAGCCGCGAACAACAGATCGACGAACTGCTCGAACAGGTCCTGGCCGTACTGCACAATGACCTCGGCCT-
GCTGCATGGCCT GGTGACCA
TTTCCGACCCGGAACACGGCGCCCTGCAGATCGGCGCCATCCACACCGACTCGGAAGCGGTGGCCCAGGCCT-
GCGAAGGCGTGC GCTACAGA
AGCGGCGAAGGCGTGATCGGCAACGTGCTCAAGCACGGCAACAGCGTGGTGCTCGGGCGCATCTCCGCCGAC-
CCGCGCTTTCTC GACCGCCT
GGCGCTGTACGACCTGGAAATGCCGTTCATCGCCGTGCCGATCAAGAACCCCGAGGGCAACACCATCGGCGT-
GCTGGCGGCCCA GCCGGACT
GCCGCGCCGACGAGCACATGCCCGCGCGCACGCGCCTTCTGGAGATCGTCGCCAACCTGCTGGCGCAGACCG-
TGCGCCTGGTGG TGAACATC
GAGGACGGCCGCGAGGCGGCCGACGAGCGCGACGAACTGCGTCGCGAGGTGCGCGGCAAGTACGGCTTCGAG-
AACATGGTGGTG GGCCACAC
CCCCACCATGCGCCGGGTGTTCGATCAGATCCGCCGGGTCGCCAAGTGGAACAGCACCGTACTGGTCCTCGG-
CGAGTCCGGTAC CGGCAAGG
AACTGATCGCCAGCGCCATCCACTACAACTCGCCGCGCGCGCACCGCCCCTTCGTGCGCCTGAACTGCGCCG-
CGCTGCCGGAAA CCCTGCTC
GAGTCCGAACTCTTCGGCCACGAGAAGGGCGCCTTCACCGGCGCGGTGAAGCAGCGCAAGGGGCGTTTCGAG-
CAGGCCGACGGC GGCACCCT
GTTCCTCGACGAGATCGGCGAGATCTCGCCGATGTTCCAGGCCAAGCTGCTGCGCGTGCTGCAGGAAGGCGA-
GTTCGAGCGGGT CGGCGGCA
ACCAGACGGTGCGGGTCAACGTGCGCATCGTCGCCGCCACCAACCGCGACCTGGAAAGCGAGGTGGAAAAGG-
GCAAGTTCCGCG AGGACCTC
TACTACCGCCTGAACGTCATGGCCATCCGCATTCCGCCGCTGCGCGAGCGTACCGCCGACATTCCCGAACTG-
GCGGAATTCCTG CTCGGCAA
GATCGGCCGCCAGCAGGGCCGCCCGCTGACCGTCACCGACAGCGCCATCCGCCTGCTGATGAGCCACCGCTG-
GCCGGGCAACGT GCGCGAAC
TGGAGAACTGCCTGGAGCGCTCGGCGATCATGAGCGAGGACGGCACCATCACCCGCGACGTGGTCTCGCTGA-
CCGGGGTCGACA ACGAGAGC
CCGCCGCTCGCCGCGCCGCTGCCCGAGGTCAACCTGGCCGACGAGACCCTGGACGACCGCGAACGGGTGATC-
GCCGCCCTCGAA CAGGCCGG
CTGGGTGCAGGCCAAGGCCGCGCGGCTGCTGGGCATGACGCCGCGGCAGATCGCCTACCGCATCCAGACCCT-
CAACATCCACAT GCGCAAGA TCTGA nifA of R. sp. Gene
ATGCTGCACAATGGGCTCAATGAGGGTATGACTGAACGATCCGCTCAAACCATCCACAAACCGGATTTCTGGG-
GCAGCGGTATC IRBG74 TATCGGAT
ATCGAAAGTTTTGATTGGTCCAGACAGTCTCGAGACGAAGCTTGCCAATGTCATTAACGCCCTCTCAGTAAT-
TCTCCCAATGCG GCGCGGCG
CAATCGTCGTTCTAAATGTTAAAGGAGAGCCCGAGATGGTTGCAATGCTGGGCCTAGAGCAAGCATCTCAAG-
GCGCCCGCTCCA TTCCGGCG
GAGGCTGCGATAGATAGAATCGTCGCCAAAGGCGCGCCGCTGGTCGTACCGGACATTTGCAAGTCGGACCTG-
TTCCAGGCGGAG CTCCAAAC
CAACTCGAACGCCACAGGCCCAGCCACGTTCGTTGGCGTCCCGATGAAGGTCGAAAAAGAAACGCTTGGAAC-
ACTATGGATCGA CCGCGCCA
AAGATGGCAGCACTAGGATCCAATTTGAGGAAGAGGTGCGCTTCCTCTCCATGGTCGCCAACCTTTCGGCCC-
GGGCCATTTGGC TGGATCGC
CACCAGAGCCGCGATGGTCAGCCAATCGTGGGCGAGGAAGGAACTCGCAAGACTAGTTCAGGCGACAAGGAA-
CTGCCCGAATCT GCCCGACA
AAGGCCCACAAAAATCGATTGGATTGTCGGGGAAAGCCCTGCCCTCAAGCAGGTGGTTGAAAGCGTCAAAGT-
CGTTGCAACAAC CAATTCTG
CGGTGCTTCTCAGGGGCGAAAGCGGCACGGGCAAGGAGTTCTTTGCAAAGGCCATCCACGAGCTTTCATACC-
GGAAAAAGAAGC CCTTCGTG
AAGTTGAACTGCGCCGCGCTGTCTGCAGGCGTTTTGGAATCGGAATTGTTTGGACATGAAAAGGGCGCCTTC-
ACGGGGGCCATC TCTCAGCG
CGCAGGCCGCTTCGAACTCGCAGACGGCGGAACGCTGCTGCTCGATGAGATCGGCGACATTTCGCCGGGCTT-
CCAAGCGAAACT GTTGCGCG
TCTTGCAGGAAGGTGAGCTTGAGCGAGTCGGCGGCACAAAAACACTCAAAGTGGACGTTCGACTCATATGCG-
CCACGAACAAAG ACCTAGAA
GCGGCAGTCGCGGATGGGGAGTTCAGGGCCGACCTTTATTACCGGATCAATGTGGTGCCCCTATTTCTGCCG-
CCTCTCCGGGAG CGAAATGG
GGATATTCCACGCCTTGCGAGAGTTTTCCTCGGCCGATTCAACAGGGAAAACAATCGCGATCTCGCGTTCGC-
GCCGGCTGCGCT CGAGCTCT
TGTCAAAATGCAACTTTCCCGGCAACGTCCGAGAGCTTGAAAACTGCGTCCGCAGGACCGCCACTCTCGCGC-
GTTCGGAGACGA TCGTTCCA
TCAGATTTCTCCTGCCTGAAGAACCAGTGCTTTTCTTCAATGCTCTGGAAAACCGGTGACCGTCCACTTGGG-
GATACGCTCAAT GGGTTGGC
CATGCGTAAGAGTTTGTCGGTCGAATCGCCGATCAGCCTCGGTTACTCCAATGGACCGGCCGGCTTAACGGT-
GGCACCACATCT AACGGACC
GCGAGCTGCTAATCAGTGCGATGGAGAAGGCCGGTTGGGTTCAGGCAAAGGCAGCTCGGATCCTCGGCCTCA-
CACCGCGACAGG TCGGCTAT GCTTTACGTAGGCATCGTATACAGGTGAAGAAAATCTAA nifA
of A. caulinodans Gene
ATGCCAATGACCGACGCCTTCCAGGTCCGCGTACCTCGGGTTTCGTCGAGCACCGCCGGAGACATCGCCGCGT-
CATCCATCACC ACGCGGGG
CGCGCTGCCGCGCCCGGGAGGGATGCCTGTGTCCATGTCGCGGGGGACCTCGCCCGAGGTGGCACTCATCGG-
GGTCTATGAGAT ATCGAAGA
TCCTGACGGCGCCCCGGCGCCTCGAAGTCACGCTCGCCAATGTGGTGAACGTGCTCTCCTCCATGCTGCAGA-
TGCGGCATGGCA TGATCTGC
ATCCTCGACAGCGAGGGCGATCCCGACATGGTGGCCACCACCGGCTGGACGCCTGAGATGGCGGGCCAGATC-
CGCGCGCATGTG CCCCAGAA
GGCCATCGACCAGATCGTCGCCACGCAGATGCCGCTGGTGGTGCAGGACGTGACGGCCGATCCGCTCTTCGC-
CGGTCACGAGGA TCTGTTCG
GCCCGCCTGAGGAGGCCACCGTCTCCTTCATCGGCGTGCCGATCAAGGCCGACCACCATGTGATGGGCACCC-
TCTCCATCGACC GCATCTGG
GACGGCACCGCCCGTTTCCGCTTCGACGAGGACGTGCGCTTCCTCACCATGGTGGCCAATCTCGTCGGCCAG-
ACCGTGCGCCTG CACAAGCT
GGTGGCGAGCGACCGCGACCGGCTGATCGCCCAGACGCACCGCCTCGAAAAGGCGCTGCGGGAAGAAAAATC-
CGGGGCCGAGCC GGAGGTGG
CCGAGGCCGCCAACGGATCCGCCATGGGCATCGTGGGCGATAGCCCGCTGGTGAAACGCCTGATCGCGACCG-
CGCAAGTGGTCG CCCGCTCA
AACTCCACCGTGCTGCTGCGCGGGGAGAGCGGCACCGGCAAGGAGTTGTTCGCCCGTGCCATCCACGAACTG-
TCGCCCCGCAAG GGCAAGCC
CTTCGTGAAGGTGAACTGCGCCGCCCTCCCGGAATCGGTGCTGGAATCGGAACTGTTCGGCCATGAGAAGGG-
CGCCTTCACCGG TGCGCTGA
ACATGCGCCAGGGCCGCTTCGAGCTGGCGCACGGCGGCACGCTCTTCCTTGACGAGATCGGCGAGATCACCC-
CCGCTTTCCAGG CCAAGCTG
CTGCGCGTGCTGCAGGAAGGCGAGTTCGAGCGGGTCGGCGGCAATCGCACGCTGAAGGTGGATGTGCGGCTC-
GTGTGCGCCACC AACAAGAA
TCTGGAAGAGGCGGTCTCCAAGGGCGAGTTCCGGGCCGATCTCTACTACCGCATCCATGTGGTGCCGCTGAT-
CCTGCCGCCGCT GCGCGAAC
GGCCGGGCGACATTCCCAAGCTCGCGAAGAACTTCCTCGACCGCTTCAACAAGGAAAACAAGCTCCACATGA-
TGCTCTCGGCGC CGGCCATC
GACGTGCTGCGGCGCTGCTATTTCCCGGGCAACGTGCGCGAGCTGGAGAACTGTATCCGGCGGACGGCAACG-
CTCGCCCACGAT GCCGTCAT
CACCCCCCATGACTTCGCCTGCGACAGCGGCCAGTGCCTCTCGGCCATGCTCTGGAAGGGCTCGGCCCCGAA-
GCCTGTGATGCC GCACGTGC
CGCCGGCGCCCACGCCGCTGACTCCGCTCTCCCCTGCTCCGCTCGCGACCGCAGCGCCCGCTGCGGCGAGCC-
CGGCGCCGGCGG CCGACAGC
CTGCCGGTCACTTGCCCCGGCACCGAGGCCTGTCCCGCGGTGCCCCCCCGCCAGAGCGAAAAGGAGCAGTTG-
CTCCAGGCCATG GAGCGCTC
CGGCTGGGTGCAGGCGAAGGCCGCGCGCCTCCTCAACCTCACGCCGCGCCAGGTGGGTTATGCGCTGCGCAA-
ATATGACATCGA CATCAAGC GCTTCTGA P.sub.J3123100 Promoter.sup.9
TAGGTGTTGACGGCTAGCTCAGTCCTAGGTACAGTGCTAGCTCTAGA P.sub.J3123101
Promoter.sup.9 TAGGTGTTTACAGCTAGCTCAGTCCTAGGTATTATGCTAGCTCTAGA
P.sub.J3123102 Promoter.sup.9
TAGGTGTTGACAGCTAGCTCAGTCCTAGGTACTGTGCTAGCTCTAGA P.sub.J3123103
Promoter.sup.9 TAGGTGCTGATAGCTAGCTCAGTCCTAGGGATTATGCTAGCTCTAGA
P.sub.J23104 Promoter.sup.9
TAGGTGTTGACAGCTAGCTCAGTCCTAGGTATTGTGCTAGCTCTAGA P.sub.J23105
Promoter.sup.9 TAGGTGTTTACGGCTAGCTCAGTCCTAGGTACTATGCTAGCTCTAGA
P.sub.J23106 Promoter.sup.9
TAGGTGTTTACGGCTAGCTCAGTCCTAGGTATAGTGCTAGCTCTAGA P.sub.J23107
Promoter.sup.9 TAGGTGTTTACGGCTAGCTCAGCCCTAGGTATTATGCTAGCTCTAGA
P.sub.J23108 Promoter.sup.9
TAGGTGCTGACAGCTAGCTCAGTCCTAGGTATAATGCTAGCTCTAGA P.sub.J23109
Promoter.sup.9 TAGGTGTTTACAGCTAGCTCAGTCCTAGGGACTGTGCTAGCTCTAGA
P.sub.J23110 Promoter.sup.9
TAGGTGTTTACGGCTAGCTCAGTCCTAGGTACAATGCTAGCTCTAGA P.sub.J23111
Promoter.sup.9 TAGGTGTTGACGGCTAGCTCAGTCCTAGGTATAGTGCTAGCTCTAGA
P.sub.J23112 Promoter.sup.9
TAGGTGCTGATAGCTAGCTCAGTCCTAGGGATTATGCTAGCTCTAGA P.sub.J23113
Promoter.sup.9 TAGGTGCTGATGGCTAGCTCAGTCCTAGGGATTATGCTAGCTCTAGA
P.sub.J23114 Promoter.sup.9
TAGGTGTTTATGGCTAGCTCAGTCCTAGGTACAATGCTAGCTCTAGA P.sub.J23115
Promoter.sup.9 TAGGTGTTTATAGCTAGCTCAGCCCTTGGTACAATGCTAGCTCTAGA
P.sub.J23116 Promoter.sup.9
TAGGTGTTGACAGCTAGCTCAGTCCTAGGGACTATGCTAGCTCTAGA P.sub.J23117
Promoter.sup.9 TAGGTGTTGACAGCTAGCTCAGTCCTAGGGATTGTGCTAGCTCTAGA
P.sub.J23118 Promoter.sup.9
TAGGTGTTGACGGCTAGCTCAGTCCTAGGTATTGTGCTAGCTCTAGA P.sub.J23119
Promoter.sup.9 TAGGTGTTGACAGCTAGCTCAGTCCTAGGTATAATGCTAGCTCTAGA
P.sub.trp Promoter.sup.10
TAGGTGTTGACATTATTCCATCGAACTAGTTAACTAGTACGAAAGTT T.sub.T7
Terminator.sup.11 TAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTGT
T.sub.T2.2 Terminator.sup.11
TACTCGAACCCCTAGCCCGCTCTTATCGGGCGGCTAGGGGTTTTTTGT T.sub.T7.3
Terminator.sup.11 TACATATCGGGGGGGTAGGGGTTTTTTGT T.sub.rrnBT1
Terminator.sup.12
CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAA-
CGCTCTC T.sub.L3S2P21 Terminator.sup.13
CTCGGTACCAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCGTTTTGGTCC
T.sub.1 Terminator.sup.13
CTCGGTACCAAATTCCAGAAAAGACACCCGAAAGGGTGTTTTTTCGTTTTGGTCCTCCTTGGCCCTCCATCCT-
TAGATAGCAGA TAAAAAAA
ATCCTTAGCTTTCGCTAAGGATGATTTCTTCATAGGCAATACGATCGCATGTCC T.sub.2
Terminator.sup.14
CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAA-
CGCTCTCTACT AGAGTCAC ACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA T.sub.3
Terminator.sup.13
CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAA-
CGCTCTCTACT AGAGTCAC ACTGGCTCACCTTCGGGTGGGCCTTTCTGCGTTTATA T.sub.4
Terminator.sup.13
GGTCTTGTCCACTACCTTGCAGTAATGCGGTGGACAGGATCGGCGGTTTTCTTTTCTCTTCTCAATGACTGAA-
TAGAAAAGACG AACATTAA
CGCATGAGAAAGCCCCCGGAAGATCACCTTCCGGGGGCTTTTTTATTGCGCTACAAATGAAAGTACATAGAA-
ATTA T.sub.5 Terminator.sup.13
CAGATAAAAAAAATCCTTAGCTTTCGCTAAGGATGATTTCTTCCTTGGCCCTCCATCCTTAGATAGCTCGGTA-
CCAAATTCCAG AAAAGACA
CCCGAAAGGGTGTTTTTTCGTTTTGGTCCTCATAGGCAATACGATCGCATGTCC T.sub.6
Terminator.sup.13
CCAGGCATCAAATAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCTGTTGTTTGTCGGTGAA-
CGCTCTCCTAG CATAACCC CTTGGGGCCTCTAAACGGGTCTTGAGGGGTTTTTTG T.sub.7
Terminator.sup.13
CTCGGTACCAAATTCCAGAAAAGAGACGCTTTCGAGCGTCTTTTTTCGTTTTGGTCCTCCTTGGCCCTCCATC-
CTTAGATAGAG TTAACCAA
AAAGGGGGGATTTTATCTCCCCTTTAATTTTTCCTTCATAGGCAATACGATCGCATGTCC
T.sub.8 Terminator.sup.13
CGCAGATAGCAAAAAAGCGCCTTTAGGGCGCTTTTTTACATTGGTGGTCCTTGGCCCTCCATCCTTAGATAGA-
GGCGACTGACG AAACCTCG
CTCCGGCGGGGTTTTTTGTTATCTGCATCATAGGCAATACGATCGCATGTCC T.sub.9
Terminator.sup.14
TCGGTCAGTTTCACCTGATTTACGTAAAAACCCGCTTCGGCGGGTTTTTGCTTTTGGAGGGGCAGAAAGATGA-
ATGACTG TC T.sub.10 Terminator.sup.14
GCCCCCGGAAGATCACCTTCCGGGGGCTTTTTTATTGGCGGCCGGCTGATTGATCAGGCGGCCGGCTGATTGG-
CGCGTTACCTG GTAGCGCG CCATTTTGTTT T.sub.11 Terminator.sup.14
GTAATCGTTAATCCGCAAATAACGTAAAAACCCGCTTCGGCGGGTTTTTTTATGGGGGGAGTTTAGGGAAAGA-
GCATTTG TCA T.sub.12 Terminator.sup.14
AAAAAAAAACCCCGCCCCTGACAGGGCGGGGTTTTTTTTTT T.sub.13
Terminator.sup.13
TCCGGCAATTAAAAAAGCGGCTAACCACGCCGCTTTTTTTACGTCTGCATGACTGAATAGAAAAGACGAACAT-
TAACGCATGAG AAAGCCCC
CGGAAGATCACCTTCCGGGGGCTTTTTTATTGCGCTCCTTGGCCCTCCATCCTTAGATAG
T.sub.14 Terminator.sup.13
GGAAGACCATACTGGAAACACAGAAAAAAGCCCGCACCTGACAGTGCGGGCTTTTTTTTTCGACCAAAGGTGA-
CTGAATAGAAA AGACGAAC
ATTCGCAGATAGCAAAAAAGCGCCTTTAGGGCGCTTTTTTACATTGGTGGTCATAGGCAATACGATCGCATG-
TCC T.sub.15 Terminator.sup.13
TCCGGCAATTAAAAAAGCGGCTAACCACGCCGCTTTTTTTACGTCTGCATCCTTGGCCCTCCATCCTTAGATA-
GCTCGGTACCA AATTCCAG
AAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCGTTTTGGTCCTCATAGGCAATACGATCGCA-
TGTCC T.sub.16 Terminator.sup.13
TTCAGCCAAAAAACTTAAGACCGCCGGTCTTGTCCACTACCTTGCAGTAATGCGGTGGACAGGATCGGCGGTT-
TTCTTTTCTCT TCTCAATA
CATGAAAGTACATAGAAATTACTCGGTACCAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTC-
GTTTTGGTCCTC ATAGGCAA TACGATCGCATGTCC T.sub.17 Terminator.sup.13
TTCAGCCAAAAAACTTAAGACCGCCGGTCTTGTCCACTACCTTGCAGTAATGCGGTGGACAGGATCGGCGGTT-
TTCTTTTCTCT TCTCAATC
CTTGGCCCTCCATCCTTAGATAGTCCGGCAATTAAAAAAGCGGCTAACCACGCCGCTTTTTTTACGTCTGCA-
TCATAGGCAATA CGATCGCA TGTCC T.sub.18 Terminator.sup.13
CTCGGTACCAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCGTTTTGGTCCTGACTGAATAGA-
AAAGACGAACA TTAACGCA
TGAGAAAGCCCCCGGAAGATCACCTTCCGGGGGCTTTTTTATTGCGCTCCTTGGCCCTCCATCCTTAGATAG
T.sub.19 Terminator.sup.13
CTCGGTACCAAATTCCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCGTTTTGGTCCTCCTTGGCCCTC-
CATCCTTAGAT GTCCGGCA
ATTAAAAAAGCGGCTAACCACGCCGCTTTTTTTACGTCTGCATCATAGGCAATACGATCGCAT-
GTCC T.sub.20 Terminator.sup.13
CTCGGTACCAAAGACGAACAATAAGACGCTGAAAAGCGTCTTTTTTCGTTTTGGTCCTACAAATGAAAGTACA-
TAGAAATTATT CAGCCAAA
AAACTTAAGACCGCCGGTCTTGTCCACTACCTTGCAGTAATGCGGTGGACAGGATCGGCGGTTTTCTTTTCT-
CTTCTCAATCCT TGGCCCTC CATCCTTAGATAG T.sub.21 Terminator.sup.14
GGGAACTGCCAGACATCAAATAAAACAAAAGGCTCAGTCGGAAGACTGGGCCTTTTGTTTTATCTGTTGTTTG-
TCGGTGAACAC TCTCCCGA CTAGTAGCGGCCGCTGCAGAAAGAGGAGA T.sub.22
Terminator.sup.13
AACGCATGAGAAAGCCCCCGGAAGATCACCTTCCGGGGGCTTTTTTATTGCGCTCATAGGCAATACGATCGCA-
TGTCCTCCGGC AATTAAAA
AAGCGGCTAACCACGCCGCTTTTTTTACGTCTGCATCCTTGGCCCTCCATCCTTAGATAG
T.sub.23 Terminator.sup.14
GGGAACTGCCAGACATCAAATAAAACAAAAGGCTCAGTCGGAAGACTGGGCCTTTTGTTTTATCTGTTGTTTG-
TCGGTGA ACACTCTCCCG T.sub.24 Terminator.sup.14
AAAGCAAGCTGATAAACCGATACAATTAAAGGCTCCTTTTGGAGCCTTTTTTTTTGGAGATTTTCAACATGAA-
AAAATTATTAT TTGATGAT
CAGATAGCGGCGGGGAACTGCCAGACATCAAATAAAACAAAAGGCTCAGTCGGAAGACTGGGCCTTTTGTTT-
TATCTGTTGTTT GTCGGTGA ACACTCTCCCG T.sub.25 Terminator.sup.13
AACGCATGAGAAAGCCCCCGGAAGATCACCTTCCGGGGGCTTTTTTATTGCGCTCCTTGGCCCTCCATCCTTA-
GATAGCTCGGT ACCAAATT
CCAGAAAAGAGGCCTCCCGAAAGGGGGGCCTTTTTTCGTTTTGGTCCTCATAGGCAATACGATCGCATGTCC
T.sub.26 Terminator.sup.14
AACGCATGAGAAAGCCCCCGGAAGATCACCTTCCGGGGGCTTTTTTATTGCGCTCCTTGGCCCTCCATCCTTA-
GATAGTTCAGC CAAAAAAC
TTAAGACCGCCGGTCTTGTCCACTACCTTGCAGTAATGCGGTGGACAGGATCGGCGGTTTTCTTTTCTCTTC-
TCAATCATAGGC AATACGAT CGCATGTCC P.sub.Lux Promoter.sup.15
CCTAGGACCTGTAGGATCGTACAGGTTTACGCAAGAAAATGGTTTGTTACTTTCGAATAAATCTAGA
P.sub.Tet Promoter.sup.16
CGGTGGAATCCCTATCAGTGATAGAGATTGACATCCCTATCAGTGATAGATATAATGAGCACTCTAGA
P.sub.Cym Promoter.sup.5
AACAAACAGACAATCTGGTCTGTTTGTATTATGGAAAATTTTTCTGTATAATAGATTCAACAAACAGACAATC-
TGGTCTG TTTGTATTAT P.sub.Phl Promoter.sup.17
AAAAAGAGTTTGACATGATACGAAACGTACCGTATCGTTAAGGTTACTAGAGTCTAGA
P.sub.Sal Promoter.sup.5
GGGGCCTCGCTTGGGTTATTGCTGGTGCCCGGCCGGGCGCAATATTCATGTTGATGATTTATTATATATCGAG-
TGGTGTATTTA TTTATATT GTTTGCTCCGTTACCGTTATTAAC luxR Gene
ATGAAAAACATAAATGCCGACGACACATACAGAATAATTAATAAAATTAAAGCTTGTAGAAGCA-
ATAATGATATTAATCAATGC TTATCTGA
TATGACTAAAATGGTACATTGTGAATATTATTTACTCGCGATCATTTATCCTCATTCTATGGTTAAATCTGA-
TATTTCAATCCT AGATAATT
ACCCTAAAAAATGGAGGCAATATTATGATGACGCTAATTTAATAAAATATGATCCTATAGTAGATTATTCTA-
ACTCCAATCATT CACCAATT
AATTGGAATATATTTGAAAACAATGCTGTAAATAAAAAATCTCCAAATGTAATTAAAGAAGCGAAAACATCA-
GGTCTTATCACT GGGTTTAG
TTTCCCTATTCATACGGCTAACAATGGCTTCGGAATGCTTAGTTTTGCACATTCAGAAAAAGACAACTATAT-
AGATAGTTTATT TTTACATG
CGTGTATGAACATACCATTAATTGTTCCTTCTCTAGTTGATAATTATCGAAAAATAAATATAGCAAATAATA-
AATCAAACAACG ATTTAACC
AAAAGAGAAAAAGAATGTTTAGCGTGGGCATGCGAAGGAAAAAGCTCTTGGGATATTTCAAAAATATTAGGT-
TGCAGTGAGCGT ACTGTCAC
TTTCCATTTAACCAATGCGCAAATGAAACTCAATACAACAAACCGCTGCCAAAGTATTTCTAAAGCAATTTT-
AACAGGAGCAAT TGATTGCC CATACTTTAAAAATTAA tetR Gene
ATGTCCAGATTAGATAAAAGTAAAGTGATTAACAGCGCATTAGAGCTGCTTAATGAGGTCGGAA-
TCGAAGGTTTAACAACCCGT AAACTCGC
CCAGAAGCTAGGTGTAGAGCAGCCTACATTGTATTGGCATGTAAAAAATAAGCGGGCTTTGCTCGACGCCTT-
AGCCATTGAGAT GTTAGATA
GGCACCATACTCACTTTTGCCCTTTAGAAGGGGAAAGCTGGCAAGATTTTTTACGTAATAACGCTAAAAGTT-
TTAGATGTGCTT TACTAAGT
CATCGCGATGGAGCAAAAGTACATTTAGGTACACGGCCTACAGAAAAACAGTATGAAACTCTCGAAAATCAA-
TTAGCCTTTTTA TGCCAACA
AGGTTTTTCACTAGAGAATGCATTATATGCACTCAGCGCTGTGGGGCATTTTACTTTAGGTTGCGTATTGGA-
AGATCAAGAGCA TCAAGTCG
CTAAAGAAGAAAGGGAAACACCTACTACTGATAGTATGCCGCCATTATTACGACAAGCTATCGAATTATTTG-
ATCACCAAGGTG CAGAGCCA
GCCTTCTTATTCGGCCTTGAATTGATCATATGCGGATTAGAAAAACAACTTAAATGTGAAAGTGGGTCCTAA
cymR Gene
ATGAGCCCGAAACGTCGTACCCAGGCAGAACGTGCAATGGAAACCCAGGGTAAACTGATTGCAG-
CAGCACTGGGTGTTCTGCGT GAAAAAGG
TTATGCAGGTTTTCGTATTGCAGATGTTCCGGGTGCAGCCGGTGTTAGCCGTGGTGCACAGAGCCATCATTT-
TCCGACCAAACT GGAACTGC
TGCTGGCAACCTTTGAATGGCTGTATGAGCAGATTACCGAACGTAGCCGTGCACGTCTGGCAAAACTGAAAC-
CGGAAGATGATG TTATTCAG
CAGATGCTGGATGATGCAGCAGAATTTTTTCTGGATGATGATTTTAGCATCAGCCTGGATCTGATTGTTGCA-
GCAGATCGTGAT CCGGCACT
GCGTGAAGGTATTCAGCGTACCGTTGAACGTAATCGTTTTGTTGTTGAAGATATGTGGCTGGGTGTGCTGGT-
GAGCCGTGGTCT GAGCCGTG
ATGATGCCGAAGATATTCTGTGGCTGATTTTTAACAGCGTTCGTGGTCTGGCAGTTCGTAGCCTGTGGCAGA-
AAGATAAAGAAC GTTTTGAA
CGTGTGCGTAATAGCACCCTGGAAATTGCACGTGAACGTTATGCAAAATTCAAACGTTGA phlF
Gene
ATGGCACGTACCCCGAGCCGTAGCAGCATTGGTAGCCTGCGTAGTCCGCATACCCATAAAGCAA-
TTCTGACCAGCACCATTGAA ATCCTGAA
AGAATGTGGTTATAGCGGTCTGAGCATTGAAAGCGTTGCACGTCGTGCCGGTGCAAGCAAACCGACCATTTA-
TCGTTGGTGGAC CAATAAAG
CAGCACTGATTGCCGAAGTGTATGAAAATGAAAGCGAACAGGTGCGTAAATTTCCGGATCTGGGTAGCTTTA-
AAGCCGATCTGG ATTTTCTG
CTGCGTAATCTGTGGAAAGTTTGGCGTGAAACCATTTGTGGTGAAGCATTTCGTTGTGTTATTGCAGAAGCA-
CAGCTGGACCCT GCAACCCT
GACCCAGCTGAAAGATCAGTTTATGGAACGTCGTCGTGAGATGCCGAAAAAACTGGTTGAAAATGCCATTAG-
CAATGGTGAACT GCCGAAAG
ATACCAATCGTGAACTGCTGCTGGATATGATTTTTGGTTTTTGTTGGTATCGCCTGCTGACCGAACAGCTGA-
CCGTTGAACAGG ATATTGAA
GAATTTACCTTCCTGCTGATTAATGGTGTTTGTCCGGGTACACAGCGTTAA nahR Gene
ATGGAACTGCGTGACCTGGATTTAAACCTGCTGGTGGTGTTCAACCAGTTGCTGGTCGACAGAC-
GCGTCTCTGTCACTGCGGAG AACCTGGG
CCTGACCCAGCCTGCCGTGAGCAATGCGCTGAAACGCCTGCGCACCTCGCTACAGGACCCACTCTTCGTGCG-
CACACATCAGGG AATGGAAC
CCACACCCTATGCCGCGCATCTGGCCGAGCACGTCACTTCGGCCATGCACGCACTGCGCAACGCCCTACAGC-
ACCATGAAAGCT TCGATCCG
CTGACCAGCGAGCGTACCTTCACCCTGGCCATGACCGACATTGGCGAGATCTACTTCATGCCGCGGCTGATG-
GATGCGCTGGCT CACCAGGC
CCCCAATTGCGTGATCAGTACGGTGCGCGACAGTTCGATGAGCCTGATGCAGGCCTTGCAGAACGGAACCGT-
GGACTTGGCCGT GGGCCTGC
TTCCCAATCTGCAAACTGGCTTCTTTCAGCGCCGGCTGCTCCAGAATCACTACGTGTGCCTATGTCGCAAGG-
ACCATCCAGTCA CCCGCGAA
CCCCTGACTCTGGAGCGCTTCTGTTCCTACGGCCACGTGCGTGTCATCGCCGCTGGCACCGGCCACGGCGAG-
GTGGACACGTAC ATGACACG
GGTCGGCATCCGGCGCGACATCCGTCTGGAAGTGCCGCACTTCGCCGCCGTTGGCCACATCCTCCAGCGCAC-
CGATCTGCTCGC CACTGTGC
CGATATGTTTAGCCGACTGCTGCGTAGAGCCCTTCGGCCTAAGCGCCTTGCCGCACCCAGTCGTCTTGCCTG-
AAATAGCCATCA ACATGTTC
TGGCATGCGAAGTACCACAAGGACCTAGCCAATATTTGGTTGCGGCAACTGATGTTTGACCTGTTTACGGAT-
TGATAA P.sub.Fde Promoter.sup.18
TCAATGTATTGATGCCGTCCATATCATGAATCAAAACAATCCATTTGATCAATATCAAGCTCACTCTTAAGCT-
TCACTCA TCCGCTGCAT fdeR Gene
ATGCGTTTCAACAAGCTCGACCTCAATCTTCTGGTCGCCCTGGATGCACTGCTCACGGAGATGA-
GCATCAGCCGCGCCGCCGAA AAGATCCA
TCTGAGCCAGTCGGCCATGAGCAATGCCCTGGCGCGGCTGCGCGAGTATTTCGATGATGAATTGCTGATCCA-
GGTGGGCCGGCG CATGGAGC
CCACGCCGCGCGCCGAGGTGCTCAAGGATGCGGTGCATGATGTGCTGCGGCGTATCGATGGCTCCATCGCGG-
CGCTGCCGGCCT TCGTGCCG
GCCGAGTCCACGCGCGAGTTTCGCATCTCGGTTTCGGACTTTACGCTCTCCGTCCTCATCCCCCGGGTGCTG-
GCGCGCGCGCAC GCCGAGGG
CAAGCACATCCGCTTTGCCCTGATGCCGCAGGTGCAAGACCCGACCCGCTCGCTGGATCGGGCCGAGGTGGA-
CCTGCTGGTCTT GCCGCAGG
AATTCTGCACGCCCGATCATCCTGCCGAAGAGGTCTTCCGCGAACGGCATGTCTGCGTGGTCTGGCGCGACA-
GTGCGCTGGCGC AAGGCGAG
CTGACGCTGGAACGCTACATGGCCTCAGGCCATGTGGTGATGGTGCCGCCTGGGGCCAATGCGTCGTCGGTG-
GAGGCGTGGATG GCCAGGAA
GCTGGGCTTTGCGCGCCGGGTGGAAGTGACCAGCTTCAGCTTCGCTTCTGCGCTGGCGCTGGTACAGGGGAC-
GGACCGCATCGC CACGGTGC
ATGCCCGGCTGGCGCAGCTGCTGGCTCCGCAATGGCCGGTGGTGATCAAGGAGAGTCCGCTGTCGCTGGGCG-
AGATGCGGCAGA TGATGCAG
TGGCATCGCTACCGCAGCAATGATCCTGGCATCCAGTGGCTGCGTCGGGTGTTTCTGGAGAGTGCGCAGGAG-
ATGGATGCGGCG CTGCCAGG CATCTGCTGA P.sub.BAD.10 Promoter
CAGACATTGCCGTCACTGCGTCTTTTACTGGCTCTTCTCGCTAACCAAACCGGTAACCCCGCTTATTAAAAGC-
ATTCTGTAACA
AAGCGGGA
CCAAAGCCATGACAAAAACGCGTAACAAAAGTGTCTATAATCACGGCAGAAAAGTCCACATTGATTATTTGC-
ACGGCGTCACAC TTTGCTAT
GCCATAGCATTTTTATCCATAAGATTAGCGGATCCTACCTGACGCTTTTTATCGCAACTCTCTATATTTTCT-
CCATACCCGTTT TTTTGGGCTAGCGAATTC araC Gene
ATGCAATATGGACAATTGGTTTCTTCTCTGAATGGCGGGAGTATGAAAAGTATGGCTGAAGCGC-
AAAATGATCCCCTGCTGCCG GGATACTC
GTTTAATGCCCATCTGGTGGCGGGTTTAACGCCGATTGAGGCCAACGGTTATCTCGATTTTTTTATCGACCG-
ACCGCTGGGAAT GAAAGGTT
ATATTCTCAATCTCACCATTCGCGGTCAGGGGGTGGTGAAAAATCAGGGACGAGAATTTGTTTGCCGACCGG-
GTGATATTTTGC TGTTCCCG
CCAGGAGAGATTCATCACTACGGTCGTCATCCGGAGGCTCGCGAATGGTATCACCAGTGGGTTTACTTTCGT-
CCGCGCGCCTAC TGGCATGA
ATGGCTTAACTGGCCGTCAATATTTGCCAATACGGGGTTCTTTCGCCCGGATGAAGCGCACCAGCCGCATTT-
CAGCGACCTGTT TGGGCAAA
TCATTAACGCCGGGCAAGGGGAAGGGCGCTATTCGGAGCTGCTGGCGATAAATCTGCTTGAGCAATTGTTAC-
TGCGGCGCATGG AAGCGATT
AACGAGTCGCTCCATCCACCGATGGATAATCGGGTACGCGAGGCTTGTCAGTACATCAGCGATCACCTGGCA-
GACAGCAATTTT GATATCGC
CAGCGTCGCACAGCATGTTTGCTTGTCGCCGTCGCGTCTGTCACATCTTTTCCGCCAGCAGTTAGGGATTAG-
CGTCTTAAGCTG GCGCGAGG
ACCAACGTATCAGCCAGGCGAAGCTGCTTTTGAGCACCACCCGGATGCCTATCGCCACCGTCGGTCGCAATG-
TTGGTTTTGACG ATCAACTC
TATTTCTCGCGGGTATTTAAAAAATGCACCGGGGCCAGCCCGAGCGAGTTCCGTGCCGGTTGTGAAGAAAAA-
GTGAATGATGTA GCCGTCAA GTTGTCATAA araE Gene
ATGGTTACTATCAATACGGAATCTGCTTTAACGCCACGTTCTTTGCGGGATACGCGGCGTATGA-
ATATGTTTGTTTCGGTAGCT GCTGCGGT
CGCAGGATTGTTATTTGGTCTTGATATCGGCGTAATCGCCGGAGCGTTGCCGTTCATTACCGATCACTTTGT-
GCTGACCAGTCG TTTGCAGG
AATGGGTGGTTAGTAGCATGATGCTCGGTGCAGCAATTGGTGCGCTGTTTAATGGTTGGCTGTCGTTCCGCC-
TGGGGCGTAAAT ACAGCCTG
ATGGCGGGGGCCATCCTGTTTGTACTCGGTTCTATAGGGTCCGCTTTTGCGACCAGCGTAGAGATGTTAATC-
GCCGCTCGTGTG GTGCTGGG
CATTGCTGTCGGGATCGCGTCTTACACCGCTCCTCTGTATCTTTCTGAAATGGCAAGTGAAAACGTTCGCGG-
TAAGATGATCAG TATGTACC
AGTTGATGGTCACACTCGGCATCGTGCTGGCGTTTTTATCCGATACAGCGTTCAGTTATAGCGGTAACTGGC-
GCGCAATGTTGG GGGTTCTT
GCTTTACCAGCAGTTCTGCTGATTATTCTGGTAGTCTTCCTGCCAAATAGCCCGCGCTGGCTGGCGGAAAAG-
GGGCGTCATATT GAGGCGGA
AGAAGTATTGCGTATGCTGCGCGATACGTCGGAAAAAGCGCGAGAAGAACTCAACGAAATTCGTGAAAGCCT-
GAAGTTAAAACA GGGCGGTT
GGGCACTGTTTAAGATCAACCGTAACGTCCGTCGTGCTGTGTTTCTCGGTATGTTGTTGCAGGCGATGCAGC-
AGTTTACCGGTA TGAACATC
ATCATGTACTACGCGCCGCGTATCTTCAAAATGGCGGGCTTTACGACCACAGAACAACAGATGATTGCGACT-
CTGGTCGTAGGG CTGACCTT
TATGTTCGCCACCTTTATTGCGGTGTTTACGGTAGATAAAGCAGGGCGTAAACCGGCTCTGAAAATTGGTTT-
CAGCGTGATGGC GTTAGGCA
CTCTGGTGCTGGGCTATTGCCTGATGCAGTTTGATAACGGTACGGCTTCCAGTGGCTTGTCCTGGCTCTCTG-
TTGGCATGACGA TGATGTGT
ATTGCCGGTTATGCGATGAGCGCCGCGCCAGTGGTGTGGATCCTGTGCTCTGAAATTCAGCCGCTGAAATGC-
CGCGATTTCGGT ATTACCTG
TTCGACCACCACGAACTGGGTGTCGAATATGATTATCGGCGCGACCTTCCTGACACTGCTTGATAGCATTGG-
CGCTGCCGGTAC GTTCTGGC
TCTACACTGCGCTGAACATTGCGTTTGTGGGCATTACTTTCTGGCTCATTCCGGAAACCAAAAATGTCACGC-
TGGAACATATCG AACGCAAA CTGATGGCAGGCGAGAAGTTGAGAAATATCGGCGTCTGA
P.sub.tac Promoter.sup.10
CTCGAGTGTTGACAATTAATCATCGGCTCGTATAATGTGTGGAATTGTGAGCGCTCACAATTTCACACATCTA-
GA P.sub.noc Promoter
AACAAATACACATGGGCGCATGCCTATTACTGCCCTTGCGATATGGAAGGCAAGCTTTTAGTAACAATAGAAA-
ACTGGGTCCTA CTCTCGAA
GAATGCACTGCGGCGGTCACGTCAACACGTGCTGCACCGTTGAGAATGAATGCTGGGCAGATTGCCAGCGGC-
GTCATTTTCGGC TGTCCCGT CCTCACGGTTTTGCGCTGCATCGCAAGAGATTGGGAA nocR
Gene
ATGACGTCAGCAGCGAATCTGGTGAGGATCACGCAGCCCGCGATCAGCCGGCTGATCAGGGATC-
TCGAAGAGGAAATTGGGATC AGCCTCTT
CGAAAGAACGGGCAACCGGTTACGTCCTACGCGGGAGGCCGGTATTCTGTTCAAGGAAGTGTCGCGACATTT-
CAACGGGATTCA GCACATCG
ACAAAGTCGCGGCTGAACTGAAGAAGTCTCATATGGGGTCCCTAAGGGTCGCCTGTTATACAGCGCCAGCTC-
TGAGTTTTATGT CCGGCGTC
ATTCAGACGTTCATCGCCGATCGGCCCGACGTGTCGGTCTACCTCGACACAGTTCCTTCCCAGACGGTCCTC-
GAATTGGTCTCG CTCCAGCA
CTACGATCTCGGAATATCGATATTGGCTGGCGACTATCCTGGTCTCACCACCGAACCTGTCCCTTCCTTTCG-
TGCGGTCTGCCT GCTGCCGC
CGGGGCATCGTCTCGAAGACAAGGAAACTGTTCATGCGACGGACCTTGAAGGAGAGTCATTGATTTGCCTCT-
CTCCAGTGAGCC TTCTACGG
ATGCAAACGGACGCCGCACTGGACAGCTGCGGCGTCCACTGTAATCGCAGGATAGAAAGTAGTCTGGCGCTG-
AATCTCTGCGAT CTGGTAAG
CAGGGGAATGGGGGTTGGTATCGTCGACCCCTTCACTGCCGACTACTACAGTGCAAATCCGGTTATTCAGCG-
CTCCTTTGATCC GGTTGTCC
CCTACCATTTTGCTATAGTTCTTCCGACCGACAGCCCACCGCCGCGCTTGGTTAGCGAGTTCCGGGCAGCGT-
TGCTTGATGCTT TGAAAGCC TTGCCCTATGAAACCATTTGA P.sub.occ Promoter
AAACGCACCATAACATCTGCTTATTCTTGCCCGGTCATTATGAATTTGACCGAATGCATATCGAATGTAAAGC-
TCACCCTATAA ATCACAAC TCTTCCGGGCCAACCGGGATCAGACGT occR Gene
ATGAATCTCAGGCAGGTCGAGGCGTTCCGGGCAGTCATGCTGACGGGGCAAATGACGGCGGCGG-
CTGAACTAATGCTGGTGACT CAGCCGGC
CATCAGTCGCCTAATCAAGGACTTTGAACAGGCGACAAAACTGCAGCTCTTCGAGAGGCGTGGGAACCATAT-
TATCCCGACACA GGAGGCAA
AGACGCTGTGGAAAGAGGTCGATCGGGCGTTCGTCGGGCTTAATCATATAGGCAACCTGGCTGCCGACATCG-
GCAGGCAGGCAG CGGGGACG
CTCCGCATTGCTGCAATGCCTGCTCTGGCAAACGGCCTCTTGCCGCGGTTTCTTGCTCAGTTCATCCGTGAC-
AGACCAAATCTC CAGGTCTC
CCTAATGGGACTGCCCTCAAGCATGGTCATGGAAGCCGTTGCGTCCGGCAGGGCCGACATCGGTTATGCCGA-
TGGCCCACAGGA GCGCCAAG
GTTTTCTAATCGAAACCCGGTCGCTTCCCGCTGTTGTCGCTGTCCCGATGGGACATCGACTTGCTGGCCTTG-
ACCGTGTCACGC CACAGGAC
CTTGCCGGTGAGCGTATTATAAAACAGGAGACTGGCACTCTCTTCGCCATGCGGGTAGAGGTGGCGATTGGT-
GGTATTCAACGC CGGCCGTC
AATTGAAGTGAGCCTGTCGCATACTGCGCTAAGTCTCGTCCGCGAAGGCGCCGGGATCGCAATTATCGATCC-
AGCCGCGGCGAT CGAGTTCA
CGGACAGGATCGTACTGCGACCGTTCTCGATCTTCATTGACGCCGGATTCCTCGAAGTCCGGTCAGCAATTG-
GCGCTCCCTCAA CCATCGTC
GATCGTTTCACAACCGAATTCTGGAGGTTTCATGATGACTTGATGAAGCAGAACGGCCTAATG-
GAGTAA P.sub.Bet Promoter.sup.17
AGCGCGGGTGAGAGGGATTCGTTACCAATAGACAATTGATTGGACGTTCAATATAATGCTAGC
P.sub.Cin Promoter
CCCTTTGTGCGTCCAAACGGACGCACGGCGCTCTAAAGCGGGTCGCGATCTTTCAGATTCGCTCCTCGCGCTT-
TCAGTCTTTGTTTTGGCGC
ATGTCGTTATCGCAAAACCGCTGCACACTTTTGCGCGACATGCTCTGATCCCCCTCATCTGGGGGGGCCTAT-
CTGAGGGAATTT CCGATCCG GCTCGCCTGAACCATTCTGCTTTCCACGAACTTGAAAACGCT
P.sub.3B5B Promoter.sup.5
TTTTGTTCGATTATCGAACAAATTATTGAAATATCGAACAAAACCTCTAAACTACTGTGGCACTGAATCAAAA-
AATTATA AACCCTGATCAG A P.sub.TTg Promoter.sup.19
CACCCAGCAGTATTTACAAACAACCATGAATGTAAGTATATTCCTTAGCAA P.sub.Van
Promoter.sup.20
ATTGGATCCAATTGACAGCTAGCTCAGTCCTAGGTACCATTGGATCCAAT
TABLE-US-00006 TABLE 6 RBS sequences used in this study Name Strain
RBS sequence.sup.a (SEQ ID NOs: 226-291) Strength (GFP, au) RBSr1
R. sp. IRBG74 ATTTCACACATCTAGAGCTAATCATCTCGTACTAAAGAGGAGAAATTAA
8242 CCATG RBSr2 R. sp. IRBG74
ATTTCACACATCTAGAGCTAATCATCGCGTACTCAGGAGGCAAGTAATG 7181.5 RBSr3 R.
sp. IRBG74 ATTTCACACATCTAGAATTAAAGAGGAGAAATTAACCATG 6238.5 RBSr4 R.
sp. IRBG74 TAACAATTTCACACATCTAGAGCTAATCATCTCGTACTAAAGAGGCAAGTAATG
3618 RBSr5 R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCGCGTACTAAGGAGGCAAGTAATG 3560 RBSr6
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCGCGTACTCAAGAGGCAAGTAATG 2614.5 RBSr7
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCTTCGCGTACTAAAGAGGCAAGTAATG 2418.5 RBSr8
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCTCGTACTCAGGAGGCAAGTAATG 1882.5 RBSr9
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCTCGTACTAATGAGGCAAGTAATG 1593.5
RBSr10 R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCGCGTACTAATGAGGCAAGTAATG 1590 RBSr11
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCGCGTACTCACGAGGCAAGTAATG 1554 RBSr12
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCGCGTACTAAAAAGGCAAGTAATG 1138 RBSr13
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCTTCGCGTACTAAAAAGGCAAGTAATG 895.5 RBSr14
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCTTCGCGTACTAAGAAGGCAAGTAATG 632.5 RBSr15
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCTCGTACTAAATAGGCAAGTAATG 648.5 RBSr16
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCTCGTACTAATAAGGCAAGTAATG 532 RBSr17
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCTTCTCGTACTAAAGAGGCAAGTAATG 488 RBSr18
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCGCGTACTCAATAGGCCAGTAATG 305.5 RBSr19
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCGCGTACTAAGTAGGCAAGTAATG 242 RBSr20
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCTCGTACTAACGAGGCAAGTAATG 248 RBSr21
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCGCGTACTCAGCAGGCAAGTAATG 183 RBSr22
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCTTCGCGTACTAAGTAGGCAAGTAATG 130 RBSr23
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCTTCGCGTACTAATTAGGCAAGTAATG 84.4 RBSr24
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCTTCTCGTACTAACAAGGCAAGTAATG 75.15 RBSr25
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCTCGTACTCAATAGGCAAGTAATG 45.45 RBSr26
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCATCTCGTACTAAGCACGCAAGTAATG 36 RBSr27 R.
sp. IRBG74 TAACAATTTCACACATCTAGAGCTAATCATCGCGTACTAACTACGCAAGTAATG
12.2 RBSr28 R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCTTCGCGTACTAAGAACGCAAGTAATG 13 RBSr29 R.
sp. IRBG74 TAACAATTTCACACATCTAGAGCTAATCTTCGCGTACTAAAAACGCAAGTAATG
4.6 RBSr30 R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCTTCGCGTACTAACAACGCAAGTAATG 2.95 RBSr31
R. sp. IRBG74
TAACAATTTCACACATCTAGAGCTAATCTTCTCGTACTCATGACGCAAGTAATG 1.45 RBSr32
R. sp. IRBG74 ATTTCACACATCTAGAATTAAAGAGAAGAAATTAACCATG N/A.sup.b
RBSr33 R. sp. IRBG74 CTAGTGCGAACTAGCTCATACCGCAGATG N/A.sup.b RBSp1
P. protegens Pf-5 CTAGCGCAGGTCCAACGTTTTTCTAAGCAAGGAGGTCATATG 25090
RBSp2 P. protegens Pf-5 CTAGCGAAGGTCCAACGTTTTTCTAAGCAAGGAGGTCATATG
21590 RBSp3 P. protegens Pf-5
CTAGCGAAGGTCCAACGTTTTTCTAAGCCAGGAGGTCATATG 19690 RBSp4 P. protegens
Pf-5 CTAGCGCAGGTCCAACGTTTTTCTAAGCCAGGAGGTCATATG 19490 RBSp5 P.
protegens Pf-5 CTAGCGAAGCTCCAACGTTTTTCTAAGCAAGGAGGTCATATG 17990
RBSp6 P. protegens Pf-5 GAATTCTACACTAACGGACAGGAGGGTCCGATG 14490
RBSp7 P. protegens Pf-5 GAATTCTAAACTAACGGACAGGAGGGTCCGATG 13390
RBSp8 P. protegens Pf-5 GAATTCTAAGCTAACGGACAGGAGGGTCCGATG 12790
RBSp9 P. protegens Pf-5 GAATTCTTAACTAACGGACAGGAGGGTCCGATG 11490
RBSp10 P. protegens Pf-5 GAATTCTACACTAACGGACAGGAGGGTCGGATG 11090
RBSp11 P. protegens Pf-5 GAATTCTACGCTAACGGACAGGAGGGTCCGATG 10390
RBSp12 P. protegens Pf-5 GAATTCTCAACTAACGGACAGGAGGGTCCGATG 9590
RBSp13 P. protegens Pf-5 GAATTCTAAGCTAACGGACAGGAGGGTCGGATG 8918
RBSp14 P. protegens Pf-5 GAATTCTCAGCTAACGGACAGGAGGGTCCGATG 8766
RBSp15 P. protegens Pf-5 GAATTCTCAACTAACGGACAGGAGGGTCCGATG 7596
RBSp16 P. protegens Pf-5 GAATTCTACGCTAACGGACAGGAGGGTCGGATG 6055
RBSp17 P. protegens Pf-5 GAATTCTCAACTAACGGACAGGAGATATACATATG 5939
RBSp18 P. protegens Pf-5 GAATTCTCAGCTAACGGACAGGAGGGTCGGATG 5915
RBSp19 P. protegens Pf-5 GAATTCTAAACTAACGGACAGGAGGGTCGGATG 4867
RBSp20 P. protegens Pf-5 GAATTCTCAGCTCACGGACAGGAGGGTCGGATG 4426
RBSp21 P. protegens Pf-5 GAATTCTCAACTAACGGACAGGAGGGTCGGGATG 4110
RBSp22 P. protegens Pf-5 GAATTCTACACTCACGGACAGGAGGGTCGGATG 3977
RBSp23 P. protegens Pf-5 GAATTCTAAGCTCACGGACAGGAGGGTCGGATG 3829
RBSp24 P. protegens Pf-5 GAATTCTCAACTCACGGACAGGAGGGTCGGATG 3661
RBSp25 P. protegens Pf-5 GAATTCTACACTAACGGACAGCAGGGTCGGATG 3542
RBSp26 P. protegens Pf-5 CTAGCGCAGGTCCAACCTTTTTCTAAGCAAGTAGGTCATATG
2139 RBSp27 P. protegens Pf-5 GAATTCTCAGCTAACGGACAGCAGGGTCGGATG
1265 RBSp28 P. protegens Pf-5
CTAGCGCAGGTCCAACCTTTTTCTAAGCAACTAGGTCATATG 389 RBSp29 P. protegens
Pf-5 CTAGCGAAGGTCCAACCTTTTTCTAAGCCAGTAGGTCATATG 377 RBSp30 P.
protegens Pf-5 GAATTCTACGCTCACGGACAGCAGGGTCGGATG 221 RBSp31 P.
protegens Pf-5 GAATTCTCCGCTCACGGACAGGAGGGTCCGATG 23.3 RBSp32 P.
protegens Pf-5 CTTCTCGGCCAGCTGACAGGGGAAGCTCGCATG N/A.sup.b RBSp33
P. protegens Pf-5 CTTCTCGGCCAGCTGACAGGAGGAAGCTCGCATG N/A.sup.b
.sup.aThe start codon is underlined. .sup.bRBSs are rationally
designed for the controllers by the RBS Calculator.sup.2
TABLE-US-00007 TABLE 7 Chemicals used in this study Chemicals
Source Identifier Tryptone Fisher Scientific Cat# BP1421 Yeast
extract BD Bacto Cat# DF0127 NaCl Fisher Scientific Cat# S271
CaCl.sub.2.cndot.2H.sub.2O Sigma-Aldrich Cat# C3306
MgSO.sub.4.cndot.7H.sub.2O Fisher Scientific Cat# M80 FeCl.sub.3
Alfa Aesar Cat# AA1235709 Na.sub.2MoO.sub.4.cndot.2H.sub.2O
Sigma-Aldrich Cat# 331058 NH.sub.4CH.sub.3CO.sub.2 Sigma-Aldrich
Cat# A1542 Na.sub.2HPO.sub.4 Fisher Scientific Cat# S375
KH.sub.2PO.sub.4 Sigma-Aldrich Cat# P9791 EDTA-Na2 Sigma-Aldrich
Cat# E5134 ZnSO.sub.4.cndot.7H.sub.2O ACROS Organics Cat#
AC424605000 H.sub.3BO.sub.3 Fisher Scientific Cat# A73
MnSO.sub.4.cndot.H.sub.2O MP Biomedicals Cat# ICN225099
CuSO.sub.4.cndot.5H.sub.2O Aldon Corp Cat# CC0535
CoCl.sub.2.cndot.6H.sub.2O Sigma-Aldrich Cat# C8661
FeSO.sub.4.cndot.7H.sub.2O Sigma-Aldrich Cat# 215422 Thiamine
hydrochloride ACROS Organics Cat# 148990100 D-pantothenic acid
hemicalcium salt Sigma-Aldrich Cat# P5155 Biotin Sigma-Aldrich Cat#
B4501 Nicotinic acid Sigma-Aldrich Cat# 72309 MOPS Fisher
Scientific Cat# BP308 Isopropyl-beta-D-thiogalactoside (IPTG)
GoldBio Cat# I2481 L-arabinose Sigma Cat# A3256 Anhydrotetracycline
hydrochloride (aTc) Sigma Cat# 37919 N-(3-Oxohexanoyl)-L-homoserine
lactone (3OC6HSL) Sigma Cat# K3007
N-(3-Hydroxytetradecanoyl)-DL-homoserine lactone Sigma Cat# 51481
(3OC14HSL) Naringenin Sigma Cat# N5893 2,4-Diacetylphloroglucinol
(DAPG) Santa Cruz Cat# sc-206518 Salicylic acid sodium salt Sigma
Cat# S3007 3,4-Dihydroxybenzoic acid (DHBA) Sigma Cat# 37580
Vanillic acid Sigma Cat# 94770 Cuminic acid Sigma Cat# 268402
Nopaline Toronto Research Chemicals Cat# N650600 Octopine Toronto
Research Chemicals Cat# O239850 Choline chloride Sigma Cat# C7017
Tris (1M), pH 8.0 Invitrogen Cat# AM9855 Triton X-100 Sigma-Aldrich
Cat# T8787 Tergitol solution Sigma-Aldrich Cat# NP40S DNase I
Sigma-Aldrich Cat# 4716728001 RNA Fragmentation Reagents Invitrogen
Cat# AM8740 T4 Polynucleotide kinase New England Biolabs Cat# M0201
SUPERase.cndot.In Invitrogen Cat# AM2694 PEG 8000 Sigma-Aldrich
Cat# 1546605 T4 RNA ligase 2, truncated K277Q New England Biolabs
Cat# M0351 SuperScript III reverse transcriptase Invitrogen Cat#
18080044 CircLigase ssDNA ligase Epicentre Cat# CL4115K Phusion
High-Fidelity DNA polymerase New England Biolabs Cat# M0530
Micrococcal nuclease Roche 10107921001
Sequence CWU 1
1
293117DNAArtificial SequenceSynthetic
polynucleotidemisc_feature(1)..(1)may be modified by
/5rApp/misc_feature(17)..(17)may be modified by /3ddc/ 1ctgtaggcac
catcaat 17233DNAArtificial SequenceSynthetic
polynucleotidemisc_feature(1)..(1)may be modified by
/5Phos/misc_feature(33)..(33)may be modified by /iSp18/ 2agatcggaag
agcgtcgtgt agggaaagag tgt 33341DNAArtificial SequenceSynthetic
polynucleotidemisc_feature(1)..(1)may be modified by /iSp18/
3caagcagaag acggcatacg agatattgat ggtgcctaca g 41421DNAArtificial
SequenceSynthetic polynucleotide 4caagcagaag acggcatacg a
21585DNAArtificial SequenceSynthetic
polynucleotidemisc_feature(63)..(68)n is a, c, g, or t 5aatgatacgg
cgaccaccga gatctacacg atcggaagag cacacgtctg aactccagtc 60acnnnnnnac
actctttccc tacac 85632DNAArtificial SequenceSynthetic
polynucleotide 6cgacaggttc agagttctac agtccgacga tc
32732DNAArtificial SequenceSynthetic polynucleotide 7cgacaggttc
agagttctac agtccgacga tc 32868DNAKlebsiella oxytoca 8cgtagggcgc
attaatgcag ctggcacgac aggtgaattc tagactgctg gatacgctgc 60ttaaggtc
68924DNAKlebsiella oxytoca 9tacgctgttt gagctggcaa acct
241074DNAPseudomonas stutzeri 10gcccggagag caagcccgta gggcgcatta
atgcagctgg cacgacaggt gttaggttgg 60cctgaattcg gtgt
741148DNAPseudomonas stutzeri 11ggctcacttc gatttcgtcc gcggtgcgtg
ccctgctagt gatgcgta 481225DNAPseudomonas stutzeri 12cgcctgattt
cgcctgatga acagg 251320DNAPseudomonas stutzeri 13tgacgctgtt
gaccaccgcc 201423DNAPseudomonas stutzeri 14atggaagtgg tcggcaccgg
cta 231522DNAPseudomonas stutzeri 15cgcaacggtt ggggtaggtt gg
221623DNAPseudomonas stutzeri 16gacgtccatc gcttcggctt cga
231737DNAPseudomonas stutzeri 17ctgcgaaatc gacgctgtcg agcatcatcg
cggttca 3718101DNAAzotobacter vinelandii 18atccattctc aggctgtctc
gtctcgtctc tacgtacgcg gatcccaggc aacgtcttcg 60tactgcggta ccgggttgcg
ggggcagcca gtggaaaaag g 1011921DNAAzotobacter vinelandii
19ctacggcacg ccctggttcg a 212022DNAAzotobacter vinelandii
20gctcggaaag tgctggagaa ac 222124DNAAzotobacter vinelandii
21aaatcagaca ttcatggcca cagg 242222DNAAzotobacter vinelandii
22tctaccatgg cgtgactctc gg 222343DNAAzotobacter vinelandii
23actcgtcttc tgtccgttta aactcccgga actctaccac cgc
432421DNAAzotobacter vinelandii 24cttggataga cgaggcacag c
212536DNAAzotobacter vinelandii 25gccggctcct gcaacctgaa ggggccgagg
atgatg 362630DNAPaenibacillus polymyxa 26gaattgagga taaatgtcag
ggatttcatg 302724DNAPaenibacillus polymyxa 27ccaagcattt tgagatcgcg
gatg 242821DNAPaenibacillus polymyxa 28cggaggtgcc ggtatgagcg a
212922DNAPaenibacillus polymyxa 29gaagtttgca gcgaaagagg cg
223023DNAPaenibacillus polymyxa 30gggatgatgc agaatacatc ccg
233126DNAPaenibacillus polymyxa 31ggtgacctgg atgatgcaga ggagag
2632118DNAUnknownCyanothece 32ggcccgcgtt aggttggcct gaattcggtg
tgtatccccc ggagatacgt aaaaaaaaaa 60accccgccct gtcaggggcg gggttttttt
ttgataagtc aagctatcag aaccgatc 1183348DNAUnknownCyanothece
33taattcccat aacatctgca tgcataataa ggtggggaaa gtctcagc
483424DNAUnknownCyanothece 34aatgtatttc tgatcgatgc gacg
243524DNAUnknownCyanothece 35gttatctggc tgatgtttgt ggtg
243624DNAUnknownCyanothece 36gtcaaactgt cttgtttaaa gccg
2437104DNAAzospirillum brasilense 37ttaaggtcat gcagcaggag
aactaaaggc ccgcgttagg ttggtaataa aaaagccccc 60ggaatgatct tccgggggcc
ctgcgcaaat acaacatcga gatc 1043826DNAAzospirillum brasilense
38gacgactgaa taaggatcgc ggaatg 263923DNAAzospirillum brasilense
39tatgtcacag gcccgacaaa gcg 234023DNAAzospirillum brasilense
40gattgtcggg tatcgcacac gag 234125DNAAzospirillum brasilense
41cgaaggagtt cgccccagtc tattc 254268DNARhodopseudomonas palustris
42aatacgatcg catgtcctag gtaatacgac tcactatagg gagaggtaat cagtggtgga
60tttgatgt 684320DNARhodopseudomonas palustris 43ccaagcaaag
gaccaccctc 204422DNARhodopseudomonas palustris 44agcttcgata
tcatccgctg at 224523DNARhodopseudomonas palustris 45ttgttcatgt
cggacctaac cga 234667DNARhodobacter sphaeroides 46caatacgatc
gcatgtccta ggtaatacga ctcactatag ggagatgcat ttcacgcttc 60gcgattc
674720DNARhodobacter sphaeroides 47ccgccttcac cagagacacc
204823DNARhodobacter sphaeroides 48atcgagaagt tctacgatgc cgt
234968DNARhodobacter sphaeroides 49gcaaaaaaaa accccgcccc tgacagggcg
gggttttttt tttcaattgg acctggatgg 60gcagcaag 6850106DNAAzorhizobium
caulinodans 50ctcgcatcca ttctcaggct gtctcgtctc gtctctctag
agtcggagct cttggggcct 60ctaaacgggt cttgaggggt tttttgttgt cttcgacgcg
aagctc 1065153DNAAzorhizobium caulinodans 51ataggcaata cgatcgcatg
tccgtttaaa ctgataagga cggcactggc tgg 535219DNAAzorhizobium
caulinodans 52cgatgccgtc cagcacctc 195321DNAAzorhizobium
caulinodans 53ctgccacggt tcccaaggtt c 215462DNAAzorhizobium
caulinodans 54taaaaaagcg gctaaccacg ccgctttttt tacgtctgca
gtgttgtcga agcttgatgc 60gc 625560DNAAzorhizobium caulinodans
55cgctgcttaa ggtcatgcag caggagaact aaaggcccgc tctgcgaaag gaatagcgtc
605619DNAAzorhizobium caulinodans 56ctatcgccgc cacctgacc
195730DNAAzorhizobium caulinodans 57cgtcagaacg gctctgacgc
atcagggaga 305832DNAAzorhizobium caulinodans 58agtaatattg
cggatcggcc agcagcgagg aa 325925DNAAzorhizobium caulinodans
59ggtggtcatt ggcaacggtt cgaag 256031DNAAzorhizobium caulinodans
60tccccaagag cccaaccgtt ccgggagcga a 316199DNAGluconacetobacter
diazotrophicus 61ttaaggtcat gcagcaggag aactaaaggc ccgcgttagg
ttggtaataa aaaagccccc 60ggaatgatct tccgggggcc gatcgaggaa atcgacgtg
996228DNAGluconacetobacter diazotrophicus 62atattccgga tacggctggt
gaggtgga 286324DNAGluconacetobacter diazotrophicus 63cgccacgtcg
tcaatgccta taac 246421DNAGluconacetobacter diazotrophicus
64tgaccaccgt gcagaagatc c 216523DNAKlebsiella oxytoca 65gtgacgctcg
cgtatcaggt ttg 236669DNAKlebsiella oxytoca 66atcaggcgca tatttgaatg
tatttactgc agcggccgct tctagagtga ccaaaagctt 60ccgcaaccc
696748DNAPseudomonas stutzeri 67actacgcatc actagcaggg cacgcaccgc
ggacgaaatc gaagtgag 486822DNAPseudomonas stutzeri 68ttgtcgactc
ccggggtctg ac 226923DNAPseudomonas stutzeri 69ggctttaacg gcatgttccg
ggt 237024DNAPseudomonas stutzeri 70gtagtcgtcg ttgtggccga actc
247122DNAPseudomonas stutzeri 71aaagcatcat ctcgggtcgg gc
227221DNAPseudomonas stutzeri 72cgtcgagcga caacgcctcg a
217325DNAPseudomonas stutzeri 73ctatgagctg gactgaaccg cgatg
257474DNAPseudomonas stutzeri 74gaaaataccg catcaggcgc atatttgaat
gtatttactg cagcggccgc tggcgaatct 60ccttcctcgg ttcg
747522DNAAzotobacter vinelandii 75gccttcgaac atgttgtccc ag
227624DNAAzotobacter vinelandii 76tcgagttcga gcagtttctc cagc
247720DNAAzotobacter vinelandii 77agcgaacaat acctgtggcc
207821DNAAzotobacter vinelandii 78tggcgcttgc ccttgttcca a
217954DNAAzotobacter vinelandii 79gcgcggtggt agagttccgg gagtttaaac
ggacagaaga cgagtcgtgc gggc 548020DNAAzotobacter vinelandii
80ttgctcaggg tcgggttggc 208139DNAAzotobacter vinelandii
81catcatcctc ggccccttca ggttgcagga gccggcttg 398221DNAAzotobacter
vinelandii 82gcaagccact ccactgacga a 218322DNAPaenibacillus
polymyxa 83acaggttccg cagttcacaa gc 228425DNAPaenibacillus polymyxa
84gctgattgtg atcgacaata ttcgg 258522DNAPaenibacillus polymyxa
85gaaagcctac acgaagcaaa gg 228622DNAPaenibacillus polymyxa
86cttgagaatc tgccgggcgc ct 228722DNAPaenibacillus polymyxa
87atccacaaat caacaccctg cg 228822DNAPaenibacillus polymyxa
88aaagcgttcc agtcacggtc ac 228948DNAUnknownCyanothece 89gagactttcc
ccaccttatt atgcatgcag atgttatggg aattaacg
489023DNAUnknownCyanothece 90accttgacaa tcattacaca gcg
239128DNAUnknownCyanothece 91caaatataat gatcgacatt ttcaccac
289224DNAUnknownCyanothece 92cgttaacttt gtcgcaaaac ttcg
2493102DNAUnknownCyanothece 93accaaggcga atctccttcc tcggttcgcg
atcacgctac tccgccaata aaaaagcccc 60cggaatgatc ttccgggggc cagattcagg
taactgctca ag 1029423DNAAzospirillum brasilense 94tgcgtcttct
tcgggcatcg tca 239523DNAAzospirillum brasilense 95agaaaattga
ttgcggacga gcg 239627DNAAzospirillum brasilense 96ttcaataagt
taagcagatc ggcctcg 279733DNAAzospirillum brasilense 97cggtgttacg
aataaatatt tctacgaata gac 339868DNAAzospirillum brasilense
98gctccaaaag gagcctttaa ttgtatcggt ttatcagctt gctttgttcc gcgggtctcg
60atacaacg 689922DNARhodopseudomonas palustris 99ggtcttgcgg
atcatcactt tc 2210020DNARhodopseudomonas palustris 100gacggtcagg
tggtccgaac 2010123DNARhodopseudomonas palustris 101ggtgagaatg
atcatgatcg gcc 2310267DNARhodopseudomonas palustris 102ctccaaaagg
agcctttaat tgtatcggtt tatcagcttg ctttgacgac aagtggagaa 60gggatag
6710322DNARhodobacter sphaeroides 103tcccatggtc atgtcctttg cg
2210421DNARhodobacter sphaeroides 104gtgcgctttt ccacgaggag c
2110566DNARhodobacter sphaeroides 105aattgaaaaa aaaaaccccg
ccctgtcagg ggcggggttt ttttttgcag cgcccattcc 60gtcttc
6610666DNARhodobacter sphaeroides 106gctccaaaag gagcctttaa
ttgtatcggt ttatcagctt gctttggaga aagcctgcgc 60ggctag
6610765DNAAzorhizobium caulinodans 107gcccccggaa ggtgatcttc
cgggggcttt ctcatgcgtt gacagccttg agatagatca 60agtgc
6510820DNAAzorhizobium caulinodans 108ctgatccagg ccttcatcgg
2010923DNAAzorhizobium caulinodans 109gacatgtctg gtctccttgg aac
2311042DNAAzorhizobium caulinodans 110ttctggaatt tggtaccgag
tcagtaacgt gccacagcct cg 42111117DNAAzorhizobium caulinodans
111atcaggcgca tatttgaatg tatttactgc agcggccgct acgtacttgt
ggggtcagtt 60ccggctgggg gttcagcagc cacctgcagt taattaaggc gctcctttcc
tgattcg 11711220DNAAzorhizobium caulinodans 112gctgctgtgt
ggagagatcg 2011322DNAAzorhizobium caulinodans 113gtcggtgaga
ttgatcatgg cc 2211420DNAAzorhizobium caulinodans 114tgcatgtccg
ttcctcgctg 2011524DNAAzorhizobium caulinodans 115acatgtcttg
aattccttcg aacc 2411618DNAAzorhizobium caulinodans 116tgcattgcgt
tcgctccc 1811719DNAAzorhizobium caulinodans 117tgtcagggca ggcagggcc
1911841DNAGluconacetobacter diazotrophicus 118tcaccagccg tatccggaat
atgtcaggat catgacatcc c 4111920DNAGluconacetobacter diazotrophicus
119acgatttcca tgcccaggtc 2012020DNAGluconacetobacter diazotrophicus
120cctccagcac ctcttcgatg 2012167DNAGluconacetobacter diazotrophicus
121gctccaaaag gagcctttaa ttgtatcggt ttatcagctt gctttgggca
atacctgaga 60cgtttca 6712270DNAArtificial SequenceSynthetic
polynucleotide 122agagtgttga cttgtgagcg gataacaatg atacttagat
tcaattgtga gcggataaca 60atttcacaca 701232652DNAArtificial
SequenceSynthetic polynucleotide 123atgaacacga ttaacatcgc
taagaacgac ttctctgaca tcgaactggc tgctatcccg 60ttcaacactc tggctgacca
ttacggtgag cgtttagctc gcgaacagtt ggcccttgag 120catgagtctt
acgagatggg tgaagcacgc ttccgcaaga tgtttgagcg tcaacttaaa
180gctggtgagg ttgcggataa cgctgccgcc aagcctctca tcactaccct
actccctaag 240atgattgcac gcatcaacga ctggtttgag gaagtgaaag
ctaagcgcgg caagcgcccg 300acagccttcc agttcctgca agaaatcaag
ccggaagccg tagcgtacat caccattaag 360accactctgg cttgcctaac
cagtgctgac aatacaaccg ttcaggctgt agcaagcgca 420atcggtcggg
ccattgagga cgaggctcgc ttcggtcgta tccgtgacct tgaagctaag
480cacttcaaga aaaacgttga ggaacaactc aacaagcgcg tagggcacgt
ctacaagaaa 540gcatttatgc aagttgtcga ggctgacatg ctctctaagg
gtctacttgg tggcgaggcg 600tggtcttcgt ggcataagga agactctatt
catgtaggag tacgctgcat cgagatgctc 660attgagtcaa ccggaatggt
tagcttacac cgccaaaatg ctggcgtagt aggtcaagac 720tctgagacta
tcgaactcgc acctgaatac gctgaggcta tcgcaacccg tgcaggtgcg
780ctggctggca tctctccgat gttccaacct tgcgtagttc ctcctaagcc
gtggactggc 840attactggtg gtggctattg ggctaacggt cgtcgtcctc
tggcgctggt gcgtactcac 900agtaagaaag cactgatgcg ctacgaagac
gtttacatgc ctgaggtgta caaagcgatt 960aacattgcgc aaaacaccgc
atggaaaatc aacaagaaag tcctagcggt cgccaacgta 1020atcaccaagt
ggaagcattg tccggtcgag gacatccctg cgattgagcg tgaagaactc
1080ccgatgaaac cggaagacat cgacatgaat cctgaggctc tcaccgcgtg
gaaacgtgct 1140gccgctgctg tgtaccgcaa ggacaaggct cgcaagtctc
gccgtatcag ccttgagttc 1200atgcttgagc aagccaataa gtttgctaac
cataaggcca tctggttccc ttacaacatg 1260gactggcgcg gtcgtgttta
cgctgtgtca atgttcaacc cgcaaggtaa cgatatgacc 1320aaaggactgc
ttacgctggc gaaaggtaaa ccaatcggta aggaaggtta ctactggctg
1380aaaatccacg gtgcaaactg tgcgggtgtc gacaaggttc cgttccctga
gcgcatcaag 1440ttcattgagg aaaaccacga gaacatcatg gcttgcgcta
agtctccact ggagaacact 1500tggtgggctg agcaagattc tccgttctgc
ttccttgcgt tctgctttga gtacgctggg 1560gtacagcacc acggcctgag
ctataactgc tcccttccgc tggcgtttga cgggtcttgc 1620tctggcatcc
agcacttctc cgcgatgctc cgagatgagg taggtggtcg cgcggttaac
1680ttgcttccta gtgaaaccgt tcaggacatc tacgggattg ttgctaagaa
agtcaacgag 1740attctacaag cagacgcaat caatgggacc gataacgaag
tagttaccgt gaccgatgag 1800aacactggtg aaatctctga gaaagtcaag
ctgggcacta aggcactggc tggtcaatgg 1860ctggcttacg gtgttactcg
cagtgtgact aagcgttcag tcatgacgct ggcttacggg 1920tccaaagagt
tcggcttccg tcaacaagtg ctggaagata ccattcagcc agctattgat
1980tccggcaagg
gtctgatgtt cactcagccg aatcaggctg ctggatacat ggctaagctg
2040atttgggaat ctgtgagcgt gacggtggta gctgcggttg aagcaatgaa
ctggcttaag 2100tctgctgcta agctgctggc tgctgaggtc aaagataaga
agactggaga gattcttcgc 2160aagcgttgcg ctgtgcattg ggtaactcct
gatggtttcc ctgtgtggca ggaatacaag 2220aagcctattc agacgcgctt
gaacctgatg ttcctcggtc agttccgctt acagcctacc 2280attaacacca
acaaagatag cgagattgat gcacacaaac aggagtctgg tatcgctcct
2340aactttgtac acagccaaga cggtagccac cttcgtaaga ctgtagtgtg
ggcacacgag 2400aagtacggaa tcgaatcttt tgcactgatt cacgactcct
tcggtacgat tccggctgac 2460gctgcgaacc tgttcaaagc agtgcgcgaa
actatggttg acacatatga gtcttgtgat 2520gtactggctg atttctacga
ccagttcgct gaccagttgc acgagtctca attggacaaa 2580atgccagcac
ttccggctaa aggtaacttg aacctccgtg acatcttaga gtcggacttc
2640gcgttcgcgt aa 265212451DNAA. caulinodans 124cgaatggtgc
aaaacctttc gcggtatggc atgatagcgc ccggaagaga g
511251545DNAArtificial SequenceSynthetic polynucleotide
125atggcgatga gcccaaagat ggagttccgc cagagccagt ctctggtgat
gacgccgcag 60ctgatgcagg ccatcaagct gctgcagctc tccaatctcg aactggtcgc
ctatgtggag 120gccgagctcg aacgcaatcc gctgctggag cgggcgagcg
agccggaaag ccccgagcac 180gatccgccga acccgcagga agaggcaccc
accccgcctg acagtggcgc gccggtgtcc 240ggcgactgga tggaaagcga
catgggctcg agccgcgagg ccatcgagac ccggctggac 300accgacctcg
gcaatgtctt tcccgatgat gcgccggccg agcgcatcgg cgcgggcagc
360ggcagcggct cgtccatcga atggggctcg ggcggcgacc ggggcgagga
ctacaatccg 420gaagccttcc tcgctgccga gacgacgctg gccgaccatc
tggaagccca gctctccgtg 480gcggagcccg atccggcgcg ccgcctcatc
ggcctcaacc tcatcggcct catcgacgag 540acgggttatt tctccggcga
cctcgatgcg gtggccgagc aactgggcgc cacccacgat 600caggtggccg
acgtgctgcg cgtcatccag agcttcgagc cgtccggcgt cggcgcacgg
660tcgctcagcg aatgcctggc cctgcaattg cgcgacaagg atcgctgcga
tcccgccatg 720caggcgctgc tcgacaatct ggaactcctc gcccgccacg
accgcaacgc gctgaagcgc 780atctgcgggg tggacgcgga agacctcgcg
gacatgatcg gcgagatccg ccgcctcgat 840ccgaagcccg gcctcgccta
tggcggcggc gtcgtccacc cgctggtgcc ggacgtgttc 900gtgcgcgagg
gctccgacgg cagctggatc gtggaactga attccgagac gctgccgcgc
960gtgctggtga accagaccta tcacgcgacg gtggccaagg cggcgcgctc
ggccgaggaa 1020aagaccttcc tcgccgactg cctccagagc gcctcctggc
ttacccgctc gctcgaccag 1080cgggctcgca ccatcctcaa ggtggcgagc
gagatcgtgc gccagcagga cgccttcctc 1140gtgcacggcg tgcggcacct
gcgccccctg aacctgcgca cggtggcgga tgccatcggc 1200atgcacgaat
ccaccgtctc gcgggtgacc tcgaacaagt acatctccac cccgcgcggg
1260gtgctggaga tgaagttctt cttctcctcc tccatcgctt cctcgggtgg
tggcgaggcc 1320catgcggcgg aggcggtgcg ccaccgcatc aagagcctca
tcgaggccga gagtgcggac 1380gacgtgctgt ccgacgacac gctggtgcag
aagctgaagg acgacggcat cgatatcgcc 1440cgccgaacgg tcgcgaaata
tcgcgagagc atgaacatcc cgtcctcggt ccagcgccgc 1500cgcgaaaagc
aggccctgcg cagcgacgcc gccgccgccg gctga 15451261083DNAArtificial
SequenceSynthetic polynucleotide 126gtgaaaccag taacgttata
cgatgtcgca gagtatgccg gtgtctctta tcagaccgtt 60tcccgcgtgg tgaaccaggc
cagccacgtt tctgcgaaaa cgcgggaaaa agtggaagcg 120gcgatggcgg
agctgaatta cattcccaac cgcgtggcac aacaactggc gggcaaacag
180tcgttgctga ttggcgttgc cacctccagt ctggccctgc acgcgccgtc
gcaaattgtc 240gcggcgatta aatctcgcgc cgatcaactg ggtgccagcg
tggtggtgtc gatggtagaa 300cgaagcggcg tcgaagcctg taaagcggcg
gtgcacaatc ttctcgcgca acgcgtcagt 360gggctgatca ttaactatcc
gctggatgac caggatgcca ttgctgtgga agctgcctgc 420actaatgttc
cggcgttatt tcttgatgtc tctgaccaga cacccatcaa cagtattatt
480ttctcccatg aagacggtac gcgactgggc gtggagcatc tggtcgcatt
gggtcaccag 540caaatcgcgc tgttagcggg cccattaagt tctgtctcgg
cgcgtctgcg tctggctggc 600tggcataaat atctcactcg caatcaaatt
cagccgatag cggaacggga aggcgactgg 660agtgccatgt ccggttttca
acaaaccatg caaatgctga atgagggcat cgttcccact 720gcgatgctgg
ttgccaacga tcagatggcg ctgggcgcaa tgcgcgccat taccgagtcc
780gggctgcgcg ttggtgcgga tatctcggta gtgggatacg acgataccga
agacagctca 840tgttatatcc cgccgttaac caccatcaaa caggattttc
gcctgctggg gcaaaccagc 900gtggaccgct tgctgcaact ctctcagggc
caggcggtga agggcaatca gctgttgccc 960gtctcactgg tgaaaagaaa
aaccaccctg gcgcccaata cgcaaaccgc ctctccccgc 1020gcgttggccg
attcattaat gcagctggca cgacaggttt cccgactgga aagcgggcag 1080tga
1083127717DNAArtificial SequenceSynthetic polynucleotide
127atgagtaaag gagaagaact tttcactgga gttgtcccaa ttcttgttga
attagatggt 60gatgttaatg ggcacaaatt ttctgttagt ggagagggtg aaggtgatgc
aacatacgga 120aaacttaccc ttaaatttat ttgcactact ggaaaactac
ctgttccatg gccaacactt 180gtcactactt tcggttatgg tgttcaatgc
tttgcgagat acccagatca tatgaaacag 240catgactttt tcaagagtgc
catgcccgaa ggttatgtac aggaaagaac tatatttttc 300aaagatgacg
ggaactacaa gacacgtgct gaagtcaagt ttgaaggtga tacccttgtt
360aatagaatcg agttaaaagg tattgatttt aaagaagatg gaaacattct
tggacacaaa 420ttggaataca actataactc acacaatgta tacatcatgg
cagacaaaca aaagaatgga 480atcaaagtta acttcaaaat tagacacaac
attgaagatg gaagcgttca actagcagac 540cattatcaac aaaatactcc
aattggcgat ggccctgtcc ttttaccaga caaccattac 600ctgtccacac
aatctgccct ttcgaaagat cccaacgaaa agagagacca catggtcctt
660cttgagtttg taacagctgc tgggattaca catggcatgg atgaactata caaatag
717128717DNAArtificial SequenceSynthetic polynucleotide
128atgcgtaaag gcgaagagct gttcactggt gtcgtcccta ttctggtgga
actggatggt 60gatgtcaacg gtcataagtt ttccgtgcgt ggcgagggtg aaggtgacgc
aactaatggt 120aaactgacgc tgaagttcat ctgtactact ggtaaactgc
cggtaccttg gccgactctg 180gtaacgacgc tgacttatgg tgttcagtgc
tttgctcgtt atccggacca tatgaagcag 240catgacttct tcaagtccgc
catgccggaa ggctatgtgc aggaacgcac gatttccttt 300aaggatgacg
gcacgtacaa aacgcgtgcg gaagtgaaat ttgaaggcga taccctggta
360aaccgcattg agctgaaagg cattgacttt aaagaagacg gcaatatcct
gggccataag 420ctggaataca attttaacag ccacaatgtt tacatcaccg
ccgataaaca aaaaaatggc 480attaaagcga attttaaaat tcgccacaac
gtggaggatg gcagcgtgca gctggctgat 540cactaccagc aaaacactcc
aatcggtgat ggtcctgttc tgctgccaga caatcactat 600ctgagcacgc
aaagcgttct gtctaaagat ccgaacgaga aacgcgatca tatggttctg
660ctggagttcg taaccgcagc gggcatcacg catggtatgg atgaactgta caaatga
717129678DNAArtificial SequenceSynthetic polynucleotide
129atggcttcct ccgaagacgt tatcaaagag ttcatgcgtt tcaaagttcg
tatggaaggt 60tccgttaacg gtcacgagtt cgaaatcgaa ggtgaaggtg aaggtcgtcc
gtacgaaggt 120acccagaccg ctaaactgaa agttaccaaa ggtggtccgc
tgccgttcgc ttgggacatc 180ctgtccccgc agttccagta cggttccaaa
gcttacgtta aacacccggc tgacatcccg 240gactacctga aactgtcctt
cccggaaggt ttcaaatggg aacgtgttat gaacttcgaa 300gacggtggtg
ttgttaccgt tacccaggac tcctccctgc aagacggtga gttcatctac
360aaagttaaac tgcgtggtac caacttcccg tccgacggtc cggttatgca
gaaaaaaacc 420atgggttggg aagcttccac cgaacgtatg tacccggaag
acggtgctct gaaaggtgaa 480atcaaaatgc gtctgaaact gaaagacggt
ggtcactacg acgctgaagt taaaaccacc 540tacatggcta aaaaaccggt
tcagctgccg ggtgcttaca aaaccgacat caaactggac 600atcacctccc
acaacgaaga ctacaccatc gttgaacagt acgaacgtgc tgaaggtcgt
660cactccaccg gtgcttaa 678130711DNAArtificial SequenceSynthetic
polynucleotide 130atggtttcga agggcgagga ggataacatg gccatcatca
aggagttcat gcgcttcaag 60gtgcacatgg agggctccgt gaacggccac gagttcgaga
tcgagggcga gggcgagggc 120cgcccctacg agggcaccca gaccgccaag
ctgaaggtga ccaagggtgg ccccctgccc 180ttcgcctggg acatcctgtc
ccctcagttc atgtacggct ccaaggccta cgtgaagcac 240cccgccgaca
tccccgacta cttgaagctg tccttccccg agggcttcaa gtgggagcgc
300gtgatgaact tcgaggacgg cggcgtggtg accgtgaccc aggactcctc
cttgcaggac 360ggcgagttca tctacaaggt gaagctgcgc ggcaccaact
tcccctccga cggccccgta 420atgcagaaga agacgatggg ctgggaggcc
tcctccgagc ggatgtaccc cgaggacggc 480gccctgaagg gcgagatcaa
gcagaggctg aagctgaagg acggcggcca ctacgacgct 540gaggtcaaga
ccacctacaa ggccaagaag cccgtgcagc tgcccggcgc ctacaacgtc
600aacatcaagt tggacatcac ctcccacaac gaggactaca ccatcgtgga
acagtacgaa 660cgcgccgagg gccgccactc caccggcggc atggacgagc
tgtacaagta a 71113123DNAArtificial SequenceSynthetic polynucleotide
131taatacgact cactataggg aga 2313223DNAArtificial SequenceSynthetic
polynucleotide 132taatacgact cactacaggc aga 2313323DNAArtificial
SequenceSynthetic polynucleotide 133taatacgact cactagagag aga
2313423DNAArtificial SequenceSynthetic polynucleotide 134taatacgact
cactaatggg aga 2313523DNAArtificial SequenceSynthetic
polynucleotide 135taatacgact cactaaaggg aga 2313623DNAArtificial
SequenceSynthetic polynucleotide 136taatacgact cactataggt aga
2313723DNAA. caulinodans 137taatacgact cactattggg aga
231381221DNAK. oxytoca 138gtgttccgtg ggaggcctgc catgctcgcc
aagacacccg caaaccccgc gccgcttcag 60cggacggcgt tcctgaacga caccacgctg
cgcgacggcg agcaggcgcc gggtgtcgcc 120ttcacccgca aggagaagat
cgagatcgcc gccgcccttg ccgccgccgg tgtcccggag 180atcgaggcgg
gaacgcccgc catgggcgac gaagaggtgg aaaccatccg ctccatcgtc
240tcgctgaacc tcccgacgcg cgtcatggcc tggtgccgca tgagcgagga
cgacctgatg 300gccgccgtcg cggcgggcgt gaagatcgtc aatgtctcca
ttcccacctc cgaccggcaa 360ctggccggca agctcggcaa ggatcgcgcc
tgggcgctcg gccgtgtggc ggaggtggtg 420acactggcgc gtcggctcgg
ctttgaggtg gcggtagggg gcgaggattc ctcgcgggcc 480gatcccgatt
ttctctgccg tctcgcggag acggcgaagg cggcgggcgc ctttcgcctg
540cggctggccg acacgcttgg cgtgcttgac cccttcggca cctatgcatt
ggtgcgccgg 600gtggccgcca ccaccgacat cgagcttgag ttccacgccc
atgacgatct cggccttgcc 660accgccaata cgctggcggc ggtgatgggc
ggagcgcgtc acgccagcgt caccgtcgcc 720gggctcggcg agcgcgcggg
caatgccgcg ctggaggaag tggccatcgc cctgcgccag 780acggcgcggg
cggagaccgg catcgctccg gccgcgctga agccgctggc cgaactagtg
840tgcggcgccg ccgcccgtcc ggtgccgcgc ggcaaggcca tcgtcggcgc
ggatgtgttc 900acccacgagt cgggcatcca tgtctccggc ctgctcaagg
accgggccac ctatgaagct 960ctgaatccgg aactgttcgg gcgtggccac
acggtggtgc tcggaaagca ttccggtctt 1020gcggcggtgg agaaggcgct
ggccgacgag ggcatcaccg tggatgcggt gcgcgggcgc 1080gccattctcg
accgggtgcg ggcttttgct gtccgcacca aggagaatgt ttcccgcgag
1140acgctgctgc gcttctatca ggacagcttc accgagtccg cgctgcgtct
gcggcgggcc 1200gccgtggaag gcgcaatctg a 1221139323DNAP. stutzeri
139tgttgcctca agcacagcct gtgccagctc gcggatgaca gaagagttag
cgcgaattca 60acgcgttatg aagagagtcg ccgcgcagcg cgccaagaga ttgcgtggaa
taagacacag 120ggggcgacaa gctgttgaac aggcgacaaa gcgccaccat
ggccccggca ggcgcaattg 180ttctgtttcc cacatttggt cgccttattg
tgccgttttg ttttacgtcc tgcgcggcga 240caaataacta acttcataaa
aatcataaga atacataaac aggcacggct ggtatgttcc 300ctgcacttct
ctgctggcaa aca 323140233DNAA. vinelandii 140tgtcatgttc gcaacagttg
ccgaaagtgt ggaaaaccgg cgcttggccc ggccgatctt 60tttgtcgcca ttgcaacagt
caggcctgtc ggttgttaac tatcgaaccg ccgaaggatg 120ttgctagtaa
ttaaattatt ctaattaaaa caagtgctta gattatttta gaaacgctgg
180cacaaaggct gctattgccc tgttgcgcag gcttgttcgt gcctatagcc cac
233141256DNAUnknownR. sp. IRBG74 141tgtcagtttt gtcacagggg
gccggaccag gatggtggac gctcgatggg gatgtcgggc 60cattgttcgg ttgtagcaat
tacaacagtc ggagtagggg gattgtaggg ggattgttgt 120gtatcagacc
gccctgcagc tcccgtcgat ggataattaa tcatttaaaa tcaatggttt
180atttatgtgt tgcgggtgct ggcacagacg ctgcattacc tttggtgcgc
ggagttgttc 240gggcttacgg ccgaac 256142294DNAA. caulinodans
142ttgacaaagc ctccgagaag agcgccccct aacccctcct cagccctgat
cggcagtatc 60atcttgtcga atcctaacgt ctgataggca acgctatacg acaaacgctg
gttacaattg 120tcggttccgc gacaagaatt tgctttgtct ggcgggtggt
ctattttgag ctaagtagct 180gagaaatcag gaaaacaaaa ctctattcgg
tctacccgac gagttggcac gggtcttgta 240accatccttg cgcaggcggc
gaaagccacc ggcgatattc atgttgcggg caac 294143215DNAK. oxytoca
143tgtcgcgttt gaaacacggg gcttttggaa ccgttcgatt ctgcaatgca
ctgattttac 60ttgattaatt cgaccacacg accactggca cacccgttgc aaaacccctt
ggtgcaggcg 120acgggttgcc ggtctggttc gcggatctcc tcgatccccg
gctaccgacc cgcctccgaa 180aagtccggtc ccgatccagt tcggcggggc cacac
2151441575DNAP. stutzeri 144atgatccata aatccgattc ggacaccacc
gtcagacgtt tcgatctctc ccagcagttt 60accgccatgc agcggataag cgtggtcctg
agtcgcgcca ccgaagcgag caaaaccctg 120caggaggttc tgagcgtgct
acataacgat gcctttatgc agcacgggat gatttgcctg 180tacgacagcc
agcaggagat cctgagcatc gaagcgctgc agcaaacgga agatcagacg
240ctgcccggca gtacgcaaat tcgctaccgg ccgggggaag gattagtcgg
taccgtgctg 300gcgcagggcc agtcgctggt gctgccgcgc gtcgccgacg
accagcgttt tctcgatcgt 360ctgagcctgt acgactatga cctgccgttt
atcgccgttc cgctgatggg cccccactcc 420cggcccatcg gcgtactggc
ggcgcagccg atggcgcgtc aggaagagcg gctgcccgcc 480tgcacgcgct
ttctcgaaac cgtcgccaat ctgatcgccc agacgattcg cctgatgatc
540ctgccaacct ccgccgcgca ggcgccgcag cagagcccca gaatagagcg
cccgcgcgcc 600tgtacccctt cgcgcggttt cggcctggaa aatatggtcg
gtaaaagccc ggcgatgcgg 660cagattatgg atattattcg tcaggtttcc
cgctgggata ccacggtgct ggtacgcggc 720gagagcggca ccgggaaaga
gctcatcgcc aacgccatcc accataattc tccgcgcgcc 780gccgcggcgt
tcgtcaaatt taactgcgcg gcgctgccgg acaacctgct ggagagcgag
840ctgtttggtc atgagaaagg cgcgtttacc ggcgcggtgc gccagcggaa
aggccgcttt 900gagctggcgg acggcggcac cttattcctc gatgagatcg
gcgaaagcag cgcctcgttt 960caggctaagc tactgcgtat tctgcaagag
ggggagatgg agcgcgtcgg cggcgacgaa 1020accctgcggg tcaacgtgcg
cattatcgcg gcgaccaacc gccatctgga agaggaggtg 1080cggctgggtc
atttccgcga ggatctatac taccgcctga acgtaatgcc tatcgcgctg
1140ccgccgctgc gcgagcgcca ggaggatatc gccgagctgg cgcactttct
ggtgcgaaaa 1200atcgcccaca gccaggggcg aacgctgcgc atcagcgatg
gggcgattcg cctgctgatg 1260gagtacagct ggccgggaaa cgtgcgcgaa
ctggaaaact gtctcgaacg ttcggcggtg 1320ctgtcggaaa gcggcctgat
agaccgggac gtgattctgt tcaaccatcg cgataacccg 1380ccgaaagcgc
tcgccagcag cggcccggcg gaggacggct ggctcgataa cagcctcgac
1440gagcgccagc ggctgatcgc cgccctggaa aaagcgggct gggtgcaggc
caaagcggcg 1500cggctgctcg gcatgacccc gcgccaggtg gcgtatcgca
ttcagattat ggatatcacc 1560atgccgcgac tgtga 15751451566DNAA.
vinelandii 145atgaacgcca cattcgccga acgccccagc gcgccaaccc
gcaacgaact gctggatgcc 60caactgcagg cgctggcgca gatcgcccgc atccttaacc
gcggccggcc catcgaggaa 120ctgctggccg agatcctcgc cgtgctgcac
gaagacctcg gcctgctgca cgggctggtc 180tccatctgca acccgaagga
cggcagcctg caggtgggcg ccgtgcacag cgactccgaa 240accgtggtac
gggcctgcga aagcacccgc taccgcatcg gcgaaggcgt gttcggcaac
300atcctcaagc atggcaacag cgtggtgctc gggcgtatcg acgccgaacc
gcgctttctc 360gaccgactgg cgctgtacga catggacctg cccttcatcg
ccgtgccgat caaggccgtc 420gacggcacca ccatcggcgt gctggctgcc
cagcccgacc gccgcgccga cgagctgatg 480cccgaacgca cccgtttgat
ggaaatcgtc gcccgcctac tggcgcagac cgtgcgcctg 540gtggtgaacc
tcgaggacgg ccaggaagtg gtcgacgagc gcgacgagct acgccgcgaa
600gtccgcgcca agtacggctt cgagaacatg gtggtgggcc acaccgcctc
catgcgccgg 660gttttcgacc aggttcgacg ggtcgccaag tggaacagca
ccgtgctgat cctcggcgaa 720tccggcaccg gcaaggagct gatcgccagc
gccatccact acaactcacc gcgcgctcac 780cagccgctgg tacgcctgaa
ctgcgccgcg ctaccggaaa ccctgctcga atcggaactg 840ttcggtcacg
agaaaggcgc cttcaccggc gccgtgaagc agcgcaaggg acgtttcgaa
900caggccgacg gcggcaccct gttcctcgac gagatcggcg agatctcgcc
gatgttccag 960gccaagctgc tgcgcgtgct gcaggaaggc gagctggagc
gcgtcggcgg cagccagacg 1020gtgaaggtca acgtgcgcat cgtcgccgcc
accaaccgcg acctggagca cgaggtggag 1080caaggcaagt tccgcgaaga
cctctactac cgcctcaacg tcatggccat ccgcgtcccg 1140ccgctgcgcg
agcgcagcgc cgacatcccg gaactggccg aattcctcct cgacaagatc
1200gcccgccagc agggtcgcaa actcaagctg accgacagcg ccctgcgtct
gctgatgagc 1260caccgctggc cgggcaacgt gcgcgaactg gaaaactgcc
tggaacgctc ggccatcatg 1320agcgaggatg gcaccatcag ccgcgacgtg
gtctccctca ccggcctcga ccacgacgcc 1380acgccgctgg cgccggtccc
cgaagtcgac ctcgccgacg acagcctcga cgaccgcgag 1440cgcgtcatcg
ccgcgctgga acaggccggc tgggtccagg ccaaggccgc ccgcctgctc
1500ggcatgacgc cccggcagat cgcctaccga gtgcagacgc tgaacattca
tatgcgcaag 1560atctga 15661461569DNAUnknownR. sp. IRBG74
146atgaatgcaa ccatccctca gcgctcggcc aaacagaacc cggtcgaact
ctatgacctg 60caattgcagg ccctggcgag catcgcccgc acgctcagcc gcgaacaaca
gatcgacgaa 120ctgctcgaac aggtcctggc cgtactgcac aatgacctcg
gcctgctgca tggcctggtg 180accatttccg acccggaaca cggcgccctg
cagatcggcg ccatccacac cgactcggaa 240gcggtggccc aggcctgcga
aggcgtgcgc tacagaagcg gcgaaggcgt gatcggcaac 300gtgctcaagc
acggcaacag cgtggtgctc gggcgcatct ccgccgaccc gcgctttctc
360gaccgcctgg cgctgtacga cctggaaatg ccgttcatcg ccgtgccgat
caagaacccc 420gagggcaaca ccatcggcgt gctggcggcc cagccggact
gccgcgccga cgagcacatg 480cccgcgcgca cgcgccttct ggagatcgtc
gccaacctgc tggcgcagac cgtgcgcctg 540gtggtgaaca tcgaggacgg
ccgcgaggcg gccgacgagc gcgacgaact gcgtcgcgag 600gtgcgcggca
agtacggctt cgagaacatg gtggtgggcc acacccccac catgcgccgg
660gtgttcgatc agatccgccg ggtcgccaag tggaacagca ccgtactggt
cctcggcgag 720tccggtaccg gcaaggaact gatcgccagc gccatccact
acaactcgcc gcgcgcgcac 780cgccccttcg tgcgcctgaa ctgcgccgcg
ctgccggaaa ccctgctcga gtccgaactc 840ttcggccacg agaagggcgc
cttcaccggc gcggtgaagc agcgcaaggg gcgtttcgag 900caggccgacg
gcggcaccct gttcctcgac gagatcggcg agatctcgcc gatgttccag
960gccaagctgc tgcgcgtgct gcaggaaggc gagttcgagc gggtcggcgg
caaccagacg 1020gtgcgggtca acgtgcgcat cgtcgccgcc accaaccgcg
acctggaaag cgaggtggaa 1080aagggcaagt tccgcgagga cctctactac
cgcctgaacg tcatggccat ccgcattccg 1140ccgctgcgcg agcgtaccgc
cgacattccc gaactggcgg aattcctgct cggcaagatc 1200ggccgccagc
agggccgccc gctgaccgtc accgacagcg ccatccgcct gctgatgagc
1260caccgctggc cgggcaacgt gcgcgaactg gagaactgcc tggagcgctc
ggcgatcatg 1320agcgaggacg gcaccatcac ccgcgacgtg gtctcgctga
ccggggtcga caacgagagc 1380ccgccgctcg ccgcgccgct gcccgaggtc
aacctggccg acgagaccct ggacgaccgc 1440gaacgggtga tcgccgccct
cgaacaggcc ggctgggtgc aggccaaggc cgcgcggctg 1500ctgggcatga
cgccgcggca gatcgcctac cgcatccaga ccctcaacat ccacatgcgc
1560aagatctga 15691471695DNAA. caulinodans 147atgctgcaca atgggctcaa
tgagggtatg actgaacgat ccgctcaaac catccacaaa 60ccggatttct ggggcagcgg
tatctatcgg atatcgaaag ttttgattgg tccagacagt 120ctcgagacga
agcttgccaa tgtcattaac gccctctcag taattctccc aatgcggcgc
180ggcgcaatcg tcgttctaaa tgttaaagga gagcccgaga tggttgcaat
gctgggccta 240gagcaagcat ctcaaggcgc ccgctccatt ccggcggagg
ctgcgataga tagaatcgtc 300gccaaaggcg cgccgctggt cgtaccggac
atttgcaagt cggacctgtt ccaggcggag 360ctccaaacca actcgaacgc
cacaggccca gccacgttcg ttggcgtccc gatgaaggtc 420gaaaaagaaa
cgcttggaac actatggatc gaccgcgcca aagatggcag cactaggatc
480caatttgagg aagaggtgcg cttcctctcc atggtcgcca acctttcggc
ccgggccatt 540tggctggatc gccaccagag ccgcgatggt cagccaatcg
tgggcgagga aggaactcgc 600aagactagtt caggcgacaa ggaactgccc
gaatctgccc gacaaaggcc cacaaaaatc 660gattggattg tcggggaaag
ccctgccctc aagcaggtgg ttgaaagcgt caaagtcgtt 720gcaacaacca
attctgcggt gcttctcagg ggcgaaagcg gcacgggcaa ggagttcttt
780gcaaaggcca tccacgagct ttcataccgg aaaaagaagc ccttcgtgaa
gttgaactgc 840gccgcgctgt ctgcaggcgt tttggaatcg gaattgtttg
gacatgaaaa gggcgccttc 900acgggggcca tctctcagcg cgcaggccgc
ttcgaactcg cagacggcgg aacgctgctg 960ctcgatgaga tcggcgacat
ttcgccgggc ttccaagcga aactgttgcg cgtcttgcag 1020gaaggtgagc
ttgagcgagt cggcggcaca aaaacactca aagtggacgt tcgactcata
1080tgcgccacga acaaagacct agaagcggca gtcgcggatg gggagttcag
ggccgacctt 1140tattaccgga tcaatgtggt gcccctattt ctgccgcctc
tccgggagcg aaatggggat 1200attccacgcc ttgcgagagt tttcctcggc
cgattcaaca gggaaaacaa tcgcgatctc 1260gcgttcgcgc cggctgcgct
cgagctcttg tcaaaatgca actttcccgg caacgtccga 1320gagcttgaaa
actgcgtccg caggaccgcc actctcgcgc gttcggagac gatcgttcca
1380tcagatttct cctgcctgaa gaaccagtgc ttttcttcaa tgctctggaa
aaccggtgac 1440cgtccacttg gggatacgct caatgggttg gccatgcgta
agagtttgtc ggtcgaatcg 1500ccgatcagcc tcggttactc caatggaccg
gccggcttaa cggtggcacc acatctaacg 1560gaccgcgagc tgctaatcag
tgcgatggag aaggccggtt gggttcaggc aaaggcagct 1620cggatcctcg
gcctcacacc gcgacaggtc ggctatgctt tacgtaggca tcgtatacag
1680gtgaagaaaa tctaa 16951481848DNAArtificial SequenceSynthetic
polynucleotide 148atgccaatga ccgacgcctt ccaggtccgc gtacctcggg
tttcgtcgag caccgccgga 60gacatcgccg cgtcatccat caccacgcgg ggcgcgctgc
cgcgcccggg agggatgcct 120gtgtccatgt cgcgggggac ctcgcccgag
gtggcactca tcggggtcta tgagatatcg 180aagatcctga cggcgccccg
gcgcctcgaa gtcacgctcg ccaatgtggt gaacgtgctc 240tcctccatgc
tgcagatgcg gcatggcatg atctgcatcc tcgacagcga gggcgatccc
300gacatggtgg ccaccaccgg ctggacgcct gagatggcgg gccagatccg
cgcgcatgtg 360ccccagaagg ccatcgacca gatcgtcgcc acgcagatgc
cgctggtggt gcaggacgtg 420acggccgatc cgctcttcgc cggtcacgag
gatctgttcg gcccgcctga ggaggccacc 480gtctccttca tcggcgtgcc
gatcaaggcc gaccaccatg tgatgggcac cctctccatc 540gaccgcatct
gggacggcac cgcccgtttc cgcttcgacg aggacgtgcg cttcctcacc
600atggtggcca atctcgtcgg ccagaccgtg cgcctgcaca agctggtggc
gagcgaccgc 660gaccggctga tcgcccagac gcaccgcctc gaaaaggcgc
tgcgggaaga aaaatccggg 720gccgagccgg aggtggccga ggccgccaac
ggatccgcca tgggcatcgt gggcgatagc 780ccgctggtga aacgcctgat
cgcgaccgcg caagtggtcg cccgctcaaa ctccaccgtg 840ctgctgcgcg
gggagagcgg caccggcaag gagttgttcg cccgtgccat ccacgaactg
900tcgccccgca agggcaagcc cttcgtgaag gtgaactgcg ccgccctccc
ggaatcggtg 960ctggaatcgg aactgttcgg ccatgagaag ggcgccttca
ccggtgcgct gaacatgcgc 1020cagggccgct tcgagctggc gcacggcggc
acgctcttcc ttgacgagat cggcgagatc 1080acccccgctt tccaggccaa
gctgctgcgc gtgctgcagg aaggcgagtt cgagcgggtc 1140ggcggcaatc
gcacgctgaa ggtggatgtg cggctcgtgt gcgccaccaa caagaatctg
1200gaagaggcgg tctccaaggg cgagttccgg gccgatctct actaccgcat
ccatgtggtg 1260ccgctgatcc tgccgccgct gcgcgaacgg ccgggcgaca
ttcccaagct cgcgaagaac 1320ttcctcgacc gcttcaacaa ggaaaacaag
ctccacatga tgctctcggc gccggccatc 1380gacgtgctgc ggcgctgcta
tttcccgggc aacgtgcgcg agctggagaa ctgtatccgg 1440cggacggcaa
cgctcgccca cgatgccgtc atcacccccc atgacttcgc ctgcgacagc
1500ggccagtgcc tctcggccat gctctggaag ggctcggccc cgaagcctgt
gatgccgcac 1560gtgccgccgg cgcccacgcc gctgactccg ctctcccctg
ctccgctcgc gaccgcagcg 1620cccgctgcgg cgagcccggc gccggcggcc
gacagcctgc cggtcacttg ccccggcacc 1680gaggcctgtc ccgcggtgcc
cccccgccag agcgaaaagg agcagttgct ccaggccatg 1740gagcgctccg
gctgggtgca ggcgaaggcc gcgcgcctcc tcaacctcac gccgcgccag
1800gtgggttatg cgctgcgcaa atatgacatc gacatcaagc gcttctga
184814947DNAArtificial SequenceSynthetic polynucleotide
149taggtgttga cggctagctc agtcctaggt acagtgctag ctctaga
4715047DNAArtificial SequenceSynthetic polynucleotide 150taggtgttta
cagctagctc agtcctaggt attatgctag ctctaga 4715147DNAArtificial
SequenceSynthetic polynucleotide 151taggtgttga cagctagctc
agtcctaggt actgtgctag ctctaga 4715247DNAArtificial
SequenceSynthetic polynucleotide 152taggtgctga tagctagctc
agtcctaggg attatgctag ctctaga 4715347DNAArtificial
SequenceSynthetic polynucleotide 153taggtgttga cagctagctc
agtcctaggt attgtgctag ctctaga 4715447DNAArtificial
SequenceSynthetic polynucleotide 154taggtgttta cggctagctc
agtcctaggt actatgctag ctctaga 4715547DNAArtificial
SequenceSynthetic polynucleotide 155taggtgttta cggctagctc
agtcctaggt atagtgctag ctctaga 4715647DNAArtificial
SequenceSynthetic polynucleotide 156taggtgttta cggctagctc
agccctaggt attatgctag ctctaga 4715747DNAArtificial
SequenceSynthetic polynucleotide 157taggtgctga cagctagctc
agtcctaggt ataatgctag ctctaga 4715847DNAArtificial
SequenceSynthetic polynucleotide 158taggtgttta cagctagctc
agtcctaggg actgtgctag ctctaga 4715947DNAArtificial
SequenceSynthetic polynucleotide 159taggtgttta cggctagctc
agtcctaggt acaatgctag ctctaga 4716047DNAArtificial
SequenceSynthetic polynucleotide 160taggtgttga cggctagctc
agtcctaggt atagtgctag ctctaga 4716147DNAArtificial
SequenceSynthetic polynucleotide 161taggtgctga tagctagctc
agtcctaggg attatgctag ctctaga 4716247DNAArtificial
SequenceSynthetic polynucleotide 162taggtgctga tggctagctc
agtcctaggg attatgctag ctctaga 4716347DNAArtificial
SequenceSynthetic polynucleotide 163taggtgttta tggctagctc
agtcctaggt acaatgctag ctctaga 4716447DNAArtificial
SequenceSynthetic polynucleotide 164taggtgttta tagctagctc
agcccttggt acaatgctag ctctaga 4716547DNAArtificial
SequenceSynthetic polynucleotide 165taggtgttga cagctagctc
agtcctaggg actatgctag ctctaga 4716647DNAArtificial
SequenceSynthetic polynucleotide 166taggtgttga cagctagctc
agtcctaggg attgtgctag ctctaga 4716747DNAArtificial
SequenceSynthetic polynucleotide 167taggtgttga cggctagctc
agtcctaggt attgtgctag ctctaga 4716847DNAArtificial
SequenceSynthetic polynucleotide 168taggtgttga cagctagctc
agtcctaggt ataatgctag ctctaga 4716947DNAArtificial
SequenceSynthetic polynucleotide 169taggtgttga cattattcca
tcgaactagt taactagtac gaaagtt 4717048DNAArtificial
SequenceSynthetic polynucleotide 170tagcataacc ccttggggcc
tctaaacggg tcttgagggg ttttttgt 4817148DNAArtificial
SequenceSynthetic polynucleotide 171tactcgaacc cctagcccgc
tcttatcggg cggctagggg ttttttgt 4817229DNAArtificial
SequenceSynthetic polynucleotide 172tacatatcgg gggggtaggg gttttttgt
2917380DNAArtificial SequenceSynthetic polynucleotide 173ccaggcatca
aataaaacga aaggctcagt cgaaagactg ggcctttcgt tttatctgtt 60gtttgtcggt
gaacgctctc 8017461DNAArtificial SequenceSynthetic polynucleotide
174ctcggtacca aattccagaa aagaggcctc ccgaaagggg ggcctttttt
cgttttggtc 60c 61175146DNAArtificial SequenceSynthetic
polynucleotide 175ctcggtacca aattccagaa aagacacccg aaagggtgtt
ttttcgtttt ggtcctcctt 60ggccctccat ccttagatag cagataaaaa aaatccttag
ctttcgctaa ggatgatttc 120ttcataggca atacgatcgc atgtcc
146176129DNAArtificial SequenceSynthetic polynucleotide
176ccaggcatca aataaaacga aaggctcagt cgaaagactg ggcctttcgt
tttatctgtt 60gtttgtcggt gaacgctctc tactagagtc acactggctc accttcgggt
gggcctttct 120gcgtttata 129177129DNAArtificial SequenceSynthetic
polynucleotide 177ccaggcatca aataaaacga aaggctcagt cgaaagactg
ggcctttcgt tttatctgtt 60gtttgtcggt gaacgctctc tactagagtc acactggctc
accttcgggt gggcctttct 120gcgtttata 129178168DNAArtificial
SequenceSynthetic polynucleotide 178ggtcttgtcc actaccttgc
agtaatgcgg tggacaggat cggcggtttt cttttctctt 60ctcaatgact gaatagaaaa
gacgaacatt aacgcatgag aaagcccccg gaagatcacc 120ttccgggggc
ttttttattg cgctacaaat gaaagtacat agaaatta 168179146DNAArtificial
SequenceSynthetic polynucleotide 179cagataaaaa aaatccttag
ctttcgctaa ggatgatttc ttccttggcc ctccatcctt 60agatagctcg gtaccaaatt
ccagaaaaga cacccgaaag ggtgtttttt cgttttggtc 120ctcataggca
atacgatcgc atgtcc 146180128DNAArtificial SequenceSynthetic
polynucleotide 180ccaggcatca aataaaacga aaggctcagt cgaaagactg
ggcctttcgt tttatctgtt 60gtttgtcggt gaacgctctc ctagcataac cccttggggc
ctctaaacgg gtcttgaggg 120gttttttg 128181152DNAArtificial
SequenceSynthetic polynucleotide 181ctcggtacca aattccagaa
aagagacgct ttcgagcgtc ttttttcgtt ttggtcctcc 60ttggccctcc atccttagat
agagttaacc aaaaaggggg gattttatct cccctttaat 120ttttccttca
taggcaatac gatcgcatgt cc 152182144DNAArtificial SequenceSynthetic
polynucleotide 182cgcagatagc aaaaaagcgc ctttagggcg cttttttaca
ttggtggtcc ttggccctcc 60atccttagat agaggcgact gacgaaacct cgctccggcg
gggttttttg ttatctgcat 120cataggcaat acgatcgcat gtcc
14418382DNAArtificial SequenceSynthetic polynucleotide
183tcggtcagtt tcacctgatt tacgtaaaaa cccgcttcgg cgggtttttg
cttttggagg 60ggcagaaaga tgaatgactg tc 82184103DNAArtificial
SequenceSynthetic polynucleotide 184gcccccggaa gatcaccttc
cgggggcttt tttattggcg gccggctgat tgatcaggcg 60gccggctgat tggcgcgtta
cctggtagcg cgccattttg ttt 10318583DNAArtificial SequenceSynthetic
polynucleotide 185gtaatcgtta atccgcaaat aacgtaaaaa cccgcttcgg
cgggtttttt tatgggggga 60gtttagggaa agagcatttg tca
8318641DNAArtificial SequenceSynthetic polynucleotide 186aaaaaaaaac
cccgcccctg acagggcggg gttttttttt t 41187152DNAArtificial
SequenceSynthetic polynucleotide 187tccggcaatt aaaaaagcgg
ctaaccacgc cgcttttttt acgtctgcat gactgaatag 60aaaagacgaa cattaacgca
tgagaaagcc cccggaagat caccttccgg gggctttttt 120attgcgctcc
ttggccctcc atccttagat ag 152188167DNAArtificial SequenceSynthetic
polynucleotide 188ggaagaccat actggaaaca cagaaaaaag cccgcacctg
acagtgcggg cttttttttt 60cgaccaaagg tgactgaata gaaaagacga acattcgcag
atagcaaaaa agcgccttta 120gggcgctttt ttacattggt ggtcataggc
aatacgatcg catgtcc 167189160DNAArtificial SequenceSynthetic
polynucleotide 189tccggcaatt aaaaaagcgg ctaaccacgc cgcttttttt
acgtctgcat ccttggccct 60ccatccttag atagctcggt accaaattcc agaaaagagg
cctcccgaaa ggggggcctt 120ttttcgtttt ggtcctcata ggcaatacga
tcgcatgtcc 160190199DNAArtificial SequenceSynthetic polynucleotide
190ttcagccaaa aaacttaaga ccgccggtct tgtccactac cttgcagtaa
tgcggtggac 60aggatcggcg gttttctttt ctcttctcaa tacatgaaag tacatagaaa
ttactcggta 120ccaaattcca gaaaagaggc ctcccgaaag gggggccttt
tttcgttttg gtcctcatag 180gcaatacgat cgcatgtcc
199191189DNAArtificial SequenceSynthetic polynucleotide
191ttcagccaaa aaacttaaga ccgccggtct tgtccactac cttgcagtaa
tgcggtggac 60aggatcggcg gttttctttt ctcttctcaa tccttggccc tccatcctta
gatagtccgg 120caattaaaaa agcggctaac cacgccgctt tttttacgtc
tgcatcatag gcaatacgat 180cgcatgtcc 189192164DNAArtificial
SequenceSynthetic polynucleotide 192ctcggtacca aattccagaa
aagaggcctc ccgaaagggg ggcctttttt cgttttggtc 60ctgactgaat agaaaagacg
aacattaacg catgagaaag cccccggaag atcaccttcc 120gggggctttt
ttattgcgct ccttggccct ccatccttag atag 164193159DNAArtificial
SequenceSynthetic polynucleotide 193ctcggtacca aattccagaa
aagaggcctc ccgaaagggg ggcctttttt cgttttggtc 60ctccttggcc ctccatcctt
agatgtccgg caattaaaaa agcggctaac cacgccgctt 120tttttacgtc
tgcatcatag gcaatacgat cgcatgtcc 159194197DNAArtificial
SequenceSynthetic polynucleotide 194ctcggtacca aagacgaaca
ataagacgct gaaaagcgtc ttttttcgtt ttggtcctac 60aaatgaaagt acatagaaat
tattcagcca aaaaacttaa gaccgccggt cttgtccact 120accttgcagt
aatgcggtgg acaggatcgg cggttttctt ttctcttctc aatccttggc
180cctccatcct tagatag 197195121DNAArtificial SequenceSynthetic
polynucleotide 195gggaactgcc agacatcaaa taaaacaaaa ggctcagtcg
gaagactggg ccttttgttt 60tatctgttgt ttgtcggtga acactctccc gactagtagc
ggccgctgca gaaagaggag 120a 121196152DNAArtificial SequenceSynthetic
polynucleotide 196aacgcatgag aaagcccccg gaagatcacc ttccgggggc
ttttttattg cgctcatagg 60caatacgatc gcatgtcctc cggcaattaa aaaagcggct
aaccacgccg ctttttttac 120gtctgcatcc ttggccctcc atccttagat ag
15219791DNAArtificial SequenceSynthetic polynucleotide
197gggaactgcc agacatcaaa taaaacaaaa ggctcagtcg gaagactggg
ccttttgttt 60tatctgttgt ttgtcggtga acactctccc g
91198195DNAArtificial SequenceSynthetic polynucleotide
198aaagcaagct gataaaccga tacaattaaa ggctcctttt ggagcctttt
tttttggaga 60ttttcaacat gaaaaaatta ttatttgatg atcagatagc ggcggggaac
tgccagacat 120caaataaaac aaaaggctca gtcggaagac tgggcctttt
gttttatctg ttgtttgtcg 180gtgaacactc tcccg 195199164DNAArtificial
SequenceSynthetic polynucleotide 199aacgcatgag aaagcccccg
gaagatcacc ttccgggggc ttttttattg cgctccttgg 60ccctccatcc ttagatagct
cggtaccaaa ttccagaaaa gaggcctccc gaaagggggg 120ccttttttcg
ttttggtcct cataggcaat acgatcgcat gtcc 164200193DNAArtificial
SequenceSynthetic polynucleotide 200aacgcatgag aaagcccccg
gaagatcacc ttccgggggc ttttttattg cgctccttgg 60ccctccatcc ttagatagtt
cagccaaaaa acttaagacc gccggtcttg tccactacct 120tgcagtaatg
cggtggacag gatcggcggt tttcttttct cttctcaatc ataggcaata
180cgatcgcatg tcc 19320167DNAArtificial SequenceSynthetic
polynucleotide 201cctaggacct gtaggatcgt acaggtttac gcaagaaaat
ggtttgttac tttcgaataa 60atctaga 6720268DNAArtificial
SequenceSynthetic polynucleotide 202cggtggaatc cctatcagtg
atagagattg acatccctat cagtgataga tataatgagc 60actctaga
6820390DNAArtificial SequenceSynthetic polynucleotide 203aacaaacaga
caatctggtc tgtttgtatt atggaaaatt tttctgtata atagattcaa 60caaacagaca
atctggtctg tttgtattat 9020458DNAArtificial SequenceSynthetic
polynucleotide 204aaaaagagtt tgacatgata cgaaacgtac cgtatcgtta
aggttactag agtctaga 58205116DNAArtificial SequenceSynthetic
polynucleotide 205ggggcctcgc ttgggttatt gctggtgccc ggccgggcgc
aatattcatg ttgatgattt 60attatatatc gagtggtgta tttatttata ttgtttgctc
cgttaccgtt attaac 116206753DNAArtificial SequenceSynthetic
polynucleotide 206atgaaaaaca taaatgccga cgacacatac agaataatta
ataaaattaa agcttgtaga 60agcaataatg atattaatca atgcttatct gatatgacta
aaatggtaca ttgtgaatat 120tatttactcg cgatcattta tcctcattct
atggttaaat ctgatatttc aatcctagat 180aattacccta aaaaatggag
gcaatattat gatgacgcta atttaataaa atatgatcct 240atagtagatt
attctaactc caatcattca ccaattaatt ggaatatatt tgaaaacaat
300gctgtaaata aaaaatctcc aaatgtaatt aaagaagcga aaacatcagg
tcttatcact 360gggtttagtt tccctattca tacggctaac aatggcttcg
gaatgcttag ttttgcacat 420tcagaaaaag acaactatat agatagttta
tttttacatg cgtgtatgaa cataccatta 480attgttcctt ctctagttga
taattatcga aaaataaata tagcaaataa taaatcaaac 540aacgatttaa
ccaaaagaga aaaagaatgt ttagcgtggg catgcgaagg aaaaagctct
600tgggatattt caaaaatatt aggttgcagt gagcgtactg tcactttcca
tttaaccaat 660gcgcaaatga aactcaatac aacaaaccgc tgccaaagta
tttctaaagc aattttaaca 720ggagcaattg attgcccata ctttaaaaat taa
753207624DNAArtificial SequenceSynthetic polynucleotide
207atgtccagat
tagataaaag taaagtgatt aacagcgcat tagagctgct taatgaggtc 60ggaatcgaag
gtttaacaac ccgtaaactc gcccagaagc taggtgtaga gcagcctaca
120ttgtattggc atgtaaaaaa taagcgggct ttgctcgacg ccttagccat
tgagatgtta 180gataggcacc atactcactt ttgcccttta gaaggggaaa
gctggcaaga ttttttacgt 240aataacgcta aaagttttag atgtgcttta
ctaagtcatc gcgatggagc aaaagtacat 300ttaggtacac ggcctacaga
aaaacagtat gaaactctcg aaaatcaatt agccttttta 360tgccaacaag
gtttttcact agagaatgca ttatatgcac tcagcgctgt ggggcatttt
420actttaggtt gcgtattgga agatcaagag catcaagtcg ctaaagaaga
aagggaaaca 480cctactactg atagtatgcc gccattatta cgacaagcta
tcgaattatt tgatcaccaa 540ggtgcagagc cagccttctt attcggcctt
gaattgatca tatgcggatt agaaaaacaa 600cttaaatgtg aaagtgggtc ctaa
624208612DNAArtificial SequenceSynthetic polynucleotide
208atgagcccga aacgtcgtac ccaggcagaa cgtgcaatgg aaacccaggg
taaactgatt 60gcagcagcac tgggtgttct gcgtgaaaaa ggttatgcag gttttcgtat
tgcagatgtt 120ccgggtgcag ccggtgttag ccgtggtgca cagagccatc
attttccgac caaactggaa 180ctgctgctgg caacctttga atggctgtat
gagcagatta ccgaacgtag ccgtgcacgt 240ctggcaaaac tgaaaccgga
agatgatgtt attcagcaga tgctggatga tgcagcagaa 300ttttttctgg
atgatgattt tagcatcagc ctggatctga ttgttgcagc agatcgtgat
360ccggcactgc gtgaaggtat tcagcgtacc gttgaacgta atcgttttgt
tgttgaagat 420atgtggctgg gtgtgctggt gagccgtggt ctgagccgtg
atgatgccga agatattctg 480tggctgattt ttaacagcgt tcgtggtctg
gcagttcgta gcctgtggca gaaagataaa 540gaacgttttg aacgtgtgcg
taatagcacc ctggaaattg cacgtgaacg ttatgcaaaa 600ttcaaacgtt ga
612209603DNAArtificial SequenceSynthetic polynucleotide
209atggcacgta ccccgagccg tagcagcatt ggtagcctgc gtagtccgca
tacccataaa 60gcaattctga ccagcaccat tgaaatcctg aaagaatgtg gttatagcgg
tctgagcatt 120gaaagcgttg cacgtcgtgc cggtgcaagc aaaccgacca
tttatcgttg gtggaccaat 180aaagcagcac tgattgccga agtgtatgaa
aatgaaagcg aacaggtgcg taaatttccg 240gatctgggta gctttaaagc
cgatctggat tttctgctgc gtaatctgtg gaaagtttgg 300cgtgaaacca
tttgtggtga agcatttcgt tgtgttattg cagaagcaca gctggaccct
360gcaaccctga cccagctgaa agatcagttt atggaacgtc gtcgtgagat
gccgaaaaaa 420ctggttgaaa atgccattag caatggtgaa ctgccgaaag
ataccaatcg tgaactgctg 480ctggatatga tttttggttt ttgttggtat
cgcctgctga ccgaacagct gaccgttgaa 540caggatattg aagaatttac
cttcctgctg attaatggtg tttgtccggg tacacagcgt 600taa
603210906DNAArtificial SequenceSynthetic polynucleotide
210atggaactgc gtgacctgga tttaaacctg ctggtggtgt tcaaccagtt
gctggtcgac 60agacgcgtct ctgtcactgc ggagaacctg ggcctgaccc agcctgccgt
gagcaatgcg 120ctgaaacgcc tgcgcacctc gctacaggac ccactcttcg
tgcgcacaca tcagggaatg 180gaacccacac cctatgccgc gcatctggcc
gagcacgtca cttcggccat gcacgcactg 240cgcaacgccc tacagcacca
tgaaagcttc gatccgctga ccagcgagcg taccttcacc 300ctggccatga
ccgacattgg cgagatctac ttcatgccgc ggctgatgga tgcgctggct
360caccaggccc ccaattgcgt gatcagtacg gtgcgcgaca gttcgatgag
cctgatgcag 420gccttgcaga acggaaccgt ggacttggcc gtgggcctgc
ttcccaatct gcaaactggc 480ttctttcagc gccggctgct ccagaatcac
tacgtgtgcc tatgtcgcaa ggaccatcca 540gtcacccgcg aacccctgac
tctggagcgc ttctgttcct acggccacgt gcgtgtcatc 600gccgctggca
ccggccacgg cgaggtggac acgtacatga cacgggtcgg catccggcgc
660gacatccgtc tggaagtgcc gcacttcgcc gccgttggcc acatcctcca
gcgcaccgat 720ctgctcgcca ctgtgccgat atgtttagcc gactgctgcg
tagagccctt cggcctaagc 780gccttgccgc acccagtcgt cttgcctgaa
atagccatca acatgttctg gcatgcgaag 840taccacaagg acctagccaa
tatttggttg cggcaactga tgtttgacct gtttacggat 900tgataa
90621190DNAArtificial SequenceSynthetic polynucleotide
211tcaatgtatt gatgccgtcc atatcatgaa tcaaaacaat ccatttgatc
aatatcaagc 60tcactcttaa gcttcactca tccgctgcat 90212930DNAArtificial
SequenceSynthetic polynucleotide 212atgcgtttca acaagctcga
cctcaatctt ctggtcgccc tggatgcact gctcacggag 60atgagcatca gccgcgccgc
cgaaaagatc catctgagcc agtcggccat gagcaatgcc 120ctggcgcggc
tgcgcgagta tttcgatgat gaattgctga tccaggtggg ccggcgcatg
180gagcccacgc cgcgcgccga ggtgctcaag gatgcggtgc atgatgtgct
gcggcgtatc 240gatggctcca tcgcggcgct gccggccttc gtgccggccg
agtccacgcg cgagtttcgc 300atctcggttt cggactttac gctctccgtc
ctcatccccc gggtgctggc gcgcgcgcac 360gccgagggca agcacatccg
ctttgccctg atgccgcagg tgcaagaccc gacccgctcg 420ctggatcggg
ccgaggtgga cctgctggtc ttgccgcagg aattctgcac gcccgatcat
480cctgccgaag aggtcttccg cgaacggcat gtctgcgtgg tctggcgcga
cagtgcgctg 540gcgcaaggcg agctgacgct ggaacgctac atggcctcag
gccatgtggt gatggtgccg 600cctggggcca atgcgtcgtc ggtggaggcg
tggatggcca ggaagctggg ctttgcgcgc 660cgggtggaag tgaccagctt
cagcttcgct tctgcgctgg cgctggtaca ggggacggac 720cgcatcgcca
cggtgcatgc ccggctggcg cagctgctgg ctccgcaatg gccggtggtg
780atcaaggaga gtccgctgtc gctgggcgag atgcggcaga tgatgcagtg
gcatcgctac 840cgcagcaatg atcctggcat ccagtggctg cgtcgggtgt
ttctggagag tgcgcaggag 900atggatgcgg cgctgccagg catctgctga
930213286DNAArtificial SequenceSynthetic polynucleotide
213cagacattgc cgtcactgcg tcttttactg gctcttctcg ctaaccaaac
cggtaacccc 60gcttattaaa agcattctgt aacaaagcgg gaccaaagcc atgacaaaaa
cgcgtaacaa 120aagtgtctat aatcacggca gaaaagtcca cattgattat
ttgcacggcg tcacactttg 180ctatgccata gcatttttat ccataagatt
agcggatcct acctgacgct ttttatcgca 240actctctata ttttctccat
acccgttttt ttgggctagc gaattc 286214930DNAArtificial
SequenceSynthetic polynucleotide 214atgcaatatg gacaattggt
ttcttctctg aatggcggga gtatgaaaag tatggctgaa 60gcgcaaaatg atcccctgct
gccgggatac tcgtttaatg cccatctggt ggcgggttta 120acgccgattg
aggccaacgg ttatctcgat ttttttatcg accgaccgct gggaatgaaa
180ggttatattc tcaatctcac cattcgcggt cagggggtgg tgaaaaatca
gggacgagaa 240tttgtttgcc gaccgggtga tattttgctg ttcccgccag
gagagattca tcactacggt 300cgtcatccgg aggctcgcga atggtatcac
cagtgggttt actttcgtcc gcgcgcctac 360tggcatgaat ggcttaactg
gccgtcaata tttgccaata cggggttctt tcgcccggat 420gaagcgcacc
agccgcattt cagcgacctg tttgggcaaa tcattaacgc cgggcaaggg
480gaagggcgct attcggagct gctggcgata aatctgcttg agcaattgtt
actgcggcgc 540atggaagcga ttaacgagtc gctccatcca ccgatggata
atcgggtacg cgaggcttgt 600cagtacatca gcgatcacct ggcagacagc
aattttgata tcgccagcgt cgcacagcat 660gtttgcttgt cgccgtcgcg
tctgtcacat cttttccgcc agcagttagg gattagcgtc 720ttaagctggc
gcgaggacca acgtatcagc caggcgaagc tgcttttgag caccacccgg
780atgcctatcg ccaccgtcgg tcgcaatgtt ggttttgacg atcaactcta
tttctcgcgg 840gtatttaaaa aatgcaccgg ggccagcccg agcgagttcc
gtgccggttg tgaagaaaaa 900gtgaatgatg tagccgtcaa gttgtcataa
9302151419DNAArtificial SequenceSynthetic polynucleotide
215atggttacta tcaatacgga atctgcttta acgccacgtt ctttgcggga
tacgcggcgt 60atgaatatgt ttgtttcggt agctgctgcg gtcgcaggat tgttatttgg
tcttgatatc 120ggcgtaatcg ccggagcgtt gccgttcatt accgatcact
ttgtgctgac cagtcgtttg 180caggaatggg tggttagtag catgatgctc
ggtgcagcaa ttggtgcgct gtttaatggt 240tggctgtcgt tccgcctggg
gcgtaaatac agcctgatgg cgggggccat cctgtttgta 300ctcggttcta
tagggtccgc ttttgcgacc agcgtagaga tgttaatcgc cgctcgtgtg
360gtgctgggca ttgctgtcgg gatcgcgtct tacaccgctc ctctgtatct
ttctgaaatg 420gcaagtgaaa acgttcgcgg taagatgatc agtatgtacc
agttgatggt cacactcggc 480atcgtgctgg cgtttttatc cgatacagcg
ttcagttata gcggtaactg gcgcgcaatg 540ttgggggttc ttgctttacc
agcagttctg ctgattattc tggtagtctt cctgccaaat 600agcccgcgct
ggctggcgga aaaggggcgt catattgagg cggaagaagt attgcgtatg
660ctgcgcgata cgtcggaaaa agcgcgagaa gaactcaacg aaattcgtga
aagcctgaag 720ttaaaacagg gcggttgggc actgtttaag atcaaccgta
acgtccgtcg tgctgtgttt 780ctcggtatgt tgttgcaggc gatgcagcag
tttaccggta tgaacatcat catgtactac 840gcgccgcgta tcttcaaaat
ggcgggcttt acgaccacag aacaacagat gattgcgact 900ctggtcgtag
ggctgacctt tatgttcgcc acctttattg cggtgtttac ggtagataaa
960gcagggcgta aaccggctct gaaaattggt ttcagcgtga tggcgttagg
cactctggtg 1020ctgggctatt gcctgatgca gtttgataac ggtacggctt
ccagtggctt gtcctggctc 1080tctgttggca tgacgatgat gtgtattgcc
ggttatgcga tgagcgccgc gccagtggtg 1140tggatcctgt gctctgaaat
tcagccgctg aaatgccgcg atttcggtat tacctgttcg 1200accaccacga
actgggtgtc gaatatgatt atcggcgcga ccttcctgac actgcttgat
1260agcattggcg ctgccggtac gttctggctc tacactgcgc tgaacattgc
gtttgtgggc 1320attactttct ggctcattcc ggaaaccaaa aatgtcacgc
tggaacatat cgaacgcaaa 1380ctgatggcag gcgagaagtt gagaaatatc
ggcgtctga 141921675DNAArtificial SequenceSynthetic polynucleotide
216ctcgagtgtt gacaattaat catcggctcg tataatgtgt ggaattgtga
gcgctcacaa 60tttcacacat ctaga 75217221DNAArtificial
SequenceSynthetic polynucleotide 217aacaaataca catgggcgca
tgcctattac tgcccttgcg atatggaagg caagctttta 60gtaacaatag aaaactgggt
cctactctcg aagaatgcac tgcggcggtc acgtcaacac 120gtgctgcacc
gttgagaatg aatgctgggc agattgccag cggcgtcatt ttcggctgtc
180ccgtcctcac ggttttgcgc tgcatcgcaa gagattggga a
221218849DNAArtificial SequenceSynthetic polynucleotide
218atgacgtcag cagcgaatct ggtgaggatc acgcagcccg cgatcagccg
gctgatcagg 60gatctcgaag aggaaattgg gatcagcctc ttcgaaagaa cgggcaaccg
gttacgtcct 120acgcgggagg ccggtattct gttcaaggaa gtgtcgcgac
atttcaacgg gattcagcac 180atcgacaaag tcgcggctga actgaagaag
tctcatatgg ggtccctaag ggtcgcctgt 240tatacagcgc cagctctgag
ttttatgtcc ggcgtcattc agacgttcat cgccgatcgg 300cccgacgtgt
cggtctacct cgacacagtt ccttcccaga cggtcctcga attggtctcg
360ctccagcact acgatctcgg aatatcgata ttggctggcg actatcctgg
tctcaccacc 420gaacctgtcc cttcctttcg tgcggtctgc ctgctgccgc
cggggcatcg tctcgaagac 480aaggaaactg ttcatgcgac ggaccttgaa
ggagagtcat tgatttgcct ctctccagtg 540agccttctac ggatgcaaac
ggacgccgca ctggacagct gcggcgtcca ctgtaatcgc 600aggatagaaa
gtagtctggc gctgaatctc tgcgatctgg taagcagggg aatgggggtt
660ggtatcgtcg accccttcac tgccgactac tacagtgcaa atccggttat
tcagcgctcc 720tttgatccgg ttgtccccta ccattttgct atagttcttc
cgaccgacag cccaccgccg 780cgcttggtta gcgagttccg ggcagcgttg
cttgatgctt tgaaagcctt gccctatgaa 840accatttga
849219119DNAArtificial SequenceSynthetic polynucleotide
219aaacgcacca taacatctgc ttattcttgc ccggtcatta tgaatttgac
cgaatgcata 60tcgaatgtaa agctcaccct ataaatcaca actcttccgg gccaaccggg
atcagacgt 119220897DNAArtificial SequenceSynthetic polynucleotide
220atgaatctca ggcaggtcga ggcgttccgg gcagtcatgc tgacggggca
aatgacggcg 60gcggctgaac taatgctggt gactcagccg gccatcagtc gcctaatcaa
ggactttgaa 120caggcgacaa aactgcagct cttcgagagg cgtgggaacc
atattatccc gacacaggag 180gcaaagacgc tgtggaaaga ggtcgatcgg
gcgttcgtcg ggcttaatca tataggcaac 240ctggctgccg acatcggcag
gcaggcagcg gggacgctcc gcattgctgc aatgcctgct 300ctggcaaacg
gcctcttgcc gcggtttctt gctcagttca tccgtgacag accaaatctc
360caggtctccc taatgggact gccctcaagc atggtcatgg aagccgttgc
gtccggcagg 420gccgacatcg gttatgccga tggcccacag gagcgccaag
gttttctaat cgaaacccgg 480tcgcttcccg ctgttgtcgc tgtcccgatg
ggacatcgac ttgctggcct tgaccgtgtc 540acgccacagg accttgccgg
tgagcgtatt ataaaacagg agactggcac tctcttcgcc 600atgcgggtag
aggtggcgat tggtggtatt caacgccggc cgtcaattga agtgagcctg
660tcgcatactg cgctaagtct cgtccgcgaa ggcgccggga tcgcaattat
cgatccagcc 720gcggcgatcg agttcacgga caggatcgta ctgcgaccgt
tctcgatctt cattgacgcc 780ggattcctcg aagtccggtc agcaattggc
gctccctcaa ccatcgtcga tcgtttcaca 840accgaattct ggaggtttca
tgatgacttg atgaagcaga acggcctaat ggagtaa 89722163DNAArtificial
SequenceSynthetic polynucleotide 221agcgcgggtg agagggattc
gttaccaata gacaattgat tggacgttca atataatgct 60agc
63222226DNAArtificial SequenceSynthetic polynucleotide
222ccctttgtgc gtccaaacgg acgcacggcg ctctaaagcg ggtcgcgatc
tttcagattc 60gctcctcgcg ctttcagtct ttgttttggc gcatgtcgtt atcgcaaaac
cgctgcacac 120ttttgcgcga catgctctga tccccctcat ctgggggggc
ctatctgagg gaatttccga 180tccggctcgc ctgaaccatt ctgctttcca
cgaacttgaa aacgct 22622393DNAArtificial SequenceSynthetic
polynucleotide 223ttttgttcga ttatcgaaca aattattgaa atatcgaaca
aaacctctaa actactgtgg 60cactgaatca aaaaattata aaccctgatc aga
9322451DNAArtificial SequenceSynthetic polynucleotide 224cacccagcag
tatttacaaa caaccatgaa tgtaagtata ttccttagca a 5122550DNAArtificial
SequenceSynthetic polynucleotide 225attggatcca attgacagct
agctcagtcc taggtaccat tggatccaat 5022654DNAUnknownR. sp. IRBG74
226atttcacaca tctagagcta atcatctcgt actaaagagg agaaattaac catg
5422749DNAUnknownR. sp. IRBG74 227atttcacaca tctagagcta atcatcgcgt
actcaggagg caagtaatg 4922840DNAUnknownR. sp. IRBG74 228atttcacaca
tctagaatta aagaggagaa attaaccatg 4022954DNAUnknownR. sp. IRBG74
229taacaatttc acacatctag agctaatcat ctcgtactaa agaggcaagt aatg
5423054DNAUnknownR. sp. IRBG74 230taacaatttc acacatctag agctaatcat
cgcgtactaa ggaggcaagt aatg 5423154DNAUnknownR. sp. IRBG74
231taacaatttc acacatctag agctaatcat cgcgtactca agaggcaagt aatg
5423254DNAUnknownR. sp. IRBG74 232taacaatttc acacatctag agctaatctt
cgcgtactaa agaggcaagt aatg 5423354DNAUnknownR. sp. IRBG74
233taacaatttc acacatctag agctaatcat ctcgtactca ggaggcaagt aatg
5423454DNAUnknownR. sp. IRBG74 234taacaatttc acacatctag agctaatcat
ctcgtactaa tgaggcaagt aatg 5423554DNAUnknownR. sp. IRBG74
235taacaatttc acacatctag agctaatcat cgcgtactaa tgaggcaagt aatg
5423654DNAUnknownR. sp. IRBG74 236taacaatttc acacatctag agctaatcat
cgcgtactca cgaggcaagt aatg 5423754DNAUnknownR. sp. IRBG74
237taacaatttc acacatctag agctaatcat cgcgtactaa aaaggcaagt aatg
5423854DNAUnknownR. sp. IRBG74 238taacaatttc acacatctag agctaatctt
cgcgtactaa aaaggcaagt aatg 5423954DNAUnknownR. sp. IRBG74
239taacaatttc acacatctag agctaatctt cgcgtactaa gaaggcaagt aatg
5424054DNAUnknownR. sp. IRBG74 240taacaatttc acacatctag agctaatcat
ctcgtactaa ataggcaagt aatg 5424154DNAUnknownR. sp. IRBG74
241taacaatttc acacatctag agctaatcat ctcgtactaa taaggcaagt aatg
5424254DNAUnknownR. sp. IRBG74 242taacaatttc acacatctag agctaatctt
ctcgtactaa agaggcaagt aatg 5424354DNAUnknownR. sp. IRBG74
243taacaatttc acacatctag agctaatcat cgcgtactca ataggccagt aatg
5424454DNAUnknownR. sp. IRBG74 244taacaatttc acacatctag agctaatcat
cgcgtactaa gtaggcaagt aatg 5424554DNAUnknownR. sp. IRBG74
245taacaatttc acacatctag agctaatcat ctcgtactaa cgaggcaagt aatg
5424654DNAUnknownR. sp. IRBG74 246taacaatttc acacatctag agctaatcat
cgcgtactca gcaggcaagt aatg 5424754DNAUnknownR. sp. IRBG74
247taacaatttc acacatctag agctaatctt cgcgtactaa gtaggcaagt aatg
5424854DNAUnknownR. sp. IRBG74 248taacaatttc acacatctag agctaatctt
cgcgtactaa ttaggcaagt aatg 5424954DNAUnknownR. sp. IRBG74
249taacaatttc acacatctag agctaatctt ctcgtactaa caaggcaagt aatg
5425054DNAUnknownR. sp. IRBG74 250taacaatttc acacatctag agctaatcat
ctcgtactca ataggcaagt aatg 5425154DNAUnknownR. sp. IRBG74
251taacaatttc acacatctag agctaatcat ctcgtactaa gcacgcaagt aatg
5425254DNAUnknownR. sp. IRBG74 252taacaatttc acacatctag agctaatcat
cgcgtactaa ctacgcaagt aatg 5425354DNAUnknownR. sp. IRBG74
253taacaatttc acacatctag agctaatctt cgcgtactaa gaacgcaagt aatg
5425454DNAUnknownR. sp. IRBG74 254taacaatttc acacatctag agctaatctt
cgcgtactaa aaacgcaagt aatg 5425554DNAUnknownR. sp. IRBG74
255taacaatttc acacatctag agctaatctt cgcgtactaa caacgcaagt aatg
5425654DNAUnknownR. sp. IRBG74 256taacaatttc acacatctag agctaatctt
ctcgtactca tgacgcaagt aatg 5425740DNAUnknownR. sp. IRBG74
257atttcacaca tctagaatta aagagaagaa attaaccatg 4025829DNAUnknownR.
sp. IRBG74 258ctagtgcgaa ctagctcata ccgcagatg 2925942DNAP.
protegens 259ctagcgcagg tccaacgttt ttctaagcaa ggaggtcata tg
4226042DNAP. protegens 260ctagcgaagg tccaacgttt ttctaagcaa
ggaggtcata tg 4226142DNAP. protegens 261ctagcgaagg tccaacgttt
ttctaagcca ggaggtcata tg 4226242DNAP. protegens 262ctagcgcagg
tccaacgttt ttctaagcca ggaggtcata tg 4226342DNAP. protegens
263ctagcgaagc tccaacgttt ttctaagcaa ggaggtcata tg 4226433DNAP.
protegens 264gaattctaca ctaacggaca ggagggtccg atg 3326533DNAP.
protegens 265gaattctaaa ctaacggaca ggagggtccg atg 3326633DNAP.
protegens 266gaattctaag ctaacggaca ggagggtccg atg 3326733DNAP.
protegens 267gaattcttaa ctaacggaca ggagggtccg atg 3326833DNAP.
protegens 268gaattctaca ctaacggaca ggagggtcgg atg 3326933DNAP.
protegens 269gaattctacg ctaacggaca ggagggtccg atg 3327033DNAP.
protegens 270gaattctcaa ctaacggaca ggagggtccg atg 3327133DNAP.
protegens 271gaattctaag ctaacggaca ggagggtcgg atg 3327233DNAP.
protegens 272gaattctcag ctaacggaca ggagggtccg atg 3327333DNAP.
protegens 273gaattctcaa ctaacggaca ggagggtccg atg 3327433DNAP.
protegens 274gaattctacg ctaacggaca ggagggtcgg atg 3327535DNAP.
protegens 275gaattctcaa ctaacggaca ggagatatac atatg 3527633DNAP.
protegens 276gaattctcag ctaacggaca ggagggtcgg atg
3327733DNAP. protegens 277gaattctaaa ctaacggaca ggagggtcgg atg
3327833DNAP. protegens 278gaattctcag ctcacggaca ggagggtcgg atg
3327934DNAP. protegens 279gaattctcaa ctaacggaca ggagggtcgg gatg
3428033DNAP. protegens 280gaattctaca ctcacggaca ggagggtcgg atg
3328133DNAP. protegens 281gaattctaag ctcacggaca ggagggtcgg atg
3328233DNAP. protegens 282gaattctcaa ctcacggaca ggagggtcgg atg
3328333DNAP. protegens 283gaattctaca ctaacggaca gcagggtcgg atg
3328442DNAP. protegens 284ctagcgcagg tccaaccttt ttctaagcaa
gtaggtcata tg 4228533DNAP. protegens 285gaattctcag ctaacggaca
gcagggtcgg atg 3328642DNAP. protegens 286ctagcgcagg tccaaccttt
ttctaagcaa ctaggtcata tg 4228742DNAP. protegens 287ctagcgaagg
tccaaccttt ttctaagcca gtaggtcata tg 4228833DNAP. protegens
288gaattctacg ctcacggaca gcagggtcgg atg 3328933DNAP. protegens
289gaattctccg ctcacggaca ggagggtccg atg 3329033DNAP. protegens
290cttctcggcc agctgacagg ggaagctcgc atg 3329134DNAP. protegens
291cttctcggcc agctgacagg aggaagctcg catg 34292582PRTR. sphaeroides
292Met Asp Thr Ser Ala Ala Arg Ser Gly Ala Val Ala Glu Arg Gly Glu1
5 10 15Glu Tyr Leu Thr Leu Asp Ala Leu Cys Glu Ile Ala Lys Leu Leu
Thr 20 25 30Gly Ala Ser Asp Pro Ile Ala Cys Met Pro Ala Val Phe Gly
Val Leu 35 40 45Gly Ala Phe Met Gly Leu Arg His Gly Ala Leu Ala Ile
Leu Gln Glu 50 55 60Gly Ala Gln Ala Glu Thr Gln Arg Asn Ala Arg His
Val Asn Pro Tyr65 70 75 80Val Ile Ala Ala Thr Ala Ser Gly Val Pro
Pro Ala Gly Ala Glu Ala 85 90 95Arg Ala Ile Pro Ala Gln Val Ala Arg
His Val Phe Arg Asn Gly Val 100 105 110Ser Leu Val Ser Cys Asp Ile
Leu Glu Glu Phe Gly Ala Glu Ala Leu 115 120 125Pro Pro Gly Leu Gly
Asp Ser Arg Gln Ala Leu Val Ala Val Pro Ile 130 135 140Arg Asp Gln
Ala Asn Ser Pro Phe Val Leu Gly Val Leu Cys Ala Tyr145 150 155
160Arg Ser Leu Lys Asp Asn Gly Ala Arg Tyr Leu Asp Thr Asp Leu Arg
165 170 175Val Leu Asn Met Val Ala Ala Val Leu Glu Gln Ser Ile Arg
Phe Arg 180 185 190Arg Leu Val Ala Arg Asp Arg Asp Arg Ile Val Gln
Glu Ala Arg Glu 195 200 205Ala Ile Arg Val Ala Ala Glu Ala Thr Ala
Gly Pro Pro Val Glu Ala 210 215 220Pro Ala Glu Leu Ala Leu Glu Gly
Val Ile Gly Ser Ser Pro Ala Ile225 230 235 240Gln Arg Val Ile Gly
Gln Ile Arg Lys Val Ala Gly Thr His Thr Pro 245 250 255Val Leu Leu
Arg Gly Glu Ser Gly Thr Gly Lys Glu Val Phe Ala Arg 260 265 270Ala
Leu His Ala Leu Ser Glu Arg Arg Asp Lys Ala Phe Ile Lys Val 275 280
285Asn Cys Ala Ala Leu Ser Gln Ser Leu Leu Glu Ser Glu Leu Phe Gly
290 295 300His Glu Lys Gly Ser Phe Thr Gly Ala Val Gln Gln Lys Lys
Gly Arg305 310 315 320Pro Glu Met Ala Glu Gly Gly Thr Leu Phe Leu
Asp Glu Ile Gly Glu 325 330 335Ile Ser Leu Glu Phe Gln Ala Lys Leu
Leu Arg Ile Leu Gln Glu Gly 340 345 350Glu Phe Glu Arg Val Gly Gly
Thr Arg Thr Leu Arg Val Asp Val Arg 355 360 365Leu Val Thr Ala Thr
Asn Lys Asp Leu Glu Arg Ala Val Ala Asn Gly 370 375 380Thr Phe Arg
Ala Asp Leu Tyr Phe Arg Ile Cys Val Val Pro Ile Val385 390 395
400Leu Pro Pro Leu Arg Asp Arg Lys Glu Asp Ile Gly Leu Leu Ala Gln
405 410 415Gly Leu Leu Glu Arg Phe Asn Lys Arg Asn Gly Met Lys Lys
Lys Leu 420 425 430His Pro Ser Ala Val Ala Ala Leu Ala Gln Cys Asn
Phe Pro Gly Asn 435 440 445Val Arg Glu Leu Glu Asn Cys Ile Ala Arg
Val Ala Ala Leu Ser Pro 450 455 460Glu Thr Val Ile His Ala Asp Asp
Leu Ala Cys His His Asp His Cys465 470 475 480Leu Ser Ala Asp Leu
Trp Arg Leu Gln Thr Gly Ser Ala Ser Pro Val 485 490 495Gly Gly Leu
Ala Gln Gly Pro Leu Glu Leu Pro Val Leu Gly Ser Arg 500 505 510Pro
Pro Ala Ala Ala Pro Ser Ala Pro Pro Pro Pro Pro Pro Thr Val 515 520
525Pro Ser Ala Pro Leu Asp Gly Glu Ala Ala Glu Arg Glu Ala Leu Ile
530 535 540Glu Ala Met Glu Arg Ala Gly Trp Val Gln Ala Lys Ala Ala
Arg Leu545 550 555 560Arg Gly Met Thr Pro Arg Gln Ile Gly Tyr Ala
Leu Lys Lys Tyr Asn 565 570 575Ile Arg Val Glu Lys Phe
580293615PRTA. caulinodans 293Met Pro Met Thr Asp Ala Phe Gln Val
Arg Val Pro Arg Val Ser Ser1 5 10 15Ser Thr Ala Gly Asp Ile Ala Ala
Ser Ser Ile Thr Thr Arg Gly Ala 20 25 30Leu Pro Arg Pro Gly Gly Met
Pro Val Ser Met Ser Arg Gly Thr Ser 35 40 45Pro Glu Val Ala Leu Ile
Gly Val Tyr Glu Ile Ser Lys Ile Leu Thr 50 55 60Ala Pro Arg Arg Leu
Glu Val Thr Leu Ala Asn Val Val Asn Val Leu65 70 75 80Ser Ser Met
Leu Gln Met Arg His Gly Met Ile Cys Ile Leu Asp Ser 85 90 95Glu Gly
Asp Pro Asp Met Val Ala Thr Thr Gly Trp Thr Pro Glu Met 100 105
110Ala Gly Gln Ile Arg Ala His Val Pro Gln Lys Ala Ile Asp Gln Ile
115 120 125Val Ala Thr Gln Met Pro Leu Val Val Gln Asp Val Thr Ala
Asp Pro 130 135 140Leu Phe Ala Gly His Glu Asp Leu Phe Gly Pro Pro
Glu Glu Ala Thr145 150 155 160Val Ser Phe Ile Gly Val Pro Ile Lys
Ala Asp His His Val Met Gly 165 170 175Thr Leu Ser Ile Asp Arg Ile
Trp Asp Gly Thr Ala Arg Phe Arg Phe 180 185 190Asp Glu Asp Val Arg
Phe Leu Thr Met Val Ala Asn Leu Val Gly Gln 195 200 205Thr Val Arg
Leu His Lys Leu Val Ala Ser Asp Arg Asp Arg Leu Ile 210 215 220Ala
Gln Thr His Arg Leu Glu Lys Ala Leu Arg Glu Glu Lys Ser Gly225 230
235 240Ala Glu Pro Glu Val Ala Glu Ala Ala Asn Gly Ser Ala Met Gly
Ile 245 250 255Val Gly Asp Ser Pro Leu Val Lys Arg Leu Ile Ala Thr
Ala Gln Val 260 265 270Val Ala Arg Ser Asn Ser Thr Val Leu Leu Arg
Gly Glu Ser Gly Thr 275 280 285Gly Lys Glu Leu Phe Ala Arg Ala Ile
His Glu Leu Ser Pro Arg Lys 290 295 300Gly Lys Pro Phe Val Lys Val
Asn Cys Ala Ala Leu Pro Glu Ser Val305 310 315 320Leu Glu Ser Glu
Leu Phe Gly His Glu Lys Gly Ala Phe Thr Gly Ala 325 330 335Leu Asn
Met Arg Gln Gly Arg Phe Glu Leu Ala His Gly Gly Thr Leu 340 345
350Phe Leu Asp Glu Ile Asp Glu Ile Thr Pro Ala Phe Gln Ala Lys Leu
355 360 365Leu Arg Val Leu Gln Glu Gly Glu Phe Glu Arg Val Gly Gly
Asn Arg 370 375 380Thr Leu Lys Val Asp Val Arg Leu Val Cys Ala Thr
Asn Lys Asn Leu385 390 395 400Glu Glu Ala Val Ser Lys Gly Glu Phe
Arg Ala Asp Leu Tyr Tyr Arg 405 410 415Ile His Val Val Pro Leu Ile
Leu Pro Pro Leu Arg Glu Arg Pro Gly 420 425 430Asp Ile Pro Lys Leu
Ala Lys Asn Phe Leu Asp Arg Phe Asn Lys Glu 435 440 445Asn Lys Leu
His Met Met Leu Ser Ala Pro Ala Ile Asp Val Leu Arg 450 455 460Arg
Cys Tyr Phe Pro Gly Asn Val Arg Glu Leu Glu Asn Cys Ile Arg465 470
475 480Arg Thr Ala Thr Leu Ala His Asp Ala Val Ile Thr Pro His Asp
Phe 485 490 495Ala Cys Asp Ser Gly Gln Cys Leu Ser Ala Met Leu Trp
Lys Gly Ser 500 505 510Ala Pro Lys Pro Val Met Pro His Val Pro Pro
Ala Pro Thr Pro Leu 515 520 525Thr Pro Leu Ser Pro Ala Pro Leu Ala
Thr Ala Ala Pro Ala Ala Ala 530 535 540Ser Pro Ala Pro Ala Ala Asp
Ser Leu Pro Val Thr Cys Pro Gly Thr545 550 555 560Glu Ala Cys Pro
Ala Val Pro Pro Arg Gln Ser Glu Lys Glu Gln Leu 565 570 575Leu Gln
Ala Met Glu Arg Ser Gly Trp Val Gln Ala Lys Ala Ala Arg 580 585
590Leu Leu Asn Leu Thr Pro Arg Gln Val Gly Tyr Ala Leu Arg Lys Tyr
595 600 605Asp Ile Asp Ile Lys Arg Phe 610 615
* * * * *
References