U.S. patent application number 14/262633 was filed with the patent office on 2014-12-04 for nucleic acids for cloning and expressing multiprotein complexes.
The applicant listed for this patent is Europaisches Laboratorium fur Molekularbiologie (EMBL). Invention is credited to Imre Berger.
Application Number | 20140356960 14/262633 |
Document ID | / |
Family ID | 42244215 |
Filed Date | 2014-12-04 |
United States Patent
Application |
20140356960 |
Kind Code |
A1 |
Berger; Imre |
December 4, 2014 |
Nucleic acids for cloning and expressing multiprotein complexes
Abstract
The present invention relates to a nucleic acid containing at
least one homing endonuclease site (HE) and at least one
restriction enzyme site (X) wherein the HE and X sites are selected
such that HE and X result in compatible cohesive ends when cut by
the homing endonuclease and restriction enzyme, respectively, and
the ligation product of HE and X cohesive ends can neither be
cleaved by the homing endonuclease nor by the restriction enzyme.
Further subject-matter of the present invention relates to a vector
comprising the nucleic acid of the present invention, host cells
containing the nucleic acid and/or the vector; a kit for cloning
and/or expression of multiprotein complexes making use of the
vector and the host cells, a method for producing a vector
containing multiple expression cassettes, and a method for
producing multiprotein complexes. The invention also relates to a
methods of assembling multiple single vectors ("vector entities")
into fusion vectors and to method of disassembling a fusion vector
containing multiple of such vector entities into single vectors.
The invention is also directed to fusion vectors containing
multiple vector entities.
Inventors: |
Berger; Imre; (St. Egreve,
FR) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Europaisches Laboratorium fur Molekularbiologie (EMBL) |
Heidelberg |
|
DE |
|
|
Family ID: |
42244215 |
Appl. No.: |
14/262633 |
Filed: |
April 25, 2014 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13254831 |
Sep 2, 2011 |
8709798 |
|
|
PCT/EP2010/052892 |
Mar 8, 2010 |
|
|
|
14262633 |
|
|
|
|
Current U.S.
Class: |
435/462 ;
435/199; 435/254.21; 435/254.22; 435/254.23; 435/258.3; 435/320.1;
435/325; 435/348; 435/366; 435/367; 435/368; 435/369; 435/370;
435/465; 435/471 |
Current CPC
Class: |
C12N 2800/30 20130101;
C12N 15/902 20130101; C12N 15/65 20130101; C12N 15/64 20130101;
C12N 15/10 20130101; C12N 9/22 20130101 |
Class at
Publication: |
435/462 ;
435/320.1; 435/325; 435/366; 435/369; 435/367; 435/368; 435/370;
435/254.21; 435/254.22; 435/254.23; 435/348; 435/258.3; 435/465;
435/471; 435/199 |
International
Class: |
C12N 15/90 20060101
C12N015/90; C12N 9/22 20060101 C12N009/22 |
Foreign Application Data
Date |
Code |
Application Number |
Mar 6, 2009 |
EP |
09154567.3 |
Claims
1. A nucleic acid comprising a multiple integration element (MIE)
having the following sequence elements: ##STR00001## wherein HE is
a homing endonudease site selected from the group consisting of a
I-CeuI site and a PI-SceI site; Prom represents a promoter; rbs
represents a ribosome binding site; term represents a terminator;
and wherein the HE and BstX sites are selected such that HE and
BstXI result in compatible cohesive ends when cut by the homing
endonuclease and the BstXI restriction enzyme, respectively, and
the ligation product of HE and BstXI cohesive ends can neither be
cleaved by the homing endonoclease nor the restriction enzyme.
2. The nucleic acid of claim 1 further comprising the nucleotide
sequence of SEQ ID NO: 1.
3. The nucleic add of claim 1 claims further comprising at least
one site for integration of the nucleic acid of claim 1 into a
vector or host cell.
4. A vector comprising the nucleic acid of claim 1.
5. The vector of claim 4 further comprising at least one
recognition sequence for a site-specific recombinase, preferably a
LoxP imperfect inverted repeat or a Tn7 attachment site.
6. The vector of claim 4 comprising a sequence selected from the
group consisting of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ
ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9,
SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID
NO: 14, SEQ ID NO: 15, SEQ ID NO: 16 and SEQ ID NO: 17.
7. The vector of claim 6 comprising more than one of the sequence
elements of the nucleic acid as defined in claim 1 and containing
more than one recognition sequence for a site-specific
recombinase.
8. The vector of claim 7 comprising the sequence of SEQ ID NO:
18.
9. The vector of claim 4 wherein the vector is a virus.
10. The vector of claim 9 wherein the virus is a baculovirus.
11. A host cell comprising the nucleic acid of claim 1.
12. A host cell comprising the vector of claim 4.
13. A kit for cloning and/or expression of multiprotein complexes
containing at least one vector of claim 4 together with at least
one host cell suitable for the propagation of said vector(s).
14. A method for assembling n vector entities each containing a
multiple integration element as defined in claim 1 into 1 to (n-1)
fusion vectors wherein said fusion vector(s) contain(s) 2 to n of
said vector entities comprising the steps of: (1) contacting said n
vector entities each containing a site-specific recombination site
and an individual resistance marker different from the resistance
markers of the other vector entities with a recombinase specific
for said site-specific recombination site so as to generate a
mixture of fusions of the vector entities comprising 2 to n of said
vector entities, (2) transforming said mixture into host cells; (3)
culturing one or more sample(s) of the transformed cells in the
presence of an appropriate combination of antibiotics for selecting
one or more desired fusion vector(s) containing 2 to n vector
entities; (4) obtaining n single clones of transformed cells from
the culture obtained in step (3) in which these were viable in the
presence of the respective combination of antibiotics; and (5)
culturing n samples of each of said n single clones in the presence
of each of n antibiotics specific for the n individual resistance
markers present in said n vector entities; wherein n is an integer
of at least 3.
15. The method of claim 14 wherein (n-1) of the vector entities to
be fused each contains a further selectable marker different from
the resistance markers such that only host cells transformed with
fusions between the vector entity not containing the further
selectable marker and one or more of the vector entities containing
the selectable marker are viable in step (3).
16. The method of claim 15 wherein (n-1) of the vector entities
contain a conditional origin of replication making the propagation
of said vector entities dependent on the presence or absence of a
specific gene in the host cells.
17. The method of claim 16 wherein the host cells are bacteria,
preferably E. coli, the origin of replication is R6K.gamma. or a
derivative thereof and the bacteria are pir.sup.-.
18. The method of claim 14 wherein each of the n vector entities
contains one or more genes of interest, preferably within an
expression cassette.
19. A method of disassembling a fusion vector containing n vector
entities each containing a multiple integration element (MIE) as
defined in claim 1 into one or more desired fusion vectors selected
from the group consisting of fusion vectors containing 2 to (n-1)
vector entities or into one or more of said single vector entities
each containing a multiple integration element (MIE) as defined in
claim 1, wherein in said fusion vector containing n vector entities
said n vector entities are separated from each other by n
site-specific recombination sites, and each vector entity contains
an individual resistance marker different from the resistance
markers of the other vector entities, comprising the steps of: (A)
contacting the fusion vector containing n vector entities each
containing a multiple integration element (MIE) as defined in claim
1 with a recombinase specific for said site-specific recombination
sites in order to generate a mixture of fusions of the vector
entities comprising 2 to (n-1) of said vector entities and single
vector entities; (B) transforming said mixture into host cells; and
(C) culturing one or more sample(s) of the transformed cells in the
presence of: (C1) an appropriate combination of antibiotics for
selecting one or more desired fusion vecter(s) containing 2 to
(n-1) vector entities; and/or (C2) a single appropriate antibiotic
for selecting a desired single vector entity; (D) obtaining n
single clones of transformed cells from the sample of the
transformed cells in which the single clones of transformed cells
were viable in the presence of the respective antibiotic or
combination of antibiotics, respectively, and (E) culturing n
samples of each of said n single clones of transformed cells in the
presence of each of n antibiotics specific for the n individual
resistance markers present in said n vector entities; wherein n is
an integer of at least 3.
20. The method of claim 19 wherein for dissembling the fusion
vector containing n vector entities into single vector entities,
steps (A), (B), and (C1) are carried out for selecting an
appropriate fusion vector containing 2 to (n-1) vector entities,
and steps (A), (B), and (C2) to (E) are carried out with said
selected fusion vector containing 2 to (n-1) vector entities.
21. The method of claim 20 wherein (n-1) of the vector entities in
said fusion vector containing n vector entities each contains a
further selectable marker different from the resistance markers
such that only host cells transformed with fusions between a vector
entity not containing the further selectable marker and one or more
of the vector entities containing the selectable marker are viable
in step (C1).
22. The method of claim 21 wherein (n-1) of the vector entities
comprise a conditional origin of replication making the propagation
of said vector entities dependent on the presence or absence of a
specific gene in the host cells.
23. The method of claim 22 wherein the host cells are bacteria,
preferably E. coli, the origin of replication is R6K.gamma. or a
derivative thereof, and the bacteria are pir
24. The method of claim 19 wherein each of the n vector entities
comprises one or more genes of interest.
25. The method of claim 24 wherein the one or more genes of
interest are within an expression cassette.
26. A fusion vector comprising n vector entities as defined in
claim 4, separated from each other by n of the same site-specific
recombination site, wherein each vector entity contains an
individual resistance marker gene different from the resistance
marker genes of the other vector entities, wherein n is an integer
of at least 3.
27. A kit for assembly and/or disassembly of n vectors comprising a
fusion vector comprising n vector entities each containing a
multiple integration element (MIE) as defined in claim 1, the n
vector entities being separated from each other by n of the same
site-specific recombination site, wherein each vector entity
contains an individual resistance marker gene different from the
resistance marker genes of the other vector entities; and/or n
vector entities each containing a site-specific recombination site
and an individual resistance marker gene different from the
resistance marker genes of the other vector entities, and a
recombinase specific for said site-specific recombination site
and/or cells for the propagation of said fusion vector and/or said
n vectors vector entities; wherein n is an integer of at least 3.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This patent application is a continuation application of
U.S. patent application Ser. No. 13/254,831, filed on Sep. 2, 2011
(currently pending), which was the National Stage of International
Application No. PCT/EP2010/052892, filed Mar. 8, 2010, entitled
"Nucleic acids for cloning and expressing multiprotein complexes,"
which claims the benefit of European Patent Application No. EP
09154567.3, filed Mar. 6, 2009, which applications are incorporated
in their entirety here by this reference.
[0002] The present invention relates to a nucleic acid containing
at least one homing endonuclease site (HE) and at least one
restriction enzyme site (X) wherein the HE and X sites are selected
such that HE and X result in compatible cohesive ends when cut by
the homing endonuclease and restriction enzyme, respectively, and
the ligation product of HE and X cohesive ends can neither be
cleaved by the homing endonuclease nor by the restriction enzyme.
Further subject-matter of the present invention relates to a vector
composing the nucleic acid of the present invention, host cells
containing the nucleic acid and/or the vector, a kit for cloning
and/or expression of multiprotein complexes making use of the
vector and the host cells, a method for producing a vector
containing multiple expression cassettes, and a method for
producing multiprotein complexes. The invention also relates to a
method for assembling multiple single vectors ("vector entities")
into fusion vectors and to a method for disassembling a fusion
vector containing multiple of such vector entities into lower order
fusion vectors and/or into single vectors. The invention is also
directed to fusion vectors containing multiple vector entities.
[0003] Many vital processes in cells are controlled by proteins
associating into interlocking molecular machines, in higher
eukaryotes often containing 10 and more subunits (Rual, J. F. et
al. Nature 437, 1173-1178 (2005); Charbonnier S., Gallego, O. and
Gavin, A. C. Biotechnol. Annu. Rev. 14, 1-28 (2008)). This has
profound consequences for functional and structural studies that
now aim to decipher physiologically relevant molecular mechanisms.
Consequently, work on complexes is increasingly becoming imperative
in contemporary biology. The low abundance and frequently
heterogeneous nature of many multisubunit complexes, however, often
preclude extraction from source.
[0004] Recombinant production methods certainly have had a decisive
impact on life science research. In particular E. coil, as an
expression host, is commonplace. Successful functional analysis of
proteins and elucidation of their molecular architecture often
crucially depends on introducing alterations, such as truncations,
mutations and extension with purification tags, or with particular
promoter/terminator elements. The ensuing requirements in terms of
experimental throughput are already considerable for diversifying
single open reading frames (ORFs). In particular structural
genomics consortia demand the standardization of subcloning
routines and implementation of automation for this. The exponential
increase in workload when many ORFs have to be rapidly diversified
and assembled in the context of a multisubunit complex is daunting,
and an unresolved challenge to date.
[0005] A number of systems have been introduced in recent years for
expression of several genes in eukaryotic and prokaryotic hosts;
see, e.g. Fitzgerald et al. (2006) Nat. Methods 3, 1021-1032; Tan
et al. (2005) Protein Expr. Purif. 40, 385-395 (2005); Tolia, N. H.
and Joshua-Tor (2006). Nat. Methods 3, 55-64; Chanda et a., (2006)
Protein Expr. Purif. 47, 217-224; Scheich et al. (2007). Nucleic
Acids Res. 35, e43 (2007). In spite of considerable improvements of
eukaryotic expression systems, in particular the baculovirus/insect
cell expression (Fitzgerald et al. (2006), supra), E. coli still
remains to date the dominant work-horse in most laboratories, for
many good reasons such as low-cost and availability of a multitude
of specialized expression strains. The current co-expression
systems for E. coli rely essentially on serial, mostly conventional
(i.e. restriction/ligation) subcloning of encoding genes either as
single expression cassettes (Tolia et al. (2006), supra: Chanda et
al. (2006), supra) or as polycistrons constituting several genes
under the control of the same promoter (Tan et al. (2005), supra).
This considerably limits the applicability of these co-expression
techniques for production of protein complexes with many subunits,
in particular at the throughput typically required for structural
molecular biology.
[0006] A major impediment of such largely serial (one gene at a
time) constructions stems from the inherent inflexibility with
regards to rapidly revising an expression experiment once the
multiprotein complex has been produced, purified and characterized.
However, such revisions, including variations of the protein
subunits, are a sine qua non in contemporary functional and
structural research.
[0007] Fitzgerald et al. (2006), supra, and WO-A-2005/085456
describe polynucleotides having a so-called multiplication module
wherein two expression cassettes in head-to-head, head-to-tail or
tail-to-tail orientation are flanked by specifically designed pairs
of restriction enzyme sites allowing iterative cloning of multiple
genes into the expression cassettes.
[0008] In view of the draw backs of prior art constructs it is
therefore the technical problem underlying the present invention to
provide versatile systems for cloning and expression of
multiprotein complexes.
[0009] The solution to the above technical problem is achieved by
the provision of the embodiments of the present invention as
defined in the claims.
[0010] In particular, the present invention relates to a nucleic
acid (or polynucleotide) containing at least one homing
endonuclease site (HE) and at least one restriction enzyme site (X)
wherein the HE and the X sites are selected such that HE and X
result in compatible cohesive ends when cut by the homing
endonuclease and restriction enzyme, respectively, and the ligation
product of HE and X cohesive ends can neither be cleaved by the
homing endonuclease nor the restriction enzyme.
[0011] According to the present invention, the terms "nucleic acid"
and "polynucleotide" are used interchangeably and refer to DNA, RNA
or species containing one or more nucleotide analogues. Preferred
nucleic acids or polynucleotides according to the present invention
are DNA, most preferred double-stranded (ds) DNA.
[0012] Preferably, the nucleic acid of the present invention has
the following sequence elements:
[0013] HE-Prom-MCS-Term-X or HE-Prom-MCS-X
[0014] wherein
[0015] Prom: represents a promoter;
[0016] MCS: represent a multiple cloning site; and
[0017] Term: represents a terminator.
[0018] The above arrangement is hereinafter often referred to as
"multiple integration element" (MIE).
[0019] Promoters useful in the present invention include, but are
not limited to, promoters of prokaryotic, viral, mammalian, or
insect cell origin or a combination thereof. Likewise, terminators
useful in a nucleic acid according to the invention include, but
are not limited to, terminators of prokaryotic, viral, mammalian,
insect cell origin or a combination thereof. The term "multiple
cloning site" according to the present invention means a sequence
having at least one restriction enzyme site different from the site
X as defined above. The MCS according to the present invention may,
e.g. be derived from the multiple cloning sites of any commercially
available plasmid.
[0020] Preferred prokaryotic promoters are Lac, T7, arabinose and
trc promoters. Further promoters useful in the context of the
present invention are viral promoters, in particular baculoviral
promoters such as polh, p10 and p.sub.XIV very late baculoviral
promoters, vp39 baculoviral late promoter, vp39 polh baculoviral
late/very late hybrid promoter, P.sub.cap/polh, pcna, etl, p35,
egt, da26 baculoviral early promoters. Further promoters useful in
the context of the present invention are the promoter sequences
CMV, SV40, UbC, EF-1.alpha., RSVLTR, MT, P.sub.DS47, Ac5, P.sub.GAL
and P.sub.ADH.
[0021] Examples of terminator sequences useful in the context of
the present invention are T7, SV40, HSVtk or BGH.
[0022] The multiple cloning site according to the present invention
may contain, in addition to the at least one restriction enzyme
site (other than X), one or more, especially 1 to 4 homology
regions. The restriction enzymes sites contained in the MCS can
easily be chosen by the skilled person and examples of such sites
together with their recognition sequences can be taken from the
latest product catalogue of New England Biolabs, Ipswich, Mass.,
USA.
[0023] A "homing endonuclease" according to the present invention
is a DNase specific for double-stranded DNA having a large,
isometric recognition site of e.g. 12-40 base pairs or even more,
preferably 20 to 30 base pairs. For a recent review with regard to
homing endonucleases, see Stoddard B. L. (2005) Q. Rev. Biophys.
38, 49-95. Due to the length of HE recognition sequences it is
highly unlikely that a corresponding site occurs in the nucleotide
sequence of a gene or polygene (or any other nucleotide sequence of
any origin) to be inserted into the constructs according to the
present invention making this strategy particularly useful for
cloning larger and/or many genes of interest ("GOI").
[0024] A preferred HE site according to the present invention is a
recognition sequence of a homing endonuclease that results in a 4
nucleotide overhang when cut by the respective homing
endonuclease.
[0025] Examples of such HE sites include, but are not limited to:
recognition sequences of PI-SceI, I-Ceul, I-PpoI, I-HmuI I-CreI,
I-DmoI, PI-PfuI and I-MsoI, PI-PspI, I-SceI, other LAGLIDAG group
members and variants thereof, SegH and Hef or other GIY-YIG homing
endonucleases, I-ApelI, I-AniI, Cytochrome b mRNA maturase bI3,
PI-TliI and PI-TfulI, PI-ThyI and others; see also Stoddard (2005),
supra.
[0026] A preferred restriction enzyme site X according to the
present invention compatible with HE sites producing a 4 bp
overhang (examples are given above) is a BstXI site.
[0027] Corresponding enzymes are commercially available, e.g. from
New England Biolabs Inc., Ipswich, Mass., USA.
[0028] Especially preferred MIEs of the invention containing
prokaryoutic promoters/terminators have one of the following
structures:
[0029] I-CeuI-T7 Prom-MCS-T7 Term-BstXI
[0030] PI-SceI-T7 Prom-MCS-T7 Term-BstXI
[0031] Especially preferred MIEs of the invention containing
baculoviral promoters have one of the following structures:
[0032] I-CeuI-p10-MCS-BstXI
[0033] PI-SceI-p10-MCS-BstXI
[0034] I-CeuI-polh-MCS- BstXI
[0035] PI-SceI-polh-MCS-BstXI
[0036] Particularly preferred examples of nucleic acids according
to the present invention comprise the sequence according to SEQ ID
NO: 1 (for a detailed map see FIG. 13A and B; the sequence
antisense to SEQ ID NO: 1 is outlined in SEQ ID NO: 54), SEQ ID NO:
50 (restriction map: FIG. 42), SEQ ID NO: 51 (restriction map: FIG.
43), SEQ ID NO: 52 (restriction map: FIG. 44) or SEQ ID NO: 53
(restriction map: FIG. 45).
[0037] In preferred embodiments of the present invention, the
above-defined nucleic acid additionally comprises at least one site
for integration of the nucleic acid into a vector or host cell. The
integration site may allow for a transient or genomic
incorporation.
[0038] With respect to the integration into a vector, in particular
into a plasmid or virus, the integration site is preferably
compatible for integration of the nucleic acid into an adenovirus,
andeno-associated virus (AAV), autonomous parvovirus, herpes
simplex virus (HSV), retrovirus, rhadinovirus, Epstein-Barr virus,
lentivirus, semliki forest virus or baculovirus.
[0039] Particularly preferred integration sites that may be
incorporated into the nucleic acid of the present invention can be
selected from the transposon element of Tn7, .lamda.-integrase
specific attachment sites and site-specific recombinases (SSRs), in
particular LoxP site or FLP recombinase specific recombination
(FRT) site. Further preferred mechanisms for integration of the
nucleic acid according to the invention are specific homologous
recombination sequences such as lef2-603/Orf1629.
[0040] In further preferred embodiments of the present invention,
the nucleic acid as described herein additionally contains one or
more resistance markers for selecting against otherwise toxic
substances. Preferred examples of resistance markers useful in the
content of the present invention include, but are not limited to,
antibiotics such as ampicillin, chloramphenicol, gentamycin,
spectinomycind, and kanamycin resistance markers.
[0041] The nucleic acid of the present invention may also contain
one or more ribosome binding site(s) (RBS), preferably integrated
into an MIE as defined above.
[0042] Further subject-matter of the present invention relates to a
vector comprising a nucleic acid as defined above.
[0043] Preferred vectors of the present invention are plasmids,
expression vectors, transfer vectors, more preferred eukaryotic
gene transfer vectors, transient or viral vector-mediated gene
transfer vectors. Other vectors according to the invention are
viruses such as adenovirus vectors, adeno-asseciated virus (AAV)
vectors, autonomous parvovirus vectors, herpes simples virus (HSV)
vectors, retrovirus vectors, rhadinovirus vectors, Epstein-Barr
virus vectors, lentivirus vectors, semliki forest virus vectors and
baculovirus vectors.
[0044] Baculovirus vectors suitable for integrating a nucleic acid
according to the invention (e.g. present on a suitable plasmid such
as a transfer vector) are also subject matter of the present
invention and preferably contain site-specific integration sites
such as a Tn7 attachment site (which may be embedded in a lacZ gene
for blue/white screening of productive integration) and/or a LoxP
site. Further preferred baculovirus according to the invention
contain (alternative to or in addition to the above-described
integration sites) a gene for expressing a substance toxic for host
flanked by sequences for homologous recombination. An example for a
gene for expressing a toxic substance is the diphtheria toxin A
gene. A preferred pair of sequences for homologous recombination is
e.g. Isf2-603/Orf1629. The baculovirus can also contain further
marker gene(s) as described above, including also fluorescent
markers such as GFP, YFP and so on. Specific examples of
corresponding baculovirus of the invention have the structure of
EMBac, EMBAcY, EMBac_Direct and EMBacY_Direct as disclosed in the
schemes according to FIGS. 38, 39, 40 and 41, respectively.
[0045] Vectors useful in prckaryotic host cells comprise,
preferably besides the above-exemplified marker genes (one or more
thereof), an origin of replication (ori). Examples are BR322,
ColE1, and conditional origins of replication such as OriV and
R6K.gamma., the latter being a preferred conditional origin of
replication which makes the propagation of the vector of the
present application dependent on the pir gene in a prokaryotic
host. OriV makes the propagation of the vector of the present
application dependent on the trfA gene in a prokaryotic host.
[0046] Furthermore, the present invention is directed to a host
cell containing the nucleic acid of the invention and/or the vector
of the present invention.
[0047] The host cells may be prokaryotic or eukaryotic. Eukaryotic
host cells may for example be mammalian cells, preferably human
cells. Examples of human host cells include, but are not limited
to, HeLa, Huh7, HEK293, HepG2, KATO-III, IMR32, MT-2, pancreatic
.beta.-cells, keratinocytes, bone-marrow fibroblasts, CHP212,
primary neural cells, W12, SK-N-MC, Saos-2, WI38, primary
hepatacytes, FLC4, 143TK, DLD-1, embryonic lung fibroblasts,
primery foreskin fibroblasts, MRC5, and MG63 cells. Further
preferred host cells of the present invention are porcine cells,
preferably CPK, FS-13, PK-15 cells, bovine cells, preferably MDB,
BT cells, bovine cells, such as FLL-YFT cells. Other eukaryotic
cells useful in the context of the present invention are C. elegans
cells. Further eukaryotic cells include yeast cells such as S.
cerevisiae, S. pombe, C. albicans and P. pastoris. Furthermore, the
present invention is directed to insect cells as host cells which
include cells from S. frugiperda, more preferably Sf9, Sf21,
Express Sf+, High Five H5 cells, and cells from D. melamogaster,
particularly S2 Schneider cells. Further host cells include
Dictyostellium discoideum cells and cells from parasites such as
Leishmania spec.
[0048] Prokaryotic hosts according to the present invention include
bacteria, in particular E. coli such as commercially available
strains like TOP10, DH5.alpha., HB101 etc.
[0049] The person skilled in the art is readily able to select
appropriate vector construct/host cell pairs for appropriate
propagation and/or transfer of the nucleic acid elements according
to the present invention into a suitable host. Specific methods for
introducing appropriate vector elements and vectors into
appropriate host cells are equally known to the art and methods can
be found in the latest edition of Ausubel et al. (ed.) Current
Protocols In Molecular Biology, John Wiley & Sons, New York,
USA.
[0050] In preferred embodiments of the present invention, the
vector as defined above additionally comprises a site for site
specific recombinases (SSRs), preferably one or more LoxP sites for
Cre-lox specific recombination. In further preferred embodiments,
the vector according to the present invention comprises a
transposon element, preferably a Tn7 attachment site.
[0051] It is further preferred that the attachment site as defined
above is located within a marker gene. This arrangement makes it
feasible to select for successfully integrated sequences into the
attachment site by transposition. According to preferred
embodiments, such a marker gene is selected from luciferase,
.beta.-GAL, CAT, fluorescent encoding protein genes, preferably
GFP, BFP, YFP, CFP and their variants, and the lacZ.alpha.
gene.
[0052] Particularly preferred embodiments of the vector according
to the present invention have a sequence selected from the group
consisting of SEQ ID NO: 2 to SEQ ID NO: 17.
[0053] Further preferred embodiments of the present invention are
vectors containing more than one of the sequence elements of the
nucleic acids of the present invention as defined above and,
optionally, additionally containing more than one recombination
sequence for a site specific recombinase, e.g. 2 to 6, more
preferred 2, 3 or 4 of such recognition sequences, preferably 2 to
6, especially preferred 1 to 4 loxP sites
[0054] A particularly preferred example of such a vector has the
sequence of SEQ ID NO. 18.
[0055] It is to be understood that, if the vector of the present
invention contains more than one recombination sequences, these can
be recognition sequences of the same or different site-specific
recombinases.
[0056] Further subject-matter of the present invention is a kit for
cloning and/or expression of multiprotein complexes containing at
least one vector as defined above together with at least host cell
suitable for the propagation of said vector(s). Preferred host
cells have been already described above. Preferably, the kit of
this aspect of the present invention additionally contains a
site-specific reeombinase such as Cre.
[0057] The present invention also relates to a method for producing
a vector containing multiple expression cassettes comprising the
steps of:
[0058] (a) inserting one or more genes between the HE and the X
site of a first vector of the present invention;
[0059] (b) inserting one or more genes between the HE and the X
site of a second vector as defined herein;
[0060] (c) cleaving the first vector with a homing endonuclease
specific for site HE and with a restriction enzyme specific for
site X yielding a fragment containing the at least one gene flanked
by the cleaved HE and X sites;
[0061] (d) cleaving the second vector with a homing endonuclease
specific for site HE;
[0062] (e) ligating the fragment obtained in step (c) into the
cleaved second vector obtained in step (d) generating a third
vector; and optionally
[0063] (f) repeating steps (a) to (e) with one or more vector(s)
generating a vector containing multiple genes.
[0064] According to preferred embodiments of the present invention
it is possible to insert one or more genes into the vectors of the
invention by methods known to the skilled person, e.g. by
restriction enzyme digestion/ligation via compatible sites within
the MCS or by recombination, preferably using the optionally
present homology region(s), preferably using the SLIC method. If
more than one gene is inserted, these can be provided as single
expression cassettes. However, it is clear for the skilled person
that the (several or multiple) genes can be present as a polygene
within in one ORF.
[0065] The present invention is further directed to a method for
producing multiple protein complexes comprising the steps of
[0066] (i) producing a vector containing multiple expression
cassettes by the method as defined above;
[0067] (ii) introducing the vector obtained in step (i) into a
suitable best cells such as the host cells described above; and
[0068] (iii) incubating the host cell under conditions allowing the
simultaneous expression of the genes present in the vector.
[0069] The introduction of the vector into suitable host cells (as
exemplified above) is carried out by methods known to the skilled
person (see, e.g. Ausubel et al. (ed.), supra).
[0070] A further aspect of the present application is a fusion
vector comprising n vector entities separated from each other by n
of the same site-specific recombination site wherein each vector
entity contains an individual resistance marker gene different from
the resistance marker genes of the other vector entities, wherein n
is an integer of at least 3.
[0071] A "single vector" or "vector entity" according to the
present aspect of the invention is generally a nucleic acid
suitable for integration of foreign genetic elements (in
particular, one or more genes of interest) into host cells and
which are suited for amplification. Typical examples are plasmids,
bacmnids, viruses, lambda vectors, cosmids etc. Preferred examples
of one or more of the above vector categories are outlined in more
detail above with respect to the HE/X site containing vector which
definitions are also valid for this aspect of the present
invention.
[0072] It is clear for the skilled person that the number of vector
entities to be assembled into a fusion vector according to the
present invention (or disassembled from such a fusion vector; with
respect to methods of assembly/disassembly see below) is generally
not specifically limited as long as a corresponding number of
resistance markers is available. With respect to practical
considerations, the number n in the context of the present
invention is preferably 3, 4, 5 or 6, (but may be more) which in
part depends on the stee of constructs that can be propagated in
the host.
[0073] The present invention furthermore relates to a kit for
assembly and/or disassembly of n vectors comprising
[0074] a fusion vector comprising n vector entitles separated from
each other by n of the same site-specific recombination site
wherein each vector entity contains an individual resistance marker
gene different from the resistance market genes of the other vector
entities; and/or
[0075] n vectors (vector entities) each containing a site-specific
recombination site and an individual resistance marker gene
different from the resistance marker geness of the other
vectors,
[0076] wherein n is an integer of at least 3; and
[0077] a recombinase specific for said site-specific recombination
site and/or cells for the propagation of said fusion vector and/or
said n vectors.
[0078] Preferred embodiments of the above fusion vector and vector
kits are or contain, respectively, fusion vector(s) and/or vector
entities comprising LoxP sites and Cre as the corresponding
recombinase enzyme. Other examples of site-specific recombination
sites/recombinases are FRT sites and the corresponding enzyme (FLP
recombinase).
[0079] According to a preferred embodiment the above-defined n
vectors or vector entities, respectively, each contain one or more
expression cassettes of the form Prom-MCS-Term or Prom-MCS-Term
(definitions are as defined above, preferably between a HE and
restriction enzyme site X as defined above). It is further
preferred that the expression cassette preferably present in the
vectors or vector entities, respectively, contains one or more
genes of interest ("GOI").
[0080] Examples of resistance marker genes (or simply "resistance
markers") useful in the context of this aspect of the present
invention are as already defined above.
[0081] An especially preferred example of the fusion vector as
defined above is vector pACKS (SEQ ID NO: 18) described in more
detail below.
[0082] Preferred examples of the vector entitles are pACE (SEQ ID
NO: 2), pACE2 (SEQ ID NO: 3), pDC (SEQ ID NO: 4), pDK (SEQ ID NO:
5) and pDS (SEQ ID NO: 6), which are all adapted for expression in
prokaryotic hosts, and pIDC (SEQ ID NO: 7), pIDK (SEQ ID NO: 8),
pIDS (SEQ ID NO: 9), pACEBac1 (SEQ ID NO: 10), pACEBac2 (SEQ ID NO:
11), pACEBac3 (SEQ ID NO: 12), pACEBac4 (SEQ ID NO: 13), pOmniBac1
(SEQ ID NO: 14), pOmniBac2 (SEQ ID NO: 15, pOmniBac3 (SEQ ID NO:
16) and pOmniBac4 (SEQ ID NO: 17), which are tailored for
expression in insect cells using baculovirus. The above preferred
examples of vector entities are described in more detail below.
[0083] It is further preferred that at least one of the vector
entities (and/or of the individual vectors in the above kit)
contains a further selectable marker different from the resistance
marker genes. An example is a conditional origin of replication
making the propagation of the respective vector entity dependent on
a specific genetic background in a host. An example is an Ori
derived from (or being) R6k.gamma. making the propagation of the
vector dependent on the pir gene.
[0084] The present invention further provides a method for
assembling n vector entities into 1 to (n-1) fusion vectors wherein
said fusion veofor(s) contain(s) 2 to n of said vector entitles
comprising the steps of:
[0085] (1) contacting n vector entities each containing a
site-specific recombination site and an individual resistance
marker different from the resistance markers of the other vector
entities with a recombinase specific for said site-specific
recombination site so as to generate a mixture of fusions of the
vector entities comprising 2 to n of said vector entities,
[0086] (2) transforming said mixture into host cells;
[0087] (3) culturing one or more sample(s) of the transformed cells
in the presence of the appropriate combination of antibiotics for
selecting a desired fusion vector containing 2 to n vector
entities.
[0088] (4) obtaining n single clones of transformed cells from the
culture obtained in step (3) in which these were viable in the
presence of the respective combination of antibiotics; and
[0089] (5) culturing n samples of each of said n single clones in
the presence of each of n antibiotics specific for the n individual
resistance markers present in said n vector entities;
wherein n is as defined above.
[0090] If it is desired to select for more than one desired vector
fusions, the transformed cells obtained in above step (2) are
divided into the appropriate number of aliquots or samples. For
example, if it is desired to select all possible (n!-n) vector
fusions (i.e. the single vector entities as eduefs of the above
method are not selected for), the transformed host cells are
divided into (n!-n) aliquots (or samples) and each aliquot is
cultured in the presence of the appropriate antibiotics.
[0091] In the context of the present invention, the term "aliquot"
as used herein does not necessarily mean that the aliquots have the
same volume or number of cells. Rather, each of the aliquots or
samples may have the same or different volumes or number of
cells.
[0092] The term "culyuring" the transformed cells or the aliquot
for sample) means that the transformed cells are incubated under
the appropriate conditions for viability of the host cells. For
example, the transformed host cells may be used to inoculate a
(e.g. larger) volume of liquid culture medium or the aliquot may be
plated out on an appropriate solid medium.
[0093] If the vector assembly method as defined above is used to
select for more than one desired vector fusion, e.g. if all
possible fusions are desired, the selection step (3) is preferably
carried out using typical well plate formats such as 96-well
plates.
[0094] According to a preferred embodiment of the present vector
assembly method (n-1) of the vector entities to be fused each
contains a further selectable marker different from the resistance
marker Such vector entities are hereinafter referred to as "Donor"
vectors, since, when fused io a vector entity which does not
contain said selectable marker different from the resistance marker
(hereinafter referred to as "Acceptor"), in a fusion between the
Donor(s) and the Acceptor, said Donor(s) provide host cells with a
phenetype that allows only the propagation of Acceptor-Donor
fusions but no Donor-Donor fusions. Preferred examples of such a
selectable marker are conditional origins of replication making the
propagation of the Donor dependent on a specific genetic
background. A specific example of such a selectable marker is
R6K.gamma. Ori making the propagation of the Donor dependent on the
presence of the pir gene in a bacterial host such as E. coli. In
this case, the mixture obtained in step (i) of the above vector
assembly method is transformed into bacterial cells lacking the pir
gene (such E. coli strains TOP 10, DH5.alpha., HB101 or other
commercially available pir cells).
[0095] A preferred embodiment of tlhe above-defined vector assembly
method is described in more detail below (ACEMBL system; Section
C.2.1)
[0096] According to a preferred embodiment of the above-defined
method, the n vector entities, respectively, each contain one or
more expression cassettes of the form Prm-MCS-Term or Prom-MCS (as
defined above, preferably between a HE and restriction enzyme site
X as defined above). It is further preferred that the expression
cassette preferably present in the vectors or vector entities,
respectively, contains one or more genes of interest ("GOI") to be
expressed in a suitable host.
[0097] Another method for providing fusion vectors according to the
present invention is a sequential assembly process wherein in the
first step two of the vector entities are recombined, transformed
into host cells and the host cells cultured in the presence of two
antibiotics. The second round comprises the isolation of the double
fusion vector (n=2) from a viable clone, contacting with a third
vector entity in the presence of the respective recombinase,
transformation into host cells and selection for the three
resistance markers present in the triple fusion vector (n=3) and so
on until the desired multifusion vector is reached.
[0098] Of course, it is also possible to provide fusion vectors
according to the invention, in particular fusion vectors of higher
order (i.e, n>3) by a combined approach using the vector
assembly method of steps (1) to (5) as defined above (e.g. for
assembling a fusion vector with n=3, 4 or 5) and then adding one or
more further vector entities sequentially as described in the
previous paragraph.
[0099] The principle underlying the above-described method for
assembling a fusion vector, i.e. the equilibrium of educts and
products in recombination reactions, can equally be applied to the
disassembly of fusion vectors.
[0100] Therefore, the invention further provides a method of
disassembling a fusion vector containing n vector entities into one
or more desired fusion vectors selected from the group consisting
of fusion vectors containing 2 to (n-1) vector entities or into one
or more desired single vector entities, wherein in said fusion
vector containing n vector entities said n vector entities are
separated from each other by n site-specific recombination sites
and each vector entity contains an individual resistance marker
different from the resistance markers of the other vector entities,
comprising the steps of:
[0101] (A) contacting the fusion vector containing n vector
entities with a recombinase specific for said site-specific
recombination sites in order to generate a mixture of fusion
vectors comprising 2 to (n-1) of said vector entities and single
vector entities;
[0102] (B) transforming said mixture into host cells; and
[0103] (C) culturing one or more sample(s) of the transformed cells
in the presence of
[0104] (C1) an appropriate combination of antibiotics for selecting
one or more desired fusion vectors) containing 2 to (n-1) vector
entities; and/or
[0105] (C2) a single appropriate antibiotic for selecting a desired
single vector entity;
[0106] (D) obtaining n single clones of transformed cells from the
sample of the transformed cells in which these were viable in the
presence of the respective antibiotic or combination of
antibiotics, respectively; and
[0107] (E) culturing n samples of each of said n single clones in
the presence of each of n antibiotics specific for the n individual
resistance markers present in said n vector entities;
wherein n is as defined above.
[0108] If it is desired to select for single vectors rising the
above fusion vector disassembly method, it is preferred that steps
(A), (B) and (C1) to (E) are carried out for selecting an
appropriate fusion vector containing 2 to (n-1) vector entities and
then to perform steps (A), (B) and (C2) to (E) are carried out with
said selected fusion vector containing 2 to (n-1) vector entities.
It is understood that this sequential approach can be repeated
which is especially preferred when starting from a fusion vector
containing a higher number of vector entities, i.e. one can select
for a (n-1) fusion vector in the first, then for a (n-2) construct
in the second round and so on, e.g. until reaching a fusion vector
with n=3 or 2 such that the presence of the single vector entities
in the recombinase reaction equilibrium makes the selection of
respective clones containing said single vector entities according
to the selection steps (C2) to (E) more likely.
[0109] Furthermore, in analogy to the above-defined vector assembly
method, it is preferred in the fusion vector disassembly method of
the present invention that (n-1) of the vector entities in said
fusion vector containing n vector entities each contains a further
selectable marker different from the resistance markers such that
only host cells transformed with fusions between a vector entity
not containing the further selectable marker and one or more of the
vector entities containing the selectable marker are viable in step
(C1).
[0110] With respect to preferred selectable markers (conditional
Ori etc.), host cells, the use of multi well test plates etc. if is
referred to the preferred embodiments of the vector assembly method
outlined above.
[0111] The fusion vector disassembly method of the present
application is further elaborated below with respect to a preferred
embodiment (ACEMBL system; Section C.2.2).
[0112] The nucleic acids and vectors (including fusion vectors and
single vectors (i.e. vector entities)) of the present invention may
contain further typical sequence elements, e.g. elements that
enable or simplify the detection and/or purification of the
(multiple) proteins expressed from the one or more genes of
interest. Typical examples of such elements are sequences coding
for GFP and its derivatives, His-tags, GST etc.
[0113] Fusion vectors according to the present invention are
advantageously used for the expression of mutliprotein complexes in
a suitable host. Thus, the present invention further provides a
corresponding process comprising transforming a fusion vector of
the invention (containing vector entities having inserted one or
more genes of interest, e.g. in form of multiple or single
expression cassettes, or in the form of polygenes as appropriate)
into a suitable host and culturing the transformed host under
conditions allowing simultaneous expression of the genes of
interest.
[0114] From the disclosure of the various aspects of the present
invention the skilled person readily understands that the HE/X site
polynucleotide (in particular corresponding vectors), preferably
used for iterative cloning of multiple expression cassettes, can be
combined with the assembly (or disassembly) methods as defined
above for creating multigene constructs. For example, one or more
of a single gene or multigene vector(s) can be prepared using the
HE/X site elements as described which may then be assembled into
fusion vectors of choice (e.g. triple, quadruple or higher order
fusion vectors) using the recombination-based assembly methods
defined herein. Such fusion vectors may then be (partly or
completely) disassembled as disclosed herein and different
constructs can be assembled in turn as appropriate for the
respective multiprotein application envisaged by the skilled
person. Thus, the aspects of the present invention represent a
building block system which provides the person skilled in the art
with a hitherto unknown freedom of combining multiple genes (or
polygenes) of interest for multiprotein applications.
[0115] The figures show:
[0116] FIG. 1 shows a schematic overview of preferred vectors
according to the present invention for expression of multiprotein
complexes in prokaryotic hosts contained in a preferred kit called
"ACEMBL".
[0117] FIG. 2 is a graphic representation of a preferred embodiment
of the nucleic add of the present invention called "multiple
integration element" (MIE).
[0118] FIG. 3 shows a schematic overview of a preferred method for
inserting a gene of interest ("GOI") into a vector of the present
invention by sequence and ligation independent cloning (SLIC; see
Tan, S. et al. Protein Expr. Purif. 40, 385 (2005)). A gene of
interest (GOI 1) is PCR amplified with specific primers and
integrated into a vector (Acceptor, Donor) linearized by PCR with
complementary primers (complementary regions are shaded in light
gray or dark grey, respectively). Resulting PCR fragments contain
homology regions at the ends. T4 DNA polymerase acts as an
exonuclease in the absence of dNTP and produces long sticky
overhangs. Mixing (optionally annealing) of T4 DNA polymerase
exonueiease treated insert and vector is followed by transformation
yielding a single gene expression cassette.
[0119] FIG. 4 shows a schematic overview of a preferred method for
inserting a polycistron into a vector of the present invention by
SLIC. Genes of interest (GOI 1, 2, 3) are PCR amplified with
specific primers and integrated into a vector (Acceptor, Donor)
linearized by PCR with primers complementary to the ends of the
forward primer of the first (GOI 1) and the reverse primer of the
last (GOI 3) gene to be assembled in the polycistron (complementary
regions are shaded in light gray or dark grey, respectively).
Resulting PCR fragments contain homology regions at the ends. T4
DNA polymerase acts as an exonuclease in the absence of dNTP and
produces long sticky overhangs. Mixing (optionally annealing) of T4
DNA polymerase exonuclease treated insert and vector is followed by
transformation, yielding a polycistronic expression cassette.
[0120] FIG. 5 shows the sequence of a LoxP imperfect inverted
repeat (SEQ ID NO: 19).
[0121] FIG. 6 (left panel) shows a schematic representation in form
of a pyramid illustrating Cre-mediated assembly and disassembly of
preferred embodiments of the vector of the present invention (pACE,
pDK and pDS vectors). LoxP sites are shown as red circles,
resistance markers and origins are labelled. White arrows stand for
the entire expression cassette (including promoter, terminator and
multiple integration elements) in the ACEMBL vectors. Not all
possible fusion products are shown for clarity. Levels of
multiresistance are indicated in the right panel.
[0122] FIG. 7 is a schematic representation of a multiresistance
analysis of bacterial colonies carrying vector constructs resulting
from Cre-deCre assembly/disassembly according to the invention (cf.
also FIG. 12).
[0123] FIG. 8 shows a schematic representation of the strategy for
cloning of human RAP74 and human RAP30 into vectors of the present
invention for expression of human TFIIF (left panel). hRAP74 was
cloned by SLIC into pDC. hRAP30 was cloned by SLIC into pACE.
Cre-Lox recombination of pDC-RAP74 (donor) and pACE-RAP30 results
in vector pACEMBL-hTFIIF. Results from restriction mapping by
BstZ17I/BamHI double digestion of 11 double resistant (Cm, Ap)
colonies are shown by a gel section from 1% E-gel electrophoresis
(M: NEB 1 kb DNA marker). All clones tested showed the expected
pattern (5.0+2.8 kb) (left panel).
[0124] FIG. 9 illustrates the strategy for cloning of human
VHL/elongin b/elongin c complex (VHLbc) (tricistron) into vector
pACE by multifragment SLIC.
[0125] FIG. 10 shows a schematic representation of the strategy for
iterative cloning of the components of yeast RES complex (Pml1p;
Snu17p, Bud13p) using a preferred homing endonuclease site
(HE)/restriction enzyme site (X) module (PI-SceI/BstXI) according
to the present invention.
[0126] FIG. 11 shows a schematic representation of the generation
of single vectors from multifusion vector pACKS (SEQ ID NO:
18).
[0127] FIG. 12 shows schematic representations and photographs
illustrating a 96 well microtiter analysis of pACKS De-Cre
reaction.
[0128] FIGS. 13A and 13B show the sequence and map of a preferred
nucleic acid ("multiple integration element", MIE) according to the
present invention (SEQ ID NO: 1). Forward and reverse primers for
sequencing can be standard vector primers for T7 and lac. Adaptor
primer sequences (see Table 1) are indicated. DNA sequences in
these homology regions, contain tried-and-tested sequencing primers
(Tan et al. (2005), supra). Sites of insertion (I1-I4) are shown.
The adaptor sequences, and probably any sequence in the homology
regions, can be used as adaptors for multifragment insertions. The
ribosome binding site present in the MIE (rbs) is boxed in red.
[0129] FIG. 14 shows a plasmid map of Acceptor vector pACE.
[0130] FIG. 15 shows a plasmid map of Acceptor vector pACE2.
[0131] FIG. 16 shows a plasmid map of Donor vector pDC.
[0132] FIG. 17 shows a plasmid map of Donor vector pDK.
[0133] FIG. 18 shows a plasmid map of Donor vector pDS.
As can be seen in the above plasmid maps, Acceptor vectors pACE
(FIG. 14) and pACE2 (FIG. 15) contain a 17 promoter and terminator.
Donor vectors pDC (FIG. 16), pDK (FIG. 17) and pDS (FIg. 18)
contain conditional origins of replication. pOS (FIG. 18) and pDK
(FIG. 17) have a lac promoter. pDC (FIG. 16) has a T7 promoter.
Resistance markers and origins of replication are shown. LoxP
imperfect inverted repeat sequences are shown as circles. Homing
endonuclease sites and corresponding BstXI sites are boxed. The
restriction enzyme sites in the multiple integration element (MIE)
are indicated. All MIEs have the same DNA sequence between ClaI and
PmeI. Differences in unique restriction site composition stem from
differences in the plasmid backbone sequences.
[0134] FIG. 19 shows the results of a restriction mapping of
preferred vectors according to the invention. Both undigested
Acceptor (pACE, pAGE2) and Donor vectors (pDC, pDK, pDS) are shown
as well as the same vectors digested with BamHI. All restriction
reactions yield the expected sizes. Lanes 1-5 show uncut pACE,
pACE2, pDC, pDK, and pDS vectors; lane M shows .lamda. Styl marker;
lanes A-E show BamHI digested pACE, pACE2, pDC, pDK, and pDS
vectors.
[0135] FIG. 20 shows the strategy for Acceptor/Donor recombineering
according to the invention exemplified for genes coding for Von
Hippel-Lindau/elonginB/elonginC (VHLbc) complex (Tan et al. (2005),
supra; see also FIG. 9 above), FtsH soluble domain (Bieniossek et
al. (2006) Proc. Natl. Acad. Sci. USA 103, 3066-3071), blue
fluorescent protein (BFP), and green fluorescent protein (mGFP)
with a coiled-coil domain (Berger et al. (2003) Proc. Natl. Acad.
Sci. USA 100, 12177-12182) were inserted into pACE, pDC, pDK and
pDS, respectively. Cre-fusion was followed by transformation into
pir.sup.- cells (TOP10). Aliquots were plated on agar with two
(Ap/Kn; Ap/Cm; Ap/Sp), three (Ap/Kn/Sp) and four (Ap/Kn/Sp/Cm)
antibiotics. Four colonies from each plate were challenged in a 96
well microliter plate. Labels left of the plate image denote
antibiotics contained in media aliquots in horizontal rows. Wells
in the bottom two rows were charged differently (labels below the
plate image). Those inoculated with four colonies each from one
agar plate are boxed in black, and flagged with antibiotics
contained in the agar plate. Four vertical rows in each such
16-well box were inoculated with the same colony. In the bottom two
rows, four wells in a row were inoculated with the same colony.
Expected vector architecture of the double, triple (ADD) and
quadruple (ADDD) fusions is shown left or right (16 well boxes),
respectively, or below (bottom two rows) of the plate image. Red
dye is used as positional marker. Deconstruction of the ADDD fusion
was carried out successfully in the reverse approach.
[0136] FIG. 21 shows the results of multiprotein complex expression
of human TFIIF (FIG. 21A), the Von Hippel-Lindau/elonginB/elonginC
(VHLbc) complex (FIG. 21B) and the prokaryotic transmembrane,
holotranlocon (HTL) YidC-SecYEGDF (FIG. 21C). (A) Human TFIIF was
assembled and purified using a TECAN Freedom Evoll 200 workstation,
and analyzed by SDS-PAGE. Uninduced and induced whole cell extracts
and purified hTFIIF are shown, with subunits (RAP74, RAP30) marked.
RAP74 contained a C-terminal oligohistidine tag. (B) All multigene
constructs from FIG. 20 were assembled, expressed and cell lysates
analyzed in parallel following the same routine as for hTFIIF
(labels as in FIG. 20). The VHLbc complex was captured by an
oligohistidine-thioredoxin-fusion tag on the VHL subunit (Tan et
al. (2005), supra). FtsH contained an oligohistidine tag at its
C-terminus (Bieniossek et al. (2006, supra). Fluorescent proteins
were identified in lysates by Western blot with antibody Roche
1814460 (1:1000 in TBST/3% BSA). (C) Production of the entire
prokaryotic transmembrane holotranslocon (HTL) YidC-SecYEGDF.
Membrane vesicle preparation, detergent solubilization, Ni.sup.2+
affinity capture and size exclusion chromatography resulted in
purified holotranslocon complex (right). Subunits are labeled. A
breakdown product of SecY is marked with an asterisk. In all
panels, M stands for Biorad broad range marker (sizes in kDa).
[0137] FIG. 22 shows a schematic workflow for an automated SLIC
process.
[0138] FIG. 23 shows a schematic workflow for an automated Cre
fusion process.
[0139] FIG. 24 shows a plasmid map of a preferred vector (Donor
vector) of the invention called pIDC (SEQ ID NO: 7).
[0140] FIG. 25 shows the plasmid map of a preferred vector (Donor
vector) of the invention called pIDK (SEQ ID NO: 8).
[0141] FIG. 26 shows the plasmid map of a preferred vector (Donor
vector) of the invention called pIDS (SEQ ID NO: 9).
[0142] FIG. 27 shows the plasmid map of a preferred vector
(Acceptor vector) of the invention called pACEBac1 (SEQ ID NO:
10).
[0143] FIG. 28 shows the plasmid map of a preferred vector
(Acceptor vector) of the invention called pACEBac2 (SEQ ID NO:
11).
[0144] FIG. 29 shows the plasmid map of a preferred vector
(Acceptor vector) of the invention called pACEBac3 (SEQ ID NO:
12).
[0145] FIG. 30 shows the plasmid map of a preferred vector
(Acceptor vector) of the invention called pACEBac4 (SEQ ID NO:
13).
[0146] FIG. 31 shows the plasmid map of a preferred vector
(Acceptor vector) of the invention called pOmniBac1 (SEQ ID NO:
14).
[0147] FIG. 32 shows the plasmid map of a preferred vector
(Acceptor vector) of the invention called pOmniBac2 (SEQ ID NO:
15).
[0148] FIG. 33 shows the plasmid map of a preferred vector
(Acceptor vector) of the invention called pOmniBac3 (SEQ ID NO:
16).
[0149] FIG. 34 shows the plasmid map of a preferred vector
(Acceptor vector) of the invention called pOmniBac4 (SEQ ID NO:
17).
[0150] FIG. 36 shows a scheme for multiprotein expression in insect
cells by generating composite baculovirus using Acceptor vectors of
the present invention carrying a ColE1 origin (pACEBac1, pACEBac2,
pOmniBac1, pOmniBac2). Multigene fusions are generated by Cre-LoxP
fusion of the desired Donor/Acceptor combinations (multigene
construction). The fusion vector is transformed in bacteria
carrying a baculoviurs genome (such as bacoluvirus EMBac or EMBAcY)
as a bacterial artificial chromosome (BAC). The vector fusion is
integrated into the baculovirus genome by Tn7 based transposition.
Productive composite viruses are selected by blue/white screening
(integration of the vector fusion into the T7 attachment site of
the virus destroys a lacZ gene present on the virus). Composite
viruses are prepared and suitable insect cells are iransfeeted for
protein production.
[0151] FIG. 36 shows a scheme for multiprotein expression in insect
cells by generating composite baculovirus using Acceptor vectors of
the present invention carrying an OriV origin (pACEBac3, pACEBac4,
pOmniBac3, pOmniBac4). Multigene fusions are generated by Cre-LoxP
fusion of the desired Donor/Acceptor combinations. The fusion
vector is transformed in bacteria carrying a baculoviurs genome
(such bacoluvirus EMBac or EMBAcY) as a bacterial artificial
chromosome (BAC). The vector fusion is integrated into the
baculovirus genome by Tn7 based transposition. Since the Acceptor
vectors carrying an OriV can only be propagated, if a trfA gene is
provided in trans, unproductive integration events in bacteria not
containing the trfA gene leads to elimination of such transformants
upon exposure to the appropriate antibiotic (here: gentamycin).
Thus, blue/white screening is not necessary in this case. Composite
viruses are then prepared and suitable insect cells are transfected
for protein production.
[0152] FIG. 37 shows a scheme for multiprotein expression in insect
cells by generating composite baculovirus using Acceptor vectors of
the present invention carrying lef2-603 and Orf1629 homology
sequences (pOmniBac1, pOmniBac2, pOmniBac3, pOmniBac4). Multigene
fusions are generated by Cre-LoxP fusion of the desired
Donor/Acceptor combinations (multigene construction). The multigene
construct and genomic baculovirus DNA carrying a diphtheria toxin A
gene flanked by the Ief2-603/Orf1629 homology sequences can be
directly co-transfected into suitable insect cells for protein
production. Transformation of transfer vector into bacteria
containing the baculovirus genome, blue/white screening for
composite viruses and preparation of composite viruses from the
bacteria is no longer necessary.
[0153] FIG. 38 shows a schematic representation of a baculovirus
vector according to the invention called EMBac.
[0154] FIG. 39 shows a schematic representation of a baculovirus
vector according to the invention called EMBacY.
[0155] FIG. 40 shows a schematic representation of a baculovirus
vector according to the invention called EMBac_Direct.
[0156] FIG. 41 shows a schematic representation of a baculovirus
vector according to the invention called EMBac_DirectY.
[0157] FIG. 42 shows a schematic representation of an MIE according
to the invention having the general structure I-CeuI-p10-MCS-BstXI
present, for example in Acceptor vectors such as pACEBac2.
[0158] FIG. 43 shows a schematic representation of an MIE according
to the invention having the general structure PI-SceI-p10-MCS-BstXI
present, for example in Donor vectors such as pIDS.
[0159] FIG. 44 shows a schematic representation of an MIE according
to the invention having the general structure I-CeuI-polh-MCS-BstXI
present, for example in Accepter vectors such as pACEBac1.
[0160] FIG. 45 shows a schematic representation of an MIE according
to the invention having the general structure
PI-SceI-polh-MCS-BstXI present, for example in Donor vectors such
as pIDC.
[0161] FIG. 46 shows a schematic representation of vector
pACEBac1-HisIKK1.
[0162] FIG. 47 shows a schematic representation of vector
pIDC-CSIKK2.
[0163] FIG. 48 shows a schematic representation of vector
pIDS-IKK3.
[0164] FIG. 49 shows a schematic representation of vector
pACEBac-HA-NA.
[0165] FIG. 50 shows a schematic representation of vector
pIDC-M1-M2.
[0166] The present invention is in the following further described
in detail with reference to preferred embodiments designated as
"ACEMBL" system.
A. Synopsis
[0167] The preferred embodiments according to the present invention
denoted as "ACEMBL" provide a multi-expression system for muitigene
expression in E. coli and insect cells using the baculovirus
system. ACEMBL can be used both manually and also in an automated
setup by using a liquid handling workstation. ACEMBL applies tandem
recombination steps for rapidly assembling many genes into
multigene expression cassettes. These can be polycistronic or
multiple expression modules, or a combination of these elements.
ACEMBL also offers the option to employ conventional approaches
involving restriction enzymes and ligase, if desired.
[0168] The following strategies for multigene assembly and
expression are provided for in the ACEMBL system:
[0169] (1) Single gene insertions into vectors (recombination or
restriction/ligation)
[0170] (2) Multigene assembly into a polycistron (recombination or
restriction/ligation)
[0171] (3) Multigene assembly using homing endonucleases
[0172] (4) Multigene plasmid fusion by Cre-LoxP reaction
[0173] (5) Multigene expression by cotransformation in E. coli
[0174] (6) Multigene expression in insect cells using the
baculovirus system
[0175] These strategies can be used individually or in conjunction,
depending on the project and user.
[0176] In the following Section C, step-by-step protocols are
provided for each of these methods for multigene cassette assembly
that can be used in the ACEMBL system.
B. ACEMBL System
B.1 ACEMBL Vectors
[0177] The present invention provides as preferred exemplary
embodiments small de novo designed vectors which are called
"Acceptor" and "Donor" vectors (FIGS. 1 and 20; for plasmid maps,
see FIGS. 14 to 18 and FIGS. 21 to 31). Acceptor vectors for
expression of proteins in prokaryotic hosts (e.g. pACE, pACE2)
contain origins of replication derived from ColE1 and resistance
markers (ampicillin or tetracycline). Donor vectors contain
conditional origins of replication (derived from R6K.gamma.), which
make their propagation dependent on hosts expressing the pir gene.
Donor vectors contain resistance markers kanamycin,
chloramphenicol, and spectinomycin. Preferably, three Donor vectors
are used in conjunction with one Acceptor vector.
[0178] All Donor and Acceptor vectors according the present example
contain a LoxP imperfect inverted repeat and in addition, a
multiple integration element (MIE). The preferred MIE of the
invention comprises an expression cassette with a promoter of
choice (prokaryotic, mammalian, insect cell specific or a
combination thereof) and a terminator (prokaryotic, mammalian,
insect cell specific or a combination thereof). In between is a DNA
segment which contains a number of restriction sites that can be
used for conventional cloning approaches or also for generating
double-strand breaks for the integration of expression elements of
choice (further promoters, ribosomal binding sites, terminators and
genes). The MIE is completed by a homing endonuclease site and a
specifically-designed restriction enzyme site (BstXI) flanking the
promoter and the terminator (see B.2.)
[0179] The sequences of ACEMBL vectors for expression in
prokaryotic hosts are outlined in the sequence listing (pACE: SEQ
ID NO: 2, pACE2: SEQ ID NO. 3; pDC: SEQ ID NO: 4; pDK; SEQ ID NO:
5; pDS: SEQ ID NO: 6; pACKS: SEQ ID NO: 18). Maps of the vectors
pACE, pACE2, pDC, pDK and pDS are shown in FIGS. 14 to 18.
[0180] The ACEMBL system according to the present invention also
provides Donor and Acceptor vectors adapted for expression of
multiprotein complexes in insect cells using baculovirus (pIDC (SEQ
ID NO. 7), pIDK (SEQ ID NO: 8), pIDS (SEQ ID NO: 9), pACEBac1 (SEQ
ID NO: 10), pACEBac2 (SEQ ID NO: 11), pACEBac3 (SEQ ID NO: 12),
pACEBac4 (SEQ ID NO: 13), pOmniBac1 (SEQ ID NO: 14), pOmniBac2 (SEQ
ID NO: 15, pOmniBac3 (SEQ ID NO: 16) and pOmniBac4 (SEQ ID NO:
17)). Plasmid maps of the vectors are shown in FIGS. 24 to 34.
[0181] Donor vectors pIDS, pIDK and pIDS contain a conditional
origin of replication (from R6Kgamma phage), a homing endonuclease
(HE) site (PI-SceI) and a complementary BstXI site (see the
corresponding E. coli vectors pDC, pDK, pDS). Donors are propagated
in cell strains containing the pir gene.
[0182] In contrast to the versions adapted for expression in
bacteria, the vectors for expression of proteins in insect cells do
not contain prokaryotic promoter and terminator structures.
Instead, they have either a polh expression cassette (polh EC) or a
p10 expression cassette (p10 EC). These expression cassettes
contain common polyhedron or p10 promoters from AcMNPV, an
oligonucleotide encoding for restriction sites (different from the
MIE in the prokaryotic ACEMBL version) and either SV40 or HSVtk
polyadenylation signal sequences.
[0183] Obviously, due to the HE and BstXI sites, the expression
cassettes can be freely exchanged in between the vectors, also if
they contain an inserted gene. This can be done by restriction
ligation or by restriction enzyme/ligase independent methods (e.g.
SLIC). Therefore, versions can be creatwed at ease which contain a
p10 or polh marker in combination with any one of the resistance
markers (spectinomycin, kanamycin, chloramphenicol, or others).
[0184] The HE/BstXI site combinations can be used to multiply
expression cassettes or also to fit the vectors with combinations
of p10 and polh expression cassettes.
[0185] All Donors contain a LoxP inverted imperfect repeat. This
can be used for LoxP mediated constructions and deconstructions of
Acceptor/Donor multifusions as described for the bacterial ACEMBL
vectors.
[0186] The present embodiment of the invention relating to vectors
adapted for protein expression in insect cells provides a number of
Acceptor vectors in the baculovirus-version of ACEMBL. These share
common features: all contain a LoxP site, a resistance marker
(gentamycin) and again either a p10 or a polh expression cassette
(identical to the ones present in the Donors).
[0187] The expression cassettes of the Acceptors are flanked by a
homing endonuclease site (I-CeuI) and a corresponding BstXI
site.
[0188] The expression cassettes can be exchanged in between the
Acceptors and also multiplied or combined using the HE/BstXI
combination as described for the bacterial ACEMBL vectors.
[0189] There are two families of Acceptors in terms of the origin
used:
[0190] pACEBac1, pACEBac2, pOmniBac1 and pOmniBac2 contain all a
ColEI origin of replication which allows propagation in all common
E. coli cloning cell strains.
[0191] All Acceptor vectors contain Tn7L and Tn7R sequences which
enable integration of the region in between into a Tn7 attachment
site by using the Tn7 transposition procedure.
[0192] pACEBac3, pACEBac4, pOmniBac3 and pOmniBac4 contain a
conditional origin of replication (OriV) from V. Cholerae which is
dependent on the trfA gene that needs to be provided in trans in
the cloning strains usee. The function of this OriV is to eliminate
the background (blue colonies) when these Acceptors, fitted with
genes and if required fused with Donors, are transformed into cells
that contain the baculovirus genome in form of a bacterial
artificial chromosome (i.e. DH10Bac from Invitrogen and similar).
Here, the Tn7 transposition system is used to integrate the regions
in between Tn7L and Tn7R of the DNA transformed into the cells into
a Tn7 attachment site on the viral genome of choice. Normally,
unproductive integration events would result in blue colonies (if
the Tn7 attachment site is embedded in a LacZalpha gene on the
baculovirus genome). These blue colonies propagate the plasmid
transformed outside of the baculovirus genome. With these four OriV
containing plasmids, the blue colonies cannot survive upon exposure
to Gentamycin (since the DH10Bac or other cells do not contain
trfA) and only white colonies are produced, which all contain
productively integrated composite bacmid carrying the heterologous
genes provided on the plasmid transformed; see also the scheme in
FIG. 36.
[0193] The Acceptor vectors pOminBac1-4 contain, in addition to the
Tn7L and Tn7R regions, also the lef2-603 and Orf1629 homology
sequences. These are used for homologous recombination procedures
for generating composite baculovirus as used by the Novagen
Bacvector series, the Baculogold system from Pharmingen, FlashBac
from OET and others. Thus, these Acceptor vectors can be used for
every baculovirus system that is currently available, including the
Tn7 based baculoviruses and all viruses relying on lef2,603/1629
homologous recombination procedures, for expressing heterologous
genes in insect cell cultures, see also the scheme in FIG. 37.
B.2 Multiple Integration Element (MIE)
[0194] A preferred multiple integration element (MIE) according to
the invention was derived from a polylinker (see Tan et al. (2005)
supra) and allows for several approaches for multigene assembly
(see Section C below). Multiple genes can be inserted into the MIE
of any one of the vectors by a variety of methods, for example
BD-in-Fusion recombination (see ClonTech TaKaRa Bio Europe,
www.clontech.com) or SLIC (sequence and ligation independent
cloning; see Li et al. (2007) Nat. Methods 4, 251). For this, the
vector needs to be linearized, which can also be carried out
efficiently by PCR reaction with appropriate primers, since the
vectors are all small (2-3.0 kb). Use of ultrahigh-fidelity
polymerases such as Phusion (Finnzymes/New England BioLabs,
www.neb.com) is preferred. Alternatively, if more conventional
approaches shall be used, e.g. in an ordinary wet lab setting
without robotics, the vectors can also be linearized by restriction
digestion, and a gene of interest can be integrated by
restriction/ligation (see below Section C of the present
embodiment). The DNA sequence (SEQ ID NO: 1) and map of the present
MIE is shown in FIG. 13.
B.3. Tags, Promoters, Terminators
[0195] For expression of proteins in prokaryotic hosts, the vectors
of the ACEMBL system contain per default promoters T7 and Lac, as
well as the T7 terminator element (FIGS. 1, 14). The T7 system
requires bacterial strains which contain a T7 polymerase gene, e.g.
in the E. coli genome. The Lac promoter is a strong endogenous
promoter which can be utilized in most strains. The present ACEMBL
vectors contain the lac operator element for repression of
heterologous expression.
[0196] Evidently, all promoters and terminators present in ACEMBL
Donor and Acceptor vectors, and in fact the entire multiple
integration element (MIE), excluding the HE and X Site,
respectively, can be exchanged with an expression cassette of
choice by using restriction/ligation cloning with appropriate
enzymes (for example ClaI/PmeI, FIG. 2) or insertion into
linearized ACEMBL vectors where the MIE was removed by sequence and
ligation independent approaches such as SLIC. For example, the T7
promoter in pDC can be substituted with a trc promoter
(pDC.sup.trc), and the T7 promoter in pACE with an arabinose
promoter (pACE.sup.ara). Such variants can be used successfully in
coexpression experiments by inducing with arabinose and IPTG.
[0197] In contrast to the ACEMBL vectors for expression in
prokaryotic hosts, the vectors for expression in insect cells do
not contain prokaryotic promoter and terminator structures. As
already mentioned above, they have either a polh expression
cassette (polh EC) or a p10 expression cassette (p10 EC). These
expression cassettes contain common polyhedron or p10 promoters
from AcMNPV, a sequence of restriction sites and either SV40 or
HSVtk polyadenylation signal sequences.
[0198] The ACEMBL system vectors of the present example do not
contain DNA sequences encoding for affinity tags to facilitate
purification or solubilization of the protein(s) of interest.
However, typically used C- or N-terminal oligohistidine tags, with
or without protease sites for tag removal can be introduced by
means of the respective PCR primers used for amplification of the
germs of interest prior to insertion into the MIE, e.g. by
SLIC-mediated insertion. Thus, Donor and Acceptor vectors of the
present invention may be equipped by the array of custom tags prior
to inserting recombinant genes of interest. This is best done by a
design which with after tag insertion, still be compatible with the
recombination based principles of ACEMBL system usage.
B.4 Complex Expression
[0199] For expression in E. coli, the ACEMBL multlgene expression
vector fusions with appropriate promoters or terminators are
transformed into the appropriate expression host of choice. With
respect to the present exemplary vectors (T7 and lac promoter
elements), most of the wide array of currently available expression
strains can be utilized. If particular expression strains already
contain helper plasmlds with DNA encoding for chaperones, lysozyme
or else, the design of the multigene fusion is preferably such that
the ACEMBL vector containing the resistance marker that is also
present on the helper plasmld is not included in multigene vector
construction.
[0200] Alternatively, if further vectors are repaired for complex
production in an experiment, the issue can be resolved by creating
alternative versions of the ACEMBL vectors containing resistance
markers that circumvent the conflict. This can be easily performed
by PCR amplifying the vectors minus the resistance marker, and
combine the resulting fragments with a PCR amplified resistance
marker by recombination (SLIC) or blunt-end ligation (using 5'
phosphorylated primers).
[0201] Donor vectors of the present example depend on expression by
the host of the pir gene product, due to the R6K.gamma. conditional
origin of replication. In regular expression strains, they rely on
fusion with an Acceptor for productive replication. Donors or
Donor-Donor fusions can nonetheless be used even for expression
when not fused with an Acceptor, by using expression strains
carrying a genomic insertion of the pir gene. Such strains are
commercially available (Novagen Inc., Madison Wis., USA).
[0202] Cotransformation of two ACEMBL plasmids adapted for
expression in bacteria can lead to a successful protein complex
expression. The present ACEMBL system for expression in prokaryotic
hosts contains two Acceptor vectors, pACE and pACE2, which are
identical except for the resistance marker (FIGS. 1, 14). These can
be used to express genes present on pACE or pACE2, respectively, by
cotransformation and exposure to both antibiotics simultaneously.
In fact, entire Acceptor-Donor fusions containing several genes,
based on pACE or pACE2 as Acceptors, can in principle be
cotransformed for multi-expression, if needed.
[0203] For expression in insect cells (such as Sf9, Sf21, Hi5 etc.)
using the baculovirus system, suitable ACEMBL vectors of the
present invention need to be integrated into a baculovirus genome
(composite virus generation). This is typically carried out by
transformation of the desired Cre-LoxP fusion into bacterial cells
containing the desired virus genome as a bacterial artificial
chromosome. Using the vector system of the present invention
adapted for baculovirus integration is used, three approaches are
possible as outlined in FIGS. 35, 36 and 37, respectively.
C. Procedures
[0204] C.1. Cloning into ACEMBL Vectors
[0205] All Donors and Acceptors of the preferred embodiment for
expression prokaryotic hosts contain an identical MIE with
exception of the homing endonucclease site/BstXI tandem
encompassing the MIE (FIGS. 1 and 14). The MIE is tailored for
sequence and ligation Independent gene insertion methods. In
addition, the MIE also contains a series of unique restriction
sites, and therefore can be used as a classical polylinker for
conventional gene insertion by restriction/ligation. For automated
applications insertion of genes of interest is preferably carried
out by recombination approaches such as SLIC.
[0206] The Donor vectors for expression in insect cells according
to the present preferred embodiment also contain an MIE which is,
however, different for each vector (see plasmld maps of vectors
pIDC, pIDK and pIDS in FIGS. 21, 22 and 23, respectively).
C1.1. Single Gene Insertion into the MIE by SLIC
[0207] Several procedures for restriction/ligation independent
insertion of genes into vectors have been published or
commercialized (e.g. Novagen LIC, Becton-Dickinson BD In-Fusion
etc). These systems share in common that they rely on the
exonuclease activity of DNA polymerases. In the absence of dNTPs,
5' extensions are created from blunt ends or overhangs by digestion
from the 3' end. If two DNA fragments contain the same .about.20-30
bp sequence at their termini at opposite ends, this results in
overhangs that share complementary sequences capable of annealing.
This can be exploited for ligation independent combination of two
or several DNA fragments containing homologous sequences.
[0208] If T4 DMA polymerase is used, this can be carried out in a
manner that is independent of the sequences of the homology regions
(Sequence and Ligation Independent Cloning, SLIC) and detailed
protocols are available for the skilled person. In the context of
multiprotein expression, this is particularly useful, as this
approach is independent of the presence of unique restriction
sites, or of their creation by mutagenesis, in the ensemble of
encoding DNAs.
[0209] For use in the context of the present invention, the SLIC
process was adapted for Inserting encoding DNAs amplified by
Phusion polymerase into the ACEMBL Acceptor and Donor vectors. In
this way, not only seamless integration of genes into the
expression cassettes, but also concatamerization of expression
cassettes to multigene constructs can be achieved by applying the
same, simple routine that can be readily automated.
[0210] The following Protocol 1 represents an improved process
based on the method described in Li et al. (2007, supra). Protocol
1 is designed for manual operation. Other systems may be used (e.g.
BD-InFusion etc.), and if so, the manufacturers' recommendations
should be followed. The present protocol may be adopted for
robotics applications. Corresponding modifications of the protocol
are outlined in Section D).
Protocol 1: Single Gene Insertion by SLIC.
[0211] Reagents required: [0212] Phusion Polymerase [0213]
5.times.HF Buffer for Phusion Polymerase [0214] dNTP mix (10 mM)
[0215] T4 DNA polymerase (and 10.times. Buffer) [0216] DpnI enzyme
[0217] E. coli competent cells [0218] 100 mM DTT, 2M Urea, 500 mM
EDTA [0219] Antibiotics
Step 1: Primer Design
[0220] Primers for the SLIC procedure are designed to provide the
regions of homology which result in the long sticky ends upon
treatment with T4 DNA polymerase in the absence of dNTP.
[0221] Primers for the insert contain a DNA sequence corresponding
to this region of homology ("Adaptor sequence" in FIG. 3, inset),
followed by a sequence which specifically anneals to the insert to
be amplified (FIG. 3, inset). Useful examples of adaptor sequences
for SLIC are listed below (Table I).
[0222] The "insert specific sequence" can be located upstream of a
ribosome binding site (rbs), for example if the gene of interest
(GOI) is amplified from a vector already containing expression
elements (e.g. the pET vector series). Otherwise, the forward
primer needs to be designed such that a ribosome binding site is
also provided in the final construct (FIG. 3, inset).
[0223] Primers for PCR linearization of the vector backbone are
simply complementary to the two adaptor sequences present in the
primer pair chosen for insert amplification (FIG. 3).
Step 2: PCR Amplification of Insert and Vector
[0224] Identical reactions are prepared in 100 .mu.l volume for DNA
insert to be cloned and vector to be linearized by PCR:
TABLE-US-00001 ddH.sub.2O 75 .mu.l 5x Phusion HF Reaction buffer 20
.mu.l dNTPs (10 mM stock) 2 .mu.l Template DNA (100 ng/.mu.l) 1
.mu.l 5.quadrature.SLICprimer (100 .mu.M stock) 1 .mu.l
3.quadrature.SLICprimer (100 .mu.M stock) 1 .mu.l Phusion
polymerase (2 U/.mu.l) 0.5 .mu.l
[0225] PCR reactions are then carried out with a standard PCR
program (unless very long DNAs are amplified, then double extension
time): [0226] 1.times.98.degree. C. for 2 min [0227]
30.times.[98.degree. C. for 20 sec.->50.degree. C. for 30
sec.->72.degree. C. for 3 min] [0228] Hold at 10.degree. C.
[0229] Analysis of the PCR reactions by agarose gel electrophoresis
and ethidium bromide staining is recommended.
Step 3: DpnI treatment of PCR Products (Optional)
[0230] PCR reactions are then supplied with 1 .mu.l DpnI enzyme
which cleaves parental plasmids (that are methylated). For insert
PCR reactions, DpnI treatment is not required if the resistance
marker of the template plasmid differs from the destination
vector.
[0231] Reactions are then carried out as follows: [0232]
Incubation: 37.degree. C. for 1-4 h [0233] Inactivation: 80.degree.
C. for 20 min
Step 4: Purification of PCR Products
[0234] PCR products should be cleaned of residual dNTPs. Otherwise,
the T4 DMA polymerase reaction (Step 5) is compromised. Product
purification is preferably performed by using commercial PCR
Purification Kits or NucleoSpin Kits (e.g. from Qiagen,
Macherey-Nagel etc.). It is recommended to perform elution in the
minimal possible volume indicated by the respective
manufacturer.
Step 5: T4 DNA Polymerase Exonuclease Treatment
[0235] Identical reactions are prepared in 20 .mu.l volume for
insert and for vector (eluted in Step 4):
TABLE-US-00002 10x T4 DNA polymerase buffer 2 .mu.l 100 mM DTT 1
.mu.l 2M Urea 2 .mu.l DNA eluate from Step 3 (vector or 14 .mu.l
insert) T4 DNA polymerase 1 .mu.l
[0236] Reactions are then carried out as follows: [0237]
Incubation: 23.degree. C. for 20 min [0238] Arrest: Addition of 1
.mu.l 500 mM EDTA [0239] Inactivation: 75.degree. C. for 20 min
Step 6: Mixing and Annealing
[0240] T4 DNA polymerase exonuclease-treated insert and vector are
then mixed, followed by an (optional) annealing step which enhances
the efficiency:
TABLE-US-00003 T4 DNA pol-treated insert: 10 .mu.l T4 DNA
pol-treated vector: 10 .mu.l
[0241] Annealing: 65.degree. C. for 10 min [0242] Cooling: Slowly
(in heat block) to RT (room temperature)
Step 7: Transformation
[0243] Mixtures are next transformed into competent cells following
standard transformation procedures.
[0244] Reactions for pACE and pACE2 derivatives are transformed
into standard E. coli cells for cloning (such as TOP10, DH5.alpha.,
HB101) and after recovery (2-4 h) plated on agar containing
ampicillin (100 .mu.g/ml) or tetracycline (25 .mu.g/ml),
respectively.
[0245] Reactions for Donor derivatives are transformed into E. coli
cells expressing the pir gene (such as BW23473, BW23474, or PIR1
and PIR2, Invitrogen) and plated on agar containing chloramphenicol
(25 .mu.g/ml, pDC), kanamycin (50 .mu.g/ml, pDK), and spectinomycin
(50 .mu.g/ml, pDS).
Step 8: Plasmid Analysis
[0246] Plasmids are cultured in small-scale in media containing the
corresponding antibiotic, and analyzed by sequencing and
(optionally) restriction mapping with an appropriate restriction
enzyme.
[0247] C1.2. Polycistron Assembly in MIE by SLIC
[0248] The multiple integration element according to the present
invention can also be used to integrate genes of interest by using
multi-fragment SLIC recombination as shown in FIG. 4. Genes
preceded by ribosome binding sites (rbs) can be assembled in this
way into polycistrons.
[0249] A detailed protocol is outlined in the following Protocol
2:
[0250] Protocol 2. Polycistron assembly by SLIC.
[0251] Reagents required: [0252] Phusion Polymerase [0253]
5.times.HF Buffer for Phusion Polymerase [0254] dNTP mix (10 mM)
[0255] T4 DNA polymerase (and 10.times. Buffer) [0256] E. coli
competent cells [0257] 100 mM DTT, 2M Urea, 500 mM EDTA [0258]
Antibiotics
Step 1: Primer Design
[0259] The MIE element according to the present embodiment is
composed of tried-and-tested primer sequences. These constitute the
"Adaptor" sequences that can be used for inserting single genes or
multlgene constructs. Examples of useful adaptor sequences are
listed below (see Table I).
[0260] Adaptor sequences form the 5' segments of the primers used
to amplify DNA fragments to be inserted into the MIE. Insert
specific sequences are added at 3', DNA coding for a ribosome
binding sites can be inserted optionally, if not already present on
the PCR template.
Step 2: PCR Amplification of Insert and Primer
[0261] Identical reactions are prepared in 100 .mu.l volume for all
DNA insert (GOI 1, 2, 3) to be cloned and the vector to be
linearized by PCR:
TABLE-US-00004 ddH.sub.2O 75 .mu.l 5x Phusion HF Reaction buffer 20
.mu.l dNTPs (10 mM stock) 2 .mu.l Template DNA (100 ng/.mu.l) 1
.mu.l 5' SLIC primer (100 .mu.M stock) 1 .mu.l 3' SLIC primer (100
.mu.M stock) 1 .mu.l Phusion polymerase (2 U/.mu.l) 0.5 .mu.l
[0262] PCR reactions are then carried out with a standard PCR
program (unless very long DNAs are amplified, then double extension
time): [0263] 1.times.98.degree. C. for 2 min [0264]
30.times.[98.degree. C. for 20 sec.->50.degree. C. for 30
sec.->72.degree. C. for 3 min] [0265] Hold at 10.degree. C.
[0266] Analysis of the PCR reactions by agarose gel electrophoresis
and ethidium bromide staining is recommended.
Step 3: DpnI Treatment of PCR Products (Optional)
[0267] PCR reactions are then supplied with 1 .mu.l DpnI enzyme
which cleaves parental plasmids (that are methylated). For insert
PCR reactions, DpnI treatment is not required if the resistance
marker of the template plasmids differs from the destination
vector.
[0268] Reactions are then carried out as follows: [0269]
Incubation: 37.degree. C. for 1-4 h [0270] Inactivation: 80.degree.
C. for 20 min
Step 4: Purification of PCR Products
[0271] PCR products should be cleaned of residual dNTPs. Otherwise,
the T4 DNA polymerase reaction (Step 5) is compromised.
[0272] Product purification is preferably performed by using
commercial PCR Purification Kits or NucleoSpin Kits (Qiagen,
Macherey-Nagel or others). It is recommended to perform elution in
the minimal possible volume indicated by the respective
manufacturer.
Step 5: T4 DNA Polymerase Exonuclease Treatment
[0273] Identical reactions are prepared in 20 .mu.l volume for each
insert (GOI 1, 2, 3) and for the vector (eluted in Step 4):
TABLE-US-00005 10x T4 DNA polymerase buffer 2 .mu.l 100 mM DTT 1
.mu.l 2M Urea 2 .mu.l DNA eluate from Step 3 (vector or 14 .mu.l
insert) T4 DNA polymerase 1 .mu.l
[0274] Reactions are then carried out as follows: [0275]
Incubation: 23.degree. C. for 20 min [0276] Arrest: Addition of 1
.mu.l 500 mM EDTA [0277] Inactivation: 75.degree. C. for 20 min
Step 6: Mixing and Annealing
[0278] T4 DNA polymerase exonuclease-treated insert and vector are
then mixed, followed by an (optional) annealing step which enhances
efficiency.
TABLE-US-00006 T4 DNA pol-treated insert 1 (GOI 1): 5 .mu.l T4 DNA
pol-treated insert 2 (GOI 2): 5 .mu.l T4 DNA pol-treated insert 3
(GOI 3): 5 .mu.l T4 DNA pol-treated vector: 5 .mu.l
[0279] Annealing: 65.degree. C. for 10 mln [0280] Cooling: Slowly
(in heat block) to RT
Step 7: Transformation
[0281] Mixtures are next transformed into competent cells following
standard transformation procedures.
[0282] Reactions for pACE and pACE2 derivatives are transformed
into standard E. coli cells for cloning (such as TOP10, DH5.alpha.,
HB101) and after recovery plated on agar containing ampicillin (100
.mu.g/ml) or tetracycline (25 .mu.g/ml), respectively.
[0283] Reactions for Donor derivatives are transformed into E. coli
cells expressing the pir gene (such as BW23473, BW23474, or PIR1
and PIR2, available from Invitrogen) and plated on agar containing
chloramphenicol (25 .mu.g/ml, pDC), kanamycin (50 .mu.g/ml, pDK),
and spectinomycin (50 .mu.g/ml, pDS).
Step 8: Plasmid Analysis
[0284] Plasmids are cultured and correct clones are selected based
on specific restriction digestion and DNA sequencing of the
inserts.
TABLE-US-00007 TABLE I Adaptor DNA sequences For single gene or
multigene insertions into ACEMBL vectors by SLIC. Adaptor* Sequence
Description (a) Adaptors for cloning into ACEMBL vectors for
expression in prokaryotic hosts T7InsFor TCCCGCGAAATTAATA Forward
primer for insert amplification, if CGACTCACTATAGGG gene of
interest (GOI) is present in a T7 (SEQ ID NO: 20) system vector
(i.e. pET series). No further extension (rbs, insert specific
overlap) required. T7InsRev CCTCAAGACCCGTTTA Reverse primer for
insert amplification, if GAGGCCCCAAGGGGT GOI is present in a T7
system vector (i.e. TATGCTAG pET series). (SEQ ID NO: 21) No
further extension (stop codon, insert specific overlap) required.
T7VecFor CTAGCATAACCCCTTG Forward primer for vector amplification,
GGGCCTCTAAACGGG reverse complement of T7InsRev. TCTTGAGG No further
extension required. (SEQ ID NO: 22) T7VecRev CCCTATAGTGAGTCGT
Reverse primer for vector amplification, ATTAATTTCGCGGGA reverse
complement of T7InsFor. (SEG ID NO: 23) No further extension
required. NdeInsFor GTTTAACTTTAAGAAG Forward primer for insert
amplification for GAGATATACATATG insertion into MIE site 11 (FIG.
2). (SEQ ID NO: 24) Further extension at 3' (insert specific
overlap) required. Can be used with adaptor XhoInsRev in case of
single fragment SLIC (FIG. 3). XhoInsRev GGGTTTAAACGGAACT Reverse
primer for insert amplification for AGTCTCGAG insertion into MIE
site 14 (FIG. 2). (SEQ ID NO: 25) Further extension at 3' (stop
codon, insert specific overfap) required. Can be used with adaptor
NdeInsFor in case of single fragment SLIC (FIG. 3). XhoVecFor
CTCGAGACTAGTTCCG Forward primer for vector amplification, TTTAAACCC
reverse complement of XhoInsRev. (SEQ ID NO: 26) No further
extension required. NdeVecRe CATATGTATATCTCCTT Reverse primer for
vector amplification, v CTTAAAGTTAAAC reverse complement of
NdeInsFor (SEQ ID NO: 27) No further extension required. SmaBam
GAATTCACTGGCCGTC Reverse primer for insert amplification
GTTTTACAGGATCC (GOI1) for insertion into MIE site 11 (FIG. (SEQ ID
NO: 28) 2). Further extension at 3' (stop codon, insert specific
overlap) required. Use with adapter NdeInsFor. BsmSma
GGATCCTGTAAAACGA Forward primer for insert amplification
CGGCCAGTGAATTC (GOI2) for insertion into Site I2 (FIG. 2, 4). (SEQ
ID NO: 29) Further extension at 3' (rbs, insert specific over-lap)
required. Use with adaptor SacHind.(multifragment SLIC, FIG. 4)
SacHind GCTCGACTGGGAAAA Reverse primer for insert amplification
CCCTGGCGAAGCTT (GOI2, insertion into MIE site I2 (FIG. 2, (SEQ ID
NO: 30) 4). Further extension at 3' (stop codon, insert specific
overlap) required. Use with adaptor BamSma.(multifragment SLIC,
FIG. 4) HindSac AAGCTTCGCCAGGGTT Forward primer for insert
amplification TTCCCAGTCGAGC (GOI3) for insertion into site I3 (FIG.
2, 4). (SEQ ID NO: 31) Further extension at 3' (rbs, insert
specific over-lap) required. Use with adaptor
BspEco.(multi-fragment SLIC, FIG. 4) BspEco5 GATCCGGATGTGAAAT
Reverse primer for insert amplification TGTTATCCGCTGGTAC (GOI3)
insertion into MIE site I3 (FIG. 2, C 4). (SEQ ID NO: 32) Further
extension at 3' (stop codon, insert specific overlap) required. Use
with adaptor HindSac.(multifragment SLIC, FIG. 4) Eco5Bsp
GGTACCAGCGGATAA Forward primer for insert amplification
CAATTTCACATCCGGA (GOI3) for insertion into site I4 (FIG. 2, 4). TC
Further extension at 3' (rbs, insert specific (SEQ ID NO: 33)
over-lap) required. Use with adaptor XhoInsRev. (multifragment
SLIC, FIG. 4) (b) Adaptors for cloning into ACEMBL vectors for
expression in insect cells PolhInsFor CCCACCATCGGGCGC Forward
primer for insert amplification, GGATCCCG needs to he followed by
insert specific (SEQ ID NO: 34) sequence (ca. 20 bp) PolhInsRev
CGAGACTGCAGGCTC Reverse primer for insert amplification, TAGATTCG
needs to be followed by insert specific (SEQ ID NO: 35) sequence
(ca. 20 bp) PolhVecFor CGGGATCCGCGCCCG Forward primer for vector
amplification, ATGGTGGG reverse complement of PolhInsRev, (SEQ ID
NO: 36) No further extension required. PolhVecRe CGAATCTAGAGCCTGC
Reverse primer for vector amplification, v AGTCTCG reverse
complement of PolhInsFor. (SEQ ID NO: 37) No further extension
required. P10hInsFor CTCCCGGTACCGCAT Forward primer for insert
amplification, GCTATGCATCAGC needs to be followed by insert
specific (SEQ ID NO: 38) sequence (ca. 20 bp) P10InsRev
AATCACTCGACGAAGA Reverse primer for insert amplification,
CTTGATCACC needs to be followed by insert specific (SEQ ID NO: 39)
sequence (ca. 20 bp) P10VecFor GCTGATGCATAGCATG Forward primer for
vector amplification, CGGTACCGGGAG reverse complement of P10InsRev.
(SEQ ID NO: 40) No further extension requred. P10VecRe
GGTGATCAAGTCTTCG Reverse primer for vector amplification, v
TCGAGTGATT reverse complement of P10InsFor. (SEQ ID NO: 41) No
further extension required. * All Adaptor primers (without
extension) can be used as sequencing primers for genes of interest
that were inserted into the MIE according to the present
embodiment.
C.1.3. Gene Insertion by Restriction/Ligation
[0285] The MIEs of the present invention can also be used as a
multiple cloning site with a series of unique restriction sites.
Preferably, the MIE described herein for expression of proteins in
prokaryolic hosts is preceded by a promoter and a ribosome binding
site, and followed by a terminator. The MIEs of the preferred
embodiments described herein for expression of proteins in insect
cells contain a polh expression cassestte or a p10 expression
cassette as already mentioned above. Therefore, cloning into live
MIE by classical restriction/ligation also yields functional
expression cassettes.
[0286] Genes of interest (GOI) can be subcloned by using standard
cloning procedures into the multiple integration element (MIE)
(see, for example, FIG. 13) of ACEMBL vectors.
Protocol 3. Restriction/Ligation Cloning into an MIE
[0287] Reagents required: [0288] Phusion Polymerase [0289]
5.times.HF Buffer for Phusion Polymerase [0290] dNTP mix (10 mM)
[0291] 10 mM BSA [0292] Restriction endonucleases (and 10.times.
Buffer) [0293] T4 DNA ligase (and 10.times. Buffer) [0294] Calf or
Shrimp intestinal-alkaline phosphatase [0295] E. coli competent
cells [0296] Antibiotics
Step 1: Primer Design
[0297] For conventional cloning, PCR primers are designed
containing chosen restriction sites, preceded by appropriate
overhangs for efficient cutting (see, e.g. New England Biolabs
catalogue), and followed by .gtoreq.20 nucleotides overlapping with
the gene of interest that is to be inserted.
[0298] In the case of the ACEMBL system for expression in bacteria,
the MIE of the present embodiment is identical for all ACEMBL
vectors. They contain a fibosome binding preceding the NdeI site.
For single gene insertions, therefore, an rbs need not be included
in the primer.
[0299] If multigene insertions are needed (for example in insertion
sites I1I-4 of the MIE), primers should be designed such that an
rbs preceding the gene and a stop codon at its 3' end are
provided.
[0300] In particular for polycistron cloning by
restriction/ligation, it is recommended to construct templates by
custom gene synthesis. In the process, the restriction sites
present in the MIE can be eliminated from the encoding DNAs.
Step 2: Insert Preparation
PCR of Insert(s):
[0301] Identical PCR reactions are prepared in 100 .mu.l volume for
genes of interest to be inserted into the MIE:
TABLE-US-00008 ddH.sub.2O 75 .mu.l 5x Phusion HF Reaction buffer 20
.mu.l dNTPs (10 mM stock) 2 .mu.l Template DNA (100 ng/.mu.l) 1
.mu.l 5.quadrature. primer (100 .mu.M stock) 1 .mu.l 3.quadrature.
primer (100 .mu.M stock) 1 .mu.l Phusion polymerase (2 U/.mu.l) 0.5
.mu.l
[0302] PCR reactions are then carried out with a standard PCR
program (unless very long DNAs are amplified, then double extension
time): [0303] 1.times.98.degree. C. for 2 min [0304]
30.times.[98.degree. C. for 20 see.->50.degree. C. for 30
sec.,->72.degree. C. for 3 min] [0305] Hold at 10.degree. C.
[0306] Analysis of the PCR reactions by agarose gel electrophoresis
and ethidium bromide staining is recommended.
[0307] Product purification is preferably performed by using
commercial PCR Purification Kits or NucleoSpin Kits (available from
Qiagen, Macherey-Nagel and other manufacturers). It is recommended
to perform elation in the minimal possible volume Indicated by the
manufacturer.
Restriction Digestion of Insert(s):
[0308] Restriction reactions are carried out in 40 .mu.l reaction
volumes, using specific restriction enzymes as specified by
manufacturers recommendations (c.f. New England Biolabs catalogue
and others).
TABLE-US-00009 PCR Kit eluate(.gtoreq.1 .mu.g) 30 .mu.l 10x
Restriction enzyme buffer 4 .mu.l 10 mM BSA 2 .mu.l Restriction
enzyme for 5' 2 .mu.l Restriction enzyme for 3' 2 .mu.l (in case of
double digestion, otherwise ddH.sub.2O)
[0309] Restriction digestions are performed in a single reaction
with both enzymes (double digestion), on alternatively,
sequentially (two single digestions) if the buffer conditions
required are incompatible.
Gel Extraction of Insert(s):
[0310] Processed insert is then purified by agarose gel extraction
rising commercial kits (Qiagen, Macherey-Nagel etc). It is
recommended to elute the extracted DNA in the minimal volume
defined by the respective manufacturer.
Step 3: Vector Preparation
[0311] Restriction digestion of ACEMBL plasmid(s):
[0312] Restriction reactions are carried out in 40 .mu.l reaction
volumes, using specific restriction enzymes as specified by
manufacturer's recommendations (see, e.g. New England Biolabs
catalogue and others).
TABLE-US-00010 ACEMBL plasmid (.gtoreq.0.5 .mu.g) in ddH.sub.2O 30
.mu.l 10x Restriction enzyme buffer 4 .mu.l 10 mM BSA 2 .mu.l
Restriction enzyme for 5' 2 .mu.l Restriction enzyme for 3' 2 .mu.l
(in case of double digestion, otherwise ddH.sub.2O)
[0313] Restriction digestions are performed in a single reaction
with two enzymes (double digestion), or, alternatively,
sequentially (two single digestions), if the buffer conditions
required are incompatible.
Gel Extraction of Vector(s):
[0314] The processed vector is then purified by agarose gel
extraction using commercial kits (Qiagen, Macherey-Nagel etc.). It
is recommended to elute the extracted DNA in the minimal volume
defined by the respective manufacturer.
Step 4: Ligation
[0315] Ligation reactions are carried out in 20 .mu.l reaction
volumes according to the recommendations of the supplier of the T4
DNA ligase:
TABLE-US-00011 ACEMBL plasmid (gel extracted) 8 .mu.l Insert (gel
extracted) 10 .mu.l 10x T4 DNA Ligase buffer 2 .mu.l T4 DNA Ligase
0.5 .mu.l
[0316] Ligation reactions are performed at 25.degree. C. (sticky
end) for 1 h or at 16.degree. C. (blunt end) overnight.
Step 5: Transformation
[0317] Mixtures are next transformed into competent cells following
standard transformation procedures.
[0318] Reactions for Acceptor derivatives are transformed into
standard E. coli cells for cloning (such as TOP10, DH5.alpha.,
HB101) and after recovery plated on agar containing ampicillin (100
.mu.g/ml) or tetracycline (25 .mu.g/ml), respectively.
[0319] Reactions for Donor derivatives are transformed into E. coli
cells expressing the pir gene (such as BW23473, BW23474, or PIR1
and PIR2, Invitrogen) and plated on agar containing chloramphenicol
(25 .mu.g/ml, pDC), kanamycin (50 .mu.g/ml, pDK), and spectinomycin
(50 .mu.g/ml, pDS).
Step 6: Plasmid Analysis
[0320] Plasmids are cultured and correct clones are selected based
on specific restriction digestion and DNA sequencing of the
inserts.
C.1.4. Multiplication by Using the HE and BstXI Sites
[0321] The ACEMBL system vectors according to the present invention
contain a homing endonuclease (HE) site and a designed BstXI site
that envelop the multiple integration element (MIE). The homing
endonuclease site can be used to insert entire expression
cassettes, containing single genes or polycistrons, into a vector
already containing one gene or several genes of interest. Homing
endonucleases have long recognition sites (12 to 40 base pairs or
more, preferably 20-30 base pairs). Although not all equally
stringent, homing endonuclease sites are most probably unique in
the context of even large plasmids, or, in fact, entire
genomes.
[0322] In the ACEMBL system of the present embodiment, Donor
vectors contain a recognition site for homing endonuclease PI-SceI
(FIG. 2). This HE site yields upon cleavage a 3' overhang with the
sequence -CTGC. Acceptor vectors contain the homing endonuclease
site I-CeuI, which upon cleavage will result in a 3' overhang of
-CTAA. On Acceptors and Donors, the respective HE site is preceding
the MIE. The 3' end of the MIE contains a specifically designed
BstXI site, which upon cleavage will generate a matching overhang.
The basis of this is the specificity of cleavage by BstXI. The
recognition sequence of BstXI is defined as CCANNNNN'NTGG (SEQ ID
NO: 42) (apostrophe marks position of phosphodiester link
cleavage). The residues denoted as N can be chosen freely. Donor
vectors thus contain a BstXI recognition site of the sequence
CCATGTGC'CTGG (SEQ ID NO: 43), and Acceptor vectors contain
CCATCTAA'TTGG (SEQ ID NO: 44). The overhangs generated by BstXI
cleavage in each case will match the overhangs generated by HE
cleavage. Note that Acceptors and Donors have different HE
sites.
[0323] The recognition sites are not symmetric. Therefore, ligation
of a HE/BstXI digested fragment into a HE site of an ACEMBL vector
will be (1) directional and (2) result in a hybrid DNA sequence
where a HE half site is combined with a BstXI half site. This site
will be cut by neither HE nor BstXI. Therefore, in a construct that
had been digested with a HE, insertion by ligation of HE/BstXI
digested DNA fragment containing an expression cassette with one or
several genes will result in a construct which contains all
heterologous genes of interest, enveloped by an intact HE site in
front, and a BstXI site at the end. Therefore, the process of
integrating entire expression cassettes by means of HE/BstXI
digestion and ligation into a HE site can be repeated
iteratively.
Protocol 4. Multiplication by Using Homing Endonuclease/BstXI.
[0324] Reagents required: [0325] Homing endonucleases PI-SceI,
I-CeuI [0326] 10.times. Buffers for homing endonucleases [0327]
Restriction enzyme BstXI (and 10.times. Buffer) [0328] T4 DNA
ligase (and 10.times. Buffer) [0329] E. coli competent cells [0330]
Antibiotics
Step 2: Insert Preparation
[0331] Restriction reactions are carried out in 40 .mu.l reaction
volumes, using homing endonucleases PI-SceI (Donors) or I-CeuI
(Acceptors) as recommended by the supplier (e.g. New England
Biolabs or others).
TABLE-US-00012 ACEMBL plasmid (.gtoreq.0.5 .mu.g) in ddH.sub.2O 32
.mu.l 10x Restriction enzyme buffer 4 .mu.l 10 mM BSA 2 .mu.l
PI-SceI (Donors) or I-CeuI (acceptors) 2 .mu.l
[0332] Reactions are then purified by PCR extraction kit or acidic
ethanol precipitation, and next digested by BstXI according to the
recommendations of the supplier.
TABLE-US-00013 HE digested DNA in ddH.sub.2O 32 .mu.l 10x
Restriction enzyme buffer 4 .mu.l 10 mM BSA 2 .mu.l BstXI 2
.mu.l
Gel Extraction of Insert(s):
[0333] Processed insert is then purified by agarose get extraction
using commercial kits (Qiagen, Macherey-Nagel etc). It is
recommended to elate the extracted DNA in the minimal volume
defined by the respective manufacturer.
Step 3: Vector Preparation
[0334] Restriction reactions are carried out in 40 .mu.l reaction
volumes, using homing eodonucleases PI-SceI (Donors) or I-CeuI
(Acceptors) as recommended by the supplier (e.g. New England
Biolabs catalogue or others).
TABLE-US-00014 ACEMBL plasmid (.gtoreq.0.5 .mu.g) in ddH.sub.2O 33
.mu.l 10x Restriction enzyme buffer 4 .mu.l 10 mM BSA 2 .mu.l
PI-SceI (Donors) or I-CeuI (acceptors) 1 .mu.l
[0335] Reactions are then purified by PCR extraction kit or acidic
ethanol precipitation, and next treated with intestinal alkaline
phosphatase according to the recommendations of the respective
supplier.
TABLE-US-00015 HE digested DNA in ddH.sub.2O 17 .mu.l 10x Alkaline
phosphatase buffer 2 .mu.l Alkaline phosphatase 1 .mu.l
Gel Extraction of Vector:
[0336] Processed vector is then purified by agarose gel extraction
using commercial kits (Qiagen, Macherey-Nagel etc). It is
recommended to elute the extracted DNA in the minimal volume
defined by the respective manufacturer.
Step 4: Ligation
[0337] Ligation reactions are carried out in 20 .mu.l reaction
volumes:
TABLE-US-00016 HE/Phosphatase treated vector (gel 4 .mu.l
extracted) HE/BstXI treated insert (gel extracted) 14 .mu.l 10x T4
DNA Ligase buffer 2 .mu.l T4 DNA Ligase 0.5 .mu.l
[0338] Ligation reactions are performed at 25.degree. C. for 1 h or
at 16.degree. C. overnight.
Step 5: Transformation
[0339] Mixtures are next transformed into competent cells following
standard transformation procedures.
[0340] Reactions for Acceptor derivatives are transformed into
standard E. coli cells for cloning (such as TOP10, DH5.alpha.,
HB101) and after recovery plated on agar containing ampicillin (100
.mu.g/ml) or tetracycline (25 .mu.g/ml), respectively.
[0341] Reactions for Donor derivatives are transformed into E. coli
expressing the pir gene (such as BW23473, BW23474, or PIR1 and
PIR2, Invitrogen) and plated on agar containing chloramphenicol (25
.mu.g/ml, pDC, pIDC), kanamycin (50 .mu.g/ml, pDK, pIDK), and
spectinomycin (50 .mu.g/ml, pDS, pIDS).
Step 6: Plasmid Analysis
[0342] Plasmids are cultured and correct clones selected based on
specific restriction digestion and DNA sequencing of the
inserts.
C.2. Cre-LoxP Reaction of Acceptors and Donors
[0343] Cre recombinase is a member of the integrase family (Type I
topoisomerase from bacteriophage P1). It recembines a 34 bp loxP
site (SEQ ID NO: 19; see FIG. 5) in the absence of accessory
protein or auxiliary DNA sequence. The loxP site is comprised of
two 13 bp recombinase-binding elements arranged as inverted repeats
willed flank an 8 bp central region where cleavage and ligation
reaction occur.
[0344] The site-specific recombination mediated by Cre reeombinase
involves the formation of a Holliday junction (HJ). The
recombination events catalyzed by Cre recombinase are dependent on
the location and relative orientation of the LoxP sites. Two DNA
molecules, for example an Acceptor and a Donor plasmid, containing
single LoxP sites will be fused. Furthermore, the Cre recombination
is an equilibrium reaction with 15-20% efficiency in recombination.
This creates useful options for multigene combinations for
multiprotein complex expressions.
[0345] In a reaction where several DNA molecules such as Donors and
Acceptors are incubated with Cre recombinase, the fusion/excision
activity of the enzyme will result in an equilibrium, state where
single vectors (educt vectors) and all possible fusions coexist.
Donor vectors can be used with Acceptors and/or Donors, likewise
for Accepter vectors. Higher order fusions are also generated where
more than two vectors are fused. This is shown schematically in
FIG. 6.
[0346] The fact that Donors of the present example contain a
conditional origin of replication that depends on a pir.sup.+ (pir
positive) background now allows for selecting out from this
reaction mix all desired Acceptor-Donor(s) combinations. For this,
the reaction mix is used to transform to pir negative strains
(TOP10, DH5.alpha., HB101 or other common laboratory cloning
strains). Then, Donor vectors will act as suicide vectors when
plated out on agar containing the antibiotic corresponding to the
Donor encoded resistance marker, unless fused with an Acceptor. By
using agar with the appropriate combinations of antibiotics, all
desired Acceptor-Donor fusions can be selected for.
[0347] In this way, fusion vectors of 25 kb and larger can be
generated. In stability tests (serial passaging for more than 60
generations), even such large plasmids are stable as checked by
restriction mapping, even if only one of the antibiotics
corresponding to the encoded resistance markers was provided in the
growth medium.
C.2.1. Cre-LoxP Fusion of Acceptors and Donors
[0348] The following protocol is designed for generating multigene
fusions from Donors and Acceptors by Cre-LoxP reaction.
[0349] Reagents: [0350] Cre recombinase [0351] Standard E. coli
competent cells (pir.sup.- strain) [0352] Antibiotics [0353] 96
well microliter plates
[0354] 12 well tissue-culture plates (or petri dishes) w.
agar/antibiotics [0355] LB media
[0356] 1. For a 20 .mu.l Cre reaction, mix 1-2 .mu.g of each educts
in approximate equal amounts (5' DNA termini). Add ddH.sub.2O to
adjust the total volume to 16.about.17 .mu.l, then add 2 .mu.l
10.times.Cre buffer and 1.about.2 .mu.l Cre recombinase.
[0357] 2. Incubate the Cre reaction at 37.degree. C. (or 30.degree.
C.) for 1 hour.
[0358] 3. Optional: load 2-5 .mu.l Cre reaction on an analytical
agarose gel for examination. Heat inactivatlon at 70.degree. C. for
10 minutes before the gel loading is strongly recommended.
[0359] 4. For chemical transformation, mix 10-15 .mu.l Cre reaction
with 200 .mu.l chemical competent cells, incubate the mixture on
ice for 15-30 minutes. Then perform heat shock at 42.degree. C. for
45-60 s. [0360] Up to 20 .mu.l Cre reaction (0.1 volumes of the
chemical competent cell suspension) can be directly transformed
into 200 .mu.l chemical competent cells. [0361] For
electrotransformation, up to 2 .mu.l Cre reaction could be directly
mixed with 100 .mu.l electrocompetent cells, and transformed by
using an electroporator (e.g. BIORAD E. coli Pulser) at 1.8-2.0 kV.
[0362] Larger volumes of Cre reactions should be desalted by
ethanol precipitation or PCR purification column before
electrotransformation. The desalted Cre reaction mix does
preferably not exceed 0.1 volumes of the electrocompetent cell
suspension. [0363] The cell/DNA mixture could be immediately used
for electrotransformation without prolonged incubation on ice.
[0364] 5. Add up to 400 .mu.l of LB media (or SOC media) per 100
.mu.l of cell/DNA suspension immediately after the transformation
(heat shock or electroporation).
[0365] 6. Incubate the suspension in a 37.degree. C. shaking
incubator overnight or for at least 4 hours. [0366] For recovering
multifusion plasmid containing more than 2 resistance markers. It
is strongly recommended to incubate the suspension at 37.degree. C.
overnight.
[0367] 7. Plate out the recovered cell suspension on agar
containing desired combination of antibiotics. Incubate at
37.degree. C. overnight.
[0368] 8. Emerged colonies after overnight incubation might be
verified directly by restriction digestion at this stage, by
referring to steps 12-16, supra. [0369] Especially in the case that
only one multifusion plasmid is desired.
[0370] For further selection by single antibiotic challenges on a
96 well microliter plate, continue to step 9.
[0371] Several various multifusion plasmids can be processed and
selected on one 96 well microliter plate in parallel.
[0372] 9. For 96 well antibiotic tests, inoculate four colonies
from each antibiotic agar plate into .about.500 .mu.l LB media
without antibiotics. Incubate the cell cultures in a 37.degree. C.
shaking incubator for 1-2 hours.
[0373] 10. During the incubation of colonies, fill a 96 well
microliter plate with 150 .mu.l antibiotic-containing LB media or
colourful dye (positional marker) in corresponding wells. [0374] A
typical arrangement of the solutions, which is used for parallel
selections of multifusion plasmids, is shown in FIG. 7. The basic
principle underlying this aspect of the present invention is that
every cell suspension from single colonies needs to be challenged
by all four single antibiotics.
[0375] 11. Add 1 .mu.l aliquots of pre-incubated cell cultures to
the corresponding wells. Then incubate the inoculated 96 well
microliter plate in a 37.degree. C. shaking incubator overnight at
180-200 rpm. [0376] It is recommendahle to use parafilm to wrap the
plate. [0377] The rest pre-incubated cell cultures could be kept in
4.degree. C. fridge for further inoculations.
[0378] 12. Select transformants containing desired multifusion
plasmids according to the combination of dense and clear cell
cultures from each colony. Inoculate 10-20 .mu.l cell cultures into
10 ml LB media with corresponding antibiotics. Incubate in a
37.degree. C. shaking incubator overnight.
[0379] 13. Centrifuge the overnight cell cultures at 4000 g for
5-10 minutes. Purify cell pellets with plasmid miniprep kit
according to manufacturers' information.
[0380] 14. Determine the concentrations of purified plasmid
solutions by using UV absorption (e.g. NanoDrop.TM. 1000).
[0381] 15. Digest 0.5.about.1 .mu.g of the purified plasmid
solution in a 20 .mu.l restriction digestion (with 5-10 unit
endonuclease). Incubate under recommended reaction condition for
.about.2 hours.
[0382] 16. Use 5-10 .mu.l of the digestion for analytical agarose
gel (0.8-1.2%) electrophoresis. Verify the plasmid integrity by
comparing the actual restriction pattern to the predicted
restriction pattern in silica (e.g. by using VectorNTI).
C.2.2. Deconstruction Effusion Vectors by Cre
[0383] The following protocol can be used, for instance for the
recovery of four single AGEMBL vectors (pACE, pDC, pDK, pDS) by
deconstructing tetra-fused pACKS plasmid (pACE-pDC-pDK-pDS) which
preferably forms part of the ACEMBL System kit (see below Section E
of the present embodiment). Likewise, the protocol is suitable for
releasing single educts from multifusion constructs. This is
achieved by Cre-LoxP reaction, transformation and plating on agar
with appropriately reduced antibiotic resistance level (FIG. 6).
For the liberated educt, encoding genes can be modified and
diversified. Then, the diversified construct is resupplied by
Cre-LoxP reaction.
[0384] Reagents: [0385] Cre recombinase (and 10.times. Buffer)
[0386] E. coli competent cells [0387] (pir.sup.+ strains, pir.sup.-
strains could be used only when partially deconstructed
Acceptor-Donor fusions are desired). [0388] Antibiotics
[0389] 1. For a 20 .mu.l De-Cre reaction, incubate .about.1 .mu.g
multifusion plasmid with 2 .mu.l 10.times.Cre buffer, 1.about.2
.mu.l Cre recomblnase, add ddH.sub.2O to adjust the total reaction
volume to 20 .mu.l.
[0390] 2. Incubate the De-Cre reaction at 30.degree. C. for 1-4
hour.
[0391] 3. Optional: load 2-5 .mu.l De-Cre reaction on an analytical
agarose gel for examination. [0392] Heat inactivation at 70.degree.
C. for 10 minutes before the gel loading is recommended.
[0393] 4. For chemical transformation, mix 10-15 .mu.l De-Cre
reaction with 200 .mu.l chemical competent cells. Incubate the
mixture on ice for 15-30 minutes. Then perform heat shock at
42.degree. C. for 45-60 s. [0394] Up to 20 .mu.l De-Cre reaction
(0.1 volumes of the chemical competent cell suspension) can be
directly transformed into 200 .mu.l chemical competent cells.
[0395] For electransformation, up to 2 .mu.l De-Cre reaction could
be directly mixed with 100 .mu.l electrocompetent cells, and
transformed by using an electroporator (e.g. BIORAD E. coli Pulser)
at 1.8-2.0 kV. [0396] Larger volumes of De-Cre reaction should be
desalted by ethanol precipitation or PCR purification column before
electrotransformation. The desalted De-Cre reaction mix does
preferably not exceed 0.1 volumes of the electrocompetent cell
suspension. [0397] The cell/DNA mixture could be immediately used
for electrotransformaton without prolonged incubation on ice.
[0398] 5. Add up to 400 .mu.l of LB media (or SOC media) per 100
.mu.l of cell/DNA suspension immediately after the transformation
(heat shock or electroporation).
[0399] 6. Incubate the suspension in a 37.degree. C. shaking
incubator. [0400] For recovery of partially deconstructed
double/triple fusions, incubate the suspension in a 37.degree. C.
shaking incubator overnight or for at least 4 hours. [0401] For
recovery of individual sdacfs such as single ACEMBL vectors from
pACKS plasmid, incubate the suspension in a 37.degree. C. incubator
for 1-2 hours.
[0402] 7. Plate out the recovered cell suspension on agar
containing desired combination of antibiotics, incubate at
37.degree. C. overnight.
[0403] 8. Colonies after overnight incubation can be verified
directly by restriction digestion at this stage, by referring to
steps 12-16. [0404] This is especially recommended if only one
single educt or partially deconstructed multifusion plasmid is
desired. [0405] For further selection by single antibiotic
challenges on a 96 well microliter plate, continue to step 9.
[0406] Several various single educts/partlally deconstructed
multifusion plasmids can be processed and selected on one 96 well
microliter plate in parallel.
[0407] 9. For 96 well antibiotic tests, inoculate four colonies
from each antibiotic agar plate into .about.500 .mu.l LB media
without antibiotics, incubate the cell cultures in a 37.degree. C.
shaking incubator for 1-2 hours.
[0408] 10. During the incubation of colonies, fill a 96 well
microliter plate with 150 .mu.l antibiotic-containing LB media or
colourful dye (positional marker) in corresponding wells. [0409]
Referring to FIG. 7 it is possible to provide a similar arrangement
of the solutions, which is used for parallel selections of four
various single educts/partially deconstructed multifusion plasmids
The underlying principle of the present aspect of the invention is
that every cell suspension from single colonies is to be challenged
by all four antibiotics separately.
[0410] 11. Add 1 .mu.l aliquots of pre-incubated cell cultures to
the corresponding wells. Then incubate the inoculated 96 well
microliter plate in a 37.degree. C. shaking incubator overnight at
180-200 rpm. [0411] It is recommendabie to use parafilm to wrap the
plate. [0412] The remaining pre-incubated cell cultures could be
kept in 4.degree. C. fridge for further inoculations.
[0413] 12. Select transformants containing desired single
educts/partially deconstructed multifusion plasmids according to
the combination of dense and clear cell cultures from each colony,
inoculate 10-20 .mu.l cell cultures into 10 ml LB media with
corresponding antibiotic(s). Incubate in a 37.degree. C. shaking
incubator overnight.
[0414] 13. Centrifuge the overnight cell cultures at 4000 g for
5-10 minutes. Purify cell pellets with plasmid miniprep kit
according to manufacturers' information.
[0415] 14. Determine the concentrations of purified plasmid
solutions by using UV absorption (e.g. NanoDrop.TM. 1000).
[0416] 15. Digest 0.5-1 .mu.g of the purified plasmid solution in a
20 .mu.l restriction digestion (with 5-10 unit endonuclease).
Incubate under recommended reaction condition for .about.2
hours.
[0417] 16. Use 5-10 .mu.l of the digestion for analytical agarose
gel (0.8-1.2%) electrophoresis. Verify the plasmid integrity by
comparing the actual restriction pattern to predicted restriction
pattern in silico (e.g. by using VectorNTI).
[0418] 17. Optional: during recovery of all four single ACEMBL
vectors from pACKS plasmid, in case one or more single ACEMBL
vectors fail to be liberated from one De-Cre reaction. One can just
pick partially deconstructed double/triple fusions containing
desired single ACEMBL vector(s), and perform a second De-Cre
reaction (repeat steps 1-8). [0419] Typically, up to 2 sequential
De-Cre reactions are sufficient to recover all four single ACEMBL
vectors from pACKS plasmid, and the liberation of single educts
from double/triple fusions could be much more efficient than from
pACKS plasmid (quadruple fusion). The same principle also applies
to the deconstruction of any other multifusion plasmid based on the
ACEMBL system according to the present invention.
C.3. Coexpression in Bacteria by Cotransformation
[0420] Protein complexes can be expressed also from two separate
vectors that were cotransformed in expression strains. The
cotransformed vectors can have the same or different origins of
replication, however, they must encode for different resistance
markers. Plasmids pACE (ampicillln resistance marker) and pACE2
(tetracycline resistance marker) have both a ColE1 derived replicon
and can therefore be used with all common expression strains. pACE
and pACE2 derivatives (also including fused Donors if needed) can
be cotransformed into expression strains, and double transformants
selected for by plating on agar plates containing both ampicillin
and tetracycline antibiotics.
[0421] Transformations are carried out using standard
transformation protocols (see, e.g. the latest edition of Ausubel
et al. (ed.), supra.
D. Automation
[0422] As already outlined above, cloning and expression of
multiple protein complexes using the nucleic acids, vectors and
methods of the present invention is highly suited for automation
equipment employing current robotic techniques.
[0423] In the following general protocols as exemplified for a
Tecan Freedom Evoll 200 pipetting device are provided. The
pipetting device is typically equipped with liquid handling arm1
(LiHa1), 4 fixed tips (steel needles), 4 disposable tips coni
(Diti's), 250 .mu.l syringes, liquid handling arm2 (LiHa2), 8 fixed
tips (steel needles), 2.5 ml syringes, robotic manipulator arm
(RoMa/transportation of plates), version long. The work station
usually contains the following integrated devices: thermocycler
PTC-200 (Biorad), Te-Shake, heatable plate shaker (Tecan), Variomag
Thermoshaker, heat- and coolable plate shaker (Inheco), Te-Vacs,
dual vacuum station for filter plates (Tecan), Safirell, UV VIS
plate reader (Tecan) and cooling unit 400 W (FRYKA multistar).
D.1. Automated SLIC Process
[0424] A schematic representation of a workflow for automated SLIC
is shown in FIG. 22.
Step 1: Initial PCR
[0425] Source plate: 96 well standard microliter plate containing
the PCR templates (cDNA Approx. 0.2 .mu.g/.mu.l)
[0426] Reaction plate: 96 well PCR plate (Eppendorf)
[0427] Material: Sample mix plate (96 well PCR plate: Eppendorf),
1% agarose E-Gel.RTM. (Invitrogen), Phusion.RTM. DNA Polymerase
master mix, oligonucleotide primers at 20 .mu.M, 2.times.DNA
loading dye (2.times.DLD) (Fermentas), E-Gel.RTM. Low Range
quantitative DNA Ladder (Invitrogen), 10.times. Buffer Tango.RTM.
with BSA (Fermentas), DpnI (Fermentas)
[0428] PCR program: [0429] 11.times.[98.degree. C. for 20
sec..fwdarw.60-50.degree. C. for 30 sec. (step down every 2.sup.nd
cycle 1.degree. C.).fwdarw.72.degree. C. for 3 min.] [0430]
19.times.[98.degree. C. for 20 sec..fwdarw.50.degree. C. for 30
sec..fwdarw.72.degree.0 C. for 3 min.] [0431] 72.degree. C. for 3
min. [0432] Hold at 10.degree. C.
[0433] DpnI of digest program: [0434] 37.degree. C. for 3 h [0435]
10.degree. C. for 1 min
[0436] Procedure: [0437] Wash tips.fwdarw.Pipet 89 .mu.l PCR
master-mix into reaction plate [0438] Wash tips.fwdarw.Pipet 1
.mu.l template DNA according to worklist [0439] Wash
tips.fwdarw.Pipet 5 .mu.l primer each to reaction plate [0440] Wash
tips.fwdarw.Run PCR program [0441] Wash tips.fwdarw.Pipet 10 .mu.l
10.times. Buffer Tango.RTM. with BSA to reaction plate [0442] Wash
tips.fwdarw.Pipet 5 .mu.l DpnI to reaction plate [0443] Wash
tips.fwdarw.Run DpnI digest program [0444] Wash tips.fwdarw.Pipet
10 .mu.l 2.times.DLD to each well of sample mix plate [0445] Wash
tips.fwdarw.Pipet 15 .mu.l DNA marker each to the E-gel marker
slots [0446] Wash tips.fwdarw.Pipet 10 .mu.l PCR product to
2.times.DLD on sample mix plate [0447] Wash tips.fwdarw.Pipet 15
.mu.l sample mix to the E-Gel sample slots [0448] Wash
tips.fwdarw.Run E-Gel.RTM. for 25 min. [0449] Assess results
Step 2: PCR Purification
[0450] Source plate: 96 well PCR plate (Eppendorf) with PCR
samples
[0451] Target plate: 96 well microtiter elution plate
(Macherey-Nagel)
[0452] Material: PCR purification Kit, NucleoSpin 96 Extract II Kit
(Macherey-Nagel)
[0453] Procedure: According to manufacturer's information
(http://www.macherey-nagel.com/tabid/10887/default.aspx)
Step 3: T4 DNA Polymerase Reaction
[0454] Source plate: 96 well microfiter elution plate
(Macherey-Nagel)
[0455] Reaction plate: 96 well PCR plate (Eppendorf)
[0456] Material: bidest. water, 10.times.T4 DNA polymerase reaction
buffer (Novagen), 100 mM DTT, 2 M Urea, T4 DNA polymerase (Novagen
LIC qualified), 500 mM EDTA
[0457] Incubation program: 23.degree. C. for 10 min. (program 1)
[0458] 75.degree. C. for 20 min. (program 2)
[0459] Procedure: [0460] Wash tips.fwdarw.Pipet 6 .mu.l water in to
reaction plate [0461] Wash tips.fwdarw.Pipet 2 .mu.l 10.times.
reaction buffer into reaction plate [0462] Wash tips.fwdarw.Pipet 1
.mu.l 100 mM DTT into reaction plate [0463] Wash tips.fwdarw.Pipet
2 .mu.l 2 M Urea into reaction plate [0464] Wash tips.fwdarw.Pipet
8 .mu.l DNA sample from prev. PCR into reaction plate [0465] Wash
tips.fwdarw.Pipet 0.5 .mu.l T4 DNA polymerase into reaction plate
[0466] Wash tips.fwdarw.Run incubation program 1 [0467] Wash
tips.fwdarw.Pipet 1 .mu.l 500 mM EDTA into reaction plate [0468]
Wash tips.fwdarw.Run incubation program 2
Step 4: Annealing
[0469] Source plate: Reaction plate from T4 DNA polymerase
reaction
[0470] Reaction plate: 96 well PCR plate (Eppendorf)
[0471] Material: bidest. water, 10.times.DNA Ligase Reaction Buffer
(NEB), linearised vector
[0472] Incubation program: 65.degree. C. for 8 min..fwdarw.ramp
down 0.4.degree. C./min. to 35.degree. C..fwdarw.10.degree. C. for
1 min.
[0473] Procedure: [0474] Wash tips.fwdarw.Pipet 150 ng T4 DNA
polymerase treated insert DNA according to worklist into reaction
plate [0475] Wash tips.fwdarw.Pipet 150 ng linearised vector DNA
according to worklist into reaction plate [0476] Wash
tips.fwdarw.Run incubation program Step 5: Transformation in E.
coli
[0477] Source plate: Reaction plate from the annealing step
[0478] Reaction plate: 96 well PCR plate (Eppendorf)
[0479] Culture plate: 2 ml 96 well plate (Nunc)
[0480] Target plates: 12 well cell culture plates containing 2 ml
of LB-agar with appropriate antibiotics (standard concentrations
used: Ampicillin 100 .mu.g/ml, Kanamycin 50 .mu.g/ml, Spectinomycin
50 .mu.g/ml, Chloramphenicol 30 .mu.g/ml)
[0481] Material: E. coli cells (XI1blue) that are chemically
competent for transformation, SOC-medium
[0482] Transformation program: Heat thermocycler to 42.degree. C.
[0483] Incubate a 42.degree. C. for 30 sec. [0484] Transfer
immediately to cooled (0.degree. C.) pipetting carrier
[0485] Procedure: [0486] Wash tips.fwdarw.Pipet 100 .mu.l competent
E. coli cells into reaction plate [0487] Wash tips.fwdarw.Pipet 10
.mu.l DNA sample from annealing step into reaction plate [0488]
Wash tips.fwdarw.Incubate at 0.degree. C. for 30 min. [0489] Run
transformation program [0490] Incubate at 0.degree. C. for 5 min.
[0491] Wash tips.fwdarw.Pipet 250 .mu.l SOC-medium into culture
plate [0492] Wash tips.fwdarw.Transfer transformation mix into
culture plate [0493] Incubate at 37.degree. C. and 720 rpm.
(Te-Shake Shaker) for 2 h [0494] Wash tips.fwdarw.Pipet 50 .mu.l
culture into target plate (agar plate) [0495] Wash
tips.fwdarw.Shake target plate at 12 Hz for 1 min. (plating out)
[0496] Incubate target plates over night at 37.degree. C.
Step 6: Picking Clones and Setting Up Over Night Cultures (Manual
Step)
[0497] Source plate: 12 well cell culture plates containing E. coli
colonies
[0498] Target plate: 24 well culture plate
[0499] Material: 2.times.TY culture medium, incubator which carries
culture plates
[0500] Procedure: Pick 4 colonies per reaction and transfer to 3 ml
2.times.TY medium in a 24 well culture plate. Incubate at
37.degree. C. and approx. 220 rpm over night.
Step 7: Plasmid Extraction (Miniprep)
[0501] Source plate: 24 well culture plate (usually 3 ml
culture)
[0502] Target plate: 96 well microliter elution plate
(Macherey-Nagel)
[0503] Material: Plasmid extraction kit, NucleoSpin Robot 96
Plasmid Kit (Macherey-Nagel)
[0504] Procedure: According to manufacturer (see
http://www.machereynagel.com/tabid/10885/default.aspx)
Step 8: Assessment
[0505] Plasmid yield is quantified by measuring UV absorbance with
a Thermo Scientific NanoDrop.TM. 1000 Spectrophotometer according
to manufacturer. Plasmid integrity was assessed by E-gel
(Invitrogen)
[0506] The efficacy of the SLIC protocol is assessed in manual and
robotics mode. The results of the comparison are shown in Table II.
Results are based on a set of 25 different Donor/Acceptor
constructions prepared.
TABLE-US-00017 TABLE II Comparison Manual versus Robotic SLIC
procedure (based on 25 constructs each) Manual Evoll DNA used for
T4 200-400 ng insert 400-800 ng insert reaction: 200-400 ng vector
400-800 ng vector T4 reaction volume for 5 .mu.l: 2.5 .mu.l
(insert) + 5ul: 2.5 .mu.l (insert) + transformation: 2.5 .mu.l
(vector) 2.5 .mu.l (vector) Volume comp. cells 100 .mu.l (+300
.mu.l 100 .mu.l (+300 .mu.l SOC) (XI1Blue, chem. comp): SOC) Volume
plated 200 .mu.l 50 .mu.l/well (12 well (Petri dish) plate) 200
.mu.l (petri dish) Clones obtained: 200->2000 25-250 (12 well
plate) (Petri dish) 70-5300 (petri dish)
D.2 Automated Cre Fusion Process
[0507] A schematic representation of a workflow for automated Cre
fusion is shown in FIG. 23.
Step 1: Cre-LoxP Plasmid Fusion Reaction
[0508] Source plate: 96 well microliter elution plate from the
plasmid extraction process containing plasmids suitable for Cre-Lox
fusion
[0509] Reaction plate: 96 well PCR plate (Eppendorf)
[0510] Material: bidest. water, 10.times.Cre reaction buffer (NEB),
Cre recombinase (NEB)
[0511] Incubation program: 37.degree. C. for 1 h.fwdarw.10.degree.
C. for 1 min.
[0512] Procedure: [0513] Wash tips.fwdarw.Pipet 6 .mu.l bidest.
water into reaction plate [0514] Wash tips.fwdarw.Pipet 2 .mu.l
10.times.cre reaction buffer into reaction plate [0515] Wash
tips.fwdarw.Pipet plasmid DNA suitable for Cre recombination
according to worklist into reaction plate [0516] Wash
tips.fwdarw.Pipet 2 .mu.l Cre recombinase into reaction plate
[0517] Wash tips.fwdarw.Run incubation program [0518] Total
reaction volume: 20 .mu.l Step 2,3 and 4: Transformation in E. coli
and Plasmid Extraction:
[0519] Identical to the method described in above Section D.1.,
with the exception that reaction plate from Cre recombination step
is used as source plate and recovery time in SOC-medlum is
prolonged to a total of 4 h. Chemically competent Mach1 cells are
used for transformation. For Cre reaction with 3 and 4 vectors
agar-plates with half of the antibiotic concentration (standard
concentrations used: Ampicillin 100 .mu.g/ml, Kanamycin 50
.mu.g/ml, Spectinomycin 50 .mu.g/ml, Chloramphenicol 30 g/ml) are
used.
Step 5: Assessment
[0520] Plasmld fusion yield is quantified by measuring UV
absorbance with a Thermo Scientific NanoDrop.TM. 1000
Spectrophotometer according to the manufacturer's instructions.
Plasmid integrity is assessed by E-gel (Invitrogen) of undigested
and digested samples. Suitable restriction sites that yield a
digestion pattern characteristic for the respective fusions are
identified by using Vector NTI (Invitrogen) and used for
restriction mapping.
[0521] The efficacy of the Cre reaction is tested by performing a
series of fusion reactions, each in triplicate, by using the Evoll
liquid handling workstation. The results are summarized in Table
III.
TABLE-US-00018 TABLE III Efficiency of Cre-LoxP Reactions on Evoll
(assessed in triplicate for each reaction) Volume Cre-reaction used
for 10 .mu.l transformation (all reactions): Volume chem. comp.
cells (XI1Blue, 100 .mu.l (+300 .mu.l SOC) Mach1) per
transformation (all reactions): Volume transformation reaction
plated: 50 .mu.l/well (12 well plate) 200 .mu.l (petri dish) Clones
obtained: (a) Double vector fusion reaction (AD, one Acceptor, one
Donor) >1000 fused functional AD plasmids plated on a standard
petri dish containing the respective two antibiotics (b) Triple
vector fusion reaction (ADD, one Acceptor, two Donors) 12-80 fused
functional ADD plasmids plated on a standard petri dish containing
the respective three antibiotics (c) Quadruple vector fusion
reaction (ADDD, one Acceptor, three Donors) For quadruple vector
fusions (ADDD, one Acceptor and three Donors), two possibilities
exist. (1) Single reaction ADDD (four vector Cre-Lox fusion, low
efficiency) (2) Two step reaction ADD + D. Triple fusion as in (b),
then addition of a further Donor. Option 2 (ADD + D) is preferred
for routine robot use as it represents a more robust approach,
resulting in example experiments in 20-100 fused functional ADDD
plasmids when plated on a standard petri dish containing all four
antibiotics.
D.3. High-Throughput Micro Batch I.MAC
[0522] Source plate: 2 ml deepwell plate (Eppendorf)
[0523] Filter plate: Glas filter plate (Novagen)
[0524] Target plate: standard microliter plate (Greiner)
[0525] Material: Ni-NTA bulk beads 50% in 20% ethanol
(Ge-Healthcare), freezer at -20.degree. C., tabletop centrifuge
suitable for microtiter plates, sonifiction device with microtip,
IMAC binding and elution buffer suitable for the specific protein
(Berrow et al, Acta Cryst. (2006). D62, 1218-1226).
[0526] Procedure:
[0527] Sample Preparation (Off Line) [0528] Harvest E. coli cells
expressing the desired protein by centrifugation at 3000 g
(4.degree. C.) directly in the source plate [0529] Freeze cell
pellets for 30 min. at .about.20.degree. C. [0530] Thaw cell
pellets 15 min. at room temperature
[0531] Preparation of the Filter Plate [0532] Wash
tips.fwdarw.Resuspend Ni-RTA bead suspension by pipetting up and
down [0533] 20 times 200 .mu.l.fwdarw.Transfer 200 .mu.l bead
suspension to filter plate [0534] Wash tips.fwdarw.Apply vacuum 550
mbar for 30 sec. (remove 20% ethanol) [0535] Wash tips.fwdarw.Pipet
1 ml equilibration buffer (e.g. binding buffer) to resin [0536]
Wash tips.fwdarw.Apply vacuum 300 mbar for 60 sec.
(equilibration)
[0537] IMAC Purification, Preparation [0538] Wash tips.fwdarw.Pipet
1 ml binding buffer to the samples in the source plate [0539] Wash
tips.fwdarw.Resuspend cell pellets by pipetting up and down 10
times 750 .mu.l [0540] Wash tips
[0541] Sonication of Samples (Off Line) [0542] Sonicafion of the
samples to insure complete lysis of the cells
[0543] IMAC Purification, Loading and Elution [0544] Wash
tips.fwdarw.Transfer whole lysate to filter plate [0545] Wash
tips.fwdarw.Apply vacuum 300 mbar for 80 sec. (binding step) [0546]
Wash tips.fwdarw.Pipet 1 ml wash buffer to the samples [0547] Wash
tips.fwdarw.Apply vacuum 300 mbar for 90 sec. (wash step) [0548]
Repeat wash step 3 times [0549] Wash tips.fwdarw.Pipet 100 .mu.l
elution buffer to the samples [0550] Wash tips.fwdarw.Incubate 3
min. at room temperature [0551] Apply vacuum 350 mbar for 90 sec.
(elution step)
Assessment
[0552] Eluted samples (10 .mu.l-12 .mu.l) are loaded manually on
12% denaturing gels using a Biorad Minigel System, pre-run at 135 V
for 25 min, and then run for 65-70 min, at 185 V. Gels arre stained
with Coomassle Brilliant Blue according to standard procedures.
E. ACEMBL Kit for Expression of Proteins in Prokaryotic Hosts
[0553] A kit according to a preferred embodiment for expression in
prokaryotic hosts contains: [0554] BW23473, BW23474 cells.sup.1
and/or Cre recomblnase [0555] pACKS quadruple fusion vector.sup.2
[0556] made of pACE (Acceptor), and pDC, pDK, pDS (Donors) [0557]
pACE2 vector [0558] pACE-[VHLbc/BFP/mGFP] control plasmid [0559]
triple fusion vector made of pACE-VHLbc, pDK-BFP, pDS-mGFP.sup.3
.sup.1 E. coli strains expressing the pir gene for propagation of
Donor derivatives (any other strain with pir.sup.+ background can
be used)..sup.2 This fusion vector was created by Cre-LoxP reaction
of pACE, pDC, pDK and pDS. It is resistant to ampicillin,
kanamycin, chloramphenicol and spectinomycin. Individual ACEMBL
vectors can be liberated from this quadruple fusion by Cre-Loxp
mediated deconstruction as described above in protocol C.2.2.
Sequences for single ACEMBL vectors according to the present
embodiment and pACKS quadruple fusion are provided in SEQ ID NO: 2
to 7..sup.3 pDS-mGFP contains a coiled-coil fused to the N-terminus
of eGFP (see Berger et al. (2003) Proc. Natl. Acad Sci. USA 100,
12177-82.
[0560] Optional Components: [0561] Antibiotics: ampicillin,
chloramphenicol, kanamycin, spectinomycin, tetracycline [0562]
Enzymes: [0563] T4 DNA polymerase (for recombination insertion of
genes) [0564] Phusion polymerase (for PCR amplification of DNA)
[0565] Restriction enzymes and T4 DNA ligase (for conventional
cloning, if desired)
[0566] The present invention is further illustrated by the
following non-limiting examples.
EXAMPLES
[0567] Examples of multiprotein expressions by using the
above-described ACEMBL system are shown in the following
illustrating the gene combination procedures outlined above.
Reactions presented were either carried out manually following the
protocols provided in above Section C, or on a Tecan Freedom Evoll
200 robot with adapted protocols according above Section D.
Example 1: SLIC Cloning into ACEMBL Vectors: Human TFIIF
[0568] Genes coding for full-length RAP74 with a C-terminal
oligo-histidine tag and full-length human RAP30 were amplified from
pET-based plasmid template (Gaiser et al. (2000) J. Mol. Biol. 302,
1119-1127) by using the primer pair T7InsFor
(5-TCCCGCGAAATTAATACGACTCACTAGGG-3'; SEQ ID NO: 20) and T7Insrev
(5'-CCTCAAGACCCGTTTAGAGGCCCCAAGGGGTTATGCTAG-3'; SEQ ID NO: 21)
following the protocols described above. Linearized vector
backbones were generated by PCR amplification from pACE and pDC by
using primer pair T7VecFor
(5'CTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGG-3'; SEQ ID NO: 22) and
T7VecRev (5'-CCCTATAGTGAGTCGTATTAATTTCGCGGGA-3'; SEQ ID NO: 23) in
both cases. Above Protocol 1 (Section C) was followed, resulting in
pACE-RAP30 and pDC-RAP74his (FIG. 8). Those plasmids were fused by
Cre-LoxP reaction (see above Section C). Results from restriction
mapping by BstZ17I/BamHI double digestion of 11 double resistant
(Cm, Ap) colonies is shown by a gel section from 1% E-gel
electrophoresis (M: NEB 1 kb DNA marker) in FIG. 8. All clones
tested showed the expected pattern (5.0+2.8 Kb). One clone was
transformed in BI21(DE3) cells. Expression and purification by
Ni.sup.2+-capture and S200 chromatography resulted in human TFIF
complex (FIG. 21A).
[0569] The high-level soluble expression of full-length human TFIIF
(FIG. 21 A) is noteworthy, as individual expression of the subunits
invariably leads to insoluble material. In the past, crystal
structure analysts of human TFIIF dimerization domain had
necessitated many iterative cycles of limited proteolysis,
recloning, insoluble expression of the designed fragments and
co-refolding (Gaiser et al. (2000), supra). Similar laborious
situations are commonplace in prior art protein complex research.
It is conceivable that the large investment of labor involved can
now be significantly reduced using the nucleic acids and vectors of
the present invention, in particular the ACEMBL system.
Example 2: Polycistron Insertion by SLIC: Human VHL/Elongin
b/Elongin c Complex
[0570] The gene encoding for Von Hippel Lindau protein (amino acids
54-213), fused at its N-terminus to a six-histidine-thioredoxin
fusion tag, was PCR amplified from plasmid pET3-HisTrxVHL by using
primers T7InsFor (see above Table I) and SmaBamVHL
(5'-GAATTCACTGGCCGTCGTTTTACAGGATCCTTAATCTCCCATCCGTTGATG TGCAATG-3';
SEQ ID NO 45). SmaBamVHL primer is a derivative of the SmaBam
adaptor sequence (Table I; SEQ ID NO: 17) elongated at its 3' by
the insert specific sequence at the 3' end of the VHL gene
(including a stop codon). The gene encoding for full-length elongin
b was PCR amplified from pET3-ElonginB by using primers BamSmaEB
(5'-GGATCCTGTAAAACGACGGCGAGTGAATTCG CTAGCTCTAGAAATAATTTTGTTTAAC-3';
SEQ ID NO: 46) and SacHindEB
(5'-GAGCTCGACTGGGAAAACCCTGGCGAAGCTTAGATCTGGATCCTTACTGCACG
GCTTGTTCATTGG-3'; SEQ ID NO: 47), which are derivatives of the
corresponding adaptors (Table I). The gene for elongin c (amino
acids 17-112) was amplified from pET3-ElonginC by using primers
HindSacEC (5'-AAGCTTCGCCAGGGTTTTCCCA
GTCGAGCTCCAATTGGAATTCGCTAGCTCTAG-3'; SEQ ID NO: 48) and BspEco5EC
(5'-GATCCGGATGTGAAATTGTTATCCGCTGGTACCAAGCTTAGAT
CTGGATCCTTAACAATCTAAGAAG-3'; SEQ ID NO: 49), which are derivatives
of the corresponding adaptors (Table I). Vector backbone was PCR
amplified by using primers Tn7VecRev and Eco5Bsp, and pACE as a
template (FIG. 9). Multifragment SLIC was carried out according to
above Protocol 2 (Section C) resulting in pACE-VHLbc which contains
a tricistron. Clones were plated on agar plates containing
ampicillin. A positive clone, verified by sequencing, was used in
the coexpression experiment described below (Example 5).
Example 3: The Homing Endonuclease/BstXI Module: Yeast RES
Complex
[0571] Plasmids pCDFDuet-Pml1p, pRSFDuet-bSnu17p-NHis and
pETDuet-Bud13p, coding for yeast, proteins (all full-length) Pml1p,
Snu17p and Bud13p, respectively, were provided by Dr. Simon
Trowitzsch and Dr. Markus Wahl (Max-Planck-Institute for
Biophysical Chemistry, Gottingen, Germany). Snu17p contains a
six-histidine tag fused to its N-terminus. The gene encoding for
His6-tagged Snu17p was excised from pRSFDuet-Snu17p-NHis by using
restriction enzymes NcoI and XboI, and ligated into a NcoI/XhoI
digested pACE construct (containing an unrelated gene between NcoI
and XhoI sites) resulting in pAGE-Snu17. The gene encoding for
Bud13p was liberated from pETDuet-Bud13p by restriction digestion
with XbaI and EcoRV, and placed into XbaI/PmeI digested pDC
resulting in pDG-Bud13. The gene encoding for Pml1p was liberated
from pCDFDuet-Pml1p by restriction digestion with NdeI and XhoI,
and placed into NdeI/XhoI digested pDC resulting in pDC-Pml1. Next,
the expression cassette for Bud13p was liberated from pDC-Bud13 by
digestion with PI-SceI and BstXI. The liberated fragment was
inserted into PI-SceI digested and alkaline phosphatase treated
pDC-Pml1p resulting in pDC-Bud13p-Pml1p.
[0572] pACE-Snu17 and pDC-BudPmI were fused by Cre-LoxP reaction
and selected for by plating on agar plates containing ampicillin
and chloramphenicol. Fusion plasmids were transformed into
BI21(DE3) cells. Expression and purification by Ni.sup.2+-capture
and S200 size exclusion chromatography resulted in the trimeric RES
complex
[0573] The strategy for cloning the yeast RES complex according to
the method of the present invention is schematically illustrated in
FIG. 10.
Example 4: Coexpression by Cotransformation: Human NYB/NYC
[0574] Genes encoding for protein NYB (amino acids 49-141) and NYC
(amino acids 27-12) were excised from vectors pACYC18411-NYB and
pET15-NYC, respectively (Romier et al. (2003 J. Biol Chem. 278,
1336-1345). NdeI and BamHI where used for NFYB. XhaI and BamHI
where used for NYC, thus importing a six-histidine tag at the
N-terminus of the protein. The NYB insert was ligated into pACE
digested with NdeI and BamHI. The NYC insert was ligated into pACE2
digested by XhaI and BamHI. pACE-NFYB and pACE2-NFYC were
transformed into BL21(DE3) cells containing the pLysS plasmid.
Selection on agar plates containing ampicillln, tetracycline and
chloramphenicol resulted in triple resistant colonies. The complex
was expressed and purified by Ni2+ capture (IMAC) and S75HR
(Pharmacia) size exclusion chromatography.
Example 5: Compression from Acceptor-Donor Fusions
[0575] Six heterologous genes coding for a trimeric protein complex
(VHLbc: VonHippel-Lindau protein amino acids 54-213/full-length
elonginB/elonginC amino acids 17-112) (Stebbins et al. (1909)
Science 284, 455-61), a gene encoding for the AAA ATPase FtsH
(amino acids 147-610), and two genes encoding for fluorescent
markers (BFP and GFP) were assembled as illustrated in FIG. 20. In
a single Cre reaction, all combinations of one Acceptor
(pACE-VHLbc) and three Donors (pDC-FtsH, pDK-BFP, pDS-mGFP) were
obtained and selected, including a quadruple fusion containing all
six heterologous genes; see FIG. 20). Clones were verified by 96
well microliter assay as described above for the ACEMBL system.
Section C. Expression and Ni.sup.2+ affinity capture, combined with
immunostainsng of the untagged fluorescent markers, confirmed
successful multiprotein expression. (FIGS. 16 and 17B). Proteins
were expressed overnight in BL21(DE3) cells in 24 well deep-well
plates in small scale using autoinduction media (Studier (2005)
Protein Expr. Purif. 41, 207-34). Restriction mapping revealed that
even large fusion plasmids were stable over many (more than 60)
generations, even if challenged by a single antibiotic in the
medium only.
Example 6: Expression of the YidC-SecYEGDF Holotranslocon
[0576] As illustrated in FIG. 21, the ACEMBL system was used to
produce a large multiprotein complex, the YidC-SecYEGDF
holotranslocon that contains in total 33 transmembrane helices.
This machinery is used to transport unfolded polypeptides into the
cell membrane or for translocation into the periplasm of bacteria
(Duong et al. (1997) EMBO J. 16, 2757-68.
Example 7: Expression of Human IKK Complex in Insect Cells
[0577] Following the protocols for single gene insertion into
ACEMBL vectors as outlined above in Section C.1., the genes for
IKK1 (also called IKKalpha), IKK2 (also called IKKbeta) and IKK3
(also called Nemo) were cloned into pACEBac1, pIDC and pIDS
respectively (maps of the resulting plasmids pACEBac1-HisIKK1,
pIDC-CSIKK2 and pIDS-IKK3 are shown in FIGS. 46, 47 and 48,
respectively). IKK1-2 double fusion (pACEBac1-HisIKK1 with
pIDC-CSIKK2) and IKK1-2-3 triple fusions (all three vectors) were
created by Cre-LoxP fusions as outlined above in Section C.2. The
fusions were introduced into suitable host cells carrying a
baculovirus genome (EMBac) as a bacterial artificial chromosome.
The vector fusions were integrated into the baculoviral genome via
Tn7 transposition. Productive integration was assessed by
blue/white screening. DNA of composite virus was prepared from
white clones and transfected into Sf21 cells.
Example 8: Expression of a H1N1-Influenza Virus-Like Particle
[0578] A virus-like particle (VLP) of the swine-flu virus
(influenza virus of type H1N1) comprising the proteins HA, NA, M1
and M2 was expressed in insect cells (Sf21) by the following
strategy: genes coding for HA and NA were cloned into pACEBac1 by
single gene insertion as outlined above in Section C.1. The same
procedure was followed for cloning the genes coding for M1 and M2
into pIDC. Double expression cassettes for HA-NA and M1-M2,
respectively, were generated by using the HE-BstXI sites in the
respective MIE (see above Section C.1.4.) resulting in plasmids
pACEBac-HA-NA (plasmid map see FIG. 49) and pIDC-M1-M2 (plasmid map
see FIG. 50). The vector for coding the complete H1N1-influenza-VLP
was generated by CreLoxP fusion of pACEBac-HA-NA with pIDC-M1-M2
following the protocol in above Section C.2. The fusion vector was
introduced into suitable host cells carrying a baculovirus genome
(EMBac) as a bacterial artificial chromosome. The vector fusions
were integrated into the baculoviral genome via Tn7 transposition.
Productive integration was assessed by blue/white screening. DNA of
composite virus was prepared from white clones and transfected into
Sf21 cells.
INCORPORATION OF SEQUENCE LISTING
[0579] A paper copy of a compliant sequence listing, submitted on
Mar. 5, 2012 in connection with U.S. application Ser. No.
13/254,831 filed by the same applicant as the present application,
and an identical compliant computer readable form of the sequence
listing, submitted on Mar. 5, 2012 in connection with U.S.
application Ser. No. 13/254,831 filed by the same applicant as the
present application, are incorporated herein by reference.
[0580] Applicant hereby requests the use of the compliant computer
readable sequence listing that is already on file for U.S.
application Ser. No. 13/254,831 (in connection with which the
compliant sequence listing and CRF were submitted on Mar. 5, 2012).
The paper copy of the sequence listing submitted with the present
application is identical to the computer readable copy filed for
the other application.
Sequence CWU 1
1
541210DNAArtificialMultiple integration element 1gggaattgtg
agcggataac aattcccctc tagaaataat tttgtttaac tttaagaagg 60agatatacat
atgaggcctc ggatcctgta aaacgacggc cagtgaattc cccgggaagc
120ttcgccaggg ttttcccagt cgagctcgat atcggtacca gcggataaca
atttcacatc 180cggatcgcga acgcgtctcg agagatccgg
21022652DNAArtificialpACE 2ggtaccgcgg ccgcgtagag gatctgttga
tcagcagttc aacctgttga tagtacttcg 60ttaatacaga tgtaggtgtt ggcaccatgc
ataactataa cggtcctaag gtagcgacct 120aggtatcgat aatacgactc
actatagggg aattgtgagc ggataacaat tcccctctag 180aaataatttt
gtttaacttt aagaaggaga tatacatatg aggcctcgga tcctgtaaaa
240cgacggccag tgaattcccc gggaagcttc gccagggttt tcccagtcga
gctcgatatc 300ggtaccagcg gataacaatt tcacatccgg atcgcgaacg
cgtctcgaga gatccggctg 360ctaacaaagc ccgaaaggaa gctgagttgg
ctgctgccac cgctgagcaa taactagcat 420aaccccttgg ggcctctaaa
cgggtcttga ggggtttttt ggtttaaacc catctaattg 480gactagtagc
ccgcctaatg agcgggcttt tttttaattc ccctatttgt ttatttttct
540aaatacattc aaatatgtat ccgctcatga gacaataacc ctgataaatg
cttcaataat 600attgaaaaag gaagagtatg agtattcaac atttccgtgt
cgcccttatt cccttttttg 660cggcattttg ccttcctgtt tttgctcacc
cagaaacgct cgtgaaagta aaagacgcag 720aggaccaatt gggggcacga
gtgggataca tagaactgga cttgaatagc ggtaaaatcc 780ttgagagttt
tcgccctgaa gagcgttttc caatgatgag cactttcaaa gttctgctat
840gtggagcagt attatcccgt gtagatgcgg ggcaagagca actcggacga
cgaatacact 900attcgcagaa tgacttggtt gaatactccc cagtgacaga
aaagcacctt acggacggaa 960tgacggtaag agaattatgt agtgccgcca
taacgatgag tgataacact gcggcgaact 1020tacttctgac aaccatcggt
ggaccgaagg aattaaccgc ttttttgcac aatatgggag 1080accatgtaac
tcgccttgac cgttgggaac cagaactgaa tgaagccata ccaaacgacg
1140agcgagacac cacaatgcct gcggcaatgg caacaacatt acgcaaacta
ttaactggcg 1200aactacttac tctggcttca cggcaacaat taatagactg
gcttgaagcg gataaagttg 1260caggaccact actgcgttcg gcacttcctg
ctggctggtt tattgctgat aaatctgggg 1320caggagagcg tggttcacgg
ggtatcattg ccgcacttgg accagatggt aagccttccc 1380gtatcgtagt
tatctacacg acgggtagtc aggcaactat ggacgaacga aatagacaga
1440ttgctgaaat aggggcttca ctgattaagc attggtaaac cgatacaatt
aaaggctcct 1500tttggagcct ttttttttgg acggaccggt agaaaagatc
aaaggatctt cttgagatcc 1560tttttttctg cgcgtaatct gctgcttgca
aacaaaaaaa ccaccgctac cagcggtggt 1620ttgtttgccg gatcaagagc
taccaactct ttttccgaag gtaactggct tcagcagagc 1680gcagatacca
aatactgtcc ttctagtgta gccgtagtta ggccaccact tcaagaactc
1740tgtagcaccg cctacatacc tcgctctgct aatcctgtta ccagtggctg
ctgccagtgg 1800cgataagtcg tgtcttaccg ggttggactc aagacgatag
ttaccggata aggcgcagcg 1860gtcgggctga acggggggtt cgtgcacaca
gcccagcttg gagcgaacga cctacaccga 1920actgagatac ctacagcgtg
agctatgaga aagcgccacg cttcccgaag ggagaaaggc 1980ggacaggtat
ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg agcttccagg
2040gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc cacctctgac
ttgagcgtcg 2100atttttgtga tgctcgtcag gggggcggag cctatggaaa
aacgccagca acgcggcctt 2160tttacggttc ctggcctttt gctggccttt
tgctcacatg ttctttcctg cgttatcccc 2220tgattctgtg gataaccgta
ttaccgcctt tgagtgagct gataccgctc gccgcagccg 2280aacgaccgag
cgcagcgagt cagtgagcga ggaagcggaa gagcgcctga tgcggtattt
2340tctccttacg catctgtgcg gtatttcaca ccgcaatggt gcactctcag
tacaatctgc 2400tctgatgccg catagttaag ccagtataca ctccgctatc
gctacgtgac tgggtcatgg 2460ctgcgccccg acacccgcca acacccgctg
acgcgccctg acgggcttgt ctgctcccgg 2520catccgctta cagacaagct
gtgaccgtct ccgggagctg catgtgtcag aggttttcac 2580cgtcatcacc
gaaacgcgcg aggcaggggg aattccagat aacttcgtat aatgtatgct
2640atacgaagtt at 265232982DNAArtificialpACE2 3atgaaatcta
acaatgcgct catcgtcatc ctcggcaccg tcaccctgga tgctgtaggc 60ataggcttgg
ttatgccggt actgccgggc ctcttgcggg atatcgtcca ttccgacagc
120atcgccagtc actatggcgt gctgctagcg ctatatgcgt tgatgcaatt
tctatgcgca 180cccgttctcg gagcactgtc cgaccgcttt ggccgccgcc
cagtcctgct cgcttcgcta 240cttggagcca ctatcgacta cgcgatcatg
gcgaccacac ccgtcctgtg gattctctac 300gccggacgca tcgtggccgg
catcaccggc gccacaggtg cggttgctgg cgcctatatc 360gccgacatca
ccgatgggga agatcgggct cgccacttcg ggctcatgag cgcttgtttc
420ggcgtgggta tggtggcagg ccccgtggcc gggggactgt tgggcgccat
ctccttacat 480gcaccattcc ttgcggcggc ggtgctcaac ggcctcaacc
tactactggg ctgcttccta 540atgcaggagt cgcataaggg agagcgccga
cccatgccct tgagagcctt caacccagtc 600agctccttcc ggtgggcgcg
gggcatgact atcgtcgccg cacttatgac tgtcttcttt 660atcatgcaac
tcgtaggaca ggtgccggca gcgctctggg tcattttcgg cgaggaccgc
720tttcgctgga gcgcgacgat gatcggcctg tcgcttgcgg tattcggaat
cttgcacgcc 780ctcgctcaag ccttcgtcac tggtcccgcc accaaacgtt
tcggcgagaa gcaggccatt 840atcgccggca tggcggccga cgcgctgggc
tacgtcttgc tggcgttcgc gacgcgaggc 900tggatggcct tccccattat
gattcttctc gcttccggcg gcatcgggat gcccgcgttg 960caggccatgc
tgtccaggca ggtagatgac gaccatcagg gacagcttca aggatcgctc
1020gcggctctta ccagcctaac ttcgatcatt ggaccgctga tcgtcacggc
gatttatgcc 1080gcctcggcga gcacatggaa cgggttggca tggattgtag
gcgccgccct ataccttgtc 1140tgcctccccg cgttgcgtcg cggtgcatgg
agccgggcca cctcgacctg aaccgataca 1200attaaaggct ccttttggag
cctttttttt tggacggacc ggtagaaaag atcaaaggat 1260cttcttgaga
tccttttttt ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc
1320taccagcggt ggtttgtttg ccggatcaag agctaccaac tctttttccg
aaggtaactg 1380gcttcagcag agcgcagata ccaaatactg tccttctagt
gtagccgtag ttaggccacc 1440acttcaagaa ctctgtagca ccgcctacat
acctcgctct gctaatcctg ttaccagtgg 1500ctgctgccag tggcgataag
tcgtgtctta ccgggttgga ctcaagacga tagttaccgg 1560ataaggcgca
gcggtcgggc tgaacggggg gttcgtgcac acagcccagc ttggagcgaa
1620cgacctacac cgaactgaga tacctacagc gtgagctatg agaaagcgcc
acgcttcccg 1680aagggagaaa ggcggacagg tatccggtaa gcggcagggt
cggaacagga gagcgcacga 1740gggagcttcc agggggaaac gcctggtatc
tttatagtcc tgtcgggttt cgccacctct 1800gacttgagcg tcgatttttg
tgatgctcgt caggggggcg gagcctatgg aaaaacgcca 1860gcaacgcggc
ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc
1920ctgcgttatc ccctgattct gtggataacc gtattaccgc ctttgagtga
gctgataccg 1980ctcgccgcag ccgaacgacc gagcgcagcg agtcagtgag
cgaggaagcg gaagagcgcc 2040tgatgcggta ttttctcctt acgcatctgt
gcggtatttc acaccgcaat ggtgcactct 2100cagtacaatc tgctctgatg
ccgcatagtt aagccagtat acactccgct atcgctacgt 2160gactgggtca
tggctgcgcc ccgacacccg ccaacacccg ctgacgcgcc ctgacgggct
2220tgtctgctcc cggcatccgc ttacagacaa gctgtgaccg tctccgggag
ctgcatgtgt 2280cagaggtttt caccgtcatc accgaaacgc gcgaggcagg
gggaattcca gataacttcg 2340tataatgtat gctatacgaa gttatggtac
cgcggccgcg tagaggatct gttgatcagc 2400agttcaacct gttgatagta
cttcgttaat acagatgtag gtgttggcac catgcataac 2460tataacggtc
ctaaggtagc gacctaggta tcgataatac gactcactat aggggaattg
2520tgagcggata acaattcccc tctagaaata attttgttta actttaagaa
ggagatatac 2580atatgaggcc tcggatcctg taaaacgacg gccagtgaat
tccccgggaa gcttcgccag 2640ggttttccca gtcgagctcg atatcggtac
cagcggataa caatttcaca tccggatcgc 2700gaacgcgtct cgagagatcc
ggctgctaac aaagcccgaa aggaagctga gttggctgct 2760gccaccgctg
agcaataact agcataaccc cttggggcct ctaaacgggt cttgaggggt
2820tttttggttt aaacccatct aattggacta gtagcccgcc taatgagcgg
gctttttttt 2880aattccccta tttgtttatt tttctaaata cattcaaata
tgtatccgct catgagacaa 2940taaccctgat aaatgcttca ataatattga
aaaaggaaga gt 298242067DNAArtificialpDC 4atcaacgtct cattttcgcc
aaaagttggc ccagatctat gtcgggtgcg gagaaagagg 60taatgaaatg gcacctaggt
atcgataata cgactcacta taggggaatt gtgagcggat 120aacaattccc
ctctagaaat aattttgttt aactttaaga aggagatata catatgaggc
180ctcggatcct gtaaaacgac ggccagtgaa ttccccggga agcttcgcca
gggttttccc 240agtcgagctc gatatcggta ccagcggata acaatttcac
atccggatcg cgaacgcgtc 300tcgagagatc cggctgctaa caaagcccga
aaggaagctg agttggctgc tgccaccgct 360gagcaataac tagcataacc
ccttggggcc tctaaacggg tcttgagggg ttttttggtt 420taaacccatg
tgcctggcag ataacttcgt ataatgtatg ctatacgaag ttatggtacc
480gcggccgcgt agaggatctg ttgatcagca gttcaacctg ttgatagtac
gtactaagct 540ctcatgtttc acgtactaag ctctcatgtt taacgtacta
agctctcatg tttaacgaac 600taaaccctca tggctaacgt actaagctct
catggctaac gtactaagct ctcatgtttc 660acgtactaag ctctcatgtt
tgaacaataa aattaatata aatcagcaac ttaaatagcc 720tctaaggttt
taagttttat aagaaaaaaa agaatatata aggcttttaa agcttttaag
780gtttaacggt tgtggacaac aagccaggga tgtaacgcac tgagaagccc
ttagagcctc 840tcaaagcaat tttgagtgac acaggaacac ttaacggctg
acagaattag cttcacgctg 900ccgcaagcac tcagggcgca agggctgcta
aaggaagcgg aacacgtaga aagccagtcc 960gcagaaacgg tgctgacccc
ggatgaatgt cagctgggag gcagaataaa tgatcatatc 1020gtcaattatt
acctccacgg ggagagcctg agcaaactgg cctcaggcat ttgagaagca
1080cacggtcaca ctgcttccgg tagtcaataa accggtaaac cagcaataga
cataagcggc 1140tatttaacga ccctgccctg aaccgacgac cgggtcgaat
ttgctttcga atttctgcca 1200ttcatccgct tattatcact tattcaggcg
tagcaaccag gcgtttaagg gcaccaataa 1260ctgccttaaa aaaattacgc
cccgccctgc cactcatcgc agtactgttg taattcatta 1320agcattctgc
cgacatggaa gccatcacaa acggcatgat gaacctgaat cgccagcggc
1380atcagcacct tgtcgccttg cgtataatat ttgcccatgg tgaaaacggg
ggcgaagaag 1440ttgtccatat tggccacgtt taaatcaaaa ctggtgaaac
tcacccaggg attggctgag 1500acgaaaaaca tattctcaat aaacccttta
gggaaatagg ccaggttttc accgtaacac 1560gccacatctt gcgaatatat
gtgtagaaac tgccggaaat cgtcgtggta ttcactccag 1620agcgatgaaa
acgtttcagt ttgctcatgg aaaacggtgt aacaagggtg aacactatcc
1680catatcacca gctcaccgtc tttcattgcc atacggaatt ccggatgagc
attcatcagg 1740cgggcaagaa tgtgaataaa ggccggataa aacttgtgct
tatttttctt tacggtcttt 1800aaaaaggccg taatatccag ctgaacggtc
tggttatagg tacattgagc aactgactga 1860aatgcctcaa aatgttcttt
acgatgccat tgggatatat caacggtggt atatccagtg 1920atttttttct
ccattttagc ttccttagct cctgaaaatc tcgataactc aaaaaatacg
1980cccggtagtg atcttatttc attatggtga aagttggacc ctcttacgtg
ccgatcaacg 2040tctcattttc gccaaaagtt ggcccag
206752077DNAArtificialpDK 5ctatgtcggg tgcggagaaa gaggtaatga
aatggcacct aggtatcgat ggctttacac 60tttatgcttc cggctcgtat gttgtgtgga
attgtgagcg gataacaatt tcacacagga 120aacagctatg accatgatta
cgaatttcta gaaataattt tgtttaactt taagaaggag 180atatacatat
gaggcctcgg atcctgtaaa acgacggcca gtgaattccc cgggaagctt
240cgccagggtt ttcccagtcg agctcgatat cggtaccagc ggataacaat
ttcacatccg 300gatcgcgaac gcgtctcgag actagttccg tttaaaccca
tgtgcctggc agataacttc 360gtataatgta tgctatacga agttatggta
cgtactaagc tctcatgttt cacgtactaa 420gctctcatgt ttaacgtact
aagctctcat gtttaacgaa ctaaaccctc atggctaacg 480tactaagctc
tcatggctaa cgtactaagc tctcatgttt cacgtactaa gctctcatgt
540ttgaacaata aaattaatat aaatcagcaa cttaaatagc ctctaaggtt
ttaagtttta 600taagaaaaaa aagaatatat aaggctttta aagcttttaa
ggtttaacgg ttgtggacaa 660caagccaggg atgtaacgca ctgagaagcc
cttagagcct ctcaaagcaa ttttcagtga 720cacaggaaca cttaacggct
gacagaatta gcttcacgct gccgcaagca ctcagggcgc 780aagggctgct
aaaggaagcg gaacacgtag aaagccagtc cgcagaaacg gtgctgaccc
840cggatgaatg tcagctactg ggctatctgg acaagggaaa acgcaagcgc
aaagagaaag 900caggtagctt gcagtgggct tacatggcga tagctagact
gggcggtttt atggacagca 960agcgaaccgg aattgccagc tggggcgccc
tctggtaagg ttgggaagcc ctgcaaagta 1020aactggatgg ctttcttgcc
gccaaggatc tgatggcgca ggggatcaag atctgatcaa 1080gagacaggat
gaggatcgtt tcgcatgatt gaacaagatg gattgcacgc aggttctccg
1140gccgcttggg tggagaggct attcggctat gactgggcac aacagacaat
cggctgctct 1200gatgccgccg tgttccggct gtcagcgcag gggcgcccgg
ttctttttgt caagaccgac 1260ctgtccggtg ccctgaatga actgcaggac
gaggcagcgc ggctatcgtg gctggccacg 1320acgggcgttc cttgcgcagc
tgtgctcgac gttgtcactg aagcgggaag ggactggctg 1380ctattgggcg
aagtgccggg gcaggatctc ctgtcatctc accttgctcc tgccgagaaa
1440gtatccatca tggctgatgc aatgcggcgg ctgcatacgc ttgatccggc
tacctgccca 1500ttcgaccacc aagcgaaaca tcgcatcgag cgagcacgta
ctcggatgga agccggtctt 1560gtcgatcagg atgatctgga cgaagagcat
caggggctcg cgccagccga actgttcgcc 1620aggctcaagg cgcgcatgcc
cgacggcgag gatctcgtcg tgacacatgg cgatgcctgc 1680ttgccgaata
tcatggtgga aaatggccgc ttttctggat tcatcgactg tggccggctg
1740ggtgtggcgg accgctatca ggacatagcg ttggctaccc gtgatattgc
tgaagagctt 1800ggcggcgaat gggctgaccg cttcctcgtg ctttacggta
tcgccgctcc cgattcgcag 1860cgcatcgcct tctatcgcct tcttgacgag
ttcttctgag cgggactctg gggttcgaaa 1920tgaccgacca agcgacgccc
aacctgccat cacgagattt cgattccacc gccgccttct 1980atgaaaggtt
gggcttcgga atcgttttcc gggacgccgg ctggatgatc ctccagcgcg
2040gggatctcat gctggagttc ttcgcccacc ccgggat
207762027DNAArtificialpDS 6ctatgtcggg tgcggagaaa gaggtaatga
aatggcacct aggtatcgat ggctttacac 60tttatgcttc cggctcgtat gttgtgtgga
attgtgagcg gataacaatt tcacacagga 120aacagctatg accatgatta
cgaatttcta gaaataattt tgtttaactt taagaaggag 180atatacatat
gaggcctcgg atcctgtaaa acgacggcca gtgaattccc cgggaagctt
240cgccagggtt ttcccagtcg agctcgatat cggtaccagc ggataacaat
ttcacatccg 300gatcgcgaac gcgtctcgag actagttccg tttaaaccca
tgtgcctggc agataacttc 360gtataatgta tgctatacga agttatggta
cgtactaagc tctcatgttt cacgtactaa 420gctctcatgt ttaacgtact
aagctctcat gtttaacgaa ctaaaccctc atggctaacg 480tactaagctc
tcatggctaa cgtactaagc tctcatgttt cacgtactaa gctctcatgt
540ttgaacaata aaattaatat aaatcagcaa cttaaatagc ctctaaggtt
ttaagtttta 600taagaaaaaa aagaatatat aaggctttta aagcttttaa
ggtttaacgg ttgtggacaa 660caagccaggg atgtaacgca ctgagaagcc
cttagagcct ctcaaagcaa ttttgagtga 720cacaggaaca cttaacggct
gacataattc agcttcacgc tgccgcaagc actcagggcg 780caagggctgc
taaaggaagc ggaacacgta gaaagccagt ccgcagaaac ggtgctgacc
840ccggatgaat gtcagctggg aggcagaata aatgatcata tcgtcaatta
ttacctccac 900ggggagagcc tgagcaaact ggcctcaggc atttgagaag
cacacggtca cactgcttcc 960ggtagtcaat aaaccggtaa gtagcgtatg
cgctcacgca actggtccag aaccttgacc 1020gaacgcagcg gtggtaacgg
cgcagtggcg gttttcatgg cttgttatga ctgttttttt 1080ggggtacagt
ctatgcctcg ggcatccaag cagcaagcgc gttacgccgt gggtcgatgt
1140ttgatgttat ggagcagcaa cgatgttacg cagcagggca gtcgccctaa
aacaaagtta 1200aacatcatga gggaagcggt gatcgccgaa gtatcgactc
aactatcaga ggtagttggc 1260gtcatcgagc gccatctcga accgacgttg
ctggccgtac atttgtacgg ctccgcagtg 1320gatggcggcc tgaagccaca
cagtgatatt gatttgctgg ttacggtgac cgtaaggctt 1380gatgaaacaa
cgcggcgagc tttgatcaac gaccttttgg aaacttcggc ttcccctgga
1440gagagcgaga ttctccgcgc tgtagaagtc accattgttg tgcacgacga
catcattccg 1500tggcgttatc cagctaagcg cgaactgcaa tttggagaat
ggcagcgcaa tgacattctt 1560gcaggtatct tcgagccagc cacgatcgac
attgatctgg ctatcttgct gacaaaagca 1620agagaacata gcgttgcctt
ggtaggtcca gcggcggagg aactctttga tccggttcct 1680gaacaggatc
tatttgaggc gctaaatgaa accttaacgc tatggaactc gccgcccgac
1740tgggctggcg atgagcgaaa tgtagtgctt acgttgtccc gcatttggta
cagcgcagta 1800accggcaaaa tcgcgccgaa ggatgtcgct gccgactggg
caatggagcg cctgccggcc 1860cagtatcagc ccgtcatact tgaagctaga
caggcttatc ttggacaaga agaagatcgc 1920ttggcctcgc gcgcagatca
gttggaagaa tttgtccact acgtgaaagg cgagatcacc 1980aaggtagtcg
gcaaataatg tctaacaatt cgttcaagcc gacggat 202772346DNAArtificialpIDC
7aaacccatgt gcctggcaga taacttcgta taatgtatgc tatacgaagt tatggtaccg
60cggccgcgta gaggatctgt tgatcagcag ttcaacctgt tgatagtacg tactaagctc
120tcatgtttca cgtactaagc tctcatgttt aacgtactaa gctctcatgt
ttaacgaact 180aaaccctcat ggctaacgta ctaagctctc atggctaacg
tactaagctc tcatgtttca 240cgtactaagc tctcatgttt gaacaataaa
attaatataa atcagcaact taaatagcct 300ctaaggtttt aagttttata
agaaaaaaaa gaatatataa ggcttttaaa gcttttaagg 360tttaacggtt
gtggacaaca agccagggat gtaacgcact gagaagccct tagagcctct
420caaagcaatt ttgagtgaca caggaacact taacggctga cagaattagc
ttcacgctgc 480cgcaagcact cagggcgcaa gggctgctaa aggaagcgga
acacgtagaa agccagtccg 540cagaaacggt gctgaccccg gatgaatgtc
agctgggagg cagaataaat gatcatatcg 600tcaattatta cctccacggg
gagagcctga gcaaactggc ctcaggcatt tgagaagcac 660acggtcacac
tgcttccggt agtcaataaa ccggtaaacc agcaatagac ataagcggct
720atttaacgac cctgccctga accgacgacc gggtcgaatt tgctttcgaa
tttctgccat 780tcatccgctt attatcactt attcaggcgt agcaaccagg
cgtttaaggg caccaataac 840tgccttaaaa aaattacgcc ccgccctgcc
actcatcgca gtactgttgt aattcattaa 900gcattctgcc gacatggaag
ccatcacaaa cggcatgatg aacctgaatc gccagcggca 960tcagcacctt
gtcgccttgc gtataatatt tgcccatggt gaaaacgggg gcgaagaagt
1020tgtccatatt ggccacgttt aaatcaaaac tggtgaaact cacccaggga
ttggctgaga 1080cgaaaaacat attctcaata aaccctttag ggaaataggc
caggttttca ccgtaacacg 1140ccacatcttg cgaatatatg tgtagaaact
gccggaaatc gtcgtggtat tcactccaga 1200gcgatgaaaa cgtttcagtt
tgctcatgga aaacggtgta acaagggtga acactatccc 1260atatcaccag
ctcaccgtct ttcattgcca tacggaattc cggatgagca ttcatcaggc
1320gggcaagaat gtgaataaag gccggataaa acttgtgctt atttttcttt
acggtcttta 1380aaaaggccgt aatatccagc tgaacggtct ggttataggt
acattgagca actgactgaa 1440atgcctcaaa atgttcttta cgatgccatt
gggatatatc aacggtggta tatccagtga 1500tttttttctc cattttagct
tccttagctc ctgaaaatct cgataactca aaaaatacgc 1560ccggtagtga
tcttatttca ttatggtgaa agttggaccc tcttacgtgc cgatcaacgt
1620ctcattttcg ccaaaagttg gcccagatca acgtctcatt ttcgccaaaa
gttggcccag 1680atctatgtcg ggtgcggaga aagaggtaat gaaatggcac
ctaggggtta tgatagttat 1740tgctcagcgg tggcagcagc caactcagct
tcctttcggg ctttgttagc agccggatct 1800tctaggctca agcagtgatc
agatccagac atgataagat acattgatga gtttggacaa 1860accacaacta
gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct
1920ttatttgtaa ccattataag ctgcaataaa caagttaaca acaacaattg
cattcatttt 1980atgtttcagg ttcaggggga ggtgtgggag gttttttaaa
gcaagtaaaa cctctacaaa 2040tgtggtatgg ctgattatga tcctctagta
cttctcgaca agcttgtcga gactgcaggc 2100tctagattcg aaagcggccg
cgactagtga gctcgtcgac gtaggccttt gaattccgcg 2160cgcttcggac
cgggatccgc gcccgatggt gggacggtat gaataatccg gaatatttat
2220aggttttttt attacaaaac tgttacgaaa acagtaaaat acttatttat
ttgcgagatg 2280gttatcattt taattatctc catgatctat taatattccg
gagtaggtcg cgaatcgata 2340ctagta 234682281DNAArtificialpIDK
8gatactagta tacggacctt taattcaacc caacacaata tattatagtt aaataagaat
60tattatcaaa tcatttgtat attaattaaa atactatact gtaaattaca ttttatttac
120aatcactcga cgaagacttg atcacccggg atctcgagcc atggtgctag
cagctgatgc 180atagcatgcg gtaccgggag atgggggagg ctaactgaaa
cacggaagga gacaataccg 240gaaggaaccc gcgctatgac ggcaataaaa
agacagaata aaacgcacgg gtgttgggtc
300gtttgttcat aaacgcgggg ttcggtccca gggctggcac tctgtcgata
ccccaccgag 360accccattgg gaccaatacg cccgcgtttc ttccttttcc
ccaccccaac ccccaagttc 420gggtgaaggc ccagggctcg cagccaacgt
cggggcggca agccctgcca tagccactac 480gggtacgttt aaacccatgt
gcctggcaga taacttcgta taatgtatgc tatacgaagt 540tatggtacgt
actaagctct catgtttcac gtactaagct ctcatgttta acgtactaag
600ctctcatgtt taacgaacta aaccctcatg gctaacgtac taagctctca
tggctaacgt 660actaagctct catgtttcac gtactaagct ctcatgtttg
aacaataaaa ttaatataaa 720tcagcaactt aaatagcctc taaggtttta
agttttataa gaaaaaaaag aatatataag 780gcttttaaag cttttaaggt
ttaacggttg tggacaacaa gccagggatg taacgcactg 840agaagccctt
agagcctctc aaagcaattt tcagtgacac aggaacactt aacggctgac
900agaattagct tcacgctgcc gcaagcactc agggcgcaag ggctgctaaa
ggaagcggaa 960cacgtagaaa gccagtccgc agaaacggtg ctgaccccgg
atgaatgtca gctactgggc 1020tatctggaca agggaaaacg caagcgcaaa
gagaaagcag gtagcttgca gtgggcttac 1080atggcgatag ctagactggg
cggttttatg gacagcaagc gaaccggaat tgccagctgg 1140ggcgccctct
ggtaaggttg ggaagccctg caaagtaaac tggatggctt tcttgccgcc
1200aaggatctga tggcgcaggg gatcaagatc tgatcaagag acaggatgag
gatcgtttcg 1260catgattgaa caagatggat tgcacgcagg ttctccggcc
gcttgggtgg agaggctatt 1320cggctatgac tgggcacaac agacaatcgg
ctgctctgat gccgccgtgt tccggctgtc 1380agcgcagggg cgcccggttc
tttttgtcaa gaccgacctg tccggtgccc tgaatgaact 1440gcaggacgag
gcagcgcggc tatcgtggct ggccacgacg ggcgttcctt gcgcagctgt
1500gctcgacgtt gtcactgaag cgggaaggga ctggctgcta ttgggcgaag
tgccggggca 1560ggatctcctg tcatctcacc ttgctcctgc cgagaaagta
tccatcatgg ctgatgcaat 1620gcggcggctg catacgcttg atccggctac
ctgcccattc gaccaccaag cgaaacatcg 1680catcgagcga gcacgtactc
ggatggaagc cggtcttgtc gatcaggatg atctggacga 1740agagcatcag
gggctcgcgc cagccgaact gttcgccagg ctcaaggcgc gcatgcccga
1800cggcgaggat ctcgtcgtga cacatggcga tgcctgcttg ccgaatatca
tggtggaaaa 1860tggccgcttt tctggattca tcgactgtgg ccggctgggt
gtggcggacc gctatcagga 1920catagcgttg gctacccgtg atattgctga
agagcttggc ggcgaatggg ctgaccgctt 1980cctcgtgctt tacggtatcg
ccgctcccga ttcgcagcgc atcgccttct atcgccttct 2040tgacgagttc
ttctgagcgg gactctgggg ttcgaaatga ccgaccaagc gacgcccaac
2100ctgccatcac gagatttcga ttccaccgcc gccttctatg aaaggttggg
cttcggaatc 2160gttttccggg acgccggctg gatgatcctc cagcgcgggg
atctcatgct ggagttcttc 2220gcccaccccg ggatctatgt cgggtgcgga
gaaagaggta atgaaatggc acctaggtat 2280c 228192231DNAArtificialpIDS
9cgatactagt atacggacct ttaattcaac ccaacacaat atattatagt taaataagaa
60ttattatcaa atcatttgta tattaattaa aatactatac tgtaaattac attttattta
120caatcactcg acgaagactt gatcacccgg gatctcgagc catggtgcta
gcagctgatg 180catagcatgc ggtaccggga gatgggggag gctaactgaa
acacggaagg agacaatacc 240ggaaggaacc cgcgctatga cggcaataaa
aagacagaat aaaacgcacg ggtgttgggt 300cgtttgttca taaacgcggg
gttcggtccc agggctggca ctctgtcgat accccaccga 360gaccccattg
ggaccaatac gcccgcgttt cttccttttc cccaccccaa cccccaagtt
420cgggtgaagg cccagggctc gcagccaacg tcggggcggc aagccctgcc
atagccacta 480cgggtacgtt taaacccatg tgcctggcag ataacttcgt
ataatgtatg ctatacgaag 540ttatggtacg tactaagctc tcatgtttca
cgtactaagc tctcatgttt aacgtactaa 600gctctcatgt ttaacgaact
aaaccctcat ggctaacgta ctaagctctc atggctaacg 660tactaagctc
tcatgtttca cgtactaagc tctcatgttt gaacaataaa attaatataa
720atcagcaact taaatagcct ctaaggtttt aagttttata agaaaaaaaa
gaatatataa 780ggcttttaaa gcttttaagg tttaacggtt gtggacaaca
agccagggat gtaacgcact 840gagaagccct tagagcctct caaagcaatt
ttgagtgaca caggaacact taacggctga 900cataattcag cttcacgctg
ccgcaagcac tcagggcgca agggctgcta aaggaagcgg 960aacacgtaga
aagccagtcc gcagaaacgg tgctgacccc ggatgaatgt cagctgggag
1020gcagaataaa tgatcatatc gtcaattatt acctccacgg ggagagcctg
agcaaactgg 1080cctcaggcat ttgagaagca cacggtcaca ctgcttccgg
tagtcaataa accggtaagt 1140agcgtatgcg ctcacgcaac tggtccagaa
ccttgaccga acgcagcggt ggtaacggcg 1200cagtggcggt tttcatggct
tgttatgact gtttttttgg ggtacagtct atgcctcggg 1260catccaagca
gcaagcgcgt tacgccgtgg gtcgatgttt gatgttatgg agcagcaacg
1320atgttacgca gcagggcagt cgccctaaaa caaagttaaa catcatgagg
gaagcggtga 1380tcgccgaagt atcgactcaa ctatcagagg tagttggcgt
catcgagcgc catctcgaac 1440cgacgttgct ggccgtacat ttgtacggct
ccgcagtgga tggcggcctg aagccacaca 1500gtgatattga tttgctggtt
acggtgacgg taaggcttga tgaaacaacg cggcgagctt 1560tgatcaacga
ccttttggaa acttcggctt cccctggaga gagcgagatt ctccgcgctg
1620tagaagtcac cattgttgtg cacgacgaca tcattccgtg gcgttatcca
gctaagcgcg 1680aactgcaatt tggagaatgg cagcgcaatg acattcttgc
aggtatcttc gagccagcca 1740cgatcgacat tgatctggct atcttgctga
caaaagcaag agaacatagc gttgccttgg 1800taggtccagc ggcggaggaa
ctctttgatc cggttcctga acaggatcta tttgaggcgc 1860taaatgaaac
cttaacgcta tggaactcgc cgcccgactg ggctggcgat gagcgaaatg
1920tagtgcttac gttgtcccgc atttggtaca gcgcagtaac cggcaaaatc
gcgccgaagg 1980atgtcgctgc cgactgggca atggagcgcc tgccggccca
gtatcagccc gtcatacttg 2040aagctagaca ggcttatctt ggacaagaag
aagatcgctt ggcctcgcgc gcagatcagt 2100tggaagaatt tgtccactac
gtgaaaggcg agatcaccaa ggtagtcggc aaataatgtc 2160taacaattcg
ttcaagccga cggatctatg tcgggtgcgg agaaagaggt aatgaaatgg
2220cacctaggta t 2231102904DNAArtificialpACEBac1 10accggttgac
ttgggtcaac tgtcagacca agtttactca tatatacttt agattgattt 60aaaacttcat
ttttaattta aaaggatcta ggtgaagatc ctttttgata atctcatgac
120caaaatccct taacgtgagt tttcgttcca ctgagcgtca gaccccgtag
aaaagatcaa 180aggatcttct tgagatcctt tttttctgcg cgtaatctgc
tgcttgcaaa caaaaaaacc 240accgctacca gcggtggttt gtttgccgga
tcaagagcta ccaactcttt ttccgaaggt 300aactggcttc agcagagcgc
agataccaaa tactgttctt ctagtgtagc cgtagttagg 360ccaccacttc
aagaactctg tagcaccgcc tacatacctc gctctgctaa tcctgttacc
420agtggctgct gccagtggcg ataagtcgtg tcttaccggg ttggactcaa
gacgatagtt 480accggataag gcgcagcggt cgggctgaac ggggggttcg
tgcacacagc ccagcttgga 540gcgaacgacc tacaccgaac tgagatacct
acagcgtgag ctatgagaaa gcgccacgct 600tcccgaaggg agaaaggcgg
acaggtatcc ggtaagcggc agggtcggaa caggagagcg 660cacgagggag
cttccagggg gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca
720cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc
tatggaaaaa 780cgccagcaac gcggcctttt tacggttcct ggccttttgc
tggccttttg ctcacatgtt 840ctttcctgcg ttatcccctg attgacttgg
gtcgctcttc ctgtggatgc gcagatgccc 900tgcgtaagcg ggtgtgggcg
gacaataaag tcttaaactg aacaaaatag atctaaacta 960tgacaataaa
gtcttaaact agacagaata gttgtaaact gaaatcagtc cagttatgct
1020gtgaaaaagc atactggact tttgttatgg ctaaagcaaa ctcttcattt
tctgaagtgc 1080aaattgcccg tcgtattaaa gaggggcgtg gccaagggca
tgtaaagact atattcgcgg 1140cgttgtgaca atttaccgaa caactccgcg
gccgggaagc cgatctcggc ttgaacgaat 1200tgttaggtgg cggtacttgg
gtcgatatca aagtgcatca cttcttcccg tatgcccaac 1260tttgtataga
gagccactgc gggatcgtca ccgtaatctg cttgcacgta gatcacataa
1320gcaccaagcg cgttggcctc atgcttgagg agattgatga gcgcggtggc
aatgccctgc 1380ctccggtgct cgccggagac tgcgagatca tagatataga
tctcactacg cggctgctca 1440aacttgggca gaacgtaagc cgcgagagcg
ccaacaaccg cttcttggtc gaaggcagca 1500agcgcgatga atgtcttact
acggagcaag ttcccgaggt aatcggagtc cggctgatgt 1560tgggagtagg
tggctacgtc tccgaactca cgaccgaaaa gatcaagagc agcccgcatg
1620gatttgactt ggtcagggcc gagcctacat gtgcgaatga tgcccatact
tgagccacct 1680aactttgttt tagggcgact gccctgctgc gtaacatcgt
tgctgctgcg taacatcgtt 1740gctgctccat aacatcaaac atcgacccac
ggcgtaacgc gcttgctgct tggatgcccg 1800aggcatagac tgtacaaaaa
aacagtcata acaagccatg aaaaccgcca ctgcgccgtt 1860accaccgctg
cgttcggtca aggttctgga ccagttgcgt gagcgcatac gctacttgca
1920ttacagttta cgaaccgaac aggcttatgt caactgggtt cgtgccttca
tccgtttcca 1980cggtgtgcgt cacccggcaa ccttgggcag cagcgaagtc
gccataactt cgtatagcat 2040acattatacg aagttatctg taactataac
ggtcctaagg tagcgagttt aaacactagt 2100atcgattcgc gacctactcc
ggaatattaa tagatcatgg agataattaa aatgataacc 2160atctcgcaaa
taaataagta ttttactgtt ttcgtaacag ttttgtaata aaaaaaccta
2220taaatattcc ggattattca taccgtccca ccatcgggcg cggatcccgg
tccgaagcgc 2280gcggaattca aaggcctacg tcgacgagct cacttgtcgc
ggccgctttc gaatctagag 2340cctgcagtct cgacaagctt gtcgagaagt
actagaggat cataatcagc cataccacat 2400ttgtagaggt tttacttgct
ttaaaaaacc tcccacacct ccccctgaac ctgaaacata 2460aaatgaatgc
aattgttgtt gttaacttgt ttattgcagc ttataatggt tacaaataaa
2520gcaatagcat cacaaatttc acaaataaag catttttttc actgcattct
agttgtggtt 2580tgtccaaact catcaatgta tcttatcatg tctggatctg
atcactgctt gagcctagaa 2640gatccggctg ctaacaaagc ccgaaaggaa
gctgagttgg ctgctgccac cgctgagcaa 2700taactatcat aacccctagg
gtatacccat ctaattggaa ccagataagt gaaatctagt 2760tccaaactat
tttgtcattt ttaattttcg tattagctta cgacgctaca cccagttccc
2820atctattttg tcactcttcc ctaaataatc cttaaaaact ccatttccac
ccctcccagt 2880tcccaactat tttgtccgcc caca
2904112761DNAArtificialpACEBac2 11accggttgac ttgggtcaac tgtcagacca
agtttactca tatatacttt agattgattt 60aaaacttcat ttttaattta aaaggatcta
ggtgaagatc ctttttgata atctcatgac 120caaaatccct taacgtgagt
tttcgttcca ctgagcgtca gaccccgtag aaaagatcaa 180aggatcttct
tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa caaaaaaacc
240accgctacca gcggtggttt gtttgccgga tcaagagcta ccaactcttt
ttccgaaggt 300aactggcttc agcagagcgc agataccaaa tactgttctt
ctagtgtagc cgtagttagg 360ccaccacttc aagaactctg tagcaccgcc
tacatacctc gctctgctaa tcctgttacc 420agtggctgct gccagtggcg
ataagtcgtg tcttaccggg ttggactcaa gacgatagtt 480accggataag
gcgcagcggt cgggctgaac ggggggttcg tgcacacagc ccagcttgga
540gcgaacgacc tacaccgaac tgagatacct acagcgtgag ctatgagaaa
gcgccacgct 600tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc
agggtcggaa caggagagcg 660cacgagggag cttccagggg gaaacgcctg
gtatctttat agtcctgtcg ggtttcgcca 720cctctgactt gagcgtcgat
ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa 780cgccagcaac
gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatgtt
840ctttcctgcg ttatcccctg attgacttgg gtcgctcttc ctgtggatgc
gcagatgccc 900tgcgtaagcg ggtgtgggcg gacaataaag tcttaaactg
aacaaaatag atctaaacta 960tgacaataaa gtcttaaact agacagaata
gttgtaaact gaaatcagtc cagttatgct 1020gtgaaaaagc atactggact
tttgttatgg ctaaagcaaa ctcttcattt tctgaagtgc 1080aaattgcccg
tcgtattaaa gaggggcgtg gccaagggca tgtaaagact atattcgcgg
1140cgttgtgaca atttaccgaa caactccgcg gccgggaagc cgatctcggc
ttgaacgaat 1200tgttaggtgg cggtacttgg gtcgatatca aagtgcatca
cttcttcccg tatgcccaac 1260tttgtataga gagccactgc gggatcgtca
ccgtaatctg cttgcacgta gatcacataa 1320gcaccaagcg cgttggcctc
atgcttgagg agattgatga gcgcggtggc aatgccctgc 1380ctccggtgct
cgccggagac tgcgagatca tagatataga tctcactacg cggctgctca
1440aacttgggca gaacgtaagc cgcgagagcg ccaacaaccg cttcttggtc
gaaggcagca 1500agcgcgatga atgtcttact acggagcaag ttcccgaggt
aatcggagtc cggctgatgt 1560tgggagtagg tggctacgtc tccgaactca
cgaccgaaaa gatcaagagc agcccgcatg 1620gatttgactt ggtcagggcc
gagcctacat gtgcgaatga tgcccatact tgagccacct 1680aactttgttt
tagggcgact gccctgctgc gtaacatcgt tgctgctgcg taacatcgtt
1740gctgctccat aacatcaaac atcgacccac ggcgtaacgc gcttgctgct
tggatgcccg 1800aggcatagac tgtacaaaaa aacagtcata acaagccatg
aaaaccgcca ctgcgccgtt 1860accaccgctg cgttcggtca aggttctgga
ccagttgcgt gagcgcatac gctacttgca 1920ttacagttta cgaaccgaac
aggcttatgt caactgggtt cgtgccttca tccgtttcca 1980cggtgtgcgt
cacccggcaa ccttgggcag cagcgaagtc gccataactt cgtatagcat
2040acattatacg aagttatctg taactataac ggtcctaagg tagcgagttt
aaacgtaccc 2100gtagtggcta tggcagggct tgccgccccg acgttggctg
cgagccctgg gccttcaccc 2160gaacttgggg gttggggtgg ggaaaaggaa
gaaacgcggg cgtattggtc ccaatggggt 2220ctcggtgggg tatcgacaga
gtgccagccc tgggaccgaa ccccgcgttt atgaacaaac 2280gacccaacac
ccgtgcgttt tattctgtct ttttattgcc gtcatagcgc gggttccttc
2340cggtattgtc tccttccgtg tttcagttag cctcccccat ctcccggtac
cgcatgctat 2400gcatcagctg ctagcaccat ggctcgagat cccgggtgat
caagtcttcg tcgagtgatt 2460gtaaataaaa tgtaatttac agtatagtat
tttaattaat atacaaatga tttgataata 2520attcttattt aactataata
tattgtgttg ggttgaatta aaggtccgta tactagggta 2580tacccatcta
attggaacca gataagtgaa atctagttcc aaactatttt gtcattttta
2640attttcgtat tagcttacga cgctacaccc agttcccatc tattttgtca
ctcttcccta 2700aataatcctt aaaaactcca tttccacccc tcccagttcc
caactatttt gtccgcccac 2760a 2761122940DNAArtificialpACEBac3
12gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc
60ctgcgttatc ccctgattga cttgggtcgc tcttcctgtg gatgcgcaga tgccctgcgt
120aagcgggtgt gggcggacaa taaagtctta aactgaacaa aatagatcta
aactatgaca 180ataaagtctt aaactagaca gaatagttgt aaactgaaat
cagtccagtt atgctgtgaa 240aaagcatact ggacttttgt tatggctaaa
gcaaactctt cattttctga agtgcaaatt 300gcccgtcgta ttaaagaggg
gcgtggccaa gggcatgtaa agactatatt cgcggcgttg 360tgacaattta
ccgaacaact ccgcggccgg gaagccgatc tcggcttgaa cgaattgtta
420ggtggcggta cttgggtcga tatcaaagtg catcacttct tcccgtatgc
ccaactttgt 480atagagagcc actgcgggat cgtcaccgta atctgcttgc
acgtagatca cataagcacc 540aagcgcgttg gcctcatgct tgaggagatt
gatgagcgcg gtggcaatgc cctgcctccg 600gtgctcgccg gagactgcga
gatcatagat atagatctca ctacgcggct gctcaaactt 660gggcagaacg
taagccgcga gagcgccaac aaccgcttct tggtcgaagg cagcaagcgc
720gatgaatgtc ttactacgga gcaagttccc gaggtaatcg gagtccggct
gatgttggga 780gtaggtggct acgtctccga actcacgacc gaaaagatca
agagcagccc gcatggattt 840gacttggtca gggccgagcc tacatgtgcg
aatgatgccc atacttgagc cacctaactt 900tgttttaggg cgactgccct
gctgcgtaac atcgttgctg ctgcgtaaca tcgttgctgc 960tccataacat
caaacatcga cccacggcgt aacgcgcttg ctgcttggat gcccgaggca
1020tagactgtac aaaaaaacag tcataacaag ccatgaaaac cgccactgcg
ccgttaccac 1080cgctgcgttc ggtcaaggtt ctggaccagt tgcgtgagcg
catacgctac ttgcattaca 1140gtttacgaac cgaacaggct tatgtcaact
gggttcgtgc cttcatccgt ttccacggtg 1200tgcgtcaccc ggcaaccttg
ggcagcagcg aagtcgccat aacttcgtat agcatacatt 1260atacgaagtt
atctgtaact ataacggtcc taaggtagcg agtttaaaca ctagtatcga
1320ttcgcgacct actccggaat attaatagat catggagata attaaaatga
taaccatctc 1380gcaaataaat aagtatttta ctgttttcgt aacagttttg
taataaaaaa acctataaat 1440attccggatt attcataccg tcccaccatc
gggcgcggat cccggtccga agcgcgcgga 1500attcaaaggc ctacgtcgac
gagctcactt gtcgcggccg ctttcgaatc tagagcctgc 1560agtctcgaca
agcttgtcga gaagtactag aggatcataa tcagccatac cacatttgta
1620gaggttttac ttgctttaaa aaacctccca cacctccccc tgaacctgaa
acataaaatg 1680aatgcaattg ttgttgttaa cttgtttatt gcagcttata
atggttacaa ataaagcaat 1740agcatcacaa atttcacaaa taaagcattt
ttttcactgc attctagttg tggtttgtcc 1800aaactcatca atgtatctta
tcatgtctgg atctgatcac tgcttgagcc tagaagatcc 1860ggctgctaac
aaagcccgaa aggaagctga gttggctgct gccaccgctg agcaataact
1920atcataaccc ctagggtata cccatctaat tggaaccaga taagtgaaat
ctagttccaa 1980actattttgt catttttaat tttcgtatta gcttacgacg
ctacacccag ttcccatcta 2040ttttgtcact cttccctaaa taatccttaa
aaactccatt tccacccctc ccagttccca 2100actattttgt ccgcccacaa
ccggttgact tgggtcaact gtcagaccaa gtttactcat 2160atatacttta
gattgattta aaacttcatt tttaatttaa aaggatctag gtgaagatcc
2220tttttgataa tctcatgacc acaggcattg gcggccttgc tgttcttcta
cggcaaggtg 2280ctgtgcacgc ccagctgcca tttttggggt gaggtcgttc
gcggccgagg ggcgcagccc 2340ctggggggat ggggtgccgc gttagcgggc
cgggagggtt cgagaagggg gggcaccccc 2400cttcggcgtg cgcggtcacg
cgccagggcg cagccctggt taaaaacaag gtttataaat 2460attggtttaa
aagcaggtta aaagacaggt tagcggtggc cgaaaaacgg gcggaaaccc
2520ttgcaaatgc tggattttct gcctgtggac agcccctcaa atgtcaatag
gtgcgcccct 2580catctgtcat cactctgccc ctcaagtgtc aaggatcgcg
cccctcatct gtcagtagtc 2640gcgcccctca agtgtcaata ccgcagggca
cttatcccca ggcttgtcca catcatctgt 2700gggaaactcg cgtaaaatca
ggcgttttcg ccgatttgcg aggctggcca gctccacgtc 2760gccggccgaa
atcgagcctg cccctcatct gtcaacgccg cgccgggtga gtcggcccct
2820caagtgtcaa cgtccgcccc tcatctgtca gtgagggcca agttttccgc
gtggtatcca 2880caacgccggc ggccaaaaga agagctttca caccgcatag
accagccgcg taacctggca 2940132805DNAArtificialpACEBac4 13gcaacgcggc
ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc 60ctgcgttatc
ccctgattga cttgggtcgc tcttcctgtg gatgcgcaga tgccctgcgt
120aagcgggtgt gggcggacaa taaagtctta aactgaacaa aatagatcta
aactatgaca 180ataaagtctt aaactagaca gaatagttgt aaactgaaat
cagtccagtt atgctgtgaa 240aaagcatact ggacttttgt tatggctaaa
gcaaactctt cattttctga agtgcaaatt 300gcccgtcgta ttaaagaggg
gcgtggccaa gggcatgtaa agactatatt cgcggcgttg 360tgacaattta
ccgaacaact ccgcggccgg gaagccgatc tcggcttgaa cgaattgtta
420ggtggcggta cttgggtcga tatcaaagtg catcacttct tcccgtatgc
ccaactttgt 480atagagagcc actgcgggat cgtcaccgta atctgcttgc
acgtagatca cataagcacc 540aagcgcgttg gcctcatgct tgaggagatt
gatgagcgcg gtggcaatgc cctgcctccg 600gtgctcgccg gagactgcga
gatcatagat atagatctca ctacgcggct gctcaaactt 660gggcagaacg
taagccgcga gagcgccaac aaccgcttct tggtcgaagg cagcaagcgc
720gatgaatgtc ttactacgga gcaagttccc gaggtaatcg gagtccggct
gatgttggga 780gtaggtggct acgtctccga actcacgacc gaaaagatca
agagcagccc gcatggattt 840gacttggtca gggccgagcc tacatgtgcg
aatgatgccc atacttgagc cacctaactt 900tgttttaggg cgactgccct
gctgcgtaac atcgttgctg ctgcgtaaca tcgttgctgc 960tccataacat
caaacatcga cccacggcgt aacgcgcttg ctgcttggat gcccgaggca
1020tagactgtac aaaaaaacag tcataacaag ccatgaaaac cgccactgcg
ccgttaccac 1080cgctgcgttc ggtcaaggtt ctggaccagt tgcgtgagcg
catacgctac ttgcattaca 1140gtttacgaac cgaacaggct tatgtcaact
gggttcgtgc cttcatccgt ttccacggtg 1200tgcgtcaccc ggcaaccttg
ggcagcagcg aagtcgccat aacttcgtat agcatacatt 1260atacgaagtt
atctgtaact ataacggtcc taaggtagcg agtttaaacg tacccgtagt
1320ggctatggca gggcttgccg ccccgacgtt ggctgcgagc cctgggcctt
cacccgaact 1380tgggggttgg ggtggggaaa aggaagaaac gcgggcgtat
tggtcccaat ggggtctcgg 1440tggggtatcg acagagtgcc agccctggga
ccgaaccccg cgtttatgaa caaacgaccc 1500aacacccgtg cgttttattc
tgtcttttta ttgccgtcat agcgcgggtt ccttccggta 1560ttgtctcctt
ccgtgtttca gttagcctcc cccatctccc ggtaccgcat gctatgcatc
1620agctgctagc accatggctc gagatcccgg gtgatcaagt cttcgtcgag
tgattgtaaa 1680taaaatgtaa tttacagtat agtattttaa ttaatataca
aatgatttga taataattct 1740tatttaacta taatatattg tgttgggttg
aattaaaggt ccgtatacta gtatcctagg 1800gtatacccat ctaattggaa
ccagataagt gaaatctagt tccaaactat tttgtcattt 1860ttaattttcg
tattagctta cgacgctaca
cccagttccc atctattttg tcactcttcc 1920ctaaataatc cttaaaaact
ccatttccac ccctcccagt tcccaactat tttgtccgcc 1980cacaaccggt
tgacttgggt caactgtcag accaagttta ctcatatata ctttagattg
2040atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt
gataatctca 2100tgaccacagg cattggcggc cttgctgttc ttctacggca
aggtgctgtg cacgcccagc 2160tgccattttt ggggtgaggt cgttcgcggc
cgaggggcgc agcccctggg gggatggggt 2220gccgcgttag cgggccggga
gggttcgaga agggggggca ccccccttcg gcgtgcgcgg 2280tcacgcgcca
gggcgcagcc ctggttaaaa acaaggttta taaatattgg tttaaaagca
2340ggttaaaaga caggttagcg gtggccgaaa aacgggcgga aacccttgca
aatgctggat 2400tttctgcctg tggacagccc ctcaaatgtc aataggtgcg
cccctcatct gtcatcactc 2460tgcccctcaa gtgtcaagga tcgcgcccct
catctgtcag tagtcgcgcc cctcaagtgt 2520caataccgca gggcacttat
ccccaggctt gtccacatca tctgtgggaa actcgcgtaa 2580aatcaggcgt
tttcgccgat ttgcgaggct ggccagctcc acgtcgccgg ccgaaatcga
2640gcctgcccct catctgtcaa cgccgcgccg ggtgagtcgg cccctcaagt
gtcaacgtcc 2700gcccctcatc tgtcagtgag ggccaagttt tccgcgtggt
atccacaacg ccggcggcca 2760aaagaagagc tttcacaccg catagaccag
ccgcgtaacc tggca 2805144589DNAArtificialpOmniBac1 14accggtggag
gaaattctcc ttgaagtttc cctggtgttc aaagtaaagg agtttgcacc 60agacgcacct
ctgttcactg gtccggcgta ttaaaacacg atacattgtt attagtacat
120ttattaagcg ctagattctg tgcgttgttg atttacagac aattgttgta
cgtattttaa 180taattcatta aatttataat ctttagggtg gtatgttaga
gcgaaaatca aatgattttc 240agcgtcttta tatctgaatt taaatattaa
atcctcaata gatttgtaaa ataggtttcg 300attagtttca aacaagggtt
gtttttccga accgatggct ggactatcta atggattttc 360gctcaacgcc
acaaaacttg ccaaatcttg tagcagcaat ctagctttgt cgatattcgt
420ttgtgttttg ttttgtaata aaggttcgac gtcgttcaaa atattatgcg
cttttgtatt 480tctttcatca ctgtcgttag tgtacaattg actcgacgta
aacacgttaa atagagcttg 540gacatattta acatcgggcg tgttagcttt
attaggccga ttatcgtcgt cgtcccaacc 600ctcgtcgtta gaagttgctt
ccgaagacga ttttgccata gccacacgac gcctattaat 660tgtgtcggct
aacacgtccg cgatcaaatt tgtagttgag ctttttggaa ttaccggttg
720acttgggtca actgtcagac caagtttact catatatact ttagattgat
ttaaaacttc 780atttttaatt taaaaggatc taggtgaaga tcctttttga
taatctcatg accaaaatcc 840cttaacgtga gttttcgttc cactgagcgt
cagaccccgt agaaaagatc aaaggatctt 900cttgagatcc tttttttctg
cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac 960cagcggtggt
ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct
1020tcagcagagc gcagatacca aatactgttc ttctagtgta gccgtagtta
ggccaccact 1080tcaagaactc tgtagcaccg cctacatacc tcgctctgct
aatcctgtta ccagtggctg 1140ctgccagtgg cgataagtcg tgtcttaccg
ggttggactc aagacgatag ttaccggata 1200aggcgcagcg gtcgggctga
acggggggtt cgtgcacaca gcccagcttg gagcgaacga 1260cctacaccga
actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag
1320ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg aacaggagag
cgcacgaggg 1380agcttccagg gggaaacgcc tggtatcttt atagtcctgt
cgggtttcgc cacctctgac 1440ttgagcgtcg atttttgtga tgctcgtcag
gggggcggag cctatggaaa aacgccagca 1500acgcggcctt tttacggttc
ctggcctttt gctggccttt tgctcacatg ttctttcctg 1560cgttatcccc
tgattgactt gggtcgctct tcctgtggat gcgcaggtat gtacaggaag
1620aggtttatac taaactgtta cattgcaaac gtggtttcgt gtgccaagtg
tgaaaaccga 1680tgtttaatca aggctctgac gcatttctac aaccacgact
ctaagtgtgt gggtgaagtc 1740atgcatcttt taatcaaatc ccaagatgtg
tataaaccac caaactgcca aaaaatgaaa 1800actgtcgaca agctctgtcc
gtttgctggc aactgcaagg gtctcaatcc tatttgtaat 1860tattgaataa
taaaacaatt ataaatgtca aatttgtttt ttattaacga tacaaaccaa
1920acgcaacaag aacatttgta gtattatcta taattgaaaa cgcgtagtta
taatcgctga 1980ggtaatattt aaaatcattt tcaaatgatt cacagttaat
ttgcgacaat ataattttat 2040tttcacataa actagacgcc ttgtcgtctt
cttcttcgta ttccttctct ttttcatttt 2100tctcttcata aaaattaaca
tagttattat cgtatccata tatgtatcta tcgtatagag 2160taaatttttt
gttgtcataa atatatatgt cttttttaat ggggtgtata gtaccgctgc
2220gcatagtttt tctgtaattt acaacagtgc tattttctgg tagttcttcg
gagtgtgttg 2280ctttaattat taaatttata taatcaatga atttgggatc
gtcggttttg tacaatatgt 2340tgccggcata gtacgcagct tcttctagtt
caattacacc attttttagc agcaccggat 2400taacataact ttccaaaatg
ttgtacgaac cgttaaacaa aaacagttca cctccctttt 2460ctatactatt
gtctgcgagc agttgtttgt tgttaaaaat aacagccatt gtaatgagac
2520gcacaaacta atatcacaaa ctggaaatgt ctatcaatat atagttgctg
attgcgcaga 2580tgccctgcgt aagcgggtgt gggcggacaa taaagtctta
aactgaacaa aatagatcta 2640aactatgaca ataaagtctt aaactagaca
gaatagttgt aaactgaaat cagtccagtt 2700atgctgtgaa aaagcatact
ggacttttgt tatggctaaa gcaaactctt cattttctga 2760agtgcaaatt
gcccgtcgta ttaaagaggg gcgtggccaa gggcatgtaa agactatatt
2820cgcggcgttg tgacaattta ccgaacaact ccgcggccgg gaagccgatc
tcggcttgaa 2880cgaattgtta ggtggcggta cttgggtcga tatcaaagtg
catcacttct tcccgtatgc 2940ccaactttgt atagagagcc actgcgggat
cgtcaccgta atctgcttgc acgtagatca 3000cataagcacc aagcgcgttg
gcctcatgct tgaggagatt gatgagcgcg gtggcaatgc 3060cctgcctccg
gtgctcgccg gagactgcga gatcatagat atagatctca ctacgcggct
3120gctcaaactt gggcagaacg taagccgcga gagcgccaac aaccgcttct
tggtcgaagg 3180cagcaagcgc gatgaatgtc ttactacgga gcaagttccc
gaggtaatcg gagtccggct 3240gatgttggga gtaggtggct acgtctccga
actcacgacc gaaaagatca agagcagccc 3300gcatggattt gacttggtca
gggccgagcc tacatgtgcg aatgatgccc atacttgagc 3360cacctaactt
tgttttaggg cgactgccct gctgcgtaac atcgttgctg ctgcgtaaca
3420tcgttgctgc tccataacat caaacatcga cccacggcgt aacgcgcttg
ctgcttggat 3480gcccgaggca tagactgtac aaaaaaacag tcataacaag
ccatgaaaac cgccactgcg 3540ccgttaccac cgctgcgttc ggtcaaggtt
ctggaccagt tgcgtgagcg catacgctac 3600ttgcattaca gtttacgaac
cgaacaggct tatgtcaact gggttcgtgc cttcatccgt 3660ttccacggtg
tgcgtcaccc ggcaaccttg ggcagcagcg aagtcgccat aacttcgtat
3720agcatacatt atacgaagtt atctgtaact ataacggtcc taaggtagcg
agtttaaaca 3780ctagtatcga ttcgcgacct actccggaat attaatagat
catggagata attaaaatga 3840taaccatctc gcaaataaat aagtatttta
ctgttttcgt aacagttttg taataaaaaa 3900acctataaat attccggatt
attcataccg tcccaccatc gggcgcggat cccggtccga 3960agcgcgcgga
attcaaaggc ctacgtcgac gagctcactt gtcgcggccg ctttcgaatc
4020tagagcctgc agtctcgaca agcttgtcga gaagtactag aggatcataa
tcagccatac 4080cacatttgta gaggttttac ttgctttaaa aaacctccca
cacctccccc tgaacctgaa 4140acataaaatg aatgcaattg ttgttgttaa
cttgtttatt gcagcttata atggttacaa 4200ataaagcaat agcatcacaa
atttcacaaa taaagcattt ttttcactgc attctagttg 4260tggtttgtcc
aaactcatca atgtatctta tcatgtctgg atctgatcac tgcttgagcc
4320tagaagatcc ggctgctaac aaagcccgaa aggaagctga gttggctgct
gccaccgctg 4380agcaataact atcataaccc ctagggtata cccatctaat
tggaaccaga taagtgaaat 4440ctagttccaa actattttgt catttttaat
tttcgtatta gcttacgacg ctacacccag 4500ttcccatcta ttttgtcact
cttccctaaa taatccttaa aaactccatt tccacccctc 4560ccagttccca
actattttgt ccgcccaca 4589154446DNAArtificialpOmniBac2 15ccggtggagg
aaattctcct tgaagtttcc ctggtgttca aagtaaagga gtttgcacca 60gacgcacctc
tgttcactgg tccggcgtat taaaacacga tacattgtta ttagtacatt
120tattaagcgc tagattctgt gcgttgttga tttacagaca attgttgtac
gtattttaat 180aattcattaa atttataatc tttagggtgg tatgttagag
cgaaaatcaa atgattttca 240gcgtctttat atctgaattt aaatattaaa
tcctcaatag atttgtaaaa taggtttcga 300ttagtttcaa acaagggttg
tttttccgaa ccgatggctg gactatctaa tggattttcg 360ctcaacgcca
caaaacttgc caaatcttgt agcagcaatc tagctttgtc gatattcgtt
420tgtgttttgt tttgtaataa aggttcgacg tcgttcaaaa tattatgcgc
ttttgtattt 480ctttcatcac tgtcgttagt gtacaattga ctcgacgtaa
acacgttaaa tagagcttgg 540acatatttaa catcgggcgt gttagcttta
ttaggccgat tatcgtcgtc gtcccaaccc 600tcgtcgttag aagttgcttc
cgaagacgat tttgccatag ccacacgacg cctattaatt 660gtgtcggcta
acacgtccgc gatcaaattt gtagttgagc tttttggaat taccggttga
720cttgggtcaa ctgtcagacc aagtttactc atatatactt tagattgatt
taaaacttca 780tttttaattt aaaaggatct aggtgaagat cctttttgat
aatctcatga ccaaaatccc 840ttaacgtgag ttttcgttcc actgagcgtc
agaccccgta gaaaagatca aaggatcttc 900ttgagatcct ttttttctgc
gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 960agcggtggtt
tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt
1020cagcagagcg cagataccaa atactgttct tctagtgtag ccgtagttag
gccaccactt 1080caagaactct gtagcaccgc ctacatacct cgctctgcta
atcctgttac cagtggctgc 1140tgccagtggc gataagtcgt gtcttaccgg
gttggactca agacgatagt taccggataa 1200ggcgcagcgg tcgggctgaa
cggggggttc gtgcacacag cccagcttgg agcgaacgac 1260ctacaccgaa
ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg
1320gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc
gcacgaggga 1380gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc
gggtttcgcc acctctgact 1440tgagcgtcga tttttgtgat gctcgtcagg
ggggcggagc ctatggaaaa acgccagcaa 1500cgcggccttt ttacggttcc
tggccttttg ctggcctttt gctcacatgt tctttcctgc 1560gttatcccct
gattgacttg ggtcgctctt cctgtggatg cgcaggtatg tacaggaaga
1620ggtttatact aaactgttac attgcaaacg tggtttcgtg tgccaagtgt
gaaaaccgat 1680gtttaatcaa ggctctgacg catttctaca accacgactc
taagtgtgtg ggtgaagtca 1740tgcatctttt aatcaaatcc caagatgtgt
ataaaccacc aaactgccaa aaaatgaaaa 1800ctgtcgacaa gctctgtccg
tttgctggca actgcaaggg tctcaatcct atttgtaatt 1860attgaataat
aaaacaatta taaatgtcaa atttgttttt tattaacgat acaaaccaaa
1920cgcaacaaga acatttgtag tattatctat aattgaaaac gcgtagttat
aatcgctgag 1980gtaatattta aaatcatttt caaatgattc acagttaatt
tgcgacaata taattttatt 2040ttcacataaa ctagacgcct tgtcgtcttc
ttcttcgtat tccttctctt tttcattttt 2100ctcttcataa aaattaacat
agttattatc gtatccatat atgtatctat cgtatagagt 2160aaattttttg
ttgtcataaa tatatatgtc ttttttaatg gggtgtatag taccgctgcg
2220catagttttt ctgtaattta caacagtgct attttctggt agttcttcgg
agtgtgttgc 2280tttaattatt aaatttatat aatcaatgaa tttgggatcg
tcggttttgt acaatatgtt 2340gccggcatag tacgcagctt cttctagttc
aattacacca ttttttagca gcaccggatt 2400aacataactt tccaaaatgt
tgtacgaacc gttaaacaaa aacagttcac ctcccttttc 2460tatactattg
tctgcgagca gttgtttgtt gttaaaaata acagccattg taatgagacg
2520cacaaactaa tatcacaaac tggaaatgtc tatcaatata tagttgctga
ttgcgcagat 2580gccctgcgta agcgggtgtg ggcggacaat aaagtcttaa
actgaacaaa atagatctaa 2640actatgacaa taaagtctta aactagacag
aatagttgta aactgaaatc agtccagtta 2700tgctgtgaaa aagcatactg
gacttttgtt atggctaaag caaactcttc attttctgaa 2760gtgcaaattg
cccgtcgtat taaagagggg cgtggccaag ggcatgtaaa gactatattc
2820gcggcgttgt gacaatttac cgaacaactc cgcggccggg aagccgatct
cggcttgaac 2880gaattgttag gtggcggtac ttgggtcgat atcaaagtgc
atcacttctt cccgtatgcc 2940caactttgta tagagagcca ctgcgggatc
gtcaccgtaa tctgcttgca cgtagatcac 3000ataagcacca agcgcgttgg
cctcatgctt gaggagattg atgagcgcgg tggcaatgcc 3060ctgcctccgg
tgctcgccgg agactgcgag atcatagata tagatctcac tacgcggctg
3120ctcaaacttg ggcagaacgt aagccgcgag agcgccaaca accgcttctt
ggtcgaaggc 3180agcaagcgcg atgaatgtct tactacggag caagttcccg
aggtaatcgg agtccggctg 3240atgttgggag taggtggcta cgtctccgaa
ctcacgaccg aaaagatcaa gagcagcccg 3300catggatttg acttggtcag
ggccgagcct acatgtgcga atgatgccca tacttgagcc 3360acctaacttt
gttttagggc gactgccctg ctgcgtaaca tcgttgctgc tgcgtaacat
3420cgttgctgct ccataacatc aaacatcgac ccacggcgta acgcgcttgc
tgcttggatg 3480cccgaggcat agactgtaca aaaaaacagt cataacaagc
catgaaaacc gccactgcgc 3540cgttaccacc gctgcgttcg gtcaaggttc
tggaccagtt gcgtgagcgc atacgctact 3600tgcattacag tttacgaacc
gaacaggctt atgtcaactg ggttcgtgcc ttcatccgtt 3660tccacggtgt
gcgtcacccg gcaaccttgg gcagcagcga agtcgccata acttcgtata
3720gcatacatta tacgaagtta tctgtaacta taacggtcct aaggtagcga
gtttaaacgt 3780acccgtagtg gctatggcag ggcttgccgc cccgacgttg
gctgcgagcc ctgggccttc 3840acccgaactt gggggttggg gtggggaaaa
ggaagaaacg cgggcgtatt ggtcccaatg 3900gggtctcggt ggggtatcga
cagagtgcca gccctgggac cgaaccccgc gtttatgaac 3960aaacgaccca
acacccgtgc gttttattct gtctttttat tgccgtcata gcgcgggttc
4020cttccggtat tgtctccttc cgtgtttcag ttagcctccc ccatctcccg
gtaccgcatg 4080ctatgcatca gctgctagca ccatggctcg agatcccggg
tgatcaagtc ttcgtcgagt 4140gattgtaaat aaaatgtaat ttacagtata
gtattttaat taatatacaa atgatttgat 4200aataattctt atttaactat
aatatattgt gttgggttga attaaaggtc cgtatactag 4260ggtataccca
tctaattgga accagataag tgaaatctag ttccaaacta ttttgtcatt
4320tttaattttc gtattagctt acgacgctac acccagttcc catctatttt
gtcactcttc 4380cctaaataat ccttaaaaac tccatttcca cccctcccag
ttcccaacta ttttgtccgc 4440ccacaa 4446164625DNAArtificialpOmniBac3
16gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc
60ctgcgttatc ccctgattga cttgggtcgc tcttcctgtg gatgcgcagg tatgtacagg
120aagaggttta tactaaactg ttacattgca aacgtggttt cgtgtgccaa
gtgtgaaaac 180cgatgtttaa tcaaggctct gacgcatttc tacaaccacg
actctaagtg tgtgggtgaa 240gtcatgcatc ttttaatcaa atcccaagat
gtgtataaac caccaaactg ccaaaaaatg 300aaaactgtcg acaagctctg
tccgtttgct ggcaactgca agggtctcaa tcctatttgt 360aattattgaa
taataaaaca attataaatg tcaaatttgt tttttattaa cgatacaaac
420caaacgcaac aagaacattt gtagtattat ctataattga aaacgcgtag
ttataatcgc 480tgaggtaata tttaaaatca ttttcaaatg attcacagtt
aatttgcgac aatataattt 540tattttcaca taaactagac gccttgtcgt
cttcttcttc gtattccttc tctttttcat 600ttttctcttc ataaaaatta
acatagttat tatcgtatcc atatatgtat ctatcgtata 660gagtaaattt
tttgttgtca taaatatata tgtctttttt aatggggtgt atagtaccgc
720tgcgcatagt ttttctgtaa tttacaacag tgctattttc tggtagttct
tcggagtgtg 780ttgctttaat tattaaattt atataatcaa tgaatttggg
atcgtcggtt ttgtacaata 840tgttgccggc atagtacgca gcttcttcta
gttcaattac accatttttt agcagcaccg 900gattaacata actttccaaa
atgttgtacg aaccgttaaa caaaaacagt tcacctccct 960tttctatact
attgtctgcg agcagttgtt tgttgttaaa aataacagcc attgtaatga
1020gacgcacaaa ctaatatcac aaactggaaa tgtctatcaa tatatagttg
ctgattgcgc 1080agatgccctg cgtaagcggg tgtgggcgga caataaagtc
ttaaactgaa caaaatagat 1140ctaaactatg acaataaagt cttaaactag
acagaatagt tgtaaactga aatcagtcca 1200gttatgctgt gaaaaagcat
actggacttt tgttatggct aaagcaaact cttcattttc 1260tgaagtgcaa
attgcccgtc gtattaaaga ggggcgtggc caagggcatg taaagactat
1320attcgcggcg ttgtgacaat ttaccgaaca actccgcggc cgggaagccg
atctcggctt 1380gaacgaattg ttaggtggcg gtacttgggt cgatatcaaa
gtgcatcact tcttcccgta 1440tgcccaactt tgtatagaga gccactgcgg
gatcgtcacc gtaatctgct tgcacgtaga 1500tcacataagc accaagcgcg
ttggcctcat gcttgaggag attgatgagc gcggtggcaa 1560tgccctgcct
ccggtgctcg ccggagactg cgagatcata gatatagatc tcactacgcg
1620gctgctcaaa cttgggcaga acgtaagccg cgagagcgcc aacaaccgct
tcttggtcga 1680aggcagcaag cgcgatgaat gtcttactac ggagcaagtt
cccgaggtaa tcggagtccg 1740gctgatgttg ggagtaggtg gctacgtctc
cgaactcacg accgaaaaga tcaagagcag 1800cccgcatgga tttgacttgg
tcagggccga gcctacatgt gcgaatgatg cccatacttg 1860agccacctaa
ctttgtttta gggcgactgc cctgctgcgt aacatcgttg ctgctgcgta
1920acatcgttgc tgctccataa catcaaacat cgacccacgg cgtaacgcgc
ttgctgcttg 1980gatgcccgag gcatagactg tacaaaaaaa cagtcataac
aagccatgaa aaccgccact 2040gcgccgttac caccgctgcg ttcggtcaag
gttctggacc agttgcgtga gcgcatacgc 2100tacttgcatt acagtttacg
aaccgaacag gcttatgtca actgggttcg tgccttcatc 2160cgtttccacg
gtgtgcgtca cccggcaacc ttgggcagca gcgaagtcgc cataacttcg
2220tatagcatac attatacgaa gttatctgta actataacgg tcctaaggta
gcgagtttaa 2280acactagtat cgattcgcga cctactccgg aatattaata
gatcatggag ataattaaaa 2340tgataaccat ctcgcaaata aataagtatt
ttactgtttt cgtaacagtt ttgtaataaa 2400aaaacctata aatattccgg
attattcata ccgtcccacc atcgggcgcg gatcccggtc 2460cgaagcgcgc
ggaattcaaa ggcctacgtc gacgagctca cttgtcgcgg ccgctttcga
2520atctagagcc tgcagtctcg acaagcttgt cgagaagtac tagaggatca
taatcagcca 2580taccacattt gtagaggttt tacttgcttt aaaaaacctc
ccacacctcc ccctgaacct 2640gaaacataaa atgaatgcaa ttgttgttgt
taacttgttt attgcagctt ataatggtta 2700caaataaagc aatagcatca
caaatttcac aaataaagca tttttttcac tgcattctag 2760ttgtggtttg
tccaaactca tcaatgtatc ttatcatgtc tggatctgat cactgcttga
2820gcctagaaga tccggctgct aacaaagccc gaaaggaagc tgagttggct
gctgccaccg 2880ctgagcaata actatcataa cccctagggt atacccatct
aattggaacc agataagtga 2940aatctagttc caaactattt tgtcattttt
aattttcgta ttagcttacg acgctacacc 3000cagttcccat ctattttgtc
actcttccct aaataatcct taaaaactcc atttccaccc 3060ctcccagttc
ccaactattt tgtccgccca caaccggtgg aggaaattct ccttgaagtt
3120tccctggtgt tcaaagtaaa ggagtttgca ccagacgcac ctctgttcac
tggtccggcg 3180tattaaaaca cgatacattg ttattagtac atttattaag
cgctagattc tgtgcgttgt 3240tgatttacag acaattgttg tacgtatttt
aataattcat taaatttata atctttaggg 3300tggtatgtta gagcgaaaat
caaatgattt tcagcgtctt tatatctgaa tttaaatatt 3360aaatcctcaa
tagatttgta aaataggttt cgattagttt caaacaaggg ttgtttttcc
3420gaaccgatgg ctggactatc taatggattt tcgctcaacg ccacaaaact
tgccaaatct 3480tgtagcagca atctagcttt gtcgatattc gtttgtgttt
tgttttgtaa taaaggttcg 3540acgtcgttca aaatattatg cgcttttgta
tttctttcat cactgtcgtt agtgtacaat 3600tgactcgacg taaacacgtt
aaatagagct tggacatatt taacatcggg cgtgttagct 3660ttattaggcc
gattatcgtc gtcgtcccaa ccctcgtcgt tagaagttgc ttccgaagac
3720gattttgcca tagccacacg acgcctatta attgtgtcgg ctaacacgtc
cgcgatcaaa 3780tttgtagttg agctttttgg aattaccggt tgacttgggt
caactgtcag accaagttta 3840ctcatatata ctttagattg atttaaaact
tcatttttaa tttaaaagga tctaggtgaa 3900gatccttttt gataatctca
tgaccacagg cattggcggc cttgctgttc ttctacggca 3960aggtgctgtg
cacgcccagc tgccattttt ggggtgaggt cgttcgcggc cgaggggcgc
4020agcccctggg gggatggggt gccgcgttag cgggccggga gggttcgaga
agggggggca 4080ccccccttcg gcgtgcgcgg tcacgcgcca gggcgcagcc
ctggttaaaa acaaggttta 4140taaatattgg tttaaaagca ggttaaaaga
caggttagcg gtggccgaaa aacgggcgga 4200aacccttgca aatgctggat
tttctgcctg tggacagccc ctcaaatgtc aataggtgcg 4260cccctcatct
gtcatcactc tgcccctcaa gtgtcaagga tcgcgcccct catctgtcag
4320tagtcgcgcc cctcaagtgt caataccgca gggcacttat ccccaggctt
gtccacatca 4380tctgtgggaa actcgcgtaa aatcaggcgt tttcgccgat
ttgcgaggct ggccagctcc 4440acgtcgccgg ccgaaatcga gcctgcccct
catctgtcaa cgccgcgccg ggtgagtcgg 4500cccctcaagt gtcaacgtcc
gcccctcatc tgtcagtgag ggccaagttt tccgcgtggt 4560atccacaacg
ccggcggcca aaagaagagc tttcacaccg catagaccag ccgcgtaacc 4620tggca
4625174490DNAArtificialpOmniBac4 17gcaacgcggc ctttttacgg ttcctggcct
tttgctggcc ttttgctcac atgttctttc 60ctgcgttatc ccctgattga cttgggtcgc
tcttcctgtg gatgcgcagg tatgtacagg 120aagaggttta tactaaactg
ttacattgca aacgtggttt cgtgtgccaa gtgtgaaaac 180cgatgtttaa
tcaaggctct gacgcatttc
tacaaccacg actctaagtg tgtgggtgaa 240gtcatgcatc ttttaatcaa
atcccaagat gtgtataaac caccaaactg ccaaaaaatg 300aaaactgtcg
acaagctctg tccgtttgct ggcaactgca agggtctcaa tcctatttgt
360aattattgaa taataaaaca attataaatg tcaaatttgt tttttattaa
cgatacaaac 420caaacgcaac aagaacattt gtagtattat ctataattga
aaacgcgtag ttataatcgc 480tgaggtaata tttaaaatca ttttcaaatg
attcacagtt aatttgcgac aatataattt 540tattttcaca taaactagac
gccttgtcgt cttcttcttc gtattccttc tctttttcat 600ttttctcttc
ataaaaatta acatagttat tatcgtatcc atatatgtat ctatcgtata
660gagtaaattt tttgttgtca taaatatata tgtctttttt aatggggtgt
atagtaccgc 720tgcgcatagt ttttctgtaa tttacaacag tgctattttc
tggtagttct tcggagtgtg 780ttgctttaat tattaaattt atataatcaa
tgaatttggg atcgtcggtt ttgtacaata 840tgttgccggc atagtacgca
gcttcttcta gttcaattac accatttttt agcagcaccg 900gattaacata
actttccaaa atgttgtacg aaccgttaaa caaaaacagt tcacctccct
960tttctatact attgtctgcg agcagttgtt tgttgttaaa aataacagcc
attgtaatga 1020gacgcacaaa ctaatatcac aaactggaaa tgtctatcaa
tatatagttg ctgattgcgc 1080agatgccctg cgtaagcggg tgtgggcgga
caataaagtc ttaaactgaa caaaatagat 1140ctaaactatg acaataaagt
cttaaactag acagaatagt tgtaaactga aatcagtcca 1200gttatgctgt
gaaaaagcat actggacttt tgttatggct aaagcaaact cttcattttc
1260tgaagtgcaa attgcccgtc gtattaaaga ggggcgtggc caagggcatg
taaagactat 1320attcgcggcg ttgtgacaat ttaccgaaca actccgcggc
cgggaagccg atctcggctt 1380gaacgaattg ttaggtggcg gtacttgggt
cgatatcaaa gtgcatcact tcttcccgta 1440tgcccaactt tgtatagaga
gccactgcgg gatcgtcacc gtaatctgct tgcacgtaga 1500tcacataagc
accaagcgcg ttggcctcat gcttgaggag attgatgagc gcggtggcaa
1560tgccctgcct ccggtgctcg ccggagactg cgagatcata gatatagatc
tcactacgcg 1620gctgctcaaa cttgggcaga acgtaagccg cgagagcgcc
aacaaccgct tcttggtcga 1680aggcagcaag cgcgatgaat gtcttactac
ggagcaagtt cccgaggtaa tcggagtccg 1740gctgatgttg ggagtaggtg
gctacgtctc cgaactcacg accgaaaaga tcaagagcag 1800cccgcatgga
tttgacttgg tcagggccga gcctacatgt gcgaatgatg cccatacttg
1860agccacctaa ctttgtttta gggcgactgc cctgctgcgt aacatcgttg
ctgctgcgta 1920acatcgttgc tgctccataa catcaaacat cgacccacgg
cgtaacgcgc ttgctgcttg 1980gatgcccgag gcatagactg tacaaaaaaa
cagtcataac aagccatgaa aaccgccact 2040gcgccgttac caccgctgcg
ttcggtcaag gttctggacc agttgcgtga gcgcatacgc 2100tacttgcatt
acagtttacg aaccgaacag gcttatgtca actgggttcg tgccttcatc
2160cgtttccacg gtgtgcgtca cccggcaacc ttgggcagca gcgaagtcgc
cataacttcg 2220tatagcatac attatacgaa gttatctgta actataacgg
tcctaaggta gcgagtttaa 2280acgtacccgt agtggctatg gcagggcttg
ccgccccgac gttggctgcg agccctgggc 2340cttcacccga acttgggggt
tggggtgggg aaaaggaaga aacgcgggcg tattggtccc 2400aatggggtct
cggtggggta tcgacagagt gccagccctg ggaccgaacc ccgcgtttat
2460gaacaaacga cccaacaccc gtgcgtttta ttctgtcttt ttattgccgt
catagcgcgg 2520gttccttccg gtattgtctc cttccgtgtt tcagttagcc
tcccccatct cccggtaccg 2580catgctatgc atcagctgct agcaccatgg
ctcgagatcc cgggtgatca agtcttcgtc 2640gagtgattgt aaataaaatg
taatttacag tatagtattt taattaatat acaaatgatt 2700tgataataat
tcttatttaa ctataatata ttgtgttggg ttgaattaaa ggtccgtata
2760ctagtatcct agggtatacc catctaattg gaaccagata agtgaaatct
agttccaaac 2820tattttgtca tttttaattt tcgtattagc ttacgacgct
acacccagtt cccatctatt 2880ttgtcactct tccctaaata atccttaaaa
actccatttc cacccctccc agttcccaac 2940tattttgtcc gcccacaacc
ggtggaggaa attctccttg aagtttccct ggtgttcaaa 3000gtaaaggagt
ttgcaccaga cgcacctctg ttcactggtc cggcgtatta aaacacgata
3060cattgttatt agtacattta ttaagcgcta gattctgtgc gttgttgatt
tacagacaat 3120tgttgtacgt attttaataa ttcattaaat ttataatctt
tagggtggta tgttagagcg 3180aaaatcaaat gattttcagc gtctttatat
ctgaatttaa atattaaatc ctcaatagat 3240ttgtaaaata ggtttcgatt
agtttcaaac aagggttgtt tttccgaacc gatggctgga 3300ctatctaatg
gattttcgct caacgccaca aaacttgcca aatcttgtag cagcaatcta
3360gctttgtcga tattcgtttg tgttttgttt tgtaataaag gttcgacgtc
gttcaaaata 3420ttatgcgctt ttgtatttct ttcatcactg tcgttagtgt
acaattgact cgacgtaaac 3480acgttaaata gagcttggac atatttaaca
tcgggcgtgt tagctttatt aggccgatta 3540tcgtcgtcgt cccaaccctc
gtcgttagaa gttgcttccg aagacgattt tgccatagcc 3600acacgacgcc
tattaattgt gtcggctaac acgtccgcga tcaaatttgt agttgagctt
3660tttggaatta ccggttgact tgggtcaact gtcagaccaa gtttactcat
atatacttta 3720gattgattta aaacttcatt tttaatttaa aaggatctag
gtgaagatcc tttttgataa 3780tctcatgacc acaggcattg gcggccttgc
tgttcttcta cggcaaggtg ctgtgcacgc 3840ccagctgcca tttttggggt
gaggtcgttc gcggccgagg ggcgcagccc ctggggggat 3900ggggtgccgc
gttagcgggc cgggagggtt cgagaagggg gggcaccccc cttcggcgtg
3960cgcggtcacg cgccagggcg cagccctggt taaaaacaag gtttataaat
attggtttaa 4020aagcaggtta aaagacaggt tagcggtggc cgaaaaacgg
gcggaaaccc ttgcaaatgc 4080tggattttct gcctgtggac agcccctcaa
atgtcaatag gtgcgcccct catctgtcat 4140cactctgccc ctcaagtgtc
aaggatcgcg cccctcatct gtcagtagtc gcgcccctca 4200agtgtcaata
ccgcagggca cttatcccca ggcttgtcca catcatctgt gggaaactcg
4260cgtaaaatca ggcgttttcg ccgatttgcg aggctggcca gctccacgtc
gccggccgaa 4320atcgagcctg cccctcatct gtcaacgccg cgccgggtga
gtcggcccct caagtgtcaa 4380cgtccgcccc tcatctgtca gtgagggcca
agttttccgc gtggtatcca caacgccggc 4440ggccaaaaga agagctttca
caccgcatag accagccgcg taacctggca 4490188823DNAArtificialpACKS
18ggtaccgcgg ccgcgtagag gatctgttga tcagcagttc aacctgttga tagtacttcg
60ttaatacaga tgtaggtgtt ggcaccatgc ataactataa cggtcctaag gtagcgacct
120aggtatcgat aatacgactc actatagggg aattgtgagc ggataacaat
tcccctctag 180aaataatttt gtttaacttt aagaaggaga tatacatatg
aggcctcgga tcctgtaaaa 240cgacggccag tgaattcccc gggaagcttc
gccagggttt tcccagtcga gctcgatatc 300ggtaccagcg gataacaatt
tcacatccgg atcgcgaacg cgtctcgaga gatccggctg 360ctaacaaagc
ccgaaaggaa gctgagttgg ctgctgccac cgctgagcaa taactagcat
420aaccccttgg ggcctctaaa cgggtcttga ggggtttttt ggtttaaacc
catctaattg 480gactagtagc ccgcctaatg agcgggcttt tttttaattc
ccctatttgt ttatttttct 540aaatacattc aaatatgtat ccgctcatga
gacaataacc ctgataaatg cttcaataat 600attgaaaaag gaagagtatg
agtattcaac atttccgtgt cgcccttatt cccttttttg 660cggcattttg
ccttcctgtt tttgctcacc cagaaacgct cgtgaaagta aaagacgcag
720aggaccaatt gggggcacga gtgggataca tagaactgga cttgaatagc
ggtaaaatcc 780ttgagagttt tcgccctgaa gagcgttttc caatgatgag
cactttcaaa gttctgctat 840gtggagcagt attatcccgt gtagatgcgg
ggcaagagca actcggacga cgaatacact 900attcgcagaa tgacttggtt
gaatactccc cagtgacaga aaagcacctt acggacggaa 960tgacggtaag
agaattatgt agtgccgcca taacgatgag tgataacact gcggcgaact
1020tacttctgac aaccatcggt ggaccgaagg aattaaccgc ttttttgcac
aatatgggag 1080accatgtaac tcgccttgac cgttgggaac cagaactgaa
tgaagccata ccaaacgacg 1140agcgagacac cacaatgcct gcggcaatgg
caacaacatt acgcaaacta ttaactggcg 1200aactacttac tctggcttca
cggcaacaat taatagactg gcttgaagcg gataaagttg 1260caggaccact
actgcgttcg gcacttcctg ctggctggtt tattgctgat aaatctgggg
1320caggagagcg tggttcacgg ggtatcattg ccgcacttgg accagatggt
aagccttccc 1380gtatcgtagt tatctacacg acgggtagtc aggcaactat
ggacgaacga aatagacaga 1440ttgctgaaat aggggcttca ctgattaagc
attggtaaac cgatacaatt aaaggctcct 1500tttggagcct ttttttttgg
acggaccggt agaaaagatc aaaggatctt cttgagatcc 1560tttttttctg
cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt
1620ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct
tcagcagagc 1680gcagatacca aatactgtcc ttctagtgta gccgtagtta
ggccaccact tcaagaactc 1740tgtagcaccg cctacatacc tcgctctgct
aatcctgtta ccagtggctg ctgccagtgg 1800cgataagtcg tgtcttaccg
ggttggactc aagacgatag ttaccggata aggcgcagcg 1860gtcgggctga
acggggggtt cgtgcacaca gcccagcttg gagcgaacga cctacaccga
1920actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag
ggagaaaggc 1980ggacaggtat ccggtaagcg gcagggtcgg aacaggagag
cgcacgaggg agcttccagg 2040gggaaacgcc tggtatcttt atagtcctgt
cgggtttcgc cacctctgac ttgagcgtcg 2100atttttgtga tgctcgtcag
gggggcggag cctatggaaa aacgccagca acgcggcctt 2160tttacggttc
ctggcctttt gctggccttt tgctcacatg ttctttcctg cgttatcccc
2220tgattctgtg gataaccgta ttaccgcctt tgagtgagct gataccgctc
gccgcagccg 2280aacgaccgag cgcagcgagt cagtgagcga ggaagcggaa
gagcgcctga tgcggtattt 2340tctccttacg catctgtgcg gtatttcaca
ccgcaatggt gcactctcag tacaatctgc 2400tctgatgccg catagttaag
ccagtataca ctccgctatc gctacgtgac tgggtcatgg 2460ctgcgccccg
acacccgcca acacccgctg acgcgccctg acgggcttgt ctgctcccgg
2520catccgctta cagacaagct gtgaccgtct ccgggagctg catgtgtcag
aggttttcac 2580cgtcatcacc gaaacgcgcg aggcaggggg aattccagat
aacttcgtat aatgtatgct 2640atacgaagtt atggtaccgc ggccgcgtag
aggatctgtt gatcagcagt tcaacctgtt 2700gatagtacgt actaagctct
catgtttcac gtactaagct ctcatgttta acgtactaag 2760ctctcatgtt
taacgaacta aaccctcatg gctaacgtac taagctctca tggctaacgt
2820actaagctct catgtttcac gtactaagct ctcatgtttg aacaataaaa
ttaatataaa 2880tcagcaactt aaatagcctc taaggtttta agttttataa
gaaaaaaaag aatatataag 2940gcttttaaag cttttaaggt ttaacggttg
tggacaacaa gccagggatg taacgcactg 3000agaagccctt agagcctctc
aaagcaattt tgagtgacac aggaacactt aacggctgac 3060agaattagct
tcacgctgcc gcaagcactc agggcgcaag ggctgctaaa ggaagcggaa
3120cacgtagaaa gccagtccgc agaaacggtg ctgaccccgg atgaatgtca
gctgggaggc 3180agaataaatg atcatatcgt caattattac ctccacgggg
agagcctgag caaactggcc 3240tcaggcattt gagaagcaca cggtcacact
gcttccggta gtcaataaac cggtaaacca 3300gcaatagaca taagcggcta
tttaacgacc ctgccctgaa ccgacgaccg ggtcgaattt 3360gctttcgaat
ttctgccatt catccgctta ttatcactta ttcaggcgta gcaaccaggc
3420gtttaagggc accaataact gccttaaaaa aattacgccc cgccctgcca
ctcatcgcag 3480tactgttgta attcattaag cattctgccg acatggaagc
catcacaaac ggcatgatga 3540acctgaatcg ccagcggcat cagcaccttg
tcgccttgcg tataatattt gcccatggtg 3600aaaacggggg cgaagaagtt
gtccatattg gccacgttta aatcaaaact ggtgaaactc 3660acccagggat
tggctgagac gaaaaacata ttctcaataa accctttagg gaaataggcc
3720aggttttcac cgtaacacgc cacatcttgc gaatatatgt gtagaaactg
ccggaaatcg 3780tcgtggtatt cactccagag cgatgaaaac gtttcagttt
gctcatggaa aacggtgtaa 3840caagggtgaa cactatccca tatcaccagc
tcaccgtctt tcattgccat acggaattcc 3900ggatgagcat tcatcaggcg
ggcaagaatg tgaataaagg ccggataaaa cttgtgctta 3960tttttcttta
cggtctttaa aaaggccgta atatccagct gaacggtctg gttataggta
4020cattgagcaa ctgactgaaa tgcctcaaaa tgttctttac gatgccattg
ggatatatca 4080acggtggtat atccagtgat ttttttctcc attttagctt
ccttagctcc tgaaaatctc 4140gataactcaa aaaatacgcc cggtagtgat
cttatttcat tatggtgaaa gttggaccct 4200cttacgtgcc gatcaacgtc
tcattttcgc caaaagttgg cccagatcaa cgtctcattt 4260tcgccaaaag
ttggcccaga tctatgtcgg gtgcggagaa agaggtaatg aaatggcacc
4320taggtatcga taatacgact cactataggg gaattgtgag cggataacaa
ttcccctcta 4380gaaataattt tgtttaactt taagaaggag atatacatat
gaggcctcgg atcctgtaaa 4440acgacggcca gtgaattccc cgggaagctt
cgccagggtt ttcccagtcg agctcgatat 4500cggtaccagc ggataacaat
ttcacatccg gatcgcgaac gcgtctcgag agatccggct 4560gctaacaaag
cccgaaagga agctgagttg gctgctgcca ccgctgagca ataactagca
4620taaccccttg gggcctctaa acgggtcttg aggggttttt tggtttaaac
ccatgtgcct 4680ggcagataac ttcgtataat gtatgctata cgaagttatg
gtacgtacta agctctcatg 4740tttcacgtac taagctctca tgtttaacgt
actaagctct catgtttaac gaactaaacc 4800ctcatggcta acgtactaag
ctctcatggc taacgtacta agctctcatg tttcacgtac 4860taagctctca
tgtttgaaca ataaaattaa tataaatcag caacttaaat agcctctaag
4920gttttaagtt ttataagaaa aaaaagaata tataaggctt ttaaagcttt
taaggtttaa 4980cggttgtgga caacaagcca gggatgtaac gcactgagaa
gcccttagag cctctcaaag 5040caattttcag tgacacagga acacttaacg
gctgacagaa ttagcttcac gctgccgcaa 5100gcactcaggg cgcaagggct
gctaaaggaa gcggaacacg tagaaagcca gtccgcagaa 5160acggtgctga
ccccggatga atgtcagcta ctgggctatc tggacaaggg aaaacgcaag
5220cgcaaagaga aagcaggtag cttgcagtgg gcttacatgg cgatagctag
actgggcggt 5280tttatggaca gcaagcgaac cggaattgcc agctggggcg
ccctctggta aggttgggaa 5340gccctgcaaa gtaaactgga tggctttctt
gccgccaagg atctgatggc gcaggggatc 5400aagatctgat caagagacag
gatgaggatc gtttcgcatg attgaacaag atggattgca 5460cgcaggttct
ccggccgctt gggtggagag gctattcggc tatgactggg cacaacagac
5520aatcggctgc tctgatgccg ccgtgttccg gctgtcagcg caggggcgcc
cggttctttt 5580tgtcaagacc gacctgtccg gtgccctgaa tgaactgcag
gacgaggcag cgcggctatc 5640gtggctggcc acgacgggcg ttccttgcgc
agctgtgctc gacgttgtca ctgaagcggg 5700aagggactgg ctgctattgg
gcgaagtgcc ggggcaggat ctcctgtcat ctcaccttgc 5760tcctgccgag
aaagtatcca tcatggctga tgcaatgcgg cggctgcata cgcttgatcc
5820ggctacctgc ccattcgacc accaagcgaa acatcgcatc gagcgagcac
gtactcggat 5880ggaagccggt cttgtcgatc aggatgatct ggacgaagag
catcaggggc tcgcgccagc 5940cgaactgttc gccaggctca aggcgcgcat
gcccgacggc gaggatctcg tcgtgacaca 6000tggcgatgcc tgcttgccga
atatcatggt ggaaaatggc cgcttttctg gattcatcga 6060ctgtggccgg
ctgggtgtgg cggaccgcta tcaggacata gcgttggcta cccgtgatat
6120tgctgaagag cttggcggcg aatgggctga ccgcttcctc gtgctttacg
gtatcgccgc 6180tcccgattcg cagcgcatcg ccttctatcg ccttcttgac
gagttcttct gagcgggact 6240ctggggttcg aaatgaccga ccaagcgacg
cccaacctgc catcacgaga tttcgattcc 6300accgccgcct tctatgaaag
gttgggcttc ggaatcgttt tccgggacgc cggctggatg 6360atcctccagc
gcggggatct catgctggag ttcttcgccc accccgggat ctatgtcggg
6420tgcggagaaa gaggtaatga aatggcacct aggtatcgat ggctttacac
tttatgcttc 6480cggctcgtat gttgtgtgga attgtgagcg gataacaatt
tcacacagga aacagctatg 6540accatgatta cgaatttcta gaaataattt
tgtttaactt taagaaggag atatacatat 6600gaggcctcgg atcctgtaaa
acgacggcca gtgaattccc cgggaagctt cgccagggtt 6660ttcccagtcg
agctcgatat cggtaccagc ggataacaat ttcacatccg gatcgcgaac
6720gcgtctcgag actagttccg tttaaaccca tgtgcctggc agataacttc
gtataatgta 6780tgctatacga agttatggta cgtactaagc tctcatgttt
cacgtactaa gctctcatgt 6840ttaacgtact aagctctcat gtttaacgaa
ctaaaccctc atggctaacg tactaagctc 6900tcatggctaa cgtactaagc
tctcatgttt cacgtactaa gctctcatgt ttgaacaata 6960aaattaatat
aaatcagcaa cttaaatagc ctctaaggtt ttaagtttta taagaaaaaa
7020aagaatatat aaggctttta aagcttttaa ggtttaacgg ttgtggacaa
caagccaggg 7080atgtaacgca ctgagaagcc cttagagcct ctcaaagcaa
ttttgagtga cacaggaaca 7140cttaacggct gacataattc agcttcacgc
tgccgcaagc actcagggcg caagggctgc 7200taaaggaagc ggaacacgta
gaaagccagt ccgcagaaac ggtgctgacc ccggatgaat 7260gtcagctggg
aggcagaata aatgatcata tcgtcaatta ttacctccac ggggagagcc
7320tgagcaaact ggcctcaggc atttgagaag cacacggtca cactgcttcc
ggtagtcaat 7380aaaccggtaa gtagcgtatg cgctcacgca actggtccag
aaccttgacc gaacgcagcg 7440gtggtaacgg cgcagtggcg gttttcatgg
cttgttatga ctgttttttt ggggtacagt 7500ctatgcctcg ggcatccaag
cagcaagcgc gttacgccgt gggtcgatgt ttgatgttat 7560ggagcagcaa
cgatgttacg cagcagggca gtcgccctaa aacaaagtta aacatcatga
7620gggaagcggt gatcgccgaa gtatcgactc aactatcaga ggtagttggc
gtcatcgagc 7680gccatctcga accgacgttg ctggccgtac atttgtacgg
ctccgcagtg gatggcggcc 7740tgaagccaca cagtgatatt gatttgctgg
ttacggtgac cgtaaggctt gatgaaacaa 7800cgcggcgagc tttgatcaac
gaccttttgg aaacttcggc ttcccctgga gagagcgaga 7860ttctccgcgc
tgtagaagtc accattgttg tgcacgacga catcattccg tggcgttatc
7920cagctaagcg cgaactgcaa tttggagaat ggcagcgcaa tgacattctt
gcaggtatct 7980tcgagccagc cacgatcgac attgatctgg ctatcttgct
gacaaaagca agagaacata 8040gcgttgcctt ggtaggtcca gcggcggagg
aactctttga tccggttcct gaacaggatc 8100tatttgaggc gctaaatgaa
accttaacgc tatggaactc gccgcccgac tgggctggcg 8160atgagcgaaa
tgtagtgctt acgttgtccc gcatttggta cagcgcagta accggcaaaa
8220tcgcgccgaa ggatgtcgct gccgactggg caatggagcg cctgccggcc
cagtatcagc 8280ccgtcatact tgaagctaga caggcttatc ttggacaaga
agaagatcgc ttggcctcgc 8340gcgcagatca gttggaagaa tttgtccact
acgtgaaagg cgagatcacc aaggtagtcg 8400gcaaataatg tctaacaatt
cgttcaagcc gacggatcta tgtcgggtgc ggagaaagag 8460gtaatgaaat
ggcacctagg tatcgatggc tttacacttt atgcttccgg ctcgtatgtt
8520gtgtggaatt gtgagcggat aacaatttca cacaggaaac agctatgacc
atgattacga 8580atttctagaa ataattttgt ttaactttaa gaaggagata
tacatatgag gcctcggatc 8640ctgtaaaacg acggccagtg aattccccgg
gaagcttcgc cagggttttc ccagtcgagc 8700tcgatatcgg taccagcgga
taacaatttc acatccggat cgcgaacgcg tctcgagact 8760agttccgttt
aaacccatgt gcctggcaga taacttcgta taatgtatgc tatacgaagt 8820tat
88231934DNAArtificialLoxP imperfect repeat 19ataacttcgt atagcataca
ttatacgaag ttat 342031DNAArtificialAdaptor T7InsFor 20tcccgcgaaa
ttaatacgac tcactatagg g 312139DNAArtificialAdaptor T7InsRev
21cctcaagacc cgtttagagg ccccaagggg ttatgctag
392239DNAArtificialAdaptor T7VecFor 22ctagcataac cccttggggc
ctctaaacgg gtcttgagg 392331DNAArtificialAdaptor T7VecRev
23ccctatagtg agtcgtatta atttcgcggg a 312430DNAArtificialAdaptor
NdeInsFor 24gtttaacttt aagaaggaga tatacatatg
302525DNAArtificialAdaptor XhoInsRev 25gggtttaaac ggaactagtc tcgag
252625DNAArtificialAdaptor XhoVecFor 26ctcgagacta gttccgttta aaccc
252730DNAArtificialAdaptor NdeVecRev 27catatgtata tctccttctt
aaagttaaac 302830DNAArtificialAdaptor SmaBam 28gaattcactg
gccgtcgttt tacaggatcc 302930DNAArtificialAdaptor BamSma
29ggatcctgta aaacgacggc cagtgaattc 303029DNAArtificialAdaptor
SacHind 30gctcgactgg gaaaaccctg gcgaagctt
293129DNAArtificialAdaptor HindSac 31aagcttcgcc agggttttcc
cagtcgagc 293233DNAArtificialAdaptor BspEco5 32gatccggatg
tgaaattgtt atccgctggt acc 333333DNAArtificialAdaptor Eco5Bsp
33ggtaccagcg gataacaatt tcacatccgg atc 333423DNAArtificialAdaptor
PolhInsFor 34cccaccatcg ggcgcggatc ccg 233523DNAArtificialAdaptor
PolhInsRev 35cgagactgca ggctctagat tcg 233623DNAArtificialAdaptor
PolhVecFor 36cgggatccgc gcccgatggt ggg 233723DNAArtificialAdaptor
PolhVecRev 37cgaatctaga gcctgcagtc tcg 233828DNAArtificialAdaptor
P10InsFor 38ctcccggtac cgcatgctat gcatcagc
283926DNAArtificialAdaptor P10InsRev
39aatcactcga cgaagacttg atcacc 264028DNAArtificialAdaptor P10VecFor
40gctgatgcat agcatgcggt accgggag 284126DNAArtificialAdaptor
P10VecRev 41ggtgatcaag tcttcgtcga gtgatt 264212DNAArtificialBxtXI
recognition sequence general 42ccannnnnnt gg
124312DNAArtificialBstXI recognition sequence contained in Donor
vectors 43ccatgtgcct gg 124412DNAArtificialBstXI recognition
sequence contained in Acceptor vectors 44ccatctaatt gg
124558DNAArtificialPrimer SmaBamVHL 45gaattcactg gccgtcgttt
tacaggatcc ttaatctccc atccgttgat gtgcaatg 584658DNAArtificialPrimer
BamSmaEB 46ggatcctgta aaacgacggc cagtgaattc gctagctcta gaaataattt
tgtttaac 584766DNAArtificialPrimer SacHindEB 47gagctcgact
gggaaaaccc tggcgaagct tagatctgga tccttactgc acggcttgtt 60cattgg
664854DNAArtificialPrimer HindSacEC 48aagcttcgcc agggttttcc
cagtcgagct ccaattggaa ttcgctagct ctag 544967DNAArtificialPrimer
BspEco5EC 49gatccggatg tgaaattgtt atccgctggt accaagctta gatctggatc
cttaacaatc 60taagaag 6750593DNAArtificialMultiple integration
element 50tatagcatac attatacgaa gttatctgta actataacgg tcctaaggta
gcgagtttaa 60acactagtat cgattcgcga cctactccgg aatattaata gatcatggag
ataattaaaa 120tgataaccat ctcgcaaata aataagtatt ttactgtttt
cgtaacagtt ttgtaataaa 180aaaacctata aatattccgg attattcata
ccgtcccacc atcgggcgcg gatcccggtc 240cgaagcgcgc ggaattcaaa
ggcctacgtc gacgagctca cttgtcgcgg ccgctttcga 300atctagagcc
tgcagtctcg acaagcttgt cgagaagtac tagaggatca taatcagcca
360taccacattt gtagaggttt tacttgcttt aaaaaacctc ccacacctcc
ccctgaacct 420gaaacataaa atgaatgcaa ttgttgttgt taacttgttt
attgcagctt ataatggtta 480caaataaagc aatagcatca caaatttcac
aaataaagca tttttttcac tgcattctag 540ttgtggtttg tccaaactca
tcaatgtatc ttatcatgtc tggatctgat cac 59351498DNAArtificialMultiple
integration element 51caagccgacg gatctatgtc gggtgcggag aaagaggtaa
tgaaatggca cctaggtatc 60gatactagta tacggacctt taattcaacc caacacaata
tattatagtt aaataagaat 120tattatcaaa tcatttgtat attaattaaa
atactatact gtaaattaca ttttatttac 180aatcactcga cgaagacttg
atcacccggg atctcgagcc atggtgctag cagctgatgc 240atagcatgcg
gtaccgggag atgggggagg ctaactgaaa cacggaagga gacaataccg
300gaaggaaccc gcgctatgac ggcaataaaa agacagaata aaacgcacgg
gtgttgggtc 360gtttgttcat aaacgcgggg ttcggtccca gggctggcac
tctgtcgata ccccaccgag 420accccattgg gaccaatacg cccgcgtttc
ttccttttcc ccaccccaac ccccaagttc 480gggtgaaggc ccagggct
49852593DNAArtificialMultiple integration element 52tatagcatac
attatacgaa gttatctgta actataacgg tcctaaggta gcgagtttaa 60acactagtat
cgattcgcga cctactccgg aatattaata gatcatggag ataattaaaa
120tgataaccat ctcgcaaata aataagtatt ttactgtttt cgtaacagtt
ttgtaataaa 180aaaacctata aatattccgg attattcata ccgtcccacc
atcgggcgcg gatcccggtc 240cgaagcgcgc ggaattcaaa ggcctacgtc
gacgagctca cttgtcgcgg ccgctttcga 300atctagagcc tgcagtctcg
acaagcttgt cgagaagtac tagaggatca taatcagcca 360taccacattt
gtagaggttt tacttgcttt aaaaaacctc ccacacctcc ccctgaacct
420gaaacataaa atgaatgcaa ttgttgttgt taacttgttt attgcagctt
ataatggtta 480caaataaagc aatagcatca caaatttcac aaataaagca
tttttttcac tgcattctag 540ttgtggtttg tccaaactca tcaatgtatc
ttatcatgtc tggatctgat cac 59353808DNAArtificialMultiple integration
element 53ttagtacgta ctatcaacag gttgaactgc tgatcaacag atcctctacg
cggccgcggt 60accataactt cgtatagcat acattatacg aagttatctg ccaggcacat
gggttttact 120agtatcgatt cgcgacctac tccggaatat taatagatca
tggagataat taaaatgata 180accatctcgc aaataaataa gtattttact
gttttcgtaa cagttttgta ataaaaaaac 240ctataaatat tccggattat
tcataccgtc ccaccatcgg gcgcggatcc cggtccgaag 300cgcgcggaat
tcaaaggcct acgtcgacga gctcactagt cgcggccgct ttcgaatcta
360gagcctgcag tctcgacaag cttgtcgaga agtactagag gatcataatc
agccatacca 420catttgtaga ggttttactt gctttaaaaa acctcccaca
cctccccctg aacctgaaac 480ataaaatgaa tgcaattgtt gttgttaact
tgtttattgc agcttataat ggttacaaat 540aaagcaatag catcacaaat
ttcacaaata aagcattttt ttcactgcat tctagttgtg 600gtttgtccaa
actcatcaat gtatcttatc atgtctggat ctgatcactg cttgagccta
660gaagatccgg ctgctaacaa agcccgaaag gaagctgagt tggctgctgc
caccgctgag 720caataactat cataacccct aggtgccatt tcattacctc
tttctccgca cccgacataa 780aaatgagacg ttgatctggg ccaacttt
80854211DNAArtificialAntisense sequence to multiple integration
element of SEQ ID NO 1 54ccggatctct cgagacgcgg ttcgcgatcc
ggatgtgaaa ttgttatccg ctggtaccga 60tatcgagctc gactgggaaa accctggcga
agcttcccgg ggaattcact ggccgtcgtt 120ttacaggatc cgaggcctca
tatgtatatc tccttcttaa agttaaacaa aattatttct 180agaggggaat
tgttatccgc tcacaattcc c 211
* * * * *
References