Nucleic acids for cloning and expressing multiprotein complexes Berger; Imre [Europaisches Laboratorium fur Molekularbiologie (EMBL)]

Nucleic acids for cloning and expressing multiprotein complexes

Berger; Imre

Patent Application Summary

U.S. patent application number 14/262633 was filed with the patent office on 2014-12-04 for nucleic acids for cloning and expressing multiprotein complexes. The applicant listed for this patent is Europaisches Laboratorium fur Molekularbiologie (EMBL). Invention is credited to Imre Berger.

Application Number	20140356960 14/262633
Document ID	/
Family ID	42244215
Filed Date	2014-12-04

United States Patent Application	20140356960
Kind Code	A1
Berger; Imre	December 4, 2014

Nucleic acids for cloning and expressing multiprotein complexes

Abstract

The present invention relates to a nucleic acid containing at least one homing endonuclease site (HE) and at least one restriction enzyme site (X) wherein the HE and X sites are selected such that HE and X result in compatible cohesive ends when cut by the homing endonuclease and restriction enzyme, respectively, and the ligation product of HE and X cohesive ends can neither be cleaved by the homing endonuclease nor by the restriction enzyme. Further subject-matter of the present invention relates to a vector comprising the nucleic acid of the present invention, host cells containing the nucleic acid and/or the vector; a kit for cloning and/or expression of multiprotein complexes making use of the vector and the host cells, a method for producing a vector containing multiple expression cassettes, and a method for producing multiprotein complexes. The invention also relates to a methods of assembling multiple single vectors ("vector entities") into fusion vectors and to method of disassembling a fusion vector containing multiple of such vector entities into single vectors. The invention is also directed to fusion vectors containing multiple vector entities.

Inventors:

Berger; Imre; (St. Egreve, FR)

Applicant:

Name	City	State	Country	Type
Europaisches Laboratorium fur Molekularbiologie (EMBL)	Heidelberg		DE

Family ID:

42244215

Appl. No.:

14/262633

Filed:

April 25, 2014

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
13254831	Sep 2, 2011	8709798
PCT/EP2010/052892	Mar 8, 2010
14262633

Current U.S. Class:	435/462 ; 435/199; 435/254.21; 435/254.22; 435/254.23; 435/258.3; 435/320.1; 435/325; 435/348; 435/366; 435/367; 435/368; 435/369; 435/370; 435/465; 435/471
Current CPC Class:	C12N 2800/30 20130101; C12N 15/902 20130101; C12N 15/65 20130101; C12N 15/64 20130101; C12N 15/10 20130101; C12N 9/22 20130101
Class at Publication:	435/462 ; 435/320.1; 435/325; 435/366; 435/369; 435/367; 435/368; 435/370; 435/254.21; 435/254.22; 435/254.23; 435/348; 435/258.3; 435/465; 435/471; 435/199
International Class:	C12N 15/90 20060101 C12N015/90; C12N 9/22 20060101 C12N009/22

Foreign Application Data

Date	Code	Application Number
Mar 6, 2009	EP	09154567.3

Claims

1. A nucleic acid comprising a multiple integration element (MIE) having the following sequence elements: ##STR00001## wherein HE is a homing endonudease site selected from the group consisting of a I-CeuI site and a PI-SceI site; Prom represents a promoter; rbs represents a ribosome binding site; term represents a terminator; and wherein the HE and BstX sites are selected such that HE and BstXI result in compatible cohesive ends when cut by the homing endonuclease and the BstXI restriction enzyme, respectively, and the ligation product of HE and BstXI cohesive ends can neither be cleaved by the homing endonoclease nor the restriction enzyme.

2. The nucleic acid of claim 1 further comprising the nucleotide sequence of SEQ ID NO: 1.

3. The nucleic add of claim 1 claims further comprising at least one site for integration of the nucleic acid of claim 1 into a vector or host cell.

4. A vector comprising the nucleic acid of claim 1.

5. The vector of claim 4 further comprising at least one recognition sequence for a site-specific recombinase, preferably a LoxP imperfect inverted repeat or a Tn7 attachment site.

6. The vector of claim 4 comprising a sequence selected from the group consisting of SEQ ID NO: 2, SEQ ID NO: 3, SEQ ID NO: 4, SEQ ID NO: 5, SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16 and SEQ ID NO: 17.

7. The vector of claim 6 comprising more than one of the sequence elements of the nucleic acid as defined in claim 1 and containing more than one recognition sequence for a site-specific recombinase.

8. The vector of claim 7 comprising the sequence of SEQ ID NO: 18.

9. The vector of claim 4 wherein the vector is a virus.

10. The vector of claim 9 wherein the virus is a baculovirus.

11. A host cell comprising the nucleic acid of claim 1.

12. A host cell comprising the vector of claim 4.

13. A kit for cloning and/or expression of multiprotein complexes containing at least one vector of claim 4 together with at least one host cell suitable for the propagation of said vector(s).

14. A method for assembling n vector entities each containing a multiple integration element as defined in claim 1 into 1 to (n-1) fusion vectors wherein said fusion vector(s) contain(s) 2 to n of said vector entities comprising the steps of: (1) contacting said n vector entities each containing a site-specific recombination site and an individual resistance marker different from the resistance markers of the other vector entities with a recombinase specific for said site-specific recombination site so as to generate a mixture of fusions of the vector entities comprising 2 to n of said vector entities, (2) transforming said mixture into host cells; (3) culturing one or more sample(s) of the transformed cells in the presence of an appropriate combination of antibiotics for selecting one or more desired fusion vector(s) containing 2 to n vector entities; (4) obtaining n single clones of transformed cells from the culture obtained in step (3) in which these were viable in the presence of the respective combination of antibiotics; and (5) culturing n samples of each of said n single clones in the presence of each of n antibiotics specific for the n individual resistance markers present in said n vector entities; wherein n is an integer of at least 3.

15. The method of claim 14 wherein (n-1) of the vector entities to be fused each contains a further selectable marker different from the resistance markers such that only host cells transformed with fusions between the vector entity not containing the further selectable marker and one or more of the vector entities containing the selectable marker are viable in step (3).

16. The method of claim 15 wherein (n-1) of the vector entities contain a conditional origin of replication making the propagation of said vector entities dependent on the presence or absence of a specific gene in the host cells.

17. The method of claim 16 wherein the host cells are bacteria, preferably E. coli, the origin of replication is R6K.gamma. or a derivative thereof and the bacteria are pir.sup.-.

18. The method of claim 14 wherein each of the n vector entities contains one or more genes of interest, preferably within an expression cassette.

19. A method of disassembling a fusion vector containing n vector entities each containing a multiple integration element (MIE) as defined in claim 1 into one or more desired fusion vectors selected from the group consisting of fusion vectors containing 2 to (n-1) vector entities or into one or more of said single vector entities each containing a multiple integration element (MIE) as defined in claim 1, wherein in said fusion vector containing n vector entities said n vector entities are separated from each other by n site-specific recombination sites, and each vector entity contains an individual resistance marker different from the resistance markers of the other vector entities, comprising the steps of: (A) contacting the fusion vector containing n vector entities each containing a multiple integration element (MIE) as defined in claim 1 with a recombinase specific for said site-specific recombination sites in order to generate a mixture of fusions of the vector entities comprising 2 to (n-1) of said vector entities and single vector entities; (B) transforming said mixture into host cells; and (C) culturing one or more sample(s) of the transformed cells in the presence of: (C1) an appropriate combination of antibiotics for selecting one or more desired fusion vecter(s) containing 2 to (n-1) vector entities; and/or (C2) a single appropriate antibiotic for selecting a desired single vector entity; (D) obtaining n single clones of transformed cells from the sample of the transformed cells in which the single clones of transformed cells were viable in the presence of the respective antibiotic or combination of antibiotics, respectively, and (E) culturing n samples of each of said n single clones of transformed cells in the presence of each of n antibiotics specific for the n individual resistance markers present in said n vector entities; wherein n is an integer of at least 3.

20. The method of claim 19 wherein for dissembling the fusion vector containing n vector entities into single vector entities, steps (A), (B), and (C1) are carried out for selecting an appropriate fusion vector containing 2 to (n-1) vector entities, and steps (A), (B), and (C2) to (E) are carried out with said selected fusion vector containing 2 to (n-1) vector entities.

21. The method of claim 20 wherein (n-1) of the vector entities in said fusion vector containing n vector entities each contains a further selectable marker different from the resistance markers such that only host cells transformed with fusions between a vector entity not containing the further selectable marker and one or more of the vector entities containing the selectable marker are viable in step (C1).

22. The method of claim 21 wherein (n-1) of the vector entities comprise a conditional origin of replication making the propagation of said vector entities dependent on the presence or absence of a specific gene in the host cells.

23. The method of claim 22 wherein the host cells are bacteria, preferably E. coli, the origin of replication is R6K.gamma. or a derivative thereof, and the bacteria are pir

24. The method of claim 19 wherein each of the n vector entities comprises one or more genes of interest.

25. The method of claim 24 wherein the one or more genes of interest are within an expression cassette.

26. A fusion vector comprising n vector entities as defined in claim 4, separated from each other by n of the same site-specific recombination site, wherein each vector entity contains an individual resistance marker gene different from the resistance marker genes of the other vector entities, wherein n is an integer of at least 3.

27. A kit for assembly and/or disassembly of n vectors comprising a fusion vector comprising n vector entities each containing a multiple integration element (MIE) as defined in claim 1, the n vector entities being separated from each other by n of the same site-specific recombination site, wherein each vector entity contains an individual resistance marker gene different from the resistance marker genes of the other vector entities; and/or n vector entities each containing a site-specific recombination site and an individual resistance marker gene different from the resistance marker genes of the other vector entities, and a recombinase specific for said site-specific recombination site and/or cells for the propagation of said fusion vector and/or said n vectors vector entities; wherein n is an integer of at least 3.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This patent application is a continuation application of U.S. patent application Ser. No. 13/254,831, filed on Sep. 2, 2011 (currently pending), which was the National Stage of International Application No. PCT/EP2010/052892, filed Mar. 8, 2010, entitled "Nucleic acids for cloning and expressing multiprotein complexes," which claims the benefit of European Patent Application No. EP 09154567.3, filed Mar. 6, 2009, which applications are incorporated in their entirety here by this reference.

[0002] The present invention relates to a nucleic acid containing at least one homing endonuclease site (HE) and at least one restriction enzyme site (X) wherein the HE and X sites are selected such that HE and X result in compatible cohesive ends when cut by the homing endonuclease and restriction enzyme, respectively, and the ligation product of HE and X cohesive ends can neither be cleaved by the homing endonuclease nor by the restriction enzyme. Further subject-matter of the present invention relates to a vector composing the nucleic acid of the present invention, host cells containing the nucleic acid and/or the vector, a kit for cloning and/or expression of multiprotein complexes making use of the vector and the host cells, a method for producing a vector containing multiple expression cassettes, and a method for producing multiprotein complexes. The invention also relates to a method for assembling multiple single vectors ("vector entities") into fusion vectors and to a method for disassembling a fusion vector containing multiple of such vector entities into lower order fusion vectors and/or into single vectors. The invention is also directed to fusion vectors containing multiple vector entities.

[0003] Many vital processes in cells are controlled by proteins associating into interlocking molecular machines, in higher eukaryotes often containing 10 and more subunits (Rual, J. F. et al. Nature 437, 1173-1178 (2005); Charbonnier S., Gallego, O. and Gavin, A. C. Biotechnol. Annu. Rev. 14, 1-28 (2008)). This has profound consequences for functional and structural studies that now aim to decipher physiologically relevant molecular mechanisms. Consequently, work on complexes is increasingly becoming imperative in contemporary biology. The low abundance and frequently heterogeneous nature of many multisubunit complexes, however, often preclude extraction from source.

[0004] Recombinant production methods certainly have had a decisive impact on life science research. In particular E. coil, as an expression host, is commonplace. Successful functional analysis of proteins and elucidation of their molecular architecture often crucially depends on introducing alterations, such as truncations, mutations and extension with purification tags, or with particular promoter/terminator elements. The ensuing requirements in terms of experimental throughput are already considerable for diversifying single open reading frames (ORFs). In particular structural genomics consortia demand the standardization of subcloning routines and implementation of automation for this. The exponential increase in workload when many ORFs have to be rapidly diversified and assembled in the context of a multisubunit complex is daunting, and an unresolved challenge to date.

[0005] A number of systems have been introduced in recent years for expression of several genes in eukaryotic and prokaryotic hosts; see, e.g. Fitzgerald et al. (2006) Nat. Methods 3, 1021-1032; Tan et al. (2005) Protein Expr. Purif. 40, 385-395 (2005); Tolia, N. H. and Joshua-Tor (2006). Nat. Methods 3, 55-64; Chanda et a., (2006) Protein Expr. Purif. 47, 217-224; Scheich et al. (2007). Nucleic Acids Res. 35, e43 (2007). In spite of considerable improvements of eukaryotic expression systems, in particular the baculovirus/insect cell expression (Fitzgerald et al. (2006), supra), E. coli still remains to date the dominant work-horse in most laboratories, for many good reasons such as low-cost and availability of a multitude of specialized expression strains. The current co-expression systems for E. coli rely essentially on serial, mostly conventional (i.e. restriction/ligation) subcloning of encoding genes either as single expression cassettes (Tolia et al. (2006), supra: Chanda et al. (2006), supra) or as polycistrons constituting several genes under the control of the same promoter (Tan et al. (2005), supra). This considerably limits the applicability of these co-expression techniques for production of protein complexes with many subunits, in particular at the throughput typically required for structural molecular biology.

[0006] A major impediment of such largely serial (one gene at a time) constructions stems from the inherent inflexibility with regards to rapidly revising an expression experiment once the multiprotein complex has been produced, purified and characterized. However, such revisions, including variations of the protein subunits, are a sine qua non in contemporary functional and structural research.

[0007] Fitzgerald et al. (2006), supra, and WO-A-2005/085456 describe polynucleotides having a so-called multiplication module wherein two expression cassettes in head-to-head, head-to-tail or tail-to-tail orientation are flanked by specifically designed pairs of restriction enzyme sites allowing iterative cloning of multiple genes into the expression cassettes.

[0008] In view of the draw backs of prior art constructs it is therefore the technical problem underlying the present invention to provide versatile systems for cloning and expression of multiprotein complexes.

[0009] The solution to the above technical problem is achieved by the provision of the embodiments of the present invention as defined in the claims.

[0010] In particular, the present invention relates to a nucleic acid (or polynucleotide) containing at least one homing endonuclease site (HE) and at least one restriction enzyme site (X) wherein the HE and the X sites are selected such that HE and X result in compatible cohesive ends when cut by the homing endonuclease and restriction enzyme, respectively, and the ligation product of HE and X cohesive ends can neither be cleaved by the homing endonuclease nor the restriction enzyme.

[0011] According to the present invention, the terms "nucleic acid" and "polynucleotide" are used interchangeably and refer to DNA, RNA or species containing one or more nucleotide analogues. Preferred nucleic acids or polynucleotides according to the present invention are DNA, most preferred double-stranded (ds) DNA.

[0012] Preferably, the nucleic acid of the present invention has the following sequence elements:

[0013] HE-Prom-MCS-Term-X or HE-Prom-MCS-X

[0014] wherein

[0015] Prom: represents a promoter;

[0016] MCS: represent a multiple cloning site; and

[0017] Term: represents a terminator.

[0018] The above arrangement is hereinafter often referred to as "multiple integration element" (MIE).

[0019] Promoters useful in the present invention include, but are not limited to, promoters of prokaryotic, viral, mammalian, or insect cell origin or a combination thereof. Likewise, terminators useful in a nucleic acid according to the invention include, but are not limited to, terminators of prokaryotic, viral, mammalian, insect cell origin or a combination thereof. The term "multiple cloning site" according to the present invention means a sequence having at least one restriction enzyme site different from the site X as defined above. The MCS according to the present invention may, e.g. be derived from the multiple cloning sites of any commercially available plasmid.

[0020] Preferred prokaryotic promoters are Lac, T7, arabinose and trc promoters. Further promoters useful in the context of the present invention are viral promoters, in particular baculoviral promoters such as polh, p10 and p.sub.XIV very late baculoviral promoters, vp39 baculoviral late promoter, vp39 polh baculoviral late/very late hybrid promoter, P.sub.cap/polh, pcna, etl, p35, egt, da26 baculoviral early promoters. Further promoters useful in the context of the present invention are the promoter sequences CMV, SV40, UbC, EF-1.alpha., RSVLTR, MT, P.sub.DS47, Ac5, P.sub.GAL and P.sub.ADH.

[0021] Examples of terminator sequences useful in the context of the present invention are T7, SV40, HSVtk or BGH.

[0022] The multiple cloning site according to the present invention may contain, in addition to the at least one restriction enzyme site (other than X), one or more, especially 1 to 4 homology regions. The restriction enzymes sites contained in the MCS can easily be chosen by the skilled person and examples of such sites together with their recognition sequences can be taken from the latest product catalogue of New England Biolabs, Ipswich, Mass., USA.

[0023] A "homing endonuclease" according to the present invention is a DNase specific for double-stranded DNA having a large, isometric recognition site of e.g. 12-40 base pairs or even more, preferably 20 to 30 base pairs. For a recent review with regard to homing endonucleases, see Stoddard B. L. (2005) Q. Rev. Biophys. 38, 49-95. Due to the length of HE recognition sequences it is highly unlikely that a corresponding site occurs in the nucleotide sequence of a gene or polygene (or any other nucleotide sequence of any origin) to be inserted into the constructs according to the present invention making this strategy particularly useful for cloning larger and/or many genes of interest ("GOI").

[0024] A preferred HE site according to the present invention is a recognition sequence of a homing endonuclease that results in a 4 nucleotide overhang when cut by the respective homing endonuclease.

[0025] Examples of such HE sites include, but are not limited to: recognition sequences of PI-SceI, I-Ceul, I-PpoI, I-HmuI I-CreI, I-DmoI, PI-PfuI and I-MsoI, PI-PspI, I-SceI, other LAGLIDAG group members and variants thereof, SegH and Hef or other GIY-YIG homing endonucleases, I-ApelI, I-AniI, Cytochrome b mRNA maturase bI3, PI-TliI and PI-TfulI, PI-ThyI and others; see also Stoddard (2005), supra.

[0026] A preferred restriction enzyme site X according to the present invention compatible with HE sites producing a 4 bp overhang (examples are given above) is a BstXI site.

[0027] Corresponding enzymes are commercially available, e.g. from New England Biolabs Inc., Ipswich, Mass., USA.

[0028] Especially preferred MIEs of the invention containing prokaryoutic promoters/terminators have one of the following structures:

[0029] I-CeuI-T7 Prom-MCS-T7 Term-BstXI

[0030] PI-SceI-T7 Prom-MCS-T7 Term-BstXI

[0031] Especially preferred MIEs of the invention containing baculoviral promoters have one of the following structures:

[0032] I-CeuI-p10-MCS-BstXI

[0033] PI-SceI-p10-MCS-BstXI

[0034] I-CeuI-polh-MCS- BstXI

[0035] PI-SceI-polh-MCS-BstXI

[0036] Particularly preferred examples of nucleic acids according to the present invention comprise the sequence according to SEQ ID NO: 1 (for a detailed map see FIG. 13A and B; the sequence antisense to SEQ ID NO: 1 is outlined in SEQ ID NO: 54), SEQ ID NO: 50 (restriction map: FIG. 42), SEQ ID NO: 51 (restriction map: FIG. 43), SEQ ID NO: 52 (restriction map: FIG. 44) or SEQ ID NO: 53 (restriction map: FIG. 45).

[0037] In preferred embodiments of the present invention, the above-defined nucleic acid additionally comprises at least one site for integration of the nucleic acid into a vector or host cell. The integration site may allow for a transient or genomic incorporation.

[0038] With respect to the integration into a vector, in particular into a plasmid or virus, the integration site is preferably compatible for integration of the nucleic acid into an adenovirus, andeno-associated virus (AAV), autonomous parvovirus, herpes simplex virus (HSV), retrovirus, rhadinovirus, Epstein-Barr virus, lentivirus, semliki forest virus or baculovirus.

[0039] Particularly preferred integration sites that may be incorporated into the nucleic acid of the present invention can be selected from the transposon element of Tn7, .lamda.-integrase specific attachment sites and site-specific recombinases (SSRs), in particular LoxP site or FLP recombinase specific recombination (FRT) site. Further preferred mechanisms for integration of the nucleic acid according to the invention are specific homologous recombination sequences such as lef2-603/Orf1629.

[0040] In further preferred embodiments of the present invention, the nucleic acid as described herein additionally contains one or more resistance markers for selecting against otherwise toxic substances. Preferred examples of resistance markers useful in the content of the present invention include, but are not limited to, antibiotics such as ampicillin, chloramphenicol, gentamycin, spectinomycind, and kanamycin resistance markers.

[0041] The nucleic acid of the present invention may also contain one or more ribosome binding site(s) (RBS), preferably integrated into an MIE as defined above.

[0042] Further subject-matter of the present invention relates to a vector comprising a nucleic acid as defined above.

[0043] Preferred vectors of the present invention are plasmids, expression vectors, transfer vectors, more preferred eukaryotic gene transfer vectors, transient or viral vector-mediated gene transfer vectors. Other vectors according to the invention are viruses such as adenovirus vectors, adeno-asseciated virus (AAV) vectors, autonomous parvovirus vectors, herpes simples virus (HSV) vectors, retrovirus vectors, rhadinovirus vectors, Epstein-Barr virus vectors, lentivirus vectors, semliki forest virus vectors and baculovirus vectors.

[0044] Baculovirus vectors suitable for integrating a nucleic acid according to the invention (e.g. present on a suitable plasmid such as a transfer vector) are also subject matter of the present invention and preferably contain site-specific integration sites such as a Tn7 attachment site (which may be embedded in a lacZ gene for blue/white screening of productive integration) and/or a LoxP site. Further preferred baculovirus according to the invention contain (alternative to or in addition to the above-described integration sites) a gene for expressing a substance toxic for host flanked by sequences for homologous recombination. An example for a gene for expressing a toxic substance is the diphtheria toxin A gene. A preferred pair of sequences for homologous recombination is e.g. Isf2-603/Orf1629. The baculovirus can also contain further marker gene(s) as described above, including also fluorescent markers such as GFP, YFP and so on. Specific examples of corresponding baculovirus of the invention have the structure of EMBac, EMBAcY, EMBac_Direct and EMBacY_Direct as disclosed in the schemes according to FIGS. 38, 39, 40 and 41, respectively.

[0045] Vectors useful in prckaryotic host cells comprise, preferably besides the above-exemplified marker genes (one or more thereof), an origin of replication (ori). Examples are BR322, ColE1, and conditional origins of replication such as OriV and R6K.gamma., the latter being a preferred conditional origin of replication which makes the propagation of the vector of the present application dependent on the pir gene in a prokaryotic host. OriV makes the propagation of the vector of the present application dependent on the trfA gene in a prokaryotic host.

[0046] Furthermore, the present invention is directed to a host cell containing the nucleic acid of the invention and/or the vector of the present invention.

[0047] The host cells may be prokaryotic or eukaryotic. Eukaryotic host cells may for example be mammalian cells, preferably human cells. Examples of human host cells include, but are not limited to, HeLa, Huh7, HEK293, HepG2, KATO-III, IMR32, MT-2, pancreatic .beta.-cells, keratinocytes, bone-marrow fibroblasts, CHP212, primary neural cells, W12, SK-N-MC, Saos-2, WI38, primary hepatacytes, FLC4, 143TK, DLD-1, embryonic lung fibroblasts, primery foreskin fibroblasts, MRC5, and MG63 cells. Further preferred host cells of the present invention are porcine cells, preferably CPK, FS-13, PK-15 cells, bovine cells, preferably MDB, BT cells, bovine cells, such as FLL-YFT cells. Other eukaryotic cells useful in the context of the present invention are C. elegans cells. Further eukaryotic cells include yeast cells such as S. cerevisiae, S. pombe, C. albicans and P. pastoris. Furthermore, the present invention is directed to insect cells as host cells which include cells from S. frugiperda, more preferably Sf9, Sf21, Express Sf+, High Five H5 cells, and cells from D. melamogaster, particularly S2 Schneider cells. Further host cells include Dictyostellium discoideum cells and cells from parasites such as Leishmania spec.

[0048] Prokaryotic hosts according to the present invention include bacteria, in particular E. coli such as commercially available strains like TOP10, DH5.alpha., HB101 etc.

[0049] The person skilled in the art is readily able to select appropriate vector construct/host cell pairs for appropriate propagation and/or transfer of the nucleic acid elements according to the present invention into a suitable host. Specific methods for introducing appropriate vector elements and vectors into appropriate host cells are equally known to the art and methods can be found in the latest edition of Ausubel et al. (ed.) Current Protocols In Molecular Biology, John Wiley & Sons, New York, USA.

[0050] In preferred embodiments of the present invention, the vector as defined above additionally comprises a site for site specific recombinases (SSRs), preferably one or more LoxP sites for Cre-lox specific recombination. In further preferred embodiments, the vector according to the present invention comprises a transposon element, preferably a Tn7 attachment site.

[0051] It is further preferred that the attachment site as defined above is located within a marker gene. This arrangement makes it feasible to select for successfully integrated sequences into the attachment site by transposition. According to preferred embodiments, such a marker gene is selected from luciferase, .beta.-GAL, CAT, fluorescent encoding protein genes, preferably GFP, BFP, YFP, CFP and their variants, and the lacZ.alpha. gene.

[0052] Particularly preferred embodiments of the vector according to the present invention have a sequence selected from the group consisting of SEQ ID NO: 2 to SEQ ID NO: 17.

[0053] Further preferred embodiments of the present invention are vectors containing more than one of the sequence elements of the nucleic acids of the present invention as defined above and, optionally, additionally containing more than one recombination sequence for a site specific recombinase, e.g. 2 to 6, more preferred 2, 3 or 4 of such recognition sequences, preferably 2 to 6, especially preferred 1 to 4 loxP sites

[0054] A particularly preferred example of such a vector has the sequence of SEQ ID NO. 18.

[0055] It is to be understood that, if the vector of the present invention contains more than one recombination sequences, these can be recognition sequences of the same or different site-specific recombinases.

[0056] Further subject-matter of the present invention is a kit for cloning and/or expression of multiprotein complexes containing at least one vector as defined above together with at least host cell suitable for the propagation of said vector(s). Preferred host cells have been already described above. Preferably, the kit of this aspect of the present invention additionally contains a site-specific reeombinase such as Cre.

[0057] The present invention also relates to a method for producing a vector containing multiple expression cassettes comprising the steps of:

[0058] (a) inserting one or more genes between the HE and the X site of a first vector of the present invention;

[0059] (b) inserting one or more genes between the HE and the X site of a second vector as defined herein;

[0060] (c) cleaving the first vector with a homing endonuclease specific for site HE and with a restriction enzyme specific for site X yielding a fragment containing the at least one gene flanked by the cleaved HE and X sites;

[0061] (d) cleaving the second vector with a homing endonuclease specific for site HE;

[0062] (e) ligating the fragment obtained in step (c) into the cleaved second vector obtained in step (d) generating a third vector; and optionally

[0063] (f) repeating steps (a) to (e) with one or more vector(s) generating a vector containing multiple genes.

[0064] According to preferred embodiments of the present invention it is possible to insert one or more genes into the vectors of the invention by methods known to the skilled person, e.g. by restriction enzyme digestion/ligation via compatible sites within the MCS or by recombination, preferably using the optionally present homology region(s), preferably using the SLIC method. If more than one gene is inserted, these can be provided as single expression cassettes. However, it is clear for the skilled person that the (several or multiple) genes can be present as a polygene within in one ORF.

[0065] The present invention is further directed to a method for producing multiple protein complexes comprising the steps of

[0066] (i) producing a vector containing multiple expression cassettes by the method as defined above;

[0067] (ii) introducing the vector obtained in step (i) into a suitable best cells such as the host cells described above; and

[0068] (iii) incubating the host cell under conditions allowing the simultaneous expression of the genes present in the vector.

[0069] The introduction of the vector into suitable host cells (as exemplified above) is carried out by methods known to the skilled person (see, e.g. Ausubel et al. (ed.), supra).

[0070] A further aspect of the present application is a fusion vector comprising n vector entities separated from each other by n of the same site-specific recombination site wherein each vector entity contains an individual resistance marker gene different from the resistance marker genes of the other vector entities, wherein n is an integer of at least 3.

[0071] A "single vector" or "vector entity" according to the present aspect of the invention is generally a nucleic acid suitable for integration of foreign genetic elements (in particular, one or more genes of interest) into host cells and which are suited for amplification. Typical examples are plasmids, bacmnids, viruses, lambda vectors, cosmids etc. Preferred examples of one or more of the above vector categories are outlined in more detail above with respect to the HE/X site containing vector which definitions are also valid for this aspect of the present invention.

[0072] It is clear for the skilled person that the number of vector entities to be assembled into a fusion vector according to the present invention (or disassembled from such a fusion vector; with respect to methods of assembly/disassembly see below) is generally not specifically limited as long as a corresponding number of resistance markers is available. With respect to practical considerations, the number n in the context of the present invention is preferably 3, 4, 5 or 6, (but may be more) which in part depends on the stee of constructs that can be propagated in the host.

[0073] The present invention furthermore relates to a kit for assembly and/or disassembly of n vectors comprising

[0074] a fusion vector comprising n vector entitles separated from each other by n of the same site-specific recombination site wherein each vector entity contains an individual resistance marker gene different from the resistance market genes of the other vector entities; and/or

[0075] n vectors (vector entities) each containing a site-specific recombination site and an individual resistance marker gene different from the resistance marker geness of the other vectors,

[0076] wherein n is an integer of at least 3; and

[0077] a recombinase specific for said site-specific recombination site and/or cells for the propagation of said fusion vector and/or said n vectors.

[0078] Preferred embodiments of the above fusion vector and vector kits are or contain, respectively, fusion vector(s) and/or vector entities comprising LoxP sites and Cre as the corresponding recombinase enzyme. Other examples of site-specific recombination sites/recombinases are FRT sites and the corresponding enzyme (FLP recombinase).

[0079] According to a preferred embodiment the above-defined n vectors or vector entities, respectively, each contain one or more expression cassettes of the form Prom-MCS-Term or Prom-MCS-Term (definitions are as defined above, preferably between a HE and restriction enzyme site X as defined above). It is further preferred that the expression cassette preferably present in the vectors or vector entities, respectively, contains one or more genes of interest ("GOI").

[0080] Examples of resistance marker genes (or simply "resistance markers") useful in the context of this aspect of the present invention are as already defined above.

[0081] An especially preferred example of the fusion vector as defined above is vector pACKS (SEQ ID NO: 18) described in more detail below.

[0082] Preferred examples of the vector entitles are pACE (SEQ ID NO: 2), pACE2 (SEQ ID NO: 3), pDC (SEQ ID NO: 4), pDK (SEQ ID NO: 5) and pDS (SEQ ID NO: 6), which are all adapted for expression in prokaryotic hosts, and pIDC (SEQ ID NO: 7), pIDK (SEQ ID NO: 8), pIDS (SEQ ID NO: 9), pACEBac1 (SEQ ID NO: 10), pACEBac2 (SEQ ID NO: 11), pACEBac3 (SEQ ID NO: 12), pACEBac4 (SEQ ID NO: 13), pOmniBac1 (SEQ ID NO: 14), pOmniBac2 (SEQ ID NO: 15, pOmniBac3 (SEQ ID NO: 16) and pOmniBac4 (SEQ ID NO: 17), which are tailored for expression in insect cells using baculovirus. The above preferred examples of vector entities are described in more detail below.

[0083] It is further preferred that at least one of the vector entities (and/or of the individual vectors in the above kit) contains a further selectable marker different from the resistance marker genes. An example is a conditional origin of replication making the propagation of the respective vector entity dependent on a specific genetic background in a host. An example is an Ori derived from (or being) R6k.gamma. making the propagation of the vector dependent on the pir gene.

[0084] The present invention further provides a method for assembling n vector entities into 1 to (n-1) fusion vectors wherein said fusion veofor(s) contain(s) 2 to n of said vector entitles comprising the steps of:

[0085] (1) contacting n vector entities each containing a site-specific recombination site and an individual resistance marker different from the resistance markers of the other vector entities with a recombinase specific for said site-specific recombination site so as to generate a mixture of fusions of the vector entities comprising 2 to n of said vector entities,

[0086] (2) transforming said mixture into host cells;

[0087] (3) culturing one or more sample(s) of the transformed cells in the presence of the appropriate combination of antibiotics for selecting a desired fusion vector containing 2 to n vector entities.

[0088] (4) obtaining n single clones of transformed cells from the culture obtained in step (3) in which these were viable in the presence of the respective combination of antibiotics; and

[0089] (5) culturing n samples of each of said n single clones in the presence of each of n antibiotics specific for the n individual resistance markers present in said n vector entities;

wherein n is as defined above.

[0090] If it is desired to select for more than one desired vector fusions, the transformed cells obtained in above step (2) are divided into the appropriate number of aliquots or samples. For example, if it is desired to select all possible (n!-n) vector fusions (i.e. the single vector entities as eduefs of the above method are not selected for), the transformed host cells are divided into (n!-n) aliquots (or samples) and each aliquot is cultured in the presence of the appropriate antibiotics.

[0091] In the context of the present invention, the term "aliquot" as used herein does not necessarily mean that the aliquots have the same volume or number of cells. Rather, each of the aliquots or samples may have the same or different volumes or number of cells.

[0092] The term "culyuring" the transformed cells or the aliquot for sample) means that the transformed cells are incubated under the appropriate conditions for viability of the host cells. For example, the transformed host cells may be used to inoculate a (e.g. larger) volume of liquid culture medium or the aliquot may be plated out on an appropriate solid medium.

[0093] If the vector assembly method as defined above is used to select for more than one desired vector fusion, e.g. if all possible fusions are desired, the selection step (3) is preferably carried out using typical well plate formats such as 96-well plates.

[0094] According to a preferred embodiment of the present vector assembly method (n-1) of the vector entities to be fused each contains a further selectable marker different from the resistance marker Such vector entities are hereinafter referred to as "Donor" vectors, since, when fused io a vector entity which does not contain said selectable marker different from the resistance marker (hereinafter referred to as "Acceptor"), in a fusion between the Donor(s) and the Acceptor, said Donor(s) provide host cells with a phenetype that allows only the propagation of Acceptor-Donor fusions but no Donor-Donor fusions. Preferred examples of such a selectable marker are conditional origins of replication making the propagation of the Donor dependent on a specific genetic background. A specific example of such a selectable marker is R6K.gamma. Ori making the propagation of the Donor dependent on the presence of the pir gene in a bacterial host such as E. coli. In this case, the mixture obtained in step (i) of the above vector assembly method is transformed into bacterial cells lacking the pir gene (such E. coli strains TOP 10, DH5.alpha., HB101 or other commercially available pir cells).

[0095] A preferred embodiment of tlhe above-defined vector assembly method is described in more detail below (ACEMBL system; Section C.2.1)

[0096] According to a preferred embodiment of the above-defined method, the n vector entities, respectively, each contain one or more expression cassettes of the form Prm-MCS-Term or Prom-MCS (as defined above, preferably between a HE and restriction enzyme site X as defined above). It is further preferred that the expression cassette preferably present in the vectors or vector entities, respectively, contains one or more genes of interest ("GOI") to be expressed in a suitable host.

[0097] Another method for providing fusion vectors according to the present invention is a sequential assembly process wherein in the first step two of the vector entities are recombined, transformed into host cells and the host cells cultured in the presence of two antibiotics. The second round comprises the isolation of the double fusion vector (n=2) from a viable clone, contacting with a third vector entity in the presence of the respective recombinase, transformation into host cells and selection for the three resistance markers present in the triple fusion vector (n=3) and so on until the desired multifusion vector is reached.

[0098] Of course, it is also possible to provide fusion vectors according to the invention, in particular fusion vectors of higher order (i.e, n>3) by a combined approach using the vector assembly method of steps (1) to (5) as defined above (e.g. for assembling a fusion vector with n=3, 4 or 5) and then adding one or more further vector entities sequentially as described in the previous paragraph.

[0099] The principle underlying the above-described method for assembling a fusion vector, i.e. the equilibrium of educts and products in recombination reactions, can equally be applied to the disassembly of fusion vectors.

[0100] Therefore, the invention further provides a method of disassembling a fusion vector containing n vector entities into one or more desired fusion vectors selected from the group consisting of fusion vectors containing 2 to (n-1) vector entities or into one or more desired single vector entities, wherein in said fusion vector containing n vector entities said n vector entities are separated from each other by n site-specific recombination sites and each vector entity contains an individual resistance marker different from the resistance markers of the other vector entities, comprising the steps of:

[0101] (A) contacting the fusion vector containing n vector entities with a recombinase specific for said site-specific recombination sites in order to generate a mixture of fusion vectors comprising 2 to (n-1) of said vector entities and single vector entities;

[0102] (B) transforming said mixture into host cells; and

[0103] (C) culturing one or more sample(s) of the transformed cells in the presence of

[0104] (C1) an appropriate combination of antibiotics for selecting one or more desired fusion vectors) containing 2 to (n-1) vector entities; and/or

[0105] (C2) a single appropriate antibiotic for selecting a desired single vector entity;

[0106] (D) obtaining n single clones of transformed cells from the sample of the transformed cells in which these were viable in the presence of the respective antibiotic or combination of antibiotics, respectively; and

[0107] (E) culturing n samples of each of said n single clones in the presence of each of n antibiotics specific for the n individual resistance markers present in said n vector entities;

wherein n is as defined above.

[0108] If it is desired to select for single vectors rising the above fusion vector disassembly method, it is preferred that steps (A), (B) and (C1) to (E) are carried out for selecting an appropriate fusion vector containing 2 to (n-1) vector entities and then to perform steps (A), (B) and (C2) to (E) are carried out with said selected fusion vector containing 2 to (n-1) vector entities. It is understood that this sequential approach can be repeated which is especially preferred when starting from a fusion vector containing a higher number of vector entities, i.e. one can select for a (n-1) fusion vector in the first, then for a (n-2) construct in the second round and so on, e.g. until reaching a fusion vector with n=3 or 2 such that the presence of the single vector entities in the recombinase reaction equilibrium makes the selection of respective clones containing said single vector entities according to the selection steps (C2) to (E) more likely.

[0109] Furthermore, in analogy to the above-defined vector assembly method, it is preferred in the fusion vector disassembly method of the present invention that (n-1) of the vector entities in said fusion vector containing n vector entities each contains a further selectable marker different from the resistance markers such that only host cells transformed with fusions between a vector entity not containing the further selectable marker and one or more of the vector entities containing the selectable marker are viable in step (C1).

[0110] With respect to preferred selectable markers (conditional Ori etc.), host cells, the use of multi well test plates etc. if is referred to the preferred embodiments of the vector assembly method outlined above.

[0111] The fusion vector disassembly method of the present application is further elaborated below with respect to a preferred embodiment (ACEMBL system; Section C.2.2).

[0112] The nucleic acids and vectors (including fusion vectors and single vectors (i.e. vector entities)) of the present invention may contain further typical sequence elements, e.g. elements that enable or simplify the detection and/or purification of the (multiple) proteins expressed from the one or more genes of interest. Typical examples of such elements are sequences coding for GFP and its derivatives, His-tags, GST etc.

[0113] Fusion vectors according to the present invention are advantageously used for the expression of mutliprotein complexes in a suitable host. Thus, the present invention further provides a corresponding process comprising transforming a fusion vector of the invention (containing vector entities having inserted one or more genes of interest, e.g. in form of multiple or single expression cassettes, or in the form of polygenes as appropriate) into a suitable host and culturing the transformed host under conditions allowing simultaneous expression of the genes of interest.

[0114] From the disclosure of the various aspects of the present invention the skilled person readily understands that the HE/X site polynucleotide (in particular corresponding vectors), preferably used for iterative cloning of multiple expression cassettes, can be combined with the assembly (or disassembly) methods as defined above for creating multigene constructs. For example, one or more of a single gene or multigene vector(s) can be prepared using the HE/X site elements as described which may then be assembled into fusion vectors of choice (e.g. triple, quadruple or higher order fusion vectors) using the recombination-based assembly methods defined herein. Such fusion vectors may then be (partly or completely) disassembled as disclosed herein and different constructs can be assembled in turn as appropriate for the respective multiprotein application envisaged by the skilled person. Thus, the aspects of the present invention represent a building block system which provides the person skilled in the art with a hitherto unknown freedom of combining multiple genes (or polygenes) of interest for multiprotein applications.

[0115] The figures show:

[0116] FIG. 1 shows a schematic overview of preferred vectors according to the present invention for expression of multiprotein complexes in prokaryotic hosts contained in a preferred kit called "ACEMBL".

[0117] FIG. 2 is a graphic representation of a preferred embodiment of the nucleic add of the present invention called "multiple integration element" (MIE).

[0118] FIG. 3 shows a schematic overview of a preferred method for inserting a gene of interest ("GOI") into a vector of the present invention by sequence and ligation independent cloning (SLIC; see Tan, S. et al. Protein Expr. Purif. 40, 385 (2005)). A gene of interest (GOI 1) is PCR amplified with specific primers and integrated into a vector (Acceptor, Donor) linearized by PCR with complementary primers (complementary regions are shaded in light gray or dark grey, respectively). Resulting PCR fragments contain homology regions at the ends. T4 DNA polymerase acts as an exonuclease in the absence of dNTP and produces long sticky overhangs. Mixing (optionally annealing) of T4 DNA polymerase exonueiease treated insert and vector is followed by transformation yielding a single gene expression cassette.

[0119] FIG. 4 shows a schematic overview of a preferred method for inserting a polycistron into a vector of the present invention by SLIC. Genes of interest (GOI 1, 2, 3) are PCR amplified with specific primers and integrated into a vector (Acceptor, Donor) linearized by PCR with primers complementary to the ends of the forward primer of the first (GOI 1) and the reverse primer of the last (GOI 3) gene to be assembled in the polycistron (complementary regions are shaded in light gray or dark grey, respectively). Resulting PCR fragments contain homology regions at the ends. T4 DNA polymerase acts as an exonuclease in the absence of dNTP and produces long sticky overhangs. Mixing (optionally annealing) of T4 DNA polymerase exonuclease treated insert and vector is followed by transformation, yielding a polycistronic expression cassette.

[0120] FIG. 5 shows the sequence of a LoxP imperfect inverted repeat (SEQ ID NO: 19).

[0121] FIG. 6 (left panel) shows a schematic representation in form of a pyramid illustrating Cre-mediated assembly and disassembly of preferred embodiments of the vector of the present invention (pACE, pDK and pDS vectors). LoxP sites are shown as red circles, resistance markers and origins are labelled. White arrows stand for the entire expression cassette (including promoter, terminator and multiple integration elements) in the ACEMBL vectors. Not all possible fusion products are shown for clarity. Levels of multiresistance are indicated in the right panel.

[0122] FIG. 7 is a schematic representation of a multiresistance analysis of bacterial colonies carrying vector constructs resulting from Cre-deCre assembly/disassembly according to the invention (cf. also FIG. 12).

[0123] FIG. 8 shows a schematic representation of the strategy for cloning of human RAP74 and human RAP30 into vectors of the present invention for expression of human TFIIF (left panel). hRAP74 was cloned by SLIC into pDC. hRAP30 was cloned by SLIC into pACE. Cre-Lox recombination of pDC-RAP74 (donor) and pACE-RAP30 results in vector pACEMBL-hTFIIF. Results from restriction mapping by BstZ17I/BamHI double digestion of 11 double resistant (Cm, Ap) colonies are shown by a gel section from 1% E-gel electrophoresis (M: NEB 1 kb DNA marker). All clones tested showed the expected pattern (5.0+2.8 kb) (left panel).

[0124] FIG. 9 illustrates the strategy for cloning of human VHL/elongin b/elongin c complex (VHLbc) (tricistron) into vector pACE by multifragment SLIC.

[0125] FIG. 10 shows a schematic representation of the strategy for iterative cloning of the components of yeast RES complex (Pml1p; Snu17p, Bud13p) using a preferred homing endonuclease site (HE)/restriction enzyme site (X) module (PI-SceI/BstXI) according to the present invention.

[0126] FIG. 11 shows a schematic representation of the generation of single vectors from multifusion vector pACKS (SEQ ID NO: 18).

[0127] FIG. 12 shows schematic representations and photographs illustrating a 96 well microtiter analysis of pACKS De-Cre reaction.

[0128] FIGS. 13A and 13B show the sequence and map of a preferred nucleic acid ("multiple integration element", MIE) according to the present invention (SEQ ID NO: 1). Forward and reverse primers for sequencing can be standard vector primers for T7 and lac. Adaptor primer sequences (see Table 1) are indicated. DNA sequences in these homology regions, contain tried-and-tested sequencing primers (Tan et al. (2005), supra). Sites of insertion (I1-I4) are shown. The adaptor sequences, and probably any sequence in the homology regions, can be used as adaptors for multifragment insertions. The ribosome binding site present in the MIE (rbs) is boxed in red.

[0129] FIG. 14 shows a plasmid map of Acceptor vector pACE.

[0130] FIG. 15 shows a plasmid map of Acceptor vector pACE2.

[0131] FIG. 16 shows a plasmid map of Donor vector pDC.

[0132] FIG. 17 shows a plasmid map of Donor vector pDK.

[0133] FIG. 18 shows a plasmid map of Donor vector pDS.

As can be seen in the above plasmid maps, Acceptor vectors pACE (FIG. 14) and pACE2 (FIG. 15) contain a 17 promoter and terminator. Donor vectors pDC (FIG. 16), pDK (FIG. 17) and pDS (FIg. 18) contain conditional origins of replication. pOS (FIG. 18) and pDK (FIG. 17) have a lac promoter. pDC (FIG. 16) has a T7 promoter. Resistance markers and origins of replication are shown. LoxP imperfect inverted repeat sequences are shown as circles. Homing endonuclease sites and corresponding BstXI sites are boxed. The restriction enzyme sites in the multiple integration element (MIE) are indicated. All MIEs have the same DNA sequence between ClaI and PmeI. Differences in unique restriction site composition stem from differences in the plasmid backbone sequences.

[0134] FIG. 19 shows the results of a restriction mapping of preferred vectors according to the invention. Both undigested Acceptor (pACE, pAGE2) and Donor vectors (pDC, pDK, pDS) are shown as well as the same vectors digested with BamHI. All restriction reactions yield the expected sizes. Lanes 1-5 show uncut pACE, pACE2, pDC, pDK, and pDS vectors; lane M shows .lamda. Styl marker; lanes A-E show BamHI digested pACE, pACE2, pDC, pDK, and pDS vectors.

[0135] FIG. 20 shows the strategy for Acceptor/Donor recombineering according to the invention exemplified for genes coding for Von Hippel-Lindau/elonginB/elonginC (VHLbc) complex (Tan et al. (2005), supra; see also FIG. 9 above), FtsH soluble domain (Bieniossek et al. (2006) Proc. Natl. Acad. Sci. USA 103, 3066-3071), blue fluorescent protein (BFP), and green fluorescent protein (mGFP) with a coiled-coil domain (Berger et al. (2003) Proc. Natl. Acad. Sci. USA 100, 12177-12182) were inserted into pACE, pDC, pDK and pDS, respectively. Cre-fusion was followed by transformation into pir.sup.- cells (TOP10). Aliquots were plated on agar with two (Ap/Kn; Ap/Cm; Ap/Sp), three (Ap/Kn/Sp) and four (Ap/Kn/Sp/Cm) antibiotics. Four colonies from each plate were challenged in a 96 well microliter plate. Labels left of the plate image denote antibiotics contained in media aliquots in horizontal rows. Wells in the bottom two rows were charged differently (labels below the plate image). Those inoculated with four colonies each from one agar plate are boxed in black, and flagged with antibiotics contained in the agar plate. Four vertical rows in each such 16-well box were inoculated with the same colony. In the bottom two rows, four wells in a row were inoculated with the same colony. Expected vector architecture of the double, triple (ADD) and quadruple (ADDD) fusions is shown left or right (16 well boxes), respectively, or below (bottom two rows) of the plate image. Red dye is used as positional marker. Deconstruction of the ADDD fusion was carried out successfully in the reverse approach.

[0136] FIG. 21 shows the results of multiprotein complex expression of human TFIIF (FIG. 21A), the Von Hippel-Lindau/elonginB/elonginC (VHLbc) complex (FIG. 21B) and the prokaryotic transmembrane, holotranlocon (HTL) YidC-SecYEGDF (FIG. 21C). (A) Human TFIIF was assembled and purified using a TECAN Freedom Evoll 200 workstation, and analyzed by SDS-PAGE. Uninduced and induced whole cell extracts and purified hTFIIF are shown, with subunits (RAP74, RAP30) marked. RAP74 contained a C-terminal oligohistidine tag. (B) All multigene constructs from FIG. 20 were assembled, expressed and cell lysates analyzed in parallel following the same routine as for hTFIIF (labels as in FIG. 20). The VHLbc complex was captured by an oligohistidine-thioredoxin-fusion tag on the VHL subunit (Tan et al. (2005), supra). FtsH contained an oligohistidine tag at its C-terminus (Bieniossek et al. (2006, supra). Fluorescent proteins were identified in lysates by Western blot with antibody Roche 1814460 (1:1000 in TBST/3% BSA). (C) Production of the entire prokaryotic transmembrane holotranslocon (HTL) YidC-SecYEGDF. Membrane vesicle preparation, detergent solubilization, Ni.sup.2+ affinity capture and size exclusion chromatography resulted in purified holotranslocon complex (right). Subunits are labeled. A breakdown product of SecY is marked with an asterisk. In all panels, M stands for Biorad broad range marker (sizes in kDa).

[0137] FIG. 22 shows a schematic workflow for an automated SLIC process.

[0138] FIG. 23 shows a schematic workflow for an automated Cre fusion process.

[0139] FIG. 24 shows a plasmid map of a preferred vector (Donor vector) of the invention called pIDC (SEQ ID NO: 7).

[0140] FIG. 25 shows the plasmid map of a preferred vector (Donor vector) of the invention called pIDK (SEQ ID NO: 8).

[0141] FIG. 26 shows the plasmid map of a preferred vector (Donor vector) of the invention called pIDS (SEQ ID NO: 9).

[0142] FIG. 27 shows the plasmid map of a preferred vector (Acceptor vector) of the invention called pACEBac1 (SEQ ID NO: 10).

[0143] FIG. 28 shows the plasmid map of a preferred vector (Acceptor vector) of the invention called pACEBac2 (SEQ ID NO: 11).

[0144] FIG. 29 shows the plasmid map of a preferred vector (Acceptor vector) of the invention called pACEBac3 (SEQ ID NO: 12).

[0145] FIG. 30 shows the plasmid map of a preferred vector (Acceptor vector) of the invention called pACEBac4 (SEQ ID NO: 13).

[0146] FIG. 31 shows the plasmid map of a preferred vector (Acceptor vector) of the invention called pOmniBac1 (SEQ ID NO: 14).

[0147] FIG. 32 shows the plasmid map of a preferred vector (Acceptor vector) of the invention called pOmniBac2 (SEQ ID NO: 15).

[0148] FIG. 33 shows the plasmid map of a preferred vector (Acceptor vector) of the invention called pOmniBac3 (SEQ ID NO: 16).

[0149] FIG. 34 shows the plasmid map of a preferred vector (Acceptor vector) of the invention called pOmniBac4 (SEQ ID NO: 17).

[0150] FIG. 36 shows a scheme for multiprotein expression in insect cells by generating composite baculovirus using Acceptor vectors of the present invention carrying a ColE1 origin (pACEBac1, pACEBac2, pOmniBac1, pOmniBac2). Multigene fusions are generated by Cre-LoxP fusion of the desired Donor/Acceptor combinations (multigene construction). The fusion vector is transformed in bacteria carrying a baculoviurs genome (such as bacoluvirus EMBac or EMBAcY) as a bacterial artificial chromosome (BAC). The vector fusion is integrated into the baculovirus genome by Tn7 based transposition. Productive composite viruses are selected by blue/white screening (integration of the vector fusion into the T7 attachment site of the virus destroys a lacZ gene present on the virus). Composite viruses are prepared and suitable insect cells are iransfeeted for protein production.

[0151] FIG. 36 shows a scheme for multiprotein expression in insect cells by generating composite baculovirus using Acceptor vectors of the present invention carrying an OriV origin (pACEBac3, pACEBac4, pOmniBac3, pOmniBac4). Multigene fusions are generated by Cre-LoxP fusion of the desired Donor/Acceptor combinations. The fusion vector is transformed in bacteria carrying a baculoviurs genome (such bacoluvirus EMBac or EMBAcY) as a bacterial artificial chromosome (BAC). The vector fusion is integrated into the baculovirus genome by Tn7 based transposition. Since the Acceptor vectors carrying an OriV can only be propagated, if a trfA gene is provided in trans, unproductive integration events in bacteria not containing the trfA gene leads to elimination of such transformants upon exposure to the appropriate antibiotic (here: gentamycin). Thus, blue/white screening is not necessary in this case. Composite viruses are then prepared and suitable insect cells are transfected for protein production.

[0152] FIG. 37 shows a scheme for multiprotein expression in insect cells by generating composite baculovirus using Acceptor vectors of the present invention carrying lef2-603 and Orf1629 homology sequences (pOmniBac1, pOmniBac2, pOmniBac3, pOmniBac4). Multigene fusions are generated by Cre-LoxP fusion of the desired Donor/Acceptor combinations (multigene construction). The multigene construct and genomic baculovirus DNA carrying a diphtheria toxin A gene flanked by the Ief2-603/Orf1629 homology sequences can be directly co-transfected into suitable insect cells for protein production. Transformation of transfer vector into bacteria containing the baculovirus genome, blue/white screening for composite viruses and preparation of composite viruses from the bacteria is no longer necessary.

[0153] FIG. 38 shows a schematic representation of a baculovirus vector according to the invention called EMBac.

[0154] FIG. 39 shows a schematic representation of a baculovirus vector according to the invention called EMBacY.

[0155] FIG. 40 shows a schematic representation of a baculovirus vector according to the invention called EMBac_Direct.

[0156] FIG. 41 shows a schematic representation of a baculovirus vector according to the invention called EMBac_DirectY.

[0157] FIG. 42 shows a schematic representation of an MIE according to the invention having the general structure I-CeuI-p10-MCS-BstXI present, for example in Acceptor vectors such as pACEBac2.

[0158] FIG. 43 shows a schematic representation of an MIE according to the invention having the general structure PI-SceI-p10-MCS-BstXI present, for example in Donor vectors such as pIDS.

[0159] FIG. 44 shows a schematic representation of an MIE according to the invention having the general structure I-CeuI-polh-MCS-BstXI present, for example in Accepter vectors such as pACEBac1.

[0160] FIG. 45 shows a schematic representation of an MIE according to the invention having the general structure PI-SceI-polh-MCS-BstXI present, for example in Donor vectors such as pIDC.

[0161] FIG. 46 shows a schematic representation of vector pACEBac1-HisIKK1.

[0162] FIG. 47 shows a schematic representation of vector pIDC-CSIKK2.

[0163] FIG. 48 shows a schematic representation of vector pIDS-IKK3.

[0164] FIG. 49 shows a schematic representation of vector pACEBac-HA-NA.

[0165] FIG. 50 shows a schematic representation of vector pIDC-M1-M2.

[0166] The present invention is in the following further described in detail with reference to preferred embodiments designated as "ACEMBL" system.

A. Synopsis

[0167] The preferred embodiments according to the present invention denoted as "ACEMBL" provide a multi-expression system for muitigene expression in E. coli and insect cells using the baculovirus system. ACEMBL can be used both manually and also in an automated setup by using a liquid handling workstation. ACEMBL applies tandem recombination steps for rapidly assembling many genes into multigene expression cassettes. These can be polycistronic or multiple expression modules, or a combination of these elements. ACEMBL also offers the option to employ conventional approaches involving restriction enzymes and ligase, if desired.

[0168] The following strategies for multigene assembly and expression are provided for in the ACEMBL system:

[0169] (1) Single gene insertions into vectors (recombination or restriction/ligation)

[0170] (2) Multigene assembly into a polycistron (recombination or restriction/ligation)

[0171] (3) Multigene assembly using homing endonucleases

[0172] (4) Multigene plasmid fusion by Cre-LoxP reaction

[0173] (5) Multigene expression by cotransformation in E. coli

[0174] (6) Multigene expression in insect cells using the baculovirus system

[0175] These strategies can be used individually or in conjunction, depending on the project and user.

[0176] In the following Section C, step-by-step protocols are provided for each of these methods for multigene cassette assembly that can be used in the ACEMBL system.

B. ACEMBL System

B.1 ACEMBL Vectors

[0177] The present invention provides as preferred exemplary embodiments small de novo designed vectors which are called "Acceptor" and "Donor" vectors (FIGS. 1 and 20; for plasmid maps, see FIGS. 14 to 18 and FIGS. 21 to 31). Acceptor vectors for expression of proteins in prokaryotic hosts (e.g. pACE, pACE2) contain origins of replication derived from ColE1 and resistance markers (ampicillin or tetracycline). Donor vectors contain conditional origins of replication (derived from R6K.gamma.), which make their propagation dependent on hosts expressing the pir gene. Donor vectors contain resistance markers kanamycin, chloramphenicol, and spectinomycin. Preferably, three Donor vectors are used in conjunction with one Acceptor vector.

[0178] All Donor and Acceptor vectors according the present example contain a LoxP imperfect inverted repeat and in addition, a multiple integration element (MIE). The preferred MIE of the invention comprises an expression cassette with a promoter of choice (prokaryotic, mammalian, insect cell specific or a combination thereof) and a terminator (prokaryotic, mammalian, insect cell specific or a combination thereof). In between is a DNA segment which contains a number of restriction sites that can be used for conventional cloning approaches or also for generating double-strand breaks for the integration of expression elements of choice (further promoters, ribosomal binding sites, terminators and genes). The MIE is completed by a homing endonuclease site and a specifically-designed restriction enzyme site (BstXI) flanking the promoter and the terminator (see B.2.)

[0179] The sequences of ACEMBL vectors for expression in prokaryotic hosts are outlined in the sequence listing (pACE: SEQ ID NO: 2, pACE2: SEQ ID NO. 3; pDC: SEQ ID NO: 4; pDK; SEQ ID NO: 5; pDS: SEQ ID NO: 6; pACKS: SEQ ID NO: 18). Maps of the vectors pACE, pACE2, pDC, pDK and pDS are shown in FIGS. 14 to 18.

[0180] The ACEMBL system according to the present invention also provides Donor and Acceptor vectors adapted for expression of multiprotein complexes in insect cells using baculovirus (pIDC (SEQ ID NO. 7), pIDK (SEQ ID NO: 8), pIDS (SEQ ID NO: 9), pACEBac1 (SEQ ID NO: 10), pACEBac2 (SEQ ID NO: 11), pACEBac3 (SEQ ID NO: 12), pACEBac4 (SEQ ID NO: 13), pOmniBac1 (SEQ ID NO: 14), pOmniBac2 (SEQ ID NO: 15, pOmniBac3 (SEQ ID NO: 16) and pOmniBac4 (SEQ ID NO: 17)). Plasmid maps of the vectors are shown in FIGS. 24 to 34.

[0181] Donor vectors pIDS, pIDK and pIDS contain a conditional origin of replication (from R6Kgamma phage), a homing endonuclease (HE) site (PI-SceI) and a complementary BstXI site (see the corresponding E. coli vectors pDC, pDK, pDS). Donors are propagated in cell strains containing the pir gene.

[0182] In contrast to the versions adapted for expression in bacteria, the vectors for expression of proteins in insect cells do not contain prokaryotic promoter and terminator structures. Instead, they have either a polh expression cassette (polh EC) or a p10 expression cassette (p10 EC). These expression cassettes contain common polyhedron or p10 promoters from AcMNPV, an oligonucleotide encoding for restriction sites (different from the MIE in the prokaryotic ACEMBL version) and either SV40 or HSVtk polyadenylation signal sequences.

[0183] Obviously, due to the HE and BstXI sites, the expression cassettes can be freely exchanged in between the vectors, also if they contain an inserted gene. This can be done by restriction ligation or by restriction enzyme/ligase independent methods (e.g. SLIC). Therefore, versions can be creatwed at ease which contain a p10 or polh marker in combination with any one of the resistance markers (spectinomycin, kanamycin, chloramphenicol, or others).

[0184] The HE/BstXI site combinations can be used to multiply expression cassettes or also to fit the vectors with combinations of p10 and polh expression cassettes.

[0185] All Donors contain a LoxP inverted imperfect repeat. This can be used for LoxP mediated constructions and deconstructions of Acceptor/Donor multifusions as described for the bacterial ACEMBL vectors.

[0186] The present embodiment of the invention relating to vectors adapted for protein expression in insect cells provides a number of Acceptor vectors in the baculovirus-version of ACEMBL. These share common features: all contain a LoxP site, a resistance marker (gentamycin) and again either a p10 or a polh expression cassette (identical to the ones present in the Donors).

[0187] The expression cassettes of the Acceptors are flanked by a homing endonuclease site (I-CeuI) and a corresponding BstXI site.

[0188] The expression cassettes can be exchanged in between the Acceptors and also multiplied or combined using the HE/BstXI combination as described for the bacterial ACEMBL vectors.

[0189] There are two families of Acceptors in terms of the origin used:

[0190] pACEBac1, pACEBac2, pOmniBac1 and pOmniBac2 contain all a ColEI origin of replication which allows propagation in all common E. coli cloning cell strains.

[0191] All Acceptor vectors contain Tn7L and Tn7R sequences which enable integration of the region in between into a Tn7 attachment site by using the Tn7 transposition procedure.

[0192] pACEBac3, pACEBac4, pOmniBac3 and pOmniBac4 contain a conditional origin of replication (OriV) from V. Cholerae which is dependent on the trfA gene that needs to be provided in trans in the cloning strains usee. The function of this OriV is to eliminate the background (blue colonies) when these Acceptors, fitted with genes and if required fused with Donors, are transformed into cells that contain the baculovirus genome in form of a bacterial artificial chromosome (i.e. DH10Bac from Invitrogen and similar). Here, the Tn7 transposition system is used to integrate the regions in between Tn7L and Tn7R of the DNA transformed into the cells into a Tn7 attachment site on the viral genome of choice. Normally, unproductive integration events would result in blue colonies (if the Tn7 attachment site is embedded in a LacZalpha gene on the baculovirus genome). These blue colonies propagate the plasmid transformed outside of the baculovirus genome. With these four OriV containing plasmids, the blue colonies cannot survive upon exposure to Gentamycin (since the DH10Bac or other cells do not contain trfA) and only white colonies are produced, which all contain productively integrated composite bacmid carrying the heterologous genes provided on the plasmid transformed; see also the scheme in FIG. 36.

[0193] The Acceptor vectors pOminBac1-4 contain, in addition to the Tn7L and Tn7R regions, also the lef2-603 and Orf1629 homology sequences. These are used for homologous recombination procedures for generating composite baculovirus as used by the Novagen Bacvector series, the Baculogold system from Pharmingen, FlashBac from OET and others. Thus, these Acceptor vectors can be used for every baculovirus system that is currently available, including the Tn7 based baculoviruses and all viruses relying on lef2,603/1629 homologous recombination procedures, for expressing heterologous genes in insect cell cultures, see also the scheme in FIG. 37.

B.2 Multiple Integration Element (MIE)

[0194] A preferred multiple integration element (MIE) according to the invention was derived from a polylinker (see Tan et al. (2005) supra) and allows for several approaches for multigene assembly (see Section C below). Multiple genes can be inserted into the MIE of any one of the vectors by a variety of methods, for example BD-in-Fusion recombination (see ClonTech TaKaRa Bio Europe, www.clontech.com) or SLIC (sequence and ligation independent cloning; see Li et al. (2007) Nat. Methods 4, 251). For this, the vector needs to be linearized, which can also be carried out efficiently by PCR reaction with appropriate primers, since the vectors are all small (2-3.0 kb). Use of ultrahigh-fidelity polymerases such as Phusion (Finnzymes/New England BioLabs, www.neb.com) is preferred. Alternatively, if more conventional approaches shall be used, e.g. in an ordinary wet lab setting without robotics, the vectors can also be linearized by restriction digestion, and a gene of interest can be integrated by restriction/ligation (see below Section C of the present embodiment). The DNA sequence (SEQ ID NO: 1) and map of the present MIE is shown in FIG. 13.

B.3. Tags, Promoters, Terminators

[0195] For expression of proteins in prokaryotic hosts, the vectors of the ACEMBL system contain per default promoters T7 and Lac, as well as the T7 terminator element (FIGS. 1, 14). The T7 system requires bacterial strains which contain a T7 polymerase gene, e.g. in the E. coli genome. The Lac promoter is a strong endogenous promoter which can be utilized in most strains. The present ACEMBL vectors contain the lac operator element for repression of heterologous expression.

[0196] Evidently, all promoters and terminators present in ACEMBL Donor and Acceptor vectors, and in fact the entire multiple integration element (MIE), excluding the HE and X Site, respectively, can be exchanged with an expression cassette of choice by using restriction/ligation cloning with appropriate enzymes (for example ClaI/PmeI, FIG. 2) or insertion into linearized ACEMBL vectors where the MIE was removed by sequence and ligation independent approaches such as SLIC. For example, the T7 promoter in pDC can be substituted with a trc promoter (pDC.sup.trc), and the T7 promoter in pACE with an arabinose promoter (pACE.sup.ara). Such variants can be used successfully in coexpression experiments by inducing with arabinose and IPTG.

[0197] In contrast to the ACEMBL vectors for expression in prokaryotic hosts, the vectors for expression in insect cells do not contain prokaryotic promoter and terminator structures. As already mentioned above, they have either a polh expression cassette (polh EC) or a p10 expression cassette (p10 EC). These expression cassettes contain common polyhedron or p10 promoters from AcMNPV, a sequence of restriction sites and either SV40 or HSVtk polyadenylation signal sequences.

[0198] The ACEMBL system vectors of the present example do not contain DNA sequences encoding for affinity tags to facilitate purification or solubilization of the protein(s) of interest. However, typically used C- or N-terminal oligohistidine tags, with or without protease sites for tag removal can be introduced by means of the respective PCR primers used for amplification of the germs of interest prior to insertion into the MIE, e.g. by SLIC-mediated insertion. Thus, Donor and Acceptor vectors of the present invention may be equipped by the array of custom tags prior to inserting recombinant genes of interest. This is best done by a design which with after tag insertion, still be compatible with the recombination based principles of ACEMBL system usage.

B.4 Complex Expression

[0199] For expression in E. coli, the ACEMBL multlgene expression vector fusions with appropriate promoters or terminators are transformed into the appropriate expression host of choice. With respect to the present exemplary vectors (T7 and lac promoter elements), most of the wide array of currently available expression strains can be utilized. If particular expression strains already contain helper plasmlds with DNA encoding for chaperones, lysozyme or else, the design of the multigene fusion is preferably such that the ACEMBL vector containing the resistance marker that is also present on the helper plasmld is not included in multigene vector construction.

[0200] Alternatively, if further vectors are repaired for complex production in an experiment, the issue can be resolved by creating alternative versions of the ACEMBL vectors containing resistance markers that circumvent the conflict. This can be easily performed by PCR amplifying the vectors minus the resistance marker, and combine the resulting fragments with a PCR amplified resistance marker by recombination (SLIC) or blunt-end ligation (using 5' phosphorylated primers).

[0201] Donor vectors of the present example depend on expression by the host of the pir gene product, due to the R6K.gamma. conditional origin of replication. In regular expression strains, they rely on fusion with an Acceptor for productive replication. Donors or Donor-Donor fusions can nonetheless be used even for expression when not fused with an Acceptor, by using expression strains carrying a genomic insertion of the pir gene. Such strains are commercially available (Novagen Inc., Madison Wis., USA).

[0202] Cotransformation of two ACEMBL plasmids adapted for expression in bacteria can lead to a successful protein complex expression. The present ACEMBL system for expression in prokaryotic hosts contains two Acceptor vectors, pACE and pACE2, which are identical except for the resistance marker (FIGS. 1, 14). These can be used to express genes present on pACE or pACE2, respectively, by cotransformation and exposure to both antibiotics simultaneously. In fact, entire Acceptor-Donor fusions containing several genes, based on pACE or pACE2 as Acceptors, can in principle be cotransformed for multi-expression, if needed.

[0203] For expression in insect cells (such as Sf9, Sf21, Hi5 etc.) using the baculovirus system, suitable ACEMBL vectors of the present invention need to be integrated into a baculovirus genome (composite virus generation). This is typically carried out by transformation of the desired Cre-LoxP fusion into bacterial cells containing the desired virus genome as a bacterial artificial chromosome. Using the vector system of the present invention adapted for baculovirus integration is used, three approaches are possible as outlined in FIGS. 35, 36 and 37, respectively.

C. Procedures

[0204] C.1. Cloning into ACEMBL Vectors

[0205] All Donors and Acceptors of the preferred embodiment for expression prokaryotic hosts contain an identical MIE with exception of the homing endonucclease site/BstXI tandem encompassing the MIE (FIGS. 1 and 14). The MIE is tailored for sequence and ligation Independent gene insertion methods. In addition, the MIE also contains a series of unique restriction sites, and therefore can be used as a classical polylinker for conventional gene insertion by restriction/ligation. For automated applications insertion of genes of interest is preferably carried out by recombination approaches such as SLIC.

[0206] The Donor vectors for expression in insect cells according to the present preferred embodiment also contain an MIE which is, however, different for each vector (see plasmld maps of vectors pIDC, pIDK and pIDS in FIGS. 21, 22 and 23, respectively).

C1.1. Single Gene Insertion into the MIE by SLIC

[0207] Several procedures for restriction/ligation independent insertion of genes into vectors have been published or commercialized (e.g. Novagen LIC, Becton-Dickinson BD In-Fusion etc). These systems share in common that they rely on the exonuclease activity of DNA polymerases. In the absence of dNTPs, 5' extensions are created from blunt ends or overhangs by digestion from the 3' end. If two DNA fragments contain the same .about.20-30 bp sequence at their termini at opposite ends, this results in overhangs that share complementary sequences capable of annealing. This can be exploited for ligation independent combination of two or several DNA fragments containing homologous sequences.

[0208] If T4 DMA polymerase is used, this can be carried out in a manner that is independent of the sequences of the homology regions (Sequence and Ligation Independent Cloning, SLIC) and detailed protocols are available for the skilled person. In the context of multiprotein expression, this is particularly useful, as this approach is independent of the presence of unique restriction sites, or of their creation by mutagenesis, in the ensemble of encoding DNAs.

[0209] For use in the context of the present invention, the SLIC process was adapted for Inserting encoding DNAs amplified by Phusion polymerase into the ACEMBL Acceptor and Donor vectors. In this way, not only seamless integration of genes into the expression cassettes, but also concatamerization of expression cassettes to multigene constructs can be achieved by applying the same, simple routine that can be readily automated.

[0210] The following Protocol 1 represents an improved process based on the method described in Li et al. (2007, supra). Protocol 1 is designed for manual operation. Other systems may be used (e.g. BD-InFusion etc.), and if so, the manufacturers' recommendations should be followed. The present protocol may be adopted for robotics applications. Corresponding modifications of the protocol are outlined in Section D).

Protocol 1: Single Gene Insertion by SLIC.

[0211] Reagents required: [0212] Phusion Polymerase [0213] 5.times.HF Buffer for Phusion Polymerase [0214] dNTP mix (10 mM) [0215] T4 DNA polymerase (and 10.times. Buffer) [0216] DpnI enzyme [0217] E. coli competent cells [0218] 100 mM DTT, 2M Urea, 500 mM EDTA [0219] Antibiotics

Step 1: Primer Design

[0220] Primers for the SLIC procedure are designed to provide the regions of homology which result in the long sticky ends upon treatment with T4 DNA polymerase in the absence of dNTP.

[0221] Primers for the insert contain a DNA sequence corresponding to this region of homology ("Adaptor sequence" in FIG. 3, inset), followed by a sequence which specifically anneals to the insert to be amplified (FIG. 3, inset). Useful examples of adaptor sequences for SLIC are listed below (Table I).

[0222] The "insert specific sequence" can be located upstream of a ribosome binding site (rbs), for example if the gene of interest (GOI) is amplified from a vector already containing expression elements (e.g. the pET vector series). Otherwise, the forward primer needs to be designed such that a ribosome binding site is also provided in the final construct (FIG. 3, inset).

[0223] Primers for PCR linearization of the vector backbone are simply complementary to the two adaptor sequences present in the primer pair chosen for insert amplification (FIG. 3).

Step 2: PCR Amplification of Insert and Vector

[0224] Identical reactions are prepared in 100 .mu.l volume for DNA insert to be cloned and vector to be linearized by PCR:

TABLE-US-00001 ddH.sub.2O 75 .mu.l 5x Phusion HF Reaction buffer 20 .mu.l dNTPs (10 mM stock) 2 .mu.l Template DNA (100 ng/.mu.l) 1 .mu.l 5.quadrature.SLICprimer (100 .mu.M stock) 1 .mu.l 3.quadrature.SLICprimer (100 .mu.M stock) 1 .mu.l Phusion polymerase (2 U/.mu.l) 0.5 .mu.l

[0225] PCR reactions are then carried out with a standard PCR program (unless very long DNAs are amplified, then double extension time): [0226] 1.times.98.degree. C. for 2 min [0227] 30.times.[98.degree. C. for 20 sec.->50.degree. C. for 30 sec.->72.degree. C. for 3 min] [0228] Hold at 10.degree. C.

[0229] Analysis of the PCR reactions by agarose gel electrophoresis and ethidium bromide staining is recommended.

Step 3: DpnI treatment of PCR Products (Optional)

[0230] PCR reactions are then supplied with 1 .mu.l DpnI enzyme which cleaves parental plasmids (that are methylated). For insert PCR reactions, DpnI treatment is not required if the resistance marker of the template plasmid differs from the destination vector.

[0231] Reactions are then carried out as follows: [0232] Incubation: 37.degree. C. for 1-4 h [0233] Inactivation: 80.degree. C. for 20 min

Step 4: Purification of PCR Products

[0234] PCR products should be cleaned of residual dNTPs. Otherwise, the T4 DMA polymerase reaction (Step 5) is compromised. Product purification is preferably performed by using commercial PCR Purification Kits or NucleoSpin Kits (e.g. from Qiagen, Macherey-Nagel etc.). It is recommended to perform elution in the minimal possible volume indicated by the respective manufacturer.

Step 5: T4 DNA Polymerase Exonuclease Treatment

[0235] Identical reactions are prepared in 20 .mu.l volume for insert and for vector (eluted in Step 4):

TABLE-US-00002 10x T4 DNA polymerase buffer 2 .mu.l 100 mM DTT 1 .mu.l 2M Urea 2 .mu.l DNA eluate from Step 3 (vector or 14 .mu.l insert) T4 DNA polymerase 1 .mu.l

[0236] Reactions are then carried out as follows: [0237] Incubation: 23.degree. C. for 20 min [0238] Arrest: Addition of 1 .mu.l 500 mM EDTA [0239] Inactivation: 75.degree. C. for 20 min

Step 6: Mixing and Annealing

[0240] T4 DNA polymerase exonuclease-treated insert and vector are then mixed, followed by an (optional) annealing step which enhances the efficiency:

TABLE-US-00003 T4 DNA pol-treated insert: 10 .mu.l T4 DNA pol-treated vector: 10 .mu.l

[0241] Annealing: 65.degree. C. for 10 min [0242] Cooling: Slowly (in heat block) to RT (room temperature)

Step 7: Transformation

[0243] Mixtures are next transformed into competent cells following standard transformation procedures.

[0244] Reactions for pACE and pACE2 derivatives are transformed into standard E. coli cells for cloning (such as TOP10, DH5.alpha., HB101) and after recovery (2-4 h) plated on agar containing ampicillin (100 .mu.g/ml) or tetracycline (25 .mu.g/ml), respectively.

[0245] Reactions for Donor derivatives are transformed into E. coli cells expressing the pir gene (such as BW23473, BW23474, or PIR1 and PIR2, Invitrogen) and plated on agar containing chloramphenicol (25 .mu.g/ml, pDC), kanamycin (50 .mu.g/ml, pDK), and spectinomycin (50 .mu.g/ml, pDS).

Step 8: Plasmid Analysis

[0246] Plasmids are cultured in small-scale in media containing the corresponding antibiotic, and analyzed by sequencing and (optionally) restriction mapping with an appropriate restriction enzyme.

[0247] C1.2. Polycistron Assembly in MIE by SLIC

[0248] The multiple integration element according to the present invention can also be used to integrate genes of interest by using multi-fragment SLIC recombination as shown in FIG. 4. Genes preceded by ribosome binding sites (rbs) can be assembled in this way into polycistrons.

[0249] A detailed protocol is outlined in the following Protocol 2:

[0250] Protocol 2. Polycistron assembly by SLIC.

[0251] Reagents required: [0252] Phusion Polymerase [0253] 5.times.HF Buffer for Phusion Polymerase [0254] dNTP mix (10 mM) [0255] T4 DNA polymerase (and 10.times. Buffer) [0256] E. coli competent cells [0257] 100 mM DTT, 2M Urea, 500 mM EDTA [0258] Antibiotics

Step 1: Primer Design

[0259] The MIE element according to the present embodiment is composed of tried-and-tested primer sequences. These constitute the "Adaptor" sequences that can be used for inserting single genes or multlgene constructs. Examples of useful adaptor sequences are listed below (see Table I).

[0260] Adaptor sequences form the 5' segments of the primers used to amplify DNA fragments to be inserted into the MIE. Insert specific sequences are added at 3', DNA coding for a ribosome binding sites can be inserted optionally, if not already present on the PCR template.

Step 2: PCR Amplification of Insert and Primer

[0261] Identical reactions are prepared in 100 .mu.l volume for all DNA insert (GOI 1, 2, 3) to be cloned and the vector to be linearized by PCR:

TABLE-US-00004 ddH.sub.2O 75 .mu.l 5x Phusion HF Reaction buffer 20 .mu.l dNTPs (10 mM stock) 2 .mu.l Template DNA (100 ng/.mu.l) 1 .mu.l 5' SLIC primer (100 .mu.M stock) 1 .mu.l 3' SLIC primer (100 .mu.M stock) 1 .mu.l Phusion polymerase (2 U/.mu.l) 0.5 .mu.l

[0262] PCR reactions are then carried out with a standard PCR program (unless very long DNAs are amplified, then double extension time): [0263] 1.times.98.degree. C. for 2 min [0264] 30.times.[98.degree. C. for 20 sec.->50.degree. C. for 30 sec.->72.degree. C. for 3 min] [0265] Hold at 10.degree. C.

[0266] Analysis of the PCR reactions by agarose gel electrophoresis and ethidium bromide staining is recommended.

Step 3: DpnI Treatment of PCR Products (Optional)

[0267] PCR reactions are then supplied with 1 .mu.l DpnI enzyme which cleaves parental plasmids (that are methylated). For insert PCR reactions, DpnI treatment is not required if the resistance marker of the template plasmids differs from the destination vector.

[0268] Reactions are then carried out as follows: [0269] Incubation: 37.degree. C. for 1-4 h [0270] Inactivation: 80.degree. C. for 20 min

Step 4: Purification of PCR Products

[0271] PCR products should be cleaned of residual dNTPs. Otherwise, the T4 DNA polymerase reaction (Step 5) is compromised.

[0272] Product purification is preferably performed by using commercial PCR Purification Kits or NucleoSpin Kits (Qiagen, Macherey-Nagel or others). It is recommended to perform elution in the minimal possible volume indicated by the respective manufacturer.

Step 5: T4 DNA Polymerase Exonuclease Treatment

[0273] Identical reactions are prepared in 20 .mu.l volume for each insert (GOI 1, 2, 3) and for the vector (eluted in Step 4):

TABLE-US-00005 10x T4 DNA polymerase buffer 2 .mu.l 100 mM DTT 1 .mu.l 2M Urea 2 .mu.l DNA eluate from Step 3 (vector or 14 .mu.l insert) T4 DNA polymerase 1 .mu.l

[0274] Reactions are then carried out as follows: [0275] Incubation: 23.degree. C. for 20 min [0276] Arrest: Addition of 1 .mu.l 500 mM EDTA [0277] Inactivation: 75.degree. C. for 20 min

Step 6: Mixing and Annealing

[0278] T4 DNA polymerase exonuclease-treated insert and vector are then mixed, followed by an (optional) annealing step which enhances efficiency.

TABLE-US-00006 T4 DNA pol-treated insert 1 (GOI 1): 5 .mu.l T4 DNA pol-treated insert 2 (GOI 2): 5 .mu.l T4 DNA pol-treated insert 3 (GOI 3): 5 .mu.l T4 DNA pol-treated vector: 5 .mu.l

[0279] Annealing: 65.degree. C. for 10 mln [0280] Cooling: Slowly (in heat block) to RT

Step 7: Transformation

[0281] Mixtures are next transformed into competent cells following standard transformation procedures.

[0282] Reactions for pACE and pACE2 derivatives are transformed into standard E. coli cells for cloning (such as TOP10, DH5.alpha., HB101) and after recovery plated on agar containing ampicillin (100 .mu.g/ml) or tetracycline (25 .mu.g/ml), respectively.

[0283] Reactions for Donor derivatives are transformed into E. coli cells expressing the pir gene (such as BW23473, BW23474, or PIR1 and PIR2, available from Invitrogen) and plated on agar containing chloramphenicol (25 .mu.g/ml, pDC), kanamycin (50 .mu.g/ml, pDK), and spectinomycin (50 .mu.g/ml, pDS).

Step 8: Plasmid Analysis

[0284] Plasmids are cultured and correct clones are selected based on specific restriction digestion and DNA sequencing of the inserts.

TABLE-US-00007 TABLE I Adaptor DNA sequences For single gene or multigene insertions into ACEMBL vectors by SLIC. Adaptor* Sequence Description (a) Adaptors for cloning into ACEMBL vectors for expression in prokaryotic hosts T7InsFor TCCCGCGAAATTAATA Forward primer for insert amplification, if CGACTCACTATAGGG gene of interest (GOI) is present in a T7 (SEQ ID NO: 20) system vector (i.e. pET series). No further extension (rbs, insert specific overlap) required. T7InsRev CCTCAAGACCCGTTTA Reverse primer for insert amplification, if GAGGCCCCAAGGGGT GOI is present in a T7 system vector (i.e. TATGCTAG pET series). (SEQ ID NO: 21) No further extension (stop codon, insert specific overlap) required. T7VecFor CTAGCATAACCCCTTG Forward primer for vector amplification, GGGCCTCTAAACGGG reverse complement of T7InsRev. TCTTGAGG No further extension required. (SEQ ID NO: 22) T7VecRev CCCTATAGTGAGTCGT Reverse primer for vector amplification, ATTAATTTCGCGGGA reverse complement of T7InsFor. (SEG ID NO: 23) No further extension required. NdeInsFor GTTTAACTTTAAGAAG Forward primer for insert amplification for GAGATATACATATG insertion into MIE site 11 (FIG. 2). (SEQ ID NO: 24) Further extension at 3' (insert specific overlap) required. Can be used with adaptor XhoInsRev in case of single fragment SLIC (FIG. 3). XhoInsRev GGGTTTAAACGGAACT Reverse primer for insert amplification for AGTCTCGAG insertion into MIE site 14 (FIG. 2). (SEQ ID NO: 25) Further extension at 3' (stop codon, insert specific overfap) required. Can be used with adaptor NdeInsFor in case of single fragment SLIC (FIG. 3). XhoVecFor CTCGAGACTAGTTCCG Forward primer for vector amplification, TTTAAACCC reverse complement of XhoInsRev. (SEQ ID NO: 26) No further extension required. NdeVecRe CATATGTATATCTCCTT Reverse primer for vector amplification, v CTTAAAGTTAAAC reverse complement of NdeInsFor (SEQ ID NO: 27) No further extension required. SmaBam GAATTCACTGGCCGTC Reverse primer for insert amplification GTTTTACAGGATCC (GOI1) for insertion into MIE site 11 (FIG. (SEQ ID NO: 28) 2). Further extension at 3' (stop codon, insert specific overlap) required. Use with adapter NdeInsFor. BsmSma GGATCCTGTAAAACGA Forward primer for insert amplification CGGCCAGTGAATTC (GOI2) for insertion into Site I2 (FIG. 2, 4). (SEQ ID NO: 29) Further extension at 3' (rbs, insert specific over-lap) required. Use with adaptor SacHind.(multifragment SLIC, FIG. 4) SacHind GCTCGACTGGGAAAA Reverse primer for insert amplification CCCTGGCGAAGCTT (GOI2, insertion into MIE site I2 (FIG. 2, (SEQ ID NO: 30) 4). Further extension at 3' (stop codon, insert specific overlap) required. Use with adaptor BamSma.(multifragment SLIC, FIG. 4) HindSac AAGCTTCGCCAGGGTT Forward primer for insert amplification TTCCCAGTCGAGC (GOI3) for insertion into site I3 (FIG. 2, 4). (SEQ ID NO: 31) Further extension at 3' (rbs, insert specific over-lap) required. Use with adaptor BspEco.(multi-fragment SLIC, FIG. 4) BspEco5 GATCCGGATGTGAAAT Reverse primer for insert amplification TGTTATCCGCTGGTAC (GOI3) insertion into MIE site I3 (FIG. 2, C 4). (SEQ ID NO: 32) Further extension at 3' (stop codon, insert specific overlap) required. Use with adaptor HindSac.(multifragment SLIC, FIG. 4) Eco5Bsp GGTACCAGCGGATAA Forward primer for insert amplification CAATTTCACATCCGGA (GOI3) for insertion into site I4 (FIG. 2, 4). TC Further extension at 3' (rbs, insert specific (SEQ ID NO: 33) over-lap) required. Use with adaptor XhoInsRev. (multifragment SLIC, FIG. 4) (b) Adaptors for cloning into ACEMBL vectors for expression in insect cells PolhInsFor CCCACCATCGGGCGC Forward primer for insert amplification, GGATCCCG needs to he followed by insert specific (SEQ ID NO: 34) sequence (ca. 20 bp) PolhInsRev CGAGACTGCAGGCTC Reverse primer for insert amplification, TAGATTCG needs to be followed by insert specific (SEQ ID NO: 35) sequence (ca. 20 bp) PolhVecFor CGGGATCCGCGCCCG Forward primer for vector amplification, ATGGTGGG reverse complement of PolhInsRev, (SEQ ID NO: 36) No further extension required. PolhVecRe CGAATCTAGAGCCTGC Reverse primer for vector amplification, v AGTCTCG reverse complement of PolhInsFor. (SEQ ID NO: 37) No further extension required. P10hInsFor CTCCCGGTACCGCAT Forward primer for insert amplification, GCTATGCATCAGC needs to be followed by insert specific (SEQ ID NO: 38) sequence (ca. 20 bp) P10InsRev AATCACTCGACGAAGA Reverse primer for insert amplification, CTTGATCACC needs to be followed by insert specific (SEQ ID NO: 39) sequence (ca. 20 bp) P10VecFor GCTGATGCATAGCATG Forward primer for vector amplification, CGGTACCGGGAG reverse complement of P10InsRev. (SEQ ID NO: 40) No further extension requred. P10VecRe GGTGATCAAGTCTTCG Reverse primer for vector amplification, v TCGAGTGATT reverse complement of P10InsFor. (SEQ ID NO: 41) No further extension required. * All Adaptor primers (without extension) can be used as sequencing primers for genes of interest that were inserted into the MIE according to the present embodiment.

C.1.3. Gene Insertion by Restriction/Ligation

[0285] The MIEs of the present invention can also be used as a multiple cloning site with a series of unique restriction sites. Preferably, the MIE described herein for expression of proteins in prokaryolic hosts is preceded by a promoter and a ribosome binding site, and followed by a terminator. The MIEs of the preferred embodiments described herein for expression of proteins in insect cells contain a polh expression cassestte or a p10 expression cassette as already mentioned above. Therefore, cloning into live MIE by classical restriction/ligation also yields functional expression cassettes.

[0286] Genes of interest (GOI) can be subcloned by using standard cloning procedures into the multiple integration element (MIE) (see, for example, FIG. 13) of ACEMBL vectors.

Protocol 3. Restriction/Ligation Cloning into an MIE

[0287] Reagents required: [0288] Phusion Polymerase [0289] 5.times.HF Buffer for Phusion Polymerase [0290] dNTP mix (10 mM) [0291] 10 mM BSA [0292] Restriction endonucleases (and 10.times. Buffer) [0293] T4 DNA ligase (and 10.times. Buffer) [0294] Calf or Shrimp intestinal-alkaline phosphatase [0295] E. coli competent cells [0296] Antibiotics

Step 1: Primer Design

[0297] For conventional cloning, PCR primers are designed containing chosen restriction sites, preceded by appropriate overhangs for efficient cutting (see, e.g. New England Biolabs catalogue), and followed by .gtoreq.20 nucleotides overlapping with the gene of interest that is to be inserted.

[0298] In the case of the ACEMBL system for expression in bacteria, the MIE of the present embodiment is identical for all ACEMBL vectors. They contain a fibosome binding preceding the NdeI site. For single gene insertions, therefore, an rbs need not be included in the primer.

[0299] If multigene insertions are needed (for example in insertion sites I1I-4 of the MIE), primers should be designed such that an rbs preceding the gene and a stop codon at its 3' end are provided.

[0300] In particular for polycistron cloning by restriction/ligation, it is recommended to construct templates by custom gene synthesis. In the process, the restriction sites present in the MIE can be eliminated from the encoding DNAs.

Step 2: Insert Preparation

PCR of Insert(s):

[0301] Identical PCR reactions are prepared in 100 .mu.l volume for genes of interest to be inserted into the MIE:

TABLE-US-00008 ddH.sub.2O 75 .mu.l 5x Phusion HF Reaction buffer 20 .mu.l dNTPs (10 mM stock) 2 .mu.l Template DNA (100 ng/.mu.l) 1 .mu.l 5.quadrature. primer (100 .mu.M stock) 1 .mu.l 3.quadrature. primer (100 .mu.M stock) 1 .mu.l Phusion polymerase (2 U/.mu.l) 0.5 .mu.l

[0302] PCR reactions are then carried out with a standard PCR program (unless very long DNAs are amplified, then double extension time): [0303] 1.times.98.degree. C. for 2 min [0304] 30.times.[98.degree. C. for 20 see.->50.degree. C. for 30 sec.,->72.degree. C. for 3 min] [0305] Hold at 10.degree. C.

[0306] Analysis of the PCR reactions by agarose gel electrophoresis and ethidium bromide staining is recommended.

[0307] Product purification is preferably performed by using commercial PCR Purification Kits or NucleoSpin Kits (available from Qiagen, Macherey-Nagel and other manufacturers). It is recommended to perform elation in the minimal possible volume Indicated by the manufacturer.

Restriction Digestion of Insert(s):

[0308] Restriction reactions are carried out in 40 .mu.l reaction volumes, using specific restriction enzymes as specified by manufacturers recommendations (c.f. New England Biolabs catalogue and others).

TABLE-US-00009 PCR Kit eluate(.gtoreq.1 .mu.g) 30 .mu.l 10x Restriction enzyme buffer 4 .mu.l 10 mM BSA 2 .mu.l Restriction enzyme for 5' 2 .mu.l Restriction enzyme for 3' 2 .mu.l (in case of double digestion, otherwise ddH.sub.2O)

[0309] Restriction digestions are performed in a single reaction with both enzymes (double digestion), on alternatively, sequentially (two single digestions) if the buffer conditions required are incompatible.

Gel Extraction of Insert(s):

[0310] Processed insert is then purified by agarose gel extraction rising commercial kits (Qiagen, Macherey-Nagel etc). It is recommended to elute the extracted DNA in the minimal volume defined by the respective manufacturer.

Step 3: Vector Preparation

[0311] Restriction digestion of ACEMBL plasmid(s):

[0312] Restriction reactions are carried out in 40 .mu.l reaction volumes, using specific restriction enzymes as specified by manufacturer's recommendations (see, e.g. New England Biolabs catalogue and others).

TABLE-US-00010 ACEMBL plasmid (.gtoreq.0.5 .mu.g) in ddH.sub.2O 30 .mu.l 10x Restriction enzyme buffer 4 .mu.l 10 mM BSA 2 .mu.l Restriction enzyme for 5' 2 .mu.l Restriction enzyme for 3' 2 .mu.l (in case of double digestion, otherwise ddH.sub.2O)

[0313] Restriction digestions are performed in a single reaction with two enzymes (double digestion), or, alternatively, sequentially (two single digestions), if the buffer conditions required are incompatible.

Gel Extraction of Vector(s):

[0314] The processed vector is then purified by agarose gel extraction using commercial kits (Qiagen, Macherey-Nagel etc.). It is recommended to elute the extracted DNA in the minimal volume defined by the respective manufacturer.

Step 4: Ligation

[0315] Ligation reactions are carried out in 20 .mu.l reaction volumes according to the recommendations of the supplier of the T4 DNA ligase:

TABLE-US-00011 ACEMBL plasmid (gel extracted) 8 .mu.l Insert (gel extracted) 10 .mu.l 10x T4 DNA Ligase buffer 2 .mu.l T4 DNA Ligase 0.5 .mu.l

[0316] Ligation reactions are performed at 25.degree. C. (sticky end) for 1 h or at 16.degree. C. (blunt end) overnight.

Step 5: Transformation

[0317] Mixtures are next transformed into competent cells following standard transformation procedures.

[0318] Reactions for Acceptor derivatives are transformed into standard E. coli cells for cloning (such as TOP10, DH5.alpha., HB101) and after recovery plated on agar containing ampicillin (100 .mu.g/ml) or tetracycline (25 .mu.g/ml), respectively.

[0319] Reactions for Donor derivatives are transformed into E. coli cells expressing the pir gene (such as BW23473, BW23474, or PIR1 and PIR2, Invitrogen) and plated on agar containing chloramphenicol (25 .mu.g/ml, pDC), kanamycin (50 .mu.g/ml, pDK), and spectinomycin (50 .mu.g/ml, pDS).

Step 6: Plasmid Analysis

[0320] Plasmids are cultured and correct clones are selected based on specific restriction digestion and DNA sequencing of the inserts.

C.1.4. Multiplication by Using the HE and BstXI Sites

[0321] The ACEMBL system vectors according to the present invention contain a homing endonuclease (HE) site and a designed BstXI site that envelop the multiple integration element (MIE). The homing endonuclease site can be used to insert entire expression cassettes, containing single genes or polycistrons, into a vector already containing one gene or several genes of interest. Homing endonucleases have long recognition sites (12 to 40 base pairs or more, preferably 20-30 base pairs). Although not all equally stringent, homing endonuclease sites are most probably unique in the context of even large plasmids, or, in fact, entire genomes.

[0322] In the ACEMBL system of the present embodiment, Donor vectors contain a recognition site for homing endonuclease PI-SceI (FIG. 2). This HE site yields upon cleavage a 3' overhang with the sequence -CTGC. Acceptor vectors contain the homing endonuclease site I-CeuI, which upon cleavage will result in a 3' overhang of -CTAA. On Acceptors and Donors, the respective HE site is preceding the MIE. The 3' end of the MIE contains a specifically designed BstXI site, which upon cleavage will generate a matching overhang. The basis of this is the specificity of cleavage by BstXI. The recognition sequence of BstXI is defined as CCANNNNN'NTGG (SEQ ID NO: 42) (apostrophe marks position of phosphodiester link cleavage). The residues denoted as N can be chosen freely. Donor vectors thus contain a BstXI recognition site of the sequence CCATGTGC'CTGG (SEQ ID NO: 43), and Acceptor vectors contain CCATCTAA'TTGG (SEQ ID NO: 44). The overhangs generated by BstXI cleavage in each case will match the overhangs generated by HE cleavage. Note that Acceptors and Donors have different HE sites.

[0323] The recognition sites are not symmetric. Therefore, ligation of a HE/BstXI digested fragment into a HE site of an ACEMBL vector will be (1) directional and (2) result in a hybrid DNA sequence where a HE half site is combined with a BstXI half site. This site will be cut by neither HE nor BstXI. Therefore, in a construct that had been digested with a HE, insertion by ligation of HE/BstXI digested DNA fragment containing an expression cassette with one or several genes will result in a construct which contains all heterologous genes of interest, enveloped by an intact HE site in front, and a BstXI site at the end. Therefore, the process of integrating entire expression cassettes by means of HE/BstXI digestion and ligation into a HE site can be repeated iteratively.

Protocol 4. Multiplication by Using Homing Endonuclease/BstXI.

[0324] Reagents required: [0325] Homing endonucleases PI-SceI, I-CeuI [0326] 10.times. Buffers for homing endonucleases [0327] Restriction enzyme BstXI (and 10.times. Buffer) [0328] T4 DNA ligase (and 10.times. Buffer) [0329] E. coli competent cells [0330] Antibiotics

Step 2: Insert Preparation

[0331] Restriction reactions are carried out in 40 .mu.l reaction volumes, using homing endonucleases PI-SceI (Donors) or I-CeuI (Acceptors) as recommended by the supplier (e.g. New England Biolabs or others).

TABLE-US-00012 ACEMBL plasmid (.gtoreq.0.5 .mu.g) in ddH.sub.2O 32 .mu.l 10x Restriction enzyme buffer 4 .mu.l 10 mM BSA 2 .mu.l PI-SceI (Donors) or I-CeuI (acceptors) 2 .mu.l

[0332] Reactions are then purified by PCR extraction kit or acidic ethanol precipitation, and next digested by BstXI according to the recommendations of the supplier.

TABLE-US-00013 HE digested DNA in ddH.sub.2O 32 .mu.l 10x Restriction enzyme buffer 4 .mu.l 10 mM BSA 2 .mu.l BstXI 2 .mu.l

Gel Extraction of Insert(s):

[0333] Processed insert is then purified by agarose get extraction using commercial kits (Qiagen, Macherey-Nagel etc). It is recommended to elate the extracted DNA in the minimal volume defined by the respective manufacturer.

Step 3: Vector Preparation

[0334] Restriction reactions are carried out in 40 .mu.l reaction volumes, using homing eodonucleases PI-SceI (Donors) or I-CeuI (Acceptors) as recommended by the supplier (e.g. New England Biolabs catalogue or others).

TABLE-US-00014 ACEMBL plasmid (.gtoreq.0.5 .mu.g) in ddH.sub.2O 33 .mu.l 10x Restriction enzyme buffer 4 .mu.l 10 mM BSA 2 .mu.l PI-SceI (Donors) or I-CeuI (acceptors) 1 .mu.l

[0335] Reactions are then purified by PCR extraction kit or acidic ethanol precipitation, and next treated with intestinal alkaline phosphatase according to the recommendations of the respective supplier.

TABLE-US-00015 HE digested DNA in ddH.sub.2O 17 .mu.l 10x Alkaline phosphatase buffer 2 .mu.l Alkaline phosphatase 1 .mu.l

Gel Extraction of Vector:

[0336] Processed vector is then purified by agarose gel extraction using commercial kits (Qiagen, Macherey-Nagel etc). It is recommended to elute the extracted DNA in the minimal volume defined by the respective manufacturer.

Step 4: Ligation

[0337] Ligation reactions are carried out in 20 .mu.l reaction volumes:

TABLE-US-00016 HE/Phosphatase treated vector (gel 4 .mu.l extracted) HE/BstXI treated insert (gel extracted) 14 .mu.l 10x T4 DNA Ligase buffer 2 .mu.l T4 DNA Ligase 0.5 .mu.l

[0338] Ligation reactions are performed at 25.degree. C. for 1 h or at 16.degree. C. overnight.

Step 5: Transformation

[0339] Mixtures are next transformed into competent cells following standard transformation procedures.

[0340] Reactions for Acceptor derivatives are transformed into standard E. coli cells for cloning (such as TOP10, DH5.alpha., HB101) and after recovery plated on agar containing ampicillin (100 .mu.g/ml) or tetracycline (25 .mu.g/ml), respectively.

[0341] Reactions for Donor derivatives are transformed into E. coli expressing the pir gene (such as BW23473, BW23474, or PIR1 and PIR2, Invitrogen) and plated on agar containing chloramphenicol (25 .mu.g/ml, pDC, pIDC), kanamycin (50 .mu.g/ml, pDK, pIDK), and spectinomycin (50 .mu.g/ml, pDS, pIDS).

Step 6: Plasmid Analysis

[0342] Plasmids are cultured and correct clones selected based on specific restriction digestion and DNA sequencing of the inserts.

C.2. Cre-LoxP Reaction of Acceptors and Donors

[0343] Cre recombinase is a member of the integrase family (Type I topoisomerase from bacteriophage P1). It recembines a 34 bp loxP site (SEQ ID NO: 19; see FIG. 5) in the absence of accessory protein or auxiliary DNA sequence. The loxP site is comprised of two 13 bp recombinase-binding elements arranged as inverted repeats willed flank an 8 bp central region where cleavage and ligation reaction occur.

[0344] The site-specific recombination mediated by Cre reeombinase involves the formation of a Holliday junction (HJ). The recombination events catalyzed by Cre recombinase are dependent on the location and relative orientation of the LoxP sites. Two DNA molecules, for example an Acceptor and a Donor plasmid, containing single LoxP sites will be fused. Furthermore, the Cre recombination is an equilibrium reaction with 15-20% efficiency in recombination. This creates useful options for multigene combinations for multiprotein complex expressions.

[0345] In a reaction where several DNA molecules such as Donors and Acceptors are incubated with Cre recombinase, the fusion/excision activity of the enzyme will result in an equilibrium, state where single vectors (educt vectors) and all possible fusions coexist. Donor vectors can be used with Acceptors and/or Donors, likewise for Accepter vectors. Higher order fusions are also generated where more than two vectors are fused. This is shown schematically in FIG. 6.

[0346] The fact that Donors of the present example contain a conditional origin of replication that depends on a pir.sup.+ (pir positive) background now allows for selecting out from this reaction mix all desired Acceptor-Donor(s) combinations. For this, the reaction mix is used to transform to pir negative strains (TOP10, DH5.alpha., HB101 or other common laboratory cloning strains). Then, Donor vectors will act as suicide vectors when plated out on agar containing the antibiotic corresponding to the Donor encoded resistance marker, unless fused with an Acceptor. By using agar with the appropriate combinations of antibiotics, all desired Acceptor-Donor fusions can be selected for.

[0347] In this way, fusion vectors of 25 kb and larger can be generated. In stability tests (serial passaging for more than 60 generations), even such large plasmids are stable as checked by restriction mapping, even if only one of the antibiotics corresponding to the encoded resistance markers was provided in the growth medium.

C.2.1. Cre-LoxP Fusion of Acceptors and Donors

[0348] The following protocol is designed for generating multigene fusions from Donors and Acceptors by Cre-LoxP reaction.

[0349] Reagents: [0350] Cre recombinase [0351] Standard E. coli competent cells (pir.sup.- strain) [0352] Antibiotics [0353] 96 well microliter plates

[0354] 12 well tissue-culture plates (or petri dishes) w. agar/antibiotics [0355] LB media

[0356] 1. For a 20 .mu.l Cre reaction, mix 1-2 .mu.g of each educts in approximate equal amounts (5' DNA termini). Add ddH.sub.2O to adjust the total volume to 16.about.17 .mu.l, then add 2 .mu.l 10.times.Cre buffer and 1.about.2 .mu.l Cre recombinase.

[0357] 2. Incubate the Cre reaction at 37.degree. C. (or 30.degree. C.) for 1 hour.

[0358] 3. Optional: load 2-5 .mu.l Cre reaction on an analytical agarose gel for examination. Heat inactivatlon at 70.degree. C. for 10 minutes before the gel loading is strongly recommended.

[0359] 4. For chemical transformation, mix 10-15 .mu.l Cre reaction with 200 .mu.l chemical competent cells, incubate the mixture on ice for 15-30 minutes. Then perform heat shock at 42.degree. C. for 45-60 s. [0360] Up to 20 .mu.l Cre reaction (0.1 volumes of the chemical competent cell suspension) can be directly transformed into 200 .mu.l chemical competent cells. [0361] For electrotransformation, up to 2 .mu.l Cre reaction could be directly mixed with 100 .mu.l electrocompetent cells, and transformed by using an electroporator (e.g. BIORAD E. coli Pulser) at 1.8-2.0 kV. [0362] Larger volumes of Cre reactions should be desalted by ethanol precipitation or PCR purification column before electrotransformation. The desalted Cre reaction mix does preferably not exceed 0.1 volumes of the electrocompetent cell suspension. [0363] The cell/DNA mixture could be immediately used for electrotransformation without prolonged incubation on ice.

[0364] 5. Add up to 400 .mu.l of LB media (or SOC media) per 100 .mu.l of cell/DNA suspension immediately after the transformation (heat shock or electroporation).

[0365] 6. Incubate the suspension in a 37.degree. C. shaking incubator overnight or for at least 4 hours. [0366] For recovering multifusion plasmid containing more than 2 resistance markers. It is strongly recommended to incubate the suspension at 37.degree. C. overnight.

[0367] 7. Plate out the recovered cell suspension on agar containing desired combination of antibiotics. Incubate at 37.degree. C. overnight.

[0368] 8. Emerged colonies after overnight incubation might be verified directly by restriction digestion at this stage, by referring to steps 12-16, supra. [0369] Especially in the case that only one multifusion plasmid is desired.

[0370] For further selection by single antibiotic challenges on a 96 well microliter plate, continue to step 9.

[0371] Several various multifusion plasmids can be processed and selected on one 96 well microliter plate in parallel.

[0372] 9. For 96 well antibiotic tests, inoculate four colonies from each antibiotic agar plate into .about.500 .mu.l LB media without antibiotics. Incubate the cell cultures in a 37.degree. C. shaking incubator for 1-2 hours.

[0373] 10. During the incubation of colonies, fill a 96 well microliter plate with 150 .mu.l antibiotic-containing LB media or colourful dye (positional marker) in corresponding wells. [0374] A typical arrangement of the solutions, which is used for parallel selections of multifusion plasmids, is shown in FIG. 7. The basic principle underlying this aspect of the present invention is that every cell suspension from single colonies needs to be challenged by all four single antibiotics.

[0375] 11. Add 1 .mu.l aliquots of pre-incubated cell cultures to the corresponding wells. Then incubate the inoculated 96 well microliter plate in a 37.degree. C. shaking incubator overnight at 180-200 rpm. [0376] It is recommendahle to use parafilm to wrap the plate. [0377] The rest pre-incubated cell cultures could be kept in 4.degree. C. fridge for further inoculations.

[0378] 12. Select transformants containing desired multifusion plasmids according to the combination of dense and clear cell cultures from each colony. Inoculate 10-20 .mu.l cell cultures into 10 ml LB media with corresponding antibiotics. Incubate in a 37.degree. C. shaking incubator overnight.

[0379] 13. Centrifuge the overnight cell cultures at 4000 g for 5-10 minutes. Purify cell pellets with plasmid miniprep kit according to manufacturers' information.

[0380] 14. Determine the concentrations of purified plasmid solutions by using UV absorption (e.g. NanoDrop.TM. 1000).

[0381] 15. Digest 0.5.about.1 .mu.g of the purified plasmid solution in a 20 .mu.l restriction digestion (with 5-10 unit endonuclease). Incubate under recommended reaction condition for .about.2 hours.

[0382] 16. Use 5-10 .mu.l of the digestion for analytical agarose gel (0.8-1.2%) electrophoresis. Verify the plasmid integrity by comparing the actual restriction pattern to the predicted restriction pattern in silica (e.g. by using VectorNTI).

C.2.2. Deconstruction Effusion Vectors by Cre

[0383] The following protocol can be used, for instance for the recovery of four single AGEMBL vectors (pACE, pDC, pDK, pDS) by deconstructing tetra-fused pACKS plasmid (pACE-pDC-pDK-pDS) which preferably forms part of the ACEMBL System kit (see below Section E of the present embodiment). Likewise, the protocol is suitable for releasing single educts from multifusion constructs. This is achieved by Cre-LoxP reaction, transformation and plating on agar with appropriately reduced antibiotic resistance level (FIG. 6). For the liberated educt, encoding genes can be modified and diversified. Then, the diversified construct is resupplied by Cre-LoxP reaction.

[0384] Reagents: [0385] Cre recombinase (and 10.times. Buffer) [0386] E. coli competent cells [0387] (pir.sup.+ strains, pir.sup.- strains could be used only when partially deconstructed Acceptor-Donor fusions are desired). [0388] Antibiotics

[0389] 1. For a 20 .mu.l De-Cre reaction, incubate .about.1 .mu.g multifusion plasmid with 2 .mu.l 10.times.Cre buffer, 1.about.2 .mu.l Cre recomblnase, add ddH.sub.2O to adjust the total reaction volume to 20 .mu.l.

[0390] 2. Incubate the De-Cre reaction at 30.degree. C. for 1-4 hour.

[0391] 3. Optional: load 2-5 .mu.l De-Cre reaction on an analytical agarose gel for examination. [0392] Heat inactivation at 70.degree. C. for 10 minutes before the gel loading is recommended.

[0393] 4. For chemical transformation, mix 10-15 .mu.l De-Cre reaction with 200 .mu.l chemical competent cells. Incubate the mixture on ice for 15-30 minutes. Then perform heat shock at 42.degree. C. for 45-60 s. [0394] Up to 20 .mu.l De-Cre reaction (0.1 volumes of the chemical competent cell suspension) can be directly transformed into 200 .mu.l chemical competent cells. [0395] For electransformation, up to 2 .mu.l De-Cre reaction could be directly mixed with 100 .mu.l electrocompetent cells, and transformed by using an electroporator (e.g. BIORAD E. coli Pulser) at 1.8-2.0 kV. [0396] Larger volumes of De-Cre reaction should be desalted by ethanol precipitation or PCR purification column before electrotransformation. The desalted De-Cre reaction mix does preferably not exceed 0.1 volumes of the electrocompetent cell suspension. [0397] The cell/DNA mixture could be immediately used for electrotransformaton without prolonged incubation on ice.

[0398] 5. Add up to 400 .mu.l of LB media (or SOC media) per 100 .mu.l of cell/DNA suspension immediately after the transformation (heat shock or electroporation).

[0399] 6. Incubate the suspension in a 37.degree. C. shaking incubator. [0400] For recovery of partially deconstructed double/triple fusions, incubate the suspension in a 37.degree. C. shaking incubator overnight or for at least 4 hours. [0401] For recovery of individual sdacfs such as single ACEMBL vectors from pACKS plasmid, incubate the suspension in a 37.degree. C. incubator for 1-2 hours.

[0402] 7. Plate out the recovered cell suspension on agar containing desired combination of antibiotics, incubate at 37.degree. C. overnight.

[0403] 8. Colonies after overnight incubation can be verified directly by restriction digestion at this stage, by referring to steps 12-16. [0404] This is especially recommended if only one single educt or partially deconstructed multifusion plasmid is desired. [0405] For further selection by single antibiotic challenges on a 96 well microliter plate, continue to step 9. [0406] Several various single educts/partlally deconstructed multifusion plasmids can be processed and selected on one 96 well microliter plate in parallel.

[0407] 9. For 96 well antibiotic tests, inoculate four colonies from each antibiotic agar plate into .about.500 .mu.l LB media without antibiotics, incubate the cell cultures in a 37.degree. C. shaking incubator for 1-2 hours.

[0408] 10. During the incubation of colonies, fill a 96 well microliter plate with 150 .mu.l antibiotic-containing LB media or colourful dye (positional marker) in corresponding wells. [0409] Referring to FIG. 7 it is possible to provide a similar arrangement of the solutions, which is used for parallel selections of four various single educts/partially deconstructed multifusion plasmids The underlying principle of the present aspect of the invention is that every cell suspension from single colonies is to be challenged by all four antibiotics separately.

[0410] 11. Add 1 .mu.l aliquots of pre-incubated cell cultures to the corresponding wells. Then incubate the inoculated 96 well microliter plate in a 37.degree. C. shaking incubator overnight at 180-200 rpm. [0411] It is recommendabie to use parafilm to wrap the plate. [0412] The remaining pre-incubated cell cultures could be kept in 4.degree. C. fridge for further inoculations.

[0413] 12. Select transformants containing desired single educts/partially deconstructed multifusion plasmids according to the combination of dense and clear cell cultures from each colony, inoculate 10-20 .mu.l cell cultures into 10 ml LB media with corresponding antibiotic(s). Incubate in a 37.degree. C. shaking incubator overnight.

[0414] 13. Centrifuge the overnight cell cultures at 4000 g for 5-10 minutes. Purify cell pellets with plasmid miniprep kit according to manufacturers' information.

[0415] 14. Determine the concentrations of purified plasmid solutions by using UV absorption (e.g. NanoDrop.TM. 1000).

[0416] 15. Digest 0.5-1 .mu.g of the purified plasmid solution in a 20 .mu.l restriction digestion (with 5-10 unit endonuclease). Incubate under recommended reaction condition for .about.2 hours.

[0417] 16. Use 5-10 .mu.l of the digestion for analytical agarose gel (0.8-1.2%) electrophoresis. Verify the plasmid integrity by comparing the actual restriction pattern to predicted restriction pattern in silico (e.g. by using VectorNTI).

[0418] 17. Optional: during recovery of all four single ACEMBL vectors from pACKS plasmid, in case one or more single ACEMBL vectors fail to be liberated from one De-Cre reaction. One can just pick partially deconstructed double/triple fusions containing desired single ACEMBL vector(s), and perform a second De-Cre reaction (repeat steps 1-8). [0419] Typically, up to 2 sequential De-Cre reactions are sufficient to recover all four single ACEMBL vectors from pACKS plasmid, and the liberation of single educts from double/triple fusions could be much more efficient than from pACKS plasmid (quadruple fusion). The same principle also applies to the deconstruction of any other multifusion plasmid based on the ACEMBL system according to the present invention.

C.3. Coexpression in Bacteria by Cotransformation

[0420] Protein complexes can be expressed also from two separate vectors that were cotransformed in expression strains. The cotransformed vectors can have the same or different origins of replication, however, they must encode for different resistance markers. Plasmids pACE (ampicillln resistance marker) and pACE2 (tetracycline resistance marker) have both a ColE1 derived replicon and can therefore be used with all common expression strains. pACE and pACE2 derivatives (also including fused Donors if needed) can be cotransformed into expression strains, and double transformants selected for by plating on agar plates containing both ampicillin and tetracycline antibiotics.

[0421] Transformations are carried out using standard transformation protocols (see, e.g. the latest edition of Ausubel et al. (ed.), supra.

D. Automation

[0422] As already outlined above, cloning and expression of multiple protein complexes using the nucleic acids, vectors and methods of the present invention is highly suited for automation equipment employing current robotic techniques.

[0423] In the following general protocols as exemplified for a Tecan Freedom Evoll 200 pipetting device are provided. The pipetting device is typically equipped with liquid handling arm1 (LiHa1), 4 fixed tips (steel needles), 4 disposable tips coni (Diti's), 250 .mu.l syringes, liquid handling arm2 (LiHa2), 8 fixed tips (steel needles), 2.5 ml syringes, robotic manipulator arm (RoMa/transportation of plates), version long. The work station usually contains the following integrated devices: thermocycler PTC-200 (Biorad), Te-Shake, heatable plate shaker (Tecan), Variomag Thermoshaker, heat- and coolable plate shaker (Inheco), Te-Vacs, dual vacuum station for filter plates (Tecan), Safirell, UV VIS plate reader (Tecan) and cooling unit 400 W (FRYKA multistar).

D.1. Automated SLIC Process

[0424] A schematic representation of a workflow for automated SLIC is shown in FIG. 22.

Step 1: Initial PCR

[0425] Source plate: 96 well standard microliter plate containing the PCR templates (cDNA Approx. 0.2 .mu.g/.mu.l)

[0426] Reaction plate: 96 well PCR plate (Eppendorf)

[0427] Material: Sample mix plate (96 well PCR plate: Eppendorf), 1% agarose E-Gel.RTM. (Invitrogen), Phusion.RTM. DNA Polymerase master mix, oligonucleotide primers at 20 .mu.M, 2.times.DNA loading dye (2.times.DLD) (Fermentas), E-Gel.RTM. Low Range quantitative DNA Ladder (Invitrogen), 10.times. Buffer Tango.RTM. with BSA (Fermentas), DpnI (Fermentas)

[0428] PCR program: [0429] 11.times.[98.degree. C. for 20 sec..fwdarw.60-50.degree. C. for 30 sec. (step down every 2.sup.nd cycle 1.degree. C.).fwdarw.72.degree. C. for 3 min.] [0430] 19.times.[98.degree. C. for 20 sec..fwdarw.50.degree. C. for 30 sec..fwdarw.72.degree.0 C. for 3 min.] [0431] 72.degree. C. for 3 min. [0432] Hold at 10.degree. C.

[0433] DpnI of digest program: [0434] 37.degree. C. for 3 h [0435] 10.degree. C. for 1 min

[0436] Procedure: [0437] Wash tips.fwdarw.Pipet 89 .mu.l PCR master-mix into reaction plate [0438] Wash tips.fwdarw.Pipet 1 .mu.l template DNA according to worklist [0439] Wash tips.fwdarw.Pipet 5 .mu.l primer each to reaction plate [0440] Wash tips.fwdarw.Run PCR program [0441] Wash tips.fwdarw.Pipet 10 .mu.l 10.times. Buffer Tango.RTM. with BSA to reaction plate [0442] Wash tips.fwdarw.Pipet 5 .mu.l DpnI to reaction plate [0443] Wash tips.fwdarw.Run DpnI digest program [0444] Wash tips.fwdarw.Pipet 10 .mu.l 2.times.DLD to each well of sample mix plate [0445] Wash tips.fwdarw.Pipet 15 .mu.l DNA marker each to the E-gel marker slots [0446] Wash tips.fwdarw.Pipet 10 .mu.l PCR product to 2.times.DLD on sample mix plate [0447] Wash tips.fwdarw.Pipet 15 .mu.l sample mix to the E-Gel sample slots [0448] Wash tips.fwdarw.Run E-Gel.RTM. for 25 min. [0449] Assess results

Step 2: PCR Purification

[0450] Source plate: 96 well PCR plate (Eppendorf) with PCR samples

[0451] Target plate: 96 well microtiter elution plate (Macherey-Nagel)

[0452] Material: PCR purification Kit, NucleoSpin 96 Extract II Kit (Macherey-Nagel)

[0453] Procedure: According to manufacturer's information (http://www.macherey-nagel.com/tabid/10887/default.aspx)

Step 3: T4 DNA Polymerase Reaction

[0454] Source plate: 96 well microfiter elution plate (Macherey-Nagel)

[0455] Reaction plate: 96 well PCR plate (Eppendorf)

[0456] Material: bidest. water, 10.times.T4 DNA polymerase reaction buffer (Novagen), 100 mM DTT, 2 M Urea, T4 DNA polymerase (Novagen LIC qualified), 500 mM EDTA

[0457] Incubation program: 23.degree. C. for 10 min. (program 1) [0458] 75.degree. C. for 20 min. (program 2)

[0459] Procedure: [0460] Wash tips.fwdarw.Pipet 6 .mu.l water in to reaction plate [0461] Wash tips.fwdarw.Pipet 2 .mu.l 10.times. reaction buffer into reaction plate [0462] Wash tips.fwdarw.Pipet 1 .mu.l 100 mM DTT into reaction plate [0463] Wash tips.fwdarw.Pipet 2 .mu.l 2 M Urea into reaction plate [0464] Wash tips.fwdarw.Pipet 8 .mu.l DNA sample from prev. PCR into reaction plate [0465] Wash tips.fwdarw.Pipet 0.5 .mu.l T4 DNA polymerase into reaction plate [0466] Wash tips.fwdarw.Run incubation program 1 [0467] Wash tips.fwdarw.Pipet 1 .mu.l 500 mM EDTA into reaction plate [0468] Wash tips.fwdarw.Run incubation program 2

Step 4: Annealing

[0469] Source plate: Reaction plate from T4 DNA polymerase reaction

[0470] Reaction plate: 96 well PCR plate (Eppendorf)

[0471] Material: bidest. water, 10.times.DNA Ligase Reaction Buffer (NEB), linearised vector

[0472] Incubation program: 65.degree. C. for 8 min..fwdarw.ramp down 0.4.degree. C./min. to 35.degree. C..fwdarw.10.degree. C. for 1 min.

[0473] Procedure: [0474] Wash tips.fwdarw.Pipet 150 ng T4 DNA polymerase treated insert DNA according to worklist into reaction plate [0475] Wash tips.fwdarw.Pipet 150 ng linearised vector DNA according to worklist into reaction plate [0476] Wash tips.fwdarw.Run incubation program Step 5: Transformation in E. coli

[0477] Source plate: Reaction plate from the annealing step

[0478] Reaction plate: 96 well PCR plate (Eppendorf)

[0479] Culture plate: 2 ml 96 well plate (Nunc)

[0480] Target plates: 12 well cell culture plates containing 2 ml of LB-agar with appropriate antibiotics (standard concentrations used: Ampicillin 100 .mu.g/ml, Kanamycin 50 .mu.g/ml, Spectinomycin 50 .mu.g/ml, Chloramphenicol 30 .mu.g/ml)

[0481] Material: E. coli cells (XI1blue) that are chemically competent for transformation, SOC-medium

[0482] Transformation program: Heat thermocycler to 42.degree. C. [0483] Incubate a 42.degree. C. for 30 sec. [0484] Transfer immediately to cooled (0.degree. C.) pipetting carrier

[0485] Procedure: [0486] Wash tips.fwdarw.Pipet 100 .mu.l competent E. coli cells into reaction plate [0487] Wash tips.fwdarw.Pipet 10 .mu.l DNA sample from annealing step into reaction plate [0488] Wash tips.fwdarw.Incubate at 0.degree. C. for 30 min. [0489] Run transformation program [0490] Incubate at 0.degree. C. for 5 min. [0491] Wash tips.fwdarw.Pipet 250 .mu.l SOC-medium into culture plate [0492] Wash tips.fwdarw.Transfer transformation mix into culture plate [0493] Incubate at 37.degree. C. and 720 rpm. (Te-Shake Shaker) for 2 h [0494] Wash tips.fwdarw.Pipet 50 .mu.l culture into target plate (agar plate) [0495] Wash tips.fwdarw.Shake target plate at 12 Hz for 1 min. (plating out) [0496] Incubate target plates over night at 37.degree. C.

Step 6: Picking Clones and Setting Up Over Night Cultures (Manual Step)

[0497] Source plate: 12 well cell culture plates containing E. coli colonies

[0498] Target plate: 24 well culture plate

[0499] Material: 2.times.TY culture medium, incubator which carries culture plates

[0500] Procedure: Pick 4 colonies per reaction and transfer to 3 ml 2.times.TY medium in a 24 well culture plate. Incubate at 37.degree. C. and approx. 220 rpm over night.

Step 7: Plasmid Extraction (Miniprep)

[0501] Source plate: 24 well culture plate (usually 3 ml culture)

[0502] Target plate: 96 well microliter elution plate (Macherey-Nagel)

[0503] Material: Plasmid extraction kit, NucleoSpin Robot 96 Plasmid Kit (Macherey-Nagel)

[0504] Procedure: According to manufacturer (see http://www.machereynagel.com/tabid/10885/default.aspx)

Step 8: Assessment

[0505] Plasmid yield is quantified by measuring UV absorbance with a Thermo Scientific NanoDrop.TM. 1000 Spectrophotometer according to manufacturer. Plasmid integrity was assessed by E-gel (Invitrogen)

[0506] The efficacy of the SLIC protocol is assessed in manual and robotics mode. The results of the comparison are shown in Table II. Results are based on a set of 25 different Donor/Acceptor constructions prepared.

TABLE-US-00017 TABLE II Comparison Manual versus Robotic SLIC procedure (based on 25 constructs each) Manual Evoll DNA used for T4 200-400 ng insert 400-800 ng insert reaction: 200-400 ng vector 400-800 ng vector T4 reaction volume for 5 .mu.l: 2.5 .mu.l (insert) + 5ul: 2.5 .mu.l (insert) + transformation: 2.5 .mu.l (vector) 2.5 .mu.l (vector) Volume comp. cells 100 .mu.l (+300 .mu.l 100 .mu.l (+300 .mu.l SOC) (XI1Blue, chem. comp): SOC) Volume plated 200 .mu.l 50 .mu.l/well (12 well (Petri dish) plate) 200 .mu.l (petri dish) Clones obtained: 200->2000 25-250 (12 well plate) (Petri dish) 70-5300 (petri dish)

D.2 Automated Cre Fusion Process

[0507] A schematic representation of a workflow for automated Cre fusion is shown in FIG. 23.

Step 1: Cre-LoxP Plasmid Fusion Reaction

[0508] Source plate: 96 well microliter elution plate from the plasmid extraction process containing plasmids suitable for Cre-Lox fusion

[0509] Reaction plate: 96 well PCR plate (Eppendorf)

[0510] Material: bidest. water, 10.times.Cre reaction buffer (NEB), Cre recombinase (NEB)

[0511] Incubation program: 37.degree. C. for 1 h.fwdarw.10.degree. C. for 1 min.

[0512] Procedure: [0513] Wash tips.fwdarw.Pipet 6 .mu.l bidest. water into reaction plate [0514] Wash tips.fwdarw.Pipet 2 .mu.l 10.times.cre reaction buffer into reaction plate [0515] Wash tips.fwdarw.Pipet plasmid DNA suitable for Cre recombination according to worklist into reaction plate [0516] Wash tips.fwdarw.Pipet 2 .mu.l Cre recombinase into reaction plate [0517] Wash tips.fwdarw.Run incubation program [0518] Total reaction volume: 20 .mu.l Step 2,3 and 4: Transformation in E. coli and Plasmid Extraction:

[0519] Identical to the method described in above Section D.1., with the exception that reaction plate from Cre recombination step is used as source plate and recovery time in SOC-medlum is prolonged to a total of 4 h. Chemically competent Mach1 cells are used for transformation. For Cre reaction with 3 and 4 vectors agar-plates with half of the antibiotic concentration (standard concentrations used: Ampicillin 100 .mu.g/ml, Kanamycin 50 .mu.g/ml, Spectinomycin 50 .mu.g/ml, Chloramphenicol 30 g/ml) are used.

Step 5: Assessment

[0520] Plasmld fusion yield is quantified by measuring UV absorbance with a Thermo Scientific NanoDrop.TM. 1000 Spectrophotometer according to the manufacturer's instructions. Plasmid integrity is assessed by E-gel (Invitrogen) of undigested and digested samples. Suitable restriction sites that yield a digestion pattern characteristic for the respective fusions are identified by using Vector NTI (Invitrogen) and used for restriction mapping.

[0521] The efficacy of the Cre reaction is tested by performing a series of fusion reactions, each in triplicate, by using the Evoll liquid handling workstation. The results are summarized in Table III.

TABLE-US-00018 TABLE III Efficiency of Cre-LoxP Reactions on Evoll (assessed in triplicate for each reaction) Volume Cre-reaction used for 10 .mu.l transformation (all reactions): Volume chem. comp. cells (XI1Blue, 100 .mu.l (+300 .mu.l SOC) Mach1) per transformation (all reactions): Volume transformation reaction plated: 50 .mu.l/well (12 well plate) 200 .mu.l (petri dish) Clones obtained: (a) Double vector fusion reaction (AD, one Acceptor, one Donor) >1000 fused functional AD plasmids plated on a standard petri dish containing the respective two antibiotics (b) Triple vector fusion reaction (ADD, one Acceptor, two Donors) 12-80 fused functional ADD plasmids plated on a standard petri dish containing the respective three antibiotics (c) Quadruple vector fusion reaction (ADDD, one Acceptor, three Donors) For quadruple vector fusions (ADDD, one Acceptor and three Donors), two possibilities exist. (1) Single reaction ADDD (four vector Cre-Lox fusion, low efficiency) (2) Two step reaction ADD + D. Triple fusion as in (b), then addition of a further Donor. Option 2 (ADD + D) is preferred for routine robot use as it represents a more robust approach, resulting in example experiments in 20-100 fused functional ADDD plasmids when plated on a standard petri dish containing all four antibiotics.

D.3. High-Throughput Micro Batch I.MAC

[0522] Source plate: 2 ml deepwell plate (Eppendorf)

[0523] Filter plate: Glas filter plate (Novagen)

[0524] Target plate: standard microliter plate (Greiner)

[0525] Material: Ni-NTA bulk beads 50% in 20% ethanol (Ge-Healthcare), freezer at -20.degree. C., tabletop centrifuge suitable for microtiter plates, sonifiction device with microtip, IMAC binding and elution buffer suitable for the specific protein (Berrow et al, Acta Cryst. (2006). D62, 1218-1226).

[0526] Procedure:

[0527] Sample Preparation (Off Line) [0528] Harvest E. coli cells expressing the desired protein by centrifugation at 3000 g (4.degree. C.) directly in the source plate [0529] Freeze cell pellets for 30 min. at .about.20.degree. C. [0530] Thaw cell pellets 15 min. at room temperature

[0531] Preparation of the Filter Plate [0532] Wash tips.fwdarw.Resuspend Ni-RTA bead suspension by pipetting up and down [0533] 20 times 200 .mu.l.fwdarw.Transfer 200 .mu.l bead suspension to filter plate [0534] Wash tips.fwdarw.Apply vacuum 550 mbar for 30 sec. (remove 20% ethanol) [0535] Wash tips.fwdarw.Pipet 1 ml equilibration buffer (e.g. binding buffer) to resin [0536] Wash tips.fwdarw.Apply vacuum 300 mbar for 60 sec. (equilibration)

[0537] IMAC Purification, Preparation [0538] Wash tips.fwdarw.Pipet 1 ml binding buffer to the samples in the source plate [0539] Wash tips.fwdarw.Resuspend cell pellets by pipetting up and down 10 times 750 .mu.l [0540] Wash tips

[0541] Sonication of Samples (Off Line) [0542] Sonicafion of the samples to insure complete lysis of the cells

[0543] IMAC Purification, Loading and Elution [0544] Wash tips.fwdarw.Transfer whole lysate to filter plate [0545] Wash tips.fwdarw.Apply vacuum 300 mbar for 80 sec. (binding step) [0546] Wash tips.fwdarw.Pipet 1 ml wash buffer to the samples [0547] Wash tips.fwdarw.Apply vacuum 300 mbar for 90 sec. (wash step) [0548] Repeat wash step 3 times [0549] Wash tips.fwdarw.Pipet 100 .mu.l elution buffer to the samples [0550] Wash tips.fwdarw.Incubate 3 min. at room temperature [0551] Apply vacuum 350 mbar for 90 sec. (elution step)

Assessment

[0552] Eluted samples (10 .mu.l-12 .mu.l) are loaded manually on 12% denaturing gels using a Biorad Minigel System, pre-run at 135 V for 25 min, and then run for 65-70 min, at 185 V. Gels arre stained with Coomassle Brilliant Blue according to standard procedures.

E. ACEMBL Kit for Expression of Proteins in Prokaryotic Hosts

[0553] A kit according to a preferred embodiment for expression in prokaryotic hosts contains: [0554] BW23473, BW23474 cells.sup.1 and/or Cre recomblnase [0555] pACKS quadruple fusion vector.sup.2 [0556] made of pACE (Acceptor), and pDC, pDK, pDS (Donors) [0557] pACE2 vector [0558] pACE-[VHLbc/BFP/mGFP] control plasmid [0559] triple fusion vector made of pACE-VHLbc, pDK-BFP, pDS-mGFP.sup.3 .sup.1 E. coli strains expressing the pir gene for propagation of Donor derivatives (any other strain with pir.sup.+ background can be used)..sup.2 This fusion vector was created by Cre-LoxP reaction of pACE, pDC, pDK and pDS. It is resistant to ampicillin, kanamycin, chloramphenicol and spectinomycin. Individual ACEMBL vectors can be liberated from this quadruple fusion by Cre-Loxp mediated deconstruction as described above in protocol C.2.2. Sequences for single ACEMBL vectors according to the present embodiment and pACKS quadruple fusion are provided in SEQ ID NO: 2 to 7..sup.3 pDS-mGFP contains a coiled-coil fused to the N-terminus of eGFP (see Berger et al. (2003) Proc. Natl. Acad Sci. USA 100, 12177-82.

[0560] Optional Components: [0561] Antibiotics: ampicillin, chloramphenicol, kanamycin, spectinomycin, tetracycline [0562] Enzymes: [0563] T4 DNA polymerase (for recombination insertion of genes) [0564] Phusion polymerase (for PCR amplification of DNA) [0565] Restriction enzymes and T4 DNA ligase (for conventional cloning, if desired)

[0566] The present invention is further illustrated by the following non-limiting examples.

EXAMPLES

[0567] Examples of multiprotein expressions by using the above-described ACEMBL system are shown in the following illustrating the gene combination procedures outlined above. Reactions presented were either carried out manually following the protocols provided in above Section C, or on a Tecan Freedom Evoll 200 robot with adapted protocols according above Section D.

Example 1: SLIC Cloning into ACEMBL Vectors: Human TFIIF

[0568] Genes coding for full-length RAP74 with a C-terminal oligo-histidine tag and full-length human RAP30 were amplified from pET-based plasmid template (Gaiser et al. (2000) J. Mol. Biol. 302, 1119-1127) by using the primer pair T7InsFor (5-TCCCGCGAAATTAATACGACTCACTAGGG-3'; SEQ ID NO: 20) and T7Insrev (5'-CCTCAAGACCCGTTTAGAGGCCCCAAGGGGTTATGCTAG-3'; SEQ ID NO: 21) following the protocols described above. Linearized vector backbones were generated by PCR amplification from pACE and pDC by using primer pair T7VecFor (5'CTAGCATAACCCCTTGGGGCCTCTAAACGGGTCTTGAGG-3'; SEQ ID NO: 22) and T7VecRev (5'-CCCTATAGTGAGTCGTATTAATTTCGCGGGA-3'; SEQ ID NO: 23) in both cases. Above Protocol 1 (Section C) was followed, resulting in pACE-RAP30 and pDC-RAP74his (FIG. 8). Those plasmids were fused by Cre-LoxP reaction (see above Section C). Results from restriction mapping by BstZ17I/BamHI double digestion of 11 double resistant (Cm, Ap) colonies is shown by a gel section from 1% E-gel electrophoresis (M: NEB 1 kb DNA marker) in FIG. 8. All clones tested showed the expected pattern (5.0+2.8 Kb). One clone was transformed in BI21(DE3) cells. Expression and purification by Ni.sup.2+-capture and S200 chromatography resulted in human TFIF complex (FIG. 21A).

[0569] The high-level soluble expression of full-length human TFIIF (FIG. 21 A) is noteworthy, as individual expression of the subunits invariably leads to insoluble material. In the past, crystal structure analysts of human TFIIF dimerization domain had necessitated many iterative cycles of limited proteolysis, recloning, insoluble expression of the designed fragments and co-refolding (Gaiser et al. (2000), supra). Similar laborious situations are commonplace in prior art protein complex research. It is conceivable that the large investment of labor involved can now be significantly reduced using the nucleic acids and vectors of the present invention, in particular the ACEMBL system.

Example 2: Polycistron Insertion by SLIC: Human VHL/Elongin b/Elongin c Complex

[0570] The gene encoding for Von Hippel Lindau protein (amino acids 54-213), fused at its N-terminus to a six-histidine-thioredoxin fusion tag, was PCR amplified from plasmid pET3-HisTrxVHL by using primers T7InsFor (see above Table I) and SmaBamVHL (5'-GAATTCACTGGCCGTCGTTTTACAGGATCCTTAATCTCCCATCCGTTGATG TGCAATG-3'; SEQ ID NO 45). SmaBamVHL primer is a derivative of the SmaBam adaptor sequence (Table I; SEQ ID NO: 17) elongated at its 3' by the insert specific sequence at the 3' end of the VHL gene (including a stop codon). The gene encoding for full-length elongin b was PCR amplified from pET3-ElonginB by using primers BamSmaEB (5'-GGATCCTGTAAAACGACGGCGAGTGAATTCG CTAGCTCTAGAAATAATTTTGTTTAAC-3'; SEQ ID NO: 46) and SacHindEB (5'-GAGCTCGACTGGGAAAACCCTGGCGAAGCTTAGATCTGGATCCTTACTGCACG GCTTGTTCATTGG-3'; SEQ ID NO: 47), which are derivatives of the corresponding adaptors (Table I). The gene for elongin c (amino acids 17-112) was amplified from pET3-ElonginC by using primers HindSacEC (5'-AAGCTTCGCCAGGGTTTTCCCA GTCGAGCTCCAATTGGAATTCGCTAGCTCTAG-3'; SEQ ID NO: 48) and BspEco5EC (5'-GATCCGGATGTGAAATTGTTATCCGCTGGTACCAAGCTTAGAT CTGGATCCTTAACAATCTAAGAAG-3'; SEQ ID NO: 49), which are derivatives of the corresponding adaptors (Table I). Vector backbone was PCR amplified by using primers Tn7VecRev and Eco5Bsp, and pACE as a template (FIG. 9). Multifragment SLIC was carried out according to above Protocol 2 (Section C) resulting in pACE-VHLbc which contains a tricistron. Clones were plated on agar plates containing ampicillin. A positive clone, verified by sequencing, was used in the coexpression experiment described below (Example 5).

Example 3: The Homing Endonuclease/BstXI Module: Yeast RES Complex

[0571] Plasmids pCDFDuet-Pml1p, pRSFDuet-bSnu17p-NHis and pETDuet-Bud13p, coding for yeast, proteins (all full-length) Pml1p, Snu17p and Bud13p, respectively, were provided by Dr. Simon Trowitzsch and Dr. Markus Wahl (Max-Planck-Institute for Biophysical Chemistry, Gottingen, Germany). Snu17p contains a six-histidine tag fused to its N-terminus. The gene encoding for His6-tagged Snu17p was excised from pRSFDuet-Snu17p-NHis by using restriction enzymes NcoI and XboI, and ligated into a NcoI/XhoI digested pACE construct (containing an unrelated gene between NcoI and XhoI sites) resulting in pAGE-Snu17. The gene encoding for Bud13p was liberated from pETDuet-Bud13p by restriction digestion with XbaI and EcoRV, and placed into XbaI/PmeI digested pDC resulting in pDG-Bud13. The gene encoding for Pml1p was liberated from pCDFDuet-Pml1p by restriction digestion with NdeI and XhoI, and placed into NdeI/XhoI digested pDC resulting in pDC-Pml1. Next, the expression cassette for Bud13p was liberated from pDC-Bud13 by digestion with PI-SceI and BstXI. The liberated fragment was inserted into PI-SceI digested and alkaline phosphatase treated pDC-Pml1p resulting in pDC-Bud13p-Pml1p.

[0572] pACE-Snu17 and pDC-BudPmI were fused by Cre-LoxP reaction and selected for by plating on agar plates containing ampicillin and chloramphenicol. Fusion plasmids were transformed into BI21(DE3) cells. Expression and purification by Ni.sup.2+-capture and S200 size exclusion chromatography resulted in the trimeric RES complex

[0573] The strategy for cloning the yeast RES complex according to the method of the present invention is schematically illustrated in FIG. 10.

Example 4: Coexpression by Cotransformation: Human NYB/NYC

[0574] Genes encoding for protein NYB (amino acids 49-141) and NYC (amino acids 27-12) were excised from vectors pACYC18411-NYB and pET15-NYC, respectively (Romier et al. (2003 J. Biol Chem. 278, 1336-1345). NdeI and BamHI where used for NFYB. XhaI and BamHI where used for NYC, thus importing a six-histidine tag at the N-terminus of the protein. The NYB insert was ligated into pACE digested with NdeI and BamHI. The NYC insert was ligated into pACE2 digested by XhaI and BamHI. pACE-NFYB and pACE2-NFYC were transformed into BL21(DE3) cells containing the pLysS plasmid. Selection on agar plates containing ampicillln, tetracycline and chloramphenicol resulted in triple resistant colonies. The complex was expressed and purified by Ni2+ capture (IMAC) and S75HR (Pharmacia) size exclusion chromatography.

Example 5: Compression from Acceptor-Donor Fusions

[0575] Six heterologous genes coding for a trimeric protein complex (VHLbc: VonHippel-Lindau protein amino acids 54-213/full-length elonginB/elonginC amino acids 17-112) (Stebbins et al. (1909) Science 284, 455-61), a gene encoding for the AAA ATPase FtsH (amino acids 147-610), and two genes encoding for fluorescent markers (BFP and GFP) were assembled as illustrated in FIG. 20. In a single Cre reaction, all combinations of one Acceptor (pACE-VHLbc) and three Donors (pDC-FtsH, pDK-BFP, pDS-mGFP) were obtained and selected, including a quadruple fusion containing all six heterologous genes; see FIG. 20). Clones were verified by 96 well microliter assay as described above for the ACEMBL system. Section C. Expression and Ni.sup.2+ affinity capture, combined with immunostainsng of the untagged fluorescent markers, confirmed successful multiprotein expression. (FIGS. 16 and 17B). Proteins were expressed overnight in BL21(DE3) cells in 24 well deep-well plates in small scale using autoinduction media (Studier (2005) Protein Expr. Purif. 41, 207-34). Restriction mapping revealed that even large fusion plasmids were stable over many (more than 60) generations, even if challenged by a single antibiotic in the medium only.

Example 6: Expression of the YidC-SecYEGDF Holotranslocon

[0576] As illustrated in FIG. 21, the ACEMBL system was used to produce a large multiprotein complex, the YidC-SecYEGDF holotranslocon that contains in total 33 transmembrane helices. This machinery is used to transport unfolded polypeptides into the cell membrane or for translocation into the periplasm of bacteria (Duong et al. (1997) EMBO J. 16, 2757-68.

Example 7: Expression of Human IKK Complex in Insect Cells

[0577] Following the protocols for single gene insertion into ACEMBL vectors as outlined above in Section C.1., the genes for IKK1 (also called IKKalpha), IKK2 (also called IKKbeta) and IKK3 (also called Nemo) were cloned into pACEBac1, pIDC and pIDS respectively (maps of the resulting plasmids pACEBac1-HisIKK1, pIDC-CSIKK2 and pIDS-IKK3 are shown in FIGS. 46, 47 and 48, respectively). IKK1-2 double fusion (pACEBac1-HisIKK1 with pIDC-CSIKK2) and IKK1-2-3 triple fusions (all three vectors) were created by Cre-LoxP fusions as outlined above in Section C.2. The fusions were introduced into suitable host cells carrying a baculovirus genome (EMBac) as a bacterial artificial chromosome. The vector fusions were integrated into the baculoviral genome via Tn7 transposition. Productive integration was assessed by blue/white screening. DNA of composite virus was prepared from white clones and transfected into Sf21 cells.

Example 8: Expression of a H1N1-Influenza Virus-Like Particle

[0578] A virus-like particle (VLP) of the swine-flu virus (influenza virus of type H1N1) comprising the proteins HA, NA, M1 and M2 was expressed in insect cells (Sf21) by the following strategy: genes coding for HA and NA were cloned into pACEBac1 by single gene insertion as outlined above in Section C.1. The same procedure was followed for cloning the genes coding for M1 and M2 into pIDC. Double expression cassettes for HA-NA and M1-M2, respectively, were generated by using the HE-BstXI sites in the respective MIE (see above Section C.1.4.) resulting in plasmids pACEBac-HA-NA (plasmid map see FIG. 49) and pIDC-M1-M2 (plasmid map see FIG. 50). The vector for coding the complete H1N1-influenza-VLP was generated by CreLoxP fusion of pACEBac-HA-NA with pIDC-M1-M2 following the protocol in above Section C.2. The fusion vector was introduced into suitable host cells carrying a baculovirus genome (EMBac) as a bacterial artificial chromosome. The vector fusions were integrated into the baculoviral genome via Tn7 transposition. Productive integration was assessed by blue/white screening. DNA of composite virus was prepared from white clones and transfected into Sf21 cells.

INCORPORATION OF SEQUENCE LISTING

[0579] A paper copy of a compliant sequence listing, submitted on Mar. 5, 2012 in connection with U.S. application Ser. No. 13/254,831 filed by the same applicant as the present application, and an identical compliant computer readable form of the sequence listing, submitted on Mar. 5, 2012 in connection with U.S. application Ser. No. 13/254,831 filed by the same applicant as the present application, are incorporated herein by reference.

[0580] Applicant hereby requests the use of the compliant computer readable sequence listing that is already on file for U.S. application Ser. No. 13/254,831 (in connection with which the compliant sequence listing and CRF were submitted on Mar. 5, 2012). The paper copy of the sequence listing submitted with the present application is identical to the computer readable copy filed for the other application.

Sequence CWU 1

1

541210DNAArtificialMultiple integration element 1gggaattgtg agcggataac aattcccctc tagaaataat tttgtttaac tttaagaagg 60agatatacat atgaggcctc ggatcctgta aaacgacggc cagtgaattc cccgggaagc 120ttcgccaggg ttttcccagt cgagctcgat atcggtacca gcggataaca atttcacatc 180cggatcgcga acgcgtctcg agagatccgg 21022652DNAArtificialpACE 2ggtaccgcgg ccgcgtagag gatctgttga tcagcagttc aacctgttga tagtacttcg 60ttaatacaga tgtaggtgtt ggcaccatgc ataactataa cggtcctaag gtagcgacct 120aggtatcgat aatacgactc actatagggg aattgtgagc ggataacaat tcccctctag 180aaataatttt gtttaacttt aagaaggaga tatacatatg aggcctcgga tcctgtaaaa 240cgacggccag tgaattcccc gggaagcttc gccagggttt tcccagtcga gctcgatatc 300ggtaccagcg gataacaatt tcacatccgg atcgcgaacg cgtctcgaga gatccggctg 360ctaacaaagc ccgaaaggaa gctgagttgg ctgctgccac cgctgagcaa taactagcat 420aaccccttgg ggcctctaaa cgggtcttga ggggtttttt ggtttaaacc catctaattg 480gactagtagc ccgcctaatg agcgggcttt tttttaattc ccctatttgt ttatttttct 540aaatacattc aaatatgtat ccgctcatga gacaataacc ctgataaatg cttcaataat 600attgaaaaag gaagagtatg agtattcaac atttccgtgt cgcccttatt cccttttttg 660cggcattttg ccttcctgtt tttgctcacc cagaaacgct cgtgaaagta aaagacgcag 720aggaccaatt gggggcacga gtgggataca tagaactgga cttgaatagc ggtaaaatcc 780ttgagagttt tcgccctgaa gagcgttttc caatgatgag cactttcaaa gttctgctat 840gtggagcagt attatcccgt gtagatgcgg ggcaagagca actcggacga cgaatacact 900attcgcagaa tgacttggtt gaatactccc cagtgacaga aaagcacctt acggacggaa 960tgacggtaag agaattatgt agtgccgcca taacgatgag tgataacact gcggcgaact 1020tacttctgac aaccatcggt ggaccgaagg aattaaccgc ttttttgcac aatatgggag 1080accatgtaac tcgccttgac cgttgggaac cagaactgaa tgaagccata ccaaacgacg 1140agcgagacac cacaatgcct gcggcaatgg caacaacatt acgcaaacta ttaactggcg 1200aactacttac tctggcttca cggcaacaat taatagactg gcttgaagcg gataaagttg 1260caggaccact actgcgttcg gcacttcctg ctggctggtt tattgctgat aaatctgggg 1320caggagagcg tggttcacgg ggtatcattg ccgcacttgg accagatggt aagccttccc 1380gtatcgtagt tatctacacg acgggtagtc aggcaactat ggacgaacga aatagacaga 1440ttgctgaaat aggggcttca ctgattaagc attggtaaac cgatacaatt aaaggctcct 1500tttggagcct ttttttttgg acggaccggt agaaaagatc aaaggatctt cttgagatcc 1560tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt 1620ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct tcagcagagc 1680gcagatacca aatactgtcc ttctagtgta gccgtagtta ggccaccact tcaagaactc 1740tgtagcaccg cctacatacc tcgctctgct aatcctgtta ccagtggctg ctgccagtgg 1800cgataagtcg tgtcttaccg ggttggactc aagacgatag ttaccggata aggcgcagcg 1860gtcgggctga acggggggtt cgtgcacaca gcccagcttg gagcgaacga cctacaccga 1920actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag ggagaaaggc 1980ggacaggtat ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg agcttccagg 2040gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg 2100atttttgtga tgctcgtcag gggggcggag cctatggaaa aacgccagca acgcggcctt 2160tttacggttc ctggcctttt gctggccttt tgctcacatg ttctttcctg cgttatcccc 2220tgattctgtg gataaccgta ttaccgcctt tgagtgagct gataccgctc gccgcagccg 2280aacgaccgag cgcagcgagt cagtgagcga ggaagcggaa gagcgcctga tgcggtattt 2340tctccttacg catctgtgcg gtatttcaca ccgcaatggt gcactctcag tacaatctgc 2400tctgatgccg catagttaag ccagtataca ctccgctatc gctacgtgac tgggtcatgg 2460ctgcgccccg acacccgcca acacccgctg acgcgccctg acgggcttgt ctgctcccgg 2520catccgctta cagacaagct gtgaccgtct ccgggagctg catgtgtcag aggttttcac 2580cgtcatcacc gaaacgcgcg aggcaggggg aattccagat aacttcgtat aatgtatgct 2640atacgaagtt at 265232982DNAArtificialpACE2 3atgaaatcta acaatgcgct catcgtcatc ctcggcaccg tcaccctgga tgctgtaggc 60ataggcttgg ttatgccggt actgccgggc ctcttgcggg atatcgtcca ttccgacagc 120atcgccagtc actatggcgt gctgctagcg ctatatgcgt tgatgcaatt tctatgcgca 180cccgttctcg gagcactgtc cgaccgcttt ggccgccgcc cagtcctgct cgcttcgcta 240cttggagcca ctatcgacta cgcgatcatg gcgaccacac ccgtcctgtg gattctctac 300gccggacgca tcgtggccgg catcaccggc gccacaggtg cggttgctgg cgcctatatc 360gccgacatca ccgatgggga agatcgggct cgccacttcg ggctcatgag cgcttgtttc 420ggcgtgggta tggtggcagg ccccgtggcc gggggactgt tgggcgccat ctccttacat 480gcaccattcc ttgcggcggc ggtgctcaac ggcctcaacc tactactggg ctgcttccta 540atgcaggagt cgcataaggg agagcgccga cccatgccct tgagagcctt caacccagtc 600agctccttcc ggtgggcgcg gggcatgact atcgtcgccg cacttatgac tgtcttcttt 660atcatgcaac tcgtaggaca ggtgccggca gcgctctggg tcattttcgg cgaggaccgc 720tttcgctgga gcgcgacgat gatcggcctg tcgcttgcgg tattcggaat cttgcacgcc 780ctcgctcaag ccttcgtcac tggtcccgcc accaaacgtt tcggcgagaa gcaggccatt 840atcgccggca tggcggccga cgcgctgggc tacgtcttgc tggcgttcgc gacgcgaggc 900tggatggcct tccccattat gattcttctc gcttccggcg gcatcgggat gcccgcgttg 960caggccatgc tgtccaggca ggtagatgac gaccatcagg gacagcttca aggatcgctc 1020gcggctctta ccagcctaac ttcgatcatt ggaccgctga tcgtcacggc gatttatgcc 1080gcctcggcga gcacatggaa cgggttggca tggattgtag gcgccgccct ataccttgtc 1140tgcctccccg cgttgcgtcg cggtgcatgg agccgggcca cctcgacctg aaccgataca 1200attaaaggct ccttttggag cctttttttt tggacggacc ggtagaaaag atcaaaggat 1260cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt gcaaacaaaa aaaccaccgc 1320taccagcggt ggtttgtttg ccggatcaag agctaccaac tctttttccg aaggtaactg 1380gcttcagcag agcgcagata ccaaatactg tccttctagt gtagccgtag ttaggccacc 1440acttcaagaa ctctgtagca ccgcctacat acctcgctct gctaatcctg ttaccagtgg 1500ctgctgccag tggcgataag tcgtgtctta ccgggttgga ctcaagacga tagttaccgg 1560ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac acagcccagc ttggagcgaa 1620cgacctacac cgaactgaga tacctacagc gtgagctatg agaaagcgcc acgcttcccg 1680aagggagaaa ggcggacagg tatccggtaa gcggcagggt cggaacagga gagcgcacga 1740gggagcttcc agggggaaac gcctggtatc tttatagtcc tgtcgggttt cgccacctct 1800gacttgagcg tcgatttttg tgatgctcgt caggggggcg gagcctatgg aaaaacgcca 1860gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc 1920ctgcgttatc ccctgattct gtggataacc gtattaccgc ctttgagtga gctgataccg 1980ctcgccgcag ccgaacgacc gagcgcagcg agtcagtgag cgaggaagcg gaagagcgcc 2040tgatgcggta ttttctcctt acgcatctgt gcggtatttc acaccgcaat ggtgcactct 2100cagtacaatc tgctctgatg ccgcatagtt aagccagtat acactccgct atcgctacgt 2160gactgggtca tggctgcgcc ccgacacccg ccaacacccg ctgacgcgcc ctgacgggct 2220tgtctgctcc cggcatccgc ttacagacaa gctgtgaccg tctccgggag ctgcatgtgt 2280cagaggtttt caccgtcatc accgaaacgc gcgaggcagg gggaattcca gataacttcg 2340tataatgtat gctatacgaa gttatggtac cgcggccgcg tagaggatct gttgatcagc 2400agttcaacct gttgatagta cttcgttaat acagatgtag gtgttggcac catgcataac 2460tataacggtc ctaaggtagc gacctaggta tcgataatac gactcactat aggggaattg 2520tgagcggata acaattcccc tctagaaata attttgttta actttaagaa ggagatatac 2580atatgaggcc tcggatcctg taaaacgacg gccagtgaat tccccgggaa gcttcgccag 2640ggttttccca gtcgagctcg atatcggtac cagcggataa caatttcaca tccggatcgc 2700gaacgcgtct cgagagatcc ggctgctaac aaagcccgaa aggaagctga gttggctgct 2760gccaccgctg agcaataact agcataaccc cttggggcct ctaaacgggt cttgaggggt 2820tttttggttt aaacccatct aattggacta gtagcccgcc taatgagcgg gctttttttt 2880aattccccta tttgtttatt tttctaaata cattcaaata tgtatccgct catgagacaa 2940taaccctgat aaatgcttca ataatattga aaaaggaaga gt 298242067DNAArtificialpDC 4atcaacgtct cattttcgcc aaaagttggc ccagatctat gtcgggtgcg gagaaagagg 60taatgaaatg gcacctaggt atcgataata cgactcacta taggggaatt gtgagcggat 120aacaattccc ctctagaaat aattttgttt aactttaaga aggagatata catatgaggc 180ctcggatcct gtaaaacgac ggccagtgaa ttccccggga agcttcgcca gggttttccc 240agtcgagctc gatatcggta ccagcggata acaatttcac atccggatcg cgaacgcgtc 300tcgagagatc cggctgctaa caaagcccga aaggaagctg agttggctgc tgccaccgct 360gagcaataac tagcataacc ccttggggcc tctaaacggg tcttgagggg ttttttggtt 420taaacccatg tgcctggcag ataacttcgt ataatgtatg ctatacgaag ttatggtacc 480gcggccgcgt agaggatctg ttgatcagca gttcaacctg ttgatagtac gtactaagct 540ctcatgtttc acgtactaag ctctcatgtt taacgtacta agctctcatg tttaacgaac 600taaaccctca tggctaacgt actaagctct catggctaac gtactaagct ctcatgtttc 660acgtactaag ctctcatgtt tgaacaataa aattaatata aatcagcaac ttaaatagcc 720tctaaggttt taagttttat aagaaaaaaa agaatatata aggcttttaa agcttttaag 780gtttaacggt tgtggacaac aagccaggga tgtaacgcac tgagaagccc ttagagcctc 840tcaaagcaat tttgagtgac acaggaacac ttaacggctg acagaattag cttcacgctg 900ccgcaagcac tcagggcgca agggctgcta aaggaagcgg aacacgtaga aagccagtcc 960gcagaaacgg tgctgacccc ggatgaatgt cagctgggag gcagaataaa tgatcatatc 1020gtcaattatt acctccacgg ggagagcctg agcaaactgg cctcaggcat ttgagaagca 1080cacggtcaca ctgcttccgg tagtcaataa accggtaaac cagcaataga cataagcggc 1140tatttaacga ccctgccctg aaccgacgac cgggtcgaat ttgctttcga atttctgcca 1200ttcatccgct tattatcact tattcaggcg tagcaaccag gcgtttaagg gcaccaataa 1260ctgccttaaa aaaattacgc cccgccctgc cactcatcgc agtactgttg taattcatta 1320agcattctgc cgacatggaa gccatcacaa acggcatgat gaacctgaat cgccagcggc 1380atcagcacct tgtcgccttg cgtataatat ttgcccatgg tgaaaacggg ggcgaagaag 1440ttgtccatat tggccacgtt taaatcaaaa ctggtgaaac tcacccaggg attggctgag 1500acgaaaaaca tattctcaat aaacccttta gggaaatagg ccaggttttc accgtaacac 1560gccacatctt gcgaatatat gtgtagaaac tgccggaaat cgtcgtggta ttcactccag 1620agcgatgaaa acgtttcagt ttgctcatgg aaaacggtgt aacaagggtg aacactatcc 1680catatcacca gctcaccgtc tttcattgcc atacggaatt ccggatgagc attcatcagg 1740cgggcaagaa tgtgaataaa ggccggataa aacttgtgct tatttttctt tacggtcttt 1800aaaaaggccg taatatccag ctgaacggtc tggttatagg tacattgagc aactgactga 1860aatgcctcaa aatgttcttt acgatgccat tgggatatat caacggtggt atatccagtg 1920atttttttct ccattttagc ttccttagct cctgaaaatc tcgataactc aaaaaatacg 1980cccggtagtg atcttatttc attatggtga aagttggacc ctcttacgtg ccgatcaacg 2040tctcattttc gccaaaagtt ggcccag 206752077DNAArtificialpDK 5ctatgtcggg tgcggagaaa gaggtaatga aatggcacct aggtatcgat ggctttacac 60tttatgcttc cggctcgtat gttgtgtgga attgtgagcg gataacaatt tcacacagga 120aacagctatg accatgatta cgaatttcta gaaataattt tgtttaactt taagaaggag 180atatacatat gaggcctcgg atcctgtaaa acgacggcca gtgaattccc cgggaagctt 240cgccagggtt ttcccagtcg agctcgatat cggtaccagc ggataacaat ttcacatccg 300gatcgcgaac gcgtctcgag actagttccg tttaaaccca tgtgcctggc agataacttc 360gtataatgta tgctatacga agttatggta cgtactaagc tctcatgttt cacgtactaa 420gctctcatgt ttaacgtact aagctctcat gtttaacgaa ctaaaccctc atggctaacg 480tactaagctc tcatggctaa cgtactaagc tctcatgttt cacgtactaa gctctcatgt 540ttgaacaata aaattaatat aaatcagcaa cttaaatagc ctctaaggtt ttaagtttta 600taagaaaaaa aagaatatat aaggctttta aagcttttaa ggtttaacgg ttgtggacaa 660caagccaggg atgtaacgca ctgagaagcc cttagagcct ctcaaagcaa ttttcagtga 720cacaggaaca cttaacggct gacagaatta gcttcacgct gccgcaagca ctcagggcgc 780aagggctgct aaaggaagcg gaacacgtag aaagccagtc cgcagaaacg gtgctgaccc 840cggatgaatg tcagctactg ggctatctgg acaagggaaa acgcaagcgc aaagagaaag 900caggtagctt gcagtgggct tacatggcga tagctagact gggcggtttt atggacagca 960agcgaaccgg aattgccagc tggggcgccc tctggtaagg ttgggaagcc ctgcaaagta 1020aactggatgg ctttcttgcc gccaaggatc tgatggcgca ggggatcaag atctgatcaa 1080gagacaggat gaggatcgtt tcgcatgatt gaacaagatg gattgcacgc aggttctccg 1140gccgcttggg tggagaggct attcggctat gactgggcac aacagacaat cggctgctct 1200gatgccgccg tgttccggct gtcagcgcag gggcgcccgg ttctttttgt caagaccgac 1260ctgtccggtg ccctgaatga actgcaggac gaggcagcgc ggctatcgtg gctggccacg 1320acgggcgttc cttgcgcagc tgtgctcgac gttgtcactg aagcgggaag ggactggctg 1380ctattgggcg aagtgccggg gcaggatctc ctgtcatctc accttgctcc tgccgagaaa 1440gtatccatca tggctgatgc aatgcggcgg ctgcatacgc ttgatccggc tacctgccca 1500ttcgaccacc aagcgaaaca tcgcatcgag cgagcacgta ctcggatgga agccggtctt 1560gtcgatcagg atgatctgga cgaagagcat caggggctcg cgccagccga actgttcgcc 1620aggctcaagg cgcgcatgcc cgacggcgag gatctcgtcg tgacacatgg cgatgcctgc 1680ttgccgaata tcatggtgga aaatggccgc ttttctggat tcatcgactg tggccggctg 1740ggtgtggcgg accgctatca ggacatagcg ttggctaccc gtgatattgc tgaagagctt 1800ggcggcgaat gggctgaccg cttcctcgtg ctttacggta tcgccgctcc cgattcgcag 1860cgcatcgcct tctatcgcct tcttgacgag ttcttctgag cgggactctg gggttcgaaa 1920tgaccgacca agcgacgccc aacctgccat cacgagattt cgattccacc gccgccttct 1980atgaaaggtt gggcttcgga atcgttttcc gggacgccgg ctggatgatc ctccagcgcg 2040gggatctcat gctggagttc ttcgcccacc ccgggat 207762027DNAArtificialpDS 6ctatgtcggg tgcggagaaa gaggtaatga aatggcacct aggtatcgat ggctttacac 60tttatgcttc cggctcgtat gttgtgtgga attgtgagcg gataacaatt tcacacagga 120aacagctatg accatgatta cgaatttcta gaaataattt tgtttaactt taagaaggag 180atatacatat gaggcctcgg atcctgtaaa acgacggcca gtgaattccc cgggaagctt 240cgccagggtt ttcccagtcg agctcgatat cggtaccagc ggataacaat ttcacatccg 300gatcgcgaac gcgtctcgag actagttccg tttaaaccca tgtgcctggc agataacttc 360gtataatgta tgctatacga agttatggta cgtactaagc tctcatgttt cacgtactaa 420gctctcatgt ttaacgtact aagctctcat gtttaacgaa ctaaaccctc atggctaacg 480tactaagctc tcatggctaa cgtactaagc tctcatgttt cacgtactaa gctctcatgt 540ttgaacaata aaattaatat aaatcagcaa cttaaatagc ctctaaggtt ttaagtttta 600taagaaaaaa aagaatatat aaggctttta aagcttttaa ggtttaacgg ttgtggacaa 660caagccaggg atgtaacgca ctgagaagcc cttagagcct ctcaaagcaa ttttgagtga 720cacaggaaca cttaacggct gacataattc agcttcacgc tgccgcaagc actcagggcg 780caagggctgc taaaggaagc ggaacacgta gaaagccagt ccgcagaaac ggtgctgacc 840ccggatgaat gtcagctggg aggcagaata aatgatcata tcgtcaatta ttacctccac 900ggggagagcc tgagcaaact ggcctcaggc atttgagaag cacacggtca cactgcttcc 960ggtagtcaat aaaccggtaa gtagcgtatg cgctcacgca actggtccag aaccttgacc 1020gaacgcagcg gtggtaacgg cgcagtggcg gttttcatgg cttgttatga ctgttttttt 1080ggggtacagt ctatgcctcg ggcatccaag cagcaagcgc gttacgccgt gggtcgatgt 1140ttgatgttat ggagcagcaa cgatgttacg cagcagggca gtcgccctaa aacaaagtta 1200aacatcatga gggaagcggt gatcgccgaa gtatcgactc aactatcaga ggtagttggc 1260gtcatcgagc gccatctcga accgacgttg ctggccgtac atttgtacgg ctccgcagtg 1320gatggcggcc tgaagccaca cagtgatatt gatttgctgg ttacggtgac cgtaaggctt 1380gatgaaacaa cgcggcgagc tttgatcaac gaccttttgg aaacttcggc ttcccctgga 1440gagagcgaga ttctccgcgc tgtagaagtc accattgttg tgcacgacga catcattccg 1500tggcgttatc cagctaagcg cgaactgcaa tttggagaat ggcagcgcaa tgacattctt 1560gcaggtatct tcgagccagc cacgatcgac attgatctgg ctatcttgct gacaaaagca 1620agagaacata gcgttgcctt ggtaggtcca gcggcggagg aactctttga tccggttcct 1680gaacaggatc tatttgaggc gctaaatgaa accttaacgc tatggaactc gccgcccgac 1740tgggctggcg atgagcgaaa tgtagtgctt acgttgtccc gcatttggta cagcgcagta 1800accggcaaaa tcgcgccgaa ggatgtcgct gccgactggg caatggagcg cctgccggcc 1860cagtatcagc ccgtcatact tgaagctaga caggcttatc ttggacaaga agaagatcgc 1920ttggcctcgc gcgcagatca gttggaagaa tttgtccact acgtgaaagg cgagatcacc 1980aaggtagtcg gcaaataatg tctaacaatt cgttcaagcc gacggat 202772346DNAArtificialpIDC 7aaacccatgt gcctggcaga taacttcgta taatgtatgc tatacgaagt tatggtaccg 60cggccgcgta gaggatctgt tgatcagcag ttcaacctgt tgatagtacg tactaagctc 120tcatgtttca cgtactaagc tctcatgttt aacgtactaa gctctcatgt ttaacgaact 180aaaccctcat ggctaacgta ctaagctctc atggctaacg tactaagctc tcatgtttca 240cgtactaagc tctcatgttt gaacaataaa attaatataa atcagcaact taaatagcct 300ctaaggtttt aagttttata agaaaaaaaa gaatatataa ggcttttaaa gcttttaagg 360tttaacggtt gtggacaaca agccagggat gtaacgcact gagaagccct tagagcctct 420caaagcaatt ttgagtgaca caggaacact taacggctga cagaattagc ttcacgctgc 480cgcaagcact cagggcgcaa gggctgctaa aggaagcgga acacgtagaa agccagtccg 540cagaaacggt gctgaccccg gatgaatgtc agctgggagg cagaataaat gatcatatcg 600tcaattatta cctccacggg gagagcctga gcaaactggc ctcaggcatt tgagaagcac 660acggtcacac tgcttccggt agtcaataaa ccggtaaacc agcaatagac ataagcggct 720atttaacgac cctgccctga accgacgacc gggtcgaatt tgctttcgaa tttctgccat 780tcatccgctt attatcactt attcaggcgt agcaaccagg cgtttaaggg caccaataac 840tgccttaaaa aaattacgcc ccgccctgcc actcatcgca gtactgttgt aattcattaa 900gcattctgcc gacatggaag ccatcacaaa cggcatgatg aacctgaatc gccagcggca 960tcagcacctt gtcgccttgc gtataatatt tgcccatggt gaaaacgggg gcgaagaagt 1020tgtccatatt ggccacgttt aaatcaaaac tggtgaaact cacccaggga ttggctgaga 1080cgaaaaacat attctcaata aaccctttag ggaaataggc caggttttca ccgtaacacg 1140ccacatcttg cgaatatatg tgtagaaact gccggaaatc gtcgtggtat tcactccaga 1200gcgatgaaaa cgtttcagtt tgctcatgga aaacggtgta acaagggtga acactatccc 1260atatcaccag ctcaccgtct ttcattgcca tacggaattc cggatgagca ttcatcaggc 1320gggcaagaat gtgaataaag gccggataaa acttgtgctt atttttcttt acggtcttta 1380aaaaggccgt aatatccagc tgaacggtct ggttataggt acattgagca actgactgaa 1440atgcctcaaa atgttcttta cgatgccatt gggatatatc aacggtggta tatccagtga 1500tttttttctc cattttagct tccttagctc ctgaaaatct cgataactca aaaaatacgc 1560ccggtagtga tcttatttca ttatggtgaa agttggaccc tcttacgtgc cgatcaacgt 1620ctcattttcg ccaaaagttg gcccagatca acgtctcatt ttcgccaaaa gttggcccag 1680atctatgtcg ggtgcggaga aagaggtaat gaaatggcac ctaggggtta tgatagttat 1740tgctcagcgg tggcagcagc caactcagct tcctttcggg ctttgttagc agccggatct 1800tctaggctca agcagtgatc agatccagac atgataagat acattgatga gtttggacaa 1860accacaacta gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct 1920ttatttgtaa ccattataag ctgcaataaa caagttaaca acaacaattg cattcatttt 1980atgtttcagg ttcaggggga ggtgtgggag gttttttaaa gcaagtaaaa cctctacaaa 2040tgtggtatgg ctgattatga tcctctagta cttctcgaca agcttgtcga gactgcaggc 2100tctagattcg aaagcggccg cgactagtga gctcgtcgac gtaggccttt gaattccgcg 2160cgcttcggac cgggatccgc gcccgatggt gggacggtat gaataatccg gaatatttat 2220aggttttttt attacaaaac tgttacgaaa acagtaaaat acttatttat ttgcgagatg 2280gttatcattt taattatctc catgatctat taatattccg gagtaggtcg cgaatcgata 2340ctagta 234682281DNAArtificialpIDK 8gatactagta tacggacctt taattcaacc caacacaata tattatagtt aaataagaat 60tattatcaaa tcatttgtat attaattaaa atactatact gtaaattaca ttttatttac 120aatcactcga cgaagacttg atcacccggg atctcgagcc atggtgctag cagctgatgc 180atagcatgcg gtaccgggag atgggggagg ctaactgaaa cacggaagga gacaataccg 240gaaggaaccc gcgctatgac ggcaataaaa agacagaata aaacgcacgg gtgttgggtc

300gtttgttcat aaacgcgggg ttcggtccca gggctggcac tctgtcgata ccccaccgag 360accccattgg gaccaatacg cccgcgtttc ttccttttcc ccaccccaac ccccaagttc 420gggtgaaggc ccagggctcg cagccaacgt cggggcggca agccctgcca tagccactac 480gggtacgttt aaacccatgt gcctggcaga taacttcgta taatgtatgc tatacgaagt 540tatggtacgt actaagctct catgtttcac gtactaagct ctcatgttta acgtactaag 600ctctcatgtt taacgaacta aaccctcatg gctaacgtac taagctctca tggctaacgt 660actaagctct catgtttcac gtactaagct ctcatgtttg aacaataaaa ttaatataaa 720tcagcaactt aaatagcctc taaggtttta agttttataa gaaaaaaaag aatatataag 780gcttttaaag cttttaaggt ttaacggttg tggacaacaa gccagggatg taacgcactg 840agaagccctt agagcctctc aaagcaattt tcagtgacac aggaacactt aacggctgac 900agaattagct tcacgctgcc gcaagcactc agggcgcaag ggctgctaaa ggaagcggaa 960cacgtagaaa gccagtccgc agaaacggtg ctgaccccgg atgaatgtca gctactgggc 1020tatctggaca agggaaaacg caagcgcaaa gagaaagcag gtagcttgca gtgggcttac 1080atggcgatag ctagactggg cggttttatg gacagcaagc gaaccggaat tgccagctgg 1140ggcgccctct ggtaaggttg ggaagccctg caaagtaaac tggatggctt tcttgccgcc 1200aaggatctga tggcgcaggg gatcaagatc tgatcaagag acaggatgag gatcgtttcg 1260catgattgaa caagatggat tgcacgcagg ttctccggcc gcttgggtgg agaggctatt 1320cggctatgac tgggcacaac agacaatcgg ctgctctgat gccgccgtgt tccggctgtc 1380agcgcagggg cgcccggttc tttttgtcaa gaccgacctg tccggtgccc tgaatgaact 1440gcaggacgag gcagcgcggc tatcgtggct ggccacgacg ggcgttcctt gcgcagctgt 1500gctcgacgtt gtcactgaag cgggaaggga ctggctgcta ttgggcgaag tgccggggca 1560ggatctcctg tcatctcacc ttgctcctgc cgagaaagta tccatcatgg ctgatgcaat 1620gcggcggctg catacgcttg atccggctac ctgcccattc gaccaccaag cgaaacatcg 1680catcgagcga gcacgtactc ggatggaagc cggtcttgtc gatcaggatg atctggacga 1740agagcatcag gggctcgcgc cagccgaact gttcgccagg ctcaaggcgc gcatgcccga 1800cggcgaggat ctcgtcgtga cacatggcga tgcctgcttg ccgaatatca tggtggaaaa 1860tggccgcttt tctggattca tcgactgtgg ccggctgggt gtggcggacc gctatcagga 1920catagcgttg gctacccgtg atattgctga agagcttggc ggcgaatggg ctgaccgctt 1980cctcgtgctt tacggtatcg ccgctcccga ttcgcagcgc atcgccttct atcgccttct 2040tgacgagttc ttctgagcgg gactctgggg ttcgaaatga ccgaccaagc gacgcccaac 2100ctgccatcac gagatttcga ttccaccgcc gccttctatg aaaggttggg cttcggaatc 2160gttttccggg acgccggctg gatgatcctc cagcgcgggg atctcatgct ggagttcttc 2220gcccaccccg ggatctatgt cgggtgcgga gaaagaggta atgaaatggc acctaggtat 2280c 228192231DNAArtificialpIDS 9cgatactagt atacggacct ttaattcaac ccaacacaat atattatagt taaataagaa 60ttattatcaa atcatttgta tattaattaa aatactatac tgtaaattac attttattta 120caatcactcg acgaagactt gatcacccgg gatctcgagc catggtgcta gcagctgatg 180catagcatgc ggtaccggga gatgggggag gctaactgaa acacggaagg agacaatacc 240ggaaggaacc cgcgctatga cggcaataaa aagacagaat aaaacgcacg ggtgttgggt 300cgtttgttca taaacgcggg gttcggtccc agggctggca ctctgtcgat accccaccga 360gaccccattg ggaccaatac gcccgcgttt cttccttttc cccaccccaa cccccaagtt 420cgggtgaagg cccagggctc gcagccaacg tcggggcggc aagccctgcc atagccacta 480cgggtacgtt taaacccatg tgcctggcag ataacttcgt ataatgtatg ctatacgaag 540ttatggtacg tactaagctc tcatgtttca cgtactaagc tctcatgttt aacgtactaa 600gctctcatgt ttaacgaact aaaccctcat ggctaacgta ctaagctctc atggctaacg 660tactaagctc tcatgtttca cgtactaagc tctcatgttt gaacaataaa attaatataa 720atcagcaact taaatagcct ctaaggtttt aagttttata agaaaaaaaa gaatatataa 780ggcttttaaa gcttttaagg tttaacggtt gtggacaaca agccagggat gtaacgcact 840gagaagccct tagagcctct caaagcaatt ttgagtgaca caggaacact taacggctga 900cataattcag cttcacgctg ccgcaagcac tcagggcgca agggctgcta aaggaagcgg 960aacacgtaga aagccagtcc gcagaaacgg tgctgacccc ggatgaatgt cagctgggag 1020gcagaataaa tgatcatatc gtcaattatt acctccacgg ggagagcctg agcaaactgg 1080cctcaggcat ttgagaagca cacggtcaca ctgcttccgg tagtcaataa accggtaagt 1140agcgtatgcg ctcacgcaac tggtccagaa ccttgaccga acgcagcggt ggtaacggcg 1200cagtggcggt tttcatggct tgttatgact gtttttttgg ggtacagtct atgcctcggg 1260catccaagca gcaagcgcgt tacgccgtgg gtcgatgttt gatgttatgg agcagcaacg 1320atgttacgca gcagggcagt cgccctaaaa caaagttaaa catcatgagg gaagcggtga 1380tcgccgaagt atcgactcaa ctatcagagg tagttggcgt catcgagcgc catctcgaac 1440cgacgttgct ggccgtacat ttgtacggct ccgcagtgga tggcggcctg aagccacaca 1500gtgatattga tttgctggtt acggtgacgg taaggcttga tgaaacaacg cggcgagctt 1560tgatcaacga ccttttggaa acttcggctt cccctggaga gagcgagatt ctccgcgctg 1620tagaagtcac cattgttgtg cacgacgaca tcattccgtg gcgttatcca gctaagcgcg 1680aactgcaatt tggagaatgg cagcgcaatg acattcttgc aggtatcttc gagccagcca 1740cgatcgacat tgatctggct atcttgctga caaaagcaag agaacatagc gttgccttgg 1800taggtccagc ggcggaggaa ctctttgatc cggttcctga acaggatcta tttgaggcgc 1860taaatgaaac cttaacgcta tggaactcgc cgcccgactg ggctggcgat gagcgaaatg 1920tagtgcttac gttgtcccgc atttggtaca gcgcagtaac cggcaaaatc gcgccgaagg 1980atgtcgctgc cgactgggca atggagcgcc tgccggccca gtatcagccc gtcatacttg 2040aagctagaca ggcttatctt ggacaagaag aagatcgctt ggcctcgcgc gcagatcagt 2100tggaagaatt tgtccactac gtgaaaggcg agatcaccaa ggtagtcggc aaataatgtc 2160taacaattcg ttcaagccga cggatctatg tcgggtgcgg agaaagaggt aatgaaatgg 2220cacctaggta t 2231102904DNAArtificialpACEBac1 10accggttgac ttgggtcaac tgtcagacca agtttactca tatatacttt agattgattt 60aaaacttcat ttttaattta aaaggatcta ggtgaagatc ctttttgata atctcatgac 120caaaatccct taacgtgagt tttcgttcca ctgagcgtca gaccccgtag aaaagatcaa 180aggatcttct tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa caaaaaaacc 240accgctacca gcggtggttt gtttgccgga tcaagagcta ccaactcttt ttccgaaggt 300aactggcttc agcagagcgc agataccaaa tactgttctt ctagtgtagc cgtagttagg 360ccaccacttc aagaactctg tagcaccgcc tacatacctc gctctgctaa tcctgttacc 420agtggctgct gccagtggcg ataagtcgtg tcttaccggg ttggactcaa gacgatagtt 480accggataag gcgcagcggt cgggctgaac ggggggttcg tgcacacagc ccagcttgga 540gcgaacgacc tacaccgaac tgagatacct acagcgtgag ctatgagaaa gcgccacgct 600tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa caggagagcg 660cacgagggag cttccagggg gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca 720cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa 780cgccagcaac gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatgtt 840ctttcctgcg ttatcccctg attgacttgg gtcgctcttc ctgtggatgc gcagatgccc 900tgcgtaagcg ggtgtgggcg gacaataaag tcttaaactg aacaaaatag atctaaacta 960tgacaataaa gtcttaaact agacagaata gttgtaaact gaaatcagtc cagttatgct 1020gtgaaaaagc atactggact tttgttatgg ctaaagcaaa ctcttcattt tctgaagtgc 1080aaattgcccg tcgtattaaa gaggggcgtg gccaagggca tgtaaagact atattcgcgg 1140cgttgtgaca atttaccgaa caactccgcg gccgggaagc cgatctcggc ttgaacgaat 1200tgttaggtgg cggtacttgg gtcgatatca aagtgcatca cttcttcccg tatgcccaac 1260tttgtataga gagccactgc gggatcgtca ccgtaatctg cttgcacgta gatcacataa 1320gcaccaagcg cgttggcctc atgcttgagg agattgatga gcgcggtggc aatgccctgc 1380ctccggtgct cgccggagac tgcgagatca tagatataga tctcactacg cggctgctca 1440aacttgggca gaacgtaagc cgcgagagcg ccaacaaccg cttcttggtc gaaggcagca 1500agcgcgatga atgtcttact acggagcaag ttcccgaggt aatcggagtc cggctgatgt 1560tgggagtagg tggctacgtc tccgaactca cgaccgaaaa gatcaagagc agcccgcatg 1620gatttgactt ggtcagggcc gagcctacat gtgcgaatga tgcccatact tgagccacct 1680aactttgttt tagggcgact gccctgctgc gtaacatcgt tgctgctgcg taacatcgtt 1740gctgctccat aacatcaaac atcgacccac ggcgtaacgc gcttgctgct tggatgcccg 1800aggcatagac tgtacaaaaa aacagtcata acaagccatg aaaaccgcca ctgcgccgtt 1860accaccgctg cgttcggtca aggttctgga ccagttgcgt gagcgcatac gctacttgca 1920ttacagttta cgaaccgaac aggcttatgt caactgggtt cgtgccttca tccgtttcca 1980cggtgtgcgt cacccggcaa ccttgggcag cagcgaagtc gccataactt cgtatagcat 2040acattatacg aagttatctg taactataac ggtcctaagg tagcgagttt aaacactagt 2100atcgattcgc gacctactcc ggaatattaa tagatcatgg agataattaa aatgataacc 2160atctcgcaaa taaataagta ttttactgtt ttcgtaacag ttttgtaata aaaaaaccta 2220taaatattcc ggattattca taccgtccca ccatcgggcg cggatcccgg tccgaagcgc 2280gcggaattca aaggcctacg tcgacgagct cacttgtcgc ggccgctttc gaatctagag 2340cctgcagtct cgacaagctt gtcgagaagt actagaggat cataatcagc cataccacat 2400ttgtagaggt tttacttgct ttaaaaaacc tcccacacct ccccctgaac ctgaaacata 2460aaatgaatgc aattgttgtt gttaacttgt ttattgcagc ttataatggt tacaaataaa 2520gcaatagcat cacaaatttc acaaataaag catttttttc actgcattct agttgtggtt 2580tgtccaaact catcaatgta tcttatcatg tctggatctg atcactgctt gagcctagaa 2640gatccggctg ctaacaaagc ccgaaaggaa gctgagttgg ctgctgccac cgctgagcaa 2700taactatcat aacccctagg gtatacccat ctaattggaa ccagataagt gaaatctagt 2760tccaaactat tttgtcattt ttaattttcg tattagctta cgacgctaca cccagttccc 2820atctattttg tcactcttcc ctaaataatc cttaaaaact ccatttccac ccctcccagt 2880tcccaactat tttgtccgcc caca 2904112761DNAArtificialpACEBac2 11accggttgac ttgggtcaac tgtcagacca agtttactca tatatacttt agattgattt 60aaaacttcat ttttaattta aaaggatcta ggtgaagatc ctttttgata atctcatgac 120caaaatccct taacgtgagt tttcgttcca ctgagcgtca gaccccgtag aaaagatcaa 180aggatcttct tgagatcctt tttttctgcg cgtaatctgc tgcttgcaaa caaaaaaacc 240accgctacca gcggtggttt gtttgccgga tcaagagcta ccaactcttt ttccgaaggt 300aactggcttc agcagagcgc agataccaaa tactgttctt ctagtgtagc cgtagttagg 360ccaccacttc aagaactctg tagcaccgcc tacatacctc gctctgctaa tcctgttacc 420agtggctgct gccagtggcg ataagtcgtg tcttaccggg ttggactcaa gacgatagtt 480accggataag gcgcagcggt cgggctgaac ggggggttcg tgcacacagc ccagcttgga 540gcgaacgacc tacaccgaac tgagatacct acagcgtgag ctatgagaaa gcgccacgct 600tcccgaaggg agaaaggcgg acaggtatcc ggtaagcggc agggtcggaa caggagagcg 660cacgagggag cttccagggg gaaacgcctg gtatctttat agtcctgtcg ggtttcgcca 720cctctgactt gagcgtcgat ttttgtgatg ctcgtcaggg gggcggagcc tatggaaaaa 780cgccagcaac gcggcctttt tacggttcct ggccttttgc tggccttttg ctcacatgtt 840ctttcctgcg ttatcccctg attgacttgg gtcgctcttc ctgtggatgc gcagatgccc 900tgcgtaagcg ggtgtgggcg gacaataaag tcttaaactg aacaaaatag atctaaacta 960tgacaataaa gtcttaaact agacagaata gttgtaaact gaaatcagtc cagttatgct 1020gtgaaaaagc atactggact tttgttatgg ctaaagcaaa ctcttcattt tctgaagtgc 1080aaattgcccg tcgtattaaa gaggggcgtg gccaagggca tgtaaagact atattcgcgg 1140cgttgtgaca atttaccgaa caactccgcg gccgggaagc cgatctcggc ttgaacgaat 1200tgttaggtgg cggtacttgg gtcgatatca aagtgcatca cttcttcccg tatgcccaac 1260tttgtataga gagccactgc gggatcgtca ccgtaatctg cttgcacgta gatcacataa 1320gcaccaagcg cgttggcctc atgcttgagg agattgatga gcgcggtggc aatgccctgc 1380ctccggtgct cgccggagac tgcgagatca tagatataga tctcactacg cggctgctca 1440aacttgggca gaacgtaagc cgcgagagcg ccaacaaccg cttcttggtc gaaggcagca 1500agcgcgatga atgtcttact acggagcaag ttcccgaggt aatcggagtc cggctgatgt 1560tgggagtagg tggctacgtc tccgaactca cgaccgaaaa gatcaagagc agcccgcatg 1620gatttgactt ggtcagggcc gagcctacat gtgcgaatga tgcccatact tgagccacct 1680aactttgttt tagggcgact gccctgctgc gtaacatcgt tgctgctgcg taacatcgtt 1740gctgctccat aacatcaaac atcgacccac ggcgtaacgc gcttgctgct tggatgcccg 1800aggcatagac tgtacaaaaa aacagtcata acaagccatg aaaaccgcca ctgcgccgtt 1860accaccgctg cgttcggtca aggttctgga ccagttgcgt gagcgcatac gctacttgca 1920ttacagttta cgaaccgaac aggcttatgt caactgggtt cgtgccttca tccgtttcca 1980cggtgtgcgt cacccggcaa ccttgggcag cagcgaagtc gccataactt cgtatagcat 2040acattatacg aagttatctg taactataac ggtcctaagg tagcgagttt aaacgtaccc 2100gtagtggcta tggcagggct tgccgccccg acgttggctg cgagccctgg gccttcaccc 2160gaacttgggg gttggggtgg ggaaaaggaa gaaacgcggg cgtattggtc ccaatggggt 2220ctcggtgggg tatcgacaga gtgccagccc tgggaccgaa ccccgcgttt atgaacaaac 2280gacccaacac ccgtgcgttt tattctgtct ttttattgcc gtcatagcgc gggttccttc 2340cggtattgtc tccttccgtg tttcagttag cctcccccat ctcccggtac cgcatgctat 2400gcatcagctg ctagcaccat ggctcgagat cccgggtgat caagtcttcg tcgagtgatt 2460gtaaataaaa tgtaatttac agtatagtat tttaattaat atacaaatga tttgataata 2520attcttattt aactataata tattgtgttg ggttgaatta aaggtccgta tactagggta 2580tacccatcta attggaacca gataagtgaa atctagttcc aaactatttt gtcattttta 2640attttcgtat tagcttacga cgctacaccc agttcccatc tattttgtca ctcttcccta 2700aataatcctt aaaaactcca tttccacccc tcccagttcc caactatttt gtccgcccac 2760a 2761122940DNAArtificialpACEBac3 12gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc 60ctgcgttatc ccctgattga cttgggtcgc tcttcctgtg gatgcgcaga tgccctgcgt 120aagcgggtgt gggcggacaa taaagtctta aactgaacaa aatagatcta aactatgaca 180ataaagtctt aaactagaca gaatagttgt aaactgaaat cagtccagtt atgctgtgaa 240aaagcatact ggacttttgt tatggctaaa gcaaactctt cattttctga agtgcaaatt 300gcccgtcgta ttaaagaggg gcgtggccaa gggcatgtaa agactatatt cgcggcgttg 360tgacaattta ccgaacaact ccgcggccgg gaagccgatc tcggcttgaa cgaattgtta 420ggtggcggta cttgggtcga tatcaaagtg catcacttct tcccgtatgc ccaactttgt 480atagagagcc actgcgggat cgtcaccgta atctgcttgc acgtagatca cataagcacc 540aagcgcgttg gcctcatgct tgaggagatt gatgagcgcg gtggcaatgc cctgcctccg 600gtgctcgccg gagactgcga gatcatagat atagatctca ctacgcggct gctcaaactt 660gggcagaacg taagccgcga gagcgccaac aaccgcttct tggtcgaagg cagcaagcgc 720gatgaatgtc ttactacgga gcaagttccc gaggtaatcg gagtccggct gatgttggga 780gtaggtggct acgtctccga actcacgacc gaaaagatca agagcagccc gcatggattt 840gacttggtca gggccgagcc tacatgtgcg aatgatgccc atacttgagc cacctaactt 900tgttttaggg cgactgccct gctgcgtaac atcgttgctg ctgcgtaaca tcgttgctgc 960tccataacat caaacatcga cccacggcgt aacgcgcttg ctgcttggat gcccgaggca 1020tagactgtac aaaaaaacag tcataacaag ccatgaaaac cgccactgcg ccgttaccac 1080cgctgcgttc ggtcaaggtt ctggaccagt tgcgtgagcg catacgctac ttgcattaca 1140gtttacgaac cgaacaggct tatgtcaact gggttcgtgc cttcatccgt ttccacggtg 1200tgcgtcaccc ggcaaccttg ggcagcagcg aagtcgccat aacttcgtat agcatacatt 1260atacgaagtt atctgtaact ataacggtcc taaggtagcg agtttaaaca ctagtatcga 1320ttcgcgacct actccggaat attaatagat catggagata attaaaatga taaccatctc 1380gcaaataaat aagtatttta ctgttttcgt aacagttttg taataaaaaa acctataaat 1440attccggatt attcataccg tcccaccatc gggcgcggat cccggtccga agcgcgcgga 1500attcaaaggc ctacgtcgac gagctcactt gtcgcggccg ctttcgaatc tagagcctgc 1560agtctcgaca agcttgtcga gaagtactag aggatcataa tcagccatac cacatttgta 1620gaggttttac ttgctttaaa aaacctccca cacctccccc tgaacctgaa acataaaatg 1680aatgcaattg ttgttgttaa cttgtttatt gcagcttata atggttacaa ataaagcaat 1740agcatcacaa atttcacaaa taaagcattt ttttcactgc attctagttg tggtttgtcc 1800aaactcatca atgtatctta tcatgtctgg atctgatcac tgcttgagcc tagaagatcc 1860ggctgctaac aaagcccgaa aggaagctga gttggctgct gccaccgctg agcaataact 1920atcataaccc ctagggtata cccatctaat tggaaccaga taagtgaaat ctagttccaa 1980actattttgt catttttaat tttcgtatta gcttacgacg ctacacccag ttcccatcta 2040ttttgtcact cttccctaaa taatccttaa aaactccatt tccacccctc ccagttccca 2100actattttgt ccgcccacaa ccggttgact tgggtcaact gtcagaccaa gtttactcat 2160atatacttta gattgattta aaacttcatt tttaatttaa aaggatctag gtgaagatcc 2220tttttgataa tctcatgacc acaggcattg gcggccttgc tgttcttcta cggcaaggtg 2280ctgtgcacgc ccagctgcca tttttggggt gaggtcgttc gcggccgagg ggcgcagccc 2340ctggggggat ggggtgccgc gttagcgggc cgggagggtt cgagaagggg gggcaccccc 2400cttcggcgtg cgcggtcacg cgccagggcg cagccctggt taaaaacaag gtttataaat 2460attggtttaa aagcaggtta aaagacaggt tagcggtggc cgaaaaacgg gcggaaaccc 2520ttgcaaatgc tggattttct gcctgtggac agcccctcaa atgtcaatag gtgcgcccct 2580catctgtcat cactctgccc ctcaagtgtc aaggatcgcg cccctcatct gtcagtagtc 2640gcgcccctca agtgtcaata ccgcagggca cttatcccca ggcttgtcca catcatctgt 2700gggaaactcg cgtaaaatca ggcgttttcg ccgatttgcg aggctggcca gctccacgtc 2760gccggccgaa atcgagcctg cccctcatct gtcaacgccg cgccgggtga gtcggcccct 2820caagtgtcaa cgtccgcccc tcatctgtca gtgagggcca agttttccgc gtggtatcca 2880caacgccggc ggccaaaaga agagctttca caccgcatag accagccgcg taacctggca 2940132805DNAArtificialpACEBac4 13gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc 60ctgcgttatc ccctgattga cttgggtcgc tcttcctgtg gatgcgcaga tgccctgcgt 120aagcgggtgt gggcggacaa taaagtctta aactgaacaa aatagatcta aactatgaca 180ataaagtctt aaactagaca gaatagttgt aaactgaaat cagtccagtt atgctgtgaa 240aaagcatact ggacttttgt tatggctaaa gcaaactctt cattttctga agtgcaaatt 300gcccgtcgta ttaaagaggg gcgtggccaa gggcatgtaa agactatatt cgcggcgttg 360tgacaattta ccgaacaact ccgcggccgg gaagccgatc tcggcttgaa cgaattgtta 420ggtggcggta cttgggtcga tatcaaagtg catcacttct tcccgtatgc ccaactttgt 480atagagagcc actgcgggat cgtcaccgta atctgcttgc acgtagatca cataagcacc 540aagcgcgttg gcctcatgct tgaggagatt gatgagcgcg gtggcaatgc cctgcctccg 600gtgctcgccg gagactgcga gatcatagat atagatctca ctacgcggct gctcaaactt 660gggcagaacg taagccgcga gagcgccaac aaccgcttct tggtcgaagg cagcaagcgc 720gatgaatgtc ttactacgga gcaagttccc gaggtaatcg gagtccggct gatgttggga 780gtaggtggct acgtctccga actcacgacc gaaaagatca agagcagccc gcatggattt 840gacttggtca gggccgagcc tacatgtgcg aatgatgccc atacttgagc cacctaactt 900tgttttaggg cgactgccct gctgcgtaac atcgttgctg ctgcgtaaca tcgttgctgc 960tccataacat caaacatcga cccacggcgt aacgcgcttg ctgcttggat gcccgaggca 1020tagactgtac aaaaaaacag tcataacaag ccatgaaaac cgccactgcg ccgttaccac 1080cgctgcgttc ggtcaaggtt ctggaccagt tgcgtgagcg catacgctac ttgcattaca 1140gtttacgaac cgaacaggct tatgtcaact gggttcgtgc cttcatccgt ttccacggtg 1200tgcgtcaccc ggcaaccttg ggcagcagcg aagtcgccat aacttcgtat agcatacatt 1260atacgaagtt atctgtaact ataacggtcc taaggtagcg agtttaaacg tacccgtagt 1320ggctatggca gggcttgccg ccccgacgtt ggctgcgagc cctgggcctt cacccgaact 1380tgggggttgg ggtggggaaa aggaagaaac gcgggcgtat tggtcccaat ggggtctcgg 1440tggggtatcg acagagtgcc agccctggga ccgaaccccg cgtttatgaa caaacgaccc 1500aacacccgtg cgttttattc tgtcttttta ttgccgtcat agcgcgggtt ccttccggta 1560ttgtctcctt ccgtgtttca gttagcctcc cccatctccc ggtaccgcat gctatgcatc 1620agctgctagc accatggctc gagatcccgg gtgatcaagt cttcgtcgag tgattgtaaa 1680taaaatgtaa tttacagtat agtattttaa ttaatataca aatgatttga taataattct 1740tatttaacta taatatattg tgttgggttg aattaaaggt ccgtatacta gtatcctagg 1800gtatacccat ctaattggaa ccagataagt gaaatctagt tccaaactat tttgtcattt 1860ttaattttcg tattagctta cgacgctaca

cccagttccc atctattttg tcactcttcc 1920ctaaataatc cttaaaaact ccatttccac ccctcccagt tcccaactat tttgtccgcc 1980cacaaccggt tgacttgggt caactgtcag accaagttta ctcatatata ctttagattg 2040atttaaaact tcatttttaa tttaaaagga tctaggtgaa gatccttttt gataatctca 2100tgaccacagg cattggcggc cttgctgttc ttctacggca aggtgctgtg cacgcccagc 2160tgccattttt ggggtgaggt cgttcgcggc cgaggggcgc agcccctggg gggatggggt 2220gccgcgttag cgggccggga gggttcgaga agggggggca ccccccttcg gcgtgcgcgg 2280tcacgcgcca gggcgcagcc ctggttaaaa acaaggttta taaatattgg tttaaaagca 2340ggttaaaaga caggttagcg gtggccgaaa aacgggcgga aacccttgca aatgctggat 2400tttctgcctg tggacagccc ctcaaatgtc aataggtgcg cccctcatct gtcatcactc 2460tgcccctcaa gtgtcaagga tcgcgcccct catctgtcag tagtcgcgcc cctcaagtgt 2520caataccgca gggcacttat ccccaggctt gtccacatca tctgtgggaa actcgcgtaa 2580aatcaggcgt tttcgccgat ttgcgaggct ggccagctcc acgtcgccgg ccgaaatcga 2640gcctgcccct catctgtcaa cgccgcgccg ggtgagtcgg cccctcaagt gtcaacgtcc 2700gcccctcatc tgtcagtgag ggccaagttt tccgcgtggt atccacaacg ccggcggcca 2760aaagaagagc tttcacaccg catagaccag ccgcgtaacc tggca 2805144589DNAArtificialpOmniBac1 14accggtggag gaaattctcc ttgaagtttc cctggtgttc aaagtaaagg agtttgcacc 60agacgcacct ctgttcactg gtccggcgta ttaaaacacg atacattgtt attagtacat 120ttattaagcg ctagattctg tgcgttgttg atttacagac aattgttgta cgtattttaa 180taattcatta aatttataat ctttagggtg gtatgttaga gcgaaaatca aatgattttc 240agcgtcttta tatctgaatt taaatattaa atcctcaata gatttgtaaa ataggtttcg 300attagtttca aacaagggtt gtttttccga accgatggct ggactatcta atggattttc 360gctcaacgcc acaaaacttg ccaaatcttg tagcagcaat ctagctttgt cgatattcgt 420ttgtgttttg ttttgtaata aaggttcgac gtcgttcaaa atattatgcg cttttgtatt 480tctttcatca ctgtcgttag tgtacaattg actcgacgta aacacgttaa atagagcttg 540gacatattta acatcgggcg tgttagcttt attaggccga ttatcgtcgt cgtcccaacc 600ctcgtcgtta gaagttgctt ccgaagacga ttttgccata gccacacgac gcctattaat 660tgtgtcggct aacacgtccg cgatcaaatt tgtagttgag ctttttggaa ttaccggttg 720acttgggtca actgtcagac caagtttact catatatact ttagattgat ttaaaacttc 780atttttaatt taaaaggatc taggtgaaga tcctttttga taatctcatg accaaaatcc 840cttaacgtga gttttcgttc cactgagcgt cagaccccgt agaaaagatc aaaggatctt 900cttgagatcc tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac 960cagcggtggt ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct 1020tcagcagagc gcagatacca aatactgttc ttctagtgta gccgtagtta ggccaccact 1080tcaagaactc tgtagcaccg cctacatacc tcgctctgct aatcctgtta ccagtggctg 1140ctgccagtgg cgataagtcg tgtcttaccg ggttggactc aagacgatag ttaccggata 1200aggcgcagcg gtcgggctga acggggggtt cgtgcacaca gcccagcttg gagcgaacga 1260cctacaccga actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag 1320ggagaaaggc ggacaggtat ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg 1380agcttccagg gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc cacctctgac 1440ttgagcgtcg atttttgtga tgctcgtcag gggggcggag cctatggaaa aacgccagca 1500acgcggcctt tttacggttc ctggcctttt gctggccttt tgctcacatg ttctttcctg 1560cgttatcccc tgattgactt gggtcgctct tcctgtggat gcgcaggtat gtacaggaag 1620aggtttatac taaactgtta cattgcaaac gtggtttcgt gtgccaagtg tgaaaaccga 1680tgtttaatca aggctctgac gcatttctac aaccacgact ctaagtgtgt gggtgaagtc 1740atgcatcttt taatcaaatc ccaagatgtg tataaaccac caaactgcca aaaaatgaaa 1800actgtcgaca agctctgtcc gtttgctggc aactgcaagg gtctcaatcc tatttgtaat 1860tattgaataa taaaacaatt ataaatgtca aatttgtttt ttattaacga tacaaaccaa 1920acgcaacaag aacatttgta gtattatcta taattgaaaa cgcgtagtta taatcgctga 1980ggtaatattt aaaatcattt tcaaatgatt cacagttaat ttgcgacaat ataattttat 2040tttcacataa actagacgcc ttgtcgtctt cttcttcgta ttccttctct ttttcatttt 2100tctcttcata aaaattaaca tagttattat cgtatccata tatgtatcta tcgtatagag 2160taaatttttt gttgtcataa atatatatgt cttttttaat ggggtgtata gtaccgctgc 2220gcatagtttt tctgtaattt acaacagtgc tattttctgg tagttcttcg gagtgtgttg 2280ctttaattat taaatttata taatcaatga atttgggatc gtcggttttg tacaatatgt 2340tgccggcata gtacgcagct tcttctagtt caattacacc attttttagc agcaccggat 2400taacataact ttccaaaatg ttgtacgaac cgttaaacaa aaacagttca cctccctttt 2460ctatactatt gtctgcgagc agttgtttgt tgttaaaaat aacagccatt gtaatgagac 2520gcacaaacta atatcacaaa ctggaaatgt ctatcaatat atagttgctg attgcgcaga 2580tgccctgcgt aagcgggtgt gggcggacaa taaagtctta aactgaacaa aatagatcta 2640aactatgaca ataaagtctt aaactagaca gaatagttgt aaactgaaat cagtccagtt 2700atgctgtgaa aaagcatact ggacttttgt tatggctaaa gcaaactctt cattttctga 2760agtgcaaatt gcccgtcgta ttaaagaggg gcgtggccaa gggcatgtaa agactatatt 2820cgcggcgttg tgacaattta ccgaacaact ccgcggccgg gaagccgatc tcggcttgaa 2880cgaattgtta ggtggcggta cttgggtcga tatcaaagtg catcacttct tcccgtatgc 2940ccaactttgt atagagagcc actgcgggat cgtcaccgta atctgcttgc acgtagatca 3000cataagcacc aagcgcgttg gcctcatgct tgaggagatt gatgagcgcg gtggcaatgc 3060cctgcctccg gtgctcgccg gagactgcga gatcatagat atagatctca ctacgcggct 3120gctcaaactt gggcagaacg taagccgcga gagcgccaac aaccgcttct tggtcgaagg 3180cagcaagcgc gatgaatgtc ttactacgga gcaagttccc gaggtaatcg gagtccggct 3240gatgttggga gtaggtggct acgtctccga actcacgacc gaaaagatca agagcagccc 3300gcatggattt gacttggtca gggccgagcc tacatgtgcg aatgatgccc atacttgagc 3360cacctaactt tgttttaggg cgactgccct gctgcgtaac atcgttgctg ctgcgtaaca 3420tcgttgctgc tccataacat caaacatcga cccacggcgt aacgcgcttg ctgcttggat 3480gcccgaggca tagactgtac aaaaaaacag tcataacaag ccatgaaaac cgccactgcg 3540ccgttaccac cgctgcgttc ggtcaaggtt ctggaccagt tgcgtgagcg catacgctac 3600ttgcattaca gtttacgaac cgaacaggct tatgtcaact gggttcgtgc cttcatccgt 3660ttccacggtg tgcgtcaccc ggcaaccttg ggcagcagcg aagtcgccat aacttcgtat 3720agcatacatt atacgaagtt atctgtaact ataacggtcc taaggtagcg agtttaaaca 3780ctagtatcga ttcgcgacct actccggaat attaatagat catggagata attaaaatga 3840taaccatctc gcaaataaat aagtatttta ctgttttcgt aacagttttg taataaaaaa 3900acctataaat attccggatt attcataccg tcccaccatc gggcgcggat cccggtccga 3960agcgcgcgga attcaaaggc ctacgtcgac gagctcactt gtcgcggccg ctttcgaatc 4020tagagcctgc agtctcgaca agcttgtcga gaagtactag aggatcataa tcagccatac 4080cacatttgta gaggttttac ttgctttaaa aaacctccca cacctccccc tgaacctgaa 4140acataaaatg aatgcaattg ttgttgttaa cttgtttatt gcagcttata atggttacaa 4200ataaagcaat agcatcacaa atttcacaaa taaagcattt ttttcactgc attctagttg 4260tggtttgtcc aaactcatca atgtatctta tcatgtctgg atctgatcac tgcttgagcc 4320tagaagatcc ggctgctaac aaagcccgaa aggaagctga gttggctgct gccaccgctg 4380agcaataact atcataaccc ctagggtata cccatctaat tggaaccaga taagtgaaat 4440ctagttccaa actattttgt catttttaat tttcgtatta gcttacgacg ctacacccag 4500ttcccatcta ttttgtcact cttccctaaa taatccttaa aaactccatt tccacccctc 4560ccagttccca actattttgt ccgcccaca 4589154446DNAArtificialpOmniBac2 15ccggtggagg aaattctcct tgaagtttcc ctggtgttca aagtaaagga gtttgcacca 60gacgcacctc tgttcactgg tccggcgtat taaaacacga tacattgtta ttagtacatt 120tattaagcgc tagattctgt gcgttgttga tttacagaca attgttgtac gtattttaat 180aattcattaa atttataatc tttagggtgg tatgttagag cgaaaatcaa atgattttca 240gcgtctttat atctgaattt aaatattaaa tcctcaatag atttgtaaaa taggtttcga 300ttagtttcaa acaagggttg tttttccgaa ccgatggctg gactatctaa tggattttcg 360ctcaacgcca caaaacttgc caaatcttgt agcagcaatc tagctttgtc gatattcgtt 420tgtgttttgt tttgtaataa aggttcgacg tcgttcaaaa tattatgcgc ttttgtattt 480ctttcatcac tgtcgttagt gtacaattga ctcgacgtaa acacgttaaa tagagcttgg 540acatatttaa catcgggcgt gttagcttta ttaggccgat tatcgtcgtc gtcccaaccc 600tcgtcgttag aagttgcttc cgaagacgat tttgccatag ccacacgacg cctattaatt 660gtgtcggcta acacgtccgc gatcaaattt gtagttgagc tttttggaat taccggttga 720cttgggtcaa ctgtcagacc aagtttactc atatatactt tagattgatt taaaacttca 780tttttaattt aaaaggatct aggtgaagat cctttttgat aatctcatga ccaaaatccc 840ttaacgtgag ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc 900ttgagatcct ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc 960agcggtggtt tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt 1020cagcagagcg cagataccaa atactgttct tctagtgtag ccgtagttag gccaccactt 1080caagaactct gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc 1140tgccagtggc gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa 1200ggcgcagcgg tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac 1260ctacaccgaa ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg 1320gagaaaggcg gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga 1380gcttccaggg ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact 1440tgagcgtcga tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa 1500cgcggccttt ttacggttcc tggccttttg ctggcctttt gctcacatgt tctttcctgc 1560gttatcccct gattgacttg ggtcgctctt cctgtggatg cgcaggtatg tacaggaaga 1620ggtttatact aaactgttac attgcaaacg tggtttcgtg tgccaagtgt gaaaaccgat 1680gtttaatcaa ggctctgacg catttctaca accacgactc taagtgtgtg ggtgaagtca 1740tgcatctttt aatcaaatcc caagatgtgt ataaaccacc aaactgccaa aaaatgaaaa 1800ctgtcgacaa gctctgtccg tttgctggca actgcaaggg tctcaatcct atttgtaatt 1860attgaataat aaaacaatta taaatgtcaa atttgttttt tattaacgat acaaaccaaa 1920cgcaacaaga acatttgtag tattatctat aattgaaaac gcgtagttat aatcgctgag 1980gtaatattta aaatcatttt caaatgattc acagttaatt tgcgacaata taattttatt 2040ttcacataaa ctagacgcct tgtcgtcttc ttcttcgtat tccttctctt tttcattttt 2100ctcttcataa aaattaacat agttattatc gtatccatat atgtatctat cgtatagagt 2160aaattttttg ttgtcataaa tatatatgtc ttttttaatg gggtgtatag taccgctgcg 2220catagttttt ctgtaattta caacagtgct attttctggt agttcttcgg agtgtgttgc 2280tttaattatt aaatttatat aatcaatgaa tttgggatcg tcggttttgt acaatatgtt 2340gccggcatag tacgcagctt cttctagttc aattacacca ttttttagca gcaccggatt 2400aacataactt tccaaaatgt tgtacgaacc gttaaacaaa aacagttcac ctcccttttc 2460tatactattg tctgcgagca gttgtttgtt gttaaaaata acagccattg taatgagacg 2520cacaaactaa tatcacaaac tggaaatgtc tatcaatata tagttgctga ttgcgcagat 2580gccctgcgta agcgggtgtg ggcggacaat aaagtcttaa actgaacaaa atagatctaa 2640actatgacaa taaagtctta aactagacag aatagttgta aactgaaatc agtccagtta 2700tgctgtgaaa aagcatactg gacttttgtt atggctaaag caaactcttc attttctgaa 2760gtgcaaattg cccgtcgtat taaagagggg cgtggccaag ggcatgtaaa gactatattc 2820gcggcgttgt gacaatttac cgaacaactc cgcggccggg aagccgatct cggcttgaac 2880gaattgttag gtggcggtac ttgggtcgat atcaaagtgc atcacttctt cccgtatgcc 2940caactttgta tagagagcca ctgcgggatc gtcaccgtaa tctgcttgca cgtagatcac 3000ataagcacca agcgcgttgg cctcatgctt gaggagattg atgagcgcgg tggcaatgcc 3060ctgcctccgg tgctcgccgg agactgcgag atcatagata tagatctcac tacgcggctg 3120ctcaaacttg ggcagaacgt aagccgcgag agcgccaaca accgcttctt ggtcgaaggc 3180agcaagcgcg atgaatgtct tactacggag caagttcccg aggtaatcgg agtccggctg 3240atgttgggag taggtggcta cgtctccgaa ctcacgaccg aaaagatcaa gagcagcccg 3300catggatttg acttggtcag ggccgagcct acatgtgcga atgatgccca tacttgagcc 3360acctaacttt gttttagggc gactgccctg ctgcgtaaca tcgttgctgc tgcgtaacat 3420cgttgctgct ccataacatc aaacatcgac ccacggcgta acgcgcttgc tgcttggatg 3480cccgaggcat agactgtaca aaaaaacagt cataacaagc catgaaaacc gccactgcgc 3540cgttaccacc gctgcgttcg gtcaaggttc tggaccagtt gcgtgagcgc atacgctact 3600tgcattacag tttacgaacc gaacaggctt atgtcaactg ggttcgtgcc ttcatccgtt 3660tccacggtgt gcgtcacccg gcaaccttgg gcagcagcga agtcgccata acttcgtata 3720gcatacatta tacgaagtta tctgtaacta taacggtcct aaggtagcga gtttaaacgt 3780acccgtagtg gctatggcag ggcttgccgc cccgacgttg gctgcgagcc ctgggccttc 3840acccgaactt gggggttggg gtggggaaaa ggaagaaacg cgggcgtatt ggtcccaatg 3900gggtctcggt ggggtatcga cagagtgcca gccctgggac cgaaccccgc gtttatgaac 3960aaacgaccca acacccgtgc gttttattct gtctttttat tgccgtcata gcgcgggttc 4020cttccggtat tgtctccttc cgtgtttcag ttagcctccc ccatctcccg gtaccgcatg 4080ctatgcatca gctgctagca ccatggctcg agatcccggg tgatcaagtc ttcgtcgagt 4140gattgtaaat aaaatgtaat ttacagtata gtattttaat taatatacaa atgatttgat 4200aataattctt atttaactat aatatattgt gttgggttga attaaaggtc cgtatactag 4260ggtataccca tctaattgga accagataag tgaaatctag ttccaaacta ttttgtcatt 4320tttaattttc gtattagctt acgacgctac acccagttcc catctatttt gtcactcttc 4380cctaaataat ccttaaaaac tccatttcca cccctcccag ttcccaacta ttttgtccgc 4440ccacaa 4446164625DNAArtificialpOmniBac3 16gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc 60ctgcgttatc ccctgattga cttgggtcgc tcttcctgtg gatgcgcagg tatgtacagg 120aagaggttta tactaaactg ttacattgca aacgtggttt cgtgtgccaa gtgtgaaaac 180cgatgtttaa tcaaggctct gacgcatttc tacaaccacg actctaagtg tgtgggtgaa 240gtcatgcatc ttttaatcaa atcccaagat gtgtataaac caccaaactg ccaaaaaatg 300aaaactgtcg acaagctctg tccgtttgct ggcaactgca agggtctcaa tcctatttgt 360aattattgaa taataaaaca attataaatg tcaaatttgt tttttattaa cgatacaaac 420caaacgcaac aagaacattt gtagtattat ctataattga aaacgcgtag ttataatcgc 480tgaggtaata tttaaaatca ttttcaaatg attcacagtt aatttgcgac aatataattt 540tattttcaca taaactagac gccttgtcgt cttcttcttc gtattccttc tctttttcat 600ttttctcttc ataaaaatta acatagttat tatcgtatcc atatatgtat ctatcgtata 660gagtaaattt tttgttgtca taaatatata tgtctttttt aatggggtgt atagtaccgc 720tgcgcatagt ttttctgtaa tttacaacag tgctattttc tggtagttct tcggagtgtg 780ttgctttaat tattaaattt atataatcaa tgaatttggg atcgtcggtt ttgtacaata 840tgttgccggc atagtacgca gcttcttcta gttcaattac accatttttt agcagcaccg 900gattaacata actttccaaa atgttgtacg aaccgttaaa caaaaacagt tcacctccct 960tttctatact attgtctgcg agcagttgtt tgttgttaaa aataacagcc attgtaatga 1020gacgcacaaa ctaatatcac aaactggaaa tgtctatcaa tatatagttg ctgattgcgc 1080agatgccctg cgtaagcggg tgtgggcgga caataaagtc ttaaactgaa caaaatagat 1140ctaaactatg acaataaagt cttaaactag acagaatagt tgtaaactga aatcagtcca 1200gttatgctgt gaaaaagcat actggacttt tgttatggct aaagcaaact cttcattttc 1260tgaagtgcaa attgcccgtc gtattaaaga ggggcgtggc caagggcatg taaagactat 1320attcgcggcg ttgtgacaat ttaccgaaca actccgcggc cgggaagccg atctcggctt 1380gaacgaattg ttaggtggcg gtacttgggt cgatatcaaa gtgcatcact tcttcccgta 1440tgcccaactt tgtatagaga gccactgcgg gatcgtcacc gtaatctgct tgcacgtaga 1500tcacataagc accaagcgcg ttggcctcat gcttgaggag attgatgagc gcggtggcaa 1560tgccctgcct ccggtgctcg ccggagactg cgagatcata gatatagatc tcactacgcg 1620gctgctcaaa cttgggcaga acgtaagccg cgagagcgcc aacaaccgct tcttggtcga 1680aggcagcaag cgcgatgaat gtcttactac ggagcaagtt cccgaggtaa tcggagtccg 1740gctgatgttg ggagtaggtg gctacgtctc cgaactcacg accgaaaaga tcaagagcag 1800cccgcatgga tttgacttgg tcagggccga gcctacatgt gcgaatgatg cccatacttg 1860agccacctaa ctttgtttta gggcgactgc cctgctgcgt aacatcgttg ctgctgcgta 1920acatcgttgc tgctccataa catcaaacat cgacccacgg cgtaacgcgc ttgctgcttg 1980gatgcccgag gcatagactg tacaaaaaaa cagtcataac aagccatgaa aaccgccact 2040gcgccgttac caccgctgcg ttcggtcaag gttctggacc agttgcgtga gcgcatacgc 2100tacttgcatt acagtttacg aaccgaacag gcttatgtca actgggttcg tgccttcatc 2160cgtttccacg gtgtgcgtca cccggcaacc ttgggcagca gcgaagtcgc cataacttcg 2220tatagcatac attatacgaa gttatctgta actataacgg tcctaaggta gcgagtttaa 2280acactagtat cgattcgcga cctactccgg aatattaata gatcatggag ataattaaaa 2340tgataaccat ctcgcaaata aataagtatt ttactgtttt cgtaacagtt ttgtaataaa 2400aaaacctata aatattccgg attattcata ccgtcccacc atcgggcgcg gatcccggtc 2460cgaagcgcgc ggaattcaaa ggcctacgtc gacgagctca cttgtcgcgg ccgctttcga 2520atctagagcc tgcagtctcg acaagcttgt cgagaagtac tagaggatca taatcagcca 2580taccacattt gtagaggttt tacttgcttt aaaaaacctc ccacacctcc ccctgaacct 2640gaaacataaa atgaatgcaa ttgttgttgt taacttgttt attgcagctt ataatggtta 2700caaataaagc aatagcatca caaatttcac aaataaagca tttttttcac tgcattctag 2760ttgtggtttg tccaaactca tcaatgtatc ttatcatgtc tggatctgat cactgcttga 2820gcctagaaga tccggctgct aacaaagccc gaaaggaagc tgagttggct gctgccaccg 2880ctgagcaata actatcataa cccctagggt atacccatct aattggaacc agataagtga 2940aatctagttc caaactattt tgtcattttt aattttcgta ttagcttacg acgctacacc 3000cagttcccat ctattttgtc actcttccct aaataatcct taaaaactcc atttccaccc 3060ctcccagttc ccaactattt tgtccgccca caaccggtgg aggaaattct ccttgaagtt 3120tccctggtgt tcaaagtaaa ggagtttgca ccagacgcac ctctgttcac tggtccggcg 3180tattaaaaca cgatacattg ttattagtac atttattaag cgctagattc tgtgcgttgt 3240tgatttacag acaattgttg tacgtatttt aataattcat taaatttata atctttaggg 3300tggtatgtta gagcgaaaat caaatgattt tcagcgtctt tatatctgaa tttaaatatt 3360aaatcctcaa tagatttgta aaataggttt cgattagttt caaacaaggg ttgtttttcc 3420gaaccgatgg ctggactatc taatggattt tcgctcaacg ccacaaaact tgccaaatct 3480tgtagcagca atctagcttt gtcgatattc gtttgtgttt tgttttgtaa taaaggttcg 3540acgtcgttca aaatattatg cgcttttgta tttctttcat cactgtcgtt agtgtacaat 3600tgactcgacg taaacacgtt aaatagagct tggacatatt taacatcggg cgtgttagct 3660ttattaggcc gattatcgtc gtcgtcccaa ccctcgtcgt tagaagttgc ttccgaagac 3720gattttgcca tagccacacg acgcctatta attgtgtcgg ctaacacgtc cgcgatcaaa 3780tttgtagttg agctttttgg aattaccggt tgacttgggt caactgtcag accaagttta 3840ctcatatata ctttagattg atttaaaact tcatttttaa tttaaaagga tctaggtgaa 3900gatccttttt gataatctca tgaccacagg cattggcggc cttgctgttc ttctacggca 3960aggtgctgtg cacgcccagc tgccattttt ggggtgaggt cgttcgcggc cgaggggcgc 4020agcccctggg gggatggggt gccgcgttag cgggccggga gggttcgaga agggggggca 4080ccccccttcg gcgtgcgcgg tcacgcgcca gggcgcagcc ctggttaaaa acaaggttta 4140taaatattgg tttaaaagca ggttaaaaga caggttagcg gtggccgaaa aacgggcgga 4200aacccttgca aatgctggat tttctgcctg tggacagccc ctcaaatgtc aataggtgcg 4260cccctcatct gtcatcactc tgcccctcaa gtgtcaagga tcgcgcccct catctgtcag 4320tagtcgcgcc cctcaagtgt caataccgca gggcacttat ccccaggctt gtccacatca 4380tctgtgggaa actcgcgtaa aatcaggcgt tttcgccgat ttgcgaggct ggccagctcc 4440acgtcgccgg ccgaaatcga gcctgcccct catctgtcaa cgccgcgccg ggtgagtcgg 4500cccctcaagt gtcaacgtcc gcccctcatc tgtcagtgag ggccaagttt tccgcgtggt 4560atccacaacg ccggcggcca aaagaagagc tttcacaccg catagaccag ccgcgtaacc 4620tggca 4625174490DNAArtificialpOmniBac4 17gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac atgttctttc 60ctgcgttatc ccctgattga cttgggtcgc tcttcctgtg gatgcgcagg tatgtacagg 120aagaggttta tactaaactg ttacattgca aacgtggttt cgtgtgccaa gtgtgaaaac 180cgatgtttaa tcaaggctct gacgcatttc

tacaaccacg actctaagtg tgtgggtgaa 240gtcatgcatc ttttaatcaa atcccaagat gtgtataaac caccaaactg ccaaaaaatg 300aaaactgtcg acaagctctg tccgtttgct ggcaactgca agggtctcaa tcctatttgt 360aattattgaa taataaaaca attataaatg tcaaatttgt tttttattaa cgatacaaac 420caaacgcaac aagaacattt gtagtattat ctataattga aaacgcgtag ttataatcgc 480tgaggtaata tttaaaatca ttttcaaatg attcacagtt aatttgcgac aatataattt 540tattttcaca taaactagac gccttgtcgt cttcttcttc gtattccttc tctttttcat 600ttttctcttc ataaaaatta acatagttat tatcgtatcc atatatgtat ctatcgtata 660gagtaaattt tttgttgtca taaatatata tgtctttttt aatggggtgt atagtaccgc 720tgcgcatagt ttttctgtaa tttacaacag tgctattttc tggtagttct tcggagtgtg 780ttgctttaat tattaaattt atataatcaa tgaatttggg atcgtcggtt ttgtacaata 840tgttgccggc atagtacgca gcttcttcta gttcaattac accatttttt agcagcaccg 900gattaacata actttccaaa atgttgtacg aaccgttaaa caaaaacagt tcacctccct 960tttctatact attgtctgcg agcagttgtt tgttgttaaa aataacagcc attgtaatga 1020gacgcacaaa ctaatatcac aaactggaaa tgtctatcaa tatatagttg ctgattgcgc 1080agatgccctg cgtaagcggg tgtgggcgga caataaagtc ttaaactgaa caaaatagat 1140ctaaactatg acaataaagt cttaaactag acagaatagt tgtaaactga aatcagtcca 1200gttatgctgt gaaaaagcat actggacttt tgttatggct aaagcaaact cttcattttc 1260tgaagtgcaa attgcccgtc gtattaaaga ggggcgtggc caagggcatg taaagactat 1320attcgcggcg ttgtgacaat ttaccgaaca actccgcggc cgggaagccg atctcggctt 1380gaacgaattg ttaggtggcg gtacttgggt cgatatcaaa gtgcatcact tcttcccgta 1440tgcccaactt tgtatagaga gccactgcgg gatcgtcacc gtaatctgct tgcacgtaga 1500tcacataagc accaagcgcg ttggcctcat gcttgaggag attgatgagc gcggtggcaa 1560tgccctgcct ccggtgctcg ccggagactg cgagatcata gatatagatc tcactacgcg 1620gctgctcaaa cttgggcaga acgtaagccg cgagagcgcc aacaaccgct tcttggtcga 1680aggcagcaag cgcgatgaat gtcttactac ggagcaagtt cccgaggtaa tcggagtccg 1740gctgatgttg ggagtaggtg gctacgtctc cgaactcacg accgaaaaga tcaagagcag 1800cccgcatgga tttgacttgg tcagggccga gcctacatgt gcgaatgatg cccatacttg 1860agccacctaa ctttgtttta gggcgactgc cctgctgcgt aacatcgttg ctgctgcgta 1920acatcgttgc tgctccataa catcaaacat cgacccacgg cgtaacgcgc ttgctgcttg 1980gatgcccgag gcatagactg tacaaaaaaa cagtcataac aagccatgaa aaccgccact 2040gcgccgttac caccgctgcg ttcggtcaag gttctggacc agttgcgtga gcgcatacgc 2100tacttgcatt acagtttacg aaccgaacag gcttatgtca actgggttcg tgccttcatc 2160cgtttccacg gtgtgcgtca cccggcaacc ttgggcagca gcgaagtcgc cataacttcg 2220tatagcatac attatacgaa gttatctgta actataacgg tcctaaggta gcgagtttaa 2280acgtacccgt agtggctatg gcagggcttg ccgccccgac gttggctgcg agccctgggc 2340cttcacccga acttgggggt tggggtgggg aaaaggaaga aacgcgggcg tattggtccc 2400aatggggtct cggtggggta tcgacagagt gccagccctg ggaccgaacc ccgcgtttat 2460gaacaaacga cccaacaccc gtgcgtttta ttctgtcttt ttattgccgt catagcgcgg 2520gttccttccg gtattgtctc cttccgtgtt tcagttagcc tcccccatct cccggtaccg 2580catgctatgc atcagctgct agcaccatgg ctcgagatcc cgggtgatca agtcttcgtc 2640gagtgattgt aaataaaatg taatttacag tatagtattt taattaatat acaaatgatt 2700tgataataat tcttatttaa ctataatata ttgtgttggg ttgaattaaa ggtccgtata 2760ctagtatcct agggtatacc catctaattg gaaccagata agtgaaatct agttccaaac 2820tattttgtca tttttaattt tcgtattagc ttacgacgct acacccagtt cccatctatt 2880ttgtcactct tccctaaata atccttaaaa actccatttc cacccctccc agttcccaac 2940tattttgtcc gcccacaacc ggtggaggaa attctccttg aagtttccct ggtgttcaaa 3000gtaaaggagt ttgcaccaga cgcacctctg ttcactggtc cggcgtatta aaacacgata 3060cattgttatt agtacattta ttaagcgcta gattctgtgc gttgttgatt tacagacaat 3120tgttgtacgt attttaataa ttcattaaat ttataatctt tagggtggta tgttagagcg 3180aaaatcaaat gattttcagc gtctttatat ctgaatttaa atattaaatc ctcaatagat 3240ttgtaaaata ggtttcgatt agtttcaaac aagggttgtt tttccgaacc gatggctgga 3300ctatctaatg gattttcgct caacgccaca aaacttgcca aatcttgtag cagcaatcta 3360gctttgtcga tattcgtttg tgttttgttt tgtaataaag gttcgacgtc gttcaaaata 3420ttatgcgctt ttgtatttct ttcatcactg tcgttagtgt acaattgact cgacgtaaac 3480acgttaaata gagcttggac atatttaaca tcgggcgtgt tagctttatt aggccgatta 3540tcgtcgtcgt cccaaccctc gtcgttagaa gttgcttccg aagacgattt tgccatagcc 3600acacgacgcc tattaattgt gtcggctaac acgtccgcga tcaaatttgt agttgagctt 3660tttggaatta ccggttgact tgggtcaact gtcagaccaa gtttactcat atatacttta 3720gattgattta aaacttcatt tttaatttaa aaggatctag gtgaagatcc tttttgataa 3780tctcatgacc acaggcattg gcggccttgc tgttcttcta cggcaaggtg ctgtgcacgc 3840ccagctgcca tttttggggt gaggtcgttc gcggccgagg ggcgcagccc ctggggggat 3900ggggtgccgc gttagcgggc cgggagggtt cgagaagggg gggcaccccc cttcggcgtg 3960cgcggtcacg cgccagggcg cagccctggt taaaaacaag gtttataaat attggtttaa 4020aagcaggtta aaagacaggt tagcggtggc cgaaaaacgg gcggaaaccc ttgcaaatgc 4080tggattttct gcctgtggac agcccctcaa atgtcaatag gtgcgcccct catctgtcat 4140cactctgccc ctcaagtgtc aaggatcgcg cccctcatct gtcagtagtc gcgcccctca 4200agtgtcaata ccgcagggca cttatcccca ggcttgtcca catcatctgt gggaaactcg 4260cgtaaaatca ggcgttttcg ccgatttgcg aggctggcca gctccacgtc gccggccgaa 4320atcgagcctg cccctcatct gtcaacgccg cgccgggtga gtcggcccct caagtgtcaa 4380cgtccgcccc tcatctgtca gtgagggcca agttttccgc gtggtatcca caacgccggc 4440ggccaaaaga agagctttca caccgcatag accagccgcg taacctggca 4490188823DNAArtificialpACKS 18ggtaccgcgg ccgcgtagag gatctgttga tcagcagttc aacctgttga tagtacttcg 60ttaatacaga tgtaggtgtt ggcaccatgc ataactataa cggtcctaag gtagcgacct 120aggtatcgat aatacgactc actatagggg aattgtgagc ggataacaat tcccctctag 180aaataatttt gtttaacttt aagaaggaga tatacatatg aggcctcgga tcctgtaaaa 240cgacggccag tgaattcccc gggaagcttc gccagggttt tcccagtcga gctcgatatc 300ggtaccagcg gataacaatt tcacatccgg atcgcgaacg cgtctcgaga gatccggctg 360ctaacaaagc ccgaaaggaa gctgagttgg ctgctgccac cgctgagcaa taactagcat 420aaccccttgg ggcctctaaa cgggtcttga ggggtttttt ggtttaaacc catctaattg 480gactagtagc ccgcctaatg agcgggcttt tttttaattc ccctatttgt ttatttttct 540aaatacattc aaatatgtat ccgctcatga gacaataacc ctgataaatg cttcaataat 600attgaaaaag gaagagtatg agtattcaac atttccgtgt cgcccttatt cccttttttg 660cggcattttg ccttcctgtt tttgctcacc cagaaacgct cgtgaaagta aaagacgcag 720aggaccaatt gggggcacga gtgggataca tagaactgga cttgaatagc ggtaaaatcc 780ttgagagttt tcgccctgaa gagcgttttc caatgatgag cactttcaaa gttctgctat 840gtggagcagt attatcccgt gtagatgcgg ggcaagagca actcggacga cgaatacact 900attcgcagaa tgacttggtt gaatactccc cagtgacaga aaagcacctt acggacggaa 960tgacggtaag agaattatgt agtgccgcca taacgatgag tgataacact gcggcgaact 1020tacttctgac aaccatcggt ggaccgaagg aattaaccgc ttttttgcac aatatgggag 1080accatgtaac tcgccttgac cgttgggaac cagaactgaa tgaagccata ccaaacgacg 1140agcgagacac cacaatgcct gcggcaatgg caacaacatt acgcaaacta ttaactggcg 1200aactacttac tctggcttca cggcaacaat taatagactg gcttgaagcg gataaagttg 1260caggaccact actgcgttcg gcacttcctg ctggctggtt tattgctgat aaatctgggg 1320caggagagcg tggttcacgg ggtatcattg ccgcacttgg accagatggt aagccttccc 1380gtatcgtagt tatctacacg acgggtagtc aggcaactat ggacgaacga aatagacaga 1440ttgctgaaat aggggcttca ctgattaagc attggtaaac cgatacaatt aaaggctcct 1500tttggagcct ttttttttgg acggaccggt agaaaagatc aaaggatctt cttgagatcc 1560tttttttctg cgcgtaatct gctgcttgca aacaaaaaaa ccaccgctac cagcggtggt 1620ttgtttgccg gatcaagagc taccaactct ttttccgaag gtaactggct tcagcagagc 1680gcagatacca aatactgtcc ttctagtgta gccgtagtta ggccaccact tcaagaactc 1740tgtagcaccg cctacatacc tcgctctgct aatcctgtta ccagtggctg ctgccagtgg 1800cgataagtcg tgtcttaccg ggttggactc aagacgatag ttaccggata aggcgcagcg 1860gtcgggctga acggggggtt cgtgcacaca gcccagcttg gagcgaacga cctacaccga 1920actgagatac ctacagcgtg agctatgaga aagcgccacg cttcccgaag ggagaaaggc 1980ggacaggtat ccggtaagcg gcagggtcgg aacaggagag cgcacgaggg agcttccagg 2040gggaaacgcc tggtatcttt atagtcctgt cgggtttcgc cacctctgac ttgagcgtcg 2100atttttgtga tgctcgtcag gggggcggag cctatggaaa aacgccagca acgcggcctt 2160tttacggttc ctggcctttt gctggccttt tgctcacatg ttctttcctg cgttatcccc 2220tgattctgtg gataaccgta ttaccgcctt tgagtgagct gataccgctc gccgcagccg 2280aacgaccgag cgcagcgagt cagtgagcga ggaagcggaa gagcgcctga tgcggtattt 2340tctccttacg catctgtgcg gtatttcaca ccgcaatggt gcactctcag tacaatctgc 2400tctgatgccg catagttaag ccagtataca ctccgctatc gctacgtgac tgggtcatgg 2460ctgcgccccg acacccgcca acacccgctg acgcgccctg acgggcttgt ctgctcccgg 2520catccgctta cagacaagct gtgaccgtct ccgggagctg catgtgtcag aggttttcac 2580cgtcatcacc gaaacgcgcg aggcaggggg aattccagat aacttcgtat aatgtatgct 2640atacgaagtt atggtaccgc ggccgcgtag aggatctgtt gatcagcagt tcaacctgtt 2700gatagtacgt actaagctct catgtttcac gtactaagct ctcatgttta acgtactaag 2760ctctcatgtt taacgaacta aaccctcatg gctaacgtac taagctctca tggctaacgt 2820actaagctct catgtttcac gtactaagct ctcatgtttg aacaataaaa ttaatataaa 2880tcagcaactt aaatagcctc taaggtttta agttttataa gaaaaaaaag aatatataag 2940gcttttaaag cttttaaggt ttaacggttg tggacaacaa gccagggatg taacgcactg 3000agaagccctt agagcctctc aaagcaattt tgagtgacac aggaacactt aacggctgac 3060agaattagct tcacgctgcc gcaagcactc agggcgcaag ggctgctaaa ggaagcggaa 3120cacgtagaaa gccagtccgc agaaacggtg ctgaccccgg atgaatgtca gctgggaggc 3180agaataaatg atcatatcgt caattattac ctccacgggg agagcctgag caaactggcc 3240tcaggcattt gagaagcaca cggtcacact gcttccggta gtcaataaac cggtaaacca 3300gcaatagaca taagcggcta tttaacgacc ctgccctgaa ccgacgaccg ggtcgaattt 3360gctttcgaat ttctgccatt catccgctta ttatcactta ttcaggcgta gcaaccaggc 3420gtttaagggc accaataact gccttaaaaa aattacgccc cgccctgcca ctcatcgcag 3480tactgttgta attcattaag cattctgccg acatggaagc catcacaaac ggcatgatga 3540acctgaatcg ccagcggcat cagcaccttg tcgccttgcg tataatattt gcccatggtg 3600aaaacggggg cgaagaagtt gtccatattg gccacgttta aatcaaaact ggtgaaactc 3660acccagggat tggctgagac gaaaaacata ttctcaataa accctttagg gaaataggcc 3720aggttttcac cgtaacacgc cacatcttgc gaatatatgt gtagaaactg ccggaaatcg 3780tcgtggtatt cactccagag cgatgaaaac gtttcagttt gctcatggaa aacggtgtaa 3840caagggtgaa cactatccca tatcaccagc tcaccgtctt tcattgccat acggaattcc 3900ggatgagcat tcatcaggcg ggcaagaatg tgaataaagg ccggataaaa cttgtgctta 3960tttttcttta cggtctttaa aaaggccgta atatccagct gaacggtctg gttataggta 4020cattgagcaa ctgactgaaa tgcctcaaaa tgttctttac gatgccattg ggatatatca 4080acggtggtat atccagtgat ttttttctcc attttagctt ccttagctcc tgaaaatctc 4140gataactcaa aaaatacgcc cggtagtgat cttatttcat tatggtgaaa gttggaccct 4200cttacgtgcc gatcaacgtc tcattttcgc caaaagttgg cccagatcaa cgtctcattt 4260tcgccaaaag ttggcccaga tctatgtcgg gtgcggagaa agaggtaatg aaatggcacc 4320taggtatcga taatacgact cactataggg gaattgtgag cggataacaa ttcccctcta 4380gaaataattt tgtttaactt taagaaggag atatacatat gaggcctcgg atcctgtaaa 4440acgacggcca gtgaattccc cgggaagctt cgccagggtt ttcccagtcg agctcgatat 4500cggtaccagc ggataacaat ttcacatccg gatcgcgaac gcgtctcgag agatccggct 4560gctaacaaag cccgaaagga agctgagttg gctgctgcca ccgctgagca ataactagca 4620taaccccttg gggcctctaa acgggtcttg aggggttttt tggtttaaac ccatgtgcct 4680ggcagataac ttcgtataat gtatgctata cgaagttatg gtacgtacta agctctcatg 4740tttcacgtac taagctctca tgtttaacgt actaagctct catgtttaac gaactaaacc 4800ctcatggcta acgtactaag ctctcatggc taacgtacta agctctcatg tttcacgtac 4860taagctctca tgtttgaaca ataaaattaa tataaatcag caacttaaat agcctctaag 4920gttttaagtt ttataagaaa aaaaagaata tataaggctt ttaaagcttt taaggtttaa 4980cggttgtgga caacaagcca gggatgtaac gcactgagaa gcccttagag cctctcaaag 5040caattttcag tgacacagga acacttaacg gctgacagaa ttagcttcac gctgccgcaa 5100gcactcaggg cgcaagggct gctaaaggaa gcggaacacg tagaaagcca gtccgcagaa 5160acggtgctga ccccggatga atgtcagcta ctgggctatc tggacaaggg aaaacgcaag 5220cgcaaagaga aagcaggtag cttgcagtgg gcttacatgg cgatagctag actgggcggt 5280tttatggaca gcaagcgaac cggaattgcc agctggggcg ccctctggta aggttgggaa 5340gccctgcaaa gtaaactgga tggctttctt gccgccaagg atctgatggc gcaggggatc 5400aagatctgat caagagacag gatgaggatc gtttcgcatg attgaacaag atggattgca 5460cgcaggttct ccggccgctt gggtggagag gctattcggc tatgactggg cacaacagac 5520aatcggctgc tctgatgccg ccgtgttccg gctgtcagcg caggggcgcc cggttctttt 5580tgtcaagacc gacctgtccg gtgccctgaa tgaactgcag gacgaggcag cgcggctatc 5640gtggctggcc acgacgggcg ttccttgcgc agctgtgctc gacgttgtca ctgaagcggg 5700aagggactgg ctgctattgg gcgaagtgcc ggggcaggat ctcctgtcat ctcaccttgc 5760tcctgccgag aaagtatcca tcatggctga tgcaatgcgg cggctgcata cgcttgatcc 5820ggctacctgc ccattcgacc accaagcgaa acatcgcatc gagcgagcac gtactcggat 5880ggaagccggt cttgtcgatc aggatgatct ggacgaagag catcaggggc tcgcgccagc 5940cgaactgttc gccaggctca aggcgcgcat gcccgacggc gaggatctcg tcgtgacaca 6000tggcgatgcc tgcttgccga atatcatggt ggaaaatggc cgcttttctg gattcatcga 6060ctgtggccgg ctgggtgtgg cggaccgcta tcaggacata gcgttggcta cccgtgatat 6120tgctgaagag cttggcggcg aatgggctga ccgcttcctc gtgctttacg gtatcgccgc 6180tcccgattcg cagcgcatcg ccttctatcg ccttcttgac gagttcttct gagcgggact 6240ctggggttcg aaatgaccga ccaagcgacg cccaacctgc catcacgaga tttcgattcc 6300accgccgcct tctatgaaag gttgggcttc ggaatcgttt tccgggacgc cggctggatg 6360atcctccagc gcggggatct catgctggag ttcttcgccc accccgggat ctatgtcggg 6420tgcggagaaa gaggtaatga aatggcacct aggtatcgat ggctttacac tttatgcttc 6480cggctcgtat gttgtgtgga attgtgagcg gataacaatt tcacacagga aacagctatg 6540accatgatta cgaatttcta gaaataattt tgtttaactt taagaaggag atatacatat 6600gaggcctcgg atcctgtaaa acgacggcca gtgaattccc cgggaagctt cgccagggtt 6660ttcccagtcg agctcgatat cggtaccagc ggataacaat ttcacatccg gatcgcgaac 6720gcgtctcgag actagttccg tttaaaccca tgtgcctggc agataacttc gtataatgta 6780tgctatacga agttatggta cgtactaagc tctcatgttt cacgtactaa gctctcatgt 6840ttaacgtact aagctctcat gtttaacgaa ctaaaccctc atggctaacg tactaagctc 6900tcatggctaa cgtactaagc tctcatgttt cacgtactaa gctctcatgt ttgaacaata 6960aaattaatat aaatcagcaa cttaaatagc ctctaaggtt ttaagtttta taagaaaaaa 7020aagaatatat aaggctttta aagcttttaa ggtttaacgg ttgtggacaa caagccaggg 7080atgtaacgca ctgagaagcc cttagagcct ctcaaagcaa ttttgagtga cacaggaaca 7140cttaacggct gacataattc agcttcacgc tgccgcaagc actcagggcg caagggctgc 7200taaaggaagc ggaacacgta gaaagccagt ccgcagaaac ggtgctgacc ccggatgaat 7260gtcagctggg aggcagaata aatgatcata tcgtcaatta ttacctccac ggggagagcc 7320tgagcaaact ggcctcaggc atttgagaag cacacggtca cactgcttcc ggtagtcaat 7380aaaccggtaa gtagcgtatg cgctcacgca actggtccag aaccttgacc gaacgcagcg 7440gtggtaacgg cgcagtggcg gttttcatgg cttgttatga ctgttttttt ggggtacagt 7500ctatgcctcg ggcatccaag cagcaagcgc gttacgccgt gggtcgatgt ttgatgttat 7560ggagcagcaa cgatgttacg cagcagggca gtcgccctaa aacaaagtta aacatcatga 7620gggaagcggt gatcgccgaa gtatcgactc aactatcaga ggtagttggc gtcatcgagc 7680gccatctcga accgacgttg ctggccgtac atttgtacgg ctccgcagtg gatggcggcc 7740tgaagccaca cagtgatatt gatttgctgg ttacggtgac cgtaaggctt gatgaaacaa 7800cgcggcgagc tttgatcaac gaccttttgg aaacttcggc ttcccctgga gagagcgaga 7860ttctccgcgc tgtagaagtc accattgttg tgcacgacga catcattccg tggcgttatc 7920cagctaagcg cgaactgcaa tttggagaat ggcagcgcaa tgacattctt gcaggtatct 7980tcgagccagc cacgatcgac attgatctgg ctatcttgct gacaaaagca agagaacata 8040gcgttgcctt ggtaggtcca gcggcggagg aactctttga tccggttcct gaacaggatc 8100tatttgaggc gctaaatgaa accttaacgc tatggaactc gccgcccgac tgggctggcg 8160atgagcgaaa tgtagtgctt acgttgtccc gcatttggta cagcgcagta accggcaaaa 8220tcgcgccgaa ggatgtcgct gccgactggg caatggagcg cctgccggcc cagtatcagc 8280ccgtcatact tgaagctaga caggcttatc ttggacaaga agaagatcgc ttggcctcgc 8340gcgcagatca gttggaagaa tttgtccact acgtgaaagg cgagatcacc aaggtagtcg 8400gcaaataatg tctaacaatt cgttcaagcc gacggatcta tgtcgggtgc ggagaaagag 8460gtaatgaaat ggcacctagg tatcgatggc tttacacttt atgcttccgg ctcgtatgtt 8520gtgtggaatt gtgagcggat aacaatttca cacaggaaac agctatgacc atgattacga 8580atttctagaa ataattttgt ttaactttaa gaaggagata tacatatgag gcctcggatc 8640ctgtaaaacg acggccagtg aattccccgg gaagcttcgc cagggttttc ccagtcgagc 8700tcgatatcgg taccagcgga taacaatttc acatccggat cgcgaacgcg tctcgagact 8760agttccgttt aaacccatgt gcctggcaga taacttcgta taatgtatgc tatacgaagt 8820tat 88231934DNAArtificialLoxP imperfect repeat 19ataacttcgt atagcataca ttatacgaag ttat 342031DNAArtificialAdaptor T7InsFor 20tcccgcgaaa ttaatacgac tcactatagg g 312139DNAArtificialAdaptor T7InsRev 21cctcaagacc cgtttagagg ccccaagggg ttatgctag 392239DNAArtificialAdaptor T7VecFor 22ctagcataac cccttggggc ctctaaacgg gtcttgagg 392331DNAArtificialAdaptor T7VecRev 23ccctatagtg agtcgtatta atttcgcggg a 312430DNAArtificialAdaptor NdeInsFor 24gtttaacttt aagaaggaga tatacatatg 302525DNAArtificialAdaptor XhoInsRev 25gggtttaaac ggaactagtc tcgag 252625DNAArtificialAdaptor XhoVecFor 26ctcgagacta gttccgttta aaccc 252730DNAArtificialAdaptor NdeVecRev 27catatgtata tctccttctt aaagttaaac 302830DNAArtificialAdaptor SmaBam 28gaattcactg gccgtcgttt tacaggatcc 302930DNAArtificialAdaptor BamSma 29ggatcctgta aaacgacggc cagtgaattc 303029DNAArtificialAdaptor SacHind 30gctcgactgg gaaaaccctg gcgaagctt 293129DNAArtificialAdaptor HindSac 31aagcttcgcc agggttttcc cagtcgagc 293233DNAArtificialAdaptor BspEco5 32gatccggatg tgaaattgtt atccgctggt acc 333333DNAArtificialAdaptor Eco5Bsp 33ggtaccagcg gataacaatt tcacatccgg atc 333423DNAArtificialAdaptor PolhInsFor 34cccaccatcg ggcgcggatc ccg 233523DNAArtificialAdaptor PolhInsRev 35cgagactgca ggctctagat tcg 233623DNAArtificialAdaptor PolhVecFor 36cgggatccgc gcccgatggt ggg 233723DNAArtificialAdaptor PolhVecRev 37cgaatctaga gcctgcagtc tcg 233828DNAArtificialAdaptor P10InsFor 38ctcccggtac cgcatgctat gcatcagc 283926DNAArtificialAdaptor P10InsRev

39aatcactcga cgaagacttg atcacc 264028DNAArtificialAdaptor P10VecFor 40gctgatgcat agcatgcggt accgggag 284126DNAArtificialAdaptor P10VecRev 41ggtgatcaag tcttcgtcga gtgatt 264212DNAArtificialBxtXI recognition sequence general 42ccannnnnnt gg 124312DNAArtificialBstXI recognition sequence contained in Donor vectors 43ccatgtgcct gg 124412DNAArtificialBstXI recognition sequence contained in Acceptor vectors 44ccatctaatt gg 124558DNAArtificialPrimer SmaBamVHL 45gaattcactg gccgtcgttt tacaggatcc ttaatctccc atccgttgat gtgcaatg 584658DNAArtificialPrimer BamSmaEB 46ggatcctgta aaacgacggc cagtgaattc gctagctcta gaaataattt tgtttaac 584766DNAArtificialPrimer SacHindEB 47gagctcgact gggaaaaccc tggcgaagct tagatctgga tccttactgc acggcttgtt 60cattgg 664854DNAArtificialPrimer HindSacEC 48aagcttcgcc agggttttcc cagtcgagct ccaattggaa ttcgctagct ctag 544967DNAArtificialPrimer BspEco5EC 49gatccggatg tgaaattgtt atccgctggt accaagctta gatctggatc cttaacaatc 60taagaag 6750593DNAArtificialMultiple integration element 50tatagcatac attatacgaa gttatctgta actataacgg tcctaaggta gcgagtttaa 60acactagtat cgattcgcga cctactccgg aatattaata gatcatggag ataattaaaa 120tgataaccat ctcgcaaata aataagtatt ttactgtttt cgtaacagtt ttgtaataaa 180aaaacctata aatattccgg attattcata ccgtcccacc atcgggcgcg gatcccggtc 240cgaagcgcgc ggaattcaaa ggcctacgtc gacgagctca cttgtcgcgg ccgctttcga 300atctagagcc tgcagtctcg acaagcttgt cgagaagtac tagaggatca taatcagcca 360taccacattt gtagaggttt tacttgcttt aaaaaacctc ccacacctcc ccctgaacct 420gaaacataaa atgaatgcaa ttgttgttgt taacttgttt attgcagctt ataatggtta 480caaataaagc aatagcatca caaatttcac aaataaagca tttttttcac tgcattctag 540ttgtggtttg tccaaactca tcaatgtatc ttatcatgtc tggatctgat cac 59351498DNAArtificialMultiple integration element 51caagccgacg gatctatgtc gggtgcggag aaagaggtaa tgaaatggca cctaggtatc 60gatactagta tacggacctt taattcaacc caacacaata tattatagtt aaataagaat 120tattatcaaa tcatttgtat attaattaaa atactatact gtaaattaca ttttatttac 180aatcactcga cgaagacttg atcacccggg atctcgagcc atggtgctag cagctgatgc 240atagcatgcg gtaccgggag atgggggagg ctaactgaaa cacggaagga gacaataccg 300gaaggaaccc gcgctatgac ggcaataaaa agacagaata aaacgcacgg gtgttgggtc 360gtttgttcat aaacgcgggg ttcggtccca gggctggcac tctgtcgata ccccaccgag 420accccattgg gaccaatacg cccgcgtttc ttccttttcc ccaccccaac ccccaagttc 480gggtgaaggc ccagggct 49852593DNAArtificialMultiple integration element 52tatagcatac attatacgaa gttatctgta actataacgg tcctaaggta gcgagtttaa 60acactagtat cgattcgcga cctactccgg aatattaata gatcatggag ataattaaaa 120tgataaccat ctcgcaaata aataagtatt ttactgtttt cgtaacagtt ttgtaataaa 180aaaacctata aatattccgg attattcata ccgtcccacc atcgggcgcg gatcccggtc 240cgaagcgcgc ggaattcaaa ggcctacgtc gacgagctca cttgtcgcgg ccgctttcga 300atctagagcc tgcagtctcg acaagcttgt cgagaagtac tagaggatca taatcagcca 360taccacattt gtagaggttt tacttgcttt aaaaaacctc ccacacctcc ccctgaacct 420gaaacataaa atgaatgcaa ttgttgttgt taacttgttt attgcagctt ataatggtta 480caaataaagc aatagcatca caaatttcac aaataaagca tttttttcac tgcattctag 540ttgtggtttg tccaaactca tcaatgtatc ttatcatgtc tggatctgat cac 59353808DNAArtificialMultiple integration element 53ttagtacgta ctatcaacag gttgaactgc tgatcaacag atcctctacg cggccgcggt 60accataactt cgtatagcat acattatacg aagttatctg ccaggcacat gggttttact 120agtatcgatt cgcgacctac tccggaatat taatagatca tggagataat taaaatgata 180accatctcgc aaataaataa gtattttact gttttcgtaa cagttttgta ataaaaaaac 240ctataaatat tccggattat tcataccgtc ccaccatcgg gcgcggatcc cggtccgaag 300cgcgcggaat tcaaaggcct acgtcgacga gctcactagt cgcggccgct ttcgaatcta 360gagcctgcag tctcgacaag cttgtcgaga agtactagag gatcataatc agccatacca 420catttgtaga ggttttactt gctttaaaaa acctcccaca cctccccctg aacctgaaac 480ataaaatgaa tgcaattgtt gttgttaact tgtttattgc agcttataat ggttacaaat 540aaagcaatag catcacaaat ttcacaaata aagcattttt ttcactgcat tctagttgtg 600gtttgtccaa actcatcaat gtatcttatc atgtctggat ctgatcactg cttgagccta 660gaagatccgg ctgctaacaa agcccgaaag gaagctgagt tggctgctgc caccgctgag 720caataactat cataacccct aggtgccatt tcattacctc tttctccgca cccgacataa 780aaatgagacg ttgatctggg ccaacttt 80854211DNAArtificialAntisense sequence to multiple integration element of SEQ ID NO 1 54ccggatctct cgagacgcgg ttcgcgatcc ggatgtgaaa ttgttatccg ctggtaccga 60tatcgagctc gactgggaaa accctggcga agcttcccgg ggaattcact ggccgtcgtt 120ttacaggatc cgaggcctca tatgtatatc tccttcttaa agttaaacaa aattatttct 180agaggggaat tgttatccgc tcacaattcc c 211

* * * * *

Nucleic acids for cloning and expressing multiprotein complexes

Berger; Imre

References