Systems, Methods, And Compositions For Site-specific Genetic Engineering Using Programmable Addition Via Site-specific Targeting Elements (paste)

Abudayyeh; Omar ;   et al.

Patent Application Summary

U.S. patent application number 17/451734 was filed with the patent office on 2022-05-12 for systems, methods, and compositions for site-specific genetic engineering using programmable addition via site-specific targeting elements (paste). The applicant listed for this patent is Massachusetts Institute of Technology. Invention is credited to Omar Abudayyeh, Jonathan Gootenberg.

Application Number20220145293 17/451734
Document ID /
Family ID
Filed Date2022-05-12

United States Patent Application 20220145293
Kind Code A1
Abudayyeh; Omar ;   et al. May 12, 2022

SYSTEMS, METHODS, AND COMPOSITIONS FOR SITE-SPECIFIC GENETIC ENGINEERING USING PROGRAMMABLE ADDITION VIA SITE-SPECIFIC TARGETING ELEMENTS (PASTE)

Abstract

This disclosure provides systems, methods, and compositions for site-specific genetic engineering using Programmable Addition via Site-Specific Targeting Elements (PASTE). PASTE comprises the addition of an integration site into a target genome followed by the insertion of one or more genes of interest or one or more nucleic acid sequences of interest at the site. PASTE combines gene editing technologies and integrase technologies to achieve unidirectional incorporation of genes in a genome for the treatment of diseases and diagnosis of disease.


Inventors: Abudayyeh; Omar; (Cambridge, MA) ; Gootenberg; Jonathan; (Cambridge, MA)
Applicant:
Name City State Country Type

Massachusetts Institute of Technology

Cambridge

MA

US
Appl. No.: 17/451734
Filed: October 21, 2021

Related U.S. Patent Documents

Application Number Filing Date Patent Number
63094803 Oct 21, 2020
63222550 Jul 16, 2021

International Class: C12N 15/11 20060101 C12N015/11; A61K 31/7105 20060101 A61K031/7105; C12N 9/22 20060101 C12N009/22; C12N 9/12 20060101 C12N009/12; C12N 15/10 20060101 C12N015/10; C12N 15/85 20060101 C12N015/85

Claims



1. A method of site-specific integration of a nucleic acid into a cell genome or target nucleic acid, the method comprising: (a) incorporating an integration site at a desired location in the cell genome or target nucleic acid by introducing into a cell: i. a DNA binding nuclease domain linked to a reverse transcriptase domain, wherein the DNA binding nuclease domain comprises a nickase activity; and ii. a guide RNA (gRNA) comprising a primer binding targeting sequence linked to a complement of an integration sequence, wherein the gRNA interacts with the DNA binding nuclease domain and targets the desired location in the cell genome genome or target nucleic acid, wherein the DNA binding nuclease domain nicks a strand of the cell genome or target nucleic acid and the reverse transcriptase domain incorporates the integration sequence of the gRNA into the nicked site, thereby providing the integration site at the desired location of the cell genome or target nucleic acid; and (b) integrating the nucleic acid into the cell genome or target nucleic acid by introducing into the cell: i. a DNA or RNA strand comprising the nucleic acid linked to a sequence that is complementary or associated to the integration site; and ii. an integration enzyme, wherein the integration enzyme incorporates the nucleic acid into the cell genome or target nucleic acid at the integration site by integration, recombination, or reverse transcription of the sequence that is complementary or associated to the integration site, thereby introducing the nucleic acid into the desired location of the cell genome or target nucleic acid of the cell.

2. The method of claim 1, wherein the gRNA hybridizes to a complementary strand of the cell genome to the genomic strand that is nicked by the DNA binding nuclease domain.

3. The method of claim 1, wherein: the integration enzyme is introduced as a polypeptide or a nucleic acid encoding the integration enzyme; and/or the DNA binding nuclease domain is introduced as a polypeptide or a nucleic acid encoding the DNA binding nuclease.

4. (canceled)

5. The method of claim 1, wherein the DNA or RNA strand comprising the nucleic acid is introduced into the cell as a minicircle, a plasmid, mRNA or a linear DNA, optionally wherein: the DNA or RNA strand comprising the nucleic acid is between 1000 bp and 36,000 bp; the DNA or RNA strand comprising the nucleic acid is more than 36,000 bp; and/or the DNA or RNA strand comprising the nucleic acid is less than 1000 bp.

6. (canceled)

7. (canceled)

8. (canceled)

9. The method of claim 1, wherein the DNA comprising the nucleic acid is introduced into the cell as a minicircle, optionally wherein the minicircle does not comprise a sequence of a bacterial origin.

10. (canceled)

11. The method of claim 1, wherein the DNA binding nuclease linked to a reverse transcriptase domain and the integration enzyme are linked via a linker, optionally wherein: the linker is cleavable; the linker is non-cleavable; or the linker can be replaced by two associating binding domains of the DNA binding nuclease linked to a reverse transcriptase.

12. (canceled)

13. (canceled)

14. (canceled)

15. The method of claim 1, wherein: the integration enzyme is selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, q 370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof; the integration site is an attB site, an attP site, an attL site, an attR site, a lox71 site a Vox site, or a FRT site; the DNA binding nuclease comprising a nickase activity is selected from Cas9-D10A, Cas9-H840A, and Cas12a/b nickase; and/or the reverse transcriptase domain is selected from the group consisting of Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase domain, transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV-RT), and Eubacterium rectale maturase RT (MarathonRT), optionally wherein: the reverse transcriptase domain comprises a mutation relative to the wild-type sequence; and/or the M-MLV reverse transcriptase domain comprises one or more mutations selected from the group consisting of D200N, T306K, W313F, T330P and L603W.

16. (canceled)

17. (canceled)

18. (canceled)

19. (canceled)

20. (canceled)

21. (canceled)

22. The method of claim 1, further comprising introducing a nicking guide RNA (ngRNA).

23. The method of claim 1, wherein: the gRNA, the nucleic acid encoding the DNA binding nuclease, the reverse transcriptase, the DNA comprising nucleic acid linked to a complementary or associated integration site, the integration enzyme, and optionally the ngRNA, are introduced into a cell in a single reaction; and/or the gRNA, the nucleic acid encoding the DNA binding nuclease, the reverse transcriptase, the DNA comprising nucleic acid linked to a complementary integration site, the integration enzyme, and optionally the ngRNA, are introduced using a virus, a RNP, an mRNA, a lipid, or a polymeric nanoparticle.

24. (canceled)

25. The method of claim 1, wherein: the nucleic acid is a reporter gene, optionally wherein the reporter gene is a fluorescent protein; the nucleic acid is a degradation tag for programmable knockdown of proteins in the presence of small molecules; the nucleic acid is a T-cell receptor (TCR), a chimeric antigen receptor (CAR), an interleukin, a cytokine, or an immune checkpoint gene for integration into a T-cell or natural killer (NK) cell, optionally wherein the TCR, the CAR, the interleukin, the cytokine, or the immune checkpoint gene is incorporated into the target site of the T-cell or NK cell genome using a minicircle DNA; the nucleic acid is a beta hemoglobin (HBB) gene and the cell is a hematopoietic stem cell (HSC), optionally wherein the HBB gene is incorporated into the target site in the HSC genome using a minicircle DNA and/or the nucleic acid is a gene responsible for beta thalassemia or sickle cell anemia; the nucleic acid is a metabolic gene, optionally wherein the metabolic gene is involved in alpha-1 antitrypsin deficiency or ornithine transcarbamylase (OTC) deficiency and/or the metabolic gene is a gene involved in an inherited disease; or the nucleic acid is a gene involved in an inherited disease or an inherited syndrome, optionally wherein the inherited disease is cystic fibrosis, familial hypercholesterolemia, adenosine deaminase (ADA) deficiency, X-linked SCID (X-SCID), Wiskott-Aldrich syndrome (WAS), hemochromatosis, Tay-Sachs, fragile X syndrome, Huntington's disease, Marfan syndrome, phenylketonuria, or muscular dystrophy.

26. (canceled)

27. The method of claim 1, wherein the cell is a dividing cell or a non-dividing cell, optionally wherein: the desired location in the cell genome is the locus of a mutated gene; and/or the cell is a mammalian cell, a bacterial cell or a plant cell.

28. (canceled)

29. (canceled)

30. (canceled)

31. (canceled)

32. (canceled)

33. (canceled)

34. (canceled)

35. (canceled)

36. (canceled)

37. (canceled)

38. (canceled)

39. (canceled)

40. (canceled)

41. (canceled)

42. A vector comprising a nucleic acid encoding the polypeptide of claim 63.

43. (canceled)

44. (canceled)

45. (canceled)

46. (canceled)

47. (canceled)

48. (canceled)

49. (canceled)

50. (canceled)

51. (canceled)

52. (canceled)

53. (canceled)

54. A cell comprising: (a) the vector of claim 42; (b) a gRNA comprising a primer binding sequence, an integration sequence, and a guide sequence, wherein the gRNA can interact with the encoded nuclease comprising a nickase activity; (c) a DNA minicircle comprising a nucleic acid and a sequence recognized by the encoded integrase, recombinase, or reverse transcriptase; and (d) a nicking guide RNA (ngRNA) capable of binding the encoded nuclease comprising a nickase activity, wherein the ngRNA targets a sequence away from the gRNA.

55. The cell of claim 54, wherein: the minicircle does not comprise a sequence of bacterial origin; the integration enzyme is selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, q 370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), Minos, and any mutants thereof; the DNA binding nuclease comprising a nickase activity is selected from the group consisting of Cas9-D10A, Cas9-H840A and Cas12a; the reverse transcriptase is a M-MLV reverse transcriptase, optionally wherein the reverse transcriptase is a modified M-MLV reverse transcriptase, optionally wherein the amino acid sequence of the M-MLV reverse transcriptase comprises one or more mutations selected from the group consisting of D200N, T306K, W313F, T330P, and L603W; and/or the cell further comprises a ngRNA.

56. (canceled)

57. (canceled)

58. (canceled)

59. (canceled)

60. (canceled)

61. (canceled)

62. (canceled)

63. A polypeptide comprising a DNA binding nuclease comprising a nickase activity C-terminally linked to a reverse transcriptase linked to an integration enzyme via a linker.

64. The polypeptide of claim 63, wherein: the linker is cleavable or non-cleavable; the integration enzyme is fused to an estrogen receptor; the DNA binding nuclease comprising a nickase activity is selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b/c/d/e/f/g/h/i/j; the reverse transcriptase is a M-MLV reverse transcriptase, a AMV-RT, a MarathonRT, or a XRT, optionally wherein the reverse transcriptase is a modified M-MLV relative to a wild-type M-MLV reverse transcriptase, optionally wherein the M-MLV reverse transcriptase domain comprises one or more of mutations selected from the group consisting of D200N, T306K, W313F, T330P, and L603W; the integration enzyme is selected from group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), Minos, and any mutants thereof.

65. (canceled)

66. (canceled)

67. (canceled)

68. (canceled)

69. (canceled)

70. (canceled)

71. (canceled)

72. (canceled)

73. A gRNA that specifically binds to a DNA binding nuclease comprising nickase activity, the gRNA comprising: (a) a primer binding site, which hybridizes to a nicked DNA strand; (b) a recognition site for an integration enzyme; and (c) a target recognition sequence recognizing a target site in a cell genome and hybridizing to a genomic strand complementary to the strand that is nicked by the DNA binding nuclease.

74. The gRNA of claim 73, wherein: the DNA binding nuclease comprising a nickase activity is selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b/c/d/e/f/g/h/i/j; the primer binding site hybridizes to the 3' end of the nicked DNA strand; the recognition site for the integration enzyme is selected from an attB site, an attP site, an attL site, an attR site, a lox71 site, and a FRT site; and/or the recognition site for the integration enzyme is a Bxb1 site.

75. (canceled)

76. (canceled)

77. (canceled)

78. A method of site-specific integration of two or more nucleic acids into a cell genome, the method comprising: (a) incorporating two integration sites at desired locations in the cell genome by introducing into the cell: i. a DNA binding nuclease linked to a reverse transcriptase domain, wherein the DNA binding nuclease comprises a nickase activity; and ii. two guide RNAs (gRNAs), each comprising, a primer binding sequence, and is linked to a unique integration sequence, wherein the gRNA interacts with the DNA binding nuclease and targets the desired locations in the cell genome, wherein the DNA binding nuclease nicks a strand of the cell genome and the reverse transcriptase domain incorporates each of the integration sequence of the gRNA into the nicked site, thereby providing the integration site at the desired locations of the cell genome; and (b) integrating the nucleic acid by introducing into the cell: i. two or more DNA or RNA comprising the nucleic acids, wherein each DNA is flanked by orthogonal integration sites; and ii. an integration enzyme, wherein the integration enzyme incorporates the nucleic acids into the cell genome at the integration sites by integrase, recombinase, or reverse transcriptase of the sequence that is complementary or associated to the integration site, thereby introducing the nucleic acids into the desired locations of the cell genome of the cell.

79. The method of claim 78, wherein each of the two different integration sites inserted into the cell genome are attB and/or attP sequences comprising different palindromic or non-palindromic central dinucleotide, optionally wherein: the integration enzyme enables each of the two or more DNA or RNA comprising the nucleic acids to directionally enable integration of the nucleic acids into a genome via recombination of a pair of orthogonal attB site sequence and an attP site sequence; and/or the pair of an attB site sequence and an attP site sequence are selected from the group consisting of SEQ ID NO: 5 and SEQ ID NO: 6, SEQ ID NO: 7 and SEQ ID NO: 8, SEQ ID NO: 9 and SEQ ID NO: 10, SEQ ID NO: 11 and SEQ ID NO: 12, SEQ ID NO: 13 and SEQ ID NO: 14, SEQ ID NO: 15 and SEQ ID NO: 16, SEQ ID NO: 17 and SEQ ID NO: 18, SEQ ID NO: 19 and SEQ ID NO: 20, SEQ ID NO: 21 and SEQ ID NO: 22, SEQ ID NO: 23 and SEQ ID NO: 24, SEQ ID NO: 25 and SEQ ID NO: 26, SEQ ID NO: 27 and SEQ ID NO: 28, SEQ ID NO: 29 and SEQ ID NO: 30, SEQ ID NO: 31 and SEQ ID NO: 32, SEQ ID NO: 33 and SEQ ID NO: 34, and SEQ ID NO: 35 and SEQ ID NO: 36.

80. (canceled)

81. (canceled)

82. (canceled)

83. (canceled)

84. (canceled)

85. (canceled)

86. (canceled)

87. (canceled)

88. (canceled)

89. (canceled)

90. (canceled)

91. (canceled)

92. The method of claim 17, wherein the attB site is about 40-46 basepair.

93. The gRNA of claim 74, wherein the attB site is about 40-46 basepair.
Description



CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Patent Application Ser. No. 63/222,550, filed Jul. 16, 2021 and U.S. Provisional Patent Application Ser. No. 63/094,803, filed Oct. 21, 2020. The entire contents of the above-referenced patent applications are incorporated by reference in their entirety herein.

FIELD OF DISCLOSURE

[0002] The subject matter disclosed herein is generally directed to systems, methods, and compositions for site-specific genetic engineering using Programmable Addition via Site-Specific Targeting Elements (PASTE) for the treatment of diseases and diagnostics.

BACKGROUND

[0003] Editing genomes using the RNA-guided DNA targeting principle of CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats-CRISPR associated proteins) immunity has been widely exploited and has become a powerful genome editing means for a wide variety of applications. The main advantage of CRISPR-Cas system lies in the minimal requirement for programmable DNA interference: an endonuclease, such as a Cas9, Cas12, or any programmable nucleases, guided by a customizable dual-RNA structure. Cas9 is a multi-domain enzyme that uses an HNH nuclease domain to cleave the target strand. The CRISPR/Cas9 protein-RNA complex is localized on the target by a guide RNA (guide RNA), then cleaved to generate a DNA double strand break (dsDNA break, DSB). After cleavage, DNA repair mechanisms are activated to repair the cleaved strand. Repair mechanisms are generally from one of two types: non-homologous end joining (NHEJ) or homologous recombination (HR). In general, NHEJ dominates the repair, and, being error prone, generates random indels (insertions or deletions) causing frame shift mutations, among others. In contrast, HR has a more precise repairing capability and is potentially capable of incorporating the exact substitution or insertion. To enhance HR, several techniques have been tried, for example: combination of fusion proteins of Cas9 nuclease with homology-directed repair (HDR) effectors to enforce their localization at DSBs, introducing an overlapping homology arm, or suppression of NHEJ. Most of these techniques rely on the host DNA repair systems.

[0004] Recently, new guided editors have been developed, such as guided prime editors (PE) PE1, PE2, and PE3, e.g., Liu, D. et al., Nature 2019, 576, 149-157. These PEs are reverse transcriptase (RT) fused with Cas 9 H 840A nickase (Cas9n (H840A)), and the genome editing is achieved using a prime-editing guide RNA (pegRNA). Despite these developments, programmable gene integration is still generally dependent on cellular pathways or repair processes.

[0005] Therefore, there is a need for more effective tools for gene editing and delivery.

SUMMARY

[0006] The present disclosure provides a method of site-specific integration of a nucleic acid into a cell genome. The method comprises incorporating an integration site at a desired location in the cell genome by introducing into the cell a DNA binding nuclease linked to a reverse transcriptase domain, wherein the DNA binding nuclease comprises a nickase activity; and a guide RNA (gRNA) comprising a primer binding sequence linked to an integration sequence, wherein the gRNA interacts with the DNA binding nuclease and targets the desired location in the cell genome, wherein the DNA binding nuclease nicks a strand of the cell genome and the reverse transcriptase domain incorporates the integration sequence of the gRNA into the nicked site, thereby providing the integration site at the desired location of the cell genome. The method further comprises integrating the nucleic acid into the cell genome by introducing into the cell a DNA or RNA strand comprising the nucleic acid linked to a sequence that is complementary or associated to the integration site, and an integration enzyme, wherein the integration enzyme incorporates the nucleic acid into the cell genome at the integration site by integration, recombination, or reverse transcription of the sequence that is complementary or associated to the integration site, thereby introducing the nucleic acid into the desired location of the cell genome of the cell.

[0007] In some embodiments, the gRNA can be hybridized to a complementary strand of the cell genome to the genomic strand that is nicked by the DNA binding nuclease.

[0008] In some embodiments, the integration enzyme can be introduced as a peptide or a nucleic acid encoding the same.

[0009] In some embodiments, the DNA binding nuclease can be introduced as a peptide or a nucleic acid encoding the same.

[0010] In some embodiments, the DNA or RNA strand comprising the nucleic acid can be introduced into the cell as a minicircle, a plasmid, mRNA or a linear DNA.

[0011] In some embodiments, the DNA or RNA strand comprising the nucleic acid can be between 1000 bp and 10,000 bp.

[0012] In some embodiments, the DNA or RNA strand comprising the nucleic acid can be more than 10,000 bp.

[0013] In some embodiments, the DNA or RNA strand comprising the nucleic acid can be less than 1000 bp.

[0014] In some embodiments, the DNA comprising the nucleic acid can be introduced into the cell as a minicircle.

[0015] In some embodiment, the minicircle cannot comprise sequences of a bacterial origin.

[0016] In some embodiments, the DNA binding nuclease can be linked to a reverse transcriptase domain and the integration enzyme can be linked via a linker. The linker can be cleavable. The linker can be non-cleavable. The linker can be replaced by two associating binding domains of the DNA binding nuclease linked to a reverse transcriptase.

[0017] In some embodiments, the integration enzyme can be selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof.

[0018] In some embodiments, the integration enzyme can be Bxb1 or a mutant thereof.

[0019] In some embodiments, the integration site can be selected from an attB site, an attP site, an attL site, an attR site, a lox71 site a Vox site, or a FRT site.

[0020] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from Cas9-D10A, Cas9-H840A, and Cas12a/b nickase.

[0021] In some embodiments, the reverse transcriptase domain can be selected from the group consisting of Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase domain, transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV-RT), and Eubacterium rectale maturase RT (MarathonRT).

[0022] In some embodiments, the reverse transcriptase domain can comprise a mutation relative to the wild-type sequence.

[0023] In some embodiments, the M-MLV reverse transcriptase domain can comprise one or more mutations selected from the group consisting of D200N, T306K, W313F, T330P and L603W.

[0024] In some embodiments, the method can further comprise introducing a second nicking guide RNA (ngRNA). The ngRNA can direct nicking at 90 bases downstream of the gRNA nick on a complementary strand.

[0025] In some embodiments, the gRNA, the nucleic acid encoding the DNA binding nuclease, the reverse transcriptase, the DNA comprising nucleic acid linked to a complementary integration site, the integration enzyme, and optionally the ngRNA can be introduced into a cell in a single reaction.

[0026] In some embodiments, the gRNA, the nucleic acid encoding the DNA binding nuclease, the reverse transcriptase, the DNA comprising nucleic acid linked to a complementary integration site, the integration enzyme, and optionally the ngRNA can be introduced using a virus, a RNP, an mRNA, a lipid, or a polymeric nanoparticle.

[0027] In some embodiments, the nucleic acid can be a reporter gene. The reporter gene can be a fluorescent protein.

[0028] In some embodiments, the cell can be a dividing cell.

[0029] In some embodiments, the cell can be a non-dividing cell.

[0030] In some embodiments, the desired location in the cell genome can be the locus of a mutated gene.

[0031] In some embodiments, the nucleic acid can be a degradation tag for programmable knockdown of proteins in the presence of small molecules.

[0032] In some embodiments, the cell can be a mammalian cell, a bacterial cell or a plant cell.

[0033] In some embodiments, nucleic acid can be a T-cell receptor (TCR), a chimeric antigen receptor (CAR), an interleukin, a cytokine, or an immune checkpoint gene for integration into a T-cell or natural killer (NK) cell. The TCR, the CAR, the interleukin, the cytokine, or the immune checkpoint gene can be incorporated into the target site of the T-cell or NK cell genome using a minicircle DNA.

[0034] In some embodiments, the nucleic acid can be a beta hemoglobin (HBB) gene and the cell can be a hematopoietic stem cell (HSC). The HBB gene can be incorporated into the target site in the HSC genome using a minicircle DNA. The nucleic acid can be a gene responsible for beta thalassemia or sickle cell anemia.

[0035] In some embodiments, the nucleic acid can be a metabolic gene. The metabolic gene can be involved in alpha-1 antitrypsin deficiency or ornithine transcarbamylase (OTC) deficiency. The metabolic gene can be a gene involved in inherited diseases.

[0036] In some embodiments, the nucleic acid can be a gene involved in an inherited disease or an inherited syndrome. The inherited disease can be cystic fibrosis, familial hypercholesterolemia, adenosine deaminase (ADA) deficiency, X-linked SCID (X-SCID), Wiskott-Aldrich syndrome (WAS), hemochromatosis, Tay-Sachs, fragile X syndrome, Huntington's disease, Marfan syndrome, phenylketonuria, or muscular dystrophy.

[0037] The present disclosure provides a vector comprising a nucleic acid encoding a DNA binding nuclease comprising a nickase activity C-terminally linked to a reverse transcriptase linked to an integration enzyme via a linker.

[0038] In some embodiments, the linker can be cleavable.

[0039] In some embodiments, the linker can be non-cleavable.

[0040] In some embodiments, the linker can comprise two associating binding domains of the DNA binding nuclease linked to a reverse transcriptase.

[0041] In some embodiments, the integration enzyme can comprise a conditional activation domain or conditional expression domain.

[0042] In some embodiments, the integration enzyme can be fused to an estrogen receptor.

[0043] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b.

[0044] In some embodiments, the reverse transcriptase can be a M-MLV reverse transcriptase, a AMV-RT, MarathonRT, or a RTX. The reverse transcriptase can be a modified M-MLV reverse transcriptase relative to the wildtype M-MLV reverse transcriptase. The M-MLV reverse transcriptase domain can comprise one or more of the mutations selected from the group consisting of D200N, T306K, W313F, T330P and L603W.

[0045] In some embodiments, the integration enzyme can be selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof.

[0046] In some embodiments, the recombinase or integrase can be Bxb1 or a mutant thereof.

[0047] The present disclosure provides a cell comprising a vector comprising a nucleic acid encoding a DNA binding nuclease comprising a nickase activity C-terminally linked to a reverse transcriptase linked to an integration enzyme via a linker. The cell further comprises a gRNA comprising a primer binding sequence, an integration sequence, and a guide sequence, wherein the gRNA can interact with the encoded nuclease comprising a nickase activity. The cell further comprising a DNA minicircle comprising a nucleic acid and a sequence recognized by the encoded integrase, recombinase, or reverse transcriptase. The cell further comprising a nicking guide RNA (ngRNA) capable of binding the encoded nuclease comprising a nickase activity, and wherein the ngRNA targets a sequence away from the gRNA.

[0048] In some embodiments, the minicircle cannot comprise a sequence of bacterial origin.

[0049] In some embodiments, the integration enzyme can be selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof.

[0050] In some embodiments, the integration enzyme can be Bxb1 or a mutant thereof.

[0051] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from the group consisting of Cas9-D10A, Cas9-H840A and Cas12a.

[0052] In some embodiments, the reverse transcriptase can be a M-MLV reverse transcriptase. The reverse transcriptase can be a modified M-MLV reverse transcriptase. The amino acid sequence of the M-MLV reverse transcriptase can comprise one or more mutations selected from the group consisting of D200N, T306K, W313F, T330P, and L603W.

[0053] In some embodiments, the cell can further comprise introducing ngRNA to the cell. The ngRNA can be a +90 ngRNA. The +90 ngRNA can direct nicking at 90 bases downstream of the gRNA nick on a complementary strand.

[0054] The present disclosure provides a polypeptide comprising a DNA binding nuclease comprising a nickase activity C-terminally linked to a reverse transcriptase linked to an integration enzyme via a linker.

[0055] In some embodiments, the linker can be cleavable.

[0056] In some embodiments, the linker can be non-cleavable.

[0057] In some embodiments, the integration enzyme can be fused to an estrogen receptor.

[0058] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b.

[0059] In some embodiments, the reverse transcriptase can be a M-MLV reverse transcriptase, a AMV-RT, a MarathonRT, or a XRT. The reverse transcriptase can be a modified M-MLV relative to a wild-type M-MLV reverse transcriptase. The M-MLV reverse transcriptase domain can comprise one or more of mutations selected from the group consisting of D200N, T306K, W313F, T330P, and L603W.

[0060] In some embodiments, the integration enzyme can be selected from group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof.

[0061] In some embodiments, the integration enzyme can be Bxb1 or a mutant thereof.

[0062] The present disclosure provides a gRNA that specifically binds to a DNA binding nuclease comprising nickase activity, the gRNA comprising a primer binding site, which hybridizes to a nicked DNA strand, a recognition site for an integration enzyme, and a target recognition sequence recognizing a target site in a cell genome and hybridizing to a genomic strand complementary to the strand that is nicked by the DNA binding nuclease.

[0063] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b.

[0064] In some embodiments, the primer binding site can hybridize to the 3' end of the nicked DNA strand.

[0065] In some embodiments, the recognition site for the integration enzyme can be selected from an attB site, an attP site, an attL site, an attR site, a lox71 site, and a FRT site.

[0066] In some embodiments, the recognition site for the integration enzyme can be a Bxb1 site.

[0067] The present disclosure provides a method of site-specific integration of two or more nucleic acids into a cell genome. The method comprises incorporating two integration sites at desired locations in the cell genome by introducing into the cell a DNA binding nuclease linked to a reverse transcriptase domain, wherein the DNA binding nuclease comprises a nickase activity, and two guide RNAs (gRNAs), each comprising, a primer binding sequence, linked to a unique integration sequence, wherein the gRNA interacts with the DNA binding nuclease and targets the desired locations in the cell genome, wherein the DNA binding nuclease nicks a strand of the cell genome and the reverse transcriptase domain incorporates each of the integration sequence of the gRNA into the nicked site, thereby providing the integration site at the desired locations of the cell genome. The method further comprises integrating the nucleic acid by introducing into the cell two or more DNA or RNA comprising the nucleic acids, wherein each DNA is flanked by orthogonal integration sites, and an integration enzyme, wherein the integration enzyme incorporates the nucleic acids into the cell genome at the integration sites by integrase, recombinase, or reverse transcriptase of the sequence that is complementary or associated to the integration site, thereby introducing the nucleic acids into the desired locations of the cell genome of the cell.

[0068] In some embodiments, each of the two different integration sites inserted into the cell genome can be attB sequences comprising different palindromic or non-palindromic central dinucleotide.

[0069] In some embodiments, each of the two different integration sites inserted into the cell genome can be attP sequences comprising different palindromic or non-palindromic central dinucleotide.

[0070] In some embodiments, the integration enzyme can enable each of the two or more DNA or RNA comprising the nucleic acids to directionally enable integration of the nucleic acids into a genome via recombination of a pair of orthogonal attB site sequence and an attP site sequence.

[0071] In some embodiments, the integration enzyme can be selected from the group consisting of Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, retrotransposases encoded by R1, R2, R3, R4, R5, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos, and any mutants thereof.

[0072] In some embodiments, the integration enzyme can be Bxb1 or a mutant thereof.

[0073] In some embodiments, the DNA comprising genes can be genes involved in a cell maintenance pathway, cell-division, or a signal transduction pathway.

[0074] In some embodiments, the reverse transcriptase domain can comprise Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase domain, transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV-RT), or Eubacterium rectale maturase RT (MarathonRT).

[0075] In some embodiments, the DNA binding nuclease comprising a nickase activity can be selected from the group consisting of Cas9-D10A, Cas9-H840A, and Cas12a/b.

[0076] In some embodiments, the pair of an attB site sequence and an attP site sequence can be selected from the group consisting of SEQ ID NO: 5 and SEQ ID NO: 6, SEQ ID NO: 7 and SEQ ID NO: 8, SEQ ID NO: 9 and SEQ ID NO: 10, SEQ ID NO: 11 and SEQ ID NO: 12, SEQ ID NO: 13 and SEQ ID NO: 14, SEQ ID NO: 15 and SEQ ID NO: 16, SEQ ID NO: 17 and SEQ ID NO: 18, SEQ ID NO: 19 and SEQ ID NO: 20, SEQ ID NO: 21 and SEQ ID NO: 22, SEQ ID NO: 23 and SEQ ID NO: 24, SEQ ID NO: 25 and SEQ ID NO: 26, SEQ ID NO: 27 and SEQ ID NO: 28, SEQ ID NO: 29 and SEQ ID NO: 30, SEQ ID NO: 31 and SEQ ID NO: 32, SEQ ID NO: 33 and SEQ ID NO: 34 and SEQ ID NO: 35 and SEQ ID NO: 36.

[0077] The present disclosure provides a cell comprising a vector comprising a nucleic acid encoding a DNA binding nuclease comprising a nickase activity, wherein the DNA binding nuclease is C-terminally linked to a reverse transcriptase, wherein the reverse transcriptase is linked to a recombinase or integrase via a linker. The cell further comprises two guide RNAs (gRNAs) comprising a primer binding sequence, an integration sequence and a guide sequence, wherein the gRNA can interact with the encoded DNA binding nuclease comprising a nickase activity. The cell further comprises two or more DNA or RNA strands comprising a nucleic acid and a pair of flanking attB site sequence and an attP site sequence recognized by the encoded integrase or recombinase. The cell optionally further comprises a nicking guide RNA (ngRNA) capable of binding the encoded nuclease comprising a nickase activity, and wherein the ngRNA targets a sequence away from the gRNA.

[0078] The present disclosure provides a cell comprising a modified genome, wherein the modification comprises incorporation of two orthogonal integration sites within the cell genome by introducing into the cell a: vector comprising a nucleic acid encoding a DNA binding nuclease comprising a nickase activity, wherein the DNA binding nuclease is C-terminally linked to a reverse transcriptase; two guide RNAs (gRNAs), each comprising a primer binding sequence, a genomic integration sequence, and a guide sequence, wherein the gRNA can interact with the encoded nuclease comprising a nickase activity; and optionally a nicking guide RNA (ngRNA) capable of binding the encoded nuclease comprising a nickase activity, and wherein the ngRNA targets a sequence away from the gRNA.

[0079] The present disclosure provides a method of integrating two or more nucleic acids into the cell genome of cell of claim 90, the method comprising introducing into the cell: two or more DNA, each comprising a nucleic acid and a pair of flanking orthogonal integration site sequences; an integration enzyme that can recognize the integration site sequence enabling directional linking of the two or more DNA comprising nucleic acid; and enabling incorporation of the nucleic acids into the cell genome by integrating the 5' orthogonal integration sequence of the first DNA with the first genomic integration sequence and 3' orthogonal integration sequence of the last DNA with the last genomic integration sequence, thereby incorporating the two or more nucleic acids into the cell genome.

[0080] The present disclosure provides a cell comprising a modified genome, wherein the modification comprises incorporation of two orthogonal integration sites within the cell genome by introducing into the cell: a vector comprising a nucleic acid encoding a DNA binding nuclease comprising a nickase activity, wherein the DNA binding nuclease is C-terminally linked to a reverse transcriptase; two guide RNAs (gRNAs), each comprising a primer binding sequence, a genomic integration sequence, and a guide sequence, wherein the gRNA can interact with the encoded nuclease comprising a nickase activity; and optionally a nicking guide RNA (ngRNA) capable of binding the encoded nuclease comprising a nickase activity, and wherein the ngRNA targets a sequence away from the gRNA; two or more DNA or RNA comprising the nucleic acids, wherein each DNA is flanked by orthogonal integration sites; and an integration enzyme, wherein the integration enzyme incorporates the nucleic acids into the cell genome at the integration sites.

BRIEF DESCRIPTION OF THE DRAWINGS

[0081] Aspects, features, benefits and advantages of the embodiments described herein will be apparent with regard to the following description, appended claims, and accompanying drawings where:

[0082] FIG. 1 shows a schematic diagram of a concept of Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;

[0083] FIG. 2 shows a schematic diagram of a prime editing process according to embodiments of the present teachings;

[0084] FIG. 3 shows the percent integration of green fluorescent protein (GFP) in the lentiviral integrated lox71 site in HEK293FT cell line in the presence of various plasmids according to embodiments of the present teachings;

[0085] FIG. 4 shows the percent editing of the HEK293FT genome for incorporation of various lengths of lox71 or lox66 according to embodiments of the present teachings;

[0086] FIG. 5A shows the percent editing of lox71 site with different PE/Cre vectors according to embodiments of the present teachings;

[0087] FIG. 5B shows the percent integration of GFP at the lox71 site in HEK293FT cell genome according to embodiments of the present teachings;

[0088] FIG. 6 shows a schematic representation of using Bxb1 to integrate a nucleic acid into the genome according to embodiments of the present teachings;

[0089] FIG. 7 shows the percent integration of GFP or Gluc into the attB locus using Bxb1 Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;

[0090] FIG. 8 shows the percent editing of various HEK3 targeting pegRNA Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;

[0091] FIG. 9A shows a fluorescent image of cells wherein the SUPT16H marker is tagged with EGFP using PASTE according to embodiments of the present teachings;

[0092] FIG. 9B shows a fluorescent image of cells wherein the SRRM2 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;

[0093] FIG. 9C shows a fluorescent image of cells wherein the LAMNB1 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;

[0094] FIG. 9D shows a fluorescent image of cells wherein the NOLC1 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;

[0095] FIG. 9E shows a fluorescent image of cells wherein the NOLC1 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;

[0096] FIG. 9F shows a fluorescent image of cells wherein the NOLC1 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;

[0097] FIG. 9G shows a fluorescent image of cells wherein the DEPDC4 marker is tagged with EGFP using Programmable Addition via Site-Specific Targeting Elements (PASTE) according to embodiments of the present teachings;

[0098] FIG. 10A shows comparisons of lipofectamine aided transfection in blue with electroporation aided transfection in red for the addition of the Bxb1 attB site at the ACTB N-terminal site in the genome using PASTE according to embodiments of the present teachings;

[0099] FIG. 10B shows comparisons of lipofectamine aided transfection in blue with electroporation aided transfection in red for EGFP integration at the ACTB N-terminal site in the genome using PASTE according to embodiments of the present teachings;

[0100] FIG. 11 shows a diagram of the integration of EGFP and Gluc with various HEK3 targeting pegRNAs according to embodiments of the present teachings;

[0101] FIG. 12 shows a schematic diagram of the using .phi.C31 as the integration enzyme, according to embodiments of the present teachings;

[0102] FIG. 13 shows a schematic diagram of multiplexing involving inserting multiple genes of interest in multiple loci using unique guide RNAs that incorporated exterior flanking attB sites according to embodiments of the present teachings;

[0103] FIG. 14A shows a diagram of the orthogonal editing with the right GT-EGFP according to embodiments of the present teachings;

[0104] FIG. 14B shows a diagram of the orthogonal editing with the right GA-mCherry according to embodiments of the present teachings;

[0105] FIG. 15A shows a fluorescent image of a multiplexing of ACTB-EGFP and NOLC1-mCherry according to embodiments of the present teachings

[0106] FIG. 15B shows a fluorescent image of a multiplexing of ACTB-EGFP and LAMNB1-mCherry according to embodiments of the present teachings;

[0107] FIG. 16A shows next generation sequencing results of 9.times.9 attP and attB central dinucleotide variants and their edit percentage wherein the orthogonality of attB/attP combinations for potential multiplexing applications is shown according to embodiments of the present teachings;

[0108] FIG. 16B shows an heatmap of 9.times.9 attP and attB central dinucleotide variants and their edit percentage according to embodiments of the present teachings;

[0109] FIG. 17 shows integration of SERPINA and CPS1 into Albumin loci using Albumin guide-pegRNA in HEK293FT cells according to embodiments of the present teachings;

[0110] FIG. 18 shows schematics for different nucleic acids for engineering T-cells according to embodiments of the present teachings;

[0111] FIG. 19 shows the editing efficiency for EGFP integration at the ACTB locus in primary T-cells according to embodiments of the present teachings;

[0112] FIG. 20 shows editing in TRAC locus in HEK293FT with different pegRNA according to embodiments of the present teachings;

[0113] FIG. 21A shows the attB integration at the ACTB locus using nicking guides 1 and 2 according to embodiments of the present teachings;

[0114] FIG. 21B shows the EGFP integration at the ACTB locus using nicking guides 1 and 2 according to embodiments of the present teachings;

[0115] FIG. 21C shows the EGFP integration at an ACTB site according to embodiments of the present teachings;

[0116] FIG. 22A shows PASTE editing in liver hepatocellular carcinoma cell line HEPG2 according to embodiments of the present teachings;

[0117] FIG. 22B shows PASTE editing of chronic myelogenous leukemia cell line K562 according to embodiments of the present teachings;

[0118] FIG. 23A shows the attB addition with targeting and non-targeting guides according to embodiments of the present teachings;

[0119] FIG. 23B shows the EGFP integration with targeting and non-targeting guides according to embodiments of the present teachings;

[0120] FIG. 23C shows the EGFP integration for mutagenized Bxb1 according to embodiments of the present teachings;

[0121] FIG. 24A shows a schematic of the design parameters for the pegRNA according to embodiments of the present teachings;

[0122] FIG. 24B shows a schematic of the design parameters for nicking guide RNA according to embodiments of the present teachings;

[0123] FIG. 25A shows the integration of EGFP at the ACTD locus with different PBS and RT lengths according to embodiments of the present teachings;

[0124] FIG. 25B shows the integration of EGFP at the LMNB1 loci with different PBS and RT lengths according to embodiments of the present teachings;

[0125] FIG. 25C shows the integration of EGFP at the NOLC1 loci with different PBS and RT lengths according to embodiments of the present teachings;

[0126] FIG. 25D shows the integration of EGFP at the GRSF1 locus with different PBS and RT lengths and different nicking guides according to embodiments of the present teachings;

[0127] FIG. 25E shows EGFP integration with mutant attP sites according to embodiments of the present teachings;

[0128] FIG. 25F shows the PASTE editing of an expanded panel of genes according to embodiments of the present teachings;

[0129] FIG. 26A shows the PASTE EGPF editing at the ACTB locus according to embodiments of the present teachings;

[0130] FIG. 26B shows the HITI EGPF editing at the ACTB locus according to embodiments of the present teachings;

[0131] FIG. 26C shows the comparison between the PASTE and HITI editing a panel of 14 genes according to embodiments of the present teachings;

[0132] FIG. 26D shows PASTE Bxb1 off-target integrations according to embodiments of the present teachings;

[0133] FIG. 26E shows PASTE Cas9 off-target integrations according to embodiments of the present teachings;

[0134] FIG. 26F shows the EGFP integration for gene inserts of different sizes according to embodiments of the present teachings;

[0135] FIG. 27A shows the orthogonality between selected sets of attB and attP sites according to embodiments of the present teachings;

[0136] FIG. 27B shows the orthogonality between selected sets of attB and attP sites according to embodiments of the present teachings;

[0137] FIG. 27C shows a schematic for the orthogonal PASTE editing using engineered di-nucleotide combinations according to embodiments of the present teachings;

[0138] FIG. 28A shows fluorescent images of the GFP tagging of ACTB and SUPT16H genes with PASTE according to embodiments of the present teachings;

[0139] FIG. 28B shows fluorescent images of the GFP tagging of NOLC1 and SRRM2 genes with PASTE according to embodiments of the present teachings;

[0140] FIG. 28C shows fluorescent images of the GFP tagging of LMNB1 and DEPDC4 genes with PASTE according to embodiments of the present teachings;

[0141] FIG. 28D shows the orthogonal gene integration at three endogenous sites with PASTE according to embodiments of the present teachings;

[0142] FIG. 28E shows the multiplexed insertion via one-plex, two-plex, and three-plex gene insertion at three endogenous sites via PASTE according to embodiments of the present teachings;

[0143] FIG. 28F shows fluorescent images of two single cells with multiplexed gene tagging of ACTB (EGFP) and NOLC1 (mCherry) using PASTE according to embodiments of the present teachings;

[0144] FIG. 28G shows fluorescent images two single cells with multiplexed gene tagging of ACTB (EGFP) and LMNB1 (mCherry) using PASTE according to embodiments of the present teachings;

[0145] FIG. 29A shows the prime editing efficiency of Bxb1 attB site insertion at the ACTB locus according to embodiments of the present teachings;

[0146] FIG. 29B shows the prime editing efficiency at inserting Bxb1 attB sites of different lengths at the ACTB locus according to embodiments of the present teachings;

[0147] FIG. 29C shows the prime editing efficiency of inserting attB sequences from different integrases, wherein both orientations of landing sites are profiled (F, forward; and R, reverse) according to embodiments of the present teachings;

[0148] FIG. 29D shows the prime editing efficiency of inserting attB sequences from Bxb1 integrase and Cre recombinase, wherein both orientations of landing sites are profiled (F, forward; and R, reverse) according to embodiments of the present teachings;

[0149] FIG. 29E shows a schematic of PASTE insertion at the ACTB locus showing guide and target sequences according to embodiments of the present teachings. FIG. 29E discloses SEQ ID NOS 428-431, respectively, in order of appearance;

[0150] FIG. 29F shows a comparison of PASTE integration efficiency of GFP with a panel of integrases targeting the 5' end of the ACTB locus, wherein both orientations of landing sites are profiled (F, forward; and R, reverse) according to embodiments of the present teachings;

[0151] FIG. 29G shows a comparison of GFP cargo integration efficiency between Bxb1 integrases and Cre recombinase according to embodiments of the present teachings;

[0152] FIG. 29H shows the dependence of PASTE editing activity on different prime and integrase components according to embodiments of the present teachings;

[0153] FIG. 29I shows a titration of a single vector PASTE system (SpCas9-RT-P2A-Bxb1) on integrase efficiency according to embodiments of the present teachings;

[0154] FIG. 29J shows the effect of cargo size on PASTE insertion efficiency at the endogenous ACTB target according to embodiments of the present teachings;

[0155] FIG. 29K shows a gel electrophoresis showing complete insertion by PASTE for multiple cargo sizes according to embodiments of the present teachings;

[0156] FIG. 30A shows a schematic of PASTE integration, including resulting attR and attL sites that are generated and PCR primers for assaying the integration junctions according to embodiments of the present teachings;

[0157] FIG. 30B shows a PCR and gel electrophoresis readout of left integration junction from PASTE insertion of GFP at the ACTB locus, wherein the insertion is analyzed for in-frame and out-of-frame GFP integration experiments as well as for a no prime control and expected sizes of the PCR fragments are shown using the primers shown in the schematic in subpanel FIG. 30A according to embodiments of the present teachings;

[0158] FIG. 30C shows a PCR and gel electrophoresis readout of right integration junction from PASTE insertion of GFP at the ACTB locus, wherein the insertion is analyzed for in-frame and out-of-frame GFP integration experiments as well as for a no prime control and the expected sizes of the PCR fragments are shown using the primers shown in the schematic in subpanel FIG. 30A according to embodiments of the present teachings;

[0159] FIG. 30D shows a Sanger sequencing shown for the right integration junction for an in-frame fusion of GFP via PASTE to the N-terminus of ACTB according to embodiments of the present teachings;

[0160] FIG. 30E shows a Sanger sequencing shown for the left integration junction for an in-frame fusion of GFP via PASTE to the N-terminus of ACTB according to embodiments of the present teachings;

[0161] FIG. 31A shows a schematic of various parameters that affect PASTE integration of .about.1 kb GFP insert, wherein on the pegRNA, the PBS, RT, and attB lengths can alter the efficiency of attB insertion, and nicking guide selection also affects overall gene integration efficiency according to embodiments of the present teachings;

[0162] FIG. 31B shows the impact of PBS and RT length on PASTE integration of GFP at the ACTB locus according to embodiments of the present teachings;

[0163] FIG. 31C shows the impact of PBS and RT length on PASTE integration of GFP at the LMNB1 locus according to embodiments of the present teachings;

[0164] FIG. 31D shows the impact of attB length on PASTE integration of GFP at the ACTB locus according to embodiments of the present teachings;

[0165] FIG. 31E shows the impact of attB length on PASTE integration of GFP at the LMNB1 locus according to embodiments of the present teachings;

[0166] FIG. 31F shows the impact of attB length on PASTE integration of GFP at the NOLC1 locus according to embodiments of the present teachings;

[0167] FIG. 31G shows the impact of minimal PBS, RT, and attB lengths on PASTE integration efficiency of GFP at the ACTB locus according to embodiments of the present teachings;

[0168] FIG. 31H shows the impact of minimal PBS, RT, and attB lengths on PASTE integration efficiency of GFP at the LMNB1 locus according to embodiments of the present teachings;

[0169] FIG. 31I shows the PASTE integration of GFP at the LMNB1 locus in the presence and absence of nicking guide, prime, and Bxb1 with a minimally compact pegRNA containing a 38 bp attB compared to a longer pegRNA design according to embodiments of the present teachings;

[0170] FIG. 32A shows the PASTE insertion efficiency at ACTB and LMNB1 loci with two different nicking guide designs according to embodiments of the present teachings;

[0171] FIG. 32B shows the PASTE editing efficiency at ACTB and LMNB1 with target and non-targeting spacers and matched pegRNAs with and without Bxb1 expression according to embodiments of the present teachings;

[0172] FIG. 33A shows the PASTE integration of GFP at the ACTB locus with different Bxb1 catalytic mutants according to embodiments of the present teachings;

[0173] FIG. 33B shows the PASTE integration of GFP at the ACTB locus with different RT catalytic mutants according to embodiments of the present teachings;

[0174] FIG. 34A shows the GFP integration by PASTE at a panel of endogenous genomic loci according to embodiments of the present teachings;

[0175] FIG. 34B shows the integration of a panel of different gene cargo at ACTB locus via PASTE according to embodiments of the present teachings;

[0176] FIG. 34C shows the integration efficiency of therapeutically relevant genes at the ACTB locus according to embodiments of the present teachings;

[0177] FIG. 34D shows the endogenous protein tagging with GFP via PASTE by in-frame endogenous gene tagging at the ACTB loci and SRRM2 loci according to embodiments of the present teachings;

[0178] FIG. 34E shows the endogenous protein tagging with GFP via PASTE by in-frame endogenous gene tagging at the NOLC1 loci and LMNB1 loci according to embodiments of the present teachings;

[0179] FIG. 35 shows the integration of a panel of different gene cargo at LMNB1 locus via PASTE according to embodiments of the present teachings;

[0180] FIG. 36A shows the PASTE integration efficiency for all 16 central dinucleotide attB/attP sequence pairs with a 5 kb GFP template at the ACTB locus according to embodiments of the present teachings;

[0181] FIG. 36B shows a schematic of the pooled attB/attP dinucleotide orthogonality assay, wherein each attB dinucleotide sequence is co-transfected with a barcoded pool of all 16 attP dinucleotide sequences and Bxb1 integrase, relative integration efficiencies are determined by next generation sequencing of barcodes, and all 16 attB dinucleotides are profiled in an arrayed format with attP pools according to embodiments of the present teachings;

[0182] FIG. 36C shows the relative insertion preferences for all possible attB/attP dinucleotide pairs determined by the pooled orthogonality assay according to embodiments of the present teachings;

[0183] FIG. 36D shows the orthogonality of top 4 attB/attP dinucleotide pairs evaluated for GFP integration with PASTE at the ACTB locus according to embodiments of the present teachings;

[0184] FIG. 37 shows the orthogonality of Bxb1 dinucleotides as measured by a pooled reporter assay, wherein each web logo motif shows the relative integration of different attP sequences in a pool at a denoted attB sequence with the listed dinucleotide according to embodiments of the present teachings;

[0185] FIG. 38A shows a schematic of multiplexed integration of different cargo sets at specific genomic loci, wherein three fluorescent cargos (GFP, mCherry, and YFP) are inserted orthogonally at three different loci (ACTB, LMNB1, NOLC1) for in-frame gene tagging according to embodiments of the present teachings;

[0186] FIG. 38B shows the efficiency of multiplexed PASTE insertion of combinations of fluorophores at ACTB, LMNB1, and NOLC1 loci according to embodiments of the present teachings;

[0187] FIG. 39A shows the GFP integration efficiency at a panel of genomic loci by PASTE compared to insertion rates by homology-independent targeted integration (HITI) according to embodiments of the present teachings;

[0188] FIG. 39B shows a comparison of unintended indel generation by PASTE and HITI at the ACTB and LMNB1 target sites, wherein the on-target EGFP integration rate observed compared to unintended indels is shown according to embodiments of the present teachings;

[0189] FIG. 39C shows the integration of a GFP template by PASTE at the ACTB locus compared to homology-directed repair (HDR) at the same target, wherein the quantification is by single-cell clone counting, wherein targeting and non-targeting guides were used for HDR insertion, and wherein for PASTE targeting and non-targeting refers to the presence or absence of the SpCas9-RT protein respectively according to embodiments of the present teachings;

[0190] FIG. 39D shows the comparison of unintended indel generation by PASTE and HDR based EGFP insertion at the ACTB target site, wherein the average indel rate measured across all single-cell clones generated is showed according to embodiments of the present teachings;

[0191] FIG. 39E shows a schematic for Bxb1 and Cas9 off-target identification and a detection assay according to embodiments of the present teachings;

[0192] FIG. 39F shows the GFP integration activity at predicted Bxb1 off-target sites in the human genome according to embodiments of the present teachings;

[0193] FIG. 39G shows the GFP integrations activity at predicted PASTE ACTB Cas9 guide off target sites according to embodiments of the present teachings;

[0194] FIG. 39H shows the GFP integration activity at predicted HITI ACTB Cas9 guide off-target sites according to embodiments of the present teachings;

[0195] FIG. 39I shows a schematic of next-generation sequencing method to assay genome-wide off-target integration sites by PASTE according to embodiments of the present teachings;

[0196] FIG. 39J shows the alignment of reads at the on-target ACTB site using a genome-wide integration assay, wherein expected on-target integration outcomes are shown according to embodiments of the present teachings;

[0197] FIG. 39K shows the analysis of on-target and off-target integration events across 3 single-cell clones for PASTE and 3 single-cell clones for no prime condition according to embodiments of the present teachings;

[0198] FIG. 39L shows a Manhattan plot of integration events for a representative single-cell clone with PASTE editing, wherein the on-target site is at the ACTB gene on chromosome 7 according to embodiments of the present teachings;

[0199] FIG. 40A shows a comparison of indel rates generated by PASTE and HITI mediated insertion of EGFP at the ACTB and LMNB1 loci in HepG2 cells according to embodiments of the present teachings;

[0200] FIG. 40B shows the validation of ddPCR assays for detecting editing at predicted Bxb1 offtarget sites using synthetic amplicons according to embodiments of the present teachings;

[0201] FIG. 40C shows the validation of ddPCR assays for detecting editing at predicted PASTE ACTB Cas9 guide off-target sites using synthetic amplicons according to embodiments of the present teachings;

[0202] FIG. 40D shows the validation of ddPCR assays for detecting editing at predicted HITI ACTB Cas9 guide off-target sites using synthetic amplicons according to embodiments of the present teachings;

[0203] FIG. 41A shows a number of significant differentially regulated genes in HEK293FT cells expressing Bxb1 integrase, PASTE targeting ACTB integration of EGFP, or Prime editing targeting ACTB for EGFP insertion without Bxb1 expression according to embodiments of the present teachings;

[0204] FIG. 41B shows Volcano plots depicting the fold expression change of sequenced mRNAs versus significance (p-value), wherein each dot represents a unique mRNA transcript and significant transcripts are shaded according to either upregulation (red) or downregulation (blue), and wherein fold expression change is measured against ACTB-targeting guide-only expression (including cargo) according to embodiments of the present teachings;

[0205] FIG. 41C shows top significantly upregulated and downregulated genes for Bxb1-only conditions, wherein genes are shown with their corresponding Z-scores of counts per million (cpm) for Bxb1 only expression, GFP-only expression, PASTE targeting ACTB for EGFP insertion, Prime targeting ACTB for EGFP expression without Bxb1, and guide/cargo only according to embodiments of the present teachings;

[0206] FIG. 42A shows a schematic of PASTE performance in the presence of cell cycle inhibition, wherein cells are transfected with plasmids for insertion with PASTE or Cas9-induced HDR and treated with aphidicolin to arrest cell division, and wherein the efficiency of PASTE and HDR are read out with ddPCR or amplicon sequencing respectively according to embodiments of the present teachings;

[0207] FIG. 42B shows the editing efficiency of single mutations by HDR at EMX1 locus with two Cas9 guides in the presence or absence of cell division read out with amplicon sequencing according to embodiments of the present teachings;

[0208] FIG. 42C shows the integration efficiency of various sized GFP inserts up to 13.3 kb at the ACTB locus with PASTE in the presence or absence of cell division according to embodiments of the present teachings;

[0209] FIG. 42D shows the PASTE editing efficiency with two vector (PE2 and Bxb1) and single vector (PE2-P2A-Bxb1) designs in K562 cells according to embodiments of the present teachings;

[0210] FIG. 42E shows the PASTE editing efficiency with single vector (PE2-P2A-Bxb1) designs in primary human T cells according to embodiments of the present teachings;

[0211] FIG. 42F shows the integration efficiency of therapeutically relevant genes at the ACTB locus according to embodiments of the present teachings;

[0212] FIG. 42G shows a schematic of protein production assay for PASTE-integrated transgene, wherein SERPINA1 and CPS1 transgenes are tagged with HIBIT luciferase for readout with both ddPCR and luminescence according to embodiments of the present teachings;

[0213] FIG. 42H shows the integration efficiency of SERPINA1 and CPS1 transgenes in HEK293FT cells at the ACTB locus according to embodiments of the present teachings;

[0214] FIG. 42I shows the integration efficiency of SERPINA1 and CPS1 transgenes in HepG2 cells at the ACTB locus according to embodiments of the present teachings;

[0215] FIG. 42J shows the intracellular levels of SERPINA1-HIBIT and CPS1-HIBIT in HepG2 cells according to embodiments of the present teachings;

[0216] FIG. 42K shows the secreted levels of SERPINA1-HIBIT and CPS1-HIBIT in HepG2 cells according to embodiments of the present teachings;

[0217] FIG. 43A shows the HDR mediated editing of the EMX1 locus that is significantly diminished in non-dividing HEK293FT cells blocked by 5 .mu.M aphidicolin treatment according to embodiments of the present teachings;

[0218] FIG. 43B shows the effect of insert minicircle DNA amount on PASTE-mediated insertion at the ACTB locus in dividing and nondividing HEK293FT cells blocked by 5 .mu.M aphidicolin treatment according to embodiments of the present teachings;

[0219] FIG. 43C shows the PASTE integration of GFP at the ACTB locus with the GFP template delivered via AAV, showing dose dependence of integration efficiency according to embodiments of the present teachings;

[0220] FIG. 44A shows the PASTE integration activity at three endogenous loci comparing the normal PASTE SV40 NLS to a c-Myc NLS/variable bi-partite SV40 NLS design according to embodiments of the present teachings;

[0221] FIG. 44B shows the PASTE integration activity at the ACTB locus with different GFP minicircle template amounts comparing the normal PASTE SV40 NLS to a c-Myc NLS/variable bi-partite SV40 NLS design according to embodiments of the present teachings;

[0222] FIG. 45 shows the improvement of the PASTE editing activity using a puromycin growth selection marker according to embodiments of the present teachings;

[0223] FIG. 46A shows the integration of SERPINA1 and CPS1 genes that are HIBIT tagged as measured by a protein expression luciferase assay according to embodiments of the present teachings;

[0224] FIG. 46B shows the integration of SERPINA1 and CPS1 genes that are HIBIT tagged as measured by a protein expression luciferase assay normalized to a standardized HIBIT ladder, enabling accurate quantification of protein levels according to embodiments of the present teachings;

[0225] FIG. 47A shows optimization of PASTE constructs with a panel of linkers and reverse transcriptase (RT) modifications for EGFP integration at the ACTB locus, according to embodiments of the present teachings;

[0226] FIG. 47B shows the effect of cargo size on PASTE insertion efficiency at the endogenous ACTB target. Cargos were transfected with fixed molar amounts, according to embodiments of the present teachings;

[0227] FIG. 48A shows prime editing efficiency for the insertion of different length BxbINT AttB sites at ACTB, according to embodiments of the present teachings;

[0228] FIG. 48B shows prime editing efficiency for the insertion of a BxbINT AttB site at ACTB with targeting and non-targeting guides, according to embodiments of the present teachings;

[0229] FIG. 48C shows prime editing efficiency for the insertion of different integrases' (Bxb1, Tp9, and Bt1) AttB sites at ACTB. Both orientations of landing sites are profiled (F, forward; R, reverse), according to embodiments of the present teachings;

[0230] FIG. 48D shows PASTE editing efficiency for the insertion of EGFP at ACTB with and without a nicking guide, according to embodiments of the present teachings; and

[0231] FIG. 49A shows optimization of PASTE editing by dosage titration and protein optimization. PASTE integration efficiency of EGFP at ACTB measured with different doses of a single-vector delivery of components.

[0232] FIG. 49B PASTE integration efficiency of EGFP at ACTB measured with different ratios of a single-vector delivery of components to the EGFP template vector.

[0233] FIG. 49C PASTE integration efficiency of EGFP at ACTB with different RT domain fusions.

[0234] FIG. 49D PASTE integration efficiency of EGFP at ACTB with different RT domain fusions and linkers.

[0235] FIG. 49E PASTE integration efficiency of EGFP at ACTB with mutant RT domains.

[0236] FIG. 49F PASTE integration efficiency of EGFP at ACTB with mutated BxbINT domains.

[0237] FIG. 50A Insertion templates delivered via AAV transduction. PASTE editing machinery was delivered via transfection, and templates were co-delivered via AAV dosing at levels indicated.

[0238] FIG. 50B Schematic of AdV delivery of the complete PASTE system with three viral vectors.

[0239] FIG. 50C Integration efficiency of AdV delivery of integrase, guides, and cargo in HEK293FT and HepG2 cells. BxbINT and guide RNAs or cargo were delivered either via plasmid transfection (P1), AdV transduction (AdV), or omitted (-). SpCas9-RT was only delivered as plasmid or omitted.

[0240] FIG. 50D AdV delivery of all PASTE components in HEK293FT and HepG2 cells.

[0241] FIG. 50E Schematic of mRNA and synthetic guide delivery of PASTE components.

[0242] FIG. 50F Delivery of PASTE system components with mRNA and synthetic guides, paired with either AdV or plasmid cargo.

[0243] FIG. 50G Delivery of circular mRNA with synthetic guides and either AdV or plasmid cargo.

[0244] FIG. 50H PASTE editing efficiency with single vector designs in primary human T cells.

[0245] FIG. 50I PASTE editing efficiency with single vector designs in primary human hepatocytes.

[0246] FIG. 51A PASTE editing efficiency at the LMNB1 locus with 130 bp and 385 bp deletions of the first exon of LMNB1 with combined insertion of an attB sequence.

[0247] FIG. 51B PASTE editing efficiency with a 130 bp deletion of the first exon of LMNB1 with a combined insertion of a 967 bp cargo using the PASTE system.

DETAILED DESCRIPTION

[0248] It will be appreciated that for clarity, the following discussion will describe various aspects of embodiments of the applicant's teachings. It should be noted that the specific embodiments are not intended as an exhaustive description or as a limitation to the broader aspects discussed herein. One aspect described in conjunction with a particular embodiment is not necessarily limited to that embodiment and can be practiced with any other embodiment(s). Reference throughout this specification to "one embodiment", "an embodiment," "an example embodiment," means that a particular feature, structure or characteristic described in connection with the embodiment is included in at least one embodiment of the present disclosure. Thus, appearances of the phrases "in one embodiment," "in an embodiment," or "an example embodiment" in various places throughout this specification are not necessarily all referring to the same embodiment, but may. Furthermore, the particular feature, structures or characteristics may be combined in any suitable manner, as would be apparent to a person skilled in the art from this disclosure, in one or more embodiments.

General Definitions

[0249] Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure pertains. Definitions of common terms and techniques in molecular biology may be found in Molecular Cloning: A Laboratory Manual, 2nd edition (1989) (Sambrook, Fritsch, and Maniatis); Molecular Cloning: A Laboratory Manual, 4th edition (2012) (Green and Sambrook); Current Protocols in Molecular Biology (1987) (F. M. Ausubel et al. eds.); the series Methods in Enzymology (Academic Press, Inc.): PCR 2: A Practical Approach (1995) (M. J. MacPherson, B. D. Hames, and G. R. Taylor eds.): Antibodies, A Laboratory Manual (1988) (Harlow and Lane, eds.): Antibodies A Laboratory Manual, 2nd edition 2013 (E. A. Greenfield ed.); Animal Cell Culture (1987) (R. I. Freshney, ed.); Benjamin Lewin, Genes IX, published by Jones and Bartlet, 2008 (ISBN 0763752223); Kendrew et al. (eds.), The Encyclopedia of Molecular Biology, published by Blackwell Science Ltd., 1994 (ISBN 0632021829); Robert A. Meyers (ed.), Molecular Biology and Biotechnology: a Comprehensive Desk Reference, published by VCH Publishers, Inc., 1995 (ISBN 9780471185710); Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley & Sons (New York, N.Y. 1994), March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, N.Y. 1992); and Marten H. Hofker and Jan van Deursen, Transgenic Mouse Methods and Protocols, 2nd edition (2011).

[0250] As used herein, the singular forms "a", "an," and "the" include both singular and plural referents unless the context clearly dictates otherwise. Thus, for example, reference to "a cell" includes a plurality of such cells.

[0251] As used herein, the term "optional" or "optionally" means that the subsequent described event, circumstance or substituent may or may not occur, and that the description includes instances where the event or circumstance occurs and instances where it does not.

[0252] The recitation of numerical ranges by endpoints includes all numbers and fractions subsumed within the respective ranges, as well as the recited endpoints.

[0253] As used herein, the term "about" or "approximately" refers to a measurable value such as a parameter, an amount, a temporal duration, and the like, are meant to encompass variations of and from the specified value, such as variations of +/-10% or less, +/-5% or less, +1-1% or less, +/-0.5% or less, and +/-0.1% or less of and from the specified value, insofar such variations are appropriate to perform in the disclosure. It is to be understood that the value to which the modifier "about" or "approximately" refers is itself also specifically, and preferably, disclosed.

[0254] It is noted that all publications and references cited herein are expressly incorporated herein by reference in their entirety. The publications discussed herein are provided solely for their disclosure prior to the filing date of the present application. Nothing herein is to be construed as an admission that the present disclosure is not entitled to antedate such publication. Further, the dates of publication provided may be different from the actual publication dates which may need to be independently confirmed.

Overview

[0255] The embodiments disclosed herein provide non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using Programmable Addition via Site-Specific Targeting Elements (PASTE). A schematic diagram illustrating the concept of PASTE is shown in FIG. 1. As discussed in more details below, PASTE comprises the addition of an integration site into a target genome followed by the insertion of one or more genes of interest or one or more nucleic acid sequences of interest at the site. This process can be done as one or more reactions in a cell. The addition of the integration site into the target genome is done using gene editing technologies that include for example, without limitation, prime editing, recombinant adeno-associated virus (rAAV)-mediated nucleic acid integration, transcription activator-like effector nucleases (TALENS), and zinc finger nucleases (ZFNs). The integration of the transgene at the integration site is done using integrase technologies that include for example, without limitation, integrases, recombinases and reverse transcriptases. The necessary components for the site-specific genetic engineering disclosed herein comprise at least one or more nucleases, one or more gRNA, one or more integration enzymes, and one or more sequences that are complementary or associated to the integration site and linked to the one or more genes of interest or one or more nucleic acid sequences of interest to be inserted into the cell genome.

[0256] An advantage of the non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering disclosed herein is programmable insertion of large elements without reliance on DNA damage responses.

[0257] Another advantage of the non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering disclosed herein is facile multiplexing, enabling programmable insertion at multiple sites.

[0258] Another advantage of the non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering disclosed herein is scalable production and delivery through minicircle templates.

Prime Editing

[0259] The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using gene editing technologies, such as prime editing, to add an integration site into a target genome. Prime editing will be discussed in more details below.

[0260] Prime editing is a versatile and precise genome editing method that directly writes new genetic information into a specified DNA site. A schematic diagram illustrating the concept of prime editing is shown in FIG. 2. See, Anzalone, A. V., et al. "Search-and-replace genome editing without double-strand breaks or donor DNA," Nature 576, 149-157 (2019). Prime editing uses a catalytically-impaired Cas9 endonuclease that is fused to an engineered reverse transcriptase (RT) and programmed with a prime-editing guide RNA (pegRNA). The skilled person in the art would appreciate that the pegRNA both specifies the target site and encodes the desired edit. The catalytically-impaired Cas9 endonuclease also comprises a Cas9 nickase that is fused to the reverse transcriptase. During genetic editing, the Cas9 nickase part of the protein is guided to the DNA target site by the pegRNA. The reverse transcriptase domain then uses the pegRNA to template reverse transcription of the desired edit, directly polymerizing DNA onto the nicked target DNA strand. The edited DNA strand replaces the original DNA strand, creating a heteroduplex containing one edited strand and one unedited strand. Afterward, the prime editor (PE) guides resolution of the heteroduplex to favor copying the edit onto the unedited strand, completing the process.

[0261] The prime editors refer to a Moloney Murine Leukemia Virus (M-MLV) reverse transcriptase (RT) fused to a Cas9 H840A nickase. Fusing the RT to the C-terminus of the Cas9 nickase may result in higher editing efficiency. Such a complex is called PE1. The Cas9(H840A) can also be linked to a non-M-MLV reverse transcriptase such as a AMV-RT or XRT (Cas9(H840A)-AMV-RT or XRT). In some embodiments, Cas 9(H840A) can be replaced with Cas12a/b or Cas9(D10A). A Cas9 (wild type), Cas9(H840A), Cas9(D10A) or Cas 12a/b nickase fused to a pentamutant of M-MLV RT (D200N/L603W/T330P/T306K/W313F), having up to about 45-fold higher efficiency is called PE2. In some embodiments, the M-MLV RT comprise one or more of the mutations: Y8H, P51L, S56A, S67R, E69K, V129P, L139P, T197A, H204R, V223H, T246E, N249D, E286R, Q2911, E302K, E302R, F309N, M320L, P330E, L435G, L435R, N454K, D524A, D524G, D524N, E562Q, D583N, H594Q, E607K, D653N, and L671P. In some embodiments, the reverse transcriptase can also be a wild-type or modified transcription xenopolymerase (RTX), avian myeloblastosis virus reverse transcriptase (AMV-RT), Feline Immunodeficiency Virus reverse transcriptase (FIV-RT), FeLV-RT (Feline leukemia virus reverse transcriptase), HIV-RT (Human Immunodeficiency Virus reverse transcriptase), or Eubacterium rectale maturase RT (MarathonRT). PE3 involves nicking the non-edited strand, potentially causing the cell to remake that strand using the edited strand as the template to induce HR. The nicking of the non-edited strand can involve the use of a nicking guide RNA (ngRNA).

[0262] Nicking the non-edited strand can increase editing efficiency. For example, nicking the non-edited strand can increase editing efficiency by about 1.1 fold, about 1.3 fold, about 1.5 fold, about 1.7 fold, about 1.9 fold, about 2.1 fold, about 2.3 fold, about 2.5 fold, about 2.7 fold, about 2.9 fold, about 3.1 fold, about 3.3 fold, about 3.5 fold, about 3.7 fold, about 3.9 fold, 4.1 fold, about 4.3 fold, about 4.5 fold, about 4.7 fold, about 4.9 fold, or any range that is formed from any two of those values as endpoints.

[0263] Although the optimal nicking position varies depending on the genomic site, nicks positioned 3' of the edit about 40-90 bp from the pegRNA-induced nick can generally increase editing efficiency without excess indel formation. The prime editing practice allows starting with non-edited strand nicks about 50 bp from the pegRNA-mediated nick, and testing alternative nick locations if indel frequencies exceed acceptable levels.

[0264] As used herein, the term "guide RNA" (gRNA) and the like refer to a RNA that guide the insertion or deletion of one or more genes of interest or one or more nucleic acid sequences of interest into a target genome. The gRNA can also refer to a prime editing guide RNA (pegRNA), a nicking guide RNA (ngRNA), and a single guide RNA (sgRNA). In some embodiments, the term "gRNA molecule" refers to a nucleic acid encoding a gRNA. In some embodiments, the gRNA molecule is naturally occurring. In some embodiments, a gRNA molecule is non-naturally occurring. In some embodiments, a gRNA molecule is a synthetic gRNA molecule. A gRNA can target a nuclease or a nickase such as Cas9, Cas 12a/b, Cas9 (H840A) or Cas9 (D10A) molecule to a target nucleic acid or sequence in a genome. In some embodiments, the gRNA can bind to a DNA nickase bound to a reverse transcriptase domain. A "modified gRNA," as used herein, refers to a gRNA molecule that has an improved half-life after being introduced into a cell as compared to a non-modified gRNA molecule after being introduced into a cell. In some embodiments, the guide RNA can facilitate the addition of the insertion site sequence for recognition by integrases, transposases, or recombinases.

[0265] As used herein, the term "prime-editing guide RNA" (pegRNA) and the like refer to an extended single guide RNA (sgRNA) comprising a primer binding site (PBS), a reverse transcriptase (RT) template sequence, and an integration site sequence that can be recognized by recombinases, integrases, or transposases. Exemplary design parameters for pegRNA are shown in FIG. 24A. For example, the PBS can have a length of at least about 4 nt, 5 nt, 6 nt, 7 nt, 8 nt, 9 nt, 10 nt, 11 nt, 12 nt, 13 nt, 14 nt, 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, 30 nt, or more nt. For example, the PBS can have a length of about 4 nt, 5 nt, 6 nt, 7 nt, 8 nt, 9 nt, 10 nt, 11 nt, 12 nt, 13 nt, 14 nt, 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, 30 nt, or any range that is formed from any two of those values as endpoints. For example, the RT template sequence can have a length of at least about 4 nt, 5 nt, 6 nt, 7 nt, 8 nt, 9 nt, 10 nt, 11 nt, 12 nt, 13 nt, 14 nt, 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, 30 nt, 31 nt, 32 nt, 33 nt, 34 nt, 35 nt, 36 nt, 37 nt, 38 nt, 39 nt, 40 nt, 41 nt, 42 nt, 43 nt, 44 nt, 45 nt, 46 nt, 47 nt, 48 nt, 49 nt, 50 nt, or more nt. For example, the RT template sequence can have a length of about 4 nt, 5 nt, 6 nt, 7 nt, 8 nt, 9 nt, 10 nt, 11 nt, 12 nt, 13 nt, 14 nt, 15 nt, 16 nt, 17 nt, 18 nt, 19 nt, 20 nt, 21 nt, 22 nt, 23 nt, 24 nt, 25 nt, 26 nt, 27 nt, 28 nt, 29 nt, 30 nt, 31 nt, 32 nt, 33 nt, 34 nt, 35 nt, 36 nt, 37 nt, 38 nt, 39 nt, 40 nt, 41 nt, 42 nt, 43 nt, 44 nt, 45 nt, 46 nt, 47 nt, 48 nt, 49 nt, 50 nt, or any range that is formed from any two of those values as endpoints.

[0266] During genome editing, the primer binding site allows the 3' end of the nicked DNA strand to hybridize to the pegRNA, while the RT template serves as a template for the synthesis of edited genetic information. The pegRNA is capable for instance, without limitation, of (i) identifying the target nucleotide sequence to be edited and (ii) encoding new genetic information that replaces the targeted sequence. In some embodiments, the pegRNA is capable of (i) identifying the target nucleotide sequence to be edited and (ii) encoding an integration site that replaces the targeted sequence.

[0267] As used herein, the term "nicking guide RNA" (ngRNA) and the like refer to an RNA sequence that can nick a strand such as an edited strand and a non-edited strand. Exemplary design parameters for ngRNA are shown in FIG. 24B. The ngRNA can induce nicks at about 1 or more nt away from the site of the gRNA-induced nick. For example, the ngRNA can nick at least at about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100, 101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111, 112, 113, 114, 115, 116, 117, 118, 119, 120, or more nt away from the site of the gRNA induced nick. In some embodiments, the ngRNA comprises SEQ ID NO: 75 with guide sequence SEQ ID NO: 74. As used herein, the terms "reverse transcriptase" and "reverse transcriptase domain" refer to an enzyme or an enzymatically active domain that can reverse a RNA transcribe into a complementary DNA. The reverse transcriptase or reverse transcriptase domain is a RNA dependent DNA polymerase. Such reverse transcriptase domains encompass, but are not limited, to a M-MLV reverse transcriptase, or a modified reverse transcriptase such as, without limitation, Superscript.RTM. reverse transcriptase (Invitrogen; Carlsbad, Calif.), Superscript.RTM. VILO.TM. cDNA synthesis (Invitrogen; Carlsbad, Calif.), RTX, AMV-RT, and Quantiscript Reverse Transcriptase (Qiagen, Hilden, Germany).

[0268] The pegRNA-PE complex disclosed herein recognizes the target site in the genome and the Cas9 for example nicks a protospacer adjacent motif (PAM) strand. The primer binding site (PBS) in the pegRNA hybridizes to the PAM strand. The RT template operably linked to the PBS, containing the edit sequence, directs the reverse transcription of the RT template to DNA into the target site. Equilibration between the edited 3' flap and the unedited 5' flap, cellular 5' flap cleavage and ligation, and DNA repair results in stably edited DNA. To optimize base editing, a Cas9 nickase can be used to nick the non-edited strand, thereby directing DNA repair to that strand, using the edited strand as a template.

Integrase Technologies

[0269] The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using integrase technologies. Integrase technologies will be discussed in more details below.

[0270] The integrase technologies used herein comprise proteins or nucleic acids encoding the proteins that direct integration of a gene of interest or nucleic acid sequence of interest into an integration site via a nuclease such as a prime editing nuclease. The protein directing the integration can be an enzyme such as integration enzyme. The integration enzyme can be an integrase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by integration. The integration enzyme can be a recombinase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by recombination. The integration enzyme can be a reverse transcriptase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by reverse transcription. The integration enzyme can be a retrotransposase that incorporates the genome or nucleic acid of interest into the cell genome at the integration site by retrotransposition.

[0271] As used herein, the term "integration enzyme" refers to an enzyme or protein used to integrate a gene of interest or nucleic acid sequence of interest into a desired location or at the integration site, in the genome of a cell, in a single reaction or multiple reactions. Example of integration enzymes include for example, without limitation, Cre, Dre, Vika, Bxb1, .phi.C31, RDF, FLP, .phi.BT1, R1, R2, R3, R4, R5, TP901-1, A118, .phi.FC1, .phi.C1, MR11, TG1, .phi.370.1, W.beta., BL3, SPBc, K38, Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, BxZ2, .phi.RV, and retrotransposases encoded by R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), and Minos. In some embodiments, the term "integration enzyme" refers to a nucleic acid (DNA or RNA) encoding the above-mentioned enzymes. In some embodiments, the Cre recombinase is expressed from a Cre recombinase expression plasmid (SEQ ID NO: 71).

[0272] Mammalian expression plasmids can be found in Table 1 below.

TABLE-US-00001 TABLE 1 Name Full Description SEQ ID NOS: PE2-Bxb1 Single pCMV-PE2- (SEQ ID NO: 381) Vector P2A-Bxb1 PE2 prime editor pCMV-PE2/ (SEQ ID NO: 382) Addgene #132775 PE2*-Bxb1 Single New NLS (SEQ ID NO: 383) Vector pCMV-PE2- P2A-Bxb1 PASTEv3 pCMV-SpCas9- (SEQ ID NO: 384) XTEN-RT (1-478)-Sto7d- GGGGS- BxbINT ACTB pegRNA ACTB N- (SEQ ID NO: 385) term PBS 13 RT 29 attB 46 pegRNA ACTB Nicking +48 ACTB N- (SEQ ID NO: 386) term Nicking guide 1 +48 guide Bxb1 integrase pCAG-NLS- (SEQ ID NO: 387) HA- Bxb1integrase/ Addgene #51271 TP901-1 Integrase TP901-1 (SEQ ID NO: 388) Integrase PhiBT Integrase PhiBT Integrase (SEQ ID NO: 389) HDR sgRNA guide Minicircle U6- (SEQ ID NO: 390) sgRNA EFS- SpCas9 HDR EGFP cargo Cas9 HDR (SEQ ID NO: 391) template site with EGFP AAV helper PDF6 AAV (SEQ ID NO: 392) plasmid helper plasmid AAV EGFP donor GFP AAV donor (SEQ ID NO: 393) plasmid AAV2/8 AAV2/8 capsid (SEQ ID NO: 394) protein

[0273] Minicircle cargo gene maps can be found in Table 2 below.

TABLE-US-00002 TABLE 2 Full Name Description SEQ ID NOS: Cargo EGFP Parent (SEQ ID NO: 76) minicircle plasmid - Cargo EGFP with attP Bxb1 site Cargo Cargo EGFP (SEQ ID NO: 395) EGFP with attP Bxb1 post site - post cleavage minicircle cleavage Cargo Parent (SEQ ID NO: 396) EGFP minicircle for plasmid - fusion Cargo EGFP with attP Bxb1 site for fusion mCherry Cargo (SEQ ID NO: 397) Cargo post mCherry cleavage with attP Bxb1 site - post minicircle cleavage YFP Cargo YFP (SEQ ID NO: 398) Cargo with attP Bxb1 post site - post cleavage minicircle cleavage SERPINA1 Cargo (SEQ ID NO: 399) Cargo SERPINA1 post with attP cleavage Bxb1 site - post minicircle cleavage CPS1 Cargo CPS1 (SEQ ID NO: 400) Cargo with attP Bxb1 post site - post cleavage minicircle cleavage CFTR Cargo Parent (SEQ ID NO: 401) minicircle plasmid - Cargo CFTR with attP Bxb1 site NYESO Cargo (SEQ ID NO: 402) TCR Cargo NYESO post TCR with cleavage attP Bxb1 site - post minicircle cleavage

[0274] In some embodiments, the serine integrase .phi.C31 from .phi.C31 phage is use as integration enzyme. The integrase .phi.C31 in combination with a pegRNA can be used to insert the pseudo attP integration site (SEQ ID NO: 78). A DNA minicircle containing a gene or nucleic acid of interest and attB (SEQ ID NO: 3) site can be used to integrate the gene or nucleic acid of interest into the genome of a cell. This integration can be aided by a co-transfection of an expression vector having the .phi.C31 integrase.

[0275] As used herein, the term "integrase" refers to a bacteriophage derived integrase, including wild-type integrase and any of a variety of mutant or modified integrases. As used herein, the term "integrase complex" may refer to a complex comprising integrase and integration host factor (IF). As used herein, the term "integrase complex" and the like may also refer to a complex comprising an integrase, an integration host factor, and a bacteriophage X-derived excisionase (Xis).

[0276] As used herein, the term "recombinase" and the like refer to a site-specific enzyme that mediates the recombination of DNA between recombinase recognition sequences, which results in the excision, integration, inversion, or exchange (e.g., translocation) of DNA fragments between the recombinase recognition sequences. Recombinases can be classified into two distinct families: serine recombinases (e.g., resolvases and invertases) and tyrosine recombinases (e.g., integrases). Examples of serine recombinases include, without limitation, Hin, Gin, Tn3, .beta.-six, CinH, ParA, .gamma..delta., Bxb1, .phi.C31, TP901, TG1, .phi.BT1, R1, R2, R3, R4, R5, .phi.RV1, .phi.FC1, MR11, A118, U153, and gp29. Examples of serine recombinases also include, without limitation, recombinases Peaches, Veracruz, Rebeuca, Theia, Benedict, KSSJEB, PattyP, Doom, Scowl, Lockley, Switzer, Bob3, Troube, Abrogate, Anglerfish, Sarfire, SkiPole, ConceptII, Museum, Severus, Airmid, Benedict, Hinder, ICleared, Sheen, Mundrea, and BxZ2 from Mycobacterial phages. Examples of tyrosine recombinases include, without limitation, Cre, FLP, R, Lambda, HK101, HK022, and pSAM2. The serine and tyrosine recombinase names stem from the conserved nucleophilic amino acid residue that the recombinase uses to attack the DNA and which becomes covalently linked to the DNA during strand exchange.

[0277] Recombinases have numerous applications, including the creation of gene knockouts/knock-ins and gene therapy applications. See, e.g., Brown et al., "Serine recombinases as tools for genome engineering."Methods, 2011; 53(4):372-9; Hirano et al., "Site-specific recombinases as tools for heterologous gene integration." Appl. Microbiol. Biotechnol. 2011; 92(2):227-39; Chavez and Calos, "Therapeutic applications of the .PHI.C31 integrase system." Curr. Gene Ther. 2011; 11(5):375-81; Turan and Bode, "Site-specific recombinases: from tag-and-target- to tag-and-exchange-based genomic modifications." FASEB J. 2011; 25(12):4088-107; Venken and Bellen, "Genome-wide manipulations of Drosophila melanogaster with transposons, Flp recombinase, and .PHI.C31 integrase."Methods Mol. Biol. 2012; 859:203-28; Murphy, "Phage recombinases and their applications."Adv. Virus Res. 2012; 83:367-414; Zhang et al., "Conditional gene manipulation: Creating a new biological era." J. Zhejiang Univ. Sci. B. 2012; 13(7):511-24; Karpenshif and Bernstein, "From yeast to mammals: recent advances in genetic control of homologous recombination." DNA Repair (Amst). 2012; 1; 11(10):781-8; the entire contents of each are hereby incorporated by reference in their entirety.

[0278] The recombinases provided herein are not meant to be exclusive examples of recombinases that can be used in embodiments of the disclosure. The methods and compositions of the disclosure can be expanded by mining databases for new orthogonal recombinases or designing synthetic recombinases with defined DNA specificities (See, e.g., Groth et al., "Phage integrases: biology and applications." J. Mol. Biol. 2004; 335, 667-678; Gordley et al., "Synthesis of programmable integrases." Proc. Natl. Acad. Sci. USA. 2009; 106, 5053-5058; the entire contents of each are hereby incorporated by reference in their entirety).

[0279] Other examples of recombinases that are useful in the systems, methods, and compositions described herein are known to those of skill in the art, and any new recombinase that is discovered or generated is expected to be able to be used in the different embodiments of the disclosure.

[0280] As used herein, the term "retrotransposase" and the like refer to an enzyme, or combination of one or more enzymes, wherein at least one enzyme has a reverse transcriptase domain. Retrotransposases are capable of inserting long sequences (e.g., over 3000 nucleotides) of heterologous nucleic acid into a genome. Examples of retrotransposases include for example, without limitation, retrotransposases encoded by elements such as R2, L1, Tol2 Tc1, Tc3, Mariner (Himar 1), Mariner (mos 1), Minos, and any mutants thereof.

[0281] In some embodiments, the one or more genes of interest or one or more nucleic acid sequences of interest are inserted into a desired location in a genome using a RNA fragment, such as a retrotransposon, encoding the nucleic acid linked to a complementary or associated integration site. The insertion of the nucleic acid of interest into a location in the desired location in the genome using a retrotransposon is aided by a retrotransposase.

[0282] The gene and nucleic acid sequence of interest disclosed herein can be any gene and nucleic acid sequence that are known in the art. The gene and nucleic acid sequence of interest can be for therapeutic and/or diagnostic uses. Examples of genes of interest include, without limitation, GBA, BTK, ADA, CNGB3, CNGA3, ATF6, GNAT2, ABCA1, ABCA7, APOE, CETP, LIPC, MMP9, PLTP, VTN, ABCA4, MFSD8, TLR3, TLR4, ERCC6, HMCN1, HTRA1, MCDR4, MCDR5, ARMS2, C2, C3, CFB, CFH, JAG1, NOTCH2, CACNA1F, SERPINA1, TTR, GSN, B2M, APOA2, APOA1, OSMR, ELP4, PAX6, ARG, ASL, PITX2, FOXC1, BBS1, BBS10, BBS2, BBS9, MKKS, MKS1, BBS4, BBS7, TTC8, ARL6, BBS5, BBS12, TRIM32, CEP290, ADIPOR1, BBIP1, CEP19, IFT27, LZTFL1, DMD, BEST1, HBB, CYP4V2, AMACR, CYP7B1, HSD3B7, AKR1D1, OPN1SW, NR2F1, RLBP1, RGS9, RGS9BP, PROM1, PRPH2, GUCY2D, CACD, CHM, ALAD, ASS1, SLC25A13, OTC, ACADVL, ETFDH, TMEM67, CC2D2A, RPGRIP1L, KCNV2, CRX, GUCA1A, CERKL, CDHR1, PDE6C, TTLL5, RPGR, CEP78, C21orf2, C8ORF37, RPGRIP1, ADAMS, POC1B, PITPNM3, RAB28, CACNA2D4, AIPL1, UNC119, PDE6H, OPN1LW, RIMS1, CNNM4, IFT81, RAX2, RDH5, SEMA4A, CORD17, PDE6B, GRK1, SAG, RHO, CABP4, GNB3, SLC24A1, GNAT1, GRM6, TRPM1, LRIT3, TGFBI, TACSTD2, KRT12, OVOL2, CPS1, UGT1A1, UGT1A9, UGT1A8, UGT1A7, UGT1A6, UGT1A5, UGT1A4, CFTR, DLD, EFEMP1, ABCC2, ZNF408, LRP5, FZD4, TSPAN12, EVR3, APOB, SLC2A2, LOC106627981, GBA1, NR2E3, OAT, SLC40A1, F8, F9, UROD, CPDX, HFE, JH, LDLR, EPHX1, TJP2, BAAT, NBAS, LARS1, HAMP, HJV, RS1, ADAMTS18, LRAT, RPE65, LCAS, MERTK, GDF6, RD3, CCT2, CLUAP1, DTHD1, NMNAT1, SPATA7, IFT140, IMPDH1, OTX2, RDH12, TULP1, CRB1, MT-ND4, MT-ND1, MT-ND6, BCKDHA, BCKDHB, DBT, MMAB, ARSB, GUSB, NAGS, NPC1, NPC2, NDP, OPA1, OPA3, OPA4, OPAS, RTN4IP1, TMEM126A, OPA6, OPA8, ACO2, PAH, PRKCSH, SEC63, GAA, UROS, PPDX, HPX, HMOX1, HMBS, MIR223, CYP1B1, LTBP2, AGXT, ATP8B1, ABCB11, ABCB4, FECH, ALAS2, PRPF31, RP1, EYS, TOPORS, USH2A, CNGA1, C2ORF71, RP2, KLHL7, ORF1, RP6, RP24, RP34, ROM1, ADGRA3, AGBL5, AHR, ARHGEF18, CA4, CLCC1, DHDDS, EMC1, FAM161A, HGSNAT, HK1, IDH3B, KIAA1549, KIZ, MAK, NEUROD1, NRL, PDE6A, PDE6G, PRCD, PRPF3, PRPF4, PRPF6, PRPF8, RBP3, REEP6, SAMD11, SLC7A14, SNRNP200, SPP2, ZNF513, NEK2, NEK4, NXNL1, OFD1, RP1L1, RP22, RP29, RP32, RP63, RP9, RGR, POMGNT1, DHX38, ARL3, COL2A1, SLCO1B1, SLCO1B3, KCNJ13, TIMP3, ELOVL4, TFR2, FAH, HPD, MYO7A, CDH23, PCDH15, DFNB31, GPR98, USH1C, USH1G, CIB2, CLRN1, HARS, ABHD12, ADGRV1, ARSG, CEP250, IMPG1, IMPG2, VCAN, G6PC1, ATP7B and any derivatives thereof.

[0283] As used here, the terms "retrotransposons," "jumping genes," "jumping nucleic acids," and the like refer to cellular movable genetic elements dependent on reverse transcription. The retrotransposons are of non-replication competent cellular origin, and are capable of carrying a foreign nucleic acid sequence. The retrotransposons can act as parasites of retroviruses, retaining certain classical hallmarks, such as long terminal repeats (LTR), retroviral primer binding sites, and the like. However, the naturally occurring retrotransposons usually do not contain functional retroviral structure genes, which would normally be capable of recombining to yield replication competent viruses. Some retrotransposons are examples of so-called "selfish DNA", or genetic information, which encodes nothing except the ability to replicate itself. The retrotransposon may do so by utilizing the occasional presence of a retrovirus or a retrotransposase within the host cell, efficiently packaging itself within the viral particle, which transports it to the new host genome, where it is expressed again as RNA. The information encoded within that RNA is potentially transported with the jumping gene. A retrotransposon can be a DNA transposon or a retrotransposon, including a LTR retrotransposon or a non-LTR retrotransposon.

[0284] Non-long terminal repeat (LTR) retrotransposons are a type of mobile genetic elements that are widespread in eukaryotic genomes. They include two classes: the apurinic/apyrimidinic endonuclease (APE)-type and the restriction enzyme-like endonuclease (RLE)-type. The APE class retrotransposons are comprised of two functional domains: an endonuclease/DNA binding domain, and a reverse transcriptase domain. The RLE class are comprised of three functional domains: a DNA binding domain, a reverse transcription domain, and an endonuclease domain. The reverse transcriptase domain of non-LTR retrotransposon functions by binding an RNA sequence template and reverse transcribing it into the host genome's target DNA. The RNA sequence template has a 3' untranslated region which is specifically bound to the transposase, and a variable 5' region generally having Open Reading Frame(s) ("ORF") encoding transposase proteins. The RNA sequence template may also comprise a 5' untranslated region which specifically binds the retrotransposase. In some embodiments, a non-LTR transposons can include a LINE retrotransposon, such as L1, and a SINE retrotransposon, such as an Alu sequence. Other examples include for example, without limitation, R1, R2, R3, R4, and R5 retro-transposons (Moss, W. N. et al., RNA Biol. 2011, 8(5), 714-718; and Burke, W. D. et al., Molecular Biology and Evolution 2003, 20(8), 1260-1270). The transposon can be autonomous or non-autonomous.

[0285] LTR retrotransposons, which include retroviruses, make up a significant fraction of the typical mammalian genome, comprising about 8% of the human genome and 10% of the mouse genome. Lander et al., 2001, Nature 409, 860-921; Waterson et al., 2002, Nature 420, 520-562. LTR elements include retrotransposons, endogenous retroviruses (ERVs), and repeat elements with HERV origins, such as SINE-R. LTR retrotransposons include two LTR sequences that flank a region encoding two enzymes: integrase and retrotransposase.

[0286] ERVs include human endogenous retroviruses (HERVs), the remnants of ancient germ-cell infections. While most HERV proviruses have undergone extensive deletions and mutations, some have retained ORFS coding for functional proteins, including the glycosylated env protein. The env gene confers the potential for LTR elements to spread between cells and individuals. Indeed, all three open reading frames (pol, gag, and env) have been identified in humans, and evidence suggests that ERVs are active in the germline. See, e.g., Wang et al., 2010, Genome Res. 20, 19-27. Moreover, a few families, including the HERV-K (HML-2) group, have been shown to form viral particles, and an apparently intact provirus has recently been discovered in a small fraction of the human population. See, e.g., Bannert and Kurth, 2006, Proc. Natl. Acad. USA 101, 14572-14579.

[0287] LTR retrotransposons insert into new sites in the genome using the same steps of DNA cleavage and DNA strand-transfer observed in DNA transposons. In contrast to DNA transposons, however, recombination of LTR retrotransposons involves an RNA intermediate. LTR retrotransposons make up about 8% of the human genome. See, e.g., Lander et al., 2001, Nature 409, 860-921; Hua-Van et al., 2011, Biol. Dir. 6, 19.

Integration Site

[0288] The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering via the addition of an integration site into a target genome. The integration site will be discussed in more details below.

[0289] As used herein, the term "integration site" refers to the site within the target genome where one or more genes of interest or one or more nucleic acid sequences of interest are inserted. Examples of integration sites include for example, without limitation, a lox71 site (SEQ ID NO: 1), attB sites (SEQ ID NO: 3 and SEQ ID NO: 43), attP sites (SEQ ID NO: 4 and SEQ ID NO: 44), an attL site (SEQ ID NO: 67), an attR site (SEQ ID NO: 68), a Vox site (SEQ ID NO: 69), a FRT site (SEQ ID NO: 70), or a pseudo attP site (SEQ ID NO: 78). The integration site can be inserted into the genome or a fragment thereof of a cell using a nuclease, a gRNA, and/or an integration enzyme. The integration site can be inserted into the genome of a cell using a prime editor such as, without limitation, PE1, PE2, and PE3, wherein the integration site is carried on a pegRNA. The pegRNA can target any site that is known in the art. Examples of cites targeted by the pegRNA include, without limitation, ACTB, SUPT16H, SRRM2, NOLC1, DEPDC4, NES, LMNB1, AAVS1 locus, CC10, CFTR, SERPINA1, ABCA4, and any derivatives thereof. The complementary integration site may be operably linked to a gene of interest or nucleic acid sequence of interest in an exogenous DNA or RNA. In some embodiments, one integration site is added to a target genome. In some embodiments, more than one integration sites are added to a target genome.

[0290] To insert multiple genes or nucleic acids of interest, two or more integration sites are added to a desired location. Multiple DNA comprising nucleic acid sequences of interest are flanked orthogonal to the integration sequences, such as, without limitation, attB and attP. An integration site is "orthogonal" when it does not significantly recognize the recognition site or nucleotide sequence of a recombinase. Thus, one attB site of a recombinase can be orthogonal to an attB site of a different recombinase. In addition, one pair of attB and attP sites of a recombinase can be orthogonal to another pair of attB and attP sites recognized by the same recombinase. A pair of recombinases are considered orthogonal to each other, as defined herein, when there is recognition of each other's attB or attP site sequences.

[0291] The lack of recognition of integration sites or pairs of sites by the same recombinase or a different recombinase can be less than about 30%. In some embodiments, the lack of recognition of integration sites or pairs of sites by the same recombinase or a different recombinase can be less than about 30%, less than about 28%, less than about 26%, less than about 24%, less than about 22%, less than about 20%, less than about 18%, less than about 16%, less than about 14%, less than about 12%, less than about 10%, less than about 8%, less than about 6%, less than about 4%, less than about 2%, about 1%, or any range that is formed from any two of those values as endpoints. The crosstalk can be less than about 30%. In some embodiments, the crosstalk is less than about 30%, less than about 28%, less than about 26%, less than about 24%, less than about 22%, less than about 20%, less than about 18%, less than about 16%, less than about 14%, less than about 12%, less than about 10%, less than about 8%, less than about 6%, less than about 4%, less than about 2%, less than about 1%, or any range that is formed from any two of those values as endpoints.

[0292] In some embodiments, the attB and/or attP site sequences comprise a central dinucleotide sequence. It has been shown that, for example, the central dinucleotide can be changed to GA from GT and that only GA containing attB/attP sites interact and will not cross react with GT containing sequences. In some embodiments, the central dinucleotide is selected from the group consisting of AG, AC, TG, TC, CA, CT, GA, AA, TT, CC, GG, AT, TA, GC, CG and GT.

[0293] As used herein, the term "pair of an attB and attP site sequences" and the like refer to attB and attP site sequences that share the same central dinucleotide and can recombine. This means that in the presence of one serine integrase as many as six pairs of these orthogonal att sites can recombine (attPTT will specifically recombine with attBTT, attPTC will specifically recombine with attBTC, and so on).

[0294] In some embodiments, the central dinucleotide is nonpalindromic. In some embodiments, the central dinucleotide is palindromic. In some embodiments, a pair of an attB site sequence and an attP site sequence are used in different DNA encoding genes of interest or nucleic acid sequences of interest for inducing directional integration of two or more different nucleic acids.

[0295] The Table 3 below shows examples of pairs of attB site sequence and attP site sequence with different central dinucleotide (CD).

TABLE-US-00003 TABLE 3 Pair attB attP CD 1 SEQ ID NO: 5 SEQ ID NO: 6 TT 2 SEQ ID NO: 7 SEQ ID NO: 8 AA 3 SEQ ID NO: 9 SEQ ID NO: 10 CC 4 SEQ ID NO: 11 SEQ ID NO: 12 GG 5 SEQ ID NO: 13 SEQ ID NO: 14 TG 6 SEQ ID NO: 15 SEQ ID NO: 16 GT 7 SEQ ID NO: 17 SEQ ID NO: 18 CT 8 SEQ ID NO: 19 SEQ ID NO: 20 CA 9 SEQ ID NO: 21 SEQ ID NO: 22 TC 10 SEQ ID NO: 23 SEQ ID NO: 24 GA 11 SEQ ID NO: 25 SEQ ID NO: 26 AG 12 SEQ ID NO: 27 SEQ ID NO: 28 AC 13 SEQ ID NO: 29 SEQ ID NO: 30 AT 14 SEQ ID NO: 31 SEQ ID NO: 32 GC 15 SEQ ID NO: 33 SEQ ID NO: 34 CG 16 SEQ ID NO: 35 SEQ ID NO: 36 TA

Paste

[0296] The present disclosure provides non-naturally occurring or engineered systems, methods, and compositions for site-specific genetic engineering using PASTE. PASTE will be discussed in more details below.

[0297] The site-specific genetic engineering disclosed herein is for the insertion of one or more genes of interest or one or more nucleic acid sequences of interest into a genome of a cell. In some embodiments, the gene of interest is a mutated gene implicated in a genetic disease such as, without limitation, a metabolic disease, cystic fibrosis, muscular dystrophy, hemochromatosis, Tay-Sachs, Huntington disease, Congenital Deafness, Sickle cell anemia, Familial hypercholesterolemia, adenosine deaminase (ADA) deficiency, X-linked SCID (X-SCID), and Wiskott-Aldrich syndrome (WAS). In some embodiments, the gene of interest or nucleic acid sequence of interest can be a reporter gene upstream or downstream of a gene for genetic analyses such as, without limitation, for determining the expression of a gene. In some embodiments, the reporter gene is a GFP template (SEQ ID NO: 76) or a Gaussia Luciferase (G-Luciferase) template (SEQ ID NO: 77) In some embodiments, the gene of interest or nucleic acid sequence of interest can be used in plant genetics to insert genes to enhance drought tolerance, weather hardiness, and increased yield and herbicide resistance in plants. In some embodiments, the gene of interest or nucleic acid sequence of interest can be used for site-specific insertion of a protein (e.g., a lysosomal enzyme), a blood factor (e.g., Factor I, II, V, VII, X, XI, XII or XIII), a membrane protein, an exon, an intracellular protein (e.g., a cytoplasmic protein, a nuclear protein, an organellar protein such as a mitochondrial protein or lysosomal protein), an extracellular protein, a structural protein, a signaling protein, a regulatory protein, a transport protein, a sensory protein, a motor protein, a defense protein, or a storage protein, an anti-inflammatory signaling molecules into cells for treatment of immune diseases, including but not limited to arthritis, psoriasis, lupus, coeliac disease, glomerulonephritis, hepatitis, and inflammatory bowel disease.

[0298] The size of the inserted gene or nucleic acid can vary from about 1 bp to about 50,000 bp. In some embodiments, the size of the inserted gene or nucleic acid can be about 1 bp, 10 bp, 50 bp, 100 bp, 150 bp, 200 bp, 250 bp, 300 bp, 350 bp, 400 bp, 600 bp, 800 bp, 1000 bp, 1200 bp, 1400 bp, 1600 bp, 1800 bp, 2000 bp, 2200 bp, 2400 bp, 2600 bp, 2800 bp, 3000 bp, 3200 bp, 3400 bp, 3600 bp, 3800 bp, 4000 bp, 4200 bp, 4400 bp, 4600 bp, 4800 bp, 5000 bp, 5200 bp, 5400 bp, 5600 bp, 5800 bp, 6000 bp, 6200, 6400 bp, 6600 bp, 6800 bp, 7000 bp, 7200 bp, 7400 bp, 7600 bp, 7800 bp, 8000 bp, 8200 bp, 8400 bp, 8600 bp, 8800 bp, 9000 bp, 9200 bp, 9400 bp, 9600 bp, 9800 bp, 10,000 bp, 10,200 bp, 10,400 bp, 10,600 bp, 10,800 bp, 11,000 bp, 11,200 bp, 11,400 bp, 11,600 bp, 11,800 bp, 12,000 bp, 14,000 bp, 16,000 bp, 18,000 bp, 20,000 bp, 30,000 bp, 40,000 bp, 50,000 bp, or any range that is formed from any two of those values as endpoints.

[0299] In some embodiments, the site-specific engineering using the gene of interest or nucleic acid sequence of interest disclosed herein is for the engineering of T cells and NKs for tumor targeting or allogeneic generation. These can involve the use of receptor or CAR for tumor specificity, anti-PD1 antibody, cytokines like IFN-gamma, TNF-alpha, IL-15, IL-12, IL-18, IL-21, and IL-10, and immune escape genes.

[0300] In the present disclosure, the site-specific insertion of the gene of interest or nucleic acid of interest is performed through Programmable Addition via Site-Specific Targeting Elements (PASTE). Components for inserting a gene of interest or a nucleic acid of interest using PASTE are for example, without limitation, a nuclease, a gRNA adding the integration site, a DNA or RNA strand comprising the gene or nucleic acid linked to a sequence that is complementary or associated to the integration site, and an integration enzyme. Components for inserting a gene of interest or a nucleic acid of interest using PASTE are for example, without limitation, a prime editor expression, pegRNA adding the integration site, nicking guide RNA, integration enzyme (Cre or serine recombinase), transgene vector comprising the gene of interest or nucleic acid sequence of interest with gene and integration signal. The nuclease and prime editor integrate the integration site into the genome. The integration enzyme integrates the gene of interest into the integration site. In some embodiments, the transgene vector comprising the gene or nucleic acid sequence of interest with gene and integration signal is a DNA minicircle devoid of bacterial DNA sequences. In some embodiments, the transgenic vector is a eukaryotic or prokaryotic vector.

[0301] As used herein, the term "vector" or "transgene vector" refers to a recombinant DNA molecule containing a desired coding sequence and appropriate nucleic acid sequences necessary for the expression of the operably linked coding sequence in a host organism. Nucleic acid sequences necessary for expression in prokaryotes usually include for example, without limitation, a promoter, an operator (optional), a ribosome binding site, and/or other sequences. Eukaryotic cells are generally known to utilize promoters (constitutive, inducible or tissue specific), enhancers, and termination and polyadenylation signals, although some elements may be deleted and other elements added without sacrificing the necessary expression. The transgenic vector may encode the PE and the integration enzyme, linked to each other via a linker. The linker can be a cleavable linker. For example, transgenic vector encoding the PE and the integration enzyme, linked to each other via a linker is pCMV PE2 P2A Cre comprises SEQ ID NO: 73. In some embodiments, the linker can be a non-cleavable linker. In some embodiments the nuclease, prime editor, and/or integration enzyme can be encoded in different vectors.

[0302] A method of inserting multiple genes or nucleic acid sequences of interest into a single site according to embodiments of the present disclosure is illustrated in FIG. 12. In some embodiments, multiplexing involves inserting multiple genes of interest in multiple loci using unique pegRNA as illustrated in FIG. 13 (Merrick, C. A. et al., ACS Synth. Biol. 2018, 7, 299-310). The insertion of multiple genes of interest or nucleic acids of interest into a cell genome, referred herein as "multiplexing," is facilitated by incorporation of the complementary 5' integration site to the 5' end of the DNA or RNA comprising the first nucleic acid and 3' integration site to the 3' end of the DNA or RNA comprising the last nucleic acid. In some embodiments, the number of genome of interest or amino acid sequences of interest that are inserted into a cell genome using multiplexing can be about 1, 2, 3, 4, 5, 6, 7, 8, 9 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, or any range that is formed from any two of those values as endpoints.

[0303] In some embodiments, multiplexing allows integration of for example, signaling cascade, over-expression of a protein of interest with its cofactor, insertion of multiple genes mutated in a neoplastic condition, or insertion of multiple CARs for treatment of cancer.

[0304] In some embodiments, the integration sites may be inserted into the genome using non-prime editing methods such as rAAV mediated nucleic acid integration, TALENS and ZFNs. A number of unique properties make AAV a promising vector for human gene therapy (Muzyczka, CURRENT TOPICS IN MICROBIOLOGY AND IMMUNOLOGY, 158:97-129 (1992)). Unlike other viral vectors, AAVs have not been shown to be associated with any known human disease and are generally not considered pathogenic. Wild type AAV is capable of integrating into host chromosomes in a site-specific manner M. Kotin et al., PROC. NATL. ACAD. SCI, USA, 87:2211-2215 (1990); R. J. Samulski, EMBO 10(12):3941-3950 (1991)). Instead of creating a double-stranded DNA break, AAV stimulates endogenous homologous recombination to achieve the DNA modification. Further, transcription activator-like effector nucleases (TALENs) and Zinc-finger nucleases (ZFNs) for genome editing and introducing targeted DSBs. The specificity of TALENs arises from two polymorphic amino acids, the so-called repeat variable diresidues (RVDs) located at positions 12 and 13 of a repeated unit. TALENS are linked to FokI nucleases, which cleaves the DNA at the desired locations. ZFNs are artificial restriction enzymes for custom site-specific genome editing. Zinc fingers themselves are transcription factors, where each finger recognizes 3-4 bases. By mixing and matching these finger modules, researchers can customize which sequence to target.

[0305] As used herein, the terms "administration," "introducing," or "delivery" into a cell, a tissue, or an organ of a plasmid, nucleic acids, or proteins for modification of the host genome refers to the transport for such administration, introduction, or delivery that can occur in vivo, in vitro, or ex vivo. Plasmids, DNA, or RNA for genetic modification can be introduced into cells by transfection, which is typically accomplished by chemical means (e.g., calcium phosphate transfection, polyethyleneimine (PEI) Or lipofection), physical means (electroporation or microinjection), infection (this typically means the introduction of an infectious agent such as a virus (e.g., a baculovirus expressing the AAV Rep gene)), transduction (in microbiology, this refers to the stable infection of cells by viruses, or the transfer of genetic material from one microorganism to another by viral factors (e.g., bacteriophages)). Vectors for the expression of a recombinant polypeptide, protein or oligonucleotide may be obtained by physical means (e.g., calcium phosphate transfection, electroporation, microinjection, or lipofection) in a cell, a tissue, an organ or a subject. The vector can be delivered by preparing the vector in a pharmaceutically acceptable carrier for the in vitro, ex vivo, or in vivo delivery to the carrier.

[0306] As used herein, the term "transfection" refers to the uptake of an exogenous nucleic acid molecule by a cell. A cell is "transfected" when an exogenous nucleic acid has been introduced into the cell membrane. The transfection can be a single transfection, co-transfection, or multiple transfection. Numerous transfection techniques are generally known in the art. See, for example, Graham et al. (1973) Virology, 52: 456. Such techniques can be used to introduce one or more exogenous nucleic acid molecules into a suitable host cell.

[0307] In some embodiments, the exogenous nucleic acid molecule and/or other components for gene editing are combined and delivered in a single transfection. In other embodiments, the exogenous nucleic acid molecule and/or other components for gene editing are not combined and delivered in a single transfection. In some embodiments, exogenous nucleic acid molecule and/or other components for gene editing are combined and delivered in a single transfection to comprise for example, without limitation, a prime editing vector, a landing site such as a landing site containing pegRNA, a nicking guide such as a nicking guide for stimulating prime editing, an expression vector such as an expression vector for a corresponding integrase or recombinase, a minicircle DNA cargo such as a minicircle DNA cargo encoding for green fluorescent protein (GFP), any derivatives thereof, and any combinations thereof. In some embodiments, the gene of interest or amino acid sequence of interest can be introduced using liposomes. In some embodiments, the gene of interest or amino acid sequence of interest can be delivered using suitable vectors for instance, without limitation, plasmids and viral vectors. Examples of viral vectors include, without limitation, adeno-associated viruses (AAV), lentiviruses, adenoviruses, other viral vectors, derivatives thereof, or combinations thereof. The proteins and one or more guide RNAs can be packaged into one or more vectors, e.g., plasmids or viral vectors. In some embodiments, the delivery is via nanoparticles or exosomes. For example, exosomes can be particularly useful in delivery RNA.

[0308] In some embodiments, the prime editing inserts the landing site with efficiencies of at least about 1%, at least about 5%, at least about 10%, at least about 15%, at least about, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, or at least about 50%. In some embodiments, the prime editing inserts the landing site(s) with efficiencies of about 1%, about 2%, about 3%, about 4%, about 5%, about 6%, about 7%, about 8%, about 9%, about 10%, about 11%, about 12%, about 13%, about 14%, about 15%, about 16%, about 17%, about 18%, about 19%, about 20%, about 21%, about 22%, about 23%, about 24%, about 25%, about 26%, about 27%, about 28%, about 29%, about 30%, about 31%, about 32%, about 33%, about 34%, about 35%, about 36%, about 37%, about 38%, about 39%, about 40%, about 41%, about 42%, about 43%, about 44%, about 45%, about 46%, about 47%, about 48%, about 49%, about 50%, or any range that is formed from any two of those values as endpoints.

Sequences

[0309] Sequences of enzymes, guides, integration sites, and plasmids can be found in Table 4 below.

TABLE-US-00004 TABLE 4 SEQ ID NO/ DESCRIPTION/ SOURCE SEQUENCE SEQ ID NO: 1 ATAACTTCGTATAATGTATGCTATACGAACGGTA Lox71 (Artificial sequence) SEQ ID NO: 2 TACCGTTCGTATAATGTATGCTATACGAAGTTAT Lox66 (Artificial sequence) SEQ ID NO: 3 GGCCGGCTTGTCGACGACGGCGGTCTCCGTCGTCAGGATCATCCG attB G (Artificial sequence) SEQ ID NO: 4 CCGGATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGC attP C (Artificial Sequence) SEQ ID NO: 5 GGCTTGTCGACGACGGCGTTCTCCGTCGTCAGGATCAT attB-TT (Artificial Sequence) SEQ ID NO: 6 GTGGTTTGTCTGGTCAACCACCGCGTTCTCAGTGGTGTACGGTACA attP-TT AACCCA (Artificial Sequence) SEQ ID NO: 7 GGCTTGTCGACGACGGCGAACTCCGTCGTCAGGATCAT attB-AA (Artificial Sequence) SEQ ID NO: 8 GTGGTTTGTCTGGTCAACCACCGCGAACTCAGTGGTGTACGGTAC attP-AA AAACCCA (Artificial Sequence) SEQ ID NO: 9 GGCTTGTCGACGACGGCGCCCTCCGTCGTCAGGATCAT attB-CC (Artificial Sequence) SEQ ID NO: 10 GTGGTTTGTCTGGTCAACCACCGCGCCCTCAGTGGTGTACGGTACA attP-CC AACCCA (Artificial Sequence) SEQ ID NO: 11 GGCTTGTCGACGACGGCGGGCTCCGTCGTCAGGATCAT attB-GG (Artificial Sequence) SEQ ID NO: 12 GTGGTTTGTCTGGTCAACCACCGCGGGCTCAGTGGTGTACGGTAC attP-GG AAACCCA (Artificial Sequence) SEQ ID NO: 13 GGCTTGTCGACGACGGCGTGCTCCGTCGTCAGGATCAT attB-TG (Artificial Sequence) SEQ ID NO: 14 GTGGTTTGTCTGGTCAACCACCGCGTGCTCAGTGGTGTACGGTACA attP-TG AACCCA (Artificial Sequence) SEQ ID NO: 15 GGCTTGTCGACGACGGCGGTCTCCGTCGTCAGGATCAT attB-GT (Artificial Sequence) SEQ ID NO: 16 GTGGTTTGTCTGGTCAACCACCGCGGTCTCAGTGGTGTACGGTACA attP-GT AACCCA (Artificial Sequence) SEQ ID NO: 17 GGCTTGTCGACGACGGCGCTCTCCGTCGTCAGGATCAT attB-CT (Artificial Sequence) SEQ ID NO: 18 GTGGTTTGTCTGGTCAACCACCGCGCTCTCAGTGGTGTACGGTACA attP-CT AACCCA (Artificial Sequence) SEQ ID NO: 19 GGCTTGTCGACGACGGCGCACTCCGTCGTCAGGATCAT attB-CA (Artificial Sequence) SEQ ID NO: 20 GTGGTTTGTCTGGTCAACCACCGCGCACTCAGTGGTGTACGGTACA attP-CA AACCCA (Artificial Sequence) SEQ ID NO: 21 GGCTTGTCGACGACGGCGTCCTCCGTCGTCAGGATCAT attB-TC (Artificial Sequence) SEQ ID NO: 22 GTGGTTTGTCTGGTCAACCACCGCGTCCTCAGTGGTGTACGGTACA attP-TC AACCCA (Artificial Sequence) SEQ ID NO: 23 GGCTTGTCGACGACGGCGGACTCCGTCGTCAGGATCAT attB-GA (Artificial Sequence) SEQ ID NO: 24 GTGGTTTGTCTGGTCAACCACCGCGGACTCAGTGGTGTACGGTAC attP-GA AAACCCA (Artificial Sequence) SEQ ID NO: 25 GGCTTGTCGACGACGGCGAGCTCCGTCGTCAGGATCAT attB-AG (Artificial Sequence) SEQ ID NO: 26 GTGGTTTGTCTGGTCAACCACCGCGAGCTCAGTGGTGTACGGTAC attP-AG AAACCCA (Artificial Sequence) SEQ ID NO: 27 GGCTTGTCGACGACGGCGACCTCCGTCGTCAGGATCAT attB-AC (Artificial Sequence) SEQ ID NO: 28 GTGGTTTGTCTGGTCAACCACCGCGACCTCAGTGGTGTACGGTACA attP-AC AACCCA (Artificial Sequence) SEQ ID NO: 29 GGCTTGTCGACGACGGCGATCTCCGTCGTCAGGATCAT attB-AT (Artificial Sequence) SEQ ID NO: 30 GTGGTTTGTCTGGTCAACCACCGCGATCTCAGTGGTGTACGGTACA attP-AT AACCCA (Artificial Sequence) SEQ ID NO: 31 GGCTTGTCGACGACGGCGGCCTCCGTCGTCAGGATCAT attB-GC (Artificial Sequence SEQ ID NO: 32 GTGGTTTGTCTGGTCAACCACCGCGGCCTCAGTGGTGTACGGTACA attP-GC AACCCA (Artificial Sequence) SEQ ID NO: 33 GGCTTGTCGACGACGGCGCGCTCCGTCGTCAGGATCAT attB-CG (Artificial Sequence) SEQ ID NO: 34 GTGGTTTGTCTGGTCAACCACCGCGCGCTCAGTGGTGTACGGTACA attP-CG AACCCA (Artificial Sequence) SEQ ID NO: 35 GGCTTGTCGACGACGGCGTACTCCGTCGTCAGGATCAT attB-TA (Artificial Sequence) SEQ ID NO: 36 GTGGTTTGTCTGGTCAACCACCGCGTACTCAGTGGTGTACGGTACA attP-TA AACCCA (Artificial Sequence) SEQ ID NO: 37 TGCGGGTGCCAGGGCGTGCCCTTGGGCTCCCCGGGCGCGTACTCC C31-attB (Artificial Sequence) SEQ ID NO: 38 GTGCCCCAACTGGGGTAACCTTTGAGTTCTCTCAGTTGGGGG C31-attP (Artificial Sequence) SEQ ID NO: 39 GCGCCCAAGTTGCCCATGACCATGCCGAAGCAGTGGTAGAAGGGC R4-attB ACCGGCAGACAC (Artificial Sequence) SEQ ID NO: 40 AGGCATGTTCCCCAAAGCGATACCACTTGAAGCAGTGGTACTGCT R4-attP TGTGGGTACACTCTGCGGGTGATGA (Artificial Sequence) SEQ ID NO: 41 GTCCTTGACCAGGTTTTTGACGAAAGTGATCCAGATGATCCAGCTC BT1-attB CACACCCCGAACGC (Artificial Sequence) SEQ ID NO: 42 GGTGCTGGGTTGTTGTCTCTGGACAGTGATCCATGGGAAACTACTC BT1-attP AGCACCACCAATGTTCC (Artificial Sequence) SEQ ID NO: 43 TCGGCCGGCTTGTCGACGACGGCGGTCTCCGTCGTCAGGATCATCC Bxb-attB GGGC (Artificial Sequence) SEQ ID NO: 44 GTCGTGGTTTGTCTGGTCAACCACCGCGGTCTCAGTGGTGTACGGT Bxb-attP ACAAACCCCGAC (Artificial Sequence) SEQ ID NO: 45 GATCAGCTCCGCGGGCAAGACCTTCTCCTTCACGGGGTGGAAGGT TG1-attB C (Artificial Sequence) SEQ ID NO: 46 TCAACCCCGTTCCAGCCCAACAGTGTTAGTCTTTGCTCTTACCCAG TG1-attP TTGGGCGGGATAGCCTGCCCG (Artificial Sequence) SEQ ID NO: 47 AACGATTTTCAAAGGATCACTGAATCAAAAGTATTGCTCATCCAC C1-attB GCGAAATTTTTC (Artificial Sequence) SEQ ID NO: 48 AATATTTTAGGTATATGATTTTGTTTATTAGTGTAAATAACACTAT C1-attP GTACCTAAAAT (Artificial Sequence) SEQ ID NO: 49 TGTAAAGGAGACTGATAATGGCATGTACAACTATACTCGTCGGTA C370-attB AAAAGGCA (Artificial Sequence) SEQ ID NO: 50 TAAAAAAATACAGCGTTTTTCATGTACAACTATACTAGTTGTAGTG C370-attP CCTAAA (Artificial Sequence) SEQ ID NO: 51 GAGCGCCGGATCAGGGAGTGGACGGCCTGGGAGCGCTACACGCT K38-attB GTGGCTGCGGTC (Artificial Sequence) SEQ ID NO: 52 CCCTAATACGCAAGTCGATAACTCTCCTGGGAGCGTTGACAACTT K38-attP GCGCACCCTGA (Artificial Sequence) SEQ ID NO: 53 TCTCGTGGTGGTGGAAGGTGTTGGTGCGGGGTTGGCCGTGGTCGA RB-attB GGTGGGGTGGTGGTAGCCATTCG (Artificial Sequence) SEQ ID NO: 54 GCACAGGTGTAGTGTATCTCACAGGTCCACGGTTGGCCGTGGACT RV-attP GCTGAAGAACATTCCACGCCAGGA (Artificial Sequence) SEQ ID NO: 55 AGTGCAGCATGTCATTAATATCAGTACAGATAAAGCTGTATCTCCT SPBC-attB GTGAACACAATGGGTGCCA (Artificial Sequence) SEQ ID NO: 56 AAAGTAGTAAGTATCTTAAAAAACAGATAAAGCTGTATATTAAGA SPBC-attP TACTTACTAC (Artificial Sequence) SEQ ID NO: 57 TGATAATTGCCAACACAATTAACATCTCAATCAAGGTAAATGCTTT TP901-attB TTCGTTTT (Artificial Sequence) SEQ ID NO: 58 AATTGCGAGTTTTTATTTCGTTTATTTCAATTAAGGTAACTAAAAA TP901-attP ACTCCTTT (Artificial Sequence) SEQ ID NO: 59 AAGGTAGCGTCAACGATAGGTGTAACTGTCGTGTTTGTAACGGTA W.beta.-attB CTTCCAACAGCTGGCGTTTCAGT (Artificial Sequence) SEQ ID NO: 60 TAGTTTTAAAGTTGGTTATTAGTTACTGTGATATTTATCACGGTAC W.beta.-attP CCAATAACCAATGAATATTTGA (Artificial Sequence) SEQ ID NO: 61 TGTAACTTTTTCGGATCAAGCTATGAAGGACGCAAAGAGGGAACT A118-attB AAACACTTAATT (Artificial Sequence)

SEQ ID NO: 62 TTGTTTAGTTCCTCGTTTTCTCTCGTTGGAAGAAGAAGAAACGAGA A118-attP AACTAAAATTA (Artificial Sequence) SEQ ID NO: 63 CAACCTGTTGACATGTTTCCACAGACAACTCACGTGGAGGTAGTC BL3-attB ACGGCTTTTACGTTAGTT (Artificial Sequence) SEQ ID NO: 64 GAGAATACTGTTGAACAATGAAAAACTAGGCATGTAGAAGTTGTT BL3-attP TGTGCACTAACTTTAA (Artificial Sequence) SEQ ID NO: 65 ACAGGTCAACACATCGCAGTTATCGAACAATCTTCGAAAATGTAT MR11-attB GGAGGCACTTGTATCAATATAGGATGTATACCTTCGAAGACACTT (Artificial Sequence) GTACATGATGGATTAGAAGGCAAATCCTTT SEQ ID NO: 66 CAAAATAAAAAACATTGATTTTTATTAACTTCTTTTGTGCGGAACT MR11-attP ACGAACAGTTCATTAATACGAAGTGTACAAACTTCCATACAAAAA (Artificial Sequence) TAACCACGACAATTAAGACGTGGTTTCTA SEQ ID NO: 67 ATTATTTCTCACCCTGA attL (Artificial Sequence) SEQ ID NO: 68 ATCATCTCCCACCCGGA attR (Artificial Sequence) SEQ ID NO: 69 AATAGGTCTG AGAACGCCCA TTCTCAGACG TATT Vox (Artificial Sequence) SEQ ID NO: 70 GAAGTTCCTATAC TTTCTAGA GAATAGGAACTTC FRT (Artificial Sequence) SEQ ID NO: 71 GGTCGACATTGATTATTGACTAGTTATTAATAGTAATCAATTACGG Cre recombinase GGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATAACT expression plasmid TACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCGCCC (Artificial Sequence) ATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATAGG GACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACTGCC CACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCCCCTA TTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCAGT ACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATT AGTCATCGCTATTACCATGGTCGAGGTGAGCCCCACGTTCTGCTTC ACTCTCCCCATCTCCCCCCCCTCCCCACCCCCAATTTTGTATTTATT TATTTTTTAATTATTTTGTGCAGCGATGGGGGCGGGGGGGGGGGG GGCGCGCGCCAGGCGGGGGGGGGGGGGGGGGGGGGGGGGGGGG GGGGGGGCGGGGGGGGGCGGCGGCAGCCAATCAGAGCGGCGCGC TCCGAAAGTTTCCTTTTATGGCGAGGCGGCGGCGGCGGCGGCCCT ATAAAAAGCGAAGCGCGCGGCGGGCGGGAGTCGCTGCGCGCTGC CTTCGCCCCGTGCCCCGCTCCGCCGCCGCCTCGCGCCGCCCGCCCC GGCTCTGACTGACCGCGTTACTCCCACAGGTGAGCGGGCGGGACG GCCCTTCTCCTCCGGGCTGTAATTAGCGCTTGGTTTAATGACGGCT TGTTTCTTTTCTGTGGCTGCGTGAAAGCCTTGAGGGGCTCCGGGAG GGCCCTTTGTGCGGGGGGAGCGGCTCGGGGGGTGCGTGCGTGTGT GTGTGCGTGGGGAGCGCCGCGTGCGGCTCCGCGCTGCCCGGCGGC TGTGAGCGCTGCGGGCGCGGCGCGGGGCTTTGTGCGCTCCGCAGT GTGCGCGAGGGGAGCGCGGCCGGGGGCGGTGCCCCGCGGTGCGG GGGGGGCTGCGAGGGGAACAAAGGCTGCGTGCGGGGTGTGTGCG TGGGGGGGTGAGCAGGGGGTGTGGGCGCGTCGGTCGGGCTGCAA CCCCCCCTGCACCCCCCTCCCCGAGTTGCTGAGCACGGCCCGGCTT CGGGTGCGGGGCTCCGTACGGGGCGTGGCGCGGGGCTCGCCGTGC CGGGCGGGGGGTGGCGGCAGGTGGGGGTGCCGGGCGGGGCGGGG CCGCCTCGGGCCGGGGAGGGCTCGGGGGAGGGGCGCGGCGGCCC CCGGAGCGCCGGCGGCTGTCGAGGCGCGGCGAGCCGCAGCCATTG CCTTTTATGGTAATCGTGCGAGAGGGCGCAGGGACTTCCTTTGTCC CAAATCTGTGCGGAGCCGAAATCTGGGAGGCGCCGCCGCACCCCC TCTAGCGGGCGCGGGGCGAAGCGGTGCGGCGCCGGCAGGAAGGA AATGGGCGGGGAGGGCCTTCGTGCGTCGCCGCGCCGCCGTCCCCT TCTCCCTCTCCAGCCTCGGGGCTGTCCGCGGGGGGACGGCTGCCTT CGGGGGGGACGGGGCAGGGCGGGGTTCGGCTTCTGGCGTGTGACC GGCGGCTCTAGAGCCTCTGCTAACCATGTTCATGCCTTCTTCTTTTT CCTACAGCTCCTGGGCAACGTGCTGGTTATTGTGCTGTCTCATCAT TTTGGCAAAGAATTCTGAGCCGCCACCATGGCCAATTTACTGACC GTACACCAAAATTTGCCTGCATTACCGGTCGATGCAACGAGTGAT GAGGTTCGCAAGAACCTGATGGACATGTTCAGGGATCGCCAGGCG TTTTCTGAGCATACCTGGAAAATGCTTCTGTCCGTTTGCCGGTCGT GGGCGGCATGGTGCAAGTTGAATAACCGGAAATGGTTTCCCGCAG AACCTGAAGATGTTCGCGATTATCTTCTATATCTTCAGGCGCGCGG TCTGGCAGTAAAAACTATCCAGCAACATTTGGGCCAGCTAAACAT GCTTCATCGTCGGTCCGGGCTGCCACGACCAAGTGACAGCAATGC TGTTTCACTGGTTATGCGGCGGATCCGAAAAGAAAACGTTGATGC CGGTGAACGTGCAAAACAGGCTCTAGCGTTCGAACGCACTGATTT CGACCAGGTTCGTTCACTCATGGAAAATAGCGATCGCTGCCAGGA TATACGTAATCTGGCATTTCTGGGGATTGCTTATAACACCCTGTTA CGTATAGCCGAAATTGCCAGGATCAGGGTTAAAGATATCTCACGT ACTGACGGTGGGAGAATGTTAATCCATATTGGCAGAACGAAAACG CTGGTTAGCACCGCAGGTGTAGAGAAGGCACTTAGCCTGGGGGTA ACTAAACTGGTCGAGCGATGGATTTCCGTCTCTGGTGTAGCTGATG ATCCGAATAACTACCTGTTTTGCCGGGTCAGAAAAAATGGTGTTG CCGCGCCATCTGCCACCAGCCAGCTATCAACTCGCGCCCTGGAAG GGATTTTTGAAGCAACTCATCGATTGATTTACGGCGCTAAGGATG ACTCTGGTCAGAGATACCTGGCCTGGTCTGGACACAGTGCCCGTG TCGGAGCCGCGCGAGATATGGCCCGCGCTGGAGTTTCAATACCGG AGATCATGCAAGCTGGTGGCTGGACCAATGTAAATATTGTCATGA ACTATATCCGTAACCTGGATAGTGAAACAGGGGCAATGGTGCGCC TGCTGGAAGATGGCGATGGACCGGTGGAACAAAAACTTATTTCTG AAGAAGATCTGTGATAGCGGCCGCACTCCTCAGGTGCAGGCTGCC TATCAGAAGGTGGTGGCTGGTGTGGCCAATGCCCTGGCTCACAAA TACCACTGAGATCTTTTTCCCTCTGCCAAAAATTATGGGGACATCA TGAAGCCCCTTGAGCATCTGACTTCTGGCTAATAAAGGAAATTTAT TTTCATTGCAATAGTGTGTTGGAATTTTTTGTGTCTCTCACTCGGAA GGACATATGGGAGGGCAAATCATTTAAAACATCAGAATGAGTATT TGGTTTAGAGTTTGGCAACATATGCCCATATGCTGGCTGCCATGAA CAAAGGTTGGCTATAAAGAGGTCATCAGTATATGAAACAGCCCCC TGCTGTCCATTCCTTATTCCATAGAAAAGCCTTGACTTGAGGTTAG ATTTTTTTTATATTTTGTTTTGTGTTATTTTTTTCTTTAACATCCCTA AAATTTTCCTTACATGTTTTACTAGCCAGATTTTTCCTCCTCTCCTG ACTACTCCCAGTCATAGCTGTCCCTCTTCTCTTATGGAGATCCCTC GACCTGCAGCCCAAGCTTGGCGTAATCATGGTCATAGCTGTTTCCT GTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACGAGCC GGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAA CTCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAA ACCTGTCGTGCCAGCGGATCCGCATCTCAATTAGTCAGCAACCAT AGTCCCGCCCCTAACTCCGCCCATCCCGCCCCTAACTCCGCCCAGT TCCGCCCATTCTCCGCCCCATGGCTGACTAATTTTTTTTATTTATGC AGAGGCCGAGGCCGCCTCGGCCTCTGAGCTATTCCAGAAGTAGTG AGGAGGCTTTTTTGGAGGCCTAGGCTTTTGCAAAAAGCTAACTTGT TTATTGCAGCTTATAATGGTTACAAATAAAGCAATAGCATCACAA ATTTCACAAATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTG TCCAAACTCATCAATGTATCTTATCATGTCTGGATCCGCTGCATTA ATGAATCGGCCAACGCGCGGGGAGAGGCGGTTTGCGTATTGGGCG CTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGG CTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTA TCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAA AGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGG CGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCG ACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGAT ACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCC GACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGA AGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGG TGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGT TCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCC AACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGT AACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTC TTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTT GGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTT GGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGT TTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCT CAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGA ACGAAAACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAA GGATCTTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATC AATCTAAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATG CTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCA TCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGG AGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACC CACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCG GAAGGGCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCA TCCAGTCTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCC AGTTAATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTG GTGTCACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCC AACGATCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAG CGGTTAGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGC CGCAGTGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTT ACTGTCATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACT CAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCT CTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAA CTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAAC TCTCAAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCAC TCGTGCACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTT CTGGGTGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGA ATAAGGGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTC AATATTATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATA CATATTTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCG CACATTTCCCCGAAAAGTGCCACCTG SEQ ID NO: 72 AGCTCTGATCAAGAGACAGGATGAGGATCGTTTCGCATGATTGAA GFP-Lox66 Cre CAAGATGGATTGCACGCAGGTTCTCCGGCCGCTTGGGTGGAGAGG expression plasmid CTATTCGGCTATGACTGGGCACAACAGACAATCGGCTGCTCTGAT (Artificial Sequence) GCCGCCGTGTTCCGGCTGTCAGCGCAGGGGCGCCCGGTTCTTTTTG TCAAGACCGACCTGTCCGGTGCCCTGAATGAACTGCAAGACGAGG CAGCGCGGCTATCGTGGCTGGCCACGACGGGCGTTCCTTGCGCAG CTGTGCTCGACGTTGTCACTGAAGCGGGAAGGGACTGGCTGCTAT TGGGCGAAGTGCCGGGGCAGGATCTCCATGTCATCTACACCTTGC TCCTGCCGAGAAAGTATCCATCATGGCTGATGCAATGCGGCGGCT GCATACGCTTGATCCGGCTACCTGCCCATTCGACCACCAAGCGAA ACATCGCATCGAGCGAGCACGTACTCGGATGGAAGCCGGTCTTGT CGATCAGGATGATCTGGACGAAGAGCATCAGGGGCTCGCGCCAGC CGAACTGTTCGCCAGGCTCAAGGCGAGCATGCCCGACGGCGAGGA TCTCGTCGTGACCCATGGCGATGCCTGCTTGCCGAATATCATGGTG GAAAATGGCCGCTTTTCTGGATTCATCGACTGTGGCCGGCTGGGTG TGGCGGACCGCTATCAGGACATAGCGTTGGCTACCCGTGATATTG CTGAAGAGCTTGGCGGCGAATGGGCTGACCGCTTCCTCGTGCTTTA CGGTATCGCCGCTCCCGATTCGCAGCGCATCGCCTTCTATCGCCTT CTTGACGAGTTCTTCTGAATTATTAACTCGAGATCCACTAGAGTGT GGCGGCCGCATTCTTATAATCAGCATCATGATGTGGTACCACATCA TGATGCTGATTACCCCCAACTGAGAGAACTCAAAGGTTACCCCAG TTGGGGCGGGCCCACAAATAAAGCAATAGCATCACAAATTTCACA AATAAAGCATTTTTTTCACTGCATTCTAGTTGTGGTTTGTCCAAAC TCATCGAGCTCGAGATCTGGCGAAGGCGATGGGGGTCTTGAAGGC GTGCTGGTACTCCACGATGCCCAGCTCGGTGTTGCTGTGCAGCTCC TCCACGCGGCGGAAGGCGAACATGGGGCCCCCGTTCTGCAGGATG CTGGGGTGGATGGCGCTCTTGAAGTGCATGTGGCTGTCCACCACG AAGCTGTAGTAGCCGCCGTCGCGCAGGCTGAAGGTGCGGGCGAAG CTGCCCACCAGCACGTTATCGCCCATGGGGTGCAGGTGCTCCACG GTGGCGTTGCTGCGGATGATCTTGTCGGTGAAGATCACGCTGTCCT CGGGGAAGCCGGTGCCCACCACCTTGAAGTCGCCGATCACGCGGC CGGCCTCGTAGCGGTAGCTGAAGCTCACGTGCAGCACGCCGCCGT CCTCGTACTTCTCGATGCGGGTGTTGGTGTAGCCGCCGTTGTTGAT GGCGTGCAGGAAGGGGTTCTCGTAGCCGCTGGGGTAGGTGCCGAA GTGGTAGAAGCCGTAGCCCATCACGTGGCTCAGCAGGTAGGGGCT GAAGGTCAGGGCGCCTTTGGTGCTCTTCATCTTGTTGGTCATGCGG CCCTGCTCGGGGGTGCCCTCTCCGCCGCCCACCAGCTCGAACTCCA CGCCGTTCAGGGTGCCGGTGATGCGGCACTCGATCTTCATGGCGG GCATGGTGGCGACCGGTAGCGCTAGCGGCTTCGGATAACTTCGTA TAGCATACATTATACGAACGGTAAGCGCTACCGCCGGCATACCCA AGTGAAGTTGCTCGCAGCTTATAGTCGCGCCCGGGGAGCCCAAGG GCACGCCCTGGCACCGCGGCCGCTGAGTCTCGACCATCATCATCA TCATCATTGAGTTTATCTGGGATAACAGGGTAATGTCATCTAGGGA TAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATCTAGGGA TAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCATCTAGG GATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATCTAGG GATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCATCTA GGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATCTA GGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCATC TAGGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTATC TAGGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGTCA TCTAGGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATGTA TCTAGGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAATGT CATCTAGGGATAACAGGGTATGTCATCTGGGATAACAGGGTAATG TATCTAGGGATAACAGGGTAATGTCATCTGGGATAACAGGGTAAT GTCATCTAGGGATAACAGGGTAAATGTCATCTAGGGATAACAGGG TAATGTCATCTAGGGATAACAGGGTAATGTCATCTGGGATAACAG GGTAATGTCATCTAGGGATAACAGGGTAATGTATCGCCAGCGTCG CACAGCATGTTTGCTTGTCGCCGTCGCGTCTGTCACATCTTTTCCG CCAGCAGTTAGGGATTAGCGTCTTAAGCTGGCGCGAGGACCAACG TATCAGCCAGGCGAAGCTGCTTTTGAGCACCACCCGGATGCCTAT CGCCACCGTCGGTCGCAATGTTGGTTTTGACGATCAACTCTATTTC TCGCGGGTATTTAAAAAATGCACCGGGGCCAGCCCGAGCGAGTTC CGTGCCGGTTGTGAAGAAAAAGTGAATGATGTAGCCGTCAAGTTG TCATAATTGGTAACGAATCAGACAATTGACGGCTTGACGGAGTAG CATAGGGTTTGCAGAATCCCTGCTTCGTCCATTTGACAGGCACATT ATGCATGCCGCTTCGCCTTCGCGCGCGAATTGATCTGCTGCCTCGC GCGTTTCGGTGATGACGGTGAAAACCTCTGACACATGCAGCTCCC GGAGACGGTCACAGCTTGTCTGTAAGCGGATGCCGGGAGCAGACA AGCCCGTCAGGGCGCGTCAGCGGGTGTTGGCGGGTGTCGGGGCGC AGCCATGACCCAGTCACGTAGCGATAGCGGAGTGTATACTGGCTT AACTATGCGGCATCAGAGCAGATTGTACTGAGAGTGCACCATATG CGGTGTGAAATACCGCACAGATGCGTAAGGAGAAAATACCGCATC AGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCG TTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATAC GGTTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGA GCAAAAGGCCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTT GCTGGCGTTTTTCCATAGGCTCCGCCCCCCTGACGAGCATCACAAA AATCGACGCTCAAGTCAGAGGTGGCGAAACCCGACAGGACTATAA AGATACCAGGCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTG TTCCGACCCTGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCG GGAAGCGTGGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTT CGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCC CCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGA

GTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCCAC TGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGA GTTCTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGGACAGT ATTTGGTATCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGA GTTGGTAGCTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGT GGTTTTTTTGTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGA TCTCAAGAAGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGT GGAACGAAAACTCACGTTAAGGGATTTTGGTCATGAGCGGATACA TATTTGAATGTATTTAGAAAAATAAACAAAAGAGTTTGTAGAAAC GCAAAAAGGCCATCCGTCAGGATGGCCTTCTGCTTAATTTGATGCC TGGCAGTTTATGGCGGGCGTCCTGCCCGCCACCCTCCGGGCCGTTG CTTCGCAACGTTCAAATCCGCTCCCGGCGGATTTGTCCTACTCAGG AGAGCGTTCACCGACAAACAACAGATAAAACGAAAGGCCCAGTC TTTCGACTGAGCCTTTCGTTTTATTTGATGCCTGGCAGTTCCCTACT CTCGCATGGGGAGACCCCACACTACCATCGGCGCTACGGCGTTTC ACTTCTGAGTTCGGCATGGGGTCAGGTGGGACCACCGCGCTACTG CCGCCAGGCAAATTCTGTTTTATCAGACCGCTTCTGCGTTCTGATT TAATCTGTATCAGGCTGAAAATCTTCTCTCATCCGCCAAAACAGCC AAGCTGGAGACCGTTTGGCCCCCCTCGAGCACGTAGAAAGCCAGT CCGCAGAAACGGTGCTGACCCCGGATGAATGTCAGCTACTGGGCT ATCTGGACAAGGGAAAACGCAAGCGCAAAGAGAAAGCAGGTAGC TTGCAGTGGGCTTACATGGCGATAGCTAGACTGGGCGGTTTTATG GACAGCAAGCGAACCGGAATTGCCAGCTGGGGCGCCCTCTGGTAA GGTTGGGAAGCCCTGCAAAGTAAACTGGATGGCTTTCTCGCCGCC AAGGATCTGATGGCGCAGGGGATCA SEQ ID NO: 73 ACGCGTTGACATTGATTATTGACTAGTTATTAATAGTAATCAATTA pCMV PE2 P2A Cre CGGGGTCATTAGTTCATAGCCCATATATGGAGTTCCGCGTTACATA plasmid ACTTACGGTAAATGGCCCGCCTGGCTGACCGCCCAACGACCCCCG (Artificial Sequence) CCCATTGACGTCAATAATGACGTATGTTCCCATAGTAACGCCAATA GGGACTTTCCATTGACGTCAATGGGTGGAGTATTTACGGTAAACT GCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACGCCC CCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCC AGTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGT ATTAGTCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATC AATGGGCGTGGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTC CACCCCATTGACGTCAATGGGAGTTTGTTTTGGCACCAAAATCAAC GGGACTTTCCAAAATGTCGTAACAACTCCGCCCCATTGACGCAAA TGGGCGGTAGGCGTGTACGGTGGGAGGTCTATATAAGCAGAGCTG GTTTAGTGAACCGTCAGATCCGCTAGAGATCCGCGGCCGCTAATA CGACTCACTATAGGGAGAGCCGCCACCATGAAACGGACAGCCGAC GGAAGCGAGTTCGAGTCACCAAAGAAGAAGCGGAAAGTCGACAA GAAGTACAGCATCGGCCTGGACATCGGCACCAACTCTGTGGGCTG GGCCGTGATCACCGACGAGTACAAGGTGCCCAGCAAGAAATTCAA GGTGCTGGGCAACACCGACCGGCACAGCATCAAGAAGAACCTGAT CGGAGCCCTGCTGTTCGACAGCGGCGAAACAGCCGAGGCCACCCG GCTGAAGAGAACCGCCAGAAGAAGATACACCAGACGGAAGAACC GGATCTGCTATCTGCAAGAGATCTTCAGCAACGAGATGGCCAAGG TGGACGACAGCTTCTTCCACAGACTGGAAGAGTCCTTCCTGGTGG AAGAGGATAAGAAGCACGAGCGGCACCCCATCTTCGGCAACATCG TGGACGAGGTGGCCTACCACGAGAAGTACCCCACCATCTACCACC TGAGAAAGAAACTGGTGGACAGCACCGACAAGGCCGACCTGCGG CTGATCTATCTGGCCCTGGCCCACATGATCAAGTTCCGGGGCCACT TCCTGATCGAGGGCGACCTGAACCCCGACAACAGCGACGTGGACA AGCTGTTCATCCAGCTGGTGCAGACCTACAACCAGCTGTTCGAGG AAAACCCCATCAACGCCAGCGGCGTGGACGCCAAGGCCATCCTGT CTGCCAGACTGAGCAAGAGCAGACGGCTGGAAAATCTGATCGCCC AGCTGCCCGGCGAGAAGAAGAATGGCCTGTTCGGAAACCTGATTG CCCTGAGCCTGGGCCTGACCCCCAACTTCAAGAGCAACTTCGACC TGGCCGAGGATGCCAAACTGCAGCTGAGCAAGGACACCTACGACG ACGACCTGGACAACCTGCTGGCCCAGATCGGCGACCAGTACGCCG ACCTGTTTCTGGCCGCCAAGAACCTGTCCGACGCCATCCTGCTGAG CGACATCCTGAGAGTGAACACCGAGATCACCAAGGCCCCCCTGAG CGCCTCTATGATCAAGAGATACGACGAGCACCACCAGGACCTGAC CCTGCTGAAAGCTCTCGTGCGGCAGCAGCTGCCTGAGAAGTACAA AGAGATTTTCTTCGACCAGAGCAAGAACGGCTACGCCGGCTACAT TGACGGCGGAGCCAGCCAGGAAGAGTTCTACAAGTTCATCAAGCC CATCCTGGAAAAGATGGACGGCACCGAGGAACTGCTCGTGAAGCT GAACAGAGAGGACCTGCTGCGGAAGCAGCGGACCTTCGACAACG GCAGCATCCCCCACCAGATCCACCTGGGAGAGCTGCACGCCATTC TGCGGCGGCAGGAAGATTTTTACCCATTCCTGAAGGACAACCGGG AAAAGATCGAGAAGATCCTGACCTTCCGCATCCCCTACTACGTGG GCCCTCTGGCCAGGGGAAACAGCAGATTCGCCTGGATGACCAGAA AGAGCGAGGAAACCATCACCCCCTGGAACTTCGAGGAAGTGGTGG ACAAGGGCGCTTCCGCCCAGAGCTTCATCGAGCGGATGACCAACT TCGATAAGAACCTGCCCAACGAGAAGGTGCTGCCCAAGCACAGCC TGCTGTACGAGTACTTCACCGTGTATAACGAGCTGACCAAAGTGA AATACGTGACCGAGGGAATGAGAAAGCCCGCCTTCCTGAGCGGCG AGCAGAAAAAGGCCATCGTGGACCTGCTGTTCAAGACCAACCGGA AAGTGACCGTGAAGCAGCTGAAAGAGGACTACTTCAAGAAAATC GAGTGCTTCGACTCCGTGGAAATCTCCGGCGTGGAAGATCGGTTC AACGCCTCCCTGGGCACATACCACGATCTGCTGAAAATTATCAAG GACAAGGACTTCCTGGACAATGAGGAAAACGAGGACATTCTGGA AGATATCGTGCTGACCCTGACACTGTTTGAGGACAGAGAGATGAT CGAGGAACGGCTGAAAACCTATGCCCACCTGTTCGACGACAAAGT GATGAAGCAGCTGAAGCGGCGGAGATACACCGGCTGGGGCAGGC TGAGCCGGAAGCTGATCAACGGCATCCGGGACAAGCAGTCCGGCA AGACAATCCTGGATTTCCTGAAGTCCGACGGCTTCGCCAACAGAA ACTTCATGCAGCTGATCCACGACGACAGCCTGACCTTTAAAGAGG ACATCCAGAAAGCCCAGGTGTCCGGCCAGGGCGATAGCCTGCACG AGCACATTGCCAATCTGGCCGGCAGCCCCGCCATTAAGAAGGGCA TCCTGCAGACAGTGAAGGTGGTGGACGAGCTCGTGAAAGTGATGG GCCGGCACAAGCCCGAGAACATCGTGATCGAAATGGCCAGAGAG AACCAGACCACCCAGAAGGGACAGAAGAACAGCCGCGAGAGAAT GAAGCGGATCGAAGAGGGCATCAAAGAGCTGGGCAGCCAGATCC TGAAAGAACACCCCGTGGAAAACACCCAGCTGCAGAACGAGAAG CTGTACCTGTACTACCTGCAGAATGGGCGGGATATGTACGTGGAC CAGGAACTGGACATCAACCGGCTGTCCGACTACGATGTGGACGCT ATCGTGCCTCAGAGCTTTCTGAAGGACGACTCCATCGACAACAAG GTGCTGACCAGAAGCGACAAGAACCGGGGCAAGAGCGACAACGT GCCCTCCGAAGAGGTCGTGAAGAAGATGAAGAACTACTGGCGGC AGCTGCTGAACGCCAAGCTGATTACCCAGAGAAAGTTCGACAATC TGACCAAGGCCGAGAGAGGCGGCCTGAGCGAACTGGATAAGGCC GGCTTCATCAAGAGACAGCTGGTGGAAACCCGGCAGATCACAAAG CACGTGGCACAGATCCTGGACTCCCGGATGAACACTAAGTACGAC GAGAATGACAAGCTGATCCGGGAAGTGAAAGTGATCACCCTGAA GTCCAAGCTGGTGTCCGATTTCCGGAAGGATTTCCAGTTTTACAAA GTGCGCGAGATCAACAACTACCACCACGCCCACGACGCCTACCTG AACGCCGTCGTGGGAACCGCCCTGATCAAAAAGTACCCTAAGCTG GAAAGCGAGTTCGTGTACGGCGACTACAAGGTGTACGACGTGCGG AAGATGATCGCCAAGAGCGAGCAGGAAATCGGCAAGGCTACCGC CAAGTACTTCTTCTACAGCAACATCATGAACTTTTTCAAGACCGAG ATTACCCTGGCCAACGGCGAGATCCGGAAGCGGCCTCTGATCGAG ACAAACGGCGAAACCGGGGAGATCGTGTGGGATAAGGGCCGGGA TTTTGCCACCGTGCGGAAAGTGCTGAGCATGCCCCAAGTGAATAT CGTGAAAAAGACCGAGGTGCAGACAGGCGGCTTCAGCAAAGAGT CTATCCTGCCCAAGAGGAACAGCGATAAGCTGATCGCCAGAAAGA AGGACTGGGACCCTAAGAAGTACGGCGGCTTCGACAGCCCCACCG TGGCCTATTCTGTGCTGGTGGTGGCCAAAGTGGAAAAGGGCAAGT CCAAGAAACTGAAGAGTGTGAAAGAGCTGCTGGGGATCACCATCA TGGAAAGAAGCAGCTTCGAGAAGAATCCCATCGACTTTCTGGAAG CCAAGGGCTACAAAGAAGTGAAAAAGGACCTGATCATCAAGCTG CCTAAGTACTCCCTGTTCGAGCTGGAAAACGGCCGGAAGAGAATG CTGGCCTCTGCCGGCGAACTGCAGAAGGGAAACGAACTGGCCCTG CCCTCCAAATATGTGAACTTCCTGTACCTGGCCAGCCACTATGAGA AGCTGAAGGGCTCCCCCGAGGATAATGAGCAGAAACAGCTGTTTG TGGAACAGCACAAGCACTACCTGGACGAGATCATCGAGCAGATCA GCGAGTTCTCCAAGAGAGTGATCCTGGCCGACGCTAATCTGGACA AAGTGCTGTCCGCCTACAACAAGCACCGGGATAAGCCCATCAGAG AGCAGGCCGAGAATATCATCCACCTGTTTACCCTGACCAATCTGG GAGCCCCTGCCGCCTTCAAGTACTTTGACACCACCATCGACCGGA AGAGGTACACCAGCACCAAAGAGGTGCTGGACGCCACCCTGATCC ACCAGAGCATCACCGGCCTGTACGAGACACGGATCGACCTGTCTC AGCTGGGAGGTGACTCTGGAGGATCTAGCGGAGGATCCTCTGGCA GCGAGACACCAGGAACAAGCGAGTCAGCAACACCAGAGAGCAGT GGCGGCAGCAGCGGCGGCAGCAGCACCCTAAATATAGAAGATGA GTATCGGCTACATGAGACCTCAAAAGAGCCAGATGTTTCTCTAGG GTCCACATGGCTGTCTGATTTTCCTCAGGCCTGGGCGGAAACCGG GGGCATGGGACTGGCAGTTCGCCAAGCTCCTCTGATCATACCTCTG AAAGCAACCTCTACCCCCGTGTCCATAAAACAATACCCCATGTCA CAAGAAGCCAGACTGGGGATCAAGCCCCACATACAGAGACTGTTG GACCAGGGAATACTGGTACCCTGCCAGTCCCCCTGGAACACGCCC CTGCTACCCGTTAAGAAACCAGGGACTAATGATTATAGGCCTGTC CAGGATCTGAGAGAAGTCAACAAGCGGGTGGAAGACATCCACCC CACCGTGCCCAACCCTTACAACCTCTTGAGCGGGCTCCCACCGTCC CACCAGTGGTACACTGTGCTTGATTTAAAGGATGCCTTTTTCTGCC TGAGACTCCACCCCACCAGTCAGCCTCTCTTCGCCTTTGAGTGGAG AGATCCAGAGATGGGAATCTCAGGACAATTGACCTGGACCAGACT CCCACAGGGTTTCAAAAACAGTCCCACCCTGTTTAATGAGGCACT GCACAGAGACCTAGCAGACTTCCGGATCCAGCACCCAGACTTGAT CCTGCTACAGTACGTGGATGACTTACTGCTGGCCGCCACTTCTGAG CTAGACTGCCAACAAGGTACTCGGGCCCTGTTACAAACCCTAGGG AACCTCGGGTATCGOGCCTCGGCCAAGAAAGCCCAAATTTGCCAG AAACAGGTCAAGTATCTGGGGTATCTTCTAAAAGAGGGTCAGAGA TGGCTGACTGAGGCCAGAAAAGAGACTGTGATGGGGCAGCCTACT CCGAAGACCCCTCGACAACTAAGGGAGTTCCTAGGGAAGGCAGGC TTCTGTCGCCTCTTCATCCCTGGGTTTGCAGAAATGGCAGCCCCCC TGTACCCTCTCACCAAACCGGGGACTCTGTTTAATTGGGGCCCAGA CCAACAAAAGGCCTATCAAGAAATCAAGCAAGCTCTTCTAACTGC CCCAGCCCTGGGGTTGCCAGATTTGACTAAGCCCTTTGAACTCTTT GTCGACGAGAAGCAGGGCTACGCCAAAGGTGTCCTAACGCAAAA ACTGGGACCTTGGCGTCGGCCGGTGGCCTACCTGTCCAAAAAGCT AGACCCAGTAGCAGCTGGGTGGCCCCCTTGCCTACGGATGGTAGC AGCCATTGCCGTACTGACAAAGGATGCAGGCAAGCTAACCATGGG ACAGCCACTAGTCATTCTGGCCCCCCATGCAGTAGAGGCACTAGT CAAACAACCCCCCGACCGCTGGCTTTCCAACGCCCGGATGACTCA CTATCAGGCCTTGCTTTTGGACACGGACCGGGTCCAGTTCGGACCG GTGGTAGCCCTGAACCCGGCTACGCTGCTCCCACTGCCTGAGGAA GGGCTGCAACACAACTGCCTTGATATCCTGGCCGAAGCCCACGGA ACCCGACCCGACCTAACGGACCAGCCGCTCCCAGACGCCGACCAC ACCTGGTACACGGATGGAAGCAGTCTCTTACAAGAGGGACAGCGT AAGGCGGGAGCTGCGGTGACCACCGAGACCGAGGTAATCTGGGCT AAAGCCCTGCCAGCCGGGACATCCGCTCAGCGGGCTGAACTGATA GCACTCACCCAGGCCCTAAAGATGGCAGAAGGTAAGAAGCTAAAT GTTTATACTGATAGCCGTTATGCTTTTGCTACTGCCCATATCCATG GAGAAATATACAGAAGGCGTGGGTGGCTCACATCAGAAGGCAAA GAGATCAAAAATAAAGACGAGATCTTGGCCCTACTAAAAGCCCTC TTTCTGCCCAAAAGACTTAGCATAATCCATTGTCCAGGACATCAAA AGGGACACAGCGCCGAGGCTAGAGGCAACCGGATGGCTGACCAA GCGGCCCGAAAGGCAGCCATCACAGAGACTCCAGACACCTCTACC CTCCTCATAGAAAATTCATCACCCTCTGGCGGCTCAAAAAGAACC GCCGACGGCAGCGAATTCGAGCCCAAGAAGAAGAGGAAAGTCGG AAGCGGAGCTACTAACTTCAGCCTGCTGAAGCAGGCTGGCGACGT GGAGGAGAACCCTGGACCTAATTTACTGACCGTACACCAAAATTT GCCTGCATTACCGGTCGATGCAACGAGTGATGAGGTTCGCAAGAA CCTGATGGACATGTTCAGGGATCGCCAGGCGTTTTCTGAGCATACC TGGAAAATGCTTCTGTCCGTTTGCCGGTCGTGGGCGGCATGGTGCA AGTTGAATAACCGGAAATGGTTTCCCGCAGAACCTGAAGATGTTC GCGATTATCTTCTATATCTTCAGGCGCGCGGTCTGGCAGTAAAAAC TATCCAGCAACATTTGGGCCAGCTAAACATGCTTCATCGTCGGTCC GGGCTGCCACGACCAAGTGACAGCAATGCTGTTTCACTGGTTATG CGGCGGATCCGAAAAGAAAACGTTGATGCCGGTGAACGTGCAAA ACAGGCTCTAGCGTTCGAACGCACTGATTTCGACCAGGTTCGTTCA CTCATGGAAAATAGCGATCGCTGCCAGGATATACGTAATCTGGCA TTTCTGGGGATTGCTTATAACACCCTGTTACGTATAGCCGAAATTG CCAGGATCAGGGTTAAAGATATCTCACGTACTGACGGTGGGAGAA TGTTAATCCATATTGGCAGAACGAAAACGCTGGTTAGCACCGCAG GTGTAGAGAAGGCACTTAGCCTGGGGGTAACTAAACTGGTCGAGC GATGGATTTCCGTCTCTGGTGTAGCTGATGATCCGAATAACTACCT GTTTTGCCGGGTCAGAAAAAATGGTGTTGCCGCGCCATCTGCCAC CAGCCAGCTATCAACTCGCGCCCTGGAAGGGATTTTTGAAGCAAC TCATCGATTGATTTACGGCGCTAAGGATGACTCTGGTCAGAGATA CCTGGCCTGGTCTGGACACAGTGCCCGTGTCGGAGCCGCGCGAGA TATGGCCCGCGCTGGAGTTTCAATACCGGAGATCATGCAAGCTGG TGGCTGGACCAATGTAAATATTGTCATGAACTATATCCGTAACCTG GATAGTGAAACAGGGGCAATGGTGCGCCTGCTGGAAGATGGCGAT TAATTTAAACCCGCTGATCAGCCTCGACTGTGCCTTCTAGTTGCCA GCCATCTGTTGTTTGCCCCTCCCCCGTGCCTTCCTTGACCCTGGAA GGTGCCACTCCCACTGTCCTTTCCTAATAAAATGAGAAAATTGCAT CGCATTGTCTGAGTAGGTGTCATTCTATTCTGGGGGGTGGGGTGGG GCAGGACAGCAAGGGGGAGGATTGGGAAGACAATAGCAGGCATG CTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCGGAAAGAACCA GCTGGGGCTCGATACCGTCGACCTCTAGCTAGAGCTTGGCGTAAT CATGGTCATAGCTGTTTCCTGTGTGAAATTGTTATCCGCTCACAAT TCCACACAACATACGAGCCGGAAGCATAAAGTGTAAAGCCTAGGG TGCCTAATGAGTGAGCTAACTCACATTAATTGCGTTGCGCTCACTG CCCGCTTTCCAGTCGGGAAACCTGTCGTGCCAGCTGCATTAATGAA tcggccaacgcgcggggagaggcggtttgcgtattgggcgctctt CCGCTTCCTCGCTCACTGACTCGCTGCGCTCGGTCGTTCGGCTGCG GCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGGTTATCCAC AGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGGCC AGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTT TCCATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCT CAAGTCAGAGGTGGCGAAACCCGACAGGACTATAAAGATACCAG GCGTTTCCCCCTGGAAGCTCCCTCGTGCGCTCTCCTGTTCCGACCC TGCCGCTTACCGGATACCTGTCCGCCTTTCTCCCTTCGGGAAGCGT GGCGCTTTCTCATAGCTCACGCTGTAGGTATCTCAGTTCGGTGTAG GTCGTTCGCTCCAAGCTGGGCTGTGTGCACGAACCCCCCGTTCAGC CCGACCGCTGCGCCTTATCCGGTAACTATCGTCTTGAGTCCAACCC GGTAAGACACGACTTATCGCCACTGGCAGCAGCCACTGGTAACAG GATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTTCTTGAA GTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTAT CTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAG CTCTTGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTT GTTTGCAAGCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGA AGATCCTTTGATCTTTTCTACGGGGTCTGACGCTCAGTGGAACGAA AACTCACGTTAAGGGATTTTGGTCATGAGATTATCAAAAAGGATC TTCACCTAGATCCTTTTAAATTAAAAATGAAGTTTTAAATCAATCT AAAGTATATATGAGTAAACTTGGTCTGACAGTTACCAATGCTTAAT CAGTGAGGCACCTATCTCAGCGATCTGTCTATTTCGTTCATCCATA GTTGCCTGACTCCCCGTCGTGTAGATAACTACGATACGGGAGGGC TTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACCCACGC TCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGG GCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGT CTATTAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTA ATAGTTTGCGCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTC ACGCTCGTCGTTTGGTATGGCTTCATTCAGCTCCGGTTCCCAACGA TCAAGGCGAGTTACATGATCCCCCATGTTGTGCAAAAAAGCGGTT AGCTCCTTCGGTCCTCCGATCGTTGTCAGAAGTAAGTTGGCCGCAG TGTTATCACTCATGGTTATGGCAGCACTGCATAATTCTCTTACTGT CATGCCATCCGTAAGATGCTTTTCTGTGACTGGTGAGTACTCAACC

AAGTCATTCTGAGAATAGTGTATGCGGCGACCGAGTTGCTCTTGCC CGGCGTCAATACGGGATAATACCGCGCCACATAGCAGAACTTTAA AAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTCAA GGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGC ACCCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGG TGAGCAAAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAG GGCGACACGGAAATGTTGAATACTCATACTCTTCCTTTTTCAATAT TATTGAAGCATTTATCAGGGTTATTGTCTCATGAGCGGATACATAT TTGAATGTATTTAGAAAAATAAACAAATAGGGGTTCCGCGCACAT TTCCCCGAAAAGTGCCACCTGACGTCGACGGATCGGGAGATCGAT CTCCCGATCCCCTAGGGTCGACTCTCAGTACAATCTGCTCTGATGC CGCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGTGTTGGAGGTC GCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGCAAGGC TTGACCGACAATTGCATGAAGAATCTGCTTAGGGTTAGGCGTTTTG CGCTGCTTCGCGATGTACGGGCCAGATAT SEQ ID NO: 74 GTCAACCAGTATCCCGGTGC +90 ngRNA guide sequence (Artificial Sequence) SEQ ID NO: 75 GTCAACCAGTATCCCGGTGCGTTTTAGAGCTAGAAATAGCAAGTT +90 ngRNA AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT (Artificial Sequence) CGGTGC SEQ ID NO: 76 TGATCCCCTGCGCCATCAGATCCTTGGCGGCGAGAAAGCCATCCA GFP minicircle GTTTACTTTGCAGGGCTTCCCAACCTTACCAGAGGGCGCCCCAGCT template (before GGCAATTCCGGTTCGCTTGCTGTCCATAAAACCGCCCAGTCTAGCT cleavage into a ATCGCCATGTAAGCCCACTGCAAGCTACCTGCTTTCTCTTTGCGCT minicircle) TGCGTTTTCCCTTGTCCAGATAGCCCAGTAGCTGACATTCATCCGG (Artificial Sequence) GGTCAGCACCGTTTCTGCGGACTGGCTTTCTACGTGCTCGAGGGGG GCCAAACGGTCTCCAGCTTGGCTGTTTTGGCGGATGAGAGAAGAT TTTCAGCCTGATACAGATTAAATCAGAACGCAGAAGCGGTCTGAT AAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGA CCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAG TGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAA TAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCT GTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGCCGG GAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGG GCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAG GCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTTTGTT TATTTTTCTAAATACATTCAAATATGTATCCGCTCATGACCAAAAT CCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAA AAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCT GCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTT TGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTT CAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTA GTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTC GCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGT CGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGG CGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCT TGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGC TATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGG TATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGA GCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTT CGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGG GGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGT TCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTA TCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTG ATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGA GCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTCCTTACGC ATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGTACAAT CTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATCGC TACGTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCCG CTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAG ACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTC ACCGTCATCACCGAAACGCGCGAGGCAGCAGATCAATTCGCGCGC GAAGGCGAAGCGGCATGCATAATGTGCCTGTCAAATGGACGAAGC AGGGATTCTGCAAACCCTATGCTACTCCGTCAAGCCGTCAATTGTC TGATTCGTTACCAATTATGACAACTTGACGGCTACATCATTCACTT TTTCTTCACAACCGGCACGGAACTCGCTCGGGCTGGCCCCGGTGC ATTTTTTAAATACCCGCGAGAAATAGAGTTGATCGTCAAAACCAA CATTGCGACCGACGGTGGCGATAGGCATCCGGGTGGTGCTCAAAA GCAGCTTCGCCTGGCTGATACGTTGGTCCTCGCGCCAGCTTAAGAC GCTAATCCCTAACTGCTGGCGGAAAAGATGTGACAGACGCGACGG CGACAAGCAAACATGCTGTGCGACGCTGGCGATACATTACCCTGT TATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTACCCTG TTATCCCTAGATGACATTACCCTGTTATCCCTAGATGACATTTACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATAAACTCAA TGATGATGATGATGATGGTCGAGACTCAGCGGCCGCGGTGCCAGG GCGTGCCCTTGGGCTCCCCGGGCGCGACTATAAGCTGCGAGCAAC TTCACTTGGGTATGCCGGCGGTAGCGCTTACCGTTCGTATAATGTA TGCTATACGAAGTTATCCGAAGCCGCTAGCGGTGGTTTGTCTGGTC AACCACCGCGGTCTCAGTGGTGTACGGTACAAACCCAGCTACCGG TCGCCACCATGCCCGCCATGAAGATCGAGTGCCGCATCACCGGCA CCCTGAACGGCGTGGAGTTCGAGCTGGTGGGCGGCGGAGAGGGC ACCCCCGAGCAGGGCCGCATGACCAACAAGATGAAGAGCACCAA AGGCGCCCTGACCTTCAGCCCCTACCTGCTGAGCCACGTGATGGG CTACGGCTTCTACCACTTCGGCACCTACCCCAGCGGCTACGAGAA CCCCTTCCTGCACGCCATCAACAACGGCGGCTACACCAACACCCG CATCGAGAAGTACGAGGACGGCGGCGTGCTGCACGTGAGCTTCAG CTACCGCTACGAGGCCGGCCGCGTGATCGGCGACTTCAAGGTGGT GGGCACCGGCTTCCCCGAGGACAGCGTGATCTTCACCGACAAGAT CATCCGCAGCAACGCCACCGTGGAGCACCTGCACCCCATGGGCGA TAACGTGCTGGTGGGCAGCTTCGCCCGCACCTTCAGCCTGCGCGA CGGCGGCTACTACAGCTTCGTGGTGGACAGCCACATGCACTTCAA GAGCGCCATCCACCCCAGCATCCTGCAGAACGGGGGCCCCATGTT CGCCTTCCGCCGCGTGGAGGAGCTGCACAGCAACACCGAGCTGGG CATCGTGGAGTACCAGCACGCCTTCAAGACCCCCATCGCCTTCGCC AGATCTCGAGCTCGATGAGTTTGGACAAACCACAACTAGAATGCA GTGAAAAAAATGCTTTATTTGTGAAATTTGTGATGCTATTGCTTTA TTTGTGGGCCCGCCCCAACTGGGGTAACCTTTGAGTTCTCTCAGTT GGGGGTAATCAGCATCATGATGTGGTACCACATCATGATGCTGAT TATAAGAATGCGGCCGCCACACTCTAGTGGATCTCGAGTTAATAA TTCAGAAGAACTCGTCAAGAAGGCGATAGAAGGCGATGCGCTGCG AATCGGGAGCGGCGATACCGTAAAGCACGAGGAAGCGGTCAGCC CATTCGCCGCCAAGCTCTTCAGCAATATCACGGGTAGCCAACGCT ATGTCCTGATAGCGGTCCGCCACACCCAGCCGGCCACAGTCGATG AATCCAGAAAAGCGGCCATTTTCCACCATGATATTCGGCAAGCAG GCATCGCCATGGGTCACGACGAGATCCTCGCCGTCGGGCATGCTC GCCTTGAGCCTGGCGAACAGTTCGGCTGGCGCGAGCCCCTGATGC TCTTCGTCCAGATCATCCTGATCGACAAGACCGGCTTCCATCCGAG TACGTGCTCGCTCGATGCGATGTTTCGCTTGGTGGTCGAATGGGCA GGTAGCCGGATCAAGCGTATGCAGCCGCCGCATTGCATCAGCCAT GATGGATACTTTCTCGGCAGGAGCAAGGTGTAGATGACATGGAGA TCCTGCCCCGGCACTTCGCCCAATAGCAGCCAGTCCCTTCCCGCTT CAGTGACAACGTCGAGCACAGCTGCGCAAGGAACGCCCGTCGTGG CCAGCCACGATAGCCGCGCTGCCTCGTCTTGCAGTTCATTCAGGGC ACCGGACAGGTCGGTCTTGACAAAAAGAACCGGGCGCCCCTGCGC TGACAGCCGGAACACGGCGGCATCAGAGCAGCCGATTGTCTGTTG TGCCCAGTCATAGCCGAATAGCCTCTCCACCCAAGCGGCCGGAGA ACCTGCGTGCAATCCATCTTGTTCAATCATGCGAAACGATCCTCAT CCTGTCTCTTGATCAGAGCT SEQ ID NO: 77 TGATCCCCTGCGCCATCAGATCCTTGGCGGCGAGAAAGCCATCCA Gaussia Luciferase GTTTACTTTGCAGGGCTTCCCAACCTTACCAGAGGGCGCCCCAGCT minicircle template GGCAATTCCGGTTCGCTTGCTGTCCATAAAACCGCCCAGTCTAGCT (Artificial Sequence) ATCGCCATGTAAGCCCACTGCAAGCTACCTGCTTTCTCTTTGCGCT TGCGTTTTCCCTTGTCCAGATAGCCCAGTAGCTGACATTCATCCGG GGTCAGCACCGTTTCTGCGGACTGGCTTTCTACGTGCTCGAGGGGG GCCAAACGGTCTCCAGCTTGGCTGTTTTGGCGGATGAGAGAAGAT TTTCAGCCTGATACAGATTAAATCAGAACGCAGAAGCGGTCTGAT AAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGA CCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAG TGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGGCATCAAA TAAAACGAAAGGCTCAGTCGAAAGACTGGGCCTTTCGTTTTATCT GTTGTTTGTCGGTGAACGCTCTCCTGAGTAGGACAAATCCGCCGG GAGCGGATTTGAACGTTGCGAAGCAACGGCCCGGAGGGTGGCGG GCAGGACGCCCGCCATAAACTGCCAGGCATCAAATTAAGCAGAAG GCCATCCTGACGGATGGCCTTTTTGCGTTTCTACAAACTCTTTTGTT TATTTTTCTAAATACATTCAAATATGTATCCGCTCATGACCAAAAT CCCTTAACGTGAGTTTTCGTTCCACTGAGCGTCAGACCCCGTAGAA AAGATCAAAGGATCTTCTTGAGATCCTTTTTTTCTGCGCGTAATCT GCTGCTTGCAAACAAAAAAACCACCGCTACCAGCGGTGGTTTGTT TGCCGGATCAAGAGCTACCAACTCTTTTTCCGAAGGTAACTGGCTT CAGCAGAGCGCAGATACCAAATACTGTCCTTCTAGTGTAGCCGTA GTTAGGCCACCACTTCAAGAACTCTGTAGCACCGCCTACATACCTC GCTCTGCTAATCCTGTTACCAGTGGCTGCTGCCAGTGGCGATAAGT CGTGTCTTACCGGGTTGGACTCAAGACGATAGTTACCGGATAAGG CGCAGCGGTCGGGCTGAACGGGGGGTTCGTGCACACAGCCCAGCT TGGAGCGAACGACCTACACCGAACTGAGATACCTACAGCGTGAGC TATGAGAAAGCGCCACGCTTCCCGAAGGGAGAAAGGCGGACAGG TATCCGGTAAGCGGCAGGGTCGGAACAGGAGAGCGCACGAGGGA GCTTCCAGGGGGAAACGCCTGGTATCTTTATAGTCCTGTCGGGTTT CGCCACCTCTGACTTGAGCGTCGATTTTTGTGATGCTCGTCAGGGG GGCGGAGCCTATGGAAAAACGCCAGCAACGCGGCCTTTTTACGGT TCCTGGCCTTTTGCTGGCCTTTTGCTCACATGTTCTTTCCTGCGTTA TCCCCTGATTCTGTGGATAACCGTATTACCGCCTTTGAGTGAGCTG ATACCGCTCGCCGCAGCCGAACGACCGAGCGCAGCGAGTCAGTGA GCGAGGAAGCGGAAGAGCGCCTGATGCGGTATTTTCTCCTTACGC ATCTGTGCGGTATTTCACACCGCATATGGTGCACTCTCAGTACAAT CTGCTCTGATGCCGCATAGTTAAGCCAGTATACACTCCGCTATCGC TACGTGACTGGGTCATGGCTGCGCCCCGACACCCGCCAACACCCG CTGACGCGCCCTGACGGGCTTGTCTGCTCCCGGCATCCGCTTACAG ACAAGCTGTGACCGTCTCCGGGAGCTGCATGTGTCAGAGGTTTTC ACCGTCATCACCGAAACGCGCGAGGCAGCAGATCAATTCGCGCGC GAAGGCGAAGCGGCATGCATAATGTGCCTGTCAAATGGACGAAGC AGGGATTCTGCAAACCCTATGCTACTCCGTCAAGCCGTCAATTGTC TGATTCGTTACCAATTATGACAACTTGACGGCTACATCATTCACTT TTTCTTCACAACCGGCACGGAACTCGCTCGGGCTGGCCCCGGTGC ATTTTTTAAATACCCGCGAGAAATAGAGTTGATCGTCAAAACCAA CATTGCGACCGACGGTGGCGATAGGCATCCGGGTGGTGCTCAAAA GCAGCTTCGCCTGGCTGATACGTTGGTCCTCGCGCCAGCTTAAGAC GCTAATCCCTAACTGCTGGCGGAAAAGATGTGACAGACGCGACGG CGACAAGCAAACATGCTGTGCGACGCTGGCGATACATTACCCTGT TATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTACCCTG TTATCCCTAGATGACATTACCCTGTTATCCCTAGATGACATTTACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATGACATTAC CCTGTTATCCCTAGATACATTACCCTGTTATCCCAGATGACATACC CTGTTATCCCTAGATGACATTACCCTGTTATCCCAGATAAACTCAA TGATGATGATGATGATGGTCGAGACTCAGCGGCCGCGGTGCCAGG GCGTGCCCTTGGGCTCCCCGGGCGCGACTATAAGCTGCGAGCAAC TTCACTTGGGTATGCCGGCGGTAGCGCTTACCGTTCGTATAATGTA TGCTATACGAAGTTATCCGAAGCCGCTAGCGGTGGTTTGTCTGGTC AACCACCGCGGTCTCAGTGGTGTACGGTACAAACCCACTACCGGT CGCCACCATGGGAGTCAAAGTTCTGTTTGCCCTGATCTGCATCGCT GTGGCCGAGGCCAAGCCCACCGAGAACAACGAAGACTTCAACATC GTGGCCGTGGCCAGCAACTTCGCGACCACGGATCTCGATGCTGAC CGCGGGAAGTTGCCCGGCAAGAAGCTGCCGCTGGAGGTGCTCAAA GAGATGGAAGCCAATGCCCGGAAAGCTGGCTGCACCAGGGGCTGT CTGATCTGCCTGTCCCACATCAAGTGCACGCCCAAGATGAAGAAG TTCATCCCAGGACGCTGCCACACCTACGAAGGCGACAAAGAGTCC GCACAGGGCGGCATAGGCGAGGCGATCGTCGACATTCCTGAGATT CCTGGGTTCAAGGACTTGGAGCCCATGGAGCAGTTCATCGCACAG GTCGATCTGTGTGTGGACTGCACAACTGGCTGCCTCAAAGGGCTT GCCAACGTGCAGTGTTCTGACCTGCTCAAGAAGTGGCTGCCGCAA CGCTGTGCGACCTTTGCCAGCAAGATCCAGGGCCAGGTGGACAAG ATCAAGGGGGCCGGTGGTGACTAAGCGGAGCTCGATGAGTTTGGA CAAACCACAACTAGAATGCAGTGAAAAAAATGCTTTATTTGTGAA ATTTGTGATGCTATTGCTTTATTTGTGGGCCCGCCCCAACTGGGGT AACCTTTGAGTTCTCTCAGTTGGGGGTAATCAGCATCATGATGTGG TACCACATCATGATGCTGATTATAAGAATGCGGCCGCCACACTCT AGTGGATCTCGAGTTAATAATTCAGAAGAACTCGTCAAGAAGGCG ATAGAAGGCGATGCGCTGCGAATCGGGAGCGGCGATACCGTAAA GCACGAGGAAGCGGTCAGCCCATTCGCCGCCAAGCTCTTCAGCAA TATCACGGGTAGCCAACGCTATGTCCTGATAGCGGTCCGCCACAC CCAGCCGGCCACAGTCGATGAATCCAGAAAAGCGGCCATTTTCCA CCATGATATTCGGCAAGCAGGCATCGCCATGGGTCACGACGAGAT CCTCGCCGTCGGGCATGCTCGCCTTGAGCCTGGCGAACAGTTCGG CTGGCGCGAGCCCCTGATGCTCTTCGTCCAGATCATCCTGATCGAC AAGACCGGCTTCCATCCGAGTACGTGCTCGCTCGATGCGATGTTTC GCTTGGTGGTCGAATGGGCAGGTAGCCGGATCAAGCGTATGCAGC CGCCGCATTGCATCAGCCATGATGGATACTTTCTCGGCAGGAGCA AGGTGTAGATGACATGGAGATCCTGCCCCGGCACTTCGCCCAATA GCAGCCAGTCCCTTCCCGCTTCAGTGACAACGTCGAGCACAGCTG CGCAAGGAACGCCCGTCGTGGCCAGCCACGATAGCCGCGCTGCCT CGTCTTGCAGTTCATTCAGGGCACCGGACAGGTCGGTCTTGACAA AAAGAACCGGGCGCCCCTGCGCTGACAGCCGGAACACGGCGGCA TCAGAGCAGCCGATTGTCTGTTGTGCCCAGTCATAGCCGAATAGC CTCTCCACCCAAGCGGCCGGAGAACCTGCGTGCAATCCATCTTGTT CAATCATGCGAAACGATCCTCATCCTGTCTCTTGATCAGAGCT SEQ ID NO: 78 CCCCAACTGGGGTAACCTTTGAGTTCTCTCAGTTGGGG pseudo attP site (Artificial sequence) SEQ ID NO: 79 GACTGAAACTTCACAGAATAGTTTTAGAGCTAGAAATAGCAAGTT Albumin-pegRNA- AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT SERPIN CGGTGCTTGGGATAGTTATGAATTCAATCTTCAACCCTATCCGGAT

(Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCTGT GAAGTTTCAGTCA SEQ ID NO: 80 GACTGAAACTTCACAGAATAGTTTTAGAGCTAGAAATAGCAAGTT Albumin-pegRNA- AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT CPS1 CGGTGCTTGGGATAGTTATGAATTCAATCTTCAACCCTATCCGGAT (Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCTGT GAAGTTTC SEQ ID NO: 81 GGCCCAGACTGAGCACGTGAGTTTTAGAGCTAGAAATAGCAAGTT 34 bp lox71 pegRNA AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT (Artificial Sequence) CGGTGCTGGAGGAAGCAGGGCTTCCTTTCCTCTGCCATCATACCGT TCGTATAGCATACATTATACGAAGTTATCGTGCTCAGTCTG SEQ ID NO: 82 GGCCCAGACTGAGCACGTGAGTTTTAGAGCTAGAAATAGCAAGTT 34 bp lox66 pegRNA AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT (Artificial Sequence) CGGTGCTGGAGGAAGCAGGGCTTCCTTTCCTCTGCCATCAATAACT TCGTATAGCATACATTATACGAACGGTACGTGCTCAGTCTG SEQ ID NO: 83 GGCCCAGACTGAGCACGTGA gRNA (Artificial Sequence) SEQ ID NO: 84 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 46 GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC (original length) TGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA pegRNA GAA (Artificial Sequence) SEQ ID NO: 85 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS_13_RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT TP901-1 minimal GGCACAATTAACATCTCAATCAAGGTAAATGCTTGAGCTGCGAG attB f pegRNA AA (Artificial Sequence) SEQ ID NO: 86 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS_13_RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT TP901-1 minimal GGAGCATTTACCTTGATTGAGATGTTAATTGTGTGAGCTGCGAGA attB rc pegRNA A (Artificial Sequence) SEQ ID NO: 87 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS_13_RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT PhiBT1 minimal GGCAGGTTTTTGACGAAAGTGATCCAGATGATCCAGTGAGCTGC attB f pegRNA GAGAA (Artificial Sequence) SEQ ID NO: 88 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS 13 RT_29_with TCGGTGCGAGTCGGTGCGACGAGCGCGGCGATATCATCATCCAT PhiBT1 minimal GGCTGGATCATCTGGATCACTTTCGTCAAAAACCTGTGAGCTGCG attB rc pegRNA AGAA (Artificial Sequence) SEQ ID NO: 89 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGT ACTB N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA Nicking guide 1 + 48 GTCGGTGC guide (Artificial Sequence) SEQ ID NO: 90 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGT ACTB N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA PBS_18_RT_16_with_ GTCGGTGCATATCATCATCCATGGTACCGTTCGTATAGCATACAT Lox71_Cre TATACGAAGTTATTGAGCTGCGAGAATAGCC pegRNA (Artificial Sequence) SEQ ID NO: 91 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGT ACTB N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA PBS_13_RT_29_with_ GTCGGTGCGACGAGCGCGGCGATATCATCATCCATGGTACCGTT Lox71_Cre CGTATAGCATACATTATACGAAGTTATTGAGCTGCGAGAA pegRNA (Artificial Sequence) SEQ ID NO: 92 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 34 pegRNA GGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGCCGGAT (Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGC TGCGAGAA SEQ ID NO: 93 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 26 pegRNA GGTGCGAGCGCGGCGATATCATCATCCATGGCCGGATGATCCTGA (Artificial Sequence) CGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 94 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 23 pegRNA GGTGCCGCGGCGATATCATCATCCATGGCCGGATGATCCTGACGAC (Artificial Sequence) GGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 95 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 20 pegRNA GGTGCGGCGATATCATCATCCATGGCCGGATGATCCTGACGACGG (Artificial Sequence) AGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 96 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 16 pegRNA GGTGCATATCATCATCCATGGCCGGATGATCCTGACGACGGAGAC (Artificial Sequence) CGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAA SEQ ID NO: 97 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 18 RT 34 pegRNA GGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGCCGGAT (Artificial Sequence) GATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGC TGCGAGAATAGCC SEQ ID NO: 98 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 18 RT 29 pegRNA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC (Artificial Sequence) TGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA GAATAGCC SEQ ID NO: 99 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 18 RT 16 pegRNA GGTGCATATCATCATCCATGGCCGGATGATCCTGACGACGGAGAC (Artificial Sequence) CGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAGAATAGCC SEQ ID NO: 100 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 39 pegRNA TCGGTGCCTGCCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCA (Artificial Sequence) TGCCGGATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCC GGCCCGGGCGGCGGAGA SEQ ID NO: 101 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 34 pegRNA TCGGTGCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCG (Artificial Sequence) GATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCC GGGCGGCGGAGA SEQ ID NO: 102 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 29 pegRNA TCGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGA (Artificial Sequence) TCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCG GCGGAGA SEQ ID NO: 103 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 24 pegRNA TCGGTGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATCCTG (Artificial Sequence) ACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGA GA SEQ ID NO: 104 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 19 pegRNA TCGGTGCGGGGGTCGCAGTCGCCATGCCGGATGATCCTGACGAC (Artificial Sequence) GGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGAGA SEQ ID NO: 105 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 39 pegRNA TCGGTGCCTGCCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCA (Artificial Sequence) TGCCGGATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCC GGCCCGGGCGGCGGAGACAGCG SEQ ID NO: 106 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 34 pegRNA TCGGTGCCATCCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCG (Artificial Sequence) GATGATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCC GGGCGGCGGAGACAGCG SEQ ID NO: 107 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 18 RT 29 pegRNA CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATC (Artificial Sequence) CTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCG GAGACAGCG SEQ ID NO: 108 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 24 pegRNA TCGGTGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATCCTG (Artificial Sequence) ACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGA GACAGCG SEQ ID NO: 109 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 18 RT 19 pegRNA TCGGTGCGGGGGTCGCAGTCGCCATGCCGGATGATCCTGACGAC (Artificial Sequence) GGAGACCGCCGTCGTCGACAAGCCGGCCCGGGCGGCGGAGACAG CG SEQ ID NO: 110 GCGTGGTGGGGCCGCCAGCGGTTTTAGAGCTAGAAATAGCAAGT LMNB1 N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA Nicking guide 1 + 46 GTCGGTGC (Artificial Sequence) SEQ ID NO: 111 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 42 GGTGCGACGAGCGCGGCGATATCATCATCCATGGGGATGATCCTG pegRNA ACGACGGAGACCGCCGTCGTCGACAAGCCGGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 112 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 40 GGTGCGACGAGCGCGGCGATATCATCATCCATGGGATGATCCTGA pegRNA CGACGGAGACCGCCGTCGTCGACAAGCCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 113 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 38 GGTGCGACGAGCGCGGCGATATCATCATCCATGGATGATCCTGAC pegRNA GACGGAGACCGCCGTCGTCGACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 114 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 36 GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGATCCTGACG pegRNA ACGGAGACCGCCGTCGTCGACAAGCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 115 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 44 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCGGATGATCC pegRNA v2 TGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCGGGCGGCGG (Artificial Sequence) AGA SEQ ID NO: 116 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 42 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGGGATGATCCT pegRNA v2 GACGACGGAGACCGCCGTCGTCGACAAGCCGGCGGGCGGCGGAG (Artificial Sequence) A SEQ ID NO: 117 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 40 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGGATGATCCTG pegRNA v2 ACGACGGAGACCGCCGTCGTCGACAAGCCGCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 118 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 38 CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGATGATCCTGA pegRNA v2 CGACGGAGACCGCCGTCGTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 119 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 18 RT 29 attB 46 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCCGGATG pegRNA ATCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCCTC (Artificial Sequence) CAGGCAATACGCG SEQ ID NO: 120 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGTT NOLC1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 13 RT 29 attB 46 CGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCCGGATGATC pegRNA CTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTCCTCCAGG (Artificial Sequence) CAAT SEQ ID NO: 121 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 29 attB 44 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCGGATGA pegRNA TCCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCTCCTCCA (Artificial Sequence) GGCAAT

SEQ ID NO: 122 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 29 attB 42 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCGGATGAT pegRNA CCTGACGACGGAGACCGCCGTCGTCGACAAGCCGGTCCTCCAGG (Artificial Sequence) CAAT SEQ ID NO: 123 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGT NOLC1 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 29 attB 40 GTCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCGATGATC pegRNA CTGACGACGGAGACCGCCGTCGTCGACAAGCCGTCCTCCAGGCA (Artificial Sequence) AT SEQ ID NO: 124 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGTT NOLC1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 29 attB 38 TCGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCATGATCCT pegRNA GACGACGGAGACCGCCGTCGTCGACAAGCCTCCTCCAGGCAAT (Artificial Sequence) SEQ ID NO: 125 GAGCCGAGCACGAGGGGATACGTTTTAGAGCTAGAAATAGCAAGT NOLC1 nicking TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG guide-43 TCGGTGC (Artificial Sequence) SEQ ID NO: 126 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 20 attB 38 GGTGCGGCGATATCATCATCCATGGATGATCCTGACGACGGAGAC pegRNA CGCCGTCGTCGACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 127 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 15 attB 38 GGTGCTATCATCATCCATGGATGATCCTGACGACGGAGACCGCCG pegRNA TCGTCGACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 128 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 10 attB 38 GGTGCTCATCCATGGATGATCCTGACGACGGAGACCGCCGTCGTC pegRNA GACAAGCCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 129 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term PBS 9 AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG RT 20 attB 38 TCGGTGCGGCGATATCATCATCCATGGATGATCCTGACGACGGAG pegRNA ACCGCCGTCGTCGACAAGCCTGAGCTGCG (Artificial Sequence) SEQ ID NO: 130 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS 9 AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC RT 15 attB 38 GGTGCTATCATCATCCATGGATGATCCTGACGACGGAGACCGCCG pegRNA TCGTCGACAAGCCTGAGCTGCG (Artificial Sequence) SEQ ID NO: 131 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS 9 AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC RT 10 attB 38 GGTGCTCATCCATGGATGATCCTGACGACGGAGACCGCCGTCGTC pegRNA GACAAGCCTGAGCTGCG (Artificial Sequence) SEQ ID NO: 132 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 20 attB 38 TCGGTGCCGGGGGTCGCAGTCGCCATGATGATCCTGACGACGGA pegRNA GACCGCCGTCGTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 133 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 15 attB 38 TCGGTGCGTCGCAGTCGCCATGATGATCCTGACGACGGAGACCG pegRNA CCGTCGTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 134 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 13 RT 10 attB 38 TCGGTGCAGTCGCCATGATGATCCTGACGACGGAGACCGCCGTC pegRNA GTCGACAAGCCCGGGCGGCGGAGA (Artificial Sequence) SEQ ID NO: 135 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 9 RT 20 attB 38 CGGTGCCGGGGGTCGCAGTCGCCATGATGATCCTGACGACGGAGA pegRNA CCGCCGTCGTCGACAAGCCCGGGCGGCG (Artificial Sequence) SEQ ID NO: 136 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG 9 RT 15 attB 38 TCGGTGCGTCGCAGTCGCCATGATGATCCTGACGACGGAGACCG pegRNA CCGTCGTCGACAAGCCCGGGCGGCG (Artificial Sequence) SEQ ID NO: 137 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 9 RT 10 attB 38 CGGTGCAGTCGCCATGATGATCCTGACGACGGAGACCGCCGTCGT pegRNA CGACAAGCCCGGGCGGCG (Artificial Sequence) SEQ ID NO: 138 GAGAAGCGGCGTCCGGGGCTAGTTTTAGAGCTAGAAATAGCAAGT SUPT16H N-term TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG PBS 13 RT 24 Bxb1- TCGGTGCTCTTTGTCCAGAGTCACAGCCATACCGGATGATCCTGAC GT_Initial length GACGGAGACCGCCGTCGTCGACAAGCCGGCCCCCCGGACGCCGC (Artificial Sequence) SEQ ID NO: 139 GGGCACGGGGCCATGTACAAGTTTTAGAGCTAGAAATAGCAAGT SRRM2 N-term PBS TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGA 13 RT 24 Bxb1 GTCGGTGCGGCGTCGGCAGCCCGATCCCGTTGCCGGATGATCCT Initial length GACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTACATGGCCC (Artificial Sequence) CGT SEQ ID NO: 140 GTGTCAGGTGGGGCGGGGCTAGTTTTAGAGCTAGAAATAGCAAG DEPDC4 N-term TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG PBS 18 RT 24 Bxb1 AGTCGGTGCGCTGGCTCCTCCCCTGGCACCATACCGGATGATCCT Initial length GACGACGGAGACCGCCGTCGTCGACAAGCCGGCCCCCCGCCCCA (Artificial Sequence) CCTGACAC SEQ ID NO: 141 GAGTGGGTCAGACGAGCAGGAGTTTTAGAGCTAGAAATAGCAAGT NES N-term PBS 13 TAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG RT 29 Bxb1 Initial TCGGTGCGATGGAGGGCTGCATGGGGGAGGAGTCGCCGGATGATC length CTGACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGCTCGTCT (Artificial Sequence) GACC SEQ ID NO: 142 GCAGCCACCCGCTCTCGGCCCGTTTTAGAGCTAGAAATAGCAAG SUPT16H nicking TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG guide-53 AGTCGGTGC (Artificial Sequence) SEQ ID NO: 143 GTGTAGTCAGGCCGCTCACCCGTTTTAGAGCTAGAAATAGCAAG SRRM2 N-term TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG nicking guide 1 + 87 AGTCGGTGC (Artificial Sequence) SEQ ID NO: 144 GCTGACAAGTCTACGGAACCTGTTTTAGAGCTAGAAATAGCAAG DEPDC4 N-term TTAAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCG Nicking guide 1 + 59 AGTCGGTGC (Artificial Sequence) SEQ ID NO: 145 GCTCCTCCAGCGCCTTGACCGTTTTAGAGCTAGAAATAGCAAGTTA NES N-term Nicking AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC guide 2 + 9 GGTGC (Artificial Sequence) SEQ ID NO: 146 GCTATTCTCGCAGCTCACCA HITI_ACTB_guide (Artificial Sequence) SEQ ID NO: 147 AGAAGCGGCGTCCGGGGCTA HITI_SUPTH16_guide (Artificial Sequence) SEQ ID NO: 148 GGGCACGGGGCCATGTACAA HITI_SRRM2_guide (Artificial Sequence) SEQ ID NO: 149 GCGTATTGCCTGGAGGATGG HITI_NOLCl_guide (Artificial Sequence) SEQ ID NO: 150 TGTCAGGTGGGGCGGGGCTA HITI_DEPDC4_guide (Artificial Sequence) SEQ ID NO: 151 AGTGGGTCAGACGAGCAGGA HITI_NES_guide (Artificial Sequence) SEQ ID NO: 152 GCTGTCTCCGCCGCCCGCCA HITI_LMNB1_guide (Artificial Sequence) SEQ ID NO: 153 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTT HDR Cas9 ACTB AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAG guide TCGGTGC (Artificial Sequence) SEQ ID NO: 154 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC original length TGACGACGGAGXXCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA pegRNAs for GAA dinucleotides XX: CG, GC, AT, TA, GG, TT, GA, AG, CC, TC, CT, AA, TG, GT, CA, or (Artificial Sequence) AC SEQ ID NO: 155 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 pegRNA GGTGCGACGAGCGCGGCGATATCATCATCCATGCCGGATGATCCT with attB 46 GT for GACGACGGAGACCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGAG fusion AA (Artificial Sequence) SEQ ID NO: 156 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 pegRNA GGTGCGACGAGCGCGGCGATATCATCATCCATGCCGGATGATCCT with attB 46 CT for GACGACGGAGAGCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA multiplexing GAA (Artificial Sequence) SEQ ID NO: 157 GCGTATTGCCTGGAGGATGGGTTTTAGAGCTAGAAATAGCAAGTT NOLC1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 18 RT 29 pegRNA CGGTGCGAACCACGCGGCGAATGCCGGCGTCCGCCCCGGATGATC with attB 46 GA for CTGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCTCCTCCAGG multiplexing CAATACGCG (Artificial Sequence) SEQ ID NO: 158 GCTGTCTCCGCCGCCCGCCAGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term PBS AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT 18 RT 29 pegRNA CGGTGCGCGGCGGCACGGGGGTCGCAGTCGCCATGCCGGATGATC with attB 46 AG for CTGACGACGGAGCTCGCCGTCGTCGACAAGCCGGCCCGGGCGGCG multiplexing GAGACAGCG (Artificial Sequence) SEQ ID NO: 159 GTCACCTCCAATGACTAGGG EMX1 Cas9 guide 1 (Artificial Sequence) SEQ ID NO: 160 GGGCAACCACAAACCCACGA EMX1 Cas9 guide 2 (Artificial Sequence) SEQ ID NO: 161 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 56 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCTATGCCGGAT pegRNA GATCCTGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCCTAGC (Artificial Sequence) TGAGCTGCGAGAA SEQ ID NO: 162 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 51 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGCCGGATGAT pegRNA CCTGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCCTATGAGC (Artificial Sequence) TGCGAGAA SEQ ID NO: 163 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 46 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCGGATGATCC pegRNA TGACGACGGAGTCCGCCGTCGTCGACAAGCCGGCCTGAGCTGCGA (Artificial Sequence) GAA SEQ ID NO: 164 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 41 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGGGATGATCCTG pegRNA ACGACGGAGTCCGCCGTCGTCGACAAGCCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 165 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 36 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGATCCTGACG pegRNA ACGGAGTCCGCCGTCGTCGACAAGCTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 166 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 31 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGATCCTGACGAC pegRNA GGAGTCCGCCGTCGTCGACATGAGCTGCGAGAA

(Artificial Sequence) SEQ ID NO: 167 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 26 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCCTGACGACGG pegRNA AGTCCGCCGTCGTCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 168 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 21 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGTGACGACGGAG pegRNA TCCGCCGTCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 169 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 16 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGACGACGGAGTC pegRNA CGCCGTGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 170 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 11 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGGACGGAGTCCG pegRNA TGAGCTGCGAGAA (Artificial Sequence) SEQ ID NO: 171 GCTATTCTCGCAGCTCACCAGTTTTAGAGCTAGAAATAGCAAGTTA ACTB N-term PBS AAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGTC 13 RT 29 attB 6 GA GGTGCGACGAGCGCGGCGATATCATCATCCATGGCGGAGTTGAGC pegRNA TGCGAGAA (Artificial Sequence) SEQ ID NO: 172 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_18_RT_34_with_ CGGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGTACCG Lox71_Cre TTCGTATAGCATACATTATACGAAGTTATTGAGCTGCGAGAATAG pegRNA CC (Artificial Sequence) SEQ ID NO: 173 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_18_RT_29_with_ CGGTGCGACGAGCGCGGCGATATCATCATCCATGGTACCGTTCGT Lox71_Cre ATAGCATACATTATACGAAGTTATTGAGCTGCGAGAATAGCC pegRNA (Artificial Sequence) SEQ ID NO: 174 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_13_RT_34_with_ CGGTGCTCGACGACGAGCGCGGCGATATCATCATCCATGGTACCG Lox71_Cre TTCGTATAGCATACATTATACGAAGTTATTGAGCTGCGAGAA pegRNA (Artificial Sequence) SEQ ID NO: 175 GAAGCCGGCCTTGCACATGCGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT PBS_13_RT_16_with_ CGGTGCATATCATCATCCATGGTACCGTTCGTATAGCATACATTAT Lox71_Cre ACGAAGTTATTGAGCTGCGAGAA pegRNA (Artificial Sequence) SEQ ID NO: 176 CCCCACGATGGAGGGGAAGAGTTTTAGAGCTAGAAATAGCAAGTT ACTB N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT Nicking guide 2 + 93 CGGTGC guide (Artificial Sequence) SEQ ID NO: 177 CCTTCTCCTGGAGCCGCGACGTTTTAGAGCTAGAAATAGCAAGTT LMNB1 N-term AAAATAAGGCTAGTCCGTTATCAACTTGAAAAAGTGGCACCGAGT Nicking guide 2 + 87 CGGTGC guide (Artificial Sequence)

[0310] Sequences of insertion sites can be found in Table 4 below.

TABLE-US-00005 TABLE 4 FORWARD SEQUENCE (5'-3') REVERSE SEQUENCE (5'-3') DESCRIPTION/ SEQ ID SEQ ID SOURCE NO Sequence NO Sequence Bxb1_attP_GT_ 178 GTGGTTTGTCTGGTC 179 TGGGTTTGTACCGTA original_site AACCACCGCGGTCT CACCACTGAGACCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_C 180 GTGGTTTGTCTGGTC 181 TGGGTTTGTACCGTA G_site AACCACCGCGCGCT CACCACTGAGCGCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_G 182 GTGGTTTGTCTGGTC 183 TGGGTTTGTACCGTA C_site AACCACCGCGGCCT CACCACTGAGGCCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_AT_ 184 GTGGTTTGTCTGGTC 185 TGGGTTTGTACCGTA site AACCACCGCGATCT CACCACTGAGATCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TA_ 186 GTGGTTTGTCTGGTC 187 TGGGTTTGTACCGTA site AACCACCGCGTACT CACCACTGAGTACG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_G 188 GTGGTTTGTCTGGTC 189 TGGGTTTGTACCGTA G_site AACCACCGCGGGCT CACCACTGAGCCCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TT_ 190 GTGGTTTGTCTGGTC 191 TGGGTTTGTACCGTA site AACCACCGCGTTCTC CACCACTGAGAACG (Artificial AGTGGTGTACGGTA CGGTGGTTGACCAG Sequence) CAAACCCA ACAAACCAC Bxb1_attP_G 192 GTGGTTTGTCTGGTC 193 TGGGTTTGTACCGTA A_site AACCACCGCGGACT CACCACTGAGTCCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_A 194 GTGGTTTGTCTGGTC 195 TGGGTTTGTACCGTA G_site AACCACCGCGAGCT CACCACTGAGCTCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_CC_ 196 GTGGTTTGTCTGGTC 197 TGGGTTTGTACCGTA site AACCACCGCGCCCT CACCACTGAGGGCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TC_ 198 GTGGTTTGTCTGGTC 199 TGGGTTTGTACCGTA site AACCACCGCGTCCTC CACCACTGAGGACG (Artificial AGTGGTGTACGGTA CGGTGGTTGACCAG Sequence) CAAACCCA ACAAACCAC Bxb1_attP_CT_ 200 GTGGTTTGTCTGGTC 201 TGGGTTTGTACCGTA site AACCACCGCGCTCTC CACCACTGAGAGCG (Artificial AGTGGTGTACGGTA CGGTGGTTGACCAG Sequence) CAAACCCA ACAAACCAC Bxb1_attP_A 202 GTGGTTTGTCTGGTC 203 TGGGTTTGTACCGTA A_site AACCACCGCGAACT CACCACTGAGTTCGC (Artificial CAGTGGTGTACGGT GGTGGTTGACCAGA Sequence) ACAAACCCA CAAACCAC Bxb1_attP_C 204 GTGGTTTGTCTGGTC 205 TGGGTTTGTACCGTA A_site AACCACCGCGCACT CACCACTGAGTGCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_A 206 GTGGTTTGTCTGGTC 207 TGGGTTTGTACCGTA C_site AACCACCGCGACCT CACCACTGAGGTCG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attP_TG_ 208 GTGGTTTGTCTGGTC 209 TGGGTTTGTACCGTA site AACCACCGCGTGCT CACCACTGAGCACG (Artificial CAGTGGTGTACGGT CGGTGGTTGACCAG Sequence) ACAAACCCA ACAAACCAC Bxb1_attB_46_ 210 GGCCGGCTTGTCGA 211 CCGGATGATCCTGA GT_ CGACGGCGGTCTCC CGACGGAGACCGCC original_site GTCGTCAGGATCATC GTCGTCGACAAGCC (Artificial CGG GGCC Sequence) Bxb1_attB_46_ 212 GGCCGGCTTGTCGA 213 CCGGATGATCCTGA AA_site CGACGGCGAACTCC CGACGGAGTTCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 214 GGCCGGCTTGTCGA 215 CCGGATGATCCTGA GA_site CGACGGCGGACTCC CGACGGAGTCCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 216 GGCCGGCTTGTCGA 217 CCGGATGATCCTGA CA_site CGACGGCGCACTCC CGACGGAGTGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 218 GGCCGGCTTGTCGA 219 CCGGATGATCCTGA TA_site CGACGGCGTACTCC CGACGGAGTACGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 220 GGCCGGCTTGTCGA 221 CCGGATGATCCTGA AG_site CGACGGCGAGCTCC CGACGGAGCTCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 222 GGCCGGCTTGTCGA 223 CCGGATGATCCTGA GG_site CGACGGCGGGCTCC CGACGGAGCCCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 224 GGCCGGCTTGTCGA 225 CCGGATGATCCTGA CG_site CGACGGCGCGCTCC CGACGGAGCGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 226 GGCCGGCTTGTCGA 227 CCGGATGATCCTGA TG_site CGACGGCGTGCTCC CGACGGAGCACGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 228 GGCCGGCTTGTCGA 229 CCGGATGATCCTGA AC_site CGACGGCGACCTCC CGACGGAGGTCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 230 GGCCGGCTTGTCGA 231 CCGGATGATCCTGA GC_site CGACGGCGGCCTCC CGACGGAGGCCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 232 GGCCGGCTTGTCGA 233 CCGGATGATCCTGA CC_site CGACGGCGCCCTCC CGACGGAGGGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 234 GGCCGGCTTGTCGA 235 CCGGATGATCCTGA TC_site CGACGGCGTCCTCC CGACGGAGGACGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 236 GGCCGGCTTGTCGA 237 CCGGATGATCCTGA AT_site CGACGGCGATCTCC CGACGGAGATCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 238 GGCCGGCTTGTCGA 239 CCGGATGATCCTGA CT_site CGACGGCGCTCTCC CGACGGAGAGCGCC (Artificial GTCGTCAGGATCATC GTCGTCGACAAGCC Sequence) CGG GGCC Bxb1_attB_46_ 240 GGCCGGCTTGTCGA 241 CCGGATGATCCTGA TT_site CGACGGCGTTCTCCG CGACGGAGAACGCC (Artificial TCGTCAGGATCATCC GTCGTCGACAAGCC Sequence) GG GGCC Bxb1_attB_38_ 242 GGCTTGTCGACGAC 243 ATGATCCTGACGAC GT_site GGCGGTCTCCGTCGT GGAGACCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 244 GGCTTGTCGACGAC 245 ATGATCCTGACGAC AA_site GGCGAACTCCGTCG GGAGTTCGCCGTCGT (Artificial TCAGGATCAT CGACAAGCC Sequence) Bxb1_attB_38_ 246 GGCTTGTCGACGAC 247 ATGATCCTGACGAC GA_site GGCGGACTCCGTCG GGAGTCCGCCGTCG (Artificial TCAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 248 GGCTTGTCGACGAC 249 ATGATCCTGACGAC CA_site GGCGCACTCCGTCGT GGAGTGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 250 GGCTTGTCGACGAC 251 ATGATCCTGACGAC TA_site GGCGTACTCCGTCGT GGAGTACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 252 GGCTTGTCGACGAC 253 ATGATCCTGACGAC AG_site GGCGAGCTCCGTCG GGAGCTCGCCGTCG (Artificial TCAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 254 GGCTTGTCGACGAC 255 ATGATCCTGACGAC GG_site GGCGGGCTCCGTCG GGAGCCCGCCGTCG (Artificial TCAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 256 GGCTTGTCGACGAC 257 ATGATCCTGACGAC CG_site GGCGCGCTCCGTCGT GGAGCGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 258 GGCTTGTCGACGAC 259 ATGATCCTGACGAC TG_site GGCGTGCTCCGTCGT GGAGCACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 260 GGCTTGTCGACGAC 261 ATGATCCTGACGAC AC_site GGCGACCTCCGTCGT GGAGGTCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 262 GGCTTGTCGACGAC 263 ATGATCCTGACGAC GC_site GGCGGCCTCCGTCGT GGAGGCCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 264 GGCTTGTCGACGAC 265 ATGATCCTGACGAC CC_site GGCGCCCTCCGTCGT GGAGGGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 266 GGCTTGTCGACGAC 267 ATGATCCTGACGAC TC_site GGCGTCCTCCGTCGT GGAGGACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 268 GGCTTGTCGACGAC 269 ATGATCCTGACGAC AT_site GGCGATCTCCGTCGT GGAGATCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 270 GGCTTGTCGACGAC 271 ATGATCCTGACGAC CT_site GGCGCTCTCCGTCGT GGAGAGCGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Bxb1_attB_38_ 272 GGCTTGTCGACGAC 273 ATGATCCTGACGAC TT_site GGCGTTCTCCGTCGT GGAGAACGCCGTCG (Artificial CAGGATCAT TCGACAAGCC Sequence) Cre Lox 66 274 TACCGTTCGTATAAT 275 ATAACTTCGTATAGC site GTATGCTATACGAA ATACATTATACGAA (Artificial GTTAT CGGTA

Sequence) Cre Lox 71 276 ATAACTTCGTATAAT 277 TACCGTTCGTATAGC site GTATGCTATACGAA ATACATTATACGAA (Artificial CGGTA GTTAT Sequence) TP901-1 278 TTTACCTTGATTGAG 279 CACAATTAACATCTC minimal attB ATGTTAATTGTG AATCAAGGTAAA site (Artificial Sequence) TP901-1 280 GCGAGTTTTTATTTC 281 AAAGGAGTTTTTTAG minimal attP GTTTATTTCAATTAA TTACCTTAATTGAAA site GGTAACTAAAAAAC TAAACGAAATAAAA (Artificial TCCTTT ACTCGC Sequence) PhiBT1 282 CTGGATCATCTGGAT 283 CAGGTTTTTGACGAA minimal attB CACTTTCGTCAAAAA AGTGATCCAGATGA site CCTG TCCAG (Artificial Sequence) PhiBT1 284 TTCGGGTGCTGGGTT 285 TGGTGCTGAGTAGTT minimal attP GTTGTCTCTGGACAG TCCCATGGATCACTG site TGATCCATGGGAAA TCCAGAGACAACAA (Artificial CTACTCAGCACCA CCCAGCACCCGAA Sequence)

[0311] Sequences of Bxb1 and RT mutants can be found in Table 6 below.

TABLE-US-00006 TABLE 6 SEQ ID NO/ DESCRIPTION/ SOURCE FORWARD SEQUENCE(5'-3') SEQ ID NO: 286 AAAAGTGTGGGCTGCAGGATCTGA Bxb1_mut_V368A (Artificial Sequence) SEQ ID NO: 287 GGAGCTGGCAGCTGTCAATGCC Bxb1_mut_E379A (Artificial Sequence) SEQ ID NO: 288 AGTCAATGCCGCTCTCGTGGA Bxb1_mut_E383A (Artifical Sequence) SEQ ID NO: 403 TTGAGCGGGCCCCCACCGT RT_mut_L139P (Artificial Sequence) SEQ ID NO: 289 CAGCGGGCTCAGCTGATAGCA RT_mut_E562Q (Artificial Sequence) SEQ ID NO: 290 CGGATGGCTAACCAAGCGGCC RT_mut_D653N (Artificial Sequence) SEQ ID NO: 404 atgactcactatcaggccttgctt RT(1-478)_Sto7d ttggacacggaccgggtccagttc fusion ggaccggtggtagccctgaacccg gctacgctgctcccactgcctgag gaagggctgcaacacaactgcctt gatGGGACAGGTGGCGGTGGTGTC ACCGTCAAGTTCAAGTACAAGGGT GAGGAACTTGAAGTTGATATTAGC AAAATCAAGAAGGTTTGGCGCGTT GGTAAAATGATATCTTTTACTTAT GACGACAACGGCAAGACAGGTAGA GGGGCAGTGTCTGAGAAAGACGCC CCCAAGGAGCTGTTGCAAATGTTG GAAAAGTCTGGGAAAAAGtctggc ggctcaaaaagaaccgccgacggc agcgaattcgagcccaagaagaag aggaaagtc

[0312] Sequences of primers, probes and restriction enzymes used in ddPCR readout can be found in Table 7 below.

TABLE-US-00007 TABLE 7 SEQ Forward SEQ Reverse SEQ Restriction Locus Cargo ID NO: Primer IN NO: Primer Probe ID NO: Enzymes ACTB GFP 291 CCCGGCTTCCTTTGTCC 292 GAACTCCACGCCGTTCA /56- 405 Eco91I, (pDY0186) FAM/C HindIII C GGC TTG T/ZEN/ C GAC GAC GGC G/3IAB kFQ/ ACTB TP90-1 293 CCCGGCTTCCTTTGTCC 294 AACCACAACTAGAATGCA /56- 406 None GFP GTGA FAM/T (pDY0333) G CTA TTG C/ZEN/ T TTA TTT GTG GGC CCG/ 31ABk FQ/ ACTB TP90-1 295 CCCGGCTTCCTTTGTCC 296 GAACTCCACGCCGTTCA /56- 407 None rc GFP FAM/ (pDY0334) CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ ACTB PhiBT1 297 CCCGGCTTCCTTTGTCC 298 AACCACAACTAGAATGCA /56- 406 None GFP GTGA FAM/T (pDY0367) G CTA TTG C/ZEN/ T TTA TTT GTG GGC CCG/ 3IABk FQ/ ACTB PhiBT1 299 CCCGGCTTCCTTTGTCC 300 GAACTCCACGCCGTTCA /56- 407 None rc GFP FAM/ (pDY0368) CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ LMNB1 GFP 301 TCCTTATCACGGTCCCGCTCG 302 GAACTCCACGCCGTTCA /56- 407 Eco91I, (pDY0186) FAM/ HindIII CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ NOLC1 GFP 303 CGTCGACAACGGTAGTG 304 GAACTCCACGCCGTTCA /56- 407 Eco91I, (pDY0186) FAM/ HindIII CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ SUPT16 H GFP 305 TCGCGTGATTCTCGGAAC 306 GAACTCCACGCCGTTCA /56- 407 Eco91I, (pDY0186) FAM/C HindIII C ATG AAG A/ZEN/ T CGA GTG CCG CAT CA/3IA BkFQ/ SRRM2 GFP 307 GGGCGGTAAGTGGTTAGTTT 308 GAACTCCACGCCGTTCA /56- 407 Eco91I, (pDY0186) FAM/ HindIII CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ DEPDC4 GFP 309 AAGAGGCGGAGCCAGTA 310 GAACTCCACGCCGTTCA /56- 407 Eco91I, (pDY0186) FAM/ HindIII CC ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ NES GFP 311 CTCCCTTCTCCCGGTGCCC 312 GAACTCCACGCCGTTCA /56- 405 Eco91I, (pDY0186) FAM/C HindIII C GGC TTG T/ZEN/ C GAC GAC GGC G/3IAB kFQ/ ACTB ACTB 313 CCCGGCTTCCTTTGTCC 314 GAACTCCACGCCGTTCA /56- 407 Eco91I HITI FAM/ template CC GFP ATG (pDY0219) AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ SRRM2 SRRM2 315 GGGCGGTAAGTGGTTAGTTT 316 GAACTCCACGCCGTTCA /56- 407 Eco91I HITI FAM/ template CC GFP ATG (aRY0182_A2) AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ NOLC1 NOLC1 317 CGTCGACAACGGTAGTG 318 GAACTCCACGCCGTTCA /56- 407 Eco91I HITI FAM/ template CC GFP ATG (aRY0182_A3) AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ DEPDC4 DEPDC4 HITI 319 AAGAGGCGGAGCCAGTA 320 GAACTCCACGCCGTTCA /56- 407 Eco91I template FAM/ GFP CC (aRY0182_A5) ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ NES NES 321 CTCCCTTCTCCCGGTGCCC 322 GAACTCCACGCCGTTCA /56- 407 Eco91I HITI FAM/ template CC GFP ATG (aRY0182_A7) AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/ LMNB1 LMNB1 323 TCCTTATCACGGTCCCGCTCG 324 GAACTCCACGCCGTTCA /56- 407 Eco91I HITI FAM/ template GFP CC (aRY0182_A4) ATG AAG A/ZE N/T CGA GTG CCG CAT CA/3I ABkF Q/

ACTB SERPI 325 CCCGGCTTCCTTTGTCC 326 GGCCTGCCAGCAGGAGGA /56- 405 EcoRI, NA FAM/ XhoI, (pDY0298) CC HindIII GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ ACTB CPS1 327 CCCGGCTTCCTTTGTCC 328 GGTGTGCAGTCACATTGG /56- 408 XhoI, (pDY299) TAAAGCC FAM/ HindIII AC AGC TTT C/ZE N/A AAG TGG TGA GGA CAC T/3IA BkFQ/ ACTB CFTR 329 CCCGGCTTCCTTTGTCC 330 GATGGGTCTAGTCCAGCT /56- 409 Eco91I, (pDY0373) AAAG FAM/ HindIII TAC GGT ACA/ ZEN/ AAC CC ACC CGA GAG A/3I ABkF Q/ ACTB NYESO 331 CCCGGCTTCCTTTGTCC 332 GAGAGACAAGGCTGCACA /56- 409 Eco47III, TRAC FAM/ HindIII (pDY0318) TAC GGT ACA/ ZEN/ AAC CC ACC CGA GAG A/3I ABkF Q/ NC_00 GFP 333 CCAGGTGAGAGTCAGGGTAGT 334 GAACTCCACGCCGTTCA /56- 405 Eco91I, 00 03 (pDY0186) GTTCA FAM/ HindIII CC GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ NC_00 GFP 335 AGGGACCTTTGCCTGTGTGAG 336 GAACTCCACGCCGTTCA /56- 405 Eco91I, 00 02 (pDY0186) TC FAM/ HindIII CC GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ NC_00 GFP 337 TCAGCTCTGTGCTGAGGCGAA 338 GAACTCCACGCCGTTCA /56- 405 Eco91I, 00 09 (pDY0186) FAM/ HindIII CC GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ chr6: GFP 339 AAGCCATCTCCCAGAATATCT 340 GAACTCCACGCCGTTCA /56- 405 Eco91I, 149045959 (pDY0186) GCTTAGAAATG FAM/ HindIII CC GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ chr16: GFP 341 GAGAGGAGCAACAGTGAGCAT 342 GAACTCCACGCCGTTCA /56- 405 Eco91I, 18607730 (pDY0186) GATG FAM/ HindIII CC GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ chr6: ACTB 343 AAGCCATCTCCCAGAATATCT 344 GAACTCCACGCCGTTCA /56- 405 Eco91I 149045959 HITI GCTTAGAAATG FAM/ template CC GFP GGC (pDY0219) TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ chr16: ACTB 345 GAGAGGAGCAACAGTGAGCAT 346 GAACTCCACGCCGTTCA /56- 405 Eco91I 18607730 HITI GATG FAM/ template CC GFP GGC (pDY0219) TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ ACTB CAG_Kozak_bGH_ 347 CCCGGCTTCCTTTGTCC 348 GGCTATGAACTAATGACC /56- 405 Eco91I, therapeutic_genes CCGT FAM/ HindIII generic CC minicircle GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ ACTB Hibit- 349 CCCGGCTTCCTTTGTCC 350 GGCCTGCCAGCAGGAGGA /56- 405 EcoRI, SERPI FAM/ XhoI, NA CC HindIII (pDY0405) GGC TTG T/ZE N/C GAC GAC GGC G/3I ABkF Q/ ACTB Hibit- 351 CCCGGCTTCCTTTGTCC 352 GGTGTGCAGTCACATTGG /56- 408 XhoI, CPS1 TAAAGCC FAM/ HindIII (pDY406) AC AGC TTT C/ZE N/A AAG TGG TGA GGA CAC T/3IA BkFQ/

[0313] Sequences of primers used for NGS readout can be found in Table 8 below.

TABLE-US-00008 TABLE 8 SEQ ID NO / DESCRIPTION / SOURCE ID SEQUENCE (5'-3') SEQ ID NO: 353 PD0966 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCCGAC N-term ACTB Tn5 CTCGGC TCACAGCG readout F 1 (Artificial Sequence) SEQ ID NO: 354 PD0967 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACCGA N-term ACTB Tn5 CCTCGG CTCACAGCG readout F 2 (Artificial Sequence) SEQ ID NO: 355 PD0968 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGACCG N-term ACTB Tn5 ACCTCG GCTCACAGCG readout F 3 (Artificial Sequence) SEQ ID NO: 356 PD0969 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTGACC N-term ACTB Tn5 GACCTC GGCTCACAGCG readout F 4 (Artificial Sequence) SEQ ID NO: 357 PD0970 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTGAC N-term ACTB Tn5 CGACCT CGGCTCACAGCG readout F 5 (Artificial Sequence) SEQ ID NO: 358 PD0971 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACTGA N-term ACTB Tn5 CCGACC TCGGCTCACAGCG readout F 6 (Artificial Sequence) SEQ ID NO: 359 PD0972 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTACTG N-term ACTB Tn5 ACCGAC CTCGGCTCACAGCG readout F 7 (Articial Sequence) SEQ ID NO: 360 PD0973 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGTACT N-term ACTB Tn5 GACCGA CCTCGGCTCACAGCG readout F 8 (Artificial Sequence) SEQ ID NO: 361 FP0952 GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTCCAC ACTB N-term NGS CCAGCC AGCTCCC R for Cas14 indels (Artificial Sequence) SEQ ID NO: 362 PD0313 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCCGGT NGS EMX1 GGCGCAT TGCCAC Forward 1 (Artificial Sequence) SEQ ID NO: 363 PD0314 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACCGG NGS EMX1 TGGCGCA TTGCCAC Forward 2 (Artificial Sequence) SEQ ID NO: 364 PD0315 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGACCG NGS EMX1 GTGGCGC ATTGCCAC Forward 3 (Artificial Sequence) SEQ ID NO: 365 PD0316 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTGACC NGS EMX1 GGTGGCG CATTGCCAC Forward 4 (Artificial Sequence) SEQ ID NO: 366 PD0317 ACACTCTTTCCCTACACGACGCTCTTCCGATCTCTGAC NGS EMX1 CGGTGGC GCATTGCCAC Forward 5 (Artificial Sequence) SEQ ID NO: 367 PD0318 ACACTCTTTCCCTACACGACGCTCTTCCGATCTACTGA NGS EMX1 CCGGTGG CGCATTGCCAC Forward 6 (Artificial Sequence) SEQ ID NO: 368 PD0319 ACACTCTTTCCCTACACGACGCTCTTCCGATCTTACTG NGS EMX1 ACCGGTG GCGCATTGCCAC Forward 7 (Artificial Sequence) SEQ ID NO: 369 PD0320 ACACTCTTTCCCTACACGACGCTCTTCCGATCTGTACT NGS EMX1 GACCGGT GGCGCATTGCCAC Forward 8 (Artificial Sequence) SEQ ID NO: 370 PD0321 GTGACTGGAGTTCAGACGTGTGCTCTTCCGATCTCAGA NGS EMX1 Reverse GTCCAGC TTGGGCCCA (Artificial Sequence)

[0314] Sequences of off-target sites can be found in Table 9 below.

TABLE-US-00009 TABLE 9 SEQ ID NO / DESCRIPTION / SOURCE SEQUENCE (5'-3') SEQ ID NO: 371 GATATTTTCCCAGCTCACCA Cas9_chr6: 149045959 (Artificial Sequence) SEQ ID NO: 372 TCTATTCTCCCAGCTCCCCA Cas9_chr16: 18607730 (Artificial Sequence) SEQ ID NO: 373 AGCGGCTTCTGTCTCTGTGA Bxb1_NC_000002 GTGAGCTGGCGGTCTCCGTC (Artificial Sequence) SEQ ID NO: 374 GACTAGCCCACGCTCCGGTT Bxb1_NC_000003 CTGAGCCGCGACGGCGGTCT (Artificial Sequence) CCG SEQ ID NO: 375 CCCAGGGTCCCATGCGCTCC Bxb1_NC_000009 CCGGCCCTGACGGCGGTCTC (Artificial Sequence) C

[0315] Linker sequences in Table 10 below.

TABLE-US-00010 TABLE 10 Description Sequence (5'-3') Amino acid sequence A - P2A GGAAGCGGAGCTACTA GSGATNFSLLKQAGDVEEN ACTTCAGCCTGCTGAA PGP (SEQ ID NO: 418) GCAGGCTGGCGACGTG GAGGAGAACCCTGGAC CT (SEQ ID NO: 410) B - (GGGS)3 GGGGGAGGAGGTTCTG GGGGSGGGGSGGGGS GAGGCGGAGGCTCCGG (SEQ ID NO: 419) AGGCGGAGGGTCA (SEQ ID NO: 411) C - GGGGS GGAGGTGGCGGGAGC GGGGS (SEQ ID NO: (SEQ ID NO: 412) 420) D - PAPAP CCCGCACCAGCGCCT PAPAP (SEQ ID NO: (SEQ ID NO: 413) 421) E - (EAAAK)3 GAGGCAGCTGCCAAGG EAAAKEAAAKEAAAK AAGCCGCTGCCAAGGA (SEQ ID NO: 422) GGCGGCCGCAAAG (SEQ ID NO: 414) F - XTEN AGTGGGAGCGAGACCC SGSETPGTSESATPES CTGGGACTAGCGAGTC (SEQ ID NO: 423) AGCTACACCCGAAAGC (SEQ ID NO: 415) G - (GGS)6 GGGGGGTCAGGTGGAT GGSGGSGGSGGSGGSGGS CCGGCGGAAGTGGCGG (SEQ ID NO: 424) ATCCGGTGGATCTGGC GGCAGT (SEQ ID NO: 416) H - EAAAK GAAGCTGCTGCTAAG EAAAK (SEQ ID NO: (SEQ ID NO: 417) 425)

[0316] Exemplary fusion sequences in Table 11 below.

TABLE-US-00011 Description Sequence SpCas9-XTEN- MKRTADGSEFESPKKKRKVDKKYSIGLDTN RT(1-478)-Sto7d- SVGWAVITDEYKVPSKKFKVLGNTDRHSIK GGGGS-BxbINT KNLIGALLFDSGETAEATRLKRTARRRYTR Amino acid RKNRICYLQEIFSNEMAKVDDSFFHRLEES SEQ ID NO: 376 FLVEEDKKHERHPIFGNIVDEVAYHEKYPT IYHLRKKLVDSTDKADLRLIYLALAHMIKF RGHFLIEGDLNPDNSDVDKLFIQLVQTYNQ LFEENPINASGVDAKAILSARLSKSRRLEN LIAQLPGEKKNGLFGNLIALSLGLTPNFKS NFDLAEDAKLQLSKDTYDDDLDNLLAQIGD QYADLFLAAKNLSDAILLSDILRVNTEITK APLSASMIKRYDEHHQDLTLLKALVRQQLP EKYKEIFFDQSKNGYAGYIDGGASQEEFYK FIKPILEKMDGTEELLVKLNREDLLRKQRT FDNGSIPHQIHLGELHAILRRQEDFYPFLK DNREKIEKILTFRIPYYVGPLARGNSRFAW MTRKSEETITPWNFEEVVDKGASAQSFIER MTNFDKNLPNEKVLPKHSLLYEYFTVYNEL TKVKYVTEGMRKPAFLSGEQKKAIVDLLFK TNRKVTVKQLKEDYFKKIECFDSVEISGVE DRFNASLGTYHDLLKIIKDKDFLDNEENED ILEDIVLTLTLFEDREMIEERLKTYAHLFD DKVMKQLKRRRYTGWGRLSRKLINGIRDKQ SGKTILDFLKSDGFANRNFMQLIHDDSLTF KEDIQKAQVSGQGDSLHEHIANLAGSPAIK KGILQTVKVVDELVKVMGRHKPENIVIEMA RENQTTQKGQKNSRERMKRIEEGIKELGSQ ILKEHPVENTQLQNEKLYLYYLQNGRDMYV DQELDINRLSDYDVDAIVPQSFLKDDSIDN KVLTRSDKNRGKSDNVPSEEVVKKMKNYWR QLLNAKLITQRKFDNLTKAERGGLSELDKA GFIKRQLVETRQITKHVAQILDSRMNTKYD ENDKLIREVKVITLKSKLVSDFRKDFQFYK VREINNYHHAHDAYLNAVVGTALIKKYPKL ESEFVYGDYKVYDVRKMIAKSEQEIGKATA KYFFYSNIMNFFKTEITLANGEIRKRPLIE TNGETGEIVWDKGRDFATVRKVLSMPQVNI VKKTEVQTGGFSKESILPKRNSDKLIARKK DWDPKKYGGFDSPTVAYSVLVVAKVEKGKS KKLKSVKELLGITIMERSSFEKNPIDFLEA KGYKEVKKDLIIKLPKYSLFELENGRKRML ASAGELQKGNELALPSKYVNFLYLASHYEK LKGSPEDNEQKQLFVEQHKHYLDEIIEQIS EFSKRVILADANLDKVLSAYNKHRDKPIRE QAENIIHLFTLTNLGAPAAFKYFDTTIDRK RYTSTKEVLDATLIHQSITGLYETRIDLSQ LGGDSGGSSGGSSGSETPGTSESATPESSG SETPGTSESATPESSGSETPGTSESATPES SGGSSGGSSTLNIEDEYRLHETSKEPDVSL GSTWLSDFPQAWAETGGMGLAVRQAPLIIP LKATSTPVSIKQYPMSQEARLGIKPHIQRL LDQGILVPCQSPWNTPLLPVKKPGTNDYRP VQDLREVNKRVEDIHPTVPNPYNLLSGPPP SHQWYTVLDLKDAFFCLRLHPTSQPLFAFE WRDPEMGISGQLTWTRLPQGFKNSPTLFNE ALHRDLADFRIQHPDLILLQYVDDLLLAAT SELDCQQGTRALLQTLGNLGYRASAKKAQI CQKQVKYLGYLLKEGQRWLTEARKETVMGQ PTPKTPRQLREFLGKAGFCRLFIPGFAEMA APLYPLTKPGTLFNWGPDQQKAYQEIKQAL LTAPALGLPDLTKPFELFVDEKQGYAKGVL TQKLGPWRRPVAYLSKKLDPVAAGWPPCLR MVAAIAVLTKDAGKLTMGQPLVILAPHAVE ALVKQPPDRWLSNARMTHYQALLLDTDRVQ FGPVVALNPATLLPLPEEGLQHNCLDGTGG GGVTVKFKYKGEELEVDISKIKKVWRVGKM ISFTYDDNGKTGRGAVSEKDAPKELLQMLE KSGKKSGGSKRTADSEFEPKKKRKVGGGGS PKKKRKVYPYDVPDYAGSRALVVIRLSRVT DATTSPERQLESCQQLCAQRGWDVVGVAED LDVSGAVDPFDRKRRPNLARWLAFEEQPFD VIVAYRVDRLTRSIRHLQQLVHWAEDHKKL VVSATEAHFDTTTPFAAVVIALMGTVAQME LEAIKERNRSAAHFNIRAGKYRGSLPPWGY LPTRVDGEWRLVPDPVQRERILEVYHRVVD NHEPLHLVAHDLNRRGVLSPKDYFAQLQGR EPQGREWSATALKRSMISEAMLGYATLNGK TVRDDDGAPLVRAEPILTREQLEALRAELV KTSRAKPAVSTPSLLLRVLFCAVCGEPAYK FAGGGRKHPRYRCRSMGFPKHCGNGTVAMA EWDAFCEEQVLDLLGDAERLEKVWVAGSDS AVELAEVNAELVDLTSLIGSPAYRAGSPQR EALDARIAALAARQEELEGLEARPSGWEWR ETGQRFGDWWREQDTAAKNTWLRSMNVRLT FDVRGGLTRTIDFGDLQEYEQHLRLGSVVE RLHTGMS SpCas9-XTEN- ATGAAACGGACAGCCGACGGAAGCGAGTTC RT(1-478)-Sto7d- GAGTCACCAAAGAAGAAGCGGAAAGTCGAC GGGGS-BxbINT AAGAAGTACAGCATCGGCCTGGACATCGGC Nucleic acid ACCAACTCTGTGGGCTGGGCCGTGATCACC SEQ ID NO: 377 GACGAGTACAAGGTGCCCAGCAAGAAATTC AAGGTGCTGGGCAACACCGACCGGCACAGC ATCAAGAAGAACCTGATCGGAGCCCTGCTG TTCGACAGCGGCGAAACAGCCGAGGCCACC CGGCTGAAGAGAACCGCCAGAAGAAGATAC ACCAGACGGAAGAACCGGATCTGCTATCTG CAAGAGATCTTCAGCAACGAGATGGCCAAG GTGGACGACAGCTTCTTCCACAGACTGGAA GAGTCCTTCCTGGTGGAAGAGGATAAGAAG CACGAGCGGCACCCCATCTTCGGCAACATC GTGGACGAGGTGGCCTACCACGAGAAGTAC CCCACCATCTACCACCTGAGAAAGAAACTG GTGGACAGCACCGACAAGGCCGACCTGCGG CTGATCTATCTGGCCCTGGCCCACATGATC AAGTTCCGGGGCCACTTCCTGATCGAGGGC GACCTGAACCCCGACAACAGCGACGTGGAC AAGCTGTTCATCCAGCTGGTGCAGACCTAC AACCAGCTGTTCGAGGAAAACCCCATCAAC GCCAGCGGCGTGGACGCCAAGGCCATCCTG TCTGCCAGACTGAGCAAGAGCAGACGGCTG GAAAATCTGATCGCCCAGCTGCCCGGCGAG AAGAAGAATGGCCTGTTCGGAAACCTGATT GCCCTGAGCCTGGGCCTGACCCCCAACTTC AAGAGCAACTTCGACCTGGCCGAGGATGCC AAACTGCAGCTGAGCAAGGACACCTACGAC GACGACCTGGACAACCTGCTGGCCCAGATC GGCGACCAGTACGCCGACCTGTTTCTGGCC GCCAAGAACCTGTCCGACGCCATCCTGCTG AGCGACATCCTGAGAGTGAACACCGAGATC ACCAAGGCCCCCCTGAGCGCCTCTATGATC AAGAGATACGACGAGCACCACCAGGACCTG ACCCTGCTGAAAGCTCTCGTGCGGCAGCAG CTGCCTGAGAAGTACAAAGAGATTTTCTTC GACCAGAGCAAGAACGGCTACGCCGGCTAC ATTGACGGCGGAGCCAGCCAGGAAGAGTTC TACAAGTTCATCAAGCCCATCCTGGAAAAG ATGGACGGCACCGAGGAACTGCTCGTGAAG CTGAACAGAGAGGACCTGCTGCGGAAGCAG CGGACCTTCGACAACGGCAGCATCCCCCAC CAGATCCACCTGGGAGAGCTGCACGCCATT CTGCGGCGGCAGGAAGATTTTTACCCATTC CTGAAGGACAACCGGGAAAAGATCGAGAAG ATCCTGACCTTCCGCATCCCCTACTACGTG GGCCCTCTGGCCAGGGGAAACAGCAGATTC GCCTGGATGACCAGAAAGAGCGAGGAAACC ATCACCCCCTGGAACTTCGAGGAAGTGGTG GACAAGGGCGCTTCCGCCCAGAGCTTCATC GAGCGGATGACCAACTTCGATAAGAACCTG CCCAACGAGAAGGTGCTGCCCAAGCACAGC CTGCTGTACGAGTACTTCACCGTGTATAAC GAGCTGACCAAAGTGAAATACGTGACCGAG GGAATGAGAAAGCCCGCCTTCCTGAGCGGC GAGCAGAAAAAGGCCATCGTGGACCTGCTG TTCAAGACCAACCGGAAAGTGACCGTGAAG CAGCTGAAAGAGGACTACTTCAAGAAAATC GAGTGCTTCGACTCCGTGGAAATCTCCGGC GTGGAAGATCGGTTCAACGCCTCCCTGGGC ACATACCACGATCTGCTGAAAATTATCAAG GACAAGGACTTCCTGGACAATGAGGAAAAC GAGGACATTCTGGAAGATATCGTGCTGACC CTGACACTGTTTGAGGACAGAGAGATGATC GAGGAACGGCTGAAAACCTATGCCCACCTG TTCGACGACAAAGTGATGAAGCAGCTGAAG CGGCGGAGATACACCGGCTGGGGCAGGCTG AGCCGGAAGCTGATCAACGGCATCCGGGAC AAGCAGTCCGGCAAGACAATCCTGGATTTC CTGAAGTCCGACGGCTTCGCCAACAGAAAC TTCATGCAGCTGATCCACGACGACAGCCTG ACCTTTAAAGAGGACATCCAGAAAGCCCAG GTGTCCGGCCAGGGCGATAGCCTGCACGAG CACATTGCCAATCTGGCCGGCAGCCCCGCC ATTAAGAAGGGCATCCTGCAGACAGTGAAG GTGGTGGACGAGCTCGTGAAAGTGATGGGC CGGCACAAGCCCGAGAACATCGTGATCGAA ATGGCCAGAGAGAACCAGACCACCCAGAAG GGACAGAAGAACAGCCGCGAGAGAATGAAG CGGATCGAAGAGGGCATCAAAGAGCTGGGC AGCCAGATCCTGAAAGAACACCCCGTGGAA AACACCCAGCTGCAGAACGAGAAGCTGTAC CTGTACTACCTGCAGAATGGGCGGGATATG TACGTGGACCAGGAACTGGACATCAACCGG CTGTCCGACTACGATGTGGACGCTATCGTG CCTCAGAGCTTTCTGAAGGACGACTCCATC GACAACAAGGTGCTGACCAGAAGCGACAAG AACCGGGGCAAGAGCGACAACGTGCCCTCC GAAGAGGTCGTGAAGAAGATGAAGAACTAC TGGCGGCAGCTGCTGAACGCCAAGCTGATT ACCCAGAGAAAGTTCGACAATCTGACCAAG GCCGAGAGAGGCGGCCTGAGCGAACTGGAT AAGGCCGGCTTCATCAAGAGACAGCTGGTG GAAACCCGGCAGATCACAAAGCACGTGGCA CAGATCCTGGACTCCCGGATGAACACTAAG TACGACGAGAATGACAAGCTGATCCGGGAA GTGAAAGTGATCACCCTGAAGTCCAAGCTG GTGTCCGATTTCCGGAAGGATTTCCAGTTT TACAAAGTGCGCGAGATCAACAACTACCAC CACGCCCACGACGCCTACCTGAACGCCGTC GTGGGAACCGCCCTGATCAAAAAGTACCCT AAGCTGGAAAGCGAGTTCGTGTACGGCGAC TACAAGGTGTACGACGTGCGGAAGATGATC GCCAAGAGCGAGCAGGAAATCGGCAAGGCT ACCGCCAAGTACTTCTTCTACAGCAACATC ATGAACTTTTTCAAGACCGAGATTACCCTG GCCAACGGCGAGATCCGGAAGCGGCCTCTG ATCGAGACAAACGGCGAAACCGGGGAGATC GTGTGGGATAAGGGCCGGGATTTTGCCACC GTGCGGAAAGTGCTGAGCATGCCCCAAGTG AATATCGTGAAAAAGACCGAGGTGCAGACA GGCGGCTTCAGCAAAGAGTCTATCCTGCCC AAGAGGAACAGCGATAAGCTGATCGCCAGA AAGAAGGACTGGGACCCTAAGAAGTACGGC GGCTTCGACAGCCCCACCGTGGCCTATTCT GTGCTGGTGGTGGCCAAAGTGGAAAAGGGC AAGTCCAAGAAACTGAAGAGTGTGAAAGAG CTGCTGGGGATCACCATCATGGAAAGAAGC AGCTTCGAGAAGAATCCCATCGACTTTCTG GAAGCCAAGGGCTACAAAGAAGTGAAAAAG GACCTGATCATCAAGCTGCCTAAGTACTCC CTGTTCGAGCTGGAAAACGGCCGGAAGAGA ATGCTGGCCTCTGCCGGCGAACTGCAGAAG GGAAACGAACTGGCCCTGCCCTCCAAATAT GTGAACTTCCTGTACCTGGCCAGCCACTAT GAGAAGCTGAAGGGCTCCCCCGAGGATAAT GAGCAGAAACAGCTGTTTGTGGAACAGCAC AAGCACTACCTGGACGAGATCATCGAGCAG ATCAGCGAGTTCTCCAAGAGAGTGATCCTG GCCGACGCTAATCTGGACAAAGTGCTGTCC GCCTACAACAAGCACCGGGATAAGCCCATC AGAGAGCAGGCCGAGAATATCATCCACCTG TTTACCCTGACCAATCTGGGAGCCCCTGCC GCCTTCAAGTACTTTGACACCACCATCGAC CGGAAGAGGTACACCAGCACCAAAGAGGTG CTGGACGCCACCCTGATCCACCAGAGCATC ACCGGCCTGTACGAGACACGGATCGACCTG TCTCAGCTGGGAGGTGACTCTGGAGGATCT AGCGGAGGATCCTCTGGCAGCGAGACACCA GGAACAAGCGAGTCAGCAACACCAGAGAGC TCTGGTAGCGAGACACCCGGTACCAGTGAA AGCGCCACGCCAGAAAGCAGTGGGAGTGAG ACTCCGGGTACATCTGAATCAGCGACACCG GAATCAAGTGGCGGCAGCAGCGGCGGCAGC AGCACCCTAAATATAGAAGATGAGTATCGG CTACATGAGACCTCAAAAGAGCCAGATGTT TCTCTAGGGTCCACATGGCTGTCTGATTTT CCTCAGGCCTGGGCGGAAACCGGGGGCATG GGACTGGCAGTTCGCCAAGCTCCTCTGATC ATACCTCTGAAAGCAACCTCTACCCCCGTG TCCATAAAACAATACCCCATGTCACAAGAA GCCAGACTGGGGATCAAGCCCCACATACAG AGACTGTTGGACCAGGGAATACTGGTACCC TGCCAGTCCCCCTGGAACACGCCCCTGCTA CCCGTTAAGAAACCAGGGACTAATGATTAT AGGCCTGTCCAGGATCTGAGAGAAGTCAAC AAGCGGGTGGAAGACATCCACCCCACCGTG CCCAACCCTTACAACCTCTTGAGCGGGCCC CCACCGTCCCACCAGTGGTACACTGTGCTT

GATTTAAAGGATGCCTTTTTCTGCCTGAGA CTCCACCCCACCAGTCAGCCTCTCTTCGCC TTTGAGTGGAGAGATCCAGAGATGGGAATC TCAGGACAATTGACCTGGACCAGACTCCCA CAGGGTTTCAAAAACAGTCCCACCCTGTTT AATGAGGCACTGCACAGAGACCTAGCAGAC TTCCGGATCCAGCACCCAGACTTGATCCTG CTACAGTACGTGGATGACTTACTGCTGGCC GCCACTTCTGAGCTAGACTGCCAACAAGGT ACTCGGGCCCTGTTACAAACCCTAGGGAAC CTCGGGTATCGGGCCTCGGCCAAGAAAGCC CAAATTTGCCAGAAACAGGTCAAGTATCTG GGGTATCTTCTAAAAGAGGGTCAGAGATGG CTGACTGAGGCCAGAAAAGAGACTGTGATG GGGCAGCCTACTCCGAAGACCCCTCGACAA CTAAGGGAGTTCCTAGGGAAGGCAGGCTTC TGTCGCCTCTTCATCCCTGGGTTTGCAGAA ATGGCAGCCCCCCTGTACCCTCTCACCAAA CCGGGGACTCTGTTTAATTGGGGCCCAGAC CAACAAAAGGCCTATCAAGAAATCAAGCAA GCTCTTCTAACTGCCCCAGCCCTGGGGTTG CCAGATTTGACTAAGCCCTTTGAACTCTTT GTCGACGAGAAGCAGGGCTACGCCAAAGGT GTCCTAACGCAAAAACTGGGACCTTGGCGT CGGCCGGTGGCCTACCTGTCCAAAAAGCTA GACCCAGTAGCAGCTGGGTGGCCCCCTTGC CTACGGATGGTAGCAGCCATTGCCGTACTG ACAAAGGATGCAGGCAAGCTAACCATGGGA CAGCCACTAGTCATTCTGGCCCCCCATGCA GTAGAGGCACTAGTCAAACAACCCCCCGAC CGCTGGCTTTCCAACGCCCGGATGACTCAC TATCAGGCCTTGCTTTTGGACACGGACCGG GTCCAGTTCGGACCGGTGGTAGCCCTGAAC CCGGCTACGCTGCTCCCACTGCCTGAGGAA GGGCTGCAACACAACTGCCTTGATGGGACA GGTGGCGGTGGTGTCACCGTCAAGTTCAAG TACAAGGGTGAGGAACTTGAAGTTGATATT AGCAAAATCAAGAAGGTTTGGCGCGTTGGT AAAATGATATCTTTTACTTATGACGACAAC GGCAAGACAGGTAGAGGGGCAGTGTCTGAG AAAGACGCCCCCAAGGAGCTGTTGCAAATG TTGGAAAAGTCTGGGAAAAAGTCTGGCGGC TCAAAAAGAACCGCCGACGGCAGCGAATTC GAGCCCAAGAAGAAGAGGAAAGTCGGAGGT GGCGGGAGCCCAAAAAAGAAAAGAAAAGTG TATCCCTATGATGTCCCCGATTATGCCGGT TCAAGAGCCCTGGTCGTGATTAGACTGAGC CGAGTGACAGACGCCACCACAAGTCCCGAG AGACAGCTGGAATCATGCCAGCAGCTCTGT GCTCAGCGGGGTTGGGATGTGGTCGGCGTG GCAGAGGATCTGGACGTGAGCGGGGCCGTC GATCCATTCGACAGAAAGAGGAGGCCCAAC CTGGCAAGATGGCTCGCTTTCGAGGAACAG CCCTTTGATGTGATCGTCGCCTACAGAGTG GACCGGCTGACCCGCTCAATTCGACATCTC CAGCAGCTGGTGCATTGGGCTGAGGACCAC AAGAAACTGGTGGTCAGCGCAACAGAAGCC CACTTCGATACTACCACACCTTTTGCCGCT GTGGTCATCGCACTGATGGGCACTGTGGCC CAGATGGAGCTCGAAGCTATCAAGGAGCGA AACAGGAGCGCAGCCCATTTCAATATTAGG GCCGGTAAATACAGAGGCTCCCTGCCCCCT TGGGGATATCTCCCTACCAGGGTGGATGGG GAGTGGAGACTGGTGCCAGACCCCGTCCAG AGAGAGCGGATTCTGGAAGTGTACCACAGA GTGGTCGATAACCACGAACCACTCCATCTG GTGGCACACGACCTGAATAGACGCGGCGTG CTCTCTCCAAAGGATTATTTTGCTCAGCTG CAGGGAAGAGAGCCACAGGGAAGAGAATGG AGTGCTACTGCACTGAAGAGATCTATGATC AGTGAGGCTATGCTGGGTTACGCAACACTC AATGGCAAAACTGTCCGGGACGATGACGGA GCCCCTCTGGTGAGGGCTGAGCCTATTCTC ACCAGAGAGCAGCTCGAAGCTCTGCGGGCA GAACTGGTCAAGACTAGTCGCGCCAAACCT GCCGTGAGCACCCCAAGCCTGCTCCTGAGG GTGCTGTTCTGCGCCGTCTGTGGAGAGCCA GCATACAAGTTTGCCGGCGGAGGGCGCAAA CATCCCCGCTATCGATGCAGGAGCATGGGG TTCCCTAAGCACTGTGGAAACGGGACAGTG GCCATGGCTGAGTGGGACGCCTTTTGCGAG GAACAGGTGCTGGATCTCCTGGGTGACGCT GAGCGGCTGGAAAAAGTGTGGGTGGCAGGA TCTGACTCCGCTGTGGAGCTGGCAGAAGTC AATGCCGAGCTCGTGGATCTGACTTCCCTC ATCGGATCTCCTGCATATAGAGCTGGGTCC CCACAGAGAGAAGCTCTGGACGCACGAATT GCTGCACTCGCTGCTAGACAGGAGGAACTG GAGGGCCTGGAGGCCAGGCCCTCTGGATGG GAGTGGCGAGAAACCGGACAGAGGTTTGGG GATTGGTGGAGGGAGCAGGACACCGCAGCC AAGAACACATGGCTGAGATCCATGAATGTC CGGCTCACATTCGACGTGCGCGGTGGCCTG ACTCGAACCATCGATTTTGGCGACCTGCAG GAGTATGAACAGCACCTGAGACTGGGGTCC GTGGTCGAAAGACTGCACACTGGGATGTCC SpCas9 DKKYSIGLDIGTNSVGWAVITDEYKVPSKK Amino acid FKVLGNTDRHSIKKNLIGALLFDSGETAEA SEQ ID NO: 378 TRLKRTARRRYTRRKNRICYLQEIFSNEMA KVDDSFFHRLEESFLVEEDKKHERHPIFGN IVDEVAYHEKYPTIYHLRKKLVDSTDKADL RLIYLALAHMIKFRGHFLIEGDLNPDNSDV DKLFIQLVQTYNQLFEENPINASGVDAKAI LSARLSKSRRLENLIAQLPGEKKNGLFGNL IALSLGLTPNFKSNFDLAEDAKLQLSKDTY DDDLDNLLAQIGDQYADLFLAAKNLSDAIL LSDILRVNTEITKAPLSASMIKRYDEHHQD LTLLKALVRQQLPEKYKEIFFDQSKNGYAG YIDGGASQEEFYKFIKPILEKMDGTEELLV KLNREDLLRKQRTFDNGSIPHQIHLGELHA ILRRQEDFYPFLKDNREKIEKILTFRIPYY VGPLARGNSRFAWMTRKSEETITPWNFEEV VDKGASAQSSFIERMTNFDKNLPNEKVLPK HSLLYEYFTVYNELTKVKYVTEGMRKPAFL SGEQKKAIVDLLFKTNRKVTVKQLKEDYFK KIECFDSVEISGVEDRFNASLGTYHDLLKI IKDKDFLDNEENEDILEDIVLTLTLFEDRE MIEERLKTYAHLFDDKVMKQLKRRRYTGWG RLSRKLINGIRDKQSGKTILDFLKSDGFAN RNFMQLIHDDSLTFKEDIQKAQVSGQGDSL HEHIANLAGSPAIKKGILQTVKVVDELVKV MGRHKPENIVIEMARENQTTQKGQKNSRER MKRIEEGIKELGSQILKEHPVENTQLQNEK LYLYYLQNGRDMYVDQELDINRLSDYDVDA IVPQSFLKDDSIDNKVLTRSDKNRGKSDNV PSEEVVKKMKNYWRQLLNAKLITQRKFDNL TKAERGGLSELDKAGFIKRQLVETRQITKH VAQILDSRMNTKYDENDKLIREVKVITLKS KLVSDFRKDFQFYKVREINNYHHAHDAYLN AVVGTALIKKYPKLESEFVYGDYKVYDVRK MIAKSEQEIGKATAKYFFYSNIMNFFKTEI TLANGEIRKRPLIETNGETGEIVWDKGRDF ATVRKVLSMPQVNIVKKTEVQTGGFSKESI LPKRNSDKLIARKKDWDPKKYGGFDSPTVA YSVLVVAKVEKGKSKKLKSVKELLGITIME RSSFEKNPIDFLEAKGYKEVKKDLIIKLPK YSLFELENGRKRMLASAGELQKGNELALPS KYVNFLYLASHYEKLKGSPEDNEQKQLFVE QHKHYLDEIIEQISEFSKRVILADANLDKV LSAYKHRDKPIREQAENIIHLFTLTNLGAP AAFKYFDTTIDRKRYTSTKEVLDATLIHQS ITGLYETRIDLSQLGGD RT(1-478)-Sto7d LNIEDEYRLHETSKEPDVSLGSTWLSDFPQ Amino acid AWAETGGMGLAVRQAPLIIPLKATSTPVSI SEQ ID NO: 379 KQYPMSQEARLGIKPHIQRLLDQGILVPCQ SPWNTPLLPVKKPGTNDYRPVQDLREVNKR VEDIHPTVPNPYNLLSGPPPSHQWYTVLDL KDAFFCLRLHPTSQPLFAFEWRDPEMGISG QLTWTRLPQGFKNSPTLFNEALHRDLADFR IQHPDLILLQYVDDLLLAATSELDCQQGTR ALLQTLGNLGYRASAKKAQICQKQVKYLGY LLKEGQRWLTEARKETVMGQPTPKTPRQLR EFLGKAGFCRLFIPGFAEMAAPLYPLTKPG TLFNWGPDQQKAYQEIKQALLTAPALGLPD LTKPFELFVDEKQGYAKGVLTQKLGPWRRP VAYLSKKLDPVAAGWPPCLRMVAAIAVLTK DAGKLTMGQPLVILAPHAVEALVKQPPDRW LSNARMTHYQALLLDTDRVQFGPVVALNPA TLLPLPEEGLQHNCLDGTGGGGVTVKFKYK GEELEVDISKIKKVWRVGKMISFTYDDNGK TGRGAVSEKDAPKELLQMLEKSGKKSGGSK RTADGS BxbINT SRALVVIRLSRVTDATTSPERQLESCQQLC Amino acid AQRGWDVVGVAEDLDVSGAVDPFDRKRRPN SEQ ID NO: 380 LARWLAFEEQPFDVIVAYRVDRLTRSIRHL QQLVHWAEDHKKLVVSATEAHFDTTTPFAA VVIALMGTVAQMELEAIKERNRSAAHFNIR AGKYRGSLPPWGYLPTRVDGEWRLVPDPVQ RERILEVYHRVVDNHEPLHLVAHDLNRRGV LSPKDYFAQLQGREPQGREWSATALKRSMI SEAMLGYATLNGKTVRDDDGAPLVRAEPIL TREQLEALRAELVKTSRAKPAVSTPSLLLR VLFCAVCGEPAYKFAGGGKHPPYRCRSMGF PKHCGNGTVAMAEWDAFCEEQVLDLLGDAE RLEKVWVAGSDSAVELAEVNAELVDLTSLI GSPAYRAGSPQREALDARIAALAARQEELE GLEARPSGWEWRETGQRFGDWWREQDTAAK NTWLRSMNVRLTFDVRGGLTRTIDFGDLQE YEQHLRLGSVVERLHTGMS

EXAMPLES

[0317] While several experimental Examples are contemplated, these Examples are intended to be non-limiting.

Example 1

CRE Integration Efficiency

[0318] The efficiency of the CRE integration was tested. In order to test the efficacy of PASTE with GFP using lox71/lox66/Cre recombinase system, a clonal HEK293FT cell line with lox71 sequence (SEQ ID NO: 1) integrated into the genome using lentivirus was developed. The integration of GFP was tested by transfection of modified HEK293FT cell line with: (1) plus/minus SEQ ID NO: 71 comprising a Cre recombinase expression plasmid, and (2) SEQ ID NO: 72 comprising a GFP template and a lox 66 Cre site of SEQ ID NO: 2. After 72 hours, the percent integration of GFP into the lox71 site was probed. FIG. 3 shows the percent integration of GFP in the lentiviral integrated lox71 site in HEK293FT cell line in the presence of various plasmids. It was observed that pCMV PE2 P2A Cre (SEQ ID NO: 73), a mammalian expression vector with prime editing complex and Cre recombinase linked to PE2 via a cleavable linker or a non-cleavable linker, shows integration of GFP.

Example 2

Programmable Addition Via Site-Specific Targeting Elements (PASTE) with Cre Recombinase--Addition of Lox Site

[0319] The lox71 (SEQ ID NO: 1) or lox66 (SEQ ID NO: 2) sequence was inserted into the HEK293FT cell genome using prime editing to test integration of GFP into the HEK293FT genome. In order to insert lox71 or lox66 sequence into HEK293FT cell genome, a pegRNA with PBS length of 13 base pairs operably linked to RT region of varying lengths was used. The following plasmids were used in the transfection of HEK293FT cells. The cells were transfected with (1) prime editing construct (PE2) or PE2 with conditional Cre expression, (2) Lox71 or Lox66 pegRNA targeting the HEK3 locus, and (3) plus/minus+90 HEK3 nicking second guide RNA targeting the HEK3 locus (+90 ngRNA). After 72 hours, the percent editing of the HEK293FT genome at the HEK3 locus was probed for incorporation of various lengths of lox71 or lox66 (see FIG. 4). It was observed that 34 base pair lox71 (HEK3 locus guide, SEQ ID NO: 83; and Lox71 pegRNA with RT 34 and PBS 13, SEQ ID NO: 81) with +90 ngRNA (SEQ ID NO: 75) and 34 base pair lox66 (HEK3 locus guide, SEQ ID NO: 83; and Lox66 pegRNA with RT 34 and PBS 13, SEQ ID NO: 82) with +90 ngRNA (SEQ ID NO: 75) had the highest percent editing.

Example 3

PASTE with Cre Recombinase--Integration of Gene

[0320] The lox71 or lox66 pegRNAs having PBS length of 13 base pairs and insert length of 34 base pairs were used to probe integration of GFP in the HEK293F genome. The PE and Cre were delivered in an inducible expression vectors and induced at day 2. The HEK293FT cells were transfected with the following plasmids: (1) prime editing construct (PE2 or PE2 with conditional Cre expression); (2) Lox71 pegRNA; (3) plus/minus+90 HEK3 nicking guide RNA; and (4) EGFP template with Lox66 site. After 72 hours, the percent editing of lox71 site and percent integration of GFP was probed with or without lox66 site in the presence of various PE/Cre constructs. FIG. 5A summarizes the percent editing of lox71 site with different PE/Cre vectors. FIG. 5B summarizes the percent integration of GFP at the lox71 site in HEK293FT cell genome. It was observed that although the lox71 site was edited in the presence of inducible or non-inducible PE/Cre expression system, there was no GFP integration.

Example 4

Bxb1 Integration Data Lenti Reporter

[0321] The integration system was switched to an integrase system that could result in an integration of target genes into a genome with higher efficiency. Serine integrase Bxb1 has been shown to be more active than Cre recombinase and highly efficient in bacteria and mammalian cells for irreversible integration of target genes. FIG. 6 shows a schematic of PASTE methodology using Bxb1 (Merrick, C. A. et al., ACS Synth. Biol. 2018, 7, 299-310).

[0322] To probe the efficiency of the Bxb1 integration system, a clonal HEK293FT cell line with attB Bxb1 site (SEQ ID NO: 3) integrated using lentivirus was developed. The modified HEK293FT cell line was then transferred with the following plasmids: (1) plus/minus Bxb1 expression plasmid and (2) plus/minus GFP (SEQ ID NO: 76) or G-Luc (SEQ ID NO: 77) minicircle template with attP Bxb1 site. After 72 hours, the integration of GFP or Gluc into the attB site in the HEK293FT genome was probed. The percent integrations of GFP or Gluc into the attB locus are shown in FIG. 7. It was observed that GFP and Gluc showed efficient integration into the attB site in HEK293FT cells.

Example 5

Addition of Bxb1 Site to Human Genome Using PRIME

[0323] The maximum length of attB that can be integrated into a HEK293FT cell line with the best efficiency was probed. To probe the best length of attB (SEQ ID NO: 3) or its reverse complement attP (SEQ ID NO: 4) for prime editing, pegRNAs having PBS length of 13 nt with varying RT homology length were used. The following plasmids were transfected in HEK293FT: (1) prime expression plasmid; (2) HEK3 targeting pegRNA design; and (3) HEK3+90 nicking guide. After 72 hours, the percent integration of each of the attB construct was probed. FIG. 8 shows the percent editing in each HEK3 targeting pegRNA. It was observed that attB with 44, 34 and 26 base pairs and attB reverse complement with 34 and 26 base pairs showed the highest percent editing.

[0324] Integration PASTE was then tested with tagging cell-organelle marker proteins with GFP in HEK29FT cells. PASTE was used to tag SUPT16H, SRRM2, LAMNB1, NOLC1 and DEPDC4 with GFP in different cell-culture wells and to test the usefulness of PASTE in tracking protein localization within the cells using microscopy. FIGS. 9A-9G shows the fluorescent microscopy results for each of the organelles. SUPT16H-GFP was observed to be enriched in the nucleus, SRRM2-GFP was observed to be enriched in the nuclear speckles, LAMNB1-GFP was observed to be enriched in the nuclear membrane, NOLC1-GFP was observed to be enriched in the fibrillar center, and DEPDC4-GFP was observed to be enriched in the aggresome.

[0325] The transfection of the plasmids can be achieved using electroporation as illustrated in FIGS. 10A-10B.

Example 6

Programmable Integration of Genes with PASTE

[0326] The efficiency of gene integration of Gluc or EGFP with PASTE was tested. To enable gene integration with PASTE, the following HEK3 targeting pegRNAs were used: (1) 44 pegRNA: PBS of 13nt and RT homology of 44nt; (2) 34 pegRNA: PBS of 13nt and RT homology of 34nt; and (3) 26 pegRNA: PBS of 13nt and RT homology of 26nt.

[0327] A HEK293 cell line was transfected with following plasmids HEK293FT: (1) Prime expression plasmid; (2) Bxb1 expression plasmid; (3) HEK3 targeting pegRNA design; (4) HEK3+90 nicking guide; and (5) EGFP or Gluc minicircle. After 72 hours, the percent integration of Gluc or EGFP was observed. FIG. 11 shows integration of EGFP and Gluc with each of the tested HEK3 targeting pegRNAs. It was observed that EGFP and Gluc were efficiently integrated using PASTE.

Example 7

PASTE for Integration of Multiple Genes

[0328] The PASTE technique for site-specific integration of multiple genes into a cell is facilitated with the use of orthogonal attB and attP sites. Central dinucleotide can be changed to GA from GT, and only GA containing attB/attP sites can interact and do not cross react with GT containing sequences. A screen of dinucleotide combinations to find orthogonal attB/attP pairs for multiplexed PASTE editing can be performed. It has been shown that many orthogonal dinucleotide combinations can be found using a Bxb1 reporter system.

[0329] To test this, attB.sup.GT and attB.sup.GA dinucleotides for Bxb1 was added at a ACTB site by prime editing. A EGFP--attP.sup.GT DNA minicircle and a mCherry--attP.sup.GA DNA minicircle was introduced to test the percent EGFP and mCherry editing in the presence or absence of Bxb1. The results of EGFP and mCherry editing are shown in FIGS. 14A-14B.

[0330] Orthogonal editing with the right GT-EGFP and GA-mCherry pairs was achieved demonstrating the ability for multiplexed PASTE editing in cells.

[0331] Two genes were introduced in the same cell using multiplexed PASTE to tag two different genes in a single reaction. EGFP and mCherry were tagged into the loci of ACTB and NOLC1 in a x cell line, in a single reaction. Further, EGFP and mCherry were tagged into the loci of ACTB and LAMNB1. The cells were visualized using fluorescence microscopy. FIGS. 15A-15B show the results of fluorescent microscopy for multiplexed PASTE.

[0332] The ability of multiplexing with 9-different attB and attP central dinucleotides--AA, GA, CA, AG, AC, CC, GT, CT and TT (SEQ ID NOs: 7, 8, 23, 24, 19, 20, 25, 26, 27, 28, 9, 10, 15, 16, 17, 18, 5 and 6)--in a 9.times.9 cross of attB and attP was tested. The edits were probed using next-generation sequencing. The results of the 9.times.9 cross of attB and attP central dinucleotides--AA, GA, CA, AG, AC, CC, GT, CT and TT--are shown in FIG. 16A. Only orthogonal pairs of attB and attP show the highest edit percentage. This result is also shown in the heat-map of FIG. 16B.

Example 8

Integration of Albumin and CPS1 into Albumin Locus

[0333] 12 pegRNAs with albumin guide were linked to PBS and reverse transcriptase sequence of variable length, and different nicking guide RNAs were used to transfect HEK293FT cells. The percent editing in the albumin was probed using next-generation sequencing. The results of prime editing at the albumin locus are shown in FIG. 17. It was observed that SEQ ID NO: 79 showed the highest percent edits with SERPINA1 and SEQ ID NO: 80 showed the highest percent edits with CPS1.

Example 9

Engineering T-Cells

[0334] In order to engineer CD8+ T-cells, the efficiency of PASTE delivery and editing in T-cells can be evaluated (FIG. 18). ACTB targeting pegRNA can be used to insert an integration site with an EGFP insertion template. To deliver the PASTE components to CD8+ T-cells, electroporation can be used along with an optimized electroporation protocol for unstimulated T-cells. As multiple plasmids may reduce the efficiency of electroporation, the consolidated PASTE components that use fewer vectors can be applied.

[0335] Five vectors, three vectors, and two vectors PASTE systems show that robust T-cell editing can be achieved with maximal editing using the three-vector approach (FIG. 19). Further, expanded sets of electroporation conditions, including the overall plasmid amounts, cell numbers, and voltage/amperage protocol can be tested. In addition, stimulation of T-cells may influence the efficiency of transduction and PASTE efficiency. Further, CD4+/CD8+ T cell mixtures stimulated with T-Activator CD3/CD28 ligands can have higher PASTE editing efficiency versus unstimulated cells. In order to separate efficiency of PASTE from the overall delivery rate, an mCherry expression cassette on PASTE vectors can be evaluated in order to sort successfully transfected T cells. Once optimized parameters are achieved, a panel of 10 insertion sites with PASTE in T cells, including the TRAC, IL2R.alpha., and PDCD1 loci, can be evaluated, using different insertions (e.g. EGFP, BFP, and YFP), both in single and multiplexed editing contexts. A tested subset of relevant sites in HEK293FT achieved greater than 40% editing for EGFP insertion (FIG. 20). The PASTE efficiency at TRAC locus with different TCR and CAR constructs can be evaluated. The T-cells can successfully be transfected to achieve insertion of CARs or TCRs.

Example 10

PASTE for CFTR

[0336] PASTE for the CFTR locus can be tested in HEK293FT cells to identify top performing pegRNA and nicking designs for human cells. Neuro-2A cells can also be tested to identify top performing pegRNA and nicking designs for mouse cells. The best constructs can be applied for testing in mouse air lung interface (ALI) organoids in vitro or for delivery in pre-clinical models of cystic fibrosis in mice. Table 12 shows the pegRNA, nicking guide and minicircle DNA characteristics for the CFTR gene modulation.

TABLE-US-00012 TABLE 12 Variables Characteristics pegRNA 38 bp shortened minimal attB and normal 46 bp attB sequence with: a. PBS of 17, 13, and 9 nt length, and b. RT of 20, 15, and 10 nt in length Nicking guides Nicking guide 1 +64 bp Nicking guide 2 +23 bp Nicking guide 3 -60 bp Nicking guide 4 -78 bp (distance is calculated from cut site of pegRNA) Minicircle A. CFTR coding sequence alone (~4,454 pb in size) template B. CFTR coding sequence plus 5' and 3' UTRs (~6,011 bp in size) (Both minicircles have attP site on them for integration by Bxb1 and a bGH poly A signal)

Example 11

AttB and EGPF Integration Using PASTE

[0337] The efficiency of the integration of attB and EGPF at the ACTB locus was evaluated (FIGS. 21A-21C). To investigate whether Bxb1 can add an EGFP template into this site, a delivery approach using a 5 plasmid system expressing each of the following component was deployed: 1) pegRNA expression, 2) nicking guide expression, 3) Prime expression (Cas9-RT), 4) Bxb1 expression and 5) the insertion template (in this case EGFP). This approach was found to yield editing efficiency of the attB site up to 24% and integration of EGFP .about.10% in HEK293FT cells as measured by sequencing (FIGS. 21A-21B). Optimal activity is achieved in 3-4 days and can be performed as a single step transfection or electroporation of all components. Because the EGFP plasmid is designed as a minicircle, allowing removal of all undesired bacterial components, only the desired gene is inserted along with minimal scars from the Bxb1 recombined sites.

[0338] To make the tool simpler to use, the Bxb1 can be linked to Prime via a P2A linker to the Cas9-RT fusion, allowing for only a single plasmid to be used for PASTE protein expression rather than two. This optimization can maintain the same level of editing, making it easier to use the tool and deliver it (FIG. 21C).

Example 12

Programmable EGFP Integrations in Different Cell Types

[0339] The programmable EGFP integration in liver hepatocellular carcinoma cell line HEPG2 (FIG. 22A) and chronic myelogenous leukemia cell line K562 (FIG. 22B) was evaluated. EGFP integration at the ACTB locus in K562 and HEPG2 cells of about 15% was observed, demonstrating robustness of the platform across cell types.

Example 13

Mutagenesis of Bxb1 for Enhanced PASTE Activity

[0340] The mutagenesis of Bxb1 for enhanced PASTE activity was evaluated (FIGS. 23A-23C). Two levers for optimizing PASTE activity exist: 1) improving the activity of the integrase and 2) enhancing the Prime addition of the integration sequence. As illustrated in FIGS. 23A-23B, Bxb1 activity can be improved as only about 30% of Bxb1 attB sites that are added by PASTE are integrated into by Bxb1. This illustrates that if the Bxb1 efficiency can be improved, the PASTE can be improved. Furthermore, catalytic residues in the Bxb1 integrase were identified via conservation and structural analyses and Bxb1 mutants were generated to test as part of PASTE. As illustrated in FIG. 23B, the mutations can improve integration by about 20-30%.

Example 14

Effect of the pegRNA PBS and RT Lengths on the Prime Editing Integration Efficiency

[0341] The effect of the pegRNA PBS and RT lengths on the prime editing integration efficiency was evaluated (FIGS. 25A-25F). It was found that PASTE can be optimized by tuning the PBS and RT lengths at the ACTB locus to achieve editing rates up to about 20% (FIG. 25A). It was found that shortening the attB site can help improve PASTE function as Prime is better at inserting shorter sequences. Further optimization of PBS, RT, and attB lengths showed that optimal designs can be found for insertion upstream of the LMNB1, NOLC1, and GRSF1 loci (FIGS. 25B, 25C, and 25D). Lengths as short as 36nt for attB were found to be still functional for integration into a reporter plasmid (FIGS. 25B and 25C). It was found that the reverse complemented version of the attB sequence was better integrated via Prime editing, suggesting that the sequence of what Prime is inserting matters. EGFP integrations with attP site mutants showed that certain mutants can improve integration efficiency significantly (FIG. 25E). PASTE was also performed with a large panel of genes, inserting EGFP at the N-terminus of ACTB, LMNB1, SUPT16H, SRRM2, NOLC1, KLHL15, GRSF1, DEPDC4, NES, PGM1, CLTA, BASP1, and DNAJC18 (FIG. 25F). Editing rates that are about 5%-40% were found using digital droplet PCR (ddPCR).

Example 15

Comparison of PASTE and HITI On-Target and Off-Target Activities

[0342] The PASTE and HITI on-target and off-target activities were compared (FIGS. 26A-26F). PASTE and HITI were found to have about 22% and 5% integration efficiencies respectively when using the same guide sequence (FIGS. 26A and 26B). PASTE was found to outperform HITI at most sites when analyzing the editing of 14 genes (FIG. 26C). Using a ddPCR based approach, it was found that PASTE was very specific with minimal off-target activity for Bxb1 off-targets integrations (FIG. 26D) and Cas9 off-targets integrations (FIG. 26E). The analysis of inserts of different sizes showed that PASTE can reliably insert sequences 1 kb-10 kb in size (FIG. 26F), revealing the wide range of sequence sizes PASTE is capable of working with. A decrease in insertion efficiency at larger sizes was also observed, which was likely due to the reduction in plasmid delivery to HEK293FT cells at larger plasmid sizes.

Example 16

Multiplexing with PASTE and Orthogonal Di-Nucleotide attB and attP Sites

[0343] Multiplexing with PASTE and orthogonal di-nucleotide attB and attP sites was evaluated (FIGS. 28A-28C). Multiple orthogonal combinations were found for mutants of the central di-nucleotide motif (FIGS. 28A and 28B). As illustrated in FIG. 28C, programmable multiplexed gene insertion can be achieved by using these orthogonal combinations with PASTE only delivering different pegRNAs and gene inserts while keeping the protein components the same (FIG. 8C).

Example 17

PASTE Multiplexed Integrations at Endogenous Sites

[0344] PASTE multiplexed integrations at endogenous sites were evaluated (FIGS. 28A-28G). A reading frame for the attR scar that is left post-integration by Bxb1 that is ideal for a protein linker due to the enrichment of glycines, serines, and prolines in the sequence (GLSGQPPRSPSSGSSG (SEQ ID NO: 426)) was identified. PegRNAs were designed using this linker frame for the resolution of the attR for tagging a number of genes at the N-terminus with EGFP (ACTB, NOLC1, LMNB1, SUPT16H, SRRM2, and DEPDC4). As these genes all have distinct protein localization appearances, microscopy can be used for ascertaining proper gene tagging. PASTE was found to be capable of high-efficiency gene tagging with protein localizations that match the reference images and expected localization of the proteins in the cells (FIGS. 28A-28C). Genes were also tagged in multiplexed fashion to demonstrate the orthogonality of the engineered integration sites. ACTB, LMNB1, NOLC1, and GRSF were targeted with orthogonal pegRNAs carrying GT, TG, AC, and CA, respectively in HEK293FT in groups of single, dual-plexing, and triple-plexing (FIGS. 28D-28E). These dinucleotides were paired with templates carrying EGFP, BFP, and mCherry to allow for multicolor imaging of these labeled genes. The efficiencies of integration for these multiplexing experiments were found to range from about 5%-32%, revealing efficient multiplex integration with PASTE. Using confocal microscopy of these multiplexed integration experiments, cells were found with simultaneous labeling of these different proteins (FIGS. 28F-28G).

Example 18

Combination of CRISPR-Based Genome Editing and Site-Specific Integration

[0345] The combination of CRISPR-based genome editing and site-specific integration was evaluated.

[0346] PegRNAs containing different attB length truncations were assessed (FIG. 29A). Prime editing was found to be capable of inserting sequences up to 56 bp at the beta-actin (ACTB) gene locus, with higher efficiency at lengths below 31 bp (FIGS. 29A-B) The integration of cognate landing sites was tested for multiple insertion enzymes: Bxb1, TP901, and phiBT1 phage serine integrases and Cre recombinase. Prime editing successfully inserted all landing sites tested, with efficiencies between 10-30% (FIGS. 29C-D). To test the complete system, all components were combined and delivered in a single transfection: the prime editing vector, the landing site containing pegRNA, a nicking guide for stimulating prime editing, a mammalian expression vector for the corresponding integrase or recombinase and a 969 bp minicircle DNA cargo encoding green fluorescent protein (GFP) (FIG. 29E). GFP integration rates among the four integrases and recombinases were compared and Bxb1 integrase was found to have the highest integration rate (.about.20%) at the targeted ACTB locus and require the prime editing nicking guide for optimal performance (FIGS. 29F-H). Finally, to reduce the number of transfected components, Bxb1 was co-expressed with the SpCas9-M-MLV reverse transcriptase (PE2) fusion protein via a P2A protein cleavage site. This combination maintained high GFP insertion efficiency, up to 30% (FIG. 29E). The complete system, PASTE, achieved precise integration of templates as large as 9,500 bp with greater than 10% integration efficiency (FIGS. 29J-K and 26E), with complete integration of the full-length cargo confirmed by Sanger sequencing (FIG. 30A-E).

Example 19

Impact of Prime Editing and Integrase Parameters on PRIME Editing

[0347] The impact of prime editing and integrase parameters on the integration efficiency of PRIME editing was assessed.

[0348] Relevant pegRNA parameters for PASTE include the primer binding site (PBS), reverse transcription template (RT), and attB site lengths, as well as the relative locations and efficacy of the pegRNA spacer and nicking guide (FIG. 31A). A range of PBS and RT lengths were tested at two loci, ACTB and lamin B1 (LMNB1), and rules governing efficiency were found to vary between loci, with shorter PBS lengths and longer RT designs having higher editing at the ACTB locus (FIG. 31B) and longer PBS and shorter RT designs performing better at LMNB1 (FIG. 31C).

[0349] The length of the attB landing site must balance two conflicting factors: the higher efficiency of prime editing for smaller inserts and reduced efficiency of Bxb1 integration at shorter attB lengths. AttB lengths were evaluated atACTB, LMNB1, and nucleolar phosphoprotein p130 (NOLC1), and the optimal attB length was found to be locus dependent. At the ACTB locus, long attB lengths could be inserted by prime editing (FIG. 29B) and overall PASTE efficiencies for the insertion of GFP were highest for long attB lengths (FIG. 31d). In contrast, intermediate attB lengths had higher overall integration efficiencies (>20%) at LMNB1 (FIG. 31E) and NOLC1 (FIG. 31F), indicating that the increased efficiency of installing shorter attB sequences overcame the reduction of Bxb1 integration at these sites.

[0350] The PE3 version of prime editing combines PE2 and an additional nicking guide to bias resolution of the flap intermediate towards insertion. To test the importance of nicking guide selection on PASTE editing, editing at ACTB and LMNB1 loci was tested with two nicking guide positions. Suboptimal nicking guide positions were found to reduce the PASTE efficiency up to 30% (FIG. 32A) in agreement with the 75% reduction of PASTE efficiency in the absence of nicking guide (FIG. 29G). The pegRNA spacer sequence was found to be necessary for PASTE editing, and substitution of the spacer sequence with a non-targeting guide was found to eliminate editing (FIG. 32B).

[0351] Rational mutations were also introduced in both the Bxb1 integrase and reverse transcriptase domain of the PE2 construct to optimize PASTE further. While some of these mutations were well tolerated by PASTE (FIGS. 33A-B), none of them improved PASTE editing efficiency.

[0352] Short RT and PBS lengths can offer additional improvements for editing. A panel of shorter RT and PBS guides were tested at ACTB and LMNB1 loci and while shorter RT and PBS sequences did not increase editing at ACTB (FIG. 31G), it was found that they had improved editing at LMNB1 (FIG. 31H) with best performing guides reaching GFP insertion rates of .about.40% (FIG. 31I).

Example 20

PASTE Tagging at Multiple Endogenous Genes

[0353] GFP insertion efficiency was measured at seven different gene loci--ACTB, SUPT16H, SRM2, NOLC1, DEPDC4, NES, and LMNB1--to test the versatility of the PASTE programming. A range of integration rates up to 22% was found (FIG. 34A). Because PASTE does not require homology or sequence similarity on cargo plasmids, integration of diverse cargo sequences is modular and easily scaled across different loci. Six different gene cargos, varying in size from 969 bp to 4906 bp, were tested for insertion at ACTB and LMNB1 loci with PASTE. Integration frequencies between 5% and 22% depending on the gene and insertion locus were found (FIGS. 34B and 35). Additionally, a panel of seven common therapeutic genes, CEP290, OTC, HBB, PAH, GBA, BTK, and ADA was evaluated for insertion at the ACTB locus, and the efficient integration of these cargos were found between 5%-20% (FIG. 34C).

[0354] The precise insertions of PASTE for in-frame protein tagging or expressing cargo without disruption of endogenous gene expression was assessed. As Bxb1 leaves residual sequences in the genome (termed attL and attR) after cargo integration, these genomic scars can serve as protein linkers. The frame of the attR sequence was positioned through strategic placement of the attP on the minicircle cargo, achieving a suitable protein linker, GGLSGQPPRSPSSGSSG (SEQ ID NO: 427). Using this linker, four genes (ACTB, SRRM2, NOLC1, and LMNB1) were tagged with GFP using PASTE. To assess correct gene tagging, the subcellular location of GFP was compared with the tagged gene product by immunofluorescence. For all four targeted loci, GFP co-localized with the tagged gene product, indicating successful tagging (FIGS. 34D-E).

Example 21

Orthogonal Sequence Preferences for Bxb1 Integration

[0355] The central dinucleotide of Bxb1 is involved in the association of attB and attP sites for integration, and changing the matched central dinucleotide sequences can modify integrase activity and provide orthogonality for insertion of two genes. Expanding the set of attB/attP dinucleotides can enable multiplexed gene insertion with PASTE. The efficiency of GFP integration at the ACTB locus with PASTE across all 16 dinucleotide attB/attP sequence pairs was profiled to find optimal attB/attP dinucleotides for PASTE insertion. Several dinucleotides with integration efficiencies greater than the wild-type GT sequence were found (FIG. 36A). A majority of dinucleotides had 75% editing efficiency or greater compared to wild-type attB/attP efficiency, implying that these dinucleotides can be orthogonal channels for multiplexed gene insertion with PASTE.

[0356] The specificity of matched and unmatched attB/attP dinucleotide interactions was then assessed. The interactions between all dinucleotide combinations in a scalable fashion using a pooled assay to compare attB/attP integration were profiled (FIG. 36B). By barcoding 16 attP dinucleotide plasmids with unique identifiers, co-transfecting this attP pool with the Bxb1 integrase expression vector and a single attB dinucleotide acceptor plasmid, and sequencing the resulting integration products, the relative integration efficiencies of all possible attB/attP pairs were measured (FIG. 36C). Dinucleotide specificity was found to vary, with some dinucleotides (GG) exhibiting strong self-interaction with negligible crosstalk, and others (AA) showing minimal self-preference. Sequence logos of attP preferences (FIG. 37) revealed that dinucleotides with C or G in the first position have stronger preferences for attB dinucleotide sequences with shared first bases, while other attP dinucleotides, especially those with an A in the first position, have reduced specificity for the first attB base.

[0357] GA, AG, AC, and CT dinucleotide pegRNAs were then tested for GFP integration at ACTB, either paired with their corresponding attP cargo or mispaired with the other three dinucleotide attP sequences. All four of the tested dinucleotides efficiently were found to integrate cargo only when paired with the corresponding attB/attP pair, with no detectable integration across mispaired combinations (FIG. 36D).

Example 22

Multiplex Gene Integration with PASTE

[0358] Multiplexing in cells by using orthogonal pegRNAs that direct a matched attP cargo to a specific site in the genome was assessed (FIG. 38A). Selecting the three top dinucleotide attachment site pairs (CT, AG, and GA), pegRNAs that target ACTB (CT), LMNB1 (AG), and NOLC1 (GA) and corresponding minicircle cargo containing GFP (CT), mCherry (AG), and YFP (GA) were designed. Upon co-delivering these reagents to cells, single-plex, dual-plex, and trip-plex editing of all possible combinations of these pegRNAs and cargo in the range of 5%-25% integration was found to be achieved (FIG. 38B).

[0359] An application for multiplexed gene integration is for labeling different proteins to visualize intracellular localization and interactions within the same cell. PASTE was used to simultaneously tag ACTB (GFP) and NOLC1 (mCherry) or ACTB (GFP) and LMNB1 (mCherry) in the same cell. No overlap of GFP and mCherry fluorescence was observed and tagged genes were confirmed to be visible in their appropriate cellular compartments, based on the known subcellular localizations of the ACTB, NOLC1 and LMNB1 protein products (FIGS. 15A-B).

Example 23

PASTE Efficiencies Compared with DSB-Based Insertion Methods

[0360] PASTE efficiencies were found to exceed comparable DSB-based insertion methods.

[0361] PASTE editing was assessed alongside DSB-dependent gene integration using either NHEJ (i.e., homology-independent targeted integration, HITI) or HDR pathways. PASTE had equivalent or better gene insertion efficiencies than either HITI (FIGS. 39A-B) or HDR (FIGS. 39C-D). On a panel of 7 different endogenous targets, PASTE exceeded HITI editing at 6 out of 7 genes, with similar efficiency for the 7th gene (FIG. 39A). As DSB generation can lead to insertions or deletions (indels) as an alternative and undesired editing outcome, the indel frequency of all three methods was assessed by next-generation sequencing, finding significantly fewer indels generated with PASTE than either HDR or HITI in both HEK293FT and HepG2 cells (FIGS. 39B, 39D and 40A), showcasing the high purity of gene integration outcomes with PASTE.

Example 24

Off-Target Characterization of PASTE and HITI Gene Integration

[0362] Off-target editing can be used in genome editing technologies. The specificity of PASTE at specific sites was assessed based on off-targets generated by Bxb1 integration into pseudo-attB sites in the human genome and off-targets generated via guide- and Cas9-dependent editing in the human genome (FIG. 39E). While Bxb1 lacks documented integration into the human genome at pseudo-attachment sites, potential sites with partial similarity to the natural Bxb1 attB core sequence were computationally identified. Bxb1 integration by ddPCR across these sites was tested and no off-target activity was found (FIGS. 39F and 40B-D). To assay Cas9 off-targets for the ACTB pegRNA, two potential off-target sites were identified via computational prediction and no off-target integration for PASTE was found (FIGS. 39G and 40A-D), but substantial off-target activity by HITI at one of the sites was found (FIGS. 39H and 40A-D).

[0363] Genome-wide off-targets due to either Cas9 or Bxb1 through tagging and PCR amplification of insert-genomic junctions were additionally assessed (FIG. 39I). Single cell clones were isolated for conditions with PASTE editing and negative controls missing PE2, and deep sequencing of insert genomic junctions from these clones showed all reads aligning to the on-target ACTB site, confirming no off-target genomic insertions (FIGS. 39J-L).

[0364] Expression of reverse transcriptases and integrases involved in PASTE can have detrimental effects on cellular health. The complete PASTE system, the corresponding guides and cargo with only PE2, and the corresponding guides and cargo with only Bxb1 were transfected and compared to both GFP control transfections and guides without protein expression via transcriptome-wide RNA sequencing to determine the extent of these effects. While Bxb1 expression in the absence of Prime editing was found to have several significant off targets, the complete PASTE system had only one differentially regulated gene with more than a 1.5-fold change (FIGS. 41A-B). Genes upregulated by Bxb1 overexpression included stress response genes, such as TENT5C and DDIT3, but these changes were not seen in the expression of the PASTE system (FIG. 41C), potentially due to the decreased expression of Bxb1 from the P2A linker on the PASTE construct.

Example 25

PASTE Efficiency in Non-Dividing Cell

[0365] PASTE activity in non-dividing cells was assessed. Cas9 and HDR templates or PASTE were transfected into HEK293FT cells and cell division was arrested via aphidicolin treatment (FIG. 42A). In this model of blocked cell division, PASTE was found to maintain a GFP gene integration activity greater than 20% at the ACTB locus whereas HDR-mediated integration was abolished (FIGS. 42B and 43A).

Example 26

Production and Secretion of Therapeutic Transgene

[0366] PASTE with larger transgenes and in additional cell lines were assessed.

[0367] To evaluate the size limits for therapeutic transgenes, insertion of cargos up to 13.3 kb in length in both dividing and aphidicolin treated cells was assessed. Insertion efficiency greater than 10% was found (FIG. 42C), enabling insertion of .about.99.7% of all full-length human cDNA transgenes. To overcome reduction of large insert delivery to cells because of delivery inefficiencies, delivering larger DNA amounts of insert was found to significantly improve gene integration efficiency (FIG. 43B). PASTE editing to additional cell types such as PASTE in the K562 lymphoblast line and in primary human T cells were also assessed. Both PE2-P2A-Bxb1 (PASTE) and separate delivery of PE2 and Bxb1 were found to result in efficient editing in both cell types (FIGS. 42D-E). Lastly, as therapeutic delivery of PASTE in vivo might require viral delivery of the DNA cargo, whether AAV could deliver an attP containing payload that could be integrated into the genome via Bxb1 was evaluated. Targeting the ACTB locus, AAV was found to be capable of delivering the appropriate template for integrase mediated insertion with rates up to 4% in a dose dependent fashion (FIGS. 42F and 43C).

[0368] To improve the efficiency of PASTE, PE2* NLS was incorporated for prime editing and improved PASTE integration at multiple loci was found (FIG. 44A). Furthermore, PE2* resulted in more robust integration at lower titrations of cargo plasmid, demonstrating integration at amounts as low as 8 ng of plasmid (FIG. 44B). To combat reductions in PASTE efficiency due to incomplete plasmid delivery, a puromycin resistance gene was co-delivered and found to increase the PASTE efficiency in the presence of drug selection (FIG. 45).

[0369] Programmable gene integration provides a modality for expression of therapeutic protein products, and protein production was assessed for therapeutically relevant proteins Alpha-1 antitrypsin (encoded by SERPINA1) and Carbamoyl phosphate synthetase I (encoded by CPS1), involved in the diseases Alpha-1 antitrypsin deficiency and CPS1 deficiency, respectively. By tagging gene products with the luminescent protein subunit HiBiT, the transgene production and secretion were assessed independently in response to PASTE treatment (FIG. 42G). PASTE was transfected with SERPINA1 or CPS1 cargo in HEK293FT cells and a human hepatocellular carcinoma cell line (HepG2) and efficient integration at the ACTB locus was found (FIG. 42H-I). This integration resulted in robust protein expression, intracellular accumulation of transgene products (FIGS. 42J and 46A-B), and secretion of proteins into the media (FIG. 42K).

Example 27

Optimized PASTE Constructs

[0370] To optimize complex activity, a panel of protein modifications were screened, including alternative reverse transcriptase fusions and mutations, various linkers between the reverse transcriptase domain and integrase and between the Cas9 and reverse transcriptase domain, and reverse transcriptase and BxbINT domain mutants (FIG. 47A and FIG. 49C-FIG. 49F). A number of protein modifications, including a 48 residue XTEN linker between the Cas9 and reverse transcriptase and the fusion of MMuLV to the Sto7d DNA binding domain (Oscorbin et al. FEBS Lett. 594. 4338-4356. 2020) improved editing efficiency (FIG. 47A and FIG. 49C-FIG. 49D). When these top modifications were combined with a GGGGS linker (SEQ ID NO: 420) between the reverse transcriptase-Sto7d domain and the BxbINT, they produced .about.55% gene integration, highlighting the importance of directly recruiting the integrase to the target site (FIG. 47A). This optimized construct was referred to as SpCas9-(XTEN-48)-RT-Sto7d-(GGGGS)-BxbINT. The optimized construct achieved precise integration of templates as large as 36,000 bp with .about.20% integration efficiency (FIG. 47A), with complete integration of the full-length cargo confirmed by Sanger sequencing.

[0371] Additionally, pegRNAs containing different AttB length truncations were tested and found that prime editing was capable of inserting sequences up to 56 bp at the beta-actin (ACTB) gene locus, with higher efficiency at lengths below 31 bp (FIG. 48A-FIG. 48B). A panel of multiple enzymes was evaluated, including Bxb1 (i.e., BxbINT), TP901 (i.e., Tp9INT), and phiBT1 (i.e., Bt1INT) phage serine integrases. Prime editing successfully inserted all landing sites tested, with efficiencies between 10-30% (FIG. 48C-FIG. 48D)

Example 28

Viral Delivery & In Vivo Editing

[0372] In order to package the complete PASTE system in viral vectors, an AdV vector was utilized (FIG. 50B). Adenovirus was evaluated for if it could deliver a suitable template for BxbINT-mediated insertion along with plasmids for SpCas9-RT-BxbINT and guide expression, or AdV delivery of guides and BxbINT with plasmid delivery of SpCas9-RT, finding that 10-20% integration of the .about.36 kb adenovirus genome carrying EGFP in HEK293FT and HepG2 cells was achieved (FIG. 50C). Upon packaging and delivering the cargo and PASTE system components across 3 AdV vectors, the complete PASTE system (Cas9-reverse transcriptase, integrase and guide RNAs, or cargo) could be substituted by adenoviral delivery, with integration of up to .about.50-60% with viral-only delivery in HEK293FT and HepG2 cells (FIG. 50D).

[0373] To further demonstrate PASTE would be amenable for in vivo delivery, an mRNA version of the PASTE protein components was developed as well as chemically-modified synthetic atgRNA and nicking guide against the LMNB1 target (FIG. 50E). Electroporation of the mRNA and guides along with delivery of the template via adenovirus or plasmid yielded high efficiency integration up to .about.23% (FIG. 50E-FIG. 50F). More sustained BxbINT expression could allow for integration into newly placed AttB sites in the genome, so circular mRNA expression was tested and found to boost the efficiency of integration to .about.30% (FIG. 50G-FIG. 50I).

Example 29

Simultaneous Deletion & Insertion with PASTE

[0374] The PASTE system was used to simultaneously delete one sequence and insert another. 130 bp and 385 bp deletions of first exon of LMNB1 with combined insertion of AttB nucleic acid sequence was performed (FIG. 51A). This data shows that it is possible to replace DNA sequence using the PASTE system.

[0375] A130 bp deletion of the first exon of LMNB1 with combined insertion of a 967 bp cargo using the PASTE system was also performed.

[0376] One of two attP sequences were inserted using the mini circle template that has mutated AttP, as described above. This AttP mutants shows better integration kinetics and efficiency, especially for the shorter AttBs (38-44 bp). The LMNB1 AttB used in this experiment is 38 bp (FIG. 51B).

Sequence CWU 1

1

431134DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideLox71 1ataacttcgt ataatgtatg ctatacgaac ggta 34234DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideLox66 2taccgttcgt ataatgtatg ctatacgaag ttat 34346DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB 3ggccggcttg tcgacgacgg cggtctccgt cgtcaggatc atccgg 46446DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP 4ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcc 46538DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-TT 5ggcttgtcga cgacggcgtt ctccgtcgtc aggatcat 38652DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-TT 6gtggtttgtc tggtcaacca ccgcgttctc agtggtgtac ggtacaaacc ca 52738DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-AA 7ggcttgtcga cgacggcgaa ctccgtcgtc aggatcat 38852DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-AA 8gtggtttgtc tggtcaacca ccgcgaactc agtggtgtac ggtacaaacc ca 52938DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-CC 9ggcttgtcga cgacggcgcc ctccgtcgtc aggatcat 381052DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-CC 10gtggtttgtc tggtcaacca ccgcgccctc agtggtgtac ggtacaaacc ca 521138DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-GG 11ggcttgtcga cgacggcggg ctccgtcgtc aggatcat 381252DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-GG 12gtggtttgtc tggtcaacca ccgcgggctc agtggtgtac ggtacaaacc ca 521338DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-TG 13ggcttgtcga cgacggcgtg ctccgtcgtc aggatcat 381452DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-TG 14gtggtttgtc tggtcaacca ccgcgtgctc agtggtgtac ggtacaaacc ca 521538DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-GT 15ggcttgtcga cgacggcggt ctccgtcgtc aggatcat 381652DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-GT 16gtggtttgtc tggtcaacca ccgcggtctc agtggtgtac ggtacaaacc ca 521738DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-CT 17ggcttgtcga cgacggcgct ctccgtcgtc aggatcat 381852DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-CT 18gtggtttgtc tggtcaacca ccgcgctctc agtggtgtac ggtacaaacc ca 521938DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-CA 19ggcttgtcga cgacggcgca ctccgtcgtc aggatcat 382052DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-CA 20gtggtttgtc tggtcaacca ccgcgcactc agtggtgtac ggtacaaacc ca 522138DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-TC 21ggcttgtcga cgacggcgtc ctccgtcgtc aggatcat 382252DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-TC 22gtggtttgtc tggtcaacca ccgcgtcctc agtggtgtac ggtacaaacc ca 522338DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-GA 23ggcttgtcga cgacggcgga ctccgtcgtc aggatcat 382452DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-GA 24gtggtttgtc tggtcaacca ccgcggactc agtggtgtac ggtacaaacc ca 522538DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-AG 25ggcttgtcga cgacggcgag ctccgtcgtc aggatcat 382652DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-AG 26gtggtttgtc tggtcaacca ccgcgagctc agtggtgtac ggtacaaacc ca 522738DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-AC 27ggcttgtcga cgacggcgac ctccgtcgtc aggatcat 382852DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-AC 28gtggtttgtc tggtcaacca ccgcgacctc agtggtgtac ggtacaaacc ca 522938DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-AT 29ggcttgtcga cgacggcgat ctccgtcgtc aggatcat 383052DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-AT 30gtggtttgtc tggtcaacca ccgcgatctc agtggtgtac ggtacaaacc ca 523138DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-GC 31ggcttgtcga cgacggcggc ctccgtcgtc aggatcat 383252DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-GC 32gtggtttgtc tggtcaacca ccgcggcctc agtggtgtac ggtacaaacc ca 523338DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-CG 33ggcttgtcga cgacggcgcg ctccgtcgtc aggatcat 383452DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-CG 34gtggtttgtc tggtcaacca ccgcgcgctc agtggtgtac ggtacaaacc ca 523538DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttB-TA 35ggcttgtcga cgacggcgta ctccgtcgtc aggatcat 383652DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttP-TA 36gtggtttgtc tggtcaacca ccgcgtactc agtggtgtac ggtacaaacc ca 523745DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideC-31-B 37tgcgggtgcc agggcgtgcc cttgggctcc ccgggcgcgt actcc 453842DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideC31-P 38gtgccccaac tggggtaacc tttgagttct ctcagttggg gg 423957DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideR4-B 39gcgcccaagt tgcccatgac catgccgaag cagtggtaga agggcaccgg cagacac 574070DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideR4-P 40aggcatgttc cccaaagcga taccacttga agcagtggta ctgcttgtgg gtacactctg 60cgggtgatga 704160DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideBT1-B 41gtccttgacc aggtttttga cgaaagtgat ccagatgatc cagctccaca ccccgaacgc 604263DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideBT1-P 42ggtgctgggt tgttgtctct ggacagtgat ccatgggaaa ctactcagca ccaccaatgt 60tcc 634350DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideBxb-B 43tcggccggct tgtcgacgac ggcggtctcc gtcgtcagga tcatccgggc 504458DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideBxb-P 44gtcgtggttt gtctggtcaa ccaccgcggt ctcagtggtg tacggtacaa accccgac 584546DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideTG1-B 45gatcagctcc gcgggcaaga ccttctcctt cacggggtgg aaggtc 464667DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideTG1-P 46tcaaccccgt tccagcccaa cagtgttagt ctttgctctt acccagttgg gcgggatagc 60ctgcccg 674757DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideC1-B 47aacgattttc aaaggatcac tgaatcaaaa gtattgctca tccacgcgaa atttttc 574857DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideC1-P 48aatattttag gtatatgatt ttgtttatta gtgtaaataa cactatgtac ctaaaat 574953DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideC370-B 49tgtaaaggag actgataatg gcatgtacaa ctatactcgt cggtaaaaag gca 535052DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideC370-P 50taaaaaaata cagcgttttt catgtacaac tatactagtt gtagtgccta aa 525156DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideK38-B 51gagcgccgga tcagggagtg gacggcctgg gagcgctaca cgctgtggct gcggtc 565256DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideK38-P 52ccctaatacg caagtcgata actctcctgg gagcgttgac aacttgcgca ccctga 565368DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideRB-B 53tctcgtggtg gtggaaggtg ttggtgcggg gttggccgtg gtcgaggtgg ggtggtggta 60gccattcg 685469DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideRV-P 54gcacaggtgt agtgtatctc acaggtccac ggttggccgt ggactgctga agaacattcc 60acgccagga 695565DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideSPBC-B 55agtgcagcat gtcattaata tcagtacaga taaagctgta tctcctgtga acacaatggg 60tgcca 655655DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideSPBC-P 56aaagtagtaa gtatcttaaa aaacagataa agctgtatat taagatactt actac 555754DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideTP901-B 57tgataattgc caacacaatt aacatctcaa tcaaggtaaa tgctttttcg tttt 545854DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideTP901-P 58aattgcgagt ttttatttcg tttatttcaa ttaaggtaac taaaaaactc cttt 545968DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideWbeta-B 59aaggtagcgt caacgatagg tgtaactgtc gtgtttgtaa cggtacttcc aacagctggc 60gtttcagt 686068DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideWbeta-P 60tagttttaaa gttggttatt agttactgtg atatttatca cggtacccaa taaccaatga 60atatttga 686157DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideA118-B 61tgtaactttt tcggatcaag ctatgaagga cgcaaagagg gaactaaaca cttaatt 576257DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideA118-P 62ttgtttagtt cctcgttttc tctcgttgga agaagaagaa acgagaaact aaaatta 576363DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideBL3-B 63caacctgttg acatgtttcc acagacaact cacgtggagg tagtcacggc ttttacgtta 60gtt 636461DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideBL3-P 64gagaatactg ttgaacaatg aaaaactagg catgtagaag ttgtttgtgc actaacttta 60a 6165120DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotideMR11-B 65acaggtcaac acatcgcagt tatcgaacaa tcttcgaaaa tgtatggagg cacttgtatc 60aatataggat gtataccttc gaagacactt gtacatgatg gattagaagg caaatccttt 12066120DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotideMR11-P 66caaaataaaa aacattgatt tttattaact tcttttgtgc ggaactacga acagttcatt 60aatacgaagt gtacaaactt ccatacaaaa ataaccacga caattaagac gtggtttcta 1206717DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttL 67attatttctc accctga 176817DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideAttR 68atcatctccc acccgga 176934DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideVox 69aataggtctg agaacgccca ttctcagacg tatt 347034DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotideFRT 70gaagttccta tactttctag agaataggaa cttc 34715881DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotideCre Recombinase Expression Plasmid 71ggtcgacatt gattattgac tagttattaa tagtaatcaa ttacggggtc attagttcat 60agcccatata tggagttccg cgttacataa cttacggtaa atggcccgcc tggctgaccg 120cccaacgacc cccgcccatt gacgtcaata atgacgtatg ttcccatagt aacgccaata 180gggactttcc attgacgtca atgggtggag tatttacggt aaactgccca cttggcagta 240catcaagtgt atcatatgcc aagtacgccc cctattgacg tcaatgacgg taaatggccc 300gcctggcatt atgcccagta catgacctta tgggactttc ctacttggca gtacatctac 360gtattagtca tcgctattac catggtcgag gtgagcccca cgttctgctt cactctcccc 420atctcccccc cctccccacc cccaattttg tatttattta ttttttaatt attttgtgca 480gcgatggggg cggggggggg gggggcgcgc gccaggcggg gggggggggg gggggggggg 540gggggggggg gggcgggggg gggcggcggc agccaatcag agcggcgcgc tccgaaagtt 600tccttttatg gcgaggcggc ggcggcggcg gccctataaa aagcgaagcg cgcggcgggc 660gggagtcgct gcgcgctgcc ttcgccccgt gccccgctcc gccgccgcct cgcgccgccc 720gccccggctc tgactgaccg cgttactccc acaggtgagc gggcgggacg gcccttctcc 780tccgggctgt aattagcgct tggtttaatg acggcttgtt tcttttctgt ggctgcgtga 840aagccttgag gggctccggg agggcccttt gtgcgggggg agcggctcgg ggggtgcgtg 900cgtgtgtgtg tgcgtgggga gcgccgcgtg cggctccgcg ctgcccggcg gctgtgagcg 960ctgcgggcgc ggcgcggggc tttgtgcgct ccgcagtgtg cgcgagggga gcgcggccgg 1020gggcggtgcc ccgcggtgcg gggggggctg cgaggggaac aaaggctgcg tgcggggtgt 1080gtgcgtgggg gggtgagcag ggggtgtggg cgcgtcggtc gggctgcaac cccccctgca 1140cccccctccc cgagttgctg agcacggccc ggcttcgggt gcggggctcc gtacggggcg 1200tggcgcgggg ctcgccgtgc cgggcggggg gtggcggcag gtgggggtgc cgggcggggc 1260ggggccgcct cgggccgggg agggctcggg ggaggggcgc ggcggccccc ggagcgccgg 1320cggctgtcga ggcgcggcga gccgcagcca ttgcctttta tggtaatcgt gcgagagggc 1380gcagggactt cctttgtccc aaatctgtgc ggagccgaaa tctgggaggc gccgccgcac 1440cccctctagc gggcgcgggg cgaagcggtg cggcgccggc aggaaggaaa tgggcgggga 1500gggccttcgt gcgtcgccgc gccgccgtcc ccttctccct ctccagcctc ggggctgtcc 1560gcggggggac ggctgccttc gggggggacg gggcagggcg gggttcggct tctggcgtgt 1620gaccggcggc tctagagcct ctgctaacca tgttcatgcc ttcttctttt tcctacagct 1680cctgggcaac gtgctggtta ttgtgctgtc tcatcatttt ggcaaagaat tctgagccgc 1740caccatggcc aatttactga ccgtacacca aaatttgcct gcattaccgg tcgatgcaac 1800gagtgatgag gttcgcaaga acctgatgga catgttcagg gatcgccagg cgttttctga 1860gcatacctgg aaaatgcttc tgtccgtttg ccggtcgtgg gcggcatggt gcaagttgaa 1920taaccggaaa tggtttcccg cagaacctga agatgttcgc gattatcttc tatatcttca 1980ggcgcgcggt ctggcagtaa aaactatcca gcaacatttg ggccagctaa acatgcttca 2040tcgtcggtcc gggctgccac gaccaagtga cagcaatgct gtttcactgg ttatgcggcg 2100gatccgaaaa gaaaacgttg atgccggtga acgtgcaaaa caggctctag cgttcgaacg 2160cactgatttc gaccaggttc gttcactcat ggaaaatagc gatcgctgcc aggatatacg 2220taatctggca tttctgggga ttgcttataa caccctgtta cgtatagccg aaattgccag 2280gatcagggtt aaagatatct cacgtactga cggtgggaga atgttaatcc atattggcag 2340aacgaaaacg ctggttagca ccgcaggtgt agagaaggca cttagcctgg gggtaactaa 2400actggtcgag cgatggattt ccgtctctgg tgtagctgat gatccgaata actacctgtt 2460ttgccgggtc agaaaaaatg gtgttgccgc gccatctgcc accagccagc tatcaactcg 2520cgccctggaa gggatttttg aagcaactca tcgattgatt tacggcgcta aggatgactc 2580tggtcagaga tacctggcct ggtctggaca cagtgcccgt gtcggagccg cgcgagatat 2640ggcccgcgct ggagtttcaa taccggagat catgcaagct ggtggctgga ccaatgtaaa 2700tattgtcatg aactatatcc gtaacctgga tagtgaaaca ggggcaatgg tgcgcctgct 2760ggaagatggc gatggaccgg tggaacaaaa acttatttct gaagaagatc tgtgatagcg 2820gccgcactcc tcaggtgcag gctgcctatc agaaggtggt ggctggtgtg gccaatgccc 2880tggctcacaa ataccactga gatctttttc cctctgccaa aaattatggg gacatcatga 2940agccccttga gcatctgact tctggctaat aaaggaaatt tattttcatt gcaatagtgt 3000gttggaattt tttgtgtctc tcactcggaa ggacatatgg gagggcaaat catttaaaac 3060atcagaatga gtatttggtt tagagtttgg caacatatgc ccatatgctg gctgccatga 3120acaaaggttg gctataaaga ggtcatcagt atatgaaaca gccccctgct gtccattcct 3180tattccatag aaaagccttg acttgaggtt agattttttt tatattttgt tttgtgttat 3240ttttttcttt aacatcccta aaattttcct tacatgtttt actagccaga tttttcctcc 3300tctcctgact actcccagtc atagctgtcc ctcttctctt atggagatcc ctcgacctgc 3360agcccaagct tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg ttatccgctc 3420acaattccac acaacatacg agccggaagc ataaagtgta aagcctgggg tgcctaatga 3480gtgagctaac tcacattaat tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg 3540tcgtgccagc ggatccgcat ctcaattagt cagcaaccat agtcccgccc ctaactccgc 3600ccatcccgcc cctaactccg cccagttccg cccattctcc gccccatggc tgactaattt 3660tttttattta tgcagaggcc gaggccgcct cggcctctga gctattccag aagtagtgag

3720gaggcttttt tggaggccta ggcttttgca aaaagctaac ttgtttattg cagcttataa 3780tggttacaaa taaagcaata gcatcacaaa tttcacaaat aaagcatttt tttcactgca 3840ttctagttgt ggtttgtcca aactcatcaa tgtatcttat catgtctgga tccgctgcat 3900taatgaatcg gccaacgcgc ggggagaggc ggtttgcgta ttgggcgctc ttccgcttcc 3960tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca 4020aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca 4080aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg 4140ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg 4200acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt 4260ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt 4320tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc 4380tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt 4440gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt 4500agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc 4560tacactagaa gaacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa 4620agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt 4680tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct 4740acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgagatta 4800tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa atcaatctaa 4860agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga ggcacctatc 4920tcagcgatct gtctatttcg ttcatccata gttgcctgac tccccgtcgt gtagataact 4980acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg agacccacgc 5040tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga gcgcagaagt 5100ggtcctgcaa ctttatccgc ctccatccag tctattaatt gttgccggga agctagagta 5160agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctacagg catcgtggtg 5220tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc aaggcgagtt 5280acatgatccc ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc 5340agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca taattctctt 5400actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac caagtcattc 5460tgagaatagt gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg ggataatacc 5520gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa 5580ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg tgcacccaac 5640tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa 5700aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat actcttcctt 5760tttcaatatt attgaagcat ttatcagggt tattgtctca tgagcggata catatttgaa 5820tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa agtgccacct 5880g 5881724915DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotideGFP-Lox66-Cre expression plasmid 72agctctgatc aagagacagg atgaggatcg tttcgcatga ttgaacaaga tggattgcac 60gcaggttctc cggccgcttg ggtggagagg ctattcggct atgactgggc acaacagaca 120atcggctgct ctgatgccgc cgtgttccgg ctgtcagcgc aggggcgccc ggttcttttt 180gtcaagaccg acctgtccgg tgccctgaat gaactgcaag acgaggcagc gcggctatcg 240tggctggcca cgacgggcgt tccttgcgca gctgtgctcg acgttgtcac tgaagcggga 300agggactggc tgctattggg cgaagtgccg gggcaggatc tccatgtcat ctacaccttg 360ctcctgccga gaaagtatcc atcatggctg atgcaatgcg gcggctgcat acgcttgatc 420cggctacctg cccattcgac caccaagcga aacatcgcat cgagcgagca cgtactcgga 480tggaagccgg tcttgtcgat caggatgatc tggacgaaga gcatcagggg ctcgcgccag 540ccgaactgtt cgccaggctc aaggcgagca tgcccgacgg cgaggatctc gtcgtgaccc 600atggcgatgc ctgcttgccg aatatcatgg tggaaaatgg ccgcttttct ggattcatcg 660actgtggccg gctgggtgtg gcggaccgct atcaggacat agcgttggct acccgtgata 720ttgctgaaga gcttggcggc gaatgggctg accgcttcct cgtgctttac ggtatcgccg 780ctcccgattc gcagcgcatc gccttctatc gccttcttga cgagttcttc tgaattatta 840actcgagatc cactagagtg tggcggccgc attcttataa tcagcatcat gatgtggtac 900cacatcatga tgctgattac ccccaactga gagaactcaa aggttacccc agttggggcg 960ggcccacaaa taaagcaata gcatcacaaa tttcacaaat aaagcatttt tttcactgca 1020ttctagttgt ggtttgtcca aactcatcga gctcgagatc tggcgaaggc gatgggggtc 1080ttgaaggcgt gctggtactc cacgatgccc agctcggtgt tgctgtgcag ctcctccacg 1140cggcggaagg cgaacatggg gcccccgttc tgcaggatgc tggggtggat ggcgctcttg 1200aagtgcatgt ggctgtccac cacgaagctg tagtagccgc cgtcgcgcag gctgaaggtg 1260cgggcgaagc tgcccaccag cacgttatcg cccatggggt gcaggtgctc cacggtggcg 1320ttgctgcgga tgatcttgtc ggtgaagatc acgctgtcct cggggaagcc ggtgcccacc 1380accttgaagt cgccgatcac gcggccggcc tcgtagcggt agctgaagct cacgtgcagc 1440acgccgccgt cctcgtactt ctcgatgcgg gtgttggtgt agccgccgtt gttgatggcg 1500tgcaggaagg ggttctcgta gccgctgggg taggtgccga agtggtagaa gccgtagccc 1560atcacgtggc tcagcaggta ggggctgaag gtcagggcgc ctttggtgct cttcatcttg 1620ttggtcatgc ggccctgctc gggggtgccc tctccgccgc ccaccagctc gaactccacg 1680ccgttcaggg tgccggtgat gcggcactcg atcttcatgg cgggcatggt ggcgaccggt 1740agcgctagcg gcttcggata acttcgtata gcatacatta tacgaacggt aagcgctacc 1800gccggcatac ccaagtgaag ttgctcgcag cttatagtcg cgcccgggga gcccaagggc 1860acgccctggc accgcggccg ctgagtctcg accatcatca tcatcatcat tgagtttatc 1920tgggataaca gggtaatgtc atctagggat aacagggtat gtcatctggg ataacagggt 1980aatgtatcta gggataacag ggtaatgtca tctgggataa cagggtaatg tcatctaggg 2040ataacagggt atgtcatctg ggataacagg gtaatgtatc tagggataac agggtaatgt 2100catctgggat aacagggtaa tgtcatctag ggataacagg gtatgtcatc tgggataaca 2160gggtaatgta tctagggata acagggtaat gtcatctggg ataacagggt aatgtcatct 2220agggataaca gggtatgtca tctgggataa cagggtaatg tatctaggga taacagggta 2280atgtcatctg ggataacagg gtaatgtcat ctagggataa cagggtatgt catctgggat 2340aacagggtaa tgtatctagg gataacaggg taatgtcatc tgggataaca gggtaatgtc 2400atctagggat aacagggtat gtcatctggg ataacagggt aatgtatcta gggataacag 2460ggtaatgtca tctgggataa cagggtaatg tcatctaggg ataacagggt aaatgtcatc 2520tagggataac agggtaatgt catctaggga taacagggta atgtcatctg ggataacagg 2580gtaatgtcat ctagggataa cagggtaatg tatcgccagc gtcgcacagc atgtttgctt 2640gtcgccgtcg cgtctgtcac atcttttccg ccagcagtta gggattagcg tcttaagctg 2700gcgcgaggac caacgtatca gccaggcgaa gctgcttttg agcaccaccc ggatgcctat 2760cgccaccgtc ggtcgcaatg ttggttttga cgatcaactc tatttctcgc gggtatttaa 2820aaaatgcacc ggggccagcc cgagcgagtt ccgtgccggt tgtgaagaaa aagtgaatga 2880tgtagccgtc aagttgtcat aattggtaac gaatcagaca attgacggct tgacggagta 2940gcatagggtt tgcagaatcc ctgcttcgtc catttgacag gcacattatg catgccgctt 3000cgccttcgcg cgcgaattga tctgctgcct cgcgcgtttc ggtgatgacg gtgaaaacct 3060ctgacacatg cagctcccgg agacggtcac agcttgtctg taagcggatg ccgggagcag 3120acaagcccgt cagggcgcgt cagcgggtgt tggcgggtgt cggggcgcag ccatgaccca 3180gtcacgtagc gatagcggag tgtatactgg cttaactatg cggcatcaga gcagattgta 3240ctgagagtgc accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc 3300atcaggcgct cttccgcttc ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg 3360cgagcggtat cagctcactc aaaggcggta atacggttat ccacagaatc aggggataac 3420gcaggaaaga acatgtgagc aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg 3480ttgctggcgt ttttccatag gctccgcccc cctgacgagc atcacaaaaa tcgacgctca 3540agtcagaggt ggcgaaaccc gacaggacta taaagatacc aggcgtttcc ccctggaagc 3600tccctcgtgc gctctcctgt tccgaccctg ccgcttaccg gatacctgtc cgcctttctc 3660ccttcgggaa gcgtggcgct ttctcatagc tcacgctgta ggtatctcag ttcggtgtag 3720gtcgttcgct ccaagctggg ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc 3780ttatccggta actatcgtct tgagtccaac ccggtaagac acgacttatc gccactggca 3840gcagccactg gtaacaggat tagcagagcg aggtatgtag gcggtgctac agagttcttg 3900aagtggtggc ctaactacgg ctacactaga aggacagtat ttggtatctg cgctctgctg 3960aagccagtta ccttcggaaa aagagttggt agctcttgat ccggcaaaca aaccaccgct 4020ggtagcggtg gtttttttgt ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa 4080gaagatcctt tgatcttttc tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa 4140gggattttgg tcatgagcgg atacatattt gaatgtattt agaaaaataa acaaaagagt 4200ttgtagaaac gcaaaaaggc catccgtcag gatggccttc tgcttaattt gatgcctggc 4260agtttatggc gggcgtcctg cccgccaccc tccgggccgt tgcttcgcaa cgttcaaatc 4320cgctcccggc ggatttgtcc tactcaggag agcgttcacc gacaaacaac agataaaacg 4380aaaggcccag tctttcgact gagcctttcg ttttatttga tgcctggcag ttccctactc 4440tcgcatgggg agaccccaca ctaccatcgg cgctacggcg tttcacttct gagttcggca 4500tggggtcagg tgggaccacc gcgctactgc cgccaggcaa attctgtttt atcagaccgc 4560ttctgcgttc tgatttaatc tgtatcaggc tgaaaatctt ctctcatccg ccaaaacagc 4620caagctggag accgtttggc ccccctcgag cacgtagaaa gccagtccgc agaaacggtg 4680ctgaccccgg atgaatgtca gctactgggc tatctggaca agggaaaacg caagcgcaaa 4740gagaaagcag gtagcttgca gtgggcttac atggcgatag ctagactggg cggttttatg 4800gacagcaagc gaaccggaat tgccagctgg ggcgccctct ggtaaggttg ggaagccctg 4860caaagtaaac tggatggctt tctcgccgcc aaggatctga tggcgcaggg gatca 49157310815DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotidepCMV-PE2-P2A-Cre 73acgcgttgac attgattatt gactagttat taatagtaat caattacggg gtcattagtt 60catagcccat atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga 120ccgcccaacg acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca 180atagggactt tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca 240gtacatcaag tgtatcatat gccaagtacg ccccctattg acgtcaatga cggtaaatgg 300cccgcctggc attatgccca gtacatgacc ttatgggact ttcctacttg gcagtacatc 360tacgtattag tcatcgctat taccatggtg atgcggtttt ggcagtacat caatgggcgt 420ggatagcggt ttgactcacg gggatttcca agtctccacc ccattgacgt caatgggagt 480ttgttttggc accaaaatca acgggacttt ccaaaatgtc gtaacaactc cgccccattg 540acgcaaatgg gcggtaggcg tgtacggtgg gaggtctata taagcagagc tggtttagtg 600aaccgtcaga tccgctagag atccgcggcc gctaatacga ctcactatag ggagagccgc 660caccatgaaa cggacagccg acggaagcga gttcgagtca ccaaagaaga agcggaaagt 720cgacaagaag tacagcatcg gcctggacat cggcaccaac tctgtgggct gggccgtgat 780caccgacgag tacaaggtgc ccagcaagaa attcaaggtg ctgggcaaca ccgaccggca 840cagcatcaag aagaacctga tcggagccct gctgttcgac agcggcgaaa cagccgaggc 900cacccggctg aagagaaccg ccagaagaag atacaccaga cggaagaacc ggatctgcta 960tctgcaagag atcttcagca acgagatggc caaggtggac gacagcttct tccacagact 1020ggaagagtcc ttcctggtgg aagaggataa gaagcacgag cggcacccca tcttcggcaa 1080catcgtggac gaggtggcct accacgagaa gtaccccacc atctaccacc tgagaaagaa 1140actggtggac agcaccgaca aggccgacct gcggctgatc tatctggccc tggcccacat 1200gatcaagttc cggggccact tcctgatcga gggcgacctg aaccccgaca acagcgacgt 1260ggacaagctg ttcatccagc tggtgcagac ctacaaccag ctgttcgagg aaaaccccat 1320caacgccagc ggcgtggacg ccaaggccat cctgtctgcc agactgagca agagcagacg 1380gctggaaaat ctgatcgccc agctgcccgg cgagaagaag aatggcctgt tcggaaacct 1440gattgccctg agcctgggcc tgacccccaa cttcaagagc aacttcgacc tggccgagga 1500tgccaaactg cagctgagca aggacaccta cgacgacgac ctggacaacc tgctggccca 1560gatcggcgac cagtacgccg acctgtttct ggccgccaag aacctgtccg acgccatcct 1620gctgagcgac atcctgagag tgaacaccga gatcaccaag gcccccctga gcgcctctat 1680gatcaagaga tacgacgagc accaccagga cctgaccctg ctgaaagctc tcgtgcggca 1740gcagctgcct gagaagtaca aagagatttt cttcgaccag agcaagaacg gctacgccgg 1800ctacattgac ggcggagcca gccaggaaga gttctacaag ttcatcaagc ccatcctgga 1860aaagatggac ggcaccgagg aactgctcgt gaagctgaac agagaggacc tgctgcggaa 1920gcagcggacc ttcgacaacg gcagcatccc ccaccagatc cacctgggag agctgcacgc 1980cattctgcgg cggcaggaag atttttaccc attcctgaag gacaaccggg aaaagatcga 2040gaagatcctg accttccgca tcccctacta cgtgggccct ctggccaggg gaaacagcag 2100attcgcctgg atgaccagaa agagcgagga aaccatcacc ccctggaact tcgaggaagt 2160ggtggacaag ggcgcttccg cccagagctt catcgagcgg atgaccaact tcgataagaa 2220cctgcccaac gagaaggtgc tgcccaagca cagcctgctg tacgagtact tcaccgtgta 2280taacgagctg accaaagtga aatacgtgac cgagggaatg agaaagcccg ccttcctgag 2340cggcgagcag aaaaaggcca tcgtggacct gctgttcaag accaaccgga aagtgaccgt 2400gaagcagctg aaagaggact acttcaagaa aatcgagtgc ttcgactccg tggaaatctc 2460cggcgtggaa gatcggttca acgcctccct gggcacatac cacgatctgc tgaaaattat 2520caaggacaag gacttcctgg acaatgagga aaacgaggac attctggaag atatcgtgct 2580gaccctgaca ctgtttgagg acagagagat gatcgaggaa cggctgaaaa cctatgccca 2640cctgttcgac gacaaagtga tgaagcagct gaagcggcgg agatacaccg gctggggcag 2700gctgagccgg aagctgatca acggcatccg ggacaagcag tccggcaaga caatcctgga 2760tttcctgaag tccgacggct tcgccaacag aaacttcatg cagctgatcc acgacgacag 2820cctgaccttt aaagaggaca tccagaaagc ccaggtgtcc ggccagggcg atagcctgca 2880cgagcacatt gccaatctgg ccggcagccc cgccattaag aagggcatcc tgcagacagt 2940gaaggtggtg gacgagctcg tgaaagtgat gggccggcac aagcccgaga acatcgtgat 3000cgaaatggcc agagagaacc agaccaccca gaagggacag aagaacagcc gcgagagaat 3060gaagcggatc gaagagggca tcaaagagct gggcagccag atcctgaaag aacaccccgt 3120ggaaaacacc cagctgcaga acgagaagct gtacctgtac tacctgcaga atgggcggga 3180tatgtacgtg gaccaggaac tggacatcaa ccggctgtcc gactacgatg tggacgctat 3240cgtgcctcag agctttctga aggacgactc catcgacaac aaggtgctga ccagaagcga 3300caagaaccgg ggcaagagcg acaacgtgcc ctccgaagag gtcgtgaaga agatgaagaa 3360ctactggcgg cagctgctga acgccaagct gattacccag agaaagttcg acaatctgac 3420caaggccgag agaggcggcc tgagcgaact ggataaggcc ggcttcatca agagacagct 3480ggtggaaacc cggcagatca caaagcacgt ggcacagatc ctggactccc ggatgaacac 3540taagtacgac gagaatgaca agctgatccg ggaagtgaaa gtgatcaccc tgaagtccaa 3600gctggtgtcc gatttccgga aggatttcca gttttacaaa gtgcgcgaga tcaacaacta 3660ccaccacgcc cacgacgcct acctgaacgc cgtcgtggga accgccctga tcaaaaagta 3720ccctaagctg gaaagcgagt tcgtgtacgg cgactacaag gtgtacgacg tgcggaagat 3780gatcgccaag agcgagcagg aaatcggcaa ggctaccgcc aagtacttct tctacagcaa 3840catcatgaac tttttcaaga ccgagattac cctggccaac ggcgagatcc ggaagcggcc 3900tctgatcgag acaaacggcg aaaccgggga gatcgtgtgg gataagggcc gggattttgc 3960caccgtgcgg aaagtgctga gcatgcccca agtgaatatc gtgaaaaaga ccgaggtgca 4020gacaggcggc ttcagcaaag agtctatcct gcccaagagg aacagcgata agctgatcgc 4080cagaaagaag gactgggacc ctaagaagta cggcggcttc gacagcccca ccgtggccta 4140ttctgtgctg gtggtggcca aagtggaaaa gggcaagtcc aagaaactga agagtgtgaa 4200agagctgctg gggatcacca tcatggaaag aagcagcttc gagaagaatc ccatcgactt 4260tctggaagcc aagggctaca aagaagtgaa aaaggacctg atcatcaagc tgcctaagta 4320ctccctgttc gagctggaaa acggccggaa gagaatgctg gcctctgccg gcgaactgca 4380gaagggaaac gaactggccc tgccctccaa atatgtgaac ttcctgtacc tggccagcca 4440ctatgagaag ctgaagggct cccccgagga taatgagcag aaacagctgt ttgtggaaca 4500gcacaagcac tacctggacg agatcatcga gcagatcagc gagttctcca agagagtgat 4560cctggccgac gctaatctgg acaaagtgct gtccgcctac aacaagcacc gggataagcc 4620catcagagag caggccgaga atatcatcca cctgtttacc ctgaccaatc tgggagcccc 4680tgccgccttc aagtactttg acaccaccat cgaccggaag aggtacacca gcaccaaaga 4740ggtgctggac gccaccctga tccaccagag catcaccggc ctgtacgaga cacggatcga 4800cctgtctcag ctgggaggtg actctggagg atctagcgga ggatcctctg gcagcgagac 4860accaggaaca agcgagtcag caacaccaga gagcagtggc ggcagcagcg gcggcagcag 4920caccctaaat atagaagatg agtatcggct acatgagacc tcaaaagagc cagatgtttc 4980tctagggtcc acatggctgt ctgattttcc tcaggcctgg gcggaaaccg ggggcatggg 5040actggcagtt cgccaagctc ctctgatcat acctctgaaa gcaacctcta cccccgtgtc 5100cataaaacaa taccccatgt cacaagaagc cagactgggg atcaagcccc acatacagag 5160actgttggac cagggaatac tggtaccctg ccagtccccc tggaacacgc ccctgctacc 5220cgttaagaaa ccagggacta atgattatag gcctgtccag gatctgagag aagtcaacaa 5280gcgggtggaa gacatccacc ccaccgtgcc caacccttac aacctcttga gcgggctccc 5340accgtcccac cagtggtaca ctgtgcttga tttaaaggat gcctttttct gcctgagact 5400ccaccccacc agtcagcctc tcttcgcctt tgagtggaga gatccagaga tgggaatctc 5460aggacaattg acctggacca gactcccaca gggtttcaaa aacagtccca ccctgtttaa 5520tgaggcactg cacagagacc tagcagactt ccggatccag cacccagact tgatcctgct 5580acagtacgtg gatgacttac tgctggccgc cacttctgag ctagactgcc aacaaggtac 5640tcgggccctg ttacaaaccc tagggaacct cgggtatcgg gcctcggcca agaaagccca 5700aatttgccag aaacaggtca agtatctggg gtatcttcta aaagagggtc agagatggct 5760gactgaggcc agaaaagaga ctgtgatggg gcagcctact ccgaagaccc ctcgacaact 5820aagggagttc ctagggaagg caggcttctg tcgcctcttc atccctgggt ttgcagaaat 5880ggcagccccc ctgtaccctc tcaccaaacc ggggactctg tttaattggg gcccagacca 5940acaaaaggcc tatcaagaaa tcaagcaagc tcttctaact gccccagccc tggggttgcc 6000agatttgact aagccctttg aactctttgt cgacgagaag cagggctacg ccaaaggtgt 6060cctaacgcaa aaactgggac cttggcgtcg gccggtggcc tacctgtcca aaaagctaga 6120cccagtagca gctgggtggc ccccttgcct acggatggta gcagccattg ccgtactgac 6180aaaggatgca ggcaagctaa ccatgggaca gccactagtc attctggccc cccatgcagt 6240agaggcacta gtcaaacaac cccccgaccg ctggctttcc aacgcccgga tgactcacta 6300tcaggccttg cttttggaca cggaccgggt ccagttcgga ccggtggtag ccctgaaccc 6360ggctacgctg ctcccactgc ctgaggaagg gctgcaacac aactgccttg atatcctggc 6420cgaagcccac ggaacccgac ccgacctaac ggaccagccg ctcccagacg ccgaccacac 6480ctggtacacg gatggaagca gtctcttaca agagggacag cgtaaggcgg gagctgcggt 6540gaccaccgag accgaggtaa tctgggctaa agccctgcca gccgggacat ccgctcagcg 6600ggctgaactg atagcactca cccaggccct aaagatggca gaaggtaaga agctaaatgt 6660ttatactgat agccgttatg cttttgctac tgcccatatc catggagaaa tatacagaag 6720gcgtgggtgg ctcacatcag aaggcaaaga gatcaaaaat aaagacgaga tcttggccct 6780actaaaagcc ctctttctgc ccaaaagact tagcataatc cattgtccag gacatcaaaa 6840gggacacagc gccgaggcta gaggcaaccg gatggctgac caagcggccc gaaaggcagc 6900catcacagag actccagaca cctctaccct cctcatagaa aattcatcac cctctggcgg 6960ctcaaaaaga accgccgacg gcagcgaatt cgagcccaag aagaagagga aagtcggaag 7020cggagctact aacttcagcc tgctgaagca ggctggcgac gtggaggaga accctggacc 7080taatttactg accgtacacc aaaatttgcc tgcattaccg gtcgatgcaa cgagtgatga 7140ggttcgcaag aacctgatgg acatgttcag ggatcgccag gcgttttctg agcatacctg 7200gaaaatgctt ctgtccgttt gccggtcgtg ggcggcatgg tgcaagttga ataaccggaa 7260atggtttccc gcagaacctg aagatgttcg cgattatctt ctatatcttc aggcgcgcgg 7320tctggcagta aaaactatcc agcaacattt gggccagcta aacatgcttc atcgtcggtc 7380cgggctgcca cgaccaagtg acagcaatgc tgtttcactg gttatgcggc ggatccgaaa 7440agaaaacgtt gatgccggtg aacgtgcaaa acaggctcta gcgttcgaac gcactgattt 7500cgaccaggtt cgttcactca tggaaaatag cgatcgctgc caggatatac gtaatctggc 7560atttctgggg attgcttata acaccctgtt acgtatagcc gaaattgcca ggatcagggt 7620taaagatatc tcacgtactg acggtgggag aatgttaatc catattggca gaacgaaaac 7680gctggttagc accgcaggtg

tagagaaggc acttagcctg ggggtaacta aactggtcga 7740gcgatggatt tccgtctctg gtgtagctga tgatccgaat aactacctgt tttgccgggt 7800cagaaaaaat ggtgttgccg cgccatctgc caccagccag ctatcaactc gcgccctgga 7860agggattttt gaagcaactc atcgattgat ttacggcgct aaggatgact ctggtcagag 7920atacctggcc tggtctggac acagtgcccg tgtcggagcc gcgcgagata tggcccgcgc 7980tggagtttca ataccggaga tcatgcaagc tggtggctgg accaatgtaa atattgtcat 8040gaactatatc cgtaacctgg atagtgaaac aggggcaatg gtgcgcctgc tggaagatgg 8100cgattaattt aaacccgctg atcagcctcg actgtgcctt ctagttgcca gccatctgtt 8160gtttgcccct cccccgtgcc ttccttgacc ctggaaggtg ccactcccac tgtcctttcc 8220taataaaatg agaaaattgc atcgcattgt ctgagtaggt gtcattctat tctggggggt 8280ggggtggggc aggacagcaa gggggaggat tgggaagaca atagcaggca tgctggggat 8340gcggtgggct ctatggcttc tgaggcggaa agaaccagct ggggctcgat accgtcgacc 8400tctagctaga gcttggcgta atcatggtca tagctgtttc ctgtgtgaaa ttgttatccg 8460ctcacaattc cacacaacat acgagccgga agcataaagt gtaaagccta gggtgcctaa 8520tgagtgagct aactcacatt aattgcgttg cgctcactgc ccgctttcca gtcgggaaac 8580ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg ggagaggcgg tttgcgtatt 8640gggcgctctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga 8700gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca 8760ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg 8820ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt 8880cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc 8940ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct 9000tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc 9060gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta 9120tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca 9180gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag 9240tggtggccta actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag 9300ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt 9360agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa 9420gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg 9480attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga 9540agttttaaat caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta 9600atcagtgagg cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc 9660cccgtcgtgt agataactac gatacgggag ggcttaccat ctggccccag tgctgcaatg 9720ataccgcgag acccacgctc accggctcca gatttatcag caataaacca gccagccgga 9780agggccgagc gcagaagtgg tcctgcaact ttatccgcct ccatccagtc tattaattgt 9840tgccgggaag ctagagtaag tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt 9900gctacaggca tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag ctccggttcc 9960caacgatcaa ggcgagttac atgatccccc atgttgtgca aaaaagcggt tagctccttc 10020ggtcctccga tcgttgtcag aagtaagttg gccgcagtgt tatcactcat ggttatggca 10080gcactgcata attctcttac tgtcatgcca tccgtaagat gcttttctgt gactggtgag 10140tactcaacca agtcattctg agaatagtgt atgcggcgac cgagttgctc ttgcccggcg 10200tcaatacggg ataataccgc gccacatagc agaactttaa aagtgctcat cattggaaaa 10260cgttcttcgg ggcgaaaact ctcaaggatc ttaccgctgt tgagatccag ttcgatgtaa 10320cccactcgtg cacccaactg atcttcagca tcttttactt tcaccagcgt ttctgggtga 10380gcaaaaacag gaaggcaaaa tgccgcaaaa aagggaataa gggcgacacg gaaatgttga 10440atactcatac tcttcctttt tcaatattat tgaagcattt atcagggtta ttgtctcatg 10500agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc gcgcacattt 10560ccccgaaaag tgccacctga cgtcgacgga tcgggagatc gatctcccga tcccctaggg 10620tcgactctca gtacaatctg ctctgatgcc gcatagttaa gccagtatct gctccctgct 10680tgtgtgttgg aggtcgctga gtagtgcgcg agcaaaattt aagctacaac aaggcaaggc 10740ttgaccgaca attgcatgaa gaatctgctt agggttaggc gttttgcgct gcttcgcgat 10800gtacgggcca gatat 108157420DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide+90ngRNA guide sequence 74gtcaaccagt atcccggtgc 207596DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide+90ngRNA 75gtcaaccagt atcccggtgc gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc 96764968DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotideGFP minicircle template (before cleavage) 76tgatcccctg cgccatcaga tccttggcgg cgagaaagcc atccagttta ctttgcaggg 60cttcccaacc ttaccagagg gcgccccagc tggcaattcc ggttcgcttg ctgtccataa 120aaccgcccag tctagctatc gccatgtaag cccactgcaa gctacctgct ttctctttgc 180gcttgcgttt tcccttgtcc agatagccca gtagctgaca ttcatccggg gtcagcaccg 240tttctgcgga ctggctttct acgtgctcga ggggggccaa acggtctcca gcttggctgt 300tttggcggat gagagaagat tttcagcctg atacagatta aatcagaacg cagaagcggt 360ctgataaaac agaatttgcc tggcggcagt agcgcggtgg tcccacctga ccccatgccg 420aactcagaag tgaaacgccg tagcgccgat ggtagtgtgg ggtctcccca tgcgagagta 480gggaactgcc aggcatcaaa taaaacgaaa ggctcagtcg aaagactggg cctttcgttt 540tatctgttgt ttgtcggtga acgctctcct gagtaggaca aatccgccgg gagcggattt 600gaacgttgcg aagcaacggc ccggagggtg gcgggcagga cgcccgccat aaactgccag 660gcatcaaatt aagcagaagg ccatcctgac ggatggcctt tttgcgtttc tacaaactct 720tttgtttatt tttctaaata cattcaaata tgtatccgct catgaccaaa atcccttaac 780gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag 840atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg 900tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact ggcttcagca 960gagcgcagat accaaatact gtccttctag tgtagccgta gttaggccac cacttcaaga 1020actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg gctgctgcca 1080gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg gataaggcgc 1140agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca 1200ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc gaagggagaa 1260aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc 1320cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc tgacttgagc 1380gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg 1440cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat 1500cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc gctcgccgca 1560gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc ctgatgcggt 1620attttctcct tacgcatctg tgcggtattt cacaccgcat atggtgcact ctcagtacaa 1680tctgctctga tgccgcatag ttaagccagt atacactccg ctatcgctac gtgactgggt 1740catggctgcg ccccgacacc cgccaacacc cgctgacgcg ccctgacggg cttgtctgct 1800cccggcatcc gcttacagac aagctgtgac cgtctccggg agctgcatgt gtcagaggtt 1860ttcaccgtca tcaccgaaac gcgcgaggca gcagatcaat tcgcgcgcga aggcgaagcg 1920gcatgcataa tgtgcctgtc aaatggacga agcagggatt ctgcaaaccc tatgctactc 1980cgtcaagccg tcaattgtct gattcgttac caattatgac aacttgacgg ctacatcatt 2040cactttttct tcacaaccgg cacggaactc gctcgggctg gccccggtgc attttttaaa 2100tacccgcgag aaatagagtt gatcgtcaaa accaacattg cgaccgacgg tggcgatagg 2160catccgggtg gtgctcaaaa gcagcttcgc ctggctgata cgttggtcct cgcgccagct 2220taagacgcta atccctaact gctggcggaa aagatgtgac agacgcgacg gcgacaagca 2280aacatgctgt gcgacgctgg cgatacatta ccctgttatc cctagatgac attaccctgt 2340tatcccagat gacattaccc tgttatccct agatgacatt accctgttat ccctagatga 2400catttaccct gttatcccta gatgacatta ccctgttatc ccagatgaca ttaccctgtt 2460atccctagat acattaccct gttatcccag atgacatacc ctgttatccc tagatgacat 2520taccctgtta tcccagatga cattaccctg ttatccctag atacattacc ctgttatccc 2580agatgacata ccctgttatc cctagatgac attaccctgt tatcccagat gacattaccc 2640tgttatccct agatacatta ccctgttatc ccagatgaca taccctgtta tccctagatg 2700acattaccct gttatcccag atgacattac cctgttatcc ctagatacat taccctgtta 2760tcccagatga cataccctgt tatccctaga tgacattacc ctgttatccc agatgacatt 2820accctgttat ccctagatac attaccctgt tatcccagat gacataccct gttatcccta 2880gatgacatta ccctgttatc ccagatgaca ttaccctgtt atccctagat acattaccct 2940gttatcccag atgacatacc ctgttatccc tagatgacat taccctgtta tcccagataa 3000actcaatgat gatgatgatg atggtcgaga ctcagcggcc gcggtgccag ggcgtgccct 3060tgggctcccc gggcgcgact ataagctgcg agcaacttca cttgggtatg ccggcggtag 3120cgcttaccgt tcgtataatg tatgctatac gaagttatcc gaagccgcta gcggtggttt 3180gtctggtcaa ccaccgcggt ctcagtggtg tacggtacaa acccagctac cggtcgccac 3240catgcccgcc atgaagatcg agtgccgcat caccggcacc ctgaacggcg tggagttcga 3300gctggtgggc ggcggagagg gcacccccga gcagggccgc atgaccaaca agatgaagag 3360caccaaaggc gccctgacct tcagccccta cctgctgagc cacgtgatgg gctacggctt 3420ctaccacttc ggcacctacc ccagcggcta cgagaacccc ttcctgcacg ccatcaacaa 3480cggcggctac accaacaccc gcatcgagaa gtacgaggac ggcggcgtgc tgcacgtgag 3540cttcagctac cgctacgagg ccggccgcgt gatcggcgac ttcaaggtgg tgggcaccgg 3600cttccccgag gacagcgtga tcttcaccga caagatcatc cgcagcaacg ccaccgtgga 3660gcacctgcac cccatgggcg ataacgtgct ggtgggcagc ttcgcccgca ccttcagcct 3720gcgcgacggc ggctactaca gcttcgtggt ggacagccac atgcacttca agagcgccat 3780ccaccccagc atcctgcaga acgggggccc catgttcgcc ttccgccgcg tggaggagct 3840gcacagcaac accgagctgg gcatcgtgga gtaccagcac gccttcaaga cccccatcgc 3900cttcgccaga tctcgagctc gatgagtttg gacaaaccac aactagaatg cagtgaaaaa 3960aatgctttat ttgtgaaatt tgtgatgcta ttgctttatt tgtgggcccg ccccaactgg 4020ggtaaccttt gagttctctc agttgggggt aatcagcatc atgatgtggt accacatcat 4080gatgctgatt ataagaatgc ggccgccaca ctctagtgga tctcgagtta ataattcaga 4140agaactcgtc aagaaggcga tagaaggcga tgcgctgcga atcgggagcg gcgataccgt 4200aaagcacgag gaagcggtca gcccattcgc cgccaagctc ttcagcaata tcacgggtag 4260ccaacgctat gtcctgatag cggtccgcca cacccagccg gccacagtcg atgaatccag 4320aaaagcggcc attttccacc atgatattcg gcaagcaggc atcgccatgg gtcacgacga 4380gatcctcgcc gtcgggcatg ctcgccttga gcctggcgaa cagttcggct ggcgcgagcc 4440cctgatgctc ttcgtccaga tcatcctgat cgacaagacc ggcttccatc cgagtacgtg 4500ctcgctcgat gcgatgtttc gcttggtggt cgaatgggca ggtagccgga tcaagcgtat 4560gcagccgccg cattgcatca gccatgatgg atactttctc ggcaggagca aggtgtagat 4620gacatggaga tcctgccccg gcacttcgcc caatagcagc cagtcccttc ccgcttcagt 4680gacaacgtcg agcacagctg cgcaaggaac gcccgtcgtg gccagccacg atagccgcgc 4740tgcctcgtct tgcagttcat tcagggcacc ggacaggtcg gtcttgacaa aaagaaccgg 4800gcgcccctgc gctgacagcc ggaacacggc ggcatcagag cagccgattg tctgttgtgc 4860ccagtcatag ccgaatagcc tctccaccca agcggccgga gaacctgcgt gcaatccatc 4920ttgttcaatc atgcgaaacg atcctcatcc tgtctcttga tcagagct 4968774855DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotideGLuc minicircle template 77tgatcccctg cgccatcaga tccttggcgg cgagaaagcc atccagttta ctttgcaggg 60cttcccaacc ttaccagagg gcgccccagc tggcaattcc ggttcgcttg ctgtccataa 120aaccgcccag tctagctatc gccatgtaag cccactgcaa gctacctgct ttctctttgc 180gcttgcgttt tcccttgtcc agatagccca gtagctgaca ttcatccggg gtcagcaccg 240tttctgcgga ctggctttct acgtgctcga ggggggccaa acggtctcca gcttggctgt 300tttggcggat gagagaagat tttcagcctg atacagatta aatcagaacg cagaagcggt 360ctgataaaac agaatttgcc tggcggcagt agcgcggtgg tcccacctga ccccatgccg 420aactcagaag tgaaacgccg tagcgccgat ggtagtgtgg ggtctcccca tgcgagagta 480gggaactgcc aggcatcaaa taaaacgaaa ggctcagtcg aaagactggg cctttcgttt 540tatctgttgt ttgtcggtga acgctctcct gagtaggaca aatccgccgg gagcggattt 600gaacgttgcg aagcaacggc ccggagggtg gcgggcagga cgcccgccat aaactgccag 660gcatcaaatt aagcagaagg ccatcctgac ggatggcctt tttgcgtttc tacaaactct 720tttgtttatt tttctaaata cattcaaata tgtatccgct catgaccaaa atcccttaac 780gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag 840atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg 900tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact ggcttcagca 960gagcgcagat accaaatact gtccttctag tgtagccgta gttaggccac cacttcaaga 1020actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg gctgctgcca 1080gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg gataaggcgc 1140agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca 1200ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc gaagggagaa 1260aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc 1320cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc tgacttgagc 1380gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg 1440cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat 1500cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc gctcgccgca 1560gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc ctgatgcggt 1620attttctcct tacgcatctg tgcggtattt cacaccgcat atggtgcact ctcagtacaa 1680tctgctctga tgccgcatag ttaagccagt atacactccg ctatcgctac gtgactgggt 1740catggctgcg ccccgacacc cgccaacacc cgctgacgcg ccctgacggg cttgtctgct 1800cccggcatcc gcttacagac aagctgtgac cgtctccggg agctgcatgt gtcagaggtt 1860ttcaccgtca tcaccgaaac gcgcgaggca gcagatcaat tcgcgcgcga aggcgaagcg 1920gcatgcataa tgtgcctgtc aaatggacga agcagggatt ctgcaaaccc tatgctactc 1980cgtcaagccg tcaattgtct gattcgttac caattatgac aacttgacgg ctacatcatt 2040cactttttct tcacaaccgg cacggaactc gctcgggctg gccccggtgc attttttaaa 2100tacccgcgag aaatagagtt gatcgtcaaa accaacattg cgaccgacgg tggcgatagg 2160catccgggtg gtgctcaaaa gcagcttcgc ctggctgata cgttggtcct cgcgccagct 2220taagacgcta atccctaact gctggcggaa aagatgtgac agacgcgacg gcgacaagca 2280aacatgctgt gcgacgctgg cgatacatta ccctgttatc cctagatgac attaccctgt 2340tatcccagat gacattaccc tgttatccct agatgacatt accctgttat ccctagatga 2400catttaccct gttatcccta gatgacatta ccctgttatc ccagatgaca ttaccctgtt 2460atccctagat acattaccct gttatcccag atgacatacc ctgttatccc tagatgacat 2520taccctgtta tcccagatga cattaccctg ttatccctag atacattacc ctgttatccc 2580agatgacata ccctgttatc cctagatgac attaccctgt tatcccagat gacattaccc 2640tgttatccct agatacatta ccctgttatc ccagatgaca taccctgtta tccctagatg 2700acattaccct gttatcccag atgacattac cctgttatcc ctagatacat taccctgtta 2760tcccagatga cataccctgt tatccctaga tgacattacc ctgttatccc agatgacatt 2820accctgttat ccctagatac attaccctgt tatcccagat gacataccct gttatcccta 2880gatgacatta ccctgttatc ccagatgaca ttaccctgtt atccctagat acattaccct 2940gttatcccag atgacatacc ctgttatccc tagatgacat taccctgtta tcccagataa 3000actcaatgat gatgatgatg atggtcgaga ctcagcggcc gcggtgccag ggcgtgccct 3060tgggctcccc gggcgcgact ataagctgcg agcaacttca cttgggtatg ccggcggtag 3120cgcttaccgt tcgtataatg tatgctatac gaagttatcc gaagccgcta gcggtggttt 3180gtctggtcaa ccaccgcggt ctcagtggtg tacggtacaa acccactacc ggtcgccacc 3240atgggagtca aagttctgtt tgccctgatc tgcatcgctg tggccgaggc caagcccacc 3300gagaacaacg aagacttcaa catcgtggcc gtggccagca acttcgcgac cacggatctc 3360gatgctgacc gcgggaagtt gcccggcaag aagctgccgc tggaggtgct caaagagatg 3420gaagccaatg cccggaaagc tggctgcacc aggggctgtc tgatctgcct gtcccacatc 3480aagtgcacgc ccaagatgaa gaagttcatc ccaggacgct gccacaccta cgaaggcgac 3540aaagagtccg cacagggcgg cataggcgag gcgatcgtcg acattcctga gattcctggg 3600ttcaaggact tggagcccat ggagcagttc atcgcacagg tcgatctgtg tgtggactgc 3660acaactggct gcctcaaagg gcttgccaac gtgcagtgtt ctgacctgct caagaagtgg 3720ctgccgcaac gctgtgcgac ctttgccagc aagatccagg gccaggtgga caagatcaag 3780ggggccggtg gtgactaagc ggagctcgat gagtttggac aaaccacaac tagaatgcag 3840tgaaaaaaat gctttatttg tgaaatttgt gatgctattg ctttatttgt gggcccgccc 3900caactggggt aacctttgag ttctctcagt tgggggtaat cagcatcatg atgtggtacc 3960acatcatgat gctgattata agaatgcggc cgccacactc tagtggatct cgagttaata 4020attcagaaga actcgtcaag aaggcgatag aaggcgatgc gctgcgaatc gggagcggcg 4080ataccgtaaa gcacgaggaa gcggtcagcc cattcgccgc caagctcttc agcaatatca 4140cgggtagcca acgctatgtc ctgatagcgg tccgccacac ccagccggcc acagtcgatg 4200aatccagaaa agcggccatt ttccaccatg atattcggca agcaggcatc gccatgggtc 4260acgacgagat cctcgccgtc gggcatgctc gccttgagcc tggcgaacag ttcggctggc 4320gcgagcccct gatgctcttc gtccagatca tcctgatcga caagaccggc ttccatccga 4380gtacgtgctc gctcgatgcg atgtttcgct tggtggtcga atgggcaggt agccggatca 4440agcgtatgca gccgccgcat tgcatcagcc atgatggata ctttctcggc aggagcaagg 4500tgtagatgac atggagatcc tgccccggca cttcgcccaa tagcagccag tcccttcccg 4560cttcagtgac aacgtcgagc acagctgcgc aaggaacgcc cgtcgtggcc agccacgata 4620gccgcgctgc ctcgtcttgc agttcattca gggcaccgga caggtcggtc ttgacaaaaa 4680gaaccgggcg cccctgcgct gacagccgga acacggcggc atcagagcag ccgattgtct 4740gttgtgccca gtcatagccg aatagcctct ccacccaagc ggccggagaa cctgcgtgca 4800atccatcttg ttcaatcatg cgaaacgatc ctcatcctgt ctcttgatca gagct 48557838DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotidepseudo-attP 78ccccaactgg ggtaaccttt gagttctctc agttgggg 3879194DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotideAlbumin-pegRNA-SERPIN 79gactgaaact tcacagaata gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcttgg gatagttatg aattcaatct 120tcaaccctat ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcctctg 180tgaagtttca gtca 19480189DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotideAlbumin-pegRNA-CPS1 80gactgaaact tcacagaata gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcttgg gatagttatg aattcaatct 120tcaaccctat ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcctctg 180tgaagtttc 18981177DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide34bp lox71 pegRNA 81ggcccagact gagcacgtga gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctgga ggaagcaggg cttcctttcc 120tctgccatca taccgttcgt atagcataca ttatacgaag ttatcgtgct cagtctg 17782177DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide34bp lox66 pegRNA 82ggcccagact gagcacgtga gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctgga ggaagcaggg cttcctttcc 120tctgccatca ataacttcgt atagcataca

ttatacgaac ggtacgtgct cagtctg 1778320DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotidegRNA2 83ggcccagact gagcacgtga 2084184DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 84gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc ctgagctgcg 180agaa 18485179DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 85gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgagt cggtgcgacg agcgcggcga 120tatcatcatc catggcacaa ttaacatctc aatcaaggta aatgcttgag ctgcgagaa 17986179DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 86gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgagt cggtgcgacg agcgcggcga 120tatcatcatc catggagcat ttaccttgat tgagatgtta attgtgtgag ctgcgagaa 17987182DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 87gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgagt cggtgcgacg agcgcggcga 120tatcatcatc catggcaggt ttttgacgaa agtgatccag atgatccagt gagctgcgag 180aa 18288182DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 88gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgagt cggtgcgacg agcgcggcga 120tatcatcatc catggctgga tcatctggat cactttcgtc aaaaacctgt gagctgcgag 180aa 1828996DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 89gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc 9690164DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 90gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcatat catcatccat ggtaccgttc 120gtatagcata cattatacga agttattgag ctgcgagaat agcc 16491172DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 91gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggtaccg ttcgtatagc atacattata cgaagttatt gagctgcgag aa 17292189DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 92gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctcga cgacgagcgc ggcgatatca 120tcatccatgg ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcctgag 180ctgcgagaa 18993181DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 93gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgagc gcggcgatat catcatccat 120ggccggatga tcctgacgac ggagaccgcc gtcgtcgaca agccggcctg agctgcgaga 180a 18194178DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 94gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgccgcg gcgatatcat catccatggc 120cggatgatcc tgacgacgga gaccgccgtc gtcgacaagc cggcctgagc tgcgagaa 17895175DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 95gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcggcg atatcatcat ccatggccgg 120atgatcctga cgacggagac cgccgtcgtc gacaagccgg cctgagctgc gagaa 17596171DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 96gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcatat catcatccat ggccggatga 120tcctgacgac ggagaccgcc gtcgtcgaca agccggcctg agctgcgaga a 17197194DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 97gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctcga cgacgagcgc ggcgatatca 120tcatccatgg ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcctgag 180ctgcgagaat agcc 19498189DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 98gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc ctgagctgcg 180agaatagcc 18999176DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 99gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcatat catcatccat ggccggatga 120tcctgacgac ggagaccgcc gtcgtcgaca agccggcctg agctgcgaga atagcc 176100194DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 100gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcctgc ccatccgcgg cggcacgggg 120gtcgcagtcg ccatgccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc 180ccgggcggcg gaga 194101189DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 101gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgccatc cgcggcggca cgggggtcgc 120agtcgccatg ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcccggg 180cggcggaga 189102184DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 102gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgcgg cggcacgggg gtcgcagtcg 120ccatgccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc ccgggcggcg 180gaga 184103179DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 103gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcggca cgggggtcgc agtcgccatg 120ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcccggg cggcggaga 179104174DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 104gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgggg gtcgcagtcg ccatgccgga 120tgatcctgac gacggagacc gccgtcgtcg acaagccggc ccgggcggcg gaga 174105199DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 105gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcctgc ccatccgcgg cggcacgggg 120gtcgcagtcg ccatgccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc 180ccgggcggcg gagacagcg 199106194DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 106gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgccatc cgcggcggca cgggggtcgc 120agtcgccatg ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcccggg 180cggcggagac agcg 194107189DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 107gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgcgg cggcacgggg gtcgcagtcg 120ccatgccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc ccgggcggcg 180gagacagcg 189108184DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 108gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcggca cgggggtcgc agtcgccatg 120ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcccggg cggcggagac 180agcg 184109179DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 109gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgggg gtcgcagtcg ccatgccgga 120tgatcctgac gacggagacc gccgtcgtcg acaagccggc ccgggcggcg gagacagcg 17911096DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 110gcgtggtggg gccgccagcg gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc 96111180DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 111gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggggatg atcctgacga cggagaccgc cgtcgtcgac aagccggtga gctgcgagaa 180112178DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 112gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catgggatga tcctgacgac ggagaccgcc gtcgtcgaca agccgtgagc tgcgagaa 178113176DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 113gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggatgat cctgacgacg gagaccgccg tcgtcgacaa gcctgagctg cgagaa 176114174DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 114gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggtgatc ctgacgacgg agaccgccgt cgtcgacaag ctgagctgcg agaa 174115182DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 115gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgcgg cggcacgggg gtcgcagtcg 120ccatgcggat gatcctgacg acggagaccg ccgtcgtcga caagccggcc gggcggcgga 180ga 182116180DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 116gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgcgg cggcacgggg gtcgcagtcg 120ccatgggatg atcctgacga cggagaccgc cgtcgtcgac aagccggcgg gcggcggaga 180117178DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 117gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgcgg cggcacgggg gtcgcagtcg 120ccatggatga tcctgacgac ggagaccgcc gtcgtcgaca agccgcgggc ggcggaga 178118176DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 118gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgcgg cggcacgggg gtcgcagtcg 120ccatgatgat cctgacgacg gagaccgccg tcgtcgacaa gcccgggcgg cggaga 176119189DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 119gcgtattgcc tggaggatgg gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgaac cacgcggcga atgccggcgt 120ccgccccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc ctcctccagg 180caatacgcg 189120184DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 120gcgtattgcc tggaggatgg gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgaac cacgcggcga atgccggcgt 120ccgccccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc ctcctccagg 180caat 184121182DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 121gcgtattgcc tggaggatgg gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgaac cacgcggcga atgccggcgt 120ccgcccggat gatcctgacg acggagaccg ccgtcgtcga caagccggct cctccaggca 180at 182122180DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 122gcgtattgcc tggaggatgg gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgaac cacgcggcga atgccggcgt 120ccgccggatg atcctgacga cggagaccgc cgtcgtcgac aagccggtcc tccaggcaat 180123178DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 123gcgtattgcc tggaggatgg gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgaac cacgcggcga atgccggcgt 120ccgccgatga tcctgacgac ggagaccgcc gtcgtcgaca agccgtcctc caggcaat 178124176DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 124gcgtattgcc tggaggatgg gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgaac cacgcggcga atgccggcgt 120ccgccatgat cctgacgacg gagaccgccg tcgtcgacaa gcctcctcca ggcaat 17612597DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 125gagccgagca cgaggggata cgttttagag ctagaaatag caagttaaaa taaggctagt 60ccgttatcaa cttgaaaaag tggcaccgag tcggtgc 97126167DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 126gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcggcg atatcatcat ccatggatga 120tcctgacgac ggagaccgcc gtcgtcgaca agcctgagct gcgagaa 167127162DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 127gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctatc atcatccatg gatgatcctg 120acgacggaga ccgccgtcgt cgacaagcct gagctgcgag aa 162128157DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 128gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctcat ccatggatga tcctgacgac 120ggagaccgcc gtcgtcgaca agcctgagct gcgagaa 157129163DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 129gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcggcg atatcatcat ccatggatga 120tcctgacgac ggagaccgcc gtcgtcgaca agcctgagct gcg 163130158DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 130gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctatc atcatccatg gatgatcctg 120acgacggaga ccgccgtcgt cgacaagcct gagctgcg 158131153DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 131gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctcat ccatggatga tcctgacgac 120ggagaccgcc gtcgtcgaca agcctgagct gcg 153132167DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 132gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgccggg ggtcgcagtc gccatgatga 120tcctgacgac ggagaccgcc gtcgtcgaca agcccgggcg gcggaga 167133162DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 133gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgtcg cagtcgccat gatgatcctg 120acgacggaga ccgccgtcgt cgacaagccc gggcggcgga ga 162134157DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 134gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcagtc gccatgatga tcctgacgac 120ggagaccgcc gtcgtcgaca agcccgggcg gcggaga 157135163DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 135gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgccggg ggtcgcagtc gccatgatga 120tcctgacgac ggagaccgcc gtcgtcgaca

agcccgggcg gcg 163136158DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 136gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgtcg cagtcgccat gatgatcctg 120acgacggaga ccgccgtcgt cgacaagccc gggcggcg 158137153DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 137gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcagtc gccatgatga tcctgacgac 120ggagaccgcc gtcgtcgaca agcccgggcg gcg 153138180DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 138gagaagcggc gtccggggct agttttagag ctagaaatag caagttaaaa taaggctagt 60ccgttatcaa cttgaaaaag tggcaccgag tcggtgctct ttgtccagag tcacagccat 120accggatgat cctgacgacg gagaccgccg tcgtcgacaa gccggccccc cggacgccgc 180139179DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 139gggcacgggg ccatgtacaa gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcggcg tcggcagccc gatcccgttg 120ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcctaca tggccccgt 179140185DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 140gtgtcaggtg gggcggggct agttttagag ctagaaatag caagttaaaa taaggctagt 60ccgttatcaa cttgaaaaag tggcaccgag tcggtgcgct ggctcctccc ctggcaccat 120accggatgat cctgacgacg gagaccgccg tcgtcgacaa gccggccccc cgccccacct 180gacac 185141184DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 141gagtgggtca gacgagcagg agttttagag ctagaaatag caagttaaaa taaggctagt 60ccgttatcaa cttgaaaaag tggcaccgag tcggtgcgat ggagggctgc atgggggagg 120agtcgccgga tgatcctgac gacggagacc gccgtcgtcg acaagccggc ctgctcgtct 180gacc 18414297DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 142gcagccaccc gctctcggcc cgttttagag ctagaaatag caagttaaaa taaggctagt 60ccgttatcaa cttgaaaaag tggcaccgag tcggtgc 9714397DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 143gtgtagtcag gccgctcacc cgttttagag ctagaaatag caagttaaaa taaggctagt 60ccgttatcaa cttgaaaaag tggcaccgag tcggtgc 9714497DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 144gctgacaagt ctacggaacc tgttttagag ctagaaatag caagttaaaa taaggctagt 60ccgttatcaa cttgaaaaag tggcaccgag tcggtgc 9714596DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 145gctcctccag cgccttgacc gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc 9614620DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 146gctattctcg cagctcacca 2014720DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 147agaagcggcg tccggggcta 2014820DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 148gggcacgggg ccatgtacaa 2014920DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 149gcgtattgcc tggaggatgg 2015020DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 150tgtcaggtgg ggcggggcta 2015120DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 151agtgggtcag acgagcagga 2015220DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 152gctgtctccg ccgcccgcca 2015396DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 153gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc 96154184DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotidemodified_base(148)..(149)CG, GC, AT, TA, GG, TT, GA, AG, CC, TC, CT, AA, TG, GT, CA, or AC 154gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggccgga tgatcctgac gacggagnnc gccgtcgtcg acaagccggc ctgagctgcg 180agaa 184155183DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 155gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catgccggat gatcctgacg acggagaccg ccgtcgtcga caagccggcc tgagctgcga 180gaa 183156183DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 156gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catgccggat gatcctgacg acggagagcg ccgtcgtcga caagccggcc tgagctgcga 180gaa 183157189DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 157gcgtattgcc tggaggatgg gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgaac cacgcggcga atgccggcgt 120ccgccccgga tgatcctgac gacggagtcc gccgtcgtcg acaagccggc ctcctccagg 180caatacgcg 189158189DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 158gctgtctccg ccgcccgcca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgcgg cggcacgggg gtcgcagtcg 120ccatgccgga tgatcctgac gacggagctc gccgtcgtcg acaagccggc ccgggcggcg 180gagacagcg 18915920DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 159gtcacctcca atgactaggg 2016020DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 160gggcaaccac aaacccacga 20161194DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 161gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggctatg ccggatgatc ctgacgacgg agtccgccgt cgtcgacaag ccggccctag 180ctgagctgcg agaa 194162189DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 162gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggtgccg gatgatcctg acgacggagt ccgccgtcgt cgacaagccg gccctatgag 180ctgcgagaa 189163184DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 163gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggccgga tgatcctgac gacggagtcc gccgtcgtcg acaagccggc ctgagctgcg 180agaa 184164179DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 164gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggggatg atcctgacga cggagtccgc cgtcgtcgac aagccgtgag ctgcgagaa 179165174DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 165gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggtgatc ctgacgacgg agtccgccgt cgtcgacaag ctgagctgcg agaa 174166169DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 166gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggatcct gacgacggag tccgccgtcg tcgacatgag ctgcgagaa 169167164DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 167gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggcctga cgacggagtc cgccgtcgtc gtgagctgcg agaa 164168159DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 168gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggtgacg acggagtccg ccgtcgtgag ctgcgagaa 159169154DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 169gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggacgac ggagtccgcc gtgagctgcg agaa 154170149DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 170gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catgggacgg agtccgtgag ctgcgagaa 149171144DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 171gctattctcg cagctcacca gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggcggag ttgagctgcg agaa 144172182DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 172gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctcga cgacgagcgc ggcgatatca 120tcatccatgg taccgttcgt atagcataca ttatacgaag ttattgagct gcgagaatag 180cc 182173177DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 173gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgacg agcgcggcga tatcatcatc 120catggtaccg ttcgtatagc atacattata cgaagttatt gagctgcgag aatagcc 177174177DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 174gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgctcga cgacgagcgc ggcgatatca 120tcatccatgg taccgttcgt atagcataca ttatacgaag ttattgagct gcgagaa 177175159DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 175gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgcatat catcatccat ggtaccgttc 120gtatagcata cattatacga agttattgag ctgcgagaa 15917696DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 176ccccacgatg gaggggaaga gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc 9617796DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 177ccttctcctg gagccgcgac gttttagagc tagaaatagc aagttaaaat aaggctagtc 60cgttatcaac ttgaaaaagt ggcaccgagt cggtgc 9617852DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 178gtggtttgtc tggtcaacca ccgcggtctc agtggtgtac ggtacaaacc ca 5217952DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 179tgggtttgta ccgtacacca ctgagaccgc ggtggttgac cagacaaacc ac 5218052DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 180gtggtttgtc tggtcaacca ccgcgcgctc agtggtgtac ggtacaaacc ca 5218152DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 181tgggtttgta ccgtacacca ctgagcgcgc ggtggttgac cagacaaacc ac 5218252DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 182gtggtttgtc tggtcaacca ccgcggcctc agtggtgtac ggtacaaacc ca 5218352DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 183tgggtttgta ccgtacacca ctgaggccgc ggtggttgac cagacaaacc ac 5218452DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 184gtggtttgtc tggtcaacca ccgcgatctc agtggtgtac ggtacaaacc ca 5218552DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 185tgggtttgta ccgtacacca ctgagatcgc ggtggttgac cagacaaacc ac 5218652DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 186gtggtttgtc tggtcaacca ccgcgtactc agtggtgtac ggtacaaacc ca 5218752DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 187tgggtttgta ccgtacacca ctgagtacgc ggtggttgac cagacaaacc ac 5218852DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 188gtggtttgtc tggtcaacca ccgcgggctc agtggtgtac ggtacaaacc ca 5218952DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 189tgggtttgta ccgtacacca ctgagcccgc ggtggttgac cagacaaacc ac 5219052DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 190gtggtttgtc tggtcaacca ccgcgttctc agtggtgtac ggtacaaacc ca 5219152DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 191tgggtttgta ccgtacacca ctgagaacgc ggtggttgac cagacaaacc ac 5219252DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 192gtggtttgtc tggtcaacca ccgcggactc agtggtgtac ggtacaaacc ca 5219352DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 193tgggtttgta ccgtacacca ctgagtccgc ggtggttgac cagacaaacc ac 5219452DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 194gtggtttgtc tggtcaacca ccgcgagctc agtggtgtac ggtacaaacc ca 5219552DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 195tgggtttgta ccgtacacca ctgagctcgc ggtggttgac cagacaaacc ac 5219652DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 196gtggtttgtc tggtcaacca ccgcgccctc agtggtgtac ggtacaaacc ca 5219752DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 197tgggtttgta ccgtacacca ctgagggcgc ggtggttgac cagacaaacc ac 5219852DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 198gtggtttgtc tggtcaacca ccgcgtcctc agtggtgtac ggtacaaacc ca 5219952DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 199tgggtttgta ccgtacacca ctgaggacgc ggtggttgac cagacaaacc ac 5220052DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 200gtggtttgtc tggtcaacca ccgcgctctc agtggtgtac ggtacaaacc ca 5220152DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 201tgggtttgta ccgtacacca ctgagagcgc ggtggttgac cagacaaacc ac 5220252DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 202gtggtttgtc tggtcaacca ccgcgaactc agtggtgtac ggtacaaacc ca 5220352DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 203tgggtttgta ccgtacacca ctgagttcgc ggtggttgac cagacaaacc ac 5220452DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 204gtggtttgtc tggtcaacca ccgcgcactc agtggtgtac ggtacaaacc ca 5220552DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 205tgggtttgta ccgtacacca ctgagtgcgc ggtggttgac cagacaaacc ac 5220652DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 206gtggtttgtc tggtcaacca ccgcgacctc agtggtgtac ggtacaaacc ca 5220752DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 207tgggtttgta

ccgtacacca ctgaggtcgc ggtggttgac cagacaaacc ac 5220852DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 208gtggtttgtc tggtcaacca ccgcgtgctc agtggtgtac ggtacaaacc ca 5220952DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 209tgggtttgta ccgtacacca ctgagcacgc ggtggttgac cagacaaacc ac 5221046DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 210ggccggcttg tcgacgacgg cggtctccgt cgtcaggatc atccgg 4621146DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 211ccggatgatc ctgacgacgg agaccgccgt cgtcgacaag ccggcc 4621246DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 212ggccggcttg tcgacgacgg cgaactccgt cgtcaggatc atccgg 4621346DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 213ccggatgatc ctgacgacgg agttcgccgt cgtcgacaag ccggcc 4621446DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 214ggccggcttg tcgacgacgg cggactccgt cgtcaggatc atccgg 4621546DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 215ccggatgatc ctgacgacgg agtccgccgt cgtcgacaag ccggcc 4621646DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 216ggccggcttg tcgacgacgg cgcactccgt cgtcaggatc atccgg 4621746DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 217ccggatgatc ctgacgacgg agtgcgccgt cgtcgacaag ccggcc 4621846DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 218ggccggcttg tcgacgacgg cgtactccgt cgtcaggatc atccgg 4621946DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 219ccggatgatc ctgacgacgg agtacgccgt cgtcgacaag ccggcc 4622046DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 220ggccggcttg tcgacgacgg cgagctccgt cgtcaggatc atccgg 4622146DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 221ccggatgatc ctgacgacgg agctcgccgt cgtcgacaag ccggcc 4622246DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 222ggccggcttg tcgacgacgg cgggctccgt cgtcaggatc atccgg 4622346DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 223ccggatgatc ctgacgacgg agcccgccgt cgtcgacaag ccggcc 4622446DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 224ggccggcttg tcgacgacgg cgcgctccgt cgtcaggatc atccgg 4622546DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 225ccggatgatc ctgacgacgg agcgcgccgt cgtcgacaag ccggcc 4622646DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 226ggccggcttg tcgacgacgg cgtgctccgt cgtcaggatc atccgg 4622746DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 227ccggatgatc ctgacgacgg agcacgccgt cgtcgacaag ccggcc 4622846DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 228ggccggcttg tcgacgacgg cgacctccgt cgtcaggatc atccgg 4622946DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 229ccggatgatc ctgacgacgg aggtcgccgt cgtcgacaag ccggcc 4623046DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 230ggccggcttg tcgacgacgg cggcctccgt cgtcaggatc atccgg 4623146DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 231ccggatgatc ctgacgacgg aggccgccgt cgtcgacaag ccggcc 4623246DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 232ggccggcttg tcgacgacgg cgccctccgt cgtcaggatc atccgg 4623346DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 233ccggatgatc ctgacgacgg agggcgccgt cgtcgacaag ccggcc 4623446DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 234ggccggcttg tcgacgacgg cgtcctccgt cgtcaggatc atccgg 4623546DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 235ccggatgatc ctgacgacgg aggacgccgt cgtcgacaag ccggcc 4623646DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 236ggccggcttg tcgacgacgg cgatctccgt cgtcaggatc atccgg 4623746DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 237ccggatgatc ctgacgacgg agatcgccgt cgtcgacaag ccggcc 4623846DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 238ggccggcttg tcgacgacgg cgctctccgt cgtcaggatc atccgg 4623946DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 239ccggatgatc ctgacgacgg agagcgccgt cgtcgacaag ccggcc 4624046DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 240ggccggcttg tcgacgacgg cgttctccgt cgtcaggatc atccgg 4624146DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 241ccggatgatc ctgacgacgg agaacgccgt cgtcgacaag ccggcc 4624238DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 242ggcttgtcga cgacggcggt ctccgtcgtc aggatcat 3824338DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 243atgatcctga cgacggagac cgccgtcgtc gacaagcc 3824438DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 244ggcttgtcga cgacggcgaa ctccgtcgtc aggatcat 3824538DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 245atgatcctga cgacggagtt cgccgtcgtc gacaagcc 3824638DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 246ggcttgtcga cgacggcgga ctccgtcgtc aggatcat 3824738DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 247atgatcctga cgacggagtc cgccgtcgtc gacaagcc 3824838DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 248ggcttgtcga cgacggcgca ctccgtcgtc aggatcat 3824938DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 249atgatcctga cgacggagtg cgccgtcgtc gacaagcc 3825038DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 250ggcttgtcga cgacggcgta ctccgtcgtc aggatcat 3825138DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 251atgatcctga cgacggagta cgccgtcgtc gacaagcc 3825238DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 252ggcttgtcga cgacggcgag ctccgtcgtc aggatcat 3825338DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 253atgatcctga cgacggagct cgccgtcgtc gacaagcc 3825438DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 254ggcttgtcga cgacggcggg ctccgtcgtc aggatcat 3825538DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 255atgatcctga cgacggagcc cgccgtcgtc gacaagcc 3825638DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 256ggcttgtcga cgacggcgcg ctccgtcgtc aggatcat 3825738DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 257atgatcctga cgacggagcg cgccgtcgtc gacaagcc 3825838DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 258ggcttgtcga cgacggcgtg ctccgtcgtc aggatcat 3825938DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 259atgatcctga cgacggagca cgccgtcgtc gacaagcc 3826038DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 260ggcttgtcga cgacggcgac ctccgtcgtc aggatcat 3826138DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 261atgatcctga cgacggaggt cgccgtcgtc gacaagcc 3826238DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 262ggcttgtcga cgacggcggc ctccgtcgtc aggatcat 3826338DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 263atgatcctga cgacggaggc cgccgtcgtc gacaagcc 3826438DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 264ggcttgtcga cgacggcgcc ctccgtcgtc aggatcat 3826538DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 265atgatcctga cgacggaggg cgccgtcgtc gacaagcc 3826638DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 266ggcttgtcga cgacggcgtc ctccgtcgtc aggatcat 3826738DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 267atgatcctga cgacggagga cgccgtcgtc gacaagcc 3826838DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 268ggcttgtcga cgacggcgat ctccgtcgtc aggatcat 3826938DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 269atgatcctga cgacggagat cgccgtcgtc gacaagcc 3827038DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 270ggcttgtcga cgacggcgct ctccgtcgtc aggatcat 3827138DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 271atgatcctga cgacggagag cgccgtcgtc gacaagcc 3827238DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 272ggcttgtcga cgacggcgtt ctccgtcgtc aggatcat 3827338DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 273atgatcctga cgacggagaa cgccgtcgtc gacaagcc 3827434DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 274taccgttcgt ataatgtatg ctatacgaag ttat 3427534DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 275ataacttcgt atagcataca ttatacgaac ggta 3427634DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 276ataacttcgt ataatgtatg ctatacgaac ggta 3427734DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 277taccgttcgt atagcataca ttatacgaag ttat 3427827DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 278tttaccttga ttgagatgtt aattgtg 2727927DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 279cacaattaac atctcaatca aggtaaa 2728050DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 280gcgagttttt atttcgttta tttcaattaa ggtaactaaa aaactccttt 5028150DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 281aaaggagttt tttagttacc ttaattgaaa taaacgaaat aaaaactcgc 5028234DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 282ctggatcatc tggatcactt tcgtcaaaaa cctg 3428334DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 283caggtttttg acgaaagtga tccagatgat ccag 3428457DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 284ttcgggtgct gggttgttgt ctctggacag tgatccatgg gaaactactc agcacca 5728557DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 285tggtgctgag tagtttccca tggatcactg tccagagaca acaacccagc acccgaa 5728624DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 286aaaagtgtgg gctgcaggat ctga 2428722DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 287ggagctggca gctgtcaatg cc 2228821DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 288agtcaatgcc gctctcgtgg a 2128921DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 289cagcgggctc agctgatagc a 2129021DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 290cggatggcta accaagcggc c 2129117DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 291cccggcttcc tttgtcc 1729217DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 292gaactccacg ccgttca 1729317DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 293cccggcttcc tttgtcc 1729422DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 294aaccacaact agaatgcagt ga 2229517DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 295cccggcttcc tttgtcc 1729617DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 296gaactccacg ccgttca 1729717DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 297cccggcttcc tttgtcc 1729822DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 298aaccacaact agaatgcagt ga 2229917DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 299cccggcttcc tttgtcc 1730017DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 300gaactccacg ccgttca 1730121DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 301tccttatcac ggtcccgctc g 2130217DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 302gaactccacg ccgttca 1730317DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 303cgtcgacaac ggtagtg 1730417DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 304gaactccacg ccgttca 1730518DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 305tcgcgtgatt ctcggaac 1830617DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 306gaactccacg ccgttca 1730720DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 307gggcggtaag tggttagttt 2030817DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 308gaactccacg ccgttca 1730917DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 309aagaggcgga gccagta

1731017DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 310gaactccacg ccgttca 1731119DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 311ctcccttctc ccggtgccc 1931217DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 312gaactccacg ccgttca 1731317DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 313cccggcttcc tttgtcc 1731417DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 314gaactccacg ccgttca 1731520DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 315gggcggtaag tggttagttt 2031617DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 316gaactccacg ccgttca 1731717DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 317cgtcgacaac ggtagtg 1731817DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 318gaactccacg ccgttca 1731917DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 319aagaggcgga gccagta 1732017DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 320gaactccacg ccgttca 1732119DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 321ctcccttctc ccggtgccc 1932217DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 322gaactccacg ccgttca 1732321DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 323tccttatcac ggtcccgctc g 2132417DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 324gaactccacg ccgttca 1732517DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 325cccggcttcc tttgtcc 1732618DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 326ggcctgccag caggagga 1832717DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 327cccggcttcc tttgtcc 1732825DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 328ggtgtgcagt cacattggta aagcc 2532917DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 329cccggcttcc tttgtcc 1733022DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 330gatgggtcta gtccagctaa ag 2233117DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 331cccggcttcc tttgtcc 1733218DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 332gagagacaag gctgcaca 1833326DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 333ccaggtgaga gtcagggtag tgttca 2633417DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 334gaactccacg ccgttca 1733523DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 335agggaccttt gcctgtgtga gtc 2333617DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 336gaactccacg ccgttca 1733721DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 337tcagctctgt gctgaggcga a 2133817DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 338gaactccacg ccgttca 1733932DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 339aagccatctc ccagaatatc tgcttagaaa tg 3234017DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 340gaactccacg ccgttca 1734125DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 341gagaggagca acagtgagca tgatg 2534217DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 342gaactccacg ccgttca 1734332DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 343aagccatctc ccagaatatc tgcttagaaa tg 3234417DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 344gaactccacg ccgttca 1734525DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 345gagaggagca acagtgagca tgatg 2534617DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 346gaactccacg ccgttca 1734717DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 347cccggcttcc tttgtcc 1734822DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 348ggctatgaac taatgacccc gt 2234917DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 349cccggcttcc tttgtcc 1735018DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 350ggcctgccag caggagga 1835117DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 351cccggcttcc tttgtcc 1735225DNAArtificial SequenceDescription of Artificial Sequence Synthetic primer 352ggtgtgcagt cacattggta aagcc 2535352DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 353acactctttc cctacacgac gctcttccga tctccgacct cggctcacag cg 5235453DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 354acactctttc cctacacgac gctcttccga tctaccgacc tcggctcaca gcg 5335554DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 355acactctttc cctacacgac gctcttccga tctgaccgac ctcggctcac agcg 5435655DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 356acactctttc cctacacgac gctcttccga tcttgaccga cctcggctca cagcg 5535756DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 357acactctttc cctacacgac gctcttccga tctctgaccg acctcggctc acagcg 5635857DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 358acactctttc cctacacgac gctcttccga tctactgacc gacctcggct cacagcg 5735958DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 359acactctttc cctacacgac gctcttccga tcttactgac cgacctcggc tcacagcg 5836059DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 360acactctttc cctacacgac gctcttccga tctgtactga ccgacctcgg ctcacagcg 5936151DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 361gtgactggag ttcagacgtg tgctcttccg atctccaccc agccagctcc c 5136251DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 362acactctttc cctacacgac gctcttccga tctccggtgg cgcattgcca c 5136352DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 363acactctttc cctacacgac gctcttccga tctaccggtg gcgcattgcc ac 5236453DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 364acactctttc cctacacgac gctcttccga tctgaccggt ggcgcattgc cac 5336554DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 365acactctttc cctacacgac gctcttccga tcttgaccgg tggcgcattg ccac 5436655DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 366acactctttc cctacacgac gctcttccga tctctgaccg gtggcgcatt gccac 5536756DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 367acactctttc cctacacgac gctcttccga tctactgacc ggtggcgcat tgccac 5636857DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 368acactctttc cctacacgac gctcttccga tcttactgac cggtggcgca ttgccac 5736958DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 369acactctttc cctacacgac gctcttccga tctgtactga ccggtggcgc attgccac 5837054DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 370gtgactggag ttcagacgtg tgctcttccg atctcagagt ccagcttggg ccca 5437120DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 371gatattttcc cagctcacca 2037220DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 372tctattctcc cagctcccca 2037340DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 373agcggcttct gtctctgtga gtgagctggc ggtctccgtc 4037443DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 374gactagccca cgctccggtt ctgagccgcg acggcggtct ccg 4337541DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 375cccagggtcc catgcgctcc ccggccctga cggcggtctc c 413762560PRTArtificial SequenceDescription of Artificial Sequence Synthetic polypeptide 376Met Lys Arg Thr Ala Asp Gly Ser Glu Phe Glu Ser Pro Lys Lys Lys1 5 10 15Arg Lys Val Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn 20 25 30Ser Val Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys 35 40 45Lys Phe Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn 50 55 60Leu Ile Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr65 70 75 80Arg Leu Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg 85 90 95Ile Cys Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp 100 105 110Asp Ser Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp 115 120 125Lys Lys His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val 130 135 140Ala Tyr His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu145 150 155 160Val Asp Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu 165 170 175Ala His Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu 180 185 190Asn Pro Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln 195 200 205Thr Tyr Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val 210 215 220Asp Ala Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu225 230 235 240Glu Asn Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe 245 250 255Gly Asn Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser 260 265 270Asn Phe Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr 275 280 285Tyr Asp Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr 290 295 300Ala Asp Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu305 310 315 320Ser Asp Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser 325 330 335Ala Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu 340 345 350Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile 355 360 365Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly 370 375 380Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys385 390 395 400Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu 405 410 415Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile 420 425 430His Leu Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr 435 440 445Pro Phe Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe 450 455 460Arg Ile Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe465 470 475 480Ala Trp Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe 485 490 495Glu Glu Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg 500 505 510Met Thr Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys 515 520 525His Ser Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys 530 535 540Val Lys Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly545 550 555 560Glu Gln Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys 565 570 575Val Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys 580 585 590Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser 595 600 605Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe 610 615 620Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr625 630 635 640Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr 645 650 655Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg 660 665 670Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile 675 680 685Arg Asp Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp 690 695 700Gly Phe Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu705 710 715 720Thr Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp 725 730 735Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys 740 745 750Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val 755 760 765Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu 770 775 780Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys785 790 795 800Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu 805 810 815His Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr 820 825 830Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile 835 840 845Asn Arg Leu Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe 850 855 860Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys865 870 875 880Asn Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys 885 890 895Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln 900 905 910Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu 915 920 925Leu Asp Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln 930

935 940Ile Thr Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys945 950 955 960Tyr Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu 965 970 975Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys 980 985 990Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn 995 1000 1005Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu 1010 1015 1020Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys 1025 1030 1035Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys 1040 1045 1050Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile 1055 1060 1065Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr 1070 1075 1080Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe 1085 1090 1095Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val 1100 1105 1110Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile 1115 1120 1125Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp 1130 1135 1140Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala 1145 1150 1155Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys 1160 1165 1170Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu 1175 1180 1185Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys 1190 1195 1200Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys 1205 1210 1215Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala 1220 1225 1230Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser 1235 1240 1245Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu 1250 1255 1260Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu 1265 1270 1275Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu 1280 1285 1290Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val 1295 1300 1305Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln 1310 1315 1320Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala 1325 1330 1335Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg 1340 1345 1350Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln 1355 1360 1365Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu 1370 1375 1380Gly Gly Asp Ser Gly Gly Ser Ser Gly Gly Ser Ser Gly Ser Glu 1385 1390 1395Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly Ser 1400 1405 1410Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser Gly 1415 1420 1425Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser Ser 1430 1435 1440Gly Gly Ser Ser Gly Gly Ser Ser Thr Leu Asn Ile Glu Asp Glu 1445 1450 1455Tyr Arg Leu His Glu Thr Ser Lys Glu Pro Asp Val Ser Leu Gly 1460 1465 1470Ser Thr Trp Leu Ser Asp Phe Pro Gln Ala Trp Ala Glu Thr Gly 1475 1480 1485Gly Met Gly Leu Ala Val Arg Gln Ala Pro Leu Ile Ile Pro Leu 1490 1495 1500Lys Ala Thr Ser Thr Pro Val Ser Ile Lys Gln Tyr Pro Met Ser 1505 1510 1515Gln Glu Ala Arg Leu Gly Ile Lys Pro His Ile Gln Arg Leu Leu 1520 1525 1530Asp Gln Gly Ile Leu Val Pro Cys Gln Ser Pro Trp Asn Thr Pro 1535 1540 1545Leu Leu Pro Val Lys Lys Pro Gly Thr Asn Asp Tyr Arg Pro Val 1550 1555 1560Gln Asp Leu Arg Glu Val Asn Lys Arg Val Glu Asp Ile His Pro 1565 1570 1575Thr Val Pro Asn Pro Tyr Asn Leu Leu Ser Gly Pro Pro Pro Ser 1580 1585 1590His Gln Trp Tyr Thr Val Leu Asp Leu Lys Asp Ala Phe Phe Cys 1595 1600 1605Leu Arg Leu His Pro Thr Ser Gln Pro Leu Phe Ala Phe Glu Trp 1610 1615 1620Arg Asp Pro Glu Met Gly Ile Ser Gly Gln Leu Thr Trp Thr Arg 1625 1630 1635Leu Pro Gln Gly Phe Lys Asn Ser Pro Thr Leu Phe Asn Glu Ala 1640 1645 1650Leu His Arg Asp Leu Ala Asp Phe Arg Ile Gln His Pro Asp Leu 1655 1660 1665Ile Leu Leu Gln Tyr Val Asp Asp Leu Leu Leu Ala Ala Thr Ser 1670 1675 1680Glu Leu Asp Cys Gln Gln Gly Thr Arg Ala Leu Leu Gln Thr Leu 1685 1690 1695Gly Asn Leu Gly Tyr Arg Ala Ser Ala Lys Lys Ala Gln Ile Cys 1700 1705 1710Gln Lys Gln Val Lys Tyr Leu Gly Tyr Leu Leu Lys Glu Gly Gln 1715 1720 1725Arg Trp Leu Thr Glu Ala Arg Lys Glu Thr Val Met Gly Gln Pro 1730 1735 1740Thr Pro Lys Thr Pro Arg Gln Leu Arg Glu Phe Leu Gly Lys Ala 1745 1750 1755Gly Phe Cys Arg Leu Phe Ile Pro Gly Phe Ala Glu Met Ala Ala 1760 1765 1770Pro Leu Tyr Pro Leu Thr Lys Pro Gly Thr Leu Phe Asn Trp Gly 1775 1780 1785Pro Asp Gln Gln Lys Ala Tyr Gln Glu Ile Lys Gln Ala Leu Leu 1790 1795 1800Thr Ala Pro Ala Leu Gly Leu Pro Asp Leu Thr Lys Pro Phe Glu 1805 1810 1815Leu Phe Val Asp Glu Lys Gln Gly Tyr Ala Lys Gly Val Leu Thr 1820 1825 1830Gln Lys Leu Gly Pro Trp Arg Arg Pro Val Ala Tyr Leu Ser Lys 1835 1840 1845Lys Leu Asp Pro Val Ala Ala Gly Trp Pro Pro Cys Leu Arg Met 1850 1855 1860Val Ala Ala Ile Ala Val Leu Thr Lys Asp Ala Gly Lys Leu Thr 1865 1870 1875Met Gly Gln Pro Leu Val Ile Leu Ala Pro His Ala Val Glu Ala 1880 1885 1890Leu Val Lys Gln Pro Pro Asp Arg Trp Leu Ser Asn Ala Arg Met 1895 1900 1905Thr His Tyr Gln Ala Leu Leu Leu Asp Thr Asp Arg Val Gln Phe 1910 1915 1920Gly Pro Val Val Ala Leu Asn Pro Ala Thr Leu Leu Pro Leu Pro 1925 1930 1935Glu Glu Gly Leu Gln His Asn Cys Leu Asp Gly Thr Gly Gly Gly 1940 1945 1950Gly Val Thr Val Lys Phe Lys Tyr Lys Gly Glu Glu Leu Glu Val 1955 1960 1965Asp Ile Ser Lys Ile Lys Lys Val Trp Arg Val Gly Lys Met Ile 1970 1975 1980Ser Phe Thr Tyr Asp Asp Asn Gly Lys Thr Gly Arg Gly Ala Val 1985 1990 1995Ser Glu Lys Asp Ala Pro Lys Glu Leu Leu Gln Met Leu Glu Lys 2000 2005 2010Ser Gly Lys Lys Ser Gly Gly Ser Lys Arg Thr Ala Asp Gly Ser 2015 2020 2025Glu Phe Glu Pro Lys Lys Lys Arg Lys Val Gly Gly Gly Gly Ser 2030 2035 2040Pro Lys Lys Lys Arg Lys Val Tyr Pro Tyr Asp Val Pro Asp Tyr 2045 2050 2055Ala Gly Ser Arg Ala Leu Val Val Ile Arg Leu Ser Arg Val Thr 2060 2065 2070Asp Ala Thr Thr Ser Pro Glu Arg Gln Leu Glu Ser Cys Gln Gln 2075 2080 2085Leu Cys Ala Gln Arg Gly Trp Asp Val Val Gly Val Ala Glu Asp 2090 2095 2100Leu Asp Val Ser Gly Ala Val Asp Pro Phe Asp Arg Lys Arg Arg 2105 2110 2115Pro Asn Leu Ala Arg Trp Leu Ala Phe Glu Glu Gln Pro Phe Asp 2120 2125 2130Val Ile Val Ala Tyr Arg Val Asp Arg Leu Thr Arg Ser Ile Arg 2135 2140 2145His Leu Gln Gln Leu Val His Trp Ala Glu Asp His Lys Lys Leu 2150 2155 2160Val Val Ser Ala Thr Glu Ala His Phe Asp Thr Thr Thr Pro Phe 2165 2170 2175Ala Ala Val Val Ile Ala Leu Met Gly Thr Val Ala Gln Met Glu 2180 2185 2190Leu Glu Ala Ile Lys Glu Arg Asn Arg Ser Ala Ala His Phe Asn 2195 2200 2205Ile Arg Ala Gly Lys Tyr Arg Gly Ser Leu Pro Pro Trp Gly Tyr 2210 2215 2220Leu Pro Thr Arg Val Asp Gly Glu Trp Arg Leu Val Pro Asp Pro 2225 2230 2235Val Gln Arg Glu Arg Ile Leu Glu Val Tyr His Arg Val Val Asp 2240 2245 2250Asn His Glu Pro Leu His Leu Val Ala His Asp Leu Asn Arg Arg 2255 2260 2265Gly Val Leu Ser Pro Lys Asp Tyr Phe Ala Gln Leu Gln Gly Arg 2270 2275 2280Glu Pro Gln Gly Arg Glu Trp Ser Ala Thr Ala Leu Lys Arg Ser 2285 2290 2295Met Ile Ser Glu Ala Met Leu Gly Tyr Ala Thr Leu Asn Gly Lys 2300 2305 2310Thr Val Arg Asp Asp Asp Gly Ala Pro Leu Val Arg Ala Glu Pro 2315 2320 2325Ile Leu Thr Arg Glu Gln Leu Glu Ala Leu Arg Ala Glu Leu Val 2330 2335 2340Lys Thr Ser Arg Ala Lys Pro Ala Val Ser Thr Pro Ser Leu Leu 2345 2350 2355Leu Arg Val Leu Phe Cys Ala Val Cys Gly Glu Pro Ala Tyr Lys 2360 2365 2370Phe Ala Gly Gly Gly Arg Lys His Pro Arg Tyr Arg Cys Arg Ser 2375 2380 2385Met Gly Phe Pro Lys His Cys Gly Asn Gly Thr Val Ala Met Ala 2390 2395 2400Glu Trp Asp Ala Phe Cys Glu Glu Gln Val Leu Asp Leu Leu Gly 2405 2410 2415Asp Ala Glu Arg Leu Glu Lys Val Trp Val Ala Gly Ser Asp Ser 2420 2425 2430Ala Val Glu Leu Ala Glu Val Asn Ala Glu Leu Val Asp Leu Thr 2435 2440 2445Ser Leu Ile Gly Ser Pro Ala Tyr Arg Ala Gly Ser Pro Gln Arg 2450 2455 2460Glu Ala Leu Asp Ala Arg Ile Ala Ala Leu Ala Ala Arg Gln Glu 2465 2470 2475Glu Leu Glu Gly Leu Glu Ala Arg Pro Ser Gly Trp Glu Trp Arg 2480 2485 2490Glu Thr Gly Gln Arg Phe Gly Asp Trp Trp Arg Glu Gln Asp Thr 2495 2500 2505Ala Ala Lys Asn Thr Trp Leu Arg Ser Met Asn Val Arg Leu Thr 2510 2515 2520Phe Asp Val Arg Gly Gly Leu Thr Arg Thr Ile Asp Phe Gly Asp 2525 2530 2535Leu Gln Glu Tyr Glu Gln His Leu Arg Leu Gly Ser Val Val Glu 2540 2545 2550Arg Leu His Thr Gly Met Ser 2555 25603777680DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 377atgaaacgga cagccgacgg aagcgagttc gagtcaccaa agaagaagcg gaaagtcgac 60aagaagtaca gcatcggcct ggacatcggc accaactctg tgggctgggc cgtgatcacc 120gacgagtaca aggtgcccag caagaaattc aaggtgctgg gcaacaccga ccggcacagc 180atcaagaaga acctgatcgg agccctgctg ttcgacagcg gcgaaacagc cgaggccacc 240cggctgaaga gaaccgccag aagaagatac accagacgga agaaccggat ctgctatctg 300caagagatct tcagcaacga gatggccaag gtggacgaca gcttcttcca cagactggaa 360gagtccttcc tggtggaaga ggataagaag cacgagcggc accccatctt cggcaacatc 420gtggacgagg tggcctacca cgagaagtac cccaccatct accacctgag aaagaaactg 480gtggacagca ccgacaaggc cgacctgcgg ctgatctatc tggccctggc ccacatgatc 540aagttccggg gccacttcct gatcgagggc gacctgaacc ccgacaacag cgacgtggac 600aagctgttca tccagctggt gcagacctac aaccagctgt tcgaggaaaa ccccatcaac 660gccagcggcg tggacgccaa ggccatcctg tctgccagac tgagcaagag cagacggctg 720gaaaatctga tcgcccagct gcccggcgag aagaagaatg gcctgttcgg aaacctgatt 780gccctgagcc tgggcctgac ccccaacttc aagagcaact tcgacctggc cgaggatgcc 840aaactgcagc tgagcaagga cacctacgac gacgacctgg acaacctgct ggcccagatc 900ggcgaccagt acgccgacct gtttctggcc gccaagaacc tgtccgacgc catcctgctg 960agcgacatcc tgagagtgaa caccgagatc accaaggccc ccctgagcgc ctctatgatc 1020aagagatacg acgagcacca ccaggacctg accctgctga aagctctcgt gcggcagcag 1080ctgcctgaga agtacaaaga gattttcttc gaccagagca agaacggcta cgccggctac 1140attgacggcg gagccagcca ggaagagttc tacaagttca tcaagcccat cctggaaaag 1200atggacggca ccgaggaact gctcgtgaag ctgaacagag aggacctgct gcggaagcag 1260cggaccttcg acaacggcag catcccccac cagatccacc tgggagagct gcacgccatt 1320ctgcggcggc aggaagattt ttacccattc ctgaaggaca accgggaaaa gatcgagaag 1380atcctgacct tccgcatccc ctactacgtg ggccctctgg ccaggggaaa cagcagattc 1440gcctggatga ccagaaagag cgaggaaacc atcaccccct ggaacttcga ggaagtggtg 1500gacaagggcg cttccgccca gagcttcatc gagcggatga ccaacttcga taagaacctg 1560cccaacgaga aggtgctgcc caagcacagc ctgctgtacg agtacttcac cgtgtataac 1620gagctgacca aagtgaaata cgtgaccgag ggaatgagaa agcccgcctt cctgagcggc 1680gagcagaaaa aggccatcgt ggacctgctg ttcaagacca accggaaagt gaccgtgaag 1740cagctgaaag aggactactt caagaaaatc gagtgcttcg actccgtgga aatctccggc 1800gtggaagatc ggttcaacgc ctccctgggc acataccacg atctgctgaa aattatcaag 1860gacaaggact tcctggacaa tgaggaaaac gaggacattc tggaagatat cgtgctgacc 1920ctgacactgt ttgaggacag agagatgatc gaggaacggc tgaaaaccta tgcccacctg 1980ttcgacgaca aagtgatgaa gcagctgaag cggcggagat acaccggctg gggcaggctg 2040agccggaagc tgatcaacgg catccgggac aagcagtccg gcaagacaat cctggatttc 2100ctgaagtccg acggcttcgc caacagaaac ttcatgcagc tgatccacga cgacagcctg 2160acctttaaag aggacatcca gaaagcccag gtgtccggcc agggcgatag cctgcacgag 2220cacattgcca atctggccgg cagccccgcc attaagaagg gcatcctgca gacagtgaag 2280gtggtggacg agctcgtgaa agtgatgggc cggcacaagc ccgagaacat cgtgatcgaa 2340atggccagag agaaccagac cacccagaag ggacagaaga acagccgcga gagaatgaag 2400cggatcgaag agggcatcaa agagctgggc agccagatcc tgaaagaaca ccccgtggaa 2460aacacccagc tgcagaacga gaagctgtac ctgtactacc tgcagaatgg gcgggatatg 2520tacgtggacc aggaactgga catcaaccgg ctgtccgact acgatgtgga cgctatcgtg 2580cctcagagct ttctgaagga cgactccatc gacaacaagg tgctgaccag aagcgacaag 2640aaccggggca agagcgacaa cgtgccctcc gaagaggtcg tgaagaagat gaagaactac 2700tggcggcagc tgctgaacgc caagctgatt acccagagaa agttcgacaa tctgaccaag 2760gccgagagag gcggcctgag cgaactggat aaggccggct tcatcaagag acagctggtg 2820gaaacccggc agatcacaaa gcacgtggca cagatcctgg actcccggat gaacactaag 2880tacgacgaga atgacaagct gatccgggaa gtgaaagtga tcaccctgaa gtccaagctg 2940gtgtccgatt tccggaagga tttccagttt tacaaagtgc gcgagatcaa caactaccac 3000cacgcccacg acgcctacct gaacgccgtc gtgggaaccg ccctgatcaa aaagtaccct 3060aagctggaaa gcgagttcgt gtacggcgac tacaaggtgt acgacgtgcg gaagatgatc 3120gccaagagcg agcaggaaat cggcaaggct accgccaagt acttcttcta cagcaacatc 3180atgaactttt tcaagaccga gattaccctg gccaacggcg agatccggaa gcggcctctg 3240atcgagacaa acggcgaaac cggggagatc gtgtgggata agggccggga ttttgccacc 3300gtgcggaaag tgctgagcat gccccaagtg aatatcgtga aaaagaccga ggtgcagaca 3360ggcggcttca gcaaagagtc tatcctgccc aagaggaaca gcgataagct gatcgccaga 3420aagaaggact gggaccctaa gaagtacggc ggcttcgaca gccccaccgt ggcctattct 3480gtgctggtgg tggccaaagt ggaaaagggc aagtccaaga aactgaagag tgtgaaagag 3540ctgctgggga tcaccatcat ggaaagaagc agcttcgaga agaatcccat cgactttctg 3600gaagccaagg gctacaaaga agtgaaaaag gacctgatca tcaagctgcc taagtactcc 3660ctgttcgagc tggaaaacgg ccggaagaga atgctggcct ctgccggcga actgcagaag 3720ggaaacgaac tggccctgcc ctccaaatat gtgaacttcc tgtacctggc cagccactat 3780gagaagctga agggctcccc cgaggataat gagcagaaac agctgtttgt ggaacagcac 3840aagcactacc tggacgagat catcgagcag atcagcgagt tctccaagag agtgatcctg 3900gccgacgcta atctggacaa agtgctgtcc gcctacaaca agcaccggga taagcccatc 3960agagagcagg ccgagaatat catccacctg tttaccctga ccaatctggg agcccctgcc 4020gccttcaagt actttgacac caccatcgac cggaagaggt acaccagcac caaagaggtg 4080ctggacgcca ccctgatcca ccagagcatc accggcctgt acgagacacg gatcgacctg 4140tctcagctgg gaggtgactc tggaggatct agcggaggat cctctggcag cgagacacca 4200ggaacaagcg agtcagcaac accagagagc tctggtagcg agacacccgg taccagtgaa 4260agcgccacgc cagaaagcag tgggagtgag actccgggta catctgaatc agcgacaccg 4320gaatcaagtg gcggcagcag cggcggcagc agcaccctaa atatagaaga tgagtatcgg 4380ctacatgaga cctcaaaaga gccagatgtt tctctagggt ccacatggct gtctgatttt 4440cctcaggcct gggcggaaac cgggggcatg ggactggcag ttcgccaagc tcctctgatc 4500atacctctga aagcaacctc tacccccgtg tccataaaac aataccccat gtcacaagaa 4560gccagactgg ggatcaagcc ccacatacag agactgttgg accagggaat actggtaccc 4620tgccagtccc cctggaacac gcccctgcta cccgttaaga aaccagggac taatgattat 4680aggcctgtcc aggatctgag agaagtcaac aagcgggtgg aagacatcca ccccaccgtg 4740cccaaccctt acaacctctt gagcgggccc ccaccgtccc accagtggta cactgtgctt 4800gatttaaagg atgccttttt ctgcctgaga ctccacccca ccagtcagcc tctcttcgcc 4860tttgagtgga gagatccaga gatgggaatc tcaggacaat tgacctggac cagactccca 4920cagggtttca aaaacagtcc caccctgttt

aatgaggcac tgcacagaga cctagcagac 4980ttccggatcc agcacccaga cttgatcctg ctacagtacg tggatgactt actgctggcc 5040gccacttctg agctagactg ccaacaaggt actcgggccc tgttacaaac cctagggaac 5100ctcgggtatc gggcctcggc caagaaagcc caaatttgcc agaaacaggt caagtatctg 5160gggtatcttc taaaagaggg tcagagatgg ctgactgagg ccagaaaaga gactgtgatg 5220gggcagccta ctccgaagac ccctcgacaa ctaagggagt tcctagggaa ggcaggcttc 5280tgtcgcctct tcatccctgg gtttgcagaa atggcagccc ccctgtaccc tctcaccaaa 5340ccggggactc tgtttaattg gggcccagac caacaaaagg cctatcaaga aatcaagcaa 5400gctcttctaa ctgccccagc cctggggttg ccagatttga ctaagccctt tgaactcttt 5460gtcgacgaga agcagggcta cgccaaaggt gtcctaacgc aaaaactggg accttggcgt 5520cggccggtgg cctacctgtc caaaaagcta gacccagtag cagctgggtg gcccccttgc 5580ctacggatgg tagcagccat tgccgtactg acaaaggatg caggcaagct aaccatggga 5640cagccactag tcattctggc cccccatgca gtagaggcac tagtcaaaca accccccgac 5700cgctggcttt ccaacgcccg gatgactcac tatcaggcct tgcttttgga cacggaccgg 5760gtccagttcg gaccggtggt agccctgaac ccggctacgc tgctcccact gcctgaggaa 5820gggctgcaac acaactgcct tgatgggaca ggtggcggtg gtgtcaccgt caagttcaag 5880tacaagggtg aggaacttga agttgatatt agcaaaatca agaaggtttg gcgcgttggt 5940aaaatgatat cttttactta tgacgacaac ggcaagacag gtagaggggc agtgtctgag 6000aaagacgccc ccaaggagct gttgcaaatg ttggaaaagt ctgggaaaaa gtctggcggc 6060tcaaaaagaa ccgccgacgg cagcgaattc gagcccaaga agaagaggaa agtcggaggt 6120ggcgggagcc caaaaaagaa aagaaaagtg tatccctatg atgtccccga ttatgccggt 6180tcaagagccc tggtcgtgat tagactgagc cgagtgacag acgccaccac aagtcccgag 6240agacagctgg aatcatgcca gcagctctgt gctcagcggg gttgggatgt ggtcggcgtg 6300gcagaggatc tggacgtgag cggggccgtc gatccattcg acagaaagag gaggcccaac 6360ctggcaagat ggctcgcttt cgaggaacag ccctttgatg tgatcgtcgc ctacagagtg 6420gaccggctga cccgctcaat tcgacatctc cagcagctgg tgcattgggc tgaggaccac 6480aagaaactgg tggtcagcgc aacagaagcc cacttcgata ctaccacacc ttttgccgct 6540gtggtcatcg cactgatggg cactgtggcc cagatggagc tcgaagctat caaggagcga 6600aacaggagcg cagcccattt caatattagg gccggtaaat acagaggctc cctgccccct 6660tggggatatc tccctaccag ggtggatggg gagtggagac tggtgccaga ccccgtccag 6720agagagcgga ttctggaagt gtaccacaga gtggtcgata accacgaacc actccatctg 6780gtggcacacg acctgaatag acgcggcgtg ctctctccaa aggattattt tgctcagctg 6840cagggaagag agccacaggg aagagaatgg agtgctactg cactgaagag atctatgatc 6900agtgaggcta tgctgggtta cgcaacactc aatggcaaaa ctgtccggga cgatgacgga 6960gcccctctgg tgagggctga gcctattctc accagagagc agctcgaagc tctgcgggca 7020gaactggtca agactagtcg cgccaaacct gccgtgagca ccccaagcct gctcctgagg 7080gtgctgttct gcgccgtctg tggagagcca gcatacaagt ttgccggcgg agggcgcaaa 7140catccccgct atcgatgcag gagcatgggg ttccctaagc actgtggaaa cgggacagtg 7200gccatggctg agtgggacgc cttttgcgag gaacaggtgc tggatctcct gggtgacgct 7260gagcggctgg aaaaagtgtg ggtggcagga tctgactccg ctgtggagct ggcagaagtc 7320aatgccgagc tcgtggatct gacttccctc atcggatctc ctgcatatag agctgggtcc 7380ccacagagag aagctctgga cgcacgaatt gctgcactcg ctgctagaca ggaggaactg 7440gagggcctgg aggccaggcc ctctggatgg gagtggcgag aaaccggaca gaggtttggg 7500gattggtgga gggagcagga caccgcagcc aagaacacat ggctgagatc catgaatgtc 7560cggctcacat tcgacgtgcg cggtggcctg actcgaacca tcgattttgg cgacctgcag 7620gagtatgaac agcacctgag actggggtcc gtggtcgaaa gactgcacac tgggatgtcc 76803781367PRTArtificial SequenceDescription of Artificial Sequence Synthetic polypeptide 378Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val Gly1 5 10 15Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys 20 25 30Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly 35 40 45Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys 50 55 60Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr65 70 75 80Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe 85 90 95Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His 100 105 110Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His 115 120 125Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser 130 135 140Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met145 150 155 160Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp 165 170 175Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn 180 185 190Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys 195 200 205Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu 210 215 220Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu225 230 235 240Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp 245 250 255Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp 260 265 270Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu 275 280 285Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile 290 295 300Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met305 310 315 320Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala 325 330 335Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp 340 345 350Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln 355 360 365Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly 370 375 380Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys385 390 395 400Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly 405 410 415Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu 420 425 430Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro 435 440 445Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met 450 455 460Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val465 470 475 480Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn 485 490 495Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu 500 505 510Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr 515 520 525Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys 530 535 540Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val545 550 555 560Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser 565 570 575Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr 580 585 590Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn 595 600 605Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu 610 615 620Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His625 630 635 640Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr 645 650 655Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys 660 665 670Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala 675 680 685Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys 690 695 700Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His705 710 715 720Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile 725 730 735Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg 740 745 750His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr 755 760 765Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu 770 775 780Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val785 790 795 800Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln 805 810 815Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu 820 825 830Ser Asp Tyr Asp Val Asp Ala Ile Val Pro Gln Ser Phe Leu Lys Asp 835 840 845Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly 850 855 860Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn865 870 875 880Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe 885 890 895Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys 900 905 910Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys 915 920 925His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu 930 935 940Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys945 950 955 960Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu 965 970 975Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val 980 985 990Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val 995 1000 1005Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys 1010 1015 1020Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr 1025 1030 1035Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn 1040 1045 1050Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr 1055 1060 1065Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg 1070 1075 1080Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu 1085 1090 1095Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg 1100 1105 1110Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys 1115 1120 1125Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu 1130 1135 1140Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser 1145 1150 1155Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe 1160 1165 1170Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu 1175 1180 1185Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe 1190 1195 1200Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu 1205 1210 1215Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn 1220 1225 1230Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro 1235 1240 1245Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His 1250 1255 1260Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg 1265 1270 1275Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr 1280 1285 1290Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile 1295 1300 1305Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe 1310 1315 1320Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr 1325 1330 1335Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly 1340 1345 1350Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp 1355 1360 1365379576PRTArtificial SequenceDescription of Artificial Sequence Synthetic polypeptide 379Leu Asn Ile Glu Asp Glu Tyr Arg Leu His Glu Thr Ser Lys Glu Pro1 5 10 15Asp Val Ser Leu Gly Ser Thr Trp Leu Ser Asp Phe Pro Gln Ala Trp 20 25 30Ala Glu Thr Gly Gly Met Gly Leu Ala Val Arg Gln Ala Pro Leu Ile 35 40 45Ile Pro Leu Lys Ala Thr Ser Thr Pro Val Ser Ile Lys Gln Tyr Pro 50 55 60Met Ser Gln Glu Ala Arg Leu Gly Ile Lys Pro His Ile Gln Arg Leu65 70 75 80Leu Asp Gln Gly Ile Leu Val Pro Cys Gln Ser Pro Trp Asn Thr Pro 85 90 95Leu Leu Pro Val Lys Lys Pro Gly Thr Asn Asp Tyr Arg Pro Val Gln 100 105 110Asp Leu Arg Glu Val Asn Lys Arg Val Glu Asp Ile His Pro Thr Val 115 120 125Pro Asn Pro Tyr Asn Leu Leu Ser Gly Pro Pro Pro Ser His Gln Trp 130 135 140Tyr Thr Val Leu Asp Leu Lys Asp Ala Phe Phe Cys Leu Arg Leu His145 150 155 160Pro Thr Ser Gln Pro Leu Phe Ala Phe Glu Trp Arg Asp Pro Glu Met 165 170 175Gly Ile Ser Gly Gln Leu Thr Trp Thr Arg Leu Pro Gln Gly Phe Lys 180 185 190Asn Ser Pro Thr Leu Phe Asn Glu Ala Leu His Arg Asp Leu Ala Asp 195 200 205Phe Arg Ile Gln His Pro Asp Leu Ile Leu Leu Gln Tyr Val Asp Asp 210 215 220Leu Leu Leu Ala Ala Thr Ser Glu Leu Asp Cys Gln Gln Gly Thr Arg225 230 235 240Ala Leu Leu Gln Thr Leu Gly Asn Leu Gly Tyr Arg Ala Ser Ala Lys 245 250 255Lys Ala Gln Ile Cys Gln Lys Gln Val Lys Tyr Leu Gly Tyr Leu Leu 260 265 270Lys Glu Gly Gln Arg Trp Leu Thr Glu Ala Arg Lys Glu Thr Val Met 275 280 285Gly Gln Pro Thr Pro Lys Thr Pro Arg Gln Leu Arg Glu Phe Leu Gly 290 295 300Lys Ala Gly Phe Cys Arg Leu Phe Ile Pro Gly Phe Ala Glu Met Ala305 310 315 320Ala Pro Leu Tyr Pro Leu Thr Lys Pro Gly Thr Leu Phe Asn Trp Gly 325 330 335Pro Asp Gln Gln Lys Ala Tyr Gln Glu Ile Lys Gln Ala Leu Leu Thr 340 345 350Ala Pro Ala Leu Gly Leu Pro Asp Leu Thr Lys Pro Phe Glu Leu Phe 355 360 365Val Asp Glu Lys Gln Gly Tyr Ala Lys Gly Val Leu Thr Gln Lys Leu 370 375 380Gly Pro Trp Arg Arg Pro Val Ala Tyr Leu Ser Lys Lys Leu Asp Pro385 390 395 400Val Ala Ala Gly Trp Pro Pro Cys Leu Arg Met Val Ala Ala Ile Ala 405 410 415Val Leu Thr Lys Asp Ala Gly Lys Leu Thr Met Gly Gln Pro Leu Val 420 425 430Ile Leu Ala Pro His Ala Val Glu Ala Leu Val Lys Gln Pro Pro Asp 435 440 445Arg Trp Leu Ser Asn Ala Arg Met Thr His Tyr Gln Ala Leu Leu Leu 450 455 460Asp Thr Asp Arg Val Gln Phe Gly Pro Val Val Ala Leu Asn Pro Ala465 470 475 480Thr Leu Leu Pro Leu Pro Glu Glu Gly Leu Gln His Asn Cys Leu Asp 485 490 495Gly Thr Gly Gly Gly Gly Val Thr Val Lys Phe Lys Tyr Lys Gly Glu 500 505 510Glu Leu Glu Val Asp Ile Ser Lys Ile Lys Lys Val Trp Arg Val Gly 515 520 525Lys Met Ile Ser Phe Thr Tyr Asp Asp Asn Gly Lys Thr Gly Arg Gly 530 535 540Ala Val Ser Glu Lys Asp Ala Pro Lys Glu Leu Leu Gln Met Leu Glu545 550 555 560Lys Ser Gly Lys Lys Ser Gly Gly Ser Lys Arg Thr Ala Asp Gly Ser 565 570 575380500PRTArtificial SequenceDescription of Artificial Sequence Synthetic polypeptide 380Ser Arg Ala Leu Val Val Ile Arg Leu Ser Arg Val Thr Asp Ala Thr1 5 10 15Thr Ser Pro Glu Arg Gln Leu Glu Ser Cys Gln Gln Leu Cys Ala Gln 20 25 30Arg Gly Trp Asp Val Val Gly

Val Ala Glu Asp Leu Asp Val Ser Gly 35 40 45Ala Val Asp Pro Phe Asp Arg Lys Arg Arg Pro Asn Leu Ala Arg Trp 50 55 60Leu Ala Phe Glu Glu Gln Pro Phe Asp Val Ile Val Ala Tyr Arg Val65 70 75 80Asp Arg Leu Thr Arg Ser Ile Arg His Leu Gln Gln Leu Val His Trp 85 90 95Ala Glu Asp His Lys Lys Leu Val Val Ser Ala Thr Glu Ala His Phe 100 105 110Asp Thr Thr Thr Pro Phe Ala Ala Val Val Ile Ala Leu Met Gly Thr 115 120 125Val Ala Gln Met Glu Leu Glu Ala Ile Lys Glu Arg Asn Arg Ser Ala 130 135 140Ala His Phe Asn Ile Arg Ala Gly Lys Tyr Arg Gly Ser Leu Pro Pro145 150 155 160Trp Gly Tyr Leu Pro Thr Arg Val Asp Gly Glu Trp Arg Leu Val Pro 165 170 175Asp Pro Val Gln Arg Glu Arg Ile Leu Glu Val Tyr His Arg Val Val 180 185 190Asp Asn His Glu Pro Leu His Leu Val Ala His Asp Leu Asn Arg Arg 195 200 205Gly Val Leu Ser Pro Lys Asp Tyr Phe Ala Gln Leu Gln Gly Arg Glu 210 215 220Pro Gln Gly Arg Glu Trp Ser Ala Thr Ala Leu Lys Arg Ser Met Ile225 230 235 240Ser Glu Ala Met Leu Gly Tyr Ala Thr Leu Asn Gly Lys Thr Val Arg 245 250 255Asp Asp Asp Gly Ala Pro Leu Val Arg Ala Glu Pro Ile Leu Thr Arg 260 265 270Glu Gln Leu Glu Ala Leu Arg Ala Glu Leu Val Lys Thr Ser Arg Ala 275 280 285Lys Pro Ala Val Ser Thr Pro Ser Leu Leu Leu Arg Val Leu Phe Cys 290 295 300Ala Val Cys Gly Glu Pro Ala Tyr Lys Phe Ala Gly Gly Gly Arg Lys305 310 315 320His Pro Arg Tyr Arg Cys Arg Ser Met Gly Phe Pro Lys His Cys Gly 325 330 335Asn Gly Thr Val Ala Met Ala Glu Trp Asp Ala Phe Cys Glu Glu Gln 340 345 350Val Leu Asp Leu Leu Gly Asp Ala Glu Arg Leu Glu Lys Val Trp Val 355 360 365Ala Gly Ser Asp Ser Ala Val Glu Leu Ala Glu Val Asn Ala Glu Leu 370 375 380Val Asp Leu Thr Ser Leu Ile Gly Ser Pro Ala Tyr Arg Ala Gly Ser385 390 395 400Pro Gln Arg Glu Ala Leu Asp Ala Arg Ile Ala Ala Leu Ala Ala Arg 405 410 415Gln Glu Glu Leu Glu Gly Leu Glu Ala Arg Pro Ser Gly Trp Glu Trp 420 425 430Arg Glu Thr Gly Gln Arg Phe Gly Asp Trp Trp Arg Glu Gln Asp Thr 435 440 445Ala Ala Lys Asn Thr Trp Leu Arg Ser Met Asn Val Arg Leu Thr Phe 450 455 460Asp Val Arg Gly Gly Leu Thr Arg Thr Ile Asp Phe Gly Asp Leu Gln465 470 475 480Glu Tyr Glu Gln His Leu Arg Leu Gly Ser Val Val Glu Arg Leu His 485 490 495Thr Gly Met Ser 50038111344DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 381ccgaaaagtg ccacctgacg tcgacggatc gggagatcga tctcccgatc ccctagggtc 60gactctcagt acaatctgct ctgatgccgc atagttaagc cagtatctgc tccctgcttg 120tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa gctacaacaa ggcaaggctt 180gaccgacaat tgcatgaaga atctgcttag ggttaggcgt tttgcgctgc ttcgcgatgt 240acgggccaga tatacgcgtt gacattgatt attgactagt tattaatagt aatcaattac 300ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta cggtaaatgg 360cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc 420catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac 480tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa 540tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg actttcctac 600ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta 660catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga 720cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa 780ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag 840agctggttta gtgaaccgtc agatccgcta gagatccgcg gccgctaata cgactcacta 900tagggagagc cgccaccatg aaacggacag ccgacggaag cgagttcgag tcaccaaaga 960agaagcggaa agtcgacaag aagtacagca tcggcctgga catcggcacc aactctgtgg 1020gctgggccgt gatcaccgac gagtacaagg tgcccagcaa gaaattcaag gtgctgggca 1080acaccgaccg gcacagcatc aagaagaacc tgatcggagc cctgctgttc gacagcggcg 1140aaacagccga ggccacccgg ctgaagagaa ccgccagaag aagatacacc agacggaaga 1200accggatctg ctatctgcaa gagatcttca gcaacgagat ggccaaggtg gacgacagct 1260tcttccacag actggaagag tccttcctgg tggaagagga taagaagcac gagcggcacc 1320ccatcttcgg caacatcgtg gacgaggtgg cctaccacga gaagtacccc accatctacc 1380acctgagaaa gaaactggtg gacagcaccg acaaggccga cctgcggctg atctatctgg 1440ccctggccca catgatcaag ttccggggcc acttcctgat cgagggcgac ctgaaccccg 1500acaacagcga cgtggacaag ctgttcatcc agctggtgca gacctacaac cagctgttcg 1560aggaaaaccc catcaacgcc agcggcgtgg acgccaaggc catcctgtct gccagactga 1620gcaagagcag acggctggaa aatctgatcg cccagctgcc cggcgagaag aagaatggcc 1680tgttcggaaa cctgattgcc ctgagcctgg gcctgacccc caacttcaag agcaacttcg 1740acctggccga ggatgccaaa ctgcagctga gcaaggacac ctacgacgac gacctggaca 1800acctgctggc ccagatcggc gaccagtacg ccgacctgtt tctggccgcc aagaacctgt 1860ccgacgccat cctgctgagc gacatcctga gagtgaacac cgagatcacc aaggcccccc 1920tgagcgcctc tatgatcaag agatacgacg agcaccacca ggacctgacc ctgctgaaag 1980ctctcgtgcg gcagcagctg cctgagaagt acaaagagat tttcttcgac cagagcaaga 2040acggctacgc cggctacatt gacggcggag ccagccagga agagttctac aagttcatca 2100agcccatcct ggaaaagatg gacggcaccg aggaactgct cgtgaagctg aacagagagg 2160acctgctgcg gaagcagcgg accttcgaca acggcagcat cccccaccag atccacctgg 2220gagagctgca cgccattctg cggcggcagg aagattttta cccattcctg aaggacaacc 2280gggaaaagat cgagaagatc ctgaccttcc gcatccccta ctacgtgggc cctctggcca 2340ggggaaacag cagattcgcc tggatgacca gaaagagcga ggaaaccatc accccctgga 2400acttcgagga agtggtggac aagggcgctt ccgcccagag cttcatcgag cggatgacca 2460acttcgataa gaacctgccc aacgagaagg tgctgcccaa gcacagcctg ctgtacgagt 2520acttcaccgt gtataacgag ctgaccaaag tgaaatacgt gaccgaggga atgagaaagc 2580ccgccttcct gagcggcgag cagaaaaagg ccatcgtgga cctgctgttc aagaccaacc 2640ggaaagtgac cgtgaagcag ctgaaagagg actacttcaa gaaaatcgag tgcttcgact 2700ccgtggaaat ctccggcgtg gaagatcggt tcaacgcctc cctgggcaca taccacgatc 2760tgctgaaaat tatcaaggac aaggacttcc tggacaatga ggaaaacgag gacattctgg 2820aagatatcgt gctgaccctg acactgtttg aggacagaga gatgatcgag gaacggctga 2880aaacctatgc ccacctgttc gacgacaaag tgatgaagca gctgaagcgg cggagataca 2940ccggctgggg caggctgagc cggaagctga tcaacggcat ccgggacaag cagtccggca 3000agacaatcct ggatttcctg aagtccgacg gcttcgccaa cagaaacttc atgcagctga 3060tccacgacga cagcctgacc tttaaagagg acatccagaa agcccaggtg tccggccagg 3120gcgatagcct gcacgagcac attgccaatc tggccggcag ccccgccatt aagaagggca 3180tcctgcagac agtgaaggtg gtggacgagc tcgtgaaagt gatgggccgg cacaagcccg 3240agaacatcgt gatcgaaatg gccagagaga accagaccac ccagaaggga cagaagaaca 3300gccgcgagag aatgaagcgg atcgaagagg gcatcaaaga gctgggcagc cagatcctga 3360aagaacaccc cgtggaaaac acccagctgc agaacgagaa gctgtacctg tactacctgc 3420agaatgggcg ggatatgtac gtggaccagg aactggacat caaccggctg tccgactacg 3480atgtggacgc tatcgtgcct cagagctttc tgaaggacga ctccatcgac aacaaggtgc 3540tgaccagaag cgacaagaac cggggcaaga gcgacaacgt gccctccgaa gaggtcgtga 3600agaagatgaa gaactactgg cggcagctgc tgaacgccaa gctgattacc cagagaaagt 3660tcgacaatct gaccaaggcc gagagaggcg gcctgagcga actggataag gccggcttca 3720tcaagagaca gctggtggaa acccggcaga tcacaaagca cgtggcacag atcctggact 3780cccggatgaa cactaagtac gacgagaatg acaagctgat ccgggaagtg aaagtgatca 3840ccctgaagtc caagctggtg tccgatttcc ggaaggattt ccagttttac aaagtgcgcg 3900agatcaacaa ctaccaccac gcccacgacg cctacctgaa cgccgtcgtg ggaaccgccc 3960tgatcaaaaa gtaccctaag ctggaaagcg agttcgtgta cggcgactac aaggtgtacg 4020acgtgcggaa gatgatcgcc aagagcgagc aggaaatcgg caaggctacc gccaagtact 4080tcttctacag caacatcatg aactttttca agaccgagat taccctggcc aacggcgaga 4140tccggaagcg gcctctgatc gagacaaacg gcgaaaccgg ggagatcgtg tgggataagg 4200gccgggattt tgccaccgtg cggaaagtgc tgagcatgcc ccaagtgaat atcgtgaaaa 4260agaccgaggt gcagacaggc ggcttcagca aagagtctat cctgcccaag aggaacagcg 4320ataagctgat cgccagaaag aaggactggg accctaagaa gtacggcggc ttcgacagcc 4380ccaccgtggc ctattctgtg ctggtggtgg ccaaagtgga aaagggcaag tccaagaaac 4440tgaagagtgt gaaagagctg ctggggatca ccatcatgga aagaagcagc ttcgagaaga 4500atcccatcga ctttctggaa gccaagggct acaaagaagt gaaaaaggac ctgatcatca 4560agctgcctaa gtactccctg ttcgagctgg aaaacggccg gaagagaatg ctggcctctg 4620ccggcgaact gcagaaggga aacgaactgg ccctgccctc caaatatgtg aacttcctgt 4680acctggccag ccactatgag aagctgaagg gctcccccga ggataatgag cagaaacagc 4740tgtttgtgga acagcacaag cactacctgg acgagatcat cgagcagatc agcgagttct 4800ccaagagagt gatcctggcc gacgctaatc tggacaaagt gctgtccgcc tacaacaagc 4860accgggataa gcccatcaga gagcaggccg agaatatcat ccacctgttt accctgacca 4920atctgggagc ccctgccgcc ttcaagtact ttgacaccac catcgaccgg aagaggtaca 4980ccagcaccaa agaggtgctg gacgccaccc tgatccacca gagcatcacc ggcctgtacg 5040agacacggat cgacctgtct cagctgggag gtgactctgg aggatctagc ggaggatcct 5100ctggcagcga gacaccagga acaagcgagt cagcaacacc agagagcagt ggcggcagca 5160gcggcggcag cagcacccta aatatagaag atgagtatcg gctacatgag acctcaaaag 5220agccagatgt ttctctaggg tccacatggc tgtctgattt tcctcaggcc tgggcggaaa 5280ccgggggcat gggactggca gttcgccaag ctcctctgat catacctctg aaagcaacct 5340ctacccccgt gtccataaaa caatacccca tgtcacaaga agccagactg gggatcaagc 5400cccacataca gagactgttg gaccagggaa tactggtacc ctgccagtcc ccctggaaca 5460cgcccctgct acccgttaag aaaccaggga ctaatgatta taggcctgtc caggatctga 5520gagaagtcaa caagcgggtg gaagacatcc accccaccgt gcccaaccct tacaacctct 5580tgagcgggct cccaccgtcc caccagtggt acactgtgct tgatttaaag gatgcctttt 5640tctgcctgag actccacccc accagtcagc ctctcttcgc ctttgagtgg agagatccag 5700agatgggaat ctcaggacaa ttgacctgga ccagactccc acagggtttc aaaaacagtc 5760ccaccctgtt taatgaggca ctgcacagag acctagcaga cttccggatc cagcacccag 5820acttgatcct gctacagtac gtggatgact tactgctggc cgccacttct gagctagact 5880gccaacaagg tactcgggcc ctgttacaaa ccctagggaa cctcgggtat cgggcctcgg 5940ccaagaaagc ccaaatttgc cagaaacagg tcaagtatct ggggtatctt ctaaaagagg 6000gtcagagatg gctgactgag gccagaaaag agactgtgat ggggcagcct actccgaaga 6060cccctcgaca actaagggag ttcctaggga aggcaggctt ctgtcgcctc ttcatccctg 6120ggtttgcaga aatggcagcc cccctgtacc ctctcaccaa accggggact ctgtttaatt 6180ggggcccaga ccaacaaaag gcctatcaag aaatcaagca agctcttcta actgccccag 6240ccctggggtt gccagatttg actaagccct ttgaactctt tgtcgacgag aagcagggct 6300acgccaaagg tgtcctaacg caaaaactgg gaccttggcg tcggccggtg gcctacctgt 6360ccaaaaagct agacccagta gcagctgggt ggcccccttg cctacggatg gtagcagcca 6420ttgccgtact gacaaaggat gcaggcaagc taaccatggg acagccacta gtcattctgg 6480ccccccatgc agtagaggca ctagtcaaac aaccccccga ccgctggctt tccaacgccc 6540ggatgactca ctatcaggcc ttgcttttgg acacggaccg ggtccagttc ggaccggtgg 6600tagccctgaa cccggctacg ctgctcccac tgcctgagga agggctgcaa cacaactgcc 6660ttgatatcct ggccgaagcc cacggaaccc gacccgacct aacggaccag ccgctcccag 6720acgccgacca cacctggtac acggatggaa gcagtctctt acaagaggga cagcgtaagg 6780cgggagctgc ggtgaccacc gagaccgagg taatctgggc taaagccctg ccagccggga 6840catccgctca gcgggctgaa ctgatagcac tcacccaggc cctaaagatg gcagaaggta 6900agaagctaaa tgtttatact gatagccgtt atgcttttgc tactgcccat atccatggag 6960aaatatacag aaggcgtggg tggctcacat cagaaggcaa agagatcaaa aataaagacg 7020agatcttggc cctactaaaa gccctctttc tgcccaaaag acttagcata atccattgtc 7080caggacatca aaagggacac agcgccgagg ctagaggcaa ccggatggct gaccaagcgg 7140cccgaaaggc agccatcaca gagactccag acacctctac cctcctcata gaaaattcat 7200caccctctgg cggctcaaaa agaaccgccg acggcagcga attcgagccc aagaagaaga 7260ggaaagtcgg aagcggagct actaacttca gcctgctgaa gcaggctggc gacgtggagg 7320agaaccctgg acctccaaaa aagaaaagaa aagtgtatcc ctatgatgtc cccgattatg 7380ccggttcaag agccctggtc gtgattagac tgagccgagt gacagacgcc accacaagtc 7440ccgagagaca gctggaatca tgccagcagc tctgtgctca gcggggttgg gatgtggtcg 7500gcgtggcaga ggatctggac gtgagcgggg ccgtcgatcc attcgacaga aagaggaggc 7560ccaacctggc aagatggctc gctttcgagg aacagccctt tgatgtgatc gtcgcctaca 7620gagtggaccg gctgacccgc tcaattcgac atctccagca gctggtgcat tgggctgagg 7680accacaagaa actggtggtc agcgcaacag aagcccactt cgatactacc acaccttttg 7740ccgctgtggt catcgcactg atgggcactg tggcccagat ggagctcgaa gctatcaagg 7800agcgaaacag gagcgcagcc catttcaata ttagggccgg taaatacaga ggctccctgc 7860ccccttgggg atatctccct accagggtgg atggggagtg gagactggtg ccagaccccg 7920tccagagaga gcggattctg gaagtgtacc acagagtggt cgataaccac gaaccactcc 7980atctggtggc acacgacctg aatagacgcg gcgtgctctc tccaaaggat tattttgctc 8040agctgcaggg aagagagcca cagggaagag aatggagtgc tactgcactg aagagatcta 8100tgatcagtga ggctatgctg ggttacgcaa cactcaatgg caaaactgtc cgggacgatg 8160acggagcccc tctggtgagg gctgagccta ttctcaccag agagcagctc gaagctctgc 8220gggcagaact ggtcaagact agtcgcgcca aacctgccgt gagcacccca agcctgctcc 8280tgagggtgct gttctgcgcc gtctgtggag agccagcata caagtttgcc ggcggagggc 8340gcaaacatcc ccgctatcga tgcaggagca tggggttccc taagcactgt ggaaacggga 8400cagtggccat ggctgagtgg gacgcctttt gcgaggaaca ggtgctggat ctcctgggtg 8460acgctgagcg gctggaaaaa gtgtgggtgg caggatctga ctccgctgtg gagctggcag 8520aagtcaatgc cgagctcgtg gatctgactt ccctcatcgg atctcctgca tatagagctg 8580ggtccccaca gagagaagct ctggacgcac gaattgctgc actcgctgct agacaggagg 8640aactggaggg cctggaggcc aggccctctg gatgggagtg gcgagaaacc ggacagaggt 8700ttggggattg gtggagggag caggacaccg cagccaagaa cacatggctg agatccatga 8760atgtccggct cacattcgac gtgcgcggtg gcctgactcg aaccatcgat tttggcgacc 8820tgcaggagta tgaacagcac ctgagactgg ggtccgtggt cgaaagactg cacactggga 8880tgtcctaggt ttaaacccgc tgatcagcct cgactgtgcc ttctagttgc cagccatctg 8940ttgtttgccc ctcccccgtg ccttccttga ccctggaagg tgccactccc actgtccttt 9000cctaataaaa tgagaaaatt gcatcgcatt gtctgagtag gtgtcattct attctggggg 9060gtggggtggg gcaggacagc aagggggagg attgggaaga caatagcagg catgctgggg 9120atgcggtggg ctctatggct tctgaggcgg aaagaaccag ctggggctcg ataccgtcga 9180cctctagcta gagcttggcg taatcatggt catagctgtt tcctgtgtga aattgttatc 9240cgctcacaat tccacacaac atacgagccg gaagcataaa gtgtaaagcc tagggtgcct 9300aatgagtgag ctaactcaca ttaattgcgt tgcgctcact gcccgctttc cagtcgggaa 9360acctgtcgtg ccagctgcat taatgaatcg gccaacgcgc ggggagaggc ggtttgcgta 9420ttgggcgctc ttccgcttcc tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc 9480gagcggtatc agctcactca aaggcggtaa tacggttatc cacagaatca ggggataacg 9540caggaaagaa catgtgagca aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt 9600tgctggcgtt tttccatagg ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa 9660gtcagaggtg gcgaaacccg acaggactat aaagatacca ggcgtttccc cctggaagct 9720ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc 9780cttcgggaag cgtggcgctt tctcatagct cacgctgtag gtatctcagt tcggtgtagg 9840tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct 9900tatccggtaa ctatcgtctt gagtccaacc cggtaagaca cgacttatcg ccactggcag 9960cagccactgg taacaggatt agcagagcga ggtatgtagg cggtgctaca gagttcttga 10020agtggtggcc taactacggc tacactagaa gaacagtatt tggtatctgc gctctgctga 10080agccagttac cttcggaaaa agagttggta gctcttgatc cggcaaacaa accaccgctg 10140gtagcggtgg tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag 10200aagatccttt gatcttttct acggggtctg acgctcagtg gaacgaaaac tcacgttaag 10260ggattttggt catgagatta tcaaaaagga tcttcaccta gatcctttta aattaaaaat 10320gaagttttaa atcaatctaa agtatatatg agtaaacttg gtctgacagt taccaatgct 10380taatcagtga ggcacctatc tcagcgatct gtctatttcg ttcatccata gttgcctgac 10440tccccgtcgt gtagataact acgatacggg agggcttacc atctggcccc agtgctgcaa 10500tgataccgcg agacccacgc tcaccggctc cagatttatc agcaataaac cagccagccg 10560gaagggccga gcgcagaagt ggtcctgcaa ctttatccgc ctccatccag tctattaatt 10620gttgccggga agctagagta agtagttcgc cagttaatag tttgcgcaac gttgttgcca 10680ttgctacagg catcgtggtg tcacgctcgt cgtttggtat ggcttcattc agctccggtt 10740cccaacgatc aaggcgagtt acatgatccc ccatgttgtg caaaaaagcg gttagctcct 10800tcggtcctcc gatcgttgtc agaagtaagt tggccgcagt gttatcactc atggttatgg 10860cagcactgca taattctctt actgtcatgc catccgtaag atgcttttct gtgactggtg 10920agtactcaac caagtcattc tgagaatagt gtatgcggcg accgagttgc tcttgcccgg 10980cgtcaatacg ggataatacc gcgccacata gcagaacttt aaaagtgctc atcattggaa 11040aacgttcttc ggggcgaaaa ctctcaagga tcttaccgct gttgagatcc agttcgatgt 11100aacccactcg tgcacccaac tgatcttcag catcttttac tttcaccagc gtttctgggt 11160gagcaaaaac aggaaggcaa aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt 11220gaatactcat actcttcctt tttcaatatt attgaagcat ttatcagggt tattgtctca 11280tgagcggata catatttgaa tgtatttaga aaaataaaca aataggggtt ccgcgcacat 11340ttcc 113443829753DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 382ccgaaaagtg ccacctgacg tcgacggatc gggagatcga tctcccgatc ccctagggtc 60gactctcagt acaatctgct ctgatgccgc atagttaagc cagtatctgc tccctgcttg 120tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa gctacaacaa ggcaaggctt 180gaccgacaat tgcatgaaga atctgcttag ggttaggcgt tttgcgctgc ttcgcgatgt 240acgggccaga tatacgcgtt gacattgatt attgactagt tattaatagt aatcaattac 300ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta cggtaaatgg 360cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc 420catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac 480tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa 540tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg actttcctac 600ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta

660catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga 720cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa 780ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag 840agctggttta gtgaaccgtc agatccgcta gagatccgcg gccgctaata cgactcacta 900tagggagagc cgccaccatg aaacggacag ccgacggaag cgagttcgag tcaccaaaga 960agaagcggaa agtcgacaag aagtacagca tcggcctgga catcggcacc aactctgtgg 1020gctgggccgt gatcaccgac gagtacaagg tgcccagcaa gaaattcaag gtgctgggca 1080acaccgaccg gcacagcatc aagaagaacc tgatcggagc cctgctgttc gacagcggcg 1140aaacagccga ggccacccgg ctgaagagaa ccgccagaag aagatacacc agacggaaga 1200accggatctg ctatctgcaa gagatcttca gcaacgagat ggccaaggtg gacgacagct 1260tcttccacag actggaagag tccttcctgg tggaagagga taagaagcac gagcggcacc 1320ccatcttcgg caacatcgtg gacgaggtgg cctaccacga gaagtacccc accatctacc 1380acctgagaaa gaaactggtg gacagcaccg acaaggccga cctgcggctg atctatctgg 1440ccctggccca catgatcaag ttccggggcc acttcctgat cgagggcgac ctgaaccccg 1500acaacagcga cgtggacaag ctgttcatcc agctggtgca gacctacaac cagctgttcg 1560aggaaaaccc catcaacgcc agcggcgtgg acgccaaggc catcctgtct gccagactga 1620gcaagagcag acggctggaa aatctgatcg cccagctgcc cggcgagaag aagaatggcc 1680tgttcggaaa cctgattgcc ctgagcctgg gcctgacccc caacttcaag agcaacttcg 1740acctggccga ggatgccaaa ctgcagctga gcaaggacac ctacgacgac gacctggaca 1800acctgctggc ccagatcggc gaccagtacg ccgacctgtt tctggccgcc aagaacctgt 1860ccgacgccat cctgctgagc gacatcctga gagtgaacac cgagatcacc aaggcccccc 1920tgagcgcctc tatgatcaag agatacgacg agcaccacca ggacctgacc ctgctgaaag 1980ctctcgtgcg gcagcagctg cctgagaagt acaaagagat tttcttcgac cagagcaaga 2040acggctacgc cggctacatt gacggcggag ccagccagga agagttctac aagttcatca 2100agcccatcct ggaaaagatg gacggcaccg aggaactgct cgtgaagctg aacagagagg 2160acctgctgcg gaagcagcgg accttcgaca acggcagcat cccccaccag atccacctgg 2220gagagctgca cgccattctg cggcggcagg aagattttta cccattcctg aaggacaacc 2280gggaaaagat cgagaagatc ctgaccttcc gcatccccta ctacgtgggc cctctggcca 2340ggggaaacag cagattcgcc tggatgacca gaaagagcga ggaaaccatc accccctgga 2400acttcgagga agtggtggac aagggcgctt ccgcccagag cttcatcgag cggatgacca 2460acttcgataa gaacctgccc aacgagaagg tgctgcccaa gcacagcctg ctgtacgagt 2520acttcaccgt gtataacgag ctgaccaaag tgaaatacgt gaccgaggga atgagaaagc 2580ccgccttcct gagcggcgag cagaaaaagg ccatcgtgga cctgctgttc aagaccaacc 2640ggaaagtgac cgtgaagcag ctgaaagagg actacttcaa gaaaatcgag tgcttcgact 2700ccgtggaaat ctccggcgtg gaagatcggt tcaacgcctc cctgggcaca taccacgatc 2760tgctgaaaat tatcaaggac aaggacttcc tggacaatga ggaaaacgag gacattctgg 2820aagatatcgt gctgaccctg acactgtttg aggacagaga gatgatcgag gaacggctga 2880aaacctatgc ccacctgttc gacgacaaag tgatgaagca gctgaagcgg cggagataca 2940ccggctgggg caggctgagc cggaagctga tcaacggcat ccgggacaag cagtccggca 3000agacaatcct ggatttcctg aagtccgacg gcttcgccaa cagaaacttc atgcagctga 3060tccacgacga cagcctgacc tttaaagagg acatccagaa agcccaggtg tccggccagg 3120gcgatagcct gcacgagcac attgccaatc tggccggcag ccccgccatt aagaagggca 3180tcctgcagac agtgaaggtg gtggacgagc tcgtgaaagt gatgggccgg cacaagcccg 3240agaacatcgt gatcgaaatg gccagagaga accagaccac ccagaaggga cagaagaaca 3300gccgcgagag aatgaagcgg atcgaagagg gcatcaaaga gctgggcagc cagatcctga 3360aagaacaccc cgtggaaaac acccagctgc agaacgagaa gctgtacctg tactacctgc 3420agaatgggcg ggatatgtac gtggaccagg aactggacat caaccggctg tccgactacg 3480atgtggacgc tatcgtgcct cagagctttc tgaaggacga ctccatcgac aacaaggtgc 3540tgaccagaag cgacaagaac cggggcaaga gcgacaacgt gccctccgaa gaggtcgtga 3600agaagatgaa gaactactgg cggcagctgc tgaacgccaa gctgattacc cagagaaagt 3660tcgacaatct gaccaaggcc gagagaggcg gcctgagcga actggataag gccggcttca 3720tcaagagaca gctggtggaa acccggcaga tcacaaagca cgtggcacag atcctggact 3780cccggatgaa cactaagtac gacgagaatg acaagctgat ccgggaagtg aaagtgatca 3840ccctgaagtc caagctggtg tccgatttcc ggaaggattt ccagttttac aaagtgcgcg 3900agatcaacaa ctaccaccac gcccacgacg cctacctgaa cgccgtcgtg ggaaccgccc 3960tgatcaaaaa gtaccctaag ctggaaagcg agttcgtgta cggcgactac aaggtgtacg 4020acgtgcggaa gatgatcgcc aagagcgagc aggaaatcgg caaggctacc gccaagtact 4080tcttctacag caacatcatg aactttttca agaccgagat taccctggcc aacggcgaga 4140tccggaagcg gcctctgatc gagacaaacg gcgaaaccgg ggagatcgtg tgggataagg 4200gccgggattt tgccaccgtg cggaaagtgc tgagcatgcc ccaagtgaat atcgtgaaaa 4260agaccgaggt gcagacaggc ggcttcagca aagagtctat cctgcccaag aggaacagcg 4320ataagctgat cgccagaaag aaggactggg accctaagaa gtacggcggc ttcgacagcc 4380ccaccgtggc ctattctgtg ctggtggtgg ccaaagtgga aaagggcaag tccaagaaac 4440tgaagagtgt gaaagagctg ctggggatca ccatcatgga aagaagcagc ttcgagaaga 4500atcccatcga ctttctggaa gccaagggct acaaagaagt gaaaaaggac ctgatcatca 4560agctgcctaa gtactccctg ttcgagctgg aaaacggccg gaagagaatg ctggcctctg 4620ccggcgaact gcagaaggga aacgaactgg ccctgccctc caaatatgtg aacttcctgt 4680acctggccag ccactatgag aagctgaagg gctcccccga ggataatgag cagaaacagc 4740tgtttgtgga acagcacaag cactacctgg acgagatcat cgagcagatc agcgagttct 4800ccaagagagt gatcctggcc gacgctaatc tggacaaagt gctgtccgcc tacaacaagc 4860accgggataa gcccatcaga gagcaggccg agaatatcat ccacctgttt accctgacca 4920atctgggagc ccctgccgcc ttcaagtact ttgacaccac catcgaccgg aagaggtaca 4980ccagcaccaa agaggtgctg gacgccaccc tgatccacca gagcatcacc ggcctgtacg 5040agacacggat cgacctgtct cagctgggag gtgactctgg aggatctagc ggaggatcct 5100ctggcagcga gacaccagga acaagcgagt cagcaacacc agagagcagt ggcggcagca 5160gcggcggcag cagcacccta aatatagaag atgagtatcg gctacatgag acctcaaaag 5220agccagatgt ttctctaggg tccacatggc tgtctgattt tcctcaggcc tgggcggaaa 5280ccgggggcat gggactggca gttcgccaag ctcctctgat catacctctg aaagcaacct 5340ctacccccgt gtccataaaa caatacccca tgtcacaaga agccagactg gggatcaagc 5400cccacataca gagactgttg gaccagggaa tactggtacc ctgccagtcc ccctggaaca 5460cgcccctgct acccgttaag aaaccaggga ctaatgatta taggcctgtc caggatctga 5520gagaagtcaa caagcgggtg gaagacatcc accccaccgt gcccaaccct tacaacctct 5580tgagcgggct cccaccgtcc caccagtggt acactgtgct tgatttaaag gatgcctttt 5640tctgcctgag actccacccc accagtcagc ctctcttcgc ctttgagtgg agagatccag 5700agatgggaat ctcaggacaa ttgacctgga ccagactccc acagggtttc aaaaacagtc 5760ccaccctgtt taatgaggca ctgcacagag acctagcaga cttccggatc cagcacccag 5820acttgatcct gctacagtac gtggatgact tactgctggc cgccacttct gagctagact 5880gccaacaagg tactcgggcc ctgttacaaa ccctagggaa cctcgggtat cgggcctcgg 5940ccaagaaagc ccaaatttgc cagaaacagg tcaagtatct ggggtatctt ctaaaagagg 6000gtcagagatg gctgactgag gccagaaaag agactgtgat ggggcagcct actccgaaga 6060cccctcgaca actaagggag ttcctaggga aggcaggctt ctgtcgcctc ttcatccctg 6120ggtttgcaga aatggcagcc cccctgtacc ctctcaccaa accggggact ctgtttaatt 6180ggggcccaga ccaacaaaag gcctatcaag aaatcaagca agctcttcta actgccccag 6240ccctggggtt gccagatttg actaagccct ttgaactctt tgtcgacgag aagcagggct 6300acgccaaagg tgtcctaacg caaaaactgg gaccttggcg tcggccggtg gcctacctgt 6360ccaaaaagct agacccagta gcagctgggt ggcccccttg cctacggatg gtagcagcca 6420ttgccgtact gacaaaggat gcaggcaagc taaccatggg acagccacta gtcattctgg 6480ccccccatgc agtagaggca ctagtcaaac aaccccccga ccgctggctt tccaacgccc 6540ggatgactca ctatcaggcc ttgcttttgg acacggaccg ggtccagttc ggaccggtgg 6600tagccctgaa cccggctacg ctgctcccac tgcctgagga agggctgcaa cacaactgcc 6660ttgatatcct ggccgaagcc cacggaaccc gacccgacct aacggaccag ccgctcccag 6720acgccgacca cacctggtac acggatggaa gcagtctctt acaagaggga cagcgtaagg 6780cgggagctgc ggtgaccacc gagaccgagg taatctgggc taaagccctg ccagccggga 6840catccgctca gcgggctgaa ctgatagcac tcacccaggc cctaaagatg gcagaaggta 6900agaagctaaa tgtttatact gatagccgtt atgcttttgc tactgcccat atccatggag 6960aaatatacag aaggcgtggg tggctcacat cagaaggcaa agagatcaaa aataaagacg 7020agatcttggc cctactaaaa gccctctttc tgcccaaaag acttagcata atccattgtc 7080caggacatca aaagggacac agcgccgagg ctagaggcaa ccggatggct gaccaagcgg 7140cccgaaaggc agccatcaca gagactccag acacctctac cctcctcata gaaaattcat 7200caccctctgg cggctcaaaa agaaccgccg acggcagcga attcgagccc aagaagaaga 7260ggaaagtcta accggtcatc atcaccatca ccattgagtt taaacccgct gatcagcctc 7320gactgtgcct tctagttgcc agccatctgt tgtttgcccc tcccccgtgc cttccttgac 7380cctggaaggt gccactccca ctgtcctttc ctaataaaat gagaaaattg catcgcattg 7440tctgagtagg tgtcattcta ttctgggggg tggggtgggg caggacagca agggggagga 7500ttgggaagac aatagcaggc atgctgggga tgcggtgggc tctatggctt ctgaggcgga 7560aagaaccagc tggggctcga taccgtcgac ctctagctag agcttggcgt aatcatggtc 7620atagctgttt cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg 7680aagcataaag tgtaaagcct agggtgccta atgagtgagc taactcacat taattgcgtt 7740gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg 7800ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga 7860ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat 7920acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca 7980aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc 8040tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata 8100aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 8160gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc 8220acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 8280accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 8340ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag 8400gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag 8460aacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 8520ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 8580gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga 8640cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat 8700cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga 8760gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg 8820tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga 8880gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc 8940agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac 9000tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 9060agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 9120gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta catgatcccc 9180catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 9240ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta ctgtcatgcc 9300atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg 9360tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag 9420cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 9480cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc 9540atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 9600aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt ttcaatatta 9660ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa 9720aaataaacaa ataggggttc cgcgcacatt tcc 975338311433DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 383ccgaaaagtg ccacctgacg tcgacggatc gggagatcga tctcccgatc ccctagggtc 60gactctcagt acaatctgct ctgatgccgc atagttaagc cagtatctgc tccctgcttg 120tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa gctacaacaa ggcaaggctt 180gaccgacaat tgcatgaaga atctgcttag ggttaggcgt tttgcgctgc ttcgcgatgt 240acgggccaga tatacgcgtt gacattgatt attgactagt tattaatagt aatcaattac 300ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta cggtaaatgg 360cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc 420catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac 480tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa 540tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg actttcctac 600ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta 660catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga 720cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa 780ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag 840agctggttta gtgaaccgtc agatccgcta gagatccgcg gccgctaata cgactcacta 900tagggagagc cgccaccatg cccgcggcta agagggtgaa gcttgacggt ggaaaacgga 960cagccgacgg aagcgagttc gagtcaccaa agaagaagcg gaaagtcgac aagaagtaca 1020gcatcggcct ggacatcggc accaactctg tgggctgggc cgtgatcacc gacgagtaca 1080aggtgcccag caagaaattc aaggtgctgg gcaacaccga ccggcacagc atcaagaaga 1140acctgatcgg agccctgctg ttcgacagcg gcgaaacagc cgaggccacc cggctgaaga 1200gaaccgccag aagaagatac accagacgga agaaccggat ctgctatctg caagagatct 1260tcagcaacga gatggccaag gtggacgaca gcttcttcca cagactggaa gagtccttcc 1320tggtggaaga ggataagaag cacgagcggc accccatctt cggcaacatc gtggacgagg 1380tggcctacca cgagaagtac cccaccatct accacctgag aaagaaactg gtggacagca 1440ccgacaaggc cgacctgcgg ctgatctatc tggccctggc ccacatgatc aagttccggg 1500gccacttcct gatcgagggc gacctgaacc ccgacaacag cgacgtggac aagctgttca 1560tccagctggt gcagacctac aaccagctgt tcgaggaaaa ccccatcaac gccagcggcg 1620tggacgccaa ggccatcctg tctgccagac tgagcaagag cagacggctg gaaaatctga 1680tcgcccagct gcccggcgag aagaagaatg gcctgttcgg aaacctgatt gccctgagcc 1740tgggcctgac ccccaacttc aagagcaact tcgacctggc cgaggatgcc aaactgcagc 1800tgagcaagga cacctacgac gacgacctgg acaacctgct ggcccagatc ggcgaccagt 1860acgccgacct gtttctggcc gccaagaacc tgtccgacgc catcctgctg agcgacatcc 1920tgagagtgaa caccgagatc accaaggccc ccctgagcgc ctctatgatc aagagatacg 1980acgagcacca ccaggacctg accctgctga aagctctcgt gcggcagcag ctgcctgaga 2040agtacaaaga gattttcttc gaccagagca agaacggcta cgccggctac attgacggcg 2100gagccagcca ggaagagttc tacaagttca tcaagcccat cctggaaaag atggacggca 2160ccgaggaact gctcgtgaag ctgaacagag aggacctgct gcggaagcag cggaccttcg 2220acaacggcag catcccccac cagatccacc tgggagagct gcacgccatt ctgcggcggc 2280aggaagattt ttacccattc ctgaaggaca accgggaaaa gatcgagaag atcctgacct 2340tccgcatccc ctactacgtg ggccctctgg ccaggggaaa cagcagattc gcctggatga 2400ccagaaagag cgaggaaacc atcaccccct ggaacttcga ggaagtggtg gacaagggcg 2460cttccgccca gagcttcatc gagcggatga ccaacttcga taagaacctg cccaacgaga 2520aggtgctgcc caagcacagc ctgctgtacg agtacttcac cgtgtataac gagctgacca 2580aagtgaaata cgtgaccgag ggaatgagaa agcccgcctt cctgagcggc gagcagaaaa 2640aggccatcgt ggacctgctg ttcaagacca accggaaagt gaccgtgaag cagctgaaag 2700aggactactt caagaaaatc gagtgcttcg actccgtgga aatctccggc gtggaagatc 2760ggttcaacgc ctccctgggc acataccacg atctgctgaa aattatcaag gacaaggact 2820tcctggacaa tgaggaaaac gaggacattc tggaagatat cgtgctgacc ctgacactgt 2880ttgaggacag agagatgatc gaggaacggc tgaaaaccta tgcccacctg ttcgacgaca 2940aagtgatgaa gcagctgaag cggcggagat acaccggctg gggcaggctg agccggaagc 3000tgatcaacgg catccgggac aagcagtccg gcaagacaat cctggatttc ctgaagtccg 3060acggcttcgc caacagaaac ttcatgcagc tgatccacga cgacagcctg acctttaaag 3120aggacatcca gaaagcccag gtgtccggcc agggcgatag cctgcacgag cacattgcca 3180atctggccgg cagccccgcc attaagaagg gcatcctgca gacagtgaag gtggtggacg 3240agctcgtgaa agtgatgggc cggcacaagc ccgagaacat cgtgatcgaa atggccagag 3300agaaccagac cacccagaag ggacagaaga acagccgcga gagaatgaag cggatcgaag 3360agggcatcaa agagctgggc agccagatcc tgaaagaaca ccccgtggaa aacacccagc 3420tgcagaacga gaagctgtac ctgtactacc tgcagaatgg gcgggatatg tacgtggacc 3480aggaactgga catcaaccgg ctgtccgact acgatgtgga cgctatcgtg cctcagagct 3540ttctgaagga cgactccatc gacaacaagg tgctgaccag aagcgacaag aaccggggca 3600agagcgacaa cgtgccctcc gaagaggtcg tgaagaagat gaagaactac tggcggcagc 3660tgctgaacgc caagctgatt acccagagaa agttcgacaa tctgaccaag gccgagagag 3720gcggcctgag cgaactggat aaggccggct tcatcaagag acagctggtg gaaacccggc 3780agatcacaaa gcacgtggca cagatcctgg actcccggat gaacactaag tacgacgaga 3840atgacaagct gatccgggaa gtgaaagtga tcaccctgaa gtccaagctg gtgtccgatt 3900tccggaagga tttccagttt tacaaagtgc gcgagatcaa caactaccac cacgcccacg 3960acgcctacct gaacgccgtc gtgggaaccg ccctgatcaa aaagtaccct aagctggaaa 4020gcgagttcgt gtacggcgac tacaaggtgt acgacgtgcg gaagatgatc gccaagagcg 4080agcaggaaat cggcaaggct accgccaagt acttcttcta cagcaacatc atgaactttt 4140tcaagaccga gattaccctg gccaacggcg agatccggaa gcggcctctg atcgagacaa 4200acggcgaaac cggggagatc gtgtgggata agggccggga ttttgccacc gtgcggaaag 4260tgctgagcat gccccaagtg aatatcgtga aaaagaccga ggtgcagaca ggcggcttca 4320gcaaagagtc tatcctgccc aagaggaaca gcgataagct gatcgccaga aagaaggact 4380gggaccctaa gaagtacggc ggcttcgaca gccccaccgt ggcctattct gtgctggtgg 4440tggccaaagt ggaaaagggc aagtccaaga aactgaagag tgtgaaagag ctgctgggga 4500tcaccatcat ggaaagaagc agcttcgaga agaatcccat cgactttctg gaagccaagg 4560gctacaaaga agtgaaaaag gacctgatca tcaagctgcc taagtactcc ctgttcgagc 4620tggaaaacgg ccggaagaga atgctggcct ctgccggcga actgcagaag ggaaacgaac 4680tggccctgcc ctccaaatat gtgaacttcc tgtacctggc cagccactat gagaagctga 4740agggctcccc cgaggataat gagcagaaac agctgtttgt ggaacagcac aagcactacc 4800tggacgagat catcgagcag atcagcgagt tctccaagag agtgatcctg gccgacgcta 4860atctggacaa agtgctgtcc gcctacaaca agcaccggga taagcccatc agagagcagg 4920ccgagaatat catccacctg tttaccctga ccaatctggg agcccctgcc gccttcaagt 4980actttgacac caccatcgac cggaagaggt acaccagcac caaagaggtg ctggacgcca 5040ccctgatcca ccagagcatc accggcctgt acgagacacg gatcgacctg tctcagctgg 5100gaggtgactc tggaggatct agcggaggat cctctggcag cgagacacca ggaacaagcg 5160agtcagcaac accagagagc agtggcggca gcagcggcgg cagcagcacc ctaaatatag 5220aagatgagta tcggctacat gagacctcaa aagagccaga tgtttctcta gggtccacat 5280ggctgtctga ttttcctcag gcctgggcgg aaaccggggg catgggactg gcagttcgcc 5340aagctcctct gatcatacct ctgaaagcaa cctctacccc cgtgtccata aaacaatacc 5400ccatgtcaca agaagccaga ctggggatca agccccacat acagagactg ttggaccagg 5460gaatatggta ccctgccagt ccccctggaa cacgcccctg ctacccgtta agaaaccagg 5520gactaatgat tataggcctg tccaggatct gagagaagtc aacaagcggg tggaagacat 5580ccaccccacc gtgcccaacc cttacaacct cttgagcggg ctcccaccgt cccaccagtg 5640gtacactgtg cttgatttaa aggatgcctt tttctgcctg agactccacc ccaccagtca 5700gcctctcttc gcctttgagt ggagagatcc agagatggga atctcaggac aattgacctg 5760gaccagactc ccacagggtt tcaaaaacag tcccaccctg tttaatgagg cactgcacag 5820agacctagca gacttccgga tccagcaccc

agacttgatc ctgctacagt acgtggatga 5880cttactgctg gccgccactt ctgagctaga ctgccaacaa ggtactcggg ccctgttaca 5940aaccctaggg aacctcgggt atcgggcctc ggccaagaaa gcccaaattt gccagaaaca 6000ggtcaagtat ctggggtatc ttctaaaaga gggtcagaga tggctgactg aggccagaaa 6060agagactgtg atggggcagc ctactccgaa gacccctcga caactaaggg agttcctagg 6120gaaggcaggc ttctgtcgcc tcttcatccc tgggtttgca gaaatggcag cccccctgta 6180ccctctcacc aaaccgggga ctctgtttaa ttggggccca gaccaacaaa aggcctatca 6240agaaatcaag caagctcttc taactgcccc agccctgggg ttgccagatt tgactaagcc 6300ctttgaactc tttgtcgacg agaagcaggg ctacgccaaa ggtgtcctaa cgcaaaaact 6360gggaccttgg cgtcggccgg tggcctacct gtccaaaaag ctagacccag tagcagctgg 6420gtggccccct tgcctacgga tggtagcagc cattgccgta ctgacaaagg atgcaggcaa 6480gctaaccatg ggacagccac tagtcattct ggccccccat gcagtagagg cactagtcaa 6540acaacccccc gaccgctggc tttccaacgc ccggatgact cactatcagg ccttgctttt 6600ggacacggac cgggtccagt tcggaccggt ggtagccctg aacccggcta cgctgctccc 6660actgcctgag gaagggctgc aacacaactg ccttgatatc ctggccgaag cccacggaac 6720ccgacccgac ctaacggacc agccgctccc agacgccgac cacacctggt acacggatgg 6780aagcagtctc ttacaagagg gacagcgtaa ggcgggagct gcggtgacca ccgagaccga 6840ggtaatctgg gctaaagccc tgccagccgg gacatccgct cagcgggctg aactgatagc 6900actcacccag gccctaaaga tggcagaagg taagaagcta aatgtttata ctgatagccg 6960ttatgctttt gctactgccc atatccatgg agaaatatac agaaggcgtg ggtggctcac 7020atcagaaggc aaagagatca aaaataaaga cgagatcttg gccctactaa aagccctctt 7080tctgcccaaa agacttagca taatccattg tccaggacat caaaagggac acagcgccga 7140ggctagaggc aaccggatgg ctgaccaagc ggcccgaaag gcagccatca cagagactcc 7200agacacctct accctcctca tagaaaattc atcaccctct ggcggctcaa aaagaaccgc 7260cgacggcagc gaaaaaagaa ccgctgactc tcaacattcc acacctccaa aaaccaagcg 7320aaaagtggaa ttcgagccca agaagaagag gaaagtcgga agcggagcta ctaacttcag 7380cctgctgaag caggctggcg acgtggagga gaaccctgga cctccaaaaa agaaaagaaa 7440agtgtatccc tatgatgtcc ccgattatgc cggttcaaga gccctggtcg tgattagact 7500gagccgagtg acagacgcca ccacaagtcc cgagagacag ctggaatcat gccagcagct 7560ctgtgctcag cggggttggg atgtggtcgg cgtggcagag gatctggacg tgagcggggc 7620cgtcgatcca ttcgacagaa agaggaggcc caacctggca agatggctcg ctttcgagga 7680acagcccttt gatgtgatcg tcgcctacag agtggaccgg ctgacccgct caattcgaca 7740tctccagcag ctggtgcatt gggctgagga ccacaagaaa ctggtggtca gcgcaacaga 7800agcccacttc gatactacca caccttttgc cgctgtggtc atcgcactga tgggcactgt 7860ggcccagatg gagctcgaag ctatcaagga gcgaaacagg agcgcagccc atttcaatat 7920tagggccggt aaatacagag gctccctgcc cccttgggga tatctcccta ccagggtgga 7980tggggagtgg agactggtgc cagaccccgt ccagagagag cggattctgg aagtgtacca 8040cagagtggtc gataaccacg aaccactcca tctggtggca cacgacctga atagacgcgg 8100cgtgctctct ccaaaggatt attttgctca gctgcaggga agagagccac agggaagaga 8160atggagtgct actgcactga agagatctat gatcagtgag gctatgctgg gttacgcaac 8220actcaatggc aaaactgtcc gggacgatga cggagcccct ctggtgaggg ctgagcctat 8280tctcaccaga gagcagctcg aagctctgcg ggcagaactg gtcaagacta gtcgcgccaa 8340acctgccgtg agcaccccaa gcctgctcct gagggtgctg ttctgcgccg tctgtggaga 8400gccagcatac aagtttgccg gcggagggcg caaacatccc cgctatcgat gcaggagcat 8460ggggttccct aagcactgtg gaaacgggac agtggccatg gctgagtggg acgccttttg 8520cgaggaacag gtgctggatc tcctgggtga cgctgagcgg ctggaaaaag tgtgggtggc 8580aggatctgac tccgctgtgg agctggcaga agtcaatgcc gagctcgtgg atctgacttc 8640cctcatcgga tctcctgcat atagagctgg gtccccacag agagaagctc tggacgcacg 8700aattgctgca ctcgctgcta gacaggagga actggagggc ctggaggcca ggccctctgg 8760atgggagtgg cgagaaaccg gacagaggtt tggggattgg tggagggagc aggacaccgc 8820agccaagaac acatggctga gatccatgaa tgtccggctc acattcgacg tgcgcggtgg 8880cctgactcga accatcgatt ttggcgacct gcaggagtat gaacagcacc tgagactggg 8940gtccgtggtc gaaagactgc acactgggat gtcctaggtt taaacccgct gatcagcctc 9000gactgtgcct tctagttgcc agccatctgt tgtttgcccc tcccccgtgc cttccttgac 9060cctggaaggt gccactccca ctgtcctttc ctaataaaat gagaaaattg catcgcattg 9120tctgagtagg tgtcattcta ttctgggggg tggggtgggg caggacagca agggggagga 9180ttgggaagac aatagcaggc atgctgggga tgcggtgggc tctatggctt ctgaggcgga 9240aagaaccagc tggggctcga taccgtcgac ctctagctag agcttggcgt aatcatggtc 9300atagctgttt cctgtgtgaa attgttatcc gctcacaatt ccacacaaca tacgagccgg 9360aagcataaag tgtaaagcct agggtgccta atgagtgagc taactcacat taattgcgtt 9420gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc cagctgcatt aatgaatcgg 9480ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct tccgcttcct cgctcactga 9540ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca gctcactcaa aggcggtaat 9600acggttatcc acagaatcag gggataacgc aggaaagaac atgtgagcaa aaggccagca 9660aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt ttccataggc tccgcccccc 9720tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg cgaaacccga caggactata 9780aagataccag gcgtttcccc ctggaagctc cctcgtgcgc tctcctgttc cgaccctgcc 9840gcttaccgga tacctgtccg cctttctccc ttcgggaagc gtggcgcttt ctcatagctc 9900acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc aagctgggct gtgtgcacga 9960accccccgtt cagcccgacc gctgcgcctt atccggtaac tatcgtcttg agtccaaccc 10020ggtaagacac gacttatcgc cactggcagc agccactggt aacaggatta gcagagcgag 10080gtatgtaggc ggtgctacag agttcttgaa gtggtggcct aactacggct acactagaag 10140aacagtattt ggtatctgcg ctctgctgaa gccagttacc ttcggaaaaa gagttggtag 10200ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt ttttttgttt gcaagcagca 10260gattacgcgc agaaaaaaag gatctcaaga agatcctttg atcttttcta cggggtctga 10320cgctcagtgg aacgaaaact cacgttaagg gattttggtc atgagattat caaaaaggat 10380cttcacctag atccttttaa attaaaaatg aagttttaaa tcaatctaaa gtatatatga 10440gtaaacttgg tctgacagtt accaatgctt aatcagtgag gcacctatct cagcgatctg 10500tctatttcgt tcatccatag ttgcctgact ccccgtcgtg tagataacta cgatacggga 10560gggcttacca tctggcccca gtgctgcaat gataccgcga gacccacgct caccggctcc 10620agatttatca gcaataaacc agccagccgg aagggccgag cgcagaagtg gtcctgcaac 10680tttatccgcc tccatccagt ctattaattg ttgccgggaa gctagagtaa gtagttcgcc 10740agttaatagt ttgcgcaacg ttgttgccat tgctacaggc atcgtggtgt cacgctcgtc 10800gtttggtatg gcttcattca gctccggttc ccaacgatca aggcgagtta catgatcccc 10860catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg atcgttgtca gaagtaagtt 10920ggccgcagtg ttatcactca tggttatggc agcactgcat aattctctta ctgtcatgcc 10980atccgtaaga tgcttttctg tgactggtga gtactcaacc aagtcattct gagaatagtg 11040tatgcggcga ccgagttgct cttgcccggc gtcaatacgg gataataccg cgccacatag 11100cagaacttta aaagtgctca tcattggaaa acgttcttcg gggcgaaaac tctcaaggat 11160cttaccgctg ttgagatcca gttcgatgta acccactcgt gcacccaact gatcttcagc 11220atcttttact ttcaccagcg tttctgggtg agcaaaaaca ggaaggcaaa atgccgcaaa 11280aaagggaata agggcgacac ggaaatgttg aatactcata ctcttccttt ttcaatatta 11340ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa 11400aaataaacaa ataggggttc cgcgcacatt tcc 1143338411056DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 384ccgaaaagtg ccacctgacg tcgacggatc gggagatcga tctcccgatc ccctagggtc 60gactctcagt acaatctgct ctgatgccgc atagttaagc cagtatctgc tccctgcttg 120tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa gctacaacaa ggcaaggctt 180gaccgacaat tgcatgaaga atctgcttag ggttaggcgt tttgcgctgc ttcgcgatgt 240acgggccaga tatacgcgtt gacattgatt attgactagt tattaatagt aatcaattac 300ggggtcatta gttcatagcc catatatgga gttccgcgtt acataactta cggtaaatgg 360cccgcctggc tgaccgccca acgacccccg cccattgacg tcaataatga cgtatgttcc 420catagtaacg ccaataggga ctttccattg acgtcaatgg gtggagtatt tacggtaaac 480tgcccacttg gcagtacatc aagtgtatca tatgccaagt acgcccccta ttgacgtcaa 540tgacggtaaa tggcccgcct ggcattatgc ccagtacatg accttatggg actttcctac 600ttggcagtac atctacgtat tagtcatcgc tattaccatg gtgatgcggt tttggcagta 660catcaatggg cgtggatagc ggtttgactc acggggattt ccaagtctcc accccattga 720cgtcaatggg agtttgtttt ggcaccaaaa tcaacgggac tttccaaaat gtcgtaacaa 780ctccgcccca ttgacgcaaa tgggcggtag gcgtgtacgg tgggaggtct atataagcag 840agctggttta gtgaaccgtc agatccgcta gagatccgcg gccgctaata cgactcacta 900tagggagagc cgccaccatg aaacggacag ccgacggaag cgagttcgag tcaccaaaga 960agaagcggaa agtcgacaag aagtacagca tcggcctgga catcggcacc aactctgtgg 1020gctgggccgt gatcaccgac gagtacaagg tgcccagcaa gaaattcaag gtgctgggca 1080acaccgaccg gcacagcatc aagaagaacc tgatcggagc cctgctgttc gacagcggcg 1140aaacagccga ggccacccgg ctgaagagaa ccgccagaag aagatacacc agacggaaga 1200accggatctg ctatctgcaa gagatcttca gcaacgagat ggccaaggtg gacgacagct 1260tcttccacag actggaagag tccttcctgg tggaagagga taagaagcac gagcggcacc 1320ccatcttcgg caacatcgtg gacgaggtgg cctaccacga gaagtacccc accatctacc 1380acctgagaaa gaaactggtg gacagcaccg acaaggccga cctgcggctg atctatctgg 1440ccctggccca catgatcaag ttccggggcc acttcctgat cgagggcgac ctgaaccccg 1500acaacagcga cgtggacaag ctgttcatcc agctggtgca gacctacaac cagctgttcg 1560aggaaaaccc catcaacgcc agcggcgtgg acgccaaggc catcctgtct gccagactga 1620gcaagagcag acggctggaa aatctgatcg cccagctgcc cggcgagaag aagaatggcc 1680tgttcggaaa cctgattgcc ctgagcctgg gcctgacccc caacttcaag agcaacttcg 1740acctggccga ggatgccaaa ctgcagctga gcaaggacac ctacgacgac gacctggaca 1800acctgctggc ccagatcggc gaccagtacg ccgacctgtt tctggccgcc aagaacctgt 1860ccgacgccat cctgctgagc gacatcctga gagtgaacac cgagatcacc aaggcccccc 1920tgagcgcctc tatgatcaag agatacgacg agcaccacca ggacctgacc ctgctgaaag 1980ctctcgtgcg gcagcagctg cctgagaagt acaaagagat tttcttcgac cagagcaaga 2040acggctacgc cggctacatt gacggcggag ccagccagga agagttctac aagttcatca 2100agcccatcct ggaaaagatg gacggcaccg aggaactgct cgtgaagctg aacagagagg 2160acctgctgcg gaagcagcgg accttcgaca acggcagcat cccccaccag atccacctgg 2220gagagctgca cgccattctg cggcggcagg aagattttta cccattcctg aaggacaacc 2280gggaaaagat cgagaagatc ctgaccttcc gcatccccta ctacgtgggc cctctggcca 2340ggggaaacag cagattcgcc tggatgacca gaaagagcga ggaaaccatc accccctgga 2400acttcgagga agtggtggac aagggcgctt ccgcccagag cttcatcgag cggatgacca 2460acttcgataa gaacctgccc aacgagaagg tgctgcccaa gcacagcctg ctgtacgagt 2520acttcaccgt gtataacgag ctgaccaaag tgaaatacgt gaccgaggga atgagaaagc 2580ccgccttcct gagcggcgag cagaaaaagg ccatcgtgga cctgctgttc aagaccaacc 2640ggaaagtgac cgtgaagcag ctgaaagagg actacttcaa gaaaatcgag tgcttcgact 2700ccgtggaaat ctccggcgtg gaagatcggt tcaacgcctc cctgggcaca taccacgatc 2760tgctgaaaat tatcaaggac aaggacttcc tggacaatga ggaaaacgag gacattctgg 2820aagatatcgt gctgaccctg acactgtttg aggacagaga gatgatcgag gaacggctga 2880aaacctatgc ccacctgttc gacgacaaag tgatgaagca gctgaagcgg cggagataca 2940ccggctgggg caggctgagc cggaagctga tcaacggcat ccgggacaag cagtccggca 3000agacaatcct ggatttcctg aagtccgacg gcttcgccaa cagaaacttc atgcagctga 3060tccacgacga cagcctgacc tttaaagagg acatccagaa agcccaggtg tccggccagg 3120gcgatagcct gcacgagcac attgccaatc tggccggcag ccccgccatt aagaagggca 3180tcctgcagac agtgaaggtg gtggacgagc tcgtgaaagt gatgggccgg cacaagcccg 3240agaacatcgt gatcgaaatg gccagagaga accagaccac ccagaaggga cagaagaaca 3300gccgcgagag aatgaagcgg atcgaagagg gcatcaaaga gctgggcagc cagatcctga 3360aagaacaccc cgtggaaaac acccagctgc agaacgagaa gctgtacctg tactacctgc 3420agaatgggcg ggatatgtac gtggaccagg aactggacat caaccggctg tccgactacg 3480atgtggacgc tatcgtgcct cagagctttc tgaaggacga ctccatcgac aacaaggtgc 3540tgaccagaag cgacaagaac cggggcaaga gcgacaacgt gccctccgaa gaggtcgtga 3600agaagatgaa gaactactgg cggcagctgc tgaacgccaa gctgattacc cagagaaagt 3660tcgacaatct gaccaaggcc gagagaggcg gcctgagcga actggataag gccggcttca 3720tcaagagaca gctggtggaa acccggcaga tcacaaagca cgtggcacag atcctggact 3780cccggatgaa cactaagtac gacgagaatg acaagctgat ccgggaagtg aaagtgatca 3840ccctgaagtc caagctggtg tccgatttcc ggaaggattt ccagttttac aaagtgcgcg 3900agatcaacaa ctaccaccac gcccacgacg cctacctgaa cgccgtcgtg ggaaccgccc 3960tgatcaaaaa gtaccctaag ctggaaagcg agttcgtgta cggcgactac aaggtgtacg 4020acgtgcggaa gatgatcgcc aagagcgagc aggaaatcgg caaggctacc gccaagtact 4080tcttctacag caacatcatg aactttttca agaccgagat taccctggcc aacggcgaga 4140tccggaagcg gcctctgatc gagacaaacg gcgaaaccgg ggagatcgtg tgggataagg 4200gccgggattt tgccaccgtg cggaaagtgc tgagcatgcc ccaagtgaat atcgtgaaaa 4260agaccgaggt gcagacaggc ggcttcagca aagagtctat cctgcccaag aggaacagcg 4320ataagctgat cgccagaaag aaggactggg accctaagaa gtacggcggc ttcgacagcc 4380ccaccgtggc ctattctgtg ctggtggtgg ccaaagtgga aaagggcaag tccaagaaac 4440tgaagagtgt gaaagagctg ctggggatca ccatcatgga aagaagcagc ttcgagaaga 4500atcccatcga ctttctggaa gccaagggct acaaagaagt gaaaaaggac ctgatcatca 4560agctgcctaa gtactccctg ttcgagctgg aaaacggccg gaagagaatg ctggcctctg 4620ccggcgaact gcagaaggga aacgaactgg ccctgccctc caaatatgtg aacttcctgt 4680acctggccag ccactatgag aagctgaagg gctcccccga ggataatgag cagaaacagc 4740tgtttgtgga acagcacaag cactacctgg acgagatcat cgagcagatc agcgagttct 4800ccaagagagt gatcctggcc gacgctaatc tggacaaagt gctgtccgcc tacaacaagc 4860accgggataa gcccatcaga gagcaggccg agaatatcat ccacctgttt accctgacca 4920atctgggagc ccctgccgcc ttcaagtact ttgacaccac catcgaccgg aagaggtaca 4980ccagcaccaa agaggtgctg gacgccaccc tgatccacca gagcatcacc ggcctgtacg 5040agacacggat cgacctgtct cagctgggag gtgactctgg aggatctagc ggaggatcct 5100ctggcagcga gacaccagga acaagcgagt cagcaacacc agagagctct ggtagcgaga 5160cacccggtac cagtgaaagc gccacgccag aaagcagtgg gagtgagact ccgggtacat 5220ctgaatcagc gacaccggaa tcaagtggcg gcagcagcgg cggcagcagc accctaaata 5280tagaagatga gtatcggcta catgagacct caaaagagcc agatgtttct ctagggtcca 5340catggctgtc tgattttcct caggcctggg cggaaaccgg gggcatggga ctggcagttc 5400gccaagctcc tctgatcata cctctgaaag caacctctac ccccgtgtcc ataaaacaat 5460accccatgtc acaagaagcc agactgggga tcaagcccca catacagaga ctgttggacc 5520agggaatact ggtaccctgc cagtccccct ggaacacgcc cctgctaccc gttaagaaac 5580cagggactaa tgattatagg cctgtccagg atctgagaga agtcaacaag cgggtggaag 5640acatccaccc caccgtgccc aacccttaca acctcttgag cgggccccca ccgtcccacc 5700agtggtacac tgtgcttgat ttaaaggatg cctttttctg cctgagactc caccccacca 5760gtcagcctct cttcgccttt gagtggagag atccagagat gggaatctca ggacaattga 5820cctggaccag actcccacag ggtttcaaaa acagtcccac cctgtttaat gaggcactgc 5880acagagacct agcagacttc cggatccagc acccagactt gatcctgcta cagtacgtgg 5940atgacttact gctggccgcc acttctgagc tagactgcca acaaggtact cgggccctgt 6000tacaaaccct agggaacctc gggtatcggg cctcggccaa gaaagcccaa atttgccaga 6060aacaggtcaa gtatctgggg tatcttctaa aagagggtca gagatggctg actgaggcca 6120gaaaagagac tgtgatgggg cagcctactc cgaagacccc tcgacaacta agggagttcc 6180tagggaaggc aggcttctgt cgcctcttca tccctgggtt tgcagaaatg gcagcccccc 6240tgtaccctct caccaaaccg gggactctgt ttaattgggg cccagaccaa caaaaggcct 6300atcaagaaat caagcaagct cttctaactg ccccagccct ggggttgcca gatttgacta 6360agccctttga actctttgtc gacgagaagc agggctacgc caaaggtgtc ctaacgcaaa 6420aactgggacc ttggcgtcgg ccggtggcct acctgtccaa aaagctagac ccagtagcag 6480ctgggtggcc cccttgccta cggatggtag cagccattgc cgtactgaca aaggatgcag 6540gcaagctaac catgggacag ccactagtca ttctggcccc ccatgcagta gaggcactag 6600tcaaacaacc ccccgaccgc tggctttcca acgcccggat gactcactat caggccttgc 6660ttttggacac ggaccgggtc cagttcggac cggtggtagc cctgaacccg gctacgctgc 6720tcccactgcc tgaggaaggg ctgcaacaca actgccttga tgggacaggt ggcggtggtg 6780tcaccgtcaa gttcaagtac aagggtgagg aacttgaagt tgatattagc aaaatcaaga 6840aggtttggcg cgttggtaaa atgatatctt ttacttatga cgacaacggc aagacaggta 6900gaggggcagt gtctgagaaa gacgccccca aggagctgtt gcaaatgttg gaaaagtctg 6960ggaaaaagtc tggcggctca aaaagaaccg ccgacggcag cgaattcgag cccaagaaga 7020agaggaaagt cggaggtggc gggagcccaa aaaagaaaag aaaagtgtat ccctatgatg 7080tccccgatta tgccggttca agagccctgg tcgtgattag actgagccga gtgacagacg 7140ccaccacaag tcccgagaga cagctggaat catgccagca gctctgtgct cagcggggtt 7200gggatgtggt cggcgtggca gaggatctgg acgtgagcgg ggccgtcgat ccattcgaca 7260gaaagaggag gcccaacctg gcaagatggc tcgctttcga ggaacagccc tttgatgtga 7320tcgtcgccta cagagtggac cggctgaccc gctcaattcg acatctccag cagctggtgc 7380attgggctga ggaccacaag aaactggtgg tcagcgcaac agaagcccac ttcgatacta 7440ccacaccttt tgccgctgtg gtcatcgcac tgatgggcac tgtggcccag atggagctcg 7500aagctatcaa ggagcgaaac aggagcgcag cccatttcaa tattagggcc ggtaaataca 7560gaggctccct gcccccttgg ggatatctcc ctaccagggt ggatggggag tggagactgg 7620tgccagaccc cgtccagaga gagcggattc tggaagtgta ccacagagtg gtcgataacc 7680acgaaccact ccatctggtg gcacacgacc tgaatagacg cggcgtgctc tctccaaagg 7740attattttgc tcagctgcag ggaagagagc cacagggaag agaatggagt gctactgcac 7800tgaagagatc tatgatcagt gaggctatgc tgggttacgc aacactcaat ggcaaaactg 7860tccgggacga tgacggagcc cctctggtga gggctgagcc tattctcacc agagagcagc 7920tcgaagctct gcgggcagaa ctggtcaaga ctagtcgcgc caaacctgcc gtgagcaccc 7980caagcctgct cctgagggtg ctgttctgcg ccgtctgtgg agagccagca tacaagtttg 8040ccggcggagg gcgcaaacat ccccgctatc gatgcaggag catggggttc cctaagcact 8100gtggaaacgg gacagtggcc atggctgagt gggacgcctt ttgcgaggaa caggtgctgg 8160atctcctggg tgacgctgag cggctggaaa aagtgtgggt ggcaggatct gactccgctg 8220tggagctggc agaagtcaat gccgagctcg tggatctgac ttccctcatc ggatctcctg 8280catatagagc tgggtcccca cagagagaag ctctggacgc acgaattgct gcactcgctg 8340ctagacagga ggaactggag ggcctggagg ccaggccctc tggatgggag tggcgagaaa 8400ccggacagag gtttggggat tggtggaggg agcaggacac cgcagccaag aacacatggc 8460tgagatccat gaatgtccgg ctcacattcg acgtgcgcgg tggcctgact cgaaccatcg 8520attttggcga cctgcaggag tatgaacagc acctgagact ggggtccgtg gtcgaaagac 8580tgcacactgg gatgtcctag gtttaaaccc gctgatcagc ctcgactgtg ccttctagtt 8640gccagccatc tgttgtttgc ccctcccccg tgccttcctt gaccctggaa ggtgccactc 8700ccactgtcct ttcctaataa aatgagaaaa ttgcatcgca ttgtctgagt aggtgtcatt 8760ctattctggg gggtggggtg gggcaggaca gcaaggggga ggattgggaa gacaatagca 8820ggcatgctgg ggatgcggtg ggctctatgg cttctgaggc ggaaagaacc agctggggct 8880cgataccgtc gacctctagc tagagcttgg cgtaatcatg gtcatagctg tttcctgtgt 8940gaaattgtta tccgctcaca attccacaca acatacgagc cggaagcata aagtgtaaag 9000cctagggtgc ctaatgagtg agctaactca cattaattgc gttgcgctca ctgcccgctt 9060tccagtcggg aaacctgtcg tgccagctgc attaatgaat cggccaacgc gcggggagag 9120gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg 9180ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat 9240caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 9300aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa

9360atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 9420cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 9480ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca 9540gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 9600accgctgcgc cttatccggt aactatcgtc ttgagtccaa cccggtaaga cacgacttat 9660cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 9720cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta tttggtatct 9780gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 9840aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 9900aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa 9960actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt 10020taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca 10080gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca 10140tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc 10200ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa 10260accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc 10320agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat agtttgcgca 10380acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat 10440tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag 10500cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca gtgttatcac 10560tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta agatgctttt 10620ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg cgaccgagtt 10680gctcttgccc ggcgtcaata cgggataata ccgcgccaca tagcagaact ttaaaagtgc 10740tcatcattgg aaaacgttct tcggggcgaa aactctcaag gatcttaccg ctgttgagat 10800ccagttcgat gtaacccact cgtgcaccca actgatcttc agcatctttt actttcacca 10860gcgtttctgg gtgagcaaaa acaggaaggc aaaatgccgc aaaaaaggga ataagggcga 10920cacggaaatg ttgaatactc atactcttcc tttttcaata ttattgaagc atttatcagg 10980gttattgtct catgagcgga tacatatttg aatgtattta gaaaaataaa caaatagggg 11040ttccgcgcac atttcc 110563852367DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 385tgttgagatc cagttcgatg taacccactc gtgcacccaa ctgatcttca gcatctttta 60ctttcaccag cgtttctggg tgagcaaaaa caggaaggca aaatgccgca aaaaagggaa 120taagggcgac acggaaatgt tgaatactca tactcttcct ttttcaatat tattgaagca 180tttatcaggg ttattgtctc atgagcggat acatatttga atgtatttag aaaaataaac 240aaataggggt tccgcgcaca tttccccgaa aagtgccacc tgacgtcgct agctgtacaa 300aaaagcaggc tttaaaggaa ccaattcagt cgactggatc cggtaccaag gtcgggcagg 360aagagggcct atttcccatg attccttcat atttgcatat acgatacaag gctgttagag 420agataattag aattaatttg actgtaaaca caaagatatt agtacaaaat acgtgacgta 480gaaagtaata atttcttggg tagtttgcag ttttaaaatt atgttttaaa atggactatc 540atatgcttac cgtaacttga aagtatttcg atttcttggc tttatatatc ttgtggaaag 600gacgaaacac cgctattctc gcagctcacc agttttagag ctagaaatag caagttaaaa 660taaggctagt ccgttatcaa cttgaaaaag tggcaccgag tcggtgcgac gagcgcggcg 720atatcatcat ccatggccgg atgatcctga cgacggagac cgccgtcgtc gacaagccgg 780cctgagctgc gagaattttt ttaagcttgg gccgctcgag gtacctctct acatatgaca 840tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 900tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 960gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 1020ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 1080tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 1140agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 1200atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 1260acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 1320actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 1380tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 1440tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 1500tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 1560tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 1620caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg 1680cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc cccgtcgtgt 1740agataactac gatacgggag ggcttaccat ctggccccag tgctgcaatg ataccgcgag 1800atccacgctc accggctcca gatttatcag caataaacca gccagccgga agggccgagc 1860gcagaagtgg tcctgcaact ttatccgcct ccatccagtc tattaattgt tgccgggaag 1920ctagagtaag tagttcgcca gttaatagtt tgcgcaacgt tgttgccatt gctacaggca 1980tcgtggtgtc acgctcgtcg tttggtatgg cttcattcag ctccggttcc caacgatcaa 2040ggcgagttac atgatccccc atgttgtgca aaaaagcggt tagctccttc ggtcctccga 2100tcgttgtcag aagtaagttg gccgcagtgt tatcactcat ggttatggca gcactgcata 2160attctcttac tgtcatgcca tccgtaagat gcttttctgt gactggtgag tactcaacca 2220agtcattctg agaatagtgt atgcggcgac cgagttgctc ttgcccggcg tcaatacggg 2280ataataccgc gccacatagc agaactttaa aagtgctcat cattggaaaa cgttcttcgg 2340ggcgaaaact ctcaaggatc ttaccgc 23673862280DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 386ctccatccag tctattaatt gttgccggga agctagagta agtagttcgc cagttaatag 60tttgcgcaac gttgttgcca ttgctacagg catcgtggtg tcacgctcgt cgtttggtat 120ggcttcattc agctccggtt cccaacgatc aaggcgagtt acatgatccc ccatgttgtg 180caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc agaagtaagt tggccgcagt 240gttatcactc atggttatgg cagcactgca taattctctt actgtcatgc catccgtaag 300atgcttttct gtgactggtg agtactcaac caagtcattc tgagaatagt gtatgcggcg 360accgagttgc tcttgcccgg cgtcaatacg ggataatacc gcgccacata gcagaacttt 420aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa ctctcaagga tcttaccgct 480gttgagatcc agttcgatgt aacccactcg tgcacccaac tgatcttcag catcttttac 540tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa aatgccgcaa aaaagggaat 600aagggcgaca cggaaatgtt gaatactcat actcttcctt tttcaatatt attgaagcat 660ttatcagggt tattgtctca tgagcggata catatttgaa tgtatttaga aaaataaaca 720aataggggtt ccgcgcacat ttccccgaaa agtgccacct gacgtcgcta gctgtacaaa 780aaagcaggct ttaaaggaac caattcagtc gactggatcc ggtaccaagg tcgggcagga 840agagggccta tttcccatga ttccttcata tttgcatata cgatacaagg ctgttagaga 900gataattaga attaatttga ctgtaaacac aaagatatta gtacaaaata cgtgacgtag 960aaagtaataa tttcttgggt agtttgcagt tttaaaatta tgttttaaaa tggactatca 1020tatgcttacc gtaacttgaa agtatttcga tttcttggct ttatatatct tgtggaaagg 1080acgaaacacc gaagccggcc ttgcacatgc gttttagagc tagaaatagc aagttaaaat 1140aaggctagtc cgttatcaac ttgaaaaagt ggcaccgagt cggtgcgttt ttttaagctt 1200gggccgctcg aggtacctct ctacatatga catgtgagca aaaggccagc aaaaggccag 1260gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca 1320tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca 1380ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg 1440atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag 1500gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt 1560tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca 1620cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg 1680cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa gaacagtatt 1740tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta gctcttgatc 1800cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc agattacgcg 1860cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg acgctcagtg 1920gaacgaaaac tcacgttaag ggattttggt catgagatta tcaaaaagga tcttcaccta 1980gatcctttta aattaaaaat gaagttttaa atcaatctaa agtatatatg agtaaacttg 2040gtctgacagt taccaatgct taatcagtga ggcacctatc tcagcgatct gtctatttcg 2100ttcatccata gttgcctgac tccccgtcgt gtagataact acgatacggg agggcttacc 2160atctggcccc agtgctgcaa tgataccgcg agatccacgc tcaccggctc cagatttatc 2220agcaataaac cagccagccg gaagggccga gcgcagaagt ggtcctgcaa ctttatccgc 22803876386DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 387tgagcgcaac gcaattaatg tgagttagct cactcattag gcaccccagg ctttacactt 60tatgcttccg gctcgtatgt tgtgtggaat tgtgagcgga taacaatttc acacaggaaa 120cagctatgac catgaggcgc gccggattcg acattgatta ttgactagtt attaatagta 180atcaattacg gggtcattag ttcatagccc atatatggag ttccgcgtta cataacttac 240ggtaaatggc ccgcctggct gaccgcccaa cgacccccgc ccattgacgt caataatgac 300gtatgttccc atagtaacgc caatagggac tttccattga cgtcaatggg tggagtattt 360acggtaaact gcccacttgg cagtacatca agtgtatcat atgccaagta cgccccctat 420tgacgtcaat gacggtaaat ggcccgcctg gcattatgcc cagtacatga ccttatggga 480ctttcctact tggcagtaca tctacgtatt agtcatcgct attaccatgg tcgaggtgag 540ccccacgttc tgcttcactc tccccatctc ccccccctcc ccacccccaa ttttgtattt 600atttattttt taattatttt gtgcagcgat gggggcgggg gggggggggg ggcgcgcgcc 660rggsggggsg gggsggggsg rggggsgggg sggggsgagg cggagaggtg cggcggcagc 720caatcagagc ggcgcgctcc gaaagtttcc ttttatggcg aggcggcggc ggcggcggcc 780ctataaaaag cgaagcgcgc ggcgggcggg agtcgctgcg cgctgccttc gccccgtgcc 840ccgctccgcc gccgcctcgc gccgcccgcc ccggctctga ctgaccgcgt tactcccaca 900ggtgagcggg cgggacggcc cttctcctcc gggctgtaat tagcgcttgg tttaatgacg 960gcttgtttct tttctgtggc tgcgtgaaag ccttgagggg ctccgggagg gccctttgtg 1020cggggggagc ggctcggggg gtgcgtgcgt gtgtgtgtgc gtggggagcg ccgcgtgcgg 1080ctccgcgctg cccggcggct gtgagcgctg cgggcgcggc gcggggcttt gtgcgctccg 1140cagtgtgcgc gaggggagcg cggccggggg cggtgccccg cggtgcgggg ggggctgcga 1200ggggaacaaa ggctgcgtgc ggggtgtgtg cgtggggggg tgagcagggg gtgtgggcgc 1260gtcggtcggg ctgcaacccc ccctgcaccc ccctccccga gttgctgagc acggcccggc 1320ttcgggtgcg gggctccgta cggggcgtgg cgcggggctc gccgtgccgg gcggggggtg 1380gcggcaggtg ggggtgccgg gcggggcggg gccgcctcgg gccggggagg gctcggggga 1440ggggcgcggc ggcccccgga gcgccggcgg ctgtcgaggc gcggcgagcc gcagccattg 1500ccttttatgg taatcgtgcg agagggcgca gggacttcct ttgtcccaaa tctgtgcgga 1560gccgaaatct gggaggcgcc gccgcacccc ctctagcggg cgcggggcga agcggtgcgg 1620cgccggcagg aaggaaatgg gcggggaggg ccttcgtgcg tcgccgcgcc gccgtcccct 1680tctccctctc cagcctcggg gctgtccgcg gggggacggc tgccttcggg ggggacgggg 1740cagggcgggg ttcggcttct ggcgtgtgac cggcggctct agagcctctg ctaaccatgt 1800tcatgccttc ttctttttcc tacagatcct taattaataa tacgactcac tatagggggt 1860cgacccgcca ccatgccaaa aaagaaaaga aaagtgtatc cctatgatgt ccccgattat 1920gccggttcaa gagccctggt cgtgattaga ctgagccgag tgacagacgc caccacaagt 1980cccgagagac agctggaatc atgccagcag ctctgtgctc agcggggttg ggatgtggtc 2040ggcgtggcag aggatctgga cgtgagcggg gccgtcgatc cattcgacag aaagaggagg 2100cccaacctgg caagatggct cgctttcgag gaacagccct ttgatgtgat cgtcgcctac 2160agagtggacc ggctgacccg ctcaattcga catctccagc agctggtgca ttgggctgag 2220gaccacaaga aactggtggt cagcgcaaca gaagcccact tcgatactac cacacctttt 2280gccgctgtgg tcatcgcact gatgggcact gtggcccaga tggagctcga agctatcaag 2340gagcgaaaca ggagcgcagc ccatttcaat attagggccg gtaaatacag aggctccctg 2400cccccttggg gatatctccc taccagggtg gatggggagt ggagactggt gccagacccc 2460gtccagagag agcggattct ggaagtgtac cacagagtgg tcgataacca cgaaccactc 2520catctggtgg cacacgacct gaatagacgc ggcgtgctct ctccaaagga ttattttgct 2580cagctgcagg gaagagagcc acagggaaga gaatggagtg ctactgcact gaagagatct 2640atgatcagtg aggctatgct gggttacgca acactcaatg gcaaaactgt ccgggacgat 2700gacggagccc ctctggtgag ggctgagcct attctcacca gagagcagct cgaagctctg 2760cgggcagaac tggtcaagac tagtcgcgcc aaacctgccg tgagcacccc aagcctgctc 2820ctgagggtgc tgttctgcgc cgtctgtgga gagccagcat acaagtttgc cggcggaggg 2880cgcaaacatc cccgctatcg atgcaggagc atggggttcc ctaagcactg tggaaacggg 2940acagtggcca tggctgagtg ggacgccttt tgcgaggaac aggtgctgga tctcctgggt 3000gacgctgagc ggctggaaaa agtgtgggtg gcaggatctg actccgctgt ggagctggca 3060gaagtcaatg ccgagctcgt ggatctgact tccctcatcg gatctcctgc atatagagct 3120gggtccccac agagagaagc tctggacgca cgaattgctg cactcgctgc tagacaggag 3180gaactggagg gcctggaggc caggccctct ggatgggagt ggcgagaaac cggacagagg 3240tttggggatt ggtggaggga gcaggacacc gcagccaaga acacatggct gagatccatg 3300aatgtccggc tcacattcga cgtgcgcggt ggcctgactc gaaccatcga ttttggcgac 3360ctgcaggagt atgaacagca cctgagactg gggtccgtgg tcgaaagact gcacactggg 3420atgtcctagg tcagagctcg ctgatcagcc tcgactgtgc cttctagttg ccagccatct 3480gttgtttgcc cctcccccgt gccttccttg accctggaag gtgccactcc cactgtcctt 3540tcctaataaa atgaggaaat tgcatcgcat tgtctgagta ggtgtcattc tattctgggg 3600ggtggggtgg ggcaggacag caagggggag gattgggaag acaatagcag gcatgctggg 3660gatgcggtgg gctctatggc ttctgaggcg gaaagaacca gctggggctc gagatccact 3720agttctagcc tcgaggctag agcggccgcc actggccgtc gttttacaac gtcgtgactg 3780ggaaaaccct ggcgttaccc aacttaatcg ccttgcagca catccccctt tcgccagctg 3840gcgtaatagc gaagaggccc gcaccgatcg cccttcccaa cagttgcgca gcctgaatgg 3900cgaatgggac gcgccctgta gcggcgcatt aagcgcggcg ggtgtggtgg ttacgcgcag 3960cgtgaccgct acacttgcca gcgccctagc gcccgctcct ttcgctttct tcccttcctt 4020tctcgccacg ttcgccggct ttccccgtca agctctaaat cgggggctcc ctttagggtt 4080ccgatttagt gctttacggc acctcgaccc caaaaaactt gattagggtg atggttcacg 4140tagtgggcca tcgccctgat agacggtttt tcgccctttg acgttggagt ccacgttctt 4200taatagtgga ctcttgttcc aaactggaac aacactcaac cctatctcgg tctattcttt 4260tgatttataa gggattttgc cgatttcggc ctattggtta aaaaatgagc tgatttaaca 4320aaaatttaac gcgaatttta acaaaatatt aacgcttacr mktymsrtks smcwttymgg 4380sgaaatgtgc gcggaacccc tatttgttta tttttctaaa tacattcaaa tatgtatccg 4440ctcatgagac aataaccctg ataaatgctt caataatatt gaaaaaggaa gagtatgagt 4500attcaacatt tccgtgtcgc ccttattccc ttttttgcgg cattttgcct tcctgttttt 4560gctcacccag aaacgctggt gaaagtaaaa gatgctgaag atcagttggg tgcacgagtg 4620ggttacatcg aactggatct caacagcggt aagatccttg agagttttcg ccccgaagaa 4680cgttttccaa tgatgagcac ttttaaagtt ctgctatgtg gcgcggtatt atcccgtatt 4740gacgccgggc aagagcaact cggtcgccgc atacactatt ctcagaatga cttggttgag 4800tactcaccag tcacagaaaa gcatcttacg gatggcatga cagtaagaga attatgcagt 4860gctgccataa ccatgagtga taacactgcg gccaacttac ttctgacaac gatcggagga 4920ccgaaggagc taaccgcttt tttgcacaac atgggggatc atgtaactcg ccttgatcgt 4980tgggaaccgg agctgaatga agccatacca aacgacgagc gtgacaccac gatgcctgta 5040gcaatggcaa caacgttgcg caaactatta actggcgaac tacttactct agcttcccgg 5100caacaattaa tagactggat ggaggcggat aaagttgcag gaccacttct gcgctcggcc 5160cttccggctg gctggtttat tgctgataaa tctggagccg gtgagcgtgg gtctcgcggt 5220atcattgcag cactggggcc agatggtaag ccctcccgta tcgtagttat ctacacgacg 5280gggagtcagg caactatgga tgaacgaaat agacagatcg ctgagatagg tgcctcactg 5340attaagcatt ggtaactgtc agaccaagtt tactcatata tactttagat tgatttaaaa 5400cttcattttt aatttaaaag gatctaggtg aagatccttt ttgataatct catgaccaaa 5460atcccttaac gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga 5520tcttcttgag atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg 5580ctaccagcgg tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact 5640ggcttcagca gagcgcagat accaaatact gttcttctag tgtagccgta gttaggccac 5700cacttcaaga actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg 5760gctgctgcca gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg 5820gataaggcgc agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga 5880acgacctaca ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc 5940gaagggagaa aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg 6000agggagcttc cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc 6060tgacttgagc gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc 6120agcaacgcgg cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt 6180cctgcgttat cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc 6240gctcgccgca gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc 6300ccaatacgca aaccgcctct ccccgcgcgt tggccgattc attaatgcag ctggcacgac 6360aggtttcccg actggaaagc gggcag 63863886317DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 388gattcgacat tgattattga ctagttatta atagtaatca attacggggt cattagttca 60tagcccatat atggagttcc gcgttacata acttacggta aatggcccgc ctggctgacc 120gcccaacgac ccccgcccat tgacgtcaat aatgacgtat gttcccatag taacgccaat 180agggactttc cattgacgtc aatgggtgga gtatttacgg taaactgccc acttggcagt 240acatcaagtg tatcatatgc caagtacgcc ccctattgac gtcaatgacg gtaaatggcc 300cgcctggcat tatgcccagt acatgacctt atgggacttt cctacttggc agtacatcta 360cgtattagtc atcgctatta ccatggtcga ggtgagcccc acgttctgct tcactctccc 420catctccccc ccctccccac ccccaatttt gtatttattt attttttaat tattttgtgc 480agcgatgggg gcgggggggg ggggggggcg cgcgccrggs ggggsggggs ggggsgrggg 540gsggggsggg gsgaggcgga gaggtgcggc ggcagccaat cagagcggcg cgctccgaaa 600gtttcctttt atggcgaggc ggcggcggcg gcggccctat aaaaagcgaa gcgcgcggcg 660ggcgggagtc gctgcgcgct gccttcgccc cgtgccccgc tccgccgccg cctcgcgccg 720cccgccccgg ctctgactga ccgcgttact cccacaggtg agcgggcggg acggcccttc 780tcctccgggc tgtaattagc gcttggttta atgacggctt gtttcttttc tgtggctgcg 840tgaaagcctt gaggggctcc gggagggccc tttgtgcggg gggagcggct cggggggtgc 900gtgcgtgtgt gtgtgcgtgg ggagcgccgc gtgcggctcc gcgctgcccg gcggctgtga 960gcgctgcggg cgcggcgcgg ggctttgtgc gctccgcagt gtgcgcgagg ggagcgcggc 1020cgggggcggt gccccgcggt gcgggggggg ctgcgagggg aacaaaggct gcgtgcgggg 1080tgtgtgcgtg ggggggtgag cagggggtgt gggcgcgtcg gtcgggctgc aaccccccct 1140gcacccccct ccccgagttg ctgagcacgg cccggcttcg ggtgcggggc tccgtacggg 1200gcgtggcgcg gggctcgccg tgccgggcgg ggggtggcgg caggtggggg tgccgggcgg 1260ggcggggccg cctcgggccg gggagggctc gggggagggg cgcggcggcc cccggagcgc 1320cggcggctgt cgaggcgcgg cgagccgcag ccattgcctt ttatggtaat cgtgcgagag 1380ggcgcaggga cttcctttgt cccaaatctg tgcggagccg aaatctggga ggcgccgccg 1440caccccctct agcgggcgcg gggcgaagcg gtgcggcgcc ggcaggaagg aaatgggcgg 1500ggagggcctt cgtgcgtcgc cgcgccgccg tccccttctc cctctccagc ctcggggctg 1560tccgcggggg gacggctgcc ttcggggggg acggggcagg gcggggttcg gcttctggcg 1620tgtgaccggc ggctctagag cctctgctaa ccatgttcat gccttcttct ttttcctaca 1680gatccttaat taataatacg actcactata gggggtcgac ccgccaccat gacagcgcca 1740aagaaaaaga ggaaggtcat gaccaagaaa gtggccatct atactagagt gagcacaacg 1800aatcaggccg aggaggggtt ctctattgac gagcaaatcg atcgtctgac caagtacgcg

1860gaagcaatgg gctggcaagt cagcgacact tacaccgatg ctgggttctc cggcgccaaa 1920ctggaaaggc ctgccatgca gcggctgatt aacgacattg agaacaaggc ctttgataca 1980gtgctcgtat acaagctcga caggctcagc cgatctgtgc gggacacgct ttacctcgta 2040aaggatgttt tcactaagaa taaaatcgac ttcattagcc tgaacgaatc cattgacacc 2100agctcagcta tgggctctct gttcctgacc atcctgagcg ctatcaatga gtttgagagg 2160gagaatataa aggagcgcat gacaatggga aagctgggta gagcgaagtc cgggaaatct 2220atgatgtgga ccaagaccgc ttttggatac taccacaata ggaagacggg cattctggag 2280atcgtgccct tgcaggcaac catcgttgag cagatcttca ccgactacct gagcggaata 2340tctctcacga agttgcgaga taagctgaat gagagcggac acattggcaa ggatattcct 2400tggtcatata gaaccctccg ccaaactctg gataatccgg tgtactgcgg ttacatcaag 2460ttcaaagaca gcctcttcga gggaatgcat aaacctatca ttccatacga gacatacctg 2520aaagtccaaa aggaactcga agagcgccag caacagactt acgaacggaa taataatccc 2580aggcctttcc aggccaaata tatgctgtcc ggcatggcaa gatgcggata ctgcggggca 2640ccactcaaga ttgtgcttgg ccataaacgg aaggatggaa gcagaaccat gaaatatcac 2700tgcgcaaacc gctttccaag gaaaacgaag gggattaccg tgtacaatga caacaaaaaa 2760tgtgatagcg gaacctacga tctgtccaac ttggaaaaca ccgtcattga caatttaatt 2820ggatttcagg aaaataatga cagccttctg aagattatca acgggaacaa tcagccgatt 2880ctggacactt catctttcaa aaaacagatc tctcagattg ataagaaaat tcagaaaaat 2940tccgatttat acctcaatga tttcataacg atggatgagc tgaaggaccg gaccgacagt 3000ttgcaggccg agaagaaact gctgaaagca aagatctccg agaacaagtt caatgacagt 3060accgatgtct tcgagttggt gaagacccag ctgggtagta tcccaatcaa cgagttgagc 3120tatgacaata agaagaagat tgttaataac ctggtgagca aagtggacgt gaccgctgat 3180aacgtggata ttatcttcaa gttccagctg gcctgagtca gagctcgctg atcagcctcg 3240actgtgcctt ctagttgcca gccatctgtt gtttgcccct cccccgtgcc ttccttgacc 3300ctggaaggtg ccactcccac tgtcctttcc taataaaatg aggaaattgc atcgcattgt 3360ctgagtaggt gtcattctat tctggggggt ggggtggggc aggacagcaa gggggaggat 3420tgggaagaca atagcaggca tgctggggat gcggtgggct ctatggcttc tgaggcggaa 3480agaaccagct ggggctcgag atccactagt tctagcctcg aggctagagc ggccgccact 3540ggccgtcgtt ttacaacgtc gtgactggga aaaccctggc gttacccaac ttaatcgcct 3600tgcagcacat ccccctttcg ccagctggcg taatagcgaa gaggcccgca ccgatcgccc 3660ttcccaacag ttgcgcagcc tgaatggcga atgggacgcg ccctgtagcg gcgcattaag 3720cgcggcgggt gtggtggtta cgcgcagcgt gaccgctaca cttgccagcg ccctagcgcc 3780cgctcctttc gctttcttcc cttcctttct cgccacgttc gccggctttc cccgtcaagc 3840tctaaatcgg gggctccctt tagggttccg atttagtgct ttacggcacc tcgaccccaa 3900aaaacttgat tagggtgatg gttcacgtag tgggccatcg ccctgataga cggtttttcg 3960ccctttgacg ttggagtcca cgttctttaa tagtggactc ttgttccaaa ctggaacaac 4020actcaaccct atctcggtct attcttttga tttataaggg attttgccga tttcggccta 4080ttggttaaaa aatgagctga tttaacaaaa atttaacgcg aattttaaca aaatattaac 4140gcttacrmkt ymsrtkssmc wttymggsga aatgtgcgcg gaacccctat ttgtttattt 4200ttctaaatac attcaaatat gtatccgctc atgagacaat aaccctgata aatgcttcaa 4260taatattgaa aaaggaagag tatgagtatt caacatttcc gtgtcgccct tattcccttt 4320tttgcggcat tttgccttcc tgtttttgct cacccagaaa cgctggtgaa agtaaaagat 4380gctgaagatc agttgggtgc acgagtgggt tacatcgaac tggatctcaa cagcggtaag 4440atccttgaga gttttcgccc cgaagaacgt tttccaatga tgagcacttt taaagttctg 4500ctatgtggcg cggtattatc ccgtattgac gccgggcaag agcaactcgg tcgccgcata 4560cactattctc agaatgactt ggttgagtac tcaccagtca cagaaaagca tcttacggat 4620ggcatgacag taagagaatt atgcagtgct gccataacca tgagtgataa cactgcggcc 4680aacttacttc tgacaacgat cggaggaccg aaggagctaa ccgctttttt gcacaacatg 4740ggggatcatg taactcgcct tgatcgttgg gaaccggagc tgaatgaagc cataccaaac 4800gacgagcgtg acaccacgat gcctgtagca atggcaacaa cgttgcgcaa actattaact 4860ggcgaactac ttactctagc ttcccggcaa caattaatag actggatgga ggcggataaa 4920gttgcaggac cacttctgcg ctcggccctt ccggctggct ggtttattgc tgataaatct 4980ggagccggtg agcgtgggtc tcgcggtatc attgcagcac tggggccaga tggtaagccc 5040tcccgtatcg tagttatcta cacgacgggg agtcaggcaa ctatggatga acgaaataga 5100cagatcgctg agataggtgc ctcactgatt aagcattggt aactgtcaga ccaagtttac 5160tcatatatac tttagattga tttaaaactt catttttaat ttaaaaggat ctaggtgaag 5220atcctttttg ataatctcat gaccaaaatc ccttaacgtg agttttcgtt ccactgagcg 5280tcagaccccg tagaaaagat caaaggatct tcttgagatc ctttttttct gcgcgtaatc 5340tgctgcttgc aaacaaaaaa accaccgcta ccagcggtgg tttgtttgcc ggatcaagag 5400ctaccaactc tttttccgaa ggtaactggc ttcagcagag cgcagatacc aaatactgtt 5460cttctagtgt agccgtagtt aggccaccac ttcaagaact ctgtagcacc gcctacatac 5520ctcgctctgc taatcctgtt accagtggct gctgccagtg gcgataagtc gtgtcttacc 5580gggttggact caagacgata gttaccggat aaggcgcagc ggtcgggctg aacggggggt 5640tcgtgcacac agcccagctt ggagcgaacg acctacaccg aactgagata cctacagcgt 5700gagctatgag aaagcgccac gcttcccgaa gggagaaagg cggacaggta tccggtaagc 5760ggcagggtcg gaacaggaga gcgcacgagg gagcttccag ggggaaacgc ctggtatctt 5820tatagtcctg tcgggtttcg ccacctctga cttgagcgtc gatttttgtg atgctcgtca 5880ggggggcgga gcctatggaa aaacgccagc aacgcggcct ttttacggtt cctggccttt 5940tgctggcctt ttgctcacat gttctttcct gcgttatccc ctgattctgt ggataaccgt 6000attaccgcct ttgagtgagc tgataccgct cgccgcagcc gaacgaccga gcgcagcgag 6060tcagtgagcg aggaagcgga agagcgccca atacgcaaac cgcctctccc cgcgcgttgg 6120ccgattcatt aatgcagctg gcacgacagg tttcccgact ggaaagcggg cagtgagcgc 6180aacgcaatta atgtgagtta gctcactcat taggcacccc aggctttaca ctttatgctt 6240ccggctcgta tgttgtgtgg aattgtgagc ggataacaat ttcacacagg aaacagctat 6300gaccatgagg cgcgccg 63173896638DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 389gattcgacat tgattattga ctagttatta atagtaatca attacggggt cattagttca 60tagcccatat atggagttcc gcgttacata acttacggta aatggcccgc ctggctgacc 120gcccaacgac ccccgcccat tgacgtcaat aatgacgtat gttcccatag taacgccaat 180agggactttc cattgacgtc aatgggtgga gtatttacgg taaactgccc acttggcagt 240acatcaagtg tatcatatgc caagtacgcc ccctattgac gtcaatgacg gtaaatggcc 300cgcctggcat tatgcccagt acatgacctt atgggacttt cctacttggc agtacatcta 360cgtattagtc atcgctatta ccatggtcga ggtgagcccc acgttctgct tcactctccc 420catctccccc ccctccccac ccccaatttt gtatttattt attttttaat tattttgtgc 480agcgatgggg gcgggggggg ggggggggcg cgcgccrggs ggggsggggs ggggsgrggg 540gsggggsggg gsgaggcgga gaggtgcggc ggcagccaat cagagcggcg cgctccgaaa 600gtttcctttt atggcgaggc ggcggcggcg gcggccctat aaaaagcgaa gcgcgcggcg 660ggcgggagtc gctgcgcgct gccttcgccc cgtgccccgc tccgccgccg cctcgcgccg 720cccgccccgg ctctgactga ccgcgttact cccacaggtg agcgggcggg acggcccttc 780tcctccgggc tgtaattagc gcttggttta atgacggctt gtttcttttc tgtggctgcg 840tgaaagcctt gaggggctcc gggagggccc tttgtgcggg gggagcggct cggggggtgc 900gtgcgtgtgt gtgtgcgtgg ggagcgccgc gtgcggctcc gcgctgcccg gcggctgtga 960gcgctgcggg cgcggcgcgg ggctttgtgc gctccgcagt gtgcgcgagg ggagcgcggc 1020cgggggcggt gccccgcggt gcgggggggg ctgcgagggg aacaaaggct gcgtgcgggg 1080tgtgtgcgtg ggggggtgag cagggggtgt gggcgcgtcg gtcgggctgc aaccccccct 1140gcacccccct ccccgagttg ctgagcacgg cccggcttcg ggtgcggggc tccgtacggg 1200gcgtggcgcg gggctcgccg tgccgggcgg ggggtggcgg caggtggggg tgccgggcgg 1260ggcggggccg cctcgggccg gggagggctc gggggagggg cgcggcggcc cccggagcgc 1320cggcggctgt cgaggcgcgg cgagccgcag ccattgcctt ttatggtaat cgtgcgagag 1380ggcgcaggga cttcctttgt cccaaatctg tgcggagccg aaatctggga ggcgccgccg 1440caccccctct agcgggcgcg gggcgaagcg gtgcggcgcc ggcaggaagg aaatgggcgg 1500ggagggcctt cgtgcgtcgc cgcgccgccg tccccttctc cctctccagc ctcggggctg 1560tccgcggggg gacggctgcc ttcggggggg acggggcagg gcggggttcg gcttctggcg 1620tgtgaccggc ggctctagag cctctgctaa ccatgttcat gccttcttct ttttcctaca 1680gatccttaat taataatacg actcactata gggggtcgac ccgccaccat gcccaagaag 1740aaacggaaag tgatgagccc ctttatcgcc ccggacgtgc ccgagcacct cctggacact 1800gtgcgcgtct ttctgtacgc ccgtcagagt aaaggacggt cagatggatc tgacgtgtcc 1860accgaagcac agctcgctgc cggacgggcc cttgttgcct caagaaacgc acaaggggga 1920gctagatggg tggtggcggg cgaattcgtg gatgtgggca gatcagggtg ggacccgaat 1980gtgacacgcg ccgacttcga aagaatgatg ggcgaggtgc gcgccggtga gggagacgta 2040gtggtggtta atgaactgag tcgccttacg aggaagggcg cccacgacgc tctggagatc 2100gataacgaac tcaaaaaaca cggtgtgcgg ttcatgagcg tgctggaacc attcctggat 2160accagcaccc caatcggtgt cgcgatcttt gccctgattg ccgcgctcgc taaacaggat 2220tcagacctta aagctgagcg gctgaagggg gctaaagatg agatcgctgc cttggggggt 2280gtgcacagct catctgcgcc attcggcatg agggcggtca gaaagaaagt ggataacctg 2340gtcatatctg ttctggagcc tgatgaggac aacccggacc acgttgagct tgtggaacgg 2400atggctaaga tgtctttcga aggcgtcagc gataacgcaa ttgccacaac atttgagaag 2460gagaaaatcc cctctccggg gatggctgag agacgagcca cggagaagag gcttgcttct 2520attaaggcac ggaggctcaa tggcgccgaa aagccgatca tgtggcgggc gcagacagtt 2580agatggattc ttaaccatcc cgcgattggt ggattcgcat tcgagcgggt gaaacacgga 2640aaagcccaca tcaacgtgat acgaagagat cccggcggca aaccccttac ccctcacact 2700ggtatcctgt ctggatccaa gtggttggaa ctccaggaga agagaagcgg gaaaaatctc 2760tccgaccgca aaccaggtgc cgaagtggaa cctacgctgc tttccgggtg gagatttctg 2820ggatgtcgga tatgcggtgg gtcaatgggc cagtcccaag ggggccgtaa gaggaatggg 2880gacttggctg agggcaatta catgtgtgca aacccaaagg ggcacggcgg tctgagcgtc 2940aagaggtctg agcttgatga attcgtggca tcaaaagtct gggccaggtt gcgcacggct 3000gacatggagg atgaacatga ccaagcatgg attgcagctg cagctgaacg gtttgctttg 3060cagcacgacc tggcgggggt agctgacgag cgacgggagc aacaagctca cctggataac 3120gttcggagat caataaaaga tctccaggcg gataggaagg caggtctcta cgtgggacgc 3180gaagaactgg agacctggcg cagtaccgtc ctgcaatata ggagctacga ggctgagtgt 3240actactaggt tggctgagct ggatgaaaaa atgaatggat ccacccgggt gccttcagaa 3300tggtttagcg gcgaggaccc aaccgcggaa ggaggcatat gggcgagctg ggatgtctat 3360gagcgccggg agtttctcag cttttttttg gactccgtaa tggttgacag gggcagacat 3420cctgaaacca agaaatatat accattgaaa gaccgggtga ccttaaagtg ggcggagctg 3480ttaaaggaag aggatgaagc aagcgaggcc acagaacggg agctggcagc tctttaggtc 3540agagctcgct gatcagcctc gactgtgcct tctagttgcc agccatctgt tgtttgcccc 3600tcccccgtgc cttccttgac cctggaaggt gccactccca ctgtcctttc ctaataaaat 3660gaggaaattg catcgcattg tctgagtagg tgtcattcta ttctgggggg tggggtgggg 3720caggacagca agggggagga ttgggaagac aatagcaggc atgctgggga tgcggtgggc 3780tctatggctt ctgaggcgga aagaaccagc tggggctcga gatccactag ttctagcctc 3840gaggctagag cggccgccac tggccgtcgt tttacaacgt cgtgactggg aaaaccctgg 3900cgttacccaa cttaatcgcc ttgcagcaca tccccctttc gccagctggc gtaatagcga 3960agaggcccgc accgatcgcc cttcccaaca gttgcgcagc ctgaatggcg aatgggacgc 4020gccctgtagc ggcgcattaa gcgcggcggg tgtggtggtt acgcgcagcg tgaccgctac 4080acttgccagc gccctagcgc ccgctccttt cgctttcttc ccttcctttc tcgccacgtt 4140cgccggcttt ccccgtcaag ctctaaatcg ggggctccct ttagggttcc gatttagtgc 4200tttacggcac ctcgacccca aaaaacttga ttagggtgat ggttcacgta gtgggccatc 4260gccctgatag acggtttttc gccctttgac gttggagtcc acgttcttta atagtggact 4320cttgttccaa actggaacaa cactcaaccc tatctcggtc tattcttttg atttataagg 4380gattttgccg atttcggcct attggttaaa aaatgagctg atttaacaaa aatttaacgc 4440gaattttaac aaaatattaa cgcttacrmk tymsrtkssm cwttymggsg aaatgtgcgc 4500ggaaccccta tttgtttatt tttctaaata cattcaaata tgtatccgct catgagacaa 4560taaccctgat aaatgcttca ataatattga aaaaggaaga gtatgagtat tcaacatttc 4620cgtgtcgccc ttattccctt ttttgcggca ttttgccttc ctgtttttgc tcacccagaa 4680acgctggtga aagtaaaaga tgctgaagat cagttgggtg cacgagtggg ttacatcgaa 4740ctggatctca acagcggtaa gatccttgag agttttcgcc ccgaagaacg ttttccaatg 4800atgagcactt ttaaagttct gctatgtggc gcggtattat cccgtattga cgccgggcaa 4860gagcaactcg gtcgccgcat acactattct cagaatgact tggttgagta ctcaccagtc 4920acagaaaagc atcttacgga tggcatgaca gtaagagaat tatgcagtgc tgccataacc 4980atgagtgata acactgcggc caacttactt ctgacaacga tcggaggacc gaaggagcta 5040accgcttttt tgcacaacat gggggatcat gtaactcgcc ttgatcgttg ggaaccggag 5100ctgaatgaag ccataccaaa cgacgagcgt gacaccacga tgcctgtagc aatggcaaca 5160acgttgcgca aactattaac tggcgaacta cttactctag cttcccggca acaattaata 5220gactggatgg aggcggataa agttgcagga ccacttctgc gctcggccct tccggctggc 5280tggtttattg ctgataaatc tggagccggt gagcgtgggt ctcgcggtat cattgcagca 5340ctggggccag atggtaagcc ctcccgtatc gtagttatct acacgacggg gagtcaggca 5400actatggatg aacgaaatag acagatcgct gagataggtg cctcactgat taagcattgg 5460taactgtcag accaagttta ctcatatata ctttagattg atttaaaact tcatttttaa 5520tttaaaagga tctaggtgaa gatccttttt gataatctca tgaccaaaat cccttaacgt 5580gagttttcgt tccactgagc gtcagacccc gtagaaaaga tcaaaggatc ttcttgagat 5640cctttttttc tgcgcgtaat ctgctgcttg caaacaaaaa aaccaccgct accagcggtg 5700gtttgtttgc cggatcaaga gctaccaact ctttttccga aggtaactgg cttcagcaga 5760gcgcagatac caaatactgt tcttctagtg tagccgtagt taggccacca cttcaagaac 5820tctgtagcac cgcctacata cctcgctctg ctaatcctgt taccagtggc tgctgccagt 5880ggcgataagt cgtgtcttac cgggttggac tcaagacgat agttaccgga taaggcgcag 5940cggtcgggct gaacgggggg ttcgtgcaca cagcccagct tggagcgaac gacctacacc 6000gaactgagat acctacagcg tgagctatga gaaagcgcca cgcttcccga agggagaaag 6060gcggacaggt atccggtaag cggcagggtc ggaacaggag agcgcacgag ggagcttcca 6120gggggaaacg cctggtatct ttatagtcct gtcgggtttc gccacctctg acttgagcgt 6180cgatttttgt gatgctcgtc aggggggcgg agcctatgga aaaacgccag caacgcggcc 6240tttttacggt tcctggcctt ttgctggcct tttgctcaca tgttctttcc tgcgttatcc 6300cctgattctg tggataaccg tattaccgcc tttgagtgag ctgataccgc tcgccgcagc 6360cgaacgaccg agcgcagcga gtcagtgagc gaggaagcgg aagagcgccc aatacgcaaa 6420ccgcctctcc ccgcgcgttg gccgattcat taatgcagct ggcacgacag gtttcccgac 6480tggaaagcgg gcagtgagcg caacgcaatt aatgtgagtt agctcactca ttaggcaccc 6540caggctttac actttatgct tccggctcgt atgttgtgtg gaattgtgag cggataacaa 6600tttcacacag gaaacagcta tgaccatgag gcgcgccg 66383909530DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 390taatcagcat catgatgtgg taccacatca tgatgctgat tataagaatg cggccgccac 60actctagtgg atctcgagtt aataattcag aagaactcgt caagaaggcg atagaaggcg 120atgcgctgcg aatcgggagc ggcgataccg taaagcacga ggaagcggtc agcccattcg 180ccgccaagct cttcagcaat atcacgggta gccaacgcta tgtcctgata gcggtccgcc 240acacccagcc ggccacagtc gatgaatcca gaaaagcggc cattttccac catgatattc 300ggcaagcagg catcgccatg ggtcacgacg agatcctcgc cgtcgggcat gctcgccttg 360agcctggcga acagttcggc tggcgcgagc ccctgatgct cttcgtccag atcatcctga 420tcgacaagac cggcttccat ccgagtacgt gctcgctcga tgcgatgttt cgcttggtgg 480tcgaatgggc aggtagccgg atcaagcgta tgcagccgcc gcattgcatc agccatgatg 540gatactttct cggcaggagc aaggtgtaga tgacatggag atcctgcccc ggcacttcgc 600ccaatagcag ccagtccctt cccgcttcag tgacaacgtc gagcacagct gcgcaaggaa 660cgcccgtcgt ggccagccac gatagccgcg ctgcctcgtc ttgcagttca ttcagggcac 720cggacaggtc ggtcttgaca aaaagaaccg ggcgcccctg cgctgacagc cggaacacgg 780cggcatcaga gcagccgatt gtctgttgtg cccagtcata gccgaatagc ctctccaccc 840aagcggccgg agaacctgcg tgcaatccat cttgttcaat catgcgaaac gatcctcatc 900ctgtctcttg atcagagctt gatcccctgc gccatcagat ccttggcggc gagaaagcca 960tccagtttac tttgcagggc ttcccaacct taccagaggg cgccccagct ggcaattccg 1020gttcgcttgc tgtccataaa accgcccagt ctagctatcg ccatgtaagc ccactgcaag 1080ctacctgctt tctctttgcg cttgcgtttt cccttgtcca gatagcccag tagctgacat 1140tcatccgggg tcagcaccgt ttctgcggac tggctttcta cgtgctcgag gggggccaaa 1200cggtctccag cttggctgtt ttggcggatg agagaagatt ttcagcctga tacagattaa 1260atcagaacgc agaagcggtc tgataaaaca gaatttgcct ggcggcagta gcgcggtggt 1320cccacctgac cccatgccga actcagaagt gaaacgccgt agcgccgatg gtagtgtggg 1380gtctccccat gcgagagtag ggaactgcca ggcatcaaat aaaacgaaag gctcagtcga 1440aagactgggc ctttcgtttt atctgttgtt tgtcggtgaa cgctctcctg agtaggacaa 1500atccgccggg agcggatttg aacgttgcga agcaacggcc cggagggtgg cgggcaggac 1560gcccgccata aactgccagg catcaaatta agcagaaggc catcctgacg gatggccttt 1620ttgcgtttct acaaactctt ttgtttattt ttctaaatac attcaaatat gtatccgctc 1680atgaccaaaa tcccttaacg tgagttttcg ttccactgag cgtcagaccc cgtagaaaag 1740atcaaaggat cttcttgaga tccttttttt ctgcgcgtaa tctgctgctt gcaaacaaaa 1800aaaccaccgc taccagcggt ggtttgtttg ccggatcaag agctaccaac tctttttccg 1860aaggtaactg gcttcagcag agcgcagata ccaaatactg tccttctagt gtagccgtag 1920ttaggccacc acttcaagaa ctctgtagca ccgcctacat acctcgctct gctaatcctg 1980ttaccagtgg ctgctgccag tggcgataag tcgtgtctta ccgggttgga ctcaagacga 2040tagttaccgg ataaggcgca gcggtcgggc tgaacggggg gttcgtgcac acagcccagc 2100ttggagcgaa cgacctacac cgaactgaga tacctacagc gtgagctatg agaaagcgcc 2160acgcttcccg aagggagaaa ggcggacagg tatccggtaa gcggcagggt cggaacagga 2220gagcgcacga gggagcttcc agggggaaac gcctggtatc tttatagtcc tgtcgggttt 2280cgccacctct gacttgagcg tcgatttttg tgatgctcgt caggggggcg gagcctatgg 2340aaaaacgcca gcaacgcggc ctttttacgg ttcctggcct tttgctggcc ttttgctcac 2400atgttctttc ctgcgttatc ccctgattct gtggataacc gtattaccgc ctttgagtga 2460gctgataccg ctcgccgcag ccgaacgacc gagcgcagcg agtcagtgag cgaggaagcg 2520gaagagcgcc tgatgcggta ttttctcctt acgcatctgt gcggtatttc acaccgcata 2580tggtgcactc tcagtacaat ctgctctgat gccgcatagt taagccagta tacactccgc 2640tatcgctacg tgactgggtc atggctgcgc cccgacaccc gccaacaccc gctgacgcgc 2700cctgacgggc ttgtctgctc ccggcatccg cttacagaca agctgtgacc gtctccggga 2760gctgcatgtg tcagaggttt tcaccgtcat caccgaaacg cgcgaggcag cagatcaatt 2820cgcgcgcgaa ggcgaagcgg catgcataat gtgcctgtca aatggacgaa gcagggattc 2880tgcaaaccct atgctactcc gtcaagccgt caattgtctg attcgttacc aattatgaca 2940acttgacggc tacatcattc actttttctt cacaaccggc acggaactcg ctcgggctgg 3000ccccggtgca ttttttaaat acccgcgaga aatagagttg atcgtcaaaa ccaacattgc 3060gaccgacggt ggcgataggc atccgggtgg tgctcaaaag cagcttcgcc tggctgatac 3120gttggtcctc gcgccagctt aagacgctaa tccctaactg ctggcggaaa agatgtgaca 3180gacgcgacgg cgacaagcaa acatgctgtg cgacgctggc gatacattac cctgttatcc 3240ctagatgaca ttaccctgtt atcccagatg acattaccct gttatcccta gatgacatta 3300ccctgttatc cctagatgac atttaccctg ttatccctag atgacattac cctgttatcc 3360cagatgacat taccctgtta tccctagata cattaccctg ttatcccaga tgacataccc 3420tgttatccct agatgacatt accctgttat cccagatgac attaccctgt tatccctaga 3480tacattaccc tgttatccca gatgacatac cctgttatcc ctagatgaca ttaccctgtt 3540atcccagatg acattaccct gttatcccta gatacattac cctgttatcc cagatgacat 3600accctgttat ccctagatga cattaccctg ttatcccaga tgacattacc ctgttatccc 3660tagatacatt accctgttat cccagatgac ataccctgtt atccctagat gacattaccc

3720tgttatccca gatgacatta ccctgttatc cctagataca ttaccctgtt atcccagatg 3780acataccctg ttatccctag atgacattac cctgttatcc cagatgacat taccctgtta 3840tccctagata cattaccctg ttatcccaga tgacataccc tgttatccct agatgacatt 3900accctgttat cccagataaa ctcaatgatg atgatgatga tggtcgagac tcagcggccg 3960cggtgccagg gcgtgccctt gggctccccg ggcgcgacta taagctgcga gcaacttcac 4020ttgggtatgc cggcggtagc gctgagggcc tatttcccat gattccttca tatttgcata 4080tacgatacaa ggctgttaga gagataattg gaattaattt gactgtaaac acaaagatat 4140tagtacaaaa tacgtgacgt agaaagtaat aatttcttgg gtagtttgca gttttaaaat 4200tatgttttaa aatggactat catatgctta ccgtaacttg aaagtatttc gatttcttgg 4260ctttatatat cttgtggaaa ggacgaaaca ccgggtcttc gagaagacct gttttagagc 4320tagaaatcgt ggttcgcacc gactcggtgc cacagcaagt taaaataagg ctagtccgtt 4380atcaacttga aaaagtggca ccgagtcggt gcttttttga attcgctagc taggtcttga 4440aaggagtggg aattggctcc ggtgcccgtc agtgggcaga gcgcacatcg cccacagtcc 4500ccgagaagtt ggggggaggg gtcggcaatt gatccggtgc ctagagaagg tggcgcgggg 4560taaactggga aagtgatgtc gtgtactggc tccgcctttt tcccgagggt gggggagaac 4620cgtatataag tgcagtagtc gccgtgaacg ttctttttcg caacgggttt gccgccagaa 4680cacaggaccg gttctagagc gctgccacca tggacaagaa gtacagcatc ggcctggaca 4740tcggcaccaa ctctgtgggc tgggccgtga tcaccgacga gtacaaggtg cccagcaaga 4800aattcaaggt gctgggcaac accgaccggc acagcatcaa gaagaacctg atcggagccc 4860tgctgttcga cagcggcgaa acagccgagg ccacccggct gaagagaacc gccagaagaa 4920gatacaccag acggaagaac cggatctgct atctgcaaga gatcttcagc aacgagatgg 4980ccaaggtgga cgacagcttc ttccacagac tggaagagtc cttcctggtg gaagaggata 5040agaagcacga gcggcacccc atcttcggca acatcgtgga cgaggtggcc taccacgaga 5100agtaccccac catctaccac ctgagaaaga aactggtgga cagcaccgac aaggccgacc 5160tgcggctgat ctatctggcc ctggcccaca tgatcaagtt ccggggccac ttcctgatcg 5220agggcgacct gaaccccgac aacagcgacg tggacaagct gttcatccag ctggtgcaga 5280cctacaacca gctgttcgag gaaaacccca tcaacgccag cggcgtggac gccaaggcca 5340tcctgtctgc cagactgagc aagagcagac ggctggaaaa tctgatcgcc cagctgcccg 5400gcgagaagaa gaatggcctg ttcggaaacc tgattgccct gagcctgggc ctgaccccca 5460acttcaagag caacttcgac ctggccgagg atgccaaact gcagctgagc aaggacacct 5520acgacgacga cctggacaac ctgctggccc agatcggcga ccagtacgcc gacctgtttc 5580tggccgccaa gaacctgtcc gacgccatcc tgctgagcga catcctgaga gtgaacaccg 5640agatcaccaa ggcccccctg agcgcctcta tgatcaagag atacgacgag caccaccagg 5700acctgaccct gctgaaagct ctcgtgcggc agcagctgcc tgagaagtac aaagagattt 5760tcttcgacca gagcaagaac ggctacgccg gctacattga cggcggagcc agccaggaag 5820agttctacaa gttcatcaag cccatcctgg aaaagatgga cggcaccgag gaactgctcg 5880tgaagctgaa cagagaggac ctgctgcgga agcagcggac cttcgacaac ggcagcatcc 5940cccaccagat ccacctggga gagctgcacg ccattctgcg gcggcaggaa gatttttacc 6000cattcctgaa ggacaaccgg gaaaagatcg agaagatcct gaccttccgc atcccctact 6060acgtgggccc tctggccagg ggaaacagca gattcgcctg gatgaccaga aagagcgagg 6120aaaccatcac cccctggaac ttcgaggaag tggtggacaa gggcgcttcc gcccagagct 6180tcatcgagcg gatgaccaac ttcgataaga acctgcccaa cgagaaggtg ctgcccaagc 6240acagcctgct gtacgagtac ttcaccgtgt ataacgagct gaccaaagtg aaatacgtga 6300ccgagggaat gagaaagccc gccttcctga gcggcgagca gaaaaaggcc atcgtggacc 6360tgctgttcaa gaccaaccgg aaagtgaccg tgaagcagct gaaagaggac tacttcaaga 6420aaatcgagtg cttcgactcc gtggaaatct ccggcgtgga agatcggttc aacgcctccc 6480tgggcacata ccacgatctg ctgaaaatta tcaaggacaa ggacttcctg gacaatgagg 6540aaaacgagga cattctggaa gatatcgtgc tgaccctgac actgtttgag gacagagaga 6600tgatcgagga acggctgaaa acctatgccc acctgttcga cgacaaagtg atgaagcagc 6660tgaagcggcg gagatacacc ggctggggca ggctgagccg gaagctgatc aacggcatcc 6720gggacaagca gtccggcaag acaatcctgg atttcctgaa gtccgacggc ttcgccaaca 6780gaaacttcat gcagctgatc cacgacgaca gcctgacctt taaagaggac atccagaaag 6840cccaggtgtc cggccagggc gatagcctgc acgagcacat tgccaatctg gccggcagcc 6900ccgccattaa gaagggcatc ctgcagacag tgaaggtggt ggacgagctc gtgaaagtga 6960tgggccggca caagcccgag aacatcgtga tcgaaatggc cagagagaac cagaccaccc 7020agaagggaca gaagaacagc cgcgagagaa tgaagcggat cgaagagggc atcaaagagc 7080tgggcagcca gatcctgaaa gaacaccccg tggaaaacac ccagctgcag aacgagaagc 7140tgtacctgta ctacctgcag aatgggcggg atatgtacgt ggaccaggaa ctggacatca 7200accggctgtc cgactacgat gtggaccata tcgtgcctca gagctttctg aaggacgact 7260ccatcgacaa caaggtgctg accagaagcg acaagaaccg gggcaagagc gacaacgtgc 7320cctccgaaga ggtcgtgaag aagatgaaga actactggcg gcagctgctg aacgccaagc 7380tgattaccca gagaaagttc gacaatctga ccaaggccga gagaggcggc ctgagcgaac 7440tggataaggc cggcttcatc aagagacagc tggtggaaac ccggcagatc acaaagcacg 7500tggcacagat cctggactcc cggatgaaca ctaagtacga cgagaatgac aagctgatcc 7560gggaagtgaa agtgatcacc ctgaagtcca agctggtgtc cgatttccgg aaggatttcc 7620agttttacaa agtgcgcgag atcaacaact accaccacgc ccacgacgcc tacctgaacg 7680ccgtcgtggg aaccgccctg atcaaaaagt accctaagct ggaaagcgag ttcgtgtacg 7740gcgactacaa ggtgtacgac gtgcggaaga tgatcgccaa gagcgagcag gaaatcggca 7800aggctaccgc caagtacttc ttctacagca acatcatgaa ctttttcaag accgagatta 7860ccctggccaa cggcgagatc cggaagcggc ctctgatcga gacaaacggc gaaaccgggg 7920agatcgtgtg ggataagggc cgggattttg ccaccgtgcg gaaagtgctg agcatgcccc 7980aagtgaatat cgtgaaaaag accgaggtgc agacaggcgg cttcagcaaa gagtctatcc 8040tgcccaagag gaacagcgat aagctgatcg ccagaaagaa ggactgggac cctaagaagt 8100acggcggctt cgacagcccc accgtggcct attctgtgct ggtggtggcc aaagtggaaa 8160agggcaagtc caagaaactg aagagtgtga aagagctgct ggggatcacc atcatggaaa 8220gaagcagctt cgagaagaat cccatcgact ttctggaagc caagggctac aaagaagtga 8280aaaaggacct gatcatcaag ctgcctaagt actccctgtt cgagctggaa aacggccgga 8340agagaatgct ggcctctgcc ggcgaactgc agaagggaaa cgaactggcc ctgccctcca 8400aatatgtgaa cttcctgtac ctggccagcc actatgagaa gctgaagggc tcccccgagg 8460ataatgagca gaaacagctg tttgtggaac agcacaagca ctacctggac gagatcatcg 8520agcagatcag cgagttctcc aagagagtga tcctggccga cgctaatctg gacaaagtgc 8580tgtccgccta caacaagcac cgggataagc ccatcagaga gcaggccgag aatatcatcc 8640acctgtttac cctgaccaat ctgggagccc ctgccgcctt caagtacttt gacaccacca 8700tcgaccggaa gaggtacacc agcaccaaag aggtgctgga cgccaccctg atccaccaga 8760gcatcaccgg cctgtacgag acacggatcg acctgtctca gctgggaggc gacaagcgac 8820ctgccgccac aaagaaggct ggacaggcta agaagaagaa agattacaaa gacgatgacg 8880ataagtaact agagctcgct gatcagcctc gactgtgcct tctagttgcc agccatctgt 8940tgtttgcccc tcccccgtgc cttccttgac cctggaaggt gccactccca ctgtcctttc 9000ctaataaaat gaggaaattg catcgcattg tctgagtagg tgtcattcta ttctgggggg 9060tggggtgggg caggacagca agggggagga ttgggaagag aatagcaggc atgctgggga 9120ctgaggcgga aagaaccagc tgtggaatgt gtgtcagtta gggtgtggaa agtccccagg 9180ctccccagca ggcagaagta tgcaaagcat gcatctcaat tagtcagcaa ccaggtgtgg 9240aaagtcccca ggctccccag caggcagaag tatgcaaagc atgcatctca attagtcagc 9300aaccatagtc ccgcccctaa ctccgcccat cccgccccta actccgccca gttccgccca 9360ttctccgccc catggctgac taattttttt tatttatgca gaggccgagg ccgcctcggc 9420ctctgagcta ttccagaagt agtgaggagg cttttttgga ggcctaggct tttgcaaaaa 9480gcttgggccc gccccaactg gggtaacctt tgagttctct cagttggggg 95303915722DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 391tgatcccctg cgccatcaga tccttggcgg cgagaaagcc atccagttta ctttgcaggg 60cttcccaacc ttaccagagg gcgccccagc tggcaattcc ggttcgcttg ctgtccataa 120aaccgcccag tctagctatc gccatgtaag cccactgcaa gctacctgct ttctctttgc 180gcttgcgttt tcccttgtcc agatagccca gtagctgaca ttcatccggg gtcagcaccg 240tttctgcgga ctggctttct acgtgctcga ggggggccaa acggtctcca gcttggctgt 300tttggcggat gagagaagat tttcagcctg atacagatta aatcagaacg cagaagcggt 360ctgataaaac agaatttgcc tggcggcagt agcgcggtgg tcccacctga ccccatgccg 420aactcagaag tgaaacgccg tagcgccgat ggtagtgtgg ggtctcccca tgcgagagta 480gggaactgcc aggcatcaaa taaaacgaaa ggctcagtcg aaagactggg cctttcgttt 540tatctgttgt ttgtcggtga acgctctcct gagtaggaca aatccgccgg gagcggattt 600gaacgttgcg aagcaacggc ccggagggtg gcgggcagga cgcccgccat aaactgccag 660gcatcaaatt aagcagaagg ccatcctgac ggatggcctt tttgcgtttc tacaaactct 720tttgtttatt tttctaaata cattcaaata tgtatccgct catgaccaaa atcccttaac 780gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag 840atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg 900tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact ggcttcagca 960gagcgcagat accaaatact gtccttctag tgtagccgta gttaggccac cacttcaaga 1020actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg gctgctgcca 1080gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg gataaggcgc 1140agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca 1200ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc gaagggagaa 1260aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc 1320cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc tgacttgagc 1380gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg 1440cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat 1500cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc gctcgccgca 1560gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc ctgatgcggt 1620attttctcct tacgcatctg tgcggtattt cacaccgcat atggtgcact ctcagtacaa 1680tctgctctga tgccgcatag ttaagccagt atacactccg ctatcgctac gtgactgggt 1740catggctgcg ccccgacacc cgccaacacc cgctgacgcg ccctgacggg cttgtctgct 1800cccggcatcc gcttacagac aagctgtgac cgtctccggg agctgcatgt gtcagaggtt 1860ttcaccgtca tcaccgaaac gcgcgaggca gcagatcaat tcgcgcgcga aggcgaagcg 1920gcatgcataa tgtgcctgtc aaatggacga agcagggatt ctgcaaaccc tatgctactc 1980cgtcaagccg tcaattgtct gattcgttac caattatgac aacttgacgg ctacatcatt 2040cactttttct tcacaaccgg cacggaactc gctcgggctg gccccggtgc attttttaaa 2100tacccgcgag aaatagagtt gatcgtcaaa accaacattg cgaccgacgg tggcgatagg 2160catccgggtg gtgctcaaaa gcagcttcgc ctggctgata cgttggtcct cgcgccagct 2220taagacgcta atccctaact gctggcggaa aagatgtgac agacgcgacg gcgacaagca 2280aacatgctgt gcgacgctgg cgatacatta ccctgttatc cctagatgac attaccctgt 2340tatcccagat gacattaccc tgttatccct agatgacatt accctgttat ccctagatga 2400catttaccct gttatcccta gatgacatta ccctgttatc ccagatgaca ttaccctgtt 2460atccctagat acattaccct gttatcccag atgacatacc ctgttatccc tagatgacat 2520taccctgtta tcccagatga cattaccctg ttatccctag atacattacc ctgttatccc 2580agatgacata ccctgttatc cctagatgac attaccctgt tatcccagat gacattaccc 2640tgttatccct agatacatta ccctgttatc ccagatgaca taccctgtta tccctagatg 2700acattaccct gttatcccag atgacattac cctgttatcc ctagatacat taccctgtta 2760tcccagatga cataccctgt tatccctaga tgacattacc ctgttatccc agatgacatt 2820accctgttat ccctagatac attaccctgt tatcccagat gacataccct gttatcccta 2880gatgacatta ccctgttatc ccagatgaca ttaccctgtt atccctagat acattaccct 2940gttatcccag atgacatacc ctgttatccc tagatgacat taccctgtta tcccagataa 3000actcaatgat gatgatgatg atggtcgaga ctcagcggcc gcggtgccag ggcgtgccct 3060tgggctcccc gggcgcggtc ctttgggcgc taactgcgtg cgcgctggga attggcgcta 3120attgcgcgtg cgcgctggga ctcaaggcgc taactgcgcg tgcgttctgg ggcccggggt 3180gccgcggcct gggctggggc gaaggcgggc tcggccggaa ggggtggggt cgccgcggct 3240cccgggcgct tgcgcgcact tcctgcccga gccgctggcc gcccgagggt gtggccgctg 3300cgtgcgcgcg cgccgacccg gcgctgtttg aaccgggcgg aggcggggct ggcgcccggt 3360tgggaggggg ttggggcctg gcttcctgcc gcgcgccgcg gggacgcctc cgaccagtgt 3420ttgcctttta tggtaataac gcggccggcc cggcttcctt tgtccccaat ctgggcgcgc 3480gccggcgccc cctggcggcc taaggactcg gcgcgccgga agtggccagg gcgggggcga 3540cctcggctca cagcgcgccc ggctattctc gcagctcgcc accatgcccg ccatgaagat 3600cgagtgccgc atcaccggca ccctgaacgg cgtggagttc gagctggtgg gcggcggaga 3660gggcaccccc gagcagggcc gcatgaccaa caagatgaag agcaccaaag gcgccctgac 3720cttcagcccc tacctgctga gccacgtgat gggctacggc ttctaccact tcggcaccta 3780ccccagcggc tacgagaacc ccttcctgca cgccatcaac aacggcggct acaccaacac 3840ccgcatcgag aagtacgagg acggcggcgt gctgcacgtg agcttcagct accgctacga 3900ggccggccgc gtgatcggcg acttcaaggt ggtgggcacc ggcttccccg aggacagcgt 3960gatcttcacc gacaagatca tccgcagcaa cgccaccgtg gagcacctgc accccatggg 4020cgataacgtg ctggtgggca gcttcgcccg caccttcagc ctgcgcgacg gcggctacta 4080cagcttcgtg gtggacagcc acatgcactt caagagcgcc atccacccca gcatcctgca 4140gaacgggggc cccatgttcg ccttccgccg cgtggaggag ctgcacagca acaccgagct 4200gggcatcgtg gagtaccagc acgccttcaa gacccccatc gccttcgcca gatctcgagc 4260tcgaaccatg gatgatgata tcgccgcgct cgtcgtcgac aacggctccg gcatgtgcaa 4320ggccggcttc gcgggcgacg atgccccccg ggccgtcttc ccctccatcg tggggcgccc 4380caggcaccag gtaggggagc tggctgggtg gggcagcccc gggagcgggc gggaggcaag 4440ggcgctttct ctgcacagga gcctcccggt ttccggggtg ggggctgcgc ccgtgctcag 4500ggcttcttgt cctttccttc ccagggcgtg atggtgggca tgggtcagaa ggattcctat 4560gtgggcgacg aggcccagag caagagaggc atcctcaccc tgaagtaccc catcgagcac 4620ggcatcgtca ccaactggga cgacatggag aaaatctggc accacacctt ctacaatgag 4680ctgcgtgtgg ctcccgagga gcaccccgtg ctgctgaccg aggcccccct gaaccccaag 4740gccaaccgcg agaagatgac ccagccccaa ctggggtaac ctttgagttc tctcagttgg 4800gggtaatcag catcatgatg tggtaccaca tcatgatgct gattataaga atgcggccgc 4860cacactctag tggatctcga gttaataatt cagaagaact cgtcaagaag gcgatagaag 4920gcgatgcgct gcgaatcggg agcggcgata ccgtaaagca cgaggaagcg gtcagcccat 4980tcgccgccaa gctcttcagc aatatcacgg gtagccaacg ctatgtcctg atagcggtcc 5040gccacaccca gccggccaca gtcgatgaat ccagaaaagc ggccattttc caccatgata 5100ttcggcaagc aggcatcgcc atgggtcacg acgagatcct cgccgtcggg catgctcgcc 5160ttgagcctgg cgaacagttc ggctggcgcg agcccctgat gctcttcgtc cagatcatcc 5220tgatcgacaa gaccggcttc catccgagta cgtgctcgct cgatgcgatg tttcgcttgg 5280tggtcgaatg ggcaggtagc cggatcaagc gtatgcagcc gccgcattgc atcagccatg 5340atggatactt tctcggcagg agcaaggtgt agatgacatg gagatcctgc cccggcactt 5400cgcccaatag cagccagtcc cttcccgctt cagtgacaac gtcgagcaca gctgcgcaag 5460gaacgcccgt cgtggccagc cacgatagcc gcgctgcctc gtcttgcagt tcattcaggg 5520caccggacag gtcggtcttg acaaaaagaa ccgggcgccc ctgcgctgac agccggaaca 5580cggcggcatc agagcagccg attgtctgtt gtgcccagtc atagccgaat agcctctcca 5640cccaagcggc cggagaacct gcgtgcaatc catcttgttc aatcatgcga aacgatcctc 5700atcctgtctc ttgatcagag ct 572239215424DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 392tcgacggtat cgataagctt gatatcgaat tcctgcagcc cgggggatcc actagttcta 60gagcggccgc caccgcggtg gagctccagc ttttgttccc tttagtgagg gttaatttcg 120agcttggcgt aatcatggtc atagctgttt cctgtgtgaa attgttatcc gctcacaatt 180ccacacaaca tacgagccgg aagcataaag tgtaaagcct ggggtgccta atgagtgagc 240taactcacat taattgcgtt gcgctcactg cccgctttcc agtcgggaaa cctgtcgtgc 300cagctgcatt aatgaatcgg ccaacgcgcg gggagaggcg gtttgcgtat tgggcgctct 360tccgcttcct cgctcactga ctcgctgcgc tcggtcgttc ggctgcggcg agcggtatca 420gctcactcaa aggcggtaat acggttatcc acagaatcag gggataacgc aggaaagaac 480atgtgagcaa aaggccagca aaaggccagg aaccgtaaaa aggccgcgtt gctggcgttt 540ttccataggc tccgcccccc tgacgagcat cacaaaaatc gacgctcaag tcagaggtgg 600cgaaacccga caggactata aagataccag gcgtttcccc ctggaagctc cctcgtgcgc 660tctcctgttc cgaccctgcc gcttaccgga tacctgtccg cctttctccc ttcgggaagc 720gtggcgcttt ctcatagctc acgctgtagg tatctcagtt cggtgtaggt cgttcgctcc 780aagctgggct gtgtgcacga accccccgtt cagcccgacc gctgcgcctt atccggtaac 840tatcgtcttg agtccaaccc ggtaagacac gacttatcgc cactggcagc agccactggt 900aacaggatta gcagagcgag gtatgtaggc ggtgctacag agttcttgaa gtggtggcct 960aactacggct acactagaag gacagtattt ggtatctgcg ctctgctgaa gccagttacc 1020ttcggaaaaa gagttggtag ctcttgatcc ggcaaacaaa ccaccgctgg tagcggtggt 1080ttttttgttt gcaagcagca gattacgcgc agaaaaaaag gatctcaaga agatcctttg 1140atcttttcta cggggtctga cgctcagtgg aacgaaaact cacgttaagg gattttggtc 1200atgagattat caaaaaggat cttcacctag atccttttaa attaaaaatg aagttttaaa 1260tcaatctaaa gtatatatga gtaaacttgg tctgacagtt accaatgctt aatcagtgag 1320gcacctatct cagcgatctg tctatttcgt tcatccatag ttgcctgact ccccgtcgtg 1380tagataacta cgatacggga gggcttacca tctggcccca gtgctgcaat gataccgcga 1440gacccacgct caccggctcc agatttatca gcaataaacc agccagccgg aagggccgag 1500cgcagaagtg gtcctgcaac tttatccgcc tccatccagt ctattaattg ttgccgggaa 1560gctagagtaa gtagttcgcc agttaatagt ttgcgcaacg ttgttgccat tgctacaggc 1620atcgtggtgt cacgctcgtc gtttggtatg gcttcattca gctccggttc ccaacgatca 1680aggcgagtta catgatcccc catgttgtgc aaaaaagcgg ttagctcctt cggtcctccg 1740atcgttgtca gaagtaagtt ggccgcagtg ttatcactca tggttatggc agcactgcat 1800aattctctta ctgtcatgcc atccgtaaga tgcttttctg tgactggtga gtactcaacc 1860aagtcattct gagaatagtg tatgcggcga ccgagttgct cttgcccggc gtcaatacgg 1920gataataccg cgccacatag cagaacttta aaagtgctca tcattggaaa acgttcttcg 1980gggcgaaaac tctcaaggat cttaccgctg ttgagatcca gttcgatgta acccactcgt 2040gcacccaact gatcttcagc atcttttact ttcaccagcg tttctgggtg agcaaaaaca 2100ggaaggcaaa atgccgcaaa aaagggaata agggcgacac ggaaatgttg aatactcata 2160ctcttccttt ttcaatatta ttgaagcatt tatcagggtt attgtctcat gagcggatac 2220atatttgaat gtatttagaa aaataaacaa ataggggttc cgcgcacatt tccccgaaaa 2280gtgccacctg acgtctaaga aaccattatt atcatgacat taacctataa aaataggcgt 2340atcacgaggc cctttcgtct tcaagaattc tcatgtttga cagcttatca tcgataagct 2400ttaatgcggt agtttatcac agttaaattg ctaacgcagt caggcaccgt gtatgaaatc 2460taacaatgcg ctcatcgtca tcctcggcac cgtcaccctg gatgctgtag gcataggctt 2520ggttatgccg gtactgccgg gcctcttgcg ggatatcgtc cattccgaca gcatcgccag 2580tcactatggc gtgctgctag cgctatatgc gttgatgcaa tttctatgcg cacccgttct 2640cggagcactg tccgaccgct ttggccgccg cccagtcctg ctcgcttcgc tacttggagc 2700cactatcgac tacgcgatca tggcgaccac acccgtcctg tggatccggc gcacaccaaa 2760aacgtcactt ttgccacatc cgtcgcttac atgtgttccg ccacacttgc aacatcacac 2820ttccgccaca ctactacgtc acccgccccg ttcccacgcc ccgcgccacg tcacaaactc 2880caccccctca ttatcatatt ggcttcaatc caaaataaat catcaataat ataccttatt 2940ttggattgaa gccaatatga taatgagggg gtggagtttg tgacgtggcg cggggcgtgg 3000gaacggggcg ggtgacgtag gttttagggc ggagtaactt gtatgtgttg ggaattgtag 3060ttttcttaaa atgggaagtt acgtaacgtg ggaaaacgga agtgacgatt tgaggaagtt 3120gtgggttttt tggctttcgt ttctgggcgt aggttcgcgt gcggttttct gggtgttttt 3180tgtggacttt aaccgttacg tcatttttta gtcctatata tactcgctct gcacttggcc 3240cttttttaca ctgtgactga ttgagctggt gccgtgtcga gtggtgtttt tttaataggt

3300tttctttttt actggtaagg ctgactgtta ggctgccgct gtgaagcgct gtatgttgtt 3360ctggagcggg agggtgctat tttgcctagg caggagggtt tttcaggtgt ttatgtgttt 3420ttctctccta ttaattttgt tatacctcct atgggggctg taatgttgtc tctacgcctg 3480cgggtatgta ttcccccggg ctatttcggt cgctttttag cactgaccga tgaatcaacc 3540tgatgtgttt accgagtctt acattatgac tccggacatg accgaggagc tgtcggtggt 3600gctttttaat cacggtgacc agttttttta cggtcacgcc ggcatggccg tagtccgtct 3660tatgcttata agggttgttt ttcctgttgt aagacaggct tctaatgttt aaatgttttt 3720ttgttatttt attttgtgtt tatgcagaaa cccgcagaca tgtttgagag aaaaatggtg 3780tctttttctg tggtggttcc ggagcttacc tgcctttatc tgcatgagca tgactacgat 3840gtgctttctt ttttgcgcga ggctttgcct gattttttga gcagcacctt gcattttata 3900tcgccgccca tgcaacaaag cttacatcgg ggctacgctg gttagcatag ctccgagtat 3960gcgtgtcata atcagtgtgg gttcttttgt caaggttcct ggcggggaag tggccgcgct 4020ggtccgtgca gacctgcacg attatgttca gctggccctg cgaagggacc tacgggatcg 4080cggtattttt gttaatgttc cgcttttgaa tcttatacag gtctgtgagg aacctgaatt 4140tttgcaatca tgattcgctg cttgaggctg aaggtggagg gcgctctgga gcagattttt 4200acaatggccg gacttaatat tcgggatttg cttagagata tattgagaag gtggcgagat 4260gagaattatt tgggcatggt tgaaggtgct ggaatgttta tagaggagat tcaccctgaa 4320gggtttagcc tttacgtcca cttggacgtg agggccgttt gccttttgga agccattgtg 4380caacatctta caaatgccat tatctgttct ttggctgtag agtttgacca cgccaccgga 4440ggggagcgcg ttcacttaat agatcttcat tttgaggttt tggataatct tttggaataa 4500aaaaaaaaac atggttcttc cagctcttcc cgctcctccc gtgtgtgact cgcagaacga 4560atgtgtaggt tggctgggtg tggcttattc tgcggtggtg gatgttatca gggcagcggc 4620gcatgaagga gtttacatag aacccgaagc cagggggcgc ctggatgctt tgagagagtg 4680gatatactac aactactaca cagagcgatc taagcggcga gaccggagac gcagatctgt 4740ttgtcacgcc cgcacctggt tttgcttcag gaaatatgac tacgtccggc gttccatttg 4800gcatgacact acgaccaaca cgatctcggt tgtctcggcg cactccgtac agtagggatc 4860gtctacctcc ttttgagaca gaaacccgcg ctaccatact ggaggatcat ccgctgctgc 4920ccgaatgtaa cactttgaca atgcacaacg tgagttacgt gcgaggtctt ccctgcagtg 4980tgggatttac gctgattcag gaatgggttg ttccctggga tatggttcta acgcgggagg 5040agcttgtaat cctgaggaag tgtatgcacg tgtgcctgtg ttgtgccaac attgatatca 5100tgacgagcat gatgatccat ggttacgagt cctgggctct ccactgtcat tgttccagtc 5160ccggttccct gcagtgtata gccggcgggc aggttttggc cagctggttt aggatggtgg 5220tggatggcgc catgtttaat cagaggttta tatggtaccg ggaggtggtg aattacaaca 5280tgccaaaaga ggtaatgttt atgtccagcg tgtttatgag gggtcgccac ttaatctacc 5340tgcgcttgtg gtatgatggc cacgtgggtt ctgtggtccc cgccatgagc tttggataca 5400gcgccttgca ctgtgggatt ttgaacaata ttgtggtgct gtgctgcagt tactgtgctg 5460atttaagtga gatcagggtg cgctgctgtg cccggaggac aaggcgcctt atgctgcggg 5520cggtgcgaat catcgctgag gagaccactg ccatgttgta ttcctgcagg acggagcggc 5580ggcggcagca gtttattcgc gcgctgctgc agcaccaccg ccctatcctg atgcacgatt 5640atgactctac ccccatgtag gcgtggactt ctccttcgcc gcccgttaag caaccgcaag 5700ttggacagca gcctgtggct cagcagctgg acagcgacat gaacttaagt gagctgcccg 5760gggagtttat taatatcact gatgagcgtt tggctcgaca ggaaaccgtg tggaatataa 5820cacctaagaa tatgtctgtt acccatgata tgatgctttt taaggccagc cggggagaaa 5880ggactgtgta ctctgtgtgt tgggagggag gtggcaggtt gaatactagg gttctgtgag 5940tttgattaag gtacggtgat ctgtataagc tatgtggtgg tggggctata ctactgaatg 6000aaaaatgact tgaaattttc tgcaattgaa aaataaacac gttgaaacat aacacaaacg 6060attctttatt cttgggcaat gtatgaaaaa gtgtaagagg atgtggcaaa tatttcatta 6120atgtagttgt ggccagacca gtcccatgaa aatgacatag agtatgcact tggagttgtg 6180tctcctgttt cctgtgtacc gtttagtgta atggttagtg ttacaggttt agttttgtct 6240ccgtttaagt aaacttgact gacaatgtta cttttggcag ttttaccgtg agattttgga 6300taagctgata ggttaggcat aaatccaaca gcgtttgtat aggctgtgcc ttcagtaaga 6360tctccatttc taaagttcca atattctggg tccaggaagg aattgtttag tagcactcca 6420ttttcgtcaa atcttataat aagatgagca ctttgaactg ttccagatat tggagccaaa 6480ctgcctttaa cagccaaaac tgaaactgta gcaagtattt gactgccaca ttttgttaag 6540accaaagtga gtttagcatc tttctctgca tttagtctac agttaggaga tggagctggt 6600gtggtccaca aagttagctt atcattattt ttgtttccta ctgtaatggc acctgtgctg 6660tcaaaactaa ggccagttcc tagtttagga accatagcct tgtttgaatc aaattctagg 6720ccatggccaa tttttgtttt gaggggattt gtgtttggtg cattaggtga accaaattca 6780agcccatctc ctgcattaat ggctatggct gtagcgtcaa acatcaaccc cttggcagtg 6840cttaggttaa cctcaagctt tttggaattg tttgaagctg taaacaagta aaggcctttg 6900ttgtagttaa tatccaagtt gtgggctgag tttataaaaa gagggccctg tcctagtctt 6960agatttagtt ggttttgagc atcaaacgga taactaacat caagtataag gcgtctgttt 7020tgagaatcaa tccttagtcc tcctgctaca ttaagttgca tattgccttg tgaatcaaaa 7080cccaaggctc cagtaacttt agtttgcaag gaagtattat taatagtcac acctggacca 7140gttgctacgg tcaaagtgtt taggtcgtct gttacatgca aaggagcccc gtactttagt 7200cctagttttc cattttgtgt ataaatgggc tctttcaagt caatgcccaa gctaccagtg 7260gcagtagtta gagggggtga ggcagtgata gtaagggtac tgctatcggt ggtggtgagg 7320gggcctgatg tttgcagggc tagctttcct tctgacactg tgaggggtcc ttgggtggca 7380atgctaagtt tggagtcgtg cacggttagc ggggcctgtg attgcatggt gagtgtgttg 7440cccgcgacca ttagaggtgc ggcggcagcc acagttaggg cttctgaggt aactgtgagg 7500ggtgcagata tttccaggtt tatgtttgac ttggtttttt tgagaggtgg gctcacagtg 7560gttacatttt gggaggtaag gttgccggcc tcgtccagag agaggccgtt gcccattttg 7620agcgcaagca tgccattgga ggtaactaga ggttcggata ggcgcaaaga gagtacccca 7680gggggactct cttgaaaccc attgggggat acaaagggag gagtaagaaa aggcacagtt 7740ggaggaccgg tttccgtgtc atatggatac acggggttga aggtatcttc agacggtctt 7800gcgcgcttca tctgcaacaa catgaagata gtgggtgcgg atggacagga acaggaggaa 7860actgacattc catttagatt gtggagaaag tttgcagcca ggaggaagct gcaataccag 7920agctgggagg agggcaagga ggtgctgctg aataaactgg acagaaattt gctaactgat 7980tttaagtaag tgatgcttta ttattttttt ttattagtta aagggaataa gatccccggg 8040tactctagtt aattaactag aggatcttga tgtaatccaa ggttaggaca gttgcaaatc 8100acagtgagaa cacagggtcc cctgtcccgc tcaactagca gggggcgctg ggtaaactcc 8160cgaatcaggc tacgggcaag ctctccctgg gcggtaagcc ggacgccgtg cgccgggccc 8220tcgatatgat cctcgggcaa ttcaaagtag caaaactcac cggagtcgcg ggcaaagcac 8280ttgtggcggc gacagtggac caggtgtttc aggcgcagtt gctctgcctc tccacttaac 8340attcagtcgt agccgtccgc cgagtccttt accgcgtcaa agttaggaat aaattgatcc 8400ggatagtggc cgggaggtcc cgagaagggg ttaaagtaga ccgatggcac aaactcctca 8460ataaattgca gagttccaat gcctccagag cgcggctcag aggacgaggt ctgcagagtt 8520aggattgcct gacgaggcgt gaatgaagga cggccggcgc cgccgatctg aaatgtcccg 8580tccggacgga gaccaagcga ggagctcacc gactcgtcgt tgagctgaat acctcgccct 8640ctgattgtca ggtgagttat accctgcccg ggcgaccgca ccctgtgacg aaagccgccc 8700gcaagctgcg cccctgagtt agtcatctga acttcggcct gggcgtctct gggaagtacc 8760acagtggtgg gagcgggact ttcctggtac accagggcag cgggccaact acggggatta 8820aggttattac gaggtgtggt ggtaatagcc gcctgttcca agagaattcg gtttcggtgg 8880gcgcggattc cgttgacccg ggatatcatg tggggtcccg cgctcatgta gtttattcgg 8940gttgagtagt cttgggcagc tccagccgca agtcccattt gtggctggta actccacatg 9000tagggcgtgg gaatttcctt gctcataatg gcgctgacga caggtgctgg cgccgggtgt 9060ggccgctgga gatgacgtag ttttcgcgct taaatttgag aaagggcgcg aaactagtcc 9120ttaagagtca gcgcgcagta tttactgaag agagcctccg cgtcttccag cgtgcgccga 9180agctgatctt cgcttttgtg atacaggcag ctgcgggtga gggatcgcag agacctgttt 9240tttattttca gctcttgttc ttggcccctg ctctgttgaa atatagcata cagagtggga 9300aaaatcctgt ttctaagctc gcgggtcgat acgggttcgt tgggcgccag acgcagcgct 9360cctcctcctg ctgctgccgc cgctgtggat ttcttgggct ttgtcagagt cttgctatcc 9420ggtcgccttt gcttctgtgt ggccgctgct gttgctgccg ctgccgctgc cgccggtgca 9480gtatgggctg tagagatgac ggtagtaatg caggatgtta cgggggaagg ccacgccgtg 9540atggtagaga agaaagcggc gggcgaagga gatgttgccc ccacagtctt gcaagcaagc 9600aactatggcg ttcttgtgcc cgcgccatga gcggtagcct tggcgctgtt gttgctcttg 9660ggctaacggc ggcggctgct tggacttacc ggccctggtt ccagtggtgt cccatctacg 9720gttgggtcgg cgaacgggca gtgccggcgg cgcctgagga gcggaggttg tagccatgct 9780ggaaccggtt gccgatttct ggggcgccgg cgaggggaat gcgaccgagg gtgacggtgt 9840ttcgtctgac acctcttcga cctcggaagc ttcctcgtct aggctctccc agtcttccat 9900catgtcctcc tcctcctcgt ccaaaacctc ctctgcctga ctgtcccagt attcctcctc 9960gtccgtgggt ggcggcggca gctgcagctt ctttttgggt gccatcctgg gaagcaaggg 10020cccgcggctg ctgctgatag ggctgcggcg gcggggggat tgggttgagc tcctcgccgg 10080actgggggtc caagtaaacc ccccgtccct ttcgtagcag aaactcttgg cgggctttgt 10140tgatggcttg caattggcca agaatgtggc cctgggtaat gacgcaggcg gtaagctccg 10200catttggcgg gcgggattgg tcttcgtaga acctaatctc gtgggcgtgg tagtcctcag 10260gtacaaattt gcgaaggtaa gccgacgtcc acagccccgg agtgagtttc aaccccggag 10320ccgcggactt ttcgtcaggc gagggaccct gcagctcaaa ggtaccgata atttgacttt 10380cgttaagcag ctgcgaattg caaaccaggg agcggtgcgg ggtgcatagg ttgcagcgac 10440agtgacactc cagtagaccg tcaccgctca cgtcttccat tatgtcagag tggtaggcaa 10500ggtagttggc tagctgcaga aggtagcagt ggccccaaag cggcggaggg cattcgcggt 10560acttaatggg cacaaagtcg ctaggaagtg cacagcaggt ggcgggcaag attcctgagc 10620gctctaggat aaagttccta aagttctgca acatgctttg actggtgaag tctggcagac 10680cctgttgcag ggttttaagc aggcgttcgg ggaaaatgat gtccgccagg tgcgcggcca 10740cggagcgctc gttgaaggcc gtccataggt ccttcaagtt ttgctttagc agtttctgca 10800gctccttgag gttgcactcc tccaagcact gctgccaaac gcccatggcc gtctgccagg 10860tgtagcatag aaataagtaa acgcagtcgc ggacgtagtc gcggcgcgcc tcgcccttga 10920gcgtggaatg aagcacgttt tgcccaaggc ggttttcgtg caaaattcca aggtaggaga 10980ccaggttgca gagctccacg ttggagatct tgcaggcctg gcgtacgtag ccctgtcgaa 11040aggtgtagtg caatgtttcc tctagcttgc gctgcatctc cgggtcagca aagaaccgct 11100gcatgcactc aagctccacg gtaacgagca ctgcggccat cattagtttg cgtcgctcct 11160ccaagtcggc aggctcgcgc gtttgaagcc agcgcgctag ctgctcgtcg ccaactgcgg 11220gtaggccctc ctctgtttgt tcttgcaaat ttgcatccct ctccaggggc tgcgcacggc 11280gcacgatcag ctcactcatg actgtgctca tgaccttggg gggtaggtta agtgccgggt 11340aggcaaagtg ggtgacctcg atgctgcgtt ttagtacggc taggcgcgcg ttgtcaccct 11400cgagttccac caacactcca gagtgacttt cattttcgct gttttcctgt tgcagagcgt 11460ttgccgcgcg cttctcgtcg cgtccaagac cctcaaagat ttttggcact tcgttgagcg 11520aggcgatatc aggtatgaca gcgccctgcc gcaaggccag ctgcttgtcc gctcggctgc 11580ggttggcacg gcaggatagg ggtatcttgc agttttggaa aaagatgtga taggtggcaa 11640gcacctctgg cacggcaaat acggggtaga agttgaggcg cgggttgggc tcgcatgtgc 11700cgttttcttg gcgtttgggg ggtacgcgcg gtgagaatag gtggcgttcg taggcaaggc 11760tgacatccgc tatggcgagg ggcacatcgc tgcgctcttg caacgcgtcg cagataatgg 11820cgcactggcg ctgcagatgc ttcaacagca cgtcgtctcc cacatctagg tagtcgccat 11880gcctttcgtc cccccgcccg acttgttcct cgtttgcctc tgcgttgtcc tggtcttgct 11940ttttatcctc tgttggtact gagcggtcct cgtcgtcttc gcttacaaaa cctgggtcct 12000gctcgataat cacttcctcc tcctcaagcg ggggtgcctc gacggggaag gtggtaggcg 12060cgttggcggc atcggtggag gcggtggtgg cgaactcaga gggggcggtt aggctgtcct 12120tcttctcgac tgactccatg atctttttct gcctatagga gaaggaaatg gccagtcggg 12180aagaggagca gcgcgaaacc acccccgagc gcggacgcgg tgcggcgcga cgtcccccaa 12240ccatggagga cgtgtcgtcc ccgtccccgt cgccgccgcc tccccgggcg cccccaaaaa 12300agcggatgag gcggcgtatc gagtccgagg acgaggaaga ctcatcacaa gacgcgctgg 12360tgccgcgcac acccagcccg cggccatcga cctcggcggc ggatttggcc attgcgccca 12420agaagaaaaa gaagcgccct tctcccaagc ccgagcgccc gccatcacca gaggtaatcg 12480tggacagcga ggaagaaaga gaagatgtgg cgctacaaat ggtgggtttc agcaacccac 12540cggtgctaat caagcatggc aaaggaggta agcgcacagt gcggcggctg aatgaagacg 12600acccagtggc gcgtggtatg cggacgcaag aggaagagga agagcccagc gaagcggaaa 12660gtgaaattac ggtgatgaac ccgctgagtg tgccgatcgt gtctgcgtgg gagaagggca 12720tggaggctgc gcgcgcgctg atggacaagt accacgtgga taacgatcta aaggcgaact 12780tcaaactact gcctgaccaa gtggaagctc tggcggccgt atgcaagacc tggctgaacg 12840aggagcaccg cgggttgcag ctgaccttca ccagcaacaa gacctttgtg acgatgatgg 12900ggcgattcct gcaggcgtac ctgcagtcgt ttgcagaggt gacctacaag catcacgagc 12960ccacgggctg cgcgttgtgg ctgcaccgct gcgctgagat cgaaggcgag cttaagtgtc 13020tacacggaag cattatgata aataaggagc acgtgattga aatggatgtg acgagcgaaa 13080acgggcagcg cgcgctgaag gagcagtcta gcaaggccaa gatcgtgaag aaccggtggg 13140gccgaaatgt ggtgcagatc tccaacaccg acgcaaggtg ctgcgtgcac gacgcggcct 13200gtccggccaa tcagttttcc ggcaagtctt gcggcatgtt cttctctgaa ggcgcaaagg 13260ctcaggtggc ttttaagcag atcaaggctt ttatgcaggc gctgtatcct aacgcccaga 13320ccgggcacgg tcaccttttg atgccactac ggtgcgagtg caactcaaag cctgggcacg 13380cgcccttttt gggaaggcag ctaccaaagt tgactccgtt cgccctgagc aacgcggagg 13440acctggacgc ggatctgatc tccgacaaga gcgtgctggc cagcgtgcac cacccggcgc 13500tgatagtgtt ccagtgctgc aaccctgtgt atcgcaactc gcgcgcgcag ggcggaggcc 13560ccaactgcga cttcaagata tcggcgcccg acctgctaaa cgcgttggtg atggtgcgca 13620gcctgtggag tgaaaacttc accgagctgc cgcggatggt tgtgcctgag tttaagtgga 13680gcactaaaca ccagtatcgc aacgtgtccc tgccagtggc gcatagcgat gcgcggcaga 13740acccctttga tttttaaacg gcgcagacgg caagggtggg ggtaaataat cacccgagag 13800tgtacaaata aaagcatttg cctttattga aagtgtctct agtacattat ttttacatgt 13860ttttcaagtg acaaaaagaa gtggcgctcc taatctgcgc actgtggctg cggaagtagg 13920gcgagtggcg ctccaggaag ctgtagagct gttcctggtt gcgacgcagg gtgggctgta 13980cctggggact gttgagcatg gagttgggta ccccggtaat aaggttcatg gtggggttgt 14040gatccatggg agtttggggc cagttggcaa aggcgtggag aaacatgcag cagaatagtc 14100cacaggcggc cgagttgggc ccctgtacgc tttgggtgga cttttccagc gttatacagc 14160ggtcggggga agaagcaatg gcgctacggc gcaggagtga ctcgtactca aactggtaaa 14220cctgcttgag tcgctggtca gaaaagccaa agggctcaaa gaggtagcat gtttttgagt 14280gcgggttcca ggcaaaggcc atccagtgta cgcccccagt ctcggtccga gactcgaacc 14340gggggtcccg cgactcaacc cttggaaaat aaccctccgg ctacagggag cgagccactt 14400aatgctttcg ctttccagcc taaccgctta cgctgcgcgc ggccagtggc caaaaaagct 14460agcgcagcag ccgccgcgcc tggaaggaag ccaaaaggag cactcccccg ttgtctgacg 14520tcgcacacct gggttcgaca cgcgggcggt aaccgcatgg atcacggcgg acggccggat 14580acggggctcg aaccccggtc gtccgccatg atacccttgc gaatttatcc accagaccac 14640ggaagagtgc ccgcttacag gctctccttt tgcacggtag agcgtcaacg attgcgcgcg 14700cctgaccggc cagagcgtcc cgaccatgga gcactttttg ccgctgcgca acatctggaa 14760ccgcgtccgc gactttccgc gcgcctccac caccgccgcc ggcatcacct ggatgtccag 14820gtacatctac ggatatcatc gccttatgtt ggaagatctc gcccccggag ccccggccac 14880cctacgctgg cccctctacc gccagccgcc gccgcacttt ttggtgggat accagtacct 14940ggtgcggact tgcaacgact acgtatttga ctcgagggct tactcgcgtc tcaggtacac 15000cgagctctcg cagccgggtc accagaccgt taactggtcc gttatggcca actgcactta 15060caccatcaac acgggcgcat accaccgctt tgtggacatg gatgacttcc agtctaccct 15120cacgcaggtg cagcaggcca tattagccga gcgcgttgtc gccgacctag ccctgcttca 15180gccgatgagg ggcttcgggg tcacacgcat gggaggaaga gggcgccacc tacggccaaa 15240ctccgccgcc gccgcagcga tagatgcaag agatgcagga caagaggaag gagaagaaga 15300agtgccggta gaaaggctca tgcaagacta ctacaaagac ctgcgccgat gtcaaaacga 15360agcctggggc atggccgacc gcctgcgcat tcagcaggcc ggacccaagg acatggtgct 15420tctg 154243933849DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 393cggcgtcaat acgggataat accgcgccac atagcagaac tttaaaagtg ctcatcattg 60gaaaacgttc ttcggggcga aaactctcaa ggatcttacc gctgttgaga tccagttcga 120tgtaacccac tcgtgcaccc aactgatctt cagcatcttt tactttcacc agcgtttctg 180ggtgagcaaa aacaggaagg caaaatgccg caaaaaaggg aataagggcg acacggaaat 240gttgaatact catactcttc ctttttcaat attattgaag catttatcag ggttattgtc 300tcatgagcgg atacatattt gaatgtattt agaaaaataa acaaataggg gttccgcgca 360catttccccg aaaagtgcca cctaaattgt aagcgttaat attttgttaa aattcgcgtt 420aaatttttgt taaatcagct cattttttaa ccaataggcc gaaatcggca aaatccctta 480taaatcaaaa gaatagaccg agatagggtt gagtgttgtt ccagtttgga acaagagtcc 540actattaaag aacgtggact ccaacgtcaa agggcgaaaa accgtctatc agggcgatgg 600cccactacgt gaaccatcac cctaatcaag ttttttgggg tcgaggtgcc gtaaagcact 660aaatcggaac cctaaaggga gcccccgatt tagagcttga cggggaaagc cggcgaacgt 720ggcgagaaag gaagggaaga aagcgaaagg agcgggcgct agggcgctgg caagtgtagc 780ggtcacgctg cgcgtaacca ccacacccgc cgcgcttaat gcgccgctac agggcgcgtc 840ccattcgcca ttcaggctgc gcaactgttg ggaagggcga tcggtgcggg cctcttcgct 900attacgccag ctgcgcgctc gctcgctcac tgaggccgcc cgggcaaagc ccgggcgtcg 960ggcgaccttt ggtcgcccgg cctcagtgag cgagcgagcg cgcagagagg gagtggccaa 1020ctccatcact aggggttcct tgtagttaat gattaacccg ccatgctact tatctacgta 1080gccatgctct aggaagagta ccattgacgt caataatgac gtatgttccc atagtaacgc 1140caatagggac tttccattga cgtcaatggg tggagtattt acggtaaact gcccacttgg 1200cagtacatca agtgtatcag tggtttgtct ggtcaaccac cgcggtctca gtggtgtacg 1260gtacaaaccc agctaccggt cgccaccatg cccgccatga agatcgagtg ccgcatcacc 1320ggcaccctga acggcgtgga gttcgagctg gtgggcggcg gagagggcac ccccgagcag 1380ggccgcatga ccaacaagat gaagagcacc aaaggcgccc tgaccttcag cccctacctg 1440ctgagccacg tgatgggcta cggcttctac cacttcggca cctaccccag cggctacgag 1500aaccccttcc tgcacgccat caacaacggc ggctacacca acacccgcat cgagaagtac 1560gaggacggcg gcgtgctgca cgtgagcttc agctaccgct acgaggccgg ccgcgtgatc 1620ggcgacttca aggtggtggg caccggcttc cccgaggaca gcgtgatctt caccgacaag 1680atcatccgca gcaacgccac cgtggagcac ctgcacccca tgggcgataa cgtgctggtg 1740ggcagcttcg cccgcacctt cagcctgcgc gacggcggct actacagctt cgtggtggac 1800agccacatgc acttcaagag cgccatccac cccagcatcc tgcagaacgg gggccccatg 1860ttcgccttcc gccgcgtgga ggagctgcac agcaacaccg agctgggcat cgtggagtac 1920cagcacgcct tcaagacccc catcgccttc gccagatctc gagctcgatg agtttggaca 1980aaccacaact agaatgcagt gaaaaaaatg ctttatttgt gaaatttgtg atgctattgc 2040tttatttgtg ggcccgggat cttcctagag catggctacg tagataagta gcatggcggg 2100ttaatcatta actacaagga acccctagtg atggagttgg ccactccctc tctgcgcgct 2160cgctcgctca ctgaggccgg gcgaccaaag gtcgcccgac gcccgggctt tgcccgggcg 2220gcctcagtga gcgagcgagc gcgcagctgc attaatgaat cggccaacgc gcggggagag 2280gcggtttgcg tattgggcgc tcttccgctt cctcgctcac tgactcgctg cgctcggtcg 2340ttcggctgcg gcgagcggta tcagctcact caaaggcggt aatacggtta tccacagaat 2400caggggataa cgcaggaaag aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta 2460aaaaggccgc gttgctggcg tttttccata ggctccgccc ccctgacgag catcacaaaa 2520atcgacgctc aagtcagagg tggcgaaacc cgacaggact ataaagatac caggcgtttc 2580cccctggaag ctccctcgtg cgctctcctg ttccgaccct gccgcttacc ggatacctgt 2640ccgcctttct cccttcggga agcgtggcgc tttctcatag ctcacgctgt aggtatctca 2700gttcggtgta ggtcgttcgc tccaagctgg gctgtgtgca cgaacccccc gttcagcccg 2760accgctgcgc cttatccggt aactatcgtc

ttgagtccaa cccggtaaga cacgacttat 2820cgccactggc agcagccact ggtaacagga ttagcagagc gaggtatgta ggcggtgcta 2880cagagttctt gaagtggtgg cctaactacg gctacactag aagaacagta tttggtatct 2940gcgctctgct gaagccagtt accttcggaa aaagagttgg tagctcttga tccggcaaac 3000aaaccaccgc tggtagcggt ggtttttttg tttgcaagca gcagattacg cgcagaaaaa 3060aaggatctca agaagatcct ttgatctttt ctacggggtc tgacgctcag tggaacgaaa 3120actcacgtta agggattttg gtcatgagat tatcaaaaag gatcttcacc tagatccttt 3180taaattaaaa atgaagtttt aaatcaatct aaagtatata tgagtaaact tggtctgaca 3240gttaccaatg cttaatcagt gaggcaccta tctcagcgat ctgtctattt cgttcatcca 3300tagttgcctg actccccgtc gtgtagataa ctacgatacg ggagggctta ccatctggcc 3360ccagtgctgc aatgataccg cgagacccac gctcaccggc tccagattta tcagcaataa 3420accagccagc cggaagggcc gagcgcagaa gtggtcctgc aactttatcc gcctccatcc 3480agtctattaa ttgttgccgg gaagctagag taagtagttc gccagttaat agtttgcgca 3540acgttgttgc cattgctaca ggcatcgtgg tgtcacgctc gtcgtttggt atggcttcat 3600tcagctccgg ttcccaacga tcaaggcgag ttacatgatc ccccatgttg tgcaaaaaag 3660cggttagctc cttcggtcct ccgatcgttg tcagaagtaa gttggccgca gtgttatcac 3720tcatggttat ggcagcactg cataattctc ttactgtcat gccatccgta agatgctttt 3780ctgtgactgg tgagtactca accaagtcat tctgagaata gtgtatgcgg cgaccgagtt 3840gctcttgcc 38493947336DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 394atgccggggt tttacgagat tgtgattaag gtccccagcg accttgacga gcatctgccc 60ggcatttctg acagctttgt gaactgggtg gccgagaagg aatgggagtt gccgccagat 120tctgacatgg atctgaatct gattgagcag gcacccctga ccgtggccga gaagctgcag 180cgcgactttc tgacggaatg gcgccgtgtg agtaaggccc cggaggctct tttctttgtg 240caatttgaga agggagagag ctacttccac atgcacgtgc tcgtggaaac caccggggtg 300aaatccatgg ttttgggacg tttcctgagt cagattcgcg aaaaactgat tcagagaatt 360taccgcggga tcgagccgac tttgccaaac tggttcgcgg tcacaaagac cagaaatggc 420gccggaggcg ggaacaaggt ggtggatgag tgctacatcc ccaattactt gctccccaaa 480acccagcctg agctccagtg ggcgtggact aatatggaac agtatttaag cgcctgtttg 540aatctcacgg agcgtaaacg gttggtggcg cagcatctga cgcacgtgtc gcagacgcag 600gagcagaaca aagagaatca gaatcccaat tctgatgcgc cggtgatcag atcaaaaact 660tcagccaggt acatggagct ggtcgggtgg ctcgtggaca aggggattac ctcggagaag 720cagtggatcc aggaggacca ggcctcatac atctccttca atgcggcctc caactcgcgg 780tcccaaatca aggctgcctt ggacaatgcg ggaaagatta tgagcctgac taaaaccgcc 840cccgactacc tggtgggcca gcagcccgtg gaggacattt ccagcaatcg gatttataaa 900attttggaac taaacgggta cgatccccaa tatgcggctt ccgtctttct gggatgggcc 960acgaaaaagt tcggcaagag gaacaccatc tggctgtttg ggcctgcaac taccgggaag 1020accaacatcg cggaggccat agcccacact gtgcccttct acgggtgcgt aaactggacc 1080aatgagaact ttcccttcaa cgactgtgtc gacaagatgg tgatctggtg ggaggagggg 1140aagatgaccg ccaaggtcgt ggagtcggcc aaagccattc tcggaggaag caaggtgcgc 1200gtggaccaga aatgcaagtc ctcggcccag atagacccga ctcccgtgat cgtcacctcc 1260aacaccaaca tgtgcgccgt gattgacggg aactcaacga ccttcgaaca ccagcagccg 1320ttgcaagacc ggatgttcaa atttgaactc acccgccgtc tggatcatga ctttgggaag 1380gtcaccaagc aggaagtcaa agactttttc cggtgggcaa aggatcacgt ggttgaggtg 1440gagcatgaat tctacgtcaa aaagggtgga gccaagaaaa gacccgcccc cagtgacgca 1500gatataagtg agcccaaacg ggtgcgcgag tcagttgcgc agccatcgac gtcagacgcg 1560gaagcttcga tcaactacgc agacaggtac caaaacaaat gttctcgtca cgtgggcatg 1620aatctgatgc tgtttccctg cagacaatgc gagagaatga atcagaattc aaatatctgc 1680ttcactcacg gacagaaaga ctgtttagag tgctttcccg tgtcagaatc tcaacccgtt 1740tctgtcgtca aaaaggcgta tcagaaactg tgctacattc atcatatcat gggaaaggtg 1800ccagacgctt gcactgcctg cgatctggtc aatgtggatt tggatgactg catctttgaa 1860caataaatga tttaaatcag gtatggctgc cgatggttat cttccagatt ggctcgagga 1920caacctctct gagggcattc gcgagtggtg ggcgctgaaa cctggagccc cgaagcccaa 1980agccaaccag caaaagcagg acgacggccg gggtctggtg cttcctggct acaagtacct 2040cggacccttc aacggactcg acaaggggga gcccgtcaac gcggcggacg cagcggccct 2100cgagcacgac aaggcctacg accagcagct gcaggcgggt gacaatccgt acctgcggta 2160taaccacgcc gacgccgagt ttcaggagcg tctgcaagaa gatacgtctt ttgggggcaa 2220cctcgggcga gcagtcttcc aggccaagaa gcgggttctc gaacctctcg gtctggttga 2280ggaaggcgct aagacggctc ctggaaagaa gagaccggta gagccatcac cccagcgttc 2340tccagactcc tctacgggca tcggcaagaa aggccaacag cccgccagaa aaagactcaa 2400ttttggtcag actggcgact cagagtcagt tccagaccct caacctctcg gagaacctcc 2460agcagcgccc tctggtgtgg gacctaatac aatggctgca ggcggtggcg caccaatggc 2520agacaataac gaaggcgccg acggagtggg tagttcctcg ggaaattggc attgcgattc 2580cacatggctg ggcgacagag tcatcaccac cagcacccga acctgggccc tgcccaccta 2640caacaaccac ctctacaagc aaatctccaa cgggacatcg ggaggagcca ccaacgacaa 2700cacctacttc ggctacagca ccccctgggg gtattttgac tttaacagat tccactgcca 2760cttttcacca cgtgactggc agcgactcat caacaacaac tggggattcc ggcccaagag 2820actcagcttc aagctcttca acatccaggt caaggaggtc acgcagaatg aaggcaccaa 2880gaccatcgcc aataacctca ccagcaccat ccaggtgttt acggactcgg agtaccagct 2940gccgtacgtt ctcggctctg cccaccaggg ctgcctgcct ccgttcccgg cggacgtgtt 3000catgattccc cagtacggct acctaacact caacaacggt agtcaggccg tgggacgctc 3060ctccttctac tgcctggaat actttccttc gcagatgctg agaaccggca acaacttcca 3120gtttacttac accttcgagg acgtgccttt ccacagcagc tacgcccaca gccagagctt 3180ggaccggctg atgaatcctc tgattgacca gtacctgtac tacttgtctc ggactcaaac 3240aacaggaggc acggcaaata cgcagactct gggcttcagc caaggtgggc ctaatacaat 3300ggccaatcag gcaaagaact ggctgccagg accctgttac cgccaacaac gcgtctcaac 3360gacaaccggg caaaacaaca atagcaactt tgcctggact gctgggacca aataccatct 3420gaatggaaga aattcattgg ctaatcctgg catcgctatg gcaacacaca aagacgacga 3480ggagcgtttt tttcccagta acgggatcct gatttttggc aaacaaaatg ctgccagaga 3540caatgcggat tacagcgatg tcatgctcac cagcgaggaa gaaatcaaaa ccactaaccc 3600tgtggctaca gaggaatacg gtatcgtggc agataacttg cagcagcaaa acacggctcc 3660tcaaattgga actgtcaaca gccagggggc cttacccggt atggtctggc agaaccggga 3720cgtgtacctg cagggtccca tctgggccaa gattcctcac acggacggca acttccaccc 3780gtctccgctg atgggcggct ttggcctgaa acatcctccg cctcagatcc tgatcaagaa 3840cacgcctgta cctgcggatc ctccgaccac cttcaaccag tcaaagctga actctttcat 3900cacgcaatac agcaccggac aggtcagcgt ggaaattgaa tgggagctgc agaaggaaaa 3960cagcaagcgc tggaaccccg agatccagta cacctccaac tactacaaat ctacaagtgt 4020ggactttgct gttaatacag aaggcgtgta ctctgaaccc cgccccattg gcacccgtta 4080cctcacccgt aatctgtaat tgcctgttaa tcaataaacc ggttgattcg tttcagttga 4140actttggtct ctgcgaaggg cgaattcgtt taaacctgca ggactagagg tcctgtatta 4200gaggtcacgt gagtgttttg cgacattttg cgacaccatg tggtcacgct gggtatttaa 4260gcccgagtga gcacgcaggg tctccatttt gaagcgggag gtttgaacgc gcagccgcca 4320agccgaattc tgcagatatc catcacactg gcggccgctc gactagagcg gccgccaccg 4380cggtggagct ccagcttttg ttccctttag tgagggttaa ttgcgcgctt ggcgtaatca 4440tggtcatagc tgtttcctgt gtgaaattgt tatccgctca caattccaca caacatacga 4500gccggaagca taaagtgtaa agcctggggt gcctaatgag tgagctaact cacattaatt 4560gcgttgcgct cactgcccgc tttccagtcg ggaaacctgt cgtgccagct gcattaatga 4620atcggccaac gcgcggggag aggcggtttg cgtattgggc gctcttccgc ttcctcgctc 4680actgactcgc tgcgctcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg 4740gtaatacggt tatccacaga atcaggggat aacgcaggaa agaacatgtg agcaaaaggc 4800cagcaaaagg ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc 4860ccccctgacg agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga 4920ctataaagat accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc 4980ctgccgctta ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat 5040agctcacgct gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg 5100cacgaacccc ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc 5160aacccggtaa gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga 5220gcgaggtatg taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact 5280agaagaacag tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt 5340ggtagctctt gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag 5400cagcagatta cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg 5460tctgacgctc agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa 5520aggatcttca cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata 5580tatgagtaaa cttggtctga cagttaccaa tgcttaatca gtgaggcacc tatctcagcg 5640atctgtctat ttcgttcatc catagttgcc tgactccccg tcgtgtagat aactacgata 5700cgggagggct taccatctgg ccccagtgct gcaatgatac cgcgagaccc acgctcaccg 5760gctccagatt tatcagcaat aaaccagcca gccggaaggg ccgagcgcag aagtggtcct 5820gcaactttat ccgcctccat ccagtctatt aattgttgcc gggaagctag agtaagtagt 5880tcgccagtta atagtttgcg caacgttgtt gccattgcta caggcatcgt ggtgtcacgc 5940tcgtcgtttg gtatggcttc attcagctcc ggttcccaac gatcaaggcg agttacatga 6000tcccccatgt tgtgcaaaaa agcggttagc tccttcggtc ctccgatcgt tgtcagaagt 6060aagttggccg cagtgttatc actcatggtt atggcagcac tgcataattc tcttactgtc 6120atgccatccg taagatgctt ttctgtgact ggtgagtact caaccaagtc attctgagaa 6180tagtgtatgc ggcgaccgag ttgctcttgc ccggcgtcaa tacgggataa taccgcgcca 6240catagcagaa ctttaaaagt gctcatcatt ggaaaacgtt cttcggggcg aaaactctca 6300aggatcttac cgctgttgag atccagttcg atgtaaccca ctcgtgcacc caactgatct 6360tcagcatctt ttactttcac cagcgtttct gggtgagcaa aaacaggaag gcaaaatgcc 6420gcaaaaaagg gaataagggc gacacggaaa tgttgaatac tcatactctt cctttttcaa 6480tattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt 6540tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctaaattg 6600taagcgttaa tattttgtta aaattcgcgt taaatttttg ttaaatcagc tcatttttta 6660accaataggc cgaaatcggc aaaatccctt ataaatcaaa agaatagacc gagatagggt 6720tgagtgttgt tccagtttgg aacaagagtc cactattaaa gaacgtggac tccaacgtca 6780aagggcgaaa aaccgtctat cagggcgatg gcccactacg tgaaccatca ccctaatcaa 6840gttttttggg gtcgaggtgc cgtaaagcac taaatcggaa ccctaaaggg agcccccgat 6900ttagagcttg acggggaaag ccggcgaacg tggcgagaaa ggaagggaag aaagcgaaag 6960gagcgggcgc tagggcgctg gcaagtgtag cggtcacgct gcgcgtaacc accacacccg 7020ccgcgcttaa tgcgccgcta cagggcgcgt cccattcgcc attcaggctg cgcaactgtt 7080gggaagggcg atcggtgcgg gcctcttcgc tattacgcca gctggcgaaa gggggatgtg 7140ctgcaaggcg attaagttgg gtaacgccag ggttttccca gtcacgacgt tgtaaaacga 7200cggccagtga gcgcgcgtaa tacgactcac tatagggcga attgggtacc gggccccccc 7260tcgatcgagg tcgacggtat cgggggagct cgcagggtct ccattttgaa gcgggaggtt 7320tgaacgcgca gccgcc 7336395969DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 395ccccaactgg ggtaaccttt gggctccccg ggcgcgacta taagctgcga gcaacttcac 60ttgggtatgc cggcggtagc gcttaccgtt cgtataatgt atgctatacg aagttatccg 120aagccgctag cggtggtttg tctggtcaac caccgcggtc tcagtggtgt acggtacaaa 180cccagctacc ggtcgccacc atgcccgcca tgaagatcga gtgccgcatc accggcaccc 240tgaacggcgt ggagttcgag ctggtgggcg gcggagaggg cacccccgag cagggccgca 300tgaccaacaa gatgaagagc accaaaggcg ccctgacctt cagcccctac ctgctgagcc 360acgtgatggg ctacggcttc taccacttcg gcacctaccc cagcggctac gagaacccct 420tcctgcacgc catcaacaac ggcggctaca ccaacacccg catcgagaag tacgaggacg 480gcggcgtgct gcacgtgagc ttcagctacc gctacgaggc cggccgcgtg atcggcgact 540tcaaggtggt gggcaccggc ttccccgagg acagcgtgat cttcaccgac aagatcatcc 600gcagcaacgc caccgtggag cacctgcacc ccatgggcga taacgtgctg gtgggcagct 660tcgcccgcac cttcagcctg cgcgacggcg gctactacag cttcgtggtg gacagccaca 720tgcacttcaa gagcgccatc caccccagca tcctgcagaa cgggggcccc atgttcgcct 780tccgccgcgt ggaggagctg cacagcaaca ccgagctggg catcgtggag taccagcacg 840ccttcaagac ccccatcgcc ttcgccagat ctcgagctcg atgagtttgg acaaaccaca 900actagaatgc agtgaaaaaa atgctttatt tgtgaaattt gtgatgctat tgctttattt 960gtgggcccg 9693964769DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 396tgatcccctg cgccatcaga tccttggcgg cgagaaagcc atccagttta ctttgcaggg 60cttcccaacc ttaccagagg gcgccccagc tggcaattcc ggttcgcttg ctgtccataa 120aaccgcccag tctagctatc gccatgtaag cccactgcaa gctacctgct ttctctttgc 180gcttgcgttt tcccttgtcc agatagccca gtagctgaca ttcatccggg gtcagcaccg 240tttctgcgga ctggctttct acgtgctcga ggggggccaa acggtctcca gcttggctgt 300tttggcggat gagagaagat tttcagcctg atacagatta aatcagaacg cagaagcggt 360ctgataaaac agaatttgcc tggcggcagt agcgcggtgg tcccacctga ccccatgccg 420aactcagaag tgaaacgccg tagcgccgat ggtagtgtgg ggtctcccca tgcgagagta 480gggaactgcc aggcatcaaa taaaacgaaa ggctcagtcg aaagactggg cctttcgttt 540tatctgttgt ttgtcggtga acgctctcct gagtaggaca aatccgccgg gagcggattt 600gaacgttgcg aagcaacggc ccggagggtg gcgggcagga cgcccgccat aaactgccag 660gcatcaaatt aagcagaagg ccatcctgac ggatggcctt tttgcgtttc tacaaactct 720tttgtttatt tttctaaata cattcaaata tgtatccgct catgaccaaa atcccttaac 780gtgagttttc gttccactga gcgtcagacc ccgtagaaaa gatcaaagga tcttcttgag 840atcctttttt tctgcgcgta atctgctgct tgcaaacaaa aaaaccaccg ctaccagcgg 900tggtttgttt gccggatcaa gagctaccaa ctctttttcc gaaggtaact ggcttcagca 960gagcgcagat accaaatact gtccttctag tgtagccgta gttaggccac cacttcaaga 1020actctgtagc accgcctaca tacctcgctc tgctaatcct gttaccagtg gctgctgcca 1080gtggcgataa gtcgtgtctt accgggttgg actcaagacg atagttaccg gataaggcgc 1140agcggtcggg ctgaacgggg ggttcgtgca cacagcccag cttggagcga acgacctaca 1200ccgaactgag atacctacag cgtgagctat gagaaagcgc cacgcttccc gaagggagaa 1260aggcggacag gtatccggta agcggcaggg tcggaacagg agagcgcacg agggagcttc 1320cagggggaaa cgcctggtat ctttatagtc ctgtcgggtt tcgccacctc tgacttgagc 1380gtcgattttt gtgatgctcg tcaggggggc ggagcctatg gaaaaacgcc agcaacgcgg 1440cctttttacg gttcctggcc ttttgctggc cttttgctca catgttcttt cctgcgttat 1500cccctgattc tgtggataac cgtattaccg cctttgagtg agctgatacc gctcgccgca 1560gccgaacgac cgagcgcagc gagtcagtga gcgaggaagc ggaagagcgc ctgatgcggt 1620attttctcct tacgcatctg tgcggtattt cacaccgcat atggtgcact ctcagtacaa 1680tctgctctga tgccgcatag ttaagccagt atacactccg ctatcgctac gtgactgggt 1740catggctgcg ccccgacacc cgccaacacc cgctgacgcg ccctgacggg cttgtctgct 1800cccggcatcc gcttacagac aagctgtgac cgtctccggg agctgcatgt gtcagaggtt 1860ttcaccgtca tcaccgaaac gcgcgaggca gcagatcaat tcgcgcgcga aggcgaagcg 1920gcatgcataa tgtgcctgtc aaatggacga agcagggatt ctgcaaaccc tatgctactc 1980cgtcaagccg tcaattgtct gattcgttac caattatgac aacttgacgg ctacatcatt 2040cactttttct tcacaaccgg cacggaactc gctcgggctg gccccggtgc attttttaaa 2100tacccgcgag aaatagagtt gatcgtcaaa accaacattg cgaccgacgg tggcgatagg 2160catccgggtg gtgctcaaaa gcagcttcgc ctggctgata cgttggtcct cgcgccagct 2220taagacgcta atccctaact gctggcggaa aagatgtgac agacgcgacg gcgacaagca 2280aacatgctgt gcgacgctgg cgatacatta ccctgttatc cctagatgac attaccctgt 2340tatcccagat gacattaccc tgttatccct agatgacatt accctgttat ccctagatga 2400catttaccct gttatcccta gatgacatta ccctgttatc ccagatgaca ttaccctgtt 2460atccctagat acattaccct gttatcccag atgacatacc ctgttatccc tagatgacat 2520taccctgtta tcccagatga cattaccctg ttatccctag atacattacc ctgttatccc 2580agatgacata ccctgttatc cctagatgac attaccctgt tatcccagat gacattaccc 2640tgttatccct agatacatta ccctgttatc ccagatgaca taccctgtta tccctagatg 2700acattaccct gttatcccag atgacattac cctgttatcc ctagatacat taccctgtta 2760tcccagatga cataccctgt tatccctaga tgacattacc ctgttatccc agatgacatt 2820accctgttat ccctagatac attaccctgt tatcccagat gacataccct gttatcccta 2880gatgacatta ccctgttatc ccagatgaca ttaccctgtt atccctagat acattaccct 2940gttatcccag atgacatacc ctgttatccc tagatgacat taccctgtta tcccagataa 3000actcaatgat gatgatgatg atggtcgaga ctcagcggcc gcggtgccag ggcgtgccct 3060tgggctcccc gggcgcgatg cccgccatga agatcgagtg ccgcatcacc ggcaccctga 3120acggcgtgga gttcgagctg gtgggcggcg gagagggcac ccccgagcag ggccgcatga 3180ccaacaagat gaagagcacc aaaggcgccc tgaccttcag cccctacctg ctgagccacg 3240tgatgggcta cggcttctac cacttcggca cctaccccag cggctacgag aaccccttcc 3300tgcacgccat caacaacggc ggctacacca acacccgcat cgagaagtac gaggacggcg 3360gcgtgctgca cgtgagcttc agctaccgct acgaggccgg ccgcgtgatc ggcgacttca 3420aggtggtggg caccggcttc cccgaggaca gcgtgatctt caccgacaag atcatccgca 3480gcaacgccac cgtggagcac ctgcacccca tgggcgataa cgtgctggtg ggcagcttcg 3540cccgcacctt cagcctgcgc gacggcggct actacagctt cgtggtggac agccacatgc 3600acttcaagag cgccatccac cccagcatcc tgcagaacgg gggccccatg ttcgccttcc 3660gccgcgtgga ggagctgcac agcaacaccg agctgggcat cgtggagtac cagcacgcct 3720tcaagacccc catcgccttc gccagatctc gagctcgagg tggtttgtct ggtcaaccac 3780cgcggtctca gtggtgtacg gtacaaaccc accccaactg gggtaacctt tgagttctct 3840cagttggggg taatcagcat catgatgtgg taccacatca tgatgctgat tataagaatg 3900cggccgccac actctagtgg atctcgagtt aataattcag aagaactcgt caagaaggcg 3960atagaaggcg atgcgctgcg aatcgggagc ggcgataccg taaagcacga ggaagcggtc 4020agcccattcg ccgccaagct cttcagcaat atcacgggta gccaacgcta tgtcctgata 4080gcggtccgcc acacccagcc ggccacagtc gatgaatcca gaaaagcggc cattttccac 4140catgatattc ggcaagcagg catcgccatg ggtcacgacg agatcctcgc cgtcgggcat 4200gctcgccttg agcctggcga acagttcggc tggcgcgagc ccctgatgct cttcgtccag 4260atcatcctga tcgacaagac cggcttccat ccgagtacgt gctcgctcga tgcgatgttt 4320cgcttggtgg tcgaatgggc aggtagccgg atcaagcgta tgcagccgcc gcattgcatc 4380agccatgatg gatactttct cggcaggagc aaggtgtaga tgacatggag atcctgcccc 4440ggcacttcgc ccaatagcag ccagtccctt cccgcttcag tgacaacgtc gagcacagct 4500gcgcaaggaa cgcccgtcgt ggccagccac gatagccgcg ctgcctcgtc ttgcagttca 4560ttcagggcac cggacaggtc ggtcttgaca aaaagaaccg ggcgcccctg cgctgacagc 4620cggaacacgg cggcatcaga gcagccgatt gtctgttgtg cccagtcata gccgaatagc 4680ctctccaccc aagcggccgg agaacctgcg tgcaatccat cttgttcaat catgcgaaac 4740gatcctcatc ctgtctcttg atcagagct 4769397797DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 397ccccaactgg ggtaaccttt gggctccccg ggcgcgatgg tgagcaaggg cgaggaggat 60aacatggcca tcatcaagga gttcatgcgc ttcaaggtgc acatggaggg ctccgtgaac 120ggccacgagt tcgagatcga gggcgagggc gagggccgcc cctacgaggg cacccagacc 180gccaagctga aggtgaccaa gggtggcccc ctgcccttcg cctgggacat cctgtcccct 240cagttcatgt acggctccaa ggcctacgtg aagcaccccg ccgacatccc cgactacttg 300aagctgtcct tccccgaggg cttcaagtgg gagcgcgtga tgaacttcga ggacggcggc 360gtggtgaccg tgacccagga ctcctccctg

caggacggcg agttcatcta caaggtgaag 420ctgcgcggca ccaacttccc ctccgacggc cccgtaatgc agaagaagac catgggctgg 480gaggcctcct ccgagcggat gtaccccgag gacggcgccc tgaagggcga gatcaagcag 540aggctgaagc tgaaggacgg cggccactac gacgctgagg tcaagaccac ctacaaggcc 600aagaagcccg tgcagctgcc cggcgcctac aacgtcaaca tcaagttgga catcacctcc 660cacaacgagg actacaccat cgtggaacag tacgaacgcg ccgagggccg ccactccacc 720ggcggcatgg acgagctgta caagggtggt ttgtctggtc aaccaccgcg agctcagtgg 780tgtacggtac aaaccca 797398815DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 398ccccaactgg ggtaaccttt gggctccccg ggcgcggccg ccaccatggt gtccaagggt 60gaggaacttt ttaccggagt ggtgccgata ctggtagagc tggatggcga cgtaaacggg 120cacaagttca gtgtacgggg agagggcgag ggcgacgcta cgaatgggaa attgactttg 180aaatttattt gcaccacggg caaattgccg gtcccgtggc caactttggt tacgaccttg 240acctatggcg ttcagtgttt ctcacggtac ccagaccaca tgaaacagca tgactttttt 300aagtcagcga tgccggaggg atatgtgcaa gaacggacta tctcatttaa agatgatggc 360acatataaga caagagcgga agtcaaattc gaaggggaca ccctcgtcaa tcgaatagaa 420ctcaagggaa tagacttcaa agaagatggt aatatactgg ggcacaaact cgaatacaat 480ttcaacagtc ataacgtcta catcactgcc gacaaacaaa aaaatgggat caaagcgaac 540ttcaaaatcc gacataatgt cgaggatggg agcgtccaac tggcagacca ttaccagcaa 600aatactccaa taggtgatgg tccagtgctt ttgccagata atcattatct tagctatcag 660agcaagttga gtaaggatcc gaatgaaaag cgagatcaca tggtcttgct ggagtttgtt 720acggcggctg gtatcacact tggtatggat gaattgtaca agggtggttt gtctggtcaa 780ccaccgcgga ctcagtggtg tacggtacaa accca 8153991660DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 399ccccaactgg ggtaaccttt gggctccccg ggcgcggcca ccatgaaatg ggttactttc 60atatctctgt tgtttttgtt ttcctctagt tccagggcca tgccgtcttc tgtctcgtgg 120ggcatcctcc tgctggcagg cctgtgctgc ctggtccctg tctccctggc tgaggatccc 180cagggagatg ctgcccagaa gacagataca tcccaccatg atcaggatca cccaaccttc 240aacaagatca cccccaacct ggctgagttc gccttcagcc tataccgcca gctggcacac 300cagtccaaca gcaccaatat cttcttctcc ccagtgagca tcgctacagc ctttgcaatg 360ctctccctgg ggaccaaggc tgacactcac gatgaaatcc tggagggcct gaatttcaac 420ctcacggaga ttccggaggc tcagatccat gaaggcttcc aggaactcct ccgtaccctc 480aaccagccag acagccagct ccagctgacc accggcaatg gcctgttcct cagcgagggc 540ctgaagctag tggataagtt tttggaggat gttaaaaagt tgtaccactc agaagccttc 600actgtcaact tcggggacac cgaagaggcc aagaaacaga tcaacgatta cgtggagaag 660ggtactcaag ggaaaattgt ggatttggtc aaggagcttg acagagacac agtttttgct 720ctggtgaatt acatcttctt taaaggcaaa tgggagagac cctttgaagt caaggacacc 780gaggaagagg acttccacgt ggaccaggtg accaccgtga aggtgcctat gatgaagcgt 840ttaggcatgt ttaacatcca gcactgtaag aagctgtcca gctgggtgct gctgatgaaa 900tacctgggca atgccaccgc catcttcttc ctgcctgatg aggggaaact acagcacctg 960gaaaatgaac tcacccacga tatcatcacc aagttcctgg aaaatgaaga cagaaggtct 1020gccagcttac atttacccaa actgtccatt actggaacct atgatctgaa gagcgtcctg 1080ggtcaactgg gcatcactaa ggtcttcagc aatggggctg acctctccgg ggtcacagag 1140gaggcacccc tgaagctctc caaggccgtg cataaggctg tgctgaccat cgacgagaaa 1200gggactgaag ctgctggggc catgttttta gaggccatac ccatgtctat cccccccgag 1260gtcaagttca acaaaccctt tgtcttctta atgattgaac aaaataccaa gtctcccctc 1320ttcatgggaa aagtggtgaa tcccacccaa aaataagaat tctaactaga gctcgctgat 1380cagcctcgac tgtgccttct agttgccagc catctgttgt ttgcccctcc cccgtgcctt 1440ccttgaccct ggaaggtgcc actcccactg tcctttccta ataaaatgag gaaattgcat 1500cgcattgtct gagtaggtgt cattctattc tggggggtgg ggtggggcag gacagcaagg 1560gggaggattg ggaagagaat agcaggcatg ctggggagcg agctcgaggt ggtttgtctg 1620gtcaaccacc gcggtctcag tggtgtacgg tacaaaccca 16604004906DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 400ccccaactgg ggtaaccttt gggctccccg ggcgcggcca ccatgaaatg ggttactttc 60atatctctgt tgtttttgtt ttcctctagt tccagggcca tgacgaggat tttgacagct 120ttcaaagtgg tgaggacact gaagactggt tttggcttta ccaatgtgac tgcacaccaa 180aaatggaaat tttcaagacc tggcatcagg ctcctttctg tcaaggcaca gacagcacac 240attgtcctgg aagatggaac taagatgaaa ggttactcct ttggccatcc atcctctgtt 300gctggtgaag tggtttttaa tactggcctg ggagggtacc cagaagctat tactgaccct 360gcctacaaag gacagattct cacaatggcc aaccctatta ttgggaatgg tggagctcct 420gatactactg ctctggatga actgggactt agcaaatatt tggagtctaa tggaatcaag 480gtttcaggtt tgctggtgct ggattatagt aaagactaca accactggct ggctaccaag 540agtttagggc aatggctaca ggaagaaaag gttcctgcaa tttatggagt ggacacaaga 600atgctgacta aaataattcg ggataagggt accatgcttg ggaagattga atttgaaggt 660cagcctgtgg attttgtgga tccaaataaa cagaatttga ttgctgaggt ttcaaccaag 720gatgtcaaag tgtacggcaa aggaaacccc acaaaagtgg tagctgtaga ctgtgggatt 780aaaaacaatg taatccgcct gctagtaaag cgaggagctg aagtgcactt agttccctgg 840aaccatgatt tcaccaagat ggagtatgat gggattttga tcgcgggagg accggggaac 900ccagctcttg cagaaccact aattcagaat gtcagaaaga ttttggagag tgatcgcaag 960gagccattgt ttggaatcag tacaggaaac ttaataacag gattggctgc tggtgccaaa 1020acctacaaga tgtccatggc caacagaggg cagaatcagc ctgttttgaa tatcacaaac 1080aaacaggctt tcattactgc tcagaatcat ggctatgcct tggacaacac cctccctgct 1140ggctggaaac cactttttgt gaatgtcaac gatcaaacaa atgaggggat tatgcatgag 1200agcaaaccct tcttcgctgt gcagttccac ccagaggtca ccccggggcc aatagacact 1260gagtacctgt ttgattcctt tttctcactg ataaagaaag gaaaagctac caccattaca 1320tcagtcttac cgaagccagc actagttgca tctcgggttg aggtttccaa agtccttatt 1380ctaggatcag gaggtctgtc cattggtcag gctggagaat ttgattactc aggatctcaa 1440gctgtaaaag ccatgaagga agaaaatgtc aaaactgttc tgatgaaccc aaacattgca 1500tcagtccaga ccaatgaggt gggcttaaag caagcggata ctgtctactt tcttcccatc 1560acccctcagt ttgtcacaga ggtcatcaag gcagaacagc cagatgggtt aattctgggc 1620atgggtggcc agacagctct gaactgtgga gtggaactat tcaagagagg tgtgctcaag 1680gaatatggtg tgaaagtcct gggaacttca gttgagtcca ttatggctac ggaagacagg 1740cagctgtttt cagataaact aaatgagatc aatgaaaaga ttgctccaag ttttgcagtg 1800gaatcgattg aggatgcact gaaggcagca gacaccattg gctacccagt gatgatccgt 1860tccgcctatg cactgggtgg gttaggctca ggcatctgtc ccaacagaga gactttgatg 1920gacctcagca caaaggcctt tgctatgacc aaccaaattc tggtggagaa gtcagtgaca 1980ggttggaaag aaatagaata tgaagtggtt cgagatgctg atgacaattg tgtcactgtc 2040tgtaacatgg aaaatgttga tgccatgggt gttcacacag gtgactcagt tgttgtggct 2100cctgcccaga cactctccaa tgccgagttt cagatgttga gacgtacttc aatcaatgtt 2160gttcgccact tgggcattgt gggtgaatgc aacattcagt ttgcccttca tcctacctca 2220atggaatact gcatcattga agtgaatgcc agactgtccc gaagctctgc tctggcctca 2280aaagccactg gctacccatt ggcattcatt gctgcaaaga ttgccctagg aatcccactt 2340ccagaaatta agaacgtcgt atccgggaag acatcagcct gttttgaacc tagcctggat 2400tacatggtca ccaagattcc ccgctgggat cttgaccgtt ttcatggaac atctagccga 2460attggtagct ctatgaaaag tgtaggagag gtcatggcta ttggtcgtac ctttgaggag 2520agtttccaga aagctttacg gatgtgccac ccatctatag aaggtttcac tccccgtctc 2580ccaatgaaca aagaatggcc atctaattta gatcttagaa aagagttgtc tgaaccaagc 2640agcacgcgta tctatgccat tgccaaggcc attgatgaca acatgtccct tgatgagatt 2700gagaagctca catacattga caagtggttt ttgtataaga tgcgtgatat tttaaacatg 2760gaaaagacac tgaaaggcct caacagtgag tccatgacag aagaaaccct gaaaagggca 2820aaggagattg ggttctcaga taagcagatt tcaaaatgcc ttgggctcac tgaggcccag 2880acaagggagc tgaggttaaa gaaaaacatc cacccttggg ttaaacagat tgatacactg 2940gctgcagaat acccatcagt aacaaactat ctctatgtta cctacaatgg tcaggagcat 3000gatgtcaatt ttgatgacca tggaatgatg gtgctaggct gtggtccata tcacattggc 3060agcagtgtgg aatttgattg gtgtgctgtc tctagtatcc gcacactgcg tcaacttggc 3120aagaagacgg tggtggtgaa ttgcaatcct gagactgtga gcacagactt tgatgagtgt 3180gacaaactgt actttgaaga gttgtccttg gagagaatcc tagacatcta ccatcaggag 3240gcatgtggtg gctgcatcat atcagttgga ggccagattc caaacaacct ggcagttcct 3300ctatacaaga atggtgtcaa gatcatgggc acaagccccc tgcagatcga cagggctgag 3360gatcgctcca tcttctcagc tgtcttggat gagctgaagg tggctcaggc accttggaaa 3420gctgttaata ctttgaatga agcactggaa tttgcaaagt ctgtggacta cccctgcttg 3480ttgaggcctt cctatgtttt gagtgggtct gctatgaatg tggtattctc tgaggatgag 3540atgaaaaaat tcctagaaga ggcgactaga gtttctcagg agcacccagt ggtgctgaca 3600aaatttgttg aaggggcccg agaagtagaa atggacgctg ttggcaaaga tggaagggtt 3660atctctcatg ccatctctga acatgttgaa gatgcaggtg tccactcggg agatgccact 3720ctgatgctgc ccacacaaac catcagccaa ggggccattg aaaaggtgaa ggatgctacc 3780cggaagattg caaaggcttt tgccatctct ggtccattca acgtccaatt tcttgtcaaa 3840ggaaatgatg tcttggtgat tgagtgtaac ttgagagctt ctcgatcctt cccctttgtt 3900tccaagactc ttggggttga cttcattgat gtggccacca aggtgatgat tggagagaat 3960gttgatgaga aacatcttcc aacattggac catcccataa ttcctgctga ctatgttgca 4020attaaggctc ccatgttttc ctggccccgg ttgagggatg ctgaccccat tctgagatgt 4080gagatggctt ccactggaga ggtggcttgc tttggtgaag gtattcatac agccttccta 4140aaggcaatgc tttccacagg atttaagata ccccagaaag gcatcctgat aggcatccag 4200caatcattcc ggccaagatt ccttggtgtg gctgaacaat tacacaatga aggtttcaag 4260ctgtttgcca cggaagccac atcagactgg ctcaacgcca acaatgtccc tgccacccca 4320gtggcatggc cgtctcaaga aggacagaat cccagcctct cttccatcag aaaattgatt 4380agagatggca gcattgacct agtgattaac cttcccaaca acaacactaa atttgtccat 4440gataattatg tgattcggag gacagctgtt gatagtggaa tccctctcct cactaatttt 4500caggtgacca aactttttgc tgaagctgtg cagaaatctc gcaaggtgga ctccaagagt 4560cttttccact acaggcagta cagtgctgga aaagcagcat aggaattcta actagagctc 4620gctgatcagc ctcgactgtg ccttctagtt gccagccatc tgttgtttgc ccctcccccg 4680tgccttcctt gaccctggaa ggtgccactc ccactgtcct ttcctaataa aatgaggaaa 4740ttgcatcgca ttgtctgagt aggtgtcatt ctattctggg gggtggggtg gggcaggaca 4800gcaaggggga ggattgggaa gagaatagca ggcatgctgg ggagcgagct cgaggtggtt 4860tgtctggtca accaccgcgg tctcagtggt gtacggtaca aaccca 49064014882DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 401ccccaactgg ggtaaccttt gggctccccg ggcgcgacta taagctgcga gcaacttcac 60ttgggtatgc cggcggtagc gcttaccgtt cgtataatgt atgctatacg aagttatccg 120aagccgctag cggtggtttg tctggtcaac caccgcggtc tcagtggtgt acggtacaaa 180cccacccgag agaccatgca gaggtcgcct ctggaaaagg ccagcgttgt ctccaaactt 240ttctttagct ggactagacc catccttcgt aaaggataca gacagcgcct ggaattgtca 300gacatatacc aaatcccttc tgttgattct gctgacaatc tatctgaaaa attggaaaga 360gaatgggata gagagctggc ttcaaagaaa aatcctaaac tcattaatgc ccttcggcga 420tgttttttct ggagatttat gttctatgga atctttttat atttagggga agtcaccaaa 480gcagtacagc ctctcttact gggaagaatc atagcttcct atgacccgga taacaaggag 540gaacgctcta tcgcgattta tctaggcata ggcttatgcc ttctctttat tgtgaggaca 600ctgctcctac acccagccat ttttggcctt catcacattg gaatgcagat gagaatagct 660atgtttagtt tgatttataa gaagacttta aagctgtcaa gccgtgttct agataaaata 720agtattggac aacttgttag tctcctttcc aacaacctga acaaatttga tgaaggactt 780gcattggcac atttcgtgtg gatcgctcct ttgcaagtgg cactcctcat ggggctaatc 840tgggagttgt tacaggcgtc tgccttctgt ggacttggtt tcctgatagt ccttgccctt 900tttcaggctg ggctagggag aatgatgatg aagtacagag atcagagagc tgggaagatc 960agtgaaagac ttgtgattac ctcagaaatg attgaaaata tccaatctgt taaggcatac 1020tgctgggaag aagcaatgga aaaaatgatt gaaaacttaa gacaaacaga actgaaactg 1080actcggaagg cagcctatgt gagatacttc aatagctcag ccttcttctt ctcagggttc 1140tttgtggtgt ttttatctgt gcttccctat gcactaatca aaggaatcat cctccggaaa 1200atattcacca ccatctcatt ctgcattgtt ctgcgcatgg cggtcactcg gcaatttccc 1260tgggctgtac aaacatggta tgactctctt ggagcaataa acaaaataca ggatttctta 1320caaaagcaag aatataagac attggaatat aacttaacga ctacagaagt agtgatggag 1380aatgtaacag ccttctggga ggagggattt ggggaattat ttgagaaagc aaaacaaaac 1440aataacaata gaaaaacttc taatggtgat gacagcctct tcttcagtaa tttctcactt 1500cttggtactc ctgtcctgaa agatattaat ttcaagatag aaagaggaca gttgttggcg 1560gttgctggat ccactggagc aggcaagact tcacttctaa tggtgattat gggagaactg 1620gagccttcag agggtaaaat taagcacagt ggaagaattt cattctgttc tcagttttcc 1680tggattatgc ctggcaccat taaagaaaat atcatctttg gtgtttccta tgatgaatat 1740agatacagaa gcgtcatcaa agcatgccaa ctagaagagg acatctccaa gtttgcagag 1800aaagacaata tagttcttgg agaaggtgga atcacactga gtggaggtca acgagcaaga 1860atttctttag caagagcagt atacaaagat gctgatttgt atttattaga ctctcctttt 1920ggatacctag acgtattgac tgagaaggag atcttcgagt cctgcgtttg caagcttatg 1980gccaataaga caagaatcct ggttacaagt aagatggagc acctgaagaa ggccgataag 2040attctgatcc tgcacgaggg atcttcatac ttctacggca ctttcagcga gcttcagaac 2100ttgcaacctg atttctctag caagcttatg ggctgcgact cctttgatca gttctctgcc 2160gagcgtcgca actccattct gaccgaaaca ctgcataggt tttccctcga gggcgacgca 2220ccagtgtctt ggactgagac taagaagcag agcttcaagc aaaccggcga attcggtgag 2280aagagaaaga acagtatcct gaaccccatt aattcaattc ggaagttcag tatcgttcag 2340aaaacgcctc ttcagatgaa cgggattgag gaagactcag acgaaccgct tgaaaggcga 2400ctctcattgg ttcctgacag tgaacaaggg gaagctattc tcccccggat ttcagtaatt 2460tccacaggtc cgactctgca agcccggaga agacaatccg tgttgaatct tatgacccat 2520tccgtgaatc aggggcaaaa tatccataga aagactactg cctctacgag gaaggtatcc 2580cttgcacccc aagccaatct gacggagctc gacatctact ctcgccgcct gtcccaggag 2640acaggactgg agattagcga ggagatcaat gaagaggatc tgaaagaatg tttcttcgac 2700gacatggaat ccatccctgc cgtcacgacg tggaatacct atttgcgtta catcacggta 2760cataaaagtc tgatattcgt cctgatctgg tgtcttgtga tcttcctcgc tgaagtcgca 2820gccagcctgg tcgttctttg gctgctcggg aataccccct tgcaggataa gggaaactcc 2880acccactctc ggaacaatag ttacgccgtc atcattactt ccacttcctc atactacgta 2940ttctatatat atgtcggggt cgctgataca ctgctggcca tgggcttctt tcgcggcctg 3000ccgctcgtcc acacgctgat aactgtctcc aagatcttgc atcataagat gctgcactca 3060gtgctgcagg ctccaatgag tacactgaat actcttaagg ctggcggcat cctgaaccgc 3120tttagtaagg acatcgccat acttgacgat ctcttgcccc tgacaatctt cgattttatt 3180caactccttt tgatcgttat cggggcgatc gctgtggttg ctgtgttgca gccatatata 3240ttcgtagcta ctgttcccgt catcgtcgcg ttcatcatgc tccgtgccta ctttctgcag 3300acgtcccaac agctgaagca gctcgagagc gagggacggt cccccatatt tacgcacttg 3360gtaactagtc tgaaggggct gtggactctg agagcatttg gtcgacaacc atatttcgag 3420accctctttc ataaggccct caacctgcac accgcgaatt ggtttctgta tttgagtacg 3480ttgcggtggt ttcagatgcg catcgagatg atattcgtga tattctttat cgcagtcaca 3540tttatcagca tcctgactac gggcgaggga gagggtcgcg tgggcatcat actcacgctc 3600gctatgaaca ttatgagcac cctgcaatgg gccgtgaata gctctatcga cgttgacagt 3660cttatgcgat ctgtgagccg agtctttaag ttcattgaca tgccaacaga aggtaaacct 3720accaagtcaa ccaaaccata caagaatggc caactctcga aagttatgat tattgagaat 3780tcacacgtga agaaagatga catctggccc tcagggggcc aaatgactgt caaagatctc 3840acagcaaaat acacagaagg tggaaatgcc atattagaga acatttcctt ctcaataagt 3900cctggccaga gggtgggcct cttgggaaga actggatcag ggaagagtac tttgttatca 3960gcttttttga gactactgaa cactgaagga gaaatccaga tcgatggtgt gtcttgggat 4020tcaataactt tgcaacagtg gaggaaagcc tttggagtga taccacagaa agtatttatt 4080ttttctggaa catttagaaa aaacttggat ccctatgaac agtggagtga tcaagaaata 4140tggaaagttg cagatgaggt tgggctcaga tctgtgatag aacagtttcc tgggaagctt 4200gactttgtcc ttgtggatgg gggctgtgtc ctaagccatg gccacaagca gttgatgtgc 4260ttggctagat ctgttctcag taaggcgaag atcttgctgc ttgatgaacc cagtgctcat 4320ttggatccag taacatacca aataattaga agaactctaa aacaagcatt tgctgattgc 4380acagtaattc tctgtgaaca caggatagaa gcaatgctgg aatgccaaca atttttggtc 4440atagaagaga acaaagtgcg gcagtacgat tccatccaga aactgctgaa cgagaggagc 4500ctcttccggc aagccatcag cccctccgac agggtgaagc tctttcccca ccggaactca 4560agcaagtgca agtctaagcc ccagattgct gctctgaaag aggagacaga agaagaggtg 4620caagatacaa ggctttagac ccgctgatca gcctcgactg tgccttctag ttgccagcca 4680tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg aaggtgccac tcccactgtc 4740ctttcctaat aaaatgagaa aattgcatcg cattgtctga gtaggtgtca ttctattctg 4800gggggtgggg tggggcagga cagcaagggg gaggattggg aagacaatag caggcatgct 4860ggggatgcgg tgggctctat gg 48824021594DNAArtificial SequenceDescription of Artificial Sequence Synthetic polynucleotide 402ccccaactgg ggtaaccttt gggctccccg ggcgcggttc cggatccgga gagggcaggg 60gatctctcct tacttgtggc gacgtggagg agaaccccgg ccccatgagc atcggcctcc 120tgtgctgtgc agccttgtct ctcctgtggg caggtccagt gaatgctggt gtcactcaga 180ccccaaaatt ccaggtcctg aagacaggac agagcatgac actgcagtgt gcccaggata 240tgaaccatga atacatgtcc tggtatcgac aagacccagg catggggctg aggctgattc 300attactcagt tggtgctggt atcactgacc aaggagaagt ccccaatggc tacaatgtct 360ccagatcaac cacagaggat ttcccgctca ggctgctgtc ggctgctccc tcccagacat 420ctgtgtactt ctgtgccagc agttacgtcg ggaacaccgg ggagctgttt tttggagaag 480gctctaggct gaccgtactg gaggacctga aaaacgtgtt cccacccgag gtcgctgtgt 540ttgagccatc agaagcagag atctcccaca cccaaaaggc cacactggta tgcctggcca 600caggcttcta ccccgaccac gtggagctga gctggtgggt gaatgggaag gaggtgcaca 660gtggggtcag cacagacccg cagcccctca aggagcagcc cgccctcaat gactccagat 720actgcctgag cagccgcctg agggtctcgg ccaccttctg gcagaacccc cgcaaccact 780tccgctgtca agtccagttc tacgggctct cggagaatga cgagtggacc caggataggg 840ccaaacccgt cacccagatc gtcagcgccg aggcctgggg tagagcagac tgtggcttca 900cctccgagtc ttaccagcaa ggggtcctgt ctgccaccat cctctatgag atcttgctag 960ggaaggccac cttgtatgcc gtgctggtca gtgccctcgt gctgatggct atggtcaaga 1020gaaaggattc cagaggccgg gccaagcggt ccggatccgg agccaccaac ttcagcctgc 1080tgaagcaggc cggcgacgtg gaggagaacc ccggccccat ggagaccctc ttgggcctgc 1140ttatcctttg gctgcagctg caatgggtga gcagcaaaca ggaggtgacg cagattcctg 1200cagctctgag tgtcccagaa ggagaaaact tggttctcaa ctgcagtttc actgatagcg 1260ctatttacaa cctccagtgg tttaggcagg accctgggaa aggtctcaca tctctgttgc 1320ttattcagtc aagtcagaga gagcaaacaa gtggaagact taatgcctcg ctggataaat 1380catcaggacg tagtacttta tacattgcag cttctcagcc tggtgactca gccacctacc 1440tctgtgctgt gaggcccctg tacggaggaa gctacatacc tacatttgga agaggaacca 1500gccttattgt tcatccgtat atccagaacc ctgaccctgc gggtggtttg tctggtcaac 1560caccgcggtc tcagtggtgt acggtacaaa ccca 159440319DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 403ttgagcgggc ccccaccgt 19404393DNAArtificial SequenceDescription

of Artificial Sequence Synthetic polynucleotide 404atgactcact atcaggcctt gcttttggac acggaccggg tccagttcgg accggtggta 60gccctgaacc cggctacgct gctcccactg cctgaggaag ggctgcaaca caactgcctt 120gatgggacag gtggcggtgg tgtcaccgtc aagttcaagt acaagggtga ggaacttgaa 180gttgatatta gcaaaatcaa gaaggtttgg cgcgttggta aaatgatatc ttttacttat 240gacgacaacg gcaagacagg tagaggggca gtgtctgaga aagacgcccc caaggagctg 300ttgcaaatgt tggaaaagtc tgggaaaaag tctggcggct caaaaagaac cgccgacggc 360agcgaattcg agcccaagaa gaagaggaaa gtc 39340511DNAArtificial SequenceDescription of Artificial Sequence Synthetic probe 405cgacgacggc g 1140616DNAArtificial SequenceDescription of Artificial Sequence Synthetic probe 406tttatttgtg ggcccg 1640715DNAArtificial SequenceDescription of Artificial Sequence Synthetic probe 407tcgagtgccg catca 1540817DNAArtificial SequenceDescription of Artificial Sequence Synthetic probe 408aaagtggtga ggacact 1740915DNAArtificial SequenceDescription of Artificial Sequence Synthetic probe 409aacccacccg agaga 1541066DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 410ggaagcggag ctactaactt cagcctgctg aagcaggctg gcgacgtgga ggagaaccct 60ggacct 6641145DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 411gggggaggag gttctggagg cggaggctcc ggaggcggag ggtca 4541215DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 412ggaggtggcg ggagc 1541315DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 413cccgcaccag cgcct 1541445DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 414gaggcagctg ccaaggaagc cgctgccaag gaggcggccg caaag 4541548DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 415agtgggagcg agacccctgg gactagcgag tcagctacac ccgaaagc 4841654DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 416ggggggtcag gtggatccgg cggaagtggc ggatccggtg gatctggcgg cagt 5441715DNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 417gaagctgctg ctaag 1541822PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 418Gly Ser Gly Ala Thr Asn Phe Ser Leu Leu Lys Gln Ala Gly Asp Val1 5 10 15Glu Glu Asn Pro Gly Pro 2041915PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 419Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser1 5 10 154205PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 420Gly Gly Gly Gly Ser1 54215PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 421Pro Ala Pro Ala Pro1 542215PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 422Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys1 5 10 1542316PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 423Ser Gly Ser Glu Thr Pro Gly Thr Ser Glu Ser Ala Thr Pro Glu Ser1 5 10 1542418PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 424Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly1 5 10 15Gly Ser4255PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 425Glu Ala Ala Ala Lys1 542616PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 426Gly Leu Ser Gly Gln Pro Pro Arg Ser Pro Ser Ser Gly Ser Ser Gly1 5 10 1542717PRTArtificial SequenceDescription of Artificial Sequence Synthetic peptide 427Gly Gly Leu Ser Gly Gln Pro Pro Arg Ser Pro Ser Ser Gly Ser Ser1 5 10 15Gly42888RNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 428gacgagcgcg gcgauaucau cauccauggc cggaugaucc ugacgacgga gaccgccguc 60gucgacaagc cggccugagc ugcgagaa 8842920RNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 429gaagccggcc uugcacaugc 2043095DNAHomo sapiens 430gcgcgcccgg ctattctcgc agctcaccat ggatgatgat atcgccgcgc tcgtcgtcga 60caacggctcc ggcatgtgca aggccggctt cgcgg 9543120RNAArtificial SequenceDescription of Artificial Sequence Synthetic oligonucleotide 431accacucgac gcucuuaucg 20

* * * * *


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed