Composition and Methods for Treatment of Primary Ciliary Dyskinesia

Dias; Anusha ;   et al.

Patent Application Summary

U.S. patent application number 17/420346 was filed with the patent office on 2022-03-24 for composition and methods for treatment of primary ciliary dyskinesia. The applicant listed for this patent is Translate Bio, Inc.. Invention is credited to Christian Cobaugh, Frank DeRosa, Anusha Dias, Jeffrey S. Dubins, Sara J. Dunaj, Michael Heartlein, Shrirang Karve, Darshan Parekh, Zarna Patel.

Application Number20220087935 17/420346
Document ID /
Family ID
Filed Date2022-03-24

United States Patent Application 20220087935
Kind Code A1
Dias; Anusha ;   et al. March 24, 2022

Composition and Methods for Treatment of Primary Ciliary Dyskinesia

Abstract

The present invention provides, among other things, methods and compositions for treating primary ciliary dyskinesia (PCD) based on mRNA therapy. The compositions used in treatment of PCD comprise an mRNA comprising a dynein axonemal heavy chain 5 (DNAH5) coding sequence and are administered at an effective dose and an administration interval such that at least one symptom or feature of PCD is reduced in intensity, severity, or frequency or has a delayed onset. mRNAs with optimized DNAH5 coding sequences are provided that can be administered without the need for modifying the nucleotides of the mRNA to achieve sustained in vivo function.


Inventors: Dias; Anusha; (Lexington, MA) ; Parekh; Darshan; (Lexington, MA) ; Dubins; Jeffrey S.; (Lexington, MA) ; Cobaugh; Christian; (Lexington, MA) ; Karve; Shrirang; (Lexington, MA) ; Patel; Zarna; (Lexington, MA) ; Dunaj; Sara J.; (Lexington, MA) ; DeRosa; Frank; (Lexington, MA) ; Heartlein; Michael; (Lexington, MA)
Applicant:
Name City State Country Type

Translate Bio, Inc.

Lexington

MA

US
Appl. No.: 17/420346
Filed: January 7, 2020
PCT Filed: January 7, 2020
PCT NO: PCT/US2020/012529
371 Date: July 1, 2021

Related U.S. Patent Documents

Application Number Filing Date Patent Number
62789414 Jan 7, 2019

International Class: A61K 9/127 20060101 A61K009/127; A61K 31/711 20060101 A61K031/711; A61P 11/00 20060101 A61P011/00

Claims



1. A method of delivery of human axonemal dynein heavy chain 5 (DNAH5) in vivo comprising administering to a subject in need of delivery an mRNA encoding a human DNAH5 protein.

2. A method of treating primary ciliary dyskinesia (PCD) comprising administering to a subject in need of treatment an mRNA encoding human axonemal dynein heavy chain 5 (DNAH5) at an effective dose and an administration interval such that at least one symptom or feature of PCD is reduced in intensity, severity, or frequency or has delayed in onset.

3. The method of claim 1 or claim 2, wherein the DNAH5 mRNA is encapsulated in a liposome.

4. The method of claim 3, wherein the liposome comprises one or more cationic lipids, one or more non-cationic lipids and one or more PEG-modified lipids.

5. The method of claim 4, wherein the one or more cationic lipids are selected from the group consisting of cKK-E12, OF-02, C12-200, MC3, DLinDMA, DLinkC2DMA, ICE (Imidazol-based), HGT5000, HGT5001, HGT4003, DODAC, DDAB, DMRIE, DOSPA, DOGS, DODAP, DODMA and DMDMA, DODAC, DLenDMA, DMRIE, CLinDMA, CpLinDMA, DMOBA, DOcarbDAP, DLinDAP, DLincarbDAP, DLinCDAP, DLinSSDMA, KLin-K-DMA, DLin-K-XTC2-DMA, 3-(4-(bis(2-hydroxydodecyl)amino)butyl)-6-(4-((2-hydroxydodecyl)(2-hydrox- yundecyl)amino)butyl)-1,4-dioxane-2,5-dione (Target 23), 3-(5-(bis(2-hydroxydodecyl)amino)pentan-2-yl)-6-(5-((2-hydroxydodecyl)(2-- hydroxyundecyl)amino)pentan-2-yl)-1,4-dioxane-2,5-dione (Target 24), ccBene, ML7 and combinations thereof.

6. The method of claim 5, wherein the cationic lipid is ICE.

7. The method of any one of the preceding claims, wherein the one or more non-cationic lipids are selected from DSPC (1,2-distearoyl-sn-glycero-3-phosphocholine), DPPC (1,2-dipalmitoyl-sn-glycero-3-phosphocholine), DOPE (1,2-dioleyl-sn-glycero-3-phosphoethanolamine), DOPC (1,2-dioleyl-sn-glycero-3-phosphotidylcholine) DPPE (1,2-dipalmitoyl-sn-glycero-3-phosphoethanolamine), DMPE (1,2-dimyristoyl-sn-glycero-3-phosphoethanolamine), DOPG (1,2-dioleoyl-sn-glycero-3-phospho-(1'-rac-glycerol)) or combinations thereof.

8. The method of any one of claims 4-7, wherein the one or more PEG-modified lipids comprise a poly(ethylene) glycol chain of up to 5 kDa in length covalently attached to a lipid with alkyl chain(s) of C.sub.6-C.sub.20 length.

9. The method of any one of the preceding claims, wherein the cationic lipid constitutes about 30-60% of the liposome by molar ratio.

10. The method of claim 9, wherein the cationic lipid constitutes about 30%, 40%, 50%, or 60% of the liposome by molar ratio.

11. The method of any one of the preceding claims, wherein the liposome comprises ICE, DOPE and DMG-PEG2K.

12. The method of any one of claims 3-11, wherein the liposome has a diameter of about 80 nm to 200 nm, optionally wherein the liposome has a diameter of about 100 nm or less than 100 nm.

13. The method of any one of the preceding claims, wherein the DNAH5 mRNA is codon optimized.

14. The method of any one of the preceding claims, wherein the DNAH5 mRNA comprises one or more modified nucleotides.

15. The method of claim 14, wherein the one or more modified nucleotides are selected from pseudouridine, N-1-methyl-pseudouridine, 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, and/or 2-thiocytidine.

16. The method of any one of the preceding claims, wherein the mRNA is unmodified.

17. The method of any one of the preceding claims, wherein the mRNA comprises a 5'-untranslated region (5'-UTR) that has a sequence set forth in SEQ ID NO: 2 or 3.

18. The method of any one of the preceding claims, wherein the mRNA comprises a 3'-untranslated region (3'-UTR) that has a sequence set forth in SEQ ID NO: 4 or 5.

19. The method of any one of the preceding claims, wherein the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to any one of SEQ ID NO: 6 to SEQ ID NO: 31.

20. The method of any one of the preceding claims, wherein the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 6 to SEQ ID NO: 31.

21. The method of any one of the preceding claims, wherein the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 6 to SEQ ID NO: 31.

22. The method of any one of the preceding claims, wherein the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 6 to SEQ ID NO: 31.

23. The method of any one of the preceding claims, wherein the mRNA comprises a coding sequence at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 6 to SEQ ID NO: 31.

24. The method of any one of the preceding claims, wherein the mRNA comprises a coding sequence set forth in SEQ ID NO: 6 to SEQ ID NO: 31.

25. The method of any one of the preceding claims, wherein administering the mRNA to the subject is performed by intratracheal, intranasal, intravenous, intramuscular or subcutaneous delivery.

26. The method of any one of the preceding claims, wherein administering the mRNA to the subject is performed by intratracheal delivery.

27. The method of any one of the preceding claims, wherein administering the mRNA to the subject is performed by intranasal delivery.

28. The method of any one of the preceding claims, wherein the composition is administered once a daily.

29. The method of any one of the preceding claims, wherein the composition is administered once a week.

30. The method of any one of the preceding claims, wherein the composition is administered once every two weeks.

31. The method of any one of the preceding claims, wherein the composition is administered twice a month.

32. The method of any one of the preceding claims, wherein the composition is administered once a month.

33. The method of any one of the preceding claims, wherein the administering the mRNA results in DNAH5 protein expression detectable in one or more internal organs selected from lung, heart, liver, spleen, kidney, brain, stomach, intestines, ovary and testis.

34. The method of any one of the preceding claims, wherein the administering the mRNA results in DNAH5 protein expression detectable in the lung.

35. The method of any one of the preceding claims, wherein the administering the mRNA results in DNAH5 protein expression detectable in the lung epithelium.

36. A composition for use in the treatment of primary ciliary dyskinesia (PCD), the composition comprising an mRNA encoding human axonemal dynein heavy chain 5 (DNAH5) encapsulated in a liposome, wherein the liposome comprises one or more cationic lipids, one or more non-cationic lipids and one or more PEG-modified lipids.

37. The composition of claim 36, wherein the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to any one of SEQ ID NO: 6 to SEQ ID NO: 31.

38. The composition of claim 36 or 37, wherein the mRNA comprises a coding sequence at least 70%, at least 80%, at least 90%, at least 95% or at least 98% identical to SEQ ID NO: 6 to SEQ ID NO: 31.

39. The composition of any one of claims 36-38, wherein the mRNA comprises a coding sequence set forth in SEQ ID NO: 6 to SEQ ID NO: 31.

40. The composition of any one of claims 36-39, wherein the mRNA has a 5'-untranslated region (5'-UTR) that has a sequence set forth in SEQ ID NO:2, and a 3'-untranslated region (3'-UTR) that has a sequence set forth in SEQ ID NO:4 or SEQ ID NO: 5.

41. The composition of any one of claims 36-39, wherein the mRNA has one or more modified nucleotides.

42. The composition of claim 41, wherein the modified one or more nucleotides is selected from pseudouridine, N-1-methyl-pseudouridine, 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, and/or 2-thiocytidine.

43. The composition of any one of claims 36-42, wherein the mRNA is unmodified.

44. The composition of any one of claims 36-43, wherein the liposome is 100 nm in diameter or less.

45. The composition of any one of claims 36-44, one or more cationic lipids are selected from the group consisting of cKK-E12, OF-02, C12-200, MC3, DLinDMA, DLinkC2DMA, ICE (Imidazol-based), HGT5000, HGT5001, HGT4003, DODAC, DDAB, DMRIE, DOSPA, DOGS, DODAP, DODMA and DMDMA, DODAC, DLenDMA, DMRIE, CLinDMA, CpLinDMA, DMOBA, DOcarbDAP, DLinDAP, DLincarbDAP, DLinCDAP, DLinSSDMA, KLin-K-DMA, DLin-K-XTC2-DMA, 3-(4-(bis(2-hydroxydodecyl)amino)butyl)-6-(4-((2-hydroxydodecyl)(2-hydrox- yundecyl)amino)butyl)-1,4-dioxane-2,5-dione (Target 23), 3-(5-(bis(2-hydroxydodecyl)amino)pentan-2-yl)-6-(5-((2-hydroxydodecyl)(2-- hydroxyundecyl)amino)pentan-2-yl)-1,4-dioxane-2,5-dione (Target 24), ccBene, ML7 and combinations thereof.

46. The composition of any one of claims 36-45, wherein the cationic lipid is ICE.

47. The composition of any one of claims 36-46, wherein the one or more non-cationic lipids are selected from DSPC (1,2-distearoyl-sn-glycero-3-phosphocholine), DPPC (1,2-dipalmitoyl-sn-glycero-3-phosphocholine), DOPE (1,2-dioleyl-sn-glycero-3-phosphoethanolamine), DOPC (1,2-dioleyl-sn-glycero-3-phosphotidylcholine) DPPE (1,2-dipalmitoyl-sn-glycero-3-phosphoethanolamine), DMPE (1,2-dimyristoyl-sn-glycero-3-phosphoethanolamine), DOPG (1,2-dioleoyl-sn-glycero-3-phospho-(1'-rac-glycerol)) or combinations thereof.

48. The composition of any one of claims 36-47, wherein the non-cationic lipid is DOPE.

49. The composition of any one of claims 36-48, wherein the one or more PEG-modified lipids comprise a poly(ethylene) glycol chain of up to 5 kDa in length covalently attached to a lipid with alkyl chain(s) of C.sub.6-C.sub.20 length.

50. A pharmaceutical composition comprising the composition of any one of claims 36-49 and a suitable excipient.
Description



CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application claims benefit of, and priority to, U.S. Provisional Patent Application Ser. No. 62/789,414 filed on Jan. 7, 2019, the contents of which are incorporated herein in its entirety

INCORPORATION-BY-REFERENCE OF SEQUENCE LISTING

[0002] The contents of the text file named "MRT-2060WO_ST25.txt" which was created on Jan. 6, 2020 and is 504 KB in size, are hereby incorporated by reference in its entirety.

BACKGROUND

[0003] Primary ciliary dyskinesia (PCD) is an auto recessive disorder characterized by abnormal cilia and flagella that are found in the linings of the airway, the reproductive system, and other organs and tissues. PCD occurs in approximately 1 in 16,000. Symptoms are present as early as at birth, with breathing problems, and the affected individuals develop frequent respiratory tract infections beginning in early childhood. People with PCD also have year-round nasal congestion and chronic cough. Chronic respiratory tract infections can result in condition called bronchiectasis, which damages the passages, called bronchi, and can cause life-threatening breathing problems. Some individuals with PCD also have infertility, recurrent ear infections, abnormally placed organs within their chest and abdomen.

[0004] Mutations in the DNAH1 or DNAH5 genes account for about a third of all cases of primary ciliary dyskinesia. The DNAH5 gene encodes dynein axonemal heavy chain 5, which forms the inner structure of cilia. With an absent or abnormal dynein axonemal heavy chain 5, defective cilia cannot produce the force and movement needed to eliminate fluid, bacteria, and particles from the lungs. The movement of cilia also helps establish the left-right axis during embryonic development and propel the sperm cells forward to the female egg cell.

[0005] There is currently no cure for PCD. Current standard of care includes aggressive measures to enhance clearance of mucus and with antibiotic therapy for bacterial infections of the airways. Routine immunizations are administered to prevent respiratory infections and other secondary complications. For some patients, lobectomy, lung transplantation, and sinus surgery are considered. Gene therapy has been studied to address the urgent need for new, more effective treatments of PCD. However, due to the large size of DNAH5 conventional gene therapy methods remain challenging.

SUMMARY OF THE INVENTION

[0006] The present invention provides, among other things, methods and compositions for use in the treatment of primary ciliary dyskinesia (PCD). The present invention is based, in part, on the surprising discovery that DNAH5 mRNA, which is approximately 14 kb in length, can be successfully encapsulated in a liposome and effectively delivered to target tissues in vivo.

[0007] In some aspects, the present invention provides a method of delivery of a 10 kb or greater mRNA encoding for a protein or peptide in vivo comprising administering to a subject in need of delivery a 10 kb or greater mRNA encoding a protein or peptide. In some embodiments, the 10 kb or greater mRNA is encapsulated in a liposome. In some embodiments, the 10 kb or greater mRNA is 11 kb or greater in length. In some embodiments, the 10 kb or greater mRNA is 12 kb or greater in length. In some embodiments, the 10 kb or greater mRNA is 13 kb or greater in length. In some embodiments, the 10 kb or greater mRNA is 14 kb or greater in length.

[0008] In some aspects, the present invention provides a method of delivery of human axonemal dynein heavy chain 5 (DNAH5) in vivo comprising administering to a subject in need of delivery an mRNA encoding a human DNAH5 protein. In some embodiments, the DNAH5 mRNA is encapsulated in a liposome.

[0009] In some aspects, the present invention provides a method of treating primary ciliary dyskinesia (PCD) comprising administering to a subject in need of treatment an mRNA encoding human axonemal dynein heavy chain 5 (DNAH5) at an effective dose and an administration interval such that at least one symptom or feature of PCD is reduced in intensity, severity, or frequency or has delayed in onset.

[0010] In some embodiments, the DNAH5 mRNA is encapsulated in a liposome.

[0011] In some embodiments, the liposome comprises one or more cationic lipids, one or more non-cationic lipids and one or more PEG-modified lipids.

[0012] In some embodiments, the one or more cationic lipids are selected from the group consisting of cKK-E12, OF-02, C12-200, MC3, DLinDMA, DLinkC2DMA, ICE (Imidazol-based), HGT5000, HGT5001, HGT4003, DODAC, DDAB, DMRIE, DOSPA, DOGS, DODAP, DODMA and DMDMA, DODAC, DLenDMA, DMRIE, CLinDMA, CpLinDMA, DMOBA, DOcarbDAP, DLinDAP, DLincarbDAP, DLinCDAP, DLinSSDMA, KLin-K-DMA, DLin-K-XTC2-DMA, 3-(4-(bis(2-hydroxydodecyl)amino)butyl)-6-(4-((2-hydroxydodecyl)(2-hydrox- yundecyl)amino)butyl)-1,4-dioxane-2,5-dione (Target 23), 3-(5-(bis(2-hydroxydodecyl)amino)pentan-2-yl)-6-(5-((2-hydroxydodecyl)(2-- hydroxyundecyl)amino)pentan-2-yl)-1,4-dioxane-2,5-dione (Target 24), ccBene, ML7 and combinations thereof.

[0013] In some embodiments, the cationic lipid is ICE.

[0014] In some embodiments, the one or more non-cationic lipids are selected from DSPC (1,2-distearoyl-sn-glycero-3-phosphocholine), DPPC (1,2-dipalmitoyl-sn-glycero-3-phosphocholine), DOPE (1,2-dioleyl-sn-glycero-3-phosphoethanolamine), DOPC (1,2-dioleyl-sn-glycero-3-phosphotidylcholine) DPPE (1,2-dipalmitoyl-sn-glycero-3-phosphoethanolamine), DMPE (1,2-dimyristoyl-sn-glycero-3-phosphoethanolamine), DOPG (1,2-dioleoyl-sn-glycero-3-phospho-(1'-rac-glycerol)) or combinations thereof. In some embodiments, the non-cationic lipid is DOPE.

[0015] In some embodiments, the one or more PEG-modified lipids comprise a poly(ethylene) glycol chain of up to 5 kDa in length covalently attached to a lipid with alkyl chain(s) of C6-C20 length.

[0016] In some embodiments, the cationic lipid constitutes about 30-60% of the liposome by molar ratio.

[0017] In some embodiments, the cationic lipid constitutes about 30%, 40%, 50%, or 60% of the liposome by molar ratio.

[0018] In some embodiments, the liposome comprises ICE, DOPE and DMG-PEG2K.

[0019] In some embodiments, the liposome has a size of about 80 nm to 200 nm, optionally wherein the liposome has a size of about 100 nm or less than 100 nm.

[0020] In some embodiments, the DNAH5 mRNA is codon optimized.

[0021] In some embodiments, the DNAH5 mRNA comprises one or more modified nucleotides.

[0022] In some embodiments, the one or more modified nucleotides are selected from pseudouridine, N-1-methyl-pseudouridine, 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, and/or 2-thiocytidine.

[0023] In some embodiments, the mRNA is unmodified.

[0024] In some embodiments, the mRNA comprises a 5'-untranslated region (5'-UTR) that has a sequence set forth in SEQ ID NO: 2 or SEQ ID NO: 3.

[0025] In some embodiments, the mRNA comprises a 3'-untranslated region (3'-UTR) that has a sequence set forth in SEQ ID NO: 4 or SEQ ID NO: 5.

[0026] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to any one of SEQ ID NO: 6 to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to any one of SEQ ID NO: 6 to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to any one of SEQ ID NO: 6 to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to any one of SEQ ID NO: 6 to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to any one of SEQ ID NO: 6 to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 6 to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 6 to SEQ ID NO: 31.

[0027] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 6. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 6. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 6. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 6. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 6. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 6. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 6.

[0028] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 7. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 7. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 7. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 7. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 7. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 7. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 7.

[0029] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 8. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 8. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 8. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 8. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 8. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 8. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 8.

[0030] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 9. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 9. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 9. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 9. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 9. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 9. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 9.

[0031] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 10. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 10. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 10. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 10. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 10. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 10. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 10.

[0032] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 11. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 11. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 11. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 11. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 11. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 11. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 11.

[0033] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 12. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 12. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 12. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 12. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 12. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 12. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 12.

[0034] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 13. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 13. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 13. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 13. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 13. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 13. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 13.

[0035] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 14. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 14. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 14. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 14. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 14. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 14. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 14.

[0036] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 15. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 15. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 15. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 15. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 15. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 15. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 15.

[0037] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 16. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 16. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 16. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 16. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 16. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 16. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 16.

[0038] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 17. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 17. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 17. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 17. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 17. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 17. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 17.

[0039] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 18. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 18. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 18. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 18. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 18. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 18. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 18.

[0040] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 19. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 19. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 19. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 19. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 19. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 19. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 19.

[0041] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 20. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 20. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 20. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 20. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 20. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 20. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 20.

[0042] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 21. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 21. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 21. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 21. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 21. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 21. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 21.

[0043] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 22. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 22. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 22. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 22. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 22. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 22. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 22.

[0044] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 23. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 23. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 23. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 23. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 23. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 23. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 23.

[0045] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 24. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 24. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 24. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 24. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 24. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 24. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 24.

[0046] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 25. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 25. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 25. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 25. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 25. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 25. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 25.

[0047] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 26. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 26. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 26. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 26. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 26. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 26. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 26.

[0048] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 27. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 27. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 27. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 27. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 27. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 27. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 27.

[0049] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 28. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 28. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 28. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 28. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 28. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 28. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 28.

[0050] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 29. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 29. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 29. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 29. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 29. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 29. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 29.

[0051] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 30. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 30. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 30. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 30. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 30. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 30. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 30.

[0052] In some embodiments, the mRNA comprises a coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence at least 70% identical to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence at least 80% identical to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence at least 90% identical to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence at least 95% identical to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 31. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 31.

[0053] In some embodiments, administering the mRNA to the subject is performed by intratracheal, intranasal, intravenous, intramuscular or subcutaneous delivery.

[0054] In some embodiments, administering the mRNA to the subject is performed by intratracheal delivery.

[0055] In some embodiments, administering the mRNA to the subject is performed by intranasal delivery.

[0056] In some embodiments, administering the mRNA to the subject is performed by aerosol delivery.

[0057] In some embodiments, administering the mRNA to the subject is performed by nebulized delivery.

[0058] In some embodiments, administering the mRNA to the subject is performed by dry powder inhalation.

[0059] In some embodiments, the composition is administered once a week.

[0060] In some embodiments, the composition is administered once every two weeks.

[0061] In some embodiments, the composition is administered twice a month.

[0062] In some embodiments, the composition is administered once a month.

[0063] In some embodiments, the administering the mRNA results in DNAH5 protein expression detectable in one or more internal organs selected from lung, heart, liver, spleen, kidney, brain, stomach, intestines, ovary and testis.

[0064] In some embodiments, the administering the mRNA results in DNAH5 protein expression detectable in the lung.

[0065] In some embodiments, the administering the mRNA results in DNAH5 protein expression detectable in the lung epithelium.

[0066] In some aspects, the invention provides a composition for use in the treatment of primary ciliary dyskinesia (PCD), the composition comprising an mRNA encoding human axonemal dynein heavy chain 5 (DNAH5) encapsulated in a liposome, wherein the liposome comprises one or more cationic lipids, one or more non-cationic lipids and one or more PEG-modified lipids.

[0067] In some embodiments, the mRNA comprises a DNAH5 coding sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to any one of SEQ ID NO: 6 to SEQ ID NO: 31.

[0068] In some embodiments, mRNA comprises a coding sequence at least 70%, at least 80%, at least 90%, at least 95% or at least 98% identical to SEQ ID NO: 6. In some embodiments, mRNA comprises a coding sequence at least 70%, at least 80%, at least 90%, at least 95% or at least 98% identical to SEQ ID NO: 7.

[0069] In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 6. In some embodiments, the mRNA comprises a coding sequence set forth in SEQ ID NO: 7.

[0070] In some embodiments, the mRNA has a 5'-untranslated region (5'-UTR) that has a sequence set forth in SEQ ID NO: 2 or SEQ ID NO: 3, and a 3'-untranslated region (3'-UTR) that has a sequence set forth in SEQ ID NO: 4 or SEQ ID NO: 5.

[0071] In some embodiments, wherein the mRNA has one or more modified nucleotides.

[0072] In some embodiments, the modified one or more nucleotides is selected from pseudouridine, N-1-methyl-pseudouridine, 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, and/or 2-thiocytidine.

[0073] In some embodiments, the mRNA is unmodified.

[0074] In some embodiments, the liposome is 100 nm in diameter or less.

[0075] In some embodiments, the invention provides a pharmaceutical composition comprising the composition described above and a suitable excipient.

[0076] In some aspects, the present invention provides a method of delivery of a mRNA encoding for a protein or peptide in vivo comprising administering to a subject in need of delivery a mRNA encoding a protein or peptide and having a 5'-untranslated region (5'-UTR) that has a sequence at least 70%, 75%, 80%, 85%, 90%, or 95% identical to SEQ ID NO: 2 and that is not SEQ ID NO: 3. In some embodiments, the mRNA comprises a 5'-untranslated region (5'-UTR) that has a sequence at least 70% identical to SEQ ID NO: 2 that is not SEQ ID NO: 3. In some embodiments, the mRNA comprises a 5'-untranslated region (5'-UTR) that has a sequence at least 75% identical to SEQ ID NO: 2 that is not SEQ ID NO: 3. In some embodiments, the mRNA comprises a 5'-untranslated region (5'-UTR) that has a sequence at least 80% identical to SEQ ID NO: 2 that is not SEQ ID NO: 3. In some embodiments, the mRNA comprises a 5'-untranslated region (5'-UTR) that has a sequence at least 85% identical to SEQ ID NO: 2 that is not SEQ ID NO: 3. In some embodiments, the mRNA comprises a 5'-untranslated region (5'-UTR) that has a sequence at least 90% identical to SEQ ID NO: 2 that is not SEQ ID NO: 3. In some embodiments, the mRNA comprises a 5'-untranslated region (5'-UTR) that has a sequence at least 95% identical to SEQ ID NO: 2 that is not SEQ ID NO: 3. In some embodiments, the mRNA comprises a 5'-untranslated region (5'-UTR) that is at least 96%, at least 97%, at least 98%, or at least 99% identical to SEQ ID NO: 2 that is not SEQ ID NO: 3. In some embodiments, the mRNA comprises a 5'-untranslated region (5'-UTR) set forth in SEQ ID NO: 2. Other features, objects, and advantages of the present invention are apparent in the detailed description, drawings and claims that follow. It should be understood, however, that the detailed description, the drawings, and the claims, while indicating embodiments of the present invention, are given by way of illustration only, not limitation. Various changes and modifications within the scope of the invention will become apparent to those skilled in the art.

BRIEF DESCRIPTION OF THE DRAWING

[0077] The drawings are for illustration purposes only, not for limitation.

[0078] FIG. 1A is a schematic diagram that shows the dissection and usage of various parts of mouse trachea and lungs for quantitative PCR analysis (qPCR) and immunohistochemistry (IHC) analysis, 24 hours after mRNA administration. FIG. 1B (left) and (right), are graphs that show qPCR data for hDNA5 mRNA in the different regions of the respiratory system as indicated in the figure. FIG. 1B (left) shows data from Group 1 mice who were administered MRT-1 hDNA5 mRNA; FIG. 1B (right) shows data from Group 2 mice who were administered MRT-1 hDNA5-GFP mRNA.

[0079] FIG. 2A and FIG. 2B show series of photomicrographs depicting results from IHC analysis for hDNA protein expression in the respiratory airways. FIG. 2A depicts representative IHC data for hDNA-5 protein staining in MRT-1 hDNA5 mRNA treated mice compared to saline-treated control (Group 1, left); and IHC data for GFP protein staining in MRT-1 hDNA5-GFP mRNA treated mice compared to saline-treated control (Group 2, right). FIG. 2B shows detailed localization of the respective hDNA5 mRNA derived protein in epithelial tissue of the airways in Group 1 (upper panel) and Group 2 (lower panel) mice.

DEFINITIONS

[0080] In order for the present invention to be more readily understood, certain terms are first defined below. Additional definitions for the following terms and other terms are set forth throughout the specification. The publications and other reference materials referenced herein to describe the background of the invention and to provide additional detail regarding its practice are hereby incorporated by reference.

[0081] Alkyl: As used herein, "alkyl" refers to a radical of a straight-chain or branched saturated hydrocarbon group having from 1 to 15 carbon atoms ("C1-15 alkyl"). In some embodiments, an alkyl group has 1 to 3 carbon atoms ("C1-3 alkyl"). Examples of C1-3 alkyl groups include methyl (C1), ethyl (C2), n-propyl (C3), and isopropyl (C3). In some embodiments, an alkyl group has 8 to 12 carbon atoms ("C8-12 alkyl"). Examples of C8-12 alkyl groups include, without limitation, n-octyl (C8), n-nonyl (C9), n-decyl (C10), n-undecyl (C11), n-dodecyl (C12) and the like. The prefix "n-" (normal) refers to unbranched alkyl groups. For example, n-C8 alkyl refers to (CH2)7CH3, n-C10 alkyl refers to (CH2)9CH3, etc.

[0082] Amino acid: As used herein, term "amino acid," in its broadest sense, refers to any compound and/or substance that can be incorporated into a polypeptide chain. In some embodiments, an amino acid has the general structure H2N--C(H)(R)--COOH. In some embodiments, an amino acid is a naturally occurring amino acid. In some embodiments, an amino acid is a synthetic amino acid; in some embodiments, an amino acid is a d-amino acid; in some embodiments, an amino acid is an 1-amino acid. "Standard amino acid" refers to any of the twenty standard 1-amino acids commonly found in naturally occurring peptides. "Nonstandard amino acid" refers to any amino acid, other than the standard amino acids, regardless of whether it is prepared synthetically or obtained from a natural source. As used herein, "synthetic amino acid" encompasses chemically modified amino acids, including but not limited to salts, amino acid derivatives (such as amides), and/or substitutions. Amino acids, including carboxy- and/or amino-terminal amino acids in peptides, can be modified by methylation, amidation, acetylation, protecting groups, and/or substitution with other chemical groups that can change the peptide's circulating half-life without adversely affecting their activity. Amino acids may participate in a disulfide bond. Amino acids may comprise one or posttranslational modifications, such as association with one or more chemical entities (e.g., methyl groups, acetate groups, acetyl groups, phosphate groups, formyl moieties, isoprenoid groups, sulfate groups, polyethylene glycol moieties, lipid moieties, carbohydrate moieties, biotin moieties, etc.). The term "amino acid" is used interchangeably with "amino acid residue," and may refer to a free amino acid and/or to an amino acid residue of a peptide. It will be apparent from the context in which the term is used whether it refers to a free amino acid or a residue of a peptide.

[0083] Animal: As used herein, the term "animal" refers to any member of the animal kingdom. In some embodiments, "animal" refers to humans, at any stage of development. In some embodiments, "animal" refers to non-human animals, at any stage of development. In certain embodiments, the non-human animal is a mammal (e.g., a rodent, a mouse, a rat, a rabbit, a monkey, a dog, a cat, a sheep, cattle, a primate, and/or a pig). In some embodiments, animals include, but are not limited to, mammals, birds, reptiles, amphibians, fish, insects, and/or worms. In some embodiments, an animal may be a transgenic animal, genetically-engineered animal, and/or a clone.

[0084] Approximately or about: As used herein, the term "approximately" or "about," as applied to one or more values of interest, refers to a value that is similar to a stated reference value. In certain embodiments, the term "approximately" or "about" refers to a range of values that fall within 25%, 20%, 19%, 18%, 17%, 16%, 15%, 14%, 13%, 12%, 11%, 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1%, or less in either direction (greater than or less than) of the stated reference value unless otherwise stated or otherwise evident from the context (except where such number would exceed 100% of a possible value). Typically, the term "approximately" or "about" refers to a range of values that within 10%, or more typically 1%, of the stated reference value.

[0085] Biologically active: As used herein, the phrase "biologically active" refers to a characteristic of any agent that has activity in a biological system, and particularly in an organism. For instance, an agent that, when administered to an organism, has a biological effect on that organism, is considered to be biologically active.

[0086] Codon-optimized: As used herein, the term describes a nucleic acid in which one or more of the nucleotides present in a naturally occurring nucleic acid sequence (also referred to as `wild-type` sequence) has been substituted with an alternative nucleotide to optimize protein expression without changing the amino acid sequence of the polypeptide encoded by the naturally occurring nucleic acid sequence. For example, the codon AAA may be altered to become AAG without changing the identity of the encoded amino acid (lysine). In some embodiments, the nucleic acids of the invention are codon optimized to increase protein expression of the protein encoded by the nucleic acid. For the purpose of this application, nucleobase thymidine (T) and uracil (U) are used interchangeably in narration of mRNA sequences.

[0087] Delivery: As used herein, the term "delivery" encompasses both local and systemic delivery. For example, delivery of mRNA encompasses situations in which an mRNA is delivered to a target tissue and the encoded protein is expressed and retained within the target tissue (also referred to as "local distribution" or "local delivery"), and situations in which an mRNA is delivered to a target tissue and the encoded protein is expressed and secreted into patient's circulation system (e.g., serum) and systematically distributed and taken up by other tissues (also referred to as "systemic distribution" or "systemic delivery).

[0088] Dosing interval: As used herein dosing interval in the context of a method for treating a disease is the frequency of administering a therapeutic composition in a subject (mammal) in need thereof, for example an mRNA composition, at an effective dose of the mRNA, such that one or more symptoms associated with the disease is reduced; or one or more biomarkers associated with the disease is reduced, at least for the period of the dosing interval. Dosing frequency and dosing interval may be used interchangeably in the current disclosure.

[0089] Expression: As used herein, "expression" of a nucleic acid sequence refers to translation of an mRNA into a polypeptide, assemble multiple polypeptides into an intact protein (e.g., enzyme) and/or post-translational modification of a polypeptide or fully assembled protein (e.g., enzyme). In this application, the terms "expression" and "production," and grammatical equivalent, are used inter-changeably.

[0090] Effective dose: As used herein, an effective dose is a dose of the mRNA in the pharmaceutical composition which when administered to the subject in need thereof, hereby a mammalian subject, according to the methods of the invention, is effective to bring about an expected outcome in the subject, for example reduce a symptom associated with the disease.

[0091] Functional: As used herein, a "functional" biological molecule is a biological molecule in a form in which it exhibits a property and/or activity by which it is characterized.

[0092] Half-life: As used herein, the term "half-life" is the time required for a quantity such as nucleic acid or protein concentration or activity to fall to half of its value as measured at the beginning of a time period.

[0093] Improve, increase, or reduce: As used herein, the terms "improve," "increase" or "reduce," or grammatical equivalents, indicate values that are relative to a baseline measurement, such as a measurement in the same individual prior to initiation of the treatment described herein, or a measurement in a control subject (or multiple control subject) in the absence of the treatment described herein. A "control subject" is a subject afflicted with the same form of disease as the subject being treated, who is about the same age as the subject being treated.

[0094] In Vitro: As used herein, the term "in vitro" refers to events that occur in an artificial environment, e.g., in a test tube or reaction vessel, in cell culture, etc., rather than within a multi-cellular organism.

[0095] In Vivo: As used herein, the term "in vivo" refers to events that occur within a multi-cellular organism, such as a human and a non-human animal. In the context of cell-based systems, the term may be used to refer to events that occur within a living cell (as opposed to, for example, in vitro systems).

[0096] Isolated: As used herein, the term "isolated" refers to a substance and/or entity that has been (1) separated from at least some of the components with which it was associated when initially produced (whether in nature and/or in an experimental setting), and/or (2) produced, prepared, and/or manufactured by the hand of man. Isolated substances and/or entities may be separated from about 10%, about 20%, about 30%, about 40%, about 50%, about 60%, about 70%, about 80%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more than about 99% of the other components with which they were initially associated. In some embodiments, isolated agents are about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99%, or more than about 99% pure. As used herein, a substance is "pure" if it is substantially free of other components. As used herein, calculation of percent purity of isolated substances and/or entities should not include excipients (e.g., buffer, solvent, water, etc.).

[0097] Local distribution or delivery: As used herein, the terms "local distribution," "local delivery," or grammatical equivalent, refer to tissue specific delivery or distribution. Typically, local distribution or delivery requires a protein (e.g., enzyme) encoded by mRNAs be translated and expressed intracellularly or with limited secretion that avoids entering the patient's circulation system.

[0098] messenger RNA (mRNA): As used herein, the term "messenger RNA (mRNA)" refers to a polyribonucleotide that encodes at least one polypeptide. mRNA may contain one or more coding and non-coding regions. mRNA can be purified from natural sources, produced using recombinant expression systems and optionally purified, in vitro transcribed, chemically synthesized, etc. An mRNA sequence is presented in the 5' to 3' direction unless otherwise indicated. Typically, the mRNA of the present invention is synthesized from adenosine, guanosine, cytidine and uridine nucleotides that bear no modifications. Such mRNA is referred to herein as mRNA with unmodified nucleotides or `unmodified mRNA` for short. Typically, this means that the mRNA of the present invention does not comprise any of the following nucleoside analogs: 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolo-pyrimidine, 3-methyl adenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaadenosine, 7-deazaguanosine, 8-oxoadenosine, 8-oxoguanosine, O(6)-methylguanine, and 2-thiocytidine. An mRNA suitable for practising the claimed invention commonly does not comprise nucleosides comprising chemically modified bases; biologically modified bases (e.g., methylated bases); intercalated bases; modified sugars (e.g., 2'-fluororibose, ribose, 2'-deoxyribose, arabinose, and hexose); and/or modified phosphate groups (e.g., phosphorothioates and 5'-N-phosphoramidite linkages).

[0099] Nucleic acid: As used herein, the term "nucleic acid," in its broadest sense, refers to any compound and/or substance that is or can be incorporated into a polynucleotide chain. In some embodiments, a nucleic acid is a compound and/or substance that is or can be incorporated into a polynucleotide chain via a phosphodiester linkage. In some embodiments, "nucleic acid" refers to individual nucleic acid residues (e.g., nucleotides and/or nucleosides). In some embodiments, "nucleic acid" refers to a polynucleotide chain comprising individual nucleic acid residues. In some embodiments, "nucleic acid" encompasses RNA as well as single and/or double-stranded DNA and/or cDNA.

[0100] Patient: As used herein, the term "patient" or "subject" refers to any organism to which a provided composition may be administered, e.g., for experimental, diagnostic, prophylactic, cosmetic, and/or therapeutic purposes. Typical patients include animals (e.g., mammals such as mice, rats, rabbits, non-human primates, and/or humans). In some embodiments, a patient is a human. A human includes pre- and post-natal forms.

[0101] Pharmaceutically acceptable: The term "pharmaceutically acceptable" as used herein, refers to substances that, within the scope of sound medical judgment, are suitable for use in contact with the tissues of human beings and animals without excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a reasonable benefit/risk ratio.

[0102] Pharmaceutically acceptable salt: Pharmaceutically acceptable salts are well known in the art. For example, S. M. Berge et al., describes pharmaceutically acceptable salts in detail in J. Pharmaceutical Sciences (1977) 66:1-19. Pharmaceutically acceptable salts of the compounds of this invention include those derived from suitable inorganic and organic acids and bases. Examples of pharmaceutically acceptable, nontoxic acid addition salts are salts of an amino group formed with inorganic acids such as hydrochloric acid, hydrobromic acid, phosphoric acid, sulfuric acid and perchloric acid or with organic acids such as acetic acid, oxalic acid, maleic acid, tartaric acid, citric acid, succinic acid or malonic acid or by using other methods used in the art such as ion exchange. Other pharmaceutically acceptable salts include adipate, alginate, ascorbate, aspartate, benzenesulfonate, benzoate, bisulfate, borate, butyrate, camphorate, camphorsulfonate, citrate, cyclopentanepropionate, digluconate, dodecylsulfate, ethanesulfonate, formate, fumarate, glucoheptonate, glycerophosphate, gluconate, hemisulfate, heptanoate, hexanoate, hydroiodide, 2-hydroxy-ethanesulfonate, lactobionate, lactate, laurate, lauryl sulfate, malate, maleate, malonate, methanesulfonate, 2-naphthalenesulfonate, nicotinate, nitrate, oleate, oxalate, palmitate, pamoate, pectinate, persulfate, 3-phenylpropionate, phosphate, picrate, pivalate, propionate, stearate, succinate, sulfate, tartrate, thiocyanate, p-toluenesulfonate, undecanoate, valerate salts, and the like. Salts derived from appropriate bases include alkali metal, alkaline earth metal, ammonium and N+(C1-4 alkyl)4 salts. Representative alkali or alkaline earth metal salts include sodium, lithium, potassium, calcium, magnesium, and the like. Further pharmaceutically acceptable salts include, when appropriate, nontoxic ammonium. quaternary ammonium, and amine cations formed using counterions such as halide, hydroxide, carboxylate, sulfate, phosphate, nitrate, sulfonate and aryl sulfonate. Further pharmaceutically acceptable salts include salts formed from the quarternization of an amine using an appropriate electrophile, e.g., an alkyl halide, to form a quarternized alkylated amino salt.

[0103] Systemic distribution or delivery: As used herein, the terms "systemic distribution," "systemic delivery," or grammatical equivalent, refer to a delivery or distribution mechanism or approach that affect the entire body or an entire organism. Typically, systemic distribution or delivery is accomplished via body's circulation system, e.g., blood stream. Compared to the definition of "local distribution or delivery."

[0104] Subject: As used herein, the term "subject" refers to a human or any non-human animal (e.g., mouse, rat, rabbit, dog, cat, cattle, swine, sheep, horse or primate). A human includes pre- and post-natal forms. In many embodiments, a subject is a human being. A subject can be a patient, which refers to a human presenting to a medical provider for diagnosis or treatment of a disease. The term "subject" is used herein interchangeably with "individual" or "patient." A subject can be afflicted with or is susceptible to a disease or disorder but may or may not display symptoms of the disease or disorder.

[0105] Substantially: As used herein, the term "substantially" refers to the qualitative condition of exhibiting total or near-total extent or degree of a characteristic or property of interest. One of ordinary skill in the biological arts will understand that biological and chemical phenomena rarely, if ever, go to completion and/or proceed to completeness or achieve or avoid an absolute result. The term "substantially" is therefore used herein to capture the potential lack of completeness inherent in many biological and chemical phenomena.

[0106] Target tissues: As used herein, the term "target tissues" refers to any tissue that is affected by a disease to be treated. In some embodiments, target tissues include those tissues that display disease-associated pathology, symptom, or feature.

[0107] Therapeutically effective amount: As used herein, the term "therapeutically effective amount" of a therapeutic agent means an amount that is sufficient, when administered to a subject suffering from or susceptible to a disease, disorder, and/or condition, to treat, diagnose, prevent, and/or delay the onset of the symptom(s) of the disease, disorder, and/or condition. It will be appreciated by those of ordinary skill in the art that a therapeutically effective amount is typically administered via a dosing regimen comprising at least one unit dose.

[0108] Treating: As used herein, the term "treat," "treatment," or "treating" refers to any method used to partially or completely alleviate, ameliorate, relieve, inhibit, prevent, delay onset of, reduce severity of and/or reduce incidence of one or more symptoms or features of a particular disease, disorder, and/or condition. Treatment may be administered to a subject who does not exhibit signs of a disease and/or exhibits only early signs of the disease for the purpose of decreasing the risk of developing pathology associated with the disease.

[0109] Various aspects of the invention are described in detail in the following sections. The use of sections is not meant to limit the invention. Each section can apply to any aspect of the invention. In this application, the use of "or" means "and/or" unless stated otherwise.

DETAILED DESCRIPTION

Primary Ciliary Dyskinesia (PCD)

[0110] Primary ciliary dyskinesia (PCD) is an autosomal recessive disorder characterized by abnormal cilia and flagella that are found in the linings of the airway, the reproductive system, and other organs and tissues. Mutations in the DNAH5 gene, which encodes the dynein axonemal heavy chain 5 protein that forms the inner structure of cilia, cause PCD. Over 80 different mutations in the DNAH5 gene have been identified in patients with PCD.

[0111] Mutations in the DNAH5 gene result in an absent or abnormal dynein axonemal heavy chain 5, which is required for the proper functioning of cilia. Without a normal version of dynein axonemal heavy chain 5, defective cilia cannot produce the force and movement needed to eliminate fluid, bacteria, and particles from the lungs, to establish the left-right axis during embryonic development, and to propel the sperm cells. PCD can lead to chronic respiratory tract infections, bronchiectasis, year-round nasal congestion, abnormally placed organs within their chest and abdomen, and infertility.

[0112] Polyribonucleotides of the disclosure can be used, for example, to treat a subject having or at risk of having primary ciliary dyskinesia or any other condition associated with a defect or malfunction of a gene whose function is linked to cilia maintenance and function. Non limiting examples of genes that have been associated with primary ciliary dyskinesia include: armadillo repeat containing 4 (ARMC4), chromosome 21 open reading frame 59 (C21orf59), coiled-coil domain containing 103 (CCDC103), coiled-coil domain containing 114 (CCDC114), coiled-coil domain containing 39 (CCDC39), coiled-coil domain containing 40 (CCDC40), coiled-coil domain containing 65 (CCDC65), cyclin O (CCNO), dynein (axonemal) assembly factor 1 (DNAAF1), dynein (axonemal) assembly factor 2 (DNAAF2), dynein (axonemal) assembly factor 3 (DNAAF3), dynein (axonemal) assembly factor 5 (DNAAF5), dynein axonemal heavy chain 11 (DNAH11), dynein axonemal heavy chain 5 (DNAH5), dynein axonemal heavy chain 6 (DNAH6), dynein axonemal heavy chain 8 (DNAH8), dynein axonemal intermediate chain 2 (DNAI2), dynein axonemal light chain 1 (DNAL1), dynein regulatory complex subunit 1 (DRC1), dyslexia susceptibility 1 candidate 1 (DYX1C1), growth arrest specific 8 (GAS8), axonemal central pair apparatus protein (HYDIN), leucine rich repeat containing 6 (LRRC6), ME/M23 family member 8 (NME8), oral-facial-digital syndrome 1 (OFD1), retinitis pigmentosa GTPase regulator (RPGR), radial spoke head 1 homolog (Chlamydomonas) (RSPH1), radial spoke head 4 homolog A (Chlamydomonas) (RSPH4A), radial spoke head 9 homolog (Chlamydomonas) (RSPH9), sperm associated antigen 1 (SPAG1), and zinc finger MY D-type containing 10 (ZMYND10).

Dynein Axonemal Heavy Chain 5 (DNAH5) Gene and Protein Sequence

[0113] In some embodiments, the present invention provides methods and compositions for delivering mRNA encoding to a subject for the treatment of PCD. A suitable DNAH5 mRNA encodes any full length, fragment or portion of a DNAH5 protein which can be substituted for naturally-occurring DNAH5 protein activity and/or reduce the intensity, severity, and/or frequency of one or more symptoms associated with PCD.

[0114] In some embodiments, a suitable mRNA sequence is an mRNA sequence encoding a human DNAH5 protein. The naturally-occurring human DNAH5 mRNA coding sequence and the corresponding amino acid sequence are shown in Table 1:

[0115] The naturally-occurring human DNAH5 mRNA coding sequence and the corresponding amino acid sequence are shown in Table 1:

TABLE-US-00001 TABLE 1 Human DNAH5 Amino Acid Sequence Human (SEQ ID NO: 1) DNAH5 MFRIGRRQLWKHSVTRVLTQRLKGEKEAKRALLDARHNYLFAIVASCLDL Protein NKTEVEDAILEGNQIERIDQLFAVGGLRHLMFYYQDVEEAETGQLGSLGGV Sequence NLVSGKIKKPKVFVTEGNDVALTGVCVFFIRTDPSKAITPDNIHQEVSFNML DAADGGLLNSVRRLLSDIFIPALRATSHGWGELEGLQDAANIRQEFLSSLEG FVNVLSGAQESLKEKVNLRKCDILELKTLKEPTDYLTLANNPETLGKIEDCM KVWIKQTEQVLAENNQLLKEADDVGPRAELEHWKKRLSKFNYLLEQLKSP DVKAVLAVLAAAKSKLLKTWREMDIRITDATNEAKDNVKYLYTLEKCCDP LYSSDPLSMMDAIPTLINAIKMIYSISHYYNTSEKITSLFVKVTNQIISACKAYI TNNGTASIWNQPQDVVEEKILSAIKLKQEYQLCFUKTKQKLKQNPNAKQFD FSEMYIFGKFETFHRRLAKIIDIFTTLKTYSVLQDSTIEGLEDMATKYQGIVAT IKKKEYNFLDQRKMDFDQDYEEFCKQTNDLHNELRKFMDVTFAKIQNTNQ ALRMLKKFERLNIPNLGIDDKYQLILENYGADIDMISKLYTKQKYDPPLARN QPPIAGKILWARQLFHRIQQPMQLFQQHPAVLSTAEAKPIIRSYNRMAKVLL EFEVLFHRAWLRQIEEIHVGLEASLLVKAPGTGELFVNFDPQILILFRETECM AQMGLEVSPLATSLFQKRDRYKRNFSNMKMMLAEYQRVKSKIPAAIEQLIV PHLAKVDEALQPGLAALTWTSLNIEAYLENTFAKIKDLELLLDRVNDLIEFRI DAILEEMSSTPLCQLPQEEPLTCEEFLQMTKDLCVNGAQILHFKSSLVEEAV NELVNMLLDVEVLSEEESEKISNENSVNYKNESSAKREEGNFDTLTSSINAR ANALLLTTVTRKKKETEMLGEEARELLSHFNHQNMDALLKVTRNTLEAIRK RIRSSHTINFRDSNSASNMKQNSLPIFRASVTLAIPNIVMAPALEDVQQTLNK AVECIISVPKGVRQWSSELLSKKKIQERKMAALQSNEDSDSDVEMGENELQ DTLEIASVNLPIPVQTKNYYKNVSENKEIVKLVSVLSTIINSTKKEVITSMDCF KRYNHIWQKGKEEAIKTFITQSPLLSEFESQILYFQNLEQEINAEPEYVCVGSI ALYTADLKFALTAETKAWMVVIGRHCNKKYRSEMENIFMLIEEFNKKLNRP IKDLDDIRIAMAALKEIREEQISIDFQVGPIEESYALLNRYGLLIAREEIDKVDT LHYAWEKLLARAGEVQNKLVSLQPSFKKELISAVEVFLQDCHQFYLDYDLN GPMASGLKPQEASDRLIMFQNQFDNIYRKYITYTGGEELFGLPATQYPQLLE IKKQLNLLQKIYTLYNSVIETVNSYYDILWSEVNIEKINNELLEFQNRCRKLP RALKDWQAFLDLKKIIDDFSECCPLLEYMASKAMMERHWERITTLTGHSLD VGNESFKLRNIMEAPLLKYKEEIEDICISAVKERDIEQKLKQVINEWDNKTFT FGSFKTRGELLLRGDSTSEIIANMEDSLMLLGSLLSNRYNMPFKAQIQKWVQ YLSNSTDIIESWMTVQNLWIYLEAVFVGGDIAKQLPKEAKRFSNIDKSWVKI MTRAHEVPSVVQCCVGDETLGQLLPHLLDQLEICQKSLTGYLEKKRLCFPR FFFVSDPALLEILGQASDSHTIQAHLLNVFDNIKSVKFHEKIYDRILSISSQEGE TIELDKPVMAEGNVEVWLNSLLEESQSSLHLVIRQAAANIQETGFQLTEFLSS FPAQVGLLGIQMIWTRDSEEALRNAKFDKKEVIQKTNQAFLELLNTLIDVTTR DLSSTERVKYETLITIHVHQRDIFDDLCHMHIKSPMDFEWLKQCRFYFNEDS DKMMIHITDVAFIYQNEFLGCTDRLVITPLTDRCYITLAQALGMSMGGAPA GPAGTGKTETTKDMGRCLGKYVVVFNCSDQMDFRGLGRIFKGLAQSGSWG CFDEFNRIDLPVLSVAAQQISIILTCKKEHKKSFIFTDGDNVTMNPEFGLFLT MNPGYAGRQELPENLKINFRSVAMMVPDRQIIIRVKLASCGFIDNVVLARKF FTLYKLCEEQLSKQVHYDFGLRNILSVLRTLGAAKRANPMDTESTIVMRVL RDMNLSKLIDEDEPLFLSLIEDLFPNILLDKAGYPELEAAISRQVEEAGLINHP PWKLKVIQLFETQRVRHGMMTLGPSGAGKTTCIHTLMRAMTDCGKPHREM RMNPKAITAPQMFGRLDVATNDWTDGIFSTLWRKTLRAKKGEHIWIILDGP VDAIWIENLNSVLDDNKTLTLANGDRIPMAPNCKIIFEPHNIDNASPATVSRN GMVFMSSSILDWSPILEGFLKKRSPQEAEILRQLYTESFPDLYRFCIQNLEYK MEVLEAFVITQSINMLQGLIPLKEQGGEVSQAHLGRLFVFALLWSAGAALEL DGRRRLELWLRSRPTGTLELPPPAGPGDTAFDYYVAPDGTWTHWNTRTQE YLYPSDTTPEYGSILVPNVDNVRTDFLIQTIAKQGKAVLLIGEQGTAKTVIIK GEMSKYDPECHMIKSLNFSSATTPLMFQRTIESYVDKRMGTTYGPPAGKKM TVFIDDVNMPIINEWGDQVTNEIVRQLMEQNGFYNLEKPGEFTSIVDIQFLA AMIHPGGGRNDIPQRLKRQFSIFNCTLPSEASVDKIFGVIGVGHYCTQRGFSE EVRDSVTKLVPLTRRLWQMTKIKMLPTPAKFHYVFNLRDLSRVWQGMLNT T SEVIKEPNDLLKLWKHECKRVIADRFTVSSDVTWFDKALVSLVEEEFGEEK KLLVDCGIDTYFVDFLRDAPEAAGETSEEADAETPKIYEPIESFSHLKERLNM FLQLYNESIRGAGMDMVFFADAMVHLVKISRVIRTPQGNALLVGVGGSGK QSLTRLASFIAGYVSFQITLTRSYNTSNLMEDLKVLYRTAGQQGKGITFIFTD NEIKDESFLEYMNNVLSSGEVSNLFARDEIDEINSDLASVMKKEFPRCLPTNE NLHDYFMSRVRQNLHIVLCFSPVGEKFRNRALKFPALISGCTIDWFSRWPKD ALVAVSEHFLTSYDIDCSLEIKKEVVQCMGSFQDGVAEKCVDYFQRFRRST HVTPKSYLSFIQGYKFIYGEKHVEVRTLANRMNTGLEKLKEASESVAALSKE LEAKEKELQVANDKADMVLKEVTMKAQAAEKVKAEVQKVKDRAQAIVD SISKDKAIAEEKLEAAKPALEEAEAALQTIRPSDIATVRTLGRPPHLIMRIMD CVLLLFQRKVSAVKIDLEKSCTMPSWQESLKLMTAGNFLQNLQQFPKDTIN EEVIEFLSPYFEMPDYNIETAKRVCGNVAGLCSWTKAMASFFSINKEVLPLK ANLVVQENRHLLAMQDLQKAQAELDDKQAELDVVQAEYEQAMTEKQTLL EDAERCRHKMQTASTLISGLAGEKERWTEQSQEFAAQTKRLVGDVLLATAF LSYSGPFNQEFRDLLLNDWRKEMKARKIPFGKNLNLSEMLIDAPTISEWNLQ GLPNDDLSIQNGIIVTKASRYPLLIDPQTQGKIWIKNKESRNELQITSLNHKYF RNHLEDSLSLGRPLLIEDVGEELDPALDNVLERNFIKTGSTFKVKVGDKEVD VLDGFRLYITTKLPNPAYTPEISARTSIIDFTVTMKGLEDQLLGRVILTEKQEL EKERTHLMEDVTANKRRMKELEDNLLYRLTSTQGSLVEDESLIVVLSNTKR TAEEVTQKLEISAETEVQINSAREEYRPVATRGSILYFLITEMRLVNEMYQTS LRQFLGLFDLSLARSVKSPITSKRIANIIEHMTYEVYKYAARGLYEEHKFLFT LLLTLKIDIQRNRVKHEEFLTLIKGGASLDLKACPPKPSKWILDITWLNLVEL SKLRQFSDVLDQISRNEKMWKIWFDKENPEEEPLPNAYDKSLDCFRRLLLIR SWCPDRTIAQARKYIVDSMGEKYAEGVILDLEKTWEESDPRTPLICLLSMGS DPTDSIIALGKRLKIETRYVSMGQGQEVHARKLLQQTMANGGWALLQNCH LGLDFMDELMDIIIETELVHDAFRLWMTTEAHKQFPITLLQMSIKFANDPPQ GLRAGLKRTYSGVSQDLLDVSSGSQWKPMLYAVAFLHSTVQERRKFGALG WNIPYEENQADFNATVQFIQNHLDDMDVKKGVSWTTIRYMIGEIQYGGRVT DDYDKRLLNTFAKVWFSENMFGPDFSFYQGYNIPKCSTVDNYLQYIQSLPA YDSPEVFGLHPNADITYQSKLAKDVLDTILGIQPKDTSGGGDETREAVVARL ADDMLEKLPPDYVPFEVKERLQKMGPFQPMNIFLRQEIDRMQRVLSLVRST LTELKLAIDGTIIMSENLRDALDCMFDARIPAWWKKASWISSTLGFWFTELI ERNSQFTSWVFNGRPHCFWMTGFFNPQGFLTAMRQEITRANKGWALDNM VLCNEVTKWMKDDISAPPTEGVYVYGLYLEGAGWDKRNMKLIESKPKVLF ELMPVIRIYAENNTLRDPRFYSCPIYKKPVRTDLNYIAAVDLRTAQTPEHWV LRGVALLCDVK

[0116] In some embodiments, a suitable mRNA is a wild-type human DNAH5 mRNA of sequence. In some embodiments, a suitable therapeutic candidate mRNA is a codon-optimized hDNAH5 sequence that can encodes a DNAH5 amino acid sequence shown in Table 1 as SEQ ID NO: 1 or an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%, 98%, or 99% identical to SEQ ID NO: 1. In some embodiments, an mRNA according to the present invention encodes a DNAH5 protein with an amino acid sequence that is identical to SEQ ID NO: 1.

Codon Optimization

[0117] According to an increasing amount of research, mRNAs contain numerous layers of information that overlap the amino acid code. Traditionally, codon optimization has been used to remove rare codons which were thought to be rate-limiting for protein expression. While fast growing bacteria and yeast both exhibit strong codon bias in highly expressed genes, higher eukaryotes exhibit much less codon bias, making it more difficult to discern codons that may be rate-limiting. In addition, it has been found that codon bias per se does not necessarily yield high expression but requires other features.

[0118] For example, rare codons have been implicated in slowing translation and forming pause sites, which may be required for correct protein folding. Therefore, variations in codon usage may provide a mechanism to fine-tune the temporal pattern of elongation and thus increase the time available for a protein to take on its correct confirmation. Codon optimization can interfere with this fine-tuning mechanism, resulting in less efficient protein translation or an increased amount of incorrectly folded proteins. Similarly, codon optimization may disrupt the normal patterns of cognate and wobble tRNA usage, thereby affecting protein structure and function because wobble-dependent slowing of elongation may likewise have been selected as a mechanism for achieving correct protein folding.

[0119] Despite these obstacles, the inventors have arrived at a codon-optimized hDNAH5 sequence that improves expression of the DNAH5 protein at least threefold over the coding sequence of the wild type gene. The increase in expression is not limited to cell cultures of mammalian cells but was also observed in vivo in a mouse model. It is expected that the observed improvement in expression of the codon-optimised DNAH5 coding sequence will result in an improved, more cost-effective mRNA replacement therapy for patients suffering from PCD, because it does not require the use of modified nucleotides for the preparation of the mRNA and allows treatment with a reduced dose and/or at extended dosing intervals.

Exemplary Codon Optimized DNAH5 mRNA Sequences

[0120] The sequences that follow recite select, exemplary codon-optimized DNAH5 mRNA sequences.

[0121] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 6.

[0122] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 7.

[0123] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 8.

[0124] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 9.

[0125] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 10.

[0126] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 11.

[0127] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 12.

[0128] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 13.

[0129] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 14.

[0130] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 15.

[0131] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 16.

[0132] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 17.

[0133] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 18.

[0134] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 19.

[0135] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 20.

[0136] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 21.

[0137] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 22.

[0138] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 23.

[0139] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 24.

[0140] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 25.

[0141] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 26.

[0142] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 27.

[0143] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 28.

[0144] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 29.

[0145] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 30.

[0146] In some embodiments, a suitable mRNA may be a codon-optimized sequence, as shown in SEQ ID NO: 31.

[0147] In some embodiments, a suitable mRNA sequence may be an mRNA sequence a homolog or an analog of human DNAH5 protein. For example, a homolog or an analog of human DNAH5 protein may be a modified human DNAH5 protein containing one or more amino acid substitutions, deletions, and/or insertions as compared to a wild-type or naturally-occurring human DNAH5 protein while retaining substantial DNAH5 protein activity. In some embodiments, an mRNA suitable for the present invention encodes an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more homologous to SEQ ID NO: 1. In some embodiments, an mRNA suitable for the present invention encodes a protein substantially identical to human DNAH5 protein. In some embodiments, an mRNA suitable for the present invention encodes an amino acid sequence at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 1. Typically, an mRNA according to the present invention encodes a DNAH5 protein with an amino acid sequence that is identical to SEQ ID NO: 1.

[0148] In some embodiments, an mRNA suitable for the present invention encodes a fragment or a portion of human DNAH5 protein. In some embodiments, an mRNA suitable for the present invention encodes a fragment or a portion of human DNAH5 protein, wherein the fragment or portion of the protein still maintains DNAH5 activity similar to that of the wild-type protein.

[0149] In some embodiments, a suitable mRNA encodes a fusion protein comprising a full length, fragment or portion of a DNAH5 protein fused to another protein (e.g., an N or C terminal fusion). In some embodiments, the protein fused to the mRNA encoding a full length, fragment or portion of a DNAH5 protein encodes a signal or a cellular targeting sequence.

[0150] In some embodiments, an mRNA suitable for the present invention comprises a nucleotide sequence at least 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 6, SEQ ID NO: 7, SEQ ID NO: 8, SEQ ID NO: 9, SEQ ID NO: 10, SEQ ID NO: 11, SEQ ID NO: 12, SEQ ID NO: 13, SEQ ID NO: 14, SEQ ID NO: 15, SEQ ID NO: 16, SEQ ID NO: 17, SEQ ID NO: 18, SEQ ID NO: 19, SEQ ID NO: 20, SEQ ID NO: 21, SEQ ID NO: 22, SEQ ID NO: 23, SEQ ID NO: 24, SEQ ID NO: 25, SEQ ID NO: 26, SEQ ID NO: 27, SEQ ID NO: 28, SEQ ID NO: 29, SEQ ID NO: 30 or SEQ ID NO: 31. More typically, an mRNA in accordance with the present invention comprises a nucleotide sequence at least 95% identical to SEQ ID NO: 6. Preferably, an mRNA according to the present invention comprises a nucleotide sequence at least 99% identical to SEQ ID NO: 7. For example, an mRNA according to the present invention comprises the nucleotide sequence of SEQ ID NO: 6 or SEQ ID NO: 7.

[0151] Messenger RNAs according to the present invention may be synthesized according to any of a variety of known methods. For example, mRNAs according to the present invention may be synthesized via in vitro transcription (IVT). Briefly, IVT is typically performed with a linear or circular DNA template containing a promoter, a pool of ribonucleotide triphosphates, a buffer system that may include DTT and magnesium ions, and an appropriate RNA polymerase (e.g., T3, T7 or SP6 RNA polymerase), DNAse I, pyrophosphatase, and/or RNAse inhibitor. The exact conditions will vary according to the specific application.

[0152] In some embodiments, for the preparation of mRNA according to the invention, a DNA template is transcribed in vitro. A suitable DNA template typically has a promoter, for example a T3, T7 or SP6 promoter, for in vitro transcription, followed by desired nucleotide sequence for desired mRNA and a termination signal.

[0153] Typically, the mRNA according to the present invention is synthesized as unmodified mRNA. Accordingly, the mRNAs of the invention are synthesized from naturally occurring nucleotides including purines (adenine (A), guanine (G)) or pyrimidines (cytosine (C), uracil (U)).

[0154] Typically, mRNA synthesis includes the addition of a "cap" on the N-terminal (5') end, and a "tail" on the C-terminal (3') end. The presence of the cap is important in providing resistance to nucleases found in most eukaryotic cells. The presence of a "tail" serves to protect the mRNA from exonuclease degradation.

[0155] Thus, in some embodiments, mRNAs (e.g., DNAH5-encoding mRNAs) include a 5' cap structure. A 5' cap is typically added as follows: first, an RNA terminal phosphatase removes one of the terminal phosphate groups from the 5' nucleotide, leaving two terminal phosphates; guanosine triphosphate (GTP) is then added to the terminal phosphates via a guanylyl transferase, producing a 5'5'5 triphosphate linkage; and the 7-nitrogen of guanine is then methylated by a methyltransferase. Examples of cap structures include, but are not limited to, m7G(5')ppp (5'(A,G(5')ppp(5')A and G(5')ppp(5')G.

[0156] In some embodiments, mRNAs (e.g., DNAH5-encoding mRNAs) include a 3' poly(A) tail structure. A poly-A tail on the 3' terminus of mRNA typically includes about 10 to 800 adenosine nucleotides (e.g., about 300 to 500 adenosine nucleotides, about 300 to 800 adenosine nucleotides, about 10 to 500 adenosine nucleotides, about 10 to 300 adenosine nucleotides, about 10 to 200 adenosine nucleotides, about 10 to 150 adenosine nucleotides, about 10 to 100 adenosine nucleotides, about 20 to 70 adenosine nucleotides, or about 20 to 60 adenosine nucleotides). Typically, a poly-A tail in an mRNA in accordance with the invention is about 300 to about 800 adenosine nucleotides long. More commonly, the poly-A tail is about 300 adenosine nucleotides long. In some embodiments, the poly(A) tail structure comprises at least 85%, 90%, 95% or 100% adenosine.

[0157] In some embodiments, mRNAs include a 3' poly(C) tail structure. A suitable poly-C tail on the 3' terminus of mRNA typically include about 10 to 200 cytosine nucleotides (e.g., about 10 to 150 cytosine nucleotides, about 10 to 100 cytosine nucleotides, about 20 to 70 cytosine nucleotides, about 20 to 60 cytosine nucleotides, or about 10 to 40 cytosine nucleotides). The poly-C tail may be added to the poly-A tail or may substitute the poly-A tail.

[0158] In some embodiments, the mRNA further comprises a 5' untranslated region (5' UTR) comprising a nucleotide sequence and positioned between the 5' cap structure and coding sequence, and/or a 3' untranslated region (3' UTR) comprising a nucleotide sequence and positioned between the coding sequence and the poly(A) tail structure. In some embodiments, a 5' untranslated region includes one or more elements that affect an mRNA's stability or translation, for example, an iron responsive element. In some embodiments, a 5' untranslated region may be between about 50 and 500 nucleotides in length.

[0159] In some embodiments, a 3' untranslated region includes one or more of a polyadenylation signal, a binding site for proteins that affect an mRNA's stability of location in a cell, or one or more binding sites for miRNAs. In some embodiments, a 3' untranslated region may be between 50 and 500 nucleotides in length or longer.

Modified mRNA

[0160] mRNAs according to the present invention are typically synthesized as unmodified mRNAs. In some embodiments, it may be advantageous to synthesize an mRNA encoding a codon-optimized DNAH5 coding sequence of the present invention with one or more modified nucleotides. Typically, mRNAs are modified to enhance their stability or reduce their immunogenic properties, in particular when administered to a subject as naked mRNAs or in complexed form. Therefore, providing an mRNA encoding a codon-optimized DNAH5 coding sequence of the present invention may have synergistic effects, resulting in sustained in vivo function that exceeds that observed with unmodified mRNAs.

[0161] Modifications of mRNA can include, for example, modifications of the nucleotides of the RNA. A modified mRNA according to the invention can thus include, for example, backbone modifications, sugar modifications or base modifications. In some embodiments, mRNAs may be synthesized from naturally occurring nucleotides and/or nucleotide analogues (modified nucleotides) including, but not limited to, purines (adenine (A), guanine (G)) or pyrimidines (thymine (T), cytosine (C), uracil (U)), and as modified nucleotides analogues or derivatives of purines and pyrimidines, such as e.g. 1-methyl-adenine, 2-methyl-adenine, 2-methylthio-N-6-isopentenyl-adenine, N6-methyl-adenine, N6-isopentenyl-adenine, 2-thio-cytosine, 3-methyl-cytosine, 4-acetyl-cytosine, 5-methyl-cytosine, 2,6-diaminopurine, 1-methyl-guanine, 2-methyl-guanine, 2,2-dimethyl-guanine, 7-methyl-guanine, inosine, 1-methyl-inosine, pseudouracil (5-uracil), dihydro-uracil, 2-thio-uracil, 4-thio-uracil, 5-carboxymethylaminomethyl-2-thio-uracil, 5-(carboxyhydroxymethyl)-uracil, 5-fluoro-uracil, 5-bromo-uracil, 5-carboxymethylaminomethyl-uracil, 5-methyl-2-thio-uracil, 5-methyl-uracil, N-uracil-5-oxyacetic acid methyl ester, 5-methylaminomethyl-uracil, 5-methoxyaminomethyl-2-thio-uracil, 5'-methoxycarbonylmethyl-uracil, 5-methoxy-uracil, uracil-5-oxyacetic acid methyl ester, uracil-5-oxyacetic acid (v), 1-methyl-pseudouracil, queosine, .beta.-D-mannosyl-queosine, wybutoxosine, and phosphoramidates, phosphorothioates, peptide nucleotides, methylphosphonates, 7-deazaguanosine, 5-methylcytosine and inosine. The preparation of such analogues is known to a person skilled in the art e.g. from the U.S. Pat. Nos. 4,373,071, 4,401,796, 4,415,732, 4,458,066, 4,500,707, 4,668,777, 4,973,679, 5,047,524, 5,132,418, 5,153,319, 5,262,530 and 5,700,642, the disclosures of which are incorporated by reference in their entirety.

[0162] In some embodiments, mRNAs of the present invention may contain RNA backbone modifications. Typically, a backbone modification is a modification in which the phosphates of the backbone of the nucleotides contained in the RNA are modified chemically. Exemplary backbone modifications typically include, but are not limited to, modifications from the group consisting of methylphosphonates, methylphosphoramidates, phosphoramidates, phosphorothioates (e.g. cytidine 5'-O-(1-thiophosphate)), boranophosphates, positively charged guanidinium groups etc., which means by replacing the phosphodiester linkage by other anionic, cationic or neutral groups.

[0163] In some embodiments, mRNAs of the present invention may contain sugar modifications. A typical sugar modification is a chemical modification of the sugar of the nucleotides it contains including, but not limited to, sugar modifications chosen from the group consisting of 2'-deoxy-2'-fluoro-oligoribonucleotide (2'-fluoro-2'-deoxycytidine 5'-triphosphate, 2'-fluoro-2'-deoxyuridine 5'-triphosphate), 2'-deoxy-2'-deamine-oligoribonucleotide (2'-amino-2'-deoxycytidine 5'-triphosphate, 2'-amino-2'-deoxyuridine 5'-triphosphate), 2'-O-alkyloligoribonucleotide, 2'-deoxy-2'-C-alkyloligoribonucleotide (2'-O-methylcytidine 5'-triphosphate, 2'-methyluridine 5'-triphosphate), 2'-C-alkyloligoribonucleotide, and isomers thereof (2'-aracytidine 5'-triphosphate, 2'-arauridine 5'-triphosphate), or azidotriphosphates (2'-azido-2'-deoxycytidine 5'-triphosphate, 2'-azido-2'-deoxyuridine 5'-triphosphate).

[0164] In some embodiments, mRNAs of the present invention may contain modifications of the bases of the nucleotides (base modifications). A modified nucleotide which contains a base modification is also called a base-modified nucleotide. Examples of such base-modified nucleotides include, but are not limited to, 2-amino-6-chloropurine riboside 5'-triphosphate, 2-aminoadenosine 5'-triphosphate, 2-thiocytidine 5'-triphosphate, 2-thiouridine 5'-triphosphate, 4-thiouridine 5'-triphosphate, 5-aminoallylcytidine 5'-triphosphate, 5-aminoallyluridine 5'-triphosphate, 5-bromocytidine 5'-triphosphate, 5-bromouridine 5'-triphosphate, 5-iodocytidine 5'-triphosphate, 5-iodouridine 5'-triphosphate, 5-methylcytidine 5'-triphosphate, 5-methyluridine 5'-triphosphate, 6-azacytidine 5'-triphosphate, 6-azauridine 5'-triphosphate, 6-chloropurine riboside 5'-triphosphate, 7-deazaadenosine 5'-triphosphate, 7-deazaguanosine 5'-triphosphate, 8-azaadenosine 5'-triphosphate, 8-azidoadenosine 5'-triphosphate, benzimidazole riboside 5'-triphosphate, N1-methyladenosine 5'-triphosphate, N1-methylguanosine 5'-triphosphate, N6-methyladenosine 5'-triphosphate, 06-methylguanosine 5'-triphosphate, pseudouridine 5'-triphosphate, puromycin 5'-triphosphate or xanthosine 5'-triphosphate.

Cap Structure

[0165] In some embodiments, mRNAs include a 5' cap structure. A 5' cap is typically added as follows: first, an RNA terminal phosphatase removes one of the terminal phosphate groups from the 5' nucleotide, leaving two terminal phosphates; guanosine triphosphate (GTP) is then added to the terminal phosphates via a guanylyl transferase, producing a 5'5'5 triphosphate linkage; and the 7-nitrogen of guanine is then methylated by a methyltransferase. Examples of cap structures include, but are not limited to, m7G(5')ppp (5'(A,G(5')ppp(5')A and G(5')ppp(5')G.

[0166] Naturally occurring cap structures comprise a 7-methyl guanosine that is linked via a triphosphate bridge to the 5'-end of the first transcribed nucleotide, resulting in a dinucleotide cap of m7G(5')ppp(5')N, where N is any nucleoside. In vivo, the cap is added enzymatically. The cap is added in the nucleus and is catalyzed by the enzyme guanylyl transferase. The addition of the cap to the 5' terminal end of RNA occurs immediately after initiation of transcription. The terminal nucleoside is typically a guanosine, and is in the reverse orientation to all the other nucleotides, i.e., G(5')ppp(5')GpNpNp.

[0167] A common cap for mRNA produced by in vitro transcription is m7G(5')ppp(5')G, which has been used as the dinucleotide cap in transcription with T7 or SP6 RNA polymerase in vitro to obtain RNAs having a cap structure in their 5'-termini. The prevailing method for the in vitro synthesis of capped mRNA employs a pre-formed dinucleotide of the form m7G(5')ppp(5')G ("m7GpppG") as an initiator of transcription.

[0168] To date, a usual form of a synthetic dinucleotide cap used in in vitro translation experiments is the Anti-Reverse Cap Analog ("ARCA") or modified ARCA, which is generally a modified cap analog in which the 2' or 3' OH group is replaced with --OCH3.

[0169] Additional cap analogs include, but are not limited to, a chemical structures selected from the group consisting of m7GpppG, m7GpppA, m7GpppC; unmethylated cap analogs (e.g., GpppG); dimethylated cap analog (e.g., m2,7GpppG), trimethylated cap analog (e.g., m2,2,7GpppG), dimethylated symmetrical cap analogs (e.g., m7Gpppm7G), or anti reverse cap analogs (e.g., ARCA; m7,2'OmeGpppG, m72'dGpppG, m7,3'OmeGpppG, m7,3'dGpppG and their tetraphosphate derivatives) (see, e.g., Jemielity, J. et al., "Novel `anti-reverse` cap analogs with superior translational properties", RNA, 9: 1108-1122 (2003)).

[0170] In some embodiments, a suitable cap is a 7-methyl guanylate ("m7G") linked via a triphosphate bridge to the 5'-end of the first transcribed nucleotide, resulting in m7G(5')ppp(5')N, where N is any nucleoside. A preferred embodiment of a m7G cap utilized in embodiments of the invention is m7G(5')ppp(5')G.

[0171] In some embodiments, the cap is a Cap0 structure. Cap0 structures lack a 2'-O-methyl residue of the ribose attached to bases 1 and 2. In some embodiments, the cap is a Cap1 structure. Cap1 structures have a 2'-O-methyl residue at base 2. In some embodiments, the cap is a Cap2 structure. Cap2 structures have a 2'-O-methyl residue attached to both bases 2 and 3.

[0172] A variety of m7G cap analogs are known in the art, many of which are commercially available. These include the m7GpppG described above, as well as the ARCA 3'-OCH3 and 2'-OCH3 cap analogs (Jemielity, J. et al., RNA, 9: 1108-1122 (2003)). Additional cap analogs for use in embodiments of the invention include N7-benzylated dinucleoside tetraphosphate analogs (described in Grudzien, E. et al., RNA, 10: 1479-1487 (2004)), phosphorothioate cap analogs (described in Grudzien-Nogalska, E., et al., RNA, 13: 1745-1755 (2007)), and cap analogs (including biotinylated cap analogs) described in U.S. Pat. Nos. 8,093,367 and 8,304,529, incorporated by reference herein.

Tail Structure

[0173] Typically, the presence of a "tail" serves to protect the mRNA from exonuclease degradation. The poly-A tail is thought to stabilize natural messengers and synthetic sense RNA. Therefore, in certain embodiments a long poly-A tail can be added to an mRNA molecule thus rendering the RNA more stable. Poly-A tails can be added using a variety of art-recognized techniques. For example, long poly-A tails can be added to synthetic or in vitro transcribed RNA using poly A polymerase (Yokoe, et al. Nature Biotechnology. 1996; 14: 1252-1256). A transcription vector can also encode long poly-A tails. In addition, poly-A tails can be added by transcription directly from PCR products. Poly-A may also be ligated to the 3' end of a sense RNA with RNA ligase (see, e.g., Molecular Cloning A Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch and Maniatis (Cold Spring Harbor Laboratory Press: 1991 edition)).

[0174] In some embodiments, mRNAs include a 3' poly(A) tail structure. Typically, the length of the poly-A tail can be at least about 10, 50, 100, 200, 300, 400 or 500 nucleotides in length. In some embodiments, a poly-A tail on the 3' terminus of mRNA typically includes about 10 to 800 adenosine nucleotides (e.g., about 300 to 500 adenosine nucleotides, about 300 to 800 adenosine nucleotides, about 10 to 200 adenosine nucleotides, about 10 to 150 adenosine nucleotides, about 10 to 100 adenosine nucleotides, about 20 to 70 adenosine nucleotides, or about 20 to 60 adenosine nucleotides). Typically, a poly-A tail in an mRNA in accordance with the invention is about 300 to about 800 adenosine nucleotides long. More commonly, the poly-A tail is about 300 adenosine nucleotides long.

[0175] In some embodiments, mRNAs include a 3' poly(C) tail structure. A suitable poly-C tail on the 3' terminus of mRNA typically include about 10 to 200 cytosine nucleotides (e.g., about 10 to 150 cytosine nucleotides, about 10 to 100 cytosine nucleotides, about 20 to 70 cytosine nucleotides, about 20 to 60 cytosine nucleotides, or about 10 to 40 cytosine nucleotides). The poly-C tail may be added to the poly-A tail or may substitute the poly-A tail.

[0176] In some embodiments, the length of the poly A or poly C tail is adjusted to control the stability of a modified sense mRNA molecule of the invention and, thus, the transcription of protein. For example, since the length of the poly A tail can influence the half-life of a sense mRNA molecule, the length of the poly A tail can be adjusted to modify the level of resistance of the mRNA to nucleases and thereby control the time course of polynucleotide expression and/or polypeptide production in a target cell.

5' and 3' Untranslated Region

[0177] In some embodiments, mRNAs include a 5' untranslated region (UTR). In some embodiments, mRNAs include a 3' untranslated region. In some embodiments, mRNAs include both a 5' untranslated region and a 3' untranslated region. In some embodiments, a 5' untranslated region includes one or more elements that affect an mRNA's stability or translation, for example, an iron responsive element. In some embodiments, a 5' untranslated region may be between about 50 and 500 nucleotides in length.

[0178] In some embodiments, a 3' untranslated region includes one or more of a polyadenylation signal, a binding site for proteins that affect an mRNA's stability of location in a cell, or one or more binding sites for miRNAs. In some embodiments, a 3' untranslated region may be between 50 and 500 nucleotides in length or longer.

[0179] Exemplary 3' and 5' untranslated region sequences can be derived from mRNA molecules which are stable (e.g., globin, actin, GAPDH, tubulin, histone, or citric acid cycle enzymes) to increase the stability of the sense mRNA molecule. For example, a 5' UTR sequence may include a partial sequence of a CMV immediate-early 1 (IE1) gene, or a fragment thereof to improve the nuclease resistance and/or improve the half-life of the polynucleotide. Also contemplated is the inclusion of a sequence encoding human growth hormone (hGH), or a fragment thereof to the 3' end or untranslated region of the polynucleotide (e.g., mRNA) to further stabilize the polynucleotide. Generally, these modifications improve the stability and/or pharmacokinetic properties (e.g., half-life) of the polynucleotide relative to their unmodified counterparts, and include, for example modifications made to improve such polynucleotides' resistance to in vivo nuclease digestion.

[0180] In certain embodiments, the codon-optimized DNAH5 mRNA includes a coding region having a codon-optimized coding region flanked by 5' and 3' untranslated regions as represented as X and Y, respectively (vide infra) [0181] X-Coding Region-Y where the coding region sequence is SEQ ID NO: 6, or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 6; or SEQ ID NO: 7 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 7; SEQ ID NO: 8 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 8; SEQ ID NO: 9 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 9; SEQ ID NO: 10 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 10; SEQ ID NO: 11 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 11; SEQ ID NO: 12 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 12; SEQ ID NO: 13 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 13; SEQ ID NO: 14 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 14; SEQ ID NO: 15 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 15; SEQ ID NO: 16 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 16; SEQ ID NO: 17 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 17; SEQ ID NO: 18 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 18; SEQ ID NO: 19 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 19; SEQ ID NO: 20 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 20; SEQ ID NO: 21 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 21; SEQ ID NO: 22 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 22; SEQ ID NO: 23 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 23; SEQ ID NO: 24 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 24; SEQ ID NO: 25 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 25; SEQ ID NO: 26 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 26; SEQ ID NO: 27 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 27; SEQ ID NO: 28 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 28; SEQ ID NO: 29 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 29; SEQ ID NO: 30 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 30; SEQ ID NO: 31 or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 31; and where

TABLE-US-00002 [0181] X (5' UTR Sequence) is [SEQ ID NO.: 2] AGACAGAUCGCCUGGAGACGCCAUCCACGCUGUUUUGACCUCCAUAGAAG ACACCGGGACCGAUCCAGCCUCCGCGGCCGGGAACGGUGCAUUGGAACGC GGAUUCCCCGUGCCAAGAGUGACUCACCGUCCUUGACACG or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 2, or (SEQ ID NO: 3) GGACAGAUCGCCUGGAGACGCCAUCCACGCUGUUUUGACCUCCAUAGAAG ACACCGGGACCGAUCCAGCCUCCGCGGCCGGGAACGGUGCAUUGGAACGC GGAUUCCCCGUGCCAAGAGUGACUCACCGUCCUUGACACG or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 3; and where Y (3' UTR Sequence) is (SEQ ID NO: 4) CGGGUGGCAUCCCUGUGACCCCUCCCCAGUGCCUCUCCUGGCCCUGGAAG UUGCCACUCCAGUGCCCACCAGCCUUGUCCUAAUAAAAUUAAGUUGCAUC AAGCU or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 4, or (SEQ ID NO: 5) GGGUGGCAUCCCUGUGACCCCUCCCCAGUGCCUCUCCUGGCCCUGGAAGU UGCCACUCCAGUGCCCACCAGCCUUGUCCUAAUAAAAUUAAGUUGCAUCA AAGCU or a sequence 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or more identical to SEQ ID NO: 5.

In Vitro Transcription

[0182] In certain embodiments of the invention, a codon-optimized human dynein axonemal heavy chain 5 messenger RNA (DNAH5 mRNA) is synthesized by in vitro transcription from a plasmid DNA template encoding the gene, which is followed by the addition of a 5' cap structure (Fechter, P.; Brownlee, G. G. "Recognition of mRNA cap structures by viral and cellular proteins" J. Gen. Virology 2005, 86, 1239-1249) and a 3' poly(A) tail of approximately 100, 200, 250, 300, 400, 500 or 800 nucleotides in length as determined by gel electrophoresis.

Delivery Vehicles

[0183] According to the present invention, mRNA encoding a DNAH5 protein (e.g., a full length, fragment or portion of a DNAH5 protein) as described herein may be delivered as naked RNA (unpackaged) or via delivery vehicles. As used herein, the terms "delivery vehicle," "transfer vehicle," "nanoparticle" or grammatical equivalent, are used interchangeably.

[0184] In some embodiments, mRNAs encoding a DNAH5 protein may be delivered via a single delivery vehicle. In some embodiments, mRNAs encoding a DNAH5 protein may be delivered via one or more delivery vehicles each of a different composition. According to various embodiments, suitable delivery vehicles include, but are not limited to polymer based carriers, such as polyethyleneimine (PEI), lipid nanoparticles and liposomes, nanoliposomes, ceramide-containing nanoliposomes, proteoliposomes, both natural and synthetically-derived exosomes, natural, synthetic and semi-synthetic lamellar bodies, nanoparticulates, calcium phosphor-silicate nanoparticulates, calcium phosphate nanoparticulates, silicon dioxide nanoparticulates, nanocrystalline particulates, semiconductor nanoparticulates, poly(D-arginine), sol-gels, nanodendrimers, starch-based delivery systems, micelles, emulsions, niosomes, multi-domain-block polymers (vinyl polymers, polypropyl acrylic acid polymers, dynamic polyconjugates), dry powder formulations, plasmids, viruses, calcium phosphate nucleotides, aptamers, peptides and other vectorial tags.

[0185] Polymers

[0186] In some embodiments, a suitable delivery vehicle is formulated using a polymer as a carrier, alone or in combination with other carriers including various lipids described herein. Thus, in some embodiments, liposomal delivery vehicles, as used herein, also encompass polymer containing nanoparticles. Suitable polymers may include, for example, polyacrylates, polyalkycyanoacrylates, polylactide, polylactide-polyglycolide copolymers, polycaprolactones, dextran, albumin, gelatin, alginate, collagen, chitosan, cyclodextrins, protamine, PEGylated protamine, PLL, PEGylated PLL and polyethylenimine (PEI). When PEI is present, it may be branched PEI of a molecular weight ranging from 10 to 40 kDa, e.g., 25 kDa branched PEI (Sigma #408727).

[0187] Liposomes

[0188] In some embodiments, a suitable delivery vehicle is a liposome. As used herein, liposomes are usually characterized as microscopic vesicles having an interior aqua space sequestered from an outer medium by a membrane of one or more bilayers. Bilayer membranes of liposomes are typically formed by amphiphilic molecules, such as lipids of synthetic or natural origin that comprise spatially separated hydrophilic and hydrophobic domains (Lasic, Trends Biotechnol., 16: 307-321, 1998). Bilayer membranes of the liposomes can also be formed by amphophilic polymers and surfactants (e.g., polymerosomes, niosomes, etc.). In the context of the present invention, liposome typically serves to transport a desired mRNA to a target cell or tissue. A typical liposome in accordance with the invention comprises one or more cationic lipids, one or more non-cationic lipids, one or more cholesterol-based lipids and one or more PEG-modified lipids.

[0189] Cationic Lipids

[0190] As used herein, the phrase "cationic lipids" refers to any of a number of lipid species that have a net positive charge at a selected pH, such as physiological pH.

[0191] Several cationic lipids have been described in the literature, many of which are commercially available. Suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publication WO 2010/144740, which is incorporated herein by reference.

[0192] In certain embodiments, the compositions and methods of the present invention include a cationic lipid, (6Z,9Z,28Z,31Z)-heptatriaconta-6,9,28,31-tetraen-19-yl 4-(dimethylamino) butanoate, having a compound structure of:

##STR00001##

and pharmaceutically acceptable salts thereof.

[0193] Other suitable cationic lipids for use in the compositions and methods of the present invention include ionizable cationic lipids as described in International Patent Publication WO 2013/149140, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention include a cationic lipid of one of the following formulas:

##STR00002##

[0194] or a pharmaceutically acceptable salt thereof, wherein R.sub.1 and R.sub.2 are each independently selected from the group consisting of hydrogen, an optionally substituted, variably saturated or unsaturated C.sub.1-C.sub.20 alkyl and an optionally substituted, variably saturated or unsaturated C.sub.6-C.sub.20 acyl; wherein L.sub.1 and L.sub.2 are each independently selected from the group consisting of hydrogen, an optionally substituted C.sub.1-C.sub.30 alkyl, an optionally substituted variably unsaturated C.sub.1-C.sub.30 alkenyl, and an optionally substituted C.sub.1-C.sub.30 alkynyl; wherein m and o are each independently selected from the group consisting of zero and any positive integer (e.g., where m is three); and wherein n is zero or any positive integer (e.g., where n is one). In certain embodiments, the compositions and methods of the present invention include the cationic lipid (15Z,18Z)-N,N-dimethyl-6-(9Z,12Z)-octadeca-9,12-dien-1-yl) tetracosa-15,18-dien-1-amine ("HGT5000"), having a compound structure of:

##STR00003##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include the cationic lipid (15Z,18Z)-N,N-dimethyl-6-((9Z,12Z)-octadeca-9,12-dien-1-yl) tetracosa-4,15,18-trien-1-amine ("HGT5001"), having a compound structure of:

##STR00004##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include the cationic lipid and (15Z,18Z)-N,N-dimethyl-6-((9Z,12Z)-octadeca-9,12-dien-1-yl) tetracosa-5,15,18-trien-1-amine ("HGT5002"), having a compound structure of:

##STR00005##

and pharmaceutically acceptable salts thereof.

[0195] Other suitable cationic lipids for use in the compositions and methods of the invention include cationic lipids described as aminoalcohol lipidoids in International Patent Publication WO 2010/053572, which is incorporated herein by reference. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00006##

and pharmaceutically acceptable salts thereof.

[0196] Other suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publication WO 2016/118725, which is incorporated herein by reference. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00007##

and pharmaceutically acceptable salts thereof.

[0197] Other suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publication WO 2016/118724, which is incorporated herein by reference. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00008##

and pharmaceutically acceptable salts thereof.

[0198] Other suitable cationic lipids for use in the compositions and methods of the invention include a cationic lipid having the formula of 14,25-ditridecyl 15,18,21,24-tetraaza-octatriacontane, and pharmaceutically acceptable salts thereof.

[0199] Other suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publications WO 2013/063468 and WO 2016/205691, each of which are incorporated herein by reference. In some embodiments, the compositions and methods of the present invention include a cationic lipid of the following formula:

##STR00009##

or pharmaceutically acceptable salts thereof, wherein each instance of R.sup.L is independently optionally substituted C.sub.6-C.sub.40 alkenyl. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00010##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00011##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00012##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00013##

and pharmaceutically acceptable salts thereof.

[0200] Other suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publication WO 2015/184256, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention include a cationic lipid of the following formula:

##STR00014##

or a pharmaceutically acceptable salt thereof, wherein each X independently is O or S; each Y independently is O or S; each m independently is 0 to 20; each n independently is 1 to 6; each R.sub.A is independently hydrogen, optionally substituted C1-50 alkyl, optionally substituted C2-50 alkenyl, optionally substituted C2-50 alkynyl, optionally substituted C3-10 carbocyclyl, optionally substituted 3-14 membered heterocyclyl, optionally substituted C6-14 aryl, optionally substituted 5-14 membered heteroaryl or halogen; and each R.sub.B is independently hydrogen, optionally substituted C1-50 alkyl, optionally substituted C2-50 alkenyl, optionally substituted C2-50 alkynyl, optionally substituted C3-10 carbocyclyl, optionally substituted 3-14 membered heterocyclyl, optionally substituted C6-14 aryl, optionally substituted 5-14 membered heteroaryl or halogen. In certain embodiments, the compositions and methods of the present invention include a cationic lipid, "Target 23", having a compound structure of:

##STR00015##

and pharmaceutically acceptable salts thereof.

[0201] Other suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publication WO 2016/004202, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00016##

or a pharmaceutically acceptable salt thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00017##

or a pharmaceutically acceptable salt thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00018##

or a pharmaceutically acceptable salt thereof.

[0202] Other suitable cationic lipids for use in the compositions and methods of the present invention include the cationic lipids as described in J. McClellan, M. C. King, Cell 2010, 141, 210-217 and in Whitehead et al., Nature Communications (2014) 5:4277, which is incorporated herein by reference. In certain embodiments, the cationic lipids of the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00019##

and pharmaceutically acceptable salts thereof.

[0203] Other suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publication WO 2015/199952, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00020##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00021##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00022##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00023##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00024##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00025##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00026##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00027##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00028##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00029##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00030##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00031##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00032##

and pharmaceutically acceptable salts thereof.

[0204] Other suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publication WO 2017/004143, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00033##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00034##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00035##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00036##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00037##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00038##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00039##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00040##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00041##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00042##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00043##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00044##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00045##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00046##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00047##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00048##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00049##

and pharmaceutically acceptable salts thereof.

[0205] Other suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publication WO 2017/075531, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention include a cationic lipid of the following formula:

##STR00050##

or a pharmaceutically acceptable salt thereof, wherein one of L.sup.1 or L.sup.2 is --O(C.dbd.O)--, --(C.dbd.O)O--, --C(.dbd.O)--, --O--, --S(O).sub.x, --S--S--, --C(.dbd.O)S--, --SC(.dbd.O)--, --NR.sup.aC(.dbd.O)--, --C(.dbd.O)NR.sup.a--, NR.sup.aC(.dbd.O)NR.sup.a--, --OC(.dbd.O)NR.sup.a--, or --NR.sup.aC(.dbd.O)O--; and the other of L.sup.1 or L.sup.2 is --O(C.dbd.O)--, --(C.dbd.O)O--, --C(.dbd.O)--, --O--, --S(O).sub.x, --S--S--, --C(.dbd.O)S--, SC(.dbd.O)--, --NR.sup.aC(.dbd.O)--, --C(.dbd.O)NR.sup.a--, NR.sup.aC(.dbd.O)NR.sup.a--, --OC(.dbd.O)NR.sup.a-- or --NR.sup.aC(.dbd.O)O-- or a direct bond; G.sup.1 and G.sup.2 are each independently unsubstituted C.sub.1-C.sub.12 alkylene or C.sub.1-C.sub.12 alkenylene; G.sup.3 is C.sub.1-C.sub.24 alkylene, C.sub.1-C.sub.24 alkenylene, C.sub.3-C.sub.8 cycloalkylene, C.sub.3-C.sub.8 cycloalkenylene; R.sup.a is H or C.sub.1-C.sub.12 alkyl; R.sup.1 and R.sup.2 are each independently C.sub.6-C.sub.24 alkyl or C.sub.6-C.sub.24 alkenyl; R.sup.3 is H, OR.sup.5, CN, --C(.dbd.O)OR.sup.4, --OC(.dbd.O)R.sup.4 or --NR.sup.5C(.dbd.O)R.sup.4; R.sup.4 is C.sub.1-C.sub.12 alkyl; R.sup.5 is H or C.sub.1-C.sub.6 alkyl; and x is 0, 1 or 2.

[0206] Other suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publication WO 2017/117528, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00051##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00052##

and pharmaceutically acceptable salts thereof. In some embodiments, the compositions and methods of the present invention include a cationic lipid having the compound structure:

##STR00053##

and pharmaceutically acceptable salts thereof.

[0207] Other suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publication WO 2017/049245, which is incorporated herein by reference. In some embodiments, the cationic lipids of the compositions and methods of the present invention include a compound of one of the following formulas:

##STR00054##

and pharmaceutically acceptable salts thereof. For any one of these four formulas, R.sub.4 is independently selected from --(CH.sub.2).sub.nQ and --(CH.sub.2).sub.nCHQR; Q is selected from the group consisting of --OR, --OH, --O(CH.sub.2).sub.nN(R).sub.2, --OC(O)R, --CX.sub.3, --CN, --N(R)C(O)R, --N(H)C(O)R, --N(R)S(O).sub.2R, --N(H)S(O).sub.2R, --N(R)C(O)N(R).sub.2, --N(H)C(O)N(R).sub.2, --N(H)C(O)N(H)(R), --N(R)C(S)N(R).sub.2, --N(H)C(S)N(R).sub.2, --N(H)C(S)N(H)(R), and a heterocycle; and n is 1, 2, or 3. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00055##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00056##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00057##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00058##

and pharmaceutically acceptable salts thereof.

[0208] Other suitable cationic lipids for use in the compositions and methods of the invention include the cationic lipids as described in International Patent Publication WO 2017/173054 and WO 2015/095340, each of which is incorporated herein by reference. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00059##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00060##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00061##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid having a compound structure of:

##STR00062##

and pharmaceutically acceptable salts thereof.

[0209] Other suitable cationic lipids for use in the compositions and methods of the present invention include cleavable cationic lipids as described in International Patent Publication WO 2012/170889, which is incorporated herein by reference. In some embodiments, the compositions and methods of the present invention include a cationic lipid of the following formula:

##STR00063##

wherein R.sub.1 is selected from the group consisting of imidazole, guanidinium, amino, imine, enamine, an optionally-substituted alkyl amino (e.g., an alkyl amino such as dimethylamino) and pyridyl; wherein R.sub.2 is selected from the group consisting of one of the following two formulas:

##STR00064##

and wherein R.sub.3 and R.sub.4 are each independently selected from the group consisting of an optionally substituted, variably saturated or unsaturated C.sub.6-C.sub.20 alkyl and an optionally substituted, variably saturated or unsaturated C.sub.6-C.sub.20 acyl; and wherein n is zero or any positive integer (e.g., one, two, three, four, five, six, seven, eight, nine, ten, eleven, twelve, thirteen, fourteen, fifteen, sixteen, seventeen, eighteen, nineteen, twenty or more). In certain embodiments, the compositions and methods of the present invention include a cationic lipid, "HGT4001", having a compound structure of:

##STR00065##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid, "HGT4002", having a compound structure of:

##STR00066##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid, "HGT4003", having a compound structure of:

##STR00067##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid, "HGT4004", having a compound structure of:

##STR00068##

and pharmaceutically acceptable salts thereof. In certain embodiments, the compositions and methods of the present invention include a cationic lipid "HGT4005", having a compound structure of:

##STR00069##

and pharmaceutically acceptable salts thereof.

[0210] Other suitable cationic lipids for use in the compositions and methods of the present invention include cleavable cationic lipids as described in U.S. Provisional Application No. 62/672,194, filed May 16, 2018, and incorporated herein by reference. In certain embodiments, the compositions and methods of the present invention include a cationic lipid that is any of general formulas or any of structures (1a)-(21a) and (1b)-(21b) and (22)-(237) described in U.S. Provisional Application No. 62/672,194. In certain embodiments, the compositions and methods of the present invention include a cationic lipid that has a structure according to Formula (I'),

##STR00070##

[0211] wherein: [0212] R.sup.X is independently --H, -L.sup.1-R.sup.1, or -L.sup.5A-L.sup.5B-B'; [0213] each of L.sup.1, L.sup.2, and L.sup.3 is independently a covalent bond, --C(O)--, --C(O)O--, --C(O)S--, or --C(O)NR.sup.L--; [0214] each L.sup.4A and L.sup.5A is independently --C(O)--, --C(O)O--, or --C(O)NR.sup.L--; [0215] each L.sup.4B and L.sup.5B is independently C.sub.1-C.sub.20 alkylene; C.sub.2-C.sub.20 alkenylene; or C.sub.2-C.sub.20 alkynylene; [0216] each B and B' is NR.sup.4R.sup.5 or a 5- to 10-membered nitrogen-containing heteroaryl; [0217] each R.sup.1, R.sup.2, and R.sup.3 is independently C.sub.6-C.sub.30 alkyl, C.sub.6-C.sub.30 alkenyl, or C.sub.6-C.sub.30 alkynyl; [0218] each R.sup.4 and R.sup.5 is independently hydrogen, C.sub.1-C.sub.10 alkyl; C.sub.2-C.sub.10 alkenyl; or C.sub.2-C.sub.10 alkynyl; and [0219] each R.sup.L is independently hydrogen, C.sub.1-C.sub.20 alkyl, C.sub.2-C.sub.20 alkenyl, or C.sub.2-C.sub.20 alkynyl. In certain embodiments, the compositions and methods of the present invention include a cationic lipid that is Compound (139) of 62/672,194, having a compound structure of:

##STR00071##

[0220] In some embodiments, the compositions and methods of the present invention include the cationic lipid, N-[1-(2,3-dioleyloxy)propyl]-N,N,N-trimethylammonium chloride ("DOTMA"). (Feigner et al. (Proc. Nat'l Acad. Sci. 84, 7413 (1987); U.S. Pat. No. 4,897,355, which is incorporated herein by reference). Other cationic lipids suitable for the compositions and methods of the present invention include, for example, 5-carboxyspermylglycinedioctadecylamide ("DOGS"); 2,3-dioleyloxy-N-[2(spermine-carboxamido)ethyl]-N,N-dimethyl-1-propanamin- ium ("DOSPA") (Behr et al. Proc. Nat.'l Acad. Sci. 86, 6982 (1989), U.S. Pat. Nos. 5,171,678; 5,334,761); 1,2-Dioleoyl-3-Dimethylammonium-Propane ("DODAP"); 1,2-Dioleoyl-3-Trimethylammonium-Propane ("DOTAP").

[0221] Additional exemplary cationic lipids suitable for the compositions and methods of the present invention also include: 1,2-distearyloxy-N,N-dimethyl-3-aminopropane ("DSDMA"); 1,2-dioleyloxy-N,N-dimethyl-3-aminopropane ("DODMA"); 1,2-dilinoleyloxy-N,N-dimethyl-3-aminopropane ("DLinDMA"); 1,2-dilinolenyloxy-N,N-dimethyl-3-aminopropane ("DLenDMA"); N-dioleyl-N,N-dimethylammonium chloride ("DODAC"); N,N-distearyl-N,N-dimethylammonium bromide ("DDAB"); N-(1,2-dimyristyloxyprop-3-yl)-N,N-dimethyl-N-hydroxyethyl ammonium bromide ("DMRIE"); 3-dimethylamino-2-(cholest-5-en-3-beta-oxybutan-4-oxy)-1-(cis,cis-9,12-oc- tadecadienoxy)propane ("CLinDMA"); 2-[5'-(cholest-5-en-3-beta-oxy)-3'-oxapentoxy)-3-dimethyl-1-(cis,cis-9',1- -2'-octadecadienoxy)propane ("CpLinDMA"); N,N-dimethyl-3,4-dioleyloxybenzylamine ("DMOBA"); 1,2-N,N'-dioleylcarbamyl-3-dimethylaminopropane ("DOcarbDAP"); 2,3-Dilinoleoyloxy-N,N-dimethylpropylamine ("DLinDAP"); 1,2-N,N'-Dilinoleylcarbamyl-3-dimethylaminopropane ("DLincarbDAP"); 1,2-Dilinoleoylcarbamyl-3-dimethylaminopropane ("DLinCDAP"); 2,2-dilinoleyl-4-dimethylaminomethyl-[1,3]-dioxolane ("DLin-K-DMA"); 2-((8-[(3P)-cholest-5-en-3-yloxy]octyl)oxy)-N,N-dimethyl-3-[(9Z,12Z)-octa- deca-9,12-dien-1-yloxy]propane-1-amine ("Octyl-CLinDMA"); (2R)-2-((8-[(3beta)-cholest-5-en-3-yloxy]octyl)oxy)-N, N-dimethyl-3-[(9Z,12Z)-octadeca-9,12-dien-1-yloxy]propan-1-amine ("Octyl-CLinDMA (2R)"); (2S)-2-((8-[(3P)-cholest-5-en-3-yloxy]octyl)oxy)-N, fsl-dimethyh3-[(9Z,12Z)-octadeca-9,12-dien-1-yloxy]propan-1-amine ("Octyl-CLinDMA (2S)"); 2,2-dilinoleyl-4-dimethylaminoethyl-[1,3]-dioxolane ("DLin-K-XTC2-DMA"); and 2-(2,2-di((9Z,12Z)-octadeca-9,12-dien-1-yl)-1,3-dioxolan-4-yl)-N,N-di- methylethanamine ("DLin-KC2-DMA") (see, WO 2010/042877, which is incorporated herein by reference; Semple et al., Nature Biotech. 28: 172-176 (2010)). (Heyes, J., et al., J Controlled Release 107: 276-287 (2005); Morrissey, D V., et al., Nat. Biotechnol. 23(8): 1003-1007 (2005); International Patent Publication WO 2005/121348). In some embodiments, one or more of the cationic lipids comprise at least one of an imidazole, dialkylamino, or guanidinium moiety.

[0222] In some embodiments, one or more cationic lipids suitable for the compositions and methods of the present invention include 2,2-Dilinoleyl-4-dimethylaminoethyl-[1,3]-dioxolane ("XTC"); (3aR,5s,6aS)-N,N-dimethyl-2,2-di((9Z,12Z)-octadeca-9,12-dienyl)tetrahydro- -3aH-cyclopenta[d][1,3]dioxol-5-amine ("ALNY-100") and/or 4,7,13-tris(3-oxo-3-(undecylamino)propyl)-N1,N16-diundecyl-4,7,10,13-tetr- aazahexadecane-1,16-diamide ("NC98-5").

[0223] In some embodiments, the compositions of the present invention include one or more cationic lipids that constitute at least about 5%, 10%, 20%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, or 70%, measured by weight, of the total lipid content in the composition, e.g., a lipid nanoparticle. In some embodiments, the compositions of the present invention include one or more cationic lipids that constitute at least about 5%, 10%, 20%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, or 70%, measured as a mol %, of the total lipid content in the composition, e.g., a lipid nanoparticle. In some embodiments, the compositions of the present invention include one or more cationic lipids that constitute about 30-70% (e.g., about 30-65%, about 30-60%, about 30-55%, about 30-50%, about 30-45%, about 30-40%, about 35-50%, about 35-45%, or about 35-40%), measured by weight, of the total lipid content in the composition, e.g., a lipid nanoparticle. In some embodiments, the compositions of the present invention include one or more cationic lipids that constitute about 30-70% (e.g., about 30-65%, about 30-60%, about 30-55%, about 30-50%, about 30-45%, about 30-40%, about 35-50%, about 35-45%, or about 35-40%), measured as mol %, of the total lipid content in the composition, e.g., a lipid nanoparticle

[0224] In some embodiments, sterol-based cationic lipids may be use instead or in addition to cationic lipids described herein. Suitable sterol-based cationic lipids are dialkylamino-, imidazole-, and guanidinium-containing sterol-based cationic lipids. For example, certain embodiments are directed to a composition comprising one or more sterol-based cationic lipids comprising an imidazole, for example, the imidazole cholesterol ester or "ICE" lipid (3S,10R,13R,17R)-10,13-dimethyl-17-((R)-6-methylheptan-2-yl)-2, 3, 4, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17-tetradecahydro-1H-cyclopenta[a]phenanthren-3-yl 3-(1H-imidazol-4-yl)propanoate, as represented by structure (I) below. In certain embodiments, a lipid nanoparticle for delivery of RNA (e.g., mRNA) encoding a functional protein may comprise one or more imidazole-based cationic lipids, for example, the imidazole cholesterol ester or "ICE" lipid (3S,10R,13R,17R)-10,13-dimethyl-17-((R)-6-methylheptan-2-yl)-2,3,4,7,8,9,- 10,11,12,13,14,15,16,17-tetradecahydro-1H-cyclopenta[a]phenanthren-3-yl3-(- 1H-imidazol-4-yl)propanoate, as represented by the following structure:

##STR00072##

[0225] In some embodiments, the percentage of cationic lipid in a liposome may be greater than 10%, greater than 20%, greater than 30%, greater than 40%, greater than 50%, greater than 60%, or greater than 70%. In some embodiments, cationic lipid(s) constitute(s) about 30-50% (e.g., about 30-45%, about 30-40%, about 35-50%, about 35-45%, or about 35-40%) of the liposome by weight. In some embodiments, the cationic lipid (e.g., ICE lipid) constitutes about 30%, about 35%, about 40%, about 45%, or about 50% of the liposome by molar ratio.

[0226] Non-Cationic/Helper Lipids

[0227] In some embodiments, provided liposomes contain one or more non-cationic ("helper") lipids. As used herein, the phrase "non-cationic lipid" refers to any neutral, zwitterionic or anionic lipid. As used herein, the phrase "anionic lipid" refers to any of a number of lipid species that carry a net negative charge at a selected H, such as physiological pH. Non-cationic lipids include, but are not limited to, distearoylphosphatidylcholine (DSPC), dioleoylphosphatidylcholine (DOPC), dipalmitoylphosphatidylcholine (DPPC), dioleoylphosphatidylglycerol (DOPG), dipalmitoylphosphatidylglycerol (DPPG), dioleoylphosphatidylethanolamine (DOPE), palmitoyloleoylphosphatidylcholine (POPC), palmitoyloleoyl-phosphatidylethanolamine (POPE), dioleoyl-phosphatidylethanolamine 4-(N-maleimidomethyl)-cyclohexane-1-carboxylate (DOPE-mal), dipalmitoyl phosphatidyl ethanolamine (DPPE), dimyristoylphosphoethanolamine (DMPE), distearoyl-phosphatidyl-ethanolamine (DSPE), 16-O-monomethyl PE, 16-O-dimethyl PE, 18-1-trans PE, 1-stearoyl-2-oleoyl-phosphatidyethanolamine (SOPE), or a mixture thereof.

[0228] In some embodiments, such non-cationic lipids may be used alone, but are preferably used in combination with other excipients, for example, cationic lipids. In some embodiments, the non-cationic lipid may comprise a molar ratio of about 5% to about 90%, or about 10% to about 70% of the total lipid present in a liposome. In some embodiments, a non-cationic lipid is a neutral lipid, i.e., a lipid that does not carry a net charge in the conditions under which the composition is formulated and/or administered. In some embodiments, the percentage of non-cationic lipid in a liposome may be greater than 5%, greater than 10%, greater than 20%, greater than 30%, or greater than 40%.

[0229] Cholesterol-Based Lipids

[0230] In some embodiments, provided liposomes comprise one or more cholesterol-based lipids. For example, suitable cholesterol-based cationic lipids include, for example, DC-Choi (N,N-dimethyl-N-ethylcarboxamidocholesterol), 1,4-bis(3-N-oleylamino-propyl)piperazine (Gao, et al. Biochem. Biophys. Res. Comm. 179, 280 (1991); Wolf et al. BioTechniques 23, 139 (1997); U.S. Pat. No. 5,744,335), or ICE. In some embodiments, the cholesterol-based lipid may comprise a molar ration of about 2% to about 30%, or about 5% to about 20% of the total lipid present in a liposome. In some embodiments, the percentage of cholesterol-based lipid in the liposome may be greater than 5, %, 10%, greater than 20%, greater than 30%, or greater than 40%.

[0231] PEGylated Lipids

[0232] In some embodiments, provided liposomes comprise one or more PEGylated lipids. For example, the use of polyethylene glycol (PEG)-modified phospholipids and derivatized lipids such as derivatized ceramides (PEG-CER), including N-Octanoyl-Sphingosine-1-[Succinyl(Methoxy Polyethylene Glycol)-2000] (C8 PEG-2000 ceramide) is also contemplated by the present invention in combination with one or more of the cationic and, in some embodiments, other lipids together which comprise the liposome. Contemplated PEG-modified lipids include, but are not limited to, a polyethylene glycol chain of up to 2 kDa, up to 3 kDa, up to 4 kDa or 5 kDa in length covalently attached to a lipid with alkyl chain(s) of C6-C20 length. In some embodiments, a PEG-modified or PEGylated lipid is PEGylated cholesterol or PEG-2K. The addition of such components may prevent complex aggregation and may also provide a means for increasing circulation lifetime and increasing the delivery of the lipid-nucleic acid composition to the target cell, (Klibanov et al. (1990) FEBS Letters, 268 (1): 235-237), or they may be selected to rapidly exchange out of the formulation in vivo (see U.S. Pat. No. 5,885,613). In some embodiments, a PEG-modified or PEGylated lipid is PEGylated cholesterol or PEG-2K. In some embodiments, particularly useful exchangeable lipids are PEG-ceramides having shorter acyl chains (e.g., C14 or C18).

[0233] In some embodiments, particularly useful exchangeable lipids are PEG-ceramides having shorter acyl chains (e.g., C14 or C18). The PEG-modified phospholipid and derivitized lipids of the present invention may comprise a molar ratio from about 0% to about 15%, about 0.5% to about 15%, about 1% to about 15%, about 4% to about 10%, or about 2% of the total lipid present in the liposome. PEG-modified phospholipid and derivatized lipids may constitute at least about 5%, 10%, 20%, 30%, 40%, 50%, 60%, or 70% of the total lipids in a suitable lipid solution by weight or by molar. In some embodiments, PEGylated lipid lipid(s) constitute(s) about 30-50% (e.g., about 30-45%, about 30-40%, about 35-50%, about 35-45%, or about 35-40%) of the total lipids in a suitable lipid solution by weight or by molar.

[0234] According to various embodiments, the selection of cationic lipids, non-cationic lipids and/or PEG-modified lipids which comprise the liposome, as well as the relative molar ratio of such lipids to each other, is based upon the characteristics of the selected lipid(s), the nature of the intended target cells, the characteristics of the mRNA to be delivered. Additional considerations include, for example, the saturation of the alkyl chain, as well as the size, charge, pH, pKa, fusogenicity and toxicity of the selected lipid(s). Thus, the molar ratios may be adjusted accordingly.

[0235] Liposome Formulations

[0236] A suitable liposome for the present invention may include one or more of any of the cationic lipids, non-cationic lipids, cholesterol lipids, PEGylated lipids and/or polymers described herein at various ratios. Typically, a liposome in accordance with the present invention comprises a cationic lipid, a non-cationic lipid, a cholesterol lipid and a PEGylated lipid. As non-limiting examples, a suitable liposome formulation may include a combination selected from cKK-E12, DOPE, cholesterol and DMG-PEG2K; C12-200, DOPE, cholesterol and DMG-PEG2K; HGT4003, DOPE, cholesterol and DMG-PEG2K; or ICE, DOPE, cholesterol and DMG-PEG2K or ICE, DOPE and DMG-PEG2K. Additional combinations of lipids are described in the art, e.g., U.S. Ser. No. 62/420,421 (filed on Nov. 10, 2016), U.S. Ser. No. 62/421,021 (filed on Nov. 11, 2016), U.S. Ser. No. 62/464,327 (filed on Feb. 27, 2017), and PCT Application entitled "Novel ICE-based Lipid Nanoparticle Formulation for Delivery of mRNA," filed on Nov. 10, 2017, the disclosures of which are included here in their full scope by reference.

[0237] In various embodiments, cationic lipids (e.g., cKK-E12, C12-200, ICE, and/or HGT4003) constitute about 30-60% (e.g., about 30-55%, about 30-50%, about 30-45%, about 30-40%, about 35-50%, about 35-45%, or about 35-40%) of the liposome by molar ratio. In some embodiments, the percentage of cationic lipids (e.g., cKK-E12, C12-200, ICE, and/or HGT4003) is or greater than about 30%, about 35%, about 40%, about 45%, about 50%, about 55%, or about 60% of the liposome by molar ratio.

[0238] In some embodiments, the ratio of cationic lipid(s) to non-cationic lipid(s) to cholesterol-based lipid(s) to PEGylated lipid(s) may be between about 30-60:25-35:20-30:1-15, respectively. In some embodiments, the ratio of cationic lipid(s) to non-cationic lipid(s) to cholesterol-based lipid(s) to PEGylated lipid(s) is approximately 40:30:20:10, respectively. In some embodiments, the ratio of cationic lipid(s) to non-cationic lipid(s) to cholesterol-based lipid(s) to PEGylated lipid(s) is approximately 40:30:25:5, respectively. In some embodiments, the ratio of cationic lipid(s) to non-cationic lipid(s) to cholesterol-based lipid(s) to PEGylated lipid(s) is approximately 40:32:25:3, respectively. In some embodiments, the ratio of cationic lipid(s) to non-cationic lipid(s) to cholesterol-based lipid(s) to PEGylated lipid(s) is approximately 50:25:20:5.

[0239] Formation of Liposomes

[0240] The liposomal transfer vehicles for use in the compositions of the invention can be prepared by various techniques which are presently known in the art. The liposomes for use in provided compositions can be prepared by various techniques which are presently known in the art. For example, multilamellar vesicles (MLV) may be prepared according to conventional techniques, such as by depositing a selected lipid on the inside wall of a suitable container or vessel by dissolving the lipid in an appropriate solvent, and then evaporating the solvent to leave a thin film on the inside of the vessel or by spray drying. An aqueous phase then may be added to the vessel with a vortexing motion which results in the formation of MLVs. Unilamellar vesicles (ULV) can then be formed by homogenization, sonication or extrusion of the multilamellar vesicles. In addition, unilamellar vesicles can be formed by detergent removal techniques.

[0241] In certain embodiments, provided compositions comprise a liposome wherein the mRNA is associated on both the surface of the liposome and encapsulated within the same liposome. For example, during preparation of the compositions of the present invention, cationic liposomes may associate with the mRNA through electrostatic interactions. For example, during preparation of the compositions of the present invention, cationic liposomes may associate with the mRNA through electrostatic interactions.

[0242] In some embodiments, the compositions and methods of the invention comprise mRNA encapsulated in a liposome. In some embodiments, the one or more mRNA species may be encapsulated in the same liposome. In some embodiments, the one or more mRNA species may be encapsulated in different liposomes. In some embodiments, the mRNA is encapsulated in one or more liposomes, which differ in their lipid composition, molar ratio of lipid components, size, charge (Zeta potential), targeting ligands and/or combinations thereof. In some embodiments, the one or more liposome may have a different composition of cationic lipids, neutral lipid, PEG-modified lipid and/or combinations thereof. In some embodiments the one or more liposomes may have a different molar ratio of cationic lipid, neutral lipid, cholesterol and PEG-modified lipid used to create the liposome.

[0243] The process of incorporation of a desired mRNA into a liposome is often referred to as "loading". Exemplary methods are described in Lasic, et al., FEBS Lett., 312: 255-258, 1992, which is incorporated herein by reference. In a typical embodiment, the mRNA of the invention is encapsulated in a liposome using the methods described in WO 2018/089801 (the teachings of which are incorporated herein by reference in their entirety). Briefly, the mRNA is encapsulated by mixing of a solution comprising pre-formed liposomes with mRNA such that liposomes encapsulating mRNA are formed.

[0244] Typically, the liposome-incorporated nucleic acids are completely located in the interior space of the liposome within the bilayer membrane of the liposome, although as discussed above, some of the mRNA (e.g., no more than 10% of total mRNA in the liposome composition) may also be associated with the exterior surface of the liposome membrane. The incorporation of a nucleic acid into liposomes is also referred to herein as "encapsulation". Typically, the purpose of incorporating an mRNA into a liposome is to protect the nucleic acid from an environment which may contain enzymes or chemicals that degrade nucleic acids and/or systems or receptors that cause the rapid excretion of the nucleic acids. Accordingly, in some embodiments, a suitable delivery vehicle is capable of enhancing the stability of the mRNA contained therein and/or facilitate the delivery of mRNA to the target cell or tissue.

[0245] Liposome Size

[0246] Suitable liposomes in accordance with the present invention may be made in various sizes. In some embodiments, provided liposomes may be made smaller than previously known mRNA encapsulating liposomes. In some embodiments, decreased size of liposomes is associated with more efficient delivery of mRNA. Selection of an appropriate liposome size may take into consideration the site of the target cell or tissue and to some extent the application for which the liposome is being made.

[0247] In some embodiments, an appropriate size of liposome is selected to facilitate systemic distribution of antibody encoded by the mRNA. In some embodiments, it may be desirable to limit transfection of the mRNA to certain cells or tissues. For example, to target hepatocytes a liposome may be sized such that its dimensions are smaller than the fenestrations of the endothelial layer lining hepatic sinusoids in the liver; in such cases the liposome could readily penetrate such endothelial fenestrations to reach the target hepatocytes.

[0248] Alternatively or additionally, a liposome may be sized such that the dimensions of the liposome are of a sufficient diameter to limit or expressly avoid distribution into certain cells or tissues. For example, a liposome may be sized such that its dimensions are larger than the fenestrations of the endothelial layer lining hepatic sinusoids to thereby limit distribution of the liposomes to hepatocytes.

[0249] In some embodiments, the size of a liposome is determined by the length of the largest diameter of the liposome particle. In some embodiments, a suitable liposome has a size no greater than about 250 nm (e.g., no greater than about 225 nm, 200 nm, 175 nm, 150 nm, 125 nm, 100 nm, 75 nm, or 50 nm). In some embodiments, a suitable liposome has a size ranging from about 10-250 nm (e.g., ranging from about 10-225 nm, 10-200 nm, 10-175 nm, 10-150 nm, 10-125 nm, 10-100 nm, 10-75 nm, or 10-50 nm). In some embodiments, a suitable liposome has a size ranging from about 100-250 nm (e.g., ranging from about 100-225 nm, 100-200 nm, 100-175 nm, 100-150 nm). Liposomes with a size of 80-200 nm are particularly suitable for some application. In some embodiments, a suitable liposome has a size ranging from about 10-100 nm (e.g., ranging from about 10-90 nm, 10-80 nm, 10-70 nm, 10-60 nm, or 10-50 nm). In a particular embodiment, a suitable liposome has a size less than about 100 nm.

[0250] A variety of alternative methods known in the art are available for sizing of a population of liposomes. One such sizing method is described in U.S. Pat. No. 4,737,323, incorporated herein by reference. Sonicating a liposome suspension either by bath or probe sonication produces a progressive size reduction down to small ULV less than about 0.05 microns in diameter. Homogenization is another method that relies on shearing energy to fragment large liposomes into smaller ones. In a typical homogenization procedure, MLV are recirculated through a standard emulsion homogenizer until selected liposome sizes, typically between about 0.1 and 0.5 microns, are observed. The size of the liposomes may be determined by quasi-electric light scattering (QELS) as described in Bloomfield, Ann. Rev. Biophys. Bioeng., 10:421-150 (1981), incorporated herein by reference. Average liposome diameter may be reduced by sonication of formed liposomes. Intermittent sonication cycles may be alternated with QELS assessment to guide efficient liposome synthesis.

Liposome Formulations for DNAH5 mRNA Delivery and Expression

[0251] This section provides exemplary liposome formulations for effective delivery and expression of DNAH5 mRNA in vivo.

Lipid Materials

[0252] The formulations described herein include a multi-component lipid mixture of varying ratios employing one or more cationic lipids, helper lipids (e.g., non-cationic lipids and/or cholesterol-based lipids) and PEGylated lipids designed to encapsulate mRNA encoding DNAH5 protein. Cationic lipids can include (but not exclusively) DOTAP (1,2-dioleyl-3-trimethylammonium propane), DODAP (1,2-dioleyl-3-dimethylammonium propane), DOTMA (1,2-di-O-octadecenyl-3-trimethylammonium propane), DLinDMA (Heyes, J.; Palmer, L.; Bremner, K.; MacLachlan, I. "Cationic lipid saturation influences intracellular delivery of encapsulated nucleic acids" J. Contr. Rel. 2005, 107, 276-287), DLin-KC2-DMA (Semple, S. C. et al. "Rational Design of Cationic Lipids for siRNA Delivery" Nature Biotech. 2010, 28, 172-176), C12-200 (Love, K. T. et al. "Lipid-like materials for low-dose in vivo gene silencing" PNAS 2010, 107, 1864-1869), cKK-E12 (3,6-bis(4-(bis(2-hydroxydodecyl)amino)butyl)piperazine-2,5-dione), HGT5000, HGT5001, HGT4003, ICE, OF-02, dialkylamino-based, imidazole-based, guanidinium-based, etc. Helper lipids can include (but not exclusively) DSPC (1,2-distearoyl-sn-glycero-3-phosphocholine), DPPC (1,2-dipalmitoyl-sn-glycero-3-phosphocholine), DOPE (1,2-dioleyl-sn-glycero-3-phosphoethanolamine), DOPC (1,2-dioleyl-sn-glycero-3-phosphotidylcholine) DPPE (1,2-dipalmitoyl-sn-glycero-3-phosphoethanolamine), DMPE (1,2-dimyristoyl-sn-glycero-3-phosphoethanolamine), DOPG (1,2-dioleoyl-sn-glycero-3-phospho-(1'-rac-glycerol)), cholesterol, etc. The PEGylated lipids can include (but not exclusively) a poly(ethylene) glycol chain of up to 5 kDa in length covalently attached to a lipid with alkyl chain(s) of C6-C20 length.

Exemplary Formulation Protocols

[0253] A. cKK-E12

[0254] Aliquots of 50 mg/mL ethanolic solutions of cKK-E12, DOPE, cholesterol and DMG-PEG2K are mixed and diluted with ethanol to 3 mL final volume. Separately, an aqueous buffered solution (10 mM citrate/150 mM NaCl, pH 4.5) of DNAH5 mRNA is prepared from a 1 mg/mL stock. The lipid solution was injected rapidly into the aqueous mRNA solution and shaken to yield a final suspension in 20% ethanol. The resulting liposome suspension was filtered, diafiltrated with 1.times.PBS (pH 7.4), concentrated and stored at 2-8.degree. C. The final concentration, Zave, Dv(50) and Dv(90) of the DNAH5 encapsulated mRNA were determined.

[0255] B. C12-200

[0256] Aliquots of 50 mg/mL ethanolic solutions of C12-200, DOPE, cholesterol and DMG-PEG2K are mixed and diluted with ethanol to 3 mL final volume. Separately, an aqueous buffered solution (10 mM citrate/150 mM NaCl, pH 4.5) of DNAH5mRNA is prepared from a 1 mg/mL stock. The lipid solution is injected rapidly into the aqueous mRNA solution and shaken to yield a final suspension in 20% ethanol. The resulting liposome suspension is filtered, diafiltrated with 1.times.PBS (pH 7.4), concentrated and stored at 2-8.degree. C. The final concentration, Zave, Dv(50) and Dv(90) of the DNAH5 encapsulated mRNA are determined.

[0257] C. HGT4003

[0258] Aliquots of 50 mg/mL ethanolic solutions of HGT4003, DOPE, cholesterol and DMG-PEG2K are mixed and diluted with ethanol to 3 mL final volume. Separately, an aqueous buffered solution (10 mM citrate/150 mM NaCl, pH 4.5) of DNAH5 mRNA is prepared from a 1 mg/mL stock. The lipid solution is injected rapidly into the aqueous mRNA solution and shaken to yield a final suspension in 20% ethanol. The resulting liposome suspension is filtered, diafiltrated with 1.times.PBS (pH 7.4), concentrated and stored at 2-8.degree. C. The final concentration, Zave, Dv(50) and Dv(90) of the DNAH5 encapsulated mRNA are determined.

[0259] D. ICE

[0260] Aliquots of 50 mg/mL ethanolic solutions of ICE, DOPE, cholesterol and DMG-PEG2K are mixed and diluted with ethanol to 3 mL final volume. Separately, an aqueous buffered solution (10 mM citrate/150 mM NaCl, pH 4.5) of DNAH5 mRNA is prepared from a 1 mg/mL stock. The lipid solution is injected rapidly into the aqueous mRNA solution and shaken to yield a final suspension in 20% ethanol. The resulting liposome suspension is filtered, diafiltrated with 1.times.PBS (pH 7.4), concentrated and stored at 2-8.degree. C. The final concentration, Zave, Dv(50) and Dv(90) of the DNAH5 encapsulated mRNA are determined.

[0261] E. HGT5001

[0262] Aliquots of 50 mg/mL ethanolic solutions of HGT5001, DOPE, cholesterol and DMG-PEG2K are mixed and diluted with ethanol to 3 mL final volume. Separately, an aqueous buffered solution (10 mM citrate/150 mM NaCl, pH 4.5) of DNAH5 mRNA is prepared from a 1 mg/mL stock. The lipid solution is injected rapidly into the aqueous mRNA solution and shaken to yield a final suspension in 20% ethanol. The resulting liposome suspension is filtered, diafiltrated with 1.times.PBS (pH 7.4), concentrated and stored at 2-8.degree. C. The final concentration, Zave, Dv(50) and Dv(90) of the DNAH5 encapsulated mRNA are determined.

[0263] F. HGT5000

[0264] Aliquots of 50 mg/mL ethanolic solutions of HGT5000, DOPE, cholesterol and DMG-PEG2K are mixed and diluted with ethanol to 3 mL final volume. Separately, an aqueous buffered solution (10 mM citrate/150 mM NaCl, pH 4.5) of DNAH5T mRNA is prepared from a 1 mg/mL stock. The lipid solution is injected rapidly into the aqueous mRNA solution and shaken to yield a final suspension in 20% ethanol. The resulting liposome suspension is filtered, diafiltrated with 1.times.PBS (pH 7.4), concentrated and stored at 2-8.degree. C. The final concentration, Zave, Dv(50) and Dv(90) of the DNAH5 encapsulated mRNA are determined.

[0265] G. DLinKC2DMA

[0266] Aliquots of 50 mg/mL ethanolic solutions of DLinKC2DMA, DOPE, cholesterol and DMG-PEG2K are mixed and diluted with ethanol to 3 mL final volume. Separately, an aqueous buffered solution (10 mM citrate/150 mM NaCl, pH 4.5) of DNAH5 mRNA is prepared from a 1 mg/mL stock. The lipid solution is injected rapidly into the aqueous mRNA solution and shaken to yield a final suspension in 20% ethanol. The resulting liposome suspension is filtered, diafiltrated with 1.times.PBS (pH 7.4), concentrated and stored at 2-8.degree. C. The final concentration, Zave, Dv(50) and Dv(90) of the DNAH5 encapsulated mRNA are determined.

[0267] H. DODAP

[0268] Aliquots of 50 mg/mL ethanolic solutions of DODAP, DOPE, cholesterol and DMG-PEG2K are mixed and diluted with ethanol to 3 mL final volume. Separately, an aqueous buffered solution (10 mM citrate/150 mM NaCl, pH 4.5) of DNAH5 mRNA is prepared from a 1 mg/mL stock. The lipid solution is injected rapidly into the aqueous mRNA solution and shaken to yield a final suspension in 20% ethanol. The resulting liposome suspension is filtered, diafiltrated with 1.times.PBS (pH 7.4), concentrated and stored at 2-8.degree. C. The final concentration, Zave, Dv(50) and Dv(90) of the DNAH5 encapsulated mRNA are determined.

[0269] I. DODMA

[0270] Aliquots of 50 mg/mL ethanolic solutions of DODMA, DOPE, cholesterol and DMG-PEG2K are mixed and diluted with ethanol to 3 mL final volume. Separately, an aqueous buffered solution (10 mM citrate/150 mM NaCl, pH 4.5) of DNAH5 mRNA is prepared from a 1 mg/mL stock. The lipid solution is injected rapidly into the aqueous mRNA solution and shaken to yield a final suspension in 20% ethanol. The resulting liposome suspension is filtered, diafiltrated with 1.times.PBS (pH 7.4), concentrated and stored at 2-8.degree. C. The final concentration, Zave, Dv(50) and Dv(90) of the DNAH5 encapsulated mRNA are determined.

[0271] Clinical or therapeutic candidate mRNA formulations are selected from the exemplary codon-optimized mRNA sequences having a 5'-cap and a 3'-poly A tail, which is formulated in a suitable lipid combination as described above. Clinically relevant mRNA candidates are characterized by efficient delivery and uptake by in vivo tissue, high level of expression and sustained protein production, without detectable adverse effects in the subject to whom the therapeutic is administered, either caused by the pharmacologically active ingredient or by the lipids in the liposome, or by any excipients used in the formulation. In general, high efficiency with low dose administration is favorable for the selection process of a relevant candidate therapeutic.

[0272] Pharmaceutical Compositions

[0273] The present invention provides compositions for use in the treatment of primary ciliary dyskinesia (PCD). The compositions of the present invention are for use in the manufacture of a medicament for the treatment of primary ciliary dyskinesia (PCD).

[0274] To facilitate expression of mRNA in vivo, delivery vehicles such as liposomes can be formulated in combination with one or more additional nucleic acids, carriers, targeting ligands or stabilizing reagents, or in pharmacological compositions where it is mixed with suitable excipients. Techniques for formulation and administration of drugs may be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co., Easton, Pa., latest edition.

[0275] Provided liposomally-encapsulated or associated mRNAs, and compositions containing the same, may be administered and dosed in accordance with current medical practice, taking into account the clinical condition of the subject, the site and method of administration, the scheduling of administration, the subject's age, sex, body weight and other factors relevant to clinicians of ordinary skill in the art. As used herein, the term "therapeutically effective amount" is largely determined based on the total amount of the therapeutic agent contained in the pharmaceutical compositions of the present invention. Generally, a therapeutically effective amount is sufficient to achieve a meaningful benefit to the subject, the mammal, (e.g., treating, modulating, curing, preventing and/or ameliorating PCD). For example, a therapeutically effective amount may be an amount sufficient to achieve a desired therapeutic and/or prophylactic effect. Generally, the amount of a therapeutic agent (e.g., mRNA encoding aDNAH5 protein) administered to a subject in need thereof will depend upon the characteristics of the subject. Such characteristics include the condition, disease severity, general health, age, sex and body weight of the subject. One of ordinary skill in the art will be readily able to determine appropriate dosages depending on these and other related factors. In addition, both objective and subjective assays may optionally be employed to identify optimal dosage ranges.

[0276] In some embodiments, an effective therapeutic dose of the pharmaceutical composition comprising an mRNA encoding dynein axonemal heavy chain 5 protein is administered to the mammal at a dosing interval sufficient to reduce for the period of the dosing interval or longer the level of at least one symptom or biomarker associated with PCD in the mammal relative to its state prior to the treatment.

[0277] In some embodiments the mammal is a human. A suitable therapeutic dose that may be applicable for a human being can be derived based on animal studies. A basic guideline for deriving a human equivalent dose from studies performed in animals can be obtained from the U.S. Food and Drug Administration (FDA) website at https://www.fda.gov/downloads/drugs/guidances/ucm078932.pdf, entitled, "Guidance for Industry Estimating the Maximum Safe Starting Dose in Initial Clinical Trials for Therapeutics in Adult Healthy Volunteers." Based on the guidelines for allometric scaling, a suitable dose of, for example, 0.6 mg/kg in a mouse, would relate to a human equivalent dose of 0.0048 mg/kg. Thus, considering the derived human equivalent dose, a projected human therapeutic dose can be derived based on studies in other animals.

[0278] In some embodiments, the dosing interval is once every 15 days or longer, or once every 20 days or longer, or once every 21 days, or once every 22 days, or once every 23 days, or once every 24 days, or once every 25 days, once every 26 days, or once every 27 days, or once every 28 days, or once every 29 days or longer, or once every 30 days or longer, or once every 31 days or longer. In some embodiments, the dosing interval is once every 40, 45 or 50 days or 60 days, or any number of days in between. In some embodiments, the dosing interval is once every 80, 90 or 120 days or 150 days, or any number of days in between.

[0279] In some embodiments, the therapeutic low dose is administered at a dosing interval of once every 2 weeks or longer, which is sufficient to reduce the level of at least one symptom or biomarker associated with PCD in the mammal relative to the state prior to the treatment. In some embodiments, the therapeutic low dose is administered at a dosing interval of once every 3 weeks or longer, which is sufficient to reduce the level of at least one symptom or biomarker associated with PCD in the mammal relative to the state prior to the treatment. In some embodiments, the dosing interval is once every 4 weeks or longer. In some embodiments, the dosing interval is once every 5 weeks or longer. In some embodiments, the dosing interval is once every 6 weeks or longer. In some embodiments, the dosing interval is once every 8 weeks or longer. In some embodiments, the dosing interval is once every 12 or 15 or 18 weeks or longer.

[0280] In some embodiments, the dosing interval is once a month. In some embodiments, the dosing interval is once in every two months. In some embodiments, the dosing interval is once every three months, or once every four months or once every five months or once every six months or anywhere in between.

[0281] In some embodiments, administering the provided composition results in an increased dynein axonemal heavy chain 5 mRNA expression level in a biological sample from a subject as compared to a baseline expression level before treatment. Typically, the baseline level is measured immediately before treatment. Biological samples include, for example, whole blood, serum, plasma, urine and tissue samples (e.g., muscle, liver, skin fibroblasts). In some embodiments, administering the provided composition results in an increased DNAH5 mRNA expression level by at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95% as compared to the baseline level immediately before treatment. In some embodiments, administering the provided composition results in an increased DNAH5 mRNA expression level as compared to a DNAH5 mRNA expression level in subjects who are not treated

[0282] According to the present invention, a therapeutically effective dose of the provided composition, when administered regularly, results in an increased DNAH5 protein expression or activity level in a subject as compared to a baseline DNAH5 protein expression or activity level before treatment. Typically, the DNAH5 protein expression or activity level is measured in a biological sample obtained from the subject such as blood, plasma or serum, urine, or solid tissue extracts. In some embodiments, the administering of a composition of the invention results in DNAH5 expression detectable in the liver. In some embodiments, administering the provided composition results in an increased DNAH5 protein expression or activity level in a biological sample (e.g., plasma/serum or urine) by at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95% as compared to a baseline level before treatment. In some embodiments, administering the provided composition results in an increased DNAH5 protein expression or activity level in a biological sample (e.g., plasma/serum or urine) by at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95% as compared to a baseline level before treatment for at least 24 hours, at least 48 hours, at least 72 hours, at least 3 days, at least 4 days, at least 5 days, at least 6 days, at least 7 days, at least 8 days, at least 9 days, at least 10 days, at least 11 days, at least 12 days, at least 13 days, at least 14 days, or at least 15 days.

[0283] In some embodiments, the therapeutic dose is sufficient to achieve at least some stabilization, improvement or elimination of symptoms and other indicators, such as biomarkers, are selected as appropriate measures of disease progress, disease regression or improvement by those of skill in the art.

[0284] Suitable routes of administration include, for example, oral, rectal, vaginal, transmucosal, pulmonary including intratracheal or inhaled, or intestinal administration; parenteral delivery, including intradermal, transdermal (topical), intramuscular, subcutaneous, intramedullary injections, as well as intrathecal, direct intraventricular, intravenous, intraperitoneal, or intranasal.

[0285] In some embodiments, the therapeutically effective dose comprising the mRNA encoding dynein axonemal heavy chain protein is administered to the subject by intramuscular administration.

[0286] In some embodiments, the therapeutically effective dose comprising the mRNA encoding dynein axonemal heavy chain protein is administered to the subject by subcutaneous administration.

[0287] In particular embodiments, the intramuscular administration is to a muscle selected from the group consisting of skeletal muscle, smooth muscle and cardiac muscle. In some embodiments the administration results in delivery of the mRNA to a muscle cell. In some embodiments the administration results in delivery of the mRNA to a hepatocyte (i.e., liver cell). In a particular embodiment, the intramuscular administration results in delivery of the mRNA to a muscle cell.

[0288] Most commonly, the therapeutically effective dose comprising the mRNA encoding dynein axonemal heavy chain protein is administered to the subject by intravenous administration.

[0289] Alternatively or additionally, liposomally encapsulated mRNAs and compositions of the invention may be administered in a local rather than systemic manner, for example, via injection of the pharmaceutical composition directly into a targeted tissue, preferably in a sustained release formulation. Local delivery can be affected in various ways, depending on the tissue to be targeted. For example, aerosols containing compositions of the present invention can be inhaled (for nasal, tracheal, or bronchial delivery); compositions of the present invention can be injected into the site of injury, disease manifestation, or pain, for example; compositions can be provided in lozenges for oral, tracheal, or esophageal application; can be supplied in liquid, tablet or capsule form for administration to the stomach or intestines, can be supplied in suppository form for rectal or vaginal application; or can even be delivered to the eye by use of creams, drops, or even injection. Formulations containing provided compositions complexed with therapeutic molecules or ligands can even be surgically administered, for example in association with a polymer or other structure or substance that can allow the compositions to diffuse from the site of implantation to surrounding cells. Alternatively, they can be applied surgically without the use of polymers or supports.

[0290] In particular embodiments, DNAH5 encoding mRNA is administered intravenously, wherein intravenous administration is associated with delivery of the mRNA to hepatocytes.

[0291] In some embodiments, the therapeutically effective dose comprising the mRNA encoding dynein axonemal heavy chain protein is administered for suitable delivery to the mammal's liver. In some embodiments, the therapeutically effective dose comprising the mRNA encoding dynein axonemal heavy chain protein is administered for suitable expression in hepatocytes of the administered mammal.

[0292] Provided methods of the present invention contemplate single as well as multiple administrations of a therapeutically effective amount of the therapeutic agents (e.g., mRNA encoding a DNAH5 protein) described herein. Therapeutic agents can be administered at regular intervals, depending on the nature, severity and extent of the subject's condition (e.g., PCD). In some embodiments, a therapeutically effective amount of the therapeutic agents (e.g., mRNA encoding a DNAH5 protein) of the present invention may be administered intrathecally periodically at regular intervals (e.g., once every year, once every six months, once every five months, once every three months, bimonthly (once every two months), monthly (once every month), biweekly (once every two weeks), twice a month, once every 30 days, once every 28 days, once every 14 days, once every 10 days, once every 7 days, weekly, twice a week, daily or continuously).

[0293] In some embodiments, provided liposomes and/or compositions are formulated such that they are suitable for extended-release of the mRNA contained therein. Such extended-release compositions may be conveniently administered to a subject at extended dosing intervals. For example, in one embodiment, the compositions of the present invention are administered to a subject twice a day, daily or every other day. In some embodiments, the compositions of the present invention are administered to a subject twice a week, once a week, once every 7 days, once every 10 days, once every 14 days, once every 28 days, once every 30 days, once every two weeks, once every three weeks, once every four weeks, once a month, twice a month, once every six weeks, once every eight weeks, once every other month, once every three months, once every four months, once every six months, once every eight months, once every nine months or annually.

[0294] In a preferred embodiment, the compositions of the present invention are administered to a subject once a week, once every two weeks or once a month. In a more preferred embodiment, the compositions of the present invention are administered to a subject once every two weeks or once every month. In the most preferred embodiment, the compositions of the present invention are administered to a subject once every month.

[0295] In some embodiments the mRNA is administered concurrently with an additional therapy.

[0296] Also contemplated are compositions and liposomes which are formulated for depot administration (e.g., intramuscularly, subcutaneously, intravitreally) to either deliver or release an mRNA over extended periods of time. Preferably, the extended-release means employed are combined with modifications made to the mRNA to enhance stability.

[0297] A therapeutically effective amount is commonly administered in a dosing regimen that may comprise multiple unit doses. For any particular therapeutic protein, a therapeutically effective amount (and/or an appropriate unit dose within an effective dosing regimen) may vary, for example, depending on route of administration, on combination with other pharmaceutical agents. Also, the specific therapeutically effective amount (and/or unit dose) for any particular patient may depend upon a variety of factors including the disorder being treated and the severity of the disorder; the activity of the specific pharmaceutical agent employed; the specific composition employed; the age, body weight, general health, sex and diet of the patient; the time of administration, route of administration, and/or rate of excretion or metabolism of the specific protein employed; the duration of the treatment; and like factors as is well known in the medical arts. According to the present invention, a therapeutically effective dose of the provided composition, when administered regularly, results in at least one symptom or feature of PCD is reduced in intensity, severity, or frequency or has delayed onset.

[0298] Also contemplated herein are lyophilized pharmaceutical compositions comprising one or more of the liposomes disclosed herein and related methods for the use of such compositions as disclosed for example, in International Patent Application PCT/US12/41663, filed Jun. 8, 2012, the teachings of which are incorporated herein by reference in their entirety. For example, lyophilized pharmaceutical compositions according to the invention may be reconstituted prior to administration or can be reconstituted in vivo. For example, a lyophilized pharmaceutical composition can be formulated in an appropriate dosage form (e.g., an intradermal dosage form such as a disk, rod or membrane) and administered such that the dosage form is rehydrated over time in vivo by the individual's bodily fluids.

[0299] In some embodiments, the pharmaceutical composition comprises a lyophilized liposomal delivery vehicle that comprises a cationic lipid, a non-cationic lipid, a PEG-modified lipid and cholesterol. In some embodiments, the pharmaceutical composition has a Dv50 of less than 500 nm, 300 nm, 200 nm, 150 nm, 125 nm, 120 nm, 100 nm, 75 nm, 50 nm, 25 nm or smaller upon reconstitution. In some embodiments, the pharmaceutical composition has a Dv90 of less than 750 nm, 700 nm, 500 nm, 300 nm, 200 nm, 150 nm, 125 nm, 100 nm, 75 nm, 50 nm, 25 nm or smaller upon reconstitution. In some embodiments, the pharmaceutical composition has a polydispersity index value of less than 1, 0.95, 0.9, 0.8, 0.75, 0.7, 0.6, 0.5, 0.4, 0.3, 0.25, 0.2, 0.1, 0.05 or less upon reconstitution. In some embodiments, the pharmaceutical composition has an average particle size of less than 500 nm, 400 nm, 300 nm, 200 nm, 175 nm, 150 nm, 125 nm, 100 nm, 75 nm, 50 nm, 25 nm or upon reconstitution.

[0300] In some embodiments, the lyophilized pharmaceutical composition further comprises one or more lyoprotectants, such as sucrose, trehalose, dextran or inulin. Typically, the lyoprotectant is sucrose. In some embodiments, the pharmaceutical composition is stable for at least 1 month or at least 6 months upon storage at 4.degree. C., or for at least 6 months upon storage at 25.degree. C. In some embodiments, the biologic activity of the mRNA of the reconstituted lyophilized pharmaceutical composition exceeds 75% of the biological activity observed prior to lyophilization of the composition.

[0301] Provided liposomes and compositions may be administered to any desired tissue. In some embodiments, the DNAH5 mRNA delivered by provided liposomes or compositions is expressed in the tissue in which the liposomes and/or compositions were administered. In some embodiments, the mRNA delivered is expressed in a tissue different from the tissue in which the liposomes and/or compositions were administered. Exemplary tissues in which delivered mRNA may be delivered and/or expressed include, but are not limited to the liver, kidney, heart, spleen, serum, brain, skeletal muscle, lymph nodes, skin, and/or cerebrospinal fluid.

[0302] According to various embodiments, the timing of expression of delivered mRNAs can be tuned to suit a particular medical need. In some embodiments, the expression of the protein encoded by delivered mRNA is detectable 1, 2, 3, 6, 12, 24, 48, 72, 96 hours, 1 week, 2 weeks, or 1 month after administration of provided liposomes and/or compositions.

[0303] In some embodiments, a therapeutically effective dose of the provided composition, when administered regularly, results in a reduced methylmalonic acid level in a subject as compared to a baseline methylmalonic acid level before treatment.

[0304] In some embodiments, administering the provided composition results in an increased level of DNAH5 protein in a liver cell (e.g., a hepatocyte) of a subject as compared to a baseline level before treatment. Typically, the baseline level is measured immediately before treatment. In some embodiments, administering the provided composition results in an increased DNAH5 protein level in the liver cell by at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95% as compared to a baseline level before treatment. In some embodiments, administering the provided composition results in an increased DNAH5 protein level in a liver cell as compared to the DNAH5 protein level a liver cell of subjects who are not treated.

[0305] In some embodiments, administering the provided composition results in an increased DNAH5 protein level in plasma or serum of subject as compared to a baseline level before treatment. Typically, the baseline level is measured immediately before treatment. In some embodiments, administering the provided composition results in an increased DNAH5 protein level in plasma or serum by at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95% as compared to a baseline level before treatment. In some embodiments, administering the provided composition results in an increased DNAH5 protein level in plasma or serum as compared to a DNAH5 protein level in plasma or serum of subjects who are not treated.

[0306] In some embodiments, administering the provided composition results in increased DNAH5 enzyme activity in a biological sample from a subject as compared to the baseline level before treatment. Typically, the baseline level is measured immediately before treatment. Biological samples include, for example, whole blood, serum, plasma, urine and tissue samples (e.g., liver). In some embodiments, administering the provided composition results in an increased DNAH5 enzyme activity by at least about 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 95% as compared to a baseline level immediately before treatment. In some embodiments, administering the provided composition results in an increased DNAH5 enzyme activity as compared to DNAH5 enzyme activity in subjects who are not treated.

[0307] In some embodiments the subject is a mammal. In some embodiments, the mammal is an adult. In some embodiments the mammal is an adolescent. In some embodiments the mammal is an infant or a young mammal. In some embodiments, the mammal is a primate. In some embodiments the mammal is a human. In some embodiments the subject is 6 years to 80 years old.

EXAMPLES

[0308] While certain compounds, compositions and methods of the present invention have been described with specificity in accordance with certain embodiments, the following examples serve only to illustrate the compounds of the invention and are not intended to limit the same.

Example 1. Exemplary Liposome Formulations for DNAH5 mRNA Delivery and Expression

[0309] This example provides exemplary liposome formulations for effective delivery and expression of hDNAH5 mRNA in vivo.

[0310] Lipid Materials

[0311] The formulations described in the following Examples, unless otherwise specified, contain a multi-component lipid mixture of varying ratios employing one or more cationic lipids, helper lipids (e.g., non-cationic lipids and/or cholesterol lipids) and PEGylated lipids designed to encapsulate human dynein axonemal heavy chain 5 (hDNAH5) mRNA. Unless otherwise specified, the multi-component lipid mixture used in the following Examples were ethanolic solutions of an imidazole cholesterol ester ("ICE") cationic lipid, a non-cationic lipid such as DOPE, and a PEGylated lipid such as DMG-PEG2K.

[0312] Messenger RNA Material

[0313] Codon-optimized hDNAH5 messenger RNA was synthesized by in vitro transcription from a plasmid DNA template encoding the gene. Following in vitro transcription, a 5' cap structure (Cap 1) (Fechter, P.; Brownlee, G. G. "Recognition of mRNA cap structures by viral and cellular proteins" J. Gen. Virology 2005, 86, 1239-1249) and a 3' poly(A) tail were added. The poly(A) tail was approximately 135 nucleotides in length on average. The 5' and 3' untranslated regions present in each mRNA product are represented as X and Y, respectively, and defined as stated (vide infra).

Codon-Optimized hDNAH5 mRNA:

TABLE-US-00003 X - Coding region - Y 5' and 3' UTR Sequences X (5' UTR Sequence) = [SEQ ID NO.: 2] AGACAGAUCGCCUGGAGACGCCAUCCACGCUGUUUUGACCUCCAUAGAAG ACACCGGGACCGAUCCAGCCUCCGCGGCCGGGAACGGUGCAUUGGAACGC GGAUUCCCCGUGCCAAGAGUGACUCACCGUCCUUGACACG OR [SEQ ID NO.: 3] GGACAGAUCGCCUGGAGACGCCAUCCACGCUGUUUUGACCUCCAUAGAAG ACACCGGGACCGAUCCAGCCUCCGCGGCCGGGAACGGUGCAUUGGAACGC GGAUUCCCCGUGCCAAGAGUGACUCACCGUCCUUGACACG Y (3' UTR Sequence) = [SEQ ID NO.: 4] CGGGUGGCAUCCCUGUGACCCCUCCCCAGUGCCUCUCCUGGCCCUGGAAG UUGCCACUCCAGUGCCCACCAGCCUUGUCCUAAUAAAAUUAAGUUGCAUC AAGCU OR [SEQ ID NO.: 5] GGGUGGCAUCCCUGUGACCCCUCCCCAGUGCCUCUCCUGGCCCUGGAAGU UGCCACUCCAGUGCCCACCAGCCUUGUCCUAAUAAAAUUAAGUUGCAUCA AAGCU

Coding Regions

[0314] The MRT-1 codon-optimized hDNAH5 messenger RNA coding region comprised the sequence of SEQ ID NO. 6 or SEQ ID NO. 7. A 3'-GFP-tagged version of MRT-1 codon-optimized hDNA5 was likewise prepared, MRT-hDNA5-GFP using molecular cloning techniques well known in the art.

Formulation Protocol

[0315] hDNAH5 mRNA was encapsulated in multi-component liposomes as described in WO 2018/089790, published May 17, 2018 (incorporated herein by reference), at an N/P ratio of approximately 10.

Example 2. In Vivo Administration and Delivery of hDNAH5 mRNA to the Lung and Expression of hDNAH5 Protein

[0316] This example illustrates exemplary methods of administering hDNAH5 mRNA-loaded liposome nanoparticles and methods for analyzing delivered mRNA and subsequently expressed hDNAH5 protein in lung epithelium in vivo.

[0317] The studies in this Example were performed using male 129S1/SvimJ mice, which were of approximately 10-12 weeks of age. Three groups of mice (each n=5) were exposed by a single intratracheal aerosol administration via Microsprayer.RTM. (50 .mu.L/animal) a test article (Groups 1 and 2) or a control. The test article for Group 1 was 10 .mu.g/animal of MRT-1 hDNAH5 mRNA prepared as described in Example 1. The test article for Group 2 was 10 .mu.g/animal (unless otherwise specified) of hDNAH5-GFP mRNA (i.e., a sequence including both MRT1 hDNAH5 mRNA and green fluorescent protein (GFP) mRNA) prepared as described in Example 1. The control included either saline administered at the same volume or an irrelevant mRNA in the same delivery vehicle as the test articles. Mice were euthanized at 24 hours (.+-.5%) post dose administration.

Isolation of Plasma for Analysis

[0318] All animals were euthanized by isoflurane overdose via nose cone followed by thoracotomy and terminal blood collection. Whole blood (maximal obtainable volume) was collected via cardiac puncture on euthanized animals and discarded. The animals were then and perfused with saline.

Isolation of Organ Tissues for Analysis

[0319] Following perfusion, the liver and the entire airway (trachea to lungs) of each mouse was harvested. The entire airway for the top of the trachea to, and including, the lungs was dissected in one piece and then sagitally cut to provide left and right sections of the entire airway. FIG. 1A depicts the dissection scheme of the lung. The left section of the entire airway was fixed in buffer for subsequent immunohistochemical and histological analysis. The right section of the entire airway was snap-frozen and stored at -70.degree. C. for subsequent qPCR analysis of the trachea (1), superior lobe (2), middle lobe (3), inferior lobe (4), and post-caval lobe (5). The liver also was snap-frozen and stored at -70.degree. C.

qPCR Assay

[0320] Mouse trachea and each lung lobes were homogenized in presence of trizol for complete lysis, followed by RNA extraction using silica-membrane based spin columns. The codon optimized hDNAH5 mRNA levels are determined using RT-qPCR. First, the purified RNA is reverse transcribed (RT) into cDNA using random primers. Then, a PCR reaction is performed using sequence specific primers and quantified in real-time using a taqman fluorophore probe (qPCR). Purified, in vitro transcribed hDNAH5 which is run as a reference in the qPCR assay is used to generate a standard curve and calculate hDNAH5 copy numbers per milligram of the analyzed tissue. Results of the qPCR analysis are shown in FIG. 1B.

Immunohistochemical (IHC) Analysis--DNAH5 or GFP

[0321] The hDNAH5 and GFP protein in the trachea and lungs was characterized by IHC staining. Briefly, the harvested tissues were fixed in formalin and embedded in paraffin blocks. Sections (5 micron thick) along the length of the tissues were mounted on glass slides for staining. Antigen retrieval was performed using EDTA based buffer, followed by blocking with hydrogen peroxide and goat serum. Primary antibodies against hDNAH5 (Ab122390) and GFP (Ab290) were incubated with respective samples overnight at 4.degree. C. Enzyme-conjugated secondary antibodies were used for detection of the bound primary antibodies. The images of the stained slides were captured at 20.times. magnification. Results of the IHC analysis are shown in FIG. 2.

Results

[0322] This Example shows the successful in vivo administration, delivery and expression of a greater than 10 kb therapeutic mRNA. In particular, in this Example, hDNAH5 mRNA, a 14 kb mRNA, was successfully encapsulated, administered by nebulization and delivered in vivo to the lung. FIG. 1B provides qPCR data showing successful hDNAH5 mRNA deposition in cells in each of the trachea (1), superior lobe (2), middle lobe (3), inferior lobe (4), and post-caval lobe (5) of the lung for each mouse in Groups 1 and 2. FIG. 2A provides exemplary IHC images showing positive staining for hDNAH5 protein expressed from the hDNAH5 mRNA lung tissue from mice in each of Groups 1 and 2. Further, FIG. 2B shows IHC images with positive staining for hDNAH5 protein, from mice in Groups 1 and 2, in tissue from the trachea as well as tissue across the entire lung, from top to bottom (left to right in FIG. 2B).

Exemplary Sequences

[0323] Exemplary codon-optimized mRNA sequences are shown in SEQ ID NO: 6-31. For the purpose of the sequence disclosure, U and T are used interchangeably.

EQUIVALENTS

[0324] Those skilled in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the invention described herein. The scope of the present invention is not intended to be limited to the above Description, but rather is as set forth in the following claims:

Sequence CWU 1

1

3114624PRTHomo sapiens 1Met Phe Arg Ile Gly Arg Arg Gln Leu Trp Lys His Ser Val Thr Arg1 5 10 15Val Leu Thr Gln Arg Leu Lys Gly Glu Lys Glu Ala Lys Arg Ala Leu 20 25 30Leu Asp Ala Arg His Asn Tyr Leu Phe Ala Ile Val Ala Ser Cys Leu 35 40 45Asp Leu Asn Lys Thr Glu Val Glu Asp Ala Ile Leu Glu Gly Asn Gln 50 55 60Ile Glu Arg Ile Asp Gln Leu Phe Ala Val Gly Gly Leu Arg His Leu65 70 75 80Met Phe Tyr Tyr Gln Asp Val Glu Glu Ala Glu Thr Gly Gln Leu Gly 85 90 95Ser Leu Gly Gly Val Asn Leu Val Ser Gly Lys Ile Lys Lys Pro Lys 100 105 110Val Phe Val Thr Glu Gly Asn Asp Val Ala Leu Thr Gly Val Cys Val 115 120 125Phe Phe Ile Arg Thr Asp Pro Ser Lys Ala Ile Thr Pro Asp Asn Ile 130 135 140His Gln Glu Val Ser Phe Asn Met Leu Asp Ala Ala Asp Gly Gly Leu145 150 155 160Leu Asn Ser Val Arg Arg Leu Leu Ser Asp Ile Phe Ile Pro Ala Leu 165 170 175Arg Ala Thr Ser His Gly Trp Gly Glu Leu Glu Gly Leu Gln Asp Ala 180 185 190Ala Asn Ile Arg Gln Glu Phe Leu Ser Ser Leu Glu Gly Phe Val Asn 195 200 205Val Leu Ser Gly Ala Gln Glu Ser Leu Lys Glu Lys Val Asn Leu Arg 210 215 220Lys Cys Asp Ile Leu Glu Leu Lys Thr Leu Lys Glu Pro Thr Asp Tyr225 230 235 240Leu Thr Leu Ala Asn Asn Pro Glu Thr Leu Gly Lys Ile Glu Asp Cys 245 250 255Met Lys Val Trp Ile Lys Gln Thr Glu Gln Val Leu Ala Glu Asn Asn 260 265 270Gln Leu Leu Lys Glu Ala Asp Asp Val Gly Pro Arg Ala Glu Leu Glu 275 280 285His Trp Lys Lys Arg Leu Ser Lys Phe Asn Tyr Leu Leu Glu Gln Leu 290 295 300Lys Ser Pro Asp Val Lys Ala Val Leu Ala Val Leu Ala Ala Ala Lys305 310 315 320Ser Lys Leu Leu Lys Thr Trp Arg Glu Met Asp Ile Arg Ile Thr Asp 325 330 335Ala Thr Asn Glu Ala Lys Asp Asn Val Lys Tyr Leu Tyr Thr Leu Glu 340 345 350Lys Cys Cys Asp Pro Leu Tyr Ser Ser Asp Pro Leu Ser Met Met Asp 355 360 365Ala Ile Pro Thr Leu Ile Asn Ala Ile Lys Met Ile Tyr Ser Ile Ser 370 375 380His Tyr Tyr Asn Thr Ser Glu Lys Ile Thr Ser Leu Phe Val Lys Val385 390 395 400Thr Asn Gln Ile Ile Ser Ala Cys Lys Ala Tyr Ile Thr Asn Asn Gly 405 410 415Thr Ala Ser Ile Trp Asn Gln Pro Gln Asp Val Val Glu Glu Lys Ile 420 425 430Leu Ser Ala Ile Lys Leu Lys Gln Glu Tyr Gln Leu Cys Phe His Lys 435 440 445Thr Lys Gln Lys Leu Lys Gln Asn Pro Asn Ala Lys Gln Phe Asp Phe 450 455 460Ser Glu Met Tyr Ile Phe Gly Lys Phe Glu Thr Phe His Arg Arg Leu465 470 475 480Ala Lys Ile Ile Asp Ile Phe Thr Thr Leu Lys Thr Tyr Ser Val Leu 485 490 495Gln Asp Ser Thr Ile Glu Gly Leu Glu Asp Met Ala Thr Lys Tyr Gln 500 505 510Gly Ile Val Ala Thr Ile Lys Lys Lys Glu Tyr Asn Phe Leu Asp Gln 515 520 525Arg Lys Met Asp Phe Asp Gln Asp Tyr Glu Glu Phe Cys Lys Gln Thr 530 535 540Asn Asp Leu His Asn Glu Leu Arg Lys Phe Met Asp Val Thr Phe Ala545 550 555 560Lys Ile Gln Asn Thr Asn Gln Ala Leu Arg Met Leu Lys Lys Phe Glu 565 570 575Arg Leu Asn Ile Pro Asn Leu Gly Ile Asp Asp Lys Tyr Gln Leu Ile 580 585 590Leu Glu Asn Tyr Gly Ala Asp Ile Asp Met Ile Ser Lys Leu Tyr Thr 595 600 605Lys Gln Lys Tyr Asp Pro Pro Leu Ala Arg Asn Gln Pro Pro Ile Ala 610 615 620Gly Lys Ile Leu Trp Ala Arg Gln Leu Phe His Arg Ile Gln Gln Pro625 630 635 640Met Gln Leu Phe Gln Gln His Pro Ala Val Leu Ser Thr Ala Glu Ala 645 650 655Lys Pro Ile Ile Arg Ser Tyr Asn Arg Met Ala Lys Val Leu Leu Glu 660 665 670Phe Glu Val Leu Phe His Arg Ala Trp Leu Arg Gln Ile Glu Glu Ile 675 680 685His Val Gly Leu Glu Ala Ser Leu Leu Val Lys Ala Pro Gly Thr Gly 690 695 700Glu Leu Phe Val Asn Phe Asp Pro Gln Ile Leu Ile Leu Phe Arg Glu705 710 715 720Thr Glu Cys Met Ala Gln Met Gly Leu Glu Val Ser Pro Leu Ala Thr 725 730 735Ser Leu Phe Gln Lys Arg Asp Arg Tyr Lys Arg Asn Phe Ser Asn Met 740 745 750Lys Met Met Leu Ala Glu Tyr Gln Arg Val Lys Ser Lys Ile Pro Ala 755 760 765Ala Ile Glu Gln Leu Ile Val Pro His Leu Ala Lys Val Asp Glu Ala 770 775 780Leu Gln Pro Gly Leu Ala Ala Leu Thr Trp Thr Ser Leu Asn Ile Glu785 790 795 800Ala Tyr Leu Glu Asn Thr Phe Ala Lys Ile Lys Asp Leu Glu Leu Leu 805 810 815Leu Asp Arg Val Asn Asp Leu Ile Glu Phe Arg Ile Asp Ala Ile Leu 820 825 830Glu Glu Met Ser Ser Thr Pro Leu Cys Gln Leu Pro Gln Glu Glu Pro 835 840 845Leu Thr Cys Glu Glu Phe Leu Gln Met Thr Lys Asp Leu Cys Val Asn 850 855 860Gly Ala Gln Ile Leu His Phe Lys Ser Ser Leu Val Glu Glu Ala Val865 870 875 880Asn Glu Leu Val Asn Met Leu Leu Asp Val Glu Val Leu Ser Glu Glu 885 890 895Glu Ser Glu Lys Ile Ser Asn Glu Asn Ser Val Asn Tyr Lys Asn Glu 900 905 910Ser Ser Ala Lys Arg Glu Glu Gly Asn Phe Asp Thr Leu Thr Ser Ser 915 920 925Ile Asn Ala Arg Ala Asn Ala Leu Leu Leu Thr Thr Val Thr Arg Lys 930 935 940Lys Lys Glu Thr Glu Met Leu Gly Glu Glu Ala Arg Glu Leu Leu Ser945 950 955 960His Phe Asn His Gln Asn Met Asp Ala Leu Leu Lys Val Thr Arg Asn 965 970 975Thr Leu Glu Ala Ile Arg Lys Arg Ile His Ser Ser His Thr Ile Asn 980 985 990Phe Arg Asp Ser Asn Ser Ala Ser Asn Met Lys Gln Asn Ser Leu Pro 995 1000 1005Ile Phe Arg Ala Ser Val Thr Leu Ala Ile Pro Asn Ile Val Met 1010 1015 1020Ala Pro Ala Leu Glu Asp Val Gln Gln Thr Leu Asn Lys Ala Val 1025 1030 1035Glu Cys Ile Ile Ser Val Pro Lys Gly Val Arg Gln Trp Ser Ser 1040 1045 1050Glu Leu Leu Ser Lys Lys Lys Ile Gln Glu Arg Lys Met Ala Ala 1055 1060 1065Leu Gln Ser Asn Glu Asp Ser Asp Ser Asp Val Glu Met Gly Glu 1070 1075 1080Asn Glu Leu Gln Asp Thr Leu Glu Ile Ala Ser Val Asn Leu Pro 1085 1090 1095Ile Pro Val Gln Thr Lys Asn Tyr Tyr Lys Asn Val Ser Glu Asn 1100 1105 1110Lys Glu Ile Val Lys Leu Val Ser Val Leu Ser Thr Ile Ile Asn 1115 1120 1125Ser Thr Lys Lys Glu Val Ile Thr Ser Met Asp Cys Phe Lys Arg 1130 1135 1140Tyr Asn His Ile Trp Gln Lys Gly Lys Glu Glu Ala Ile Lys Thr 1145 1150 1155Phe Ile Thr Gln Ser Pro Leu Leu Ser Glu Phe Glu Ser Gln Ile 1160 1165 1170Leu Tyr Phe Gln Asn Leu Glu Gln Glu Ile Asn Ala Glu Pro Glu 1175 1180 1185Tyr Val Cys Val Gly Ser Ile Ala Leu Tyr Thr Ala Asp Leu Lys 1190 1195 1200Phe Ala Leu Thr Ala Glu Thr Lys Ala Trp Met Val Val Ile Gly 1205 1210 1215Arg His Cys Asn Lys Lys Tyr Arg Ser Glu Met Glu Asn Ile Phe 1220 1225 1230Met Leu Ile Glu Glu Phe Asn Lys Lys Leu Asn Arg Pro Ile Lys 1235 1240 1245Asp Leu Asp Asp Ile Arg Ile Ala Met Ala Ala Leu Lys Glu Ile 1250 1255 1260Arg Glu Glu Gln Ile Ser Ile Asp Phe Gln Val Gly Pro Ile Glu 1265 1270 1275Glu Ser Tyr Ala Leu Leu Asn Arg Tyr Gly Leu Leu Ile Ala Arg 1280 1285 1290Glu Glu Ile Asp Lys Val Asp Thr Leu His Tyr Ala Trp Glu Lys 1295 1300 1305Leu Leu Ala Arg Ala Gly Glu Val Gln Asn Lys Leu Val Ser Leu 1310 1315 1320Gln Pro Ser Phe Lys Lys Glu Leu Ile Ser Ala Val Glu Val Phe 1325 1330 1335Leu Gln Asp Cys His Gln Phe Tyr Leu Asp Tyr Asp Leu Asn Gly 1340 1345 1350Pro Met Ala Ser Gly Leu Lys Pro Gln Glu Ala Ser Asp Arg Leu 1355 1360 1365Ile Met Phe Gln Asn Gln Phe Asp Asn Ile Tyr Arg Lys Tyr Ile 1370 1375 1380Thr Tyr Thr Gly Gly Glu Glu Leu Phe Gly Leu Pro Ala Thr Gln 1385 1390 1395Tyr Pro Gln Leu Leu Glu Ile Lys Lys Gln Leu Asn Leu Leu Gln 1400 1405 1410Lys Ile Tyr Thr Leu Tyr Asn Ser Val Ile Glu Thr Val Asn Ser 1415 1420 1425Tyr Tyr Asp Ile Leu Trp Ser Glu Val Asn Ile Glu Lys Ile Asn 1430 1435 1440Asn Glu Leu Leu Glu Phe Gln Asn Arg Cys Arg Lys Leu Pro Arg 1445 1450 1455Ala Leu Lys Asp Trp Gln Ala Phe Leu Asp Leu Lys Lys Ile Ile 1460 1465 1470Asp Asp Phe Ser Glu Cys Cys Pro Leu Leu Glu Tyr Met Ala Ser 1475 1480 1485Lys Ala Met Met Glu Arg His Trp Glu Arg Ile Thr Thr Leu Thr 1490 1495 1500Gly His Ser Leu Asp Val Gly Asn Glu Ser Phe Lys Leu Arg Asn 1505 1510 1515Ile Met Glu Ala Pro Leu Leu Lys Tyr Lys Glu Glu Ile Glu Asp 1520 1525 1530Ile Cys Ile Ser Ala Val Lys Glu Arg Asp Ile Glu Gln Lys Leu 1535 1540 1545Lys Gln Val Ile Asn Glu Trp Asp Asn Lys Thr Phe Thr Phe Gly 1550 1555 1560Ser Phe Lys Thr Arg Gly Glu Leu Leu Leu Arg Gly Asp Ser Thr 1565 1570 1575Ser Glu Ile Ile Ala Asn Met Glu Asp Ser Leu Met Leu Leu Gly 1580 1585 1590Ser Leu Leu Ser Asn Arg Tyr Asn Met Pro Phe Lys Ala Gln Ile 1595 1600 1605Gln Lys Trp Val Gln Tyr Leu Ser Asn Ser Thr Asp Ile Ile Glu 1610 1615 1620Ser Trp Met Thr Val Gln Asn Leu Trp Ile Tyr Leu Glu Ala Val 1625 1630 1635Phe Val Gly Gly Asp Ile Ala Lys Gln Leu Pro Lys Glu Ala Lys 1640 1645 1650Arg Phe Ser Asn Ile Asp Lys Ser Trp Val Lys Ile Met Thr Arg 1655 1660 1665Ala His Glu Val Pro Ser Val Val Gln Cys Cys Val Gly Asp Glu 1670 1675 1680Thr Leu Gly Gln Leu Leu Pro His Leu Leu Asp Gln Leu Glu Ile 1685 1690 1695Cys Gln Lys Ser Leu Thr Gly Tyr Leu Glu Lys Lys Arg Leu Cys 1700 1705 1710Phe Pro Arg Phe Phe Phe Val Ser Asp Pro Ala Leu Leu Glu Ile 1715 1720 1725Leu Gly Gln Ala Ser Asp Ser His Thr Ile Gln Ala His Leu Leu 1730 1735 1740Asn Val Phe Asp Asn Ile Lys Ser Val Lys Phe His Glu Lys Ile 1745 1750 1755Tyr Asp Arg Ile Leu Ser Ile Ser Ser Gln Glu Gly Glu Thr Ile 1760 1765 1770Glu Leu Asp Lys Pro Val Met Ala Glu Gly Asn Val Glu Val Trp 1775 1780 1785Leu Asn Ser Leu Leu Glu Glu Ser Gln Ser Ser Leu His Leu Val 1790 1795 1800Ile Arg Gln Ala Ala Ala Asn Ile Gln Glu Thr Gly Phe Gln Leu 1805 1810 1815Thr Glu Phe Leu Ser Ser Phe Pro Ala Gln Val Gly Leu Leu Gly 1820 1825 1830Ile Gln Met Ile Trp Thr Arg Asp Ser Glu Glu Ala Leu Arg Asn 1835 1840 1845Ala Lys Phe Asp Lys Lys Ile Met Gln Lys Thr Asn Gln Ala Phe 1850 1855 1860Leu Glu Leu Leu Asn Thr Leu Ile Asp Val Thr Thr Arg Asp Leu 1865 1870 1875Ser Ser Thr Glu Arg Val Lys Tyr Glu Thr Leu Ile Thr Ile His 1880 1885 1890Val His Gln Arg Asp Ile Phe Asp Asp Leu Cys His Met His Ile 1895 1900 1905Lys Ser Pro Met Asp Phe Glu Trp Leu Lys Gln Cys Arg Phe Tyr 1910 1915 1920Phe Asn Glu Asp Ser Asp Lys Met Met Ile His Ile Thr Asp Val 1925 1930 1935Ala Phe Ile Tyr Gln Asn Glu Phe Leu Gly Cys Thr Asp Arg Leu 1940 1945 1950Val Ile Thr Pro Leu Thr Asp Arg Cys Tyr Ile Thr Leu Ala Gln 1955 1960 1965Ala Leu Gly Met Ser Met Gly Gly Ala Pro Ala Gly Pro Ala Gly 1970 1975 1980Thr Gly Lys Thr Glu Thr Thr Lys Asp Met Gly Arg Cys Leu Gly 1985 1990 1995Lys Tyr Val Val Val Phe Asn Cys Ser Asp Gln Met Asp Phe Arg 2000 2005 2010Gly Leu Gly Arg Ile Phe Lys Gly Leu Ala Gln Ser Gly Ser Trp 2015 2020 2025Gly Cys Phe Asp Glu Phe Asn Arg Ile Asp Leu Pro Val Leu Ser 2030 2035 2040Val Ala Ala Gln Gln Ile Ser Ile Ile Leu Thr Cys Lys Lys Glu 2045 2050 2055His Lys Lys Ser Phe Ile Phe Thr Asp Gly Asp Asn Val Thr Met 2060 2065 2070Asn Pro Glu Phe Gly Leu Phe Leu Thr Met Asn Pro Gly Tyr Ala 2075 2080 2085Gly Arg Gln Glu Leu Pro Glu Asn Leu Lys Ile Asn Phe Arg Ser 2090 2095 2100Val Ala Met Met Val Pro Asp Arg Gln Ile Ile Ile Arg Val Lys 2105 2110 2115Leu Ala Ser Cys Gly Phe Ile Asp Asn Val Val Leu Ala Arg Lys 2120 2125 2130Phe Phe Thr Leu Tyr Lys Leu Cys Glu Glu Gln Leu Ser Lys Gln 2135 2140 2145Val His Tyr Asp Phe Gly Leu Arg Asn Ile Leu Ser Val Leu Arg 2150 2155 2160Thr Leu Gly Ala Ala Lys Arg Ala Asn Pro Met Asp Thr Glu Ser 2165 2170 2175Thr Ile Val Met Arg Val Leu Arg Asp Met Asn Leu Ser Lys Leu 2180 2185 2190Ile Asp Glu Asp Glu Pro Leu Phe Leu Ser Leu Ile Glu Asp Leu 2195 2200 2205Phe Pro Asn Ile Leu Leu Asp Lys Ala Gly Tyr Pro Glu Leu Glu 2210 2215 2220Ala Ala Ile Ser Arg Gln Val Glu Glu Ala Gly Leu Ile Asn His 2225 2230 2235Pro Pro Trp Lys Leu Lys Val Ile Gln Leu Phe Glu Thr Gln Arg 2240 2245 2250Val Arg His Gly Met Met Thr Leu Gly Pro Ser Gly Ala Gly Lys 2255 2260 2265Thr Thr Cys Ile His Thr Leu Met Arg Ala Met Thr Asp Cys Gly 2270 2275 2280Lys Pro His Arg Glu Met Arg Met Asn Pro Lys Ala Ile Thr Ala 2285 2290 2295Pro Gln Met Phe Gly Arg Leu Asp Val Ala Thr Asn Asp Trp Thr 2300 2305 2310Asp Gly Ile Phe Ser Thr Leu Trp Arg Lys Thr Leu Arg Ala Lys 2315 2320 2325Lys Gly Glu His Ile Trp Ile Ile Leu Asp Gly Pro Val Asp Ala 2330 2335 2340Ile Trp Ile Glu Asn Leu Asn Ser Val Leu Asp Asp Asn Lys Thr 2345 2350 2355Leu Thr Leu Ala Asn Gly Asp Arg Ile Pro Met Ala Pro Asn Cys 2360 2365 2370Lys Ile Ile Phe Glu Pro His Asn Ile Asp Asn Ala Ser Pro Ala 2375 2380 2385Thr Val Ser Arg Asn Gly Met Val Phe Met Ser Ser Ser Ile Leu 2390 2395 2400Asp Trp Ser Pro Ile Leu Glu Gly Phe Leu Lys Lys Arg Ser Pro 2405 2410 2415Gln Glu Ala Glu Ile Leu Arg Gln Leu Tyr Thr Glu Ser Phe Pro 2420 2425 2430Asp Leu Tyr Arg Phe Cys Ile Gln Asn Leu Glu Tyr Lys Met Glu 2435 2440 2445Val

Leu Glu Ala Phe Val Ile Thr Gln Ser Ile Asn Met Leu Gln 2450 2455 2460Gly Leu Ile Pro Leu Lys Glu Gln Gly Gly Glu Val Ser Gln Ala 2465 2470 2475His Leu Gly Arg Leu Phe Val Phe Ala Leu Leu Trp Ser Ala Gly 2480 2485 2490Ala Ala Leu Glu Leu Asp Gly Arg Arg Arg Leu Glu Leu Trp Leu 2495 2500 2505Arg Ser Arg Pro Thr Gly Thr Leu Glu Leu Pro Pro Pro Ala Gly 2510 2515 2520Pro Gly Asp Thr Ala Phe Asp Tyr Tyr Val Ala Pro Asp Gly Thr 2525 2530 2535Trp Thr His Trp Asn Thr Arg Thr Gln Glu Tyr Leu Tyr Pro Ser 2540 2545 2550Asp Thr Thr Pro Glu Tyr Gly Ser Ile Leu Val Pro Asn Val Asp 2555 2560 2565Asn Val Arg Thr Asp Phe Leu Ile Gln Thr Ile Ala Lys Gln Gly 2570 2575 2580Lys Ala Val Leu Leu Ile Gly Glu Gln Gly Thr Ala Lys Thr Val 2585 2590 2595Ile Ile Lys Gly Phe Met Ser Lys Tyr Asp Pro Glu Cys His Met 2600 2605 2610Ile Lys Ser Leu Asn Phe Ser Ser Ala Thr Thr Pro Leu Met Phe 2615 2620 2625Gln Arg Thr Ile Glu Ser Tyr Val Asp Lys Arg Met Gly Thr Thr 2630 2635 2640Tyr Gly Pro Pro Ala Gly Lys Lys Met Thr Val Phe Ile Asp Asp 2645 2650 2655Val Asn Met Pro Ile Ile Asn Glu Trp Gly Asp Gln Val Thr Asn 2660 2665 2670Glu Ile Val Arg Gln Leu Met Glu Gln Asn Gly Phe Tyr Asn Leu 2675 2680 2685Glu Lys Pro Gly Glu Phe Thr Ser Ile Val Asp Ile Gln Phe Leu 2690 2695 2700Ala Ala Met Ile His Pro Gly Gly Gly Arg Asn Asp Ile Pro Gln 2705 2710 2715Arg Leu Lys Arg Gln Phe Ser Ile Phe Asn Cys Thr Leu Pro Ser 2720 2725 2730Glu Ala Ser Val Asp Lys Ile Phe Gly Val Ile Gly Val Gly His 2735 2740 2745Tyr Cys Thr Gln Arg Gly Phe Ser Glu Glu Val Arg Asp Ser Val 2750 2755 2760Thr Lys Leu Val Pro Leu Thr Arg Arg Leu Trp Gln Met Thr Lys 2765 2770 2775Ile Lys Met Leu Pro Thr Pro Ala Lys Phe His Tyr Val Phe Asn 2780 2785 2790Leu Arg Asp Leu Ser Arg Val Trp Gln Gly Met Leu Asn Thr Thr 2795 2800 2805Ser Glu Val Ile Lys Glu Pro Asn Asp Leu Leu Lys Leu Trp Lys 2810 2815 2820His Glu Cys Lys Arg Val Ile Ala Asp Arg Phe Thr Val Ser Ser 2825 2830 2835Asp Val Thr Trp Phe Asp Lys Ala Leu Val Ser Leu Val Glu Glu 2840 2845 2850Glu Phe Gly Glu Glu Lys Lys Leu Leu Val Asp Cys Gly Ile Asp 2855 2860 2865Thr Tyr Phe Val Asp Phe Leu Arg Asp Ala Pro Glu Ala Ala Gly 2870 2875 2880Glu Thr Ser Glu Glu Ala Asp Ala Glu Thr Pro Lys Ile Tyr Glu 2885 2890 2895Pro Ile Glu Ser Phe Ser His Leu Lys Glu Arg Leu Asn Met Phe 2900 2905 2910Leu Gln Leu Tyr Asn Glu Ser Ile Arg Gly Ala Gly Met Asp Met 2915 2920 2925Val Phe Phe Ala Asp Ala Met Val His Leu Val Lys Ile Ser Arg 2930 2935 2940Val Ile Arg Thr Pro Gln Gly Asn Ala Leu Leu Val Gly Val Gly 2945 2950 2955Gly Ser Gly Lys Gln Ser Leu Thr Arg Leu Ala Ser Phe Ile Ala 2960 2965 2970Gly Tyr Val Ser Phe Gln Ile Thr Leu Thr Arg Ser Tyr Asn Thr 2975 2980 2985Ser Asn Leu Met Glu Asp Leu Lys Val Leu Tyr Arg Thr Ala Gly 2990 2995 3000Gln Gln Gly Lys Gly Ile Thr Phe Ile Phe Thr Asp Asn Glu Ile 3005 3010 3015Lys Asp Glu Ser Phe Leu Glu Tyr Met Asn Asn Val Leu Ser Ser 3020 3025 3030Gly Glu Val Ser Asn Leu Phe Ala Arg Asp Glu Ile Asp Glu Ile 3035 3040 3045Asn Ser Asp Leu Ala Ser Val Met Lys Lys Glu Phe Pro Arg Cys 3050 3055 3060Leu Pro Thr Asn Glu Asn Leu His Asp Tyr Phe Met Ser Arg Val 3065 3070 3075Arg Gln Asn Leu His Ile Val Leu Cys Phe Ser Pro Val Gly Glu 3080 3085 3090Lys Phe Arg Asn Arg Ala Leu Lys Phe Pro Ala Leu Ile Ser Gly 3095 3100 3105Cys Thr Ile Asp Trp Phe Ser Arg Trp Pro Lys Asp Ala Leu Val 3110 3115 3120Ala Val Ser Glu His Phe Leu Thr Ser Tyr Asp Ile Asp Cys Ser 3125 3130 3135Leu Glu Ile Lys Lys Glu Val Val Gln Cys Met Gly Ser Phe Gln 3140 3145 3150Asp Gly Val Ala Glu Lys Cys Val Asp Tyr Phe Gln Arg Phe Arg 3155 3160 3165Arg Ser Thr His Val Thr Pro Lys Ser Tyr Leu Ser Phe Ile Gln 3170 3175 3180Gly Tyr Lys Phe Ile Tyr Gly Glu Lys His Val Glu Val Arg Thr 3185 3190 3195Leu Ala Asn Arg Met Asn Thr Gly Leu Glu Lys Leu Lys Glu Ala 3200 3205 3210Ser Glu Ser Val Ala Ala Leu Ser Lys Glu Leu Glu Ala Lys Glu 3215 3220 3225Lys Glu Leu Gln Val Ala Asn Asp Lys Ala Asp Met Val Leu Lys 3230 3235 3240Glu Val Thr Met Lys Ala Gln Ala Ala Glu Lys Val Lys Ala Glu 3245 3250 3255Val Gln Lys Val Lys Asp Arg Ala Gln Ala Ile Val Asp Ser Ile 3260 3265 3270Ser Lys Asp Lys Ala Ile Ala Glu Glu Lys Leu Glu Ala Ala Lys 3275 3280 3285Pro Ala Leu Glu Glu Ala Glu Ala Ala Leu Gln Thr Ile Arg Pro 3290 3295 3300Ser Asp Ile Ala Thr Val Arg Thr Leu Gly Arg Pro Pro His Leu 3305 3310 3315Ile Met Arg Ile Met Asp Cys Val Leu Leu Leu Phe Gln Arg Lys 3320 3325 3330Val Ser Ala Val Lys Ile Asp Leu Glu Lys Ser Cys Thr Met Pro 3335 3340 3345Ser Trp Gln Glu Ser Leu Lys Leu Met Thr Ala Gly Asn Phe Leu 3350 3355 3360Gln Asn Leu Gln Gln Phe Pro Lys Asp Thr Ile Asn Glu Glu Val 3365 3370 3375Ile Glu Phe Leu Ser Pro Tyr Phe Glu Met Pro Asp Tyr Asn Ile 3380 3385 3390Glu Thr Ala Lys Arg Val Cys Gly Asn Val Ala Gly Leu Cys Ser 3395 3400 3405Trp Thr Lys Ala Met Ala Ser Phe Phe Ser Ile Asn Lys Glu Val 3410 3415 3420Leu Pro Leu Lys Ala Asn Leu Val Val Gln Glu Asn Arg His Leu 3425 3430 3435Leu Ala Met Gln Asp Leu Gln Lys Ala Gln Ala Glu Leu Asp Asp 3440 3445 3450Lys Gln Ala Glu Leu Asp Val Val Gln Ala Glu Tyr Glu Gln Ala 3455 3460 3465Met Thr Glu Lys Gln Thr Leu Leu Glu Asp Ala Glu Arg Cys Arg 3470 3475 3480His Lys Met Gln Thr Ala Ser Thr Leu Ile Ser Gly Leu Ala Gly 3485 3490 3495Glu Lys Glu Arg Trp Thr Glu Gln Ser Gln Glu Phe Ala Ala Gln 3500 3505 3510Thr Lys Arg Leu Val Gly Asp Val Leu Leu Ala Thr Ala Phe Leu 3515 3520 3525Ser Tyr Ser Gly Pro Phe Asn Gln Glu Phe Arg Asp Leu Leu Leu 3530 3535 3540Asn Asp Trp Arg Lys Glu Met Lys Ala Arg Lys Ile Pro Phe Gly 3545 3550 3555Lys Asn Leu Asn Leu Ser Glu Met Leu Ile Asp Ala Pro Thr Ile 3560 3565 3570Ser Glu Trp Asn Leu Gln Gly Leu Pro Asn Asp Asp Leu Ser Ile 3575 3580 3585Gln Asn Gly Ile Ile Val Thr Lys Ala Ser Arg Tyr Pro Leu Leu 3590 3595 3600Ile Asp Pro Gln Thr Gln Gly Lys Ile Trp Ile Lys Asn Lys Glu 3605 3610 3615Ser Arg Asn Glu Leu Gln Ile Thr Ser Leu Asn His Lys Tyr Phe 3620 3625 3630Arg Asn His Leu Glu Asp Ser Leu Ser Leu Gly Arg Pro Leu Leu 3635 3640 3645Ile Glu Asp Val Gly Glu Glu Leu Asp Pro Ala Leu Asp Asn Val 3650 3655 3660Leu Glu Arg Asn Phe Ile Lys Thr Gly Ser Thr Phe Lys Val Lys 3665 3670 3675Val Gly Asp Lys Glu Val Asp Val Leu Asp Gly Phe Arg Leu Tyr 3680 3685 3690Ile Thr Thr Lys Leu Pro Asn Pro Ala Tyr Thr Pro Glu Ile Ser 3695 3700 3705Ala Arg Thr Ser Ile Ile Asp Phe Thr Val Thr Met Lys Gly Leu 3710 3715 3720Glu Asp Gln Leu Leu Gly Arg Val Ile Leu Thr Glu Lys Gln Glu 3725 3730 3735Leu Glu Lys Glu Arg Thr His Leu Met Glu Asp Val Thr Ala Asn 3740 3745 3750Lys Arg Arg Met Lys Glu Leu Glu Asp Asn Leu Leu Tyr Arg Leu 3755 3760 3765Thr Ser Thr Gln Gly Ser Leu Val Glu Asp Glu Ser Leu Ile Val 3770 3775 3780Val Leu Ser Asn Thr Lys Arg Thr Ala Glu Glu Val Thr Gln Lys 3785 3790 3795Leu Glu Ile Ser Ala Glu Thr Glu Val Gln Ile Asn Ser Ala Arg 3800 3805 3810Glu Glu Tyr Arg Pro Val Ala Thr Arg Gly Ser Ile Leu Tyr Phe 3815 3820 3825Leu Ile Thr Glu Met Arg Leu Val Asn Glu Met Tyr Gln Thr Ser 3830 3835 3840Leu Arg Gln Phe Leu Gly Leu Phe Asp Leu Ser Leu Ala Arg Ser 3845 3850 3855Val Lys Ser Pro Ile Thr Ser Lys Arg Ile Ala Asn Ile Ile Glu 3860 3865 3870His Met Thr Tyr Glu Val Tyr Lys Tyr Ala Ala Arg Gly Leu Tyr 3875 3880 3885Glu Glu His Lys Phe Leu Phe Thr Leu Leu Leu Thr Leu Lys Ile 3890 3895 3900Asp Ile Gln Arg Asn Arg Val Lys His Glu Glu Phe Leu Thr Leu 3905 3910 3915Ile Lys Gly Gly Ala Ser Leu Asp Leu Lys Ala Cys Pro Pro Lys 3920 3925 3930Pro Ser Lys Trp Ile Leu Asp Ile Thr Trp Leu Asn Leu Val Glu 3935 3940 3945Leu Ser Lys Leu Arg Gln Phe Ser Asp Val Leu Asp Gln Ile Ser 3950 3955 3960Arg Asn Glu Lys Met Trp Lys Ile Trp Phe Asp Lys Glu Asn Pro 3965 3970 3975Glu Glu Glu Pro Leu Pro Asn Ala Tyr Asp Lys Ser Leu Asp Cys 3980 3985 3990Phe Arg Arg Leu Leu Leu Ile Arg Ser Trp Cys Pro Asp Arg Thr 3995 4000 4005Ile Ala Gln Ala Arg Lys Tyr Ile Val Asp Ser Met Gly Glu Lys 4010 4015 4020Tyr Ala Glu Gly Val Ile Leu Asp Leu Glu Lys Thr Trp Glu Glu 4025 4030 4035Ser Asp Pro Arg Thr Pro Leu Ile Cys Leu Leu Ser Met Gly Ser 4040 4045 4050Asp Pro Thr Asp Ser Ile Ile Ala Leu Gly Lys Arg Leu Lys Ile 4055 4060 4065Glu Thr Arg Tyr Val Ser Met Gly Gln Gly Gln Glu Val His Ala 4070 4075 4080Arg Lys Leu Leu Gln Gln Thr Met Ala Asn Gly Gly Trp Ala Leu 4085 4090 4095Leu Gln Asn Cys His Leu Gly Leu Asp Phe Met Asp Glu Leu Met 4100 4105 4110Asp Ile Ile Ile Glu Thr Glu Leu Val His Asp Ala Phe Arg Leu 4115 4120 4125Trp Met Thr Thr Glu Ala His Lys Gln Phe Pro Ile Thr Leu Leu 4130 4135 4140Gln Met Ser Ile Lys Phe Ala Asn Asp Pro Pro Gln Gly Leu Arg 4145 4150 4155Ala Gly Leu Lys Arg Thr Tyr Ser Gly Val Ser Gln Asp Leu Leu 4160 4165 4170Asp Val Ser Ser Gly Ser Gln Trp Lys Pro Met Leu Tyr Ala Val 4175 4180 4185Ala Phe Leu His Ser Thr Val Gln Glu Arg Arg Lys Phe Gly Ala 4190 4195 4200Leu Gly Trp Asn Ile Pro Tyr Glu Phe Asn Gln Ala Asp Phe Asn 4205 4210 4215Ala Thr Val Gln Phe Ile Gln Asn His Leu Asp Asp Met Asp Val 4220 4225 4230Lys Lys Gly Val Ser Trp Thr Thr Ile Arg Tyr Met Ile Gly Glu 4235 4240 4245Ile Gln Tyr Gly Gly Arg Val Thr Asp Asp Tyr Asp Lys Arg Leu 4250 4255 4260Leu Asn Thr Phe Ala Lys Val Trp Phe Ser Glu Asn Met Phe Gly 4265 4270 4275Pro Asp Phe Ser Phe Tyr Gln Gly Tyr Asn Ile Pro Lys Cys Ser 4280 4285 4290Thr Val Asp Asn Tyr Leu Gln Tyr Ile Gln Ser Leu Pro Ala Tyr 4295 4300 4305Asp Ser Pro Glu Val Phe Gly Leu His Pro Asn Ala Asp Ile Thr 4310 4315 4320Tyr Gln Ser Lys Leu Ala Lys Asp Val Leu Asp Thr Ile Leu Gly 4325 4330 4335Ile Gln Pro Lys Asp Thr Ser Gly Gly Gly Asp Glu Thr Arg Glu 4340 4345 4350Ala Val Val Ala Arg Leu Ala Asp Asp Met Leu Glu Lys Leu Pro 4355 4360 4365Pro Asp Tyr Val Pro Phe Glu Val Lys Glu Arg Leu Gln Lys Met 4370 4375 4380Gly Pro Phe Gln Pro Met Asn Ile Phe Leu Arg Gln Glu Ile Asp 4385 4390 4395Arg Met Gln Arg Val Leu Ser Leu Val Arg Ser Thr Leu Thr Glu 4400 4405 4410Leu Lys Leu Ala Ile Asp Gly Thr Ile Ile Met Ser Glu Asn Leu 4415 4420 4425Arg Asp Ala Leu Asp Cys Met Phe Asp Ala Arg Ile Pro Ala Trp 4430 4435 4440Trp Lys Lys Ala Ser Trp Ile Ser Ser Thr Leu Gly Phe Trp Phe 4445 4450 4455Thr Glu Leu Ile Glu Arg Asn Ser Gln Phe Thr Ser Trp Val Phe 4460 4465 4470Asn Gly Arg Pro His Cys Phe Trp Met Thr Gly Phe Phe Asn Pro 4475 4480 4485Gln Gly Phe Leu Thr Ala Met Arg Gln Glu Ile Thr Arg Ala Asn 4490 4495 4500Lys Gly Trp Ala Leu Asp Asn Met Val Leu Cys Asn Glu Val Thr 4505 4510 4515Lys Trp Met Lys Asp Asp Ile Ser Ala Pro Pro Thr Glu Gly Val 4520 4525 4530Tyr Val Tyr Gly Leu Tyr Leu Glu Gly Ala Gly Trp Asp Lys Arg 4535 4540 4545Asn Met Lys Leu Ile Glu Ser Lys Pro Lys Val Leu Phe Glu Leu 4550 4555 4560Met Pro Val Ile Arg Ile Tyr Ala Glu Asn Asn Thr Leu Arg Asp 4565 4570 4575Pro Arg Phe Tyr Ser Cys Pro Ile Tyr Lys Lys Pro Val Arg Thr 4580 4585 4590Asp Leu Asn Tyr Ile Ala Ala Val Asp Leu Arg Thr Ala Gln Thr 4595 4600 4605Pro Glu His Trp Val Leu Arg Gly Val Ala Leu Leu Cys Asp Val 4610 4615 4620Lys2140RNAArtificial SequenceSynthetic polynucleotide 2agacagaucg ccuggagacg ccauccacgc uguuuugacc uccauagaag acaccgggac 60cgauccagcc uccgcggccg ggaacggugc auuggaacgc ggauuccccg ugccaagagu 120gacucaccgu ccuugacacg 1403140RNAArtificial SequenceSynthetic polynucleotide 3ggacagaucg ccuggagacg ccauccacgc uguuuugacc uccauagaag acaccgggac 60cgauccagcc uccgcggccg ggaacggugc auuggaacgc ggauuccccg ugccaagagu 120gacucaccgu ccuugacacg 1404105RNAArtificial SequenceSynthetic polynucleotide 4cggguggcau cccugugacc ccuccccagu gccucuccug gcccuggaag uugccacucc 60agugcccacc agccuugucc uaauaaaauu aaguugcauc aagcu 1055105RNAArtificial SequenceSynthetic polynucleotide 5ggguggcauc ccugugaccc cuccccagug ccucuccugg cccuggaagu ugccacucca 60gugcccacca gccuuguccu aauaaaauua aguugcauca aagcu 105613875DNAArtificial SequenceSynthetic polynucleotide 6atgttcagaa ttggacgccg ccagctctgg aagcattccg tcacccgggt cctgactcag 60aggctgaagg gggaaaagga agcgaaacgc gccctgctgg acgcccgcca taactacctc 120tttgccatcg tggcctcctg cctggacctc aacaagacag aagtcgagga cgccattctc 180gaagggaacc agattgagcg gatcgaccaa ctcttcgccg tcggaggact tcggcacctc 240atgttctact accaggacgt ggaggaggca gaaaccggcc agctcggcag cctgggtgga 300gtgaacctcg tgtcggggaa gattaagaag ccaaaagtgt tcgtgaccga aggaaatgac 360gtggcactca ccggagtgtg cgtgttcttc attcggacag acccctcgaa ggcgattacc 420ccagacaata tccaccaaga agtgtcgttc aacatgctgg acgctgcgga tggcgggctg 480ttgaactccg tgcggcggtt gctgtccgac attttcattc cggccctgcg agctacttcg 540cacggctggg gcgaactgga gggcttgcag gatgcggcta acattcggca ggagttcctt 600tcctctctgg aaggcttcgt caacgtcctg tccggcgctc aggagtcact gaaggagaag 660gtgaacctta gaaaatgcga catcctggag ctcaagaccc tgaaggagcc caccgattac 720cttaccctgg

ccaacaaccc tgaaacactg ggaaagattg aggattgcat gaaggtctgg 780attaagcaga ctgaacaagt gctggccgaa aacaaccagc tgctgaagga agccgacgat 840gtggggcctc gcgcggagct tgaacactgg aagaagagac tctcgaagtt caactacctc 900ctggagcaat tgaagtcccc tgatgtgaag gccgtgctgg ccgtcctggc cgccgccaag 960tccaagctgc tcaagacctg gagggaaatg gacataagaa ttactgacgc gaccaatgag 1020gccaaggaca acgtgaaata cttgtacacc ctggaaaagt gctgcgaccc gctctactcc 1080tccgacccgc tcagcatgat ggacgccatt ccgaccctta tcaacgcgat caagatgatc 1140tactccattt cccactacta caacacctcc gagaagatta cttcgctgtt cgtcaaggtg 1200accaaccaga tcatctcagc ctgcaaggcc tacattacca acaacggaac cgcctccatc 1260tggaaccagc cgcaggacgt ggtggaagaa aaaatccttt ccgccattaa gctcaagcag 1320gaataccagt tgtgcttcca caagaccaag cagaagctca agcagaaccc taacgccaag 1380cagttcgact tctccgaaat gtacatcttc ggcaaatttg agacattcca ccgccggctc 1440gctaagatca ttgacatttt caccacgctc aagacctact ccgtcctgca agactcgacc 1500attgagggac ttgaggacat ggcgaccaag taccagggca tcgtcgccac catcaagaag 1560aaggagtaca acttcctcga ccaaaggaag atggacttcg accaggatta cgaggaattc 1620tgcaaacaga ctaacgacct ccacaacgag ctgcgcaagt tcatggacgt caccttcgcg 1680aagattcaga acaccaacca agcgctgagg atgcttaaaa agttcgaacg gctcaacatc 1740cccaacctgg ggatcgatga caagtaccaa ctcatcctgg aaaactacgg cgccgatatc 1800gacatgatct ccaagctgta taccaagcag aagtacgacc ccccgctggc acgcaaccaa 1860ccccccattg ccgggaagat tctgtgggcg aggcaactgt tccacagaat tcagcaaccc 1920atgcaactct tccagcagca ccccgccgtg ctctcgaccg ccgaagccaa acccatcatt 1980cggtcctaca accgcatggc caaggtcctt ttggagttcg aagtcctgtt ccaccgcgcg 2040tggctccggc agattgagga gattcatgtg gggcttgagg ccagcctgct cgtcaaggct 2100cccggcactg gagaactgtt cgtgaatttt gacccgcaaa tcctgatcct gttcagggag 2160acagagtgta tggcccagat gggccttgaa gtgtctccgc tcgccacctc actgttccag 2220aagcgcgaca gatacaagag aaacttcagc aacatgaaga tgatgctggc cgagtaccag 2280agagtcaagt ccaagatccc cgcagccatc gagcagctca ttgtgccgca ccttgcgaag 2340gtcgacgagg cacttcagcc aggactcgcc gccctgactt ggacttccct caacatcgaa 2400gcctacctgg agaacacctt cgcgaagatc aaagacttgg aactgctgct ggatagagtc 2460aacgatctga tcgagttcag aattgacgcc atcctggaag aaatgtctag caccccgctc 2520tgtcaacttc ctcaggaaga acccctcact tgcgaagagt tcctgcagat gactaaggat 2580ctgtgcgtga atggggccca gatcctgcac ttcaagtcgt ccctggtgga ggaagccgtg 2640aacgagctgg tcaacatgct cctggacgtc gaggtgctgt ccgaggagga gtcggagaag 2700attagcaacg aaaacagcgt gaactacaaa aacgaatcct ccgccaagag agaagaggga 2760aacttcgaca ccctgacctc ctcgataaat gcgcgggcca acgccctgct cctgaccacc 2820gtcactcgga agaaaaagga aactgagatg ctgggcgaag aggcccggga gctgctctcc 2880cacttcaacc accaaaacat ggacgccctg ctcaaggtga cccgcaacac cctggaggcc 2940attcgaaagc ggattcattc ctcccacacg attaacttcc gggacagcaa ctccgcgtcc 3000aacatgaagc agaactccct cccaatcttc agggcttcgg tgaccctcgc cattcccaac 3060atcgtgatgg cgcctgccct cgaggacgtg cagcagacgc tgaacaaggc agtcgagtgc 3120attatctccg tgcccaaggg cgtgcgccag tggtccagcg agttgcttag caagaaaaag 3180atccaagagc gcaagatggc cgccctgcaa tcaaacgaag attcggactc cgacgtggaa 3240atgggagaaa acgaactgca ggacaccctc gagatcgcct ccgtgaactt gcccatcccg 3300gtgcaaacca agaactacta taagaacgtg tccgaaaaca aggaaatcgt caagctcgtg 3360agcgtcctct cgaccatcat caactccacc aagaaagagg tgatcacctc catggactgc 3420ttcaagcgct acaaccacat ctggcagaaa gggaaagaag aggccattaa gaccttcatc 3480acccagtcac cactgctgtc agagttcgag tcgcagatcc tgtacttcca gaacctcgag 3540caggaaatca acgcggaacc agaatacgtc tgcgtcggat cgattgccct gtacacggca 3600gacctgaagt tcgcactcac tgcagagact aaagcctgga tggtcgtgat cgggcgccat 3660tgcaacaaga agtatagaag cgagatggaa aacatattca tgctgatcga ggaattcaac 3720aagaaattga accggcctat caaggacctg gacgacatcc gcatcgccat ggccgccctg 3780aaggagattc gcgaggaaca aatttcgata gatttccaag tgggacccat tgaggagtcc 3840tacgccctgc tgaaccggta cggactgctg atcgctagag aggagattga caaggtggac 3900accctgcact acgcctggga aaagctgctg gccagggccg gtgaagtgca gaacaagctg 3960gtcagcctcc agccttcctt caagaaggaa ctgatttccg cggtcgaagt gttcctgcaa 4020gactgccacc agttctacct ggactatgac ctcaacggac cgatggccag cggtctgaaa 4080ccgcaggagg cctccgaccg cctgatcatg ttccaaaacc agttcgacaa catataccgg 4140aaatacatca cttacactgg aggagaagaa ctcttcggcc tgcccgccac ccaataccct 4200caactgctgg agatcaagaa gcagctgaat ctgctgcaaa agatctacac cctctacaac 4260tccgtcattg agactgtgaa ttcatactac gatatcctgt ggtccgaagt gaacatcgaa 4320aagattaaca acgagctcct ggagtttcag aacagatgcc ggaagctgcc gcgggccctc 4380aaggattggc aggctttcct tgacctgaag aagataatcg acgacttctc ggagtgttgc 4440cccctcctcg agtacatggc ctcgaaggcc atgatggaac gccactggga aaggatcact 4500actctgactg ggcacagcct ggatgtgggt aacgagtcct tcaagctgcg gaacatcatg 4560gaggccccgc ttctgaaata caaggaggaa atcgaggaca tctgcatctc ggccgtcaag 4620gaacgcgaca ttgagcaaaa gctgaagcaa gtgatcaacg agtgggacaa caagactttc 4680accttcggat ccttcaaaac ccggggagag ctcctgctga gaggagactc cacttccgaa 4740atcattgcga acatggagga ctccctcatg ctcctcgggt cgctcctgag caacagatac 4800aacatgccgt tcaaggccca gatccagaag tgggtgcagt acctgtcgaa ctccactgac 4860atcattgaga gctggatgac tgtgcagaac ctctggatct acttagaggc cgtgttcgtg 4920ggaggcgaca ttgctaagca gctcccaaag gaagcaaagc gcttcagcaa catcgacaag 4980tcgtgggtga agattatgac ccgcgcgcat gaggtgccca gcgtggtgca gtgttgcgtg 5040ggcgatgaaa cgctgggcca gcttctgccg catctgctgg accagctgga gatttgccag 5100aagtccctca ccggatacct ggagaagaag cggctgtgct tcccccggtt cttcttcgtg 5160tccgaccctg ccctgctgga aattcttgga caggcctctg acagccatac tatccaggcc 5220cacctcctga acgtgttcga caacatcaag tccgtgaagt tccacgaaaa gatatatgac 5280cgcatcctgt cgattagcag ccaggaaggg gaaaccatcg agctcgacaa gcctgtgatg 5340gccgaaggaa acgtcgaggt gtggctgaac tcgctcctgg aggagagcca gagcagcctg 5400cacctcgtta taagacaggc cgccgccaac atccaggaga ctggattcca gctcactgaa 5460ttcttgtcgt ccttccctgc ccaagtgggc ttgctgggaa ttcaaatgat ctggacccgc 5520gactccgaag aggccctcag gaacgccaag ttcgacaaaa agatcatgca gaaaaccaat 5580caggcgttcc tggagcttct gaacaccctc attgatgtga ccactaggga cctgtcaagc 5640accgaacggg tgaagtacga aaccctcatc accatccacg tccatcagcg ggatattttc 5700gatgacctgt gtcacatgca tatcaagtcc cctatggact tcgaatggct gaagcaatgc 5760cgcttctact tcaatgagga ctcggacaag atgatgatcc atatcacaga tgtcgccttc 5820atctaccaga acgaattcct gggatgcacc gaccgcttgg tgatcacccc cctcactgac 5880cggtgctaca ttaccttggc ccaggccctg ggaatgtcca tgggaggggc gcctgccggg 5940ccagccggca ccggaaaaac cgaaacaaca aaggacatgg gccgctgcct ggggaagtac 6000gtggtggtgt tcaattgctc cgatcaaatg gacttcagag gtctggggcg gatcttcaag 6060ggtctggctc agtccggctc ctggggatgc ttcgacgaat tcaaccggat cgacttgccg 6120gtgctgagcg tcgccgccca gcagatctcc atcatcctga cctgtaagaa ggaacacaag 6180aagtcgttca ttttcaccga cggagacaac gtgacgatga acccggagtt cgggctattc 6240ttgaccatga acccgggtta cgccggccga caagagctgc ccgaaaactt gaagatcaac 6300ttccgcagcg tggctatgat ggtcccggat cggcagatta ttattagagt gaagctggct 6360tcgtgcggat tcatcgataa cgtcgtgctg gcccggaagt tcttcacgct gtacaagctc 6420tgcgaggaac agctgagcaa gcaagtccac tacgacttcg ggctgcgcaa cattctctcc 6480gtgctgcgca ctctgggagc ggccaagcgg gcgaacccta tggataccga gagcaccatt 6540gtcatgcgcg tgctccggga catgaacctc tcaaagttga tcgacgagga cgagcctctc 6600ttcctctccc tgatcgagga cctctttccc aacatcctgc tcgataaggc cgggtacccc 6660gagctggagg ccgccattag ccgccaggtg gaagaggctg gcctcatcaa ccaccctcct 6720tggaagttga aggtcataca gctgttcgaa actcagcgcg tgcgacatgg catgatgacc 6780ctgggacctt cgggagccgg caaaactact tgcatccaca ccctgatgcg ggctatgacc 6840gactgcggga agcctcaccg ggagatgagg atgaacccga aggccatcac cgccccgcaa 6900atgttcggcc ggctggacgt ggccactaat gattggaccg acgggatctt ctcaaccctc 6960tggcgcaaaa ccctgcgcgc aaagaaggga gagcacatct ggatcatcct ggatggcccg 7020gtcgatgcca tctggattga gaacctgaac agcgtgctgg acgacaacaa gactctcact 7080ctcgcgaacg gcgacagaat tcccatggcc ccgaactgca agatcatttt cgagccacac 7140aacattgaca acgcctcccc ggctaccgtg tcgagaaacg gaatggtgtt tatgtccagc 7200tcgatcctcg attggtcccc catcctggaa ggcttcctga agaagaggtc gccgcaagag 7260gccgaaattt tgagacagct gtacaccgaa tccttcccgg acctgtaccg cttctgcatc 7320cagaacctcg agtacaagat ggaagtgctg gaggcattcg tgatcactca gagcattaac 7380atgctgcagg gactcatccc tcttaaggaa caggggggcg aagtctctca ggctcacctc 7440ggccggctgt tcgtgttcgc cctcctctgg agcgcgggtg cggccctgga gctggatggc 7500cgccgcagac tcgagctgtg gctgagatcc agacccaccg gaactctcga gttgcccccg 7560cccgccggcc ccggagatac ggccttcgac tactacgtgg cccccgacgg gacgtggact 7620cattggaaca cacgcacaca ggaatacctg taccctagcg acaccactcc ggaatacgga 7680tcgatcctgg tgcccaacgt ggataacgtg aggaccgatt tccttataca aaccatcgcc 7740aagcaaggaa aggcggtcct cctcatcggg gagcagggga ccgccaagac cgtgatcatt 7800aagggattca tgtccaagta cgaccccgaa tgccacatga tcaagtctct caactttagc 7860tccgccacca ccccactgat gttccaacgg actatcgaaa gctacgtcga caagcgcatg 7920ggaactactt acggccctcc ggcaggaaag aaaatgaccg tgttcatcga cgacgtcaac 7980atgcccatca ttaacgaatg gggggaccag gtgacgaacg agatcgtgcg ccagctcatg 8040gaacaaaacg gtttctacaa ccttgagaag cccggagagt ttacttcaat tgtggacatc 8100cagttcctcg cggcaatgat tcatcccggc ggaggaagaa acgacatccc gcagcgcctc 8160aaacgccagt tctcgatctt caactgcacg ctgccgtccg aggctagcgt cgacaagata 8220tttggcgtga ttggtgtggg acattactgc acccagaggg gattctcgga ggaagtgcgg 8280gacagcgtga ccaagctggt gccgctgact aggcggctct ggcagatgac caagatcaag 8340atgctcccta cacccgccaa gttccactac gtgttcaatc tgcgcgacct gtcacgcgtg 8400tggcagggaa tgctgaacac tacctccgaa gtcatcaagg aaccgaacga tctccttaaa 8460ctttggaagc acgaatgcaa gcgcgtaatc gcagaccggt tcaccgtgag ctccgatgtg 8520acctggttcg acaaggccct ggtgtccctc gtggaggaag agttcggaga ggaaaagaaa 8580cttctcgtag actgcggtat agatacgtac ttcgtggact tcctgcgcga cgcccccgaa 8640gccgccggcg aaacctccga agaagctgac gccgaaacac caaagatata cgagcctatc 8700gaatccttca gccatctgaa ggagagactc aacatgttcc tgcaattgta taatgagtcc 8760atccgcggcg ccggcatgga catggtgttc tttgcggatg ccatggtgca cctggtgaag 8820atctcgcggg tgattcggac tccccagggc aacgcgctgc tcgtgggcgt ggggggctcc 8880gggaagcaga gccttacccg gctggcctcc ttcattgccg gctatgtgag cttccagatc 8940actctgactc gctcgtacaa taccagcaac ctcatggagg atctcaaggt cctgtaccgg 9000accgccggac aacagggcaa gggtatcacc ttcatcttta ccgacaacga gatcaaggat 9060gagtccttcc tggagtacat gaacaacgtg ctgtcctccg gtgaagtctc caacctcttt 9120gcccgggacg agattgatga gatcaactcg gacttggcgt cagtgatgaa gaaggaattc 9180ccgaggtgtc tgcctaccaa cgagaacctc cacgactact tcatgagccg ggtgcggcag 9240aatctgcaca tcgtgctttg cttcagcccg gtgggagaaa agtttcggaa ccgcgccctc 9300aagttcccgg cccttatctc gggatgcacc attgactggt tttcacggtg gccgaaggat 9360gctctggtcg cggtgtccga acacttcctt acttcgtacg acattgactg ctccctggaa 9420atcaagaaag aagtggtcca gtgtatgggc tccttccagg acggtgtcgc cgaaaagtgc 9480gtggactact tccagcgctt ccgcagatcc acgcacgtga cccctaagtc ctacctgtcg 9540ttcatccaag gatacaagtt catctacggg gaaaagcacg tcgaagtgcg gaccctggcc 9600aaccgaatga acacaggact ggaaaagctg aaggaggcgt ccgagtccgt ggctgccctg 9660agcaaggagc tggaagctaa agagaaggag ctccaggtcg ccaacgacaa ggccgacatg 9720gtcctgaagg aggtgaccat gaaggctcaa gccgccgaga aagtcaaggc ggaggtgcaa 9780aaggtcaagg atagagcgca agccatcgtg gactccatct ccaaagacaa ggctattgcg 9840gaggagaagc tcgaggccgc caagcctgcc ttggaagagg ccgaagccgc tctgcagacc 9900attcggccga gcgatatagc caccgtgcgg acgctgggac ggcctcctca cctgattatg 9960cggatcatgg actgtgtgct cctgctgttt caacggaagg tgtccgccgt gaagatcgat 10020ctcgagaaat cctgcactat gccgagctgg caggagtccc tgaagctgat gaccgccggc 10080aacttcctcc agaacctcca gcagttccca aaagatacca ttaacgagga ggtgatcgaa 10140ttcctgagcc cgtacttcga aatgcccgat tacaacatcg aaaccgccaa gagggtgtgc 10200ggcaacgtgg ctgggctctg ctcgtggact aaggccatgg cctccttctt ctcaatcaac 10260aaggaagtgc tgcccctgaa ggccaacctc gtggtgcagg aaaatcgcca ccttctcgcc 10320atgcaggatc tgcagaaggc acaggctgaa ctggacgaca aacaggcaga gctggatgtg 10380gtgcaggccg agtacgaaca ggccatgacg gagaagcaaa ccctgttgga ggatgcggag 10440agatgcagac ataagatgca gactgcctcc accctgattt caggcctggc cggagaaaag 10500gaacgctgga ctgaacaatc ccaggaattc gcggcccaga caaagagact cgtgggagat 10560gtgctgcttg cgactgcctt tctgagctac tctggaccgt tcaaccagga attccgggac 10620ctcctgctca acgactggcg caaggagatg aaggcccgga agatcccctt cgggaagaac 10680ctcaacctga gcgagatgct gatcgacgcc cccaccatct ccgagtggaa cctccaggga 10740ctgcctaacg acgacctctc cattcagaac ggcatcatcg tgactaaggc gtcccgctac 10800ccgctgctca ttgaccctca gacccaggga aagatttgga tcaaaaataa ggagtcccgc 10860aacgagctgc agatcacctc cctgaaccac aagtactttc gcaaccatct ggaagattcc 10920ctcagcctgg gacggccgct tctgatagag gatgtgggag aagaactgga cccggctctc 10980gacaacgtcc tggagaggaa cttcatcaag accgggtcca ccttcaaggt gaaggtgggc 11040gacaaggagg tggacgtcct ggatggattc cgcctgtaca ttaccactaa gctcccaaac 11100cccgcttaca ctcctgagat cagcgcgcgg accagcatca ttgatttcac cgtgactatg 11160aagggccttg aggaccagct gctgggtcgc gtgatcttga ccgagaagca ggaacttgaa 11220aaggaacgca ctcacctcat ggaggacgtg accgccaata agcgccggat gaaggaactg 11280gaagataact tgctgtacag gcttacttcc actcagggtt ccctggtgga ggacgagtcg 11340ctgattgtgg tgctgagcaa caccaagcgg actgccgaag aagtgaccca aaaactggag 11400atctcggcgg aaaccgaggt gcagatcaat agcgcgcggg aggagtaccg gccagtcgca 11460accagagggt ccatcctgta cttcctgatt accgaaatga ggctggtcaa cgaaatgtac 11520cagacgtccc tgaggcagtt cctggggttg ttcgacctga gccttgcccg ctcggtgaag 11580tcaccaatca cttccaagag aattgccaac attattgagc acatgaccta cgaggtctat 11640aagtacgccg cgcggggact gtacgaggaa cacaaattcc tgttcactct gctgctcacc 11700ctgaaaatcg atattcagcg gaaccgcgtc aagcacgagg agttcctcac cctgattaag 11760ggaggagcca gcctggacct gaaggcgtgc cctcctaagc cctccaagtg gattcttgac 11820atcacctggc tgaacctggt ggaactcagc aagttgcggc aattctccga cgtgcttgac 11880caaatttccc gcaacgagaa gatgtggaag atttggttcg acaaagagaa cccggaggaa 11940gaacccctgc ctaacgccta cgacaagtcg ctcgattgtt tccggcgcct cctgctgatt 12000aggagctggt gccctgacag gaccattgcc caagctcgga agtacattgt ggactccatg 12060ggcgagaagt atgccgaagg ggtcattctc gacctggaga aaacttggga agagtccgac 12120ccgagaactc cactgatctg cctcctgtcc atgggctccg accccaccga tagcatcatc 12180gcgctgggaa agcggctgaa gatcgaaact agatacgtga gcatgggaca agggcaggag 12240gtgcacgcga ggaagctcct ccaacaaacc atggccaacg gaggctgggc cttgctgcag 12300aactgtcacc tcggactcga ctttatggac gagctgatgg acattatcat cgagactgaa 12360cttgtgcatg acgccttcag actgtggatg accaccgaag cccacaagca gttcccgatt 12420acactgctcc aaatgtccat caaattcgca aacgacccgc ctcagggact ccgggccgga 12480ctgaagcgga cctactcggg agtgtcccaa gacctcctgg atgtgagctc agggagccaa 12540tggaagccaa tgctctacgc cgtggccttc ctgcattcga ccgtgcagga acggcgcaag 12600ttcggcgccc tgggctggaa catcccgtac gaattcaacc aggccgactt caatgccacc 12660gtgcagttca tccagaacca ccttgacgac atggacgtga agaagggagt gtcgtggact 12720accatccgct acatgattgg cgaaattcag tatggcggcc gcgtgaccga cgactacgat 12780aagagactcc tcaacacctt cgccaaggtc tggttctcgg aaaacatgtt cggtccagac 12840ttctccttct accaaggata caacatcccc aagtgctcca ccgtggataa ttaccttcag 12900tacatccagt cgctgccggc ctacgattca ccggaagtgt tcggactgca tcctaacgcg 12960gacattactt accagagcaa gctcgccaag gatgtcctgg acactatcct gggaattcag 13020cctaaggata cttcaggagg gggagatgag actagagagg cggtggtcgc tcgcctggcc 13080gacgacatgt tggaaaagct cccgcctgat tacgtgccgt tcgaggtgaa agagcggctg 13140cagaagatgg gacctttcca accgatgaac atttttctga gacaggagat cgaccggatg 13200cagagagtgc tgtccctcgt gcggtccacc ctgaccgaac tgaagttggc tatcgacggg 13260accattatta tgtcggagaa cctccgggac gccctggact gcatgttcga tgcgcggatt 13320ccggcctggt ggaagaaggc atcctggatt tccagcaccc tgggcttctg gtttaccgaa 13380ctgattgaaa gaaattcgca attcacttcc tgggtgttca acgggcgccc gcactgtttt 13440tggatgaccg gcttcttcaa cccccaggga tttctcactg cgatgaggca ggaaattacc 13500cgcgcgaaca agggatgggc cctggataac atggtgctct gcaacgaggt gaccaaatgg 13560atgaaggatg acatttccgc ccccccgacc gaaggagtct acgtctacgg cctgtacctc 13620gagggtgccg ggtgggacaa gcgaaatatg aagttgatcg aatcaaagcc aaaggtcttg 13680ttcgaactga tgccggtgat cagaatatac gccgagaaca acaccttgcg cgaccccagg 13740ttctactcct gccctatcta caagaagcca gtgcgcaccg acctcaacta catcgccgcc 13800gtcgacctcc ggactgccca aaccccggaa cactgggtgc tgcgcggtgt ggccctgctc 13860tgcgatgtca agtag 13875713875DNAArtificial SequenceSynthetic polynucleotide 7atgttccgaa tcggccggcg ccagctgtgg aagcactcag tgacccgcgt cctgacccag 60agactgaagg gagaaaagga ggccaaacgg gccctcctgg acgcgcggca caactacctg 120ttcgctatcg tggcctcgtg tctcgacctc aacaagaccg aggtggaaga tgcaatcctg 180gagggcaatc aaattgagcg gattgatcag ctgttcgcag tgggcggctt gcggcacctg 240atgttttact accaagacgt ggaggaggcc gaaactgggc agctgggaag cctgggcgga 300gtgaacctgg tgtccggcaa aatcaaaaag cccaaggtct ttgtcactga agggaacgac 360gtggccctca ccggagtgtg cgtgttcttc atccggacgg accctagcaa agccatcacc 420ccggacaaca tccaccagga agtgtccttt aacatgctgg acgcggccga cgggggcctc 480cttaactcgg tccggcgcct cctttccgac atcttcatcc cggctctgcg ggctacgtcc 540cacggttggg gagaactgga aggactgcag gacgccgcta acatccgcca agagttcctc 600tcatccctcg agggttttgt caatgtgctg tcgggcgccc aggaatcgct gaaggagaag 660gtgaacctcc gcaagtgcga tattctggag ctcaagaccc tcaaggaacc gactgactac 720ctcactctgg ccaacaaccc cgaaactctg gggaaaattg aggactgcat gaaggtgtgg 780atcaagcaga ccgaacaggt cctggccgaa aacaaccagc tgctgaaaga ggccgatgac 840gtgggacccc gggccgagct ggagcactgg aagaagcgcc tctcaaagtt caactacctt 900ctcgagcaac tgaagtcccc tgacgtgaag gctgtgctgg cggtgcttgc cgcggctaaa 960tcaaagctcc tgaaaacctg gagagaaatg gatattagaa tcaccgacgc gaccaatgaa 1020gcgaaggaca acgtgaagta cctgtacacg ctggaaaaat gctgcgaccc cctctattcc 1080tccgacccgc tgtccatgat ggatgcaatt cccaccctga tcaacgccat caagatgatc 1140tactcgattt cacactacta caatacttcc gagaagatca ccagcctgtt cgtcaaggtc 1200actaaccaaa tcatcagcgc atgcaaggcc tacatcacca acaacggaac cgcgtccatt 1260tggaaccagc cacaggatgt ggtcgaggaa aagatccttt ccgcaattaa gctcaagcag 1320gaataccagc tgtgcttcca caagactaag cagaaactga aacaaaaccc gaacgcaaag 1380cagttcgact ttagcgagat gtacattttc ggaaagttcg aaaccttcca tagacggctg 1440gctaagatca tcgatatctt caccaccctc aagacctact cggtgttgca ggacagcacc 1500atcgaaggtc tggaggacat ggcgactaag taccagggaa tcgtcgccac tatcaagaag 1560aaggagtaca acttcctgga tcagcggaag atggacttcg accaggatta cgaagagttc 1620tgcaagcaga ccaatgacct ccacaacgag ctgcggaagt tcatggatgt gactttcgcc 1680aaaattcaga acaccaacca ggcgcttcgg atgctgaaga agttcgaaag gcttaacatc 1740cccaacctgg gcatcgatga taaataccag ttgatccttg agaactacgg agccgacatc 1800gacatgatta

gcaagctgta caccaaacaa aaatacgacc ctcctctggc cagaaaccaa 1860cctccgattg ctggaaagat tctctgggcc cggcagctgt tccaccgcat ccagcagcct 1920atgcagctgt ttcaacagca cccggccgtc ctgtccaccg ctgaagccaa gccgatcatc 1980aggtcctata acaggatggc caaggtgctg ctggagttcg aggtgctgtt ccaccgagcc 2040tggctgcggc aaattgaaga aatccacgtg ggactggaag cctccctcct ggtgaaggcc 2100ccagggaccg gcgaactgtt cgtgaatttc gaccctcaga tcctgattct gttccgggaa 2160accgaatgca tggcccagat gggcctggag gtgtcccccc tggcaacctc actgttccaa 2220aagcgcgaca ggtacaagag gaacttctcc aatatgaaga tgatgctcgc agagtaccag 2280agagtcaagt ccaagatccc cgcagccatt gagcagctga ttgtcccgca ccttgctaag 2340gtggacgaag cgctccagcc cggcctcgcc gccctgacct ggacttcctt gaatatcgag 2400gcgtatctcg aaaacacttt cgctaaaatc aaggacctgg aattgctgct ggacagggtc 2460aacgatctca tcgagttcag aattgacgcg atcctggagg aaatgtcctc gacccctctg 2520tgccagcttc ctcaggagga gccactgacc tgtgaggagt tcctgcagat gaccaaggac 2580ctgtgtgtga acggagccca gatcctccac tttaagagca gcttggtgga agaagccgtg 2640aacgaattgg tgaacatgct cctggacgtg gaggtgctgt ccgaagagga gtccgagaag 2700atttcgaacg agaactcagt gaactacaag aacgaatcct cagcgaagag agaggaaggc 2760aactttgaca ctctgacctc ctcgattaat gcacgggcca acgctctgct gttgaccacc 2820gtgactcgga aaaagaagga aaccgagatg ctgggtgaag aggcccgcga acttctctcg 2880cacttcaatc accaaaacat ggacgccctg ttgaaggtca ccaggaacac cctggaagcc 2940atcaggaaga gaattcactc ctcgcacact atcaacttca gagattcaaa ttccgcatcc 3000aacatgaagc agaacagcct gccgatcttc cgcgcgagcg tgacccttgc catcccaaac 3060atcgtgatgg ccccagctct cgaggacgtg cagcagactc tgaacaaagc tgtggaatgc 3120attatttccg tgcccaaggg agtgagacaa tggtcatccg aactgctgag caagaagaag 3180atccaagagc gcaagatggc cgcactccag agcaacgagg attccgatag cgacgtggaa 3240atgggcgaga acgaactgca ggataccctg gagatcgcca gcgtcaacct tcccatcccg 3300gtgcagacca agaactatta caagaacgtg tcggaaaaca aggaaatcgt gaaactcgtg 3360tccgtgctct ccactattat caacagcact aagaaggaag tcatcacttc aatggattgt 3420ttcaagaggt acaaccacat ctggcaaaag ggcaaagaag aagccatcaa gacttttatt 3480acccaaagcc ccctgctgtc agagttcgag agccagatcc tgtacttcca gaacctggaa 3540caagaaatca acgccgaacc ggagtacgtg tgcgtgggta gcatcgccct ttacactgct 3600gacctgaagt tcgcactgac cgcagagact aaagcatgga tggtggtgat cggacggcac 3660tgcaacaaga agtaccgctc cgaaatggag aacatcttca tgcttatcga ggagttcaac 3720aagaagctca accggccgat caaggacctg gatgacattc ggattgcgat ggccgccctg 3780aaagagatta gagaagaaca aatctccatc gatttccaag tcggacccat tgaagaatcc 3840tacgccctgc tgaaccgcta cggattgctg atcgcccgcg aggaaatcga caaggtggac 3900accctgcatt acgcctggga aaagctgctg gccagagcag gagaagtgca aaacaagctc 3960gtgtccctgc agccgtcgtt taagaaagag ctcatttcgg ccgtggaagt gttcctgcag 4020gactgtcacc agttctatct ggactacgat ctcaatggac ctatggcttc cgggttgaag 4080ccgcaggaag cctctgaccg cctcattatg ttccagaatc agttcgacaa tatctaccgc 4140aagtacatta catacaccgg cggagaagaa ctgttcgggc tcccagcgac ccagtacccc 4200cagctcctcg agatcaaaaa gcagctgaac cttctgcaga aaatctacac tctatacaac 4260tccgtgatcg aaactgtcaa ttcctactac gatatcctgt ggtcggaagt caacattgag 4320aagatcaaca atgaactgct cgaattccag aaccgctgca ggaagctgcc gcgggccctc 4380aaggattggc aggccttcct ggacctgaag aagatcatcg acgacttctc cgaatgttgc 4440cccctcctgg aatacatggc atccaaggcc atgatggagc ggcactggga aaggattacc 4500acactcaccg gccactcact ggacgtcggg aacgagtcct tcaagctccg gaacattatg 4560gaggcccccc tgctgaagta caaggaggag attgaggata tctgcatttc ggccgtgaag 4620gaaagagaca tagagcagaa gctgaagcaa gtgatcaacg aatgggacaa caagacgttc 4680actttcggat ccttcaagac gaggggtgaa ctgctgctcc ggggagacag cacctccgag 4740atcatcgcaa acatggagga ctcgctgatg ctgctggggt ccctgctgtc caaccggtac 4800aatatgcctt tcaaggccca gattcagaag tgggtccagt acttgagcaa ctcgaccgac 4860atcatcgagt cctggatgac tgtccaaaac ctgtggatct acctcgaagc ggtgttcgtg 4920ggaggtgaca ttgccaaaca gctccccaag gaagcgaagc gcttttcgaa cattgacaag 4980tcctgggtga agatcatgac tcgggctcac gaagtgccgt ccgtcgtcca gtgttgcgtg 5040ggcgacgaaa ccctgggaca gttgctcccc catctgctgg atcagctcga aatttgccag 5100aagtcgctga ctggatacct cgagaagaag agattgtgct tcccgagatt cttcttcgtg 5160tcggacccgg ccctgctgga gatcctgggc caggcctccg atagccacac tattcaagcg 5220cacctcttga acgtgttcga caatatcaag agcgtgaagt tccacgaaaa gatctatgac 5280cggatcctgt caattagcag ccaagaagga gaaaccattg aactggacaa gcccgtgatg 5340gctgagggca acgtggaagt gtggctgaac agcctcttgg aagagtcgca gtcaagcctc 5400catctggtga tcagacaggc agccgccaac atccaggaaa ccggtttcca actgactgag 5460ttcctgtcat ccttcccggc tcaagtcggg ctgctcggaa tccagatgat ttggacccgg 5520gactccgagg aagccctcag gaatgccaag ttcgataaga agatcatgca aaagaccaac 5580caggcatttt tggagctcct gaacaccctg atcgacgtca caacccggga cctgtccagc 5640accgagcggg tcaagtacga aaccttgatc accattcacg tgcaccagag agacatcttc 5700gatgacctgt gtcacatgca catcaagtcc cccatggact tcgaatggct gaagcagtgc 5760cggttctact ttaacgagga cagcgataag atgatgatcc acatcaccga cgtggccttc 5820atctaccaaa acgagtttct tggatgcacc gaccgcctgg tcatcacccc cctgacggac 5880cggtgttaca tcacgctcgc ccaagcactg ggcatgagca tgggcggcgc cccagctgga 5940cccgcgggaa ctggaaagac tgaaaccacc aaggacatgg gccggtgcct gggaaagtac 6000gtcgtggtgt ttaactgctc cgatcagatg gacttcagag gcctcggccg catctttaag 6060gggctggccc agtccggctc ctggggctgc ttcgacgagt tcaacaggat cgatctgccg 6120gtcctgagcg tggctgcgca gcagatctcc attatcctga cttgcaagaa ggaacacaag 6180aagtccttta tcttcactga tggcgataac gtcactatga accccgagtt cgggctgttc 6240ctgaccatga accccggata cgccggtcgg caggagctgc ctgagaacct caagatcaac 6300ttccggtccg tggccatgat ggtgcccgac cgccagatta tcatacgggt gaagctcgcc 6360tcatgcggct ttatcgacaa cgtggtgctg gcccggaagt tctttaccct gtacaagctg 6420tgcgaggaac agctgtcgaa gcaagtccac tacgacttcg gcctgcgcaa catcttgtcc 6480gtcctgcgca ccctgggcgc ggccaagcgg gccaacccca tggatactga gagcaccatt 6540gtcatgcgcg tgctccggga tatgaacttg agcaagctga ttgatgagga cgagccgctt 6600tttctgtccc tgattgaaga cttgtttcca aacatcctgc tggacaaagc cggataccca 6660gagctggagg cagcgatctc ccgccaagtg gaagaagccg gactgatcaa ccacccgcct 6720tggaaattga aagtgatcca actgttcgaa actcagcggg tccggcacgg catgatgacc 6780ttgggcccct cgggggcggg aaagacgact tgcatccata ctttgatgcg ggctatgacg 6840gattgcggga agccgcatag ggaaatgcgg atgaacccca aggccatcac cgctccacaa 6900atgttcggta ggcttgacgt ggccaccaac gattggaccg atggcatctt ctcaactctg 6960tggagaaaga ccctgcgggc taagaagggg gagcatattt ggattatcct cgacggccca 7020gtggacgcca tctggatcga aaaccttaac tccgtgctcg acgacaacaa gaccctgacc 7080ctggccaacg gcgaccgcat cccgatggct ccgaactgca agatcatctt cgaaccgcac 7140aacatcgaca acgcctcccc ggccaccgtg tccagaaacg gaatggtgtt catgtcctcc 7200tctatcttgg actggtcgcc gatcctggag ggattcctga agaagagaag cccgcaagaa 7260gccgagatcc tgcgccaact gtatactgaa tcgttccccg atctgtaccg gttttgcatc 7320caaaacctgg agtacaagat ggaggtgctc gaggcgttcg tgatcactca gagcattaac 7380atgttgcaag gcctgatccc gctcaaggag caggggggcg aagtgtccca ggcccacctg 7440gggcggctgt tcgtgttcgc cctcctgtgg tcagccggag cggcgctgga gcttgacgga 7500cgccgcagac tggaactgtg gctgcggtcc cgaccgactg ggactctgga gttgcctccg 7560ccggccggcc ctggcgatac agcgttcgac tattacgtgg cccctgatgg aacttggacc 7620cattggaaca ctaggaccca ggaatacctc tacccgagcg acaccacccc agaatacggg 7680tcaatcctgg tcccgaacgt ggataatgtc cggactgatt tcttgatcca gaccatcgcg 7740aagcagggaa aagccgtgct gctgattgga gaacagggta ccgctaagac tgtgattatc 7800aagggattca tgtccaagta cgacccggaa tgtcacatga ttaagtcctt gaacttctca 7860tcggccacca cccccctgat gttccagcgg accatcgagt cctacgtgga caagcggatg 7920ggaactacct acggaccgcc ggcagggaag aagatgaccg tgttcattga cgacgtcaac 7980atgccaatca tcaatgaatg gggcgatcaa gtcaccaacg agatcgtccg ccagctgatg 8040gaacagaacg gcttctacaa cctcgaaaaa ccgggagaat tcactagcat cgtggacatc 8100caatttctcg cagccatgat ccaccctggt ggcgggcgga acgatattcc ccagagactg 8160aagcgccagt ttagcatctt caactgcact ttgcctagcg aagccagcgt cgacaagatc 8220ttcggcgtga taggagtggg acactactgc actcagcggg gattctcaga agaagtccgg 8280gactccgtga ctaagctcgt gcctctgaca aggcggctgt ggcagatgac taagatcaag 8340atgctgccga cccccgcaaa gttccactac gtgtttaacc ttcgggacct gtccagagtc 8400tggcagggga tgctcaacac cacctcggag gtcattaagg aacccaacga tcttctgaag 8460ttgtggaagc acgagtgcaa gagggtcatc gccgacagat tcactgtgtc ctcagacgtc 8520acttggttcg acaaggccct cgtgtccctt gtggaggaag aattcggaga agagaagaag 8580ttgctcgtgg actgcggcat tgatacttac ttcgtggact tcctccggga tgcgccggag 8640gccgccggag aaacctctga ggaggccgac gctgaaaccc ccaaaatcta cgagcctatc 8700gagtcgttct cccatctgaa ggagcggctg aacatgttct tgcagctgta caacgagtcc 8760atccgggggg ccggaatgga catggtgttt tttgccgatg cgatggtgca cctggtgaag 8820atttccagag tgatccgcac cccccaggga aacgcactcc ttgtgggagt gggaggatcc 8880ggcaaacaga gcctgactcg cttggcgtcg ttcatcgcgg gatacgtgtc gttccagatc 8940accctgactc gcagctacaa cacctcgaac ctcatggagg acctcaaggt cctgtacagg 9000accgcaggcc aacaaggcaa aggaattacc ttcattttca ccgacaacga aatcaaggac 9060gaatcattcc tggagtacat gaacaacgtg ctgagctcag gagaggtgtc aaacctgttc 9120gcccgcgacg agatcgacga gatcaactcg gatctggctt ccgtgatgaa aaaggaattc 9180cctcggtgtc tgcctaccaa tgagaacctc catgactact tcatgagccg cgtgcggcag 9240aacttgcata tcgtcctgtg cttttccccc gtcggagaaa agttcagaaa cagggccctg 9300aagttcccgg cgctcatctc cggctgcacc attgattggt ttagccgctg gccaaaggac 9360gcactcgtgg ccgtgtcgga acatttcctc acatcctacg acatcgattg ctcgcttgag 9420atcaaaaagg aggtggtgca gtgcatgggc tcgttccaag acggcgtggc tgaaaagtgt 9480gtggactact tccagaggtt ccgacggtcc actcacgtga ccccaaagtc ctacctgagc 9540ttcatccagg gatacaagtt catctacgga gagaagcatg tcgaagtgcg caccttggcg 9600aaccggatga acaccgggct cgaaaagctg aaggaggcct ctgaatccgt cgccgccctg 9660tccaaagagc tggaggccaa ggaaaaggaa ctgcaagtcg ccaacgataa ggccgacatg 9720gtcctgaagg aagtcaccat gaaggctcag gccgccgaga aggtcaaagc agaggtgcag 9780aaggtgaagg atcgcgcgca ggccattgtg gacagcattt caaaggacaa ggccatcgcc 9840gaggaaaaac tggaagccgc gaagccggcc ttggaagagg cagaggccgc gctgcagacc 9900atacggccct ctgacattgc caccgtgcgg accctcgggc ggcccccgca tctgatcatg 9960agaattatgg actgcgtgct gctgctgttc caacggaaag tgtccgccgt gaagattgac 10020ctggagaagt cctgcactat gccgagctgg caggagtcgc tgaagctcat gactgcggga 10080aacttcctgc agaacctcca acaatttccc aaagacacca ttaacgagga agtgattgag 10140ttcctgtccc catacttcga gatgcccgac tacaacatcg agactgccaa gagggtgtgc 10200ggaaacgtgg ccggcctctg ctcgtggacc aaggccatgg cgtcgttttt cagcatcaac 10260aaggaggtgt tgcctcttaa ggccaacctg gtggtgcaag agaaccggca tctcttggcc 10320atgcaggacc tccagaaggc ccaggcagag ttggacgaca agcaggccga gctggacgtc 10380gtccaagccg agtacgaaca ggccatgacc gaaaagcaga ccctcctgga ggatgctgaa 10440cgctgccgcc acaagatgca gactgccagc actctgatct ccggacttgc cggagaaaag 10500gagcgctgga ccgagcagtc ccaggagttc gcagcccaga cgaagcgcct cgtgggggac 10560gtgctgctgg cgaccgcctt cctctcgtac tcgggcccgt tcaaccagga gtttcgggat 10620cttctgttga acgattggcg caaggagatg aaagccagaa aaatcccgtt cggtaaaaac 10680ctcaatctga gcgagatgct gatcgatgcc cccaccatct ccgaatggaa ccttcaggga 10740ctgccgaacg atgatttgtc aatccagaac ggtatcattg tcactaaggc ctcccgctac 10800cccttattga ttgatcctca gacccagggg aagatttgga tcaaaaacaa ggaatcgcgg 10860aacgagctgc agatcacatc tctgaaccac aagtacttcc gaaaccactt ggaagattcc 10920ctgtccctgg gccggcccct gctgattgag gacgtgggcg aagaactcga cccggccctg 10980gataatgtgc tggaacggaa tttcattaag accggtagca ctttcaaggt gaaagtggga 11040gataaggagg tggacgtcct ggacggattc cgcttgtaca tcacgaccaa gctgcctaac 11100cccgcgtaca ctccggaaat cagcgcgagg acgtcgatca tcgatttcac tgtgaccatg 11160aagggtctgg aagatcagct tctgggacgg gtcatcctga ctgagaagca ggaactggaa 11220aaagagagaa cccacctgat ggaagatgtg accgccaata agcgcaggat gaaagagctc 11280gaggacaacc tcctttaccg cctgacctcc acccagggtt ccttggtgga ggatgaatcg 11340ctgattgtgg tgctgtcgaa caccaagagg accgccgaag aagtgaccca aaagttggaa 11400atctccgccg aaaccgaagt gcagatcaac tcggctcggg aggagtaccg gccggtggcc 11460actcgaggat caattctgta cttcctgatc accgagatgc ggctcgtgaa cgagatgtat 11520cagactagcc tccgccagtt ccttggcttg ttcgacctgt cgctggcgag aagcgtgaag 11580tccccaatta cctcgaaacg gatcgccaac attattgaac atatgactta cgaagtgtac 11640aaatacgcgg cacgggggct ctacgaagaa cacaagtttc tcttcacgct gctgctgact 11700ctgaagatcg atattcagcg caaccgggtg aagcacgaag agttcctgac cctgattaag 11760ggtggcgcct ccctggacct caaggcctgc ccccccaagc cgtccaagtg gatcctcgac 11820attacctggc tcaacctcgt ggaactttca aagctcagac agttctccga cgtgctcgat 11880cagatctcca ggaacgagaa gatgtggaag atttggttcg ataaggaaaa tcccgaagag 11940gagcctcttc ctaacgcgta cgacaagtcc ctggattgct tccgccggct gctcctgatc 12000cggtcgtggt gtccagaccg gactattgcc caagcccgga aatacatcgt ggactcaatg 12060ggcgagaagt acgcagaggg tgtgatcctg gacctggaaa agacctggga agagtccgat 12120ccgagaactc ctctgatctg cctgctgtcc atgggttcag accccaccga ctcgatcatc 12180gccctcggaa agcgcctgaa gatcgaaacc cgatacgtca gcatgggcca aggccaggag 12240gtccacgccc gcaaactgct gcagcagacc atggctaacg gaggatgggc gctgctgcag 12300aattgtcacc tgggacttga cttcatggat gagctgatgg acatcatcat cgaaaccgag 12360ctcgtgcacg atgcattccg gctgtggatg acaactgagg cgcacaagca gttcccgatt 12420accttgctgc agatgtctat taagttcgcg aacgatccgc cccagggact gagagccgga 12480ctcaagcgca cctacagcgg cgtgtcccaa gacttgttgg acgtgtcctc gggaagccag 12540tggaagccga tgctctacgc cgtggcgttc ctccattcaa ccgtccagga gcgcagaaag 12600ttcggcgcac tgggatggaa catcccttac gaattcaacc aggcggattt taacgccacc 12660gtgcagttca ttcagaacca cctggacgac atggacgtga agaagggggt gtcatggacc 12720actatccggt acatgatcgg agagatacag tatggaggtc gggtgaccga cgactacgac 12780aagcgcttgc tgaacacctt cgccaaggtc tggttctccg agaacatgtt cggacccgat 12840ttcagcttct accagggata caacatccct aagtgctcca ccgtcgacaa ctacctccag 12900tacattcagt ccctgccagc ctacgacagc ccggaggtgt tcggactcca ccccaacgca 12960gacatcacct accagtccaa gctcgccaag gatgtgttgg acaccatcct gggaatccag 13020cctaaggaca cgagcggcgg gggggatgaa actcgggagg ctgtggtggc acggctggcc 13080gacgatatgc tggaaaagct cccacctgac tacgtgccgt tcgaagtgaa ggagcggctc 13140cagaagatgg gcccgttcca gcccatgaac atcttcctgc ggcaagaaat tgaccggatg 13200cagagagtgc tgtccctggt ccggtcaacc ctgactgaac tgaagctggc catcgacgga 13260accatcatca tgtccgagaa cctcagagat gcgctggatt gcatgttcga cgcccggatc 13320cctgcctggt ggaaaaaggc ctcctggatc tccagcactt tgggattctg gttcaccgaa 13380ctgatcgaaa gaaactcaca attcacttcc tgggtgttta acggcagacc acactgtttc 13440tggatgaccg gcttcttcaa cccacaagga ttcctgacag cgatgagaca ggaaatcacc 13500cgcgccaaca agggctgggc cctggacaac atggtgctgt gcaacgaagt gaccaagtgg 13560atgaaggacg acatttccgc accgcctact gaaggggtgt acgtgtacgg cctgtacctg 13620gagggcgctg gatgggacaa gcggaacatg aaactgattg aatccaagcc caaggtcctg 13680ttcgaactca tgccagtcat taggatctac gcggagaaca acacgctccg ggacccgagg 13740ttttactcct gccccatcta taagaagccc gtgcggaccg atctgaacta cattgcggcg 13800gtggacctca ggaccgcgca gacccctgaa cattgggtgc tccgcggcgt ggcccttctg 13860tgtgacgtga agtag 13875813875DNAArtificial SequenceSynthetic polynucleotide 8atgttcagaa tcgggcggcg gcaactgtgg aagcattcag taactcgcgt cctgactcag 60agactgaagg gagaaaagga ggcaaaaagg gccttgctgg acgcccgcca caactacctc 120ttcgcgattg tggcctcatg cctggacctg aataagactg aagtggagga cgctatcctc 180gagggaaacc aaatagagag aatcgaccaa ttgttcgccg tgggtggact ccggcacctg 240atgttctact accaagacgt ggaggaagcg gaaaccggac agctgggatc actggggggc 300gtgaacctcg tgagcggaaa gatcaagaag cccaaggtgt tcgtcaccga aggcaacgat 360gtggcgctca ccggcgtgtg cgtgtttttc attcgcaccg acccatcgaa ggccattact 420cccgataaca tccatcaaga ggtgtccttc aacatgctgg acgctgccga tggaggactc 480ctcaacagcg tgcgccggct cctctccgac attttcatcc ccgcgctgag agctaccagc 540cacggatggg gggaactcga gggactgcag gacgccgcaa acattcgcca agaattcctc 600tcctcattgg agggcttcgt gaacgtgctc agcggcgctc aggaaagcct gaaagaaaag 660gtcaatctcc gcaagtgcga catcctagag ctcaaaacgc tgaaagagcc cacagactac 720ctcactcttg ccaataaccc tgaaaccctg ggaaagatcg aggactgcat gaaggtctgg 780attaagcaaa cagaacaagt cctggccgag aacaaccagc tcctgaagga ggcggatgac 840gtgggcccgc gggcagaatt agagcactgg aagaaacgcc tcagcaagtt taattacctc 900ctggagcaat tgaagtcccc tgacgtgaag gccgtgctcg cagtgttggc agcggcaaag 960tccaagctgc tgaaaacttg gcgggagatg gacattagaa ttactgacgc gacgaacgag 1020gcaaaggata acgtcaaata cttgtacacg ctcgagaagt gctgcgaccc gttgtattcc 1080tcagacccac tgagcatgat ggacgccatt cccaccctca tcaacgccat taagatgatt 1140tactcaattt cccattacta caacacctca gagaagatta cttcactctt cgtgaaggtg 1200accaaccaaa ttatctccgc ctgcaaggcc tacatcacta acaacgggac agcttcgatc 1260tggaaccagc cgcaggatgt cgtggaggag aagatcctgt cggccatcaa acttaagcag 1320gagtaccaac tgtgctttca caaaaccaag cagaagctga agcaaaatcc aaacgccaag 1380caattcgact tctccgagat gtacatcttc ggcaagttcg aaaccttcca caggagactg 1440gccaaaatca ttgatatttt cactaccctt aagacctaca gcgtgctcca agacagcact 1500atcgagggac tggaggacat ggccaccaag taccaaggca tcgtcgccac cattaaaaaa 1560aaggaataca acttcctgga tcagcggaag atggacttcg atcaggatta tgaagagttc 1620tgcaaacaga ccaatgatct ccacaacgaa ctgcggaagt ttatggacgt gaccttcgca 1680aaaatccaga acaccaacca ggcgctgcgc atgctgaaga agttcgagag attgaacatc 1740ccgaatctcg gcatcgacga taagtaccag ctcatcctgg aaaactacgg ggccgacatc 1800gacatgatct ccaagctgta cactaagcag aaatacgacc cgccactggc gagaaaccag 1860ccccccattg ccggcaagat cctctgggcc cgacagcttt tccaccgaat ccagcaaccc 1920atgcagcttt tccagcaaca ccccgccgtg ctctcaaccg ccgaggccaa gccgatcatt 1980agaagctaca acagaatggc gaaggtgctg ctcgaattcg aggtgttgtt ccaccgggca 2040tggttgaggc agatcgagga aatacacgtg ggactggagg cctcgctgtt ggtgaaggcc 2100ccagggaccg gagaactgtt cgtcaacttc gacccgcaaa tcctgatcct gttccgcgaa 2160actgaatgca tggctcagat gggattggaa gtcagccccc tcgcgacttc cctcttccaa 2220aagagagata gatacaaacg gaacttctcc aatatgaaga tgatgctggc ggaataccag 2280agagtgaaat ccaaaattcc ggccgctatt gagcagctga ttgtgcctca ccttgccaag 2340gtcgacgaag cgttgcagcc tggcctggcc gctctcactt ggaccagcct gaacatcgag 2400gcctacttgg aaaacacctt cgccaaaatt aaggacctcg aactcctgct cgaccgggtg 2460aacgacttga tcgagtttag gattgatgcc attctggagg agatgagctc cactccgttg 2520tgtcaactgc cacaggagga acccctcaca tgcgaggaat tcctgcaaat gactaaggac 2580ctgtgcgtca acggggccca aatcctgcac ttcaaatcct ccctggtcga ggaagcagtc 2640aacgagttgg tgaacatgct cctggatgtg gaggtgctgt ccgaggagga gtccgagaag 2700atctccaacg aaaactccgt gaactataag aatgaatcct ccgcaaagcg ggaggagggg 2760aatttcgata ccctgacttc ctccatcaac gcccgcgcca acgccttgct gctaactacc 2820gtgactagaa agaagaaaga aactgaaatg ctgggggagg aggcacgcga actcctgtcc 2880cactttaacc

accagaacat ggacgccctg ctgaaggtca cccggaacac cttggaggcc 2940attcgcaagc ggatccatag cagccacacc attaacttca gagactcaaa ctcggcatcc 3000aacatgaagc agaattcact cccgatcttc agggcaagcg tgactttggc tatcccgaac 3060attgtcatgg cgcctgctct ggaggacgtc cagcagacgc tgaacaaggc cgtggaatgc 3120atcatcagcg tcccgaaggg tgtgagacag tggtcctccg aattgttgtc aaagaagaag 3180atccaagaga ggaagatggc ggcgctccag tccaatgaag atagcgatag cgacgtggag 3240atgggcgaga acgaactcca ggataccctg gagatcgcgt ccgtcaacct ccctatcccg 3300gtccagacca agaactacta caagaatgtc tcggaaaaca aggagatcgt gaagctcgtg 3360tcggtcctgt ccaccattat caactccacc aagaaggagg tcattactag catggactgc 3420ttcaagcgct ataaccacat ctggcaaaag gggaaggaag aggccatcaa gacctttatc 3480acccagtcgc cgctcttgtc agagtttgag tcacagattc tgtacttcca gaacctggaa 3540caggagatta atgctgaacc agagtacgtg tgcgtgggct ccatcgcgct gtatactgcg 3600gacctcaagt tcgcgttgac cgcagaaact aaggcctgga tggtggtcat cggcagacac 3660tgcaataaga agtaccgcag cgaaatggaa aacatcttca tgttgattga agagttcaac 3720aagaagctca accggcccat taaggacctc gatgatattc gcatcgccat ggcggccctc 3780aaagaaatcc gggaggagca aatctccatc gacttccagg tcggccccat tgaagagagc 3840tacgcactgc tgaaccgcta tggactgtta atcgcccggg aagaaatcga taaggtggat 3900accctgcatt acgcttggga aaagttgctg gcccgggcag gagaggtgca gaacaagctc 3960gtgagcctcc aaccctcctt caaaaaagaa ctgatcagcg cggtggaagt gtttctccag 4020gactgccacc agttctacct cgactatgac ctgaacggcc ccatggctag cggcttgaag 4080cctcaggagg cctcagaccg cctgatcatg tttcagaacc aattcgataa catctaccgg 4140aagtacatta cctataccgg cggcgaggag ttgtttggat tgccagccac ccagtaccct 4200caactcctgg agatcaaaaa gcaactgaac ttgctccaga agatctacac cctctacaac 4260tcggtcatcg aaactgtgaa ctcgtactac gacattcttt ggagcgaggt caacatcgaa 4320aagatcaata acgaactcct ggaattccag aaccgatgca ggaagctgcc ccgggccctg 4380aaagattggc aagccttctt ggacctgaag aagattattg atgacttctc agaatgttgc 4440cccctcctgg agtacatggc ctccaaggcc atgatggaac ggcattggga gcggattact 4500acccttacgg gccacagcct ggacgtcggc aacgagagct tcaaactgag aaacatcatg 4560gaggccccac tcctgaagta caaggaagag attgaggata tttgcatttc cgccgtgaag 4620gaacgcgaca tcgaacagaa acttaagcaa gtcattaacg agtgggacaa caaaaccttc 4680acgttcggat ccttcaagac gagaggcgag ctcctcctga ggggagactc aaccagcgaa 4740attatcgcca acatggagga ctccctgatg ctcctggggt cgctgctgtc gaacaggtac 4800aacatgccct tcaaggccca gatccagaag tgggtccagt acctcagcaa ctccaccgac 4860atcatcgagt cctggatgac tgtgcagaac ttgtggatct acctggaggc cgtgttcgtg 4920ggaggagata tcgccaaaca attgcctaag gaagccaaga ggttctcgaa tattgacaag 4980agctgggtga agatcatgac cagggcacac gaagtgcctt cggtggtgca atgttgcgtg 5040ggggatgaaa ctctcggaca gttgctgcct cacctcctgg accaactcga gatttgtcag 5100aagtccctga ctggatacct cgagaagaaa cgcttgtgct tcccaaggtt tttcttcgtg 5160tcggatcctg ccctcttgga aatcctcggt caggcctcag actcacacac cattcaagcc 5220cacctcctta acgtctttga taacattaag agcgtcaagt tccatgagaa aatctacgac 5280cggatcctct ccatttcgtc ccaagaggga gaaacgattg aacttgacaa gccagtgatg 5340gccgaaggga atgtcgaggt gtggctcaac agcctcctgg aagaatccca aagctccctt 5400catcttgtga tccggcaggc cgccgccaat atccaggaaa ccggattcca actcaccgag 5460ttcctttcct ccttccccgc acaagtggga ctgctcggca ttcaaatgat ctggacgcgg 5520gattccgagg aggccctgag gaacgccaag ttcgacaaga agatcatgca aaaaacaaac 5580caggccttcc tcgaacttct caataccctg atcgatgtga ccactagaga tctctcctcg 5640acggaacggg tgaaatacga aaccctcatc accatccacg tgcaccagag agatattttc 5700gacgacctct gccacatgca tattaagtcg ccaatggact tcgaatggtt gaaacaatgc 5760agattttact ttaacgagga cagcgataag atgatgatcc atatcaccga cgtcgccttc 5820atctaccaga acgaattcct gggatgcacc gataggctgg tgattacccc gctgactgac 5880cggtgctaca ttaccctggc ccaggccctg ggaatgagca tgggcggcgc ccctgccgga 5940ccggcgggca ccggcaagac cgaaaccacc aaggatatgg gacggtgcct cggaaagtac 6000gtggtggtgt ttaactgctc ggaccagatg gatttccgcg gactgggcag gatcttcaaa 6060ggcctggctc agagcggttc atggggctgc ttcgacgagt tcaaccgaat tgacttgccg 6120gtgctgtccg tcgcagcgca gcaaatctcg atcatcctga cttgtaagaa ggaacataaa 6180aagtccttca tttttaccga cggagacaac gtgacaatga acccggagtt cggactgttc 6240ctgactatga accctgggta cgccgggcgc caggagctcc ctgaaaacct taagatcaac 6300ttccgctccg tggcaatgat ggtgcctgac agacagatta tcattcgcgt gaagctggcg 6360tcatgcggct tcatcgacaa cgtggtgctg gcgaggaagt ttttcacact gtacaaactt 6420tgcgaggagc agctctccaa acaggtgcac tacgacttcg gactgagaaa catcctgagc 6480gtcctgagga ccctgggggc tgctaagcgc gccaacccca tggataccga atccaccatt 6540gtcatgcggg tcctgaggga catgaacctg tccaagctca tcgacgagga tgaacccctg 6600ttcctgagcc tgattgaaga tctgtttcca aacatcttgc tggacaaggc gggttacccc 6660gagctggaag ccgccatctc ccgccaagtg gaggaggctg gactcattaa ccacccaccc 6720tggaagctca aggtcatcca actgttcgaa acgcagagag tgcgacacgg catgatgaca 6780ctggggccat caggtgcagg aaagaccacg tgcatccaca ccttgatgcg ggcgatgacc 6840gactgcggga agccacatcg ggagatgcgc atgaacccga aggcgatcac tgcacctcaa 6900atgttcggac ggctcgacgt ggccactaac gactggaccg acgggatttt ctcgaccttg 6960tggcgcaaga ccctacgggc caagaaagga gagcacatct ggattatcct ggatggtcca 7020gtggatgcga tctggatcga gaaccttaac tccgtgctgg acgacaacaa gaccctgacc 7080ctggctaacg gcgaccggat cccaatggcg cccaactgca aaatcatctt cgaaccccac 7140aacattgaca acgcctcgcc cgccactgtg tcgcggaacg ggatggtgtt catgtcgtcg 7200tccatcttgg actggtcccc cattctcgaa ggcttcctga agaagcgcag ccctcaagaa 7260gccgagatac tccgccaact ctacaccgag tcgttcccgg atttgtaccg gttctgtatc 7320cagaacttgg agtacaagat ggaggtgctt gaggcattcg tgatcaccca atcgatcaac 7380atgctgcaag gactcatccc cctgaaagaa cagggaggtg aagtctccca agctcacctg 7440ggacgcctct tcgtgttcgc gctgctttgg agcgcgggag ccgcgctcga gctcgacggg 7500cggcgcaggc tggagctctg gctgcgctcc cgcccgaccg gaaccctgga gctgccgccc 7560ccggccggcc cgggcgacac cgcctttgac tactacgtgg cccccgacgg gacctggact 7620cactggaaca ctagaaccca ggaatacctt tacccctccg acactactcc cgaatacgga 7680agcatccttg tgccgaacgt ggacaacgtg cgcaccgact tcctaattca gaccatcgcc 7740aagcagggaa aggccgtgct gcttattgga gaacagggta ccgcaaagac cgtgatcatc 7800aagggattca tgtcaaagta cgaccctgaa tgtcacatga ttaagtcact taacttctcc 7860agcgccacca cccctctgat gttccagaga accatcgaga gctacgtgga caaacgcatg 7920ggcaccacgt acggtccccc ggccggaaag aagatgaccg tattcatcga cgacgtgaac 7980atgccgatca ttaacgaatg gggagatcag gtgaccaacg aaatcgtgcg ccagttaatg 8040gagcagaacg gtttctacaa ccttgaaaaa cccggagagt tcacttcaat cgtggacatc 8100cagttcctgg ccgccatgat ccacccgggc ggaggtagaa acgacatccc gcagagactg 8160aagagacagt tctcaatctt caactgcacc ctgccctccg aagcatcagt cgataagatt 8220ttcggggtga tcggagtggg ccactactgc acgcagaggg gtttctcaga ggaggtgcgc 8280gactccgtga ccaaactggt cccactcact cgaaggctgt ggcagatgac caagattaag 8340atgctcccta ctcctgccaa gttccattac gtctttaacc ttcgggactt gtcccgggtc 8400tggcagggaa tgctgaatac cacctccgaa gtgattaagg aacctaacga cctcctgaag 8460ctctggaaac acgagtgcaa gagggtgatc gccgatagat tcaccgtgtc ctccgacgtg 8520acctggttcg acaaggccct cgtgtccttg gtggaagaag agttcggtga agaaaagaag 8580ctcctcgtgg actgcggaat cgacacctac ttcgtcgact tcctcagaga tgcccccgag 8640gctgccggag aaacctcaga agaggccgat gcggagactc cgaagattta cgaacccatc 8700gaatccttca gccacttgaa ggagaggctc aacatgttcc tgcagctcta caacgaaagc 8760atcaggggag ctggcatgga catggtgttc ttcgccgacg cgatggtgca ccttgtcaag 8820atctcccggg tcattcgaac gccgcaggga aacgcattgc tcgtgggcgt cggaggttcc 8880ggaaaacagt ccctcacgag gctggcgtcc ttcattgcgg gatacgtgag cttccaaatt 8940accctcaccc gcagctacaa tacctccaac cttatggagg acttgaaggt cttgtaccgc 9000actgccggac agcagggaaa ggggatcacc ttcatcttca ccgacaacga aatcaaggat 9060gagagcttct tggagtacat gaacaacgtc ctttcgtccg gagaagtgtc caacctcttc 9120gctcgcgatg aaatcgacga gatcaactcc gacctcgcca gcgtcatgaa aaaagaattc 9180cctcgctgtc tccccaccaa cgagaacctc cacgattact ttatgtcccg ggtccgccaa 9240aacttgcata ttgtgctgtg cttctcgccc gtgggggaga agtttcggaa ccgggcgctg 9300aagttccccg ccctgattag cggatgtact atcgactggt tctcgagatg gcccaaagac 9360gccctggtcg ccgtgagcga acatttcctg acttcctacg acatcgactg cagcctcgaa 9420atcaagaagg aagtggtgca gtgcatgggg tcatttcagg atggagtggc cgagaagtgc 9480gtcgactact tccagagatt ccggcggtca acccatgtga cgcccaaaag ctacctttcg 9540ttcatccagg gctacaagtt catctacggg gaaaagcatg tcgaagtgcg gacccttgca 9600aaccgcatga acaccggcct tgagaagttg aaagaggcct cggaatccgt ggccgcgctc 9660agcaaagaac tggaagctaa ggagaaggaa ctccaagtcg ccaacgataa agcggacatg 9720gtgctgaagg aagtgaccat gaaggcccag gccgccgaga aggtcaaggc cgaggtccag 9780aaggtgaagg accgcgcaca agcaatcgtg gatagcatct ccaaggacaa agcaatcgca 9840gaagagaagc tcgaggcggc aaagcccgcg ctcgaagagg cggaagcggc gctgcagact 9900atccggccgt ccgacattgc aaccgtgaga accctgggcc gccccccaca cctcatcatg 9960cgcattatgg actgcgtgct cttgctcttt caacggaagg tgtccgccgt gaagatcgac 10020cttgagaagt cctgcaccat gccaagctgg caggagtcgc tgaaactcat gaccgccgga 10080aacttcctgc agaacttgca acagttcccg aaagacacca tcaacgaaga agtcatcgag 10140ttcctttccc cgtacttcga aatgcctgat tacaacattg aaaccgccaa gagagtgtgc 10200ggaaatgtcg cgggcctgtg ctcgtggacc aaggccatgg cgtcgttctt tagcatcaac 10260aaggaggtgc tccccctgaa ggccaacctc gtggtgcagg aaaatcgcca cttgctggcc 10320atgcaagatc ttcagaaggc tcaagcggag ctggacgata aacaggccga acttgacgtg 10380gtccaggccg agtacgagca ggctatgacg gaaaagcaga ccctcctgga ggatgcagaa 10440cgctgcaggc acaagatgca gaccgcctcc acccttattt ccggcctggc gggcgaaaag 10500gagcggtgga ccgagcagtc ccaggaattc gcagctcaga ccaagcggct cgtgggcgat 10560gtgctgctgg ccactgcctt cttgagctac tccggcccct tcaaccagga atttcgggac 10620ctcctgctga acgactggag gaaggagatg aaggcgcgga agatcccatt cgggaagaac 10680ttgaacctct ccgagatgct catcgacgct cccaccatca gcgaatggaa cctccaggga 10740ctgcccaacg atgaccttag cattcaaaac ggaatcatcg tgaccaaggc ctcgcgctac 10800ccgctgctta tcgacccaca aactcaagga aagatttgga ttaagaacaa ggagtcacgc 10860aacgagctgc agatcacctc cctgaaccat aaatacttta gaaaccatct cgaggattcc 10920ctgagcctgg gcagacccct tctcatcgag gacgtgggcg aggagctcga tccagcgctg 10980gacaacgtcc tggagagaaa cttcattaag accggatcca cgttcaaggt caaggtcggc 11040gacaaggaag tggatgtcct ggacggcttc cgcctgtaca tcaccaccaa attgcctaac 11100cccgcataca ccccggaaat ctcagctcgc acgtcgatca ttgattttac cgtcactatg 11160aaaggactgg aggaccagct gctgggcaga gtcattctca ccgaaaagca agagctggaa 11220aaggaacgca cccatctcat ggaggacgtg actgcgaata agcggcggat gaaagagctg 11280gaagataact tgctgtaccg cctgacttcc actcaggggt ccctcgtcga agatgagtca 11340ctgatcgtgg tcctgtcaaa cacgaagagg accgccgagg aagtaaccca gaagctggag 11400atttccgccg aaaccgaagt gcagatcaac tccgcaagag aggaatatag acccgtagct 11460acgcggggga gcattctgta cttcctcatc acggagatga gacttgtcaa cgaaatgtac 11520cagacctcat tgcggcagtt cctcggactg tttgacctgt ccctcgcaag aagcgtcaag 11580tccccaatta cttcaaagcg catcgcgaac attattgagc acatgactta cgaagtgtac 11640aagtacgcgg ccagggggtt gtatgaggag cacaagtttc tcttcaccct gctgctgacc 11700ttgaagatcg acattcaacg gaatagagtg aagcatgaag agttcctgac cctcatcaaa 11760ggcggcgctt ccctcgatct gaaggcttgc cctccgaaac cgtcaaaatg gatcctggac 11820attacctggc tgaaccttgt cgagctgtcc aagttgcgcc aattctccga cgtgctggac 11880cagatctccc ggaacgagaa gatgtggaag atctggttcg acaaggaaaa cccagaggag 11940gagcctctgc ccaacgccta tgacaaaagc ctggactgct tccggcggct ccttctcatt 12000cgctcttggt gtcccgaccg gaccattgcc caggcccgca agtacatcgt ggattcaatg 12060ggggagaagt acgctgaggg ggtgatcctt gacctggaga aaacttggga ggagagcgat 12120ccgcggaccc cgctgatttg cttgctttca atgggatctg accccaccga ctccatcatc 12180gccctgggaa agaggcttaa gatcgaaact cgctacgtca gcatgggaca aggacaggag 12240gtgcacgccc ggaagctgct ccagcagacc atggccaacg ggggatgggc gctgctgcag 12300aactgccacc ttggactgga tttcatggac gaactcatgg acatcattat cgagactgaa 12360ttggtccatg acgccttcag actgtggatg actactgagg cccataagca gttccccatc 12420acacttctgc agatgagcat caagttcgcg aacgatcctc ctcaaggcct gagagccgga 12480ttgaaaagga cgtactccgg ggtgtcccaa gacctcctgg atgtgtcctc cggatcccaa 12540tggaagccaa tgctctacgc ggtggcgttc cttcacagca ctgtgcagga gaggcggaag 12600tttggagccc tgggatggaa cattccatac gagttcaacc aagccgactt caacgcgact 12660gtgcaattca tccagaacca cctggacgat atggatgtga aaaagggggt gtcctggacg 12720accatccgct acatgatcgg ggagatccag tacgggggaa gagtgaccga tgattacgac 12780aagaggctcc tgaacacttt cgccaaggtc tggttctccg aaaacatgtt cggccccgac 12840ttctcgttct accaggggta taacattccg aagtgctcga cggtggataa ctacctccag 12900tacattcaat cgctgccggc ctacgactcc cccgaggtgt tcggcctcca ccccaacgcc 12960gacattacct accagagcaa gctggctaag gacgtgctag acaccatact ggggatccaa 13020ccgaaggata cttccggcgg aggggatgaa acccgcgaag cagtggtggc acggctggct 13080gacgacatgc tggagaaact gccccctgac tacgtcccct ttgaggtcaa ggaaaggctc 13140cagaagatgg gacctttcca gccaatgaac atcttcttgc ggcaagagat cgaccggatg 13200cagagagtgc tctccctcgt gcgctcaacc ctcactgagc tcaagctggc aatcgacggt 13260accattatca tgtcggagaa cctccgggac gcactggact gcatgttcga tgcgcggatc 13320cctgcgtggt ggaagaaggc ctcctggatc tcgtcaaccc tggggttctg gttcaccgag 13380ctgattgaaa ggaactccca attcacctcc tgggtcttta acggccgccc gcactgcttc 13440tggatgaccg gctttttcaa cccccaggga tttctcaccg ccatgcggca ggaaatcacc 13500agggccaaca agggctgggc gttggataac atggtgctgt gcaacgaagt gaccaagtgg 13560atgaaagatg acatttcagc cccgccgacc gaaggcgtct acgtctacgg gctctacttg 13620gaaggggccg gatgggacaa gcggaatatg aaactcattg agtccaagcc caaggtcctg 13680ttcgagctga tgccagtgat ccgcatctac gccgaaaata acaccctccg ggatccgagg 13740ttctactcgt gcccaattta caagaagccc gtgcggaccg acctgaatta catcgccgct 13800gtcgaccttc gcactgccca aactccggaa cactgggtgc tgcggggagt cgccctgctt 13860tgcgacgtga agtag 13875913875DNAArtificial SequenceSynthetic polynucleotide 9atgtttagga ttggtcggcg ccagctgtgg aaacatagcg tcactcgcgt cctcacgcaa 60agacttaaag gagaaaagga ggcaaaacgg gcgttgctcg acgctcggca taactacctg 120tttgcaatcg tcgcatcgtg tcttgatctt aacaagacgg aggtggaaga cgctattctg 180gaaggtaatc agatcgagag aattgaccag ttgttcgctg tgggcggtct tcgccacctt 240atgttctatt atcaagatgt cgaagaggcg gagactgggc agttggggag cttgggtgga 300gtcaatttgg tcagcggtaa aatcaaaaaa ccaaaagtgt tcgtcacgga gggcaatgac 360gtcgcgctga cgggtgtctg cgtcttcttt atccggacgg accccagcaa ggctatcacg 420ccagataaca tccatcagga ggtgtcattc aatatgttgg atgccgcaga tggaggactc 480cttaattccg tgaggcgcct tctttccgac atttttattc cggctctgcg cgccacctcg 540catggttggg gagagctcga gggacttcaa gatgcagcga atattcggca agaatttttg 600tcctccctgg aaggattcgt caatgtgctt tcgggagcac aggagtccct taaagaaaaa 660gtcaatttgc ggaagtgtga catcctggag cttaaaactt tgaaagaacc cacggactat 720cttactttgg cgaacaatcc agaaacgttg gggaagattg aagactgtat gaaagtgtgg 780attaaacaga ctgaacaggt gttggcagaa aataatcaat tgctcaaaga agccgacgat 840gtgggcccga gagcggaact cgaacattgg aaaaagagac tgagcaagtt taattacctt 900cttgaacagc ttaaatcccc ggacgtcaaa gcagtgctgg cagtgttggc ggctgctaag 960tcgaaacttt tgaaaacttg gcgggagatg gacattcgga ttacggatgc cacgaacgag 1020gccaaagata atgtgaaata tctttatacc ctcgagaagt gctgcgaccc actttattca 1080tccgacccgc tctcaatgat ggacgctatt cctaccctga tcaacgcaat taagatgatt 1140tactcaattt cgcattacta caataccagc gagaaaatca cttcgttgtt cgtgaaagtg 1200accaatcaaa tcattagcgc atgcaaggcc tatatcacga acaatggcac ggcctcaatc 1260tggaaccaac cgcaagatgt ggtcgaggaa aagattttgt cggcaattaa gttgaagcaa 1320gaataccagc tctgctttca taaaaccaag caaaagttga aacaaaaccc caatgccaaa 1380caattcgact tcagcgaaat gtacatcttt gggaaattcg agacgtttca caggcggctg 1440gcaaaaatca tcgatatctt taccacgctt aaaacgtact cggtcctcca agattcgacg 1500atcgaggggt tggaggacat ggcgacgaag tatcaaggta ttgtggcaac tatcaagaaa 1560aaagagtata acttcttgga ccagagaaaa atggactttg accaagatta cgaggagttc 1620tgtaagcaaa ccaacgattt gcacaacgag cttcgcaagt tcatggacgt cacctttgcg 1680aaaatccaga acaccaatca ggcgctccgc atgctgaaaa agtttgaacg gctcaacatt 1740cccaatcttg gaattgacga caagtatcaa ctcattctcg aaaattatgg ggcagatatc 1800gatatgatca gcaagttgta taccaagcag aaatatgacc cgccgttggc taggaatcag 1860ccgcccatcg ctggcaaaat cctttgggca cgccagctct tccatcggat tcagcagccc 1920atgcaactgt ttcaacagca tcccgcggtc ctgtcgaccg ccgaagccaa gccaattatt 1980cggtcgtata acagaatggc taaggtcttg ctcgagttcg aagtcctgtt ccaccgcgct 2040tggctgcgcc aaattgagga aattcacgtc ggtcttgaag caagcctttt ggtcaaagct 2100cctggcactg gcgaactctt tgtcaatttt gacccgcaga ttttgattct gttcagagaa 2160actgagtgta tggctcagat gggactcgaa gtgagccctt tggccacttc cctgtttcaa 2220aagagagata ggtataagcg caacttctcg aatatgaaga tgatgcttgc tgaatatcaa 2280agggtgaaaa gcaagattcc cgcggcgatt gaacagctca ttgtgccgca cctcgcgaag 2340gtggatgagg cattgcaacc cggacttgcg gcgttgacct ggacttcgct caacattgaa 2400gcctacttgg agaacacttt tgccaaaatc aaagatctgg agcttttgct cgacagggtg 2460aatgatttga tcgagttcag aatcgacgcg atcctggaag agatgtcatc cactccactg 2520tgtcagttgc ctcaggagga accacttacc tgcgaagaat ttctgcagat gacgaaagac 2580ctctgtgtca acggcgcaca aattttgcac tttaaatcat cattggtgga agaagcggtc 2640aatgaacttg tgaacatgtt gcttgatgtg gaagtgttgt ccgaggagga atcggagaaa 2700atttcgaacg aaaattcggt caactacaag aatgaatcgt ccgcgaaacg cgaggagggg 2760aattttgaca ctctgacctc ctcaattaat gccagagcca atgccctctt gctgactacc 2820gtgacgagaa agaagaagga gaccgagatg cttggagagg aagctaggga attgctctcc 2880cactttaacc atcagaacat ggacgcgttg ctcaaggtga cgcgcaatac tcttgaggcc 2940attaggaagc gcatccactc gtcccacacg atcaacttcc gggattcaaa ctccgcttcc 3000aatatgaaac agaactcgct cccgatcttt agagcttcag tgactctggc cattcctaat 3060attgtcatgg cacctgcttt ggaggacgtc cagcaaacgc tcaacaaggc cgtcgagtgt 3120attatttcgg tcccgaaggg agtcagacag tggtcctcag aactcttgtc caagaagaag 3180atccaagaac gcaaaatggc agcccttcaa tccaacgaag attcggattc agatgtggaa 3240atgggcgaaa atgagctcca ggataccctg gaaattgcat cggtcaatct tccaatccct 3300gtccagacta agaactatta taagaatgtc tcagaaaaca aagaaatcgt caagttggtg 3360tcagtcctgt cgacgatcat taatagcact aagaaggaag tgattacgtc gatggactgc 3420ttcaaaaggt ataaccatat ctggcagaaa ggcaaggaag aagcaattaa aacctttatc 3480acgcagtcac cgctgctgtc agagtttgag agccagattt tgtacttcca gaatcttgag 3540caagaaatta atgcagagcc ggagtatgtc tgcgtcggtt ccatcgctct gtacacggcc 3600gatttgaagt ttgctcttac cgcagaaacg aaagcatgga tggtggtgat cggcagacat 3660tgtaataaga aataccggtc ggagatggaa aatatcttta tgcttattga ggaattcaat 3720aaaaaattga accggcctat taaggatctg gatgatatcc gcatcgccat ggccgccctg 3780aaagagatta gagaagaaca aatctcaatc gatttccagg tgggccccat cgaagaatca 3840tatgcgctcc ttaataggta cggactgctc attgccagag aggagatcga caaagtcgac 3900accctccatt acgcttggga gaagctgctc gccagagcgg gggaagtgca aaacaaactt 3960gtgagcttgc

aaccgtcctt taagaaagaa ctcatttcag ccgtcgaggt gtttctgcaa 4020gattgtcatc aattctacct cgattatgat ctgaacggtc caatggcttc gggactgaaa 4080ccccaggaag ccagcgaccg gctgattatg ttccagaacc agtttgacaa catttatcgg 4140aaatacatta cgtacacggg gggtgaggaa cttttcggtc tgccagcaac gcagtatcct 4200cagctgttgg aaattaaaaa acaactgaac ctgctccaaa agatttacac cctctacaat 4260tcagtcattg agaccgtgaa ctcatactac gacattctct ggtcagaggt gaatatcgag 4320aagattaata atgaactcct tgaattccag aacaggtgca ggaagcttcc acgggcactc 4380aaggattggc aggcgttctt ggatctcaaa aaaattattg acgatttctc agagtgttgc 4440cctcttctgg agtatatggc ctcgaaagcg atgatggaaa ggcattggga aaggattacc 4500accttgacgg gccacagcct tgatgtcggg aatgagtcat ttaaactcag aaacattatg 4560gaagcccctt tgctgaagta taaggaggag atcgaggata tctgcattag cgctgtgaaa 4620gagcgcgata tcgagcagaa attgaagcaa gtgattaacg aatgggacaa caaaacgttc 4680acgtttggga gcttcaagac cagaggggaa ctcctcctgc ggggcgactc cacttccgaa 4740attatcgcga acatggaaga ctcgcttatg cttctcggct cgctgttgtc caacaggtac 4800aatatgccat ttaaggcaca gattcaaaaa tgggtgcaat acctcagcaa ctcgacggat 4860attatcgagt catggatgac cgtgcaaaac ttgtggattt atctggaagc ggtgtttgtg 4920ggaggagaca tcgctaaaca gctcccgaaa gaagcaaagc gcttttcgaa cattgacaaa 4980agctgggtga aaattatgac gagggcccac gaagtgccct ccgtcgtgca atgttgtgtc 5040ggcgacgaga ccctcggcca gctgcttccc cacctcctcg atcagctcga aatttgtcag 5100aaatcgttga cgggatatct tgagaaaaaa aggttgtgct tcccgagatt ttttttcgtg 5160agcgaccccg ctctcttgga gattttgggt caggcctccg attcgcatac catccaagca 5220catctgctca acgtctttga caacattaag tcagtgaagt ttcatgaaaa gatttacgat 5280agaatcctta gcatctcatc ccaagaggga gaaactattg aactggacaa gcccgtgatg 5340gcagagggga acgtcgaggt ctggcttaat tcgttgctgg aagagtccca atcatcactg 5400catcttgtca ttcgccaagc cgcggccaat atccaggaaa cggggttcca gctcactgaa 5460ttcctttcaa gcttcccggc tcaggtcgga ctgttgggaa ttcaaatgat ttggacgcgg 5520gattccgaag aggctttgag aaacgcgaag tttgacaaaa aaattatgca aaagaccaat 5580caggctttct tggaactcct taatacgttg attgacgtca ccactagaga tttgtcatcg 5640acggagcggg tgaagtacga aaccctgatc acgatccacg tgcatcagag agacatcttc 5700gacgacttgt gtcatatgca cattaagagc ccaatggact ttgaatggct gaaacaatgt 5760agattctact tcaacgagga ctccgataaa atgatgattc atatcacgga cgtggcattt 5820atctatcaaa acgaattctt gggatgcact gatagactcg tgattactcc ccttaccgac 5880agatgttata ttactttggc gcaagcactg ggcatgtcaa tgggtggcgc accggccgga 5940cccgctggca ctggtaaaac cgaaaccacc aaggatatgg ggcggtgtct ggggaaatac 6000gtcgtggtgt ttaattgctc cgaccagatg gatttccgcg gcctcggccg catctttaag 6060ggcctcgccc aatcgggctc atgggggtgc tttgacgagt tcaatagaat tgacctgcct 6120gtcttgtcgg tcgcagccca acaaatttca attatcttga cctgcaaaaa ggagcataaa 6180aaatcgttta tctttactga cggggataat gtcactatga atccggagtt cggactcttt 6240ttgacgatga acccgggtta cgccggaaga caagaactgc ctgagaatct caaaatcaat 6300tttagatccg tggcaatgat ggtccccgac cgccaaatta ttattagagt caaacttgcc 6360tcgtgtgggt tcattgataa cgtcgtgctc gcccggaaat tttttaccct gtacaaactg 6420tgtgaagaac aactttcgaa gcaggtgcat tatgatttcg ggcttcggaa tatcctgagc 6480gtgctccgga cgctgggtgc agcgaaaagg gccaacccaa tggacaccga gtcaactatt 6540gtgatgcggg tccttagaga catgaatttg tccaaactga ttgatgagga cgagcctctg 6600ttcctttcgc tgatcgagga tcttttcccg aacattcttc tggataaagc tggatatcct 6660gaattggaag cggcaatcag cagacaggtg gaggaagcag gtttgatcaa ccatccccct 6720tggaaattga aagtcatcca gttgttcgag acgcagcgcg tcaggcacgg tatgatgacc 6780ctggggccct caggcgcagg gaagactacg tgtatccaca ctttgatgag ggcgatgact 6840gattgcggta agccccacag ggaaatgaga atgaatccaa aagcaattac tgctccacag 6900atgtttggtc ggctcgacgt ggccacgaat gattggacgg acggaatttt tagcactttg 6960tggagaaaaa ctctcagagc taaaaagggt gaacatatct ggattattct cgatggcccg 7020gtggatgcga tttggatcga gaatctgaat tcagtgctgg acgacaataa aactttgacg 7080ctcgcgaacg gtgatcggat tcccatggcc cctaattgta aaattatttt cgaacctcac 7140aacattgata atgctagccc ggccactgtc tccaggaacg ggatggtgtt tatgtcgtcg 7200tccattttgg actggtcccc gattctggag ggcttcctga agaaaaggag cccacaggag 7260gcagaaattt tgagacagtt gtacactgag tccttcccag atctctatcg gttctgcatc 7320cagaacctcg aatacaaaat ggaggtgctc gaggcctttg tcattaccca gtcaattaat 7380atgttgcagg ggcttattcc ccttaaggag caggggggtg aggtgagcca ggcccatctg 7440gggcgcttgt tcgtctttgc tcttctgtgg tcggccgggg ctgctctgga gcttgacggc 7500cggagacggt tggaattgtg gctgaggagc agacctacgg gtacgctcga attgcccccg 7560ccagccgggc ccggggacac ggcgttcgat tactacgtcg cgcccgatgg gacttggacc 7620cactggaaca ctcggacgca agagtatttg tatccctccg ataccacccc ggaatacggt 7680agcatcctcg tgcctaacgt cgataacgtc cgcacggact ttcttatcca aaccatcgcg 7740aagcagggca aggcagtgct gttgattggg gagcaaggca ctgccaaaac ggtgatcatc 7800aaaggtttca tgtcgaagta tgatccagaa tgccatatga ttaaatcgct gaacttcagc 7860tccgcgacta ccccgctcat gtttcaacgc accatcgagt cgtacgtcga taagaggatg 7920ggcaccacgt acggtccgcc agccggtaaa aaaatgaccg tctttattga tgatgtgaat 7980atgcctatca ttaatgagtg gggtgatcag gtcactaatg aaatcgtgcg ccagcttatg 8040gaacagaatg gcttttacaa tctcgagaaa cccggcgaat tcacttcaat tgtggatatc 8100caatttctgg ctgccatgat tcacccaggt ggaggaagga atgacattcc gcagagactc 8160aaacggcagt tcagcatttt taattgcact ctcccttcgg aggcgtcagt ggacaagatc 8220tttggagtca tcggggtcgg tcattactgt acccagagag gattttcgga ggaggtccgc 8280gactcggtca ccaagcttgt ccctcttact aggcgcctct ggcaaatgac taagatcaag 8340atgcttccca ccccggcgaa attccactac gtgtttaatc ttagggacct gtcccgggtc 8400tggcagggca tgttgaacac tacgtcggag gtgattaagg aacccaacga tttgcttaaa 8460ctgtggaagc acgagtgcaa acgcgtcatc gctgaccgct ttactgtgtc ctcagacgtg 8520acctggtttg acaaagcctt ggtctccttg gtcgaggagg aatttggtga ggaaaaaaaa 8580ttgctggtgg attgcggaat tgacacttac ttcgtggatt tcctccgcga tgcaccagaa 8640gctgccggtg aaacctcgga ggaagcggac gccgagaccc ccaaaattta cgaaccgatt 8700gaatcgttct cccacttgaa agagcggctc aacatgtttc tccaactgta taacgagtcg 8760atcaggggag ctgggatgga catggtgttc ttcgccgatg ccatggtcca ccttgtcaag 8820atctcgcggg tcatccgcac gcctcaaggt aacgctctct tggtcggtgt gggagggagc 8880ggcaaacaaa gcctcactcg cctcgcgtcg ttcattgcag gttatgtctc atttcaaatt 8940actctcaccc gctcctataa tacttcgaat ttgatggagg atttgaaggt cctttatagg 9000accgctgggc aacaaggcaa aggaatcacc ttcatcttca ccgacaacga aattaaggac 9060gaatcctttt tggaatatat gaataatgtc ctcagctcgg gagaagtgag caacctgttt 9120gcaagagatg agattgatga aatcaattcc gaccttgctt ccgtgatgaa aaaggagttc 9180ccaaggtgcc tgcccaccaa tgagaatctt cacgactact ttatgagccg cgtccggcaa 9240aatctgcata tcgtgttgtg tttctcaccc gtcggtgaga agtttagaaa tcgcgccctc 9300aagtttccgg ctttgatctc aggctgcacc attgattggt tttccagatg gccgaaggat 9360gcactggtgg cagtctccga acatttcctg acttcatacg atattgactg ttcactggaa 9420attaagaagg aagtggtcca atgcatggga tcattccagg atggcgtcgc agaaaagtgt 9480gtggattact tccaaaggtt tcggaggagc acccacgtca cgcccaaatc atatttgtca 9540ttcattcagg gctacaaatt tatctacggc gaaaagcacg tcgaggtccg gactttggca 9600aacaggatga acaccggcct tgaaaaactg aaggaagcta gcgagtccgt ggctgcactc 9660tcgaaggagc tggaggccaa agaaaaagag ctgcaagtcg ctaatgacaa ggccgacatg 9720gtcttgaaag aggtgactat gaaagcgcag gctgctgaga aggtgaaggc tgaggtgcaa 9780aaggtgaaag accgggctca ggccatcgtg gactcaattt caaaggataa ggctatcgct 9840gaggaaaagt tggaagccgc taagcccgca ttggaggagg cagaggctgc gcttcaaacc 9900atcaggccct ccgacattgc gactgtgagg accctgggaa ggccgcccca tctcatcatg 9960cggattatgg actgtgtgct cctcctcttc caacgcaagg tctcagcagt caaaatcgat 10020cttgaaaaaa gctgtacgat gccctcctgg caagagtcgc ttaaacttat gactgcgggc 10080aattttctcc agaatcttca acagttccct aaagacacca ttaacgaaga agtcattgaa 10140tttctttcgc cgtatttcga gatgcccgac tataatatcg agactgccaa acgcgtgtgt 10200ggaaacgtcg cgggactgtg tagctggacg aaagcaatgg caagcttctt ttcgattaat 10260aaagaagtcc tgccactgaa agccaatctc gtggtccaag aaaaccgcca ccttttggca 10320atgcaagatc tccaaaaggc tcaagccgaa ttggacgata agcaagccga gctggacgtg 10380gtccaggccg aatacgaaca agcaatgacg gagaagcaga cgttgctgga ggacgcagaa 10440cggtgcaggc ataagatgca gacggcttcg acgcttattt cgggtcttgc cggagaaaag 10500gaaaggtgga ccgagcaatc gcaagagttc gcagcccaaa ctaaaagact tgtgggagac 10560gtcttgctcg ccactgcctt tttgtcatac agcggcccat tcaaccagga gttcagggac 10620ctcttgctta acgattggag aaaggaaatg aaggctcgca aaatcccgtt tggaaagaac 10680ctgaatttga gcgaaatgct tattgacgca ccgaccattt ccgagtggaa tctgcaaggc 10740ctcccgaatg acgatcttag catccaaaac ggtattattg tgaccaaggc ctccaggtat 10800ccactgttga tcgacccgca aactcaagga aagatctgga ttaaaaacaa ggaatcgcgg 10860aacgaacttc agatcactag cctgaaccac aagtacttcc gcaaccatct cgaggattcc 10920ctcagcctgg gccgccccct tctgatcgaa gacgtcggtg aggagctcga tcctgcgctc 10980gataacgtcc tcgagaggaa ctttatcaaa acgggatcaa cgttcaaggt gaaagtcgga 11040gacaaggaag tggatgtcct ggacgggttt cgcctctaca ttacgactaa gttgccaaac 11100cctgcttaca cgcccgagat ctcggcaagg acgtcaatca ttgattttac cgtgaccatg 11160aaaggcctcg aggatcagct tctcgggcgg gtgattctga ctgaaaagca ggaactcgaa 11220aaagaaagaa cgcatcttat ggaggatgtg accgcgaata aacgccggat gaaagagctc 11280gaagataacc ttctctacag gcttacctca acgcaaggtt ccctggtcga ggacgaatca 11340cttattgtcg tgctgtccaa cactaagagg accgcggaag aggtgacgca gaagttggaa 11400atttcagcag aaacggaagt gcagattaat tcggctcgcg aagaatatag accagtcgca 11460actcgcggat cgattctcta ttttttgatc actgagatgc ggcttgtcaa tgaaatgtat 11520caaacctcgc tgcgccagtt tcttggattg tttgacctgt cactcgcacg gagcgtcaaa 11580tcgcccatca cgtcaaagcg cattgcgaat atcatcgaac atatgaccta cgaggtctat 11640aagtatgccg cacggggatt gtacgaagag cacaagttcc tctttactct gctgctgacc 11700cttaaaatcg acattcagag gaacagagtc aagcacgagg aattccttac gctgattaag 11760ggaggagctt cactcgattt gaaggcgtgc ccacccaaac cgtccaagtg gattcttgac 11820attacctggc tgaacctcgt cgagttgtcc aaattgaggc agttctcaga tgtcctggat 11880cagatctcgc ggaacgagaa aatgtggaaa atttggtttg ataaggaaaa cccggaggag 11940gagcccctgc cgaatgcgta cgacaaatca cttgattgtt ttagaaggct ccttttgatt 12000cggtcatggt gccctgacag gacgatcgcc caagctagaa agtacattgt ggactcgatg 12060ggggaaaagt atgcagaggg agtcatcctg gatcttgaaa agacctggga agagtcagac 12120cctagaactc ctcttatctg tctgctttcc atgggctcgg atccgacgga ttcaatcatc 12180gcactcggca agcgcctcaa aatcgagacg cggtacgtgt caatgggtca aggacaagag 12240gtgcatgcac gcaagttgct gcaacagacc atggcgaacg gtgggtgggc cctgcttcaa 12300aactgccacc ttgggctgga cttcatggac gaattgatgg atattattat cgagacggag 12360ttggtgcatg acgcttttag actctggatg actacggaag cccataagca gttccccatc 12420accctgttgc agatgtcgat taagttcgca aatgatcccc cccagggttt gcgggctggt 12480ctgaaaagga cgtattcggg agtgtcgcaa gatttgttgg atgtctcctc cgggtcgcaa 12540tggaaaccaa tgttgtatgc cgtggccttc cttcattcca cggtgcaaga gcgccgcaaa 12600ttcggggcgc ttggatggaa catcccttat gagttcaatc aagcagattt caatgccacc 12660gtgcaattca tccagaatca ccttgacgat atggatgtga aaaagggtgt ctcatggacc 12720acgatccgct atatgatcgg tgagattcag tatggcggtc gcgtgacgga cgattatgat 12780aagcggttgt tgaacacctt cgcgaaagtg tggttcagcg agaatatgtt cggacctgac 12840ttctccttct accaaggcta taacattcca aagtgttcga ccgtcgataa ctacctccag 12900tacattcaaa gccttcccgc atatgacagc cctgaggtct tcggtttgca cccgaatgcc 12960gacattactt atcagagcaa gctggcaaag gacgtcctgg acaccatcct gggaatccag 13020ccaaaagaca cgagcggtgg aggagacgag acgagggagg ccgtcgtggc cagattggca 13080gacgacatgt tggagaaact gcctcccgac tatgtcccct ttgaggtgaa agaacggctg 13140cagaaaatgg gtcctttcca gccgatgaac attttcctga ggcaggaaat tgatcggatg 13200caaagagtgc tttcccttgt cagatcgacc ctgacggaac tgaaacttgc tatcgatggc 13260actatcatta tgtcggagaa cttgagggat gccttggatt gcatgttcga tgcgagaatc 13320ccggcatggt ggaagaaagc ttcatggatt tcatccactc tcgggttctg gttcacggaa 13380cttatcgaga gaaactcaca gtttaccagc tgggtgttca acggtagacc acattgcttt 13440tggatgactg gtttcttcaa cccgcagggc tttttgaccg ccatgaggca ggagattacc 13500cgcgcaaata aaggttgggc tttggacaac atggtcctct gcaacgaagt cactaaatgg 13560atgaaggacg atatctccgc gccgccaacg gagggcgtgt acgtctacgg gttgtatttg 13620gagggagccg gatgggataa aagaaacatg aagcttatcg aaagcaagcc taaggtgttg 13680ttcgaactca tgccagtgat cagaatttat gcggaaaaca atacgcttcg ggacccgcgg 13740ttttattcct gccctatcta taaaaagcct gtccggaccg acctcaacta tattgcagcc 13800gtggatctgc ggaccgccca aacccccgaa cattgggtgc tgaggggggt ggcgcttctc 13860tgcgacgtga aatag 138751013875DNAArtificial SequenceSynthetic polynucleotide 10atgttccgga ttggtcgccg ccagctttgg aagcacagcg tcactcgcgt cctgactcag 60aggctgaagg gtgaaaaaga ggccaaaagg gccctgctcg atgcaagaca caattatctc 120tttgcaattg tggcatcatg cctcgatctt aataagactg aagtcgagga cgcgattctt 180gaagggaatc aaatcgaaag gattgatcag ctcttcgctg tgggtgggct tagacatctt 240atgttttact accaggacgt cgaagaagca gaaaccggcc aattggggag cctcggtggt 300gtcaatctgg tctccggaaa gattaaaaaa ccgaaggtct tcgtcactga gggtaacgac 360gtcgcgctca cgggagtctg cgtctttttt attagaactg acccttcaaa agccatcacc 420cccgacaaca ttcaccagga ggtgtcattc aacatgctgg acgcagcaga cggaggtctt 480ctgaattcag tcaggcggct tctttccgat attttcattc cggcgcttcg ggccacctcg 540cacggatggg gcgaactcga aggcctccag gatgctgcga acattagaca ggaattcctc 600agctcgttgg agggatttgt gaacgtgttg tcaggggcac aggaaagcct gaaagagaag 660gtgaacctca ggaaatgtga catccttgaa ctgaaaaccc ttaaggagcc aactgattat 720ttgactctgg ctaataatcc tgaaaccctt ggtaaaatcg aggattgtat gaaagtgtgg 780atcaaacaga ccgaacaagt cttggccgag aacaatcaac tgctcaagga ggcagacgac 840gtgggtcctc gggcagaact tgagcattgg aaaaaacggc tgtccaagtt caattatctg 900cttgaacagc ttaaaagccc ggacgtcaaa gccgtgcttg ctgtcctggc cgccgctaaa 960tcgaaacttt tgaaaacttg gcgggaaatg gatatccgga ttaccgatgc tactaacgag 1020gctaaagaca atgtgaagta cttgtacacg ttggagaagt gttgtgatcc tttgtattcc 1080tcagatccgc tcagcatgat ggacgcgatt ccaaccctta ttaatgcaat taagatgatt 1140tattcgatca gccattatta caatacctca gagaaaatta cctcactgtt cgtcaaagtg 1200acgaatcaga ttatctccgc gtgtaaagca tatattacca acaacggcac tgcatcaatc 1260tggaatcaac cacaagacgt cgtcgaagaa aaaattcttt ccgctatcaa gctgaaacaa 1320gaatatcagc tctgctttca caaaactaaa caaaaactca aacagaaccc caatgccaaa 1380cagtttgatt ttagcgaaat gtatatcttt ggaaagtttg agacgttcca tcggaggttg 1440gcgaagatta tcgatatctt caccacgctc aagacttatt cagtgctcca ggattcaacc 1500atcgaaggat tggaggatat ggctacgaag taccaaggaa ttgtcgcaac tatcaaaaaa 1560aaggaatata acttcttgga tcagcgcaaa atggatttcg atcaggatta cgaggagttt 1620tgtaagcaaa cgaatgacct ccacaatgaa ctcaggaaat tcatggacgt gacgttcgcc 1680aaaattcaaa atactaatca agctttgagg atgctcaaga agttcgaaag gcttaacatc 1740ccaaatcttg gaattgacga taaataccaa cttatcctcg agaactatgg agctgatatc 1800gacatgattt ccaaattgta cacgaagcag aagtatgacc ctccccttgc ccggaatcag 1860cctccgattg caggtaagat cctgtgggcc cgccagctgt ttcacaggat ccaacaacct 1920atgcagcttt ttcagcagca ccccgcagtc ctgtcgaccg ctgaggccaa acctattatt 1980agatcatata accgcatggc taaagtcctc ctggagttcg aggtcctgtt ccatagagcc 2040tggcttaggc agatcgagga aatccacgtc ggcttggagg cttccctgct tgtcaaggcc 2100ccggggacgg gcgagctctt cgtgaatttt gatccacaga ttttgatttt gtttcgcgag 2160acggaatgca tggcacaaat gggtctggaa gtctcgccac ttgctacctc gttgtttcag 2220aaaagagata ggtacaagcg caacttttcg aacatgaaga tgatgctcgc agaatatcaa 2280cgggtgaagt ccaaaatccc ggccgcaatc gagcagctca tcgtcccgca cctcgctaaa 2340gtcgacgaag ccttgcaacc tggactggca gcgctcacgt ggacgtcgct caatatcgag 2400gcctacctcg aaaatacgtt tgctaagatc aaagacctcg aattgctcct tgatcgggtc 2460aacgatctga ttgagtttag gattgacgca atcctggagg aaatgtcgtc aacccctctc 2520tgtcagctcc cgcaagaaga gccacttacg tgcgaggaat ttttgcaaat gacgaaagac 2580ctgtgcgtca atggagccca gattcttcac ttcaagtcgt cattggtgga ggaggccgtg 2640aatgagctcg tcaacatgtt gctcgatgtg gaagtgcttt cggaggagga gtcggaaaaa 2700atctccaatg agaatagcgt caactataag aacgaatcaa gcgcaaagcg ggaagagggt 2760aacttcgata ccctgactag ctcaattaac gctagagcga acgctttgct ccttacgacg 2820gtgactcgga aaaaaaaaga gaccgagatg ttgggtgagg aggccaggga gctgctctcg 2880cacttcaacc atcagaacat ggacgcactc cttaaggtga ccaggaatac tttggaggct 2940atccgcaaga ggatccattc gagccatact atcaacttcc gcgactccaa tagcgcgtcg 3000aacatgaaac aaaattcact tccaatcttc agagcgagcg tcacgcttgc cattcctaat 3060atcgtcatgg ctcctgcact ggaagatgtg cagcaaactt tgaacaaggc cgtggagtgt 3120atcatttcgg tcccgaaggg ggtgagacaa tggagcagcg agcttcttag caaaaaaaag 3180attcaagaac gcaagatggc agccctccag tccaacgaag attcagattc agacgtcgaa 3240atgggtgaaa atgagttgca agatacgttg gaaatcgcga gcgtgaatct tcccattccc 3300gtccagacta aaaactatta caagaacgtc agcgaaaata aggagattgt caaacttgtc 3360tcggtccttt caactattat taactcgacg aaaaaagagg tgatcacttc aatggattgc 3420tttaaaaggt ataaccatat ctggcagaag ggtaaggagg aagcaatcaa aacgttcatc 3480acccagagcc cccttctctc agaatttgaa agccaaatcc tctatttcca aaaccttgag 3540caagaaatca acgcggagcc tgaatacgtc tgcgtggggt caattgcgct gtatacggcg 3600gacctcaagt tcgcgctgac cgcggaaacc aaggcatgga tggtggtcat cggaaggcat 3660tgtaacaaaa aatatcgctc ggagatggag aacatcttca tgttgatcga ggagttcaac 3720aagaaactga acagaccgat taaggacctg gatgacatca ggattgccat ggcggcgctc 3780aaagaaatta gagaggaaca aatttccatc gatttccagg tcggcccaat cgaagaatcc 3840tatgcattgc tcaacaggta tggccttctg atcgcccgcg aggagattga caaagtggac 3900acgctgcatt atgcttggga aaaacttctt gctagggcgg gagaagtcca aaacaagctc 3960gtgagccttc agccaagctt caagaaagag ctgatcagcg ccgtcgaggt gtttcttcag 4020gattgtcatc aattttacct cgactacgat ttgaatggtc ccatggcatc gggtctgaaa 4080ccacaagagg cttcggatcg gctcatcatg tttcaaaatc agtttgataa tatctacaga 4140aaatacatta cgtacacggg aggtgaagag cttttcggac ttcctgcaac tcaatatccg 4200caacttcttg agatcaaaaa gcaacttaac cttctccaga aaatttatac tctgtataat 4260tcagtcattg agactgtcaa cagctactac gacatcttgt ggtcagaggt caatattgag 4320aaaatcaata atgagttgct tgaatttcag aatagatgtc ggaagcttcc tcgggccctc 4380aaagactggc aggctttttt ggaccttaaa aaaatcatcg atgatttctc cgaatgttgc 4440cccttgctcg agtatatggc tagcaaagcg atgatggaac gccattggga gcggatcacg 4500acgcttactg gacatagcct ggatgtcggc aacgaatcct ttaaacttcg gaacatcatg 4560gaggcgccgc ttcttaaata taaggaggaa attgaggaca tttgtatttc ggcggtgaaa 4620gaaagggaca ttgagcagaa actgaaacag gtgattaacg agtgggacaa taagactttc 4680actttcggtt cctttaaaac gagaggtgaa ctcctgctga gaggagactc gacgtcagag 4740attattgcta atatggaaga ttcacttatg ctgttgggct cacttctctc aaataggtat 4800aacatgccgt ttaaagcaca gatccagaaa tgggtgcaat accttagcaa ctccactgat 4860attattgaga gctggatgac tgtgcaaaac ttgtggatct atcttgaggc agtgtttgtg 4920gggggtgata tcgcgaaaca actgcctaaa gaggccaagc ggttctcaaa catcgataaa 4980tcgtgggtca aaattatgac gcgggcacac gaggtcccta gcgtcgtcca atgctgtgtg 5040ggggacgaga

ctctcgggca actgcttccg cacctgctgg atcagctgga gatctgtcag 5100aaatcgctga ctggatacct cgaaaagaaa cgcttgtgtt tccccaggtt cttttttgtg 5160tcagaccctg ccttgttgga aattttgggg caggcaagcg acagccacac tattcaagcc 5220cacttgttga acgtctttga taacattaaa tccgtgaagt ttcatgagaa aatttacgat 5280agaatcttgt cgattagcag ccaggagggt gaaactatcg agctggataa acctgtcatg 5340gctgagggta atgtcgaagt ctggttgaac agccttctgg aggagtccca gagctcactc 5400catttggtca ttagacaggc tgcagctaat atccaggaga ctggatttca actgacggag 5460tttctctcaa gctttcctgc tcaggtgggt ctgcttggaa tccaaatgat ctggacccgc 5520gattcggagg aggctctgcg gaacgctaag ttcgataaaa agattatgca gaaaacgaat 5580caggcatttc tggaactctt gaataccctg attgacgtga cgactcgcga ccttagctca 5640accgagaggg tcaaatatga gaccctcatt accattcatg tgcatcaacg cgatattttt 5700gacgacctct gccacatgca tattaaatca ccgatggact tcgagtggct gaagcaatgc 5760agattttatt tcaacgagga ttcggataaa atgatgatcc atatcacgga tgtcgccttc 5820atctaccaga atgagttcct tggatgcact gaccgcttgg tcatcactcc attgaccgac 5880cgctgctaca ttactcttgc ccaggctttg gggatgtcga tgggtggggc cccagcgggc 5940cctgcaggta cgggcaaaac ggaaactacg aaggacatgg ggaggtgcct tgggaagtac 6000gtggtcgtgt tcaattgctc agatcagatg gattttaggg ggctcggaag aattttcaaa 6060ggcctggctc agtccggctc ctgggggtgt ttcgacgaat ttaatcggat tgacttgcct 6120gtgctttccg tggctgccca gcaaatcagc attattctca cgtgtaagaa ggagcacaag 6180aaatcattta ttttcactga cggagataac gtgaccatga atcctgaatt cggccttttc 6240ctcacgatga acccaggtta tgcgggtcgg caggagttgc ctgaaaattt gaagattaac 6300tttcgctccg tcgctatgat ggtcccggac cggcaaatca ttattagagt gaagttggcc 6360tcgtgcgggt tcatcgacaa cgtcgtgctg gcaagaaaat tctttacgct ctataagctc 6420tgcgaagaac agctttcaaa acaggtccat tacgactttg gactcagaaa tattctctca 6480gtgctcagga cgttgggtgc cgcaaagagg gccaatccca tggataccga atcaaccatc 6540gtgatgagag tcctgagaga catgaatttg tccaagctta tcgacgagga cgaaccgctc 6600ttcctgagct tgatcgagga ccttttccca aacattttgc tcgataaagc gggctaccca 6660gaattggaag ccgccatttc gcggcaagtc gaggaagcag gtctgatcaa tcatccgcct 6720tggaagctga aagtcattca attgttcgaa acccaacggg tccggcatgg gatgatgacc 6780ctcggtccat cgggcgcggg taaaactacc tgtatccaca ctctcatgag ggcaatgacc 6840gactgtggaa agccacaccg ggagatgagg atgaacccta aagctattac cgcgccccag 6900atgttcggaa gactcgatgt ggccaccaat gattggacgg atgggatttt ttcgactttg 6960tggcggaaga ctttgcgggc caagaagggg gaacacatct ggatcattct tgacggaccg 7020gtggatgcca tttggatcga aaatctgaat tcggtgttgg acgataataa aactctgacc 7080cttgcgaatg gtgatcgcat tccgatggcg cccaactgca aaattatctt cgagccccat 7140aatatcgata atgcttcgcc agctactgtg tccaggaacg ggatggtgtt tatgtcatcc 7200tcgatcctgg attggagccc aatcctggag gggtttctca aaaaacggtc gcctcaggag 7260gccgaaattc tcagacagct ttacaccgag tccttcccgg atctctacag gttctgcatc 7320caaaatctcg aatataagat ggaagtcctc gaagccttcg tgatcaccca atccatcaat 7380atgcttcagg gactcatccc ccttaaggag caaggaggag aggtgtcgca ggcccatctg 7440ggaagacttt ttgtcttcgc attgttgtgg agcgccgggg cagctctcga acttgacgga 7500cgcaggcgcc tggaactttg gcttaggtcc aggcccaccg gaacgctgga acttcctccg 7560cctgctggcc ccggggacac tgcttttgac tattatgtcg cccctgacgg tacctggacg 7620cactggaaca ccaggactca ggaatatctg tacccttcag acaccacgcc cgagtacggg 7680tccatcttgg tccctaacgt cgacaacgtg agaactgact ttcttatcca gactatcgcg 7740aagcaaggca aagcagtcct cctgatcggc gaacaaggaa cggctaagac ggtcatcatt 7800aagggtttca tgtcaaagta tgatcctgag tgccacatga ttaaatcgtt gaatttttcc 7860tcggcgacga cgcccttgat gtttcaacgc actattgagt cgtacgtgga taaaagaatg 7920gggaccacgt acgggccccc ggcgggaaag aagatgactg tgttcatcga cgacgtcaat 7980atgccgatca tcaacgagtg gggggaccag gtcacgaacg agatcgtgag gcaattgatg 8040gaacagaatg gtttctacaa tcttgagaag cctggagagt tcacttcaat cgtcgatatt 8100cagtttctgg ccgctatgat ccatccagga ggtggtcgga atgacattcc gcagaggctc 8160aaaaggcagt tttcgatttt taattgcact ttgcccagcg aggccagcgt ggataaaatc 8220ttcggggtga ttggagtggg tcattactgt actcagcggg gattctccga ggaggtgaga 8280gactcggtca cgaagcttgt cccattgact cggcggcttt ggcaaatgac caagattaag 8340atgctgccaa ctcccgccaa attccattat gtgttcaatc tcagggacct ttcgcgcgtg 8400tggcaaggca tgctcaacac cactagcgag gtcattaaag agcccaacga ccttttgaaa 8460ttgtggaagc atgagtgcaa acgcgtcatc gcagacagat ttacggtgtc ctcagacgtg 8520acttggttcg acaaagccct cgtgtccctg gtggaagagg aatttggaga ggaaaaaaag 8580ctgctggtgg actgcggtat tgatacttat ttcgtggact ttcttcggga cgcaccggaa 8640gcagccggag aaacgtcgga ggaagctgac gccgagactc ctaagatcta tgagcccatc 8700gagagcttta gccacctcaa agaaaggctt aacatgttcc tccaattgta taatgagtcg 8760attagaggtg ccggcatgga tatggtgttc tttgctgatg cgatggtgca cctcgtcaaa 8820atttcgcggg tgatccggac tccacaggga aatgccctgc ttgtgggcgt cggcggatca 8880gggaaacagt cgctcacgcg cttggcaagc tttatcgccg gttacgtctc attccagatc 8940actctgacga gatcatataa tacttcgaac ttgatggaag atttgaaagt gttgtaccgc 9000acggcgggcc aacaagggaa agggattacg ttcattttca ccgataacga aatcaaagac 9060gaatcgtttc tcgaatacat gaacaacgtc ctgagctccg gggaggtgtc caatctgttt 9120gctagggatg agattgacga aattaattca gaccttgcaa gcgtgatgaa gaaagagttc 9180ccgcgctgcc tccccacgaa cgaaaacctg cacgattatt tcatgagcag agtccgccag 9240aaccttcata tcgtgctttg ttttagccca gtcggagaga agtttcggaa tagagcactt 9300aaatttccgg cgctgatctc cgggtgcacc atcgattggt tttcccgctg gcctaaagac 9360gctctggtcg ccgtgtccga gcactttctc acttcgtatg atattgattg ctcactcgaa 9420atcaagaagg aggtggtgca gtgcatgggg agcttccaag atggtgtggc agagaagtgc 9480gtcgattact tccaaaggtt tagacgctcg actcacgtca ctcctaagtc atatctgagc 9540tttattcaag gttataaatt tatctacggc gaaaagcacg tcgaggtccg gacccttgct 9600aacagaatga acacggggct tgaaaagttg aaggaagctt cagaatcggt ggcagcactc 9660tccaaggaac tcgaggccaa agagaaagaa cttcaagtcg caaacgataa agcggatatg 9720gtcctcaaag aagtgaccat gaaggcacaa gcggcagaaa aagtgaaggc ggaggtgcaa 9780aaagtgaagg atcgcgcgca agcaatcgtc gatagcatct caaaagacaa ggctattgcg 9840gaggagaagc tggaggccgc caagcctgct ctggaggaag cagaggcagc tcttcagacc 9900attcgcccta gcgatatcgc taccgtgaga actcttggta ggccccccca cttgattatg 9960aggattatgg attgcgtgct tctccttttt caacggaagg tgtcggcggt gaagatcgac 10020ttggagaagt catgtactat gccgagctgg caagaatcat tgaaattgat gaccgcagga 10080aacttcttgc agaacctcca acagttccca aaggatacca tcaatgaaga agtcattgaa 10140tttctctccc catattttga aatgccagac tataatattg aaacggccaa aagggtgtgt 10200ggaaacgtgg cgggactgtg ctcctggacg aaggcaatgg cgtcgttttt ctcgatcaac 10260aaggaggtgc ttccgttgaa agcaaacctc gtggtccaag agaaccggca cctgctggcc 10320atgcaggacc ttcagaaggc tcaagccgag ctggacgaca agcaggcaga actggacgtc 10380gtgcaagccg aatacgagca agcaatgacg gaaaagcaga cgttgttgga agatgcggaa 10440cgctgccggc ataaaatgca gacggcttcg acgcttattt ccggactcgc tggggagaaa 10500gaaaggtgga cggaacagtc acaagaattt gccgcccaaa ctaagagact ggtgggagac 10560gtgcttctgg caactgcctt tctgtcgtac agcggcccat tcaatcagga attccgggat 10620ctgttgctga atgattggag aaaagagatg aaagcccgga agatcccctt cgggaaaaat 10680cttaatctct ccgaaatgct cattgatgcc cctaccatta gcgagtggaa tctccaagga 10740ttgccgaatg acgatctgtc aattcaaaac ggcatcattg tgacgaaggc ctccaggtac 10800ccacttctta tcgatccgca aacccagggg aaaatctgga ttaagaataa agaatcgcgg 10860aacgagctcc agattacctc acttaatcac aaatatttca ggaaccacct tgaggattca 10920ctgtcgctcg ggcggcctct gttgattgag gatgtgggtg aggaactgga tccagctttg 10980gataatgtcc ttgaacgcaa cttcattaag accggatcaa cctttaaagt caaggtcggc 11040gataaagagg tggatgtgct ggacggcttt agactctata ttacgactaa gcttcctaat 11100cctgcgtaca cgccagaaat ttccgcgagg accagcatca tcgacttcac cgtcacgatg 11160aagggactcg aggaccaatt gctggggagg gtcatcctca ctgaaaagca agaacttgag 11220aaagagagaa cgcacctcat ggaagatgtg actgctaaca agagacggat gaaggaattg 11280gaagataatt tgctgtatcg gctgacttca acccagggct cgctggtgga agacgaaagc 11340cttattgtcg tcttgtcgaa tactaagcgc actgctgagg aagtcactca aaaactcgaa 11400atttcagcgg agaccgaggt ccagatcaac tcggccaggg aggagtacag gccagtggcc 11460actagaggtt cgatcttgta ttttcttatt accgagatgc ggctggtgaa tgaaatgtac 11520caaacgtccc tgcggcaatt ccttggcctt ttcgacttga gccttgctcg ctcggtcaaa 11580tcaccaatta cttccaaacg catcgcgaat atcattgagc acatgactta cgaagtgtac 11640aagtacgcgg ccaggggact gtatgaggaa cacaaattcc tttttacgct cctcctcact 11700ctcaaaattg atatccaacg caacagggtg aagcatgaag agtttcttac tttgatcaag 11760ggaggtgcct cactcgacct caaggcctgc cctcccaaac cgtccaaatg gatcttggat 11820attacttggt tgaacctcgt ggagcttagc aagctgcggc aattctcaga tgtcctcgat 11880caaatttcac ggaatgaaaa aatgtggaag atttggtttg acaaagagaa ccccgaggag 11940gaacctcttc ccaatgccta cgacaaaagc ctggactgtt ttaggcggct tctcttgatc 12000aggtcatggt gtccggatcg cactatcgct caagcgcgga agtatatcgt cgactccatg 12060ggtgaaaagt acgcagaggg ggtcatcctc gatctggaga aaacctggga agagtcagac 12120ccaagaactc cgttgatttg cctcttgtca atgggctccg accctaccga ctccatcatc 12180gcgctgggta aaaggctcaa aatcgaaacc cggtatgtca gcatggggca aggtcaggag 12240gtgcacgcgc ggaagcttct ccaacagacg atggcaaatg ggggttgggc acttcttcag 12300aactgccact tgggcctcga cttcatggat gaactgatgg acattattat tgagacggag 12360ctggtccacg atgcattccg cctctggatg accacggaag cccacaaaca atttcctatc 12420acgctgctgc agatgtccat taagtttgca aatgatcccc cccaaggtct tcgcgcaggc 12480cttaagagaa cgtattcagg agtgtcacag gatctccttg atgtctcatc ggggtcacaa 12540tggaaaccga tgctgtacgc ggtcgctttc cttcactcca ctgtgcagga gcggaggaag 12600tttggagcgt tgggatggaa tatcccctac gaatttaacc aagccgactt taatgctact 12660gtgcaattta ttcaaaacca tcttgacgac atggacgtca aaaagggagt gagctggact 12720accatcagat acatgatcgg tgagattcag tatggaggga gggtcaccga cgactatgac 12780aaacggcttt tgaacacgtt cgcaaaagtg tggttttcag agaacatgtt tggcccggat 12840ttctcatttt atcaagggta taatatccct aagtgctcaa ccgtcgataa ctatctccag 12900tatatccaga gccttcccgc ttatgattcc ccagaggtgt ttgggttgca cccgaatgcg 12960gatatcactt accagagcaa acttgctaag gacgtgcttg atacgattct cggtattcaa 13020cccaaagata cgagcggagg aggagacgaa accagggaag ccgtggtcgc taggctcgct 13080gacgacatgc tcgagaaact tccgcccgac tacgtcccgt ttgaagtcaa ggaaaggttg 13140cagaagatgg gtcccttcca gccaatgaac attttcctcc ggcaggagat tgatcgcatg 13200cagagagtgc tgtcattggt ccggtcgacg ctcactgaac ttaaacttgc cattgacggg 13260actatcatca tgagcgaaaa tctgcgggat gcattggatt gcatgtttga tgcgcggatc 13320ccagcgtggt ggaagaaagc ttcctggatc agcagcactc tcgggttttg gtttaccgag 13380ctcattgaaa gaaattcgca gttcacgtca tgggtcttta acgggcgccc tcattgcttt 13440tggatgacgg gcttctttaa cccgcagggt tttctcacgg ctatgaggca ggaaatcact 13500agggcaaata aaggctgggc gcttgacaac atggtcctgt gtaatgaggt gactaaatgg 13560atgaaggacg acatcagcgc tccccctacg gagggtgtgt atgtgtacgg cctctatttg 13620gaaggtgctg gatgggataa gcgcaatatg aaacttattg agagcaaacc taaggtgctg 13680tttgagctga tgcccgtgat ccggatctat gctgaaaata acacgttgag ggaccccagg 13740ttctattcgt gtccgatcta caaaaagccg gtccgcaccg atttgaatta cattgctgct 13800gtcgatcttc ggacggccca gacgcctgaa cattgggtcc tcaggggtgt ggcgctgctg 13860tgtgatgtca aatag 138751113875DNAArtificial SequenceSynthetic polynucleotide 11atgtttcgga ttggcaggcg ccagctgtgg aagcactcag tgactcgggt ccttacgcag 60cggttgaaag gcgagaagga ggcaaaacgg gcactgttgg acgcaaggca caactatttg 120ttcgcaattg tggcatcgtg cttggatctt aacaaaacgg aggtcgaaga cgcgatcctg 180gaggggaatc agattgagag gatcgaccaa ttgttcgccg tcggagggtt gcggcacttg 240atgttctact atcaagacgt cgaggaggct gagaccggcc agctcgggag cttgggagga 300gtcaaccttg tgtcgggcaa aatcaaaaaa cccaaggtct tcgtgacgga ggggaatgat 360gtcgcactga ccggtgtgtg cgtcttcttt atccgcaccg atccgtcaaa agcaattacc 420cccgacaata tccatcagga agtgtcgttc aatatgctcg atgcggctga tggtggcctc 480cttaattcag tcagaaggct tctctcagat atttttattc ccgccctccg ggctacgtcg 540catggctggg gggagctgga ggggcttcaa gacgcagcta atattagaca ggagtttctt 600tcctcactcg agggcttcgt caatgtcctg tccggcgcgc aggaatcact caaagaaaaa 660gtcaacctca gaaaatgtga tatcctcgaa cttaaaaccc ttaaagaacc gaccgattac 720ctcacgttgg caaataaccc cgaaactctg ggcaaaattg aagattgcat gaaagtgtgg 780atcaagcaaa ctgagcaggt cctcgccgag aataaccagc tgctgaagga ggcggatgac 840gtgggacctc gcgcggaact ggaacactgg aagaaacggc tttcaaaatt caactacctt 900ctggaacagc ttaaatcccc tgatgtgaag gccgtcctgg ccgtgctcgc ggccgccaaa 960agcaagctgt tgaagacttg gcgggagatg gacatcagaa tcacggacgc gaccaacgaa 1020gccaaggaca atgtcaagta cctctacacg cttgagaaat gttgtgatcc cctctattca 1080agcgatcctc ttagcatgat ggacgctatc ccaactctta tcaacgctat taaaatgatc 1140tatagcatta gccactatta taacacttca gagaaaatca cgagcctctt cgtgaaggtc 1200actaatcaaa tcatttccgc ttgtaaagcc tacattacta ataacggaac ggcatcgatc 1260tggaaccagc cgcaagacgt ggtggaggaa aaaatccttt ccgcaattaa gcttaaacag 1320gagtatcagc tttgctttca caagacgaaa cagaagttga aacaaaaccc aaacgcaaag 1380caattcgatt tctcagagat gtatattttc ggaaagttcg aaacgtttca caggagactg 1440gcgaaaatta tcgacatctt cactacgctt aaaacgtact cggtgctcca ggacagcact 1500atcgagggtc ttgaggacat ggccaccaag tatcagggga tcgtcgccac cattaagaag 1560aaggagtaca acttccttga ccaacggaaa atggattttg accaagatta tgaagaattc 1620tgcaaacaaa ccaatgattt gcataacgaa cttcgcaaat tcatggacgt cacgtttgcg 1680aaaatccaga atactaacca ggcgctgcgc atgctcaaaa agtttgagag gctgaacatc 1740ccgaatctcg ggatcgacga caagtatcag ctcattcttg agaactacgg cgccgacatt 1800gatatgatct cgaaactcta cacgaagcag aaatatgatc cgccgctcgc taggaaccag 1860cctcctattg ccggaaagat tctctgggcg cgccagctgt tccatcgcat tcagcaacca 1920atgcagcttt tccagcagca tcctgctgtg ctgtccaccg cagaggcgaa acctattatt 1980agatcataca acaggatggc caaagtgttg ctggaatttg aggtgctgtt ccaccgggcc 2040tggctgcgcc aaatcgagga gattcacgtg gggcttgagg catccctctt ggtgaaagct 2100ccaggtaccg gggaactgtt cgtcaatttt gacccgcaaa ttttgattct ttttcgggag 2160actgaatgta tggctcagat gggtttggag gtgtcaccct tggcaactag cctgtttcag 2220aagagagatc gctataaaag gaacttcagc aacatgaaga tgatgctcgc agagtaccag 2280cgcgtcaagt ccaagatccc agccgctatc gagcagctga tcgtccccca cctcgctaaa 2340gtggacgaag cgttgcagcc tgggctcgca gcacttacgt ggactagcct caacatcgaa 2400gcttatttgg agaatacttt cgcaaagatc aaggacctgg agcttctgct tgatagggtc 2460aacgatttga tcgagttcag aattgatgct attctcgagg agatgtcctc aacgccgttg 2520tgccagctcc cacaggaaga accattgact tgcgaggagt tcctgcaaat gaccaaggac 2580ttgtgtgtca acggtgcgca gattttgcat tttaagtcaa gcttggtgga agaggcggtg 2640aacgagcttg tcaacatgct cttggacgtc gaggtgttgt cagaggaaga atcggaaaaa 2700atctcgaatg aaaattccgt gaactataaa aacgaatcgt ccgccaagcg ggaagaaggg 2760aacttcgata ccctcacctc aagcattaac gcgagggcta acgccctcct gctcactact 2820gtgacgcgga aaaaaaagga aactgaaatg ctgggtgaag aggctcgcga acttttgtca 2880cattttaatc accagaatat ggacgctctg ctcaaggtca ctcgcaacac tcttgaagcc 2940atccgcaaac gcatccattc aagccacacg atcaacttca gggattccaa ctcggcaagc 3000aacatgaaac agaactcact gcccattttt cgggcgtcag tgacgctcgc gatcccgaat 3060atcgtcatgg cccctgccct tgaagatgtc caacaaactc tcaacaaagc ggtggaatgt 3120attatttcag tgcctaaggg tgtcaggcaa tggagctcag agcttctgtc gaaaaaaaaa 3180atccaggaga ggaaaatggc cgctttgcag agcaatgagg actcggattc cgacgtcgaa 3240atgggggaaa acgaactgca agacacgctg gaaattgcat cagtcaacct tccaattcct 3300gtccaaacta aaaactatta taagaacgtc tcggaaaata aggagattgt caaactggtg 3360agcgtgttga gcactattat taatagcacg aagaaagaag tgatcacgtc catggactgc 3420ttcaagaggt ataaccatat ctggcagaaa ggtaaagagg aagccattaa gacttttatt 3480acgcaaagcc ccttgttgag cgagtttgag tcccagattc tctacttcca gaatctcgag 3540caggagatca acgctgaacc cgagtatgtc tgtgtgggta gcattgcctt gtatactgcc 3600gacctcaagt ttgctctcac ggctgaaact aaagcgtgga tggtcgtgat cgggaggcat 3660tgtaacaaga agtaccgcag cgaaatggaa aatattttta tgctgatcga agagttcaac 3720aaaaaactta atcggcccat caaagacttg gatgacattc ggattgcgat ggccgctctc 3780aaagaaatca gggaggagca gatctccatt gactttcagg tcgggcctat tgaggagtcc 3840tacgcactcc tgaacagata tggtctgctc atcgctcggg aagaaatcga caaagtggat 3900accctgcatt atgcatggga aaaattgttg gcacgcgcag gggaggtcca gaacaaactt 3960gtgagcctcc aaccttcgtt taaaaaggag ctgatctccg cagtggaggt ctttttgcag 4020gactgtcatc aattctatct tgattatgac ctcaatgggc ctatggcgag cgggctcaag 4080cctcaagaag catccgatcg cttgattatg ttccagaatc agtttgacaa tatctaccgg 4140aagtatatta cttataccgg cggtgaagaa ttgttcggtc ttcctgcgac ccagtacccg 4200cagcttctcg aaattaagaa gcaactgaat ctcctccaaa aaatttacac gctttacaac 4260agcgtgatcg aaaccgtgaa ctcgtattac gatatcctct ggtcggaagt caacattgaa 4320aaaattaaca atgaactgct ggaatttcag aatagatgca gaaaattgcc acgggctctg 4380aaagattggc aggccttttt ggacttgaag aaaattatcg acgacttttc cgagtgctgc 4440cccttgctgg aatatatggc ttccaaggcg atgatggaaa gacactggga aaggatcacg 4500actctcaccg gtcactcgct tgatgtggga aacgaaagct ttaagctcag aaatatcatg 4560gaggcaccgc tgttgaagta taaggaggag attgaagata tttgtatttc ggccgtgaag 4620gaacgggaca tcgagcagaa actgaagcag gtgatcaatg aatgggataa caagactttt 4680acctttgggt cctttaagac ccggggcgag ctccttctta gaggcgactc gacttccgaa 4740atcatcgcaa atatggaaga ttccctgatg ctccttggat cattgttgtc aaatcgctac 4800aacatgccct tcaaggccca gattcagaag tgggtccaat atctgtccaa cagcaccgac 4860atcatcgaat cctggatgac tgtccagaac ctctggatct acctggaagc ggtgtttgtc 4920ggcggggata tcgcgaaaca gttgccaaag gaggcgaaaa ggttctcgaa cattgataaa 4980tcctgggtga aaattatgac tcgcgcccac gaagtgccaa gcgtcgtcca atgctgtgtg 5040ggggatgaaa ccctgggcca actgctgccc caccttctgg accagcttga gatctgtcag 5100aaatccctga cgggctacct cgaaaagaaa agactttgct tccccaggtt cttctttgtc 5160tcagacccgg ccctgctgga gatcctcgga caagcatcgg actcgcatac cattcaggca 5220catctcctta acgtgttcga taatatcaag tccgtgaaat tccacgaaaa gatttacgac 5280cggatcctta gcattagcag ccaggaagga gaaactatcg agcttgacaa accagtgatg 5340gctgaaggaa atgtcgaggt ctggttgaat tcccttcttg aagagtcgca gagctccctc 5400catcttgtca ttcggcaagc cgcagcgaat attcaggaga cggggtttca acttaccgag 5460tttctttcga gcttccccgc tcaagtcggt ttgttgggca ttcagatgat ttggacgagg 5520gactcggagg aggccctccg caatgctaag ttcgataaaa aaatcatgca aaaaaccaac 5580caggcattcc ttgagcttct taacactctt atcgatgtca ccacccggga cttgtcctcc 5640acggagaggg tcaagtacga aacgcttatc actatccacg tgcaccaacg ggacatcttt 5700gatgacttgt gccatatgca tatcaaatcg ccaatggatt tcgagtggct gaaacagtgc 5760cggttctact tcaacgaaga ttccgataaa atgatgattc acattaccga tgtcgcattc 5820atttaccaaa atgagttcct cggatgtact gatcggcttg tcattacgcc cctgactgac 5880aggtgttaca tcactctggc gcaagctctg ggtatgtcga tggggggcgc tccggcaggc 5940cccgcgggta ccgggaagac cgaaactacg aaggatatgg gcagatgctt gggcaaatac 6000gtggtcgtgt ttaactgttc agatcagatg gacttccggg ggctgggtcg catctttaag 6060gggttggcac agtcaggctc ctggggatgt ttcgatgaat tcaatcggat cgacttgccg 6120gtcttgagcg

tggcagcgca acagatttca atcatcctta cctgtaagaa ggagcataag 6180aagtcgttta ttttcacgga cggggacaac gtcaccatga acccagagtt tggattgttc 6240ctcactatga atccggggta cgcaggccgc caagagctcc cagagaatct caaaattaat 6300tttagatcag tggctatgat ggtcccggac agacagatca tcattcgggt gaaactcgcc 6360agctgcggct tcatcgacaa cgtggtgttg gcgcggaaat ttttcacgct ctataaactc 6420tgcgaagaac agctttcaaa acaggtgcac tatgattttg gcctccggaa cattctctcc 6480gtcctgagaa ctctcggagc ggcgaaaagg gcaaatccta tggataccga gtcgacgatt 6540gtgatgaggg tcctgagaga tatgaacctt tcaaaactga tcgacgagga cgaaccactg 6600tttctttcgt tgatcgagga tttgtttccg aacatccttc tggacaaggc tggttacccg 6660gagcttgaag ctgcgatttc acggcaagtc gaagaggctg gattgattaa ccacccgcca 6720tggaagctga aagtcatcca attgtttgag actcaaagag tccgccatgg catgatgact 6780cttggtccta gcggcgcggg gaagacgacg tgtatccaca ctttgatgag ggcaatgacg 6840gattgcggta aacctcacag agaaatgagg atgaatccaa aggctattac cgcaccgcag 6900atgttcggaa ggttggacgt ggcgacgaat gactggactg acggcatttt ctcaacgttg 6960tggcgcaaga ccttgagagc caaaaaagga gaacatatct ggattatcct cgacggcccc 7020gtggatgcca tctggattga gaatcttaac tcggtgctcg atgataataa gaccctgacc 7080ctggctaacg gagataggat cccgatggcc cctaattgca aaatcatctt tgaaccgcat 7140aacattgata atgcatcacc agcgaccgtc tccaggaatg gtatggtgtt catgagctca 7200agcattctgg attggtcgcc cattcttgag ggattcctca aaaaaagatc acctcaggag 7260gcagagattt tgagacaact gtatacggaa tccttcccgg atctgtatag attttgtatc 7320caaaatctcg agtataaaat ggaggtcctt gaggctttcg tcattacgca aagcatcaat 7380atgctgcagg gtctgatccc tttgaaagaa caggggggag aggtgtcaca agctcacctg 7440ggaagactct tcgtgttcgc gttgctctgg agcgcaggcg cagcgctcga gctggatgga 7500aggaggaggc tcgaattgtg gctgcggagc cgccccacgg gcactttgga actgccgccc 7560ccggccggtc cgggcgacac cgcattcgac tactacgtgg cgccagatgg tacgtggact 7620cactggaata cgagaaccca agaatatctt tatccatcgg ataccacgcc tgagtatggt 7680agcattctgg tccctaatgt cgataatgtc agaacggact tcctcatcca aactattgcc 7740aaacagggta aagccgtcct cttgatcggt gaacagggta ccgcaaagac cgtcattatc 7800aaagggttca tgtcaaaata tgacccggag tgtcatatga tcaagagcct caacttctca 7860tcagccacca ctccgctgat gttccaaagg actatcgagt cgtatgtcga caagaggatg 7920ggcacgacgt atgggcctcc tgccggcaag aagatgaccg tctttattga tgacgtgaat 7980atgccgatta ttaacgaatg gggagatcag gtcactaatg aaatcgtccg ccaactgatg 8040gagcagaacg ggttctataa tctggagaag cccggcgaat ttacttcaat tgtggatatt 8100cagttcttgg cagctatgat ccatccaggc ggaggccgga acgacatccc ccaacggctg 8160aagagacagt ttagcatctt taactgcacg ttgccctcag aggcatcagt ggataagatc 8220tttggagtga tcggagtggg tcactactgt acccagcggg gtttctcgga ggaggtccgc 8280gacagcgtca ctaagctggt gcccttgacg agacggctct ggcagatgac gaagattaag 8340atgctgccaa ctcccgcgaa gttccactac gtgtttaatc tgcgggactt gagccgcgtg 8400tggcagggga tgctcaacac cacgtcagag gtcatcaaag agccgaacga cctgctcaag 8460ttgtggaaac acgaatgcaa acgggtgatt gcggatcgct ttacggtctc ctccgatgtc 8520acctggttcg ataaggcctt ggtgtcactt gtcgaagagg agttcggcga ggaaaaaaag 8580cttctcgtcg actgcggaat tgatacgtat tttgtggact tcctgaggga tgcccccgaa 8640gcggcaggag agacttccga ggaagcagat gcggaaaccc ccaaaattta tgaaccgatc 8700gagtcatttt cacacttgaa ggagcgcctg aacatgttcc tccaactcta taatgaatcg 8760atccgcggtg caggaatgga catggtgttc tttgctgatg ctatggtcca tctggtcaaa 8820atttcaagag tgattagaac tccgcagggc aatgcgctcc tcgtcggagt ggggggatcc 8880gggaagcaat cgctcacccg gctggcctcg ttcatcgcgg ggtatgtgtc gttccaaatt 8940actctcacca ggagctacaa taccagcaac cttatggagg acctcaaggt gctctacagg 9000actgctggcc agcagggtaa gggcattacc tttattttta ccgataatga aattaaagac 9060gaatcctttt tggagtacat gaacaatgtc ctttcgtcag gtgaagtgtc aaacctcttc 9120gcaagggatg aaattgacga gatcaacagc gacctggcta gcgtgatgaa aaaagaattt 9180ccgcggtgtt tgccaaccaa tgaaaacttg catgactact tcatgtcacg ggtccgccaa 9240aacttgcaca tcgtcctctg tttttcgcca gtgggtgaga agtttcggaa cagggcactt 9300aagttcccgg cactcatctc cggctgtact atcgactggt tctcgaggtg gcctaaagat 9360gcattggtgg cagtgagcga gcacttcttg acgtcctacg atattgactg ttcgttggaa 9420attaagaaag aggtggtcca gtgtatgggg tcctttcagg atggagtcgc agaaaaatgc 9480gtcgactatt tccagagatt caggagatca acgcatgtca ctccgaagtc atatttgagc 9540tttattcaag gctacaagtt tatttacggg gaaaagcatg tcgaggtcag aacccttgca 9600aatagaatga ataccggctt ggagaaactc aaagaagcgt cggaatcagt ggcggcattg 9660tcaaaggagc tcgaagcaaa ggaaaaggaa ttgcaggtgg ctaacgacaa agcggacatg 9720gtgctgaaag aagtgaccat gaaagcccaa gctgcagaga aagtcaaagc tgaagtgcag 9780aaggtgaagg accgcgcaca ggctatcgtc gattcgatct cgaaggataa ggctattgcc 9840gaagagaaac ttgaggccgc caagcccgct cttgaagagg cggaagctgc attgcagacc 9900attagaccct ccgacattgc aacggtcagg actctgggca ggccccctca cttgattatg 9960agaatcatgg actgcgtgct ccttctgttc caaagaaagg tgtccgccgt gaagatcgat 10020cttgagaagt catgcacgat gccgtcctgg caggaatcgc ttaagcttat gacggcaggt 10080aactttctcc aaaatctcca gcaatttccc aaggatacca ttaacgaaga ggtcatcgaa 10140ttcttgtcac cctactttga aatgcccgat tacaacattg agacggctaa acgggtctgc 10200gggaatgtcg ccggactgtg tagctggact aaggccatgg caagcttctt ctccatcaac 10260aaagaagtcc tgcctctgaa agcaaatctg gtggtccaag agaatagaca tcttctggct 10320atgcaggact tgcaaaaagc ccaggcggag ttggacgata aacaagcaga gttggacgtg 10380gtgcaggccg agtacgaaca ggctatgacc gagaagcaaa cgctcctcga ggatgcagag 10440cgctgtaggc ataaaatgca gacggcatcc accctcatct cagggctggc tggggaaaag 10500gagaggtgga ccgaacagtc acaagaattt gccgcccaga ctaagagact tgtgggagat 10560gtcctgctcg caacggcctt cctgtcgtat agcggtccat ttaatcagga atttcgcgat 10620ctcttgctta acgattggag aaaagaaatg aaggccagaa aaatcccgtt cggtaaaaac 10680cttaatttgt cggagatgct gatcgacgcg cctactattt cagaatggaa tctgcaaggg 10740ctgccaaatg acgacttgtc catccaaaac ggaattatcg tgactaaggc ttcgcggtat 10800ccactcctca ttgatccgca gactcaaggt aaaatttgga tcaagaataa ggaatcgcgg 10860aacgagcttc agatcacttc actcaaccac aaatatttcc gcaaccacct ggaggattcc 10920ttgagccttg gaagaccgtt gctcatcgaa gatgtcggcg aagaacttga tccggccttg 10980gacaacgtcc tggaaaggaa cttcatcaaa actggctcga cttttaaagt gaaggtgggc 11040gacaaggaag tcgatgtcct ggatggattt aggctttaca ttacgactaa attgccgaat 11100ccagcataca ccccggaaat ttcggcgagg accagcatca ttgactttac ggtgactatg 11160aagggattgg aagaccagct cctcggcagg gtgattttga cggaaaaaca ggagctggaa 11220aaagaaagga cgcacctcat ggaggatgtg accgccaata aacgccggat gaaggagctc 11280gaggataatc tcctttatcg cctcacgagc actcaaggat ccttggtcga ggacgagtcc 11340cttattgtcg tgcttagcaa cactaagaga acggcggagg aggtcactca gaaactcgaa 11400attagcgcag aaacggaggt gcaaatcaac tcagctaggg aggaatatcg gccagtcgca 11460actagaggct ccattctcta tttcctcatc accgaaatgc gcctcgtcaa tgagatgtac 11520caaacttcac tgcgccaatt cctgggtctt tttgatctgt ccctcgcaag atcggtgaaa 11580tcccccatta ccagcaagcg gattgcgaac attatcgaac atatgacgta tgaggtctac 11640aagtatgccg ccagggggct ttacgaagag cacaagttcc tctttacgtt gttgttgact 11700cttaagattg acattcagcg gaaccgcgtg aaacatgagg agtttctcac tctgatcaaa 11760ggaggggcaa gccttgactt gaaggcatgc ccccccaaac catcgaaatg gattcttgat 11820atcacctggc tcaacttggt cgagctgtca aagctccggc aattctcgga cgtccttgac 11880caaatttcga ggaacgagaa gatgtggaag atctggttcg ataaagagaa tcccgaagag 11940gagcctttgc ccaacgccta tgataaatca ttggactgct ttcgccggct ccttctcatc 12000agaagctggt gtccagacag aacgattgcc caggcgcgga agtatatcgt cgatagcatg 12060ggggaaaaat acgccgaggg tgtcattctt gaccttgaaa aaacttggga ggaatccgat 12120ccgcggactc ctttgatttg tttgctgtcc atgggctccg atcccactga tagcatcatt 12180gcacttggta agaggcttaa aattgaaacg cgctacgtga gcatgggaca gggccaggag 12240gtccatgctc ggaaacttct gcagcaaacg atggccaacg gtggttgggc ccttttgcag 12300aattgccact tgggtctcga ttttatggat gaacttatgg acatcattat cgagaccgaa 12360cttgtccacg acgcatttcg gctctggatg acgactgaag cgcataaaca gtttccgatc 12420accttgctcc agatgtcgat taaattcgcg aacgaccctc cgcaagggct tagagcgggt 12480ctcaaaagga cctactcggg ggtgtcacag gatcttcttg acgtctcctc cggcagccag 12540tggaaaccaa tgctgtacgc tgtggcattc ttgcactcca cggtgcagga aaggcggaag 12600ttcggagctt tgggctggaa tatcccgtac gaattcaacc aggccgattt taatgcaacg 12660gtgcaattta tccagaatca tctcgatgac atggatgtga aaaagggggt ctcatggacc 12720accattagat atatgatcgg ggaaatccaa tacggtggta gggtcactga tgattatgat 12780aagagacttc tgaatacgtt cgcaaaggtc tggttctcag agaatatgtt tggtcctgat 12840ttctcgtttt atcagggcta taacatccct aagtgcagca ccgtggataa ctatctccaa 12900tatatccaat ccctccctgc ttatgattca ccagaagtct ttggcttgca tcctaatgca 12960gatattacgt atcagtcaaa actggcgaag gacgtcttgg acactatcct gggtattcag 13020ccgaaagata cgagcggggg tggagacgaa accagagagg cagtcgtggc gaggctggct 13080gacgacatgc tggagaagct gcctcccgac tacgtcccct ttgaggtgaa agaaagactg 13140cagaagatgg gccccttcca acctatgaac atcttcttga gacaagaaat cgacaggatg 13200caaagagtgc tgagcctcgt gcgctccacc ctgactgaat tgaagctcgc aatcgatgga 13260acgatcatca tgtcggaaaa cttgcgggac gcacttgact gtatgttcga cgccaggatc 13320ccagcgtggt ggaaaaaagc atcatggatc tcatcgactc tgggtttctg gtttaccgaa 13380ctgatcgaaa ggaattcgca gtttacgtcc tgggtgttta acggacggcc acattgcttc 13440tggatgaccg gcttttttaa ccctcagggt tttcttacgg ctatgcgcca agaaatcacc 13500cgggcaaaca agggttgggc acttgacaac atggtcttgt gtaacgaggt gactaagtgg 13560atgaaggatg acatctcagc tccgccgact gagggggtct acgtgtatgg tctttatctg 13620gagggcgcag gttgggataa acgcaacatg aagctgatcg agtcgaaacc aaaagtcttg 13680ttcgagctca tgcccgtcat tagaatctac gccgagaaca atacgcttcg cgaccctaga 13740ttctatagct gcccgattta taaaaaaccg gtgcggacgg acttgaatta tatcgcggca 13800gtcgatctgc ggacggcgca gacccctgag cattgggtgc tgcggggagt ggctcttctg 13860tgcgatgtca agtag 138751213875DNAArtificial SequenceSynthetic polynucleotide 12atgtttcgca tcgggaggcg gcagctctgg aagcactccg tgacgcgcgt gctgacgcag 60cgccttaaag gcgaaaagga agcgaagcgg gcccttctgg acgctcggca taactacttg 120tttgccattg tggcttcctg tcttgatctg aacaagacgg aggtcgagga tgcaatcctt 180gaaggcaatc aaattgaaag aattgaccag ttgttcgcag tgggcggtct caggcatctt 240atgttttact atcaagacgt cgaggaggcg gaaacgggcc aattggggag ccttggggga 300gtcaatctgg tgagcgggaa aatcaagaag ccgaaagtct ttgtgacgga gggaaacgac 360gtggcgctta ccggtgtgtg cgtgttcttt attagaaccg acccatcgaa ggccatcacg 420cccgacaaca ttcatcagga ggtgagcttc aacatgttgg atgcagcaga tgggggactg 480ttgaactcgg tgagaagact cttgtccgac atttttatcc cggctttgcg ggcaactagc 540cacggttggg gtgaattgga agggctccag gacgctgcaa atatccgcca agagtttctt 600agctcccttg aaggttttgt gaatgtcttg agcggagcgc aggagtccct taaggaaaaa 660gtcaatctga gaaaatgtga tattttggaa ctgaaaacgc tgaaggaacc gactgattac 720ctgaccttgg cgaataatcc agaaacgttg gggaagatcg aagactgcat gaaagtgtgg 780attaagcaga ccgagcaagt cttggcagaa aacaaccagt tgctcaagga ggctgacgac 840gtggggcccc gggccgagct cgagcattgg aagaagcggt tgagcaagtt taactatttg 900ctcgaacagt tgaaatcgcc tgacgtcaag gcggtgctcg cagtcttggc agctgcgaag 960tcaaagttgc tgaaaacttg gcgcgagatg gacatccgca ttactgacgc taccaatgaa 1020gcaaaagaca acgtgaaata tctgtatact ctcgagaaat gttgtgaccc tttgtattcg 1080tcggaccccc tttcgatgat ggatgcaatc ccgacgttga tcaacgctat caagatgatt 1140tattccattt cacactacta caatacgtcg gagaagatca cctccctgtt tgtcaaggtg 1200acgaatcaga tcatttcggc gtgtaaagca tatatcacta acaacggcac ggcatccatt 1260tggaaccagc ctcaggatgt cgtcgaagag aagatcttgt ccgctatcaa gctgaagcaa 1320gagtatcaac tgtgttttca caagactaaa cagaagctca agcaaaaccc aaatgcgaag 1380cagtttgatt tttcagaaat gtacatcttc gggaagttcg agaccttcca caggcggttg 1440gctaaaatta tcgatatctt cacgacgctt aagacgtata gcgtccttca ggattcgact 1500atcgaaggac ttgaagatat ggcgacgaag taccaaggga tcgtcgcaac gatcaagaag 1560aaagagtata actttctcga tcagagaaag atggactttg atcaagacta tgaagagttc 1620tgcaaacaaa ccaatgatct tcataacgaa ctgcggaaat tcatggatgt cacctttgcg 1680aagatccaaa acaccaacca agccctcaga atgctcaaga aatttgaaag gcttaatatt 1740ccaaatctcg gcattgatga taagtaccaa cttatcctcg aaaattacgg agccgacatc 1800gacatgatta gcaagcttta caccaaacag aaatatgatc ccccattggc gagaaaccag 1860ccccccattg caggtaaaat tttgtgggcc cgccagctct ttcatagaat tcagcagccg 1920atgcagctgt ttcaacagca tcccgcagtg ctttccacgg cagaggccaa gccaatcatc 1980cgctcgtata atagaatggc aaaggtgctt ctggaattcg aggtcctttt ccatcgggcc 2040tggttgcggc agattgagga aatccatgtg ggccttgaag cgtcactgct tgtcaaagcc 2100ccgggtactg gtgagctgtt tgtcaacttt gatccccaaa tcttgatttt gttccgggaa 2160acggaatgca tggcccagat gggtctcgag gtctccccat tggccacgtc gctctttcaa 2220aagcgggacc ggtacaaaag gaatttttcg aatatgaaaa tgatgctcgc tgaataccaa 2280cgcgtcaaat cgaagatccc agctgctatt gaacaactca ttgtcccaca ccttgctaaa 2340gtcgatgagg ccctgcagcc tggtcttgcc gctttgacgt ggactagcct taacattgaa 2400gcatatcttg aaaatacgtt cgcaaagatc aaagaccttg agctcctgct cgaccgggtg 2460aatgatctga tcgagttccg gatcgatgca atcttggaag aaatgtcgtc caccccactc 2520tgtcaacttc cacaggagga acccctgacc tgtgaagaat tcttgcagat gaccaaagat 2580ctttgcgtca acggtgctca gatcttgcac tttaagagca gcctggtcga ggaggcggtc 2640aatgagctcg tcaatatgtt gcttgacgtc gaagtgttgt cggaagagga aagcgagaaa 2700atttccaacg aaaattcagt caactacaaa aatgaaagct cggctaagag ggaggaaggc 2760aatttcgata cgctgaccag ctccattaac gcgcgcgcta acgccctgct tctcaccacc 2820gtcacccgga agaagaaaga gactgaaatg ctgggagaag aggctcggga gcttctcagc 2880cactttaatc accaaaatat ggacgcgctc ttgaaggtca cccggaatac cctcgaggcg 2940atccggaaga ggatccactc atcccacact attaattttc gcgattccaa cagcgctagc 3000aacatgaaac agaattccct gccgattttt cgggctagcg tgactcttgc cattccaaac 3060attgtgatgg ctccagcatt ggaagatgtg caacaaacgc ttaacaaggc ggtcgagtgt 3120atcattagcg tgccaaaggg ggtcaggcaa tggagctccg agttgttgag caagaagaag 3180attcaagagc ggaagatggc ggcactccaa tccaacgagg attcggattc agatgtcgag 3240atgggtgaaa atgaactgca agatactctg gaaatcgcaa gcgtgaatct gccaattcct 3300gtccaaacca agaattatta taaaaacgtc tcggagaaca aagaaatcgt caagctcgtg 3360tcggtcctct caactatcat taatagcact aaaaaagaag tcattaccag catggactgt 3420tttaaacggt ataatcatat ctggcagaaa ggaaaggaag aggcaatcaa gacgtttatc 3480acccagagcc ccttgctctc agaattcgaa tcacagattc tctacttcca gaatcttgaa 3540caagaaatca atgctgagcc agagtatgtg tgcgtggggt ccatcgcttt gtatacggcc 3600gatctgaaat ttgcgttgac cgcagaaacg aaggcttgga tggtggtcat tggccgccat 3660tgcaacaaaa agtatcggtc agaaatggaa aacatcttca tgctgattga agagtttaac 3720aagaagttga atcggcctat taaagatctc gatgatattc gcattgctat ggctgctctg 3780aaggagatta gggaagaaca aatttccatc gactttcaag tcggtcctat cgaagaaagc 3840tacgctctcc tcaacaggta tggtctcctc attgctaggg aagaaatcga taaagtcgat 3900acgttgcact atgcgtggga gaagttgctc gcccgggccg gcgaggtcca aaataaactt 3960gtgtccttgc aaccctcgtt caagaaggag ctgatctcag cagtggaggt cttcttgcaa 4020gactgccatc aattctactt ggattatgat ctgaatggac ctatggcaag cggcctcaag 4080ccccaagagg cgtcagaccg cctgatcatg ttccaaaatc aatttgataa catttacaga 4140aagtatatta cgtacaccgg cggagaggaa ctgtttgggc tcccagcaac gcaatatcct 4200caactcctgg aaattaaaaa gcagcttaat cttcttcaga agatttacac tttgtacaac 4260tcagtcatcg agaccgtgaa ttcatattat gatatccttt ggtcggaagt gaatatcgag 4320aagatcaata atgaactcct tgaattccag aatcgctgta ggaaattgcc cagagcactg 4380aaagattggc aagccttctt ggacctcaaa aagatcattg acgacttctc cgagtgttgt 4440ccacttctcg agtacatggc ctcgaaggct atgatggaac gccattggga gcggatcacc 4500acgctcacgg gacactcgct tgacgtcgga aatgagtcct tcaaattgag gaatattatg 4560gaggcgcccc tgcttaagta taaggaggag attgaggaca tctgtatttc ggctgtgaag 4620gaaagggaca ttgaacagaa attgaagcag gtgattaatg aatgggacaa taaaaccttt 4680accttcggtt cctttaaaac tcgcggagag cttctgttgc ggggagacag cacttcggaa 4740attatcgcca acatggaaga ttcacttatg ctcctggggt cgctgctctc gaatagatat 4800aatatgccct tcaaagccca aattcagaaa tgggtgcagt atttgagcaa ctcgactgac 4860atcattgaat cctggatgac tgtgcagaac ttgtggattt acttggaggc agtcttcgtg 4920ggaggcgata tcgcaaaaca acttccgaag gaagctaaaa gattctccaa tatcgacaaa 4980tcctgggtga aaatcatgac tcgggcacat gaagtcccat cggtcgtgca atgctgtgtc 5040ggtgacgaaa ctcttgggca actgctgccg cacctgttgg atcagctcga gatctgtcaa 5100aagtcattga cgggatacct ggaaaagaag cgcctgtgtt ttcctcgctt cttctttgtg 5160tccgaccccg cgctgttgga gatcttgggc caggcttccg actcgcacac tattcaggcc 5220catctcctta atgtgttcga caatattaaa tccgtgaaat ttcatgaaaa gatttatgat 5280cgcattctgt cgatctcctc acaggaagga gagacgattg aacttgacaa gcctgtcatg 5340gccgaaggga atgtcgaggt ctggttgaat tccctcttgg aagagtcgca gagctcgttg 5400cacctcgtca ttaggcaggc cgcagctaac atccaagaga ccggatttca gcttacggaa 5460ttcctttcga gctttccggc tcaagtcggt ctgctcggca tccagatgat ttggacgcgg 5520gacagcgaag aggccctcag aaacgcgaaa tttgacaaaa aaatcatgca gaagactaat 5580caggcatttt tggagttgct gaatacgctg attgacgtga ctacgaggga tttgtcaagc 5640acggagcgcg tgaagtatga gactttgatc actatccatg tgcaccaaag agatattttc 5700gacgacttgt gccatatgca tattaaaagc ccgatggact ttgagtggct gaaacaatgt 5760agattttact tcaatgagga ctcagacaag atgatgattc acatcaccga cgtcgcgttt 5820atctaccaaa acgaattttt ggggtgcact gatagactcg tgattacgcc cctcactgat 5880aggtgttata tcaccttggc ccaggcgctt ggtatgagca tgggcggggc gccagcgggc 5940ccggcaggaa ccggtaaaac ggaaactact aaagatatgg ggcgctgtct gggcaagtat 6000gtcgtggtct tcaattgtag cgatcagatg gattttcggg gcctcggacg catttttaag 6060ggccttgccc aatccggctc ctggggatgt tttgacgagt tcaatcggat tgacttgccg 6120gtcttgtccg tcgccgccca acaaatctcc atcatcctga cttgcaagaa agagcacaag 6180aagtcgttca tctttaccga cggtgacaat gtgactatga atcctgagtt tggtctcttt 6240ctcacgatga atccgggata tgcgggaagg caggaactgc ctgaaaatct caaaatcaat 6300ttcaggtcag tggctatgat ggtgcccgat cgccaaatca tcattcgcgt caaactggcc 6360tcgtgcggat ttatcgataa tgtcgtgctg gcgaggaaat ttttcactct ctacaaactc 6420tgtgaagaac aacttagcaa acaggtccac tacgatttcg gactccggaa catccttagc 6480gtcttgagaa ccctcggggc tgccaaacgg gcgaacccaa tggatactga gagcaccatt 6540gtcatgagag tgttgagaga tatgaacctc tccaagctga tcgatgaaga tgaacccctt 6600ttcctgagct tgatcgaaga tctctttccc aatatcctgc ttgacaaagc gggttatccc 6660gaactcgaag ctgctattag caggcaggtg gaggaagcag gactcatcaa ccatcctccg 6720tggaaactga aagtcattca gctgttcgaa acccaaaggg tgcggcatgg tatgatgacg 6780ctggggcctt ccggtgccgg gaaaacgacc tgcatccata ctcttatgag agccatgacg 6840gattgcggca aacctcatcg cgaaatgagg atgaatccga aagcgattac cgccccacag 6900atgtttggac ggttggatgt ggcgacgaac gactggactg acgggatttt ctcgacgttg 6960tggcggaaga cgcttcgggc gaaaaagggg gagcatattt ggatcattct cgatggtccc 7020gtggatgcca tttggatcga aaatttgaac tccgtgctcg acgataataa gactcttacg 7080ttggcaaatg gtgacagaat tccgatggca ccaaactgca agattatctt cgaaccacac 7140aatatcgaca atgcgtcccc cgccaccgtc tcccgcaacg ggatggtctt tatgtcatcg 7200agcattttgg

actggtcgcc aattctcgaa gggttcctga aaaaacgctc gccgcaggag 7260gcggaaattc tgaggcaact ctatacggaa tcattcccag atctctatcg cttctgcatc 7320caaaacctgg aatataaaat ggaagtcctc gaggcttttg tcattacgca atccattaac 7380atgctccagg ggctcattcc attgaaagag caaggaggag aggtgagcca agcacacctg 7440ggcaggcttt ttgtgttcgc actcttgtgg tcggcggggg ccgctctgga acttgatggg 7500agacgcaggc tggaattgtg gcttcggtcg cggccaaccg gtacgttgga actcccacct 7560cccgcaggcc caggggacac cgcttttgat tattatgtcg ccccagatgg cacctggacc 7620cactggaaca ccagaacgca agaatacctt tatccgtcgg acaccactcc agagtatggg 7680tccatccttg tgccgaacgt cgataatgtg agaacggatt ttcttatcca gaccattgcc 7740aagcaaggaa aagcggtcct tctgatcgga gaacaaggga ccgctaaaac tgtgatcatc 7800aaaggtttta tgtcgaagta tgatcccgaa tgtcacatga tcaaatcatt gaattttagc 7860agcgcgacga ccccacttat gttccaaaga acgattgagt catatgtgga taaaagaatg 7920ggtacgacct atgggccccc agcgggtaaa aagatgaccg tctttattga tgacgtcaat 7980atgccgatta ttaatgagtg gggcgaccag gtcacgaacg agattgtccg gcagctcatg 8040gagcaaaacg gcttctacaa tctcgaaaaa cccggagagt tcacgtcaat tgtggatatt 8100cagttcctgg cagccatgat ccacccaggt gggggtagaa atgatatccc ccaaaggttg 8160aagagacaat tttcgatttt taattgcacc ctccccagcg aagcctccgt ggataaaatt 8220ttcggcgtca ttggagtggg gcactattgc acccaaagag gtttttccga agaagtccgc 8280gattcggtca cgaaactcgt gcctctcacg cgccggcttt ggcagatgac taaaattaag 8340atgctcccca cgccagctaa attccactac gtcttcaact tgagggacct gtcccgcgtg 8400tggcagggca tgctcaacac tacctcggag gtgatcaaag agcccaatga tttgctgaaa 8460ctctggaaac acgagtgcaa gagggtgatc gctgatcgct tcactgtctc gagcgacgtc 8520acttggtttg acaaggcgct tgtcagcctg gtggaagagg aatttggtga agaaaaaaag 8580ctcttggtcg actgtgggat cgatacttac tttgtcgatt ttctgaggga tgcgcccgaa 8640gcggcggggg agacctccga agaagctgat gccgagactc cgaaaatcta cgagccgatt 8700gaatcctttt cacatctcaa agagagactt aacatgttct tgcaactgta taacgaatca 8760atccgggggg ccggaatgga catggtgttc ttcgctgacg caatggtcca tctggtcaaa 8820atctcgcgcg tgattcgcac ccctcaaggg aatgctcttc tggtgggtgt gggagggtcc 8880ggcaagcaaa gcctgacccg gcttgcctcc tttatcgccg gctacgtctc gtttcaaatt 8940acgcttacgc gctcctataa cacgtcaaac cttatggaag atctcaaagt gttgtacaga 9000actgctggac aacagggaaa gggaatcact tttatcttca ccgacaacga aatcaaggat 9060gagagctttc tcgagtatat gaacaatgtg cttagcagcg gagaggtctc aaatttgttt 9120gcgagagacg aaatcgatga gattaatagc gatctggcaa gcgtcatgaa aaaggagttt 9180cctcggtgtt tgccgactaa tgaaaatttg cacgattatt tcatgtcacg cgtgcggcag 9240aacctgcaca tcgtcctgtg cttctcacct gtgggtgaaa agtttaggaa ccgggcactc 9300aagtttccgg cactgatcag cggctgtact attgattggt ttagccgctg gccaaaggat 9360gccttggtgg cggtctcaga gcattttctc acctcctacg atattgattg cagcctcgag 9420attaagaaag aagtcgtcca atgcatgggt tcgttccaag acggggtcgc cgaaaagtgt 9480gtcgactatt tccagagatt caggcggagc actcatgtca ctccgaagtc ctatttgtcg 9540ttcatccagg gatacaaatt tatttacgga gaaaagcatg tggaggtgag aactttggca 9600aataggatga acacggggct tgagaaattg aaggaggcta gcgaatcagt ggccgcactc 9660tcaaaagagt tggaagctaa ggagaaggag ttgcaggtgg ctaatgataa ggctgatatg 9720gtccttaaag aggtcacgat gaaggcgcaa gccgcagaaa aagtcaaagc ggaagtgcaa 9780aaagtgaaag atcgggctca ggctattgtc gacagcatct ccaaggataa agccattgcc 9840gaggagaagc tcgaagctgc taagcctgct cttgaggaag ctgaggcagc actccagacc 9900atcaggccgt ccgacatcgc aaccgtgagg acgctgggaa ggcctccgca ccttatcatg 9960agaatcatgg attgcgtcct gctgctcttc caacggaaag tctccgcggt caagattgat 10020ttggagaaat cgtgtaccat gccctcatgg caggaatcct tgaagttgat gaccgcaggc 10080aactttctgc aaaacctgca gcaatttcca aaggacacca tcaacgaaga ggtcatcgag 10140ttcctcagcc cctattttga gatgccggac tataatatcg aaacggcaaa acgcgtgtgt 10200ggcaacgtcg caggcttgtg ctcctggacc aaagctatgg cttcgttctt ctcaattaac 10260aaagaggtgc tgccgttgaa agcaaacctc gtggtgcagg agaatagaca tcttctggca 10320atgcaggact tgcaaaaagc tcaagctgag ctggatgata agcaagcaga gcttgatgtg 10380gtccaggctg agtacgagca ggctatgact gaaaagcaaa cgcttctgga ggacgcagaa 10440cgctgcagac acaagatgca gactgcttcc accttgattt cagggctggc tggagagaaa 10500gaacggtgga cggaacagtc acaagagttt gccgcacaaa ctaaaaggtt ggtcggtgac 10560gtcttgcttg cgaccgcgtt tctttcgtac tcagggccat tcaaccaaga gtttcgggat 10620ttgctgctca atgattggag gaaggaaatg aaagcgcgca agattccatt cggaaagaat 10680ctcaacctct cggagatgct tatcgacgcc ccgaccattt cagagtggaa cctccaaggg 10740ctgccgaatg atgatctctc catccagaac gggatcattg tcacgaaagc ctcacgctac 10800cctctgctca tcgatccgca gactcagggg aagatctgga ttaagaataa ggagtcgagg 10860aacgaactgc agatcactag cttgaaccac aaatacttta gaaaccacct tgaggattca 10920ctcagcctcg gtaggccgct gctcattgag gatgtgggcg aggaactgga cccagccctt 10980gataacgtcc tggagcggaa cttcatcaaa accgggtcga cgtttaaagt gaaggtcggc 11040gacaaagagg tcgacgtgct ggatggattc aggctttaca tcactacgaa actgcctaac 11100ccagcgtaca ctccggagat ttccgcccgg acctcgatta tcgacttcac cgtcaccatg 11160aaagggcttg aggaccagct ccttggacgc gtgatcctca ctgaaaaaca agaactcgaa 11220aaggaacgga cccacctgat ggaggatgtc accgccaata aaagaaggat gaaagaactt 11280gaagataatc ttctttatcg cctcacgagc actcagggct cgttggtgga agatgaatcc 11340cttatcgtcg tcttgtcgaa caccaaacgc actgcggaag aggtcaccca gaagctggaa 11400atttcggcag aaacggaggt ccaaattaat tccgcaaggg aggagtacag gccggtggcg 11460acccgcgggt cgattcttta ttttctcatt acggagatga ggttggtcaa cgaaatgtat 11520caaacgtccc tcaggcagtt tcttggcttg ttcgacctgt cacttgctcg ctcggtgaag 11580tcgccgatta cgtcgaagag aatcgcaaat atcattgagc atatgaccta tgaggtctat 11640aaatatgccg cccggggcct gtatgaggaa cataaatttt tgtttaccct gttgcttacg 11700ttgaaaatcg acatccaacg gaacagggtc aaacacgagg aatttttgac gctgatcaag 11760ggtggtgcct ccctggatct gaaggcctgc ccccccaagc ctagcaaatg gattcttgat 11820atcacttggc tcaacctggt cgagctgtca aaattgcgcc agttttcgga tgtgctcgac 11880cagatctcaa ggaacgaaaa gatgtggaag atctggttcg ataaagagaa cccggaggag 11940gaaccactgc ctaatgctta tgacaaatca ctggactgtt ttcggagatt gcttctgatc 12000cggtcctggt gcccggatcg cactatcgcc caggcaagaa agtatatcgt cgattcgatg 12060ggagagaaat atgcagaagg tgtgatcctc gatcttgaaa agacgtggga agagtcagat 12120cctcgcactc ctcttatttg cctcctttcg atgggatcag acccaacgga cagcattatt 12180gcgctgggca agcggttgaa aatcgaaacg cggtatgtgt caatgggtca ggggcaggaa 12240gtgcatgcac gcaaacttct ccaacaaact atggccaatg ggggctgggc gttgctgcaa 12300aactgccacc ttggcttgga ctttatggac gaacttatgg atattattat tgagaccgaa 12360ctggtccatg atgctttcag attgtggatg actacggagg ctcacaagca atttccgatc 12420actttgctcc aaatgtcgat caaatttgcc aatgaccccc ctcaaggact gcgcgccgga 12480ttgaaaagaa cttactccgg ggtgtcgcag gatttgcttg acgtctcaag cgggtcccag 12540tggaagccca tgctttatgc tgtggccttc ttgcattcaa ctgtccaaga acgccggaag 12600ttcggtgcac ttggctggaa cattccctat gaattcaacc aggctgattt caatgccacg 12660gtccagttca tccaaaacca cctggatgat atggatgtca agaagggtgt ctcctggact 12720actatccggt acatgatcgg tgaaattcaa tatggcggta gggtcacgga cgactacgat 12780aaaaggctct tgaatacttt cgcgaaggtc tggttctcgg aaaacatgtt tggtccagac 12840tttagctttt accagggata caacatccct aaatgtagca cggtggacaa ttatttgcag 12900tacattcaaa gccttcctgc ttatgactcg ccggaggtct tcggtttgca tccaaacgcg 12960gacattacgt atcagtcgaa acttgcaaaa gatgtgctcg acacgatcct ggggattcag 13020ccgaaagata cgagcggcgg aggggacgaa actcgggaag ccgtcgtggc taggctggcg 13080gatgacatgt tggaaaagct cccccccgac tatgtgcctt tcgaagtcaa agagaggctt 13140caaaagatgg ggccgttcca acctatgaac attttccttc ggcaggagat cgatcgcatg 13200caacgcgtcc tgagccttgt ccgcagcacc ctcactgaac tcaagctggc cattgatggg 13260accatcatta tgtccgaaaa cctgcgcgat gcgcttgact gcatgtttga cgcgcgcatt 13320ccagcttggt ggaagaaggc ctcctggatc tcatccactt tgggtttctg gtttaccgag 13380ctgattgaac gcaactcgca attcacttcg tgggtcttca atgggcgccc gcactgtttc 13440tggatgaccg gtttttttaa tcctcaaggt tttcttactg ctatgcgcca agagatcacc 13500agggccaata aggggtgggc cctcgacaat atggtcctct gcaatgaagt cactaagtgg 13560atgaaggacg atatctcagc acccccaacg gagggtgtct atgtctacgg cttgtacctg 13620gaaggtgcgg gatgggacaa aaggaacatg aaacttatcg aatccaagcc caaggtgctt 13680tttgagttga tgccggtgat ccggatttat gcggaaaata acactctgag agaccctagg 13740ttctactcat gcccaattta caaaaaaccg gtccgcacgg acttgaacta cattgcggcg 13800gtggacctcc gcaccgccca aaccccggaa cattgggtcc tcagaggcgt cgccctgctt 13860tgcgacgtga agtag 138751313875DNAArtificial SequenceSynthetic polynucleotide 13atgtttcgga ttgggaggag gcaactctgg aaacacagcg tgaccagagt gcttacccag 60cggctgaagg gagagaaaga agccaaacgc gctttgttgg atgctaggca caactatctc 120tttgccattg tggcctcgtg cctcgatttg aacaaaactg aagtggagga cgccattctg 180gaaggcaacc aaattgaaag aattgatcaa ctctttgctg tgggcggttt gagacacctc 240atgttttatt accaggatgt ggaagaggcc gaaactggac aactgggatc gcttggtggt 300gtcaatcttg tctccggaaa gatcaaaaag ccaaaagtgt tcgtgacgga aggaaacgat 360gtcgccctga ctggggtgtg tgtgttcttt attagaaccg acccctccaa agcgatcacc 420cccgataaca ttcatcagga ggtgtccttt aatatgcttg atgctgccga cggtggtctc 480ttgaacagcg tgagacggtt gctttcagat atcttcatcc ctgcattgag ggcgacttcc 540catggttggg gtgaattgga aggcttgcaa gacgctgcca atatccgcca ggagtttttg 600tcgtccttgg aaggatttgt caacgtcctt agcggagcgc aagagtcgct taaggagaag 660gtcaatttgc gcaaatgcga tatccttgag ctcaaaactc tcaaggaacc cacggattac 720ttgaccttgg caaataaccc cgaaaccttg gggaagattg aggactgcat gaaggtgtgg 780atcaaacaga cggagcaagt cctcgccgaa aataatcagt tgcttaagga ggctgacgat 840gtcggcccca gggccgaact tgagcattgg aagaagcggc tctcgaaatt taactatttg 900ctcgagcaac ttaagtcccc tgacgtcaaa gctgtgctgg cagtgcttgc cgcggccaag 960tcaaaactgc tcaaaacgtg gagagaaatg gatattagaa tcactgacgc cacgaacgag 1020gccaaagaca atgtgaaata cttgtacact cttgagaagt gctgcgatcc actgtactcg 1080tcggatccgt tgagcatgat ggatgcgatc cccactctca tcaatgctat taaaatgatc 1140tatagcattt cccattacta caacacgtcc gaaaagatca cgtccttgtt tgtgaaagtc 1200accaatcaga ttatctcagc ttgtaaagca tacattacta ataatgggac tgcgtcaatc 1260tggaatcagc cacaggacgt ggtggaggaa aagattcttt cggcgatcaa acttaagcaa 1320gagtaccagc tctgcttcca taaaaccaag caaaagctta agcaaaatcc caatgccaag 1380caatttgact tttcagagat gtacattttc ggtaagttcg agactttcca ccggcggctc 1440gctaagatta tcgatatttt cactaccctt aaaacttaca gcgtcctcca agattccacg 1500atcgaaggac tcgaggacat ggccaccaag taccagggta ttgtggcaac tatcaaaaaa 1560aaagaatata actttttgga tcagagaaaa atggattttg atcaagatta tgaggagttc 1620tgtaaacaga ctaacgacct ccacaatgaa ctccggaaat ttatggacgt cacctttgcc 1680aagatccaga acacgaacca agctttgcgg atgcttaaga aatttgaaag acttaacatt 1740ccaaacctcg gcatcgacga taaataccag ctcatccttg aaaattatgg ggcggatatt 1800gatatgatta gcaaattgta cactaagcag aagtacgacc cgcctttggc acggaaccaa 1860ccccctatcg cgggcaagat cctttgggcc aggcagttgt ttcatcgcat ccagcagcct 1920atgcagcttt ttcaacaaca tccggcagtc ctgtcaaccg ccgaagcgaa gcctattatt 1980aggagctata accgcatggc caaagtgctg ctcgaatttg aggtcttgtt tcacagagca 2040tggcttagac agattgagga aatccatgtc ggccttgaag catccctgct ggtcaaagct 2100ccaggcacgg gcgaactttt cgtcaatttt gacccccaaa tcctcattct gttcagagag 2160acggagtgta tggcacaaat ggggttggag gtgtccccct tggcaacgtc cctttttcag 2220aaacgggacc gctacaagag gaattttagc aacatgaaga tgatgcttgc ggagtaccag 2280cgggtgaagt ccaaaatccc agctgcaatc gaacaactga tcgtgcccca ccttgccaag 2340gtcgacgagg ccctgcaacc aggcctcgcc gcgttgactt ggacctcgct gaatatcgaa 2400gcctatttgg aaaatacttt tgcgaagatc aaagaccttg agctgttgct tgacagggtc 2460aacgacttga tcgaattccg gattgatgcc atccttgagg aaatgagcag cacccccctc 2520tgtcaattgc ctcaagagga gcctttgacg tgcgaggaat tcctgcaaat gacgaaggat 2580ctttgcgtga acggggcaca aatccttcac ttcaaatcat ccttggtcga ggaggccgtc 2640aacgaactcg tcaatatgct tctcgatgtg gaggtcctta gcgaagaaga gagcgagaaa 2700atttccaatg aaaactccgt gaattacaaa aatgagagct ccgcaaaaag ggaggagggg 2760aacttcgaca ctctgacctc atccatcaac gcaagagcaa atgcattgct gcttactacg 2820gtgactcgca aaaagaaaga gaccgagatg cttggtgagg aggccagaga actcttgtca 2880cactttaatc atcagaacat ggacgccctt ctcaaggtga ccaggaatac cttggaagcg 2940atcagaaaga gaatccacag cagccatacc attaacttta gggattcgaa ttcagcctcc 3000aatatgaagc agaattcgtt gcctatcttc cgggcgtcag tgacccttgc cattccaaac 3060attgtgatgg cccctgcact cgaagacgtc caacaaacgc ttaacaaagc ggtcgaatgt 3120atcatttcgg tgccgaaggg agtgcggcaa tggtcatccg aactgttgtc gaaaaagaag 3180attcaggaaa gaaagatggc cgctctgcaa agcaatgagg attcagattc agatgtcgag 3240atgggagaaa atgagctcca ggatactctg gagattgctt cggtgaactt gcccatcccc 3300gtgcaaacga aaaattacta caaaaatgtc tcggagaaca aagagatcgt caaacttgtg 3360tcggtcctct ccacgattat caacagcacc aaaaaggaag tcattacgtc aatggattgc 3420tttaaaagat ataaccatat ttggcagaag ggcaaggagg aagctatcaa gacctttatt 3480acccaatccc ctctcctcag cgagttcgaa agccagattc tctatttcca aaacttggaa 3540caagagatta acgccgagcc agaatacgtc tgtgtggggt cgatcgcgct gtacacggcg 3600gacctcaaat ttgcacttac ggcggagacg aaggcctgga tggtcgtcat cggtaggcat 3660tgtaacaaga aatacagaag cgagatggaa aatatcttca tgcttattga agaattcaat 3720aagaaactga acaggcctat caaagatctt gatgatatca gaatcgcgat ggctgccctg 3780aaagaaattc gggaggaaca aatttcgatc gattttcagg tgggtcctat cgaagaaagc 3840tatgctttgt tgaatagata tgggctcctg atcgcacggg aagaaattga caaagtcgat 3900actctgcatt acgcctggga aaagcttctc gcgagagccg gggaagtcca gaacaaactt 3960gtctcccttc agcctagctt taagaaagag ctgatcagcg ctgtggaagt gtttcttcaa 4020gattgccatc aattctacct cgactacgat ctcaacggtc caatggcctc cggtttgaag 4080ccccaagagg cctccgacag acttatcatg ttccaaaacc agtttgataa tatctacaga 4140aaatatatta cgtacacggg tggcgaggaa ctgttcggtc tcccagcaac ccaataccct 4200cagctgcttg agattaaaaa acaactgaat ttgttgcaga agatttacac gctctataac 4260tcggtgatcg aaacggtcaa cagctattac gatattctct ggtcagaagt gaatatcgag 4320aagatcaata atgaattgct cgaatttcag aatcggtgta gaaaactgcc cagggcactc 4380aaagactggc aagccttcct tgatttgaag aaaattattg atgacttcag cgaatgctgt 4440ccccttctcg agtacatggc ctcgaaggcc atgatggaga ggcactggga acgcattacc 4500actctgactg gccacagcct cgatgtgggt aatgagtcat tcaaattgag aaacatcatg 4560gaggctcccc ttctgaaata taaggaggag atcgaggata tttgtatttc ggctgtgaag 4620gagcgcgata ttgagcagaa attgaagcag gtgattaatg aatgggataa caagaccttc 4680acgtttggtt ccttcaaaac cagaggcgag ctgcttctgc ggggcgactc aacgagcgag 4740attatcgcaa acatggaaga ttccttgatg ctgttggggt cactgctttc aaatcgctat 4800aatatgccgt ttaaggcaca aattcagaaa tgggtgcagt atctttccaa ttccaccgat 4860attattgaat cgtggatgac tgtccaaaac ttgtggatct accttgaagc cgtgttcgtc 4920ggtggggata ttgctaagca gttgccaaaa gaagctaaac gcttttccaa tatcgataaa 4980agctgggtga agatcatgac tagagcacat gaggtgcctt ccgtggtgca gtgttgtgtc 5040ggcgatgaaa cgcttggaca gcttctcccc caccttctcg accaactgga aatctgccaa 5100aaatccttga ccgggtatct tgaaaagaaa agactttgct ttccaagatt ctttttcgtc 5160tcagatcctg cgcttttgga aatcctgggc caggccagcg attcccatac gattcaagca 5220cacctcctca atgtgttcga taatatcaaa tcagtcaagt ttcatgagaa aatttacgat 5280cgcatcctgt caatctcctc ccaagaaggt gagaccatcg agttggataa acctgtcatg 5340gcggagggga acgtggaagt gtggttgaac tccttgttgg aagaatccca atcatccctg 5400caccttgtga ttcgccaggc ggcggctaat atccaggaaa cggggttcca gctcaccgag 5460tttctcagct cattccctgc tcaggtcggg ctgctcggca ttcagatgat ttggacgcgg 5520gactcggagg aagccctcag aaatgcgaag tttgacaaaa aaatcatgca aaagaccaat 5580caagcctttc ttgaactgct gaataccctc atcgatgtga ctaccaggga tctgtcgtcg 5640accgaacggg tcaaatacga gacgcttatt actatccacg tccaccaaag ggatatcttc 5700gatgatctct gtcacatgca tatcaaatca ccaatggact ttgaatggct gaagcagtgt 5760cgcttttact ttaacgagga ttcagacaag atgatgattc acattaccga tgtggcattt 5820atttatcaaa atgaattcct gggttgcacg gatcgcctgg tgattacgcc actcacggat 5880cggtgttata tcacgctcgc acaggcattg ggaatgtcaa tggggggggc cccggcaggg 5940ccagctggaa cgggtaaaac cgaaacgact aaggatatgg gtcggtgtct tggaaagtac 6000gtggtcgtgt tcaattgcag cgatcaaatg gacttccggg gattgggaag aattttcaag 6060ggattggccc aatccggatc ctgggggtgt tttgacgaat tcaatagaat tgatcttccg 6120gtcctgtcag tggccgcgca gcagattagc atcatcctta cttgcaaaaa agaacacaag 6180aagagcttca tttttacgga cggagataac gtgactatga atccggagtt cgggctcttc 6240ctgaccatga atccgggcta cgcgggcagg caggagctgc ctgagaatct caaaattaac 6300tttcggtcag tggctatgat ggtccctgat cgccagatta ttatccgcgt caaacttgcg 6360tcgtgcggtt ttatcgacaa tgtcgtgttg gcaaggaaat tctttactct ttataagctc 6420tgtgaggaac aactctccaa gcaagtgcac tacgacttcg ggctccggaa tattctttcc 6480gtccttcgga cgctcggcgc cgcaaaaagg gctaacccca tggacacgga atccacgatc 6540gtcatgaggg tgctgcgcga catgaacctg tcgaagctca tcgacgaaga cgagccgctg 6600tttttgagcc tcatcgaaga cctctttcct aacatccttc ttgacaaggc cgggtaccct 6660gagcttgaag ccgctatttc gcggcaagtg gaggaggcgg ggcttatcaa tcatccgccc 6720tggaagctta aggtcatcca attgtttgag acgcagaggg tccgccatgg aatgatgacg 6780ctgggcccaa gcggtgccgg gaagactacc tgcatccaca ccctcatgag agcgatgacg 6840gactgcggga agccccacag agagatgaga atgaacccta aggcaattac tgcaccccaa 6900atgttcggcc gcctggacgt ggctacgaat gattggaccg acggaatctt ctcgaccctc 6960tggaggaaaa cgctcagagc gaagaaggga gagcatatct ggattatcct cgacgggcct 7020gtcgacgcaa tttggattga aaacttgaat tcagtcttgg acgataacaa gacgctgacc 7080ttggcgaacg gcgatcggat ccctatggct cccaactgca aaatcatctt cgaaccccat 7140aatatcgaca atgcgtcccc ggctacggtg tcgaggaacg gtatggtctt catgagctca 7200tccatcctgg attggtcccc gattctggaa gggtttctca aaaagcggtc cccgcaagaa 7260gcagaaattt tgagacaact ttatactgag agctttcccg acttgtatcg cttttgtatc 7320caaaacttgg agtataagat ggaggtcttg gaggcatttg tcatcactca gtccatcaac 7380atgctccagg ggctcattcc gctgaaagag caaggaggtg aggtgtcgca ggcacatctg 7440ggaaggcttt tcgtgtttgc cctgctgtgg agcgcaggag ctgccctgga gttggacggt 7500cggagaagac tggagctctg gctgcgctca agaccgacgg gcaccctgga acttccgcct 7560ccagccgggc cgggcgacac tgcgttcgat tactacgtcg ctccggatgg aacttggacg 7620cactggaata ctcgcactca agagtatctc tatccttcag ataccactcc ggaatacggc 7680tcaattctcg tgccgaacgt cgacaatgtc aggaccgatt tccttatcca aaccattgct 7740aagcaaggga aagccgtcct gcttattggc gagcaaggta ctgctaagac tgtcatcatt 7800aaagggttca tgtcgaaata tgaccctgaa tgccacatga ttaaaagcct caatttcagc 7860tcggccacta cgccgcttat gttccaacgc actatcgagt cgtacgtcga caagagaatg 7920ggtaccactt atggtccacc ggcaggaaaa aaaatgactg tgtttattga tgacgtgaac 7980atgcccatca ttaacgaatg gggtgatcag gtgacgaacg agattgtgcg gcaactcatg 8040gagcaaaacg gcttttataa tttggaaaag ccaggcgagt ttacctcaat cgtggacatc 8100cagttcctcg cagcgatgat tcaccccggg ggcgggcgca atgacatccc acagaggctg 8160aagagacagt tttcaatttt caattgcacg ctgccctcgg aagcaagcgt cgacaaaatt 8220tttggtgtca tcggagtggg tcactactgc actcaacgcg gcttctccga agaagtgaga 8280gattcagtca

ctaagctggt cccactgact cggcggcttt ggcagatgac gaaaattaaa 8340atgctgccta ctcccgcgaa attccactac gtctttaatt tgagggatct ttcccgggtc 8400tggcaaggta tgctcaatac cacttcggag gtcatcaagg agcccaacga tctcttgaaa 8460ttgtggaagc atgaatgcaa gagagtcatc gccgaccggt tcacggtgag cagcgacgtg 8520acttggttcg acaaagcgct tgtctcattg gtggaggagg aatttggcga agagaagaag 8580ttgttggtgg actgtggaat cgatacttac ttcgtggatt ttcttcgcga tgcaccggaa 8640gctgcgggag aaacgtcgga agaagcagac gccgaaacgc ctaaaatcta cgaaccaatt 8700gagtcatttt cccaccttaa agaacggctg aatatgtttc tgcaacttta caacgaatca 8760attcgcggtg cagggatgga catggtcttc tttgccgacg caatggtcca tctcgtgaaa 8820atttcgagag tgattaggac gcctcagggt aatgcactcc ttgtcggggt gggcggctcc 8880ggaaaacaat cattgacgcg gcttgcttca tttattgcag ggtacgtctc atttcagatt 8940acgcttacca gatcgtataa cacctccaat ctcatggagg accttaaagt gttgtatcgg 9000actgctgggc agcaggggaa ggggattacc ttcattttca ctgataatga aattaaagat 9060gaaagctttc tggaatatat gaataatgtg ctttcatcgg gggaggtctc aaatcttttc 9120gccagggatg aaattgacga aatcaacagc gaccttgcct ccgtgatgaa gaaagaattc 9180cctcggtgcc tccctactaa cgagaatctc cacgattatt tcatgtccag agtgcgccaa 9240aatctccata tcgtcctgtg tttttcgcca gtcggtgaaa agtttagaaa tagagctctt 9300aaatttcccg cactcatcag cggctgtacg attgattggt tttcacgctg gcccaaagac 9360gcgcttgtcg ccgtgtccga gcacttcctg actagctacg acattgactg ctcactggag 9420attaaaaaag aagtggtcca atgtatgggt tcgtttcaag atggagtggc cgaaaaatgc 9480gtggattact tccagagatt cagacggtcc acgcacgtga cgccgaaatc atacctgtca 9540tttatccagg ggtacaagtt tatttacggg gaaaagcatg tggaggtccg cacgcttgct 9600aacaggatga atacgggcct ggagaagctt aaagaagctt ccgaatcggt cgcggcactg 9660tcaaaggagc ttgaagccaa agagaaggag ctccaagtcg cgaatgataa agctgacatg 9720gtcctgaagg aggtcacgat gaaggctcaa gcggcagaaa aggtgaaggc cgaggtgcag 9780aaagtcaagg atcgcgcaca ggctatcgtc gattcaattt caaaggataa agctatcgcg 9840gaagaaaagc tggaagctgc aaagccggcc cttgaagagg cggaggccgc tctgcaaact 9900attcgcccgt ccgatatcgc tacggtgagg actctcggac gcccaccaca tctcatcatg 9960agaattatgg attgcgtgct gcttcttttc cagagaaaag tcagcgcagt caagatcgat 10020ctggagaaat catgtactat gccgtcatgg caggagagcc tgaagctgat gacggcagga 10080aacttcttgc aaaacttgca acagtttccg aaagacacca tcaacgagga agtgattgag 10140ttcttgtcgc cttactttga gatgccagat tacaatattg agaccgcaaa aagagtctgt 10200ggcaacgtgg ccgggctttg tagctggact aaagcaatgg cctccttttt ctccatcaac 10260aaagaggtcc tgcctcttaa ggcgaacctg gtggtgcagg agaataggca tcttctggca 10320atgcaagacc tccaaaaggc ccaggctgag ttggatgaca agcaggccga actggacgtc 10380gtccaggctg aatatgaaca ggcaatgacg gagaaacaaa ctctcctgga agatgccgaa 10440cggtgcaggc ataagatgca aactgcttcc actcttatca gcggattggc gggcgaaaag 10500gagagatgga cggagcaatc acaagaattc gcggctcaga ccaagcggct ggtcggtgac 10560gtcctcctcg ccaccgcctt cttgtcgtac tcgggaccct tcaaccagga atttagagat 10620cttcttttga atgactggag aaaggaaatg aaggctagaa aaatcccatt tggcaagaac 10680ctcaacctct cggagatgct gattgacgct cccactattt ccgaatggaa ccttcaaggt 10740ctgccaaacg atgacctcag cattcagaat ggaattattg tgactaaggc atcaaggtat 10800ccattgctta ttgacccgca gacgcaaggc aaaatttgga tcaaaaataa ggagtccaga 10860aacgagctgc agattactag ccttaatcac aaatacttta gaaaccattt ggaggactcc 10920ttgtcactgg ggcgcccgtt gcttatcgag gacgtggggg aggagctgga cccggcgctg 10980gacaacgtcc ttgaaaggaa tttcatcaag actggaagca cgtttaaagt caaggtcggg 11040gataaggagg tggatgtcct ggatggattt cggctctata ttacgactaa actccccaac 11100ccggcgtata ctccggagat ctcagcgcgc acgtcgatca ttgacttcac tgtcactatg 11160aaaggtcttg aggatcagct tctgggaagg gtgatcctta ctgaaaagca agagctggag 11220aaagagagga cgcatcttat ggaggatgtg actgcgaata aaaggcggat gaaagagctc 11280gaagataacc tgttgtaccg cctgacgagc acccagggat cattggtgga agatgaatca 11340ctgatcgtgg tcctgtcaaa caccaaaagg accgcagagg aggtgaccca aaagctggag 11400atttcggctg agaccgaagt ccaaattaac tcggccagag aagaatacag gcctgtcgct 11460actcggggaa gcattctgta ttttcttatt accgagatgc gccttgtgaa tgagatgtat 11520caaacttcac tgaggcaatt tctcggactc ttcgatctgt cgcttgcgag atcagtgaaa 11580tcccctatta cttcaaaacg gattgctaat attattgagc acatgacgta tgaagtctac 11640aaatacgctg caagaggctt gtacgaggag cacaagtttc tgttcactct cctcttgacg 11700cttaagattg acatccagcg gaacagagtg aagcatgagg agtttctgac gcttattaag 11760ggaggtgctt cactggacct gaaagcatgt cctccgaagc cttccaagtg gattctggac 11820atcacgtggc ttaatctggt ggagctctca aaactcagac aattttcaga cgtgttggat 11880caaatttcaa ggaacgagaa aatgtggaag atttggttcg acaaagaaaa cccggaagag 11940gagccattgc cgaacgctta tgacaaatcc ctggactgtt tcagaaggct gttgctcatt 12000cggagctggt gtcctgatcg caccatcgca caggcgagaa aatatatcgt ggactcaatg 12060ggtgagaagt acgcagaggg ggtcatcctg gatttggaaa agacttggga ggaatcagat 12120ccgcgcactc ccctcatttg cctcctgtcc atgggaagcg atcccactga ttcgatcatt 12180gcgcttggca aacggcttaa gattgaaacg aggtatgtct cgatgggtca aggacaggag 12240gtccacgcta ggaaattgct tcagcaaacg atggccaatg gtgggtgggc gcttctccag 12300aactgccatc tcggtcttga cttcatggac gaactgatgg atatcattat tgaaaccgag 12360ttggtccatg acgcttttag gctgtggatg actaccgagg cccataaaca gttcccaatt 12420acgttgctcc aaatgtccat taagtttgcg aacgaccctc cgcaaggttt gcgcgcgggc 12480cttaaacgga cttactccgg agtcagccag gacttgctgg acgtcagcag cggatcacaa 12540tggaagccca tgctctatgc agtggccttt ttgcacagca ctgtccagga gagaaggaag 12600tttggagcgc tggggtggaa tatcccctat gaattcaacc aagcggactt taacgctacg 12660gtccagttca tccagaacca ccttgatgac atggatgtca agaaaggcgt ctcgtggacg 12720acgatcaggt acatgatcgg ggaaattcag tacggcggta gggtcaccga cgattacgac 12780aaaaggctct tgaatacttt tgccaaagtg tggttttcag agaacatgtt cgggccggac 12840ttctcctttt accaaggtta taatatcccc aagtgctcga ccgtcgacaa ttatcttcaa 12900tacatccagt cgctgccagc atacgattcg ccagaggtct tcggtctgca tcctaacgcg 12960gacatcacgt accaatccaa gttggcgaaa gatgtcttgg atactatttt gggtatccag 13020cctaaagaca cctcgggggg gggcgatgaa accagggagg cagtggtcgc gaggctcgca 13080gatgatatgc ttgagaagct gccacccgat tacgtgccct tcgaggtgaa ggagaggctc 13140caaaaaatgg gaccattcca gccgatgaat attttcttgc gccaggagat cgacaggatg 13200cagcgcgtgc tgtcactcgt ccgcagcact cttaccgagc tgaagcttgc catcgatggc 13260acgattatta tgtcagagaa ccttagagac gcgctcgact gcatgtttga cgcaagaatt 13320cccgcttggt ggaagaaggc atcatggatt tcctcgacgt tgggcttttg gttcacggag 13380cttattgaac ggaattccca gttcacttcc tgggtgttca atggcagacc acactgtttt 13440tggatgaccg gcttcttcaa cccgcagggg tttctgacgg cgatgaggca ggaaattact 13500cgggctaata agggttgggc tcttgacaac atggtgctct gcaacgaagt gactaagtgg 13560atgaaggacg atatttcggc ccctcctact gaaggggtgt atgtctatgg actttacttg 13620gaaggggcag gttgggataa gaggaatatg aagcttatcg aatcaaaacc aaaagtgctc 13680tttgagttga tgcctgtgat cagaatctat gcagaaaaca atactctcag ggatcccaga 13740ttctactcct gtcctatcta taagaaaccc gtccgcacgg acttgaacta tatcgccgcg 13800gtcgatctcc gcactgctca gactcccgag cactgggtcc ttcgcggggt cgcgctcctt 13860tgtgacgtga agtag 138751413875DNAArtificial SequenceSynthetic polynucleotide 14atgttccgga tcggccggag gcaactgtgg aagcattcag tcaccagagt ccttacccaa 60cgcctcaagg gagagaaaga ggccaaacgc gcgctgctgg acgctcgcca taattacctc 120tttgcgattg tcgcctcatg cttggatctc aacaagactg aagtcgaaga cgccatcctt 180gaaggcaacc aaatcgaacg gatcgatcag cttttcgccg tcgggggtct ccggcacctt 240atgttttact accaggacgt ggaagaagct gaaacgggac agctggggtc acttggtggg 300gtgaacctgg tgtccgggaa aattaagaag ccgaaggtgt ttgtcaccga gggcaatgac 360gtggccctta ccggagtgtg cgtgttcttt atcagaacgg acccatcaaa agcaattact 420ccggataaca tccatcaaga ggtcagcttc aacatgcttg atgcggcaga tggtggtttg 480cttaattcag tccgccggtt gctttcggac atcttcattc ctgcccttcg cgcaacctcc 540cacggttggg gagaacttga ggggttgcag gacgccgcta acattcgcca agaatttctt 600tcaagcctcg aaggcttcgt gaatgtgctg tcgggggcac aggagtcgct caaagagaaa 660gtcaatctga gaaaatgtga cattctcgag ctcaagactc tcaaggaacc cactgattat 720ttgacgctcg cgaataatcc agaaaccctc gggaaaatcg aggactgtat gaaagtgtgg 780attaaacaaa cggagcaagt gctcgcagaa aataatcaac tcctcaagga ggcggatgac 840gtcggccctc gcgcggagct tgaacattgg aaaaagaggc tttcgaagtt taattatctg 900ctcgaacagc ttaagtcacc ggatgtgaag gcagtgctgg cggtgcttgc agctgcaaag 960tcgaaattgc tcaagacttg gcgggagatg gacatcagga tcactgacgc aacgaatgag 1020gcgaaggata acgtgaagta cctgtatacc ttggaaaagt gctgtgatcc tctttatagc 1080tccgaccctc tctcgatgat ggatgctatc ccaaccctca ttaatgccat taagatgatt 1140tattcgattt cgcattatta taatacctcg gaaaaaatta cctcgttgtt cgtgaaggtc 1200acgaatcaga ttatttcagc atgcaaagct tacattacca ataatggtac cgcttcgatc 1260tggaatcagc cgcaagatgt ggtggaggag aaaattttga gcgcaatcaa gttgaaacag 1320gaataccagt tgtgcttcca caagactaag caaaagttga agcaaaaccc aaacgctaaa 1380caatttgatt ttagcgaaat gtatattttc gggaagtttg aaacctttca taggagactc 1440gcgaagatta tcgacatttt cactacgctt aagacttatt cggtgttgca ggactccact 1500attgagggat tggaggatat ggctactaag taccagggta ttgtcgccac gattaaaaag 1560aaggaataca actttctgga ccaacgcaaa atggactttg accaagacta tgaagagttc 1620tgcaagcaga ccaatgacct tcataatgaa ttgagaaagt tcatggacgt cactttcgcg 1680aaaatccaga acaccaatca ggcccttagg atgctcaaaa aattcgagag acttaacatt 1740cccaatctcg gaatcgacga caaatatcag ctcattctcg agaattatgg cgctgacatc 1800gacatgatct caaaactgta caccaaacaa aagtatgatc cacctctggc taggaaccag 1860cctccaatcg cggggaagat cctttgggcc cggcagcttt tccataggat ccagcagcct 1920atgcaacttt tccagcagca tccagctgtg ttgtccacgg cggaggccaa accgattatc 1980cgctcctata atcggatggc aaaggtcttg ctcgagttcg aagtgctctt ccaccgcgca 2040tggcttcggc aaatcgagga gattcacgtg ggtctcgaag ctagcctttt ggtcaaggct 2100cctggcacgg gggagctttt tgtcaatttt gatcctcaaa ttttgattct gttcagggaa 2160actgaatgta tggcgcagat gggcttggag gtgtcaccct tggcgacgag cctctttcag 2220aagagggaca ggtataagcg caatttctcc aatatgaaaa tgatgctggc tgagtaccag 2280cgggtcaaat caaagatccc cgctgccatt gaacaactta ttgtgccaca tctcgcaaaa 2340gtggacgaag ccctccaacc cggactcgcc gcgctcactt ggacttcgct gaacatcgag 2400gcctacttgg aaaatacctt cgctaagatc aaagacctgg aattgttgct tgacagagtc 2460aacgatctga ttgagttccg gattgacgct atcttggaag agatgtcgtc gacgccgctt 2520tgtcagttgc cccaggagga gccattgact tgtgaagagt tccttcaaat gacgaaggac 2580ttgtgcgtga acggtgccca aatcctccac ttcaagtcgt cattggtcga agaggctgtg 2640aatgaactcg tgaatatgtt gcttgatgtc gaggtgctga gcgaggaaga atccgagaaa 2700atttcgaatg agaattccgt caattataaa aatgaatcca gcgccaaacg ggaggagggg 2760aactttgaca ctctgacgtc gtcgattaac gcacgcgcaa atgccctgtt gctgactacg 2820gtgacgcgca agaaaaagga aaccgagatg ttgggcgagg aggccagaga actccttagc 2880catttcaacc atcaaaatat ggatgctctc ctcaaagtga cccggaacac cctcgaggct 2940atcagaaagc ggattcactc ctcgcacacc attaatttcc gggactcgaa ctcagcgagc 3000aatatgaaac aaaactcact gccaatcttt cgcgcaagcg tgactctcgc aatcccaaat 3060atcgtgatgg cccccgcctt ggaggatgtc cagcagacgc tgaataaggc tgtggagtgc 3120atcatctcgg tgccaaaagg agtccgccag tggtccagcg agctcttgtc gaagaaaaaa 3180atccaagagc gcaagatggc cgcccttcaa tccaacgagg atagcgattc agacgtcgag 3240atgggcgaga acgagttgca agacaccttg gagattgcat ccgtgaatct cccaattcca 3300gtccagacta agaactacta taaaaatgtg tccgaaaata aagaaatcgt caaactcgtg 3360agcgtgctta gcaccattat caattccact aaaaaagagg tgattacgtc aatggactgt 3420ttcaaacggt acaatcatat ttggcagaag ggcaaggaag aggctattaa aacgttcatt 3480acgcagagcc cgctcctctc agaattcgaa tcccagattt tgtatttcca aaatttggag 3540caggagatca atgcggagcc tgaatatgtg tgcgtgggat ccattgctct ttatactgcg 3600gatctgaagt tcgctctgac ggctgaaact aaggcgtgga tggtggtcat tggcagacat 3660tgcaataaaa agtataggtc cgaaatggaa aatatcttca tgctgatcga agagtttaat 3720aagaaactca atcggccgat taaggatctg gatgatatcc gcattgccat ggcagccttg 3780aaagagatca gagaagagca gatctcgatc gatttccagg tcgggcccat tgaggagagc 3840tacgctctgc tgaatcgcta cggtcttctg attgcgcggg aggaaatcga caaggtggat 3900actctccatt acgcatggga aaaacttttg gcacgcgccg gggaagtcca gaataaattg 3960gtgtcgctgc agccttcgtt caaaaaagag ttgatttccg ccgtggaagt ctttttgcag 4020gattgccatc agttctatct ggactacgac ctcaacggac ctatggcttc gggccttaaa 4080ccccaggaag cgtccgaccg gctgatcatg ttccagaatc agtttgataa tatctatagg 4140aaatacatta cgtacacggg tggggaggag ctgttcggac tgccagcaac ccaataccct 4200caattgttgg agattaagaa gcagttgaat ttgctccaga agatctacac gctgtataac 4260tcggtcattg aaactgtcaa ttcgtactat gacatcttgt ggtcagaagt caacatcgag 4320aagatcaata acgaattgct cgaattccaa aatcgctgta ggaagttgcc aagggctctg 4380aaggattggc aagccttcct ggatcttaag aaaattattg acgacttttc agaatgttgc 4440ccactcctgg aatacatggc gagcaaagct atgatggaaa ggcactggga gcgcattacg 4500actttgaccg ggcactcact tgatgtcggc aatgaatcgt ttaaactgcg gaatattatg 4560gaggcaccgt tgttgaagta caaagaggaa attgaagata tctgtatcag cgcggtcaaa 4620gaacgcgata tcgaacagaa actcaaacaa gtgattaatg agtgggacaa taagacgttt 4680acgtttgggt cattcaagac tcggggagaa ctgctcttga ggggagactc cacttccgag 4740atcattgcca acatggaaga ttccttgatg ttgcttggat ccttgctctc gaatcgctat 4800aacatgccgt tcaaagccca gatccaaaag tgggtccagt atctctcgaa cagcaccgac 4860atcatcgaat cgtggatgac ggtccaaaat ttgtggattt acctggaagc cgtcttcgtg 4920ggcggcgata tcgcaaagca gttgcctaaa gaggcaaaaa gattttccaa tatcgacaaa 4980agctgggtca aaattatgac cagagcacac gaagtgcctt ccgtggtcca gtgttgcgtg 5040ggcgacgaaa cgcttggtca actgctcccg caccttttgg atcaactcga gatctgtcag 5100aaatccctga ccggttatct tgagaaaaaa agactttgct ttcctagatt ctttttcgtc 5160tcggacccgg cactgctgga gatcctcggt caggcgtcgg actcccacac tattcaagcc 5220caccttctca acgtcttcga caatatcaaa tcggtcaaat tccatgagaa gatctacgat 5280agaatcctgt cgatctcgtc acaagaaggc gaaactattg aactcgataa gcctgtgatg 5340gcagaaggga acgtcgaagt ctggttgaat agcctcctgg aggagtcaca gtcgagcttg 5400catcttgtca ttcggcaagc cgctgctaat attcaggaga ctggtttcca acttaccgag 5460tttttgtcgt cctttccagc tcaggtgggc ctcctgggca ttcagatgat ttggacgagg 5520gatagcgaag aagcccttcg caacgcaaag ttcgacaaga aaattatgca aaagaccaac 5580caagcttttc tggaactcct caatactctg attgacgtga ccacgcggga tctcagctcc 5640accgaaagag tcaagtacga gacgctcatt accatccacg tccatcaaag ggatattttc 5700gacgacctct gccatatgca catcaaatcg cctatggatt tcgagtggtt gaaacagtgc 5760cggttttact tcaacgagga ctccgacaaa atgatgattc atattaccga tgtggcattc 5820atctatcaaa acgagtttct tggttgtacc gacagactcg tcattacccc tttgacggat 5880aggtgctaca tcacccttgc acaagctctc ggaatgagca tgggaggtgc gcccgccggc 5940ccggcgggaa ctggcaagac cgaaaccact aaggacatgg gcagatgtct gggcaagtat 6000gtggtggtgt tcaactgttc cgaccaaatg gactttaggg gcctgggacg catcttcaaa 6060ggactggcgc agagcggatc atggggctgc tttgacgagt ttaaccgcat cgatctcccg 6120gtgctctccg tggccgccca gcagatttcg attattctca cctgtaagaa ggaacacaag 6180aagtcattca ttttcacgga tggtgataat gtcacgatga accccgaatt tgggctgttt 6240ctcactatga atccagggta tgctggcaga caagaactgc cagagaacct taaaattaat 6300tttcggtccg tcgcgatgat ggtgcccgac cgccaaatta ttatccgggt gaaactggcg 6360tcatgtggct tcattgataa cgtcgtcttg gcgcggaaat tttttactct ttataagctt 6420tgtgaagaac agctctccaa gcaggtgcac tacgacttcg gtctgcggaa tatcctcagc 6480gtcttgcgca cgcttggcgc cgcgaagagg gctaatccca tggatactga aagcactatc 6540gtgatgcgcg tccttcgcga catgaacctg tcaaagttga tcgatgaaga cgaaccactg 6600tttctgtcgc tgatcgaaga cttgtttccc aatatcctcc tcgacaaggc gggttatcct 6660gaattggagg cagcgatttc ccggcaagtg gaggaagcgg gcctgatcaa ccatcctccg 6720tggaagctga aggtgattca gcttttcgag actcaacggg tgcggcatgg gatgatgacg 6780cttggtccgt ccggggctgg taaaactact tgtattcaca ccctgatgag agccatgacc 6840gattgcggta aacctcaccg cgagatgagg atgaatccaa aggccattac ggcacctcag 6900atgtttgggc gcctggacgt ggccaccaac gattggaccg acggaatttt ttcgaccctg 6960tggcgcaaaa ctcttagagc caagaaagga gagcatattt ggatcattct ggacggtcca 7020gtcgatgcta tttggatcga aaacctgaac tcggtcttgg acgacaataa gacgctcacg 7080cttgctaacg gagaccgcat ccctatggcg cccaattgca aaattatttt cgagccacac 7140aacatcgata acgcatcacc tgccactgtc agccggaatg gcatggtctt tatgtcgagc 7200tcgattctgg actggagccc gatcctggaa ggatttctca aaaagcgcag cccgcaagaa 7260gcagagattc ttaggcaact ctacaccgaa agcttccccg acttgtacag gttttgcatt 7320cagaatcttg aatacaagat ggaggtcctt gaggccttcg tcatcacgca gtcaattaac 7380atgcttcagg gcctgatccc actcaaagaa cagggcggag aggtgtcaca ggcacacctc 7440gggagattgt ttgtgtttgc tcttctttgg tcagcgggcg cggcacttga actggatggg 7500cgccgccgcc ttgagctctg gctgcggtcg agacctaccg gaactcttga actgccgcct 7560cctgctggtc ctggagacac ggcatttgac tactatgtgg ccccagacgg cacttggact 7620cattggaaca ccagaactca agaatatctt tacccatccg atactacgcc agagtatggg 7680agcatccttg tgcctaacgt cgacaatgtc cgcacggatt tccttattca aactattgcg 7740aaacagggga aggccgtcct gcttatcgga gagcaaggga ctgccaagac tgtcatcatc 7800aaggggttta tgagcaaata tgacccagag tgccatatga tcaagtcatt gaatttttca 7860tccgccacta cgcctctcat gtttcaacgc accatcgaat cctacgtcga caaaagaatg 7920ggaactactt atgggccccc tgccgggaag aaaatgactg tgttcatcga tgacgtgaac 7980atgcccatta ttaatgagtg gggagatcag gtcaccaatg aaattgtcag acagcttatg 8040gaacagaatg gattctataa tctcgaaaag ccaggggaat ttacctccat tgtcgacatt 8100caattcctcg cagccatgat ccatccagga ggtgggagga acgatattcc gcaacgcctt 8160aagcggcagt tctccatttt caattgcacc ctcccctccg aggctagcgt cgataaaatt 8220tttggggtca tcggtgtcgg gcactactgc actcaaaggg gcttttcgga agaggtgaga 8280gattcagtca cgaaactggt gccgttgact agacgcttgt ggcagatgac caaaatcaaa 8340atgcttccaa ctcccgctaa attccactac gtgtttaatt tgcgggacct ttcgcgcgtg 8400tggcagggta tgctgaacac tacctccgag gtcatcaaag aacccaacga tttgctgaag 8460ctctggaaac atgaatgcaa acgcgtcatt gctgatagat tcacggtcag ctccgatgtc 8520acctggttcg ataaagcgct ggtcagcctg gtcgaggaag agtttgggga agagaagaag 8580ttgttggtcg attgtggtat tgacacctat tttgtcgatt ttctcagaga cgcgcctgaa 8640gctgcgggtg agacctccga ggaggctgat gctgaaactc ccaaaattta tgaaccgatc 8700gagtccttct cccatttgaa ggaacggctg aacatgttcc tgcagctcta taatgagagc 8760atccgcggtg ccggaatgga catggtcttt ttcgcagatg ctatggtgca cttggtgaaa 8820atttcaagag tcattaggac gccccagggt aatgcattgc tggtcggggt gggtggatcc 8880ggtaagcaat ccctgacccg ccttgcatca tttatcgcgg ggtacgtgtc attccagatc 8940actcttacga gaagctataa tacgagcaat cttatggagg acttgaaagt gttgtatagg 9000acggccggtc agcagggcaa gggtatcacg ttcatcttca ccgataatga gatcaaagat 9060gagtccttct tggagtacat gaataacgtg ctcagcagcg gtgaggtgtc gaaccttttc 9120gcaagagacg aaatcgatga aattaactcc gatctggcgt ccgtcatgaa aaaggaattt 9180ccaaggtgcc tcccgacgaa cgagaatctc cacgactatt tcatgagcag agtccggcag 9240aacctgcaca ttgtgttgtg cttctcaccg gtgggcgaaa agtttaggaa ccgcgcgctt 9300aaattccccg ctctcatttc aggttgcact atcgattggt tctcaaggtg gccgaaggat 9360gcactggtcg

cagtgtcgga acactttctc acttcatatg atatcgattg ctcccttgaa 9420atcaagaaag aggtggtgca gtgtatggga tcattccagg acggtgtggc ggagaaatgt 9480gtcgactact ttcaaagatt cagaagaagc acgcacgtga cccctaaatc atacctctca 9540tttattcagg gatacaagtt tatctacggc gaaaaacatg tcgaggtccg gactttggcg 9600aaccgcatga acactggttt ggaaaagttg aaagaagcgt cagaatcggt cgctgcactg 9660tcaaaggaac ttgaagcgaa ggagaaagaa ttgcaggtcg cgaacgacaa ggccgatatg 9720gtcttgaagg aagtcaccat gaaggcacaa gctgccgaaa aggtgaaggc cgaagtgcaa 9780aaagtcaaag atcgggcaca agccattgtg gattcaattt cgaaagacaa ggcgattgct 9840gaagaaaagt tggaggcggc gaagccagcc ctggaagagg ctgaggcggc gcttcaaacc 9900attaggcctt cagatattgc cacggtcaga accctgggca ggccgcctca ccttattatg 9960agaattatgg attgcgtcct tcttttgttc cagcgcaaag tctcagctgt caagattgac 10020ctcgagaaat cgtgtactat gccttcatgg caggaatcgc ttaagcttat gacggctggt 10080aatttccttc aaaatctcca gcagttcccg aaggatacta tcaatgagga agtgattgag 10140ttcctctcac cgtacttcga gatgcccgac tataacattg agacggcaaa gagagtgtgc 10200ggaaacgtcg ccggactgtg ttcgtggacg aaggcaatgg cctcgttctt ttcaatcaat 10260aaagaagtgt tgcctctgaa ggcaaacctc gtggtgcagg aaaacaggca tcttctcgct 10320atgcaggatc ttcaaaaagc acaagctgag ctcgatgaca aacaggctga actcgatgtg 10380gtgcaagccg agtatgagca agcaatgact gaaaaacaaa cgctccttga ggacgcggaa 10440aggtgtagac acaagatgca aactgcatca actttgattt cggggcttgc cggagaaaaa 10500gagagatgga ccgagcaatc acaagagttc gccgctcaaa cgaagaggtt ggtcggtgac 10560gtcttgctcg caaccgcctt cctgtcctat tcaggtccat ttaaccaaga gtttcgggat 10620ctcttgttga acgactggcg caaggaaatg aaggcccgca aaatcccatt cggtaagaat 10680cttaatctga gcgagatgct tatcgacgcg cctaccatta gcgaatggaa tcttcagggt 10740cttccaaatg acgatctgtc gattcagaat gggatcattg tgacgaaagc gagccgctat 10800ccgctgctta tcgacccaca gactcagggt aaaatctgga tcaagaacaa ggaaagccgg 10860aacgaactcc aaatcactag ccttaatcat aagtactttc ggaaccattt ggaggattca 10920ctgtccttgg ggcggccact gttgattgag gatgtgggtg aagagctcga tccggcgctc 10980gataacgtcc tggaaaggaa tttcattaaa accgggagca ccttcaaggt gaaagtcgga 11040gacaaggagg tggacgtctt ggatggattc cggctctaca ttacgactaa gctcccaaac 11100cctgcttata ctcccgagat ctcggcccgc acgagcatca ttgatttcac ggtcacgatg 11160aaaggtctcg aggaccagtt gttggggagg gtcattctca cggagaagca agagcttgaa 11220aaggaaagaa cgcatctcat ggaggatgtg accgccaata agcggcggat gaaggagctg 11280gaagacaacc tgttgtatcg gctgacgagc acccagggca gcctggtgga agacgaaagc 11340cttatcgtcg tgctgagcaa cactaagagg actgctgaag aagtcactca gaaacttgaa 11400atctcagctg aaacggaggt gcaaatcaat tccgcgcggg aagagtaccg cccggtcgct 11460acgagaggtt cgattcttta tttcttgatt accgagatgc gcctggtgaa cgagatgtac 11520caaacctccc ttcgccagtt tcttggactc ttcgacttga gcctcgcaag atcggtcaag 11580agccccatca ctagcaaaag aattgctaat attatcgagc acatgactta tgaagtctac 11640aagtatgcag caagaggcct gtatgaagaa cataagtttc tgtttacgtt gttgctgacc 11700ctgaaaattg acattcagag aaatagagtc aagcatgagg agttcctcac gcttattaaa 11760ggtggagcct cacttgactt gaaagcttgt cctccaaagc cctcaaaatg gattctcgac 11820attacctggc tcaatcttgt ggagctttcc aaactcagac agttttcaga cgtccttgac 11880cagatctcgc ggaatgaaaa gatgtggaag atttggttcg acaaggaaaa tcctgaagaa 11940gaaccacttc ctaatgccta tgacaaatcc ctcgactgtt ttagaagatt gctgcttatt 12000cggagctggt gtcctgatcg caccattgct caggcaagga aatatatcgt ggactccatg 12060ggggaaaagt acgcagaagg ggtgattctt gatctggaga agacctggga ggaatcagat 12120cctcggacgc cccttatttg cctgttgtcg atggggagcg atcccactga ctccattatc 12180gcacttggca agagattgaa gattgaaacc agatatgtga gcatgggtca aggtcaagaa 12240gtgcacgcca ggaagttgtt gcaacagacg atggccaatg gagggtgggc actccttcaa 12300aattgccacc tcggcttgga cttcatggac gaacttatgg acattatcat cgaaaccgag 12360ctggtccacg atgcgttcag attgtggatg acgactgaag cacacaaaca gttcccaatc 12420acgctgcttc aaatgtcgat caagtttgcg aatgatccac cacaaggact tcgggccgga 12480ttgaagcgga cttattcggg cgtctcacag gatctcctcg acgtgtcgtc agggagccaa 12540tggaagccga tgctttatgc ggtggcgttc ctgcatagca ccgtgcaaga gaggagaaaa 12600ttcggcgccc ttggatggaa tatcccttac gagtttaacc aggctgattt caatgccacc 12660gtgcaattta ttcaaaacca cttggacgac atggatgtga agaaaggggt ctcgtggact 12720acgattcggt acatgattgg agaaatccag tacggaggaa gggtcactga cgattatgac 12780aagcggctgc ttaacacttt tgccaaagtc tggttctcgg aaaatatgtt tgggcccgat 12840ttctcatttt accagggcta taatatcccg aagtgctcga ccgtcgataa ctatctccaa 12900tacatccaat cgttgccagc ctatgacagc ccagaggtgt ttggtcttca cccaaacgct 12960gatattacct accagtccaa actcgctaag gacgtgctcg acaccatcct gggtatccag 13020ccaaaagata cgtcaggagg gggggacgaa acccgcgagg cagtggtggc tcgcctcgct 13080gatgatatgc ttgaaaagtt gccaccagat tatgtgcctt tcgaggtcaa agaaagattg 13140caaaagatgg gcccgttcca accaatgaac atcttcctca gacaggaaat cgacaggatg 13200caacgggtgc tctcgttggt cagaagcacg cttacggagt tgaagttggc aatcgacggg 13260acgatcatta tgtcagaaaa ccttagagac gccttggatt gtatgttcga cgctcgcatt 13320ccggcttggt ggaaaaaagc gtcatggatt agctcaacgc tcgggttttg gtttacggag 13380cttattgaaa ggaattcaca gtttacttcc tgggtgttca atggtcggcc acattgcttt 13440tggatgacgg gtttctttaa ccctcaagga ttcttgactg cgatgaggca ggagatcact 13500cgggcaaata agggttgggc gctggataat atggtccttt gcaacgaagt gactaaatgg 13560atgaaagacg acattagcgc accgcctact gaaggtgtct acgtctacgg cttgtacttg 13620gagggagcag gttgggacaa aaggaatatg aaactgatcg agtccaaacc gaaggtcctc 13680tttgaactga tgccggtgat tcggatttac gctgagaata atacgctccg ggacccaaga 13740ttttatagct gcccgattta caaaaaacct gtcaggacgg atctgaacta cattgcagca 13800gtggatctca gaaccgctca aacgccagag cattgggtgc tgcgcggggt cgcattgttg 13860tgtgatgtga aatag 138751513875DNAArtificial SequenceSynthetic polynucleotide 15atgtttcgga ttggtcggcg ccagctttgg aagcactccg tgactagggt cctgacccag 60agattgaagg gagaaaagga agcgaaacgc gccttgctcg atgcgcgcca caattatctc 120tttgccattg tggcctcatg cctcgatttg aacaagaccg aggtcgagga cgctatcctg 180gaaggtaacc aaattgagag aattgaccaa ttgttcgccg tgggcggact tcggcacctg 240atgttctact accaggatgt cgaagaggcc gaaacgggtc agcttggtag cctgggcgga 300gtgaaccttg tctcagggaa aattaagaaa cccaaagtct ttgtgactga aggaaatgac 360gtggcattga ctggggtgtg tgtctttttt atccggactg atccgtccaa ggctatcact 420cctgacaaca ttcaccagga ggtctcattc aatatgctgg acgctgctga cggcggtctt 480cttaactcag tcaggcgcct cctgtccgat atcttcattc ctgccctgag ggcaacgagc 540cacggatggg gcgaacttga ggggttgcaa gacgcagcaa atatcaggca ggaatttctg 600tcaagcttgg agggtttcgt gaacgtcttg tcaggagccc aagagtccct caaagaaaaa 660gtgaatctcc ggaaatgcga tatcctggag ctgaagacgc tcaaggagcc cactgactat 720ttgactttgg caaataatcc agagacgctt ggaaaaatcg aggattgtat gaaagtctgg 780attaagcaga cggagcaagt ccttgccgaa aataaccaac tgctgaaaga ggcagacgat 840gtgggtccca gagctgaact cgaacactgg aagaaacgcc tgtcgaagtt taactacctt 900ctggaacagc ttaagtcgcc ggatgtcaaa gcggtgctcg cagtcctcgc ggccgcgaag 960tccaagcttt tgaaaacctg gagggagatg gacatccgca tcacggatgc tactaatgag 1020gccaaagaca acgtgaagta tttgtatact cttgaaaagt gctgcgaccc tttgtactcg 1080tcggacccgc tgtcaatgat ggacgctatc cccacgttga ttaatgcgat taagatgatc 1140tactccattt cgcattacta caatacgtcc gagaagatta cctcactgtt cgtcaaagtg 1200acgaaccaga tcatttcagc gtgtaaagcg tacattacca acaacggcac cgctagcatt 1260tggaatcaac cccaagacgt ggtggaggag aagatccttt ccgcgatcaa gcttaagcag 1320gagtatcaat tgtgctttca caaaactaaa caaaagctta agcagaatcc taacgccaag 1380cagttcgatt tttcagaaat gtacatcttc ggaaagtttg aaacgttcca cagacgcctc 1440gccaaaatca ttgatatctt tacgaccctc aagacctata gcgtgctgca ggattcgacg 1500atcgaggggc ttgaggatat ggctactaag taccaaggaa tcgtcgctac tattaaaaag 1560aaggaatata actttctcga ccaacgcaaa atggacttcg atcaggacta cgaggaattc 1620tgtaaacaga ctaatgatct gcataatgaa cttagaaagt ttatggacgt cactttcgct 1680aaaattcaaa acactaatca agcactcaga atgctgaaaa aatttgaacg gttgaatatc 1740ccaaatctgg gcattgatga taaatatcaa cttatcctcg agaactatgg ggcagacatc 1800gacatgatct ccaagttgta tactaagcaa aaatacgacc ccccccttgc tcgcaaccaa 1860cccccaattg caggaaaaat cctgtgggcc aggcagctct tccaccggat ccaacagcca 1920atgcagctct ttcagcaaca tcccgctgtg cttagcaccg cagaagccaa acctatcatc 1980aggtcataca accgcatggc taaggtgctg ctggaatttg aagtcctttt ccatcgggct 2040tggcttagac aaatcgagga gattcacgtc ggactcgaag cctcccttct ggtgaaagca 2100ccgggaacgg gggagctgtt cgtgaatttt gatccccaaa ttttgatctt gttccgggaa 2160acggagtgca tggctcaaat gggtcttgaa gtctcaccac tcgctaccag cctctttcag 2220aagagagata gatacaaacg gaacttttcc aacatgaaaa tgatgctcgc ggagtatcaa 2280agggtgaaga gcaaaatccc tgccgcaatt gaacagttga tcgtcccgca cctggccaag 2340gtggacgaag ccctccagcc cggacttgca gctttgacgt ggacgtcact taatattgag 2400gcgtatctgg aaaatacgtt cgcgaagatc aaggacctgg agctcttgct cgatcgggtc 2460aatgatctca tcgaatttag gattgatgca atcctcgagg agatgtccag cactcctctt 2520tgtcaacttc ctcaggagga gcccttgact tgcgaggagt ttctgcagat gacgaaagat 2580ctctgtgtca acggtgcaca aatcttgcac tttaagagca gccttgtgga agaagcagtc 2640aatgagcttg tcaatatgtt gcttgacgtg gaagtcctct ccgaggaaga gagcgagaag 2700atttccaacg aaaactccgt caattacaag aatgaatcaa gcgctaaacg cgaggaaggg 2760aatttcgaca ctctcacgtc atcgatcaac gctcgggcca acgctcttct cctgaccact 2820gtcacgagaa agaagaaaga aacggagatg ctgggagaag aggccagaga attgctttcg 2880cactttaatc accaaaatat ggatgccctg ctcaaggtca ccaggaacac cctggaggcc 2940attagaaaac ggattcacag ctcacatacc atcaacttta gggattccaa tagcgcttcg 3000aatatgaagc agaactcatt gccaatcttc cgcgcttccg tgaccctggc gattcctaac 3060attgtgatgg ccccggcatt ggaagacgtg cagcagactc tgaataaggc cgtcgaatgt 3120atcatttcgg tccctaaagg ggtgcgccag tggtcgtcag aactcctgag caagaaaaag 3180atccaggaga gaaaaatggc ggccttgcaa tcgaatgagg attcggattc agatgtcgaa 3240atgggtgaga atgaactcca agacacgctt gagatcgcct cggtcaatct cccaatccct 3300gtccaaacta aaaactacta taagaacgtc tccgaaaaca aggaaattgt gaaattggtg 3360tcggtcttga gcactatcat taactcaacg aaaaaagagg tcatcaccag catggattgc 3420tttaaaagat ataatcatat ttggcagaag ggtaaggaag aggccatcaa gacctttatt 3480acccaatcac ctctgctctc ggaattcgag tcgcaaatcc tgtacttcca aaatctggag 3540caagagatta atgctgagcc agaatatgtc tgcgtcggca gcattgcgct gtataccgct 3600gatctgaagt ttgcattgac ggcggaaact aaagcttgga tggtcgtgat tgggcgccat 3660tgtaacaaaa aatataggtc ggaaatggaa aacatcttca tgttgatcga ggagtttaat 3720aagaagctca ataggcccat caaagacctg gacgatatta ggattgccat ggcagccctc 3780aaggaaatta gagaggaaca aatctcaatt gacttccagg tcggacccat tgaagagtca 3840tatgcgctgc ttaatcgcta cggattgttg attgcccggg aagagatcga taaagtggac 3900actcttcatt acgcatggga gaaactcctc gctagggccg gtgaagtgca gaataagctt 3960gtcagcctcc aacccagctt caaaaaagaa ctgatctccg cagtcgaggt cttcttgcag 4020gattgccacc agttttactt ggattatgat ctgaatggtc ccatggcatc cggcttgaag 4080cctcaggaag ccagcgatcg cctcattatg ttccagaacc agtttgataa tatttacaga 4140aagtacatta cctataccgg aggcgaagag ctttttggct tgcctgctac ccaatatccg 4200cagttgctcg aaattaaaaa gcagctcaat ctgcttcaga aaatttacac cttgtataac 4260tcagtgattg aaacggtgaa ctcgtactac gatattctct ggagcgaggt gaatattgag 4320aaaatcaaca acgaactcct tgaattccaa aaccggtgta gaaagctgcc cagggcgctg 4380aaagattggc aggccttctt ggacctgaag aaaattattg atgacttctc agaatgctgc 4440cccctcttgg agtatatggc atcaaaagcc atgatggaaa gacattggga acgcattacg 4500acgcttacgg gacactccct tgacgtgggc aatgaatcct tcaagcttcg gaacatcatg 4560gaagctccct tgttgaagta caaagaagag atcgaagata tctgcatttc agccgtcaaa 4620gaaagggaca tcgagcagaa gcttaaacag gtgattaacg aatgggacaa caaaacgttt 4680actttcggga gcttcaagac caggggcgag ctgcttcttc gcggtgactc aacttcagag 4740attatcgcaa atatggaaga cagccttatg cttctcgggt cactcctctc gaatcggtat 4800aacatgccct ttaaagccca aattcaaaag tgggtccaat acctttccaa ctcgactgat 4860attattgagt cctggatgac cgtccaaaac ttgtggattt acctcgaagc cgtcttcgtc 4920ggtggagaca tcgccaagca actccccaag gaggccaaga ggttctcaaa cattgataag 4980tcctgggtca aaatcatgac tcgggcacac gaggtcccgt cagtggtcca gtgctgtgtc 5040ggcgacgaaa ctcttgggca gctccttccg caccttctcg atcaattgga gatttgtcaa 5100aaatccttga ctggctacct tgaaaagaaa cggttgtgtt ttccacgctt tttctttgtc 5160agcgatcctg cgttgctgga aatcttgggt caggcgtcgg actcgcacac cattcaggca 5220cacctgctta atgtctttga taatattaaa agcgtgaaat tccatgagaa gatctatgac 5280agaattttga gcatttcgtc acaagaaggt gagactattg aactcgataa accagtcatg 5340gctgaaggta acgtcgaggt ctggcttaat tcactcttgg aagagtcgca gtccagcttg 5400cacctggtca ttagacaagc ggcggcaaac attcaagaga ctggttttca acttaccgag 5460tttttgtcct ccttccccgc tcaagtgggt ctgttgggga ttcagatgat ttggactagg 5520gatagcgaag aggcgcttcg gaatgcgaaa tttgataaaa aaatcatgca gaaaactaac 5580caagcgtttc tcgaacttct taataccctg atcgacgtca cgaccagaga cctgagctcg 5640actgaaagag tcaagtacga aaccctgatt acgattcacg tccaccagag ggatatcttt 5700gatgacttgt gccacatgca tatcaaatca ccgatggatt ttgaatggct caagcagtgc 5760agattctatt tcaatgagga tagcgataaa atgatgatcc atatcactga cgtggcgttc 5820atctatcaga acgagtttct tggctgcact gatagactgg tcatcactcc tctgactgat 5880cgctgttaca ttacccttgc ccaagccctt ggtatgtcca tgggcggtgc tccagctggg 5940cccgcgggca cgggaaagac cgaaactacg aaggacatgg gcaggtgtct gggaaagtat 6000gtcgtcgtgt tcaattgctc cgatcagatg gacttccggg gcctcggcag aatttttaag 6060ggcctggctc agtcgggttc atgggggtgc ttcgacgaat tcaatcggat tgacttgcca 6120gtgttgtcag tggcggcgca acagattagc attattttga cgtgtaaaaa agagcacaag 6180aaatcgttta tcttcactga cggcgataac gtcacgatga atcccgaatt tggcttgttc 6240ttgactatga atcccgggta cgcaggtcgc caggaacttc cggaaaatct taaaattaat 6300tttaggagcg tcgcgatgat ggtgccagat cgccagatca ttatccgggt caaattggcc 6360tcgtgtggat tcattgataa cgtggtgctg gcgagaaaat tttttaccct ttataaattg 6420tgcgaggaac agctctcgaa gcaggtgcac tacgacttcg gactccgcaa tattctttca 6480gtcttgagaa ctctgggtgc agctaaaagg gcaaacccga tggacacgga aagcactatt 6540gtcatgaggg tgcttcggga tatgaacctt tccaaattga ttgacgaaga cgaacctttg 6600tttctcagcc tgattgaaga tctcttcccg aatattctcc tggacaaggc cggctaccct 6660gaattggagg cagccatttc ccggcaggtc gaagaagccg gcttgatcaa ccatcctcca 6720tggaagctca aagtgatcca gctcttcgag actcaacggg tgcggcacgg tatgatgacg 6780cttggtccgt cgggtgccgg taaaaccact tgcatccaca ctctcatgag ggcgatgact 6840gactgtggca aaccgcaccg cgaaatgcgc atgaatccga aagcgattac tgctccccaa 6900atgttcggac ggttggatgt cgcgactaac gactggaccg atggcatctt ctcgacgttg 6960tggcgcaaaa cgctgcgcgc caaaaagggc gagcacattt ggatcattct tgacggtcct 7020gtggacgcga tttggattga gaatctgaat tccgtgcttg atgacaacaa aaccttgacc 7080cttgccaacg gtgataggat ccccatggcc ccgaattgca aaattatttt cgaacctcac 7140aatattgata atgcgtcacc cgctacggtc tcccgcaacg gcatggtctt tatgagctcg 7200tcaatcctgg attggtcacc tatcctcgaa ggattcttga agaagcgctc cccccaagag 7260gcagagattc tccggcaact ctataccgaa agcttccccg acctgtatcg cttttgtatt 7320cagaacttgg agtacaaaat ggaagtcctt gaggcctttg tcattacgca atcgattaac 7380atgcttcaag gccttattcc tttgaaagag caaggaggag aggtgtccca agcacatctg 7440gggaggttgt ttgtgtttgc gcttctttgg tcagcgggcg ccgcgttgga actggacggc 7500cgcagacggc tcgagttgtg gctccgctcg agacctactg gtacgttgga actcccgcct 7560cccgcaggtc ctggggacac cgcgtttgat tattacgtgg ctcccgacgg tacctggact 7620cactggaaca ctaggaccca agagtacctc tacccttcag acacgacccc tgaatacggg 7680tcgattcttg tgcccaatgt cgacaatgtc aggaccgact ttctcatcca gacgattgca 7740aaacagggta aagcagtcct gctgatcggg gaacagggca cggcaaaaac tgtgatcatt 7800aagggtttca tgtccaagta tgatccagag tgtcacatga ttaagtcatt gaacttttca 7860tcggcgacta cgccgctcat gttccaaaga acgatcgagt cgtacgtgga caaacgcatg 7920ggaaccactt atggcccacc agccgggaaa aagatgactg tgtttattga cgacgtgaat 7980atgcctatta ttaatgagtg gggcgatcag gtcaccaacg aaattgtgag gcaacttatg 8040gagcagaatg ggttttataa cctcgaaaag ccaggggaat tcacttccat cgtcgacatc 8100caattcctcg ctgccatgat tcatccaggg ggaggtagaa acgatatccc ccagcggctg 8160aaacggcaat tttcaatctt taattgcacg ctgcccagcg aggcctcagt ggacaagatc 8220tttggggtca ttggggtcgg acactactgc acgcaacgcg gattttcaga agaagtccgg 8280gactcggtga ccaagctcgt cccgcttacc cgccggcttt ggcaaatgac caaaatcaag 8340atgcttccaa cccccgcaaa gtttcactac gtcttcaatc ttcgcgatct ttcacgggtg 8400tggcaaggga tgctgaacac tacgtcggag gtgattaagg agcccaacga cttgctcaaa 8460ttgtggaaac atgaatgtaa gcgggtgatt gcagacaggt ttactgtgtc cagcgacgtc 8520acgtggttcg ataaggctct cgtctccctg gtcgaagaag agttcggcga ggaaaaaaaa 8580ctgcttgtcg attgcgggat tgatacctat ttcgtggact tcctgcggga tgccccagag 8640gcggcgggag agacttccga agaggctgac gcggagaccc ctaagatcta tgaacctatc 8700gaaagctttt cacacctcaa ggaacgcttg aacatgtttc tccaactgta caatgaatcg 8760attagaggcg ccggcatgga catggtgttc ttcgccgacg cgatggtcca tctcgtgaaa 8820atctcgcggg tcattcgcac ccctcaaggt aacgcgcttt tggtcggtgt cggcggcagc 8880ggaaagcaga gcttgacccg cttggcttcc tttatcgcgg ggtatgtctc gtttcaaatc 8940acgctgacca gatcgtataa tacttccaat ttgatggaag accttaaagt cctttataga 9000actgcagggc aacaaggcaa gggtatcact tttatcttca ccgacaacga aattaaggac 9060gaatccttcc tggaatatat gaataatgtg ctttcgagcg gggaggtgag caatcttttt 9120gcccgcgacg aaatcgatga aatcaatagc gaccttgcga gcgtcatgaa gaaggagttc 9180ccaagatgtt tgcctacgaa cgaaaacctt catgactact tcatgtcacg ggtgaggcag 9240aacttgcata tcgtcctgtg tttttccccg gtgggtgaga agtttcgcaa tcgcgcactg 9300aaatttccgg cattgattag cggctgcacg atcgactggt tctcgcggtg gcctaaagac 9360gcattggtgg cggtgagcga gcattttctg acctcatatg acatcgactg ttccctggaa 9420attaaaaaag aagtcgtcca atgcatggga agcttccagg acggtgtcgc cgagaagtgc 9480gtggattact tccagcggtt tcgccgctcg actcatgtca ctccaaaaag ctacttgtcc 9540ttcattcaag gatataagtt tatctatggt gagaaacacg tggaggtgag gacccttgcc 9600aatcgcatga acactggcct ggaaaaactg aaggaagcaa gcgagtccgt ggctgcgttg 9660tcgaaagaac tcgaggcgaa agaaaaagaa ctccaagtcg ctaatgataa agcggacatg 9720gtgctcaaag aggtgacgat gaaggctcaa gcagcagaga aagtcaaggc ggaggtccaa 9780aaagtgaagg accgcgccca ggcgatcgtc gattcgattt ccaaagataa ggctatcgcc 9840gaagaaaaat tggaggccgc gaagcctgct cttgaagagg ctgaagcagc acttcaaacc 9900attagaccct cggacattgc cacggtgcgg actctgggca gacctccgca ccttatcatg 9960agaatcatgg attgcgtctt gctgttgttc caaagaaaag tctccgcagt caaaatcgat 10020ctcgaaaaat cgtgtacgat gccgtcctgg caggagtccc tcaaactgat gaccgcgggt 10080aacttccttc aaaatctgca gcaattccct aaagacacta tcaacgagga agtgatcgaa 10140ttcctgtcgc cgtactttga aatgcctgat tacaatatcg agaccgctaa aagagtgtgc 10200gggaacgtgg ccggtctctg cagctggacc aaggcgatgg cttcgttctt ctcgattaat 10260aaggaggtgc ttcccctcaa agccaacctc gtcgtccagg agaacaggca tctgctggcc 10320atgcaagatc tccaaaaagc ccaagcagaa ttggatgaca agcaggcaga gcttgacgtc 10380gtccaagccg agtacgagca agcaatgacg gaaaagcaaa ctcttctcga agacgcggaa 10440aggtgtcggc

acaaaatgca aaccgcttcc actcttatct cagggttggc aggagaaaag 10500gaacgctgga ccgaacaaag ccaagaattt gcggctcaaa cgaagagatt ggtcggcgat 10560gtgttgctcg caaccgcgtt tctgtcgtat tccggcccct ttaatcagga gtttcgcgat 10620ttgctcctca atgattggcg gaaagagatg aaggcaagga agatcccgtt cggcaagaac 10680ctgaatctct cggagatgct catcgacgct cctactatct cggagtggaa tctccaagga 10740ctgcctaacg atgacctctc cattcaaaat gggattattg tcacgaaggc gagccgctac 10800ccccttctga tcgatcccca gactcaaggg aagatctgga tcaagaacaa agaatcgcgc 10860aatgagctcc aaattacttc cctcaaccat aagtactttc gcaaccacct cgaagactcg 10920ctttcattgg gcaggccact cttgatcgaa gatgtgggag aagaactgga ccctgccctt 10980gataacgtgc tggaacgcaa cttcattaag acgggatcaa ctttcaaagt gaaagtgggt 11040gataaggaag tggatgtgct cgacggattc agactttaca ttactacgaa gctccctaat 11100ccagcctaca cgcctgagat ctccgctcgg acttcaatca tcgatttcac ggtgacgatg 11160aagggccttg aggaccaact gttgggcaga gtcattctca ccgagaaaca ggaactcgaa 11220aaggagcgga cgcatcttat ggaggacgtg acggccaata agcgccggat gaaggaattg 11280gaagataact tgctctacag gctcactagc acgcagggta gcctcgtgga ggacgaaagc 11340ctgattgtgg tcctcagcaa cacgaaaagg acggccgagg aggtgaccca aaagttggaa 11400atttcagctg aaacggaggt gcagatcaat tcagcaaggg aagagtatag gcctgtggcg 11460acgagggggt caatcttgta tttcctgatt acggaaatgc gccttgtcaa cgaaatgtac 11520cagacgtcat tgcgccaatt tctcgggctt tttgacctgt cgctcgctag gtcggtcaaa 11580tcccctatca cgtccaaacg cattgcaaac attattgaac atatgacgta cgaggtgtac 11640aagtatgcag cacggggact gtacgaggaa cacaaattcc ttttcaccct tctgctcacc 11700ttgaagattg acattcagag gaatagagtc aaacacgagg agttcctcac tctcatcaaa 11760ggtggggcgt cgctcgactt gaaggcttgt ccgcctaaac catccaagtg gattctggat 11820attacgtggc tcaatctcgt cgaattgtca aaattgagac aattttcaga tgtgctcgat 11880cagatttccc gcaacgagaa gatgtggaaa atctggttcg ataaagagaa tcccgaggag 11940gagccccttc caaatgcgta tgacaaatcg ctggactgct ttcggaggtt gctcctgatt 12000cgctcgtggt gtccagatag aaccatcgcc caggcgcgga agtatattgt ggattcgatg 12060ggtgaaaaat acgctgaagg agtcattttg gatctggaaa agacgtggga agagtcggac 12120ccacgcaccc cgctcatctg cctccttagc atgggctcgg acccaactga ttcgattatt 12180gccttgggga aaaggctcaa aattgaaacc aggtacgtca gcatgggtca gggtcaagag 12240gtgcacgctc ggaaactgtt gcaacagacc atggcaaacg gcggatgggc tctgctccag 12300aactgccacc tcgggttgga cttcatggac gaactgatgg acattatcat cgaaaccgaa 12360ctggtgcacg atgcttttcg cctctggatg acgaccgagg cccataagca atttcccatc 12420acgctcctcc aaatgagcat taaatttgcg aatgatccgc cacaggggct gagagctggt 12480ttgaagcgga cttatagcgg agtcagccag gatctgttgg atgtctccag cggaagccag 12540tggaagccca tgttgtatgc agtggccttt ttgcattcaa ctgtccagga gaggcggaag 12600ttcggcgccc tcgggtggaa catcccgtat gaatttaacc aggcggattt caacgcgact 12660gtgcagttta ttcagaacca tctggatgat atggatgtca aaaagggagt gtcctggacg 12720accatcaggt atatgattgg tgagatccaa tatggtggga gggtcacgga cgattatgac 12780aagaggcttc tcaacacctt tgcgaaagtg tggttttccg aaaacatgtt cgggcctgac 12840tttagctttt accagggtta caatatcccc aaatgtagca ccgtggacaa ttacctgcaa 12900tacattcaga gccttcccgc atacgacagc ccggaggtgt tcggtttgca tccgaatgca 12960gatattacgt atcaatccaa gctcgcaaaa gacgtcttgg acacgatcct cggaattcaa 13020cctaaagaca cgtccggcgg cggcgatgaa acccgcgaag ccgtcgtcgc gcggcttgcg 13080gacgatatgt tggagaaatt gccacccgat tacgtccctt ttgaagtcaa ggaaaggctt 13140cagaagatgg gcccttttca acccatgaat atcttcctta ggcaggaaat tgaccgcatg 13200cagagggtgt tgtcgctcgt ccggtcgacg cttaccgagt tgaagcttgc gatcgatggg 13260accatcatta tgtccgaaaa tctcagagac gctctcgact gtatgttcga tgccaggatc 13320ccagcttggt ggaaaaaggc gagctggatc agctccactt tgggcttttg gttcacggag 13380ctcattgaga gaaactcgca gtttacttcg tgggtgttca acggcagacc acactgtttt 13440tggatgactg gattctttaa tccccagggc tttttgactg ctatgaggca ggaaattact 13500cgggctaaca aaggttgggc attggataac atggtgcttt gtaacgaagt cacgaagtgg 13560atgaaggacg acatctcggc cccccccact gagggggtgt acgtgtacgg gttgtacctg 13620gaaggcgcgg gctgggacaa aagaaacatg aagctcattg aatcaaagcc caaggtcctt 13680ttcgaactca tgcctgtcat tcgcatctat gccgagaata acactctcag agatcctagg 13740ttctattcgt gtccgatcta caaaaagccc gtgagaaccg atcttaatta cattgctgca 13800gtggacttga gaacggctca aactcccgaa cactgggtcc tgagaggcgt ggcactcttg 13860tgcgatgtca aatag 138751613875DNAArtificial SequenceSynthetic polynucleotide 16atgttcagaa ttgggcggcg gcaactttgg aagcatagcg taacacgggt gctgactcag 60agactgaaag gagaaaagga ggcgaaaagg gccttgctgg acgcccgcca taactacctc 120ttcgcgattg tggcctcatg cctggacctg aataagactg aagtcgagga cgctatcctc 180gagggcaacc aaatagagcg catcgaccaa ttgttcgccg tgggtggact tcggcacctg 240atgttctact accaagacgt ggaagaagcg gaaaccgggc agctgggatc actggggggc 300gtgaacctcg tgagcggaaa gatcaagaag cccaaggtgt tcgtcaccga aggcaacgat 360gtggcgctca ccggggtgtg cgtgtttttc attcgcactg acccatcaaa ggccattact 420cccgataaca tccatcaaga ggtgtccttc aacatgctgg acgctgccga tggaggactg 480ctcaacagcg tgcgccggct cctctcggac attttcatcc ccgccctgag agctaccagc 540catggatggg gggaactcga gggactacag gacgcggcaa acattcgcca agaattcctg 600tcctcattgg agggcttcgt gaacgtgctc agcggagctc aggaaagcct caaagaaaaa 660gtcaacctcc gcaagtgcga catcctagag ctcaaaacgc tgaaggagcc cacagactac 720ctcactctcg ccaataaccc agaaaccctc ggaaagatcg aggactgcat gaaggtctgg 780attaagcaaa cagaacaagt cctggcggag aacaaccagc tccttaagga ggcggacgac 840gtcggcccga gggcggagtt agaacactgg aagaaacgcc tcagcaagtt taattacctc 900ctggagcaat tgaagtcccc tgacgtgaag gccgtgctcg cagtgttggc agcggccaaa 960tcgaagctgc tgaaaacttg gcgggagatg gacattagaa ttactgacgc gactaacgag 1020gccaaggata acgtcaaata cttgtacacc ctcgagaagt gttgcgaccc gttgtatagc 1080tcagacccac tgagcatgat ggacgccatc cccaccctca ttaacgccat taagatgatt 1140tactcaatct cccattacta caacacctca gagaagatta cttcactctt cgtgaaggtg 1200accaaccaaa ttatctcggc ctgcaaggct tacattacta ataacgggac agcttccatc 1260tggaaccagc cgcaagatgt cgtggaagag aagatcctgt cggccatcaa acttaaacag 1320gagtaccaac tctgctttca caaaaccaag cagaagctga agcaaaatcc gaacgccaag 1380caattcgact tctccgaaat gtacatcttc ggaaagttcg aaaccttcca ccggagactg 1440gccaagatca ttgacatttt cactactctt aagacctaca gcgtgctcca ggacagcact 1500atcgagggac tggaggacat ggcaacgaag taccaaggca tcgtcgccac cattaaaaaa 1560aaggaataca acttcctgga tcagcggaag atggacttcg atcaggatta tgaggagttc 1620tgcaaacaga ccaacgacct ccacaacgaa ctgcgcaagt ttatggacgt gaccttcgca 1680aagatccaaa acacgaacca ggcgctgcgg atgctcaaga agttcgagag attgaacatc 1740ccgaatctcg gcatcgacga taagtaccaa ctcatcctgg aaaactacgg ggccgacatc 1800gacatgatct ccaagctgta tactaagcag aagtacgacc cgccactggc gagaaaccag 1860ccccccattg ccggcaagat cctctgggcc cgacagcttt tccaccgaat ccagcaaccc 1920atgcagcttt tccagcaaca ccccgccgtg ctctcaaccg ccgaggccaa gcccatcatt 1980aggagctaca acagaatggc gaaggtcctg ctcgaattcg aagtcttgtt ccaccgcgca 2040tggttgcgcc agatcgagga aatacacgtg ggactggaag cgtcgctgct cgtgaaggca 2100cctgggaccg gagaactgtt cgtcaacttc gacccgcaaa tcctgatcct gttccgcgaa 2160actgaatgca tggctcagat gggattggaa gtcagccccc tggcgacttc cctcttccaa 2220aagagagata gatacaaacg gaacttctcc aacatgaaga tgatgctggc ggaataccag 2280cgcgtgaaat ccaagattcc ggccgctatc gagcagctca tcgtgcctca ccttgccaag 2340gtcgacgaag cactgcagcc tggcctggcc gctctcactt ggaccagcct caacatcgag 2400gcctacttgg aaaacacctt cgccaagatc aaggacctcg aactcctcct cgaccgggtg 2460aacgacttga tcgagtttag gattgatgcc attctggagg agatgagctc cactccgctg 2520tgtcaactgc cacaggagga acccctcaca tgcgaggaat tcctgcaaat gactaaggac 2580ctgtgcgtca acggggccca aatcctgcac ttcaaatcct ccctggtcga ggaggcagtc 2640aacgagctcg tgaacatgct cctggatgtg gaagtgctgt ccgaggagga gtccgagaag 2700atctccaacg aaaatagcgt gaactataag aacgaatcca gcgccaagcg ggaagagggg 2760aatttcgata ccctgacttc cagcatcaac gccagggcca acgccctctt actaactacc 2820gtgactagaa agaagaaaga aaccgaaatg ctgggcgagg aggcacgcga actcctgtcc 2880cactttaacc atcagaacat ggacgccctt cttaaggtca cccggaacac tttggaggcg 2940attcgcaaga gaatccacag cagccacacc attaacttcc gcgacagcaa ctcggcatcc 3000aacatgaagc agaattcact gccgatcttc agggccagcg tgactttggc tatcccgaat 3060attgtcatgg cgcctgctct ggaggacgtc cagcaaaccc tgaacaaggc cgtcgagtgc 3120atcatcagcg tgccgaaggg tgtccggcag tggtccagcg aattgttgtc taaaaagaag 3180attcaagaga ggaagatggc tgcgctccaa tccaatgaag atagcgacag cgacgtggag 3240atgggcgaga acgaactgca ggacaccctc gagatcgcgt ccgtcaacct tcctatcccg 3300gtccagacca agaactacta caagaatgtc tcggaaaaca aggagatcgt gaagctcgtg 3360tcggtcctgt ccaccattat caactccaca aagaaggagg tcattacttc catggactgc 3420ttcaagaggt ataaccacat ttggcaaaag gggaaggaag aggccatcaa gacctttatc 3480acccagtcgc cactcttgag cgagtttgag tcacagattc tgtacttcca gaacctggaa 3540caggagatta atgctgaacc agagtacgtg tgcgtgggct ccattgcgct gtatactgcg 3600gacctcaagt tcgcgttgac tgcagagact aaggcctgga tggtggtcat tggcagacac 3660tgcaacaaga agtacaggag cgaaatggag aacattttca tgttgatcga agagttcaac 3720aagaagctca accggccaat caaggacctc gatgatattc gcattgccat ggcggccctc 3780aaggaaatcc gggaggagca gatctccatc gacttccagg tcggccctat tgaagagagc 3840tacgcactgc tcaaccgcta tggactgtta atcgcccggg aagaaattga taaggtggat 3900accctgcatt acgcttggga aaagttgctg gcccgggcag gagaggtgca gaacaagctc 3960gtcagcctcc aaccctcctt caaaaaggaa ttgatcagcg cggtggaggt ctttctccaa 4020gactgccacc agttctactt ggattatgat ctgaacggcc ccatggctag cggcctgaag 4080cctcaagagg cctcagaccg gctgatcatg tttcagaacc aattcgataa catctaccgg 4140aagtacatta cctatactgg cggagaggag ttgtttggat tgccggcaac ccagtaccct 4200cagctcctgg agatcaagaa gcagctgaac ttgctgcaga agatctacac cctctacaac 4260tccgtcatag agactgtgaa ctcctactac gacattcttt ggagcgaagt aaacatcgaa 4320aagatcaata acgagctctt ggaatttcag aaccgatgca ggaagctgcc ccgggccctg 4380aaagactggc aggctttctt ggaccttaag aagattattg atgacttctc agaatgctgc 4440cccctcctgg agtacatggc ctccaaggcc atgatggaac gccactggga gcggatcact 4500accctgacgg gacacagcct ggatgtcggc aacgagagct tcaaactgag aaacatcatg 4560gaagcgccac tcctgaagta caaggaagag attgaggata tttgcatttc cgccgtgaaa 4620gaaagggaca ttgaacagaa acttaagcaa gtcatcaacg agtgggacaa caaaaccttc 4680acgttcggct ccttcaaaac ccgcggcgag ctcctcctga ggggagactc aaccagcgaa 4740atcatcgcca acatggagga tagcctgatg ctcctgggat cgctgctgtc gaacagatat 4800aacatgccct tcaaggccca gattcagaag tgggtgcagt acctctccaa ctccaccgac 4860atcatcgagt cctggatgac tgtgcagaac ttgtggatct acctcgaggc cgtgttcgtc 4920ggaggggata tcgcaaaaca acttcctaag gaagccaaga ggttcagcaa tattgacaag 4980agctgggtga agatcatgac ccgggcacac gaagtgcctt cggtggtgca atgttgcgtg 5040ggggatgaaa ccctcggaca gttgctgcct cacctccttg accaactcga gatttgtcaa 5100aagtccctga ctggatacct cgagaagaaa cgcttgtgct tcccaaggtt tttcttcgtg 5160tccgatcctg ccctcttgga aatcctgggt caggcctccg actcacacac catccaagcc 5220cacctcctta acgtctttga taacattaag agcgtcaagt tccatgagaa aatctacgac 5280cggatcctct ccatttcgtc ccaagaggga gaaacgattg agctggataa gccagtgatg 5340gccgaaggaa atgtcgaggt gtggctcaac agcctgctgg aagagtcaca aagctccctt 5400catcttgtga tccggcaggc agccgccaat atccaggaaa ccggattcca actcaccgag 5460ttcctcagct ccttccccgc acaagtggga ctgctgggca ttcaaatgat ctggacgcgg 5520gactccgagg aagccctgag gaacgcaaag ttcgacaaga agatcatgca aaaaaccaac 5580caggctttcc tcgaacttct caacaccctg atcgatgtga ccactagaga tctctcctcg 5640acggaacggg tgaaatacga aaccctcatc accatccacg tgcaccagcg ggatattttc 5700gacgacctct gccacatgca tattaagagc ccaatggatt ttgaatggtt gaaacagtgc 5760cggttttact tcaacgagga cagcgacaag atgatgatcc atatcaccga cgtcgccttc 5820atctaccaga acgaattcct gggatgcacc gataggctgg tgattacccc gctgactgac 5880cggtgctaca ttacgctggc ccaggccctg ggaatgtcga tgggcggcgc ccctgccgga 5940ccggctggca ccggcaagac cgaaaccacc aaggatatgg gacggtgtct cggaaagtac 6000gtggtggtgt ttaactgctc ggaccagatg gacttccgcg gacttggaag gattttcaaa 6060ggcctcgccc aaagcggttc atggggatgc ttcgacgagt tcaaccgcat tgatttgccg 6120gtgctgtccg tcgcagcgca gcaaatttcg atcatcctga cctgtaaaaa ggaacacaaa 6180aagtcgttca tttttaccga cggagacaac gtcacaatga acccggagtt cgggttgttc 6240ctgactatga accctgggta cgccgggcgc caggaactcc ctgaaaacct gaaaattaac 6300ttccgctcag tggcaatgat ggtgcctgac agacagatca ttattcgggt gaagctggcg 6360agctgcggct tcatcgataa cgtggtgctg gcgcggaagt ttttcacact gtacaaactt 6420tgcgaggagc agctctccaa acaggtgcac tacgatttcg gactgagaaa tatcctgagc 6480gtgctcagga ccctgggggc cgctaagcgc gcgaacccca tggatactga atccaccatt 6540gtgatgaggg tgctgagaga catgaacctg tccaagctca tcgacgagga tgaacccctg 6600ttcctgtccc tgattgaaga tctgttccca aacatcctcc tggacaaggc gggatacccc 6660gagctggaag cagccatttc cagacaagtg gaggaggccg gactcatcaa ccacccaccc 6720tggaagctca aggtcatcca gctgttcgaa acgcagagag tgcgacacgg catgatgaca 6780ctggggccgt caggggcagg aaagaccacg tgcatccaca ccttgatgcg ggcgatgacc 6840gactgcggga agccgcaccg ggagatgcgc atgaacccga aggcgattac tgcccctcaa 6900atgttcggac ggctcgacgt ggccactaac gactggaccg acggcatttt ctcgacattg 6960tggcgcaaga ccctaagagc caagaaggga gagcatatct ggatcatcct ggatggtcca 7020gtggatgcga tttggatcga gaaccttaac tccgtgctgg acgacaacaa gaccctgaca 7080ctggctaacg gcgaccggat ccctatggcg cccaactgca aaatcatctt cgagcctcac 7140aacattgaca acgcctcgcc cgccaccgtg tcgcggaacg gcatggtgtt catgtcgtcg 7200tccatcctgg actggtcccc cattctcgaa ggcttcctga agaagcgcag ccctcaagaa 7260gccgagatac tccggcaact ctacaccgag tcgttccccg atctttaccg gttctgtatc 7320cagaacttgg agtacaaaat ggaagtcctt gaggccttcg tgatcaccca gtcgatcaac 7380atgctgcagg gactcatccc cctgaaagaa cagggaggtg aagtatccca ggctcacctg 7440ggccgcctgt tcgtgtttgc gttgctttgg agcgcggggg ctgcgctcga gctggacggg 7500cgacgccgcc tggagctctg gctccgcagt aggccgaccg gaaccctgga actcccgccc 7560ccggccggcc ctggagacac cgccttcgac tactacgtgg cccccgacgg gacctggact 7620cactggaaca cgagaaccca agaatacctg tacccctccg acaccactcc ggaatacgga 7680agcatccttg tgccgaacgt ggacaacgtg cgcactgact tcctaattca gaccatcgcc 7740aaacagggga aggcagtgct gcttattgga gaacaaggta ccgcaaagac cgtgatcatc 7800aagggattca tgtccaagta cgatcctgag tgtcacatga tcaagtcact taacttctcc 7860agcgctacta cccctctgat gttccagaga accatcgaaa gctacgtcga caagcgcatg 7920ggaaccacgt acggtccccc ggccggaaag aagatgaccg tattcatcga cgacgtgaac 7980atgccgatca ttaacgaatg gggggatcag gtcaccaacg aaatcgtgcg gcaattaatg 8040gagcagaacg gattctacaa cctggaaaag cccggagagt tcacttcaat cgtggacatt 8100cagttcctgg ccgccatgat ccacccgggc ggaggaagaa acgacatccc gcagagactc 8160aaaagacaat tctcaatctt caactgcacc ctgccgtccg aggcctcagt cgataagatt 8220ttcggagtga tcggagtggg ccactactgc acgcagaggg gtttcagcga ggaggtccgc 8280gactccgtga ccaaattggt cccactcact cgaaggctgt ggcagatgac caagattaag 8340atgctcccca ctcctgccaa gtttcattac gtgtttaacc ttcgggactt gtcccgggtc 8400tggcagggaa tgctgaatac gacctccgaa gtgattaagg aaccaaatga cctcctgaag 8460ctctggaaac acgaatgcaa gagggtgatc gcggatagat tcacggtgtc ctccgacgtg 8520acctggttcg acaaggccct cgtgtccttg gtggaagaag agttcggtga agagaagaag 8580ctcctggtgg actgcggaat cgacacctac tttgtcgact tcttgagaga tgccccggag 8640gctgcgggag aaacctcaga agaggccgat gcggagactc cgaagattta cgaacccatc 8700gaatccttca gccacttgaa ggaaaggctc aacatgttcc tgcagctcta caacgaaagc 8760atccggggag ctgggatgga catggtgttc ttcgccgacg ccatggtgca ccttgtcaag 8820atctcccggg tcattcgaac gccccaggga aacgcattgc tcgtgggtgt cggaggatcc 8880ggaaaacagt ccctgacgag gctggcgtcc ttcattgcgg ggtacgtgag cttccaaatt 8940actctcaccc gctcctacaa tacttccaac cttatggagg acttgaaggt cttgtaccgc 9000accgccggac aacagggaaa ggggatcacc ttcatcttca ccgacaacga aatcaaggac 9060gagagcttcc tggagtacat gaacaacgtc ctttcgtccg gagaagtgtc caacctcttc 9120gcacgcgatg aaatcgacga gattaactcc gacttggcca gcgtcatgaa aaaagaattc 9180cctcgctgtc tccccaccaa cgagaacctc catgattact ttatgagccg ggtccgccaa 9240aacttgcaca ttgtgctgtg cttctcgccc gtgggggaga aatttcggaa ccgggcgctg 9300aagttccccg cgctgattag cggatgtact atcgactggt tctcgagatg gcccaaagac 9360gccctggtcg ccgtgagcga acacttcctg acttcgtatg atatcgactg cagcctcgaa 9420attaagaagg aagtggtgca gtgcatgggg tcatttcagg atggagtggc cgagaagtgc 9480gtcgactact tccagagatt ccggcggtca actcatgtga cgcccaaaag ctacctttcg 9540ttcatccagg gttacaagtt catatacggg gaaaagcacg tcgaagtgcg gaccctggcc 9600aaccgcatga acaccggcct tgagaagttg aaagaggcct cggaatccgt ggcggcgctc 9660agcaaggaac tggaggccaa agagaaggaa ctccaagtcg cgaatgataa agcagacatg 9720gtgctgaagg aagtgacgat gaaagcccag gccgccgaga aggtcaaggc cgaggtccag 9780aaagtgaagg accgcgctca agcaatcgtg gattcgattt ccaaggacaa agcaatcgca 9840gaggagaagc tggaggccgc aaagcccgcg ctcgaagagg ctgaagcggc actgcagact 9900atccggccgt ccgacattgc caccgtgaga accctgggcc gccccccaca cctcatcatg 9960cgcatcatgg actgcgtcct cttgctcttt caacggaagg tgtccgccgt caagatcgac 10020cttgagaaat catgcaccat gccatcgtgg caggagtccc tgaaactcat gacagccggc 10080aacttcctgc agaatctgca acagttcccc aaagacacca tcaacgaaga agtgatcgag 10140ttcttgtccc cgtacttcga aatgcctgat tataacattg aaaccgccaa gcgcgtgtgc 10200ggaaatgtcg cgggcctgtg ctcctggact aaggccatgg cctccttctt tagcattaac 10260aaggaagtgc tcccactgaa ggccaacctc gtggtgcagg aaaatcgcca cttgctggcc 10320atgcaggatc tccagaaggc tcaagcggag ctggacgata aacaggccga acttgacgtg 10380gtgcaggcgg agtacgagca ggctatgacg gaaaagcaga ccttgctgga agatgcagaa 10440agatgcaggc acaagatgca gaccgcctcc acccttattt ccggcctggc cggcgaaaag 10500gaacggtgga ccgagcagtc ccaggaattc gcagctcaga ccaagaggct cgtgggcgat 10560gtgctgctgg ccactgcctt cttgagctac tcaggcccct ttaaccagga atttcgggac 10620ctcctgctga acgattggag gaaggaaatg aaggcgcgga agatcccatt cgggaagaac 10680ttgaacctct ccgaaatgct catcgacgct cccaccatta gcgagtggaa cctccaggga 10740ctgcccaacg acgaccttag cattcaaaac gggatcatcg tgaccaaggc ctcgcgctac 10800cccctcctta tcgacccaca aactcaagga aagatttgga ttaagaacaa ggagtcacgc 10860aacgagctgc agatcacctc cctgaaccat aagtacttca gaaaccatct cgaggattcc 10920ctgagcctgg gcagacccct tctgatcgag gacgtgggcg aggagctcga tccggccctg 10980gacaacgtcc tggagagaaa cttcatcaag actggatcca ccttcaaggt caaggtcggc 11040gataaggaag tcgatgtcct ggatggcttc aggctgtata tcaccaccaa attgcctaac 11100cccgcataca ccccggaaat ctcagcgcgc acctcgatca ttgactttac tgtcaccatg 11160aaaggactgg aggatcaact gctgggcaga gtcattctca ccgaaaagca agagctcgaa 11220aaggaacgca cccatctcat ggaggacgtg accgcgaaca agcggcggat gaaagagctt 11280gaggataact tgctgtaccg cctgacctcg actcaggggt ccctcgtcga agatgagtcc 11340ctgatcgtcg tcctgagcaa tactaagagg accgccgagg aagtaaccca gaagctcgag 11400atcagcgcgg aaaccgaagt gcagatcaac agcgcaagag aagaatatag acccgtagct 11460acgaggggga gcattctgta cttcctcatc acggagatga gacttgtcaa cgaaatgtac 11520cagacctcat

tgcggcaatt cctcggactg ttcgacctgt ccctcgctcg gtccgtcaag 11580tcccctatca cttcaaagcg cattgcgaac attatcgagc acatgaccta cgaagtgtac 11640aagtacgcgg ccagggggtt gtatgaggag cacaagtttc tcttcaccct cctgctgacc 11700ttgaagatcg acattcaaag gaatcgcgtg aagcatgaag aattcctgac cctcatcaaa 11760ggcggcgctt ccctcgatct gaaggcttgc ccaccgaaac cgagcaaatg gatcctggac 11820atcacgtggc tgaaccttgt cgaacttagc aagctgcggc aattctccga cgtcctggac 11880cagatctccc ggaacgagaa aatgtggaag atctggttcg acaaagaaaa ccccgaggag 11940gagcctctgc ccaacgcgta tgacaaaagc ctggactgct tccggcggct cctcctcatt 12000cgctcgtggt gtcccgaccg gaccattgca caggcccgca agtacatcgt ggactccatg 12060ggggagaagt acgctgaggg cgtgatcctt gacctggaga aaacttggga ggaaagcgac 12120ccgaggacgc ctctgatttg cctgctttca atgggaagcg acccgaccga tagcatcatc 12180gcgctgggaa agaggcttaa gattgaaact cgctacgtca gcatggggca aggccaggaa 12240gtgcacgccc ggaagctgct ccagcagacc atggccaacg gaggctgggc gctgctgcag 12300aactgccacc ttggactgga cttcatggac gaactcatgg acatcattat cgagactgaa 12360cttgtccacg acgccttcag actgtggatg actaccgagg cccataagca gttccccatc 12420acactcctcc agatgagcat caaattcgcc aacgatcctc cacagggcct gcgcgccgga 12480ttgaaaagga cgtactcagg ggtgtcccag gacctcctgg acgtgtcctc cggctcccaa 12540tggaagccaa tgctctacgc agtggcattc ctgcacagca ctgtgcagga gaggcggaag 12600tttggagccc tgggatggaa cattccatac gagttcaacc aggccgactt caacgcgact 12660gtgcaattca tccagaacca cctggacgat atggatgtga aaaagggggt gtcctggacg 12720accattcgct acatgatcgg ggagatccag tacgggggaa gagtgaccga tgattacgac 12780aagaggctcc tgaacacttt cgctaaggtc tggttctccg aaaacatgtt cggccccgac 12840ttctcgttct accaggggta taacattcca aagtgctcga cggtggataa ctacctccag 12900tacatccagt cgctgccggc ctacgattcc cccgaggtgt tcggcctcca ccccaacgcc 12960gacattacct accagagcaa gctggctaaa gacgtgctgg acaccatact ggggatccaa 13020ccgaaggata cctccggagg aggggacgaa acccgcgaag cagtggtggc acgccttgcc 13080gacgacatgc tggaaaaact gccccctgac tacgtcccct ttgaggtcaa ggaacgcctc 13140cagaagatgg gacctttcca gccaatgaac attttcttgc gacaagagat cgaccggatg 13200cagcgcgtcc tctccctcgt gcgctcaacc ctcaccgagc tcaagctggc aatcgacggt 13260accattatca tgtcggaaaa cctccgggac gcactggact gcatgttcga tgcgcggatc 13320ccagcgtggt ggaagaaagc ctcctggatt tcgtcgaccc tggggttctg gttcaccgag 13380ctgattgaaa ggaactccca attcacctcc tgggtcttta acggcagacc gcactgcttc 13440tggatgaccg gcttttttaa cccccaggga tttctcaccg ccatgcgcca agagatcacc 13500agggcgaaca agggctgggc gttggataac atggtgctgt gcaacgaagt gactaagtgg 13560atgaaagatg acatttcagc cccgccaacc gaaggcgtct acgtctacgg gctctacttg 13620gaaggggccg gatgggacaa gagaaatatg aaactcattg agtccaagcc gaaggtgctg 13680ttcgagctga tgccagtgat ccgcatctac gctgaaaata acactctccg ggatcccagg 13740ttctactcgt gcccaattta caagaagccc gtgcggaccg acctgaacta catcgccgcc 13800gtcgaccttc gcactgccca gactccggag cactgggtgc tgcggggagt cgccctgctt 13860tgcgacgtga agtag 138751713875DNAArtificial SequenceSynthetic polynucleotide 17atgtttagga ttggaaggag acaattgtgg aaacacagcg tgaccagggt gctcactcaa 60cgccttaaag gggagaagga ggctaaacgg gcgctgctcg acgctaggca taactacctc 120ttcgccatcg tggcatcctg ccttgatctt aacaagacgg aagtggagga cgctattctt 180gagggcaacc aaatcgagag aatcgaccag ctgttcgccg tgggaggcct gcgccacctt 240atgttctact atcaggacgt ggaggaggcc gagactggac agctggggtc tttgggagga 300gtcaatttgg tcagcggcaa aatcaagaag ccaaaagtgt ttgtcacgga aggcaatgac 360gtcgcgctga ctggggtctg cgtgttcttc atccggaccg acccctctaa ggccatcacc 420ccagataaca ttcaccagga ggtgtctttc aacatgctgg acgccgcaga tggaggcctg 480ctgaactccg tgaggaggct cctcagcgac atcttcattc ccgccctgcg cgccacctcg 540catggatggg gagaactgga ggggcttcaa gacgccgcga acattcggca agaattcctg 600tcctccctgg aaggattcgt gaacgtgctg agcggcgcac aggaatccct caaggagaaa 660gtcaacttga gaaagtgcga catcctcgaa ctcaagactt tgaaggagcc taccgattac 720ctcaccctcg cgaacaaccc tgaaaccttg ggaaagatcg aagattgtat gaaggtctgg 780attaagcaga ctgaacaggt gctggctgag aacaaccaac tgctgaagga ggccgacgat 840gtcgggccgc gggcggaact cgaacattgg aaaaagagac tctccaagtt taattacctc 900ctggaacagc ttaaatcccc ggatgtgaag gcggtgctgg cagtcctggc ggccgccaag 960tccaagctgc tcaagacttg gcgcgagatg gacatccgca ttaccgatgc caccaacgag 1020gcaaaggaca acgtgaaata tctgtacacc ctcgagaagt gctgcgaccc tttgtactca 1080agcgatccac tctcaatgat ggacgccatt ccgactttga ttaacgccat taagatgatt 1140tactccattt cgcattacta caatacctcg gaaaagatca cttcgttgtt cgtgaaggtg 1200accaaccaga tcattagcgc ctgcaaggca tacatcacaa ataacggaac cgcctccatt 1260tggaaccaac cgcaggacgt ggtggaggaa aagattttgt cagcaatcaa gttgaagcag 1320gagtaccagc tctgtttcca taagactaag cagaagttga agcagaatcc caatgccaag 1380caattcgact tctccgaaat gtacatcttc ggcaagtttg agactttcca ccggcggctg 1440gcaaagatca ttgatatctt caccacgctt aagacctaca gcgtcctcca ggattcgact 1500atcgaagggt tggaagatat ggcgaccaag tatcaaggaa ttgtcgctac catcaagaag 1560aaggagtaca acttcctcga ccagagaaaa atggacttcg atcaggacta cgaggagttc 1620tgtaagcaaa ccaacgatct gcataacgag cttcggaagt tcatggacgt cacgtttgcc 1680aagattcaga ataccaacca agccctgagg atgcttaaga agtttgaacg gctcaacatt 1740cccaacctcg gcatcgacga caagtaccaa ctcattctcg aaaattatgg ggcagacatt 1800gatatgatca gcaagctgta caccaagcag aagtacgacc ctccactagc caggaatcag 1860cctcctatcg ccggcaagat cctgtgggcg agacaactct tccacagaat tcagcagcca 1920atgcaactgt tccagcagca tcccgcggtg ctgtcgactg cggaagccaa gcctatcatc 1980cgctcctaca ataggatggc caaggtcctg ctggagttcg aagtgctgtt ccatagagct 2040tggctccgcc agatcgaaga aattcacgtc ggccttgaag catccctcct ggtcaaggcc 2100cccggcaccg gtgaactctt cgtcaacttc gaccctcaga tcctcatcct gtttagagag 2160actgagtgca tggcccagat gggactcgaa gtttcgccgc ttgccactag cttatttcaa 2220aagcgggatc ggtataagcg gaacttctcc aacatgaaga tgatgctggc cgagtatcaa 2280cgggtgaaat ccaagattcc cgcggcgatt gaacagctca ttgtgccgca cctcgccaaa 2340gtggatgagg ccttgcagcc gggacttgca gcactgacct ggacttcgct caacattgaa 2400gcctacttgg agaacacttt cgccaagatc aaagacctgg aactgttgct cgatagggtg 2460aacgacctca ttgagttccg gatcgacgcc atcctcgaag aaatgtccag cactcccctg 2520tgtcaactcc cccaagagga acccctcact tgcgaagaat ttctgcagat gacgaaggat 2580ctttgcgtca acggggcaca gattctgcac ttcaagtcat cacttgtgga agaggccgtg 2640aacgaacttg ttaacatgtt gcttgacgtc gaggtcttga gcgaggagga atccgagaaa 2700atctccaatg aaaactcggt gaactataag aacgagtcct ccgccaagag ggaagaggga 2760aacttcgaca ccctgacctc gtccatcaac gccagggcca atgccctgtt gctgactact 2820gtgaccagaa agaagaagga gactgaaatg ctgggggagg aagcaaggga gctcctcagc 2880cacttcaacc atcaaaacat ggacgccctg ctcaaggtga ccaggaatac ccttgaggcc 2940attcggaagc ggattcactc gagccacacc attaacttcc gcgatagcaa ctcagccagc 3000aacatgaagc agaacagtct ccccatcttc agagcctccg tgactctggc cattccgaac 3060attgtcatgg cccccgcttt ggaggatgtc cagcagactc ttaacaaggc cgtggagtgc 3120atcatttccg tacccaaggg cgtgcgccaa tggtctagcg aactcctgtc caagaagaag 3180attcaggaac gcaagatggc cgccctccag agtaatgaag atagcgactc cgatgtggaa 3240atgggggaaa acgaactcca ggatactctt gagatcgcct cggtgaactt gccgattccc 3300gtccagacta agaactacta caagaacgtc agcgaaaaca aagaaatcgt gaagctcgtg 3360tcggtgctga gcacgatcat taacagcact aagaaggaag tgattactag tatggactgc 3420ttcaagcggt ataaccacat ttggcagaag ggcaaggaag aggctattaa gaccttcatc 3480acccaatccc ccctgctgag cgagttcgag tcccagatct tgtacttcca gaacctcgaa 3540caagaaatta atgccgagcc cgagtacgtc tgcgtgggct ccatcgcgct gtacaccgcc 3600gatctgaagt ttgccctgac cgccgagact aaggcctgga tggtcgtgat cggcagacac 3660tgcaacaaga agtaccgctc ggagatggaa aacattttta tgcttattga agagttcaac 3720aagaaattga accgccctat caaggatctc gacgacattc gaatagctat ggccgccctg 3780aaggaaatcc gggaagaaca gatctcaatt gacttccagg tcgggcccat cgaagaaagc 3840tacgcactcc ttaatcgcta cggactactc attgccaggg aagagatcga caaggtcgac 3900accctccatt acgcatggga gaagctgctc gctcgcgctg gcgaagtgca gaacaaactc 3960gtcagcctgc agccgagctt taagaaggaa ctgatctccg ccgtggaagt gtttctgcaa 4020gactgccacc aattctacct tgattacgac ctcaatggtc caatggcctc gggcctgaag 4080ccccaagagg cctcggaccg gctgatcatg ttccaaaacc agttcgacaa catctaccgg 4140aagtacatta cctacactgg tggcgaagaa cttttcggcc tgccagctac ccagtatccg 4200cagctcctgg agatcaagaa gcaactgaac ttgttacaaa agatctacac tctctacaat 4260agcgtcattg aaaccgtcaa ctcctactac gacatcctgt ggagcgaagt gaatattgaa 4320aaaattaata acgagctgct tgagttccag aacagatgcc gcaagctccc gcgggcactc 4380aaggactggc aagcattctt ggacctcaag aaaattatcg acgatttcag cgagtgctgc 4440ccgctcctgg agtacatggc ctccaaggcg atgatggaac gccattggga gcggatcact 4500accctgacgg ggcacagcct ggacgtgggc aacgagagct ttaagctccg gaacattatg 4560gaagccccgt tgctgaagta taaggaagag atcgaggata tttgcatcag cgccgtgaag 4620gaaagggaca tcgagcagaa actaaagcaa gtgatcaacg agtgggacaa caagacgttc 4680acgttcggct cgttcaagac tagaggagaa ctcctcctga gaggagactc cacttccgaa 4740atcatcgcga acatggagga tagccttatg cttctcggct ccctacttag caacagatac 4800aacatgccgt tcaaggcgca gatccaaaag tgggtgcaat acctgagcaa ctccaccgat 4860attattgaat cctggatgac cgtgcagaac ttgtggatct acctcgaagc cgtgttcgtg 4920ggtggagaca ttgccaagca actgcccaaa gaggcgaagc ggttttccaa cattgacaag 4980agctgggtga agattatgac cagggcccac gaagtgccga gcgtggtgca gtgctgtgtc 5040ggtgacgaga ctctcggcca gctgctccct catctcctgg atcagctgga gatttgtcag 5100aagtcgttga ccggatacct tgaaaagaaa aggctctgct tcccgcgctt cttcttcgtg 5160agcgaccccg cgctgttgga gattttgggt caagcgtccg attcccacac catccaagcc 5220cacctcctca acgtctttga taacattaag tcagtgaagt tccatgagaa gatctacgac 5280agaattctta gcatctcgtc ccaggagggc gagacaatcg agctggacaa gccggtgatg 5340gccgagggta acgtcgaggt ctggcttaac tcgttgctgg aagagagcca gtcaagcctg 5400cacttggtca ttcgccaagc cgcggccaac atccaggaga ctggattcca gctcaccgag 5460ttcctgagca gcttcccggc tcaagtcgga ctcctgggca tccaaatgat ttggacgaga 5520gactctgagg aggcgttgcg caacgccaag tttgacaaga aaattatgca aaagacaaac 5580caggccttcc ttgaattgct aaacaccctt attgacgtga ccaccagaga tctcagctcg 5640actgagcgcg tgaagtacga aaccctgatc actatccacg tgcaccagag agacatcttc 5700gacgacttgt gtcacatgca tattaagtcg ccgatggact tcgaatggct caagcagtgt 5760agattctact tcaacgagga ctcggataaa atgatgattc atattactga tgtggcgttt 5820atctaccaaa acgaattctt gggctgcact gataggttgg tgatcactcc cctgactgac 5880agatgctata tcaccctggc ccaggcgttg ggaatgtcaa tgggaggcgc ccctgccgga 5940ccggctggca ctggaaagac cgaaacgact aaggatatgg ggagatgctt gggaaaatac 6000gtcgtagtgt tcaactgctc cgaccagatg gacttccgcg gcctgggcag aatttttaag 6060ggcctcgccc agtcagggtc gtgggggtgc ttcgacgagt tcaatagaat tgacttgccc 6120gtcctgtcgg tggccgccca gcaaattagc atcatattga cgtgcaagaa ggaacataag 6180aaatccttca tctttaccga cggggacaat gtcactatga atccagagtt cggtcttttc 6240ctaaccatga acccaggcta cgctggcaga caggagctgc ctgagaactt gaagatcaat 6300ttcagaagcg tggcgatgat ggtccccgac cgacaaatca ttatacgcgt caaactggcc 6360tcgtgtggct tcatcgacaa cgtcgtcctg gcccggaagt ttttcaccct ctacaagctc 6420tgtgaggagc agctcagcaa acaagtgcat tacgatttcg ggcttcggaa catcctgagc 6480gtcctccgga ccctcggagc cgccaagaga gccaacccaa tggacaccga gtcaaccatt 6540gtgatgcgcg tcttgcgaga tatgaacttg tccaagctca tcgacgagga cgagcccctc 6600ttcctgagcc tgattgagga cctgttccct aacatcctcc tggataaggc cggctaccct 6660gaactggagg cggccattag cagacaggtg gaggaggcgg ggctgatcaa ccatccccct 6720tggaagctca aggtcatcca gttgttcgag actcagcgcg tgcggcacgg catgatgacc 6780ctgggaccgt ccggcgcagg caaaactacg tgcattcaca ccctcatgcg ggcgatgacc 6840gactgtggca agccccacag agaaatgaga atgaatccca aagccattac tgctccccaa 6900atgtttggcc ggctggacgt ggccaccaat gactggacgg acggaatctt tagcaccttg 6960tggcggaaaa ccctgagagc taagaagggc gaacacattt ggatcattct cgacggccct 7020gtggacgcga tttggattga aaatctgaat agcgtgctgg atgacaataa gactctcacg 7080cttgccaacg gtgatcggat tcccatggcc ccgaattgta agattatttt cgagcctcat 7140aacatcgata acgcttcgcc tgcgactgtc tcaagaaatg gcatggtgtt tatgtcctct 7200tccattctgg actggtcccc gattctcgag ggcttcctga agaaaaggtc accacaagag 7260gctgaaattt tgaggcaact gtacactgag tccttcccag acctgtaccg gttttgcatc 7320cagaacctcg agtacaaaat ggaagtcttg gaagccttcg tcattactca gtcaatcaac 7380atgctccagg ggctcatccc cctcaaggag caggggggag aggtcagcca ggcgcacctc 7440ggacggctgt tcgtgttcgc actcctgtgg agcgcggggg ctgcactgga actggatggg 7500aggaggcggt tggagttgtg gctgagaagc agaccgacgg gtaccctgga actacccccc 7560cccgcgggtc cgggagacac cgccttcgat tactacgtcg cgcccgatgg cacttggacc 7620cattggaaca ctcgcaccca agagtatttg tacccttccg atactacgcc tgaatacggc 7680agcatcctcg tgcctaacgt ggacaacgtc cggacagact tccttatcca aaccattgcc 7740aaacagggca aggccgtgct cttgattggg gagcaaggca ctgccaagac ggtgatcatt 7800aagggcttca tgtcgaagta cgacccagaa tgccatatga tcaaatcact gaacttcagc 7860agcgccacta ccccactcat gttccaacgc acaattgagt cgtacgtcga taaaagaatg 7920ggcaccacct acggacctcc ggccggaaaa aaaatgaccg tgtttatcga cgatgtgaat 7980atgcctatta tcaacgagtg gggtgaccaa gtgaccaacg aaatcgtgcg ccagcttatg 8040gaacagaacg gattttacaa cctggagaag cctggggagt ttacctctat cgtggacatc 8100caattcctgg ccgcaatgat tcacccggga ggaggtcgga acgacatccc gcagcgactg 8160aagcggcaat tctcgatctt caactgtact ctcccgtcgg aggcgtcagt ggacaagatc 8220tttggagtca tcggggtggg ccattactgt acccagaggg gtttcagcga agaggtgcgc 8280gattccgtga ccaagctggt gccccttacc agacgcctgt ggcaaatgac caagatcaaa 8340atgcttccca ctcccgcgaa gttccactac gtgtttaacc tccgcgatct gtcccgggtc 8400tggcagggca tgttgaacac taccagtgaa gtcatcaagg agccgaacga cttgctcaag 8460ctgtggaagc acgaatgcaa gcgcgtcatc gccgaccggt ttacggtgtc ctccgacgtg 8520acctggttcg acaaagcgtt ggtgtcattg gtcgaagagg aattcggtga ggaaaaaaag 8580ctcctggtgg attgtgggat tgacacttac tttgtcgatt ttctccgcga cgcccctgaa 8640gcggccggtg aaaccagcga ggaagccgac gccgaaaccc cgaagatcta cgagccgatt 8700gaatcgttca gccatctcaa agagcggctc aacatgttcc tccagctgta caacgaatca 8760atcaggggcg ctggaatgga catggtgttc ttcgccgatg ccatggtcca ccttgtcaag 8820atctcgcgcg tgatccggac ccctcaagga aacgccctct tggtcggtgt cggagggtca 8880ggaaagcaga gcctgacccg gctcgcgtcg ttcattgccg gttacgtgag ctttcaaatt 8940actctcaccc ggtcgtataa cacctcgaac ttgatggagg acctgaaggt cctctatcgc 9000accgccgggc agcaaggcaa aggaatcacc ttcatcttca ccgacaacga aattaaggat 9060gaatcatttc tggaatacat gaacaatgtc ctgagcagcg gagaagtgtc caacttgttc 9120gctcgcgatg agatcgatga aattaactcc gacctggcct ccgtgatgaa gaaggagttc 9180cctaggtgcc tgccgaccaa cgaaaacctc catgactact tcatgtccag ggtccggcaa 9240aatctgcata ttgtgctgtg tttctcaccc gtgggggaaa aatttcgcaa tcgcgcgctg 9300aagttccccg cactgatcag cggctgcacc atcgactggt tctcccggtg gccgaaggat 9360gccctggtgg cagtcagcga acacttcctg actagctacg atattgactg tagcctcgaa 9420attaagaagg aggtggtcca atgcatgggg tcattccagg atggtgtcgc agaaaaatgc 9480gtggattatt tccaaaggtt ccggagaagc acccatgtga cgccgaagtc gtacttgtcg 9540ttcattcagg gctacaaatt catctatggc gaaaaacacg tcgaagtgcg gaccctggct 9600aacagaatga acaccggcct ggaaaaactg aaggaagcta gcgaaagcgt ggccgcgctc 9660agcaaggaac ttgaagcgaa ggaaaaagaa ctgcaagtgg ctaacgacaa agcggacatg 9720gtgttgaaag aggtgaccat gaaggcccag gcagcagaga aggtgaaggc cgaggtgcaa 9780aaggtcaaag accgggccca ggccattgtg gactcgatct ccaaggataa ggctatcgct 9840gaggagaagt tggaagcggc taagccagcc ctggaagagg ccgaggccgc cctgcaaacc 9900atcaggccgt ccgacatcgc gaccgtgcgc accttgggca gaccgcccca cctcatcatg 9960cggatcatgg actgtgtact cctcctgttc cagcgcaagg tgtccgcagt caagatcgat 10020ctggaaaagt catgcacgat gcccagctgg caggaaagtc ttaaactcat gactgccggt 10080aacttcctgc aaaaccttca acaatttccc aaggacacca ttaacgaaga agtcatcgaa 10140ttcctgtcgc cgtacttcga aatgcctgac tacaatattg agactgccaa acgcgtgtgc 10200ggaaacgtcg ccggactttg ctcctggacg aaggccatgg catccttctt cagcattaac 10260aaggaggtcc tgccgctgaa ggcgaacctc gtggtccagg aaaaccgcca ccttctggcc 10320atgcaagatc tccaaaaagc acaagccgaa ctggacgaca agcaggcgga acttgacgtg 10380gtccaggccg agtacgagca agcaatgacc gagaagcaga ctttgctgga ggacgcagaa 10440cgctgccggc acaagatgca gacggcgtca acgttgatct cgggcctcgc cggagaaaag 10500gaaagatgga ccgagcagag ccaagaattc gcagcgcaga ctaagagact tgtcggagat 10560gttctcctgg ccaccgcctt tctgtcctac agcgggccat tcaaccagga attcagagat 10620ctgctgctga acgactggcg caaggaaatg aaggctcgca agattccgtt cgggaagaat 10680ctcaatctct cggaaatgct tatcgacgca ccaaccatct ctgaatggaa tctgcaaggc 10740ctgccgaacg acgaccttag catccagaac ggaatcatcg tgactaaggc ctccaggtac 10800cccctcctga tcgaccccca gactcaggga aagatctgga ttaagaacaa ggaatcgcgg 10860aacgaactcc agatcacctc actcaaccac aagtacttcc gcaaccatct cgaggattca 10920ctcagcctgg gtcggcccct gcttattgag gacgtcggag aggagctgga tcctgccctc 10980gacaacgtcc tcgagagaaa cttcatcaag acgggatcaa ccttcaaggt gaaggtcgga 11040gacaaggaag tcgacgtgct ggatggattc aggctctaca tcaccactaa gctccccaac 11100cccgcctaca ctccggagat ttcggcccgc acctccatca tcgactttac cgtcaccatg 11160aagggactcg aagatcagct cctcgggaga gtgatcctga cagaaaagca ggaactggaa 11220aaagaaagaa cgcatctgat ggaggatgtg actgcgaata agcggagaat gaaggaactc 11280gaagataact tgctgtatag acttacgagc acccaagggt ccctggtcga ggacgagtca 11340ttgatcgtgg tgctgtccaa caccaagcgc accgccgagg aggtgacgca gaagctcgaa 11400atctcggcgg aaacggaagt ccagattaat tcggctcgcg aggaatatcg cccggtcgca 11460actcggggat cgatcctgta cttcctgatc actgaaatgc ggctcgtcaa cgaaatgtac 11520cagacatcgc tgcggcaatt cctgggactt tttgacctga gcctggcccg gtccgtcaag 11580tcgccgatca cctcaaagag aattgcgaac atcattgaac acatgaccta cgaagtgtac 11640aagtacgccg cccgtggatt gtacgaggag cacaaattcc tgtttacctt gctgcttacc 11700cttaagatag acatccagag gaacagagtg aagcacgaag agttccttac gctgattaag 11760ggcggagcct cactcgactt gaaggcatgc cccccgaagc catcaaagtg gattctcgac 11820atcacctggc tgaacctggt ggagctctcc aagctccggc agttctcgga cgtcctggac 11880caaatatcga gaaacgaaaa aatgtggaag atttggtttg acaaggaaaa cccggaagag 11940gagccccttc ccaatgccta tgacaagtcg ctggactgct ttcggcgcct cctcctgatt 12000aggtcctggt gccctgatag gactattgcc caggccagaa agtacattgt ggactcaatg 12060ggagagaagt acgccgaagg agtcatcctt gacctcgaaa agacttggga ggaatcagac 12120cctcgcaccc cgctgatttg tctcctttcg atgggctcgg atcctacaga tagcatcatc 12180gccctgggaa agcgcctcaa gatcgaaact cggtacgtgt ccatgggaca aggtcaggaa 12240gtccatgccc ggaagctgct tcaacagact atggcgaacg gtggatgggc cctgcttcag 12300aactgccacc ttgggctcga tttcatggac gaattgatgg acattattat cgaaacggag 12360ctggtgcacg acgcttttag actctggatg accactgagg cccataagca attccccatt 12420acgctgctcc aaatgtctat caaattcgcc aatgatcccc cgcaaggtct gcgggctggt 12480cttaaaagga cgtactccgg agtgtcccag gacttgctgg atgtctccag cgggtcgcag 12540tggaaaccaa tgctttacgc cgtcgccttc cttcactcga ccgtgcagga acgccggaag 12600ttcggcgccc

ttggctggaa cattccctac gaattcaatc aagccgactt caatgccact 12660gtgcaattca tccagaacca cttagacgat atggacgtca aaaagggagt gagctggacc 12720accatccgct acatgattgg tgaaatccag tacggcggac gggtcacgga cgattatgac 12780aaacggctcc tcaacacctt cgcgaaagtg tggttcagcg agaacatgtt cggtccggac 12840ttctccttct accaaggcta taatattcca aagtgctcga ctgtggataa ctacctccag 12900tacattcaga gccttcctgc atatgactcc cctgaagtgt tcggattgca tccgaacgcc 12960gatattactt accagagcaa gctggcaaag gacgtgctgg acaccattct gggcatccag 13020cccaaggata cgagcggagg aggagatgaa acccgcgaag ccgtcgtcgc acgcttggca 13080gacgacatgc tggagaaact gcctcccgac tacgtgccgt ttgaggtgaa ggaacggctg 13140cagaaaatgg gaccatttca gccgatgaat attttcctgc ggcaggaaat cgaccggatg 13200caacgcgtgc tttccctcgt ccgcagcacc ctgaccgagc tgaaacttgc tatcgatgga 13260accattatca tgagcgagaa cttgcgcgac gccttagact gtatgttcga tgcgagaatc 13320ccggcgtggt ggaagaaagc cagttggatc tcgtcgacac tggggttctg gttcaccgag 13380cttatcgaaa ggaactccca attcacgtcc tgggtgttca acggtagacc gcattgcttc 13440tggatgactg gattcttcaa cccgcagggc ttcctgaccg ccatgcggca ggagatcacc 13500agagccaaca aaggatgggc tttggacaac atggtgctct gcaacgaagt cacaaagtgg 13560atgaaggacg atatctccgc cccgcctacg gagggcgtat acgtctacgg gctctacttg 13620gaaggagccg gatgggacaa gagaaatatg aagctgatcg aaagcaagcc gaaggtgcta 13680ttcgaactga tgccagtgat ccgcatttac gccgagaaca acacacttcg cgatccgcga 13740ttctacagct gccctatcta caaaaagccc gtgaggactg acctgaacta tatcgcagca 13800gtggaccttc gcaccgccca aacccccgaa cactgggtgt tgaggggggt ggcgctgctc 13860tgcgacgtga agtag 138751813875DNAArtificial SequenceSynthetic polynucleotide 18atgttccgga tcggacggcg ccaactttgg aagcattccg tcacccgcgt gctgactcag 60cgcctcaagg gtgaaaaaga ggccaaaagg gccctgctcg atgctagaca taactactta 120tttgccattg tcgcatcctg cctggatctg aataagaccg aagtcgagga cgcgattctg 180gaagggaacc aaatcgagcg gatcgaccaa ctcttcgccg tgggtgggct gcgccatctc 240atgttctact accaggatgt ggaagaagca gaaaccggcc aattgggatc actgggaggt 300gtcaatctcg tatccggcaa gattaagaaa cctaaggtgt tcgtgactga gggaaacgac 360gtcgcgctta cgggcgtctg cgtctttttc attcggaccg atccttcgaa ggccatcacc 420ccggacaaca tccaccaaga agtgtccttc aacatgcttg acgcagccga cggcgggctt 480ttaaacagcg tcagacggct cctgagcgac attttcattc ctgccctgcg ggccactagc 540cacggctggg gagaattgga gggacttcag gacgctgcta atattagaca agaattcctc 600tcctcgttgg aaggattcgt gaacgtgcta agcggcgccc aggaaagcct gaaagagaag 660gtcaacttga ggaagtgcga catcctcgaa ttgaaaacac tgaaggagcc tactgattac 720ctgacactgg caaacaatcc tgagactctc gggaaaatcg aggattgtat gaaggtctgg 780atcaagcaga ccgagcaagt gctcgccgaa aacaatcaac tcctcaagga ggcggatgac 840gtgggtccta gggcagaatt ggagcattgg aaaaaaagac tctccaagtt caactacttg 900ctcgaacagc tgaaatcgcc tgacgtgaag gccgtccttg ccgtgctggc cgctgccaag 960agcaaactcc tgaaaacttg gcgggaaatg gacattcgga tcactgatgc caccaacgag 1020gctaaggaca atgtgaagta cttgtacaca ctcgagaagt gttgcgaccc gctgtattcg 1080agcgatccac tcagcatgat ggacgcgatc ccgaccctca ttaacgcaat taagatgatc 1140tactcgataa gccattatta caacacctct gaaaagatca cctccctttt cgtgaaggtg 1200acgaaccaga ttatttcagc ctgcaaggct tacattacta acaacggcac tgcctccatc 1260tggaatcagc ctcaggacgt cgtggaggag aaaatcctgt ccgcgatcaa gctgaagcaa 1320gaataccagc tgtgcttcca caagaccaag caaaagctca aacagaaccc caatgctaaa 1380caattcgact tttcggagat gtacatcttc ggaaagtttg aaaccttcca tcggaggctc 1440gcgaagatta tcgacatctt caccaccctc aagacttaca gcgtgctcca ggatagcact 1500atcgaaggcc tagaagatat ggctaccaaa taccaaggaa ttgtggcaac catcaagaag 1560aaggaataca acttcctgga ccagcgcaaa atggatttcg atcaggatta tgaggagttt 1620tgtaagcaga ccaacgatct gcataatgaa ctgaggaagt tcatggacgt gacgttcgct 1680aagatccaaa atactaatca agccctgagg atgctgaaga agtttgaacg cctgaacatc 1740cccaacctcg gaatcgacga taaataccaa ctcatcctcg aaaactacgg agcagatatc 1800gacatgatct ccaagttgta cacgaagcag aagtatgacc ctcctctcgc ccggaaccag 1860cctccgattg cgggaaagat cctttgggcc cgccagctct ttcaccggat ccagcagcca 1920atgcagctct ttcagcaaca tcctgcggtc cttagcaccg ccgaagccaa acctattatc 1980agatcataca accgcatggc caaagtgttg ttggagttcg aggtcctgtt ccatcgcgcc 2040tggctgaggc agatcgaaga gatccacgtc ggcctcgagg cgtccctgct tgtcaaggcc 2100ccgggtactg gggaactctt cgtgaatttt gacccccaga tcttgatcct tttccgcgag 2160acagagtgca tggcgcaaat gggactggaa gtctcgccgt tggccacttc gttgttccag 2220aagcgggata ggtacaagcg aaacttcagc aacatgaaga tgatgctggc agaataccaa 2280cgtgtgaagt cgaaaattcc cgccgcaatc gaacagctca tcgtgcccca cttggctaaa 2340gtggacgagg ccctgcagcc gggactcgct gcgctgacct ggaccagcct caacattgaa 2400gcctacctcg agaacacctt cgctaagatc aaggatctcg agctcctcct cgatagagtc 2460aacgacctga ttgaattcag gattgacgca attctggagg aaatgagctc aactcccctc 2520tgtcagctcc cccaagaaga accgctgact tgcgaagaat ttctccagat gaccaaagac 2580ctgtgcgtca atggggcgca gatcctccac tttaagagtt cactcgtcga ggaggccgtg 2640aacgaactcg tgaacatgct gctggatgtg gaagtgctgt cggaggaaga gtcggagaag 2700atttccaacg aaaacagcgt gaattacaaa aacgagtcat cagcaaagcg cgaagagggg 2760aacttcgata ccctgaccag cagcatcaac gccagggcca acgctttgct cctcacgacc 2820gtgacccgaa agaagaaaga aaccgagatg ctcggagagg aagcgcggga attgctgtcg 2880cacttcaacc accagaacat ggacgcactc cttaaagtga ctcggaacac tttggaggcg 2940atccggaaga ggatccattc ctcccatacc atcaacttca gggacagcaa cagcgcgtcg 3000aacatgaagc agaactcact cccaattttc agagcgtcgg tgacacttgc cattccgaac 3060atcgtcatgg ctcccgcact ggaggatgtc caacagaccc tgaacaaggc agtggagtgt 3120atcatttcgg tccccaaagg ggtgcggcag tggagctccg aacttctttc caaaaaaaag 3180atccaagaac ggaagatggc ggccctccaa tccaacgaag attcggactc agacgtcgaa 3240atgggtgaaa acgaattgca agacactctg gaaattgcca gcgtgaatct ccctatcccc 3300gtccaaacta agaattacta caagaacgtg agcgagaaca aggagattgt gaagctggtg 3360tccgtcttga gcactatcat caattcgacc aagaaggagg tgatcaccag tatggattgc 3420ttcaagcgct acaaccatat ctggcaaaag ggaaaggagg aggccatcaa gaccttcatc 3480acccaaagcc ctcttctctc cgaattcgaa tcgcagatcc tctatttcca aaacttggaa 3540caggaaatca acgccgagcc tgagtacgtg tgcgtggggt caatcgccct gtatactgcg 3600gacctgaaat tcgcgctgac tgccgaaact aaggcctgga tggtggtcat tggccggcac 3660tgcaacaaaa aataccgcag cgagatggag aacatcttca tgctcatcga ggaattcaac 3720aagaagctga acagaccgat caaggacctc gatgatatca ggattgccat ggcggccctt 3780aaggaaatcc gggaggaaca aattagcatc gatttccagg tcggcccaat cgaggaatcc 3840tacgccttgc tgaaccgcta tggcctgctg attgcacggg aagagatcga caaggtggac 3900acccttcatt atgcgtggga gaagctgctt gcgcgggcgg gagaagtcca aaacaagctg 3960gtgtccctgc agccttcctt caaaaaggag ctgatctcag ccgtggaagt gttcttgcaa 4020gattgtcatc aattctacct cgactacgac ttgaatggac ccatggcatc aggcctgaaa 4080ccacaggagg cgtcagaccg gctgatcatg ttccagaacc aattcgacaa catttacagg 4140aagtatatca cgtacactgg aggagaggaa ttgttcggcc ttcctgccac tcagtacccg 4200caactgctgg aaattaagaa gcaacttaac ctcctccaga agatctatac tctctacaat 4260tcagtgattg agactgtgaa ctcgtactac gacattctgt ggtcagaggt gaacatcgaa 4320aagatcaata acgagttgct ggaatttcag aaccgctgca gaaagctccc tagagccctc 4380aaagactggc aggccttctt ggacctgaag aaaattatcg acgacttctc cgaatgctgc 4440ccgctgcttg agtacatggc ctccaaagcg atgatggaac gccattggga acggatcacg 4500actttgactg gacacagcct ggatgtgggg aacgaatcct ttaaactgag gaacatcatg 4560gaagctccgc tgctcaaata taaagaagaa atcgaggaca tttgcatcag cgccgtcaag 4620gaaagggaca ttgaacagaa gctgaaacag gtcatcaacg agtgggacaa taagactttc 4680accttcggca gcttcaagac tcggggcgaa ctcctcctga gaggagattc gacgagcgag 4740attattgcga acatggagga ttcgctgatg ctccttggaa gcctcctctc gaatagatac 4800aacatgccgt ttaaggcaca gatccaaaag tgggtccagt acctctcgaa ctccaccgac 4860attatcgagt cctggatgac cgtgcaaaac ctctggatct atctcgaagc agtattcgtc 4920gggggggata tcgccaaaca actgcctaaa gaggccaaga gattctccaa catcgataag 4980tcgtgggtga aaattatgac tcgggcacat gaggtgccca gcgtggtcca atgctgcgtc 5040ggtgacgaaa ctctcggtca actgcttcca cacttgctgg accaactgga gatttgccag 5100aagtcactca ctggatatct cgaaaagaag cggctctgtt tcccccgctt ctttttcgtg 5160agcgaccctg ccttgttgga aatcttgggg caggcctccg actcgcacac catccaggcc 5220cacctcctga acgtctttga caacatcaag tccgtgaagt ttcatgagaa gatctacgac 5280cgcatcctga gcatttcgtc acaggaaggt gaaactatcg aactggataa acctgtcatg 5340gcggaaggga acgtggaagt gtggctgaac agcctgctgg aggagtcaca gagctcgctc 5400cacctggtca ttcggcaggc cgccgcgaat atccaagaga ctggattcca gctgacggag 5460ttcctgagca gcttcccggc ccaagtgggc ctgctgggaa ttcaaatgat ctggacccgg 5520gacagcgaag aggccctgag aaacgctaag ttcgacaaaa aaatcatgca gaaaacgaac 5580caggccttcc tggaattgct gaacaccctc attgacgtca ccactcgcga ccttagctcc 5640accgagaggg tgaaatacga aactctgatt accatccacg tgcatcaacg cgatattttc 5700gacgacctct gccacatgca cattaagtcg ccgatggatt tcgagtggct gaagcagtgt 5760agattctact ttaacgagga ttcggacaag atgatgatcc acatcaccga tgtcgccttt 5820atctaccaga atgagttcct gggatgcacc gacagactcg tgatcacccc actgaccgac 5880agatgctaca tcacccttgc gcaggctctg ggcatgtcga tgggtggggc cccggcgggc 5940cctgcaggca ccggcaagac ggaaaccacc aaggatatgg gaaggtgcct tgggaagtac 6000gtggtcgtgt tcaactgctc agaccagatg gacttccgcg ggctgggacg cattttcaag 6060ggcctggccc agtcgggctc gtggggttgc ttcgacgagt tcaacaggat cgacttgccg 6120gtcctgtccg tggctgcaca gcagatttcg attatcctca cttgcaagaa ggagcacaag 6180aagtccttca ttttcaccga cggagacaac gtcaccatga atccggaatt cggccttttc 6240ctgactatga accctgggta cgcgggaagg caggaactgc cagaaaacct caagattaat 6300tttcgcagcg tggctatgat ggtgccggac cggcaaatca ttattagagt gaagctcgca 6360tcgtgcggat tcatcgacaa cgtcgtgctc gcaagaaagt tcttcactct gtacaagctg 6420tgcgaggaac agcttagcaa acaggtgcac tacgattttg gcctccggaa cattctctca 6480gtgctccgga ccctgggagc cgcgaagagg gccaacccca tggacaccga atcaaccatt 6540gtgatgcggg tcctgagaga catgaacttg tccaagctca tcgacgagga cgaacccctg 6600ttcttgagct tgatcgagga cctgttccct aacatcctcc tggataaagc aggctacccg 6660gaattggaag ccgccattag cagacaagtg gaggaagccg gattgattaa tcatccaccg 6720tggaagctga aagtcatcca gttattcgaa acccaacgcg tgcgccatgg aatgatgacc 6780ctcggccctt ccggagccgg caaaaccacg tgcatccaca ccctcatgcg cgccatgact 6840gactgtggca agccacaccg ggaaatgagg atgaacccga aagctattac cgcgccacag 6900atgttcggac gcctggacgt ggccactaac gattggacgg atgggatctt ctcgaccctt 6960tggcgcaaaa ctctgcgcgc caagaaggga gaacacatct ggatcattct tgacggacct 7020gtggacgcca tttggattga aaatctgaac tcggtgcttg acgacaacaa gactctcacc 7080ttggcaaatg gcgaccgaat tcctatggcc ccgaactgca agattatttt cgagcctcac 7140aatattgata acgcttcgcc agcgactgtc tcccgcaacg ggatggtgtt catgtcaagc 7200tcgatcctgg attggagccc aatcctcgag ggctttttaa aaaagcggtc gcctcaggaa 7260gcggaaatcc ttcggcagct ttacaccgag tcgttcccgg atctctaccg cttctgcatt 7320cagaacctcg aatacaagat ggaagtgctc gaagccttcg tgatcacaca gtccattaac 7380atgctgcagg gcctgatccc cctgaaggag caaggaggag aagtgtccca ggcccatctg 7440ggtagacttt tcgtgttcgc gctcctgtgg tccgccggtg ccgcgctcga acttgacggc 7500cgaaggcggc tggaactgtg gctgaggtcg aggccgaccg gaactttgga gctgcctcca 7560cccgccggac ctggcgacac tgcctttgac tactacgtcg cccctgacgg aacctggacc 7620cactggaaca ccaggactca ggagtacctc tacccttccg acacgactcc cgagtacggg 7680tcaattttgg tgccgaacgt ggataacgtg cggacagact ttctgatcca gactatcgcg 7740aagcagggaa aggcagtcct gctgatcggc gaacaaggaa cggcaaagac ggtcattatt 7800aagggcttca tgagcaagta cgaccctgag tgccatatga tcaagtcact gaatttctcc 7860agcgcgacga ccccccttat gtttcaaaga accattgagt cgtacgtcga caagagaatg 7920ggcactacgt atgggcctcc cgccggaaaa aagatgaccg tgttcatcga cgatgtgaac 7980atgccgatta tcaacgaatg gggggaccaa gtcacgaacg agatcgtgag gcagcttatg 8040gaacagaacg gtttctacaa tcttgagaag ccgggagagt tcacttccat cgtcgatatt 8100cagttcctgg ccgccatgat ccacccagga ggaggccgga atgatatccc tcagcgcctg 8160aaaaggcagt tctcgatctt taattgcacc ttgccgtcag aagcctcggt ggacaagatc 8220ttcggcgtga tcggcgtggg gcattactgc actcagcgcg gcttctcgga ggaggtccgg 8280gactcggtca ccaagctggt gccactgacc cgcaggctct ggcaaatgac caagattaag 8340atgctgccta cccccgccaa gttccattat gtgttcaacc tcagggacct ctcacgggtg 8400tggcaaggaa tgctgaatac tacttccgaa gtgatcaaag agcctaacga ccttttgaag 8460ttgtggaagc acgagtgcaa gcgcgtcatt gccgaccggt tcaccgtgtc ctcagacgtg 8520acttggttcg acaaggccct cgtgtccctg gtcgaagagg agtttggaga agaaaaaaag 8580ctccttgtgg actgcggtat cgatacctac ttcgtggact ttcttcggga cgcaccggaa 8640gcagccggag aaactagcga agaagctgac gccgaaactc cgaagatcta tgagcctatc 8700gagagctttt cgcacctgaa ggaacggctg aacatgttcc tccagcttta taatgagagc 8760attcgcggtg ccgggatgga tatggtgttc ttcgccgacg cgatggtgca ccttgtgaag 8820atttcacggg tcattcggac cccacaagga aacgcgctgc tggtcggtgt cgggggatcc 8880gggaaacagt cactgacgag attggcgagc ttcattgccg gatacgtgag cttccagatc 8940acgctgacca ggtcctacaa taccagcaac ttgatggagg atttgaaggt gctgtaccgc 9000accgcgggcc aacaagggaa ggggattact ttcattttca ctgacaacga aatcaaggac 9060gaatcgttcc tggaatacat gaataacgtc ctctcgagcg gggaagtgtc caacctgttc 9120gccagggatg agattgacga gatcaatagc gaccttgcaa gcgtcatgaa gaaggaattc 9180cctcgctgcc tccccacaaa cgaaaacctc cacgattact tcatgagccg ggtcagacag 9240aacctccata ttgtgctgtg cttctcacct gtgggtgaaa agtttcggaa ccgcgcactg 9300aaattccccg cactgatctc ggggtgcacc atcgactggt tttcccggtg gcctaaggac 9360gccctggtcg ccgtgtccga gcacttcctc actagctatg atatagattg cagcttggaa 9420atcaagaagg aggtagtcca gtgcatggga tcgttccaag atggcgtcgc ggaaaagtgc 9480gtggattact tccagaggtt tcggagatcg acccacgtga ccccgaagtc atacctgagc 9540tttattcaag gttataagtt tatttacggc gaaaagcacg tcgaagtccg caccctcgcc 9600aaccgaatga acaccgggct ggaaaagctc aaggaagcct cagaatccgt ggccgcactg 9660agcaaggaac tcgaggctaa agaaaaagag ttgcaagtgg ccaacgacaa agcggacatg 9720gtgcttaaag aagtgaccat gaaggcccaa gcagcggaaa aggtgaaggc cgaagtgcag 9780aaggtcaagg acagggcgca agcaatcgtg gactcgatca gcaaagacaa ggctattgcg 9840gaggagaagc tcgaggccgc caagcctgca ctggaggaag cagaagccgc tctccagacc 9900atcagaccta gcgacatcgc caccgtgcgg actcttggcc ggcctcccca cttgattatg 9960cgcatcatgg actgcgtgct gctgctcttt cagcggaagg tgtcggcggt gaagatcgac 10020ttggaaaagt cgtgtactat gccttcgtgg caagaatcac tcaaactcat gaccgcaggc 10080aacttcctac agaacctcca acaattccca aaggatacca tcaacgaaga agtcattgaa 10140ttcctctctc cgtatttcga aatgcccgat tacaacattg aaacggccaa acgcgtgtgc 10200ggcaacgtgg cgggactgtg ctcatggact aaggcgatgg cgtccttctt ctcgattaac 10260aaggaagtgc ttccacttaa ggcgaacctg gtggtccagg agaatcgaca cctcctggcg 10320atgcaggacc ttcagaaggc ccaggccgaa cttgacgaca agcaggcaga actggacgtg 10380gtgcaggccg aatacgagca ggctatgact gagaagcaaa cgctgctgga ggatgccgag 10440cggtgccggc ataaaatgca gaccgcctcg acccttattt ccggactcgc tggagagaag 10500gaacgctgga ctgaacagag ccaggaattt gccgcgcaaa ccaagagact tgtgggcgac 10560gtcctgctcg caactgcctt cctgtcctac tccggcccat tcaatcagga attccgcgac 10620ctgttgctta atgactggcg gaaagagatg aaagcccgca agatcccttt cgggaagaac 10680cttaacctct ccgagatgct gatcgacgca ccgaccatta gcgaatggaa cctccaagga 10740ttgccaaatg acgatctaag tatccagaac gggatcatcg tgacgaaggc ctccaggtac 10800cctcttctga tcgatccgca gacccaagga aagatctgga tcaagaataa ggagagcaga 10860aatgaactgc agataacctc actcaatcac aagtacttta ggaaccacct tgaggattca 10920ctgagcttgg gacggccgct gttgatcgag gatgtggggg aggaactgga tccagcgttg 10980gataacgtcc tggagcgcaa cttcatcaag accggatcga ccttcaaggt gaaggttggc 11040gataaggaag tggacgtgct ggacgggttt agactctaca taaccactaa gctccccaac 11100cctgcctaca cgcctgaaat ttccgccaga acctcaatta tcgacttcac cgtcactatg 11160aagggactcg aggaccagtt gctgggtcgc gtcatcctga ctgaaaagca ggaactggag 11220aaggaacgga ctcatctcat ggaggatgtg accgccaata agcgccggat gaaggaattg 11280gaagataact tgctgtatag acttacttcc acccaaggct ccctcgtgga ggacgaaagc 11340ctcattgtgg tcctgtccaa tactaagcgc actgccgaag aagtcactca aaagctggaa 11400atttcagcgg agactgaggt gcagatcaac tcggccagag aggaataccg gcccgtggct 11460actcgcggga gcatcctgta cttcctcatt actgagatgc gcctggtgaa cgagatgtat 11520cagacctccc tccgacaatt ccttggtctt ttcgacttga gccttgctcg ctcggtcaag 11580agccccatta ctagcaaacg catcgcgaac atcatcgagc acatgactta cgaagtgtac 11640aagtacgcgg ctagaggact gtacgaggag cacaagttcc ttttcacact cctcctcacc 11700ctcaagatcg atattcaacg gaacagagtc aagcacgaag aattcctgac tctcatcaag 11760ggaggagcct cgctcgatct caaggcctgc cctccgaagc catccaagtg gatcttggat 11820attacctggc tgaacctcgt ggagcttagc aaactgcggc aattcagcga cgtgctggac 11880cagatttcca gaaacgaaaa gatgtggaag atttggtttg acaaggagaa ccccgaggaa 11940gaaccattgc cgaatgccta tgataaaagc ctggactgct ttaggcggct gctcctgatc 12000agatcatggt gccctgatcg taccatcgcc caagcgcgca agtatatcgt cgactccatg 12060ggtgaaaagt acgctgaggg ggtgatcctc gacctggaaa agacgtggga agagagcgac 12120ccacgcaccc cgctcatttg cctcttaagc atgggatccg accccaccga tagcattatt 12180gctctgggca aaaggctcaa aatcgaaacg cggtacgtga gcatggggca gggtcaggag 12240gtgcacgccc ggaagctgct ccagcaaact atggcgaacg gcggatgggc gcttcttcag 12300aactgccatc tcggcctcga cttcatggat gagctgatgg atattattat tgaaaccgag 12360ctggtccacg acgcattcag gctctggatg accaccgagg cccacaaaca attccccatc 12420acgctgttgc agatgtccat caagttcgct aacgatcccc ctcaaggact tagggctggg 12480ctcaagcgga cgtattcagg agtctcacag gatctccttg atgtgtcaag cggctcacaa 12540tggaaaccca tgctctacgc tgtggctttc cttcactcca cggtgcagga acggcgcaag 12600ttcggagcct tggggtggaa tatcccctac gaattcaacc aagcggactt caacgctacc 12660gtgcaattca tccaaaatca tcttgacgac atggatgtga agaagggagt gagctggacc 12720accattagat acatgattgg cgagatccag tacggggggc gcgtgaccga cgactatgac 12780aaacggctct tgaacacgtt cgcaaaagtg tggtttagcg agaacatgtt cggcccagac 12840ttctctttct accaaggtta caatatcccg aagtgctcca ccgtcgataa ctacctccag 12900tacatccaaa gccttcccgc ctacgactcg cccgaggtgt tcggtctgca tccgaacgcg 12960gacatcacct accaatccaa actggctaag gacgtccttg atacgattct gggtattcag 13020ccaaaagaca cgagcggagg aggggacgaa accagagaag ccgtggtcgc ccgcttggct 13080gacgacatgc tggaaaaact tccgccggac tacgtgccgt ttgaagtgaa agaacgcttg 13140cagaagatgg gtccctttca accaatgaac atattcttga gacaagagat tgatagaatg 13200cagcgggtgc tgagcttggt gcggtcgacc ctcacagaac ttaaactggc gattgacggg 13260acgatcatta tgagcgagaa tctgcgcgac gccttggact gcatgtttga tgctcggatc 13320cccgcgtggt ggaagaaggc ttcttggatc tcgagcacgc tcgggttttg gtttaccgaa 13380ctcatcgagc ggaattcgca gttcacttca tgggtcttta acgggcgacc ccactgcttc 13440tggatgacgg gcttcttcaa cccgcaaggt tttctcaccg ccatgcggca ggagatcact 13500agagccaaca aaggctgggc gttggacaat atggtcctgt gcaacgaagt gaccaagtgg 13560atgaaggacg acatctccgc gccccccacc gagggcgttt atgtctacgg actttacttg 13620gaaggtgccg gatgggacaa gcggaacatg aagctcatcg agagcaaacc gaaggtgctg 13680tttgagctca

tgcccgtgat ccgcatctat gccgaaaata acacattgag agatccccgc 13740ttctactcgt gccccatcta taagaagcct gtcagaaccg acttgaacta cattgccgcc 13800gtcgacctga gaacggccca gactcccgag cattgggtcc tgaggggagt ggcgctgctg 13860tgcgatgtga agtag 138751913875DNAArtificial SequenceSynthetic polynucleotide 19atgtttagaa ttggcaggcg ccagctgtgg aagcactcag tgacccgggt gctaacgcag 60cggcttaagg gcgagaagga ggcgaagcgg gccttgctgg acgccaggca caactacctc 120ttcgcgattg tggcgtcgtg ccttgacctg aacaaaaccg aggtcgagga cgcgattttg 180gaagggaacc agatcgaaag gatcgaccaa ttgtttgccg tcggaggtct gcggcacctg 240atgttctact accaggatgt ggaggaggcc gaaaccggcc agctaggctc cctcggagga 300gtgaacctgg tgtcgggaaa aatcaagaag cccaaggtgt tcgtcaccga aggcaatgat 360gtcgctctga ccggcgtgtg cgtgttcttc attcggaccg acccctcaaa agcaatcacc 420ccggacaata tccaccagga ggtttccttc aacatgctcg acgcggctga cggtggactg 480cttaacagcg tccggaggct cctctcggac attttcattc ccgccctgcg ggctactagc 540catggctggg gagaactgga ggggcttcag gacgccgcta atattagaca ggaattcctg 600tcctcgctgg agggattcgt gaacgtcctg tcgggcgccc aggaaagtct caaagaaaag 660gtcaacctca gaaagtgcga catcttggaa ctgaaaacgc ttaaagaacc gaccgattac 720ctgaccctgg ctaacaaccc ggaaaccctg ggcaagatcg aagattgcat gaaagtgtgg 780atcaagcaaa cagaacaggt cctggctgaa aacaaccagc tgctgaagga agccgatgat 840gtggggccca gagccgaact ggagcactgg aagaagaggt taagcaagtt caactacctt 900ctggaacagc ttaaatcccc ggatgtgaag gccgtgctgg ccgtgctggc cgccgccaag 960agcaagctcc tcaagacttg gagagagatg gacatccgga tcacggacgc tacgaacgaa 1020gccaaggata atgtcaagta cctttacacc ttggagaagt gctgcgaccc actgtacagc 1080agcgatcctc tgagcatgat ggatgctatc ccgaccctta tcaacgcgat caagatgatt 1140tacagcatca gccactatta caacacaagc gaaaaaatta cttcactgtt cgtgaaggtc 1200actaaccaga ttatttccgc ttgtaaggcc tacatcacta acaacggaac tgccagcatc 1260tggaaccagc cacaggacgt ggtggaggag aagatcctga gcgccatcaa gctgaaacaa 1320gagtatcagc tgtgctttca caagacgaaa cagaagctga aacagaaccc aaacgctaag 1380cagttcgatt tctcagaaat gtacatcttc ggcaagttcg aaaccttcca caggcgcctc 1440gccaagatta ttgacatctt cactaccctt aaaacgtaca gcgtcctcca agactccacc 1500attgaagggc tggaggatat ggccaccaag taccagggga tcgtggccac catcaagaag 1560aaggaataca acttccttga ccaacggaag atggacttcg accaggacta tgaggagttc 1620tgcaaacaaa ccaacgattt gcacaatgaa ctcaggaagt tcatggatgt gacctttgcc 1680aagatccaga atactaacca ggccctgcgc atgctgaaga agttcgaacg gctgaatatt 1740ccgaacttgg ggatcgacga caagtaccaa ctcatcctgg aaaactacgg cgctgacatt 1800gacatgatct cgaaactcta caccaagcag aaatacgacc caccgctggc tcggaaccag 1860ccgcctatcg cgggaaagat cctgtgggcg agacagctgt tccataggat tcagcagcca 1920atgcagcttt tccagcagca tccggccgtg ctcagcaccg ccgaggcgaa acctatcatt 1980cggtcataca acaggatggc caaagtgttg ctggaattcg aagtgctgtt ccaccgggcg 2040tggctccggc agatcgagga gatccacgtc gggctagagg cctccctcct cgtgaaggct 2100cccggaaccg gggagctgtt cgtcaacttt gaccctcaaa tcttgatcct tttccgggag 2160actgaatgca tggcccagat gggattggag gtgtcaccat tggctacttc gctctttcaa 2220aagcgggacc gctacaaaag gaatttcagc aacatgaaga tgatgctcgc agaataccag 2280cgcgtcaagt ccaagatccc tgccgccatc gaacagctca tcgtcccgca ccttgccaaa 2340gtggacgaag cgcttcagcc cggactggcg gctctgacgt ggactagcct taacatcgaa 2400gcgtatttag agaacacctt cgccaagatc aaggacctgg agctgctctt ggatagggtg 2460aacgacctga tcgagttcag aatcgacgct atactggagg aaatgagctc aacgccgctg 2520tgccagctcc ctcaggaaga acccttgact tgtgaagaat tcctgcagat gactaaggac 2580ttgtgcgtga acggtgcgca gattctgcat tttaagtcat ccttggtgga ggaggcggtg 2640aacgagcttg tcaacatgct cctggacgtg gaggtgctgt cagaggagga atcggaaaaa 2700atctccaacg aaaacagcgt caactacaag aacgaatcga gtgccaagcg ggaagagggt 2760aacttcgaca ccctcacctc gagcattaac gcgcgggcga acgccctcct tctcaccact 2820gtgacgcgga agaagaagga gactgaaatg cttggcgagg aagccagaga actcctgtcc 2880cattttaatc atcaaaatat ggatgcactg ctcaaggtca ctcgcaacac tctcgaggcc 2940atccgcaaac gaatccacag ctcacacact atcaacttcc gggattccaa ctccgcaagc 3000aacatgaagc agaactcact gccgattttt cgggcttcag tcactctggc gatcccgaat 3060attgtgatgg ccccggccct ggaagatgtc cagcaaaccc tcaacaaggc ggtggaatgt 3120atcatctcag tgcctaaggg tgtgaggcaa tggagcagcg agcttctctc caagaagaag 3180atccaggagc gcaagatggc ggctctccag agcaacgaag attcggactc cgacgtggag 3240atgggcgaaa acgaactgca agacacgctg gaaattgcat cagtgaacct cccaattcct 3300gtgcaaacta aaaactacta taaaaacgtc agcgaaaata aggagatcgt caagctggtc 3360agcgtcctgt cgaccattat taactccacg aagaaagaag tgataaccag catggactgc 3420tttaagcgct acaaccacat ttggcagaag ggaaaggagg aagccattaa gactttcatt 3480acccagtccc cgctcttgag cgagttcgag tcccagatcc tgtacttcca gaacctcgag 3540caggagatta acgctgagcc cgaatacgtc tgcgtgggta gcattgcgct gtatactgcc 3600gatctcaagt ttgccctcac ggctgaaact aaggcctgga tggtcgtgat cgggaggcat 3660tgtaacaaga agtaccgctc cgaaatggaa aacattttca tgcttatcga agagttcaac 3720aagaaactca atcggcccat caaggatttg gacgatatcc ggattgcgat ggccgcgctc 3780aaggagatcc gagaggaaca gatctcgatt gacttccagg tcgggcctat cgaggagagc 3840tacgccctgc tgaacagata tggactcctc attgctcggg aagaaattga caaggtggac 3900actctgcatt atgcctggga gaaactgctg gcccgcgcag gggaggtcca gaacaaactg 3960gtcagcctcc aacctagctt caagaaagag ctgatctccg cagtggaggt gtttctgcag 4020gactgccatc aattctacct tgattacgac ctcaacggcc cgatggccag cggactcaaa 4080cctcaagagg catccgaccg gctcatcatg ttccagaacc aattcgacaa catctaccgg 4140aagtatatca cttataccgg cggagaagaa ctgttcggat tgccggcgac tcagtacccc 4200caacttctcg aaatcaagaa gcaattgaac ctcctgcaaa agatttacac gctttacaac 4260agcgtgatcg aaacggtgaa ctcctactac gacatccttt ggtccgaagt gaacatcgaa 4320aaaatcaaca atgaactgct ggaattccag aacagatgcc gcaagttgcc acgcgctctg 4380aaggattggc aggccttctt ggacctgaag aagattatcg acgattttag cgaatgctgc 4440cccttgctgg aatacatggc tagcaaggcc atgatggaga gacattggga gcgcatcacg 4500accctcactg gccacagcct tgacgtgggc aatgagtcgt tcaagctgag aaacattatg 4560gaagcaccgc tgctcaagta caaggaagag atcgaagata tttgcattag cgcggtcaag 4620gaacgggaca ttgagcagaa acttaaacag gtcatcaacg agtgggacaa taagactttt 4680acctttggat cctttaagac ccggggcgag ctccttctga gaggcgactc gactagcgag 4740atcatcgcaa atatggagga ttccctgatg ctcctgggat cactcctgag caacagatat 4800aatatgccct ttaaggctca aatccagaag tgggtgcaat acctgtccaa cagcaccgac 4860attatcgaga gctggatgac cgtccagaac ttgtggatct acctggaagc ggtgttcgtc 4920ggcggtgata ttgccaagca gctccctaag gaggctaaaa ggttctccaa cattgacaag 4980agctgggtga agatcatgac tagagcccat gaagtcccga gtgtggtcca atgctgtgtg 5040ggggacgaga ctctggggca gctgctaccc cacctcctgg accagcttga gatctgtcaa 5100aagagcctga cgggctacct ggagaagaaa cgcctgtgct ttccgaggtt cttcttcgtg 5160agcgacccgg ccctgctgga aattctcgga caagcgagcg actcccatac catccaggca 5220catctgctta acgtgttcga taacatcaag agcgtgaagt tccacgagaa gatctacgac 5280cggatcctga gcatctcatc gcaggaagga gagactatag agcttgacaa accagtgatg 5340gccgaaggaa atgtggaggt gtggttgaac tccctgctcg aagagagcca gagctccctg 5400cacctcgtca ttcgccaggc cgcggcgaat attcaggaaa ccgggttcca acttaccgag 5460ttcttgtcca gctttcccgc gcaagtcgga ctcttgggta ttcaaatgat ttggaccaga 5520gattccgaag aggccctccg caacgccaag ttcgacaaaa agatcatgca aaaaactaac 5580caagcattcc tggagctgct taacaccctt atcgatgtga ccaccaggga tctcagctcc 5640actgagcggg tcaaatacga aacgttgatt actatccacg tgcaccaacg cgacatcttc 5700gacgacttgt gccacatgca catcaagagc ccgatggatt tcgagtggct caaacagtgc 5760cggttctact tcaacgagga ttccgacaaa atgatgattc acattaccga tgtggctttc 5820atttaccaaa acgaattcct gggctgtact gaccggctgg tgatcacgcc gctgaccgac 5880cgctgctaca tcactctggc acaggctctg ggaatgtcga tgggaggagc tcctgcgggc 5940ccagcgggaa ctggcaaaac cgaaaccacg aaggatatgg ggcggtgtct ggggaagtac 6000gtcgtggtgt ttaactgctc agaccagatg gactttaggg gactgggtcg gatctttaag 6060ggactggccc agtcaggctc ctgggggtgt ttcgatgaat tcaatcggat cgacttgccg 6120gtgctgtccg tggccgcgca gcaaatttcc atcatcctta cctgtaagaa ggagcacaag 6180aagtccttca tctttaccga cggggacaac gtgaccatga acccagagtt cggactcttc 6240ctcactatga atcccgggta cgccggccgc caagagctcc cagagaatct gaagattaat 6300tttcgctcag tggccatgat ggtcccggat agacaaatca tcatccgggt gaagttggcg 6360tcctgcggct tcatcgacaa cgtggtgttg gccagaaaat tcttcacgct ctataagttg 6420tgtgaagaac agctctcaaa acaggtgcac tacgacttcg gacttaggaa catcctcagc 6480gtgttgagaa ctctcggagc ggcgaagcgc gcaaacccca tggataccga gtcgactatc 6540gtgatgagag tgctgagaga catgaacctt tcaaagctga ttgacgagga cgaaccgctg 6600ttcctttcct tgatcgagga cctcttcccg aacatcctcc tcgataaggc cggttacccc 6660gagctcgaag ccgcgatttc acggcaagtt gaagaggctg gactcattaa ccacccacca 6720tggaagctca aggtcatcca gctgttcgag actcagagag tgcggcatgg aatgatgaca 6780cttggtccta gcggcgcggg aaagactacg tgtatccaca ccttgatgcg ggcgatgacc 6840gattgcggca agccgcacag ggaaatgcgg atgaacccga aggcgatcac cgcaccccaa 6900atgttcggac ggctcgacgt ggcgaccaac gactggaccg acggcatttt ttcgaccttg 6960tggcgcaaga ccctgcgggc caagaaagga gaacacatct ggattatcct ggatggcccg 7020gtggatgcga tctggattga aaaccttaac tcagtgctcg acgacaataa gaccctgacc 7080ctggctaacg gcgataggat cccgatggct cctaactgca aaatcatctt cgagccgcat 7140aacattgata atgcatcacc agccaccgtg tcccgcaatg gtatggtgtt catgagctcc 7200agcatcctgg attggtcgcc cattcttgag ggattcctca agaagcgctc accacaggag 7260gccgagattt tgaggcagct gtataccgaa tcatttccgg atctctacag attttgtatc 7320cagaacctcg agtacaagat ggaggtcctt gaagccttcg tcatcaccca aagcattaac 7380atgctgcagg gacttatccc cttgaaggaa cagggcggag aggtgtcaca ggctcacctg 7440ggaaggctgt tcgtgtttgc cttgttgtgg tccgccggcg cggccctcga gctggatggc 7500aggaggcggc tcgagttgtg gctgcggagc cgccccaccg ggactttgga actgccgccc 7560ccggcgggtc cgggcgacac cgctttcgac tactacgtcg cgccggacgg aacttggact 7620cactggaata ccagaaccca agaatacctt tatccatcgg atactacccc tgaatatggt 7680agcatcctcg tccctaacgt ggacaacgtc cgaacggact tcctcatcca aactatcgcc 7740aagcagggca aggcagtcct gctgatcggc gaacaaggca ctgccaagac cgtcattatc 7800aaaggcttta tgagcaaata cgatccggag tgccatatga tcaagagcct gaacttctcc 7860tccgcgacaa ctccgctgat gttccaaaga actattgagt cgtacgtgga taagcgcatg 7920ggaaccactt acgggccgcc ggccggaaag aagatgaccg tgttcattga tgacgtgaac 7980atgccgatca tcaacgaatg gggcgaccag gtcactaacg aaatcgtgag acaactgatg 8040gagcagaacg gattctacaa cctggagaag cccggagagt ttacctccat cgtggatatc 8100cagttcctgg ccgccatgat ccatccgggc ggagggcgga acgacatccc acaaagactg 8160aagagacagt tctccatctt taattgcacc ttgccctcgg aggcctcagt ggataagatt 8220tttggagtga ttggcgtggg ccactactgc acccagcggg gtttcagcga ggaggtcagg 8280gatagcgtga ccaagctcgt gcccttgacc agacggctgt ggcagatgac gaagattaag 8340atgctgccca ccccggcgaa gttccactac gtgttcaatc tgagggactt gtcccgcgtg 8400tggcagggaa tgctgaacac cacgagcgaa gtgatcaagg agccgaacga cttgctcaaa 8460ttgtggaaac acgaatgcaa aagagtgatt gcggaccgct tcaccgtgag ctccgacgtg 8520acctggttcg acaaggctct cgtgtccctg gtggaagagg agttcggcga agaaaaaaag 8580ctcctggtgg attgcggcat cgatacgtac ttcgtggact tcctgagaga tgcacccgag 8640gccgcaggag aaacttccga ggaagcagac gccgaaaccc ccaaaattta cgagcccatc 8700gagtcctttt cacacctgaa ggaacgcctg aacatgttcc tccagctcta caacgaatcg 8760atccgcggcg caggaatgga catggtgttc ttcgcggacg ccatggtgca tctcgtgaag 8820atctccagag tgatcagaac tccccagggc aacgctctgc tggtcggagt ggggggatcg 8880gggaagcaat cgctcaccag actggccagc ttcatcgcgg gctacgttag cttccaaatc 8940accctcacca ggagctacaa tacctcgaac ctgatggagg atctgaaggt cctctacagg 9000actgcgggac agcaggggaa gggtattacc ttcatcttta ccgataacga gataaaggat 9060gagagttttc tcgagtacat gaacaatgtg ctgtcgtcgg gggaagtgtc aaacctcttc 9120gcccgcgatg aaatcgacga gatcaacagc gacttagcta gcgtgatgaa aaaggagttt 9180ccgagatgct tgccgaccaa tgagaacttg catgactact ttatgtcaag ggtccgccaa 9240aacctccaca tcgtgctttg tttctcgcca gtgggagaga agttccgcaa ccgcgcactt 9300aaattccccg ccctgatctc cggttgtacc attgactggt tctcccgctg gcctaaagat 9360gcactcgtgg cagtctcgga gcacttcctg acttcgtacg atattgactg ctccttggag 9420attaagaaag aggtcgtcca gtgtatgggg agcttccagg acggagtcgc cgaaaaatgc 9480gtggactact ttcagagatt caggcggagc acccatgtca cccccaagtc atacctcagc 9540tttatccaag gctacaagtt catctacggc gaaaagcatg tcgaggtccg caccctggca 9600aacagaatga ataccggtct ggagaagctc aaggaagcgt cggagtcagt ggctgccttg 9660tcaaaggaac tcgaagccaa ggaaaaggaa ttacaggtgg cgaacgacaa ggccgacatg 9720gtgctgaaag aagtgactat gaaggcccag gcggcggaga aggtgaaggc cgaagtgcag 9780aaggtcaagg accgcgccca agctatcgtg gactcgatct cgaaggacaa ggcgattgct 9840gaggagaagc tcgaagccgc caagccggct ctggaggaag ccgaagcggc attgcaaacc 9900attagacctt cagacatcgc tacggtcagg actctgggaa ggccccccca tttgatcatg 9960cggatcatgg actgcgtgct cctcctgttc caaagaaaag tgagcgccgt gaagatcgat 10020cttgagaaat cctgcacgat gccctcatgg caggagtcgc tcaagctcat gacggccggt 10080aacttcctgc aaaacctcca gcagtttcct aaggatacaa tcaacgaaga ggtgattgaa 10140ttcctcagcc cgtactttga gatgcccgac tacaacatcg aaactgcgaa acgcgtgtgc 10200gggaacgtgg ccggactgtg tagctggacc aaggccatgg cctccttctt ctccatcaac 10260aaggaagtgc tgcctctaaa ggcaaacttg gtcgtgcagg aaaaccggca ccttctcgca 10320atgcaggacc ttcagaaggc acaagcggag ctggacgaca agcaggctga gctggacgtg 10380gtccaggctg agtacgagca ggcgatgact gagaagcaaa cgctactgga ggacgcagag 10440cgctgtagac ataaaatgca gaccgcatcc accctgatct ccgggctcgc cggcgaaaag 10500gagaggtgga ctgaacagtc acaagaattt gctgctcaaa caaagcgcct ggtcggtgat 10560gtcctgcttg ccactgcctt cctgagctac agcggtcctt ttaaccaaga gttccgcgac 10620ctcctcctca atgactggag aaaggaaatg aaggctcgga agatcccgtt cggcaagaac 10680ctcaatctta gcgagatgct cattgacgca cctactatca gcgaatggaa cttgcaagga 10740ctgccgaacg acgacctgtc cattcaaaac ggaatcatcg tcaccaaggc ttcgcggtat 10800ccactcctca ttgacccgca gactcagggc aagatatgga ttaaaaacaa ggaaagccgc 10860aatgaactgc aaattacctc cctgaaccac aagtacttcc gcaaccacct cgaggacagc 10920ctcagccttg ggcgcccatt gctcatcgag gacgtcggcg aggaactgga tccggccctg 10980gacaacgtcc tggaaagaaa cttcatcaag accggctcga catttaaagt caaggtgggc 11040gacaaggaag tggacgtgct ggatggcttt agactctaca ttaccaccaa attaccgaac 11100ccggcctaca ccccagaaat ttcggcgcgg acctccatca ttgactttac tgtgactatg 11160aagggcctgg aggaccagct cctcggccgg gtgatcctga ctgaaaaaca agagctcgaa 11220aaggaaagga ctcacctcat ggaggacgtg acggcgaaca aaagacggat gaaggaattg 11280gaggataatc tcctgtatag actcactagc actcaagggt ccctggtcga agatgagtcc 11340cttattgtgg tcctttcgaa tacaaaacgg accgcggaag aggtcactca gaaactcgag 11400atctccgcgg aaactgaagt gcagatcaac agcgctcgag aggaatacag gccagtggca 11460accaggggct ccatactcta cttcctcatc accgaaatgc gcctcgtcaa tgagatgtac 11520caaacctcgc tgcggcaatt cttggggctt ttcgacctct cactcgcacg gtcggtgaaa 11580tcgccgatta ctagcaagag aattgccaat atcatcgagc acatgaccta cgaggtgtac 11640aagtacgcag ccaggggact gtacgaggaa cacaagttcc tgtttaccct tctcctgaca 11700ctgaagattg acattcagag aaacagagtg aaacatgagg aatttctcac cctcatcaaa 11760ggaggggcta gccttgattt gaaggcgtgc ccgccgaaac cttcgaagtg gatcctggat 11820atcacctggc tcaacctggt ggagctgtca aagctgcggc agttttccga cgtgctggac 11880caaatttcgc gcaacgagaa gatgtggaag atctggttcg acaaagagaa ccccgaagag 11940gagccgctgc ccaacgctta tgacaagtca cttgactgct tccgccgcct cctgctcata 12000cggagctggt gcccagaccg gaccattgcc caggcgcgaa agtacatcgt cgatagcatg 12060ggggaaaagt acgcggaggg tgtgattctc gacctcgaga aaacttggga ggaatcagac 12120cctcgcaccc ccttgatttg cctcctgtcc atgggctccg atcccactga cagcatcatt 12180gcactcggga agaggctgaa gatcgaaacg cgctacgtgt caatgggaca gggacaagaa 12240gtccacgctc ggaagctcct ccagcaaacg atggcgaacg gtggttgggc cctgttgcag 12300aattgccatc tggggctgga cttcatggat gaactcatgg acattattat cgaaacggaa 12360ctggtccacg atgcctttag actttggatg acgactgaag cccataaaca attccctatc 12420accctgcttc aaatgtcgat caagttcgcc aatgaccctc cgcaggggct ccgggccggc 12480ctcaaaagga cttactccgg ggtgtcacag gaccttcttg acgtgtcctc cggaagccaa 12540tggaagccaa tgctttacgc cgtcgcgttc ctccactcca cggtgcagga acggcggaag 12600ttcggagctc tgggctggaa tattccgtac gaattcaacc aggcagattt caatgcaacc 12660gtccagttca tccaaaacca tctcgatgat atggatgtga agaaaggagt gtcatggact 12720accattagat acatgatcgg ggagatccag tacgggggac gagtgactga cgattacgat 12780aagcgccttc tgaacacttt cgctaaggtc tggtttagcg aaaacatgtt tggtccagac 12840ttctccttct accaagggta taacatcccg aagtgctcca ccgtggacaa ctacctccag 12900tacattcagt cgctccccgc ttatgattca ccagaagtct ttggcttgca cccaaatgcc 12960gatatcacct accaaagcaa actcgcgaag gacgtcctgg acaccatact ggggatccag 13020ccgaaagata cctcgggggg cggcgatgag actcgcgaag cagtcgtggc gagactggcg 13080gacgacatgc ttgaaaagct gcctcctgac tacgtgccat ttgaagtgaa agaaagattg 13140cagaagatgg gccctttcca gcctatgaac attttcctga ggcaggagat tgaccgcatg 13200caacgggtgc tcagcctcgt gcgatccacc ctcactgagt tgaagctcgc catcgatggg 13260accatcatca tgagcgaaaa cttgcgggat gcactcgact gcatgttcga cgctaggatc 13320ccagcgtggt ggaaaaaggc atcatggatt agctccaccc tgggcttctg gttcaccgaa 13380ctcatcgaga ggaactccca gttcacctcc tgggtgttca acggacggcc tcattgcttt 13440tggatgaccg gcttcttcaa cccccagggt tttctcacgg ccatgcgcca ggaaattacc 13500cgggcaaaca aggggtgggc cctcgacaac atggtgcttt gtaacgaggt tactaagtgg 13560atgaaggacg acatctcagc cccccctacc gagggagtct acgtgtacgg cctgtacctg 13620gagggcgcag gatgggataa acggaacatg aagctgatcg agtcgaagcc gaaagtcctg 13680ttcgagctca tgcccgtcat tcgcatctac gccgagaaca acaccctgcg cgacccaaga 13740ttctacagct gcccgattta caaaaagccc gtccggacgg acttgaacta tatcgcggcc 13800gtcgatctgc ggactgcgca gacccctgag cactgggtgc tgcggggagt ggcgctgctc 13860tgcgacgtca agtag 138752013875DNAArtificial SequenceSynthetic polynucleotide 20atgttccgga tcggacgcag gcagctctgg aaacatagcg tcacgagagt gctgacccag 60cgcctcaagg gggagaagga agcgaagcgg gccctgctcg acgccagaca caattacttg 120ttcgccattg tcgcctcctg cctggatctg aacaagactg aggtcgaaga tgccatcctc 180gagggaaacc aaatcgaaag aattgaccaa ttattcgccg tgggcgggtt gcgccatttg 240atgttctact accaagacgt tgaggaggca gagactggac agcttggttc cttgggtgga 300gtgaacttag tgtcggggaa gatcaagaag ccgaaagtgt tcgtgactga aggcaacgat 360gtggctctga ctggagtgtg cgtgttcttt attcgcactg acccctctaa ggccattacg 420ccggacaaca ttcatcagga ggtgtcattc aacatgctgg acgccgcgga tggcggcctg 480ctcaacagcg tgcgccggct gctttccgac attttcattc ctgcattgag agcgacttcg 540catggttggg gtgaactgga agggctgcaa gacgctgcca acatccgaca ggaattcctg 600agctccctgg aaggtttcgt gaacgtcctc agcggcgccc aggaaagcct caaggaaaag 660gtcaacctga gaaagtgtga catcctggaa ctgaaaacgc tgaaggagcc cacagattac 720cttaccttgg ccaataaccc ggaaaccctc gggaagatcg aggactgcat gaaggtctgg 780atcaagcaga

ctgaacaagt gctcgcagag aacaaccaac tcctgaagga agccgacgac 840gtcgggcctc gggcagagct cgagcactgg aagaagcggc tgagcaagtt caactacctc 900ctggaacagt tgaagtcgcc ggacgtgaag gcggtgctcg ccgtgttggc cgccgccaag 960agcaagctgc tcaaaacttg gcgggagatg gacattagga tcaccgacgc caccaacgaa 1020gcaaaggaca atgtgaagta cttatacacc ctcgagaagt gctgcgatcc gctctactcg 1080tcggaccccc tgtcgatgat ggatgctatc ccgacgctca tcaacgcaat caagatgata 1140tattccatca gccactacta caatacgagc gagaagatta cttccctgtt cgtgaaggtt 1200accaaccaga ttatctccgc gtgcaaggcc tacattacca acaacggaac tgccagtatc 1260tggaatcaac ctcaagacgt ggtcgaagag aagatactct cggctatcaa gctgaagcag 1320gaataccaac tgtgctttca caagaccaag caaaagctga agcagaaccc aaacgccaaa 1380cagttcgatt tcagcgagat gtacatcttc gggaagttcg agacttttca ccgccggctg 1440gccaagatca tcgacatctt caccaccctc aagacgtact ccgtgctgca agactcgact 1500atcgagggat tggaggatat ggccacgaag tatcaaggca ttgtggcaac catcaagaag 1560aaagaataca actttctcga ccagagaaag atggatttcg accaagatta tgaagagttc 1620tgcaagcaaa ccaatgacct tcataacgaa ctccggaagt tcatggatgt caccttcgcg 1680aagatccaga acacgaacca agcactgaga atgctcaaga agttcgaacg ccttaacatt 1740ccaaacctcg ggattgatga caagtaccaa cttatcctgg agaattacgg tgccgacatc 1800gacatgatct ccaaattgta caccaaacaa aagtacgatc ctccactggc gcgcaaccaa 1860ccccccatcg ccggaaagat cctgtgggca agacaattat tccacagaat ccagcagccg 1920atgcaactgt ttcagcagca tcccgctgtc cttagcacgg cggaagccaa gcctatcatc 1980agatcgtata atcgcatggc aaaggtgctg ctggaattcg aggtcctctt ccaccgggcg 2040tggctgaggc agattgagga gattcacgtg ggacttgagg cgagcctgct tgtcaaagct 2100ccgggcaccg gagagctgtt cgtgaatttc gacccacaaa tcttgatcct gttccgggaa 2160actgagtgca tggcccagat gggcctcgag gtgtcaccac tggccacgtc attgttccaa 2220aagcgggaca gatacaagcg caatttttcg aacatgaaaa tgatgctggc cgaataccaa 2280cgggtgaaat ccaagattcc cgctgccatt gaacaactca tcgtgcctca tcttgcgaaa 2340gtcgatgagg ccctgcaacc gggactggcg gcgctgactt ggacttccct taacattgag 2400gcatacctcg agaacacctt cgcaaagatt aaagacctag agctcctcct cgaccgcgtg 2460aacgacctta tcgagttccg gatcgatgcc atcctggaag agatgtcgtc cactccactt 2520tgccaacttc cccaggagga accgttgact tgtgaagaat tcttgcagat gaccaaagac 2580ctttgcgtca acggcgccca gatcctgcac tttaagtcta gcctggtgga ggaagcggtg 2640aacgagctcg tcaacatgct tctcgacgtg gaagtgctgt cggaagagga atccgagaaa 2700atctccaacg aaaatagcgt gaactacaaa aatgagtcat cagccaagag ggaagagggt 2760aacttcgata cgctgactag cagcattaac gccagggcaa acgccctgct cctgaccacc 2820gtgactcgga agaagaaaga gactgagatg ctgggagaag aagctcggga gcttctgagc 2880cacttcaacc accaaaacat ggatgccctg cttaaggtga cccggaacac actggaggcg 2940atccggaagc ggatccactc gagccatacc attaattttc gggattcgaa cagcgcctcg 3000aacatgaagc aaaattccct gccgattttt agagcgagcg tcaccctggc catccccaac 3060atcgtgatgg cgcccgcatt ggaggacgtg cagcagaccc ttaacaaggc ggtggagtgt 3120atcattagcg tccctaaggg cgtccgccaa tggagctcag agttgctctc gaagaagaaa 3180atccaggagc ggaagatggc ggctctccag tccaacgagg actccgattc ggacgtggag 3240atgggtgaaa atgagttgca agacactctc gagatcgcct ccgtcaatct gccgattccc 3300gtccaaacca agaactacta caaaaacgtg agcgagaaca aagagatcgt gaagctcgtc 3360agcgtgctca gcactatcat taactcaacg aagaaagaag tgatcactag catggactgc 3420tttaagcggt acaaccatat ctggcagaaa ggcaaggaag aagccatcaa gaccttcatc 3480acccaatccc ccttgttgtc ggaattcgag tcacagattt tgtacttcca gaatctcgag 3540caggagatca atgcggagcc agaatacgtg tgcgtggggt ccattgcgct atacaccgcc 3600gaccttaaat tcgcgctgac ggccgaaacc aaggcctgga tggttgtgat cggccgccat 3660tgtaacaaga agtacaggag cgaaatggag aatatcttca tgctgatcga agagttcaac 3720aagaagttga accggcccat taaggacctg gatgatattc gcattgccat ggccgccctt 3780aaggagatcc gcgaagaaca gatctccatc gactttcagg tcggccctat cgaagagagc 3840tacgccctcc tgaaccgcta cgggctcctc attgccaggg aagaaattga caaggtcgat 3900acccttcact acgcatggga gaagttgctg gcgcgcgccg gagaggtcca aaacaagttg 3960gtgtccctgc aaccctcctt caagaaggaa ctgatttcgg cagtcgaggt gttcttgcaa 4020gactgccatc aattttacct ggactacgac ctgaatggac ccatggcgtc cggcctcaag 4080ccccaagagg cctcagacag actgatcatg tttcagaacc aattcgacaa catctaccgg 4140aagtatatca cgtataccgg tggagaagaa ctgttcggac tcccggccac gcaatacccg 4200cagctcctcg agattaagaa gcagcttaac cttcttcaga agatttacac gctgtacaat 4260agcgtcatcg agactgtgaa ttcctattat gacatccttt ggagcgaagt caatatcgag 4320aaaatcaaca acgaactcct cgagttccag aacaggtgcc ggaaattgcc ccgcgccctg 4380aaggactggc aggcgttctt ggatctgaaa aagattattg acgacttctc cgaatgctgt 4440cctttgctcg agtacatggc ctctaaggct atgatggagc gacattggga acgcatcacc 4500accctcaccg gacacagcct cgacgtcggc aatgagtcct tcaagttgag aaatattatg 4560gaggcgcctc tgctgaagta caaagaagag atcgaggaca tttgcatttc ggcggtcaag 4620gagagggaca ttgaacagaa gctcaagcaa gtcatcaatg agtgggataa taagaccttc 4680acatttggga gctttaagac ccgcggcgag ctactcctgc gcggcgactc gaccagcgaa 4740atcattgcca acatggaaga ttcactcatg ctgctgggtt ccttgctctc gaacagatat 4800aacatgccct tcaaagcgca gatacagaag tgggtgcagt acctgtctaa tagcaccgac 4860attatcgaat cctggatgac tgtgcagaac ctctggattt acctggaggc cgtgttcgtc 4920ggcggagaca tcgcgaagca actgccgaag gaggctaaaa gattctccaa tatcgataag 4980tcgtgggtga aaatcatgac gcgcgcacac gaggtcccat cagtggtgca gtgctgtgtc 5040ggtgacgaga ctttgggtca actgctgccg cacctcctcg accagctcga gatttgccaa 5100aagtccttga cgggatacct ggaaaagaag cggctttgct tcccccgctt cttcttcgtg 5160tccgaccccg cgctgctcga gatcctgggc caggctagcg actcacacac cattcaggcg 5220catctcctga acgtgttcga caacattaag tcagtgaagt tccacgaaaa aatttacgac 5280cgcatcctga gcatcagctc gcaggagggg gagactattg aactcgacaa gcccgtcatg 5340gccgagggga acgtggaggt gtggttgaac tcactcctgg aggaaagcca atcctcgctc 5400cacctcgtga ttcgccaggc cgcagcgaac atccaggaga ctggatttca gctcaccgag 5460ttcctctcga gctttcctgc acaagtcggc ctgcttggca ttcagatgat ttggacccgc 5520gacagcgaag aggccctgag aaacgccaag ttcgacaaga agatcatgca aaagactaat 5580caggcattct tggaactgct gaacactctg atcgatgtca ccactcggga tctgtcgagc 5640actgagcgcg tgaagtacga gactttgatt accattcatg tgcaccagcg ggacatcttc 5700gacgacctgt gccatatgca tattaaaagc ccgatggatt ttgagtggct gaagcaatgc 5760agattctact tcaatgagga ctcagacaag atgatgattc atatcaccga cgtggccttt 5820atctaccaaa acgaattcct gggttgtact gataggttag tgatcactcc gctcaccgac 5880aggtgctata tcaccctggc acaggccctg ggaatgtcga tgggaggggc ccccgccggc 5940ccggccggaa ccgggaaaac tgaaactact aaagatatgg ggcggtgcct tgggaagtac 6000gtcgtggtgt tcaattgcag cgatcaaatg gatttccggg gacttggacg cattttcaag 6060ggtctggcgc aaagcggcag ctggggatgc ttcgacgaat tcaaccggat cgacttgccc 6120gtgctctccg tcgcagccca acaaatctcc atcatcctga cttgcaagaa ggagcacaag 6180aagtcgttca tcttcaccga cggagacaat gtcactatga acccagagtt tggcctattc 6240ctgacgatga acccgggcta tgccggcagg caggagctgc ctgagaatct gaagatcaac 6300ttccggagcg tggctatgat ggtgcctgat cgccaaatca ttatccgcgt gaaactggcc 6360tcgtgtggat tcatcgacaa tgtggtgttg gctaggaagt ttttcactct ctacaagctc 6420tgtgaagaac agctcagcaa acaggtccat tacgacttcg gcctccggaa cattcttagc 6480gtgctccgga ctcttggagc cgccaagcgg gcgaacccga tggacaccga gtccaccatt 6540gtgatgaggg tgttgcggga tatgaacctc tccaaactga tcgacgaaga tgaacctttg 6600ttcctgagcc tcatcgagga tctgtttcct aacatcctgc tcgacaaagc cggatatccc 6660gaactcgaag ccgctatcag ccgccaggtg gaggaagcgg ggctcatcaa ccatcctccg 6720tggaagctca aggtcattca gctgtttgaa acgcagagag tgcggcacgg catgatgacc 6780ctgggaccga gcggtgccgg aaaaacgact tgcatccaca ccctcatgag agccatgacc 6840gattgtggca agccgcaccg ggaaatgcgg atgaacccaa aagcgattac ggccccccag 6900atgtttggac ggttggatgt ggcaaccaac gactggactg atgggatttt ctcaactctg 6960tggcgcaaga cgctgcgcgc gaaaaagggg gaacacattt ggatcattct tgacggtccg 7020gtggacgcca tttggattga aaatttgaac agcgtgctgg atgataacaa gacgctgact 7080ttggccaacg gagacagaat ccccatggcc ccaaactgca agatcatctt cgaacctcac 7140aacatcgaca atgcctcgcc cgcgaccgtg agccgcaacg gaatggtctt tatgagtagc 7200agcattttgg actggagccc tatcctcgag ggattcctga agaagcgctc accgcaggag 7260gcggagatcc tgaggcagct ttacactgaa agtttccccg atctctaccg cttctgcatc 7320cagaaccttg aatacaagat ggaggtgctc gaggccttcg tcatcaccca gtccatcaac 7380atgctccaag ggctcatccc gctgaaggag caaggcggag aggtgtcaca agcgcacctc 7440ggcagactgt ttgtgtttgc cctgttgtgg agcgccggag cagctttgga gcttgatggg 7500cggcggcgcc tggaattgtg gctgcgctcc cggcctaccg ggactttgga actcccacct 7560cccgccggcc ccggcgacac agcgttcgat tattacgtgg cccccgacgg cacctggacc 7620cactggaaca cccgcactca agaatacctg tacccttcgg acaccactcc agagtatgga 7680tccattcttg tgcctaacgt ggacaatgtc cggacggact ttttaatcca gaccattgct 7740aagcagggaa aggcggtgct gctcattgga gagcaaggga cagcaaagac cgtgatcatc 7800aaggggttca tgtcgaagta tgacccggaa tgccatatga taaagtcact gaatttcagc 7860agcgctacaa ccccactcat gttccaaaga accatcgagt catacgtcga caagagaatg 7920ggcactactt acggcccacc ggccggaaag aagatgaccg tgttcatcga tgatgtgaac 7980atgccaatca ttaacgagtg gggcgaccag gtcaccaacg agattgtccg gcagctcatg 8040gagcaaaacg gcttctacaa cctggagaag cccggagagt ttacctcaat cgtggatatt 8100cagttccttg cagcgatgat ccacccgggc ggaggtcgca atgacatccc ccagaggctt 8160aaaagacagt tctcgatttt taactgcacc ttacccagcg aagcgtcggt ggataagatc 8220ttcggagtca tcggcgtggg gcactattgc acccagagag gcttttccga ggaagtccgg 8280gactccgtca cgaagctcgt gcctttgacc cgccgcctgt ggcagatgac caaaattaag 8340atgctcccta cgccggctaa atttcactac gtgttcaacc tacgggacct gtcccgggtg 8400tggcaaggca tgctgaacac aaccagcgaa gtcatcaaag agccgaacga cttgctcaag 8460ctctggaaac acgaatgcaa gcgcgtgatc gcggaccggt ttaccgtcag cagcgacgtg 8520acctggttcg acaaggcgct cgtgtcgttg gtcgaagagg agtttgggga agaaaaaaag 8580ctccttgtgg actgtggcat cgacacctac ttcgtggatt tcctgcggga tgccccagag 8640gcggcgggag aaaccagcga agaagccgac gcagaaactc caaagattta cgagccgatt 8700gaatcgtttt cgcacctcaa agaacgcctc aacatgttcc tccagcttta taacgaatcc 8760atccggggag cgggcatgga tatggtgttc ttcgctgacg caatggtcca ccttgtgaag 8820atctcgcgcg tcatccgcac cccacagggg aacgccctct tggtgggggt cggaggctcc 8880ggaaagcaaa gcctgacccg gcttgcctcc ttcattgccg gctacgtcag ttttcaaatc 8940acgctgaccc gctcctataa caccagcaac cttatggaag atctcaaagt cttgtaccgg 9000accgctggcc agcagggcaa gggtatcacc ttcatcttca ctgacaatga gattaaagat 9060gagagttttc tcgaatacat gaacaatgtg ctgtcaagcg gcgaagtctc caaccttttt 9120gcgcgggatg agattgatga aattaactcg gacctggcaa gcgtgatgaa gaaggagttc 9180ccgaggtgct tgccgaccaa cgaaaacttg cacgactact tcatgagccg cgtgagacag 9240aacttgcata tcgtgctgtg cttctcgccg gtcggagaga agttccggaa ccgcgcgctc 9300aagtttcctg cactgatctc gggctgcacc attgattggt tctcacgctg gccaaaggac 9360gccctggtgg ctgtctccga gcacttcctc acctcctacg atattgactg cagcctcgag 9420attaaaaagg aggtggtgca gtgcatgggt agcttccaag acggggtcgc cgaaaagtgc 9480gtggactatt ttcaacggtt caggcggagc actcacgtca ctccgaagtc ctacttgtcg 9540ttcatccagg gctacaagtt tatctacggc gaaaagcacg tggaggtcag aactctcgcc 9600aataggatga acaccgggct ggagaagtta aaggaggcct cggaaagcgt ggccgccctc 9660tcaaaggagc tggaagctaa ggagaaggag ctgcaagtcg ctaacgataa ggccgacatg 9720gtgctgaagg aggtcaccat gaaggcccag gcggccgaaa aagtgaaggc cgaagtgcaa 9780aaagtcaaag acagagcgca ggctatcgtc gacagcatct cgaaggataa ggccattgcc 9840gaggagaagc tcgaggctgc aaagcctgcc ctggaggaag cggaagcagc actgcagacc 9900atcagacctt ccgatatcgc caccgtcagg accctgggaa ggcctcctca cctcatcatg 9960cggatcatgg attgcgtcct gttgcttttc caacggaagg tgtccgccgt caagatcgac 10020ttggagaagt cgtgcaccat gccatcatgg caggagtcac tgaagctcat gactgcagga 10080aactttctcc agaaccttca gcaattccca aaggatacca tcaacgaaga ggtgatcgag 10140ttcttgtcgc cgtatttcga aatgccggat tacaatatcg aaactgcaaa acgcgtgtgc 10200ggtaacgtgg caggcttgtg ctcctggacc aaggccatgg ctagcttttt ctcgattaac 10260aaggaagtac tgccactcaa ggcaaacttg gtggtgcagg aaaatagaca tcttctcgcg 10320atgcaggatc tgcaaaaggc tcaagccgag ctggatgata agcaggccga actggatgtc 10380gtgcaggccg aatacgagca ggccatgact gaaaagcaaa cgctgctgga ggacgcggaa 10440cgctgcagac acaagatgca gacagcttcc accttgattt cgggcctcgc tggagagaaa 10500gagagatgga cggagcagag ccaggagttt gccgcgcaaa ctaaacgcct ggtgggcgac 10560gtgctgctgg ccacagcgtt ccttagctac agcggcccat ttaaccagga attccgggac 10620ttgctcctga atgactggag gaaagaaatg aaggcgcgca agattccttt cggcaagaac 10680ctgaacttgt ccgagatgct tatcgacgcc ccgaccattt cagagtggaa tctgcaaggg 10740ctccccaatg acgatctgag catccagaac ggcatcattg tgacaaaggc ctcgcgctac 10800ccgctgctca tcgatccgca gactcaaggg aagatctgga ttaagaataa ggagagccgg 10860aacgaactgc agatcaccag cttgaaccac aagtacttta gaaaccacct ggaggattca 10920ctgagcctgg gccgccctct gctcatcgaa gatgtcggag aggaactgga tccggccctt 10980gacaacgtcc tggagcggaa ctttatcaaa actgggtcga ctttcaaggt caaggtcggc 11040gacaaagagg tcgacgtgct ggacgggttc aggctttaca tcaccactaa actccccaat 11100cctgcctaca ctcctgagat ctccgcccgc acctcgatta tcgacttcac agtgaccatg 11160aaagggcttg aggaccagct ccttggacgc gtgatcctga ctgaaaagca agaactggaa 11220aaggaacgga cccacttgat ggaggacgtc accgccaaca aaagacgcat gaaggaattg 11280gaagataatc tcctgtatag acttaccagc actcagggta gtctggtgga agatgaatcc 11340ctcatcgtcg tgttgtccaa caccaagcgc actgcggaag aagtcaccca gaagctggaa 11400atttcggcag aaaccgaggt tcagattaac tcggcaaggg aagaataccg gcctgtggcc 11460acccggggtt ccattctcta cttcctcatc accgaaatga ggctggtcaa tgaaatgtac 11520cagacctcac tccgccagtt cctgggactg ttcgatctga gcctggcccg ctcggtgaag 11580tcgcctatca cgtcaaagag gatcgcaaac atcatcgagc acatgaccta tgaggtctac 11640aaatacgccg cccggggcct ttacgaggaa cataagtttt tgttcacgct cttgcttact 11700ctcaagattg acatccagcg gaaccgggtc aaacacgagg aattcctgac cctgatcaag 11760ggtggtgcct ccttggatct gaaggcatgc cctcctaagc ctagcaagtg gattcttgac 11820atcacttggc tgaacctggt ggagctgagc aaactgaggc agttttcgga cgtgctcgac 11880cagatttcaa ggaacgaaaa gatgtggaag atctggttcg acaaagagaa cccggaggaa 11940gaaccgctgc ctaacgccta tgacaagtca ctggactgtt tccggagatt gctgctgatc 12000aggtcctggt gcccggatag aaccattgcg caggccagga agtacattgt ggattcgatg 12060ggagaaaaat acgccgaggg agtgatcctc gatctggaaa agacctggga agagagcgat 12120cctcgcaccc ctctgatttg cctcctttcg atgggaagcg acccgactga tagcatcatt 12180gcactgggga aacgcttgaa aatcgaaact cgatatgtgt caatgggaca gggacaggaa 12240gtgcacgcaa gaaagctctt gcagcagact atggccaacg ggggatgggc gctcctccag 12300aactgccact tgggactgga cttcatggat gaactcatgg atattatcat tgagactgaa 12360ctcgtgcatg acgctttcag attgtggatg actacggagg cccacaagca gttccctatc 12420acccttttac aaatgtccat caaattcgcc aacgaccccc cccagggcct gcgggccggg 12480ttgaaacgaa cttactccgg agtgtcgcag gacctactgg acgtcagctc cggctcacag 12540tggaagccaa tgctgtatgc cgtggccttc ttgcactcca ctgtgcagga gcgcagaaag 12600ttcggtgcct tgggctggaa cattccctac gaatttaacc aagccgactt caacgccacc 12660gtgcagttca tccagaacca cctggacgat atggacgtca agaagggtgt cagctggact 12720accatcaggt acatgattgg tgaaattcaa tacggcggac gcgtgacgga cgactacgac 12780aagaggctcc tcaacacctt cgctaaggtc tggttctcgg aaaacatgtt tggaccagac 12840ttttcattct accaaggata caatatcccg aagtgctcga cggtggataa ctacttgcag 12900tatattcaaa gccttcctgc ctacgattcg ccagaagtgt ttgggttgca tcctaacgcg 12960gatattacct accagtcgaa acttgcaaag gatgtgctcg acacgatcct gggcattcaa 13020ccgaaagaca ccagcggggg aggcgacgag actcgggagg cggtggtggc tcgcctggcg 13080gacgacatgc tggaaaagct cccgcccgat tacgtgccgt ttgaagtcaa agagcgcctt 13140caaaaaatgg gacccttcca gcctatgaac attttcctaa gacaggagat cgacagaatg 13200caacgggtcc tgagccttgt gcgcagcact ctcactgagc tcaagttggc gatcgacggg 13260accatcatta tgtcggaaaa ccttagagat gcgctcgact gcatgttcga tgcgagaatt 13320ccagcttggt ggaagaaggc ctcatggata tcatccaccc tcgggttctg gtttactgag 13380ctgattgaac ggaactccca gttcacgtcg tgggtgttca acggaaggcc gcactgcttc 13440tggatgacag ggttctttaa tccgcaaggt ttccttactg cgatgcgcca agagattacc 13500cgcgcgaaca aggggtgggc gctggacaat atggtgctct gcaacgaagt gaccaagtgg 13560atgaaggacg acatatcggc cccccctacc gagggcgtct acgtgtacgg attgtacctg 13620gaaggggcgg gctgggacaa gcgaaacatg aaattaatcg aatcgaagcc caaggtgctt 13680tttgaactga tgcccgtgat ccggatttat gctgaaaata acactctgcg cgacccgcgc 13740ttttactcat gcccaattta caaaaagccg gtcagaacgg acctcaacta catcgccgct 13800gtggacctca gaacggccca gacccccgag cactgggtgc tgagaggtgt cgcgctgctt 13860tgcgacgtga agtag 138752113875DNAArtificial SequenceSynthetic polynucleotide 21atgtttcgca ttggtaggcg gcagctctgg aagcactcgg tgactagagt gctgactcaa 60cgcctcaagg gcgagaagga agcaaagcgc gctctgctgg atgctcggca caactacctc 120ttcgcgatag tcgccagctg cctcgacctg aataagaccg aagtcgagga cgccatcctg 180gagggcaacc aaatcgaaag aattgatcaa ctgttcgccg tgggcggact gagacacctg 240atgttctatt accaggatgt ggaagaggcg gagactggac aactaggatc gctcggtggt 300gtcaacctcg tcagcggaaa gatcaagaag cccaaagtgt tcgtgaccga aggaaacgat 360gtggccctga ctggggtgtg cgtgttcttt attagaaccg acccttcgaa ggcgatcact 420cccgacaaca ttcaccagga ggtgtccttt aacatgcttg acgccgcaga tggaggcctg 480ctgaactcag tcagaaggct gctttcagac atttttatcc ctgccctgag agccacttct 540catggttggg gagaattgga aggcttgcaa gacgctgcca atattcgcca agaattcttg 600tcctcattgg aaggattcgt gaacgtgctc agtggagccc aggaaagcct gaaagaaaag 660gtcaacttgc gaaagtgcga catcctcgag ctcaaaaccc tgaaggagcc cactgattac 720ctcaccttag ccaacaatcc cgaaaccttg ggcaaaatcg aggactgcat gaaggtctgg 780atcaaacaga cggagcaagt cctcgccgaa aacaaccaat tgctgaagga ggctgacgat 840gtgggcccta gggccgaact cgagcattgg aagaagagac tgtcgaaatt taactacttg 900ctggaacagc ttaagagccc tgacgtcaag gctgtgctgg cggtgctggc cgcagccaag 960tccaagttac tcaaaacgtg gagagagatg gatataagaa tcaccgacgc cacgaacgag 1020gctaaggaca acgtgaagta cttgtatact ctcgagaagt gctgcgatcc cctgtactct 1080tccgaccctt tgagcatgat ggacgcgatc ccgaccctga tcaacgcgat taagatgatc 1140tatagcatct cccactacta caacacttcc gaaaagatta cgtccctctt cgtcaaggtg 1200accaaccaga ttatttccgc ctgtaaggcg tatattacta acaacggaac cgcctcgatc 1260tggaaccagc cgcaggacgt cgtggaagag aagatcctga gcgccatcaa gttgaagcag 1320gaataccaac tgtgcttcca taagaccaaa cagaagctca agcaaaatcc aaatgccaaa 1380caattcgact tctccgaaat gtacattttc ggcaagttcg aaactttcca ccggaggctc 1440gcgaagatta tcgacatttt cactaccctt aaaacctaca gcgtgcttca agattctacc 1500atcgaaggac tcgaggacat ggccacgaag taccagggta ttgtggcaac catcaagaag 1560aaagaatata acttcctgga ccagcgcaaa atggattttg accaggatta tgaggagttc 1620tgcaaacaga ctaacgacct ccacaatgaa ctcagaaagt ttatggacgt gaccttcgcc 1680aagatccaga acacgaacca agccctgagg atgcttaaga agttcgaacg cttgaacatt 1740ccgaacctcg gcatcgatga caaataccaa ctcatcttgg aaaattacgg cgcggatatt 1800gatatgatct caaagttgta cactaagcag aagtacgatc cgccgctggc ccggaaccaa 1860ccgcctatcg

cgggcaagat cctgtgggcc cggcagttgt ttcaccggat tcagcagccg 1920atgcagcttt ttcaacaaca tcccgcggtg ctctcgaccg ctgaagctaa gcctattatt 1980cggagctaca accgcatggc taaggtgctg ctggagtttg aggtcttgtt ccatcgggca 2040tggctcagac agatcgagga aatccacgtc ggactggagg cttccctgct cgtcaaagcg 2100ccaggcacgg gcgaactttt cgtgaatttc gatccgcaaa tcctgattct gttccgcgaa 2160acggagtgca tggcgcaaat gggattggaa gtgtcgcccc ttgccacttc cctgttccaa 2220aagcgcgacc gctacaagcg gaatttcagc aatatgaaga tgatgctcgc cgaataccag 2280cgggtgaaaa gcaagatccc agccgccatc gagcaactca tcgtgcctca cctcgccaag 2340gtcgacgagg ccctgcaacc gggtttggcc gcactgactt ggacctcgtt aaacatcgaa 2400gcctacttgg aaaacacttt cgcgaagatc aaggatcttg aacttctgct ggaccgcgtg 2460aacgacctaa tcgagttccg gatcgatgcc atcctggaag agatgagttc cacccccctc 2520tgccagttgc cacaagaaga acctctgaca tgcgaagaat tcctccaaat gactaaggac 2580ttgtgcgtca acggggcaca gatccttcac ttcaaatcct ccctggtcga ggaggccgtg 2640aacgagctag tgaacatgct ccttgatgtg gaggtcctca gcgaggaaga gtccgaaaag 2700atcagcaacg aaaactccgt gaactacaaa aacgagagca gcgctaaaag ggaggaggga 2760aatttcgaca ccctgacttc ctccatcaac gcgagagcga atgcactcct cctgactact 2820gtgacaagga agaagaaaga gactgaaatg ctcggcgagg aggccagaga actgttgtcg 2880cacttcaacc accagaacat ggacgccctg ctcaaggtga cccgaaacac cctggaagcg 2940atcagaaaga gaatccacag cagccacacc attaacttta gggacagcaa ctcagcctca 3000aatatgaagc agaattcact gcccatcttc cgcgcgtcag tgaccctggc catcccgaac 3060atcgtcatgg cgcccgcatt ggaggacgtg cagcagactc tgaacaaggc cgtcgagtgt 3120attattagcg ttcccaaggg agtgcggcag tggagctccg aactgttgtc gaaaaagaag 3180atccaggaac gcaagatggc cgccctgcaa tctaacgaag attcggactc agacgtggag 3240atgggagaga acgaattgca ggatactctg gagattgctt cggtgaactt gcccatcccg 3300gtgcagacga aaaattacta caagaatgtc agcgagaaca aagagattgt taagctcgtg 3360tcggtgctct caaccatcat caactcaact aaaaaggaag tcattaccag catggattgc 3420ttcaaacggt ataaccacat ttggcagaag gggaaggaag aggccattaa gaccttcata 3480acccaatcac cgctcctgtc cgagtttgag tcgcagatcc tgtactttca aaacttggaa 3540caggaaatca acgccgaacc cgaatacgtc tgcgtgggta gcatcgccct gtatactgcg 3600gacctcaagt tcgcgcttac tgctgaaacc aaggcatgga tggtggtcat cggtaggcat 3660tgtaacaaga agtaccgcag cgagatggaa aatatcttca tgctgattga agaattcaac 3720aaaaaactga acagacctat caaagacctg gacgatatta gaatcgccat ggcggccctg 3780aaggagattc gggaggaaca gatttccatc gacttccaag tgggccctat cgaagagagc 3840tacgccttat tgaacagata tggattgttg atcgcacggg aagagatcga caaagtcgac 3900actttgcact acgcatggga aaagctgctc gcccgagccg gggaggtcca gaacaaactt 3960gtgtcgcttc agccctcctt caagaaggag ctgatcagcg cggtggaagt gttcctccaa 4020gattgccatc aattctacct ggactacgat ttgaacggac ctatggctag cggcctgaag 4080ccccaggagg cgtccgacag acttatcatg tttcagaacc agtttgacaa catttacaga 4140aagtacatta cctacactgg tggcgaagaa ctcttcggac ttccggccac ccagtacccc 4200cagctccttg agatcaagaa gcaactgaac ttgctgcaga agatctacac tctgtataat 4260tccgtgattg aaacggtgaa ctcctactac gacattctct ggagcgaggt gaacatcgaa 4320aagatcaaca acgaattgct cgaatttcaa aacagatgcc ggaagctgcc cagggcactc 4380aaggactggc aagcgttcct cgacttgaaa aaaattattg atgatttctc ggagtgctgc 4440ccgctgctcg aatacatggc aagcaaggcg atgatggagc ggcactggga acggattacc 4500accctgactg gacactcgct cgatgtggga aatgagtcct ttaagctgcg gaacatcatg 4560gaagcccctc ttctgaagta taaggaggaa atcgaggata tttgcatctc cgctgtcaag 4620gagcgcgata tcgagcagaa gctcaagcaa gtcattaatg aatgggataa caagaccttc 4680acttttggct ccttcaagac tcgcggcgaa ctgcttcttc ggggcgactc aacgagcgaa 4740attatcgcga acatggagga ttcactgatg ctcctgggat cgctgctctc aaacaggtat 4800aacatgccct ttaaggctca gatccagaag tgggtgcagt acctgtcaaa ttccactgac 4860ataattgaat cgtggatgac cgtgcaaaac ctttggatct acctggaggc cgtgttcgtg 4920ggtggcgata ttgctaagca actccctaaa gaagctaaaa ggttcagcaa catcgataag 4980agctgggtca agatcatgac cagagcccat gaagtgcctt cggtggtgca gtgctgcgtg 5040ggcgatgaaa cgctgggaca gctcctgcct cacctcctgg accaattgga gatttgtcag 5100aagtcgctga ctgggtacct tgaaaagaaa agactttgct tcccacgctt cttctttgtc 5160tccgacccgg cgcttcttga gattctggga caggccagcg actcccatac catccaagcc 5220cacttgctca acgtgttcga taacattaaa tcagtgaagt tccatgagaa aatttacgac 5280cgaatcctgt cgatctccag ccaggagggc gaaaccatcg agttggataa acccgtgatg 5340gccgaaggaa acgtggaagt gtggttgaac tccttgctgg aagagagcca gagctccctc 5400cacctggtga tccgccaggc ggcggccaac attcaggaaa cggggttcca gctcaccgag 5460tttctgagct cgttccctgc ccaggtggga ctcctgggca tccaaatgat ctggacccgg 5520gattcggagg aagccctgag aaatgcgaaa tttgacaaga agattatgca aaagactaac 5580caggcctttc ttgaattgct gaatactctg atcgacgtga ccacccggga cctttcgtcc 5640acggaaaggg ttaagtacga aactttaatt actatccacg ttcaccaaag ggacattttc 5700gatgatctct gtcacatgca catcaagagc ccaatggact ttgagtggct gaagcaatgc 5760cggttttact tcaacgaaga tagcgacaag atgatgatcc acatcaccga tgtcgcgttt 5820atctaccaaa acgaattcct gggttgcact gaccgcctcg tgatcacccc gctgaccgat 5880aggtgttaca tcaccctggc acaagcattg gggatgtcca tggggggagc tccagccggg 5940cctgccggca ccggcaaaac cgaaacgaca aaggacatgg ggagatgcct tgggaagtac 6000gtggtggtgt tcaattgcag cgaccaaatg gacttccgcg ggctgggacg gatcttcaag 6060ggtctggctc aaagcggaag ctgggggtgc ttcgatgaat tcaatagaat tgacctccct 6120gtgctgtcgg tggccgcgca gcagatcagc atcatcctta cttgcaagaa ggagcacaag 6180aagtcattca ttttcactga cggagataac gtgactatga acccggaatt cggcctgttc 6240ttaaccatga accccggcta cgcgggccgg caggagctgc cagaaaacct gaaaatcaac 6300tttcggagtg tcgctatgat ggtcccggac agacagatca tcatcagagt gaaactcgcg 6360tcgtgcggtt tcatcgataa cgtcgtgctg gcccgcaaat tcttcaccct ctacaagttg 6420tgcgaagaac agctctccaa gcaagtgcac tacgacttcg gactgcgcaa cattcttagc 6480gtgcttcgca cgctcggagc cgcaaagaga gcgaacccta tggacacgga atccaccatc 6540gtcatgcggg tgctgcgcga catgaacctg tccaagctca tcgacgaaga tgagccgctc 6600tttttgtccc tgatcgagga cctgttccct aacatccttc ttgacaaggc cggctaccct 6660gagcttgaag ccgccatttc cagacaggtg gaagaagccg gccttattaa tcatccccca 6720tggaaactca aagtgatcca gctctttgag actcaaaggg tccggcacgg gatgatgaca 6780ctcggaccta gcggtgccgg caagactact tgtatccaca ccctcatgcg cgccatgacc 6840gactgtggaa aaccccacag agagatgaga atgaacccta aggccatcac tgcaccgcag 6900atgttcggcc ggctggacgt ggccacgaat gactggactg acggcatctt cagcaccctc 6960tggaggaaaa ccttgagagc gaagaagggc gaacacattt ggatcatcct cgatgggcct 7020gtggatgcga tttggattga gaacctcaat agcgtcctcg atgacaacaa gaccctgacc 7080ctcgccaacg gggaccggat ccccatggcc cccaactgca agatcatctt cgagccgcat 7140aacattgaca atgctagccc ggccacggtg tcccggaacg ggatggtctt tatgagctca 7200tccatcctgg attggtcccc tatcctcgaa gggttcctca agaagcggtc ccctcaggaa 7260gccgaaatcc tgcggcagct ctacacggag agcttcccgg acttgtaccg gttctgcatt 7320cagaacttgg aatacaagat ggaagtcttg gaggccttcg tcattaccca gagcatcaac 7380atgctgcagg ggctcattcc actcaaggaa cagggcggag aggtgtcgca ggcccatctc 7440gggaggcttt tcgtattcgc actgctgtgg tctgcaggag ctgccctgga actggacggc 7500cgcaggaggc tggagctctg gctgcgctca cggcccacgg gcacccttga actgcctccc 7560cccgcgggtc cgggagatac cgccttcgac tactacgtcg ctccggacgg cacatggacg 7620cactggaata cccgcactca ggagtatctc tacccttccg acactactcc ggaatacggc 7680agcatcctcg tgccgaacgt ggacaacgtg agaacggact tccttatcca aactattgct 7740aagcaaggaa aagccgtctt gcttatcggc gagcaaggca ccgcaaagac cgtgatcatc 7800aaggggttta tgagcaagta cgacccagaa tgccacatga tcaaaagcct caactttagc 7860tccgccacca ctccactgat gtttcagcgc actattgagt cgtacgtcga caagcgcatg 7920ggaaccactt acggtccacc ggccggaaag aagatgactg tgtttattga cgatgtgaac 7980atgcccatca tcaacgaatg gggcgatcaa gtgactaacg agattgtgcg ccagctgatg 8040gaacaaaacg gattctacaa cctggaaaag ccgggcgaat tcacctcgat tgtcgacatc 8100caattcctgg cagccatgat tcatcccgga ggcggaagaa acgacattcc tcagaggctc 8160aagcggcaat tctccatttt caattgcacc ctgccgtcgg aagcctccgt ggacaaaatt 8220ttcggcgtca ttggagtggg gcactactgc acccagcggg gcttcagcga agaagtgaga 8280gattccgtca ccaagctggt gccactcacc agacggcttt ggcaaatgac caagatcaaa 8340atgctgccta ccccggccaa atttcactac gtctttaacc tccgggatct cagccgggtg 8400tggcagggaa tgctcaatac tacttccgag gtgatcaagg aacccaatga cctcctgaaa 8460ctgtggaagc acgaatgcaa gcgcgtgatt gctgaccgct ttaccgtgag cagcgacgtg 8520acgtggtttg acaaagcgct cgtgtcactc gtcgaggaag aatttggcga ggagaagaag 8580ttgttggtgg actgtggaat tgatacctac ttcgtggatt tccttcggga cgcccctgaa 8640gcggccggag aaacttccga agaagcagat gccgaaaccc caaagattta cgaaccgatt 8700gagtcgtttt cacacctcaa agaacgcctc aacatgttcc ttcagctgta caacgagtca 8760atccggggtg ccggaatgga catggtgttt ttcgcggacg ccatggtaca cctcgtgaag 8820atctcgcgcg tgattagaac gccacagggg aacgccctgc tggtcggtgt ggggggtagc 8880ggaaagcagt cattgacccg gctggcctcg ttcattgcgg gatacgtgtc attccagata 8940acccttacca gatcgtacaa cacctccaac ctgatggagg acctgaaggt gctgtatcgc 9000actgccgggc agcaagggaa aggaataact ttcattttca ctgacaatga gattaaggat 9060gagagcttcc ttgaatacat gaacaacgtc ctttcatccg gggaagtcag caaccttttc 9120gcccgcgacg agatcgacga gattaacagc gatcttgcgt ccgtcatgaa gaaggaattc 9180cctagatgct tgccgaccaa cgaaaaccta cacgattact tcatgtcaag agtgcgccaa 9240aacctccaca tcgtgctgtg tttcagcccg gtgggcgaga agttcagaaa cagggctttg 9300aaattccccg cactgatttc cggttgtacg attgattggt tctcacgctg gccgaaggac 9360gccctggtgg ccgtcagcga gcacttcctg accagctacg acattgactg cagcctcgag 9420atcaagaagg aggtggtcca gtgtatgggc tccttccaag atggagtggc cgaaaagtgc 9480gtggactact tccagagatt cagacgctcc actcacgtga ctcccaaatc ctacttgtcc 9540ttcatccaag gttataagtt catttacggc gaaaagcatg tcgaggtccg gactctggcc 9600aaccggatga acaccggact ggagaagctg aaagaagcca gcgagtcggt ggccgctctc 9660tcaaaggaac tggaggccaa ggagaaggag ctccaggtgg cgaacgacaa ggcggacatg 9720gtcctgaagg aggtgaccat gaaggcccag gccgctgaga aagtgaaggc ggaggtgcag 9780aaggtcaaag atagggcgca ggccatcgtc gattccatca gcaaggacaa ggccattgcg 9840gaagagaagc tggaggctgc aaagccggcg cttgaagaag ccgaggccgc gctgcaaacc 9900attcgcccca gcgacatcgc caccgtgagg actctgggac gacccccaca ccttatcatg 9960cggatcatgg attgcgtgct cctcctgttc cagcgcaaag tgtcggcagt caagatcgac 10020ctggagaaat catgcactat gccttcatgg caggagtcct tgaagctgat gacggcaggc 10080aacttcctgc aaaacttgca gcagtttccg aaggatacca tcaacgaaga agtgatcgag 10140ttcctttcgc cgtactttga gatgcccgat tacaacattg aaaccgcaaa gagggtctgc 10200ggcaacgtgg cagggctgtg cagctggacc aaggcaatgg cctcgttctt ctcgattaac 10260aaggaagtgc tgccgctgaa ggcgaacctg gtcgtacagg agaacaggca ccttctggcc 10320atgcaagacc tccaaaaggc acaagccgaa ttggacgaca agcaggccga gctcgatgta 10380gtccaggctg aatacgaaca ggccatgacc gagaaacaga ctctgctgga agatgccgag 10440cggtgccgcc acaagatgca gaccgcttcg accctcatta gcgggctggc cggcgaaaag 10500gagaggtgga ccgaacagtc acaagaattc gccgcccaga cgaaacgcct ggtcggcgac 10560gtcctcctcg ccaccgcctt cctgagctac agcgggcctt tcaaccagga attcagagat 10620ctccttctga acgactggcg gaaggaaatg aaggccagaa agatcccgtt cggcaagaac 10680ctgaacctct cggagatgct gattgacgct ccgacaattt cagagtggaa ccttcagggt 10740ctgccaaatg atgacctgag cattcaaaat ggcatcatcg tgaccaaggc atccaggtac 10800cctctcctta ttgatcccca aacccagggg aagatctgga tcaagaacaa agagtccaga 10860aacgaactgc agattaccag cctcaaccat aaatactttc gcaatcacct ggaagattcc 10920ctgagcctcg ggaggcccct gctcatcgag gatgtgggag aggaactgga ccccgcgctc 10980gataacgtgc tggagcgcaa tttcatcaag accggaagca ccttcaaagt caaggtcggc 11040gataaggagg tggatgtcct ggatggtttc cggctctaca ttactacgaa actcccgaac 11100cctgcctata cgccggaaat ctcagcccga accagcatca tcgacttcac tgtgactatg 11160aaaggcctgg aggaccagct cctcggacgc gtgatcctga cggagaagca agagctggag 11220aaggagagga ctcatctcat ggaggatgtc accgcgaaca aaagacggat gaaggagctg 11280gaggacaacc tgttgtacag gctgacgtca actcagggat cactggtgga agatgagtcc 11340ctcatcgtcg ttctgtccaa caccaagcgg accgccgagg aagtcaccca gaagctcgag 11400attagcgctg aaaccgaagt ccagattaac tccgccagag aggaataccg gcccgtggct 11460actcggggct ctatcctgta cttcctcatc accgagatgc gcttggtcaa cgaaatgtac 11520cagacttccc ttcggcaatt cctgggactg ttcgatttga gcctcgccag atccgtgaag 11580tcgcccatca cttcaaagag gattgcgaac atcatcgaac acatgactta cgaggtgtac 11640aagtacgccg cacgaggcct gtacgaagaa cacaagttcc tgttcacttt actgttgacg 11700cttaagatcg acatccagcg gaacagagtg aagcacgagg agttcctcac gcttattaag 11760ggcggtgctt ccctggacct gaaggcctgt cctcccaagc ctagcaagtg gatcctggat 11820atcacgtggc tgaacttggt ggagctctcc aaacttagac agttctccga cgtcttggac 11880cagatttcaa ggaatgagaa aatgtggaag atctggttcg acaaggagaa cccggaagag 11940gagcccttgc cgaacgctta cgacaagtcc ctggactgtt tcaggagact gttgctcatt 12000cggtcctggt gtccggatag gaccatcgcc caggcccgca agtacatcgt ggatagcatg 12060ggtgaaaagt acgctgaagg cgtcattctt gacctggaaa agacttggga ggaaagcgac 12120ccgcgcaccc cactgatttg cctcctctcg atgggaagcg atcctacgga ctccatcatt 12180gcccttggta aaagacttaa gatcgagact cgctatgtct cgatgggcca ggggcaggag 12240gtccacgccc ggaagttgct tcagcagact atggccaatg gaggttgggc gctgctccag 12300aactgccatc tgggtttgga cttcatggac gaattaatgg atatcatcat tgaaaccgaa 12360ctggtgcatg acgctttccg gctgtggatg accaccgaag cccataagca atttccaatt 12420accctcctcc aaatgagcat caagtttgcc aacgacccac cacaaggttt gcgcgcgggc 12480cttaagcgga cttactccgg agtcagccag gacttgctcg acgtcagcag cggatcacag 12540tggaagccca tgctctacgc agtggccttt ttgcacagca ctgtgcagga gagaaggaag 12600tttggagcgc tggggtggaa tattccgtac gaattcaacc aagcggactt taatgctacg 12660gtccagttca tccagaacca ccttgacgat atggacgtca aaaagggcgt gtcgtggacg 12720acgatcaggt acatgatcgg ggaaatccag tacggcggaa gagtcactga cgattacgac 12780aagaggctcc tgaacacttt tgccaaagtg tggttttcag agaacatgtt cgggccggac 12840ttctccttct accaaggata caatatcccc aaatgcagca ccgtggacaa ctacttgcaa 12900tacatccaga gcctgccagc atacgactcc ccagaagtct ttggactgca cccgaacgcc 12960gacattacct accagagcaa gttggcgaag gacgtcttgg acactattct tggtatccag 13020ccgaaagaca cctcgggggg gggggacgaa accagagagg cagtggtcgc gcggctcgct 13080gacgacatgc tggaaaagct gcccccggat tacgtcccgt tcgaggtcaa ggagaggctg 13140cagaagatgg gaccattcca gcctatgaat attttcctga ggcaggagat cgaccggatg 13200cagcgcgtcc tgtcactcgt gcgcagcact ctcaccgagt tgaaactggc aatcgatggc 13260acgattatca tgtcggaaaa ccttcgcgat gcactggact gcatgtttga cgccagaatt 13320cccgcttggt ggaagaaggc ttcatggata agctccacct tgggattctg gttcacagag 13380ctcattgaac gcaacagcca gttcacttcc tgggtgttca atggcagacc ccactgcttc 13440tggatgaccg gattcttcaa cccccagggg ttcctgaccg ctatgcggca ggagattact 13500cgggcgaata agggatgggc cctggacaac atggtgctct gcaacgaagt gaccaaatgg 13560atgaaggatg atattagcgc gccgccgact gagggagtgt acgtctacgg actttacttg 13620gagggcgcgg ggtgggataa aaggaacatg aagttgatcg agtcaaaacc caaggtgctc 13680tttgaattga tgccggtgat ccgcatctac gccgaaaaca acactctcag agatcccaga 13740ttctactcgt gtcctattta caagaagccc gtccgcaccg acttgaacta tatcgccgcg 13800gtcgatttga gaactgccca gactcccgag cactgggtgc ttcggggagt cgcgctgctt 13860tgcgatgtga agtag 138752213875DNAArtificial SequenceSynthetic polynucleotide 22atgttccgca tcggccgccg ccagctctgg aagcatagcg tgactagagt ccttacccaa 60cgcctgaagg gtgaaaaaga ggccaagcgc gccctgctgg acgcccggca taattacctg 120tttgccattg tcgcctcatg cctggacctg aataagactg aggtcgagga tgccatcctt 180gaagggaacc aaatcgagag aatcgatcaa ctcttcgccg tgggggggct gcggcatctc 240atgttctact accaggacgt ggaagaagca gaaaccggcc agcttggctc ccttggtggg 300gtgaacctcg tgtcggggaa gatcaaaaag cctaaggtgt tcgtgaccga gggcaacgac 360gtggcgctta ccggagtgtg cgtgttcttc ataaggacgg atccatcaaa ggccattacc 420cctgataaca ttcatcaaga ggtcagcttc aacatgctcg atgccgcgga tggaggtttg 480ctgaatagcg tgcgcagatt gctctcagac attttcatcc ctgcgctccg cgccacctcc 540cacggttggg gagaacttga ggggctccag gatgccgcca acattagaca ggaattcctg 600agcagcctgg aaggattcgt gaacgtgctg tccggggcac aagagagcct caaagaaaag 660gtcaatttga gaaaatgcga catcctcgaa ctcaagactc tgaaggaacc gactgattat 720ctgactctcg cgaataaccc agaaaccctc ggaaagatcg aggactgcat gaaggtctgg 780attaaacaaa cggagcaagt gctggctgag aataaccagc ttctcaagga agccgatgac 840gtcggcccga gagcagagct tgaacattgg aagaagagac tttccaagtt caactacctc 900ctcgaacagc tgaagtcccc ggacgtcaag gccgtgctcg ccgtgctggc cgctgccaag 960agcaagctgc tcaagacttg gcgggaaatg gacatcagga tcaccgacgc aacgaacgag 1020gccaaagata acgtcaagta cctttatacc ttggagaagt gctgtgatcc gctttactcg 1080agcgaccctc tgtcgatgat ggacgccatc ccaactctga ttaacgccat taagatgatt 1140tatagcatca gccactacta caatacctcc gaaaaaatca cctcgttgtt cgtgaaggtc 1200accaatcaga ttatttccgc ttgcaaggcg tacattacca ataacggtac cgcgagcatc 1260tggaaccagc cgcaggacgt ggtggaagaa aaaattctga gcgcaattaa gctgaaacaa 1320gagtaccagt tgtgctttca caagaccaag cagaagttga agcagaaccc aaacgccaag 1380cagtttgatt tttccgaaat gtacattttc gggaagtttg agactttcca ccggagactc 1440gcgaagatta ttgacatttt caccaccctt aagacttaca gcgtgctaca ggactcgacc 1500attgaggggc tcgaggatat ggccaccaag taccagggaa ttgtggccac aattaagaag 1560aaggaataca acttcctcga ccagcgcaag atggacttcg accaagacta cgaggagttt 1620tgcaagcaga ctaatgacct tcataatgag ctgcggaagt tcatggatgt gaccttcgcg 1680aaaatccaga acaccaacca ggccctgcgg atgctcaaga aattcgaacg cctgaacatt 1740cctaacctcg ggatcgacga caaataccaa ctcattttgg agaactatgg ggcggatatc 1800gatatgatct ccaaactcta caccaaacag aagtacgacc cgcccttggc tcggaaccag 1860ccacccatcg cgggaaagat cctgtgggcc cgccaacttt tccatcgcat ccagcagcct 1920atgcagctgt tccagcagca tcccgccgtg ctctccacgg ccgaagcaaa gccgattatt 1980cgctcctata accggatggc caaagtgctg ctcgagttcg aagtgctgtt ccatcgcgca 2040tggctccggc agatcgagga gattcacgtg ggcctcgaag ctagcttgct cgtgaaggct 2100cctggcactg gagaactttt cgtgaacttt gaccctcaaa tcctgatcct gttcagggag 2160actgagtgca tggcgcagat gggcttggag gtgtcaccct tggccacgag cctgtttcag 2220aagagagaca ggtacaagcg caacttttcc aacatgaaga tgatgctggc cgaataccag 2280cgcgtgaaat ccaagatacc ggcggccatc gaacaactta ttgtgcccca cctcgcaaag 2340gtggacgagg cgcttcaacc cggactcgcc gcgctcacat ggacgtcgct gaacatcgag 2400gcctacttgg agaacacctt tgccaagatc aaggacctgg aactgttgct agatcgcgtg 2460aacgacctga ttgaattccg gattgatgct attttggaag agatgtcgtc gaccccgctg 2520tgtcagctgc cccaggagga gcctctcact tgcgaagagt tcttacaaat gacgaaggac 2580ttgtgcgtga acggtgccca aatcctgcac ttcaagtcct cgctcgtcga ggaggccgtg 2640aacgagctcg tgaatatgct tcttgatgtc gaggtgctga gcgaggaaga atcagagaaa 2700atttcgaatg agaatagcgt caattacaag aacgagagta gcgcgaagcg cgaggagggg 2760aattttgaca ctcttacctc ctccatcaac gcccgggcaa atgcactcct tctgactacc 2820gtgacccgga agaagaagga aaccgagatg ctgggagaag aagcaagaga gctccttagc 2880cacttcaacc accagaacat ggacgcgctc ctgaaagtga cccggaacac cctggaggct 2940atccggaagc

ggatccatag cagccacacc attaatttcc gcgacagcaa cagcgcaagc 3000aacatgaagc agaactcact gccaatcttc cgcgcctccg tgactctcgc cattccaaac 3060atcgtgatgg ctcctgcctt ggaggatgtc cagcaaaccc tgaataaggc cgtggaatgc 3120atcatctccg tgccgaaggg agtcaggcag tggtcgtcag agctcctgtc aaaaaagaag 3180atccaagagc ggaagatggc cgccctgcaa tccaacgaag atagcgattc ggacgtggag 3240atgggagaga acgagttgca ggataccctc gagatcgcct ccgtgaatct gccaatccca 3300gtgcagacta agaattatta caagaacgtg agcgaaaaca aagaaattgt caaactcgtg 3360agcgtcctga gcaccattat caattccact aagaaggaag tcatcaccag catggactgc 3420ttcaagaggt acaaccatat ttggcagaag ggaaaggaag aggcgattaa gaccttcatc 3480actcaatccc ccctcctcag cgagttcgaa tcgcagattc tgtacttcca aaacttggag 3540caggaaatta acgccgagcc tgaatacgtg tgcgtgggct ccatcgccct gtacaccgcg 3600gatctcaagt tcgcgctcac cgcagaaacc aaggcgtgga tggtggtgat cgggagacac 3660tgcaacaaaa agtaccggtc cgaaatggaa aacatcttca tgctgattga ggaattcaac 3720aagaaactga accggccgat aaaggacctc gacgacatca ggatagccat ggcagccttg 3780aaggaaatcc gcgaggagca gatcagcatt gacttccagg tcggaccgat tgaggagtcc 3840tacgccctgc tgaatagata cggtctgcta atcgccagag aagagatcga caaggtggac 3900accttgcact acgcttggga aaagctcctt gcccgggccg gggaagtgca gaacaagctg 3960gtcagcttac agccatcctt taaaaaggaa ttgatctccg ccgtggaagt gttcttgcaa 4020gattgccacc agttctacct ggactacgac ctcaacggac cgatggctag cggccttaag 4080ccacaggagg catcagaccg gctgatcatg tttcagaacc aattcgacaa catctacaga 4140aagtacatta cctacaccgg gggcgaggaa ttgttcggcc tgccggcaac ccaatacccc 4200cagcttctcg agattaagaa acaattgaac ttgctccaga agatctacac gctctataac 4260tccgtgatcg agactgtgaa ctcgtactat gacattctct ggtcagaggt caacatcgaa 4320aagattaata acgaactgct ggaatttcaa aatcggtgca ggaagctccc tagggctctg 4380aaggattggc aggccttcct ggaccttaag aagatcattg acgacttctc ggaatgctgc 4440ccactcctgg agtatatggc tagcaaggct atgatggaga ggcattggga gcgcatcacc 4500actttgactg gccacagcct tgacgtgggc aatgaatcct tcaaactgcg caacatcatg 4560gaggcaccat tgttgaaata caaggaagaa atcgaagata tctgcatcag cgccgtcaag 4620gaacgcgata tcgaacagaa gctcaaacag gtcattaacg agtgggacaa caagacgttc 4680accttcggga gcttcaagac taggggagaa ctgctgttga gaggagactc cacttcggag 4740atcattgcca acatggaaga tagcctcatg ttgctgggct ccttgctctc gaaccgctat 4800aacatgccat tcaaggcaca aatccagaag tgggtccaat acctctccaa tagcacagac 4860atcatcgaat cgtggatgac cgtccaaaac ctgtggatct acctggaagc cgtgttcgtg 4920ggcggagata ttgcgaagca gttgcctaag gaggcaaaaa gattcagcaa tatcgacaag 4980tcgtgggtga agatcatgac cagagcccac gaggttcctt ccgtggtgca gtgctgcgtc 5040ggagatgaga ctctgggcca gctcctcccg caccttctgg atcaactgga aatctgccaa 5100aagtccctga ctggctacct ggaaaaaaaa agactgtgct ttcctcggtt cttcttcgtg 5160tccgaccccg ctctgctgga aatcctcgga caggcgtccg actcccacac tatccaggcc 5220caccttctca acgtgttcga taatatcaag tcggtcaaat tccatgagaa gatctacgac 5280agaatcctgt ccatctcgtc acaagagggc gagactatcg agcttgataa gccagtcatg 5340gccgaaggca acgtcgaagt ctggctcaat tcactcctgg aagaatcaca gtcgtccttg 5400catctggtca ttcgccaagc cgccgctaac attcaggaga ctggtttcca acttactgag 5460ttcttgtcgt cattccccgc ccaagtgggc ctgctgggca ttcagatgat ttggacgagg 5520gattccgaag aggcccttcg caatgcaaaa ttcgacaaga aaattatgca gaaaaccaat 5580caagccttcc tggaacttct gaacactttg attgatgtga ctactcggga tctcagctcc 5640accgaaagag tcaaatacga gactctcatc accatccacg tgcatcaacg cgacattttc 5700gacgacttgt gccacatgca tatcaaatcc ccgatggatt tcgagtggct gaagcagtgt 5760cgcttttact tcaacgaaga ttccgacaaa atgatgattc acatcaccga tgtggccttt 5820atttaccaaa acgaattctt gggctgtact gaccggctgg tgattactcc tttgactgat 5880cgctgctaca ttacccttgc ccaggctctg ggaatgtcca tgggaggggc gcctgccggc 5940ccggccggaa ctggaaagac cgaaaccacg aaggacatgg gtcggtgcct gggcaagtac 6000gtggtggtgt tcaactgcag cgaccaaatg gactttcggg gactgggaag aatcttcaag 6060ggactcgccc agtcggggtc atggggctgc tttgacgaat tcaacaggat tgacctcccg 6120gtcctgtccg tcgccgccca acaaatttca atcattctca cctgtaagaa agaacataag 6180aagtcattca tcttcaccga cggagacaac gtgaccatga accctgaatt tggactgttc 6240ctcaccatga acccgggata tgcgggccgg caggagctgc cagagaacct taagattaat 6300ttccggtccg tcgcgatgat ggtgccggac cggcaaatta tcatccgggt aaagctggct 6360tcatgcgggt tcatcgataa cgtcgtcctc gcccggaaat ttttcaccct ttataagctg 6420tgcgaagaac agctttcgaa acaggtgcat tacgattttg gtctgcggaa catcctgagc 6480gtcctccgga ctctgggagc cgcgaagagg gctaatccga tggacactga gtccactatc 6540gtcatgagag tcctgcgcga catgaacctt tccaagctca tcgacgagga tgagcctctc 6600ttcctgagcc tgatcgagga cctgttcccc aacatcctgc tcgacaaggc gggatatcct 6660gaactggagg cggccatcag cagacaggtc gaggaggccg gcctaatcaa ccaccctccc 6720tggaagctca aggtgatcca gctgtttgag acacaacggg tgcggcatgg aatgatgacc 6780ctggggccgt ccggggctgg aaaaactact tgtatccata ctctcatgcg cgccatgacc 6840gattgtggaa aacctcaccg cgagatgaga atgaatccca aggcgattac tgccccccag 6900atgtttgggc ggctggacgt cgccactaac gactggacgg acggaatttt ctccaccctc 6960tggcgcaaaa cactgagagc caagaaggga gagcatatct ggatcattct ggacggtcca 7020gtcgatgcga tttggatcga gaacctgaac tccgtcttgg acgataacaa gacgctcacc 7080ctggctaacg gcgatagaat cccaatggcc ccaaattgta agatcatctt cgagccacac 7140aacattgata acgcgtcacc ggccaccgtc agccggaacg gaatggtgtt tatgagcagc 7200tcgattcttg actggagccc gatcctcgaa ggtttcctca aaaagagatc cccccaggag 7260gcagagattc tcaggcagct ctacaccgag tcattcccgg atctgtacag gttctgcatt 7320cagaatctgg aatacaagat ggaagtcctt gaagccttcg tgatcaccca atcgattaac 7380atgcttcagg ggctgatccc actcaaagaa caagggggag aggtgtctca ggcgcacctg 7440ggccgcttgt tcgtgtttgc tttgctttgg agcgcgggcg ccgcccttga actggatggc 7500cgcaggcgcc tcgagctctg gctgagaagc agacctacgg gaacccttga actccctccc 7560cccgctggcc ccggagatac cgccttcgac tactacgtgg ccccggacgg gacttggact 7620cattggaaca ccagaactca agagtacctt tatccgtccg atactactcc ggagtacgga 7680tccatcctgg tccctaacgt ggacaacgtg cgaaccgact tcctgattca gacaatcgcc 7740aagcagggaa aggcggtgct gcttatcgga gaacaaggaa cagccaagac tgtcatcatc 7800aagggattta tgagcaagta tgatcctgag tgccacatga tcaagagctt gaacttctcg 7860tcggccacta ctcccctcat gttccaaaga accatcgagt cctacgtgga caaaagaatg 7920ggaaccacgt acgggccccc tgccgggaag aagatgaccg tctttatcga tgacgtgaac 7980atgcccatca tcaacgagtg gggtgatcaa gtcaccaacg aaatcgtgcg gcagctcatg 8040gaacagaacg gattctacaa cctggaaaag ccaggggaat tcacctccat cgtggatatt 8100cagttcctgg ccgccatgat ccatccaggc ggaggccgca atgacattcc gcaacgcctg 8160aagcggcagt tctcaatctt caattgcacc ctcccgtcgg aagctagcgt ggataagatt 8220tttggtgtca tcggagtcgg acactactgc acccaacgag gctttagcga agaggtcaga 8280gatagcgtca cgaaactcgt gccgttgact cgccgccttt ggcagatgac caagatcaag 8340atgctaccca cccccgctaa gttccactac gttttcaatt tgcgggactt gagccgggtg 8400tggcagggaa tgctgaacac cacctcggaa gtcatcaaag aacctaacga cttgctgaag 8460ctctggaagc acgaatgcaa gcgcgtgatc gccgaccgct tcactgtgag cagcgacgtg 8520acgtggtttg ataaggctct ggtcagcctg gtcgaggagg agttcgggga agagaagaag 8580ctcctggtcg actgtggcat cgacacgtat ttcgtggact ttctcagaga tgcgcccgaa 8640gccgcgggcg aaacctcgga agaagcagac gctgagactc ccaagatcta cgagccgatt 8700gagagcttca gccacttgaa ggagagactt aacatgtttc tgcagctcta caacgaaagc 8760attcggggcg ctgggatgga catggtgttt ttcgctgacg ccatggttca cttagtgaaa 8820atctcccgcg taattaggac cccgcagggt aatgccctgc ttgtaggcgt gggcggatcc 8880ggaaagcagt cactgacccg gcttgcatca ttcatcgcgg gctacgtgag cttccagatc 8940actcttacta gatcatataa cacgtccaat ctcatggagg acctgaaagt gctgtatagg 9000accgcgggcc agcaggggaa gggtatcacc ttcattttta ccgacaacga aatcaaagat 9060gagtccttct tggagtacat gaacaacgtg ctctccagcg gcgaagtgag caaccttttc 9120gcccgcgatg agatcgatga aattaactcc gacctggcct cagtgatgaa gaaggaattt 9180ccgcgctgcc tccctacgaa cgaaaatctg cacgattact tcatgtcaag agtccggcag 9240aatctccaca tcgtgctgtg cttttcaccg gtgggagaga agtttcgcaa ccgcgcactg 9300aagttcccgg ccctgatttc gggatgcacg attgattggt tctcgcggtg gcctaaagat 9360gccctcgtgg cggtgtccga gcacttcctt acttcctacg acatcgattg ctccctggag 9420atcaagaaag aggtggtgca gtgtatgggt tcattccagg acggggtcgc agagaagtgc 9480gtggactact ttcagagatt cagacgctcc actcacgtca ctccgaaatc ctacctgagc 9540ttcatccagg gatacaagtt tatatacggc gaaaagcacg tcgaagtgag aacactcgcc 9600aacaggatga acactggttt ggaaaagctc aaggaggcgt ccgaatcggt ggccgcactg 9660tcaaaggagc tcgaagcgaa ggagaaagaa ctccaggtcg cgaacgacaa ggccgacatg 9720gtgttgaagg aagtaaccat gaaggcccag gccgccgaga aggtgaaggc cgaagtgcaa 9780aaggtcaagg atagggcaca agccatcgtc gactcaattt cgaaggacaa ggcaatcgcc 9840gaagaaaagt tggaggccgc gaagccggcc ctcgaagaag cggaggcggc gctgcagacc 9900atcaggcctt cagatatcgc tacggtccgc accctgggtc gccctcctca cctcattatg 9960agaatcatgg actgcgtgct gcttttgttc cagcgcaaag tctcagcagt gaagatcgat 10020ctcgaaaagt catgcaccat gccatcatgg caggagagcc tgaagctaat gaccgctgga 10080aatttccttc aaaatctgca gcaattccca aaggacacta ttaacgaaga agtgatcgaa 10140ttcctctcgc cttacttcga gatgcccgac tacaacatcg aaaccgccaa aagagtgtgc 10200ggcaacgtcg ccgggctttg ctcctggacg aaggcaatgg ccagcttctt ctcaatcaac 10260aaagaggtgc tgcccttgaa ggccaacttg gtggtccagg aaaacaggca ccttctcgcc 10320atgcaagatc tccaaaaggc ccaagccgag ctcgatgaca agcaggccga gctggatgtg 10380gtgcaggcgg agtacgagca ggcgatgaca gaaaagcaga cgttgcttga ggacgccgag 10440agatgccggc acaaaatgca gaccgcctcc accctgatct cggggctggc aggcgaaaaa 10500gaaagatgga ccgagcaatc gcaggagttt gccgctcaaa ccaagaggtt ggtgggcgac 10560gtgttgttgg caaccgcatt cctgagctac agcggaccat tcaaccaaga gttccgcgac 10620ctgttgctga acgactggag aaaggagatg aaggcccgca agatcccatt cggcaagaac 10680cttaatctca gcgaaatgct gattgatgcc cctacaattt ccgagtggaa tctgcaggga 10740cttccaaacg acgacctgag catccaaaat ggcatcatcg tgaccaaggc ttcacggtac 10800cccctcctca tcgacccgca gactcaagga aagatttgga tcaagaataa ggaaagccgg 10860aacgagctgc agatcacttc ccttaaccac aagtacttta gaaaccacct cgaggactcc 10920ctttccttgg gtcggccgct cctcattgag gacgtgggcg aggaattaga cccagccctc 10980gacaacgtgc tggaaagaaa cttcattaag accggatcca cctttaaggt gaaagtcgga 11040gataaggaag tcgacgtcct ggacggcttc agactctaca ttacgactaa gctcccaaac 11100cctgcctata ccccggagat ctccgcgcgc actagcatta ttgacttcac cgtcacgatg 11160aagggactgg aggaccagct gctggggagg gtgatcctga cggaaaagca ggaactcgag 11220aaggagcgga cgcatttgat ggaggacgtg actgcgaaca agcggcgcat gaaggagctg 11280gaggataacc tgttgtaccg gctgaccagc acccagggct cactggtcga ggacgaaagc 11340cttatcgtgg tgctgtcgaa cactaagcgg actgccgagg aggtcactca gaagctggag 11400atctccgctg agacagaagt ccagattaac tccgcccgcg aagagtacag gcctgtcgcc 11460actagaggat ccatcctgta cttcttgatt accgaaatga gactcgtcaa cgagatgtac 11520caaacctccc tgagacaatt cctgggcctg ttcgacctga gcctggcccg gtcggtaaag 11580tcccctatca ccagcaagcg catcgctaat atcattgagc acatgaccta tgaggtgtac 11640aagtacgccg cccgggggct ctacgaagaa cataagttcc tgtttaccct tctgctcacc 11700cttaagatcg acattcagcg gaaccgcgtg aagcacgagg aattcctgac ccttatcaag 11760ggaggagcct cactcgactt gaaagcgtgt cccccaaagc cctcgaagtg gatactggac 11820attacctggc tgaaccttgt ggagctctcc aaactccggc aattttcaga cgtcttggat 11880caaatttcaa ggaacgaaaa aatgtggaag atctggttcg acaaggaaaa tccggaggaa 11940gaaccactgc cgaacgccta cgacaagagc ctcgactgct ttcggcggct gctgctcatt 12000agaagctggt gcccggatcg gaccatcgcg caggccagga agtatatagt ggattcgatg 12060ggggagaagt acgcggaagg cgtgattctc gatcttgaaa agacgtggga ggaatcggat 12120cccaggacgc ccctcatttg cctgctgtcg atgggcagcg atcccacaga ctccatcatt 12180gcactgggca agcggctcaa gattgaaact agatacgtgt cgatgggtca gggtcaagag 12240gtgcacgcca gaaagttgct tcaacaaacc atggcaaacg gaggttgggc cctgcttcaa 12300aattgccacc tgggactgga tttcatggac gaactcatgg atattattat cgaaaccgag 12360ctggtccacg acgccttcag actgtggatg accaccgagg cccacaagca gtttccgatc 12420acgctgctgc aaatgtcgat caagttcgcg aatgatccgc cgcaaggact ccgggccgga 12480ttgaagagga cctactcggg agtctcacag gacctcctcg acgtttcgag cggctcacaa 12540tggaagccca tgctgtacgc ggtcgctttc ctgcacagca ccgtgcaaga acggcggaag 12600ttcggcgcgc ttgggtggaa tatcccgtac gaattcaacc aagccgactt taacgccacc 12660gtgcagttta tccagaacca cctcgatgat atggatgtca aaaagggagt ctcgtggacc 12720accattcggt acatgattgg agaaattcag tacgggggac gcgtcaccga tgattacgac 12780aagaggcttc tcaacacctt cgccaaagtg tggttctcgg aaaatatgtt tggaccggac 12840ttctcgtttt accagggtta caacatcccg aaatgctcaa ccgtcgataa ctacctccaa 12900tacatccaat cgctgcccgc gtatgactcg ccagaggtct ttggactgca tcctaacgcc 12960gatattacct accagtccaa actggcgaag gacgtcctcg ataccatcct cgggatccaa 13020ccaaaggaca cgagcggagg aggcgacgaa acccgcgagg cagtggtggc tcgcctggcc 13080gatgatatgc ttgaaaagct gcccccagac tatgtgccgt tcgaggtcaa ggaaagattg 13140caaaagatgg gacccttcca accgatgaac atcttcctga gacaggagat cgacaggatg 13200cagcgcgtgc tatcgctggt cagatcgacg ctgaccgagc tgaaattggc aattgacgga 13260acgattatca tgagcgaaaa ccttcgggac gcccttgatt gcatgttcga cgccagaatt 13320ccggcttggt ggaaaaaagc gtcatggatt agctcaaccc tcgggttttg gttcacggag 13380ctcattgaga ggaactccca gttcacttca tgggtgttta atggacggcc tcattgcttc 13440tggatgactg gattctttaa tcctcaaggt ttcctgactg cgatgcggca ggagattact 13500cgggccaata agggttgggc cctggataac atggtgttgt gcaacgaagt gactaaatgg 13560atgaaggatg acatcagcgc accgcccacg gaaggcgtgt acgtctacgg attgtatctc 13620gaaggagcgg gttgggacaa aaggaacatg aagctgatcg agtccaaacc aaaagtgctc 13680tttgagctga tgcccgtgat ccggatctac gcagagaaca acaccctgcg ggatccccgc 13740ttttattcat gcccgattta caagaagccc gtgaggaccg acctcaacta catcgcagcc 13800gtggatttgc gcaccgccca aaccccggag cactgggtgc tgaggggggt cgcattgttg 13860tgcgatgtga aatag 138752313875DNAArtificial SequenceSynthetic polynucleotide 23atgttccgga ttgggaggag acagctttgg aagcactcag tcactagggt gctgacccaa 60agactgaagg gcgaaaagga agcgaagaga gctttgctcg atgcgcgcca caattacctc 120ttcgcgattg tggcctcatg cctggatctc aacaaaaccg aggtcgagga cgcgatcctc 180gagggtaacc aaatcgaacg catcgaccag ttgtttgccg tgggtggact gcggcacttg 240atgttctact accaggatgt cgaagaggct gaaacgggcc agctgggttc actcggagga 300gtgaatcttg tcagcgggaa gatcaaaaag ccgaaagtgt tcgtcaccga aggtaacgat 360gtagccctga ccggagtgtg tgtgttcttt atccggaccg atccgtccaa ggcgatcact 420cctgataata ttcaccagga ggtgtcgttc aacatgctgg acgctgctga cggaggactg 480cttaacagcg tcaggagact cctgtccgac atcttcattc ctgccctgcg ggcaaccagc 540cacggatggg gagagctcga gggattgcag gacgcggcaa acattaggca ggaattcctg 600agctccttgg agggcttcgt gaacgtcctt agcggagccc aagagtcctt gaaggaaaaa 660gtgaacttgc ggaaatgcga tatcctggag ctcaagactc tcaaggagcc gaccgattat 720ctgacgctgg ccaataatcc agagactctc ggcaagatcg aggattgtat gaaggtctgg 780attaagcaga ctgagcaagt gctcgctgaa aataaccaac tgctgaagga agcggatgat 840gtgggaccgc gcgctgaact ggaacactgg aagaaacgcc tgtcgaagtt taactacctc 900ctggaacagc tcaagagccc ggatgtgaag gcagtgctcg cagtgttggc ggccgcgaag 960tcaaagctcc ttaagacttg gcgggagatg gacatccgga tcactgatgc gaccaacgaa 1020gccaaagaca acgtgaaata cttgtacact ctggaaaagt gctgcgaccc attgtactcc 1080agcgacccgc tgtcaatgat ggacgctatc cccacgttga ttaacgcgat taagatgatc 1140tactcaatca gccactacta caacacctcc gaaaagatta cctcactctt cgtcaaagtg 1200acgaaccaga tcatctcagc gtgcaaggcg tacattacta acaacggcac cgccagcatt 1260tggaatcaac ctcaagacgt cgtggaggag aagatcttga gcgccatcaa gctgaagcag 1320gagtaccaac tatgctttca caaaaccaag caaaagctca agcagaaccc taacgccaag 1380caattcgact tctccgaaat gtacatcttc ggcaagtttg aaactttcca caggcgcctc 1440gccaaaatca tcgacatctt caccaccctc aaaacttaca gcgtgctgca ggattccact 1500atcgagggac ttgaagatat ggctaccaag taccaaggaa tcgtggccac cattaagaag 1560aaggaataca attttctgga ccaacgcaaa atggacttcg accaggatta cgaggagttc 1620tgcaagcaga ctaacgatct gcacaatgag ctccggaagt tcatggacgt gactttcgca 1680aagatccaaa acaccaacca agccctccgc atgctgaaga aattcgaaag actcaatatc 1740cctaacctgg gcattgacga taaataccaa cttatcctcg agaactatgg cgcggatatc 1800gacatgatct cgaaactgta cactaagcag aaatacgacc cgccccttgc tcgcaatcag 1860cccccgattg ccggaaagat cctgtgggcc cggcagctct ttcacagaat tcaacagcca 1920atgcagctct ttcagcaaca tcccgccgtg ctttccaccg cggaggctaa gcctatcatc 1980cggtcctaca accggatggc taaggtcctc cttgagttcg aagtcctgtt ccaccgggcc 2040tggcttagac aaatcgaaga gattcacgtg ggactcgaag ctagcttgct cgtaaaagca 2100ccgggaacgg gggagctttt cgtcaacttt gacccccaaa tcctgatcct gttccgcgaa 2160accgagtgca tggctcaaat gggacttgag gtgtccccac tcgcgactag cctctttcag 2220aagagagaca gatacaagcg caacttctcc aacatgaaaa tgatgctggc tgagtatcaa 2280cgcgtgaagt cgaaaattcc agccgccatt gaacagctga ttgtgcctca cctggccaaa 2340gtggatgaag ccctgcagcc cggtctggcc gctttgacgt ggaccagcct caacatcgaa 2400gcgtacttag agaacacttt cgccaagatc aaggatctcg aactcctcct ggaccgagtc 2460aatgacctca tcgaattcag gatcgatgcc attttggaag agatgtccag cactcccctt 2520tgccagcttc cccaggagga gccgcttact tgcgaggagt tcctccagat gactaaggat 2580ctctgcgtga acggtgctca aatcctgcac tttaagtcct ccctggtgga agaagccgtc 2640aatgagctgg tgaacatgtt gcttgacgtg gaggtcctgt cggaggaaga gagcgagaag 2700atttccaacg aaaacagcgt caattacaag aacgagagta gcgccaaacg cgaggaagga 2760aacttcgaca ctctgacatc ctcaatcaac gcccgcgcta atgctctcct cctgaccact 2820gtcacgagaa agaagaagga aaccgagatg ctcggggagg aggctagaga gctcctgagc 2880cactttaacc accaaaacat ggacgcgctg cttaaggtga cccgcaacac cctcgaggcg 2940atcagaaagc ggatccactc gtcacacacc attaactttc gagactccaa ttccgcatcg 3000aacatgaagc aaaattcact accgatcttc agagcctcgg tgactttggc aattcccaac 3060atcgtcatgg ctcctgcact cgaggatgtc cagcaaaccc tcaataaggc cgtcgagtgt 3120atcatcagcg tgccgaaggg agtgcgccag tggtcgtcag aactcctcag caagaaaaag 3180atccaggaaa gaaagatggc cgcgctccag tctaacgaag atagcgactc cgacgtggaa 3240atgggagaga acgagctgca agacactctg gaaatcgcca gcgtcaatct ccctatcccc 3300gtccagacca agaactacta taagaacgtc agcgaaaaca aggaaatcgt gaagttggtg 3360tccgtcttgt ccacgatcat caactcgacc aaaaaggagg tgatcaccag catggattgc 3420tttaagagat acaaccacat ttggcagaag ggcaaggagg aggctatcaa gaccttcatt 3480acccagtccc cactgttgtc ggaattcgaa tcgcagatcc tgtacttcca gaatctggaa 3540caggagatta atgccgagcc cgagtacgtc tgcgtcggca gcatcgcgct gtacaccgcc 3600gatttgaagt tcgccttgac tgccgagact aaggcctgga tggtggtgat tggccgccat 3660tgcaacaaga agtataggag cgaaatggaa aatattttca tgttgatcga ggaattcaat 3720aagaagctga atagaccaat caaggacctc gacgacatta ggattgccat ggcagccctg 3780aaggaaattc gcgaggaaca aatcagcatt gacttccagg tcggcccaat cgaagaaagc 3840tacgccctgc ttaaccgcta cggactgctc attgcccgcg aagagatcga caaagtggac 3900acgttacatt acgcttggga gaaactcctg gctcgcgccg gagaggtgca aaacaagctg 3960gtcagcctcc aacccagctt caagaaggag ctgatctccg ccgtcgaggt gttcttgcag 4020gattgccacc

aattttacct ggattacgac ctgaacggac cgatggcgtc cggcctcaag 4080ccgcaagagg ctagcgacag gctgatcatg tttcagaacc aattcgacaa catttacaga 4140aagtacatca cttacaccgg aggcgaggag cttttcggcc tgcccgccac tcagtaccca 4200caattgctgg agatcaagaa gcagctcaat ctgctgcaga aaatttacac cttgtacaac 4260tcagtgattg aaactgtgaa cagctactac gacatcctgt ggagcgaggt caacattgaa 4320aagatcaata atgaactgct ggagttccag aaccggtgcc gcaagctgcc tagggcgctg 4380aaggactggc aggcgttcct ggacctgaag aagattatcg acgatttctc agaatgctgc 4440ccactgctcg agtacatggc ctcgaaggcc atgatggaac ggcactggga gcgcatcacg 4500actttgactg gacatagcct tgacgtgggc aatgagtcct tcaagctgcg gaacatcatg 4560gaagctccct tgctcaagta caaggaagaa atcgaagata tctgcatctc cgccgtgaag 4620gaacgcgaca tcgaacagaa actgaaacag gtcatcaacg aatgggacaa caagactttt 4680actttcggga gcttcaagac caggggcgag ctccttcttc ggggtgactc aacctcagag 4740attatcgcca atatggaaga tagcctcatg cttctcggaa gcctcctgtc caaccgctat 4800aacatgccct ttaaggccca aatccaaaag tgggtccaat acttatctaa cagcaccgat 4860attatcgaat cgtggatgac cgtgcagaac ctttggattt acctggaggc cgtgttcgtc 4920ggcggagata ttgccaagca actgcccaag gaggccaagc gcttctcaaa catcgacaag 4980tcctgggtca aaatcatgac tcgggcccac gaagtcccct cagtggtcca gtgctgtgtc 5040ggcgacgaaa ccttggggca gctcctgccg caccttctcg accagttgga gatttgtcaa 5100aagtccttga ccggctacct tgaaaaaaag aggttgtgct tcccacgctt tttctttgtg 5160agcgaccccg cgctcctcga gatcctgggt caggcgagcg actcccacac catccaggcg 5220cacctcctga acgtgttcga caacattaag agcgtcaaat tccatgaaaa gatctacgac 5280agaatcctga gcatctccag ccaagagggt gaaaccatcg aactcgacaa gccagtgatg 5340gctgagggaa acgtcgaggt ctggctgaat tcattgctgg aagaatcgca gagcagcttg 5400cacctcgtga tccgccaggc cgcggcgaac attcaggaga ctggctttca gttgaccgag 5460tttttgagct ccttccccgc gcaagtgggc ctgctgggga tccagatgat ttggactagg 5520gattccgaag aggcgctgag aaatgccaag tttgacaaaa aaattatgca gaaaaccaac 5580caggcctttc tcgagcttct gaacactttg atcgacgtca ctacacgcga tctgagctcg 5640actgagcgcg tcaagtatga aaccctgatt accatccacg tccatcaacg ggatattttc 5700gatgacctct gccacatgca tatcaaatca ccgatggatt ttgaatggct caaacagtgc 5760cgcttctact tcaatgaaga tagcgacaag atgatgattc atatcaccga tgtggccttc 5820atctaccaga acgaattcct tggctgcact gacagactgg tcatcactcc tctgaccgat 5880agatgctata ttacccttgc ccaggcctta ggaatgtcga tgggcggtgc tccagccggg 5940cccgcgggca ccggaaagac agagactacg aaggatatgg gaaggtgcct gggcaagtac 6000gtcgtcgtgt tcaattgcag cgaccagatg gatttccggg gcctgggccg cattttcaag 6060ggccttgcgc agtcgggatc gtgggggtgc ttcgatgaat tcaatcggat cgacttgcca 6120gtgttgagcg tcgcggccca acagatttca atcatactaa cctgtaaaaa ggagcacaag 6180aaatccttca ttttcacgga cggggacaac gtgacgatga acccagaatt tggcctcttc 6240ctcactatga atccagggta cgccggtcgg caggaacttc ccgaaaatct gaaaattaac 6300ttcaggagcg tcgccatgat ggtcccagat cggcaaatca tcattagggt caaactggct 6360tcgtgcggat tcattgataa cgtggtgctg gccagaaagt tcttcaccct ttacaaattg 6420tgcgaggaac agctctcgaa gcaagtgcac tacgactttg gcctcagaaa catcctaagc 6480gtgctgcgca ctctgggggc agccaaaaga gccaacccaa tggacaccga gagcactatt 6540gtgatgaggg tccttcgcga catgaacctc tcaaaactta tcgacgagga cgagccgcta 6600ttcctcagcc tgatcgaaga tctgttcccc aacatcctgc tggacaaggc gggctaccct 6660gaactggagg ccgccatttc ccgccaagtc gaagaggccg gcttgattaa ccatcctccc 6720tggaagctca aggtgatcca gttattcgag actcagcgcg tacggcacgg gatgatgacg 6780ctgggtccgt cgggtgccgg aaagactact tgtatccaca ccctcatgag ggctatgacc 6840gattgtggaa aaccccaccg ggaaatgcgc atgaacccga aggcgattac ggctccgcag 6900atgtttggac gcctggatgt cgccactaac gactggaccg atgggatttt ctccacgctg 6960tggagaaaaa ccttgcgcgc caaaaagggc gaacacattt ggatcattct tgacggaccg 7020gtggacgcga tctggatcga aaatctgaat tccgtcctgg acgataacaa aactctcacc 7080ctggctaacg gtgatcggat ccctatggcg ccgaattgta aaattatctt cgaacctcat 7140aacattgaca atgctagccc cgcaacggtc agccgcaatg gcatggtgtt tatgtcatcg 7200tccatcctgg attggtcacc gatcctcgag ggattcctca agaagcgctc ccctcaagag 7260gccgaaattc tccggcagct ctacaccgag tcgttcccag atttgtacag gttctgtatt 7320cagaacctgg aatacaagat ggaagtgttg gaagcattcg tgatcacgca gtcgatcaac 7380atgctgcaag gtctgattcc tttgaaggag cagggcggcg aggtgtccca agcgcacctg 7440ggacggttgt tcgtatttgc cttgctgtgg tccgccgggg ccgccctgga actggacggg 7500cgccgccgcc tggagctctg gctccggtcc agacctaccg gaacgttgga actgcccccc 7560ccggccggtc ccggggacac cgcctttgac tactacgtgg ctcccgacgg cacttggact 7620cattggaaca cccggactca agagtacctg tacccttcag acacaactcc cgagtacggt 7680tcaatccttg tgccaaacgt cgacaacgtg cgcaccgact ttctcattca gaccattgcc 7740aagcaaggaa aggcagtcct gctcatcgga gagcagggca cggccaagac cgtgatcatc 7800aagggattca tgagcaagta tgaccctgag tgtcatatga tcaagtcact gaacttttcg 7860agcgctacta caccgctcat gttccagcgg actatcgaaa gctacgtgga caaaagaatg 7920ggaaccactt acggaccgcc agccggaaag aagatgaccg tgttcattga cgatgtgaac 7980atgcccatta tcaacgaatg gggcgaccag gtcaccaacg aaatcgtgag gcaattgatg 8040gagcagaacg gattttataa cctcgaaaag ccaggcgaat tcacctccat cgtggatatc 8100cagttcctgg ccgcaatgat ccatcccggc ggcggtagaa acgacatccc gcaacgcctg 8160aagcggcaat tctcaatttt caattgtact ctcccctccg aggcctccgt ggacaagatc 8220tttggagtca ttggggtggg ccactactgc acccagagag gcttttccga ggaagtccgg 8280gactcggtca ccaagctggt gccgcttacc cgcaggctct ggcaaatgac caaaatcaag 8340atgcttccta ccccggccaa gtttcactac gtgttcaacc tccgcgatct gtcacgcgtg 8400tggcagggga tgctcaacac cacttccgag gtcatcaagg aaccgaacga cctcctgaaa 8460ctgtggaaac atgaatgcaa aagagtcatt gctgaccgct tcaccgtgag tagcgacgtc 8520acatggttcg acaaggcact ggtcagcctg gtcgaagagg agttcggcga ggaaaaaaaa 8580ctgcttgtgg actgtgggat cgatacttat ttcgtggatt tcctgaggga tgccccagag 8640gcggccggcg aaacatccga agaagcagac gccgaaaccc ctaagatcta cgagcccatt 8700gaatcctttt cccacttaaa ggaacgcctg aacatgttcc tgcagctgta caatgaaagc 8760attcgcggcg ccgggatgga catggttttc ttcgccgatg ccatggtgca cctcgtgaag 8820ataagccggg tcattagaac ccctcaaggg aacgccctgt tggtcggtgt cggcggctcg 8880gggaagcaga gcttgactcg gctcgccagc tttatcgccg ggtatgtctc atttcaaatc 8940accctgaccc gctcctacaa cactagcaac ctgatggagg acctgaaggt gctgtacaga 9000actgccgggc agcagggcaa gggtatcacg tttattttca ccgataacga aattaaggat 9060gagagctttc tcgaatacat gaacaacgtg ctgagttccg gggaggtgtc caaccttttt 9120gcaagggacg aaatcgatga gatcaactcc gacttggcca gcgtgatgaa gaaggaattc 9180ccaagatgcc tcccaactaa cgaaaatctc cacgactatt tcatgtcaag agtccgacaa 9240aacttgcaca tcgtgctgtg ctttagtcct gtgggagaga agttccggaa ccgggctctg 9300aagtttcccg ccctgattag cggatgcacc attgactggt tctcccggtg gcccaaagat 9360gcgctggttg ctgtgtcgga acacttcctg acctcgtacg atattgactg ttccctggaa 9420atcaagaagg aagtcgtcca atgcatggga agtttccagg acggtgtcgc cgaaaagtgc 9480gtggattact ttcaaaggtt ccgccgcagt actcatgtca cccccaaaag ctacctgtca 9540ttcattcagg ggtacaaatt catctacgga gagaaacacg tggaggtgcg gactctggcc 9600aatcgcatga acaccgggct ggagaagctc aaggaagcga gcgagagcgt ggccgccttg 9660tccaaggaac tcgaggccaa ggaaaaagag ctccaggtag ccaatgacaa ggccgatatg 9720gtgctgaagg aggtcaccat gaaagcacag gcagcagaga aagtcaaggc ggaagtccaa 9780aaggtgaagg accgggccca ggccatcgtc gactcgatct ccaaggataa ggccattgcc 9840gaggaaaaat tagaggcggc gaagccggct ctcgaggaag cggaagctgc cctgcaaacc 9900atcagaccct ccgacatcgc caccgtgagg actctgggcc gccctccgca tcttatcatg 9960cggatcatgg actgcgtcct gctcctgttc cagcgaaagg tgtccgccgt gaagatcgat 10020ctcgagaagt cgtgtaccat gccctcgtgg caggaatcgc tcaaattgat gacggccgga 10080aacttcctgc aaaatctgca gcagttccct aaagacacta tcaacgagga agtgatcgaa 10140ttcctctcgc cgtacttcga gatgccggac tacaacatcg aaaccgctaa acgggtctgc 10200ggcaacgtcg ccggtctttg ctcctggacc aaggcgatgg cctccttctt ctcgattaac 10260aaggaggtgc tgcctctcaa ggccaacctc gtcgtgcaag aaaacagaca cctactagcg 10320atgcaagacc ttcagaaggc acaggccgag ctggacgaca agcaagcaga actggacgtc 10380gtgcaggccg agtacgagca agccatgacc gagaagcaga ctctgctaga ggacgccgag 10440cgctgtcggc acaaaatgca aaccgcgtcc acccttattt cgggattggc aggagaaaag 10500gaacggtgga ctgaacagag ccaagaattt gccgcgcaaa ccaagagact tgtgggcgac 10560gtgctcctgg ccactgcctt cctgagctat tccggacctt tcaatcagga attccgcgac 10620ctcctgctga atgactggcg gaaggagatg aaggcacgca agatcccttt cggaaagaac 10680ttgaacctct ccgagatgct tatcgacgcg cctaccattt ccgaatggaa cctccaggga 10740ctgccgaatg acgatctcag cattcaaaac gggatcatcg tgacgaaggc cagccgctac 10800cccctgctca tcgaccccca gactcagggg aagatttgga tcaagaacaa ggaaagccgg 10860aacgaactcc aaattaccag ccttaaccac aagtactttc gcaaccacct cgaggacagc 10920ctttccttgg gcagaccttt actgatcgag gatgtgggag aggaactgga ccctgccctt 10980gataacgtac tggaaagaaa ctttatcaag actggctcta ctttcaaagt gaaagtgggt 11040gacaaggaag tggatgtgct ggacgggttc cggctgtaca tcaccaccaa gctccccaat 11100ccagcctaca cccctgaaat ctccgcccgg actagcatca ttgatttcac cgtgactatg 11160aagggccttg aggaccaact gctgggcaga gtcattctga ccgagaaaca ggaactggaa 11220aaggaaagga cgcatcttat ggaggacgtc actgcgaaca agcggcggat gaaggagttg 11280gaagataact tgctctaccg gctgaccagc acccagggaa gcctcgtgga ggatgaatca 11340ctcatcgtgg tcctcagcaa caccaagagg actgcagagg aagtgaccca gaagttggaa 11400attagcgcag aaacggaggt gcagatcaat agtgctcggg aggaatacag gcccgtcgcg 11460accaggggga gcatcctgta cttcctgatt actgaaatgc gcctggtcaa tgaaatgtac 11520cagacgtccc tgcgccaatt cctggggctt tttgacttgt ccctcgctag gtcggtgaag 11580tcccctatca cgtccaaaag aattgcaaac attattgaac acatgacgta cgaagtctac 11640aagtatgcag ccagaggtct gtacgaggaa cacaagttcc ttttcaccct tctgctgacc 11700ctgaagattg atatccagcg caatagggtg aaacatgaag agttcctcac cctgatcaaa 11760ggtggcgctt ccctcgattt gaaggcatgt ccgcctaagc cttcaaagtg gatcctggat 11820attacgtggc tcaatctcgt ggaactgtcc aagttgcgac agtttagtga cgtgctggac 11880cagatttccc gcaatgaaaa gatgtggaag atctggttcg ataaggaaaa ccccgaagag 11940gagccgctgc ccaacgccta cgacaagtcg ctggattgct ttcggaggtt actcctgatt 12000agatcgtggt gcccggatcg caccatcgct caggccagga agtacatagt ggactcgatg 12060ggggaaaaat acgccgaagg agtcattttg gacctcgaaa agacttggga ggagtccgac 12120cctcgcactc cgctgatctg cctgcttagc atgggctcgg accctactga ttcgatcatt 12180gccttgggga agcgcctcaa gatcgaaacc agatacgtgt cgatgggcca aggtcaagag 12240gtgcacgcgc ggaaactcct ccagcagacc atggcgaacg gcggctgggc gctgctgcag 12300aactgccacc tcgggctgga cttcatggac gagctgatgg acatcattat cgaaaccgag 12360ctggtgcacg acgcgttccg cctttggatg accaccgagg cccataagca atttcccatc 12420acactcctcc aaatgtccat caagtttgcc aacgaccctc cgcagggtct gcgggctggt 12480ctgaagagga cttattccgg agtcagccag gacctcctag atgtcagcag cggatcccaa 12540tggaagccaa tgttgtacgc cgtggccttt ttgcactcca ccgtccagga aaggaggaag 12600ttcggagcct tagggtggaa tatcccgtac gaattcaacc aggccgattt caatgctact 12660gtccagttca tccagaatca tctggacgat atggacgtaa agaagggagt gtcgtggacg 12720acgattcgct acatgattgg tgagattcag tacggggggc gggtcactga cgactatgat 12780aagaggcttc tcaatacctt tgcgaaggtc tggttctccg aaaacatgtt cggccccgat 12840ttctccttct accaaggtta caatatcccg aagtgctcga ccgtggacaa ctacctccag 12900tacattcaga gccttcctgc ctacgactcc ccggaagtgt tcggcctgca ccctaacgcg 12960gacattacgt atcaaagcaa gctggccaaa gacgtgctgg acactattct cggaatccag 13020cctaaggaca catcgggagg aggcgatgaa acccgcgaag ccgtggtggc gcggctggcg 13080gatgacatgt tggagaagtt gccaccggac tacgtgccat tcgaagtgaa agaaaggctg 13140caaaagatgg gaccgtttca acccatgaac attttcctta gacaggaaat tgatagaatg 13200caaagggtcc tgtcgctcgt tagatccacg ctgaccgagt taaagctggc tatcgacggc 13260accattatca tgagcgagaa tctccgggac gctctcgact gcatgttcga tgcaaggatt 13320ccggcttggt ggaagaaggc gagctggatt agctccactt tgggcttttg gttcactgaa 13380ctcattgaac ggaactccca gtttacttcg tgggtcttta acggtaggcc acactgcttt 13440tggatgactg gattctttaa cccccagggc ttccttaccg ctatgcggca ggagattacc 13500agagccaaca agggttgggc actggacaac atggtgcttt gtaacgaagt gaccaagtgg 13560atgaaggacg atattagcgc tcccccgacc gaaggggtct acgtctacgg actctacctg 13620gaaggggccg gttgggacaa aagaaacatg aagctcattg agtctaagcc caaggtcctc 13680ttcgaactca tgccagtcat tcgcatttac gccgaaaata acactctccg cgatcctcgg 13740ttctactcat gcccgatcta caagaagccc gtgagaactg acttgaatta catcgctgcc 13800gtggacctca gaacggccca gacccccgaa cactgggtcc tgagaggggt ggcactgctg 13860tgcgatgtca agtag 138752413875DNAArtificial SequenceSynthetic polynucleotide 24atgttcagaa tcggaaggag acaactatgg aagcacagcg tgacgcgggt acttacccaa 60cgtctaaagg gggagaagga ggcgaagcgg gcactgctag acgcgcgtca taattacctc 120tttgcaatag ttgccagctg cctcgacctc aacaagacgg aggtagagga cgccatatta 180gagggcaacc agattgagcg gatcgatcag ctatttgccg tgggcgggct ccggcatcta 240atgttttact accaggacgt cgaggaagct gagaccgggc aactgggatc cctgggaggc 300gtcaacctcg tctccggcaa gataaaaaag cctaaggttt tcgttacaga gggcaacgac 360gtagcgctga ctggtgtatg cgtcttcttc atacggacag accccagcaa ggcgattacg 420ccagacaaca tccaccagga ggtctcgttt aacatgctcg acgccgccga tggcgggctg 480ctgaactcgg tgcgccggct gctctcggat atctttatcc ccgcgcttcg ggcgacgagc 540cacgggtggg gtgagctgga aggcctacag gacgcggcca atattcgtca ggagttccta 600tccagcctgg aaggttttgt taacgtgctg tccggcgccc aggagtcgct taaggagaag 660gtgaacttac gaaagtgtga tatattagag ctgaaaaccc tgaaggaacc tacagactat 720ctcaccctcg caaacaaccc cgaaaccctc ggcaaaattg aagattgcat gaaggtgtgg 780attaagcaga cggaacaagt cctggcagag aacaaccaac tcttgaagga ggccgacgac 840gtgggcccgc gcgctgagct ggagcactgg aagaagaggc tcagcaagtt taactatctt 900cttgagcagc tgaagagccc ggacgttaag gcggtactag cggtcctcgc ggctgcgaag 960tcgaagctgc tcaagacctg gcgtgagatg gacatacgca tcacggacgc aaccaacgaa 1020gctaaggaca acgttaagta tttgtatacc ctcgagaagt gctgcgaccc cctctactca 1080tctgatccgc tcagtatgat ggatgccatc cccacgctaa ttaacgccat taagatgatc 1140tactcgatat cgcactatta caacacgtct gaaaaaatca ccagcctctt cgtaaaagtg 1200actaaccaaa tcattagcgc ctgcaaggct tacatcacta acaacggcac cgccagtata 1260tggaaccagc cccaggacgt cgtggaggag aagatcctat cggccataaa gctgaagcag 1320gagtatcagc tgtgcttcca caaaaccaag cagaaactca agcagaaccc aaatgctaag 1380cagttcgact tttctgagat gtatattttc gggaagtttg aaacatttca tcgccgcctg 1440gccaaaatca tcgacatatt caccactctg aagacctact cagtcctaca agacagcact 1500atagaagggc tagaggatat ggccacaaag taccagggca tcgtggccac tatcaaaaag 1560aaggagtaca acttcttaga ccagcgtaaa atggatttcg accaggacta tgaagaattc 1620tgcaaacaaa cgaatgattt gcacaacgag cttcggaaat tcatggatgt gacttttgcc 1680aaaatacaga ataccaacca agctcttagg atgttaaaga aatttgaaag gctcaatatt 1740cctaatttgg gcattgatga caaataccag ttgatactcg aaaattatgg agcagatatt 1800gatatgatct ctaaactgta cacaaaacaa aaatatgatc ccccgctagc tagaaatcaa 1860cctccgattg ctggtaagat actctgggcc agacagctct ttcaccgcat ccagcagccc 1920atgcagctgt ttcagcagca ccctgcggtg ctgtccaccg ccgaagcgaa acccattatt 1980cgatcttata accgcatggc caaggttctg ttagagtttg aagttttgtt ccaccgtgcc 2040tggttacgtc agatcgagga gatccatgtg ggactggagg cctctctcct agtcaaggcc 2100cccggcacag gcgaactctt tgtcaatttt gatccccaga ttctaatact cttccgggaa 2160accgagtgca tggcccagat gggcttagag gttagtcctc tggctacttc tctgttccag 2220aagagagacc gctataaacg gaatttcagc aatatgaaga tgatgctcgc tgaatatcag 2280agggtcaagt ccaaaatccc cgctgcgatc gagcagctga tagtgccaca cctggccaaa 2340gtagatgagg ccctacaacc aggactggcc gcgctgacgt ggacctctct gaatatcgaa 2400gcgtatttgg agaacacctt tgccaagatt aaggacctgg agcttttact ggacagagtg 2460aacgatctca ttgaattccg catagacgcg attttagagg agatgtcttc cacgccacta 2520tgccagcttc ctcaggagga gcctttaaca tgtgaagagt tccttcagat gactaaggac 2580ctctgcgtga atggcgctca gatactacat ttcaagtcta gcttggtcga ggaggcagtg 2640aacgaattgg ttaacatgtt actggatgta gaagtcctta gcgaggagga atccgaaaag 2700atcagcaacg aaaattcggt gaactataag aacgaatcta gcgccaagcg ggaggagggc 2760aactttgata cactcacttc ttccatcaat gcgagggcta atgctctctt gttgacgacc 2820gttaccagaa aaaaaaagga gactgagatg cttggggaag aggcaaggga gttgctgtcc 2880cacttcaacc atcagaatat ggacgccctc ttaaaggtta cccgaaacac gttagaggca 2940attaggaagc gtattcactc aagccacacg ataaacttcc gcgactcaaa ctcagcatca 3000aatatgaagc aaaactcctt gccgatcttc agagccagcg tcaccctggc catacctaac 3060atcgttatgg caccggcact tgaggacgta caacaaacct tgaacaaagc agtagagtgc 3120atcatcagcg tccctaaagg agttcgccaa tggtccagtg aactgctatc caagaagaag 3180atccaggagc gtaaaatggc tgcgttacag agtaacgaag attcggactc tgacgttgaa 3240atgggtgaaa acgagctcca agacacactg gagattgcga gcgttaacct gcctataccc 3300gtccagacca agaactacta caaaaacgtg tccgaaaaca aggagatcgt caagctcgtt 3360tctgtgctca gcaccataat aaattcgact aagaaagaag ttataacttc catggattgt 3420ttcaaacggt ataaccacat ctggcagaaa ggcaaggaag aagctatcaa gacatttatt 3480acccagagcc cactactaag cgagttcgag tctcagatcc tctacttcca gaatcttgag 3540caggagatca acgctgagcc cgaatatgtg tgcgtcggct cgatagccct gtacacggct 3600gatctgaaat ttgcgctgac cgctgagacg aaggcttgga tggtggtgat tggccgacac 3660tgcaacaaga agtaccggtc tgaaatggag aacatcttta tgctaatcga ggaatttaac 3720aaaaagctga accgtcccat taaggatctg gacgacatca ggattgccat ggcggcccta 3780aaggaaatta gagaggagca gatatccatt gattttcagg ttggccccat cgaagaatca 3840tatgcccttc tgaatcgata cggtctatta atcgcccgag aggaaataga taaggtggac 3900acacttcatt atgcatggga gaaactctta gcgcgggccg gcgaagtgca gaataagctc 3960gtatcgctgc agccatcatt taagaaggag ctcatcagtg ctgtcgaggt ctttctgcag 4020gactgccacc agttctatct ggattatgac ctgaacggtc cgatggcgag tggtctgaag 4080ccccaagagg cttcagaccg gcttatcatg ttccaaaatc agttcgacaa tatttaccga 4140aagtatatca cctatacagg gggtgaagaa ttgtttggtc tcccagccac ccagtatcca 4200caattattgg aaataaagaa gcagctgaac cttcttcaaa aaatctacac tctctataat 4260tcggtaattg aaactgttaa ttcctactac gatattctct ggagcgaggt caacattgag 4320aaaattaata acgaactctt ggagttccaa aacagatgcc gcaagttgcc gagagcgctg 4380aaggactggc aggcttttct cgaccttaag aaaataatcg atgatttcag tgaatgctgt 4440cctctcttag aatacatggc gagtaaggct atgatggaga gacactggga gaggattacg 4500actctgacgg ggcattcttt ggacgttggc aacgagtcct tcaagctgcg taatataatg 4560gaggctccac ttctcaagta caaagaggaa atagaagaca tctgtatatc tgctgtcaaa 4620gagcgcgaca tagaacagaa actaaagcag gtaattaacg aatgggacaa taaaacgttt 4680acatttggca gtttcaagac acgtggagaa ttattgcttc gaggcgactc cacctcggaa 4740attatcgcta acatggagga ctctctcatg ttactcggct cgctgttatc gaaccggtat 4800aatatgccat tcaaagcaca gatccagaag tgggtgcagt atctatctaa tagtacggat 4860ataatagaga gctggatgac cgtccagaat ctctggatct acctggaggc ggtgtttgtg 4920ggaggtgata tagcgaagca gcttccaaag gaggccaaaa gattctccaa cattgacaaa 4980tcctgggtca agattatgac tcgggcccac gaagtgccct ccgtggtgca gtgctgcgtc 5040ggggacgaaa ccttgggcca gctgttgccc cacctgttgg atcaattgga aatctgccaa 5100aagagcctga

ctggctacct agagaaaaag cgtctgtgct ttccccggtt cttcttcgtt 5160tctgaccctg cactactcga aatcttgggt caggcctctg attctcacac aattcaggct 5220catttgttaa atgtgtttga caacatcaaa agtgtgaaat ttcatgaaaa gatttatgac 5280aggatcttgt ccatttcatc ccaagaggga gaaaccattg agcttgataa gcctgtgatg 5340gcagagggaa acgtggaggt ctggcttaac agtctcctgg aagagtccca gtcctcactg 5400cacctggtca tccgccaggc ggcggctaat atccaggaga caggattcca gctcacggaa 5460ttccttagtt cgtttccggc gcaagtgggg ctcctcggca ttcagatgat ctggacgaga 5520gattcggagg aagccctccg caacgccaag tttgacaaga agattatgca gaaaactaac 5580caagccttcc tagagctcct caacactctg atcgatgtca caactcgtga tctatcgtct 5640accgagcggg tcaagtatga gacactgatt accatacacg ttcaccagcg tgatatattc 5700gatgatctat gccacatgca cataaagagt cccatggact tcgaatggct aaaacagtgc 5760aggttctact ttaatgaaga ctcggataag atgatgatcc atatcacaga tgtagcgttt 5820atttaccaaa acgagttcct tggctgcaca gacaggttag tcataactcc gttaactgat 5880cgctgctaca ttacactcgc ccaagcgctt ggaatgtcca tgggtggagc ccccgcaggg 5940ccggcgggga caggtaagac cgaaacaact aaagatatgg gccgttgcct cgggaagtat 6000gtagtagttt ttaactgctc agaccaaatg gatttccgag ggctgggccg tatctttaaa 6060gggctggcgc aatccggttc ctggggctgt tttgacgagt tcaatcgtat tgatttaccg 6120gtgctaagtg ttgccgcaca gcaaattagt ataattttga catgtaagaa agaacacaag 6180aaaagtttta tatttactga cggcgacaac gtcactatga atcctgaatt cgggcttttc 6240ttgactatga acccagggta tgctggccgt caagaacttc ctgaaaatct gaaaatcaac 6300tttcgatcgg tggctatgat ggtaccggac cgccagatca tcatccgggt aaaactggcc 6360tcgtgtggct tcatcgacaa cgtcgtactt gctcgaaagt tcttcaccct ttacaagcta 6420tgtgaggagc agttatcgaa acaagttcat tacgactttg ggctccggaa tatcttgtcc 6480gtcttacgca cactcggagc ggctaaacgt gcaaatccca tggacactga gagtacgatt 6540gtgatgcgag tgttaaggga tatgaatctc tcaaaattaa tagacgagga cgagcctctt 6600tttctcagcc ttatagagga tctgttccca aacatcctcc tggacaaggc tggatatccc 6660gagttggaag cggcgattag caggcaggtg gaggaggccg gattgattaa tcacccgccc 6720tggaaactga aagtcatcca gctgttcgag actcagcggg tccgacacgg tatgatgact 6780ttaggcccat ctggcgcggg gaaaaccacc tgcatccaca ccctgatgag ggctatgacc 6840gattgtggga agcctcaccg tgagatgcgg atgaacccga aggcgatcac agcgccccaa 6900atgtttgggc gtctggatgt ggcgacaaat gactggaccg atggaatctt ttccacactc 6960tggaggaaga ccctgcgcgc aaaaaaagga gagcacatct ggatcattct cgatggcccc 7020gttgacgcta tttggatcga aaacttaaac agcgtgctcg acgacaacaa gaccctgaca 7080ttggcaaatg gtgaccggat tcctatggct cccaattgca aaatcatttt tgaacctcac 7140aacatcgaca acgccagtcc ggctacggtg tcccgcaacg gtatggtttt catgagcagt 7200tccatactgg attggagtcc gatattggaa ggatttctca agaaacgcag tccccaggag 7260gcggaaattc tgcggcaact gtatacggaa agttttcctg acctgtaccg tttctgtatt 7320caaaatctcg agtataagat ggaggtgttg gaggccttcg tcatcacaca gtccattaac 7380atgcttcagg gcttgatccc cttgaaggag caaggagggg aagtcagcca ggcacatcta 7440gggcggcttt tcgttttcgc ccttctctgg tccgcgggtg ctgctctcga gctagacggt 7500cgccggcgct tggagttgtg gctgaggtct cgcccgaccg ggacactcga gctgccgccg 7560ccagccggac ccggggacac ggcattcgac tactacgttg ctccggacgg cacctggacc 7620cactggaaca cccgtacgca ggagtatctc tatcccagcg atacaactcc tgagtatggt 7680agcatactcg ttccgaacgt agacaacgtc agaaccgact ttctgatcca gaccattgct 7740aagcagggca aggcagtcct attgatcgga gagcaaggga ccgcgaaaac cgtgattatc 7800aagggcttca tgagtaaata tgacccagag tgtcatatga ttaagtccct caacttcagt 7860tctgctacca caccactcat gtttcagcgt actatcgaat cctacgtgga caagcggatg 7920ggcaccacct acgggccgcc tgccgggaag aagatgacgg tatttataga cgacgttaac 7980atgcccatca tcaacgagtg gggagatcaa gtgacaaacg aaatcgttcg gcaacttatg 8040gaacaaaacg ggttctataa cctcgagaag ccgggcgagt tcacctcaat agtagacatt 8100caatttctgg cagctatgat ccatcccgga ggaggacgga acgacattcc ccagcggctc 8160aagcgccaat tcagcatctt caactgcacg ctgccaagtg aagcatcggt agacaaaatc 8220ttcggcgtca tcggggtggg tcactactgc acccagcgcg gcttttcaga ggaagtccga 8280gattctgtta ctaaactggt tcctttgact agaaggctgt ggcagatgac caaaattaag 8340atgcttccta ctccagctaa attccactac gtgttcaatc tgcgagactt atccagggta 8400tggcaaggta tgcttaatac cacctctgag gtaattaaag aaccgaacga tctgctgaaa 8460ctgtggaagc acgaatgtaa gagagttatt gcagataggt ttactgtttc gtcagatgtt 8520acctggttcg ataaagccct ggtgtctttg gtcgaagaag agtttgggga agagaagaag 8580ctactcgtcg attgcgggat cgacacttac tttgtggact tcttaagaga cgcccctgag 8640gctgccggcg aaacatcgga agaggcagac gctgagactc ctaagatcta tgagccgatc 8700gagagcttta gccacctcaa ggaacgtctg aatatgtttt tacaattgta taatgagagc 8760attcgtggtg ctgggatgga tatggtgttc ttcgcggatg caatggtgca tctggtaaag 8820atttctcggg tgattcgcac gccacaggga aacgcgctac tggtcggggt gggtgggtca 8880ggaaagcaat cgttaactcg tttggcatcc tttatcgcgg gatatgtgag ttttcaaatc 8940accctgacaa ggtcctataa tacatccaac ctgatggagg atcttaaagt tctctacagg 9000acggcgggac aacagggcaa aggaataacc ttcatcttca cagataacga aattaaagac 9060gaatcattct tggagtatat gaacaacgtc ttatcaagcg gcgaagtttc gaacctcttc 9120gcgcgggacg aaatcgatga gatcaactct gatctcgctt ctgtcatgaa gaaagaattc 9180ccccggtgtt tgccgacgaa tgagaacttg catgactatt tcatgtcccg tgtgcggcag 9240aatttgcaca ttgtgctttg cttttcaccg gtgggggaga agttccgaaa tagagctttg 9300aagttccctg ctttgatttc tgggtgcact attgactggt tttcccgttg gcccaaggac 9360gctctggtcg ccgtgtccga gcacttttta accagctatg atatcgactg cagcctcgaa 9420attaagaagg aagtagttca gtgtatgggc tctttccaag acggtgtggc agagaagtgc 9480gtcgactatt tccagaggtt tcgccgatct actcatgtca cacctaagag ctacttgtcc 9540ttcatacagg ggtacaagtt tatatatggg gagaaacacg ttgaagtaag gactctggca 9600aaccgtatga atactggctt agagaagctc aaggaggcct cagaaagtgt ggctgctctc 9660tcaaaggagc ttgaagctaa ggagaaggaa ctccaagttg cgaacgataa agcggatatg 9720gttctaaaag aagtgactat gaaagcacaa gcggctgaaa aggttaaggc ggaggtacag 9780aaggtgaagg atcgcgccca ggcaatagtc gattccattt ccaaagacaa ggccatcgcg 9840gaagaaaagc tagaagccgc caagccggcc ttagaagagg cagaggctgc cttgcaaacc 9900ataagaccga gtgacatcgc gacggtacga acccttggtc gtcctcctca tttgattatg 9960aggattatgg actgtgtcct gcttttattt caacgtaaag tatctgcagt taagattgat 10020ttggaaaaat cctgtaccat gccctcatgg caggaatccc tgaaattgat gaccgcgggc 10080aatttccttc aaaatctaca acaattcccc aaggacacca ttaacgaaga ggtcatagaa 10140tttttatctc catattttga gatgccagat tacaatattg aaacagcgaa gcgcgtctgt 10200ggtaacgtag caggtctctg ttcgtggacc aaagctatgg cctccttctt tagtatcaat 10260aaagaggtac ttccactgaa agccaacctg gtggtacagg agaaccggca tctgcttgct 10320atgcaggatc ttcagaaggc ccaggccgaa ttagacgaca aacaggctga gcttgacgtt 10380gtgcaagcag aatacgaaca agctatgact gaaaagcaga ctttattaga ggacgctgaa 10440cgctgcagac ataagatgca gactgcaagc accctcatat ccgggttggc tggagaaaaa 10500gaacggtgga cagagcagtc gcaagaattc gctgctcaaa ccaaaaggtt ggttggagac 10560gttctactcg cgacagcttt tctctcctat tctggtcctt tcaaccagga attccgggac 10620cttttgctga atgactggag aaaagaaatg aaggctcgca aaataccatt tggtaagaat 10680ttgaacttgt ctgaaatgct tattgacgca cccactatat cagagtggaa tcttcaggga 10740cttccaaatg acgatctgtc catccaaaac ggaattattg taaccaaggc gagtcgctac 10800cctctgctca ttgacccgca gacacagggc aagatctgga ttaaaaataa ggaaagcagg 10860aacgaactcc agatcacaag tctcaaccac aagtacttcc gtaaccacct cgaagatagt 10920ctgtccctgg gacggccgtt gctaatcgag gacgttggag aagagctgga ccccgcatta 10980gacaacgtcc ttgaaagaaa ttttatcaag acaggatcaa cttttaaagt taaagtagga 11040gataaagaag tggatgtgtt agatggcttc cgcctatata tcacaactaa actcccgaat 11100cccgcctata ctccagagat cagcgctaga actagcatca tagatttcac tgtaactatg 11160aaggggttag aagatcaatt attaggacgc gtgatcctga cggagaaaca ggaattggaa 11220aaagagcgta cacacctcat ggaagacgtg acagctaaca aacgacggat gaaggaactg 11280gaggacaatt tactgtatcg gttgacatca acacagggct cccttgttga ggacgagagt 11340ctgatcgtgg tcctgtctaa cacaaagagg actgctgaag aagtaactca gaaattggag 11400atttctgccg aaactgaagt tcagattaac tccgctagag aagagtatcg tccagtcgct 11460actcggggct ctatcctata cttcctcata actgagatgc gcttggtcaa tgaaatgtac 11520cagacttcac tccggcaatt cctgggcttg tttgacttgt cgctggcaag atcagtaaaa 11580tctccaatta ccagcaagag aatcgcaaac atcattgagc acatgacgta cgaggtgtat 11640aaatacgcgg cgaggggtct ttatgaagag cataagttcc tcttcaccct actattaacg 11700ttgaagatag atattcagag gaaccgggtg aagcacgagg agtttctaac tctaataaaa 11760ggaggagcta gtttagattt gaaagcctgt ccaccgaaac cttctaagtg gattttagac 11820ataacatggc tgaatttagt ggagttgtcc aagttacgtc agttcagtga cgtattggat 11880cagatatcca ggaacgagaa gatgtggaag atctggttcg ataaagagaa tccggaggag 11940gagcccttgc caaatgctta tgataaatca ctagactgct tcaggaggtt gttgctcatc 12000cgctcctggt gccccgaccg cactattgcc caagctagga aatacattgt ggactccatg 12060ggggagaagt atgccgaggg agtcatactc gacctggaaa agacttggga agagtcagat 12120ccgaggacgc ccctcatttg tcttctttcc atgggttctg atcccacgga ctctattata 12180gcactcggga aaagactaaa gatcgagaca cgctatgtta gcatgggaca ggggcaagag 12240gtccatgccc gcaaactact acagcaaact atggctaatg gaggttgggc tctgttacag 12300aactgtcact taggcctgga ttttatggac gaattgatgg acataataat tgagacggag 12360ctagtccacg acgcatttcg cttatggatg acgactgaag cacacaagca gtttccgatt 12420accctgttgc agatgtccat caagttcgca aatgaccctc ctcagggcct tagggcaggt 12480ctcaaaagga cctacagcgg cgtttcccag gatttacttg acgtcagctc cggatcccag 12540tggaagccca tgctctacgc tgtggctttc cttcacagca cagttcaaga aaggcggaag 12600tttggtgcgc taggctggaa catcccctac gagttcaacc aagctgattt taatgcaaca 12660gtacagttta ttcaaaatca tctggatgac atggatgtta aaaaaggtgt gtcatggact 12720acaataaggt acatgattgg ggagattcag tacggaggac gggtaactga tgattatgac 12780aagcggctac tgaacacatt cgctaaagtg tggttttctg agaatatgtt cggtccagat 12840ttcagcttct accaaggtta caatataccc aagtgctcca cggtcgacaa ctaccttcag 12900tacatccaga gcttgcccgc gtacgacagc ccggaagtct tcggactcca ccccaacgcc 12960gacatcacgt accagagcaa gctggccaag gacgtgcttg acaccattct cggcatccag 13020ccgaaggaca cgtccggcgg gggggacgag acgcgggagg ccgtcgtcgc gcgcttggca 13080gatgacatgc tggagaagct cccccccgat tacgtcccgt ttgaggtcaa ggaaaggctc 13140cagaagatgg gcccgttcca gcccatgaac atcttcctcc gccaggagat cgaccggatg 13200cagcgcgtgc ttagcctggt gcgctcaacg ctgaccgagc tgaagctggc catcgacggg 13260acgatcatta tgtcggagaa cctccgggac gcgctggact gcatgttcga cgcgcgtatc 13320ccggcctggt ggaagaaggc gtcgtggatc tccagcaccc tggggttctg gttcacggag 13380ctgatcgagc gcaactcgca attcacctcc tgggtgttca acgggcggcc ccactgcttc 13440tggatgacgg gcttctttaa cccgcagggc ttcctgacgg ctatgcggca ggaaatcacc 13500cgggcgaaca agggttgggc gctcgacaat atggtgctct gcaacgaggt cacgaagtgg 13560atgaaggacg acatctcggc gcctcccacc gaaggggtct acgtctatgg cctgtacctc 13620gagggggcgg gctgggacaa gcgtaacatg aagctgatcg agtcgaagcc caaggtcctg 13680ttcgagctga tgcccgtcat ccgcatctac gcagaaaaca acacgctgcg cgacccgcgg 13740ttctactcgt gccccatcta caagaagccg gtgcggacgg acctcaacta catcgccgcc 13800gtcgacctcc gcaccgcgca gacccccgag cactgggtgc tgcggggggt cgcactgctc 13860tgcgacgtca agtag 138752513875DNAArtificial SequenceSynthetic polynucleotide 25atgttcagaa tcggaaggag acaactatgg aagcacagcg tgacgcgggt acttacccaa 60cgtctaaagg gggagaagga ggcgaagcgg gcactgctag acgcgcgtca taattacctc 120tttgcaatag ttgccagctg cctcgacttg aacaagacgg aggtagagga cgccatatta 180gagggcaacc agattgaacg gatcgatcag ctatttgccg tgggcgggct ccggcatcta 240atgttttact accaggacgt cgaggaagct gagaccggcc aactgggatc cctgggaggc 300gtcaacctcg tctccggcaa gataaaaaag cctaaggttt tcgttacaga gggcaacgac 360gtagcgctga ctggtgtatg cgtcttcttc atacggacag acccctcaaa ggcaattacg 420ccagacaaca tccaccagga ggtctcgttt aacatgctcg acgccgccga tggcgggctg 480ctgaactcgg tgcgccggct gctctcggat atctttatcc ccgcgcttcg ggcgacgagc 540cacgggtggg gtgagctgga aggcctacag gacgcggcca atattcgtca ggagttccta 600tccagcctgg aaggttttgt taacgtgtta tccggcgccc aggagtcgct taaggagaag 660gtgaacttac gaaagtgtga tatattagag ctgaaaaccc tgaaggaacc tacagactat 720ctcaccctcg caaacaaccc cgaaacactc ggcaaaattg aagattgcat gaaggtgtgg 780attaagcaga cggaacaagt cctggcagag aacaaccaac tcttgaagga ggccgacgac 840gtgggcccgc gcgctgagct ggagcactgg aagaagaggc tcagcaagtt taactatctt 900cttgagcagc tgaagagccc ggacgttaag gcggtactag cggtcctcgc ggctgcgaag 960tcgaagctcc tcaagacctg gcgtgagatg gacatacgca tcacggacgc gaccaacgaa 1020gctaaggaca acgttaagta tttgtatacc ctcgagaagt gctgcgaccc cctctactca 1080tctgatccgc tcagtatgat ggatgccatc cccacgttga ttaacgccat taaaatgatc 1140tactcgatat cgcactatta caacacgtct gaaaaaatca ccagcctctt cgtaaaagtg 1200actaaccaaa tcattagcgc ctgcaaggct tacatcacta acaacggcac cgccagtata 1260tggaaccagc cccaggacgt cgtggaggag aagatcctat cggccataaa gctgaagcag 1320gagtatcagc tgtgcttcca caaaaccaag cagaaactca agcagaaccc aaatgctaag 1380cagttcgact tctctgagat gtacattttc gggaagtttg aaacatttca tcgccgcctg 1440gccaaaatca tcgacatatt caccactctg aagacctact cagtcctaca agatagcact 1500atagaagggc tagaggatat ggccacaaag taccagggca tcgtggccac tatcaaaaag 1560aaggagtaca acttcttaga ccagcgtaaa atggatttcg accaggacta tgaagaattc 1620tgcaaacaaa cgaatgattt gcacaacgag cttcggaaat tcatggatgt gacttttgcc 1680aaaatacaga ataccaacca agctcttagg atgttaaaga aatttgaaag gctcaatatt 1740cctaatttgg gcattgatga caaataccag ttgatactcg aaaattatgg agcagatatt 1800gatatgatct ctaaactgta cacaaaacaa aaatatgatc ccccgctagc tagaaatcaa 1860cctccgattg ccggtaagat actctgggcc agacagctct ttcaccgcat ccagcagccc 1920atgcagctgt ttcagcagca ccctgcggtg ctgtccaccg ccgaagcgaa acccattatt 1980cgatcttata accgcatggc caaggttctg ttagagtttg aagttttgtt ccaccgtgcc 2040tggttacgtc agatcgagga gatccatgtg ggactggagg cctctctcct agtcaaggcc 2100cccggcacag gcgaactctt tgtcaatttt gatccccaga ttctaatact cttccgggaa 2160accgagtgca tggcccagat gggcttagag gttagtcctc tggctacttc tctgttccag 2220aagagagacc gctataaacg gaatttcagc aatatgaaga tgatgctcgc tgaatatcag 2280agggtcaagt ccaaaatccc cgctgcgatc gagcagctga tagtgccaca cctggccaaa 2340gtagatgagg ccctacaacc aggactggcc gcgctgactt ggacctctct gaatatcgaa 2400gcgtatttgg agaacacctt tgcaaagatt aaggacctgg agcttttact ggacagagtg 2460aacgatctca ttgaattccg catagacgcg attttagagg agatgtcttc cacgccacta 2520tgccagcttc ctcaggagga gcctttaaca tgtgaagagt tccttcagat gactaaggac 2580ctctgcgtga atggcgctca gatactacat ttcaagtcta gcttggtcga ggaggcagtg 2640aacgaattgg ttaacatgtt actggatgta gaagtcctta gcgaggagga atccgaaaag 2700atcagcaacg aaaattcggt gaactataag aacgaatcta gcgccaagcg ggaggagggc 2760aactttgata cactcacttc ttccatcaat gcgagggcta atgctctctt gttgacgacc 2820gttaccagaa aaaaaaagga gactgagatg cttggggaag aggcacgtga gttgctgtcc 2880cacttcaacc atcagaatat ggacgccctc ttaaaggtta cccgaaacac gctagaggca 2940attaggaagc gtattcactc aagccacacg ataaacttcc gcgactcaaa ctcagcatca 3000aatatgaagc aaaactcctt gccgatcttc agagccagcg tcaccctggc catacctaac 3060atcgttatgg caccggcact tgaggacgta caacaaacct tgaacaaagc agtcgagtgc 3120atcatcagcg tccctaaagg agttcgccaa tggtccagtg aactgctatc caagaagaag 3180atccaggagc gtaaaatggc tgcgttacag agtaacgaag attcggacag cgacgttgaa 3240atgggtgaaa acgagctcca agacacactg gagattgcga gcgttaacct gcctataccc 3300gtccagacca agaactacta caaaaacgtg tccgaaaaca aggagatcgt caagctcgtt 3360tctgtgctca gcaccataat aaattcgact aagaaagaag ttataacttc catggattgt 3420ttcaaacggt ataaccacat ctggcagaaa ggtaaggaag aagctatcaa gacatttatt 3480acccagagcc cactactaag cgagttcgag tctcagatcc tctacttcca gaatttagag 3540caggagatca acgctgagcc cgaatatgtg tgcgtcggct cgatagccct gtacacggct 3600gatctgaaat ttgcgctgac cgctgagacg aaggcttgga tggtggtgat tggccgacac 3660tgcaacaaga agtaccggtc tgaaatggag aacatcttta tgctaatcga ggaatttaac 3720aaaaagctga accgtcccat taaggatctg gacgacatca ggattgccat ggcggcccta 3780aaggaaatta gagaggagca gatatccatt gattttcagg ttggccccat cgaagaatca 3840tatgcccttc tgaatcgata cggtctatta atcgcccgag aggaaataga taaggtggac 3900acacttcatt atgcatggga gaaactctta gcgcgggccg gcgaagtgca gaataagctc 3960gtatcgctgc agccatcatt taagaaggag ctcatcagtg ctgtcgaggt ctttctgcag 4020gactgccacc agttctatct ggattatgac ctgaacggtc cgatggcgag tggtctgaag 4080ccccaagagg cttcagaccg gcttatcatg ttccaaaatc agttcgacaa tatttaccga 4140aagtatatca cctatacagg gggtgaagaa ttgtttggtc tcccagccac ccagtatcca 4200caattattgg aaataaagaa gcagttgaac cttctccaaa aaatctacac tctctataat 4260tcggtaattg aaactgttaa ttcctactac gatattctct ggagcgaggt caacattgag 4320aaaattaata acgaactctt ggagttccaa aacagatgcc gcaagttgcc gagagcgctg 4380aaggactggc aggcttttct cgaccttaag aaaataatcg atgatttcag tgaatgctgt 4440cctctcttag aatacatggc gagtaaggct atgatggaga gacactggga gaggattacg 4500actctgacgg ggcattcttt ggacgttggc aacgagtcct tcaagctgcg taatataatg 4560gaggctccac ttctcaagta caaagaggaa atagaagata tctgtatatc tgctgtcaaa 4620gagcgcgaca tagaacagaa actaaagcag gtaattaacg aatgggacaa taaaacgttt 4680acatttggca gtttcaagac acgtggagaa ttattgcttc gaggcgactc cacttcggaa 4740attatcgcta acatggagga ctctctcatg ttactcggct cgctgttatc gaaccggtat 4800aatatgccat tcaaagcaca gatccagaag tgggtgcagt atctatctaa tagtacggat 4860ataatagaga gctggatgac cgtccagaat ctgtggatct acctggaggc ggtgttcgtg 4920ggaggtgata tagcgaagca gcttccaaag gaggccaaaa gattctccaa cattgacaaa 4980tcctgggtca agattatgac tcgggcccac gaagtgccct ccgtggtgca gtgctgcgtc 5040ggggacgaaa ccttgggcca gctgttgccc cacctgttgg atcaattgga aatctgccaa 5100aagagcctga ctggctacct agagaaaaag cgtctgtgct ttccccggtt cttcttcgtt 5160tctgaccctg cactactcga aatcttgggt caggcctctg attcccacac aattcaggct 5220catttgttaa atgtgtttga caacatcaaa agtgtgaaat ttcatgaaaa gatttatgac 5280aggatcttgt ccatttcatc ccaagaggga gaaaccattg agcttgataa gcctgtgatg 5340gcagagggaa acgtggaggt ctggcttaac agtctcctgg aagagtccca gtcctcactg 5400cacctggtca tccgccaggc ggcggctaat atccaggaga caggattcca gctcacggaa 5460ttccttagtt cgtttccggc gcaagtgggg ctcctcggca ttcagatgat ctggacgaga 5520gattcggagg aagccctccg caacgccaag tttgacaaga agattatgca gaaaactaac 5580caagccttcc tagagctcct caacactctg atcgatgtca caactcgtga tctatcgtct 5640accgagcggg tcaagtatga gacactgatt accatacacg ttcaccagcg tgatatattc 5700gatgatctat gccacatgca cataaagagt cccatggact tcgaatggct aaaacagtgc 5760aggttctact ttaatgaaga ctcggataag atgatgatcc atatcacaga tgtagcgttt 5820atttaccaaa acgagttcct tggctgcaca gacaggttag tcataactcc gttaactgat 5880cgctgctaca ttacactcgc ccaagcgctt ggaatgtcca tgggtggagc ccccgcaggg 5940ccggcgggga caggtaagac cgaaacaact aaagatatgg gccgttgcct cggaaagtat 6000gtagtagttt ttaactgctc agaccaaatg gatttccgag ggctggggcg tatctttaaa 6060gggctggcgc aatccggttc ctggggctgt tttgacgagt tcaatcgtat tgatttaccg 6120gtgctaagtg ttgccgcaca gcaaattagt ataattttga catgtaagaa agaacacaag 6180aaaagtttta

tatttaccga cggcgacaac gtcactatga atcctgaatt cgggcttttc 6240ttgactatga acccagggta tgctggacgt caagaacttc ctgaaaatct gaaaatcaac 6300tttcgatcgg tggctatgat ggtaccggac cgccagatca tcatccgggt aaaactggcc 6360tcgtgtggct tcatcgacaa cgtcgtactt gctcgaaagt tcttcaccct ttacaagcta 6420tgtgaggagc agttatcgaa acaagttcat tacgactttg ggctccggaa tatcttgtcc 6480gtcttacgca cactcggagc ggctaaacgt gccaatccca tggacactga gagtacgatt 6540gtgatgcgag tgttaaggga tatgaatctc tcaaaactga tagacgagga cgagcctctt 6600tttctcagcc ttatagagga tctgttccca aacatcctcc tcgacaaggc tggatatccc 6660gagttggaag cggcgattag caggcaggtg gaggaggccg gattgattaa tcacccgccc 6720tggaaactga aagtcatcca gctgttcgag actcagcggg tccgacacgg tatgatgact 6780ttaggcccat ctggcgcggg gaaaaccacc tgcatccaca ccctgatgag ggctatgacc 6840gattgtggga agcctcaccg tgagatgcgg atgaacccga aggccatcac agcgccgcaa 6900atgtttgggc gtctggatgt ggcgacaaat gactggaccg atggaatctt ttccacactc 6960tggaggaaga ccctgcgcgc aaaaaaagga gagcacatct ggatcattct cgatggcccc 7020gttgacgcta tttggatcga aaacttaaac agcgtgctcg acgacaacaa gaccctgaca 7080ttggcaaatg gtgaccggat tccaatggct cccaattgca aaatcatttt cgaacctcac 7140aacatcgaca acgccagtcc ggctacggtg tcccgcaacg gtatggtttt catgagctca 7200tccatactgg attggagtcc gatattagaa ggatttctca agaagcgcag tccccaggag 7260gcggaaattc tgcggcaact gtatacggaa agttttcctg acctgtaccg tttctgtatt 7320caaaatctcg agtataagat ggaggtgttg gaggccttcg tcatcacaca gtccattaac 7380atgcttcagg gtttgatccc cttgaaggag caaggagggg aagtcagcca ggcacatcta 7440gggcggcttt tcgttttcgc ccttctctgg tccgcgggtg ctgctctcga gctagacggt 7500cgccggcgct tggagttgtg gctgaggtct cgcccgaccg ggacactcga gctgccgccg 7560ccagccggac cgggggacac ggcattcgac tactacgttg ctccggacgg cacctggacc 7620cactggaaca cccgtacgca ggagtacctc tatcccagcg atacaactcc tgagtatggt 7680agcatactcg ttccgaacgt agacaacgtc cgtaccgact ttctgatcca gaccattgct 7740aagcagggca aggcagtcct attgatcgga gagcaaggga ccgcgaaaac cgtgattatc 7800aagggcttca tgagtaaata tgacccagag tgtcatatga ttaagtccct caacttcagt 7860tctgctacca caccactcat gtttcagcgt actatcgaat cctacgtgga caagcggatg 7920ggcaccacct acgggccgcc tgccgggaag aagatgacgg tatttataga cgacgttaac 7980atgcccatca tcaacgagtg gggagatcaa gtgacaaacg aaatcgttcg gcaacttatg 8040gaacaaaacg ggttctataa cctcgagaag ccgggcgagt tcacctcaat agtagacatt 8100caatttctgg cagctatgat ccatcccgga ggagggcgga acgacattcc acagcggctc 8160aagcgccaat tcagcatctt caactgcacg ctgccaagtg aagcatcggt agacaaaatc 8220ttcggcgtca tcggggtggg tcactactgc acccagcgcg gcttttcaga ggaagtccga 8280gattctgtta ctaaactggt tcctttgact agaaggctgt ggcagatgac caaaattaag 8340atgcttccta ctccagctaa attccactac gtgttcaatc tgcgagactt atcgagggta 8400tggcaaggta tgcttaatac cacctctgag gtaattaaag aaccgaacga tctgctgaaa 8460ctgtggaagc acgaatgtaa gagagttatt gcagataggt ttactgtttc gtcagatgtt 8520acctggttcg ataaagccct ggtttctttg gtcgaagaag agtttgggga agagaagaag 8580ctactcgtcg attgcgggat cgacacttac tttgtggact tcttaagaga cgcccctgag 8640gctgccggcg aaacatcgga agaggcagac gctgagactc ctaagatcta tgagccgatc 8700gagagcttta gccacctcaa ggaacgtctg aatatgtttt tacaattgta taatgagagc 8760attcgtggtg ctgggatgga tatggtgttc ttcgcggatg ccatggtgca tctggtaaag 8820atttctcggg tgattcgcac gccacaggga aacgcgctac tggtcggggt gggtgggtca 8880ggaaagcaat cgttaactcg tttggcatcc tttatcgcgg gatatgtgag ttttcaaatc 8940accctgacaa ggtcctataa tacatccaac ctgatggagg atcttaaagt tctctacagg 9000acggcgggac aacagggcaa aggtataacc ttcatcttca cagataacga aattaaagac 9060gaatcattct tggagtatat gaacaacgtc ttatcaagcg gcgaagtttc gaacctcttc 9120gctagggacg aaatcgatga gatcaactct gatctcgctt ctgtcatgaa gaaagaattc 9180ccccggtgtt tgccgacgaa tgagaacttg catgactatt tcatgtcccg tgtgcggcag 9240aatttgcaca ttgtgctttg cttttcaccg gtgggggaga agttccgaaa tagagctttg 9300aagttccctg cgttgatttc tgggtgcact atcgactggt tttcccgttg gcccaaggac 9360gctctggtcg ccgtgtccga gcacttttta accagctatg atatcgactg cagcctcgaa 9420attaagaagg aagtagttca gtgtatgggc tctttccaag acggtgtggc agagaagtgc 9480gtcgactatt tccagaggtt tcgccgatct actcatgtca cacctaagag ctacttgtcc 9540ttcatacagg ggtacaagtt tatatatggg gagaaacacg ttgaagtaag gactctggca 9600aaccgtatga atactggctt agagaagctc aaggaggcct cagaaagtgt ggctgctctc 9660tcaaaggagc ttgaagctaa ggagaaggaa ctccaagttg cgaacgataa agcggatatg 9720gttctaaaag aagtgactat gaaagcacaa gcggctgaaa aggttaaggc ggaggtacag 9780aaggtgaagg atcgcgccca ggcaatagtc gattccattt ccaaagacaa ggccatcgcg 9840gaagaaaagc tagaagccgc caagccggcc ttagaagagg cagaggctgc cttgcaaacc 9900ataagaccga gtgacatcgc gacggtacgc acccttggtc gtcctcctca tttgattatg 9960aggattatgg actgtgtcct gctcttgttc cagcggaagg tgagcgcggt gaaaatcgac 10020ctagaaaaga gctgtacgat gccgagctgg caggagtccc tcaagctcat gaccgccggg 10080aacttcttac agaacttgca gcagttcccg aaggacacca taaacgagga ggtgatcgag 10140tttcttagcc cctactttga gatgcccgac tacaacatcg agaccgcgaa gcgcgtctgc 10200gggaacgtgg ccggtctctg ctcgtggacc aaggctatgg cctccttctt ttcgataaat 10260aaggaggtcc tgccgctcaa ggccaacctc gtcgtgcagg agaaccgtca cctactcgct 10320atgcaggacc tccagaaggc gcaggcggag ttggacgaca aacaggccga gctggacgtc 10380gtgcaagccg agtacgaaca ggctatgacg gagaagcaga cgctgctaga agacgcggag 10440cggtgccggc ataaaatgca gacggcgagc acactgattt cgggcctagc aggggagaaa 10500gagcggtgga cggagcagtc gcaagagttc gccgcccaaa ctaagcgact ggtgggcgac 10560gtgctcctgg ctactgcttt tttgtcctat tcgggcccct tcaatcaaga gttccgggac 10620ttgctactca acgactggcg taaggagatg aaggcccgca agatcccctt cgggaagaac 10680ctgaacctca gtgaaatgct gatagacgcc cccactatct ctgagtggaa tctccagggg 10740ctgcctaacg acgacctgag catacagaac gggatcatag tcaccaaggc gtcgcgctac 10800cccctgctga tcgatccgca aacccagggg aagatctgga tcaagaacaa ggagtcccgg 10860aacgagttgc aaatcacgtc cctcaatcac aagtactttc ggaaccacct agaggactcg 10920ctgtcgttgg ggcgcccgct tctgattgag gacgttggcg aggagctgga tcctgcgctg 10980gacaacgttc ttgagcgcaa cttcatcaag accgggtcca ccttcaaggt aaaggtggga 11040gacaaggagg tcgacgtgct ggacggcttt cgcctatata tcaccacgaa gctgcctaac 11100ccggcgtaca cgcccgagat cagtgcgcgt acgagcatca ttgacttcac cgtgactatg 11160aaaggcctcg aagatcagct gctcggtcgc gtcattctca cggaaaagca ggagctggag 11220aaggagcgaa cacacctgat ggaggacgtg acggccaaca agcggcgtat gaaagagcta 11280gaggacaacc tgctgtaccg cttgacatca acccaggggt cgctggtcga ggacgaatcc 11340ctgatcgtgg tgctgagtaa taccaaacgg acagcagaag aggtcacgca gaaactcgag 11400atctcggcgg agaccgaggt gcagatcaac tccgcgcggg aagagtaccg cccggtggcc 11460acccgcggga gcatcttgta cttcctgatc actgagatgc ggcttgtgaa tgagatgtac 11520cagacaagcc tgcggcagtt ccttggcctg ttcgatcttt cgctggcccg gtccgtcaag 11580tctccgatta cctccaagcg gatcgctaac ataattgaac acatgacgta cgaggtgtac 11640aagtacgcgg cgcgaggcct ctacgaagag cacaagttcc tgttcacgct gctcctcacg 11700ctcaagatcg acatccagcg caaccgcgtc aagcacgagg agtttctcac cctgataaag 11760ggtggagcgt ccctggacct gaaggcctgt ccgccgaagc cgtcgaagtg gatcctggac 11820ataacgtggc tcaaccttgt cgagctgtcc aagctccgtc agttctcgga cgtgctcgac 11880cagatttcgc ggaacgagaa gatgtggaaa atttggttcg acaaggaaaa tccagaggag 11940gagcccttgc ccaacgcgta tgacaagtcg ctcgactgct tccgtcgcct gctgctgatc 12000cgcagctggt gccccgaccg gacgatcgcg caggcgagga agtacatcgt ggacagtatg 12060ggtgagaaat atgcggaggg cgttattctg gatctggaga agacttggga ggagagcgac 12120ccccgcaccc ccctgatctg cctgctgtct atggggtccg acccgaccga tagcatcatt 12180gcgctgggga aacggctcaa gatcgagacc cggtacgtgt ccatgggtca ggggcaggag 12240gtgcatgccc gcaagctcct gcagcagact atggcgaacg ggggttgggc gctcttacag 12300aactgccatc tggggctcga cttcatggat gaactcatgg atatcatcat cgagacggaa 12360ctcgtgcacg acgcattccg cctgtggatg accaccgagg cgcacaagca gttcccgatc 12420acgttgctgc agatgtccat caagttcgcc aacgaccctc cgcagggcct ccgggcgggc 12480ctgaagcgca cgtatagcgg cgtgtctcag gatctccttg atgtcagctc ggggagccag 12540tggaagccga tgctctatgc cgtggcattt ctacactcga ccgtccagga gcggcgaaag 12600tttggagcgc tggggtggaa cataccctac gagtttaacc aggccgactt caacgccacc 12660gtacagttca tccagaacca tttggacgat atggatgtga agaagggggt gtcctggacg 12720accatacggt acatgatcgg cgagatccag tatgggggcc gggtcacgga cgactacgac 12780aagcggttgc tgaacacgtt cgcgaaggtc tggttcagcg agaatatgtt cgggcccgat 12840ttttcctttt accagggcta caatataccc aagtgctcca cggtagacaa ctatcttcag 12900tacatccaga gccttcccgc gtacgacagc ccggaagtct tcggactcca ccccaacgcc 12960gacatcacgt accagagcaa gctggccaag gacgtgctcg acaccattct cggcatccag 13020ccgaaggaca cgtccggcgg gggggacgag acgcgggagg ccgtcgtcgc gcgcttggca 13080gatgacatgc tggagaagct cccccccgat tacgtcccgt tcgaggtcaa ggaaaggctc 13140cagaagatgg gcccgttcca gcccatgaac atcttcctcc gccaggagat cgaccggatg 13200cagcgcgtgc ttagcctggt ccgctcaacg ctgacggagc tgaagctggc catcgacggg 13260acgatcatta tgtcggagaa cctccgggac gcgctggact gcatgttcga cgcgcgtatc 13320ccggcctggt ggaagaaggc gtcgtggatc tccagcaccc tggggttctg gttcacggag 13380ctgatcgagc gcaactcgca attcacctcc tgggtcttca acgggcggcc ccactgcttc 13440tggatgacgg gcttcttcaa cccgcagggc ttcctgacgg ctatgcggca ggaaatcacc 13500cgggcgaaca agggttgggc gctcgacaat atggtgctct gcaacgaggt cacgaagtgg 13560atgaaggacg acatctcggc gcctcccacc gaaggggtct acgtatacgg cctgtacctc 13620gagggggcgg gctgggacaa gcgtaacatg aagctgatcg agtcgaagcc caaggtcctg 13680ttcgagctga tgcccgtcat ccgcatctac gcagagaaca acacgctgcg cgacccgcgg 13740ttctactcgt gccccatcta caagaagccg gtgcggacgg acctcaacta catcgccgcc 13800gtcgacctcc gcaccgcgca gacccccgag cactgggtgc tgcggggggt cgcactgctc 13860tgcgacgtca agtag 138752613875DNAArtificial SequenceSynthetic polynucleotide 26atgttcagaa tcggaaggcg tcaactatgg aaacacagcg tcaccagagt actgactcaa 60cggctcaaag gagaaaaaga ggcgaagcga gcgcttctcg acgcccgaca caattatcta 120ttcgcgattg tagcatcgtg tcttgacctc aataagacgg aggtggaaga cgccatatta 180gaaggaaacc agattgagcg gatcgaccaa ttgtttgcgg ttggtggact cagacattta 240atgttttact atcaagacgt tgaagaagcg gaaactgggc aactcgggtc acttggtgga 300gttaacctcg tcagcggtaa aattaaaaag ccaaaagtat ttgtcacaga gggaaacgac 360gttgcactca caggtgtatg tgtattcttt attcggactg atccctcaaa agccataaca 420ccagataata tacaccagga agtcagcttt aacatgctcg acgccgccga cggcgggctg 480ctgaactcgg tgcgccggct gctctcggat atctttatcc ccgcgcttag agcgacgagc 540cacgggtggg gtgagctgga aggcctacag gacgcggcca atatcaggca ggaattcctg 600tcatccctgg aaggttttgt gaacgtgctc agcggcgcac aggagtcatt gaaagaaaag 660gtgaacttgc ggaagtgtga cattctggaa ttaaagactt tgaaggagcc aaccgactat 720ctcaccttgg cgaacaatcc ggagactcta gggaaaatag aggactgcat gaaggtgtgg 780ataaaacaga ccgagcaagt tttagcagaa aataaccagc tgctgaagga ggcggacgac 840gtaggccctc gggcggaact tgaacattgg aagaagcggc tgtctaagtt taattacctt 900ttagaacagt taaaatctcc agatgtcaaa gcagtgcttg cagtcctcgc agctgcaaag 960tccaaactgc tgaagacgtg gcgtgagatg gacatacgga taactgacgc gaccaatgaa 1020gccaaggata acgttaagta cctatacacc ctagagaagt gttgtgatcc tctatattcc 1080tctgatccgc tgtctatgat ggatgcaata cctacgttga tcaacgctat taagatgatt 1140tacagtatct ctcactatta taatacaagc gaaaaaataa cttccttatt cgtaaaagtc 1200acgaaccaaa ttatatcagc ctgtaaagca tatataacca ataacggcac ggcatctata 1260tggaatcagc cccaagacgt ggttgaggaa aaaattctta gtgctataaa actaaagcaa 1320gagtatcaat tgtgcttcca taagactaag cagaagctaa agcagaaccc aaacgctaag 1380cagttcgact tctctgagat gtatattttt gggaagtttg aaaccttcca ccgacgtcta 1440gccaaaatta tcgacatctt tacgacgtta aaaacttaca gcgtgctgca agattctacg 1500atagaagggc tagaagatat ggccacaaag taccagggca tcgtggccac tatcaaaaag 1560aaggagtaca acttcttaga ccagcgtaaa atggatttcg accaggacta tgaagaattc 1620tgcaaacaaa cgaatgattt gcacaacgag cttcggaaat tcatggatgt gacttttgcc 1680aaaatacaga acaccaacca agctcttagg atgttaaaaa aatttgaaag gctcaatatt 1740cctaatttgg gcattgatga caaataccag ttgatactcg aaaattatgg agcagatatt 1800gatatgatct ctaaactgta caccaaacaa aaatatgatc caccgctagc tagaaatcaa 1860cctccgattg ctggtaagat actctgggcc agacaactct ttcaccgcat ccagcagccc 1920atgcagctgt ttcagcagca ccctgcggtg ctttccaccg ccgaggcgaa acccattatt 1980cgatcttata accgcatggc caaggttctg ttagagtttg aagttttgtt ccacagagcc 2040tggttacgtc agatagagga gatccatgtg ggactggagg cctctctcct agtcaaggcc 2100cccggcacag gcgaactctt tgtcaatttt gatccccaga ttctaatact cttccgggaa 2160accgagtgca tggcccagat gggcttagag gtttcgcctc tggctacttc tctgttccag 2220aaaagagacc gctataaacg gaatttcagc aatatgaaga tgatgctcgc tgaatatcag 2280agggtcaagt ccaagatccc cgctgcgatc gagcagctga tagtgccaca cctggcaaaa 2340gtagatgagg ccctacaacc aggactggct gcgctgacgt ggacctctct gaatatcgaa 2400gcgtatcttg agaacacctt tgccaagatt aaagacctgg agcttttact ggacagagtg 2460aacgatctca ttgaattccg catagacgcg attttagagg agatgtctag cacgccacta 2520tgccagcttc ctcaggagga gcctttaact tgtgaagagt tccttcagat gactaaggat 2580ctctgcgtga atggcgctca gatactacat ttcaagtcta gcttggtcga ggaggcagta 2640aacgaattgg ttaacatgtt actggatgta gaagtcctta gcgaggagga aagcgaaaag 2700atcagcaacg aaaattcggt gaactataag aacgaatcta gcgccaagcg ggaggagggc 2760aactttgata cactcacttc ttccatcaat gcgagggcta atgctctctt gttgacgacg 2820gttaccagga aaaaaaagga gactgagatg cttggtgaag aggcaaggga gctactgtcc 2880cacttcaacc atcagaatat ggacgccctc ttaaaggtta cccgaaacac gttagaggca 2940attaggaagc gtattcactc aagccacacg ataaacttcc gagactcaaa ctcagcatca 3000aatatgaagc aaaactccct gccgatcttc agagccagcg tcaccctggc catacctaac 3060atcgttatgg caccggcact tgaggacgta caacaaacct tgaacaaagc agtagagtgc 3120atcatcagcg tccctaaagg agttcgccaa tggtccagtg aactgctatc caagaagaag 3180atccaggagc gtaaaatggc tgcgttacag agtaacgaag attcggactc tgacgttgaa 3240atgggtgaaa acgagctcca agacacactg gagattgcga gcgttaacct gcctatacca 3300gtccagacca agaactacta caaaaacgtg tccgaaaaca aggagatcgt caagctcgtt 3360tctgtgctca gcaccatcat aaattcgact aagaaagaag ttataacttc catggattgt 3420ttcaaacggt ataaccacat ctggcagaaa ggcaaggagg aagctatcaa gacatttatt 3480acccagagcc cactactaag cgagttcgag tctcagatcc tctacttcca gaatcttgag 3540caggagatca acgctgagcc cgaatatgtg tgcgtcggct cgatagccct gtacacggct 3600gatctgaaat ttgcgctgac cgctgagacg aaggcttgga tggtggtgat tggccgacac 3660tgcaacaaga agtaccgatc tgaaatggag aacatcttta tgctaatcga ggaatttaac 3720aaaaagctga accgtcccat taaagatctg gacgacatca ggattgccat ggcggcccta 3780aaggaaatta gagaggagca gatatccatt gattttcagg ttggccccat cgaagaatca 3840tatgcccttc tgaatcgata cggtctatta atcgccaggg aggaaataga taaggtggac 3900acactacatt atgcatggga gaaactctta gcgcgggccg gcgaagtgca gaataagctc 3960gtatcgttac aaccatcatt taagaaggag ctcatcagtg ctgtcgaggt ctttctgcag 4020gactgccatc agttctatct ggattatgac ctgaacggtc cgatggcgag tggtctgaag 4080ccccaagagg cttcagaccg gcttatcatg ttccaaaatc agtttgacaa tatttacagg 4140aagtatatca cctatacagg gggtgaagaa ttgtttggtc tcccagccac ccagtatcca 4200caattattgg aaataaagaa gcagctgaac cttcttcaaa aaatctacac tctctataat 4260tcggtaattg aaactgttaa ttcctactac gatattctct ggagcgaggt caacattgag 4320aaaattaata acgaactctt ggagtttcaa aaccgatgcc gcaagttgcc gagagcgctg 4380aaggactggc aggcttttct cgaccttaag aaaataatcg atgatttcag tgaatgctgt 4440cctctcttag aatacatggc gagtaaggct atgatggaga gacactggga gaggattacg 4500actctgaccg gccattcttt ggacgttggc aacgagtcct tcaagctgcg taatataatg 4560gaggctccac ttctcaagta caaagaggaa atagaagaca tctgtatatc tgctgtcaaa 4620gagcgcgaca tagaacagaa actaaagcag gtaattaacg aatgggacaa taaaacgttt 4680acatttggca gtttcaagac acgtggagaa ttattgcttc gtggcgactc cacctcggaa 4740attatcgcta acatggagga ctctctcatg ttactcggct cgctgttatc gaaccggtat 4800aatatgccat tcaaagcaca gatccaaaag tgggtgcagt atctatctaa tagtacggat 4860attatagaga gctggatgac cgtccagaat ctctggatct acctggaggc ggtgtttgtg 4920ggaggtgata ttgcgaagca gcttccaaag gaggccaaaa gattctccaa cattgacaag 4980tcctgggtca agattatgac tcgggcccac gaagtgccgt ccgtggtgca atgctgcgtt 5040ggggacgaaa ccttgggcca gctgttgccc cacctgttgg atcaattgga aatctgccaa 5100aagagcctga ctggctacct agaaaaaaag cgtctgtgct ttccccggtt ctttttcgtt 5160tctgaccctg cactcctcga aatcttgggt caggcctcag attctcacac aattcaggct 5220catttgttaa atgtgtttga caacatcaaa agtgtgaagt ttcatgaaaa gatttatgac 5280aggatcttgt ccatttcatc ccaagagggg gaaaccattg agcttgataa gcctgtgatg 5340gcagagggaa acgtggaggt ctggcttaac agtctcctgg aagagtccca gtcctcactg 5400cacctggtca tccgccaggc ggcggctaac atccaggaga caggattcca gctcacggaa 5460ttcctttcct cgtttccggc gcaagtgggg ctcctcggca ttcagatgat ctggacgaga 5520gattcggagg aagccctccg caacgccaag tttgacaaga agattatgca gaaaactaac 5580caagccttcc tagagctcct caacaccctg atcgatgtca caacacgtga tctatcgtct 5640accgagcggg tcaagtatga gacactgatt accattcacg tgcaccagcg tgatatattc 5700gatgatctct gccacatgca cataaagagt cccatggact tcgaatggct aaaacagtgc 5760aggttctact ttaatgaaga ctcggataag atgatgatcc atatcacaga tgtagcgttt 5820atttaccaaa acgagttcct tggctgcaca gacaggttag tcataactcc gttaactgat 5880cgctgctaca ttacactcgc ccaagcgctt ggaatgtcca tgggtggagc ccccgcaggg 5940ccggcgggga ctggtaagac cgaaacaacg aaagatatgg gccgttgcct cgggaagtat 6000gtagtagttt ttaactgttc agaccaaatg gatttccgag ggctgggccg tatctttaaa 6060gggctggcgc aatccggttc ctggggctgt tttgacgagt tcaatcgtat tgatttaccg 6120gtgctatcgg ttgccgcaca gcaaattagt ataattttga cttgtaagaa agaacacaag 6180aaaagtttta tatttactga tggcgacaac gtcactatga atcctgaatt cgggcttttc 6240ttgactatga acccagggta tgctggccgt caagaacttc ctgaaaatct gaagatcaac 6300ttccgatcgg tggctatgat ggtaccggat cgccagatca ttatacgggt aaaactggcc 6360tcgtgtggct tcatcgacaa cgtcgtactt gctagaaagt tcttcaccct ttataagcta 6420tgtgaggagc agttatcgaa gcaagttcat tacgactttg ggctccggaa tatcttgtcc 6480gtcttacgca cactcggagc ggctaaacgt gcaaatccca tggacactga gagtacgatt 6540gtgatgcgag tgttaaggga tatgaatctc tcaaaattaa tagacgagga cgagcctctt 6600tttctcagcc ttatagagga tctgtttcca aacatcctcc tggacaaggc gggatatccc 6660gagttggaag cggccattag caggcaggtg gaggaggccg gattgattaa tcacccgccc 6720tggaaactga aagtcatcca gctgttcgag actcagcggg tccgacacgg tatgatgact 6780ttaggcccat ctggtgcggg caaaactacc tgcatccaca ccctgatgag ggctatgacc 6840gattgtggga agcctcaccg tgagatgcgg atgaacccca aggcgatcac tgcgccccaa 6900atgtttgggc gtctggatgt ggcgacaaat gactggaccg atggaatctt ttccacactc 6960tggaggaaga ccctgcgcgc aaaaaaagga gagcacatct ggatcattct cgatggcccc 7020gttgacgcta tttggatcga aaacttaaac agcgtgctcg acgacaacaa gaccctgaca 7080ttggcaaatg gtgaccgaat tcctatggct cccaactgca aaatcatttt tgaacctcac 7140aacatcgaca acgccagtcc ggctacggtg tcccgcaacg gtatggtttt catgagcagt 7200tccatactgg attggagtcc gatattggaa ggatttctca agaaacgcag tccccaggag 7260gcggaaattc

tgagacaact gtatacggaa agttttcctg acctgtaccg cttctgtatt 7320caaaatctcg agtataagat ggaggtgttg gaggccttcg tcatcacaca gtccattaac 7380atgcttcagg ggttgatccc cttgaaggaa caaggagggg aagtcagcca ggcacatcta 7440gggcggcttt tcgttttcgc cctgctatgg tccgcgggtg ctgctctcga gctagacggt 7500cgccggcgct tggagttgtg gctgaggtct cgcccgaccg ggacactcga gctgccgccg 7560ccagccggac ccggggacac tgcattcgac tactacgtag ctccggacgg cacctggacc 7620cactggaaca cccgtacgca ggagtacctc tatcccagcg atacaactcc tgagtatgga 7680agcatactcg ttccgaacgt agacaacgtc agaaccgact ttctgatcca gaccattgct 7740aagcagggca aggcagtcct attgatcgga gagcaaggga ccgcgaaaac cgtgattatc 7800aagggcttca tgtctaaata tgacccagag tgtcatatga ttaagtcgct caacttcagt 7860tctgctacca caccactcat gtttcagcgt actatcgaat cctacgtgga caagcggatg 7920ggcaccacct acgggccacc tgccgggaag aagatgacgg tatttataga cgacgttaac 7980atgcccatca tcaacgagtg gggagatcaa gtgaccaacg agatcgttcg gcaacttatg 8040gaacaaaacg ggttctataa cctcgagaaa ccgggcgagt tcacctcaat agtggacatt 8100caatttctgg cagctatgat ccaccccgga ggaggacgga acgacattcc ccagcggctc 8160aagcgccaat tcagcatctt caactgcacg ctgccaagtg aagcatcggt agacaaaatc 8220ttcggcgtca tcggggtggg tcattactgc acccagcgcg gcttttcaga ggaagtccga 8280gattctgtta ctaaactggt tcctttgact agaaggttat ggcagatgac caaaattaaa 8340atgcttccta ctccagctaa attccactac gtgttcaatc tgcgagactt atccagggta 8400tggcaaggta tgcttaatac cacctctgag gtaattaaag aaccgaacga tctgctgaaa 8460ctgtggaagc acgaatgtaa gagagttatt gcagacaggt ttactgtatc gtcagatgtt 8520acctggttcg ataaagcact ggtgtctttg gtggaagaag agtttgggga agagaagaag 8580ctactcgtcg attgcgggat cgacacttac ttcgtggact tcttaagaga cgcccctgag 8640gctgccggcg aaacatcgga agaggcagac gctgagactc ctaagatcta tgagccgatc 8700gagagcttta gccacctcaa ggaacgtctg aatatgtttt tacaattgta taacgagagc 8760attcgtggtg ctgggatgga tatggtgttc ttcgcggatg caatggtgca tctggttaag 8820atttctcggg tgattcgcac gccacaggga aacgcgctac tggtcggggt gggtgggtca 8880ggaaagcaat cgttgactcg tctcgcatcc tttatcgcgg gatatgtgag ttttcaaatc 8940accctgacaa ggtcctataa tacatccaac ctgatggagg atcttaaagt tctctacagg 9000acggcgggac aacaaggcaa aggaataacc ttcatcttca ctgataacga aattaaagac 9060gaatcattct tggagtatat gaacaacgtc ttatcaagcg gcgaagtttc gaacctcttc 9120gcgcgggacg aaatcgatga gatcaactct gatctcgctt ctgtcatgaa gaaagaattc 9180ccacgctgtt tgccgacgaa tgagaacttg catgactatt tcatgtcccg tgtgcggcag 9240aatttgcaca ttgtgctttg cttttcaccg gtgggggaga agttccgaaa tagagctttg 9300aagttccctg ctttgatttc tgggtgtact attgactggt tttcccgttg gcccaaggac 9360gctctggtcg ccgtgtccga gcacttttta accagctatg atatcgactg cagcctcgaa 9420attaagaagg aagtagttca gtgtatgggc tctttccaag acggtgtggc agagaagtgc 9480gtcgactatt tccagaggtt tcgccgatct actcatgtca cacctaagag ctacttgtcc 9540ttcatacagg ggtacaagtt tatatatggg gagaaacacg ttgaagtaag gactctggca 9600aaccgtatga atactggctt agagaagctc aaggaggcct cagaaagtgt ggctgctctc 9660tcaaaggagc ttgaagctaa ggagaaggaa ctccaagttg cgaacgataa agcggatatg 9720gttctaaaag aagttactat gaaagcacaa gcggctgaaa aggttaaggc ggaggtacag 9780aaggtgaagg atcgcgccca ggcaatagtc gatagcattt ccaaggacaa ggcgatcgcg 9840gaagagaagc tcgaggccgc gaagcctgcg ctggaggagg ccgaggccgc gctgcagacc 9900atccgcccct ccgacatcgc gacggttcgc acgttggggc ggccacctca ccttatcatg 9960cgcattatgg actgcgttct gctcttgttc cagcggaagg tgagcgcggt gaaaatcgac 10020ctggaaaaga gctgtaccat gccgagctgg caggagtccc tcaagctcat gacggccggg 10080aacttcttac aaaacctgca gcagttcccg aaggacacca taaacgagga ggtgatcgag 10140tttcttagcc cctactttga gatgcccgac tacaacatcg agaccgcgaa gcgcgtatgc 10200gggaacgtgg ccggtctctg ctcgtggacc aaggctatgg cgtccttctt ttcgataaat 10260aaggaggtct tgccgctcaa ggccaacctc gtcgtgcagg agaaccgtca cctactcgct 10320atgcaggacc tccagaaagc gcaggcggag ttggacgaca aacaggccga gctggacgtc 10380gtgcaagccg agtacgaaca ggctatgacg gagaagcaga cgctgctaga agacgcggag 10440cggtgccggc ataaaatgca gacggcgagc acactgatta gtggcctagc aggggagaaa 10500gagcggtgga cggagcagtc gcaagagttc gccgcccaaa ctaagcgact ggtgggcgac 10560gtgctcctgg ctactgcttt tttgtcctat tcgggcccct tcaatcaaga gttccgggac 10620ttgctactca acgactggcg taaggagatg aaggcccgca agatcccctt cgggaagaac 10680ctgaacctca gtgaaatgct gatcgacgcc cccactatct ctgagtggaa tctccagggg 10740ctgcccaacg acgacctgtc gatacaaaac ggaataatag tcaccaaggc gtcgcgctac 10800cccctgctga tcgatccgca aacccagggg aagatctgga tcaagaacaa ggagtcccgg 10860aacgagttgc aaatcacgtc cctcaatcac aagtactttc ggaaccacct cgaggactcg 10920ctgtcgttgg ggcgcccgct tctgatcgag gacgttggcg aggagctgga tcctgcgctg 10980gacaacgttc ttgagcgcaa cttcatcaag accgggtcca ccttcaaggt aaaggtggga 11040gataaggagg tcgacgtgct ggacggcttt cgcctatata tcaccacgaa gctgcctaac 11100ccggcgtaca cgcccgagat cagtgcgcgt acgagcatca ttgacttcac cgtgactatg 11160aaaggcctcg aagatcagct gctcggtcgc gtgattctca cggaaaagca ggagctggag 11220aaggagcgaa cacacctgat ggaggacgtg acggccaaca agcggcgtat gaaagagcta 11280gaggacaacc tgctgtaccg ccttacgtca acccaggggt cgctggtcga ggacgaatcc 11340ctgatcgtgg tgctgagtaa taccaagcgg acagctgagg aggtcacgca gaaactcgag 11400atctcggccg agaccgaggt gcagatcaac agcgcgcggg aggagtaccg cccggtggcc 11460acccgcggga gcatcttgta cttcctgatc actgagatgc gccttgtgaa tgagatgtac 11520cagacaagcc tgcggcagtt ccttggcctg ttcgatcttt cgctggcccg gtccgtcaag 11580tctccgatta cctccaagcg gatcgctaac ataattgaac acatgacgta cgaggtgtac 11640aagtacgcgg cgcggggcct ctacgaagag cacaagttcc tgttcacgct cctcctcacg 11700ctcaagatcg acatccagcg caaccgcgtc aagcacgagg agtttctcac cctgataaag 11760gggggagcgt ccttagacct gaaggcctgt ccgccgaagc cgtcgaagtg gatcctggac 11820ataacgtggc tcaacctcgt cgagctgtcc aagctccgtc agttctcgga cgtgctcgac 11880cagatttcgc ggaacgagaa gatgtggaaa atatggttcg acaaggagaa tccagaggag 11940gagcccttgc ccaatgcgta tgacaagtcg ctcgactgct tccgtcgcct gctgctgatc 12000cgcagctggt gccccgaccg gacgatcgcg caggcgagga agtacatcgt tgacagtatg 12060ggtgagaaat acgcggaggg cgttattctg gatctggaga agacttggga ggagagcgac 12120ccccgcaccc ccctgatttg ccttctgtct atggggtccg acccgaccga tagcatcatt 12180gcgctgggga aacggctcaa gatcgagacc cggtacgtgt ctatgggaca ggggcaggag 12240gtgcatgccc gcaagctcct gcagcagact atggcgaacg ggggttgggc gctcttacag 12300aactgccatc tggggctcga cttcatggat gaactgatgg acatcatcat cgagacggaa 12360ctcgtgcacg acgcattccg cctgtggatg accaccgagg cgcacaagca gttcccgatc 12420acgttgctgc agatgtccat caagttcgcc aacgaccctc cgcagggcct ccgggcgggc 12480ctgaagcgca cgtattcggg cgtgtctcaa gatctcctag atgtcagctc ggggtcccag 12540tggaagccga tgctctatgc cgtggcattt ctacactcga ccgtccagga gcggcgaaag 12600tttggagcgc tggggtggaa cataccctac gagtttaacc aggccgactt caacgccacc 12660gtacagttca tccagaacca tttggacgat atggatgtta agaagggggt gtcctggacg 12720accatacggt atatgattgg cgagatccag tatggggggc gggtcacgga cgactacgac 12780aagcggttac tgaacacgtt cgcgaaggtc tggttctccg agaatatgtt cgggcccgat 12840ttttcctttt accagggcta taatataccc aagtgctcca cggtcgacaa ctaccttcag 12900tacatccaga gcctacccgc gtacgacagc ccggaagtct tcggactcca ccccaacgcc 12960gacatcacgt accagagcaa gctggccaag gacgtgctcg acaccatact cggcatccag 13020ccgaaggaca cgtccggcgg gggggacgag acgcgggagg ccgtcgtcgc gcgcttggca 13080gatgacatgc tggagaagct cccccccgat tacgtcccgt tcgaggtcaa ggaaaggctc 13140cagaagatgg gcccgttcca gcccatgaac atcttcctcc gccaggagat cgaccggatg 13200cagcgcgtgc ttagcctggt gcgctcaacg ctgacggagc tgaagctggc catcgacggg 13260acgatcatta tgtcggagaa cctccgggac gcgctggact gcatgttcga cgcgcgtatc 13320ccggcctggt ggaagaaggc gtcgtggatc tccagcaccc tggggttctg gttcacggag 13380ctgatcgagc gtaattctca atttacatct tgggtgttta atggtcgtcc ccattgtttt 13440tggatgactg gtttttttaa tccacaagga tttttaactg ctatgagaca ggaaataact 13500cgtgcgaata agggttgggc attagataat atggttttgt gtaatgaagt aactaagtgg 13560atgaaagacg atataagtgc accacctact gaaggtgttt atgtatatgg tttatattta 13620gaaggagctg gatgggataa acgtaatatg aaattaatag aatcgaaacc aaaagtttta 13680tttgaactga tgccagttat tagaatttat gcagaaaata atacattaag agatcctaga 13740ttttatagtt gtccgattta taaaaaacct gtaagaacag atttaaatta tatagcagcc 13800gtcgatctta gaactgctca aacaccagaa cattgggtat taagaggagt agctttactt 13860tgtgatgtta aatag 138752713875DNAArtificial SequenceSynthetic polynucleotide 27atgttcagaa taggcagacg acagttatgg aaacacagtg tgactagggt tctaacacag 60agactaaaag gagagaaaga ggcaaaaaga gcgctactcg atgcaagaca taactatcta 120ttcgctattg tggcgtcgtg tttagattta aataaaactg aagtagagga cgccattctc 180gagggaaatc aaatagaacg catcgaccaa ttattcgcgg ttggaggact tagacatttg 240atgttttact atcaagacgt tgaagaagcg gaaactgggc aactcgggtc actaggtgga 300gttaatttag tctcaggtaa gatcaaaaag ccaaaagtat ttgtaacaga aggaaacgac 360gttgcactaa caggtgtttg cgtgtttttt attagaactg atccgtctaa agccataaca 420cccgacaaca tccatcagga agtttcgttt aacatgctgg atgcggccga cggaggacta 480ctgaattcag tgcgtcgcct gctatctgac attttcattc ctgctttgcg tgcgaccagt 540cacgggtggg gggagttaga aggactccag gacgccgcga atatccggca ggaattcctg 600tccagcctgg aaggttttgt taacgtcctg tccggcgccc aggagagcct taaggagaag 660gtgaacttac gaaagtgcga catactcgag ctcaaaaccc tgaaggaacc tacagactac 720ctcaccctcg caaacaaccc cgaaaccctc ggcaaaattg aagattgcat gaaggtttgg 780attaagcaga cggaacaagt cctggccgag aacaaccaac tcctaaagga ggccgacgac 840gtgggcccgc gcgctgagct ggagcactgg aagaagaggc tcagcaagtt taactatctt 900cttgagcagc tgaagagccc ggacgttaag gcggtactag ccgtcctcgc ggctgcgaag 960tcgaaactgc tcaagacctg gcgtgagatg gatatacgca ttacggacgc aaccaacgaa 1020gctaaggaca acgttaagta cttgtatacc ctcgagaagt gctgcgaccc actctactca 1080tctgacccgc tcagtatgat ggatgccatc cccacgctaa ttaacgcaat taagatgatc 1140tactcgatat cgcactatta caacacgtct gaaaaaataa ccagcctctt cgtaaaagtg 1200accaaccaga tcatttcagc ctgtaaggct tacatcacaa acaacggcac cgcctccata 1260tggaaccagc cccaggacgt cgtggaagag aagatcctat cggccataaa gctgaagcag 1320gagtaccagc tgtgcttcca caaaaccaag cagaaactca agcagaaccc caatgctaag 1380cagttcgact tctctgagat gtatattttc gggaagtttg aaaccttcca ccgccgcctg 1440gcgaagatca tcgacatatt cacaactctg aagacctaca gtgttctaca agacagcact 1500atagagggtc tcgaggatat ggccactaag taccagggca ttgtcgcaac tatcaagaag 1560aaagaatata atttcctcga tcagcgtaag atggacttcg accaggacta cgaagaattc 1620tgcaagcaga cgaatgattt gcataacgaa ctccggaagt tcatggacgt aacgtttgcc 1680aagattcaaa acacaaatca ggcgttgcgg atgctaaaga agttcgagcg tctgaacatc 1740cctaatctag ggattgacga caagtaccaa cttatactgg aaaactacgg ggctgacatc 1800gatatgatct ccaagctata taccaagcaa aagtatgacc cgccgttagc tcggaatcag 1860cccccgatcg ccgggaagat cctgtgggca cggcagcttt ttcaccgcat tcagcagcct 1920atgcagctgt tccagcagca cccggcggtt ctctccaccg ccgaggctaa gccaattata 1980cgtagctaca accgcatggc gaaagtcctg ctcgagtttg aggtcttgtt ccaccgagcg 2040tggcttcggc agatcgaaga gatccacgtc gggctcgagg cctcgctcct ggttaaggcg 2100ccggggaccg gtgagctgtt cgtaaacttc gacccgcaaa tactaatcct gttccgtgaa 2160accgagtgca tggcccagat gggcctcgag gtctcacctt tggccacgag cctgttccag 2220aagcgcgacc gctacaagcg gaacttctct aacatgaaga tgatgcttgc cgagtaccag 2280agggttaagt cgaagatccc tgctgccatc gagcagctca tcgtgccaca tctggccaag 2340gttgacgagg cactccaacc gggcttggcg gccctgacgt ggacctccct aaacattgag 2400gcctacttgg aaaatacttt cgcgaagatt aaggatctcg aattactact ggatcgtgtg 2460aatgacctca tagaattccg gatagacgcg atcctagagg agatgtcgag cacccccctc 2520tgtcagctcc cgcaggagga gccgctcaca tgcgaggaat ttctccagat gactaaggac 2580ctctgcgtta acggggccca gatactccat ttcaagtcgt ccctcgttga ggaggcggtg 2640aatgaactgg ttaatatgct cttggatgtg gaagtgctct cggaggaaga atccgagaag 2700attagcaacg agaatagcgt gaactacaag aacgagagct cagcaaaacg ggaggagggg 2760aattttgata cgctgacttc ctccatcaac gcgcgggcca acgctctctt gctgacaaca 2820gtaacgcgca aaaagaagga gactgagatg ctaggagagg aggcgcgcga gctgctgtcc 2880catttcaacc accaaaatat ggatgcgctt ctcaaagtca cccggaatac cctcgaggcg 2940atacgcaagc gcatccattc gagccatacg ataaacttca gggacagcaa ctccgcgtca 3000aatatgaagc agaactcgtt gccgatattc cgggcctcgg tgacgctggc gatcccgaac 3060attgtgatgg caccggccct cgaggacgta cagcagaccc ttaacaaggc ggtggagtgc 3120atcatctccg ttcccaaggg cgtccgccag tggtcctcgg agctgctcag caagaaaaag 3180attcaggagc gtaaaatggc ggcccttcaa tcgaacgaag actccgactc tgacgttgag 3240atgggagaga acgagctaca ggatacctta gaaatcgcct ccgtgaacct ccctatcccg 3300gtgcagacga aaaactacta caagaacgtc agcgaaaaca aggaaatcgt aaagctggtg 3360agcgtcctga gtaccatcat taacagcaca aagaaggaag tgataacctc gatggactgc 3420tttaagcgct acaaccacat ttggcagaaa ggcaaggagg aggctataaa aacgttcatc 3480acgcagagcc ccctgctctc ggagttcgag tcacaaatcc tctacttcca aaatctggag 3540caggagatca acgctgagcc ggaatatgtg tgcgtcggct cgatagccct gtacacggct 3600gatctgaaat ttgcgctgac cgctgagacg aaggcttgga tggtggtgat tggccgacac 3660tgcaacaaga agtaccggtc tgaaatggag aacatcttta tgctaatcga ggaatttaac 3720aaaaagctga accgtcccat taaggatctg gacgacatcc gaattgccat ggcggcccta 3780aaggaaatta gagaggagca gatatccatt gactttcagg ttggccccat cgaagaatca 3840tatgcccttc tgaatcgata cggtctatta atcgcccgag aggaaataga taaggtggac 3900acacttcatt atgcatggga gaaactctta gcgcgggccg gcgaagtgca gaataagctc 3960gtatcgctgc aaccatcatt taagaaggag ctcatcagtg ctgtcgaagt ctttctgcag 4020gactgccacc agttctatct ggattatgac ctgaacggtc cgatggcgag tggtctgaag 4080ccccaagagg cttcagaccg gcttattatg ttccaaaatc agtttgataa tatttaccgg 4140aagtatatca cctatacagg gggtgaagaa ttgtttggtc tcccagccac ccagtaccca 4200caattattgg aaataaagaa gcagctgaac cttcttcaaa aaatctacac tctctataat 4260tcggtaattg aaactgttaa ttcctactac gatattctct ggagcgaggt caacattgag 4320aaaattaaca acgaactctt ggagtttcaa aacagatgcc gcaagttgcc cagagcgctg 4380aaggactggc aggcttttct cgaccttaaa aaaataatcg atgatttcag tgaatgctgt 4440cctctcttag aatacatggc ctcaaaggct atgatggaga gacactggga gaggattacg 4500actctcacgg ggcattcttt ggacgttggc aacgagtcct tcaagctgcg taatataatg 4560gaggctccac ttctcaagta caaagaggaa atagaagaca tctgtatatc tgctgtcaaa 4620gagcgcgata tagaacagaa actaaagcag gtaattaacg aatgggacaa taaaacgttt 4680acatttggca gtttcaagac acgtggagaa ttattgcttc gtggcgactc cacctcggaa 4740attatagcta acatggagga ctctctcatg ttactcggct cgctgttatc gaaccggtat 4800aatatgccat tcaaagcaca gattcagaag tgggtgcagt atctatctaa tagtacggat 4860attatagaga gctggatgac cgtccagaat ctctggatct acctggaggc ggtgtttgtg 4920ggtggtgata ttgcgaagca gcttccaaag gaggccaaaa gattctccaa cattgacaaa 4980tcctgggtca agattatgac tcgggcccac gaagtgccct ccgtggtgca gtgctgcgtt 5040ggcgacgaaa ccttgggcca gctgttgccc cacctgttgg atcaattgga aatctgccaa 5100aagagcctga ctggctacct agaaaaaaag cgtctgtgct ttccccggtt ctttttcgtt 5160tctgaccctg cactactcga aatcttgggt caggcctcag attctcacac aattcaggct 5220catttgttaa atgtgtttga caacatcaaa agtgtgaaat ttcatgaaaa gatttatgac 5280aggatcttgt ccatttcatc ccaagagggg gaaaccattg agcttgataa gcctgtgatg 5340gcagagggaa acgtggaggt ctggttaaac agtctcctgg aagagtccca gtcctcactg 5400cacctggtaa tccgccaggc ggcggctaat atccaggaga caggattcca gctcacggaa 5460ttccttagtt cgtttccggc gcaagtgggg ctcctcggca ttcagatgat ctggacgaga 5520gattcggagg aagccctcag aaacgccaag tttgacaaga agattatgca gaaaactaac 5580caagccttcc tagagctcct caacaccctg atcgatgtca caacacgtga tctatcgtct 5640accgagcggg tcaagtatga gacactgatt accatacacg tgcaccagcg tgatatattc 5700gatgatctct gccacatgca cataaagagt cccatggact tcgaatggct aaaacagtgc 5760cgcttctact ttaatgaaga ctcggataag atgatgatcc atatcacaga tgtagcgttt 5820atttaccaaa acgagttcct tggctgcaca gacaggttag tcataactcc gttaactgat 5880cgctgctaca ttacactcgc ccaagcgctt ggaatgtcca tgggtggagc ccccgcaggg 5940ccggcgggga ctggtaagac cgaaacaact aaagatatgg gccgttgcct cgggaagtat 6000gtggtagttt ttaactgctc agaccaaatg gatttccgag ggctgggccg tatctttaaa 6060gggctggcgc aatcaggttc ctggggctgt tttgacgagt tcaatcgtat tgatttaccg 6120gtgctaagtg ttgccgcaca gcaaattagt ataattctaa cttgtaagaa agaacacaag 6180aaaagtttta tatttactga tggcgacaac gtcactatga atcctgaatt cgggcttttc 6240ttgactatga acccagggta tgctggccgt caagaacttc ctgaaaatct gaagatcaac 6300tttcgatcgg tggctatgat ggtaccggat cgccagatca ttatccgagt aaaactggct 6360tcgtgtggct tcatcgacaa cgtcgtactt gctcgaaagt tcttcaccct ttacaagcta 6420tgtgaggagc agttatcgaa acaagttcat tacgactttg ggctccggaa tatcttgtcc 6480gtcctccgca cactcggagc ggctaaacgt gcaaatccca tggacactga gagtacgatt 6540gtgatgcgag tgttaaggga tatgaatctc tcaaaattaa tcgacgagga cgagcctctt 6600tttttaagcc ttatagagga tctgttccca aacatcctcc tggacaaggc gggatatccc 6660gagttggaag cggccattag caggcaggtg gaggaggccg gattgattaa tcacccgccc 6720tggaaactga aagtcatcca gctgttcgag actcagcggg tccgacacgg tatgatgact 6780ttaggcccat ctggcgcggg caaaaccacc tgcatccaca ccctgatgag ggctatgacc 6840gattgtggga agcctcaccg tgagatgcgg atgaacccga aggcgatcac tgcgccccaa 6900atgtttgggc gtctggatgt ggcgacaaat gactggaccg atggaatctt tagcacactc 6960tggaggaaga ccctgcgcgc caaaaaagga gagcacatct ggatcattct cgatggcccc 7020gttgacgcta tttggatcga aaacttaaac agcgtgctcg acgacaacaa gaccctgaca 7080ttggcaaatg gtgaccggat tcctatggct cccaattgca aaatcatttt tgaacctcac 7140aacatcgaca acgccagtcc ggctacggtg tcccgcaacg gtatggtttt catgagcagt 7200tccatactgg attggagtcc gatattggaa ggatttctca agaaacgcag tccccaggag 7260gcggaaattc tgcggcaact gtatacggaa agttttcctg acctgtaccg cttctgtatt 7320caaaatctcg agtataagat ggaggtgttg gaggccttcg tcatcacaca gtccattaac 7380atgcttcagg gcttgatccc cttgaaggag caaggagggg aagtcagcca ggcacatcta 7440gggcggcttt tcgttttcgc ccttctctgg tccgcgggtg ctgctctcga gctagacgga 7500cgccggcgct tggagttgtg gctgaggtct cgcccgaccg ggacactcga gctcccgccg 7560ccagccggac ccggggacac tgcattcgac tactacgtag ctccggacgg cacctggacc 7620cactggaaca cccgtacgca ggagtatctc tatcccagcg atacaactcc tgagtatggt 7680agcatactcg ttccgaacgt agacaacgtc agaaccgact ttctgatcca gaccattgct 7740aagcagggca aggcagtcct acttatcgga gagcaaggga ccgcgaaaac cgtgattatc 7800aagggcttca tgagtaaata tgacccagag tgtcatatga ttaaatccct caacttctcc 7860tctgctacca caccactcat gtttcagcgc actatcgaat cctacgtgga caagcggatg 7920ggcaccacct acggaccgcc tgccgggaag aagatgacgg tatttataga cgacgttaac 7980atgcccatca tcaacgagtg gggagatcaa gtgaccaacg aaatcgttcg gcaacttatg 8040gaacaaaacg ggttctataa cctcgagaag ccgggcgagt tcacctcaat agtagacatt 8100caatttctgg cagctatgat ccaccccgga ggaggacgga acgacattcc ccagcggctc 8160aagcgccaat tcagcatctt caactgcacg ctgccaagtg aagcatcggt tgacaaaatc 8220ttcggcgtca tcggggtggg tcactactgc acccagcgcg gcttttcaga ggaagtccga 8280gattctgtta ctaaactggt tcctttgact agaaggctgt ggcagatgac caaaattaaa 8340atgcttccta

ctccagctaa attccactac gtgttcaatc tgcgagactt atccagggta 8400tggcaaggta tgcttaatac cacctctgag gtaattaaag aaccgaacga tctgctgaaa 8460ctgtggaagc acgaatgtaa gagagttatt gcagataggt ttactgtatc gtcagatgtt 8520acctggttcg ataaagcact ggtgtctttg gtcgaagaag agtttgggga agagaagaag 8580ctactcgtcg attgcgggat agacacttac tttgtggact tcttaagaga cgcccctgag 8640gctgccggcg aaacatcgga agaggcagac gctgagactc ctaagatcta tgagccgatc 8700gagagcttta gccacctcaa ggagcgtctg aatatgtttt tacaattgta taacgagagc 8760attcgtggtg ctgggatgga tatggtgttc ttcgccgatg caatggtgca tctggttaag 8820attagccggg tgattcgcac gccacaggga aacgcgctac tggtcggggt gggtgggtca 8880ggaaagcaat cgttaactcg tttggcatcc tttatcgcgg gatatgtgag ttttcaaatc 8940accctgacga ggtcctataa tacatccaac ctgatggagg atcttaaagt tttgtacagg 9000acggcgggac aacagggcaa aggaataacc ttcatcttca cagataacga aattaaagac 9060gaatcattct tggagtatat gaacaacgtc ttatcaagcg gcgaagtttc gaacctcttc 9120gcgcgggacg aaatcgatga gatcaactct gatctcgctt ctgtcatgaa gaaagaattc 9180ccccgctgtt tgccgacgaa tgagaacttg catgactatt tcatgtcccg tgtgcggcag 9240aatttgcaca ttgtgctatg tttttcaccg gtgggggaga agttccgaaa tagagctttg 9300aagttccctg ctttgatttc tgggtgtact attgactggt tttcccgttg gcccaaggac 9360gctctggtcg ccgtgtccga gcacttttta accagctatg atatcgactg cagcctcgaa 9420attaagaagg aagtagttca gtgtatgggc tctttccaag acggtgtggc agagaagtgc 9480gtcgactatt tccagaggtt tcgccgatct actcatgtca cacctaagag ctacttgtcc 9540ttcatacagg ggtacaagtt tatatatggg gagaaacacg ttgaagtaag gactctggca 9600aaccgtatga atactggctt agagaagctc aaggaggcct cagaatccgt ggctgctctc 9660tcaaaggagc ttgaagctaa ggagaaggaa ctccaagttg ctaacgataa agcggatatg 9720gttctaaagg aagtcactat gaaagcacaa gcggctgaaa aggttaaggc ggaggtacag 9780aaggtgaagg atcgcgccca ggcaatagtc gattccattt ccaaagacaa ggccatcgcg 9840gaagaaaagc tagaagcggc gaagccggcc ttagaagagg cagaggctgc cttgcaaacc 9900ataagaccca gtgacatcgc cacggtacga acccttggtc gtcctcctca tttgattatg 9960aggattatgg actgtgtcct gcttttattt caacgtaaag tatctgcagt taagattgat 10020ttggaaaaat cctgtaccat gccctcatgg caggaatccc tgaaattgat gaccgcgggc 10080aatttccttc aaaatttgca acaattcccc aaggacacca ttaacgaaga ggtcatagaa 10140tttttatctc catattttga gatgccagac tacaatattg aaacagcgaa gcgcgtctgt 10200ggtaacgttg caggtctctg ttcgtggacc aaagctatgg cctccttctt tagtatcaat 10260aaagaggtac taccactgaa agccaacctg gtggtacagg agaaccggca tctgcttgct 10320atgcaggatc ttcagaaggc ccaggccgaa ttagacgaca agcaggctga gcttgacgtt 10380gtgcaagcag aatacgaaca agctatgaca gaaaagcaga ctttattaga agacgctgaa 10440cgctgcagac ataagatgca gactgcaagc accctcatat ccggtttggc tggagaaaaa 10500gagcggtgga cagagcagtc gcaagaattc gctgctcaaa ccaaaaggtt ggttggagac 10560gttctactcg cgacagcgtt tctctcctat tctggtcctt tcaaccagga attccgggac 10620cttttgctga atgactggag aaaagaaatg aaggctcgca aaataccatt tggaaagaat 10680ttgaacttgt ctgaaatgct tattgacgca cccactatat cagagtggaa tcttcaggga 10740cttccaaatg acgatctgtc catccaaaac ggaattattg taaccaaggc gagccgctac 10800cctctgctaa ttgacccgca gacacagggc aagatctgga ttaaaaataa ggaaagcagg 10860aacgagctcc agatcactag tctcaaccac aagtacttcc gtaaccacct cgaagatagt 10920ctgtccctgg gacggccgtt gctaatcgag gacgtcggag aagagctgga ccccgcatta 10980gacaacgttc ttgaaagaaa ttttatcaag acaggatcaa ctttcaaagt taaagtagga 11040gataaagaag tggatgtgtt agatggcttc cgcctatata tcacgactaa actcccgaat 11100cccgcctata ctccagagat cagcgctaga actagcatca tagatttcac tgtaactatg 11160aaggggttag aagatcaatt attaggacgc gtcatcctga cggagaaaca ggaacttgaa 11220aaagagcgta cacacctcat ggaagacgtg acagctaaca aacgtcggat gaaggaactg 11280gaggacaatt tactgtatcg gttgacatca acacagggct cccttgttga ggacgagagt 11340ctgatcgtgg tcctgtctaa cacaaagagg actgctgaag aagtcactca gaaattggag 11400atttctgccg aaactgaagt tcagattaac tccgctagag aagagtatcg tccagtcgct 11460actcggggct ctatcctata cttcctcata actgagatgc gcttggtcaa tgagatgtac 11520cagacttcac tccggcaatt cctgggcttg tttgacttgt cgctggcaag atcagtaaaa 11580tctccaatta ccagcaagag aatcgcaaac atcattgagc acatgacgta cgaggtgtat 11640aaatacgcgg cgaggggtct ttatgaagag cataagttcc tcttcaccct actattaacg 11700ttgaagatag atattcagag gaaccgggtg aagcacgagg agtttctaac tctaataaaa 11760ggaggagctt ctttagatct taaagcctgt ccaccgaaac cttctaagtg gattttagac 11820ataacatggc tgaatttagt ggagttgtcc aagttacgtc agttcagtga cgttttggat 11880cagatatcta ggaacgagaa gatgtggaag atctggttcg ataaagagaa tccggaggag 11940gagcccttgc caaatgctta tgataaaagc ctagactgct tcaggaggct tttgctcatc 12000cgctcctggt gccccgaccg cactattgcc caagctagga aatacattgt ggactccatg 12060ggggagaagt atgccgaagg agtcatactc gacttggaga agacttggga agagtcagat 12120ccgaggacgc ccctcatttg tcttctttcc atgggttctg atcccacgga ctctattata 12180gcactcggga aaagactaaa gatcgagaca cgctatgtta gcatgggaca ggggcaagag 12240gtccatgccc gcaaactact acagcagact atggctaatg gaggttgggc tctgttacag 12300aactgtcact taggccttga ttttatggac gaattgatgg acatcataat tgagacggag 12360ctagttcacg acgcatttcg cttatggatg acgactgaag cacacaagca gtttccgata 12420accctgttgc agatgtccat caagttcgcc aacgaccctc cccagggcct tagggcaggt 12480ctcaaaagga cctacagcgg cgtttcccag gatttacttg acgtctccag cggatcccag 12540tggaagccca tgttgtacgc cgtggctttc cttcacagca cagttcagga aaggcggaag 12600tttggtgcgc taggctggaa catcccctac gagttcaacc aggctgactt taatgcaaca 12660gtacagttta ttcaaaatca tctggatgac atggatgtca aaaaaggtgt gtcatggact 12720acaataaggt acatgattgg ggagatacag tacggaggcc gggtaactga tgattatgat 12780aaaagattgt taaacacttt tgctaaagtt tggtttagtg aaaatatgtt tggacctgat 12840tttagttttt atcaaggtta taatatacct aagtgctcaa ctgtagataa ttatttacaa 12900tatattcaaa gtttacctgc atatgatagt cctgaagttt ttggtttgca tcctaatgca 12960gatataacat accaatcaaa attagcaaaa gacgtcttag atacaatact tggtatccag 13020cctaaggaca ccagtggggg cggtgacgag actcgcgagg ctgtggtggc ccggctcgct 13080gatgacatgc tagagaaact tcccccggac tacgtcccct ttgaggtcaa agagcggctg 13140cagaaaatgg ggcccttcca gcccatgaac atattcttgc gccaggagat agaccgtatg 13200caacgagtcc tgagcctggt ccgctcgact ctaaccgagc tcaagctggc catcgatggg 13260acgatcatta tgtctgagaa tttgagggac gcgctcgatt gcatgtttga cgccaggatc 13320ccagcctggt ggaaaaaagc tagttggatc tcatctaccc tggggttctg gttcacagag 13380ttgatagagc gcaactcgca attcacctcc tgggtgttca acgggcggcc ccactgcttc 13440tggatgacgg gcttcttcaa cccgcagggc ttcctgacgg ctatgcggca ggaaatcacc 13500cgggcgaaca agggttgggc gctcgacaat atggtgctct gcaacgaggt cacgaagtgg 13560atgaaggacg acatctcggc gcctcccacc gaaggggtct acgtatacgg cctgtacctc 13620gagggggcgg gctgggacaa gcgtaacatg aagctgatcg agtcgaagcc caaggtgctg 13680ttcgagctga tgcccgtcat ccggatctac gcagagaaca acacgctgcg cgacccgcgg 13740ttctactcgt gccccatcta caagaagccg gtgcggacgg acctcaacta catcgccgcc 13800gtcgacctcc gcaccgcgca gacccccgag cactgggtgc tgcggggggt cgcattgctc 13860tgcgacgtca agtag 138752813875DNAArtificial SequenceSynthetic polynucleotide 28atgttcagaa taggcagacg acagttatgg aaacacagtg tgactagggt tctaacacag 60agactaaaag gagagaaaga ggcaaaaaga gcgctactcg atgcaagaca taactatcta 120ttcgctattg tggcgtcgtg tttagattta aataaaactg aagtagagga cgccattctc 180gagggaaatc aaatagaacg catcgaccaa ttattcgcgg ttggaggact tagacatttg 240atgttttact atcaagacgt tgaagaagcg gaaactgggc aactcgggtc actaggtgga 300gttaatttag tctcaggtaa gatcaaaaag ccaaaagtat ttgtaacaga aggaaacgac 360gttgcactaa caggtgtttg cgtgtttttt attagaactg atccgtctaa agccataaca 420cccgacaaca tccatcagga agtttcgttt aacatgctgg atgcggccga cggaggacta 480ctgaattcag tgcgtcgcct gctatctgac attttcattc ctgctttgcg tgcgaccagt 540cacgggtggg gggagttaga aggactccag gacgccgcga atatccggca ggaattcctg 600tccagcctgg aaggttttgt taacgtcctg tccggcgccc aggagagcct taaggagaag 660gtgaacttac gaaagtgcga catactcgag ctcaaaaccc tgaaggaacc tacagactac 720ctcaccctcg caaacaaccc cgaaaccctc ggcaaaattg aagattgcat gaaggtttgg 780attaagcaga cggaacaagt cctggccgag aacaaccaac tcctaaagga ggccgacgac 840gtgggcccgc gcgctgagct ggagcactgg aagaagaggc tcagcaagtt taactatctt 900cttgagcagc tgaagagccc ggacgttaag gcggtactag ccgtcctcgc ggctgcgaag 960tcgaaactgc tcaagacctg gcgtgagatg gatatacgca ttacggacgc aaccaacgaa 1020gctaaggaca acgttaagta cttgtatacc ctcgagaagt gctgcgaccc actctactca 1080tctgacccgc tcagtatgat ggatgccatc cccacgctaa ttaacgcaat taagatgatc 1140tactcgatat cgcactatta caacacgtct gaaaaaataa ccagcctctt cgtaaaagtg 1200accaaccaga tcatttcagc ctgtaaggct tacatcacaa acaacggcac cgcctccata 1260tggaaccagc cccaggacgt cgtggaagag aagatcctat cggccataaa gctgaagcag 1320gagtaccagc tgtgcttcca caaaaccaag cagaaactca agcagaaccc caatgctaag 1380cagttcgact tctctgagat gtatattttc gggaagtttg aaaccttcca ccgccgcctg 1440gcgaagatca tcgacatatt cacaactctg aagacctaca gtgttctaca agacagcact 1500atagagggtc tcgaggatat ggccactaag taccagggca ttgtcgcaac tatcaagaag 1560aaagaatata atttcctcga tcagcgtaag atggacttcg accaggacta cgaagaattc 1620tgcaagcaga cgaatgattt gcataacgaa ctccggaagt tcatggacgt aacgtttgcc 1680aagattcaaa acacaaatca ggcgttgcgg atgctaaaga agttcgagcg tctgaacatc 1740cctaatctag ggattgacga caagtaccaa cttatactgg aaaactacgg ggctgacatc 1800gatatgatct ccaagctata taccaagcaa aagtatgacc cgccgttagc tcggaatcag 1860cccccgatcg ccgggaagat cctgtgggca cggcagcttt ttcaccgcat tcagcagcct 1920atgcagctgt tccagcagca cccggcggtt ctctccaccg ccgaggctaa gccaattata 1980cgtagctaca accgcatggc gaaagtcctg ctcgagtttg aggtcttgtt ccaccgagcg 2040tggcttcggc agatcgaaga gatccacgtc gggctcgagg cctcgctcct ggttaaggcg 2100ccggggaccg gtgagctgtt cgtaaacttc gacccgcaaa tactaatcct gttccgtgaa 2160accgagtgca tggcccagat gggcctcgag gtctcacctt tggccacgag cctgttccag 2220aagcgcgacc gctacaagcg gaacttctct aacatgaaga tgatgcttgc cgagtaccag 2280agggttaagt cgaagatccc tgctgccatc gagcagctca tcgtgccaca tctggccaag 2340gttgacgagg cactccaacc gggcttggcg gccctgacgt ggacctccct aaacattgag 2400gcctacttgg aaaatacttt cgcgaagatt aaggatctcg aattactact ggatcgtgtg 2460aatgacctca tagaattccg gatagacgcg atcctagagg agatgtcgag cacccccctc 2520tgtcagctcc cgcaggagga gccgctcaca tgcgaggaat ttctccagat gactaaggac 2580ctctgcgtta acggggccca gatactccat ttcaagtcgt ccctcgttga ggaggcggtg 2640aatgaactgg ttaatatgct cttggatgtg gaagtgctct cggaggaaga atccgagaag 2700attagcaacg agaatagcgt gaactacaag aacgagagct cagcaaaacg ggaggagggg 2760aattttgata cgctgacttc ctccatcaac gcgcgggcca acgctctctt gctgacaaca 2820gtaacgcgca aaaagaagga gactgagatg ctaggagagg aggcgcgcga gctgctgtcc 2880catttcaacc accaaaatat ggatgcgctt ctcaaagtca cccggaatac cctcgaggcg 2940atacgcaagc gcatccattc gagccatacg ataaacttca gggacagcaa ctccgcgtca 3000aatatgaagc agaactcgtt gccgatattc cgggcctcgg tgacgctggc gatcccgaac 3060attgtgatgg caccggccct cgaggacgta cagcagaccc ttaacaaggc ggtggagtgc 3120atcatctccg ttcccaaggg cgtccgccag tggtcctcgg agctgctcag caagaaaaag 3180attcaggagc gtaaaatggc ggcccttcaa tcgaacgaag actccgactc tgacgttgag 3240atgggagaga acgagctaca ggatacctta gaaatcgcct ccgtgaacct ccctatcccg 3300gtgcagacga aaaactacta caagaacgtc agcgaaaaca aggaaatcgt aaagctggtg 3360agcgtcctga gtaccatcat taacagcaca aagaaggaag tgataacctc gatggactgc 3420tttaagcgct acaaccacat ttggcagaaa ggcaaggagg aggctataaa aacgttcatc 3480acgcagagcc ccctgctctc ggagttcgag tcacaaatcc tctacttcca aaatctggag 3540caggagatca acgctgagcc ggaatatgtg tgcgtcggct cgatagccct gtacacggct 3600gatctgaaat ttgcgctgac cgctgagacg aaggcttgga tggtggtgat tggccgacac 3660tgcaacaaga agtaccggtc tgaaatggag aacatcttta tgctaatcga ggaatttaac 3720aaaaagctga accgtcccat taaggatctg gacgacatcc gaattgccat ggcggcccta 3780aaggaaatta gagaggagca gatatccatt gactttcagg ttggccccat cgaagaatca 3840tatgcccttc tgaatcgata cggtctatta atcgcccgag aggaaataga taaggtggac 3900acacttcatt atgcatggga gaaactctta gcgcgggccg gcgaagtgca gaataagctc 3960gtatcgctgc aaccatcatt taagaaggag ctcatcagtg ctgtcgaagt ctttctgcag 4020gactgccacc agttctatct ggattatgac ctgaacggtc cgatggcgag tggtctgaag 4080ccccaagagg cttcagaccg gcttattatg ttccaaaatc agtttgataa tatttaccgg 4140aagtatatca cctatacagg gggtgaagaa ttgtttggtc tcccagccac ccagtaccca 4200caattattgg aaataaagaa gcagctgaac cttcttcaaa aaatctacac tctctataat 4260tcggtaattg aaactgttaa ttcctactac gatattctct ggagcgaggt caacattgag 4320aaaattaaca acgaactctt ggagtttcaa aacagatgcc gcaagttgcc cagagcgctg 4380aaggactggc aggcttttct cgaccttaaa aaaataatcg atgatttcag tgaatgctgt 4440cctctcttag aatacatggc ctcaaaggct atgatggaga gacactggga gaggattacg 4500actctcacgg ggcattcttt ggacgttggc aacgagtcct tcaagctgcg taatataatg 4560gaggctccac ttctcaagta caaagaggaa atagaagaca tctgtatatc tgctgtcaaa 4620gagcgcgata tagaacagaa actaaagcag gtaattaacg aatgggacaa taaaacgttt 4680acatttggca gtttcaagac acgtggagaa ttattgcttc gtggcgactc cacctcggaa 4740attatagcta acatggagga ctctctcatg ttactcggct cgctgttatc gaaccggtat 4800aatatgccat tcaaagcaca gattcagaag tgggtgcagt atctatctaa tagtacggat 4860attatagaga gctggatgac cgtccagaat ctctggatct acctggaggc ggtgtttgtg 4920ggtggtgata ttgcgaagca gcttccaaag gaggccaaaa gattctccaa cattgacaaa 4980tcctgggtca agattatgac tcgggcccac gaagtgccct ccgtggtgca gtgctgcgtt 5040ggcgacgaaa ccttgggcca gctgttgccc cacctgttgg atcaattgga aatctgccaa 5100aagagcctga ctggctacct agaaaaaaag cgtctgtgct ttccccggtt ctttttcgtt 5160tctgaccctg cactactcga aatcttgggt caggcctcag attctcacac aattcaggct 5220catttgttaa atgtgtttga caacatcaaa agtgtgaaat ttcatgaaaa gatttatgac 5280aggatcttgt ccatttcatc ccaagagggg gaaaccattg agcttgataa gcctgtgatg 5340gcagagggaa acgtggaggt ctggttaaac agtctcctgg aagagtccca gtcctcactg 5400cacctggtaa tccgccaggc ggcggctaat atccaggaga caggattcca gctcacggaa 5460ttccttagtt cgtttccggc gcaagtgggg ctcctcggca ttcagatgat ctggacgaga 5520gattcggagg aagccctcag aaacgccaag tttgacaaga agattatgca gaaaactaac 5580caagccttcc tagagctcct caacaccctg atcgatgtca caacacgtga tctatcgtct 5640accgagcggg tcaagtatga gacactgatt accatacacg tgcaccagcg tgatatattc 5700gatgatctct gccacatgca cataaagagt cccatggact tcgaatggct aaaacagtgc 5760cgcttctact ttaatgaaga ctcggataag atgatgatcc atatcacaga tgtagcgttt 5820atttaccaaa acgagttcct tggctgcaca gacaggttag tcataactcc gttaactgat 5880cgctgctaca ttacactcgc ccaagcgctt ggaatgtcca tgggtggagc ccccgcaggg 5940ccggcgggga ctggtaagac cgaaacaact aaagatatgg gccgttgcct cgggaagtat 6000gtggtagttt ttaactgctc agaccaaatg gatttccgag ggctgggccg tatctttaaa 6060gggctggcgc aatcaggttc ctggggctgt tttgacgagt tcaatcgtat tgatttaccg 6120gtgctaagtg ttgccgcaca gcaaattagt ataattctaa cttgtaagaa agaacacaag 6180aaaagtttta tatttactga tggcgacaac gtcactatga atcctgaatt cgggcttttc 6240ttgactatga acccagggta tgctggccgt caagaacttc ctgaaaatct gaagatcaac 6300tttcgatcgg tggctatgat ggtaccggat cgccagatca ttatccgagt aaaactggct 6360tcgtgtggct tcatcgacaa cgtcgtactt gctcgaaagt tcttcaccct ttacaagcta 6420tgtgaggagc agttatcgaa acaagttcat tacgactttg ggctccggaa tatcttgtcc 6480gtcctccgca cactcggagc ggctaaacgt gcaaatccca tggacactga gagtacgatt 6540gtgatgcgag tgttaaggga tatgaatctc tcaaaattaa tcgacgagga cgagcctctt 6600tttttaagcc ttatagagga tctgttccca aacatcctcc tggacaaggc gggatatccc 6660gagttggaag cggccattag caggcaggtg gaggaggccg gattgattaa tcacccgccc 6720tggaaactga aagtcatcca gctgttcgag actcagcggg tccgacacgg tatgatgact 6780ttaggcccat ctggcgcggg caaaaccacc tgcatccaca ccctgatgag ggctatgacc 6840gattgtggga agcctcaccg tgagatgcgg atgaacccga aggcgatcac tgcgccccaa 6900atgtttgggc gtctggatgt ggcgacaaat gactggaccg atggaatctt tagcacactc 6960tggaggaaga ccctgcgcgc caaaaaagga gagcacatct ggatcattct cgatggcccc 7020gttgacgcta tttggatcga aaacttaaac agcgtgctcg acgacaacaa gaccctgaca 7080ttggcaaatg gtgaccggat tcctatggct cccaattgca aaatcatttt tgaacctcac 7140aacatcgaca acgccagtcc ggctacggtg tcccgcaacg gtatggtttt catgagcagt 7200tccatactgg attggagtcc gatattggaa ggatttctca agaaacgcag tccccaggag 7260gcggaaattc tgcggcaact gtatacggaa agttttcctg acctgtaccg cttctgtatt 7320caaaatctcg agtataagat ggaggtgttg gaggccttcg tcatcacaca gtccattaac 7380atgcttcagg gcttgatccc cttgaaggag caaggagggg aagtcagcca ggcacatcta 7440gggcggcttt tcgttttcgc ccttctctgg tccgcgggtg ctgctctcga gctagacgga 7500cgccggcgct tggagttgtg gctgaggtct cgcccgaccg ggacactcga gctcccgccg 7560ccagccggac ccggggacac tgcattcgac tactacgtag ctccggacgg cacctggacc 7620cactggaaca cccgtacgca ggagtatctc tatcccagcg atacaactcc tgagtatggt 7680agcatactcg ttccgaacgt agacaacgtc agaaccgact ttctgatcca gaccattgct 7740aagcagggca aggcagtcct acttatcgga gagcaaggga ccgcgaaaac cgtgattatc 7800aagggcttca tgagtaaata tgacccagag tgtcatatga ttaaatccct caacttctcc 7860tctgctacca caccactcat gtttcagcgc actatcgaat cctacgtgga caagcggatg 7920ggcaccacct acggaccgcc tgccgggaag aagatgacgg tatttataga cgacgttaac 7980atgcccatca tcaacgagtg gggagatcaa gtgaccaacg aaatcgttcg gcaacttatg 8040gaacaaaacg ggttctataa cctcgagaag ccgggcgagt tcacctcaat agtagacatt 8100caatttctgg cagctatgat ccaccccgga ggaggacgga acgacattcc ccagcggctc 8160aagcgccaat tcagcatctt caactgcacg ctgccaagtg aagcatcggt tgacaaaatc 8220ttcggcgtca tcggggtggg tcactactgc acccagcgcg gcttttcaga ggaagtccga 8280gattctgtta ctaaactggt tcctttgact agaaggctgt ggcagatgac caaaattaaa 8340atgcttccta ctccagctaa attccactac gtgttcaatc tgcgagactt atccagggta 8400tggcaaggta tgcttaatac cacctctgag gtaattaaag aaccgaacga tctgctgaaa 8460ctgtggaagc acgaatgtaa gagagttatt gcagataggt ttactgtatc gtcagatgtt 8520acctggttcg ataaagcact ggtgtctttg gtcgaagaag agtttgggga agagaagaag 8580ctactcgtcg attgcgggat agacacttac tttgtggact tcttaagaga cgcccctgag 8640gctgccggcg aaacatcgga agaggcagac gctgagactc ctaagatcta tgagccgatc 8700gagagcttta gccacctcaa ggagcgtctg aatatgtttt tacaattgta taacgagagc 8760attcgtggtg ctgggatgga tatggtgttc ttcgccgatg caatggtgca tctggttaag 8820attagccggg tgattcgcac gccacaggga aacgcgctac tggtcggggt gggtgggtca 8880ggaaagcaat cgttaactcg tttggcatcc tttatcgcgg gatatgtgag ttttcaaatc 8940accctgacga ggtcctataa tacatccaac ctgatggagg atcttaaagt tttgtacagg 9000acggcgggac aacagggcaa aggaataacc ttcatcttca cagataacga aattaaagac 9060gaatcattct tggagtatat gaacaacgtc ttatcaagcg gcgaagtttc gaacctcttc 9120gcgcgggacg aaatcgatga gatcaactct gatctcgctt ctgtcatgaa gaaagaattc 9180ccccgctgtt tgccgacgaa tgagaacttg catgactatt tcatgtcccg tgtgcggcag 9240aatttgcaca ttgtgctatg tttttcaccg gtgggggaga agttccgaaa tagagctttg 9300aagttccctg ctttgatttc tgggtgtact attgactggt tttcccgttg gcccaaggac 9360gctctggtcg ccgtgtccga gcacttttta accagctatg atatcgactg cagcctcgaa 9420attaagaagg

aagtagttca gtgtatgggc tctttccaag acggtgtggc agagaagtgc 9480gtcgactatt tccagaggtt tcgccgatct actcatgtca cacctaagag ctacttgtcc 9540ttcatacagg ggtacaagtt tatatatggg gagaaacacg ttgaagtaag gactctggca 9600aaccgtatga atactggctt agagaagctc aaggaggcct cagaatccgt ggctgctctc 9660tcaaaggagc ttgaagctaa ggagaaggaa ctccaagttg ctaacgataa agcggatatg 9720gttctaaagg aagtcactat gaaagcacaa gcggctgaaa aggttaaggc ggaggtacag 9780aaggtgaagg atcgcgccca ggcaatagtc gattccattt ccaaagacaa ggccatcgcg 9840gaagaaaagc tagaagcggc gaagccggcc ttagaagagg cagaggctgc cttgcaaacc 9900ataagaccca gtgacatcgc cacggtacga acccttggtc gtcctcctca tttgattatg 9960aggattatgg actgtgtcct gcttttattt caacgtaaag tatctgcagt taagattgat 10020ttggaaaaat cctgtaccat gccctcatgg caggaatccc tgaaattgat gaccgcgggc 10080aatttccttc aaaatttgca acaattcccc aaggacacca ttaacgaaga ggtcatagaa 10140tttttatctc catattttga gatgccagac tacaatattg aaacagcgaa gcgcgtctgt 10200ggtaacgttg caggtctctg ttcgtggacc aaagctatgg cctccttctt tagtatcaat 10260aaagaggtac taccactgaa agccaacctg gtggtacagg agaaccggca tctgcttgct 10320atgcaggatc ttcagaaggc ccaggccgaa ttagacgaca agcaggctga gcttgacgtt 10380gtgcaagcag aatacgaaca agctatgaca gaaaagcaga ctttattaga agacgctgaa 10440cgctgcagac ataagatgca gactgcaagc accctcatat ccggtttggc tggagaaaaa 10500gagcggtgga cagagcagtc gcaagaattc gctgctcaaa ccaaaaggtt ggttggagac 10560gttctactcg cgacagcgtt tctctcctat tctggtcctt tcaaccagga attccgggac 10620cttttgctga atgactggag aaaagaaatg aaggctcgca aaataccatt tggaaagaat 10680ttgaacttgt ctgaaatgct tattgacgca cccactatat cagagtggaa tcttcaggga 10740cttccaaatg acgatctgtc catccaaaac ggaattattg taaccaaggc gagccgctac 10800cctctgctaa ttgacccgca gacacagggc aagatctgga ttaaaaataa ggaaagcagg 10860aacgagctcc agatcactag tctcaaccac aagtacttcc gtaaccacct cgaagatagt 10920ctgtccctgg gacggccgtt gctaatcgag gacgtcggag aagagctgga ccccgcatta 10980gacaacgttc ttgaaagaaa ttttatcaag acaggatcaa ctttcaaagt taaagtagga 11040gataaagaag tggatgtgtt agatggcttc cgcctatata tcacgactaa actcccgaat 11100cccgcctata ctccagagat cagcgctaga actagcatca tagatttcac tgtaactatg 11160aaggggttag aagatcaatt attaggacgc gtcatcctga cggagaaaca ggaacttgaa 11220aaagagcgta cacacctcat ggaagacgtg acagctaaca aacgtcggat gaaggaactg 11280gaggacaatt tactgtatcg gttgacatca acacagggct cccttgttga ggacgagagt 11340ctgatcgtgg tcctgtctaa cacaaagagg actgctgaag aagtcactca gaaattggag 11400atttctgccg aaactgaagt tcagattaac tccgctagag aagagtatcg tccagtcgct 11460actcggggct ctatcctata cttcctcata actgagatgc gcttggtcaa tgagatgtac 11520cagacttcac tccggcaatt cctgggcttg tttgacttgt cgctggcaag atcagtaaaa 11580tctccaatta ccagcaagag aatcgcaaac atcattgagc acatgacgta cgaggtgtat 11640aaatacgcgg cgaggggtct ttatgaagag cataagttcc tcttcaccct actattaacg 11700ttgaagatag atattcagag gaaccgggtg aagcacgagg agtttctaac tctaataaaa 11760ggaggagctt ctttagatct taaagcctgt ccaccgaaac cttctaagtg gattttagac 11820ataacatggc tgaatttagt ggagttgtcc aagttacgtc agttcagtga cgttttggat 11880cagatatcta ggaacgagaa gatgtggaag atctggttcg ataaagagaa tccggaggag 11940gagcccttgc caaatgctta tgataaaagc ctagactgct tcaggaggct tttgctcatc 12000cgctcctggt gccccgaccg cactattgcc caagctagga aatacattgt ggactccatg 12060ggggagaagt atgccgaagg agtcatactc gacttggaga agacttggga agagtcagat 12120ccgaggacgc ccctcatttg tcttctttcc atgggttctg atcccacgga ctctattata 12180gcactcggga aaagactaaa gatcgagaca cgctatgtta gcatgggaca ggggcaagag 12240gtccatgccc gcaaactact acagcagact atggctaatg gaggttgggc tctgttacag 12300aactgtcact taggccttga ttttatggac gaattgatgg acatcataat tgagacggag 12360ctagttcacg acgcatttcg cttatggatg acgactgaag cacacaagca gtttccgata 12420accctgttgc agatgtccat caagttcgcc aacgaccctc cccagggcct tagggcaggt 12480ctcaaaagga cctacagcgg cgtttcccag gatttacttg acgtctccag cggatcccag 12540tggaagccca tgttgtacgc cgtggctttc cttcacagca cagttcagga aaggcggaag 12600tttggtgcgc taggctggaa catcccctac gagttcaacc aggctgactt taatgcaaca 12660gtacagttta ttcaaaatca tctggatgac atggatgtca aaaaaggtgt gtcatggact 12720acaataaggt acatgattgg ggagatacag tacggaggcc gggtaactga tgattatgat 12780aaaagattgt taaacacttt tgctaaagtt tggtttagtg aaaatatgtt tggacctgat 12840tttagttttt atcaaggtta taatatacct aagtgctcaa ctgtagataa ttatttacaa 12900tatattcaaa gtttacctgc atatgatagt cctgaagttt ttggtttgca tcctaatgca 12960gatataacat accaatcaaa attagcaaaa gacgtcttag atacaatact tggtatccag 13020cctaaggaca ccagtggggg cggtgacgag actcgcgagg ctgtggtggc ccggctcgct 13080gatgacatgc tagagaaact tcccccggac tacgtcccct ttgaggtcaa agagcggctg 13140cagaaaatgg ggcccttcca gcccatgaac atattcttgc gccaggagat agaccgtatg 13200caacgagtcc tgagcctggt ccgctcgact ctaaccgagc tcaagctggc catcgatggg 13260acgatcatta tgtctgagaa tttgagggac gcgctcgatt gcatgtttga cgccaggatc 13320ccagcctggt ggaaaaaagc tagttggatc tcatctaccc tggggttctg gttcacagag 13380ttgatagagc gcaactcgca attcacctcc tgggtgttca acgggcggcc ccactgcttc 13440tggatgacgg gcttcttcaa cccgcagggc ttcctgacgg ctatgcggca ggaaatcacc 13500cgggcgaaca agggttgggc gctcgacaat atggtgctct gcaacgaggt cacgaagtgg 13560atgaaggacg acatctcggc gcctcccacc gaaggggtct acgtatacgg cctgtacctc 13620gagggggcgg gctgggacaa gcgtaacatg aagctgatcg agtcgaagcc caaggtgctg 13680ttcgagctga tgcccgtcat ccggatctac gcagagaaca acacgctgcg cgacccgcgg 13740ttctactcgt gccccatcta caagaagccg gtgcggacgg acctcaacta catcgccgcc 13800gtcgacctcc gcaccgcgca gacccccgag cactgggtgc tgcggggggt cgcattgctc 13860tgcgacgtca agtag 138752913875DNAArtificial SequenceSynthetic polynucleotide 29atgttcagaa tcgggcggcg acaattatgg aaacactcag tcactagagt tctaacacaa 60aggttaaaag gagaaaaaga agcaaaaaga gcgctattgg acgcaagaca taattatttg 120ttcgcgatag tcgcatcgtg tttagattta aataaaactg aagtagaaga cgctattttg 180gaaggaaatc aaatcgaacg gatcgaccaa ttattcgcgg ttggaggact ccgtcattta 240atgttttact accaagacgt ggaggaagcg gaaactgggc aactcgggtc acttggggga 300gttaacttag tctcaggtaa aattaagaag ccaaaagtat ttgtaacaga aggaaacgac 360gttgcactaa caggtgtttg cgtatttttt attcgaactg atccgtcaaa agccattaca 420ccagataata ttcatcagga ggtttcattt aacatgctag atgctgctga tggaggttta 480ctaaattccg tacgtaggtt gctaagcgat attttcattc ctgccctaag agcaacttct 540cacggatggg gtgagttaga aggtttacaa gacgctgcaa acatacgtca agaattcctt 600tcaagtcttg agggatttgt caatgtgctc tcaggagccc aagaatcact aaaggagaaa 660gtaaatttga gaaagtgtga tattcttgaa ttgaaaacct taaaagaacc taccgattat 720ctaacgcttg caaataatcc tgagacatta ggtaaaatcg aggattgtat gaaagtgtgg 780atcaaacaga ctgaacaagt cttagcagaa aataaccagt tattaaaaga agcggatgac 840gttggaccgc gagcagaact agaacactgg aagaagaggc taagtaaatt taattatctt 900cttgaacaac tgaagtctcc agatgttaag gcggttctcg cggtgttagc agctgcaaaa 960tcaaaactcc ttaaaacgtg gcgtgaaatg gatataagga ttactgatgc aactaacgaa 1020gcaaaagata acgtaaaata tttgtatacg ctcgaaaagt gttgtgatcc tttatattca 1080tcagatccac ttagtatgat ggacgctata cctacattaa ttaacgctat taaaatgatt 1140tatagtatct ctcattatta taatacaagc gaaaaaataa cttccttatt cgtaaaagtc 1200acgaaccaga ttatatcagc gtgtaaagca tatataacta acaacggtac tgcatctata 1260tggaatcagc cccaagacgt ggttgaggaa aaaattttga gtgctataaa attaaaacaa 1320gaatatcaat tgtgctttca taaaactaaa caaaaactga agcaaaatcc aaatgcaaaa 1380caatttgact tttctgaaat gtatatattc ggtaaattcg aaacgtttca ccgtcgatta 1440gccaaaatta tcgacatctt tacgacgtta aaaacttaca gcgtgctaca agattctacg 1500atagaagggc tagaggatat ggccacaaag taccagggca tcgtggccac tatcaagaag 1560aaggagtaca acttcttaga ccagcgtaaa atggatttcg accaggacta tgaagaattc 1620tgcaaacaaa cgaatgatct gcacaacgag ctgcggaaat tcatggatgt gacttttgcc 1680aaaatacaga ataccaacca agctcttagg atgttaaaga aatttgaaag gctcaacatt 1740cctaatttgg gcattgatga caaataccag ttgatactcg aaaattatgg agcagatatt 1800gatatgatct ctaaactata cacaaaacaa aaatatgatc ctccgttagc tagaaatcaa 1860ccgccgattg ctggtaagat actctgggcc agacaactct ttcaccgcat ccagcagccc 1920atgcagctgt ttcagcagca ccctgcggtg ctctccaccg ccgaggcgaa acccattatt 1980cgatcttata accgcatggc caaggttctg ttagagtttg aagttttgtt ccaccgtgcc 2040tggttacgtc agatcgagga gatccatgtg ggactggagg cctctctcct agtcaaggcc 2100cccggcaccg gcgaactctt tgtcaatttt gatccccaga ttctaatact cttccgggaa 2160accgagtgca tggcccagat gggtttagag gttagtcctc tggctacttc actgttccag 2220aagagggacc gctataaacg gaatttcagc aatatgaaga tgatgctcgc tgagtatcag 2280agggtcaagt ccaaaatccc cgctgcgatc gagcagctga tcgtgccaca cctggccaaa 2340gtagatgagg ccctacaacc aggactggct gcgctgacgt ggacctctct gaatatcgaa 2400gcgtatcttg aaaacacctt tgccaagatt aaagacctgg agcttttact ggacagagtg 2460aacgacctca ttgaattccg catagacgcg attttagagg agatgtcttc cacgccacta 2520tgtcagcttc ctcaggagga gcctctcaca tgtgaagagt tccttcagat gactaaggat 2580ctctgcgtga atggcgctca gatactacat ttcaaatcta gcttggtcga ggaggcagtg 2640aacgaattgg ttaacatgtt actggatgta gaagtcctta gcgaggagga atccgaaaag 2700atcagcaacg aaaattcggt gaactataag aacgaatctt ctgccaagcg ggaggagggc 2760aactttgata cactcacatc ttccatcaat gcgcgcgcta atgcactctt gttgacgacg 2820gttaccagga aaaagaagga gactgagatg cttggggaag aggcaaggga gttactgtcc 2880cacttcaacc atcagaatat ggacgccttg ttaaaggtta cccgaaacac gttagaggca 2940attcgtaagc gtattcactc aagccacacg ataaacttcc gagactcaaa ctcagcatca 3000aatatgaagc aaaactcctt gccgatcttc agagccagcg tcaccctggc catacctaac 3060atcgttatgg caccggcact tgaggacgta caacaaacct tgaacaaagc agtagagtgc 3120atcatcagcg tccctaaagg agttcgccaa tggtccagtg aactgctatc caagaagaag 3180atccaggagc gtaaaatggc tgcgttacag agtaacgaag attcggactc tgacgttgaa 3240atgggtgaaa acgagctcca agacacactg gagattgcga gcgttaacct gcctataccc 3300gtccagacca agaactacta caaaaacgtg tccgaaaaca aggagatcgt caagctcgtt 3360tctgtgctca gcaccatcat aaattcgact aagaaagaag ttataacttc catggattgt 3420ttcaaacggt ataaccacat ctggcagaaa ggcaaggagg aagctatcaa gacatttatt 3480acccagtccc cactactaag cgagttcgag tctcagatcc tttacttcca gaatcttgag 3540caggagatca acgcggagcc cgaatatgtg tgcgtcggct cgatagccct gtacacggct 3600gatctgaaat ttgcgctgac cgcggagacg aaggcttgga tggtggtgat tggccgacac 3660tgcaacaaga agtaccggtc tgaaatggag aacatcttta tgctaatcga ggaatttaac 3720aaaaagctga accgtcccat taaggatctg gacgacatca ggattgccat ggcggcccta 3780aaggaaatta gagaggagca gatatccatt gattttcagg ttggccccat cgaagaatca 3840tatgcccttc tgaatcgata cggtctatta atcgcccgag aggaaataga taaggtggac 3900acacttcatt acgcatggga gaaactctta gcgcgggccg gcgaagtgca gaataagctc 3960gtatcgctgc aaccatcatt taagaaggag ctcatcagtg ctgtcgaggt ctttctgcag 4020gactgccacc agttctatct ggattatgac ttaaacggtc cgatggcgag tggtctgaag 4080ccccaagagg cttcagaccg gcttatcatg ttccaaaatc agtttgacaa tatttaccga 4140aagtatatca cctatacagg gggtgaagaa ttgtttggtc tcccagccac ccagtatcca 4200caattattgg aaataaaaaa gcagctgaac cttttacaaa aaatctacac tctctataat 4260tcggtaattg aaactgttaa ttcctactac gatattctct ggagcgaggt caacattgag 4320aaaattaata acgaactctt ggagtttcaa aacagatgcc gcaagttgcc gagagcgctg 4380aaggactggc aggcttttct cgaccttaag aaaataatcg atgatttcag tgaatgctgt 4440cctctcttag aatacatggc gagtaaggct atgatggaga gacactggga gaggattacg 4500actctgacgg ggcattcttt agacgttggc aacgagtcct tcaagctgcg taatataatg 4560gaggctccac ttctcaaata caaagaggaa atagaagaca tctgtatctc tgctgtcaaa 4620gagcgcgaca tagaacagaa actaaagcag gtcattaacg aatgggacaa taaaacgttt 4680acatttggca gtttcaagac acgtggagaa ttattgcttc gtggcgactc cacctcggaa 4740attatcgcta acatggagga ctctctcatg ttactcggct cgctgttatc gaaccggtat 4800aatatgccat tcaaagcaca gatccagaag tgggtgcagt atctatctaa tagcacggat 4860attatagaga gctggatgac cgtccaaaat ctctggatct acctggaggc ggtgtttgtg 4920ggaggtgata ttgcgaagca gcttccaaag gaggccaaaa gattctccaa catcgacaag 4980tcctgggtca agattatgac tcgggcccac gaagtgccct ccgtggtgca gtgctgcgtt 5040ggggacgaaa ccttgggcca gctgttgccc cacctgttgg atcaattgga aatctgccaa 5100aagagcctga cgggctacct agaaaaaaag cgtctgtgct ttccccggtt ctttttcgtt 5160tctgaccctg cactactcga aatcttgggt caggcctcag attctcacac aattcaggct 5220catttgttaa atgtgtttga caatatcaaa agtgtgaaat ttcatgaaaa gatttatgac 5280aggatcttgt ccatttcatc ccaagagggg gaaaccattg agcttgataa gcctgtgatg 5340gcagagggaa acgtggaggt ctggcttaac agtctcctgg aagagtccca gtcctcactg 5400cacctggtca tccgccaggc ggcggctaat atccaggaga caggattcca gctcacggaa 5460ttccttagtt cgtttccggc gcaagtgggg ctcctcggaa ttcagatgat ctggacgaga 5520gattcggagg aagccctccg caacgccaag tttgacaaga agattatgca gaaaactaac 5580caagccttcc tagagctcct caacaccctg atcgatgtca caacacgtga tctatcgtct 5640accgagcggg tcaagtatga gacactgatt accatacacg tgcaccagcg tgatatattc 5700gacgatctct gccacatgca cataaagagt cccatggact tcgaatggct aaaacagtgc 5760aggttctact ttaacgaaga ctcggataag atgatgatcc atatcacaga tgtagcgttt 5820atttaccaaa acgagttcct tggctgcaca gacaggttag tcataactcc gttaactgat 5880cgctgctaca ttacactcgc ccaagcgctt ggaatgtcca tgggtggagc ccccgcaggg 5940ccggcgggga ctggtaagac cgaaacaact aaagatatgg gccgttgcct cgggaagtat 6000gtagtagttt ttaactgctc agaccaaatg gatttccgag ggctgggccg tatctttaaa 6060gggctggcgc aatccggttc ctggggctgt tttgacgagt tcaatcgtat tgatttaccg 6120gtgctaagtg ttgccgcaca gcaaattagt ataatattga cttgtaagaa agaacacaag 6180aaaagtttta tatttactga tggcgacaac gtcactatga atcctgaatt cgggcttttc 6240ttgactatga acccagggta tgctggccgt caagaacttc ctgaaaatct gaagatcaac 6300tttcgatcgg tggctatgat ggtaccggat cgccagatca ttatccgggt aaaactggcg 6360tcgtgtggct tcatcgacaa cgtggtactt gctcgaaagt tcttcaccct ttacaagcta 6420tgtgaggagc agttatcgaa acaagttcat tacgactttg ggctccggaa tatcttgtcc 6480gtcttacgca cactcggagc ggctaaacgt gcaaatccca tggacactga gagtacgatt 6540gtgatgaggg tgttaaggga tatgaatctc tcaaaattaa tagacgagga cgagcctctt 6600tttctctccc ttatagagga tctgttccca aacattctcc tggacaaggc gggatatccc 6660gagttggaag cggcgatcag caggcaggtg gaggaggccg gattgattaa tcacccgccc 6720tggaaactga aagtcatcca gctgttcgag actcagcggg tccgacacgg tatgatgact 6780ttaggcccct ctggcgcggg gaaaaccacc tgcatccaca ccctgatgag ggctatgacc 6840gattgtggga agcctcaccg tgagatgcgg atgaacccga aggcgatcac agcgccccaa 6900atgtttgggc gtctggatgt ggcgacaaat gactggaccg atggaatctt ttccacactc 6960tggaggaaga ccctgcgcgc aaaaaaagga gagcacatct ggatcattct cgatggcccc 7020gttgacgcta tttggatcga aaacttaaac agcgtgctcg acgacaacaa gaccctgaca 7080ttggcaaatg gtgaccggat tcctatggct cccaattgca aaatcatttt tgaacctcac 7140aacatcgaca acgccagtcc ggctacggtg tcccgcaacg gtatggtttt catgagcagt 7200tccatactgg attggagtcc gatattggaa ggatttctca agaaacgcag tccccaggag 7260gcggaaattc tgcggcaact gtatacggaa agtttccctg acctgtaccg cttctgtatt 7320caaaatctcg agtataagat ggaggtgttg gaggccttcg tcatcacaca gtccattaac 7380atgcttcagg gcttgatccc cttgaaggag caaggagggg aagtcagcca ggcacatcta 7440gggcggcttt tcgttttcgc ccttctctgg tccgcgggtg ctgctctcga gctagacggt 7500cgccggcgct tggagttgtg gctgaggtct cgcccgaccg ggacactcga gctgccgccg 7560ccagccggac ccggcgacac tgcattcgac tactacgtag ctccggacgg cacctggacc 7620cactggaaca cccgtacgca ggagtatctc tatcccagcg atacaactcc tgagtatggt 7680agcatactcg ttccgaacgt agacaacgtc agaaccgact ttctgatcca gaccattgct 7740aagcagggca aggcagtcct attgatcgga gagcaaggga ccgcgaaaac cgtgattatc 7800aagggcttca tgagtaaata tgacccagag tgtcatatga ttaagtccct taacttcagt 7860tctgctacca caccactcat gtttcagcgt actatcgaat cctacgtgga caagcggatg 7920ggcaccacct acgggccgcc tgccgggaag aagatgacgg tatttataga cgacgttaac 7980atgcccatca tcaacgagtg gggagatcaa gtgaccaacg aaatcgttcg gcaacttatg 8040gaacaaaacg ggttctataa cctcgagaag ccgggcgagt tcacctcaat agtagacatt 8100caatttctgg cagctatgat ccaccccgga ggaggacgga acgacattcc ccagcggctc 8160aagcgccaat tcagcatctt caactgcacg ctgccaagtg aagcatcggt agacaaaatc 8220ttcggcgtca tcggggtggg tcactactgc acccagcgcg gcttttcaga ggaagtccga 8280gattctgtta ctaaactggt tcctttgact agaaggctgt ggcagatgac caaaattaaa 8340atgcttccta ctccagctaa attccactac gtgttcaatc tgcgagactt atccagggta 8400tggcaaggta tgcttaatac cacctctgag gtaattaaag aaccgaacga tctgctgaaa 8460ctgtggaagc acgaatgtaa gagagttata gcagataggt ttactgtatc gtcagatgtt 8520acctggttcg ataaagcact ggtgtctttg gtcgaagaag agtttgggga agagaagaag 8580ctactcgtcg attgcgggat cgacacttac tttgtggact tcttaagaga cgcccctgag 8640gctgccggcg aaacatcgga agaggcagac gctgagactc ctaagatcta tgagccgatc 8700gagagcttta gccacctcaa ggagcgtctg aatatgtttt tacaattgta taacgagagc 8760attcgtggtg ctgggatgga tatggtgttc ttcgcggatg caatggtgca tctggttaag 8820atttctcggg tcattcgcac gccacaggga aacgcgctac tggtcggggt gggtgggtca 8880ggaaagcaat cgttaactcg tttggcatcc tttatcgcgg gatatgtgag ttttcaaatc 8940acactgacaa ggtcctataa tacatccaac ctgatggagg atcttaaagt tctttacagg 9000acggcgggac aacagggcaa aggaataacc ttcatcttca cagataacga aattaaagac 9060gaatcattct tggagtatat gaacaacgtc ttatcaagcg gcgaagtttc gaacctcttt 9120gcgcgggacg aaatcgatga gatcaactct gatctcgctt ctgtcatgaa gaaagaattc 9180ccccgctgtt tgccgacgaa cgagaacttg catgactatt tcatgtcccg tgtgcggcag 9240aatttgcaca ttgtgctttg cttttcaccg gtgggcgaga agttccgaaa tagagctttg 9300aagttccctg ctttgatttc tgggtgtact attgactggt tttcccgttg gcccaaggac 9360gctctggtcg ccgtgtccga gcacttttta accagctatg atatcgactg cagcctcgaa 9420attaagaagg aagtagttca gtgtatgggc tctttccaag acggtgtggc agagaagtgc 9480gtcgactatt tccagaggtt tcgccgatct actcatgtca cacctaagag ctacttgtcc 9540ttcatacagg ggtacaagtt tatatacggg gagaaacacg ttgaagtaag gactctggcg 9600aaccgtatga atactggctt agagaagctc aaggaggcct cagaaagtgt ggctgctctc 9660tcaaaggaac ttgaagctaa ggagaaggaa ctccaagttg cgaacgataa agcggatatg 9720gttctaaaag aagtcactat gaaagcacaa gcggctgaaa aggttaaggc ggaggtacag 9780aaggtgaaag atcgcgccca ggcaatagtc gattccattt ccaaagacaa ggccatcgcg 9840gaagaaaagc tagaagccgc caagccggcc ttagaagagg cagaggctgc cttgcaaacc 9900ataagaccga gcgacatcgc gacggtacga acccttggtc gtcctcctca tttgattatg 9960aggattatgg actgtgtcct gcttttattt caacgtaaag tatctgcagt taagattgat 10020ttggagaaat cctgtaccat gccctcatgg caggaatccc tgaaattgat gaccgcgggc 10080aatttccttc aaaatctaca acaattcccc aaggacacca ttaacgaaga ggtcatagaa 10140tttttatctc catattttga gatgccagat tacaatatag aaacagcgaa gcgcgtctgt 10200ggaaacgttg caggtctctg ttcgtggacc aaagctatgg cctccttctt tagtatcaat 10260aaagaggtac taccactgaa agccaacctg gtggtacagg agaaccggca tctgcttgct 10320atgcaggatc ttcagaaggc ccaggccgaa ttagacgaca agcaggctga gcttgacgtt 10380gtgcaagcag aatatgaaca agctatgact gaaaagcaga ctttattaga ggacgctgaa 10440cgctgcagac ataagatgca gactgcaagc accctcatat ccgggttggc tggagaaaaa 10500gagcggtgga

cagagcagtc gcaagaattc gctgctcaaa ccaaaaggtt ggttggagac 10560gttctactcg cgacagcttt cctctcctat tctggtcctt tcaaccagga attccgggac 10620cttttgctga atgactggag aaaagaaatg aaggctcgca agataccatt tggtaagaat 10680ttgaacttgt ctgaaatgct tattgacgca cccactatat cagagtggaa tcttcaggga 10740cttccaaatg acgatctgtc catccaaaac ggaattattg taaccaaggc gagtcgctac 10800cctctgctca ttgacccgca gacacagggc aagatctgga ttaaaaataa ggaaagcagg 10860aacgaattgc agatcactag tctcaaccac aagtacttcc gtaaccacct cgaagatagt 10920ctgtccctgg gaaggccgct tctgatcgag gacgttggcg aggagctgga tcctgcgctg 10980gacaacgttc ttgagcgcaa cttcatcaag accgggtcca ccttcaaggt aaaggtggga 11040gacaaggagg tggacgtgct ggacggcttt cgcctatata tcaccacgaa gctgcctaac 11100ccggcgtaca cgcccgagat cagtgcgcgt acgagcatca ttgacttcac cgtgactatg 11160aaaggcctcg aagatcagct gctcggtcgc gtcattctca cggaaaagca ggagctggag 11220aaggagcgaa cgcacctgat ggaggacgtg acggccaaca agcggcgtat gaaagagcta 11280gaggacaacc tgctgtaccg cctcacgtca acccaggggt cgctggtcga ggacgaatcc 11340ctgatcgtgg tgctgagtaa taccaagcgg acagcagagg aggtcacgca gaaactcgag 11400atctcggcgg agaccgaggt gcagatcaac agcgcgcggg aggagtacag gccggtggcc 11460acccgcggga gcatcttata cttcctgatc actgagatgc gccttgtgaa tgagatgtac 11520cagacaagcc tgcggcagtt ccttggcctg ttcgatcttt cgctggcccg gtccgtcaag 11580tctccgatta cctccaagcg gatcgctaac ataattgaac acatgacgta cgaggtgtac 11640aagtacgcgg cgaggggcct ctacgaagag cacaagttcc tgttcacgct gctcctcacg 11700ctcaagatcg acatccagcg caaccgcgtc aagcacgagg agtttctcac cctgataaag 11760gggggagcgt ccctggacct gaaggcctgt ccgccgaagc cgtcgaagtg gatcctggac 11820ataacgtggc tcaacctcgt cgagctgtcc aagctccgtc agttttcgga cgtgctcgac 11880cagatttcgc ggaacgagaa gatgtggaaa atatggttcg acaaggagaa tccagaggag 11940gagcccttgc ccaacgcgta tgacaagtcg ctcgactgct tccgtcgcct gctgctgatc 12000cgcagctggt gccccgaccg gacgatcgcg caggcgagga agtacatcgt tgacagtatg 12060ggtgagaaat acgcggaggg cgttattctg gatctggaga agacttggga ggagagcgac 12120ccccgcaccc ccctgatctg ccttctgtct atggggtccg acccgaccga tagcatcatt 12180gctctgggga agcggctcaa gatcgagacc cggtacgtgt ccatgggaca ggggcaggag 12240gtgcatgccc gcaagctcct gcagcagact atggcgaacg ggggttgggc gctcttacag 12300aactgccatc tggggctcga cttcatggat gaactcatgg acatcatcat cgagacggaa 12360ctcgtgcacg acgcattccg cctgtggatg accaccgagg cgcacaagca gttcccgatc 12420acgttgctgc agatgtccat caagttcgcc aacgaccctc cgcagggcct ccgggcgggc 12480ctgaagcgca cgtatagcgg cgtgtctcag gatctccttg atgtcagctc ggggagccag 12540tggaagccga tgctctatgc cgtggcattt ctacactcga ccgtccagga gcggcgaaag 12600tttggagcgc tggggtggaa cataccctac gagtttaacc aggccgactt caacgccacc 12660gtgcagttca tccagaacca tttggacgat atggatgtga agaagggggt gtcctggacg 12720accatacggt atatgatcgg cgagatccag tatggggggc gggtcacgga cgactacgac 12780aagcggttgc tgaacacgtt cgcgaaggtc tggttcagcg agaatatgtt cgggcccgat 12840ttttcctttt accagggcta caatataccc aagtgctcca cggtcgacaa ctaccttcag 12900tacatccaga gcttgcccgc atacgacagc ccggaagtct tcggactcca ccccaacgcc 12960gacatcacgt accagagcaa gctggccaag gacgtgctcg acaccattct cggcatccag 13020ccgaaggaca cgtccggcgg gggggacgag acgcgggagg ccgtcgtcgc gcgcttggca 13080gatgacatgc tggagaagct cccccccgat tacgtcccgt tcgaggtcaa ggaaaggctc 13140cagaagatgg gcccgttcca gcccatgaac atcttcctcc gccaggagat cgaccggatg 13200cagcgcgtgc ttagcctggt gcgctcaacg ctgacggagc tgaagctggc catcgacggg 13260acgatcatta tgtcggagaa cctccgggac gcgctggact gcatgttcga cgcgcgtatc 13320ccggcctggt ggaagaaggc gtcgtggatc tccagcaccc tggggttctg gttcacggag 13380ctgatcgagc gcaactcgca attcacctcc tgggtgttca acgggcggcc ccactgcttc 13440tggatgacgg gcttcttcaa cccgcagggc ttcctgacgg ctatgcggca ggaaatcacc 13500cgggcgaaca agggttgggc gctcgacaat atggtgctct gcaatgaggt cacgaagtgg 13560atgaaggacg acatctcggc gcctcccact gaaggggtct acgtctacgg cctgtacctc 13620gagggggcgg gctgggacaa gcgtaacatg aagctgatcg agtcgaagcc caaggtcctg 13680ttcgagctga tgcccgtcat ccgcatctac gccgagaaca acacgctgcg cgacccgcgg 13740ttctactcgt gccccatcta caagaagccg gtgcggacgg acctcaacta catcgccgcc 13800gtcgacctcc gcaccgcgca gacccccgag cactgggtgc tgcggggggt cgcactgctc 13860tgcgacgtca agtag 138753013875DNAArtificial SequenceSynthetic polynucleotide 30atgttcagaa taggaaggag acaactatgg aagcacagcg tgacgcgggt acttacccaa 60cgtctaaagg gggagaagga ggcgaagcgg gcactgctag acgcgcgtca taattacctc 120tttgcaatag ttgccagctg cctcgacctc aacaagacgg aggtagagga cgccatatta 180gagggcaacc agattgagcg gatcgatcag ctatttgccg tgggcgggct ccggcatcta 240atgttttact accaggacgt cgaggaagct gagaccgggc aactgggatc cctgggaggc 300gtcaacctcg tctccggcaa gataaaaaag cctaaggttt tcgttacaga gggcaacgac 360gtagcgctga ctggtgtatg cgtcttcttc atccggacag accccagcaa ggcaattacg 420ccagacaaca tccaccagga ggtctcgttt aacatgctcg acgctgccga tggcgggctg 480ctgaactcgg tgcgccggct gctctcggat atctttatcc ccgcgcttcg ggcgacgagc 540cacgggtggg gtgagctgga aggcctacag gacgcggcca atattcgtca ggagttccta 600tccagcctgg aaggttttgt taacgtgctg tccggcgccc aggagtcgct taaggagaag 660gtgaacttac gaaagtgtga tatattagag ctgaaaaccc tgaaggaacc tacagactat 720ctcaccctcg caaacaaccc cgaaaccctc ggcaaaattg aagattgcat gaaggtgtgg 780attaagcaga cggaacaagt cctggcagag aacaaccaac tcttgaagga ggccgacgac 840gtgggcccgc gcgctgagct ggagcactgg aagaagaggc tcagcaagtt taactatctt 900cttgagcagc tgaagagccc ggacgttaag gcggtactag cggtcctcgc ggctgcgaag 960tcgaagctgc tcaagacctg gcgtgagatg gacatacgca tcacggacgc aaccaacgaa 1020gctaaggaca acgttaagta tttgtatacc ctcgagaagt gctgcgaccc cctctactca 1080tctgatccgc tcagtatgat ggatgccatc cccacgctaa ttaacgccat taagatgatc 1140tactcgatat cgcactatta caacacgtct gaaaaaatca ccagcctctt cgtaaaagtg 1200actaaccaaa tcatcagcgc ctgcaaggct tacatcacta acaacggcac cgccagtata 1260tggaaccagc cccaggacgt cgtggaggag aagatcctat cggccataaa gctgaagcag 1320gagtatcagc tgtgcttcca caaaacaaag cagaaactca agcagaaccc aaatgctaag 1380cagttcgact tttctgagat gtacattttc gggaagtttg aaacatttca tcgccgcctg 1440gccaaaatca tcgacatatt caccactctg aagacctact cagtcctaca agacagcact 1500atagaagggc tagaggatat ggccacaaag taccagggca tcgtggccac tatcaaaaag 1560aaggagtaca acttcttaga ccagcgtaaa atggatttcg accaggacta tgaagaattc 1620tgcaaacaaa cgaatgattt gcacaacgag cttcggaaat tcatggatgt gacttttgcc 1680aaaatacaga ataccaacca agctcttagg atgttaaaga aatttgaaag gcttaatatt 1740cctaatttgg gcattgatga caaataccag ttgatactcg aaaattatgg agcagatatt 1800gatatgatct ctaagctgta cacaaaacaa aaatatgatc ccccgctagc tagaaatcaa 1860cctccgattg ctggtaagat actctgggcc agacagctct ttcaccgcat ccagcagccc 1920atgcagctgt ttcagcagca ccctgcggtg ctgtccaccg ccgaagcgaa acccattatt 1980cgatcttata accgcatggc caaggttctg ttagagtttg aagttttgtt ccaccgtgcc 2040tggttacgtc agatcgagga gatccatgtg ggactggagg cctctctcct agtcaaggcc 2100cccggcacag gcgaactctt tgtcaatttt gatccccaga ttctaatact cttccgggaa 2160accgagtgca tggcccagat gggcttagag gttagtcctc tggctacttc tctgttccag 2220aagagagacc gctataaacg gaatttcagc aatatgaaga tgatgctcgc tgaatatcag 2280agggtcaagt ccaaaatccc cgctgcgatc gagcagctga tagtgccaca cctggccaaa 2340gtagatgagg ccctacaacc aggactggcc gcgctgacgt ggacctctct gaatatcgaa 2400gcgtatttgg agaacacctt tgccaagatc aaggacctgg agcttttact ggacagagtg 2460aacgatctca ttgaattccg catagacgcg attttagagg agatgtcttc cacgccacta 2520tgccagcttc ctcaggagga gccgttaaca tgtgaagagt tcctgcagat gactaaggac 2580ctctgcgtga atggcgctca gatactacat ttcaagtcta gcttggtcga ggaggcagtg 2640aacgaattgg ttaacatgtt actggatgta gaagtccttt ccgaggagga atccgaaaag 2700atcagcaacg aaaattcggt gaactataag aacgaatcta gcgccaagcg ggaggagggc 2760aactttgata cactcacttc ttccatcaat gcgagggcta atgctctctt gttgacaacc 2820gttaccagaa aaaaaaagga gactgagatg cttggggaag aggcaaggga gttgctgtcc 2880cacttcaacc atcagaatat ggacgccctc ttaaaggtta cccgaaacac gttagaggca 2940attaggaagc gtattcactc aagccacacg ataaacttca gagactcaaa ctctgcatca 3000aatatgaagc aaaactcctt gccgatcttc agagccagcg tcaccctggc catacctaac 3060atcgttatgg caccggcact tgaggacgta caacaaacct tgaacaaagc agtagagtgc 3120atcatcagcg tccctaaagg agttcgccaa tggtccagtg aactgctatc caagaagaag 3180atccaggagc gtaaaatggc tgcgttacag agtaacgaag attcggactc tgacgttgaa 3240atgggtgaaa acgaactcca agacacactg gagattgcga gcgttaacct gcctataccc 3300gtccagacca agaactacta caaaaacgta tccgaaaaca aggagatcgt caagctcgtt 3360tctgtgctca gcaccataat aaattcgact aagaaagagg ttataacttc catggattgt 3420ttcaaacggt ataaccacat ctggcagaaa ggcaaggaag aagctatcaa gacatttatt 3480acccagagcc cactactaag cgagttcgag tctcagatcc tctacttcca gaatcttgag 3540caggagatca acgctgagcc cgaatatgtg tgcgtcggct cgatagccct gtacacggct 3600gatctgaaat ttgccctgac cgctgagact aaggcttgga tggtggtgat tggccgacac 3660tgcaacaaga agtaccggtc tgaaatggag aacatcttta tgctaatcga ggaatttaac 3720aaaaagctga accgtcccat taaggatctg gacgacatca ggattgccat ggcggcccta 3780aaggaaatta gagaggagca gatatccatt gattttcagg ttggccccat cgaagaatca 3840tatgcccttc tgaatcgata cggtctatta atcgcccgag aggaaataga taaggtggac 3900acacttcatt atgcatggga gaaactctta gcgcgggccg gcgaagtgca gaataagctc 3960gtatcgctgc agccatcatt taagaaggag ctcatcagtg ctgtcgaggt ctttctgcag 4020gactgccacc agttctatct ggattatgac ctgaacggtc cgatggcgag tggtctgaag 4080ccccaagagg cttcagaccg gcttatcatg ttccaaaatc agttcgacaa tatttaccga 4140aagtatatca cctatacagg gggtgaagaa ttgtttggtc tcccagccac ccagtatcca 4200caattattgg aaataaagaa gcagctgaac cttcttcaaa aaatctacac tctctataat 4260tcggtaattg aaactgttaa ttcctactac gatattctct ggagcgaggt caacattgag 4320aaaattaata acgaactctt ggagttccaa aacagatgcc gcaagctgcc gagagcgctg 4380aaggactggc aggcttttct cgaccttaag aaaataatcg atgatttcag tgaatgctgt 4440cctctcttag aatacatggc gagtaaggct atgatggaga gacactggga gaggattacg 4500actctgacgg ggcattcttt ggacgttggc aacgagtcct tcaagctgcg taatataatg 4560gaggctccac ttctcaagta caaagaggaa atagaagaca tctgtatatc tgctgtcaaa 4620gagcgcgaca tagaacagaa actaaagcag gtaattaacg agtgggacaa taaaacgttt 4680acatttggca gtttcaagac acgtggagaa ttattgcttc gaggtgactc cacctcggaa 4740attatcgcta acatggagga ctctctcatg ttactcggct cgctgttatc gaaccggtat 4800aatatgccat tcaaagcaca gatccagaag tgggtgcagt atctatctaa tagtacggat 4860attatagaga gctggatgac cgtccagaat ctctggatct acctggaggc ggtgtttgtg 4920ggaggtgata tagcgaagca gcttccaaag gaggccaaaa gattctccaa cattgacaaa 4980tcctgggtca agattatgac tcgggcccac gaagtgccct ccgttgtgca gtgctgcgtc 5040ggggacgaaa ccttgggcca gctgttgccc cacctgttgg atcaattgga aatctgccaa 5100aagagcctga ctggctacct agagaaaaag cgtctgtgct ttccccggtt cttcttcgtt 5160tctgaccctg cactactcga aatcttgggt caggcctctg attctcacac aattcaggct 5220catttgttaa atgtgtttga caacatcaaa agtgtgaaat ttcatgaaaa gatttatgac 5280aggatcttgt ccatttcatc ccaagaggga gaaaccattg agcttgataa gcctgtgatg 5340gcagagggaa acgtggaagt ctggcttaac agtctcctgg aagagtccca gtcctcactg 5400cacctggtca tccgccaggc ggcggctaat atccaggaga caggattcca gctcacggaa 5460ttccttagtt cgtttccggc gcaagtgggg ctcctcggca ttcagatgat ctggacgaga 5520gattcggagg aagccctccg caacgccaag tttgacaaga agattatgca gaaaactaac 5580caagccttcc tagagctcct caacactctg atcgatgtca caactcgtga tctatcgtct 5640accgagcggg tcaagtatga gacactcatt accatacacg ttcaccaacg tgatatattc 5700gatgatctat gccacatgca cataaagagt cccatggact tcgaatggct aaaacagtgc 5760aggttctact ttaatgaaga ctctgataaa atgatgatac acatcacaga tgttgctttt 5820atctatcaaa acgaattttt aggttgtacc gacagactag taatcactcc attgacagat 5880agatgttata ttacacttgc tcaggctcta ggtatgagta tgggtggagc accagcgggt 5940ccggcaggaa caggtaaaac agaaacgaca aaggatatgg gacgttgttt aggtaaatat 6000gtagtcgtat ttaactgttc tgaccaaatg gattttcgtg gtcttggtag aatttttaaa 6060ggtttagctc aatcaggttc ttggggttgt tttgacgaat ttaatcggat agatttgcct 6120gttttatctg tcgcggctca acaaatctcc ataattttaa cttgtaaaaa agaacataaa 6180aaaagtttca tttttaccga tggtgataac gttacgatga accctgaatt tggtttgttc 6240ttaactatga atccaggata cgccggccgt caagagttac ctgaaaattt gaaaataaac 6300tttagatcag tggctatgat ggttcctgat cgccagatta ttatccgagt caaattagca 6360agttgtggat ttattgataa cgttgtttta gcaagaaagt tttttacact atacaaatta 6420tgtgaggaac aattgtctaa acaagtacat tacgattttg gtctaagaaa tatattatca 6480gtattgcgaa cattaggagc agctaagaga gcaaatccaa tggatactga atcaactatt 6540gtaatgcgcg ttttaagaga tatgaattta tcaaagttaa ttgatgaaga cgaacctctt 6600tttctcagcc ttatagagga tctgtttcca aacattctcc tggacaaggc gggatatccc 6660gagttggaag cggccattag caggcaggtg gaggaggccg gattgattaa tcacccgccc 6720tggaaactga aagtcatcca gctgttcgag actcagcggg tccgacacgg tatgatgact 6780ttaggcccat ctggcgcggg caaaaccacc tgcatccaca cccttatgag ggctatgacc 6840gactgtggga agcctcaccg tgagatgcgg atgaacccga aggcgatcac tgcgccccaa 6900atgtttgggc gtctggatgt ggcgacaaat gactggaccg atgggatctt ttccacactc 6960tggaggaaga ccctgcgcgc caaaaaagga gagcacatct ggatcattct cgatggcccc 7020gttgacgcta tttggatcga aaacttaaac agcgtgctcg acgacaacaa gaccctgaca 7080ttggcaaatg gtgaccggat tcctatggct cctaattgca aaataatttt tgaacctcac 7140aacatcgaca acgccagtcc ggctacggtg tcccgcaacg gtatggtttt catgagcagt 7200tccatactgg attggagtcc gatattggaa ggatttctca agaaacgcag tccccaggag 7260gcggaaattc tgcggcaact gtatacggaa agttttcctg acctgtaccg cttctgcatt 7320caaaatctcg agtataagat ggaggtgttg gaggccttcg tcatcacaca gtccattaac 7380atgcttcagg gcttgatccc cttgaaggag caaggggggg aagtcagcca ggcacatcta 7440gggcggcttt ttgttttcgc cctgctctgg tccgcgggtg ctgctctcga gctagacggt 7500cgccggcgct tggagttgtg gctgaggtct cgcccgaccg ggacactcga gctgccgccg 7560ccagccggac ccggggacac tgcattcgac tactacgtag ctccggacgg cacctggacc 7620cactggaaca cccggacgca ggagtatctc tatcccagcg atacaactcc tgagtatggt 7680agcatactcg ttccgaacgt agacaacgtc agaaccgact ttctgatcca gaccattgct 7740aagcaaggca aggcagtcct attgatcgga gagcaaggga ccgcgaaaac cgtgattatc 7800aagggcttca tgagtaaata tgacccagag tgtcatatga ttaagtcact caacttcagt 7860tctgctacca caccactcat gtttcagcgt actatcgaat cctacgtgga caagcggatg 7920ggcaccacct acgggccgcc tgccgggaag aagatgacgg tatttataga cgacgttaac 7980atgcccatca tcaacgagtg gggagatcaa gtgaccaacg aaatcgttcg gcaacttatg 8040gaacaaaacg ggttctacaa cctcgagaag ccgggcgagt tcacctcaat agtagacatt 8100caatttctgg cagctatgat ccaccccgga ggaggacgaa acgacattcc ccagcggctg 8160aagcgccaat tcagcatctt caactgcacg ctgccaagcg aagcatcggt agacaaaatc 8220ttcggcgtca tcggggtggg tcactactgc acccagcgcg gcttttcaga ggaagtccga 8280gattctgtta ctaaactggt tcctttgact agaaggctgt ggcagatgac caaaattaaa 8340atgcttccta ctccagctaa attccactac gtgttcaatc tgcgagactt atccagggta 8400tggcaaggta tgcttaatac cacctctgag gtaattaaag aaccgaacga tctgctgaaa 8460ctgtggaagc acgaatgtaa gagagtcatt gcagataggt ttactgtatc gtcagatgta 8520acctggttcg ataaagcact ggtgtctttg gtcgaagaag agtttgggga agagaagaag 8580ctactcgtcg attgcgggat cgacacttac tttgtagact tcttaagaga cgcccctgag 8640gctgccggtg aaacatcgga agaggcagac gctgagactc ctaagatcta tgagccgatc 8700gagagcttta gccacctcaa ggagcgtctg aatatgtttt tacaactcta taacgagagc 8760attcgtggtg ctgggatgga tatggtgttc ttcgcggatg caatggtgca tctggtcaag 8820atctctcggg tgattcgcac gccacaggga aacgcgctac tggtcggggt gggtgggtca 8880ggaaagcaat cgttaactcg tttggcatcc tttatcgcgg gatatgtgtc atttcaaatc 8940accctgacaa ggtcctataa tacatccaac ctgatggagg atcttaaagt tctctacagg 9000acggcgggac aacagggcaa aggaataacc ttcatcttca cagataacga aattaaagac 9060gaatcattct tggagtatat gaacaacgtc ttatcaagcg gcgaagtttc gaacctcttc 9120gcgcgggacg aaatagatga gatcaactct gatctcgctt ctgtcatgaa gaaagaattc 9180ccccgctgtt tgccgacgaa tgagaacttg catgactatt tcatgtcccg tgtgcggcag 9240aatttgcaca ttgtgctttg cttttcaccg gtgggggaga agttccgaaa tagagctttg 9300aagttccctg ctttgattag cgggtgtact attgactggt tttcccgttg gcccaaggac 9360gctctggtcg ccgtgtccga gcacttttta acctcttatg atatcgactg cagcctcgaa 9420attaagaaag aagtagttca gtgtatgggc tctttccaag acggtgtggc agagaagtgc 9480gtcgactatt tccagaggtt tcgccgatct actcatgtca cacctaagag ctacttgtcc 9540ttcatacagg ggtacaagtt catatatggg gagaaacacg ttgaagtaag gactctggca 9600aaccgtatga atactggctt agagaagctc aaggaggcct cagaaagtgt ggctgctctc 9660tcaaaggagc ttgaagctaa ggagaaggaa ctccaagttg cgaacgataa agcggatatg 9720gttctaaaag aagtcactat gaaagcacaa gcggctgaaa aggttaaggc ggaggtacag 9780aaggttaagg atcgcgccca ggcaatagtc gattccattt ccaaagacaa ggccatcgcg 9840gaagaaaagc tagaagccgc caagcctgcc ttagaagagg cagaggctgc cttgcaaacc 9900ataagaccga gtgacatcgc gacggtacga acccttggtc gtcctcctca tttgattatg 9960aggattatgg actgtgtcct gcttttattt caacgtaaag tatctgcagt taagattgat 10020ttggaaaaat cctgtaccat gccctcatgg caggagtccc tgaaattgat gaccgcgggc 10080aatttccttc aaaatctaca acaattcccc aaggacacca ttaacgaaga ggtcatagaa 10140tttttatctc catattttga gatgccagat tacaatattg aaactgcgaa gcgcgtctgt 10200ggtaacgttg caggtctctg ttcgtggacc aaagctatgg cctccttctt tagtatcaat 10260aaagaggtac taccactgaa agccaacctg gtggtacagg agaaccggca tttgcttgct 10320atgcaggatc ttcagaaggc acaggccgaa ttagacgaca agcaggctga gcttgacgtt 10380gtgcaagcag aatacgaaca agctatgact gaaaagcaga ctttattaga ggacgctgaa 10440cgctgcagac ataagatgca gactgcaagc accctcatat ccgggttggc tggagaaaaa 10500gagcggtgga cagagcagtc gcaagaattt gctgctcaaa ccaaaaggtt ggttggagac 10560gtgctactcg cgacagcttt tctctcctat tctggtcctt tcaaccagga attccgggac 10620cttttgctga atgactggag aaaagaaatg aaggctcgca aaataccatt tggtaagaat 10680ttgaacttgt ctgaaatgct tattgacgca cccactatat cagagtggaa tcttcaggga 10740cttccgaatg acgacctgtc tatccaaaac ggaattattg taaccaaggc gagtcgctac 10800cctctgctca ttgacccgca gacacagggc aagatctgga ttaaaaataa ggaaagcagg 10860aacgaactcc agatcactag tctcaaccac aagtacttcc gtaaccacct cgaagatagt 10920ctgtccctgg gacggccatt gctaatcgag gacgtcggag aagagctgga ccccgcatta 10980gacaacgtcc ttgaaagaaa tttcatcaag acaggatcaa ctttcaaagt taaagtagga 11040gataaagaag tggatgtgtt agatggcttc cgcctatata tcacaactaa actcccgaat 11100cccgcctata ctccagagat cagcgctaga actagcatca tagatttcac tgtaactatg 11160aaggggttag aagatcaatt attaggacgc gtgatcctga cggagaaaca ggaattggaa 11220aaagagcgta cacatctcat ggaagacgtg acagctaaca aacgtcgaat gaaggaactg 11280gaggacaatt tactgtatcg gttgacatca acacagggct cccttgttga ggacgagagt 11340ctgatcgtgg tcctgtctaa cacaaagagg actgctgaag aagtaactca gaaattggag 11400atttctgccg aaactgaagt tcagattaac tccgctagag aagagtatcg tccagtcgct 11460actcggggct cgatcctata cttcctcata actgagatgc gcttggtcaa tgagatgtat 11520cagacttcac tccggcaatt cctgggcttg tttgacttgt cgctggcaag atcagtgaaa 11580tctccaatta

ccagcaagag aatcgcaaac atcattgagc acatgacgta cgaggtgtat 11640aaatacgcgg caaggggact ttacgaagag cataagttcc tcttcaccct actattaacg 11700ttgaagatag atattcagag gaaccgggtg aagcacgagg agtttctaac tctaataaaa 11760ggaggagctt ctttggattt gaaagcctgt ccaccgaaac cttctaagtg gattttagac 11820ataacatggc tgaatctcgt ggagttgtcc aagctccgtc agttcagtga cgttttggat 11880cagatatcca ggaacgagaa gatgtggaag atctggttcg ataaagagaa tccggaggaa 11940gagcccttgc caaatgctta tgataaaagc ctagactgct tcaggaggct tttgctcatc 12000cgctcctggt gccccgaccg cactattgcc caggctagga aatacattgt ggactccatg 12060ggggagaagt atgccgaagg agtcatactc gacttggaga agacttggga agagtcagat 12120ccgaggacgc ccctcatttg tcttctttcc atgggttctg atcccacgga ctctattata 12180gcactcggga aaagactaaa gatcgagaca cgctatgtta gcatgggaca ggggcaagag 12240gtccatgccc gcaaactact acagcagact atggctaatg gaggttgggc tctgttacag 12300aactgtcact taggcttaga ttttatggac gaattgatgg acatcataat tgagacggag 12360ctagtccacg acgcatttcg cttatggatg acgactgaag cacacaagca gtttccgata 12420accctgttgc agatgtccat caagttcgcc aatgaccctc cccagggcct tagggcaggt 12480ctcaaaagga cctacagcgg cgtttcccag gatttacttg acgtctcctc cggatcccag 12540tggaagccca tgttgtacgc tgtggctttc cttcacagca cagttcagga aaggcggaag 12600tttggtgcgc taggctggaa catcccctac gagttcaacc aggctgactt taatgcaaca 12660gtacagttta ttcaaaatca tctggatgac atggatgtta aaaaaggtgt gtcatggact 12720acaataaggt acatgattgg ggagatacag tacggaggcc gggtaactga tgattatgac 12780aagcggctac tgaacacttt cgctaaagtg tggttttctg agaatatgtt cggtccagat 12840ttcagcttct accaaggtta taacattcca aagtgctcca cagtcgacaa ctacctccaa 12900tatattcaaa gtttacctgc atatgatagt cctgaagttt ttggtttgca tcctaatgca 12960gatataacat atcaaagtaa attagcaaaa gacgtcttag atacaatact aggaatccaa 13020ccaaaagata catccggtgg gggagacgaa actcgagaag ccgttgttgc aagattagca 13080gatgatatgt tagaaaaatt acctcctgac tatgtacctt ttgaagttaa agaacgtttg 13140caaaaaatgg gaccttttca accaatgaat atctttctaa ggcaggagat tgatcgaatg 13200caaagggttt tatccttagt acgatctact ttaacagaac ttaaactagc tatagatggt 13260actataatta tgagtgaaaa tttaagagac gcattagatt gtatgtttga tgcaaggatt 13320ccagcttggt ggaaaaaagc atcttggata tcatcaacat tgggattttg gtttactgaa 13380ttgatagaac gtaattctca atttacaagt tgggtattta atggtcgtcc acattgtttt 13440tggatgactg gtttttttaa tccacaagga tttttaactg ccatgagaca ggaaataact 13500agagcaaata agggttgggc attagataat atggttttgt gtaatgaagt aactaagtgg 13560atgaaagacg atataagtgc accaccaact gagggtgttt atgtatatgg tttatattta 13620gaaggagctg gatgggataa acgtaatatg aaattaatag aatcgaaacc aaaagtttta 13680tttgaattaa tgccagttat cagaatttat gcagagaata atacattaag agatcctaga 13740ttttatagtt gtccaattta taaaaaacct gtcagaacag atcttaatta tatagcagcc 13800gtcgatctta gaactgctca aacaccagaa cattgggtat taagaggagt agctttactt 13860tgtgatgtta aatag 138753113875DNAArtificial SequenceSynthetic polynucleotide 31atgttcagaa tcggaaggcg gcaattatgg aagcattcag taactagagt cttgacacag 60aggctcaaag gtgaaaaaga ggcaaagcgt gcgctattgg atgcacggca taattacttg 120ttcgcgatag tagcaagttg tttagatcta aacaaaactg aagtagaaga cgccattctc 180gaaggaaatc aaattgagcg gattgaccaa ttattcgctg ttggaggcct cagacattta 240atgttttact atcaagacgt tgaagaagcg gaaactgggc aactcgggtc actaggtggc 300gttaatttag tctcaggtaa aattaaaaag ccaaaagtat ttgttactga gggcaacgac 360gtggccctaa ctggagtatg tgtgtttttt attcggaccg acccgagtaa ggccattacg 420ccagacaata ttcatcaaga ggtatcattt aacatgctag atgctgctga tggtggactt 480ctgaatagcg tgcgtcgcct gctatctgac attttcattc ctgctttacg cgcgaccagt 540cacgggtggg gggagcttga aggcctccag gacgccgcca atatcaggca ggaattcctg 600tcctccttag aaggttttgt gaatgttctc agcggtgccc aggagtcact aaaagaaaag 660gtgaacttgc ggaagtgtga cattcttgaa ttaaagactc tgaaggagcc aaccgattac 720ctcacgttag ctaacaaccc ggagactcta ggcaaaatcg aggactgcat gaaggtgtgg 780atcaaacaga ctgagcaagt tttagcagaa aacaaccagc tccttaagga agcggatgac 840gtaggccctc gggcggaact tgaacattgg aagaagcggt tgtctaagtt taattatctt 900cttgaacagc ttaaatctcc tgatgtcaaa gcagtgcttg cagtcctcgc tgcagcaaag 960tccaagctgc ttaagacctg gcgtgaaatg gacataagga taactgacgc taccaatgaa 1020gctaaggata acgttaagta cctatacaca ctagagaagt gttgtgatcc tttatactcc 1080tctgatccac tgtctatgat ggatgcaata cctacgctaa tcaacgctat taagatgatt 1140tatagtatct ctcactatta taacacatct gagaaaatta cttccctttt cgtgaaagtt 1200acgaaccaga ttattagcgc ctgcaaggct tacataacca ataacggtac agcgagcata 1260tggaatcagc cacaggacgt cgttgaagag aagattttgt ccgccatcaa attgaaacag 1320gagtaccagc tgtgcttcca taaaacaaag cagaagctga agcagaaccc aaacgctaag 1380cagttcgact tctctgagat gtatattttt gggaaatttg aaaccttcca ccgacgtcta 1440gccaaaatta tcgacatctt tacgacgtta aaaacttaca gcgtgctgca agattctacg 1500atagaagggc tagaggatat ggccacaaag taccagggca tcgtggccac tatcaaaaag 1560aaggagtaca acttcttaga ccagcgtaaa atggatttcg accaagacta tgaagaattc 1620tgcaaacaaa cgaatgattt gcacaacgag cttcggaaat tcatggatgt gacttttgcc 1680aaaatacaga ataccaacca agctcttagg atgttgaaga aatttgaaag gttgaatatt 1740cctaatttgg gcattgatga caaataccag ttgatactcg aaaattatgg agcagatatt 1800gatatgatct ctaaactgta cacaaaacaa aaatatgatc ccccgctagc tagaaatcaa 1860cctccgattg ctggtaagat actctgggcc agacaactct ttcaccgcat ccagcagccc 1920atgcagctgt ttcagcagca ccctgcggtg ctctccaccg ccgaggcgaa acccattatt 1980cgatcttata accgcatggc caaggttctg ttagagtttg aagttttgtt ccaccgtgcc 2040tggttacgtc agatcgagga gatccatgtg ggactggagg cctctctcct agtcaaggcc 2100cccggcacag gcgaactctt tgtcaatttt gatccccaga ttttaatact cttccgggag 2160accgagtgca tggcccagat gggcttagag gttagtcctc tggctacttc tctgttccag 2220aagagagacc gctataaacg gaatttcagc aatatgaaga tgatgctcgc tgaatatcag 2280agggtcaagt ccaaaatccc cgctgcgatc gagcagctga tagtgccaca cctggccaaa 2340gtagatgagg ccctacaacc aggactggcc gcgctgacgt ggacctctct gaatatcgaa 2400gcgtatttgg agaacacgtt tgccaagatt aaggacctgg agcttttact ggacagagtg 2460aacgatctca ttgaattccg catagacgcg attttagagg agatgtcttc cacgccacta 2520tgccagcttc ctcaggagga gcctttaaca tgtgaagagt tccttcagat gactaaggac 2580ctctgcgtga atggcgctca gatactacat ttcaagtcta gcttggtcga ggaggcagtg 2640aacgaattgg ttaacatgtt actggatgta gaagtcctta gcgaggagga atccgaaaag 2700atcagcaacg aaaattcggt gaactataag aacgaatcta gcgccaagcg ggaggagggg 2760aactttgata ctctcacttc ttccatcaat gcgagggcta atgctctctt gttgacgacc 2820gttaccagaa aaaaaaagga gactgagatg cttggggaag aggcaaggga gttgctgtcc 2880catttcaacc atcagaatat ggacgccctc ttaaaggtta cccgaaacac gttagaagcc 2940attaggaagc gtattcactc aagccacacg ataaatttcc gcgactcaaa ctcagcatca 3000aatatgaagc aaaactcctt gccgatcttc agagccagcg tcaccctggc catacctaac 3060atcgttatgg caccggcact tgaggacgta caacaaacct tgaacaaagc agtagagtgc 3120atcatcagtg tccctaaagg agttcgccaa tggtccagtg aactgctatc caagaagaag 3180atccaggagc gtaaaatggc tgcgttacag agtaacgaag attcggactc tgacgttgaa 3240atgggtgaaa acgagctcca agacacactg gagattgcga gcgttaacct gcctatacct 3300gtccagacca agaactacta caaaaacgtg tcggaaaaca aggagattgt caagctcgtt 3360tctgtgctca gcaccataat aaattcaact aagaaagaag ttataacttc catggattgt 3420ttcaaacggt ataaccacat ctggcagaaa ggcaaggaag aagctatcaa aacatttatt 3480acccagagcc cactactaag cgaattcgag tctcagatcc tctacttcca gaatcttgag 3540caggagatca acgctgagcc cgaatatgtg tgcgtcggct cgatagccct gtacacggct 3600gatctgaaat ttgcgctgac cgctgagaca aaggcttgga tggtggtgat tggccgacac 3660tgcaacaaga agtaccggtc tgaaatggag aacattttta tgctaatcga ggaatttaac 3720aaaaagctga accgtcccat taaggatctg gacgacatca ggattgccat ggcggcccta 3780aaggaaatta gagaggagca gatatccatt gattttcagg taggccccat cgaagaatca 3840tatgcccttc tgaatcgata cggtctatta atcgcccgag aggaaataga taaggtggac 3900acacttcatt atgcatggga gaaactctta gcgcgggccg gcgaagtgca gaataagctc 3960gtatcgctgc agccatcatt taagaaggag ctcatcagtg ctgtcgaggt ctttctgcag 4020gactgccacc agttctatct ggattatgac ctgaacggtc cgatggcgag tggtctgaag 4080ccccaagagg cttcagaccg gcttatcatg ttccaaaatc agttcgacaa tatttacagg 4140aagtatatca cctatacagg gggtgaagaa ttgttcggtc tcccagccac ccagtatcca 4200caattattgg aaataaagaa gcagttaaac cttcttcaaa aaatctacac tctctataat 4260tcggtaatag aaactgttaa ttcctactac gatattctct ggagcgaggt caatattgag 4320aaaattaata acgaactctt ggagttccaa aacagatgcc gcaagttgcc gagagcgctg 4380aaggactggc aggcttttct cgaccttaag aaaataatcg atgatttcag tgaatgctgt 4440cctctcttag aatacatggc gagtaaggct atgatggaga gacactggga gaggattacg 4500actctgacgg ggcattcttt ggacgttggc aacgagtcct tcaagctgcg taatataatg 4560gaggctccac ttctcaagta caaagaggag atagaggaca tctgtatatc tgctgtcaaa 4620gagcgcgaca tagaacagaa actaaagcag gtaattaacg aatgggacaa caaaacgttt 4680acatttggca gtttcaagac acgtggagaa ttattgcttc gaggcgactc cacctcggaa 4740attatcgcta acatggagga ctctctcatg ttacttggct cgctgttatc gaaccggtat 4800aatatgccat tcaaagcaca gatccagaag tgggtgcagt atctatctaa tagtacggat 4860ataatagaga gctggatgac cgtccagaat ctctggatct acttggaggc ggtgtttgtg 4920ggaggtgata tagcgaagca gcttcccaag gaggccaaaa gattctccaa cattgacaaa 4980tcctgggtca agattatgac tcgggcccac gaagtgccct ccgtggtgca gtgctgcgta 5040ggggacgaaa ccttgggcca gctgttgccc cacctgttgg atcaactcga aatctgccaa 5100aagtctctga ctggctacct agagaaaaag cgtctgtgct ttccccggtt cttcttcgtt 5160tctgatcctg cactactcga aatcttgggt caggcctctg attctcacac aattcaggct 5220catttgttaa atgtgtttga caacatcaaa agtgtgaaat ttcatgaaaa gatttatgac 5280aggatcttgt ccatttcatc ccaagaggga gaaaccattg agcttgataa gcctgtgatg 5340gcagagggaa acgtggaggt ctggcttaac agtctcttgg aagagtccca gtcctcactg 5400cacctggtca tccgccaggc ggcggctaat atccaggaga caggattcca gctcacggaa 5460ttccttagtt cgtttccggc gcaagtgggg ctcctcggca ttcagatgat ctggacgaga 5520gattcggagg aagccctccg caacgccaag tttgacaaga agattatgca gaaaactaac 5580caagccttcc tagagctcct caacactctg atcgatgtca caactcgtga cctatcgtct 5640accgagcggg tcaagtatga gacactgatt acaatccacg ttcaccagcg tgatatattc 5700gatgatctat gccacatgca cataaaaagt cccatggact tcgaatggct aaaacagtgc 5760aggttctact ttaatgaaga cagtgataag atgatgatcc atatcacaga tgtagcgttt 5820atttaccaaa acgagttcct tggctgtaca gacaggttag tcataactcc gttaactgat 5880cgctgctaca ttacactcgc ccaagcgctt ggaatgtcca tgggtggagc ccccgcaggg 5940ccggcgggga caggtaagac cgaaacaact aaagatatgg gccgttgcct cgggaagtat 6000gttgtagtat ttaactgctc agaccaaatg gatttccgag ggctgggccg tatctttaaa 6060gggctggcgc aatccggttc ctggggctgt tttgacgagt tcaatcgtat tgatttaccg 6120gtgctaagtg ttgccgcaca gcaaattagt ataattttga catgtaagaa agaacacaag 6180aaaagtttta tatttactga cggcgacaac gtcactatga atcctgaatt cgggcttttc 6240ttgactatga acccagggta cgctggccgt caagaacttc ctgaaaatct gaaaatcaac 6300tttcgatcgg tggctatgat ggtaccggac cgccagatca tcatccgggt aaaactggcc 6360tcgtgtggct tcatcgacaa cgtcgtactt gctcgaaagt tcttcaccct ttacaagcta 6420tgtgaggagc agttatcgaa acaagttcat tacgactttg ggctccggaa tatcttgtcc 6480gtcttacgca cactcggagc ggctaaacgt gcaaatccca tggacactga gagtacgatt 6540gtgatgcgag tgttaaggga tatgaatctc tcgaagttaa tagacgagga cgagcctctt 6600tttctcagcc ttatagagga tctgttccca aacatcctcc tggacaaggc tggatatccc 6660gagttggaag cggcgattag caggcaggtg gaggaggccg gattgattaa tcacccgccc 6720tggaaactga aagtcatcca gctgttcgag actcagcggg tccgacacgg tatgatgact 6780ttaggcccat ctggcgcggg gaaaaccacc tgcatccaca ccctgatgag ggctatgacc 6840gattgtggga agcctcaccg tgagatgcgg atgaacccga aggcgatcac agcgccccaa 6900atgtttgggc gtctggatgt ggcgacaaat gactggaccg atggaatctt ttccacactc 6960tggaggaaga ccctgcgcgc aaaaaaagga gagcacatct ggatcattct cgatggcccc 7020gttgacgcta tttggatcga aaacttaaac agcgtgctcg acgacaacaa gaccctgaca 7080ttggcaaatg gtgaccggat tcctatggct cccaattgca aaatcatttt tgaacctcac 7140aacatcgaca acgccagtcc ggctacggtg tcccgcaacg gtatggtttt catgagcagt 7200tccatactgg attggagtcc gatattggaa ggatttctca agaaacgcag tccccaggag 7260gcggaaattc tgcggcaact gtatacggaa agttttcctg acctgtaccg cttctgtatt 7320caaaatctcg agtataagat ggaggtgttg gaggccttcg tcatcacaca gtccattaac 7380atgcttcagg gcttgatccc cttgaaggag caaggagggg aagtcagcca ggcacatcta 7440gggcggcttt tcgttttcgc ccttctctgg tccgcgggtg ctgctctcga gctagacggt 7500cgccggcgct tggagttgtg gctgaggtct cgcccgaccg ggacactcga gctgccgccg 7560ccagcgggac ccggggacac tgcattcgac tactacgtag ctccggacgg cacctggacc 7620cactggaaca cccgtacgca ggagtatctc tatcccagcg atacaactcc tgagtatggt 7680agcatactcg ttccgaacgt agacaacgtc agaaccgact ttctgatcca gaccattgct 7740aagcagggca aggcagtcct attgatcgga gagcaaggga ccgcgaaaac cgtgattatc 7800aagggcttca tgagtaaata tgacccagag tgtcatatga ttaagtccct caacttcagt 7860tctgctacca caccactcat gtttcagcgt actatcgaat cctacgtgga caagcggatg 7920ggcaccacct acgggccgcc tgccgggaag aagatgacgg tatttataga cgacgttaac 7980atgcccatca tcaacgagtg gggagatcaa gtgaccaacg aaatcgttcg gcaacttatg 8040gaacaaaacg ggttctataa cctcgagaag ccgggcgagt tcacctcaat agtagacatt 8100caatttctgg cagctatgat ccaccccgga ggaggacgga acgacattcc ccagcggctc 8160aagcgccaat tcagcatctt caactgcacg ctgccaagtg aagcatcggt agacaaaatc 8220ttcggcgtca tcggggtggg tcactactgc acccagcgcg gcttttcaga ggaagtccga 8280gattcagtta ctaaactggt tcctttgact agaaggctgt ggcagatgac caaaattaaa 8340atgcttccta ctccagctaa attccactac gtgttcaatc tgcgagactt atccagggta 8400tggcaaggta tgcttaatac cacctctgag gtaattaaag aaccgaacga tctgctgaaa 8460ctgtggaagc acgaatgtaa gagagttatt gcagataggt ttactgtatc gtcagatgtt 8520acctggttcg ataaagcact ggtgtctttg gtcgaagaag agtttgggga agagaagaag 8580ctactcgtcg attgcgggat cgacacttac tttgtggact tcttaagaga cgcccctgag 8640gctgccggcg aaacatcgga agaggcagac gctgagactc ctaagatcta tgagccgatc 8700gagagcttta gccacctcaa ggagcgtctg aatatgtttt tacaattgta taacgagagc 8760attcgtggtg ctgggatgga tatggtgttc ttcgcggatg caatggtgca tctggttaag 8820atttctcggg tgattcgcac gccacaggga aacgcgctac tggtcggggt gggtgggtca 8880ggaaagcaat cgttaactcg tttggcatcc tttatcgcgg gatatgtgag ttttcaaatc 8940accctgacaa ggtcctataa tacatccaac ctgatggagg atcttaaagt tctctacagg 9000acggcgggac aacagggcaa aggaataacc ttcatcttca cagataacga aattaaagac 9060gaatcattct tggagtatat gaacaacgtc ttatcaagcg gcgaagtttc gaacctcttc 9120gcgcgggacg aaatcgatga gatcaactct gatctcgctt ctgtcatgaa gaaagaattc 9180ccccgctgtt tgccgacgaa tgagaacttg catgactatt tcatgtcccg tgtgcggcag 9240aatttgcaca ttgtgctttg cttttcaccg gtgggggaga agttccgaaa tagagctttg 9300aagttccctg ctttgatttc tgggtgtact attgactggt tttcccgttg gcccaaggac 9360gctctggtcg ccgtgtccga gcacttttta accagctatg atatcgactg cagcctcgaa 9420attaagaagg aagtagttca gtgtatgggc tctttccaag acggtgtggc agagaagtgc 9480gtcgactatt tccagaggtt tcgccgatct actcatgtca cacctaagag ctacttgtcc 9540ttcatacagg ggtacaagtt tatatatggg gagaaacacg ttgaagtaag gactctggca 9600aaccgtatga atactggctt agagaagctc aaggaggcct cagaaagtgt ggctgctctc 9660tcaaaggagc ttgaagctaa ggagaaggaa ctccaagttg cgaacgataa agcggatatg 9720gttctaaaag aagtcactat gaaagcacaa gcggctgaaa aggttaaggc ggaggtacag 9780aaggtgaagg atcgcgccca ggcaatagtc gattccattt ccaaagacaa ggccatcgcg 9840gaagaaaagc tagaagccgc caagccggcc ttagaagagg cagaggctgc cttgcaaacc 9900ataagaccga gtgacatcgc gacggtacga acccttggtc gtcctcctca tttgattatg 9960aggattatgg actgtgtcct gcttttattt caacgtaaag tatctgcagt taagattgat 10020ttggaaaaat cctgtaccat gccctcatgg caggaatccc tgaaattgat gaccgcgggc 10080aatttccttc aaaatctaca acaattcccc aaggacacca ttaacgaaga ggtcatagaa 10140tttttatctc catattttga gatgccagat tacaatattg aaacagcgaa gcgcgtctgt 10200ggtaacgttg caggtctctg ttcgtggacc aaagctatgg cctccttctt tagtatcaat 10260aaagaggtac taccactgaa agccaacctg gtggtacagg agaaccggca tctgcttgct 10320atgcaggatc ttcagaaggc ccaggccgaa ttagacgaca agcaggctga gcttgacgtt 10380gtgcaagcag aatacgaaca agctatgact gaaaagcaga ctttattaga ggacgctgaa 10440cgctgcagac ataagatgca gactgcaagc accctcatat ccgggttggc tggagaaaaa 10500gagcggtgga cagagcagtc gcaagaattc gctgctcaaa ccaaaaggtt ggttggagac 10560gttctactcg cgacagcttt tctctcctat tctggtcctt tcaaccagga attccgggac 10620cttttgctga atgactggag aaaagaaatg aaggctcgca aaataccatt tggtaagaat 10680ttgaacttgt ctgaaatgct tattgacgca cccactatat cagagtggaa tcttcaggga 10740cttccaaatg acgatctgtc catccaaaac ggaattattg taaccaaggc gagtcgctac 10800cctctgctca ttgacccgca gacacagggc aagatctgga ttaaaaataa ggaaagcagg 10860aacgaactcc agatcactag tctcaaccac aagtacttcc gtaaccacct cgaagatagt 10920ctgtccctgg gacggccgtt gctaatcgag gacgtcggag aagagctgga ccccgcatta 10980gacaacgttc ttgaaagaaa ttttatcaag acaggatcaa ctttcaaagt taaagtagga 11040gataaagaag tggatgtgtt agatggcttc cgcctatata tcacaactaa actcccgaat 11100cccgcctata ctccagagat cagcgctaga actagcatca tagatttcac tgtaactatg 11160aaggggttag aagatcaatt attaggacgc gtgatcctga cggagaaaca ggaattggaa 11220aaagagcgta cacacctcat ggaagacgtg acagctaaca aacgtcggat gaaggaactg 11280gaagacaatt tactgtatcg gttgacatca acacagggct cccttgttga ggacgagagt 11340ctgatcgtgg tcctgtctaa cacaaagagg actgctgaag aagtaactca gaaattggag 11400atttctgccg aaactgaagt tcagattaac tccgctagag aagagtatcg tccagtcgct 11460actcggggct ctatcctata cttcctcata actgagatgc gcttggtcaa tgagatgtac 11520cagacttcac tccggcaatt cctgggcttg tttgacttgt cgctggcaag atcagtaaaa 11580tctccaatta ccagcaagag aatcgcaaac atcattgagc acatgacgta cgaggtgtat 11640aaatacgcgg cgaggggtct ttatgaagag cataagttcc tcttcaccct actattaacg 11700ttgaagatag atattcagag gaaccgggtg aagcacgagg agtttctaac tctaataaaa 11760ggaggagctt ctttagattt gaaagcctgt ccaccgaaac cttctaagtg gattttagac 11820ataacatggc tgaatttagt ggagttgtcc aagttacgtc agttcagtga cgttttggat 11880cagatatcca ggaacgagaa gatgtggaag atctggttcg ataaagagaa tccggaggag 11940gagcccttgc caaatgctta tgataaaagc ctagactgct tcaggaggct tttgctcatc 12000cgctcctggt gccccgaccg cactattgcc caagctagga aatacattgt ggactccatg 12060ggggagaagt atgccgaagg agtcatactc gacttggaga agacttggga agagtcagat 12120ccgaggacgc ccctcatttg tcttctttcc atgggttctg atcccacgga ctctattata 12180gcactcggga aaagactaaa gatcgagaca cgctatgtta gcatgggaca ggggcaagag 12240gtccatgccc gcaaactact acagcagact atggctaatg gaggttgggc tctgttacag 12300aactgtcact taggccttga ttttatggac gaattgatgg acatcataat tgagacggag 12360ctagtccacg acgcatttcg cttatggatg acgactgaag cacacaagca gtttccgata 12420accctgttgc agatgtccat caagttcgcc aatgaccctc cccagggcct tagggcaggt 12480ctcaaaagga cctacagcgg cgtttcccag gatttacttg acgtctcctc cggatcccag 12540tggaagccca tgttgtacgc tgtggctttc cttcacagca cagttcagga aaggcggaag 12600tttggtgcgc taggctggaa catcccctac gagttcaacc aggctgactt taatgcaaca 12660gtacagttta

ttcaaaatca tctggatgac atggatgtta aaaaaggtgt gtcatggact 12720acaataaggt acatgattgg ggagatacag tacggaggcc gggtaactga tgattatgac 12780aagcggctac tgaacacttt cgctaaagtg tggttttctg agaatatgtt cggtccagat 12840ttcagcttct accaaggtta taacattcca aagtgctcca cagtcgacaa ctacctccag 12900tacatccaat cgttaccagc atacgacagc ccggaagttt ttggcctgca cccaaacgcg 12960gacatcactt atcagtcgaa gctggcaaag gacgtgctcg acaccatcct tggtatacag 13020cctaaggaca ccagtggggg cggtgacgag actcgcgagg ctgtggtggc ccggctcgct 13080gatgacatgc tagagaaact tcccccggac tacgtcccct ttgaggtcaa agagcggctg 13140cagaaaatgg ggcccttcca gcccatgaac atattcttgc gccaggagat agaccgtatg 13200caacgggtcc tgagcctggt ccgctcgact ctaaccgagc tcaagctggc catcgatggg 13260acgattatta tgtctgagaa tttgagggac gcgctcgatt gcatgtttga tgccaggatc 13320ccagcctggt ggaaaaaagc tagttggatc tcatctactc tggggttttg gtttacagag 13380ttgatagaac ggaacagcca gtttacttct tgggtattca atggtaggcc tcactgtttt 13440tggatgacag gcttctttaa cccccagggt ttcctcactg cgatgagaca agagattacg 13500cgagccaata agggctgggc actagataac atggtcctgt gtaatgaagt gaccaaatgg 13560atgaaagacg acatatcagc gccccccacc gagggtgtgt acgtatatgg cctctatttg 13620gaaggggctg gatgggacaa gcgtaacatg aaactgatag aatccaagcc taaggtcctc 13680tttgagctaa tgccggttat acgaatctac gccgagaaca atacattgag agatccaaga 13740ttttattctt gtcccatata caagaagcct gtccgtacag atttgaatta cattgctgcc 13800gtcgacctgc gcaccgcaca aactcccgag cactgggtgc tgcggggggt cgcactgctc 13860tgcgacgtca agtag 13875

* * * * *

References


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed