Engineered Nucleases That Target Human And Canine Factor Viii Genes As A Treatment For Hemophilia A Hekele; Armin ; et al. [Precision BioSciences, Inc.]

Engineered Nucleases That Target Human And Canine Factor Viii Genes As A Treatment For Hemophilia A

Hekele; Armin ; et al.

Patent Application Summary

U.S. patent application number 16/760902 was filed with the patent office on 2020-09-24 for engineered nucleases that target human and canine factor viii genes as a treatment for hemophilia a. This patent application is currently assigned to Precision BioSciences, Inc.. The applicant listed for this patent is Precision BioSciences, Inc.. Invention is credited to Victor Bartsevich, Clayton Beard, Armin Hekele, Derek Jantz, James Jefferson Smith.

Application Number	20200299658 16/760902
Document ID	/
Family ID	1000004914478
Filed Date	2020-09-24

View All Diagrams

United States Patent Application	20200299658
Kind Code	A1
Hekele; Armin ; et al.	September 24, 2020

ENGINEERED NUCLEASES THAT TARGET HUMAN AND CANINE FACTOR VIII GENES AS A TREATMENT FOR HEMOPHILIA A

Abstract

The present invention encompasses engineered nucleases which recognize and cleave a recognition sequence within the int22h-1 sequence of a Factor VIII gene. The present invention also encompasses methods of using such engineered nucleases to make genetically-modified cells, and the use of such cells in a pharmaceutical composition and in methods for treating hemophilia A. Further, the invention encompasses pharmaceutical compositions comprising engineered nuclease proteins, nucleic acids encoding engineered nucleases, or genetically-modified cells of the invention, and the use of such compositions for treating of hemophilia A.

Inventors:

Hekele; Armin; (Cary, NC) ; Beard; Clayton; (Durham, NC) ; Jantz; Derek; (Durham, NC) ; Smith; James Jefferson; (Morrisville, NC) ; Bartsevich; Victor; (Durham, NC)

Applicant:

Name	City	State	Country	Type
Precision BioSciences, Inc.	Durham	NC	US

Assignee:

Precision BioSciences, Inc.
Durham
NC

Family ID:

1000004914478

Appl. No.:

16/760902

Filed:

November 1, 2018

PCT Filed:

November 1, 2018

PCT NO:

PCT/US2018/058692

371 Date:

April 30, 2020

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
62580031	Nov 1, 2017

Current U.S. Class:	1/1
Current CPC Class:	C12N 9/22 20130101; C12N 2800/80 20130101; C12N 5/067 20130101; C12N 15/113 20130101; A61K 38/02 20130101; C12N 2310/20 20170501; C12N 2750/14143 20130101; A61K 9/127 20130101; C12N 5/0672 20130101; A61K 48/005 20130101; C12N 15/86 20130101
International Class:	C12N 9/22 20060101 C12N009/22; C12N 15/86 20060101 C12N015/86; C12N 5/071 20060101 C12N005/071; C12N 15/113 20060101 C12N015/113; A61K 48/00 20060101 A61K048/00; A61K 9/127 20060101 A61K009/127

Claims

1. An engineered meganuclease that recognizes and cleaves a recognition sequence positioned within an int22h-1 sequence of said Factor VIII gene, wherein said recognition sequence does not comprise a CpG site and is at least 95% identical between the human genome and the canine genome, wherein said engineered meganuclease comprises a first subunit and a second subunit, wherein said first subunit binds to a first recognition half-site of said recognition sequence and comprises a first hypervariable (HVR1) region, and wherein said second subunit binds to a second recognition half-site of said recognition sequence and comprises a second hypervariable (HVR2) region.

2. The engineered meganuclease of claim 1, wherein said int22h-1 sequence comprises a nucleotide sequence having at least 95% sequence identity to SEQ ID NO: 3 or SEQ ID NO: 4.

3. The engineered meganuclease of claim 1 or 2, wherein said recognition sequence is within an F8A1 coding sequence of said Factor VIII gene.

4. The engineered meganuclease of claim 3, wherein said F8A1 coding sequence comprises a nucleotide sequence having at least 95% sequence identity to SEQ ID NO: 5 or SEQ ID NO: 6.

5. The engineered meganuclease of any one of claims 1-4, wherein said recognition sequence comprises SEQ ID NO: 7.

6. The engineered meganuclease of any one of claims 1-5, wherein said recognition sequence comprises SEQ ID NO: 9 or SEQ ID NO: 11.

7. The engineered meganuclease of claim 5 or claim 6, wherein said HVR1 region comprises an amino acid sequence having at least 80% sequence identity to an amino acid sequence corresponding to residues 24-79 of any one of SEQ ID NOs: 13-21.

8. The engineered meganuclease of any one of claims 5-7, wherein said HVR1 region comprises residues corresponding to residues 24, 26, 28, 32, 33, 38, 40, 42, 44, 46, 68, 70, 75, and 77 of any one of SEQ ID NOs: 13-21.

9. The engineered meganuclease of any one of claims 5-8, wherein said HVR1 region comprises residues 24-79 of any one of SEQ ID NOs: 13-21.

10. The engineered meganuclease of any one of claims 5-9, wherein said HVR2 region comprises an amino acid sequence having at least 80% sequence identity to an amino acid sequence corresponding to residues 215-270 of any one of SEQ ID NOs: 13-21.

11. The engineered meganuclease of claim 5-10, wherein said HVR2 region comprises residues corresponding to residues 215, 217, 219, 221, 223, 224, 229, 231, 233, 235, 237, 259, 261, 266, and 268 of any one of SEQ ID NOs: 13-21.

12. The engineered meganuclease of any one of claims 5-11, wherein said HVR2 region further comprises a residue corresponding to residue 245 of SEQ ID NO: 16.

13. The engineered meganuclease of any one of claims 5-12, wherein said HVR2 region further comprises a residue corresponding to residue 262 of SEQ ID NO: 19.

14. The engineered meganuclease of any one of claims 5-13, wherein said HVR2 region further comprises residues corresponding to residues 262, 263, 264, and 265 of SEQ ID NO: 19 or SEQ ID NO: 21.

15. The engineered meganuclease of any one of claims 5-14, wherein said HVR2 region comprises residues 215-270 of any one of SEQ ID NOs: 13-21.

16. The engineered meganuclease of any one of claims 5-15, wherein said first subunit comprises an amino acid sequence having at least 80% sequence identity to residues 7-153 of any one of SEQ ID NOs: 13-21, and wherein said second subunit comprises an amino acid sequence having at least 80% sequence identity to residues 198-344 of any one of SEQ ID NOs: 13-21.

17. The engineered meganuclease of any one of claims 5-16, wherein said second subunit further comprises a residue corresponding to residue 210 of any one of SEQ ID NOs: 13-21.

18. The engineered meganuclease of any one of claims 5-17, wherein said first subunit comprises residues 7-153 of any one of SEQ ID NOs: 13-21.

19. The engineered meganuclease of any one of claims 5-18, wherein said second subunit comprises residues 198-344 of any one of SEQ ID NOs: 13-21.

20. The engineered meganuclease of any one of claims 5-19, wherein said engineered meganuclease is a single-chain meganuclease comprising a linker, wherein said linker covalently joins said first subunit and said second subunit.

21. The engineered meganuclease of any one of claims 5-20, wherein said engineered meganuclease comprises the amino acid sequence of any one of SEQ ID NOs: 13-21.

22. A polynucleotide comprising a nucleic acid sequence encoding said engineered meganuclease of any one of claims 1-21.

23. The polynucleotide of claim 22, wherein said polynucleotide is an mRNA.

24. A recombinant DNA construct comprising a nucleic acid sequence encoding said engineered meganuclease of any one of claims 1-21.

25. The recombinant DNA construct of claim 24, wherein said recombinant DNA construct encodes a viral vector comprising said nucleic acid sequence encoding said engineered meganuclease of any one of claims 1-21.

26. The recombinant DNA construct of claim 24 or claim 25, wherein said viral vector is an adenoviral vector, a lentiviral vector, a retroviral vector, or an adeno-associated viral (AAV) vector.

27. The recombinant DNA construct of claim 26, wherein said viral vector is a recombinant AAV vector.

28. A viral vector comprising a nucleic acid sequence encoding said engineered meganuclease of any one of claims 1-21.

29. The viral vector of claim 28, wherein said viral vector is an adenoviral vector, a lentiviral vector, a retroviral vector, or an adeno-associated viral (AAV) vector.

30. The viral vector of claim 29, wherein said viral vector is a recombinant AAV vector.

31. A pharmaceutical composition for treatment of a subject having hemophilia A characterized by an inversion of exons 1-22 in a Factor VIII gene, said pharmaceutical composition comprising a pharmaceutically acceptable carrier and an effective amount of: (a) a nucleic acid encoding an engineered nuclease, wherein said engineered nuclease is expressed in a target cell in vivo; or (b) an engineered nuclease protein; wherein said engineered nuclease has specificity for a first recognition sequence positioned within an int22h-1 sequence of said Factor VIII gene; wherein said first recognition sequence does not comprise a CpG site; and wherein said first recognition sequence is at least 95% identical between the human genome and in the canine genome.

32. The pharmaceutical composition of claim 31, wherein said int22h-1 sequence comprises a nucleotide sequence having at least 95% sequence identity to SEQ ID NO: 3 or SEQ ID NO: 4.

33. The pharmaceutical composition of claim 31 or claim 32, wherein said first recognition sequence is within an F8A1 coding sequence of said Factor VIII gene.

34. The pharmaceutical composition of any one of claims 31-33, wherein said F8A1 coding sequence comprises a nucleotide sequence having at least 95% sequence identity to SEQ ID NO: 5 or SEQ ID NO: 6.

35. The pharmaceutical composition of any one of claims 31-34, wherein said engineered nuclease has specificity for a second recognition sequence that is identical to said first recognition sequence, wherein said second recognition sequence is positioned in a repeat sequence telomeric to said Factor VIII gene in the X chromosome, and wherein said repeat sequence is identical to said int22h-1 sequence except that said repeat sequence is in reverse orientation relative to said int22h-1 sequence.

36. The pharmaceutical composition of any one of claims 31-35, wherein said nucleic acid is an mRNA.

37. The pharmaceutical composition of claim 36, wherein said mRNA is encapsulated in a lipid nanoparticle.

38. The pharmaceutical composition of any one of claims 31-35, wherein said pharmaceutical composition comprises a recombinant DNA construct comprising said nucleic acid.

39. The pharmaceutical composition of any one of claims 31-35, wherein said pharmaceutical composition comprises a viral vector comprising said nucleic acid.

40. The pharmaceutical composition of claim 39, wherein said viral vector is a recombinant AAV vector.

41. The pharmaceutical composition of any one of claims 31-40, wherein said engineered nuclease is an engineered meganuclease, a TALEN, a zinc finger nuclease, a compact TALEN, a CRISPR, or a megaTAL.

42. The pharmaceutical composition of claim 41, wherein said engineered nuclease is an engineered meganuclease.

43. The pharmaceutical composition of claim 42, wherein said first recognition sequence comprises SEQ ID NO: 7.

44. The pharmaceutical composition of claim 42 or claim 43, wherein said first recognition sequence comprises SEQ ID NO: 9 or SEQ ID NO: 11.

45. The pharmaceutical composition of claim 43 or claim 44, wherein said nucleic acid encodes said engineered meganuclease of any one of claims 1-21.

46. The pharmaceutical composition of claim 43 or claim 44, wherein said engineered meganuclease is said engineered meganuclease of any one of claims 1-21.

47. A method for treating a subject having hemophilia A characterized by an inversion of exons 1-22 of the Factor VIII gene, said method comprising administering to said subject said pharmaceutical composition of any one of claims 31-46.

48. The method of claim 47, wherein said engineered nuclease, or said nucleic acid encoding said engineered nuclease, is delivered to cells which express Factor VIII in a wild-type subject, or progenitor cells which differentiate into cells which express Factor VIII in a wild-type subject.

49. The method of claim 47 or claim 48, wherein said cells are hepatic sinusoidal endothelial cells.

50. The method of claim 47 or claim 48, wherein said cells are progenitor cells which differentiate into hepatic sinusoidal endothelial cells.

51. The method of any one of claims 47-50, wherein said engineered nuclease recognizes and cleaves said first recognition sequence to promote recombination between said int22h-1 sequence and said repeat sequence, resulting in reversion of exons 1-22 to generate a Factor VIII gene having a wild-type orientation.

52. The method of claim 51, wherein said engineered nuclease further recognizes and cleaves said second recognition sequence in said repeat sequence.

53. The method of any one of claims 47-52, wherein said engineered nuclease is an engineered meganuclease.

54. The method of claim 53, wherein said pharmaceutical composition comprises a nucleic acid encoding said engineered meganuclease of any one of claims 1-21.

55. The method of any one of claims 47-54, wherein said subject is a human.

56. The method of any one of claims 47-54, wherein said subject is a canine.

57. A method for genetically modifying the Factor VIII gene in the genome of a mammalian cell, wherein said mammalian cell comprises an inversion of exons 1-22 in the Factor VIII gene compared to a wild-type Factor VIII gene, said method comprising introducing into said mammalian cell: (a) an engineered nuclease having specificity for a first recognition sequence positioned within an int22h-1 sequence of said Factor VIII gene; or (b) a nucleic acid encoding said engineered nuclease, wherein said engineered nuclease is expressed in said mammalian cell; wherein said engineered nuclease cleaves said first recognition sequence and causes a reversion of exons 1-22 to a wild-type orientation in said genetically modified mammalian cell; and wherein said recognition sequence does not comprise a CpG site and is at least 95% identical between the human genome and the canine genome.

58. The method of claim 57, wherein said int22h-1 sequence comprises a nucleotide sequence having at least 95% sequence identity to SEQ ID NO: 3 or SEQ ID NO: 4.

59. The method of claim 57 or claim 58, wherein said first recognition sequence is within an F8A1 coding sequence of said Factor VIII gene.

60. The method of claim 59, wherein said F8A1 coding sequence comprises a nucleotide sequence having at least 95% sequence identity to SEQ ID NO: 5 or SEQ ID NO: 6.

61. The method of any one of claims 57-60, wherein said engineered nuclease has specificity for a second recognition sequence that is identical to said first recognition sequence, wherein said second recognition sequence is positioned in a repeat sequence telomeric to said Factor VIII gene in the X chromosome, and wherein said repeat sequence is identical to said int22h-1 sequence except that said repeat sequence is in reverse orientation relative to said int22h-1 sequence.

62. The method of any one of claims 57-61, wherein said nucleic acid is an mRNA.

63. The method of claim 62, wherein said mRNA is encapsulated in a lipid nanoparticle.

64. The method of any one of claims 57-61, wherein said nucleic acid is introduced using a recombinant DNA construct comprising said nucleic acid.

65. The method of any one of claims 57-61, wherein said nucleic acid is introduced using a viral vector comprising said nucleic acid.

66. The method of claim 65, wherein said viral vector is a recombinant AAV vector.

67. The method of any one of claims 57-66, wherein said engineered nuclease is an engineered meganuclease, a TALEN, a zinc finger nuclease, a compact TALEN, a CRISPR, or a megaTAL.

68. The method of any one of claims 57-67, wherein said engineered nuclease is an engineered meganuclease.

69. The method of claim 68, wherein said first recognition sequence comprises SEQ ID NO: 7.

70. The method of claim 68 or claim 69, wherein said first recognition sequence comprises SEQ ID NO: 9 or SEQ ID NO: 11.

71. The method of any one of claims 68-70, wherein said nucleic acid encodes said engineered meganuclease of any one of claims 1-21.

72. The method of any one of claims 68-70, wherein said engineered meganuclease is said engineered meganuclease of any one of claims 1-21.

73. The method of any one of claims 57-72, wherein said mammalian cell can express Factor VIII following reversion of exons 1-22 to a wild-type orientation.

74. The method of claim 73, wherein said mammalian cell is a progenitor cell which can differentiate into a cell which can express Factor VIII following reversion of exons 1-22 to a wild-type orientation.

75. The method of claim 73, wherein said mammalian cell is a hepatic cell.

76. The method of claim 75, wherein said mammalian cell is a hepatic sinusoidal endothelial cell.

77. The method of claim 74, wherein said mammalian cell is a progenitor cell capable of differentiating into a hepatic sinusoidal endothelial cell.

78. The method of claim 77, wherein said progenitor cell is a hepatic stem cell.

79. The method of claim 73, wherein said mammalian cell is a hematopoietic endothelial cell.

80. The method of claim 74, wherein said mammalian cell is a progenitor cell capable of differentiating into a hematopoietic endothelial cell.

81. The method of any one of claims 57-80, wherein said method is performed in a subject in vivo.

82. The method of claim 81, wherein said method comprises administering to said subject said pharmaceutical composition of any one of claims 31-46.

83. The method of claim 81 or claim 82, wherein said subject is a human.

84. The method of claim 81 or claim 82, wherein said subject is a canine.

85. The method of any one of claims 57-80, wherein said method is performed in vitro.

86. The method of claim 85, wherein said mammalian cell is a human cell.

87. The method of claim 86, wherein said mammalian cell is a canine cell.

88. The method of any one of claims 85-87, wherein said mammalian cell is a pluripotent cell.

89. A genetically-modified cell made by the method of any one of claims 57-88.

Description

FIELD OF THE INVENTION

[0001] The invention relates to the field of molecular biology and recombinant nucleic acid technology. In particular, the invention relates to engineered nucleases having specificity for a recognition sequence within intron 22 of a Factor VIII gene, and particularly within the int22h-1 sequence. Such engineered nucleases are useful in methods for treating hemophilia A characterized by an inversion of exons 1-22 in the Factor VIII gene.

REFERENCE TO A SEQUENCE LISTING SUBMITTED AS A TEXT FILE VIA EFS-WEB

[0002] The instant application contains a Sequence Listing which has been submitted in ASCII format via EFS-Web and is hereby incorporated by reference in its entirety. Said ASCII copy, created on Oct. 31, 2018, is named P109070027W000-SEQ-MJT, and is 90,208 bytes in size.

BACKGROUND OF THE INVENTION

[0003] Hemophilia A is a common genetic bleeding disorder with an incidence of 1 in 5000 males worldwide. This genetic disease can result from various mutations within the coagulation Factor VIII (F8) gene located on the X chromosome, which include large deletions, insertions, inversions, and point mutations. Clinically, hemophilia A can be classified based on relative Factor VIII activity in the patient's plasma as mild (5-30% activity; 50% of patients), moderate (2-5% activity; 10% of patients), or severe (<1% activity; 50% of patients). Currently, there is no cure for hemophilia A. Standard therapy includes the administration of recombinant Factor VIII, but this approach is limited by cost, the requirement for frequent injections, and the formation of Factor VIII-inactivating antibodies in the subject which reduce the effectiveness of therapy. Therefore, a clear need still exists for alternative treatments for hemophilia A. Gene therapy, targeting mutations in the Factor VIII gene, remains an attractive yet elusive approach to treatment.

[0004] Factor VIII is an essential component of the clotting cascade. The protein circulates in the body in an inactive form that is attached to von Willebrand factor. In response to injury, Factor VIII is activated (Factor VIIIa) and separates from von Willebrand factor, then interacts with Factor IXa as part of the clotting cascade which leads to the formation of fibrin and stable clotting. A number of studies have suggested that Factor VIII is produced by liver sinusoidal endothelial cells, as well as extra-hepatic, hematopoietic cells throughout the body.

[0005] The Factor VIII gene on the X chromosome is large and structurally complex, comprising .about.180 kb and 26 exons. The wild-type Factor VIII gene encodes two proteins. The first protein is the full-length Factor VIII protein, which is encoded by the 9030 bases found in exons 1 to 26, and has a circulating form containing 2332 amino acid residues. The second protein, referred to as Factor VIIIb, is encoded by 2598 bases in 5 exons present in the Factor VIII gene. The resulting protein comprises 216 amino acids and has a presently unknown function.

[0006] Approximately 45% of severe hemophilia A cases are caused by an intra-chromosomal inversion that involves intron 22 of the Factor VIII gene. This inversion arises when an .about.9.5 kb segment of intron 22, referred to as int22h-1, recombines with one of two repeat copies (referred to as int22h-2 and int22h-3, respectively) which are positioned approximately 400 kb and 500 kb telomeric to the Factor VIII gene on the X chromosome. Following recombination, exons 1-22 of the Factor VIII gene become inverted in the genome relative to exons 23-26, resulting in the expression of a truncated, inactive Factor VIII protein that lacks the amino acids encoded by exons 23-26 (Sauna et al. (2015) Blood 125(2): 223-228).

[0007] The upstream repeat copy involved in exon 1-22 inversion is oriented in the opposite direction as int22h-1. Early studies suggested that int22h-2 and int22h-3 were both in reverse orientation relative to int22h-1, allowing for recombination to occur with either repeat sequence. This was referred to as Type I inversion and Type II inversion. However, more recent evidence indicates that int22h-2 and int22h-3 are found in an inverse orientation to one another on the X chromosome, and are part of an imperfect palindrome (FIG. 1). Recombination of sequences within this palindrome allows int22h-2 and in22h-3 to swap places in the genome and, consequently, change their orientation relative to int22h-1. As a result, the int22h-1 sequence can, in different circumstances, recombine with the int22h-2 repeat or the int22h-3 repeat, depending on which is in the opposite orientation to int22h-1 (Bagnall et al. (2006) Journal of Thrombosis and Haemostasis 4: 591-598).

[0008] Of note, intron 22 of the Factor VIII gene contains a CpG island that acts as a bi-directional promoter for two further genes, referred to as F8A1 (Factor VIII-associated 1) and F8B. The CpG island and the intron-less F8A1 gene (SEQ ID NO: 5) are both contained within the int22h-1 sequence (and consequently, within int22h-2 and in22h-3) and are transcribed in the opposite direction as the Factor VIII gene (Bowen (2002) J. Clin. Pathol: Mol. Pathol. 55: 127-144). Interestingly, the sequence of the F8A1 gene is the only region of the human Factor VIII gene that exhibits significant homology to the Factor VIII gene in the canine genome, and particularly in a clinically-relevant population of canines that are Factor VIII-deficient and exhibit an inversion of exons 1-22 in their Factor VIII gene (Lozier et al. (2002) PNAS 99(20): 12991-12996).

[0009] The present invention requires the use of site-specific, rare-cutting nucleases that are engineered to recognize DNA sequences within the int22h-1 sequence in order to generate a double-strand break and promote recombination between int22h-1 and an inversely-oriented repeat sequence (int22h-2 or int22h-3) positioned telomeric to the Factor VIII gene. Nuclease-induced recombination between these regions results in an inversion or reversion of exons 1-22 of the Factor VIII gene. The inventors have discovered that nucleases that are capable of targeting recognition sequences in the int22h-1 region that are highly homologous between the human and canine Factor VIII gene, and lack CpG sites, allows for cleavage of both the human and canine Factor VIII genes and recombination between int22h-1 and an inversely-oriented repeat sequence (int22h-2 or int22h-3). Thus, such nucleases are useful in the clinically-relevant Factor VIII-deficient canine model and for therapeutic treatment of humans.

[0010] Methods for producing engineered, site-specific nucleases are known in the art. For example, zinc-finger nucleases (ZFNs) can be engineered to recognize and cut pre-determined sites in a genome. ZFNs are chimeric proteins comprising a zinc finger DNA-binding domain fused to a nuclease domain from an endonuclease or exonuclease (e.g., Type IIs restriction endonuclease, such as the FokI restriction enzyme). The zinc finger domain can be a native sequence or can be redesigned through rational or experimental means to produce a protein which binds to a pre-determined DNA sequence .about.18 basepairs in length. By fusing this engineered protein domain to the nuclease domain, it is possible to target DNA breaks with genome-level specificity. ZFNs have been used extensively to target gene addition, removal, and substitution in a wide range of eukaryotic organisms (reviewed in S. Durai et al., Nucleic Acids Res 33, 5978 (2005)).

[0011] Likewise, TAL-effector nucleases (TALENs) can be generated to cleave specific sites in genomic DNA. Like a ZFN, a TALEN comprises an engineered, site-specific DNA-binding domain fused to an endonuclease or exonuclease (e.g., Type IIs restriction endonuclease, such as the FokI restriction enzyme) (reviewed in Mak, et al. (2013) Curr Opin Struct Biol. 23:93-9). In this case, however, the DNA binding domain comprises a tandem array of TAL-effector domains, each of which specifically recognizes a single DNA basepair.

[0012] Compact TALENs are an alternative endonuclease architecture that avoids the need for dimerization (Beurdeley, et al. (2013) Nat Commun. 4:1762). A Compact TALEN comprises an engineered, site-specific TAL-effector DNA-binding domain fused to the nuclease domain from the I-TevI homing endonuclease or any of the endonucleases listed in Table 2 in U.S. Application No. 20130117869. Compact TALENs do not require dimerization for DNA processing activity, so a Compact TALEN is functional as a monomer.

[0013] Engineered endonucleases based on the CRISPR/Cas system are also known in the art (Ran, et al. (2013) Nat Protoc. 8:2281-2308; Mali et al. (2013) Nat Methods. 10:957-63). A CRISPR endonuclease comprises two components: (1) a caspase effector nuclease; and (2) a short "guide RNA" comprising a .about.20 nucleotide targeting sequence that directs the nuclease to a location of interest in the genome. By expressing multiple guide RNAs in the same cell, each having a different targeting sequence, it is possible to target DNA breaks simultaneously to multiple sites in in the genome.

[0014] In the preferred embodiment of the invention, the DNA break-inducing agent is an engineered homing endonuclease (also called a "meganuclease"). Homing endonucleases are a group of naturally-occurring nucleases which recognize 15-40 base-pair cleavage sites commonly found in the genomes of plants and fungi. They are frequently associated with parasitic DNA elements, such as group 1 self-splicing introns and inteins. They naturally promote homologous recombination or gene insertion at specific locations in the host genome by producing a double-stranded break in the chromosome, which recruits the cellular DNA-repair machinery (Stoddard (2006), Q. Rev. Biophys. 38: 49-95). Homing endonucleases are commonly grouped into four families: the LAGLIDADG (SEQ ID NO: 2) family, the GIY-YIG family, the His-Cys box family and the HNH family. These families are characterized by structural motifs, which affect catalytic activity and recognition sequence. For instance, members of the LAGLIDADG family are characterized by having either one or two copies of the conserved LAGLIDADG motif (see Chevalier et al. (2001), Nucleic Acids Res. 29(18): 3757-3774). The LAGLIDADG homing endonucleases with a single copy of the LAGLIDADG motif form homodimers, whereas members with two copies of the LAGLIDADG motif are found as monomers.

[0015] I-CreI (SEQ ID NO: 1) is a member of the LAGLIDADG family of homing endonucleases which recognizes and cuts a 22 basepair recognition sequence in the chloroplast chromosome of the algae Chlamydomonas reinhardtii. Genetic selection techniques have been used to modify the wild-type I-CreI cleavage site preference (Sussman et al. (2004), J Mol. Biol. 342: 31-41; Chames et al. (2005), Nucleic Acids Res. 33: e178; Seligman et al. (2002), Nucleic Acids Res. 30: 3870-9, Arnould et al. (2006), J Mol. Biol. 355: 443-58). Methods for rationally-designing mono-LAGLIDADG homing endonucleases were described which are capable of comprehensively redesigning I-CreI and other homing endonucleases to target widely-divergent DNA sites, including sites in mammalian, yeast, plant, bacterial, and viral genomes (WO 2007/047859).

[0016] As first described in WO 2009/059195, I-CreI and its engineered derivatives are normally dimeric but can be fused into a single polypeptide using a short peptide linker that joins the C-terminus of a first subunit to the N-terminus of a second subunit (Li, et al. (2009) Nucleic Acids Res. 37:1650-62; Grizot, et al. (2009) Nucleic Acids Res. 37:5405-19.) Thus, a functional "single-chain" meganuclease can be expressed from a single transcript. This, coupled with the extremely low frequency of off-target cutting observed with engineered meganucleases makes them the preferred endonuclease for the present invention.

[0017] The use of engineered nucleases for gene therapy in severe hemophilia A has been limited. Park et al. described the use of a TALEN to induce an inversion of exon 1 in the Factor VIII gene in HEK 293T cells and induced pluripotent stem cells (iPSCs) (Park et al. (2014), PNAS 111(25): 9253-9258). Inversions of exon 1 are also associated with the occurrence of hemophilia A occur due to homologous recombination between an int1h-1 sequence in intron 1 of the Factor VIII gene and a single homologous region (int1h-2) positioned telomeric to the Factor VIII gene. The TALEN selected for this study cut within the intron 1 homology region in order to induce an inversion of this shorter sequence with an efficiency of 1.9% and 1.4% in the HEK 293T cells and iPSCs, respectively. The authors further demonstrated reversion of exon 1 in the iPSCs at a similar efficiency of 1.3%.

[0018] In a subsequent study, Park et al. reported the use of a CRISPR/Cas system to induce a reversion of exons 1-22 of the Factor VIII gene in iPSCs obtained from patients suffering from severe hemophilia A (Park et al. (2015) Cell Stem Cell 17: 213-220). The authors noted that inversions of exons 1-22 are eight times more prevalent than inversions of exon 1, but emphasized that the exon 1-22 inversion is technically more challenging to revert due in part to the substantially larger size of the inversion (600 kbp compared to 140 kbp) and the presence of three homologs of the int22h-1 sequence on the X chromosome, compared to only two homologs of the int1h-1 sequence. Indeed, Park et al. specifically targets recognition sequences outside of the int22h-1, int22h-2, and int22h-3 homology regions in order to rule out the possibility that unwanted deletions or inversions involving any two of the three int22 homologs, rather than the desired reversion of the inverted 600-kbp segment, would be induced by cutting within an int22h homology region. Using this approach, the authors observed a reversion frequency of approximately 3.7% in iPS cells.

[0019] The present invention improves on the art in several aspects. Despite suggestions in the art to avoid targeting recognition sequences within the int22h homology regions, surprisingly targeting recognition sequences within int22h-1 can, in fact, produce an inversion or reversion of exons 1-22 in the Factor VIII with high efficiency. The recognition sequences targeted within the int22h-1 sequence by the inventors lack CpG sites and have a high degree of homology between the human and canine genomes, such that they can both be targeted by the same nuclease. The absence of CpG sites within the recognition sequences targeted by nucleases avoids potential methylation of these sites, which can hinder cleavage. Thus, the methods of the invention are useful not only in human subjects suffering from hemophilia A, but also in the clinically-relevant canine hemophilia A model which also has an inversion of exons 1-22. Accordingly, the present invention fulfills a need in the art for further gene therapy approaches to severe hemophilia A.

SUMMARY OF THE INVENTION

[0020] The present invention provides engineered nucleases useful for the treatment of hemophilia A, which is characterized by an inversion of exons 1-22 of the Factor VIII gene. The engineered nucleases of the invention recognize and a cleave a recognition sequence within an int22h-1 sequence of the Factor VIII gene, thereby promoting recombination between the int22h-1 sequence and an identical, or highly homologous, inverted repeat sequence positioned telomeric to the Factor VIII gene on the X chromosome. Such recombination results in a reversion of exons 1-22 to generate a Factor VIII gene with a wild-type orientation that encodes a biologically-active Factor VIII protein. The recognition sequence recognized by the presently disclosed nucleases does not comprise a CpG site and is at least 80%, at least 85%, at least 90%, at least 95%, or more, identical between the human genome and the canine genome. The present invention also provides pharmaceutical compositions and methods for treatment of hemophilia A which utilize an engineered nuclease having specificity for a recognition sequence positioned within the int22h-1 sequence of the Factor VIII gene that does not comprise a CpG site and is at least 80%, at least 85%, at least 90%, at least 95%, or more, identical between the human genome and the canine genome. The present invention further provides genetically-modified cells which have been modified to correct an inversion of exons 1-22 in the Factor VIII gene, as well as pharmaceutical compositions comprising such genetically-modified cells and methods of using the same for the treatment of hemophilia A.

[0021] Thus, in one aspect, the invention provides an engineered meganuclease that recognizes and cleaves a recognition sequence within an int22h-1 sequence of a Factor VIII gene that does not comprise a CpG site and is at least 80%, at least 85%, at least 90%, at least 95%, or more, identical between the human genome and the canine genome. The engineered meganuclease comprises a first subunit and a second subunit, wherein the first subunit binds to a first recognition half-site of the recognition sequence and comprises a first hypervariable (HVR1) region, and wherein the second subunit binds to a second recognition half-site of the recognition sequence and comprises a second hypervariable (HVR2) region.

[0022] In one embodiment, the int22h-1 sequence can have at least 80%, at least 85%, at least 90%, at least 95%, or more, sequence identity to SEQ ID NO: 3 or SEQ ID NO: 4. In one such embodiment, the int22h-1 sequence can comprise SEQ ID NO: 3 or SEQ ID NO: 4.

[0023] In another embodiment, the recognition sequence can be within an F8A1 coding sequence of the Factor VIII gene. In such an embodiment, the F8A1 coding sequence can have at least 80%, at least 85%, at least 90%, at least 95%, or more, sequence identity to SEQ ID NO: 5 or SEQ ID NO: 6. In another such embodiment, the F8A1 coding sequence can comprise SEQ ID NO: 5 or SEQ ID NO: 6.

[0024] In another embodiment, the recognition sequence can comprise SEQ ID NO: 7. In particular embodiments, the recognition sequence can comprise SEQ ID NO: 9 or SEQ ID NO: 11.

[0025] In some such embodiments, the HVR1 region can comprise an amino acid sequence having at least 80%, at least 85%, at least 90%, at least 95%, or more, sequence identity to an amino acid sequence corresponding to residues 24-79 of any one of SEQ ID NOs: 13-21.

[0026] In certain embodiments, the HVR1 region can comprise residues corresponding to residues 24, 26, 28, 32, 33, 38, 40, 42, 44, 46, 68, 70, 75, and 77 of any one of SEQ ID NOs: 13-21.

[0027] In particular embodiments, the HVR1 region can comprise residues 24-79 of any one of SEQ ID NOs: 13-21.

[0028] In some such embodiments, the HVR2 region can comprise an amino acid sequence having at least 80%, at least 85%, at least 90%, at least 95%, or more, sequence identity to an amino acid sequence corresponding to residues 215-270 of any one of SEQ ID NOs: 13-21.

[0029] In certain embodiments, the HVR2 region can comprise residues corresponding to residues 215, 217, 219, 221, 223, 224, 229, 231, 233, 235, 237, 259, 261, 266, and 268 of any one of SEQ ID NOs: 13-21.

[0030] In certain embodiments, the HVR2 region can further comprise a residue corresponding to residue 245 of SEQ ID NO: 16.

[0031] In certain embodiments, the HVR2 region can further comprise a residue corresponding to residue 262 of SEQ ID NO: 19.

[0032] In certain embodiments, the HVR2 region can further comprise one or more residues corresponding to residues 262, 263, 264, and 265 of SEQ ID NO: 19 or SEQ ID NO: 21.

[0033] In particular embodiments, the HVR2 region can comprise residues 215-270 of any one of SEQ ID NOs: 13-21.

[0034] In one such embodiment, the first subunit can comprise an amino acid sequence having at least 80%, at least 85%, at least 90%, at least 95%, or more, sequence identity to residues 7-153 of any one of SEQ ID NOs: 13-21, and the second subunit can comprise an amino acid sequence having at least 80%, at least 85%, at least 90%, at least 95, or more, sequence identity to residues 198-344 of any one of SEQ ID NOs: 13-21.

[0035] In another such embodiment, the first subunit can further comprise a residue corresponding to residue 80 of any one of SEQ ID NOs: 13-21.

[0036] In another such embodiment, the second subunit can further comprise a residue corresponding to residue 210 of any one of SEQ ID NOs: 13-21.

[0037] In another such embodiment, the second subunit can further comprise a residue corresponding to residue 271 of any one of SEQ ID NOs: 13-21.

[0038] In another embodiment, the first subunit can comprise residues 7-153 of any one of SEQ ID NOs: 13-21.

[0039] In another such embodiment, the second subunit can comprise residues 198-344 of any one of SEQ ID NOs: 13-21.

[0040] In another such embodiment, the engineered meganuclease can be a single-chain meganuclease comprising a linker, wherein the linker covalently joins the first subunit and the second subunit.

[0041] In another such embodiment, the engineered meganuclease can comprise the amino acid sequence of any one of SEQ ID NOs: 13-21.

[0042] In another aspect, the invention provides a polynucleotide comprising a nucleic acid sequence encoding any engineered meganuclease of the invention. In a particular embodiment, the polynucleotide can be an mRNA. In some embodiments, the mRNA can be a polycistronic mRNA encoding an engineered meganuclease of the invention and at least one additional polypeptide.

[0043] In another aspect, the invention provides a recombinant DNA construct comprising a nucleic acid sequence which encodes any engineered meganuclease of the invention.

[0044] In one embodiment, the recombinant DNA construct encodes a viral vector. In such an embodiment, the viral vector can be a retrovirus, a lentivirus, an adenovirus, or an adeno-associated virus (AAV) vector. In a particular embodiment, the viral vector can be a recombinant AAV vector.

[0045] In another aspect, the invention provides a viral vector comprising a nucleic acid sequence which encodes any engineered meganuclease of the invention.

[0046] In one embodiment, the viral vector can be a retrovirus, a lentivirus, an adenovirus, or an adeno-associated virus (AAV) vector. In a particular embodiment, the viral vector can be a recombinant AAV vector.

[0047] In another aspect, the invention provides a pharmaceutical composition for treatment of a subject having hemophilia A. In such an aspect, hemophilia A is characterized by an inversion of exons 1-22 in a Factor VIII gene. The pharmaceutical composition comprises a pharmaceutically acceptable carrier and an effective amount of: (a) a nucleic acid encoding an engineered nuclease, wherein the engineered nuclease is expressed in a target cell in vivo; or (b) an engineered nuclease protein; wherein the engineered nuclease has specificity for a first recognition sequence positioned within an int22h-1 sequence of the Factor VIII gene in the target cell, wherein the first recognition sequence does not comprise a CpG site and is at least 80%, at least 85%, at least 90%, at least 95%, or more identical between the human genome and the canine genome.

[0048] In one embodiment, the int22h-1 sequence can have at least 80%, at least 85%, at least 90%, at least 95%, or more, sequence identity to SEQ ID NO: 3 or SEQ ID NO: 4. In one such embodiment, the int22h-1 sequence can comprise SEQ ID NO: 3 or SEQ ID NO: 4.

[0049] In another embodiment, the first recognition sequence can be within an F8A1 coding sequence. In such an embodiment, the F8A1 coding sequence can have at least 80%, at least 85%, at least 90%, at least 95%, or more sequence identity to SEQ ID NO: 5 or SEQ ID NO: 6. In another such embodiment, the F8A1 coding sequence can comprise SEQ ID NO: 5 or SEQ ID NO: 6.

[0050] In another embodiment, the engineered nuclease can have specificity for a second recognition sequence that is identical to, or has a high degree of homology with, the first recognition sequence, wherein the second recognition sequence is positioned in a repeat sequence telomeric to the Factor VIII gene in the X chromosome. In such an embodiment, the repeat sequence is identical to, or has a high degree of homology with, the int22h-1 sequence except that the repeat sequence is in reverse orientation relative to the int22h-1 sequence.

[0051] In another embodiment, the nucleic acid encoding the engineered nuclease can be an mRNA. In a particular embodiment, the mRNA can be a polycistronic mRNA which encodes an engineered nuclease of the invention and at least one additional polypeptide. In some embodiments, the mRNA or polycistronic mRNA can be encapsulated in a lipid nanoparticle.

[0052] In another embodiment, the pharmaceutical composition comprises a recombinant DNA construct comprising the nucleic acid which encodes the engineered nuclease.

[0053] In another embodiment, the pharmaceutical composition comprises a viral vector comprising the nucleic acid. In one such embodiment, the viral vector can be a retrovirus, a lentivirus, an adenovirus, or an AAV. In a particular embodiment, the viral vector can be a recombinant AAV vector.

[0054] In another embodiment, the engineered nuclease can be an engineered meganuclease, a TALEN, a zinc finger nuclease, a compact TALEN, a CRISPR, or a megaTAL. In a particular embodiment, the engineered nuclease can be an engineered meganuclease.

[0055] In another embodiment, wherein the engineered nuclease is an engineered meganuclease, the first recognition sequence can comprise SEQ ID NO: 7. In particular embodiments, the first recognition sequence can comprise SEQ ID NO: 9 or SEQ ID NO: 11. In some such embodiments, the pharmaceutical composition can comprise an engineered meganuclease of the invention (or a nucleic acid encoding the same) which recognizes and cleaves SEQ ID NO: 7, 9, or 11. In a particular embodiment, the engineered meganuclease can comprise the amino acid sequence of any one of SEQ ID NOs: 13-21.

[0056] In another aspect, the invention provides a method for treating a subject having hemophilia A. In such an aspect, hemophilia A is characterized by an inversion of exons 1-22 of a Factor VIII gene. The method comprises delivering to a target cell in the subject an effective amount of: (a) a nucleic acid encoding an engineered nuclease, wherein the engineered nuclease is expressed in the target cell in vivo; or (b) an engineered nuclease protein; wherein the engineered nuclease is any engineered nuclease of the invention which has specificity for a first recognition sequence positioned within an int22h-1 sequence of the Factor VIII gene in the target cell, wherein the first recognition sequence does not comprise a CpG site and is at least 80%, at least 85%, at least 90%, at least 95%, or more identical between the human genome and the canine genome.

[0057] In one embodiment of the method, the method comprises administering to the subject a pharmaceutical composition of the invention described above, which comprises an effective amount of: (a) a nucleic acid encoding an engineered nuclease of the invention, wherein the engineered nuclease is expressed in a target cell in vivo; or (b) an engineered nuclease protein of the invention.

[0058] In another embodiment of the method, the engineered nuclease, or the nucleic acid encoding the engineered nuclease, can be delivered to a target cell which can express Factor VIII following a reversion of exons 1-22 to a wild-type orientation, or a progenitor cell which differentiates into a cell which can express Factor VIII following a reversion of exons 1-22 to a wild-type orientation. In one such embodiment, the target cell can be a hepatic cell. In a particular embodiment, the hepatic cell can be a hepatic sinusoidal endothelial cell. In another such embodiment, the hepatic cell can be a progenitor cell, such as a hepatic stem cell, which differentiates into a hepatic sinusoidal endothelial cell. In another such embodiment, the target cell can be a hematopoietic endothelial cell. In another such embodiment, the target cell can be a progenitor cell which differentiates into a hematopoietic endothelial cell. It is to be understood that target cells comprise a Factor VIII gene which has an inversion of exons 1-22.

[0059] In another embodiment of the method, the engineered nuclease recognizes and cleaves the first recognition sequence to promote recombination between the int22h-1 sequence and the repeat sequence, resulting in reversion of exons 1-22 to generate a Factor VIII gene having a wild-type orientation.

[0060] In another embodiment of the method, the engineered nuclease further recognizes and cleaves the second recognition sequence in the repeat sequence.

[0061] In another embodiment of the method, the engineered nuclease can be an engineered meganuclease, a TALEN, a zinc finger nuclease, a compact TALEN, a CRISPR, or a megaTAL. In a particular embodiment, the engineered nuclease can be an engineered meganuclease.

[0062] In another embodiment of the method, wherein the engineered nuclease is an engineered meganuclease, the first recognition sequence can comprise SEQ ID NO: 7. In one such embodiment, the engineered meganuclease can be any engineered meganuclease of the invention which recognizes and cleaves SEQ ID NO: 7, 9 or 11. In a particular embodiment, the engineered meganuclease can comprise the amino acid sequence of any one of SEQ ID NOs: 13-21.

[0063] In another embodiment of the method, the subject can be a mammal. In one such embodiment, the subject can be a human. In another such embodiment, the subject can be a canine. In some embodiments, hemophilia A in the subject is treated. In some embodiments, blood clotting time in the subject is reduced. In some embodiments, circulating levels of Factor VIII are increased.

[0064] In another aspect, the invention provides a method for producing a genetically-modified cell comprising a Factor VIII gene having a wild-type orientation. The method comprises: (a) obtaining a cell comprising a Factor VIII gene having an inversion of exons 1-22; and (b) introducing into the cell: (i) a nucleic acid sequence encoding an engineered nuclease, wherein the engineered nuclease is expressed in the cell; or (ii) an engineered nuclease protein; wherein the engineered nuclease has specificity for a first recognition sequence within an int22h-1 sequence of the Factor VIII gene; and wherein the engineered nuclease recognizes and cleaves the first recognition sequence within the int22h-1 sequence to promote recombination between the int22h-1 sequence and a repeat sequence positioned telomeric to the Factor VIII gene; wherein the first recognition sequence does not comprise a CpG site and is at least 80%, at least 85%, at least 90%, at least 95%, or more identical between the human genome and the canine genome, and wherein the repeat sequence is identical to, or has a high degree of homology with, the int22h-1 sequence except that the repeat sequence is in reverse orientation relative to the int22h-1 sequence; and wherein recombination causes reversion of exons 1-22 and generation of the genetically-modified cell comprising a Factor VIII gene having a wild-type orientation.

[0065] In one embodiment, the cell can be a eukaryotic cell. In one such embodiment, the eukaryotic cell can be a pluripotent cell. In such an embodiment, the pluripotent cell can be an induced pluripotent stem (iPS) cell. In a particular embodiment, the iPS cell can be a human iPS cell or a canine iPS cell.

[0066] In another embodiment, the int22h-1 sequence can have at least 80%, at least 85%, at least 90%, at least 95%, or more, sequence identity to SEQ ID NO: 3 or SEQ ID NO: 4. In one such embodiment, the int22h-1 sequence can comprise SEQ ID NO: 3 or SEQ ID NO: 4.

[0067] In another embodiment, the first recognition sequence can be within an F8A1 coding sequence of the Factor VIII gene. In such an embodiment, the F8A1 coding sequence can have at least 80%, at least 85%, at least 90%, at least 95%, or more, sequence identity to SEQ ID NO: 5 or SEQ ID NO: 6. In a particular embodiment, the F8A1 coding sequence can comprise SEQ ID NO: 5 or SEQ ID NO: 6.

[0068] In another embodiment, the engineered nuclease can have specificity for a second recognition sequence that is identical to, or has a high degree of homology with, the first recognition sequence, wherein the second recognition sequence is positioned in a repeat sequence telomeric to the Factor VIII gene in the X chromosome. In such an embodiment, the repeat sequence is identical to, or has a high degree of homology with, the int22h-1 sequence except that the repeat sequence is in reverse orientation relative to the int22h-1 sequence.

[0069] In another embodiment, the nucleic acid can be an mRNA.

[0070] In another embodiment, the nucleic acid can be introduced into the cell using a recombinant DNA construct.

[0071] In another embodiment, the nucleic acid can be introduced into the cell using a viral vector. In one such embodiment, the viral vector can be a retrovirus, a lentivirus, an adenovirus, or an AAV. In a particular embodiment, the viral vector can be a recombinant AAV vector.

[0072] In another embodiment, the engineered nuclease can be an engineered meganuclease, a TALEN, a zinc finger nuclease, a compact TALEN, a CRISPR, or a megaTAL. In a particular embodiment, the engineered nuclease can be an engineered meganuclease.

[0073] In another embodiment, the engineered nuclease can be any engineered meganuclease of the invention which recognizes and cleaves a recognition sequence comprising SEQ ID NO: 7, 9, or 11. In one such embodiment, the engineered meganuclease can comprise the amino acid sequence of any one of SEQ ID NOs: 13-21.

[0074] In another aspect, the invention provides a genetically-modified cell, wherein the genetically-modified cell comprises a Factor VIII gene having a wild-type orientation, and is produced according to the methods of the invention described herein, which produce a genetically-modified cell from a cell which comprises a Factor VIII gene having an inversion of exons 1-22.

[0075] In another aspect, the invention provides a pharmaceutical composition for treatment of a subject having hemophilia A. In such an aspect, hemophilia A is characterized by an inversion of exons 1-22 in a Factor VIII gene. In different embodiments, the pharmaceutical composition comprises a pharmaceutically acceptable carrier and an effective amount of any genetically-modified cell of the invention, and/or any genetically-modified cell produced according to the methods of the invention, which comprises a Factor VIII gene having a wild-type orientation.

[0076] In another aspect, the invention provides a method for treating a subject having hemophilia A. In such an aspect, hemophilia A is characterized by an inversion of exons 1-22 of the Factor VIII gene. The method comprises administering to the subject a pharmaceutical composition of the invention which comprises a pharmaceutically acceptable carrier and an effective amount of any genetically-modified cell of the invention. Such a genetically-modified cell comprises a Factor VIII gene having a wild-type orientation following modification.

[0077] In one embodiment of the method, the genetically-modified cell can be delivered to a target tissue. In one such embodiment, the target tissue can be the liver. In another such embodiment, the target tissue can be the circulatory system.

[0078] In another embodiment of the method, the genetically-modified cell can be a genetically-modified iPS cell. In one such embodiment, the genetically-modified iPS cell can differentiate into a cell which expresses Factor VIII when it is delivered to the target tissue.

[0079] In a particular embodiment, the genetically-modified iPS cell can differentiate into a hepatic sinusoidal endothelial cell which expresses Factor VIII. In another particular embodiment, the genetically-modified iPS cell can differentiate into a hematopoietic cell, such as a hematopoietic endothelial cell, which expresses Factor VIII.

[0080] In another embodiment of the method, the subject can be a mammal. In one such embodiment, the subject can be a human. In another such embodiment, the subject can be a canine. In some embodiments, hemophilia A in the subject is treated. In some embodiments, blood clotting time in the subject is reduced. In some embodiments, circulating levels of Factor VIII are increased.

[0081] In another aspect, the invention provides a method for genetically modifying the Factor VIII gene in the genome of a mammalian cell, wherein the mammalian cell comprises an inversion of exons 1-22 in the Factor VIII gene compared to a wild-type Factor VIII gene. In some embodiments, the method comprises introducing into the mammalian cell: (a) an engineered nuclease having specificity for a first recognition sequence positioned within an int22h-1 sequence of the Factor VIII gene; or (b) a nucleic acid encoding the engineered nuclease, wherein the engineered nuclease is expressed in the mammalian cell; wherein the engineered nuclease recognizes and cleaves the first recognition sequence and causes a reversion of exons 1-22 to a wild-type orientation in the genetically modified mammalian cell; and wherein the first recognition sequence does not comprise a CpG site and is at least 80%, at least 85%, at least 90%, at least 95%, or more identical between the human genome and the canine genome. In such embodiments of the method, the engineered nuclease recognizes and cleaves the first recognition sequence within the int22h-1 sequence to promote recombination between the int22h-1 sequence and a repeat sequence positioned telomeric to the Factor VIII gene; wherein the repeat sequence is identical to, or has a high degree of homology with, the int22h-1 sequence except that the repeat sequence is in reverse orientation relative to the int22h-1 sequence; and wherein recombination causes reversion of exons 1-22 and generation of the genetically-modified cell comprising a Factor VIII gene having a wild-type orientation.

[0082] In some embodiments of the method, the int22h-1 sequence comprises a nucleotide sequence having at least 80%, at least 85%, at least 90%, at least 95%, or more, sequence identity to SEQ ID NO: 3 or SEQ ID NO: 4.

[0083] In some embodiments of the method, the first recognition sequence is within an F8A1 coding sequence of said Factor VIII gene.

[0084] In certain embodiments of the method, the F8A1 coding sequence comprises a nucleotide sequence having at least 80%, at least 85%, at least 90%, at least 95%, or more, sequence identity to SEQ ID NO: 5 or SEQ ID NO: 6.

[0085] In some embodiments of the method, the engineered nuclease has specificity for a second recognition sequence that is identical to the first recognition sequence, wherein the second recognition sequence is positioned in a repeat sequence telomeric to the Factor VIII gene in the X chromosome, and wherein the repeat sequence is identical to the int22h-1 sequence except that the repeat sequence is in reverse orientation relative to the int22h-1 sequence.

[0086] In some embodiments of the method, the nucleic acid is an mRNA. In certain embodiments of the method, the mRNA is encapsulated in a lipid nanoparticle.

[0087] In some embodiments of the method, the nucleic acid is introduced using a recombinant DNA construct comprising the nucleic acid.

[0088] In some embodiments of the method, the nucleic acid is introduced using a viral vector comprising the nucleic acid. In particular embodiments of the method, the viral vector is a retroviral vector, a lentiviral vector, an adenoviral vector, or an AAV vector. In certain embodiments of the method, the viral vector is a recombinant AAV vector.

[0089] In some embodiments of the method, the engineered nuclease is an engineered meganuclease, a TALEN, a zinc finger nuclease, a compact TALEN, a CRISPR, or a megaTAL. In certain embodiments of the method, the engineered nuclease is an engineered meganuclease.

[0090] In particular embodiments of the method, the first recognition sequence comprises SEQ ID NO: 7. In certain embodiments of the method, the first recognition sequence comprises SEQ ID NO: 9 or SEQ ID NO: 11. In specific embodiments of the method, the engineered meganuclease is any engineered meganuclease described herein which has specificity for a recognition sequence comprising SEQ ID NO: 7, 9, or 11. In other specific embodiments of the method, the engineered meganuclease comprises the amino acid sequence of any one of SEQ ID NOs: 13-21.

[0091] In some embodiments of the method, the mammalian cell can express Factor VIII following reversion of exons 1-22 to a wild-type orientation. In certain embodiments of the method, the mammalian cell is a progenitor cell which can differentiate into a cell which can express Factor VIII following reversion of exons 1-22 to a wild-type orientation.

[0092] In some embodiments of the method, the mammalian cell is a hepatic cell. In particular embodiments of the method, the mammalian cell is a hepatic sinusoidal endothelial cell. In certain embodiments of the method, the mammalian cell is a progenitor cell capable of differentiating into a hepatic sinusoidal endothelial cell. In certain embodiments of the method, the progenitor cell is a hepatic stem cell.

[0093] In some embodiments of the method, the mammalian cell is a hematopoietic endothelial cell. In some embodiments of the method, the mammalian cell is a progenitor cell capable of differentiating into a hematopoietic endothelial cell.

[0094] In some embodiments, the method is performed in a subject in vivo. In certain embodiments, the subject has hemophilia A characterized by an inversion of exons 1-22 of the Factor VIII gene. In some such embodiments, the method comprises administering to the subject a pharmaceutical composition described herein. In some such embodiments of the method, the subject is a human. In some such embodiments of the method, the subject is a canine. In some embodiments, hemophilia A in the subject is treated. In some embodiments, blood clotting time in the subject is reduced. In some embodiments, circulating levels of Factor VIII are increased.

[0095] In some embodiments, the method is performed in vitro. In some such embodiments of the method, the mammalian cell is a human cell. In some such embodiments of the method, the mammalian cell is a canine cell. In some embodiments of the method, the mammalian cell can be a pluripotent cell, such as an induced pluripotent stem (iPS) cell.

[0096] In another aspect, the invention provides a genetically-modified cell produced according to the methods described herein.

[0097] In another aspect, the invention provides an engineered nuclease, and particularly an engineered meganuclease, described herein for use as a medicament. The invention further provides the use of an engineered nuclease, and particularly an engineered meganuclease, described herein in the manufacture of a medicament for treating hemophilia A, which is characterized by an inversion of exons 1-22 in the Factor VIII gene.

[0098] In another aspect, the invention provides a polynucleotide for use as a medicament, wherein the polynucleotide comprises a nucleic acid sequence encoding an engineered nuclease, and particularly an engineered meganuclease, of the invention. The invention further provides the use of a polynucleotide in the manufacture of a medicament for treating hemophilia A, which is characterized by an inversion of exons 1-22 in the Factor VIII gene, wherein the polynucleotide comprises a nucleic acid sequence encoding an engineered nuclease, and particularly an engineered meganuclease, of the invention.

[0099] In another aspect, the invention provides a recombinant AAV vector for use as a medicament, wherein the recombinant AAV vector comprises a polynucleotide which comprises a nucleic acid sequence encoding an engineered nuclease, and particularly an engineered meganuclease, of the invention. The invention further provides the use of a recombinant AAV vector in the manufacture of a medicament for treating hemophilia A, which is characterized by an inversion of exons 1-22 of the Factor VIII gene, wherein the recombinant AAV vector comprises a polynucleotide which comprises a nucleic acid sequence encoding an engineered nuclease, and particularly an engineered meganuclease, of the invention.

[0100] In another aspect, the invention provides a genetically-modified cell of the invention for use as a medicament, wherein the genetically-modified cell has been modified to comprise a Factor VIII gene having a wild-type orientation. The invention further provides the use of a genetically-modified cell of the invention in the manufacture of a medicament for treating hemophilia A, which is characterized by an inversion of exons 1-22 of the Factor VIII gene, wherein the genetically-modified cell has been modified to comprise a Factor VIII gene having a wild-type orientation.

BRIEF DESCRIPTION OF THE FIGURES

[0101] FIG. 1A and FIG. 1B. Inversion of introns 1-22 in the Factor VIII gene. The int22h-2 and int22h-3 repeat sequences are positioned telomeric to the int22h-1 sequence on the X chromosome. Further, int22h-2 and int22h-3 are found in an inverse orientation to one another as part of an imperfect palindrome. Recombination of sequences within this palindrome allows int22h-2 and in22h-3 to swap places in the genome and, consequently, change their orientation relative to int22h-1. As a result, the int22h-1 sequence can, in different circumstances, recombine with the int22h-2 repeat or the int22h-3 repeat, depending on which is in the opposite orientation to int22h-1. FIG. 1A shows a configuration in which int22h-3 is in an inverse orientation to int22h-1, allowing for intrachromosomal recombination to occur between these repeat sequences, resulting in the illustrated inversion of exons 1-22. FIG. 1B shows a configuration in which int22h-2 is in an inverse orientation to int22h-1, allowing for intrachromosomal recombination to occur between these repeat sequences, resulting in the illustrated inversion of exons 1-22.

[0102] FIG. 2. F8R recognition sequences in the Factor VIII gene. A) Each recognition sequence targeted by a recombinant meganuclease of the invention comprises two recognition half-sites. Each recognition half-site comprises 9 base pairs, separated by a 4 base pair central sequence. The human F8R 17-18 recognition sequence (SEQ ID NO: 9) and canine F8R 17-18 recognition sequence (SEQ ID NO: 11) comprise two recognition half-sites referred to as F8R17 and F8R18.

[0103] FIG. 3. The recombinant meganucleases of the invention comprise two subunits, wherein the first subunit comprising the HVR1 region binds to a first recognition half-site (i.e., F8R17) and the second subunit comprising the HVR2 region binds to a second recognition half-site (i.e., F8R18). In embodiments where the recombinant meganuclease is a single-chain meganuclease, the first subunit comprising the HVR1 region can be positioned as either the N-terminal or C-terminal subunit. Likewise, the second subunit comprising the HVR2 region can be positioned as either the N-terminal or C-terminal subunit.

[0104] FIG. 4. Schematic of reporter assay in CHO cells for evaluating recombinant meganucleases targeting recognition sequences found in intron 22 of the Factor VIII gene. For the recombinant meganucleases described herein, a CHO cell line was produced in which a reporter cassette was integrated stably into the genome of the cell. The reporter cassette comprised, in 5' to 3' order: an SV40 Early Promoter; the 5' 2/3 of the GFP gene; the recognition sequence for an engineered meganuclease of the invention (i.e., the human or canine F8R 17-18 recognition sequence); the recognition sequence for the CHO-23/24 meganuclease (WO/2012/167192); and the 3' 2/3 of the GFP gene. Cells stably transfected with this cassette did not express GFP in the absence of a DNA break-inducing agent. Meganucleases were introduced by transduction of plasmid DNA or mRNA encoding each meganuclease. When a DNA break was induced at either of the meganuclease recognition sequences, the duplicated regions of the GFP gene recombined with one another to produce a functional GFP gene. The percentage of GFP-expressing cells could then be determined by flow cytometry as an indirect measure of the frequency of genome cleavage by the meganucleases.

[0105] FIG. 5A and FIG. 5B. Efficiency of first-generation recombinant meganucleases for recognizing and cleaving F8R 17-18 recognition sequences in the int22h-1 sequence of the Factor VIII gene in a CHO cell reporter assay. Recombinant meganucleases set forth in SEQ ID NOs: 15 (F8R 17-18x.1), SEQ ID NO: 16 (F8R 17-18x.2), SEQ ID NO: 17 (F8R 17-18x.79), and SEQ ID NO: 18 (F8R 17-18x.88) were engineered to target the human F8R 17-18 recognition sequence (SEQ ID NO: 9) and the canine F8R 17-18 recognition sequence (SEQ ID NO: 11), and were screened for efficacy in the CHO cell reporter assay. The results shown provide the percentage of GFP-expressing cells observed in each assay, which indicates the efficacy of each meganuclease for cleaving a target recognition sequence or the CHO-23/24 recognition sequence. A negative control (bs) was further included in each assay. FIG. 5A. shows meganucleases targeting the human F8R 17-18 recognition sequence. FIG. 5B shows meganucleases targeting the canine F8R 17-18 recognition sequence.

[0106] FIG. 6A and FIG. 6B. Efficiency of second-generation engineered meganuclease for recognizing and cleaving recognition sequences in the int22h-1 sequence of the Factor VIII gene in a CHO cell reporter assay. The second-generation engineered meganuclease F8R 17-18L1.35 set forth in SEQ ID NO: 13 was screened for efficacy for targeting the human F8R 17-18 recognition sequence (SEQ ID NO: 9) and the canine F8R 17-18 recognition sequence (SEQ ID NO: 11) in the CHO cell reporter assay and compared to the first-generation recombinant meganuclease F8R 17-18x.1 from which it was derived. The results shown provide the percentage of GFP-expressing cells observed in each assay, which indicates the efficacy of each meganuclease for cleaving a target recognition sequence or the CHO-23/24 recognition sequence. A negative control (bs) was further included in each assay. FIG. 6A. shows meganucleases targeting the human F8R 17-18 recognition sequence. FIG. 6B shows meganucleases targeting the canine F8R 17-18 recognition sequence.

[0107] FIG. 7A and FIG. 7B. Efficiency of second-generation engineered meganuclease for recognizing and cleaving recognition sequences in the int22h-1 sequence of the Factor VIII gene in a CHO cell reporter assay. The second-generation engineered meganuclease F8R 17-18L2.23 set forth in SEQ ID NO: 14 was screened for efficacy for targeting the human F8R 17-18 recognition sequence (SEQ ID NO: 9) and the canine F8R 17-18 recognition sequence (SEQ ID NO: 11) in the CHO cell reporter assay and compared to the first-generation recombinant meganuclease F8R 17-18x.88 from which it was derived. The results shown provide the percentage of GFP-expressing cells observed in each assay, which indicates the efficacy of each meganuclease for cleaving a target recognition sequence or the CHO-23/24 recognition sequence. A negative control (bs) was further included in each assay. FIG. 7A. shows meganucleases targeting the human F8R 17-18 recognition sequence. FIG. 7B shows meganucleases targeting the canine F8R 17-18 recognition sequence.

[0108] FIG. 8A and FIG. 8B. Engineered meganucleases encompassed by the invention were engineered to target the human F8R 17-18 (SEQ ID NO: 9) and canine F84 17-18 (SEQ ID NO: 11) recognition sequences, and were screened for efficacy in the CHO cell reporter assay at multiple time points over 7 days after nucleofection. The results shown provide the percentage of GFP-expressing cells observed in each assay at days 2, 5, and 7, which indicates the efficacy of each meganuclease for cleaving a target recognition sequence or the CHO-23/24 recognition sequence as a function of time. FIG. 8A shows F8R 17-18 meganucleases targeting the human F8R 17-18 recognition sequence. FIG. 8B shows F8R 17-18 meganucleases targeting the canine F8R 17-18 recognition sequence.

[0109] FIG. 9. Efficiency of third-generation engineered meganucleases for recognizing and cleaving human and canine F8R recognition sequences, and for reducing cleavage of an off-target recognition sequence, in a CHO cell reporter assay. Third-generation engineered meganucleases were screened for efficacy for targeting the human F8R 17-18 recognition sequence (SEQ ID NO: 9), the canine F8R 17-18 recognition sequence (SEQ ID NO: 11), or an off-target sequence (SEQ ID NO: 46), in the CHO cell reporter assay and were compared to the F8R 17-18L1.35 meganuclease.

[0110] FIG. 10A and FIG. 10B. Reversion of Factor VIII gene by F8R nucleases in primary human patient T cells and determination of editing by long-distance PCR. Hemophilia A patient T-cells were transfected with mRNA encoding F8R 17-18x.1, F8R 17-18x.2, F8R 17-18x.79, or F8R 17-18x.88 nucleases. At 3 days post-transfection, genomic DNA was isolated from cells and long-distance PCR was performed to determine Factor VIII genome editing. FIG. 10A shows PCR bands corresponding to the hemophilia A-associated Factor VIII gene inversion, as detected using primers H3D and H1D. FIG. 10B shows PCR bands corresponding to a wild-type Factor VIII gene configuration, as detected using primers H1U and H1D.

[0111] FIG. 11A and FIG. 11B. Inversion of exons 1-22 in the Factor VIII gene by F8R nucleases in HEK293 cells and determination of editing by long-distance PCR. HEK293 cells were transfected with mRNA encoding F8R 17-18x.1, F8R 17-18x.88, F8R 17-18L1.35, or F8R 17-18L2.23 nucleases. At 3 days post-transfection, genomic DNA was isolated from cells and long-distance PCR was performed to determine Factor VIII genome editing. FIG. 11A shows PCR bands corresponding to a wild-type Factor VIII gene configuration, as detected using primers H1U and H1D. FIG. 11B shows PCR bands corresponding to the hemophilia A-associated Factor VIII gene inversion, as detected using primers H3D and H1D.

[0112] FIG. 12A and FIG. 12B. Inversion of exons 1-22 in the Factor VIII gene by F8R nucleases in canine cells. MDCK cells were transfected with mRNA encoding F8R 17-18x.1, F8R 17-18x.2, F8R 17-18x.79, or F8R 17-18x.88 nucleases. At 3 days post-transfection, genomic DNA was isolated from cells and inverse digital PCR was performed to determine Factor VIII genome editing. FIG. 12A shows the results corresponding to a wild-type Factor VIII gene configuration, as detected using primers U1 and D1. FIG. 12B shows the results corresponding to the hemophilia A-associated Factor VIII gene inversion, as detected using primers U3 and U1.

[0113] FIG. 13A and FIG. 13B. Inversion of exons 1-22 in the Factor VIII gene by F8R nucleases in canine cells. MDCK cells were transfected with mRNA encoding F8R 17-18x.1, F8R 17-18x.88, F8R 17-18L1.35, or F8R 17-18L2.23 nucleases. At 3 days post-transfection, genomic DNA was isolated from cells and inverse digital PCR was performed to determine Factor VIII genome editing. FIG. 13A shows the results corresponding to a wild-type Factor VIII gene configuration, as detected using primers U1 and D1. FIG. 13B shows the results corresponding to the hemophilia A-associated Factor VIII gene inversion, as detected using primers U3 and U1.

[0114] FIG. 14. Whole blood clotting time in a nuclease-treated hemophilic dog. A dog carrying the int22 Factor VIII inversion was infused with a single dose of 2.4E14 AAV8 particles (1E13 viral genomes/kg) encoding the F8R17-18L1.35 meganuclease under the control of a liver-specific promoter. Whole-blood clotting time (WBCT) was determined repeatedly after over a period of approximately 2.5 months (110 days) following AAV infusion.

[0115] FIG. 15. Detection of corrected Factor VIII mRNA by digital PCR in liver biopsy. A liver biopsy was obtained from the hemophilic dog approximately four months (117 days) after AAV administration of the nuclease. Total RNA was isolated from the biopsy sample and from a liver biopsy sample from the same dog collected prior to AAV infusion. RNA was analyzed by reverse transcription digital PCR to try and detect full-length corrected Factor VIII transcripts.

BRIEF DESCRIPTION OF THE SEQUENCES

[0116] SEQ ID NO: 1 sets forth the amino acid sequence of the wild-type I-CreI meganuclease from Chlamydomonas reinhardtii.

[0117] SEQ ID NO: 2 sets forth the amino acid sequence of the LAGLIDADG motif.

[0118] SEQ ID NO: 3 sets forth the nucleic acid sequence of a human int22h-1 sequence.

[0119] SEQ ID NO: 4 sets forth the nucleic acid sequence of a canine int22h-1 sequence.

[0120] SEQ ID NO: 5 sets forth the nucleic acid sequence of a human F8A1 sequence.

[0121] SEQ ID NO: 6 sets forth the nucleic acid sequence of a canine F8A1 sequence.

[0122] SEQ ID NO: 7 sets forth the nucleic acid sequence of the F8R 17-18 consensus recognition sequence (sense).

[0123] SEQ ID NO: 8 sets forth the nucleic acid sequence of the F8R 17-18 consensus recognition sequence (antisense).

[0124] SEQ ID NO: 9 sets forth the nucleic acid sequence of the F8R 17-18 human recognition sequence (sense).

[0125] SEQ ID NO: 10 sets forth the nucleic acid sequence of the F8R 17-18 human recognition sequence (antisense).

[0126] SEQ ID NO: 11 sets forth the nucleic acid sequence of the F8R 17-18 canine recognition sequence (sense).

[0127] SEQ ID NO: 12 sets forth the nucleic acid sequence of the F8R 17-18 canine recognition sequence (antisense).

[0128] SEQ ID NO: 13 sets forth the amino acid sequence of the F8R 17-18L1.35 meganuclease.

[0129] SEQ ID NO: 14 sets forth the amino acid sequence of the F8R 17-18L2.23 meganuclease.

[0130] SEQ ID NO: 15 sets forth the amino acid sequence of the F8R 17-18x.1 meganuclease.

[0131] SEQ ID NO: 16 sets forth the amino acid sequence of the F8R 17-18x.2 meganuclease.

[0132] SEQ ID NO: 17 sets forth the amino acid sequence of the F8R 17-18x.79 meganuclease.

[0133] SEQ ID NO: 18 sets forth the amino acid sequence of the F8R 17-18x.88 meganuclease.

[0134] SEQ ID NO: 19 sets forth the amino acid sequence of the F8R 17-18L.626 meganuclease.

[0135] SEQ ID NO: 20 sets forth the amino acid sequence of the F8R 17-18L.615 meganuclease.

[0136] SEQ ID NO: 21 sets forth the amino acid sequence of the F8R 17-18L.553 meganuclease.

[0137] SEQ ID NO: 22 sets forth the amino acid sequence of the F8R 17-18L1.35 meganuclease F8R17-binding monomer.

[0138] SEQ ID NO: 23 sets forth the amino acid sequence of the F8R 17-18L2.23 meganuclease F8R17-binding monomer.

[0139] SEQ ID NO: 24 sets forth the amino acid sequence of the F8R 17-18x.1 meganuclease F8R17-binding monomer.

[0140] SEQ ID NO: 25 sets forth the amino acid sequence of the F8R 17-18x.2 meganuclease F8R17-binding monomer.

[0141] SEQ ID NO: 26 sets forth the amino acid sequence of the F8R 17-18x.79 meganuclease F8R17-binding monomer.

[0142] SEQ ID NO: 27 sets forth the amino acid sequence of the F8R 17-18x.88 meganuclease F8R17-binding monomer.

[0143] SEQ ID NO: 28 sets forth the amino acid sequence of the F8R 17-18L.626 meganuclease F8R17-binding monomer.

[0144] SEQ ID NO: 29 sets forth the amino acid sequence of the F8R 17-18L.615 meganuclease F8R17-binding monomer.

[0145] SEQ ID NO: 30 sets forth the amino acid sequence of the F8R 17-18L.553 meganuclease F8R17-binding monomer.

[0146] SEQ ID NO: 31 sets forth the amino acid sequence of the F8R 17-18L1.35 meganuclease F8R18-binding monomer.

[0147] SEQ ID NO: 32 sets forth the amino acid sequence of the F8R 17-18L2.23 meganuclease F8R18-binding monomer.

[0148] SEQ ID NO: 33 sets forth the amino acid sequence of the F8R 17-18x.1 meganuclease F8R18-binding monomer.

[0149] SEQ ID NO: 34 sets forth the amino acid sequence of the F8R 17-18x.2 meganuclease F8R18-binding monomer.

[0150] SEQ ID NO: 35 sets forth the amino acid sequence of the F8R 17-18x.79 meganuclease F8R18-binding monomer.

[0151] SEQ ID NO: 36 sets forth the amino acid sequence of the F8R 17-18x.88 meganuclease F8R18-binding monomer.

[0152] SEQ ID NO: 37 sets forth the amino acid sequence of the F8R 17-18L.626 meganuclease F8R18-binding monomer.

[0153] SEQ ID NO: 38 sets forth the amino acid sequence of the F8R 17-18L.615 meganuclease F8R18-binding monomer.

[0154] SEQ ID NO: 39 sets forth the amino acid sequence of the F8R 17-18L.553 meganuclease F8R18-binding monomer.

[0155] SEQ ID NO: 40 sets forth the nucleic acid sequence of the H1U primer.

[0156] SEQ ID NO: 41 sets forth the nucleic acid sequence of the H1D primer.

[0157] SEQ ID NO: 42 sets forth the nucleic acid sequence of the H3D primer.

[0158] SEQ ID NO: 43 sets forth the nucleic acid sequence of the U1 primer.

[0159] SEQ ID NO: 44 sets forth the nucleic acid sequence of the D1 primer.

[0160] SEQ ID NO: 45 sets forth the nucleic acid sequence of the U3 primer.

[0161] SEQ ID NO: 46 sets forth the nucleic acid sequence of an off-target recognition sequence.

[0162] SEQ ID NO: 47 sets forth the nucleic acid sequence of a K9F8e22F primer sequence.

[0163] SEQ ID NO: 48 sets forth the nucleic acid sequence of a K9F8e22-23FAM probe sequence.

[0164] SEQ ID NO: 49 sets forth the nucleic acid sequence of a K9F8e22-23FAM probe sequence.

[0165] SEQ ID NO: 50 sets forth the nucleic acid sequence of a K9F8e23R primer sequence.

DETAILED DESCRIPTION OF THE INVENTION

1.1 References and Definitions

[0166] The patent and scientific literature referred to herein establishes knowledge that is available to those of skill in the art. The issued US patents, allowed applications, published foreign applications, and references, including GenBank database sequences, which are cited herein are hereby incorporated by reference to the same extent as if each was specifically and individually indicated to be incorporated by reference.

[0167] The present invention can be embodied in different forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. For example, features illustrated with respect to one embodiment can be incorporated into other embodiments, and features illustrated with respect to a particular embodiment can be deleted from that embodiment. In addition, numerous variations and additions to the embodiments suggested herein will be apparent to those skilled in the art in light of the instant disclosure, which do not depart from the instant invention.

[0168] Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.

[0169] All publications, patent applications, patents, and other references mentioned herein are incorporated by reference herein in their entirety.

[0170] As used herein, "a," "an," or "the" can mean one or more than one. For example, "a" cell can mean a single cell or a multiplicity of cells.

[0171] As used herein, unless specifically indicated otherwise, the word "or" is used in the inclusive sense of "and/or" and not the exclusive sense of "either/or."

[0172] As used herein, the terms "nuclease" and "endonuclease" are used interchangeably to refer to naturally-occurring or engineered enzymes which cleave a phosphodiester bond within a polynucleotide chain.

[0173] As used herein, the term "meganuclease" refers to an endonuclease that binds double-stranded DNA at a recognition sequence that is greater than 12 base pairs. In some embodiments, the recognition sequence for a meganuclease of the present disclosure is 22 base pairs. A meganuclease can be an endonuclease that is derived from I-CreI, and can refer to an engineered variant of I-CreI that has been modified relative to natural I-CreI with respect to, for example, DNA-binding specificity, DNA cleavage activity, DNA-binding affinity, or dimerization properties. Methods for producing such modified variants of I-CreI are known in the art (e.g. WO 2007/047859, incorporated by reference in its entirety). A meganuclease as used herein binds to double-stranded DNA as a heterodimer. A meganuclease may also be a "single-chain meganuclease" in which a pair of DNA-binding domains are joined into a single polypeptide using a peptide linker. The term "homing endonuclease" is synonymous with the term "meganuclease." Meganucleases of the present disclosure are substantially non-toxic when expressed in cells, particularly in human T cells, such that cells can be transfected and maintained at 37.degree. C. without observing deleterious effects on cell viability or significant reductions in meganuclease cleavage activity when measured using the methods described herein.

[0174] As used herein, the term "single-chain meganuclease" refers to a polypeptide comprising a pair of nuclease subunits joined by a linker. A single-chain meganuclease has the organization: N-terminal subunit-Linker-C-terminal subunit. The two meganuclease subunits will generally be non-identical in amino acid sequence and will recognize non-identical DNA sequences. Thus, single-chain meganucleases typically cleave pseudo-palindromic or non-palindromic recognition sequences. A single-chain meganuclease may be referred to as a "single-chain heterodimer" or "single-chain heterodimeric meganuclease" although it is not, in fact, dimeric. For clarity, unless otherwise specified, the term "meganuclease" can refer to a dimeric or single-chain meganuclease.

[0175] As used herein, the term "linker" refers to an exogenous peptide sequence used to join two meganuclease subunits into a single polypeptide. A linker may have a sequence that is found in natural proteins, or may be an artificial sequence that is not found in any natural protein. A linker may be flexible and lacking in secondary structure or may have a propensity to form a specific three-dimensional structure under physiological conditions. A linker can include, without limitation, any of those encompassed by U.S. Pat. Nos. 8,445,251, 9,340,777, 9,434,931, and 10,041,053, each of which is incorporated by reference in its entirety. In some embodiments, a linker may have an amino acid sequence comprising residues 154-195 of any one of SEQ ID NOs: 13-21.

[0176] As used herein, the term "TALEN" refers to an endonuclease comprising a DNA-binding domain comprising a plurality of TAL domain repeats fused to a nuclease domain or an active portion thereof from an endonuclease or exonuclease, including but not limited to a restriction endonuclease, homing endonuclease, 51 nuclease, mung bean nuclease, pancreatic DNAse I, micrococcal nuclease, and yeast HO endonuclease. See, for example, Christian et al. (2010) Genetics 186:757-761, which is incorporated by reference in its entirety. Nuclease domains useful for the design of TALENs include those from a Type IIs restriction endonuclease, including but not limited to FokI, FoM, StsI, HhaI, HindIII, Nod, BbvCI, EcoRI, BglI, and AlwI. Additional Type IIs restriction endonucleases are described in International Publication No. WO 2007/014275. In some embodiments, the nuclease domain of the TALEN is a FokI nuclease domain or an active portion thereof. TAL domain repeats can be derived from the TALE (transcription activator-like effector) family of proteins used in the infection process by plant pathogens of the Xanthomonas genus. TAL domain repeats are 33-34 amino acid sequences with divergent 12.sup.th and 13.sup.th amino acids. These two positions, referred to as the repeat variable dipeptide (RVD), are highly variable and show a strong correlation with specific nucleotide recognition. Each base pair in the DNA target sequence is contacted by a single TAL repeat, with the specificity resulting from the RVD. In some embodiments, the TALEN comprises 16-22 TAL domain repeats. DNA cleavage by a TALEN requires two DNA recognition regions flanking a nonspecific central region (i.e., the "spacer"). The term "spacer" in reference to a TALEN refers to the nucleic acid sequence that separates the two nucleic acid sequences recognized and bound by each monomer constituting a TALEN. The TAL domain repeats can be native sequences from a naturally-occurring TALE protein or can be redesigned through rational or experimental means to produce a protein which binds to a pre-determined DNA sequence (see, for example, Boch et al. (2009) Science 326(5959):1509-1512 and Moscou and Bogdanove (2009) Science 326(5959):1501, each of which is incorporated by reference in its entirety). See also, U.S. Publication No. 20110145940 and International Publication No. WO 2010/079430 for methods for engineering a TALEN to recognize a specific sequence and examples of RVDs and their corresponding target nucleotides. In some embodiments, each nuclease (e.g., FokI) monomer can be fused to a TAL effector sequence that recognizes a different DNA sequence, and only when the two recognition sites are in close proximity do the inactive monomers come together to create a functional enzyme.

[0177] As used herein, the term "compact TALEN" refers to an endonuclease comprising a DNA-binding domain with one or more TAL domain repeats fused in any orientation to any portion of the I-TevI homing endonuclease or any of the endonucleases listed in Table 2 in U.S. Application No. 20130117869 (which is incorporated by reference in its entirety), including but not limited to MmeI, EndA, EndI, I-BasI, I-TevII, I-TevIII, I-TwoI, MspI, MvaI, NucA, and NucM. Compact TALENs do not require dimerization for DNA processing activity, alleviating the need for dual target sites with intervening DNA spacers. In some embodiments, the compact TALEN comprises 16-22 TAL domain repeats.

[0178] As used herein, the term "zinc finger nuclease" or "ZFN" refers to a chimeric protein comprising a zinc finger DNA-binding domain fused to a nuclease domain from an endonuclease or exonuclease, including but not limited to a restriction endonuclease, homing endonuclease, 51 nuclease, mung bean nuclease, pancreatic DNAse I, micrococcal nuclease, and yeast HO endonuclease. Nuclease domains useful for the design of zinc finger nucleases include those from a Type IIs restriction endonuclease, including but not limited to FokI, FoM, and StsI restriction enzyme. Additional Type IIs restriction endonucleases are described in International Publication No. WO 2007/014275, which is incorporated by reference in its entirety. The structure of a zinc finger domain is stabilized through coordination of a zinc ion. DNA binding proteins comprising one or more zinc finger domains bind DNA in a sequence-specific manner. The zinc finger domain can be a native sequence or can be redesigned through rational or experimental means to produce a protein which binds to a pre-determined DNA sequence .about.18 basepairs in length, comprising a pair of nine basepair half-sites separated by 2-10 basepairs. See, for example, U.S. Pat. Nos. 5,789,538, 5,925,523, 6,007,988, 6,013,453, 6,200,759, and International Publication Nos. WO 95/19431, WO 96/06166, WO 98/53057, WO 98/54311, WO 00/27878, WO 01/60970, WO 01/88197, and WO 02/099084, each of which is incorporated by reference in its entirety. By fusing this engineered protein domain to a nuclease domain, such as FokI nuclease, it is possible to target DNA breaks with genome-level specificity. The selection of target sites, zinc finger proteins and methods for design and construction of zinc finger nucleases are known to those of skill in the art and are described in detail in U.S. Publications Nos. 20030232410, 20050208489, 2005064474, 20050026157, 20060188987 and International Publication No. WO 07/014275, each of which is incorporated by reference in its entirety. Cleavage by a zinc finger nuclease can create a blunt end or a 5' overhang of variable length (frequently four basepairs).

[0179] As used herein, the term "CRISPR" refers to a system comprising a caspase-based endonuclease comprising a caspase, such as Cas9, Cpf1, or others known in the art, and a guide RNA that directs DNA cleavage of the caspase by hybridizing to a recognition site in the genomic DNA. The caspase component of a CRISPR is an RNA-guided DNA endonuclease. In certain embodiments, the caspase is a class II Cas enzyme. In some of these embodiments, the caspase is a class II, type II enzyme, such as Cas9. In other embodiments, the caspase is a class II, type V enzyme, such as Cpf1. The guide RNA comprises a direct repeat and a guide sequence (often referred to as a spacer in the context of an endogenous CRISPR system), which is complementary to the target recognition site. In certain embodiments, the CRISPR further comprises a tracrRNA (trans-activating CRISPR RNA) that is complementary (fully or partially) to a direct repeat sequence (sometimes referred to as a tracr-mate sequence) present on the guide RNA. In particular embodiments, the caspase can be mutated with respect to a corresponding wild-type enzyme such that the enzyme lacks the ability to cleave one strand of a target polynucleotide, functioning as a nickase, cleaving only a single strand of the target DNA. Non-limiting examples of caspase enzymes that function as a nickase include Cas9 enzymes with a D10A mutation within the RuvC I catalytic domain, or with a H840A, N854A, or N863A mutation.

[0180] As used herein, the term "megaTAL" refers to a single-chain endonuclease comprising a transcription activator-like effector (TALE) DNA binding domain with an engineered, sequence-specific homing endonuclease.

[0181] As used herein, with respect to a protein, the term "recombinant" or "engineered" means having an altered amino acid sequence as a result of the application of genetic engineering techniques to nucleic acids which encode the protein, and cells or organisms which express the protein. With respect to a nucleic acid, the term "recombinant" or "engineered" means having an altered nucleic acid sequence as a result of the application of genetic engineering techniques. Genetic engineering techniques include, but are not limited to, PCR and DNA cloning technologies; transfection, transformation and other gene transfer technologies; homologous recombination; site-directed mutagenesis; and gene fusion. In accordance with this definition, a protein having an amino acid sequence identical to a naturally-occurring protein, but produced by cloning and expression in a heterologous host, is not considered recombinant.

[0182] As used herein, the term "wild-type" refers to the most common naturally occurring allele (i.e., polynucleotide sequence) in the allele population of the same type of gene, wherein a polypeptide encoded by the wild-type allele has its original functions. The term "wild-type" also refers a polypeptide encoded by a wild-type allele. Wild-type alleles (i.e., polynucleotides) and polypeptides are distinguishable from mutant or variant alleles and polypeptides, which comprise one or more mutations and/or substitutions relative to the wild-type sequence(s). Whereas a wild-type allele or polypeptide can confer a normal phenotype in an organism, a mutant or variant allele or polypeptide can, in some instances, confer an altered phenotype. Wild-type nucleases are distinguishable from recombinant or non-naturally-occurring nucleases. The term "wild-type" can also refer to a cell, an organism, and/or a subject which possesses a wild-type allele of a particular gene, or a cell, an organism, and/or a subject used for comparative purposes.

[0183] The term "wild-type orientation" as used herein refers to the orientation of exons 1-22 as found in the wild-type Factor VIII gene. It is understood that a Factor VIII gene having a wild-type orientation can be either the wild-type Factor VIII gene normally present in healthy subjects or, alternatively, a variant thereof which comprises an insertion or deletion ("indel") but still encodes a biologically-active Factor VIII protein. For example, in cases where a nuclease cleavage site is within an intron of the int22h-1 region, nuclease cleavage can cause a reversion of the inverted gene back to the wild-type orientation and may produce an indel at the nuclease cleavage site. In such cases, the Factor VIII gene may be modified but, because the resulting indel is within an intron, the gene still encodes a biologically-active Factor VIII protein.

[0184] As used herein, the term "genetically-modified" refers to a cell or organism in which, or in an ancestor of which, a genomic DNA sequence has been deliberately modified by recombinant technology. As used herein, the term "genetically-modified" encompasses the term "transgenic."

[0185] As used herein with respect to recombinant proteins, the term "modification" means any insertion, deletion, or substitution of an amino acid residue in the recombinant sequence relative to a reference sequence (e.g., a wild-type or a native sequence).

[0186] As used herein, the term "recognition sequence" refers to a DNA sequence that is bound and cleaved by an endonuclease. In the case of a meganuclease, a recognition sequence comprises a pair of inverted, 9 basepair "half sites" which are separated by four basepairs. In the case of a single-chain meganuclease, the N-terminal domain of the protein contacts a first half-site and the C-terminal domain of the protein contacts a second half-site. Cleavage by a meganuclease produces four basepair 3' "overhangs". "Overhangs", or "sticky ends" are short, single-stranded DNA segments that can be produced by endonuclease cleavage of a double-stranded DNA sequence. In the case of meganucleases and single-chain meganucleases derived from I-CreI, the overhang comprises bases 10-13 of the 22 basepair recognition sequence. In the case of a compact TALEN, the recognition sequence comprises a first CNNNGN sequence that is recognized by the I-TevI domain, followed by a non-specific spacer 4-16 basepairs in length, followed by a second sequence 16-22 bp in length that is recognized by the TAL-effector domain (this sequence typically has a 5' T base). Cleavage by a Compact TALEN produces two basepair 3' overhangs. In the case of a CRISPR, the recognition sequence is the sequence, typically 16-24 basepairs, to which the guide RNA binds to direct cleavage. Full complementarity between the guide sequence and the recognition sequence is not necessarily required to effect cleavage. Cleavage by a CRISPR can produce blunt ends (such as by a class II, type II caspase) or overhanging ends (such as by a class II, type V caspase), depending on the caspase. In those embodiments wherein a Cpf1 caspase is utilized, cleavage by the CRISPR complex comprising the same will result in 5' overhangs and in certain embodiments, 5 nucleotide 5' overhangs. Each caspase enzyme also requires the recognition of a PAM (protospacer adjacent motif) sequence that is near the recognition sequence complementary to the guide RNA. The precise sequence, length requirements for the PAM, and distance from the target sequence differ depending on the caspase enzyme, but PAMs are typically 2-5 base pair sequences adjacent to the target/recognition sequence. PAM sequences for particular caspase enzymes are known in the art (see, for example, U.S. Pat. No. 8,697,359 and U.S. Publication No. 20160208243, each of which is incorporated by reference in its entirety) and PAM sequences for novel or engineered caspase enzymes can be identified using methods known in the art, such as a PAM depletion assay (see, for example, Karvelis et al. (2017) Methods 121-122:3-8, which is incorporated herein in its entirety). In the case of a zinc finger, the DNA binding domains typically recognize an 18-bp recognition sequence comprising a pair of nine basepair "half-sites" separated by 2-10 basepairs and cleavage by the nuclease creates a blunt end or a 5' overhang of variable length (frequently four basepairs).

[0187] As used herein, the term "target site" or "target sequence" refers to a region of the chromosomal DNA of a cell comprising a recognition sequence for a nuclease.

[0188] As used herein, the term "DNA-binding affinity" or "binding affinity" means the tendency of a nuclease to non-covalently associate with a reference DNA molecule (e.g., a recognition sequence or an arbitrary sequence). Binding affinity is measured by a dissociation constant, K.sub.d. As used herein, a nuclease has "altered" binding affinity if the K.sub.d of the nuclease for a reference recognition sequence is increased or decreased by a statistically significant (p<0.05) amount relative to a reference nuclease.

[0189] As used herein, the term "specificity" means the ability of a nuclease to recognize and cleave double-stranded DNA molecules only at a particular sequence of base pairs referred to as the recognition sequence, or only at a particular set of recognition sequences. The set of recognition sequences will share certain conserved positions or sequence motifs, but may be degenerate at one or more positions. A highly-specific nuclease is capable of cleaving only one or a very few recognition sequences. Specificity can be determined by any method known in the art. As used herein, a nuclease has "altered" specificity if it binds to and cleaves a recognition sequence which is not bound to and cleaved by a reference nuclease (e.g., a wild-type) under physiological conditions, or if the rate of cleavage of a recognition sequence is increased or decreased by a biologically significant amount (e.g., at least 2.times., or 2.times.-10.times.) relative to a reference nuclease.

[0190] As used herein, the term "homologous recombination" or "HR" refers to the natural, cellular process in which a double-stranded DNA-break is repaired using a homologous DNA sequence as the repair template (see, e.g. Cahill et al. (2006), Front. Biosci. 11:1958-1976). The homologous DNA sequence may be an endogenous chromosomal sequence or an exogenous nucleic acid that was delivered to the cell.

[0191] As used herein, the term "non-homologous end-joining" or "NHEJ" refers to the natural, cellular process in which a double-stranded DNA-break is repaired by the direct joining of two non-homologous DNA segments (see, e.g. Cahill et al. (2006), Front. Biosci. 11:1958-1976). DNA repair by non-homologous end-joining is error-prone and frequently results in the untemplated addition or deletion of DNA sequences at the site of repair. In some instances, cleavage at a target recognition sequence results in NHEJ at a target recognition site. Nuclease-induced cleavage of a target site in the coding sequence of a gene followed by DNA repair by NHEJ can introduce mutations into the coding sequence, such as frameshift mutations, that disrupt gene function. Thus, engineered nucleases can be used to effectively knock-out a gene in a population of cells.

[0192] As used herein with respect to both amino acid sequences and nucleic acid sequences, the terms "percent identity," "sequence identity," "percentage similarity," "sequence similarity" and the like refer to a measure of the degree of similarity of two sequences based upon an alignment of the sequences which maximizes similarity between aligned amino acid residues or nucleotides, and which is a function of the number of identical or similar residues or nucleotides, the number of total residues or nucleotides, and the presence and length of gaps in the sequence alignment. A variety of algorithms and computer programs are available for determining sequence similarity using standard parameters. As used herein, sequence similarity is measured using the BLASTp program for amino acid sequences and the BLASTn program for nucleic acid sequences, both of which are available through the National Center for Biotechnology Information (www.ncbi.nlm.nih.gov/), and are described in, for example, Altschul et al. (1990), J. Mol. Biol. 215:403-410; Gish and States (1993), Nature Genet. 3:266-272; Madden et al. (1996), Meth. Enzymol. 266:131-141; Altschul et al. (1997), Nucleic Acids Res. 25:33 89-3402); Zhang et al. (2000), J. Comput. Biol. 7(1-2):203-14. As used herein, percent similarity of two amino acid sequences is the score based upon the following parameters for the BLASTp algorithm: word size=3; gap opening penalty=-11; gap extension penalty=-1; and scoring matrix=BLOSUM62. As used herein, percent similarity of two nucleic acid sequences is the score based upon the following parameters for the BLASTn algorithm: word size=11; gap opening penalty=-5; gap extension penalty=-2; match reward=1; and mismatch penalty=-3.

[0193] As used herein with respect to modifications of two proteins or amino acid sequences, the term "corresponding to" is used to indicate that a specified modification in the first protein is a substitution of the same amino acid residue as in the modification in the second protein, and that the amino acid position of the modification in the first protein corresponds to or aligns with the amino acid position of the modification in the second protein when the two proteins are subjected to standard sequence alignments (e.g., using the BLASTp program). Thus, the modification of residue "X" to amino acid "A" in the first protein will correspond to the modification of residue "Y" to amino acid "A" in the second protein if residues X and Y correspond to each other in a sequence alignment, and despite the fact that X and Y may be different numbers.

[0194] As used herein, the term "recognition half-site," "recognition sequence half-site," or simply "half-site" means a nucleic acid sequence in a double-stranded DNA molecule which is recognized by a monomer of a homodimeric or heterodimeric meganuclease, or by one subunit of a single-chain meganuclease.

[0195] As used herein, the term "hypervariable region" refers to a localized sequence within a meganuclease monomer or subunit that comprises amino acids with relatively high variability. A hypervariable region can comprise about 50-60 contiguous residues, about 53-57 contiguous residues, or preferably about 56 residues. In some embodiments, the residues of a hypervariable region may correspond to positions 24-79 or positions 215-270 of any one of SEQ ID NOs: 13-21. A hypervariable region can comprise one or more residues that contact DNA bases in a recognition sequence and can be modified to alter base preference of the monomer or subunit. A hypervariable region can also comprise one or more residues that bind to the DNA backbone when the meganuclease associates with a double-stranded DNA recognition sequence. Such residues can be modified to alter the binding affinity of the meganuclease for the DNA backbone and the target recognition sequence. In different embodiments of the invention, a hypervariable region may comprise between 1-20 residues that exhibit variability and can be modified to influence base preference and/or DNA-binding affinity. In particular embodiments, a hypervariable region comprises between about 15-18 residues that exhibit variability and can be modified to influence base preference and/or DNA-binding affinity. In some embodiments, variable residues within a hypervariable region correspond to one or more of positions 24, 26, 28, 32, 33, 38, 40, 42, 44, 46, 68, 70, 75, and 77 of any one of SEQ ID NOs: 13-21. In other embodiments, variable residues within a hypervariable region correspond to one or more of positions 215, 217, 219, 221, 223, 224, 229, 231, 233, 235, 237, 245, 259, 261, 266, and 268 of any one of SEQ ID NOs: 13-21. In certain embodiments, a hypervariable region further comprises a variable residue corresponding to residue 245 of SEQ ID NO: 16. In certain embodiments, a hypervariable region comprises a variable residue corresponding to residue 262 of SEQ ID NO: 19. In certain embodiments, a hypervariable region comprises a variable residue corresponding to one or more of residues 262, 263, 264, and 265 of SEQ ID NO: 19 or SEQ ID NO: 21.

[0196] As used herein, the terms "Factor VIII gene," "F8 gene," and the like, refer to a gene located on the X chromosome which encodes the coagulation Factor VIII protein. In humans, the Factor VIII gene, identified by NCBI as Gene ID No. 2157, is located from base pair 154,835,788 to base pair 155,026,934 on the X chromosome. In canines, the Factor VIII gene can be the gene identified by NCBI Reference Sequence: NM 001003212.1. It is understood that the term "Factor VIII gene" can include both a wild-type Factor VIII gene and a Factor VIII gene which comprises naturally-occurring polymorphisms and/or mutations that allow for the production of a functional Factor VIII protein.

[0197] As used herein, the terms "int22h-1" and "int22h-1 sequence" refer to a sequence positioned within intron 22 of the Factor VIII gene having a size of approximately 9.5 kb (Bagnall et al. (2006) Journal of Thrombosis and Haemostasis 4:591-598) and can further refer to the human sequence identified by GenBank as Accession No. AY619999.1. The int22h-1 sequence is characterized as comprising a CpG island, a coding sequence for the H2AFB1 histone protein, and a coding sequence for the Factor VIII-Associated 1 protein (F8A1; also referred to as the intron 22 protein). The int22h-1 sequence is further characterized as being identical to, or having high homology with, at least one repeat sequence that is positioned telomeric to the Factor VIII gene on the X chromosome. In humans, two repeat sequences, referred to as int22h-2 and int22h-3, are positioned telomeric to the Factor VIII gene on the X chromosome. In particular embodiments of the invention, the human int22h-1 sequence can comprise SEQ ID NO: 3. In other particular embodiments of the invention, the canine int22h-1 sequence can comprise SEQ ID NO: 4.

[0198] As used herein, the terms "F8A1 coding sequence" and "intron 22 protein coding sequence" are used interchangeably and refer to a sequence positioned within the int22h-1 sequence which encodes the F8A1 protein. The F8A1 coding sequence is intronless and is transcribed in the opposite direction as the Factor VIII gene. In one embodiment, the wild-type human F8A1 coding sequence can comprise SEQ ID NO: 5. In another embodiment, the wild-type canine F8A1 coding sequence can comprise SEQ ID NO: 6, which has .about.75% homology to the human F8A1 coding sequence. It is understood that reference to an F8A1 coding sequence includes a wild-type F8A1 sequence and an F8A1 sequence comprising naturally-occurring polymorphisms and/or mutations that allow for the production of a functional F8A1 protein.

[0199] As used herein, the terms "inversion" and "inversion of exons 1-22" refer to a mutation of a Factor VIII gene wherein an intra-chromosomal homologous recombination event occurs between the int22h-1 sequence of the Factor VIII gene and an identical or closely related, inversely oriented, repeat sequence positioned telomeric to the Factor VIII gene on the X chromosome, which results in an inversion of exons 1-22 with respect to exons 23-26.

[0200] As used herein, the term "reversion" refers to an intra-chromosomal homologous recombination event in a cell comprising an inversion of exons 1-22 of the Factor VIII gene, wherein a double-strand break is produced within the int22h-1 sequence to promote recombination with a repeat sequence telomeric to the Factor VIII gene on the X chromosome. Such recombination results in the corrected orientation of exons 1-22 and the production of a functional Factor VIII gene having a wild-type orientation which encodes a biologically-active Factor VIII protein.

[0201] As used herein, the term "CpG site" refers to regions of DNA in the genome characterized by a cytosine nucleotide followed by a guanine nucleotide in the linear sequence of bases. Cytosine bases in CpG sites can be methylated.

[0202] The terms "recombinant DNA construct," "recombinant construct," "expression cassette," "expression construct," "chimeric construct," "construct," and "recombinant DNA fragment" are used interchangeably herein and are nucleic acid fragments. A recombinant construct comprises an artificial combination of nucleic acid fragments, including, without limitation, regulatory and coding sequences that are not found together in nature. For example, a recombinant DNA construct may comprise regulatory sequences and coding sequences that are derived from different sources, or regulatory sequences and coding sequences derived from the same source and arranged in a manner different than that found in nature. Such a construct may be used by itself or may be used in conjunction with a vector.

[0203] As used herein, a "vector" or "recombinant DNA vector" may be a construct that includes a replication system and sequences that are capable of transcription and translation of a polypeptide-encoding sequence in a given host cell. If a vector is used then the choice of vector is dependent upon the method that will be used to transform host cells as is well known to those skilled in the art. Vectors can include, without limitation, plasmid vectors and recombinant AAV vectors, or any other vector known in that art suitable for delivering a gene encoding a nuclease of the invention to a target cell. The skilled artisan is well aware of the genetic elements that must be present on the vector in order to successfully transform, select and propagate host cells comprising any of the isolated nucleotides or nucleic acid sequences of the invention.

[0204] As used herein, a "vector" can also refer to a viral vector. Viral vectors can include, without limitation, retroviral vectors, lentiviral vectors, adenoviral vectors, and adeno-associated viral vectors (AAV).

[0205] As used herein, a "control" or "control cell" refers to a cell that provides a reference point for measuring changes in genotype or phenotype of a genetically-modified cell. A control cell may comprise, for example: (a) a wild-type cell, i.e., of the same genotype as the starting material for the genetic alteration which resulted in the genetically-modified cell; (b) a cell of the same genotype as the genetically-modified cell but which has been transformed with a null construct (i.e., with a construct which has no known effect on the trait of interest); or, (c) a cell genetically identical to the genetically-modified cell but which is not exposed to conditions or stimuli or further genetic modifications that would induce expression of altered genotype or phenotype.

[0206] As used herein, the terms "treatment" or "treating a subject" refers to the administration of an effective amount of an engineered nuclease of the invention, or an effective amount of a nucleic acid encoding an engineered nuclease of the invention, to a subject having hemophilia A for the purpose of correcting an inversion of exons 1-22 in the Factor VIII gene in cells which normally express Factor VIII in wild-type subjects. Such treatment results in correction of the Factor VIII gene in a number of cells sufficient to increase circulating levels of Factor VIII in the subject, and either partial or complete relief of one or more symptoms of hemophilia A in the subject including, but not limited to, a reduction in clotting time. The terms "treatment" or "treating a subject" can further refer to the administration of an effective amount of a genetically-modified cell comprising a Factor VIII gene having a wild-type orientation to a subject according the method of the invention, wherein the genetically-modified cell is delivered to a target tissue and either produces biologically-active Factor VIII, or differentiates into a cell which produces biologically-active Factor VIII, in an amount sufficient to increase the circulating levels of biologically-active Factor VIII in the subject, resulting in either partial or complete relief of one or more symptoms of hemophilia A including, but not limited to, a reduction in clotting time. In some aspects, an engineered nuclease of the invention, a nucleic acid encoding the same, or a genetically-modified cell of the invention is administered during treatment in the form of a pharmaceutical composition of the invention.

[0207] The term "effective amount" or "therapeutically effective amount" refers to an amount sufficient to effect beneficial or desirable biological and/or clinical results. The therapeutically effective amount will vary depending on the formulation or composition used, the disease and its severity and the age, weight, physical condition and responsiveness of the subject to be treated. In specific embodiments, an effective amount of the engineered meganuclease or pharmaceutical compositions disclosed herein reduces clotting time in a subject having hemophilia A, and/or increases detectable circulating levels of functional Factor VIII protein in the subject's blood.

[0208] The term "lipid nanoparticle" refers to a lipid composition having a typically spherical structure with an average diameter between 10 and 1000 nanometers. In some formulations, lipid nanoparticles can comprise at least one cationic lipid, at least one non-cationic lipid, and at least one conjugated lipid. Lipid nanoparticles known in the art that are suitable for encapsulating nucleic acids, such as mRNA, are contemplated for use in the invention.

[0209] As used herein, the recitation of a numerical range for a variable is intended to convey that the invention may be practiced with the variable equal to any of the values within that range. Thus, for a variable which is inherently discrete, the variable can be equal to any integer value within the numerical range, including the end-points of the range. Similarly, for a variable which is inherently continuous, the variable can be equal to any real value within the numerical range, including the end-points of the range. As an example, and without limitation, a variable which is described as having values between 0 and 2 can take the values 0, 1 or 2 if the variable is inherently discrete, and can take the values 0.0, 0.1, 0.01, 0.001, or any other real values >0 and <2 if the variable is inherently continuous.

2.1 Principle of the Invention

[0210] The present invention is based, in part, on the hypothesis that engineered nucleases that cleave recognition sequences within an int22h-1 sequence of the Factor VIII gene in both the human and canine genome can be used to correct an inversion of exons 1-22 in the Factor VIII gene and are thus useful for the treatment of human patients with hemophilia A and the clinically-relevant canine model of hemophilia A. More specifically, nucleases can be engineered to recognize and cleave a recognition sequence within an int22h-1 sequence of the Factor VIII gene that does not comprise a CpG site and has a high degree of homology between the human and canine genome to produce a double-strand break in either the human or canine genome. Intra-chromosomal homologous recombination can then occur between the int22h-1 sequence and a repeat sequence which is telomeric to the Factor VIII gene on the X chromosome, resulting in a reversion of exons 1-22 and the production of a Factor VIII gene having a wild-type orientation in target cells of the subject, which encodes a biologically-active Factor VIII protein.

[0211] The invention is also based, in part, on the hypothesis that pluripotent cells (e.g., induced pluripotent stem (iPS) cells) comprising an inversion of exons 1-22 in the Factor VIII gene can be obtained and contacted with an engineered nuclease of the invention (or a nucleic acid encoding the same) in order to correct the Factor VIII gene by the same mechanism described above. Such pluripotent cells can then be administered to a subject having hemophilia A, wherein the cells are delivered to a target tissue (e.g., the liver or the circulatory system) and differentiate into cells which express biologically-active Factor VIII in the subject.

[0212] Thus, the present invention encompasses engineered nucleases, and particularly engineered recombinant meganucleases, which recognize and cleave a recognition sequence within the int22h-1 sequence of a Factor VIII gene in both the human and canine genomes. The present invention also encompasses methods of using such engineered nucleases to make genetically-modified cells, and the use of such cells in a pharmaceutical composition and in methods for treating hemophilia A. Further, the invention encompasses pharmaceutical compositions comprising engineered nuclease proteins, nucleic acids encoding engineered nucleases, or genetically-modified cells of the invention, and the use of such compositions for the treatment of hemophilia A.

2.2 Nucleases for Recognizing and Cleaving Recognition Sequences within an int22h-1 Sequence of the Factor VIII Gene

[0213] It is known in the art that it is possible to use a site-specific nuclease to make a DNA break in the genome of a living cell, and that such a DNA break can result in permanent modification of the genome via homologous recombination of the cleaved target site with an identical or highly homologous DNA sequence within the genome.

[0214] Thus, in different embodiments, a variety of different types of nuclease are useful for practicing the invention. In one embodiment, the invention can be practiced using engineered recombinant meganucleases. In another embodiment, the invention can be practiced using a CRISPR nuclease or CRISPR Nickase. Methods for making CRISPRs and CRISPR Nickases that recognize pre-determined DNA sites are known in the art, for example Ran, et al. (2013) Nat Protoc. 8:2281-308. In another embodiment, the invention can be practiced using TALENs or Compact TALENs. Methods for making TALE domains that bind to pre-determined DNA sites are known in the art, for example Reyon et al. (2012) Nat Biotechnol. 30:460-5. In another embodiment, the invention can be practiced using zinc finger nucleases (ZFNs). In a further embodiment, the invention can be practiced using megaTALs.

[0215] In preferred embodiments, the nucleases used to practice the invention are single-chain meganucleases. A single-chain meganuclease comprises an N-terminal subunit and a C-terminal subunit joined by a linker peptide. Each of the two domains recognizes half of the recognition sequence (i.e., a recognition half-site) and the site of DNA cleavage is at the middle of the recognition sequence near the interface of the two subunits. DNA strand breaks are offset by four base pairs such that DNA cleavage by a meganuclease generates a pair of four base pair, 3' single-strand overhangs.

[0216] In some examples, recombinant meganucleases of the invention have been engineered to recognize and cleave both the human F8R 17-18 recognition sequence (SEQ ID NO: 9) and the canine F8R 17-18 recognition sequence (SEQ ID NO: 11), which are represented by the consensus F8R 17-18 recognition sequence (SEQ ID NO: 7). The F8R 17-18 recognition sequence is positioned within both the int22h-1 sequence and the F8A1 sequence. Such recombinant meganucleases are collectively referred to herein as "F8R 17-18 meganucleases." Exemplary F8R 17-18 meganucleases are provided in SEQ ID NOs: 13-21.

[0217] Recombinant meganucleases of the invention comprise a first subunit, comprising a first hypervariable (HVR1) region, and a second subunit, comprising a second hypervariable (HVR2) region. Further, the first subunit binds to a first recognition half-site in the recognition sequence (i.e., the F8R17 half-site), and the second subunit binds to a second recognition half-site in the recognition sequence (e.g., the F8R18 half-site). In embodiments where the recombinant meganuclease is a single-chain meganuclease, the first and second subunits can be oriented such that the first subunit, which comprises the HVR1 region and binds the first half-site, is positioned as the N-terminal subunit, and the second subunit, which comprises the HVR2 region and binds the second half-site, is positioned as the C-terminal subunit. In alternative embodiments, the first and second subunits can be oriented such that the first subunit, which comprises the HVR1 region and binds the first half-site, is positioned as the C-terminal subunit, and the second subunit, which comprises the HVR2 region and binds the second half-site, is positioned as the N-terminal subunit. Exemplary F8R 17-18 meganucleases of the invention are provided in Table 1.

TABLE-US-00001 TABLE 1 Exemplary recombinant meganucleases engineered to recognize and cleave the consensus F8R 17-18 recognition sequence (SEQ ID NO: 7) AA F8R17 F8R17 *F8R17 F8R18 F8R18 *F8R18 SEQ Subunit Subunit Subunit Subunit Subunit Subunit Meganuclease ID Residues SEQ ID % Residues SEQ ID % F8R 17-18L1.35 13 7-153 22 100 198-344 31 100 F8R 17-18L2.23 14 7-153 23 97.96 198-344 32 92.52 F8R 17-18x.l 15 7-153 24 97.28 198-344 33 98.64 F8R 17-18x.2 16 7-153 25 97.28 198-344 34 91.84 F8R 17-18x.79 17 7-153 26 96.6 198-344 35 92.52 F8R17-18x.88 18 7-153 27 97.28 198-344 36 92.52 F8R17-18L.626 19 7-153 28 96.6 198-344 37 96.6 F8R17-18L.615 20 7-153 29 96.6 198-344 38 97.28 F8R17-18L.553 21 7-153 30 96.6 198-344 39 94.56 *"F8R17 Subunit %" and "F8R18 Subunit %" represent the amino acid sequence identity between the F8R17-binding and F8R18-binding subunit regions of each meganuclease and the F8R17-binding and F8R18-binding subunit regions, respectively, of the F8R 17-18L1.35 meganuclease.

2.3 Methods for Delivering and Expressing Nucleases

[0218] The invention provides methods for producing genetically-modified cells using engineered nucleases that recognize and cleave recognition sequences found within an intron 22 sequence of a Factor VIII gene. The invention further provides methods for treating hemophilia A in a subject by administering a pharmaceutical composition comprising a pharmaceutically acceptable carrier and an engineered nuclease of the invention (or a nucleic acid encoding the engineered nuclease). In each case, the invention requires that an engineered nuclease of the invention can be delivered to and/or expressed from DNA/RNA in appropriate cells that comprise an inversion of exons 1-22 in a Factor VIII gene and would typically express Factor VIII in a healthy subject (e.g., hepatic sinusoidal endothelial cells or hematopoietic endothelial cells, or progenitor cells which differentiate into the same).

[0219] Engineered nucleases of the invention can be delivered into a cell in the form of protein or, preferably, as a nucleic acid encoding the engineered nuclease. Such nucleic acid can be DNA (e.g., circular or linearized plasmid DNA or PCR products) or RNA (e.g., mRNA). For embodiments in which the engineered nuclease coding sequence is delivered in DNA form, it should be operably linked to a promoter to facilitate transcription of the nuclease gene. Mammalian promoters suitable for the invention include constitutive promoters such as the cytomegalovirus early (CMV) promoter (Thomsen et al. (1984), Proc Natl Acad Sci USA. 81(3):659-63) or the SV40 early promoter (Benoist and Chambon (1981), Nature. 290(5804):304-10) as well as inducible promoters such as the tetracycline-inducible promoter (Dingermann et al. (1992), Mol Cell Biol. 12(9):4038-45). An engineered nuclease of the invention can also be operably linked to a synthetic promoter. Synthetic promoters can include, without limitation, the JeT promoter (WO 2002/012514).

[0220] In some embodiments, mRNA encoding a nuclease is delivered to a cell because this reduces the likelihood that the gene encoding the engineered nuclease will integrate into the genome of the cell. Such mRNA encoding an engineered nuclease can be produced using methods known in the art such as in vitro transcription. In some embodiments, the mRNA is capped using 7-methyl-guanosine. In some embodiments, the mRNA may be polyadenylated.

[0221] In another particular embodiment, a nucleic acid encoding a nuclease of the invention can be introduced into the cell using a single-stranded DNA template. The single-stranded DNA can further comprise a 5' and/or a 3' AAV inverted terminal repeat (ITR) upstream and/or downstream of the sequence encoding the engineered nuclease. In other embodiments, the single-stranded DNA can further comprise a 5' and/or a 3' homology arm upstream and/or downstream of the sequence encoding the engineered nuclease.

[0222] In another particular embodiment, genes encoding a nuclease of the invention can be introduced into a cell using a linearized DNA template. In some examples, a plasmid DNA encoding a nuclease can be digested by one or more restriction enzymes such that the circular plasmid DNA is linearized prior to being introduced into a cell.

[0223] Purified nuclease proteins can be delivered into cells to cleave genomic DNA by a variety of different mechanisms known in the art, including those further detailed herein below.

[0224] The target tissue(s) for delivery of recombinant nucleases of the invention include, without limitation, cells of the liver, preferably hepatic sinusoidal endothelial cells or, alternatively, progenitor cells which differentiate into hepatic sinusoidal endothelial cells. Target tissues can also include, without limitation, cells in the circulatory system, preferably hematopoietic endothelial cells or, alternatively, progenitor cells which differentiate into hematopoietic endothelial cells. As discussed, nucleases of the invention can be delivered as purified protein or as RNA or DNA encoding the nucleases. In one embodiment, nuclease proteins, or mRNA, or DNA vectors encoding nucleases, are supplied to target cells (e.g., cells in the liver or cells in the circulatory system) via injection directly to the target tissue. Alternatively, nuclease protein, mRNA, or DNA can be delivered systemically via the circulatory system.

[0225] In some embodiments, nuclease proteins, or DNA/mRNA encoding nucleases, are formulated for systemic administration, or administration to target tissues, in a pharmaceutically acceptable carrier in accordance with known techniques. See, e.g., Remington, The Science And Practice of Pharmacy (21st ed. 2005). In the manufacture of a pharmaceutical formulation according to the invention, proteins/RNA/mRNA are typically admixed with a pharmaceutically acceptable carrier. The carrier must, of course, be acceptable in the sense of being compatible with any other ingredients in the formulation and must not be deleterious to the patient. The carrier can be a solid or a liquid, or both, and can be formulated with the compound as a unit-dose formulation.

[0226] In some embodiments, nuclease proteins, or DNA/mRNA encoding the nuclease, are coupled to a cell penetrating peptide or targeting ligand to facilitate cellular uptake. Examples of cell penetrating peptides known in the art include poly-arginine (Jearawiriyapaisarn, et al. (2008) Mol Ther. 16:1624-9), TAT peptide from the HIV virus (Hudecz et al. (2005), Med. Res. Rev. 25: 679-736), MPG (Simeoni, et al. (2003) Nucleic Acids Res. 31:2717-2724), Pep-1 (Deshayes et al. (2004) Biochemistry 43: 7698-7706, and HSV-1 VP-22 (Deshayes et al. (2005) Cell Mol Life Sci. 62:1839-49. In an alternative embodiment, nuclease proteins, or DNA/mRNA encoding nucleases, are coupled covalently or non-covalently to an antibody that recognizes a specific cell-surface receptor expressed on target cells such that the nuclease protein/DNA/mRNA binds to and is internalized by the target cells. Alternatively, nuclease protein/DNA/mRNA can be coupled covalently or non-covalently to the natural ligand (or a portion of the natural ligand) for such a cell-surface receptor. (McCall, et al. (2014) Tissue Barriers. 2(4):e944449; Dinda, et al. (2013) Curr Pharm Biotechnol. 14:1264-74; Kang, et al. (2014) Curr Pharm Biotechnol. 15(3):220-30; Qian et al. (2014) Expert Opin Drug Metab Toxicol. 10(11):1491-508).

[0227] In some embodiments, nuclease proteins, or DNA/mRNA encoding nucleases, are encapsulated within biodegradable hydrogels for injection or implantation within the desired region of the liver (e.g., in proximity to hepatic sinusoidal endothelial cells or hematopoietic endothelial cells, or progenitor cells which differentiate into the same). Hydrogels can provide sustained and tunable release of the therapeutic payload to the desired region of the target tissue without the need for frequent injections, and stimuli-responsive materials (e.g., temperature- and pH-responsive hydrogels) can be designed to release the payload in response to environmental or externally applied cues (Kang Derwent et al. (2008) Trans Am Ophthalmol Soc. 106:206-214).

[0228] In some embodiments, nuclease proteins, or DNA/mRNA encoding nucleases, are coupled covalently or, preferably, non-covalently to a nanoparticle or encapsulated within such a nanoparticle using methods known in the art (Sharma, et al. (2014) Biomed Res Int. 2014). A nanoparticle is a nanoscale delivery system whose length scale is <1 .mu.m, preferably <100 nm. Such nanoparticles may be designed using a core composed of metal, lipid, polymer, or biological macromolecule, and multiple copies of the nuclease proteins, mRNA, or DNA can be attached to or encapsulated with the nanoparticle core. This increases the copy number of the protein/mRNA/DNA that is delivered to each cell and, so, increases the intracellular expression of each nuclease to maximize the likelihood that the target recognition sequences will be cut. The surface of such nanoparticles may be further modified with polymers or lipids (e.g., chitosan, cationic polymers, or cationic lipids) to form a core-shell nanoparticle whose surface confers additional functionalities to enhance cellular delivery and uptake of the payload (Jian et al. (2012) Biomaterials. 33(30): 7621-30). Nanoparticles may additionally be advantageously coupled to targeting molecules to direct the nanoparticle to the appropriate cell type and/or increase the likelihood of cellular uptake. Examples of such targeting molecules include antibodies specific for cell-surface receptors and the natural ligands (or portions of the natural ligands) for cell surface receptors.

[0229] In some embodiments, the nuclease proteins or DNA/mRNA encoding the nucleases are encapsulated within liposomes or complexed using cationic lipids (see, e.g., Lipofectamine.TM., Life Technologies Corp., Carlsbad, Calif.; Zuris et al. (2015) Nat Biotechnol. 33: 73-80; Mishra et al. (2011) J Drug Deliv. 2011:863734). The liposome and lipoplex formulations can protect the payload from degradation, enhance accumulation and retention at the target site, and facilitate cellular uptake and delivery efficiency through fusion with and/or disruption of the cellular membranes of the target cells.

[0230] In some embodiments, nuclease proteins, or DNA/mRNA encoding nucleases, are encapsulated within polymeric scaffolds (e.g., PLGA) or complexed using cationic polymers (e.g., PEI, PLL) (Tamboli et al. (2011) Ther Deliv. 2(4): 523-536). Polymeric carriers can be designed to provide tunable drug release rates through control of polymer erosion and drug diffusion, and high drug encapsulation efficiencies can offer protection of the therapeutic payload until intracellular delivery to the desired target cell population.

[0231] In some embodiments, nuclease proteins, or DNA/mRNA encoding recombinant meganucleases, are combined with amphiphilic molecules that self-assemble into micelles (Tong et al. (2007) J Gene Med. 9(11): 956-66). Polymeric micelles may include a micellar shell formed with a hydrophilic polymer (e.g., polyethyleneglycol) that can prevent aggregation, mask charge interactions, and reduce nonspecific interactions.

[0232] In some embodiments, nuclease proteins, or DNA/mRNA encoding nucleases, are formulated into an emulsion or a nanoemulsion (i.e., having an average particle diameter of <1 nm) for administration and/or delivery to the target cell. The term "emulsion" refers to, without limitation, any oil-in-water, water-in-oil, water-in-oil-in-water, or oil-in-water-in-oil dispersions or droplets, including lipid structures that can form as a result of hydrophobic forces that drive apolar residues (e.g., long hydrocarbon chains) away from water and polar head groups toward water, when a water immiscible phase is mixed with an aqueous phase. These other lipid structures include, but are not limited to, unilamellar, paucilamellar, and multilamellar lipid vesicles, micelles, and lamellar phases. Emulsions are composed of an aqueous phase and a lipophilic phase (typically containing an oil and an organic solvent). Emulsions also frequently contain one or more surfactants. Nanoemulsion formulations are well known, e.g., as described in US Patent Application Nos. 2002/0045667 and 2004/0043041, and U.S. Pat. Nos. 6,015,832, 6,506,803, 6,635,676, and 6,559,189, each of which is incorporated herein by reference in its entirety.

[0233] In some embodiments, nuclease proteins, or DNA/mRNA encoding nucleases, are covalently attached to, or non-covalently associated with, multifunctional polymer conjugates, DNA dendrimers, and polymeric dendrimers (Mastorakos et al. (2015) Nanoscale. 7(9): 3845-56; Cheng et al. (2008) J Pharm Sci. 97(1): 123-43). The dendrimer generation can control the payload capacity and size, and can provide a high drug payload capacity. Moreover, display of multiple surface groups can be leveraged to improve stability, reduce nonspecific interactions, and enhance cell-specific targeting and drug release.

[0234] In some embodiments, genes encoding a nuclease are delivered using a viral vector. Such vectors are known in the art and include retroviral vectors, lentiviral vectors, adenoviral vectors, and adeno-associated virus (AAV) vectors (reviewed in Vannucci, et al. (2013 New Microbiol. 36:1-22). In some embodiments, the viral vectors are injected directly into target tissues. In alternative embodiments, the viral vectors are delivered systemically via the circulatory system. It is known in the art that different AAV vectors tend to localize to different tissues. In liver target tissues, effective transduction of hepatocytes has been shown, for example, with AAV serotypes 2, 8, and 9 (Sands (2011) Methods Mol. Biol. 807:141-157). AAV vectors can also be self-complementary such that they do not require second-strand DNA synthesis in the host cell (McCarty, et al. (2001) Gene Ther. 8:1248-54).

[0235] In one embodiment, a viral vector used for nuclease gene delivery is a self-limiting viral vector. A self-limiting viral vector can have limited persistence time in a cell or organism due to the presence of a recognition sequence for a recombinant nuclease within the vector. Thus, a self-limiting viral vector can be engineered to provide coding for a promoter, a nuclease described herein, and a nuclease recognition site within the ITRs. The self-limiting viral vector delivers the nuclease gene to a cell, tissue, or organism, such that the nuclease is expressed and able to cut the genome of the cell at an endogenous recognition sequence within the genome. The delivered nuclease will also find its target site within the self-limiting viral vector itself, and cut the vector at this target site. Once cut, the 5' and 3' ends of the viral genome will be exposed and degraded by exonucleases, thus killing the virus and ceasing production of the nuclease.

[0236] If the nuclease genes are delivered in DNA form (e.g. plasmid) and/or via a viral vector (e.g. AAV) they must be operably linked to a promoter. In some embodiments, this can be a viral promoter such as endogenous promoters from the viral vector (e.g. the LTR of a lentiviral vector) or the well-known cytomegalovirus- or SV40 virus-early promoters. In a preferred embodiment, nuclease genes are operably linked to a promoter that drives gene expression preferentially in the target cells. Examples of liver-specific promoters include, without limitation, human alpha-1 antitrypsin promoter and apolipoprotein A-II promoter.

[0237] It is envisioned that a single treatment will permanently cause a reversion of exons 1-22 in the Factor VIII gene, resulting in a gene having a wild-type orientation in a percentage of patient target cells. If the frequency of reversion is low, however, or if a large percentage of target cells need to be corrected, it may be necessary to perform multiple treatments on each patient.

2.4 Pharmaceutical Compositions

[0238] In some embodiments, the invention provides a pharmaceutical composition comprising a pharmaceutically acceptable carrier and an effective amount of an engineered nuclease of the invention, or a pharmaceutically acceptable carrier and an effective amount of a polynucleotide comprising a nucleic acid encoding an engineered nuclease of the invention. In other embodiments, the invention provides a pharmaceutical composition comprising a pharmaceutically acceptable carrier and an effective amount of a genetically-modified cell of the invention which can be delivered to a target tissue where the cell can then differentiate into a cell which expresses biologically-active Factor VIII. Pharmaceutical compositions of the invention can be useful for treating a subject having hemophilia A, wherein the disease is characterized by an inversion of exons 1-22 in a Factor VIII gene.

[0239] Such pharmaceutical compositions can be prepared in accordance with known techniques. See, e.g., Remington, The Science and Practice of Pharmacy (21st ed. 2005). In the manufacture of a pharmaceutical formulation according to the invention, nuclease polypeptides (or DNA/RNA encoding the same) are typically admixed with a pharmaceutically acceptable carrier and the resulting composition is administered to a subject. The carrier must, of course, be acceptable in the sense of being compatible with any other ingredients in the formulation and must not be deleterious to the subject. In some embodiments, pharmaceutical compositions of the invention can further comprise one or more additional agents or biological molecules useful in the treatment of a disease in the subject. Likewise, the additional agent(s) and/or biological molecule(s) can be co-administered as a separate composition.

[0240] In particular embodiments of the invention, the pharmaceutical composition can comprise one or more mRNAs described herein encapsulated within lipid nanoparticles, which are described elsewhere herein. In other embodiments, lipid nanoparticles can comprise one or more polycistronic mRNAs described herein, wherein each polycistronic mRNA encodes at least one engineered nucleases of the invention and at least one additional polypeptide.

[0241] Some lipid nanoparticles contemplated for use in the invention comprise at least one cationic lipid, at least one non-cationic lipid, and at least one conjugated lipid. In more particular examples, lipid nanoparticles can comprise from about 50 mol % to about 85 mol % of a cationic lipid, from about 13 mol % to about 49.5 mol % of a non-cationic lipid, and from about 0.5 mol % to about 10 mol % of a lipid conjugate, and are produced in such a manner as to have a non-lamellar (i.e., non-bilayer) morphology. In other particular examples, lipid nanoparticles can comprise from about 40 mol % to about 85 mol % of a cationic lipid, from about 13 mol % to about 49.5 mol % of a non-cationic lipid, and from about 0.5 mol % to about 10 mol % of a lipid conjugate, and are produced in such a manner as to have a non-lamellar (i.e., non-bilayer) morphology.

[0242] Cationic lipids can include, for example, one or more of the following: palmitoyi-oleoyl-nor-arginine (PONA), MPDACA, GUADACA, ((6Z,9Z,28Z,31Z)-heptatriaconta-6,9,28,31-tetraen-19-yl 4-(dimethylamino)butanoate) (MC3), LenMC3, CP-LenMC3, .gamma.-LenMC3, CP-.gamma.-LenMC3, MC3MC, MC2MC, MC3 Ether, MC4 Ether, MC3 Amide, Pan-MC3, Pan-MC4 and Pan MC5, 1,2-dilinoleyloxy-N,N-dimethylaminopropane (DLinDMA), 1,2-dilinolenyloxy-N,N-dimethylaminopropane (DLenDMA), 2,2-dilinoleyl-4-(2-dimethylaminoethyl)-[1,3]-dioxolane (DLin-K-C2-DMA; "XTC2"), 2,2-dilinoleyl-4-(3-dimethylaminopropyl)-[1,3]-dioxolane (DLin-K-C3-DMA), 2,2-dilinoleyl-4-(4-dimethylaminobutyl)-[1,3]-dioxolane (DLin-K-C4-DMA), 2,2-dilinoleyl-5-dimethylaminomethyl-[1,3]-dioxane (DLin-K6-DMA), 2,2-dilinoleyl-4-N-methylpepiazino-[1,3]-dioxolane (DLin-K-MPZ), 2,2-dilinoleyl-4-dimethylaminomethyl-[1,3]-dioxolane (DLin-K-DMA), 1,2-dilinoleylcarbamoyloxy-3-dimethylaminopropane (DLin-C-DAP), 1,2-dilinoleyoxy-3-(dimethylamino)acetoxypropane (DLin-DAC), 1,2-dilinoleyoxy-3-morpholinopropane (DLin-MA), 1,2-dilinoleoyl-3-dimethylaminopropane (DLinDAP), 1,2-dilinoleylthio-3-dimethylaminopropane (DLin-S-DMA), 1-linoleoyl-2-linoleyloxy-3-dimethylaminopropane (DLin-2-DMAP), 1,2-dilinoleyloxy-3-trimethylaminopropane chloride salt (DLin-TMA.C1), 1,2-dilinoleoyl-3-trimethylaminopropane chloride salt (DLin-TAP.C1), 1,2-dilinoleyloxy-3-(N-methylpiperazino)propane (DLin-MPZ), 3-(N,N-dilinoleylamino)-1,2-propanediol (DLinAP), 3-(N,N-dioleylamino)-1,2-propanedio (DOAP), 1,2-dilinoleyloxo-3-(2-N,N-dimethylamino)ethoxypropane (DLin-EG-DMA), N,N-dioleyl-N,N-dimethylammonium chloride (DODAC), 1,2-dioleyloxy-N,N-dimethylaminopropane (DODMA), 1,2-distearyloxy-N,N-dimethylaminopropane (DSDMA), N-(1-(2,3-dioleyloxy)propyl)-N,N,N-trimethylammonium chloride (DOTMA), N,N-distearyl-N,N-dimethylammonium bromide (DDAB), N-(1-(2,3-dioleoyloxy)propyl)-N,N,N-trimethylammonium chloride (DOTAP), 3-(N--(N',N'-dimethylaminoethane)-carbamoyl)cholesterol (DC-Chol), N-(1,2-dimyristyloxyprop-3-yl)-N,N-dimethyl-N-hydroxyethyl ammonium bromide (DMRIE), 2,3-dioleyloxy-N-[2(spermine-carboxamido)ethyl]-N,N-dimethyl-1-propanamin- iumtrifluoroacetate (DOSPA), dioctadecylamidoglycyl spermine (DOGS), 3-dimethylamino-2-(cholest-5-en-3-beta-oxybutan-4-oxy)-1-(cis,cis-9,12-oc- tadecadienoxy)propane (CLinDMA), 2-[5'-(cholest-5-en-3-beta-oxy)-3'-oxapentoxy)-3-dimethy-1-(cis,cis-9',1-- 2'-octadecadienoxy)propane (CpLinDMA), N,N-dimethyl-3,4-dioleyloxybenzylamine (DMOBA), 1,2-N,N'-dioleylcarbamyl-3-dimethylaminopropane (DOcarbDAP), 1,2-N,N'-dilinoleylcarbamyl-3-dimethylaminopropane (DLincarbDAP), or mixtures thereof. The cationic lipid can also be DLinDMA, DLin-K-C2-DMA ("XTC2"), MC3, LenMC3, CP-LenMC3, .gamma.-LenMC3, CP-.gamma.-LenMC3, MC3MC, MC2MC, MC3 Ether, MC4 Ether, MC3 Amide, Pan-MC3, Pan-MC4, Pan MC5, or mixtures thereof.

[0243] In various embodiments, the cationic lipid may comprise from about 50 mol % to about 90 mol %, from about 50 mol % to about 85 mol %, from about 50 mol % to about 80 mol %, from about 50 mol % to about 75 mol %, from about 50 mol % to about 70 mol %, from about 50 mol % to about 65 mol %, or from about 50 mol % to about 60 mol % of the total lipid present in the particle.

[0244] In other embodiments, the cationic lipid may comprise from about 40 mol % to about 90 mol %, from about 40 mol % to about 85 mol %, from about 40 mol % to about 80 mol %, from about 40 mol % to about 75 mol %, from about 40 mol % to about 70 mol %, from about 40 mol % to about 65 mol %, or from about 40 mol % to about 60 mol % of the total lipid present in the particle.

[0245] The non-cationic lipid may comprise, e.g., one or more anionic lipids and/or neutral lipids. In preferred embodiments, the non-cationic lipid comprises one of the following neutral lipid components: (1) cholesterol or a derivative thereof (2) a phospholipid; or (3) a mixture of a phospholipid and cholesterol or a derivative thereof. Examples of cholesterol derivatives include, but are not limited to, cholestanol, cholestanone, cholestenone, coprostanol, cholesteryl-2'-hydroxyethyl ether, cholesteryl-4'-hydroxybutyl ether, and mixtures thereof. The phospholipid may be a neutral lipid including, but not limited to, dipalmitoylphosphatidylcholine (DPPC), distearoylphosphatidylcholine (DSPC), dioleoylphosphatidylethanolamine (DOPE), palmitoyloleoyl-phosphatidylcholine (POPC), palmitoyloleoyl-phosphatidylethanolamine (POPE), palmitoyloleyol-phosphatidylglycerol (POPG), dipalmitoyl-phosphatidylethanolamine (DPPE), dimyristoyl-phosphatidylethanolamine (DMPE), distearoyl-phosphatidylethanolamine (DSPE), monomethyl-phosphatidylethanolamine, dimethyl-phosphatidylethanolamine, dielaidoyl-phosphatidylethanolamine (DEPE), stearoyloleoyl-phosphatidylethanolamine (SOPE), egg phosphatidylcholine (EPC), and mixtures thereof. In certain preferred embodiments, the phospholipid is DPPC, DSPC, or mixtures thereof.

[0246] In some embodiments, the non-cationic lipid (e.g., one or more phospholipids and/or cholesterol) may comprise from about 10 mol % to about 60 mol %, from about 15 mol % to about 60 mol %, from about 20 mol % to about 60 mol %, from about 25 mol % to about 60 mol %, from about 30 mol % to about 60 mol %, from about 10 mol % to about 55 mol %, from about 15 mol % to about 55 mol %, from about 20 mol % to about 55 mol %, from about 25 mol % to about 55 mol %, from about 30 mol % to about 55 mol %, from about 13 mol % to about 50 mol %, from about 15 mol % to about 50 mol % or from about 20 mol % to about 50 mol % of the total lipid present in the particle. When the non-cationic lipid is a mixture of a phospholipid and cholesterol or a cholesterol derivative, the mixture may comprise up to about 40, 50, or 60 mol % of the total lipid present in the particle.

[0247] The conjugated lipid that inhibits aggregation of particles may comprise, e.g., one or more of the following: a polyethyleneglycol (PEG)-lipid conjugate, a polyamide (ATTA)-lipid conjugate, a cationic-polymer-lipid conjugates (CPLs), or mixtures thereof. In one preferred embodiment, the nucleic acid-lipid particles comprise either a PEG-lipid conjugate or an ATTA-lipid conjugate. In certain embodiments, the PEG-lipid conjugate or ATTA-lipid conjugate is used together with a CPL. The conjugated lipid that inhibits aggregation of particles may comprise a PEG-lipid including, e.g., a PEG-diacylglycerol (DAG), a PEG dialkyloxypropyl (DAA), a PEG-phospholipid, a PEG-ceramide (Cer), or mixtures thereof. The PEG-DAA conjugate may be PEG-di lauryloxypropyl (C12), a PEG-dimyristyloxypropyl (C14), a PEG-dipalmityloxypropyl (C16), a PEG-distearyloxypropyl (C18), or mixtures thereof.

[0248] Additional PEG-lipid conjugates suitable for use in the invention include, but are not limited to, mPEG2000-1,2-di-O-alkyl-sn3-carbomoylglyceride (PEG-C-DOMG). The synthesis of PEG-C-DOMG is described in PCT Application No. PCT/US08/88676. Yet additional PEG-lipid conjugates suitable for use in the invention include, without limitation, 1-[8'-(1,2-dimyristoyl-3-propanoxy)-carboxamido-3',6'-dioxaoctanyl]carbam- oyl-w-methyl-poly(ethylene glycol) (2KPEG-DMG). The synthesis of 2KPEG-DMG is described in U.S. Pat. No. 7,404,969.

[0249] In some cases, the conjugated lipid that inhibits aggregation of particles (e.g., PEG-lipid conjugate) may comprise from about 0.1 mol % to about 2 mol %, from about 0.5 mol % to about 2 mol %, from about 1 mol % to about 2 mol %, from about 0.6 mol % to about 1.9 mol %, from about 0.7 mol % to about 1.8 mol %, from about 0.8 mol % to about 1.7 mol %, from about 1 mol % to about 1.8 mol %, from about 1.2 mol % to about 1.8 mol %, from about 1.2 mol % to about 1.7 mol %, from about 1.3 mol % to about 1.6 mol %, from about 1.4 mol % to about 1.5 mol %, or about 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, or 2 mol % (or any fraction thereof or range therein) of the total lipid present in the particle. Typically, in such instances, the PEG moiety has an average molecular weight of about 2,000 Daltons. In other cases, the conjugated lipid that inhibits aggregation of particles (e.g., PEG-lipid conjugate) may comprise from about 5.0 mol % to about 10 mol %, from about 5 mol % to about 9 mol %, from about 5 mol % to about 8 mol %, from about 6 mol % to about 9 mol %, from about 6 mol % to about 8 mol %, or about 5 mol %, 6 mol %, 7 mol %, 8 mol %, 9 mol %, or 10 mol % (or any fraction thereof or range therein) of the total lipid present in the particle. Typically, in such instances, the PEG moiety has an average molecular weight of about 750 Daltons.

[0250] In other embodiments, the composition may comprise amphoteric liposomes, which contain at least one positive and at least one negative charge carrier, which differs from the positive one, the isoelectric point of the liposomes being between 4 and 8. This objective is accomplished owing to the fact that liposomes are prepared with a pH-dependent, changing charge.

[0251] Liposomal structures with the desired properties are formed, for example, when the amount of membrane-forming or membrane-based cationic charge carriers exceeds that of the anionic charge carriers at a low pH and the ratio is reversed at a higher pH. This is always the case when the ionizable components have a pKa value between 4 and 9. As the pH of the medium drops, all cationic charge carriers are charged more and all anionic charge carriers lose their charge.

[0252] Cationic compounds useful for amphoteric liposomes include those cationic compounds previously described herein above. Without limitation, strongly cationic compounds can include, for example: DC-Choi 3-O--[N--(N',N'-dimethylmethane) carbamoyl] cholesterol, TC-Choi N', N'-trimethylaminoethane) carbamoyl cholesterol, BGSC bisguanidinium-spermidine-cholesterol, BGTC bis-guadinium-tren-cholesterol, DOTAP (1,2-dioleoyloxypropyl)-N,N,N-trimethylammonium chloride, DOSPER (1,3-dioleoyloxy-2-(6-carboxy-spermyl)-propylarnide, DOTMA (1,2-dioleoyloxypropyl)-N,N,N-trimethylamronium chloride) (Lipofectin.RTM.), DORIE 1,2-dioleoyloxypropyl)-3-dimethylhydroxyethylammonium bromide, DOSC (1,2-dioleoyl-3-succinyl-sn-glyceryl choline ester), DOGSDSO (1,2-dioleoyl-sn-glycero-3-succinyl-2-hydroxyethyl disulfide omithine), DDAB dimethyldioctadecylammonium bromide, DOGS ((C18)2GlySper3+) N,N-dioctadecylamido-glycol-spermin (Transfectam.RTM.) (C18)2Gly+N,N-dioctadecylamido-glycine, CTAB cetyltrimethylarnmonium bromide, CpyC cetylpyridinium chloride, DOEPC 1,2-dioleoly-sn-glycero-3-ethylphosphocholine or other O-alkyl-phosphatidylcholine or ethanolamines, amides from lysine, arginine or omithine and phosphatidyl ethanolarnine.

[0253] Examples of weakly cationic compounds include, without limitation: His-Chol (histaminyl-cholesterol hemisuccinate), Mo-Chol (morpholine-N-ethylamino-cholesterol hemisuccinate), or histidinyl-PE.

[0254] Examples of neutral compounds include, without limitation: cholesterol, ceramides, phosphatidyl cholines, phosphatidyl ethanolamines, tetraether lipids, or diacyl glycerols.

[0255] Anionic compounds useful for amphoteric liposomes include those non-cationic compounds previously described herein. Without limitation, examples of weakly anionic compounds can include: CHEMS (cholesterol hemisuccinate), alkyl carboxylic acids with 8 to 25 carbon atoms, or diacyl glycerol hemisuccinate. Additional weakly anionic compounds can include the amides of aspartic acid, or glutamic acid and PE as well as PS and its amides with glycine, alanine, glutamine, asparagine, serine, cysteine, threonine, tyrosine, glutamic acid, aspartic acid or other amino acids or aminodicarboxylic acids. According to the same principle, the esters of hydroxycarboxylic acids or hydroxydicarboxylic acids and PS are also weakly anionic compounds.

[0256] In some embodiments, amphoteric liposomes may contain a conjugated lipid, such as those described herein above. Particular examples of useful conjugated lipids include, without limitation, PEG-modified phosphatidylethanolamine and phosphatidic acid, PEG-ceramide conjugates (e.g., PEG-CerC14 or PEG-CerC20), PEG-modified dialkylamines and PEG-modified 1,2-diacyloxypropan-3-amines. Particularly preferred are PEG-modified diacylglycerols and dialkylglycerols.

[0257] In some embodiments, the neutral lipids may comprise from about 10 mol % to about 60 mol %, from about 15 mol % to about 60 mol %, from about 20 mol % to about 60 mol %, from about 25 mol % to about 60 mol %, from about 30 mol % to about 60 mol %, from about 10 mol % to about 55 mol %, from about 15 mol % to about 55 mol %, from about 20 mol % to about 55 mol %, from about 25 mol % to about 55 mol %, from about 30 mol % to about 55 mol %, from about 13 mol % to about 50 mol %, from about 15 mol % to about 50 mol % or from about 20 mol % to about 50 mol % of the total lipid present in the particle.

[0258] In some cases, the conjugated lipid that inhibits aggregation of particles (e.g., PEG-lipid conjugate) may comprise from about 0.1 mol % to about 2 mol %, from about 0.5 mol % to about 2 mol %, from about 1 mol % to about 2 mol %, from about 0.6 mol % to about 1.9 mol %, from about 0.7 mol % to about 1.8 mol %, from about 0.8 mol % to about 1.7 mol %, from about 1 mol % to about 1.8 mol %, from about 1.2 mol % to about 1.8 mol %, from about 1.2 mol % to about 1.7 mol %, from about 1.3 mol % to about 1.6 mol %, from about 1.4 mol % to about 1.5 mol %, or about 1, 1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, or 2 mol % (or any fraction thereof or range therein) of the total lipid present in the particle. Typically, in such instances, the PEG moiety has an average molecular weight of about 2,000 Daltons. In other cases, the conjugated lipid that inhibits aggregation of particles (e.g., PEG-lipid conjugate) may comprise from about 5.0 mol % to about 10 mol %, from about 5 mol % to about 9 mol %, from about 5 mol % to about 8 mol %, from about 6 mol % to about 9 mol %, from about 6 mol % to about 8 mol %, or about 5 mol %, 6 mol %, 7 mol %, 8 mol %, 9 mol %, or 10 mol % (or any fraction thereof or range therein) of the total lipid present in the particle. Typically, in such instances, the PEG moiety has an average molecular weight of about 750 Daltons.

[0259] Considering the total amount of neutral and conjugated lipids, the remaining balance of the amphoteric liposome can comprise a mixture of cationic compounds and anionic compounds formulated at various ratios. The ratio of cationic to anionic lipid may selected in order to achieve the desired properties of nucleic acid encapsulation, zeta potential, pKa, or other physicochemical property that is at least in part dependent on the presence of charged lipid components.

[0260] In some embodiments, the lipid nanoparticles have a composition which specifically enhances delivery and uptake in the liver, or specifically within hepatic sinusoidal endothelial cells. The composition may also specifically enhance delivery uptake by endothelial cells in the circulatory system.

[0261] An "effective amount" or "therapeutically effective amount" of a pharmaceutical composition, engineered nuclease, or nucleic acid encoding an engineering nuclease, generally refers to an amount sufficient to effect beneficial or desirable biological and/or clinical results. In the treatment of hemophilia A, characterized by an inversion of exons 1-22 of the Factor VIII gene, an effective amount reduces clotting time and/or increases detectable circulating levels of functional Factor VIII protein in the subject's blood. Clotting time in a subject may be reduced by any amount until achieving a clotting time exhibited by a wild-type healthy subject. Clotting time can be measured relative to the same subject prior to treatment, or relative to the clotting time of a subject having hemophilia A characterized by an inversion of exons 1-22 of the Factor VIII gene. Clotting time of a healthy human subject can typically be between 8 to 15 minutes. Such reductions in clotting time can be about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, or 60 minutes, or more, including a reduction to the clotting time of a healthy wild-type subject. Clotting time can be measured using any methods known in the art including, but not limited to, a whole blood clotting test (WBCT) as described in the Examples herein, an activated partial thromboplastin time (aPTT) test, a prothrombin time (PT) test, and/or a fibrinogen test. Increases in detectable circulating Factor VIII can be up to about 1 fold, 2 fold, 3 fold, 4, fold, 5 fold, 10 fold, 20 fold, 30 fold, 40 fold, 50 fold, 60 fold, 70 fold, 80 fold, 90 fold, 100 fold, 250 fold, 500 fold, 750 fold, or 1000 fold, or more, higher than the baseline level of circulating Factor VIII in the subject, or up to the level of circulating Factor VIII exhibited by a healthy wild-type subject. Circulating levels of Factor VIII can be measured using any methods known in the art including, but not limited to, an enzyme linked immunosorbent assay (ELSIA). Methods for detecting mRNA encoding full-length Factor VIII in a wild-type orientation include those described in the Examples herein.

2.5 Methods for Producing Recombinant AAV Vectors

[0262] In some embodiments, the invention provides recombinant AAV vectors for us in the methods of the invention. Recombinant AAV vectors are typically produced in mammalian cell lines such as HEK-293. Because the viral cap and rep genes are removed from the vector to prevent its self-replication to make room for the therapeutic gene(s) to be delivered (e.g. the nuclease gene), it is necessary to provide these in trans in the packaging cell line. In addition, it is necessary to provide the "helper" (e.g. adenoviral) components necessary to support replication (Cots D, Bosch A, Chillon M (2013) Curr. Gene Ther. 13(5): 370-81). Frequently, recombinant AAV vectors are produced using a triple-transfection in which a cell line is transfected with a first plasmid encoding the "helper" components, a second plasmid comprising the cap and rep genes, and a third plasmid comprising the viral ITRs containing the intervening DNA sequence to be packaged into the virus. Viral particles comprising a genome (ITRs and intervening gene(s) of interest) encased in a capsid are then isolated from cells by freeze-thaw cycles, sonication, detergent, or other means known in the art. Particles are then purified using cesium-chloride density gradient centrifugation or affinity chromatography and subsequently delivered to the gene(s) of interest to cells, tissues, or an organism such as a human patient.

[0263] Because recombinant AAV particles are typically produced (manufactured) in cells, precautions must be taken in practicing the current invention to ensure that the site-specific nuclease is not expressed in the packaging cells. Because the viral genomes of the invention comprise a recognition sequence for the nuclease, any nuclease expressed in the packaging cell line will be capable of cleaving the viral genome before it can be packaged into viral particles. This will result in reduced packaging efficiency and/or the packaging of fragmented genomes. Several approaches can be used to prevent nuclease expression in the packaging cells, including: [0264] 1. The nuclease can be placed under the control of a tissue-specific promoter that is not active in the packaging cells. For example, if a viral vector is developed for delivery of (an) nuclease gene(s) to muscle tissue, a muscle-specific promoter can be used. Examples of muscle-specific promoters include C5-12 (Liu, et al. (2004) Hum Gene Ther. 15:783-92), the muscle-specific creatine kinase (MCK) promoter (Yuasa, et al. (2002) Gene Ther. 9:1576-88), or the smooth muscle 22 (SM22) promoter (Haase, et al. (2013) BMC Biotechnol. 13:49-54). Examples of CNS (neuron)-specific promoters include the NSE, Synapsin, and MeCP2 promoters (Lentz, et al. (2012) Neurobiol Dis. 48:179-88). Examples of liver-specific promoters include albumin promoters (such as Palb), human al-antitrypsin (such as PalAT), and hemopexin (such as Phpx) (Kramer, M G et al., (2003) Mol. Therapy 7:375-85). Examples of eye-specific promoters include opsin, and corneal epithelium-specific K12 promoters (Martin K R G, Klein R L, and Quigley H A (2002) Methods (28): 267-75) (Tong Y, et al., (2007) J Gene Med, 9:956-66). These promoters, or other tissue-specific promoters known in the art, are not highly-active in HEK-293 cells and, thus, will not expected to yield significant levels of nuclease gene expression in packaging cells when incorporated into viral vectors of the present invention. Similarly, the viral vectors of the present invention contemplate the use of other cell lines with the use of incompatible tissue specific promoters (i.e., the well-known HeLa cell line (human epithelial cell) and using the liver-specific hemopexin promoter). Other examples of tissue specific promoters include: synovial sarcomas PDZD4 (cerebellum), C6 (liver), ASBS (muscle), PPP1R12B (heart), SLC5A12 (kidney), cholesterol regulation APOM (liver), ADPRHL1 (heart), and monogenic malformation syndromes TP73L (muscle). (Jacox E, et al., (2010) PLoS One v.5(8):e12274). [0265] 2. Alternatively, the vector can be packaged in cells from a different species in which the nuclease is not likely to be expressed. For example, viral particles can be produced in microbial, insect, or plant cells using mammalian promoters, such as the well-known cytomegalovirus- or SV40 virus-early promoters, which are not active in the non-mammalian packaging cells. In a preferred embodiment, viral particles are produced in insect cells using the baculovirus system as described by Gao, et al. (Gao, H., et al. (2007) J. Biotechnol. 131(2):138-43). A nuclease under the control of a mammalian promoter is unlikely to be expressed in these cells (Airenne, K J, et al. (2013) Mol. Ther. 21(4):739-49). Moreover, insect cells utilize different mRNA splicing motifs than mammalian cells. Thus, it is possible to incorporate a mammalian intron, such as the human growth hormone (HGH) intron or the SV40 large T antigen intron, into the coding sequence of a nuclease. Because these introns are not spliced efficiently from pre-mRNA transcripts in insect cells, insect cells will not express a functional nuclease and will package the full-length genome. In contrast, mammalian cells to which the resulting recombinant AAV particles are delivered will properly splice the pre-mRNA and will express functional nuclease protein. Haifeng Chen has reported the use of the HGH and SV40 large T antigen introns to attenuate expression of the toxic proteins barnase and diphtheria toxin fragment A in insect packaging cells, enabling the production of recombinant AAV vectors carrying these toxin genes (Chen, H (2012) Mol Ther Nucleic Acids. 1(11): e57). [0266] 3. The nuclease gene can be operably linked to an inducible promoter such that a small-molecule inducer is required for nuclease expression. Examples of inducible promoters include the Tet-On system (Clontech; Chen H., et al., (2015) BMC Biotechnol. 15(1):4)) and the RheoSwitch system (Intrexon; Sowa G., et al., (2011) Spine, 36(10): E623-8). Both systems, as well as similar systems known in the art, rely on ligand-inducible transcription factors (variants of the Tet Repressor and Ecdysone receptor, respectively) that activate transcription in response to a small-molecule activator (Doxycycline or Ecdysone, respectively). Practicing the current invention using such ligand-inducible transcription activators includes: 1) placing the nuclease gene under the control of a promoter that responds to the corresponding transcription factor, the nuclease gene having (a) binding site(s) for the transcription factor; and 2) including the gene encoding the transcription factor in the packaged viral genome The latter step is necessary because the nuclease will not be expressed in the target cells or tissues following recombinant AAV delivery if the transcription activator is not also provided to the same cells. The transcription activator then induces nuclease gene expression only in cells or tissues that are treated with the cognate small-molecule activator. This approach is advantageous because it enables nuclease gene expression to be regulated in a spatio-temporal manner by selecting when and to which tissues the small-molecule inducer is delivered. However, the requirement to include the inducer in the viral genome, which has significantly limited carrying capacity, creates a drawback to this approach. [0267] 4. In another preferred embodiment, recombinant AAV particles are produced in a mammalian cell line that expresses a transcription repressor that prevents expression of the nuclease. Transcription repressors are known in the art and include the Tet-Repressor, the Lac-Repressor, the Cro repressor, and the Lambda-repressor. Many nuclear hormone receptors such as the ecdysone receptor also act as transcription repressors in the absence of their cognate hormone ligand. To practice the current invention, packaging cells are transfected/transduced with a vector encoding a transcription repressor and the nuclease gene in the viral genome (packaging vector) is operably linked to a promoter that is modified to comprise binding sites for the repressor such that the repressor silences the promoter. The gene encoding the transcription repressor can be placed in a variety of positions. It can be encoded on a separate vector; it can be incorporated into the packaging vector outside of the ITR sequences; it can be incorporated into the cap/rep vector or the adenoviral helper vector; or, most preferably, it can be stably integrated into the genome of the packaging cell such that it is expressed constitutively. Methods to modify common mammalian promoters to incorporate transcription repressor sites are known in the art. For example, Chang and Roninson modified the strong, constitutive CMV and RSV promoters to comprise operators for the Lac repressor and showed that gene expression from the modified promoters was greatly attenuated in cells expressing the repressor (Chang B D, and Roninson D3 (1996) Gene 183:137-42). The use of a non-human transcription repressor ensures that transcription of the nuclease gene will be repressed only in the packaging cells expressing the repressor and not in target cells or tissues transduced with the resulting recombinant AAV vector.

2.6 Engineered Nuclease Variants

[0268] Embodiments of the invention encompass the engineered nucleases described herein, and variants thereof. Further embodiments of the invention encompass polynucleotides comprising a nucleic acid sequence encoding the nucleases described herein, and variants of such polynucleotides.

[0269] As used herein, "variants" is intended to mean substantially similar sequences. A "variant" polypeptide is intended to mean a polypeptide derived from the "native" polypeptide by deletion or addition of one or more amino acids at one or more internal sites in the native protein and/or substitution of one or more amino acids at one or more sites in the native polypeptide. As used herein, a "native" polynucleotide or polypeptide comprises a parental sequence from which variants are derived. Variant polypeptides encompassed by the embodiments are biologically active. That is, they continue to possess the desired biological activity of the native protein; i.e., the ability to recognize and cleave recognition sequences found in an int22h-1 sequence in a Factor VIII gene including, for example, the human F8R 17-18 recognition sequence (SEQ ID NO: 9), the canine F8R 17-18 recognition sequence (SEQ ID NO: 11), and the consensus F8R 17-18 recognition sequence (SEQ ID NO: 7). Such variants may result, for example, from human manipulation. Biologically active variants of a native polypeptide of the embodiments (e.g., SEQ ID NOs: 13-21), or biologically active variants of the recognition half-site binding subunits described herein, will have at least about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, or about 99%, sequence identity to the amino acid sequence of the native polypeptide or native subunit, as determined by sequence alignment programs and parameters described elsewhere herein. A biologically active variant of a polypeptide or subunit of the embodiments may differ from that polypeptide or subunit by as few as about 1-40 amino acid residues, as few as about 1-20, as few as about 1-10, as few as about 5, as few as 4, 3, 2, or even 1 amino acid residue.

[0270] The polypeptides of the embodiments may be altered in various ways including amino acid substitutions, deletions, truncations, and insertions. Methods for such manipulations are generally known in the art. For example, amino acid sequence variants can be prepared by mutations in the DNA. Methods for mutagenesis and polynucleotide alterations are well known in the art. See, for example, Kunkel (1985) Proc. Natl. Acad. Sci. USA 82:488-492; Kunkel et al. (1987) Methods in Enzymol. 154:367-382; U.S. Pat. No. 4,873,192; Walker and Gaastra, eds. (1983) Techniques in Molecular Biology (MacMillan Publishing Company, New York) and the references cited therein. Guidance as to appropriate amino acid substitutions that do not affect biological activity of the protein of interest may be found in the model of Dayhoff et al. (1978) Atlas of Protein Sequence and Structure (Natl. Biomed. Res. Found., Washington, D.C.), herein incorporated by reference. Conservative substitutions, such as exchanging one amino acid with another having similar properties, may be optimal.

[0271] A substantial number of amino acid modifications to the DNA recognition domain of the wild-type I-CreI meganuclease have previously been identified (e.g., U.S. Pat. No. 8,021,867) which, singly or in combination, result in recombinant meganucleases with specificities altered at individual bases within the DNA recognition sequence half-site, such that the resulting rationally-designed meganucleases have half-site specificities different from the wild-type enzyme. Table 2 provides potential substitutions that can be made in a recombinant meganuclease monomer or subunit to enhance specificity based on the base present at each half-site position (-1 through -9) of a recognition half-site.

TABLE-US-00002 TABLE 2 Favored Sense-Strand Base Posn. A C G T A/T A/C A/G C/T G/T A/G/T A/C/G/T -1 Y75 R70* K70 Q70* T46* G70 L75* H75* E70* C70 A70 C75* R75* E75* L70 S70 Y139* H46* E46* Y75* G46* C46* K46* D46* Q75* A46* R46* H75* H139 Q46* H46* -2 Q70 E70 H70 Q44* C44* T44* D70 D44* A44* K44* E44* V44* R44* I44* L44* N44* -3 Q68 E68 R68 M68 H68 Y68 K68 C24* F68 C68 I24* K24* L68 R24* F68 -4 A26* E77 R77 S77 S26* Q77 K26* E26* Q26* -5 E42 R42 K28* C28* M66 Q42 K66 -6 Q40 E40 R40 C40 A40 S40 C28* R28* I40 A79 S28* V40 A28* C79 H28* I79 V79 Q28* -7 N30* E38 K38 I38 C38 H38 Q38 K30* R38 L38 N38 R30* E30* Q30* -8 F33 E33 F33 L33 R32* R33 Y33 D33 H33 V33 I33 F33 C33 -9 E32 R32 L32 D32 S32 K32 V32 I32 N32 Favored Sense-Strand Base A32 H32 C32 Q32 T32

Bold entries are wild-type contact residues and do not constitute "modifications" as used herein. An asterisk indicates that the residue contacts the base on the antisense strand.

[0272] For polynucleotides, a "variant" comprises a deletion and/or addition of one or more nucleotides at one or more sites within the native polynucleotide. One of skill in the art will recognize that variants of the nucleic acids of the embodiments will be constructed such that the open reading frame is maintained. For polynucleotides, conservative variants include those sequences that, because of the degeneracy of the genetic code, encode the amino acid sequence of one of the polypeptides of the embodiments. Variant polynucleotides include synthetically derived polynucleotides, such as those generated, for example, by using site-directed mutagenesis but which still encode a recombinant nuclease of the embodiments. Generally, variants of a particular polynucleotide of the embodiments will have at least about 40%, about 45%, about 50%, about 55%, about 60%, about 65%, about 70%, about 75%, about 80%, about 85%, about 90%, about 91%, about 92%, about 93%, about 94%, about 95%, about 96%, about 97%, about 98%, about 99% or more sequence identity to that particular polynucleotide as determined by sequence alignment programs and parameters described elsewhere herein. Variants of a particular polynucleotide of the embodiments (i.e., the reference polynucleotide) can also be evaluated by comparison of the percent sequence identity between the polypeptide encoded by a variant polynucleotide and the polypeptide encoded by the reference polynucleotide.

[0273] The deletions, insertions, and substitutions of the protein sequences encompassed herein are not expected to produce radical changes in the characteristics of the polypeptide. However, when it is difficult to predict the exact effect of the substitution, deletion, or insertion in advance of doing so, one skilled in the art will appreciate that the effect will be evaluated by screening the polypeptide for its ability to preferentially recognize and cleave recognition sequences found within an int22h-1 sequence of a Factor VIII gene.

EXAMPLES

[0274] This invention is further illustrated by the following examples, which should not be construed as limiting. Those skilled in the art will recognize, or be able to ascertain, using no more than routine experimentation, numerous equivalents to the specific substances and procedures described herein. Such equivalents are intended to be encompassed in the scope of the claims that follow the examples below.

Example 1

Characterization of Meganucleases that Recognize and Cleave F8R Recognition Sequences

[0275] 1. Meganucleases that Recognize and Cleave the F8R 17-18 Recognition Sequence

[0276] Recombinant meganucleases (SEQ ID NOs: 13-21), collectively referred to herein as "F8R 17-18 meganucleases," were engineered to recognize and cleave the F8R 17-18 recognition sequence in the human genome (SEQ ID NO: 9), and the F8R 17-18 recognition sequence in the canine genome (SEQ ID NO: 11). Each recognition sequence is present in the Factor VIII gene, specifically within the int22h-1 sequence, and more specifically within the F8A1 sequence. Each F8R 17-18 recombinant meganuclease comprises an N-terminal nuclease-localization signal derived from SV40, a first meganuclease subunit, a linker sequence, and a second meganuclease subunit. A first subunit in each F8R 17-18 meganuclease binds to the F8R17 recognition half-site of SEQ ID NO: 9 or SEQ ID NO: 11, while a second subunit binds to the F8R18 recognition half-site (see, FIG. 2). The F8R17-binding subunits and F8R18-binding subunits each comprise a 56 base pair hypervariable region, referred to as HVR1 and HVR2, respectively. F8R17-binding subunits are highly conserved outside of the HVR1 region. Similarly, F8R18-binding subunits are also highly conserved outside of the HVR2 region. The F8R17-binding regions of SEQ ID NOs: 13-21 are provided as SEQ ID NOs: 22-30, respectively. Each of SEQ ID NOs: 22-30 share at least 90% sequence identity to SEQ ID NO: 22, which is the F8R17-binding region of the meganuclease F8R 17-18L1.35 (SEQ ID NO: 13). F8R18-binding regions of SEQ ID NOs: 13-21 are provided as SEQ ID NOs: 31-39, respectively. Each of SEQ ID NOs: 31-39 share at least 90% sequence identity to SEQ ID NO: 31, which is the F8R18-binding region of the meganuclease F8R 17-18L1.35 (SEQ ID NO: 13).

2. Cleavage of F8R Recognition Sequences in a CHO Cell Reporter Assay

[0277] To determine whether F8R 17-18 meganucleases could recognize and cleave both the human (SEQ ID NO: 9) and canine (SEQ ID NO: 11) F8R 17-18 recognition sequences, each recombinant meganuclease was evaluated using the CHO cell reporter assay previously described (see, WO/2012/167192 and FIG. 4). To perform the assays, CHO cell reporter lines were produced which carried a non-functional Green Fluorescent Protein (GFP) gene expression cassette integrated into the genome of the cells. The GFP gene in each cell line was interrupted by a pair of recognition sequences such that intracellular cleavage of either recognition sequence by a meganuclease would stimulate a homologous recombination event resulting in a functional GFP gene.

[0278] In CHO reporter cell lines developed for this study, one recognition sequence inserted into the GFP gene was the human F8R 17-18 recognition sequence (SEQ ID NO: 9) or the canine F8R 17-18 recognition sequence (SEQ ID NO: 11). The second recognition sequence inserted into the GFP gene was a CHO-23/24 recognition sequence, which is recognized and cleaved by a control meganuclease called "CHO-23/24". CHO reporter cells comprising the human F8R 17-18 recognition sequence and the CHO-23/24 recognition sequence are referred to as "F8R 17-18h cells." CHO reporter cells comprising the canine F8R 17-18 recognition sequence and the CHO-23/24 recognition sequence are referred to as "F8R 17-18c cells."

[0279] CHO reporter cells were transfected with plasmid DNA encoding their corresponding recombinant meganucleases (e.g., F8R 17-18h cells were transfected with plasmid DNA encoding F8R 17-18 meganucleases) or encoding the CHO-23/34 meganuclease. In each assay, 4e5 CHO reporter cells were transfected with 50 ng of plasmid DNA in a 96-well plate using Lipofectamine.RTM. 2000 (ThermoFisher) according to the manufacturer's instructions. At 48 hours post-transfection, cells were evaluated by flow cytometry to determine the percentage of GFP-positive cells compared to an untransfected negative control (F8R bs). As shown in FIG. 5, first-generation F8R 17-18 meganucleases were found to produce GFP-positive cells in cell lines comprising the human (FIG. 5A) or the canine (FIG. 5B) F8R 17-18 recognition sequence at frequencies significantly exceeding the negative control. Similarly, FIG. 6 shows that the second-generation F8R 17-18L1.35 meganuclease also produced GFP-positive cells in cell lines comprising the human (FIG. 6A) or the canine (FIG. 6B) F8R 17-18 recognition sequence at frequencies significantly exceeding the negative control, and comparable to the F8R 17-18x.1 meganuclease from which it was derived. Likewise, FIG. 7 shows that the second-generation F8R 17-18L2.23 meganuclease produced GFP-positive cells in cell lines comprising the human (FIG. 7A) or the canine (FIG. 7B) F8R 17-18 recognition sequence at frequencies significantly exceeding the negative control, and comparable to the F8R 17-18x.88 meganuclease from which it was derived.

[0280] The efficacy of F8R 17-18 meganucleases was also determined in a time-dependent manner 2, 5, and 7 days after introduction of the meganucleases into CHO reporter cells. In this study, F8R 17-18h or F8R 17-18c cells (1.0.times.106) were electroporated with 1.times.106 copies of F8R 17-18 meganuclease mRNA per cell using a BioRad Gene Pulser Xcell according to the manufacturer's instructions. At the designated time points post-transfection, cells were evaluated by flow cytometry to determine the percentage of GFP-positive cells. A CHO-23/24 meganuclease was also included at each time point as a positive control.

[0281] As shown in FIG. 8, the % GFP produced by F8R 17-18 meganucleases was relatively consistent over the time course of each study in both F8R 17-18h (FIG. 8A) and F8R 17-18c (FIG. 8B) cells, indicating persistent cleavage activity and a lack of any substantial toxicity in the cells.

3. Reduced Off-Target Cutting in a Cell Reporter Assay by Optimized F8R Nucleases

[0282] Analysis of off-target cutting by second-generation F8R meganucleases (which are specific for SEQ ID NOs: 9 and 11) revealed an off-target recognition sequence which is cleaved at some frequency in cells. This recognition site is referred to herein as "nf2" and is set forth in SEQ ID NO: 46.

[0283] Third-generation F8R nucleases were engineered to reduce cutting of the off-target site, while maintaining specificity for both the human and canine recognition sequences (SEQ ID NOs: 9 and 11). The CHO cell reporter assay used to evaluate the third-generation F8R nucleases is similar to that described above. In CHO reporter cell lines developed for this study, one recognition sequence inserted into the GFP gene was the human F8R 17-18 recognition sequence (SEQ ID NO: 9), the canine F8R 17-18 recognition sequence (SEQ ID NO: 11), or the nf2 off-target recognition sequence (SEQ ID NO: 46). The second recognition sequence inserted into the GFP gene was a CHO-23/24 recognition sequence, which is recognized and cleaved by a control meganuclease called "CHO-23/24". CHO reporter cells comprising the human F8R 17-18 recognition sequence and the CHO-23/24 recognition sequence are referred to as "F8R 17-18h cells." CHO reporter cells comprising the canine F8R 17-18 recognition sequence and the CHO-23/24 recognition sequence are referred to as "F8R 17-18c cells." CHO reporter cells comprising the nf2 off-target recognition sequence and the CHO-23/24 recognition sequence are referred to as "Off-target cells."

[0284] CHO reporter cells were transfected with mRNA encoding either the F8R 17-18L1.35 meganuclease, or third-generation nucleases including F8R 17-18L.553, F8R 17-18L.615, or F8R 17-18L.626. CHO reporter cells were also transfected with mRNA encoding the CHO-23/24 meganuclease. In each assay, 5e4 CHO reporter cells were transfected with 90 ng of mRNA in a 96-well plate using Lipofectamine.RTM. MessengerMax (ThermoFisher) according to the manufacturer's instructions. F8R 17-18h and F8R 17-18c cells were evaluated by flow cytometry at 48 hours, 120 hours and 168 hours to determine the percentage of GFP-positive cells compared to an untransfected negative control. F8R 17-18 off-target cells were evaluated by flow cytometry at 48 hours only. Data obtained at each time point was normalized to the % GFP positive cells observed using the CHO-23/24 meganuclease to determine an "activity score" and the normalized data from the earliest time point was subtracted from that of the latest time point to determine a "toxicity score". The activity and toxicity scores were then added together to determine an "activity index," which was then normalized to the activity index of the CHO-23/24 meganuclease to compare data between cell lines.

[0285] As shown in FIG. 9, the F8R 17-18L1.35 meganuclease cleaved both the human and canine F8R 17-18 recognition sequences with high frequency, while also showing cleavage of the nf2 off-target site. However, each of the optimized third-generation nucleases maintained a high efficiency of on-target cutting of the human and canine F8R 17-18 recognition sequences, while drastically reducing cleavage of the nf2 off-target recognition sequence. In particular, the F8R 17-18L.615 and F8R 17-18L.626 nucleases exhibited to lowest frequencies of cleavage of the off-target site.

4. Conclusions

[0286] These studies demonstrated that F8R meganucleases encompassed by the invention can efficiently target and cleave their respective recognition sequences in cells. Further, these studies demonstrated that optimized third-generation nucleases could be developed to improve specificity for the on-target human and canine recognition sequences while reducing cleavage of off-target sequences.

Example 2

Reversion of Exons 1-22 in the Human Factor VIII Gene in Patient T Cells

1. Methods

[0287] This study demonstrated that F8R 17-18 nucleases encompassed by the invention could lead to the reversion of the hemophilia A specific Factor VIII gene inversion in hemophilia A patient T cells.

[0288] Hemophilia A patient T-cells (1.times.10.sup.1{circumflex over ( )}6) were transfected with mRNA (1 .mu.g) encoding F8R17-18x.1, F8R17-18x.2, F8R17-18x.79, or F8R17-18x.88, respectively, using a Lonza 4D nucleofector according to the manufacturer's instructions. At 3 days post-transfection, genomic DNA was isolated from cells and long-distance PCR was performed to determine Factor VIII genome editing. Genomic DNA isolated from untreated normal T-cells, as well as patient T cells transfected with mRNA encoding green fluorescent protein (GFP) and F8R11-12.x69 nuclease, respectively, served as controls.

[0289] In this long-distance PCR, the genomic DNA was amplified between primers H1U/H1D and H3D/H1D, respectively.

TABLE-US-00003 H1U: (SEQ ID NO: 40) [5'-CCCTTACAGTTATTAACTACTCTCATGAGGTTCATTCC-3'] H1D: (SEQ ID NO: 41) [5'-GGCCCTACAACCATTCTGCCTTTCACTTTCAGTGCAATA-3'] H3D: (SEQ ID NO: 42) [5'-CACAAGGGGGAAGAGTGTGAGGGTGTGGGATAAGAA-3']

[0290] Primers H1U and H1D bind upstream and downstream, respectively of the int22h-1 repeat in intron 22 of the Factor VIII gene; primer H3D binds downstream of a near-identical, inversely oriented copy of the int22h-1 repeat located about 0.5 Mb upstream of int22h-1.

[0291] FIG. 10 shows the gels obtained following the long-distance PCR. FIG. 10A shows the result of using the H3D/H1D primer pair. FIG. 10B shows the result of using the H1U/H1D primer pair. In each figure, lane 1 is a molecular weight standard, lane 2 is gDNA obtained from normal human T cells, and lanes 3-8 show gDNA obtained from hemophilia A patient T cells treated with mRNA for GFP (lane 3), the F8R 11-12x.69 meganuclease (which targets a separate recognition site (F8R 11-12) in the int22h-1 region that is only present in the human genome) (lane 4), the F8R 17-18x.1 meganuclease (lane 5), the F8R 17-18x.2 meganuclease (lane 6), the F8R 17-18x.79 meganuclease (lane 7), or the F8R 17-18x.88 meganuclease (lane 8). Lane 9 is blank.

[0292] Long-distance PCR of genomic DNA from normal human T cells with primers H1U/H1D yields an approximately 11 kb amplicon (FIG. 10B, lane 2) while PCR with primers H3D/H1D does not generate an amplification product (FIG. 10A, lane 2). Conversely, long-distance PCR of genomic DNA from patient cells with the hemophilia A gene inversion treated with GFP mRNA using primers H1U/H1D fails to generate a PCR product (FIG. 10B, lane 3) while the H3D/H1D PCR yields an approximately 11 kb amplicon (FIG. 10A, lane 3).

[0293] Upon successful reversion of the genomic fragment in patient T cells between two inversely oriented int22h repeats, the H1U primer binding site, which is located on the inverted fragment, is reoriented relative to the H1D and H3D primer binding sites. Now the H1U/H1D PCR yields the 11 kb amplicon (FIG. 10B, lanes 4 through 8), indicating a reversion to the wild-type configuration of the Factor VIII gene. PCR fragments were analyzed by agarose gel electrophoresis and visualized by ethidium bromide.

2. Results

[0294] Genomic DNAs from hemophilia A patient T cells treated with mRNA encoding F8R17-18x.1, F8R17-18x.2, F8R17-18x.79 and F8R17-18x.88 nucleases (or GFP as a control) were analyzed by long-distance PCR (FIG. 10). Only the H3D/H1D fragment could be amplified from genomic DNA isolated from patient T-cells treated with GFP mRNA (FIG. 10A, lane 3). Using genomic DNA from F8R17-18x.1, F8R17-18x.2, F8R17-18x.79 and F8R17-18x.88 nuclease-treated patient T cells as PCR template, both H1U/H1D and H3D/H1D primer combinations yielded their signature wild-type (.about.11 kb) and inversion (.about.11 kb) amplicons, respectively (FIG. 10A, lanes 5-8, and FIG. 10B, lanes 5-8). The H3U/H1D fragment was still being amplified from genomic DNA from F8R nuclease-treated patient T-cells because the nuclease treatment generated a mixed population of cells with edited and unedited genomes.

3. Conclusions

[0295] F8R 17-18 meganucleases encompassed by the invention were capable of inducing a reversion of the inverted Factor VIII gene back to a wild-type configuration in hemophilia A patient T cells in vitro, and this reversion could be detected by long-distance PCR.

Example 3

Inversion of Exons 1-22 in the Human Factor VIII Gene in HEK293 Cells

1. Methods

[0296] Nucleases F8R 17-18x.1 and F8R 17-18x.88 were optimized with respect to their specificity for the human and canine F8R 17-18 recognition sequences. Two improved nucleases, F8R 17-18L1.35 (derived from F8R 17-18x.1) and F8R 17-18L2.23 (derived from F8R 17-18x.88), were evaluated in HEK293 cells for their ability to cause the Factor VIII locus inversion characteristic of a subset of hemophilia A patients.

[0297] HEK293 cells were transfected with mRNA encoding F8R17-18L1.35 and F8R17-18L2.23 (both in triplicate) and long-distance PCR was performed, essentially as described in Example 2. Genomic DNA isolated from HEK293 cells transfected with mRNA encoding green fluorescent protein (GFP) as well as F8R17-18x.1 and F8R17-18x.88 nucleases, respectively, served as controls.

[0298] Gels obtained from the long-distance PCR are shown in FIG. 11. FIG. 11A shows the result of using the H1U/H1D primer pair. FIG. 11B shows the result of using the H3D/H1D primer pair. In each figure, lane 1 is a molecular weight standard, lanes 2 and 12 are blank, lane 3 is gDNA obtained from untreated HEK293 cells, and lanes 3-8 show gDNA obtained from HEK293 cells treated with mRNA for GFP (lane 3), the F8R 17-18x.1 meganuclease (lane 4), the F8R 17-18x.88 meganuclease (lane 5), the F8R 17-18L1.35 meganuclease (lanes 6-8), or the F8R 17-18L2.23 meganuclease (lanes 9-11).

2. Results

[0299] Genomic DNAs from HEK293 cells treated with mRNA encoding F8R 17-18L1.35 and F8R 17-18L2.23 nucleases (or GFP as a control) were analyzed by long-distance PCR (FIG. 11). Only the H1U/H1D fragment could be amplified from genomic DNA isolated from HEK293 cells treated with GFP mRNA (FIG. 11A, lane 3; but not FIG. 11B, lane 3). Using genomic DNA from F8R 17-18L1.35 and F8R 17-18L2.23 nuclease-treated 293 cells as PCR template, both H1U/H1D and H3D/H1D primer combinations yielded their signature wild-type (.about.11 kb) and inversion (.about.11 kb) amplicons, respectively (FIG. 11A, lanes 6-11; FIG. 11B, lanes 6-11). The H1U/H1D fragment could still be amplified from genomic DNA from F8R 17-18 nuclease-treated HEK293 cells because the nuclease treatment generated a mixed population of cells with edited and unedited genomes.

3. Conclusions

[0300] The optimized F8R 17-18 meganucleases were capable of inducing an inversion of the Factor VIII locus in HEK293 cells in vitro, and this inversion could be detected by long-distance PCR.

Example 4

Inversion of Exons 1-22 in the Canine Factor VIII Gene in MDCK Cells

1. Methods

[0301] This study demonstrated that F8R 17-18 meganucleases encompassed by this invention can lead to the hemophilia A-specific Factor VIII gene inversion in canine MDCK cells. In addition, the described method can be used to determine the efficiency of F8R nuclease-mediated Factor VIII gene inversion.

[0302] MDCK cells (1.times.10.sup.1{circumflex over ( )}6) were transfected with mRNA (1 .mu.g) encoding F8R 17-18x.1, F8R 17-18x.2, F8R 17-18x.79 and F8R 17-18x.88 meganucleases, respectively, using the ThermoFisher Neon Transfection System according to the manufacturer's instructions. At 3 days post-transfection, genomic DNA was isolated from cells and inverse digital PCR was performed to determine Factor VIII genome editing. Genomic DNA isolated from MDCK cells transfected with mRNA encoding green fluorescent protein (GFP) served as a control.

[0303] Genomic DNA was digested to completion with restriction endonuclease Digested DNA was circularized using T4 DNA ligase and analyzed by inverse digital PCR using the Bio-Rad QX200 Digital PCR System according to the manufacturer's instructions.

[0304] In normal canine genomic DNA, the BclI digest generates an approximately 12 kb fragment encompassing the int22h-1 repeat in intron 22 of the Factor VIII gene as well as an approximately 9 kb fragment encompassing a near-identical, inversely oriented copy of the int22h-1 repeat located about 0.5 Mb upstream of int22h-1.

[0305] In inverse digital PCR, the two circularized BclI fragments described above are amplified with primers flanking the respective BclI sites. Primers U1 and D1 bind upstream and downstream, respectively of the int22h-1 repeat in intron 22 of the Factor VIII gene; primer U3 binds upstream of a near-identical, inversely oriented copy of the int22h-1 repeat located about 0.5 Mb upstream of int22h-1. All primers bind the genomic DNA in opposite orientation to conventional PCR and generate amplicons only when the BclI fragments are circularized.

TABLE-US-00004 U1: (SEQ ID NO: 43) [5'-GAGCCAGTTGTGTACCAT-3'] D1: (SEQ ID NO: 44) [5'-ACGTGATCCCGATTTGAATA-3'] U3: (SEQ ID NO: 45) [5'-GCACCTTACTGTCCTGAT-3']

[0306] Inverse digital PCR of BclI-digested and circularized MDCK genomic DNA with primers U1/D1 yields an approximately 0.5 kb amplicon that can be detected using a TaqMan probe while PCR with primers U3/U1 does not generate an amplification product.

[0307] Upon successful inversion of the genomic fragment between int22h-1 and its distal copy, the U1 primer binding site, which is located on the inverted fragment, is reoriented relative to the D1 and U3 primer binding sites. Now, the U1/D1 PCR fails to generate a PCR product, while the U3/U1 PCR yields an approximately 0.5 kb amplicon which can be detected with the same TaqMan probe.

[0308] FIG. 12 shows the results of the digital PCR performed in this study. FIG. 12A shows the results obtained using the U1/D1 primer pair. FIG. 12B shows the results obtained using the U3/U1 primer pair. In each figure, the lanes are as follows: lane 1, MDCK cells treated with mRNA encoding GFP; lane 2, MDCK cells treated with mRNA encoding the F8R 17-18x.1 meganuclease; lane 3, MDCK cells treated with mRNA encoding the F8R 17-18x.2 meganuclease; lane 4, MDCK cells treated with mRNA encoding the F8R 17-18x.79 meganuclease; lane 5, MDCK cells treated with mRNA encoding the F8R 17-18x.88 meganuclease; lane 6, no template control.

2. Results

[0309] Genomic DNAs from MDCK cells treated with GFP and MDCK cells treated with F8R17-18x.1, F8R17-18x.2, F8R17-18x.79 and F8R17-18x.88 nucleases, respectively, were analyzed by inverse digital PCR. Only the U1/D1 fragment was amplified from genomic DNA isolated from GFP-treated MDCK cells (FIG. 12A, lane 1), while the U3/U1 PCR did not generate a signal (FIG. 12B, lane 1). Using genomic DNA from F8R nuclease-treated MDCK cells, both U1/D1 and U3/U1 amplicons were detected (FIG. 12A, lane 2-5, and FIG. 12B, lanes 2-5). The U1/D1 fragment was still amplified from genomic DNA from F8R nuclease-treated MDCK cells because the nuclease treatment generated a mixed population of cells with both edited and unedited genomes. Since digital PCR allows parallel analysis of hundreds to thousands of chromosome equivalents, the Factor VIII gene inversion efficiency could be calculated. Out of the total number of Factor VIII genes detected by this assay, 0.3 to 2.8% showed an inversion as a result of the activity of nucleases F8R17-18x.1, F8R17-18x.2, F8R17-18x.79 and F8R17-18x.88, respectively.

3. Conclusions

[0310] Inverse digital PCR detected Factor VIII gene inversion in MDCK cells treated with nucleases F8R17-18x.1, F8R17-18x.2, F8R17-18x.79 and F8R17-18x.88. In addition, using inverse digital PCR, the editing efficiency could be calculated. Depending on the nuclease, up to 2.8% of the detected Factor VIII genes in MDCK cells were edited. Importantly, this study further demonstrates that Factor VIII gene inversions can be induced by DNA double-strand breaks within the int22h repeats, and that F8R 17-18 nucleases can induce a flipping of the intron in both human and canine cells.

Example 5

Inversion of Exons 1-22 in the Canine Factor VIII Gene in MDCK Cells

1. Methods

[0311] As previously discussed, the F8R 17-18x.1 and F8R 17-18x.88 meganucleases were optimized with respect to their specificity for the human and canine F8R 17-18 recognition sequence. In this study, the optimized F8R 17-18L1.35 and F8R17-18L2.23 meganucleases were evaluated in MDCK cells for their ability to cause the Factor VIII locus inversion characteristic of a subset of hemophilia A patients.

[0312] MDCK cells were transfected with mRNA encoding F8R 17-18L1.35 or F8R 17-18L2.23 and inverse digital PCR was performed, essentially as described in Example 4. Genomic DNA isolated from MDCK cells transfected with mRNA encoding green fluorescent protein (GFP) as well as F8R 17-18x.1 and F8R 17-18x.88 nucleases, respectively, served as controls.

[0313] FIG. 13 shows the results of the digital PCR performed in this study. FIG. 13A shows the results obtained using the U1/D1 primer pair. FIG. 13B shows the results obtained using the U3/U1 primer pair. In each figure, the lanes are as follows: lane 1, MDCK cells treated with mRNA encoding GFP; lane 2, MDCK cells treated with mRNA encoding the F8R 17-18x.1 meganuclease; lane 3, MDCK cells treated with mRNA encoding the F8R 17-18x.2 meganuclease; lane 4, MDCK cells treated with mRNA encoding the F8R 17-18L1.35 meganuclease; lane 5, MDCK cells treated with mRNA encoding the F8R 17-18L2.23 meganuclease.

2. Results

[0314] Genomic DNAs from MDCK cells treated with GFP and MDCK cells treated with F8R 17-18L1.35 and F8R 17-18L2.23 nucleases, respectively, were analyzed by inverse digital PCR. Only the U1/D1 fragment was amplified from genomic DNA isolated from GFP-treated MDCK cells (FIG. 13A, lane 1), while the U3/U1 PCR did not generate a signal (FIG. 13B, lane 1). Using genomic DNA from F8R nuclease-treated MDCK cells, both U1/D1 and U3/U1 amplicons were detected (FIG. 13A, lanes 4 and 5, FIG. 13B, lanes 4 and 5). The U1/D1 fragment was still amplified from genomic DNA from F8R nuclease-treated MDCK cells because the nuclease treatment generated a mixed population of cells with both edited and unedited genomes. Since digital PCR allows parallel analysis of hundreds to thousands of chromosome equivalents, the Factor VIII gene inversion efficiency could be calculated. Out of the total number of Factor VIII genes detected by this assay, 2.8 to 4.5% showed an inversion as a result of the activity of nucleases F8R 17-18L1.35 and F8R17-18 L2.23, respectively.

3. Conclusions

[0315] The optimized F8R meganucleases F8R17-18 L1.35 and F8R17-18 L2.23 were also capable of inducing an inversion of the wild-type Factor VIII gene in MDCK cells in vitro, and this inversion could be detected by digital PCR.

Example 6

Whole-Blood Clotting Time in a Nuclease-Treated Hemophilic Dog

1. Methods

[0316] This study demonstrated that nucleases encompassed by this invention can be administered in vivo and lead to a significant reduction in whole-blood clotting time in a large animal model of hemophilia A.

[0317] A dog from the Chapel Hill colony of Factor VIII-deficient dogs was selected for use in this study. Dogs in this colony are hemophilic and carry the int22 inversion of exons 1-22 in their Factor VIII gene. In this study, a 24.1 kg dog was infused with a single dose of 2.4E14 AAV8 particles (1E13 viral genomes/kg) encoding the F8R17-18L1.35 meganuclease under the control of a liver-specific promoter.

[0318] Whole-blood clotting time (WBCT) was determined repeatedly after over a period of approximately 2.5 months (110 days) following AAV infusion. The WBCT assay uses two siliconized glass tubes (Vacutainer, Becton-Dickinson, Rutherford, N.J.) in a 28.degree. C. water bath. One mL of whole blood was drawn from the study animal and 0.5 mL blood was distributed into each dry tube. After one minute, one tube was tilted every 30 seconds, the other left undisturbed. When a clot formed in the tilted tube, the second tube was then tilted every 30 seconds until a clot formed. The time for formation of a fully gelled clot in the second tube was recorded as the WBCT.

2. Results

[0319] Untreated dogs from the Chapel Hill colony of Factor VIII-deficient dogs synthesize a truncated Factor VIII transcript and consequently fail to express functional Factor VIII protein. Consequently, their baseline clotting time is at or above 60 minutes (see, FIG. 14), whereas the WBCT of a normal dog is between 8 to 12 minutes (grey lines). However, following a single administration of AAV encoding the F8R 17-18L1.35 meganuclease under the control of a liver-specific promoter, a significant reduction in WBCT was observed within 21 days (.about.45 minutes), and was further reduced to as low as 35 minutes during the course of the study. A relatively stable clotting time between 35 and 40 minutes was maintained through day 112.

3. Conclusions

[0320] A functional Factor VIII protein can only be synthesized from a correctly spliced transcript containing all 26 exons. The shortened WBCT observed in this dog after AAV infusion provides in vivo proof-of-concept for nuclease-induced Factor VIII gene reversion to a wild-type orientation in a large animal model of hemophilia A.

Example 7

Detection of Corrected Factor VIII mRNA by Digital PCR in Liver Biopsy

1. Methods

[0321] A liver biopsy was obtained from the hemophilic dog treated in Example 6. The biopsy sample was collected approximately four months (117 days) after AAV administration of the nuclease. Total RNA was isolated using TRIzol reagent (ThermoFisher, Waltham, Mass.) following the manufacturer's instructions. Total RNA isolated from a liver biopsy from the same dog prior to AAV infusion served as a control. RNA was analyzed by reverse transcription digital PCR (RT-dPCR; Bio-Rad, Hercules, Calif.) using the following primers and probes. Full-length Factor VIII transcripts can be detected using primers K9F8e22F (which binds in exon 22) and K9F8e23R (which binds in exon 23) and probe K9F8e22-23FAM (which overlaps the exon 22-23 junction).

TABLE-US-00005 K9F8e22F (SEQ ID NO: 47) [5'-TCTGGATGGCAACAAGT-3'] K9F8e22-23FAM [5'-56-FAM/TGCCAAAGA(SEQ ID NO: 48)/ZEN/ AGACCATTAAGGTCCC(SEQ ID NO: 49)/3IABkFQ-3'] K9F8e23R (SEQ ID NO: 50) [5'-TGATCCCAGATGAATCCAC-3']

2. Results

[0322] Total RNA isolated from liver biopsies taken prior to the study and on day 117 after AAV infusion were analyzed by RT-dPCR using K9F8e22F, K9F8e22-23FAM and K9F8e23R. In RNA purified from the pre-treatment biopsy (Day 0), the assay failed to detect the exon 22-23 junction which is indicative of the presence of correctly spliced Factor VIII transcripts (FIG. 15, left panel). In contrast, RNA purified from the post-treatment biopsy (Day 117) contained the correctly spliced Factor VIII transcripts and was detected by the assay (FIG. 15, right panel).

3. Conclusions

[0323] In this study, RT-dPCR detected correctly spliced Factor VIII transcripts in RNA isolated from a liver biopsy taken from a hemophilic dog treated with the F8R17-18L1.35 meganuclease. Prior to treatment, this transcript is absent due to the hemophilia A-specific Factor VIII gene inversion. In these dogs, two messenger RNAs containing Factor VIII exons 1-22 and 23-26, respectively, are transcribed in opposite directions from two different promoters on the X-chromosome and cannot be spliced into a transcript encoding functional Factor VIII protein. Only after Factor VIII gene reversion can an mRNA be synthesized that is spliced into a transcript containing all 26 exons, thereby leading to the synthesis of full-length, functional Factor VIII.

[0324] The presence of correctly spliced Factor VIII transcripts observed in this study provides further proof-of-concept for nuclease-induced Factor VIII gene reversion in a large animal model of hemophilia A.

Sequence CWU 1

1

501163PRTChlamydomonas reinhardtii 1Met Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe1 5 10 15Val Asp Gly Asp Gly Ser Ile Ile Ala Gln Ile Lys Pro Asn Gln Ser 20 25 30Tyr Lys Phe Lys His Gln Leu Ser Leu Ala Phe Gln Val Thr Gln Lys 35 40 45Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val 50 55 60Gly Tyr Val Arg Asp Arg Gly Ser Val Ser Asp Tyr Ile Leu Ser Glu65 70 75 80Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys 85 90 95Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Trp Arg Leu 100 105 110Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp 115 120 125Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr 130 135 140Ser Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys Lys145 150 155 160Ser Ser Pro29PRTChlamydomonas reinhardtii 2Leu Ala Gly Leu Ile Asp Ala Asp Gly1 539512DNAHomo sapiens 3ttgaacagtc actgagcaac tactatgtcc tgggttctaa ttcaggggtg ggcaaactat 60agccccagaa tttggcccat agcctgcttt tgtacggact gtgagttaag aatagttttt 120acacgttgaa aggattgcaa agacaaacat acaaagaaac aaagaagact gtgcaacaga 180gaccacctgt ggtctgtaaa gactaacaca tttctatccc gcccttgaca gaaagagtct 240gtgggctgct ggtctcctct aacggtggta gagatgcctg cacggttaac attaatcctt 300ggcaccagaa tcctcagcac ctaggaacca gatcttgcct aacacgctaa tttcagtctt 360gaccaccttc ctccggcgta gcggttctca aacgtctttg tctttgtact attcacgtat 420aaaatattct ttcataagca acatttatcc ttttgggaat acctcacaat ggggagaagg 480ggaaccccaa cagcctttaa gggttcactg cttcgctgcc accatttccg acggtttaca 540ctgttttaag tgtgaatttg gaattgtttt caagttaaaa agcaaactaa tcatgtcctc 600tgaaagtatt tgcttttggc atgctagaaa tcagtgttga cttttgatac tgatgtgata 660cattacgagg gaattctcaa aactgctgaa atctggcctc ggcccagtta tcactgctct 720taagcttctg agggagatac atgaatgagc ccacggcaaa tggaaaacga ggagttttag 780tgtttcctag aattatgctc agatacccac tacctgaccg tctggttatc cttcccctca 840cctccctgac gcaaggagtt tggaccagag gcctaaggag gcttctcctg aagcccaaat 900cccaaactgg tcaccagtca tgctccgtaa ctcctgaacc tgacaagaag ccagccggcc 960aggtctgcag cccgggttaa gaggagcata ccaggaaaga gccaaagagc aaaggagcat 1020gagcccttca accgctttta caataatttg ggctaggcgt tcagggctcc gtaggaccct 1080tcctggcagc caagtgagag aaagaggaat gatggtggaa tgggcctctc ctgtgcttcc 1140cattacttcc acactgtcga aatagaaata gaagcagaaa agaacaccct acaagtccca 1200cccatttgga ggcactcaac tcacagtgac aaccctccac acctctcccc tgcaaaaaga 1260cgcaaaacaa aaacacctac tccaaactgt gtccttacat ctcagccccg aagatcagga 1320ttgtgtgcaa cttcggccca aaggatgcat ttccccaggg ttgaaagttt gagaaagagg 1380ctatattctg aagagttctt gttgtcacca tcaaaaggat taaaaagacg caataaataa 1440gaaaacagcg tagttggggg gcatgctcca tttgagccag aaagccttgg aaacttaagt 1500gttctcaaac ggaacgccat cctgctttgg gggaacacgg aggctgcctt gcagtcacgt 1560gatcgcacaa caccaaaggg ccacgcactc tgatttcacc tacttaacta aaagttgcag 1620caaaatccct attacaggcc aggcgtggtg gctcatgcct gtaatcccag cactttggga 1680ggccgaggag ggtggatcat ttgaggtcag gagttggaga ccagcttggc caacatggtg 1740agaccccatc tctattaaaa atacaaaaat tagcccagcg tggtggtgca cgcctgtaat 1800cccaggcacc ctggaggttg aggtaggaga atcgcttgaa cccaggaggc ggaggttgca 1860gtgagccgag atcacgccac tgcgctcctg cctgggcgac agagtgagac tccatctcgg 1920ggaaaaaaaa aaaaaaaaaa aaaaaaatcc ctattacaaa taaaagctgt tgtgatccag 1980actgcatata cctctgcgaa tggaaccaga accgtgaatt ccaatgcaaa tcgatgcatc 2040ggcaccagac ccgctgcact ggatgtatct gcattgcagt cacccgagta cggagcacat 2100catagatgat ctctgcaggt tcgttgccca cataggaggc atagcgcaaa tttcaaagga 2160acgaatacat cctggagccc aaacagctat ctggttctgc tgctggcctc ctgacaagta 2220ggtaagagag tcacatttta tagacgacgg acaccaaaac cacacatgag gagtacaaga 2280gtagctttat catggattta gggctgtggt tacaaggaag ctgtaaggaa taaaatgact 2340cccatgaaga cgtaccgtgc ggacgagtgg aaggagaaat ttggccatta caaagacaca 2400ggaatatgtt aagaagtgag gggcaggatg aaatcatcta gggtaggtat ttagagggag 2460ggcgccgtgc aaaataaaat cctcactatg aaacaaaggc ggaggcagga ggctgcgtta 2520ggtggaagca gcggaggaag gagacgaaag ggattgtcat tttcatgtcg tggcttttta 2580gaagacagcc atgtcctcta ctctgattct atcaaaatgt gttctcgggg tgctggtaac 2640gttcagccaa cgaaataatt cctatggcgg cagtaggaat aacaaaacgc agaagcggga 2700acgatgtctt tttattcctc cccagacgca aacgtggatg catgaggttt ggtaacaggc 2760aaagtcatct ggttaacgtg actgatgcaa aaagtccagg cctgggcaaa aagaagtcac 2820tgggtgaatg ggatggatca gactccctgt cctgaggggg agatggtttc ttgcagaacg 2880aggtgaagga ggtggttctg ctcagcagtc aacagtggcc acatctccac ctgcagcgac 2940ttgatggctt ccgtgtcctt ttcgtgggta gccatgacca aagactggag cagcagaaag 3000agctcctcgg gaagctggcc gctgctctcc tgcccgtggc tgtcaaaagc ctcccaggag 3060tacttctcca gggtctgggc gtgctccggc agcagcttgg cgggcggtgg ttgcaggagg 3120agcagcagca gcacgcggga cacctcgcag cggaccagca cgtccgagaa ggcgcccagg 3180gcggcgggag agggcgccgc cgagccggag ttcggaggaa gcagcgcggc cggtagggcg 3240ggcgtcgccc cgggcccggg ctggggtgcc ggcggcgggg gcggcggcag tgactgcacc 3300gggtggctgc cgtgctcccg cgccaggcgc tgcatgcgcg tgaagaccgc cagggcgccg 3360gtgtagtcgc gcgccagcag ctggcaggag gcggcctcgc caagcgcctg cagcgcggcc 3420aggggcagct ggggcagctg gagctgggcg gcgcgctgga agtgaccggc ggcggcggcc 3480ggctggccca ggtcgcgcag ggcggcggcc agctcgaggc agagggcggc ggcggcggcc 3540ggctggccca gctcgaggtg cagacgcacc gcggcgccca gggcgctggc ggcggcctgc 3600agcggctccc cgtaggcggc ggggcagacc aggcgctggc gcgcgtcgcg ctcctgccgc 3660aggaagaggc gggcggcctc ggtgagggcc agcgcctccc cgggcccgtg gaagagcgcc 3720tgctggcagc gcgccaccgc cagctggcac caggccgcgt agggcagaca ctcctgggcg 3780cgcagctccc ggcccagctg tccgaactgc tcgccggcct ccgccacgtt cggcttccgc 3840aggaaccgct tcttcagctt gttcgatacc agccggtagc gggccaggaa gtccccggcc 3900tcgggtcccg ggccggcgcc gccgccgccc aggcctgcag ccgctgccgc catgctcgcc 3960gccccaagca cttcccgacg cgccgccgca gctggcgggc gggccggggc ggggcgacgt 4020gccctgcgtc cccctcggcg ggctgccgcc gtgcccgcgc cggctcccca gcccgagcct 4080gccccttgcc ctgatgaggt gcaaagagcg ggatcggagg cggggcctgg ccgggctgtg 4140agcggcgtat gcaaatcgag ggtctcgggg atgcggatcc aagaccctgg gaaggtacgc 4200ggggcctggc ggggcaccag ctgctgctag ctcggctgca atgcaagtgg tctaggttgc 4260taaaggcatc ccacagcctc tccatctgaa catgacccaa acgaaactcg tgaccctaat 4320tccatgtctg cgcatttcta gactgttgtc cccccccccc cccgccccga ctactcagtc 4380ctccgtcttc cggtccaggg ccccttgcca agcaccgggt ccacctctcc gtccccaccc 4440cggttgcctt agaagtccgt cctgtcgcaa cactgcagtc atggtcttga ggcccacccg 4500ccccaacgaa caccatcatg ctgaggactt tcccgggcag gccctgactt gctcagaacc 4560agcgggggtg tccccttccc acccagggcc actcccctgc actgtcaccc ggagagactg 4620ctcctctgtg ccatccctgg ctcccaccca accccagacc cccaccacct ctccatccct 4680ccagctgtgg aggtctcaca accccccaac ccatctcacc gcccccccac ccccacccca 4740aggcaaagtg actgaagcgg gcagatggct tccttgaaac attttattga cagaattaat 4800gaaggcccaa gactttgggg cctgggttgt ggggggaggg tgtttaaggc cgggggttca 4860ggccggggga tttggggccg ggtgggtgga cgagtggacc tgtcaggtcc caggggccgg 4920gtgtcagaag ctagtcctcg ccaggggcca cttgagagat ggtggtcgtg ttgaaaaggg 4980tgctcagtag cctgtcgttg tgaaccacca tgtccagcag caggggagtg atgttccgct 5040ctccgctgtt ctgggcctcg ttgcccgcca gctccgggac cttggccgtc aggtactcaa 5100taaccgcagc gaggtagacc ggcgccgtgc gactcaggcg ctgagcgtag tggccctccc 5160gtagactgcg ctccacctgg ctcactgaaa acgaaagctc cgctcggacg gtgcgagagc 5220aggtccgccc ccggccgcca gcaccggagg accctcggcg tctcctcctc ctcggcatgc 5280tgggcgttga gtgtgctatc tcggcttggc ccagctaggc aagatggctc tcaagaggac 5340agttaccgcg tccagtactg tgtatcctag cgaccagggc ccagcccctc attggctagg 5400gagccgagac caatgggcac gcacatccgg cgacgggcac gcatgtggtg acggcccctc 5460acaagggaca cacgtccgtc aggtgacctc atcactttcc cattggcctc gagggagcag 5520gcctgggcct agaagtggct ggagggccgt gggggtgggg tggggcgggg cagggggaat 5580cgcgctggtg accctctctt tgccagtggg aactttccct ttctactgga tgggaacacc 5640gtgggaaaga caaaggggtg ggcgagggga ggacgggtac cacgccttca caatgttgca 5700catccatcac gaccacctag ttccaaaacg ttttcaacac cccgaaaaga aaccgaaacc 5760cctgtaccta taagcagtca cttgccgcac gcctccttcc acaccaccac taccagcccc 5820cacaccctcc cacacacacc ccctgccccc gcccatacac acgttcccga tagtccctga 5880caacccctag tccatctgct ttctgtccat agaggttagc ctgttctgga gatttcctat 5940agatggaatt atacgaccaa atgtgaggcc gtgtgtgtct ggctgctttc acttagcgta 6000atggtttcat cagggtgcat ccatgtagag gcatgaatca ctacttcctt cctttgaatg 6060actgagtacg attctgttgt atgaatagga ggccacattt tgtttaccca ctcgtcagtt 6120gatggacagg ttatttcccc cttctggcta ttgtgagtgg cactgccatg accatctctc 6180tacaggtttt tctttgaata tctcttttca gttcttttgg gtctatttct agcagtcaaa 6240ctgctggctc gtgtggtaat tctgtttaac ttattgagga accaccaaac tgatttccac 6300agcagctgta atctttcgca ttcccaacag tagtgcatga gagtcccaat ttcttcacag 6360cctcatcaaa acctgttttc tgtttgcctc attttgtttt gtttacagta gccatcctac 6420tgggtgtcaa gtgctatctc atggtggttt tcattcgtat ttcccaaatg gctaatgatg 6480ttgctgtggt ttgagtgcat cccccaaatt gtgtgtcttg gaaacttaat ccccaaattc 6540acatgttgat tggaggcgca gcctctgaga cggtaattag gattagataa ggtcatcggg 6600gtgagacccc caggatgcga ctggtggctt tataagaata ggaagagagg cctgaaacga 6660catacacgct cttgccctct cgccgtgtga taccctctgc cgtccccaga tgccgggtca 6720cttcccagtc cccagaacgg taagaaataa atttcttttc tttataaatt gttcagtgtc 6780gggtattcaa ttatggcaac agaaaacaga ctaagacatc ttttcatgtg cttcttggcc 6840ctctgtacct ctgctttgga ggaatgtcta ttcaagccct ttgcccattt tttaattcgg 6900ttgattgtat tttggctgtg ggcttctaaa acttattcat atattctgga aaatagactc 6960ttatcagata tgtgacttgc aaatgtttct cccattcact ttctggatag agccctttgt 7020tgcccaaaag atttacattt ggatgtagtc caacttgcca aatgaaaaga tatctgtggc 7080tttgcctttg gtgtcatact gaaggagctg ttgcctaatc caaggtcgtg caaagttaca 7140tctccgtttt cttcttagag ttttatagtt tcagccctta catttagatc tgtgatccat 7200tttgaattaa ttctttacat gatgtgaggt aggggtccag gggccttctt ttgcatgtgg 7260ctatccagtt gtcccagcgc agtttgttga ggggattatt cttcccctcc acccattgag 7320gggtgccgga actcttactg aaaataaact ttacataaat atatgggttt attcctgact 7380ctgagttctg taacattgac ctaatgtatc gatcacgatg gcagtaccac ccttttcgga 7440ttactgcggt tttgtagtac gttttgaaat tgggaagtgt gagtccttca acttttttct 7500tttctgagat tgttttggct atctgagccc cttacatttt cttatgaatt ttaggatcag 7560cttgtcagtt tttacaaaga aggcaggttg gattctgaca ggcatcacga tgaatctgta 7620tattgccttg gagattatgg gcatcttaac aatattaagt gtcccaatcc gttaacacaa 7680aatgcctttc gatttattta ggtcttcttt aatttatttt agcaacgtct tgaaattttc 7740agagtataca tcttgtacac ctttagttaa atttattcct cgacatttta ttgtttcgat 7800gctactgtaa aatgaatcat ttccttaatc ttattttcat gttattcatt gctagggtgt 7860agaaatacaa ccgactgttg cagattgatc ttggatactg caactttgct gagccgaata 7920tgctttgctg agcatactca gacagggttg gcatattagt ccgttcctac actgctataa 7980agaactgcct gagaatgggt aattcctaaa gaaaagaggt ttaattgcct catggttctg 8040caggctgtac aaggcttctg cttctgggca ggcctcagga aacgtgcaat catggcggaa 8100ggcgaagggg aagcaagcac cttcttcaca tgttggagca ggaggaagag agagagaacg 8160cacgcaaagg gggaagcgct gcacattttc aaacaatcat cagatcttgt gagcgctcta 8220tcagaagaat agcaaggggg aagtccgccc ccatgattca atcacctccc actaggccct 8280tccttcaaca ggtggggatt acaattcgac atgagatttg ggtggggaca cagagccaaa 8340ccgtctcagt tttttttttt ttcttttgtt ggactcttta gtgtcctcta tataagaaca 8400tgccatctat gcatctatga atagagatgg ttttacttgt tcctttccga tctggatgcc 8460ttttatttct ttttcttgac taattgccct gactagaact ttgagtacga tgttgagtta 8520caagtggcat tcctgatctt agggggaaat caaccagtct ttcaccatta agtatgatat 8580tatctctggg tttttcatgg atgccctcta tcaggttgaa gaagtttctt tctgttcctg 8640gtttgttgaa tttattttca tgaaagggta ctgcgttttg tcaaatgatc ctttttgtac 8700atgattaaga tgaccatgag ccccctcccc cgcccccgct ccgccatgca ttctgttaat 8760atggtgtatt atataaattg attttcacat gttgaaccaa ccttacattt gtgggataaa 8820tcctatttgg tcatagtgta taaagagtgg tcaataaaca tttcgttgaa agaataggag 8880tggatctggc aagcttcttg gaggacaatg tgtgtgttaa agaatctgta gcatgatgag 8940aagccaaggc accggtggga ggaggggagt tgcaaccaat tcattaaggc tggagagtac 9000gatgccagtg gagcagtagt ggttgatgtg gctgggaaag aggtaggcag gagccaagac 9060atggaggttc tattatgcca tgctcaggtt ttagaatacc ctgtaggcta cactgaaccc 9120actgtggtct ttcagcttgg gagtgacgtg gtctgatttg cctctagaaa tatcaccctg 9180gaagctgtgt ggagaataga acagagagga ttgtgtgtgg agaatagaac agagaggact 9240gagattggaa ttagaaagct gctgtattat accagtcaag aaatgacaga tatctcaact 9300aagacaatgg cattggtaga taagactagg ggacagagtc cataaaaagt ttaggtagta 9360aaatggcaca cagtagacac tcactacata ttactcatac tggcgaacct agctggagac 9420atgataattc atgtgctcat tcttcaaaaa atattgaagg gtagtgccag gtatactgtg 9480ttaagcattg agacaacacc aacgagaaat at 9512413008DNACanis lupus familiaris 4aagattttat ttttatttat ttgaaagaga gcatgagggg atccctgggt ggctcagcgg 60tttagcgcct gcctttggcc cagggcgtga tcctggagac ccgggatcga gtcccacatc 120aggctccttg catggagctt gcttctccct ctgcctgtgt ctccacctct ctctgtgtgt 180ttctcatgaa taaataaata aattttcaaa agagagagag agcatgagta agagggagag 240ggagagatag agaatctaaa gcagactctg tgctgagcag gaacccaaca cagggctcaa 300tcccacgacc ctgagatcgt gacctgagcc aaaaacaaga gtcagatgct taaccaactg 360agccacccag gcgcccctgg agttgttttc taactcatta atttctgttc aaaactttaa 420aattttcttt tactttcttt cggttttgtc tttcctaaaa catggagttg aatgactagt 480tcatttattt tcagtctttc tcattcagta tgataaaaga gttaactact ctcatgaagg 540tcattcattt aataaacagt tactgagtaa ctgctatgtt ctagattcta gtacagagat 600ggggaagcta gagcccatcc ttgaatttgg actaccactt gcttttgtac ggactgtgac 660ttaagaatgg tttttacagg tttaaagttg caaaaacaaa cattatgcaa cgaagatcat 720ttgtggcctg aaaagactaa aatattgcca tctggtcctt tacagaaagt ttgcagactg 780ctggtctggc ctaaaggcag tagagatgcc taggcagtgc catgaagcag gactcttcat 840ccccacagct tgcctaatgt gcagatttca atcttagtca tcgtccccca gtgtagtgat 900tctcagatgt ttttatatct ttgttttgct cacagacaaa atcttgttca tcagagatat 960ttcttatcat ggatgtccca cataaagagg gagaggggaa cctttaagga ccacagctac 1020actgccaccg ttctcaatag tgtagaataa tgtaaatgtt aactttttta tttttttaag 1080atttatttat tcacgaaaga cacagagaga gaggcacaga cagaggcaga gggagaagca 1140ggctccatgc agggagcccg acatgggact ccatcctggg gccccagcat cacgccctgg 1200actaaaggcg gcgctaaacc gctgagccac ctgggctgcc ctaacttttt tattttttaa 1260aagattttac ttatttatcc atgagagaca cagagaggca gagacacagg cagagggagg 1320agcaggcttc atgcagtgag cccgatgcgg gactcgatcc caggacccca ggatcatgac 1380ctgagccaaa ggcagatgct cagccactga gccacccagg tgcccctgta tatgttaact 1440ttgatatatc aatgaacatt tcagtgatgt ggctttttct ctctggtttt caagttaaaa 1500agaagagtat ttttaatgtc ctttaaaaat atttattttt ggttgtgtta gaaatcagtg 1560ttattttgac ataaagttga taattaccaa gggatttgtc aaagcctatg aaaattgcag 1620aaatctagcc ttagcccagc tatctatgct ttgaagtttc tgagggagag ctataaagga 1680gccaatgaca aatggaaaac aactttcatg ttttctaaaa ttatacttgt atacccaaca 1740ccaggctctc tggcaaaccc tcccctctac atcctcaatg caaggagtta gaccagaggt 1800ccaagggtgc ttttccttaa gcccaaaccc aaatactgcc accagtcaca tatcaaaact 1860cctatacatt gttgttgtag tgatgtgatt tgaagtgagg tttgggagcc aatggcgaag 1920aattcttgag actttttcag tgcaaaaaat gctgttttaa ttacagcaca gggacaggac 1980ccatgggcag aaagacctgc actggggttg cgaggagtgg ctgattatgt aagattttcc 2040atcctatgga ggggagggtg atgttaaggt cccaggaaat tgagtatagg gttcgggagg 2100tctggctatt gatgattgct ttttttcctt gtaaatcatt aagacagttg caaactgatg 2160gaagattcat gtcgtgcatg actgtgatct ctgtcagtta agcatttgtt tttccccttc 2220ctttgctctt gggcagccag gagtgcctgc acaacatcac acctttccca cctggtgggt 2280ggggtgggag gggggcagtt gttgcggagt attagcgtgt gctttaccct cagcttgcct 2340tttgctccct catcaataag actactggcc agactgaggg gggcacttga cgggatgagc 2400actgggtgtt actctagatg ttggcaaatt gaacaccaat aaaaaataaa tgtataaaaa 2460aaaaaaagac tactggccag gcctgtagcc cgggttgaga gtataccaag aaaagtaaaa 2520ctgcaaagga ccatgcaccc taaccacttt tttaaaaaat gtttttcaaa aattccagta 2580taattaatgt agtgttacat tagtttcagg tgtacaacat ggtgactcaa cacctctata 2640catttcccag ggctcatgat gattacgtgc actcttaatc cccatcatgt atttcaccca 2700tccccacacc tacaccccat ctggtgacca tcagtttctg gtctatattt aggagtctgg 2760tttttttatc tctttttttt ctttcatctt ttgtttctta aattccacag gagtgaaatc 2820atacagtatt tgtctttctc tgacttattt cacataacat tatacattct agatccatcc 2880atgttgcaaa tggcaagatt gaattctttt tcagggctca aaaatactcc attttggatc 2940tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tcttgatcct ttcatctatg 3000gacacttgga ttgcttccat atattgactg ttgtaaataa tgctgctgta aacatagggg 3060tgtgtatctt tcttcgaatt agtgttctca ttttcgttgg gtaaataccc agttggggaa 3120ttacatggtc atatggtggt tctattttta atttttaaag gaacctccat actattttcc 3180acagtggccg caacagctgg cattcacacc aatagtgcat gaaggctcct ttttctccac 3240atcctcacca acacctgtgc tttcctgtgg ttttgatttt agccattctg acagatgtga 3300tatctcattg tggttttcac ttgcatttcc ctgatgatta gtgatgttga gcatcttttc 3360atgtgtctga taccctaaca cttttgcaat aagttagact aggcttttag ggctccgaaa 3420gacccttcca ggcagccacg tgagtgcgca ggactggtgg cggagtctgt gtcagatggg 3480cctcttttgt attttcttac ctactgactg cttccccgtt gctgaaatgg cggcatgagg 3540actacctaac cgtcccaccc atttacaggc gcttaactca cggtgaccac cgcctacaca 3600tcccccaaca cacacacaca ctaaaggcaa aacagaactg cccaaccgca tccttaaacc 3660tcaccccacc agatcagttc cgtacacagc ctcaggctaa agggtgcatt ttccgaaagc 3720agaaggactg agagggagat tacattctgg agagttctca ctggccctcc caaaccccta 3780aaggcaagtg gaatgtaaaa acacgcggct agtaacgttt gcacgcgaca tttggaccaa 3840aaagccttga aaatgtagga ttctgaaaca ggatgctgtc ctcttgggca agaaaggcgc 3900ggtctcgcag accgaggacc acgcaatagg tcggaagcac gctgattcaa cacatctgag 3960ttacctacca gtggcgggaa ggcgcccagg acaagtaagt cccagcgtgg cccaggccca 4020cgtctgagaa tggaacctag gcccacaacc ctgagttcca aggcacgcca gacacctcga 4080cagtcgaacg tgggcccctg ctttgcgcag gacacattcg ccttggcgcc gaggaaggcc 4140cacgcggcca aggttgtctg tggattgact gtgcggcaca gcccgagcag caggcgcacg 4200gaccgctgcg agtgaacgac cacactgcgg cccgaagcac agccacccgc tcttcccctt 4260gctgcctccc gacaggtgga gcagctattc cgcgttacgg aggaaggcga accgcaggcg 4320agaggcatca gaacggcttt attgtggatt tagggcagca gccccgaggg cccgtgccat 4380ctagtgctcc

ccaggggtac gcgtggggcg tgggcaacaa aggagaacct gggtgagcac 4440ctggagagac accggcctgc gctaaagcag atccccgacg aacaaaaaac ggaaccgaga 4500ggcgtgtgca acatccaacc gtccgacttg caacagaggt gtacgggagg gtgggggtgg 4560gggtgggggt gggggcccgc aggggagttg ggtgcaagaa gaggccaggg gtaaaggggc 4620cgctcggtgt cccgtggccc gtcttgggga agagagccaa gtcctccagg ctgaccccgt 4680gatgctgggg tggttgccgc gccagggacg gcacgaagcc tgtgggatgc gggaggctgc 4740agtcccgcag cgtccggtgg aggcgggcgg gcgaagagca agacaggtgc gagtgccggg 4800cccctgcccg tcggtccgtg ggcaccgccg cccgccggcc ggccggccgg ccggccggcc 4860ggcccgcggt ccaacggcgg ggcagcagca ggccacgggg ccagaggcgg aaagcggctt 4920cggggctggc gcgcgcgagt ggtcagacgc cctgccccga cggggacacg gcttcctgca 4980gaacgaggtg cagcaggtgg ttctgctccg cactgagcag cggccacatg tccacctgca 5040gcgacttgac ggcctccgtg tccttctcgt gggtggccat gaccagggac tggagcagca 5100ggaacagctc gtcgggaagc tggccggcgc cgtcgggccc gtggccgtcg aaggcctccc 5160aggagtactt ctccagcgtg tgggcgtgct cgggcagcag cttggcgggc ggcggctgca 5220gcaggagcag cagcagcacg cgggacacct cgcagcggac cagcacgtcc gagaaggcgc 5280ccagcgtggc gggcgcgggc gtgggcgtgg gcgcgggcgc ggcaggcagc agggcggcgg 5340gcagggccgg ggcggcggcg gcggggccca gcggggcgga ggccgacgag gacgaggccg 5400aggccgacga ggtcgaggcc gaggccgacg aggtcgacgc cgccgccggc ggggccaggg 5460gcagcggcgg ggcgggcggc ggcggggccg ggggcggctg ccgcagcggg tggctgccgt 5520gctcccgcgc caggcgctgc atgcgcgtga agacggccag ggcgccgctg tagtcgcgcg 5580ccagcagctg gcaggaggcg gcgtcgccca gcgcctgcag cgcggccagg ggcagctgcg 5640gcaggtgcag ctgcgcggcg cgcaggaagt ggccggcggc ggcggccggc tggcccaggt 5700cgcgcagggc ggcggccagc tcgaggcaca ggccggcggc ggcggccggc tggcccagct 5760ccaggtgcag gcgcacggcg gcgcccagcg cgctggcggc ggcctgcagc ggctccccgt 5820aggccgcggg gcaggcgagg cgctggcggg cgtcgcgctc ctgccgcagg aagaggcgcg 5880cggcctcggt cagcgccagc gcctccccgg gcccgtggaa cagcgcctgc tggcagcgcg 5940ccacggccag ctggcaccac gcggcgtacg gcaggcactc ctgcgcgcgc agctcccggc 6000ccagctgcgc gaactgctcg cccgcctccg ccacgttcgg cttccgcagg aaccgcttac 6060gcagcttgct cgacacctgc cggtagcggg ccaggaagtc cccggcctcg ggccccgggc 6120ccgcgccgcc gccgccgccg ccgccgccgc cgcccgggcc gccgccgccg ccgccgcccg 6180ggccgcccgc cgccgccgcc gccatcttgc ccgcacgcgc gcacgcccga cgtgcccgcg 6240tcccccggcc ccgccccctg cgggccccgc ccccccgcgg accccgcgca tgcgtgcgcc 6300gccccccgcc gtcccgccgg acggaaccga gcgcgcgggc cggcgcgggg cctgggcggc 6360cgcggccctt ccgaggcgac cccggccccc gggtcggccc gcgccccccg gcccctcccg 6420gcccctgccg gcccccgagc taacgtcgcg gcgccggccc gcctggcccc gaggccgctt 6480ggccggagtc aggatggtcc ccgccccccc agccttccgt caaagccctg tcccctcgag 6540tccgcgccgg cacctgtgtc ccccaacagt ccgcgccggc agctgtgtcc ccccattcac 6600tccgcgccgg caggtgtgtc cccctccatc ccgcagccag gttgttactg caccggcgtt 6660caccccatca ccacgccgca ggcatggtcc tgagctccag ccccacacac accatcgtcc 6720acgggactgg cccgtgcggt gggggggggt cttctatgcc ccagcgtcac tcccagccca 6780gccaccccct gtcctgaccc cgagcagtcc ccgtccccgc accccagccg gaccccacca 6840ccacccctgc accccagccg gaccccgcca cagcccccgt ccctgcaccc caacaatgct 6900ccggaccgca tgaccgaacc ccgcaccaca cccacgaacc cgcgccccac gccacacccc 6960aaatcctcca tcacccccat cgtgcaccgc cgagggcacc ccaacccccc gtcatccccg 7020agccccgcac ccccacccca acctcgccca caaccccaaa ccccaacctc ccccccgccc 7080ccccccgccc tccccccgag gcgcgtcccg cacccagtga gccaggcgca cacgtctggt 7140tgtctctgcg ccttttattg cggggacgcg ggggtcaccg agcccccccg gcggctcccc 7200gggcggcggc gggcatcagg ggccgggggc ggctagtacc gggccggggc cacctgggac 7260acggtggtca tcgtgaacaa gccgctgagc agctcgtggt tgtgcagcgc ccggtccacg 7320agctccgggg tgatgtacgc ggtgcgcctg tgccgggcct cgtcgcccgc cagccccagc 7380acggtggccg tcaggaactg gatgacggcc gccaggaaga tgggggcgga cgcgcccagg 7440cgcttggcgt agcggccggc ccgcaggagg cgctccatct ggcacacgga gaaggccagc 7500cccgcgcggg cgctgcgcga gcggcacgac ctcgggcggc cggcccgccc tccacggctc 7560cccctgagcc gcgcgcgggg cctcggcggc gctcggcgcg gggcctcggg ctgcggcccg 7620gctgcggccc ggctggggac gctcgggcgt cgggccgcgg agccacgggc ctcgggctgc 7680ggagaccccg ggctgcggac gctcgggcgt cgggccgcgg agccacgggc ctcgggctgc 7740ggagaccccg ggctgcggac ggtcgggcgt cgggccgcgg ggacatgggg cttgggctgc 7800agagacatcg ggctgcggac agagagacac cggcctcggg ctgcggagag acggggcaag 7860ggctgcagac agacgggcct caggctgcag agagaccgac ctcgggctgc agagagaccg 7920aactcgggct gcagagtgtc tgacctcggg ctgcagagac cgacctcggg ctgcggagag 7980acgggcctcg ggctgcagaa agtccgacct cgggctgcag agaccgacct caggctgcgg 8040agagacgggc ctcgcgctgc agagagaccg acctcgggct gtagagagac tgacctcggg 8100ctgcagagac cgacctcggg ctgcagagag tccgacctcg ggctgcagag agtccgacct 8160cgggctgcag agagacgggc ctcgggctgc gctgccgaaa cagcgtcggg gcgcagagag 8220gagcgccggg gtgcaccgcc gtgcggcgcg ctgggccggc tgcacccgag ccctcagcag 8280cgggcgagga ggccccgctc cgtatccgag ggacacaccc cctccccgcc ccgctgcgca 8340cgcggtgaca cgcaggcctg atgaggtcac cgcgtcccca ttggcccggc ccggccctcg 8400cccgccagaa aagccgctgg cgggaagtcg ctggctctgc gccgcgcgga cggcatgggg 8460cgccaccgac gagcgtgcag gagctcgcgc gcccccacgt gcacccccga catgtcggcc 8520ctctcggctg cacacgcggc accgccccgc acagacggcc cggccgccgc gcgctcactg 8580ccccctgcac cccgtcctgc ccccggggac cgaccgctcc tcggcctcct gtccctgccg 8640cttggcgtcc gctggacacc tgctgcaggg gccaccctgg gaccagtagg tggccgtgtg 8700cgccggccgc gttccctcgg caccgtgttc tcaggagggc tgttctctga gggagcggga 8760accggggtcc ctccccccga gagagcagga atcgggcccc tccccctgag ggagcaggaa 8820tcggggcacc cccccccagg gagcaggaat aacggcctct tccccccccc cagggagcag 8880gaatcggggc ccctcccctc aagggagcag gagtcggggt ccctcccccc cgagagagca 8940ggaatcgggc ccctccccct gagggagcag gaatcggggt ccctcctccc gagggagcag 9000gaatcggggc accccccccc ccagggagca ggaataacgg cctcttcccc cccccccccc 9060cgggagcagg aatcggggcc cctcccctcg agggagcagg aatccgggac ccccaaggga 9120gcgggaattg gggtccctcc gcccgaggga gcaaaagacg gacctcgggc tgcagaaaga 9180cggacctcgg gctgcggaga gaccgacctc gggctgcgga gagaccaacc tcgggctgcg 9240cttccgaaag acggcgtcgg ggcgcagaga ggagcgccgg ggtgcaccgc cgtgcggcgc 9300gctgggccgg ctgctcccga gctctgagca gcgggcgagg aggccccgct ccgtataaaa 9360gcgacacccc ctcccctccc cgctgcgcac gcggtgacac gcagatctga tgaggtcacc 9420gcgtcctcat tggcccggcc ctcgcccgcc agaaaaggcg ggaagtcgct ggctctgcgc 9480cgcgcggacg gcatggggcg ccacccacga gcgtgcagga gctcgcgcgc ccccacgtgc 9540acccccgaca tgtcggccct ctcggctgca cacgcggcag cgccccgcac agacggcccg 9600gccgccgcgc gctcactgcc ccctgcaccc cgtcctgccc ccggggaccg accgctcctc 9660ggcctcctgt ccctgccgct tggcgtccgc tggacacctg ctgcaggggc caccctggga 9720ccagtaggtg gccgtgtgcg ccggccgcgt tccctcggca ccgtgttctc aggagggctg 9780ttctccgagg gagcgggaac ccgggtcccc ccccccaagg gagcaggaat tggggtccct 9840ccccccgagc gatcaggaag cggggtccct ccaccaaggg acaggagtcg gggtccctcc 9900ccccgaggga tcaggaatcg gggtccctcc accaagggat caggagttgg ggtccctccc 9960ccaagggaca ggagtcaggg tccctccccc aagggacagg agtcggggtc cctccgccaa 10020gggacaggag tcggggtccg tccccccgag ggagcaggaa taggcccccc gaggcagcgg 10080ggatcgccct gcacgtccat cgaggggcac tcgcccccac tgcgcgcccc ccctggcggc 10140gaggggcacc tgcgggcgga cgtgcgcggc ggcggcggcg gcgaaggtcg cgcggggccc 10200ctccgggcgc gggatggggg gcccgacggc agggcgacac ccgctgtctg ggcagcgcgc 10260tgacccggcc ccctgctccc gccgcgccgc cccactggcc ctggcccgcg cctgctcctc 10320ccggtgcggc ctcgcgagcc cccgcgccgg ctgtgccgcg gcacctggca cctgggggtc 10380actgtccccc gtgtgtaggg aggggcaggg cggggcccga gggacaggga gctcgggcag 10440ctgcagcccg ctgacccggg cccccgtgga gcctgcgggc tccccccgcg cccccagggc 10500tgccgacccg agcccgggct gcaggcgggc gcccagctga tcccccccgc cccccccccc 10560cccgggctgg gcctgtcgcg cccccgggtc ccgagcgccg ccccgcggtg ccagcagcgg 10620cgggtcggcg cggcgggagc gctgcaggtg cgccgggacc gggcggggcc ctccctctgg 10680gtgcccctcc aggcggcccc tgcactcggg ctgcgcaggg cggggcgggg gagcttcccg 10740gagggggtgc ggcctgtctg tcgcctgggc gcgactcggc gatgggacac gttcaggtcc 10800tgacaccggg gggggggggg ggagcggggg ggcttcccgc gtattcgggg cctcccgagt 10860gactttcagc aatgttccgt gactttccgt gcacacgccc tgcacgtcct ccgctacgtg 10920tattcctagg gatgtaattg cacgtgatcc cgatttgaat aaaattattc aatcagttag 10980ttaactgatc aattaatcag tttcgtttga ggttcgtcgc cgctgccgcg tggaaaaccc 11040cctaatttct gcacgttggt ctcatatcct gaagtttgtt gaattcacta actctgccgg 11100tgtttcgtga attctttagg actttttctg tgtaaggtta tgtcatctga gaacacagat 11160ggttttcctt cttcctttcc aatttaaata ccctttattt ctttctcttg catcattgct 11220ctagccagga tttccattac aatgtcggtt agaggcaggg aaagcgcgga ttcttgttcc 11280tgattagggg aaaagctttc agtcctccac cagtgagtat gaccttagct atgggtattt 11340cataaatgcc ctttattatg tttggtgatt ccccttctat tcctagtttg ctgagtgttt 11400tttgtcagga aaaggtggat tttagtcaat gctttttctg catgaatcaa gacagtcatg 11460tgggtttttc cccctttatt ttattaacgt agagttattt tcttaagttg aagcatcttt 11520gtattcctgg gacagttcct ttttggacat gaaatgtcac ttttataatg tactgctgga 11580ttccgtctgc taaaatcatt tgaggatttc tgcacctata ttctttttta aagatttttt 11640aggtactatt tgagagacca tgaatgatca gggggcggag ggagaggatg aagcagactc 11700cccactgagc agggagcctg acgcgggcct cgatctcccg acccgggatc atgacctgag 11760ctgaaagcgg atgcttcacc gactgagcca ccaggcaccc gggcaaaaca atttcttaca 11820acattctgca cgatactgta atgctgatgt gtcatcatat aacacacact gattcctatt 11880ctagtgtatt caagcataca caaagcctag gagatcattt taaactttcc gtagcctgta 11940cgccatggtt taaaccgagg ccttctgagt aggtgttcct tttttattta aagattttat 12000ttatttattc atgagagaca cagagagaga gagagaggca gagacccagg cagagggaga 12060agcaggctcc atgcagggag cccgacgcgg ggctcgatcc caggtctcca ggttcatgcc 12120ctgggccgaa ggcaggtgcc aagccactga gccccccagg gatcccctga ctaggtgttc 12180ctatcacatt tctcaaactg tgttcccttt cctttgaaga tgccgtgtac tttctctgca 12240ccctagactg ctcaaggtcc gaacccccac atgttggatg ttaacacgtg tcttacaaat 12300ccatacacaa ggaatcatta attaaagcct cacagttcat gcacatgtgc acacacacac 12360acacacagag agagaccaca gtcttggaag attatcctga ggccagggtg gtagggtggt 12420gcctgccagc accctctcag atgtggaaca gggcccgaca tacaggactt ctagctacga 12480cggttgtatg tgagcgctgc gtgctgtcga gagaagcaca aagcaaaatt agagggaaga 12540tgcaatgggg agcaattgct ttgtaccctg tctgcacctg gcatgtacct gtgctaatcc 12600ctccccacag gtcttctttg gcaacgtgga ttcatctggg atcaaacaca atatttttaa 12660ccctccgatt attgctcagt acatccgttt gcacccaacc cattacagca tccgcagcac 12720tcttcgcatg gagctcttgg gctgtgactt caacagtaag tgcccagtca tcacgtgccc 12780ttccgtgtcc cagccccggg tgggatgaat gactgtccta gtcttctcga gggcagggcg 12840atgtcccagg acacagaacc acgaatgcta agagcagcgc agtcccgagc aaacgcaggc 12900cttggtcatt gtaaccatgg gattccctag gggcagccac ctcctccggc actcttaagg 12960tcaaagtgcc cccgaactga gaagagctga ccagaaggcg cggggcag 1300851116DNAHomo sapiens 5atggcggcag cggctgcagg cctgggcggc ggcggcgccg gcccgggacc cgaggccggg 60gacttcctgg cccgctaccg gctggtatcg aacaagctga agaagcggtt cctgcggaag 120ccgaacgtgg cggaggccgg cgagcagttc ggacagctgg gccgggagct gcgcgcccag 180gagtgtctgc cctacgcggc ctggtgccag ctggcggtgg cgcgctgcca gcaggcgctc 240ttccacgggc ccggggaggc gctggccctc accgaggccg cccgcctctt cctgcggcag 300gagcgcgacg cgcgccagcg cctggtctgc cccgccgcct acggggagcc gctgcaggcc 360gccgccagcg ccctgggcgc cgcggtgcgt ctgcacctcg agctgggcca gccggccgcc 420gccgccgccc tctgcctcga gctggccgcc gccctgcgcg acctgggcca gccggccgcc 480gccgccggtc acttccagcg cgccgcccag ctccagctgc cccagctgcc cctggccgcg 540ctgcaggcgc ttggcgaggc cgcctcctgc cagctgctgg cgcgcgacta caccggcgcc 600ctggcggtct tcacgcgcat gcagcgcctg gcgcgggagc acggcagcca cccggtgcag 660tcactgccgc cgcccccgcc gccggcaccc cagcccgggc ccggggcgac gcccgcccta 720ccggccgcgc tgcttcctcc gaactccggc tcggcggcgc cctctcccgc cgccctgggc 780gccttctcgg acgtgctggt ccgctgcgag gtgtcccgcg tgctgctgct gctcctcctg 840caaccaccgc ccgccaagct gctgccggag cacgcccaga ccctggagaa gtactcctgg 900gaggcttttg acagccacgg gcaggagagc agcggccagc ttcccgagga gctctttctg 960ctgctccagt ctttggtcat ggctacccac gaaaaggaca cggaagccat caagtcgctg 1020caggtggaga tgtggccact gttgactgct gagcagaacc acctccttca cctcgttctg 1080caagaaacca tctccccctc aggacaggga gtctga 111661349DNACanis lupus familiaris 6atgcgcgggg tccgcggggg ggcggggccc gcagggggcg gggccggggg acgcgggcac 60gtcgggcgtg cgcgcgtgcg ggcaagatgg cggcggcggc ggcgggcggc ccgggcggcg 120gcggcggcgg cggcccgggc ggcggcggcg gcggcggcgg cggcggcgcg ggcccggggc 180ccgaggccgg ggacttcctg gcccgctacc ggcaggtgtc gagcaagctg cgtaagcggt 240tcctgcggaa gccgaacgtg gcggaggcgg gcgagcagtt cgcgcagctg ggccgggagc 300tgcgcgcgca ggagtgcctg ccgtacgccg cgtggtgcca gctggccgtg gcgcgctgcc 360agcaggcgct gttccacggg cccggggagg cgctggcgct gaccgaggcc gcgcgcctct 420tcctgcggca ggagcgcgac gcccgccagc gcctcgcctg ccccgcggcc tacggggagc 480cgctgcaggc cgccgccagc gcgctgggcg ccgccgtgcg cctgcacctg gagctgggcc 540agccggccgc cgccgccggc ctgtgcctcg agctggccgc cgccctgcgc gacctgggcc 600agccggccgc cgccgccggc cacttcctgc gcgccgcgca gctgcacctg ccgcagctgc 660ccctggccgc gctgcaggcg ctgggcgacg ccgcctcctg ccagctgctg gcgcgcgact 720acagcggcgc cctggccgtc ttcacgcgca tgcagcgcct ggcgcgggag cacggcagcc 780acccgctgcg gcagccgccc ccggccccgc cgccgcccgc cccgccgctg cccctggccc 840cgccggcggc ggcgtcgacc tcgtcggcct cggcctcgac ctcgtcggcc tcggcctcgt 900cctcgtcggc ctccgccccg ctgggccccg ccgccgccgc cccggccctg cccgccgccc 960tgctgcctgc cgcgcccgcg cccacgccca cgcccgcgcc cgccacgctg ggcgccttct 1020cggacgtgct ggtccgctgc gaggtgtccc gcgtgctgct gctgctcctg ctgcagccgc 1080cgcccgccaa gctgctgccc gagcacgccc acacgctgga gaagtactcc tgggaggcct 1140tcgacggcca cgggcccgac ggcgccggcc agcttcccga cgagctgttc ctgctgctcc 1200agtccctggt catggccacc cacgagaagg acacggaggc cgtcaagtcg ctgcaggtgg 1260acatgtggcc gctgctcagt gcggagcaga accacctgct gcacctcgtt ctgcaggaag 1320ccgtgtcccc gtcggggcag ggcgtctga 1349722DNAHomo sapiensmisc_feature(22)..(22)n = c or g 7ctcccaggag tacttctcca gn 22822DNAHomo sapiensmisc_feature(22)..(22)n = c or g 8gagggtcctc atgaagaggt cn 22922DNAHomo sapiens 9ctcccaggag tacttctcca gg 221022DNAHomo sapiens 10gagggtcctc atgaagaggt cc 221122DNACanis lupus familiaris 11ctcccaggag tacttctcca gc 221222DNACanis lupus familiaris 12gagggtcctc atgaagaggt cg 2213354PRTArtificial sequenceSynthesized 13Met Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe1 5 10 15Val Asp Gly Asp Gly Ser Ile Tyr Ala Ser Ile Lys Pro His Gln Ser 20 25 30Val Lys Phe Lys His Glu Leu Arg Leu Gly Phe Ser Val Gly Gln Lys 35 40 45Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val 50 55 60Gly Tyr Val Arg Asp Lys Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln65 70 75 80Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys 85 90 95Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu 100 105 110Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp 115 120 125Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr 130 135 140Ser Glu Thr Val Arg Ala Val Leu Asp Ser Leu Pro Gly Ser Val Gly145 150 155 160Gly Leu Ser Pro Ser Gln Ala Ser Ser Ala Ala Ser Ser Ala Ser Ser 165 170 175Ser Pro Gly Ser Gly Ile Ser Glu Ala Leu Arg Ala Gly Ala Gly Ser 180 185 190Gly Thr Gly Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val 195 200 205Asp Ala Asp Gly Ser Ile Tyr Ala Ala Ile Gln Pro Glu Gln Ser Ser 210 215 220Lys Phe Lys His Thr Leu Arg Leu Cys Phe Asp Val Ser Gln Lys Thr225 230 235 240Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly 245 250 255Tyr Val Arg Asp Ala Gly Ser Val Ser His Tyr Val Leu Ser Glu Ile 260 265 270Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu 275 280 285Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro 290 295 300Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val305 310 315 320Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser 325 330 335Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys Lys Ser 340 345 350Ser Pro14354PRTArtificial sequenceSynthesized 14Met Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe1 5 10 15Val Asp Gly Asp Gly Ser Ile Tyr Ala Ser Ile Lys Pro Arg Gln Asp 20 25 30Leu Lys Phe Lys His Glu Leu Arg Leu Gly Phe Ser Val Gly Gln Lys 35 40 45Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val 50 55 60Gly Tyr Val Arg Asp Lys Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln65 70 75 80Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys 85 90 95Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu 100 105 110Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp 115 120 125Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr 130 135 140Ser Glu Thr Val Arg Ala Val Leu Asp Ser Leu Pro Gly Ser Val Gly145 150 155 160Gly Leu Ser Pro Ser Gln Ala Ser Ser Ala Ala Ser Ser Ala Ser Ser 165 170 175Ser Pro Gly Ser

Gly Ile Ser Glu Ala Leu Arg Ala Gly Ala Gly Ser 180 185 190Gly Thr Gly Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val 195 200 205Asp Ser Asp Gly Ser Ile Tyr Ala Cys Ile Glu Pro Gln Gln Arg Met 210 215 220Lys Phe Lys His Ala Leu Arg Leu Ser Phe Asn Val Ser Gln Lys Thr225 230 235 240Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly 245 250 255Tyr Val Arg Asp Asn Gly Ser Val Ser His Tyr Ser Leu Ser Glu Ile 260 265 270Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu 275 280 285Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro 290 295 300Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val305 310 315 320Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser 325 330 335Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys Lys Ser 340 345 350Ser Pro15354PRTArtificial sequenceSynthesized 15Met Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe1 5 10 15Val Asp Gly Asp Gly Ser Ile Tyr Ala Ser Ile Lys Pro Ala Gln Asp 20 25 30Ser Lys Phe Lys His Asn Leu Arg Leu Gly Phe Ser Val Gly Gln Lys 35 40 45Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val 50 55 60Gly Tyr Val Arg Asp Lys Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln65 70 75 80Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys 85 90 95Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu 100 105 110Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp 115 120 125Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr 130 135 140Ser Glu Thr Val Arg Ala Val Leu Asp Ser Leu Pro Gly Ser Val Gly145 150 155 160Gly Leu Ser Pro Ser Gln Ala Ser Ser Ala Ala Ser Ser Ala Ser Ser 165 170 175Ser Pro Gly Ser Gly Ile Ser Glu Ala Leu Arg Ala Gly Ala Gly Ser 180 185 190Gly Thr Gly Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val 195 200 205Asp Ala Asp Gly Ser Ile Tyr Ala Thr Ile Gln Pro Glu Gln Ser Gly 210 215 220Lys Phe Lys His Thr Leu Arg Leu Cys Phe Asp Val Ser Gln Lys Thr225 230 235 240Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly 245 250 255Tyr Val Arg Asp Ala Gly Ser Val Ser His Tyr Val Leu Ser Glu Ile 260 265 270Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu 275 280 285Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro 290 295 300Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val305 310 315 320Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser 325 330 335Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys Lys Ser 340 345 350Ser Pro16354PRTArtificial sequenceSynthesized 16Met Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe1 5 10 15Val Asp Gly Asp Gly Ser Ile Tyr Ala Ser Ile Lys Pro Ala Gln Asp 20 25 30Ser Lys Phe Lys His Asn Leu Arg Leu Gly Phe Ser Val Gly Gln Lys 35 40 45Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val 50 55 60Gly Tyr Val Arg Asp Lys Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln65 70 75 80Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys 85 90 95Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu 100 105 110Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp 115 120 125Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr 130 135 140Ser Glu Thr Val Arg Ala Val Leu Asp Ser Leu Pro Gly Ser Val Gly145 150 155 160Gly Leu Ser Pro Ser Gln Ala Ser Ser Ala Ala Ser Ser Ala Ser Ser 165 170 175Ser Pro Gly Ser Gly Ile Ser Glu Ala Leu Arg Ala Gly Ala Gly Ser 180 185 190Gly Thr Gly Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val 195 200 205Asp Ala Asp Gly Ser Ile Tyr Ala Thr Ile Glu Pro Glu Gln Asp Val 210 215 220Lys Phe Lys His Thr Leu Arg Leu Val Phe Asn Val Ser Gln Lys Thr225 230 235 240Gln Arg Arg Trp Leu Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly 245 250 255Tyr Val Asn Asp Lys Gly Ser Val Ser Ser Tyr Thr Leu Ser Lys Ile 260 265 270Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu 275 280 285Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro 290 295 300Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val305 310 315 320Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser 325 330 335Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys Lys Ser 340 345 350Ser Pro17354PRTArtificial sequenceSynthesized 17Met Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe1 5 10 15Val Asp Gly Asp Gly Ser Ile Tyr Ala Ser Ile Lys Pro Ala Gln Asp 20 25 30Ser Lys Phe Lys His Asn Leu Arg Leu Gly Phe Ser Val Gly Gln Lys 35 40 45Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val 50 55 60Gly Tyr Val Arg Asp Lys Gly Ser Val Ser Glu Tyr Leu Leu Ser Glu65 70 75 80Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys 85 90 95Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu 100 105 110Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp 115 120 125Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr 130 135 140Ser Glu Thr Val Arg Ala Val Leu Asp Ser Leu Pro Gly Ser Val Gly145 150 155 160Gly Leu Ser Pro Ser Gln Ala Ser Ser Ala Ala Ser Ser Ala Ser Ser 165 170 175Ser Pro Gly Ser Gly Ile Ser Glu Ala Leu Arg Ala Gly Ala Gly Ser 180 185 190Gly Thr Gly Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val 195 200 205Asp Ser Asp Gly Ser Ile Tyr Ala Ser Ile Glu Pro Asn Gln Ser Pro 210 215 220Lys Phe Lys His Tyr Leu Arg Leu Ala Phe Asn Val Ser Gln Lys Thr225 230 235 240Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly 245 250 255Tyr Val Arg Asp Gln Gly Ser Val Ser His Tyr Thr Leu Ser Gln Ile 260 265 270Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu 275 280 285Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro 290 295 300Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val305 310 315 320Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser 325 330 335Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys Lys Ser 340 345 350Ser Pro18354PRTArtificial sequenceSynthesized 18Met Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe1 5 10 15Val Asp Gly Asp Gly Ser Ile Tyr Ala Ser Ile Lys Pro Ala Gln Asp 20 25 30Ser Lys Phe Lys His Asn Leu Arg Leu Gly Phe Ser Val Gly Gln Lys 35 40 45Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val 50 55 60Gly Tyr Val Arg Asp Lys Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln65 70 75 80Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys 85 90 95Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu 100 105 110Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp 115 120 125Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr 130 135 140Ser Glu Thr Val Arg Ala Val Leu Asp Ser Leu Pro Gly Ser Val Gly145 150 155 160Gly Leu Ser Pro Ser Gln Ala Ser Ser Ala Ala Ser Ser Ala Ser Ser 165 170 175Ser Pro Gly Ser Gly Ile Ser Glu Ala Leu Arg Ala Gly Ala Gly Ser 180 185 190Gly Thr Gly Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val 195 200 205Asp Ser Asp Gly Ser Ile Tyr Ala Cys Ile Glu Pro Gln Gln Arg Val 210 215 220Lys Phe Lys His Ala Leu Arg Leu Ser Phe Asn Val Ser Gln Lys Thr225 230 235 240Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly 245 250 255Tyr Val Arg Asp Asn Gly Ser Val Ser His Tyr Gln Leu Ser Glu Ile 260 265 270Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu 275 280 285Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro 290 295 300Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val305 310 315 320Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser 325 330 335Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys Lys Ser 340 345 350Ser Pro19354PRTArtificial sequenceSynthesized 19Met Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe1 5 10 15Val Asp Gly Asp Gly Ser Ile Tyr Ala Ser Ile Lys Pro His Gln Asp 20 25 30Leu Lys Phe Lys His Glu Leu Arg Leu Gly Phe Glu Val Ser Gln Lys 35 40 45Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val 50 55 60Gly Tyr Val Arg Asp Arg Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln65 70 75 80Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys 85 90 95Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu 100 105 110Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp 115 120 125Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr 130 135 140Ser Glu Thr Val Arg Ala Val Leu Asp Ser Leu Pro Gly Ser Val Gly145 150 155 160Gly Leu Ser Pro Ser Gln Ala Ser Ser Ala Ala Ser Ser Ala Ser Ser 165 170 175Ser Pro Gly Ser Gly Ile Ser Glu Ala Leu Arg Ala Gly Ala Gly Ser 180 185 190Gly Thr Gly Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val 195 200 205Asp Ala Asp Gly Ser Ile Tyr Ala Ala Ile Gln Pro Asp Gln Glu Val 210 215 220Lys Phe Lys His Val Leu Arg Leu Cys Phe Asp Val Ser Gln Lys Thr225 230 235 240Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly 245 250 255Tyr Val Arg Asp Ala Ser Ser Val Ser His Tyr Val Leu Ser Glu Ile 260 265 270Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu 275 280 285Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro 290 295 300Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val305 310 315 320Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser 325 330 335Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys Lys Ser 340 345 350Ser Pro20354PRTArtificial sequenceSynthesized 20Met Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe1 5 10 15Val Asp Gly Asp Gly Ser Ile Tyr Ala Ser Ile Lys Pro His Gln Asp 20 25 30Leu Lys Phe Lys His Glu Leu Arg Leu Gly Phe Glu Val Ser Gln Lys 35 40 45Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val 50 55 60Gly Tyr Val Arg Asp Arg Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln65 70 75 80Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys 85 90 95Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu 100 105 110Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp 115 120 125Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr 130 135 140Ser Glu Thr Val Arg Ala Val Leu Asp Ser Leu Pro Gly Ser Val Gly145 150 155 160Gly Leu Ser Pro Ser Gln Ala Ser Ser Ala Ala Ser Ser Ala Ser Ser 165 170 175Ser Pro Gly Ser Gly Ile Ser Glu Ala Leu Arg Ala Gly Ala Gly Ser 180 185 190Gly Thr Gly Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val 195 200 205Asp Ala Asp Gly Ser Ile Tyr Ala Ala Ile Gln Pro Asp Gln Glu Val 210 215 220Lys Phe Lys His Val Leu Arg Leu Cys Phe Asp Val Ser Gln Lys Thr225 230 235 240Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly 245 250 255Tyr Val Arg Asp Ala Gly Ser Val Ser His Tyr Val Leu Ser Glu Ile 260 265 270Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu 275 280 285Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro 290 295 300Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val305 310 315 320Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser 325 330 335Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys Lys Ser 340 345 350Ser Pro21354PRTArtificial sequenceSynthesized 21Met Asn Thr Lys Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe1 5 10 15Val Asp Gly Asp Gly Ser Ile Tyr Ala Ser Ile Lys Pro His Gln Asp 20 25 30Leu Lys Phe Lys His Glu Leu Arg Leu Gly Phe Glu Val Gly Gln Lys 35 40 45Thr Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val 50 55 60Gly Tyr Val Arg Asp Arg Gly Ser Val Ser His Tyr Leu Leu Ser Gln65 70 75 80Ile Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys 85 90 95Leu Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu 100 105 110Pro Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp 115 120 125Val Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr 130 135 140Ser Glu Thr Val Arg Ala

Val Leu Asp Ser Leu Pro Gly Ser Val Gly145 150 155 160Gly Leu Ser Ser Ser Gln Ala Ser Ser Ala Ala Ser Ser Ala Ser Ser 165 170 175Ser Pro Gly Ser Gly Ile Ser Glu Ala Leu Arg Ala Gly Ala Gly Ser 180 185 190Gly Thr Gly Tyr Asn Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val 195 200 205Asp Ala Asp Gly Ser Ile Tyr Ala Ala Ile Gln Pro Asp Gln Lys Val 210 215 220Lys Phe Lys His Val Leu Arg Leu Cys Phe Asp Val Ser Gln Lys Thr225 230 235 240Gln Arg Arg Trp Phe Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly 245 250 255Tyr Val Arg Asp Ala Tyr Pro Cys Ala His Tyr Val Leu Ser Glu Ile 260 265 270Lys Pro Leu His Asn Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu 275 280 285Lys Gln Lys Gln Ala Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro 290 295 300Ser Ala Lys Glu Ser Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val305 310 315 320Asp Gln Ile Ala Ala Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser 325 330 335Glu Thr Val Arg Ala Val Leu Asp Ser Leu Ser Glu Lys Lys Lys Ser 340 345 350Ser Pro22147PRTArtificial sequenceSynthesized 22Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Gly Asp Gly Ser1 5 10 15Ile Tyr Ala Ser Ile Lys Pro His Gln Ser Val Lys Phe Lys His Glu 20 25 30Leu Arg Leu Gly Phe Ser Val Gly Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Lys 50 55 60Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14523147PRTArtificial sequenceSynthesized 23Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Gly Asp Gly Ser1 5 10 15Ile Tyr Ala Ser Ile Lys Pro Arg Gln Asp Leu Lys Phe Lys His Glu 20 25 30Leu Arg Leu Gly Phe Ser Val Gly Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Lys 50 55 60Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14524147PRTArtificial sequenceSynthesized 24Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Gly Asp Gly Ser1 5 10 15Ile Tyr Ala Ser Ile Lys Pro Ala Gln Asp Ser Lys Phe Lys His Asn 20 25 30Leu Arg Leu Gly Phe Ser Val Gly Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Lys 50 55 60Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14525147PRTArtificial sequenceSynthesized 25Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Gly Asp Gly Ser1 5 10 15Ile Tyr Ala Ser Ile Lys Pro Ala Gln Asp Ser Lys Phe Lys His Asn 20 25 30Leu Arg Leu Gly Phe Ser Val Gly Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Lys 50 55 60Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14526147PRTArtificial sequenceSynthesized 26Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Gly Asp Gly Ser1 5 10 15Ile Tyr Ala Ser Ile Lys Pro Ala Gln Asp Ser Lys Phe Lys His Asn 20 25 30Leu Arg Leu Gly Phe Ser Val Gly Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Lys 50 55 60Gly Ser Val Ser Glu Tyr Leu Leu Ser Glu Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14527147PRTArtificial sequenceSynthesized 27Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Gly Asp Gly Ser1 5 10 15Ile Tyr Ala Ser Ile Lys Pro Ala Gln Asp Ser Lys Phe Lys His Asn 20 25 30Leu Arg Leu Gly Phe Ser Val Gly Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Lys 50 55 60Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14528147PRTArtificial sequenceSynthesized 28Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Gly Asp Gly Ser1 5 10 15Ile Tyr Ala Ser Ile Lys Pro His Gln Asp Leu Lys Phe Lys His Glu 20 25 30Leu Arg Leu Gly Phe Glu Val Ser Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Arg 50 55 60Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14529147PRTArtificial sequenceSynthesized 29Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Gly Asp Gly Ser1 5 10 15Ile Tyr Ala Ser Ile Lys Pro His Gln Asp Leu Lys Phe Lys His Glu 20 25 30Leu Arg Leu Gly Phe Glu Val Ser Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Arg 50 55 60Gly Ser Val Ser Glu Tyr Leu Leu Ser Gln Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14530147PRTArtificial sequenceSynthesized 30Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Gly Asp Gly Ser1 5 10 15Ile Tyr Ala Ser Ile Lys Pro His Gln Asp Leu Lys Phe Lys His Glu 20 25 30Leu Arg Leu Gly Phe Glu Val Gly Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Arg 50 55 60Gly Ser Val Ser His Tyr Leu Leu Ser Gln Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14531147PRTArtificial sequenceSynthesized 31Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Ala Asp Gly Ser1 5 10 15Ile Tyr Ala Ala Ile Gln Pro Glu Gln Ser Ser Lys Phe Lys His Thr 20 25 30Leu Arg Leu Cys Phe Asp Val Ser Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Ala 50 55 60Gly Ser Val Ser His Tyr Val Leu Ser Glu Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14532147PRTArtificial sequenceSynthesized 32Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Ser Asp Gly Ser1 5 10 15Ile Tyr Ala Cys Ile Glu Pro Gln Gln Arg Met Lys Phe Lys His Ala 20 25 30Leu Arg Leu Ser Phe Asn Val Ser Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Asn 50 55 60Gly Ser Val Ser His Tyr Ser Leu Ser Glu Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14533147PRTArtificial sequenceSynthesized 33Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Ala Asp Gly Ser1 5 10 15Ile Tyr Ala Thr Ile Gln Pro Glu Gln Ser Gly Lys Phe Lys His Thr 20 25 30Leu Arg Leu Cys Phe Asp Val Ser Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Ala 50 55 60Gly Ser Val Ser His Tyr Val Leu Ser Glu Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14534147PRTArtificial sequenceSynthesized 34Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Ala Asp Gly Ser1 5 10 15Ile Tyr Ala Thr Ile Glu Pro Glu Gln Asp Val Lys Phe Lys His Thr 20 25 30Leu Arg Leu Val Phe Asn Val Ser Gln Lys Thr Gln Arg Arg Trp Leu 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Asn Asp Lys 50 55 60Gly Ser Val Ser Ser Tyr Thr Leu Ser Lys Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14535147PRTArtificial sequenceSynthesized 35Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Ser Asp Gly Ser1 5 10 15Ile Tyr Ala Ser Ile Glu Pro Asn Gln Ser Pro Lys Phe Lys His Tyr 20 25 30Leu Arg Leu Ala Phe Asn Val Ser Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Gln 50 55 60Gly Ser Val Ser His Tyr Thr Leu Ser Gln Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14536147PRTArtificial sequenceSynthesized 36Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Ser Asp Gly Ser1 5 10 15Ile Tyr Ala Cys Ile Glu Pro Gln Gln Arg Val Lys Phe Lys His Ala 20 25 30Leu Arg Leu Ser Phe Asn Val Ser Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Asn 50 55 60Gly Ser Val Ser His Tyr Gln Leu Ser Glu Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14537147PRTArtificial

sequenceSynthesized 37Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Ala Asp Gly Ser1 5 10 15Ile Tyr Ala Ala Ile Gln Pro Asp Gln Glu Val Lys Phe Lys His Val 20 25 30Leu Arg Leu Cys Phe Asp Val Ser Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Ala 50 55 60Ser Ser Val Ser His Tyr Val Leu Ser Glu Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14538147PRTArtificial sequenceSynthesized 38Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Ala Asp Gly Ser1 5 10 15Ile Tyr Ala Ala Ile Gln Pro Asp Gln Glu Val Lys Phe Lys His Val 20 25 30Leu Arg Leu Cys Phe Asp Val Ser Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Ala 50 55 60Gly Ser Val Ser His Tyr Val Leu Ser Glu Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp14539147PRTArtificial sequenceSynthesized 39Lys Glu Phe Leu Leu Tyr Leu Ala Gly Phe Val Asp Ala Asp Gly Ser1 5 10 15Ile Tyr Ala Ala Ile Gln Pro Asp Gln Lys Val Lys Phe Lys His Val 20 25 30Leu Arg Leu Cys Phe Asp Val Ser Gln Lys Thr Gln Arg Arg Trp Phe 35 40 45Leu Asp Lys Leu Val Asp Glu Ile Gly Val Gly Tyr Val Arg Asp Ala 50 55 60Tyr Pro Cys Ala His Tyr Val Leu Ser Glu Ile Lys Pro Leu His Asn65 70 75 80Phe Leu Thr Gln Leu Gln Pro Phe Leu Lys Leu Lys Gln Lys Gln Ala 85 90 95Asn Leu Val Leu Lys Ile Ile Glu Gln Leu Pro Ser Ala Lys Glu Ser 100 105 110Pro Asp Lys Phe Leu Glu Val Cys Thr Trp Val Asp Gln Ile Ala Ala 115 120 125Leu Asn Asp Ser Lys Thr Arg Lys Thr Thr Ser Glu Thr Val Arg Ala 130 135 140Val Leu Asp1454038DNAArtificial sequenceSynthesized 40cccttacagt tattaactac tctcatgagg ttcattcc 384139DNAArtificial sequenceSynthesized 41ggccctacaa ccattctgcc tttcactttc agtgcaata 394236DNAArtificial sequenceSynthesized 42cacaaggggg aagagtgtga gggtgtggga taagaa 364318DNAArtificial sequenceSynthesized 43gagccagttg tgtaccat 184420DNAArtificial sequenceSynthesized 44acgtgatccc gatttgaata 204518DNAArtificial sequenceSynthesized 45gcaccttact gtcctgat 184622DNAHomo sapiens 46ttcccagtag tacttctcca gt 224717DNAArtificial sequenceSynthesized 47tctggatggc aacaagt 17489DNAArtificial sequenceSynthesized 48tgccaaaga 94916DNAArtificial sequenceSynthesized 49agaccattaa ggtccc 165019DNAArtificial sequenceSynthesized 50tgatcccaga tgaatccac 19

* * * * *

Patent Diagrams and Documents

D00001

D00002

D00003

D00004

D00005

D00006

D00007

D00008

D00009

D00010

D00011

D00012

D00013

D00014

D00015

S00001

XML

US20200299658A1 – US 20200299658 A1