Crispr/cas9-based Repressors For Silencing Gene Targets In Vivo And Methods Of Use Gersbach; Charles A. ; et al. [Duke University]

Crispr/cas9-based Repressors For Silencing Gene Targets In Vivo And Methods Of Use

Gersbach; Charles A. ; et al.

Patent Application Summary

U.S. patent application number 16/093272 was filed with the patent office on 2019-05-02 for crispr/cas9-based repressors for silencing gene targets in vivo and methods of use. The applicant listed for this patent is Duke University. Invention is credited to Charles A. Gersbach, Pratiksha I. Thakore.

Application Number	20190127713 16/093272
Document ID	/
Family ID	60041921
Filed Date	2019-05-02

View All Diagrams

United States Patent Application	20190127713
Kind Code	A1
Gersbach; Charles A. ; et al.	May 2, 2019

CRISPR/CAS9-BASED REPRESSORS FOR SILENCING GENE TARGETS IN VIVO AND METHODS OF USE

Abstract

The present disclosure provides Crispr/cas9-based repressors for silencing gene targets in vivo and methods of use

Inventors:

Gersbach; Charles A.; (Durham, NC) ; Thakore; Pratiksha I.; (Durham, NC)

Applicant:

Name	City	State	Country	Type
Duke University	Durham	NC	US

Family ID:

60041921

Appl. No.:

16/093272

Filed:

April 13, 2017

PCT Filed:

April 13, 2017

PCT NO:

PCT/US17/27490

371 Date:

October 12, 2018

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
62321947	Apr 13, 2016
62369248	Aug 1, 2016

Current U.S. Class:	1/1
Current CPC Class:	A61K 48/00 20130101; C12N 2800/80 20130101; C07K 2319/80 20130101; C12N 15/113 20130101; C07K 14/4703 20130101; C12N 15/11 20130101; C12N 2320/32 20130101; C12N 15/63 20130101; A61P 9/00 20180101; C12N 9/22 20130101; C07K 2319/09 20130101; C12N 2750/14143 20130101; A61K 9/0019 20130101; C12N 2310/20 20170501; C12N 7/00 20130101; C07K 2319/71 20130101
International Class:	C12N 9/22 20060101 C12N009/22; C12N 15/11 20060101 C12N015/11; C12N 7/00 20060101 C12N007/00; C07K 14/47 20060101 C07K014/47; A61K 9/00 20060101 A61K009/00; A61P 9/00 20060101 A61P009/00

Goverment Interests

STATEMENT OF GOVERNMENT INTEREST

[0002] This invention was made with Government support under Federal Grant Nos. 1 RO1 DA036865 and 1 DP2 OD008586 awarded by the NIH. The Government has certain rights to this invention.

Claims

1. A method of modulating expression of a gene, in vivo, in a subject comprising administering to, or providing in, the subject: (a) (i) a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; or (ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and (b) (i) a gRNA which targets the fusion molecule to the gene; or (ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, in an amount sufficient to modulate expression of the gene.

2. The method of claim 1, comprising administering to, or provided in, the subject any of: (a)(ii) and (b)(ii), (a)(i) and (b)(i), (a)(i) and (b)(ii), or (a)(ii) and (b)(i).

3. The method of claim 1, comprising administering to, or provided in, the subject: (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene.

4. The method of claim 1, wherein the nucleic acid of (a)(ii) comprises DNA.

5. The method of claim 1, wherein the nucleic acid of (b)(ii) comprises DNA.

6. The method of claim 1, wherein the nucleic acid of (a)(ii) comprises RNA.

7. The method of claim 1, wherein the nucleic acid of (b)(ii) comprises RNA.

8. The method of claim 1, wherein one or both of (a) and (b) are packaged in a viral vector.

9. The method of claim 1, wherein (a) is packaged in a viral vector.

10. The method of claim 1, wherein (b) is packaged in a viral vector.

11. The method of claim 1, wherein (a) and (b) are packaged in the same viral vector.

12. The method of claim 8, wherein the viral vector comprises an AAV vector.

13. The method of claim 8, wherein the viral vector comprises a lentiviral vector.

14. The method of claim 1, wherein (a) is packaged in a first viral vector and (b) is packaged in a second viral vector.

15. The method of claim 14, wherein the first viral vector comprises an AAV vector and the second viral vector comprises an AAV vector.

16. The method of claim 1, wherein the dCas9 molecule comprises a gRNA binding domain of a Cas9 molecule.

17. The method of claim 1, wherein the dCas9 molecule comprises one, two or all of: a Rec1 domain, a bridge helix domain, or a PAM interacting domain, of a Cas9 molecule.

18. The method of claim 1, wherein the dCas9 molecule is a mutant of a wild-type Cas9 molecule, e.g., in which the Cas9 nuclease activity is inactivated.

19. The method of claim 1, wherein the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a DNA-cleavage domain of a Cas9 molecule.

20. The method of claim 1, wherein the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a RuvC domain and/or a mutation in a HNH domain.

21. The method of claim 1, wherein the dCas9 molecule comprises a Staphylococcus aureus dCas9 molecule, a Streptococcus pyogenes dCas9 molecule, a Campylobacter jejuni dCas9 molecule, a Corynebacterium diphtheria dCas9 molecule, a Eubacterium ventriosum dCas9 molecule, a Streptococcus pasteurianus dCas9 molecule, a Lactobacillus farciminis dCas9 molecule, a Sphaerochaeta globus dCas9 molecule, an Azospirillum (e.g., strain B510) dCas9 molecule, a Gluconacetobacter diazotrophicus dCas9 molecule, a Neisseria cinerea dCas9 molecule, a Roseburia intestinalis dCas9 molecule, a Parvibaculum lavamentivorans dCas9 molecule, a Nitratifractor salsuginis (e.g., strain DSM 16511) dCas9 molecule, a Campylobacter lari (e.g., strain CF89-12) dCas9 molecule, or a Streptococcus thermophilus (e.g., strain LMD-9) dCas9 molecule.

22. The method of claim 1, wherein the dCas9 molecule comprises an S. aureus dCas9 molecule, e.g., comprising an S. aureus dCas9 sequence described herein.

23. The method of claim 1, wherein the S. aureus dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 580, or both (e.g., D10A, N580A, or both), relative to a wild-type S. aureus dCas9 molecule, numbered according to SEQ ID NO: 25.

24. The method of claim 1, wherein the S. aureus dCas9 molecule comprises the amino acid sequence of SEQ ID NO: 35 or 36, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 35 or 36, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 35 or 36, or any fragment thereof.

25. The method of claim 1, wherein the dCas9 molecule comprises an S. pyogenes dCas9 molecule, e.g., comprising an S. pyogenes dCas9 sequence described herein.

26. The method of claim 1, the S. pyogenes dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 840, or both (e.g., D10A, H840A, or both), relative to a wild-type S. pyogenes dCas9 molecule, numbered according to SEQ ID NO: 24.

27. The method of claim 1, wherein the dCas9 molecule is less than 1400, 1300, 1200, 1100, 1000, 900, 800, 700, 600, or 500 amino acids in length.

28. The method of claim 1, wherein the dCas9 molecule is 500-1300, 600-1200, 700-1100, 800-1000, 500-1200, 500-1000, 500-800, 500-600, 1000-1200, 800-1200, or 600-1200 amino acids in length.

29. The method of claim 1, wherein the dCas9 molecule has a size that is less than 90%, 80%, 70%, 60%, 50%, 40%, or 30% of the size of a wild-type Cas9 molecule, e.g., a wild-type S. pyogenes Cas9 molecule or a wild-type S. aureus dCas9 molecule.

30. The method of claim 1, wherein the modulator of gene expression comprises a modulator of gene expression described herein.

31. The method of claim 1, wherein the modulator of gene expression comprises a repressor of gene expression, e.g., a Kruppel associated box (KRAB) molecule, an mSin3 interaction domain (SID) molecule, four concatenated mSin3 interaction domains (SID4X), MAX-interacting protein 1 (MXI1), or any fragment thereof.

32. The method of claim 1, wherein the modulator of gene expression comprises a Kruppel associated box (KRAB) molecule comprising the sequence of SEQ ID NO: 34, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, or any fragment thereof.

33. The method of claim 1, wherein the modulator of gene expression comprises an activator of gene expression, e.g., a VP16 transcription activation domain, a VP64 transcriptional activation domain, a p65 activation domain, an Epstein-Barr virus R transactivator Rta molecule, a VP64-p65-Rta fusion (VPR), Ldb1 self-association domain, or any fragment thereof.

34. The method of claim 1, wherein the modulator of gene expression comprises a modulator of epigenetic modification, e.g., a histone acetyltransferase (e.g., p300 catalytic domain), a histone deacetylase, a histone methyltransferase (e.g., SUV39H1 or G9a (EHMT2)), a histone demethylase (e.g., Lys-specific histone demethylase 1 (LSD1)), a DNA methyltransferase (e.g., DNMT3a or DNMT3a-DNMT3L), a DNA demethylase (e.g., TET1 catalytic domain or TDG), or fragment thereof.

35. The method of claim 1, wherein the modulator of gene expression is fused to the C-terminus, N-terminus, or both, of the dCas9 molecule.

36. The method of claim 1, wherein the modulator of gene expression is fused to the dCas9 molecule directly.

37. The method of claim 1, wherein the modulator of gene expression is fused to the dCas9 molecule indirectly, e.g., via a non-modulator or a linker, or a second modulator.

38. The method of claim 1, wherein a plurality of modulators of gene expression, e.g., two or more identical, substantially identical, or different modulators, are fused to the dCas9 molecule.

39. The method of claim 1, wherein the fusion molecule further comprises a nuclear localization sequence.

40. The method of claim 39, wherein one or more nuclear localization sequences are fused to the C-terminus, N-terminus, or both, of the dCas9 molecule, e.g., directly or indirectly, e.g., via a linker.

41. The method of claim 40, wherein the one or more nuclear localization sequences comprise the amino acid sequence of SEQ ID NO: 37 or 38, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 37 or 38, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 37 or 38, or any fragment thereof.

42. The method of claim 1, wherein the fusion molecule comprises the amino acid sequence of SEQ ID NO: 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 39, 40, or 41, or any fragment thereof.

43. The method of claim 1, wherein the nucleic acid that encodes the fusion molecule comprises the sequence of SEQ ID NO: 23, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 23, or a sequence having one, two, three, four, five or more changes, e.g., substitutions, insertions, or deletions, relative to SEQ ID NO: 23, or any fragment thereof.

44. The method of claim 1, wherein the gRNA comprises a unimolecular gRNA.

45. The method of claim 1, wherein the gRNA comprises a bimolecular gRNA.

46. The method of claim 1, wherein the gRNA comprises a gRNA sequence described herein.

47. The method of claim 1, wherein gene expression is modulated in a cell, tissue, or organ described herein, e.g., Table 2 or 3.

48. The method of claim 1, wherein gene expression is modulated in the liver.

49. The method of claim 1, wherein the modulation is sufficient to alter a function of the gene, or a symptom of a disorder associated with the gene, as described herein, e.g., in Table 2 or 3.

50. The method of claim 1, wherein the modulation comprises modulation of transcription.

51. The method of claim 1, wherein the modulation comprises down-regulation of transcription.

52. The method of claim 1, wherein the modulation comprises up-regulation of transcription.

53. The method of claim 1, wherein the modulation comprises modulating the temporal pattern of expression of the gene.

54. The method of claim 1, wherein the modulation comprises modulating the spatial pattern of expression of the gene.

55. The method of claim 1, wherein the modulation comprises modulating a post-transcriptional or co-transcriptional modification, e.g., splicing, 5' capping, 3' cleavage, 3' polyadenylation, or RNA export.

56. The method of claim 1, wherein the modulation comprises modulating the expression of an isoform, e.g., an increase or decrease in the expression of an isoform, the increase or decrease in the expression of a first isoform over a second isoform.

57. The method of claim 1, wherein the modulation comprises modulating chromatin structure, e.g., increasing or decreasing methylation, acetylation, phosphorylation, or ubiquitination, e.g., at a preselected site, or altering the spatial pattern, cell specificity, or temporal occurrence of methylation, acetylation, phosphorylation, or ubiquitination.

58. The method of claim 1, wherein the modulation comprises modulating a post-translational modification (e.g., indirectly), e.g., glycosylation, lipidation, acetylation, phosphorylation, amidation, hydroxylation, methylation, ubiquitination, sulfation, nitrosylation, or proteolysis.

59. The method of claim 1, wherein the modulation does not comprise cleaving the subject's DNA.

60. The method of claim 1, wherein the modulation comprises an inducible modulation.

61. The method of claim 1, wherein the gene is selected from Table 2, optionally wherein the method down-regulates the expression of the gene.

62. The method of any of claims 1-60, wherein the gene is selected from Table 3, optionally wherein the method up-regulates the expression of the gene.

63. The method of claim 1, wherein the gene comprises PCSK9.

64. The method of claim 1, wherein the dCas9 molecule does not cleave the genome of the subject.

65. A method of modulating expression of a gene, in vivo, in a subject comprising administering to, or providing in, the subject: (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising an S. aureus dCas9 molecule fused to a KRAB molecule; and (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, and wherein one or both of (a)(i) and (b)(ii) are packaged in an AAV vector.

66. The method of claim 65, wherein the fusion molecule (e.g., a fusion molecule described herein) comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof.

67. The method of claim 65, wherein the gRNA comprises a gRNA sequence described herein.

68. The method of claim 65, wherein the gene is selected from Table 2 or 3.

69. The method of claim 65, wherein the gene comprises PCSK9.

70. The method of claim 65, wherein (a)(ii) and (b)(ii) are packaged in different AAV vectors.

71. The method of claim 65, wherein (a)(ii) and (b)(ii) are packaged in the same AAV vector.

72. A pharmaceutical composition, or unit dosage form, comprising, in an amount sufficient for modulating a gene in a human subject, or in an amount sufficient for a therapeutic effect in a human subject, (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and/or (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, wherein one or both of (a)(ii) and (b)(ii) are packaged in a viral vector.

73. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the fusion molecule (e.g., a fusion molecule described herein) comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof.

74. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the gRNA comprises a gRNA sequence described herein.

75. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the gene is selected from Table 2 or 3.

76. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the gene comprises PCSK9.

77. The pharmaceutical composition, or unit dosage form, of claim 72, wherein (a)(ii) and (b)(ii) are packaged in the same viral vector, e.g., an AAV vector.

78. The pharmaceutical composition, or unit dosage form, of claim 72, wherein (a)(ii) and (b)(ii) are packaged in different viral vectors, e.g., AAV vectors.

79. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the viral vector (e.g., AAV vector) comprising (a)(ii), and the viral vector (e.g., AAV vector) comprising (b)(ii), are provided in separate containers.

80. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the viral vector (e.g., AAV vector) comprising (a)(ii) and the viral vector (e.g., AAV vector) comprising (b)(ii), are provided in the same container.

81. The pharmaceutical composition, or unit dosage form, of claim 72, which is formulated for administration, e.g., oral, parenteral, sublingual, transdermal, rectal, transmucosal, topical, intrapleural, intravenous, intraarterial, intraperitoneal, subcutaneous, intramuscular, intranasal intrathecal, or intraarticular administration, or administration via inhalation or via buccal administration, or any combination thereof, to the subject.

82. The pharmaceutical composition, or unit dosage form, of claim 72, which is formulated for intravenous administration to the subject.

83. The pharmaceutical composition, or unit dosage form, of claim 72, which is disposed in a device suitable for administration, e.g., oral, parenteral, sublingual, transdermal, rectal, transmucosal, topical, intrapleural, intravenous, intraarterial, intraperitoneal, subcutaneous, intramuscular, intranasal intrathecal, or intraarticular administration, or administration via inhalation or via buccal administration, or any combination thereof, to the subject.

84. The pharmaceutical composition, or unit dosage form, of claim 72, which is disposed in a device suitable for intravenous administration to the subject.

85. The pharmaceutical composition, or unit dosage form, of claim 72, which is disposed in a volume of at least 1, 2, 5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 400, or 500 ml.

86. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the nucleic acid of (a)(ii) comprises DNA.

87. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the nucleic acid of (b)(ii) comprises DNA.

88. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the nucleic acid of (a)(ii) comprises RNA.

89. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the nucleic acid of (b)(ii) comprises RNA.

90. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 molecule comprises a gRNA binding domain of a Cas9 molecule.

91. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 molecule comprises one, two or all of: a Rec1 domain, a bridge helix domain, or a PAM interacting domain, of a Cas9 molecule.

92. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 molecule is a mutant of a wild-type Cas9 molecule, e.g., in which the Cas9 nuclease activity is inactivated.

93. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a DNA-cleavage domain of a Cas9 molecule.

94. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a RuvC domain and/or a mutation in a HNH domain.

95. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 molecule comprises a Staphylococcus aureus dCas9 molecule, a Streptococcus pyogenes dCas9 molecule, a Campylobacter jejuni dCas9 molecule, a Corynebacterium diphtheria dCas9 molecule, a Eubacterium ventriosum dCas9 molecule, a Streptococcus pasteurianus dCas9 molecule, a Lactobacillus farciminis dCas9 molecule, a Sphaerochaeta globus dCas9 molecule, an Azospirillum (e.g., strain B510) dCas9 molecule, a Gluconacetobacter diazotrophicus dCas9 molecule, a Neisseria cinerea dCas9 molecule, a Roseburia intestinalis dCas9 molecule, a Parvibaculum lavamentivorans dCas9 molecule, a Nitratifractor salsuginis (e.g., strain DSM 16511) dCas9 molecule, a Campylobacter lari (e.g., strain CF89-12) dCas9 molecule, or a Streptococcus thermophilus (e.g., strain LMD-9) dCas9 molecule.

96. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 molecule comprises an S. aureus dCas9 molecule, e.g., comprising an S. aureus dCas9 sequence described herein.

97. The pharmaceutical composition, or unit dosage form, of claim 96, wherein the S. aureus dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 580, or both (e.g., D10A, N580A, or both), relative to a wild-type S. aureus dCas9 molecule, numbered according to SEQ ID NO: 25.

98. The pharmaceutical composition, or unit dosage form, of claim 96, wherein the S. aureus dCas9 molecule comprises the amino acid sequence of SEQ ID NO: 35 or 36, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 35 or 36, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 35 or 36, or any fragment thereof.

99. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 molecule comprises an S. pyogenes dCas9 molecule, e.g., comprising an S. pyogenes dCas9 sequence described herein.

100. The pharmaceutical composition, or unit dosage form, of claim 99, wherein the S. pyogenes dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 840, or both (e.g., D10A, H840A, or both), relative to a wild-type S. pyogenes dCas9 molecule, numbered according to SEQ ID NO: 24.

101. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 molecule is less than 1400, 1300, 1200, 1100, 1000, 900, 800, 700, 600, or 500 amino acids in length.

102. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 molecule is 500-1300, 600-1200, 700-1100, 800-1000, 500-1200, 500-1000, 500-800, 500-600, 1000-1200, 800-1200, or 600-1200 amino acids in length.

103. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 molecule has a size that is less than 90%, 80%, 70%, 60%, 50%, 40%, or 30% of the size of a wild-type Cas9 molecule, e.g., a wild-type S. pyogenes Cas9 molecule or a wild-type S. aureus dCas9 molecule.

104. The pharmaceutical composition, or unit dosage form, of claim 72, wherein modulator of gene expression comprises a modulator of gene expression described herein.

105. The pharmaceutical composition, or unit dosage form, of claim 72, wherein modulator of gene expression comprises a KRAB molecule, e.g., comprising the sequence of SEQ ID NO: 34, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, or any fragment thereof.

106. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the gRNA comprises a unimolecular gRNA.

107. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the gRNA comprises a bimolecular gRNA.

108. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the gRNA comprises a gRNA sequence described herein.

109. The pharmaceutical composition, or unit dosage form, of claim 72, wherein gene expression is modulated in a cell, tissue, or organ described herein, e.g., Table 2 or 3.

110. The pharmaceutical composition, or unit dosage form, of claim 72, wherein gene expression is modulated in the liver.

111. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the modulation is sufficient to alter a function of the gene, or a symptom of a disorder associated with the gene, as described herein, e.g., in Table 2 or 3.

112. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the modulation comprises modulation of transcription.

113. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the modulation comprises down-regulation of transcription.

114. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the modulation comprises up-regulation of transcription.

115. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the modulation comprises modulating the temporal pattern of expression of the gene.

116. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the modulation comprises modulating the spatial pattern of expression of the gene.

117. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the modulation comprises modulating a post-transcriptional or co-transcriptional modification, e.g., splicing, 5' capping, 3' cleavage, 3' polyadenylation, or RNA export.

118. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the modulation comprises modulating the expression of an isoform, e.g., an increase or decrease in the expression of an isoform, the increase or decrease in the expression of a first isoform over a second isoform.

119. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the modulation comprises modulating chromatin structure, e.g., increasing or decreasing methylation, acetylation, phosphorylation, or ubiquitination, e.g., at a preselected site, or altering the spatial pattern, cell specificity, or temporal occurrence of methylation, acetylation, phosphorylation, or ubiquitination.

120. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the modulation comprises modulating a post-translational modification (e.g., indirectly), e.g., glycosylation, lipidation, acetylation, phosphorylation, amidation, hydroxylation, methylation, ubiquitination, sulfation, nitrosylation, or proteolysis.

121. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the gene is selected from Table 2, optionally wherein the method down-regulates the expression of the gene.

122. The pharmaceutical composition, or unit dosage form, of any of claim 72, wherein the gene is selected from Table 3, optionally wherein the method up-regulates the expression of the gene.

123. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the gene comprises PCSK9.

124. The pharmaceutical composition, or unit dosage form, of claim 72, wherein the dCas9 does not cleave the genome of the subject.

125. A pharmaceutical composition, or unit dosage form, comprising, in an amount sufficient for modulating a gene in a human subject, or in an amount sufficient for a therapeutic effect in a human subject, (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising an S. aureus dCas9 molecule fused to a KRAB molecule; and/or (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, wherein one or both of (a)(ii) and (b)(ii) are packaged in a viral vector.

126. The pharmaceutical composition, or unit dosage form, of claim 125, wherein the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof.

127. The pharmaceutical composition, or unit dosage form, of claim 125, wherein the gRNA comprises a gRNA sequence described herein.

128. The pharmaceutical composition, or unit dosage form, of claim 125, wherein the gene is selected from Table 2 or 3.

129. The pharmaceutical composition, or unit dosage form, of claim 125, wherein the gene comprises PCSK9.

130. The pharmaceutical composition, or unit dosage form, of claim 125, wherein (a)(ii) and (b)(ii) are packaged in different AAV vectors.

131. The pharmaceutical composition, or unit dosage form, of claim 125, wherein (a)(ii) and (b)(ii) are packaged in the same AAV vector.

132. A viral vector comprising: (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and/or (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to a gene.

133. The viral vector of claim 132, wherein the viral vector is an AAV vector, the fusion molecule comprises a fusion molecule described herein, the dCas9 molecule comprises a dCas9 molecule described herein (e.g., an S. aureus dCas9 molecule), and/or the modulator of gene expression comprises a modulator described herein.

134. The viral vector of claim 132, comprising: (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising an S. aureus dCas9 molecule fused to a KRAB molecule; and (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to PCSK9, wherein one or both of (a)(ii) and (b)(ii) are packaged in an AAV vector.

135. The viral vector of claim 132, wherein the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof.

136. The viral vector of claim 132, wherein the gRNA comprises a gRNA sequence described herein.

137. The viral vector of claim 132, wherein the gene is selected from Table 2 or 3.

138. The viral vector of claim 132, wherein the gene comprises PCSK9.

139. A method of treating a disorder, comprising administering to a subject: (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to a gene associated with the disorder, thereby treating the disorder.

140. The method of claim 139, wherein the disorder is selected from Table 2 or 3, the fusion molecule comprises a fusion molecule described herein, the dCas9 molecule comprises a dCas9 molecule described herein, the modulator of gene expression comprises a modulator described herein, and/or the gRNA comprises a gRNA sequence described herein.

141. The method of claim 139, wherein the gene is selected from Table 2 or 3.

142. The method of claim 139, wherein one or both of (a)(ii) and (b)(ii) are provided in an AAV vector.

143. A method of treating a cardiovascular disease, comprising administering to a subject: (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to a PCSK9 gene, thereby treating the cardiovascular disease.

144. The method of claim 143, wherein the fusion molecule comprises a fusion molecule described herein, the dCas9 molecule comprises a dCas9 molecule described herein, e.g., an S. aureus dCas9 molecule, and/or the modulator of gene expression comprises a modulator described herein.

145. The method of claim 143, wherein the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof.

146. The method of claim 143, wherein the gRNA comprises a gRNA sequence described herein.

147. The method of claim 143, wherein one or both of (a)(ii) and (b)(ii) are provided in an AAV vector.

Description

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application claims the benefit of U.S. Provisional Application No. 62/321,947, filed Apr. 13, 2016, and U.S. Provisional Application No. 62/369,248, filed Aug. 1, 2016. The contents of the aforesaid applications are hereby incorporated by reference in their entirety.

BACKGROUND

[0003] Engineered DNA-binding proteins that can be customized to target any gene in mammalian cells have enabled rapid advances in biomedical research and are a promising platform for gene therapies. The RNA-guided CRISPR-Cas9 system has emerged as a promising platform for programmable targeted gene regulation. Fusion of catalytically inactive, "dead" Cas9 (dCas9) to the Kruppel-associated box (KRAB) domain generates a synthetic repressor capable of highly specific and potent silencing of target genes in cell culture experiments. However, a technology to deliver CRISPR/Cas9-based gene repressors in vivo has not been developed. Adeno-associated virus (AAV) vectors have been proposed for gene delivery of CRISPR-Cas9 components for in vivo studies and therapeutic applications. AAV vectors provide stable gene expression with low risk of mutagenic integration events. AAV vectors can be engineered to target tissues of interest in vivo, and are already in use in humans in clinical trials. However, gene delivery of S. pyogenes dCas9-KRAB in vivo is challenging because the size of the S. pyogenes dCas9 and KRAB domain fusion exceeds the packaging limits of standard AAV vectors.

SUMMARY

[0004] In an aspect, the disclosure features a method of modulating expression of a gene, in vivo, in a subject comprising administering to, or providing in, the subject: [0005] (a) (i) a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; or (ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and [0006] (b) (i) a gRNA which targets the fusion molecule to the gene; or (ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, in an amount sufficient to modulate expression of the gene.

[0007] In an embodiment, the method comprises administering to, or provided in, the subject any of: (a)(ii) and (b)(ii), (a)(i) and (b)(i), (a)(i) and (b)(ii), or (a)(ii) and (b)(i).

[0008] In an embodiment, the method comprises administering to, or provided in, the subject: [0009] (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and [0010] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene.

[0011] In an embodiment, the nucleic acid of (a)(ii) comprises DNA. In an embodiment, the nucleic acid of (b)(ii) comprises DNA. In an embodiment, the nucleic acid of (a)(ii) comprises RNA. In an embodiment, the nucleic acid of (b)(ii) comprises RNA.

[0012] In an embodiment, the method comprises one or both of (a) and (b) are packaged in a viral vector. In an embodiment, (a) is packaged in a viral vector. In an embodiment, (b) is packaged in a viral vector. In an embodiment, (a) and (b) are packaged in the same viral vector.

[0013] In an embodiment, the viral vector comprises an AAV vector. In an embodiment, the viral vector comprises a lentiviral vector.

[0014] In an embodiment, (a) is packaged in a first viral vector and (b) is packaged in a second viral vector. In an embodiment, the first viral vector comprises an AAV vector and the second viral vector comprises an AAV vector.

[0015] In an embodiment, the dCas9 molecule comprises a gRNA binding domain of a Cas9 molecule. In an embodiment, the dCas9 molecule comprises one, two or all of: a Rec1 domain, a bridge helix domain, or a PAM interacting domain, of a Cas9 molecule.

[0016] In an embodiment, the dCas9 molecule is a mutant of a wild-type Cas9 molecule, e.g., in which the Cas9 nuclease activity is inactivated. In an embodiment, the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a DNA-cleavage domain of a Cas9 molecule. In an embodiment, the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a RuvC domain and/or a mutation in a HNH domain.

[0017] In an embodiment, the dCas9 molecule comprises a Staphylococcus aureus dCas9 molecule, a Streptococcus pyogenes dCas9 molecule, a Campylobacter jejuni dCas9 molecule, a Corynebacterium diphtheria dCas9 molecule, a Eubacterium ventriosum dCas9 molecule, a Streptococcus pasteurianus dCas9 molecule, a Lactobacillus farciminis dCas9 molecule, a Sphaerochaeta globus dCas9 molecule, an Azospirillum (e.g., strain B510) dCas9 molecule, a Gluconacetobacter diazotrophicus dCas9 molecule, a Neisseria cinerea dCas9 molecule, a Roseburia intestinalis dCas9 molecule, a Parvibaculum lavamentivorans dCas9 molecule, a Nitratifractor salsuginis (e.g., strain DSM 16511) dCas9 molecule, a Campylobacter lari (e.g., strain CF89-12) dCas9 molecule, or a Streptococcus thermophilus (e.g., strain LMD-9) dCas9 molecule.

[0018] In an embodiment, the dCas9 molecule comprises an S. aureus dCas9 molecule, e.g., comprising an S. aureus dCas9 sequence described herein.

[0019] In an embodiment, the S. aureus dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 580, or both (e.g., D10A, N580A, or both), relative to a wild-type S. aureus dCas9 molecule, numbered according to SEQ ID NO: 25.

[0020] In an embodiment, the S. aureus dCas9 molecule comprises the amino acid sequence of SEQ ID NO: 35 or 36, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 35 or 36, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 35 or 36, or any fragment thereof.

[0021] In an embodiment, the dCas9 molecule comprises an S. pyogenes dCas9 molecule, e.g., comprising an S. pyogenes dCas9 sequence described herein.

[0022] In an embodiment, the S. pyogenes dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 840, or both (e.g., D10A, H840A, or both), relative to a wild-type S. pyogenes dCas9 molecule, numbered according to SEQ ID NO: 24.

[0023] In an embodiment, the dCas9 molecule is less than 1400, 1300, 1200, 1100, 1000, 900, 800, 700, 600, or 500 amino acids in length. In an embodiment, the dCas9 molecule is 500-1300, 600-1200, 700-1100, 800-1000, 500-1200, 500-1000, 500-800, 500-600, 1000-1200, 800-1200, or 600-1200 amino acids in length.

[0024] In an embodiment, the dCas9 molecule has a size that is less than 90%, 80%, 70%, 60%, 50%, 40%, or 30% of the size of a wild-type Cas9 molecule, e.g., a wild-type S. pyogenes Cas9 molecule or a wild-type S. aureus dCas9 molecule.

[0025] In an embodiment, the modulator of gene expression comprises a modulator of gene expression described herein.

[0026] In an embodiment, the modulator of gene expression comprises a repressor of gene expression, e.g., a Kruppel associated box (KRAB) molecule, an mSin3 interaction domain (SID) molecule, four concatenated mSin3 interaction domains (SID4X), MAX-interacting protein 1 (MXI1), or any fragment thereof.

[0027] In an embodiment, the modulator of gene expression comprises a Kruppel associated box (KRAB) molecule comprising the sequence of SEQ ID NO: 34, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, or any fragment thereof.

[0028] In an embodiment, the modulator of gene expression comprises an activator of gene expression, e.g., a VP16 transcription activation domain, a VP64 transcriptional activation domain, a p65 activation domain, an Epstein-Barr virus R transactivator Rta molecule, a VP64-p65-Rta fusion (VPR), Ldb1 self-association domain, or any fragment thereof.

[0029] In an embodiment, the modulator of gene expression comprises a modulator of epigenetic modification, e.g., a histone acetyltransferase (e.g., p300 catalytic domain), a histone deacetylase, a histone methyltransferase (e.g., SUV39H1 or G9a (EHMT2)), a histone demethylase (e.g., Lys-specific histone demethylase 1 (LSD1)), a DNA methyltransferase (e.g., DNMT3a or DNMT3a-DNMT3L), a DNA demethylase (e.g., TET1 catalytic domain or TDG), or fragment thereof.

[0030] In an embodiment, the modulator of gene expression is fused to the C-terminus, N-terminus, or both, of the dCas9 molecule.

[0031] In an embodiment, the modulator of gene expression is fused to the dCas9 molecule directly. In an embodiment, the modulator of gene expression is fused to the dCas9 molecule indirectly, e.g., via a non-modulator or a linker, or a second modulator.

[0032] In an embodiment, a plurality of modulators of gene expression, e.g., two or more identical, substantially identical, or different modulators, are fused to the dCas9 molecule.

[0033] In an embodiment, the fusion molecule further comprises a nuclear localization sequence.

[0034] In an embodiment, one or more nuclear localization sequences are fused to the C-terminus, N-terminus, or both, of the dCas9 molecule, e.g., directly or indirectly, e.g., via a linker.

[0035] In an embodiment, the one or more nuclear localization sequences comprise the amino acid sequence of SEQ ID NO: 37 or 38, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 37 or 38, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 37 or 38, or any fragment thereof.

[0036] In an embodiment, the fusion molecule comprises the amino acid sequence of SEQ ID NO: 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 39, 40, or 41, or any fragment thereof.

[0037] In an embodiment, the nucleic acid that encodes the fusion molecule comprises the sequence of SEQ ID NO: 23, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 23, or a sequence having one, two, three, four, five or more changes, e.g., substitutions, insertions, or deletions, relative to SEQ ID NO: 23, or any fragment thereof.

[0038] In an embodiment, the gRNA comprises a unimolecular gRNA. In an embodiment, the gRNA comprises a bimolecular gRNA.

[0039] In an embodiment, the gRNA comprises a gRNA sequence described herein.

[0040] In an embodiment, gene expression is modulated in a cell, tissue, or organ described herein, e.g., Table 2 or 3. In an embodiment, gene expression is modulated in the liver.

[0041] In an embodiment, the modulation is sufficient to alter a function of the gene, or a symptom of a disorder associated with the gene, as described herein, e.g., in Table 2 or 3.

[0042] In an embodiment, the modulation comprises modulation of transcription. In an embodiment, the modulation comprises down-regulation of transcription. In an embodiment, the modulation comprises up-regulation of transcription.

[0043] In an embodiment, the modulation comprises modulating the temporal pattern of expression of the gene. In an embodiment, the modulation comprises modulating the spatial pattern of expression of the gene.

[0044] In an embodiment, the modulation comprises modulating a post-transcriptional or co-transcriptional modification, e.g., splicing, 5' capping, 3' cleavage, 3' polyadenylation, or RNA export.

[0045] In an embodiment, the modulation comprises modulating the expression of an isoform, e.g., an increase or decrease in the expression of an isoform, the increase or decrease in the expression of a first isoform over a second isoform.

[0046] In an embodiment, the modulation comprises modulating chromatin structure, e.g., increasing or decreasing methylation, acetylation, phosphorylation, or ubiquitination, e.g., at a preselected site, or altering the spatial pattern, cell specificity, or temporal occurrence of methylation, acetylation, phosphorylation, or ubiquitination.

[0047] In an embodiment, the modulation comprises modulating a post-translational modification (e.g., indirectly), e.g., glycosylation, lipidation, acetylation, phosphorylation, amidation, hydroxylation, methylation, ubiquitination, sulfation, nitrosylation, or proteolysis.

[0048] In an embodiment, the modulation does not comprise cleaving the subject's DNA.

[0049] In an embodiment, the modulation comprises an inducible modulation.

[0050] In an embodiment, the gene is selected from Table 2, optionally wherein the method down-regulates the expression of the gene.

[0051] In an embodiment, the gene is selected from Table 3, optionally wherein the method up-regulates the expression of the gene.

[0052] In an embodiment, the gene comprises PCSK9.

[0053] In an embodiment, the dCas9 molecule does not cleave the genome of the subject.

[0054] In another aspect, the disclosure features a method of modulating expression of a gene, in vivo, in a subject comprising administering to, or providing in, the subject: [0055] (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising an S. aureus dCas9 molecule fused to a KRAB molecule; and [0056] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, and wherein one or both of (a)(i) and (b)(ii) are packaged in an AAV vector.

[0057] In an embodiment, the fusion molecule comprises a fusion molecule described herein.

[0058] In an embodiment, the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof.

[0059] In an embodiment, the gRNA comprises a gRNA sequence described herein.

[0060] In an embodiment, the gene is selected from Table 2 or 3. In an embodiment, the gene comprises PCSK9.

[0061] In an embodiment, (a)(ii) and (b)(ii) are packaged in different AAV vectors. In an embodiment, (a)(ii) and (b)(ii) are packaged in the same AAV vector.

[0062] In another aspect, the disclosure features a pharmaceutical composition, or unit dosage form, comprising, in an amount sufficient for modulating a gene in a human subject, or in an amount sufficient for a therapeutic effect in a human subject, [0063] (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule, e.g., an S. aureus dCas9 molecule, fused to a modulator of gene expression; and/or [0064] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, [0065] wherein one or both of (a)(ii) and (b)(ii) are packaged in a viral vector.

[0066] In an embodiment, the fusion molecule comprises a fusion molecule described herein.

[0067] In an embodiment, the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof.

[0068] In an embodiment, the gRNA comprises a gRNA sequence described herein.

[0069] In an embodiment, the gene is selected from Table 2 or 3. In an embodiment, the gene comprises PCSK9.

[0070] In an embodiment, one or both of (a)(ii) and (b)(ii) are packaged in an AAV vector.

[0071] In an embodiment, (a)(ii) and (b)(ii) are packaged in the same viral vector, e.g., an AAV vector. In an embodiment, (a)(ii) and (b)(ii) are packaged in different viral vectors, e.g., AAV vectors.

[0072] In an embodiment, the viral vector (e.g., AAV vector) comprising (a)(ii), and the viral vector (e.g., AAV vector) comprising (b)(ii), are provided in separate containers.

[0073] In an embodiment, the viral vector (e.g., AAV vector) comprising (a)(ii) and the viral vector (e.g., AAV vector) comprising (b)(ii), are provided in the same container.

[0074] In an embodiment, the pharmaceutical composition, or unit dosage form, is formulated for administration, e.g., oral, parenteral, sublingual, transdermal, rectal, transmucosal, topical, intrapleural, intravenous, intraarterial, intraperitoneal, subcutaneous, intramuscular, intranasal intrathecal, or intraarticular administration, or administration via inhalation or via buccal administration, or any combination thereof, to the subject.

[0075] In an embodiment, the pharmaceutical composition, or unit dosage form, is formulated for intravenous administration to the subject.

[0076] In an embodiment, the pharmaceutical composition, or unit dosage form, is disposed in a device suitable for administration, e.g., oral, parenteral, sublingual, transdermal, rectal, transmucosal, topical, intrapleural, intravenous, intraarterial, intraperitoneal, subcutaneous, intramuscular, intranasal intrathecal, or intraarticular administration, or administration via inhalation or via buccal administration, or any combination thereof, to the subject.

[0077] In an embodiment, the pharmaceutical composition, or unit dosage form, is disposed in a device suitable for intravenous administration to the subject.

[0078] In an embodiment, the pharmaceutical composition, or unit dosage form, is disposed in a volume of at least 1, 2, 5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 400, or 500 ml.

[0079] In an embodiment, the nucleic acid of (a)(ii) comprises DNA. In an embodiment, the nucleic acid of (b)(ii) comprises DNA. In an embodiment, the nucleic acid of (a)(ii) comprises RNA. In an embodiment, the nucleic acid of (b)(ii) comprises RNA.

[0080] In an embodiment, the dCas9 molecule comprises a gRNA binding domain of a Cas9 molecule.

[0081] In an embodiment, the dCas9 molecule comprises one, two or all of: a Rec1 domain, a bridge helix domain, or a PAM interacting domain, of a Cas9 molecule. In an embodiment, the dCas9 molecule is a mutant of a wild-type Cas9 molecule, e.g., in which the Cas9 nuclease activity is inactivated. In an embodiment, the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a DNA-cleavage domain of a Cas9 molecule. In an embodiment, the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a RuvC domain and/or a mutation in a HNH domain.

[0082] In an embodiment, the dCas9 molecule comprises a Staphylococcus aureus dCas9 molecule, a Streptococcus pyogenes dCas9 molecule, a Campylobacter jejuni dCas9 molecule, a Corynebacterium diphtheria dCas9 molecule, a Eubacterium ventriosum dCas9 molecule, a Streptococcus pasteurianus dCas9 molecule, a Lactobacillus farciminis dCas9 molecule, a Sphaerochaeta globus dCas9 molecule, an Azospirillum (e.g., strain B510) dCas9 molecule, a Gluconacetobacter diazotrophicus dCas9 molecule, a Neisseria cinerea dCas9 molecule, a Roseburia intestinalis dCas9 molecule, a Parvibaculum lavamentivorans dCas9 molecule, a Nitratifractor salsuginis (e.g., strain DSM 16511) dCas9 molecule, a Campylobacter lari (e.g., strain CF89-12) dCas9 molecule, or a Streptococcus thermophilus (e.g., strain LMD-9) dCas9 molecule.

[0083] In an embodiment, the dCas9 molecule comprises an S. aureus dCas9 molecule, e.g., comprising an S. aureus dCas9 sequence described herein. In an embodiment, the S. aureus dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 580, or both (e.g., D10A, N580A, or both), relative to a wild-type S. aureus dCas9 molecule, numbered according to SEQ ID NO: 25.

[0084] In an embodiment, the S. aureus dCas9 molecule comprises the amino acid sequence of SEQ ID NO: 35 or 36, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 35 or 36, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 35 or 36, or any fragment thereof.

[0085] In an embodiment, the dCas9 molecule comprises an S. pyogenes dCas9 molecule, e.g., comprising an S. pyogenes dCas9 sequence described herein. In an embodiment, the S. pyogenes dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 840, or both (e.g., D10A, H840A, or both), relative to a wild-type S. pyogenes dCas9 molecule, numbered according to SEQ ID NO: 24.

[0086] In an embodiment, the dCas9 molecule is less than 1400, 1300, 1200, 1100, 1000, 900, 800, 700, 600, or 500 amino acids in length.

[0087] In an embodiment, the dCas9 molecule is 500-1300, 600-1200, 700-1100, 800-1000, 500-1200, 500-1000, 500-800, 500-600, 1000-1200, 800-1200, or 600-1200 amino acids in length.

[0088] In an embodiment, the dCas9 molecule has a size that is less than 90%, 80%, 70%, 60%, 50%, 40%, or 30% of the size of a wild-type Cas9 molecule, e.g., a wild-type S. pyogenes Cas9 molecule or a wild-type S. aureus dCas9 molecule.

[0089] In an embodiment, modulator of gene expression comprises a modulator of gene expression described herein.

[0090] In an embodiment, modulator of gene expression comprises a KRAB molecule, e.g., comprising the sequence of SEQ ID NO: 34, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, or any fragment thereof.

[0091] In an embodiment, the gRNA comprises a unimolecular gRNA. In an embodiment, the gRNA comprises a bimolecular gRNA. In an embodiment, the gRNA comprises a gRNA sequence described herein.

[0092] In an embodiment, gene expression is modulated in a cell, tissue, or organ described herein, e.g., Table 2 or 3. In an embodiment, gene expression is modulated in the liver.

[0093] In an embodiment, the modulation is sufficient to alter a function of the gene, or a symptom of a disorder associated with the gene, as described herein, e.g., in Table 2 or 3.

[0094] In an embodiment, the modulation comprises modulation of transcription. In an embodiment, the modulation comprises down-regulation of transcription. In an embodiment, the modulation comprises up-regulation of transcription.

[0095] In an embodiment, the modulation comprises modulating the temporal pattern of expression of the gene. In an embodiment, the modulation comprises modulating the spatial pattern of expression of the gene.

[0096] In an embodiment, the modulation comprises modulating a post-transcriptional or co-transcriptional modification, e.g., splicing, 5' capping, 3' cleavage, 3' polyadenylation, or RNA export.

[0097] In an embodiment, the modulation comprises modulating the expression of an isoform, e.g., an increase or decrease in the expression of an isoform, the increase or decrease in the expression of a first isoform over a second isoform.

[0098] In an embodiment, the modulation comprises modulating chromatin structure, e.g., increasing or decreasing methylation, acetylation, phosphorylation, or ubiquitination, e.g., at a preselected site, or altering the spatial pattern, cell specificity, or temporal occurrence of methylation, acetylation, phosphorylation, or ubiquitination.

[0099] In an embodiment, the modulation comprises modulating a post-translational modification (e.g., indirectly), e.g., glycosylation, lipidation, acetylation, phosphorylation, amidation, hydroxylation, methylation, ubiquitination, sulfation, nitrosylation, or proteolysis.

[0100] In an embodiment, the gene is selected from Table 2, optionally wherein the method down-regulates the expression of the gene. In an embodiment, the gene is selected from Table 3, optionally wherein the method up-regulates the expression of the gene. In an embodiment, the gene comprises PCSK9.

[0101] In an embodiment, the dCas9 does not cleave the genome of the subject.

[0102] In another aspect, the disclosure features a pharmaceutical composition, or unit dosage form, comprising, in an amount sufficient for modulating a gene in a human subject, or in an amount sufficient for a therapeutic effect in a human subject, [0103] (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising an S. aureus dCas9 molecule fused to a KRAB molecule; and/or [0104] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, [0105] wherein one or both of (a)(ii) and (b)(ii) are packaged in a viral vector.

[0106] In an embodiment, the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof.

[0107] In an embodiment, the gRNA comprises a gRNA sequence described herein.

[0108] In an embodiment, the gene is selected from Table 2 or 3. In an embodiment, the gene comprises PCSK9.

[0109] In an embodiment, one or both of (a)(ii) and (b)(ii) are packaged in an AAV vector.

[0110] In an embodiment, (a)(ii) and (b)(ii) are packaged in different AAV vectors. In an embodiment, (a)(ii) and (b)(ii) are packaged in the same AAV vector.

[0111] In another aspect, the disclosure features a viral vector comprising: [0112] (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and/or [0113] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to a gene.

[0114] In an embodiment, the viral vector is an AAV vector.

[0115] In an embodiment, the fusion molecule comprises a fusion molecule described herein.

[0116] In an embodiment, the dCas9 molecule comprises a dCas9 molecule described herein, e.g., an S. aureus dCas9 molecule.

[0117] In an embodiment, the modulator of gene expression comprises a modulator described herein.

[0118] In an embodiment, the gene is a gene described herein.

[0119] In an embodiment, the viral vector comprises: [0120] (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising an S. aureus dCas9 molecule fused to a KRAB molecule; and [0121] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to PCSK9, [0122] wherein one or both of (a)(ii) and (b)(ii) are packaged in an AAV vector.

[0123] In an embodiment, the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof.

[0124] In an embodiment, the gRNA comprises a gRNA sequence described herein.

[0125] In an embodiment, the gene is selected from Table 2 or 3. In an embodiment, the gene comprises PCSK9.

[0126] In an embodiment, the disclosure features a method of treating a disorder, comprising administering to a subject: [0127] (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and [0128] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to a gene associated with the disorder, thereby treating the disorder.

[0129] In an embodiment, the disorder is selected from Table 2 or 3. In an embodiment, the gene is selected from Table 2 or 3.

[0130] In an embodiment, one or both of (a)(ii) and (b)(ii) are provided in an AAV vector.

[0131] In an embodiment, the fusion molecule comprises a fusion molecule described herein.

[0132] In an embodiment, the dCas9 molecule comprises a dCas9 molecule described herein.

[0133] In an embodiment, the modulator of gene expression comprises a modulator described herein.

[0134] In an embodiment, the gRNA comprises a gRNA sequence described herein.

[0135] In an embodiment, the disclosure features a method of treating a cardiovascular disease, comprising administering to a subject: [0136] (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and [0137] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to a PCSK9 gene, thereby treating the cardiovascular disease.

[0138] In an embodiment, the fusion molecule comprises a fusion molecule described herein.

[0139] In an embodiment, the dCas9 molecule comprises a dCas9 molecule described herein.

[0140] In an embodiment, the modulator of gene expression comprises a modulator described herein.

[0141] In an embodiment, the dCas9 molecule is an S. aureus dCas9 molecule.

[0142] In an embodiment, the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof.

[0143] In an embodiment, the gRNA comprises a gRNA sequence described herein.

[0144] In an embodiment, one or both of (a)(ii) and (b)(ii) are provided in an AAV vector.

[0145] In another aspect, the disclosure features: [0146] (a) (i) a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; or (ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and [0147] (b) (i) a gRNA which targets the fusion molecule to a gene; or (ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, for use in a method of modulating expression of the gene, in vivo, in a subject.

[0148] In an embodiment, the fusion molecule comprises a fusion molecule described herein.

[0149] In an embodiment, the dCas9 molecule comprises a dCas9 molecule described herein.

[0150] In an embodiment, the modulator of gene expression comprises a modulator described herein.

[0151] In an embodiment, the gRNA comprises a gRNA sequence described herein.

[0152] In an embodiment, the gene is a gene described herein.

[0153] In some embodiments, the method comprises a method described herein.

[0154] In another aspect, the disclosure features: [0155] (a) (i) a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; or (ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and [0156] (b) (i) a gRNA which targets the fusion molecule to a gene; or (ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, for use in a method of treating or preventing a disorder associated with the gene, in vivo, in a subject.

[0157] In an embodiment, the fusion molecule comprises a fusion molecule described herein.

[0158] In an embodiment, the dCas9 molecule comprises a dCas9 molecule described herein.

[0159] In an embodiment, the modulator of gene expression comprises a modulator described herein.

[0160] In an embodiment, the gRNA comprises a gRNA sequence described herein.

[0161] In an embodiment, the gene is a gene described herein.

[0162] In some embodiments, the disorder is a disorder described herein.

[0163] The present disclosure addresses these shortcomings by creating a modified programmable RNA-guided dCas9-based repressor for efficient packaging in AAV and in vivo gene regulation. This gene delivery system can be customized to target any endogenous gene by designing a new guide RNA molecule, enabling patent and stable gene repression in animal models and therapeutic use.

[0164] One aspect of the present disclosure provides a fusion protein comprising, consisting of, or consisting essentially of three heterologous polypeptide domains, wherein the first polypeptide domain comprises, consists of, or consists essentially of a dead Clustered Regularly Interspaced Short Palindromic Repeats associated (dCas) protein, the second polypeptide domain comprises, consists of, or consists essentially of a Kruppel-associated box (KRAB), and the polypeptide domain has an activity selected from the group consisting of transcription activation activity, transcription repression activity, transcription release factor activity, histone modification activity, nuclease activity, nucleic acid association activity, methylase activity, and demethylase activity.

[0165] Another aspect of the present disclosure provides a gene therapy construct comprising, consisting of, or consisting essentially of a polynucleotide encoding a fusion protein comprising three heterologous polypeptide domains, wherein the first polypeptide domain comprises, consists of, or consists essentially of a dead Clustered Regularly Interspaced Short Palindromic Repeats associated (dCas) protein, the second polypeptide domain comprises, consists of, or consists essentially of a Kruppel-associated box (KRAB), and the polypeptide domain has an activity selected from the group consisting of transcription activation activity, transcription repression activity, transcription release factor activity, histone modification activity, nuclease activity, nucleic acid association activity, methylase activity, and demethylase activity.

[0166] In some embodiments, the gene therapy construct comprises a vector system. In certain embodiments, the vector system comprises an AAV vector system.

[0167] In another embodiment, the gene therapy construct further comprises a first and second AAV inverted terminal repeat (ITR) sequence flanking the fusion protein.

[0168] Another aspect of the present disclosure provides a pharmaceutical composition comprising the gene therapy construct as described herein in a biocompatible pharmaceutical carrier.

[0169] In some embodiments, the Cas protein comprises Cas9.

[0170] In some embodiments, the gene therapy construct is designed for the targeted reduction of the PCSK9 gene. In some embodiments, the gene therapy construct is designed for the targeted reduction of the expression of the PCSK9 gene.

[0171] Another aspect of the present disclosure provides a method of suppressing the expression of a gene in a cell in vivo comprising, consisting of, or consisting essentially of administering to a cell a therapeutically effective amount of a gene therapy construct as described herein such that the gene expression is suppressed.

[0172] Another aspect of the present disclosure provides a method of suppressing a gene in vivo in a subject comprising, consisting of, or consisting essentially of administering to the subject a therapeutically effective amount of a gene therapy construct as described herein such that the gene is suppressed.

[0173] In some embodiments, the method is designed for the targeted reduction of the PCSK9 gene. In some embodiments, the method is designed for the targeted reduction of the expression of the PCSK9 gene.

[0174] Another aspect of the present disclosure provides a kit for the suppression of a gene in vivo comprising a gene therapy construct or pharmaceutical composition as described herein and instructions for use.

[0175] Yet another aspect of the present disclosure provides all that is described and illustrated herein.

BRIEF DESCRIPTION OF THE DRAWINGS

[0176] The foregoing aspects and other features of the disclosure are explained in the following description, taken in connection with the accompanying drawings, herein:

[0177] FIGS. 1A-1D are graphs showing the adaptation of SaCas9 for transcriptional repression. FIG. 1A is a schematic graph showing introducing inactivating mutations D10A and N580A into the cleavage domains of SaCas9 to generate a nuclease-null dSaCas9 DNA-binding domain. FIG. 1B is a schematic graph showing a single lentiviral vector with puromycin resistance used to express dSaCas9-KRAB and a U6-gRNA cassette for in vitro testing of dSaCas9 repressors. FIGS. 1C and 1D are bar graphs showing that multiple gRNAs against the synthetic CAG promoter effected potent repression of mRNA by qPCR (FIG. 1C) and protein via luciferase bioluminescence (FIG. 1D) in primary mouse fibroblasts expressing a CAG-luciferase reporter cassette. * indicates p<0.05 by Student's t-test compared to non-treated (NT) controls (n=2 independent experiments).

[0178] FIGS. 2A and 2B are graphs showing the silencing of endogenous genes with the dSaCas9-KRAB repressor. In FIG. 2A, eight gRNAs were designed to target the skeletal muscle DNase-hypersensitivity peak upstream of the transcription start site in the endogenous mouse Acvr2b gene locus. FIG. 2B is a bar graph showing that several single gRNAs effected strong repression of Acvr2b when delivered with dSaCas9-KRAB, compared to no lentivirus (No LV) and dSaCas9-KRAB only (No gRNA) controls. * indicates p<0.05 by Student's t-test compared to No LV controls (n=2 independent experiments).

[0179] FIGS. 3A-3E are graphs showing the targeting of Acvr2b with AAV-dSaCas9-KRAB in vivo. FIG. 3A is a schematic showing a two-vector AAV9 expression system used to deliver dSaCas9-KRAB and Acvr2b gRNA intramuscularly to the right tibialis anterior muscle (TA) of adult wild-type mice. FIGS. 3B and 3D are bar graphs showing that dSaCas9 was efficiently expressed as measured by qPCR in the injected TA at 4 and 8 weeks, respectively, after injection. FIGS. 3C and 3E are bar graphs showing Acvr2b expression in the injected TA as assayed by qPCR at 4 and 8 weeks, respectively, post-AAV treatment. (n=3 mice, * indicates p<0.05 compared to PBS sham controls)

[0180] FIG. 4 is a bar graph showing the analysis of AAV-gRNA vector genome signal in intramuscularly injected mice. For PBS sham, AAV-dSaCas9-KRAB only, and AAV-dSaCas9-KRAB and AAV-Acvr2b-gRNA treated mice, the bars from left to right show the presence of the AAV-U6-gRNA vector, as measured by qPCR, in the liver, heart, right tibialis anterior (TA), left TA, right gastrocnemius (gastroc), and left gastroc, respectively.

[0181] FIGS. 5A-5D are graphs showing the silencing of endogenous genes in vivo with AAV-dSaCas9-KRAB. FIGS. 5A and 5C are bar graphs showing that intramuscular delivery of AAV9 expressing dSaCas9-KRAB results in efficient transgene expression in the liver and heart, respectively, 8 weeks after transduction in adult wild-type mice. FIGS. 5B and 5D are bar graphs showing that delivery of dSaCas9-KRAB with Acvr2b gRNA reduces target gene expression in the liver and heart, respectively, at 8 weeks after treatment. (n=3 mice, * indicates p<0.05 by Student's t-test compared to PBS sham controls)

[0182] FIG. 6 is a graph showing a restriction map of a lentiviral vector encoding S. aureus Cas9 KRAB-based repressor.

[0183] FIG. 7 is a graph showing a restriction map of an AAV vector encoding S. aureus Cas9 KRAB-based repressor.

[0184] FIG. 8 is a graph showing a restriction map of an AAV vector encoding S. aureus Cas9 U6-gRNA.

[0185] FIG. 9 is a graph showing a restriction map of an AAV vector encoding S. aureus Cas9 U6-gRNA.

[0186] FIGS. 10A-10C are schematics showing an AAV-based gene delivery system for CRISPR/Cas9-based synthetic repressors. In FIG. 10A, a nuclease-null S. aureus dCas9 DNA-binding domain was generated by introducing two catalytically inactivating mutations to the nuclease domains of Cas9. dCas9 derived from S. aureus was fused to a KRAB synthetic repressor to create a synthetic repressor for in vivo gene delivery. Dual vector (FIG. 10B) and single AAV vector (FIG. 10C) platforms were designed to efficiently express dCas9-KRAB and a custom guide RNA target molecule in vivo.

[0187] FIGS. 11A-11C are graphs showing targeted reduction of the PCSK9 gene in vivo with engineered synthetic repressors. FIG. 11A is a schematic showing vectors used for targeted reduction of PCSK9 expression. S. aureus dCas9-KRAB (dCas9-KRAB) was targeted to the mouse PCSK9 gene and delivered in a dual-vector AAV system intravenously in C57Bl/6 wild-type 7-week old mice. At 2 weeks post-systemic treatment, circulating PCSK9 (FIG. 11B) and total cholesterol levels (FIG. 11C) are significantly repressed in the serum compared to sham PBS-injected controls and dCas9-KRAB-treated controls without a guide RNA (* indicates p<0.05 by Student's t-test compared to PBS sham controls, n=4 mice per condition).

[0188] FIGS. 12A-12E are graphs showing results from a study in which mice were intravenously administered with PBS, or AAV vectors encoding dSaCas9-KRAB (dCK) alone, or low-dose dSaCas9-KRAB (dCK) and PCSK9 guide RNA (gRNA). FIG. 12A is a graph showing serum PCSK9 levels for the three treatment groups as measured by ELISA. FIG. 12B is a bar graph showing relative PCSK9 mRNA levels in the liver, as normalized to GAPDH mRNA levels, for the three treatment groups. FIG. 12C is a graph showing data from an RNA-Seq study comparing the RNA levels in the liver in the dSaCas9-KRAB and gRNA treatment group with those in the dSaCas9-KRAB alone treatment group. The dot representing PCSK9 RNA levels is labeled in the figure. FIGS. 12D and 12E are graphs showing the serum levels of total and LDL cholesterol for the three treatment groups as measured in a colorimetric assay.

[0189] FIGS. 13A-13F are graphs showing results from a study in which mice were intravenously administered with PBS, or AAV vectors encoding dSaCas9-KRAB (dCK) alone, PCSK9 guide RNA (gRNA) alone, or moderate-dose dSaCas9-KRAB (dCK) and PCSK9 guide RNA (gRNA). FIGS. 13A and 13B are graphs showing serum PCSK9 levels for the three treatment groups as measured by ELISA. In FIG. 13B, the serum PCSK9 levels are normalized to the levels at Day 0. FIGS. 13C and 13D are graphs showing total cholesterol levels in the serum for the three treatment groups as measured in a colorimetric assay. In FIG. 13D, the serum total cholesterol levels are normalized to the levels at Day 0. FIGS. 13E and 13F are graphs showing LDL cholesterol levels in the serum for the three treatment groups as measured in a colorimetric assay. In FIG. 13F, the serum LDL cholesterol levels are normalized to the levels at Day 0.

[0190] FIGS. 14A-14C are graphs showing results from a study in which mice were intravenously administered with PBS, moderate-dose, or high-dose of AAV vectors encoding dSaCas9-KRAB and PCSK9 gRNA. FIG. 14A is a graph showing serum PCSK9 levels for the three treatment groups as measured by ELISA. FIGS. 14B and 14C are graphs showing total cholesterol levels in the serum. FIG. 14D is a graph showing LDL cholesterol levels in the serum.

DETAILED DESCRIPTION

[0191] For the purposes of promoting an understanding of the principles of the present disclosure, reference will now be made to preferred embodiments and specific language will be used to describe the same. It will nevertheless be understood that no limitation of the scope of the disclosure is thereby intended, such alteration and further modifications of the disclosure as illustrated herein, being contemplated as would normally occur to one skilled in the art to which the disclosure relates.

[0192] Articles "a" and "an" are used herein to refer to one or to more than one (i.e. at least one) of the grammatical object of the article. By way of example, "an element" means at least one element and can include more than one element.

[0193] Unless otherwise defined, all technical terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.

A. Definitions

[0194] As used herein, the term "coding sequence" or "encoding nucleic acid" means the nucleic acids (RNA or DNA molecule) that comprise a nucleotide sequence which encodes a protein. The coding sequence can further include initiation and termination signals operably linked to regulatory elements including a promoter and polyadenylation signal capable of directing expression in the cells of an individual or mammal to which the nucleic acid is administered. The coding sequence may be codon optimized.

[0195] The term "complement" or "complementary" as used herein with reference to a nucleic acid can mean Watson-Crick (e.g., A-T/U and C-G) or Hoogsteen base pairing between nucleotides or nucleotide analogs of nucleic acid molecules. "Complementarity" refers to a property shared between two nucleic acid sequences, such that when they are aligned antiparallel to each other, the nucleotide bases at each position will be complementary.

[0196] The term "correcting", "genome editing" and "restoring" refers to changing a mutant gene that encodes a mutant protein, a truncated protein or no protein at all, such that a full-length functional or partially full-length functional protein expression is obtained. Correcting or restoring a mutant gene may include replacing the region of the gene that has the mutation or replacing the entire mutant gene with a copy of the gene that does not have the mutation with a repair mechanism such as homology-directed repair (HDR). Correcting or restoring a mutant gene may also include repairing a frameshift mutation that causes a premature stop codon, an aberrant splice acceptor site or an aberrant splice donor site, by generating a double stranded break in the gene that is then repaired using non-homologous end joining (NHEJ). NHEJ may add or delete at least one base pair during repair which may restore the proper reading frame and eliminate the premature stop codon. Correcting or restoring a mutant gene may also include disrupting an aberrant splice acceptor site or splice donor sequence. Correcting or restoring a mutant gene may also include deleting a non-essential gene segment by the simultaneous action of two nucleases on the same DNA strand in order to restore the proper reading frame by removing the DNA between the two nuclease target sites and repairing the DNA break by NHEJ.

[0197] As used herein, the term "donor DNA", "donor template" and "repair template" refers to a double-stranded DNA fragment or molecule that includes at least a portion of the gene of interest. The donor DNA may encode a full-functional protein or a partially-functional protein.

[0198] As used herein, the terms "frameshift" or "frameshift mutation" are used interchangeably and refer to a type of gene mutation wherein the addition or deletion of one or more nucleotides causes a shift in the reading frame of the codons in the mRNA. The shift in reading frame may lead to the alteration in the amino acid sequence at protein translation, such as a missense mutation or a premature stop codon.

[0199] As used herein, the term "functional" and "full-functional" describes a protein that has biological activity. A "functional gene" refers to a gene transcribed to mRNA, which is translated to a functional protein.

[0200] As used herein, the term "fusion protein" refers to a chimeric protein created through the covalent or non-covalent joining of two or more genes, directly or indirectly, that originally coded for separate proteins. In some embodiments, the translation of the fusion gene results in a single polypeptide with functional properties derived from each of the original proteins.

[0201] As used herein, the term "genetic construct" refers to the DNA or RNA molecules that comprise a nucleotide sequence that encodes a protein. The coding sequence includes initiation and termination signals operably linked to regulatory elements including a promoter and polyadenylation signal capable of directing expression in cells.

[0202] The term "Homology-directed repair" or "HDR" as used interchangeably herein refers to a mechanism in cells to repair double strand DNA lesions when a homologous piece of DNA is present in the nucleus, mostly in G2 and S phase of the cell cycle. HDR uses a donor DNA template to guide repair and may be used to create specific sequence changes to the genome, including the targeted addition of whole genes. If a donor template is provided along with the site specific nuclease, such as with a CRISPR/Cas9-based systems, then the cellular machinery will repair the break by homologous recombination, which is enhanced several orders of magnitude in the presence of DNA cleavage. When the homologous DNA piece is absent, nonhomologous end joining may take place instead.

[0203] The term "genome editing" as used herein refers to changing a gene. Genome editing may include correcting or restoring a mutant gene. Genome editing may include knocking out a gene, such as a mutant gene or a normal gene. Genome editing may be used to treat disease or enhance muscle repair by changing the gene of interest.

[0204] The term "identical" or "identity" as used herein in the context of two or more nucleic acids or polypeptide sequences means that the sequences have a specified percentage of residues that are the same over a specified region. The percentage may be calculated by optimally aligning the two sequences, comparing the two sequences over the specified region, determining the number of positions at which the identical residue occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the specified region, and multiplying the result by 100 to yield the percentage of sequence identity. In cases where the two sequences are of different lengths or the alignment produces one or more staggered ends and the specified region of comparison includes only a single sequence, the residues of single sequence are included in the denominator but not the numerator of the calculation. When comparing DNA and RNA, thymine (T) and uracil (U) may be considered equivalent. Identity may be performed manually or by using a computer sequence algorithm such as BLAST or BLAST 2.0. Identity of related peptides can be readily calculated by known methods. Such methods include, but are not limited to, those described in Computational Molecular Biology, Lesk, A. M., ed., Oxford University Press, New York, 1988; Biocomputing: Informatics and Genome Projects, Smith, D. W., ed., Academic Press, New York, 1993; Computer Analysis of Sequence Data, Part 1, Griffin, A. M., and Griffin, H. G., eds., Humana Press, New Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; Sequence Analysis Primer, Gribskov, M. and Devereux, J., eds., M. Stockton Press, New York, 1991; and Carillo et al, SIAM J. Applied Math. 48, 1073 (1988), herein incorporated by reference in their entirety.

[0205] As used herein, the terms "mutant gene" or "mutated gene" as used interchangeably herein refers to a gene that has undergone a detectable mutation. A mutant gene has undergone a change, such as the loss, gain, or exchange of genetic material, which affects the normal transmission and expression of the gene. A "disrupted gene" as used herein refers to a mutant gene that has a mutation that causes a premature stop codon. The disrupted gene product is truncated relative to a full-length undisrupted gene product.

[0206] The term "non-homologous end joining (NHEJ) pathway" as used herein refers to a pathway that repairs double-strand breaks in DNA by directly ligating the break ends without the need for a homologous template. The template-independent re-ligation of DNA ends by NHEJ is a stochastic, error-prone repair process that introduces random micro-insertions and micro-deletions (indels) at the DNA breakpoint. This method may be used to intentionally disrupt, delete, or alter the reading frame of targeted gene sequences. NHEJ typically uses short homologous DNA sequences called microhomologies to guide repair. These microhomologies are often present in single-stranded overhangs on the end of double-strand breaks. When the overhangs are perfectly compatible, NHEJ usually repairs the break accurately, yet imprecise repair leading to loss of nucleotides may also occur, but is much more common when the overhangs are not compatible.

[0207] The term "normal gene" as used herein refers to a gene that has not undergone a change, such as a loss, gain, or exchange of genetic material. The normal gene undergoes normal gene transmission and gene expression.

[0208] The term "nuclease mediated NHEJ" as used herein refers to NHEJ that is initiated after a nuclease, such as a cas9, cuts double stranded DNA.

[0209] As used herein, the term "nucleic acid" or "oligonucleotide" or "polynucleotide" as used herein means at least two nucleotides covalently linked together. The depiction of a single strand also defines the sequence of the complementary strand. Thus, a nucleic acid also encompasses the complementary strand of a depicted single strand. Many variants of a nucleic acid may be used for the same purpose as a given nucleic acid. Thus, a nucleic acid also encompasses substantially identical nucleic acids and complements thereof. A single strand provides a probe that may hybridize to a target sequence under stringent hybridization conditions. Thus, a nucleic acid also encompasses a probe that hybridizes under stringent hybridization conditions. Nucleic acids may be single stranded or double stranded, or may contain portions of both double stranded and single stranded sequence. The nucleic acid may be DNA, both genomic and cDNA, RNA, or a hybrid, where the nucleic acid may contain combinations of deoxyribo- and ribo-nucleotides, and combinations of bases including uracil, adenine, thymine, cytosine, guanine, inosine, xanthine hypoxanthine, isocytosine and isoguanine. Nucleic acids may be obtained by chemical synthesis methods or by recombinant methods.

[0210] As used herein, the term "operably linked" means that expression of a gene is under the control of a promoter with which it is spatially connected. A promoter may be positioned 5' (upstream) or 3' (downstream) of a gene under its control. The distance between the promoter and a gene may be approximately the same as the distance between that promoter and the gene it controls in the gene from which the promoter is derived. As is known in the art, variation in this distance may be accommodated without loss of promoter function.

[0211] The term "partially-functional" as used herein describes a protein that is encoded by a mutant gene and has less biological activity than a functional protein but more than a non-functional protein. In one embodiment, a partially-functional protein shows a biological activity that is less than 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, or 30% of that of a corresponding functional protein.

[0212] The term "premature stop codon" or "out-of-frame stop codon" as used interchangeably herein refers to nonsense mutation in a sequence of DNA, which results in a stop codon at a location not normally found in the wild-type gene. A premature stop codon may cause a protein to be truncated or shorter compared to the full-length version of the protein.

[0213] The term "promoter" as used herein means a synthetic or naturally-derived molecule which is capable of conferring, activating or enhancing expression of a nucleic acid in a cell. A promoter may comprise one or more specific transcriptional regulatory sequences to further enhance expression and/or to alter the spatial expression and/or temporal expression of a nucleic acid. A promoter may also comprise distal enhancer or repressor elements, which may be located as much as several thousand base pairs from the start site of transcription. A promoter may be derived from sources including viral, bacterial, fungal, plants, insects, and animals. A promoter may regulate the expression of a gene component constitutively, or differentially with respect to cell, the tissue or organ in which expression occurs or, with respect to the developmental stage at which expression occurs, or in response to external stimuli such as physiological stresses, pathogens, metal ions, or inducing agents. Representative examples of promoters include the bacteriophage T7 promoter, bacteriophage T3 promoter, SP6 promoter, lac operator-promoter, tac promoter, SV40 late promoter, SV40 early promoter, RSV-LTR promoter, and CMV IE promoter.

[0214] The term "target gene" as used herein refers to any nucleotide sequence encoding a known or putative gene product. The target gene may be a mutated gene involved in a genetic disease or disorder.

[0215] The term "target region" as used herein refers to the region of the target gene to which the site-specific nuclease is designed to bind.

[0216] As used herein, the term "transgene" refers to a gene or genetic material containing a gene sequence that has been isolated from one organism and is introduced into a different organism. Alternatively, the term "transgene" also refers to a gene or genetic material that is chemically synthesized and introduced into an organism. This non-native segment of DNA may retain the ability to produce RNA or protein in the transgenic organism, or it may alter the normal function of the transgenic organism's genetic code. The introduction of a transgene has the potential to change the phenotype of an organism.

[0217] As used herein, the term "variant" when used with respect to a nucleic acid means (i) a portion or fragment of a referenced nucleotide sequence; (ii) the complement of a referenced nucleotide sequence or portion thereof; (iii) a nucleic acid that is substantially identical to a referenced nucleic acid or the complement thereof; or (iv) a nucleic acid that hybridizes under stringent conditions to the referenced nucleic acid, complement thereof, or a sequences substantially identical thereto. "Variant" with respect to a peptide or polypeptide that differs in amino acid sequence by the insertion, deletion, or conservative substitution of amino acids, but retain at least one biological activity. Variant may also mean a protein with an amino acid sequence that is substantially identical to a referenced protein with an amino acid sequence that retains at least one biological activity. A conservative substitution of an amino acid, i.e., replacing an amino acid with a different amino acid of similar properties (e.g., hydrophilicity, degree and distribution of charged regions) is recognized in the art as typically involving a minor change. These minor changes may be identified, in part, by considering the hydropathic index of amino acids, as understood in the art. Kyte et al., J. Mol. Biol. 157: 105-132 (1982), incorporated herein by reference in its entirety. The hydropathic index of an amino acid is based on a consideration of its hydrophobicity and charge. It is known in the art that amino acids of similar hydropathic indexes may be substituted and still retain protein function. In one aspect, amino acids having hydropathic indexes of .+-.2 are substituted. The hydrophilicity of amino acids may also be used to reveal substitutions that would result in proteins retaining biological function. A consideration of the hydrophilicity of amino acids in the context of a peptide permits calculation of the greatest local average hydrophilicity of that peptide. Substitutions may be performed with amino acids having hydrophilicity values within .+-.2 of each other. Both the hydrophobicity index and the hydrophilicity value of amino acids are influenced by the particular side chain of that amino acid. Consistent with that observation, amino acid substitutions that are compatible with biological function are understood to depend on the relative similarity of the amino acids, and particularly the side chains of those amino acids, as revealed by the hydrophobicity, hydrophilicity, charge, size, and other properties.

[0218] As used herein, the term "vector" as used herein means a nucleic acid sequence containing an origin of replication. A vector may be a viral vector, bacteriophage, bacterial artificial chromosome or yeast artificial chromosome. A vector may be a DNA or RNA vector. A vector may be a self-replicating extrachromosomal vector, such as a DNA plasmid.

[0219] As used herein, the terms "gene transfer," "gene delivery," and "gene transduction" refer to methods or systems for reliably inserting a particular nucleotide sequence (e.g., DNA or RNA), fusion protein, polypeptide and the like into targeted cells. The vector may also comprise an adenovirus (AAV) vector. As used herein, the terms "adenoviral associated virus (AAV) vector," "AAV gene therapy vector," and "gene therapy vector" refer to a vector having functional or partly functional ITR sequences and transgenes. As used herein, the term "ITR" refers to inverted terminal repeats (ITR). The ITR sequences may be derived from an adeno-associated virus serotype, including without limitation, AAV-1, AAV-2, AAV-3, AAV-4, AAV-5, and AAV-6. The ITRs, however, need not be the wild-type nucleotide sequences, and may be altered (e.g., by the insertion, deletion or substitution of nucleotides), so long as the sequences retain function to provide for functional rescue, replication and packaging. AAV vectors may have one or more of the AAV wild-type genes deleted in whole or part, preferably the rep and/or cap genes but retain functional flanking ITR sequences. Functional ITR sequences function to, for example, rescue, replicate and package the AAV virion or particle. Thus, an "AAV vector" is defined herein to include at least those sequences required for insertion of the transgene into a subject's cells. Optionally included are those sequences necessary in cis for replication and packaging (e.g., functional ITRs) of the virus.

[0220] As used herein, the term "gene therapy" refers to a method of treating a patient wherein polypeptides or nucleic acid sequences are transferred into cells of a patient such that activity and/or the expression of a particular gene is modulated. In certain embodiments, the expression of the gene is suppressed. In certain embodiments, the expression of the gene is enhanced. In certain embodiments, the temporal or spatial pattern of the expression of the gene is modulated.

[0221] The terms "adeno-associated virus inverted terminal repeats" or "AAV ITRs" refer to the palindromic regions found at each end of the AAV genome which function together in cis as origins of DNA replication and as packaging signals for the virus. For use in some embodiments of the present invention, flanking AAV ITRs are positioned 5' and 3' of one or more selected heterologous nucleotide sequences. Optionally, the ITRs together with the rep coding region or the Rep expression products provide for the integration of the selected sequences into the genome of a target cell.

[0222] As used herein, the term "AAV rep coding region" refers to the region of the AAV genome that encodes the replication proteins Rep 78, Rep 68, Rep 52 and Rep 40. These Rep expression products have been shown to possess many functions, including recognition, binding and nicking of the AAV origin of DNA replication, DNA helicase activity and modulation of transcription from AAV (or other heterologous) promoters. The Rep expression products are collectively required for replicating the AAV genome. Muzyczka (Muzyczka, Curr. Top. Microbiol. Immunol., 158:97-129 (1992)) and Kotin (Kotin, Hum. Gene Ther., 5:793-801 (1994)), incorporated herein by reference in their entirety, provide additional descriptions of the AAV rep coding region, as well as the cap coding region described below. Suitable homologues of the AAV rep coding region include the human herpesvirus 6 (HHV-6) rep gene which is also known to mediate AAV-2 DNA replication (Thomson el al., Virol., 204:304-311 (1994), incorporated herein by reference in its entirety).

[0223] As used herein, the term "AAV cap coding region" refers to the region of the AAV genome that encodes the capsid proteins VP1, VP2, and VP3, or functional homologues thereof. These cap expression products supply the packaging functions, which are collectively required for packaging the viral genome. In some embodiments, AAV2 Cap proteins may be used.

[0224] As used herein, the term "AAV helper function" refers to AAV coding regions capable of being expressed in a host cell to complement AAV viral functions missing from the AAV vector. Typically, the AAV helper functions include the AAV rep coding region and the AAV cap coding region. The helper functions may be contained in a "helper plasmid" or "helper construct." An AAV helper construct as used herein, refers to a molecule that provides all or part of the elements necessary for AAV replication and packaging. Such AAV helper constructs may be a plasmid, virus or genes integrated into cell lines or into the cells of a subject. It may be provided as DNA, RNA, or protein. The elements do not have to be arranged co-linearly (i.e., in the same molecule). For example, rep78 and rep68 may be on different molecules. An "AAV helper construct" may be, for example, a vector containing AAV coding regions required to complement AAV viral functions missing from the AAV vector (e.g., the AAV rep coding region and the AAV cap coding region).

[0225] As used herein, the terms "accessory functions" and "accessory factors" refer to functions and factors that are required by AAV for replication, but are not provided by the AAV vector or AAV helper construct. Thus, these accessory functions and factors must be provided by the host cell, a virus (e.g., adenovirus or herpes simplex virus), or another expression vector that is co-expressed in the same cell. Generally, the E1, E2A, E4 and VA coding regions of adenovirus are used to supply the necessary accessory function required for AAV replication and packaging (Matsushita et al., Gene Therapy 5:938 (1998), incorporated herein by reference in its entirety).

[0226] Portions of the AAV genome have the capability of integrating into the DNA of cells to which it is introduced. As used herein, "integrate," refers to portions of the genetic construct that become covalently bound to the genome of the cell to which it is administered, for example through the mechanism of action mediated by the AAV Rep protein and the AAV ITRs. For example, the AAV virus has been shown to integrate at 19q13.3-qter in the human genome. The minimal elements for AAV integration are the inverted terminal repeat (ITR) sequences and a functional Rep 78/68 protein. In some embodiments, the present invention incorporates the ITR sequences into a vector for integration to facilitate the integration of the transgene into the host cell genome for sustained transgene expression. The genetic transcript may also integrate into other chromosomes if the chromosomes contain the AAV integration site.

[0227] The predictability of insertion site reduces the danger of random insertional events into the cellular genome that may activate or inactivate host genes or interrupt coding sequences, consequences that limit the use of vectors whose integration is random, e.g., retroviruses. The Rep protein mediates the integration of the genetic construct containing the AAV ITRs and the transgene. The use of AAV is advantageous for its predictable integration site and because it has not been associated with human or non-human primate diseases, thus obviating many of the concerns that have been raised with virus-derived gene therapy vectors.

[0228] "Portion of the genetic construct integrates into a chromosome" refers to the portion of the genetic construct that will become covalently bound to the genome of the cell upon introduction of the genetic construct into the cell via administration of the gene therapy particle. The integration is mediated by the AAV ITRs flanking the transgene and the AAV Rep protein. Portions of the genetic construct that may be integrated into the genome include the transgene and the AAV ITRs.

[0229] The "transgene" may contain a transgenic sequence or a native or wild-type DNA sequence. The transgene may become part of the genome of the primate subject. A transgenic sequence can be partly or entirely species-heterologous, i.e., the transgenic sequence, or a portion thereof, can be from a species which is different from the cell into which it is introduced.

[0230] As used herein, the term "stably maintained" refers to characteristics of transgenic subject (e.g., a human or non-human primate) that maintain at least one of their transgenic elements (i.e., the element that is desired) through multiple generations of cells. For example, it is intended that the term encompass many cell division cycles of the originally transfected cell. The term "stable transfection" or "stably transfected" refers to the introduction and integration of foreign DNA into the genome of the cell. The term "stable transfectant" refers to a cell that has stably integrated foreign DNA into the genomic DNA.

[0231] As used herein, the terms "transgene encoding," "nucleic acid molecule encoding," "DNA sequence encoding," and "DNA encoding" refer to the order or sequence of deoxyribonucleotides along a strand of deoxyribonucleic acid. The order of these deoxyribonucleotides may, for example, determine the order of amino acids along the polypeptide (protein) chain. The DNA sequence thus may code for the amino acid sequence.

[0232] As used herein, the term "wild type" (wt) refers to a gene or gene product which has the characteristics of that gene or gene product when isolated from a naturally occurring source. A wild-type gene is that which is most frequently observed in a population and is thus arbitrarily designed the "normal" or "wild-type" form of the gene. In contrast, the term "modified" or "mutant" refers to a gene or gene product that displays modifications in sequence and/or functional properties (i.e., altered characteristics) when compared to the wild-type gene or gene product. It is noted that naturally occurring mutants may be isolated, which are identified by the acquisition of altered characteristics when compared to the wild-type gene or gene product.

[0233] As used herein, the term "AAV virion," "AAV particle," or "AAV gene therapy particle," "AAV gene therapy vector," or "rAAV gene therapy vector" refers to a complete virus unit, such as a wt AAV virus particle (comprising a linear, single-stranded AAV nucleic acid genome associated with at least one AAV capsid protein coat). In this regard, single-stranded AAV nucleic acid molecules of either complementary sense (e.g., "sense" or "antisense" strands) can be packaged into any one AAV virion and both strands are equally infectious. Also included are infectious viral particles containing a heterologous DNA molecule of interest (e.g., CFTR or a biologically active portion thereof), which is flanked on both sides by AAV ITRs.

[0234] As used herein, the term "transfection" refers to the uptake of a foreign nucleic acid (e.g., DNA or RNA) by a cell. A cell has been "transfected" when an exogenous nucleic acid (DNA or RNA) has been introduced inside the cell membrane. A number of transfection techniques are generally known in the art (See, e.g., Graham et al., Virol., 52:456 (1973); Sambrook et al., Molecular Cloning, a Laboratory Manual, Cold Spring Harbor Laboratories, N.Y. (1989); Davis et al., Basic Methods in Molecular Biology, Elsevier, (1986); and Chu et al., Gene 13:197 (1981), incorporated herein by reference in their entirety). Such techniques may be used to introduce one or more exogenous DNA moieties, such as a gene transfer vector and other nucleic acid molecules, into suitable recipient cells.

[0235] As used herein, the terms "stable transfection" and "stably transfected" refers to the introduction and integration of foreign DNA into the genome of the transfected cell. The term "stable transfectant" refers to a cell, which has stably integrated foreign DNA into the genomic DNA.

[0236] As used herein, the term "transient transfection" or "transiently transfected" refers to the introduction of foreign DNA into a cell wherein the foreign DNA fails to integrate into the genome of the transfected cell and is maintained as an episome. During this time the foreign DNA is subject to the regulatory controls that govern the expression of endogenous genes in the chromosomes. The term "transient transfectant" refers to cells which have taken up foreign DNA but have failed to integrate this DNA. As used herein, the term "transduction" denotes the delivery of a DNA molecule to a recipient cell either in vivo or in vitro, via a replication-defective viral vector, such as via a recombinant AAV virion.

[0237] As used herein, the term "recipient cell" refers to a cell which has been transfected or transduced, or is capable of being transfected or transduced, by a nucleic acid construct or vector bearing a selected nucleotide sequence of interest. The term includes the progeny of the parent cell, whether or not the progeny are identical in morphology or in genetic make-up to the original parent, so long as the selected nucleotide sequence is present. The recipient cell may be the cells of a subject to which the gene therapy particles and/or gene therapy vector has been administered.

[0238] As used herein, the term "recombinant DNA molecule" refers to a DNA molecule which is comprised of segments of DNA joined together by means of molecular biological techniques.

[0239] As used herein, the term "regulatory element" refers to a genetic element which can control the expression of nucleic acid sequences. For example, a promoter is a regulatory element that facilitates the initiation of transcription of an operably linked coding region. Other regulatory elements are splicing signals, polyadenylation signals, termination signals, etc.

[0240] The term DNA "control sequences" refers collectively to regulatory elements such as promoter sequences, polyadenylation signals, transcription termination sequences, upstream regulatory domains, origins of replication, internal ribosome entry sites ("IRES"), enhancers, and the like, which collectively provide for the replication, transcription and translation of a coding sequence in a recipient cell. Not all of these control sequences need be present.

[0241] Transcriptional control signals in eukaryotes generally comprise "promoter" and "enhancer" elements. Promoters and enhancers consist of short arrays of DNA sequences that interact specifically with cellular proteins involved in transcription (Maniatis et al., Science 236:1237 (1987), incorporated herein by reference in its entirety). Promoter and enhancer elements have been isolated from a variety of eukaryotic sources including genes in yeast, insect and mammalian cells and viruses (analogous control sequences, i.e., promoters, are also found in prokaryotes). The selection of a particular promoter and enhancer depends on the recipient cell type. Some eukaryotic promoters and enhancers have a broad host range while others are functional in a limited subset of cell types (See e.g., Voss et al., Trends Biochem. Sci., 11:287 (1986); and Maniatis et al., supra, for reviews, incorporated herein by reference in their entirety). For example, the SV40 early gene enhancer is very active in a variety of cell types from many mammalian species and has been used to express proteins in a broad range of mammalian cells (Dijkema et al, EMBO J. 4:761 (1985), incorporated herein by reference in its entirety). Promoter and enhancer elements derived from the human elongation factor 1-alpha gene (Uetsuki et al., J. Biol. Chem., 264:5791 (1989); Kim et al., Gene 91:217 (1990); and Mizushima and Nagata, Nucl. Acids. Res., 18:5322 (1990)), the long terminal repeats of the Rous sarcoma virus (Gorman et al., Proc. Natl. Acad. Sci. U.S.A. 79:6777 (1982)), and the human cytomegalovirus (Boshart et al., Cell 41:521 (1985)) are also of utility for expression of proteins in diverse mammalian cell types, incorporated herein by reference in their entirety. Promoters and enhancers can be found naturally, alone or together. For example, retroviral long terminal repeats comprise both promoter and enhancer elements. Generally promoters and enhancers act independently of the gene being transcribed or translated. Thus, the enhancer and promoter used can be "endogenous," "exogenous," or "heterologous" with respect to the gene to which they are operably linked. An "endogenous" enhancer/promoter is one which is naturally linked with a given gene in the genome. An "exogenous" or "heterologous" enhancer or promoter is one which is placed in juxtaposition to a gene by means of genetic manipulation (i.e., molecular biological techniques) such that transcription of that gene is directed by the linked enhancer/promoter.

[0242] As used herein, the term "CBA" promoter refers to a fusion of the chicken (3-actin promoter and CMV immediate-early enhancer.

[0243] As used herein, the term "tissue specific" refers to regulatory elements or control sequences, such as a promoter, an enhancer, etc., wherein the expression of the nucleic acid sequence is substantially greater in a specific cell type(s) or tissue(s). In particularly preferred embodiments, the CB promoter (CB is the same as CBA defined above) displays good expression of human CFTR, rAAV5-CB-.DELTA.264CFTR, rAAV5-CB-.DELTA.27-264CFTR, or another biologically active portion of CFTR. It is not intended, however, that the present invention be limited to the CB promoter or to lung specific expression, as other tissue specific regulatory elements, or regulatory elements that display altered gene expression patterns, are encompassed within the invention.

[0244] The presence of "splicing signals" on an expression vector often results in higher levels of expression of the recombinant transcript. Splicing signals mediate the removal of introns from the primary RNA transcript and consist of a splice donor and acceptor site (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor Laboratory Press, New York (1989), pp. 16.7-16.8, incorporated herein by reference in its entirety). A commonly used splice donor and acceptor site is the splice junction from the 16S RNA of SV40.

[0245] Transcription termination signals are generally found downstream of a polyadenylation signal and are a few hundred nucleotides in length. The term "poly A site" or "poly A sequence" as used herein denotes a DNA sequence which directs both the termination and polyadenylation of the nascent RNA transcript. Efficient polyadenylation of the recombinant transcript is desirable as transcripts lacking a poly A tail are unstable and are rapidly degraded. The poly A signal utilized in an expression vector may be "heterologous" or "endogenous." An endogenous poly A signal is one that is found naturally at the 3' end of the coding region of a given gene in the genome. A heterologous poly A signal is one which has been isolated from one gene and operably linked to the 3' end of another gene. A commonly used heterologous poly A signal is the SV40 poly A signal. The SV40 poly A signal is contained on a 237 bp BamHI/BclI restriction fragment and directs both termination and polyadenylation (Sambrook et al., supra, at 16.6-16.7, incorporated herein by reference in its entirety).

[0246] As used herein, the term "subject" and "patient" are used interchangeably herein and refer to both human and nonhuman animals. The term "nonhuman animals" of the disclosure includes all vertebrates, e.g., mammals and non-mammals, such as nonhuman primates, sheep, dog, cat, horse, cow, chickens, amphibians, reptiles, and the like.

[0247] As defined herein, a "therapeutically effective amount" or "therapeutic effective dose" is an amount or dose of a fusion protein, polypeptide, nucleic acid, AAV particle(s), or virion(s) capable of producing sufficient amounts of a desired protein to modulate the activity of the protein in a desired manner, thus providing a palliative tool for clinical intervention. In some embodiments, a therapeutically effective amount or dose of a transfected fusion protein, polypeptide, nucleic acid, AAV particle(s), or virion(s) as described herein is enough to confer suppression of a gene targeted by the fusion protein/gene therapy construct.

[0248] As used herein, the term "treat", e.g., a disorder, means that a subject (e.g., a human) who has a disorder, is at risk of having a disorder, and/or experiences a symptom of a disorder, will, in an embodiment, suffer a less severe symptom and/or will recover faster, when a fusion molecule or a nucleic acid that encodes the fusion molecule, and/or a gRNA or a nucleic acid that encodes the gRNA, e.g., as described herein, is administered than if the fusion molecule or a nucleic acid that encodes the fusion molecule, and/or the gRNA or a nucleic acid that encodes the gRNA, were never administered.

B. CRISPR System

[0249] "Clustered Regularly Interspaced Short Palindromic Repeats" and "CRISPRs", as used interchangeably herein, refer to loci containing multiple short direct repeats that are found in the genomes of approximately 40% of sequenced bacteria and 90% of sequenced archaea. The CRISPR system is a microbial nuclease system involved in defense against invading phages and plasmids that provides a form of acquired immunity. The CRISPR loci in microbial hosts contain a combination of CRISPR-associated (Cas) genes as well as non-coding RNA elements capable of programming the specificity of the CRISPR-mediated nucleic acid cleavage. Short segments of foreign DNA, called spacers, are incorporated into the genome between CRISPR repeats, and serve as a `memory` of past exposures. Cas9 forms a complex with the 3' end of the single guide RNA (sgRNA), and the protein-RNA pair recognizes its genomic target by complementary base pairing between the 5' end of the sgRNA sequence and a predefined 20 bp DNA sequence, known as the protospacer. This complex is directed to homologous loci of pathogen DNA via regions encoded within the CRISPR RNA (crRNA), i.e., the protospacers, and protospacer-adjacent motifs (PAMs) within the pathogen genome. The non-coding CRISPR array is transcribed and cleaved within direct repeats into short crRNAs containing individual spacer sequences, which direct Cas nucleases to the target site (protospacer). By simply exchanging the 20 bp recognition sequence of the expressed sgRNA, the Cas9 nuclease can be directed to new genomic targets. CRISPR spacers are used to recognize and silence exogenous genetic elements in a manner analogous to RNAi in eukaryotic organisms.

[0250] Three classes of CRISPR systems (Types I, II and III effector systems) are known. The Type II effector system carries out targeted DNA double-strand break in four sequential steps, using a single effector enzyme, Cas9, to cleave dsDNA. Compared to the Type I and Type III effector systems, which require multiple distinct effectors acting as a complex, the Type II effector system may function in alternative contexts such as eukaryotic cells. The Type II effector system consists of a long pre-crRNA, which is transcribed from the spacer-containing CRISPR locus, the Cas9 protein, and a trans-encoded small RNA (tracrRNA), which is involved in pre-crRNA processing. The tracrRNAs hybridize to the repeat regions separating the spacers of the pre-crRNA, thus initiating dsRNA cleavage by endogenous RNase III. This cleavage is followed by a second cleavage event within each spacer by Cas9, producing mature crRNAs that remain associated with the tracrRNA and Cas9, forming a Cas9:crRNA-tracrRNA complex.

[0251] The Cas9:crRNA-tracrRNA complex unwinds the DNA duplex and searches for sequences matching the crRNA to cleave. Target recognition occurs upon detection of complementarity between a "protospacer" sequence in the target DNA and the remaining spacer sequence in the crRNA. Cas9 mediates cleavage of target DNA if a correct protospacer-adjacent motif (PAM) is also present at the 3' end of the protospacer. For protospacer targeting, the sequence must be immediately followed by the protospacer-adjacent motif (PAM), a short sequence recognized by the Cas9 nuclease that is required for DNA cleavage. Different Type II systems have differing PAM requirements. The S. pyogenes CRISPR system may have the PAM sequence for this Cas9 (SpCas9) as 5'-NRG-3', where R is either A or G, and characterized the specificity of this system in human cells. A unique capability of the CRISPR/Cas9 system is the straightforward ability to simultaneously target multiple distinct genomic loci by co-expressing a single Cas9 protein with two or more sgRNAs. For example, the Streptococcus pyogenes (S. pyogenes) Type II system naturally prefers to use an "NGG" sequence, where "N" can be any nucleotide, but also accepts other PAM sequences, such as "NAG" in engineered systems (Hsu et al, Nature Biotechnology (2013) doi: 10.1038/nbt.2647, incorporated herein by reference in its entirety). Similarly, the Cas9 derived from Neisseria meningitidis (NmCas9) normally has a native PAM of NNNNGATT, but has activity across a variety of PAMs, including a highly degenerate NNNNGNNN PAM (Esvelt et al. Nature Methods (2013) doi: 10.1038/nmeth.2681, incorporated herein by reference in its entirety).

C. CRISPR/CAS9-Based System

[0252] An engineered form of the Type II effector system of S. pyogenes was shown to function in human cells for genome engineering. In this system, the Cas9 protein was directed to genomic target sites by a synthetically reconstituted "guide RNA" ("gRNA", also used interchangeably herein as a chimeric single guide RNA ("sgRNA")), which is a crRNA-tracrRNA fusion that obviates the need for RNase III and crRNA processing in general. Provided herein are CRISPR/Cas9-based engineered systems for use in genome editing and treating genetic diseases. The CRISPR/Cas9-based engineered systems may be designed to target any gene, including genes involved in a genetic disease, aging, tissue regeneration, or wound healing. The CRISPR/Cas9-based systems may include a Cas9 protein or Cas9 fusion protein and at least one gRNA. The Cas9 fusion protein may, for example, include a domain that has a different activity from what is endogenous to Cas9, such as a transactivation domain.

[0253] The target gene may be involved in differentiation of a cell or any other process in which activation of a gene may be desired, or may have a mutation such as a frameshift mutation or a nonsense mutation. If the target gene has a mutation that causes a premature stop codon, an aberrant splice acceptor site or an aberrant splice donor site, the CRISPR/Cas9-based system may be designed to recognize and bind a nucleotide sequence upstream or downstream from the premature stop codon, the aberrant splice acceptor site or the aberrant splice donor site. The CRISPR-Cas9-based system may also be used to disrupt normal gene splicing by targeting splice acceptors and donors to induce skipping of premature stop codons or restore a disrupted reading frame. The CRISPR/Cas9-based system may or may not mediate off-target changes to protein-coding regions of the genome. In some embodiments, the expression of the target gene is to be suppressed.

D. Cas9

[0254] The CRISPR/Cas9-based system may include a Cas9 protein or a fragment thereof, a Cas9 fusion protein, a nucleic acid encoding a Cas9 protein or a fragment thereof, or a nucleic acid encoding a Cas9 fusion protein. As used herein, a "Cas9 molecule" may refer to a Cas9 protein, or a fragment thereof. Cas9 protein is an endonuclease that cleaves nucleic acid and is encoded by the CRISPR loci and is involved in the Type II CRISPR system. The Cas9 protein may be from any bacterial or archaea species, such as Streptococcus pyogenes. Cas9 sequences and structures from different species are known in the art, see, e.g., Ferretti et al., Proc Natl Acad Sci USA. (2001); 98(8): 4658-63; Deltcheva et al., Nature. 2011 Mar. 31; 471(7340):602-7; and Jinek et al., Science. (2012); 337(6096):816-21, incorporated herein by reference in their entirety. Exemplary S. pyogenes Cas9 sequence is available at the Uniprot database under accession number Q99ZW2. Exemplary Staphylococcus aureus (S. aureus) Cas9 sequence is available at the Uniprot database under accession number J7RUA5. Exemplary Cas9 sequences are also shown in Table 1.

[0255] S. pyogenes Cas9 is the most commonly studied Cas9 molecule. Notably, S. pyogenes Cas9 is quite large (the gene itself is over 4.1 Kb), making it challenging to be packed into certain delivery vectors. For example, Adeno-associated virus (AAV) vector has a packaging limit of 4.5 or 4.75 Kb. This means that Cas9 as well as regulatory elements such as a promoter and a transcription terminator all have to fit into the same viral vector. Constructs larger than 4.5 or 4.75 Kb will lead to significantly reduced virus production. One possibility is to use a functional fragment of S. pyogenes Cas9. Another possibility is to split Cas9 into its sub-portions (e.g., the N-terminal lobe and the C-terminal lobe of Cas9). Each sub-portion is expressed by a separate vector, and these sub-portions associate to form a functional Cas9. See, e.g., Chew et al., Nat Methods. 2016; 13:868-74; Truong et al., Nucleic Acids Res. 2015; 43: 6450-6458; and Fine et al., Sci Rep. 2015; 5:10777, incorporated by reference herein in their entirety.

[0256] Alternatively, shorter Cas9 molecules from other species can be used in the compositions and methods disclosed herein, e.g., Cas9 molecules from Staphylococcus aureus, Campylobacter jejuni, Corynebacterium diphtheria, Eubacterium ventriosum, Streptococcus pasteurianus, Lactobacillus farciminis, Sphaerochaeta globus, Azospirillum (strain B510), Gluconacetobacter diazotrophicus, Neisseria cinerea, Roseburia intestinalis, Parvibaculum lavamentivorans, Nitratifractor salsuginis (strain DSM 16511), Campylobacter lari (strain CF89-12), or Streptococcus thermophilus (strain LMD-9). Exemplary Cas9 sequences from these species are also shown in Table 1. In certain embodiments, the present disclosure provides an AAV vector comprising a nucleotide encoding a Cas9 molecule from Streptococcus pyogenes, Staphylococcus aureus, Campylobacter jejuni, Corynebacterium diphtheria, Eubacterium ventriosum, Streptococcus pasteurianus, Lactobacillus farciminis, Sphaerochaeta globus, Azospirillum (strain B510), Gluconacetobacter diazotrophicus, Neisseria cinerea, Roseburia intestinalis, Parvibaculum lavamentivorans, Nitratifractor salsuginis (strain DSM 16511), Campylobacter lari (strain CF89-12), or Streptococcus thermophilus (strain LMD-9), or fragment thereof.

TABLE-US-00001 TABLE 1 Exemplary Cas9 amino acid sequences SEQ ID NO: Description Sequence 24 S. pyogenes MDKKYSIGLDIGTNSVGWAVITDEYKVPSKKFKVLGNTDRH serotype M1 SIKKNLIGALLFDSGETAEATRLKRTARRRYTRRKNRICYLQ Cas9 (Q99ZW2) EIFSNEMAKVDDSFFHRLEESFLVEEDKKHERHPIFGNIVDE VAYHEKYPTIYHLRKKLVDSTDKADLRLIYLALAHMIKFRG HFLIEGDLNPDNSDVDKLFIQLVQTYNQLFEENPINASGVDA KAILSARLSKSRRLENLIAQLPGEKKNGLFGNLIALSLGLTPN FKSNFDLAEDAKLQLSKDTYDDDLDNLLAQIGDQYADLFLA AKNLSDAILLSDILRVNTEITKAPLSASMIKRYDEHHQDLTLL KALVRQQLPEKYKEIFFDQSKNGYAGYIDGGASQEEFYKFIK PILEKMDGTEELLVKLNREDLLRKQRTFDNGSIPHQIHLGEL HAILRRQEDFYPFLKDNREKIEKILTFRIPYYVGPLARGNSRF AWMTRKSEETITPWNFEEVVDKGASAQSFIERMTNFDKNLP NEKVLPKHSLLYEYFTVYNELTKVKYVTEGMRKPAFLSGEQ KKAIVDLLFKTNRKVTVKQLKEDYFKKIECFDSVEISGVEDR FNASLGTYHDLLKIIKDKDFLDNEENEDILEDIVLTLTLFEDR EMIEERLKTYAHLFDDKVMKQLKRRRYTGWGRLSRKLINGI RDKQSGKTILDFLKSDGFANRNFMQLIHDDSLTFKEDIQKAQ VSGQGDSLHEHIANLAGSPAIKKGILQTVKVVDELVKVMGR HKPENIVIEMARENQTTQKGQKNSRERMKRIEEGIKELGSQI LKEHPVENTQLQNEKLYLYYLQNGRDMYVDQELDINRLSD YDVDHIVPQSFLKDDSIDNKVLTRSDKNRGKSDNVPSEEVV KKMKNYWRQLLNAKLITQRKFDNLTKAERGGLSELDKAGF IKRQLVETRQITKHVAQILDSRMNTKYDENDKLIREVKVITL KSKLVSDFRKDFQFYKVREINNYHHAHDAYLNAVVGTALI KKYPKLESEFVYGDYKVYDVRKMIAKSEQEIGKATAKYFFY SNIMNFFKTEITLANGEIRKRPLIETNGETGEIVWDKGRDFAT VRKVLSMPQVNIVKKTEVQTGGFSKESILPKRNSDKLIARKK DWDPKKYGGFDSPTVAYSVLVVAKVEKGKSKKLKSVKELL GITIMERSSFEKNPIDFLEAKGYKEVKKDLIIKLPKYSLFELEN GRKRMLASAGELQKGNELALPSKYVNFLYLASHYEKLKGS PEDNEQKQLFVEQHKHYLDEIIEQISEFSKRVILADANLDKV LSAYNKHRDKPIREQAENIIHLFTLTNLGAPAAFKYFDTTIDR KRYTSTKEVLDATLIHQSITGLYETRIDLSQLGGD 25 S. aureus Cas9 MKRNYILGLDIGITSVGYGIIDYETRDVIDAGVRLFKEANVE (J7RUA5) NNEGRRSKRGARRLKRRRRHRIQRVKKLLFDYNLLTDHSEL SGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHNVNEV EEDTGNELSTKEQISRNSKALEEKYVAELQLERLKKDGEVR GSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTYIDLLET RRTYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSV KYAYNADLYNALNDLNNLVITRDENEKLEYYEKFQIIENVF KQKKKPTLKQIAKEILVNEEDIKGYRVTSTGKPEFTNLKVYH DIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSEL TQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIF NRLKLVPKKVDLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVI NAIIKKYGLPNDIIIELAREKNSKDAQKMINEMQKRNRQTNE RIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYSLEAIPLEDL LNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEENSKKGNRTP FQYLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERD INRFSVQKDFINRNLVDTRYATRGLMNLLRSYFRVNNLDVK VKSINGGFTSFLRRKWKFKKERNKGYKHHAEDALIIANADFI FKEWKKLDKAKKVMENQMFEEKQAESMPEIETEQEYKEIFI TPHQIKHIKDFKDYKYSHRVDKKPNRELINDTLYSTRKDDK GNTLIVNNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQT YQKLKLIMEQYGDEKNPLYKYYEETGNYLTKYSKKDNGPV IKKIKYYGNKLNAHLDITDDYPNSRNKVVKLSLKPYRFDVY LDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAKKLKKIS NQAEFIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDIT YREYLENMNDKRPPRIIKTIASKTQSIKKYSTDILGNLYEVKS KKHPQIIKKG 26 Eubacterium MGYTVGLDIGVASVGVAVLDENDNIVEAVSNIFDEADTSNN ventriosum KVRRTLREGRRTKRRQKTRIEDFKQLWETSGYIIPHKLHLNII Cas9 ELRNKGLTELLSLDELYCVLLSMLKHRGISYLEDADDGEKG (A5Z395) NAYKKGLAFNEKQLKEKMPCEIQLERMKKYGKYHGEFIIEI NDEKEYQSNVFTTKAYKKELEKIFETQRCNGNKINTKFIKKY MEIYERKREYYIGPGNEKSRTDYGIYTTRTDEEGNFIDEKNIF GKLIGKCSVYPEEYRASSASYTAQEFNLLNDLNNLKINNEKL TEFQKKEIVEIIKDASSVNMRKIIKKVIDEDIEQYSGARIDKK GKEIYHTFEIYRKLKKELKTINVDIDSFTREELDKTMDILTLN TERESIVKAFDEQKFVYEENLIKKLIEFRKNNQRLFSGWHSF SYKAMLQLIPVMYKEPKEQMQLLTEMNVFKSKKEKYVNY KYIPENEVVKEIYNPVVVKSIRTTVKILNALIKKYGYPESVVI EMPRDKNSDDEKEKIDMNQKKNQEEYEKILNKIYDEKGIEIT NKDYKKQKKLVLKLKLWNEQEGLCLYSGKKIAIEDLLNHP EFFEIDHIIPKSISLDDSRSNKVLVYKTENSIKENDTPYHYLTR INGKWGFDEYKANVLELRRRGKIDDKKVNNLLCMEDITKID VVKGFINRNLNDTRYASRVVLNEMQSFFESRKYCNTKVKVI RGSLTYQMRQDLHLKKNREESYSHHAVDAMLIAFSQKGYE AYRKIQKDCYDFETGEILDKEKWNKYIDDDEFDDILYKERM NEIRKKIIEAEEKVKYNYKIDKKCNRGLCNQTIYGTREKDGK IHKISSYNIYDDKECNSLKKMINSGKGSDLLMYNNDPKTYR DMLKILETYSSEKNPFVAYNKETGDYFRKYSKNHNGPKVEK VKYYSGQINSCIDISHKYGHAKNSKKVVLVSLNPYRTDVYY DNDTGKYYLVGVKYNHIKCVGNKYVIDSETYNELLRKEGV LNSDENLEDLNSKNITYKFSLYKNDIIQYEKGGEYYTERFLS RIKEQKNLIETKPINKPNFQRKNKKGEWENTRNQIALAKTK YVGKLVTDVLGNCYIVNMEKFSLVVDK 27 Azospirillum MARPAFRAPRREHVNGWTPDPHRISKPFFILVSWHLLSRVVI (strain B510) DSSSGCFPGTSRDHTDKFAEWECAVQPYRLSFDLGTNSIGW Cas9 GLLNLDRQGKPREIRALGSRIFSDGRDPQDKASLAVARRLA (D3NT09) RQMRRRRDRYLTRRTRLMGALVRFGLMPADPAARKRLEV AVDPYLARERATRERLEPFEIGRALFHLNQRRGYKPVRTAT KPDEEAGKVKEAVERLEAAIAAAGAPTLGAWFAWRKTRGE TLRARLAGKGKEAAYPFYPARRMLEAEFDTLWAEQARHHP DLLTAEAREILRHRIFHQRPLKPPPVGRCTLYPDDGRAPRAL PSAQRLRLFQELASLRVIHLDLSERPLTPAERDRIVAFVQGRP PKAGRKPGKVQKSVPFEKLRGLLELPPGTGFSLESDKRPELL GDETGARIAPAFGPGWTALPLEEQDALVELLLTEAEPERAIA ALTARWALDEATAAKLAGATLPDFHGRYGRRAVAELLPVL ERETRGDPDGRVRPIRLDEAVKLLRGGKDHSDFSREGALLD ALPYYGAVLERHVAFGTGNPADPEEKRVGRVANPTVHIAL NQLRHLVNAILARHGRPEEIVIELARDLKRSAEDRRREDKRQ ADNQKRNEERKRLILSLGERPTPRNLLKLRLWEEQGPVENR RCPYSGETISMRMLLSEQVDIDHILPFSVSLDDSAANKVVCL REANRIKRNRSPWEAFGHDSERWAGILARAEALPKNKRWR FAPDALEKLEGEGGLRARHLNDTRHLSRLAVEYLRCVCPKV RVSPGRLTALLRRRWGIDAILAEADGPPPEVPAETLDPSPAE KNRADHRHHALDAVVIGCIDRSMVQRVQLAAASAEREAAA REDNIRRVLEGFKEEPWDGFRAELERRARTIVVSHRPEHGIG GALHKETAYGPVDPPEEGFNLVVRKPIDGLSKDEINSVRDPR LRRALIDRLAIRRRDANDPATALAKAAEDLAAQPASRGIRR VRVLKKESNPIRVEHGGNPSGPRSGGPFHKLLLAGEVHHVD VALRADGRRWVGHWVTLFEAHGGRGADGAAAPPRLGDGE RFLMRLHKGDCLKLEHKGRVRVMQVVKLEPSSNSVVVVEP HQVKTDRSKHVKISCDQLRARGARRVTVDPLGRVRVHAPG ARVGIGGDAGRTAMEPAEDIS 28 Gluconacetobacter MGENMIDESLTFGIDLGIGSCGWAVLRRPSAFGRKGVIEGM diazotrophicus GSWCFDVPETSKERTPTNQIRRSNRLLRRVIRRRRNRMAAIR (strain ATCC RLLHAAGLLPSTDSDALKRPGHDPWELRARGLDKPLKPVEF 49037) Cas9 AVVLGHIAKRRGFKSAAKRKATNISSDDKKMLTALEATRER (A9HKP2) LGRYRTVGEMFARDPDFASRRRNREGKYDRTTARDDLEHE VHALFAAQRRLGQGFASPELEEAFTASAFHQRPMQDSERLV GFCPFERTEKRAAKLTPSFERFRLLARLLNLRITTPDGERPLT VDEIALVTRDLGKTAKLSIKRVRTLIGLEDNQRFTTIRPEDED RDIVARTGGAMTGTATLRKALGEALWTDMQERPEQLDAIV QVLSFFEANETITEKLREIGLTLAVLDVLLTALDAGVFAKFK GAAHISTKAARNLLPHLEQGRRYDEACTMAGYDHAASRLS HHGQIVAKTQFNALVTEIGESIANPIARKALIEGLKQIWAMR NHWGLPGSIHVELARDVGNSIEKRREIEKHIEKNTALRARER REVHDLLDLEDVNGDTLLRYRLWKEQGGKCLYTGKAIHIR QIAATDNSVQVDHILPWSRFGDDSFNNKTLCLASANQQKKR STPYEWLSGQTGDAWNAFVQRIETNKELRGFKKRNYLLKN AKEAEEKFRSRNLNDTRYAARLFAEAVKLLYAFGERQEKG GNRRVFTRPGALTAALRQAWGVESLKKQDGKRINDDRHHA LDALTVAAVDEAEIQRLTKSFHEWEQQGLGRPLRRVEPPWE SFRADVEATYPEVFVARPERRRARGEGHAATIRQVKERECT PIVFERKAVSSLKEADLERIKDGERNEAIVEAIRSWIATGRPA DAPPRSPRGDIITKIRLATTIKAAVPVRGGTAGRGEMVRADV FSKPNRRGKDEWYLVPVYPHQIMNRKAWPKPPMRSIVANK DEDEWTEVGPEHQFRFSLYPRSNIEIIRPSGEVIEGYFVGLHR NTGALTISAHNDPKSIHSGIGTKTLLAISKYQVDRFGRKSPVR KEVRTWHGEACISPTPPG 29 Neisseria MAAFKPNPMNYILGLDIGIASVGWAIVEIDEEENPIRLIDLGV cinerea Cas9 RVFERAEVPKTGDSLAAARRLARSVRRLTRRRAHRLLRARR (D0W2Z9) LLKREGVLQAADFDENGLIKSLPNTPWQLRAAALDRKLTPL EWSAVLLHLIKHRGYLSQRKNEGETADKELGALLKGVADN THALQTGDFRTPAELALNKFEKESGHIRNQRGDYSHTFNRK DLQAELNLLFEKQKEFGNPHVSDGLKEGIETLLMTQRPALS GDAVQKMLGHCTFEPTEPKAAKNTYTAERFVWLTKLNNLR ILEQGSERPLTDTERATLMDEPYRKSKLTYAQARKLLDLDD TAFFKGLRYGKDNAEASTLMEMKAYHAISRALEKEGLKDK KSPLNLSPELQDEIGTAFSLFKTDEDITGRLKDRVQPEILEAL LKHISFDKFVQISLKALRRIVPLMEQGNRYDEACTEIYGDHY GKKNTEEKIYLPPIPADEIRNPVVLRALSQARKVINGVVRRY GSPARIHIETAREVGKSFKDRKEIEKRQEENRKDREKSAAKF REYFPNFVGEPKSKDILKLRLYEQQHGKCLYSGKEINLGRLN EKGYVEIDHALPFSRTWDDSFNNKVLALGSENQNKGNQTP YEYFNGKDNSREWQEFKARVETSRFPRSKKQRILLQKFDED GFKERNLNDTRYINRFLCQFVADHMLLTGKGKRRVFASNG QITNLLRGFWGLRKVRAENDRHHALDAVVVACSTIAMQQK ITRFVRYKEMNAFDGKTIDKETGEVLHQKAHFPQPWEFFAQ EVMIRVFGKPDGKPEFEEADTPEKLRTLLAEKLSSRPEAVHK YVTPLFISRAPNRKMSGQGHMETVKSAKRLDEGISVLRVPL TQLKLKDLEKMVNREREPKLYEALKARLEAHKDDPAKAFA EPFYKYDKAGNRTQQVKAVRVEQVQKTGVWVHNHNGIAD NATIVRVDVFEKGGKYYLVPIYSWQVAKGILPDRAVVQGK DEEDWTVMDDSFEFKFVLYANDLIKLTAKKNEFLGYFVSLN RATGAIDIRTHDTDSTKGKNGIFQSVGVKTALSFQKYQIDEL GKEIRPCRLKKRPPVR 30 Roseburia MRENGSDERRRNMDEKMDYRIGLDIGIASVGWAVLQNNSD intestinalis DEPVRIVDLGVRIFDTAEIPKTGESLAGPRRAARTTRRRLRR Cas9 RKHRLDRIKWLFENQGLINIDDFLKRYNMAGLPDVYQLRYE (C7G697) ALDRKLTDEELAQVLLHIAKHRGFRSTRKAETAAKENGAVL KATDENQKRMQEKGYRTVGEMIYLDEAFRTGCSWSEKGYI LTPRNKAENYQHTMLRAMLVEEVKEIFSSQRRLGNEKATEE LEEKYLEIMTSQRSFDLGPGMQPDGKPSPYAMEGFSDRVGK CTFLGDQGELRGAKGTYTAEYFVALQKINHTKLVNQDGET RNFTEEERRALTLLLFTQKEVKYAAVRKKLGLPEDILFYNLN YKKAATKEEQQKENQNTEKAKFIGMPYYHDYKKCLEERVK YLTENEVRDLFDEIGMILTCYKNDDSRTERLAKLGLVPIEME GLLAYTPTKFQHLSMKAMRNIIPFLEKGMTYDKACEEAGYD FKADSKGTKQKLLTGENVNQTINEITNPVVKRSVSQTVKVIN AIIRTYGSPQAINIELAREMSKTFEERRKIKGDMEKRQKNNE DVKKQIQELGKLSPTGQDILKYRLWQEQQGICMYSGKTIPLE ELFKPGYDIDHILPYSITFDDSFRNKVLVTSQENRQKGNRTP YEYMGNDEQRWNEFETRVKTTIRDYKKQQKLLKKHFSEEE RSEFKERNLTDTKYITTVIYNMIRQNLEMAPLNRPEKKKQV RAVNGAITAYLRKRWGLPQKNRETDTHHAMDAVVIACCTD GMIQKISRYTKVRERCYSKGTEFVDAETGEIFRPEDYSRAEW DEIFGVHIPKPWETFRAELDVRMGDDPKGFLDTHSDVALEL DYPEYIYENLRPIFVSRMPNHKVTGAAHADTIRSPRHFKDEG IVLTKTALTDLKLDKDGEIDGYYNPQSDLLLYEALKKQLLL YGNDAKKAFAQDFHKPKADGTEGPVVRKVKIQKKQTMGV FVDSGNGIAENGGMVRIDVFRVNGKYYFVPVYTADVVKKV LPNRASTAHKPYGEWKVMEDKDFLFSLYSRDLIHIKSKKDIP IKMVNGGMEGIKETYAYYIGADISAANIQGIAHDSRYKFRGL GIQSLDVLEKCQIDVLGHVSVVRSEKRMGFS 31 Parvibaculum MERIFGFDIGTTSIGFSVIDYSSTQSAGNIQRLGVRIFPEARDP lavamentivorans DGTPLNQQRRQKRMMRRQLRRRRIRRKALNETLHEAGFLP (strain DS- AYGSADWPVVMADEPYELRRRGLEEGLSAYEFGRAIYHLA 1) Cas9 QHRHFKGRELEESDTPDPDVDDEKEAANERAATLKALKNE (A7HP89) QTTLGAWLARRPPSDRKRGIHAHRNVVAEEFERLWEVQSK FHPALKSEEMRARISDTIFAQRPVFWRKNTLGECRFMPGEPL CPKGSWLSQQRRMLEKLNNLAIAGGNARPLDAEERDAILSK LQQQASMSWPGVRSALKALYKQRGEPGAEKSLKFNLELGG ESKLLGNALEAKLADMFGPDWPAHPRKQEIRHAVHERLWA ADYGETPDKKRVIILSEKDRKAHREAAANSFVADFGITGEQ AAQLQALKLPTGWEPYSIPALNLFLAELEKGERFGALVNGP DWEGWRRTNFPHRNQPTGEILDKLPSPASKEERERISQLRNP TVVRTQNELRKVVNNLIGLYGKPDRIRIEVGRDVGKSKRER EEIQSGIRRNEKQRKKATEDLIKNGIANPSRDDVEKWILWKE GQERCPYTGDQIGFNALFREGRYEVEHIWPRSRSFDNSPRNK TLCRKDVNIEKGNRMPFEAFGHDEDRWSAIQIRLQGMVSAK GGTGMSPGKVKRFLAKTMPEDFAARQLNDTRYAAKQILAQ LKRLWPDMGPEAPVKVEAVTGQVTAQLRKLWTLNNILADD GEKTRADHRHHAIDALTVACTHPGMTNKLSRYWQLRDDPR AEKPALTPPWDTIRADAEKAVSEIVVSHRVRKKVSGPLHKE TTYGDTGTDIKTKSGTYRQFVTRKKIESLSKGELDEIRDPRIK EIVAAHVAGRGGDPKKAFPPYPCVSPGGPEIRKVRLTSKQQL NLMAQTGNGYADLGSNHHIAIYRLPDGKADFEIVSLFDASR RLAQRNPIVQRTRADGASFVMSLAAGEAIMIPEGSKKGIWIV QGVWASGQVVLERDTDADHSTTTRPMPNPILKDDAKKVSI DPIGRVRPSND 32 Nitratifractor MKKILGVDLGITSFGYAILQETGKDLYRCLDNSVVMRNNPY salsuginis DEKSGESSQSIRSTQKSMRRLIEKRKKRIRCVAQTMERYGIL (strain DSM DYSETMKINDPKNNPIKNRWQLRAVDAWKRPLSPQELFAIF 16511) Cas9 AHMAKHRGYKSIATEDLIYELELELGLNDPEKESEKKADER (E6WZS9) RQVYNALRHLEELRKKYGGETIAQTIHRAVEAGDLRSYRNH DDYEKMIRREDIEEEIEKVLLRQAELGALGLPEEQVSELIDEL KACITDQEMPTIDESLFGKCTFYKDELAAPAYSYLYDLYRL YKKLADLNIDGYEVTQEDREKVIEWVEKKIAQGKNLKKITH KDLRKILGLAPEQKIFGVEDERIVKGKKEPRTFVPFFFLADIA KFKELFASIQKHPDALQIFRELAEILQRSKTPQEALDRLRAL MAGKGIDTDDRELLELFKNKRSGTRELSHRYILEALPLFLEG YDEKEVQRILGFDDREDYSRYPKSLRHLHLREGNLFEKEEN PINNHAVKSLASWALGLIADLSWRYGPFDEIILETTRDALPE KIRKEIDKAMREREKALDKIIGKYKKEFPSIDKRLARKIQLW

ERQKGLDLYSGKVINLSQLLDGSADIEHIVPQSLGGLSTDYN TIVTLKSVNAAKGNRLPGDWLAGNPDYRERIGMLSEKGLID WKKRKNLLAQSLDEIYTENTHSKGIRATSYLEALVAQVLKR YYPFPDPELRKNGIGVRMIPGKVTSKTRSLLGIKSKSRETNFH HAEDALILSTLTRGWQNRLHRMLRDNYGKSEAELKELWKK YMPHIEGLTLADYIDEAFRRFMSKGEESLFYRDMFDTIRSISY WVDKKPLSASSHKETVYSSRHEVPTLRKNILEAFDSLNVIKD RHKLTTEEFMKRYDKEIRQKLWLHRIGNTNDESYRAVEERA TQIAQILTRYQLMDAQNDKEIDEKFQQALKELITSPIEVTGKL LRKMRFVYDKLNAMQIDRGLVETDKNMLGIHISKGPNEKLI FRRMDVNNAHELQKERSGILCYLNEMLFIFNKKGLIHYGCL RSYLEKGQGSKYIALFNPRFPANPKAQPSKFTSDSKIKQVGI GSATGIIKAHLDLDGHVRSYEVFGTLPEGSIEWFKEESGYGR VEDDPHH 33 Campylobacter MRILGFDIGINSIGWAFVENDELKDCGVRIFTKAENPKNKES lari Cas9 LALPRRNARSSRRRLKRRKARLIAIKRILAKELKLNYKDYVA (G1UFN3) ADGELPKAYEGSLASVYELRYKALTQNLETKDLARVILHIA KHRGYMNKNEKKSNDAKKGKILSALKNNALKLENYQSVG EYFYKEFFQKYKKNTKNFIKIRNTKDNYNNCVLSSDLEKEL KLILEKQKEFGYNYSEDFINEILKVAFFQRPLKDFSHLVGAC TFFEEEKRACKNSYSAWEFVALTKIINEIKSLEKISGEIVPTQT INEVLNLILDKGSITYKKFRSCINLHESISFKSLKYDKENAEN AKLIDFRKLVEFKKALGVHSLSRQELDQISTHITLIKDNVKL KTVLEKYNLSNEQINNLLEIEFNDYINLSFKALGMILPLMRE GKRYDEACEIANLKPKTVDEKKDFLPAFCDSIFAHELSNPVV NRAISEYRKVLNALLKKYGKVHKIHLELARDVGLSKKAREK IEKEQKENQAVNAWALKECENIGLKASAKNILKLKLWKEQ KEICIYSGNKISIEHLKDEKALEVDHIYPYSRSFDDSFINKVLV FTKENQEKLNKTPFEAFGKNIEKWSKIQTLAQNLPYKKKNKI LDENFKDKQQEDFISRNLNDTRYIATLIAKYTKEYLNFLLLS ENENANLKSGEKGSKIHVQTISGMLTSVLRHTWGFDKKDRN NHLHHALDAIIVAYSTNSIIKAFSDFRKNQELLKARFYAKEL TSDNYKHQVKFFEPFKSFREKILSKIDEIFVSKPPRKRARRAL HKDTFHSENKIIDKCSYNSKEGLQIALSCGRVRKIGTKYVEN DTIVRVDIFKKQNKFYAIPIYAMDFALGILPNKIVITGKDKNN NPKQWQTIDESYEFCFSLYKNDLILLQKKNMQEPEFAYYND FSISTSSICVEKHDNKFENLTSNQKLLFSNAKEGSVKVESLGI QNLKVFEKYIITPLGDKIKADFQPRENISLKTSKKYGLR

[0257] In one embodiment, Cas9 comprises one or more of the following domains: a Rec1 domain, a Rec2 domain, a bridge helix domain, a PAM interacting domain, an HNH nuclease domain, and a RuvC nuclease domain. Without wishing to be bound by theory, the Rec1 domain is responsible for binding guide RNA. The arginine-rich bridge helix domain plays an important role in initiating cleavage activity upon binding of target DNA. The PAM-Interacting domain confers PAM specificity and is therefore responsible for initiating binding to target DNA. The HNH and RuvC domains are nuclease domains that cut single-stranded DNA complementary and noncomplementary to the guide RNA, respectively. See, e.g., Nishimasu et al., Cell (2014) 156:935-49; Anders et al., Nature (2014) 513: 569-73; Jinek et al., Science (2014) 343: 1247997; Sternberg et al., Nature (2014) 507: 62-7, incorporated by reference herein in their entirety.

E. dCas9

[0258] The Cas9 protein may be mutated so that the nuclease activity is inactivated. An inactivated Cas9 protein from S. pyogenes (iCas9, also referred to as "dCas9") with no endonuclease activity has been recently targeted to genes in bacteria, yeast, and human cells by gRNA to silence gene expression through steric hindrance. As used herein, a "dCas molecule" may refer to a dCas protein, or a fragment thereof. As used herein, a "dCas9 molecule" may refer to a dCas9 protein, or a fragment thereof. As used herein, the terms "iCas" and "dCas" are used interchangeably and refer to a catalytically inactive CRISPR associated protein. In one embodiment, the dCas molecule comprises one or more mutations in a DNA-cleavage domain. In one embodiment, the dCas molecule comprises one or more mutations in the RuvC or HNH domain. In one embodiment, the dCas molecule comprises one or more mutations in both the RuvC and HNH domain. In one embodiment, the dCas molecule is a fragment of a wild-type Cas molecule. In one embodiment, the dCas molecule comprises a functional domain from a wild-type Cas molecule, wherein the functional domain is chosen from a Rec1 domain, a bridge helix domain, or a PAM interacting domain. In one embodiment, the nuclease activity of the dCas molecule is reduced by at least 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% compared to that of a corresponding wild type Cas molecule.

[0259] Suitable dCas molecule can be derived from a wild type Cas molecule. The Cas molecule can be from a type I, type II, or type III CRISPR-Cas systems. In one embodiment, suitable dCas molecules can be derived from a Cas1, Cas2, Cas3, Cas4, Cas5, Cash, Cas7, Cas8, Cas9, or Cas10 molecule. In one embodiment, the dCas molecule is derived from a Cas9 molecule. The dCas9 molecule can be obtained, for example, by introducing point mutations (e.g., substitutions, deletions, or additions) in the Cas9 molecule at the DNA-cleavage domain, e.g., the nuclease domain, e.g., the RuvC and/or HNH domain. See, e.g., Jinek et al., Science (2012) 337:816-21, incorporated by reference herein in its entirety. For example, introducing two point mutations in the RuvC and HNH domains reduces the Cas9 nuclease activity while retaining the Cas9 sgRNA and DNA binding activity. In one embodiment, the two point mutations within the RuvC and HNH active sites are D10A and H840A mutations of the S. pyogenes Cas9 molecule. Alternatively, D10 and H840 of the S. pyogenes Cas9 molecule can be deleted to abolish the Cas9 nuclease activity while retaining its sgRNA and DNA binding activity. In one embodiment, the two point mutations within the RuvC and HNH active sites are D10A and N580A mutations of the S. aureus Cas9 molecule. In one embodiment, the dCas molecule is an S. aureus dCas9 molecule comprising a mutation at D10 and/or N580, numbered according to SEQ ID NO: 25. In one embodiment, the dCas molecule is an S. aureus dCas9 molecule comprising D10A and/or N580A mutations, numbered according to SEQ ID NO: 25. In one embodiment, the dCas molecule is an S. aureus dCas9 molecule comprising the amino acid sequence of SEQ ID NO: 35 or 36, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 35 or 36, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 35 or 36, or any fragment thereof.

TABLE-US-00002 (exemplary S. aureus dCas9) SEQ ID NO: 35 KRNYILGLAIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSKR GARRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKLS EEEFSAALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYVA ELQLERLKKDGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDTY IDLLETRRTYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYAY NADLYNALNDLNNLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIAK EILVNEEDIKGYRVTSTGKPEFTNLKVYHDIKDITARKEIIENAELLDQI AKILTIYQSSEDIQEELTNLNSELTQEEIEQISNLKGYTGTHNLSLKAIN LILDELWHTNDNQIAIFNRLKLVPKKVDLSQQKEIPTTLVDDFILSPVVK RSFIQSIKVINAIIKKYGLPNDIIIELAREKNSKDAQKMINEMQKRNRQT NERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYSLEAIPLEDLLNNPF NYEVDHIIPRSVSFDNSFNNKVLVKQEEASKKGNRTPFQYLSSSDSKISY ETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNLVDTRY ATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKGYKHH AEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESMPEIETEQEYK EIFITPHQIKHIKDFKDYKYSHRVDKKPNRELINDTLYSTRKDDKGNTLI VNNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDEK NPLYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNSR NKVVKLSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEAK KLKKISNQAEFIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDITY REYLENMNDKRPPRIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQIIK KG (exemplary S. aureus dCas9) SEQ ID NO: 36 MKRNYILGLAIGITSVGYGIIDYETRDVIDAGVRLFKEANVENNEGRRSK RGARRLKRRRRHRIQRVKKLLFDYNLLTDHSELSGINPYEARVKGLSQKL SEEEFSAALLHLAKRRGVHNVNEVEEDTGNELSTKEQISRNSKALEEKYV AELQLERLKKDGEVRGSINRFKTSDYVKEAKQLLKVQKAYHQLDQSFIDT YIDLLETRRTYYEGPGEGSPFGWKDIKEWYEMLMGHCTYFPEELRSVKYA YNADLYNALNDLNNLVITRDENEKLEYYEKFQIIENVFKQKKKPTLKQIA KEILVNEEDIKGYRVTSTGKPEFTNLKVYHDIKDITARKEIIENAELLDQ IAKILTIYQSSEDIQEELTNLNSELTQEEIEQISNLKGYTGTHNLSLKAI NLILDELWHTNDNQIAIFNRLKLVPKKVDLSQQKEIPTTLVDDFILSPVV KRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSKDAQKMINEMQKRNRQ TNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCLYSLEAIPLEDLLNNP FNYEVDHIIPRSVSFDNSFNNKVLVKQEEASKKGNRTPFQYLSSSDSKIS YETFKKHILNLAKGKGRISKTKKEYLLEERDINRFSVQKDFINRNLVDTR YATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLRRKWKFKKERNKGYKH HAEDALIIANADFIFKEWKKLDKAKKVMENQMFEEKQAESMPEIETEQEY KEIFITPHQIKHIKDFKDYKYSHRVDKKPNRELINDTLYSTRKDDKGNTL IVNNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQTYQKLKLIMEQYGDE KNPLYKYYEETGNYLTKYSKKDNGPVIKKIKYYGNKLNAHLDITDDYPNS RNKVVKLSLKPYRFDVYLDNGVYKFVTVKNLDVIKKENYYEVNSKCYEEA KKLKKISNQAEFIASFYNNDLIKINGELYRVIGVNNDLLNRIEVNMIDIT YREYLENMNDKRPPRIIKTIASKTQSIKKYSTDILGNLYEVKSKKHPQII KKG

[0260] Similar mutations can also apply to any other naturally-occurring Cas9 (e.g., Cas9 from other species) or engineered Cas9 molecules. In certain embodiments, the dCas9 molecule comprises a Streptococcus pyogenes dCas9 molecule, a Staphylococcus aureus dCas9 molecule, a Campylobacter jejuni dCas9 molecule, a Corynebacterium diphtheria dCas9 molecule, a Eubacterium ventriosum dCas9 molecule, a Streptococcus pasteurianus dCas9 molecule, a Lactobacillus farciminis dCas9 molecule, a Sphaerochaeta globus dCas9 molecule, an Azospirillum (strain B510) dCas9 molecule, a Gluconacetobacter diazotrophicus dCas9 molecule, a Neisseria cinerea dCas9 molecule, a Roseburia intestinalis dCas9 molecule, a Parvibaculum lavamentivorans dCas9 molecule, a Nitratifractor salsuginis (strain DSM 16511) dCas9 molecule, a Campylobacter lari (strain CF89-12) dCas9 molecule, a Streptococcus thermophilus (strain LMD-9) dCas9 molecule, or fragment thereof. In certain embodiments, the present disclosure provides an AAV vector comprising a nucleotide encoding a Streptococcus pyogenes dCas9 molecule, a Staphylococcus aureus dCas9 molecule, a Campylobacter jejuni dCas9 molecule, a Corynebacterium diphtheria dCas9 molecule, a Eubacterium ventriosum dCas9 molecule, a Streptococcus pasteurianus dCas9 molecule, a Lactobacillus farciminis dCas9 molecule, a Sphaerochaeta globus dCas9 molecule, an Azospirillum (strain B510) dCas9 molecule, a Gluconacetobacter diazotrophicus dCas9 molecule, a Neisseria cinerea dCas9 molecule, a Roseburia intestinalis dCas9 molecule, a Parvibaculum lavamentivorans dCas9 molecule, a Nitratifractor salsuginis (strain DSM 16511) dCas9 molecule, a Campylobacter lari (strain CF89-12) dCas9 molecule, a Streptococcus thermophilus (strain LMD-9) dCas9 molecule, or fragment thereof.

[0261] In one embodiment, as used herein, "iCas9" and "dCas9" both refer to a Cas9 protein that has the amino acid substitutions D10A and H840A and has its nuclease activity inactivated. In certain embodiments, the Cas9 protein comprises dCas9.

F. Cas9 Fusion Protein

[0262] The CRISPR/Cas9-based system may include a fusion protein. The fusion protein may comprise three heterologous polypeptide domains, wherein the first polypeptide domain comprises, consists of, or consists essentially of a dead Clustered Regularly Interspaced Short Palindromic Repeats associated (dCas) protein, the second polypeptide domain comprises, consists of, or consists essentially of a Kruppel-associated box (KRAB), and the polypeptide domain has an activity selected from the group consisting of transcription activation activity, transcription repression activity, transcription release factor activity, histone modification activity, nuclease activity, nucleic acid association activity, methylase activity, and demethylase activity.

(1) Transcription Activation Activity

[0263] The third polypeptide domain may have transcription activation activity, i.e., a transactivation domain. For example, gene expression of endogenous mammalian genes, such as human genes, may be achieved by targeting a fusion protein of iCas9 and a transactivation domain to mammalian promoters via combinations of gRNAs. The transactivation domain may include a VP 16 protein, multiple VP 16 proteins, such as a VP48 domain or VP64 domain, or p65 domain of NF kappa B transcription activator activity. For example, the fusion protein may be iCas9-VP64.

(2) Transcription Repression Activity

[0264] The third polypeptide domain may have transcription repression activity. The second polypeptide domain may have a Kruppel associated box activity, such as a KRAB domain, ERF repressor domain activity, Mxi1 repressor domain activity, SID4X repressor domain activity, Mad-SID repressor domain activity or TATA box binding protein activity. For example, the fusion protein may be dCas9-KRAB.

(3) Transcription Release Factor Activity

[0265] The third polypeptide domain may have transcription release factor activity. The second polypeptide domain may have eukaryotic release factor 1 (ERF1) activity or eukaryotic release factor 3 (ERF3) activity.

(4) Histone Modification Activity

[0266] The third polypeptide domain may have histone modification activity. The second polypeptide domain may have histone deacetylase, histone acetyltransferase, histone demethylase, or histone methyltransferase activity. The histone acetyltransferase may be p300 or CREB-binding protein (CBP) protein, or fragments thereof. For example, the fusion protein may be dCas9-p300.

(5) Nuclease Activity

[0267] The third polypeptide domain may have nuclease activity that is different from the nuclease activity of the Cas9 protein. A nuclease, or a protein having nuclease activity, is an enzyme capable of cleaving the phosphodiester bonds between the nucleotide subunits of nucleic acids. Nucleases are usually further divided into endonucleases and exonucleases, although some of the enzymes may fall in both categories. Well known nucleases are deoxyribonuclease and ribonuclease.

(6) Nucleic Acid Association Activity

[0268] The third polypeptide domain may have nucleic acid association activity or nucleic acid binding protein-DNA-binding domain (DBD) is an independently folded protein domain that contains at least one motif that recognizes double- or single-stranded DNA. A DBD can recognize a specific DNA sequence (a recognition sequence) or have a general affinity to DNA. nucleic acid association region selected from the group consisting of helix-turn-helix region, leucine zipper region, winged helix region, winged helix-turn-helix region, helix-loop-helix region, immunoglobulin fold, B3 domain, Zinc finger, HMG-box, Wor3 domain, TAL effector DNA-binding domain.

(7) Methylase Activity

[0269] The third polypeptide domain may have methylase activity, which involves transferring a methyl group to DNA, RNA, protein, small molecule, cytosine or adenine. The second polypeptide domain may include a DNA methyltransferase.

(8) Demethylase Activity

[0270] The third polypeptide domain may have demethylase activity. The second polypeptide domain may include an enzyme that remove methyl (CH3-) groups from nucleic acids, proteins (in particular histones), and other molecules. Alternatively, the second polypeptide may covert the methyl group to hydroxymethylcytosine in a mechanism for demethylating DNA. The second polypeptide may catalyze this reaction. For example, the second polypeptide that catalyzes this reaction may be Tetl.

[0271] In one aspect, the CRISPR/Cas9-based system may include a dCas molecule and a modulator of gene expression, or a nucleic acid encoding a dCas molecule and a modulator of gene expression. In one embodiment, the dCas molecule and the modulator of gene expression are linked covalently. In one embodiment, the modulator of gene expression is covalently fused to the dCas molecule directly. In one embodiment, the modulator of gene expression is covalently fused to the dCas molecule indirectly, e.g., via a non-modulator or linker, or via a second modulator. In one embodiment, the modulator of gene expression is at the N-terminus and/or C-terminus of the dCas molecule. In one embodiment, the dCas molecule and the modulator of gene expression are linked non-covalently. In one embodiment, the dCas molecule is fused to a first tag, e.g., a first peptide tag. In one embodiment, the modulator of gene expression is fused to a second tag, e.g., a second peptide tag. In one embodiment, the first and second tag, e.g., the first peptide tag and the second peptide tag, non-covalently interact with each other, thereby brining the dCas molecule and the modulator of gene expression into close proximity.

[0272] In one embodiment, the CRISPR/Cas9-based system includes a fusion molecule or a nucleic acid encoding a fusion molecule. In one embodiment, the fusion molecule comprises a sequence comprising a dCas molecule fused to a modulator of gene expression. In one embodiment, the dCas molecule comprises a Streptococcus pyogenes dCas9 molecule, a Staphylococcus aureus dCas9 molecule, a Campylobacter jejuni dCas9 molecule, a Corynebacterium diphtheria dCas9 molecule, a Eubacterium ventriosum dCas9 molecule, a Streptococcus pasteurianus dCas9 molecule, a Lactobacillus farciminis dCas9 molecule, a Sphaerochaeta globus dCas9 molecule, an Azospirillum (strain B510) dCas9 molecule, a Gluconacetobacter diazotrophicus dCas9 molecule, a Neisseria cinerea dCas9 molecule, a Roseburia intestinalis dCas9 molecule, a Parvibaculum lavamentivorans dCas9 molecule, a Nitratifractor salsuginis (strain DSM 16511) dCas9 molecule, a Campylobacter lari (strain CF89-12) dCas9 molecule, a Streptococcus thermophilus (strain LMD-9) dCas9 molecule, or fragment thereof. In one embodiment, the modulator of gene expression is chosen from a repressor of gene expression, an activator of gene expression, or a modulator of epigenetic modification.

[0273] Different modulators of gene expression are known in the art, see, e.g., Thakore et al., Nat Methods. 2016; 13:127-37, incorporated by reference herein in its entirety.

(1) Repressor of Gene Expression

[0274] The repressor may be any known repressor of gene expression, for example, a repressor chosen from Kruppel associated box (KRAB) domain, mSin3 interaction domain (SID), MAX-interacting protein 1 (MXI1), a chromo shadow domain, an EAR-repression domain (SRDX), eukaryotic release factor 1 (ERF1), eukaryotic release factor 3 (ERF3), tetracycline repressor, the lad repressor, Catharanthus roseus G-box binding factors 1 and 2, Drosophila Groucho, Tripartite motif-containing 28 (TRIM28), Nuclear receptor co-repressor 1, Nuclear receptor co-repressor 2, or fragment or fusion thereof.

Kruppel Associated Box (KRAB)

[0275] The KRAB domain is a type of transcriptional repression domains present in the N-terminal part of many zinc finger protein-based transcription factors. The KRAB domain functions as a transcriptional repressor when tethered to a target DNA by a DNA-binding domain. The KRAB domain is enriched in charged amino acids and can be divided into sub-domains A and B. The KRAB A and B sub-domains can be separated by variable spacer segments and many KRAB proteins contain only the A sub-domain. A sequence of 45 amino acids in the KRAB A sub-domain has been shown to be important for transcriptional repression. The B sub-domain does not repress transcription by itself but does potentiate the repression exerted by the KRAB A sub-domain. The KRAB domain recruits corepressors KAP1 (KRAB-associated protein-1, also known as transcription intermediary factor 1 beta, KRAB-A interacting protein and tripartite motif protein 28) and heterochromatin protein 1 (HP1), as well as other chromatin modulating proteins, leading to transcriptional repression through heterochromatin formation. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to a KRAB domain or fragment thereof. In one embodiment, the KRAB domain or fragment thereof is fused to the N-terminus of the dCas9 molecule. In one embodiment, the KRAB domain or fragment thereof is fused to the C-terminus of the dCas9 molecule. In one embodiment, the KRAB domain or fragment thereof is fused to both the N-terminus and the C-terminus of the dCas9 molecule. In one embodiment, the fusion molecule comprises a KRAB domain comprising the sequence of SEQ ID NO: 34, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, or any fragment thereof.

TABLE-US-00003 (exemplary KRAB) SEQ ID NO: 34 DAKSLTAWSRTLVTFKDVFVDFTREEWKLLDTAQQILYRNVMLENYKNLV SLGYQLTKPDVILRLEKGEEPWLVEREIHQETHPDSETAFEIKSSVPKKK RKV

mSin3 Interaction Domain (SID)

[0276] The mSin3 interaction domain (SID) is an interaction domain that is present on several transcription repressor proteins. It interacts with the paired amphipathic alpha-helix 2 (PAH2) domain of mSin3, a transcriptional repressor domain that is attached to transcription repressor proteins such as the mSin3A corepressor. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to an mSin3 interaction domain or fragment thereof. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to four concatenated mSin3 interaction domains (SID4X). In one embodiment, the four concatenated mSin3 interaction domains (SID4X) are fused to the C-terminus of the dCas9 molecule.

MAX-Interacting Protein 1 (MXI1)

[0277] Mxi1 is a repressor of MYC. Mxi1 antagonizes MYC transcriptional activity possibly by competing for binding to MYC-associated factor X (MAX), which binds to MYC and is required for MYC to function. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to Mxi1 or fragment thereof. In one embodiment, Mxi1 is fused to the C-terminus of the dCas9 molecule.

(2) Activator of Gene Expression

[0278] The activator may be any known activator of gene expression, for example, a VP16 activation domain, a VP64 activation domain, a p65 activation domain, an Epstein-Barr virus R transactivator Rta molecule, or fragment thereof. Activations that can be used with a dCas9 molecule are known in the art. See, e.g., Chavez et al., Nat Methods. (2016) 13: 563-67, incorporated by reference herein in its entirety.

VP16, VP64, VP160

[0279] VP16 is a viral protein sequence of 16 amino acids that recruits transcriptional activators to promoters and enhancers. VP64 is a transcription activator comprising four copies of VP16, e.g., a molecule comprising four tandem copies of VP16 connected by Gly-Ser linkers. VP160 is a transcription activator comprising 10 copies of VP16. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more copies of VP16. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to VP64. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to VP160. In one embodiment, VP64 is fused to the C-terminus, the N-terminus, or both the N-terminus and the C-terminus of the dCas9 molecule.

p65 Activation Domain (p65AD)

[0280] p65AD is the principal transactivation domain of the 65 kDa polypeptide of the nuclear form of the NF-.kappa.B transcription factor. An exemplary sequence of human transcription factor p65 is available at the Uniprot database under accession number Q04206. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to p65 or fragment thereof, e.g., p65AD.

Epstein-Barr Virus (EBV) R Transactivator (Rta)

[0281] Rta, an immediate-early protein of EBV, is a transcriptional activator that induces lytic gene expression and triggers virus reactivation. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to Rta or fragment thereof.

VP64, p65, Rta Fusions

[0282] In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to VP64, p65, Rta, or any combination thereof. The tripartite activator VP64-p65-Rta (also known as VPR), in which the three transcription activation domains are fused using short amino acid linkers, can effectively up-regulate target gene expression when fused to a dCas9 molecule. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to VPR.

Synergistic Activation Mediators (SAM)

[0283] In one embodiment, the methods and compositions disclosed herein include a CRISPR-Cas system that comprises three components: (1) a dCas9-VP64 fusion, (2) a gRNA incorporating two MS2 RNA aptamers at the tetraloop and stem-loop, and (3) the MS2-P65-HSF1 activation helper protein. This system, named Synergistic Activation Mediators (SAM), brings together three activation domains--VP64, P65 and HSF1 and has been described in Konermann et al., Nature. 2015; 517:583-8, incorporated by reference herein in its entirety.

Ldb1 Self-Association Domain

[0284] In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to Ldb1 self-association domain. Ldb1 self-association domain recruits enhancer-associated endogenous Ldb1.

(3) Modulator of Epigenetic Modification

[0285] In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to a modular of epigenetic modification. In one embodiment, the fusion molecule modulates target gene expression via epigenetic modification, e.g., via histone acetylation or methylation, or DNA methylation, at a regulatory element of target gene, e.g., a promoter or enhancer. The modulator may be any known modulator of epigenetic modification, e.g., a histone acetyltransferase (e.g., p300 catalytic domain), a histone deacetylase, a histone methyltransferase (e.g., SUV39H1 or G9a (EHMT2)), a histone demethylase (e.g., LSD1), a DNA methyltransferase (e.g., DNMT3a or DNMT3a-DNMT3L), a DNA demethylase (e.g., TET1 catalytic domain or TDG), or fragment thereof.

[0286] In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to Lys-specific histone demethylase 1 (LSD1) or fragment thereof. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to acetyltransferase p300 or fragment thereof, e.g., the catalytic core of p300. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to CREB-binding protein (CBP) protein or fragment thereof. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to Ten-eleven translocation methylcytosine dioxygenase 1 (TET1) or fragment thereof. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to thymine DNA glycosylase (TDG) or fragment thereof. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to SUV39H1 or fragment thereof. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to G9a (EHMT2) or fragment thereof. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to DNMT3a or fragment thereof. In one embodiment, the methods and compositions disclosed herein include a fusion molecule comprising a dCas9 molecule fused to DNMT3a-DNMT3L or fragment thereof.

[0287] In one embodiment, the Cas9 fusion protein also comprises a nuclear localization sequence (NLS), e.g., a NLS fused to the N-terminus and/or C-terminus of Cas9. Nuclear localization sequences are known in the art. In one embodiment, the NLS comprises the amino acid sequence of SEQ ID NO: 37 or 38, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 37 or 38, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 37 or 38, or any fragment thereof.

TABLE-US-00004 (exemplary nuclear localization sequence) SEQ ID NO: 37 APKKKRKVGIHGVPAA (exemplary nuclear localization sequence) SEQ ID NO: 38 KRPAATKKAGQAKKKK

[0288] In one embodiment, the fusion molecule is a NLS-dSaCas9-NLS-KRAB fusion molecule comprising from the N-terminus to the C-terminus: a first NLS, an S. aureus dCas9 molecule, a second NLS, and a KRAB, fused directly or indirectly (e.g., via a linker). In one embodiment, the fusion molecule is a HA-NLS-dSaCas9-NLS-KRAB fusion molecule comprising from the N-terminus to the C-terminus: a HA tag, a first NLS, an S. aureus dCas9 molecule, a second NLS, and a KRAB, fused directly or indirectly (e.g., via a linker). In one embodiment, the fusion molecule is encoded by a nucleic acid comprising the sequence of SEQ ID NO: 23, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 23, or a sequence having one, two, three, four, five or more changes, e.g., substitutions, insertions, or deletions, relative to SEQ ID NO: 23, or any fragment thereof. In one embodiment, the fusion molecule comprises the fusion molecule comprises the amino acid sequence of SEQ ID NO: 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 39, 40, or 41, or any fragment thereof.

TABLE-US-00005 (exemplary HA-NLS-dSaCas9-NLS-KRAB sequence) SEQ ID NO: 39 MYPYDVPDYAAPKKKRKVGIHGVPAAKRNYILGLAIGITSVGYGIIDYET RDVIDAGVRLFKEANVENNEGRRSKRGARRLKRRRRHRIQRVKKLLFDYN LLTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHNVNEVE EDTGNELSTKEQISRNSKALEEKYVAELQLERLKKDGEVRGSINRFKTSD YVKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYEGPGEGSPFGWKD IKEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLNNLVITRDENEKL EYYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTSTGKPEFTN LKVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSEL TQEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVP KKVDLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDII IELAREKNSKDAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIK LHDMQEGKCLYSLEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLV KQEEASKKGNRTPFQYLSSSDSKISYETFKKHILNLAKGKGRISKTKKEY LLEERDINRFSVQKDFINRNLVDTRYATRGLMNLLRSYFRVNNLDVKVKS INGGFTSFLRRKWKFKKERNKGYKHHAEDALIIANADFIFKEWKKLDKAK KVMENQMFEEKQAESMPEIETEQEYKEIFITPHQIKHIKDFKDYKYSHRV DKKPNRELINDTLYSTRKDDKGNTLIVNNLNGLYDKDNDKLKKLINKSPE KLLMYHHDPQTYQKLKLIMEQYGDEKNPLYKYYEETGNYLTKYSKKDNGP VIKKIKYYGNKLNAHLDITDDYPNSRNKVVKLSLKPYRFDVYLDNGVYKF VTVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQAEFIASFYNNDLIKIN GELYRVIGVNNDLLNRIEVNMIDITYREYLENMNDKRPPRIIKTIASKTQ SIKKYSTDILGNLYEVKSKKHPQIIKKGKRPAATKKAGQAKKKKGSDAKS LTAWSRTLVTFKDVFVDFTREEWKLLDTAQQILYRNVMLENYKNLVSLGY QLTKPDVILRLEKGEEPWLVEREIHQETHPDSETAFEIKSSVPKKKRKV (exemplary HA-NLS-dSaCas9-NLS-KRAB sequence) SEQ ID NO: 40 YPYDVPDYAAPKKKRKVGIHGVPAAKRNYILGLAIGITSVGYGIIDYETR DVIDAGVRLFKEANVENNEGRRSKRGARRLKRRRRHRIQRVKKLLFDYNL LTDHSELSGINPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHNVNEVEE DTGNELSTKEQISRNSKALEEKYVAELQLERLKKDGEVRGSINRFKTSDY VKEAKQLLKVQKAYHQLDQSFIDTYIDLLETRRTYYEGPGEGSPFGWKDI KEWYEMLMGHCTYFPEELRSVKYAYNADLYNALNDLNNLVITRDENEKLE YYEKFQIIENVFKQKKKPTLKQIAKEILVNEEDIKGYRVTSTGKPEFTNL KVYHDIKDITARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSELT QEEIEQISNLKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPK KVDLSQQKEIPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIII ELAREKNSKDAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKL HDMQEGKCLYSLEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVK QEEASKKGNRTPFQYLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYL LEERDINRFSVQKDFINRNLVDTRYATRGLMNLLRSYFRVNNLDVKVKSI NGGFTSFLRRKWKFKKERNKGYKHHAEDALIIANADFIFKEWKKLDKAKK VMENQMFEEKQAESMPEIETEQEYKEIFITPHQIKHIKDFKDYKYSHRVD KKPNRELINDTLYSTRKDDKGNTLIVNNLNGLYDKDNDKLKKLINKSPEK LLMYHHDPQTYQKLKLIMEQYGDEKNPLYKYYEETGNYLTKYSKKDNGPV IKKIKYYGNKLNAHLDITDDYPNSRNKVVKLSLKPYRFDVYLDNGVYKFV TVKNLDVIKKENYYEVNSKCYEEAKKLKKISNQAEFIASFYNNDLIKING ELYRVIGVNNDLLNRIEVNMIDITYREYLENMNDKRPPRIIKTIASKTQS IKKYSTDILGNLYEVKSKKHPQIIKKGKRPAATKKAGQAKKKKGSDAKSL TAWSRTLVTFKDVFVDFTREEWKLLDTAQQILYRNVMLENYKNLVSLGYQ LTKPDVILRLEKGEEPWLVEREIHQETHPDSETAFEIKSSVPKKKRKV (exemplary NLS-dSaCas9-NLS-KRAB) SEQ ID NO: 41 APKKKRKVGIHGVPAAKRNYILGLAIGITSVGYGIIDYETRDVIDAGVRL FKEANVENNEGRRSKRGARRLKRRRRHRIQRVKKLLFDYNLLTDHSELSG INPYEARVKGLSQKLSEEEFSAALLHLAKRRGVHNVNEVEEDTGNELSTK EQISRNSKALEEKYVAELQLERLKKDGEVRGSINRFKTSDYVKEAKQLLK VQKAYHQLDQSFIDTYIDLLETRRTYYEGPGEGSPFGWKDIKEWYEMLMG HCTYFPEELRSVKYAYNADLYNALNDLNNLVITRDENEKLEYYEKFQIIE NVFKQKKKPTLKQIAKEILVNEEDIKGYRVTSTGKPEFTNLKVYHDIKDI TARKEIIENAELLDQIAKILTIYQSSEDIQEELTNLNSELTQEEIEQISN LKGYTGTHNLSLKAINLILDELWHTNDNQIAIFNRLKLVPKKVDLSQQKE IPTTLVDDFILSPVVKRSFIQSIKVINAIIKKYGLPNDIIIELAREKNSK DAQKMINEMQKRNRQTNERIEEIIRTTGKENAKYLIEKIKLHDMQEGKCL YSLEAIPLEDLLNNPFNYEVDHIIPRSVSFDNSFNNKVLVKQEEASKKGN RTPFQYLSSSDSKISYETFKKHILNLAKGKGRISKTKKEYLLEERDINRF SVQKDFINRNLVDTRYATRGLMNLLRSYFRVNNLDVKVKSINGGFTSFLR RKWKFKKERNKGYKHHAEDALIIANADFIFKEWKKLDKAKKVMENQMFEE KQAESMPEIETEQEYKEIFITPHQIKHIKDFKDYKYSHRVDKKPNRELIN DTLYSTRKDDKGNTLIVNNLNGLYDKDNDKLKKLINKSPEKLLMYHHDPQ TYQKLKLIMEQYGDEKNPLYKYYEETGNYLTKYSKKDNGPVIKKIKYYGN KLNAHLDITDDYPNSRNKVVKLSLKPYRFDVYLDNGVYKFVTVKNLDVIK KENYYEVNSKCYEEAKKLKKISNQAEFIASFYNNDLIKINGELYRVIGVN NDLLNRIEVNMIDITYREYLENMNDKRPPRIIKTIASKTQSIKKYSTDIL GNLYEVKSKKHPQIIKKGKRPAATKKAGQAKKKKGSDAKSLTAWSRTLVT FKDVFVDFTREEWKLLDTAQQILYRNVMLENYKNLVSLGYQLTKPDVILR LEKGEEPWLVEREIHQETHPDSETAFEIKSSVPKKKRKV

G. gRNA

[0289] As described above, the CRISPR/Cas9 system utilizes gRNA that provides the targeting of the CRISPR/Cas9-based system. The gRNA is a fusion of two noncoding RNAs: a crRNA and a tracrRNA. The sgRNA may target any desired DNA sequence by exchanging the sequence encoding a 20 bp protospacer which confers targeting specificity through complementary base pairing with the desired DNA target. gRNA mimics the naturally occurring crRNA:tracrRNA duplex involved in the Type II Effector system. This duplex, which may include, for example, a 42-nucleotide crRNA and a 75-nucleotide tracrRNA, acts as a guide for the Cas9 to cleave the target nucleic acid. The term "target region", "target sequence" or "protospacer" as used interchangeably herein refers to the region of the target gene to which the CRISPR/Cas9-based system targets. The CRISPR/Cas9-based system may include at least one gRNA, wherein the gRNAs target different DNA sequences. The target DNA sequences may be overlapping. The target sequence or protospacer is followed by a PAM sequence at the 3' end of the protospacer. Different Type II systems have differing PAM requirements. For example, the S. pyogenes Type II system uses an "NGG" sequence, where "N" can be any nucleotide.

[0290] The number of gRNA administered to the cell may be at least 1 gRNA, at least 2 different gRNAs, at least 3 different gRNAs, at least 4 different gRNAs, at least 5 different gRNAs, at least 6 different gRNAs, at least 7 different gRNAs, at least 8 different gRNAs, at least 9 different gRNAs, at least 10 different gRNAs, at least 11 different gRNAs, at least 12 different gRNAs, at least 13 different gRNAs, at least 14 different gRNAs, at least 15 different gRNAs, at least 16 different gRNAs, at least 17 different gRNAs, at least 18 different gRNAs, at least 19 different gRNAs, at least 20 different gRNAs, at least 25 different gRNAs, at least 30 different gRNAs, at least 35 different gRNAs, at least 40 different gRNAs, at least 45 different gRNAs, or at least 50 different gRNAs. The number of gRNA administered to the cell may be between at least 1 gRNA to at least 50 different gRNAs, at least 1 gRNA to at least 45 different gRNAs, at least 1 gRNA to at least 40 different gRNAs, at least 1 gRNA to at least 35 different gRNAs, at least 1 gRNA to at least 30 different gRNAs, at least 1 gRNA to at least 25 different gRNAs, at least 1 gRNA to at least 20 different gRNAs, at least 1 gRNA to at least 16 different gRNAs, at least 1 gRNA to at least 12 different gRNAs, at least 1 gRNA to at least 8 different gRNAs, at least 1 gRNA to at least 4 different gRNAs, at least 4 gRNAs to at least 50 different gRNAs, at least 4 different gRNAs to at least 45 different gRNAs, at least 4 different gRNAs to at least 40 different gRNAs, at least 4 different gRNAs to at least 35 different gRNAs, at least 4 different gRNAs to at least 30 different gRNAs, at least 4 different gRNAs to at least 25 different gRNAs, at least 4 different gRNAs to at least 20 different gRNAs, at least 4 different gRNAs to at least 16 different gRNAs, at least 4 different gRNAs to at least 12 different gRNAs, at least 4 different gRNAs to at least 8 different gRNAs, at least 8 different gRNAs to at least 50 different gRNAs, at least 8 different gRNAs to at least 45 different gRNAs, at least 8 different gRNAs to at least 40 different gRNAs, at least 8 different gRNAs to at least 35 different gRNAs, 8 different gRNAs to at least 30 different gRNAs, at least 8 different gRNAs to at least 25 different gRNAs, 8 different gRNAs to at least 20 different gRNAs, at least 8 different gRNAs to at least 16 different gRNAs, or 8 different gRNAs to at least 12 different gRNAs.

[0291] In one embodiment, the gRNA is selected to increase or decrease transcription of a target gene. In one embodiment, the gRNA targets a region upstream of the transcription start site of a target gene, e.g., between 0-1000 bp upstream of the transcription start site of a target gene. In one embodiment, the gRNA targets a region downstream of the transcription start site of a target gene, e.g., between 0-1000 bp downstream of the transcription start site of a target gene. In one embodiment, the gRNA targets a promoter region of a target gene. In one embodiment, the gRNA targets an enhancer region of a target gene.

[0292] gRNA can be divided into a target binding region, a Cas9 binding region, and a transcription termination region. The target binding region hybridizes with a target region in a target gene. Methods for designing such target binding regions are known in the art, see, e.g., Doench et al., Nat Biotechnol. (2014) 32:1262-7; and Doench et al., Nat Biotechnol. (2016) 34:184-91, incorporated by reference herein in their entirety. Design tools are available at, e.g., Feng Zhang lab's target Finder, Michael Boutros lab's Target Finder (E-CRISP), RGEN Tools (Cas-OF Finder), CasFinder, and CRISPR Optimal Target Finder. In certain embodiments, the target binding region can be between about 15 and about 50 nucleotides in length (about 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or about 50 nucleotides in length). In certain embodiments, the target binding region can be between about 19 and about 21 nucleotides in length. In one embodiment, the target binding region is 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, or 25 nucleotides in length.

[0293] In one embodiment, the target binding region is complementary, e.g., completely complementary, to the target region in the target gene. In one embodiment, the target binding region is substantially complementary to the target region in the target gene. In one embodiment, the target binding region comprises no more than 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 nucleotides that are not complementary to the target region in the target gene.

[0294] In one embodiment, the target binding region is engineered to improve stability or extend half-life, e.g., by incorporating a non-natural nucleotide or a modified nucleotide in the target binding region, by removing or modifying an RNA destabilizing sequence element, by adding an RNA stabilizing sequence element, or by increasing the stability of the Cas9/gRNA complex. In one embodiment, the target binding region is engineered to enhance its transcription. In one embodiment, the target binding region is engineered to reduce secondary structure formation.

[0295] In one embodiment, the Cas9 binding region of gRNA is modified to enhance the transcription of the gRNA. In one embodiment, the Cas9 binding region of gRNA is modified to improve stability or assembly of the Cas9/gRNA complex.

H. Gene Therapy Construct

[0296] Another aspect of the present disclosure provides a gene therapy construct comprising, consisting of, or consisting essentially of a fusion protein comprising three heterologous polypeptide domains, wherein the first polypeptide domain comprises, consists of, or consists essentially of a dead Clustered Regularly Interspaced Short Palindromic Repeats associated (dCas) protein, the second polypeptide domain comprises, consists of, or consists essentially of a Kruppel-associated box (KRAB), and the polypeptide domain has an activity selected from the group consisting of transcription activation activity, transcription repression activity, transcription release factor activity, histone modification activity, nuclease activity, nucleic acid association activity, methylase activity, and demethylase activity.

[0297] In one aspect, the present disclosure provides a nucleic acid encoding a fusion protein comprising a dCas9 molecule fused to a modulator of gene expression. In one embodiment, the nucleic acid contains a promoter operably linked to a polynucleotide encoding the fusion protein. In one embodiment, the promoter is constitutive. In one embodiment, the promoter is inducible. In one embodiment, the promoter is tissue specific. In one embodiment, the promoter is specific for liver expression. In one embodiment, the promoter for the polynucleotide encoding the fusion protein is selected to express an amount of the fusion protein that is proportional to the amount of gRNA, or amount of gRNA expression.

[0298] In another aspect, the present disclosure provides a nucleic acid encoding gRNA. In one embodiment, the nucleic acid contains a promoter operably linked to a polynucleotide encoding the gRNA. In one embodiment, the promoter is constitutive. In one embodiment, the promoter is inducible. In one embodiment, the promoter is tissue specific. In one embodiment, the promoter is specific for liver expression. In one embodiment, the promoter for the polynucleotide encoding the gRNA is selected to express an amount of the gRNA that is proportional to the amount of the fusion protein, or amount of fusion protein expression.

[0299] In some embodiments, the gene therapy construct comprises a vector system. In certain embodiments, the vector system comprises an AAV vector system.

[0300] In another embodiment, the gene therapy construct further comprises a first and second AAV inverted terminal repeat (ITR) sequence flanking the fusion protein.

[0301] In one embodiment, the vector system is a single viral vector system comprising a viral vector. In one embodiment, the vector is an adeno-associated virus (AAV) vector. In one embodiment, the adeno-associated virus is selected from the serotype 2, the serotype 5, the serotype 7, the serotype 8, and the serotype 9. In one embodiment, the vector comprises a first nucleic acid molecule that encodes a fusion molecule comprising a dCas9 molecule fused to a modulator that regulates the expression of a gene, and a second nucleic acid molecule that encodes a gRNA that targets the fusion molecule to the gene.

[0302] In one embodiment, the vector system comprises two or more viral vectors. In one embodiment, the vector system is a dual viral vector system comprising a first viral vector and a second viral vector. In one embodiment, the first and second vectors are adeno-associated virus (AAV) vectors. In one embodiment, the adeno-associated virus (AAV) vectors are the same or different AAV serotypes. In one embodiment, the adeno-associated virus is selected from the serotype 2, the serotype 5, the serotype 7, the serotype 8, and the serotype 9. In one embodiment, the first vector comprises a first nucleic acid molecule that encodes a fusion molecule comprising a dCas9 molecule fused to a modulator that regulates the expression of a gene; and the second vector comprises a second nucleic acid molecule that encodes a gRNA that targets the fusion molecule to the gene.

[0303] Different AAV capsids may be used in the compositions and methods described herein. For example, suitable AAV includes, but is not limited to, AAV8 (see, e.g., U.S. Pat. Nos. 7,790,449 and 7,282,199, incorporated by reference herein in their entirety), AAV9 (see, e.g., U.S. Pat. No. 7,906,111 and US 2011/0236353, incorporated by reference herein in their entirety), hu.37 (see, e.g., U.S. Pat. No. 7,906,111 and US 2011/0236353, incorporated by reference herein in their entirety), AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV6.2, AAV7, and AAV8 (see, e.g., U.S. Pat. Nos. 7,790,449 and 7,282,199, incorporated by reference herein in their entirety). The sequences of additional suitable AAV vectors and methods for generating them are disclosed in WO 2003/042397, WO 2005/033321, WO 2006/110689, U.S. Pat. Nos. 7,790,449, 7,282,199, and 7,588,772, incorporated by reference herein in their entirety. Still other AAV may be selected, optionally taking into consideration tissue preferences of the selected AAV capsid. A recombinant AAV vector (AAV viral particle) may comprise, packaged within an AAV capsid, a nucleic acid molecule containing a 5' AAV ITR, the expression cassettes described herein and a 3' AAV ITR. As described herein, an expression cassette may contain regulatory elements for an open reading frame(s) within each expression cassette and the nucleic acid molecule may optionally contain additional regulatory elements.

[0304] The AAV vector may contain a full-length AAV 5' inverted terminal repeat (ITR) and a full-length 3' ITR. A shortened version of the 5' ITR, termed AITR, has been described in which the D-sequence and terminal resolution site (trs) are deleted. The abbreviation "sc" refers to self-complementary. "Self-complementary AAV" refers a construct in which a coding region carried by a recombinant AAV nucleic acid sequence has been designed to form an intra-molecular double-stranded DNA template. Upon infection, rather than waiting for cell mediated synthesis of the second strand, the two complementary halves of scAAV will associate to form one double stranded DNA (dsDNA) unit that is ready for immediate replication and transcription. See, e.g., McCarty et al, Gene Ther. 2001, 8:1248-54, incorporated by reference herein in its entirety. Self-complementary AAVs are described in, e.g., U.S. Pat. Nos. 6,596,535; 7,125,717, and 7,456,683, incorporated by reference herein in their entirety.

[0305] A single-stranded AAV viral vector may be used. Methods for generating and isolating AAV viral vectors suitable for delivery to a subject are known in the art. See, e.g., U.S. Pat. Nos. 7,790,449; 7,282,199; WO 2003/042397; WO 2005/033321; WO 2006/110689; and U.S. Pat. No. 7,588,772. In one system, a producer cell line is transiently transfected with a construct that encodes the transgene flanked by ITRs and a construct(s) that encodes rep and cap. In a second system, a packaging cell line that stably supplies rep and cap is transfected (transiently or stably) with a construct encoding the transgene flanked by ITRs. In each of these systems, AAV virions are produced in response to infection with helper adenovirus or herpesvirus, requiring the separation of the rAAVs from contaminating virus. More recently, systems have been developed that do not require infection with helper virus to recover the AAV--the required helper functions (i.e., adenovirus E1, E2a, VA, and E4 or herpesvirus ULS, UL8, UL52, and UL29, and herpesvirus polymerase) are also supplied, in trans, by the system. In these newer systems, the helper functions can be supplied by transient transfection of the cells with constructs that encode the required helper functions, or the cells can be engineered to stably contain genes encoding the helper functions, the expression of which can be controlled at the transcriptional or posttranscriptional level. In yet another system, the transgene flanked by ITRs and rep/cap genes are introduced into insect cells by infection with baculovirus-based vectors. For reviews on these production systems, see generally, e.g., Zhang et al., Hum Gene Ther. 2009; 20:922-9, incorporated by reference herein in its entirety. Methods of making and using these and other AAV production systems are also described in the following U.S. patents, incorporated by reference herein in their entirety: U.S. Pat. Nos. 5,139,941; 5,741,683; 6,057,152; 6,204,059; 6,268,213; 6,491,907; 6,660,514; 6,951,753; 7,094,604; 7,172,893; 7,201,898; 7,229,823; and 7,439,065.

[0306] In another embodiment, other viral vectors may be used, including integrating viruses, e.g., herpesvirus or lentivirus vectors. Suitably, where one of these other vectors is generated, it is produced as a replication-defective viral vector. In one embodiment, the genome of the viral vector does not include genes encoding the enzymes required to replicate (the genome can be engineered to be "gutless"--containing only the transgene of interest flanked by the signals required for amplification and packaging of the artificial genome), but these genes may be supplied during production.

[0307] In another embodiment, a non-viral delivery system may be used. For example, a composition disclosed herein comprising a nucleic acid may be formulated with nanoparticles, micelles, liposomes, cationic lipids, poly-glycans, polymers, lipids and/or cholesterols. See, e.g., Su et al., Mol. Pharmaceutics, 2011, 8, 774-787; WO 2013/182683, WO 2010/053572, and WO 2012/170930, incorporated by reference herein in their entirety.

[0308] Another aspect of the present disclosure provides a pharmaceutical composition comprising the gene therapy construct as described herein in a biocompatible pharmaceutical carrier.

[0309] In another aspect, the present disclosure provides a modified programmable RNA-guided dCas9-based repressor for efficient packaging in AAV and in vivo gene regulation. This gene delivery system can be customized to target any endogenous gene by designing a new guide RNA molecule, enabling patent and stable gene repression in animal models and therapeutic use.

[0310] In some embodiments, the Cas protein comprises Cas9.

[0311] In some embodiments, the gene therapy construct is designed for the targeted reduction of the PCSK9 gene.

I. Gene Therapy Target

[0312] The invention disclosed herein can be used to modulate the expression of a gene of interest. In one embodiment, the expression of the gene is down-regulated. In one embodiment, the expression of the gene is up-regulated. In one embodiment, the temporal pattern of the expression of the gene is modulated. In one embodiment, the spatial pattern of the expression of the gene is modulated. Exemplary genes, tissues expressing these genes, and relevant disease indications are disclosed in Tables 2 and 3. Table 2 provides genes, the expression of which can be down-regulated to treat diseases shown alongside the genes. Table 3 provides genes, the expression of which can be up-regulated to treat diseases shown alongside the genes.

TABLE-US-00006 TABLE 2 Exemplary genes for expression modulation (e.g., repression) and Exemplary Diseases and Tissues Gene Disease Tissue proprotein convertase Hypercholesteremia Liver subtilisin/kexin type 9 (PCSK9) activin receptor type- muscle weakness Muscle 2B (ACVR2B) huntingtin gene (HTT) Huntington's disease Brain superoxide dismutase 1 Amyotrophic lateral sclerosis Brain (SOD1) transthyretin (TTR) Hereditary ATTR amyloidosis Liver antithrombin Hemophilia Liver complement component C5 Complement-mediated disease Liver aminolevulinic acid Hepatic porphyria Liver synthase 1 glycolate oxidase Primary hyperoxaluria type 1 Liver transmembrane protease, Beta thalassemia Liver serine 6 (Tmprss6) alpha-antitrypsin (AAT) Alpha-1 antitrypsin (AAT) Liver deficiency vascular endothelial Age-related macular Retina growth factor (VEGF) degeneration C9orf72 Familial frontotemporal Brain dementia (FTD) and amyotrophic lateral sclerosis (ALS) KRAS Cancer tumor human epidermal growth Cancer tumor factor receptor 2 (HER2) Beta catenin Cancer tumor angiopoietin-like 3 Hyperlipidemia Liver (ANGPTL3) apolipoprotein C-III Hyperlipidemia Liver (apoCIII) PD-L1 Chronic liver infection Liver HBV, HCV, HDV viral Hepatitis Liver genomes vascular endothelial Age-related macular Retina growth factor receptor degeneration 1 (VEGFR1) RTP801 Age-related macular Retina degeneration beta-2 adrenergic Glaucoma, Ocular Retina receptor (ADRB2) hypertension Caspase 2 Glaucoma, Ocular Retina hypertension IKKbeta Glaucoma Retina apolipoprotein A Cardiovascular disease Liver factor 12 Hereditary angioedema Liver prekallikrein Hereditary angioedema Liver apolipoprotein B-100 Hypercholesteremia Liver glucagon receptor Diabetes Liver microRNA-103/107 Nonalcoholic steatohepatitis Liver (NASH) in patients with type 2 diabetes Diacylglycerol O- Nonalcoholic steatohepatitis Liver Acyltransferase 2 (NASH) in patients with (DGAT2) type 2 diabetes Ube3a-ATS Angelman Syndrome Brain TNFR Autoimmmune disease Various- cartilage FRG1 Facioscapulohumeral muscular Muscle dystrophy BCR-ABL Chronic myelogenous leukemia Blood tumor TEL-AML1 Acute lymphoblastic leukemia Blood tumor PTEN Cancer Tumor Other tumor suppressors Cancer Tumor Mendelian disorders Various Various Triggering receptor Neurodegenerative disease, CNS expressed on myeloid e.g., Alzheimer's disease, cells 2 (TREM-2) amyotrophic lateral sclerosis, and Parkinson's disease APOE4 Alzheimer's disease CNS CD33 Alzheimer's disease CNS Other disease risk genes Various Various

TABLE-US-00007 TABLE 3 Exemplary genes for expression modulation (e.g., activation) and Exemplary Diseases and Tissues Gene Disease Tissue aromatic L-amino acid Parkinson's disease Brain decarboxylase (AADC) triggering receptor Alzheimer's Disease Brain expressed on myeloid cells 2 (TREM2) vascular endothelial growth Tissue regeneration Various - factor (VEGF) muscle brain-derived neurotrophic Neurological conditions Brain factor (BDNF) platelet-derived growth Tissue regeneration Various - factor (PDGF) muscle utrophin Muscular dystrophy Skeletal and cardiac muscle frataxin Friedreich's ataxia Brain sodium voltage-gated Dravet Syndrome Brain channel alpha subunit 1 (SCN1A) pigment epithelium-derived Wet AMD, cancer Eye, tumor factor (PEDF) BCL2 Associated X (BA.chi.) Cancer Tumor mammary serine protease Cancer Tumor inhibitor (maspin) p53 Cancer Tumor cystic fibrosis Cystic fibrosis Lung transmembrane conductance regulator (CFTR) fragile X mental retardation Fragile X Brain 1 (FMR1) methyl-CpG-binding Rhett syndrome Brain protein 2 (MECP2) ubiquitin-protein ligase Angelman syndrome Brain E3A (Ube3a) ubiquitin-protein ligase Prader-Willi syndrome Brain E3A (Ube3a) IL1RA rheumatoid arthritis Cartilage HBG1/HBG2 sickle cell anemia Blood IL-10 Collitis, inflammatory Gut - T bowel disease cells IL-2 Various- graft versus Various host disease, rheumatoid arthritis, lupus, type 1 diabetes Growth factors (e.g., having Various Various a protective or regenerative function)

J. Methods

[0313] A variety of different diseases and conditions (e.g., one or more diseases described herein), e.g., diseases and conditions associated with one or more genes described herein, including, e.g., genetic deletions, insertions or mutations, can be treated using the method described herein. The compositions described herein can be delivered to any of the cells, tissues, or organs described herein to treat a disorder or condition associated with a gene described herein. Exemplary genes for expression modulation (e.g., repression or activation), and exemplary diseases and tissues, are described in Tables 2 and 3.

[0314] In one aspect, the present disclosure provides a method of suppressing the expression of a gene in a cell in vivo comprising, consisting of, or consisting essentially of administering to a cell a therapeutically effective amount of a gene therapy construct as described herein such that the gene expression is suppressed.

[0315] In one aspect, the present disclosure provides a method of suppressing the expression of a gene in vivo in a subject comprising, consisting of, or consisting essentially of administering to the subject a therapeutically effective amount of a gene therapy construct as described herein such that the gene expression is suppressed.

[0316] In some embodiments, the method is designed for the targeted reduction of the PCSK9 gene. In some embodiments, the method is designed for the targeted reduction of the expression of the PCSK9 gene.

[0317] In one aspect, the present disclosure provides a method of increasing the expression of a gene in a cell in vivo comprising, consisting of, or consisting essentially of administering to a cell a therapeutically effective amount of a gene therapy construct as described herein such that the gene expression is increased.

[0318] In one aspect, the present disclosure provides a method of increasing the expression of a gene in vivo in a subject comprising, consisting of, or consisting essentially of administering to the subject a therapeutically effective amount of a gene therapy construct as described herein such that the gene expression is increased.

[0319] In one embodiment, the aforementioned methods comprise administering to the cell or subject: a first nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression, and a second nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, in an amount sufficient to modulate expression of the gene. In one embodiment, the first and second nucleic acids are packaged in a same vector or different vectors. In one embodiment the first and second nucleic acids are packaged in a same AAV vector or different AAV vectors. In one embodiment, the first nucleic acid is a DNA. In one embodiment, the first nucleic acid is an mRNA.

[0320] In one embodiment, the aforementioned methods comprise administering to the cell or subject: a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression, and a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, in an amount sufficient to modulate expression of the gene. In one embodiment, the nucleic acid is packaged in a viral vector, e.g., an AAV vector.

[0321] In one embodiment, the aforementioned methods comprise administering to the cell or subject: a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression, and a gRNA which targets the fusion molecule to the gene, in an amount sufficient to modulate expression of the gene.

[0322] In one embodiment, the aforementioned methods comprise administering to the cell or subject: a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression, and a gRNA which targets the fusion molecule to the gene, in an amount sufficient to modulate expression of the gene. In one embodiment, the nucleic acid is packaged in a viral vector, e.g., an AAV vector. In one embodiment, the nucleic acid is a DNA. In one embodiment, the nucleic acid is an mRNA.

[0323] Different administration routes may be used for the methods disclosed herein. The compositions disclosed herein can be administered systemically or locally. In some embodiments, the compositions disclosed herein are administered intravenously, subcutaneously, orally, via inhalation, intranasally, intratracheally, intraarterially, intraocularly, or intramuscularly. In some embodiments, the compositions may be delivered in a single administration or multiple administrations. In one embodiment, two or more AAV vectors may be delivered, see, e.g., WO 2011/126808 and WO 2013/049493, incorporated by reference herein in their entirety.

[0324] In the case of AAV viral vectors, quantification of the genome copies ("GC") may be used as the measure of the dose contained in the formulation. Any method known in the art can be used to determine the genome copy (GC) number of the replication-defective virus compositions of the invention.

[0325] Production of lentivirus is measured as IU per volume (e.g., mL). IU is infectious unit, or alternatively transduction units (TU); IU and TU can be used interchangeably as a quantitative measure of the titer of a viral vector particle preparation.

[0326] Any known RNA delivery method can be used in the methods disclosed herein, including but not limited to, delivering RNA using block copolymers (see, e.g., US 2011/0286957, EP2620161, and WO 2015/017519, incorporated by reference herein in their entirety), and delivering RNA using cationic complexes or liposomal formulations (see, e.g., Landen et al., Cancer Biol. Ther. (2006) 5(12); Khoury et al., Arthritis Rheumatol. (2006) 54: 1867-77, incorporated by reference herein in their entirety). Local administration to the liver has also been demonstrated by injecting double stranded RNA directly into the circulatory system surrounding the liver using renal vein catheterization, see, e.g., Hamar et al., PNAS (2004) 101: 14883-8, incorporated by reference herein in its entirety.

[0327] Other methods are disclosed in WO 2013/143555; US 2013/0323001; US 2012/0195917; Soutschek et al., Nature (2004) 432: 173-8; Morrissey et al., Hepatol. (2005) 41: 1349-56; Uchida et al, (2013) PLoS ONE 8: e56220, incorporated by reference herein in their entirety.

K. Kits

[0328] Another aspect of the present disclosure provides a kit for the suppression of a gene in vivo comprising a gene therapy construct or pharmaceutical composition as described herein and instructions for use.

[0329] Yet another aspect of the present disclosure provides all that is described and illustrated herein.

[0330] The present invention may be defined in any of the following numbered paragraphs: [0331] 1. A method of modulating expression of a gene, in vivo, in a subject comprising administering to, or providing in, the subject: [0332] (a) (i) a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; or (ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and [0333] (b) (i) a gRNA which targets the fusion molecule to the gene; or (ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, [0334] in an amount sufficient to modulate expression of the gene. [0335] 2. The method of paragraph 1, comprising administering to, or provided in, the subject any of: (a)(ii) and (b)(ii), (a)(i) and (b)(i), (a)(i) and (b)(ii), or (a)(ii) and (b)(i). [0336] 3. The method of paragraph 1 or 2, comprising administering to, or provided in, the subject: [0337] (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising a dCas9 molecule fused to a modulator of gene expression; and [0338] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene. [0339] 4. The method of any of the preceding paragraphs, wherein the nucleic acid of (a)(ii) comprises DNA. [0340] 5. The method of any of the preceding paragraphs, wherein the nucleic acid of (b)(ii) comprises DNA. [0341] 6. The method of any of the preceding paragraphs, wherein the nucleic acid of (a)(ii) comprises RNA. [0342] 7. The method of any of the preceding paragraphs, wherein the nucleic acid of (b)(ii) comprises RNA. [0343] 8. The method of any of the preceding paragraphs, wherein one or both of (a) and (b) are packaged in a viral vector. [0344] 9. The method of any of the preceding paragraphs, wherein (a) is packaged in a viral vector. [0345] 10. The method of any of the preceding paragraphs, wherein (b) is packaged in a viral vector. [0346] 11. The method of any of the preceding paragraphs, wherein (a) and (b) are packaged in the same viral vector. [0347] 12. The method of any of paragraphs 8-11, wherein the viral vector comprises an AAV vector. [0348] 13. The method of any of paragraphs 8-11, wherein the viral vector comprises a lentiviral vector. [0349] 14. The method of any of paragraphs 1-10, wherein (a) is packaged in a first viral vector and (b) is packaged in a second viral vector. [0350] 15. The method of paragraph 14, wherein the first viral vector comprises an AAV vector and the second viral vector comprises an AAV vector. [0351] 16. The method of any of the preceding paragraphs, wherein the dCas9 molecule comprises a gRNA binding domain of a Cas9 molecule. [0352] 17. The method of any of the preceding paragraphs, wherein the dCas9 molecule comprises one, two or all of: a Rec1 domain, a bridge helix domain, or a PAM interacting domain, of a Cas9 molecule. [0353] 18. The method of any of the preceding paragraphs, wherein the dCas9 molecule is a mutant of a wild-type Cas9 molecule, e.g., in which the Cas9 nuclease activity is inactivated. [0354] 19. The method of any of the preceding paragraphs, wherein the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a DNA-cleavage domain of a Cas9 molecule. [0355] 20. The method of any of the preceding paragraphs, wherein the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a RuvC domain and/or a mutation in a HNH domain. [0356] 21. The method of any of the preceding paragraphs, wherein the dCas9 molecule comprises a Staphylococcus aureus dCas9 molecule, a Streptococcus pyogenes dCas9 molecule, a Campylobacter jejuni dCas9 molecule, a Corynebacterium diphtheria dCas9 molecule, a Eubacterium ventriosum dCas9 molecule, a Streptococcus pasteurianus dCas9 molecule, a Lactobacillus farciminis dCas9 molecule, a Sphaerochaeta globus dCas9 molecule, an Azospirillum (e.g., strain B510) dCas9 molecule, a Gluconacetobacter diazotrophicus dCas9 molecule, a Neisseria cinerea dCas9 molecule, a Roseburia intestinalis dCas9 molecule, a Parvibaculum lavamentivorans dCas9 molecule, a Nitratifractor salsuginis (e.g., strain DSM 16511) dCas9 molecule, a Campylobacter lari (e.g., strain CF89-12) dCas9 molecule, or a Streptococcus thermophilus (e.g., strain LMD-9) dCas9 molecule. [0357] 22. The method of any of the preceding paragraphs, wherein the dCas9 molecule comprises an S. aureus dCas9 molecule, e.g., comprising an S. aureus dCas9 sequence described herein. [0358] 23. The method of any of the preceding paragraphs, wherein the S. aureus dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 580, or both (e.g., D10A, N580A, or both), relative to a wild-type S. aureus dCas9 molecule, numbered according to SEQ ID NO: 25. [0359] 24. The method of any of the preceding paragraphs, wherein the S. aureus dCas9 molecule comprises the amino acid sequence of SEQ ID NO: 35 or 36, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 35 or 36, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 35 or 36, or any fragment thereof. [0360] 25. The method of any of paragraphs 1-20, wherein the dCas9 molecule comprises an S. pyogenes dCas9 molecule, e.g., comprising an S. pyogenes dCas9 sequence described herein. [0361] 26. The method of any of paragraphs 1-20, the S. pyogenes dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 840, or both (e.g., D10A, H840A, or both), relative to a wild-type S. pyogenes dCas9 molecule, numbered according to SEQ ID NO: 24. [0362] 27. The method of any of the preceding paragraphs, wherein the dCas9 molecule is less than 1400, 1300, 1200, 1100, 1000, 900, 800, 700, 600, or 500 amino acids in length. [0363] 28. The method of any of the preceding paragraphs, wherein the dCas9 molecule is 500-1300, 600-1200, 700-1100, 800-1000, 500-1200, 500-1000, 500-800, 500-600, 1000-1200, 800-1200, or 600-1200 amino acids in length. [0364] 29. The method of any of the preceding paragraphs, wherein the dCas9 molecule has a size that is less than 90%, 80%, 70%, 60%, 50%, 40%, or 30% of the size of a wild-type Cas9 molecule, e.g., a wild-type S. pyogenes Cas9 molecule or a wild-type S. aureus dCas9 molecule. [0365] 30. The method of any of the preceding paragraphs, wherein the modulator of gene expression comprises a modulator of gene expression described herein. [0366] 31. The method of any of the preceding paragraphs, wherein the modulator of gene expression comprises a repressor of gene expression, e.g., a Kruppel associated box (KRAB) molecule, an mSin3 interaction domain (SID) molecule, four concatenated mSin3 interaction domains (SID4X), MAX-interacting protein 1 (MXI1), or any fragment thereof [0367] 32. The method of any of the preceding paragraphs, wherein the modulator of gene expression comprises a Kruppel associated box (KRAB) molecule comprising the sequence of SEQ ID NO: 34, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, or any fragment thereof [0368] 33. The method of any of the preceding paragraphs, wherein the modulator of gene expression comprises an activator of gene expression, e.g., a VP16 transcription activation domain, a VP64 transcriptional activation domain, a p65 activation domain, an Epstein-Barr virus R transactivator Rta molecule, a VP64-p65-Rta fusion (VPR), Ldb1 self-association domain, or any fragment thereof. [0369] 34. The method of any of the preceding paragraphs, wherein the modulator of gene expression comprises a modulator of epigenetic modification, e.g., a histone acetyltransferase (e.g., p300 catalytic domain), a histone deacetylase, a histone methyltransferase (e.g., SUV39H1 or G9a (EHMT2)), a histone demethylase (e.g., Lys-specific histone demethylase 1 (LSD1)), a DNA methyltransferase (e.g., DNMT3a or DNMT3a-DNMT3L), a DNA demethylase (e.g., TET1 catalytic domain or TDG), or fragment thereof [0370] 35. The method of any of the preceding paragraphs, wherein the modulator of gene expression is fused to the C-terminus, N-terminus, or both, of the dCas9 molecule. [0371] 36. The method of any of the preceding paragraphs, wherein the modulator of gene expression is fused to the dCas9 molecule directly. [0372] 37. The method of any of paragraphs 1-34, wherein the modulator of gene expression is fused to the dCas9 molecule indirectly, e.g., via a non-modulator or a linker, or a second modulator. [0373] 38. The method of any of the preceding paragraphs, wherein a plurality of modulators of gene expression, e.g., two or more identical, substantially identical, or different modulators, are fused to the dCas9 molecule. [0374] 39. The method of any of the preceding paragraphs, wherein the fusion molecule further comprises a nuclear localization sequence. [0375] 40. The method of paragraph 39, wherein one or more nuclear localization sequences are fused to the C-terminus, N-terminus, or both, of the dCas9 molecule, e.g., directly or indirectly, e.g., via a linker. [0376] 41. The method of paragraph 40, wherein the one or more nuclear localization sequences comprise the amino acid sequence of SEQ ID NO: 37 or 38, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 37 or 38, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 37 or 38, or any fragment thereof. [0377] 42. The method of any of the preceding paragraphs, wherein the fusion molecule comprises the amino acid sequence of SEQ ID NO: 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 39, 40, or 41, or any fragment thereof. [0378] 43. The method of any of the preceding paragraphs, wherein the nucleic acid that encodes the fusion molecule comprises the sequence of SEQ ID NO: 23, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 23, or a sequence having one, two, three, four, five or more changes, e.g., substitutions, insertions, or deletions, relative to SEQ ID NO: 23, or any fragment thereof. [0379] 44. The method of any of the preceding paragraphs, wherein the gRNA comprises a unimolecular gRNA. [0380] 45. The method of any of paragraphs 1-43, wherein the gRNA comprises a bimolecular gRNA. [0381] 46. The method of any of the preceding paragraphs, wherein the gRNA comprises a gRNA sequence described herein. [0382] 47. The method of any of the preceding paragraphs, wherein gene expression is modulated in a cell, tissue, or organ described herein, e.g., Table 2 or 3. [0383] 48. The method of any of the preceding paragraphs, wherein gene expression is modulated in the liver. [0384] 49. The method of any of the preceding paragraphs, wherein the modulation is sufficient to alter a function of the gene, or a symptom of a disorder associated with the gene, as described herein, e.g., in Table 2 or 3. [0385] 50. The method of any of the preceding paragraphs, wherein the modulation comprises modulation of transcription. [0386] 51. The method of any of the preceding paragraphs, wherein the modulation comprises down-regulation of transcription. [0387] 52. The method of any of the preceding paragraphs, wherein the modulation comprises up-regulation of transcription. [0388] 53. The method of any of the preceding paragraphs, wherein the modulation comprises modulating the temporal pattern of expression of the gene. [0389] 54. The method of any of the preceding paragraphs, wherein the modulation comprises modulating the spatial pattern of expression of the gene. [0390] 55. The method of any of the preceding paragraphs, wherein the modulation comprises modulating a post-transcriptional or co-transcriptional modification, e.g., splicing, 5' capping, 3' cleavage, 3' polyadenylation, or RNA export. [0391] 56. The method of any of the preceding paragraphs, wherein the modulation comprises modulating the expression of an isoform, e.g., an increase or decrease in the expression of an isoform, the increase or decrease in the expression of a first isoform over a second isoform. [0392] 57. The method of any of the preceding paragraphs, wherein the modulation comprises modulating chromatin structure, e.g., increasing or decreasing methylation, acetylation, phosphorylation, or ubiquitination, e.g., at a preselected site, or altering the spatial pattern, cell specificity, or temporal occurrence of methylation, acetylation, phosphorylation, or ubiquitination. [0393] 58. The method of any of the preceding paragraphs, wherein the modulation comprises modulating a post-translational modification (e.g., indirectly), e.g., glycosylation, lipidation, acetylation, phosphorylation, amidation, hydroxylation, methylation, ubiquitination, sulfation, nitrosylation, or proteolysis. [0394] 59. The method of any of the preceding paragraphs, wherein the modulation does not comprise cleaving the subject's DNA. [0395] 60. The method of any of the preceding paragraphs, wherein the modulation comprises an inducible modulation. [0396] 61. The method of any of the preceding paragraphs, wherein the gene is selected from Table 2, optionally wherein the method down-regulates the expression of the gene. [0397] 62. The method of any of paragraphs 1-60, wherein the gene is selected from Table 3, optionally wherein the method up-regulates the expression of the gene. [0398] 63. The method of any of the preceding paragraphs, wherein the gene comprises PCSK9. [0399] 64. The method of any of the preceding paragraphs, wherein the dCas9 molecule does not cleave the genome of the subject. [0400] 65. A method of modulating expression of a gene, in vivo, in a subject comprising administering to, or providing in, the subject: [0401] (a)(ii) a nucleic acid that encodes a fusion molecule (e.g., a fusion molecule described herein) comprising a sequence comprising an S. aureus dCas9 molecule fused to a KRAB molecule; and [0402] (b)(ii) a nucleic acid that encodes a gRNA (e.g., a gRNA described herein) which targets the fusion molecule to the gene, and [0403] wherein one or both of (a)(i) and (b)(ii) are packaged in an AAV vector. [0404] 66. The method of paragraph 65, wherein the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof

[0405] 67. The method of paragraph 65 or 66, wherein the gRNA comprises a gRNA sequence described herein. [0406] 68. The method of any of paragraphs 65-67, wherein the gene is selected from Table 2 or 3. [0407] 69. The method of any of paragraphs 65-68, wherein the gene comprises PCSK9. [0408] 70. The method of any of paragraphs 65-69, wherein (a)(ii) and (b)(ii) are packaged in different AAV vectors. [0409] 71. The method of any of paragraphs 65-70, wherein (a)(ii) and (b)(ii) are packaged in the same AAV vector. [0410] 72. A pharmaceutical composition, or unit dosage form, comprising, in an amount sufficient for modulating a gene in a human subject, or in an amount sufficient for a therapeutic effect in a human subject, [0411] (a)(ii) a nucleic acid that encodes a fusion molecule (e.g., a fusion molecule described herein) comprising a sequence comprising a dCas9 molecule, e.g., an S. aureus dCas9 molecule, fused to a modulator of gene expression (e.g., a modulator described herein); and/or [0412] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, [0413] wherein one or both of (a)(ii) and (b)(ii) are packaged in a viral vector, e.g., an AAV vector. [0414] 73. The pharmaceutical composition, or unit dosage form, of paragraph 72, wherein the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof [0415] 74. The pharmaceutical composition, or unit dosage form, of paragraph 72 or 73, wherein the gRNA comprises a gRNA sequence described herein. [0416] 75. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-74, wherein the gene is selected from Table 2 or 3. [0417] 76. The method of any of paragraphs 72-75, wherein the gene comprises PCSK9. [0418] 77. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-76, wherein (a)(ii) and (b)(ii) are packaged in the same viral vector, e.g., an AAV vector. [0419] 78. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-77, wherein (a)(ii) and (b)(ii) are packaged in different viral vectors, e.g., AAV vectors. [0420] 79. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-78, wherein the viral vector (e.g., AAV vector) comprising (a)(ii), and the viral vector (e.g., AAV vector) comprising (b)(ii), are provided in separate containers. [0421] 80. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-79, wherein the viral vector (e.g., AAV vector) comprising (a)(ii) and the viral vector (e.g., AAV vector) comprising (b)(ii), are provided in the same container. [0422] 81. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-80, which is formulated for administration, e.g., oral, parenteral, sublingual, transdermal, rectal, transmucosal, topical, intrapleural, intravenous, intraarterial, intraperitoneal, subcutaneous, intramuscular, intranasal intrathecal, or intraarticular administration, or administration via inhalation or via buccal administration, or any combination thereof, to the subject. [0423] 82. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-81, which is formulated for intravenous administration to the subject. [0424] 83. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-82, which is disposed in a device suitable for administration, e.g., oral, parenteral, sublingual, transdermal, rectal, transmucosal, topical, intrapleural, intravenous, intraarterial, intraperitoneal, subcutaneous, intramuscular, intranasal intrathecal, or intraarticular administration, or administration via inhalation or via buccal administration, or any combination thereof, to the subject. [0425] 84. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-83, which is disposed in a device suitable for intravenous administration to the subject. [0426] 85. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-84, which is disposed in a volume of at least 1, 2, 5, 10, 15, 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 400, or 500 ml. [0427] 86. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-85, wherein the nucleic acid of (a)(ii) comprises DNA. [0428] 87. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-86, wherein the nucleic acid of (b)(ii) comprises DNA. [0429] 88. The pharmaceutical composition, or unit dosage form, of paragraphs 72-85 or 87, wherein the nucleic acid of (a)(ii) comprises RNA. [0430] 89. The pharmaceutical composition, or unit dosage form, of paragraphs 72-86 or 88, wherein the nucleic acid of (b)(ii) comprises RNA. [0431] 90. The pharmaceutical composition, or unit dosage form, of paragraphs 72-89, wherein the dCas9 molecule comprises a gRNA binding domain of a Cas9 molecule. [0432] 91. The pharmaceutical composition, or unit dosage form, of paragraphs 72-90, wherein the dCas9 molecule comprises one, two or all of: a Rec1 domain, a bridge helix domain, or a PAM interacting domain, of a Cas9 molecule. [0433] 92. The pharmaceutical composition, or unit dosage form, of paragraphs 72-91, wherein the dCas9 molecule is a mutant of a wild-type Cas9 molecule, e.g., in which the Cas9 nuclease activity is inactivated. [0434] 93. The pharmaceutical composition, or unit dosage form, of paragraphs 72-90, wherein the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a DNA-cleavage domain of a Cas9 molecule. [0435] 94. The pharmaceutical composition, or unit dosage form, of paragraphs 72-93, wherein the dCas9 molecule comprises a mutation that inactivates a Cas9 nuclease activity, e.g., a mutation in a RuvC domain and/or a mutation in a HNH domain. [0436] 95. The pharmaceutical composition, or unit dosage form, of paragraphs 72-94, wherein the dCas9 molecule comprises a Staphylococcus aureus dCas9 molecule, a Streptococcus pyogenes dCas9 molecule, a Campylobacter jejuni dCas9 molecule, a Corynebacterium diphtheria dCas9 molecule, a Eubacterium ventriosum dCas9 molecule, a Streptococcus pasteurianus dCas9 molecule, a Lactobacillus farciminis dCas9 molecule, a Sphaerochaeta globus dCas9 molecule, an Azospirillum (e.g., strain B510) dCas9 molecule, a Gluconacetobacter diazotrophicus dCas9 molecule, a Neisseria cinerea dCas9 molecule, a Roseburia intestinalis dCas9 molecule, a Parvibaculum lavamentivorans dCas9 molecule, a Nitratifractor salsuginis (e.g., strain DSM 16511) dCas9 molecule, a Campylobacter lari (e.g., strain CF89-12) dCas9 molecule, or a Streptococcus thermophilus (e.g., strain LMD-9) dCas9 molecule. [0437] 96. The pharmaceutical composition, or unit dosage form, of paragraphs 72-95, wherein the dCas9 molecule comprises an S. aureus dCas9 molecule, e.g., comprising an S. aureus dCas9 sequence described herein. [0438] 97. The pharmaceutical composition, or unit dosage form, of paragraph 96, wherein the S. aureus dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 580, or both (e.g., D10A, N580A, or both), relative to a wild-type S. aureus dCas9 molecule, numbered according to SEQ ID NO: 25. [0439] 98. The pharmaceutical composition, or unit dosage form, of paragraph 96, wherein the S. aureus dCas9 molecule comprises the amino acid sequence of SEQ ID NO: 35 or 36, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 35 or 36, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 35 or 36, or any fragment thereof. [0440] 99. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-95, wherein the dCas9 molecule comprises an S. pyogenes dCas9 molecule, e.g., comprising an S. pyogenes dCas9 sequence described herein. [0441] 100. The pharmaceutical composition, or unit dosage form, of paragraph 99, wherein the S. pyogenes dCas9 molecule comprises a mutation at an amino acid position, corresponding to position 10, 840, or both (e.g., D10A, H840A, or both), relative to a wild-type S. pyogenes dCas9 molecule, numbered according to SEQ ID NO: 24. [0442] 101. The pharmaceutical composition, or unit dosage form, of paragraphs 72-100, wherein the dCas9 molecule is less than 1400, 1300, 1200, 1100, 1000, 900, 800, 700, 600, or 500 amino acids in length. [0443] 102. The pharmaceutical composition, or unit dosage form, of paragraphs 72-101, wherein the dCas9 molecule is 500-1300, 600-1200, 700-1100, 800-1000, 500-1200, 500-1000, 500-800, 500-600, 1000-1200, 800-1200, or 600-1200 amino acids in length. [0444] 103. The pharmaceutical composition, or unit dosage form, of paragraphs 72-102, wherein the dCas9 molecule has a size that is less than 90%, 80%, 70%, 60%, 50%, 40%, or 30% of the size of a wild-type Cas9 molecule, e.g., a wild-type S. pyogenes Cas9 molecule or a wild-type S. aureus dCas9 molecule. [0445] 104. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-103, wherein modulator of gene expression comprises a modulator of gene expression described herein. [0446] 105. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-104, wherein modulator of gene expression comprises a KRAB molecule, e.g., comprising the sequence of SEQ ID NO: 34, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, or any fragment thereof. [0447] 106. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-105, wherein the gRNA comprises a unimolecular gRNA. [0448] 107. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-105, wherein the gRNA comprises a bimolecular gRNA. [0449] 108. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-107, wherein the gRNA comprises a gRNA sequence described herein. [0450] 109. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-108, wherein gene expression is modulated in a cell, tissue, or organ described herein, e.g., Table 2 or 3. [0451] 110. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-109, wherein gene expression is modulated in the liver. [0452] 111. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-110, wherein the modulation is sufficient to alter a function of the gene, or a symptom of a disorder associated with the gene, as described herein, e.g., in Table 2 or 3. [0453] 112. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-111, wherein the modulation comprises modulation of transcription. [0454] 113. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-112, wherein the modulation comprises down-regulation of transcription. [0455] 114. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-113, wherein the modulation comprises up-regulation of transcription. [0456] 115. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-114, wherein the modulation comprises modulating the temporal pattern of expression of the gene. [0457] 116. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-115, wherein the modulation comprises modulating the spatial pattern of expression of the gene. [0458] 117. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-116, wherein the modulation comprises modulating a post-transcriptional or co-transcriptional modification, e.g., splicing, 5' capping, 3' cleavage, 3' polyadenylation, or RNA export. [0459] 118. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-117, wherein the modulation comprises modulating the expression of an isoform, e.g., an increase or decrease in the expression of an isoform, the increase or decrease in the expression of a first isoform over a second isoform. [0460] 119. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-118, wherein the modulation comprises modulating chromatin structure, e.g., increasing or decreasing methylation, acetylation, phosphorylation, or ubiquitination, e.g., at a preselected site, or altering the spatial pattern, cell specificity, or temporal occurrence of methylation, acetylation, phosphorylation, or ubiquitination. [0461] 120. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-119, wherein the modulation comprises modulating a post-translational modification (e.g., indirectly), e.g., glycosylation, lipidation, acetylation, phosphorylation, amidation, hydroxylation, methylation, ubiquitination, sulfation, nitrosylation, or proteolysis. [0462] 121. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-120, wherein the gene is selected from Table 2, optionally wherein the method down-regulates the expression of the gene. [0463] 122. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-120, wherein the gene is selected from Table 3, optionally wherein the method up-regulates the expression of the gene. [0464] 123. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-122, wherein the gene comprises PCSK9. [0465] 124. The pharmaceutical composition, or unit dosage form, of any of paragraphs 72-123, wherein the dCas9 does not cleave the genome of the subject. [0466] 125. A pharmaceutical composition, or unit dosage form, comprising, in an amount sufficient for modulating a gene in a human subject, or in an amount sufficient for a therapeutic effect in a human subject, [0467] (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising an S. aureus dCas9 molecule fused to a KRAB molecule; and/or [0468] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to the gene, [0469] wherein one or both of (a)(ii) and (b)(ii) are packaged in a viral vector, e.g., an AAV vector. [0470] 126. The pharmaceutical composition, or unit dosage form, of paragraph 125, wherein the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof

[0471] 127. The pharmaceutical composition, or unit dosage form, of paragraph 125 or 126, wherein the gRNA comprises a gRNA sequence described herein. [0472] 128. The pharmaceutical composition, or unit dosage form, of any of paragraphs 125-127, wherein the gene is selected from Table 2 or 3. [0473] 129. The pharmaceutical composition, or unit dosage form, of any of paragraphs 125-128, wherein the gene comprises PCSK9. [0474] 130. The pharmaceutical composition, or unit dosage form, of any of paragraphs 125-129, wherein (a)(ii) and (b)(ii) are packaged in different AAV vectors. [0475] 131. The pharmaceutical composition, or unit dosage form, of any of paragraphs 125-130, wherein (a)(ii) and (b)(ii) are packaged in the same AAV vector. [0476] 132. A viral vector comprising: [0477] (a)(ii) a nucleic acid that encodes a fusion molecule (e.g., a fusion molecule described herein) comprising a sequence comprising a dCas9 molecule (e.g., a dCas9 molecule described herein), e.g., an S. aureus dCas9 molecule, fused to a modulator of gene expression (e.g., a modulator described herein); and/or [0478] (b)(ii) a nucleic acid that encodes a gRNA (e.g., a gRNA described herein) which targets the fusion molecule to a gene (e.g., a gene described herein). [0479] 133. The viral vector of paragraph 132, which is an AAV vector. 134. The viral vector of paragraph 132 and 133, comprising: [0480] (a)(ii) a nucleic acid that encodes a fusion molecule comprising a sequence comprising an S. aureus dCas9 molecule fused to a KRAB molecule; and [0481] (b)(ii) a nucleic acid that encodes a gRNA which targets the fusion molecule to PCSK9, [0482] wherein one or both of (a)(ii) and (b)(ii) are packaged in an AAV vector. [0483] 135. The viral vector of any of paragraphs 132-134, wherein the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof [0484] 136. The viral vector of any of paragraphs 132-135, wherein the gRNA comprises a gRNA sequence described herein. [0485] 137. The viral vector of any of paragraphs 132-136, wherein the gene is selected from Table 2 or 3. [0486] 138. The viral vector of any of paragraphs 132-137, wherein the gene comprises PCSK9. [0487] 139. A method of treating a disorder, comprising administering to a subject: [0488] (a)(ii) a nucleic acid that encodes a fusion molecule (e.g., a fusion molecule described herein) comprising a sequence comprising a dCas9 molecule (e.g., a dCas9 molecule) fused to a modulator of gene expression (e.g., a modulator describe herein); and [0489] (b)(ii) a nucleic acid that encodes a gRNA (e.g., a gRNA described herein) which targets the fusion molecule to a gene associated with the disorder, [0490] thereby treating the disorder. [0491] 140. The method of paragraph 139, wherein the disorder is selected from Table 2 or 3. [0492] 141. The method of paragraph 139 or 140, wherein the gene is selected from Table 2 or 3. [0493] 142. The method of any of paragraphs 139-140, wherein one or both of (a)(ii) and (b)(ii) are provided in an AAV vector. [0494] 143. A method of treating a cardiovascular disease, comprising administering to a subject: [0495] (a)(ii) a nucleic acid that encodes a fusion molecule (e.g., a fusion molecule described herein) comprising a sequence comprising a dCas9 molecule (e.g., a dCas9 molecule described herein) fused to a modulator of gene expression (e.g., a modulator describe herein); and [0496] (b)(ii) a nucleic acid that encodes a gRNA (e.g., a gRNA described herein) which targets the fusion molecule to a PCSK9 gene, [0497] thereby treating the cardiovascular disease. [0498] 144. The method of paragraph 143, wherein the dCas9 molecule is an S. aureus dCas9 molecule. [0499] 145. The method of paragraph 143 or 144, wherein the fusion molecule comprises a sequence described herein, e.g., the amino acid sequence of SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, a sequence substantially identical (e.g., at least 80%, 85%, 90%, 92%, 95%, 97%, 98%, 99% or higher identical) to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or a sequence having one, two, three, four, five or more changes, e.g., amino acid substitutions, insertions, or deletions, relative to SEQ ID NO: 34, 35, 36, 37, 38, 39, 40, or 41, or any fragment thereof [0500] 146. The method of any of paragraphs 143-145, wherein the gRNA comprises a gRNA sequence described herein. [0501] 147. The method of any of paragraphs 143-146, wherein one or both of (a)(ii) and (b)(ii) are provided in an AAV vector.

[0502] The following examples are provided by way of illustration and not by way of limitation.

EXAMPLES

[0503] 1. In Vivo Transcriptional Repression of Endogenous Genes Using S. aureus Cas9-Based Repressors.

1.1 Synopsis

[0504] RNA-guided dCas9-KRAB repressors have demonstrated promise in cell culture models for silencing target gene expression efficiently and specifically. An exciting application of this technology would be to study gene regulation in development and disease in animal models and to design novel gene therapies. However, a technology to deliver CRISPR/Cas9-based gene repressors in vivo has not been developed. AAV vectors have been used as a delivery platform for CRISPR/Cas9 nuclease components for in vivo studies and therapeutic applications (Ran, F. A. et al. Nature 520, 186-91 (2015), incorporated by reference herein in its entirety). Recently, a smaller Cas9 nuclease protein derived from S. aureus was described for AAV delivery and in vivo gene editing (Ran, F. A. et al. Nature 520, 186-91 (2015)). In this example, a KRAB repressor motif was fused to S. aureus nuclease-null dCas9 (dSaCas9), thereby generating a programmable RNA-guided repressor for in vivo gene regulation. dSaCas9-KRAB repressors efficiently silenced a reporter luciferase gene in primary fibroblasts and the myostatin receptor Acvr2b in a mouse myoblast cell line. When delivered intramuscularly via an AAV9 dual-vector expression system, dSaCas9-KRAB and Acvr2b gRNA were efficiently expressed in the injected tibialis anterior, heart, and liver tissues of adult wild-type mice. No appreciable silencing of Acvr2b was achieved in skeletal muscle, but dSaCas9-KRAB was biologically active and significantly silenced Acvr2b expression in heart and liver when delivered with a target guide RNA molecule. This gene delivery system can be customized to target any endogenous gene, enabling potent and stable gene repression in animal models and for therapeutic applications.

1.2 Introduction

[0505] RNA-guided gene regulation with the CRISPR/Cas9 system has enabled functional genomics studies in cell culture systems (Kearns, N. A. et al. Nat Methods (2015); Gilbert, L. A. et al. Cell 159, 647-61 (2014); Thakore, P. I. et al. Nat Methods 12, 1143-9 (2015); Konermann, S. et al. Nature 517, 583-8 (2015), incorporated by reference herein in their entirety). The potency and specificity of dCas9-KRAB epigenetic repressors, in particular, are promising for loss-of-function studies and guiding cell phenotype in vitro (Thakore, P. I., et al. Nat Methods 13, 127-37 (2016); Gilbert, L. A. et al. Cell 159, 647-61 (2014); Thakore, P. I. et al. Nat Methods 12, 1143-9 (2015), incorporated by reference herein in their entirety). Adapting programmable transcriptional modulators for use in vivo would allow for the study of gene regulation in complex organisms and enable the development of therapies to address aberrant gene regulation in disease.

[0506] The large packaging capacity of lentiviral vectors, a commonly used method to stably deliver CRISPR/Cas9 components in vitro, can accommodate the 4.2 kb S. pyogenes Cas9, epigenetic modulator fusions, a single gRNA, and associated regulatory elements required for expression. While efficacious for in vitro delivery, under certain circumstances, lentiviral delivery is typically not suitable for in vivo gene regulation due to concerns for insertional mutagenesis. Adeno-associated viral (AAV) vectors are a promising gene delivery vehicle as they provide stable episomal gene expression with minimal integration and have been extensively engineered to target a variety of tissue types (Asokan, A., et al. Mol Ther 20, 699-708 (2012), incorporated by reference herein in its entirety). However, the packaging capacity of AAV is limited to 4.5 kb, precluding delivery of the 4.2 kb S. pyogenes dCas9 DNA-binding domain, KRAB repressor motif, and associated regulatory elements. A smaller 3.2 kb Cas9 nuclease derived from S. aureus (SaCas9) has recently been identified and adapted for genome editing in vivo in the liver and skeletal muscle (Ran, F. A. et al. Nature 520, 186-91 (2015); Nelson, C. E. et al. Science 351, 403-7 (2016); Tabebordbar, M. et al. Science 351, 407-11 (2016), incorporated by reference herein in their entirety). A SaCas9-based transcriptional repressor was generated for AAV-based delivery and silencing of endogenous genes in vivo.

[0507] The SaCas9-based transcriptional repressor was tested in vitro for silencing a luciferase reporter gene in primary fibroblasts. For in vivo gene regulation, the myostatin receptor, Acvr2b, was targeted. Inhibiting the myostatin signaling pathway is a potential method for treating skeletal muscle degeneration. Myostatin is a secreted protein that acts as a negative regulator of skeletal muscle growth by binding the activin type II receptor (Acvr2b) and activating TGF-.beta. signaling pathways (Lee, S. J. Annu Rev Cell Dev Biol 20, 61-86 (2004), incorporated by reference herein in its entirety). Knockout animal models of myostatin and Acvr2b demonstrate a double muscling phenotype (Lee, S. J. Annu Rev Cell Dev Biol 20, 61-86 (2004); Lee, S. J. et al. Proc Natl Acad Sci USA 109, E2353-60 (2012), incorporated by reference herein in its entirety). Blocking myostatin signaling through systemic administration of blocking antibodies or soluble Acvr2b receptors has been tested in clinical trials for the treatment of muscular dystrophy, but has thus far showed limited efficacy and safety concerns over adverse side effects (Wagner, K. R. et al. Ann Neurol 63, 561-71 (2008); Smith, R. C. & Lin, B. K. Curr Opin Support Palliat Care 7, 352-60 (2013), incorporated by reference herein in their entirety). A more targeted strategy to localize myostatin inhibition to skeletal muscle may increase the efficacy and safety of this strategy for treating muscle disorders.

[0508] An AAV9 two-vector system was designed for expressing SaCas9 repressors and targeting guide RNA (gRNA) molecule. AAV9 can provide stable and high transgene expression in skeletal and cardiac muscle (Asokan, A., et al. Mol Ther 20, 699-708 (2012); Zincarelli, C., et al. Mol Ther 16, 1073-80 (2008), incorporated by reference herein in their entirety) and is currently being evaluated in clinical trials for spinal muscular atrophy. When delivered intramuscularly in adult wild-type mice, SaCas9 repressors effected significant silencing of the endogenous Acvr2b gene in the heart and liver. These studies demonstrate that SaCas9-based repressors can regulate genes in animal models and will facilitate the development of gene-regulation based therapies.

1.3 Materials and Methods

1.3.1 Plasmid Constructs and AAV Design

[0509] An inactive version of SaCas9 (dSaCas9) was created by introducing D10A and N580A mutations (Ran, F. A. et al. Nature 520, 186-91 (2015), incorporated by reference herein in its entirety). dSaCas9 was cloned into a lentiviral vector driven by the human Ubiquitin C (hUbC) promoter, fused to a KRAB repressor motif, and linked to a puromycin resistance cassette via T2A ribosome skipping peptide. For sgRNA screening, the oligonucleotides containing protospacer sequences were synthesized (IDT-DNA), hybridized, phosphorylated, and inserted into a phU6-SaCas9 gRNA plasmid using BbsI sites. U6-gRNA cassettes were then cloned in reverse orientation upstream of the hUbC promoter in dSaCas9-KRAB lentiviral vectors for stable expression.

[0510] A Staphylococcus aureus Cas9 (SaCas9) AAV expression plasmid (Addgene #61592) was received as a gift from the Zhang lab (Ran, F. A. et al. Nature 520, 186-U98 (2015), incorporated by reference herein in its entirety). We replaced the nuclease-active SaCas9 with dSaCas9-KRAB. We also removed the C' terminal 3.times.HA epitope tag and incorporated a single N' terminal HA tag for tracking protein expression. For the AAV-U6 gRNA plasmid, a U6-Acvr2b gRNA cassette was cloned into a pTR-eGFP backbone replacing the CMV with the gRNA.

1.3.2 Cell Culture

[0511] C2C12s cells and HEK293T cells were obtained from the American Tissue Collection Center (ATCC) through the Duke University Cancer Center Facilities. Primary fibroblasts were harvested from the tail and ear of adult mice expressing a CAG-Luciferase-P2A-GFP cassette (Jackson Laboratories). C2C12 cells were maintained in DMEM supplemented with 20% FBS and 1% penicillin-streptomycin. HEK293T cells were cultured in DMEM supplemented with 10% FBS and 1% penicillin-streptomycin. Mouse fibroblasts were cultured in DMEM supplemented with 10% FBS and 1% penicillin-streptomycin. All cell lines were cultured at 37 C with 5% CO.sub.2.

1.3.3 Lentiviral Production

[0512] C2C12s and primary fibroblasts were transduced with lentivirus to stably express dSaCas9-KRAB and target gRNA molecules. To produce VSV-G pseudotyped lentivirus, HEK293T cells were plated at a density of 5.1e3 cells/cm.sup.2 in high glucose DMEM supplemented with 10% FBS and 1% pencillin-streptomycin. The next day after seeding, cells in 10-cm plates were co-transfected with the appropriate dSaCas9-KRAB lentiviral expression plasmid (20 .mu.g), the second-generation packaging plasmid psPAX2 (Addgene #12260, 15 .mu.g), and the envelope plasmid pMD2.G (Addgene #12259, 6 .mu.g) by calcium phosphate precipitation (Salmon, P. & Trono, D. Curr Protoc Neurosci Chapter 4, Unit 4 21 (2006), incorporated by reference herein in its entirety). After 14-20 hours, transfection medium was exchanged for 10 mL of fresh 293T medium.

[0513] Conditioned medium containing lentivirus was collected 24 and 48 hours after the first media exchange. Residual producer cells were cleared from the lentiviral supernatant by filtration through 0.45 .mu.m cellulose acetate filters and incubated overnight by incubation with Lenti-X. Concentrated virus was pelleted by centrifugation according to the manufacturer's protocol and resuspended at 20-fold concentration in PBS. Concentrated viral supernatant was snap-frozen in liquid nitrogen and stored at -80.degree. C. for future use. For transduction, concentrated viral supernatant was diluted 1:20 with media. To facilitate transduction, the cationic polymer polybrene was added at a concentration of 4 .mu.g/mL to the viral media. Non-transduced (NT) cells did not receive virus but were treated with polybrene as a control. The day after transduction, the medium was exchanged to remove the virus. Puromycin at 2 ug/mL (C2C12s) or 4 ug/mL (fibroblasts) was used to initiate selection for transduced cells approximately 48 hours after transduction.

1.3.4 AAV Production

[0514] ITRs were verified by SmaI digest before production. AAV-dSaCas9-KRAB and AAV-U6 Acvr2b gRNA were used to generate AAV9 in two separate batches by the Gene Transfer Vector Core at Schepens Eye Research Institute, Massachusetts Eye and Ear. Titers were provided at 5.3.times.10.sup.13 vp/mL (AAV-dSaCas9-KRAB) and 1.6.times.10.sup.13 vp/mL (AAV-U6 Acvr2b gRNA).

1.3.5 Animal Studies

[0515] Animal studies were conducted with adherence to the guidelines for the care and use of laboratory animals of the National Institutes of Health (NIH). All the experiments with animals were approved by the Institutional Animal Care and Use Committee (IACUC) at Duke University. 6-8 week old C57Bl/6 mice (Jackson Labs) were anesthetized and maintained at 37.degree. C. The right tibialis anterior muscle was prepared and injected with 30-40 .mu.L of AAV solution (5.6.times.10.sup.11-7.46.times.10.sup.11vp) or sterile PBS using a 30 G needle. Mice were injected with a saline control, a Sell vp dose AAV-dSaCas9-KRAB alone, or a 1:1 mixture of 1e12 total dose of AAV-dSaCas9-KRAB and AAV-U6 Acvr2b gRNA. At 4 and 8 weeks post-injection, mice were euthanized by CO.sub.2 inhalation and tissue was collected into RNALater.RTM. (Life Technologies) for DNA and RNA or snap-frozen for protein analysis.

1.3.6 qRT-PCR

[0516] Cells were harvested for total RNA isolation using the RNeasy Plus RNA isolation kit (Qiagen). Tissue samples were stored in RNALater (Ambion) and total RNA was isolated using the RNA Universal Plus Kit (Qiagen). cDNA synthesis was performed using the SuperScript VILO cDNA Synthesis Kit (Invitrogen). For genomic qPCR experiments, genomic DNA from tissue samples was isolated using a Blood and Tissue Kit (Qiagen). Quantitative real-time PCR (qRT-PCR) using QuantIT Perfecta Supermix was performed with the CFX96 Real-Time PCR Detection System (Bio-Rad) with the oligonucleotide primers optimized for 90-110% amplification efficiency. The results are expressed as fold-increase mRNA expression of the gene of interest normalized to Gapdh expression by the .DELTA..DELTA.C.sub.t method.

1.3.7 Western Blot

[0517] Cells or minced tissue were lysed in RIPA buffer (Sigma), and the BCA assay (Pierce) was performed to quantify total protein. Lysates were mixed with LDS sample buffer (Invitrogen) and boiled for 5 min; equal amounts of total protein were run in NuPAGE Novex 4-12% Bis-Tris polyacrylamide gels (Life Technologies) and transferred to nitrocellulose membranes. Nonspecific antibody binding was blocked with 5% nonfat milk in TBS-T (50 mM Tris, 150 mM NaCl and 0.1% Tween-20) for 30 min. The membranes were then incubated with primary antibody in 5% milk in TBS-T: rabbit anti-ACTRIIB diluted 1:1000 overnight at 4.degree. C., anti-HA diluted 1:1000 for 60 min at room temperature, or rabbit anti-GAPDH diluted 1:5000 for 60 min at room temperature. Membranes labeled with primary antibodies were incubated with anti-mouse (Santa Cruz, S.C.--2005) or anti-rabbit HRP-conjugated antibody (Sigma-Aldrich, A6154) diluted 1:5000 for 60 min and washed with TBS-T for 60 min. Membranes were visualized using the Immun-Star WesternC Chemiluminescence Kit (Bio-Rad) and images were captured using a ChemiDoc XRS+ system and processed using ImageLab software (Bio-Rad).

1.4 Results

[0518] 1.4.1 Generation of a Transcriptional Repressor from S. aureus Cas9

[0519] D10A and N580A mutations were introduced into the SaCas9 nuclease in order to abrogate catalytic activity and create a nuclease-null programmable DNA-binding domain (Ran, F. A. et al. Nature 520, 186-91 (2015), incorporated by reference herein in its entirety) (FIG. 1A). Fusion of a synthetic KRAB motif generated a dSaCas9 repressor. An N-terminal HA-tag was included to facilitate protein analysis and an N- and C-terminal nuclear localization sequence was included to enable trafficking of dSaCas9-KRAB into the cell nucleus.

[0520] For initial testing in vitro, dSaCas9-KRAB and single gRNAs were stably expressed using a lentiviral delivery system with puromycin selection (FIG. 1B). dSaCas9-KRAB was first tested in primary mouse fibroblasts expressing a luciferase reporter knocked in at chromosome 7 of the genome. Nine gRNAs to the synthetic CAG promoter driving transgene expression were designed, searching for base pair target sequences followed by the SaCas9 PAM, 5' NNGRRT 3' (SEQ ID NO: 1, wherein N is any nucleotide, and R is G or A). Multiple gRNAs exhibited robust repression of luciferase expression via qPCR and Western 7 days after transduction of fibroblasts (FIGS. 1C and 1D). These results confirmed that dSaCas9-KRAB repressors were effective at silencing a reporter gene in vitro.

1.4.2 Silencing Endogenous Acvr2b in Myoblasts

[0521] SaCas9-based repressors were targeted to the myostatin receptor Acvr2b in C2C12 mouse myoblasts. gRNAs were targeted to the DNase I hypersensitivity site (DHS) containing the transcription start site (TSS) of Acvr2b according to DNase-seq data on mouse skeletal muscle from the ENCODE project (Consortium, E. P. et al. Nature 489, 57-74 (2012), incorporated by reference herein in its entirety) (FIG. 2A). dSaCas9-KRAB and a single gRNA were stably expressed using a lentiviral delivery system, and multiple gRNAs effected potent repression of endogenous Acvr2b by qPCR 7 days after transduction and selection in C2C12s (FIG. 2B).

1.4.3 Transcriptional Repression of the Acvr2b Gene In Vivo with AAV Delivery of S. Aureus Cas9 Repressors

[0522] To accommodate the limited packaging capacity of AAV, a two-vector system was designed to deliver dSaCas9-KRAB and a single gRNA for targeted gene repression (FIG. 3A). AAV9 vectors expressing dSaCas9-KRAB and an Acvr2b gRNA were generated and purified by the Massachusetts General Hospital Ear and Eye Vector Core. The Cr4 Acvr2b gRNA was chosen for AAV in vivo studies. AAV9 is a muscle-tropic serotype capable of producing high levels of transgene expression (Zincarelli, C., et al. Mol Ther 16, 1073-80 (2008), incorporated by reference herein in its entirety).

[0523] Adult C57Bl/6 wild-type mice were injected in the tibialis anterior of the right limb with a mixture of AAV-dSaCas9-KRAB and AAV-Acvr2b-gRNA, at Sell vector genome copies delivered per AAV per limb. Age-matched controls received a PBS sham injection or AAV-dSaCas9-KRAB injection without gRNA. At 4 and 8 weeks post-transduction, dSaCas9-KRAB was steadily expressed via qPCR in the injected TA muscle (FIGS. 3B and 3D). Acvr2b expression was not significantly affected by delivery of dSaCas9-KRAB alone or dSaCas9-KRAB with Acvr2b gRNA at 4 weeks post-treatment (FIG. 3C). At 8 weeks post-AAV delivery, Acvr2b mRNA expression was significantly reduced compared to sham-injected muscles in both AAV treatment groups (FIG. 3E). However, targeting dSaCas9-KRAB with Acvr2b gRNA result in stronger repression than delivery of dSaCas9-KRAB alone.

[0524] To determine if delivered AAV escaped the injected muscle and distributed systemically, vector genome signal was quantified in the liver, heart, and tibialis anterior muscles of treated mice at 8 weeks post-transduction. For AAV-Acvr2b-gRNA, the highest vector genome signals were found in the liver, heart, the right gastrocnemius muscle, and the injected tibialis anterior muscle (FIG. 4). Various AAV serotypes demonstrate tropism for the liver, and AAV9 can efficiently transduce cardiac muscle (Asokan, A., et al. Mol Ther 20, 699-708 (2012); Zincarelli, C., et al. Mol Ther 16, 1073-80 (2008), incorporated by reference herein in their entirety). dSaCas9-KRAB was expressed in the liver and heart at 4 and 8 weeks post-transduction via qPCR (FIG. 5). At 8 weeks post-transduction, Acvr2b expression in the heart was reduced by .about.50% with delivery of dSaCas9-KRAB with gRNA. dSaCas9-KRAB alone did not have a significant effect on Acvr2b expression. Changes in Acvr2b expression in the liver were not statistically significant at 8 weeks post-transduction. These results indicate that dSaCas9-KRAB is biologically active in vivo and AAV delivery is a promising method for achieving targeted repression in animal models.

1.5 Discussion

[0525] The efficiency and specificity of CRISPR/Cas9 gene silencing has shown great preclinical promise. In this example, a platform was presented to translate RNA-guided gene repression in vivo in a wild-type mouse model. dSaCas9-KRAB potently silenced reporter and endogenous genes in vitro, and AAV9 delivery of CRISPR/Cas9 components in an adult wild-type mouse model resulted in efficient silencing of the Acvr2b gene in the heart.

[0526] Muscle tissue contains large and multinucleated fibers and a progenitor population capable of proliferation and regeneration. These are all factors that may have contributed to the lack of repression observed in skeletal muscle. dSaCas9-KRAB repression in muscle may have limited by replication-mediated AAV dilution, diffusion of the repressor protein and delivered gRNA molecule along the myofiber, or inability of dSaCas9-KRAB to silence the majority of nuclei within a fiber. In contrast, cardiomyocytes of the heart are binucleated and post-mitotic, factors that may have contributed to the more efficient silencing observed in this tissue.

[0527] Interestingly, in some cases, it was observed that delivering dSaCas9-KRAB alone significantly downregulated Acvr2b expression. This unexpected biological effect may be related to potential host immune responses of high doses of AAV or expressing foreign SaCas9-based proteins in mouse tissue. An influx of immune cells or inflammatory responses could lead to gene expression changes in AAV-treated tissues and apparent silencing of the target gene.

[0528] The CRISPR/Cas9 platform is highly flexible, and the AAV delivery system developed in this example can easily be adapted to target other gene products. The extent of immune response to foreign Cas9 proteins and synthetic gRNA molecules, as well as the specificity of SaCas9-based gene regulation, can also be evaluated. A major determinant of off-site target binding is the presence of a PAM sequence, and thus the more stringent PAM requirement of SaCas9 compared to SpCas9 may be indicative of at least comparable levels of specificity for gene regulation. Lastly, minimal and tissue-specific promoters may enable implementation of a single AAV vector system for future in vivo gene regulation applications.

1.6. Appendix

[0529] 1.6.1 Lentiviral S. aureus Cas9 KRAB-Based Repressor

[0530] A restriction map of a lentiviral vector encoding S. aureus Cas9 KRAB-based repressor is shown in FIG. 6. SEQ ID NO: 2 provides the nucleic acid sequence of the lentivial vector encoding S. aureus Cas9 KRAB-based repressor.

TABLE-US-00008 SEQ ID NO: 2 GTCGACGGATCGGGAGATCTCCCGATCCCCTATGGTGCACTCTCAGTACA ATCTGCTCTGATGCCGCATAGTTAAGCCAGTATCTGCTCCCTGCTTGTGT GTTGGAGGTCGCTGAGTAGTGCGCGAGCAAAATTTAAGCTACAACAAGGC AAGGCTTGACCGACAATTGCATGAAGAATCTGCTTAGGGTTAGGCGTTTT GCGCTGCTTCGCGATGTACGGGCCAGATATACGCGTTGACATTGATTATT GACTAGTTATTAATAGTAATCAATTACGGGGTCATTAGTTCATAGCCCAT ATATGGAGTTCCGCGTTACATAACTTACGGTAAATGGCCCGCCTGGCTGA CCGCCCAACGACCCCCGCCCATTGACGTCAATAATGACGTATGTTCCCAT AGTAACGCCAATAGGGACTTTCCATTGACGTCAATGGGTGGAGTATTTAC GGTAAACTGCCCACTTGGCAGTACATCAAGTGTATCATATGCCAAGTACG CCCCCTATTGACGTCAATGACGGTAAATGGCCCGCCTGGCATTATGCCCA GTACATGACCTTATGGGACTTTCCTACTTGGCAGTACATCTACGTATTAG TCATCGCTATTACCATGGTGATGCGGTTTTGGCAGTACATCAATGGGCGT GGATAGCGGTTTGACTCACGGGGATTTCCAAGTCTCCACCCCATTGACGT CAATGGGAGTTTGTTTTGGCACCAAAATCAACGGGACTTTCCAAAATGTC GTAACAACTCCGCCCCATTGACGCAAATGGGCGGTAGGCGTGTACGGTGG GAGGTCTATATAAGCAGCGCGTTTTGCCTGTACTGGGTCTCTCTGGTTAG ACCAGATCTGAGCCTGGGAGCTCTCTGGCTAACTAGGGAACCCACTGCTT AAGCCTCAATAAAGCTTGCCTTGAGTGCTTCAAGTAGTGTGTGCCCGTCT GTTGTGTGACTCTGGTAACTAGAGATCCCTCAGACCCTTTTAGTCAGTGT GGAAAATCTCTAGCAGTGGCGCCCGAACAGGGACTTGAAAGCGAAAGGGA AACCAGAGGAGCTCTCTCGACGCAGGACTCGGCTTGCTGAAGCGCGCACG GCAAGAGGCGAGGGGCGGCGACTGGTGAGTACGCCAAAAATTTTGACTAG CGGAGGCTAGAAGGAGAGAGATGGGTGCGAGAGCGTCAGTATTAAGCGGG GGAGAATTAGATCGCGATGGGAAAAAATTCGGTTAAGGCCAGGGGGAAAG AAAAAATATAAATTAAAACATATAGTATGGGCAAGCAGGGAGCTAGAACG ATTCGCAGTTAATCCTGGCCTGTTAGAAACATCAGAAGGCTGTAGACAAA TACTGGGACAGCTACAACCATCCCTTCAGACAGGATCAGAAGAACTTAGA TCATTATATAATACAGTAGCAACCCTCTATTGTGTGCATCAAAGGATAGA GATAAAAGACACCAAGGAAGCTTTAGACAAGATAGAGGAAGAGCAAAACA AAAGTAAGACCACCGCACAGCAAGCGGCCGCTGATCTTCAGACCTGGAGG AGGAGATATGAGGGACAATTGGAGAAGTGAATTATATAAATATAAAGTAG TAAAAATTGAACCATTAGGAGTAGCACCCACCAAGGCAAAGAGAAGAGTG GTGCAGAGAGAAAAAAGAGCAGTGGGAATAGGAGCTTTGTTCCTTGGGTT CTTGGGAGCAGCAGGAAGCACTATGGGCGCAGCGTCAATGACGCTGACGG TACAGGCCAGACAATTATTGTCTGGTATAGTGCAGCAGCAGAACAATTTG CTGAGGGCTATTGAGGCGCAACAGCATCTGTTGCAACTCACAGTCTGGGG CATCAAGCAGCTCCAGGCAAGAATCCTGGCTGTGGAAAGATACCTAAAGG ATCAACAGCTCCTGGGGATTTGGGGTTGCTCTGGAAAACTCATTTGCACC ACTGCTGTGCCTTGGAATGCTAGTTGGAGTAATAAATCTCTGGAACAGAT TTGGAATCACACGACCTGGATGGAGTGGGACAGAGAAATTAACAATTACA CAAGCTTAATACACTCCTTAATTGAAGAATCGCAAAACCAGCAAGAAAAG AATGAACAAGAATTATTGGAATTAGATAAATGGGCAAGTTTGTGGAATTG GTTTAACATAACAAATTGGCTGTGGTATATAAAATTATTCATAATGATAG TAGGAGGCTTGGTAGGTTTAAGAATAGTTTTTGCTGTACTTTCTATAGTG AATAGAGTTAGGCAGGGATATTCACCATTATCGTTTCAGACCCACCTCCC AACCCCGAGGGGACCCGACAGGCCCGAAGGAATAGAAGAAGAAGGTGGAG AGAGAGACAGAGACAGATCCATTCGATTAGTGAACGGATCGGCACTGCGT GCGCCAATTCTGCAGACAAATGGCAGTATTCATCCACAATTTTAAAAGAA AAGGGGGGATTGGGGGGTACAGTGCAGGGGAAAGAATAGTAGACATAATA GCAACAGACATACAAACTAAAGAATTACAAAAACAAATTACAAAAATTCA AAATTTTCGGGTTTATTACAGGGACAGCAGAGATCCAGTTTGGTTAatTA AATAACTTCGTATAGCATACATTATACGAAGTTATGATAAGAGACGGTGG TGgcgccgctacagggcgcgtcccattcgccattcaggctgcgcaactgt tgggaagggcgatcggtgcgggcctcttcgctattacgccagctggcgaa agggggatgtgctgcaaggcgattaagttgggtaacgccagggttttccc agtcacgacgttgtaaaacgacggccagtgagcgcgcgtaatacgactca ctatagggcgaattgggtaccgggccccccctcgaggtcctccagctttt gttccctttagtgagggttaattgcgcgcttggcgtaatcatggtcatag ctgtttcctgtgtgaaattgttatccgctcacaattccacacaacatacg agccggaagcataaagtgtaaagcctggggtgcctaatgagtgagctaac tcacattaattgcgttgcgctcactgcccgctttccaCTGCATGACGTCT CCACAATTAatTAAgggtgcagcggcctccgcgccgggttttggcgcctc ccgcgggcgcccccctcctcacggcgagcgctgccacgtcagacgaaggg cgcaggagcgttcctgatccttccgcccggacgctcaggacagcggcccg ctgctcataagactcggccttagaaccccagtatcagcagaaggacattt taggacgggacttgggtgactctagggcactggttttctttccagagagc ggaacaggcgaggaaaagtagtcccttctcggcgattctgcggagggatc tccgtggggcggtgaacgccgatgattatataaggacgcgccgggtgtgg cacagctagttccgtcgcagccgggatttgggtcgcggttcttgtttgtg gatcgctgtgatcgtcacttggtgagttgcgggctgctgggctggccggg gctttcgtggccgccgggccgctcggtgggacggaagcgtgtggagagac cgccaagggctgtagtctgggtccgcgagcaaggttgccctgaactgggg gttggggggagcgcacaaaatggcggctgttcccgagtcttgaatggaag acgcttgtaaggcgggctgtgaggtcgttgaaacaaggtggggggcatgg tgggcggcaagaacccaaggtcttgaggccttcgctaatgcgggaaagct cttattcgggtgagatgggctggggcaccatctggggaccctgacgtgaa gtttgtcactgactggagaactcgggtttgtcgtctggttgcgggggcgg cagttatgcggtgccgttgggcagtgcacccgtacctttgggagcgcgcg cctcgtcgtgtcgtgacgtcacccgttctgttggcttataatgcagggtg gggccacctgccggtaggtgtgcggtaggcttttctccgtcgcaggacgc agggttcgggcctagggtaggctctcctgaatcgacaggcgccggacctc tggtgaggggagggataagtgaggcgtcagtttctttggtcggttttatg tacctatcttcttaagtagctgaagctccggttttgaactatgcgctcgg ggttggcgagtgtgttttgtgaagttttttaggcaccttttgaaatgtaa tcatttgggtcaatatgtaattttcagtgttagactagTaaattgtccgc taaattctggccgtttttggcttttttgttagacGAAGCTTGGGCTGCAG GTCGACTctagagccaccatgtacccatacgatgttccagattacgctAT GGCCCCAAAGAAGAAGCGGAAGGTCGGTATCCACGGAGTCCCAGCAGCCA AGCGGAACTACATCCTGGGCCTGGCCATCGGCATCACCAGCGTGGGCTAC GGCATCATCGACTACGAGACACGGGACGTGATCGATGCCGGCGTGCGGCT GTTCAAAGAGGCCAACGTGGAAAACAACGAGGGCAGGCGGAGCAAGAGAG GCGCCAGAAGGCTGAAGCGGCGGAGGCGGCATAGAATCCAGAGAGTGAAG AAGCTGCTGTTCGACTACAACCTGCTGACCGACCACAGCGAGCTGAGCGG CATCAACCCCTACGAGGCCAGAGTGAAGGGCCTGAGCCAGAAGCTGAGCG AGGAAGAGTTCTCTGCCGCCCTGCTGCACCTGGCCAAGAGAAGAGGCGTG CACAACGTGAACGAGGTGGAAGAGGACACCGGCAACGAGCTGTCCACCAA AGAGCAGATCAGCCGGAACAGCAAGGCCCTGGAAGAGAAATACGTGGCCG AACTGCAGCTGGAACGGCTGAAGAAAGACGGCGAAGTGCGGGGCAGCATC AACAGATTCAAGACCAGCGACTACGTGAAAGAAGCCAAACAGCTGCTGAA GGTGCAGAAGGCCTACCACCAGCTGGACCAGAGCTTCATCGACACCTACA TCGACCTGCTGGAAACCCGGCGGACCTACTATGAGGGACCTGGCGAGGGC AGCCCCTTCGGCTGGAAGGACATCAAAGAATGGTACGAGATGCTGATGGG CCACTGCACCTACTTCCCCGAGGAACTGCGGAGCGTGAAGTACGCCTACA ACGCCGACCTGTACAACGCCCTGAACGACCTGAACAATCTCGTGATCACC AGGGACGAGAACGAGAAGCTGGAATATTACGAGAAGTTCCAGATCATCGA GAACGTGTTCAAGCAGAAGAAGAAGCCCACCCTGAAGCAGATCGCCAAAG AAATCCTCGTGAACGAAGAGGATATTAAGGGCTACAGAGTGACCAGCACC GGCAAGCCCGAGTTCACCAACCTGAAGGTGTACCACGACATCAAGGACAT TACCGCCCGGAAAGAGATTATTGAGAACGCCGAGCTGCTGGATCAGATTG CCAAGATCCTGACCATCTACCAGAGCAGCGAGGACATCCAGGAAGAACTG ACCAATCTGAACTCCGAGCTGACCCAGGAAGAGATCGAGCAGATCTCTAA TCTGAAGGGCTATACCGGCACCCACAACCTGAGCCTGAAGGCCATCAACC TGATCCTGGACGAGCTGTGGCACACCAACGACAACCAGATCGCTATCTTC AACCGGCTGAAGCTGGTGCCCAAGAAGGTGGACCTGTCCCAGCAGAAAGA GATCCCCACCACCCTGGTGGACGACTTCATCCTGAGCCCCGTCGTGAAGA GAAGCTTCATCCAGAGCATCAAAGTGATCAACGCCATCATCAAGAAGTAC GGCCTGCCCAACGACATCATTATCGAGCTGGCCCGCGAGAAGAACTCCAA GGACGCCCAGAAAATGATCAACGAGATGCAGAAGCGGAACCGGCAGACCA ACGAGCGGATCGAGGAAATCATCCGGACCACCGGCAAAGAGAACGCCAAG TACCTGATCGAGAAGATCAAGCTGCACGACATGCAGGAAGGCAAGTGCCT GTACAGCCTGGAAGCCATCCCTCTGGAAGATCTGCTGAACAACCCCTTCA ACTATGAGGTGGACCACATCATCCCCAGAAGCGTGTCCTTCGACAACAGC TTCAACAACAAGGTGCTCGTGAAGCAGGAAGAAgcCAGCAAGAAGGGCAA CCGGACCCCATTCCAGTACCTGAGCAGCAGCGACAGCAAGATCAGCTACG

AAACCTTCAAGAAGCACATCCTGAATCTGGCCAAGGGCAAGGGCAGAATC AGCAAGACCAAGAAAGAGTATCTGCTGGAAGAACGGGACATCAACAGGTT CTCCGTGCAGAAAGACTTCATCAACCGGAACCTGGTGGATACCAGATACG CCACCAGAGGCCTGATGAACCTGCTGCGGAGCTACTTCAGAGTGAACAAC CTGGACGTGAAAGTGAAGTCCATCAATGGCGGCTTCACCAGCTTTCTGCG GCGGAAGTGGAAGTTTAAGAAAGAGCGGAACAAGGGGTACAAGCACCACG CCGAGGACGCCCTGATCATTGCCAACGCCGATTTCATCTTCAAAGAGTGG AAGAAACTGGACAAGGCCAAAAAAGTGATGGAAAACCAGATGTTCGAGGA AAAGCAGGCCGAGAGCATGCCCGAGATCGAAACCGAGCAGGAGTACAAAG AGATCTTCATCACCCCCCACCAGATCAAGCACATTAAGGACTTCAAGGAC TACAAGTACAGCCACCGGGTGGACAAGAAGCCTAATAGAGAGCTGATTAA CGACACCCTGTACTCCACCCGGAAGGACGACAAGGGCAACACCCTGATCG TGAACAATCTGAACGGCCTGTACGACAAGGACAATGACAAGCTGAAAAAG CTGATCAACAAGAGCCCCGAAAAGCTGCTGATGTACCACCACGACCCCCA GACCTACCAGAAACTGAAGCTGATTATGGAACAGTACGGCGACGAGAAGA ATCCCCTGTACAAGTACTACGAGGAAACCGGGAACTACCTGACCAAGTAC TCCAAAAAGGACAACGGCCCCGTGATCAAGAAGATTAAGTATTACGGCAA CAAACTGAACGCCCATCTGGACATCACCGACGACTACCCCAACAGCAGAA ACAAGGTCGTGAAGCTGTCCCTGAAGCCCTACAGATTCGACGTGTACCTG GACAATGGCGTGTACAAGTTCGTGACCGTGAAGAATCTGGATGTGATCAA AAAAGAAAACTACTACGAAGTGAATAGCAAGTGCTATGAGGAAGCTAAGA AGCTGAAGAAGATCAGCAACCAGGCCGAGTTTATCGCCTCCTTCTACAAC AACGATCTGATCAAGATCAACGGCGAGCTGTATAGAGTGATCGGCGTGAA CAACGACCTGCTGAACCGGATCGAAGTGAACATGATCGACATCACCTACC GCGAGTACCTGGAAAACATGAACGACAAGAGGCCCCCCAGGATCATTAAG ACAATCGCCTCCAAGACCCAGAGCATTAAGAAGTACAGCACAGACATTCT GGGCAACCTGTATGAAGTGAAATCTAAGAAGCACCCTCAGATCATCAAAA AGGGCAAAAGGCCGGCGGCCACGAAAAAGGCCGGCCAGGCAAAAAAGAAA AAGggatcCGATGCTAAGTCACTGACTGCCTGGTCCCGGACACTGGTGAC CTTCAAGGATGTGTTTGTGGACTTCACCAGGGAGGAGTGGAAGCTGCTGG ACACTGCTCAGCAGATCCTGTACAGAAATGTGATGCTGGAGAACTATAAG AACCTGGTTTCCTTGGGTTATCAGCTTACTAAGCCAGATGTGATCCTCCG GTTGGAGAAGGGAGAAGAGCCCTGGCTGGTGGAGAGAGAAATTCACCAAG AGACCCATCCTGATTCAGAGACTGCATTTGAAATCAAATCATCAGTTCCG AAAAAGAAACGCAAAGTTgctagCGAGGGCAGAGGAAGTCTTCTAACATG CGGTGACGTGGAGGAGAATCCCGGCCCTATGACCGAGTACAAGCCCACGG TGCGCCTCGCCACCCGCGACGACGTCCCCaGGGCCGTACGCACCCTCGCC GCCGCGTTCGCCGACTACCCCGCCACGCGCCACACCGTCGATCCGGACCG CCACATCGAGCGGGTCACCGAGCTGCAAGAACTCTTCCTCACGCGCGTCG GGCTCGACATCGGCAAGGTGTGGGTCGCGGACGACGGCGCCGCGGTGGCG GTCTGGACCACGCCGGAGAGCGTCGAAGCGGGGGCGGTGTTCGCCGAGAT CGGCCCGCGCATGGCCGAGTTGAGCGGTTCCCGGCTGGCCGCGCAGCAAC AGATGGAAGGCCTCCTGGCGCCGCACCGGCCCAAGGAGCCCGCGTGGTTC CTGGCCACCGTCGGCGTGTCGCCCGACCACCAGGGCAAGGGTCTGGGCAG CGCCGTCGTGCTCCCCGGAGTGGAGGCGGCCGAGCGCGCCGGGGTGCCCG CCTTCCTGGAGACCTCCGCGCCCCGCAACCTCCCCTTCTACGAGCGGCTC GGCTTCACCGTCACCGCCGACGTCGAGGTGCCCGAAGGACCGCGCACCTG GTGCATGACCCGCAAGCCCGGTGCCTGACCAGcacactggcggcCGTTAC TAGCTTCTGCAGCACGAccggTTGATAATAGATAACTTCGTATAGCATAC ATTATACGAAGTTATGaattCGATATCAAGCTTATCGATAATCAACCTCT GGATTACAAAATTTGTGAAAGATTGACTGGTATTCTTAACTATGTTGCTC CTTTTACGCTATGTGGATACGCTGCTTTAATGCCTTTGTATCATGCTATT GCTTCCCGTATGGCTTTCATTTTCTCCTCCTTGTATAAATCCTGGTTGCT GTCTCTTTATGAGGAGTTGTGGCCCGTTGTCAGGCAACGTGGCGTGGTGT GCACTGTGTTTGCTGACGCAACCCCCACTGGTTGGGGCATTGCCACCACC TGTCAGCTCCTTTCCGGGACTTTCGCTTTCCCCCTCCCTATTGCCACGGC GGAACTCATCGCCGCCTGCCTTGCCCGCTGCTGGACAGGGGCTCGGCTGT TGGGCACTGACAATTCCGTGGTGTTGTCGGGGAAATCATCGTCCTTTCCT TGGCTGCTCGCCTGTGTTGCCACCTGGATTCTGCGCGGGACGTCCTTCTG CTACGTCCCTTCGGCCCTCAATCCAGCGGACCTTCCTTCCCGCGGCCTGC TGCCGGCTCTGCGGCCTCTTCCGCGTCTTCGCCTTCGCCCTCAGACGAGT CGGATCTCCCTTTGGGCCGCCTCCCCGCATCGATACCGTCGACCTCGAGA CCTAGAAAAACATGGAGCAATCACAAGTAGCAATACAGCAGCTACCAATG CTGATTGTGCCTGGCTAGAAGCACAAGAGGAGGAGGAGGTGGGTTTTCCA GTCACACCTCAGGTACCTTTAAGACCAATGACTTACAAGGCAGCTGTAGA TCTTAGCCACTTTTTAAAAGAAAAGGGGGGACTGGAAGGGCTAATTCACT CCCAACGAAGACAAGATATCCTTGATCTGTGGATCTACCACACACAAGGC TACTTCCCTGATTGGCAGAACTACACACCAGGGCCAGGGATCAGATATCC ACTGACCTTTGGATGGTGCTACAAGCTAGTACCAGTTGAGCAAGAGAAGG TAGAAGAAGCCAATGAAGGAGAGAACACCCGCTTGTTACACCCTGTGAGC CTGCATGGGATGGATGACCCGGAGAGAGAAGTATTAGAGTGGAGGTTTGA CAGCCGCCTAGCATTTCATCACATGGCCCGAGAGCTGCATCCGGACTGTA CTGGGTCTCTCTGGTTAGACCAGATCTGAGCCTGGGAGCTCTCTGGCTAA CTAGGGAACCCACTGCTTAAGCCTCAATAAAGCTTGCCTTGAGTGCTTCA AGTAGTGTGTGCCCGTCTGTTGTGTGACTCTGGTAACTAGAGATCCCTCA GACCCTTTTAGTCAGTGTGGAAAATCTCTAGCAGGGCCCGTTTAAACCCG CTGATCAGCCTCGACTGTGCCTTCTAGTTGCCAGCCATCTGTTGTTTGCC CCTCCCCCGTGCCTTCCTTGACCCTGGAAGGTGCCACTCCCACTGTCCTT TCCTAATAAAATGAGGAAATTGCATCGCATTGTCTGAGTAGGTGTCATTC TATTCTGGGGGGTGGGGTGGGGCAGGACAGCAAGGGGGAGGATTGGGAAG ACAATAGCAGGCATGCTGGGGATGCGGTGGGCTCTATGGCTTCTGAGGCG GAAAGAACCAGCTGGGGCTCTAGGGGGTATCCCCACGCGCCCTGTAGCGG CGCATTAAGCGCGGCGGGTGTGGTGGTTACGCGCAGCGTGACCGCTACAC TTGCCAGCGCCCTAGCGCCCGCTCCTTTCGCTTTCTTCCCTTCCTTTCTC GCCACGTTCGCCGGCTTTCCCCGTCAAGCTCTAAATCGGGGGCTCCCTTT AGGGTTCCGATTTAGTGCTTTACGGCACCTCGACCCCAAAAAACTTGATT AGGGTGATGGTTCACGTAGTGGGCCATCGCCCTGATAGACGGTTTTTCGC CCTTTGACGTTGGAGTCCACGTTCTTTAATAGTGGACTCTTGTTCCAAAC TGGAACAACACTCAACCCTATCTCGGTCTATTCTTTTGATTTATAAGGGA TTTTGCCGATTTCGGCCTATTGGTTAAAAAATGAGCTGATTTAACAAAAA TTTAACGCGAATTAATTCTGTGGAATGTGTGTCAGTTAGGGTGTGGAAAG TCCCCAGGCTCCCCAGCAGGCAGAAGTATGCAAAGCATGCATCTCAATTA GTCAGCAACCAGGTGTGGAAAGTCCCCAGGCTCCCCAGCAGGCAGAAGTA TGCAAAGCATGCATCTCAATTAGTCAGCAACCATAGTCCCGCCCCTAACT CCGCCCATCCCGCCCCTAACTCCGCCCAGTTCCGCCCATTCTCCGCCCCA TGGCTGACTAATTTTTTTTATTTATGCAGAGGCCGAGGCCGCCTCTGCCT CTGAGCTATTCCAGAAGTAGTGAGGAGGCTTTTTTGGAGGCCTAGGCTTT TGCAAAAAGCTCCCGGGAGCTTGTATATCCATTTTCGGATCTGATCAGCA CGTGTTGACAATTAATCATCGGCATAGTATATCGGCATAGTATAATACGA CAAGGTGAGGAACTAAACCATGGCCAAGTTGACCAGTGCCGTTCCGGTGC TCACCGCGCGCGACGTCGCCGGAGCGGTCGAGTTCTGGACCGACCGGCTC GGGTTCTCCCGGGACTTCGTGGAGGACGACTTCGCCGGTGTGGTCCGGGA CGACGTGACCCTGTTCATCAGCGCGGTCCAGGACCAGGTGGTGCCGGACA ACACCCTGGCCTGGGTGTGGGTGCGCGGCCTGGACGAGCTGTACGCCGAG TGGTCGGAGGTCGTGTCCACGAACTTCCGGGACGCCTCCGGGCCGGCCAT GACCGAGATCGGCGAGCAGCCGTGGGGGCGGGAGTTCGCCCTGCGCGACC CGGCCGGCAACTGCGTGCACTTCGTGGCCGAGGAGCAGGACTGACACGTG CTACGAGATTTCGATTCCACCGCCGCCTTCTATGAAAGGTTGGGCTTCGG AATCGTTTTCCGGGACGCCGGCTGGATGATCCTCCAGCGCGGGGATCTCA TGCTGGAGTTCTTCGCCCACCCCAACTTGTTTATTGCAGCTTATAATGGT TACAAATAAAGCAATAGCATCACAAATTTCACAAATAAAGCATTTTTTTC ACTGCATTCTAGTTGTGGTTTGTCCAAACTCATCAATGTATCTTATCATG TCTGTATACCGTCGACCTCTAGCTAGAGCTTGGCGTAATCATGGTCATAG CTGTTTCCTGTGTGAAATTGTTATCCGCTCACAATTCCACACAACATACG AGCCGGAAGCATAAAGTGTAAAGCCTGGGGTGCCTAATGAGTGAGCTAAC TCACATTAATTGCGTTGCGCTCACTGCCCGCTTTCCAGTCGGGAAACCTG TCGTGCCAGCTGCATTAATGAATCGGCCAACGCGCGGGGAGAGGCGGTTT GCGTATTGGGCGCTCTTCCGCTTCCTCGCTCACTGACTCGCTGCGCTCGG TCGTTCGGCTGCGGCGAGCGGTATCAGCTCACTCAAAGGCGGTAATACGG TTATCCACAGAATCAGGGGATAACGCAGGAAAGAACATGTGAGCAAAAGG CCAGCAAAAGGCCAGGAACCGTAAAAAGGCCGCGTTGCTGGCGTTTTTCC ATAGGCTCCGCCCCCCTGACGAGCATCACAAAAATCGACGCTCAAGTCAG AGGTGGCGAAACCCGACAGGACTATAAAGATACCAGGCGTTTCCCCCTGG AAGCTCCCTCGTGCGCTCTCCTGTTCCGACCCTGCCGCTTACCGGATACC TGTCCGCCTTTCTCCCTTCGGGAAGCGTGGCGCTTTCTCATAGCTCACGC

TGTAGGTATCTCAGTTCGGTGTAGGTCGTTCGCTCCAAGCTGGGCTGTGT GCACGAACCCCCCGTTCAGCCCGACCGCTGCGCCTTATCCGGTAACTATC GTCTTGAGTCCAACCCGGTAAGACACGACTTATCGCCACTGGCAGCAGCC ACTGGTAACAGGATTAGCAGAGCGAGGTATGTAGGCGGTGCTACAGAGTT CTTGAAGTGGTGGCCTAACTACGGCTACACTAGAAGAACAGTATTTGGTA TCTGCGCTCTGCTGAAGCCAGTTACCTTCGGAAAAAGAGTTGGTAGCTCT TGATCCGGCAAACAAACCACCGCTGGTAGCGGTGGTTTTTTTGTTTGCAA GCAGCAGATTACGCGCAGAAAAAAAGGATCTCAAGAAGATCCTTTGATCT TTTCTACGGGGTCTGACGCTCAGTGGAACGAAAACTCACGTTAAGGGATT TTGGTCATGAGATTATCAAAAAGGATCTTCACCTAGATCCTTTTAAATTA AAAATGAAGTTTTAAATCAATCTAAAGTATATATGAGTAAACTTGGTCTG ACAGTTACCAATGCTTAATCAGTGAGGCACCTATCTCAGCGATCTGTCTA TTTCGTTCATCCATAGTTGCCTGACTCCCCGTCGTGTAGATAACTACGAT ACGGGAGGGCTTACCATCTGGCCCCAGTGCTGCAATGATACCGCGAGACC CACGCTCACCGGCTCCAGATTTATCAGCAATAAACCAGCCAGCCGGAAGG GCCGAGCGCAGAAGTGGTCCTGCAACTTTATCCGCCTCCATCCAGTCTAT TAATTGTTGCCGGGAAGCTAGAGTAAGTAGTTCGCCAGTTAATAGTTTGC GCAACGTTGTTGCCATTGCTACAGGCATCGTGGTGTCACGCTCGTCGTTT GGTATGGCTTCATTCAGCTCCGGTTCCCAACGATCAAGGCGAGTTACATG ATCCCCCATGTTGTGCAAAAAAGCGGTTAGCTCCTTCGGTCCTCCGATCG TTGTCAGAAGTAAGTTGGCCGCAGTGTTATCACTCATGGTTATGGCAGCA CTGCATAATTCTCTTACTGTCATGCCATCCGTAAGATGCTTTTCTGTGAC TGGTGAGTACTCAACCAAGTCATTCTGAGAATAGTGTATGCGGCGACCGA GTTGCTCTTGCCCGGCGTCAATACGGGATAATACCGCGCCACATAGCAGA ACTTTAAAAGTGCTCATCATTGGAAAACGTTCTTCGGGGCGAAAACTCTC AAGGATCTTACCGCTGTTGAGATCCAGTTCGATGTAACCCACTCGTGCAC CCAACTGATCTTCAGCATCTTTTACTTTCACCAGCGTTTCTGGGTGAGCA AAAACAGGAAGGCAAAATGCCGCAAAAAAGGGAATAAGGGCGACACGGAA ATGTTGAATACTCATACTCTTCCTTTTTCAATATTATTGAAGCATTTATC AGGGTTATTGTCTCATGAGCGGATACATATTTGAATGTATTTAGAAAAAT AAACAAATAGGGGTTCCGCGCACATTTCCCCGAAAAGTGCCACCTGAC

1.6.2 AAV S. aureus Cas9 KRAB-Based Repressor

[0531] A restriction map of an AAV vector encoding S. aureus Cas9 KRAB-based repressor is shown in FIG. 7. SEQ ID NO: 3 provides the nucleic acid sequence of the AAV vector encoding S. aureus Cas9 KRAB-based repressor.

TABLE-US-00009 SEQ ID NO: 3 gcaggaacccctagtgatggagttggccactccctctctgcgcgctcgct cgctcactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcc cgggcggcctcagtgagcgagcgagcgcgcagctgcctgcaggggcgcct gatgcggtattttctccttacgcatctgtgcggtatttcacaccgcatac gtcaaagcaaccatagtacgcgccctgtagcggcgcattaagcgcggcgg gtgtggtggttacgcgcagcgtgaccgctacacttgccagcgccctagcg cccgctcctttcgctttcttcccttcctttctcgccacgttcgccggctt tccccgtcaagctctaaatcgggggctccctttagggttccgatttagtg ctttacggcacctcgaccccaaaaaacttgatttgggtgatggttcacgt agtgggccatcgccctgatagacggtttttcgccctttgacgttggagtc cacgttctttaatagtggactcttgttccaaactggaacaacactcaacc ctatctcgggctattcttttgatttataagggattttgccgatttcggcc tattggttaaaaaatgagctgatttaacaaaaatttaacgcgaattttaa caaaatattaacgtttacaattttatggtgcactctcagtacaatctgct ctgatgccgcatagttaagccagccccgacacccgccaacacccgctgac gcgccctgacgggcttgtctgctcccggcatccgcttacagacaagctgt gaccgtctccgggagctgcatgtgtcagaggttttcaccgtcatcaccga aacgcgcgagacgaaagggcctcgtgatacgcctatttttataggttaat gtcatgataataatggtttcttagacgtcaggtggcacttttcggggaaa tgtgcgcggaacccctatttgtttatttttctaaatacattcaaatatgt atccgctcatgagacaataaccctgataaatgcttcaataatattgaaaa aggaagagtatgagtattcaacatttccgtgtcgcccttattcccttttt tgcggcattttgccttcctgtttttgctcacccagaaacgctggtgaaag taaaagatgctgaagatcagttgggtgcacgagtgggttacatcgaactg gatctcaacagcggtaagatccttgagagttttcgccccgaagaacgttt tccaatgatgagcacttttaaagttctgctatgtggcgcggtattatccc gtattgacgccgggcaagagcaactcggtcgccgcatacactattctcag aatgacttggttgagtactcaccagtcacagaaaagcatcttacggatgg catgacagtaagagaattatgcagtgctgccataaccatgagtgataaca ctgcggccaacttacttctgacaacgatcggaggaccgaaggagctaacc gcttttttgcacaacatgggggatcatgtaactcgccttgatcgttggga accggagctgaatgaagccataccaaacgacgagcgtgacaccacgatgc ctgtagcaatggcaacaacgttgcgcaaactattaactggcgaactactt actctagcttcccggcaacaattaatagactggatggaggcggataaagt tgcaggaccacttctgcgctcggcccttccggctggctggtttattgctg ataaatctggagccggtgagcgtggaagccgcggtatcattgcagcactg gggccagatggtaagccctcccgtatcgtagttatctacacgacggggag tcaggcaactatggatgaacgaaatagacagatcgctgagataggtgcct cactgattaagcattggtaactgtcagaccaagtttactcatatatactt tagattgatttaaaacttcatttttaatttaaaaggatctaggtgaagat cctttttgataatctcatgaccaaaatcccttaacgtgagttttcgttcc actgagcgtcagaccccgtagaaaagatcaaaggatcttcttgagatcct ttttttctgcgcgtaatctgctgcttgcaaacaaaaaaaccaccgctacc agcggtggtttgtttgccggatcaagagctaccaactctttttccgaagg taactggcttcagcagagcgcagataccaaatactgtccttctagtgtag ccgtagttaggccaccacttcaagaactctgtagcaccgcctacatacct cgctctgctaatcctgttaccagtggctgctgccagtggcgataagtcgt gtcttaccgggttggactcaagacgatagttaccggataaggcgcagcgg tcgggctgaacggggggttcgtgcacacagcccagcttggagcgaacgac ctacaccgaactgagatacctacagcgtgagctatgagaaagcgccacgc ttcccgaagggagaaaggcggacaggtatccggtaagcggcagggtcgga acaggagagcgcacgagggagcttccagggggaaacgcctggtatcttta tagtcctgtcgggtttcgccacctctgacttgagcgtcgatttttgtgat gctcgtcaggggggcggagcctatggaaaaacgccagcaacgcggccttt ttacggttcctggccttttgctggccttttgctcacatgtcctgcaggca gctgcgcgctcgctcgctcactgaggccgcccgggcgtcgggcgaccttt ggtcgcccggcctcagtgagcgagcgagcgcgcagagagggagtggccaa ctccatcactaggggttcctgcggcctctagactcgaggcgttgacattg attattgactagttattaatagtaatcaattacggggtcattagttcata gcccatatatggagttccgcgttacataacttacggtaaatggcccgcct ggctgaccgcccaacgacccccgcccattgacgtcaataatgacgtatgt tcccatagtaacgccaatagggactttccattgacgtcaatgggtggagt atttacggtaaactgcccacttggcagtacatcaagtgtatcatatgcca agtacgccccctattgacgtcaatgacggtaaatggcccgcctggcatta tgcccagtacatgaccttatgggactttcctacttggcagtacatctacg tattagtcatcgctattaccatggtgatgcggttttggcagtacatcaat gggcgtggatagcggtttgactcacggggatttccaagtctccaccccat tgacgtcaatgggagtttgttttggcaccaaaatcaacgggactttccaa aatgtcgtaacaactccgccccattgacgcaaatgggcggtaggcgtgta cggtgggaggtctatataagcagagctctctggctaactaccggtgccac catgtacccatacgatgttccagattacgctGCCCCAAAGAAGAAGCGGA AGGTCGGTATCCACGGAGTCCCAGCAGCCAAGCGGAACTACATCCTGGGC CTGGCCATCGGCATCACCAGCGTGGGCTACGGCATCATCGACTACGAGAC ACGGGACGTGatcgATGCCGGCGTGCGGCTGTTCAAAGAGGCCAACGTGG AAAACAACGAGGGCAGGCGGAGCAAGAGAGGCGCCAGAAGGCTGAAGCGG CGGAGGCGGCATAGAATCCAGAGAGTGAAGAAGCTGCTGTTCGACTACAA CCTGCTGACCGACCACAGCGAGCTGAGCGGCATCAACCCCTACGAGGCCA GAGTGAAGGGCCTGAGCCAGAAGCTGAGCGAGGAAGAGTTCTCTGCCGCC CTGCTGCACCTGGCCAAGAGAAGAGGCGTGCACAACGTGAACGAGGTGGA AGAGGACACCGGCAACGAGCTGTCCACCAAAGAGCAGATCAGCCGGAACA GCAAGGCCCTGGAAGAGAAATACGTGGCCGAACTGCAGCTGGAACGGCTG AAGAAAGACGGCGAAGTGCGGGGCAGCATCAACAGATTCAAGACCAGCGA CTACGTGAAAGAAGCCAAACAGCTGCTGAAGGTGCAGAAGGCCTACCACC AGCTGGACCAGAGCTTCATCGACACCTACATCGACCTGCTGGAAACCCGG CGGACCTACTATGAGGGACCTGGCGAGGGCAGCCCCTTCGGCTGGAAGGA CATCAAAGAATGGTACGAGATGCTGATGGGCCACTGCACCTACTTCCCCG AGGAACTGCGGAGCGTGAAGTACGCCTACAACGCCGACCTGTACAACGCC CTGAACGACCTGAACAATCTCGTGATCACCAGGGACGAGAACGAGAAGCT GGAATATTACGAGAAGTTCCAGATCATCGAGAACGTGTTCAAGCAGAAGA AGAAGCCCACCCTGAAGCAGATCGCCAAAGAAATCCTCGTGAACGAAGAG GATATTAAGGGCTACAGAGTGACCAGCACCGGCAAGCCCGAGTTCACCAA CCTGAAGGTGTACCACGACATCAAGGACATTACCGCCCGGAAAGAGATTA TTGAGAACGCCGAGCTGCTGGATCAGATTGCCAAGATCCTGACCATCTAC CAGAGCAGCGAGGACATCCAGGAAGAACTGACCAATCTGAACTCCGAGCT GACCCAGGAAGAGATCGAGCAGATCTCTAATCTGAAGGGCTATACCGGCA CCCACAACCTGAGCCTGAAGGCCATCAACCTGATCCTGGACGAGCTGTGG CACACCAACGACAACCAGATCGCTATCTTCAACCGGCTGAAGCTGGTGCC CAAGAAGGTGGACCTGTCCCAGCAGAAAGAGATCCCCACCACCCTGGTGG ACGACTTCATCCTGAGCCCCGTCGTGAAGAGAAGCTTCATCCAGAGCATC AAAGTGATCAACGCCATCATCAAGAAGTACGGCCTGCCCAACGACATCAT TATCGAGCTGGCCCGCGAGAAGAACTCCAAGGACGCCCAGAAAATGATCA ACGAGATGCAGAAGCGGAACCGGCAGACCAACGAGCGGATCGAGGAAATC ATCCGGACCACCGGCAAAGAGAACGCCAAGTACCTGATCGAGAAGATCAA GCTGCACGACATGCAGGAAGGCAAGTGCCTGTACAGCCTGGAAGCCATCC CTCTGGAAGATCTGCTGAACAACCCCTTCAACTATGAGGTGGACCACATC ATCCCCAGAAGCGTGTCCTTCGACAACAGCTTCAACAACAAGGTGCTCGT GAAGCAGGAAGAAgcCAGCAAGAAGGGCAACCGGACCCCATTCCAGTACC TGAGCAGCAGCGACAGCAAGATCAGCTACGAAACCTTCAAGAAGCACATC CTGAATCTGGCCAAGGGCAAGGGCAGAATCAGCAAGACCAAGAAAGAGTA TCTGCTGGAAGAACGGGACATCAACAGGTTCTCCGTGCAGAAAGACTTCA TCAACCGGAACCTGGTGGATACCAGATACGCCACCAGAGGCCTGATGAAC CTGCTGCGGAGCTACTTCAGAGTGAACAACCTGGACGTGAAAGTGAAGTC CATCAATGGCGGCTTCACCAGCTTTCTGCGGCGGAAGTGGAAGTTTAAGA AAGAGCGGAACAAGGGGTACAAGCACCACGCCGAGGACGCCCTGATCATT GCCAACGCCGATTTCATCTTCAAAGAGTGGAAGAAACTGGACAAGGCCAA AAAAGTGATGGAAAACCAGATGTTCGAGGAAAAGCAGGCCGAGAGCATGC CCGAGATCGAAACCGAGCAGGAGTACAAAGAGATCTTCATCACCCCCCAC CAGATCAAGCACATTAAGGACTTCAAGGACTACAAGTACAGCCACCGGGT GGACAAGAAGCCTAATAGAGAGCTGATTAACGACACCCTGTACTCCACCC GGAAGGACGACAAGGGCAACACCCTGATCGTGAACAATCTGAACGGCCTG TACGACAAGGACAATGACAAGCTGAAAAAGCTGATCAACAAGAGCCCCGA AAAGCTGCTGATGTACCACCACGACCCCCAGACCTACCAGAAACTGAAGC TGATTATGGAACAGTACGGCGACGAGAAGAATCCCCTGTACAAGTACTAC GAGGAAACCGGGAACTACCTGACCAAGTACTCCAAAAAGGACAACGGCCC CGTGATCAAGAAGATTAAGTATTACGGCAACAAACTGAACGCCCATCTGG

ACATCACCGACGACTACCCCAACAGCAGAAACAAGGTCGTGAAGCTGTCC CTGAAGCCCTACAGATTCGACGTGTACCTGGACAATGGCGTGTACAAGTT CGTGACCGTGAAGAATCTGGATGTGATCAAAAAAGAAAACTACTACGAAG TGAATAGCAAGTGCTATGAGGAAGCTAAGAAGCTGAAGAAGATCAGCAAC CAGGCCGAGTTTATCGCCTCCTTCTACAACAACGATCTGATCAAGATCAA CGGCGAGCTGTATAGAGTGATCGGCGTGAACAACGACCTGCTGAACCGGA TCGAAGTGAACATGATCGACATCACCTACCGCGAGTACCTGGAAAACATG AACGACAAGAGGCCCCCCAGGATCATTAAGACAATCGCCTCCAAGACCCA GAGCATTAAGAAGTACAGCACAGACATTCTGGGCAACCTGTATGAAGTGA AATCTAAGAAGCACCCTCAGATCATCAAAAAGGGCAAAAGGCCGGCGGCC ACGAAAAAGGCCGGCCAGGCAAAAAAGAAAAAGggatcCGATGCTAAGTC ACTGACTGCCTGGTCCCGGACACTGGTGACCTTCAAGGATGTGTTTGTGG ACTTCACCAGGGAGGAGTGGAAGCTGCTGGACACTGCTCAGCAGATCCTG TACAGAAATGTGATGCTGGAGAACTATAAGAACCTGGTTTCCTTGGGTTA TCAGCTTACTAAGCCAGATGTGATCCTCCGGTTGGAGAAGGGAGAAGAGC CCTGGCTGGTGGAGAGAGAAATTCACCAAGAGACCCATCCTGATTCAGAG ACTGCATTTGAAATCAAATCATCAGTTCCGAAAAAGAAACGCAAAGttta aGaattcctagagctcgctgatcagcctcgactgtgccttctagttgcca gccatctgttgtttgcccctcccccgtgccttccttgaccctggaaggtg ccactcccactgtcctttcctaataaaatgaggaaattgcatcgcattgt ctgagtaggtgtcattctattctggggggtggggtggggcaggacagcaa gggggaggattgggaagagaatagcaggcatgctggggag

1.6.3 AAV S. aureus Cas9 U6-gRNA Vector with GFP-Kan Stuffer

[0532] A restriction map of an AAV vector encoding S. aureus Cas9 U6-gRNA is shown in FIG. 8. SEQ ID NO: 4 provides the nucleic acid sequence of the AAV vector encoding S. aureus Cas9 U6-gRNA (with sample protospacer gRNA sequence).

TABLE-US-00010 SEQ ID NO: 4 ggggggggggggggggggttggccactccctctctgcgcgctcgctcgct cactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccggg cggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatc actaggggttcctagatctgaattcggtacCagatctaggaaCCTAGGgc ctatttcccatgattccttcatatttgcatatacgatacaaggctgttag agagataattggaattaatttgactgtaaacacaaagatattagtacaaa atacgtgacgtagaaagtaataatttcttgggtagtttgcagttttaaaa ttatgttttaaaatggactatcatatgcttaccgtaacttgaaagtattt cgatttcttggctttatatatcttgTGGAAAGGACGAAACACCgagcgcg ccccgcctagcccgttttagtactctggaaacagaatctactaaaacaag gcaaaatgccgtgtttatctcgtcaacttgttggcgagatttttttGCGG CCGCCCgcggtggagctccagcttttgttccctttagtgagggttaatTc tagaggatccggtactcgaggaactgaaaaaccagaaagttaactggtaa gtttagtctttttgtcttttatttcaggtcccggatccggtggtggtgca aatcaaagaactgctcctcagtggatgttgcctttacttctaggcctgta cggaagtgttacttctgctctaaaagctgcggaattgtacccgcggcccg ggatccaccggtcgccaccatggtgagcaagggcgaggagctgttcaccg gggtggtgcccatcctggtcgagctggacggcgacgtaaacggccacaag ttcagcgtgtccggcgagggcgagggcgatgccacctacggcaagctgac cctgaagttcatctgcaccaccggcaagctgcccgtgccctggcccaccc tcgtgaccaccctgacctacggcgtgcagtgcttcagccgctaccccgac cacatgaagcagcacgacttcttcaagtccgccatgcccgaaggctacgt ccaggagcgcaccatcttcttcaaggacgacggcaactacaagacccgcg ccgaggtgaagttcgagggcgacaccctggtgaaccgcatcgagctgaag ggcatcgacttcaaggaggacggcaacatcctggggcacaagctggagta caactacaacagccacaacgtctatatcatggccgacaagcagaagaacg gcatcaaggtgaacttcaagatccgccacaacatcgaggacggcagcgtg cagctcgccgaccactaccagcagaacacccccatcggcgacggccccgt gctgctgcccgacaaccactacctgagcacccagtccgccctgagcaaag accccaacgagaagcgcgatcacatggtcctgctggagttcgtgaccgcc gccgggatcactctcggcatggacgagctgtacaagtaaagcggccgcgg ggatccagacatgataagatacattgatgagtttggacaaaccacaacta gaatgcagtgaaaaaaatgctttatttgtgaaatttgtgatgctattgct ttatttgtaaccattataagctgcaataaacaagttaacaacaacaattg cattcattttatgtttcaggttcagggggaggtgtgggaggttttttagt cgacctcgagcagtgtggttttgcaagaggaagcaaaaagcctctccacc caggcctggaatgtttccacccaagtcgaaggcagtgtggttttgcaaga ggaagcaaaaagcctctccacccaggcctggaatgtttccacccaatgtc gagcaaccccgcccagcgtcttgtcattggcgaattcgaacacgcagatg cagtcggggcggcgcggtcccaggtccacttcgcatattaaggtgacgcg tgtggcctcgaacaccgagcgaccctgcagccaatatgggatcggccatt gaacaagatggattgcacgcaggttctccggccgcttgggtggagaggct attcggctatgactgggcacaacagacaatcggctgctctgatgccgccg tgttccggctgtcagcgcaggggcgcccggttctttttgtcaagaccgac ctgtccggtgccctgaatgaactgcaggacgaggcagcgcggctatcgtg gctggccacgacgggcgttccttgcgcagctgtgctcgacgttgtcactg aagcgggaagggactggctgctattgggcgaagtgccggggcaggatctc ctgtcatctcaccttgctcctgccgagaaagtatccatcatggctgatgc aatgcggcggctgcatacgcttgatccggctacctgcccattcgaccacc aagcgaaacatcgcatcgagcgagcacgtactcggatggaagccggtctt gtcgatcaggatgatctggacgaagagcatcaggggctcgcgccagccga actgttcgccaggctcaaggcgcgcatgcccgacggcgaggatctcgtcg tgacccatggcgatgcctgcttgccgaatatcatggtggaaaatggccgc ttttctggattcatcgactgtggccggctgggtgtggcggaccgctatca ggacatagcgttggctacccgtgatattgctgaagagcttggcggcgaat gggctgaccgcttcctcgtgctttacggtatcgccgctcccgattcgcag cgcatcgccttctatcgccttcttgacgagttcttctgaggggatccgtc gactagagctcgctgatcagcctcgactgtgccttctagttgccagccat ctgttgtttgcccctcccccgtgccttccttgaccctggaaggtgccact cccactgtcctttcctaataaaatgaggaaattgcatcgcattgtctgag taggtgtcattctattctggggggtggggtggggcaggacagcaaggggg aggattgggaagacaatagcaggcatgctggggagagatctaggaacccc tagtgatggagttggccactccctctctgcgcgctcgctcgctcactgag gccgcccgggcaaagcccgggcgtcgggcgacctttggtcgcccggcctc agtgagcgagcgagcgcgcagagagggagtggccaacccccccccccccc cccctgcagcccagctgcattaatgaatcggccaacgcgcggggagaggc ggtttgcgtattgggcgctcttccgcttcctcgctcactgactcgctgcg ctcggtcgttcggctgcggcgagcggtatcagctcactcaaaggcggtaa tacggttatccacagaatcaggggataacgcaggaaagaacatgtgagca aaaggccagcaaaaggccaggaaccgtaaaaaggccgcgttgctggcgtt tttccataggctccgcccccctgacgagcatcacaaaaatcgacgctcaa gtcagaggtggcgaaacccgacaggactataaagataccaggcgtttccc cctggaagctccctcgtgcgctctcctgttccgaccctgccgcttaccgg atacctgtccgcctttctcccttcgggaagcgtggcgctttctcaatgct cacgctgtaggtatctcagttcggtgtaggtcgttcgctccaagctgggc tgtgtgcacgaaccccccgttcagcccgaccgctgcgccttatccggtaa ctatcgtcttgagtccaacccggtaagacacgacttatcgccactggcag cagccactggtaacaggattagcagagcgaggtatgtaggcggtgctaca gagttcttgaagtggtggcctaactacggctacactagaaggacagtatt tggtatctgcgctctgctgaagccagttaccttcggaaaaagagttggta gctcttgatccggcaaacaaaccaccgctggtagcggtggtttttttgtt tgcaagcagcagattacgcgcagaaaaaaaggatctcaagaagatccttt gatcttttctacggggtctgacgctcagtggaacgaaaactcacgttaag ggattttggtcatgagattatcaaaaaggatcttcacctagatcctttta aattaaaaatgaagttttaaatcaatctaaagtatatatgagtaaacttg gtctgacagttaccaatgcttaatcagtgaggcacctatctcagcgatct gtctatttcgttcatccatagttgcctgactccccgtcgtgtagataact acgatacgggagggcttaccatctggccccagtgctgcaatgataccgcg agacccacgctcaccggctccagatttatcagcaataaaccagccagccg gaagggccgagcgcagaagtggtcctgcaactttatccgcctccatccag tctattaattgttgccgggaagctagagtaagtagttcgccagttaatag tttgcgcaacgttgttgccattgctacaggcatcgtggtgtcacgctcgt cgtttggtatggcttcattcagctccggttcccaacgatcaaggcgagtt acatgatcccccatgttgtgcaaaaaagcggttagctccttcggtcctcc gatcgttgtcagaagtaagttggccgcagtgttatcactcatggttatgg cagcactgcataattctcttactgtcatgccatccgtaagatgcttttct gtgactggtgagtactcaaccaagtcattctgagaatagtgtatgcggcg accgagttgctcttgcccggcgtcaatacgggataataccgcgccacata gcagaactttaaaagtgctcatcattggaaaacgttcttcggggcgaaaa ctctcaaggatcttaccgctgttgagatccagttcgatgtaacccactcg tgcacccaactgatcttcagcatcttttactttcaccagcgtttctgggt gagcaaaaacaggaaggcaaaatgccgcaaaaaagggaataagggcgaca cggaaatgttgaatactcatactcttcctttttcaatattattgaagcat ttatcagggttattgtctcatgagcggatacatatttgaatgtatttaga aaaataaacaaataggggttccgcgcacatttccccgaaaagtgccacct gacgtctaagaaaccattattatcatgacattaacctataaaaataggcg tatcacgaggccctttcgtctcgcgcgtttcggtgatgacggtgaaaacc tctgacacatgcagctcccggagacggtcacagcttgtctgtaagcggat gccgggagcagacaagcccgtcagggcgcgtcagcgggtgttggcgggtg tcggggctggcttaactatgcggcatcagagcagattgtactgagagtgc accatatgcggtgtgaaataccgcacagatgcgtaaggagaaaataccgc atcaggaaattgtaaacgttaatattttgttaaaattcgcgttaaatttt tgttaaatcagctcattttttaaccaataggccgaaatcggcaaaatccc ttataaatcaaaagaatagaccgagatagggttgagtgttgttccagttt ggaacaagagtccactattaaagaacgtggactccaacgtcaaagggcga aaaaccgtctatcagggcgatggcccactacgtgaaccatcaccctaatc aagttttttggggtcgaggtgccgtaaagcactaaatcggaaccctaaag ggagcccccgatttagagcttgacggggaaagccggcgaacgtggcgaga aaggaagggaagaaagcgaaaggagcgggcgctagggcgctggcaagtgt agcggtcacgctgcgcgtaaccaccacacccgccgcgcttaatgcgccgc tacagggcgcgtcgcgccattcgccattcaggctacgcaactgttgggaa gggcgatcggtgcgggcctcttcgctattacgccagctggctgca

1.6.5 AAV S. aureus Cas9 U6-gRNA Vector with GFP-Kan Stuffer

[0533] A restriction map of an AAV vector encoding S. aureus Cas9 U6-gRNA is shown in FIG. 9. SEQ ID NO: 5 provides the nucleic acid sequence of the AAV vector encoding S. aureus Cas9 U6-gRNA (Protospacer is cloned into the BbsI sites).

TABLE-US-00011 SEQ ID NO: 5 ggggggggggggggggggttggccactccctctctgcgcgctcgctcgct cactgaggccgggcgaccaaaggtcgcccgacgcccgggctttgcccggg cggcctcagtgagcgagcgagcgcgcagagagggagtggccaactccatc actaggggttcctagatctgaattcggtaccaagctTgcctatttcccat gattccttcatatttgcatatacgatacaaggctgttagagagataattg gaattaatttgactgtaaacacaaagatattagtacaaaatacgtgacgt agaaagtaataatttcttgggtagtttgcagttttaaaattatgttttaa aatggactatcatatgcttaccgtaacttgaaagtatttcgatttcttgg ctttatatatcttgTGGAAAGGACGAAACACCgggtcttcgagaagacct gttttagtactctggaaacagaatctactaaaacaaggcaaaatgccgtg tttatctcgtcaacttgttggcgagatttttttGCGGCCGCCCgcggtgg agctccagcttttgttccctttagtgagggttaatTctagAgagacgtac aaaaaagagcaagaagctaaaaaagatttaaaaattatttttagcgcagt taatggaacaggaactaaatttaccccaaaaatattacgtgaatcaggat ataacgttattgaggttgaagagcatgcatttgaagatgaaacatttaaa aatgttgtaaatccaaatccagaatttgatcctgcatgaaaaataccgct tgaatatggtattaaacatgatgcagatattattattatgaatgacccag atgctgacagatttggaatggcaataaaacatgatggtcattttgtaaga ttagatggaaatcaaacaggaccaattttaattgattgaaaattatcaaa tctaaaacgcttaaatagcattccaaaaaatccggctctatattcaagtt ttgtaacaagtgatttgggtgatagaatcgctcatgaaaaatatggagtt aatattgtaaaaactttaactggatttaaatgaatgggtagagaaattgc taaagaagaagataacggattaaattttgtttttgcttatgaagaaagtt atggatatgtaattgatgactcagctagagataaagatggaatacaagct tctatattaatagcagaggctgcttgattttataaaaaacaaaataaaac attagtagactatttagaagatttatttaaagaaatgggtgcatattaca ctttcactttaaacttgaattttaaaccagaagaaaagaaattaaaaatt gaaccattaatgaaatcattgagagcaacacccttaactcaaattgctgg acttaaagttgttaatgttgaagactacatcgatggaatgtataatatgc caggacaagacttactaaaattttatttagaagataagtcatgatttgct gttcgcccaagtggaactgaacctaaactaaaaatttattttataggtgt tggtgaatctgttcaaaacgctaaagttaaagtagacgaaattattaaag aattaaaattaaaaatgaatatataggagaaaaaatgaaactaaacaaat atatagatcacacattattaaaacaagatgctacgaaagctgaaattaaa caattatgtgatgaagcaattgaatttgattttgcaacagtttgtgttaa ttcatattgaacaagctattgtaaagaattattaaaaggcacaaatgtag gaataacaaatgttgtaggttttcctctaggtgcatgcacaacagctaca aaagcattcgaagtttctgaagcaattaaagatggtgcaacagaaattga tatggtattaaatattggtgcattaaaagacaaaaattatgaattagttt tagaagacatgaaagctgtaaaaaaagcagctggatcacatgttgttaaa tgtattatggaaaattgtttattaacaaaagaagaaatcatgaaagcttg tgaaatagctgttgaagctggattagaatttgttaaaacatcaacaggat tttcaaaatcaggtgcaacatttgaagatgttaaactaatgaagtcagtt gttaaagacaatgctttagttaaagcagctggtggagttagaacatttga agatgctcaaaaaatgattgaagcaggagctgaccgcttaggaacaagtg gtggagtagctattattaaaggtgaagaaaacaacgcgagttactaaaac tagcgtttttttattttgctcatttttattaaaagtttgcaaaaaggaac ataaaaattctaattattgatactaaagttattaaaaagaagattttggt tgattttataaaggtcatagaatataatattttagcatgtgtattttgtg tgctcatttacaaccgtctcGCggccgcggggatccagacatgataagat acattgatgagtttggacaaaccacaactagaatgcagtgaaaaaaatgc tttatttgtgaaatttgtgatgctattgctttatttgtaaccattataag ctgcaataaacaagttaacaacaacaattgcattcattttatgtttcagg ttcagggggaggtgtgggaggttttttagtcgacctcgagcagtgtggtt ttgcaagaggaagcaaaaagcctctccacccaggcctggaatgtttccac ccaagtcgaaggcagtgtggttttgcaagaggaagcaaaaagcctctcca cccaggcctggaatgtttccacccaatgtcgagcaaccccgcccagcgtc ttgtcattggcgaattcgaacacgcagatgcagtcggggcggcgcggtcc caggtccacttcgcatattaaggtgacgcgtgtggcctcgaacaccgagc gaccctgcagccaatatgggatcggccattgaacaagatggattgcacgc aggttctccggccgcttgggtggagaggctattcggctatgactgggcac aacagacaatcggctgctctgatgccgccgtgttccggctgtcagcgcag gggcgcccggttctttttgtcaagaccgacctgtccggtgccctgaatga actgcaggacgaggcagcgcggctatcgtggctggccacgacgggcgttc cttgcgcagctgtgctcgacgttgtcactgaagcgggaagggactggctg ctattgggcgaagtgccggggcaggatctcctgtcatctcaccttgctcc tgccgagaaagtatccatcatggctgatgcaatgcggcggctgcatacgc ttgatccggctacctgcccattcgaccaccaagcgaaacatcgcatcgag cgagcacgtactcggatggaagccggtcttgtcgatcaggatgatctgga cgaagagcatcaggggctcgcgccagccgaactgttcgccaggctcaagg cgcgcatgcccgacggcgaggatctcgtcgtgacccatggcgatgcctgc ttgccgaatatcatggtggaaaatggccgcttttctggattcatcgactg tggccggctgggtgtggcggaccgctatcaggacatagcgttggctaccc gtgatattgctgaagagcttggcggcgaatgggctgaccgcttcctcgtg ctttacggtatcgccgctcccgattcgcagcgcatcgccttctatcgcct tcttgacgagttcttctgaggggatccgtcgactagagctcgctgatcag cctcgactgtgccttctagttgccagccatctgttgtttgcccctccccc gtgccttccttgaccctggaaggtgccactcccactgtcctttcctaata aaatgaggaaattgcatcgcattgtctgagtaggtgtcattctattctgg ggggtggggtggggcaggacagcaagggggaggattgggaagacaatagc aggcatgctggggagagatctaggaacccctagtgatggagttggccact ccctctctgcgcgctcgctcgctcactgaggccgcccgggcaaagcccgg gcgtcgggcgacctttggtcgcccggcctcagtgagcgagcgagcgcgca gagagggagtggccaacccccccccccccccccctgcagcccagctgcat taatgaatcggccaacgcgcggggagaggcggtttgcgtattgggcgctc ttccgcttcctcgctcactgactcgctgcgctcggtcgttcggctgcggc gagcggtatcagctcactcaaaggcggtaatacggttatccacagaatca ggggataacgcaggaaagaacatgtgagcaaaaggccagcaaaaggccag gaaccgtaaaaaggccgcgttgctggcgtttttccataggctccgccccc ctgacgagcatcacaaaaatcgacgctcaagtcagaggtggcgaaacccg acaggactataaagataccaggcgtttccccctggaagctccctcgtgcg ctctcctgttccgaccctgccgcttaccggatacctgtccgcctttctcc cttcgggaagcgtggcgctttctcaatgctcacgctgtaggtatctcagt tcggtgtaggtcgttcgctccaagctgggctgtgtgcacgaaccccccgt tcagcccgaccgctgcgccttatccggtaactatcgtcttgagtccaacc cggtaagacacgacttatcgccactggcagcagccactggtaacaggatt agcagagcgaggtatgtaggcggtgctacagagttcttgaagtggtggcc taactacggctacactagaaggacagtatttggtatctgcgctctgctga agccagttaccttcggaaaaagagttggtagctcttgatccggcaaacaa accaccgctggtagcggtggtttttttgtttgcaagcagcagattacgcg cagaaaaaaaggatctcaagaagatcctttgatcttttctacggggtctg acgctcagtggaacgaaaactcacgttaagggattttggtcatgagatta tcaaaaaggatcttcacctagatccttttaaattaaaaatgaagttttaa atcaatctaaagtatatatgagtaaacttggtctgacagttaccaatgct taatcagtgaggcacctatctcagcgatctgtctatttcgttcatccata gttgcctgactccccgtcgtgtagataactacgatacgggagggcttacc atctggccccagtgctgcaatgataccgcgagacccacgctcaccggctc cagatttatcagcaataaaccagccagccggaagggccgagcgcagaagt ggtcctgcaactttatccgcctccatccagtctattaattgttgccggga agctagagtaagtagttcgccagttaatagtttgcgcaacgttgttgcca ttgctacaggcatcgtggtgtcacgctcgtcgtttggtatggcttcattc agctccggttcccaacgatcaaggcgagttacatgatcccccatgttgtg caaaaaagcggttagctccttcggtcctccgatcgttgtcagaagtaagt tggccgcagtgttatcactcatggttatggcagcactgcataattctctt actgtcatgccatccgtaagatgcttttctgtgactggtgagtactcaac caagtcattctgagaatagtgtatgcggcgaccgagttgctcttgcccgg cgtcaatacgggataataccgcgccacatagcagaactttaaaagtgctc atcattggaaaacgttcttcggggcgaaaactctcaaggatcttaccgct gttgagatccagttcgatgtaacccactcgtgcacccaactgatcttcag catcttttactttcaccagcgtttctgggtgagcaaaaacaggaaggcaa aatgccgcaaaaaagggaataagggcgacacggaaatgttgaatactcat actcttcctttttcaatattattgaagcatttatcagggttattgtctca tgagcggatacatatttgaatgtatttagaaaaataaacaaataggggtt ccgcgcacatttccccgaaaagtgccacctgacgtctaagaaaccattat tatcatgacattaacctataaaaataggcgtatcacgaggccctttcgtc

tcgcgcgtttcggtgatgacggtgaaaacctctgacacatgcagctcccg gagacggtcacagcttgtctgtaagcggatgccgggagcagacaagcccg tcagggcgcgtcagcgggtgttggcgggtgtcggggctggcttaactatg cggcatcagagcagattgtactgagagtgcaccatatgcggtgtgaaata ccgcacagatgcgtaaggagaaaataccgcatcaggaaattgtaaacgtt aatattttgttaaaattcgcgttaaatttttgttaaatcagctcattttt taaccaataggccgaaatcggcaaaatcccttataaatcaaaagaataga ccgagatagggttgagtgttgttccagtttggaacaagagtccactatta aagaacgtggactccaacgtcaaagggcgaaaaaccgtctatcagggcga tggcccactacgtgaaccatcaccctaatcaagttttttggggtcgaggt gccgtaaagcactaaatcggaaccctaaagggagcccccgatttagagct tgacggggaaagccggcgaacgtggcgagaaaggaagggaagaaagcgaa aggagcgggcgctagggcgctggcaagtgtagcggtcacgctgcgcgtaa ccaccacacccgccgcgcttaatgcgccgctacagggcgcgtcgcgccat tcgccattcaggctacgcaactgttgggaagggcgatcggtgcgggcctc ttcgctattacgccagctggctgca

1.6.6 Protospacer Sequences for gRNAs

TABLE-US-00012 TABLE 4 CAG Luciferase gRNAs SEQ ID NO Description Sequence 6 SaCr1 GTCATTATTGACGTCAATGGGC 7 SaCr2 gtgctcagcaactcggggag 8 SaCr3 ctcggggaggggggtgcagg 9 SaCr4 ACTTTCCATTGACGTCAATGGG 10 SaCr5 CTTCGGGGGGGACGGGGCAGGG 11 SaCr6 cttcgccccgcgcccgctaga 12 SaCr7 tcggggaggggggtgcagg 13 SaCr8 tgctcagcaactcggggag 14 SaCr9 gcggggggtggcggcaggt

TABLE-US-00013 TABLE 5 Mouse Acvr2b gRNAs SEQ ID NO Description Sequence 15 SaCr1 gctcctctgggacccctga 16 SaCr2 tgctatggagcccacgcta 17 SaCr3 ggcgcgctctccgagctgg 18 SaCr4 agcgcgccccgcctagccc 19 SaCr5 gcctctttgtatccaacat 20 SaCr6 gcacgctcctctgggacccctga 21 SaCr7 gtgggggaggggacctgaa 22 SaCr8 gaggggccatgaacggggg

1.6.7 S. aureus Cas9-Based Repressor Gene Sequence

[0534] SEQ ID NO: 23 provides a nucleic acid sequence encoding HA-NLS-dSaCas9-NLS-KRAB. Residues 1-3 are a start codon. Residues 4-30 encode a HA tag. Residues 31-78 encode a first nuclear localization sequence (NLS). Residues 79-3234 encode S. aureus "dead" Cas9. Residues 103-105 encode the first inactivating mutation. Residues 1813-1815 encode the second inactivating mutation. Residues 3235-3282 encode a second NLS. Residues 3289-3597 encode KRAB. Residues 3598-3600 are a stop codon. All the residues are numbered based on SEQ ID NO: 23.

TABLE-US-00014 SEQ ID NO: 23 atgtacccatacgatgttccagattacgctGCCCCAAAGAAGAAGCGGAA GGTCGGTATCCACGGAGTCCCAGCAGCCAAGCGGAACTACATCCTGGGCC TGGCCATCGGCATCACCAGCGTGGGCTACGGCATCATCGACTACGAGACA CGGGACGTGatcgATGCCGGCGTGCGGCTGTTCAAAGAGGCCAACGTGGA AAACAACGAGGGCAGGCGGAGCAAGAGAGGCGCCAGAAGGCTGAAGCGGC GGAGGCGGCATAGAATCCAGAGAGTGAAGAAGCTGCTGTTCGACTACAAC CTGCTGACCGACCACAGCGAGCTGAGCGGCATCAACCCCTACGAGGCCAG AGTGAAGGGCCTGAGCCAGAAGCTGAGCGAGGAAGAGTTCTCTGCCGCCC TGCTGCACCTGGCCAAGAGAAGAGGCGTGCACAACGTGAACGAGGTGGAA GAGGACACCGGCAACGAGCTGTCCACCAAAGAGCAGATCAGCCGGAACAG CAAGGCCCTGGAAGAGAAATACGTGGCCGAACTGCAGCTGGAACGGCTGA AGAAAGACGGCGAAGTGCGGGGCAGCATCAACAGATTCAAGACCAGCGAC TACGTGAAAGAAGCCAAACAGCTGCTGAAGGTGCAGAAGGCCTACCACCA GCTGGACCAGAGCTTCATCGACACCTACATCGACCTGCTGGAAACCCGGC GGACCTACTATGAGGGACCTGGCGAGGGCAGCCCCTTCGGCTGGAAGGAC ATCAAAGAATGGTACGAGATGCTGATGGGCCACTGCACCTACTTCCCCGA GGAACTGCGGAGCGTGAAGTACGCCTACAACGCCGACCTGTACAACGCCC TGAACGACCTGAACAATCTCGTGATCACCAGGGACGAGAACGAGAAGCTG GAATATTACGAGAAGTTCCAGATCATCGAGAACGTGTTCAAGCAGAAGAA GAAGCCCACCCTGAAGCAGATCGCCAAAGAAATCCTCGTGAACGAAGAGG ATATTAAGGGCTACAGAGTGACCAGCACCGGCAAGCCCGAGTTCACCAAC CTGAAGGTGTACCACGACATCAAGGACATTACCGCCCGGAAAGAGATTAT TGAGAACGCCGAGCTGCTGGATCAGATTGCCAAGATCCTGACCATCTACC AGAGCAGCGAGGACATCCAGGAAGAACTGACCAATCTGAACTCCGAGCTG ACCCAGGAAGAGATCGAGCAGATCTCTAATCTGAAGGGCTATACCGGCAC CCACAACCTGAGCCTGAAGGCCATCAACCTGATCCTGGACGAGCTGTGGC ACACCAACGACAACCAGATCGCTATCTTCAACCGGCTGAAGCTGGTGCCC AAGAAGGTGGACCTGTCCCAGCAGAAAGAGATCCCCACCACCCTGGTGGA CGACTTCATCCTGAGCCCCGTCGTGAAGAGAAGCTTCATCCAGAGCATCA AAGTGATCAACGCCATCATCAAGAAGTACGGCCTGCCCAACGACATCATT ATCGAGCTGGCCCGCGAGAAGAACTCCAAGGACGCCCAGAAAATGATCAA CGAGATGCAGAAGCGGAACCGGCAGACCAACGAGCGGATCGAGGAAATCA TCCGGACCACCGGCAAAGAGAACGCCAAGTACCTGATCGAGAAGATCAAG CTGCACGACATGCAGGAAGGCAAGTGCCTGTACAGCCTGGAAGCCATCCC TCTGGAAGATCTGCTGAACAACCCCTTCAACTATGAGGTGGACCACATCA TCCCCAGAAGCGTGTCCTTCGACAACAGCTTCAACAACAAGGTGCTCGTG AAGCAGGAAGAAgcCAGCAAGAAGGGCAACCGGACCCCATTCCAGTACCT GAGCAGCAGCGACAGCAAGATCAGCTACGAAACCTTCAAGAAGCACATCC TGAATCTGGCCAAGGGCAAGGGCAGAATCAGCAAGACCAAGAAAGAGTAT CTGCTGGAAGAACGGGACATCAACAGGTTCTCCGTGCAGAAAGACTTCAT CAACCGGAACCTGGTGGATACCAGATACGCCACCAGAGGCCTGATGAACC TGCTGCGGAGCTACTTCAGAGTGAACAACCTGGACGTGAAAGTGAAGTCC ATCAATGGCGGCTTCACCAGCTTTCTGCGGCGGAAGTGGAAGTTTAAGAA AGAGCGGAACAAGGGGTACAAGCACCAGCGCCAGGACGCCCTGATCATTG CCAACGCCGATTTCATCTTCAAAGAGTGGAAGAAACTGGACAAGGCCAAA AAAGTGATGGAAAACCAGATGTTCGAGGAAAAGCAGGCCGAGAGCATGCC CGAGATCGAAACCGAGCAGGAGTACAAAGAGATCTTCATCACCCCCCACC AGATCAAGCACATTAAGGACTTCAAGGACTACAAGTACAGCCACCGGGTG GACAAGAAGCCTAATAGAGAGCTGATTAACGACACCCTGTACTCCACCCG GAAGGACGACAAGGGCAACACCCTGATCGTGAACAATCTGAACGGCCTGT ACGACAAGGACAATGACAAGCTGAAAAAGCTGATCAACAAGAGCCCCGAA AAGCTGCTGATGTACCACCACGACCCCCAGACCTACCAGAAACTGAAGCT GATTATGGAACAGTACGGCGACGAGAAGAATCCCCTGTACAAGTACTACG AGGAAACCGGGAACTACCTGACCAAGTACTCCAAAAAGGACAACGGCCCC GTGATCAAGAAGATTAAGTATTACGGCAACAAACTGAACGCCCATCTGGA CATCACCGACGACTACCCCAACAGCAGAAACAAGGTCGTGAAGCTGTCCC TGAAGCCCTACAGATTCGACGTGTACCTGGACAATGGCGTGTACAAGTTC GTGACCGTGAAGAATCTGGATGTGATCAAAAAAGAAAACTACTACGAAGT GAATAGCAAGTGCTATGAGGAAGCTAAGAAGCTGAAGAAGATCAGCAACC AGGCCGAGTTTATCGCCTCCTTCTACAACAACGATCTGATCAAGATCAAC GGCGAGCTGTATAGAGTGATCGGCGTGAACAACGACCTGCTGAACCGGAT CGAAGTGAACATGATCGACATCACCTACCGCGAGTACCTGGAAAACATGA ACGACAAGAGGCCCCCCAGGATCATTAAGACAATCGCCTCCAAGACCCAG AGCATTAAGAAGTACAGCACAGACATTCTGGGCAACCTGTATGAAGTGAA ATCTAAGAAGCACCCTCAGATCATCAAAAAGGGCAAAAGGCCGGCGGCCA CGAAAAAGGCCGGCCAGGCAAAAAAGAAAAAGggatcCGATGCTAAGTCA CTGACTGCCTGGTCCCGGACACTGGTGACCTTCAAGGATGTGTTTGTGGA CTTCACCAGGGAGGAGTGGAAGCTGCTGGACACTGCTCAGCAGATCCTGT ACAGAAATGTGATGCTGGAGAACTATAAGAACCTGGTTTCCTTGGGTTAT CAGCTTACTAAGCCAGATGTGATCCTCCGGTTGGAGAAGGGAGAAGAGCC CTGGCTGGTGGAGAGAGAAATTCACCAAGAGACCCATCCTGATTCAGAGA CTGCATTTGAAATCAAATCATCAGTTCCGAAAAAGAAACGCAAAGtttaa

2. Additional Information

[0535] Engineered DNA-binding proteins that can be customized to target any gene in mammalian cells have enabled rapid advances in biomedical research and are a promising platform for gene therapies. The RNA-guided CRISPR/Cas9 system has emerged as a promising platform for programmable targeted gene regulation. Current Cas9 transcriptional repressors are based on Cas9 derived from the S. pyogenes bacterial strain. Fusion of catalytically inactive, "dead" Cas9 (dCas9) to the Kruppel-associated box (KRAB) domain generates a synthetic repressor capable of highly specific and potent silencing of target genes in cell culture experiments. However, a technology to deliver CRISPR/Cas9-based gene repressors in vivo has not been developed. Adeno-associated virus (AAV) vectors have been proposed for gene delivery of CRISPR/Cas9 components for in vivo studies and therapeutic applications. AAV vectors provide stable gene expression with low risk of mutagenic integration events, can be engineered to target tissues of interest in vivo, and are already in use in humans in clinical trials. However, gene delivery of S. pyogenes dCas9-KRAB in vivo is challenging because the size of the S. pyogenes dCas9 and KRAB domain fusion exceeds the packaging limit of standard AAV vectors. Recently, a smaller Cas9 nuclease protein derived from S. aureus was described for AAV delivery and in vivo gene editing. An S. aureus nuclease-null dCas9 was generated and fused to a synthetic KRAB repressor to create a programmable RNA-guided repressor for in vivo gene regulation (FIG. 10A). An AAV-based expression system was designed to deliver dCas9-KRAB fusion proteins and CRISPR gRNA targeting molecules in vivo (FIGS. 10B and 10C). When delivered intramuscularly using an AAV9 serotype vector, S. aureus dCas9-KRAB protein was expressed efficiently in skeletal muscle up to 8 weeks after delivery in a wild-type mouse model (FIG. 3D). Furthermore, it was demonstrated that S. aureus dCas9-KRAB is biologically active and can effectively silence an endogenous gene, acvr2b, in the injected muscle, heart and liver when delivered with a target guide RNA molecule (FIGS. 3E, 5B and 5D). This gene delivery system can be customized to target any endogenous gene by designing a new guide RNA molecule, enabling potent and stable gene repression in animal models and for human use.

3. Hypercholesterolemia

[0536] Hypercholesterolemia is a risk factor for cardiovascular disease, a leading cause of mortality in the United States. PCSK9 is a circulating protease that binds and facilitates degradation of low density lipoprotein (LDL) receptors. Individuals with naturally reduced PCSK9 demonstrate hypocholesterolemia, and silencing PCSK9 expression has been proposed as a mechanism to lower levels of harmful LDL cholesterol in the serum. RNA-guided CRISPR/Cas9-based transcriptional modulators can enable efficient and specific gene repression. An adeno-associated virus (AAV)-based gene modulation platform was developed using CRISPR/Cas9 repressors to enable targeted silencing of PCSK9 gene expression in vivo. To generate RNA-guided repressors, nuclease-inactive S. aureus Cas9 was fused to the KRAB domain, a motif found in mammalian transcription factors. CRISPR guide RNAs were targeted to the transcriptional start site region of the mouse PCSK9 gene. The dCas9-KRAB repressor and PCSK9 guide RNA (protospacer sequence: gagggaagggatacaggctgga (SEQ ID NO: 42); mm10 coordinates: chr4 106464536-106464557) were expressed on separate adeno-associated viral vectors and delivered intravenously to wild-type mice (FIG. 11A). Two weeks after treatment, mice expressing dCas9-KRAB and PCSK9 guide RNA had significantly reduced circulating PCSK9 and total cholesterol levels in serum, compared to sham-treated and dCas9-KRAB only-treated controls (FIGS. 11B and 11C). The magnitude of PCSK9 repression and cholesterol reduction depended on the dose of AAV administered. Overall these results demonstrate that RNA-guided CRISPR/dCas9-KRAB repressors can effectively silence target liver gene expression in mouse models and show the potential of this technology for basic research and clinical applications.

4. Regulation of PCSK9 Expression In Vivo

4.1 Materials and Methods

Plasmid Constructs and AAV Design

[0537] An inactive version of S. aureus Cas9 (dSaCas9) was created by introducing D10A and N580A mutations (Ran et al., Nature. 2015; 520:186-91, incorporated by reference herein in its entirety). A SaCas9 AAV expression plasmid (Addgene #61592) was received as a gift from the Zhang lab (Ran et al. Nature. 2005; 520:186-U98, incorporated by reference herein in its entirety). The nuclease-active SaCas9 was replaced with dSaCas9-KRAB. The C' terminal 3.times.HA epitope tag was also removed and a single N' terminal HA tag was incorporated for tracking protein expression. For the AAV-U6 gRNA plasmid, a U6-PCSK9 gRNA cassette was cloned into a pTR-eGFP backbone replacing the CMV with the gRNA.

AAV Production

[0538] ITRs were verified by SmaI digest before production. AAV-dSaCas9-KRAB and AAV-U6 PCSK9 gRNA were used to generate AAV9 in two separate batches by the Gene Transfer Vector Core at Schepens Eye Research Institute, Massachusetts Eye and Ear.

Animal Studies

[0539] Animal studies were conducted with adherence to the guidelines for the care and use of laboratory animals of the National Institutes of Health (NIH). All the experiments with animals were approved by the Institutional Animal Care and Use Committee (IACUC) at Duke University. 6-8 week old C57Bl/6 mice (Jackson Labs) were anesthetized and maintained at 37.degree. C. The tail vein was prepared and injected with 200 .mu.L of AAV solution (2.times.10.sup.11-4.times.10.sup.12 viral genomes/total dose) or sterile PBS using a 31 G needle. Low dose treatment was defined as 2.times.10.sup.11 viral genomes per vector per mouse (vg/v/m), and moderate dose was defined as 4.times.10.sup.11 vg/v/m. Mice were injected with a saline control, AAV-dSaCas9-KRAB alone, AAV-U6 PCSK9 gRNA alone, or a 1:1 mixture of AAV-dSaCas9-KRAB and AAV-U6 PCSK9 gRNA. Mice were fasted for 12-14 hours and submandibular vein blood collections were performed every two weeks, starting on day 0 four to six hours prior to tail vein injection. At 6 and 14 weeks post-injection, mice were euthanized by CO.sub.2 inhalation, perfused with PBS, and tissue was collected into RNALater.RTM. (Life Technologies) for DNA and RNA, snap-frozen for protein analysis, or fixed in 4% PFA and embedded in OCT for histology.

qRT-PCR

[0540] Tissue samples were stored in RNALater (Ambion) and total RNA was isolated using the RNA Universal Plus Kit (Qiagen). cDNA synthesis was performed using the SuperScript VILO cDNA Synthesis Kit (Invitrogen). For genomic qPCR experiments, genomic DNA from tissue samples was isolated using a Blood and Tissue Kit (Qiagen). Quantitative real-time PCR (qRT-PCR) using QuantIT Perfecta Supermix was performed with the CFX96 Real-Time PCR Detection System (Bio-Rad) with the oligonucleotide primers optimized for 90-110% amplification efficiency. The results are expressed as fold-increase mRNA expression of the gene of interest normalized to Gapdh expression by the .DELTA..DELTA.C.sub.t method.

RNA-Sequencing

[0541] mRNA was purified from total RNA using oligo(dT) Dynabeads (Invitrogen). First-strand cDNA was synthesized using the SuperScript VILO cDNA Synthesis Kit (Invitrogen) and second-strand cDNA was synthesized using DNA polymerase I (New England Biolabs). cDNA was purified using Agencourt AMPure XP beads (Beckman Coulter). Purified cDNA was treated with Nextera transposase (Illumina) for 5 min at 55.degree. C. to simultaneously fragment and insert sequencing primers into the double-stranded cDNA. Transposase activity was halted using QG buffer (Qiagen) and fragmented cDNA was purified on AMPure XP beads. Indexed sequencing libraries were PCR-amplified and sequenced for 50-bp paired-end reads on an Illumina HiSeq 2000 instrument at the Duke Genome Sequencing Shared Resource. Reads aligned to the delivered AAV vector were removed from analysis. Filtered reads were then aligned to mouse RefSeq transcripts using Bowtie 2 (Langmead and Salzberg, Nat Methods. 2012; 9:357-9, incorporated by reference herein in its entirety). Statistical analysis, including multiple hypothesis testing, on three independent biological replicates was performed using DESeq (Anders and Huber, Genome Biol. 2010; 11:R106, incorporated by reference herein in its entirety).

Western Blot

[0542] Minced tissue was lysed in RIPA buffer (Sigma), and the BCA assay (Pierce) was performed to quantify total protein. Lysates were mixed with LDS sample buffer (Invitrogen) and boiled for 5 min; equal amounts of total protein were run in NuPAGE Novex 4-12% Bis-Tris polyacrylamide gels (Life Technologies) and transferred to nitrocellulose membranes. Nonspecific antibody binding was blocked with 5% nonfat milk in TBS-T (50 mM Tris, 150 mM NaCl and 0.1% Tween-20) for 30 min. The membranes were then incubated with primary antibody in 5% milk in TBS-T: rabbit anti-LDLR diluted 1:1000 overnight at 4.degree. C. or or rabbit anti-GAPDH diluted 1:5000 for 60 min at room temperature. Membranes labeled with primary antibodies were incubated with anti-rabbit HRP-conjugated antibody (Sigma-Aldrich, A6154) diluted 1:5000 for 60 min and washed with TBS-T for 60 min. Membranes were visualized using the Immun-Star WesternC Chemiluminescence Kit (Bio-Rad) and images were captured using a ChemiDoc XRS+ system and processed using ImageLab software (Bio-Rad).

Histology

[0543] A cross section of the median liver lobe was fixed overnight in 4% PFA and embedded in OCT using liquid nitrogen-cooled isopentane. 10 .mu.m sections were cut onto pre-treated histological slides. Hematoxylin and eosin was used to reveal general liver histopathology.

Serum Analysis

[0544] After harvest, serum was stored in one-time use aliquots at -80 C. Total cholesterol and LDL cholesterol levels were measured from serum via a colorimetric assay according to manufacturer's instructions (ThermoScientific Total Cholesterol Reagents #TR13421 and WakoChemical LDL Cholesterol #993-00404). PCSK9 serum protein levels were quantified by ELISA with a standard curve according to the manufacturer's instructions (R&D Systems #MPC900).

4.2 Results

[0545] Three independent studies were conducted, in which dSaCas9-KRAB repressor and PCSK9 guide RNA were delivered by AAV vectors to mice.

[0546] In the first study, mice were administered with PBS, AAV-dSaCas9-KRAB alone (1.times.10.sup.12 total genomes/vector/mouse), or a low-dose 1:1 mixture of AAV-dSaCas9-KRAB and AAV-U6 PCSK9 gRNA (4.times.10.sup.11 viral genomes/vector/mouse). Four mice were tested in each treatment group and followed for 6 weeks. As shown in FIG. 12A, low dose treatment with dSaCas9-KRAB and PCSK9 gRNA effectively lowered the serum levels of PCSK9 as measured by ELISA for at least 42 days post-treatment. Treatment with dSaCas9-KRAB alone did not reduce the serum levels of PCSK9 (FIG. 12A). Consistent with the reduction of PCSK9 protein levels, a reduction of PCSK9 mRNA levels in the liver was also observed in a qRT-PCR analysis (FIG. 12B) as well as a RNA-seq analysis (FIG. 12C). Total cholesterol and LDL cholesterol levels in the serum were measured using a colorimetric assay. As shown in FIGS. 12D and 12E, both the total and LDL cholesterol levels were reduced over the course of 42 days by the low-dose treatment with dSaCas9-KRAB and PCSK9 gRNA, compared to the PBS treatment or the treatment with dSaCas9-KRAB alone.

[0547] In the second study, mice were administered with PBS, AAV-dSaCas9-KRAB alone (4.times.10.sup.11 total genomes/vector/mouse), AAV-U6 PCSK9 gRNA alone (4.times.10.sup.11 total genomes/vector/mouse), or a moderate-dose 1:1 mixture of AAV-dSaCas9-KRAB and AAV-U6 PCSK9 gRNA (8.times.10.sup.11 viral genomes/vector/mouse). Four mice were tested in each treatment group and followed for 6 weeks. Consistent with results from the low-dose study described above, treatment with a moderate dose of dSaCas9-KRAB and PCSK9 gRNA also reduced PCSK9 protein levels (FIGS. 13A and 13B), as well as total cholesterol levels (FIGS. 13C and 13D) and LDL cholesterol levels (FIGS. 13E and 13F) in the serum.

[0548] In the third study, mice were administered with PBS, a low-dose 1:1 mixture of AAV-dSaCas9-KRAB and AAV-U6 PCSK9 gRNA (4.times.10.sup.11 viral genomes/vector/mouse), or a moderate-dose 1:1 mixture of AAV-dSaCas9-KRAB and AAV-U6 PCSK9 gRNA (8.times.10.sup.11 viral genomes/vector/mouse). Four mice were tested in each group and followed for 24 weeks. As shown in FIG. 14A, both the low-dose and moderate-dose treatments with dSaCas9-KRAB and PCSK9 gRNA significantly lowered the serum PCSK9 levels for at least 168 days post-treatment. Both treatments also reduced total (FIGS. 14B and 14C) and LDL (FIG. 14D) cholesterol levels in the serum.

[0549] Any patents or publications mentioned in this specification are indicative of the levels of those skilled in the art to which the invention pertains. These patents and publications are herein incorporated by reference to the same extent as if each individual publication was specifically and individually indicated to be incorporated by reference. In case of conflict, the present specification, including definitions, will control.

[0550] One skilled in the art will readily appreciate that the present invention is well adapted to carry out the objects and obtain the ends and advantages mentioned, as well as those inherent therein. The present disclosure described herein are presently representative of preferred embodiments, are exemplary, and are not intended as limitations on the scope of the invention. Changes therein and other uses will occur to those skilled in the art which are encompassed within the spirit of the invention as defined by the scope of the claims.

Sequence CWU 1

1

4216DNAArtificial sequencePAM sequencemisc_feature(1)..(2)n is a, c, g, or t 1nngrrt 6214048DNAArtificial sequenceAAV Vector Construct 2gtcgacggat cgggagatct cccgatcccc tatggtgcac tctcagtaca atctgctctg 60atgccgcata gttaagccag tatctgctcc ctgcttgtgt gttggaggtc gctgagtagt 120gcgcgagcaa aatttaagct acaacaaggc aaggcttgac cgacaattgc atgaagaatc 180tgcttagggt taggcgtttt gcgctgcttc gcgatgtacg ggccagatat acgcgttgac 240attgattatt gactagttat taatagtaat caattacggg gtcattagtt catagcccat 300atatggagtt ccgcgttaca taacttacgg taaatggccc gcctggctga ccgcccaacg 360acccccgccc attgacgtca ataatgacgt atgttcccat agtaacgcca atagggactt 420tccattgacg tcaatgggtg gagtatttac ggtaaactgc ccacttggca gtacatcaag 480tgtatcatat gccaagtacg ccccctattg acgtcaatga cggtaaatgg cccgcctggc 540attatgccca gtacatgacc ttatgggact ttcctacttg gcagtacatc tacgtattag 600tcatcgctat taccatggtg atgcggtttt ggcagtacat caatgggcgt ggatagcggt 660ttgactcacg gggatttcca agtctccacc ccattgacgt caatgggagt ttgttttggc 720accaaaatca acgggacttt ccaaaatgtc gtaacaactc cgccccattg acgcaaatgg 780gcggtaggcg tgtacggtgg gaggtctata taagcagcgc gttttgcctg tactgggtct 840ctctggttag accagatctg agcctgggag ctctctggct aactagggaa cccactgctt 900aagcctcaat aaagcttgcc ttgagtgctt caagtagtgt gtgcccgtct gttgtgtgac 960tctggtaact agagatccct cagacccttt tagtcagtgt ggaaaatctc tagcagtggc 1020gcccgaacag ggacttgaaa gcgaaaggga aaccagagga gctctctcga cgcaggactc 1080ggcttgctga agcgcgcacg gcaagaggcg aggggcggcg actggtgagt acgccaaaaa 1140ttttgactag cggaggctag aaggagagag atgggtgcga gagcgtcagt attaagcggg 1200ggagaattag atcgcgatgg gaaaaaattc ggttaaggcc agggggaaag aaaaaatata 1260aattaaaaca tatagtatgg gcaagcaggg agctagaacg attcgcagtt aatcctggcc 1320tgttagaaac atcagaaggc tgtagacaaa tactgggaca gctacaacca tcccttcaga 1380caggatcaga agaacttaga tcattatata atacagtagc aaccctctat tgtgtgcatc 1440aaaggataga gataaaagac accaaggaag ctttagacaa gatagaggaa gagcaaaaca 1500aaagtaagac caccgcacag caagcggccg ctgatcttca gacctggagg aggagatatg 1560agggacaatt ggagaagtga attatataaa tataaagtag taaaaattga accattagga 1620gtagcaccca ccaaggcaaa gagaagagtg gtgcagagag aaaaaagagc agtgggaata 1680ggagctttgt tccttgggtt cttgggagca gcaggaagca ctatgggcgc agcgtcaatg 1740acgctgacgg tacaggccag acaattattg tctggtatag tgcagcagca gaacaatttg 1800ctgagggcta ttgaggcgca acagcatctg ttgcaactca cagtctgggg catcaagcag 1860ctccaggcaa gaatcctggc tgtggaaaga tacctaaagg atcaacagct cctggggatt 1920tggggttgct ctggaaaact catttgcacc actgctgtgc cttggaatgc tagttggagt 1980aataaatctc tggaacagat ttggaatcac acgacctgga tggagtggga cagagaaatt 2040aacaattaca caagcttaat acactcctta attgaagaat cgcaaaacca gcaagaaaag 2100aatgaacaag aattattgga attagataaa tgggcaagtt tgtggaattg gtttaacata 2160acaaattggc tgtggtatat aaaattattc ataatgatag taggaggctt ggtaggttta 2220agaatagttt ttgctgtact ttctatagtg aatagagtta ggcagggata ttcaccatta 2280tcgtttcaga cccacctccc aaccccgagg ggacccgaca ggcccgaagg aatagaagaa 2340gaaggtggag agagagacag agacagatcc attcgattag tgaacggatc ggcactgcgt 2400gcgccaattc tgcagacaaa tggcagtatt catccacaat tttaaaagaa aaggggggat 2460tggggggtac agtgcagggg aaagaatagt agacataata gcaacagaca tacaaactaa 2520agaattacaa aaacaaatta caaaaattca aaattttcgg gtttattaca gggacagcag 2580agatccagtt tggttaatta aataacttcg tatagcatac attatacgaa gttatgataa 2640gagacggtgg tggcgccgct acagggcgcg tcccattcgc cattcaggct gcgcaactgt 2700tgggaagggc gatcggtgcg ggcctcttcg ctattacgcc agctggcgaa agggggatgt 2760gctgcaaggc gattaagttg ggtaacgcca gggttttccc agtcacgacg ttgtaaaacg 2820acggccagtg agcgcgcgta atacgactca ctatagggcg aattgggtac cgggcccccc 2880ctcgaggtcc tccagctttt gttcccttta gtgagggtta attgcgcgct tggcgtaatc 2940atggtcatag ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg 3000agccggaagc ataaagtgta aagcctgggg tgcctaatga gtgagctaac tcacattaat 3060tgcgttgcgc tcactgcccg ctttccactg catgacgtct ccacaattaa ttaagggtgc 3120agcggcctcc gcgccgggtt ttggcgcctc ccgcgggcgc ccccctcctc acggcgagcg 3180ctgccacgtc agacgaaggg cgcaggagcg ttcctgatcc ttccgcccgg acgctcagga 3240cagcggcccg ctgctcataa gactcggcct tagaacccca gtatcagcag aaggacattt 3300taggacggga cttgggtgac tctagggcac tggttttctt tccagagagc ggaacaggcg 3360aggaaaagta gtcccttctc ggcgattctg cggagggatc tccgtggggc ggtgaacgcc 3420gatgattata taaggacgcg ccgggtgtgg cacagctagt tccgtcgcag ccgggatttg 3480ggtcgcggtt cttgtttgtg gatcgctgtg atcgtcactt ggtgagttgc gggctgctgg 3540gctggccggg gctttcgtgg ccgccgggcc gctcggtggg acggaagcgt gtggagagac 3600cgccaagggc tgtagtctgg gtccgcgagc aaggttgccc tgaactgggg gttgggggga 3660gcgcacaaaa tggcggctgt tcccgagtct tgaatggaag acgcttgtaa ggcgggctgt 3720gaggtcgttg aaacaaggtg gggggcatgg tgggcggcaa gaacccaagg tcttgaggcc 3780ttcgctaatg cgggaaagct cttattcggg tgagatgggc tggggcacca tctggggacc 3840ctgacgtgaa gtttgtcact gactggagaa ctcgggtttg tcgtctggtt gcgggggcgg 3900cagttatgcg gtgccgttgg gcagtgcacc cgtacctttg ggagcgcgcg cctcgtcgtg 3960tcgtgacgtc acccgttctg ttggcttata atgcagggtg gggccacctg ccggtaggtg 4020tgcggtaggc ttttctccgt cgcaggacgc agggttcggg cctagggtag gctctcctga 4080atcgacaggc gccggacctc tggtgagggg agggataagt gaggcgtcag tttctttggt 4140cggttttatg tacctatctt cttaagtagc tgaagctccg gttttgaact atgcgctcgg 4200ggttggcgag tgtgttttgt gaagtttttt aggcaccttt tgaaatgtaa tcatttgggt 4260caatatgtaa ttttcagtgt tagactagta aattgtccgc taaattctgg ccgtttttgg 4320cttttttgtt agacgaagct tgggctgcag gtcgactcta gagccaccat gtacccatac 4380gatgttccag attacgctat ggccccaaag aagaagcgga aggtcggtat ccacggagtc 4440ccagcagcca agcggaacta catcctgggc ctggccatcg gcatcaccag cgtgggctac 4500ggcatcatcg actacgagac acgggacgtg atcgatgccg gcgtgcggct gttcaaagag 4560gccaacgtgg aaaacaacga gggcaggcgg agcaagagag gcgccagaag gctgaagcgg 4620cggaggcggc atagaatcca gagagtgaag aagctgctgt tcgactacaa cctgctgacc 4680gaccacagcg agctgagcgg catcaacccc tacgaggcca gagtgaaggg cctgagccag 4740aagctgagcg aggaagagtt ctctgccgcc ctgctgcacc tggccaagag aagaggcgtg 4800cacaacgtga acgaggtgga agaggacacc ggcaacgagc tgtccaccaa agagcagatc 4860agccggaaca gcaaggccct ggaagagaaa tacgtggccg aactgcagct ggaacggctg 4920aagaaagacg gcgaagtgcg gggcagcatc aacagattca agaccagcga ctacgtgaaa 4980gaagccaaac agctgctgaa ggtgcagaag gcctaccacc agctggacca gagcttcatc 5040gacacctaca tcgacctgct ggaaacccgg cggacctact atgagggacc tggcgagggc 5100agccccttcg gctggaagga catcaaagaa tggtacgaga tgctgatggg ccactgcacc 5160tacttccccg aggaactgcg gagcgtgaag tacgcctaca acgccgacct gtacaacgcc 5220ctgaacgacc tgaacaatct cgtgatcacc agggacgaga acgagaagct ggaatattac 5280gagaagttcc agatcatcga gaacgtgttc aagcagaaga agaagcccac cctgaagcag 5340atcgccaaag aaatcctcgt gaacgaagag gatattaagg gctacagagt gaccagcacc 5400ggcaagcccg agttcaccaa cctgaaggtg taccacgaca tcaaggacat taccgcccgg 5460aaagagatta ttgagaacgc cgagctgctg gatcagattg ccaagatcct gaccatctac 5520cagagcagcg aggacatcca ggaagaactg accaatctga actccgagct gacccaggaa 5580gagatcgagc agatctctaa tctgaagggc tataccggca cccacaacct gagcctgaag 5640gccatcaacc tgatcctgga cgagctgtgg cacaccaacg acaaccagat cgctatcttc 5700aaccggctga agctggtgcc caagaaggtg gacctgtccc agcagaaaga gatccccacc 5760accctggtgg acgacttcat cctgagcccc gtcgtgaaga gaagcttcat ccagagcatc 5820aaagtgatca acgccatcat caagaagtac ggcctgccca acgacatcat tatcgagctg 5880gcccgcgaga agaactccaa ggacgcccag aaaatgatca acgagatgca gaagcggaac 5940cggcagacca acgagcggat cgaggaaatc atccggacca ccggcaaaga gaacgccaag 6000tacctgatcg agaagatcaa gctgcacgac atgcaggaag gcaagtgcct gtacagcctg 6060gaagccatcc ctctggaaga tctgctgaac aaccccttca actatgaggt ggaccacatc 6120atccccagaa gcgtgtcctt cgacaacagc ttcaacaaca aggtgctcgt gaagcaggaa 6180gaagccagca agaagggcaa ccggacccca ttccagtacc tgagcagcag cgacagcaag 6240atcagctacg aaaccttcaa gaagcacatc ctgaatctgg ccaagggcaa gggcagaatc 6300agcaagacca agaaagagta tctgctggaa gaacgggaca tcaacaggtt ctccgtgcag 6360aaagacttca tcaaccggaa cctggtggat accagatacg ccaccagagg cctgatgaac 6420ctgctgcgga gctacttcag agtgaacaac ctggacgtga aagtgaagtc catcaatggc 6480ggcttcacca gctttctgcg gcggaagtgg aagtttaaga aagagcggaa caaggggtac 6540aagcaccacg ccgaggacgc cctgatcatt gccaacgccg atttcatctt caaagagtgg 6600aagaaactgg acaaggccaa aaaagtgatg gaaaaccaga tgttcgagga aaagcaggcc 6660gagagcatgc ccgagatcga aaccgagcag gagtacaaag agatcttcat caccccccac 6720cagatcaagc acattaagga cttcaaggac tacaagtaca gccaccgggt ggacaagaag 6780cctaatagag agctgattaa cgacaccctg tactccaccc ggaaggacga caagggcaac 6840accctgatcg tgaacaatct gaacggcctg tacgacaagg acaatgacaa gctgaaaaag 6900ctgatcaaca agagccccga aaagctgctg atgtaccacc acgaccccca gacctaccag 6960aaactgaagc tgattatgga acagtacggc gacgagaaga atcccctgta caagtactac 7020gaggaaaccg ggaactacct gaccaagtac tccaaaaagg acaacggccc cgtgatcaag 7080aagattaagt attacggcaa caaactgaac gcccatctgg acatcaccga cgactacccc 7140aacagcagaa acaaggtcgt gaagctgtcc ctgaagccct acagattcga cgtgtacctg 7200gacaatggcg tgtacaagtt cgtgaccgtg aagaatctgg atgtgatcaa aaaagaaaac 7260tactacgaag tgaatagcaa gtgctatgag gaagctaaga agctgaagaa gatcagcaac 7320caggccgagt ttatcgcctc cttctacaac aacgatctga tcaagatcaa cggcgagctg 7380tatagagtga tcggcgtgaa caacgacctg ctgaaccgga tcgaagtgaa catgatcgac 7440atcacctacc gcgagtacct ggaaaacatg aacgacaaga ggccccccag gatcattaag 7500acaatcgcct ccaagaccca gagcattaag aagtacagca cagacattct gggcaacctg 7560tatgaagtga aatctaagaa gcaccctcag atcatcaaaa agggcaaaag gccggcggcc 7620acgaaaaagg ccggccaggc aaaaaagaaa aagggatccg atgctaagtc actgactgcc 7680tggtcccgga cactggtgac cttcaaggat gtgtttgtgg acttcaccag ggaggagtgg 7740aagctgctgg acactgctca gcagatcctg tacagaaatg tgatgctgga gaactataag 7800aacctggttt ccttgggtta tcagcttact aagccagatg tgatcctccg gttggagaag 7860ggagaagagc cctggctggt ggagagagaa attcaccaag agacccatcc tgattcagag 7920actgcatttg aaatcaaatc atcagttccg aaaaagaaac gcaaagttgc tagcgagggc 7980agaggaagtc ttctaacatg cggtgacgtg gaggagaatc ccggccctat gaccgagtac 8040aagcccacgg tgcgcctcgc cacccgcgac gacgtcccca gggccgtacg caccctcgcc 8100gccgcgttcg ccgactaccc cgccacgcgc cacaccgtcg atccggaccg ccacatcgag 8160cgggtcaccg agctgcaaga actcttcctc acgcgcgtcg ggctcgacat cggcaaggtg 8220tgggtcgcgg acgacggcgc cgcggtggcg gtctggacca cgccggagag cgtcgaagcg 8280ggggcggtgt tcgccgagat cggcccgcgc atggccgagt tgagcggttc ccggctggcc 8340gcgcagcaac agatggaagg cctcctggcg ccgcaccggc ccaaggagcc cgcgtggttc 8400ctggccaccg tcggcgtgtc gcccgaccac cagggcaagg gtctgggcag cgccgtcgtg 8460ctccccggag tggaggcggc cgagcgcgcc ggggtgcccg ccttcctgga gacctccgcg 8520ccccgcaacc tccccttcta cgagcggctc ggcttcaccg tcaccgccga cgtcgaggtg 8580cccgaaggac cgcgcacctg gtgcatgacc cgcaagcccg gtgcctgacc agcacactgg 8640cggccgttac tagcttctgc agcacgaccg gttgataata gataacttcg tatagcatac 8700attatacgaa gttatgaatt cgatatcaag cttatcgata atcaacctct ggattacaaa 8760atttgtgaaa gattgactgg tattcttaac tatgttgctc cttttacgct atgtggatac 8820gctgctttaa tgcctttgta tcatgctatt gcttcccgta tggctttcat tttctcctcc 8880ttgtataaat cctggttgct gtctctttat gaggagttgt ggcccgttgt caggcaacgt 8940ggcgtggtgt gcactgtgtt tgctgacgca acccccactg gttggggcat tgccaccacc 9000tgtcagctcc tttccgggac tttcgctttc cccctcccta ttgccacggc ggaactcatc 9060gccgcctgcc ttgcccgctg ctggacaggg gctcggctgt tgggcactga caattccgtg 9120gtgttgtcgg ggaaatcatc gtcctttcct tggctgctcg cctgtgttgc cacctggatt 9180ctgcgcggga cgtccttctg ctacgtccct tcggccctca atccagcgga ccttccttcc 9240cgcggcctgc tgccggctct gcggcctctt ccgcgtcttc gccttcgccc tcagacgagt 9300cggatctccc tttgggccgc ctccccgcat cgataccgtc gacctcgaga cctagaaaaa 9360catggagcaa tcacaagtag caatacagca gctaccaatg ctgattgtgc ctggctagaa 9420gcacaagagg aggaggaggt gggttttcca gtcacacctc aggtaccttt aagaccaatg 9480acttacaagg cagctgtaga tcttagccac tttttaaaag aaaagggggg actggaaggg 9540ctaattcact cccaacgaag acaagatatc cttgatctgt ggatctacca cacacaaggc 9600tacttccctg attggcagaa ctacacacca gggccaggga tcagatatcc actgaccttt 9660ggatggtgct acaagctagt accagttgag caagagaagg tagaagaagc caatgaagga 9720gagaacaccc gcttgttaca ccctgtgagc ctgcatggga tggatgaccc ggagagagaa 9780gtattagagt ggaggtttga cagccgccta gcatttcatc acatggcccg agagctgcat 9840ccggactgta ctgggtctct ctggttagac cagatctgag cctgggagct ctctggctaa 9900ctagggaacc cactgcttaa gcctcaataa agcttgcctt gagtgcttca agtagtgtgt 9960gcccgtctgt tgtgtgactc tggtaactag agatccctca gaccctttta gtcagtgtgg 10020aaaatctcta gcagggcccg tttaaacccg ctgatcagcc tcgactgtgc cttctagttg 10080ccagccatct gttgtttgcc cctcccccgt gccttccttg accctggaag gtgccactcc 10140cactgtcctt tcctaataaa atgaggaaat tgcatcgcat tgtctgagta ggtgtcattc 10200tattctgggg ggtggggtgg ggcaggacag caagggggag gattgggaag acaatagcag 10260gcatgctggg gatgcggtgg gctctatggc ttctgaggcg gaaagaacca gctggggctc 10320tagggggtat ccccacgcgc cctgtagcgg cgcattaagc gcggcgggtg tggtggttac 10380gcgcagcgtg accgctacac ttgccagcgc cctagcgccc gctcctttcg ctttcttccc 10440ttcctttctc gccacgttcg ccggctttcc ccgtcaagct ctaaatcggg ggctcccttt 10500agggttccga tttagtgctt tacggcacct cgaccccaaa aaacttgatt agggtgatgg 10560ttcacgtagt gggccatcgc cctgatagac ggtttttcgc cctttgacgt tggagtccac 10620gttctttaat agtggactct tgttccaaac tggaacaaca ctcaacccta tctcggtcta 10680ttcttttgat ttataaggga ttttgccgat ttcggcctat tggttaaaaa atgagctgat 10740ttaacaaaaa tttaacgcga attaattctg tggaatgtgt gtcagttagg gtgtggaaag 10800tccccaggct ccccagcagg cagaagtatg caaagcatgc atctcaatta gtcagcaacc 10860aggtgtggaa agtccccagg ctccccagca ggcagaagta tgcaaagcat gcatctcaat 10920tagtcagcaa ccatagtccc gcccctaact ccgcccatcc cgcccctaac tccgcccagt 10980tccgcccatt ctccgcccca tggctgacta atttttttta tttatgcaga ggccgaggcc 11040gcctctgcct ctgagctatt ccagaagtag tgaggaggct tttttggagg cctaggcttt 11100tgcaaaaagc tcccgggagc ttgtatatcc attttcggat ctgatcagca cgtgttgaca 11160attaatcatc ggcatagtat atcggcatag tataatacga caaggtgagg aactaaacca 11220tggccaagtt gaccagtgcc gttccggtgc tcaccgcgcg cgacgtcgcc ggagcggtcg 11280agttctggac cgaccggctc gggttctccc gggacttcgt ggaggacgac ttcgccggtg 11340tggtccggga cgacgtgacc ctgttcatca gcgcggtcca ggaccaggtg gtgccggaca 11400acaccctggc ctgggtgtgg gtgcgcggcc tggacgagct gtacgccgag tggtcggagg 11460tcgtgtccac gaacttccgg gacgcctccg ggccggccat gaccgagatc ggcgagcagc 11520cgtgggggcg ggagttcgcc ctgcgcgacc cggccggcaa ctgcgtgcac ttcgtggccg 11580aggagcagga ctgacacgtg ctacgagatt tcgattccac cgccgccttc tatgaaaggt 11640tgggcttcgg aatcgttttc cgggacgccg gctggatgat cctccagcgc ggggatctca 11700tgctggagtt cttcgcccac cccaacttgt ttattgcagc ttataatggt tacaaataaa 11760gcaatagcat cacaaatttc acaaataaag catttttttc actgcattct agttgtggtt 11820tgtccaaact catcaatgta tcttatcatg tctgtatacc gtcgacctct agctagagct 11880tggcgtaatc atggtcatag ctgtttcctg tgtgaaattg ttatccgctc acaattccac 11940acaacatacg agccggaagc ataaagtgta aagcctgggg tgcctaatga gtgagctaac 12000tcacattaat tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg tcgtgccagc 12060tgcattaatg aatcggccaa cgcgcgggga gaggcggttt gcgtattggg cgctcttccg 12120cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg gtatcagctc 12180actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga aagaacatgt 12240gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg gcgtttttcc 12300ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag aggtggcgaa 12360acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc gtgcgctctc 12420ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg ggaagcgtgg 12480cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc 12540tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc ggtaactatc 12600gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc actggtaaca 12660ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg tggcctaact 12720acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca gttaccttcg 12780gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc ggtggttttt 12840ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat cctttgatct 12900tttctacggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt ttggtcatga 12960gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt tttaaatcaa 13020tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc agtgaggcac 13080ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcccc gtcgtgtaga 13140taactacgat acgggagggc ttaccatctg gccccagtgc tgcaatgata ccgcgagacc 13200cacgctcacc ggctccagat ttatcagcaa taaaccagcc agccggaagg gccgagcgca 13260gaagtggtcc tgcaacttta tccgcctcca tccagtctat taattgttgc cgggaagcta 13320gagtaagtag ttcgccagtt aatagtttgc gcaacgttgt tgccattgct acaggcatcg 13380tggtgtcacg ctcgtcgttt ggtatggctt cattcagctc cggttcccaa cgatcaaggc 13440gagttacatg atcccccatg ttgtgcaaaa aagcggttag ctccttcggt cctccgatcg 13500ttgtcagaag taagttggcc gcagtgttat cactcatggt tatggcagca ctgcataatt 13560ctcttactgt catgccatcc gtaagatgct tttctgtgac tggtgagtac tcaaccaagt 13620cattctgaga atagtgtatg cggcgaccga gttgctcttg cccggcgtca atacgggata 13680ataccgcgcc acatagcaga actttaaaag tgctcatcat tggaaaacgt tcttcggggc 13740gaaaactctc aaggatctta ccgctgttga gatccagttc gatgtaaccc actcgtgcac 13800ccaactgatc ttcagcatct tttactttca ccagcgtttc tgggtgagca aaaacaggaa 13860ggcaaaatgc cgcaaaaaag ggaataaggg cgacacggaa atgttgaata ctcatactct 13920tcctttttca atattattga agcatttatc agggttattg tctcatgagc ggatacatat 13980ttgaatgtat ttagaaaaat aaacaaatag gggttccgcg cacatttccc cgaaaagtgc 14040cacctgac 1404837340DNAArtificial sequenceAAV Vector Construct 3gcaggaaccc ctagtgatgg agttggccac tccctctctg cgcgctcgct cgctcactga 60ggccgggcga ccaaaggtcg cccgacgccc gggctttgcc cgggcggcct cagtgagcga 120gcgagcgcgc agctgcctgc aggggcgcct gatgcggtat tttctcctta cgcatctgtg 180cggtatttca caccgcatac gtcaaagcaa ccatagtacg cgccctgtag cggcgcatta 240agcgcggcgg gtgtggtggt tacgcgcagc gtgaccgcta cacttgccag cgccctagcg 300cccgctcctt tcgctttctt cccttccttt ctcgccacgt tcgccggctt tccccgtcaa 360gctctaaatc gggggctccc tttagggttc cgatttagtg ctttacggca cctcgacccc 420aaaaaacttg atttgggtga tggttcacgt agtgggccat cgccctgata gacggttttt 480cgccctttga cgttggagtc cacgttcttt aatagtggac tcttgttcca aactggaaca 540acactcaacc ctatctcggg ctattctttt gatttataag ggattttgcc gatttcggcc 600tattggttaa aaaatgagct gatttaacaa aaatttaacg cgaattttaa caaaatatta 660acgtttacaa ttttatggtg cactctcagt acaatctgct ctgatgccgc atagttaagc 720cagccccgac acccgccaac acccgctgac gcgccctgac

gggcttgtct gctcccggca 780tccgcttaca gacaagctgt gaccgtctcc gggagctgca tgtgtcagag gttttcaccg 840tcatcaccga aacgcgcgag acgaaagggc ctcgtgatac gcctattttt ataggttaat 900gtcatgataa taatggtttc ttagacgtca ggtggcactt ttcggggaaa tgtgcgcgga 960acccctattt gtttattttt ctaaatacat tcaaatatgt atccgctcat gagacaataa 1020ccctgataaa tgcttcaata atattgaaaa aggaagagta tgagtattca acatttccgt 1080gtcgccctta ttcccttttt tgcggcattt tgccttcctg tttttgctca cccagaaacg 1140ctggtgaaag taaaagatgc tgaagatcag ttgggtgcac gagtgggtta catcgaactg 1200gatctcaaca gcggtaagat ccttgagagt tttcgccccg aagaacgttt tccaatgatg 1260agcactttta aagttctgct atgtggcgcg gtattatccc gtattgacgc cgggcaagag 1320caactcggtc gccgcataca ctattctcag aatgacttgg ttgagtactc accagtcaca 1380gaaaagcatc ttacggatgg catgacagta agagaattat gcagtgctgc cataaccatg 1440agtgataaca ctgcggccaa cttacttctg acaacgatcg gaggaccgaa ggagctaacc 1500gcttttttgc acaacatggg ggatcatgta actcgccttg atcgttggga accggagctg 1560aatgaagcca taccaaacga cgagcgtgac accacgatgc ctgtagcaat ggcaacaacg 1620ttgcgcaaac tattaactgg cgaactactt actctagctt cccggcaaca attaatagac 1680tggatggagg cggataaagt tgcaggacca cttctgcgct cggcccttcc ggctggctgg 1740tttattgctg ataaatctgg agccggtgag cgtggaagcc gcggtatcat tgcagcactg 1800gggccagatg gtaagccctc ccgtatcgta gttatctaca cgacggggag tcaggcaact 1860atggatgaac gaaatagaca gatcgctgag ataggtgcct cactgattaa gcattggtaa 1920ctgtcagacc aagtttactc atatatactt tagattgatt taaaacttca tttttaattt 1980aaaaggatct aggtgaagat cctttttgat aatctcatga ccaaaatccc ttaacgtgag 2040ttttcgttcc actgagcgtc agaccccgta gaaaagatca aaggatcttc ttgagatcct 2100ttttttctgc gcgtaatctg ctgcttgcaa acaaaaaaac caccgctacc agcggtggtt 2160tgtttgccgg atcaagagct accaactctt tttccgaagg taactggctt cagcagagcg 2220cagataccaa atactgtcct tctagtgtag ccgtagttag gccaccactt caagaactct 2280gtagcaccgc ctacatacct cgctctgcta atcctgttac cagtggctgc tgccagtggc 2340gataagtcgt gtcttaccgg gttggactca agacgatagt taccggataa ggcgcagcgg 2400tcgggctgaa cggggggttc gtgcacacag cccagcttgg agcgaacgac ctacaccgaa 2460ctgagatacc tacagcgtga gctatgagaa agcgccacgc ttcccgaagg gagaaaggcg 2520gacaggtatc cggtaagcgg cagggtcgga acaggagagc gcacgaggga gcttccaggg 2580ggaaacgcct ggtatcttta tagtcctgtc gggtttcgcc acctctgact tgagcgtcga 2640tttttgtgat gctcgtcagg ggggcggagc ctatggaaaa acgccagcaa cgcggccttt 2700ttacggttcc tggccttttg ctggcctttt gctcacatgt cctgcaggca gctgcgcgct 2760cgctcgctca ctgaggccgc ccgggcgtcg ggcgaccttt ggtcgcccgg cctcagtgag 2820cgagcgagcg cgcagagagg gagtggccaa ctccatcact aggggttcct gcggcctcta 2880gactcgaggc gttgacattg attattgact agttattaat agtaatcaat tacggggtca 2940ttagttcata gcccatatat ggagttccgc gttacataac ttacggtaaa tggcccgcct 3000ggctgaccgc ccaacgaccc ccgcccattg acgtcaataa tgacgtatgt tcccatagta 3060acgccaatag ggactttcca ttgacgtcaa tgggtggagt atttacggta aactgcccac 3120ttggcagtac atcaagtgta tcatatgcca agtacgcccc ctattgacgt caatgacggt 3180aaatggcccg cctggcatta tgcccagtac atgaccttat gggactttcc tacttggcag 3240tacatctacg tattagtcat cgctattacc atggtgatgc ggttttggca gtacatcaat 3300gggcgtggat agcggtttga ctcacgggga tttccaagtc tccaccccat tgacgtcaat 3360gggagtttgt tttggcacca aaatcaacgg gactttccaa aatgtcgtaa caactccgcc 3420ccattgacgc aaatgggcgg taggcgtgta cggtgggagg tctatataag cagagctctc 3480tggctaacta ccggtgccac catgtaccca tacgatgttc cagattacgc tgccccaaag 3540aagaagcgga aggtcggtat ccacggagtc ccagcagcca agcggaacta catcctgggc 3600ctggccatcg gcatcaccag cgtgggctac ggcatcatcg actacgagac acgggacgtg 3660atcgatgccg gcgtgcggct gttcaaagag gccaacgtgg aaaacaacga gggcaggcgg 3720agcaagagag gcgccagaag gctgaagcgg cggaggcggc atagaatcca gagagtgaag 3780aagctgctgt tcgactacaa cctgctgacc gaccacagcg agctgagcgg catcaacccc 3840tacgaggcca gagtgaaggg cctgagccag aagctgagcg aggaagagtt ctctgccgcc 3900ctgctgcacc tggccaagag aagaggcgtg cacaacgtga acgaggtgga agaggacacc 3960ggcaacgagc tgtccaccaa agagcagatc agccggaaca gcaaggccct ggaagagaaa 4020tacgtggccg aactgcagct ggaacggctg aagaaagacg gcgaagtgcg gggcagcatc 4080aacagattca agaccagcga ctacgtgaaa gaagccaaac agctgctgaa ggtgcagaag 4140gcctaccacc agctggacca gagcttcatc gacacctaca tcgacctgct ggaaacccgg 4200cggacctact atgagggacc tggcgagggc agccccttcg gctggaagga catcaaagaa 4260tggtacgaga tgctgatggg ccactgcacc tacttccccg aggaactgcg gagcgtgaag 4320tacgcctaca acgccgacct gtacaacgcc ctgaacgacc tgaacaatct cgtgatcacc 4380agggacgaga acgagaagct ggaatattac gagaagttcc agatcatcga gaacgtgttc 4440aagcagaaga agaagcccac cctgaagcag atcgccaaag aaatcctcgt gaacgaagag 4500gatattaagg gctacagagt gaccagcacc ggcaagcccg agttcaccaa cctgaaggtg 4560taccacgaca tcaaggacat taccgcccgg aaagagatta ttgagaacgc cgagctgctg 4620gatcagattg ccaagatcct gaccatctac cagagcagcg aggacatcca ggaagaactg 4680accaatctga actccgagct gacccaggaa gagatcgagc agatctctaa tctgaagggc 4740tataccggca cccacaacct gagcctgaag gccatcaacc tgatcctgga cgagctgtgg 4800cacaccaacg acaaccagat cgctatcttc aaccggctga agctggtgcc caagaaggtg 4860gacctgtccc agcagaaaga gatccccacc accctggtgg acgacttcat cctgagcccc 4920gtcgtgaaga gaagcttcat ccagagcatc aaagtgatca acgccatcat caagaagtac 4980ggcctgccca acgacatcat tatcgagctg gcccgcgaga agaactccaa ggacgcccag 5040aaaatgatca acgagatgca gaagcggaac cggcagacca acgagcggat cgaggaaatc 5100atccggacca ccggcaaaga gaacgccaag tacctgatcg agaagatcaa gctgcacgac 5160atgcaggaag gcaagtgcct gtacagcctg gaagccatcc ctctggaaga tctgctgaac 5220aaccccttca actatgaggt ggaccacatc atccccagaa gcgtgtcctt cgacaacagc 5280ttcaacaaca aggtgctcgt gaagcaggaa gaagccagca agaagggcaa ccggacccca 5340ttccagtacc tgagcagcag cgacagcaag atcagctacg aaaccttcaa gaagcacatc 5400ctgaatctgg ccaagggcaa gggcagaatc agcaagacca agaaagagta tctgctggaa 5460gaacgggaca tcaacaggtt ctccgtgcag aaagacttca tcaaccggaa cctggtggat 5520accagatacg ccaccagagg cctgatgaac ctgctgcgga gctacttcag agtgaacaac 5580ctggacgtga aagtgaagtc catcaatggc ggcttcacca gctttctgcg gcggaagtgg 5640aagtttaaga aagagcggaa caaggggtac aagcaccacg ccgaggacgc cctgatcatt 5700gccaacgccg atttcatctt caaagagtgg aagaaactgg acaaggccaa aaaagtgatg 5760gaaaaccaga tgttcgagga aaagcaggcc gagagcatgc ccgagatcga aaccgagcag 5820gagtacaaag agatcttcat caccccccac cagatcaagc acattaagga cttcaaggac 5880tacaagtaca gccaccgggt ggacaagaag cctaatagag agctgattaa cgacaccctg 5940tactccaccc ggaaggacga caagggcaac accctgatcg tgaacaatct gaacggcctg 6000tacgacaagg acaatgacaa gctgaaaaag ctgatcaaca agagccccga aaagctgctg 6060atgtaccacc acgaccccca gacctaccag aaactgaagc tgattatgga acagtacggc 6120gacgagaaga atcccctgta caagtactac gaggaaaccg ggaactacct gaccaagtac 6180tccaaaaagg acaacggccc cgtgatcaag aagattaagt attacggcaa caaactgaac 6240gcccatctgg acatcaccga cgactacccc aacagcagaa acaaggtcgt gaagctgtcc 6300ctgaagccct acagattcga cgtgtacctg gacaatggcg tgtacaagtt cgtgaccgtg 6360aagaatctgg atgtgatcaa aaaagaaaac tactacgaag tgaatagcaa gtgctatgag 6420gaagctaaga agctgaagaa gatcagcaac caggccgagt ttatcgcctc cttctacaac 6480aacgatctga tcaagatcaa cggcgagctg tatagagtga tcggcgtgaa caacgacctg 6540ctgaaccgga tcgaagtgaa catgatcgac atcacctacc gcgagtacct ggaaaacatg 6600aacgacaaga ggccccccag gatcattaag acaatcgcct ccaagaccca gagcattaag 6660aagtacagca cagacattct gggcaacctg tatgaagtga aatctaagaa gcaccctcag 6720atcatcaaaa agggcaaaag gccggcggcc acgaaaaagg ccggccaggc aaaaaagaaa 6780aagggatccg atgctaagtc actgactgcc tggtcccgga cactggtgac cttcaaggat 6840gtgtttgtgg acttcaccag ggaggagtgg aagctgctgg acactgctca gcagatcctg 6900tacagaaatg tgatgctgga gaactataag aacctggttt ccttgggtta tcagcttact 6960aagccagatg tgatcctccg gttggagaag ggagaagagc cctggctggt ggagagagaa 7020attcaccaag agacccatcc tgattcagag actgcatttg aaatcaaatc atcagttccg 7080aaaaagaaac gcaaagttta agaattccta gagctcgctg atcagcctcg actgtgcctt 7140ctagttgcca gccatctgtt gtttgcccct cccccgtgcc ttccttgacc ctggaaggtg 7200ccactcccac tgtcctttcc taataaaatg aggaaattgc atcgcattgt ctgagtaggt 7260gtcattctat tctggggggt ggggtggggc aggacagcaa gggggaggat tgggaagaga 7320atagcaggca tgctggggag 734046095DNAArtificial sequenceAAV Vector Construct 4gggggggggg ggggggggtt ggccactccc tctctgcgcg ctcgctcgct cactgaggcc 60gggcgaccaa aggtcgcccg acgcccgggc tttgcccggg cggcctcagt gagcgagcga 120gcgcgcagag agggagtggc caactccatc actaggggtt cctagatctg aattcggtac 180cagatctagg aacctagggc ctatttccca tgattccttc atatttgcat atacgataca 240aggctgttag agagataatt ggaattaatt tgactgtaaa cacaaagata ttagtacaaa 300atacgtgacg tagaaagtaa taatttcttg ggtagtttgc agttttaaaa ttatgtttta 360aaatggacta tcatatgctt accgtaactt gaaagtattt cgatttcttg gctttatata 420tcttgtggaa aggacgaaac accgagcgcg ccccgcctag cccgttttag tactctggaa 480acagaatcta ctaaaacaag gcaaaatgcc gtgtttatct cgtcaacttg ttggcgagat 540ttttttgcgg ccgcccgcgg tggagctcca gcttttgttc cctttagtga gggttaattc 600tagaggatcc ggtactcgag gaactgaaaa accagaaagt taactggtaa gtttagtctt 660tttgtctttt atttcaggtc ccggatccgg tggtggtgca aatcaaagaa ctgctcctca 720gtggatgttg cctttacttc taggcctgta cggaagtgtt acttctgctc taaaagctgc 780ggaattgtac ccgcggcccg ggatccaccg gtcgccacca tggtgagcaa gggcgaggag 840ctgttcaccg gggtggtgcc catcctggtc gagctggacg gcgacgtaaa cggccacaag 900ttcagcgtgt ccggcgaggg cgagggcgat gccacctacg gcaagctgac cctgaagttc 960atctgcacca ccggcaagct gcccgtgccc tggcccaccc tcgtgaccac cctgacctac 1020ggcgtgcagt gcttcagccg ctaccccgac cacatgaagc agcacgactt cttcaagtcc 1080gccatgcccg aaggctacgt ccaggagcgc accatcttct tcaaggacga cggcaactac 1140aagacccgcg ccgaggtgaa gttcgagggc gacaccctgg tgaaccgcat cgagctgaag 1200ggcatcgact tcaaggagga cggcaacatc ctggggcaca agctggagta caactacaac 1260agccacaacg tctatatcat ggccgacaag cagaagaacg gcatcaaggt gaacttcaag 1320atccgccaca acatcgagga cggcagcgtg cagctcgccg accactacca gcagaacacc 1380cccatcggcg acggccccgt gctgctgccc gacaaccact acctgagcac ccagtccgcc 1440ctgagcaaag accccaacga gaagcgcgat cacatggtcc tgctggagtt cgtgaccgcc 1500gccgggatca ctctcggcat ggacgagctg tacaagtaaa gcggccgcgg ggatccagac 1560atgataagat acattgatga gtttggacaa accacaacta gaatgcagtg aaaaaaatgc 1620tttatttgtg aaatttgtga tgctattgct ttatttgtaa ccattataag ctgcaataaa 1680caagttaaca acaacaattg cattcatttt atgtttcagg ttcaggggga ggtgtgggag 1740gttttttagt cgacctcgag cagtgtggtt ttgcaagagg aagcaaaaag cctctccacc 1800caggcctgga atgtttccac ccaagtcgaa ggcagtgtgg ttttgcaaga ggaagcaaaa 1860agcctctcca cccaggcctg gaatgtttcc acccaatgtc gagcaacccc gcccagcgtc 1920ttgtcattgg cgaattcgaa cacgcagatg cagtcggggc ggcgcggtcc caggtccact 1980tcgcatatta aggtgacgcg tgtggcctcg aacaccgagc gaccctgcag ccaatatggg 2040atcggccatt gaacaagatg gattgcacgc aggttctccg gccgcttggg tggagaggct 2100attcggctat gactgggcac aacagacaat cggctgctct gatgccgccg tgttccggct 2160gtcagcgcag gggcgcccgg ttctttttgt caagaccgac ctgtccggtg ccctgaatga 2220actgcaggac gaggcagcgc ggctatcgtg gctggccacg acgggcgttc cttgcgcagc 2280tgtgctcgac gttgtcactg aagcgggaag ggactggctg ctattgggcg aagtgccggg 2340gcaggatctc ctgtcatctc accttgctcc tgccgagaaa gtatccatca tggctgatgc 2400aatgcggcgg ctgcatacgc ttgatccggc tacctgccca ttcgaccacc aagcgaaaca 2460tcgcatcgag cgagcacgta ctcggatgga agccggtctt gtcgatcagg atgatctgga 2520cgaagagcat caggggctcg cgccagccga actgttcgcc aggctcaagg cgcgcatgcc 2580cgacggcgag gatctcgtcg tgacccatgg cgatgcctgc ttgccgaata tcatggtgga 2640aaatggccgc ttttctggat tcatcgactg tggccggctg ggtgtggcgg accgctatca 2700ggacatagcg ttggctaccc gtgatattgc tgaagagctt ggcggcgaat gggctgaccg 2760cttcctcgtg ctttacggta tcgccgctcc cgattcgcag cgcatcgcct tctatcgcct 2820tcttgacgag ttcttctgag gggatccgtc gactagagct cgctgatcag cctcgactgt 2880gccttctagt tgccagccat ctgttgtttg cccctccccc gtgccttcct tgaccctgga 2940aggtgccact cccactgtcc tttcctaata aaatgaggaa attgcatcgc attgtctgag 3000taggtgtcat tctattctgg ggggtggggt ggggcaggac agcaaggggg aggattggga 3060agacaatagc aggcatgctg gggagagatc taggaacccc tagtgatgga gttggccact 3120ccctctctgc gcgctcgctc gctcactgag gccgcccggg caaagcccgg gcgtcgggcg 3180acctttggtc gcccggcctc agtgagcgag cgagcgcgca gagagggagt ggccaacccc 3240cccccccccc cccctgcagc ccagctgcat taatgaatcg gccaacgcgc ggggagaggc 3300ggtttgcgta ttgggcgctc ttccgcttcc tcgctcactg actcgctgcg ctcggtcgtt 3360cggctgcggc gagcggtatc agctcactca aaggcggtaa tacggttatc cacagaatca 3420ggggataacg caggaaagaa catgtgagca aaaggccagc aaaaggccag gaaccgtaaa 3480aaggccgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca tcacaaaaat 3540cgacgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca ggcgtttccc 3600cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg atacctgtcc 3660gcctttctcc cttcgggaag cgtggcgctt tctcaatgct cacgctgtag gtatctcagt 3720tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt tcagcccgac 3780cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca cgacttatcg 3840ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg cggtgctaca 3900gagttcttga agtggtggcc taactacggc tacactagaa ggacagtatt tggtatctgc 3960gctctgctga agccagttac cttcggaaaa agagttggta gctcttgatc cggcaaacaa 4020accaccgctg gtagcggtgg tttttttgtt tgcaagcagc agattacgcg cagaaaaaaa 4080ggatctcaag aagatccttt gatcttttct acggggtctg acgctcagtg gaacgaaaac 4140tcacgttaag ggattttggt catgagatta tcaaaaagga tcttcaccta gatcctttta 4200aattaaaaat gaagttttaa atcaatctaa agtatatatg agtaaacttg gtctgacagt 4260taccaatgct taatcagtga ggcacctatc tcagcgatct gtctatttcg ttcatccata 4320gttgcctgac tccccgtcgt gtagataact acgatacggg agggcttacc atctggcccc 4380agtgctgcaa tgataccgcg agacccacgc tcaccggctc cagatttatc agcaataaac 4440cagccagccg gaagggccga gcgcagaagt ggtcctgcaa ctttatccgc ctccatccag 4500tctattaatt gttgccggga agctagagta agtagttcgc cagttaatag tttgcgcaac 4560gttgttgcca ttgctacagg catcgtggtg tcacgctcgt cgtttggtat ggcttcattc 4620agctccggtt cccaacgatc aaggcgagtt acatgatccc ccatgttgtg caaaaaagcg 4680gttagctcct tcggtcctcc gatcgttgtc agaagtaagt tggccgcagt gttatcactc 4740atggttatgg cagcactgca taattctctt actgtcatgc catccgtaag atgcttttct 4800gtgactggtg agtactcaac caagtcattc tgagaatagt gtatgcggcg accgagttgc 4860tcttgcccgg cgtcaatacg ggataatacc gcgccacata gcagaacttt aaaagtgctc 4920atcattggaa aacgttcttc ggggcgaaaa ctctcaagga tcttaccgct gttgagatcc 4980agttcgatgt aacccactcg tgcacccaac tgatcttcag catcttttac tttcaccagc 5040gtttctgggt gagcaaaaac aggaaggcaa aatgccgcaa aaaagggaat aagggcgaca 5100cggaaatgtt gaatactcat actcttcctt tttcaatatt attgaagcat ttatcagggt 5160tattgtctca tgagcggata catatttgaa tgtatttaga aaaataaaca aataggggtt 5220ccgcgcacat ttccccgaaa agtgccacct gacgtctaag aaaccattat tatcatgaca 5280ttaacctata aaaataggcg tatcacgagg ccctttcgtc tcgcgcgttt cggtgatgac 5340ggtgaaaacc tctgacacat gcagctcccg gagacggtca cagcttgtct gtaagcggat 5400gccgggagca gacaagcccg tcagggcgcg tcagcgggtg ttggcgggtg tcggggctgg 5460cttaactatg cggcatcaga gcagattgta ctgagagtgc accatatgcg gtgtgaaata 5520ccgcacagat gcgtaaggag aaaataccgc atcaggaaat tgtaaacgtt aatattttgt 5580taaaattcgc gttaaatttt tgttaaatca gctcattttt taaccaatag gccgaaatcg 5640gcaaaatccc ttataaatca aaagaataga ccgagatagg gttgagtgtt gttccagttt 5700ggaacaagag tccactatta aagaacgtgg actccaacgt caaagggcga aaaaccgtct 5760atcagggcga tggcccacta cgtgaaccat caccctaatc aagttttttg gggtcgaggt 5820gccgtaaagc actaaatcgg aaccctaaag ggagcccccg atttagagct tgacggggaa 5880agccggcgaa cgtggcgaga aaggaaggga agaaagcgaa aggagcgggc gctagggcgc 5940tggcaagtgt agcggtcacg ctgcgcgtaa ccaccacacc cgccgcgctt aatgcgccgc 6000tacagggcgc gtcgcgccat tcgccattca ggctacgcaa ctgttgggaa gggcgatcgg 6060tgcgggcctc ttcgctatta cgccagctgg ctgca 609557025DNAArtificial sequenceAAV Vector Construct 5gggggggggg ggggggggtt ggccactccc tctctgcgcg ctcgctcgct cactgaggcc 60gggcgaccaa aggtcgcccg acgcccgggc tttgcccggg cggcctcagt gagcgagcga 120gcgcgcagag agggagtggc caactccatc actaggggtt cctagatctg aattcggtac 180caagcttgcc tatttcccat gattccttca tatttgcata tacgatacaa ggctgttaga 240gagataattg gaattaattt gactgtaaac acaaagatat tagtacaaaa tacgtgacgt 300agaaagtaat aatttcttgg gtagtttgca gttttaaaat tatgttttaa aatggactat 360catatgctta ccgtaacttg aaagtatttc gatttcttgg ctttatatat cttgtggaaa 420ggacgaaaca ccgggtcttc gagaagacct gttttagtac tctggaaaca gaatctacta 480aaacaaggca aaatgccgtg tttatctcgt caacttgttg gcgagatttt tttgcggccg 540cccgcggtgg agctccagct tttgttccct ttagtgaggg ttaattctag agagacgtac 600aaaaaagagc aagaagctaa aaaagattta aaaattattt ttagcgcagt taatggaaca 660ggaactaaat ttaccccaaa aatattacgt gaatcaggat ataacgttat tgaggttgaa 720gagcatgcat ttgaagatga aacatttaaa aatgttgtaa atccaaatcc agaatttgat 780cctgcatgaa aaataccgct tgaatatggt attaaacatg atgcagatat tattattatg 840aatgacccag atgctgacag atttggaatg gcaataaaac atgatggtca ttttgtaaga 900ttagatggaa atcaaacagg accaatttta attgattgaa aattatcaaa tctaaaacgc 960ttaaatagca ttccaaaaaa tccggctcta tattcaagtt ttgtaacaag tgatttgggt 1020gatagaatcg ctcatgaaaa atatggagtt aatattgtaa aaactttaac tggatttaaa 1080tgaatgggta gagaaattgc taaagaagaa gataacggat taaattttgt ttttgcttat 1140gaagaaagtt atggatatgt aattgatgac tcagctagag ataaagatgg aatacaagct 1200tctatattaa tagcagaggc tgcttgattt tataaaaaac aaaataaaac attagtagac 1260tatttagaag atttatttaa agaaatgggt gcatattaca ctttcacttt aaacttgaat 1320tttaaaccag aagaaaagaa attaaaaatt gaaccattaa tgaaatcatt gagagcaaca 1380cccttaactc aaattgctgg acttaaagtt gttaatgttg aagactacat cgatggaatg 1440tataatatgc caggacaaga cttactaaaa ttttatttag aagataagtc atgatttgct 1500gttcgcccaa gtggaactga acctaaacta aaaatttatt ttataggtgt tggtgaatct 1560gttcaaaacg ctaaagttaa agtagacgaa attattaaag aattaaaatt aaaaatgaat 1620atataggaga aaaaatgaaa ctaaacaaat atatagatca cacattatta aaacaagatg 1680ctacgaaagc tgaaattaaa caattatgtg atgaagcaat tgaatttgat tttgcaacag 1740tttgtgttaa ttcatattga acaagctatt gtaaagaatt attaaaaggc acaaatgtag 1800gaataacaaa tgttgtaggt tttcctctag gtgcatgcac aacagctaca aaagcattcg 1860aagtttctga agcaattaaa gatggtgcaa cagaaattga tatggtatta aatattggtg 1920cattaaaaga caaaaattat gaattagttt tagaagacat gaaagctgta aaaaaagcag 1980ctggatcaca tgttgttaaa tgtattatgg aaaattgttt attaacaaaa gaagaaatca 2040tgaaagcttg tgaaatagct gttgaagctg gattagaatt tgttaaaaca tcaacaggat 2100tttcaaaatc aggtgcaaca tttgaagatg ttaaactaat gaagtcagtt gttaaagaca 2160atgctttagt taaagcagct ggtggagtta gaacatttga agatgctcaa aaaatgattg 2220aagcaggagc

tgaccgctta ggaacaagtg gtggagtagc tattattaaa ggtgaagaaa 2280acaacgcgag ttactaaaac tagcgttttt ttattttgct catttttatt aaaagtttgc 2340aaaaaggaac ataaaaattc taattattga tactaaagtt attaaaaaga agattttggt 2400tgattttata aaggtcatag aatataatat tttagcatgt gtattttgtg tgctcattta 2460caaccgtctc gcggccgcgg ggatccagac atgataagat acattgatga gtttggacaa 2520accacaacta gaatgcagtg aaaaaaatgc tttatttgtg aaatttgtga tgctattgct 2580ttatttgtaa ccattataag ctgcaataaa caagttaaca acaacaattg cattcatttt 2640atgtttcagg ttcaggggga ggtgtgggag gttttttagt cgacctcgag cagtgtggtt 2700ttgcaagagg aagcaaaaag cctctccacc caggcctgga atgtttccac ccaagtcgaa 2760ggcagtgtgg ttttgcaaga ggaagcaaaa agcctctcca cccaggcctg gaatgtttcc 2820acccaatgtc gagcaacccc gcccagcgtc ttgtcattgg cgaattcgaa cacgcagatg 2880cagtcggggc ggcgcggtcc caggtccact tcgcatatta aggtgacgcg tgtggcctcg 2940aacaccgagc gaccctgcag ccaatatggg atcggccatt gaacaagatg gattgcacgc 3000aggttctccg gccgcttggg tggagaggct attcggctat gactgggcac aacagacaat 3060cggctgctct gatgccgccg tgttccggct gtcagcgcag gggcgcccgg ttctttttgt 3120caagaccgac ctgtccggtg ccctgaatga actgcaggac gaggcagcgc ggctatcgtg 3180gctggccacg acgggcgttc cttgcgcagc tgtgctcgac gttgtcactg aagcgggaag 3240ggactggctg ctattgggcg aagtgccggg gcaggatctc ctgtcatctc accttgctcc 3300tgccgagaaa gtatccatca tggctgatgc aatgcggcgg ctgcatacgc ttgatccggc 3360tacctgccca ttcgaccacc aagcgaaaca tcgcatcgag cgagcacgta ctcggatgga 3420agccggtctt gtcgatcagg atgatctgga cgaagagcat caggggctcg cgccagccga 3480actgttcgcc aggctcaagg cgcgcatgcc cgacggcgag gatctcgtcg tgacccatgg 3540cgatgcctgc ttgccgaata tcatggtgga aaatggccgc ttttctggat tcatcgactg 3600tggccggctg ggtgtggcgg accgctatca ggacatagcg ttggctaccc gtgatattgc 3660tgaagagctt ggcggcgaat gggctgaccg cttcctcgtg ctttacggta tcgccgctcc 3720cgattcgcag cgcatcgcct tctatcgcct tcttgacgag ttcttctgag gggatccgtc 3780gactagagct cgctgatcag cctcgactgt gccttctagt tgccagccat ctgttgtttg 3840cccctccccc gtgccttcct tgaccctgga aggtgccact cccactgtcc tttcctaata 3900aaatgaggaa attgcatcgc attgtctgag taggtgtcat tctattctgg ggggtggggt 3960ggggcaggac agcaaggggg aggattggga agacaatagc aggcatgctg gggagagatc 4020taggaacccc tagtgatgga gttggccact ccctctctgc gcgctcgctc gctcactgag 4080gccgcccggg caaagcccgg gcgtcgggcg acctttggtc gcccggcctc agtgagcgag 4140cgagcgcgca gagagggagt ggccaacccc cccccccccc cccctgcagc ccagctgcat 4200taatgaatcg gccaacgcgc ggggagaggc ggtttgcgta ttgggcgctc ttccgcttcc 4260tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca 4320aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca 4380aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg 4440ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg 4500acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt 4560ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt 4620tctcaatgct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc 4680tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt 4740gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt 4800agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc 4860tacactagaa ggacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa 4920agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt 4980tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct 5040acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgagatta 5100tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa atcaatctaa 5160agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga ggcacctatc 5220tcagcgatct gtctatttcg ttcatccata gttgcctgac tccccgtcgt gtagataact 5280acgatacggg agggcttacc atctggcccc agtgctgcaa tgataccgcg agacccacgc 5340tcaccggctc cagatttatc agcaataaac cagccagccg gaagggccga gcgcagaagt 5400ggtcctgcaa ctttatccgc ctccatccag tctattaatt gttgccggga agctagagta 5460agtagttcgc cagttaatag tttgcgcaac gttgttgcca ttgctacagg catcgtggtg 5520tcacgctcgt cgtttggtat ggcttcattc agctccggtt cccaacgatc aaggcgagtt 5580acatgatccc ccatgttgtg caaaaaagcg gttagctcct tcggtcctcc gatcgttgtc 5640agaagtaagt tggccgcagt gttatcactc atggttatgg cagcactgca taattctctt 5700actgtcatgc catccgtaag atgcttttct gtgactggtg agtactcaac caagtcattc 5760tgagaatagt gtatgcggcg accgagttgc tcttgcccgg cgtcaatacg ggataatacc 5820gcgccacata gcagaacttt aaaagtgctc atcattggaa aacgttcttc ggggcgaaaa 5880ctctcaagga tcttaccgct gttgagatcc agttcgatgt aacccactcg tgcacccaac 5940tgatcttcag catcttttac tttcaccagc gtttctgggt gagcaaaaac aggaaggcaa 6000aatgccgcaa aaaagggaat aagggcgaca cggaaatgtt gaatactcat actcttcctt 6060tttcaatatt attgaagcat ttatcagggt tattgtctca tgagcggata catatttgaa 6120tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa agtgccacct 6180gacgtctaag aaaccattat tatcatgaca ttaacctata aaaataggcg tatcacgagg 6240ccctttcgtc tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg 6300gagacggtca cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg 6360tcagcgggtg ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta 6420ctgagagtgc accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc 6480atcaggaaat tgtaaacgtt aatattttgt taaaattcgc gttaaatttt tgttaaatca 6540gctcattttt taaccaatag gccgaaatcg gcaaaatccc ttataaatca aaagaataga 6600ccgagatagg gttgagtgtt gttccagttt ggaacaagag tccactatta aagaacgtgg 6660actccaacgt caaagggcga aaaaccgtct atcagggcga tggcccacta cgtgaaccat 6720caccctaatc aagttttttg gggtcgaggt gccgtaaagc actaaatcgg aaccctaaag 6780ggagcccccg atttagagct tgacggggaa agccggcgaa cgtggcgaga aaggaaggga 6840agaaagcgaa aggagcgggc gctagggcgc tggcaagtgt agcggtcacg ctgcgcgtaa 6900ccaccacacc cgccgcgctt aatgcgccgc tacagggcgc gtcgcgccat tcgccattca 6960ggctacgcaa ctgttgggaa gggcgatcgg tgcgggcctc ttcgctatta cgccagctgg 7020ctgca 7025622DNAArtificial sequenceprotospace sequence for CAG Luciferase gRNAs 6gtcattattg acgtcaatgg gc 22720DNAArtificial sequenceprotospace sequence for CAG Luciferase gRNAs 7gtgctcagca actcggggag 20820DNAArtificial sequenceprotospace sequence for CAG Luciferase gRNAs 8ctcggggagg ggggtgcagg 20922DNAArtificial sequenceprotospace sequence for CAG Luciferase gRNAs 9actttccatt gacgtcaatg gg 221022DNAArtificial sequenceprotospace sequence for CAG Luciferase gRNAs 10cttcgggggg gacggggcag gg 221121DNAArtificial sequenceprotospace sequence for CAG Luciferase gRNAs 11cttcgccccg cgcccgctag a 211219DNAArtificial sequenceprotospace sequence for CAG Luciferase gRNAs 12tcggggaggg gggtgcagg 191319DNAArtificial sequenceprotospace sequence for CAG Luciferase gRNAs 13tgctcagcaa ctcggggag 191419DNAArtificial sequenceprotospace sequence for CAG Luciferase gRNAs 14gcggggggtg gcggcaggt 191519DNAArtificial sequenceprotospace sequence for Mouse Acvr2b gRNAs 15gctcctctgg gacccctga 191619DNAArtificial sequenceprotospace sequence for Mouse Acvr2b gRNAs 16tgctatggag cccacgcta 191719DNAArtificial sequenceprotospace sequence for Mouse Acvr2b gRNAs 17ggcgcgctct ccgagctgg 191819DNAArtificial sequenceprotospace sequence for Mouse Acvr2b gRNAs 18agcgcgcccc gcctagccc 191919DNAArtificial sequenceprotospace sequence for Mouse Acvr2b gRNAs 19gcctctttgt atccaacat 192023DNAArtificial sequenceprotospace sequence for Mouse Acvr2b gRNAs 20gcacgctcct ctgggacccc tga 232119DNAArtificial sequenceprotospace sequence for Mouse Acvr2b gRNAs 21gtgggggagg ggacctgaa 192219DNAArtificial sequenceprotospace sequence for Mouse Acvr2b gRNAs 22gaggggccat gaacggggg 19233600DNAArtificial sequenceHA-NLS-dSaCas9-NLS-KRAB 23atgtacccat acgatgttcc agattacgct gccccaaaga agaagcggaa ggtcggtatc 60cacggagtcc cagcagccaa gcggaactac atcctgggcc tggccatcgg catcaccagc 120gtgggctacg gcatcatcga ctacgagaca cgggacgtga tcgatgccgg cgtgcggctg 180ttcaaagagg ccaacgtgga aaacaacgag ggcaggcgga gcaagagagg cgccagaagg 240ctgaagcggc ggaggcggca tagaatccag agagtgaaga agctgctgtt cgactacaac 300ctgctgaccg accacagcga gctgagcggc atcaacccct acgaggccag agtgaagggc 360ctgagccaga agctgagcga ggaagagttc tctgccgccc tgctgcacct ggccaagaga 420agaggcgtgc acaacgtgaa cgaggtggaa gaggacaccg gcaacgagct gtccaccaaa 480gagcagatca gccggaacag caaggccctg gaagagaaat acgtggccga actgcagctg 540gaacggctga agaaagacgg cgaagtgcgg ggcagcatca acagattcaa gaccagcgac 600tacgtgaaag aagccaaaca gctgctgaag gtgcagaagg cctaccacca gctggaccag 660agcttcatcg acacctacat cgacctgctg gaaacccggc ggacctacta tgagggacct 720ggcgagggca gccccttcgg ctggaaggac atcaaagaat ggtacgagat gctgatgggc 780cactgcacct acttccccga ggaactgcgg agcgtgaagt acgcctacaa cgccgacctg 840tacaacgccc tgaacgacct gaacaatctc gtgatcacca gggacgagaa cgagaagctg 900gaatattacg agaagttcca gatcatcgag aacgtgttca agcagaagaa gaagcccacc 960ctgaagcaga tcgccaaaga aatcctcgtg aacgaagagg atattaaggg ctacagagtg 1020accagcaccg gcaagcccga gttcaccaac ctgaaggtgt accacgacat caaggacatt 1080accgcccgga aagagattat tgagaacgcc gagctgctgg atcagattgc caagatcctg 1140accatctacc agagcagcga ggacatccag gaagaactga ccaatctgaa ctccgagctg 1200acccaggaag agatcgagca gatctctaat ctgaagggct ataccggcac ccacaacctg 1260agcctgaagg ccatcaacct gatcctggac gagctgtggc acaccaacga caaccagatc 1320gctatcttca accggctgaa gctggtgccc aagaaggtgg acctgtccca gcagaaagag 1380atccccacca ccctggtgga cgacttcatc ctgagccccg tcgtgaagag aagcttcatc 1440cagagcatca aagtgatcaa cgccatcatc aagaagtacg gcctgcccaa cgacatcatt 1500atcgagctgg cccgcgagaa gaactccaag gacgcccaga aaatgatcaa cgagatgcag 1560aagcggaacc ggcagaccaa cgagcggatc gaggaaatca tccggaccac cggcaaagag 1620aacgccaagt acctgatcga gaagatcaag ctgcacgaca tgcaggaagg caagtgcctg 1680tacagcctgg aagccatccc tctggaagat ctgctgaaca accccttcaa ctatgaggtg 1740gaccacatca tccccagaag cgtgtccttc gacaacagct tcaacaacaa ggtgctcgtg 1800aagcaggaag aagccagcaa gaagggcaac cggaccccat tccagtacct gagcagcagc 1860gacagcaaga tcagctacga aaccttcaag aagcacatcc tgaatctggc caagggcaag 1920ggcagaatca gcaagaccaa gaaagagtat ctgctggaag aacgggacat caacaggttc 1980tccgtgcaga aagacttcat caaccggaac ctggtggata ccagatacgc caccagaggc 2040ctgatgaacc tgctgcggag ctacttcaga gtgaacaacc tggacgtgaa agtgaagtcc 2100atcaatggcg gcttcaccag ctttctgcgg cggaagtgga agtttaagaa agagcggaac 2160aaggggtaca agcaccacgc cgaggacgcc ctgatcattg ccaacgccga tttcatcttc 2220aaagagtgga agaaactgga caaggccaaa aaagtgatgg aaaaccagat gttcgaggaa 2280aagcaggccg agagcatgcc cgagatcgaa accgagcagg agtacaaaga gatcttcatc 2340accccccacc agatcaagca cattaaggac ttcaaggact acaagtacag ccaccgggtg 2400gacaagaagc ctaatagaga gctgattaac gacaccctgt actccacccg gaaggacgac 2460aagggcaaca ccctgatcgt gaacaatctg aacggcctgt acgacaagga caatgacaag 2520ctgaaaaagc tgatcaacaa gagccccgaa aagctgctga tgtaccacca cgacccccag 2580acctaccaga aactgaagct gattatggaa cagtacggcg acgagaagaa tcccctgtac 2640aagtactacg aggaaaccgg gaactacctg accaagtact ccaaaaagga caacggcccc 2700gtgatcaaga agattaagta ttacggcaac aaactgaacg cccatctgga catcaccgac 2760gactacccca acagcagaaa caaggtcgtg aagctgtccc tgaagcccta cagattcgac 2820gtgtacctgg acaatggcgt gtacaagttc gtgaccgtga agaatctgga tgtgatcaaa 2880aaagaaaact actacgaagt gaatagcaag tgctatgagg aagctaagaa gctgaagaag 2940atcagcaacc aggccgagtt tatcgcctcc ttctacaaca acgatctgat caagatcaac 3000ggcgagctgt atagagtgat cggcgtgaac aacgacctgc tgaaccggat cgaagtgaac 3060atgatcgaca tcacctaccg cgagtacctg gaaaacatga acgacaagag gccccccagg 3120atcattaaga caatcgcctc caagacccag agcattaaga agtacagcac agacattctg 3180ggcaacctgt atgaagtgaa atctaagaag caccctcaga tcatcaaaaa gggcaaaagg 3240ccggcggcca cgaaaaaggc cggccaggca aaaaagaaaa agggatccga tgctaagtca 3300ctgactgcct ggtcccggac actggtgacc ttcaaggatg tgtttgtgga cttcaccagg 3360gaggagtgga agctgctgga cactgctcag cagatcctgt acagaaatgt gatgctggag 3420aactataaga acctggtttc cttgggttat cagcttacta agccagatgt gatcctccgg 3480ttggagaagg gagaagagcc ctggctggtg gagagagaaa ttcaccaaga gacccatcct 3540gattcagaga ctgcatttga aatcaaatca tcagttccga aaaagaaacg caaagtttaa 3600241368PRTStreptococcus pyogenes 24Met Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val1 5 10 15Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe 20 25 30Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile 35 40 45Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu 50 55 60Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys65 70 75 80Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser 85 90 95Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys 100 105 110His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr 115 120 125His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp 130 135 140Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His145 150 155 160Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro 165 170 175Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr 180 185 190Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala 195 200 205Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn 210 215 220Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn225 230 235 240Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe 245 250 255Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp 260 265 270Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp 275 280 285Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp 290 295 300Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser305 310 315 320Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys 325 330 335Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe 340 345 350Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser 355 360 365Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp 370 375 380Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg385 390 395 400Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu 405 410 415Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe 420 425 430Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile 435 440 445Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp 450 455 460Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu465 470 475 480Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr 485 490 495Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser 500 505 510Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys 515 520 525Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln 530 535 540Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr545 550 555 560Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp 565 570 575Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly 580 585 590Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp 595 600 605Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr 610 615 620Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala625 630 635 640His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr 645 650 655Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp 660 665 670Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe 675 680 685Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe 690 695 700Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu705 710 715 720His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly 725 730

735Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly 740 745 750Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln 755 760 765Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile 770 775 780Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro785 790 795 800Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu 805 810 815Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg 820 825 830Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys 835 840 845Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg 850 855 860Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys865 870 875 880Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys 885 890 895Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp 900 905 910Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr 915 920 925Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp 930 935 940Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser945 950 955 960Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg 965 970 975Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val 980 985 990Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe 995 1000 1005Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala 1010 1015 1020Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe 1025 1030 1035Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala 1040 1045 1050Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu 1055 1060 1065Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val 1070 1075 1080Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr 1085 1090 1095Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys 1100 1105 1110Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro 1115 1120 1125Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val 1130 1135 1140Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys 1145 1150 1155Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser 1160 1165 1170Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys 1175 1180 1185Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu 1190 1195 1200Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly 1205 1210 1215Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val 1220 1225 1230Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser 1235 1240 1245Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys 1250 1255 1260His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys 1265 1270 1275Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala 1280 1285 1290Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn 1295 1300 1305Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala 1310 1315 1320Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser 1325 1330 1335Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr 1340 1345 1350Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp 1355 1360 1365251053PRTStaphylococcus aureus 25Met Lys Arg Asn Tyr Ile Leu Gly Leu Asp Ile Gly Ile Thr Ser Val1 5 10 15Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly 20 25 30Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg 35 40 45Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile 50 55 60Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His65 70 75 80Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu 85 90 95Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu 100 105 110Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr 115 120 125Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala 130 135 140Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys145 150 155 160Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr 165 170 175Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln 180 185 190Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg 195 200 205Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys 210 215 220Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe225 230 235 240Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr 245 250 255Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn 260 265 270Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe 275 280 285Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu 290 295 300Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys305 310 315 320Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr 325 330 335Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala 340 345 350Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu 355 360 365Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser 370 375 380Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile385 390 395 400Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala 405 410 415Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln 420 425 430Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro 435 440 445Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile 450 455 460Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg465 470 475 480Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys 485 490 495Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr 500 505 510Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp 515 520 525Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu 530 535 540Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro545 550 555 560Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys 565 570 575Gln Glu Glu Asn Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu 580 585 590Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile 595 600 605Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu 610 615 620Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp625 630 635 640Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu 645 650 655Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys 660 665 670Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp 675 680 685Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp 690 695 700Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys705 710 715 720Leu Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu Lys 725 730 735Gln Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys Glu 740 745 750Ile Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp 755 760 765Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Glu Leu Ile 770 775 780Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu785 790 795 800Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu 805 810 815Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His 820 825 830Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly 835 840 845Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr 850 855 860Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile865 870 875 880Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp 885 890 895Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr 900 905 910Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val 915 920 925Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser 930 935 940Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala945 950 955 960Glu Phe Ile Ala Ser Phe Tyr Asn Asn Asp Leu Ile Lys Ile Asn Gly 965 970 975Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile 980 985 990Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn Met 995 1000 1005Asn Asp Lys Arg Pro Pro Arg Ile Ile Lys Thr Ile Ala Ser Lys 1010 1015 1020Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu 1025 1030 1035Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys Gly 1040 1045 1050261107PRTEubacterium ventriosum 26Met Gly Tyr Thr Val Gly Leu Asp Ile Gly Val Ala Ser Val Gly Val1 5 10 15Ala Val Leu Asp Glu Asn Asp Asn Ile Val Glu Ala Val Ser Asn Ile 20 25 30Phe Asp Glu Ala Asp Thr Ser Asn Asn Lys Val Arg Arg Thr Leu Arg 35 40 45Glu Gly Arg Arg Thr Lys Arg Arg Gln Lys Thr Arg Ile Glu Asp Phe 50 55 60Lys Gln Leu Trp Glu Thr Ser Gly Tyr Ile Ile Pro His Lys Leu His65 70 75 80Leu Asn Ile Ile Glu Leu Arg Asn Lys Gly Leu Thr Glu Leu Leu Ser 85 90 95Leu Asp Glu Leu Tyr Cys Val Leu Leu Ser Met Leu Lys His Arg Gly 100 105 110Ile Ser Tyr Leu Glu Asp Ala Asp Asp Gly Glu Lys Gly Asn Ala Tyr 115 120 125Lys Lys Gly Leu Ala Phe Asn Glu Lys Gln Leu Lys Glu Lys Met Pro 130 135 140Cys Glu Ile Gln Leu Glu Arg Met Lys Lys Tyr Gly Lys Tyr His Gly145 150 155 160Glu Phe Ile Ile Glu Ile Asn Asp Glu Lys Glu Tyr Gln Ser Asn Val 165 170 175Phe Thr Thr Lys Ala Tyr Lys Lys Glu Leu Glu Lys Ile Phe Glu Thr 180 185 190Gln Arg Cys Asn Gly Asn Lys Ile Asn Thr Lys Phe Ile Lys Lys Tyr 195 200 205Met Glu Ile Tyr Glu Arg Lys Arg Glu Tyr Tyr Ile Gly Pro Gly Asn 210 215 220Glu Lys Ser Arg Thr Asp Tyr Gly Ile Tyr Thr Thr Arg Thr Asp Glu225 230 235 240Glu Gly Asn Phe Ile Asp Glu Lys Asn Ile Phe Gly Lys Leu Ile Gly 245 250 255Lys Cys Ser Val Tyr Pro Glu Glu Tyr Arg Ala Ser Ser Ala Ser Tyr 260 265 270Thr Ala Gln Glu Phe Asn Leu Leu Asn Asp Leu Asn Asn Leu Lys Ile 275 280 285Asn Asn Glu Lys Leu Thr Glu Phe Gln Lys Lys Glu Ile Val Glu Ile 290 295 300Ile Lys Asp Ala Ser Ser Val Asn Met Arg Lys Ile Ile Lys Lys Val305 310 315 320Ile Asp Glu Asp Ile Glu Gln Tyr Ser Gly Ala Arg Ile Asp Lys Lys 325 330 335Gly Lys Glu Ile Tyr His Thr Phe Glu Ile Tyr Arg Lys Leu Lys Lys 340 345 350Glu Leu Lys Thr Ile Asn Val Asp Ile Asp Ser Phe Thr Arg Glu Glu 355 360 365Leu Asp Lys Thr Met Asp Ile Leu Thr Leu Asn Thr Glu Arg Glu Ser 370 375 380Ile Val Lys Ala Phe Asp Glu Gln Lys Phe Val Tyr Glu Glu Asn Leu385 390 395 400Ile Lys Lys Leu Ile Glu Phe Arg Lys Asn Asn Gln Arg Leu Phe Ser 405 410 415Gly Trp His Ser Phe Ser Tyr Lys Ala Met Leu Gln Leu Ile Pro Val 420 425 430Met Tyr Lys Glu Pro Lys Glu Gln Met Gln Leu Leu Thr Glu Met Asn 435 440 445Val Phe Lys Ser Lys Lys Glu Lys Tyr Val Asn Tyr Lys Tyr Ile Pro 450 455 460Glu Asn Glu Val Val Lys Glu Ile Tyr Asn Pro Val Val Val Lys Ser465 470 475 480Ile Arg Thr Thr Val Lys Ile Leu Asn Ala Leu Ile Lys Lys Tyr Gly 485 490 495Tyr Pro Glu Ser Val Val Ile Glu Met Pro Arg Asp Lys Asn Ser Asp 500 505 510Asp Glu Lys Glu Lys Ile Asp Met Asn Gln Lys Lys Asn Gln Glu Glu 515 520 525Tyr Glu Lys Ile Leu Asn Lys Ile Tyr Asp Glu Lys Gly Ile Glu Ile 530 535 540Thr Asn Lys Asp Tyr Lys Lys Gln Lys Lys Leu Val Leu Lys Leu Lys545 550 555 560Leu Trp Asn Glu Gln Glu Gly Leu Cys Leu Tyr Ser Gly Lys Lys Ile 565 570 575Ala Ile Glu Asp Leu Leu Asn His Pro Glu Phe Phe Glu Ile Asp His 580 585 590Ile Ile Pro Lys Ser Ile Ser Leu Asp Asp Ser Arg Ser Asn Lys Val 595 600 605Leu Val Tyr Lys Thr Glu Asn Ser Ile Lys Glu Asn Asp Thr Pro Tyr 610 615 620His Tyr Leu Thr Arg Ile Asn Gly Lys Trp Gly Phe Asp Glu Tyr Lys625 630 635 640Ala Asn Val Leu Glu Leu Arg Arg Arg Gly Lys Ile Asp Asp Lys Lys 645 650 655Val Asn Asn Leu Leu Cys Met Glu Asp Ile Thr Lys Ile Asp Val Val 660 665 670Lys Gly Phe Ile Asn Arg Asn Leu Asn Asp Thr Arg Tyr Ala Ser Arg 675 680 685Val Val Leu Asn Glu Met Gln Ser Phe Phe Glu Ser Arg Lys Tyr Cys 690 695 700Asn Thr Lys Val Lys Val Ile Arg Gly Ser Leu Thr Tyr Gln Met Arg705 710 715 720Gln Asp Leu His Leu Lys Lys Asn Arg Glu Glu Ser Tyr Ser His His 725 730 735Ala Val Asp Ala Met Leu Ile Ala Phe Ser Gln Lys Gly Tyr Glu Ala 740 745 750Tyr Arg Lys Ile Gln Lys Asp Cys Tyr Asp Phe Glu Thr Gly Glu Ile 755 760 765Leu Asp Lys Glu

Lys Trp Asn Lys Tyr Ile Asp Asp Asp Glu Phe Asp 770 775 780Asp Ile Leu Tyr Lys Glu Arg Met Asn Glu Ile Arg Lys Lys Ile Ile785 790 795 800Glu Ala Glu Glu Lys Val Lys Tyr Asn Tyr Lys Ile Asp Lys Lys Cys 805 810 815Asn Arg Gly Leu Cys Asn Gln Thr Ile Tyr Gly Thr Arg Glu Lys Asp 820 825 830Gly Lys Ile His Lys Ile Ser Ser Tyr Asn Ile Tyr Asp Asp Lys Glu 835 840 845Cys Asn Ser Leu Lys Lys Met Ile Asn Ser Gly Lys Gly Ser Asp Leu 850 855 860Leu Met Tyr Asn Asn Asp Pro Lys Thr Tyr Arg Asp Met Leu Lys Ile865 870 875 880Leu Glu Thr Tyr Ser Ser Glu Lys Asn Pro Phe Val Ala Tyr Asn Lys 885 890 895Glu Thr Gly Asp Tyr Phe Arg Lys Tyr Ser Lys Asn His Asn Gly Pro 900 905 910Lys Val Glu Lys Val Lys Tyr Tyr Ser Gly Gln Ile Asn Ser Cys Ile 915 920 925Asp Ile Ser His Lys Tyr Gly His Ala Lys Asn Ser Lys Lys Val Val 930 935 940Leu Val Ser Leu Asn Pro Tyr Arg Thr Asp Val Tyr Tyr Asp Asn Asp945 950 955 960Thr Gly Lys Tyr Tyr Leu Val Gly Val Lys Tyr Asn His Ile Lys Cys 965 970 975Val Gly Asn Lys Tyr Val Ile Asp Ser Glu Thr Tyr Asn Glu Leu Leu 980 985 990Arg Lys Glu Gly Val Leu Asn Ser Asp Glu Asn Leu Glu Asp Leu Asn 995 1000 1005Ser Lys Asn Ile Thr Tyr Lys Phe Ser Leu Tyr Lys Asn Asp Ile 1010 1015 1020Ile Gln Tyr Glu Lys Gly Gly Glu Tyr Tyr Thr Glu Arg Phe Leu 1025 1030 1035Ser Arg Ile Lys Glu Gln Lys Asn Leu Ile Glu Thr Lys Pro Ile 1040 1045 1050Asn Lys Pro Asn Phe Gln Arg Lys Asn Lys Lys Gly Glu Trp Glu 1055 1060 1065Asn Thr Arg Asn Gln Ile Ala Leu Ala Lys Thr Lys Tyr Val Gly 1070 1075 1080Lys Leu Val Thr Asp Val Leu Gly Asn Cys Tyr Ile Val Asn Met 1085 1090 1095Glu Lys Phe Ser Leu Val Val Asp Lys 1100 1105271168PRTAzospirillum 27Met Ala Arg Pro Ala Phe Arg Ala Pro Arg Arg Glu His Val Asn Gly1 5 10 15Trp Thr Pro Asp Pro His Arg Ile Ser Lys Pro Phe Phe Ile Leu Val 20 25 30Ser Trp His Leu Leu Ser Arg Val Val Ile Asp Ser Ser Ser Gly Cys 35 40 45Phe Pro Gly Thr Ser Arg Asp His Thr Asp Lys Phe Ala Glu Trp Glu 50 55 60Cys Ala Val Gln Pro Tyr Arg Leu Ser Phe Asp Leu Gly Thr Asn Ser65 70 75 80Ile Gly Trp Gly Leu Leu Asn Leu Asp Arg Gln Gly Lys Pro Arg Glu 85 90 95Ile Arg Ala Leu Gly Ser Arg Ile Phe Ser Asp Gly Arg Asp Pro Gln 100 105 110Asp Lys Ala Ser Leu Ala Val Ala Arg Arg Leu Ala Arg Gln Met Arg 115 120 125Arg Arg Arg Asp Arg Tyr Leu Thr Arg Arg Thr Arg Leu Met Gly Ala 130 135 140Leu Val Arg Phe Gly Leu Met Pro Ala Asp Pro Ala Ala Arg Lys Arg145 150 155 160Leu Glu Val Ala Val Asp Pro Tyr Leu Ala Arg Glu Arg Ala Thr Arg 165 170 175Glu Arg Leu Glu Pro Phe Glu Ile Gly Arg Ala Leu Phe His Leu Asn 180 185 190Gln Arg Arg Gly Tyr Lys Pro Val Arg Thr Ala Thr Lys Pro Asp Glu 195 200 205Glu Ala Gly Lys Val Lys Glu Ala Val Glu Arg Leu Glu Ala Ala Ile 210 215 220Ala Ala Ala Gly Ala Pro Thr Leu Gly Ala Trp Phe Ala Trp Arg Lys225 230 235 240Thr Arg Gly Glu Thr Leu Arg Ala Arg Leu Ala Gly Lys Gly Lys Glu 245 250 255Ala Ala Tyr Pro Phe Tyr Pro Ala Arg Arg Met Leu Glu Ala Glu Phe 260 265 270Asp Thr Leu Trp Ala Glu Gln Ala Arg His His Pro Asp Leu Leu Thr 275 280 285Ala Glu Ala Arg Glu Ile Leu Arg His Arg Ile Phe His Gln Arg Pro 290 295 300Leu Lys Pro Pro Pro Val Gly Arg Cys Thr Leu Tyr Pro Asp Asp Gly305 310 315 320Arg Ala Pro Arg Ala Leu Pro Ser Ala Gln Arg Leu Arg Leu Phe Gln 325 330 335Glu Leu Ala Ser Leu Arg Val Ile His Leu Asp Leu Ser Glu Arg Pro 340 345 350Leu Thr Pro Ala Glu Arg Asp Arg Ile Val Ala Phe Val Gln Gly Arg 355 360 365Pro Pro Lys Ala Gly Arg Lys Pro Gly Lys Val Gln Lys Ser Val Pro 370 375 380Phe Glu Lys Leu Arg Gly Leu Leu Glu Leu Pro Pro Gly Thr Gly Phe385 390 395 400Ser Leu Glu Ser Asp Lys Arg Pro Glu Leu Leu Gly Asp Glu Thr Gly 405 410 415Ala Arg Ile Ala Pro Ala Phe Gly Pro Gly Trp Thr Ala Leu Pro Leu 420 425 430Glu Glu Gln Asp Ala Leu Val Glu Leu Leu Leu Thr Glu Ala Glu Pro 435 440 445Glu Arg Ala Ile Ala Ala Leu Thr Ala Arg Trp Ala Leu Asp Glu Ala 450 455 460Thr Ala Ala Lys Leu Ala Gly Ala Thr Leu Pro Asp Phe His Gly Arg465 470 475 480Tyr Gly Arg Arg Ala Val Ala Glu Leu Leu Pro Val Leu Glu Arg Glu 485 490 495Thr Arg Gly Asp Pro Asp Gly Arg Val Arg Pro Ile Arg Leu Asp Glu 500 505 510Ala Val Lys Leu Leu Arg Gly Gly Lys Asp His Ser Asp Phe Ser Arg 515 520 525Glu Gly Ala Leu Leu Asp Ala Leu Pro Tyr Tyr Gly Ala Val Leu Glu 530 535 540Arg His Val Ala Phe Gly Thr Gly Asn Pro Ala Asp Pro Glu Glu Lys545 550 555 560Arg Val Gly Arg Val Ala Asn Pro Thr Val His Ile Ala Leu Asn Gln 565 570 575Leu Arg His Leu Val Asn Ala Ile Leu Ala Arg His Gly Arg Pro Glu 580 585 590Glu Ile Val Ile Glu Leu Ala Arg Asp Leu Lys Arg Ser Ala Glu Asp 595 600 605Arg Arg Arg Glu Asp Lys Arg Gln Ala Asp Asn Gln Lys Arg Asn Glu 610 615 620Glu Arg Lys Arg Leu Ile Leu Ser Leu Gly Glu Arg Pro Thr Pro Arg625 630 635 640Asn Leu Leu Lys Leu Arg Leu Trp Glu Glu Gln Gly Pro Val Glu Asn 645 650 655Arg Arg Cys Pro Tyr Ser Gly Glu Thr Ile Ser Met Arg Met Leu Leu 660 665 670Ser Glu Gln Val Asp Ile Asp His Ile Leu Pro Phe Ser Val Ser Leu 675 680 685Asp Asp Ser Ala Ala Asn Lys Val Val Cys Leu Arg Glu Ala Asn Arg 690 695 700Ile Lys Arg Asn Arg Ser Pro Trp Glu Ala Phe Gly His Asp Ser Glu705 710 715 720Arg Trp Ala Gly Ile Leu Ala Arg Ala Glu Ala Leu Pro Lys Asn Lys 725 730 735Arg Trp Arg Phe Ala Pro Asp Ala Leu Glu Lys Leu Glu Gly Glu Gly 740 745 750Gly Leu Arg Ala Arg His Leu Asn Asp Thr Arg His Leu Ser Arg Leu 755 760 765Ala Val Glu Tyr Leu Arg Cys Val Cys Pro Lys Val Arg Val Ser Pro 770 775 780Gly Arg Leu Thr Ala Leu Leu Arg Arg Arg Trp Gly Ile Asp Ala Ile785 790 795 800Leu Ala Glu Ala Asp Gly Pro Pro Pro Glu Val Pro Ala Glu Thr Leu 805 810 815Asp Pro Ser Pro Ala Glu Lys Asn Arg Ala Asp His Arg His His Ala 820 825 830Leu Asp Ala Val Val Ile Gly Cys Ile Asp Arg Ser Met Val Gln Arg 835 840 845Val Gln Leu Ala Ala Ala Ser Ala Glu Arg Glu Ala Ala Ala Arg Glu 850 855 860Asp Asn Ile Arg Arg Val Leu Glu Gly Phe Lys Glu Glu Pro Trp Asp865 870 875 880Gly Phe Arg Ala Glu Leu Glu Arg Arg Ala Arg Thr Ile Val Val Ser 885 890 895His Arg Pro Glu His Gly Ile Gly Gly Ala Leu His Lys Glu Thr Ala 900 905 910Tyr Gly Pro Val Asp Pro Pro Glu Glu Gly Phe Asn Leu Val Val Arg 915 920 925Lys Pro Ile Asp Gly Leu Ser Lys Asp Glu Ile Asn Ser Val Arg Asp 930 935 940Pro Arg Leu Arg Arg Ala Leu Ile Asp Arg Leu Ala Ile Arg Arg Arg945 950 955 960Asp Ala Asn Asp Pro Ala Thr Ala Leu Ala Lys Ala Ala Glu Asp Leu 965 970 975Ala Ala Gln Pro Ala Ser Arg Gly Ile Arg Arg Val Arg Val Leu Lys 980 985 990Lys Glu Ser Asn Pro Ile Arg Val Glu His Gly Gly Asn Pro Ser Gly 995 1000 1005Pro Arg Ser Gly Gly Pro Phe His Lys Leu Leu Leu Ala Gly Glu 1010 1015 1020Val His His Val Asp Val Ala Leu Arg Ala Asp Gly Arg Arg Trp 1025 1030 1035Val Gly His Trp Val Thr Leu Phe Glu Ala His Gly Gly Arg Gly 1040 1045 1050Ala Asp Gly Ala Ala Ala Pro Pro Arg Leu Gly Asp Gly Glu Arg 1055 1060 1065Phe Leu Met Arg Leu His Lys Gly Asp Cys Leu Lys Leu Glu His 1070 1075 1080Lys Gly Arg Val Arg Val Met Gln Val Val Lys Leu Glu Pro Ser 1085 1090 1095Ser Asn Ser Val Val Val Val Glu Pro His Gln Val Lys Thr Asp 1100 1105 1110Arg Ser Lys His Val Lys Ile Ser Cys Asp Gln Leu Arg Ala Arg 1115 1120 1125Gly Ala Arg Arg Val Thr Val Asp Pro Leu Gly Arg Val Arg Val 1130 1135 1140His Ala Pro Gly Ala Arg Val Gly Ile Gly Gly Asp Ala Gly Arg 1145 1150 1155Thr Ala Met Glu Pro Ala Glu Asp Ile Ser 1160 1165281050PRTGluconacetobacter diazotrophicus 28Met Gly Glu Asn Met Ile Asp Glu Ser Leu Thr Phe Gly Ile Asp Leu1 5 10 15Gly Ile Gly Ser Cys Gly Trp Ala Val Leu Arg Arg Pro Ser Ala Phe 20 25 30Gly Arg Lys Gly Val Ile Glu Gly Met Gly Ser Trp Cys Phe Asp Val 35 40 45Pro Glu Thr Ser Lys Glu Arg Thr Pro Thr Asn Gln Ile Arg Arg Ser 50 55 60Asn Arg Leu Leu Arg Arg Val Ile Arg Arg Arg Arg Asn Arg Met Ala65 70 75 80Ala Ile Arg Arg Leu Leu His Ala Ala Gly Leu Leu Pro Ser Thr Asp 85 90 95Ser Asp Ala Leu Lys Arg Pro Gly His Asp Pro Trp Glu Leu Arg Ala 100 105 110Arg Gly Leu Asp Lys Pro Leu Lys Pro Val Glu Phe Ala Val Val Leu 115 120 125Gly His Ile Ala Lys Arg Arg Gly Phe Lys Ser Ala Ala Lys Arg Lys 130 135 140Ala Thr Asn Ile Ser Ser Asp Asp Lys Lys Met Leu Thr Ala Leu Glu145 150 155 160Ala Thr Arg Glu Arg Leu Gly Arg Tyr Arg Thr Val Gly Glu Met Phe 165 170 175Ala Arg Asp Pro Asp Phe Ala Ser Arg Arg Arg Asn Arg Glu Gly Lys 180 185 190Tyr Asp Arg Thr Thr Ala Arg Asp Asp Leu Glu His Glu Val His Ala 195 200 205Leu Phe Ala Ala Gln Arg Arg Leu Gly Gln Gly Phe Ala Ser Pro Glu 210 215 220Leu Glu Glu Ala Phe Thr Ala Ser Ala Phe His Gln Arg Pro Met Gln225 230 235 240Asp Ser Glu Arg Leu Val Gly Phe Cys Pro Phe Glu Arg Thr Glu Lys 245 250 255Arg Ala Ala Lys Leu Thr Pro Ser Phe Glu Arg Phe Arg Leu Leu Ala 260 265 270Arg Leu Leu Asn Leu Arg Ile Thr Thr Pro Asp Gly Glu Arg Pro Leu 275 280 285Thr Val Asp Glu Ile Ala Leu Val Thr Arg Asp Leu Gly Lys Thr Ala 290 295 300Lys Leu Ser Ile Lys Arg Val Arg Thr Leu Ile Gly Leu Glu Asp Asn305 310 315 320Gln Arg Phe Thr Thr Ile Arg Pro Glu Asp Glu Asp Arg Asp Ile Val 325 330 335Ala Arg Thr Gly Gly Ala Met Thr Gly Thr Ala Thr Leu Arg Lys Ala 340 345 350Leu Gly Glu Ala Leu Trp Thr Asp Met Gln Glu Arg Pro Glu Gln Leu 355 360 365Asp Ala Ile Val Gln Val Leu Ser Phe Phe Glu Ala Asn Glu Thr Ile 370 375 380Thr Glu Lys Leu Arg Glu Ile Gly Leu Thr Leu Ala Val Leu Asp Val385 390 395 400Leu Leu Thr Ala Leu Asp Ala Gly Val Phe Ala Lys Phe Lys Gly Ala 405 410 415Ala His Ile Ser Thr Lys Ala Ala Arg Asn Leu Leu Pro His Leu Glu 420 425 430Gln Gly Arg Arg Tyr Asp Glu Ala Cys Thr Met Ala Gly Tyr Asp His 435 440 445Ala Ala Ser Arg Leu Ser His His Gly Gln Ile Val Ala Lys Thr Gln 450 455 460Phe Asn Ala Leu Val Thr Glu Ile Gly Glu Ser Ile Ala Asn Pro Ile465 470 475 480Ala Arg Lys Ala Leu Ile Glu Gly Leu Lys Gln Ile Trp Ala Met Arg 485 490 495Asn His Trp Gly Leu Pro Gly Ser Ile His Val Glu Leu Ala Arg Asp 500 505 510Val Gly Asn Ser Ile Glu Lys Arg Arg Glu Ile Glu Lys His Ile Glu 515 520 525Lys Asn Thr Ala Leu Arg Ala Arg Glu Arg Arg Glu Val His Asp Leu 530 535 540Leu Asp Leu Glu Asp Val Asn Gly Asp Thr Leu Leu Arg Tyr Arg Leu545 550 555 560Trp Lys Glu Gln Gly Gly Lys Cys Leu Tyr Thr Gly Lys Ala Ile His 565 570 575Ile Arg Gln Ile Ala Ala Thr Asp Asn Ser Val Gln Val Asp His Ile 580 585 590Leu Pro Trp Ser Arg Phe Gly Asp Asp Ser Phe Asn Asn Lys Thr Leu 595 600 605Cys Leu Ala Ser Ala Asn Gln Gln Lys Lys Arg Ser Thr Pro Tyr Glu 610 615 620Trp Leu Ser Gly Gln Thr Gly Asp Ala Trp Asn Ala Phe Val Gln Arg625 630 635 640Ile Glu Thr Asn Lys Glu Leu Arg Gly Phe Lys Lys Arg Asn Tyr Leu 645 650 655Leu Lys Asn Ala Lys Glu Ala Glu Glu Lys Phe Arg Ser Arg Asn Leu 660 665 670Asn Asp Thr Arg Tyr Ala Ala Arg Leu Phe Ala Glu Ala Val Lys Leu 675 680 685Leu Tyr Ala Phe Gly Glu Arg Gln Glu Lys Gly Gly Asn Arg Arg Val 690 695 700Phe Thr Arg Pro Gly Ala Leu Thr Ala Ala Leu Arg Gln Ala Trp Gly705 710 715 720Val Glu Ser Leu Lys Lys Gln Asp Gly Lys Arg Ile Asn Asp Asp Arg 725 730 735His His Ala Leu Asp Ala Leu Thr Val Ala Ala Val Asp Glu Ala Glu 740 745 750Ile Gln Arg Leu Thr Lys Ser Phe His Glu Trp Glu Gln Gln Gly Leu 755 760 765Gly Arg Pro Leu Arg Arg Val Glu Pro Pro Trp Glu Ser Phe Arg Ala 770 775 780Asp Val Glu Ala Thr Tyr Pro Glu Val Phe Val Ala Arg Pro Glu Arg785 790 795 800Arg Arg Ala Arg Gly Glu Gly His Ala Ala Thr Ile Arg Gln Val Lys 805 810 815Glu Arg Glu Cys Thr Pro Ile Val Phe Glu Arg Lys Ala Val Ser Ser 820 825 830Leu Lys Glu Ala Asp Leu Glu Arg Ile Lys Asp Gly Glu Arg Asn Glu 835 840 845Ala Ile Val Glu Ala Ile Arg Ser Trp Ile Ala Thr Gly Arg Pro Ala 850 855 860Asp Ala Pro Pro Arg Ser Pro Arg Gly Asp Ile Ile Thr Lys Ile Arg865 870 875 880Leu Ala Thr Thr Ile Lys Ala Ala Val Pro Val Arg Gly Gly Thr Ala 885 890 895Gly Arg Gly Glu Met Val Arg Ala Asp Val Phe Ser Lys Pro Asn Arg 900 905 910Arg Gly Lys Asp Glu Trp Tyr Leu Val Pro Val Tyr Pro His Gln Ile 915 920 925Met Asn Arg Lys Ala Trp Pro Lys Pro Pro Met Arg Ser Ile Val Ala 930 935 940Asn Lys Asp Glu Asp Glu Trp Thr Glu Val Gly Pro Glu His Gln Phe945 950 955

960Arg Phe Ser Leu Tyr Pro Arg Ser Asn Ile Glu Ile Ile Arg Pro Ser 965 970 975Gly Glu Val Ile Glu Gly Tyr Phe Val Gly Leu His Arg Asn Thr Gly 980 985 990Ala Leu Thr Ile Ser Ala His Asn Asp Pro Lys Ser Ile His Ser Gly 995 1000 1005Ile Gly Thr Lys Thr Leu Leu Ala Ile Ser Lys Tyr Gln Val Asp 1010 1015 1020Arg Phe Gly Arg Lys Ser Pro Val Arg Lys Glu Val Arg Thr Trp 1025 1030 1035His Gly Glu Ala Cys Ile Ser Pro Thr Pro Pro Gly 1040 1045 1050291082PRTNeisseria cinerea 29Met Ala Ala Phe Lys Pro Asn Pro Met Asn Tyr Ile Leu Gly Leu Asp1 5 10 15Ile Gly Ile Ala Ser Val Gly Trp Ala Ile Val Glu Ile Asp Glu Glu 20 25 30Glu Asn Pro Ile Arg Leu Ile Asp Leu Gly Val Arg Val Phe Glu Arg 35 40 45Ala Glu Val Pro Lys Thr Gly Asp Ser Leu Ala Ala Ala Arg Arg Leu 50 55 60Ala Arg Ser Val Arg Arg Leu Thr Arg Arg Arg Ala His Arg Leu Leu65 70 75 80Arg Ala Arg Arg Leu Leu Lys Arg Glu Gly Val Leu Gln Ala Ala Asp 85 90 95Phe Asp Glu Asn Gly Leu Ile Lys Ser Leu Pro Asn Thr Pro Trp Gln 100 105 110Leu Arg Ala Ala Ala Leu Asp Arg Lys Leu Thr Pro Leu Glu Trp Ser 115 120 125Ala Val Leu Leu His Leu Ile Lys His Arg Gly Tyr Leu Ser Gln Arg 130 135 140Lys Asn Glu Gly Glu Thr Ala Asp Lys Glu Leu Gly Ala Leu Leu Lys145 150 155 160Gly Val Ala Asp Asn Thr His Ala Leu Gln Thr Gly Asp Phe Arg Thr 165 170 175Pro Ala Glu Leu Ala Leu Asn Lys Phe Glu Lys Glu Ser Gly His Ile 180 185 190Arg Asn Gln Arg Gly Asp Tyr Ser His Thr Phe Asn Arg Lys Asp Leu 195 200 205Gln Ala Glu Leu Asn Leu Leu Phe Glu Lys Gln Lys Glu Phe Gly Asn 210 215 220Pro His Val Ser Asp Gly Leu Lys Glu Gly Ile Glu Thr Leu Leu Met225 230 235 240Thr Gln Arg Pro Ala Leu Ser Gly Asp Ala Val Gln Lys Met Leu Gly 245 250 255His Cys Thr Phe Glu Pro Thr Glu Pro Lys Ala Ala Lys Asn Thr Tyr 260 265 270Thr Ala Glu Arg Phe Val Trp Leu Thr Lys Leu Asn Asn Leu Arg Ile 275 280 285Leu Glu Gln Gly Ser Glu Arg Pro Leu Thr Asp Thr Glu Arg Ala Thr 290 295 300Leu Met Asp Glu Pro Tyr Arg Lys Ser Lys Leu Thr Tyr Ala Gln Ala305 310 315 320Arg Lys Leu Leu Asp Leu Asp Asp Thr Ala Phe Phe Lys Gly Leu Arg 325 330 335Tyr Gly Lys Asp Asn Ala Glu Ala Ser Thr Leu Met Glu Met Lys Ala 340 345 350Tyr His Ala Ile Ser Arg Ala Leu Glu Lys Glu Gly Leu Lys Asp Lys 355 360 365Lys Ser Pro Leu Asn Leu Ser Pro Glu Leu Gln Asp Glu Ile Gly Thr 370 375 380Ala Phe Ser Leu Phe Lys Thr Asp Glu Asp Ile Thr Gly Arg Leu Lys385 390 395 400Asp Arg Val Gln Pro Glu Ile Leu Glu Ala Leu Leu Lys His Ile Ser 405 410 415Phe Asp Lys Phe Val Gln Ile Ser Leu Lys Ala Leu Arg Arg Ile Val 420 425 430Pro Leu Met Glu Gln Gly Asn Arg Tyr Asp Glu Ala Cys Thr Glu Ile 435 440 445Tyr Gly Asp His Tyr Gly Lys Lys Asn Thr Glu Glu Lys Ile Tyr Leu 450 455 460Pro Pro Ile Pro Ala Asp Glu Ile Arg Asn Pro Val Val Leu Arg Ala465 470 475 480Leu Ser Gln Ala Arg Lys Val Ile Asn Gly Val Val Arg Arg Tyr Gly 485 490 495Ser Pro Ala Arg Ile His Ile Glu Thr Ala Arg Glu Val Gly Lys Ser 500 505 510Phe Lys Asp Arg Lys Glu Ile Glu Lys Arg Gln Glu Glu Asn Arg Lys 515 520 525Asp Arg Glu Lys Ser Ala Ala Lys Phe Arg Glu Tyr Phe Pro Asn Phe 530 535 540Val Gly Glu Pro Lys Ser Lys Asp Ile Leu Lys Leu Arg Leu Tyr Glu545 550 555 560Gln Gln His Gly Lys Cys Leu Tyr Ser Gly Lys Glu Ile Asn Leu Gly 565 570 575Arg Leu Asn Glu Lys Gly Tyr Val Glu Ile Asp His Ala Leu Pro Phe 580 585 590Ser Arg Thr Trp Asp Asp Ser Phe Asn Asn Lys Val Leu Ala Leu Gly 595 600 605Ser Glu Asn Gln Asn Lys Gly Asn Gln Thr Pro Tyr Glu Tyr Phe Asn 610 615 620Gly Lys Asp Asn Ser Arg Glu Trp Gln Glu Phe Lys Ala Arg Val Glu625 630 635 640Thr Ser Arg Phe Pro Arg Ser Lys Lys Gln Arg Ile Leu Leu Gln Lys 645 650 655Phe Asp Glu Asp Gly Phe Lys Glu Arg Asn Leu Asn Asp Thr Arg Tyr 660 665 670Ile Asn Arg Phe Leu Cys Gln Phe Val Ala Asp His Met Leu Leu Thr 675 680 685Gly Lys Gly Lys Arg Arg Val Phe Ala Ser Asn Gly Gln Ile Thr Asn 690 695 700Leu Leu Arg Gly Phe Trp Gly Leu Arg Lys Val Arg Ala Glu Asn Asp705 710 715 720Arg His His Ala Leu Asp Ala Val Val Val Ala Cys Ser Thr Ile Ala 725 730 735Met Gln Gln Lys Ile Thr Arg Phe Val Arg Tyr Lys Glu Met Asn Ala 740 745 750Phe Asp Gly Lys Thr Ile Asp Lys Glu Thr Gly Glu Val Leu His Gln 755 760 765Lys Ala His Phe Pro Gln Pro Trp Glu Phe Phe Ala Gln Glu Val Met 770 775 780Ile Arg Val Phe Gly Lys Pro Asp Gly Lys Pro Glu Phe Glu Glu Ala785 790 795 800Asp Thr Pro Glu Lys Leu Arg Thr Leu Leu Ala Glu Lys Leu Ser Ser 805 810 815Arg Pro Glu Ala Val His Lys Tyr Val Thr Pro Leu Phe Ile Ser Arg 820 825 830Ala Pro Asn Arg Lys Met Ser Gly Gln Gly His Met Glu Thr Val Lys 835 840 845Ser Ala Lys Arg Leu Asp Glu Gly Ile Ser Val Leu Arg Val Pro Leu 850 855 860Thr Gln Leu Lys Leu Lys Asp Leu Glu Lys Met Val Asn Arg Glu Arg865 870 875 880Glu Pro Lys Leu Tyr Glu Ala Leu Lys Ala Arg Leu Glu Ala His Lys 885 890 895Asp Asp Pro Ala Lys Ala Phe Ala Glu Pro Phe Tyr Lys Tyr Asp Lys 900 905 910Ala Gly Asn Arg Thr Gln Gln Val Lys Ala Val Arg Val Glu Gln Val 915 920 925Gln Lys Thr Gly Val Trp Val His Asn His Asn Gly Ile Ala Asp Asn 930 935 940Ala Thr Ile Val Arg Val Asp Val Phe Glu Lys Gly Gly Lys Tyr Tyr945 950 955 960Leu Val Pro Ile Tyr Ser Trp Gln Val Ala Lys Gly Ile Leu Pro Asp 965 970 975Arg Ala Val Val Gln Gly Lys Asp Glu Glu Asp Trp Thr Val Met Asp 980 985 990Asp Ser Phe Glu Phe Lys Phe Val Leu Tyr Ala Asn Asp Leu Ile Lys 995 1000 1005Leu Thr Ala Lys Lys Asn Glu Phe Leu Gly Tyr Phe Val Ser Leu 1010 1015 1020Asn Arg Ala Thr Gly Ala Ile Asp Ile Arg Thr His Asp Thr Asp 1025 1030 1035Ser Thr Lys Gly Lys Asn Gly Ile Phe Gln Ser Val Gly Val Lys 1040 1045 1050Thr Ala Leu Ser Phe Gln Lys Tyr Gln Ile Asp Glu Leu Gly Lys 1055 1060 1065Glu Ile Arg Pro Cys Arg Leu Lys Lys Arg Pro Pro Val Arg 1070 1075 1080301140PRTRoseburia intestinalis 30Met Arg Glu Asn Gly Ser Asp Glu Arg Arg Arg Asn Met Asp Glu Lys1 5 10 15Met Asp Tyr Arg Ile Gly Leu Asp Ile Gly Ile Ala Ser Val Gly Trp 20 25 30Ala Val Leu Gln Asn Asn Ser Asp Asp Glu Pro Val Arg Ile Val Asp 35 40 45Leu Gly Val Arg Ile Phe Asp Thr Ala Glu Ile Pro Lys Thr Gly Glu 50 55 60Ser Leu Ala Gly Pro Arg Arg Ala Ala Arg Thr Thr Arg Arg Arg Leu65 70 75 80Arg Arg Arg Lys His Arg Leu Asp Arg Ile Lys Trp Leu Phe Glu Asn 85 90 95Gln Gly Leu Ile Asn Ile Asp Asp Phe Leu Lys Arg Tyr Asn Met Ala 100 105 110Gly Leu Pro Asp Val Tyr Gln Leu Arg Tyr Glu Ala Leu Asp Arg Lys 115 120 125Leu Thr Asp Glu Glu Leu Ala Gln Val Leu Leu His Ile Ala Lys His 130 135 140Arg Gly Phe Arg Ser Thr Arg Lys Ala Glu Thr Ala Ala Lys Glu Asn145 150 155 160Gly Ala Val Leu Lys Ala Thr Asp Glu Asn Gln Lys Arg Met Gln Glu 165 170 175Lys Gly Tyr Arg Thr Val Gly Glu Met Ile Tyr Leu Asp Glu Ala Phe 180 185 190Arg Thr Gly Cys Ser Trp Ser Glu Lys Gly Tyr Ile Leu Thr Pro Arg 195 200 205Asn Lys Ala Glu Asn Tyr Gln His Thr Met Leu Arg Ala Met Leu Val 210 215 220Glu Glu Val Lys Glu Ile Phe Ser Ser Gln Arg Arg Leu Gly Asn Glu225 230 235 240Lys Ala Thr Glu Glu Leu Glu Glu Lys Tyr Leu Glu Ile Met Thr Ser 245 250 255Gln Arg Ser Phe Asp Leu Gly Pro Gly Met Gln Pro Asp Gly Lys Pro 260 265 270Ser Pro Tyr Ala Met Glu Gly Phe Ser Asp Arg Val Gly Lys Cys Thr 275 280 285Phe Leu Gly Asp Gln Gly Glu Leu Arg Gly Ala Lys Gly Thr Tyr Thr 290 295 300Ala Glu Tyr Phe Val Ala Leu Gln Lys Ile Asn His Thr Lys Leu Val305 310 315 320Asn Gln Asp Gly Glu Thr Arg Asn Phe Thr Glu Glu Glu Arg Arg Ala 325 330 335Leu Thr Leu Leu Leu Phe Thr Gln Lys Glu Val Lys Tyr Ala Ala Val 340 345 350Arg Lys Lys Leu Gly Leu Pro Glu Asp Ile Leu Phe Tyr Asn Leu Asn 355 360 365Tyr Lys Lys Ala Ala Thr Lys Glu Glu Gln Gln Lys Glu Asn Gln Asn 370 375 380Thr Glu Lys Ala Lys Phe Ile Gly Met Pro Tyr Tyr His Asp Tyr Lys385 390 395 400Lys Cys Leu Glu Glu Arg Val Lys Tyr Leu Thr Glu Asn Glu Val Arg 405 410 415Asp Leu Phe Asp Glu Ile Gly Met Ile Leu Thr Cys Tyr Lys Asn Asp 420 425 430Asp Ser Arg Thr Glu Arg Leu Ala Lys Leu Gly Leu Val Pro Ile Glu 435 440 445Met Glu Gly Leu Leu Ala Tyr Thr Pro Thr Lys Phe Gln His Leu Ser 450 455 460Met Lys Ala Met Arg Asn Ile Ile Pro Phe Leu Glu Lys Gly Met Thr465 470 475 480Tyr Asp Lys Ala Cys Glu Glu Ala Gly Tyr Asp Phe Lys Ala Asp Ser 485 490 495Lys Gly Thr Lys Gln Lys Leu Leu Thr Gly Glu Asn Val Asn Gln Thr 500 505 510Ile Asn Glu Ile Thr Asn Pro Val Val Lys Arg Ser Val Ser Gln Thr 515 520 525Val Lys Val Ile Asn Ala Ile Ile Arg Thr Tyr Gly Ser Pro Gln Ala 530 535 540Ile Asn Ile Glu Leu Ala Arg Glu Met Ser Lys Thr Phe Glu Glu Arg545 550 555 560Arg Lys Ile Lys Gly Asp Met Glu Lys Arg Gln Lys Asn Asn Glu Asp 565 570 575Val Lys Lys Gln Ile Gln Glu Leu Gly Lys Leu Ser Pro Thr Gly Gln 580 585 590Asp Ile Leu Lys Tyr Arg Leu Trp Gln Glu Gln Gln Gly Ile Cys Met 595 600 605Tyr Ser Gly Lys Thr Ile Pro Leu Glu Glu Leu Phe Lys Pro Gly Tyr 610 615 620Asp Ile Asp His Ile Leu Pro Tyr Ser Ile Thr Phe Asp Asp Ser Phe625 630 635 640Arg Asn Lys Val Leu Val Thr Ser Gln Glu Asn Arg Gln Lys Gly Asn 645 650 655Arg Thr Pro Tyr Glu Tyr Met Gly Asn Asp Glu Gln Arg Trp Asn Glu 660 665 670Phe Glu Thr Arg Val Lys Thr Thr Ile Arg Asp Tyr Lys Lys Gln Gln 675 680 685Lys Leu Leu Lys Lys His Phe Ser Glu Glu Glu Arg Ser Glu Phe Lys 690 695 700Glu Arg Asn Leu Thr Asp Thr Lys Tyr Ile Thr Thr Val Ile Tyr Asn705 710 715 720Met Ile Arg Gln Asn Leu Glu Met Ala Pro Leu Asn Arg Pro Glu Lys 725 730 735Lys Lys Gln Val Arg Ala Val Asn Gly Ala Ile Thr Ala Tyr Leu Arg 740 745 750Lys Arg Trp Gly Leu Pro Gln Lys Asn Arg Glu Thr Asp Thr His His 755 760 765Ala Met Asp Ala Val Val Ile Ala Cys Cys Thr Asp Gly Met Ile Gln 770 775 780Lys Ile Ser Arg Tyr Thr Lys Val Arg Glu Arg Cys Tyr Ser Lys Gly785 790 795 800Thr Glu Phe Val Asp Ala Glu Thr Gly Glu Ile Phe Arg Pro Glu Asp 805 810 815Tyr Ser Arg Ala Glu Trp Asp Glu Ile Phe Gly Val His Ile Pro Lys 820 825 830Pro Trp Glu Thr Phe Arg Ala Glu Leu Asp Val Arg Met Gly Asp Asp 835 840 845Pro Lys Gly Phe Leu Asp Thr His Ser Asp Val Ala Leu Glu Leu Asp 850 855 860Tyr Pro Glu Tyr Ile Tyr Glu Asn Leu Arg Pro Ile Phe Val Ser Arg865 870 875 880Met Pro Asn His Lys Val Thr Gly Ala Ala His Ala Asp Thr Ile Arg 885 890 895Ser Pro Arg His Phe Lys Asp Glu Gly Ile Val Leu Thr Lys Thr Ala 900 905 910Leu Thr Asp Leu Lys Leu Asp Lys Asp Gly Glu Ile Asp Gly Tyr Tyr 915 920 925Asn Pro Gln Ser Asp Leu Leu Leu Tyr Glu Ala Leu Lys Lys Gln Leu 930 935 940Leu Leu Tyr Gly Asn Asp Ala Lys Lys Ala Phe Ala Gln Asp Phe His945 950 955 960Lys Pro Lys Ala Asp Gly Thr Glu Gly Pro Val Val Arg Lys Val Lys 965 970 975Ile Gln Lys Lys Gln Thr Met Gly Val Phe Val Asp Ser Gly Asn Gly 980 985 990Ile Ala Glu Asn Gly Gly Met Val Arg Ile Asp Val Phe Arg Val Asn 995 1000 1005Gly Lys Tyr Tyr Phe Val Pro Val Tyr Thr Ala Asp Val Val Lys 1010 1015 1020Lys Val Leu Pro Asn Arg Ala Ser Thr Ala His Lys Pro Tyr Gly 1025 1030 1035Glu Trp Lys Val Met Glu Asp Lys Asp Phe Leu Phe Ser Leu Tyr 1040 1045 1050Ser Arg Asp Leu Ile His Ile Lys Ser Lys Lys Asp Ile Pro Ile 1055 1060 1065Lys Met Val Asn Gly Gly Met Glu Gly Ile Lys Glu Thr Tyr Ala 1070 1075 1080Tyr Tyr Ile Gly Ala Asp Ile Ser Ala Ala Asn Ile Gln Gly Ile 1085 1090 1095Ala His Asp Ser Arg Tyr Lys Phe Arg Gly Leu Gly Ile Gln Ser 1100 1105 1110Leu Asp Val Leu Glu Lys Cys Gln Ile Asp Val Leu Gly His Val 1115 1120 1125Ser Val Val Arg Ser Glu Lys Arg Met Gly Phe Ser 1130 1135 1140311037PRTParvibaculum lavamentivorans 31Met Glu Arg Ile Phe Gly Phe Asp Ile Gly Thr Thr Ser Ile Gly Phe1 5 10 15Ser Val Ile Asp Tyr Ser Ser Thr Gln Ser Ala Gly Asn Ile Gln Arg 20 25 30Leu Gly Val Arg Ile Phe Pro Glu Ala Arg Asp Pro Asp Gly Thr Pro 35 40 45Leu Asn Gln Gln Arg Arg Gln Lys Arg Met Met Arg Arg Gln Leu Arg 50 55 60Arg Arg Arg Ile Arg Arg Lys Ala Leu Asn Glu Thr Leu His Glu Ala65 70 75 80Gly Phe Leu Pro Ala Tyr Gly Ser Ala Asp Trp Pro Val Val Met Ala 85 90 95Asp Glu Pro Tyr Glu Leu Arg Arg Arg Gly Leu Glu Glu Gly Leu Ser 100 105 110Ala Tyr Glu Phe Gly Arg Ala Ile Tyr His Leu Ala Gln His Arg His 115 120 125Phe Lys Gly Arg Glu Leu Glu Glu Ser Asp Thr Pro Asp Pro Asp Val

130 135 140Asp Asp Glu Lys Glu Ala Ala Asn Glu Arg Ala Ala Thr Leu Lys Ala145 150 155 160Leu Lys Asn Glu Gln Thr Thr Leu Gly Ala Trp Leu Ala Arg Arg Pro 165 170 175Pro Ser Asp Arg Lys Arg Gly Ile His Ala His Arg Asn Val Val Ala 180 185 190Glu Glu Phe Glu Arg Leu Trp Glu Val Gln Ser Lys Phe His Pro Ala 195 200 205Leu Lys Ser Glu Glu Met Arg Ala Arg Ile Ser Asp Thr Ile Phe Ala 210 215 220Gln Arg Pro Val Phe Trp Arg Lys Asn Thr Leu Gly Glu Cys Arg Phe225 230 235 240Met Pro Gly Glu Pro Leu Cys Pro Lys Gly Ser Trp Leu Ser Gln Gln 245 250 255Arg Arg Met Leu Glu Lys Leu Asn Asn Leu Ala Ile Ala Gly Gly Asn 260 265 270Ala Arg Pro Leu Asp Ala Glu Glu Arg Asp Ala Ile Leu Ser Lys Leu 275 280 285Gln Gln Gln Ala Ser Met Ser Trp Pro Gly Val Arg Ser Ala Leu Lys 290 295 300Ala Leu Tyr Lys Gln Arg Gly Glu Pro Gly Ala Glu Lys Ser Leu Lys305 310 315 320Phe Asn Leu Glu Leu Gly Gly Glu Ser Lys Leu Leu Gly Asn Ala Leu 325 330 335Glu Ala Lys Leu Ala Asp Met Phe Gly Pro Asp Trp Pro Ala His Pro 340 345 350Arg Lys Gln Glu Ile Arg His Ala Val His Glu Arg Leu Trp Ala Ala 355 360 365Asp Tyr Gly Glu Thr Pro Asp Lys Lys Arg Val Ile Ile Leu Ser Glu 370 375 380Lys Asp Arg Lys Ala His Arg Glu Ala Ala Ala Asn Ser Phe Val Ala385 390 395 400Asp Phe Gly Ile Thr Gly Glu Gln Ala Ala Gln Leu Gln Ala Leu Lys 405 410 415Leu Pro Thr Gly Trp Glu Pro Tyr Ser Ile Pro Ala Leu Asn Leu Phe 420 425 430Leu Ala Glu Leu Glu Lys Gly Glu Arg Phe Gly Ala Leu Val Asn Gly 435 440 445Pro Asp Trp Glu Gly Trp Arg Arg Thr Asn Phe Pro His Arg Asn Gln 450 455 460Pro Thr Gly Glu Ile Leu Asp Lys Leu Pro Ser Pro Ala Ser Lys Glu465 470 475 480Glu Arg Glu Arg Ile Ser Gln Leu Arg Asn Pro Thr Val Val Arg Thr 485 490 495Gln Asn Glu Leu Arg Lys Val Val Asn Asn Leu Ile Gly Leu Tyr Gly 500 505 510Lys Pro Asp Arg Ile Arg Ile Glu Val Gly Arg Asp Val Gly Lys Ser 515 520 525Lys Arg Glu Arg Glu Glu Ile Gln Ser Gly Ile Arg Arg Asn Glu Lys 530 535 540Gln Arg Lys Lys Ala Thr Glu Asp Leu Ile Lys Asn Gly Ile Ala Asn545 550 555 560Pro Ser Arg Asp Asp Val Glu Lys Trp Ile Leu Trp Lys Glu Gly Gln 565 570 575Glu Arg Cys Pro Tyr Thr Gly Asp Gln Ile Gly Phe Asn Ala Leu Phe 580 585 590Arg Glu Gly Arg Tyr Glu Val Glu His Ile Trp Pro Arg Ser Arg Ser 595 600 605Phe Asp Asn Ser Pro Arg Asn Lys Thr Leu Cys Arg Lys Asp Val Asn 610 615 620Ile Glu Lys Gly Asn Arg Met Pro Phe Glu Ala Phe Gly His Asp Glu625 630 635 640Asp Arg Trp Ser Ala Ile Gln Ile Arg Leu Gln Gly Met Val Ser Ala 645 650 655Lys Gly Gly Thr Gly Met Ser Pro Gly Lys Val Lys Arg Phe Leu Ala 660 665 670Lys Thr Met Pro Glu Asp Phe Ala Ala Arg Gln Leu Asn Asp Thr Arg 675 680 685Tyr Ala Ala Lys Gln Ile Leu Ala Gln Leu Lys Arg Leu Trp Pro Asp 690 695 700Met Gly Pro Glu Ala Pro Val Lys Val Glu Ala Val Thr Gly Gln Val705 710 715 720Thr Ala Gln Leu Arg Lys Leu Trp Thr Leu Asn Asn Ile Leu Ala Asp 725 730 735Asp Gly Glu Lys Thr Arg Ala Asp His Arg His His Ala Ile Asp Ala 740 745 750Leu Thr Val Ala Cys Thr His Pro Gly Met Thr Asn Lys Leu Ser Arg 755 760 765Tyr Trp Gln Leu Arg Asp Asp Pro Arg Ala Glu Lys Pro Ala Leu Thr 770 775 780Pro Pro Trp Asp Thr Ile Arg Ala Asp Ala Glu Lys Ala Val Ser Glu785 790 795 800Ile Val Val Ser His Arg Val Arg Lys Lys Val Ser Gly Pro Leu His 805 810 815Lys Glu Thr Thr Tyr Gly Asp Thr Gly Thr Asp Ile Lys Thr Lys Ser 820 825 830Gly Thr Tyr Arg Gln Phe Val Thr Arg Lys Lys Ile Glu Ser Leu Ser 835 840 845Lys Gly Glu Leu Asp Glu Ile Arg Asp Pro Arg Ile Lys Glu Ile Val 850 855 860Ala Ala His Val Ala Gly Arg Gly Gly Asp Pro Lys Lys Ala Phe Pro865 870 875 880Pro Tyr Pro Cys Val Ser Pro Gly Gly Pro Glu Ile Arg Lys Val Arg 885 890 895Leu Thr Ser Lys Gln Gln Leu Asn Leu Met Ala Gln Thr Gly Asn Gly 900 905 910Tyr Ala Asp Leu Gly Ser Asn His His Ile Ala Ile Tyr Arg Leu Pro 915 920 925Asp Gly Lys Ala Asp Phe Glu Ile Val Ser Leu Phe Asp Ala Ser Arg 930 935 940Arg Leu Ala Gln Arg Asn Pro Ile Val Gln Arg Thr Arg Ala Asp Gly945 950 955 960Ala Ser Phe Val Met Ser Leu Ala Ala Gly Glu Ala Ile Met Ile Pro 965 970 975Glu Gly Ser Lys Lys Gly Ile Trp Ile Val Gln Gly Val Trp Ala Ser 980 985 990Gly Gln Val Val Leu Glu Arg Asp Thr Asp Ala Asp His Ser Thr Thr 995 1000 1005Thr Arg Pro Met Pro Asn Pro Ile Leu Lys Asp Asp Ala Lys Lys 1010 1015 1020Val Ser Ile Asp Pro Ile Gly Arg Val Arg Pro Ser Asn Asp1025 1030 1035321132PRTNitratifractor salsuginis 32Met Lys Lys Ile Leu Gly Val Asp Leu Gly Ile Thr Ser Phe Gly Tyr1 5 10 15Ala Ile Leu Gln Glu Thr Gly Lys Asp Leu Tyr Arg Cys Leu Asp Asn 20 25 30Ser Val Val Met Arg Asn Asn Pro Tyr Asp Glu Lys Ser Gly Glu Ser 35 40 45Ser Gln Ser Ile Arg Ser Thr Gln Lys Ser Met Arg Arg Leu Ile Glu 50 55 60Lys Arg Lys Lys Arg Ile Arg Cys Val Ala Gln Thr Met Glu Arg Tyr65 70 75 80Gly Ile Leu Asp Tyr Ser Glu Thr Met Lys Ile Asn Asp Pro Lys Asn 85 90 95Asn Pro Ile Lys Asn Arg Trp Gln Leu Arg Ala Val Asp Ala Trp Lys 100 105 110Arg Pro Leu Ser Pro Gln Glu Leu Phe Ala Ile Phe Ala His Met Ala 115 120 125Lys His Arg Gly Tyr Lys Ser Ile Ala Thr Glu Asp Leu Ile Tyr Glu 130 135 140Leu Glu Leu Glu Leu Gly Leu Asn Asp Pro Glu Lys Glu Ser Glu Lys145 150 155 160Lys Ala Asp Glu Arg Arg Gln Val Tyr Asn Ala Leu Arg His Leu Glu 165 170 175Glu Leu Arg Lys Lys Tyr Gly Gly Glu Thr Ile Ala Gln Thr Ile His 180 185 190Arg Ala Val Glu Ala Gly Asp Leu Arg Ser Tyr Arg Asn His Asp Asp 195 200 205Tyr Glu Lys Met Ile Arg Arg Glu Asp Ile Glu Glu Glu Ile Glu Lys 210 215 220Val Leu Leu Arg Gln Ala Glu Leu Gly Ala Leu Gly Leu Pro Glu Glu225 230 235 240Gln Val Ser Glu Leu Ile Asp Glu Leu Lys Ala Cys Ile Thr Asp Gln 245 250 255Glu Met Pro Thr Ile Asp Glu Ser Leu Phe Gly Lys Cys Thr Phe Tyr 260 265 270Lys Asp Glu Leu Ala Ala Pro Ala Tyr Ser Tyr Leu Tyr Asp Leu Tyr 275 280 285Arg Leu Tyr Lys Lys Leu Ala Asp Leu Asn Ile Asp Gly Tyr Glu Val 290 295 300Thr Gln Glu Asp Arg Glu Lys Val Ile Glu Trp Val Glu Lys Lys Ile305 310 315 320Ala Gln Gly Lys Asn Leu Lys Lys Ile Thr His Lys Asp Leu Arg Lys 325 330 335Ile Leu Gly Leu Ala Pro Glu Gln Lys Ile Phe Gly Val Glu Asp Glu 340 345 350Arg Ile Val Lys Gly Lys Lys Glu Pro Arg Thr Phe Val Pro Phe Phe 355 360 365Phe Leu Ala Asp Ile Ala Lys Phe Lys Glu Leu Phe Ala Ser Ile Gln 370 375 380Lys His Pro Asp Ala Leu Gln Ile Phe Arg Glu Leu Ala Glu Ile Leu385 390 395 400Gln Arg Ser Lys Thr Pro Gln Glu Ala Leu Asp Arg Leu Arg Ala Leu 405 410 415Met Ala Gly Lys Gly Ile Asp Thr Asp Asp Arg Glu Leu Leu Glu Leu 420 425 430Phe Lys Asn Lys Arg Ser Gly Thr Arg Glu Leu Ser His Arg Tyr Ile 435 440 445Leu Glu Ala Leu Pro Leu Phe Leu Glu Gly Tyr Asp Glu Lys Glu Val 450 455 460Gln Arg Ile Leu Gly Phe Asp Asp Arg Glu Asp Tyr Ser Arg Tyr Pro465 470 475 480Lys Ser Leu Arg His Leu His Leu Arg Glu Gly Asn Leu Phe Glu Lys 485 490 495Glu Glu Asn Pro Ile Asn Asn His Ala Val Lys Ser Leu Ala Ser Trp 500 505 510Ala Leu Gly Leu Ile Ala Asp Leu Ser Trp Arg Tyr Gly Pro Phe Asp 515 520 525Glu Ile Ile Leu Glu Thr Thr Arg Asp Ala Leu Pro Glu Lys Ile Arg 530 535 540Lys Glu Ile Asp Lys Ala Met Arg Glu Arg Glu Lys Ala Leu Asp Lys545 550 555 560Ile Ile Gly Lys Tyr Lys Lys Glu Phe Pro Ser Ile Asp Lys Arg Leu 565 570 575Ala Arg Lys Ile Gln Leu Trp Glu Arg Gln Lys Gly Leu Asp Leu Tyr 580 585 590Ser Gly Lys Val Ile Asn Leu Ser Gln Leu Leu Asp Gly Ser Ala Asp 595 600 605Ile Glu His Ile Val Pro Gln Ser Leu Gly Gly Leu Ser Thr Asp Tyr 610 615 620Asn Thr Ile Val Thr Leu Lys Ser Val Asn Ala Ala Lys Gly Asn Arg625 630 635 640Leu Pro Gly Asp Trp Leu Ala Gly Asn Pro Asp Tyr Arg Glu Arg Ile 645 650 655Gly Met Leu Ser Glu Lys Gly Leu Ile Asp Trp Lys Lys Arg Lys Asn 660 665 670Leu Leu Ala Gln Ser Leu Asp Glu Ile Tyr Thr Glu Asn Thr His Ser 675 680 685Lys Gly Ile Arg Ala Thr Ser Tyr Leu Glu Ala Leu Val Ala Gln Val 690 695 700Leu Lys Arg Tyr Tyr Pro Phe Pro Asp Pro Glu Leu Arg Lys Asn Gly705 710 715 720Ile Gly Val Arg Met Ile Pro Gly Lys Val Thr Ser Lys Thr Arg Ser 725 730 735Leu Leu Gly Ile Lys Ser Lys Ser Arg Glu Thr Asn Phe His His Ala 740 745 750Glu Asp Ala Leu Ile Leu Ser Thr Leu Thr Arg Gly Trp Gln Asn Arg 755 760 765Leu His Arg Met Leu Arg Asp Asn Tyr Gly Lys Ser Glu Ala Glu Leu 770 775 780Lys Glu Leu Trp Lys Lys Tyr Met Pro His Ile Glu Gly Leu Thr Leu785 790 795 800Ala Asp Tyr Ile Asp Glu Ala Phe Arg Arg Phe Met Ser Lys Gly Glu 805 810 815Glu Ser Leu Phe Tyr Arg Asp Met Phe Asp Thr Ile Arg Ser Ile Ser 820 825 830Tyr Trp Val Asp Lys Lys Pro Leu Ser Ala Ser Ser His Lys Glu Thr 835 840 845Val Tyr Ser Ser Arg His Glu Val Pro Thr Leu Arg Lys Asn Ile Leu 850 855 860Glu Ala Phe Asp Ser Leu Asn Val Ile Lys Asp Arg His Lys Leu Thr865 870 875 880Thr Glu Glu Phe Met Lys Arg Tyr Asp Lys Glu Ile Arg Gln Lys Leu 885 890 895Trp Leu His Arg Ile Gly Asn Thr Asn Asp Glu Ser Tyr Arg Ala Val 900 905 910Glu Glu Arg Ala Thr Gln Ile Ala Gln Ile Leu Thr Arg Tyr Gln Leu 915 920 925Met Asp Ala Gln Asn Asp Lys Glu Ile Asp Glu Lys Phe Gln Gln Ala 930 935 940Leu Lys Glu Leu Ile Thr Ser Pro Ile Glu Val Thr Gly Lys Leu Leu945 950 955 960Arg Lys Met Arg Phe Val Tyr Asp Lys Leu Asn Ala Met Gln Ile Asp 965 970 975Arg Gly Leu Val Glu Thr Asp Lys Asn Met Leu Gly Ile His Ile Ser 980 985 990Lys Gly Pro Asn Glu Lys Leu Ile Phe Arg Arg Met Asp Val Asn Asn 995 1000 1005Ala His Glu Leu Gln Lys Glu Arg Ser Gly Ile Leu Cys Tyr Leu 1010 1015 1020Asn Glu Met Leu Phe Ile Phe Asn Lys Lys Gly Leu Ile His Tyr 1025 1030 1035Gly Cys Leu Arg Ser Tyr Leu Glu Lys Gly Gln Gly Ser Lys Tyr 1040 1045 1050Ile Ala Leu Phe Asn Pro Arg Phe Pro Ala Asn Pro Lys Ala Gln 1055 1060 1065Pro Ser Lys Phe Thr Ser Asp Ser Lys Ile Lys Gln Val Gly Ile 1070 1075 1080Gly Ser Ala Thr Gly Ile Ile Lys Ala His Leu Asp Leu Asp Gly 1085 1090 1095His Val Arg Ser Tyr Glu Val Phe Gly Thr Leu Pro Glu Gly Ser 1100 1105 1110Ile Glu Trp Phe Lys Glu Glu Ser Gly Tyr Gly Arg Val Glu Asp 1115 1120 1125Asp Pro His His 1130331003PRTCampylobacter lari 33Met Arg Ile Leu Gly Phe Asp Ile Gly Ile Asn Ser Ile Gly Trp Ala1 5 10 15Phe Val Glu Asn Asp Glu Leu Lys Asp Cys Gly Val Arg Ile Phe Thr 20 25 30Lys Ala Glu Asn Pro Lys Asn Lys Glu Ser Leu Ala Leu Pro Arg Arg 35 40 45Asn Ala Arg Ser Ser Arg Arg Arg Leu Lys Arg Arg Lys Ala Arg Leu 50 55 60Ile Ala Ile Lys Arg Ile Leu Ala Lys Glu Leu Lys Leu Asn Tyr Lys65 70 75 80Asp Tyr Val Ala Ala Asp Gly Glu Leu Pro Lys Ala Tyr Glu Gly Ser 85 90 95Leu Ala Ser Val Tyr Glu Leu Arg Tyr Lys Ala Leu Thr Gln Asn Leu 100 105 110Glu Thr Lys Asp Leu Ala Arg Val Ile Leu His Ile Ala Lys His Arg 115 120 125Gly Tyr Met Asn Lys Asn Glu Lys Lys Ser Asn Asp Ala Lys Lys Gly 130 135 140Lys Ile Leu Ser Ala Leu Lys Asn Asn Ala Leu Lys Leu Glu Asn Tyr145 150 155 160Gln Ser Val Gly Glu Tyr Phe Tyr Lys Glu Phe Phe Gln Lys Tyr Lys 165 170 175Lys Asn Thr Lys Asn Phe Ile Lys Ile Arg Asn Thr Lys Asp Asn Tyr 180 185 190Asn Asn Cys Val Leu Ser Ser Asp Leu Glu Lys Glu Leu Lys Leu Ile 195 200 205Leu Glu Lys Gln Lys Glu Phe Gly Tyr Asn Tyr Ser Glu Asp Phe Ile 210 215 220Asn Glu Ile Leu Lys Val Ala Phe Phe Gln Arg Pro Leu Lys Asp Phe225 230 235 240Ser His Leu Val Gly Ala Cys Thr Phe Phe Glu Glu Glu Lys Arg Ala 245 250 255Cys Lys Asn Ser Tyr Ser Ala Trp Glu Phe Val Ala Leu Thr Lys Ile 260 265 270Ile Asn Glu Ile Lys Ser Leu Glu Lys Ile Ser Gly Glu Ile Val Pro 275 280 285Thr Gln Thr Ile Asn Glu Val Leu Asn Leu Ile Leu Asp Lys Gly Ser 290 295 300Ile Thr Tyr Lys Lys Phe Arg Ser Cys Ile Asn Leu His Glu Ser Ile305 310 315 320Ser Phe Lys Ser Leu Lys Tyr Asp Lys Glu Asn Ala Glu Asn Ala Lys 325 330 335Leu Ile Asp Phe Arg Lys Leu Val Glu Phe Lys Lys Ala Leu Gly Val 340 345 350His Ser Leu Ser Arg Gln Glu Leu Asp Gln Ile Ser Thr His Ile Thr 355 360 365Leu Ile Lys Asp Asn Val Lys Leu Lys Thr Val Leu Glu Lys Tyr Asn 370 375 380Leu Ser Asn Glu Gln Ile Asn Asn Leu Leu Glu Ile Glu Phe Asn Asp385 390 395 400Tyr Ile Asn Leu Ser Phe Lys Ala Leu Gly Met Ile Leu Pro Leu Met 405 410 415Arg Glu Gly Lys Arg Tyr Asp Glu Ala Cys Glu Ile Ala Asn Leu Lys 420

425 430Pro Lys Thr Val Asp Glu Lys Lys Asp Phe Leu Pro Ala Phe Cys Asp 435 440 445Ser Ile Phe Ala His Glu Leu Ser Asn Pro Val Val Asn Arg Ala Ile 450 455 460Ser Glu Tyr Arg Lys Val Leu Asn Ala Leu Leu Lys Lys Tyr Gly Lys465 470 475 480Val His Lys Ile His Leu Glu Leu Ala Arg Asp Val Gly Leu Ser Lys 485 490 495Lys Ala Arg Glu Lys Ile Glu Lys Glu Gln Lys Glu Asn Gln Ala Val 500 505 510Asn Ala Trp Ala Leu Lys Glu Cys Glu Asn Ile Gly Leu Lys Ala Ser 515 520 525Ala Lys Asn Ile Leu Lys Leu Lys Leu Trp Lys Glu Gln Lys Glu Ile 530 535 540Cys Ile Tyr Ser Gly Asn Lys Ile Ser Ile Glu His Leu Lys Asp Glu545 550 555 560Lys Ala Leu Glu Val Asp His Ile Tyr Pro Tyr Ser Arg Ser Phe Asp 565 570 575Asp Ser Phe Ile Asn Lys Val Leu Val Phe Thr Lys Glu Asn Gln Glu 580 585 590Lys Leu Asn Lys Thr Pro Phe Glu Ala Phe Gly Lys Asn Ile Glu Lys 595 600 605Trp Ser Lys Ile Gln Thr Leu Ala Gln Asn Leu Pro Tyr Lys Lys Lys 610 615 620Asn Lys Ile Leu Asp Glu Asn Phe Lys Asp Lys Gln Gln Glu Asp Phe625 630 635 640Ile Ser Arg Asn Leu Asn Asp Thr Arg Tyr Ile Ala Thr Leu Ile Ala 645 650 655Lys Tyr Thr Lys Glu Tyr Leu Asn Phe Leu Leu Leu Ser Glu Asn Glu 660 665 670Asn Ala Asn Leu Lys Ser Gly Glu Lys Gly Ser Lys Ile His Val Gln 675 680 685Thr Ile Ser Gly Met Leu Thr Ser Val Leu Arg His Thr Trp Gly Phe 690 695 700Asp Lys Lys Asp Arg Asn Asn His Leu His His Ala Leu Asp Ala Ile705 710 715 720Ile Val Ala Tyr Ser Thr Asn Ser Ile Ile Lys Ala Phe Ser Asp Phe 725 730 735Arg Lys Asn Gln Glu Leu Leu Lys Ala Arg Phe Tyr Ala Lys Glu Leu 740 745 750Thr Ser Asp Asn Tyr Lys His Gln Val Lys Phe Phe Glu Pro Phe Lys 755 760 765Ser Phe Arg Glu Lys Ile Leu Ser Lys Ile Asp Glu Ile Phe Val Ser 770 775 780Lys Pro Pro Arg Lys Arg Ala Arg Arg Ala Leu His Lys Asp Thr Phe785 790 795 800His Ser Glu Asn Lys Ile Ile Asp Lys Cys Ser Tyr Asn Ser Lys Glu 805 810 815Gly Leu Gln Ile Ala Leu Ser Cys Gly Arg Val Arg Lys Ile Gly Thr 820 825 830Lys Tyr Val Glu Asn Asp Thr Ile Val Arg Val Asp Ile Phe Lys Lys 835 840 845Gln Asn Lys Phe Tyr Ala Ile Pro Ile Tyr Ala Met Asp Phe Ala Leu 850 855 860Gly Ile Leu Pro Asn Lys Ile Val Ile Thr Gly Lys Asp Lys Asn Asn865 870 875 880Asn Pro Lys Gln Trp Gln Thr Ile Asp Glu Ser Tyr Glu Phe Cys Phe 885 890 895Ser Leu Tyr Lys Asn Asp Leu Ile Leu Leu Gln Lys Lys Asn Met Gln 900 905 910Glu Pro Glu Phe Ala Tyr Tyr Asn Asp Phe Ser Ile Ser Thr Ser Ser 915 920 925Ile Cys Val Glu Lys His Asp Asn Lys Phe Glu Asn Leu Thr Ser Asn 930 935 940Gln Lys Leu Leu Phe Ser Asn Ala Lys Glu Gly Ser Val Lys Val Glu945 950 955 960Ser Leu Gly Ile Gln Asn Leu Lys Val Phe Glu Lys Tyr Ile Ile Thr 965 970 975Pro Leu Gly Asp Lys Ile Lys Ala Asp Phe Gln Pro Arg Glu Asn Ile 980 985 990Ser Leu Lys Thr Ser Lys Lys Tyr Gly Leu Arg 995 100034103PRTArtificial sequenceexemplary KRAB 34Asp Ala Lys Ser Leu Thr Ala Trp Ser Arg Thr Leu Val Thr Phe Lys1 5 10 15Asp Val Phe Val Asp Phe Thr Arg Glu Glu Trp Lys Leu Leu Asp Thr 20 25 30Ala Gln Gln Ile Leu Tyr Arg Asn Val Met Leu Glu Asn Tyr Lys Asn 35 40 45Leu Val Ser Leu Gly Tyr Gln Leu Thr Lys Pro Asp Val Ile Leu Arg 50 55 60Leu Glu Lys Gly Glu Glu Pro Trp Leu Val Glu Arg Glu Ile His Gln65 70 75 80Glu Thr His Pro Asp Ser Glu Thr Ala Phe Glu Ile Lys Ser Ser Val 85 90 95Pro Lys Lys Lys Arg Lys Val 100351052PRTArtificial sequenceexemplary KRAB 35Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val Gly1 5 10 15Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly Val 20 25 30Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg Ser 35 40 45Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile Gln 50 55 60Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His Ser65 70 75 80Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu Ser 85 90 95Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu Ala 100 105 110Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr Gly 115 120 125Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala Leu 130 135 140Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys Asp145 150 155 160Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr Val 165 170 175Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln Leu 180 185 190Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg Arg 195 200 205Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys Asp 210 215 220Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe Pro225 230 235 240Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr Asn 245 250 255Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn Glu 260 265 270Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe Lys 275 280 285Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu Val 290 295 300Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys Pro305 310 315 320Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr Ala 325 330 335Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala Lys 340 345 350Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu Thr 355 360 365Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser Asn 370 375 380Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile Asn385 390 395 400Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala Ile 405 410 415Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln Gln 420 425 430Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro Val 435 440 445Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile Ile 450 455 460Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg Glu465 470 475 480Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys Arg 485 490 495Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr Gly 500 505 510Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp Met 515 520 525Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu Asp 530 535 540Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro Arg545 550 555 560Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys Gln 565 570 575Glu Glu Ala Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu Ser 580 585 590Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile Leu 595 600 605Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu Tyr 610 615 620Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp Phe625 630 635 640Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu Met 645 650 655Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys Val 660 665 670Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp Lys 675 680 685Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp Ala 690 695 700Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys Leu705 710 715 720Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu Lys Gln 725 730 735Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys Glu Ile 740 745 750Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp Tyr 755 760 765Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Glu Leu Ile Asn 770 775 780Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu Ile785 790 795 800Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu Lys 805 810 815Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His Asp 820 825 830Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly Asp 835 840 845Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr Leu 850 855 860Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile Lys865 870 875 880Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp Tyr 885 890 895Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr Arg 900 905 910Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val Lys 915 920 925Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser Lys 930 935 940Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala Glu945 950 955 960Phe Ile Ala Ser Phe Tyr Asn Asn Asp Leu Ile Lys Ile Asn Gly Glu 965 970 975Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile Glu 980 985 990Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn Met Asn 995 1000 1005Asp Lys Arg Pro Pro Arg Ile Ile Lys Thr Ile Ala Ser Lys Thr 1010 1015 1020Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu Tyr 1025 1030 1035Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys Gly 1040 1045 1050361053PRTArtificial sequencedCas9 sequence 36Met Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val1 5 10 15Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly 20 25 30Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg 35 40 45Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile 50 55 60Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His65 70 75 80Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu 85 90 95Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu 100 105 110Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr 115 120 125Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala 130 135 140Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys145 150 155 160Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr 165 170 175Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln 180 185 190Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg 195 200 205Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys 210 215 220Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe225 230 235 240Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr 245 250 255Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn 260 265 270Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe 275 280 285Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu 290 295 300Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys305 310 315 320Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr 325 330 335Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala 340 345 350Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu 355 360 365Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser 370 375 380Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile385 390 395 400Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala 405 410 415Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln 420 425 430Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro 435 440 445Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile 450 455 460Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg465 470 475 480Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys 485 490 495Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr 500 505 510Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp 515 520 525Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu 530 535 540Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro545 550 555 560Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys 565 570 575Gln Glu Glu Ala Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu 580 585 590Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile 595 600 605Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu 610 615 620Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp625 630 635 640Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu 645 650 655Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys 660 665 670Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp 675 680 685Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp 690 695 700Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys705 710 715 720Leu Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met

Phe Glu Glu Lys 725 730 735Gln Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys Glu 740 745 750Ile Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp 755 760 765Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Glu Leu Ile 770 775 780Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu785 790 795 800Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu 805 810 815Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His 820 825 830Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly 835 840 845Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr 850 855 860Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile865 870 875 880Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp 885 890 895Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr 900 905 910Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val 915 920 925Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser 930 935 940Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala945 950 955 960Glu Phe Ile Ala Ser Phe Tyr Asn Asn Asp Leu Ile Lys Ile Asn Gly 965 970 975Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile 980 985 990Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn Met 995 1000 1005Asn Asp Lys Arg Pro Pro Arg Ile Ile Lys Thr Ile Ala Ser Lys 1010 1015 1020Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu 1025 1030 1035Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys Gly 1040 1045 10503716PRTArtificial sequenceexemplary nuclear localization sequence 37Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala Ala1 5 10 153816PRTArtificial sequenceexemplary nuclear localization sequence 38Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys1 5 10 15391199PRTArtificial sequenceexemplary HA-NLS-dSaCas9-NLS-KRAB sequence 39Met Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Ala Pro Lys Lys Lys Arg1 5 10 15Lys Val Gly Ile His Gly Val Pro Ala Ala Lys Arg Asn Tyr Ile Leu 20 25 30Gly Leu Ala Ile Gly Ile Thr Ser Val Gly Tyr Gly Ile Ile Asp Tyr 35 40 45Glu Thr Arg Asp Val Ile Asp Ala Gly Val Arg Leu Phe Lys Glu Ala 50 55 60Asn Val Glu Asn Asn Glu Gly Arg Arg Ser Lys Arg Gly Ala Arg Arg65 70 75 80Leu Lys Arg Arg Arg Arg His Arg Ile Gln Arg Val Lys Lys Leu Leu 85 90 95Phe Asp Tyr Asn Leu Leu Thr Asp His Ser Glu Leu Ser Gly Ile Asn 100 105 110Pro Tyr Glu Ala Arg Val Lys Gly Leu Ser Gln Lys Leu Ser Glu Glu 115 120 125Glu Phe Ser Ala Ala Leu Leu His Leu Ala Lys Arg Arg Gly Val His 130 135 140Asn Val Asn Glu Val Glu Glu Asp Thr Gly Asn Glu Leu Ser Thr Lys145 150 155 160Glu Gln Ile Ser Arg Asn Ser Lys Ala Leu Glu Glu Lys Tyr Val Ala 165 170 175Glu Leu Gln Leu Glu Arg Leu Lys Lys Asp Gly Glu Val Arg Gly Ser 180 185 190Ile Asn Arg Phe Lys Thr Ser Asp Tyr Val Lys Glu Ala Lys Gln Leu 195 200 205Leu Lys Val Gln Lys Ala Tyr His Gln Leu Asp Gln Ser Phe Ile Asp 210 215 220Thr Tyr Ile Asp Leu Leu Glu Thr Arg Arg Thr Tyr Tyr Glu Gly Pro225 230 235 240Gly Glu Gly Ser Pro Phe Gly Trp Lys Asp Ile Lys Glu Trp Tyr Glu 245 250 255Met Leu Met Gly His Cys Thr Tyr Phe Pro Glu Glu Leu Arg Ser Val 260 265 270Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr Asn Ala Leu Asn Asp Leu Asn 275 280 285Asn Leu Val Ile Thr Arg Asp Glu Asn Glu Lys Leu Glu Tyr Tyr Glu 290 295 300Lys Phe Gln Ile Ile Glu Asn Val Phe Lys Gln Lys Lys Lys Pro Thr305 310 315 320Leu Lys Gln Ile Ala Lys Glu Ile Leu Val Asn Glu Glu Asp Ile Lys 325 330 335Gly Tyr Arg Val Thr Ser Thr Gly Lys Pro Glu Phe Thr Asn Leu Lys 340 345 350Val Tyr His Asp Ile Lys Asp Ile Thr Ala Arg Lys Glu Ile Ile Glu 355 360 365Asn Ala Glu Leu Leu Asp Gln Ile Ala Lys Ile Leu Thr Ile Tyr Gln 370 375 380Ser Ser Glu Asp Ile Gln Glu Glu Leu Thr Asn Leu Asn Ser Glu Leu385 390 395 400Thr Gln Glu Glu Ile Glu Gln Ile Ser Asn Leu Lys Gly Tyr Thr Gly 405 410 415Thr His Asn Leu Ser Leu Lys Ala Ile Asn Leu Ile Leu Asp Glu Leu 420 425 430Trp His Thr Asn Asp Asn Gln Ile Ala Ile Phe Asn Arg Leu Lys Leu 435 440 445Val Pro Lys Lys Val Asp Leu Ser Gln Gln Lys Glu Ile Pro Thr Thr 450 455 460Leu Val Asp Asp Phe Ile Leu Ser Pro Val Val Lys Arg Ser Phe Ile465 470 475 480Gln Ser Ile Lys Val Ile Asn Ala Ile Ile Lys Lys Tyr Gly Leu Pro 485 490 495Asn Asp Ile Ile Ile Glu Leu Ala Arg Glu Lys Asn Ser Lys Asp Ala 500 505 510Gln Lys Met Ile Asn Glu Met Gln Lys Arg Asn Arg Gln Thr Asn Glu 515 520 525Arg Ile Glu Glu Ile Ile Arg Thr Thr Gly Lys Glu Asn Ala Lys Tyr 530 535 540Leu Ile Glu Lys Ile Lys Leu His Asp Met Gln Glu Gly Lys Cys Leu545 550 555 560Tyr Ser Leu Glu Ala Ile Pro Leu Glu Asp Leu Leu Asn Asn Pro Phe 565 570 575Asn Tyr Glu Val Asp His Ile Ile Pro Arg Ser Val Ser Phe Asp Asn 580 585 590Ser Phe Asn Asn Lys Val Leu Val Lys Gln Glu Glu Ala Ser Lys Lys 595 600 605Gly Asn Arg Thr Pro Phe Gln Tyr Leu Ser Ser Ser Asp Ser Lys Ile 610 615 620Ser Tyr Glu Thr Phe Lys Lys His Ile Leu Asn Leu Ala Lys Gly Lys625 630 635 640Gly Arg Ile Ser Lys Thr Lys Lys Glu Tyr Leu Leu Glu Glu Arg Asp 645 650 655Ile Asn Arg Phe Ser Val Gln Lys Asp Phe Ile Asn Arg Asn Leu Val 660 665 670Asp Thr Arg Tyr Ala Thr Arg Gly Leu Met Asn Leu Leu Arg Ser Tyr 675 680 685Phe Arg Val Asn Asn Leu Asp Val Lys Val Lys Ser Ile Asn Gly Gly 690 695 700Phe Thr Ser Phe Leu Arg Arg Lys Trp Lys Phe Lys Lys Glu Arg Asn705 710 715 720Lys Gly Tyr Lys His His Ala Glu Asp Ala Leu Ile Ile Ala Asn Ala 725 730 735Asp Phe Ile Phe Lys Glu Trp Lys Lys Leu Asp Lys Ala Lys Lys Val 740 745 750Met Glu Asn Gln Met Phe Glu Glu Lys Gln Ala Glu Ser Met Pro Glu 755 760 765Ile Glu Thr Glu Gln Glu Tyr Lys Glu Ile Phe Ile Thr Pro His Gln 770 775 780Ile Lys His Ile Lys Asp Phe Lys Asp Tyr Lys Tyr Ser His Arg Val785 790 795 800Asp Lys Lys Pro Asn Arg Glu Leu Ile Asn Asp Thr Leu Tyr Ser Thr 805 810 815Arg Lys Asp Asp Lys Gly Asn Thr Leu Ile Val Asn Asn Leu Asn Gly 820 825 830Leu Tyr Asp Lys Asp Asn Asp Lys Leu Lys Lys Leu Ile Asn Lys Ser 835 840 845Pro Glu Lys Leu Leu Met Tyr His His Asp Pro Gln Thr Tyr Gln Lys 850 855 860Leu Lys Leu Ile Met Glu Gln Tyr Gly Asp Glu Lys Asn Pro Leu Tyr865 870 875 880Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr Leu Thr Lys Tyr Ser Lys Lys 885 890 895Asp Asn Gly Pro Val Ile Lys Lys Ile Lys Tyr Tyr Gly Asn Lys Leu 900 905 910Asn Ala His Leu Asp Ile Thr Asp Asp Tyr Pro Asn Ser Arg Asn Lys 915 920 925Val Val Lys Leu Ser Leu Lys Pro Tyr Arg Phe Asp Val Tyr Leu Asp 930 935 940Asn Gly Val Tyr Lys Phe Val Thr Val Lys Asn Leu Asp Val Ile Lys945 950 955 960Lys Glu Asn Tyr Tyr Glu Val Asn Ser Lys Cys Tyr Glu Glu Ala Lys 965 970 975Lys Leu Lys Lys Ile Ser Asn Gln Ala Glu Phe Ile Ala Ser Phe Tyr 980 985 990Asn Asn Asp Leu Ile Lys Ile Asn Gly Glu Leu Tyr Arg Val Ile Gly 995 1000 1005Val Asn Asn Asp Leu Leu Asn Arg Ile Glu Val Asn Met Ile Asp 1010 1015 1020Ile Thr Tyr Arg Glu Tyr Leu Glu Asn Met Asn Asp Lys Arg Pro 1025 1030 1035Pro Arg Ile Ile Lys Thr Ile Ala Ser Lys Thr Gln Ser Ile Lys 1040 1045 1050Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu Tyr Glu Val Lys Ser 1055 1060 1065Lys Lys His Pro Gln Ile Ile Lys Lys Gly Lys Arg Pro Ala Ala 1070 1075 1080Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys Gly Ser Asp Ala 1085 1090 1095Lys Ser Leu Thr Ala Trp Ser Arg Thr Leu Val Thr Phe Lys Asp 1100 1105 1110Val Phe Val Asp Phe Thr Arg Glu Glu Trp Lys Leu Leu Asp Thr 1115 1120 1125Ala Gln Gln Ile Leu Tyr Arg Asn Val Met Leu Glu Asn Tyr Lys 1130 1135 1140Asn Leu Val Ser Leu Gly Tyr Gln Leu Thr Lys Pro Asp Val Ile 1145 1150 1155Leu Arg Leu Glu Lys Gly Glu Glu Pro Trp Leu Val Glu Arg Glu 1160 1165 1170Ile His Gln Glu Thr His Pro Asp Ser Glu Thr Ala Phe Glu Ile 1175 1180 1185Lys Ser Ser Val Pro Lys Lys Lys Arg Lys Val 1190 1195401198PRTArtificial sequenceexemplary HA-NLS-dSaCas9-NLS-KRAB sequence 40Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Ala Pro Lys Lys Lys Arg Lys1 5 10 15Val Gly Ile His Gly Val Pro Ala Ala Lys Arg Asn Tyr Ile Leu Gly 20 25 30Leu Ala Ile Gly Ile Thr Ser Val Gly Tyr Gly Ile Ile Asp Tyr Glu 35 40 45Thr Arg Asp Val Ile Asp Ala Gly Val Arg Leu Phe Lys Glu Ala Asn 50 55 60Val Glu Asn Asn Glu Gly Arg Arg Ser Lys Arg Gly Ala Arg Arg Leu65 70 75 80Lys Arg Arg Arg Arg His Arg Ile Gln Arg Val Lys Lys Leu Leu Phe 85 90 95Asp Tyr Asn Leu Leu Thr Asp His Ser Glu Leu Ser Gly Ile Asn Pro 100 105 110Tyr Glu Ala Arg Val Lys Gly Leu Ser Gln Lys Leu Ser Glu Glu Glu 115 120 125Phe Ser Ala Ala Leu Leu His Leu Ala Lys Arg Arg Gly Val His Asn 130 135 140Val Asn Glu Val Glu Glu Asp Thr Gly Asn Glu Leu Ser Thr Lys Glu145 150 155 160Gln Ile Ser Arg Asn Ser Lys Ala Leu Glu Glu Lys Tyr Val Ala Glu 165 170 175Leu Gln Leu Glu Arg Leu Lys Lys Asp Gly Glu Val Arg Gly Ser Ile 180 185 190Asn Arg Phe Lys Thr Ser Asp Tyr Val Lys Glu Ala Lys Gln Leu Leu 195 200 205Lys Val Gln Lys Ala Tyr His Gln Leu Asp Gln Ser Phe Ile Asp Thr 210 215 220Tyr Ile Asp Leu Leu Glu Thr Arg Arg Thr Tyr Tyr Glu Gly Pro Gly225 230 235 240Glu Gly Ser Pro Phe Gly Trp Lys Asp Ile Lys Glu Trp Tyr Glu Met 245 250 255Leu Met Gly His Cys Thr Tyr Phe Pro Glu Glu Leu Arg Ser Val Lys 260 265 270Tyr Ala Tyr Asn Ala Asp Leu Tyr Asn Ala Leu Asn Asp Leu Asn Asn 275 280 285Leu Val Ile Thr Arg Asp Glu Asn Glu Lys Leu Glu Tyr Tyr Glu Lys 290 295 300Phe Gln Ile Ile Glu Asn Val Phe Lys Gln Lys Lys Lys Pro Thr Leu305 310 315 320Lys Gln Ile Ala Lys Glu Ile Leu Val Asn Glu Glu Asp Ile Lys Gly 325 330 335Tyr Arg Val Thr Ser Thr Gly Lys Pro Glu Phe Thr Asn Leu Lys Val 340 345 350Tyr His Asp Ile Lys Asp Ile Thr Ala Arg Lys Glu Ile Ile Glu Asn 355 360 365Ala Glu Leu Leu Asp Gln Ile Ala Lys Ile Leu Thr Ile Tyr Gln Ser 370 375 380Ser Glu Asp Ile Gln Glu Glu Leu Thr Asn Leu Asn Ser Glu Leu Thr385 390 395 400Gln Glu Glu Ile Glu Gln Ile Ser Asn Leu Lys Gly Tyr Thr Gly Thr 405 410 415His Asn Leu Ser Leu Lys Ala Ile Asn Leu Ile Leu Asp Glu Leu Trp 420 425 430His Thr Asn Asp Asn Gln Ile Ala Ile Phe Asn Arg Leu Lys Leu Val 435 440 445Pro Lys Lys Val Asp Leu Ser Gln Gln Lys Glu Ile Pro Thr Thr Leu 450 455 460Val Asp Asp Phe Ile Leu Ser Pro Val Val Lys Arg Ser Phe Ile Gln465 470 475 480Ser Ile Lys Val Ile Asn Ala Ile Ile Lys Lys Tyr Gly Leu Pro Asn 485 490 495Asp Ile Ile Ile Glu Leu Ala Arg Glu Lys Asn Ser Lys Asp Ala Gln 500 505 510Lys Met Ile Asn Glu Met Gln Lys Arg Asn Arg Gln Thr Asn Glu Arg 515 520 525Ile Glu Glu Ile Ile Arg Thr Thr Gly Lys Glu Asn Ala Lys Tyr Leu 530 535 540Ile Glu Lys Ile Lys Leu His Asp Met Gln Glu Gly Lys Cys Leu Tyr545 550 555 560Ser Leu Glu Ala Ile Pro Leu Glu Asp Leu Leu Asn Asn Pro Phe Asn 565 570 575Tyr Glu Val Asp His Ile Ile Pro Arg Ser Val Ser Phe Asp Asn Ser 580 585 590Phe Asn Asn Lys Val Leu Val Lys Gln Glu Glu Ala Ser Lys Lys Gly 595 600 605Asn Arg Thr Pro Phe Gln Tyr Leu Ser Ser Ser Asp Ser Lys Ile Ser 610 615 620Tyr Glu Thr Phe Lys Lys His Ile Leu Asn Leu Ala Lys Gly Lys Gly625 630 635 640Arg Ile Ser Lys Thr Lys Lys Glu Tyr Leu Leu Glu Glu Arg Asp Ile 645 650 655Asn Arg Phe Ser Val Gln Lys Asp Phe Ile Asn Arg Asn Leu Val Asp 660 665 670Thr Arg Tyr Ala Thr Arg Gly Leu Met Asn Leu Leu Arg Ser Tyr Phe 675 680 685Arg Val Asn Asn Leu Asp Val Lys Val Lys Ser Ile Asn Gly Gly Phe 690 695 700Thr Ser Phe Leu Arg Arg Lys Trp Lys Phe Lys Lys Glu Arg Asn Lys705 710 715 720Gly Tyr Lys His His Ala Glu Asp Ala Leu Ile Ile Ala Asn Ala Asp 725 730 735Phe Ile Phe Lys Glu Trp Lys Lys Leu Asp Lys Ala Lys Lys Val Met 740 745 750Glu Asn Gln Met Phe Glu Glu Lys Gln Ala Glu Ser Met Pro Glu Ile 755 760 765Glu Thr Glu Gln Glu Tyr Lys Glu Ile Phe Ile Thr Pro His Gln Ile 770 775 780Lys His Ile Lys Asp Phe Lys Asp Tyr Lys Tyr Ser His Arg Val Asp785 790 795 800Lys Lys Pro Asn Arg Glu Leu Ile Asn Asp Thr Leu Tyr Ser Thr Arg 805 810 815Lys Asp Asp Lys Gly Asn Thr Leu Ile Val Asn Asn Leu Asn Gly Leu 820 825 830Tyr Asp Lys Asp Asn Asp Lys Leu Lys Lys Leu Ile Asn Lys Ser Pro 835 840 845Glu Lys Leu Leu Met Tyr His His Asp Pro Gln Thr Tyr Gln Lys Leu 850 855 860Lys Leu Ile Met Glu Gln Tyr Gly Asp Glu Lys Asn

Pro Leu Tyr Lys865 870 875 880Tyr Tyr Glu Glu Thr Gly Asn Tyr Leu Thr Lys Tyr Ser Lys Lys Asp 885 890 895Asn Gly Pro Val Ile Lys Lys Ile Lys Tyr Tyr Gly Asn Lys Leu Asn 900 905 910Ala His Leu Asp Ile Thr Asp Asp Tyr Pro Asn Ser Arg Asn Lys Val 915 920 925Val Lys Leu Ser Leu Lys Pro Tyr Arg Phe Asp Val Tyr Leu Asp Asn 930 935 940Gly Val Tyr Lys Phe Val Thr Val Lys Asn Leu Asp Val Ile Lys Lys945 950 955 960Glu Asn Tyr Tyr Glu Val Asn Ser Lys Cys Tyr Glu Glu Ala Lys Lys 965 970 975Leu Lys Lys Ile Ser Asn Gln Ala Glu Phe Ile Ala Ser Phe Tyr Asn 980 985 990Asn Asp Leu Ile Lys Ile Asn Gly Glu Leu Tyr Arg Val Ile Gly Val 995 1000 1005Asn Asn Asp Leu Leu Asn Arg Ile Glu Val Asn Met Ile Asp Ile 1010 1015 1020Thr Tyr Arg Glu Tyr Leu Glu Asn Met Asn Asp Lys Arg Pro Pro 1025 1030 1035Arg Ile Ile Lys Thr Ile Ala Ser Lys Thr Gln Ser Ile Lys Lys 1040 1045 1050Tyr Ser Thr Asp Ile Leu Gly Asn Leu Tyr Glu Val Lys Ser Lys 1055 1060 1065Lys His Pro Gln Ile Ile Lys Lys Gly Lys Arg Pro Ala Ala Thr 1070 1075 1080Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys Gly Ser Asp Ala Lys 1085 1090 1095Ser Leu Thr Ala Trp Ser Arg Thr Leu Val Thr Phe Lys Asp Val 1100 1105 1110Phe Val Asp Phe Thr Arg Glu Glu Trp Lys Leu Leu Asp Thr Ala 1115 1120 1125Gln Gln Ile Leu Tyr Arg Asn Val Met Leu Glu Asn Tyr Lys Asn 1130 1135 1140Leu Val Ser Leu Gly Tyr Gln Leu Thr Lys Pro Asp Val Ile Leu 1145 1150 1155Arg Leu Glu Lys Gly Glu Glu Pro Trp Leu Val Glu Arg Glu Ile 1160 1165 1170His Gln Glu Thr His Pro Asp Ser Glu Thr Ala Phe Glu Ile Lys 1175 1180 1185Ser Ser Val Pro Lys Lys Lys Arg Lys Val 1190 1195411189PRTArtificial sequenceexemplary NLS-dSaCas9-NLS-KRAB 41Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala Ala1 5 10 15Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val Gly 20 25 30Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly Val 35 40 45Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg Ser 50 55 60Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile Gln65 70 75 80Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His Ser 85 90 95Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu Ser 100 105 110Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu Ala 115 120 125Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr Gly 130 135 140Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala Leu145 150 155 160Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys Asp 165 170 175Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr Val 180 185 190Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln Leu 195 200 205Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg Arg 210 215 220Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys Asp225 230 235 240Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe Pro 245 250 255Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr Asn 260 265 270Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn Glu 275 280 285Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe Lys 290 295 300Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu Val305 310 315 320Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys Pro 325 330 335Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr Ala 340 345 350Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala Lys 355 360 365Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu Thr 370 375 380Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser Asn385 390 395 400Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile Asn 405 410 415Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala Ile 420 425 430Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln Gln 435 440 445Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro Val 450 455 460Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile Ile465 470 475 480Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg Glu 485 490 495Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys Arg 500 505 510Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr Gly 515 520 525Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp Met 530 535 540Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu Asp545 550 555 560Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp His Ile Ile Pro Arg 565 570 575Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys Gln 580 585 590Glu Glu Ala Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu Ser 595 600 605Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile Leu 610 615 620Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu Tyr625 630 635 640Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp Phe 645 650 655Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu Met 660 665 670Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys Val 675 680 685Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp Lys 690 695 700Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp Ala705 710 715 720Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys Leu 725 730 735Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu Lys Gln 740 745 750Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys Glu Ile 755 760 765Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp Tyr 770 775 780Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Glu Leu Ile Asn785 790 795 800Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu Ile 805 810 815Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu Lys 820 825 830Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His Asp 835 840 845Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly Asp 850 855 860Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr Leu865 870 875 880Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile Lys 885 890 895Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp Tyr 900 905 910Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr Arg 915 920 925Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val Lys 930 935 940Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser Lys945 950 955 960Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala Glu 965 970 975Phe Ile Ala Ser Phe Tyr Asn Asn Asp Leu Ile Lys Ile Asn Gly Glu 980 985 990Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile Glu 995 1000 1005Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn Met 1010 1015 1020Asn Asp Lys Arg Pro Pro Arg Ile Ile Lys Thr Ile Ala Ser Lys 1025 1030 1035Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu 1040 1045 1050Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys Gly 1055 1060 1065Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys 1070 1075 1080Lys Gly Ser Asp Ala Lys Ser Leu Thr Ala Trp Ser Arg Thr Leu 1085 1090 1095Val Thr Phe Lys Asp Val Phe Val Asp Phe Thr Arg Glu Glu Trp 1100 1105 1110Lys Leu Leu Asp Thr Ala Gln Gln Ile Leu Tyr Arg Asn Val Met 1115 1120 1125Leu Glu Asn Tyr Lys Asn Leu Val Ser Leu Gly Tyr Gln Leu Thr 1130 1135 1140Lys Pro Asp Val Ile Leu Arg Leu Glu Lys Gly Glu Glu Pro Trp 1145 1150 1155Leu Val Glu Arg Glu Ile His Gln Glu Thr His Pro Asp Ser Glu 1160 1165 1170Thr Ala Phe Glu Ile Lys Ser Ser Val Pro Lys Lys Lys Arg Lys 1175 1180 1185Val4222DNAArtificial sequenceprotospacer sequence 42gagggaaggg atacaggctg ga 22