U.S. patent application number 16/624500 was filed with the patent office on 2020-06-04 for biomarkers for the diagnosis and treatment of fibrotic lung disease.
The applicant listed for this patent is The Regents of the University of Colorado, A Body Corporate. Invention is credited to Christopher M. EVANS, Joyce S. LEE, David A. SCHWARTZ, Marvin I. SCHWARZ, Ivana V. YANG.
Application Number | 20200171024 16/624500 |
Document ID | / |
Family ID | 62976194 |
Filed Date | 2020-06-04 |
View All Diagrams
United States Patent
Application |
20200171024 |
Kind Code |
A1 |
SCHWARTZ; David A. ; et
al. |
June 4, 2020 |
BIOMARKERS FOR THE DIAGNOSIS AND TREATMENT OF FIBROTIC LUNG
DISEASE
Abstract
The present disclosure provides a method of treating a fibrotic
lung disease in a subject comprising administering to the subject
an effective amount of a therapeutic agent, wherein the subject is
asymptomatic and wherein the subject is at risk of developing the
fibrotic lung disease.
Inventors: |
SCHWARTZ; David A.; (Aurora,
CO) ; YANG; Ivana V.; (Englewood, CO) ; LEE;
Joyce S.; (Cherry Hills Village, CO) ; EVANS;
Christopher M.; (Denver, CO) ; SCHWARZ; Marvin
I.; (Denver, CO) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
The Regents of the University of Colorado, A Body
Corporate |
Denver |
CO |
US |
|
|
Family ID: |
62976194 |
Appl. No.: |
16/624500 |
Filed: |
June 26, 2018 |
PCT Filed: |
June 26, 2018 |
PCT NO: |
PCT/US2018/039573 |
371 Date: |
December 19, 2019 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
62525087 |
Jun 26, 2017 |
|
|
|
62525088 |
Jun 26, 2017 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C12Q 1/6883 20130101;
A61K 31/198 20130101; C12Q 2600/106 20130101; A61K 31/496 20130101;
A61K 31/4418 20130101; C12Q 2600/156 20130101; A61K 31/4412
20130101; A61K 9/0056 20130101; A61P 11/00 20180101; A61K 31/197
20130101 |
International
Class: |
A61K 31/496 20060101
A61K031/496; A61K 31/197 20060101 A61K031/197; A61P 11/00 20060101
A61P011/00; A61K 9/00 20060101 A61K009/00; C12Q 1/6883 20180101
C12Q001/6883; A61K 31/4412 20060101 A61K031/4412 |
Claims
1. A method of treating a fibrotic lung disease in a subject
comprising administering to the subject an effective amount of a
therapeutic agent, wherein the subject is asymptomatic and wherein
the subject is at risk of developing the fibrotic lung disease.
2. The method of claim 1, wherein the subject presents radiographic
Usual Interstitial Pneumonia (UIP).
3. The method of claim 1 or 2, wherein the subject has fibrotic
interstitial lung disease (FILD).
4. The method of claim 1 or 2, wherein the subject has an
interstitial lung disease (ILD).
5. The method of claim 4, where in the subject has rheumatoid
arthritis-associated interstitial lung disease (RA-ILD).
6. The method of any one of claims 1-3, wherein the subject has a
blood relative with familial interstitial pneumonia (FIP).
7. The method of claim 6, wherein the blood relative is a
sibling.
8. The method of any one of claims 1-7, wherein the subject has a
mutation in a sequence encoding Mucin 5B (MUC5B), Telomerase RNA
Component (TERC), Family with sequence similarity 13 member A
(FAM13A), Telomerase Reverse Transcriptase (TERT), Desmoplakin
(DSP), Zinc-alpha 2-Glycoprotein 1 (AZGP1),
Oligonucleotide/oligosaccharide-binding Fold Containing 1 (OBFC1),
ATPase Phospholipid Transporting 11A (ATP11A), Isovaleryl-CoA
dehydrogenase (IVD)/Dispatched RND Transporter Family Member 2
(DISP2), Dipeptidyl Peptidase 9 (DPP9), Sialic Acid Binding Ig-Like
Lectin 14 (SIGLEC14), Adrenomedullin 2 (ADM2), Tetraspanin 5
(TSPAN5), Calcium/Calmodulin-Dependent Protein Kinase Kinase 1
(CAMKK1), zinc figner with KRAB and SCAN domains 1 (ZKSCAN1),
isovaleryl-CoA dehydrogenase (IVD), ATPase phospholipid
transporting 11A (AK025511) or Matrix Metalloprotease-7
(MMP-7).
9. The method of claim 8, wherein the mutation comprises a
polymorphism in a sequence encoding a MUC5B promoter.
10. The method of claim 9, wherein the polymorphism is
rs35705950.
11. The method of claim 8, wherein the mutation comprises a
polymorphism in a sequence encoding a TERC 3' untranslated region
(UTR).
12. The method of claim 11, wherein the polymorphism is
rs2293607.
13. The method of claim 8, wherein the mutation comprises a
polymorphism in a sequence encoding intronic FAM13A.
14. The method of claim 13, wherein the polymorphism is
rs2609260.
15. The method of claim 8, wherein the mutation comprises a
polymorphism in a sequence encoding intronic TERT.
16. The method of claim 15, wherein the polymorphism is
rs4449583.
17. The method of claim 8, wherein the mutation comprises a
polymorphism in a sequence encoding intronic DSP.
18. The method of claim 17, wherein the polymorphism is
rs2076295.
19. The method of claim 8, wherein the mutation comprises a
polymorphism in a sequence encoding intronic ZKSCAN1.
20. The method of claim 19, wherein the polymorphism is
rs6963345.
21. The method of claim 8, wherein the mutation comprises a
polymorphism in a sequence encoding intronic OBFC1.
22. The method of claim 21, wherein the polymorphism is
rs2488000.
23. The method of claim 8, wherein the mutation comprises a
polymorphism in a sequence encoding an AK025511 3' UTR.
24. The method of claim 23, wherein the polymorphism is
rs1278769.
25. The method of claim 8, wherein the mutation comprises a
polymorphism in a sequence encoding IVD.
26. The method of claim 25, wherein the polymorphism is
rs35700143.
27. The method of claim 8, wherein the mutation comprises a
polymorphism in a sequence encoding intronic DPP9.
28. The method of claim 27, wherein the polymorphism is
rs12610495.
29. The method of any one of claims 1-28, wherein the fibrotic lung
disease is pulmonary fibrosis, idiopathic pulmonary fibrosis (IPF),
an interstitial lung abnormality (ILA), an asymptomatic ILA,
interstitial lung disease (ILD) or rheumatoid arthritis-associated
interstitial lung disease (RA-ILD).
30. The method of any one of claims 1-28, wherein the fibrotic lung
disease is ILD or RA-ILD.
31. The method of any one of claims 1-28, wherein the fibrotic lung
disease is pulmonary fibrosis or IPF.
32. The method of any one of claims 1-28, wherein the fibrotic lung
disease is IPF.
33. The method of any one of claims 1-32, wherein the therapeutic
agent comprises a N-acetylcysteine, pirfenidone, and
nintedanib.
34. The method of any one of claims 1-32, wherein the therapeutic
agent comprises pirfenidone.
35. The method of claim 34, wherein the effective dosage is about
2400 mg/day.
36. The method of claim 35, wherein the effective dosage is
administered orally as a capsule or a tablet.
37. The method of claim 35 or 36, wherein the effective dosage is
administered three times per day.
38. The method of any one of claims 35-37, wherein the effective
dosage is administered according to an escalating dosage
regimen.
39. The method of claim 39, wherein the escalating dosage regimen
comprises (a) administering to the subject about 800 mg of
pirfenidone per day for a first week; (b) administering to the
subject about 1600 mg of pirfenidone per day for a second week; and
(c) administering to the subject about 2400 mg of pirfenidone per
day for the remainder of the treatment.
40. The method of claim 38, wherein the escalating dosage regimen
comprises (a) administering to the subject a capsule or tablet
comprising about 250 mg of pirfenidone three times a day for a
first week; (b) administering to the subject two capsules or
tablets comprising about 250 mg of pirfenidone three times a day
for a second week; and (c) administering to the subject three
capsules or tablets comprising about 250 mg of pirfenidone three
times a day for the remainder of the treatment.
41. The method of claim 40, wherein the capsule or tablet comprises
267 mg of pirfenidone.
42. The method of any one of claims 1-32, wherein the therapeutic
agent comprises nintedanib.
43. The method of claim 42, wherein the effective dosage is
administered orally as a capsule or a tablet.
44. The method of claim 43, wherein the effective dosage is about
300 mg/day.
45. The method of claim 42 or 43, wherein the effective dosage is
about 150 mg administered twice per day, wherein the daily doses
are administered about 12 hours apart from one another.
46. The method of claim 43, wherein the effective dosage is about
200 mg/day.
47. The method of claim 43 or 46, wherein the effective dosage is
about 100 mg administered twice per day, wherein the daily doses
are administered about 12 hours apart from one another.
48. The method of any one of claims 42-47, wherein the effective
dosage is administered according to a modified or interrupted
dosage regimen.
49. The method of claim 48, wherein the modified or interrupted
dosage regimen comprises (a) administering to the subject about 300
mg of nintedanib per day until the subject presents an elevated
level of liver enzymes compared to a control level of liver
enzymes; (b) administering to the subject about 200 mg of
nintedanib per day until the subject presents the control level of
liver enzymes; and (c) administering to the subject about 300 mg of
nintedanib per day for the remainder of the treatment; wherein the
control level of liver enzymes is a level detected in the subject
prior to an initiation of the treatment.
50. The method of claim 48, wherein the modified or interrupted
regimen comprises (a) administering to the subject a capsule or
tablet comprising about 150 mg of nintedanib twice per day until
the subject presents an elevated level of liver enzymes compared to
a control level of liver enzymes; (b) administering to the subject
two capsules or tablets comprising about 100 mg twice per day until
the subject presents an elevated level of liver enzymes compared to
a control level of liver enzymes; and (c) administering to the
subject a capsule or tablet comprising about 150 mg of nintedanib
twice per day for the remainder of the treatment; wherein the
control level of liver enzymes is a level detected in the subject
prior to an initiation of the treatment.
51. The method of any one of claims 1-50, wherein the therapeutic
agent prevents the onset or development of a sign or symptom of the
fibrotic lung disease.
52. The method of any one of claims 1-50, wherein the therapeutic
agent delays the onset or development of a sign or symptom of the
fibrotic lung disease when compared to the expected onset of the a
sign or symptom in the absence of treatment with the therapeutic
agent.
53. The method of any one of claims 1-50, wherein the therapeutic
agent reduces the severity of a sign or symptom of the fibrotic
lung disease when compared to the expected severity of the a sign
or symptom in the absence of treatment with the therapeutic
agent.
54. The method of any one of claims 51-53, wherein at least one
sign of the fibrotic lung disease is detectable before the subject
presents a symptom of the fibrotic lung disease.
55. The method of claim 54, wherein the at least one sign comprises
gradual or unintended weight loss, clubbing of the fingers or toes,
rapid and shallow breathing, fibrotic lesions in one or both lungs
detectable by radiography, or a cough.
56. The method of claim 54, wherein the symptom comprises shortness
of breath during exercise, shortness of breath at rest, a dry and
hacking cough, repeated bouts of coughing, and uncontrollable bouts
of coughing.
57. The method of any one of claims 1-56, wherein the method
prevents the onset of a secondary condition associated with a
severe form of the fibrotic lung disease.
58. The method of claim 57, wherein secondary condition comprises a
collapsed lung, an infected lung, a blood clot in a lung, lung
cancer, respiratory failure, pulmonary hypertension, heart failure
or death.
59. A method of identifying a therapeutic agent or target thereof
for the treatment of a fibrotic lung disease, comprising
administering to a non-human subject a dose of a composition that
modifies transcription or translation of a sequence encoding Mucin
5B (MUC5B), Telomerase RNA Component (TERC), Family with sequence
similarity 13 member A (FAM13A), Telomerase Reverse Transcriptase
(TERT), Desmoplakin (DSP), Zinc-alpha 2-Glycoprotein 1 (AZGP1),
Oligonucleotide/oligosaccharide-binding Fold Containing 1 (OBFC1),
ATPase Phospholipid Transporting 11A (ATP11A), Isovaleryl-CoA
dehydrogenase (IVD)/Dispatched RND Transporter Family Member 2
(DISP2), Dipeptidyl Peptidase 9 (DPP9), Sialic Acid Binding Ig-Like
Lectin 14 (SIGLEC14), Adrenomedullin 2 (ADM2), Tetraspanin 5
(TSPAN5), Calcium/Calmodulin-Dependent Protein Kinase Kinase 1
(CAMKK1) or Matrix Metalloprotease-7 (MMP-7), wherein the dose of
the composition is tolerable to the non-human subject and wherein
the dose of the composition is therapeutically effective.
60. A method of identifying a therapeutic agent or target thereof
for the treatment of a fibrotic lung disease, comprising
administering to a non-human subject a composition that modifies an
activity of a product of a sequence encoding MUC5B, TERC, FAM13A,
TERT, DSP, AZGP1, OBFC1, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2,
TSPAN5, CAMKK1 or MMP-7, wherein the dose of the composition is
tolerable to the non-human subject and wherein the dose of the
composition is therapeutically effective.
61. The method of claim 59, wherein the composition that modifies
transcription or translation decreases or inhibits transcription or
translation.
62. The method of claim 61, wherein the composition decreases or
inhibits transcription or translation of a sequence encoding a gene
selected from the group consisting of Leukotriene A4 Hydrolase
(LTA4H), Surfactant Protein B (SFTPB), Breast Cancer Anti-Estrogen
Resistance 3 (BCAR3), C-X-C motif Chemokine Ligand 13 (CXCL13), EPH
Receptor A2 (EPHA2), Serum Amyloid A1 (SAA1), Phospholipase A2
Group IIA (PLA2G2A), Insulin-Like Growth Factor Binding Protein 3
(IGFBP3), C-C Motif Chemokine Ligand 28 (CCL28), S100 Calcium
Binding Protein A12 (S100A12), Thromboxane A Synthase 1 (TBXAS1),
Leukocyte Cell Derived Chemotaxin 1 (LECT1), Complement C3 (C3),
Gastrin Releasing Peptide (GRP), C-Reactive Protein (CRP), Vitrin
(VIT), Insulin-Like Growth Factor Binding Protein 1 (IGFBP1),
Family with Sequence Similarity 173 Member A (FAM173A), Natriuretic
Peptide A (NPPA), Secreted Frizzled Related Protein 1 (SFRP1),
Ezrin (EZR), Inter-Alpha-Trypsin Inhibitor Heavy Chain Family
Member 5 (ITIH5), Pleckstrin and Sec7 Domain Containing 2 (PSD2),
Galectin 3 Binding Protein (LGALS3BP), Catenin Beta 1 (CTNNB1),
Chromodomain Y Like 2 (CDYL2), Matrix Metallopeptidase 7 (MMPI),
Apolipoprotein B (APOB), Proline and Arginine Rich End Leucine Rich
Repeat Protein (PRELP), Eukaryotic Translation Initiation Factor
1A, X-linked (EIF1AX), Mesencephalic Astrocyte Derived Neurotrophic
Factor (MANF), TNF Receptor Superfamily Member 13C (TNFRSF13C),
Deformed Epidermal Autoregulatory Factor 1 transcription factor
(DEAF1), Tumor Protein Translationally-Controlled 1 (TPT1), Unc-5
Netrin Receptor B (UNCSB), Phosphatidylethanolamine Binding Protein
1 (PEBP1), Syntaxin 8 (STX8), Polymeric Immunoglobulin Receptor
(PIGR), Adenine Phosphoribosyltransferase (APRT), Matrix
Metallopeptidase 3 (MMP3), Galectin 7 (LGALS7), Bruton Tyrosine
Kinase (BTK), NSFL1 Cofactor (NSFL1C), FER Tyrosine Kinase (FER),
Regenerating Family Member 1 Beta (REG1B), SMAD Family Member 2
(SMAD2), Interleukin 1 Receptor Like 1 (IL1RL1), C-C Motif
Chemokine Ligand 18 (CCL18), Acid Phosphatase 2 Lysosomal (ACP2),
Eukaryotic Translation Initiation Factor 4E Family Member 2
(EIF4E2), Neurexin 3 (NRXN3), IGF Like Family Member 1 (IGFL1),
NME/NM23 Nucleoside Diphosphate Kinase 1 (NME1), Potassium
Voltage-Gated Channel Isk-Related Family Member 1-Like (KCNE1L) or
Neurexophilin 2 (NXPH2).
63. The method of claim 59, wherein the composition that modifies
transcription or translation increases or activates transcription
or translation.
64. The method of claim 63, wherein the composition increases or
activates transcription or translation of a sequence encoding a
gene selected from the group consisting of Surfactant Protein D
(SFTPD), Glyceraldehyde-3-Phosphate Dehydrogenase (GAPDH), Histone
Cluster 1 H1 Family Member C (HIST1H1C), YTH Domain Containing 1
(YTHDC1), Plexin A1 (PLXNA1), Serine Peptidase Inhibitor Kazal Type
6 (SPINK6), LDL Receptor Related Protein Associated Protein 1
(LRPAP1), Secretoglobin Family 3A Member 1 (SCGB3A1), H2A Histone
Family Member Z (H2AFZ) or Chromosome 1 Open Reading Frame 162 (C1
orf162).
65. The method of claim 60, wherein the composition that modifies
an activity decreases or inhibits the activity.
66. The method of claim 65, wherein the composition decreases or
inhibits the activity of a sequence encoding a gene selected from
Leukotriene A4 Hydrolase (LTA4H), Surfactant Protein B (SFTPB),
Breast Cancer Anti-Estrogen Resistance 3 (BCAR3), C-X-C motif
Chemokine Ligand 13 (CXCL13), EPH Receptor A2 (EPHA2), Serum
Amyloid A1 (SAA1), Phospholipase A2 Group IIA (PLA2G2A),
Insulin-Like Growth Factor Binding Protein 3 (IGFBP3), C-C Motif
Chemokine Ligand 28 (CCL28), 5100 Calcium Binding Protein A12
(S100A12), Thromboxane A Synthase 1 (TBXAS1), Leukocyte Cell
Derived Chemotaxin 1 (LECT1), Complement C3 (C3), Gastrin Releasing
Peptide (GRP), C-Reactive Protein (CRP), Vitrin (VIT), Insulin-Like
Growth Factor Binding Protein 1 (IGFBP1), Family with Sequence
Similarity 173 Member A (FAM173A), Natriuretic Peptide A (NPPA),
Secreted Frizzled Related Protein 1 (SFRP1), Ezrin (EZR),
Inter-Alpha-Trypsin Inhibitor Heavy Chain Family Member 5 (ITIH5),
Pleckstrin and Sec7 Domain Containing 2 (PSD2), Galectin 3 Binding
Protein (LGALS3BP), Catenin Beta 1 (CTNNB1), Chromodomain Y Like 2
(CDYL2), Matrix Metallopeptidase 7 (MMPI), Apolipoprotein B (APOB),
Proline and Arginine Rich End Leucine Rich Repeat Protein (PRELP),
Eukaryotic Translation Initiation Factor 1A, X-linked (EIF1AX),
Mesencephalic Astrocyte Derived Neurotrophic Factor (MANF), TNF
Receptor Superfamily Member 13C (TNFRSF13C), Deformed Epidermal
Autoregulatory Factor 1 transcription factor (DEAF1), Tumor Protein
Translationally-Controlled 1 (TPT1), Unc-5 Netrin Receptor B
(UNCSB), Phosphatidylethanolamine Binding Protein 1 (PEBP1),
Syntaxin 8 (STX8), Polymeric Immunoglobulin Receptor (PIGR),
Adenine Phosphoribosyltransferase (APRT), Matrix Metallopeptidase 3
(MMP3), Galectin 7 (LGALS7), Bruton Tyrosine Kinase (BTK), NSFL1
Cofactor (NSFL1C), FER Tyrosine Kinase (FER), Regenerating Family
Member 1 Beta (REG1B), SMAD Family Member 2 (SMAD2), Interleukin 1
Receptor Like 1 (IL1RL1), C-C Motif Chemokine Ligand 18 (CCL18),
Acid Phosphatase 2 Lysosomal (ACP2), Eukaryotic Translation
Initiation Factor 4E Family Member 2 (EIF4E2), Neurexin 3 (NRXN3),
IGF Like Family Member 1 (IGFL1), NME/NM23 Nucleoside Diphosphate
Kinase 1 (NME1), Potassium Voltage-Gated Channel Isk-Related Family
Member 1-Like (KCNE1L) or Neurexophilin 2 (NXPH2).
67. The method of claim 60, wherein the composition that modifies
an activity increases or activates the activity.
68. The method of claim 67, wherein the composition increases or
activates the activity of a sequence encoding Surfactant Protein D
(SFTPD), Glyceraldehyde-3-Phosphate Dehydrogenase (GAPDH), Histone
Cluster 1 H1 Family Member C (HIST1H1C), YTH Domain Containing 1
(YTHDC1), Plexin A1 (PLXNA1), Serine Peptidase Inhibitor Kazal Type
6 (SPINK6), LDL Receptor Related Protein Associated Protein 1
(LRPAP1), Secretoglobin Family 3A Member 1 (SCGB3A1), H2A Histone
Family Member Z (H2AFZ) or Chromosome 1 Open Reading Frame 162
(Clorf162).
69. The method of any one of claims 60-68, wherein the non-human
subject is a mammal.
70. The method of any one of claims 60-69, wherein the mammal is
genetically-modified.
71. The method of claim 70, wherein the genetically-modified mammal
is a model organism for the fibrotic lung disease.
72. The method of any one of claims 60-71, wherein the fibrotic
lung disease is pulmonary fibrosis, idiopathic pulmonary fibrosis
(IPF), an interstitial lung abnormality (ILA), or an asymptomatic
ILA.
73. The method of any one of claims 60-71, wherein the fibrotic
lung disease is pulmonary fibrosis or IPF.
74. The method of any one of claims 60-71, wherein the fibrotic
lung disease is IPF.
75. The method of any one of claims 60-74, wherein the non-human
subject carries a mutation in a sequence encoding MUC5B.
76. The method of claim 75, wherein the mutation comprises a
polymorphism in a sequence encoding a MUC5B promoter.
77. The method of claim 76, wherein the polymorphism is
rs35705950.
78. The method of any one of claims 60-77, wherein the non-human
subject carries a mutation in a sequence encoding TERC, FAM13A,
TERT, DSP, AZGP1, OBFC1, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2,
TSPAN5, CAMKK1 or MMP-7.
79. The method of any one of claims 60-78, wherein the composition
prevents the onset or development of a sign or symptom of the
fibrotic lung disease.
80. The method of any one of claims 60-78, wherein the composition
delays the onset or development of a sign or symptom of the
fibrotic lung disease when compared to the expected onset of the a
sign or symptom in the absence of treatment with the
composition.
81. The method of claim 80, wherein the composition delays the
onset or development of a sign or symptom of the fibrotic lung
disease when compared to the expected onset of the sign or symptom
when treated using a standard therapeutic intervention.
82. The method of any one of claims 60-78, wherein the composition
reduces the severity of a sign or symptom of the fibrotic lung
disease when compared to the expected severity of the sign or
symptom in the absence of treatment with the composition.
83. The method of claim 24, wherein the composition reduces the
severity of a sign or symptom of the fibrotic lung disease when
compared to the expected severity of the sign or symptom when
treated using a standard therapeutic intervention.
84. The method of claim 81 or 83, wherein the standard therapeutic
intervention comprises a N-acetylcysteine, pirfenidone, and
nintedanib.
85. The method of claim 81 or 83, wherein the standard therapeutic
intervention comprises pirfenidone.
86. The method of claim 85, wherein an effective dosage of
pirfenidone is about 2400 mg/day.
87. The method of claim 86, wherein the effective dosage is
administered orally as a capsule or a tablet.
88. The method of claim 86 or 87, wherein the effective dosage is
administered three times per day.
89. The method of any one of claims 85-88, wherein the effective
dosage is administered according to an escalating dosage
regimen.
90. The method of claim 85, 86 or 89, wherein the escalating dosage
regimen comprises (a) administering to the non-human subject about
800 mg of pirfenidone per day for a first week; (b) administering
to the non-human subject about 1600 mg of pirfenidone per day for a
second week; and (c) administering to the non-human subject about
2400 mg of pirfenidone per day for the remainder of the
treatment.
91. The method of claim 85, 86 or 89, wherein the escalating dosage
regimen comprises (a) administering to the non-human subject a
capsule or tablet comprising about 250 mg of pirfenidone three
times a day for a first week; (b) administering to the non-human
subject two capsules or tablets comprising about 250 mg of
pirfenidone three times a day for a second week; and (c)
administering to the non-human subject three capsules or tablets
comprising about 250 mg of pirfenidone three times a day for the
remainder of the treatment.
92. The method of claim 91, wherein the capsule or tablet comprises
267 mg of pirfenidone.
93. The method of claim 81 or 83, wherein the standard therapeutic
intervention comprises nintedanib.
94. The method of claim 93, wherein an effective dosage of
nintedanib is administered orally as a capsule or a tablet.
95. The method of claim 94, wherein the effective dosage is about
300 mg/day.
96. The method of claim 94 or 95, wherein the effective dosage is
about 150 mg administered twice per day, wherein the daily doses
are administered about 12 hours apart from one another.
97. The method of claim 94, wherein the effective dosage is about
200 mg/day.
98. The method of claim 94 or 95, wherein the effective dosage is
about 100 mg administered twice per day, wherein the daily doses
are administered about 12 hours apart from one another.
99. The method of any one of claims 79-98, wherein the non-human
subject presents at least one sign of the fibrotic lung
disease.
100. The method of claim 99, wherein the at least one sign
comprises gradual or unintended weight loss, clubbing of the
fingers or toes, rapid and shallow breathing, fibrotic lesions in
one or both lungs detectable by radiography, or a cough.
101. The method of any one of claims 60-100, wherein the compound
prevents the onset of a secondary condition associated with a
severe form of the fibrotic lung disease.
102. The method of claim 100, wherein the compound prevents the
onset for at 1 year, 2 years, 3 years, 4 years, 5 years or any
whole or fractional number of years in between.
103. The method of claim 101 or 102, wherein secondary condition
comprises a collapsed lung, an infected lung, a blood clot in a
lung, lung cancer, respiratory failure, pulmonary hypertension,
heart failure or death.
104. A composition for the treatment of a fibrotic lung disease
identified by the method of any one of claims 60-103.
105. A method of treating a fibrotic lung disease in a human
subject comprising administering to the subject the composition of
claim 104, wherein the subject is asymptomatic and wherein the
subject is at risk of developing the fibrotic lung disease.
106. The method of claim 105, wherein the human subject presents
radiographic Usual Interstitial Pneumonia (UIP).
107. The method of claim 105 or 106, wherein the human subject has
fibrotic interstitial lung disease (FILD).
108. The method of any one of claims 105-107, wherein the human
subject has a blood relative with familial interstitial pneumonia
(FIP).
109. The method of claim 108, wherein the blood relative is a
sibling.
110. The method of any one of claims 105-109, wherein the human
subject has a mutation in a sequence encoding MUC5B, TERC, FAM13A,
TERT, DSP, AZGP1, OBFC1, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2,
TSPAN5, CAMKK1 or MMP-7.
111. The method of claim 110, wherein the mutation comprises a
polymorphism in a sequence encoding a MUC5B promoter.
112. The method of claim 111, wherein the polymorphism is
rs35705950.
113. The method of any one of claims 105-112, wherein the fibrotic
lung disease is pulmonary fibrosis, idiopathic pulmonary fibrosis
(IPF), an interstitial lung abnormality (ILA), or an asymptomatic
ILA.
114. The method of any one of claims 105-112, wherein the fibrotic
lung disease is pulmonary fibrosis or IPF.
115. The method of any one of claims 105-112, wherein the fibrotic
lung disease is IPF.
116. The method of any one of claims 105-112, wherein the method
prevents the onset of a secondary condition associated with a
severe form of the fibrotic lung disease.
117. The method of claim 116, wherein secondary condition comprises
a collapsed lung, an infected lung, a blood clot in a lung, lung
cancer, respiratory failure, pulmonary hypertension, heart failure
or death.
Description
RELATED APPLICATIONS
[0001] This application claims the benefit of provisional
application U.S. Ser. No. 62/525,087, filed Jun. 26, 2017, and U.S.
Ser. No. 62/525,088, filed Jun. 26, 2017, the contents of each of
which are herein incorporated by reference in their entirety.
INCORPORATION OF SEQUENCE LISTING
[0002] The contents of the text file named
"UNCO-018_001WO_SeqList_ST25.txt," which was created on Jun. 21,
2018 and is 418 KB in size, are hereby incorporated by reference in
their entirety.
FIELD OF THE DISCLOSURE
[0003] The disclosure is directed to molecular biology, genetics,
and therapeutics for fibrotic lung disease.
BACKGROUND
[0004] Fibrotic pulmonary diseases are progressive and
irreversible. Standard therapies are mere palliative as they cannot
address the underlying disease mechanism once the subject has
progressed to a point at which symptoms are present. Thus, there is
a long-felt but unmet need in the field for a method of treating
asymptomatic subjects as well as those who are at risk of
developing fibrotic pulmonary diseases to prevent onset of the
disease, delay onset of the disease, or reduce the severity of
disease symptoms, The methods of the disclosure provide a
preventative or efficacious treatment, as opposed to a merely
palliative treatment, for asymptomatic subjects as well as those
subjects at risk of developing the disease.
SUMMARY
[0005] The disclosure provides a method of treating a fibrotic lung
disease in a subject comprising administering to the subject an
effective amount of a therapeutic agent, wherein the subject is
asymptomatic and wherein the subject is at risk of developing the
fibrotic lung disease.
[0006] In some embodiments of the methods of the disclosure, the
subject presents radiographic Usual Interstitial Pneumonia (UIP).
In some embodiments, the subject has fibrotic interstitial lung
disease (FILD). In some embodiments, the subject has a blood
relative with familial interstitial pneumonia (FIP). In some
embodiments, including those embodiments wherein the subject has a
blood relative with familial interstitial pneumonia (FIP), the
blood relative is a sibling. Alternatively, or in addition, in some
embodiments, the subject has a mutation in a sequence encoding
Mucin 5B (MUC5B), Telomerase RNA Component (TERC), Family with
sequence similarity 13 member A (FAM13A), Telomerase Reverse
Transcriptase (TERT), Desmoplakin (DSP), Zinc-alpha 2-Glycoprotein
1 (AZGP1), Oligonucleotide/oligosaccharide-binding Fold Containing
1 (OBFC1), ATPase Phospholipid Transporting 11A (ATP11A),
Isovaleryl-CoA dehydrogenase (IVD)/Dispatched RND Transporter
Family Member 2 (DISP2), Dipeptidyl Peptidase 9 (DPP9), Sialic Acid
Binding Ig-Like Lectin 14 (SIGLEC14), Adrenomedullin 2 (ADM2),
Tetraspanin 5 (TSPAN5), Calcium/Calmodulin-Dependent Protein Kinase
1 (CAMKK1), zinc figner with KRAB and SCAN domains 1 (ZKSCAN1),
isovaleryl-CoA dehydrogenase (IVD), ATPase phospholipid
transporting 11A (AK025511) or Matrix Metalloprotease-7
(MMP-7).
[0007] In some embodiments of the methods of the disclosure, the
subject has a mutation in a sequence encoding MUC5B, TERC, FAM13A,
TERT, DSP, AZGP1, OBFC1, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2,
TSPAN5, CAMKK1 or MMP-7.
[0008] In some embodiments of the methods of the disclosure, the
subject has a mutation in a nucleic acid or amino acid sequence
encoding a gene or gene product that is upregulated in a subject
having a fibrotic pulmonary disease of the disclosure. In some
embodiments of the methods of the disclosure, the subject has a
mutation in a nucleic acid or amino acid sequence encoding
Leukotriene A4 Hydrolase (LTA4H), Surfactant Protein B (SFTPB),
Breast Cancer Anti-Estrogen Resistance 3 (BCAR3), C-X-C motif
Chemokine Ligand 13 (CXCL13), EPH Receptor A2 (EPHA2), Serum
Amyloid A1 (SAA1), Phospholipase A2 Group IIA (PLA2G2A),
Insulin-Like Growth Factor Binding Protein 3 (IGFBP3), C-C Motif
Chemokine Ligand 28 (CCL28), S100 Calcium Binding Protein A12
(S100A12), Thromboxane A Synthase 1 (TBXAS1), Leukocyte Cell
Derived Chemotaxin 1 (LECT1), Complement C3 (C3), Gastrin Releasing
Peptide (GRP), C-Reactive Protein (CRP), Vitrin (VIT), Insulin-Like
Growth Factor Binding Protein 1 (IGFBP1), Family with Sequence
Similarity 173 Member A (FAM173A), Natriuretic Peptide A (NPPA),
Secreted Frizzled Related Protein 1 (SFRP1), Ezrin (EZR),
Inter-Alpha-Trypsin Inhibitor Heavy Chain Family Member 5 (ITIH5),
Pleckstrin and Sec7 Domain Containing 2 (PSD2), Galectin 3 Binding
Protein (LGALS3BP), Catenin Beta 1 (CTNNB1), Chromodomain Y Like 2
(CDYL2), Matrix Metallopeptidase 7 (MMPI), Apolipoprotein B (APOB),
Proline and Arginine Rich End Leucine Rich Repeat Protein (PRELP),
Eukaryotic Translation Initiation Factor 1A, X-linked (EIF1AX),
Mesencephalic Astrocyte Derived Neurotrophic Factor (MANF), TNF
Receptor Superfamily Member 13C (TNFRSF13C), Deformed Epidermal
Autoregulatory Factor 1 transcription factor (DEAF1), Tumor Protein
Translationally-Controlled 1 (TPT1), Unc-5 Netrin Receptor B
(UNC5B), Phosphatidylethanolamine Binding Protein 1 (PEBP1),
Syntaxin 8 (STX8), Polymeric Immunoglobulin Receptor (PIGR),
Adenine Phosphoribosyltransferase (APRT), Matrix Metallopeptidase 3
(MMP3), Galectin 7 (LGALS7), Bruton Tyrosine Kinase (BTK), NSFL1
Cofactor (NSFL1C), FER Tyrosine Kinase (FER), Regenerating Family
Member 1 Beta (REG1B), SMAD Family Member 2 (SMAD2), Interleukin 1
Receptor Like 1 (IL1RL1), C-C Motif Chemokine Ligand 18 (CCL18),
Acid Phosphatase 2 Lysosomal (ACP2), Eukaryotic Translation
Initiation Factor 4E Family Member 2 (EIF4E2), Neurexin 3 (NRXN3),
IGF Like Family Member 1 (IGFL1), NME/NM23 Nucleoside Diphosphate
Kinase 1 (NME1), Potassium Voltage-Gated Channel Isk-Related Family
Member 1-Like (KCNE1L) or Neurexophilin 2 (NXPH2).
[0009] In some embodiments of the methods of the disclosure, the
subject has a mutation in a nucleic acid or amino acid sequence
encoding a gene or gene product that is downregulated in a subject
having a fibrotic pulmonary disease of the disclosure. In some
embodiments of the methods of the disclosure, the subject has a
mutation in a nucleic acid or amino acid sequence encoding
Surfactant Protein D (SFTPD), Glyceraldehyde-3-Phosphate
Dehydrogenase (GAPDH), Histone Cluster 1 H1 Family Member C
(HIST1H1C), YTH Domain Containing 1 (YTHDC1), Plexin A1 (PLXNA1),
Serine Peptidase Inhibitor Kazal Type 6 (SPINK6), LDL Receptor
Related Protein Associated Protein 1 (LRPAP1), Secretoglobin Family
3A Member 1 (SCGB3A1), H2A Histone Family Member Z (H2AFZ) or
Chromosome 1 Open Reading Frame 162 (Clorf162).
[0010] In some embodiments of the methods of the disclosure, the
subject has a mutation in a sequence encoding MUC5B. In some
embodiments, the mutation is a polymorphism in a sequence encoding
a MUC5B promoter. In some embodiments, the polymorphism is
rs35705950 comprising (SEQ ID NO: 7).
[0011] In some embodiments of the methods of the disclosure, the
human subject has a mutation in a sequence encoding TERC. In some
embodiments, the mutation is a polymorphism in a sequence encoding
TERC or a regulatory sequence thereof. In some embodiments the
polymorphism is rs6793295 comprising (SEQ ID NO: 1).
[0012] In some embodiments of the methods of the disclosure, the
human subject has a mutation in a sequence encoding intronic
FAM13A. In some embodiments, the mutation is a polymorphism in a
sequence encoding intronic FAM13A or a regulatory sequence thereof.
In some embodiments, the polymorphism is rs2609260.
[0013] In some embodiments of the methods of the disclosure, the
human subject has a mutation in a sequence encoding intronic TERT.
In some embodiments, the mutation is a polymorphism in a sequence
encoding intronic TERT or a regulatory sequence thereof. In some
embodiments, the polymorphism is rs4449583.
[0014] In some embodiments of the methods of the disclosure, the
human subject has a mutation in a sequence encoding intronic DSP.
In some embodiments, the mutation is a polymorphism in a sequence
encoding intronic DSP or a regulatory sequence thereof. In some
embodiments, the polymorphism is rs2076295.
[0015] In some embodiments of the methods of the disclosure, the
human subject has a mutation in a sequence encoding intronic
ZKSCAN1. In some embodiments, the mutation is a polymorphism in a
sequence encoding intronic ZKSCAN1 or a regulatory sequence
thereof. In some embodiments, the polymorphism is rs6963345.
[0016] In some embodiments of the methods of the disclosure, the
human subject has a mutation in a sequence encoding intronic OBFC1.
In some embodiments, the mutation is a polymorphism in a sequence
encoding intronic OBFC1 or a regulatory sequence thereof. In some
embodiments, the polymorphism is rs2488000.
[0017] In some embodiments of the methods of the disclosure, the
human subject has a mutation in a sequence encoding an AK025511 3'
UTR. In some embodiments, the mutation is a polymorphism in a
sequence encoding an AK025511 3' UTR or a regulatory sequence
thereof. In some embodiments, the polymorphism is rs1278769.
[0018] In some embodiments of the methods of the disclosure, the
human subject has a mutation in a sequence encoding IVD. In some
embodiments, the mutation is a polymorphism in a sequence encoding
intronic IVD or a regulatory sequence thereof. In some embodiments,
the polymorphism is rs35700143.
[0019] In some embodiments of the methods of the disclosure, the
human subject has a mutation in a sequence encoding intronic DPP9.
In some embodiments, the mutation is a polymorphism in a sequence
encoding intronic DPP9 or a regulatory sequence thereof. In some
embodiments, the polymorphism is rs12610495.
[0020] In some embodiments of the methods of the disclosure, the
subject has a mutation in a sequence encoding FAM13A. In some
embodiments, the mutation is a polymorphism in a sequence encoding
FAM13A or a regulatory sequence thereof. In some embodiments the
polymorphism is rs2609255 comprising (SEQ ID NO: 2).
[0021] In some embodiments of the methods of the disclosure, the
subject has a mutation in a sequence encoding TERT. In some
embodiments, the mutation is a polymorphism in a sequence encoding
TERT or a regulatory sequence thereof. In some embodiments the
polymorphism is rs2736100 comprising (SEQ ID NO: 3).
[0022] In some embodiments of the methods of the disclosure, the
subject has a mutation in a sequence encoding DSP. In some
embodiments, the mutation is a polymorphism in a sequence encoding
DSP or a regulatory sequence thereof. In some embodiments the
polymorphism is rs2076295 comprising (SEQ ID NO: 4).
[0023] In some embodiments of the methods of the disclosure, the
subject has a mutation in a sequence encoding AZGP1. In some
embodiments, the mutation is a polymorphism in a sequence encoding
AZGP1 or a regulatory sequence thereof. In some embodiments the
polymorphism is rs4727443 comprising (SEQ ID NO: 5).
[0024] In some embodiments of the methods of the disclosure, the
subject has a mutation in a sequence encoding OBFC1. In some
embodiments, the mutation is a polymorphism in a sequence encoding
OBFC1 or a regulatory sequence thereof. In some embodiments the
polymorphism is rs11191865 comprising (SEQ ID NO: 6).
[0025] In some embodiments of the methods of the disclosure, the
subject has a mutation in a sequence encoding ATP11A. In some
embodiments, the mutation is a polymorphism in a sequence encoding
ATP11A or a regulatory sequence thereof. In some embodiments the
polymorphism is rs12787690 comprising (SEQ ID NO: 8).
[0026] In some embodiments of the methods of the disclosure, the
subject has a mutation in a sequence encoding IVD/DISP2. In some
embodiments, the mutation is a polymorphism in a sequence encoding
IVD/DISP2 or a regulatory sequence thereof. In some embodiments the
polymorphism is rs2034650 comprising (SEQ ID NO: 9).
[0027] In some embodiments of the methods of the disclosure, the
subject has a mutation in a sequence encoding DPP9. In some
embodiments, the mutation is a polymorphism in a sequence encoding
DPP9 or a regulatory sequence thereof. In some embodiments the
polymorphism is rs12610495 comprising (SEQ ID NO: 10).
[0028] In some embodiments of the methods of the disclosure, the
fibrotic lung disease is pulmonary fibrosis, idiopathic pulmonary
fibrosis (IPF), an interstitial lung abnormality (ILA), or an
asymptomatic ILA. In some embodiments, the fibrotic lung disease is
pulmonary fibrosis or IPF. In some embodiments, the fibrotic lung
disease is IPF.
[0029] In some embodiments of the methods of the disclosure, the
therapeutic agent comprises a N-acetylcysteine, pirfenidone, and
nintedanib.
[0030] In some embodiments of the methods of the disclosure, the
therapeutic agent comprises pirfenidone. In some embodiments, the
effective dosage is administered orally as a capsule or a tablet.
In some embodiments, including those embodiments wherein the
therapeutic agent comprises pirfenidone, the effective dosage is
about 2400 mg/day. In some embodiments, the effective dosage is
administered according to an escalating dosage regimen. In some
embodiments, including those embodiments wherein the therapeutic
agent comprises pirfenidone, the escalating dosage regimen
comprises (a) administering to the subject about 800 mg of
pirfenidone per day for a first week; (b) administering to the
subject about 1600 mg of pirfenidone per day for a second week; and
(c) administering to the subject about 2400 mg of pirfenidone per
day for the remainder of the treatment. In some embodiments,
including those embodiments wherein the therapeutic agent comprises
pirfenidone, the escalating dosage regimen comprises (a)
administering to the subject a capsule or tablet comprising about
250 mg of pirfenidone three times a day for a first week; (b)
administering to the subject two capsules or tablets comprising
about 250 mg of pirfenidone three times a day for a second week;
and (c) administering to the subject three capsules or tablets
comprising about 250 mg of pirfenidone three times a day for the
remainder of the treatment. In some embodiments of the escalating
dosage regimen, the capsule or tablet comprises 267 mg of
pirfenidone.
[0031] In some embodiments of the methods of the disclosure, the
therapeutic agent comprises nintedanib. In some embodiments, the
effective dosage is administered orally as a capsule or a tablet.
In some embodiments, including those embodiments wherein the
therapeutic agent comprises nintedanib, the effective dosage is
about 300 mg/day. In some embodiments, the effective dosage is
about 150 mg administered twice per day, wherein the daily doses
are administered about 12 hours apart from one another. In some
embodiments, including those embodiments wherein the therapeutic
agent comprises nintedanib, the effective dosage is about 200
mg/day. In some embodiments, the effective dosage is about 100 mg
administered twice per day, wherein the daily doses are
administered about 12 hours apart from one another. In some
embodiments, including those embodiments wherein the therapeutic
agent comprises nintedanib, the effective dosage is administered
according to a modified or interrupted dosage regimen. In some
embodiments, the modified or interrupted dosage regimen comprises
(a) administering to the subject about 300 mg of nintedanib per day
until the subject presents an elevated level of liver enzymes
compared to a control level of liver enzymes; (b) administering to
the subject about 200 mg of nintedanib per day until the subject
presents the control level of liver enzymes; and (c) administering
to the subject about 300 mg of nintedanib per day for the remainder
of the treatment; wherein the control level of liver enzymes is a
level detected in the subject prior to an initiation of the
treatment. In some embodiments, including those embodiments wherein
the therapeutic agent comprises nintedanib, the modified or
interrupted regimen comprises (a) administering to the subject a
capsule or tablet comprising about 150 mg of nintedanib twice per
day until the subject presents an elevated level of liver enzymes
compared to a control level of liver enzymes; (b) administering to
the subject two capsules or tablets comprising about 100 mg twice
per day until the subject presents an elevated level of liver
enzymes compared to a control level of liver enzymes; and (c)
administering to the subject a capsule or tablet comprising about
150 mg of nintedanib twice per day for the remainder of the
treatment; wherein the control level of liver enzymes is a level
detected in the subject prior to an initiation of the
treatment.
[0032] In some embodiments of the methods of the disclosure, the
therapeutic agent prevents the onset or development of a sign or
symptom of the fibrotic lung disease.
[0033] In some embodiments of the methods of the disclosure, the
therapeutic agent delays the onset or development of a sign or
symptom of the fibrotic lung disease when compared to the expected
onset of the sign or symptom in the absence of treatment with the
therapeutic agent.
[0034] In some embodiments of the methods of the disclosure, the
therapeutic agent reduces the severity of a sign or symptom of the
fibrotic lung disease when compared to the expected severity of the
sign or symptom in the absence of treatment with the therapeutic
agent.
[0035] In some embodiments of the methods of the disclosure, the
therapeutic agent reduces the severity of a sign or symptom of the
fibrotic lung disease when compared to the expected severity of the
sign or symptom in the absence of treatment with the therapeutic
agent.
[0036] In some embodiments of the methods of the disclosure, the at
least one sign of the fibrotic lung disease is detectable before
the subject presents a symptom of the fibrotic lung disease. In
some embodiments, the at least one sign comprises gradual or
unintended weight loss, clubbing of the fingers or toes, rapid and
shallow breathing, fibrotic lesions in one or both lungs detectable
by radiography, or a cough. In some embodiments, the symptom
comprises shortness of breath during exercise, shortness of breath
at rest, a dry and hacking cough, repeated bouts of coughing, and
uncontrollable bouts of coughing.
[0037] In some embodiments of the methods of the disclosure, the
method prevents the onset of a secondary condition associated with
a severe form of the fibrotic lung disease. In some embodiments, a
secondary condition comprises a collapsed lung, an infected lung, a
blood clot in a lung, lung cancer, respiratory failure, pulmonary
hypertension, heart failure or death.
[0038] The disclosure provides a method of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease, comprising administering to a non-human subject a
dose of a composition that modifies transcription or translation of
a sequence encoding Mucin 5B (MUC5B), Telomerase RNA Component
(TERC), Family with sequence similarity 13 member A (FAM13A),
Telomerase Reverse Transcriptase (TERT), Desmoplakin (DSP),
Zinc-alpha 2-Glycoprotein 1 (AZGP1),
Oligonucleotide/oligosaccharide-binding Fold Containing 1 (OBFC1),
ATPase Phospholipid Transporting 11A (ATP11A), Isovaleryl-CoA
dehydrogenase (IVD)/Dispatched RND Transporter Family Member 2
(DISP2), Dipeptidyl Peptidase 9 (DPP9), Sialic Acid Binding Ig-Like
Lectin 14 (SIGLEC14), Adrenomedullin 2 (ADM2), Tetraspanin 5
(TSPAN5), Calcium/Calmodulin-Dependent Protein Kinase Kinase 1
(CAMKK1) or Matrix Metalloprotease-7 (MMP-7), wherein the dose of
the composition is tolerable to the non-human subject and wherein
the dose of the composition is therapeutically effective.
[0039] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the method of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease, comprising administering to a non-human subject a
composition that modifies an activity of a product of a sequence
encoding MUC5B, TERC, FAM13A, TERT, DSP, AZGP1, OBFC1, ATP11A,
IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5, CAMKK1 or MMP-7, wherein
the dose of the composition is tolerable to the non-human subject
and wherein the dose of the composition is therapeutically
effective.
[0040] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition that modifies
transcription or translation decreases or inhibits transcription or
translation.
[0041] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition decreases or
inhibits transcription or translation of a sequence encoding a gene
selected from the group consisting of Leukotriene A4 Hydrolase
(LTA4H), Surfactant Protein B (SFTPB), Breast Cancer Anti-Estrogen
Resistance 3 (BCAR3), C--X--C motif Chemokine Ligand 13 (CXCL13),
EPH Receptor A2 (EPHA2), Serum Amyloid A1 (SAA1), Phospholipase A2
Group IIA (PLA2G2A), Insulin-Like Growth Factor Binding Protein 3
(IGFBP3), C-C Motif Chemokine Ligand 28 (CCL28), S100 Calcium
Binding Protein A12 (S100A12), Thromboxane A Synthase 1 (TBXAS1),
Leukocyte Cell Derived Chemotaxin 1 (LECT1), Complement C3 (C3),
Gastrin Releasing Peptide (GRP), C-Reactive Protein (CRP), Vitrin
(VIT), Insulin-Like Growth Factor Binding Protein 1 (IGFBP1),
Family with Sequence Similarity 173 Member A (FAM173A), Natriuretic
Peptide A (NPPA), Secreted Frizzled Related Protein 1 (SFRP1),
Ezrin (EZR), Inter-Alpha-Trypsin Inhibitor Heavy Chain Family
Member 5 (ITIH5), Pleckstrin and Sec7 Domain Containing 2 (PSD2),
Galectin 3 Binding Protein (LGALS3BP), Catenin Beta 1 (CTNNB1),
Chromodomain Y Like 2 (CDYL2), Matrix Metallopeptidase 7 (MMPI),
Apolipoprotein B (APOB), Proline and Arginine Rich End Leucine Rich
Repeat Protein (PRELP), Eukaryotic Translation Initiation Factor
1A, X-linked (EIF1AX), Mesencephalic Astrocyte Derived Neurotrophic
Factor (MANF), TNF Receptor Superfamily Member 13C (TNFRSF13C),
Deformed Epidermal Autoregulatory Factor 1 transcription factor
(DEAF1), Tumor Protein Translationally-Controlled 1 (TPT1), Unc-5
Netrin Receptor B (UNCSB), Phosphatidylethanolamine Binding Protein
1 (PEBP1), Syntaxin 8 (STX8), Polymeric Immunoglobulin Receptor
(PIGR), Adenine Phosphoribosyltransferase (APRT), Matrix
Metallopeptidase 3 (MMP3), Galectin 7 (LGALS7), Bruton Tyrosine
Kinase (BTK), NSFL1 Cofactor (NSFL1C), FER Tyrosine Kinase (FER),
Regenerating Family Member 1 Beta (REG1B), SMAD Family Member 2
(SMAD2), Interleukin 1 Receptor Like 1 (IL1RL1), C-C Motif
Chemokine Ligand 18 (CCL18), Acid Phosphatase 2 Lysosomal (ACP2),
Eukaryotic Translation Initiation Factor 4E Family Member 2
(EIF4E2), Neurexin 3 (NRXN3), IGF Like Family Member 1 (IGFL1),
NME/NM23 Nucleoside Diphosphate Kinase 1 (NME1), Potassium
Voltage-Gated Channel Isk-Related Family Member 1-Like (KCNE1L) or
Neurexophilin 2 (NXPH2).
[0042] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition that modifies
transcription or translation increases or activates transcription
or translation.
[0043] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition increases or
activates transcription or translation of a sequence encoding a
gene selected from the group consisting of Surfactant Protein D
(SFTPD), Glyceraldehyde-3-Phosphate Dehydrogenase (GAPDH), Histone
Cluster 1 H1 Family Member C (HIST1H1C), YTH Domain Containing 1
(YTHDC1), Plexin A1 (PLXNA1), Serine Peptidase Inhibitor Kazal Type
6 (SPINK6), LDL Receptor Related Protein Associated Protein 1
(LRPAP1), Secretoglobin Family 3A Member 1 (SCGB3A1), H2A Histone
Family Member Z (H2AFZ) or Chromosome 1 Open Reading Frame 162
(Clorf162).
[0044] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition that modifies an
activity decreases or inhibits the activity.
[0045] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition decreases or
inhibits the activity of a sequence encoding a gene selected from
Leukotriene A4 Hydrolase (LTA4H), Surfactant Protein B (SFTPB),
Breast Cancer Anti-Estrogen Resistance 3 (BCAR3), C--X--C motif
Chemokine Ligand 13 (CXCL13), EPH Receptor A2 (EPHA2), Serum
Amyloid A1 (SAA1), Phospholipase A2 Group IIA (PLA2G2A),
Insulin-Like Growth Factor Binding Protein 3 (IGFBP3), C-C Motif
Chemokine Ligand 28 (CCL28), S100 Calcium Binding Protein A12
(S100A12), Thromboxane A Synthase 1 (TBXAS1), Leukocyte Cell
Derived Chemotaxin 1 (LECT1), Complement C3 (C3), Gastrin Releasing
Peptide (GRP), C-Reactive Protein (CRP), Vitrin (VIT), Insulin-Like
Growth Factor Binding Protein 1 (IGFBP1), Family with Sequence
Similarity 173 Member A (FAM173A), Natriuretic Peptide A (NPPA),
Secreted Frizzled Related Protein 1 (SFRP1), Ezrin (EZR),
Inter-Alpha-Trypsin Inhibitor Heavy Chain Family Member 5 (ITIH5),
Pleckstrin and Sec7 Domain Containing 2 (PSD2), Galectin 3 Binding
Protein (LGALS3BP), Catenin Beta 1 (CTNNB1), Chromodomain Y Like 2
(CDYL2), Matrix Metallopeptidase 7 (MMPI), Apolipoprotein B (APOB),
Proline and Arginine Rich End Leucine Rich Repeat Protein (PRELP),
Eukaryotic Translation Initiation Factor 1A, X-linked (EIF1AX),
Mesencephalic Astrocyte Derived Neurotrophic Factor (MANF), TNF
Receptor Superfamily Member 13C (TNFRSF13C), Deformed Epidermal
Autoregulatory Factor 1 transcription factor (DEAF1), Tumor Protein
Translationally-Controlled 1 (TPT1), Unc-5 Netrin Receptor B
(UNC5B), Phosphatidylethanolamine Binding Protein 1 (PEBP1),
Syntaxin 8 (STX8), Polymeric Immunoglobulin Receptor (PIGR),
Adenine Phosphoribosyltransferase (APRT), Matrix Metallopeptidase 3
(MMP3), Galectin 7 (LGALS7), Bruton Tyrosine Kinase (BTK), NSFL1
Cofactor (NSFL1C), FER Tyrosine Kinase (FER), Regenerating Family
Member 1 Beta (REG1B), SMAD Family Member 2 (SMAD2), Interleukin 1
Receptor Like 1 (IL1RL1), C-C Motif Chemokine Ligand 18 (CCL18),
Acid Phosphatase 2 Lysosomal (ACP2), Eukaryotic Translation
Initiation Factor 4E Family Member 2 (EIF4E2), Neurexin 3 (NRXN3),
IGF Like Family Member 1 (IGFL1), NME/NM23 Nucleoside Diphosphate
Kinase 1 (NME1), Potassium Voltage-Gated Channel Isk-Related Family
Member 1-Like (KCNE1L) or Neurexophilin 2 (NXPH2).
[0046] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition that modifies an
activity increases or activates the activity.
[0047] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition increases or
activates the activity of a sequence encoding Surfactant Protein D
(SFTPD), Glyceraldehyde-3-Phosphate Dehydrogenase (GAPDH), Histone
Cluster 1 H1 Family Member C (HIST1H1C), YTH Domain Containing 1
(YTHDC1), Plexin A1 (PLXNA1), Serine Peptidase Inhibitor Kazal Type
6 (SPINK6), LDL Receptor Related Protein Associated Protein 1
(LRPAP1), Secretoglobin Family 3A Member 1 (SCGB3A1), H2A Histone
Family Member Z (H2AFZ) or Chromosome 1 Open Reading Frame 162 (C1
orf162).
[0048] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the non-human subject is a
mammal.
[0049] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the mammal is
genetically-modified.
[0050] In some embodiments of the methods of the disclosure, the
genetically-modified mammal is a model organism for the fibrotic
lung disease.
[0051] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the fibrotic lung disease is
pulmonary fibrosis, idiopathic pulmonary fibrosis (IPF), an
interstitial lung abnormality (ILA), or an asymptomatic ILA.
[0052] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the fibrotic lung disease is
pulmonary fibrosis or IPF.
[0053] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the fibrotic lung disease is
IPF.
[0054] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the non-human subject carries a
mutation in a sequence encoding MUC5B.
[0055] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the mutation comprises a
polymorphism in a sequence encoding a MUC5B promoter.
[0056] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the polymorphism is rs35705950.
[0057] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the non-human subject carries a
mutation in a sequence encoding TERC, FAM13A, TERT, DSP, ZKSCAN1,
AZGP1, OBFC1, MUC5B, AK025511, ATP11A, IVD/DISP2, DPP9, SIGLEC14,
ADM2, TSPAN5, CAMKK1 or MMP-7.
[0058] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition prevents the onset
or development of a sign or symptom of the fibrotic lung
disease.
[0059] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition delays the onset or
development of a sign or symptom of the fibrotic lung disease when
compared to the expected onset of the a sign or symptom in the
absence of treatment with the composition.
[0060] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition delays the onset or
development of a sign or symptom of the fibrotic lung disease when
compared to the expected onset of the sign or symptom when treated
using a standard therapeutic intervention.
[0061] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition reduces the
severity of a sign or symptom of the fibrotic lung disease when
compared to the expected severity of the sign or symptom in the
absence of treatment with the composition.
[0062] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the composition reduces the
severity of a sign or symptom of the fibrotic lung disease when
compared to the expected severity of the sign or symptom when
treated using a standard therapeutic intervention.
[0063] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the standard therapeutic
intervention comprises a N-acetylcysteine, pirfenidone, and
nintedanib.
[0064] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the standard therapeutic
intervention comprises pirfenidone.
[0065] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, an effective dosage of pirfenidone
is about 2400 mg/day.
[0066] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the effective dosage is
administered orally as a capsule or a tablet.
[0067] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the effective dosage is
administered three times per day.
[0068] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the effective dosage is
administered according to an escalating dosage regimen.
[0069] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the escalating dosage regimen
comprises, administering to the non-human subject about 800 mg of
pirfenidone per day for a first week; administering to the
non-human subject about 1600 mg of pirfenidone per day for a second
week; and administering to the non-human subject about 2400 mg of
pirfenidone per day for the remainder of the treatment.
[0070] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the escalating dosage regimen
comprises, administering to the non-human subject a capsule or
tablet comprising about 250 mg of pirfenidone three times a day for
a first week; administering to the non-human subject two capsules
or tablets comprising about 250 mg of pirfenidone three times a day
for a second week; and administering to the non-human subject three
capsules or tablets comprising about 250 mg of pirfenidone three
times a day for the remainder of the treatment.
[0071] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the capsule or tablet comprises 267
mg of pirfenidone.
[0072] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the standard therapeutic
intervention comprises nintedanib.
[0073] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, an effective dosage of nintedanib
is administered orally as a capsule or a tablet.
[0074] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the effective dosage is about 300
mg/day.
[0075] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the effective dosage is about 150
mg administered twice per day, wherein the daily doses are
administered about 12 hours apart from one another.
[0076] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the effective dosage is about 200
mg/day.
[0077] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the effective dosage is about 100
mg administered twice per day, wherein the daily doses are
administered about 12 hours apart from one another.
[0078] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the non-human subject presents at
least one sign of the fibrotic lung disease.
[0079] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the at least one sign comprises
gradual or unintended weight loss, clubbing of the fingers or toes,
rapid and shallow breathing, fibrotic lesions in one or both lungs
detectable by radiography, or a cough.
[0080] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the compound prevents the onset of
a secondary condition associated with a severe form of the fibrotic
lung disease.
[0081] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, the compound prevents the onset for
at 1 year, 2 years, 3 years, 4 years, 5 years or any whole or
fractional number of years in between.
[0082] In some embodiments of the methods of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease of the disclosure, secondary condition comprises a
collapsed lung, an infected lung, a blood clot in a lung, lung
cancer, respiratory failure, pulmonary hypertension, heart failure
or death.
[0083] The disclosure provides a composition for the treatment of a
fibrotic lung disease identified by a method of the disclosure,
including, a method of identifying a therapeutic agent or target
thereof for the treatment of a fibrotic lung disease of the
disclosure.
[0084] The disclosure provides a method of treating fibrotic lung
disease in a human subject of the disclosure comprising
administering a therapeutically effective amount of a composition
identified by a method of the disclosure, wherein the subject is
asymptomatic and wherein the subject is at risk of developing the
fibrotic lung disease. In some embodiments, the subject is wild
type (e.g. does not comprises a mutation or a sequence variation)
with respect to a nucleic acid or amino acid sequence encoding one
or more of TERC, FAM13A, TERT, DSP, ZKSCAN1, AZGP1, OBFC1, MUC5B,
AK025511, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5, CAMKK1
or MMP-7.
[0085] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, the human
subject presents radiographic Usual Interstitial Pneumonia
(UIP).
[0086] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, wherein the
human subject has fibrotic interstitial lung disease (FILD).
[0087] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, wherein the
human subject has a blood relative with familial interstitial
pneumonia (FIP).
[0088] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, wherein the
blood relative is a sibling.
[0089] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, wherein the
human subject has a mutation or a sequence variation in a nucleic
acid or an amino acid sequence encoding TERC, FAM13A, TERT, DSP,
ZKSCAN1, AZGP1, OBFC1, MUC5B, AK025511, ATP11A, IVD/DISP2, DPP9,
SIGLEC14, ADM2, TSPAN5, CAMKK1 or MMP-7.
[0090] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, the mutation
comprises a polymorphism in a sequence encoding a MUC5B
promoter.
[0091] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, the
polymorphism is rs35705950.
[0092] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, the fibrotic
lung disease is pulmonary fibrosis, idiopathic pulmonary fibrosis
(IPF), an interstitial lung abnormality (ILA), or an asymptomatic
ILA.
[0093] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, the fibrotic
lung disease is pulmonary fibrosis or IPF.
[0094] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, the fibrotic
lung disease is IPF.
[0095] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, the method
prevents the onset of a secondary condition associated with a
severe form of the fibrotic lung disease.
[0096] In some embodiments of the methods of treating fibrotic lung
disease in a human subject of the disclosure by administering a
composition identified by a method of the disclosure, a secondary
condition comprises a collapsed lung, an infected lung, a blood
clot in a lung, lung cancer, respiratory failure, pulmonary
hypertension, heart failure or death.
BRIEF DESCRIPTION OF THE DRAWINGS
[0097] The patent or application file contains at least one drawing
executed in color. Copies of this patent or patent application
publication with color drawing(s) will be provided by the Office
upon request and payment of the necessary fee.
[0098] FIG. 1 is a map depicting an exemplary hierarchical
clustering of differentially expressed genes for pre-pulmonary
fibrosis subjects and normal subjects.
[0099] FIG. 2A-B is a pair of volcano plots showing serum sample
quality control using Principal component analysis (PCA). FIG. 2A
shows before outlier exclusion and FIG. 2B shows after outlier
exclusion.
[0100] FIG. 3 is a volcano plot of 3315 plasma proteins, comparing
results from 70 patients with established IPF and 70 controls.
Solid red symbols represent 57 proteins that were significantly
up-regulated and solid blue symbols 12 proteins that were
significantly down-regulated in patients with IPF after controlling
for multiple comparisons and age/gender/smoking.
[0101] FIG. 4 is a survival plot showing receiver operator curves
of predictive model for PrePF in asymptomatic relatives from FIP
families. Area Under Curve (AUC) values for each model are as
follows: Gene Expression alone (red)=0.83, Clinical Predictors
(blue)=0.87, Clinical Predictors+MUC5B genotype (green)=0.87,
Clinical Predictors+Gene Expression Score (yellow)=0.95, Clinical
Predictors+MUC5B genotype+Gene Expression Score (black)=0.95,
indicating that a peripheral blood biomarker panel may improve the
diagnostic power of a predictive model for PrePF in an at-risk
population.
[0102] FIG. 5 is a graph showing MUC5B expression in IPF (N=203)
and unaffected subjects (N=139) stratified by MUC5B promoter
variant (rs35705950) genotype.
[0103] FIG. 6A is a microscopic image demonstrating that MUC5B is
produced in bronchoalveolar epithelia of patients with IPF (brown
staining in photomicrographs). Staining is increased in the airways
of patients positive for rs35705950 (TT) compared to WT (GG).
[0104] FIG. 6B is a graph showing the percentage of MUC5B positive
area of bronchiolar epithelium. Unbiased stereological assessment
of staining demonstrates that the volume fraction of stained
airways (% positive area) is significantly greater in both the GT
heterozygotes and the TT homozygotes.
[0105] FIG. 7A-B is series of bar graphs showing that Scgb1a1- and
SFPTC promoter show significant worsening of fibrosis
(hydroxyproline) after bleomycin while Muc5b-/- mice are protected.
FIG. 7A is a series of graphs and FIG. 7B is a series of confocal
images showing that the concentration of Muc5b is directly related
to the fibroproliferative response to bleomycin. Representative
images from second harmonic generation (SHG) demonstrate increased
lung collagen (red) in transgenic mice following bleomycin
injury.
[0106] FIG. 8 is bar graph showing that the baseline expression of
ER stress genes in lung tissue from WT and Scgb1a1 Muc5bTg mice.
Muc5bTg mice have greater ER stress gene expression than their WT
littermates (all genes in the ER stress pathway, with p<0.05).
Bleomycin also induces ER stress (data not shown).
[0107] FIG. 9 is a pair of microscopic images showing enhanced CHOP
(Ddit3) protein in wild type (WT, top photograph) and
Scgb1a1-Muc5bTg mice (bottom photograph) after repeat
bleomycin.
[0108] FIG. 10 is a pair of microscopic images and corresponding
graphs showing the expanded mucus layer and decreased mucociliary
transport in SFTPC-Muc5bTg mice compared to littermate wild-type
mice. Statistical differences were assessed by Mann-Whitney U
Test.
[0109] FIG. 11 is a series of schematic diagram showing that the
MUC5B variant and other biomarkers can identify an at-risk
population or those with PrePF, establishing the opportunity for
primary and secondary prevention of IPF. The `at-risk` population
and the population with PrePF is large (19% with the MUC5B promoter
variant and 1.8% of individuals .gtoreq.50 years of age
respectively), IPF is diagnosed in a small population with
established, end-stage disease and PrePF can be identified using
the MUC5B variant rs35705950. Results indicate that PrePF (detected
via chest CT scan) is associated with a poor prognosis suggesting
that PrePF may be a harbinger of IPF.
[0110] FIG. 12 is a schematic diagram showing a method of screening
at-risk populations (family members of patients with IPF) to
identify individuals with PrePF. Focus is placed on identifying the
genetic variants and biomarkers that increase the yield of PrePF on
HRCT scan, in addition to gender, age, and physiology scores.
[0111] FIG. 13 is a table describing the baseline characteristics
of patients with rheumatoid arthritis.
[0112] FIG. 14 is a table describing the genotypic association of
MUC5B rs35705950 single nucleotide polymorphism in patients with
RA, with and without interstitial lung disease
[0113] FIG. 15 is a table describing the dominant genotypic
association of MUC5B rs35705950 single nucleotide polymorphism in
patients with RA-ILD and a usual interstitial pneumonia or possible
usual interstitial pneumonia pattern (RA-UIP) and in patients with
RA-ILD and a pattern inconsistent with usual interstitial pneumonia
(RA non-UIP).
[0114] FIG. 16A is a forest plot of odds ratios {OR) and 95%
confidence intervals {C1) depicting the lack of association of the
MUC5B rs35705950 promoter variant with RA without 1LD {RA-nolLD).
The boxes indicate OR, and the horizontal lines indicate 95% C1 for
the best-fitting genetic model for each association test. The black
dotted line represents a mean OR value of 1. The red boxes and red
lines indicate the overall OR and 95% C1, respectively. For
comparisons between RA cases and controls, the associations were
adjusted for the country of origin and sex. For intra-RA cases
comparisons, the associations were adjusted for the country of
origin, sex, age at inclusion and smoking.
[0115] FIG. 16B is a forest plot of odds ratios (OR) and 95%
confidence intervals {C1) depicting the additive genotypic
association of the MUC5B rs 35705950 promoter variant with RA-ILD.
The red dotted line represent the mean value of overall OR value.
The boxes indicate OR, and the horizontal lines indicate 95% C1 for
the best-fitting genetic model for each association test. The black
dotted line represents a mean OR value of 1. The red boxes and red
lines indicate the overall OR and 95% C1, respectively. For
comparisons between RA cases and controls, the associations were
adjusted for the country of origin and sex. For intra-RA cases
comparisons, the associations were adjusted for the country of
origin, sex, age at inclusion and smoking.
[0116] FIG. 16C is a forest plot of odds ratios {OR) and 95%
confidence intervals {C1) depicting dominant genotypic association
of the MUC5B re35705950 promoter variant with ILD among patients
with RA and those with the usual interstitial pneumonia or possible
usual interstitial pneumonia (UIP) pattern. The boxes indicate OR,
and the horizontal lines indicate 95% C1 for the best-fitting
genetic model for each association test. The red dotted line
represent the mean value of overall OR value. The black dotted line
represents a mean OR value of 1. The red boxes and red lines
indicate the overall OR and 95% C1, respectively. For comparisons
between RA cases and controls, the associations were adjusted for
the country of origin and sex. For intra-RA cases comparisons, the
associations were adjusted for the country of origin, sex, age at
inclusion and smoking.
[0117] FIG. 17 is a series of photographs depicting MUC5B
expression in explanted lung issue from rheumatoid arthritis
associates interstitial lung disease. Representative lung tissue
images from unaffected control (GG genotype, Panel A), RA-ILD case
#1 (GG genotype, Panel B), and RA-ILD case #2 (GT genotype, Panel
C). Low power views with high power view insets identified. Panel
A--low power view of normal lung; top and middle insets with high
power view of bronchiole with MUC5B staining; bottom inset with
high power view of alveolar epithelia. Panel B and C--low power
view of the usual interstitial pneumonia pattern in explanted lung
tissue of RA-ILD; top inset with high power view of bronchiole with
MUC5B staining; middle and bottom insets with high power view of
MUC5B staining in metaplastic epithelia lining honeycomb cysts and
MUC5B staining of mucous in honeycomb cysts.
[0118] FIG. 18 is a flow chart depicting the screening and
enrollment process for study subjects.
[0119] FIG. 19A-D is a series of photographs depicting
High-resolution CT (HRCT) images of: 19A) chest from a study
subject whose scan was read as normal, without signs of
interstitial lung disease or fibrosis. 19B) HRCT image from subject
who was categorized as having "Probable Fibrotic ILD." 19C)
Representative HRCT image from subject who was characterized as
having "Definite Fibrotic ILD." 19D) HRCT image from a case of
previously diagnosed, established Idiopathic Pulmonary Fibrosis
(IPF) in one of the study families.
[0120] FIG. 20 is a table depicting a summary of characteristics of
study subjects used in quantitative CT Analyses.
[0121] FIG. 21A-F is a series of photographs depicting
representative axial HRCT images visually assessed as "No Fibrosis"
(21A), "Probable Fibrotic ILD" (21C) and "Definite Fibrotic ILD"
(E). Below each is the corresponding quantitative HRCT results for
the above scan: (21B) "No Fibrosis" fibrosis extent 1.7% (fibrosis
score=0.55), (21D) "Probable Fibrotic ILD" fibrosis extent 18.5%
(fibrosis score 2.92), (F) "Definite Fibrotic ILD" fibrosis extent
35.5% (fibrosis score 3.60), Classification results color coded as
follows: green=normal lung, blue=airway, yellow=reticular
abnormality, magenta=ground glass opacity, red=honeycombing.
[0122] FIG. 22 is a table depicting Screening Cohort Subject
Characteristics. * DNA available on a total of 489 subjects (404 No
Fibrosis and 75 PrePF subjects). ** Odds ratios reported in this
table were calculated from a mixed effects logistic regression
model including age (as a continuous variable), male sex, ever
smoker (yes/no), and MUC5B promoter variant (rs35705950) genotype.
***In the reported model, rs35705950 coded as a dominant allele; in
log-additive genetic model, p=0.05, as well.
[0123] FIG. 23 is a table depicting patterns of CT abnormalities in
scans with probable or definite fibrotic ILD. * Because a confident
single diagnosis was relatively uncommon, most cases included
consideration of several patterns. For this reason, the percentages
add up to more than 100%.
[0124] FIG. 24 is a box plot depicting fibrosis score by visual
diagnosis. Boxplots of fibrosis scores based on quantitative HRCT
assessment for each visual diagnosis category. Fibrosis score means
were significantly different (ANOVA, p<0.0001) across groups
defined by visual diagnosis. Comparison of fibrosis score between
groups showed significant differences for all comparisons
(p<0.01 for all).
[0125] FIG. 25A-C is a series of graphs depicting Receiver
Operating Characteristic (ROC) curves for quantitative imaging
measures of Fibrosis and PrePF. FIG. 5A depicts ROC curves for
visual diagnosis compared to log HAA scores. FIG. 5B depicts ROC
Curves for visual diagnosis compared to fibrosis scores. ROC
analysis showed that fibrosis score discriminates subjects with
visual diagnosis of PrePF. Average area under the curve (AUC) in
fivefold cross validation was 0.85 (range 0.83-0.87) and average
accuracy, sensitivity, and specificity in the test partitions were
0.83 (range 0.74-0.86), 0.74 (range 0.56-0.92) and 0.84 (range
0.76-0.89) respectively. Optimal threshold for fibrosis score
ranged from 1.40-1.42.
[0126] FIG. 5C depicts Density plots of fibrosis scores for
visually diagnosed PrePF (pink) and No Fibrosis (blue) scans--the
fibrosis score optimal threshold is indicated with the red line
(1.40).
[0127] FIG. 26 is a series of tables depicting Dyspnea
questionnaire data. FIG. 26A depicts breathlessness responses for
the cohort. FIG. 26B depicts breathlessness responses by Visual CT
diagnosis.
[0128] FIG. 27 is a graph that depicts the prevalence of PrePF in
FIP Siblings Cohort by Age and MUC5B Genotype. PrePF prevalence in
this FIP siblings cohort increases by age, as shown in this graph.
By age >60 years, the prevalence of PrePF differed significantly
based on MUC5B genotype (*p=0.02). Subjects with the variant are
depicted by the red line, while those without it are depicted with
the blue line.
[0129] FIG. 28 is a table depicting subject characteristics based
on Quantitative Fibrosis Score. Clinical characteristics and
genotype breakdown of subjects with quantitative HRCT analyses. The
cutoff of 1.4 for the logarithm of fibrosis score is based on
analyses presented in the text. * p-value compares characteristic
between groups. Linear regression values regress fibrosis score on
age, male sex, smoking history, and MUC5B promoter variant. **In
the reported model, rs35705950 coded as a dominant allele given
small number of TT subjects.
[0130] FIG. 29 is a table depicting an exploratory genetic
association study of 13 pulmonary fibrosis susceptibility variants
in RA-ILD.
DETAILED DESCRIPTION OF THE DISCLOSURE
[0131] The present disclosure provides a method of treating a
fibrotic lung disease in a subject comprising administering to the
subject an effective amount of a therapeutic agent, wherein the
subject is asymptomatic and wherein the subject is at risk of
developing the fibrotic lung disease.
Methods of Identifying a Therapeutic Agent of the Disclosure or
Target Thereof
[0132] The disclosure provides a method of identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease, comprising administering to a non-human subject a
dose of a composition that modifies transcription or translation of
a sequence encoding Mucin 5B (MUC5B), Telomerase RNA Component
(TERC), Family with sequence similarity 13 member A (FAM13A),
Telomerase Reverse Transcriptase (TERT), Desmoplakin (DSP),
Zinc-alpha 2-Glycoprotein 1 (AZGP1),
Oligonucleotide/oligosaccharide-binding Fold Containing 1 (OBFC1),
ATPase Phospholipid Transporting 11A (ATP11A), Isovaleryl-CoA
dehydrogenase (IVD)/Dispatched RND Transporter Family Member 2
(DISP2), Dipeptidyl Peptidase 9 (DPP9), Sialic Acid Binding Ig-Like
Lectin 14 (SIGLEC14), Adrenomedullin 2 (ADM2), Tetraspanin 5
(TSPAN5), Calcium/Calmodulin-Dependent Protein Kinase Kinase 1
(CAMKK1) or Matrix Metalloprotease-7 (MMP-7), wherein the dose of
the composition is tolerable to the non-human subject and wherein
the dose of the composition is therapeutically effective.
[0133] The disclosure provides method of identifying a therapeutic
agent or target thereof for the treatment of a fibrotic lung
disease, comprising administering to a non-human subject a
composition that modifies an activity of a product of a sequence
encoding MUC5B, TERC, FAM13A, TERT, DSP, AZGP1, OBFC1, ATP11A,
IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5, CAMKK1 or MMP-7, wherein
the dose of the composition is tolerable to the non-human subject
and wherein the dose of the composition is therapeutically
effective.
[0134] In some embodiments of the methods of the disclosure, the
composition that modifies transcription or translation decreases or
inhibits transcription or translation. In some embodiments, the
composition decreases or inhibits transcription or translation of a
sequence encoding a gene selected from the group consisting of
Leukotriene A4 Hydrolase (LTA4H), Surfactant Protein B (SFTPB),
Breast Cancer Anti-Estrogen Resistance 3 (BCAR3), C--X--C motif
Chemokine Ligand 13 (CXCL13), EPH Receptor A2 (EPHA2), Serum
Amyloid A1 (SAA1), Phospholipase A2 Group IIA (PLA2G2A),
Insulin-Like Growth Factor Binding Protein 3 (IGFBP3), C-C Motif
Chemokine Ligand 28 (CCL28), S100 Calcium Binding Protein A12
(S100A12), Thromboxane A Synthase 1 (TBXAS1), Leukocyte Cell
Derived Chemotaxin 1 (LECT1), Complement C3 (C3), Gastrin Releasing
Peptide (GRP), C-Reactive Protein (CRP), Vitrin (VIT), Insulin-Like
Growth Factor Binding Protein 1 (IGFBP1), Family with Sequence
Similarity 173 Member A (FAM173A), Natriuretic Peptide A (NPPA),
Secreted Frizzled Related Protein 1 (SFRP1), Ezrin (EZR),
Inter-Alpha-Trypsin Inhibitor Heavy Chain Family Member 5 (ITIH5),
Pleckstrin and Sec7 Domain Containing 2 (PSD2), Galectin 3 Binding
Protein (LGALS3BP), Catenin Beta 1 (CTNNB1), Chromodomain Y Like 2
(CDYL2), Matrix Metallopeptidase 7 (MMPI), Apolipoprotein B (APOB),
Proline and Arginine Rich End Leucine Rich Repeat Protein (PRELP),
Eukaryotic Translation Initiation Factor 1A, X-linked (EIF1AX),
Mesencephalic Astrocyte Derived Neurotrophic Factor (MANF), TNF
Receptor Superfamily Member 13C (TNFRSF13C), Deformed Epidermal
Autoregulatory Factor 1 transcription factor (DEAF1), Tumor Protein
Translationally-Controlled 1 (TPT1), Unc-5 Netrin Receptor B
(UNC5B), Phosphatidylethanolamine Binding Protein 1 (PEBP1),
Syntaxin 8 (STX8), Polymeric Immunoglobulin Receptor (PIGR),
Adenine Phosphoribosyltransferase (APRT), Matrix Metallopeptidase 3
(MMP3), Galectin 7 (LGALS7), Bruton Tyrosine Kinase (BTK), NSFL1
Cofactor (NSFL1C), FER Tyrosine Kinase (FER), Regenerating Family
Member 1 Beta (REG1B), SMAD Family Member 2 (SMAD2), Interleukin 1
Receptor Like 1 (IL1RL1), C-C Motif Chemokine Ligand 18 (CCL18),
Acid Phosphatase 2 Lysosomal (ACP2), Eukaryotic Translation
Initiation Factor 4E Family Member 2 (EIF4E2), Neurexin 3 (NRXN3),
IGF Like Family Member 1 (IGFL1), NME/NM23 Nucleoside Diphosphate
Kinase 1 (NME1), Potassium Voltage-Gated Channel Isk-Related Family
Member 1-Like (KCNE1L) or Neurexophilin 2 (NXPH2).
[0135] In some embodiments of the methods of the disclosure, the
composition that modifies transcription or translation increases or
activates transcription or translation. In some embodiments, the
composition increases or activates transcription or translation of
a sequence encoding a gene selected from the group consisting of
Surfactant Protein D (SFTPD), Glyceraldehyde-3-Phosphate
Dehydrogenase (GAPDH), Histone Cluster 1 H1 Family Member C
(HIST1H1C), YTH Domain Containing 1 (YTHDC1), Plexin A1 (PLXNA1),
Serine Peptidase Inhibitor Kazal Type 6 (SPINK6), LDL Receptor
Related Protein Associated Protein 1 (LRPAP1), Secretoglobin Family
3A Member 1 (SCGB3A1), H2A Histone Family Member Z (H2AFZ) or
Chromosome 1 Open Reading Frame 162 (C1 orf162).
[0136] In some embodiments of the methods of the disclosure, the
composition that modifies an activity decreases or inhibits the
activity. In some embodiments, the composition decreases or
inhibits the activity of a sequence encoding a gene selected from
Leukotriene A4 Hydrolase (LTA4H), Surfactant Protein B (SFTPB),
Breast Cancer Anti-Estrogen Resistance 3 (BCAR3), C-X-C motif
Chemokine Ligand 13 (CXCL13), EPH Receptor A2 (EPHA2), Serum
Amyloid A1 (SAA1), Phospholipase A2 Group IIA (PLA2G2A),
Insulin-Like Growth Factor Binding Protein 3 (IGFBP3), C-C Motif
Chemokine Ligand 28 (CCL28), 5100 Calcium Binding Protein A12
(S100A12), Thromboxane A Synthase 1 (TBXAS1), Leukocyte Cell
Derived Chemotaxin 1 (LECT1), Complement C3 (C3), Gastrin Releasing
Peptide (GRP), C-Reactive Protein (CRP), Vitrin (VIT), Insulin-Like
Growth Factor Binding Protein 1 (IGFBP1), Family with Sequence
Similarity 173 Member A (FAM173A), Natriuretic Peptide A (NPPA),
Secreted Frizzled Related Protein 1 (SFRP1), Ezrin (EZR),
Inter-Alpha-Trypsin Inhibitor Heavy Chain Family Member 5 (ITIH5),
Pleckstrin and Sec7 Domain Containing 2 (PSD2), Galectin 3 Binding
Protein (LGALS3BP), Catenin Beta 1 (CTNNB1), Chromodomain Y Like 2
(CDYL2), Matrix Metallopeptidase 7 (MMPI), Apolipoprotein B (APOB),
Proline and Arginine Rich End Leucine Rich Repeat Protein (PRELP),
Eukaryotic Translation Initiation Factor 1A, X-linked (EIF1AX),
Mesencephalic Astrocyte Derived Neurotrophic Factor (MANF), TNF
Receptor Superfamily Member 13C (TNFRSF13C), Deformed Epidermal
Autoregulatory Factor 1 transcription factor (DEAF1), Tumor Protein
Translationally-Controlled 1 (TPT1), Unc-5 Netrin Receptor B
(UNC5B), Phosphatidylethanolamine Binding Protein 1 (PEBP1),
Syntaxin 8 (STX8), Polymeric Immunoglobulin Receptor (PIGR),
Adenine Phosphoribosyltransferase (APRT), Matrix Metallopeptidase 3
(MMP3), Galectin 7 (LGALS7), Bruton Tyrosine Kinase (BTK), NSFL1
Cofactor (NSFL1C), FER Tyrosine Kinase (FER), Regenerating Family
Member 1 Beta (REG1B), SMAD Family Member 2 (SMAD2), Interleukin 1
Receptor Like 1 (IL1RL1), C-C Motif Chemokine Ligand 18 (CCL18),
Acid Phosphatase 2 Lysosomal (ACP2), Eukaryotic Translation
Initiation Factor 4E Family Member 2 (EIF4E2), Neurexin 3 (NRXN3),
IGF Like Family Member 1 (IGFL1), NME/NM23 Nucleoside Diphosphate
Kinase 1 (NME1), Potassium Voltage-Gated Channel Isk-Related Family
Member 1-Like (KCNE1L) or Neurexophilin 2 (NXPH2).
[0137] In some embodiments of the methods of the disclosure, the
composition that modifies an activity increases or activates the
activity. In some embodiments, the composition increases or
activates the activity of a sequence encoding Surfactant Protein D
(SFTPD), Glyceraldehyde-3-Phosphate Dehydrogenase (GAPDH), Histone
Cluster 1 H1 Family Member C (HIST1H1C), YTH Domain Containing 1
(YTHDC1), Plexin A1 (PLXNA1), Serine Peptidase Inhibitor Kazal Type
6 (SPINK6), LDL Receptor Related Protein Associated Protein 1
(LRPAP1), Secretoglobin Family 3A Member 1 (SCGB3A1), H2A Histone
Family Member Z (H2AFZ) or Chromosome 1 Open Reading Frame 162 (C1
orf162).
[0138] In some embodiments of the methods of the disclosure, the
non-human subject is a mammal. In some embodiments, mammal is
genetically-modified. In some embodiments, the genetically-modified
mammal is a model organism for the fibrotic lung disease.
[0139] In some embodiments of the methods of the disclosure, the
fibrotic lung disease is pulmonary fibrosis, idiopathic pulmonary
fibrosis (IPF), an interstitial lung abnormality (ILA), or an
asymptomatic ILA. In some embodiments, the fibrotic lung disease is
pulmonary fibrosis or IPF. In some embodiments, the fibrotic lung
disease is IPF.
[0140] In some embodiments of the methods of the disclosure, the
non-human subject carries a mutation in a sequence encoding MUC5B.
In some embodiments, the mutation comprises a polymorphism in a
sequence encoding a MUC5B promoter. In some embodiments, the
polymorphism is rs35705950. Alternatively, or in addition, in some
embodiments, the non-human subject carries a mutation in a sequence
encoding TERC, FAM13A, TERT, DSP, AZGP1, OBFC1, ATP11A, IVD/DISP2,
DPP9, SIGLEC14, ADM2, TSPAN5, CAMKK1 or MMP-7.
[0141] In some embodiments of the methods of the disclosure, the
composition prevents the onset or development of a sign or symptom
of the fibrotic lung disease.
[0142] In some embodiments of the methods of the disclosure, the
composition delays the onset or development of a sign or symptom of
the fibrotic lung disease when compared to the expected onset of
the sign or symptom in the absence of treatment with the
composition. In some embodiments, the composition delays the onset
or development of a sign or symptom of the fibrotic lung disease
when compared to the expected onset of the sign or symptom when
treated using a standard therapeutic intervention.
[0143] In some embodiments of the methods of the disclosure, the
composition reduces the severity of a sign or symptom of the
fibrotic lung disease when compared to the expected severity of the
sign or symptom in the absence of treatment with the composition.
In some embodiments, the composition reduces the severity of a sign
or symptom of the fibrotic lung disease when compared to the
expected severity of the sign or symptom when treated using a
standard therapeutic intervention.
[0144] In some embodiments of the methods of the disclosure, the
standard therapeutic intervention comprises a N-acetylcysteine,
pirfenidone, and nintedanib.
[0145] In some embodiments of the methods of the disclosure, the
standard therapeutic intervention comprises pirfenidone. In some
embodiments, an effective dosage of pirfenidone is about 2400
mg/day. In some embodiments, the effective dosage is administered
orally as a capsule or a tablet. In some embodiments, the effective
dosage is administered three times per day. In some embodiments,
the effective dosage is administered according to an escalating
dosage regimen. In some embodiments, the escalating dosage regimen
comprises (a) administering to the non-human subject about 800 mg
of pirfenidone per day for a first week; (b) administering to the
non-human subject about 1600 mg of pirfenidone per day for a second
week; and (c) administering to the non-human subject about 2400 mg
of pirfenidone per day for the remainder of the treatment. In some
embodiments, the escalating dosage regimen comprises (a)
administering to the non-human subject a capsule or tablet
comprising about 250 mg of pirfenidone three times a day for a
first week; (b) administering to the non-human subject two capsules
or tablets comprising about 250 mg of pirfenidone three times a day
for a second week; and (c) administering to the non-human subject
three capsules or tablets comprising about 250 mg of pirfenidone
three times a day for the remainder of the treatment. In some
embodiments, the capsule or tablet comprises 267 mg of
pirfenidone.
[0146] In some embodiments of the methods of the disclosure, the
standard therapeutic intervention comprises nintedanib. In some
embodiments, an effective dosage of nintedanib is administered
orally as a capsule or a tablet. In some embodiments, the effective
dosage is about 300 mg/day. In some embodiments, the effective
dosage is about 150 mg administered twice per day, wherein the
daily doses are administered about 12 hours apart from one another.
In some embodiments, the effective dosage is about 200 mg/day. In
some embodiments, the effective dosage is about 100 mg administered
twice per day, wherein the daily doses are administered about 12
hours apart from one another.
[0147] In some embodiments of the methods of the disclosure, the
non-human subject presents at least one sign of the fibrotic lung
disease. In some embodiments, the at least one sign comprises
gradual or unintended weight loss, clubbing of the fingers or toes,
rapid and shallow breathing, fibrotic lesions in one or both lungs
detectable by radiography, or a cough.
[0148] In some embodiments of the methods of the disclosure, the
compound prevents the onset of a secondary condition associated
with a severe form of the fibrotic lung disease. In some
embodiments, the compound prevents the onset for at 1 year, 2
years, 3 years, 4 years, 5 years or any whole or fractional number
of years in between. In some embodiments, the secondary condition
comprises a collapsed lung, an infected lung, a blood clot in a
lung, lung cancer, respiratory failure, pulmonary hypertension,
heart failure or death.
[0149] The disclosure provides a composition for the treatment of a
fibrotic lung disease identified by a method of the disclosure for
identifying a therapeutic agent or target thereof for the treatment
of a fibrotic lung disease.
Subjects of the Disclosure
[0150] The disclosure provides a method of treating a fibrotic lung
disease in a human subject comprising administering to the subject
the composition for the treatment of a fibrotic lung disease
identified by a method of the disclosure for identifying a
therapeutic agent or target thereof for the treatment of a fibrotic
lung disease, wherein the subject is asymptomatic and wherein the
subject is at risk of developing the fibrotic lung disease.
[0151] In some embodiments of the methods of treating a fibrotic
lung disease in a human subject of the disclosure, the human
subject presents radiographic Usual Interstitial Pneumonia (UIP).
In some embodiments, the human subject has fibrotic interstitial
lung disease (FILD). In some embodiments, the human subject has a
blood relative with familial interstitial pneumonia (FIP). In some
embodiments, the blood relative is a sibling. Alternatively, or in
addition, in some embodiments, the human subject has a mutation in
a sequence encoding MUC5B, TERC, FAM13A, TERT, DSP, AZGP1, OBFC1,
ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5, CAMKK1 or MMP-7.
In some embodiments, the mutation comprises a polymorphism in a
sequence encoding a MUC5B promoter. In some embodiments, the
polymorphism is rs35705950.
[0152] In some embodiments of the methods of treating a fibrotic
lung disease in a human subject of the disclosure, the fibrotic
lung disease is pulmonary fibrosis, idiopathic pulmonary fibrosis
(IPF), an interstitial lung abnormality (ILA), or an asymptomatic
ILA. In some embodiments, the fibrotic lung disease is pulmonary
fibrosis or IPF. In some embodiments, the fibrotic lung disease is
IPF.
[0153] In some embodiments of the methods of treating a fibrotic
lung disease in a human subject of the disclosure, the method
prevents the onset of a secondary condition associated with a
severe form of the fibrotic lung disease. In some embodiments, the
secondary condition comprises a collapsed lung, an infected lung, a
blood clot in a lung, lung cancer, respiratory failure, pulmonary
hypertension, heart failure or death.
Idiopathic Pulmonary Fibrosis (IPF)
[0154] IPF is localized to the lung and is characterized by a
pattern of heterogeneous, subpleural patches of fibrotic, remodeled
lung, and often results in death within 3-5 years of diagnosis. IPF
affects 5 million people worldwide, disproportionately affects men,
is associated with cigarette smoking, increases with age, is
inexplicably increasing in prevalence, and is likely
underdiagnosed. Most patients with IPF are discovered in the
advanced stage when little can be done to influence survival. There
is a critical unmet need in idiopathic pulmonary fibrosis (IPF) for
an early detection and prevention of IPF. Earlier diagnosis of IPF
detects subjects with a lower burden of fibrotic lung disease
providing an opportunity for secondary prevention of this
progressive disease and changes the clinical approach to patients
with IPF from palliative to preventive.
[0155] Early detection and prevention of idiopathic pulmonary
fibrosis (IPF) is critical. As demonstrated herein, treatment of
subjects at risk for developing PrePF is based on two central
concepts of first, understanding that PrePF is essential for
primary and secondary prevention of IPF and second, that similar to
asymptomatic family members of familial IPF (FIP; .gtoreq.2 family
members with IPF), asymptomatic family members of sporadic IPF
represent an at-risk population for PrePF. These central concepts
are supported by the observation that 1) IPF has a pre-symptomatic
phase and PrePF appears to be a harbinger of IPF, 2) familial and
sporadic IPF are similar etiologically, 3) MUC5B promoter variant
is critical to early disease recognition and 4) identification of
PrePF represents an opportunity to prevent extensive lung fibrosis.
As shown herein, a common gain-of-function MUC5B promoter variant
rs35705950 is a strong risk factor (genetic and otherwise),
accounting for at least 30% of the total risk of developing IPF.
The MUC5B promoter variant rs35705950 may be used to identify
individuals with PrePF. MUC5B promoter variant rs35705950 is also
predictive of radiographic progression of PrePF and is present in
over 50% of non-Hispanic white patients with IPF and is also
associated with unique clinical and biological IPF phenotypes.
PrePF can be predicted using a combination of clinical risk
factors, the MUC5B promoter variant rs35705950, and a panel of
biomarkers. This disclosure provides methods of treating subjects
with Preclinical Pulmonary Fibrosis (PrePF) and who may also be at
risk for developing IPF. The methods of the disclosure fundmentally
change the clinical approach to treating subjects with IPF,
shifting the focus from a merely palliative to a proactive and
preventive therapy.
Rheumatoid Arthritis-Associated Interstitial Lung Disease
(RA-ILD)
[0156] Rheumatoid arthritis (RA) is a common inflammatory and
autoimmune disease that is associated with progressive impairment,
systemic complications and increased mortality. Interstitial lung
disease (RA-ILD) is detected in up to 60% of patients with RA on
high-resolution computed-tomography (HRCT), is clinically
significant in 10%, and is a leading cause of morbidity and
mortality in patients with RA.
[0157] RA-ILD shares several characteristics with idiopathic
pulmonary fibrosis (IPF), including common environmental risk
factors, the high prevalence of the usual interstitial pneumonia
(UIP) pattern, the progressive nature of the disease, and poor
survival. The hypothesis of a shared genetic background between IPF
and RA-ILD was recently suggested by a whole-exome sequencing (WES)
genetic association study in patients with RA-ILD, revealing an
excess of mutations in genes in RA-ILD previously associated with
familial interstitial pneumonia (FIP) including TERT, RTEL1, PARN
and SFTPC.
[0158] The common gain-of-function promoter variant rs3570595013 of
the gene encoding mucin5B (MUC5B) is the strongest genetic risk
factor for IPF, observed in at least 50% of the cases of IPF and
accounting for 30% of the risk of developing this disease. The
MUC5B promoter variant is associated with increased expression of
MUC5B in lung parenchyma of unaffected controls and cases of IPF.
Consequently, it is hypothesized that the MUC5B promoter variant
rs35705950 would also contribute to the occurrence of RA-ILD. To
test this hypothesis, a multi-ethnic association study of the MUC5B
promoter variantand RA-ILD in seven distinct case series was
performed.
[0159] The MUC5B promoter variant rs35705950, the strongest genetic
risk factor for IPF, is also a strong risk factor for RA-ILD,
especially among those with radiographic evidence of UIP. Of note,
the effect of the MUC5B promoter variant on the development of ILD
associated with RA was similar in magnitude and direction to that
observed in IPF.
[0160] The relationship between the MUC5B promoter variant and
RA-ILD may be specific to UIP and may not generalizable to other
autoimmune conditions of the lung. The MUC5B promoter variant has
not been found to be associated with risk of ILDs linked to
systemic sclerosis or autoimmune myositis. Unlike these other types
of ILD, RA-ILD shares more characteristics with IPF, notably the
increased frequency of the UIP pattern (both radiologic and
histologic), an increased prevalence of male sex and older age, and
genetic susceptibility as assessed by an excess of mutations in
genes linked to FIP in a cohort of RA-ILD, and now the MUC5B
promoter variant rs35705950.
[0161] The disclosure demonstrates that the MUC5B promoter variant
is a risk factor for UIP, and not simply limited to IPF and RA-ILD.
In fact, emerging studies have identified the MUC5B promoter
variant as a risk factor for chronic hypersensitivity pneumonitis,
another condition known to have a sub-phenotype of UIP. Further,
since HRCT underestimates the presence of ILD and the UIP pattern
of fibrosis, our point estimates for association with the MUC5B
variant are likely conservative. Similar to IPF, early forms of
RA-ILD can be identified using the MUC5B promoter variant as
biomarker.
[0162] The disclosure demonstrates that Muc5b is overexpressed by
the bronchoalveolar epithelia and MUC5B mRNA is co-expressed by
cells expressing surfactant protein C, as has been shown in IPF.
These findings suggest either type 2 alveolar epithelial cells can
express MUC5B or that in patients with RA-ILD, the cells in the
distal airspace de-differentiate. Importantly, the disclosure
demonstrates for the first time that cells that overexpress MUC5B
are undergoing ER stress, a recognized mechanism of cell injury and
repair. In aggregate, these findings indicate that the
gain-of-function MUC5B promoter variant rs35705950 injures alveolar
epithelia by inducing ER stress.
[0163] RA-ILD is a complex genetic phenotype with the minor allele
of the MUC5B promoter variant rs35705950 identified as a risk
factor for the disease. The odds ratios for the association of
MUC5B promoter variant with RA-ILD is equivalent to that observed
with IPF and substantively higher than those for the most other
common risk variants for RA-ILD, including cigarette smoking and
the human leukocyte antigen locus for RA.
[0164] The MUC5B promoter variant is a risk factor for UIP in
general and may prove relevant beyond RA-ILD and IPF.
[0165] Expression of MUC5B in the bronchoalveolar epithelia
co-incident with markers of ER stress suggest that the MUC5B
promoter variant may be causing pulmonary fibrosis by initiating
microscopic foci of injury and repair.
[0166] The MUC5B promoter variant appears to predict ILD in the RA
population, identifying potential opportunities for early ILD
detection in patients with RA.
Preclinical Idiopathic Pulmonary Fibrosis
[0167] Better understanding and recognition of early pulmonary
fibrosis is critical because medical therapies have been shown to
slow progression, not to reverse or even stabilize established
fibrosis--therefore, intervention before irreversible fibrosis has
become extensive has the potential to improve quality of life and
decrease morbidity. While IPF affects approximately 5 million
people worldwide, between 1.8 and 14% of the general population
.gtoreq.50 years of age have radiologic findings of undiagnosed
pulmonary fibrosis. Large cohort studies indicate that interstitial
lung abnormalities, postulated to represent early pulmonary
fibrosis, are associated with increased mortality, and that most of
these abnormalities progress over time. Members of families with 2
or more cases of pulmonary fibrosis (FIP, Familial Interstitial
Pneumonia) have been identified as an "at-risk" population. In a
previous study of FIP relatives, 14% had interstitial lung
abnormalities on high resolution computed tomography (HRCT), and
35% had an abnormal transbronchial biopsy indicating interstitial
lung disease.
[0168] HRCT provides visualization of the lung parenchyma and plays
a key role in the diagnosis of the Idiopathic Interstitial
Pneumonias (IIPs), including IPF. Currently, visual diagnosis by
thoracic radiologists, in conjunction with multidisciplinary
clinical conference, is the gold standard for diagnosing IIPs.
However, visual assessment is imprecise and hampered by
inter-observer variation. Quantitative HRCT (qHRCT) evaluation
provides measures of fibrosis extent that, in subjects diagnosed
with IPF, correlate with degree of physiologic impairment at
baseline, and may be more sensitive to subtle changes in disease
status than routinely used physiological metrics. The design and
utility of quantitative methods in the context of early forms of
fibrotic ILD requires further study. Deep learning methods have
been increasingly used in imaging to identify and classify CT
patterns, and may be particularly valuable in detection of early
lung fibrosis.
[0169] PrePF is prevalent among FIP relatives, and a texture-based
quantitative method of HRCT analyses is useful in identifying these
abnormalities in this population, and key risk factors, including
the MUC5B promoter variant, predict those at risk of this disease.
PrePF subjects are older, more likely to be male, and more likely
to have smoked than the unaffected subjects; additionally, the
gain-of-function MUC5B promoter variant rs35705950, which has been
shown in prior studies to be associated with pulmonary fibrosis, is
more common in PrePF subjects when compared to their unaffected
family members. Given the subtlety of the fibrotic change in many
of these cases of PrePF, the high prevalence of potential UIP
pattern on HRCT scan suggests that PrePF subjects may progress to
IPF over time.
Methods for Detecting a Genetic Variant
[0170] The present disclosure also provides methods of detecting
the biomarkers of the present disclosure. Methods of detecting a
genetic variant are further described in US Application US
2016-0060701A1(the contents of which are incorporated herein by
reference in their entirety). The practice of the present
disclosure employs, unless otherwise indicated, conventional
methods of analytical biochemistry, microbiology, molecular biology
and recombinant DNA techniques within the skill of the art. Such
techniques are explained fully in the literature. (See, e.g.,
Sambrook, J. et al. Molecular Cloning: A Laboratory Manual. 3rd,
ed., Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory
Press, Cold Spring Harbor, N. Y., 2000; DNA Cloning: A Practical
Approach, Vol. I & II (D. Glover, ed.); Oligonucleotide
Synthesis (N. Gait, ed., Current Edition); Nucleic Acid
Hybridization (B. Hames & S. Higgins, eds., Current Edition);
Transcription and Translation (B. Hames & S. Higgins, eds.,
Current Edition); CRC Handbook of Parvoviruses, Vol. I & II (P.
Tijessen, ed.); Fundamental Virology, 2nd Edition, Vol. I & II
(B. N. Fields and D. M. Knipe, eds.)).
[0171] The methods of the invention are not limited to any
particular way of detecting the presence or absence of a genetic
variant (e.g. SNP) and can employ any suitable method to detect the
presence or absence of a variant(s), of which numerous detection
methods are known in the art. Dynamic allele-specific hybridization
(DASH) can be used to detect a genetic variant. DASH genotyping
takes advantage of the differences in the melting temperature in
DNA that results from the instability of mismatched base pairs. The
process can be vastly automated and encompasses a few simple
principles. Thus, the aspects and embodiments described herein
provide methods for assessing the presence or absence of SNPs in a
sample (e.g. biological sample) from a subject suspected of having
or developing an interstitial lung disease (e.g., because of family
history). In certain embodiments, one or more SNPs are screened in
one or more samples from a subject. The SNPs can be associated with
one or more genes, e.g., one or more genes or other genes
associated with mucous secretions as disclosed herein.
[0172] Typically, the target genomic segment is amplified and
separated from non-target sequence, e.g., through use of a
biotinylated primer and chromatography. A probe that is specific
for the particular allele is added to the amplification product.
The probe can be designed to hybridize specifically to a variant
sequence or to the dominant allelic sequence. The probe can be
either labeled with or added in the presence of a molecule that
fluoresces when bound to double-stranded DNA. The signal intensity
is then measured as temperature is increased until the Tm can be
determined. A non-matching sequence (either genetic variant or
dominant allelic sequence, depending on probe design), will result
in a lower than expected Tm.
[0173] DASH genotyping relies on a quantifiable change in Tm, and
is thus capable of measuring many types of mutations, not just
SNPs. Other benefits of DASH include its ability to work with label
free probes and its simple design and performance conditions.
[0174] Molecular beacons can also be used to detect a genetic
variant. This method makes use of a specifically engineered
single-stranded oligonucleotide probe. The oligonucleotide is
designed such that there are complementary regions at each end and
a probe sequence located in between. This design allows the probe
to take on a hairpin, or stem-loop, structure in its natural,
isolated state. Attached to one end of the probe is a fluorophore
and to the other end a fluorescence quencher. Because of the
stem-loop structure of the probe, the fluorophore is in close
proximity to the quencher, thus preventing the molecule from
emitting any fluorescence. The molecule is also engineered such
that only the probe sequence is complementary to the targeted
genomic DNA sequence.
[0175] If the probe sequence of the molecular beacon encounters its
target genomic DNA sequence during the assay, it will anneal and
hybridize. Because of the length of the probe sequence, the hairpin
segment of the probe will be denatured in favor of forming a
longer, more stable probe-target hybrid. This conformational change
permits the fluorophore and quencher to be free of their tight
proximity due to the hairpin association, allowing the molecule to
fluoresce.
[0176] If on the other hand, the probe sequence encounters a target
sequence with as little as one non-complementary nucleotide, the
molecular beacon will preferentially stay in its natural hairpin
state and no fluorescence will be observed, as the fluorophore
remains quenched. The unique design of these molecular beacons
allows for a simple diagnostic assay to identify SNPs at a given
location. If a molecular beacon is designed to match a wild-type
allele and another to match a mutant of the allele, the two can be
used to identify the genotype of an individual. If only the first
probe's fluorophore wavelength is detected during the assay then
the individual is homozygous to the wild type. If only the second
probe's wavelength is detected then the individual is homozygous to
the mutant allele. Finally, if both wavelengths are detected, then
both molecular beacons must be hybridizing to their complements and
thus the individual must contain both alleles and be
heterozygous.
[0177] A microarray can also be used to detect genetic variants.
Hundreds of thousands of probes can be arrayed on a small chip,
allowing for many genetic variants or SNPs to be interrogated
simultaneously. Because SNP alleles only differ in one nucleotide
and because it is difficult to achieve optimal hybridization
conditions for all probes on the array, the target DNA has the
potential to hybridize to mismatched probes. This can be addressed
by using several redundant probes to interrogate each SNP. Probes
can be designed to have the SNP site in several different locations
as well as containing mismatches to the SNP allele. By comparing
the differential amount of hybridization of the target DNA to each
of these redundant probes, it is possible to determine specific
homozygous and heterozygous alleles.
[0178] Restriction fragment length polymorphism (RFLP) can be used
to detect genetic variants and SNPs. RFLP makes use of the many
different restriction endonucleases and their high affinity to
unique and specific restriction sites. By performing a digestion on
a genomic sample and determining fragment lengths through a gel
assay it is possible to ascertain whether or not the enzymes cut
the expected restriction sites. A failure to cut the genomic sample
results in an identifiably larger than expected fragment implying
that there is a mutation at the point of the restriction site which
is rendering it protected from nuclease activity.
[0179] PCR- and amplification-based methods can be used to detect
genetic variants. For example, tetra-primer PCR employs two pairs
of primers to amplify two alleles in one PCR reaction. The primers
are designed such that the two primer pairs overlap at a SNP
location but each matches perfectly to only one of the possible
alleles. As a result, if a given allele is present in the PCR
reaction, the primer pair specific to that allele will produce
product but not the alternative allele with a different allelic
sequence. The two primer pairs can be designed such that their PCR
products are of a significantly different length allowing for
easily distinguishable bands by gel electrophoresis, or such that
they are differently labeled.
[0180] Primer extension can also be used to detect genetic
variants. Primer extension first involves the hybridization of a
probe to the bases immediately upstream of the SNP nucleotide
followed by a `mini-sequencing` reaction, in which DNA polymerase
extends the hybridized primer by adding a base that is
complementary to the SNP nucleotide. The incorporated base that is
detected determines the presence or absence of the SNP allele.
Because primer extension is based on the highly accurate DNA
polymerase enzyme, the method is generally very reliable. Primer
extension is able to genotype most SNPs under very similar reaction
conditions making it also highly flexible. The primer extension
method is used in a number of assay formats, and can be detected
using e.g., fluorescent labels or mass spectrometry.
[0181] Primer extension can involve incorporation of either
fluorescently labeled ddNTP or fluorescently labeled
deoxynucleotides (dNTP). With ddNTPs, probes hybridize to the
target DNA immediately upstream of SNP nucleotide, and a single,
ddNTP complementary to the SNP allele is added to the 3' end of the
probe (the missing 3'-hydroxyl in didioxynucleotide prevents
further nucleotides from being added). Each ddNTP is labeled with a
different fluorescent signal allowing for the detection of all four
alleles in the same reaction. With dNTPs, allele-specific probes
have 3' bases which are complementary to each of the SNP alleles
being interrogated. If the target DNA contains an allele
complementary to the 3' base of the probe, the target DNA will
completely hybridize to the probe, allowing DNA polymerase to
extend from the 3' end of the probe. This is detected by the
incorporation of the fluorescently labeled dNTPs onto the end of
the probe. If the target DNA does not contain an allele
complementary to the probe's 3' base, the target DNA will produce a
mismatch at the 3' end of the probe and DNA polymerase will not be
able to extend from the 3' end of the probe.
[0182] The iPLEX.RTM. SNP genotyping method takes a slightly
different approach, and relies on detection by mass spectrometer.
Extension probes are designed in such a way that many different SNP
assays can be amplified and analyzed in a PCR cocktail. The
extension reaction uses ddNTPs as above, but the detection of the
SNP allele is dependent on the actual mass of the extension product
and not on a fluorescent molecule. This method is for low to medium
high throughput, and is not intended for whole genome scanning.
[0183] Primer extension methods are, however, amenable to high
throughput analysis. Primer extension probes can be arrayed on
slides allowing for many SNPs to be genotyped at once. Broadly
referred to as arrayed primer extension (APEX), this technology has
several benefits over methods based on differential hybridization
of probes. Comparatively, APEX methods have greater discriminating
power than methods using differential hybridization, as it is often
impossible to obtain the optimal hybridization conditions for the
thousands of probes on DNA microarrays (usually this is addressed
by having highly redundant probes).
[0184] Oligonucleotide ligation assays can also be used to detect
genetic variants. DNA ligase catalyzes the ligation of the 3' end
of a DNA fragment to the 5' end of a directly adjacent DNA
fragment. This mechanism can be used to interrogate a SNP by
hybridizing two probes directly over the SNP polymorphic site,
whereby ligation can occur if the probes are identical to the
target DNA. For example, two probes can be designed; an
allele-specific probe which hybridizes to the target DNA so that
its 3' base is situated directly over the SNP nucleotide and a
second probe that hybridizes the template upstream (downstream in
the complementary strand) of the SNP polymorphic site providing a
5' end for the ligation reaction. If the allele-specific probe
matches the target DNA, it will fully hybridize to the target DNA
and ligation can occur. Ligation does not generally occur in the
presence of a mismatched 3' base. Ligated or unligated products can
be detected by gel electrophoresis, MALDI-TOF mass spectrometry or
by capillary electrophoresis.
[0185] The 5'-nuclease activity of Taq DNA polymerase can be used
for detecting genetic variants. The assay is performed concurrently
with a PCR reaction and the results can be read in real-time. The
assay requires forward and reverse PCR primers that will amplify a
region that includes the SNP polymorphic site. Allele
discrimination is achieved using FRET, and one or two
allele-specific probes that hybridize to the SNP polymorphic site.
The probes have a fluorophore linked to their 5' end and a quencher
molecule linked to their 3' end. While the probe is intact, the
quencher will remain in close proximity to the fluorophore,
eliminating the fluorophore's signal. During the PCR amplification
step, if the allele-specific probe is perfectly complementary to
the SNP allele, it will bind to the target DNA strand and then get
degraded by 5'-nuclease activity of the Taq polymerase as it
extends the DNA from the PCR primers. The degradation of the probe
results in the separation of the fluorophore from the quencher
molecule, generating a detectable signal. If the allele-specific
probe is not perfectly complementary, it will have lower melting
temperature and not bind as efficiently. This prevents the nuclease
from acting on the probe.
[0186] Forster resonance energy transfer (FRET) detection can be
used for detection in primer extension and ligation reactions where
the two labels are brought into close proximity to each other. It
can also be used in the 5'-nuclease reaction, the molecular beacon
reaction, and the invasive cleavage reactions where the neighboring
donor/acceptor pair is separated by cleavage or disruption of the
stem-loop structure that holds them together. FRET occurs when two
conditions are met. First, the emission spectrum of the fluorescent
donor dye must overlap with the excitation wavelength of the
acceptor dye. Second, the two dyes must be in close proximity to
each other because energy transfer drops off quickly with distance.
The proximity requirement is what makes FRET a good detection
method for a number of allelic discrimination mechanisms.
[0187] A variety of dyes can be used for FRET, and are known in the
art. The most common ones are fluorescein, cyanine dyes (Cy3 to
Cy7), rhodamine dyes (e.g. rhodamine 6G), the Alexa series of dyes
(Alexa 405 to Alexa 730). Some of these dyes have been used in FRET
networks (with multiple donors and acceptors). Optics for imaging
all of these require detection from UV to near IR (e.g. Alex 405 to
Cy7), and the Atto series of dyes (Atto-Tec GmbH). The Alexa series
of dyes from Invitrogen cover the whole spectral range. They are
very bright and photostable.
[0188] Example dye pairs for FRET labeling include
Alexa-405/Alex-488, Alexa-488/Alexa-546, Alexa-532/Alexa-594,
Alexa-594/Alexa-680, Alexa-594/Alexa-700, Alexa-700/Alexa-790,
Cy3/Cy5, Cy3.5/Cy5.5, and Rhodamine-Green/Rhodamine-Red, etc.
Fluorescent metal nanoparticles such as silver and gold
nanoclusters can also be used (Richards et al. (2008) J Am Chem Soc
130:5038-39; Vosch et al. (2007) Proc Natl Acad Sci USA
104:12616-21; Petty and Dickson (2003) J Am Chem Soc 125:7780-81
Available filters, dichroics, multichroic mirrors and lasers can
affect the choice of dye.
In Vitro Complexes
[0189] Provided herein are nucleic acid complexes, e.g., formed in
in vitro assays to indicate the presence of a genetic variant
sequence. One of skill will understand that a nucleic acid complex
can also be formed to detect the presence of a dominant allelic
sequence, depending on the design of the probe or primer, e.g., in
assays to distinguish homozygous and heterozygous subjects.
[0190] In some embodiments, the complex comprises a first nucleic
acid hybridized to a genetic variant nucleic acid, wherein the
genetic variant nucleic acid is a genetic variant in a gene
selected from MUC5B, TERC, FAM13A, TERT, DSP, AZGP1, OBFC1, ATP11A,
IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5, CAMKK1 or MMP-7. In some
embodiments, the genetic variant nucleic acid is an amplification
product. In some embodiments, the genetic variant nucleic acid is
on genomic DNA, e.g., from a subject that has or is suspected of
having an interstitial lung disease. In some embodiments, the first
nucleic acid is an amplification product or a primer extension
product. In some embodiments, the first nucleic acid is labeled. In
some embodiments, the nucleic acid complex further comprises a
second nucleic acid hybridized to the genetic variant nucleic acid.
In some embodiments, the second nucleic acid is labeled e.g., with
a FRET or other fluorescent label. In some embodiments, the first
and second nucleic acids form a FRET pair when hybridized to a
genetic variant sequence.
[0191] In some embodiments, the nucleic acid complex further
comprises an enzyme, such as a DNA polymerase (e.g., standard DNA
polymerase or thermostable polymerase such as Taq) or ligase.
[0192] The present disclosure includes but is not limited to the
following embodiments:
[0193] A method for determining if an individual is predicted to
develop and/or progress rapidly with an interstitial pneumonia
comprising: detecting in a biological sample from the individual,
at least one of: a) the presence of a marker polymorphism selected
from the group consisting of: rs35705950; and/or, b) a level of
gene expression of a marker gene or plurality of marker genes
selected from the group consisting of: a marker gene having at
least 95% sequence identity with at least one sequence selected
from the group consisting of MUC5B, TERC, FAM13A, TERT, DSP, AZGP1,
OBFC1, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5, CAMKK1 or
MMP-7, or homologs or variants thereof c) polypeptides encoded by
the marker genes of b) d) fragments of polypeptides of c); and e) a
polynucleotide which is fully complementary to at least a portion
of a marker gene of b); wherein the presence of the plurality of
markers is indicative of whether an individual will develop a
disease. In some embodiments, the genes detected share 100%
sequence identity with the corresponding marker gene in b). In some
embodiments, the presence or level of at least one of the plurality
of markers is determined and compared to a standard level or
reference set. In some embodiments, the standard level or reference
set is determined according to a statistical procedure for risk
prediction. In some embodiments, the statistical procedure for risk
prediction comprises using the sum of the gene expression of the
marker or markers or the presence or absence of a set of markers,
weighted by a Proportional Hazards coefficient. In some
embodiments, the presence of the at least one marker is determined
by detecting the presence or absence or expression level of a
polypeptide. In some embodiments, the method further comprises
detecting the presence of the polypeptide using a reagent that
specifically binds to the polypeptide or a fragment thereof. In
some embodiments, the reagent is selected from the group consisting
of an antibody, an antibody derivative, and an antibody fragment.
In some embodiments, the presence of the marker is determined by
obtaining the sequence of genomic DNA at the locus of the
polymorphism. In some embodiments, the presence of the marker is
determined by obtaining RNA from the biological sample; generating
cDNA from the RNA; amplifying the cDNA with probes or primers for
marker genes; obtaining from the amplified cDNA the expression
levels of the genes or gene expression products in the sample. In
some embodiments, the individual is a human.
[0194] In some embodiments, the method further comprises: a)
comparing the expression level of the marker gene or plurality of
marker genes in the biological sample to a control level of the
marker gene(s) selected from the group consisting of: a control
level of the marker gene that has been correlated with interstitial
lung disease, the risk of developing interstitial lung disease, or
having a interstitial lung disease; and a control level of the
marker that has been correlated with slow or no progression of
interstitial lung disease, or low risk of developing an
interstitial lung disease; and b) selecting the individual as being
predicted to progress rapidly in the development of interstitial
pneumonia, if the expression level of the marker gene in the
individual's biological sample is statistically similar to, or
greater than, the control level of expression of the marker gene
that has been correlated with interstitial lung disease, or c)
selecting the individual as being predicted to not develop
interstitial lung disease, or to progress slowly, if the level of
the marker gene in the individual's biological sample is
statistically less than the control level of the marker gene that
has been correlated with interstitial lung disease.
[0195] In some embodiments, the method further comparing the
presence of a polymorphism, in the biological sample to a set of
genetic variants or polymorphic markers from an individual or
control group having developed interstitial lung disease, and,
selecting the individual as being predicted to develop or to
progress with interstitial pneumonia if the polymorphic markers
present in the biological sample are identical to or statistically
similar to a set of polymorphic markers from the individual or
control group or, selecting the individual as being predicted to
develop or rapidly progress with interstitial pneumonia, if the
polymorphic markers present in the biological sample are not
identical to or statistically similar to the set of genetic
variants or polymorphic markers from the individual or control
group.
[0196] A method for monitoring the progression of interstitial lung
disease in a subject, comprising: i) measuring expression levels of
a plurality of gene markers in a first biological sample obtained
from the subject, wherein the plurality of markers comprise a
plurality of markers selected from the group consisting of: a
marker gene having at least 95% sequence identity with a sequence
selected from the group consisting of a) MUC5B, TERC, FAM13A, TERT,
DSP, AZGP1, OBFC1, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5,
CAMKK1 or MMP-7, or homologs or variants thereof; b) polypeptides
encoded by the marker genes of a), c) fragments of polypeptides of
d); and e) a polynucleotide which is fully complementary to at
least a portion of a marker gene of b); ii) measuring expression
levels of the plurality of markers in a second biological sample
obtained from the subject; and iii) comparing the expression level
of the marker measured in the first sample with the level of the
marker measured in the second sample. In some embodiments, the
marker genes detected share 100% sequence identity with the
corresponding marker gene in a). In some embodiments, the method
further comprises performing a follow-up step selected from the
group consisting of CT scan of the chest and pathological
examination of lung tissues from the subject. In some embodiments,
the first biological sample from the subject is obtained at a time
to, and the second biological sample from the subject is obtained
at a later time t.sub.1. In some embodiments, the first biological
sample and the second biological sample are obtained from the
subject are obtained more than once over a range of times.
[0197] A method of assessing the efficacy of a treatment for
interstitial lung disease or interstitial pneumonia in a subject,
the method comprising comparing: i) the expression level of a
marker measured in a first sample obtained from the subject at a
time to, wherein the marker is selected from the group consisting
of a) a marker gene having at least 95% sequence identity with a
sequence selected from the group consisting of MUC5B, TERC, FAM13A,
TERT, DSP, AZGP1, OBFC1, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2,
TSPAN5, CAMKK1 or MMP-7, or homologs or variants thereof; b)
polypeptides encoded by the marker genes of a)
c) fragments of polypeptides of b); and d) a polynucleotide which
is fully complementary to at least a portion of a marker gene of
a); ii) the level of the marker in a second sample obtained from
the subject at time t.sub.1; and, iii) performing a follow-up step
selected from CT scan of the chest and pathological examination of
lung tissues from the subject; wherein a decrease in the level of
the marker in the second sample relative to the first sample is an
indication that the treatment is efficacious for treating
interstitial pneumonia in the subject. In some embodiments, the
genes detected share 100% sequence identity with the corresponding
marker gene in a). In some embodiments, the time t0 is before the
treatment has been administered to the subject, and the time t1 is
after the treatment has been administered to the subject. In some
embodiments, the comparing is repeated over a range of times.
[0198] An assay system for predicting individual prognosis therapy
for interstitial pneumonia comprising a means to detect at least
one of: a) the presence of a marker polymorphism selected from the
group consisting of: rs35705950; and/or, b) a level of gene
expression of a marker gene or plurality of marker genes selected
from the group consisting of: a marker gene having at least 95%
sequence identity with a sequence selected from the group
consisting of MUC5B, TERC, FAM13A, TERT, DSP, AZGP1, OBFC1, ATP11A,
IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5, CAMKK1 or MMP-7, or
homologs or variants thereof c) polypeptides encoded by the marker
genes of b) d) fragments of polypeptides of c); and e) a
polynucleotide which is fully complementary to at least a portion
of a marker gene of b). In some embodiments, the means to detect
comprises nucleic acid probes comprising at least 10 to 50
contiguous nucleic acids of the marker polymorphisms or gene(s), or
complementary nucleic acid sequences thereof. In some embodiments,
the means to detect comprises binding ligands that specifically
detect polypeptides encoded by the marker genes. In some
embodiments, the genes detected share 100% sequence identity with
the corresponding marker gene in b). In some embodiments, the means
to detect comprises at least one of nucleic acid probe and binding
ligands disposed on an assay surface. In some embodiments, the
assay surface comprises a chip, array, or fluidity card. In some
embodiments, the probes comprise complementary nucleic acid
sequences to at least 10 to 50 nucleic acid sequences of the marker
genes. In some embodiments, the binding ligands comprise antibodies
or binding fragments thereof. In some embodiments, the assay system
further comprises: a control selected from information containing a
predetermined control level or set of genetic variants or
polymorphic markers that has been correlated with diagnosis,
development, progression, or life expectancy in interstitial lung
disease patients.
[0199] A method of detecting a level of gene expression of one or
more marker genes in a human subject with interstitial pneumonia,
comprising, optionally, obtaining a biological sample from a human
individual with interstitial pneumonia; detecting the level of
expression of a gene selected from MUC5B, TERC, FAM13A, TERT, DSP,
AZGP1, OBFC1, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5,
CAMKK1 or MMP-7, or homologs or variants thereof, in one or more
cells from the biological sample from the individual. In some
embodiments, the method further comprises detecting the level of
expression of a gene selected from MUC5B, TERC, FAM13A, TERT, DSP,
AZGP1, OBFC1, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5,
CAMKK1 or MMP-7, or homologs or variants thereof, in one or more
cells from the biological sample from the individual. In some
embodiments, the method further comprises detecting the level of
expression of a gene selected from MUC5B, TERC, FAM13A, TERT, DSP,
AZGP1, OBFC1, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5,
CAMKK1 or MMP-7, or homologs or variants thereof in one or more
cells from the biological sample from the individual.
[0200] A method of treating an interstitial lung disease in a
subject in need of such treatment, comprising: detecting a level of
one or more marker genes selected from MUC5B, TERC, FAM13A, TERT,
DSP, AZGP1, OBFC1, ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5,
CAMKK1 or MMP-7, or homologs or variants thereof in a biological
sample obtained from the human subject; and, administering an
effective amount of an effective treatment. In some embodiments,
the method further comprises detecting the level of expression of a
gene selected from MUC5B, TERC, FAM13A, TERT, DSP, AZGP1, OBFC1,
ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5, CAMKK1 or MMP-7,
or homologs or variants thereof, in one or more cells from the
biological sample from the individual. In some embodiments, the
method further comprises detecting the level of expression of a
gene selected from MUC5B, TERC, FAM13A, TERT, DSP, AZGP1, OBFC1,
ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5, CAMKK1 or MMP-7,
or homologs or variants thereof, in one or more cells from the
biological sample from the individual.
Detection of Genetic Variants
[0201] Methods of detecting a genetic variant are further
described, for example, in U.S. Pat. No. 8,673,565 (the contents of
which are herein incorporated by reference in their entirety).
Genetic variations in the mucin genes are associated with pulmonary
diseases. These genetic variations can be found in any part of the
gene, e.g., in the regulatory regions, introns, or exons. Relevant
genetic variations may also be found the intergene regions, e.g.,
in sequences between mucin genes. Insertions, substitutions, and
deletions are included in genetic variants. Single nucleotide
polymorphisms (SNPs) are exemplary genetic variants.
[0202] In particular, 14 independent SNPs are associated with
pulmonary disorders (e.g. FIP or IPF). The studies disclosed herein
demonstrate that presence of one or more of these SNPs associated
with MUC5B can lead to predisposition to a pulmonary disorder. In
addition, in some embodiments, if present, some of these SNPs are
related to a transcription factor binding site. The transcription
factor binding site can effect modulation of MUC5B expression, for
example E2F3 loss, and HOXA9 and PAX-2 generation.
[0203] The disclosure thus provides methods for assessing the
presence or absence of SNPs in a sample from a subject suspected of
having or developing a pulmonary disorder (e.g., because of family
history). In certain embodiments, one or more SNPs are screened in
one or more samples from a subject. The SNPs can be associated with
one or more genes, e.g., one or more MUC genes or other genes
associated with mucous secretion. In some embodiments, a MUC gene
associated SNP is associated with MUC5B and/or another MUC gene,
such as MUC5AC or MUC1. SNPs contemplated for diagnostic,
treatment, or prognosis can include SNPs found within a MUC gene
and/or within a regulatory or promoter region associated with a MUC
gene. For example, one or more SNPs can include, but are not
limited to, detection of the SNPs of MUC5B alone or in combination
with other genetic variations or SNPs and/or other diagnostic or
prognostic methods.
[0204] Methods for detecting genetic variants such as a SNP are
known in the art, e.g., Southern or Northern blot, nucleotide
array, amplification methods, etc. Primers or probes are designed
to hybridize to a target sequence. For example, genomic DNA can be
screened for the presence of an identified genetic element of using
a probe based upon one or more sequences, e.g., using a probe with
substantial identity to a subsequence of the MUC5B gene. Expressed
RNA can also be screened, but may not include all relevant genetic
variations. Various degrees of stringency of hybridization may be
employed in the assay. As the conditions for hybridization become
more stringent, there must be a greater degree of complementarity
between the probe and the target for duplex formation to occur.
Thus, high stringency conditions are typically used for detecting a
SNP.
[0205] Thus, in some embodiments, a genetic variant MUC5B gene in a
subject is detected by contacting a nucleic acid in a sample from
the subject with a probe having substantial identity to a
subsequence of the MUC5B gene, and determining whether the nucleic
acid indicates that the subject has a genetic variant MUC5B gene.
In some cases, the sample can be processed prior to amplification,
e.g., to separate genomic DNA from other sample components. In some
cases, the probe has at least 90, 92, 94, 95, 96, 98, 99, or 100%
identity to the MUC5B gene subsequence. Typically, the probe is
between 10-500 nucleotides in length, e.g., 10-100, 10-40, 10-20,
20-100, 100-400, etc. In the case of detecting a SNP, the probe can
be even shorter, e.g., 8-20 nucleotides in length. In some cases,
the MUC5B gene sequence to be detected includes at least 8
contiguous nucleotides, e.g., at least 10, 15, 20, 25, 30, 35 or
more contiguous nucleotides. In some embodiments, the sequence to
be detected includes 8 contiguous nucleotides, e.g., at least 10,
15, 20, 25, 30, 35 or more contiguous nucleotides.
[0206] The degree of stringency can be controlled by temperature,
ionic strength, pH and/or the presence of a partially denaturing
solvent such as formamide. For example, the stringency of
hybridization is conveniently varied by changing the concentration
of formamide within the range up to and about 50%. The degree of
complementarity (sequence identity) required for detectable binding
will vary in accordance with the stringency of the hybridization
medium and/or wash medium. In certain embodiments, in particular
for detection of a particular SNP, the degree of complementarity is
about 100 percent. In other embodiments, sequence variations can
result in <100% complementarity, <90% complimentarity probes,
<80% complimentarity probes, etc., in particular, in a sequence
that does not involve a SNP. In some examples, e.g., detection of
species homologs, primers may be compensated for by reducing the
stringency of the hybridization and/or wash medium.
[0207] High stringency conditions for nucleic acid hybridization
are well known in the art. For example, conditions may comprise low
salt and/or high temperature conditions, such as provided by about
0.02 M to about 0.15 M NaCl at temperatures of about 50.degree. C.
to about 70.degree. C. Other exemplary conditions are disclosed in
the following Examples. It is understood that the temperature and
ionic strength of a desired stringency are determined in part by
the length of the particular nucleic acid(s), the length and
nucleotide content of the target sequence(s), the charge
composition of the nucleic acid(s), and by the presence or
concentration of formamide, tetramethylammonium chloride or other
solvent(s) in a hybridization mixture. Nucleic acids can be
completely complementary to a target sequence or exhibit one or
more mismatches.
[0208] Nucleic acids of interest can also be amplified using a
variety of known amplification techniques. For instance, polymerase
chain reaction (PCR) technology may be used to amplify target
sequences (e.g., genetic variants) directly from DNA, RNA, or cDNA.
In some embodiments, a stretch of nucleic acids is amplified using
primers on either side of a targeted genetic variation, and the
amplification product is then sequenced to detect the targeted
genetic variation (using, e.g., Sanger sequencing, Pyrosequencing,
Nextgen.RTM. sequencing technologies). For example, the primers can
be designed to hybridize to either side of the upstream regulatory
region of the MUC5B gene, and the intervening sequence determined
to detect a SNP in the promoter region. In some embodiments, one of
the primers can be designed to hybridize to the targeted genetic
variant. In some cases, a genetic variant nucleotide can be
identified using RT-PCR, e.g., using labeled nucleotide monomers.
In this way, the identity of the nucleotide at a given position can
be detected as it is added to the polymerizing nucleic acid. The
Scorpion.TM. system is a commercially available example of this
technology.
[0209] Thus, in some embodiments, a genetic variant MUC5B gene in a
subject is detected by amplifying a nucleic acid in a sample from
the subject to form an amplification product, and determining
whether the amplification product indicates a genetic variant MUC5B
gene. In some cases, the sample can be processed prior to
amplification, e.g., to separate genomic DNA from other sample
components. In some cases, amplifying comprises contacting the
sample with amplification primers having substantial identity to
MUC5B genomic subsequences, e.g., at least 90, 92, 94, 95, 96, 98,
99, or 100% identity. Typically, the sequence to be amplified is
between 30-1000 nucleotides in length, e.g., 50-500, 50-400,
100-400, 50-200, 100-300, etc. In some cases, the sequence to be
amplified or detected includes at least 8 contiguous nucleotides,
e.g., at least 10, 15, 20, 25, 30, 35 or more contiguous
nucleotides. In some embodiments, the sequence to be amplified or
detected includes 8 contiguous nucleotides, e.g., at least 10, 15,
20, 25, 30, 35 or more contiguous nucleotides. In some aspects, the
contiguous nucleotides include nucleotide 28.
[0210] Amplification techniques can also be useful for cloning
nucleic acid sequences, to make nucleic acids to use as probes for
detecting the presence of a target nucleic acid in samples, for
nucleic acid sequencing, for control samples, or for other
purposes. Probes and primers are also readily available from
commercial sources, e.g., from Invitrogen, Clonetech, etc.
Detection of Expression Levels
[0211] Expression of a given gene, e.g., MUC5B or another mucin,
pulmonary disease marker, or standard (control), is typically
detected by detecting the amount of RNA (e.g., mRNA) or protein.
Sample levels can be compared to a control level.
[0212] Methods for detecting RNA are largely cumulative with the
nucleic acid detection assays described above. RNA to be detected
can include mRNA. In some embodiments, a reverse transcriptase
reaction is carried out and the targeted sequence is then amplified
using standard PCR. Quantitative PCR (qPCR) or real time PCR
(RT-PCR) is useful for determining relative expression levels, when
compared to a control. Quantitative PCR techniques and platforms
are known in the art, and commercially available (see, e.g., the
qPCR Symposium website, available at qpersymposium.com). Nucleic
acid arrays are also useful for detecting nucleic acid expression.
Customizable arrays are available from, e.g., Affimatrix. An
exemplary human MUC5B mRNA sequence, e.g., for probe and primer
design, can be found at GenBank Accession No. AF086604.1.
[0213] Protein levels can be detected using antibodies or antibody
fragments specific for that protein, natural ligands, small
molecules, aptamers, etc. An exemplary human MUC5B sequence, e.g.,
for screening a targeting agent, can be found at UniProt Accession
No. 000446.
[0214] Antibody based techniques are known in the art, and
described, e.g., in Harlow & Lane (1988) Antibodies: A
Laboratory Manual and Harlow (1998) Using Antibodies: A Laboratory
Manual; Wild, The Immunoassay Handbook, 3d edition (2005) and Law,
Immunoassay: A Practical Guide (1996). The assay can be directed to
detection of a molecular target (e.g., protein or antigen), or a
cell, tissue, biological sample, liquid sample or surface suspected
of carrying an antibody or antibody target.
[0215] A non-exhaustive list of immunoassays includes: competitive
and non-competitive formats, enzyme linked immunosorption assays
(ELISA), microspot assays, Western blots, gel filtration and
chromatography, immunochromatography, immunohistochemistry, flow
cytometry or fluorescence activated cell sorting (FACS),
microarrays, and more. Such techniques can also be used in situ, ex
vivo, or in vivo, e.g., for diagnostic imaging.
[0216] Aptamers are nucleic acids that are designed to bind to a
wide variety of targets in a non-Watson Crick manner. An aptamer
can thus be used to detect or otherwise target nearly any molecule
of interest, including a pulmonary disease associated protein.
Methods of constructing and determining the binding characteristics
of aptamers are well known in the art. For example, such techniques
are described in U.S. Pat. Nos. 5,582,981, 5,595,877 and 5,637,459.
Aptamers are typically at least 5 nucleotides, 10, 20, 30 or 40
nucleotides in length, and can be composed of modified nucleic
acids to improve stability. Flanking sequences can be added for
structural stability, e.g., to form 3-dimensional structures in the
aptamer.
[0217] Protein detection agents described herein can also be used
as a treatment and/or diagnosis of pulmonary disease or predictor
of disease progression, e.g., propensity for survival, in a subject
having or suspected of developing a pulmonary disorder. In certain
embodiments, MUC5B antibodies can be used to assess MUC5B protein
levels in a subject having or suspected of developing a pulmonary
disorder. It is contemplated herein that antibodies or antibody
fragments may be used to modulate MUC5B production in a subject
having or suspected of developing a pulmonary disease. In certain
embodiments, one or more agents capable of modulating MUC5B may be
used to treat a subject having or suspected of developing a
pulmonary disorder. One or more antibodies or antibody fragments
may be generated to detect one or more of the SNPs disclosed herein
by any method known in the art.
[0218] In certain embodiments, MUC5B diagnostic tests may include,
but are not limited to, alone or in combination, analysis of
rs35705950 SNP in MUC5B gene, MUC5B mRNA levels, and/or MUC5B
protein levels.
Additional Pulmonary Disease Markers
[0219] The above methods of detection can be applied to additional
pulmonary disease markers. That is, the expression level or
presence of genetic variants of at least one additional pulmonary
disease marker gene can be determined, or the activity of the
marker protein can be determined, and compared to a standard
control for the pulmonary disease marker. The examination of
additional pulmonary disease markers can be used to confirm a
diagnosis of pulmonary disease, monitor disease progression, or
determine the efficacy of a course of treatment in a subject.
[0220] In some cases, pulmonary disease is indicated by an
increased number of lymphocytes, e.g., CD4+CD28- cells.
[0221] Genetic variations in the following genes are associated
with pulmonary disease: Surfactant Protein A2, Surfactant Protein
B, Surfactant Protein C, TERC, TERT, IL-1RN, IL-1.alpha.,
IL-1.beta., TNF, Lymphotoxin a, TNF-RII, IL-10, IL-6, IL-12,
IFN.gamma., TGF.beta., CR1, ACE, IL-8, CXCR1, CXCR2, MUC1 (KL6), or
MUC5AC. Thus, the invention further includes methods of determining
whether the genome of a subject comprises a genetic variant of at
least one gene selected from these genes. The presence of a genetic
variant indicates that the subject has or is at risk of developing
pulmonary disease. Said determining can optionally be combined with
determining whether the genome of the subject comprises a genetic
variant MUC5B gene, or determining whether the subject has an
elevated level of MUC5B RNA or protein to confirm or strengthen the
diagnosis or prognosis.
[0222] Abnormal expression in the following genes can also be
indicative of pulmonary disease: Surfactant Protein A, Surfactant
Protein D, KL-6/MUC1, CC16, CK-19, Ca 19-9, SLX, MCP-1, MIP-1a,
ITAC, glutathione, type III procollagen peptide, sIL-2R, ACE,
neopterin, beta-glucuronidase, LDH, CCL-18, CCL-2, CXCL12, MMPI,
and osteopontin. Thus, the expression of one of these genes can be
detected and compared to a control, wherein an abnormal expression
level indicates that the subject has or is at risk of developing
pulmonary disease. Said determining can optionally be combined with
determining whether the genome of the subject comprises a genetic
variant MUC5B gene, or determining whether the subject has an
elevated level of MUC5B RNA or protein to confirm or strengthen the
diagnosis or prognosis.
Biomarkers
[0223] The present disclosure provides a peripheral blood biomarker
profile for IPF to demonstrate the use of a predictive biomarker
profile in cases of preclinical pulmonary fibrosis (PrePF) derived
from families with familial IPF. The present disclosure also
provides biomarker identification for association between each
genetic, epigenetic or protein (gene product) biomarker with PrePF
and the predictive value of the combination of biomarkers
associated with PrePF.
[0224] A large cohort of families with familial IPF for genetic
research was established, including 937 families with .gtoreq.2
cases of IPF, and 2375 family members that have been previously
phenotyped as unaffected. This study focuses on subjects with PrePF
to elucidate the processes active in early disease pathogenesis and
to predict or prevent the irreversible fibroproliferative process.
Genetic risk factors, especially the MUC5B promoter variant,
identifies individuals with preclinical interstitial changes on
chest CT scan that progress and are associated with reduced
survival. Biomarkers may be used to identify those subjects with
PrePF among those at-risk for IPF. Given the irreversible nature of
IPF, even approved treatments (pirfenidone and nintedanib) only
modestly slow progression and have not been shown to alter the 3-5
year survival. Pirfenidone and nintedanib are effective in patients
with mild disease, suggesting that patients with PrePF may be
targeted for early intervention, before most of the lung has been
irreversibly remodeled.
[0225] Table 1 below shows additional gene expression changes
present in subjects with IPF compared to controls. Specifically,
the expression of the genes listed in Table 1 are upregulated in
IPF compared to the expression of these same genes in control
subjects. Accordingly, the discovery of elevated expression levels
of one or more genes listed in Table 1 compared to a control in an
asymptomatic subject may indicate that the subject has PrePF and/or
that the subject is at risk for developing IPF.
[0226] In some embodiments of the methods of the disclosure, the
subject has a mutation in a nucleic acid or amino acid sequence
encoding a gene or gene product that is upregulated in a subject
having a fibrotic pulmonary disease of the disclosure. In some
embodiments of the methods of the disclosure, the subject has a
mutation in a nucleic acid or amino acid sequence encoding
Leukotriene A4 Hydrolase (LTA4H), Surfactant Protein B (SFTPB),
Breast Cancer Anti-Estrogen Resistance 3 (BCAR3), C-X-C motif
Chemokine Ligand 13 (CXCL13), EPH Receptor A2 (EPHA2), Serum
Amyloid A1 (SAA1), Phospholipase A2 Group IIA (PLA2G2A),
Insulin-Like Growth Factor Binding Protein 3 (IGFBP3), C-C Motif
Chemokine Ligand 28 (CCL28), 5100 Calcium Binding Protein A12
(S100A12), Thromboxane A Synthase 1 (TBXAS1), Leukocyte Cell
Derived Chemotaxin 1 (LECT1), Complement C3 (C3), Gastrin Releasing
Peptide (GRP), C-Reactive Protein (CRP), Vitrin (VIT), Insulin-Like
Growth Factor Binding Protein 1 (IGFBP1), Family with Sequence
Similarity 173 Member A (FAM173A), Natriuretic Peptide A (NPPA),
Secreted Frizzled Related Protein 1 (SFRP1), Ezrin (EZR),
Inter-Alpha-Trypsin Inhibitor Heavy Chain Family Member 5 (ITIH5),
Pleckstrin and Sec7 Domain Containing 2 (PSD2), Galectin 3 Binding
Protein (LGALS3BP), Catenin Beta 1 (CTNNB1), Chromodomain Y Like 2
(CDYL2), Matrix Metallopeptidase 7 (MMPI), Apolipoprotein B (APOB),
Proline and Arginine Rich End Leucine Rich Repeat Protein (PRELP),
Eukaryotic Translation Initiation Factor 1A, X-linked (EIF1AX),
Mesencephalic Astrocyte Derived Neurotrophic Factor (MANF), TNF
Receptor Superfamily Member 13C (TNFRSF13C), Deformed Epidermal
Autoregulatory Factor 1 transcription factor (DEAF1), Tumor Protein
Translationally-Controlled 1 (TPT1), Unc-5 Netrin Receptor B
(UNCSB), Phosphatidylethanolamine Binding Protein 1 (PEBP1),
Syntaxin 8 (STX8), Polymeric Immunoglobulin Receptor (PIGR),
Adenine Phosphoribosyltransferase (APRT), Matrix Metallopeptidase 3
(MMP3), Galectin 7 (LGALS7), Bruton Tyrosine Kinase (BTK), NSFL1
Cofactor (NSFL1C), FER Tyrosine Kinase (FER), Regenerating Family
Member 1 Beta (REG1B), SMAD Family Member 2 (SMAD2), Interleukin 1
Receptor Like 1 (IL1RL1), C-C Motif Chemokine Ligand 18 (CCL18),
Acid Phosphatase 2 Lysosomal (ACP2), Eukaryotic Translation
Initiation Factor 4E Family Member 2 (EIF4E2), Neurexin 3 (NRXN3),
IGF Like Family Member 1 (IGFL1), NME/NM23 Nucleoside Diphosphate
Kinase 1 (NME1), Potassium Voltage-Gated Channel Isk-Related Family
Member 1-Like (KCNE1L) or Neurexophilin 2 (NXPH2).
TABLE-US-00001 TABLE 1 TARGET_GENE_SYM- ORGAN- B-H Fold BOL ISM
p-value q-value Change LTA4H Human 8.70E-43 3.13E-39 3.912 SFTP8
Human 1.17E-37 2.10E-34 3.399 BCAR3 Human 4.28E-25 3.85E-22 2.906
CXCL13 Human 1.30E-29 1.56E-26 2.904 EPHA2 Human 9.62E-23 6.93E-20
2.651 SAA1 Human 6.01E-07 7.84E-06 2.631 PLA2GZA Human 8.19E-21
2.95E-18 2.171 Igfbp3 Mouse 1.18E-18 2.66E-16 2.149 CCL28 Human
1.22E-22 7.30E-20 2.135 S100A12 Human 1.06E-20 3.45E-18 2.125
TBXAS1 Human 1.60E-21 7.20E-19 2.11 LECT1 Human 4.17E-19 1.00E-16
2.082 C3 Human 7.08E-07 8.95E-06 2.062 GRP Human 8.35E-09 1.66E-07
1.988 CSP Human 1.36E-08 2.61E-07 1.957 VIT Human 2.47E-17 4.45E-15
1.929 IGFBP1 Human 4.32E-11 1.56E-09 1.914 FAM173A Human 2.19E-13
1.84E-11 1.904 NPPA Human 5.02E-12 2.58E-10 1.877 SFRP1 Human
1.74E-20 5.23E-18 1.866 EZR Human 6.41E-10 1.72E-08 1.809 ITIH5
Human 5.11E-21 2.04E-18 1.705 PSD2 Human 5.38E-18 1.08E-15 1.689
LGAL538P Human 8.06E-22 4.15E-19 1.678 1.18E-05 0.000102 1.668
CTNNB1 Human 5.66E-12 2.87E-10 1.625 CDVL2 Human 4.11E-07 5.59E-06
1.622 MMP7 Human 1.56E-19 4.02E-17 1.621 APOB Human 8.73E-13
6.42E-11 1.597 PRELP Human 1.13E-10 3.53E-09 1.595 EIF1AX Human
2.13E-06 2.31E-05 1.59 MANF Human 0.00458 0.015006 1.585 TNFRSF13C
Human 1.77E-11 7.31E-10 1.573 C3 Human 2.40E-16 3.93E-14 1.566
DEAF1 Human 0.000221 0.001192 1.565 TPT1 Human 1.22E-12 7.82E-11
1.548 UNC5B Human 2.06E-34 2.18E-12 1.547 PEBP1 Human 4.92E-11
1.72E-09 1.544 STX8 Human 8.82E-12 4.13E-10 1.537 PIGR Human
1.29E-09 3.19E-08 1.532 APRT Human 1.51E-07 2.26E-06 1.525 MMP3
Human 9.50E-07 1.15E-05 1.524 LGAL57 Human 7.51E-05 0.000474 1.514
BTK Human 1.47E-09 3.52E-08 1.511 NSFL1C Human 7.33E-11 2.40E-09
1.506 FER Human 2.24E-07 3.24E-06 1.503 REG1B Human 6.68E-11
2.25E-09 1.502 SMAD2 Human 4.39E-10 1.25E-08 1.493 IL1RL1 Human
9.55E-07 1.15E-05 1.492 CCL18 Human 1.25E-13 1.07E-11 1.491 ACP2
Human 3.73E-08 6.33E-07 1.488 EIF4E2 Human 1.67E-12 1.02E-10 1.483
NRXN3 Human 2.33E-17 4.42E-15 1.48 IGFL1 Human 5.07E-10 1.40E-08
1.474 NME1 Human 1.43E-10 4.39E-09 1.463 KCNE1L Human 3.93E-20
1.09E-17 1.462 NXPH2 Human 9.66E-30 2.47E-08 1.451
[0227] Table 2 below shows additional gene expression changes
present in subjects with IPF compared to controls. Specifically,
the expression of the genes listed in Table 2 are downregulated in
IPF compared to the expression of these same genes in control
subjects. Accordingly, the discovery of decreased expression levels
of one or more genes listed in Table 2 compared to a control in an
asymptomatic subject may indicate that the subject has PrePF and/or
that the subject is at risk for developing IPF.
[0228] In some embodiments of the methods of the disclosure, the
subject has a mutation in a nucleic acid or amino acid sequence
encoding a gene or gene product that is downregulated in a subject
having a fibrotic pulmonary disease of the disclosure. In some
embodiments of the methods of the disclosure, the subject has a
mutation in a nucleic acid or amino acid sequence encoding
Surfactant Protein D (SFTPD), Glyceraldehyde-3-Phosphate
Dehydrogenase (GAPDH), Histone Cluster 1 H1 Family Member C
(HIST1H1C), YTH Domain Containing 1 (YTHDC1), Plexin A1 (PLXNA1),
Serine Peptidase Inhibitor Kazal Type 6 (SPINK6), LDL Receptor
Related Protein Associated Protein 1 (LRPAP1), Secretoglobin Family
3A Member 1 (SCGB3A1), H2A Histone Family Member Z (H2AFZ) or
Chromosome 1 Open Reading Frame 162 (Clorf162).
TABLE-US-00002 TABLE 2 TARGET_GENE_SYM- ORGAN- B-H Fold BOL ISM
p-value q-value Change SFTPD Haman 8.19E-15 9.83E-13 -2.262 GAPDH
Human 1.46E-09 3.52E-08 -2.096 HIST1H1C Human 3.68E-18 7.80E-16
-2.011 3.63E-16 5.69E-14 -1.964 YTHDC1 Human 1.19E-11 5.38E-10
-1.699 PLXNA1 Human 1.64E-12 1.02E-10 -1.64 SPINK6 Human 3.68E-07
5.04E-06 -1.635 LRPAP1 Human 2.65E-15 3.53E-13 -1.521 SCGB3A1 Human
3.35E-07 4.61E-06 -1.518 H2AFZ Human 3.91E-14 3.91E-12 -1.501
2.95E-11 1.16E-09 -1.493 C1orf162 Human 1.29E-84 7.52E-04
-1.458
[0229] In some embodiments of the methods of the disclosure, the
subject having PrePF or at risk of developing IPF has a mutation in
a sequence encoding MUC5B, TERC, FAM13A, TERT, DSP, AZGP1, OBFC1,
ATP11A, IVD/DISP2, DPP9, SIGLEC14, ADM2, TSPAN5, CAMKK1 or
MMP-7.
[0230] In some embodiments of the methods of the disclosure, the
subject having PrePF or at risk of developing IPF has a mutation in
a sequence encoding Telomerase RNA Component (TERC). In some
embodiments the polymorphism is rs6793295 comprising (SEQ ID NO:
1).
TABLE-US-00003 (SEQ ID NO: 1) AGAAAGAAGT CATGAAAGTA GGAACCACAT
TTTTACTCAT CTTTCTGTCT CCAGCAAGCA GCTTACTGCT TTTCATACAC ATTTTGCTTT
TATTACTCAT GATTTCAAAG GTGTAATGGT TCAGCCACAT CAATGTAACA AACAGTTCAC
ACTGGGCTCT TATAGTCTGG CCTTTAAAAC CTTCACTATT TATGCTTTCA TCTTAACTAC
TTTGACCCTC ACAGGTTTAC TCACTAAGAA CTTGAGTTTC AAGAGAAAAG ATGACATGTT
TGCTGCTTAA ACAAGCAATA TCTAAAAGCA TATTTAGTTA TAAACGTCTT ACCAAGAATT
GATATAATTT TCATTTAAAC ATTTTTATAA ATAGTAGTTT ACAAGATATA GTAAGTACAT
CTCTAAAAAT ACAGTGTATT CATGTACCTT GACATAAACT TGTAGTAGTA CCTTAGTTTT
ATTCATGTTG TTATATTAAC TACCATCACT TTGAATACAT ACCTGTTCAC B GTACAGTATA
GGTCGGTTTA GGTTTATTGC CTTAATTGCT TGGTTTTGAG TTAGTACTGT AGCAAATGCT
ATCACACTTT GCATTCCCTA AAAACAGGTA AATTCATTAA GGAAACAGAC AAAGTATATA
ATAATCTCGC TACATAAATA TTTCAAGATC AGCTATCTGC ATTCTGATAA AATTGTTTTT
AAAATTTAAG CATTCCTTGG ACTTTGAATT GTAAGTTGAT CAAATTCAAA AATGAATTGT
TACTGTATTC TTCTCTCCTG GCCCTAAAAT CTATCTAAAA CATGGCATGG GGAGTTTCTT
AATGTTTCAG TGTCCATTTC CTGGGTGTTT CCCTCTAGGT TTTTTTTCCT CACCCCTCAA
GCTTCTATGT GGATCCCAGC TAGAGCTCAT ACTACTTATC CAACACACAT CATTGTGCAA
GCACTCTTTT ATATTCATAC TAGTACTTTT AAGTGTGTGT GCGGTGGGAA AAGGTTACCA
ATCACATTTT
[0231] In some embodiments of the methods of the disclosure, the
subject having PrePF or at risk of developing IPF has a mutation in
a sequence encoding Family with sequence similarity 13 member A
(FAM13A). In some embodiments the polymorphism is rs2609255
comprising (SEQ ID NO: 2).
TABLE-US-00004 (SEQ ID NO: 2) GTATTCATCA ACTCCTATTT CATTCCCTCT
TCCTGTGCTC ACTGGAAGAT GACATTTCCC AGACTTCCAA GAATGTTACT GAGTTCTGGA
ATGTAAGTAG AAGGGATAAG TATCACTTCT GTGCTGTGGC GGTTATGGAC CTGTGAACTT
TGCACACGCC TTCTATCTTC TTTTTCAGTG TCCATTTCAG AGGGCATGTT TTCAGATGAA
ACCAGTAGAA GATGGAAGCA GCCTGTGACT AGAATCACTG CTTAGGGTCT TGCTGCCTAG
GAATCCCACT CTACCTGCAA CAGACTGTGA AAGAACCGAG AAATACACTG ATTTTGAACA
TAGCCCATAC TATAATGGGG ATGTTTGTTA CAGCAGTTAG CATTAAAAAC CTTGGCTAGG
CATTGGTCAT AATTGTAGAA CACAGCAAAT GAAGGGAAAC TGGAACATAG AGGCCAGTGA
GAACTTTAGG GTTAATGAAA AATGAGGGCA ACCAGGATAA TTTGGTTCTT K GCCAAATAGG
AAGGTGAAAC CAAAGGTAGA CTGGAGGTCA GAAAATCAGT CCAGCACATG TGATGTTTTC
ATTTAGTTGC CTGTATGTCT GTCTGGTCTC CAGCTCAGCC TGGCTCCTTG AGGTAAGAGG
CAGTGGCTGT TCACCTTTGC ATCCCAGCAC CTGGCATACA ATAGATGGGA TGAAATGTTC
AAACTGAGCC TAAGCTTCAG GGTGCTTATC AAAGCAGGGA AGATACACAA GAGGAGATGA
TTCAGGTCCA GGGCAGGTCA GGTATCTAAA CCCAGTCTCT TAGGAAGCTG GATCCTCCGA
ACCAGGGAGA ACAAGCTGGA TATGCACTGG ATTTCCCAGC AGTACTGATC TAGAGACTCT
CATAGAGTCC CTTTTATTCC TTGGCCTAGG GTTACAACTG CTTATAGCAT CTGGAAAGAC
TCAACACCTC AAAAGAGACT TTCAGTAGAT ACAGCAAATA CACTCATGGA ATTGATAATT
AAGCTTCAAT
[0232] In some embodiments of the methods of the disclosure, the
subject having PrePF or at risk of developing IPF has a mutation in
a sequence encoding Telomerase Reverse Transcriptase (TERT). In
some embodiments the polymorphism is rs2736100 comprising (SEQ ID
NO: 3).
TABLE-US-00005 (SEQ ID NO: 3) ATTGTCGTTG TTTGCTTTTG TTTATTGAGA
CAGTCTCACT CTGTCACCCA GGCTGGAGTG TAATGGCACA ATCTCGGCTC ACTGCAACCT
CTGCCTCCTC GGTTCAAGCA GTTCTCATTC CTCAACCTCA TGAGTAGCTG GGATTACAGG
CGCCCACCAC CACGCCTGGC TAATTTTTGT ATTTTTAGTA GAGATAGGCT TTCACCATGT
TGGCCAGGCT GGTCTCAAAC TCCTGACCTC AAGTGATCTG CCCGCCTTGG CCTCCCACAG
TGCTGGGATT ACAGGTGCAA GCCACCGTGC CCGGCATACC TTGATCTTTT AAAATGAAGT
CTGAAACATT GCTACCCTTG TCCTGAGCAA TAAGACCCTT AGTGTATTTT AGCTCTGGCC
ACCCCCCAGC CTGTGTGCTG TTTTCCCTGC TGACTTAGTT CTATCTCAGG CATCTTGACA
CCCCCACAAG CTAAGCATTA TTAATATTGT TTTCCGTGTT GAGTGTTTCT K TAGCTTTGCC
CCCGCCCTGC TTTTCCTCCT TTGTTCCCCG TCTGTCTTCT GTCTCAGGCC CGCCGTCTGG
GGTCCCCTTC CTTGTCCTTT GCGTGGTTCT TCTGTCTTGT TATTGCTGGT AAACCCCAGC
TTTACCTGTG CTGGCCTCCA TGGCATCTAG CGACGTCCGG GGACCTCTGC TTATGATGCA
CAGATGAAGA TGTGGAGACT CACGAGGAGG GCGGTCATCT TGGCCCGTGA GTGTCTGGAG
CACCACGTGG CCAGCGTTCC TTAGCCAGTG AGTGACAGCA ACGTCCGCTC GGCCTGGGTT
CAGCCTGGAA AACCCCAGGC ATGTCGGGGT CTGGTGGCTC CGCGGTGTCG AGTTTGAAAT
CGCGCAAACC TGCGGTGTGG CGCCAGCTCT GACGGTGCTG CCTGGCGGGG GAGTGTCTGC
TTCCTCCCTT CTGCTTGGGA ACCAGGACAA AGGATGAGGC TCCGAGCCGT TGTCGCCCAA
CAGGAGCATG
[0233] In some embodiments of the methods of the disclosure, the
subject having PrePF or at risk of developing IPF has a mutation in
a sequence encoding Desmoplakin (DSP). In some embodiments the
polymorphism is rs2076295 comprising (SEQ ID NO: 4).
TABLE-US-00006 (SEQ ID NO: 4) ATTTGGGAAC CTTTAAAAAA TATTCTGGCT
TCAAAAATAC TCCATATTTA CATCTTTGGT TCTATCTGAA GTAAAGCCGT GATGGTGTGC
GTAAGTGAAA CAGGTGCAAA GGGGCAACAA CAAAGGGCGC CTCTCTTTGT CTTTGTGTCG
CAGGCGGAGA TGGACATGGT GGCCTGGGGT GTGGACCTGG CCTCAGTGGA GCAGCACATT
AACAGCCACC GGGGCATCCA CAACTCCATC GGCGACTATC GCTGGCAGCT GGACAAAATC
AAAGCCGACC TGGTACTTGT CTGTGTTTCA TTTTAGAGTC TTCAAAATAT CTACCGAAGG
ATCGTGTAAT TACTCAATCC CAGGGAGTTT CTTCTGAAAC ATTGCTATTA TTTCTTTCCC
AGAAGACTGG AAATGTTTAG AAATCCCACT TCTTAAATGG GGAAGTGGAA TCAGTAGCCC
TATTAGAGAT TATGTTAACA CTTGAAGAGG AGTTAAACCA GAGGCTGAGG K TGTGCAAACA
CTCATTTGCA GTTTGTGAAT AAGTCTCTTT AGGGGTGGCA GTTTGTTTCT GCGGTAAGCA
GAACATCTTT TTGAATAGGG GAAATGCAAC AGTCTTATAC AGTAGTTTGT GTCATTGGTG
AATCCTTTCC TAGGTGGTAA TTAAAACATT ATTTCTACTG AGCAAAGCCA TATGTCATCC
CGACACCCGC TCCCATGCTG AAAAAAGTCA GACTTGAAAC TGGGTTGAGA ATTACAGCAT
AAAATCATAA CTGATCTTAA GTGCTTAGTT TCCCGCAGGT CTCTACACTT GTAAATCACT
AAACTTTTTT TTTTTTTTTT TACCTGAGAC CATAGCTTCT CATCCTCATT TCTTCTTCTG
GCTTTTTGGG GCTTACTTTT GTCCACCTGA GCCCCTGACC AACTTTCTCC TTCATTTCTC
TAAGACCTAG GGAATCCTAA ATGATGTCTT TAAACTTTAA GACAATTTTC TAACACGTGA
GTCTTTAAGT
[0234] In some embodiments of the methods of the disclosure, the
subject having PrePF or at risk of developing IPF has a mutation in
a sequence encoding Zinc-alpha 2-Glycoprotein 1 (AZGP1). In some
embodiments the polymorphism is rs4727443 comprising (SEQ ID NO:
5).
TABLE-US-00007 (SEQ ID NO: 5) CCCAACCCAA ATAAGCACTA TAACCTCTTG
TTATTCACTT CTCATGCAAC CAGTCTTCTG TTCTCTGTGA GTCTTTAGGA AATGAGGAGC
ATGATCTTCT AGCAGTAAAA CACCTGTAGA GAATTGCCTT ATGTTTTTTG TTTGTTTATT
TGTTTGTGTG CTTTGGTTTG GTTTGCTTTT TTTTTTTTTT TTTTTTTTTT TTTGAGATGG
AGTCTCGCCC TGTTGCCCAG GCTGGAGTGT AGTGGCGAAA TCTCGGCTCA CTGCAACCTC
CACCTCCCTG GTTCAAGCAA TTCCCCTGTC TCAGCCTCCC GAGTAGCTGA GATTACAGGT
GCACACCACC ACGCCCGGCT AATTTTTTTG TATTTTTAGT AGAGATGGGG TTTCACCATG
TTGGCCAGAC TGGTCTCGAA CTTCTGACCT CAGGCAATCC GCCTGCCTCA GCCTCCCAAA
GCGCTGGGAT TACAGGCATG AGCCACTGCG CCCCGCCTCC ATGTTAATCA M TCTTTCTGAT
TTCAAATAAC TCATTATCCC CATGACCTTA TGGATTTGTT TTTCCTCTTC ATCCACAAAA
TTCTCCAGAG AAGTCTCCCT TGTTATCTCT TGGCTGTGCT TTCTATCTCA CCAGTTATCT
TTCTCCAAAG AGCTTCCTCT GCAAAGAAGC TTTGTATATG AAGACCATGT GGGGGCTGAA
TCAAGACCAA GTTTCACAAC CTAAAAGTAG TTCACAAAGC TTCCTTGCCT CTATTCTCTG
CAAATCTGTA AACTCTTCAG CTGACCCAAT TTCTCTCTTT AGCCTTCAGA GATTATTTTA
TTTTATTTTA TTTCATTTCA TTTCATTTCA TTTTGACAGA ATCTAGCTCT GTCGCCCAGG
CTGGAGTGCA GTGGCACCAT CTTTGCTCAC TGCAACCTCC CCCTCACAGG TTCAAGCAAC
TGTCCTGCCT CAGCCTCCCG AGTAGCTGGG ATTACAGGCG TGAGCCACCA CGCCCAGCTG
ATTTTTTTTT
[0235] In some embodiments of the methods of the disclosure, the
subject having PrePF or at risk of developing IPF has a mutation in
a sequence encoding Oligonucleotide/oligosaccharide-binding Fold
Containing 1 (OBFC1). In some embodiments the polymorphism is
rs11191865 comprising (SEQ ID NO: 6).
TABLE-US-00008 (SEQ ID NO: 6) CCTCTACTGC CGTACACCCC ACCACTCAGC
CTTGGAGTGC CTGTGTGCAG AGCAGGGCTG AGGCATGGTG CTGCTTTGGT GGTCTAGGTT
TGCTGCAGGG CCAGGTGGCC TGAGCTCCAG GCAGGATCTC TGGCTGCACT CAGCCCTTTC
TGCCTCCCCA AATGCTCTAT ATCACTATTT GTACACTGAG CAGAGTAAAG TTAGAGAGAA
CTGTTTTATA GAATAGGGCT GGCCCCCGCT CCCCTGGCCT ACGTGATGGT CCTTCCTGGC
TGCCAGGTAC TTGTTTGTAT TAGAGACAGA CACTCCACAG GGTCTGTTGT GGCCCACAGC
ACATAGGCAA TCAGAGGCAG AAAGCAGAGC TGTTTGGACC CACAGAGGGC CGGCTGTCTG
CCACTGAAAT GTCTTTCCAG TTGGTTGAGA AGCAGCAGGA TGCTCTGCTG GTGATGTCTG
AAAGTCCCAG GATTCTTTGG GTCTCCAAGG AGATCCTAGC ATATACCACT R TCGTGGTTTT
AATAAAGAGC AAAAACACTT TCAGATGGGG AGAAGAGTGG AACAAAAGGT ATTCTTCCTG
GGTTGAAGTC TGGGGGAAAG GCATTGAGAA GACTGGGCTA ATGGCACAAA CCAATGAAGT
ACTCAAGTCA CCTGTGATGG AGGCCAGTCA TCCAATGGTA TCAACTTTGT ATGTGGCAAC
ACTTAATAAA AATCTGAACA GGTCTTCACT TGTGGACACA GTAGACTTTC TTGAAAAAGG
ACAGAAAAGT GAGCCCTGTG AATTTTCATC TCACGGACTG ACAACAATGA CTTGCCTTTA
AGGACAGTCA CTCAAGATGA AGATGCAACA AAACCCTTCC AGTTCCAAGT GGCTGATGAA
AAAAAAAAAA TCTTAAAAGC ATCACAGAAC AACGGAGAAA GAGATCAGAA GACTATAACA
GATAGTTTGA ATTTTAAAAC TCAGAGAAAA GCAACTGAGG AGGAAATACA CTGCTTAGAA
AGAAGAAACT
[0236] In some embodiments of the methods of the disclosure, the
subject having PrePF or at risk of developing IPF has a mutation in
a sequence encoding Mucin 5B (MUC5B). In some embodiments the
polymorphism is rs35705950 comprising (SEQ ID NO: 7).
TABLE-US-00009 SEQ ID NO: 7) TGGACGGCCT CTGAAGGGGT CTGTGGGGTC
CTGGACGGGT CCCCATTCAT GGCAGGATTA ACCCCCCTCG GGTTCTGTGT GGTCCAGGCC
GCCCCTTTGT CTCCACTGCC CCCTGGCCAG AATGAGGGAC AGTGACCCAC CCAGGGCTGG
GCCTGGCTCA GACTCCGTCA GAGCCGCAGG GCAAGTTCCT GGCACGTCCG AGGTGGGAGG
CTCCTCTGCG CTCCAGGAGG CTGTGCCTGG CCCCCCTTCC CGGCAGGAAC CGGCTGTGTC
CCTTTCCTTC CTTTATCTTC TGTTTTCAGC D CCTTCAACTG TGAAGAGGTG AACTCTTCAA
ACACGCTGAG CAAACAGGCC CGACTCCCAG GGCCGCATCC GGGATGTCTC AATAGCTGTG
GCCTTGACGT CCACCTCGGA CCCCTGCCCC GGACCCAGCC CAGTTCCCAA TGGGCCCTCT
GCCCGGGGAG GTGCCTAGTG GGAGGGACGA GGGCAAAGTC GGGGCCCCCA CTTGTTTGGT
GTCACTGTGT GCCAGCGGCC ACTGGCGGGC GAGGCTGTTC CAGGGTGGAG GCGGGGAGGG
TTGGACCACA GGCACTGAGC GGGGACAGAG
[0237] In some embodiments of the methods of the disclosure, the
subject having PrePF or at risk of developing IPF has a mutation in
a sequence encoding ATPase Phospholipid Transporting 11A (ATP11A).
In some embodiments the polymorphism is rs12787690 comprising (SEQ
ID NO: 8).
TABLE-US-00010 (SEQ ID NO: 8) GTCATTGGTC AAATGTGGCC TGTATCTAAA
TTCCAACTGT TAGAATCATA GACATCTAGA GCTTACGTCA GTTTTAGATA TTTCTTATGA
ATTCTCAGAA TTCATAGATT CTCATTTTTA TTCTTAGACT TCTCAGATAT TCCGTTTTTG
ATAGTATACC CTTCTGAGTC TAATATGTCC TAAAGTGCGA ACTTGTACAA TTTttttttt
tttttttttt tttttttttt t K tgataaggag ttttactctg tcacccaggc
tggagtgcag tgacccgatc tcggctcact gcaacctctg cctcccgggt tcaagtgatt
gtgatgtctc agtctcccaa gtagctggga ttacaggctc ctgccaccac atgcctagct
aattgttata ctttagtaga aatggggctt cgccgtgtta gtcaggctgg tcttgtactc
ctgacctcag ttgatctgcc taccttggcc cccaaggtgc tgggattaca ggcatgagcc
accgcgcctg accCAGCTTC TTAAATTATT CTGGGCCACC AGTAATGTGA ATCATGtaaa
ttaaaatata taattaaaCA AAATCATATA GCGATTAGAG ATAATAGTTG TGAAATGCTT
GAAAAATCAT AGGCATTTAA TAAATAGAAG CCATTCCAAT TAGGATTCTT CTTGATTTTT
TTTCAAGACC AAAAAAATAC TCttttaaat atttattata ataCTCCATG
[0238] In some embodiments of the methods of the disclosure, the
subject having PrePF or at risk of developing IPF has a mutation in
a sequence encoding Isovaleryl-CoA dehydrogenase (IVD)/Dispatched
RND Transporter Family Member 2 (DISP2). In some embodiments the
polymorphism is rs2034650 comprising (SEQ ID NO: 9).
TABLE-US-00011 (SEQ ID NO: 9) aggctgcagt tagtcatgac tgcgcgctgc
actccagcct gggtgacaaa gtgaggccct gtctcaaaaa caataaaaaa TTTAAAAGAG
CTGAGCATGG AGGCcacttt gggaggctga ggcaggcaga tctcttaagc ccaggagtct
gagaccagcc tgggcgacat gatgaagccc catctctaca aaaaatacaa aaaaattagc
tgagctttat ggcaaatccc tgtaatccca gttacctagg aggcccaggc aggaagatgg
cttgagccca aaaggttgag gctgtagtga gctgtgatca tgaacagagt gagaccctgt
ttcaaaacaa aatgaaaaac aaacaaacaa aaaaaCCAAG AAAACAAGAA AACAAAAACT
ATACAATGAT GAGCCAAAAA GCAAGATATG GAAGAatata tatatatata tatatatata
tataGTATGA GTCCAGCTAT AGAAAGTTTG AAATCAGGCA ACCTAAACAA TATTGTTCAG
GGATCTATAC AGAGGCAGGA AGCCATTGAG AAAGGTAAGG GGAGGATTAT CACCAAATTC
AGGATGGTGG CTCCCCTGGG GAGAATATGT CAAGGAGGGG CACATGGGCT TGGAATACTG
TCTTCATTGA CCTGCGTGTT GGGTACACAG GAGTTTGTTA TTTTTCACAC TGCATATGTG
CATGTATATA CTCTCCCATA TATACCATGC ATTTCACACA AGAACACAAA GGCTGTGTGG
CTCTGCTCTG CCCCTTTCCC CTTCCAGCTC CCATTCTCGT C Y TCAGCTAGCA
GAGGAGGGTC AGGGTCTTTT AGCACAGCTT CCTTCTGTCT CTGAGTGGGT CAGAGGAGTA
CGGGGATGAG GGCCTCCCTT CTGCGGCTGG GCTCTGGCCA CTCCAGGGTG GGAAGGCCTG
GAGAAAACAG GGCCAGGCAA AGCCGGCTGG CCCTGCTGTT TCTGCCAATG CTGGGATTAG
GCCAGGGCTC TGGCCCACCT GTCATTTCAC TCATTCAGCA TGAACATAGC CACTGAGCAC
TTACTGTGAG CCCCGGGTGC TATTGGGAGA GTTCAGATAA GTGAGAGAGG GTCTTTGACC
TCAAAGATCT TACAGAGAGG ACCGTATACA CAAATAACAG TATACCAGCA AAATGTGAGC
TAAGTGTCAT GTGACTACTC atctactctt tcaataaata tttgttgtgc acctattaca
tgccaggaac tgtgctggat ggtgatcatg taaagacagt caaatcacag tcctagctct
cagattcaca gcctgcctaa tgctggggaa acTGGAAT
[0239] In some embodiments of the methods of the disclosure, the
subject having PrePF or at risk of developing IPF has a mutation in
a sequence encoding Dipeptidyl Peptidase 9 (DPP9). In some
embodiments the polymorphism is rs12610495 comprising (SEQ ID NO:
10).
TABLE-US-00012 (SEQ ID NO: 10) CCAGCCAGAA GGGGCGCAGT TTGTTAGTTC
AGCTCCTCCT GAGACAGAAA TAAAGACACG AACCAAAGGA CATCAGCACT TACAGGGCTC
TCAGGTCACA CACAGGATGT CCGCGCCCAC TGCAGAGCTG CAGGTCCCCT CCAGGGCAGT
GGGGAGCCAC AAGCAGCGTT AGGCAGCGGC TGGGACCAGG ACCGCCTGAG CACTCAAGAA
CCCCCACTGC CCCAAGCACT GCTGGCAGCA AGCCCAGAAA ACTGAGCCCG GGGAGCTCCT
CTGAGCGGCC TAAGCACCCC TCTAAGCTGT GCTGCCCCAA TTCAAGCCTG GCTCACGGCA
GCAAAGAAAA AATGTGACCT TCGGAGCTCC CAAAGGGGCC ACCCATAAGC TGAGAGCCTG
CCCGGAAGCA CTTATAGACC CGCGTGGCTT GTTTTCATTG CAAAGAACAA TAAAAATTAT
CTTGCCTCTG ATCACCACTG ATAGCCCAAG AAGCAAAAAT TCGATCCCGG D GATGAGAAAT
GAAATGAAAC ATCGCGAGAA ACTTCCAGGA ATCTTCTGGA TGTGGCTAGA CTCTTTAGCT
TGAGCTTCCA GACAGGCCGA GGCTTGGTGC TGGAGCCTGG CCCTCCGCTG ACCTCTCTTC
TACCCGGGGG CACAGCCCGG ATTGCAGAGA GGCTGGCGCA AGAGTGAGGG AGCGAGGGCT
AGCCTGTGAT GGGCTTTCTC CACCTAGCAC CACCCTATGC TGTGGCTCAG GGGAGTCAAG
AGTTTACACA GCTGCAGAGA TGGATTCCAG GCCACTTACT CAAGTCTACC TACTCCTTCC
TTCGGCCAAT CAGCTGGGTG CCTCTGCGGC CTGTGACACC ACCAGCAAAC AGCTCCAGAC
CTCCTAGCAT GGTCTCTGTC AAGGCTGGGT GGCAGATCTG TGATCTCCTT TTTAAATTTT
TCATTTTTTT TAAGAGATGG GGTCTTGCTA TATTGCCCAG GCTGGTCTCA AACTCCTGGG
CTCCAGCGAT
[0240] In some embodiments of the methods of the disclosure, the
wild type human MUC5B gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_002458.2):
TABLE-US-00013 (SEQ ID NO: 11) 1 cacccggccc ggctccctcc ctgcccgtcc
ccgtcccccc acccgtgcca gcccccagga 61 tgggtgcccc gagcgcgtgc
cggacgctgg tgttggctct ggcggccatg ctcgtggtgc 121 cgcaggcaga
gacccagggc cctgtggagc cgagctggga gaatgcaggg cacaccatgg 181
atggcggtgc cccgacgtcc tcgcccaccc ggcgcgtgag ctttgttcca cccgtcactg
241 tcttccccag cctgagcccc ctgaacccgg cgcacaatgg gcgggtgtgc
agcacctggg 301 gtgacttcca ctacaagacc ttcgacggcg acgtcttccg
cttccctggc ctttgcaact 361 acgtgttctc tgagcactgc cgcgccgcct
acgaggactt caacgtccag ctacgccgag 421 gcctagtggg ctccaggcct
gtggtcaccc gtgttgtcat caaggcccag gggctggtgc 481 tggaggcgtc
caacggctcc gtcctcatca atgggcagcg ggaggagctg ccttacagcc 541
gcactggcct cctggtggag cagagcgggg actacatcaa ggtcagcatc cggctggtgc
601 tgacattcct gtggaacgga gaggacagtg ccctgctgga gctggatccc
aaatacgcca 661 accagacctg tggcctgtgt ggggacttca acggcctccc
ggccttcaac gagttctatg 721 cccacaacgc caggctgacc ccgctccagt
ttgggaacct gcagaagttg gatgggccca 781 cggagcagtg cccggacccg
ctgcccttgc cggccggcaa ctgcacggac gaggagggca 841 tctgccaccg
caccctgctg gggccggcct ttgcggagtg ccacgcactg gtggacagca 901
ctgcgtacct ggccgcctgc gcccaggacc tgtgccgctg ccccacctgc ccgtgtgcca
961 cctttgtgga atactcacgc cagtgcgccc acgcgggggg ccagccgcgg
aactggaggt 1021 gccctgagct ctgcccccgg acctgccccc tcaacatgca
gcaccaggag tgtggctcac 1081 cctgcacgga cacctgctcc aacccccagc
gcgcgcagct ctgcgaggac cactgtgtgg 1141 acggctgctt ctgcccccca
ggcacggtgc tggatgacat cacgcactct ggctgcctgc 1201 ccctcgggca
gtgcccctgc acccacggcg gccgcaccta cagcccgggc acctccttca 1261
acaccacctg cagctcctgc acctgctccg gggggctatg gcagtgccag gacctgccgt
1321 gccctggcac ctgctctgtg cagggcgggg cccacatctc cacctatgat
gagaaactct 1381 acgacctgca tggtgactgc agctacgttc tgtccaagaa
atgtgccgac agcagcttca 1441 ccgtgctggc tgagctgcgg aagtgcggcc
tgacggacaa cgagaactgc ctgaaagcgg 1501 tgacgctcag cctggacggc
ggggacacgg ccatccgggt ccaagcggac ggcggcgtgt 1561 tcctcaactc
catctacacg cagctgcccc tgtcggcagc caacatcacc ctgttcacac 1621
cctcgagctt cttcatcgtg gtgcagacag gcctggggct gcagctgctg gtgcagctgg
1681 tgccactcat gcaggtgttt gtcaggctgg accccgccca ccagggccag
atgtgcggcc 1741 tgtgtgggaa cttcaaccag aaccaggctg acgacttcac
ggccctcagc ggggtggtgg 1801 aggccacggg cgcagccttc gccaacacct
ggaaggccca ggctgcctgt gccaatgcca 1861 ggaacagctt tgaggacccc
tgctccctca gtgtggagaa tgagaactac gcccggcact 1921 ggtgctcgcg
cctgaccgat cccaacagtg ccttctcgcg ctgccactcc atcatcaacc 1981
ccaagccctt ccactcgaac tgcatgtttg acacctgcaa ctgtgagcgg agcgaggact
2041 gcctgtgcgc cgcgctgtcc tcctatgtgc acgcctgtgc cgccaagggc
gtacagctca 2101 gcgactggag ggacggcgtc tgcaccaagt acatgcagaa
ctgccccaag tcccagcgct 2161 acgcctacgt ggtggatgcc tgccagccca
cttgccgcgg cctgagtgag gccgacgtca 2221 cctgcagcgt ttccttcgtg
cctgtggacg gctgcacctg ccccgcgggc accttcctca 2281 atgacgcggg
cgcctgtgtg cccgcccagg agtgcccctg ctacgctcac ggcaccgtgc 2341
tggctcctgg agaggtggtg cacgacgagg gcgccgtgtg ttcatgtacg ggtgggaagc
2401 taagctgcct gggagcctct ctgcagaaaa gcacagggtg tgcagccccc
atggtgtacc 2461 tggactgcag caacagctcg gcgggcaccc ctggggccga
gtgcctccgg agctgccaca 2521 cgctggacgt gggctgtttc agcacacact
gcgtgtccgg ctgtgtctgt cccccggggc 2581 tggtgtcgga tgggagtggg
ggctgcattg ccgaggagga ctgcccctgt gtgcacaacg 2641 aggccaccta
caagcctgga gagaccatca gggtcgactg caacacctgc acctgcagga 2701
accggaggtg ggagtgcagc caccggctct gcctgggcac ctgcgtggcc tacggggatg
2761 gccacttcat cacctttgat ggcgatcgct acagctttga aggcagctgc
gagtacatct 2821 tggcccagga ctactgtggg gacaacacca cccacgggac
cttccgcatc gtcaccgaga 2881 acatcccctg tgggaccacc ggcaccacct
gctccaaggc catcaagctc ttcgtggaga 2941 gctacgagct gatcctccaa
gaggggacct ttaaggcggt ggcgagaggg ccgggtgggg 3001 acccacccta
caagatacgc tacatgggga tcttcctggt catcgagacc cacgggatgg 3061
ccgtgtcctg ggaccggaag accagcgtgt tcatccgact gcaccaggac tacaagggca
3121 gggtctgcgg cctgtgcggg aacttcgacg acaatgccat caatgacttt
gccacgcgta 3181 gccggtccgt ggtgggggac gcactggagt ttgggaacag
ctggaagctc tccccctcct 3241 gcccggacgc cctggcaccc aaggacccct
gcacggccaa ccccttccgc aagtcctggg 3301 cccagaagca gtgcagcatc
ctccacggcc ccaccttcgc cgcctgccgc tcccaggttg 3361 actccaccaa
gtactacgag gcctgcgtga acgacgcgtg tgcctgcgac tcgggtggcg 3421
actgcgagtg tttctgcacg gctgtggctg cctacgccca ggcctgccac gacgcgggcc
3481 tgtgtgtgtc ctggcggact ccggacacct gccccttgtt ctgtgacttc
tacaacccac 3541 atgggggctg tgagtggcac taccagccct gcggggcacc
ctgcctaaaa acctgccgga 3601 accccagtgg gcactgcctg gtggacctgc
ctggcctgga aggctgctac ccgaagtgcc 3661 cacccagcca gcccttcttc
aatgaggacc agatgaagtg cgtggcccag tgtggctgct 3721 acgacaagga
cggaaactac tatgacgtcg gtgcaagggt ccccacagcg gagaactgcc 3781
agagctgtaa ctgcacaccc agtggcatcc agtgcgctca cagccttgag gcctgcacct
3841 gcacctatga ggacaggacc tacagctacc aggacgtcat ctacaacacc
accgatgggc 3901 ttggcgcctg cttgatcgcc atctgcggaa gcaacggcac
catcatcagg aaggctgtgg 3961 catgtcctgg aactccagcc acaacgccat
tcaccttcac caccgcctgg gtcccccact 4021 ccacgacaag cccggccctc
ccggtctcca ccgtgtgtgt ccgcgaggtc tgccgctggt 4081 ccagctggta
caatgggcac cgcccagagc ccggcctggg aggcggagac tttgagacgt 4141
ttgaaaacct gaggcagaga gggtaccagg tatgccctgt gctggctgac atcgagtgcc
4201 gggcggcgca gcttcccgac atgccgctgg aggagctggg ccagcaggtg
gactgtgacc 4261 gcatgcgggg gctgatgtgc gccaacagcc aacagagtcc
cccgctctgt cacgactacg 4321 agctgcgggt tctctgctgc gaatacgtgc
cctgtggccc ctccccggcc ccaggcacca 4381 gccctcagcc ctccctcagt
gccagcacgg agcctgctgt gcctacccca acccagacca 4441 cagcaaccga
aaagaccacc ctatgggtga ccccgagcat ccggtcgacg gcggccctca 4501
cctcgcagac tgggtccagc tcaggccccg tgacggtcac cccctcggcc ccaggtacca
4561 ccacctgcca gccccggtgt cagtggacag agtggtttga tgaggactac
cccaagtctg 4621 aacaacttgg aggggacgtt gagtcctacg ataagatcag
ggccgctgga gggcacttat 4681 gccagcagcc taaggacata gagtgccagg
ccgagagctt ccccaactgg accctggcac 4741 aggtggggca gaaggtgcac
tgtgacgtcc acttcggcct ggtgtgcagg aactgggagc 4801 aggagggcgt
cttcaagatg tgctacaact acaggatccg ggtcctctgc tgcagtgacg 4861
accactgcag gggacgtgcc acaaccccgc caccgaccac agagctggag acggccacca
4921 ccaccaccac ccaggccctg ttctcaacgc cgcagcctac gagtagcccg
gggctgacca 4981 gggctccccc ggccagcacc acagcagtcc ccaccctctc
agaaggactg acatccccca 5041 gatacacaag cacccttggt acagccacca
cgggaggccc cacgacgcct gcaggctcca 5101 cagaacccac tgtcccaggg
gtggccacat ccacccttcc aacacgctca gcccttccag 5161 ggacgacggg
gagcttgggc acatggcgcc cctcacagcc acccacgctg gccccaacaa 5221
caatggcaac ctccagagct cgcccgacag gcacagccag caccgcttcc aaagagccgc
5281 tgaccacgag cctggcgcca acactcacga gcgagctgtc cacctctcag
gccgagacca 5341 gcacgcccag gacagagacg acaatgagcc ccttgactaa
caccaccacc agccagggca 5401 cgacccgctg tcaaccgaag tgtgagtgga
cagagtggtt tgacgtggac ttcccaacct 5461 caggggttgc aggcggggac
atggaaactt ttgaaaacat cagggctgct gggggcaaga 5521 tgtgctgggc
accaaagagc atagagtgcc gggcggagaa ctaccccgag gtaagcatcg 5581
accaggtcgg gcaggtgctg acctgcagcc tggagacggg gctgacctgc aagaacgaag
5641 accagacagg caggttcaac atgtgcttca actacaacgt gcgtgtgctt
tgctgtgacg 5701 actacagcca ctgccccagt accccagcca ccagctccac
ggccacgccc tcctcaactc 5761 cggggacgac ctggatcctc acaaagccga
ccacaacagc cactacgact gcgtccactg 5821 gatccacggc caccccgacc
tccaccctga gaacagctcc ccctcccaaa gtgctgacca 5881 ccacggccac
cacacccaca gtcaccagct ccaaagccac tccctcctcc agtccaggga 5941
ctgcaaccgc ccttccagca ctgagaagca cagccaccac acccacagct accagcgtta
6001 cacccatccc ctcttcctcc ctgggcacca cctggacccg cctatcacag
accaccacac 6061 ccacggccac catgtccaca gccacaccct cctccactcc
agagactgcc cacacctcca 6121 cagtgcttac cgccacggcc accacaactg
gggccaccgg ctctgtggcc accccctcct 6181 ccaccccagg aacagctcac
actaccaaag tgccaactac cacaaccacg ggcttcacag 6241 ccaccccctc
ctccagccca gggacggcac tcacgcctcc agtgtggatc agcacaacca 6301
ccacacccac aaccagaggc tccacggtga ccccctcctc catcccgggg accacccaca
6361 ccgccacagt gctgaccacc accaccacaa ctgtggccac tggttctatg
gcaacaccct 6421 cctctagcac acagaccagt ggtactcccc catcactgac
caccacggcc actacgatca 6481 cggccaccgg ctccaccacc aacccctcct
caactcctgg gacaactccc atccccccag 6541 tgctgaccac caccgccacc
acacctgcag ccaccagcaa cacagtgact ccctcctctg 6601 ccctagggac
cacccacaca cccccagtgc cgaacaccat ggccaccaca cacgggcgat 6661
ccctgccccc cagcagtccc cacacggtgc gcacagcctg gacttcggcc acctcgggca
6721 tcttgggcac cacccacatc acagagcctt ccacggtgac ttcccacacc
ctagcagcaa 6781 ccaccggtac cacccagcac tcgactccag ccctttccag
ccctcaccct agcagcagaa 6841 ccaccgagtc acccccttct ccagggacga
ccaccccggg ccacaccacg gccacctcca 6901 ggaccacagc cacggccaca
cccagcaaga cccgcacctc gaccctgctg cccagcagcc 6961 ccacatcggc
ccccataacc acggtggtga ccatgggctg tgagccccag tgtgcctggt 7021
cagagtggct ggactacagc taccccatgc cggggccctc tggcggggac tttgacacct
7081 actccaacat ccgtgcggcc ggaggggccg tctgtgagca gcccctgggc
ctcgagtgcc 7141 gtgcccaggc ccagcctggt gtccccctgc gggagttggg
ccaggtcgtg gaatgcagcc 7201 tggactttgg cctggtctgc aggaaccgtg
agcaggtggg gaagttcaag atgtgcttca 7261 actatgaaat ccgtgtgttc
tgctgcaact acggccactg ccccagcacc ccggccacca 7321 gctctacggc
catgccctcc tccactccgg ggacgacctg gatcctcaca gagctgacca 7381
caacagccac tacgactgag tccactggat ccacggccac cccgtcctcc accccaggga
7441 ccacctggat cctcacagag ccgagcacta cagccaccgt gacggtgccc
accggatcca
7501 cggccaccgc ctcctccacc caggcaactg ctggcacccc acatgtgagc
accacggcca 7561 cgacacccac agtcaccagc tccaaagcca ctcccttctc
cagtccaggg actgcaaccg 7621 cccttccagc actgagaagc acagccacca
cacccacagc taccagcttt acagccatcc 7681 cctcctcctc cctgggcacc
acctggaccc gcctatcaca gaccaccaca cccacggcca 7741 ccatgtccac
agccacaccc tcctccactc cagagactgt ccacacctcc acagtgctta 7801
ccaccacggc caccacaacc ggggccaccg gctctgtggc caccccctcc tccaccccag
7861 gaacagctca cactaccaaa gtgctgacta ccacaaccac gggcttcaca
gccaccccct 7921 cctccagccc agggacggca cgcacgcttc cagtgtggat
cagcacaacc accacaccca 7981 caaccagagg ttccacggtg accccctcct
ccatcccggg gaccacccac acccccacag 8041 tgctgaccac caccaccaca
actgtggcca ctggttctat ggcaacaccc tcctctagca 8101 cacagaccag
tggtactccc ccatcactga ccaccacggc cactacgatc acggccaccg 8161
gctccaccac caacccctcc tcaactccag ggacaacacc tatcccccca gtgctgacca
8221 ccaccgccac cacacctgca gccaccagca gcacagtgac tccctcctct
gccctaggga 8281 ccacccacac acccccagtg ccgaacacca cggccaccac
acacgggcga tccctgtccc 8341 ccagcagtcc ccacacggtg cgcacagcct
ggacttcggc cacctcaggc accttgggca 8401 ccacccacat cacagagcct
tccacgggga cttcccacac cccagcagca accaccggta 8461 ccacccagca
ctcgactcca gccctgtcca gccctcaccc tagcagcagg accaccgagt 8521
cacccccttc tccagggacg accaccccgg gccacaccag ggccacctcc aggaccacgg
8581 ccacggccac acccagcaag acccgcacct cgaccctgct gcccagcagc
cccacatcgg 8641 ccccaataac cacggtggtg accatgggct gtgagcccca
gtgtgcctgg tcagagtggc 8701 tggactacag ctaccccatg ccggggccct
ctggcgggga ctttgacacc tactccaaca 8761 tccgtgcggc cggaggggcc
gtctgtgagc agcccctggg cctcgagtgc cgtgcccagg 8821 cccagcctgg
tgtccccctg cgggagttgg gccaggtcgt ggaatgcagc ctggactttg 8881
gcctggtctg caggaaccgt gagcaggtgg ggaagttcaa gatgtgcttc aactatgaaa
8941 tccgtgtgtt ctgctgcaac tacggccact gccccagcac cccggccacc
agctctacgg 9001 ccacgccctc ctccactcca gggacgacct ggatcctcac
agagcagacc acagcagcca 9061 ctacgaccgc aaccactgga tccacggcca
tcccgtcctc caccccggga acagctcccc 9121 ctcccaaagt gctgaccagc
acggccacca cacccacagc caccagttcc aaagccactt 9181 cctcctccag
tccaaggact gcaaccaccc ttccagtgct gacaagcaca gccaccaaat 9241
ccacagctac cagctttaca cccatcccct ccttcaccct tgggaccacc gggaccctcc
9301 cagaacagac caccacaccc atggccacca tgtccacaat ccacccctcc
tccactccgg 9361 agaccaccca cacctccaca gtgctgacca cgaaggccac
cacgacaagg gccaccagtt 9421 ccatgtccac cccctcctcc actccgggga
cgacctggat cctcacagag ctgaccacag 9481 cagccactac aactgcagcc
actggcccca cggccacccc gtcctccacc ccagggacca 9541 cctggatcct
cacagagccc agcactacag ccaccgtgac ggtgcccacc ggatccacgg 9601
ccaccgcctc ctccacccgg gcaactgctg gcaccctcaa agtgctgacc agcacggcca
9661 ccacacccac agtcatcagc tccagagcca ctccctcctc cagtccaggg
actgcaaccg 9721 cccttccagc actgagaagc acagccacca cacccacagc
taccagcgtt acagccatcc 9781 cctcttcctc cctgggcacc gcctggaccc
gcctatcaca gaccaccaca cccacggcca 9841 ccatgtccac agccacaccc
tcctctactc cagagactgt ccacacctcc acagtgctta 9901 ccaccacgac
caccacaacc agggccaccg gctctgtggc caccccctcc tccaccccag 9961
gaacagctca cactaccaaa gtgccgacta ccacaaccac gggcttcaca gccaccccct
10021 cctccagccc agggacggca ctcacgcctc cagtgtggat cagcacaacc
accacaccca 10081 caaccagagg ctccacggtg accccctcct ccatcccggg
gaccacccac accgccacag 10141 tgctgaccac caccaccaca actgtggcca
ctggttctat ggcaacaccc tcctctagca 10201 cacagaccag tggtactccc
ccatcactga ccaccacggc cactacgatc acagccaccg 10261 gctccaccac
caacccctcc tcaactccag ggacaactcc catcccccca gtgctgacca 10321
ccaccgccac cacacctgca gccaccagca gcacagtgac tccctcctct gccctaggga
10381 ccacccacac acccccagtg ccgaacacca cggccaccac acacgggcgg
tccctgcccc 10441 ccagcagtcc ccacacggtg cgcacagcct ggacttcggc
cacctcgggc atcttgggca 10501 ccacccacat cacagagcct tccacggtga
cttcccacac cccagcagca accaccagta 10561 ccacccagca ctcgactcca
gccctgtcca gccctcaccc tagcagcagg accaccgagt 10621 cacccccttc
tccagggacg accaccccgg gccacaccag gggcacctcc aggaccacag 10681
ccacagccac acccagcaag acccgcacct cgaccctgct gcccagcagc cccacatcgg
10741 cccccataac cacggtggtg accacgggct gtgagcccca gtgtgcctgg
tcagagtggc 10801 tggactacag ctaccccatg ccggggccct ctggcgggga
ctttgacacc tactccaaca 10861 tccgtgcggc cggaggggca gtctgtgagc
agcccctggg cctcgagtgc cgtgcccagg 10921 cccagcctgg tgtccccctg
cgggagttgg gccaggtcgt ggaatgcagc ctggactttg 10981 gcctggtctg
caggaaccgt gagcaggtgg ggaagttcaa gatgtgcttc aactatgaaa 11041
tccgtgtgtt ctgctgcaac tacggccact gccccagcac cccggccacc agctctacgg
11101 ccacgccctc ctcaactccg gggacgacct ggatcctcac aaagctgacc
acaacagcca 11161 ctacgactga gtccactgga tccacggcca ccccgtcctc
caccccaggg accacctgga 11221 tcctcacaga gccgagcact acagccaccg
tgacggtgcc caccggatcc acggccaccg 11281 cctcctccac ccaggcaact
gctggcaccc cacatgtgag caccacggcc acgacaccca 11341 cagtcaccag
ctccaaagcc actcccttct ccagtccagg gactgcaacc gcccttccag 11401
cactgagaag cacagccacc acacccacag ctaccagctt tacagccatc ccctcctcct
11461 ccctgggcac cacctggacc cgcctatcac agaccaccac acccacggcc
accatgtcca 11521 cagccacacc ctcctccact ccagagactg cccacacctc
cacagtgctt accaccacgg 11581 ccaccacaac cagggccacc ggctctgtgg
ccaccccctc ttccacccca ggaacagctc 11641 acactaccaa agtgccgact
accacaacca cgggcttcac agtcaccccc tcctccagcc 11701 cagggacggc
acgcacgcct ccagtgtgga tcagcacaac caccacaccc acaaccagtg 11761
gctccacggt gaccccctcc tccgtcccgg ggaccaccca cacccccaca gtgctgacca
11821 ccaccaccac aactgtggcc actggttcta tggcaacacc ctcctctagc
acacagacca 11881 gtggtactcc cccatcactg atcaccacgg ccactacgat
cacggccacc ggctccacca 11941 ccaacccctc ctcaactcca gggacaacac
ctatcccccc agtgctgacc accaccgcca 12001 ccacacctgc agccaccagc
agcacagtga ctccctcctc tgccctaggg accacccaca 12061 cacccccagt
gccgaacacc acggccacca cacacgggcg atccctgtcc cccagcagtc 12121
cccacacggt gcgcacagcc tggacttcgg ccacctcagg caccttgggc accacccaca
12181 tcacagagcc ttccacgggg acttcccaca ccccagcagc aaccaccggt
accacccagc 12241 actcgactcc agccctgtcc agccctcacc ctagcagcag
gaccaccgag tcaccccctt 12301 ccccagggac gaccaccccg ggccacacca
cggccacctc caggaccacg gccacggcca 12361 cacccagcaa gacccgcacc
tcgaccctgc tgcccagcag ccccacatcg gcccccataa 12421 ccacggtggt
gaccacgggc tgtgagcccc agtgtgcctg gtcagagtgg ctggactaca 12481
gctaccccat gccggggccc tctggcgggg actttgacac ctactccaac atccgtgcgg
12541 ccggaggggc cgtctgtgag cagcccctgg gcctcgagtg ccgtgcccag
gcccagcctg 12601 gtgtccccct gggggagttg ggccaggtcg tggaatgcag
cctggacttt ggcctggtct 12661 gcaggaaccg tgagcaggtg gggaagttca
agatgtgctt caactatgaa atccgtgtgt 12721 tctgctgcaa ctacggccac
tgccccagca ccccggccac cagctctacg gccatgccct 12781 cctccactcc
ggggacgacc tggatcctca cagagctgac cacaacagcc actacgactg 12841
catccactgg atccacggcc accccgtcct ccaccccggg aacagctccc cctcccaaag
12901 tgctgaccag cccggccacc acacccacag ccaccagttc caaagccact
tcctcctcca 12961 gtccaaggac tgcaaccacc cttccagtgc tgacaagcac
agccaccaaa tccacagcta 13021 ccagcgttac acccatcccc tcctccaccc
ttgggaccac cgggaccctc ccagaacaga 13081 ccaccacacc cgtggccacc
atgtccacaa tccacccctc ctccactccg gagaccaccc 13141 acacctccac
agtgctgacc acgaaggcca ccacgacaag ggccaccagt tccacgtcca 13201
ccccctcctc cactccgggg acgacctgga tcctcacaga gctgaccaca gcagccacta
13261 caactgcagc cactggcccc acggccaccc cgtcctccac cccagggacc
acctggatcc 13321 tcacagagct gaccacaaca gccactacga ctgcgtccac
tggatccacg gccaccccgt 13381 cctccacccc agggaccacc tggatcctca
cagagccgag cactacagcc accgtgacgg 13441 tgcccaccgg atccacggcc
accgcctcct ccacccaggc aactgctggc accccacatg 13501 tgagcaccac
ggccacgaca cccacagtca ccagctccaa agccactccc tcctccagtc 13561
cagggactgc aactgccctt ccagcactga gaagcacagc caccacaccc acagctacca
13621 gctttacagc catcccctcc tcctccctgg gcaccacctg gacccgccta
tcacagacca 13681 ccacacccac ggccaccatg tccacagcca caccctcctc
cactccagag actgtccaca 13741 cctccacagt gcttaccgcc acggccacca
caaccggggc caccggctct gtggccaccc 13801 cctcctccac cccaggaaca
gctcacacta ccaaagtgcc gactaccaca accacgggct 13861 tcacagccac
cccctcctcc agcccaggga cggcactcac gcctccagtg tggatcagca 13921
caaccaccac acccacaacc accacaccca caaccagtgg ctccacggtg accccctcct
13981 ccatcccggg gaccacccac accgccagag tgctgaccac caccaccaca
actgtggcca 14041 ctggttctat ggcaacaccc tcctctagca cacagaccag
tggtactccc ccatcactga 14101 ccaccacggc cactacgatc acggccaccg
gctccaccac caacccctcc tcaactccag 14161 ggacaacacc catcacccca
gtgctgacca gcacggccac cacacccgca gccaccagct 14221 ccaaagccac
ttcctcctcc agtccaagga ctgcaaccac ccttccagtg ctgacaagca 14281
cagccacaaa atccacagct accagcttta cacccatccc ctcctccacc ctgtggacca
14341 cgtggaccgt cccagcacag accaccacac ccatgtccac catgtccaca
atccacacct 14401 cctctactcc agagaccacc cacacctcca cagtgctgac
caccacagcc accatgacaa 14461 gggccaccaa ttccacggcc acaccctcct
ccactctggg gacgacccgg atcctcactg 14521 agctgaccac aacagccact
acaactgcag ccactggatc cacggccacc ctgtcctcca 14581 ccccagggac
cacctggatc ctcacagagc cgagcactat agccaccgtg atggtgccca 14641
ccggttccac ggccaccgcc tcctccactc tgggaacagc tcacaccccc aaagtggtga
14701 ccaccatggc cactatgccc acagccactg cctccacggt tcccagctcg
tccaccgtgg 14761 ggaccacccg cacccctgca gtgctcccca gcagcctgcc
aaccttcagc gtgtccactg 14821 tgtcctcctc agtcctcacc accctgagac
ccactggctt ccccagctcc cacttctcta 14881 ctccctgctt ctgcagggca
tttggacagt ttttctcgcc cggggaagtc atctacaata 14941 agaccgaccg
agccggctgc catttctacg cagtgtgcaa tcagcactgt gacattgacc
15001 gcttccaggg cgcctgtccc acctccccac cgccagtgtc ctccgccccg
ctgtcctcgc 15061 cctcccctgc ccctggctgt gacaatgcca tccctctccg
gcaggtgaat gagacctgga 15121 ccctggagaa ctgcacggtg gccaggtgcg
tgggtgacaa ccgtgtcgtc ctgctggacc 15181 caaagcctgt ggccaacgtc
acctgcgtga acaagcacct gcccatcaaa gtgtcggacc 15241 cgagccagcc
ctgtgacttc cactatgagt gcgagtgcat ctgcagcatg tggggcggct 15301
cccactattc cacctttgac ggcacctctt acaccttccg gggcaactgc acctatgtcc
15361 tcatgagaga gatccatgca cgctttggga atctcagcct ctacctggac
aaccactact 15421 gcacggcctc tgccactgcc gctgccgccc gctgcccccg
cgccctcagc atccactaca 15481 agtccatgga tatcgtcctc actgtcacca
tggtgcatgg gaaggaggag ggcctgatcc 15541 tgtttgacca aattccggtg
agcagcggtt tcagcaagaa cggcgtgctt gtgtctgtgc 15601 tggggaccac
caccatgcgt gtggacattc ctgccctggg cgtgagcgtc accttcaatg 15661
gccaagtctt ccaggcccgg ctgccctaca gcctcttcca caacaacacc gagggccagt
15721 gcggcacctg caccaacaac cagagggacg actgtctcca gcgggacgga
accactgccg 15781 ccagttgcaa ggacatggcc aagacgtggc tggtccccga
cagcagaaag gatggctgct 15841 gggccccgac tggcacaccc cccactgcca
gccccgcagc cccggtgtct agcacaccca 15901 cccccacccc atgcccacca
cagccgctct gtgatctgat gctgagccag gtctttgctg 15961 agtgccacaa
ccttgtgccc ccgggcccat tcttcaacgc ctgcatcagc gaccactgca 16021
ggggccgcct tgaggtgccc tgccagagcc tggaggctta cgcagagctc tgccgcgccc
16081 ggggagtgtg cagtgactgg cgaggtgcaa ccggtggcct gtgcgacctc
acctgcccac 16141 ccaccaaagt gtacaagcca tgcggcccca tacagcctgc
cacctgcaac tctaggaacc 16201 agagcccaca gctggagggg atggcggagg
gctgcttctg ccctgaggac cagatcctct 16261 tcaacgcaca catgggcatc
tgcgtgcagg cctgcccctg cgtgggaccc gatgggtttc 16321 ctaaatttcc
cggggagcgg tgggtcagca actgccagtc ctgcgtgtgt gacgagggtt 16381
cagtgtcggt gcagtgcaag cccctgccct gtgacgccca gggtcagccc ccgccgtgca
16441 accgtcccgg cttcgtaacc gtgaccaggc cccgggccga gaacccctgc
tgccccgaga 16501 cggtgtgcgt gtgcaacaca accacctgcc cccagagcct
gcctgtgtgc ccgccagggc 16561 aggagtccat ctgcacccag gaggagggcg
actgctgtcc caccttccgc tgcagacctc 16621 agctgtgttc gtacaatggc
accttctacg gggttggtgc aaccttccca ggcgcccttc 16681 cctgccacat
gtgtacctgc ctctctgggg acacccagga cccaacggtg caatgtcagg 16741
aggatgcctg caacaatact acctgtcccc agggctttga gtacaagaga gtggccgggc
16801 agtgctgtgg ggagtgcgtc cagaccgcct gcctcacgcc cgatggccag
ccagtccagc 16861 tgaatgaaac ctgggtcaac agccatgtgg acaactgcac
cgtgtacctc tgtgaggctg 16921 agggtggagt ccatttgctg accccacagc
ctgcatcctg cccagatgtg tccagctgca 16981 gggggagcct caggaaaacc
ggctgctgct actcctgtga ggaggactcc tgtcaagtcc 17041 gcatcaacac
gaccatcctg tggcaccagg gctgcgagac cgaggtcaac atcaccttct 17101
gcgagggctc ctgccccgga gcgtccaagt actcagcaga ggcccaggcc atgcagcacc
17161 agtgcacctg ctgccaggag aggcgggtcc acgaggagac ggtgcccttg
cactgtccta 17221 acggctcagc catcctgcac acctacaccc acgtggatga
gtgtggctgc acgcccttct 17281 gtgtccctgc gcccatggct cccccacaca
cccgtggctt cccggcccag gaggccactg 17341 ctgtctgaga acgttctgcc
tccatcccca tgctctgtcc acctggagcc aggatgtgca 17401 ttgtctgatc
atgaaaacct tgggcctcct ctgcggagcc ccccggcctg tgtgtggcac 17461
cccgcgctcc gtgctcctgc tgcccacccc gtgggtgaaa ccggccccag aagggtgagg
17521 ggccagcagg acccctttcg ggagggcgcc actcaggagt cctaccctgg
gagagcctgt 17581 ggcccacctt ggccttgccc ctccctgatg tcactgggac
gccctggaac aaactaagca 17641 tgtgcgggcc tatgtgtccc tgccacggcc
ggagcgcccg cgcagcacgg attccagctg 17701 gccacgtccg gccgctgggg
cagacaggct ggtccaggca aggccagctg ctgccaggaa 17761 gctgcgacag
gcaaggcggc cgcctgtcca tgcctgctgc agggtaactc agggctgagg 17821
tcgcaacggc caggtcagag aggggtcagc atcccaaagc cccctctgct caacccagcc
17881 cagttttgca aataaaccct gagcattgag tacgtt
[0241] In some embodiments of the methods of the disclosure, the
wild type human MUC5B gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_002449.2):
TABLE-US-00014 (SEQ ID NO: 12) 1 mgapsacrtl vlalaamlvv pqaetqgpve
pswenaghtm dggaptsspt rrvsfvppvt 61 vfpslspinp ahngrvcstw
gdfhyktfdg dvfrfpglcn yvfsehcraa yedfnvqlrr 121 glvgsrpvvt
rvvikaqglv leasngsvli ngqreelpys rtgllveqsg dyikvsirlv 181
ltflwngeds alleldpkya nqtcglcgdf nglpafnefy ahnarltplq fgnlqkldgp
241 teqcpdplpl pagnctdeeg ichrtllgpa faechalvds taylaacaqd
lcrcptcpca 301 tfveysrqca haggqprnwr cpelcprtcp lnmqhqecgs
pctdtcsnpq raqlcedhcv 361 dgcfcppgtv lddithsgcl plgqcpcthg
grtyspgtsf nttcssctcs gglwqcqdlp 421 cpgtcsvqgg ahistydekl
ydlhgdcsyv lskkcadssf tvlaelrkcg ltdnenclka 481 vtlsldggdt
airvqadggv flnsiytqlp lsaanitlft pssffivvqt glglqllvql 541
vplmqvfvrl dpahqgqmcg lcgnfnqnqa ddftalsgvv eatgaafant wkaqaacana
601 rnsfedpcsl svenenyarh wcsrltdpns afsrchsiin pkpfhsncmf
dtcncersed 661 cicaalssyv hacaakgvql sdwrdgvctk ymqncpksqr
yayvvdacqp tcrglseadv 721 tcsysfvpvd gctcpagtfl ndagacvpaq
ecpcyahgtv lapgevvhde gavcsctggk 781 lsclgaslqk stgcaapmvy
ldcsnssagt pgaeclrsch tldvgcfsth cvsgcvcppg 841 lvsdgsggci
aeedcpcvhn eatykpgeti rvdcntctcr nrrwecshrl clgtcvaygd 901
ghfitfdgdr ysfegsceyi laqdycgdnt thgtfrivte nipcgttgtt cskaiklfve
961 syelilqegt fkavargpgg dppykirymg iflviethgm ayswdrktsv
firlhqdykg 1021 rvcglcgnfd dnaindfatr srsvvgdale fgnswklsps
cpdalapkdp ctanpfrksw 1081 aqkqcsilhg ptfaacrsqv dstkyyeacv
ndacacdsgg dcecfctava ayaqachdag 1141 lcvswrtpdt cplfcdfynp
hggcewhyqp cgapclktcr npsghclvdl pglegcypkc 1201 ppsqpffned
qmkcvaqcgc ydkdgnyydv garvptaenc qscnctpsgi qcahsleact 1261
ctyedrtysy qdviynttdg lgacliaicg sngtiirkav acpgtpattp ftfttawvph
1321 sttspalpvs tvcvrevcrw sswynghrpe pglgggdfet fenlrqrgyq
vcpvladiec 1381 raaqlpdmpl eelgqqvdcd rmrglmcans qqspplchdy
elrvlcceyv pcgpspapgt 1441 spqpslsast epavptptqt tatekttlwv
tpsirstaal tsqtgsssgp vtvtpsapgt 1501 ttcqprcqwt ewfdedypks
eqlggdvesy dkiraagghl cqqpkdiecq aesfpnwtla 1561 qvgqkvhcdv
hfglvcrnwe qegvfkmcyn yrirvlccsd dhcrgrattp pptteletat 1621
ttttqalfst pqptsspglt rappasttav ptlsegltsp rytstlgtat tggpttpags
1681 teptvpgvat stlptrsalp gttgslgtwr psqpptlapt tmatsrarpt
gtastaskep 1741 lttslaptlt selstsqaet stprtettms pltntttsqg
ttrcqpkcew tewfdvdfpt 1801 sgvaggdmet feniraaggk mcwapksiec
raenypevsi dqvgqvltcs letgltckne 1861 dqtgrfnmcf nynvrvlccd
dyshcpstpa tsstatpsst pgttwiltkp tttatttast 1921 gstatptstl
rtapppkvlt ttattptvts skatpssspg tatalpalrs tattptatsv 1981
tpipssslgt twtrlsqttt ptatmstatp sstpetahts tvltatattt gatgsvatps
2041 stpgtahttk vptttttgft atpssspgta ltppvwistt ttpttrgstv
tpssipgtth 2101 tatvlttttt tvatgsmatp ssstqtsgtp psltttatti
tatgsttnps stpgttpipp 2161 vltttattpa atsntvtpss algtthtppv
pntmatthgr slppssphtv rtawtsatsg 2221 ilgtthitep stvtshtlaa
ttgttqhstp alssphpssr ttesppspgt ttpghttats 2281 rttatatpsk
trtstllpss ptsapittvv tmgcepqcaw sewldysypm pgpsggdfdt 2341
ysniraagga vceqplglec raqaqpgvpl relgqvvecs ldfglvcrnr eqvgkfkmcf
2401 nyeirvfccn yghcpstpat sstampsstp gttwiltelt ttatttestg
statpsstpg 2461 ttwiltepst tatvtvptgs tatasstqat agtphvstta
ttptvtsska tpfsspgtat 2521 alpalrstat tptatsftai pssslgttwt
rlsqtttpta tmstatpsst petvhtstvl 2581 tttatttgat gsvatpsstp
gtahttkvlt ttttgftatp ssspgtartl pvwisttttp 2641 ttrgstvtps
sipgtthtpt vlttttttva tgsmatpsss tqtsgtppsl tttattitat 2701
gsttnpsstp gttpippvlt ttattpaats stvtpssalg tthtppvpnt tatthgrsls
2761 pssphtvrta wtsatsgtlg tthitepstg tshtpaattg ttqhstpals
sphpssrtte 2821 sppspgtttp ghtratsrtt atatpsktrt stllpsspts
apittvvtmg cepqcawsew 2881 ldysypmpgp sggdfdtysn iraaggavce
qplglecraq aqpgvplrel gqvvecsldf 2941 glvcrnreqv gkfkmcfnye
irvfccnygh cpstpatsst atpsstpgtt wilteqttaa 3001 tttattgsta
ipsstpgtap ppkvltstat tptatsskat ssssprtatt 1pvltstatk 3061
statsftpip sftlgttgtl peqtttpmat mstihpsstp etthtstvlt tkatttrats
3121 smstpsstpg ttwilteltt aatttaatgp tatpsstpgt twiltepstt
atvtvptgst 3181 atasstrata gtlkvltsta ttptvissra tpssspgtat
alpalrstat tptatsvtai 3241 pssslgtawt rlsqtttpta tmstatpsst
petvhtstvl tttttttrat gsvatpsstp 3301 gtahttkvpt ttttgftatp
ssspgtaltp pvwisttttp ttrgstvtps sipgtthtat 3361 vlttttttva
tgsmatpsss tqtsgtppsl tttattitat gsttnpsstp gttpippvlt 3421
ttattpaats stvtpssalg tthtppvpnt tatthgrslp pssphtvrta wtsatsgilg
3481 tthitepstv tshtpaatts ttqhstpals sphpssrtte sppspgtttp
ghtrgtsrtt 3541 atatpsktrt stllpsspts apittvvttg cepqcawsew
ldysypmpgp sggdfdtysn 3601 iraaggavce qplglecraq aqpgvplrel
gqvvecsldf glvcrnreqv gkfkmcfnye 3661 irvfccnygh cpstpatsst
atpsstpgtt wiltklttta tttestgsta tpsstpgttw 3721 iltepsttat
vtvptgstat asstqatagt phvsttattp tvtsskatpf sspgtatalp 3781
alrstattpt atsftaipss slgttwtrls qtttptatms tatpsstpet ahtstvlttt
3841 atttratgsv atpsstpgta httkvptttt tgftvtpsss pgtartppvw
isttttptts 3901 gstvtpssvp gtthtptvlt tttttvatgs matpssstqt
sgtppslitt attitatgst 3961 tnpsstpgtt pippvlttta ttpaatsstv
tpssalgtth tppvpnttat thgrslspss 4021 phtvrtawts atsgtlgtth
itepstgtsh tpaattgttq hstpalssph pssrttespp 4081 spgtttpght
tatsrttata tpsktrtstl 1pssptsapi ttvvttgcep qcawsewldy 4141
sypmpgpsgg dfdtysnira aggavceqpl glecraqaqp gvplgelgqv vecsldfglv
4201 crnreqvgkf kmcfnyeiry fccnyghcps tpatsstamp sstpgttwil
teltttattt 4261 astgstatps stpgtapppk vltspattpt atsskatsss
sprtattlpv ltstatksta 4321 tsvtpipsst lgttgtlpeq tttpvatmst
ihpsstpett htstvlttka tttratssts 4381 tpsstpgttw iltelttaat
ttaatgptat psstpgttwi lteltttatt tastgstatp 4441 sstpgttwil
tepsttatvt vptgstatas stqatagtph vsttattptv tsskatpsss 4501
pgtatalpal rstattptat sftaipsssl gttwtrlsqt ttptatmsta tpsstpetvh
4561 tstvltatat ttgatgsvat psstpgtaht tkvptttttg ftatpssspg
taltppvwis 4621 ttttpttttp ttsgstvtps sipgtthtar vlttttttva
tgsmatpsss tqtsgtppsl 4681 tttattitat gsttnpsstp gttpitpvlt
stattpaats skatsssspr tattlpvlts 4741 tatkstatsf tpipsstlwt
twtvpaqttt pmstmstiht sstpetthts tvltttatmt 4801 ratnstatps
stlgttrilt eltttattta atgstatlss tpgttwilte pstiatvmvp 4861
tgstatasst lgtahtpkvv ttmatmptat astvpssstv gttrtpavlp sslptfsyst
4921 vsssvlttlr ptgfpsshfs tpcfcrafgq ffspgeviyn ktdragchfy
avcnqhcdid 4981 rfqgacptsp ppvssaplss pspapgcdna iplrqvnetw
tlenctvarc vgdnrvvlld 5041 pkpvanvtcv nkhlpikvsd psqpcdfhye
cecicsmwgg shystfdgts ytfrgnctyv 5101 lmreiharfg nlslyldnhy
ctasataaaa rcpralsihy ksmdivltvt mvhgkeegli 5161 lfdqipvssg
fskngvlvsv lgtttmrvdi palgvsvtfn gqvfqarlpy slfhnntegq 5221
cgtctnnqrd dclqrdgtta asckdmaktw lvpdsrkdgc waptgtppta spaapvsstp
5281 tptpcppqpl cdlmlsqvfa echnlvppgp ffnacisdhc rgrlevpcqs
leayaelcra 5341 rgvcsdwrga tgglcdltcp ptkvykpcgp iqpatcnsrn
qspqlegmae gcfcpedqil 5401 fnahmgicvq acpcvgpdgf pkfpgerwvs
ncqscvcdeg sysvqckplp cdaqgqpppc 5461 nrpgfvtvtr praenpccpe
tvcvcntttc pqslpvcppg qesictqeeg dccptfrcrp 5521 qlcsyngtfy
gvgatfpgal pchmctclsg dtqdptvqcq edacnnttcp qgfeykrvag 5581
qccgecvqta cltpdgqpvq lnetwvnshv dnctvylcea eggvhlltpq pascpdvssc
5641 rgslrktgcc ysceedscqv rinttilwhq gcetevnitf cegscpgask
ysaeaqamqh 5701 qctccqerry heetvplhcp ngsailhtyt hvdecgctpf
cvpapmapph trgfpaqeat 5761 av
[0242] In some embodiments of the methods of the disclosure, the
wild type human TERT gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_198253.2, transcript variant 1):
TABLE-US-00015 (SEQ ID NO: 13) 1 caggcagcgc tgcgtcctgc tgcgcacgtg
ggaagccctg gccccggcca cccccgcgat 61 gccgcgcgct ccccgctgcc
gagccgtgcg ctccctgctg cgcagccact accgcgaggt 121 gctgccgctg
gccacgttcg tgcggcgcct ggggccccag ggctggcggc tggtgcagcg 181
cggggacccg gcggctttcc gcgcgctggt ggcccagtgc ctggtgtgcg tgccctggga
241 cgcacggccg ccccccgccg ccccctcctt ccgccaggtg tcctgcctga
aggagctggt 301 ggcccgagtg ctgcagaggc tgtgcgagcg cggcgcgaag
aacgtgctgg ccttcggctt 361 cgcgctgctg gacggggccc gcgggggccc
ccccgaggcc ttcaccacca gcgtgcgcag 421 ctacctgccc aacacggtga
ccgacgcact gcgggggagc ggggcgtggg ggctgctgct 481 gcgccgcgtg
ggcgacgacg tgctggttca cctgctggca cgctgcgcgc tctttgtgct 541
ggtggctccc agctgcgcct accaggtgtg cgggccgccg ctgtaccagc tcggcgctgc
601 cactcaggcc cggcccccgc cacacgctag tggaccccga aggcgtctgg
gatgcgaacg 661 ggcctggaac catagcgtca gggaggccgg ggtccccctg
ggcctgccag ccccgggtgc 721 gaggaggcgc gggggcagtg ccagccgaag
tctgccgttg cccaagaggc ccaggcgtgg 781 cgctgcccct gagccggagc
ggacgcccgt tgggcagggg tcctgggccc acccgggcag 841 gacgcgtgga
ccgagtgacc gtggtttctg tgtggtgtca cctgccagac ccgccgaaga 901
agccacctct ttggagggtg cgctctctgg cacgcgccac tcccacccat ccgtgggccg
961 ccagcaccac gcgggccccc catccacatc gcggccacca cgtccctggg
acacgccttg 1021 tcccccggtg tacgccgaga ccaagcactt cctctactcc
tcaggcgaca aggagcagct 1081 gcggccctcc ttcctactca gctctctgag
gcccagcctg actggcgctc ggaggctcgt 1141 ggagaccatc tttctgggtt
ccaggccctg gatgccaggg actccccgca ggttgccccg 1201 cctgccccag
cgctactggc aaatgcggcc cctgtttctg gagctgcttg ggaaccacgc 1261
gcagtgcccc tacggggtgc tcctcaagac gcactgcccg ctgcgagctg cggtcacccc
1321 agcagccggt gtctgtgccc gggagaagcc ccagggctct gtggcggccc
ccgaggagga 1381 ggacacagac ccccgtcgcc tggtgcagct gctccgccag
cacagcagcc cctggcaggt 1441 gtacggcttc gtgcgggcct gcctgcgccg
gctggtgccc ccaggcctct ggggctccag 1501 gcacaacgaa cgccgcttcc
tcaggaacac caagaagttc atctccctgg ggaagcatgc 1561 caagctctcg
ctgcaggagc tgacgtggaa gatgagcgtg cgggactgcg cttggctgcg 1621
caggagccca ggggttggct gtgttccggc cgcagagcac cgtctgcgtg aggagatcct
1681 ggccaagttc ctgcactggc tgatgagtgt gtacgtcgtc gagctgctca
ggtctttctt 1741 ttatgtcacg gagaccacgt ttcaaaagaa caggctcttt
ttctaccgga agagtgtctg 1801 gagcaagttg caaagcattg gaatcagaca
gcacttgaag agggtgcagc tgcgggagct 1861 gtcggaagca gaggtcaggc
agcatcggga agccaggccc gccctgctga cgtccagact 1921 ccgcttcatc
cccaagcctg acgggctgcg gccgattgtg aacatggact acgtcgtggg 1981
agccagaacg ttccgcagag aaaagagggc cgagcgtctc acctcgaggg tgaaggcact
2041 gttcagcgtg ctcaactacg agcgggcgcg gcgccccggc ctcctgggcg
cctctgtgct 2101 gggcctggac gatatccaca gggcctggcg caccttcgtg
ctgcgtgtgc gggcccagga 2161 cccgccgcct gagctgtact ttgtcaaggt
ggatgtgacg ggcgcgtacg acaccatccc 2221 ccaggacagg ctcacggagg
tcatcgccag catcatcaaa ccccagaaca cgtactgcgt 2281 gcgtcggtat
gccgtggtcc agaaggccgc ccatgggcac gtccgcaagg ccttcaagag 2341
ccacgtctct accttgacag acctccagcc gtacatgcga cagttcgtgg ctcacctgca
2401 ggagaccagc ccgctgaggg atgccgtcgt catcgagcag agctcctccc
tgaatgaggc 2461 cagcagtggc ctcttcgacg tcttcctacg cttcatgtgc
caccacgccg tgcgcatcag 2521 gggcaagtcc tacgtccagt gccaggggat
cccgcagggc tccatcctct ccacgctgct 2581 ctgcagcctg tgctacggcg
acatggagaa caagctgttt gcggggattc ggcgggacgg 2641 gctgctcctg
cgtttggtgg atgatttctt gttggtgaca cctcacctca cccacgcgaa 2701
aaccttcctc aggaccctgg tccgaggtgt ccctgagtat ggctgcgtgg tgaacttgcg
2761 gaagacagtg gtgaacttcc ctgtagaaga cgaggccctg ggtggcacgg
cttttgttca 2821 gatgccggcc cacggcctat tcccctggtg cggcctgctg
ctggataccc ggaccctgga 2881 ggtgcagagc gactactcca gctatgcccg
gacctccatc agagccagtc tcaccttcaa 2941 ccgcggcttc aaggctggga
ggaacatgcg tcgcaaactc tttggggtct tgcggctgaa 3001 gtgtcacagc
ctgtttctgg atttgcaggt gaacagcctc cagacggtgt gcaccaacat 3061
ctacaagatc ctcctgctgc aggcgtacag gtttcacgca tgtgtgctgc agctcccatt
3121 tcatcagcaa gtttggaaga accccacatt tttcctgcgc gtcatctctg
acacggcctc 3181 cctctgctac tccatcctga aagccaagaa cgcagggatg
tcgctggggg ccaagggcgc 3241 cgccggccct ctgccctccg aggccgtgca
gtggctgtgc caccaagcat tcctgctcaa 3301 gctgactcga caccgtgtca
cctacgtgcc actcctgggg tcactcagga cagcccagac 3361 gcagctgagt
cggaagctcc cggggacgac gctgactgcc ctggaggccg cagccaaccc 3421
ggcactgccc tcagacttca agaccatcct ggactgatgg ccacccgccc acagccaggc
3481 cgagagcaga caccagcagc cctgtcacgc cgggctctac gtcccaggga
gggaggggcg 3541 gcccacaccc aggcccgcac cgctgggagt ctgaggcctg
agtgagtgtt tggccgaggc 3601 ctgcatgtcc ggctgaaggc tgagtgtccg
gctgaggcct gagcgagtgt ccagccaagg 3661 gctgagtgtc cagcacacct
gccgtcttca cttccccaca ggctggcgct cggctccacc 3721 ccagggccag
cttttcctca ccaggagccc ggcttccact ccccacatag gaatagtcca 3781
tccccagatt cgccattgtt cacccctcgc cctgccctcc tttgccttcc acccccacca
3841 tccaggtgga gaccctgaga aggaccctgg gagctctggg aatttggagt
gaccaaaggt 3901 gtgccctgta cacaggcgag gaccctgcac ctggatgggg
gtccctgtgg gtcaaattgg 3961 ggggaggtgc tgtgggagta aaatactgaa
tatatgagtt tttcagtttt gaaaaaaa
[0243] In some embodiments of the methods of the disclosure, the
wild type human TERT gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_937983.2, transcript variant 1):
TABLE-US-00016 (SEQ ID NO: 14) 1 mpraprcrav rsllrshyre vlplatfvrr
lgpqgwrlvq rgdpaafral vaqclvcvpw 61 darpppaaps frqvsclkel
varvlqrlce rgaknvlafg falldgargg ppeafttsvr 121 sylpntvtda
lrgsgawgll lrrvgddvlv hllarcalfv lvapscayqv cgpplyqlga 181
atqarpppha sgprrrlgce rawnhsvrea gvplglpapg arrrggsasr slplpkrprr
241 gaapepertp vgqgswahpg rtrgpsdrgf cvvsparpae eatslegals
gtrhshpsvg 301 rqhhagppst srpprpwdtp cppvyaetkh flyssgdkeq
lrpsfllssl rpsltgarrl 361 vetiflgsrp wmpgtprrlp rlpqrywqmr
plflellgnh aqcpygvllk thcplraavt 421 paagvcarek pqgsvaapee
edtdprrlvq llrqhsspwq vygfvraclr rlvppglwgs 481 rhnerrflrn
tkkfislgkh aklslqeltw kmsvrdcawl rrspgvgcvp aaehrlreei 541
lakflhwlms vyvvellrsf fyvtettfqk nrlffyrksv wsklqsigir qhlkrvqlre
601 lseaevrqhr earpalltsr lrfipkpdgl rpivnmdyvv gartfrrekr
aerltsrvka 661 lfsvinyera rrpgllgasv lglddihraw rtfvlrvraq
dpppelyfvk vdvtgaydti 721 pqdrltevia siikpqntyc vrryavvqka
ahghvrkafk shvstltdlq pymrqfvahl 781 qetsplrdav vieqssslne
assglfdvfl rfmchhavri rgksyvqcqg ipqgsilstl 841 lcslcygdme
nklfagirrd glllrlvddf llvtphltha ktflrtivrg vpeygcvvnl 901
rktvvnfpve dealggtafv qmpahglfpw cglildtrtl evqsdyssya rtsirasltf
961 nrgfkagrnm rrklfgvlrl kchslfldlq vnslqtvctn iykilllqay
rfhacvlqlp 1021 fhqqvwknpt fflrvisdta slcysilkak nagmslgakg
aagplpseav qwlchqafll 1081 kltrhrvtyv pllgslrtaq tqlsrklpgt
tltaleaaan palpsdfkti ld
[0244] In some embodiments of the methods of the disclosure, the
wild type human TERT gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_001193376.1, transcript variant 2):
TABLE-US-00017 (SEQ ID NO: 15) 1 caggcagcgc tgcgtcctgc tgcgcacgtg
ggaagccctg gccccggcca cccccgcgat 61 gccgcgcgct ccccgctgcc
gagccgtgcg ctccctgctg cgcagccact accgcgaggt 121 gctgccgctg
gccacgttcg tgcggcgcct ggggccccag ggctggcggc tggtgcagcg 181
cggggacccg gcggctttcc gcgcgctggt ggcccagtgc ctggtgtgcg tgccctggga
241 cgcacggccg ccccccgccg ccccctcctt ccgccaggtg tcctgcctga
aggagctggt 301 ggcccgagtg ctgcagaggc tgtgcgagcg cggcgcgaag
aacgtgctgg ccttcggctt 361 cgcgctgctg gacggggccc gcgggggccc
ccccgaggcc ttcaccacca gcgtgcgcag 421 ctacctgccc aacacggtga
ccgacgcact gcgggggagc ggggcgtggg ggctgctgct 481 gcgccgcgtg
ggcgacgacg tgctggttca cctgctggca cgctgcgcgc tctttgtgct 541
ggtggctccc agctgcgcct accaggtgtg cgggccgccg ctgtaccagc tcggcgctgc
601 cactcaggcc cggcccccgc cacacgctag tggaccccga aggcgtctgg
gatgcgaacg 661 ggcctggaac catagcgtca gggaggccgg ggtccccctg
ggcctgccag ccccgggtgc 721 gaggaggcgc gggggcagtg ccagccgaag
tctgccgttg cccaagaggc ccaggcgtgg 781 cgctgcccct gagccggagc
ggacgcccgt tgggcagggg tcctgggccc acccgggcag 841 gacgcgtgga
ccgagtgacc gtggtttctg tgtggtgtca cctgccagac ccgccgaaga 901
agccacctct ttggagggtg cgctctctgg cacgcgccac tcccacccat ccgtgggccg
961 ccagcaccac gcgggccccc catccacatc gcggccacca cgtccctggg
acacgccttg 1021 tcccccggtg tacgccgaga ccaagcactt cctctactcc
tcaggcgaca aggagcagct 1081 gcggccctcc ttcctactca gctctctgag
gcccagcctg actggcgctc ggaggctcgt 1141 ggagaccatc tttctgggtt
ccaggccctg gatgccaggg actccccgca ggttgccccg 1201 cctgccccag
cgctactggc aaatgcggcc cctgtttctg gagctgcttg ggaaccacgc 1261
gcagtgcccc tacggggtgc tcctcaagac gcactgcccg ctgcgagctg cggtcacccc
1321 agcagccggt gtctgtgccc gggagaagcc ccagggctct gtggcggccc
ccgaggagga 1381 ggacacagac ccccgtcgcc tggtgcagct gctccgccag
cacagcagcc cctggcaggt 1441 gtacggcttc gtgcgggcct gcctgcgccg
gctggtgccc ccaggcctct ggggctccag 1501 gcacaacgaa cgccgcttcc
tcaggaacac caagaagttc atctccctgg ggaagcatgc 1561 caagctctcg
ctgcaggagc tgacgtggaa gatgagcgtg cgggactgcg cttggctgcg 1621
caggagccca ggggttggct gtgttccggc cgcagagcac cgtctgcgtg aggagatcct
1681 ggccaagttc ctgcactggc tgatgagtgt gtacgtcgtc gagctgctca
ggtctttctt 1741 ttatgtcacg gagaccacgt ttcaaaagaa caggctcttt
ttctaccgga agagtgtctg 1801 gagcaagttg caaagcattg gaatcagaca
gcacttgaag agggtgcagc tgcgggagct 1861 gtcggaagca gaggtcaggc
agcatcggga agccaggccc gccctgctga cgtccagact 1921 ccgcttcatc
cccaagcctg acgggctgcg gccgattgtg aacatggact acgtcgtggg 1981
agccagaacg ttccgcagag aaaagagggc cgagcgtctc acctcgaggg tgaaggcact
2041 gttcagcgtg ctcaactacg agcgggcgcg gcgccccggc ctcctgggcg
cctctgtgct 2101 gggcctggac gatatccaca gggcctggcg caccttcgtg
ctgcgtgtgc gggcccagga 2161 cccgccgcct gagctgtact ttgtcaaggt
ggatgtgacg ggcgcgtacg acaccatccc 2221 ccaggacagg ctcacggagg
tcatcgccag catcatcaaa ccccagaaca cgtactgcgt 2281 gcgtcggtat
gccgtggtcc agaaggccgc ccatgggcac gtccgcaagg ccttcaagag 2341
ccacgtctct accttgacag acctccagcc gtacatgcga cagttcgtgg ctcacctgca
2401 ggagaccagc ccgctgaggg atgccgtcgt catcgagcag agctcctccc
tgaatgaggc 2461 cagcagtggc ctcttcgacg tcttcctacg cttcatgtgc
caccacgccg tgcgcatcag 2521 gggcaagtcc tacgtccagt gccaggggat
cccgcagggc tccatcctct ccacgctgct 2581 ctgcagcctg tgctacggcg
acatggagaa caagctgttt gcggggattc ggcgggacgg 2641 gctgctcctg
cgtttggtgg atgatttctt gttggtgaca cctcacctca cccacgcgaa 2701
aaccttcctc agctatgccc ggacctccat cagagccagt ctcaccttca accgcggctt
2761 caaggctggg aggaacatgc gtcgcaaact ctttggggtc ttgcggctga
agtgtcacag 2821 cctgtttctg gatttgcagg tgaacagcct ccagacggtg
tgcaccaaca tctacaagat 2881 cctcctgctg caggcgtaca ggtttcacgc
atgtgtgctg cagctcccat ttcatcagca 2941 agtttggaag aaccccacat
ttttcctgcg cgtcatctct gacacggcct ccctctgcta 3001 ctccatcctg
aaagccaaga acgcagggat gtcgctgggg gccaagggcg ccgccggccc 3061
tctgccctcc gaggccgtgc agtggctgtg ccaccaagca ttcctgctca agctgactcg
3121 acaccgtgtc acctacgtgc cactcctggg gtcactcagg acagcccaga
cgcagctgag 3181 tcggaagctc ccggggacga cgctgactgc cctggaggcc
gcagccaacc cggcactgcc 3241 ctcagacttc aagaccatcc tggactgatg
gccacccgcc cacagccagg ccgagagcag 3301 acaccagcag ccctgtcacg
ccgggctcta cgtcccaggg agggaggggc ggcccacacc 3361 caggcccgca
ccgctgggag tctgaggcct gagtgagtgt ttggccgagg cctgcatgtc 3421
cggctgaagg ctgagtgtcc ggctgaggcc tgagcgagtg tccagccaag ggctgagtgt
3481 ccagcacacc tgccgtcttc acttccccac aggctggcgc tcggctccac
cccagggcca 3541 gcttttcctc accaggagcc cggcttccac tccccacata
ggaatagtcc atccccagat 3601 tcgccattgt tcacccctcg ccctgccctc
ctttgccttc cacccccacc atccaggtgg 3661 agaccctgag aaggaccctg
ggagctctgg gaatttggag tgaccaaagg tgtgccctgt 3721 acacaggcga
ggaccctgca cctggatggg ggtccctgtg ggtcaaattg gggggaggtg 3781
ctgtgggagt aaaatactga atatatgagt ttttcagttt tgaaaaaaa
[0245] In some embodiments of the methods of the disclosure, the
wild type human TERT gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_001180305.1, transcript variant 21:
TABLE-US-00018 (SEQ ID NO: 16) 1 mpraprcrav rsllrshyre vlplatfvrr
lgpqgwrlvq rgdpaafral vaqclvcvpw 61 darpppaaps frqvsclkel
varvlqrlce rgaknvlafg falldgargg ppeafttsvr 121 sylpntvtda
lrgsgawgll lrrvgddvlv hllarcalfv lvapscayqv cgpplyqlga 181
atqarpppha sgprrrlgce rawnhsvrea gvplglpapg arrrggsasr slplpkrprr
241 gaapepertp vgqgswahpg rtrgpsdrgf cvvsparpae eatslegals
gtrhshpsvg 301 rqhhagppst srpprpwdtp cppvyaetkh flyssgdkeq
lrpsfllssl rpsltgarrl 361 vetiflgsrp wmpgtprrlp rlpqrywqmr
plflellgnh aqcpygvllk thcplraavt 421 paagvcarek pqgsvaapee
edtdprrlvq llrqhsspwq vygfvraclr rlvppglwgs 481 rhnerrflrn
tkkfislgkh aklslqeltw kmsvrdcawl rrspgvgcvp aaehrlreei 541
lakflhwlms vyvvellrsf fyvtettfqk nrlffyrksv wsklqsigir qhlkrvqlre
601 lseaevrqhr earpalltsr lrfipkpdgl rpivnmdyvv gartfrrekr
aerltsrvka 661 lfsvinyera rrpgllgasv lglddihraw rtfvlrvraq
dpppelyfvk vdvtgaydti 721 pqdrltevia siikpqntyc vrryavvqka
ahghvrkafk shvstltdlq pymrqfvahl 781 qetsplrdav vieqssslne
assglfdvfl rfmchhavri rgksyvqcqg ipqgsilstl 841 lcslcygdme
nklfagirrd glllrlvddf llvtphltha ktflsyarts irasltfnrg 901
fkagrnmrrk lfgvlrlkch slfldlqvns lqtvctniyk illlqayrfh acvlqlpfhq
961 qvwknptffl rvisdtaslc ysilkaknag mslgakgaag plpseavqwl
chqafllklt 1021 rhrvtyvpll gslrtaqtql srklpgttlt aleaaanpal
psdfktild
[0246] In some embodiments of the methods of the disclosure, the
wild type human FAM13A gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_014883.3, transcript variant 1):
TABLE-US-00019 (SEQ ID NO: 36) 1 atcaaatttc aactccaggc agtccttcca
gccatgtggg ttcagcggaa agagaagcaa 61 aaccactctt cctaaaatgt
tagaagctgc tcttcgctta ccttggggcc tttgcattgg 121 gagctgtttt
tcacatcaaa gaatatgtgc tgaatggaat tttagtattt tgctgtcgtt 181
ttaatatttt cgtctggtct tcctcagttc ttccagacgc tttctgagag aatgggggca
241 ggagctctag ccatctgtca aagtaaagca gcggttcggc tgaaagaaga
catgaaaaag 301 atagtggcag tgccattaaa tgaacagaag gattttacct
atcagaagtt atttggagtc 361 agtctccaag aacttgaacg gcaggggctc
accgagaatg gcattccagc agtagtgtgg 421 aatatagtgg aatatttgac
gcagcatgga cttacccaag aaggtctttt tagggtgaat 481 ggtaacgtga
aggtggtgga acaacttcga ctgaagttcg agagtggagt gcccgtggag 541
ctcgggaagg acggtgatgt ctgctcagca gccagtctgt tgaagctgtt tctgagggag
601 ctgcctgaca gtctgatcac ctcagcgttg cagcctcgat tcattcaact
ctttcaggat 661 ggcagaaatg atgttcagga gagtagctta agagacttaa
taaaagagct gccagacacc 721 cactactgcc tcctcaagta cctttgccag
ttcttgacaa aagtagccaa gcatcatgtg 781 cagaatcgca tgaatgttca
caatctcgcc actgtatttg ggccaaattg ctttcatgtg 841 ccacctgggc
ttgaaggcat gaaggaacag gacctgtgca acaagataat ggctaaaatt 901
ctagaaaatt acaataccct gtttgaagta gagtatacag aaaatgatca tctgagatgt
961 gaaaacctgg ctaggcttat catagtaaaa gaggtctatt ataagaactc
cctgcccatc 1021 cttttaacaa gaggcttaga aagagacatg ccaaaaccac
ctccaaaaac caagatccca 1081 aaatccagga gtgagggatc tattcaggcc
cacagagtac tgcaaccaga gctatctgat 1141 ggcattcctc agctcagctt
gcggctaagt tatagaaaag cctgcttgga agacatgaat 1201 tcagcagagg
gtgctattag tgccaagttg gtacccagtt cacaggaaga tgaaagacct 1261
ctgtcacctt tctatttgag tgctcatgta ccccaagtca gcaatgtgtc tgcaaccgga
1321 gaactcttag aaagaaccat ccgatcagct gtagaacaac atctttttga
tgttaataac 1381 tctggaggtc aaagttcaga ggactcagaa tctggaacac
tatcagcatc ttctgccaca 1441 tctgccagac agcgccgccg ccagtccaag
gagcaggatg aagttcgaca tgggagagac 1501 aagggactta tcaacaaaga
aaatactcct tctgggttca accaccttga tgattgtatt 1561 ttgaatactc
aggaagtcga aaaggtacac aaaaatactt ttggttgtgc tggagaaagg 1621
agcaagccta aacgtcagaa atccagtact aaactttctg agcttcatga caatcaggac
1681 ggtcttgtga atatggaaag tctcaattcc acacgatctc atgagagaac
tggacctgat 1741 gattttgaat ggatgtctga tgaaaggaaa ggaaatgaaa
aagatggtgg acacactcag 1801 cattttgaga gccccacaat gaagatccag
gagcatccca gcctatctga caccaaacag 1861 cagagaaatc aagatgccgg
tgaccaggag gagagctttg tctccgaagt gccccagtcg 1921 gacctgactg
cattgtgtga tgaaaagaac tgggaagagc ctatccctgc tttctcctcc 1981
tggcagcggg agaacagtga ctctgatgaa gcccacctct cgccgcaggc tgggcgcctg
2041 atccgtcagc tgctggacga agacagcgac cccatgctct ctcctcggtt
ctacgcttat 2101 gggcagagca ggcaatacct ggatgacaca gaagtgcctc
cttccccacc aaactcccat 2161 tctttcatga ggcggcgaag ctcctctctg
gggtcctatg atgatgagca agaggacctg 2221 acacctgccc agctcacacg
aaggattcag agccttaaaa agaagatccg gaagtttgaa 2281 gatagattcg
aagaagagaa gaagtacaga ccttcccaca gtgacaaagc agccaatccg 2341
gaggttctga aatggacaaa tgaccttgcc aaattccgga gacaacttaa agaatcaaaa
2401 ctaaagatat ctgaagagga cctaactccc aggatgcggc agcgaagcaa
cacactcccc 2461 aagagttttg gttcccaact tgagaaagaa gatgagaaga
agcaagagct ggtggataaa 2521 gcaataaagc ccagtgttga agccacattg
gaatctattc agaggaagct ccaggagaag 2581 cgagcggaaa gcagccgccc
tgaggacatt aaggatatga ccaaagacca gattgctaat 2641 gagaaagtgg
ctctgcagaa agctctgtta tattatgaaa gcattcatgg acggccggta 2701
acaaagaacg aacggcaggt gatgaagcca ctatacgaca ggtaccggct ggtcaaacag
2761 atcctctccc gagctaacac catacccatc attggttccc cctccagcaa
gcggagaagc 2821 cctttgctgc agccaattat cgagggcgaa actgcttcct
tcttcaagga gataaaggaa 2881 gaagaggagg ggtcagaaga cgatagcaat
gtgaagccag acttcatggt cactctgaaa 2941 accgatttca gtgcacgatg
ctttctggac caattcgaag atgacgctga tggatttatt 3001 tccccaatgg
atgataaaat accatcaaaa tgcagccagg acacagggct ttcaaatctc 3061
catgctgcct caatacctga actcctggaa cacctccagg aaatgagaga agaaaagaaa
3121 aggattcgaa agaaacttcg ggattttgaa gacaactttt tcagacagaa
tggaagaaat 3181 gtccagaagg aagaccgcac tcctatggct gaagaataca
gtgaatataa gcacataaag 3241 gcgaaactga ggctcctgga ggtgctcatc
agcaagagag acactgattc caagtccatg 3301 tgaggggcat ggccaagcac
agggggctgg cagctgcggt gagagtttac tgtccccaga 3361 gaaagtgcag
ctctggaagg cagccttggg gctggccctg caaagcatgc agcccttctg 3421
cctctagacc atttggcatc ggctcctgtt tccattgcct gccttagaaa ctggctggaa
3481 gaagacaatg tgacctgact taggcatttt gtaattggaa agtcaagact
gcagtatgtg 3541 cacatgcgca cgcgcatgca cgcacacaca cacacagtag
tggagctttc ctaacactag 3601 cagagattaa tcactacatt agacaacact
catctacaga gaatatacac tgttcttccc 3661 tggataactg agaaacaaga
gaccattctc tgtctaactg tgataaaaac aagctcagga 3721 ctttattcta
tagagcaaac ttgctgtgga gggccatgct ctccttggac ccagttaact 3781
gcaaacgtgc attggagccc tatttgctgc cgctgccatt ctagtgacct ttccacagag
3841 ctgcgccttc ctcacgtgtg tgaaaggttt tccccttcag ccctcaggta
gatggaagct 3901 gcatctgccc acgatggcag tgcagtcatc atcttcagga
tgtttcttca ggacttcctc 3961 agctgacaag gaattttggt ccctgcctag
gaccgggtca tctgcagagg acagagagat 4021 ggtaagcagc tgtatgaatg
ctgattttaa aaccaggtca tgggagaaga gcctggagat 4081 tctttcctga
acactgactg cacttaccag tctgatttta tcgtcaaaca ccaagccagg 4141
ctagcatgct catggcaatc tgtttggggc tgttttgttg tggcactagc caaacataaa
4201 ggggcttaag tcagcctgca tacagaggat cggggagaga aggggcctgt
gttctcagcc 4261 tcctgagtac ttaccagagt ttaatttttt taaaaaaaat
ctgcactaaa atccccaaac 4321 tgacaggtaa atgtagccct cagagctcag
cccaaggcag aatctaaatc acactatttt 4381 cgagatcatg tataaaaaga
aaaaaaagaa gtcatgctgt gtggccaatt ataatttttt 4441 tcaaagactt
tgtcacaaaa ctgtctatat tagacatttt ggagggacca ggaaatgtaa 4501
gacaccaaat cctccatctc ttcagtgtgc ctgatgtcac ctcatgattt gctgttactt
4561 ttttaactcc tgcgccaagg acagtgggtt ctgtgtccac ctttgtgctt
tgcgaggccg 4621 agcccaggca tctgctcgcc tgccacggct gaccagagaa
ggtgcttcag gagctctgcc 4681 ttagacgacg tgttacagta tgaacacaca
gcagaggcac cctcgtatgt tttgaaagtt 4741 gccttctgaa agggcacagt
tttaaggaaa agaaaaagaa tgtaaaacta tactgacccg 4801 ttttcagttt
taaagggtcg tgagaaactg gctggtccaa tgggatttac agcaacattt 4861
tccattgctg aagtgaggta gcagctctct tctgtcagct gaatgttaag gatggggaaa
4921 aagaatgcct ttaagtttgc tcttaatcgt atggaagctt gagctatgtg
ttggaagtgc 4981 cctggtttta atccatacac aaagacggta cataatccta
caggtttaaa tgtacataaa 5041 aatatagttt ggaattcttt gctctactgt
ttacattgca gattgctata atttcaagga 5101 gtgagattat aaataaaatg
atgcacttta ggatgtttcc tatttttgaa atctgaacat 5161 gaatcattca
catgaccaaa aattgtgttt ttttaaaaat acatgtctag tctgtccttt 5221
aatagctctc ttaaataagc tatgatatta atcagatcat taccagttag cttttaaagc
5281 acatttgttt aagactatgt ttttggaaaa atacgctaca gaattttttt
ttaagctaca 5341 aataaatgag atgctactaa ttgttttgga atctgttgtt
tctgccaaag gtaaattaac 5401 taaagattta ttcaggaatc cccatttgaa
tttgtatgat tcaataaaag aaaacaccaa 5461 gtaagttata taaaataaat
tgtgtatgag atgttgtgtt ttcctttgta atttccacta 5521 actaactaac
taacttatat tcttcatgga atggagccca gaagaaatga gaggaagccc 5581
ttttcacact agatcttatt tgaagaaatg tttgttagtc agtcagtcag tggtttctgg
5641 ctctgccgag ggagatgtgt tccccagcaa ccatttctgc agcccagaat
ctcaaggcac 5701 tagaggcggt gtcttaatta attggcttca caaagacaaa
atgctctgga ctgggatttt 5761 tcctttgctg tgttgggaat atgtgtttat
taattagcac atgccaacaa aataaatgtc 5821 aagagttatt tcataagtgt
aagtaaactt aagaattaaa gagtgcagac ttataatttt 5881 ca
[0247] In some embodiments of the methods of the disclosure, the
wild type human FAM13A gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_055698.2, transcript variant 1):
TABLE-US-00020 (SEQ ID NO: 37) 1 mgagalaicq skaavrlked mkkivavpln
eqkdftyqkl fgvslqeler qgltengipa 61 vvwniveylt qhgltqeglf
rvngnvkvve qlrlkfesgv pvelgkdgdv csaasllklf 121 lrelpdslit
salqprfiql fqdgrndvqe sslrdlikel pdthycllky lcqfltkvak 181
hhvqnrmnvh nlatvfgpnc fhvppglegm keqdlcnkim akilenyntl feveytendh
241 lrcenlarli ivkevyykns lpilltrgle rdmpkpppkt kipksrsegs
iqahrvlqpe 301 lsdgipqlsl rlsyrkacle dmnsaegais aklvpssqed
erplspfyls ahvpqvsnvs 361 atgellerti rsaveqhlfd vnnsggqsse
dsesgtlsas satsarqrrr qskeqdevrh 421 grdkglinke ntpsgfnhld
dcilntqeve kvhkntfgca gerskpkrqk sstklselhd 481 nqdglvnmes
lnstrshert gpddfewmsd erkgnekdgg htqhfesptm kigehpslsd 541
tkqqrnqdag dqeesfvsev pqsdltalcd eknweepipa fsswqrensd sdeahlspqa
601 grlirqllde dsdpmlsprf yaygqsrqyl ddtevppspp nshsfmrrrs
sslgsyddeq 661 edltpaqltr riqslkkkir kfedrfeeek kyrpshsdka
anpevlkwtn dlakfrrqlk 721 esklkiseed ltprmrqrsn tlpksfgsql
ekedekkqel vdkaikpsve atlesiqrkl 781 qekraessrp edikdmtkdq
ianekvalqk allyyesihg rpvtknerqv mkplydryrl 841 vkqilsrant
ipiigspssk rrspllqpii egetasffke ikeeeegsed dsnvkpdfmv 901
tlktdfsarc fldqfeddad gfispmddki pskcsqdtgl snlhaasipe llehlqemre
961 ekkrirkklr dfednffrqn grnvqkedrt pmaeeyseyk hikaklrlle
vliskrdtds 1021 ksm
[0248] In some embodiments of the methods of the disclosure, the
wild type human FAM13A gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_001015045.2, transcript variant 2):
TABLE-US-00021 (SEQ ID NO: 17) 1 attgaggagc agaaggagta gggtgcgggg
gaggaggagg agcgccttta gtgctgcagc 61 agctgctgct ctgattggcc
cggtggttca gctgcttccc tggaacaaaa ggtcaaagtg 121 gactgcagtg
taaatgtaga gaagcagccg ataaaatagc attgcctgaa gaagtttgga 181
ggctgagagc agcagtagac tggccaactg cagagcaagt tgtttctcca gccgtgcggt
241 gcagcctcat gcccccaacc cagcttagcc actgtaagaa gacgttcact
gtacagacga 301 ccaaacttgc cgtggaagag acagttgtga gattcccttg
caaatttaca tacgagaatg 361 gcttgtgaaa tcatgcctct gcaaagttca
caggaagatg aaagacctct gtcacctttc 421 tatttgagtg ctcatgtacc
ccaagtcagc aatgtgtctg caaccggaga actcttagaa 481 agaaccatcc
gatcagctgt agaacaacat ctttttgatg ttaataactc tggaggtcaa 541
agttcagagg actcagaatc tggaacacta tcagcatctt ctgccacatc tgccagacag
601 cgccgccgcc agtccaagga gcaggatgaa gttcgacatg ggagagacaa
gggacttatc 661 aacaaagaaa atactccttc tgggttcaac caccttgatg
attgtatttt gaatactcag 721 gaagtcgaaa aggtacacaa aaatactttt
ggttgtgctg gagaaaggag caagcctaaa 781 cgtcagaaat ccagtactaa
actttctgag cttcatgaca atcaggacgg tcttgtgaat 841 atggaaagtc
tcaattccac acgatctcat gagagaactg gacctgatga ttttgaatgg 901
atgtctgatg aaaggaaagg aaatgaaaaa gatggtggac acactcagca ttttgagagc
961 cccacaatga agatccagga gcatcccagc ctatctgaca ccaaacagca
gagaaatcaa 1021 gatgccggtg accaggagga gagctttgtc tccgaagtgc
cccagtcgga cctgactgca 1081 ttgtgtgatg aaaagaactg ggaagagcct
atccctgctt tctcctcctg gcagcgggag 1141 aacagtgact ctgatgaagc
ccacctctcg ccgcaggctg ggcgcctgat ccgtcagctg 1201 ctggacgaag
acagcgaccc catgctctct cctcggttct acgcttatgg gcagagcagg 1261
caatacctgg atgacacaga agtgcctcct tccccaccaa actcccattc tttcatgagg
1321 cggcgaagct cctctctggg gtcctatgat gatgagcaag aggacctgac
acctgcccag 1381 ctcacacgaa ggattcagag ccttaaaaag aagatccgga
agtttgaaga tagattcgaa 1441 gaagagaaga agtacagacc ttcccacagt
gacaaagcag ccaatccgga ggttctgaaa 1501 tggacaaatg accttgccaa
attccggaga caacttaaag aatcaaaact aaagatatct 1561 gaagaggacc
taactcccag gatgcggcag cgaagcaaca cactccccaa gagttttggt 1621
tcccaacttg agaaagaaga tgagaagaag caagagctgg tggataaagc aataaagccc
1681 agtgttgaag ccacattgga atctattcag aggaagctcc aggagaagcg
agcggaaagc 1741 agccgccctg aggacattaa ggatatgacc aaagaccaga
ttgctaatga gaaagtggct 1801 ctgcagaaag ctctgttata ttatgaaagc
attcatggac ggccggtaac aaagaacgaa 1861 cggcaggtga tgaagccact
atacgacagg taccggctgg tcaaacagat cctctcccga 1921 gctaacacca
tacccatcat tggttccccc tccagcaagc ggagaagccc tttgctgcag 1981
ccaattatcg agggcgaaac tgcttccttc ttcaaggaga taaaggaaga agaggagggg
2041 tcagaagacg atagcaatgt gaagccagac ttcatggtca ctctgaaaac
cgatttcagt 2101 gcacgatgct ttctggacca attcgaagat gacgctgatg
gatttatttc cccaatggat 2161 gataaaatac catcaaaatg cagccaggac
acagggcttt caaatctcca tgctgcctca 2221 atacctgaac tcctggaaca
cctccaggaa atgagagaag aaaagaaaag gattcgaaag 2281 aaacttcggg
attttgaaga caactttttc agacagaatg gaagaaatgt ccagaaggaa 2341
gaccgcactc ctatggctga agaatacagt gaatataagc acataaaggc gaaactgagg
2401 ctcctggagg tgctcatcag caagagagac actgattcca agtccatgtg
aggggcatgg 2461 ccaagcacag ggggctggca gctgcggtga gagtttactg
tccccagaga aagtgcagct 2521 ctggaaggca gccttggggc tggccctgca
aagcatgcag cccttctgcc tctagaccat 2581 ttggcatcgg ctcctgtttc
cattgcctgc cttagaaact ggctggaaga agacaatgtg 2641 acctgactta
ggcattttgt aattggaaag tcaagactgc agtatgtgca catgcgcacg 2701
cgcatgcacg cacacacaca cacagtagtg gagctttcct aacactagca gagattaatc
2761 actacattag acaacactca tctacagaga atatacactg ttcttccctg
gataactgag 2821 aaacaagaga ccattctctg tctaactgtg ataaaaacaa
gctcaggact ttattctata 2881 gagcaaactt gctgtggagg gccatgctct
ccttggaccc agttaactgc aaacgtgcat 2941 tggagcccta tttgctgccg
ctgccattct agtgaccttt ccacagagct gcgccttcct 3001 cacgtgtgtg
aaaggttttc cccttcagcc ctcaggtaga tggaagctgc atctgcccac 3061
gatggcagtg cagtcatcat cttcaggatg tttcttcagg acttcctcag ctgacaagga
3121 attttggtcc ctgcctagga ccgggtcatc tgcagaggac agagagatgg
taagcagctg 3181 tatgaatgct gattttaaaa ccaggtcatg ggagaagagc
ctggagattc tttcctgaac 3241 actgactgca cttaccagtc tgattttatc
gtcaaacacc aagccaggct agcatgctca 3301 tggcaatctg tttggggctg
ttttgttgtg gcactagcca aacataaagg ggcttaagtc 3361 agcctgcata
cagaggatcg gggagagaag gggcctgtgt tctcagcctc ctgagtactt 3421
accagagttt aattttttta aaaaaaatct gcactaaaat ccccaaactg acaggtaaat
3481 gtagccctca gagctcagcc caaggcagaa tctaaatcac actattttcg
agatcatgta 3541 taaaaagaaa aaaaagaagt catgctgtgt ggccaattat
aatttttttc aaagactttg 3601 tcacaaaact gtctatatta gacattttgg
agggaccagg aaatgtaaga caccaaatcc 3661 tccatctctt cagtgtgcct
gatgtcacct catgatttgc tgttactttt ttaactcctg 3721 cgccaaggac
agtgggttct gtgtccacct ttgtgctttg cgaggccgag cccaggcatc 3781
tgctcgcctg ccacggctga ccagagaagg tgcttcagga gctctgcctt agacgacgtg
3841 ttacagtatg aacacacagc agaggcaccc tcgtatgttt tgaaagttgc
cttctgaaag 3901 ggcacagttt taaggaaaag aaaaagaatg taaaactata
ctgacccgtt ttcagtttta 3961 aagggtcgtg agaaactggc tggtccaatg
ggatttacag caacattttc cattgctgaa 4021 gtgaggtagc agctctcttc
tgtcagctga atgttaagga tggggaaaaa gaatgccttt 4081 aagtttgctc
ttaatcgtat ggaagcttga gctatgtgtt ggaagtgccc tggttttaat 4141
ccatacacaa agacggtaca taatcctaca ggtttaaatg tacataaaaa tatagtttgg
4201 aattctttgc tctactgttt acattgcaga ttgctataat ttcaaggagt
gagattataa 4261 ataaaatgat gcactttagg atgtttccta tttttgaaat
ctgaacatga atcattcaca 4321 tgaccaaaaa ttgtgttttt ttaaaaatac
atgtctagtc tgtcctttaa tagctctctt 4381 aaataagcta tgatattaat
cagatcatta ccagttagct tttaaagcac atttgtttaa 4441 gactatgttt
ttggaaaaat acgctacaga attttttttt aagctacaaa taaatgagat 4501
gctactaatt gttttggaat ctgttgtttc tgccaaaggt aaattaacta aagatttatt
4561 caggaatccc catttgaatt tgtatgattc aataaaagaa aacaccaagt
aagttatata 4621 aaataaattg tgtatgagat gttgtgtttt cctttgtaat
ttccactaac taactaacta 4681 acttatattc ttcatggaat ggagcccaga
agaaatgaga ggaagccctt ttcacactag 4741 atcttatttg aagaaatgtt
tgttagtcag tcagtcagtg gtttctggct ctgccgaggg 4801 agatgtgttc
cccagcaacc atttctgcag cccagaatct caaggcacta gaggcggtgt 4861
cttaattaat tggcttcaca aagacaaaat gctctggact gggatttttc ctttgctgtg
4921 ttgggaatat gtgtttatta attagcacat gccaacaaaa taaatgtcaa
gagttatttc 4981 ataagtgtaa gtaaacttaa gaattaaaga gtgcagactt
ataattttca
[0249] In some embodiments of the methods of the disclosure, the
wild type human FAM13A gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_001015045.1, transcript variant 2):
TABLE-US-00022 (SEQ ID NO: 18) 1 maceimplqs sqederplsp fylsahvpqv
snvsatgell ertirsaveq hlfdvnnsgg 61 qssedsesgt lsassatsar
qrrrqskeqd evrhgrdkgl inkentpsgf nhlddcilnt 121 qevekvhknt
fgcagerskp krqksstkls elhdnqdglv nmeslnstrs hertgpddfe 181
wmsderkgne kdgghtqhfe sptmkigehp slsdtkqqrn qdagdqeesf vsevpqsdlt
241 alcdeknwee pipafsswqr ensdsdeahl spqagrlirq lldedsdpml
sprfyaygqs 301 rqylddtevp psppnshsfm rrrssslgsy ddeqedltpa
qltrriqslk kkirkfedrf 361 eeekkyrpsh sdkaanpevl kwtndlakfr
rqlkesklki seedltprmr qrsntlpksf 421 gsqlekedek kqelvdkaik
psveatlesi qrklqekrae ssrpedikdm tkdqianekv 481 alqkallyye
sihgrpvtkn erqvmkplyd ryrlvkqils rantipiigs psskrrspll 541
qpiiegetas ffkeikeeee gseddsnvkp dfmvtlktdf sarcfldqfe ddadgfispm
601 ddkipskcsq dtglsnlhaa sipellehlq emreekkrir kklrdfednf
frqngrnvqk 661 edrtpmaeey seykhikakl rllevliskr dtdsksm
[0250] In some embodiments of the methods of the disclosure, the
wild type human FAM13A gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_001265578.1, transcript variant 3):
TABLE-US-00023 (SEQ ID NO: 38) 1 attgaggagc agaaggagta gggtgcgggg
gaggaggagg agcgccttta gtgctgcagc 61 agctgctgct ctgattggcc
cggtggttca gctgcttccc tggaacaaaa ggtcaaagtg 121 gactgcagtg
taaatgtaga gaagcagccg ataaaatagc attgcctgaa gaagtttgga 181
ggctgagagc agcagtagac tggccaactg cagagcaagt tgtttctcca gccgtgcggt
241 gcagcctcat gcccccaacc cagcttagcc actgtaagaa gacgttcact
gtacagacga 301 ccaaacttgc cgtggaagag acagttgtga gattcccttg
caaatttaca tacgagaatg 361 gcttgtgaaa tcatgcctct gcaaagtgct
catgtacccc aagtcagcaa tgtgtctgca 421 accggagaac tcttagaaag
aaccatccga tcagctgtag aacaacatct ttttgatgtt 481 aataactctg
gaggtcaaag ttcagaggac tcagaatctg gaacactatc agcatcttct 541
gccacatctg ccagacagcg ccgccgccag tccaaggagc aggatgaagt tcgacatggg
601 agagacaagg gacttatcaa caaagaaaat actccttctg ggttcaacca
ccttgatgat 661 tgtattttga atactcagga agtcgaaaag gtacacaaaa
atacttttgg ttgtgctgga 721 gaaaggagca agcctaaacg tcagaaatcc
agtactaaac tttctgagct tcatgacaat 781 caggacggtc ttgtgaatat
ggaaagtctc aattccacac gatctcatga gagaactgga 841 cctgatgatt
ttgaatggat gtctgatgaa aggaaaggaa atgaaaaaga tggtggacac 901
actcagcatt ttgagagccc cacaatgaag atccaggagc atcccagcct atctgacacc
961 aaacagcaga gaaatcaaga tgccggtgac caggaggaga gctttgtctc
cgaagtgccc 1021 cagtcggacc tgactgcatt gtgtgatgaa aagaactggg
aagagcctat ccctgctttc 1081 tcctcctggc agcgggagaa cagtgactct
gatgaagccc acctctcgcc gcaggctggg 1141 cgcctgatcc gtcagctgct
ggacgaagac agcgacccca tgctctctcc tcggttctac 1201 gcttatgggc
agagcaggca atacctggat gacacagaag tgcctccttc cccaccaaac 1261
tcccattctt tcatgaggcg gcgaagctcc tctctggggt cctatgatga tgagcaagag
1321 gacctgacac ctgcccagct cacacgaagg attcagagcc ttaaaaagaa
gatccggaag 1381 tttgaagata gattcgaaga agagaagaag tacagacctt
cccacagtga caaagcagcc 1441 aatccggagg ttctgaaatg gacaaatgac
cttgccaaat tccggagaca acttaaagaa 1501 tcaaaactaa agatatctga
agaggaccta actcccagga tgcggcagcg aagcaacaca 1561 ctccccaaga
gttttggttc ccaacttgag aaagaagatg agaagaagca agagctggtg 1621
gataaagcaa taaagcccag tgttgaagcc acattggaat ctattcagag gaagctccag
1681 gagaagcgag cggaaagcag ccgccctgag gacattaagg atatgaccaa
agaccagatt 1741 gctaatgaga aagtggctct gcagaaagct ctgttatatt
atgaaagcat tcatggacgg 1801 ccggtaacaa agaacgaacg gcaggtgatg
aagccactat acgacaggta ccggctggtc 1861 aaacagatcc tctcccgagc
taacaccata cccatcattg gttccccctc cagcaagcgg 1921 agaagccctt
tgctgcagcc aattatcgag ggcgaaactg cttccttctt caaggagata 1981
aaggaagaag aggaggggtc agaagacgat agcaatgtga agccagactt catggtcact
2041 ctgaaaaccg atttcagtgc acgatgcttt ctggaccaat tcgaagatga
cgctgatgga 2101 tttatttccc caatggatga taaaatacca tcaaaatgca
gccaggacac agggctttca 2161 aatctccatg ctgcctcaat acctgaactc
ctggaacacc tccaggaaat gagagaagaa 2221 aagaaaagga ttcgaaagaa
acttcgggat tttgaagaca actttttcag acagaatgga 2281 agaaatgtcc
agaaggaaga ccgcactcct atggctgaag aatacagtga atataagcac 2341
ataaaggcga aactgaggct cctggaggtg ctcatcagca agagagacac tgattccaag
2401 tccatgtgag gggcatggcc aagcacaggg ggctggcagc tgcggtgaga
gtttactgtc 2461 cccagagaaa gtgcagctct ggaaggcagc cttggggctg
gccctgcaaa gcatgcagcc 2521 cttctgcctc tagaccattt ggcatcggct
cctgtttcca ttgcctgcct tagaaactgg 2581 ctggaagaag acaatgtgac
ctgacttagg cattttgtaa ttggaaagtc aagactgcag 2641 tatgtgcaca
tgcgcacgcg catgcacgca cacacacaca cagtagtgga gctttcctaa 2701
cactagcaga gattaatcac tacattagac aacactcatc tacagagaat atacactgtt
2761 cttccctgga taactgagaa acaagagacc attctctgtc taactgtgat
aaaaacaagc 2821 tcaggacttt attctataga gcaaacttgc tgtggagggc
catgctctcc ttggacccag 2881 ttaactgcaa acgtgcattg gagccctatt
tgctgccgct gccattctag tgacctttcc 2941 acagagctgc gccttcctca
cgtgtgtgaa aggttttccc cttcagccct caggtagatg 3001 gaagctgcat
ctgcccacga tggcagtgca gtcatcatct tcaggatgtt tcttcaggac 3061
ttcctcagct gacaaggaat tttggtccct gcctaggacc gggtcatctg cagaggacag
3121 agagatggta agcagctgta tgaatgctga ttttaaaacc aggtcatggg
agaagagcct 3181 ggagattctt tcctgaacac tgactgcact taccagtctg
attttatcgt caaacaccaa 3241 gccaggctag catgctcatg gcaatctgtt
tggggctgtt ttgttgtggc actagccaaa 3301 cataaagggg cttaagtcag
cctgcataca gaggatcggg gagagaaggg gcctgtgttc 3361 tcagcctcct
gagtacttac cagagtttaa tttttttaaa aaaaatctgc actaaaatcc 3421
ccaaactgac aggtaaatgt agccctcaga gctcagccca aggcagaatc taaatcacac
3481 tattttcgag atcatgtata aaaagaaaaa aaagaagtca tgctgtgtgg
ccaattataa 3541 tttttttcaa agactttgtc acaaaactgt ctatattaga
cattttggag ggaccaggaa 3601 atgtaagaca ccaaatcctc catctcttca
gtgtgcctga tgtcacctca tgatttgctg 3661 ttactttttt aactcctgcg
ccaaggacag tgggttctgt gtccaccttt gtgctttgcg 3721 aggccgagcc
caggcatctg ctcgcctgcc acggctgacc agagaaggtg cttcaggagc 3781
tctgccttag acgacgtgtt acagtatgaa cacacagcag aggcaccctc gtatgttttg
3841 aaagttgcct tctgaaaggg cacagtttta aggaaaagaa aaagaatgta
aaactatact 3901 gacccgtttt cagttttaaa gggtcgtgag aaactggctg
gtccaatggg atttacagca 3961 acattttcca ttgctgaagt gaggtagcag
ctctcttctg tcagctgaat gttaaggatg 4021 gggaaaaaga atgcctttaa
gtttgctctt aatcgtatgg aagcttgagc tatgtgttgg 4081 aagtgccctg
gttttaatcc atacacaaag acggtacata atcctacagg tttaaatgta 4141
cataaaaata tagtttggaa ttctttgctc tactgtttac attgcagatt gctataattt
4201 caaggagtga gattataaat aaaatgatgc actttaggat gtttcctatt
tttgaaatct 4261 gaacatgaat cattcacatg accaaaaatt gtgttttttt
aaaaatacat gtctagtctg 4321 tcctttaata gctctcttaa ataagctatg
atattaatca gatcattacc agttagcttt 4381 taaagcacat ttgtttaaga
ctatgttttt ggaaaaatac gctacagaat ttttttttaa 4441 gctacaaata
aatgagatgc tactaattgt tttggaatct gttgtttctg ccaaaggtaa 4501
attaactaaa gatttattca ggaatcccca tttgaatttg tatgattcaa taaaagaaaa
4561 caccaagtaa gttatataaa ataaattgtg tatgagatgt tgtgttttcc
tttgtaattt 4621 ccactaacta actaactaac ttatattctt catggaatgg
agcccagaag aaatgagagg 4681 aagccctttt cacactagat cttatttgaa
gaaatgtttg ttagtcagtc agtcagtggt 4741 ttctggctct gccgagggag
atgtgttccc cagcaaccat ttctgcagcc cagaatctca 4801 aggcactaga
ggcggtgtct taattaattg gcttcacaaa gacaaaatgc tctggactgg 4861
gatttttcct ttgctgtgtt gggaatatgt gtttattaat tagcacatgc caacaaaata
4921 aatgtcaaga gttatttcat aagtgtaagt aaacttaaga attaaagagt
gcagacttat 4981 aattttca
[0251] In some embodiments of the methods of the disclosure, the
wild type human FAM13A gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_001252507.1, transcript variant 3):
TABLE-US-00024 (SEQ ID NO: 39) 1 maceimplqs ahvpqvsnvs atgellerti
rsaveqhlfd vnnsggqsse dsesgtlsas 61 satsarqrrr qskeqdevrh
grdkglinke ntpsgfnhld dcilntqeve kvhkntfgca 121 gerskpkrqk
sstklselhd nqdglvnmes lnstrshert gpddfewmsd erkgnekdgg 181
htqhfesptm kigehpslsd tkqqrnqdag dqeesfvsev pqsdltalcd eknweepipa
241 fsswqrensd sdeahlspqa grlirqllde dsdpmlsprf yayggsrqyl
ddtevppspp 301 nshsfmrrrs sslgsyddeq edltpaqltr riqslkkkir
kfedrfeeek kyrpshsdka 361 anpevlkwtn dlakfrrqlk esklkiseed
ltprmrqrsn tlpksfgsql ekedekkqel 421 vdkaikpsve atlesiqrkl
qekraessrp edikdmtkdq ianekvalqk allyyesihg 481 rpvtknerqv
mkplydryrl vkqilsrant ipiigspssk rrspllqpii egetasffke 541
ikeeeegsed dsnvkpdfmv tlktdfsarc fldqfeddad gfispmddki pskcsqdtgl
601 snlhaasipe llehlqemre ekkrirkklr dfednffrqn grnvqkedrt
pmaeeyseyk 661 hikaklrlle vliskrdtds ksm
[0252] In some embodiments of the methods of the disclosure, the
wild type human FAM13A gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_001265579.1, transcript variant 4):
TABLE-US-00025 (SEQ ID NO: 40) 1 attgaggagc agaaggagta gggtgcgggg
gaggaggagg agcgccttta gtgctgcagc 61 agctgctgct ctgattggcc
cggtggttca gctgcttccc tggaacaaaa ggtcaaagtg 121 gactgcagtg
taaatgtaga gaagcagccg ataaaatagc attgcctgaa gaagtttgga 181
ggctgagagc agcagtagac tggccaactg cagagcaagt tgtttctcca gccgtgcggt
241 gcagcctcat gcccccaacc cagcttagcc actgtaagaa gacgttcact
gtacagacga 301 ccaaacttgc cgtggaagag acagttgtga gattcccttg
caaatttaca tacgagaatg 361 gcttgtgaaa tcatgcctct gcaaagttca
caggaagatg aaagacctct gtcacctttc 421 tatttgagtg ctcatgtacc
ccaagtcagc aatgtgtctg caaccggaga actcttagaa 481 agaaccatcc
gatcagctgt agaacaacat ctttttgatg ttaataactc tggaggtcaa 541
agttcagagg actcagaatc tggaacacta tcagcatctt ctgccacatc tgccagacag
601 cgccgccgcc agtccaagga gcaggatgaa gttcgacatg ggagagacaa
gggacttatc 661 aacaaagaaa atactccttc tgggttcaac caccttgatg
attgtatttt gaatactcag 721 gaagtcgaaa aggtacacaa aaatactttt
ggttgtgctg gagaaaggag caagcctaaa 781 cgtcagaaat ccagtactaa
actttctgag cttcatgaca atcaggacgg tcttgtgaat 841 atggaaagtc
tcaattccac acgatctcat gagagaactg gacctgatga ttttgaatgg 901
atgtctgatg aaaggaaagg aaatgaaaaa gatggtggac acactcagca ttttgagagc
961 cccacaatga agatccagga gcatcccagc ctatctgaca ccaaacagca
gagaaatcaa 1021 gatgccggtg accaggagga gagctttgtc tccgaagtgc
cccagtcgga cctgactgca 1081 ttgtgtgatg aaaagaactg ggaagagcct
atccctgctt tctcctcctg gcagcgggag 1141 aacagtgact ctgatgaagc
ccacctctcg ccgcaggctg ggcgcctgat ccgtcagctg 1201 ctggacgaag
acagcgaccc catgctctct cctcggttct acgcttatgg gcagagcagg 1261
caatacctgg atgacacaga agtgcctcct tccccaccaa actcccattc tttcatgagg
1321 cggcgaagct cctctctggg gtcctatgat gatgagcaag aggacctgac
acctgcccag 1381 ctcacacgaa ggattcagag ccttaaaaag aagatccgga
agtttgaaga tagattcgaa 1441 gaagagaaga agtacagacc ttcccacagt
gacaaagcag ccaatccgga ggttctgaaa 1501 tggacaaatg accttgccaa
attccggaga caacttaaag aatcaaaact aaagatatct 1561 gaagaggacc
taactcccag gatgcggcag cgaagcaaca cactccccaa gagttttggt 1621
tcccaacttg agaaagaaga tgagaagaag caagagctgg tggataaagc aataaagccc
1681 agtgttgaag ccacattgga atctattcag aggaagctcc aggagaagcg
agcggaaagc 1741 agccgccctg aggacattaa ggatatgacc aaagaccaga
ttgctaatga gaaagtggct 1801 ctgcagaaag ctctgttata ttatgaaagc
attcatggac ggccggtaac aaagaacgaa 1861 cggcaggtga tgaagccact
atacgacagg taccggctgg tcaaacagat cctctcccga 1921 gctaacacca
tacccatcat tgaagaagag gaggggtcag aagacgatag caatgtgaag 1981
ccagacttca tggtcactct gaaaaccgat ttcagtgcac gatgctttct ggaccaattc
2041 gaagatgacg ctgatggatt tatttcccca atggatgata aaataccatc
aaaatgcagc 2101 caggacacag ggctttcaaa tctccatgct gcctcaatac
ctgaactcct ggaacacctc 2161 caggaaatga gagaagaaaa gaaaaggatt
cgaaagaaac ttcgggattt tgaagacaac 2221 tttttcagac agaatggaag
aaatgtccag aaggaagacc gcactcctat ggctgaagaa 2281 tacagtgaat
ataagcacat aaaggcgaaa ctgaggctcc tggaggtgct catcagcaag 2341
agagacactg attccaagtc catgtgaggg gcatggccaa gcacaggggg ctggcagctg
2401 cggtgagagt ttactgtccc cagagaaagt gcagctctgg aaggcagcct
tggggctggc 2461 cctgcaaagc atgcagccct tctgcctcta gaccatttgg
catcggctcc tgtttccatt 2521 gcctgcctta gaaactggct ggaagaagac
aatgtgacct gacttaggca ttttgtaatt 2581 ggaaagtcaa gactgcagta
tgtgcacatg cgcacgcgca tgcacgcaca cacacacaca 2641 gtagtggagc
tttcctaaca ctagcagaga ttaatcacta cattagacaa cactcatcta 2701
cagagaatat acactgttct tccctggata actgagaaac aagagaccat tctctgtcta
2761 actgtgataa aaacaagctc aggactttat tctatagagc aaacttgctg
tggagggcca 2821 tgctctcctt ggacccagtt aactgcaaac gtgcattgga
gccctatttg ctgccgctgc 2881 cattctagtg acctttccac agagctgcgc
cttcctcacg tgtgtgaaag gttttcccct 2941 tcagccctca ggtagatgga
agctgcatct gcccacgatg gcagtgcagt catcatcttc 3001 aggatgtttc
ttcaggactt cctcagctga caaggaattt tggtccctgc ctaggaccgg 3061
gtcatctgca gaggacagag agatggtaag cagctgtatg aatgctgatt ttaaaaccag
3121 gtcatgggag aagagcctgg agattctttc ctgaacactg actgcactta
ccagtctgat 3181 tttatcgtca aacaccaagc caggctagca tgctcatggc
aatctgtttg gggctgtttt 3241 gttgtggcac tagccaaaca taaaggggct
taagtcagcc tgcatacaga ggatcgggga 3301 gagaaggggc ctgtgttctc
agcctcctga gtacttacca gagtttaatt tttttaaaaa 3361 aaatctgcac
taaaatcccc aaactgacag gtaaatgtag ccctcagagc tcagcccaag 3421
gcagaatcta aatcacacta ttttcgagat catgtataaa aagaaaaaaa agaagtcatg
3481 ctgtgtggcc aattataatt tttttcaaag actttgtcac aaaactgtct
atattagaca 3541 ttttggaggg accaggaaat gtaagacacc aaatcctcca
tctcttcagt gtgcctgatg 3601 tcacctcatg atttgctgtt acttttttaa
ctcctgcgcc aaggacagtg ggttctgtgt 3661 ccacctttgt gctttgcgag
gccgagccca ggcatctgct cgcctgccac ggctgaccag 3721 agaaggtgct
tcaggagctc tgccttagac gacgtgttac agtatgaaca cacagcagag 3781
gcaccctcgt atgttttgaa agttgccttc tgaaagggca cagttttaag gaaaagaaaa
3841 agaatgtaaa actatactga cccgttttca gttttaaagg gtcgtgagaa
actggctggt 3901 ccaatgggat ttacagcaac attttccatt gctgaagtga
ggtagcagct ctcttctgtc 3961 agctgaatgt taaggatggg gaaaaagaat
gcctttaagt ttgctcttaa tcgtatggaa 4021 gcttgagcta tgtgttggaa
gtgccctggt tttaatccat acacaaagac ggtacataat 4081 cctacaggtt
taaatgtaca taaaaatata gtttggaatt ctttgctcta ctgtttacat 4141
tgcagattgc tataatttca aggagtgaga ttataaataa aatgatgcac tttaggatgt
4201 ttcctatttt tgaaatctga acatgaatca ttcacatgac caaaaattgt
gtttttttaa 4261 aaatacatgt ctagtctgtc ctttaatagc tctcttaaat
aagctatgat attaatcaga 4321 tcattaccag ttagctttta aagcacattt
gtttaagact atgtttttgg aaaaatacgc 4381 tacagaattt ttttttaagc
tacaaataaa tgagatgcta ctaattgttt tggaatctgt 4441 tgtttctgcc
aaaggtaaat taactaaaga tttattcagg aatccccatt tgaatttgta 4501
tgattcaata aaagaaaaca ccaagtaagt tatataaaat aaattgtgta tgagatgttg
4561 tgttttcctt tgtaatttcc actaactaac taactaactt atattcttca
tggaatggag 4621 cccagaagaa atgagaggaa gcccttttca cactagatct
tatttgaaga aatgtttgtt 4681 agtcagtcag tcagtggttt ctggctctgc
cgagggagat gtgttcccca gcaaccattt 4741 ctgcagccca gaatctcaag
gcactagagg cggtgtctta attaattggc ttcacaaaga 4801 caaaatgctc
tggactggga tttttccttt gctgtgttgg gaatatgtgt ttattaatta 4861
gcacatgcca acaaaataaa tgtcaagagt tatttcataa gtgtaagtaa acttaagaat
4921 taaagagtgc agacttataa ttttca
[0253] In some embodiments of the methods of the disclosure, the
wild type human FAM13A gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_001252508.1, transcript variant 4):
TABLE-US-00026 (SEQ ID NO: 41) 1 maceimplqs sqederplsp fylsahvpqv
snvsatgell ertirsaveq hlfdvnnsgg 61 qssedsesgt lsassatsar
qrrrqskeqd evrhgrdkgl inkentpsgf nhlddcilnt 121 qevekvhknt
fgcagerskp krqksstkls elhdnqdglv nmeslnstrs hertgpddfe 181
wmsderkgne kdgghtqhfe sptmkigehp slsdtkqqrn qdagdqeesf vsevpqsdlt
241 alcdeknwee pipafsswqr ensdsdeahl spqagrlirq lldedsdpml
sprfyaygqs 301 rqylddtevp psppnshsfm rrrssslgsy ddeqedltpa
qltrriqslk kkirkfedrf 361 eeekkyrpsh sdkaanpevl kwtndlakfr
rqlkesklki seedltprmr qrsntlpksf 421 gsqlekedek kqelvdkaik
psveatlesi qrklqekrae ssrpedikdm tkdqianekv 481 alqkallyye
sihgrpvtkn erqvmkplyd ryrlvkqils rantipiiee eegseddsnv 541
kpdfmvtlkt dfsarcfldq feddadgfis pmddkipskc sqdtglsnlh aasipelleh
601 lqemreekkr irkklrdfed nffrqngrnv qkedrtpmae eyseykhika
klrllevlis 661 krdtdsksm
[0254] In some embodiments of the methods of the disclosure, the
wild type human FAM13A gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_001265580.1, transcript variant 5):
TABLE-US-00027 (SEQ ID NO: 42) 1 attgaggagc agaaggagta gggtgcgggg
gaggaggagg agcgccttta gtgctgcagc 61 agctgctgct ctgattggcc
cggtggttca gctgcttccc tggaacaaaa ggtcaaagtg 121 gactgcagtg
taaatgtaga gaagcagccg ataaaatagc attgcctgaa gaagtttgga 181
ggctgagagc agcagtagac tggccaactg cagagcaagt tgtttctcca gccgtgcggt
241 gcagcctcat gcccccaacc cagcttagcc actgtaagaa gacgttcact
gtacagacga 301 ccaaacttgc cgtggaagag acagttgtga gattcccttg
caaatttaca tacgagaatg 361 gcttgtgaaa tcatgcctct gcaaagactc
ttagaaagaa ccatccgatc agctgtagaa 421 caacatcttt ttgatgttaa
taactctgga ggtcaaagtt cagaggactc agaatctgga 481 acactatcag
catcttctgc cacatctgcc agacagcgcc gccgccagtc caaggagcag 541
gatgaagttc gacatgggag agacaaggga cttatcaaca aagaaaatac tccttctggg
601 ttcaaccacc ttgatgattg tattttgaat actcaggaag tcgaaaaggt
acacaaaaat 661 acttttggtt gtgctggaga aaggagcaag cctaaacgtc
agaaatccag tactaaactt 721 tctgagcttc atgacaatca ggacggtctt
gtgaatatgg aaagtctcaa ttccacacga 781 tctcatgaga gaactggacc
tgatgatttt gaatggatgt ctgatgaaag gaaaggaaat 841 gaaaaagatg
gtggacacac tcagcatttt gagagcccca caatgaagat ccaggagcat 901
cccagcctat ctgacaccaa acagcagaga aatcaagatg ccggtgacca ggaggagagc
961 tttgtctccg aagtgcccca gtcggacctg actgcattgt gtgatgaaaa
gaactgggaa 1021 gagcctatcc ctgctttctc ctcctggcag cgggagaaca
gtgactctga tgaagcccac 1081 ctctcgccgc aggctgggcg cctgatccgt
cagctgctgg acgaagacag cgaccccatg 1141 ctctctcctc ggttctacgc
ttatgggcag agcaggcaat acctggatga cacagaagtg 1201 cctccttccc
caccaaactc ccattctttc atgaggcggc gaagctcctc tctggggtcc 1261
tatgatgatg agcaagagga cctgacacct gcccagctca cacgaaggat tcagagcctt
1321 aaaaagaaga tccggaagtt tgaagataga ttcgaagaag agaagaagta
cagaccttcc 1381 cacagtgaca aagcagccaa tccggaggtt ctgaaatgga
caaatgacct tgccaaattc 1441 cggagacaac ttaaagaatc aaaactaaag
atatctgaag aggacctaac tcccaggatg 1501 cggcagcgaa gcaacacact
ccccaagagt tttggttccc aacttgagaa agaagatgag 1561 aagaagcaag
agctggtgga taaagcaata aagcccagtg ttgaagccac attggaatct 1621
attcagagga agctccagga gaagcgagcg gaaagcagcc gccctgagga cattaaggat
1681 atgaccaaag accagattgc taatgagaaa gtggctctgc agaaagctct
gttatattat 1741 gaaagcattc atggacggcc ggtaacaaag aacgaacggc
aggtgatgaa gccactatac 1801 gacaggtacc ggctggtcaa acagatcctc
tcccgagcta acaccatacc catcattggt 1861 tccccctcca gcaagcggag
aagccctttg ctgcagccaa ttatcgaggg cgaaactgct 1921 tccttcttca
aggagataaa ggaagaagag gaggggtcag aagacgatag caatgtgaag 1981
ccagacttca tggtcactct gaaaaccgat ttcagtgcac gatgctttct ggaccaattc
2041 gaagatgacg ctgatggatt tatttcccca atggatgata aaataccatc
aaaatgcagc 2101 caggacacag ggctttcaaa tctccatgct gcctcaatac
ctgaactcct ggaacacctc 2161 caggaaatga gagaagaaaa gaaaaggatt
cgaaagaaac ttcgggattt tgaagacaac 2221 tttttcagac agaatggaag
aaatgtccag aaggaagacc gcactcctat ggctgaagaa 2281 tacagtgaat
ataagcacat aaaggcgaaa ctgaggctcc tggaggtgct catcagcaag 2341
agagacactg attccaagtc catgtgaggg gcatggccaa gcacaggggg ctggcagctg
2401 cggtgagagt ttactgtccc cagagaaagt gcagctctgg aaggcagcct
tggggctggc 2461 cctgcaaagc atgcagccct tctgcctcta gaccatttgg
catcggctcc tgtttccatt 2521 gcctgcctta gaaactggct ggaagaagac
aatgtgacct gacttaggca ttttgtaatt 2581 ggaaagtcaa gactgcagta
tgtgcacatg cgcacgcgca tgcacgcaca cacacacaca 2641 gtagtggagc
tttcctaaca ctagcagaga ttaatcacta cattagacaa cactcatcta 2701
cagagaatat acactgttct tccctggata actgagaaac aagagaccat tctctgtcta
2761 actgtgataa aaacaagctc aggactttat tctatagagc aaacttgctg
tggagggcca 2821 tgctctcctt ggacccagtt aactgcaaac gtgcattgga
gccctatttg ctgccgctgc 2881 cattctagtg acctttccac agagctgcgc
cttcctcacg tgtgtgaaag gttttcccct 2941 tcagccctca ggtagatgga
agctgcatct gcccacgatg gcagtgcagt catcatcttc 3001 aggatgtttc
ttcaggactt cctcagctga caaggaattt tggtccctgc ctaggaccgg 3061
gtcatctgca gaggacagag agatggtaag cagctgtatg aatgctgatt ttaaaaccag
3121 gtcatgggag aagagcctgg agattctttc ctgaacactg actgcactta
ccagtctgat 3181 tttatcgtca aacaccaagc caggctagca tgctcatggc
aatctgtttg gggctgtttt 3241 gttgtggcac tagccaaaca taaaggggct
taagtcagcc tgcatacaga ggatcgggga 3301 gagaaggggc ctgtgttctc
agcctcctga gtacttacca gagtttaatt tttttaaaaa 3361 aaatctgcac
taaaatcccc aaactgacag gtaaatgtag ccctcagagc tcagcccaag 3421
gcagaatcta aatcacacta ttttcgagat catgtataaa aagaaaaaaa agaagtcatg
3481 ctgtgtggcc aattataatt tttttcaaag actttgtcac aaaactgtct
atattagaca 3541 ttttggaggg accaggaaat gtaagacacc aaatcctcca
tctcttcagt gtgcctgatg 3601 tcacctcatg atttgctgtt acttttttaa
ctcctgcgcc aaggacagtg ggttctgtgt 3661 ccacctttgt gctttgcgag
gccgagccca ggcatctgct cgcctgccac ggctgaccag 3721 agaaggtgct
tcaggagctc tgccttagac gacgtgttac agtatgaaca cacagcagag 3781
gcaccctcgt atgttttgaa agttgccttc tgaaagggca cagttttaag gaaaagaaaa
3841 agaatgtaaa actatactga cccgttttca gttttaaagg gtcgtgagaa
actggctggt 3901 ccaatgggat ttacagcaac attttccatt gctgaagtga
ggtagcagct ctcttctgtc 3961 agctgaatgt taaggatggg gaaaaagaat
gcctttaagt ttgctcttaa tcgtatggaa 4021 gcttgagcta tgtgttggaa
gtgccctggt tttaatccat acacaaagac ggtacataat 4081 cctacaggtt
taaatgtaca taaaaatata gtttggaatt ctttgctcta ctgtttacat 4141
tgcagattgc tataatttca aggagtgaga ttataaataa aatgatgcac tttaggatgt
4201 ttcctatttt tgaaatctga acatgaatca ttcacatgac caaaaattgt
gtttttttaa 4261 aaatacatgt ctagtctgtc ctttaatagc tctcttaaat
aagctatgat attaatcaga 4321 tcattaccag ttagctttta aagcacattt
gtttaagact atgtttttgg aaaaatacgc 4381 tacagaattt ttttttaagc
tacaaataaa tgagatgcta ctaattgttt tggaatctgt 4441 tgtttctgcc
aaaggtaaat taactaaaga tttattcagg aatccccatt tgaatttgta 4501
tgattcaata aaagaaaaca ccaagtaagt tatataaaat aaattgtgta tgagatgttg
4561 tgttttcctt tgtaatttcc actaactaac taactaactt atattcttca
tggaatggag 4621 cccagaagaa atgagaggaa gcccttttca cactagatct
tatttgaaga aatgtttgtt 4681 agtcagtcag tcagtggttt ctggctctgc
cgagggagat gtgttcccca gcaaccattt 4741 ctgcagccca gaatctcaag
gcactagagg cggtgtctta attaattggc ttcacaaaga 4801 caaaatgctc
tggactggga tttttccttt gctgtgttgg gaatatgtgt ttattaatta 4861
gcacatgcca acaaaataaa tgtcaagagt tatttcataa gtgtaagtaa acttaagaat
4921 taaagagtgc agacttataa ttttca
[0255] In some embodiments of the methods of the disclosure, the
wild type human FAM13A gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_001252509.1, transcript variant 5):
TABLE-US-00028 (SEQ ID NO: 43) 1 maceimplqr llertirsav eqhlfdvnns
ggqssedses gtlsassats arqrrrqske 61 qdevrhgrdk glinkentps
gfnhlddcil ntqevekvhk ntfgcagers kpkrqksstk 121 lselhdnqdg
lvnmeslnst rshertgpdd fewmsderkg nekdgghtqh fesptmkiqe 181
hpslsdtkqq rnqdagdqee sfvsevpqsd ltalcdeknw eepipafssw qrensdsdea
241 hlspqagrli rqlldedsdp mlsprfyayg qsrqylddte vppsppnshs
fmrrrssslg 301 syddeqedlt paqltrriqs lkkkirkfed rfeeekkyrp
shsdkaanpe vlkwtndlak 361 frrqlkeskl kiseedltpr mrqrsntlpk
sfgsqleked ekkqelvdka ikpsveatle 421 siqrklqekr aessrpedik
dmtkdqiane kvalqkally yesihgrpvt knerqvmkpl 481 ydryrlvkqi
lsrantipii gspsskrrsp llqpiieget asffkeikee eegseddsnv 541
kpdfmvtlkt dfsarcfldq feddadgfis pmddkipskc sqdtglsnlh aasipelleh
601 lqemreekkr irkklrdfed nffrqngrnv qkedrtpmae eyseykhika
klrllevlis 661 krdtdsksm
[0256] In some embodiments of the methods of the disclosure, the
wild type human DSP gene of the disclosure consists of or comprises
the nucleic acid sequence (Genbank Accession number: NM_004415.3,
transcript variant 1):
TABLE-US-00029 (SEQ ID NO: 44) 1 aagaaaccgg ccaggtgtgg cctaggcgcc
cagtgccagc ggggaggaga ctcgctccgc 61 cgccgaccaa caccaacacc
cagctccgac gcagctcctc tgcgcccttg ccgccctccg 121 agccacagct
ttcctcccgc tcctgccccc ggcccgtcgc cgtctccgcg ctcgcagcgg 181
cctcgggagg gcccaggtag cgagcagcga cctcgcgagc cttccgcact cccgcccggt
241 tccccggccg tccgcctatc cttggccccc tccgctttct ccgcgccggc
ccgcctcgct 301 tatgcctcgg cgctgagccg ctctcccgat tgcccgccga
catgagctgc aacggaggct 361 cccacccgcg gatcaacact ctgggccgca
tgatccgcgc cgagtctggc ccggacctgc 421 gctacgaggt gaccagcggc
ggcgggggca ccagcaggat gtactattct cggcgcggcg 481 tgatcaccga
ccagaactcg gacggctact gtcaaaccgg cacgatgtcc aggcaccaga 541
accagaacac catccaggag ctgctgcaga actgctccga ctgcttgatg cgagcagagc
601 tcatcgtgca gcctgaattg aagtatggag atggaataca actgactcgg
agtcgagaat 661 tggatgagtg ttttgcccag gccaatgacc aaatggaaat
cctcgacagc ttgatcagag 721 agatgcggca gatgggccag ccctgtgatg
cttaccagaa aaggcttctt cagctccaag 781 agcaaatgcg agccctttat
aaagccatca gtgtccctcg agtccgcagg gccagctcca 841 agggtggtgg
aggctacact tgtcagagtg gctctggctg ggatgagttc accaaacatg 901
tcaccagtga atgtttgggg tggatgaggc agcaaagggc ggagatggac atggtggcct
961 ggggtgtgga cctggcctca gtggagcagc acattaacag ccaccggggc
atccacaact 1021 ccatcggcga ctatcgctgg cagctggaca aaatcaaagc
cgacctgcgc gagaaatctg 1081 cgatctacca gttggaggag gagtatgaaa
acctgctgaa agcgtccttt gagaggatgg 1141 atcacctgcg acagctgcag
aacatcattc aggccacgtc cagggagatc atgtggatca 1201 atgactgcga
ggaggaggag ctgctgtacg actggagcga caagaacacc aacatcgctc 1261
agaaacagga ggccttctcc atacgcatga gtcaactgga agttaaagaa aaagagctca
1321 ataagctgaa acaagaaagt gaccaacttg tcctcaatca gcatccagct
tcagacaaaa 1381 ttgaggccta tatggacact ctgcagacgc agtggagttg
gattcttcag atcaccaagt 1441 gcattgatgt tcatctgaaa gaaaatgctg
cctactttca gttttttgaa gaggcgcagt 1501 ctactgaagc atacctgaag
gggctccagg actccatcag gaagaagtac ccctgcgaca 1561 agaacatgcc
cctgcagcac ctgctggaac agatcaagga gctggagaaa gaacgagaga 1621
aaatccttga atacaagcgt caggtgcaga acttggtaaa caagtctaag aagattgtac
1681 agctgaagcc tcgtaaccca gactacagaa gcaataaacc cattattctc
agagctctct 1741 gtgactacaa acaagatcag aaaatcgtgc ataaggggga
tgagtgtatc ctgaaggaca 1801 acaacgagcg cagcaagtgg tacgtgacgg
gcccgggagg cgttgacatg cttgttccct 1861 ctgtggggct gatcatccct
cctccgaacc cactggccgt ggacctctct tgcaagattg 1921 agcagtacta
cgaagccatc ttggctctgt ggaaccagct ctacatcaac atgaagagcc 1981
tggtgtcctg gcactactgc atgattgaca tagagaagat cagggccatg acaatcgcca
2041 agctgaaaac aatgcggcag gaagattaca tgaagacgat agccgacctt
gagttacatt 2101 accaagagtt catcagaaat agccaaggct cagagatgtt
tggagatgat gacaagcgga 2161 aaatacagtc tcagttcacc gatgcccaga
agcattacca gaccctggtc attcagctcc 2221 ctggctatcc ccagcaccag
acagtgacca caactgaaat cactcatcat ggaacctgcc 2281 aagatgtcaa
ccataataaa gtaattgaaa ccaacagaga aaatgacaag caagaaacat 2341
ggatgctgat ggagctgcag aagattcgca ggcagataga gcactgcgag ggcaggatga
2401 ctctcaaaaa cctccctcta gcagaccagg gatcttctca ccacatcaca
gtgaaaatta 2461 acgagcttaa gagtgtgcag aatgattcac aagcaattgc
tgaggttctc aaccagctta 2521 aagatatgct tgccaacttc agaggttctg
aaaagtactg ctatttacag aatgaagtat 2581 ttggactatt tcagaaactg
gaaaatatca atggtgttac agatggctac ttaaatagct 2641 tatgcacagt
aagggcactg ctccaggcta ttctccaaac agaagacatg ttaaaggttt 2701
atgaagccag gctcactgag gaggaaactg tctgcctgga cctggataaa gtggaagctt
2761 accgctgtgg actgaagaaa ataaaaaatg acttgaactt gaagaagtcg
ttgttggcca 2821 ctatgaagac agaactacag aaagcccagc agatccactc
tcagacttca cagcagtatc 2881 cactttatga tctggacttg ggcaagttcg
gtgaaaaagt cacacagctg acagaccgct 2941 ggcaaaggat agataaacag
atcgacttta ggttatggga cctggagaaa caaatcaagc 3001 aattgaggaa
ttatcgtgat aactatcagg ctttctgcaa gtggctctat gatgctaaac 3061
gccgccagga ttccttagaa tccatgaaat ttggagattc caacacagtc atgcggtttt
3121 tgaatgagca gaagaacttg cacagtgaaa tatctggcaa acgagacaaa
tcagaggaag 3181 tacaaaaaat tgctgaactt tgcgccaatt caattaagga
ttatgagctc cagctggcct 3241 catacacctc aggactggaa actctgctga
acatacctat caagaggacc atgattcagt 3301 ccccttctgg ggtgattctg
caagaggctg cagatgttca tgctcggtac attgaactac 3361 ttacaagatc
tggagactat tacaggttct taagtgagat gctgaagagt ttggaagatc 3421
tgaagctgaa aaataccaag atcgaagttt tggaagagga gctcagactg gcccgagatg
3481 ccaactcgga aaactgtaat aagaacaaat tcctggatca gaacctgcag
aaataccagg 3541 cagagtgttc ccagttcaaa gcgaagcttg cgagcctgga
ggagctgaag agacaggctg 3601 agctggatgg gaagtcggct aagcaaaatc
tagacaagtg ctacggccaa ataaaagaac 3661 tcaatgagaa gatcacccga
ctgacttatg agattgaaga tgaaaagaga agaagaaaat 3721 ctgtggaaga
cagatttgac caacagaaga atgactatga ccaactgcag aaagcaaggc 3781
aatgtgaaaa ggagaacctt ggttggcaga aattagagtc tgagaaagcc atcaaggaga
3841 aggagtacga gattgaaagg ttgagggttc tactgcagga agaaggcacc
cggaagagag 3901 aatatgaaaa tgagctggca aaggtaagaa accactataa
tgaggagatg agtaatttaa 3961 ggaacaagta tgaaacagag attaacatta
cgaagaccac catcaaggag atatccatgc 4021 aaaaagagga tgattccaaa
aatcttagaa accagcttga tagactttca agggaaaatc 4081 gagatctgaa
ggatgaaatt gtcaggctca atgacagcat cttgcaggcc actgagcagc 4141
gaaggcgagc tgaagaaaac gcccttcagc aaaaggcctg tggctctgag ataatgcaga
4201 agaagcagca tctggagata gaactgaagc aggtcatgca gcagcgctct
gaggacaatg 4261 cccggcacaa gcagtccctg gaggaggctg ccaagaccat
tcaggacaaa aataaggaga 4321 tcgagagact caaagctgag tttcaggagg
aggccaagcg ccgctgggaa tatgaaaatg 4381 aactgagtaa ggtaagaaac
aattatgatg aggagatcat tagcttaaaa aatcagtttg 4441 agaccgagat
caacatcacc aagaccacca tccaccagct caccatgcag aaggaagagg 4501
ataccagtgg ctaccgggct cagatagaca atctcacccg agaaaacagg agcttatctg
4561 aagaaataaa gaggctgaag aacactctaa cccagaccac agagaatctc
aggagggtgg 4621 aagaagacat ccaacagcaa aaggccactg gctctgaggt
gtctcagagg aaacagcagc 4681 tggaggttga gctgagacaa gtcactcaga
tgcgaacaga ggagagcgta agatataagc 4741 aatctcttga tgatgctgcc
aaaaccatcc aggataaaaa caaggagata gaaaggttaa 4801 aacaactgat
cgacaaagaa acaaatgacc ggaaatgcct ggaagatgaa aacgcgagat 4861
tacaaagggt ccagtatgac ctgcagaaag caaacagtag tgcgacggag acaataaaca
4921 aactgaaggt tcaggagcaa gaactgacac gcctgaggat cgactatgaa
agggtttccc 4981 aggagaggac tgtgaaggac caggatatca cgcggttcca
gaactctctg aaagagctgc 5041 agctgcagaa gcagaaggtg gaagaggagc
tgaatcggct gaagaggacc gcgtcagaag 5101 actcctgcaa gaggaagaag
ctggaggaag agctggaagg catgaggagg tcgctgaagg 5161 agcaagccat
caaaatcacc aacctgaccc agcagctgga gcaggcatcc attgttaaga 5221
agaggagtga ggatgacctc cggcagcaga gggacgtgct ggatggccac ctgagggaaa
5281 agcagaggac ccaggaagag ctgaggaggc tctcttctga ggtcgaggcc
ctgaggcggc 5341 agttactcca ggaacaggaa agtgtcaaac aagctcactt
gaggaatgag catttccaga 5401 aggcgataga agataaaagc agaagcttaa
atgaaagcaa aatagaaatt gagaggctgc 5461 agtctctcac agagaacctg
accaaggagc acttgatgtt agaagaagaa ctgcggaacc 5521 tgaggctgga
gtacgatgac ctgaggagag gacgaagcga agcggacagt gataaaaatg 5581
caaccatctt ggaactaagg agccagctgc agatcagcaa caaccggacc ctggaactgc
5641 aggggctgat taatgattta cagagagaga gggaaaattt gagacaggaa
attgagaaat 5701 tccaaaagca ggctttagag gcatctaata ggattcagga
atcaaagaat cagtgtactc 5761 aggtggtaca ggaaagagag agccttctgg
tgaaaatcaa agtcctggag caagacaagg 5821 caaggctgca gaggctggag
gatgagctga atcgtgcaaa atcaactcta gaggcagaaa 5881 ccagggtgaa
acagcgcctg gagtgtgaga aacagcaaat tcagaatgac ctgaatcagt 5941
ggaagactca atattcccgc aaggaggagg ctattaggaa gatagaatcg gaaagagaaa
6001 agagtgagag agagaagaac agtcttagga gtgagatcga aagactccaa
gcagagatca 6061 agagaattga agagaggtgc aggcgtaagc tggaggattc
taccagggag acacagtcac 6121 agttagaaac agaacgctcc cgatatcaga
gggagattga taaactcaga cagcgcccat 6181 atgggtccca tcgagagacc
cagactgagt gtgagtggac cgttgacacc tccaagctgg 6241 tgtttgatgg
gctgaggaag aaggtgacag caatgcagct ctatgagtgt cagctgatcg 6301
acaaaacaac cttggacaaa ctattgaagg ggaagaagtc agtggaagaa gttgcttctg
6361 aaatccagcc attccttcgg ggtgcaggat ctatcgctgg agcatctgct
tctcctaagg 6421 aaaaatactc tttggtagag gccaagagaa agaaattaat
cagcccagaa tccacagtca 6481 tgcttctgga ggcccaggca gctacaggtg
gtataattga tccccatcgg aatgagaagc 6541 tgactgtcga cagtgccata
gctcgggacc tcattgactt cgatgaccgt cagcagatat 6601 atgcagcaga
aaaagctatc actggttttg atgatccatt ttcaggcaag acagtatctg 6661
tttcagaagc catcaagaaa aatttgattg atagagaaac cggaatgcgc ctgctggaag
6721 cccagattgc ttcagggggt gtagtagacc ctgtgaacag tgtctttttg
ccaaaagatg 6781 tcgccttggc ccgggggctg attgatagag atttgtatcg
atccctgaat gatccccgag 6841 atagtcagaa aaactttgtg gatccagtca
ccaaaaagaa ggtcagttac gtgcagctga 6901 aggaacggtg cagaatcgaa
ccacatactg gtctgctctt gctttcagta cagaagagaa 6961 gcatgtcctt
ccaaggaatc agacaacctg tgaccgtcac tgagctagta gattctggta 7021
tattgagacc gtccactgtc aatgaactgg aatctggtca gatttcttat gacgaggttg
7081 gtgagagaat taaggacttc ctccagggtt caagctgcat agcaggcata
tacaatgaga 7141 ccacaaaaca gaagcttggc atttatgagg ccatgaaaat
tggcttagtc cgacctggta 7201 ctgctctgga gttgctggaa gcccaagcag
ctactggctt tatagtggat cctgttagca 7261 acttgaggtt accagtggag
gaagcctaca agagaggtct ggtgggcatt gagttcaaag 7321 agaagctcct
gtctgcagaa cgagctgtca ctgggtataa tgatcctgaa acaggaaaca 7381
tcatctcttt gttccaagcc atgaataagg aactcatcga aaagggccac ggtattcgct
7441 tattagaagc acagatcgca accgggggga tcattgaccc aaaggagagc
catcgtttac
7501 cagttgacat agcatataag aggggctatt tcaatgagga actcagtgag
attctctcag 7561 atccaagtga tgataccaaa ggattttttg accccaacac
tgaagaaaat cttacctatc 7621 tgcaactaaa agaaagatgc attaaggatg
aggaaacagg gctctgtctt ctgcctctga 7681 aagaaaagaa gaaacaggtg
cagacatcac aaaagaatac cctcaggaag cgtagagtgg 7741 tcatagttga
cccagaaacc aataaagaaa tgtctgttca ggaggcctac aagaagggcc 7801
taattgatta tgaaaccttc aaagaactgt gtgagcagga atgtgaatgg gaagaaataa
7861 ccatcacggg atcagatggc tccaccaggg tggtcctggt agatagaaag
acaggcagtc 7921 agtatgatat tcaagatgct attgacaagg gccttgttga
caggaagttc tttgatcagt 7981 accgatccgg cagcctcagc ctcactcaat
ttgctgacat gatctccttg aaaaatggtg 8041 tcggcaccag cagcagcatg
ggcagtggtg tcagcgatga tgtttttagc agctcccgac 8101 atgaatcagt
aagtaagatt tccaccatat ccagcgtcag gaatttaacc ataaggagca 8161
gctctttttc agacaccctg gaagaatcga gccccattgc agccatcttt gacacagaaa
8221 acctggagaa aatctccatt acagaaggta tagagcgggg catcgttgac
agcatcacgg 8281 gtcagaggct tctggaggct caggcctgca caggtggcat
catccaccca accacgggcc 8341 agaagctgtc acttcaggac gcagtctccc
agggtgtgat tgaccaagac atggccacca 8401 ggctgaagcc tgctcagaaa
gccttcatag gcttcgaggg tgtgaaggga aagaagaaga 8461 tgtcagcagc
agaggcagtg aaagaaaaat ggctcccgta tgaggctggc cagcgcttcc 8521
tggagttcca gtacctcacg ggaggtcttg ttgacccgga agtgcatggg aggataagca
8581 ccgaagaagc catccggaag gggttcatag atggccgcgc cgcacagagg
ctgcaagaca 8641 ccagcagcta tgccaaaatc ctgacctgcc ccaaaaccaa
attaaaaata tcctataagg 8701 atgccataaa tcgctccatg gtagaagata
tcactgggct gcgccttctg gaagccgcct 8761 ccgtgtcgtc caagggctta
cccagccctt acaacatgtc ttcggctccg gggtcccgct 8821 ccggctcccg
ctcgggatct cgctccggat ctcgctccgg gtcccgcagt gggtcccgga 8881
gaggaagctt tgacgccaca gggaattctt cctactctta ttcctactca tttagcagta
8941 gttctattgg gcactagtag tcagttggga gtggttgcta taccttgact
tcatttatat 9001 gaatttccac tttattaaat aatagaaaag aaaatcccgg
tgcttgcagt agagtgatag 9061 gacattctat gcttacagaa aatatagcca
tgattgaaat caaatagtaa aggctgttct 9121 ggctttttat cttcttagct
catcttaaat aagcagtaca cttggatgca gtgcgtctga 9181 agtgctaatc
agttgtaaca atagcacaaa tcgaacttag gatttgtttc ttctcttctg 9241
tgtttcgatt tttgatcaat tctttaattt tggaagccta taatacagtt ttctattctt
9301 ggagataaaa attaaatgga tcactgatat tttagtcatt ctgcttctca
tctaaatatt 9361 tccatattct gtattaggag aaaattaccc tcccagcacc
agcccccctc tcaaaccccc 9421 aacccaaaac caagcatttt ggaatgagtc
tcctttagtt tcagagtgtg gattgtataa 9481 cccatatact cttcgatgta
cttgtttggt ttggtattaa tttgactgtg catgacagcg 9541 gcaatctttt
ctttggtcaa agttttctgt ttattttgct tgtcatattc gatgtacttt 9601
aaggtgtctt tatgaagttt gctattctgg caataaactt ttagactttt gaagtgtttg
9661 tgttttaatt taatatgttt ataagcatgt ataaacattt agcatatttt
tatcataggt 9721 ctaaaaatat ttgtttacta aatacctgtg aagaaatacc
attaaaaaac tatttggttc 9781 tgaattctta ctagaaaaaa aa
[0257] In some embodiments of the methods of the disclosure, the
wild type human DSP gene of the disclosure consists of or comprises
the amino acid sequence (Genbank Accession number: NP_004406.2,
transcript variant 1):
TABLE-US-00030 (SEQ ID NO: 45) 1 mscnggshpr intlgrmira esgpdlryev
tsggggtsrm yysrrgvitd qnsdgycqtg 61 tmsrhqnqnt iqellqncsd
clmraelivq pelkygdgiq ltrsreldec faqandqmei 121 ldsliremrq
mgqpcdayqk rllqlqeqmr alykaisvpr vrrasskggg gytcqsgsgw 181
deftkhvtse clgwmrqqra emdmvawgvd lasveqhins hrgihnsigd yrwqldkika
241 dlreksaiyq leeeyenllk asfermdhlr qlqniiqats reimwindce
eeellydwsd 301 kntniaqkqe afsirmsqle vkekelnklk qesdqlvinq
hpasdkieay mdtlqtqwsw 361 ilqitkcidv hlkenaayfq ffeeaqstea
ylkglqdsir kkypcdknmp lqhlleqike 421 lekerekile ykrqvqnlvn
kskkivqlkp rnpdyrsnkp iilralcdyk qdqkivhkgd 481 ecilkdnner
skwyvtgpgg vdmlvpsvgl iipppnplav dlsckieqyy eailalwnql 541
yinmkslvsw hycmidieki ramtiaklkt mrqedymkti adlelhyqef irnsqgsemf
601 gdddkrkiqs qftdaqkhyq tiviqlpgyp qhqtvtttei thhgtcqdvn
hnkvietnre 661 ndkqetwmlm elqkirrqie hcegrmtlkn lpladqgssh
hitvkinelk svqndsgqia 721 evlnqlkdml anfrgsekyc ylqnevfglf
qkleningvt dgylnslctv rallqailqt 781 edmlkvyear lteeetvcld
ldkveayrcg lkkikndlnl kksllatmkt elqkaqqihs 841 qtsqqyplyd
ldlgkfgekv tqltdrwqri dkqidfrlwd lekqikqlrn yrdnyqafck 901
wlydakrrqd slesmkfgds ntvmrflneq knlhseisgk rdkseevqki aelcansikd
961 yelqlasyts gletllnipi krtmiqspsg vilqeaadvh aryielltrs
gdyyrflsem 1021 lksledlklk ntkievleee lrlardanse ncnknkfldq
nlqkyqaecs qfkaklasle 1081 elkrqaeldg ksakqnldkc ygqikelnek
itrltyeied ekrrrksved rfdqqkndyd 1141 qlqkarqcek enlgwqkles
ekaikekeye ierlrvllqe egtrkreyen elakvrnhyn 1201 eemsnlrnky
eteinitktt ikeismqked dsknlrnqld rlsrenrdlk deivrlndsi 1261
lqateqrrra eenalqqkac gseimqkkqh leielkqvmq qrsednarhk qsleeaakti
1321 qdknkeierl kaefqeeakr rweyenelsk vrnnydeeii slknqfetei
nitkttihql 1381 tmqkeedtsg yraqidnltr enrslseeik rlkntltqtt
enlrrveedi qqqkatgsev 1441 sqrkqqleve lrqvtqmrte esvrykqsld
daaktiqdkn keierlkqli dketndrkcl 1501 edenarlqry qydlqkanss
atetinklkv qeqeltrlri dyervsgert vkdqditrfq 1561 nslkelqlqk
qkveeelnrl krtasedsck rkkleeeleg mrrslkeqai kitnitqqle 1621
qasivkkrse ddlrqqrdvl dghlrekqrt qeelrrlsse vealrrqllq eqesvkqahl
1681 rnehfqkaie dksrslnesk ieierlqslt enitkehlml eeelrnlrle
yddlrrgrse 1741 adsdknatil elrsqlqisn nrtlelqgli ndlgrerenl
rqeiekfqkq aleasnriqe 1801 sknqctqvvq eresllvkik vleqdkarlq
rledelnrak stleaetrvk qrlecekqqi 1861 qndlnqwktq ysrkeeairk
ieserekser eknslrseie rlqaeikrie ercrrkleds 1921 tretqsqlet
ersrygreid klrqrpygsh retqtecewt vdtsklvfdg lrkkvtamql 1981
yecqlidktt ldkllkgkks veevaseiqp flrgagsiag asaspkekys lveakrkkli
2041 spestvmlle aqaatggiid phrnekltvd saiardlidf ddrqqiyaae
kaitgfddpf 2101 sgktvsvsea ikknlidret gmrlleaqia sggvvdpvns
vflpkdvala rglidrdlyr 2161 slndprdsqk nfvdpvtkkk vsyvqlkerc
riephtglll lsvqkrsmsf qgirqpvtvt 2221 elvdsgilrp stvnelesgq
isydevgeri kdflqgssci agiynettkq klgiyeamki 2281 glvrpgtale
lleaqaatgf ivdpvsnlrl pveeaykrgl vgiefkekll saeravtgyn 2341
dpetgniisl fqamnkelie kghgirllea qiatggiidp keshrlpvdi aykrgyfnee
2401 lseilsdpsd dtkgffdpnt eenitylqlk ercikdeetg lcllplkekk
kqvqtsqknt 2461 lrkrrvvivd petnkemsvq eaykkglidy etfkelceqe
ceweeititg sdgstrvvlv 2521 drktgsqydi qdaidkglvd rkffdqyrsg
slsltqfadm islkngvgts ssmgsgvsdd 2581 vfsssrhesv skistissvr
nitirsssfs dtleesspia aifdtenlek isitegierg 2641 ivdsitgqrl
leaqactggi ihpttgqkls lqdaysqgvi dqdmatrlkp aqkafigfeg 2701
vkgkkkmsaa eavkekwlpy eagqrflefq yltgglvdpe vhgristeea irkgfidgra
2761 aqrlqdtssy akiltcpktk lkisykdain rsmveditgl rlleaasvss
kglpspynms 2821 sapgsrsgsr sgsrsgsrsg srsgsrrgsf datgnssysy
sysfssssig h
[0258] In some embodiments of the methods of the disclosure, the
wild type human DSP gene of the disclosure consists of or comprises
the nucleic acid sequence (Genbank Accession number:
NM_001008844.2, transcript variant 2):
TABLE-US-00031 (SEQ ID NO: 19) 1 aagaaaccgg ccaggtgtgg cctaggcgcc
cagtgccagc ggggaggaga ctcgctccgc 61 cgccgaccaa caccaacacc
cagctccgac gcagctcctc tgcgcccttg ccgccctccg 121 agccacagct
ttcctcccgc tcctgccccc ggcccgtcgc cgtctccgcg ctcgcagcgg 181
cctcgggagg gcccaggtag cgagcagcga cctcgcgagc cttccgcact cccgcccggt
241 tccccggccg tccgcctatc cttggccccc tccgctttct ccgcgccggc
ccgcctcgct 301 tatgcctcgg cgctgagccg ctctcccgat tgcccgccga
catgagctgc aacggaggct 361 cccacccgcg gatcaacact ctgggccgca
tgatccgcgc cgagtctggc ccggacctgc 421 gctacgaggt gaccagcggc
ggcgggggca ccagcaggat gtactattct cggcgcggcg 481 tgatcaccga
ccagaactcg gacggctact gtcaaaccgg cacgatgtcc aggcaccaga 541
accagaacac catccaggag ctgctgcaga actgctccga ctgcttgatg cgagcagagc
601 tcatcgtgca gcctgaattg aagtatggag atggaataca actgactcgg
agtcgagaat 661 tggatgagtg ttttgcccag gccaatgacc aaatggaaat
cctcgacagc ttgatcagag 721 agatgcggca gatgggccag ccctgtgatg
cttaccagaa aaggcttctt cagctccaag 781 agcaaatgcg agccctttat
aaagccatca gtgtccctcg agtccgcagg gccagctcca 841 agggtggtgg
aggctacact tgtcagagtg gctctggctg ggatgagttc accaaacatg 901
tcaccagtga atgtttgggg tggatgaggc agcaaagggc ggagatggac atggtggcct
961 ggggtgtgga cctggcctca gtggagcagc acattaacag ccaccggggc
atccacaact 1021 ccatcggcga ctatcgctgg cagctggaca aaatcaaagc
cgacctgcgc gagaaatctg 1081 cgatctacca gttggaggag gagtatgaaa
acctgctgaa agcgtccttt gagaggatgg 1141 atcacctgcg acagctgcag
aacatcattc aggccacgtc cagggagatc atgtggatca 1201 atgactgcga
ggaggaggag ctgctgtacg actggagcga caagaacacc aacatcgctc 1261
agaaacagga ggccttctcc atacgcatga gtcaactgga agttaaagaa aaagagctca
1321 ataagctgaa acaagaaagt gaccaacttg tcctcaatca gcatccagct
tcagacaaaa 1381 ttgaggccta tatggacact ctgcagacgc agtggagttg
gattcttcag atcaccaagt 1441 gcattgatgt tcatctgaaa gaaaatgctg
cctactttca gttttttgaa gaggcgcagt 1501 ctactgaagc atacctgaag
gggctccagg actccatcag gaagaagtac ccctgcgaca 1561 agaacatgcc
cctgcagcac ctgctggaac agatcaagga gctggagaaa gaacgagaga 1621
aaatccttga atacaagcgt caggtgcaga acttggtaaa caagtctaag aagattgtac
1681 agctgaagcc tcgtaaccca gactacagaa gcaataaacc cattattctc
agagctctct 1741 gtgactacaa acaagatcag aaaatcgtgc ataaggggga
tgagtgtatc ctgaaggaca 1801 acaacgagcg cagcaagtgg tacgtgacgg
gcccgggagg cgttgacatg cttgttccct 1861 ctgtggggct gatcatccct
cctccgaacc cactggccgt ggacctctct tgcaagattg 1921 agcagtacta
cgaagccatc ttggctctgt ggaaccagct ctacatcaac atgaagagcc 1981
tggtgtcctg gcactactgc atgattgaca tagagaagat cagggccatg acaatcgcca
2041 agctgaaaac aatgcggcag gaagattaca tgaagacgat agccgacctt
gagttacatt 2101 accaagagtt catcagaaat agccaaggct cagagatgtt
tggagatgat gacaagcgga 2161 aaatacagtc tcagttcacc gatgcccaga
agcattacca gaccctggtc attcagctcc 2221 ctggctatcc ccagcaccag
acagtgacca caactgaaat cactcatcat ggaacctgcc 2281 aagatgtcaa
ccataataaa gtaattgaaa ccaacagaga aaatgacaag caagaaacat 2341
ggatgctgat ggagctgcag aagattcgca ggcagataga gcactgcgag ggcaggatga
2401 ctctcaaaaa cctccctcta gcagaccagg gatcttctca ccacatcaca
gtgaaaatta 2461 acgagcttaa gagtgtgcag aatgattcac aagcaattgc
tgaggttctc aaccagctta 2521 aagatatgct tgccaacttc agaggttctg
aaaagtactg ctatttacag aatgaagtat 2581 ttggactatt tcagaaactg
gaaaatatca atggtgttac agatggctac ttaaatagct 2641 tatgcacagt
aagggcactg ctccaggcta ttctccaaac agaagacatg ttaaaggttt 2701
atgaagccag gctcactgag gaggaaactg tctgcctgga cctggataaa gtggaagctt
2761 accgctgtgg actgaagaaa ataaaaaatg acttgaactt gaagaagtcg
ttgttggcca 2821 ctatgaagac agaactacag aaagcccagc agatccactc
tcagacttca cagcagtatc 2881 cactttatga tctggacttg ggcaagttcg
gtgaaaaagt cacacagctg acagaccgct 2941 ggcaaaggat agataaacag
atcgacttta ggttatggga cctggagaaa caaatcaagc 3001 aattgaggaa
ttatcgtgat aactatcagg ctttctgcaa gtggctctat gatgctaaac 3061
gccgccagga ttccttagaa tccatgaaat ttggagattc caacacagtc atgcggtttt
3121 tgaatgagca gaagaacttg cacagtgaaa tatctggcaa acgagacaaa
tcagaggaag 3181 tacaaaaaat tgctgaactt tgcgccaatt caattaagga
ttatgagctc cagctggcct 3241 catacacctc aggactggaa actctgctga
acatacctat caagaggacc atgattcagt 3301 ccccttctgg ggtgattctg
caagaggctg cagatgttca tgctcggtac attgaactac 3361 ttacaagatc
tggagactat tacaggttct taagtgagat gctgaagagt ttggaagatc 3421
tgaagctgaa aaataccaag atcgaagttt tggaagagga gctcagactg gcccgagatg
3481 ccaactcgga aaactgtaat aagaacaaat tcctggatca gaacctgcag
aaataccagg 3541 cagagtgttc ccagttcaaa gcgaagcttg cgagcctgga
ggagctgaag agacaggctg 3601 agctggatgg gaagtcggct aagcaaaatc
tagacaagtg ctacggccaa ataaaagaac 3661 tcaatgagaa gatcacccga
ctgacttatg agattgaaga tgaaaagaga agaagaaaat 3721 ctgtggaaga
cagatttgac caacagaaga atgactatga ccaactgcag aaagcaaggc 3781
aatgtgaaaa ggagaacctt ggttggcaga aattagagtc tgagaaagcc atcaaggaga
3841 aggagtacga gattgaaagg ttgagggttc tactgcagga agaaggcacc
cggaagagag 3901 aatatgaaaa tgagctggca aaggcatcta ataggattca
ggaatcaaag aatcagtgta 3961 ctcaggtggt acaggaaaga gagagccttc
tggtgaaaat caaagtcctg gagcaagaca 4021 aggcaaggct gcagaggctg
gaggatgagc tgaatcgtgc aaaatcaact ctagaggcag 4081 aaaccagggt
gaaacagcgc ctggagtgtg agaaacagca aattcagaat gacctgaatc 4141
agtggaagac tcaatattcc cgcaaggagg aggctattag gaagatagaa tcggaaagag
4201 aaaagagtga gagagagaag aacagtctta ggagtgagat cgaaagactc
caagcagaga 4261 tcaagagaat tgaagagagg tgcaggcgta agctggagga
ttctaccagg gagacacagt 4321 cacagttaga aacagaacgc tcccgatatc
agagggagat tgataaactc agacagcgcc 4381 catatgggtc ccatcgagag
acccagactg agtgtgagtg gaccgttgac acctccaagc 4441 tggtgtttga
tgggctgagg aagaaggtga cagcaatgca gctctatgag tgtcagctga 4501
tcgacaaaac aaccttggac aaactattga aggggaagaa gtcagtggaa gaagttgctt
4561 ctgaaatcca gccattcctt cggggtgcag gatctatcgc tggagcatct
gcttctccta 4621 aggaaaaata ctctttggta gaggccaaga gaaagaaatt
aatcagccca gaatccacag 4681 tcatgcttct ggaggcccag gcagctacag
gtggtataat tgatccccat cggaatgaga 4741 agctgactgt cgacagtgcc
atagctcggg acctcattga cttcgatgac cgtcagcaga 4801 tatatgcagc
agaaaaagct atcactggtt ttgatgatcc attttcaggc aagacagtat 4861
ctgtttcaga agccatcaag aaaaatttga ttgatagaga aaccggaatg cgcctgctgg
4921 aagcccagat tgcttcaggg ggtgtagtag accctgtgaa cagtgtcttt
ttgccaaaag 4981 atgtcgcctt ggcccggggg ctgattgata gagatttgta
tcgatccctg aatgatcccc 5041 gagatagtca gaaaaacttt gtggatccag
tcaccaaaaa gaaggtcagt tacgtgcagc 5101 tgaaggaacg gtgcagaatc
gaaccacata ctggtctgct cttgctttca gtacagaaga 5161 gaagcatgtc
cttccaagga atcagacaac ctgtgaccgt cactgagcta gtagattctg 5221
gtatattgag accgtccact gtcaatgaac tggaatctgg tcagatttct tatgacgagg
5281 ttggtgagag aattaaggac ttcctccagg gttcaagctg catagcaggc
atatacaatg 5341 agaccacaaa acagaagctt ggcatttatg aggccatgaa
aattggctta gtccgacctg 5401 gtactgctct ggagttgctg gaagcccaag
cagctactgg ctttatagtg gatcctgtta 5461 gcaacttgag gttaccagtg
gaggaagcct acaagagagg tctggtgggc attgagttca 5521 aagagaagct
cctgtctgca gaacgagctg tcactgggta taatgatcct gaaacaggaa 5581
acatcatctc tttgttccaa gccatgaata aggaactcat cgaaaagggc cacggtattc
5641 gcttattaga agcacagatc gcaaccgggg ggatcattga cccaaaggag
agccatcgtt 5701 taccagttga catagcatat aagaggggct atttcaatga
ggaactcagt gagattctct 5761 cagatccaag tgatgatacc aaaggatttt
ttgaccccaa cactgaagaa aatcttacct 5821 atctgcaact aaaagaaaga
tgcattaagg atgaggaaac agggctctgt cttctgcctc 5881 tgaaagaaaa
gaagaaacag gtgcagacat cacaaaagaa taccctcagg aagcgtagag 5941
tggtcatagt tgacccagaa accaataaag aaatgtctgt tcaggaggcc tacaagaagg
6001 gcctaattga ttatgaaacc ttcaaagaac tgtgtgagca ggaatgtgaa
tgggaagaaa 6061 taaccatcac gggatcagat ggctccacca gggtggtcct
ggtagataga aagacaggca 6121 gtcagtatga tattcaagat gctattgaca
agggccttgt tgacaggaag ttctttgatc 6181 agtaccgatc cggcagcctc
agcctcactc aatttgctga catgatctcc ttgaaaaatg 6241 gtgtcggcac
cagcagcagc atgggcagtg gtgtcagcga tgatgttttt agcagctccc 6301
gacatgaatc agtaagtaag atttccacca tatccagcgt caggaattta accataagga
6361 gcagctcttt ttcagacacc ctggaagaat cgagccccat tgcagccatc
tttgacacag 6421 aaaacctgga gaaaatctcc attacagaag gtatagagcg
gggcatcgtt gacagcatca 6481 cgggtcagag gcttctggag gctcaggcct
gcacaggtgg catcatccac ccaaccacgg 6541 gccagaagct gtcacttcag
gacgcagtct cccagggtgt gattgaccaa gacatggcca 6601 ccaggctgaa
gcctgctcag aaagccttca taggcttcga gggtgtgaag ggaaagaaga 6661
agatgtcagc agcagaggca gtgaaagaaa aatggctccc gtatgaggct ggccagcgct
6721 tcctggagtt ccagtacctc acgggaggtc ttgttgaccc ggaagtgcat
gggaggataa 6781 gcaccgaaga agccatccgg aaggggttca tagatggccg
cgccgcacag aggctgcaag 6841 acaccagcag ctatgccaaa atcctgacct
gccccaaaac caaattaaaa atatcctata 6901 aggatgccat aaatcgctcc
atggtagaag atatcactgg gctgcgcctt ctggaagccg 6961 cctccgtgtc
gtccaagggc ttacccagcc cttacaacat gtcttcggct ccggggtccc 7021
gctccggctc ccgctcggga tctcgctccg gatctcgctc cgggtcccgc agtgggtccc
7081 ggagaggaag ctttgacgcc acagggaatt cttcctactc ttattcctac
tcatttagca 7141 gtagttctat tgggcactag tagtcagttg ggagtggttg
ctataccttg acttcattta 7201 tatgaatttc cactttatta aataatagaa
aagaaaatcc cggtgcttgc agtagagtga 7261 taggacattc tatgcttaca
gaaaatatag ccatgattga aatcaaatag taaaggctgt 7321 tctggctttt
tatcttctta gctcatctta aataagcagt acacttggat gcagtgcgtc 7381
tgaagtgcta atcagttgta acaatagcac aaatcgaact taggatttgt ttcttctctt
7441 ctgtgtttcg atttttgatc aattctttaa ttttggaagc ctataataca
gttttctatt
7501 cttggagata aaaattaaat ggatcactga tattttagtc attctgcttc
tcatctaaat 7561 atttccatat tctgtattag gagaaaatta ccctcccagc
accagccccc ctctcaaacc 7621 cccaacccaa aaccaagcat tttggaatga
gtctccttta gtttcagagt gtggattgta 7681 taacccatat actcttcgat
gtacttgttt ggtttggtat taatttgact gtgcatgaca 7741 gcggcaatct
tttctttggt caaagttttc tgtttatttt gcttgtcata ttcgatgtac 7801
tttaaggtgt ctttatgaag tttgctattc tggcaataaa cttttagact tttgaagtgt
7861 ttgtgtttta atttaatatg tttataagca tgtataaaca tttagcatat
ttttatcata 7921 ggtctaaaaa tatttgttta ctaaatacct gtgaagaaat
accattaaaa aactatttgg 7981 ttctgaattc ttactagaaa aaaaa
[0259] In some embodiments of the methods of the disclosure, the
wild type human DSP gene of the disclosure consists of or comprises
the amino acid sequence (Genbank Accession number: NP_001008844.1,
transcript variant 2):
TABLE-US-00032 (SEQ ID NO: 20) 1 mscnggshpr intlgrmira esgpdlryev
tsggggtsrm yysrrgvitd qnsdgycqtg 61 tmsrhqnqnt iqellqncsd
clmraelivq pelkygdgiq ltrsreldec faqandqmei 121 ldsliremrq
mgqpcdayqk rllqlqeqmr alykaisvpr vrrasskggg gytcqsgsgw 181
deftkhvtse clgwmrqqra emdmvawgvd lasveqhins hrgihnsigd yrwqldkika
241 dlreksaiyq leeeyenllk asfermdhlr qlqniiqats reimwindce
eeellydwsd 301 kntniaqkqe afsirmsqle vkekelnklk qesdqlvinq
hpasdkieay mdtlqtqwsw 361 ilqitkcidv hlkenaayfq ffeeaqstea
ylkglqdsir kkypcdknmp lqhlleqike 421 lekerekile ykrqvqnlvn
kskkivqlkp rnpdyrsnkp iilralcdyk qdqkivhkgd 481 ecilkdnner
skwyvtgpgg vdmlvpsvgl iipppnplav dlsckieqyy eailalwnql 541
yinmkslvsw hycmidieki ramtiaklkt mrqedymkti adlelhyqef irnsqgsemf
601 gdddkrkiqs qftdaqkhyq tiviqlpgyp qhqtvtttei thhgtcqdvn
hnkvietnre 661 ndkqetwmlm elqkirrqie hcegrmtlkn lpladqgssh
hitvkinelk svqndsgaia 721 evinqlkdml anfrgsekyc ylqnevfglf
qkleningvt dgylnslctv rallqailqt 781 edmlkvyear lteeetvcld
ldkveayrcg lkkikndlnl kksllatmkt elqkaqqihs 841 qtsqqyplyd
ldlgkfgekv tqltdrwqri dkqidfrlwd lekqikqlrn yrdnyqafck 901
wlydakrrqd slesmkfgds ntvmrflneq knlhseisgk rdkseevqki aelcansikd
961 yelqlasyts gletllnipi krtmiqspsg vilqeaadvh aryielltrs
gdyyrflsem 1021 lksledlklk ntkievleee lrlardanse ncnknkfldq
nlqkyqaecs qfkaklasle 1081 elkrqaeldg ksakqnldkc ygqikelnek
itrltyeied ekrrrksved rfdqqkndyd 1141 qlqkarqcek enlgwqkles
ekaikekeye ierlrvllqe egtrkreyen elakasnriq 1201 esknqctqvv
qeresllvki kvleqdkarl qrledelnra kstleaetrv kqrlecekqq 1261
iqndlnqwkt qysrkeeair kieserekse reknslrsei erlqaeikri eercrrkled
1321 stretqsqle tersrygrei dklrqrpygs hretqtecew tvdtsklvfd
glrkkvtamq 1381 lyecqlidkt tldkllkgkk sveevaseiq pflrgagsia
gasaspkeky slveakrkkl 1441 ispestvmll eaqaatggii dphrnekltv
dsaiardlid fddrqqiyaa ekaitgfddp 1501 fsgktvsvse aikknlidre
tgmrlleaqi asggvvdpvn svflpkdval arglidrdly 1561 rslndprdsq
knfvdpvtkk kvsyvqlker criephtgll llsvqkrsms fqgirqpvtv 1621
telvdsgilr pstvnelesg qisydevger ikdflqgssc iagiynettk qklgiyeamk
1681 iglvrpgtal elleaqaatg fivdpvsnlr lpveeaykrg lvgiefkekl
lsaeravtgy 1741 ndpetgniis lfqamnkeli ekghgirlle aqiatggiid
pkeshrlpvd iaykrgyfne 1801 elseilsdps ddtkgffdpn teenitylql
kercikdeet glcllplkek kkqvqtsqkn 1861 tlrkrrvviv dpetnkemsv
qeaykkglid yetfkelceq eceweeitit gsdgstrvvl 1921 vdrktgsqyd
iqdaidkglv drkffdqyrs gslsltqfad mislkngvgt sssmgsgvsd 1981
dvfsssrhes vskistissv rnltirsssf sdtleesspi aaifdtenle kisitegier
2041 givdsitgqr lleaqactgg iihpttgqkl slqdavsqgv idqdmatrlk
paqkafigfe 2101 gvkgkkkmsa aeavkekwlp yeagqrflef qyltgglvdp
evhgristee airkgfidgr 2161 aaqrlqdtss yakiltcpkt klkisykdai
nrsmveditg lrlleaasvs skglpspynm 2221 ssapgsrsgs rsgsrsgsrs
gsrsgsrrgs fdatgnssys ysysfssssi gh
[0260] In some embodiments of the methods of the disclosure, the
wild type human DSP gene of the disclosure consists of or comprises
the nucleic acid sequence (Genbank Accession number:
NM_001319034.1, transcript variant 3):
TABLE-US-00033 (SEQ ID NO: 46) 1 aagaaaccgg ccaggtgtgg cctaggcgcc
cagtgccagc ggggaggaga ctcgctccgc 61 cgccgaccaa caccaacacc
cagctccgac gcagctcctc tgcgcccttg ccgccctccg 121 agccacagct
ttcctcccgc tcctgccccc ggcccgtcgc cgtctccgcg ctcgcagcgg 181
cctcgggagg gcccaggtag cgagcagcga cctcgcgagc cttccgcact cccgcccggt
241 tccccggccg tccgcctatc cttggccccc tccgctttct ccgcgccggc
ccgcctcgct 301 tatgcctcgg cgctgagccg ctctcccgat tgcccgccga
catgagctgc aacggaggct 361 cccacccgcg gatcaacact ctgggccgca
tgatccgcgc cgagtctggc ccggacctgc 421 gctacgaggt gaccagcggc
ggcgggggca ccagcaggat gtactattct cggcgcggcg 481 tgatcaccga
ccagaactcg gacggctact gtcaaaccgg cacgatgtcc aggcaccaga 541
accagaacac catccaggag ctgctgcaga actgctccga ctgcttgatg cgagcagagc
601 tcatcgtgca gcctgaattg aagtatggag atggaataca actgactcgg
agtcgagaat 661 tggatgagtg ttttgcccag gccaatgacc aaatggaaat
cctcgacagc ttgatcagag 721 agatgcggca gatgggccag ccctgtgatg
cttaccagaa aaggcttctt cagctccaag 781 agcaaatgcg agccctttat
aaagccatca gtgtccctcg agtccgcagg gccagctcca 841 agggtggtgg
aggctacact tgtcagagtg gctctggctg ggatgagttc accaaacatg 901
tcaccagtga atgtttgggg tggatgaggc agcaaagggc ggagatggac atggtggcct
961 ggggtgtgga cctggcctca gtggagcagc acattaacag ccaccggggc
atccacaact 1021 ccatcggcga ctatcgctgg cagctggaca aaatcaaagc
cgacctgcgc gagaaatctg 1081 cgatctacca gttggaggag gagtatgaaa
acctgctgaa agcgtccttt gagaggatgg 1141 atcacctgcg acagctgcag
aacatcattc aggccacgtc cagggagatc atgtggatca 1201 atgactgcga
ggaggaggag ctgctgtacg actggagcga caagaacacc aacatcgctc 1261
agaaacagga ggccttctcc atacgcatga gtcaactgga agttaaagaa aaagagctca
1321 ataagctgaa acaagaaagt gaccaacttg tcctcaatca gcatccagct
tcagacaaaa 1381 ttgaggccta tatggacact ctgcagacgc agtggagttg
gattcttcag atcaccaagt 1441 gcattgatgt tcatctgaaa gaaaatgctg
cctactttca gttttttgaa gaggcgcagt 1501 ctactgaagc atacctgaag
gggctccagg actccatcag gaagaagtac ccctgcgaca 1561 agaacatgcc
cctgcagcac ctgctggaac agatcaagga gctggagaaa gaacgagaga 1621
aaatccttga atacaagcgt caggtgcaga acttggtaaa caagtctaag aagattgtac
1681 agctgaagcc tcgtaaccca gactacagaa gcaataaacc cattattctc
agagctctct 1741 gtgactacaa acaagatcag aaaatcgtgc ataaggggga
tgagtgtatc ctgaaggaca 1801 acaacgagcg cagcaagtgg tacgtgacgg
gcccgggagg cgttgacatg cttgttccct 1861 ctgtggggct gatcatccct
cctccgaacc cactggccgt ggacctctct tgcaagattg 1921 agcagtacta
cgaagccatc ttggctctgt ggaaccagct ctacatcaac atgaagagcc 1981
tggtgtcctg gcactactgc atgattgaca tagagaagat cagggccatg acaatcgcca
2041 agctgaaaac aatgcggcag gaagattaca tgaagacgat agccgacctt
gagttacatt 2101 accaagagtt catcagaaat agccaaggct cagagatgtt
tggagatgat gacaagcgga 2161 aaatacagtc tcagttcacc gatgcccaga
agcattacca gaccctggtc attcagctcc 2221 ctggctatcc ccagcaccag
acagtgacca caactgaaat cactcatcat ggaacctgcc 2281 aagatgtcaa
ccataataaa gtaattgaaa ccaacagaga aaatgacaag caagaaacat 2341
ggatgctgat ggagctgcag aagattcgca ggcagataga gcactgcgag ggcaggatga
2401 ctctcaaaaa cctccctcta gcagaccagg gatcttctca ccacatcaca
gtgaaaatta 2461 acgagcttaa gagtgtgcag aatgattcac aagcaattgc
tgaggttctc aaccagctta 2521 aagatatgct tgccaacttc agaggttctg
aaaagtactg ctatttacag aatgaagtat 2581 ttggactatt tcagaaactg
gaaaatatca atggtgttac agatggctac ttaaatagct 2641 tatgcacagt
aagggcactg ctccaggcta ttctccaaac agaagacatg ttaaaggttt 2701
atgaagccag gctcactgag gaggaaactg tctgcctgga cctggataaa gtggaagctt
2761 accgctgtgg actgaagaaa ataaaaaatg acttgaactt gaagaagtcg
ttgttggcca 2821 ctatgaagac agaactacag aaagcccagc agatccactc
tcagacttca cagcagtatc 2881 cactttatga tctggacttg ggcaagttcg
gtgaaaaagt cacacagctg acagaccgct 2941 ggcaaaggat agataaacag
atcgacttta ggttatggga cctggagaaa caaatcaagc 3001 aattgaggaa
ttatcgtgat aactatcagg ctttctgcaa gtggctctat gatgctaaac 3061
gccgccagga ttccttagaa tccatgaaat ttggagattc caacacagtc atgcggtttt
3121 tgaatgagca gaagaacttg cacagtgaaa tatctggcaa acgagacaaa
tcagaggaag 3181 tacaaaaaat tgctgaactt tgcgccaatt caattaagga
ttatgagctc cagctggcct 3241 catacacctc aggactggaa actctgctga
acatacctat caagaggacc atgattcagt 3301 ccccttctgg ggtgattctg
caagaggctg cagatgttca tgctcggtac attgaactac 3361 ttacaagatc
tggagactat tacaggttct taagtgagat gctgaagagt ttggaagatc 3421
tgaagctgaa aaataccaag atcgaagttt tggaagagga gctcagactg gcccgagatg
3481 ccaactcgga aaactgtaat aagaacaaat tcctggatca gaacctgcag
aaataccagg 3541 cagagtgttc ccagttcaaa gcgaagcttg cgagcctgga
ggagctgaag agacaggctg 3601 agctggatgg gaagtcggct aagcaaaatc
tagacaagtg ctacggccaa ataaaagaac 3661 tcaatgagaa gatcacccga
ctgacttatg agattgaaga tgaaaagaga agaagaaaat 3721 ctgtggaaga
cagatttgac caacagaaga atgactatga ccaactgcag aaagcaaggc 3781
aatgtgaaaa ggagaacctt ggttggcaga aattagagtc tgagaaagcc atcaaggaga
3841 aggagtacga gattgaaagg ttgagggttc tactgcagga agaaggcacc
cggaagagag 3901 aatatgaaaa tgagctggca aaggtaagaa accactataa
tgaggagatg agtaatttaa 3961 ggaacaagta tgaaacagag attaacatta
cgaagaccac catcaaggag atatccatgc 4021 aaaaagagga tgattccaaa
aatcttagaa accagcttga tagactttca agggaaaatc 4081 gagatctgaa
ggatgaaatt gtcaggctca atgacagcat cttgcaggcc actgagcagc 4141
gaaggcgagc tgaagaaaac gcccttcagc aaaaggcctg tggctctgag ataatgcaga
4201 agaagcagca tctggagata gaactgaagc aggtcatgca gcagcgctct
gaggacaatg 4261 cccggcacaa gcagtccctg gaggaggctg ccaagaccat
tcaggacaaa aataaggaga 4321 tcgagagact caaagctgag tttcaggagg
aggccaagcg ccgctgggaa tatgaaaatg 4381 aactgagtaa ggcatctaat
aggattcagg aatcaaagaa tcagtgtact caggtggtac 4441 aggaaagaga
gagccttctg gtgaaaatca aagtcctgga gcaagacaag gcaaggctgc 4501
agaggctgga ggatgagctg aatcgtgcaa aatcaactct agaggcagaa accagggtga
4561 aacagcgcct ggagtgtgag aaacagcaaa ttcagaatga cctgaatcag
tggaagactc 4621 aatattcccg caaggaggag gctattagga agatagaatc
ggaaagagaa aagagtgaga 4681 gagagaagaa cagtcttagg agtgagatcg
aaagactcca agcagagatc aagagaattg 4741 aagagaggtg caggcgtaag
ctggaggatt ctaccaggga gacacagtca cagttagaaa 4801 cagaacgctc
ccgatatcag agggagattg ataaactcag acagcgccca tatgggtccc 4861
atcgagagac ccagactgag tgtgagtgga ccgttgacac ctccaagctg gtgtttgatg
4921 ggctgaggaa gaaggtgaca gcaatgcagc tctatgagtg tcagctgatc
gacaaaacaa 4981 ccttggacaa actattgaag gggaagaagt cagtggaaga
agttgcttct gaaatccagc 5041 cattccttcg gggtgcagga tctatcgctg
gagcatctgc ttctcctaag gaaaaatact 5101 ctttggtaga ggccaagaga
aagaaattaa tcagcccaga atccacagtc atgcttctgg 5161 aggcccaggc
agctacaggt ggtataattg atccccatcg gaatgagaag ctgactgtcg 5221
acagtgccat agctcgggac ctcattgact tcgatgaccg tcagcagata tatgcagcag
5281 aaaaagctat cactggtttt gatgatccat tttcaggcaa gacagtatct
gtttcagaag 5341 ccatcaagaa aaatttgatt gatagagaaa ccggaatgcg
cctgctggaa gcccagattg 5401 cttcaggggg tgtagtagac cctgtgaaca
gtgtcttttt gccaaaagat gtcgccttgg 5461 cccgggggct gattgataga
gatttgtatc gatccctgaa tgatccccga gatagtcaga 5521 aaaactttgt
ggatccagtc accaaaaaga aggtcagtta cgtgcagctg aaggaacggt 5581
gcagaatcga accacatact ggtctgctct tgctttcagt acagaagaga agcatgtcct
5641 tccaaggaat cagacaacct gtgaccgtca ctgagctagt agattctggt
atattgagac 5701 cgtccactgt caatgaactg gaatctggtc agatttctta
tgacgaggtt ggtgagagaa 5761 ttaaggactt cctccagggt tcaagctgca
tagcaggcat atacaatgag accacaaaac 5821 agaagcttgg catttatgag
gccatgaaaa ttggcttagt ccgacctggt actgctctgg 5881 agttgctgga
agcccaagca gctactggct ttatagtgga tcctgttagc aacttgaggt 5941
taccagtgga ggaagcctac aagagaggtc tggtgggcat tgagttcaaa gagaagctcc
6001 tgtctgcaga acgagctgtc actgggtata atgatcctga aacaggaaac
atcatctctt 6061 tgttccaagc catgaataag gaactcatcg aaaagggcca
cggtattcgc ttattagaag 6121 cacagatcgc aaccgggggg atcattgacc
caaaggagag ccatcgttta ccagttgaca 6181 tagcatataa gaggggctat
ttcaatgagg aactcagtga gattctctca gatccaagtg 6241 atgataccaa
aggatttttt gaccccaaca ctgaagaaaa tcttacctat ctgcaactaa 6301
aagaaagatg cattaaggat gaggaaacag ggctctgtct tctgcctctg aaagaaaaga
6361 agaaacaggt gcagacatca caaaagaata ccctcaggaa gcgtagagtg
gtcatagttg 6421 acccagaaac caataaagaa atgtctgttc aggaggccta
caagaagggc ctaattgatt 6481 atgaaacctt caaagaactg tgtgagcagg
aatgtgaatg ggaagaaata accatcacgg 6541 gatcagatgg ctccaccagg
gtggtcctgg tagatagaaa gacaggcagt cagtatgata 6601 ttcaagatgc
tattgacaag ggccttgttg acaggaagtt ctttgatcag taccgatccg 6661
gcagcctcag cctcactcaa tttgctgaca tgatctcctt gaaaaatggt gtcggcacca
6721 gcagcagcat gggcagtggt gtcagcgatg atgtttttag cagctcccga
catgaatcag 6781 taagtaagat ttccaccata tccagcgtca ggaatttaac
cataaggagc agctcttttt 6841 cagacaccct ggaagaatcg agccccattg
cagccatctt tgacacagaa aacctggaga 6901 aaatctccat tacagaaggt
atagagcggg gcatcgttga cagcatcacg ggtcagaggc 6961 ttctggaggc
tcaggcctgc acaggtggca tcatccaccc aaccacgggc cagaagctgt 7021
cacttcagga cgcagtctcc cagggtgtga ttgaccaaga catggccacc aggctgaagc
7081 ctgctcagaa agccttcata ggcttcgagg gtgtgaaggg aaagaagaag
atgtcagcag 7141 cagaggcagt gaaagaaaaa tggctcccgt atgaggctgg
ccagcgcttc ctggagttcc 7201 agtacctcac gggaggtctt gttgacccgg
aagtgcatgg gaggataagc accgaagaag 7261 ccatccggaa ggggttcata
gatggccgcg ccgcacagag gctgcaagac accagcagct 7321 atgccaaaat
cctgacctgc cccaaaacca aattaaaaat atcctataag gatgccataa 7381
atcgctccat ggtagaagat atcactgggc tgcgccttct ggaagccgcc tccgtgtcgt
7441 ccaagggctt acccagccct tacaacatgt cttcggctcc ggggtcccgc
tccggctccc
7501 gctcgggatc tcgctccgga tctcgctccg ggtcccgcag tgggtcccgg
agaggaagct 7561 ttgacgccac agggaattct tcctactctt attcctactc
atttagcagt agttctattg 7621 ggcactagta gtcagttggg agtggttgct
ataccttgac ttcatttata tgaatttcca 7681 ctttattaaa taatagaaaa
gaaaatcccg gtgcttgcag tagagtgata ggacattcta 7741 tgcttacaga
aaatatagcc atgattgaaa tcaaatagta aaggctgttc tggcttttta 7801
tcttcttagc tcatcttaaa taagcagtac acttggatgc agtgcgtctg aagtgctaat
7861 cagttgtaac aatagcacaa atcgaactta ggatttgttt cttctcttct
gtgtttcgat 7921 ttttgatcaa ttctttaatt ttggaagcct ataatacagt
tttctattct tggagataaa 7981 aattaaatgg atcactgata ttttagtcat
tctgcttctc atctaaatat ttccatattc 8041 tgtattagga gaaaattacc
ctcccagcac cagcccccct ctcaaacccc caacccaaaa 8101 ccaagcattt
tggaatgagt ctcctttagt ttcagagtgt ggattgtata acccatatac 8161
tcttcgatgt acttgtttgg tttggtatta atttgactgt gcatgacagc ggcaatcttt
8221 tctttggtca aagttttctg tttattttgc ttgtcatatt cgatgtactt
taaggtgtct 8281 ttatgaagtt tgctattctg gcaataaact tttagacttt
tgaagtgttt gtgttttaat 8341 ttaatatgtt tataagcatg tataaacatt
tagcatattt ttatcatagg tctaaaaata 8401 tttgtttact aaatacctgt
gaagaaatac cattaaaaaa ctatttggtt ctgaattctt 8461 actagaaaaa aaa
[0261] In some embodiments of the methods of the disclosure, the
wild type human DSP gene of the disclosure consists of or comprises
the amino acid sequence (Genbank Accession number: NP_001305963.1,
transcript variant 3):
TABLE-US-00034 (SEQ ID NO: 47) 1 mscnggshpr intlgrmira esgpdlryev
tsggggtsrm yysrrgvitd qnsdgycqtg 61 tmsrhqnqnt iqellqncsd
clmraelivq pelkygdgiq ltrsreldec faqandqmei 121 ldsliremrq
mgqpcdayqk rllqlqeqmr alykaisvpr vrrasskggg gytcqsgsgw 181
deftkhvtse clgwmrqqra emdmvawgvd lasveqhins hrgihnsigd yrwqldkika
241 dlreksaiyq leeeyenllk asfermdhlr qlqniiqats reimwindce
eeellydwsd 301 kntniaqkqe afsirmsqle vkekelnklk qesdqlvlnq
hpasdkieay mdtlqtqwsw 361 ilqitkcidv hlkenaayfq ffeeaqstea
ylkglqdsir kkypcdknmp lqhlleqike 421 lekerekile ykrqvqnlvn
kskkivqlkp rnpdyrsnkp iilralcdyk qdqkivhkgd 481 ecilkdnner
skwyvtgpgg vdmlvpsvgl iipppnplav dlsckieqyy eailalwnql 541
yinmkslvsw hycmidieki ramtiaklkt mrqedymkti adlelhyqef irnsqgsemf
601 gdddkrkiqs qftdaqkhyq tlviqlpgyp qhqtvtttei thhgtcqdvn
hnkvietnre 661 ndkqetwmlm elqkirrqie hcegrmtlkn lpladqgssh
hitvkinelk svqndsqaia 721 evlnqlkdml anfrgsekyc ylqnevfglf
qkleningvt dgylnslctv rallqailqt 781 edmlkvyear lteeetvcld
ldkveayrcg lkkikndlnl kksllatmkt elqkaqqihs 841 qtsqqyplyd
ldlgkfgekv tqltdrwqri dkqidfrlwd lekqikqlrn yrdnyqafck 901
wlydakrrqd slesmkfgds ntvmrflneq knlhseisgk rdkseevqki aelcansikd
961 yelqlasyts gletllnipi krtmiqspsg vilqeaadvh aryielltrs
gdyyrflsem 1021 lksledlklk ntkievleee lrlardanse ncnknkfldq
nlqkyqaecs qfkaklasle 1081 elkrqaeldg ksakqnldkc ygqikelnek
itrltyeied ekrrrksved rfdqqkndyd 1141 qlqkarqcek enlgwqkles
ekaikekeye ierlrvllqe egtrkreyen elakvrnhyn 1201 eemsnlrnky
eteinitktt ikeismqked dsknlrnqld rlsrenrdlk deivrlndsi 1261
lqateqrrra eenalqqkac gseimqkkqh leielkqvmq qrsednarhk qsleeaakti
1321 qdknkeierl kaefqeeakr rweyenelsk asnriqeskn qctqvvqere
sllvkikvle 1381 qdkarlqrle delnrakstl eaetrvkqrl ecekqqiqnd
lnqwktqysr keeairkies 1441 erekserekn slrseierlq aeikrieerc
rrkledstre tqsqleters ryqreidklr 1501 qrpygshret qtecewtvdt
sklvfdglrk kvtamqlyec qlidkttldk llkgkksvee 1561 vaseiqpflr
gagsiagasa spkekyslve akrkklispe stvmlleaqa atggiidphr 1621
nekltvdsai ardlidfddr qqiyaaekai tgfddpfsgk tvsyseaikk nlidretgmr
1681 lleaqiasgg vvdpvnsvfl pkdvalargl idrdlyrsln dprdsqknfv
dpvtkkkvsy 1741 vqlkercrie phtgllllsv qkrsmsfqgi rqpvtvtelv
dsgilrpstv nelesgqisy 1801 devgerikdf lqgssciagi ynettkqklg
iyeamkiglv rpgtalelle aqaatgfivd 1861 pvsnlrlpve eaykrglvgi
efkekllsae ravtgyndpe tgniislfqa mnkeliekgh 1921 girlleaqia
tggiidpkes hrlpvdiayk rgyfneelse ilsdpsddtk gffdpnteen 1981
ltylqlkerc ikdeetglcl lplkekkkqv qtsqkntlrk rrvvivdpet nkemsvqeay
2041 kkglidyetf kelceqecew eeititgsdg strvvlvdrk tgsqydiqda
idkglvdrkf 2101 fdqyrsgsls ltqfadmisl kngvgtsssm gsgvsddvfs
ssrhesvski stissvrnlt 2161 irsssfsdtl eesspiaaif dtenlekisi
tegiergivd sitgqrllea qactggiihp 2221 ttgqklslqd aysqgvidqd
matrlkpaqk afigfegvkg kkkmsaaeav kekwlpyeag 2281 qrflefqylt
gglvdpevhg risteeairk gfidgraaqr lqdtssyaki ltcpktklki 2341
sykdainrsm veditglrll eaasysskgl pspynmssap gsrsgsrsgs rsgsrsgsrs
2401 gsrrgsfdat gnssysysys fssssigh
[0262] In some embodiments of the methods of the disclosure, the
wild type human AZGP1 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_001185.3):
TABLE-US-00035 (SEQ ID NO: 21) 1 ccattggcct gtagattcac ctcccctggg
cagggcccca ggacccagga taatatctgt 61 gcctcctgcc cagaaccctc
caagcagaca caatggtaag aatggtgcct gtcctgctgt 121 ctctgctgct
gcttctgggt cctgctgtcc cccaggagaa ccaagatggt cgttactctc 181
tgacctatat ctacactggg ctgtccaagc atgttgaaga cgtccccgcg tttcaggccc
241 ttggctcact caatgacctc cagttcttta gatacaacag taaagacagg
aagtctcagc 301 ccatgggact ctggagacag gtggaaggaa tggaggattg
gaagcaggac agccaacttc 361 agaaggccag ggaggacatc tttatggaga
ccctgaaaga catcgtggag tattacaacg 421 acagtaacgg gtctcacgta
ttgcagggaa ggtttggttg tgagatcgag aataacagaa 481 gcagcggagc
attctggaaa tattactatg atggaaagga ctacattgaa ttcaacaaag 541
aaatcccagc ctgggtcccc ttcgacccag cagcccagat aaccaagcag aagtgggagg
601 cagaaccagt ctacgtgcag cgggccaagg cttacctgga ggaggagtgc
cctgcgactc 661 tgcggaaata cctgaaatac agcaaaaata tcctggaccg
gcaagatcct ccctctgtgg 721 tggtcaccag ccaccaggcc ccaggagaaa
agaagaaact gaagtgcctg gcctacgact 781 tctacccagg gaaaattgat
gtgcactgga ctcgggccgg cgaggtgcag gagcctgagt 841 tacggggaga
tgttcttcac aatggaaatg gcacttacca gtcctgggtg gtggtggcag 901
tgcccccgca ggacacagcc ccctactcct gccacgtgca gcacagcagc ctggcccagc
961 ccctcgtggt gccctgggag gccagctagg aagcaagggt tggaggcaat
gtgggatctc 1021 agacccagta gctgcccttc ctgcctgatg tgggagctga
accacagaaa tcacagtcaa 1081 tggatccaca aggcctgagg agcagtgtgg
ggggacagac aggaggtgga tttggagacc 1141 gaagactggg atgcctgtct
tgagtagact tggacccaaa aaatcatctc accttgagcc 1201 cacccccacc
ccattgtcta atctgtagaa gctaataaat aatcatccct ccttgcctag 1261
cataaaaaaa aaaaaaaa
[0263] In some embodiments of the methods of the disclosure, the
wild type human AZGP1 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NP_001176.1):
TABLE-US-00036 (SEQ ID NO: 22) 1 mvrmvpvlls lllllgpavp qenqdgrysl
tyiytglskh vedvpafqal gslndlqffr 61 ynskdrksqp mglwrqvegm
edwkqdsqlq karedifmet lkdiveyynd sngshvlqgr 121 fgceiennrs
sgafwkyyyd gkdyiefnke ipawvpfdpa aqitkqkwea epvyvqraka 181
yleeecpatl rkylkyskni ldrqdppsvv vtshqapgek kklkclaydf ypgkidvhwt
241 ragevqepel rgdvlhngng tyqswvvvav ppqdtapysc hvqhsslaqp
lvvpweas
[0264] In some embodiments of the methods of the disclosure, the
wild type human OBFC1 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_024928):
TABLE-US-00037 (SEQ ID NO: 23) 1 aaatgcgctg gcggggagac cggggttggt
ccctggcggg gcagggggcg ggctcaggcc 61 ggaactccag agacgacctc
agccaactgc tcctgcgccg ggcggggtcg tcgccgccag 121 cggctccgag
cgccggaagg gccaggtctc agggctcctg gagctgcagg cggcgggagg 181
ggctacaaat gcttgactca gtgatgcaga acctttcaga gttagctgga agccacagcc
241 ctgcctcttg atgcagcctg gatccagccg gtgtgaagag gagacccctt
ccctcttgtg 301 gggtttggat cctgtgtttc tagcctttgc aaaactctac
atcagggata tcctggacat 361 gaaggagtcc cgccaggtgc caggtgtatt
tttgtacaat ggacatccaa taaaacaggt 421 agatgtcttg ggaactgtca
ttggagtgag agaaagagat gctttctaca gttatggagt 481 ggatgacagc
actggagtta taaactgcat ctgctggaaa aagttgaata ctgagtctgt 541
atcagctgct ccaagtgcag caagagagct cagcttaacc tcacaactta agaagctaca
601 agagaccatt gagcagaaaa caaagataga gatcggggac acgatccgag
tcagaggcag 661 tatccgcaca tacagagaag agcgagagat tcatgccacc
acttactata aagtggacga 721 cccagtgtgg aacattcaaa ttgcaaggat
gcttgagctg cccactatct acaggaaagt 781 ttatgaccag ccttttcaca
gctcagccct agagaaagaa gaggcactaa gcaatccagg 841 cgccctggac
ctccccagtc tcacgagttt gctgagtgaa aaagccaaag aattcctcat 901
ggagaacaga gtgcagagct tttaccagca ggagctggaa atggtggagt ctttgctgtc
961 ccttgccaat cagcctgtga ttcacagtgc ctcctccgac caagtgaatt
ttaagaagga 1021 caccacttcc aaggcaattc atagtatatt taagaatgct
atacaactgc tgcaggaaaa 1081 aggacttgtt ttccagaaag atgatggttt
tgataaccta tactatgtaa ccagagaaga 1141 caaagacctg cacagaaaga
tccaccggat cattcagcag gactgccaga aaccaaatca 1201 catggagaag
ggctgtcact tcctgcacat cttggcctgt gctcgcctga gcatccgccc 1261
gggcctgagc gaggctgtgc tgcagcaagt tctggagctc ctggaggacc agagtgacat
1321 tgtcagcaca atggagcact actacacagc gttctgagca gagacacgca
gaccagctga 1381 ggaggacaaa gataaggtgg cattcacccc caggctctga
ctttcagcat catgcagggg 1441 cttatctgtc tggaggcagt tacctcataa
taaactataa aatatagtca tcttgggaat 1501 gggatttggc ataaatgttg
ttggctccct tctgtccact atgtccttgg tgtacaatga 1561 ctttgatctc
agccatgaca caacaagaaa accctccctg ttgagctcct ggctggactg 1621
tgcgttgttc gcagagcaga atggggagga aacagtgttg gcagcttaac tgatgtgtgt
1681 ggttggagtc tcttccatgg caaagggaca ccacagggta gtgaacattc
aggaactgag 1741 gggcatatgg cctgatcaca cagttctaag cttttcaaaa
cttcaggtta tcagagacct 1801 tcctgtgggc ctctcttgct ggctaagaac
cggtttaggg gagtagttct ccctggatga 1861 gtgcttacag tttctgtggc
tcagttacca gcagtggggt tgagacctgg gtcgatgctc 1921 tttacaggcc
tgcccagaga tgggaataaa cagggatcca cagcgtgact atgtgtttgt 1981
cattttcctt ttatttcctt gggaatcgaa aggtgtccca gtacatttcc ctgcacttac
2041 agaggtgcat gactaaatac attgtccctc gatgcccctg aagatcacgg
aggcagtcag 2101 ccaattgcct ggcaggtggt agatgttatt ttcagggttg
ccgctgagtg tgcaggatgt 2161 gctgacacca tccagacaaa gactcggtat
gtgcccagac aggtgatgga gtcatgcttt 2221 tgctcagaat gacaaggtaa
aggaaaaaca tctgaggtat gttgtaggcc tgttctgaca 2281 gcaaaatgac
aaatccagcc agcaaaaata aagtgtggag aaagatttgg agttaattac 2341
agtcatttca cagaaggcac tgccttcgtc tgctgcattt gctcttgatg tgataagctc
2401 ttcgtggctc agctggagat cctttaggcc tggagagttg ctcctctctc
cgtggaaaca 2461 ggacagtctt tatacgcaga agtccgctgc agctcgatac
gtcaggctga gagctagaac 2521 cagtagattg cctcctgtca tagacttttg
taatgatgca aacctttgct gatttctaac 2581 agtgattatg tagtggctgc
cctgcatctt ctctgtgtac agaagggtcc ctagcataga 2641 gtctgcctgg
aatgatgtcc tgggcagttc ttccttgagg tcagcagctg ttccacgttg 2701
aatgcatctg attagtgggg ctgcccagga aggagttcag aatcagaagg taaaaagggc
2761 atacccttgc ctatagcaac tctgctctta ggggtttatc tcaaggagat
ggctacacaa 2821 gtgtgaaagg atggttgcac aaggtgttca ttgctgtata
atctagaatt ctatattggg 2881 gaaaatacct atagggaaaa agttaattac
ggttcttggg cacaatgaaa tactatgcag 2941 ctatgaaaaa aatgatgaaa
gcagacagac agtgttgcca tggcacactg tccctagtag 3001 atttagtggg
aagtagatag agttatagat ctgtttctat agtataacac cattatctac 3061
agctccctgt gtgtatgtat atatccgtag agagagtgta tatttctgca tggaggtctt
3121 tataaatgta gcacatgtac atatatatat atatacacac acacagtcga
ccactccctt 3181 ctcctggaag tactttccgc gtttggcttt caggacacca
agctctctgg ttgctccttc 3241 tcaggttcct ttgttcagtg ctctgcctcc
ctgaggactc agtcccagac ctcttttcta 3301 tctggcttgc tcactggggt
gtctccagca gccacatgga ttataccatc tacatgctgt 3361 ctaacacctc
agtttaaacc cagaatgggc ctcttccctg aactgcagac ccctatattc 3421
agtttgctac tgacatctcc acttaggtct ctaatggaca tctcagattt cacaggccca
3481 aagccaggct cccaattact cctgacccca ggcttgctcc tgatagtgac
atgaggcagc 3541 caaatgccta ggcagagagg ggagggtccc aaatgaaacc
ccacgttcaa gcaaagatca 3601 gcctgaaggc taaaagacca gattgctggt
cctggatgaa acccaccacg cagagtggga 3661 acttctgttc ctgtttgccc
accctttccc aattgttctt tctgaataac gccttaacca 3721 atcgaatgtt
gccttttcca gtaataccta cagcctgccc ctccccccat tctgagccca 3781
taaaaagacc cagactcccc catattaagg ggactttcct gcctttgggt agggggacca
3841 cccccacgtc tcctctctgt tgaaaactgt ttcatcactc aataaaactc
ccagctttgc 3901 tcactcttcc actgtcagca cattctcatt cttctttggt
gctgggcaag aactcaacca 3961 gtgtggaagc catacttggc ccaggcgggt
gaagtgggcg ggccgtctcc tgcagcaggt 4021 agcatggtca agcgaggccc
aggtgggccg tcaccagcca gaggtccctg gcttgcaaag 4081 tgaccgagaa
aaaaatcctg tgccactcct ttggaaaatg tccctgattc aggaagaggt 4141
agctccatcc agttgctcaa accaaatcca ttggcttctt tctttctatc atacctcaca
4201 tccaatctgt ctgcaagtct tttggctcta ccttcagaat atctccagaa
tcttaactgc 4261 ttcaccctcc tccccggcct cctcagtcct ctctgcttcc
gccctggccc ctcttgggct 4321 gttcacagca cagcagctgt tgccaccctg
ttaatgctcc cactctccta cagccttcgg 4381 tcttgcccca ggtaggagcc
tgaggctgca cagaggtcag cacggccccg cttaccctgc 4441 cctcccagcc
cagccgcacg ggccttgcac acatgcctcg gcatattcct gccttagggc 4501
tggtgctcct gctatttcct cttcccaggt aaccatgtga agtgcctccc tctgccctct
4561 ttccagcctt tacttgagtg tcaccttctc agtgaggcct gccctcattc
ctctttcgct 4621 gtttgcaacc catctcctgt cccccttccc agaactccct
ttcctacttc gtttttcttc 4681 acagtacttg atactgccta acacactcca
tggtttctta cttgccctgt ttattatttt 4741 cccccaatag acagaatgtt
ccatgatggc agaattctct gttttgtttc cttccatgtc 4801 cccagcacct
agaacagtgc ctgacgcatc tcctaagcaa tacgaccaat aagtatgtgt 4861
ctggctgcct tccggctgcc agtgtctgcc tctttcctag gggcagtggt tgcgggggtg
4921 ctttctcaca tgtcttagta ggctgtgcag gctggaagtg ctcagaagtc
acacccccag 4981 ggagcagcct cagccaacag caccttggct gtaaatgccc
cagctccctc gccctcaggt 5041 aagcattgct gaggcacacg ttccatactc
ttttccacag ttcctccgtg ggactgagca 5101 ccacccagcc acccacagga
gcagctaacc tgataaccac cagcctcacc ctccctgcct 5161 tacttccccg
ctccccttta ccacatgctg acctcccaga tgcatttctt gctttccggt 5221
ctctgtctca ggattggctc ctggatgaac acaaactaac actatgttca caaatatatt
5281 tgggaaatgc tggatgaata attatacaca tcagacagat tactagaaat
tctcaccaaa 5341 gggatgcaca tgttacctct gcatggtgag atctcaggtg
ctttttaccc cacatagcta 5401 tcctttggca tttttataat tagcaagtgc
tcactcttcc actgtcagta cattctcatt 5461 cttcttgggc gctggacaag
aattcaaccg gtgtgtaagc cagactcggc ccgggcagtc 5521 tcaaactcct
gactccttat ataatttcta caaaaattat aaagctattt cccactcccc 5581
accccacatt catgtaacct gaagcatgag taaaccaaga atgaggtagg cctctgtctt
5641 ctaagcaaca tcagaactct aagaacatga gggactctta gaaaactctc
tggagctaac 5701 cacagctggg tcactgctca tgtactgaag accagccaga
gggttcccct gaaaaggagg 5761 gaaactgagc aaacattctc cagttctctt
agtgtgcaca tgtttcagga ggtgtgaacc 5821 ccacatgtag cttgtgtagg
caagaagaca aatagtgcta ctgtctggtc aaggatttgt 5881 ttgaagagcc
atgattatgc ccatatggta agccaccagt gctccccatc cctgtaagac 5941
acttctttct cattattttc tcctctgatg gtgtgccagg atgctggcca agagaagcca
6001 agtggaaaga aggctgttca gtgacaagga acctaagact tagtgccaag
gactgaaacc 6061 aagtaaactt gtaattttcc atgatggaaa catctacact
ttctcattag tggcctctac 6121 agcagttgcc ccaaagaagc gtctcattgt
ttttttacta catttatgtg aagcatacag 6181 gcaaactcag aaagactgtg
ataaggctcg ccagagatgc ctgcacaggt gctgggggaa 6241 aagcaggacc
atcctgaagg gagatggtgt ctgtggacaa agaactctgc agtggttctt 6301
atttgcatga tttctgctgg tggaggctgt aaatgtgagc tcaaactccc acataagtga
6361 gttttcattg taatccagaa tgtttttaaa tcaccctact tctattgaac
ttgcactatc 6421 atctgttaac ctctactgta tttattaaat aaacctgaat
aggtaaatca cagtacagca 6481 aaa
[0265] In some embodiments of the methods of the disclosure, the
wild type human OBFC1 gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_079204.2):
TABLE-US-00038 (SEQ ID NO: 24) 1 mqpgssrcee etpsllwgld pvflafakly
irdildmkes rqvpgvflyn ghpikqvdvl 61 gtvigvrerd afysygvdds
tgvincicwk klntesvsaa psaarelslt sqlkklqeti 121 eqktkieigd
tirvrgsirt yreereihat tyykvddpvw niqiarmlel ptiyrkvydq 181
pfhssaleke ealsnpgald lpsltsllse kakeflmenr vqsfyqqele mvesllslan
241 qpvihsassd qvnfkkdtts kaihsifkna iqllqekglv fqkddgfdnl
yyvtredkdl 301 hrkihriiqq dcqkpnhmek gchflhilac arlsirpgls
eavlqqvlel ledqsdivst 361 mehyytaf
[0266] In some embodiments of the methods of the disclosure, the
wild type human ATP11A gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_015205.2, transcript variant 1):
TABLE-US-00039 (SEQ ID NO: 25) 1 gcggccgcac tagtaccccg gagcccatgg
gcgcgccgag ccgggcgcgg gggcgctgaa 61 cggcggagcg ggagcggccg
gaggagccat ggactgcagc ctcgtgcgga cgctcgtgca 121 cagatactgt
gcaggagaag agaattgggt ggacagcagg accatctacg tgggacacag 181
ggagccacct ccgggcgcag aggcctacat cccacagaga tacccagaca acaggatcgt
241 ctcgtccaag tacacatttt ggaactttat acccaagaat ttatttgaac
aattcagaag 301 agtagccaac ttttatttcc ttatcatatt tctggtgcag
ttgattattg atacacccac 361 aagtccagtg acaagcggac ttccactctt
ctttgtcatt actgtgacgg ctatcaaaca 421 gggttatgaa gactggcttc
gacataaagc agacaatgcc atgaaccagt gtcctgttca 481 tttcattcag
cacggcaagc tcgttcggaa acaaagtcga aagctgcgag ttggggacat 541
tgtcatggtt aaggaggacg agacctttcc ctgcgacttg atcttccttt ccagcaaccg
601 gggagatggg acgtgccacg tcaccaccgc cagcttggat ggagaatcca
gccataaaac 661 gcattacgcg gtccaggaca ccaaaggctt ccacacagag
gaggatatcg gcggacttca 721 cgccaccatc gagtgtgagc agccccagcc
cgacctctac aagttcgtgg gtcgcatcaa 781 cgtttacagt gacctgaatg
accccgtggt gaggccctta ggatcggaaa acctgctgct 841 tagaggagct
acactgaaga acactgagaa aatctttggt gtggctattt acacgggaat 901
ggaaaccaag atggcattaa attatcaatc aaaatctcag aagcgatctg ccgtggaaaa
961 atcgatgaat gcgttcctca ttgtgtatct ctgcattctg atcagcaaag
ccctgataaa 1021 cactgtgctg aaatacatgt ggcagagtga gccctttcgg
gatgagccgt ggtataatca 1081 gaaaacggag tcggaaaggc agaggaatct
gttcctcaag gcattcacgg acttcctggc 1141 cttcatggtc ctctttaact
acatcatccc tgtgtccatg tacgtcacgg tcgagatgca 1201 gaagttcctc
ggctcttact tcatcacctg ggacgaagac atgtttgacg aggagactgg 1261
cgaggggcct ctggtgaaca cgtcggacct caatgaagag ctgggacagg tggagtacat
1321 cttcacagac aagaccggca ccctcacgga aaacaacatg gagttcaagg
agtgctgcat 1381 cgaaggccat gtctacgtgc cccacgtcat ctgcaacggg
caggtcctcc cagagtcgtc 1441 aggaatcgac atgattgact cgtcccccag
cgtcaacggg agggagcgcg aggagctgtt 1501 tttccgggcc ctctgtctct
gccacaccgt ccaggtgaaa gacgatgaca gcgtagacgg 1561 ccccaggaaa
tcgccggacg gggggaaatc ctgtgtgtac atctcatcct cgcccgacga 1621
ggtggcgctg gtcgaaggtg tccagagact tggctttacc tacctaaggc tgaaggacaa
1681 ttacatggag atattaaaca gggagaacca catcgaaagg tttgaattgc
tggaaatttt 1741 gagttttgac tcagtcagaa ggagaatgag tgtaattgta
aaatctgcta caggagaaat 1801 ttatctgttt tgcaaaggag cagattcttc
gatattcccc cgagtgatag aaggcaaagt 1861 tgaccagatc cgagccagag
tggagcgtaa cgcagtggag gggctccgaa ctttgtgtgt 1921 tgcttataaa
aggctgatcc aagaagaata tgaaggcatt tgtaagctgc tgcaggctgc 1981
caaagtggcc cttcaagatc gagagaaaaa gttagcagaa gcctatgagc aaatagagaa
2041 agatcttact ctgcttggtg ctacagctgt tgaggaccgg ctgcaggaga
aagctgcaga 2101 caccatcgag gccctgcaga aggccgggat caaagtctgg
gttctcacgg gagacaagat 2161 ggagacggcc gcggccacgt gctacgcctg
caagctcttc cgcaggaaca cgcagctgct 2221 ggagctgacc accaagagga
tcgaggagca gagcctgcac gacgtcctgt tcgagctgag 2281 caagacggtc
ctgcgccaca gcgggagcct gaccagagac aacctgtccg gactttcagc 2341
agatatgcag gactacggtt taattatcga cggagctgca ctgtctctga taatgaagcc
2401 tcgagaagac gggagttccg gcaactacag ggagctcttc ctggaaatct
gccggagctg 2461 cagcgcggtg ctctgctgcc gcatggcgcc cttgcagaag
gctcagattg ttaaattaat 2521 caaattttca aaagagcacc caatcacgtt
agcaattggc gatggtgcaa atgatgtcag 2581 catgattctg gaagcgcacg
tgggcatagg tgtcatcggc aaggaaggcc gccaggctgc 2641 caggaacagc
gactatgcaa tcccaaagtt taagcatttg aagaagatgc tgcttgttca 2701
cgggcatttt tattacatta ggatctctga gctcgtgcag tacttcttct ataagaacgt
2761 ctgcttcatc ttccctcagt ttttatacca gttcttctgt gggttttcac
aacagacttt 2821 gtacgacacc gcgtatctga ccctctacaa catcagcttc
acctccctcc ccatcctcct 2881 gtacagcctc atggagcagc atgttggcat
tgacgtgctc aagagagacc cgaccctgta 2941 cagggacgtc gccaagaatg
ccctgctgcg ctggcgcgtg ttcatctact ggacgctcct 3001 gggactgttt
gacgcactgg tgttcttctt tggtgcttat ttcgtgtttg aaaatacaac 3061
tgtgacaagc aacgggcaga tatttggaaa ctggacgttt ggaacgctgg tattcaccgt
3121 gatggtgttc acagttacac taaagcttgc attggacaca cactactgga
cttggatcaa 3181 ccattttgtc atctgggggt cgctgctgtt ctacgttgtc
ttttcgcttc tctggggagg 3241 agtgatctgg ccgttcctca actaccagag
gatgtactac gtgttcatcc agatgctgtc 3301 cagcgggccc gcctggctgg
ccatcgtgct gctggtgacc atcagcctcc ttcccgacgt 3361 cctcaagaaa
gtcctgtgcc ggcagctgtg gccaacagca acagagagag tccagactaa 3421
gagccagtgc ctttctgtcg agcagtcaac catctttatg ctttctcaga cttccagcag
3481 cctgagtttc tgatggaaca agagcccagg ctaccagagc acctgtccct
cggccgcctg 3541 gtacagctcc cactctcagc aggtgacact cgcggcctgg
aaggagaagg tgtccacgga 3601 gcccccaccc atcctcggcg gttcccatca
ccactgcagt tccatcccaa gtcacagctg 3661 ccctaggtcc cgtgtgggaa
tgctcgtgtg atggatggtc ctaagcctgt ggagactgtg 3721 cacgtgcctc
ttcctggccc ccagcaggca aggagggggg tcacaggcct tgccctcgag 3781
catggcaccc tggccgcctg gacccagcac tgtggttgtt gagccacacc agtggcctct
3841 gggcattcgg ctcaacgcag gagggacatt ctgctggccc accctgcgcg
ctgtcatgca 3901 gaggccattc ccccaggcct gtgtcttcac ccacctgcca
tcattggcct ttgctgtcac 3961 tgggagagaa gagccgtcca gggacccatg
gtggcccaca tgtggatgcc acatgctgct 4021 gtttcctgct tgcccggcca
ccacccatgc cctccatagg gtgaggtgga gccatggtgg 4081 tgcgtccttt
actcaacaac cctccaatcc ggatgctgtg ggaagggccg ggtcactcgg 4141
ataccatcat ccctgcggat gcaccgccgt accctgctca tctgggagtg gtttccctgc
4201 ggttacgtcc aagcccgcct gccctgtgtg ttggggctgg ctgagtttcg
gtctccccat 4261 caccggccgc ctcgtggaga aggcagtgcc acgtgggagg
acaaggccac gccggcagct 4321 tccagccctg ccgcagaagt gccaggatgt
ccatcagcca ctcgccaggg cacggagccg 4381 tcagtccact gttacgggag
aatgttgatt tcgcgggtgc gagggccggg agacagatac 4441 ttggctgtga
tgagcagaca tcctctgtcc ccgtggaggg gtcaacacca aggtggtgtt 4501
cgtgcaccag aacctgtctc gggctgacgg gggtggcaca caggacacgg gtggatccca
4561 acaggcagca ccgcacctct gcccgcctcc cgcactgcag ctccgcccgc
cgggctctgc 4621 gtccccacgt cccctcgtcc catccccacg tcccctcatc
ccgtcacctc gtccccacat 4681 ccccttgccc cgtcacctcg tcctcatgtc
cccttgtcct gtcacctcgt ccccacgtcc 4741 cctcgtctcc tcatccccac
gtcctctcgt ccccttgtcc cgtccccaca taccctcgtc 4801 cccatgtccc
cacgcagggc tctccttcgt cttaggatct gtccagcgct gctctgggtg 4861
ggttagcaac cccagggctg ctgtgatagg aagtccctgt tgttctccgt actggcattt
4921 ctatttctag aaataatatt tgacatagcc ttaatggtcc ttaaagaaga
catttcagtg 4981 tgagattcag acttcagacg ctgaaactgc tgcctttcag
gaaagcacca ccaacgctgg 5041 aggaggagcc ggccctcacg cccgccccgc
gccacgctgt ggaacggggc tccggcaagt 5101 gaaacccaga gggtgtttcc
gaggtgctcg acagtaggta tttttggaag ctcagatttc 5161 accatttgat
tgtataatct tttacctata aaatatttat ttgaagtaga gggtaaatca 5221
gcggtaagaa cagtgaacac agtggttggg ataaaataag gtgacaaaca tcacaccaaa
5281 gatgagggta gcgagcaact ggcttgagca gacagaacgg ggaagactcc
actctgtccc 5341 gaggggccag ccgcaggcgt ccccagggcc accctgccct
gaggtccttg tgtggccgcc 5401 ctggcttggc agccctgccc acgctgcccc
cgcaaacaat ggtgtgtgcg tttttacagc 5461 cctttttagg aacccaatat
gggcataaat gtaacacctg tagcgggggc agattctctg 5521 tatgttcagt
taacaaatta tttgtaatgt atttttttag aaatcttaaa attgcctttg 5581
cactgaagta ttttcatagc tgtttatatc tcttttattc atttatttaa catactgtct
5641 aattttaaaa ataggttttt aaagctttca tttttaagtt tatgaaattt
tggccacttt 5701 acatttagat tctggtgaga gttttgactg aatgttccaa
tctctgatga atgcgaattt 5761 tcagatttga ttttattctc tacacacacc
tcttcttttc ttggtatttc tggtggcagt 5821 gattagttga acagcacatt
taaggcacga taatttgcta cactttttct ttacaatttg 5881 ttgcaatttc
atctgctttc tatgtttcat tgttaattgc catccttcag ccttaaaaat 5941
agaagattct cacgtgaagg tttagtaagt tgggtcccag ctctgcctgt gtggagatag
6001 tcaccatgta cctctgacaa caagttttag tgtgaaagtc actaaacttt
tacacactcc 6061 caaacgtctt tttaaaaatt gcttgggaaa ttattaaatg
aatgtgcctg atgatttgaa 6121 atagacaagg ggcacgagat aaaaaagaaa
aggatgagaa gatcctcagt gaatgacgtt 6181 gcagggtctt catgcaattt
tccacctcgc agtagttagt atttacttgc cttaaactaa 6241 ctttgaagca
agtaatgtca actttgagca ctttgttgag ttttgaaaaa tcttatttgt 6301
tgctgcacag gttaataaat tatcaatttg taattcagca tgttggtcag agacacggtc
6361 actgattcac acccagtccc tgccacagac cgtctcagac acgcacagtg
ggcctgctgc 6421 atgattcaca cccagtccct gccacagacc gtctcagaca
cgcacagtgg gcctgctgca 6481 tgattcacac ccagtccctg ccacagaccg
tctcagacac gcacagtggg cctgctgcat 6541 gcgtgttacc tggcttttgg
ctccacgctc actcatagcc atgtccacat gggggcttgc 6601 acacaggatc
actcacatat gtacatgtac ccaccacaaa cgtgcaagct cctgcacaca 6661
tgcatgcaca caaacgtgta cacaagtgtg agctcctaca cgcatacaca cacacacgtg
6721 tacatgcacc aaagcatgtg tgacctacag acatgcagaa catgcacgtg
tacacatacc 6781 acagacacgc gtgtgcatgc tcctacacaa tacatatgca
catatcatga acagcgtaag 6841 ttcctacaca cggacgtgtg atacacacat
gcatgtacag gtaagcacac atgtacaagc 6901 tcctacaggc ttgctctcac
acacgtgtat gcacagcaga gagacgtatg agcttctact 6961 gcacacatgc
acacacacac gcacacgtac attcactaca aacgtgcagc ctcctgcaca 7021
cgtgcacatt catgtgtaca ccacaaatga gttcccagac gtgtaaacac acgtgcacac
7081 atcgtacaca tgtgagctcc cacacgtaca cacagatgca catggacaca
ccccaaacac 7141 gcacaggctc ctacacacat gcacacacgt gtacaccaca
aacgagctcc cagacatgta 7201 aacacacgtc tcccacacgt gagctcccac
acgtacacat gcacatgtac gcaccacaaa 7261 cacatgcgca ggctcctgca
ggcgtgaata cacacatgca cacacatata cacacatgtg 7321 ccacaaacaa
gtgcacactg tcctggtgtc ctgcactgca tcctgcctcc ttgctgaggg 7381
gcccctgtga gaggcctctg gatgggcatg ggaagatggg ctccctggcc cccagcccat
7441 gcctccctgg gatgaagagt ccccctcctg gcagaatgtc tgggctttgc
agagcaggcc
7501 ccgggggtga agtcgcagct tcacttacac cagctgctct gtgagcaagg
cttggtgccc 7561 tggacaaggc ccttcccctt tagggaggtc cagcctcgca
agctgaaacc tcccctcggc 7621 tcagccctat accaggcggc cacagcagga
ctggccacac ccacgccgca cctcatccgt 7681 gcacgcgtcg gagcacggcc
agccttccgc cacgagccag ctgggaaggg ccgcggccgc 7741 ctaaagcccc
agtcaaccca gcctgtgtct gagcagacag ggcgaacaag caggccacac 7801
cgtctcgagg gaggaggcca gatgcggcca gcgtctccaa cagggtgacc atccgctcgg
7861 cttgctgagc gtttaaacaa atgtttagac aggctgtggg gactcccctg
agttgagcct 7921 tggccagggg tccggtgctg tcgcgggaaa cctccagcct
tgttcttcaa accactcagc 7981 tcatgtgttt tgcactgact agtactgaat
aatacaacca ctcttattta atgttagtat 8041 tatttatttg acaactcagt
gtctaacagc ttgatatgca ggtccttgca tcctacattt 8101 ctttaggaag
ttacccattt gtaactttaa aaacaggaaa aatatcagtt ggcaaatgca 8161
atcttttttt tttttaagct aaaggtgggt gaactggaat gaaaatcttt ctgatgttgt
8221 gtctataagc agccttgatg ggatatgtta gaagtgtcat gaaagtgtga
ttctactttt 8281 gcagaaaaat ctaaagatca atttatatag ctttattttt
tactttatca aagtatacag 8341 aattttaata tgcatatatt gtgtctgact
taaaattata atgtctgcgt caccatttaa 8401 aatgtctgtt cattatgtaa
tgtaataaaa gaaggtcttc aaaaatgtat ttaacatgaa 8461 tggtatccat
agttgtcatc atcataaata ctggagttta tttttaaatt attaaacata 8521
gtaggtgcat taacataaat cagtctccac acagtaacat ttaactgata attcattaat
8581 cagctttgaa aaattaaatt gttaattaaa ccaatctaac atttcagtaa
agtttatttt 8641 gtatgcttct gtttttaact tttatttctg tagataaact
gactggataa tattatattg 8701 gacttttctc tagattatct aagcaggaga
cctgaatctg cttgcaataa agaataaaag 8761 tctgcttcag tttctttata
aagaaactca cacaa
[0267] In some embodiments of the methods of the disclosure, the
wild type human
[0268] ATP11A gene of the disclosure consists of or comprises the
nucleic acid sequence (Genbank Accession number: NP_056020.2,
transcript variant 1):
TABLE-US-00040 (SEQ ID NO: 26) 1 mdcslvrtlv hrycageenw vdsrtiyvgh
repppgaeay ipqrypdnri vsskytfwnf 61 ipknlfeqfr rvanfyflii
flvqliidtp tspvtsglpl ffvitvtaik qgyedwlrhk 121 adnamnqcpv
hfiqhgklvr kqsrklrvgd ivmvkedetf pcdliflssn rgdgtchvtt 181
asldgesshk thyavqdtkg fhteediggl hatieceqpq pdlykfvgri nvysdlndpv
241 vrplgsenll lrgatlknte kifgvaiytg metkmalnyq sksqkrsave
ksmnaflivy 301 lciliskali ntvlkymwqs epfrdepwyn qkteserqrn
lflkaftdfl afmvlfnyii 361 pvsmyvtvem qkflgsyfit wdedmfdeet
gegplvntsd lneelgqvey iftdktgtlt 421 ennmefkecc ieghvyvphv
icngqvlpes sgidmidssp svngrereel ffralclcht 481 vqvkdddsvd
gprkspdggk scvyissspd evalvegvqr lgftylrlkd nymeilnren 541
hierfellei lsfdsvrrrm svivksatge iylfckgads sifprviegk vdqirarver
601 naveglrtlc vaykrliqee yegickllqa akvalqdrek klaeayeqie
kdltllgata 661 vedrlqekaa dtiealqkag ikvwvltgdk metaaatcya
cklfrrntql lelttkriee 721 qslhdvlfel sktvlrhsgs ltrdnlsgls
admqdyglii dgaalslimk predgssgny 781 relfleicrs csavlccrma
plqkaqivkl ikfskehpit laigdgandv smileahvgi 841 gvigkegrqa
arnsdyaipk fkhlkkmllv hghfyyiris elvqyffykn vcfifpqfly 901
qffcgfsqqt lydtayltly nisftslpil lyslmeqhvg idvlkrdptl yrdvaknall
961 rwrvfiywtl lglfdalvff fgayfvfent tvtsngqifg nwtfgtlvft
vmvftvtlkl 1021 aldthywtwi nhfviwgsll fyvvfsllwg gviwpflnyq
rmyyvfiqml ssgpawlaiv 1081 llvtisllpd vlkkvlcrql wptatervqt
ksqclsveqs tifmlsqtss slsf
[0269] In some embodiments of the methods of the disclosure, the
wild type human ATP11A gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_032189.3, transcript variant 2):
TABLE-US-00041 (SEQ ID NO: 48) 1 gcggccgcac tagtaccccg gagcccatgg
gcgcgccgag ccgggcgcgg gggcgctgaa 61 cggcggagcg ggagcggccg
gaggagccat ggactgcagc ctcgtgcgga cgctcgtgca 121 cagatactgt
gcaggagaag agaattgggt ggacagcagg accatctacg tgggacacag 181
ggagccacct ccgggcgcag aggcctacat cccacagaga tacccagaca acaggatcgt
241 ctcgtccaag tacacatttt ggaactttat acccaagaat ttatttgaac
aattcagaag 301 agtagccaac ttttatttcc ttatcatatt tctggtgcag
ttgattattg atacacccac 361 aagtccagtg acaagcggac ttccactctt
ctttgtcatt actgtgacgg ctatcaaaca 421 gggttatgaa gactggcttc
gacataaagc agacaatgcc atgaaccagt gtcctgttca 481 tttcattcag
cacggcaagc tcgttcggaa acaaagtcga aagctgcgag ttggggacat 541
tgtcatggtt aaggaggacg agacctttcc ctgcgacttg atcttccttt ccagcaaccg
601 gggagatggg acgtgccacg tcaccaccgc cagcttggat ggagaatcca
gccataaaac 661 gcattacgcg gtccaggaca ccaaaggctt ccacacagag
gaggatatcg gcggacttca 721 cgccaccatc gagtgtgagc agccccagcc
cgacctctac aagttcgtgg gtcgcatcaa 781 cgtttacagt gacctgaatg
accccgtggt gaggccctta ggatcggaaa acctgctgct 841 tagaggagct
acactgaaga acactgagaa aatctttggt gtggctattt acacgggaat 901
ggaaaccaag atggcattaa attatcaatc aaaatctcag aagcgatctg ccgtggaaaa
961 atcgatgaat gcgttcctca ttgtgtatct ctgcattctg atcagcaaag
ccctgataaa 1021 cactgtgctg aaatacatgt ggcagagtga gccctttcgg
gatgagccgt ggtataatca 1081 gaaaacggag tcggaaaggc agaggaatct
gttcctcaag gcattcacgg acttcctggc 1141 cttcatggtc ctctttaact
acatcatccc tgtgtccatg tacgtcacgg tcgagatgca 1201 gaagttcctc
ggctcttact tcatcacctg ggacgaagac atgtttgacg aggagactgg 1261
cgaggggcct ctggtgaaca cgtcggacct caatgaagag ctgggacagg tggagtacat
1321 cttcacagac aagaccggca ccctcacgga aaacaacatg gagttcaagg
agtgctgcat 1381 cgaaggccat gtctacgtgc cccacgtcat ctgcaacggg
caggtcctcc cagagtcgtc 1441 aggaatcgac atgattgact cgtcccccag
cgtcaacggg agggagcgcg aggagctgtt 1501 tttccgggcc ctctgtctct
gccacaccgt ccaggtgaaa gacgatgaca gcgtagacgg 1561 ccccaggaaa
tcgccggacg gggggaaatc ctgtgtgtac atctcatcct cgcccgacga 1621
ggtggcgctg gtcgaaggtg tccagagact tggctttacc tacctaaggc tgaaggacaa
1681 ttacatggag atattaaaca gggagaacca catcgaaagg tttgaattgc
tggaaatttt 1741 gagttttgac tcagtcagaa ggagaatgag tgtaattgta
aaatctgcta caggagaaat 1801 ttatctgttt tgcaaaggag cagattcttc
gatattcccc cgagtgatag aaggcaaagt 1861 tgaccagatc cgagccagag
tggagcgtaa cgcagtggag gggctccgaa ctttgtgtgt 1921 tgcttataaa
aggctgatcc aagaagaata tgaaggcatt tgtaagctgc tgcaggctgc 1981
caaagtggcc cttcaagatc gagagaaaaa gttagcagaa gcctatgagc aaatagagaa
2041 agatcttact ctgcttggtg ctacagctgt tgaggaccgg ctgcaggaga
aagctgcaga 2101 caccatcgag gccctgcaga aggccgggat caaagtctgg
gttctcacgg gagacaagat 2161 ggagacggcc gcggccacgt gctacgcctg
caagctcttc cgcaggaaca cgcagctgct 2221 ggagctgacc accaagagga
tcgaggagca gagcctgcac gacgtcctgt tcgagctgag 2281 caagacggtc
ctgcgccaca gcgggagcct gaccagagac aacctgtccg gactttcagc 2341
agatatgcag gactacggtt taattatcga cggagctgca ctgtctctga taatgaagcc
2401 tcgagaagac gggagttccg gcaactacag ggagctcttc ctggaaatct
gccggagctg 2461 cagcgcggtg ctctgctgcc gcatggcgcc cttgcagaag
gctcagattg ttaaattaat 2521 caaattttca aaagagcacc caatcacgtt
agcaattggc gatggtgcaa atgatgtcag 2581 catgattctg gaagcgcacg
tgggcatagg tgtcatcggc aaggaaggcc gccaggctgc 2641 caggaacagc
gactatgcaa tcccaaagtt taagcatttg aagaagatgc tgcttgttca 2701
cgggcatttt tattacatta ggatctctga gctcgtgcag tacttcttct ataagaacgt
2761 ctgcttcatc ttccctcagt ttttatacca gttcttctgt gggttttcac
aacagacttt 2821 gtacgacacc gcgtatctga ccctctacaa catcagcttc
acctccctcc ccatcctcct 2881 gtacagcctc atggagcagc atgttggcat
tgacgtgctc aagagagacc cgaccctgta 2941 cagggacgtc gccaagaatg
ccctgctgcg ctggcgcgtg ttcatctact ggacgctcct 3001 gggactgttt
gacgcactgg tgttcttctt tggtgcttat ttcgtgtttg aaaatacaac 3061
tgtgacaagc aacgggcaga tatttggaaa ctggacgttt ggaacgctgg tattcaccgt
3121 gatggtgttc acagttacac taaagcttgc attggacaca cactactgga
cttggatcaa 3181 ccattttgtc atctgggggt cgctgctgtt ctacgttgtc
ttttcgcttc tctggggagg 3241 agtgatctgg ccgttcctca actaccagag
gatgtactac gtgttcatcc agatgctgtc 3301 cagcgggccc gcctggctgg
ccatcgtgct gctggtgacc atcagcctcc ttcccgacgt 3361 cctcaagaaa
gtcctgtgcc ggcagctgtg gccaacagca acagagagag tccagaatgg 3421
gtgcgcacag cctcgggacc gcgactcaga attcacccct cttgcctctc tgcagagccc
3481 aggctaccag agcacctgtc cctcggccgc ctggtacagc tcccactctc
agcaggtgac 3541 actcgcggcc tggaaggaga aggtgtccac ggagccccca
cccatcctcg gcggttccca 3601 tcaccactgc agttccatcc caagtcacag
ctgccctagg tcccgtgtgg gaatgctcgt 3661 gtgatggatg gtcctaagcc
tgtggagact gtgcacgtgc ctcttcctgg cccccagcag 3721 gcaaggaggg
gggtcacagg ccttgccctc gagcatggca ccctggccgc ctggacccag 3781
cactgtggtt gttgagccac accagtggcc tctgggcatt cggctcaacg caggagggac
3841 attctgctgg cccaccctgc gcgctgtcat gcagaggcca ttcccccagg
cctgtgtctt 3901 cacccacctg ccatcattgg cctttgctgt cactgggaga
gaagagccgt ccagggaccc 3961 atggtggccc acatgtggat gccacatgct
gctgtttcct gcttgcccgg ccaccaccca 4021 tgccctccat agggtgaggt
ggagccatgg tggtgcgtcc tttactcaac aaccctccaa 4081 tccggatgct
gtgggaaggg ccgggtcact cggataccat catccctgcg gatgcaccgc 4141
cgtaccctgc tcatctggga gtggtttccc tgcggttacg tccaagcccg cctgccctgt
4201 gtgttggggc tggctgagtt tcggtctccc catcaccggc cgcctcgtgg
agaaggcagt 4261 gccacgtggg aggacaaggc cacgccggca gcttccagcc
ctgccgcaga agtgccagga 4321 tgtccatcag ccactcgcca gggcacggag
ccgtcagtcc actgttacgg gagaatgttg 4381 atttcgcggg tgcgagggcc
gggagacaga tacttggctg tgatgagcag acatcctctg 4441 tccccgtgga
ggggtcaaca ccaaggtggt gttcgtgcac cagaacctgt ctcgggctga 4501
cgggggtggc acacaggaca cgggtggatc ccaacaggca gcaccgcacc tctgcccgcc
4561 tcccgcactg cagctccgcc cgccgggctc tgcgtcccca cgtcccctcg
tcccatcccc 4621 acgtcccctc atcccgtcac ctcgtcccca catccccttg
ccccgtcacc tcgtcctcat 4681 gtccccttgt cctgtcacct cgtccccacg
tcccctcgtc tcctcatccc cacgtcctct 4741 cgtccccttg tcccgtcccc
acataccctc gtccccatgt ccccacgcag ggctctcctt 4801 cgtcttagga
tctgtccagc gctgctctgg gtgggttagc aaccccaggg ctgctgtgat 4861
aggaagtccc tgttgttctc cgtactggca tttctatttc tagaaataat atttgacata
4921 gccttaatgg tccttaaaga agacatttca gtgtgagatt cagacttcag
acgctgaaac 4981 tgctgccttt caggaaagca ccaccaacgc tggaggagga
gccggccctc acgcccgccc 5041 cgcgccacgc tgtggaacgg ggctccggca
agtgaaaccc agagggtgtt tccgaggtgc 5101 tcgacagtag gtatttttgg
aagctcagat ttcaccattt gattgtataa tcttttacct 5161 ataaaatatt
tatttgaagt agagggtaaa tcagcggtaa gaacagtgaa cacagtggtt 5221
gggataaaat aaggtgacaa acatcacacc aaagatgagg gtagcgagca actggcttga
5281 gcagacagaa cggggaagac tccactctgt cccgaggggc cagccgcagg
cgtccccagg 5341 gccaccctgc cctgaggtcc ttgtgtggcc gccctggctt
ggcagccctg cccacgctgc 5401 ccccgcaaac aatggtgtgt gcgtttttac
agcccttttt aggaacccaa tatgggcata 5461 aatgtaacac ctgtagcggg
ggcagattct ctgtatgttc agttaacaaa ttatttgtaa 5521 tgtatttttt
tagaaatctt aaaattgcct ttgcactgaa gtattttcat agctgtttat 5581
atctctttta ttcatttatt taacatactg tctaatttta aaaataggtt tttaaagctt
5641 tcatttttaa gtttatgaaa ttttggccac tttacattta gattctggtg
agagttttga 5701 ctgaatgttc caatctctga tgaatgcgaa ttttcagatt
tgattttatt ctctacacac 5761 acctcttctt ttcttggtat ttctggtggc
agtgattagt tgaacagcac atttaaggca 5821 cgataatttg ctacactttt
tctttacaat ttgttgcaat ttcatctgct ttctatgttt 5881 cattgttaat
tgccatcctt cagccttaaa aatagaagat tctcacgtga aggtttagta 5941
agttgggtcc cagctctgcc tgtgtggaga tagtcaccat gtacctctga caacaagttt
6001 tagtgtgaaa gtcactaaac ttttacacac tcccaaacgt ctttttaaaa
attgcttggg 6061 aaattattaa atgaatgtgc ctgatgattt gaaatagaca
aggggcacga gataaaaaag 6121 aaaaggatga gaagatcctc agtgaatgac
gttgcagggt cttcatgcaa ttttccacct 6181 cgcagtagtt agtatttact
tgccttaaac taactttgaa gcaagtaatg tcaactttga 6241 gcactttgtt
gagttttgaa aaatcttatt tgttgctgca caggttaata aattatcaat 6301
ttgtaattca gcatgttggt cagagacacg gtcactgatt cacacccagt ccctgccaca
6361 gaccgtctca gacacgcaca gtgggcctgc tgcatgattc acacccagtc
cctgccacag 6421 accgtctcag acacgcacag tgggcctgct gcatgattca
cacccagtcc ctgccacaga 6481 ccgtctcaga cacgcacagt gggcctgctg
catgcgtgtt acctggcttt tggctccacg 6541 ctcactcata gccatgtcca
catgggggct tgcacacagg atcactcaca tatgtacatg 6601 tacccaccac
aaacgtgcaa gctcctgcac acatgcatgc acacaaacgt gtacacaagt 6661
gtgagctcct acacgcatac acacacacac gtgtacatgc accaaagcat gtgtgaccta
6721 cagacatgca gaacatgcac gtgtacacat accacagaca cgcgtgtgca
tgctcctaca 6781 caatacatat gcacatatca tgaacagcgt aagttcctac
acacggacgt gtgatacaca 6841 catgcatgta caggtaagca cacatgtaca
agctcctaca ggcttgctct cacacacgtg 6901 tatgcacagc agagagacgt
atgagcttct actgcacaca tgcacacaca cacgcacacg 6961 tacattcact
acaaacgtgc agcctcctgc acacgtgcac attcatgtgt acaccacaaa 7021
tgagttccca gacgtgtaaa cacacgtgca cacatcgtac acatgtgagc tcccacacgt
7081 acacacagat gcacatggac acaccccaaa cacgcacagg ctcctacaca
catgcacaca 7141 cgtgtacacc acaaacgagc tcccagacat gtaaacacac
gtctcccaca cgtgagctcc 7201 cacacgtaca catgcacatg tacgcaccac
aaacacatgc gcaggctcct gcaggcgtga 7261 atacacacat gcacacacat
atacacacat gtgccacaaa caagtgcaca ctgtcctggt 7321 gtcctgcact
gcatcctgcc tccttgctga ggggcccctg tgagaggcct ctggatgggc 7381
atgggaagat gggctccctg gcccccagcc catgcctccc tgggatgaag agtccccctc
7441 ctggcagaat gtctgggctt tgcagagcag gccccggggg tgaagtcgca
gcttcactta
7501 caccagctgc tctgtgagca aggcttggtg ccctggacaa ggcccttccc
ctttagggag 7561 gtccagcctc gcaagctgaa acctcccctc ggctcagccc
tataccaggc ggccacagca 7621 ggactggcca cacccacgcc gcacctcatc
cgtgcacgcg tcggagcacg gccagccttc 7681 cgccacgagc cagctgggaa
gggccgcggc cgcctaaagc cccagtcaac ccagcctgtg 7741 tctgagcaga
cagggcgaac aagcaggcca caccgtctcg agggaggagg ccagatgcgg 7801
ccagcgtctc caacagggtg accatccgct cggcttgctg agcgtttaaa caaatgttta
7861 gacaggctgt ggggactccc ctgagttgag ccttggccag gggtccggtg
ctgtcgcggg 7921 aaacctccag ccttgttctt caaaccactc agctcatgtg
ttttgcactg actagtactg 7981 aataatacaa ccactcttat ttaatgttag
tattatttat ttgacaactc agtgtctaac 8041 agcttgatat gcaggtcctt
gcatcctaca tttctttagg aagttaccca tttgtaactt 8101 taaaaacagg
aaaaatatca gttggcaaat gcaatctttt ttttttttaa gctaaaggtg 8161
ggtgaactgg aatgaaaatc tttctgatgt tgtgtctata agcagccttg atgggatatg
8221 ttagaagtgt catgaaagtg tgattctact tttgcagaaa aatctaaaga
tcaatttata 8281 tagctttatt ttttacttta tcaaagtata cagaatttta
atatgcatat attgtgtctg 8341 acttaaaatt ataatgtctg cgtcaccatt
taaaatgtct gttcattatg taatgtaata 8401 aaagaaggtc ttcaaaaatg
tatttaacat gaatggtatc catagttgtc atcatcataa 8461 atactggagt
ttatttttaa attattaaac atagtaggtg cattaacata aatcagtctc 8521
cacacagtaa catttaactg ataattcatt aatcagcttt gaaaaattaa attgttaatt
8581 aaaccaatct aacatttcag taaagtttat tttgtatgct tctgttttta
acttttattt 8641 ctgtagataa actgactgga taatattata ttggactttt
ctctagatta tctaagcagg 8701 agacctgaat ctgcttgcaa taaagaataa
aagtctgctt cagtttcttt ataaagaaac 8761 tcacacaa
[0270] In some embodiments of the methods of the disclosure, the
wild type human ATP11A gene of the disclosure consists of or
comprises the amino acid sequence (Genbank
[0271] Accession number: NP_115565.3, transcript variant 2):
TABLE-US-00042 (SEQ ID NO: 49) 1 mdcslvrtlv hrycageenw vdsrtiyvgh
repppgaeay ipqrypdnri vsskytfwnf 61 ipknlfeqfr rvanfyflii
flvqliidtp tspvtsglpl ffvitvtaik qgyedwlrhk 121 adnamnqcpv
hfiqhgklvr kqsrklrvgd ivmvkedetf pcdliflssn rgdgtchvtt 181
asldgesshk thyavqdtkg fhteediggl hatieceqpq pdlykfvgri nvysdlndpv
241 vrplgsenll lrgatlknte kifgvaiytg metkmalnyq sksqkrsave
ksmnaflivy 301 lciliskali ntvlkymwqs epfrdepwyn qkteserqrn
lflkaftdfl afmvlfnyii 361 pvsmyvtvem qkflgsyfit wdedmfdeet
gegplvntsd lneelgqvey iftdktgtlt 421 ennmefkecc ieghvyvphv
icngqvlpes sgidmidssp svngrereel ffralclcht 481 vqvkdddsvd
gprkspdggk scvyissspd evalvegvqr lgftylrlkd nymeilnren 541
hierfellei lsfdsvrrrm svivksatge iylfckgads sifprviegk vdqirarver
601 naveglrtlc vaykrliqee yegickllqa akvalqdrek klaeayeqie
kdltllgata 661 vedrlqekaa dtiealqkag ikvwvltgdk metaaatcya
cklfrrntql lelttkriee 721 qslhdvlfel sktvlrhsgs ltrdnlsgls
admqdyglii dgaalslimk predgssgny 781 relfleicrs csavlccrma
plqkaqivkl ikfskehpit laigdgandv smileahvgi 841 gvigkegrqa
arnsdyaipk fkhlkkmllv hghfyyiris elvqyffykn vcfifpqfly 901
qffcgfsqqt lydtayltly nisftslpil lyslmeqhvg idvlkrdptl yrdvaknall
961 rwrvfiywtl lglfdalvff fgayfvfent tvtsngqifg nwtfgtlvft
vmvftvtlkl 1021 aldthywtwi nhfviwgsll fyvvfsllwg gviwpflnyq
rmyyvfiqml ssgpawlaiv 1081 llvtisllpd vlkkvlcrql wptatervqn
gcaqprdrds eftplaslqs pgyqstcpsa 1141 awysshsqqv tlaawkekvs
tepppilggs hhhcssipsh scprsrvgml v
[0272] In some embodiments of the methods of the disclosure, the
wild type human IVD/DISP2 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_002225.3, transcript variant 1):
TABLE-US-00043 (SEQ ID NO: 50) 1 tttccgcagt taggggctgc tatttcaacg
cagggagata aaaagaaaaa aacacttgct 61 cttctacccc gctaaaaaca
ctcatcctag ggagcacgcc agcatttgca gcgttcgggg 121 cagggccact
cggcctgcgg ccgttgcact ggctggaagc tggcaggcga tcacggttga 181
ttggctcggg tgcggtccaa gggcagcaac gccttcggcg ggccgcctag ggtgattggc
241 tgctgcagcc caccccctag ccggtttggt gggcggcgaa gcctggattg
gtggagctaa 301 gagctggctc agtttcagcg ctggctcttc gtgcatggca
gagatggcga ctgcgactcg 361 gctgctgggg tggcgtgtgg cgagctggag
gctgcggccg ccgcttgccg gcttcgtttc 421 ccagcgggcc cactcgcttt
tgcccgtgga cgatgcaatc aatgggctaa gcgaggagca 481 gaggcagctt
cgtcagacca tggctaagtt ccttcaggag cacctggccc ccaaggccca 541
ggagatcgat cgcagcaatg agttcaagaa cctgcgagaa ttttggaagc agctggggaa
601 cctgggcgta ttgggcatca cagcccctgt tcagtatggc ggctccggcc
tgggctacct 661 ggagcatgtg ctggtgatgg aggagatatc ccgagcttcc
ggagcagtgg ggctcagtta 721 cggtgcccac tccaacctct gcatcaacca
gcttgtacgc aatgggaatg aggcccagaa 781 agagaagtat ctcccgaagc
tgatcagtgg tgagtacatc ggagccctgg ccatgagtga 841 gcccaatgca
ggctctgatg ttgtctctat gaagctcaaa gcggaaaaga aaggaaatca 901
ctacatcctg aatggcaaca agttctggat cactaatggc cctgatgctg acgtcctgat
961 tgtctatgcc aagacagatc tggctgctgt gccagcttct cggggcatca
cagccttcat 1021 tgtggagaag ggtatgcctg gctttagcac ctctaagaag
ctggacaagc tggggatgag 1081 gggctctaac acctgtgagc taatctttga
agactgcaag attcctgctg ccaacatcct 1141 gggccatgag aataagggtg
tctacgtgct gatgagtggg ctggacctgg agcggctggt 1201 gctggccggg
gggcctcttg ggctcatgca agcggtcctg gaccacacca ttccctacct 1261
gcacgtgagg gaagcctttg gccagaagat cggccacttc cagttgatgc aggggaagat
1321 ggctgacatg tacacccgcc tcatggcgtg tcggcagtat gtctacaatg
tcgccaaggc 1381 ctgcgatgag ggccattgca ctgctaagga ctgtgcaggt
gtgattcttt actcagctga 1441 gtgtgccaca caggtagccc tggacggcat
tcagtgtttt ggtggcaatg gctacatcaa 1501 tgactttccc atgggccgct
ttcttcgaga tgccaagctg tatgagatag gggctgggac 1561 cagcgaggtg
aggcggctgg tcatcggcag agccttcaat gcagactttc actagtcctg 1621
agacccttcg cccccttttc ctgcacctag tggcctttct tgggaagtag agatgtggcg
1681 gctttcccac cctgcccaca gcaggccctc ctgcccagct gctcttgtca
gccctctggc 1741 ctctggatga ggttgagttc tccacaacag ctcccaagca
tcatgggcct cgcagccggg 1801 cctgtgccac ggctagtgtt gtgtgattta
aaatggactc agcaggaagc atattgtctg 1861 gggattgttg ggacaggttt
tggtgactct gtgcccttgc tctctaactt ctgagcccac 1921 ctcccagggt
aggcacctgg gggcatgcag gtgcccacct cccagggtag gcacctgggg 1981
gcatgcaggt acccacctct ttctcttggg tgaggctctg gcaaggagat ctctctgctc
2041 aagcacagca gaatcatggc ccctctccat gaattggaac ttggtacagg
ttaagtatcc 2101 ctaatcctga aatctgaaac acttgtggtt ccaagcattt
tggataaggc aaattcaact 2161 ttcagtctct tttctggggg aaaaaaataa
taaacctagc ctagccaggc gtggtggctc 2221 atgcttgtaa tcccagcact
tcaggaggct gagatgggtg gatcacctga ggtcaggagt 2281 tcaagaccag
cctggccaac atgtggaaac ctcgcctcaa ctaaaaatag aaaaaaatta 2341
gttgggcatg gtggtgggca cctgtaatcc cagctacttc aggaggctga ggcaggagaa
2401 ttacttgaac ccaggaggcg gacgttgcag tgagccgagc ttgtgccatt
gcactccagc 2461 ctgggcgaca agagcaaaac tcttcaaaaa acaaaacaaa
acaaaaaaac cctggccctt 2521 gtttcttcca gtttctagag gtatcagctc
ctagcagctt atgaacacat atgcttgctt 2581 ggccaggcaa ggtggtgtgt
gcctgtaatc ccagcacttt gggaggccaa ggcaggtgga 2641 tcacttgcag
tcaggagttc aagaccagcc tgtccaacgt ggtgaaaccc catctctact 2701
aaaaatacaa aaattagcca ggggtggtgg tgcacgtctg taatcccagc tactcaggag
2761 gctgaggcag gagaatcact tgaacccggg aggtggaggt tgcaatgagc
caatatgaca 2821 ccgctgcagt ccagcctggg ccatagagtg agactctgtc
tcaaaaaagg aaagaaaaat 2881 aggctgggca cagtgactca tgcctgtaat
cccaacactt tgggaggccg aggcaggtgg 2941 atcacgaggt caggagttca
agaccagcct ggccaagatg gtaaaacctc gtctctacta 3001 aaaatacaaa
aattagccag gtgtggtggc aggctcctgt aatcccagct actcaggagg 3061
ctgaggcaga gaattgcttg aacccgggag gcagagtttg cagtgagcca agatcacacc
3121 actgcactcc agcttggacg acagagcgag actctgtctc aaaaaataat
aggccaggca 3181 tggtggctca acgtctgtaa tcccagcact ttgggaggcc
gaggcgggca gatcacaagg 3241 tcaggagttc gagaccagcc tgacgaccaa
catggtgaaa cctcgtctct actaaaaata 3301 caaaaattag ccaggcctgg
tggcacgcgc ctgtaatccc agttacacag aagactgagg 3361 caggagaatc
gcttgaacgc aggaggcaga ggttgcagga gctgagatcg cgccattgca 3421
ctccagcctg ggcaacagag tgagactctg tctcaaaaaa taataataaa ataaatgaac
3481 acacatgctg ctgagtccgc agggggggca gagcagagga cagcgtgctt
ttgtgtactg 3541 ttggaagact ggctcctcct gtacagcacc tctgagccct
tgtgcaccgc cctgccacgg 3601 gcaccatcca gtcctggccg tgtgaccacc
cacagctgac tgggcagcag gcacaggccc 3661 tacccgagca ggccggagtt
ggctcgcatg actccagctg aggctgcctg tgtacatttc 3721 tccagatacc
ctatggctaa ttttgttata actgcacagt ggctgctgcc attttgtatt 3781
aaatatattg tgaaacaaac ctatctgggg agaagcaatc tacttgccgc tgcttcctgt
3841 ctggatccag cttgtgtcct tggagagtgg ctggcccagg tcctattcct
gtcctccagc 3901 ccgttctttc atgagggaca ggaaggtaaa atcagccctt
aggagagagg tctcagcctc 3961 cctttcccag atctcccagt gagttttaaa
ggaagcaggg agcccagagt gctaagttct 4021 tacagccaga aggaagctta
tagatttctg aaaaccgccc ctttgttttt aaaaagatca 4081 acacaatttg
actttctcaa ggtcaaaacg aactagaatc cagatctgct catggcaaaa 4141
atgggggtgt tctgagaatt ccagctttgg gccgcactgt acagcagtct ggatagagtg
4201 tgatctgaga agggaatggg tctgggttgt tccacccctt ccgagttcca
aaaagaggga 4261 actggttttc ttggttctca gcccagcagc acctatcctg
gctcttggtc ctggcctgca 4321 gccaagtgct gttcctagcc tgaggcttga
gacaggtggg gttggctcct caccaacccc 4381 agttccgtcc catcctgagg
gcaagatcct gggctcatag gcagtccctt tcacttcctt 4441 gtcttgctcc
ctgctatgtt ggagatgaat gtgactaaaa gggccatctt gctggcttaa 4501
tgtgtggctg gagagaccag cctggagaca atgtggcaaa atggggcgct tcatccagtc
4561 tgtctaagcc ctgtcgactt ggggaggtga tttctttcct ggttctatat
gtgaagcaaa 4621 ataaatgttt taaaattaaa agcaaaaaaa acaaaatgaa
ccatgaaaaa aaa
[0273] In some embodiments of the methods of the disclosure, the
wild type human IVD/DISP2 gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
transcript variant 1):
TABLE-US-00044 (SEQ ID NO: 51) 1 maematatrl lgwrvaswrl rpplagfvsq
rahsllpvdd ainglseeqr qlrqtmakfl 61 qehlapkage idrsnefknl
refwkqlgnl gvlgitapvq yggsglgyle hvlvmeeisr 121 asgavglsyg
ahsnlcinql vrngneaqke kylpklisge yigalamsep nagsdvvsmk 181
lkaekkgnhy ilngnkfwit ngpdadvliv yaktdlaavp asrgitafiv ekgmpgfsts
241 kkldklgmrg sntcelifed ckipaanilg henkgvyvlm sgldlerlvl
aggplglmqa 301 vldhtipylh vreafgqkig hfqlmqgkma dmytrlmacr
qyvynvakac deghctakdc 361 agvilysaec atqvaldgiq cfggngyind
fpmgrflrda klyeigagts evrrlvigra 421 fnadfh
[0274] In some embodiments of the methods of the disclosure, the
wild type human IVD/DISP2 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_001159508.1, transcript variant 2):
TABLE-US-00045 (SEQ ID NO: 27) 1 tttccgcagt taggggctgc tatttcaacg
cagggagata aaaagaaaaa aacacttgct 61 cttctacccc gctaaaaaca
ctcatcctag ggagcacgcc agcatttgca gcgttcgggg 121 cagggccact
cggcctgcgg ccgttgcact ggctggaagc tggcaggcga tcacggttga 181
ttggctcggg tgcggtccaa gggcagcaac gccttcggcg ggccgcctag ggtgattggc
241 tgctgcagcc caccccctag ccggtttggt gggcggcgaa gcctggattg
gtggagctaa 301 gagctggctc agtttcagcg ctggctcttc gtgcatggca
gagatggcga ctgcgactcg 361 gctgctgggg tggcgtgtgg cgagctggag
gctgcggccg ccgcttgccg gcttcgtttc 421 ccagcgggcc cactcgcttt
tgcccgtgga cgatgcaatc aatgggctaa gcgaggagca 481 gaggcaggaa
ttttggaagc agctggggaa cctgggcgta ttgggcatca cagcccctgt 541
tcagtatggc ggctccggcc tgggctacct ggagcatgtg ctggtgatgg aggagatatc
601 ccgagcttcc ggagcagtgg ggctcagtta cggtgcccac tccaacctct
gcatcaacca 661 gcttgtacgc aatgggaatg aggcccagaa agagaagtat
ctcccgaagc tgatcagtgg 721 tgagtacatc ggagccctgg ccatgagtga
gcccaatgca ggctctgatg ttgtctctat 781 gaagctcaaa gcggaaaaga
aaggaaatca ctacatcctg aatggcaaca agttctggat 841 cactaatggc
cctgatgctg acgtcctgat tgtctatgcc aagacagatc tggctgctgt 901
gccagcttct cggggcatca cagccttcat tgtggagaag ggtatgcctg gctttagcac
961 ctctaagaag ctggacaagc tggggatgag gggctctaac acctgtgagc
taatctttga 1021 agactgcaag attcctgctg ccaacatcct gggccatgag
aataagggtg tctacgtgct 1081 gatgagtggg ctggacctgg agcggctggt
gctggccggg gggcctcttg ggctcatgca 1141 agcggtcctg gaccacacca
ttccctacct gcacgtgagg gaagcctttg gccagaagat 1201 cggccacttc
cagttgatgc aggggaagat ggctgacatg tacacccgcc tcatggcgtg 1261
tcggcagtat gtctacaatg tcgccaaggc ctgcgatgag ggccattgca ctgctaagga
1321 ctgtgcaggt gtgattcttt actcagctga gtgtgccaca caggtagccc
tggacggcat 1381 tcagtgtttt ggtggcaatg gctacatcaa tgactttccc
atgggccgct ttcttcgaga 1441 tgccaagctg tatgagatag gggctgggac
cagcgaggtg aggcggctgg tcatcggcag 1501 agccttcaat gcagactttc
actagtcctg agacccttcg cccccttttc ctgcacctag 1561 tggcctttct
tgggaagtag agatgtggcg gctttcccac cctgcccaca gcaggccctc 1621
ctgcccagct gctcttgtca gccctctggc ctctggatga ggttgagttc tccacaacag
1681 ctcccaagca tcatgggcct cgcagccggg cctgtgccac ggctagtgtt
gtgtgattta 1741 aaatggactc agcaggaagc atattgtctg gggattgttg
ggacaggttt tggtgactct 1801 gtgcccttgc tctctaactt ctgagcccac
ctcccagggt aggcacctgg gggcatgcag 1861 gtgcccacct cccagggtag
gcacctgggg gcatgcaggt acccacctct ttctcttggg 1921 tgaggctctg
gcaaggagat ctctctgctc aagcacagca gaatcatggc ccctctccat 1981
gaattggaac ttggtacagg ttaagtatcc ctaatcctga aatctgaaac acttgtggtt
2041 ccaagcattt tggataaggc aaattcaact ttcagtctct tttctggggg
aaaaaaataa 2101 taaacctagc ctagccaggc gtggtggctc atgcttgtaa
tcccagcact tcaggaggct 2161 gagatgggtg gatcacctga ggtcaggagt
tcaagaccag cctggccaac atgtggaaac 2221 ctcgcctcaa ctaaaaatag
aaaaaaatta gttgggcatg gtggtgggca cctgtaatcc 2281 cagctacttc
aggaggctga ggcaggagaa ttacttgaac ccaggaggcg gacgttgcag 2341
tgagccgagc ttgtgccatt gcactccagc ctgggcgaca agagcaaaac tcttcaaaaa
2401 acaaaacaaa acaaaaaaac cctggccctt gtttcttcca gtttctagag
gtatcagctc 2461 ctagcagctt atgaacacat atgcttgctt ggccaggcaa
ggtggtgtgt gcctgtaatc 2521 ccagcacttt gggaggccaa ggcaggtgga
tcacttgcag tcaggagttc aagaccagcc 2581 tgtccaacgt ggtgaaaccc
catctctact aaaaatacaa aaattagcca ggggtggtgg 2641 tgcacgtctg
taatcccagc tactcaggag gctgaggcag gagaatcact tgaacccggg 2701
aggtggaggt tgcaatgagc caatatgaca ccgctgcagt ccagcctggg ccatagagtg
2761 agactctgtc tcaaaaaagg aaagaaaaat aggctgggca cagtgactca
tgcctgtaat 2821 cccaacactt tgggaggccg aggcaggtgg atcacgaggt
caggagttca agaccagcct 2881 ggccaagatg gtaaaacctc gtctctacta
aaaatacaaa aattagccag gtgtggtggc 2941 aggctcctgt aatcccagct
actcaggagg ctgaggcaga gaattgcttg aacccgggag 3001 gcagagtttg
cagtgagcca agatcacacc actgcactcc agcttggacg acagagcgag 3061
actctgtctc aaaaaataat aggccaggca tggtggctca acgtctgtaa tcccagcact
3121 ttgggaggcc gaggcgggca gatcacaagg tcaggagttc gagaccagcc
tgacgaccaa 3181 catggtgaaa cctcgtctct actaaaaata caaaaattag
ccaggcctgg tggcacgcgc 3241 ctgtaatccc agttacacag aagactgagg
caggagaatc gcttgaacgc aggaggcaga 3301 ggttgcagga gctgagatcg
cgccattgca ctccagcctg ggcaacagag tgagactctg 3361 tctcaaaaaa
taataataaa ataaatgaac acacatgctg ctgagtccgc agggggggca 3421
gagcagagga cagcgtgctt ttgtgtactg ttggaagact ggctcctcct gtacagcacc
3481 tctgagccct tgtgcaccgc cctgccacgg gcaccatcca gtcctggccg
tgtgaccacc 3541 cacagctgac tgggcagcag gcacaggccc tacccgagca
ggccggagtt ggctcgcatg 3601 actccagctg aggctgcctg tgtacatttc
tccagatacc ctatggctaa ttttgttata 3661 actgcacagt ggctgctgcc
attttgtatt aaatatattg tgaaacaaac ctatctgggg 3721 agaagcaatc
tacttgccgc tgcttcctgt ctggatccag cttgtgtcct tggagagtgg 3781
ctggcccagg tcctattcct gtcctccagc ccgttctttc atgagggaca ggaaggtaaa
3841 atcagccctt aggagagagg tctcagcctc cctttcccag atctcccagt
gagttttaaa 3901 ggaagcaggg agcccagagt gctaagttct tacagccaga
aggaagctta tagatttctg 3961 aaaaccgccc ctttgttttt aaaaagatca
acacaatttg actttctcaa ggtcaaaacg 4021 aactagaatc cagatctgct
catggcaaaa atgggggtgt tctgagaatt ccagctttgg 4081 gccgcactgt
acagcagtct ggatagagtg tgatctgaga agggaatggg tctgggttgt 4141
tccacccctt ccgagttcca aaaagaggga actggttttc ttggttctca gcccagcagc
4201 acctatcctg gctcttggtc ctggcctgca gccaagtgct gttcctagcc
tgaggcttga 4261 gacaggtggg gttggctcct caccaacccc agttccgtcc
catcctgagg gcaagatcct 4321 gggctcatag gcagtccctt tcacttcctt
gtcttgctcc ctgctatgtt ggagatgaat 4381 gtgactaaaa gggccatctt
gctggcttaa tgtgtggctg gagagaccag cctggagaca 4441 atgtggcaaa
atggggcgct tcatccagtc tgtctaagcc ctgtcgactt ggggaggtga 4501
tttctttcct ggttctatat gtgaagcaaa ataaatgttt taaaattaaa agcaaaaaaa
4561 acaaaatgaa ccatg
[0275] In some embodiments of the methods of the disclosure, the
wild type human IVD/DISP2 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NP_001152980.1, transcript variant 2):
TABLE-US-00046 (SEQ ID NO: 28) 1 maematatrl lgwrvaswrl rpplagfvsq
rahsllpvdd ainglseeqr qefwkqlgnl 61 gvlgitapvq yggsglgyle
hvlvmeeisr asgavglsyg ahsnlcinql vrngneaqke 121 kylpklisge
yigalamsep nagsdvvsmk lkaekkgnhy ilngnkfwit ngpdadvliv 181
yaktdlaavp asrgitafiv ekgmpgfsts kkldklgmrg sntcelifed ckipaanilg
241 henkgvyvlm sgldlerlvl aggplglmqa vldhtipylh vreafgqkig
hfqlmqgkma 301 dmytrlmacr qyvynvakac deghctakdc agvilysaec
atqvaldgiq cfggngyind 361 fpmgrflrda klyeigagts evrrlvigra
fnadfh
[0276] In some embodiments of the methods of the disclosure, the
wild type human DPP9 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_139159.41:
TABLE-US-00047 (SEQ ID NO: 29) 1 caacttccgg gtcaaaggtg cctgagccgg
cgggtcccct gtgtccgccg cggctgtcgt 61 cccccgctcc cgccacttcc
ggggtcgcag tcccgggcat ggagccgcga ccgtgaggcg 121 ccgctggacc
cgggacgacc tgcccagtcc ggccgccgcc ccacgtcccg gtctgtgtcc 181
cacgcctgca gctggaatgg aggctctctg gaccctttag aaggcacccc tgccctcctg
241 aggtcagctg agcggttaat gcggaaggtt aagaaactgc gcctggacaa
ggagaacacc 301 ggaagttgga gaagcttctc gctgaattcc gagggggctg
agaggatggc caccaccggg 361 accccaacgg ccgaccgagg cgacgcagcc
gccacagatg acccggccgc ccgcttccag 421 gtgcagaagc actcgtggga
cgggctccgg agcatcatcc acggcagccg caagtactcg 481 ggcctcattg
tcaacaaggc gccccacgac ttccagtttg tgcagaagac ggatgagtct 541
gggccccact cccaccgcct ctactacctg ggaatgccat atggcagccg agagaactcc
601 ctcctctact ctgagattcc caagaaggtc cggaaagagg ctctgctgct
cctgtcctgg 661 aagcagatgc tggatcattt ccaggccacg ccccaccatg
gggtctactc tcgggaggag 721 gagctgctga gggagcggaa acgcctgggg
gtcttcggca tcacctccta cgacttccac 781 agcgagagtg gcctcttcct
cttccaggcc agcaacagcc tcttccactg ccgcgacggc 841 ggcaagaacg
gcttcatggt gtcccctatg aaaccgctgg aaatcaagac ccagtgctca 901
gggccccgga tggaccccaa aatctgccct gccgaccctg ccttcttctc cttcatcaat
961 aacagcgacc tgtgggtggc caacatcgag acaggcgagg agcggcggct
gaccttctgc 1021 caccaaggtt tatccaatgt cctggatgac cccaagtctg
cgggtgtggc caccttcgtc 1081 atacaggaag agttcgaccg cttcactggg
tactggtggt gccccacagc ctcctgggaa 1141 ggttcagagg gcctcaagac
gctgcgaatc ctgtatgagg aagtcgatga gtccgaggtg 1201 gaggtcattc
acgtcccctc tcctgcgcta gaagaaagga agacggactc gtatcggtac 1261
cccaggacag gcagcaagaa tcccaagatt gccttgaaac tggctgagtt ccagactgac
1321 agccagggca agatcgtctc gacccaggag aaggagctgg tgcagccctt
cagctcgctg 1381 ttcccgaagg tggagtacat cgccagggcc gggtggaccc
gggatggcaa atacgcctgg 1441 gccatgttcc tggaccggcc ccagcagtgg
ctccagctcg tcctcctccc cccggccctg 1501 ttcatcccga gcacagagaa
tgaggagcag cggctagcct ctgccagagc tgtccccagg 1561 aatgtccagc
cgtatgtggt gtacgaggag gtcaccaacg tctggatcaa tgttcatgac 1621
atcttctatc ccttccccca atcagaggga gaggacgagc tctgctttct ccgcgccaat
1681 gaatgcaaga ccggcttctg ccatttgtac aaagtcaccg ccgttttaaa
atcccagggc 1741 tacgattgga gtgagccctt cagccccggg gaagatgaat
ttaagtgccc cattaaggaa 1801 gagattgctc tgaccagcgg tgaatgggag
gttttggcga ggcacggctc caagatctgg 1861 gtcaatgagg agaccaagct
ggtgtacttc cagggcacca aggacacgcc gctggagcac 1921 cacctctacg
tggtcagcta tgaggcggcc ggcgagatcg tacgcctcac cacgcccggc 1981
ttctcccata gctgctccat gagccagaac ttcgacatgt tcgtcagcca ctacagcagc
2041 gtgagcacgc cgccctgcgt gcacgtctac aagctgagcg gccccgacga
cgaccccctg 2101 cacaagcagc cccgcttctg ggctagcatg atggaggcag
ccagctgccc cccggattat 2161 gttcctccag agatcttcca tttccacacg
cgctcggatg tgcggctcta cggcatgatc 2221 tacaagcccc acgccttgca
gccagggaag aagcacccca ccgtcctctt tgtatatgga 2281 ggcccccagg
tgcagctggt gaataactcc ttcaaaggca tcaagtactt gcggctcaac 2341
acactggcct ccctgggcta cgccgtggtt gtgattgacg gcaggggctc ctgtcagcga
2401 gggcttcggt tcgaaggggc cctgaaaaac caaatgggcc aggtggagat
cgaggaccag 2461 gtggagggcc tgcagttcgt ggccgagaag tatggcttca
tcgacctgag ccgagttgcc 2521 atccatggct ggtcctacgg gggcttcctc
tcgctcatgg ggctaatcca caagccccag 2581 gtgttcaagg tggccatcgc
gggtgccccg gtcaccgtct ggatggccta cgacacaggg 2641 tacactgagc
gctacatgga cgtccctgag aacaaccagc acggctatga ggcgggttcc 2701
gtggccctgc acgtggagaa gctgcccaat gagcccaacc gcttgcttat cctccacggc
2761 ttcctggacg aaaacgtgca ctttttccac acaaacttcc tcgtctccca
actgatccga 2821 gcagggaaac cttaccagct ccagatctac cccaacgaga
gacacagtat tcgctgcccc 2881 gagtcgggcg agcactatga agtcacgttg
ctgcactttc tacaggaata cctctgagcc 2941 tgcccaccgg gagccgccac
atcacagcac aagtggctgc agcctccgcg gggaaccagg 3001 cgggagggac
tgagtggccc gcgggcccca gtgaggcact ttgtcccgcc cagcgctggc 3061
cagccccgag gagccgctgc cttcaccgcc ccgacgcctt ttatcctttt ttaaacgctc
3121 ttgggtttta tgtccgctgc ttcttggttg ccgagacaga gagatggtgg
tctcgggcca 3181 gcccctcctc tccccgcctt ctgggaggag gaggtcacac
gctgatgggc actggagagg 3241 ccagaagaga ctcagaggag cgggctgcct
tccgcctggg gctccctgtg acctctcagt 3301 cccctggccc ggccagccac
cgtccccagc acccaagcat gcaattgcct gtcccccccg 3361 gccagcctcc
ccaacttgat gtttgtgttt tgtttggggg gatatttttc ataattattt 3421
aaaagacagg ccgggcgcgg tggctcacgt ctgtaatccc agcactttgg gaggctgagg
3481 cgggcggatc acctgaggtt gggagttcaa gaccagcctg gccaacatgg
ggaaaccccg 3541 tctctactaa aaatacaaaa aattagccgg gtgtggtggc
gcgtgcctat aatcccagct 3601 actcgggagg ctgaggcagg agaatcgctt
gaacccggga ggtggaggtt gcggtgagcc 3661 aagatcgcac cattgcactc
cagcctgggc aacaagagcg aaactctgtc tcaaaataaa 3721 taaaaaataa
aagacagaaa gcaaggggtg cctaaatcta gacttggggt ccacaccggg 3781
cagcggggtt gcaacccagc acctggtagg ctccatttct tcccaagccc gagcagaggg
3841 tcatgcgggc cccacaggag aagcggccag ggcccgcggg gggcaccacc
tgtggacagc 3901 cctcctgtcc ccaagctttc aggcaggcac tgaaacgcac
cgaacttcca cgctctgctg 3961 gtcagtggcg gctgtcccct ccccagccca
gccgcccagc cacatgtgtc tgcctgaccc 4021 gtacacacca ggggttccgg
ggttgggagc tgaaccatcc ccacctcagg gttatatttc 4081 cctctcccct
tccctccccg ccaagagctc tgccaggggc gggcaaaaaa aaaagtaaaa 4141
agaaaagaaa aaaaaaaaaa agaaacaaac cacctctaca tattatggaa agaaaatatt
4201 tttgtcgatt cttattcttt tataattatg cgtggaagaa gtagacacat
taaacgattc 4261 cagttggaaa aaaaaaaaaa aaaaaa
[0277] In some embodiments of the methods of the disclosure, the
wild type human
[0278] DPP9 gene of the disclosure consists of or comprises the
amino acid sequence (Genbank Accession number: NP_631898.3):
TABLE-US-00048 (SEQ ID NO: 30) 1 mrkvkklrld kentgswrsf slnsegaerm
attgtptadr gdaaatddpa arfqvqkhsw 61 dglrsiihgs rkysglivnk
aphdfqfvqk tdesgphshr lyylgmpygs rensllysei 121 pkkvrkeall
llswkqmldh fqatphhgvy sreeellrer krlgvfgits ydfhsesglf 181
lfqasnslfh crdggkngfm vspmkpleik tqcsgprmdp kicpadpaff sfinnsdlwv
241 anietgeerr ltfchqglsn vlddpksagv atfviqeefd rftgywwcpt
aswegseglk 301 tlrilyeevd esevevihvp spaleerktd syryprtgsk
npkialklae fqtdsqgkiv 361 stqekelvqp fsslfpkvey iaragwtrdg
kyawamfldr pqqwlqlvll ppalfipste 421 neeqrlasar avprnvqpyv
vyeevtnvwi nvhdifypfp qsegedelcf lranecktgf 481 chlykvtavl
ksqgydwsep fspgedefkc pikeeialts gewevlarhg skiwvneetk 541
lvyfqgtkdt plehhlyvvs yeaageivrl ttpgfshscs msqnfdmfvs hyssystppc
601 vhvyklsgpd ddplhkqprf wasmmeaasc ppdyvppeif hfhtrsdvrl
ygmiykphal 661 qpgkkhptvl fvyggpqvql vnnsfkgiky lrlntlaslg
yavvvidgrg scqrglrfeg 721 alknqmgqve iedqveglqf vaekygfidl
srvaihgwsy ggflslmgli hkpqvfkvai 781 agapvtvwma ydtgyterym
dvpennqhgy eagsvalhve klpnepnrll ilhgfldenv 841 hffhtnflvs
qliragkpyq lqiypnerhs ircpesgehy evtllhflqe yl
[0279] In some embodiments of the methods of the disclosure, the
wild type human SIGLEC14 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_001098612.1):
TABLE-US-00049 (SEQ ID NO: 31) 1 actcaccctc cggcttcctg tcggggcttt
ctcagcccca ccccacgttt ggacatttgg 61 agcatttcct tccctgacag
ccggacctgg gactgggctg gggccctggc ggatggagac 121 atgctgcccc
tgctgctgct gcccctgctg tggggggggt ccctgcagga gaagccagtg 181
tacgagctgc aagtgcagaa gtcggtgacg gtgcaggagg gcctgtgcgt ccttgtgccc
241 tgctccttct cttacccctg gagatcctgg tattcctctc ccccactcta
cgtctactgg 301 ttccgggacg gggagatccc atactacgct gaggttgtgg
ccacaaacaa cccagacaga 361 agagtgaagc cagagaccca gggccgattc
cgcctccttg gggatgtcca gaagaagaac 421 tgctccctga gcatcggaga
tgccagaatg gaggacacgg gaagctattt cttccgcgtg 481 gagagaggaa
gggatgtaaa atatagctac caacagaata agctgaactt ggaggtgaca 541
gccctgatag agaaacccga catccacttt ctggagcctc tggagtccgg ccgccccaca
601 aggctgagct gcagccttcc aggatcctgt gaagcgggac cacctctcac
attctcctgg 661 acggggaatg ccctcagccc cctggacccc gagaccaccc
gctcctcgga gctcaccctc 721 acccccaggc ccgaggacca tggcaccaac
ctcacctgtc aggtgaaacg ccaaggagct 781 caggtgacca cggagagaac
tgtccagctc aatgtctcct atgctccaca gaacctcgcc 841 atcagcatct
tcttcagaaa tggcacaggc acagccctgc ggatcctgag caatggcatg 901
tcggtgccca tccaggaggg ccagtccctg ttcctcgcct gcacagttga cagcaacccc
961 cctgcctcac tgagctggtt ccgggaggga aaagccctca atccttccca
gacctcaatg 1021 tctgggaccc tggagctgcc taacatagga gctagagagg
gaggggaatt cacctgccgg 1081 gttcagcatc cgctgggctc ccagcacctg
tccttcatcc tttctgtgca gagaagctcc 1141 tcttcctgca tatgtgtaac
tgagaaacag cagggctcct ggcccctcgt cctcaccctg 1201 atcagggggg
ctctcatggg ggctggcttc ctcctcacct atggcctcac ctggatctac 1261
tataccaggt gtggaggccc ccagcagagc agggctgaga ggcctggctg agcccctccc
1321 gctcaagaca gaactgaggt gtggacactt agccctgtgg gacacatgca
ggacatcact 1381 gtcagcttct ttctggaagc tcacatccca ctgactaccc
ctcttttcct tcctgcccca 1441 taccccttct acttattccc ctctgcttgt
gagtcttgcc ccaccacacc tgcatcccca 1501 tctgcacccc atcccctctc
cacctgccct tctcttccct ctccatccac catctccagc 1561 cctgtgaagg
gaatgtactt tcggtcttat acccccatta cccattaccc aaaagttacc 1621
tttttttttt tttttttttt ttgagacaga gtctcactct gttgcacagg ctggagttca
1681 gtggcacaat ctccgttcac tgcaacctcc acctctgggg ttcaagcaat
tctcctgcct 1741 cagcctccct agtagctggg attacaggtg cctgccacca
catccagtta attttttttt 1801 tttgtatgtt agtagagatg gggttttacc
atgttggcca ggtctcgaac tcctgacctc 1861 aagcaatcca ctgcattggc
ctcccaaagt gctggcatta caggtatgag ccaccgtgcc 1921 tggctgccaa
aagttacctt cttaacactt gaatttctgg tctcctcagc ttccctatcc 1981
atataggcac agagaggcag catttgtttt ccagttaaaa ctctacctca ttgtgattat
2041 tatccaatac aattgttaca aaataagtaa aacttttatg aaacaataca
acataactga 2101 ttttactctt taa
[0280] In some embodiments of the methods of the disclosure, the
wild type human SIGLEC14 gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_001092082.1):
TABLE-US-00050 (SEQ ID NO: 32) 1 mlpllllpll wggslqekpv yelqvqksvt
vqeglcvlvp csfsypwrsw ysspplyvyw 61 frdgeipyya evvatnnpdr
rvkpetqgrf rllgdvqkkn cslsigdarm edtgsyffry 121 ergrdvkysy
qqnklnlevt aliekpdihf leplesgrpt rlscslpgsc eagppltfsw 181
tgnalspldp ettrsseltl tprpedhgtn ltcqvkrqga qvttertvql nvsyapqnla
241 isiffrngtg talrilsngm svpiqegqsl flactvdsnp paslswfreg
kalnpsqtsm 301 sgtlelpnig areggeftcr vqhplgsqhl sfilsvqrss
sscicvtekq qgswplvltl 361 irgalmgagf lltygltwiy ytrcggpqqs
raerpg
[0281] In some embodiments of the methods of the disclosure, the
wild type human ADM2 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_001253845.1):
TABLE-US-00051 (SEQ ID NO: 33) 1 cgcccacgcc cggcgccccg accgcggagg
actccccgag ccccgcccgc catggcccgg 61 atcccgacgg ccgccctggg
ttgcatcagc ctcctctgcc tgcagctccc tggctcgctg 121 tcccgcagcc
tgggcgggga cccgcgaccc gtcaaaccca gggagccccc agcccggagc 181
ccttccagca gcctgcagcc caggcacccc gcaccccgac ctgtggtctg gaagcttcac
241 cgggccctcc aggcacagag gggtgccggc ctggcccctg ttatgggtca
gcctctccgg 301 gatggtggcc gccaacactc gggcccccga agacactcgg
gcccccgcag gacccaagcc 361 cagctcctgc gagtgggctg tgtgctgggc
acctgccagg tgcagaatct cagccaccgc 421 ctgtggcaac tcatgggacc
ggccggccgg caggactcag ctcctgtgga ccccagcagc 481 ccccacagct
atggctgagg tggggccggg ccacacccct gcccatccca gccagggtgc 541
tgtgcccccg tccagagctg cagctgagcc ccatctgaag cccagtccct cggagctgca
601 gacagcaggt cctgcagcaa caatacctgc acggctttgc acacgtaaac
ctaggctggt 661 ctacacgcag tgctggtacg tcaaggagcc taaacaccct
gaaattgtga ccccctgggg 721 gacagctgcc agacacagct ggcggcagca
ccagatgcta agcgcttcag agaggaggtg 781 tctgcccaga gatgtggagc
agaagctggg ccctgaacac acggggccat gtctggacga 841 gcaggggaga
gaggctgaac tggccagaag tggcccctcc gctgctggtc cagtcagact 901
gaagcccggc cttgtgcctg ggctgttcct gctctcatgc acaaccagcc cttccacgtg
961 cctgcctgtg ggacaggagg gggagcgtgg gatgctgtag cccccggggt
tgggcaaggg 1021 aaggatggtg gccctccaga ggtcatgaag ggacctctgt
ggctccagct gccaaccctg 1081 gagcccagac cgaggtggcc atggagactc
cacctggatc ccctgtagga ggccagggag 1141 gggaactcag cagttcagga
gccaccccaa accattctgg gacagggaca cccctttcta 1201 ccccagggca
gggcagggct gggtggggca agatccccca gcccgactag acccacctca 1261
cctgaagggg gtgagaccct tgttggcagc cagacaaggg tggggctcca caggcagcac
1321 aggcgcccca ccaccaccca gtttggggac ccagtgggac caggtgcggg
ggcagagggt 1381 gacttaccaa gagccaggga gggcagccca ggcccaagtg
acagcaagaa caagaaccac 1441 tgccggcgtg cacagacttg gtgtgtgtcc
ttccctgggg ggacggggga ctcacatgtg 1501 cctgccactg gagcctctca
accgtccagc agaacacggg gttcagaaag ggctccttct 1561 gctatttagc
gaacactgag catttaattt acaaatgttt gctagggtca ccctctcggc 1621
catcccacga gggtcgccat gatcacccca actctagagg ccgcagcaga gctcaggaca
1681 ttcccccaca gagcttgccc ctcagttcct acctccaagg gggagggtcc
tggaagcgcc 1741 cacccaggcg ccgcccctgt gcttgctccc cgagctcagg
gattgccgag tccacgtaac 1801 tgacctgtac tccacgaggc cctgtgggaa
cggtccaggc tggtcctgcc ctgtggaggc 1861 ctccgtgcac tgagagatgt
actaggattg cagcaaaggt ggtcagggtg atgggccgca 1921 cagcgaggca
gtcaaggcca gctccctggg agaagcactg ggtcaggtga ggtctgagga 1981
cagcaggcct tccctagggg aaggagctgg gagtgccaag gccccaggtg cacaggaggc
2041 gtggctgctg agaggctgca gggtggaggg gcctcggcct cagagtcatg
tgccctgtga 2101 ccactgaagg gtgtcagcag agcacacggc atgaggacag
agggaggggc acggggagtg 2161 aaggaggggg ccctggggca aggctcgggg
gtcaggagct cagcgtccgc tactcagccc 2221 agccaaaacc ctcccagacg
tctcctctcc tgcctgggca aagtccagct tggcaccccg 2281 tctggggcct
gcctgtggtc agggccaagt gttccctcct ccaggaaagc ctttaccctc 2341
ctcatgccct gtagtcagga ggccgcctgc tgtaaccctc cgtgtcgcct cgggtgcgaa
2401 atcagaccca cctgacacca tcacgcggag gcccagcagc acctgcaccc
acttccagct 2461 gctctggcca aaatctccgc tcggccaggc cccgtggctc
acacctgtaa tcctagcaca 2521 ttgggaggcc aaggcaggca catcacctga
gttcaggagt tcaagaccag cctggccaac 2581 atggtgaaat cccgtctcta
ctaaaaacag aaaattatcc gggcgtggtg gcacatgact 2641 gtaatcccag
ctactcagga ggctgaggca ggaggatcac ttgaacctgg gaggcggagg 2701
ttgcagtgag ctgagattgc gccattgcac tccagcctgg gcaacaagag caaaattctg
2761 cctcaaaaaa aaaaatagta ataatacaaa aattagctgg gcgtggtggc
acatgccagt 2821 aattccatct actcgggagg ctgaggcagg agaatcgtct
aagcccggga ggtggaggtt 2881 gcagtgagcc cagatggcgc tgctgcactc
aagcttggat gacagagcaa gactccgttt 2941 caaaaaaaaa aaacctcctc
tcttccttca caccttcctc tgaatcccac ccggtcccac 3001 ctcctgaacc
tatccagaca ccttctcctg acccaggcac cacctgcttt cggggcgatg 3061
gccgtagcct cctcccaggc acctgtctgc atccctctgg ccagtgcatg ctgagcacgt
3121 gacctacccg tgttgggaca cgtgaggata cagccttgac ccccaggggc
tgacattcta 3181 gggggagata gaaggagaca aacgtagaag gtagaataag
tgggtggtgg agtggcaggg 3241 agtgctgagt gccacaggaa gtcagacaag
gaaggagagt gtggggcagg tgccgtttaa 3301 atggggggcg ctggggtctc
ctcacagttg cttctcagct cagctgtgcc aggatcttgt 3361 tgagtcaggt
cagctgccca cagccctctt gcctgacccc tgaagcccag aactctgatc 3421
ttcacagccc taggtatggc cccagcaccc cactgccctc tctcctgccc cagccgactg
3481 ctgttcccag acttccctgg ccacgctcca agacgccagc tctgccgcgg
gcactttgtt 3541 ctcacggtgt cctccatgcc tgcagggccc atgcatggga
agttgcgttg gcggcctggg 3601 tgttggcggt tccgtgcctg ctccaactct
ccgtgaggcc cctctcccag agcctgacac 3661 actctgtggc cgaactctag
gcaggtgccc ctgagtcctt tcctcgacga ggcctgaccc 3721 catccccatc
ctcgctgggc ccgccgaccc cggtgttagc aagaatcctc taaatcagtt 3781
tatggagaat tacccaccct cgatatctga tcccattcct catctcccac ccttgatctc
3841 atcaccctgc cggcctcctg caagatcctc attgagccac tccagtgaga
atccccctac 3901 cctcgaaggc cgccctaaca acttcccatc cgctgacccc
tccaacgcca tcaatctcca 3961 gctgtggttg ttgaactcgg aggtgagctc
ctctcaccac tctcttgaat aaagcttttc 4021 tcaccatttt aaaaaaaaaa
aaaaa
[0282] In some embodiments of the methods of the disclosure, the
wild type human ADM2 gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_001240774.1):
TABLE-US-00052 (SEQ ID NO: 34) 1 mariptaalg cisllclqlp gslsrslggd
prpvkprepp arspssslqp rhpaprpvvw 61 klhralqaqr gaglapvmgq
plrdggrqhs gprrhsgprr tqaqllrvgc vlgtcqvqnl 121 shrlwqlmgp
agrqdsapvd pssphsyg
[0283] In some embodiments of the methods of the disclosure, the
wild type human TSPAN5 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_005723.3):
TABLE-US-00053 (SEQ ID NO: 35) 1 aggcgggcgg agcgaggggt gggagggcgc
gcgcgaacgg gcgggcgagc aagcgagcgg 61 cgtctccacc agcatctgcc
gcggccgcct ttgcccgaag cccggggacg aaccgacgga 121 ccgaccgcct
ggcgcacgga cgcgggcgct cgctttgtgt tcggggctag cgtcggcgag 181
gcttgagctt gcagcgcgcg gcttccctgc tttctcgcgg ccaccccggc tccggcggcc
241 tcggcgcgcg aggggctgga ggtgcgggag ccgctctccg ccggtcggtc
cccgcgcggc 301 tgagcccagg ccgccagcgc cgcggccccg tgcggtgtcc
ctgagctcct gctccccgcc 361 gggctgctcc gagcaacggt gcttcggagc
tccaaactcg ggctgccggg gcaagtgtct 421 tcatgaaccc agaggatgtc
cgggaagcac tacaagggtc ctgaagtcag ttgttgcatc 481 aaatacttca
tatttggctt caatgtcata ttttggtttt tgggaataac atttcttgga 541
attggactgt gggcatggaa tgaaaaagga gttctgtcca acatctcttc catcaccgat
601 ctcggcggct ttgacccagt ttggctcttc cttgtggtgg gaggagtgat
gttcattttg 661 ggatttgcag ggtgcattgg agcgctacgg gaaaacactt
tccttctcaa gtttttttct 721 gtgttcctgg gaattatttt cttcctggag
ctcactgccg gagttctagc atttgttttc 781 aaagactgga tcaaagacca
gctgtatttc tttataaaca acaacatcag agcatatcgg 841 gatgacattg
atttgcaaaa cctcatagac ttcacccagg aatattggca gtgctgtggg 901
gcttttggag ctgatgattg gaacctaaat atttacttca attgcacaga ttccaatgca
961 agtcgagagc gatgtggcgt tccattctcc tgctgcacta aagatcccgc
agaagatgtc 1021 atcaacactc agtgtggcta tgatgccagg caaaaaccag
aagttgacca gcagattgta 1081 atctacacga aaggctgtgt gccccagttt
gagaagtggt tgcaggacaa tttaaccatc 1141 gttgctggta ttttcatagg
cattgcattg ctgcagatat ttgggatatg cctggcccag 1201 aatttggtta
gcgatatcga agctgtcagg gcgagctggt agaccccctg caaccgctgc 1261
tgcaagacac tggacagacc cagctttcgg gaccctcccg cgtgccgaac tgatcttcga
1321 gctgcatgga cctaatcaca gatgcagcct gcagtctcgc ctaatggagc
tgccattagg 1381 ggagtgtaaa actgggaaat gctgctcact gacagaatta
aaaaaaaaaa taaccagtat 1441 gaaagtcgtt gcgccgtgaa tctctactgt
agccatgaat ttatggacag ttagatgctt 1501 accaaaaaag aaaaaaaggg
agggtagggg acccagatgt acttgaatgt gcagaaaata 1561 cattcttgtc
ctcatcttcc gtaattggag ggctgggaga ggcagctttg ctcttcacca 1621
caccttggac ggaccacctt ctttctgttc catggcctga aggagtgcat ctcctcaaag
1681 actcagcccc tcacctggga gggcagtggt ttgtgggcat ccctccatgt
acattttagg 1741 aaacacttgc aactctcatc tgaagaagaa aacaactcat
ctttgggttc agattttgtg 1801 atggtattca gcaagtcact tgggcgagca
cacttggtct atcctggaaa gtctccttat 1861 aagagaagtt gtgtatttca
tgtgcaccga gcaagggcat tggaagacgt catgaggctg 1921 tattttagca
ggactgatcg tttttctaag tagacctgag ctttgtttat cagtgaaatt 1981
caaggagaaa atgaggttaa tgaagaggta tcagttaaat atccccttct tctcaccctg
2041 ccaaaattag cagttggatt tttggaaact ctggaatatt ctgggtcatt
ttgttttgta 2101 tgtttgttgt ttttcgtctt ccaaaggtga aagctatgat
acagttccac ttaaatttta 2161 gtgttttctt actcagctca agcattaatt
tttgattaag tcttaatctg catgacctgt 2221 gaatctgaat ccatcatctc
cctttcctgc cagcttttct acaaacattg aaatatgtta 2281 tttggtcagc
acttatttcc taggttcaca gccttgggag gttgtggcat gtcctcccag 2341
tctggctggg aagagaccag ctgtaccatc caaatgcttc cctggtcttg atgatctctt
2401 ccagagtcga tctgagtggc cttttctgca ccctcccctt ctttctcttt
gaatggaatt 2461 aaacccaatt tggaaacaac attgacccag tcaaaagctt
ctaatggttt ctttttcttc 2521 ctccagtttt agtttgcttt tattaaaaaa
agaaaatagt gcatggccat agctccttca 2581 gttctcttat tgcagactaa
ccatcaggat ggtatcaaag cacaaatact ttggagggga 2641 atgcgttgaa
ctggggcaag tactctgtaa cacaaagtgg gaaaccactt cctggtgctg 2701
ccgctcctgc ccccacttta ggtgggaggg acgagttttg ccctctagat tttaatccag
2761 ctggtgtcca ccggatgttg ccctcctggg gagcagatat cagtctgtgg
aactctggga 2821 aaaccacagg cacatttttc ggtgcggaca gatttgccag
cacataactg ggcagccagc 2881 tagaatactt tgtggaaatt aagcgaggtt
ttccatttca gccccatggt gcatggtggt 2941 ggccgatgaa tgtgtcagtc
tgctcagaga aaggacaaaa aggaaattat tttcaaaact 3001 gtgttcactg
tttgggtgtg tgtatggctc tgcatgtgtg tgtttttgtc tctgtatagg 3061
tagaggtatt cacatcttac tccgactgta aggttgtctt acttcatctc tgcccccacc
3121 acagttgcca ttttgtaatg tccttccaac atggagaaga cacgagctct
ctccagttgg 3181 catcatttgt cttttttgtt gattgcctca ttctccagtg
aactccatct ggccaattga 3241 ttcagaatca ggcaagatcc ctgccctttg
gcacatccac tgaaaggcca aacagcaagt 3301 ccgagtgagt tttaaatatt
aattaatcac cctttatttt ttacacttga gagtgattgt 3361 aataaaggct
gtcattaata aacttggttc taccttaaaa aaaaaa
[0284] In some embodiments of the methods of the disclosure, the
wild type human TSPAN5 gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_005714.2):
TABLE-US-00054 (SEQ ID NO: 52) 1 msgkhykgpe vsccikyfif gfnvifwflg
itflgiglwa wnekgvlsni ssitdlggfd 61 pvwlflvvgg vmfilgfagc
igalrentfl lkffsvflgi iffleltagv lafvfkdwik 121 dqlyffinnn
irayrddidl qnlidftqey wqccgafgad dwnlniyfnc tdsnasrerc 181
gvpfscctkd paedvintqc gydarqkpev dqqiviytkg cvpqfekwlq dnitivagif
241 igiallqifg iclaqnlvsd ieavrasw
[0285] In some embodiments of the methods of the disclosure, the
wild type human CAMKK1 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_032294.2, transcript variant 1):
TABLE-US-00055 (SEQ ID NO: 53) 1 ctgggcccca gcgaggcggt ggggcggggc
ggggcggggc ggggcgcgca gcaggagcga 61 gtggggccgc ccgccgggcc
gcggacactg tcgcccggcg cccaggttcc caacaaggct 121 acgcagaaga
acccccttga ctgaagcaat ggaggggggt ccagctgtct gctgccagga 181
tcctcgggca gagctggtag aacgggtggc agccatcgat gtgactcact tggaggaggc
241 agatggtggc ccagagccta ctagaaacgg tgtggacccc ccaccacggg
ccagagctgc 301 ctctgtgatc cctggcagta cttcaagact gctcccagcc
cggcctagcc tctcagccag 361 gaagctttcc ctacaggagc ggccagcagg
aagctatctg gaggcgcagg ctgggcctta 421 tgccacgggg cctgccagcc
acatctcccc ccgggcctgg cggaggccca ccatcgagtc 481 ccaccacgtg
gccatctcag atgcagagga ctgcgtgcag ctgaaccagt acaagctgca 541
gagtgagatt ggcaagggtg cctacggtgt ggtgaggctg gcctacaacg aaagtgaaga
601 cagacactat gcaatgaaag tcctttccaa aaagaagtta ctgaagcagt
atggctttcc 661 acgtcgccct cccccgagag ggtcccaggc tgcccaggga
ggaccagcca agcagctgct 721 gcccctggag cgggtgtacc aggagattgc
catcctgaag aagctggacc acgtgaatgt 781 ggtcaaactg atcgaggtcc
tggatgaccc agctgaggac aacctctatt tggtgtttga 841 cctcctgaga
aaggggcccg tcatggaagt gccctgtgac aagcccttct cggaggagca 901
agctcgcctc tacctgcggg acgtcatcct gggcctcgag tacttgcact gccagaagat
961 cgtccacagg gacatcaagc catccaacct gctcctgggg gatgatgggc
acgtgaagat 1021 cgccgacttt ggcgtcagca accagtttga ggggaacgac
gctcagctgt ccagcacggc 1081 gggaacccca gcattcatgg cccccgaggc
catttctgat tccggccaga gcttcagtgg 1141 gaaggccttg gatgtatggg
ccactggcgt cacgttgtac tgctttgtct atgggaagtg 1201 cccattcatc
gacgatttca tcctggccct ccacaggaag atcaagaatg agcccgtggt 1261
gtttcctgag gagccagaaa tcagcgagga gctcaaggac ctgatcctga agatgttaga
1321 caagaatccc gagacgagaa ttggggtgcc agacatcaag ttgcaccctt
gggtgaccaa 1381 gaacggggag gagccccttc cttcggagga ggagcactgc
agcgtggtgg aggtgacaga 1441 ggaggaggtt aagaactcag tcaggctcat
ccccagctgg accacggtga tcctggtgaa 1501 gtccatgctg aggaagcgtt
cctttgggaa cccgtttgag ccccaagcac ggagggaaga 1561 gcgatccatg
tctgctccag gaaacctact ggtgaaagaa gggtttggtg aagggggcaa 1621
gagcccagag ctccccggcg tccaggaaga cgaggctgca tcctgagccc ctgcatgcac
1681 ccagggccac ccggcagcac actcatcccg cgcctccaga ggcccacccc
tcatgcaaca 1741 gccgcccccg caggcagggg gctggggact gcagccccac
tcccgcccct cccccatcgt 1801 gctgcatgac ctccacgcac gcacgtccag
ggacagactg gaatgtatgt catttggggt 1861 cttgggggca gggctcccac
gaggccatcc tcctcttctt ggacctcctt ggcctgaccc 1921 attctgtggg
gaaaccgggt gcccatggag cctcagaaat gccacccggc tggttggcat 1981
ggcctggggc aggaggcaga ggcaggagac caagatggca ggtggaggcc aggcttacca
2041 caacggaaga gacctcccgc tggggccggg caggcctggc tcagctgcca
caggcatatg 2101 gtggagaggg gggtaccctg cccaccttgg ggtggtggca
ccagagctct tgtctattca 2161 gacgctggta tgggggctcg gacccctcac
tggggacagg gccagtgttg gagaattctg 2221 attccttttt tgttgtcttt
tacttttgtt tttaacctgg gggttcgggg agaggccctg 2281 cttgggaaca
tctcacgagc tttcctacat cttccgtggt tcccagcaca gcccaagatt 2341
atttggcagc caagtggatg gaactaactt tcctggactg tgtttcgcat tcggcgttat
2401 ctggaaagtg gactgaacgg aatcaagctc tgagcagagg cctgaagcgg
aagcaccaca 2461 tcgtccctgc ccatctcact ctctcccttg atgatgcccc
tagagctgag gctggagaag 2521 acaccagggc tgactttgac cgagggccat
ggacgcgaca ggcctgtggc cctgcgcatg 2581 ctgaaataac tggaacccag
cctctcctcc tacaccggcc tacccatctg ggcccaagag 2641 ctgcactcac
actcctacaa cgaaggacaa actgtccagg tcggagggat cacgagacac 2701
agaacctgga ggggtgtgca cgctggcagg tggcctctgc ggcaattgcc tcaccctgag
2761 gacatcagca gtcagcctgc tcagagcggg ggtgctggag cgcgtgcaga
cacagctctt 2821 ccggagcagc cttcaccttc tctctgggat cagtgtccgg
ctggccgacg tggcatttgc 2881 tgaccgaatg ctcatagagg ttgaccccca
cagggtcacg caggactcgg acactgccct 2941 ggaaacatgg atggacaagg
gcttttggcc acaggtgtgg gtgtcctgtt ggaggagggc 3001 ttgtttggag
aagggaggct ggctggggga gaaacccgga tcccgctgca tctccgcgcc 3061
tgtgggtgca tgtcgcgtgc tcatctgttg cacacagctc actcgtatgt cctgcactgg
3121 tacatgcatc tgtaatacag tttctacgtc tatttaaggc taggagccga
atgtgcccca 3181 ttgtcagtgg gtccacgttt ctccccggct cctctgggct
aaggcagtgt ggcccgaagc 3241 ttaaaaagtt actcggtact gtttttaaga
acacttttat agagttagtg gaaggcaagt 3301 taagagccaa tcactgatcc
ccaagtgttt cttgagcatc tggtctgggg ggaccacttt 3361 gatcggaccc
acccttggaa agctcagggg taggcccagg tgggatgctc accctgtcac 3421
tgagggtttt ggttggcatc gttgtttttg aatgtagcac aagcgatgag caaactctat
3481 aagagtgttt taaaaattaa cttcccagga agtgagttaa aaacaataaa
agccctttct 3541 tgagttaaaa agaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaa
[0286] In some embodiments of the methods of the disclosure, the
wild type human CAMKK1 gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_115670.1, transcript variant 1):
TABLE-US-00056 (SEQ ID NO: 54) 1 meggpavccq dpraelvery aaidvthlee
adggpeptrn gvdppprara asvipgstsr 61 llparpslsa rklslqerpa
gsyleaqagp yatgpashis prawrrptie shhvaisdae 121 dcvqlnqykl
qseigkgayg vvrlaynese drhyamkvls kkkllkqygf prrppprgsq 181
aaqggpakql lplervyqei ailkkldhvn vvklievldd paednlylvf dllrkgpvme
241 vpcdkpfsee qarlylrdvi lgleylhcqk ivhrdikpsn lllgddghvk
iadfgvsnqf 301 egndaqlsst agtpafmape aisdsgqsfs gkaldvwatg
vtlycfvygk cpfiddfila 361 lhrkiknepv vfpeepeise elkdlilkml
dknpetrigv pdiklhpwvt kngeeplpse 421 eehcsvvevt eeevknsvrl
ipswttvilv ksmlrkrsfg npfepqarre ersmsapgnl 481 lvkegfgegg
kspelpgvqe deaas
[0287] In some embodiments of the methods of the disclosure, the
wild type human CAMKK1 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_172206.1, transcript variant 2):
TABLE-US-00057 (SEQ ID NO: 55) 1 agcagaacag agtatgcaat ttgggaagct
gtggtgtggc tgcagtggag agttcccaac 61 aaggctacgc agaagaaccc
ccttgactga agcaatggag gggggtccag ctgtctgctg 121 ccaggatcct
cgggcagagc tggtagaacg ggtggcagcc atcgatgtga ctcacttgga 181
ggaggcagat ggtggcccag agcctactag aaacggtgtg gaccccccac cacgggccag
241 agctgcctct gtgatccctg gcagtacttc aagactgctc ccagcccggc
ctagcctctc 301 agccaggaag ctttccctac aggagcggcc agcaggaagc
tatctggagg cgcaggctgg 361 gccttatgcc acggggcctg ccagccacat
ctccccccgg gcctggcgga ggcccaccat 421 cgagtcccac cacgtggcca
tctcagatgc agaggactgc gtgcagctga accagtacaa 481 gctgcagagt
gagattggca agggtgccta cggtgtggtg aggctggcct acaacgaaag 541
tgaagacaga cactatgcaa tgaaagtcct ttccaaaaag aagttactga agcagtatgg
601 ctttccacgt cgccctcccc cgagagggtc ccaggctgcc cagggaggac
cagccaagca 661 gctgctgccc ctggagcggg tgtaccagga gattgccatc
ctgaagaagc tggaccacgt 721 gaatgtggtc aaactgatcg aggtcctgga
tgacccagct gaggacaacc tctatttggt 781 gtttgacctc ctgagaaagg
ggcccgtcat ggaagtgccc tgtgacaagc ccttctcgga 841 ggagcaagct
cgcctctacc tgcgggacgt catcctgggc ctcgagtact tgcactgcca 901
gaagatcgtc cacagggaca tcaagccatc caacctgctc ctgggggatg atgggcacgt
961 gaagatcgcc gactttggcg tcagcaacca gtttgagggg aacgacgctc
agctgtccag 1021 cacggcggga accccagcat tcatggcccc cgaggccatt
tctgattccg gccagagctt 1081 cagtgggaag gccttggatg tatgggccac
tggcgtcacg ttgtactgct ttgtctatgg 1141 gaagtgccca ttcatcgacg
atttcatcct ggccctccac aggaagatca agaatgagcc 1201 cgtggtgttt
cctgaggagc cagaaatcag cgaggagctc aaggacctga tcctgaagat 1261
gttagacaag aatcccgaga cgagaattgg ggtgccagac atcaagttgc acccttgggt
1321 gaccaagaac ggggaggagc cccttccttc ggaggaggag cactgcagcg
tggtggaggt 1381 gacagaggag gaggttaaga actcagtcag gctcatcccc
agctggacca cggtgatcct 1441 ggtgaagtcc atgctgagga agcgttcctt
tgggaacccg tttgagcccc aagcacggag 1501 ggaagagcga tccatgtctg
ctccaggaaa cctactggtg aaagaagggt ttggtgaagg 1561 gggcaagagc
ccagagctcc ccggcgtcca ggaagacgag gctgcatcct gagcccctgc 1621
atgcacccag ggccacccgg cagcacactc atcccgcgcc tccagaggcc cacccctcat
1681 gcaacagccg cccccgcagg cagggggctg gggactgcag ccccactccc
gcccctcccc 1741 catcgtgctg catgacctcc acgcacgcac gtccagggac
agactggaat gtatgtcatt 1801 tggggtcttg ggggcagggc tcccacgagg
ccatcctcct cttcttggac ctccttggcc 1861 tgacccattc tgtggggaaa
ccgggtgccc atggagcctc agaaatgcca cccggctggt 1921 tggcatggcc
tggggcagga ggcagaggca ggagaccaag atggcaggtg gaggccaggc 1981
ttaccacaac ggaagagacc tcccgctggg gccgggcagg cctggctcag ctgccacagg
2041 catatggtgg agaggggggt accctgccca ccttggggtg gtggcaccag
agctcttgtc 2101 tattcagacg ctggtatggg ggctcggacc cctcactggg
gacagggcca gtgttggaga 2161 attctgattc cttttttgtt gtcttttact
tttgttttta acctgggggt tcggggagag 2221 gccctgcttg ggaacatctc
acgagctttc ctacatcttc cgtggttccc agcacagccc 2281 aagattattt
ggcagccaag tggatggaac taactttcct ggactgtgtt tcgcattcgg 2341
cgttatctgg aaagtggact gaacggaatc aagctctgag cagaggcctg aagcggaagc
2401 accacatcgt ccctgcccat ctcactctct cccttgatga tgcccctaga
gctgaggctg 2461 gagaagacac cagggctgac tttgaccgag ggccatggac
gcgacaggcc tgtggccctg 2521 cgcatgctga aataactgga acccagcctc
tcctcctaca ccggcctacc catctgggcc 2581 caagagctgc actcacactc
ctacaacgaa ggacaaactg tccaggtcgg agggatcacg 2641 agacacagaa
cctggagggg tgtgcacgct ggcaggtggc ctctgcggca attgcctcac 2701
cctgaggaca tcagcagtca gcctgctcag agcgggggtg ctggagcgcg tgcagacaca
2761 gctcttccgg agcagccttc accttctctc tgggatcagt gtccggctgg
ccgacgtggc 2821 atttgctgac cgaatgctca tagaggttga cccccacagg
gtcacgcagg actcggacac 2881 tgccctggaa acatggatgg acaagggctt
ttggccacag gtgtgggtgt cctgttggag 2941 gagggcttgt ttggagaagg
gaggctggct gggggagaaa cccggatccc gctgcatctc 3001 cgcgcctgtg
ggtgcatgtc gcgtgctcat ctgttgcaca cagctcactc gtatgtcctg 3061
cactggtaca tgcatctgta atacagtttc tacgtctatt taaggctagg agccgaatgt
3121 gccccattgt cagtgggtcc acgtttctcc ccggctcctc tgggctaagg
cagtgtggcc 3181 cgaagcttaa aaagttactc ggtactgttt ttaagaacac
ttttatagag ttagtggaag 3241 gcaagttaag agccaatcac tgatccccaa
gtgtttcttg agcatctggt ctggggggac 3301 cactttgatc ggacccaccc
ttggaaagct caggggtagg cccaggtggg atgctcaccc 3361 tgtcactgag
ggttttggtt ggcatcgttg tttttgaatg tagcacaagc gatgagcaaa 3421
ctctataaga gtgttttaaa aattaacttc ccaggaagtg agttaaaaac aataaaagcc
3481 ctttcttgag ttaaaaagaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaa
[0288] In some embodiments of the methods of the disclosure, the
wild type human CAMKK1 gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_757343.2, transcript variant 2):
TABLE-US-00058 (SEQ ID NO: 56) 1 mqfgklwcgc sgefptrlrr rtplteameg
gpavccqdpr aelvervaai dvthleeadg 61 gpeptrngvd pppraraasv
ipgstsrllp arpslsarkl slgerpagsy leaqagpyat 121 gpashispra
wrrptieshh vaisdaedcv qlnqyklqse igkgaygvvr laynesedrh 181
yamkvlskkk llkqygfprr ppprgsqaaq ggpakqllpl ervyqeiail kkldhvnvvk
241 lievlddpae dnlylvfdll rkgpvmevpc dkpfseeqar lylrdvilgl
eylhcqkivh 301 rdikpsnlll gddghvkiad fgvsnqfegn daqlsstagt
pafmapeais dsgqsfsgka 361 ldvwatgvtl ycfvygkcpf iddfilalhr
kiknepvvfp eepeiseelk dlilkmldkn 421 petrigvpdi klhpwvtkng
eeplpseeeh csvvevteee vknsvrlips wttvilvksm 481 lrkrsfgnpf
epqarreers msapgnllvk egfgeggksp elpgvqedea as
[0289] In some embodiments of the methods of the disclosure, the
wild type human CAMKK1 gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_172207.2, transcript variant 3):
TABLE-US-00059 (SEQ ID NO: 57) 1 ctgggcccca gcgaggcggt ggggcggggc
ggggcggggc ggggcgcgca gcaggagcga 61 gtggggccgc ccgccgggcc
gcggacactg tcgcccggcg cccaggttcc caacaaggct 121 acgcagaaga
acccccttga ctgaagcaat ggaggggggt ccagctgtct gctgccagga 181
tcctcgggca gagctggtag aacgggtggc agccatcgat gtgactcact tggaggaggc
241 agatggtggc ccagagccta ctagaaacgg tgtggacccc ccaccacggg
ccagagctgc 301 ctctgtgatc cctggcagta cttcaagact gctcccagcc
cggcctagcc tctcagccag 361 gaagctttcc ctacaggagc ggccagcagg
aagctatctg gaggcgcagg ctgggcctta 421 tgccacgggg cctgccagcc
acatctcccc ccgggcctgg cggaggccca ccatcgagtc 481 ccaccacgtg
gccatctcag atgcagagga ctgcgtgcag ctgaaccagt acaagctgca 541
gagtgagatt ggcaagggtg cctacggtgt ggtgaggctg gcctacaacg aaagtgaaga
601 cagacactat gcaatgaaag tcctttccaa aaagaagtta ctgaagcagt
atggctttcc 661 acgtcgccct cccccgagag ggtcccaggc tgcccaggga
ggaccagcca agcagctgct 721 gcccctggag cgggtgtacc aggagattgc
catcctgaag aagctggacc acgtgaatgt 781 ggtcaaactg atcgaggtcc
tggatgaccc agctgaggac aacctctatt tggccctgca 841 gaaccaggcc
cagaatatcc agttagattc aacaaatatc gccaagcccc actccctgct 901
tccctctgag cagcaagaca gtggatccac gtgggctgcg cgctcagtgt ttgacctcct
961 gagaaagggg cccgtcatgg aagtgccctg tgacaagccc ttctcggagg
agcaagctcg 1021 cctctacctg cgggacgtca tcctgggcct cgagtacttg
cactgccaga agatcgtcca 1081 cagggacatc aagccatcca acctgctcct
gggggatgat gggcacgtga agatcgccga 1141 ctttggcgtc agcaaccagt
ttgaggggaa cgacgctcag ctgtccagca cggcgggaac 1201 cccagcattc
atggcccccg aggccatttc tgattccggc cagagcttca gtgggaaggc 1261
cttggatgta tgggccactg gcgtcacgtt gtactgcttt gtctatggga agtgcccatt
1321 catcgacgat ttcatcctgg ccctccacag gaagatcaag aatgagcccg
tggtgtttcc 1381 tgaggagcca gaaatcagcg aggagctcaa ggacctgatc
ctgaagatgt tagacaagaa 1441 tcccgagacg agaattgggg tgccagacat
caagttgcac ccttgggtga ccaagaacgg 1501 ggaggagccc cttccttcgg
aggaggagca ctgcagcgtg gtggaggtga cagaggagga 1561 ggttaagaac
tcagtcaggc tcatccccag ctggaccacg gtgatcctgg tgaagtccat 1621
gctgaggaag cgttcctttg ggaacccgtt tgagccccaa gcacggaggg aagagcgatc
1681 catgtctgct ccaggaaacc tactggtgta agtactggtg ggccagggac
tgccgggcac 1741 tccctggagt tgggtgggga ggtctgaggc ccatcctccc
actctcactg tcgttgggcc 1801 aaggccagag cctggggact tggccaggtc
tcggtgttgg ccccatttgc atctctgtcc 1861 ccaaggttag tcggggctag
aagggacctt ttgggcccag ctcttgcttc attcctgggg 1921 ccagcatccc
tcacacacac acttccaggg atgaggagct cacgcagccc ctccatggga 1981
caggaagacc cttcttccat gcagcttgat gtcactctct cactgggtcc agcccctctg
2041 gggcttcaaa tctgtggccc cctcagccct tggcagcctg gcagaggttt
gcagacaggc 2101 tgatgttggc ttcctgtagg aggctggcgg gctgtagagg
aggggtgctg gcccctctgc 2161 ctggccctgg ggactgttgg ctgctctccc
aagtggccca ggctgcctgc agccattgct 2221 ggggctctgt gcccagtcag
cactttgtga gtgcttgttc agtgagtaag cagggacagg 2281 ctggccggtg
gaccacggga gaggaacccg cattggccga gggctcccta tggtgagcca 2341
cgcctgtggg ttcaccacct cctaggaggg tccagaaaag cagctcccca agcctgtgcg
2401 cctcgtcctc agcagatcca ccttcttcac tataataaaa gccagtctgg
gatgctaaaa 2461 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa 2521 aaaaaaaaaa aaaaa
[0290] In some embodiments of the methods of the disclosure, the
wild type human CAMKK1 gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_757344.2, transcript variant 3):
TABLE-US-00060 (SEQ ID NO: 58) 1 meggpavccq dpraelvery aaidvthlee
adggpeptrn gvdppprara asvipgstsr 61 llparpslsa rklslqerpa
gsyleaqagp yatgpashis prawrrptie shhvaisdae 121 dcvqlnqykl
qseigkgayg vvrlaynese drhyamkvls kkkllkqygf prrppprgsq 181
aaqggpakql lplervyqei ailkkldhvn vvklievldd paednlylal qnqaqniqld
241 stniakphsl lpseqqdsgs twaarsvfdl lrkgpvmevp cdkpfseeqa
rlylrdvilg 301 leylhcqkiv hrdikpsnll lgddghvkia dfgvsnqfeg
ndaqlsstag tpafmapeai 361 sdsgqsfsgk aldvwatgvt lycfvygkcp
fiddfilalh rkiknepvvf peepeiseel 421 kdlilkmldk npetrigvpd
iklhpwvtkn geeplpseee hcsvvevtee evknsvrlip 481 swttvilvks
mlrkrsfgnp fepqarreer smsapgnllv
[0291] In some embodiments of the methods of the disclosure, the
wild type human MMPI gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NM_002423.4):
TABLE-US-00061 (SEQ ID NO: 59) 1 gaaaacacca aatcaaccat aggtccaaga
acaattgtct ctggacggca gctatgcgac 61 tcaccgtgct gtgtgctgtg
tgcctgctgc ctggcagcct ggccctgccg ctgcctcagg 121 aggcgggagg
catgagtgag ctacagtggg aacaggctca ggactatctc aagagatttt 181
atctctatga ctcagaaaca aaaaatgcca acagtttaga agccaaactc aaggagatgc
241 aaaaattctt tggcctacct ataactggaa tgttaaactc ccgcgtcata
gaaataatgc 301 agaagcccag atgtggagtg ccagatgttg cagaatactc
actatttcca aatagcccaa 361 aatggacttc caaagtggtc acctacagga
tcgtatcata tactcgagac ttaccgcata 421 ttacagtgga tcgattagtg
tcaaaggctt taaacatgtg gggcaaagag atccccctgc 481 atttcaggaa
agttgtatgg ggaactgctg acatcatgat tggctttgcg cgaggagctc 541
atggggactc ctacccattt gatgggccag gaaacacgct ggctcatgcc tttgcgcctg
601 ggacaggtct cggaggagat gctcacttcg atgaggatga acgctggacg
gatggtagca 661 gtctagggat taacttcctg tatgctgcaa ctcatgaact
tggccattct ttgggtatgg 721 gacattcctc tgatcctaat gcagtgatgt
atccaaccta tggaaatgga gatccccaaa 781 attttaaact ttcccaggat
gatattaaag gcattcagaa actatatgga aagagaagta 841 attcaagaaa
gaaatagaaa cttcaggcag aacatccatt cattcattca ttggattgta 901
tatcattgtt gcacaatcag aattgataag cactgttcct ccactccatt tagcaattat
961 gtcacccttt tttattgcag ttggtttttg aatgtctttc actcctttta
aggataaact 1021 cctttatggt gtgactgtgt cttattcatc tatacttgca
gtgggtagat gtcaataaat 1081 gttacataca caaataaata aaatgtttat
tccatggtaa atttaaaaaa aaaaaaaaaa 1141 aaaaaaaaaa aaa
[0292] In some embodiments of the methods of the disclosure, the
wild type human MMPI gene of the disclosure consists of or
comprises the amino acid sequence (Genbank Accession number:
NP_002414.1):
TABLE-US-00062 (SEQ ID NO: 60) 1 mrltvlcavc llpgslalpl pqeaggmsel
qweqaqdylk rfylydsetk nansleaklk 61 emqkffglpi tgmlnsrvie
imqkprcgvp dvaeyslfpn spkwtskvvt yrivsytrdl 121 phitvdrlvs
kalnmwgkei plhfrkvvwg tadimigfar gahgdsypfd gpgntlahaf 181
apgtglggda hfdederwtd gsslginfly aathelghsl gmghssdpna vmyptygngd
241 pqnfklsqdd ikgiqklygk rsnsrkk
[0293] In some embodiments of the methods of the disclosure, the
wild type human TERC gene of the disclosure consists of or
comprises the nucleic acid sequence (Genbank Accession number:
NR_001566.1):
TABLE-US-00063 (SEQ ID NO: 61) 1 gggttgcgga gggtgggcct gggaggggtg
gtggccattt tttgtctaac cctaactgag 61 aagggcgtag gcgccgtgct
tttgctcccc gcgcgctgtt tttctcgctg actttcagcg 121 ggcggaaaag
cctcggcctg ccgccttcca ccgttcattc tagagcaaac aaaaaatgtc 181
agctgctggc ccgttcgccc ctcccgggga cctgcggcgg gtcgcctgcc cagcccccga
241 accccgcctg gaggccgcgg tcggcccggg gcttctccgg aggcacccac
tgccaccgcg 301 aagagttggg ctctgtcagc cgcgggtctc tcgggggcga
gggcgaggtt caggcctttc 361 aggccgcagg aagaggaacg gagcgagtcc
ccgcgcgcgg cgcgattccc tgagctgtgg 421 gacgtgcacc caggactcgg
ctcacacatg c
Definitions
[0294] The following definitions are included for the purpose of
understanding the present subject matter and for constructing the
appended patent claims. Abbreviations used herein have their
conventional meaning within the chemical and biological arts.
[0295] Unless defined otherwise, technical and scientific terms
used herein have the same meaning as commonly understood by a
person of ordinary skill in the art. Any methods, devices and
materials similar or equivalent to those described herein can be
used in the practice of this disclosure. The following definitions
are provided to facilitate understanding of certain terms used
frequently herein and are not meant to limit the scope of the
present disclosure.
[0296] As used herein, the term "FILD" refers to fibrotic
interstitial lung disease.
[0297] As used herein, the term "FIP" refers to Familial
Interstitial Pneumonia.
[0298] As used herein, the term "HRCT" refers to high-resolution CT
(HRCT).
[0299] As used herein, the term "ILA" refers to asymptomatic
interstitial lung abnormalities.
[0300] As used herein, the term "IPF" refers to idiopathic
pulmonary fibrosis.
[0301] As used herein, the term "PBMC" refers to peripheral blood
mononuclear cell.
[0302] As used herein, the term "alleviate" is meant to describe a
process by which the severity of a sign or symptom of a disorder is
decreased. Importantly, a sign or symptom can be alleviated without
being eliminated. In a preferred embodiment, the administration of
pharmaceutical compositions disclosed herein leads to the
elimination of a sign or symptom, however, elimination is not
required. Effective dosages are expected to decrease the severity
of a sign or symptom. A sign is an objective indication of a
medical condition that is observable or detectable by a medical
professional or lay person (e.g. family member) (for example, with
respect to fibrotic pulmonary disease, signs include, but are not
limited to, changes in body weight, changes in body temperature and
the presence of a fibrotic lesion in one or both lungs detectable
by radiography).
[0303] A symptom is an indication of disease that may be a sign but
may also be exclusively observable or subjectively experienced by
the subject (for example, with respect to fibrotic pulmonary
disease, symptoms may include but are not limited to, a dry or
hacking cough, a sore throat, a tight chest, shortness of breath,
and a feeling of exhaustion or malaise).
[0304] In one aspect, the terms "co-administered" and
"co-administration" as relating to a subject refer to administering
to the subject a compound of the invention or salt thereof along
with a compound that may also treat the disorders or diseases
contemplated within the invention. In one embodiment, the
co-administered compounds are administered separately, or in any
kind of combination as part of a single therapeutic approach. The
co-administered compound may be formulated in any kind of
combinations as mixtures of solids and liquids under a variety of
solid, gel, and liquid formulations, and as a solution.
[0305] As used herein, the term "composition" or "pharmaceutical
composition" refers to a mixture of at least one compound useful
within the disclosure with a pharmaceutically acceptable carrier.
The pharmaceutical composition facilitates administration of the
compound to a patient or subject. Multiple techniques of
administering a compound exist in the art including, but not
limited to, intravenous, oral, aerosol, parenteral, ophthalmic,
nasal, pulmonary and topical administration.
[0306] A "disease" as used herein is a state of health of an animal
or subject wherein the animal or subject cannot maintain
homeostasis, and wherein if the disease is not ameliorated then the
animal's or subject's health continues to deteriorate.
[0307] A "disorder" as used herein in an animal is a state of
health in which the animal or subject is able to maintain
homeostasis, but in which the animal's or subject's state of health
is less favorable than it would be in the absence of the disorder.
Left untreated, a disorder does not necessarily cause a further
decrease in the animal's or subject's state of health.
[0308] As used herein, the terms "effective amount,"
"pharmaceutically effective amount" and "therapeutically effective
amount" refer to a nontoxic but sufficient amount of an agent to
provide the desired biological result. That result may be reduction
and/or alleviation of the signs, symptoms, or causes of a disease,
or any other desired alteration of a biological system. An
appropriate therapeutic amount in any individual case may be
determined by one of ordinary skill in the art using routine
experimentation.
[0309] As used herein, the term "fibrotic lung disease" or "fibroid
lung disease" or "pulmonary fibrosis" or "scarring of the lung"
refers to a group of diseases characterized by the formation or
development of excess fibrous connective tissue (fibrosis) in the
lungs. Symptoms of pulmonary fibrosis are mainly: shortness of
breath, particularly with exertion; chronic dry, hacking coughing;
fatigue and weakness; chest discomfort; and loss of appetite and
rapid weight loss. Pulmonary fibrosis may be a secondary effect of
other diseases, most of them being classified as interstitial lung
diseases, such as autoimmune disorders, viral infections or other
microscopic injuries to the lung. Pulmonary fibrosis can also
appear without any known cause ("idiopathic"). Idiopathic pulmonary
fibrosis is a diagnosis of exclusion of a characteristic set of
histologic/pathologic features known as usual interstitial
pneumonia (UIP).
[0310] Diseases and conditions that may cause pulmonary fibrosis as
a secondary effect include: inhalation of environmental and
occupational pollutants (asbestosis, silicosis and gas exposure);
hypersensitivity pneumonitis, most often resulting from inhaling
dust contaminated with bacterial, fungal, or animal products;
cigarette smoking; connective tissue diseases such as rheumatoid
arthritis, SLE; scleroderma, sarcoidosis and Wegener's
granulomatosis; infections; medications such as amiodarone,
bleomycin (pingyangmycin), busulfan, methotrexate, apomorphine and
nitrofurantoin; and radiation therapy to the chest.
[0311] As used herein, a "subject in need thereof" is a subject
suffering from fibrotic lung disease relative to the population at
large. For example, the subject is a patient who is or is about to
be administered with comprising administering to the subject an
effective amount of a therapeutic agent. For example, the subject
is asymptomatic and is at risk of developing the fibrotic lung
disease. A "subject" includes a mammal. The mammal can be e.g., any
mammal, e.g., a human, primate, bird, mouse, rat, fowl, dog, cat,
cow, horse, goat, camel, sheep or pig. Preferably, the mammal is a
human.
[0312] As used herein, the term "pharmaceutically acceptable"
refers to a material, such as a carrier or diluent, which does not
abrogate the biological activity or properties of the compound, and
is relatively non-toxic, i.e., the material may be administered to
an individual without causing undesirable biological effects or
interacting in a deleterious manner with any of the components of
the composition in which it is contained.
[0313] As used herein, the term "pharmaceutically acceptable
carrier" means a pharmaceutically acceptable material, composition
or carrier, such as a liquid or solid filler, stabilizer,
dispersing agent, suspending agent, diluent, excipient, thickening
agent, solvent or encapsulating material, involved in carrying or
transporting a compound useful within the invention within or to
the patient such that it may perform its intended function.
Typically, such constructs are carried or transported from one
organ, or portion of the body, to another organ, or portion of the
body. Each carrier must be "acceptable" in the sense of being
compatible with the other ingredients of the formulation, including
the compound useful within the invention, and not injurious to the
patient. Some examples of materials that may serve as
pharmaceutically acceptable carriers include: sugars, such as
lactose, glucose and sucrose; starches, such as corn starch and
potato starch; cellulose, and its derivatives, such as sodium
carboxymethyl cellulose, ethyl cellulose and cellulose acetate;
powdered tragacanth; malt; gelatin; talc; excipients, such as cocoa
butter and suppository waxes; oils, such as peanut oil, cottonseed
oil, safflower oil, sesame oil, olive oil, corn oil and soybean
oil; glycols, such as propylene glycol; polyols, such as glycerin,
sorbitol, mannitol and polyethylene glycol; esters, such as ethyl
oleate and ethyl laurate; agar; buffering agents, such as magnesium
hydroxide and aluminum hydroxide; surface active agents; alginic
acid; pyrogen-free water; isotonic saline; Ringer's solution; ethyl
alcohol; phosphate buffer solutions; and other non-toxic compatible
substances employed in pharmaceutical formulations.
[0314] Pharmaceutically acceptable carriers of the disclosure
include, but are not limited to, pharmaceutically acceptable
materials, compositions or carriers, such as a liquid or solid
fillers, stabilizers, dispersing agents, suspending agents,
diluents, excipients, thickening agents, solvents or encapsulating
materials, involved in carrying or transporting a compound useful
within the invention within or to the patient such that it may
perform its intended function. Typically, such constructs are
carried or transported from one organ, or portion of the body, to
another organ, or portion of the body. Each carrier must be
"acceptable" in the sense of being compatible with the other
ingredients of the formulation, including the compound useful
within the invention, and not injurious to the patient. Some
examples of materials that may serve as pharmaceutically acceptable
carriers include: sugars, such as lactose, glucose and sucrose;
starches, such as corn starch and potato starch; cellulose, and its
derivatives, such as sodium carboxymethyl cellulose, ethyl
cellulose and cellulose acetate; powdered tragacanth; malt;
gelatin; talc; excipients, such as cocoa butter and suppository
waxes; oils, such as peanut oil, cottonseed oil, safflower oil,
sesame oil, olive oil, corn oil and soybean oil; glycols, such as
propylene glycol; polyols, such as glycerin, sorbitol, mannitol and
polyethylene glycol; esters, such as ethyl oleate and ethyl
laurate; agar; buffering agents, such as magnesium hydroxide and
aluminum hydroxide; surface active agents; alginic acid;
pyrogen-free water; isotonic saline; Ringer's solution; ethyl
alcohol; phosphate buffer solutions; and other non-toxic compatible
substances employed in pharmaceutical formulations.
[0315] Suitable forms for administration include forms suitable for
systemic administration, oral administration, for example by a
capsule or tablet. Once formulated, the compositions of the
disclosure can be administered directly to the subject.
[0316] The term "prevent," "preventing" or "prevention," as used
herein, means avoiding or delaying the onset of symptoms associated
with a disease or condition in a subject that has not developed
such symptoms at the time the administering of an agent or compound
commences.
Compounds and Compositions
[0317] In some embodiments, compounds known to be useful in
treating pulmonary fibrosis are useful within the methods of the
invention. Non-limiting examples of such compounds are pirfenidone
(5-methyl-1-phenylpyridin-2-one, or a salt or solvate thereof) and
nintedanib (methyl
(3Z)-3-{[(4-{methyl[(4-methylpiperazin-1-yl)acetyl]amino}phenyl)amino](ph-
enyl)methylidene}-2-oxo-2,3-dihydro-1 H-indole-6-carboxylate, or a
salt or solvate thereof).
[0318] In some embodiments, the subject identified as having MUC5B
promoter polymorphism rs35705950 is administered a compound
contemplated within the disclosure. In some embodiments, the
subject is a mammal. In other embodiments, the mammal is a
human.
Administration/Dosage/Formulations
[0319] The regimen of administration may affect what constitutes an
effective amount. The therapeutic formulations may be administered
to the subject either prior to or after the onset of a disease or
disorder contemplated in the invention. Further, several divided
dosages, as well as staggered dosages may be administered daily or
sequentially, or the dose may be continuously infused, or may be a
bolus injection. Further, the dosages of the therapeutic
formulations may be proportionally increased or decreased as
indicated by the exigencies of the therapeutic or prophylactic
situation.
[0320] Administration of the compositions of the present disclosure
to a patient, preferably a mammal, more preferably a human, may be
carried out using known procedures, at dosages and for periods of
time effective to treat a disease or disorder contemplated in the
invention. An effective amount of the therapeutic compound
necessary to achieve a therapeutic effect may vary according to
factors such as the state of the disease or disorder in the
patient; the age, sex, and weight of the patient; and the ability
of the therapeutic compound to treat a disease or disorder
contemplated in the invention. Dosage regimens may be adjusted to
provide the optimum therapeutic response. For example, several
divided doses may be administered daily or the dose may be
proportionally reduced as indicated by the exigencies of the
therapeutic situation. A non-limiting example of an effective dose
range for a therapeutic compound of the invention is from about 1
and 5,000 mg/kg of body weight/per day. One of ordinary skill in
the art would be able to study the relevant factors and make the
determination regarding the effective amount of the therapeutic
compound without undue experimentation. Actual dosage levels of the
active ingredients in the pharmaceutical compositions of this
invention may be varied so as to obtain an amount of the active
ingredient that is effective to achieve the desired therapeutic
response for a particular patient, composition, and mode of
administration, without being toxic to the patient.
[0321] The precise therapeutically effective amount for a human
subject will depend upon the severity of the disease state, the
general health of the subject, the age, weight and gender of the
subject, diet, time and frequency of administration, drug
combination(s), reaction sensitivities and tolerance/response to
therapy. This amount can be determined by routine experimentation
and is within the judgement of the clinician.
[0322] A medical doctor, e.g., physician or veterinarian, having
ordinary skill in the art may readily determine and prescribe the
effective amount of the pharmaceutical composition required. For
example, the physician or veterinarian could start doses of the
compounds of the invention employed in the pharmaceutical
composition at levels lower than that required in order to achieve
the desired therapeutic effect and gradually increase the dosage
until the desired effect is achieved.
[0323] A suitable dose of a compound of the disclosure may be in
the range of from about 0.01 mg to about 5,000 mg per day, such as
from about 0.1 mg to about 1,000 mg, for example, from about 1 mg
to about 500 mg, such as about 5 mg to about 250 mg per day. The
dose may be administered in a single dosage or in multiple dosages,
for example from 1 to 4 or more times per day. When multiple
dosages are used, the amount of each dosage may be the same or
different. For example, a dose of 1 mg per day may be administered
as two 0.5 mg doses, with about a 12-hour interval between
doses.
[0324] In some embodiments of the methods of the disclosure, the
therapeutic agent comprises pirfenidone. In some embodiments, the
effective dosage is administered orally as a capsule or a tablet.
In some embodiments, including those embodiments wherein the
therapeutic agent comprises pirfenidone, the effective dosage is
about 2400 mg/day. In some embodiments, the effective dosage is
administered according to an escalating dosage regimen. In some
embodiments, including those embodiments wherein the therapeutic
agent comprises pirfenidone, the escalating dosage regimen
comprises (a) administering to the subject about 800 mg of
pirfenidone per day for a first week; (b) administering to the
subject about 1600 mg of pirfenidone per day for a second week; and
(c) administering to the subject about 2400 mg of pirfenidone per
day for the remainder of the treatment. In some embodiments,
including those embodiments wherein the therapeutic agent comprises
pirfenidone, the escalating dosage regimen comprises (a)
administering to the subject a capsule or tablet comprising about
250 mg of pirfenidone three times a day for a first week; (b)
administering to the subject two capsules or tablets comprising
about 250 mg of pirfenidone three times a day for a second week;
and (c) administering to the subject three capsules or tablets
comprising about 250 mg of pirfenidone three times a day for the
remainder of the treatment. In some embodiments of the escalating
dosage regimen, the capsule or tablet comprises 267 mg of
pirfenidone.
[0325] In some embodiments of the methods of the disclosure, the
therapeutic agent comprises nintedanib. In some embodiments, the
effective dosage is administered orally as a capsule or a tablet.
In some embodiments, including those embodiments wherein the
therapeutic agent comprises nintedanib, the effective dosage is
about 300 mg/day. In some embodiments, the effective dosage is
about 150 mg administered twice per day, wherein the daily doses
are administered about 12 hours apart from one another. In some
embodiments, including those embodiments wherein the therapeutic
agent comprises nintedanib, the effective dosage is about 200
mg/day. In some embodiments, the effective dosage is about 100 mg
administered twice per day, wherein the daily doses are
administered about 12 hours apart from one another. In some
embodiments, including those embodiments wherein the therapeutic
agent comprises nintedanib, the effective dosage is administered
according to a modified or interrupted dosage regimen. In some
embodiments, the modified or interrupted dosage regimen comprises
(a) administering to the subject about 300 mg of nintedanib per day
until the subject presents an elevated level of liver enzymes
compared to a control level of liver enzymes; (b) administering to
the subject about 200 mg of nintedanib per day until the subject
presents the control level of liver enzymes; and (c) administering
to the subject about 300 mg of nintedanib per day for the remainder
of the treatment; wherein the control level of liver enzymes is a
level detected in the subject prior to an initiation of the
treatment. In some embodiments, including those embodiments wherein
the therapeutic agent comprises nintedanib, the modified or
interrupted regimen comprises (a) administering to the subject a
capsule or tablet comprising about 150 mg of nintedanib twice per
day until the subject presents an elevated level of liver enzymes
compared to a control level of liver enzymes; (b) administering to
the subject two capsules or tablets comprising about 100 mg twice
per day until the subject presents an elevated level of liver
enzymes compared to a control level of liver enzymes; and (c)
administering to the subject a capsule or tablet comprising about
150 mg of nintedanib twice per day for the remainder of the
treatment; wherein the control level of liver enzymes is a level
detected in the subject prior to an initiation of the
treatment.
[0326] In some embodiments, the compositions of the invention are
formulated using one or more pharmaceutically acceptable excipients
or carriers. In one embodiment, the pharmaceutical compositions of
the invention comprise a therapeutically effective amount of a
compound of the invention and a pharmaceutically acceptable
carrier.
[0327] The carrier may be a solvent or dispersion medium
containing, for example, water, ethanol, polyol (for example,
glycerol, propylene glycol, and liquid polyethylene glycol, and the
like), suitable mixtures thereof, and vegetable oils. The proper
fluidity may be maintained, for example, by the use of a coating
such as lecithin, by the maintenance of the required particle size
in the case of dispersion and by the use of surfactants. Prevention
of the action of microorganisms may be achieved by various
antibacterial and antifungal agents, for example, parabens,
chlorobutanol, phenol, ascorbic acid, thimerosal and the like. In
many cases, it is preferable to include isotonic agents, for
example, sugars, sodium chloride, or polyalcohols such as mannitol
and sorbitol, in the composition.
[0328] It is also noted that the term "comprising" is intended to
be open and permits but does not require the inclusion of
additional elements or steps. When the term "comprising" is used
herein, the terms "consisting essentially of" and "consisting of"
are thus also encompassed and disclosed. Throughout the
description, where compositions or combinations are described as
having, including, or comprising specific components or steps, it
is contemplated that compositions or combinations also consist
essentially of, or consist of, the recited components. Similarly,
where methods or processes are described as having, including, or
comprising specific process steps, the processes also consist
essentially of, or consist of, the recited processing steps.
[0329] All publications and patent documents cited herein are
incorporated herein by reference as if each such publication or
document was specifically and individually indicated to be
incorporated herein by reference.
EXAMPLES
[0330] In order that the invention disclosed herein may be more
efficiently understood, examples are provided below. It should be
understood that these examples are for illustrative purposes only
and are not to be construed as limiting the invention in any
manner.
Example 1: Genetic Background of Asymptomatic Siblings of FIP
Subjects
[0331] Asymptomatic siblings (>50 years old) of patients with
established FIP underwent HRCT scan of the chest. HRCT scans were
assessed for FILD by blinded thoracic radiologists; when possible,
specific radiographic patterns were identified. PBMCs RNA and DNA
were isolated. Genotyping for rs35705950 and microarray analysis
were performed (SurePrint G3 Human Gene Expression Microarray).
Data were analyzed using Partek Genomics Suite and RStudio.
Four-hundred eighty-eight FIP siblings from 271 families were
evaluated, 25 HRCT scans were excluded due to technically
inadequacy, leaving 463 to be interpreted. Of these, 19% (n=88) met
criteria for FILD. A subset of the positive FILD scans (n=58) were
evaluated for specific interstitial patterns: the predominant
radiographic finding was Usual Interstitial Pneumonia (UIP),
documented as possible (n=37), probable (n=6), or definite (n=5) in
82.8% of these cases. DNA was available for 443 subjects (358
without and 85 with FILD). The minor allele (T) frequency (MAF) of
rs35705950 was higher among those with evidence of FILD (MAF=0.29)
than among those with normal appearing HRCT scans (MAF=0.21,
p=0.005). The rs35705950 variant was associated with the presence
of FILD (OR=1.90, 95% CI 1.10-3.30, p=0.02), and FILD was
associated with age (OR=1.09, 95% CI 1.06-1.12, p=7.24.times.10-9),
male sex (OR=1.81, 95% CI 1.04-3.16, p=0.04), and history of
smoking (OR=1.94, 95% CI 1.11-3.40, p=0.02). Microarray analysis on
PBMC RNA from 40 subjects with FILD and 105 unaffected siblings
revealed 1,272 differentially expressed genes (FDR<0.05,
fold-change>2); hierarchical clustering performed on the top 194
differentially expressed probes illustrates segregation of FILD
subjects from unaffected siblings (FIG. 1).
Example 2: Role of MUC5B in Pathogenesis of IPF
[0332] Common genetic variants play major and similar roles in the
development of both familial and sporadic IPF (Table 3), indicating
a similar etiology for familial and sporadic IPF. A common
gain-of-function MUC5B promoter variant rs35705950 is a strong risk
factor (genetic and otherwise), accounting for at least 30% of the
total risk of developing IPF (10) confirmed in 10 independent
studies, including a GWAS (OR for T (minor) allele=4.51; 95%
CI=3.91-5.21; P=7.21.times.10-95); 3) rs35705950 may be used to
identify individuals with PrePF and is predictive of radiographic
progression of PrePF. MUC5B promoter variant rs35705950 is present
in over 50% of non-Hispanic white (NHW) patients with IPF and is
associated with unique biological and clinical IPF phenotypes.
PrePF can be predicted using a combination of clinical risk
factors, the MUC5B promoter variant rs35705950, and a panel of
biomarkers.
TABLE-US-00064 TABLE 1 Common IPF risk variants identified by
targeted sequencing of risk loci in 3,642 IPF cases and 4,442
unaffected controls Common Nearest Minor MAF in OR Aa vs AA OR aa
vs AA Chrm Variant Gene Annotation.sup.a Allele cases (95% CI) (95%
CI) P.sup.b 3q26 rs2293607 TERC 3' UTR C 0.2999 1.30 (1.18-1.43)
1.79 (1.49-2.15) 9.11 .times. 10.sup.-13 4q22 rs2609260 FAM13A
Intronic C 0.2289 1.35 (1.22-1.50) 1.96 (1.56-2.47) 1.03 .times.
10.sup.-13 5p15 rs4449583 TERT Intronic T 0.2641 0.68 (0.62-0.75)
0.46 (0.39-0.55) 2.67 .times. 10.sup.-25 6p24 rs2076295 DSP
Intronic G 0.5428 1.27 (1.14-1.42) 2.08 (1.83-2.37) 1.11 .times.
10.sup.-29 7q22 rs6963345 ZKSCAN1 Intronic A 0.4444 1.35
(1.22-1.50) 1.73 (1.51-1.99) 1.89 .times. 10.sup.-15 10q24
rs2488000 OBFC1 Intronic T 0.08 0.70 (0.62-0.79).sup.c 7.13 .times.
10.sup.-9 11p15 rs35705950 MUC5B Promoter T 0.3533 5.45 (4.91-6.06)
18.68 (13.34-6.17) .sup. 9.60 .times. 10.sup.-295 13q34 rs1278769
AK025511 3' UTR A 0.1996 0.77 (0.70-0.85) 0.69 (0.56-0.86) 7.48
.times. 10.sup.-8 15q15 rs35700143 IVD -- C 0.4118 0.76 (0.68-0.84)
0.63 (0.55-0.71) 3.44 .times. 10.sup.-12 19p13 rs12610495 DPP9
Intronic G 0.3398 1.22 (1.11-1.35) 1.59 (1.36-1.87) 3.11 .times.
10.sup.-9 OR, odds ratio. The minor allele is defined as the minor
allele in the combined case and control group. .sup.aBased on
SNPDOC; .sup.bP value adjusted for sex; .sup.cOR resulting from
dominant test.
[0333] MUC5B is predicted is involved in the pathogenesis of IPF.
FIG. 5 shows that MUC5B promoter variant is associated with
enhanced MUC5B expression in both unaffected subjects and in
patients with IPF and in IPF, MUC5B message and protein are
expressed in bronchoalveolar epithelia (FIG. 6) and honeycomb
cysts. In mice, the concentration of Muc5b is directly related to
the fibroproliferative response to bleomycin (FIG. 7), Muc5b
protein is expressed in the injured lung following bleomycin
challenge, and enhanced production of Muc5b in mice appears to
initiate endoplasmic reticulum (ER) stress in peripheral airways
(FIGS. 8 and 9). Preliminary studies, also show that mucociliary
clearance is decreased in mice that over-express Muc5b
(SFTPC-Muc5b.sup.Tg) and in humans with IPF (FIG. 10).
[0334] Interstitial lung abnormalities on HRCT scans show
asymptomatic relatives of patients with familial IPF and in the
elderly. Similar to patients with IPF, interstitial lung
abnormalities in asymptomatic subjects are associated with advanced
age, cigarette smoking, reduced lung volume and decreased exercise
tolerance. Moreover, the MUC5B promoter variant rs35705950 is
associated with a higher prevalence of interstitial lung
abnormalities on HRCT scan and is predictive of radiographic
progression. Suggesting that interstitial lung abnormalities on
HRCT scan are a precursor of IPF. However, interstitial lung
abnormalities are not specific and include non-fibrotic and
fibrotic HRCT defects, and consequently, the prevalence of
interstitial lung abnormalities (>5% in the general population
.gtoreq.50 years of age is orders of magnitude higher than IPF.
[0335] To address the non-specificity of interstitial lung
abnormalities, a novel entity--Preclinical Pulmonary Fibrosis
(PrePF) was used. PrePF is reported more frequently among smokers
and in families with two or more cases of pulmonary fibrosis. In
the Framingham population, data shows that PrePF is present in 1.8%
of the general population .gtoreq.50 years of age (in contrast,
interstitial lung abnormalities were seen in 6.7%) and that the
MUC5B promoter variant rs35705950 is predictive of those with PrePF
(OR=6.3 per allele [95% CI 3.1-12.7). As shown herein, among
asymptomatic first-degree family members of familial interstitial
pneumonia (FIP) 14% have fibrotic interstitial changes on CT scan
and 35% have interstitial abnormalities on transbronchial biopsy.
Moreover, in the Framingham population, it is shown that rs35705950
is predictive of radiographic progression of PrePF (OR=2.8 per
allele [95% CI 1.8-4.4]) which is associated with a greater FVC
decline (P=0.0001) and an increased risk of death (HR=3.7 [95% CI
1.3, 10.7]; P=0.02), indicating that in addition to having
radiographic features of IPF, PrePF has similar risk factors (age,
gender, smoking, and MUC5B variant) and a progressive clinical
course. While the MUC5B promoter variant is predictive of PrePF,
rs35705950 is present in .apprxeq.19% (minor allele frequency
(MAF)=0.09) of the NHW population, however IPF occurs infrequently
(<0.1%). Thus, additional biomarkers may be used in combination
with rs35705950 identify PrePF within at-risk populations.
[0336] The data provided herein suggest that 1) IPF is
under-diagnosed; 2) PrePF is prevalent in at-risk populations; 3)
approximately 75% of the cases of PrePF are progressive; 4)
radiographic progression of PrePF is associated with increased
morbidity and mortality; and 5) MUC5B variant rs35705950,
peripheral blood biomarkers, clinical/biological, and radiographic
screening should be useful in identifying those with PrePF (FIG.
11). While IPF takes years to develop, most patients with IPF are
diagnosed in the advanced stage when little can be done to
influence survival. Once the lung has undergone remodeling, the
non-compliant, stiff lung matrix causes additional remodeling
through activation of myofibroblasts, resulting in a feed-forward
loop of lung remodeling. Earlier diagnosis of IPF detects subjects
with a lower burden of fibrotic lung disease.
[0337] This disclosure provides a strategic approach to screening
for early forms of IPF needs to be established (FIG. 11). While the
MUC5B promoter variant is predictive of PrePF (defined as chest
HRCT consistent with probable or definite fibrosis (e.g., bilateral
subpleural reticular changes, honeycombing, or traction
bronchiectasis) occurring in asymptomatic subjects .gtoreq.40 years
of age that emerge from at-risk populations), the MUC5B promoter
variant is present in .apprxeq.19% of the NHW population and IPF
occurs infrequently (<0.1%). To study at-risk populations
(asymptomatic siblings .gtoreq.40 years of age of patients with
family or sporadic IPF), identification of genetic variants and
biomarkers that increase the yield of patients with PrePF are used
to establish screening tools and approaches that identify early
stages of IPF. This approach changes the way IPF is diagnosed and
treated, and is critical to developing interventions to prevent
PrePF progression to established IPF. The methods provided in this
disclosure fundmentally alter the clinical approach to patients
with IPF from palliative to preventive (FIG. 10).
Example 3: Predictive Biomarker Profile for Established IPF
[0338] To address the development of a peripheral blood biomarker
profile for IPF, an assay of the expression levels of >3700
plasma proteins was performed on plasma from 70 patients with
established IPF and 70 controls. After controlling for multiple
comparisons and appropriate co-variables, 57 proteins were
up-regulated >1.5-fold (including surfactant proteins, MMPI, and
C3) in the plasma of patients with IPF and 12 were significantly
down-regulated (FIG. 2).
Example 4: Predictive Biomarker Profile for Early IPF
[0339] To evaluate a predictive biomarker profile in cases of
preclinical pulmonary fibrosis (PrePF) derived from families with
familial IPF (.gtoreq.2 cases of IPF in a family), HRCT scans were
performed on 496 asymptomatic family members .gtoreq.40 years of
age previously phenotyped as unaffected from 263 families with
familial IPF. PrePF, consistent with the operational definition
(defined as abnormalities on chest HRCT consistent with probable or
definite fibrosis (e.g., bilateral subpleural reticular changes,
honeycombing, or traction bronchiectasis) occurring in asymptomatic
subjects .gtoreq.40 years that emerge from at-risk populations),
was present in 77 (15.5%) of 496 asymptomatic individuals from
families with familial IPF. The minor allele frequency (MAF) of the
MUC5B promoter variant was 0.29 in those with PrePF versus 0.21 in
those without fibrosis (P=0.025). Preliminary analysis of PBMC gene
expression profiles evaluated by microarrays from 38 cases of PrePF
and 187 subjects without fibrosis identified 16 genes significantly
differentially expressed between the two groups (p-value <0.05
and >1.5 fold change). Among genes differentially expressed in
PrePF are those involved in innate immunity and inflammatory
responses (SIGLEC14), antibacterial effects (ADM2), growth and
motility (TSPAN5), and protein phosyphorylation (CAMKK1). Moreover,
PBMC gene expression appears to contribute to the ability to
predict PrePF in an at-risk population (FIG. 3).
[0340] Additionally, RNA-sequencing analysis was performed on 40
PrePF subjects and 80 subjects with a normal HRCT scan. Sequencing
of the polyA-enriched libraries was prepared using Illumina TrueSEQ
reagents and multiplexing 10 samples on each lane of HiSEQ4000 to
obtain on average 35-40 million reads per sample. This high
coverage allows for the consideration of a broad dynamic range of
mRNA transcripts for biomarker selection. Platform selection of
serum and plasma samples from the same subjects are used for
proteomic analysis.
Example 5: Biomarker Identification
[0341] To examine for association between each biomarkers and
PrePF, a multivariable logistic regression model for PrePF with
biomarkers and covariates is used for inclusion and a step-wise
forward selection procedure is constructed. Variables stay in the
model if associated at P.ltoreq.0.01 after adjustment for the
variables already in the model. Protein biomarkers that are
significantly associated with established IPF and the top 20
differentially expressed genes in PrePF are considered for
inclusion in a multivariable model. The number of potential
biomarkers allowed in the joint model is restricted to
approximately 20 given the number cases of PrePF expected.
Secondarily, interactions between MUC5B genotype and the other
biomarkers are tested for, which allow for the possibility that
different biomarker profiles are diagnostic in IPF patients
with/without the MUC5B risk allele.
Example 6: Predictive Ability of Biomarkers
[0342] To test the predictive value of the combination of
biomarkers associated with PrePF, the observed expression and other
biomarker values from those associated with PrePF in the siblings
of FIP patients is used to obtain the probability, for each
sibling, having PrePF.
[0343] Following, a construct receiver operating characteristic
(ROC) curves (see M. S. Pepe et al., Phases of biomarker
development for early detection of cancer. Journal of the National
Cancer Institute 93, 1054-1061 (2001)), is used to choose the
probability threshold that maximizes the area under the ROC curve.
This probability threshold is used to classify each individual as
predicted to have PrePF or not, allowing calculation of the
sensitivity, specificity, positive predictive value, and negative
predictive value of the predictive model. The properties of the
predictive model(s) in the independent set of siblings of patients
with IPF are evaluated. Different aliquots are run for 10 samples
for each assay at each time the assays is run in order to use those
10 samples to evaluate the need for standardization of the absolute
values for each assay over time. Either the raw or standardized
values, for a given model, is used to observe biomarker values in
the PrePF siblings and non-PrePF siblings to obtain the probability
of being in the disease group based on the model parameters
developed using the FIP siblings. The thresholds identified among
the FIP siblings are used to classify each individual as predicted
to have PrePF or not. This categorization allows for the
calculation of the sensitivity, specificity, positive predictive
value, and negative predictive value of the predictive model among
the siblings of independent cases of IPF to that observed in the
siblings of FIP cases.
[0344] Power is calculated to detect differences between those with
and without PrePF assuming 500 siblings and 10% (N=50) with PrePF.
Assuming .alpha.=0.00005 (conservatively correcting for up to 1000
independent tests), we have 80% (90%) power to detect differences
in protein or expression level of 0.74 (0.80) standard deviation
between PrePF and unaffected siblings. These differences are larger
than previously-observed protein and gene-expression levels in IPF
patients and controls (see I. V. Yang et al., The peripheral blood
transcriptome identifies the presence and extent of disease in
idiopathic pulmonary fibrosis. PLoS One 7, e37708 (2012). With 50
PrePF and 450 unaffected, there is 90% power to bound the
sensitivity of the biomarker-based classification of PrePF with a
margin of error of 11% if the sensitivity is 65%, and 6.5% if the
sensitivity is 95%; the margins of error for 65% and 95%
sensitivity are 4.5% and 2.5%, respectively.
Example 7: MUC5B Promoter Variant r35705950is a Risk Factor for
Rheumatoid Arthritis--Interstitial Lung Disease
Methods
Study Cohorts
[0345] This study included a discovery cohort and multi-ethnic
replication cohorts. The discovery cohort included patients with
RA, with and without ILD (RA-noILD) as assessed by chest HRCT, and
controls, from the French RA-ILD network. The multi-ethnic
replication cohorts were obtained from six countries (China,
Greece, Japan, Mexico, the Netherlands and United States). This
included patients with RA-ILD and RA-noILD patients, and controls.
All cases fulfilled the 2010 European League Against
Rheumatism-American College of Rheumatology (EULAR-ACR) and/or 1987
ACR revised criteria for RA. The ILD status of patients with RA was
established by chest HRCT images that were centrally reviewed by
experienced readers for each participating cohort. There was one
cohort, the RA-noILD cases from the USA1 cohort, which was
determined by self-report. The chest HRCT ILD pattern was
classified as UIP, possible UIP or inconsistent with UIP according
to international criteria and all readers were blinded to the
clinical and genetic data. The institutional review boards at each
institution approved all protocols, and all patients provided
written informed consent.
Genotyping
[0346] Genotyping of the MUC5B rs35705950 single nucleotide
polymorphism (SNP) involved use of Taqman Genotyping Assays
(Applied Biosystems, Foster City, Calif., USA) as previously
reported, by direct Sanger Sequencing or imputation from
genome-wide association study data.
[0347] The additional common IPF risk variants on 3q26, 4q22, 5p15,
6p21.3, 6p24, 7q22, 10q24, 11p15.5, 13q34, 15q14-15, and 19p13 were
genotyped by Taqman qPCR (Thermo Fisher Scientific, California) per
the manufacturer's instructions.
Lung Tissue Analysis
[0348] In order to determine if MUC5B was expressed in RA-ILD ling
tissue, we analyzed lung tissue was analyzed from nine patients
with RA-ILD undergoing lung transplantation (University of
California, San Francisco) compared to six unaffected controls with
ILD (NHLBI Lung Tissue Research Consortium; https://ltrcpublic.com)
or concordant expression of other relevant markers of pulmonary
fibrosis. The tissue was formalin fixed, paraffin embedded and cut
in 4 um sections. Tissue sections were deparaffinized in xylene,
followed by dehydration in series of ethanol. Following citrate
buffer antigen retrieval, slides were incubated overnight with
primary antibodies against MUC5B (1:4000, Santa Cruz, Dallas,
Tex.). Secondary antibody diluted 1:1000 tagged with HRP (Life
Technologies) was visualized using an Aperio CS2 slide scanner
(Leica, Buffalo Grove, Ill.).
Results
Study Cohorts
[0349] This case-control genetic study included 620 RA-ILD cases,
614 RA-noILD cases and 5448 unaffected controls. The discovery
cohort included 118 RA-ILD cases, 105 RAnoILD cases and 1229
unaffected controls. The multi-ethnic replication sample included
502 RA-ILD, 509 RA-noILD cases and 4219 unaffected controls.
Characteristics of the Discovery Cohort
[0350] As compared with RA-noILD, patients with RA-ILD were more
frequently male, older and more frequently smoked cigarettes (54.7%
versus 36.1%) (FIG. 13). However, after adjusting for sex, the
relationship between RA-ILD and cigarette smoking was no longer
statistically significant (FIG. 13). After adjustment, RA-ILD and
RA-noILD patients did not differ in rheumatoid factor (RF) and/or
anti-citrullinated protein antibody (ACPA) positivity, erosive
status of RA, exposure to methotrexate or the mean RA duration from
diagnosis at inclusion in the cohort. Overall, 41% of patients with
RA-ILD had a UIP or possible UIP HRCT pattern.
MUC5B Promoter Variant and Risk of Rheumatoid Arthritis-Associated
Interstitial Lung Disease
[0351] Comparison of RA-noILD and controls revealed that none of
the cohorts (discovery cohort and multi-ethnic cohorts)
demonstrated a significant difference in the frequency of the MUC5B
promoter variant (FIG. 14; FIG. 16A), suggesting a lack of
association between the MUC5B promoter variant and RA. In the
discovery cohort, the minor allele frequency (MAF) of the MUC5B
promoter variant was 10.9% in unaffected controls and 32.6% in
cases of RAILD; this variant was in Hardy-Weinberg equilibrium
(HWE) in both study groups. I In the discovery population, after
controlling for sex we detected a significant association between
the MUC5B promoter variant and RA-ILD when compared to non-RA
controls (ORadj=3.8; 95% CI, 2.8 to 5.2; P=9.7.times.10-17) (FIG.
14). Similar to the discovery population, the MUC5B promoter
variant was significantly over-represented among the
cases of RA-ILD compared to unaffected non-RA controls in all of
the multi-ethnic study case series, except in the two Asian case
series (FIG. 14). Given that the MUC5B promoter variant is
under-represented in Asian populations compared to non-Hispanic
whites (FIG. 14;
www.ncbi.nlm.nih.gov/projects/SNP/snp_refcgi?rs=35705950), a likely
explanation, especially given the consistent point estimates, for
the absence of a significant relationship between the MUC5B
promoter variant and RA-ILD is that the analysis of the two Asian
case series is likely underpowered. The relationship between the
MUC5B promoter variant and RA-ILD in combined multi-ethnic study
case series (ORadj=4.7; 95% CI, 3.9 to 5.8; P=1.3.times.10-49)
(FIG. 14) (FIG. 16B) validated the observed association between the
MUC5B promoter variant and RA-ILD in the discovery study
population.n addition, the cases of RA-ILD in the study populations
from Greece and USA-1 were not in HWE, suggesting (as has been
observed in cases of IPF 14), that the MUC5B promoter variant
and/or common variants in high or complete linkage disequilibrium
with the MUC5B promoter variant should be considered as causative
in these cases of RA-ILD. For the comparison with non-RA controls,
the best-fitting genetic model for the three study populations
(discovery population, combined multi-ethnic case series, and
combined analysis) for the association of the MUC5B MUC5B
RS35705950 AND RISK OF ITNERSTITIAL LUNG DISEASE AMONG PATIENTS
WITH RHEUMATOID ARTHRITIS
[0352] To further investigate whether the MUC5B promoter variant
rs35705950 contributes to the risk of ILD among patients with RA,
we compared RA-ILD and RA-noILD patients, adjusting for sex, age at
inclusion and cigarette smoking. In the discovery cohort, the MUC5B
variant was associated with RA-ILD (ORadj=3.1; 95% CI, 1.6 to 6.3;
P=9.4.times.10.sup.-4), and this finding was replicated in the
aggregate multi-ethnic cohort (ORadj=2.9; 95% CI, 1.1 to 8.4;
P=0.04) and the combined analysis (ORadj, 3.1; 95% CI, 1.8 to 5.4;
P=7.4.times.10.sup.-5) (FIG. 14; FIG. 16C). For the comparison of
RA-ILD with RA-noILD, the best-fitting genetic model for the three
study cohorts (discovery population, combined multi-ethnic case
series, and combined analysis) was dominant. After adjusting for
covariates, no association between tobacco smoking and the risk of
ILD among patients with RA was found and no interaction of tobacco
smoke exposure with the MUC5B promoter variant was observed
(ORadj=0.7; 95% CI, 0.3 to 1.9; P=0.51).
MUC5B rs35705950 and UIP on HRCT Scan
[0353] Limiting the RA-ILD cases to those with radiographic
evidence of definite or possible UIP on HRCT scan, the association
observed in the discovery cohort (ORadj=5.0; 95% CI, 2.1 to 12.3;
P=3.0.times.10.sup.-4), was replicated in the combined multi-ethnic
cohort (ORadj=9.2; 95% CI, 2.3 to 38.7; P=1.8.times.10.sup.-3)
(FIG. 16C), and was observed in the combined cohort analysis
(ORadj=6.1; 95% CI, 2.9 to 13.1; P=2.5.times.10.sup.-6) (FIG. 16C).
In the combined analysis, the comparison of odds ratios for UIP
RA-ILD vs RA-noILD (ORadj=6.1; 95% CI, 2.9 to 13.1;
P=2.5.times.10.sup.-6) to non-UIP RA-ILD vs RA-noILD (ORadj=1.3;
95% CI, 0.6 to 2.8; P=0.46) was statistically significant (P=0.02),
suggesting that the effect of the MUC5B promoter variant was
restricted to the UIP RA-ILD sub-phenotype (FIG. 16C). Finally,
consistent with our previous findings, the MUC5B promoter variant
was found to increase the risk of developing a UIP pattern among
patients with RA-ILD through a dominant model in the discovery,
replication and combined analysis; the odds of having a UIP and
possible UIP pattern for patients with RA-ILD carrying at least one
MUC5B rs35705950 T risk allele were 2.9 times greater than
individuals having the GG genotype (ORadj=2.9; 95% CI, 1.7 to 4.8;
P=5.1.times.10-5) (FIG. 15; FIG. 16C). After adjusting for
covariates, tobacco smoking exposure did not contribute to a
specific HRCT pattern for RA-ILD and no interaction with the MUC5B
rs35705950 variant was detected.
Sites of MUC5B Expression in RA-ILD
[0354] We performed immunohistochemical staining for MUC5B in nine
RA-ILD lung tissue explants (5 GG and 4 GT) and 6 unaffected
controls (3 GG and 3 GT). Similar to what has been reported in IPF,
RA-ILD lung tissue demonstrated MUC5B in the cytoplasm of the
bronchioles and in areas of microscopic honeycombing, including
staining of the metaplastic epithelia lining the honeycomb cysts
and the mucous within the cyst (FIG. 17). The controls demonstrated
MUC5B expression in the bronchioles only. There were no obvious
differences in MUC5B expression by genotype.
Exploratory Genetic Association Study of 12 Common IPF Risk
Variants in RA-ILD
[0355] Having provided evidence for the contribution of the
dominant IPF genetic risk variant, i.e. the MUC5B promoter variant,
to RA-ILD, we decided to test the association of 12 additional
common IPF risk variants with RA-ILD (FIG. 29). This exploratory
study included 272 RA-ILD and 242 RA-noILD patients from the
France, USA-1 and Mexico case series. Taking into account the
relatively small sample size and related low power of detection
corresponding P-values, Odds Ratio and 95% CI for the 12 candidate
variants were considered as descriptive and Bonferoni correction
was therefore not applied (Table 4). Comparison between RA-ILD and
RA-noILD revealed that 2 common IPF risk variants, TOLLIP rs5743890
and IVD rs2034650, were significantly associated with RA-ILD. The
TOLLIP rs5743890 minor allele was associated with increased risk of
RA-ILD and the IVD rs2034650 minor allele was associated with
decreased risk of RA-ILD (ORadj=2.13; 95% CI, 1.13 to 4.10; P=0.02
and ORadj=0.59; 95% CI, 0.38 to 0.89; P=0.01, respectively) and the
directionality of these relationships is consistent with what has
been observed for IPF.16,17 No association with RA-ILD was detected
for the 10 other IPF risk variants (FIG. 29).
Example 8: MUC5B Promoter Variant is Associated with Visually and
Quantitatively Detected Preclinical Pulmonary Fibrosis
[0356] Better understanding and recognition of early pulmonary
fibrosis is critical because medical therapies have been shown to
slow progression, not to reverse or even stabilize established
fibrosis--therefore, intervention before irreversible fibrosis has
become extensive has the potential to improve quality of life and
decrease morbidity. While IPF affects approximately 5 million
people worldwide, between 1.8 and 14% of the general population
.gtoreq.50 years of age have radiologic findings of undiagnosed
pulmonary fibrosis. Large cohort studies indicate that interstitial
lung abnormalities, postulated to represent early pulmonary
fibrosis, are associated with increased mortality, and that most of
these abnormalities progress over time. Members of families with 2
or more cases of pulmonary fibrosis (FIP, Familial Interstitial
Pneumonia) have been identified as an "at-risk" population. In a
previous study of FIP relatives, 14% had interstitial lung
abnormalities on high resolution computed tomography (HRCT), and
35% had an abnormal transbronchial biopsy indicating interstitial
lung disease.
[0357] HRCT provides visualization of the lung parenchyma and plays
a key role in the diagnosis of the Idiopathic Interstitial
Pneumonias (IIPs), including IPF. Currently, visual diagnosis by
thoracic radiologists, in conjunction with multidisciplinary
clinical conference, is the gold standard for diagnosing IIPs.
However, visual assessment is imprecise and hampered by
inter-observer variation. Quantitative HRCT (qHRCT) evaluation
provides measures of fibrosis extent that, in subjects diagnosed
with IPF, correlate with degree of physiologic impairment at
baseline, and may be more sensitive to subtle changes in disease
status than routinely used physiological metrics. The design and
utility of quantitative methods in the context of early forms of
fibrotic ILD requires further study. Deep learning methods have
been increasingly used in imaging to identify and classify CT
patterns, and may be particularly valuable in detection of early
lung fibrosis.
[0358] This study aims to: (1) examine risk factors, including two
common fibrosis-associated genetic variants in MUC5B and TERT, for
undiagnosed pulmonary fibrosis (PrePF) in FIP first-degree
relatives; and (2) determine the utility of a deep learning,
texture-based qHRCT method in the detection of early fibrosis in
this cohort.
Materials and Methods
FIP Relatives Screening:
[0359] As part of a study of FIP conducted at the University of
Colorado, National Jewish Health, and Vanderbilt University (COMIRB
#15-1147; NJH IRB 1441a; Vanderbilt IRB #020343), non-Hispanic
white (NHW) relatives of FIP patients, defined as those in families
with two or more cases of pulmonary fibrosis, were contacted for
enrollment. First-degree relatives without a known prior diagnosis
of pulmonary fibrosis and greater than 40 years of age were offered
HRCT scans of the chest and asked to undergo peripheral blood draw.
Study subjects younger than 40 years of age or older than 40 years
of age who reported on pre-study questionnaires to be personally
affected by pulmonary fibrosis were excluded (FIG. 18).
Visual CT Review:
[0360] HRCT scans were interpreted by study radiologists and
examined for the presence of fibrotic ILD. "PrePF" was defined as
the presence of "probable" or "definite" fibrotic ILD on HRCT in
FIP relatives who had no known diagnosis of pulmonary fibrosis at
the time of study enrollment (FIGS. 18, 19).
Quantitative CT:
[0361] Inspiratory HRCT series with slice thickness .ltoreq.125 mm
and spacing .ltoreq.200 mm were selected for quantitative analysis.
This included 212 volumetric series with thin, contiguous sections
(slice thickness and spacing both <=125 mm) and 191
non-volumetric scans (56 with slice spacing >125 mm and <10
mm, 65 with slice spacing of 10 mm and 70 with slice spacing=20
mm). Scans identified as technically inadequate were omitted. In
addition, 100 inspiratory volumetric HRCT of never-smoking control
subjects from the COPDGene cohort were analyzed (FIG. 20). The
lungs were segmented in a semi-automatic fashion using open source
software followed by manual editing, if necessary, performed by
trained analysts. Examples of the categorization of different parts
of CT scans are shown in FIG. 21. Some studies were acquired with
contiguous thin axial sections while others used 1 or 2 cm
intervals. Also, reconstruction kernel, a parameter that affects
image sharpness and noise, was not standardized.
[0362] Fibrosis quantification on CT scans was performed using a
deep learning technique, with a convolutional neural network (CNN)
algorithm trained with image regions of normal and abnormal lung
identified by expert radiologists. Training data and an earlier
algorithm version were described previously. Here, a more complex
CNN architecture was employed that classifies image regions using
pixel and texture features extracted by multiple convolutional
layers at different scales. Classification categories included
normal lung, airways, reticular abnormality, honeycombing and
ground glass. An additional category, "not normal", was also
included for lung regions not classified into any of the named
categories. Further, pixels in the "not normal" category were split
into two subcategories: "not normal" low density and "not normal"
high density using the threshold value of -650 Hounsfield Units
(HU). Subject level scores were computed as the percentage of total
lung volume classified in each category. HRCT fibrosis score was
defined as the sum of CNN classification scores for reticular
abnormality, honeycombing, ground glass, and "not normal high
density" (FIG. 21).
[0363] A simpler previously described densitometric analysis of
HRCTs was also performed for comparison. Percent high attenuation
area (% HAA), the percentage of total lung volume with HRCT pixel
intensity greater than -600 HU and less than -250 HU, has been used
as a measure of interstitial lung disease on CT.
Statistical Analysis:
[0364] Analysis of the effect of specific alleles on PrePF risk was
performed using minor allele frequency (MAF) for comparison of
variant prevalence in the study groups; statistical significance
was determined utilizing either a z-score test for proportions or a
mixed effects logistic regression model when controlling for other
clinical factors (age, sex, and history of smoking) and family
[random effect]) in both dominant and log-additive models.
[0365] Distribution of qHRCT fibrosis scores was left skewed as was
% HAA, and therefore these values were log transformed prior to
analyses. Log of qHRCT fibrosis score (hereafter, "fibrosis score")
and log (% HAA) were compared with visual scores using ANOVA and
Tukey's honest significant difference (HSD) test. To determine the
ability of qHRCT scores to predict visual diagnosis of PrePF,
receiver-operating characteristic (ROC) analysis was performed.
Optimal threshold for discriminating visual diagnosis of fibrotic
ILD was determined with Youden's method. Five-fold cross-validation
was performed to test detection accuracy, sensitivity and
specificity, and consistency of optimal threshold. Linear
regression was performed to test association between the MUC5B
genotype and qHRCT fibrosis score and log (% HAA).
[0366] A p-value of <0.05 was considered statistically
significant for differences between groups as well as for
associations between individual variables and outcomes in linear
and logistic regression modeling. Statistical analyses were
performed using RStudio (Version 0.99.473).
Results
Study Cohort Characteristics
[0367] A total of 1,090 FIP relatives were contacted, and 523
eligible subjects were recruited and underwent HRCT screening (FIG.
18). Of the 523 subjects, 26 were excluded due to technical
inadequacy of images and one for an equivocal consensus read by
study radiologists. The remaining 496 subjects from 263 families
were included in the final analyses. The mean age of study subjects
was 57 years (95% CI: 56.5-58), 189 (38%) were male, and 148 (29%)
were either current or former smokers. The minor allele (T)
frequency of the MUC5B promoter polymorphism rs35705950 was 0.22 in
this cohort; 45% of the subjects in this cohort had one or two
copies of the minor allele (FIG. 22). The minor allele (C)
frequency of the TERT variant rs2736100 was 0.47 in the entire
cohort; 69% of the subjects in the cohort having one or two copies
of the minor allele (FIG. 22).
Prevalence of Preclinical Pulmonary Fibrosis (PrePF) in FIP
Relatives
[0368] Of the 496 HRCT scans, 401 showed no CT evidence of
interstitial lung disease (ILD), and 95 showed evidence of ILD,
either fibrotic (27 probable and 50 definite) or non-fibrotic
(n=18). Therefore, among these 496 subjects who reported being
personally unaffected by pulmonary fibrosis, the PrePF prevalence
was 15.5% (n=77) (FIG. 18).
[0369] The CT patterns noted in PrePF subjects (FIG. 23) show that
possible, probable, or definite UIP pattern was the most commonly
considered (n=59, 77% of all PrePF cases). NSIP was considered in
45 subjects (58% of all PrePF cases). The fibrotic changes were
most commonly lower-lobe predominant and subpleural in nature,
consistent with a UIP pattern (FIG. 23). Non-fibrotic ILD scans, on
the other hand, generally had more diffuse, upper-lobe predominant
abnormalities.
[0370] There were 402 study subjects with HRCT scans that were
technically adequate for quantitative assessment. 212 of the scans
had both slice thickness and spacing <=125 mm (thin,
contiguous); of the remaining 191 scans, 56 had slice spacing
>125 mm and <10 mm, 65 had slice spacing=10 mm, and 70 had
slice spacing=20 mm. Volumetric HRCT scans on an additional 100
COPDGene subjects were included as normal controls. Fibrosis score
means were significantly different (p<00001) across groups
defined by visual diagnosis (FIG. 24). Comparison of means showed
fibrosis score were significantly different comparing each group
(all between-group comparisons p<001). Means of log (% HAA)
scores were also significantly different across visual scoring
groups (p<00001), and individual between-group comparisons
showed log (% HAA) was significantly different in most comparisons
(p<00001), except between the "probable" and "definite" visual
scores (p=035).
[0371] ROC analysis showed that fibrosis score discriminates
subjects with visual diagnosis of PrePF (FIG. 25B). Average area
under the curve (AUC) in five-fold cross validation was 0.85 (range
0.83-0.87) and average accuracy, sensitivity, and specificity in
the test partitions were 0.83 (range 0.74-0.86), 0.74 (range
0.56-0.92), and 0.84 (range 0.76-0.89), respectively. Optimal
threshold for fibrosis score ranged from 1.40-1.42, corresponding
to 4.1% fibrotic area in examined lung. Utilizing a cutoff of 1.40
for fibrosis score on the entire dataset, the sensitivity was 74%,
specificity was 82%, and accuracy was 81%; the negative predictive
value of this test was 95%, exceeding its positive predictive value
(42%) (FIG. 25C).
[0372] Compared to the classification achieved with the CNN as
described above, ROC analysis of log % HAA had lower mean AUC 0.80
(range 0.79-0.81) and average accuracy, sensitivity, and
specificity of 0.67 (range 0.63-0.70), 0.82 (range 0.75-0.91) and
0.64 (range 0.62-0.70) respectively (FIG. 25A). Optimal threshold
for log % HAA ranged from 1.49-1.57. Utilizing a cutoff of 1.49 for
log % HAA, the sensitivity was 88%, specificity was 55%, and
accuracy was 60%; the negative predictive value of this test was
96%, exceeding its positive predictive value (26%).
Risk Factors for PrePF
[0373] Subjects with PrePF were older (mean age 65.8 years, 95% CI
63.5-68.1) than those without fibrosis (mean age 55.8, 95% CI
54.9-56.6, p=6.36.times.10.sup.-13); they were also more likely to
have ever smoked (43% versus 27%, p=0007), and to be male (48%
versus 36%, p=005). However, there was no difference in
breathlessness between the PrePF and subjects without fibrosis
(mean score 0.5 versus 06, p=024, FIG. 26). When fibrosis was
defined by quantitative fibrosis score cutoff (1.4), there was a
significant difference between groups in terms of mean
breathlessness score (0.39 versus 0.78, p=0003). Quantitative
fibrosis score was positively associated with breathlessness score
(p=0.001), even after controlling for age (p=1.9.times.10.sup.-9),
male sex (p=0.7), and smoking history (p=0.8).
[0374] Screening for autoantibodies in this cohort revealed that
there were no differences between PrePF and No Fibrosis subjects in
terms of overall seropositivity or individual antibodies' testing
in this cohort. For quantitatively defined fibrosis, there was no
significant difference between groups in terms of auto-antibody
testing, with similar overall seropositivity rates (11% versus 16%,
p=0.30).
[0375] The MUC5B promoter polymorphism rs35705950 was associated
with the visual diagnosis of PrePF (present in 40% of those without
fibrosis versus 53% with PrePF; MAF 0.29 versus 0.21, respectively,
p=0.03, FIG. 22). After age 60, there was a statistically
significant difference in the proportion of subjects with visually
diagnosed PrePF when the cohort was stratified by MUC5B genotype
(23.8% versus 39.8% prevalence, p=0.02) (FIG. 27).
[0376] MUC5B variant carriers, regardless of their visual CT
diagnosis, had significantly higher qHRCT fibrosis scores (1.3 [95%
CI 1.2-1.5] versus 1.1 [95% CI 1.0-1.2], p=0.02). The association
between MUC5B genotype and fibrosis score was significant even when
controlling for age and male sex in linear regression (p=0.03, FIG.
28). Age was significantly associated with fibrosis score
(p=2.17.times.10.sup.-9), but male sex (p=0.63) and smoking
(p=0.94) were not. To determine whether individual textural
components were driving the association of the composite fibrosis
score with genotype, each score component was tested individually
for association with the MUC5B variant, controlling for age and
sex. Quantitative scores for reticulation, honeycombing, and ground
glass were significantly associated with the MUC5B variant (p=0.02,
p=0.02, p=004, respectively), while "not normal high density" was
not (p=018). The simpler quantitative scoring method, log % HAA,
was not significantly different in MUC5B variant carriers
(p=0.4).
[0377] In contrast to the MUC5B variant, the common IPF-associated
TERT polymorphism (rs2736100) was not significantly associated with
PrePF assessed either qualitatively (MAF 0.47 in PrePF versus 0.46
in unaffected, p=0.77) or quantitatively (MAF 0.50 fibrotic versus
0.47 not fibrotic, p=0.40).
[0378] When these factors were examined individually for their
contributions to risk of PrePF in our study cohort, we used a mixed
effects logistic regression model to test the independent effects
of age sex, smoking, and MUC5B or TERT genotypes while controlling
for family. Age and the MUC5B genotype remained statistically
significantly associated with PrePF (OR 1.15, 95% CI 1.09-1.22,
p=7.34.times.10.sup.-7 and OR 2.18, 95% CI 1.00-4.73, p=0.05,
respectively) (FIG. 22). The common TERT polymorphism (rs2736100)
associated with fibrotic idiopathic interstitial pneumonia (29) was
not significantly associated with PrePF (MAF was 0.45 in PrePF
versus 0.45 in unaffected, p=0.88) or in a log-additive model
controlling for age, sex, and smoking history (p=0.57).
[0379] Given the presence of non-fibrotic ILD (n=18, FIG. 18) in
the "No Fibrosis" cohort, secondary analyses were performed that
(1) excluded non-fibrotic ILDs and (2) compared all ILD (inclusive
of non-fibrotic ILD) to those without any ILD. When non-fibrotic
ILDs were excluded from analyses, PrePF subjects were older
(p=4.7.times.10.sup.-13), more commonly male (p=0.04), more often
had a smoking history (p=0.003), and had a higher prevalence of the
MUC5B promoter variant (MAF 0.29 versus 0.20, p=0.02). However,
when controlling for family relatedness and the other risk factors
in a mixed effects logistic regression, only age and the MUC5B
promoter variant were significantly associated with PrePF with odds
ratios 1.15 (95% CI 1.09-1.22, p=9.5.times.10.sup.-7) and 2.16 (95%
CI 1.00-4.75, p=0.05), respectively. Another secondary analysis of
the data was performed in which all subjects with CT findings of
ILD (fibrotic or non-fibrotic) were compared to those without any
evidence of ILD. Those with CT evidence of ILD were older (mean age
64.3 years, 95% CI 62.2-66.3) compared to those without any
evidence of ILD (mean age 55.7 years, 95% CI 54.8-56.6,
p=4.1.times.10.sup.-12), more likely to be male (p=0.01), more
likely to have smoked (p=0.0003), and more likely to carry the
MUC5B promoter variant (MAF 0.21 versus 0.30, p=0.006). When
controlling for family relatedness in a mixed effects logistic
regression model, age (OR 1.10, 95% CI 1.07-1.14,
p=1.21.times.10.sup.-9), smoking history (OR 1.72, 95% CI
1.00-2.99, p=0.04), and the MUC5B promoter variant (OR 1.73, 95% CI
1.08-2.76, p=002) were significantly associated with risk of
ILD.
OTHER EMBODIMENTS
[0380] It is to be understood that while the disclosure has been
described in conjunction with the detailed description thereof, the
foregoing description is intended to illustrate and not limit the
scope of the disclosure, which is defined by the scope of the
appended claims. Other aspects, advantages, and modifications are
within the scope of the following claims.
Sequence CWU 1
1
6111001DNAHomo sapiens 1agaaagaagt catgaaagta ggaaccacat ttttactcat
ctttctgtct ccagcaagca 60gcttactgct tttcatacac attttgcttt tattactcat
gatttcaaag gtgtaatggt 120tcagccacat caatgtaaca aacagttcac
actgggctct tatagtctgg cctttaaaac 180cttcactatt tatgctttca
tcttaactac tttgaccctc acaggtttac tcactaagaa 240cttgagtttc
aagagaaaag atgacatgtt tgctgcttaa acaagcaata tctaaaagca
300tatttagtta taaacgtctt accaagaatt gatataattt tcatttaaac
atttttataa 360atagtagttt acaagatata gtaagtacat ctctaaaaat
acagtgtatt catgtacctt 420gacataaact tgtagtagta ccttagtttt
attcatgttg ttatattaac taccatcact 480ttgaatacat acctgttcac
bgtacagtat aggtcggttt aggtttattg ccttaattgc 540ttggttttga
gttagtactg tagcaaatgc tatcacactt tgcattccct aaaaacaggt
600aaattcatta aggaaacaga caaagtatat aataatctcg ctacataaat
atttcaagat 660cagctatctg cattctgata aaattgtttt taaaatttaa
gcattccttg gactttgaat 720tgtaagttga tcaaattcaa aaatgaattg
ttactgtatt cttctctcct ggccctaaaa 780tctatctaaa acatggcatg
gggagtttct taatgtttca gtgtccattt cctgggtgtt 840tccctctagg
ttttttttcc tcacccctca agcttctatg tggatcccag ctagagctca
900tactacttat ccaacacaca tcattgtgca agcactcttt tatattcata
ctagtacttt 960taagtgtgtg tgcggtggga aaaggttacc aatcacattt t
100121001DNAHomo sapiens 2gtattcatca actcctattt cattccctct
tcctgtgctc actggaagat gacatttccc 60agacttccaa gaatgttact gagttctgga
atgtaagtag aagggataag tatcacttct 120gtgctgtggc ggttatggac
ctgtgaactt tgcacacgcc ttctatcttc tttttcagtg 180tccatttcag
agggcatgtt ttcagatgaa accagtagaa gatggaagca gcctgtgact
240agaatcactg cttagggtct tgctgcctag gaatcccact ctacctgcaa
cagactgtga 300aagaaccgag aaatacactg attttgaaca tagcccatac
tataatgggg atgtttgtta 360cagcagttag cattaaaaac cttggctagg
cattggtcat aattgtagaa cacagcaaat 420gaagggaaac tggaacatag
aggccagtga gaactttagg gttaatgaaa aatgagggca 480accaggataa
tttggttctt kgccaaatag gaaggtgaaa ccaaaggtag actggaggtc
540agaaaatcag tccagcacat gtgatgtttt catttagttg cctgtatgtc
tgtctggtct 600ccagctcagc ctggctcctt gaggtaagag gcagtggctg
ttcacctttg catcccagca 660cctggcatac aatagatggg atgaaatgtt
caaactgagc ctaagcttca gggtgcttat 720caaagcaggg aagatacaca
agaggagatg attcaggtcc agggcaggtc aggtatctaa 780acccagtctc
ttaggaagct ggatcctccg aaccagggag aacaagctgg atatgcactg
840gatttcccag cagtactgat ctagagactc tcatagagtc ccttttattc
cttggcctag 900ggttacaact gcttatagca tctggaaaga ctcaacacct
caaaagagac tttcagtaga 960tacagcaaat acactcatgg aattgataat
taagcttcaa t 100131001DNAHomo sapiens 3attgtcgttg tttgcttttg
tttattgaga cagtctcact ctgtcaccca ggctggagtg 60taatggcaca atctcggctc
actgcaacct ctgcctcctc ggttcaagca gttctcattc 120ctcaacctca
tgagtagctg ggattacagg cgcccaccac cacgcctggc taatttttgt
180atttttagta gagataggct ttcaccatgt tggccaggct ggtctcaaac
tcctgacctc 240aagtgatctg cccgccttgg cctcccacag tgctgggatt
acaggtgcaa gccaccgtgc 300ccggcatacc ttgatctttt aaaatgaagt
ctgaaacatt gctacccttg tcctgagcaa 360taagaccctt agtgtatttt
agctctggcc accccccagc ctgtgtgctg ttttccctgc 420tgacttagtt
ctatctcagg catcttgaca cccccacaag ctaagcatta ttaatattgt
480tttccgtgtt gagtgtttct ktagctttgc ccccgccctg cttttcctcc
tttgttcccc 540gtctgtcttc tgtctcaggc ccgccgtctg gggtcccctt
ccttgtcctt tgcgtggttc 600ttctgtcttg ttattgctgg taaaccccag
ctttacctgt gctggcctcc atggcatcta 660gcgacgtccg gggacctctg
cttatgatgc acagatgaag atgtggagac tcacgaggag 720ggcggtcatc
ttggcccgtg agtgtctgga gcaccacgtg gccagcgttc cttagccagt
780gagtgacagc aacgtccgct cggcctgggt tcagcctgga aaaccccagg
catgtcgggg 840tctggtggct ccgcggtgtc gagtttgaaa tcgcgcaaac
ctgcggtgtg gcgccagctc 900tgacggtgct gcctggcggg ggagtgtctg
cttcctccct tctgcttggg aaccaggaca 960aaggatgagg ctccgagccg
ttgtcgccca acaggagcat g 100141001DNAHomo sapiens 4atttgggaac
ctttaaaaaa tattctggct tcaaaaatac tccatattta catctttggt 60tctatctgaa
gtaaagccgt gatggtgtgc gtaagtgaaa caggtgcaaa ggggcaacaa
120caaagggcgc ctctctttgt ctttgtgtcg caggcggaga tggacatggt
ggcctggggt 180gtggacctgg cctcagtgga gcagcacatt aacagccacc
ggggcatcca caactccatc 240ggcgactatc gctggcagct ggacaaaatc
aaagccgacc tggtacttgt ctgtgtttca 300ttttagagtc ttcaaaatat
ctaccgaagg atcgtgtaat tactcaatcc cagggagttt 360cttctgaaac
attgctatta tttctttccc agaagactgg aaatgtttag aaatcccact
420tcttaaatgg ggaagtggaa tcagtagccc tattagagat tatgttaaca
cttgaagagg 480agttaaacca gaggctgagg ktgtgcaaac actcatttgc
agtttgtgaa taagtctctt 540taggggtggc agtttgtttc tgcggtaagc
agaacatctt tttgaatagg ggaaatgcaa 600cagtcttata cagtagtttg
tgtcattggt gaatcctttc ctaggtggta attaaaacat 660tatttctact
gagcaaagcc atatgtcatc ccgacacccg ctcccatgct gaaaaaagtc
720agacttgaaa ctgggttgag aattacagca taaaatcata actgatctta
agtgcttagt 780ttcccgcagg tctctacact tgtaaatcac taaacttttt
tttttttttt ttacctgaga 840ccatagcttc tcatcctcat ttcttcttct
ggctttttgg ggcttacttt tgtccacctg 900agcccctgac caactttctc
cttcatttct ctaagaccta gggaatccta aatgatgtct 960ttaaacttta
agacaatttt ctaacacgtg agtctttaag t 10015941DNAHomo sapiens
5ttctctgtga gtctttagga aatgaggagc atgatcttct agcagtaaaa cacctgtaga
60gaattgcctt atgttttttg tttgtttatt tgtttgtgtg ctttggtttg gtttgctttt
120tttttttttt tttttttttt tttgagatgg agtctcgccc tgttgcccag
gctggagtgt 180agtggcgaaa tctcggctca ctgcaacctc cacctccctg
gttcaagcaa ttcccctgtc 240tcagcctccc gagtagctga gattacaggt
gcacaccacc acgcccggct aatttttttg 300tatttttagt agagatgggg
tttcaccatg ttggccagac tggtctcgaa cttctgacct 360caggcaatcc
gcctgcctca gcctcccaaa gcgctgggat tacaggcatg agccactgcg
420ccccgcctcc atgttaatca mtctttctga tttcaaataa ctcattatcc
ccatgacctt 480atggatttgt ttttcctctt catccacaaa attctccaga
gaagtctccc ttgttatctc 540ttggctgtgc tttctatctc accagttatc
tttctccaaa gagcttcctc tgcaaagaag 600ctttgtatat gaagaccatg
tgggggctga atcaagacca agtttcacaa cctaaaagta 660gttcacaaag
cttccttgcc tctattctct gcaaatctgt aaactcttca gctgacccaa
720tttctctctt tagccttcag agattatttt attttatttt atttcatttc
atttcatttc 780attttgacag aatctagctc tgtcgcccag gctggagtgc
agtggcacca tctttgctca 840ctgcaacctc cccctcacag gttcaagcaa
ctgtcctgcc tcagcctccc gagtagctgg 900gattacaggc gtgagccacc
acgcccagct gatttttttt t 94161001DNAHomo sapiens 6cctctactgc
cgtacacccc accactcagc cttggagtgc ctgtgtgcag agcagggctg 60aggcatggtg
ctgctttggt ggtctaggtt tgctgcaggg ccaggtggcc tgagctccag
120gcaggatctc tggctgcact cagccctttc tgcctcccca aatgctctat
atcactattt 180gtacactgag cagagtaaag ttagagagaa ctgttttata
gaatagggct ggcccccgct 240cccctggcct acgtgatggt ccttcctggc
tgccaggtac ttgtttgtat tagagacaga 300cactccacag ggtctgttgt
ggcccacagc acataggcaa tcagaggcag aaagcagagc 360tgtttggacc
cacagagggc cggctgtctg ccactgaaat gtctttccag ttggttgaga
420agcagcagga tgctctgctg gtgatgtctg aaagtcccag gattctttgg
gtctccaagg 480agatcctagc atataccact rtcgtggttt taataaagag
caaaaacact ttcagatggg 540gagaagagtg gaacaaaagg tattcttcct
gggttgaagt ctgggggaaa ggcattgaga 600agactgggct aatggcacaa
accaatgaag tactcaagtc acctgtgatg gaggccagtc 660atccaatggt
atcaactttg tatgtggcaa cacttaataa aaatctgaac aggtcttcac
720ttgtggacac agtagacttt cttgaaaaag gacagaaaag tgagccctgt
gaattttcat 780ctcacggact gacaacaatg acttgccttt aaggacagtc
actcaagatg aagatgcaac 840aaaacccttc cagttccaag tggctgatga
aaaaaaaaaa atcttaaaag catcacagaa 900caacggagaa agagatcaga
agactataac agatagtttg aattttaaaa ctcagagaaa 960agcaactgag
gaggaaatac actgcttaga aagaagaaac t 10017601DNAHomo sapiens
7tggacggcct ctgaaggggt ctgtggggtc ctggacgggt ccccattcat ggcaggatta
60acccccctcg ggttctgtgt ggtccaggcc gcccctttgt ctccactgcc ccctggccag
120aatgagggac agtgacccac ccagggctgg gcctggctca gactccgtca
gagccgcagg 180gcaagttcct ggcacgtccg aggtgggagg ctcctctgcg
ctccaggagg ctgtgcctgg 240ccccccttcc cggcaggaac cggctgtgtc
cctttccttc ctttatcttc tgttttcagc 300dccttcaact gtgaagaggt
gaactcttca aacacgctga gcaaacaggc ccgactccca 360gggccgcatc
cgggatgtct caatagctgt ggccttgacg tccacctcgg acccctgccc
420cggacccagc ccagttccca atgggccctc tgcccgggga ggtgcctagt
gggagggacg 480agggcaaagt cggggccccc acttgtttgg tgtcactgtg
tgccagcggc cactggcggg 540cgaggctgtt ccagggtgga ggcggggagg
gttggaccac aggcactgag cggggacaga 600g 6018732DNAHomo sapiens
8gtcattggtc aaatgtggcc tgtatctaaa ttccaactgt tagaatcata gacatctaga
60gcttacgtca gttttagata tttcttatga attctcagaa ttcatagatt ctcattttta
120ttcttagact tctcagatat tccgtttttg atagtatacc cttctgagtc
taatatgtcc 180taaagtgcga acttgtacaa tttttttttt tttttttttt
tttttttttt tktgataagg 240agttttactc tgtcacccag gctggagtgc
agtgacccga tctcggctca ctgcaacctc 300tgcctcccgg gttcaagtga
ttgtgatgtc tcagtctccc aagtagctgg gattacaggc 360tcctgccacc
acatgcctag ctaattgtta tactttagta gaaatggggc ttcgccgtgt
420tagtcaggct ggtcttgtac tcctgacctc agttgatctg cctaccttgg
cccccaaggt 480gctgggatta caggcatgag ccaccgcgcc tgacccagct
tcttaaatta ttctgggcca 540ccagtaatgt gaatcatgta aattaaaata
tataattaaa caaaatcata tagcgattag 600agataatagt tgtgaaatgc
ttgaaaaatc ataggcattt aataaataga agccattcca 660attaggattc
ttcttgattt tttttcaaga ccaaaaaaat actcttttaa atatttatta
720taatactcca tg 73291320DNAHomo sapiens 9aggctgcagt tagtcatgac
tgcgcgctgc actccagcct gggtgacaaa gtgaggccct 60gtctcaaaaa caataaaaaa
tttaaaagag ctgagcatgg aggccacttt gggaggctga 120ggcaggcaga
tctcttaagc ccaggagtct gagaccagcc tgggcgacat gatgaagccc
180catctctaca aaaaatacaa aaaaattagc tgagctttat ggcaaatccc
tgtaatccca 240gttacctagg aggcccaggc aggaagatgg cttgagccca
aaaggttgag gctgtagtga 300gctgtgatca tgaacagagt gagaccctgt
ttcaaaacaa aatgaaaaac aaacaaacaa 360aaaaaccaag aaaacaagaa
aacaaaaact atacaatgat gagccaaaaa gcaagatatg 420gaagaatata
tatatatata tatatatata tatagtatga gtccagctat agaaagtttg
480aaatcaggca acctaaacaa tattgttcag ggatctatac agaggcagga
agccattgag 540aaaggtaagg ggaggattat caccaaattc aggatggtgg
ctcccctggg gagaatatgt 600caaggagggg cacatgggct tggaatactg
tcttcattga cctgcgtgtt gggtacacag 660gagtttgtta tttttcacac
tgcatatgtg catgtatata ctctcccata tataccatgc 720atttcacaca
agaacacaaa ggctgtgtgg ctctgctctg cccctttccc cttccagctc
780ccattctcgt cytcagctag cagaggaggg tcagggtctt ttagcacagc
ttccttctgt 840ctctgagtgg gtcagaggag tacggggatg agggcctccc
ttctgcggct gggctctggc 900cactccaggg tgggaaggcc tggagaaaac
agggccaggc aaagccggct ggccctgctg 960tttctgccaa tgctgggatt
aggccagggc tctggcccac ctgtcatttc actcattcag 1020catgaacata
gccactgagc acttactgtg agccccgggt gctattggga gagttcagat
1080aagtgagaga gggtctttga cctcaaagat cttacagaga ggaccgtata
cacaaataac 1140agtataccag caaaatgtga gctaagtgtc atgtgactac
tcatctactc tttcaataaa 1200tatttgttgt gcacctatta catgccagga
actgtgctgg atggtgatca tgtaaagaca 1260gtcaaatcac agtcctagct
ctcagattca cagcctgcct aatgctgggg aaactggaat 1320101001DNAHomo
sapiens 10ccagccagaa ggggcgcagt ttgttagttc agctcctcct gagacagaaa
taaagacacg 60aaccaaagga catcagcact tacagggctc tcaggtcaca cacaggatgt
ccgcgcccac 120tgcagagctg caggtcccct ccagggcagt ggggagccac
aagcagcgtt aggcagcggc 180tgggaccagg accgcctgag cactcaagaa
cccccactgc cccaagcact gctggcagca 240agcccagaaa actgagcccg
gggagctcct ctgagcggcc taagcacccc tctaagctgt 300gctgccccaa
ttcaagcctg gctcacggca gcaaagaaaa aatgtgacct tcggagctcc
360caaaggggcc acccataagc tgagagcctg cccggaagca cttatagacc
cgcgtggctt 420gttttcattg caaagaacaa taaaaattat cttgcctctg
atcaccactg atagcccaag 480aagcaaaaat tcgatcccgg dgatgagaaa
tgaaatgaaa catcgcgaga aacttccagg 540aatcttctgg atgtggctag
actctttagc ttgagcttcc agacaggccg aggcttggtg 600ctggagcctg
gccctccgct gacctctctt ctacccgggg gcacagcccg gattgcagag
660aggctggcgc aagagtgagg gagcgagggc tagcctgtga tgggctttct
ccacctagca 720ccaccctatg ctgtggctca ggggagtcaa gagtttacac
agctgcagag atggattcca 780ggccacttac tcaagtctac ctactccttc
cttcggccaa tcagctgggt gcctctgcgg 840cctgtgacac caccagcaaa
cagctccaga cctcctagca tggtctctgt caaggctggg 900tggcagatct
gtgatctcct ttttaaattt ttcatttttt ttaagagatg gggtcttgct
960atattgccca ggctggtctc aaactcctgg gctccagcga t 10011117916DNAHomo
sapiens 11cacccggccc ggctccctcc ctgcccgtcc ccgtcccccc acccgtgcca
gcccccagga 60tgggtgcccc gagcgcgtgc cggacgctgg tgttggctct ggcggccatg
ctcgtggtgc 120cgcaggcaga gacccagggc cctgtggagc cgagctggga
gaatgcaggg cacaccatgg 180atggcggtgc cccgacgtcc tcgcccaccc
ggcgcgtgag ctttgttcca cccgtcactg 240tcttccccag cctgagcccc
ctgaacccgg cgcacaatgg gcgggtgtgc agcacctggg 300gtgacttcca
ctacaagacc ttcgacggcg acgtcttccg cttccctggc ctttgcaact
360acgtgttctc tgagcactgc cgcgccgcct acgaggactt caacgtccag
ctacgccgag 420gcctagtggg ctccaggcct gtggtcaccc gtgttgtcat
caaggcccag gggctggtgc 480tggaggcgtc caacggctcc gtcctcatca
atgggcagcg ggaggagctg ccttacagcc 540gcactggcct cctggtggag
cagagcgggg actacatcaa ggtcagcatc cggctggtgc 600tgacattcct
gtggaacgga gaggacagtg ccctgctgga gctggatccc aaatacgcca
660accagacctg tggcctgtgt ggggacttca acggcctccc ggccttcaac
gagttctatg 720cccacaacgc caggctgacc ccgctccagt ttgggaacct
gcagaagttg gatgggccca 780cggagcagtg cccggacccg ctgcccttgc
cggccggcaa ctgcacggac gaggagggca 840tctgccaccg caccctgctg
gggccggcct ttgcggagtg ccacgcactg gtggacagca 900ctgcgtacct
ggccgcctgc gcccaggacc tgtgccgctg ccccacctgc ccgtgtgcca
960cctttgtgga atactcacgc cagtgcgccc acgcgggggg ccagccgcgg
aactggaggt 1020gccctgagct ctgcccccgg acctgccccc tcaacatgca
gcaccaggag tgtggctcac 1080cctgcacgga cacctgctcc aacccccagc
gcgcgcagct ctgcgaggac cactgtgtgg 1140acggctgctt ctgcccccca
ggcacggtgc tggatgacat cacgcactct ggctgcctgc 1200ccctcgggca
gtgcccctgc acccacggcg gccgcaccta cagcccgggc acctccttca
1260acaccacctg cagctcctgc acctgctccg gggggctatg gcagtgccag
gacctgccgt 1320gccctggcac ctgctctgtg cagggcgggg cccacatctc
cacctatgat gagaaactct 1380acgacctgca tggtgactgc agctacgttc
tgtccaagaa atgtgccgac agcagcttca 1440ccgtgctggc tgagctgcgg
aagtgcggcc tgacggacaa cgagaactgc ctgaaagcgg 1500tgacgctcag
cctggacggc ggggacacgg ccatccgggt ccaagcggac ggcggcgtgt
1560tcctcaactc catctacacg cagctgcccc tgtcggcagc caacatcacc
ctgttcacac 1620cctcgagctt cttcatcgtg gtgcagacag gcctggggct
gcagctgctg gtgcagctgg 1680tgccactcat gcaggtgttt gtcaggctgg
accccgccca ccagggccag atgtgcggcc 1740tgtgtgggaa cttcaaccag
aaccaggctg acgacttcac ggccctcagc ggggtggtgg 1800aggccacggg
cgcagccttc gccaacacct ggaaggccca ggctgcctgt gccaatgcca
1860ggaacagctt tgaggacccc tgctccctca gtgtggagaa tgagaactac
gcccggcact 1920ggtgctcgcg cctgaccgat cccaacagtg ccttctcgcg
ctgccactcc atcatcaacc 1980ccaagccctt ccactcgaac tgcatgtttg
acacctgcaa ctgtgagcgg agcgaggact 2040gcctgtgcgc cgcgctgtcc
tcctatgtgc acgcctgtgc cgccaagggc gtacagctca 2100gcgactggag
ggacggcgtc tgcaccaagt acatgcagaa ctgccccaag tcccagcgct
2160acgcctacgt ggtggatgcc tgccagccca cttgccgcgg cctgagtgag
gccgacgtca 2220cctgcagcgt ttccttcgtg cctgtggacg gctgcacctg
ccccgcgggc accttcctca 2280atgacgcggg cgcctgtgtg cccgcccagg
agtgcccctg ctacgctcac ggcaccgtgc 2340tggctcctgg agaggtggtg
cacgacgagg gcgccgtgtg ttcatgtacg ggtgggaagc 2400taagctgcct
gggagcctct ctgcagaaaa gcacagggtg tgcagccccc atggtgtacc
2460tggactgcag caacagctcg gcgggcaccc ctggggccga gtgcctccgg
agctgccaca 2520cgctggacgt gggctgtttc agcacacact gcgtgtccgg
ctgtgtctgt cccccggggc 2580tggtgtcgga tgggagtggg ggctgcattg
ccgaggagga ctgcccctgt gtgcacaacg 2640aggccaccta caagcctgga
gagaccatca gggtcgactg caacacctgc acctgcagga 2700accggaggtg
ggagtgcagc caccggctct gcctgggcac ctgcgtggcc tacggggatg
2760gccacttcat cacctttgat ggcgatcgct acagctttga aggcagctgc
gagtacatct 2820tggcccagga ctactgtggg gacaacacca cccacgggac
cttccgcatc gtcaccgaga 2880acatcccctg tgggaccacc ggcaccacct
gctccaaggc catcaagctc ttcgtggaga 2940gctacgagct gatcctccaa
gaggggacct ttaaggcggt ggcgagaggg ccgggtgggg 3000acccacccta
caagatacgc tacatgggga tcttcctggt catcgagacc cacgggatgg
3060ccgtgtcctg ggaccggaag accagcgtgt tcatccgact gcaccaggac
tacaagggca 3120gggtctgcgg cctgtgcggg aacttcgacg acaatgccat
caatgacttt gccacgcgta 3180gccggtccgt ggtgggggac gcactggagt
ttgggaacag ctggaagctc tccccctcct 3240gcccggacgc cctggcaccc
aaggacccct gcacggccaa ccccttccgc aagtcctggg 3300cccagaagca
gtgcagcatc ctccacggcc ccaccttcgc cgcctgccgc tcccaggttg
3360actccaccaa gtactacgag gcctgcgtga acgacgcgtg tgcctgcgac
tcgggtggcg 3420actgcgagtg tttctgcacg gctgtggctg cctacgccca
ggcctgccac gacgcgggcc 3480tgtgtgtgtc ctggcggact ccggacacct
gccccttgtt ctgtgacttc tacaacccac 3540atgggggctg tgagtggcac
taccagccct gcggggcacc ctgcctaaaa acctgccgga 3600accccagtgg
gcactgcctg gtggacctgc ctggcctgga aggctgctac ccgaagtgcc
3660cacccagcca gcccttcttc aatgaggacc agatgaagtg cgtggcccag
tgtggctgct 3720acgacaagga cggaaactac tatgacgtcg gtgcaagggt
ccccacagcg gagaactgcc 3780agagctgtaa ctgcacaccc agtggcatcc
agtgcgctca cagccttgag gcctgcacct 3840gcacctatga ggacaggacc
tacagctacc aggacgtcat ctacaacacc accgatgggc 3900ttggcgcctg
cttgatcgcc atctgcggaa gcaacggcac catcatcagg aaggctgtgg
3960catgtcctgg aactccagcc acaacgccat tcaccttcac caccgcctgg
gtcccccact 4020ccacgacaag cccggccctc ccggtctcca ccgtgtgtgt
ccgcgaggtc tgccgctggt 4080ccagctggta caatgggcac cgcccagagc
ccggcctggg aggcggagac tttgagacgt 4140ttgaaaacct gaggcagaga
gggtaccagg tatgccctgt gctggctgac atcgagtgcc 4200gggcggcgca
gcttcccgac atgccgctgg aggagctggg ccagcaggtg gactgtgacc
4260gcatgcgggg gctgatgtgc gccaacagcc aacagagtcc cccgctctgt
cacgactacg 4320agctgcgggt tctctgctgc gaatacgtgc cctgtggccc
ctccccggcc ccaggcacca 4380gccctcagcc ctccctcagt gccagcacgg
agcctgctgt gcctacccca acccagacca 4440cagcaaccga aaagaccacc
ctatgggtga ccccgagcat ccggtcgacg gcggccctca 4500cctcgcagac
tgggtccagc tcaggccccg tgacggtcac cccctcggcc ccaggtacca
4560ccacctgcca gccccggtgt cagtggacag agtggtttga tgaggactac
cccaagtctg 4620aacaacttgg aggggacgtt gagtcctacg ataagatcag
ggccgctgga gggcacttat 4680gccagcagcc taaggacata gagtgccagg
ccgagagctt ccccaactgg accctggcac 4740aggtggggca gaaggtgcac
tgtgacgtcc acttcggcct ggtgtgcagg aactgggagc 4800aggagggcgt
cttcaagatg tgctacaact acaggatccg ggtcctctgc tgcagtgacg
4860accactgcag gggacgtgcc acaaccccgc caccgaccac agagctggag
acggccacca 4920ccaccaccac ccaggccctg ttctcaacgc cgcagcctac
gagtagcccg gggctgacca 4980gggctccccc ggccagcacc
acagcagtcc ccaccctctc agaaggactg acatccccca 5040gatacacaag
cacccttggt acagccacca cgggaggccc cacgacgcct gcaggctcca
5100cagaacccac tgtcccaggg gtggccacat ccacccttcc aacacgctca
gcccttccag 5160ggacgacggg gagcttgggc acatggcgcc cctcacagcc
acccacgctg gccccaacaa 5220caatggcaac ctccagagct cgcccgacag
gcacagccag caccgcttcc aaagagccgc 5280tgaccacgag cctggcgcca
acactcacga gcgagctgtc cacctctcag gccgagacca 5340gcacgcccag
gacagagacg acaatgagcc ccttgactaa caccaccacc agccagggca
5400cgacccgctg tcaaccgaag tgtgagtgga cagagtggtt tgacgtggac
ttcccaacct 5460caggggttgc aggcggggac atggaaactt ttgaaaacat
cagggctgct gggggcaaga 5520tgtgctgggc accaaagagc atagagtgcc
gggcggagaa ctaccccgag gtaagcatcg 5580accaggtcgg gcaggtgctg
acctgcagcc tggagacggg gctgacctgc aagaacgaag 5640accagacagg
caggttcaac atgtgcttca actacaacgt gcgtgtgctt tgctgtgacg
5700actacagcca ctgccccagt accccagcca ccagctccac ggccacgccc
tcctcaactc 5760cggggacgac ctggatcctc acaaagccga ccacaacagc
cactacgact gcgtccactg 5820gatccacggc caccccgacc tccaccctga
gaacagctcc ccctcccaaa gtgctgacca 5880ccacggccac cacacccaca
gtcaccagct ccaaagccac tccctcctcc agtccaggga 5940ctgcaaccgc
ccttccagca ctgagaagca cagccaccac acccacagct accagcgtta
6000cacccatccc ctcttcctcc ctgggcacca cctggacccg cctatcacag
accaccacac 6060ccacggccac catgtccaca gccacaccct cctccactcc
agagactgcc cacacctcca 6120cagtgcttac cgccacggcc accacaactg
gggccaccgg ctctgtggcc accccctcct 6180ccaccccagg aacagctcac
actaccaaag tgccaactac cacaaccacg ggcttcacag 6240ccaccccctc
ctccagccca gggacggcac tcacgcctcc agtgtggatc agcacaacca
6300ccacacccac aaccagaggc tccacggtga ccccctcctc catcccgggg
accacccaca 6360ccgccacagt gctgaccacc accaccacaa ctgtggccac
tggttctatg gcaacaccct 6420cctctagcac acagaccagt ggtactcccc
catcactgac caccacggcc actacgatca 6480cggccaccgg ctccaccacc
aacccctcct caactcctgg gacaactccc atccccccag 6540tgctgaccac
caccgccacc acacctgcag ccaccagcaa cacagtgact ccctcctctg
6600ccctagggac cacccacaca cccccagtgc cgaacaccat ggccaccaca
cacgggcgat 6660ccctgccccc cagcagtccc cacacggtgc gcacagcctg
gacttcggcc acctcgggca 6720tcttgggcac cacccacatc acagagcctt
ccacggtgac ttcccacacc ctagcagcaa 6780ccaccggtac cacccagcac
tcgactccag ccctttccag ccctcaccct agcagcagaa 6840ccaccgagtc
acccccttct ccagggacga ccaccccggg ccacaccacg gccacctcca
6900ggaccacagc cacggccaca cccagcaaga cccgcacctc gaccctgctg
cccagcagcc 6960ccacatcggc ccccataacc acggtggtga ccatgggctg
tgagccccag tgtgcctggt 7020cagagtggct ggactacagc taccccatgc
cggggccctc tggcggggac tttgacacct 7080actccaacat ccgtgcggcc
ggaggggccg tctgtgagca gcccctgggc ctcgagtgcc 7140gtgcccaggc
ccagcctggt gtccccctgc gggagttggg ccaggtcgtg gaatgcagcc
7200tggactttgg cctggtctgc aggaaccgtg agcaggtggg gaagttcaag
atgtgcttca 7260actatgaaat ccgtgtgttc tgctgcaact acggccactg
ccccagcacc ccggccacca 7320gctctacggc catgccctcc tccactccgg
ggacgacctg gatcctcaca gagctgacca 7380caacagccac tacgactgag
tccactggat ccacggccac cccgtcctcc accccaggga 7440ccacctggat
cctcacagag ccgagcacta cagccaccgt gacggtgccc accggatcca
7500cggccaccgc ctcctccacc caggcaactg ctggcacccc acatgtgagc
accacggcca 7560cgacacccac agtcaccagc tccaaagcca ctcccttctc
cagtccaggg actgcaaccg 7620cccttccagc actgagaagc acagccacca
cacccacagc taccagcttt acagccatcc 7680cctcctcctc cctgggcacc
acctggaccc gcctatcaca gaccaccaca cccacggcca 7740ccatgtccac
agccacaccc tcctccactc cagagactgt ccacacctcc acagtgctta
7800ccaccacggc caccacaacc ggggccaccg gctctgtggc caccccctcc
tccaccccag 7860gaacagctca cactaccaaa gtgctgacta ccacaaccac
gggcttcaca gccaccccct 7920cctccagccc agggacggca cgcacgcttc
cagtgtggat cagcacaacc accacaccca 7980caaccagagg ttccacggtg
accccctcct ccatcccggg gaccacccac acccccacag 8040tgctgaccac
caccaccaca actgtggcca ctggttctat ggcaacaccc tcctctagca
8100cacagaccag tggtactccc ccatcactga ccaccacggc cactacgatc
acggccaccg 8160gctccaccac caacccctcc tcaactccag ggacaacacc
tatcccccca gtgctgacca 8220ccaccgccac cacacctgca gccaccagca
gcacagtgac tccctcctct gccctaggga 8280ccacccacac acccccagtg
ccgaacacca cggccaccac acacgggcga tccctgtccc 8340ccagcagtcc
ccacacggtg cgcacagcct ggacttcggc cacctcaggc accttgggca
8400ccacccacat cacagagcct tccacgggga cttcccacac cccagcagca
accaccggta 8460ccacccagca ctcgactcca gccctgtcca gccctcaccc
tagcagcagg accaccgagt 8520cacccccttc tccagggacg accaccccgg
gccacaccag ggccacctcc aggaccacgg 8580ccacggccac acccagcaag
acccgcacct cgaccctgct gcccagcagc cccacatcgg 8640ccccaataac
cacggtggtg accatgggct gtgagcccca gtgtgcctgg tcagagtggc
8700tggactacag ctaccccatg ccggggccct ctggcgggga ctttgacacc
tactccaaca 8760tccgtgcggc cggaggggcc gtctgtgagc agcccctggg
cctcgagtgc cgtgcccagg 8820cccagcctgg tgtccccctg cgggagttgg
gccaggtcgt ggaatgcagc ctggactttg 8880gcctggtctg caggaaccgt
gagcaggtgg ggaagttcaa gatgtgcttc aactatgaaa 8940tccgtgtgtt
ctgctgcaac tacggccact gccccagcac cccggccacc agctctacgg
9000ccacgccctc ctccactcca gggacgacct ggatcctcac agagcagacc
acagcagcca 9060ctacgaccgc aaccactgga tccacggcca tcccgtcctc
caccccggga acagctcccc 9120ctcccaaagt gctgaccagc acggccacca
cacccacagc caccagttcc aaagccactt 9180cctcctccag tccaaggact
gcaaccaccc ttccagtgct gacaagcaca gccaccaaat 9240ccacagctac
cagctttaca cccatcccct ccttcaccct tgggaccacc gggaccctcc
9300cagaacagac caccacaccc atggccacca tgtccacaat ccacccctcc
tccactccgg 9360agaccaccca cacctccaca gtgctgacca cgaaggccac
cacgacaagg gccaccagtt 9420ccatgtccac cccctcctcc actccgggga
cgacctggat cctcacagag ctgaccacag 9480cagccactac aactgcagcc
actggcccca cggccacccc gtcctccacc ccagggacca 9540cctggatcct
cacagagccc agcactacag ccaccgtgac ggtgcccacc ggatccacgg
9600ccaccgcctc ctccacccgg gcaactgctg gcaccctcaa agtgctgacc
agcacggcca 9660ccacacccac agtcatcagc tccagagcca ctccctcctc
cagtccaggg actgcaaccg 9720cccttccagc actgagaagc acagccacca
cacccacagc taccagcgtt acagccatcc 9780cctcttcctc cctgggcacc
gcctggaccc gcctatcaca gaccaccaca cccacggcca 9840ccatgtccac
agccacaccc tcctctactc cagagactgt ccacacctcc acagtgctta
9900ccaccacgac caccacaacc agggccaccg gctctgtggc caccccctcc
tccaccccag 9960gaacagctca cactaccaaa gtgccgacta ccacaaccac
gggcttcaca gccaccccct 10020cctccagccc agggacggca ctcacgcctc
cagtgtggat cagcacaacc accacaccca 10080caaccagagg ctccacggtg
accccctcct ccatcccggg gaccacccac accgccacag 10140tgctgaccac
caccaccaca actgtggcca ctggttctat ggcaacaccc tcctctagca
10200cacagaccag tggtactccc ccatcactga ccaccacggc cactacgatc
acagccaccg 10260gctccaccac caacccctcc tcaactccag ggacaactcc
catcccccca gtgctgacca 10320ccaccgccac cacacctgca gccaccagca
gcacagtgac tccctcctct gccctaggga 10380ccacccacac acccccagtg
ccgaacacca cggccaccac acacgggcgg tccctgcccc 10440ccagcagtcc
ccacacggtg cgcacagcct ggacttcggc cacctcgggc atcttgggca
10500ccacccacat cacagagcct tccacggtga cttcccacac cccagcagca
accaccagta 10560ccacccagca ctcgactcca gccctgtcca gccctcaccc
tagcagcagg accaccgagt 10620cacccccttc tccagggacg accaccccgg
gccacaccag gggcacctcc aggaccacag 10680ccacagccac acccagcaag
acccgcacct cgaccctgct gcccagcagc cccacatcgg 10740cccccataac
cacggtggtg accacgggct gtgagcccca gtgtgcctgg tcagagtggc
10800tggactacag ctaccccatg ccggggccct ctggcgggga ctttgacacc
tactccaaca 10860tccgtgcggc cggaggggca gtctgtgagc agcccctggg
cctcgagtgc cgtgcccagg 10920cccagcctgg tgtccccctg cgggagttgg
gccaggtcgt ggaatgcagc ctggactttg 10980gcctggtctg caggaaccgt
gagcaggtgg ggaagttcaa gatgtgcttc aactatgaaa 11040tccgtgtgtt
ctgctgcaac tacggccact gccccagcac cccggccacc agctctacgg
11100ccacgccctc ctcaactccg gggacgacct ggatcctcac aaagctgacc
acaacagcca 11160ctacgactga gtccactgga tccacggcca ccccgtcctc
caccccaggg accacctgga 11220tcctcacaga gccgagcact acagccaccg
tgacggtgcc caccggatcc acggccaccg 11280cctcctccac ccaggcaact
gctggcaccc cacatgtgag caccacggcc acgacaccca 11340cagtcaccag
ctccaaagcc actcccttct ccagtccagg gactgcaacc gcccttccag
11400cactgagaag cacagccacc acacccacag ctaccagctt tacagccatc
ccctcctcct 11460ccctgggcac cacctggacc cgcctatcac agaccaccac
acccacggcc accatgtcca 11520cagccacacc ctcctccact ccagagactg
cccacacctc cacagtgctt accaccacgg 11580ccaccacaac cagggccacc
ggctctgtgg ccaccccctc ttccacccca ggaacagctc 11640acactaccaa
agtgccgact accacaacca cgggcttcac agtcaccccc tcctccagcc
11700cagggacggc acgcacgcct ccagtgtgga tcagcacaac caccacaccc
acaaccagtg 11760gctccacggt gaccccctcc tccgtcccgg ggaccaccca
cacccccaca gtgctgacca 11820ccaccaccac aactgtggcc actggttcta
tggcaacacc ctcctctagc acacagacca 11880gtggtactcc cccatcactg
atcaccacgg ccactacgat cacggccacc ggctccacca 11940ccaacccctc
ctcaactcca gggacaacac ctatcccccc agtgctgacc accaccgcca
12000ccacacctgc agccaccagc agcacagtga ctccctcctc tgccctaggg
accacccaca 12060cacccccagt gccgaacacc acggccacca cacacgggcg
atccctgtcc cccagcagtc 12120cccacacggt gcgcacagcc tggacttcgg
ccacctcagg caccttgggc accacccaca 12180tcacagagcc ttccacgggg
acttcccaca ccccagcagc aaccaccggt accacccagc 12240actcgactcc
agccctgtcc agccctcacc ctagcagcag gaccaccgag tcaccccctt
12300ccccagggac gaccaccccg ggccacacca cggccacctc caggaccacg
gccacggcca 12360cacccagcaa gacccgcacc tcgaccctgc tgcccagcag
ccccacatcg gcccccataa 12420ccacggtggt gaccacgggc tgtgagcccc
agtgtgcctg gtcagagtgg ctggactaca 12480gctaccccat gccggggccc
tctggcgggg actttgacac ctactccaac atccgtgcgg 12540ccggaggggc
cgtctgtgag cagcccctgg gcctcgagtg ccgtgcccag gcccagcctg
12600gtgtccccct gggggagttg ggccaggtcg tggaatgcag cctggacttt
ggcctggtct 12660gcaggaaccg tgagcaggtg gggaagttca agatgtgctt
caactatgaa atccgtgtgt 12720tctgctgcaa ctacggccac tgccccagca
ccccggccac cagctctacg gccatgccct 12780cctccactcc ggggacgacc
tggatcctca cagagctgac cacaacagcc actacgactg 12840catccactgg
atccacggcc accccgtcct ccaccccggg aacagctccc cctcccaaag
12900tgctgaccag cccggccacc acacccacag ccaccagttc caaagccact
tcctcctcca 12960gtccaaggac tgcaaccacc cttccagtgc tgacaagcac
agccaccaaa tccacagcta 13020ccagcgttac acccatcccc tcctccaccc
ttgggaccac cgggaccctc ccagaacaga 13080ccaccacacc cgtggccacc
atgtccacaa tccacccctc ctccactccg gagaccaccc 13140acacctccac
agtgctgacc acgaaggcca ccacgacaag ggccaccagt tccacgtcca
13200ccccctcctc cactccgggg acgacctgga tcctcacaga gctgaccaca
gcagccacta 13260caactgcagc cactggcccc acggccaccc cgtcctccac
cccagggacc acctggatcc 13320tcacagagct gaccacaaca gccactacga
ctgcgtccac tggatccacg gccaccccgt 13380cctccacccc agggaccacc
tggatcctca cagagccgag cactacagcc accgtgacgg 13440tgcccaccgg
atccacggcc accgcctcct ccacccaggc aactgctggc accccacatg
13500tgagcaccac ggccacgaca cccacagtca ccagctccaa agccactccc
tcctccagtc 13560cagggactgc aactgccctt ccagcactga gaagcacagc
caccacaccc acagctacca 13620gctttacagc catcccctcc tcctccctgg
gcaccacctg gacccgccta tcacagacca 13680ccacacccac ggccaccatg
tccacagcca caccctcctc cactccagag actgtccaca 13740cctccacagt
gcttaccgcc acggccacca caaccggggc caccggctct gtggccaccc
13800cctcctccac cccaggaaca gctcacacta ccaaagtgcc gactaccaca
accacgggct 13860tcacagccac cccctcctcc agcccaggga cggcactcac
gcctccagtg tggatcagca 13920caaccaccac acccacaacc accacaccca
caaccagtgg ctccacggtg accccctcct 13980ccatcccggg gaccacccac
accgccagag tgctgaccac caccaccaca actgtggcca 14040ctggttctat
ggcaacaccc tcctctagca cacagaccag tggtactccc ccatcactga
14100ccaccacggc cactacgatc acggccaccg gctccaccac caacccctcc
tcaactccag 14160ggacaacacc catcacccca gtgctgacca gcacggccac
cacacccgca gccaccagct 14220ccaaagccac ttcctcctcc agtccaagga
ctgcaaccac ccttccagtg ctgacaagca 14280cagccacaaa atccacagct
accagcttta cacccatccc ctcctccacc ctgtggacca 14340cgtggaccgt
cccagcacag accaccacac ccatgtccac catgtccaca atccacacct
14400cctctactcc agagaccacc cacacctcca cagtgctgac caccacagcc
accatgacaa 14460gggccaccaa ttccacggcc acaccctcct ccactctggg
gacgacccgg atcctcactg 14520agctgaccac aacagccact acaactgcag
ccactggatc cacggccacc ctgtcctcca 14580ccccagggac cacctggatc
ctcacagagc cgagcactat agccaccgtg atggtgccca 14640ccggttccac
ggccaccgcc tcctccactc tgggaacagc tcacaccccc aaagtggtga
14700ccaccatggc cactatgccc acagccactg cctccacggt tcccagctcg
tccaccgtgg 14760ggaccacccg cacccctgca gtgctcccca gcagcctgcc
aaccttcagc gtgtccactg 14820tgtcctcctc agtcctcacc accctgagac
ccactggctt ccccagctcc cacttctcta 14880ctccctgctt ctgcagggca
tttggacagt ttttctcgcc cggggaagtc atctacaata 14940agaccgaccg
agccggctgc catttctacg cagtgtgcaa tcagcactgt gacattgacc
15000gcttccaggg cgcctgtccc acctccccac cgccagtgtc ctccgccccg
ctgtcctcgc 15060cctcccctgc ccctggctgt gacaatgcca tccctctccg
gcaggtgaat gagacctgga 15120ccctggagaa ctgcacggtg gccaggtgcg
tgggtgacaa ccgtgtcgtc ctgctggacc 15180caaagcctgt ggccaacgtc
acctgcgtga acaagcacct gcccatcaaa gtgtcggacc 15240cgagccagcc
ctgtgacttc cactatgagt gcgagtgcat ctgcagcatg tggggcggct
15300cccactattc cacctttgac ggcacctctt acaccttccg gggcaactgc
acctatgtcc 15360tcatgagaga gatccatgca cgctttggga atctcagcct
ctacctggac aaccactact 15420gcacggcctc tgccactgcc gctgccgccc
gctgcccccg cgccctcagc atccactaca 15480agtccatgga tatcgtcctc
actgtcacca tggtgcatgg gaaggaggag ggcctgatcc 15540tgtttgacca
aattccggtg agcagcggtt tcagcaagaa cggcgtgctt gtgtctgtgc
15600tggggaccac caccatgcgt gtggacattc ctgccctggg cgtgagcgtc
accttcaatg 15660gccaagtctt ccaggcccgg ctgccctaca gcctcttcca
caacaacacc gagggccagt 15720gcggcacctg caccaacaac cagagggacg
actgtctcca gcgggacgga accactgccg 15780ccagttgcaa ggacatggcc
aagacgtggc tggtccccga cagcagaaag gatggctgct 15840gggccccgac
tggcacaccc cccactgcca gccccgcagc cccggtgtct agcacaccca
15900cccccacccc atgcccacca cagccgctct gtgatctgat gctgagccag
gtctttgctg 15960agtgccacaa ccttgtgccc ccgggcccat tcttcaacgc
ctgcatcagc gaccactgca 16020ggggccgcct tgaggtgccc tgccagagcc
tggaggctta cgcagagctc tgccgcgccc 16080ggggagtgtg cagtgactgg
cgaggtgcaa ccggtggcct gtgcgacctc acctgcccac 16140ccaccaaagt
gtacaagcca tgcggcccca tacagcctgc cacctgcaac tctaggaacc
16200agagcccaca gctggagggg atggcggagg gctgcttctg ccctgaggac
cagatcctct 16260tcaacgcaca catgggcatc tgcgtgcagg cctgcccctg
cgtgggaccc gatgggtttc 16320ctaaatttcc cggggagcgg tgggtcagca
actgccagtc ctgcgtgtgt gacgagggtt 16380cagtgtcggt gcagtgcaag
cccctgccct gtgacgccca gggtcagccc ccgccgtgca 16440accgtcccgg
cttcgtaacc gtgaccaggc cccgggccga gaacccctgc tgccccgaga
16500cggtgtgcgt gtgcaacaca accacctgcc cccagagcct gcctgtgtgc
ccgccagggc 16560aggagtccat ctgcacccag gaggagggcg actgctgtcc
caccttccgc tgcagacctc 16620agctgtgttc gtacaatggc accttctacg
gggttggtgc aaccttccca ggcgcccttc 16680cctgccacat gtgtacctgc
ctctctgggg acacccagga cccaacggtg caatgtcagg 16740aggatgcctg
caacaatact acctgtcccc agggctttga gtacaagaga gtggccgggc
16800agtgctgtgg ggagtgcgtc cagaccgcct gcctcacgcc cgatggccag
ccagtccagc 16860tgaatgaaac ctgggtcaac agccatgtgg acaactgcac
cgtgtacctc tgtgaggctg 16920agggtggagt ccatttgctg accccacagc
ctgcatcctg cccagatgtg tccagctgca 16980gggggagcct caggaaaacc
ggctgctgct actcctgtga ggaggactcc tgtcaagtcc 17040gcatcaacac
gaccatcctg tggcaccagg gctgcgagac cgaggtcaac atcaccttct
17100gcgagggctc ctgccccgga gcgtccaagt actcagcaga ggcccaggcc
atgcagcacc 17160agtgcacctg ctgccaggag aggcgggtcc acgaggagac
ggtgcccttg cactgtccta 17220acggctcagc catcctgcac acctacaccc
acgtggatga gtgtggctgc acgcccttct 17280gtgtccctgc gcccatggct
cccccacaca cccgtggctt cccggcccag gaggccactg 17340ctgtctgaga
acgttctgcc tccatcccca tgctctgtcc acctggagcc aggatgtgca
17400ttgtctgatc atgaaaacct tgggcctcct ctgcggagcc ccccggcctg
tgtgtggcac 17460cccgcgctcc gtgctcctgc tgcccacccc gtgggtgaaa
ccggccccag aagggtgagg 17520ggccagcagg acccctttcg ggagggcgcc
actcaggagt cctaccctgg gagagcctgt 17580ggcccacctt ggccttgccc
ctccctgatg tcactgggac gccctggaac aaactaagca 17640tgtgcgggcc
tatgtgtccc tgccacggcc ggagcgcccg cgcagcacgg attccagctg
17700gccacgtccg gccgctgggg cagacaggct ggtccaggca aggccagctg
ctgccaggaa 17760gctgcgacag gcaaggcggc cgcctgtcca tgcctgctgc
agggtaactc agggctgagg 17820tcgcaacggc caggtcagag aggggtcagc
atcccaaagc cccctctgct caacccagcc 17880cagttttgca aataaaccct
gagcattgag tacgtt 17916125762PRTHomo sapiens 12Met Gly Ala Pro Ser
Ala Cys Arg Thr Leu Val Leu Ala Leu Ala Ala1 5 10 15Met Leu Val Val
Pro Gln Ala Glu Thr Gln Gly Pro Val Glu Pro Ser 20 25 30Trp Glu Asn
Ala Gly His Thr Met Asp Gly Gly Ala Pro Thr Ser Ser 35 40 45Pro Thr
Arg Arg Val Ser Phe Val Pro Pro Val Thr Val Phe Pro Ser 50 55 60Leu
Ser Pro Leu Asn Pro Ala His Asn Gly Arg Val Cys Ser Thr Trp65 70 75
80Gly Asp Phe His Tyr Lys Thr Phe Asp Gly Asp Val Phe Arg Phe Pro
85 90 95Gly Leu Cys Asn Tyr Val Phe Ser Glu His Cys Arg Ala Ala Tyr
Glu 100 105 110Asp Phe Asn Val Gln Leu Arg Arg Gly Leu Val Gly Ser
Arg Pro Val 115 120 125Val Thr Arg Val Val Ile Lys Ala Gln Gly Leu
Val Leu Glu Ala Ser 130 135 140Asn Gly Ser Val Leu Ile Asn Gly Gln
Arg Glu Glu Leu Pro Tyr Ser145 150 155 160Arg Thr Gly Leu Leu Val
Glu Gln Ser Gly Asp Tyr Ile Lys Val Ser 165 170 175Ile Arg Leu Val
Leu Thr Phe Leu Trp Asn Gly Glu Asp Ser Ala Leu 180 185 190Leu Glu
Leu Asp Pro Lys Tyr Ala Asn Gln Thr Cys Gly Leu Cys Gly 195 200
205Asp Phe Asn Gly Leu Pro Ala Phe Asn Glu Phe Tyr Ala His Asn Ala
210 215 220Arg Leu Thr Pro Leu Gln Phe Gly Asn Leu Gln Lys Leu Asp
Gly Pro225 230 235 240Thr Glu Gln Cys Pro Asp Pro Leu Pro Leu Pro
Ala Gly Asn Cys Thr 245 250 255Asp Glu Glu Gly Ile Cys His Arg Thr
Leu Leu Gly Pro Ala Phe Ala 260 265 270Glu Cys His Ala Leu Val Asp
Ser Thr Ala Tyr Leu Ala Ala Cys Ala 275 280 285Gln Asp Leu Cys Arg
Cys Pro Thr Cys Pro Cys Ala Thr Phe Val Glu 290 295 300Tyr Ser Arg
Gln Cys Ala His Ala Gly Gly Gln Pro Arg Asn Trp Arg305 310 315
320Cys Pro Glu Leu Cys Pro Arg Thr Cys Pro Leu Asn Met Gln His Gln
325 330 335Glu Cys Gly Ser Pro Cys Thr Asp Thr Cys Ser Asn Pro Gln
Arg Ala
340 345 350Gln Leu Cys Glu Asp His Cys Val Asp Gly Cys Phe Cys Pro
Pro Gly 355 360 365Thr Val Leu Asp Asp Ile Thr His Ser Gly Cys Leu
Pro Leu Gly Gln 370 375 380Cys Pro Cys Thr His Gly Gly Arg Thr Tyr
Ser Pro Gly Thr Ser Phe385 390 395 400Asn Thr Thr Cys Ser Ser Cys
Thr Cys Ser Gly Gly Leu Trp Gln Cys 405 410 415Gln Asp Leu Pro Cys
Pro Gly Thr Cys Ser Val Gln Gly Gly Ala His 420 425 430Ile Ser Thr
Tyr Asp Glu Lys Leu Tyr Asp Leu His Gly Asp Cys Ser 435 440 445Tyr
Val Leu Ser Lys Lys Cys Ala Asp Ser Ser Phe Thr Val Leu Ala 450 455
460Glu Leu Arg Lys Cys Gly Leu Thr Asp Asn Glu Asn Cys Leu Lys
Ala465 470 475 480Val Thr Leu Ser Leu Asp Gly Gly Asp Thr Ala Ile
Arg Val Gln Ala 485 490 495Asp Gly Gly Val Phe Leu Asn Ser Ile Tyr
Thr Gln Leu Pro Leu Ser 500 505 510Ala Ala Asn Ile Thr Leu Phe Thr
Pro Ser Ser Phe Phe Ile Val Val 515 520 525Gln Thr Gly Leu Gly Leu
Gln Leu Leu Val Gln Leu Val Pro Leu Met 530 535 540Gln Val Phe Val
Arg Leu Asp Pro Ala His Gln Gly Gln Met Cys Gly545 550 555 560Leu
Cys Gly Asn Phe Asn Gln Asn Gln Ala Asp Asp Phe Thr Ala Leu 565 570
575Ser Gly Val Val Glu Ala Thr Gly Ala Ala Phe Ala Asn Thr Trp Lys
580 585 590Ala Gln Ala Ala Cys Ala Asn Ala Arg Asn Ser Phe Glu Asp
Pro Cys 595 600 605Ser Leu Ser Val Glu Asn Glu Asn Tyr Ala Arg His
Trp Cys Ser Arg 610 615 620Leu Thr Asp Pro Asn Ser Ala Phe Ser Arg
Cys His Ser Ile Ile Asn625 630 635 640Pro Lys Pro Phe His Ser Asn
Cys Met Phe Asp Thr Cys Asn Cys Glu 645 650 655Arg Ser Glu Asp Cys
Leu Cys Ala Ala Leu Ser Ser Tyr Val His Ala 660 665 670Cys Ala Ala
Lys Gly Val Gln Leu Ser Asp Trp Arg Asp Gly Val Cys 675 680 685Thr
Lys Tyr Met Gln Asn Cys Pro Lys Ser Gln Arg Tyr Ala Tyr Val 690 695
700Val Asp Ala Cys Gln Pro Thr Cys Arg Gly Leu Ser Glu Ala Asp
Val705 710 715 720Thr Cys Ser Val Ser Phe Val Pro Val Asp Gly Cys
Thr Cys Pro Ala 725 730 735Gly Thr Phe Leu Asn Asp Ala Gly Ala Cys
Val Pro Ala Gln Glu Cys 740 745 750Pro Cys Tyr Ala His Gly Thr Val
Leu Ala Pro Gly Glu Val Val His 755 760 765Asp Glu Gly Ala Val Cys
Ser Cys Thr Gly Gly Lys Leu Ser Cys Leu 770 775 780Gly Ala Ser Leu
Gln Lys Ser Thr Gly Cys Ala Ala Pro Met Val Tyr785 790 795 800Leu
Asp Cys Ser Asn Ser Ser Ala Gly Thr Pro Gly Ala Glu Cys Leu 805 810
815Arg Ser Cys His Thr Leu Asp Val Gly Cys Phe Ser Thr His Cys Val
820 825 830Ser Gly Cys Val Cys Pro Pro Gly Leu Val Ser Asp Gly Ser
Gly Gly 835 840 845Cys Ile Ala Glu Glu Asp Cys Pro Cys Val His Asn
Glu Ala Thr Tyr 850 855 860Lys Pro Gly Glu Thr Ile Arg Val Asp Cys
Asn Thr Cys Thr Cys Arg865 870 875 880Asn Arg Arg Trp Glu Cys Ser
His Arg Leu Cys Leu Gly Thr Cys Val 885 890 895Ala Tyr Gly Asp Gly
His Phe Ile Thr Phe Asp Gly Asp Arg Tyr Ser 900 905 910Phe Glu Gly
Ser Cys Glu Tyr Ile Leu Ala Gln Asp Tyr Cys Gly Asp 915 920 925Asn
Thr Thr His Gly Thr Phe Arg Ile Val Thr Glu Asn Ile Pro Cys 930 935
940Gly Thr Thr Gly Thr Thr Cys Ser Lys Ala Ile Lys Leu Phe Val
Glu945 950 955 960Ser Tyr Glu Leu Ile Leu Gln Glu Gly Thr Phe Lys
Ala Val Ala Arg 965 970 975Gly Pro Gly Gly Asp Pro Pro Tyr Lys Ile
Arg Tyr Met Gly Ile Phe 980 985 990Leu Val Ile Glu Thr His Gly Met
Ala Val Ser Trp Asp Arg Lys Thr 995 1000 1005Ser Val Phe Ile Arg
Leu His Gln Asp Tyr Lys Gly Arg Val Cys 1010 1015 1020Gly Leu Cys
Gly Asn Phe Asp Asp Asn Ala Ile Asn Asp Phe Ala 1025 1030 1035Thr
Arg Ser Arg Ser Val Val Gly Asp Ala Leu Glu Phe Gly Asn 1040 1045
1050Ser Trp Lys Leu Ser Pro Ser Cys Pro Asp Ala Leu Ala Pro Lys
1055 1060 1065Asp Pro Cys Thr Ala Asn Pro Phe Arg Lys Ser Trp Ala
Gln Lys 1070 1075 1080Gln Cys Ser Ile Leu His Gly Pro Thr Phe Ala
Ala Cys Arg Ser 1085 1090 1095Gln Val Asp Ser Thr Lys Tyr Tyr Glu
Ala Cys Val Asn Asp Ala 1100 1105 1110Cys Ala Cys Asp Ser Gly Gly
Asp Cys Glu Cys Phe Cys Thr Ala 1115 1120 1125Val Ala Ala Tyr Ala
Gln Ala Cys His Asp Ala Gly Leu Cys Val 1130 1135 1140Ser Trp Arg
Thr Pro Asp Thr Cys Pro Leu Phe Cys Asp Phe Tyr 1145 1150 1155Asn
Pro His Gly Gly Cys Glu Trp His Tyr Gln Pro Cys Gly Ala 1160 1165
1170Pro Cys Leu Lys Thr Cys Arg Asn Pro Ser Gly His Cys Leu Val
1175 1180 1185Asp Leu Pro Gly Leu Glu Gly Cys Tyr Pro Lys Cys Pro
Pro Ser 1190 1195 1200Gln Pro Phe Phe Asn Glu Asp Gln Met Lys Cys
Val Ala Gln Cys 1205 1210 1215Gly Cys Tyr Asp Lys Asp Gly Asn Tyr
Tyr Asp Val Gly Ala Arg 1220 1225 1230Val Pro Thr Ala Glu Asn Cys
Gln Ser Cys Asn Cys Thr Pro Ser 1235 1240 1245Gly Ile Gln Cys Ala
His Ser Leu Glu Ala Cys Thr Cys Thr Tyr 1250 1255 1260Glu Asp Arg
Thr Tyr Ser Tyr Gln Asp Val Ile Tyr Asn Thr Thr 1265 1270 1275Asp
Gly Leu Gly Ala Cys Leu Ile Ala Ile Cys Gly Ser Asn Gly 1280 1285
1290Thr Ile Ile Arg Lys Ala Val Ala Cys Pro Gly Thr Pro Ala Thr
1295 1300 1305Thr Pro Phe Thr Phe Thr Thr Ala Trp Val Pro His Ser
Thr Thr 1310 1315 1320Ser Pro Ala Leu Pro Val Ser Thr Val Cys Val
Arg Glu Val Cys 1325 1330 1335Arg Trp Ser Ser Trp Tyr Asn Gly His
Arg Pro Glu Pro Gly Leu 1340 1345 1350Gly Gly Gly Asp Phe Glu Thr
Phe Glu Asn Leu Arg Gln Arg Gly 1355 1360 1365Tyr Gln Val Cys Pro
Val Leu Ala Asp Ile Glu Cys Arg Ala Ala 1370 1375 1380Gln Leu Pro
Asp Met Pro Leu Glu Glu Leu Gly Gln Gln Val Asp 1385 1390 1395Cys
Asp Arg Met Arg Gly Leu Met Cys Ala Asn Ser Gln Gln Ser 1400 1405
1410Pro Pro Leu Cys His Asp Tyr Glu Leu Arg Val Leu Cys Cys Glu
1415 1420 1425Tyr Val Pro Cys Gly Pro Ser Pro Ala Pro Gly Thr Ser
Pro Gln 1430 1435 1440Pro Ser Leu Ser Ala Ser Thr Glu Pro Ala Val
Pro Thr Pro Thr 1445 1450 1455Gln Thr Thr Ala Thr Glu Lys Thr Thr
Leu Trp Val Thr Pro Ser 1460 1465 1470Ile Arg Ser Thr Ala Ala Leu
Thr Ser Gln Thr Gly Ser Ser Ser 1475 1480 1485Gly Pro Val Thr Val
Thr Pro Ser Ala Pro Gly Thr Thr Thr Cys 1490 1495 1500Gln Pro Arg
Cys Gln Trp Thr Glu Trp Phe Asp Glu Asp Tyr Pro 1505 1510 1515Lys
Ser Glu Gln Leu Gly Gly Asp Val Glu Ser Tyr Asp Lys Ile 1520 1525
1530Arg Ala Ala Gly Gly His Leu Cys Gln Gln Pro Lys Asp Ile Glu
1535 1540 1545Cys Gln Ala Glu Ser Phe Pro Asn Trp Thr Leu Ala Gln
Val Gly 1550 1555 1560Gln Lys Val His Cys Asp Val His Phe Gly Leu
Val Cys Arg Asn 1565 1570 1575Trp Glu Gln Glu Gly Val Phe Lys Met
Cys Tyr Asn Tyr Arg Ile 1580 1585 1590Arg Val Leu Cys Cys Ser Asp
Asp His Cys Arg Gly Arg Ala Thr 1595 1600 1605Thr Pro Pro Pro Thr
Thr Glu Leu Glu Thr Ala Thr Thr Thr Thr 1610 1615 1620Thr Gln Ala
Leu Phe Ser Thr Pro Gln Pro Thr Ser Ser Pro Gly 1625 1630 1635Leu
Thr Arg Ala Pro Pro Ala Ser Thr Thr Ala Val Pro Thr Leu 1640 1645
1650Ser Glu Gly Leu Thr Ser Pro Arg Tyr Thr Ser Thr Leu Gly Thr
1655 1660 1665Ala Thr Thr Gly Gly Pro Thr Thr Pro Ala Gly Ser Thr
Glu Pro 1670 1675 1680Thr Val Pro Gly Val Ala Thr Ser Thr Leu Pro
Thr Arg Ser Ala 1685 1690 1695Leu Pro Gly Thr Thr Gly Ser Leu Gly
Thr Trp Arg Pro Ser Gln 1700 1705 1710Pro Pro Thr Leu Ala Pro Thr
Thr Met Ala Thr Ser Arg Ala Arg 1715 1720 1725Pro Thr Gly Thr Ala
Ser Thr Ala Ser Lys Glu Pro Leu Thr Thr 1730 1735 1740Ser Leu Ala
Pro Thr Leu Thr Ser Glu Leu Ser Thr Ser Gln Ala 1745 1750 1755Glu
Thr Ser Thr Pro Arg Thr Glu Thr Thr Met Ser Pro Leu Thr 1760 1765
1770Asn Thr Thr Thr Ser Gln Gly Thr Thr Arg Cys Gln Pro Lys Cys
1775 1780 1785Glu Trp Thr Glu Trp Phe Asp Val Asp Phe Pro Thr Ser
Gly Val 1790 1795 1800Ala Gly Gly Asp Met Glu Thr Phe Glu Asn Ile
Arg Ala Ala Gly 1805 1810 1815Gly Lys Met Cys Trp Ala Pro Lys Ser
Ile Glu Cys Arg Ala Glu 1820 1825 1830Asn Tyr Pro Glu Val Ser Ile
Asp Gln Val Gly Gln Val Leu Thr 1835 1840 1845Cys Ser Leu Glu Thr
Gly Leu Thr Cys Lys Asn Glu Asp Gln Thr 1850 1855 1860Gly Arg Phe
Asn Met Cys Phe Asn Tyr Asn Val Arg Val Leu Cys 1865 1870 1875Cys
Asp Asp Tyr Ser His Cys Pro Ser Thr Pro Ala Thr Ser Ser 1880 1885
1890Thr Ala Thr Pro Ser Ser Thr Pro Gly Thr Thr Trp Ile Leu Thr
1895 1900 1905Lys Pro Thr Thr Thr Ala Thr Thr Thr Ala Ser Thr Gly
Ser Thr 1910 1915 1920Ala Thr Pro Thr Ser Thr Leu Arg Thr Ala Pro
Pro Pro Lys Val 1925 1930 1935Leu Thr Thr Thr Ala Thr Thr Pro Thr
Val Thr Ser Ser Lys Ala 1940 1945 1950Thr Pro Ser Ser Ser Pro Gly
Thr Ala Thr Ala Leu Pro Ala Leu 1955 1960 1965Arg Ser Thr Ala Thr
Thr Pro Thr Ala Thr Ser Val Thr Pro Ile 1970 1975 1980Pro Ser Ser
Ser Leu Gly Thr Thr Trp Thr Arg Leu Ser Gln Thr 1985 1990 1995Thr
Thr Pro Thr Ala Thr Met Ser Thr Ala Thr Pro Ser Ser Thr 2000 2005
2010Pro Glu Thr Ala His Thr Ser Thr Val Leu Thr Ala Thr Ala Thr
2015 2020 2025Thr Thr Gly Ala Thr Gly Ser Val Ala Thr Pro Ser Ser
Thr Pro 2030 2035 2040Gly Thr Ala His Thr Thr Lys Val Pro Thr Thr
Thr Thr Thr Gly 2045 2050 2055Phe Thr Ala Thr Pro Ser Ser Ser Pro
Gly Thr Ala Leu Thr Pro 2060 2065 2070Pro Val Trp Ile Ser Thr Thr
Thr Thr Pro Thr Thr Arg Gly Ser 2075 2080 2085Thr Val Thr Pro Ser
Ser Ile Pro Gly Thr Thr His Thr Ala Thr 2090 2095 2100Val Leu Thr
Thr Thr Thr Thr Thr Val Ala Thr Gly Ser Met Ala 2105 2110 2115Thr
Pro Ser Ser Ser Thr Gln Thr Ser Gly Thr Pro Pro Ser Leu 2120 2125
2130Thr Thr Thr Ala Thr Thr Ile Thr Ala Thr Gly Ser Thr Thr Asn
2135 2140 2145Pro Ser Ser Thr Pro Gly Thr Thr Pro Ile Pro Pro Val
Leu Thr 2150 2155 2160Thr Thr Ala Thr Thr Pro Ala Ala Thr Ser Asn
Thr Val Thr Pro 2165 2170 2175Ser Ser Ala Leu Gly Thr Thr His Thr
Pro Pro Val Pro Asn Thr 2180 2185 2190Met Ala Thr Thr His Gly Arg
Ser Leu Pro Pro Ser Ser Pro His 2195 2200 2205Thr Val Arg Thr Ala
Trp Thr Ser Ala Thr Ser Gly Ile Leu Gly 2210 2215 2220Thr Thr His
Ile Thr Glu Pro Ser Thr Val Thr Ser His Thr Leu 2225 2230 2235Ala
Ala Thr Thr Gly Thr Thr Gln His Ser Thr Pro Ala Leu Ser 2240 2245
2250Ser Pro His Pro Ser Ser Arg Thr Thr Glu Ser Pro Pro Ser Pro
2255 2260 2265Gly Thr Thr Thr Pro Gly His Thr Thr Ala Thr Ser Arg
Thr Thr 2270 2275 2280Ala Thr Ala Thr Pro Ser Lys Thr Arg Thr Ser
Thr Leu Leu Pro 2285 2290 2295Ser Ser Pro Thr Ser Ala Pro Ile Thr
Thr Val Val Thr Met Gly 2300 2305 2310Cys Glu Pro Gln Cys Ala Trp
Ser Glu Trp Leu Asp Tyr Ser Tyr 2315 2320 2325Pro Met Pro Gly Pro
Ser Gly Gly Asp Phe Asp Thr Tyr Ser Asn 2330 2335 2340Ile Arg Ala
Ala Gly Gly Ala Val Cys Glu Gln Pro Leu Gly Leu 2345 2350 2355Glu
Cys Arg Ala Gln Ala Gln Pro Gly Val Pro Leu Arg Glu Leu 2360 2365
2370Gly Gln Val Val Glu Cys Ser Leu Asp Phe Gly Leu Val Cys Arg
2375 2380 2385Asn Arg Glu Gln Val Gly Lys Phe Lys Met Cys Phe Asn
Tyr Glu 2390 2395 2400Ile Arg Val Phe Cys Cys Asn Tyr Gly His Cys
Pro Ser Thr Pro 2405 2410 2415Ala Thr Ser Ser Thr Ala Met Pro Ser
Ser Thr Pro Gly Thr Thr 2420 2425 2430Trp Ile Leu Thr Glu Leu Thr
Thr Thr Ala Thr Thr Thr Glu Ser 2435 2440 2445Thr Gly Ser Thr Ala
Thr Pro Ser Ser Thr Pro Gly Thr Thr Trp 2450 2455 2460Ile Leu Thr
Glu Pro Ser Thr Thr Ala Thr Val Thr Val Pro Thr 2465 2470 2475Gly
Ser Thr Ala Thr Ala Ser Ser Thr Gln Ala Thr Ala Gly Thr 2480 2485
2490Pro His Val Ser Thr Thr Ala Thr Thr Pro Thr Val Thr Ser Ser
2495 2500 2505Lys Ala Thr Pro Phe Ser Ser Pro Gly Thr Ala Thr Ala
Leu Pro 2510 2515 2520Ala Leu Arg Ser Thr Ala Thr Thr Pro Thr Ala
Thr Ser Phe Thr 2525 2530 2535Ala Ile Pro Ser Ser Ser Leu Gly Thr
Thr Trp Thr Arg Leu Ser 2540 2545 2550Gln Thr Thr Thr Pro Thr Ala
Thr Met Ser Thr Ala Thr Pro Ser 2555 2560 2565Ser Thr Pro Glu Thr
Val His Thr Ser Thr Val Leu Thr Thr Thr 2570 2575 2580Ala Thr Thr
Thr Gly Ala Thr Gly Ser Val Ala Thr Pro Ser Ser 2585 2590 2595Thr
Pro Gly Thr Ala His Thr Thr Lys Val Leu Thr Thr Thr Thr 2600 2605
2610Thr Gly Phe Thr Ala Thr Pro Ser Ser Ser Pro Gly Thr Ala Arg
2615 2620 2625Thr Leu Pro Val Trp Ile Ser Thr Thr Thr Thr Pro Thr
Thr Arg 2630 2635 2640Gly Ser Thr Val Thr Pro Ser Ser Ile Pro Gly
Thr Thr His Thr 2645 2650 2655Pro Thr Val Leu Thr Thr Thr Thr Thr
Thr Val Ala Thr Gly Ser 2660 2665 2670Met Ala Thr Pro Ser Ser Ser
Thr Gln Thr Ser Gly Thr Pro Pro 2675 2680 2685Ser Leu Thr Thr Thr
Ala Thr Thr Ile Thr Ala Thr Gly Ser Thr 2690 2695 2700Thr Asn Pro
Ser Ser Thr Pro Gly Thr Thr Pro Ile Pro Pro Val 2705 2710 2715Leu
Thr Thr Thr Ala Thr Thr Pro Ala Ala Thr Ser Ser Thr Val 2720 2725
2730Thr Pro Ser Ser Ala Leu Gly Thr Thr His Thr Pro Pro Val Pro
2735 2740 2745Asn Thr Thr Ala Thr Thr His Gly Arg Ser Leu Ser Pro
Ser Ser 2750 2755 2760Pro His Thr Val Arg Thr Ala Trp Thr Ser Ala
Thr Ser Gly Thr 2765 2770 2775Leu Gly Thr Thr His Ile Thr Glu Pro
Ser Thr Gly Thr Ser His 2780 2785
2790Thr Pro Ala Ala Thr Thr Gly Thr Thr Gln His Ser Thr Pro Ala
2795 2800 2805Leu Ser Ser Pro His Pro Ser Ser Arg Thr Thr Glu Ser
Pro Pro 2810 2815 2820Ser Pro Gly Thr Thr Thr Pro Gly His Thr Arg
Ala Thr Ser Arg 2825 2830 2835Thr Thr Ala Thr Ala Thr Pro Ser Lys
Thr Arg Thr Ser Thr Leu 2840 2845 2850Leu Pro Ser Ser Pro Thr Ser
Ala Pro Ile Thr Thr Val Val Thr 2855 2860 2865Met Gly Cys Glu Pro
Gln Cys Ala Trp Ser Glu Trp Leu Asp Tyr 2870 2875 2880Ser Tyr Pro
Met Pro Gly Pro Ser Gly Gly Asp Phe Asp Thr Tyr 2885 2890 2895Ser
Asn Ile Arg Ala Ala Gly Gly Ala Val Cys Glu Gln Pro Leu 2900 2905
2910Gly Leu Glu Cys Arg Ala Gln Ala Gln Pro Gly Val Pro Leu Arg
2915 2920 2925Glu Leu Gly Gln Val Val Glu Cys Ser Leu Asp Phe Gly
Leu Val 2930 2935 2940Cys Arg Asn Arg Glu Gln Val Gly Lys Phe Lys
Met Cys Phe Asn 2945 2950 2955Tyr Glu Ile Arg Val Phe Cys Cys Asn
Tyr Gly His Cys Pro Ser 2960 2965 2970Thr Pro Ala Thr Ser Ser Thr
Ala Thr Pro Ser Ser Thr Pro Gly 2975 2980 2985Thr Thr Trp Ile Leu
Thr Glu Gln Thr Thr Ala Ala Thr Thr Thr 2990 2995 3000Ala Thr Thr
Gly Ser Thr Ala Ile Pro Ser Ser Thr Pro Gly Thr 3005 3010 3015Ala
Pro Pro Pro Lys Val Leu Thr Ser Thr Ala Thr Thr Pro Thr 3020 3025
3030Ala Thr Ser Ser Lys Ala Thr Ser Ser Ser Ser Pro Arg Thr Ala
3035 3040 3045Thr Thr Leu Pro Val Leu Thr Ser Thr Ala Thr Lys Ser
Thr Ala 3050 3055 3060Thr Ser Phe Thr Pro Ile Pro Ser Phe Thr Leu
Gly Thr Thr Gly 3065 3070 3075Thr Leu Pro Glu Gln Thr Thr Thr Pro
Met Ala Thr Met Ser Thr 3080 3085 3090Ile His Pro Ser Ser Thr Pro
Glu Thr Thr His Thr Ser Thr Val 3095 3100 3105Leu Thr Thr Lys Ala
Thr Thr Thr Arg Ala Thr Ser Ser Met Ser 3110 3115 3120Thr Pro Ser
Ser Thr Pro Gly Thr Thr Trp Ile Leu Thr Glu Leu 3125 3130 3135Thr
Thr Ala Ala Thr Thr Thr Ala Ala Thr Gly Pro Thr Ala Thr 3140 3145
3150Pro Ser Ser Thr Pro Gly Thr Thr Trp Ile Leu Thr Glu Pro Ser
3155 3160 3165Thr Thr Ala Thr Val Thr Val Pro Thr Gly Ser Thr Ala
Thr Ala 3170 3175 3180Ser Ser Thr Arg Ala Thr Ala Gly Thr Leu Lys
Val Leu Thr Ser 3185 3190 3195Thr Ala Thr Thr Pro Thr Val Ile Ser
Ser Arg Ala Thr Pro Ser 3200 3205 3210Ser Ser Pro Gly Thr Ala Thr
Ala Leu Pro Ala Leu Arg Ser Thr 3215 3220 3225Ala Thr Thr Pro Thr
Ala Thr Ser Val Thr Ala Ile Pro Ser Ser 3230 3235 3240Ser Leu Gly
Thr Ala Trp Thr Arg Leu Ser Gln Thr Thr Thr Pro 3245 3250 3255Thr
Ala Thr Met Ser Thr Ala Thr Pro Ser Ser Thr Pro Glu Thr 3260 3265
3270Val His Thr Ser Thr Val Leu Thr Thr Thr Thr Thr Thr Thr Arg
3275 3280 3285Ala Thr Gly Ser Val Ala Thr Pro Ser Ser Thr Pro Gly
Thr Ala 3290 3295 3300His Thr Thr Lys Val Pro Thr Thr Thr Thr Thr
Gly Phe Thr Ala 3305 3310 3315Thr Pro Ser Ser Ser Pro Gly Thr Ala
Leu Thr Pro Pro Val Trp 3320 3325 3330Ile Ser Thr Thr Thr Thr Pro
Thr Thr Arg Gly Ser Thr Val Thr 3335 3340 3345Pro Ser Ser Ile Pro
Gly Thr Thr His Thr Ala Thr Val Leu Thr 3350 3355 3360Thr Thr Thr
Thr Thr Val Ala Thr Gly Ser Met Ala Thr Pro Ser 3365 3370 3375Ser
Ser Thr Gln Thr Ser Gly Thr Pro Pro Ser Leu Thr Thr Thr 3380 3385
3390Ala Thr Thr Ile Thr Ala Thr Gly Ser Thr Thr Asn Pro Ser Ser
3395 3400 3405Thr Pro Gly Thr Thr Pro Ile Pro Pro Val Leu Thr Thr
Thr Ala 3410 3415 3420Thr Thr Pro Ala Ala Thr Ser Ser Thr Val Thr
Pro Ser Ser Ala 3425 3430 3435Leu Gly Thr Thr His Thr Pro Pro Val
Pro Asn Thr Thr Ala Thr 3440 3445 3450Thr His Gly Arg Ser Leu Pro
Pro Ser Ser Pro His Thr Val Arg 3455 3460 3465Thr Ala Trp Thr Ser
Ala Thr Ser Gly Ile Leu Gly Thr Thr His 3470 3475 3480Ile Thr Glu
Pro Ser Thr Val Thr Ser His Thr Pro Ala Ala Thr 3485 3490 3495Thr
Ser Thr Thr Gln His Ser Thr Pro Ala Leu Ser Ser Pro His 3500 3505
3510Pro Ser Ser Arg Thr Thr Glu Ser Pro Pro Ser Pro Gly Thr Thr
3515 3520 3525Thr Pro Gly His Thr Arg Gly Thr Ser Arg Thr Thr Ala
Thr Ala 3530 3535 3540Thr Pro Ser Lys Thr Arg Thr Ser Thr Leu Leu
Pro Ser Ser Pro 3545 3550 3555Thr Ser Ala Pro Ile Thr Thr Val Val
Thr Thr Gly Cys Glu Pro 3560 3565 3570Gln Cys Ala Trp Ser Glu Trp
Leu Asp Tyr Ser Tyr Pro Met Pro 3575 3580 3585Gly Pro Ser Gly Gly
Asp Phe Asp Thr Tyr Ser Asn Ile Arg Ala 3590 3595 3600Ala Gly Gly
Ala Val Cys Glu Gln Pro Leu Gly Leu Glu Cys Arg 3605 3610 3615Ala
Gln Ala Gln Pro Gly Val Pro Leu Arg Glu Leu Gly Gln Val 3620 3625
3630Val Glu Cys Ser Leu Asp Phe Gly Leu Val Cys Arg Asn Arg Glu
3635 3640 3645Gln Val Gly Lys Phe Lys Met Cys Phe Asn Tyr Glu Ile
Arg Val 3650 3655 3660Phe Cys Cys Asn Tyr Gly His Cys Pro Ser Thr
Pro Ala Thr Ser 3665 3670 3675Ser Thr Ala Thr Pro Ser Ser Thr Pro
Gly Thr Thr Trp Ile Leu 3680 3685 3690Thr Lys Leu Thr Thr Thr Ala
Thr Thr Thr Glu Ser Thr Gly Ser 3695 3700 3705Thr Ala Thr Pro Ser
Ser Thr Pro Gly Thr Thr Trp Ile Leu Thr 3710 3715 3720Glu Pro Ser
Thr Thr Ala Thr Val Thr Val Pro Thr Gly Ser Thr 3725 3730 3735Ala
Thr Ala Ser Ser Thr Gln Ala Thr Ala Gly Thr Pro His Val 3740 3745
3750Ser Thr Thr Ala Thr Thr Pro Thr Val Thr Ser Ser Lys Ala Thr
3755 3760 3765Pro Phe Ser Ser Pro Gly Thr Ala Thr Ala Leu Pro Ala
Leu Arg 3770 3775 3780Ser Thr Ala Thr Thr Pro Thr Ala Thr Ser Phe
Thr Ala Ile Pro 3785 3790 3795Ser Ser Ser Leu Gly Thr Thr Trp Thr
Arg Leu Ser Gln Thr Thr 3800 3805 3810Thr Pro Thr Ala Thr Met Ser
Thr Ala Thr Pro Ser Ser Thr Pro 3815 3820 3825Glu Thr Ala His Thr
Ser Thr Val Leu Thr Thr Thr Ala Thr Thr 3830 3835 3840Thr Arg Ala
Thr Gly Ser Val Ala Thr Pro Ser Ser Thr Pro Gly 3845 3850 3855Thr
Ala His Thr Thr Lys Val Pro Thr Thr Thr Thr Thr Gly Phe 3860 3865
3870Thr Val Thr Pro Ser Ser Ser Pro Gly Thr Ala Arg Thr Pro Pro
3875 3880 3885Val Trp Ile Ser Thr Thr Thr Thr Pro Thr Thr Ser Gly
Ser Thr 3890 3895 3900Val Thr Pro Ser Ser Val Pro Gly Thr Thr His
Thr Pro Thr Val 3905 3910 3915Leu Thr Thr Thr Thr Thr Thr Val Ala
Thr Gly Ser Met Ala Thr 3920 3925 3930Pro Ser Ser Ser Thr Gln Thr
Ser Gly Thr Pro Pro Ser Leu Ile 3935 3940 3945Thr Thr Ala Thr Thr
Ile Thr Ala Thr Gly Ser Thr Thr Asn Pro 3950 3955 3960Ser Ser Thr
Pro Gly Thr Thr Pro Ile Pro Pro Val Leu Thr Thr 3965 3970 3975Thr
Ala Thr Thr Pro Ala Ala Thr Ser Ser Thr Val Thr Pro Ser 3980 3985
3990Ser Ala Leu Gly Thr Thr His Thr Pro Pro Val Pro Asn Thr Thr
3995 4000 4005Ala Thr Thr His Gly Arg Ser Leu Ser Pro Ser Ser Pro
His Thr 4010 4015 4020Val Arg Thr Ala Trp Thr Ser Ala Thr Ser Gly
Thr Leu Gly Thr 4025 4030 4035Thr His Ile Thr Glu Pro Ser Thr Gly
Thr Ser His Thr Pro Ala 4040 4045 4050Ala Thr Thr Gly Thr Thr Gln
His Ser Thr Pro Ala Leu Ser Ser 4055 4060 4065Pro His Pro Ser Ser
Arg Thr Thr Glu Ser Pro Pro Ser Pro Gly 4070 4075 4080Thr Thr Thr
Pro Gly His Thr Thr Ala Thr Ser Arg Thr Thr Ala 4085 4090 4095Thr
Ala Thr Pro Ser Lys Thr Arg Thr Ser Thr Leu Leu Pro Ser 4100 4105
4110Ser Pro Thr Ser Ala Pro Ile Thr Thr Val Val Thr Thr Gly Cys
4115 4120 4125Glu Pro Gln Cys Ala Trp Ser Glu Trp Leu Asp Tyr Ser
Tyr Pro 4130 4135 4140Met Pro Gly Pro Ser Gly Gly Asp Phe Asp Thr
Tyr Ser Asn Ile 4145 4150 4155Arg Ala Ala Gly Gly Ala Val Cys Glu
Gln Pro Leu Gly Leu Glu 4160 4165 4170Cys Arg Ala Gln Ala Gln Pro
Gly Val Pro Leu Gly Glu Leu Gly 4175 4180 4185Gln Val Val Glu Cys
Ser Leu Asp Phe Gly Leu Val Cys Arg Asn 4190 4195 4200Arg Glu Gln
Val Gly Lys Phe Lys Met Cys Phe Asn Tyr Glu Ile 4205 4210 4215Arg
Val Phe Cys Cys Asn Tyr Gly His Cys Pro Ser Thr Pro Ala 4220 4225
4230Thr Ser Ser Thr Ala Met Pro Ser Ser Thr Pro Gly Thr Thr Trp
4235 4240 4245Ile Leu Thr Glu Leu Thr Thr Thr Ala Thr Thr Thr Ala
Ser Thr 4250 4255 4260Gly Ser Thr Ala Thr Pro Ser Ser Thr Pro Gly
Thr Ala Pro Pro 4265 4270 4275Pro Lys Val Leu Thr Ser Pro Ala Thr
Thr Pro Thr Ala Thr Ser 4280 4285 4290Ser Lys Ala Thr Ser Ser Ser
Ser Pro Arg Thr Ala Thr Thr Leu 4295 4300 4305Pro Val Leu Thr Ser
Thr Ala Thr Lys Ser Thr Ala Thr Ser Val 4310 4315 4320Thr Pro Ile
Pro Ser Ser Thr Leu Gly Thr Thr Gly Thr Leu Pro 4325 4330 4335Glu
Gln Thr Thr Thr Pro Val Ala Thr Met Ser Thr Ile His Pro 4340 4345
4350Ser Ser Thr Pro Glu Thr Thr His Thr Ser Thr Val Leu Thr Thr
4355 4360 4365Lys Ala Thr Thr Thr Arg Ala Thr Ser Ser Thr Ser Thr
Pro Ser 4370 4375 4380Ser Thr Pro Gly Thr Thr Trp Ile Leu Thr Glu
Leu Thr Thr Ala 4385 4390 4395Ala Thr Thr Thr Ala Ala Thr Gly Pro
Thr Ala Thr Pro Ser Ser 4400 4405 4410Thr Pro Gly Thr Thr Trp Ile
Leu Thr Glu Leu Thr Thr Thr Ala 4415 4420 4425Thr Thr Thr Ala Ser
Thr Gly Ser Thr Ala Thr Pro Ser Ser Thr 4430 4435 4440Pro Gly Thr
Thr Trp Ile Leu Thr Glu Pro Ser Thr Thr Ala Thr 4445 4450 4455Val
Thr Val Pro Thr Gly Ser Thr Ala Thr Ala Ser Ser Thr Gln 4460 4465
4470Ala Thr Ala Gly Thr Pro His Val Ser Thr Thr Ala Thr Thr Pro
4475 4480 4485Thr Val Thr Ser Ser Lys Ala Thr Pro Ser Ser Ser Pro
Gly Thr 4490 4495 4500Ala Thr Ala Leu Pro Ala Leu Arg Ser Thr Ala
Thr Thr Pro Thr 4505 4510 4515Ala Thr Ser Phe Thr Ala Ile Pro Ser
Ser Ser Leu Gly Thr Thr 4520 4525 4530Trp Thr Arg Leu Ser Gln Thr
Thr Thr Pro Thr Ala Thr Met Ser 4535 4540 4545Thr Ala Thr Pro Ser
Ser Thr Pro Glu Thr Val His Thr Ser Thr 4550 4555 4560Val Leu Thr
Ala Thr Ala Thr Thr Thr Gly Ala Thr Gly Ser Val 4565 4570 4575Ala
Thr Pro Ser Ser Thr Pro Gly Thr Ala His Thr Thr Lys Val 4580 4585
4590Pro Thr Thr Thr Thr Thr Gly Phe Thr Ala Thr Pro Ser Ser Ser
4595 4600 4605Pro Gly Thr Ala Leu Thr Pro Pro Val Trp Ile Ser Thr
Thr Thr 4610 4615 4620Thr Pro Thr Thr Thr Thr Pro Thr Thr Ser Gly
Ser Thr Val Thr 4625 4630 4635Pro Ser Ser Ile Pro Gly Thr Thr His
Thr Ala Arg Val Leu Thr 4640 4645 4650Thr Thr Thr Thr Thr Val Ala
Thr Gly Ser Met Ala Thr Pro Ser 4655 4660 4665Ser Ser Thr Gln Thr
Ser Gly Thr Pro Pro Ser Leu Thr Thr Thr 4670 4675 4680Ala Thr Thr
Ile Thr Ala Thr Gly Ser Thr Thr Asn Pro Ser Ser 4685 4690 4695Thr
Pro Gly Thr Thr Pro Ile Thr Pro Val Leu Thr Ser Thr Ala 4700 4705
4710Thr Thr Pro Ala Ala Thr Ser Ser Lys Ala Thr Ser Ser Ser Ser
4715 4720 4725Pro Arg Thr Ala Thr Thr Leu Pro Val Leu Thr Ser Thr
Ala Thr 4730 4735 4740Lys Ser Thr Ala Thr Ser Phe Thr Pro Ile Pro
Ser Ser Thr Leu 4745 4750 4755Trp Thr Thr Trp Thr Val Pro Ala Gln
Thr Thr Thr Pro Met Ser 4760 4765 4770Thr Met Ser Thr Ile His Thr
Ser Ser Thr Pro Glu Thr Thr His 4775 4780 4785Thr Ser Thr Val Leu
Thr Thr Thr Ala Thr Met Thr Arg Ala Thr 4790 4795 4800Asn Ser Thr
Ala Thr Pro Ser Ser Thr Leu Gly Thr Thr Arg Ile 4805 4810 4815Leu
Thr Glu Leu Thr Thr Thr Ala Thr Thr Thr Ala Ala Thr Gly 4820 4825
4830Ser Thr Ala Thr Leu Ser Ser Thr Pro Gly Thr Thr Trp Ile Leu
4835 4840 4845Thr Glu Pro Ser Thr Ile Ala Thr Val Met Val Pro Thr
Gly Ser 4850 4855 4860Thr Ala Thr Ala Ser Ser Thr Leu Gly Thr Ala
His Thr Pro Lys 4865 4870 4875Val Val Thr Thr Met Ala Thr Met Pro
Thr Ala Thr Ala Ser Thr 4880 4885 4890Val Pro Ser Ser Ser Thr Val
Gly Thr Thr Arg Thr Pro Ala Val 4895 4900 4905Leu Pro Ser Ser Leu
Pro Thr Phe Ser Val Ser Thr Val Ser Ser 4910 4915 4920Ser Val Leu
Thr Thr Leu Arg Pro Thr Gly Phe Pro Ser Ser His 4925 4930 4935Phe
Ser Thr Pro Cys Phe Cys Arg Ala Phe Gly Gln Phe Phe Ser 4940 4945
4950Pro Gly Glu Val Ile Tyr Asn Lys Thr Asp Arg Ala Gly Cys His
4955 4960 4965Phe Tyr Ala Val Cys Asn Gln His Cys Asp Ile Asp Arg
Phe Gln 4970 4975 4980Gly Ala Cys Pro Thr Ser Pro Pro Pro Val Ser
Ser Ala Pro Leu 4985 4990 4995Ser Ser Pro Ser Pro Ala Pro Gly Cys
Asp Asn Ala Ile Pro Leu 5000 5005 5010Arg Gln Val Asn Glu Thr Trp
Thr Leu Glu Asn Cys Thr Val Ala 5015 5020 5025Arg Cys Val Gly Asp
Asn Arg Val Val Leu Leu Asp Pro Lys Pro 5030 5035 5040Val Ala Asn
Val Thr Cys Val Asn Lys His Leu Pro Ile Lys Val 5045 5050 5055Ser
Asp Pro Ser Gln Pro Cys Asp Phe His Tyr Glu Cys Glu Cys 5060 5065
5070Ile Cys Ser Met Trp Gly Gly Ser His Tyr Ser Thr Phe Asp Gly
5075 5080 5085Thr Ser Tyr Thr Phe Arg Gly Asn Cys Thr Tyr Val Leu
Met Arg 5090 5095 5100Glu Ile His Ala Arg Phe Gly Asn Leu Ser Leu
Tyr Leu Asp Asn 5105 5110 5115His Tyr Cys Thr Ala Ser Ala Thr Ala
Ala Ala Ala Arg Cys Pro 5120 5125 5130Arg Ala Leu Ser Ile His Tyr
Lys Ser Met Asp Ile Val Leu Thr 5135 5140 5145Val Thr Met Val His
Gly Lys Glu Glu Gly Leu Ile Leu Phe Asp 5150 5155 5160Gln Ile Pro
Val Ser Ser Gly Phe Ser Lys Asn Gly Val Leu Val 5165 5170 5175Ser
Val Leu Gly Thr Thr Thr Met Arg Val Asp Ile Pro Ala Leu 5180 5185
5190Gly Val Ser Val Thr Phe Asn Gly Gln Val Phe Gln Ala Arg Leu
5195 5200 5205Pro Tyr Ser Leu Phe His Asn Asn Thr Glu Gly Gln Cys
Gly Thr 5210 5215 5220Cys Thr Asn Asn Gln Arg Asp Asp Cys
Leu Gln Arg Asp Gly Thr 5225 5230 5235Thr Ala Ala Ser Cys Lys Asp
Met Ala Lys Thr Trp Leu Val Pro 5240 5245 5250Asp Ser Arg Lys Asp
Gly Cys Trp Ala Pro Thr Gly Thr Pro Pro 5255 5260 5265Thr Ala Ser
Pro Ala Ala Pro Val Ser Ser Thr Pro Thr Pro Thr 5270 5275 5280Pro
Cys Pro Pro Gln Pro Leu Cys Asp Leu Met Leu Ser Gln Val 5285 5290
5295Phe Ala Glu Cys His Asn Leu Val Pro Pro Gly Pro Phe Phe Asn
5300 5305 5310Ala Cys Ile Ser Asp His Cys Arg Gly Arg Leu Glu Val
Pro Cys 5315 5320 5325Gln Ser Leu Glu Ala Tyr Ala Glu Leu Cys Arg
Ala Arg Gly Val 5330 5335 5340Cys Ser Asp Trp Arg Gly Ala Thr Gly
Gly Leu Cys Asp Leu Thr 5345 5350 5355Cys Pro Pro Thr Lys Val Tyr
Lys Pro Cys Gly Pro Ile Gln Pro 5360 5365 5370Ala Thr Cys Asn Ser
Arg Asn Gln Ser Pro Gln Leu Glu Gly Met 5375 5380 5385Ala Glu Gly
Cys Phe Cys Pro Glu Asp Gln Ile Leu Phe Asn Ala 5390 5395 5400His
Met Gly Ile Cys Val Gln Ala Cys Pro Cys Val Gly Pro Asp 5405 5410
5415Gly Phe Pro Lys Phe Pro Gly Glu Arg Trp Val Ser Asn Cys Gln
5420 5425 5430Ser Cys Val Cys Asp Glu Gly Ser Val Ser Val Gln Cys
Lys Pro 5435 5440 5445Leu Pro Cys Asp Ala Gln Gly Gln Pro Pro Pro
Cys Asn Arg Pro 5450 5455 5460Gly Phe Val Thr Val Thr Arg Pro Arg
Ala Glu Asn Pro Cys Cys 5465 5470 5475Pro Glu Thr Val Cys Val Cys
Asn Thr Thr Thr Cys Pro Gln Ser 5480 5485 5490Leu Pro Val Cys Pro
Pro Gly Gln Glu Ser Ile Cys Thr Gln Glu 5495 5500 5505Glu Gly Asp
Cys Cys Pro Thr Phe Arg Cys Arg Pro Gln Leu Cys 5510 5515 5520Ser
Tyr Asn Gly Thr Phe Tyr Gly Val Gly Ala Thr Phe Pro Gly 5525 5530
5535Ala Leu Pro Cys His Met Cys Thr Cys Leu Ser Gly Asp Thr Gln
5540 5545 5550Asp Pro Thr Val Gln Cys Gln Glu Asp Ala Cys Asn Asn
Thr Thr 5555 5560 5565Cys Pro Gln Gly Phe Glu Tyr Lys Arg Val Ala
Gly Gln Cys Cys 5570 5575 5580Gly Glu Cys Val Gln Thr Ala Cys Leu
Thr Pro Asp Gly Gln Pro 5585 5590 5595Val Gln Leu Asn Glu Thr Trp
Val Asn Ser His Val Asp Asn Cys 5600 5605 5610Thr Val Tyr Leu Cys
Glu Ala Glu Gly Gly Val His Leu Leu Thr 5615 5620 5625Pro Gln Pro
Ala Ser Cys Pro Asp Val Ser Ser Cys Arg Gly Ser 5630 5635 5640Leu
Arg Lys Thr Gly Cys Cys Tyr Ser Cys Glu Glu Asp Ser Cys 5645 5650
5655Gln Val Arg Ile Asn Thr Thr Ile Leu Trp His Gln Gly Cys Glu
5660 5665 5670Thr Glu Val Asn Ile Thr Phe Cys Glu Gly Ser Cys Pro
Gly Ala 5675 5680 5685Ser Lys Tyr Ser Ala Glu Ala Gln Ala Met Gln
His Gln Cys Thr 5690 5695 5700Cys Cys Gln Glu Arg Arg Val His Glu
Glu Thr Val Pro Leu His 5705 5710 5715Cys Pro Asn Gly Ser Ala Ile
Leu His Thr Tyr Thr His Val Asp 5720 5725 5730Glu Cys Gly Cys Thr
Pro Phe Cys Val Pro Ala Pro Met Ala Pro 5735 5740 5745Pro His Thr
Arg Gly Phe Pro Ala Gln Glu Ala Thr Ala Val 5750 5755
5760134018DNAHomo sapiens 13caggcagcgc tgcgtcctgc tgcgcacgtg
ggaagccctg gccccggcca cccccgcgat 60gccgcgcgct ccccgctgcc gagccgtgcg
ctccctgctg cgcagccact accgcgaggt 120gctgccgctg gccacgttcg
tgcggcgcct ggggccccag ggctggcggc tggtgcagcg 180cggggacccg
gcggctttcc gcgcgctggt ggcccagtgc ctggtgtgcg tgccctggga
240cgcacggccg ccccccgccg ccccctcctt ccgccaggtg tcctgcctga
aggagctggt 300ggcccgagtg ctgcagaggc tgtgcgagcg cggcgcgaag
aacgtgctgg ccttcggctt 360cgcgctgctg gacggggccc gcgggggccc
ccccgaggcc ttcaccacca gcgtgcgcag 420ctacctgccc aacacggtga
ccgacgcact gcgggggagc ggggcgtggg ggctgctgct 480gcgccgcgtg
ggcgacgacg tgctggttca cctgctggca cgctgcgcgc tctttgtgct
540ggtggctccc agctgcgcct accaggtgtg cgggccgccg ctgtaccagc
tcggcgctgc 600cactcaggcc cggcccccgc cacacgctag tggaccccga
aggcgtctgg gatgcgaacg 660ggcctggaac catagcgtca gggaggccgg
ggtccccctg ggcctgccag ccccgggtgc 720gaggaggcgc gggggcagtg
ccagccgaag tctgccgttg cccaagaggc ccaggcgtgg 780cgctgcccct
gagccggagc ggacgcccgt tgggcagggg tcctgggccc acccgggcag
840gacgcgtgga ccgagtgacc gtggtttctg tgtggtgtca cctgccagac
ccgccgaaga 900agccacctct ttggagggtg cgctctctgg cacgcgccac
tcccacccat ccgtgggccg 960ccagcaccac gcgggccccc catccacatc
gcggccacca cgtccctggg acacgccttg 1020tcccccggtg tacgccgaga
ccaagcactt cctctactcc tcaggcgaca aggagcagct 1080gcggccctcc
ttcctactca gctctctgag gcccagcctg actggcgctc ggaggctcgt
1140ggagaccatc tttctgggtt ccaggccctg gatgccaggg actccccgca
ggttgccccg 1200cctgccccag cgctactggc aaatgcggcc cctgtttctg
gagctgcttg ggaaccacgc 1260gcagtgcccc tacggggtgc tcctcaagac
gcactgcccg ctgcgagctg cggtcacccc 1320agcagccggt gtctgtgccc
gggagaagcc ccagggctct gtggcggccc ccgaggagga 1380ggacacagac
ccccgtcgcc tggtgcagct gctccgccag cacagcagcc cctggcaggt
1440gtacggcttc gtgcgggcct gcctgcgccg gctggtgccc ccaggcctct
ggggctccag 1500gcacaacgaa cgccgcttcc tcaggaacac caagaagttc
atctccctgg ggaagcatgc 1560caagctctcg ctgcaggagc tgacgtggaa
gatgagcgtg cgggactgcg cttggctgcg 1620caggagccca ggggttggct
gtgttccggc cgcagagcac cgtctgcgtg aggagatcct 1680ggccaagttc
ctgcactggc tgatgagtgt gtacgtcgtc gagctgctca ggtctttctt
1740ttatgtcacg gagaccacgt ttcaaaagaa caggctcttt ttctaccgga
agagtgtctg 1800gagcaagttg caaagcattg gaatcagaca gcacttgaag
agggtgcagc tgcgggagct 1860gtcggaagca gaggtcaggc agcatcggga
agccaggccc gccctgctga cgtccagact 1920ccgcttcatc cccaagcctg
acgggctgcg gccgattgtg aacatggact acgtcgtggg 1980agccagaacg
ttccgcagag aaaagagggc cgagcgtctc acctcgaggg tgaaggcact
2040gttcagcgtg ctcaactacg agcgggcgcg gcgccccggc ctcctgggcg
cctctgtgct 2100gggcctggac gatatccaca gggcctggcg caccttcgtg
ctgcgtgtgc gggcccagga 2160cccgccgcct gagctgtact ttgtcaaggt
ggatgtgacg ggcgcgtacg acaccatccc 2220ccaggacagg ctcacggagg
tcatcgccag catcatcaaa ccccagaaca cgtactgcgt 2280gcgtcggtat
gccgtggtcc agaaggccgc ccatgggcac gtccgcaagg ccttcaagag
2340ccacgtctct accttgacag acctccagcc gtacatgcga cagttcgtgg
ctcacctgca 2400ggagaccagc ccgctgaggg atgccgtcgt catcgagcag
agctcctccc tgaatgaggc 2460cagcagtggc ctcttcgacg tcttcctacg
cttcatgtgc caccacgccg tgcgcatcag 2520gggcaagtcc tacgtccagt
gccaggggat cccgcagggc tccatcctct ccacgctgct 2580ctgcagcctg
tgctacggcg acatggagaa caagctgttt gcggggattc ggcgggacgg
2640gctgctcctg cgtttggtgg atgatttctt gttggtgaca cctcacctca
cccacgcgaa 2700aaccttcctc aggaccctgg tccgaggtgt ccctgagtat
ggctgcgtgg tgaacttgcg 2760gaagacagtg gtgaacttcc ctgtagaaga
cgaggccctg ggtggcacgg cttttgttca 2820gatgccggcc cacggcctat
tcccctggtg cggcctgctg ctggataccc ggaccctgga 2880ggtgcagagc
gactactcca gctatgcccg gacctccatc agagccagtc tcaccttcaa
2940ccgcggcttc aaggctggga ggaacatgcg tcgcaaactc tttggggtct
tgcggctgaa 3000gtgtcacagc ctgtttctgg atttgcaggt gaacagcctc
cagacggtgt gcaccaacat 3060ctacaagatc ctcctgctgc aggcgtacag
gtttcacgca tgtgtgctgc agctcccatt 3120tcatcagcaa gtttggaaga
accccacatt tttcctgcgc gtcatctctg acacggcctc 3180cctctgctac
tccatcctga aagccaagaa cgcagggatg tcgctggggg ccaagggcgc
3240cgccggccct ctgccctccg aggccgtgca gtggctgtgc caccaagcat
tcctgctcaa 3300gctgactcga caccgtgtca cctacgtgcc actcctgggg
tcactcagga cagcccagac 3360gcagctgagt cggaagctcc cggggacgac
gctgactgcc ctggaggccg cagccaaccc 3420ggcactgccc tcagacttca
agaccatcct ggactgatgg ccacccgccc acagccaggc 3480cgagagcaga
caccagcagc cctgtcacgc cgggctctac gtcccaggga gggaggggcg
3540gcccacaccc aggcccgcac cgctgggagt ctgaggcctg agtgagtgtt
tggccgaggc 3600ctgcatgtcc ggctgaaggc tgagtgtccg gctgaggcct
gagcgagtgt ccagccaagg 3660gctgagtgtc cagcacacct gccgtcttca
cttccccaca ggctggcgct cggctccacc 3720ccagggccag cttttcctca
ccaggagccc ggcttccact ccccacatag gaatagtcca 3780tccccagatt
cgccattgtt cacccctcgc cctgccctcc tttgccttcc acccccacca
3840tccaggtgga gaccctgaga aggaccctgg gagctctggg aatttggagt
gaccaaaggt 3900gtgccctgta cacaggcgag gaccctgcac ctggatgggg
gtccctgtgg gtcaaattgg 3960ggggaggtgc tgtgggagta aaatactgaa
tatatgagtt tttcagtttt gaaaaaaa 4018141132PRTHomo sapiens 14Met Pro
Arg Ala Pro Arg Cys Arg Ala Val Arg Ser Leu Leu Arg Ser1 5 10 15His
Tyr Arg Glu Val Leu Pro Leu Ala Thr Phe Val Arg Arg Leu Gly 20 25
30Pro Gln Gly Trp Arg Leu Val Gln Arg Gly Asp Pro Ala Ala Phe Arg
35 40 45Ala Leu Val Ala Gln Cys Leu Val Cys Val Pro Trp Asp Ala Arg
Pro 50 55 60Pro Pro Ala Ala Pro Ser Phe Arg Gln Val Ser Cys Leu Lys
Glu Leu65 70 75 80Val Ala Arg Val Leu Gln Arg Leu Cys Glu Arg Gly
Ala Lys Asn Val 85 90 95Leu Ala Phe Gly Phe Ala Leu Leu Asp Gly Ala
Arg Gly Gly Pro Pro 100 105 110Glu Ala Phe Thr Thr Ser Val Arg Ser
Tyr Leu Pro Asn Thr Val Thr 115 120 125Asp Ala Leu Arg Gly Ser Gly
Ala Trp Gly Leu Leu Leu Arg Arg Val 130 135 140Gly Asp Asp Val Leu
Val His Leu Leu Ala Arg Cys Ala Leu Phe Val145 150 155 160Leu Val
Ala Pro Ser Cys Ala Tyr Gln Val Cys Gly Pro Pro Leu Tyr 165 170
175Gln Leu Gly Ala Ala Thr Gln Ala Arg Pro Pro Pro His Ala Ser Gly
180 185 190Pro Arg Arg Arg Leu Gly Cys Glu Arg Ala Trp Asn His Ser
Val Arg 195 200 205Glu Ala Gly Val Pro Leu Gly Leu Pro Ala Pro Gly
Ala Arg Arg Arg 210 215 220Gly Gly Ser Ala Ser Arg Ser Leu Pro Leu
Pro Lys Arg Pro Arg Arg225 230 235 240Gly Ala Ala Pro Glu Pro Glu
Arg Thr Pro Val Gly Gln Gly Ser Trp 245 250 255Ala His Pro Gly Arg
Thr Arg Gly Pro Ser Asp Arg Gly Phe Cys Val 260 265 270Val Ser Pro
Ala Arg Pro Ala Glu Glu Ala Thr Ser Leu Glu Gly Ala 275 280 285Leu
Ser Gly Thr Arg His Ser His Pro Ser Val Gly Arg Gln His His 290 295
300Ala Gly Pro Pro Ser Thr Ser Arg Pro Pro Arg Pro Trp Asp Thr
Pro305 310 315 320Cys Pro Pro Val Tyr Ala Glu Thr Lys His Phe Leu
Tyr Ser Ser Gly 325 330 335Asp Lys Glu Gln Leu Arg Pro Ser Phe Leu
Leu Ser Ser Leu Arg Pro 340 345 350Ser Leu Thr Gly Ala Arg Arg Leu
Val Glu Thr Ile Phe Leu Gly Ser 355 360 365Arg Pro Trp Met Pro Gly
Thr Pro Arg Arg Leu Pro Arg Leu Pro Gln 370 375 380Arg Tyr Trp Gln
Met Arg Pro Leu Phe Leu Glu Leu Leu Gly Asn His385 390 395 400Ala
Gln Cys Pro Tyr Gly Val Leu Leu Lys Thr His Cys Pro Leu Arg 405 410
415Ala Ala Val Thr Pro Ala Ala Gly Val Cys Ala Arg Glu Lys Pro Gln
420 425 430Gly Ser Val Ala Ala Pro Glu Glu Glu Asp Thr Asp Pro Arg
Arg Leu 435 440 445Val Gln Leu Leu Arg Gln His Ser Ser Pro Trp Gln
Val Tyr Gly Phe 450 455 460Val Arg Ala Cys Leu Arg Arg Leu Val Pro
Pro Gly Leu Trp Gly Ser465 470 475 480Arg His Asn Glu Arg Arg Phe
Leu Arg Asn Thr Lys Lys Phe Ile Ser 485 490 495Leu Gly Lys His Ala
Lys Leu Ser Leu Gln Glu Leu Thr Trp Lys Met 500 505 510Ser Val Arg
Asp Cys Ala Trp Leu Arg Arg Ser Pro Gly Val Gly Cys 515 520 525Val
Pro Ala Ala Glu His Arg Leu Arg Glu Glu Ile Leu Ala Lys Phe 530 535
540Leu His Trp Leu Met Ser Val Tyr Val Val Glu Leu Leu Arg Ser
Phe545 550 555 560Phe Tyr Val Thr Glu Thr Thr Phe Gln Lys Asn Arg
Leu Phe Phe Tyr 565 570 575Arg Lys Ser Val Trp Ser Lys Leu Gln Ser
Ile Gly Ile Arg Gln His 580 585 590Leu Lys Arg Val Gln Leu Arg Glu
Leu Ser Glu Ala Glu Val Arg Gln 595 600 605His Arg Glu Ala Arg Pro
Ala Leu Leu Thr Ser Arg Leu Arg Phe Ile 610 615 620Pro Lys Pro Asp
Gly Leu Arg Pro Ile Val Asn Met Asp Tyr Val Val625 630 635 640Gly
Ala Arg Thr Phe Arg Arg Glu Lys Arg Ala Glu Arg Leu Thr Ser 645 650
655Arg Val Lys Ala Leu Phe Ser Val Leu Asn Tyr Glu Arg Ala Arg Arg
660 665 670Pro Gly Leu Leu Gly Ala Ser Val Leu Gly Leu Asp Asp Ile
His Arg 675 680 685Ala Trp Arg Thr Phe Val Leu Arg Val Arg Ala Gln
Asp Pro Pro Pro 690 695 700Glu Leu Tyr Phe Val Lys Val Asp Val Thr
Gly Ala Tyr Asp Thr Ile705 710 715 720Pro Gln Asp Arg Leu Thr Glu
Val Ile Ala Ser Ile Ile Lys Pro Gln 725 730 735Asn Thr Tyr Cys Val
Arg Arg Tyr Ala Val Val Gln Lys Ala Ala His 740 745 750Gly His Val
Arg Lys Ala Phe Lys Ser His Val Ser Thr Leu Thr Asp 755 760 765Leu
Gln Pro Tyr Met Arg Gln Phe Val Ala His Leu Gln Glu Thr Ser 770 775
780Pro Leu Arg Asp Ala Val Val Ile Glu Gln Ser Ser Ser Leu Asn
Glu785 790 795 800Ala Ser Ser Gly Leu Phe Asp Val Phe Leu Arg Phe
Met Cys His His 805 810 815Ala Val Arg Ile Arg Gly Lys Ser Tyr Val
Gln Cys Gln Gly Ile Pro 820 825 830Gln Gly Ser Ile Leu Ser Thr Leu
Leu Cys Ser Leu Cys Tyr Gly Asp 835 840 845Met Glu Asn Lys Leu Phe
Ala Gly Ile Arg Arg Asp Gly Leu Leu Leu 850 855 860Arg Leu Val Asp
Asp Phe Leu Leu Val Thr Pro His Leu Thr His Ala865 870 875 880Lys
Thr Phe Leu Arg Thr Leu Val Arg Gly Val Pro Glu Tyr Gly Cys 885 890
895Val Val Asn Leu Arg Lys Thr Val Val Asn Phe Pro Val Glu Asp Glu
900 905 910Ala Leu Gly Gly Thr Ala Phe Val Gln Met Pro Ala His Gly
Leu Phe 915 920 925Pro Trp Cys Gly Leu Leu Leu Asp Thr Arg Thr Leu
Glu Val Gln Ser 930 935 940Asp Tyr Ser Ser Tyr Ala Arg Thr Ser Ile
Arg Ala Ser Leu Thr Phe945 950 955 960Asn Arg Gly Phe Lys Ala Gly
Arg Asn Met Arg Arg Lys Leu Phe Gly 965 970 975Val Leu Arg Leu Lys
Cys His Ser Leu Phe Leu Asp Leu Gln Val Asn 980 985 990Ser Leu Gln
Thr Val Cys Thr Asn Ile Tyr Lys Ile Leu Leu Leu Gln 995 1000
1005Ala Tyr Arg Phe His Ala Cys Val Leu Gln Leu Pro Phe His Gln
1010 1015 1020Gln Val Trp Lys Asn Pro Thr Phe Phe Leu Arg Val Ile
Ser Asp 1025 1030 1035Thr Ala Ser Leu Cys Tyr Ser Ile Leu Lys Ala
Lys Asn Ala Gly 1040 1045 1050Met Ser Leu Gly Ala Lys Gly Ala Ala
Gly Pro Leu Pro Ser Glu 1055 1060 1065Ala Val Gln Trp Leu Cys His
Gln Ala Phe Leu Leu Lys Leu Thr 1070 1075 1080Arg His Arg Val Thr
Tyr Val Pro Leu Leu Gly Ser Leu Arg Thr 1085 1090 1095Ala Gln Thr
Gln Leu Ser Arg Lys Leu Pro Gly Thr Thr Leu Thr 1100 1105 1110Ala
Leu Glu Ala Ala Ala Asn Pro Ala Leu Pro Ser Asp Phe Lys 1115 1120
1125Thr Ile Leu Asp 1130153829DNAHomo sapiens 15caggcagcgc
tgcgtcctgc tgcgcacgtg ggaagccctg gccccggcca cccccgcgat 60gccgcgcgct
ccccgctgcc gagccgtgcg ctccctgctg cgcagccact accgcgaggt
120gctgccgctg gccacgttcg tgcggcgcct ggggccccag ggctggcggc
tggtgcagcg 180cggggacccg gcggctttcc gcgcgctggt ggcccagtgc
ctggtgtgcg tgccctggga 240cgcacggccg ccccccgccg ccccctcctt
ccgccaggtg tcctgcctga aggagctggt 300ggcccgagtg ctgcagaggc
tgtgcgagcg cggcgcgaag aacgtgctgg ccttcggctt 360cgcgctgctg
gacggggccc gcgggggccc ccccgaggcc ttcaccacca gcgtgcgcag
420ctacctgccc aacacggtga ccgacgcact gcgggggagc ggggcgtggg
ggctgctgct 480gcgccgcgtg ggcgacgacg tgctggttca cctgctggca
cgctgcgcgc tctttgtgct 540ggtggctccc agctgcgcct accaggtgtg
cgggccgccg ctgtaccagc tcggcgctgc 600cactcaggcc cggcccccgc
cacacgctag tggaccccga aggcgtctgg gatgcgaacg 660ggcctggaac
catagcgtca gggaggccgg ggtccccctg ggcctgccag ccccgggtgc
720gaggaggcgc gggggcagtg ccagccgaag tctgccgttg cccaagaggc
ccaggcgtgg 780cgctgcccct gagccggagc ggacgcccgt tgggcagggg
tcctgggccc acccgggcag 840gacgcgtgga ccgagtgacc gtggtttctg
tgtggtgtca cctgccagac ccgccgaaga 900agccacctct ttggagggtg
cgctctctgg cacgcgccac tcccacccat ccgtgggccg 960ccagcaccac
gcgggccccc catccacatc gcggccacca cgtccctggg acacgccttg
1020tcccccggtg tacgccgaga ccaagcactt cctctactcc tcaggcgaca
aggagcagct 1080gcggccctcc ttcctactca gctctctgag gcccagcctg
actggcgctc ggaggctcgt 1140ggagaccatc tttctgggtt ccaggccctg
gatgccaggg actccccgca ggttgccccg 1200cctgccccag cgctactggc
aaatgcggcc cctgtttctg gagctgcttg ggaaccacgc 1260gcagtgcccc
tacggggtgc tcctcaagac gcactgcccg ctgcgagctg cggtcacccc
1320agcagccggt gtctgtgccc gggagaagcc ccagggctct gtggcggccc
ccgaggagga 1380ggacacagac ccccgtcgcc tggtgcagct gctccgccag
cacagcagcc cctggcaggt 1440gtacggcttc gtgcgggcct gcctgcgccg
gctggtgccc ccaggcctct ggggctccag 1500gcacaacgaa cgccgcttcc
tcaggaacac caagaagttc atctccctgg ggaagcatgc 1560caagctctcg
ctgcaggagc tgacgtggaa gatgagcgtg cgggactgcg cttggctgcg
1620caggagccca ggggttggct gtgttccggc cgcagagcac cgtctgcgtg
aggagatcct 1680ggccaagttc ctgcactggc tgatgagtgt gtacgtcgtc
gagctgctca ggtctttctt 1740ttatgtcacg gagaccacgt ttcaaaagaa
caggctcttt ttctaccgga agagtgtctg 1800gagcaagttg caaagcattg
gaatcagaca gcacttgaag agggtgcagc tgcgggagct 1860gtcggaagca
gaggtcaggc agcatcggga agccaggccc gccctgctga cgtccagact
1920ccgcttcatc cccaagcctg acgggctgcg gccgattgtg aacatggact
acgtcgtggg 1980agccagaacg ttccgcagag aaaagagggc cgagcgtctc
acctcgaggg tgaaggcact 2040gttcagcgtg ctcaactacg agcgggcgcg
gcgccccggc ctcctgggcg cctctgtgct 2100gggcctggac gatatccaca
gggcctggcg caccttcgtg ctgcgtgtgc gggcccagga 2160cccgccgcct
gagctgtact ttgtcaaggt ggatgtgacg ggcgcgtacg acaccatccc
2220ccaggacagg ctcacggagg tcatcgccag catcatcaaa ccccagaaca
cgtactgcgt 2280gcgtcggtat gccgtggtcc agaaggccgc ccatgggcac
gtccgcaagg ccttcaagag 2340ccacgtctct accttgacag acctccagcc
gtacatgcga cagttcgtgg ctcacctgca 2400ggagaccagc ccgctgaggg
atgccgtcgt catcgagcag agctcctccc tgaatgaggc 2460cagcagtggc
ctcttcgacg tcttcctacg cttcatgtgc caccacgccg tgcgcatcag
2520gggcaagtcc tacgtccagt gccaggggat cccgcagggc tccatcctct
ccacgctgct 2580ctgcagcctg tgctacggcg acatggagaa caagctgttt
gcggggattc ggcgggacgg 2640gctgctcctg cgtttggtgg atgatttctt
gttggtgaca cctcacctca cccacgcgaa 2700aaccttcctc agctatgccc
ggacctccat cagagccagt ctcaccttca accgcggctt 2760caaggctggg
aggaacatgc gtcgcaaact ctttggggtc ttgcggctga agtgtcacag
2820cctgtttctg gatttgcagg tgaacagcct ccagacggtg tgcaccaaca
tctacaagat 2880cctcctgctg caggcgtaca ggtttcacgc atgtgtgctg
cagctcccat ttcatcagca 2940agtttggaag aaccccacat ttttcctgcg
cgtcatctct gacacggcct ccctctgcta 3000ctccatcctg aaagccaaga
acgcagggat gtcgctgggg gccaagggcg ccgccggccc 3060tctgccctcc
gaggccgtgc agtggctgtg ccaccaagca ttcctgctca agctgactcg
3120acaccgtgtc acctacgtgc cactcctggg gtcactcagg acagcccaga
cgcagctgag 3180tcggaagctc ccggggacga cgctgactgc cctggaggcc
gcagccaacc cggcactgcc 3240ctcagacttc aagaccatcc tggactgatg
gccacccgcc cacagccagg ccgagagcag 3300acaccagcag ccctgtcacg
ccgggctcta cgtcccaggg agggaggggc ggcccacacc 3360caggcccgca
ccgctgggag tctgaggcct gagtgagtgt ttggccgagg cctgcatgtc
3420cggctgaagg ctgagtgtcc ggctgaggcc tgagcgagtg tccagccaag
ggctgagtgt 3480ccagcacacc tgccgtcttc acttccccac aggctggcgc
tcggctccac cccagggcca 3540gcttttcctc accaggagcc cggcttccac
tccccacata ggaatagtcc atccccagat 3600tcgccattgt tcacccctcg
ccctgccctc ctttgccttc cacccccacc atccaggtgg 3660agaccctgag
aaggaccctg ggagctctgg gaatttggag tgaccaaagg tgtgccctgt
3720acacaggcga ggaccctgca cctggatggg ggtccctgtg ggtcaaattg
gggggaggtg 3780ctgtgggagt aaaatactga atatatgagt ttttcagttt
tgaaaaaaa 3829161069PRTHomo sapiens 16Met Pro Arg Ala Pro Arg Cys
Arg Ala Val Arg Ser Leu Leu Arg Ser1 5 10 15His Tyr Arg Glu Val Leu
Pro Leu Ala Thr Phe Val Arg Arg Leu Gly 20 25 30Pro Gln Gly Trp Arg
Leu Val Gln Arg Gly Asp Pro Ala Ala Phe Arg 35 40 45Ala Leu Val Ala
Gln Cys Leu Val Cys Val Pro Trp Asp Ala Arg Pro 50 55 60Pro Pro Ala
Ala Pro Ser Phe Arg Gln Val Ser Cys Leu Lys Glu Leu65 70 75 80Val
Ala Arg Val Leu Gln Arg Leu Cys Glu Arg Gly Ala Lys Asn Val 85 90
95Leu Ala Phe Gly Phe Ala Leu Leu Asp Gly Ala Arg Gly Gly Pro Pro
100 105 110Glu Ala Phe Thr Thr Ser Val Arg Ser Tyr Leu Pro Asn Thr
Val Thr 115 120 125Asp Ala Leu Arg Gly Ser Gly Ala Trp Gly Leu Leu
Leu Arg Arg Val 130 135 140Gly Asp Asp Val Leu Val His Leu Leu Ala
Arg Cys Ala Leu Phe Val145 150 155 160Leu Val Ala Pro Ser Cys Ala
Tyr Gln Val Cys Gly Pro Pro Leu Tyr 165 170 175Gln Leu Gly Ala Ala
Thr Gln Ala Arg Pro Pro Pro His Ala Ser Gly 180 185 190Pro Arg Arg
Arg Leu Gly Cys Glu Arg Ala Trp Asn His Ser Val Arg 195 200 205Glu
Ala Gly Val Pro Leu Gly Leu Pro Ala Pro Gly Ala Arg Arg Arg 210 215
220Gly Gly Ser Ala Ser Arg Ser Leu Pro Leu Pro Lys Arg Pro Arg
Arg225 230 235 240Gly Ala Ala Pro Glu Pro Glu Arg Thr Pro Val Gly
Gln Gly Ser Trp 245 250 255Ala His Pro Gly Arg Thr Arg Gly Pro Ser
Asp Arg Gly Phe Cys Val 260 265 270Val Ser Pro Ala Arg Pro Ala Glu
Glu Ala Thr Ser Leu Glu Gly Ala 275 280 285Leu Ser Gly Thr Arg His
Ser His Pro Ser Val Gly Arg Gln His His 290 295 300Ala Gly Pro Pro
Ser Thr Ser Arg Pro Pro Arg Pro Trp Asp Thr Pro305 310 315 320Cys
Pro Pro Val Tyr Ala Glu Thr Lys His Phe Leu Tyr Ser Ser Gly 325 330
335Asp Lys Glu Gln Leu Arg Pro Ser Phe Leu Leu Ser Ser Leu Arg Pro
340 345 350Ser Leu Thr Gly Ala Arg Arg Leu Val Glu Thr Ile Phe Leu
Gly Ser 355 360 365Arg Pro Trp Met Pro Gly Thr Pro Arg Arg Leu Pro
Arg Leu Pro Gln 370 375 380Arg Tyr Trp Gln Met Arg Pro Leu Phe Leu
Glu Leu Leu Gly Asn His385 390 395 400Ala Gln Cys Pro Tyr Gly Val
Leu Leu Lys Thr His Cys Pro Leu Arg 405 410 415Ala Ala Val Thr Pro
Ala Ala Gly Val Cys Ala Arg Glu Lys Pro Gln 420 425 430Gly Ser Val
Ala Ala Pro Glu Glu Glu Asp Thr Asp Pro Arg Arg Leu 435 440 445Val
Gln Leu Leu Arg Gln His Ser Ser Pro Trp Gln Val Tyr Gly Phe 450 455
460Val Arg Ala Cys Leu Arg Arg Leu Val Pro Pro Gly Leu Trp Gly
Ser465 470 475 480Arg His Asn Glu Arg Arg Phe Leu Arg Asn Thr Lys
Lys Phe Ile Ser 485 490 495Leu Gly Lys His Ala Lys Leu Ser Leu Gln
Glu Leu Thr Trp Lys Met 500 505 510Ser Val Arg Asp Cys Ala Trp Leu
Arg Arg Ser Pro Gly Val Gly Cys 515 520 525Val Pro Ala Ala Glu His
Arg Leu Arg Glu Glu Ile Leu Ala Lys Phe 530 535 540Leu His Trp Leu
Met Ser Val Tyr Val Val Glu Leu Leu Arg Ser Phe545 550 555 560Phe
Tyr Val Thr Glu Thr Thr Phe Gln Lys Asn Arg Leu Phe Phe Tyr 565 570
575Arg Lys Ser Val Trp Ser Lys Leu Gln Ser Ile Gly Ile Arg Gln His
580 585 590Leu Lys Arg Val Gln Leu Arg Glu Leu Ser Glu Ala Glu Val
Arg Gln 595 600 605His Arg Glu Ala Arg Pro Ala Leu Leu Thr Ser Arg
Leu Arg Phe Ile 610 615 620Pro Lys Pro Asp Gly Leu Arg Pro Ile Val
Asn Met Asp Tyr Val Val625 630 635 640Gly Ala Arg Thr Phe Arg Arg
Glu Lys Arg Ala Glu Arg Leu Thr Ser 645 650 655Arg Val Lys Ala Leu
Phe Ser Val Leu Asn Tyr Glu Arg Ala Arg Arg 660 665 670Pro Gly Leu
Leu Gly Ala Ser Val Leu Gly Leu Asp Asp Ile His Arg 675 680 685Ala
Trp Arg Thr Phe Val Leu Arg Val Arg Ala Gln Asp Pro Pro Pro 690 695
700Glu Leu Tyr Phe Val Lys Val Asp Val Thr Gly Ala Tyr Asp Thr
Ile705 710 715 720Pro Gln Asp Arg Leu Thr Glu Val Ile Ala Ser Ile
Ile Lys Pro Gln 725 730 735Asn Thr Tyr Cys Val Arg Arg Tyr Ala Val
Val Gln Lys Ala Ala His 740 745 750Gly His Val Arg Lys Ala Phe Lys
Ser His Val Ser Thr Leu Thr Asp 755 760 765Leu Gln Pro Tyr Met Arg
Gln Phe Val Ala His Leu Gln Glu Thr Ser 770 775 780Pro Leu Arg Asp
Ala Val Val Ile Glu Gln Ser Ser Ser Leu Asn Glu785 790 795 800Ala
Ser Ser Gly Leu Phe Asp Val Phe Leu Arg Phe Met Cys His His 805 810
815Ala Val Arg Ile Arg Gly Lys Ser Tyr Val Gln Cys Gln Gly Ile Pro
820 825 830Gln Gly Ser Ile Leu Ser Thr Leu Leu Cys Ser Leu Cys Tyr
Gly Asp 835 840 845Met Glu Asn Lys Leu Phe Ala Gly Ile Arg Arg Asp
Gly Leu Leu Leu 850 855 860Arg Leu Val Asp Asp Phe Leu Leu Val Thr
Pro His Leu Thr His Ala865 870 875 880Lys Thr Phe Leu Ser Tyr Ala
Arg Thr Ser Ile Arg Ala Ser Leu Thr 885 890 895Phe Asn Arg Gly Phe
Lys Ala Gly Arg Asn Met Arg Arg Lys Leu Phe 900 905 910Gly Val Leu
Arg Leu Lys Cys His Ser Leu Phe Leu Asp Leu Gln Val 915 920 925Asn
Ser Leu Gln Thr Val Cys Thr Asn Ile Tyr Lys Ile Leu Leu Leu 930 935
940Gln Ala Tyr Arg Phe His Ala Cys Val Leu Gln Leu Pro Phe His
Gln945 950 955 960Gln Val Trp Lys Asn Pro Thr Phe Phe Leu Arg Val
Ile Ser Asp Thr 965 970 975Ala Ser Leu Cys Tyr Ser Ile Leu Lys Ala
Lys Asn Ala Gly Met Ser 980 985 990Leu Gly Ala Lys Gly Ala Ala Gly
Pro Leu Pro Ser Glu Ala Val Gln 995 1000 1005Trp Leu Cys His Gln
Ala Phe Leu Leu Lys Leu Thr Arg His Arg 1010 1015 1020Val Thr Tyr
Val Pro Leu Leu Gly Ser Leu Arg Thr Ala Gln Thr 1025 1030 1035Gln
Leu Ser Arg Lys Leu Pro Gly Thr Thr Leu Thr Ala Leu Glu 1040 1045
1050Ala Ala Ala Asn Pro Ala Leu Pro Ser Asp Phe Lys Thr Ile Leu
1055 1060 1065Asp175030DNAHomo sapiens 17attgaggagc agaaggagta
gggtgcgggg gaggaggagg agcgccttta gtgctgcagc 60agctgctgct ctgattggcc
cggtggttca gctgcttccc tggaacaaaa ggtcaaagtg 120gactgcagtg
taaatgtaga gaagcagccg ataaaatagc attgcctgaa gaagtttgga
180ggctgagagc agcagtagac tggccaactg cagagcaagt tgtttctcca
gccgtgcggt 240gcagcctcat gcccccaacc cagcttagcc actgtaagaa
gacgttcact gtacagacga 300ccaaacttgc cgtggaagag acagttgtga
gattcccttg caaatttaca tacgagaatg 360gcttgtgaaa tcatgcctct
gcaaagttca caggaagatg aaagacctct gtcacctttc 420tatttgagtg
ctcatgtacc ccaagtcagc aatgtgtctg caaccggaga actcttagaa
480agaaccatcc gatcagctgt agaacaacat ctttttgatg ttaataactc
tggaggtcaa 540agttcagagg actcagaatc tggaacacta tcagcatctt
ctgccacatc tgccagacag 600cgccgccgcc agtccaagga gcaggatgaa
gttcgacatg ggagagacaa gggacttatc 660aacaaagaaa atactccttc
tgggttcaac caccttgatg attgtatttt gaatactcag 720gaagtcgaaa
aggtacacaa aaatactttt ggttgtgctg gagaaaggag caagcctaaa
780cgtcagaaat ccagtactaa actttctgag cttcatgaca atcaggacgg
tcttgtgaat 840atggaaagtc tcaattccac acgatctcat gagagaactg
gacctgatga ttttgaatgg 900atgtctgatg aaaggaaagg aaatgaaaaa
gatggtggac acactcagca ttttgagagc 960cccacaatga agatccagga
gcatcccagc ctatctgaca ccaaacagca gagaaatcaa 1020gatgccggtg
accaggagga gagctttgtc tccgaagtgc cccagtcgga cctgactgca
1080ttgtgtgatg aaaagaactg ggaagagcct atccctgctt tctcctcctg
gcagcgggag 1140aacagtgact ctgatgaagc ccacctctcg ccgcaggctg
ggcgcctgat ccgtcagctg 1200ctggacgaag acagcgaccc catgctctct
cctcggttct acgcttatgg gcagagcagg 1260caatacctgg atgacacaga
agtgcctcct tccccaccaa actcccattc tttcatgagg 1320cggcgaagct
cctctctggg gtcctatgat gatgagcaag aggacctgac acctgcccag
1380ctcacacgaa ggattcagag ccttaaaaag aagatccgga agtttgaaga
tagattcgaa 1440gaagagaaga agtacagacc ttcccacagt gacaaagcag
ccaatccgga ggttctgaaa 1500tggacaaatg accttgccaa attccggaga
caacttaaag aatcaaaact aaagatatct 1560gaagaggacc taactcccag
gatgcggcag cgaagcaaca cactccccaa gagttttggt 1620tcccaacttg
agaaagaaga tgagaagaag caagagctgg tggataaagc aataaagccc
1680agtgttgaag ccacattgga atctattcag aggaagctcc aggagaagcg
agcggaaagc 1740agccgccctg aggacattaa ggatatgacc aaagaccaga
ttgctaatga gaaagtggct 1800ctgcagaaag ctctgttata ttatgaaagc
attcatggac ggccggtaac aaagaacgaa 1860cggcaggtga tgaagccact
atacgacagg taccggctgg tcaaacagat cctctcccga 1920gctaacacca
tacccatcat tggttccccc tccagcaagc ggagaagccc tttgctgcag
1980ccaattatcg agggcgaaac tgcttccttc ttcaaggaga taaaggaaga
agaggagggg 2040tcagaagacg atagcaatgt gaagccagac ttcatggtca
ctctgaaaac cgatttcagt 2100gcacgatgct ttctggacca attcgaagat
gacgctgatg gatttatttc cccaatggat 2160gataaaatac catcaaaatg
cagccaggac acagggcttt caaatctcca tgctgcctca 2220atacctgaac
tcctggaaca cctccaggaa atgagagaag aaaagaaaag gattcgaaag
2280aaacttcggg attttgaaga caactttttc agacagaatg gaagaaatgt
ccagaaggaa 2340gaccgcactc ctatggctga agaatacagt gaatataagc
acataaaggc gaaactgagg 2400ctcctggagg tgctcatcag caagagagac
actgattcca agtccatgtg aggggcatgg 2460ccaagcacag ggggctggca
gctgcggtga gagtttactg tccccagaga aagtgcagct 2520ctggaaggca
gccttggggc tggccctgca aagcatgcag cccttctgcc tctagaccat
2580ttggcatcgg ctcctgtttc cattgcctgc cttagaaact ggctggaaga
agacaatgtg 2640acctgactta ggcattttgt aattggaaag tcaagactgc
agtatgtgca catgcgcacg 2700cgcatgcacg cacacacaca cacagtagtg
gagctttcct aacactagca gagattaatc 2760actacattag acaacactca
tctacagaga atatacactg ttcttccctg gataactgag 2820aaacaagaga
ccattctctg tctaactgtg ataaaaacaa gctcaggact ttattctata
2880gagcaaactt gctgtggagg gccatgctct ccttggaccc agttaactgc
aaacgtgcat 2940tggagcccta tttgctgccg ctgccattct agtgaccttt
ccacagagct gcgccttcct 3000cacgtgtgtg aaaggttttc cccttcagcc
ctcaggtaga tggaagctgc atctgcccac 3060gatggcagtg cagtcatcat
cttcaggatg tttcttcagg acttcctcag ctgacaagga 3120attttggtcc
ctgcctagga ccgggtcatc tgcagaggac agagagatgg taagcagctg
3180tatgaatgct gattttaaaa ccaggtcatg ggagaagagc ctggagattc
tttcctgaac 3240actgactgca cttaccagtc tgattttatc gtcaaacacc
aagccaggct agcatgctca 3300tggcaatctg tttggggctg ttttgttgtg
gcactagcca aacataaagg ggcttaagtc 3360agcctgcata cagaggatcg
gggagagaag gggcctgtgt tctcagcctc ctgagtactt 3420accagagttt
aattttttta aaaaaaatct gcactaaaat ccccaaactg acaggtaaat
3480gtagccctca gagctcagcc caaggcagaa tctaaatcac actattttcg
agatcatgta 3540taaaaagaaa aaaaagaagt catgctgtgt ggccaattat
aatttttttc aaagactttg 3600tcacaaaact gtctatatta gacattttgg
agggaccagg aaatgtaaga caccaaatcc 3660tccatctctt cagtgtgcct
gatgtcacct catgatttgc tgttactttt ttaactcctg 3720cgccaaggac
agtgggttct gtgtccacct ttgtgctttg cgaggccgag cccaggcatc
3780tgctcgcctg ccacggctga ccagagaagg tgcttcagga gctctgcctt
agacgacgtg 3840ttacagtatg aacacacagc agaggcaccc tcgtatgttt
tgaaagttgc cttctgaaag 3900ggcacagttt taaggaaaag aaaaagaatg
taaaactata ctgacccgtt ttcagtttta 3960aagggtcgtg agaaactggc
tggtccaatg ggatttacag caacattttc cattgctgaa 4020gtgaggtagc
agctctcttc tgtcagctga atgttaagga tggggaaaaa gaatgccttt
4080aagtttgctc ttaatcgtat ggaagcttga gctatgtgtt ggaagtgccc
tggttttaat 4140ccatacacaa agacggtaca taatcctaca ggtttaaatg
tacataaaaa tatagtttgg 4200aattctttgc tctactgttt acattgcaga
ttgctataat ttcaaggagt gagattataa 4260ataaaatgat gcactttagg
atgtttccta tttttgaaat ctgaacatga atcattcaca 4320tgaccaaaaa
ttgtgttttt ttaaaaatac atgtctagtc tgtcctttaa tagctctctt
4380aaataagcta tgatattaat cagatcatta ccagttagct tttaaagcac
atttgtttaa 4440gactatgttt ttggaaaaat acgctacaga attttttttt
aagctacaaa taaatgagat 4500gctactaatt gttttggaat ctgttgtttc
tgccaaaggt aaattaacta aagatttatt 4560caggaatccc catttgaatt
tgtatgattc aataaaagaa aacaccaagt aagttatata 4620aaataaattg
tgtatgagat gttgtgtttt cctttgtaat ttccactaac taactaacta
4680acttatattc ttcatggaat ggagcccaga agaaatgaga ggaagccctt
ttcacactag 4740atcttatttg aagaaatgtt tgttagtcag tcagtcagtg
gtttctggct ctgccgaggg 4800agatgtgttc cccagcaacc atttctgcag
cccagaatct caaggcacta gaggcggtgt 4860cttaattaat tggcttcaca
aagacaaaat gctctggact gggatttttc ctttgctgtg 4920ttgggaatat
gtgtttatta attagcacat gccaacaaaa taaatgtcaa gagttatttc
4980ataagtgtaa gtaaacttaa gaattaaaga gtgcagactt ataattttca
503018697PRTHomo sapiens 18Met Ala Cys Glu Ile Met Pro Leu Gln Ser
Ser Gln Glu Asp Glu Arg1 5 10 15Pro Leu Ser Pro Phe Tyr Leu Ser Ala
His Val Pro Gln Val Ser Asn 20 25 30Val Ser Ala Thr Gly Glu Leu Leu
Glu Arg Thr Ile Arg Ser Ala Val 35 40 45Glu Gln His Leu Phe Asp Val
Asn Asn Ser Gly Gly Gln Ser Ser Glu 50 55
60Asp Ser Glu Ser Gly Thr Leu Ser Ala Ser Ser Ala Thr Ser Ala Arg65
70 75 80Gln Arg Arg Arg Gln Ser Lys Glu Gln Asp Glu Val Arg His Gly
Arg 85 90 95Asp Lys Gly Leu Ile Asn Lys Glu Asn Thr Pro Ser Gly Phe
Asn His 100 105 110Leu Asp Asp Cys Ile Leu Asn Thr Gln Glu Val Glu
Lys Val His Lys 115 120 125Asn Thr Phe Gly Cys Ala Gly Glu Arg Ser
Lys Pro Lys Arg Gln Lys 130 135 140Ser Ser Thr Lys Leu Ser Glu Leu
His Asp Asn Gln Asp Gly Leu Val145 150 155 160Asn Met Glu Ser Leu
Asn Ser Thr Arg Ser His Glu Arg Thr Gly Pro 165 170 175Asp Asp Phe
Glu Trp Met Ser Asp Glu Arg Lys Gly Asn Glu Lys Asp 180 185 190Gly
Gly His Thr Gln His Phe Glu Ser Pro Thr Met Lys Ile Gln Glu 195 200
205His Pro Ser Leu Ser Asp Thr Lys Gln Gln Arg Asn Gln Asp Ala Gly
210 215 220Asp Gln Glu Glu Ser Phe Val Ser Glu Val Pro Gln Ser Asp
Leu Thr225 230 235 240Ala Leu Cys Asp Glu Lys Asn Trp Glu Glu Pro
Ile Pro Ala Phe Ser 245 250 255Ser Trp Gln Arg Glu Asn Ser Asp Ser
Asp Glu Ala His Leu Ser Pro 260 265 270Gln Ala Gly Arg Leu Ile Arg
Gln Leu Leu Asp Glu Asp Ser Asp Pro 275 280 285Met Leu Ser Pro Arg
Phe Tyr Ala Tyr Gly Gln Ser Arg Gln Tyr Leu 290 295 300Asp Asp Thr
Glu Val Pro Pro Ser Pro Pro Asn Ser His Ser Phe Met305 310 315
320Arg Arg Arg Ser Ser Ser Leu Gly Ser Tyr Asp Asp Glu Gln Glu Asp
325 330 335Leu Thr Pro Ala Gln Leu Thr Arg Arg Ile Gln Ser Leu Lys
Lys Lys 340 345 350Ile Arg Lys Phe Glu Asp Arg Phe Glu Glu Glu Lys
Lys Tyr Arg Pro 355 360 365Ser His Ser Asp Lys Ala Ala Asn Pro Glu
Val Leu Lys Trp Thr Asn 370 375 380Asp Leu Ala Lys Phe Arg Arg Gln
Leu Lys Glu Ser Lys Leu Lys Ile385 390 395 400Ser Glu Glu Asp Leu
Thr Pro Arg Met Arg Gln Arg Ser Asn Thr Leu 405 410 415Pro Lys Ser
Phe Gly Ser Gln Leu Glu Lys Glu Asp Glu Lys Lys Gln 420 425 430Glu
Leu Val Asp Lys Ala Ile Lys Pro Ser Val Glu Ala Thr Leu Glu 435 440
445Ser Ile Gln Arg Lys Leu Gln Glu Lys Arg Ala Glu Ser Ser Arg Pro
450 455 460Glu Asp Ile Lys Asp Met Thr Lys Asp Gln Ile Ala Asn Glu
Lys Val465 470 475 480Ala Leu Gln Lys Ala Leu Leu Tyr Tyr Glu Ser
Ile His Gly Arg Pro 485 490 495Val Thr Lys Asn Glu Arg Gln Val Met
Lys Pro Leu Tyr Asp Arg Tyr 500 505 510Arg Leu Val Lys Gln Ile Leu
Ser Arg Ala Asn Thr Ile Pro Ile Ile 515 520 525Gly Ser Pro Ser Ser
Lys Arg Arg Ser Pro Leu Leu Gln Pro Ile Ile 530 535 540Glu Gly Glu
Thr Ala Ser Phe Phe Lys Glu Ile Lys Glu Glu Glu Glu545 550 555
560Gly Ser Glu Asp Asp Ser Asn Val Lys Pro Asp Phe Met Val Thr Leu
565 570 575Lys Thr Asp Phe Ser Ala Arg Cys Phe Leu Asp Gln Phe Glu
Asp Asp 580 585 590Ala Asp Gly Phe Ile Ser Pro Met Asp Asp Lys Ile
Pro Ser Lys Cys 595 600 605Ser Gln Asp Thr Gly Leu Ser Asn Leu His
Ala Ala Ser Ile Pro Glu 610 615 620Leu Leu Glu His Leu Gln Glu Met
Arg Glu Glu Lys Lys Arg Ile Arg625 630 635 640Lys Lys Leu Arg Asp
Phe Glu Asp Asn Phe Phe Arg Gln Asn Gly Arg 645 650 655Asn Val Gln
Lys Glu Asp Arg Thr Pro Met Ala Glu Glu Tyr Ser Glu 660 665 670Tyr
Lys His Ile Lys Ala Lys Leu Arg Leu Leu Glu Val Leu Ile Ser 675 680
685Lys Arg Asp Thr Asp Ser Lys Ser Met 690 695198005DNAHomo sapiens
19aagaaaccgg ccaggtgtgg cctaggcgcc cagtgccagc ggggaggaga ctcgctccgc
60cgccgaccaa caccaacacc cagctccgac gcagctcctc tgcgcccttg ccgccctccg
120agccacagct ttcctcccgc tcctgccccc ggcccgtcgc cgtctccgcg
ctcgcagcgg 180cctcgggagg gcccaggtag cgagcagcga cctcgcgagc
cttccgcact cccgcccggt 240tccccggccg tccgcctatc cttggccccc
tccgctttct ccgcgccggc ccgcctcgct 300tatgcctcgg cgctgagccg
ctctcccgat tgcccgccga catgagctgc aacggaggct 360cccacccgcg
gatcaacact ctgggccgca tgatccgcgc cgagtctggc ccggacctgc
420gctacgaggt gaccagcggc ggcgggggca ccagcaggat gtactattct
cggcgcggcg 480tgatcaccga ccagaactcg gacggctact gtcaaaccgg
cacgatgtcc aggcaccaga 540accagaacac catccaggag ctgctgcaga
actgctccga ctgcttgatg cgagcagagc 600tcatcgtgca gcctgaattg
aagtatggag atggaataca actgactcgg agtcgagaat 660tggatgagtg
ttttgcccag gccaatgacc aaatggaaat cctcgacagc ttgatcagag
720agatgcggca gatgggccag ccctgtgatg cttaccagaa aaggcttctt
cagctccaag 780agcaaatgcg agccctttat aaagccatca gtgtccctcg
agtccgcagg gccagctcca 840agggtggtgg aggctacact tgtcagagtg
gctctggctg ggatgagttc accaaacatg 900tcaccagtga atgtttgggg
tggatgaggc agcaaagggc ggagatggac atggtggcct 960ggggtgtgga
cctggcctca gtggagcagc acattaacag ccaccggggc atccacaact
1020ccatcggcga ctatcgctgg cagctggaca aaatcaaagc cgacctgcgc
gagaaatctg 1080cgatctacca gttggaggag gagtatgaaa acctgctgaa
agcgtccttt gagaggatgg 1140atcacctgcg acagctgcag aacatcattc
aggccacgtc cagggagatc atgtggatca 1200atgactgcga ggaggaggag
ctgctgtacg actggagcga caagaacacc aacatcgctc 1260agaaacagga
ggccttctcc atacgcatga gtcaactgga agttaaagaa aaagagctca
1320ataagctgaa acaagaaagt gaccaacttg tcctcaatca gcatccagct
tcagacaaaa 1380ttgaggccta tatggacact ctgcagacgc agtggagttg
gattcttcag atcaccaagt 1440gcattgatgt tcatctgaaa gaaaatgctg
cctactttca gttttttgaa gaggcgcagt 1500ctactgaagc atacctgaag
gggctccagg actccatcag gaagaagtac ccctgcgaca 1560agaacatgcc
cctgcagcac ctgctggaac agatcaagga gctggagaaa gaacgagaga
1620aaatccttga atacaagcgt caggtgcaga acttggtaaa caagtctaag
aagattgtac 1680agctgaagcc tcgtaaccca gactacagaa gcaataaacc
cattattctc agagctctct 1740gtgactacaa acaagatcag aaaatcgtgc
ataaggggga tgagtgtatc ctgaaggaca 1800acaacgagcg cagcaagtgg
tacgtgacgg gcccgggagg cgttgacatg cttgttccct 1860ctgtggggct
gatcatccct cctccgaacc cactggccgt ggacctctct tgcaagattg
1920agcagtacta cgaagccatc ttggctctgt ggaaccagct ctacatcaac
atgaagagcc 1980tggtgtcctg gcactactgc atgattgaca tagagaagat
cagggccatg acaatcgcca 2040agctgaaaac aatgcggcag gaagattaca
tgaagacgat agccgacctt gagttacatt 2100accaagagtt catcagaaat
agccaaggct cagagatgtt tggagatgat gacaagcgga 2160aaatacagtc
tcagttcacc gatgcccaga agcattacca gaccctggtc attcagctcc
2220ctggctatcc ccagcaccag acagtgacca caactgaaat cactcatcat
ggaacctgcc 2280aagatgtcaa ccataataaa gtaattgaaa ccaacagaga
aaatgacaag caagaaacat 2340ggatgctgat ggagctgcag aagattcgca
ggcagataga gcactgcgag ggcaggatga 2400ctctcaaaaa cctccctcta
gcagaccagg gatcttctca ccacatcaca gtgaaaatta 2460acgagcttaa
gagtgtgcag aatgattcac aagcaattgc tgaggttctc aaccagctta
2520aagatatgct tgccaacttc agaggttctg aaaagtactg ctatttacag
aatgaagtat 2580ttggactatt tcagaaactg gaaaatatca atggtgttac
agatggctac ttaaatagct 2640tatgcacagt aagggcactg ctccaggcta
ttctccaaac agaagacatg ttaaaggttt 2700atgaagccag gctcactgag
gaggaaactg tctgcctgga cctggataaa gtggaagctt 2760accgctgtgg
actgaagaaa ataaaaaatg acttgaactt gaagaagtcg ttgttggcca
2820ctatgaagac agaactacag aaagcccagc agatccactc tcagacttca
cagcagtatc 2880cactttatga tctggacttg ggcaagttcg gtgaaaaagt
cacacagctg acagaccgct 2940ggcaaaggat agataaacag atcgacttta
ggttatggga cctggagaaa caaatcaagc 3000aattgaggaa ttatcgtgat
aactatcagg ctttctgcaa gtggctctat gatgctaaac 3060gccgccagga
ttccttagaa tccatgaaat ttggagattc caacacagtc atgcggtttt
3120tgaatgagca gaagaacttg cacagtgaaa tatctggcaa acgagacaaa
tcagaggaag 3180tacaaaaaat tgctgaactt tgcgccaatt caattaagga
ttatgagctc cagctggcct 3240catacacctc aggactggaa actctgctga
acatacctat caagaggacc atgattcagt 3300ccccttctgg ggtgattctg
caagaggctg cagatgttca tgctcggtac attgaactac 3360ttacaagatc
tggagactat tacaggttct taagtgagat gctgaagagt ttggaagatc
3420tgaagctgaa aaataccaag atcgaagttt tggaagagga gctcagactg
gcccgagatg 3480ccaactcgga aaactgtaat aagaacaaat tcctggatca
gaacctgcag aaataccagg 3540cagagtgttc ccagttcaaa gcgaagcttg
cgagcctgga ggagctgaag agacaggctg 3600agctggatgg gaagtcggct
aagcaaaatc tagacaagtg ctacggccaa ataaaagaac 3660tcaatgagaa
gatcacccga ctgacttatg agattgaaga tgaaaagaga agaagaaaat
3720ctgtggaaga cagatttgac caacagaaga atgactatga ccaactgcag
aaagcaaggc 3780aatgtgaaaa ggagaacctt ggttggcaga aattagagtc
tgagaaagcc atcaaggaga 3840aggagtacga gattgaaagg ttgagggttc
tactgcagga agaaggcacc cggaagagag 3900aatatgaaaa tgagctggca
aaggcatcta ataggattca ggaatcaaag aatcagtgta 3960ctcaggtggt
acaggaaaga gagagccttc tggtgaaaat caaagtcctg gagcaagaca
4020aggcaaggct gcagaggctg gaggatgagc tgaatcgtgc aaaatcaact
ctagaggcag 4080aaaccagggt gaaacagcgc ctggagtgtg agaaacagca
aattcagaat gacctgaatc 4140agtggaagac tcaatattcc cgcaaggagg
aggctattag gaagatagaa tcggaaagag 4200aaaagagtga gagagagaag
aacagtctta ggagtgagat cgaaagactc caagcagaga 4260tcaagagaat
tgaagagagg tgcaggcgta agctggagga ttctaccagg gagacacagt
4320cacagttaga aacagaacgc tcccgatatc agagggagat tgataaactc
agacagcgcc 4380catatgggtc ccatcgagag acccagactg agtgtgagtg
gaccgttgac acctccaagc 4440tggtgtttga tgggctgagg aagaaggtga
cagcaatgca gctctatgag tgtcagctga 4500tcgacaaaac aaccttggac
aaactattga aggggaagaa gtcagtggaa gaagttgctt 4560ctgaaatcca
gccattcctt cggggtgcag gatctatcgc tggagcatct gcttctccta
4620aggaaaaata ctctttggta gaggccaaga gaaagaaatt aatcagccca
gaatccacag 4680tcatgcttct ggaggcccag gcagctacag gtggtataat
tgatccccat cggaatgaga 4740agctgactgt cgacagtgcc atagctcggg
acctcattga cttcgatgac cgtcagcaga 4800tatatgcagc agaaaaagct
atcactggtt ttgatgatcc attttcaggc aagacagtat 4860ctgtttcaga
agccatcaag aaaaatttga ttgatagaga aaccggaatg cgcctgctgg
4920aagcccagat tgcttcaggg ggtgtagtag accctgtgaa cagtgtcttt
ttgccaaaag 4980atgtcgcctt ggcccggggg ctgattgata gagatttgta
tcgatccctg aatgatcccc 5040gagatagtca gaaaaacttt gtggatccag
tcaccaaaaa gaaggtcagt tacgtgcagc 5100tgaaggaacg gtgcagaatc
gaaccacata ctggtctgct cttgctttca gtacagaaga 5160gaagcatgtc
cttccaagga atcagacaac ctgtgaccgt cactgagcta gtagattctg
5220gtatattgag accgtccact gtcaatgaac tggaatctgg tcagatttct
tatgacgagg 5280ttggtgagag aattaaggac ttcctccagg gttcaagctg
catagcaggc atatacaatg 5340agaccacaaa acagaagctt ggcatttatg
aggccatgaa aattggctta gtccgacctg 5400gtactgctct ggagttgctg
gaagcccaag cagctactgg ctttatagtg gatcctgtta 5460gcaacttgag
gttaccagtg gaggaagcct acaagagagg tctggtgggc attgagttca
5520aagagaagct cctgtctgca gaacgagctg tcactgggta taatgatcct
gaaacaggaa 5580acatcatctc tttgttccaa gccatgaata aggaactcat
cgaaaagggc cacggtattc 5640gcttattaga agcacagatc gcaaccgggg
ggatcattga cccaaaggag agccatcgtt 5700taccagttga catagcatat
aagaggggct atttcaatga ggaactcagt gagattctct 5760cagatccaag
tgatgatacc aaaggatttt ttgaccccaa cactgaagaa aatcttacct
5820atctgcaact aaaagaaaga tgcattaagg atgaggaaac agggctctgt
cttctgcctc 5880tgaaagaaaa gaagaaacag gtgcagacat cacaaaagaa
taccctcagg aagcgtagag 5940tggtcatagt tgacccagaa accaataaag
aaatgtctgt tcaggaggcc tacaagaagg 6000gcctaattga ttatgaaacc
ttcaaagaac tgtgtgagca ggaatgtgaa tgggaagaaa 6060taaccatcac
gggatcagat ggctccacca gggtggtcct ggtagataga aagacaggca
6120gtcagtatga tattcaagat gctattgaca agggccttgt tgacaggaag
ttctttgatc 6180agtaccgatc cggcagcctc agcctcactc aatttgctga
catgatctcc ttgaaaaatg 6240gtgtcggcac cagcagcagc atgggcagtg
gtgtcagcga tgatgttttt agcagctccc 6300gacatgaatc agtaagtaag
atttccacca tatccagcgt caggaattta accataagga 6360gcagctcttt
ttcagacacc ctggaagaat cgagccccat tgcagccatc tttgacacag
6420aaaacctgga gaaaatctcc attacagaag gtatagagcg gggcatcgtt
gacagcatca 6480cgggtcagag gcttctggag gctcaggcct gcacaggtgg
catcatccac ccaaccacgg 6540gccagaagct gtcacttcag gacgcagtct
cccagggtgt gattgaccaa gacatggcca 6600ccaggctgaa gcctgctcag
aaagccttca taggcttcga gggtgtgaag ggaaagaaga 6660agatgtcagc
agcagaggca gtgaaagaaa aatggctccc gtatgaggct ggccagcgct
6720tcctggagtt ccagtacctc acgggaggtc ttgttgaccc ggaagtgcat
gggaggataa 6780gcaccgaaga agccatccgg aaggggttca tagatggccg
cgccgcacag aggctgcaag 6840acaccagcag ctatgccaaa atcctgacct
gccccaaaac caaattaaaa atatcctata 6900aggatgccat aaatcgctcc
atggtagaag atatcactgg gctgcgcctt ctggaagccg 6960cctccgtgtc
gtccaagggc ttacccagcc cttacaacat gtcttcggct ccggggtccc
7020gctccggctc ccgctcggga tctcgctccg gatctcgctc cgggtcccgc
agtgggtccc 7080ggagaggaag ctttgacgcc acagggaatt cttcctactc
ttattcctac tcatttagca 7140gtagttctat tgggcactag tagtcagttg
ggagtggttg ctataccttg acttcattta 7200tatgaatttc cactttatta
aataatagaa aagaaaatcc cggtgcttgc agtagagtga 7260taggacattc
tatgcttaca gaaaatatag ccatgattga aatcaaatag taaaggctgt
7320tctggctttt tatcttctta gctcatctta aataagcagt acacttggat
gcagtgcgtc 7380tgaagtgcta atcagttgta acaatagcac aaatcgaact
taggatttgt ttcttctctt 7440ctgtgtttcg atttttgatc aattctttaa
ttttggaagc ctataataca gttttctatt 7500cttggagata aaaattaaat
ggatcactga tattttagtc attctgcttc tcatctaaat 7560atttccatat
tctgtattag gagaaaatta ccctcccagc accagccccc ctctcaaacc
7620cccaacccaa aaccaagcat tttggaatga gtctccttta gtttcagagt
gtggattgta 7680taacccatat actcttcgat gtacttgttt ggtttggtat
taatttgact gtgcatgaca 7740gcggcaatct tttctttggt caaagttttc
tgtttatttt gcttgtcata ttcgatgtac 7800tttaaggtgt ctttatgaag
tttgctattc tggcaataaa cttttagact tttgaagtgt 7860ttgtgtttta
atttaatatg tttataagca tgtataaaca tttagcatat ttttatcata
7920ggtctaaaaa tatttgttta ctaaatacct gtgaagaaat accattaaaa
aactatttgg 7980ttctgaattc ttactagaaa aaaaa 8005202272PRTHomo
sapiens 20Met Ser Cys Asn Gly Gly Ser His Pro Arg Ile Asn Thr Leu
Gly Arg1 5 10 15Met Ile Arg Ala Glu Ser Gly Pro Asp Leu Arg Tyr Glu
Val Thr Ser 20 25 30Gly Gly Gly Gly Thr Ser Arg Met Tyr Tyr Ser Arg
Arg Gly Val Ile 35 40 45Thr Asp Gln Asn Ser Asp Gly Tyr Cys Gln Thr
Gly Thr Met Ser Arg 50 55 60His Gln Asn Gln Asn Thr Ile Gln Glu Leu
Leu Gln Asn Cys Ser Asp65 70 75 80Cys Leu Met Arg Ala Glu Leu Ile
Val Gln Pro Glu Leu Lys Tyr Gly 85 90 95Asp Gly Ile Gln Leu Thr Arg
Ser Arg Glu Leu Asp Glu Cys Phe Ala 100 105 110Gln Ala Asn Asp Gln
Met Glu Ile Leu Asp Ser Leu Ile Arg Glu Met 115 120 125Arg Gln Met
Gly Gln Pro Cys Asp Ala Tyr Gln Lys Arg Leu Leu Gln 130 135 140Leu
Gln Glu Gln Met Arg Ala Leu Tyr Lys Ala Ile Ser Val Pro Arg145 150
155 160Val Arg Arg Ala Ser Ser Lys Gly Gly Gly Gly Tyr Thr Cys Gln
Ser 165 170 175Gly Ser Gly Trp Asp Glu Phe Thr Lys His Val Thr Ser
Glu Cys Leu 180 185 190Gly Trp Met Arg Gln Gln Arg Ala Glu Met Asp
Met Val Ala Trp Gly 195 200 205Val Asp Leu Ala Ser Val Glu Gln His
Ile Asn Ser His Arg Gly Ile 210 215 220His Asn Ser Ile Gly Asp Tyr
Arg Trp Gln Leu Asp Lys Ile Lys Ala225 230 235 240Asp Leu Arg Glu
Lys Ser Ala Ile Tyr Gln Leu Glu Glu Glu Tyr Glu 245 250 255Asn Leu
Leu Lys Ala Ser Phe Glu Arg Met Asp His Leu Arg Gln Leu 260 265
270Gln Asn Ile Ile Gln Ala Thr Ser Arg Glu Ile Met Trp Ile Asn Asp
275 280 285Cys Glu Glu Glu Glu Leu Leu Tyr Asp Trp Ser Asp Lys Asn
Thr Asn 290 295 300Ile Ala Gln Lys Gln Glu Ala Phe Ser Ile Arg Met
Ser Gln Leu Glu305 310 315 320Val Lys Glu Lys Glu Leu Asn Lys Leu
Lys Gln Glu Ser Asp Gln Leu 325 330 335Val Leu Asn Gln His Pro Ala
Ser Asp Lys Ile Glu Ala Tyr Met Asp 340 345 350Thr Leu Gln Thr Gln
Trp Ser Trp Ile Leu Gln Ile Thr Lys Cys Ile 355 360 365Asp Val His
Leu Lys Glu Asn Ala Ala Tyr Phe Gln Phe Phe Glu Glu 370 375 380Ala
Gln Ser Thr Glu Ala Tyr Leu Lys Gly Leu Gln Asp Ser Ile Arg385 390
395 400Lys Lys Tyr Pro Cys Asp Lys Asn Met Pro Leu Gln His Leu Leu
Glu 405 410 415Gln Ile Lys Glu Leu Glu Lys Glu Arg Glu Lys Ile Leu
Glu Tyr Lys 420 425 430Arg Gln Val Gln Asn Leu Val Asn Lys Ser Lys
Lys Ile Val Gln Leu 435 440 445Lys Pro Arg Asn Pro Asp Tyr Arg Ser
Asn Lys Pro Ile Ile Leu Arg 450 455 460Ala Leu Cys Asp Tyr Lys Gln
Asp Gln Lys Ile Val His Lys Gly Asp465 470 475 480Glu Cys Ile Leu
Lys Asp Asn Asn Glu Arg Ser Lys Trp Tyr Val Thr 485 490 495Gly Pro
Gly Gly Val Asp Met Leu Val Pro Ser Val Gly Leu Ile Ile 500 505
510Pro
Pro Pro Asn Pro Leu Ala Val Asp Leu Ser Cys Lys Ile Glu Gln 515 520
525Tyr Tyr Glu Ala Ile Leu Ala Leu Trp Asn Gln Leu Tyr Ile Asn Met
530 535 540Lys Ser Leu Val Ser Trp His Tyr Cys Met Ile Asp Ile Glu
Lys Ile545 550 555 560Arg Ala Met Thr Ile Ala Lys Leu Lys Thr Met
Arg Gln Glu Asp Tyr 565 570 575Met Lys Thr Ile Ala Asp Leu Glu Leu
His Tyr Gln Glu Phe Ile Arg 580 585 590Asn Ser Gln Gly Ser Glu Met
Phe Gly Asp Asp Asp Lys Arg Lys Ile 595 600 605Gln Ser Gln Phe Thr
Asp Ala Gln Lys His Tyr Gln Thr Leu Val Ile 610 615 620Gln Leu Pro
Gly Tyr Pro Gln His Gln Thr Val Thr Thr Thr Glu Ile625 630 635
640Thr His His Gly Thr Cys Gln Asp Val Asn His Asn Lys Val Ile Glu
645 650 655Thr Asn Arg Glu Asn Asp Lys Gln Glu Thr Trp Met Leu Met
Glu Leu 660 665 670Gln Lys Ile Arg Arg Gln Ile Glu His Cys Glu Gly
Arg Met Thr Leu 675 680 685Lys Asn Leu Pro Leu Ala Asp Gln Gly Ser
Ser His His Ile Thr Val 690 695 700Lys Ile Asn Glu Leu Lys Ser Val
Gln Asn Asp Ser Gln Ala Ile Ala705 710 715 720Glu Val Leu Asn Gln
Leu Lys Asp Met Leu Ala Asn Phe Arg Gly Ser 725 730 735Glu Lys Tyr
Cys Tyr Leu Gln Asn Glu Val Phe Gly Leu Phe Gln Lys 740 745 750Leu
Glu Asn Ile Asn Gly Val Thr Asp Gly Tyr Leu Asn Ser Leu Cys 755 760
765Thr Val Arg Ala Leu Leu Gln Ala Ile Leu Gln Thr Glu Asp Met Leu
770 775 780Lys Val Tyr Glu Ala Arg Leu Thr Glu Glu Glu Thr Val Cys
Leu Asp785 790 795 800Leu Asp Lys Val Glu Ala Tyr Arg Cys Gly Leu
Lys Lys Ile Lys Asn 805 810 815Asp Leu Asn Leu Lys Lys Ser Leu Leu
Ala Thr Met Lys Thr Glu Leu 820 825 830Gln Lys Ala Gln Gln Ile His
Ser Gln Thr Ser Gln Gln Tyr Pro Leu 835 840 845Tyr Asp Leu Asp Leu
Gly Lys Phe Gly Glu Lys Val Thr Gln Leu Thr 850 855 860Asp Arg Trp
Gln Arg Ile Asp Lys Gln Ile Asp Phe Arg Leu Trp Asp865 870 875
880Leu Glu Lys Gln Ile Lys Gln Leu Arg Asn Tyr Arg Asp Asn Tyr Gln
885 890 895Ala Phe Cys Lys Trp Leu Tyr Asp Ala Lys Arg Arg Gln Asp
Ser Leu 900 905 910Glu Ser Met Lys Phe Gly Asp Ser Asn Thr Val Met
Arg Phe Leu Asn 915 920 925Glu Gln Lys Asn Leu His Ser Glu Ile Ser
Gly Lys Arg Asp Lys Ser 930 935 940Glu Glu Val Gln Lys Ile Ala Glu
Leu Cys Ala Asn Ser Ile Lys Asp945 950 955 960Tyr Glu Leu Gln Leu
Ala Ser Tyr Thr Ser Gly Leu Glu Thr Leu Leu 965 970 975Asn Ile Pro
Ile Lys Arg Thr Met Ile Gln Ser Pro Ser Gly Val Ile 980 985 990Leu
Gln Glu Ala Ala Asp Val His Ala Arg Tyr Ile Glu Leu Leu Thr 995
1000 1005Arg Ser Gly Asp Tyr Tyr Arg Phe Leu Ser Glu Met Leu Lys
Ser 1010 1015 1020Leu Glu Asp Leu Lys Leu Lys Asn Thr Lys Ile Glu
Val Leu Glu 1025 1030 1035Glu Glu Leu Arg Leu Ala Arg Asp Ala Asn
Ser Glu Asn Cys Asn 1040 1045 1050Lys Asn Lys Phe Leu Asp Gln Asn
Leu Gln Lys Tyr Gln Ala Glu 1055 1060 1065Cys Ser Gln Phe Lys Ala
Lys Leu Ala Ser Leu Glu Glu Leu Lys 1070 1075 1080Arg Gln Ala Glu
Leu Asp Gly Lys Ser Ala Lys Gln Asn Leu Asp 1085 1090 1095Lys Cys
Tyr Gly Gln Ile Lys Glu Leu Asn Glu Lys Ile Thr Arg 1100 1105
1110Leu Thr Tyr Glu Ile Glu Asp Glu Lys Arg Arg Arg Lys Ser Val
1115 1120 1125Glu Asp Arg Phe Asp Gln Gln Lys Asn Asp Tyr Asp Gln
Leu Gln 1130 1135 1140Lys Ala Arg Gln Cys Glu Lys Glu Asn Leu Gly
Trp Gln Lys Leu 1145 1150 1155Glu Ser Glu Lys Ala Ile Lys Glu Lys
Glu Tyr Glu Ile Glu Arg 1160 1165 1170Leu Arg Val Leu Leu Gln Glu
Glu Gly Thr Arg Lys Arg Glu Tyr 1175 1180 1185Glu Asn Glu Leu Ala
Lys Ala Ser Asn Arg Ile Gln Glu Ser Lys 1190 1195 1200Asn Gln Cys
Thr Gln Val Val Gln Glu Arg Glu Ser Leu Leu Val 1205 1210 1215Lys
Ile Lys Val Leu Glu Gln Asp Lys Ala Arg Leu Gln Arg Leu 1220 1225
1230Glu Asp Glu Leu Asn Arg Ala Lys Ser Thr Leu Glu Ala Glu Thr
1235 1240 1245Arg Val Lys Gln Arg Leu Glu Cys Glu Lys Gln Gln Ile
Gln Asn 1250 1255 1260Asp Leu Asn Gln Trp Lys Thr Gln Tyr Ser Arg
Lys Glu Glu Ala 1265 1270 1275Ile Arg Lys Ile Glu Ser Glu Arg Glu
Lys Ser Glu Arg Glu Lys 1280 1285 1290Asn Ser Leu Arg Ser Glu Ile
Glu Arg Leu Gln Ala Glu Ile Lys 1295 1300 1305Arg Ile Glu Glu Arg
Cys Arg Arg Lys Leu Glu Asp Ser Thr Arg 1310 1315 1320Glu Thr Gln
Ser Gln Leu Glu Thr Glu Arg Ser Arg Tyr Gln Arg 1325 1330 1335Glu
Ile Asp Lys Leu Arg Gln Arg Pro Tyr Gly Ser His Arg Glu 1340 1345
1350Thr Gln Thr Glu Cys Glu Trp Thr Val Asp Thr Ser Lys Leu Val
1355 1360 1365Phe Asp Gly Leu Arg Lys Lys Val Thr Ala Met Gln Leu
Tyr Glu 1370 1375 1380Cys Gln Leu Ile Asp Lys Thr Thr Leu Asp Lys
Leu Leu Lys Gly 1385 1390 1395Lys Lys Ser Val Glu Glu Val Ala Ser
Glu Ile Gln Pro Phe Leu 1400 1405 1410Arg Gly Ala Gly Ser Ile Ala
Gly Ala Ser Ala Ser Pro Lys Glu 1415 1420 1425Lys Tyr Ser Leu Val
Glu Ala Lys Arg Lys Lys Leu Ile Ser Pro 1430 1435 1440Glu Ser Thr
Val Met Leu Leu Glu Ala Gln Ala Ala Thr Gly Gly 1445 1450 1455Ile
Ile Asp Pro His Arg Asn Glu Lys Leu Thr Val Asp Ser Ala 1460 1465
1470Ile Ala Arg Asp Leu Ile Asp Phe Asp Asp Arg Gln Gln Ile Tyr
1475 1480 1485Ala Ala Glu Lys Ala Ile Thr Gly Phe Asp Asp Pro Phe
Ser Gly 1490 1495 1500Lys Thr Val Ser Val Ser Glu Ala Ile Lys Lys
Asn Leu Ile Asp 1505 1510 1515Arg Glu Thr Gly Met Arg Leu Leu Glu
Ala Gln Ile Ala Ser Gly 1520 1525 1530Gly Val Val Asp Pro Val Asn
Ser Val Phe Leu Pro Lys Asp Val 1535 1540 1545Ala Leu Ala Arg Gly
Leu Ile Asp Arg Asp Leu Tyr Arg Ser Leu 1550 1555 1560Asn Asp Pro
Arg Asp Ser Gln Lys Asn Phe Val Asp Pro Val Thr 1565 1570 1575Lys
Lys Lys Val Ser Tyr Val Gln Leu Lys Glu Arg Cys Arg Ile 1580 1585
1590Glu Pro His Thr Gly Leu Leu Leu Leu Ser Val Gln Lys Arg Ser
1595 1600 1605Met Ser Phe Gln Gly Ile Arg Gln Pro Val Thr Val Thr
Glu Leu 1610 1615 1620Val Asp Ser Gly Ile Leu Arg Pro Ser Thr Val
Asn Glu Leu Glu 1625 1630 1635Ser Gly Gln Ile Ser Tyr Asp Glu Val
Gly Glu Arg Ile Lys Asp 1640 1645 1650Phe Leu Gln Gly Ser Ser Cys
Ile Ala Gly Ile Tyr Asn Glu Thr 1655 1660 1665Thr Lys Gln Lys Leu
Gly Ile Tyr Glu Ala Met Lys Ile Gly Leu 1670 1675 1680Val Arg Pro
Gly Thr Ala Leu Glu Leu Leu Glu Ala Gln Ala Ala 1685 1690 1695Thr
Gly Phe Ile Val Asp Pro Val Ser Asn Leu Arg Leu Pro Val 1700 1705
1710Glu Glu Ala Tyr Lys Arg Gly Leu Val Gly Ile Glu Phe Lys Glu
1715 1720 1725Lys Leu Leu Ser Ala Glu Arg Ala Val Thr Gly Tyr Asn
Asp Pro 1730 1735 1740Glu Thr Gly Asn Ile Ile Ser Leu Phe Gln Ala
Met Asn Lys Glu 1745 1750 1755Leu Ile Glu Lys Gly His Gly Ile Arg
Leu Leu Glu Ala Gln Ile 1760 1765 1770Ala Thr Gly Gly Ile Ile Asp
Pro Lys Glu Ser His Arg Leu Pro 1775 1780 1785Val Asp Ile Ala Tyr
Lys Arg Gly Tyr Phe Asn Glu Glu Leu Ser 1790 1795 1800Glu Ile Leu
Ser Asp Pro Ser Asp Asp Thr Lys Gly Phe Phe Asp 1805 1810 1815Pro
Asn Thr Glu Glu Asn Leu Thr Tyr Leu Gln Leu Lys Glu Arg 1820 1825
1830Cys Ile Lys Asp Glu Glu Thr Gly Leu Cys Leu Leu Pro Leu Lys
1835 1840 1845Glu Lys Lys Lys Gln Val Gln Thr Ser Gln Lys Asn Thr
Leu Arg 1850 1855 1860Lys Arg Arg Val Val Ile Val Asp Pro Glu Thr
Asn Lys Glu Met 1865 1870 1875Ser Val Gln Glu Ala Tyr Lys Lys Gly
Leu Ile Asp Tyr Glu Thr 1880 1885 1890Phe Lys Glu Leu Cys Glu Gln
Glu Cys Glu Trp Glu Glu Ile Thr 1895 1900 1905Ile Thr Gly Ser Asp
Gly Ser Thr Arg Val Val Leu Val Asp Arg 1910 1915 1920Lys Thr Gly
Ser Gln Tyr Asp Ile Gln Asp Ala Ile Asp Lys Gly 1925 1930 1935Leu
Val Asp Arg Lys Phe Phe Asp Gln Tyr Arg Ser Gly Ser Leu 1940 1945
1950Ser Leu Thr Gln Phe Ala Asp Met Ile Ser Leu Lys Asn Gly Val
1955 1960 1965Gly Thr Ser Ser Ser Met Gly Ser Gly Val Ser Asp Asp
Val Phe 1970 1975 1980Ser Ser Ser Arg His Glu Ser Val Ser Lys Ile
Ser Thr Ile Ser 1985 1990 1995Ser Val Arg Asn Leu Thr Ile Arg Ser
Ser Ser Phe Ser Asp Thr 2000 2005 2010Leu Glu Glu Ser Ser Pro Ile
Ala Ala Ile Phe Asp Thr Glu Asn 2015 2020 2025Leu Glu Lys Ile Ser
Ile Thr Glu Gly Ile Glu Arg Gly Ile Val 2030 2035 2040Asp Ser Ile
Thr Gly Gln Arg Leu Leu Glu Ala Gln Ala Cys Thr 2045 2050 2055Gly
Gly Ile Ile His Pro Thr Thr Gly Gln Lys Leu Ser Leu Gln 2060 2065
2070Asp Ala Val Ser Gln Gly Val Ile Asp Gln Asp Met Ala Thr Arg
2075 2080 2085Leu Lys Pro Ala Gln Lys Ala Phe Ile Gly Phe Glu Gly
Val Lys 2090 2095 2100Gly Lys Lys Lys Met Ser Ala Ala Glu Ala Val
Lys Glu Lys Trp 2105 2110 2115Leu Pro Tyr Glu Ala Gly Gln Arg Phe
Leu Glu Phe Gln Tyr Leu 2120 2125 2130Thr Gly Gly Leu Val Asp Pro
Glu Val His Gly Arg Ile Ser Thr 2135 2140 2145Glu Glu Ala Ile Arg
Lys Gly Phe Ile Asp Gly Arg Ala Ala Gln 2150 2155 2160Arg Leu Gln
Asp Thr Ser Ser Tyr Ala Lys Ile Leu Thr Cys Pro 2165 2170 2175Lys
Thr Lys Leu Lys Ile Ser Tyr Lys Asp Ala Ile Asn Arg Ser 2180 2185
2190Met Val Glu Asp Ile Thr Gly Leu Arg Leu Leu Glu Ala Ala Ser
2195 2200 2205Val Ser Ser Lys Gly Leu Pro Ser Pro Tyr Asn Met Ser
Ser Ala 2210 2215 2220Pro Gly Ser Arg Ser Gly Ser Arg Ser Gly Ser
Arg Ser Gly Ser 2225 2230 2235Arg Ser Gly Ser Arg Ser Gly Ser Arg
Arg Gly Ser Phe Asp Ala 2240 2245 2250Thr Gly Asn Ser Ser Tyr Ser
Tyr Ser Tyr Ser Phe Ser Ser Ser 2255 2260 2265Ser Ile Gly His
2270211278DNAHomo sapiens 21ccattggcct gtagattcac ctcccctggg
cagggcccca ggacccagga taatatctgt 60gcctcctgcc cagaaccctc caagcagaca
caatggtaag aatggtgcct gtcctgctgt 120ctctgctgct gcttctgggt
cctgctgtcc cccaggagaa ccaagatggt cgttactctc 180tgacctatat
ctacactggg ctgtccaagc atgttgaaga cgtccccgcg tttcaggccc
240ttggctcact caatgacctc cagttcttta gatacaacag taaagacagg
aagtctcagc 300ccatgggact ctggagacag gtggaaggaa tggaggattg
gaagcaggac agccaacttc 360agaaggccag ggaggacatc tttatggaga
ccctgaaaga catcgtggag tattacaacg 420acagtaacgg gtctcacgta
ttgcagggaa ggtttggttg tgagatcgag aataacagaa 480gcagcggagc
attctggaaa tattactatg atggaaagga ctacattgaa ttcaacaaag
540aaatcccagc ctgggtcccc ttcgacccag cagcccagat aaccaagcag
aagtgggagg 600cagaaccagt ctacgtgcag cgggccaagg cttacctgga
ggaggagtgc cctgcgactc 660tgcggaaata cctgaaatac agcaaaaata
tcctggaccg gcaagatcct ccctctgtgg 720tggtcaccag ccaccaggcc
ccaggagaaa agaagaaact gaagtgcctg gcctacgact 780tctacccagg
gaaaattgat gtgcactgga ctcgggccgg cgaggtgcag gagcctgagt
840tacggggaga tgttcttcac aatggaaatg gcacttacca gtcctgggtg
gtggtggcag 900tgcccccgca ggacacagcc ccctactcct gccacgtgca
gcacagcagc ctggcccagc 960ccctcgtggt gccctgggag gccagctagg
aagcaagggt tggaggcaat gtgggatctc 1020agacccagta gctgcccttc
ctgcctgatg tgggagctga accacagaaa tcacagtcaa 1080tggatccaca
aggcctgagg agcagtgtgg ggggacagac aggaggtgga tttggagacc
1140gaagactggg atgcctgtct tgagtagact tggacccaaa aaatcatctc
accttgagcc 1200cacccccacc ccattgtcta atctgtagaa gctaataaat
aatcatccct ccttgcctag 1260cataaaaaaa aaaaaaaa 127822298PRTHomo
sapiens 22Met Val Arg Met Val Pro Val Leu Leu Ser Leu Leu Leu Leu
Leu Gly1 5 10 15Pro Ala Val Pro Gln Glu Asn Gln Asp Gly Arg Tyr Ser
Leu Thr Tyr 20 25 30Ile Tyr Thr Gly Leu Ser Lys His Val Glu Asp Val
Pro Ala Phe Gln 35 40 45Ala Leu Gly Ser Leu Asn Asp Leu Gln Phe Phe
Arg Tyr Asn Ser Lys 50 55 60Asp Arg Lys Ser Gln Pro Met Gly Leu Trp
Arg Gln Val Glu Gly Met65 70 75 80Glu Asp Trp Lys Gln Asp Ser Gln
Leu Gln Lys Ala Arg Glu Asp Ile 85 90 95Phe Met Glu Thr Leu Lys Asp
Ile Val Glu Tyr Tyr Asn Asp Ser Asn 100 105 110Gly Ser His Val Leu
Gln Gly Arg Phe Gly Cys Glu Ile Glu Asn Asn 115 120 125Arg Ser Ser
Gly Ala Phe Trp Lys Tyr Tyr Tyr Asp Gly Lys Asp Tyr 130 135 140Ile
Glu Phe Asn Lys Glu Ile Pro Ala Trp Val Pro Phe Asp Pro Ala145 150
155 160Ala Gln Ile Thr Lys Gln Lys Trp Glu Ala Glu Pro Val Tyr Val
Gln 165 170 175Arg Ala Lys Ala Tyr Leu Glu Glu Glu Cys Pro Ala Thr
Leu Arg Lys 180 185 190Tyr Leu Lys Tyr Ser Lys Asn Ile Leu Asp Arg
Gln Asp Pro Pro Ser 195 200 205Val Val Val Thr Ser His Gln Ala Pro
Gly Glu Lys Lys Lys Leu Lys 210 215 220Cys Leu Ala Tyr Asp Phe Tyr
Pro Gly Lys Ile Asp Val His Trp Thr225 230 235 240Arg Ala Gly Glu
Val Gln Glu Pro Glu Leu Arg Gly Asp Val Leu His 245 250 255Asn Gly
Asn Gly Thr Tyr Gln Ser Trp Val Val Val Ala Val Pro Pro 260 265
270Gln Asp Thr Ala Pro Tyr Ser Cys His Val Gln His Ser Ser Leu Ala
275 280 285Gln Pro Leu Val Val Pro Trp Glu Ala Ser 290
295236483DNAHomo sapiens 23aaatgcgctg gcggggagac cggggttggt
ccctggcggg gcagggggcg ggctcaggcc 60ggaactccag agacgacctc agccaactgc
tcctgcgccg ggcggggtcg tcgccgccag 120cggctccgag cgccggaagg
gccaggtctc agggctcctg gagctgcagg cggcgggagg 180ggctacaaat
gcttgactca gtgatgcaga acctttcaga gttagctgga agccacagcc
240ctgcctcttg atgcagcctg gatccagccg gtgtgaagag gagacccctt
ccctcttgtg 300gggtttggat cctgtgtttc tagcctttgc aaaactctac
atcagggata tcctggacat 360gaaggagtcc cgccaggtgc caggtgtatt
tttgtacaat ggacatccaa taaaacaggt 420agatgtcttg ggaactgtca
ttggagtgag agaaagagat gctttctaca gttatggagt 480ggatgacagc
actggagtta taaactgcat ctgctggaaa aagttgaata ctgagtctgt
540atcagctgct ccaagtgcag caagagagct cagcttaacc tcacaactta
agaagctaca 600agagaccatt gagcagaaaa caaagataga gatcggggac
acgatccgag tcagaggcag 660tatccgcaca tacagagaag agcgagagat
tcatgccacc acttactata aagtggacga 720cccagtgtgg aacattcaaa
ttgcaaggat gcttgagctg cccactatct acaggaaagt 780ttatgaccag
ccttttcaca gctcagccct agagaaagaa gaggcactaa gcaatccagg
840cgccctggac ctccccagtc tcacgagttt gctgagtgaa aaagccaaag
aattcctcat 900ggagaacaga gtgcagagct tttaccagca ggagctggaa
atggtggagt ctttgctgtc 960ccttgccaat cagcctgtga ttcacagtgc
ctcctccgac caagtgaatt ttaagaagga 1020caccacttcc aaggcaattc
atagtatatt taagaatgct
atacaactgc tgcaggaaaa 1080aggacttgtt ttccagaaag atgatggttt
tgataaccta tactatgtaa ccagagaaga 1140caaagacctg cacagaaaga
tccaccggat cattcagcag gactgccaga aaccaaatca 1200catggagaag
ggctgtcact tcctgcacat cttggcctgt gctcgcctga gcatccgccc
1260gggcctgagc gaggctgtgc tgcagcaagt tctggagctc ctggaggacc
agagtgacat 1320tgtcagcaca atggagcact actacacagc gttctgagca
gagacacgca gaccagctga 1380ggaggacaaa gataaggtgg cattcacccc
caggctctga ctttcagcat catgcagggg 1440cttatctgtc tggaggcagt
tacctcataa taaactataa aatatagtca tcttgggaat 1500gggatttggc
ataaatgttg ttggctccct tctgtccact atgtccttgg tgtacaatga
1560ctttgatctc agccatgaca caacaagaaa accctccctg ttgagctcct
ggctggactg 1620tgcgttgttc gcagagcaga atggggagga aacagtgttg
gcagcttaac tgatgtgtgt 1680ggttggagtc tcttccatgg caaagggaca
ccacagggta gtgaacattc aggaactgag 1740gggcatatgg cctgatcaca
cagttctaag cttttcaaaa cttcaggtta tcagagacct 1800tcctgtgggc
ctctcttgct ggctaagaac cggtttaggg gagtagttct ccctggatga
1860gtgcttacag tttctgtggc tcagttacca gcagtggggt tgagacctgg
gtcgatgctc 1920tttacaggcc tgcccagaga tgggaataaa cagggatcca
cagcgtgact atgtgtttgt 1980cattttcctt ttatttcctt gggaatcgaa
aggtgtccca gtacatttcc ctgcacttac 2040agaggtgcat gactaaatac
attgtccctc gatgcccctg aagatcacgg aggcagtcag 2100ccaattgcct
ggcaggtggt agatgttatt ttcagggttg ccgctgagtg tgcaggatgt
2160gctgacacca tccagacaaa gactcggtat gtgcccagac aggtgatgga
gtcatgcttt 2220tgctcagaat gacaaggtaa aggaaaaaca tctgaggtat
gttgtaggcc tgttctgaca 2280gcaaaatgac aaatccagcc agcaaaaata
aagtgtggag aaagatttgg agttaattac 2340agtcatttca cagaaggcac
tgccttcgtc tgctgcattt gctcttgatg tgataagctc 2400ttcgtggctc
agctggagat cctttaggcc tggagagttg ctcctctctc cgtggaaaca
2460ggacagtctt tatacgcaga agtccgctgc agctcgatac gtcaggctga
gagctagaac 2520cagtagattg cctcctgtca tagacttttg taatgatgca
aacctttgct gatttctaac 2580agtgattatg tagtggctgc cctgcatctt
ctctgtgtac agaagggtcc ctagcataga 2640gtctgcctgg aatgatgtcc
tgggcagttc ttccttgagg tcagcagctg ttccacgttg 2700aatgcatctg
attagtgggg ctgcccagga aggagttcag aatcagaagg taaaaagggc
2760atacccttgc ctatagcaac tctgctctta ggggtttatc tcaaggagat
ggctacacaa 2820gtgtgaaagg atggttgcac aaggtgttca ttgctgtata
atctagaatt ctatattggg 2880gaaaatacct atagggaaaa agttaattac
ggttcttggg cacaatgaaa tactatgcag 2940ctatgaaaaa aatgatgaaa
gcagacagac agtgttgcca tggcacactg tccctagtag 3000atttagtggg
aagtagatag agttatagat ctgtttctat agtataacac cattatctac
3060agctccctgt gtgtatgtat atatccgtag agagagtgta tatttctgca
tggaggtctt 3120tataaatgta gcacatgtac atatatatat atatacacac
acacagtcga ccactccctt 3180ctcctggaag tactttccgc gtttggcttt
caggacacca agctctctgg ttgctccttc 3240tcaggttcct ttgttcagtg
ctctgcctcc ctgaggactc agtcccagac ctcttttcta 3300tctggcttgc
tcactggggt gtctccagca gccacatgga ttataccatc tacatgctgt
3360ctaacacctc agtttaaacc cagaatgggc ctcttccctg aactgcagac
ccctatattc 3420agtttgctac tgacatctcc acttaggtct ctaatggaca
tctcagattt cacaggccca 3480aagccaggct cccaattact cctgacccca
ggcttgctcc tgatagtgac atgaggcagc 3540caaatgccta ggcagagagg
ggagggtccc aaatgaaacc ccacgttcaa gcaaagatca 3600gcctgaaggc
taaaagacca gattgctggt cctggatgaa acccaccacg cagagtggga
3660acttctgttc ctgtttgccc accctttccc aattgttctt tctgaataac
gccttaacca 3720atcgaatgtt gccttttcca gtaataccta cagcctgccc
ctccccccat tctgagccca 3780taaaaagacc cagactcccc catattaagg
ggactttcct gcctttgggt agggggacca 3840cccccacgtc tcctctctgt
tgaaaactgt ttcatcactc aataaaactc ccagctttgc 3900tcactcttcc
actgtcagca cattctcatt cttctttggt gctgggcaag aactcaacca
3960gtgtggaagc catacttggc ccaggcgggt gaagtgggcg ggccgtctcc
tgcagcaggt 4020agcatggtca agcgaggccc aggtgggccg tcaccagcca
gaggtccctg gcttgcaaag 4080tgaccgagaa aaaaatcctg tgccactcct
ttggaaaatg tccctgattc aggaagaggt 4140agctccatcc agttgctcaa
accaaatcca ttggcttctt tctttctatc atacctcaca 4200tccaatctgt
ctgcaagtct tttggctcta ccttcagaat atctccagaa tcttaactgc
4260ttcaccctcc tccccggcct cctcagtcct ctctgcttcc gccctggccc
ctcttgggct 4320gttcacagca cagcagctgt tgccaccctg ttaatgctcc
cactctccta cagccttcgg 4380tcttgcccca ggtaggagcc tgaggctgca
cagaggtcag cacggccccg cttaccctgc 4440cctcccagcc cagccgcacg
ggccttgcac acatgcctcg gcatattcct gccttagggc 4500tggtgctcct
gctatttcct cttcccaggt aaccatgtga agtgcctccc tctgccctct
4560ttccagcctt tacttgagtg tcaccttctc agtgaggcct gccctcattc
ctctttcgct 4620gtttgcaacc catctcctgt cccccttccc agaactccct
ttcctacttc gtttttcttc 4680acagtacttg atactgccta acacactcca
tggtttctta cttgccctgt ttattatttt 4740cccccaatag acagaatgtt
ccatgatggc agaattctct gttttgtttc cttccatgtc 4800cccagcacct
agaacagtgc ctgacgcatc tcctaagcaa tacgaccaat aagtatgtgt
4860ctggctgcct tccggctgcc agtgtctgcc tctttcctag gggcagtggt
tgcgggggtg 4920ctttctcaca tgtcttagta ggctgtgcag gctggaagtg
ctcagaagtc acacccccag 4980ggagcagcct cagccaacag caccttggct
gtaaatgccc cagctccctc gccctcaggt 5040aagcattgct gaggcacacg
ttccatactc ttttccacag ttcctccgtg ggactgagca 5100ccacccagcc
acccacagga gcagctaacc tgataaccac cagcctcacc ctccctgcct
5160tacttccccg ctccccttta ccacatgctg acctcccaga tgcatttctt
gctttccggt 5220ctctgtctca ggattggctc ctggatgaac acaaactaac
actatgttca caaatatatt 5280tgggaaatgc tggatgaata attatacaca
tcagacagat tactagaaat tctcaccaaa 5340gggatgcaca tgttacctct
gcatggtgag atctcaggtg ctttttaccc cacatagcta 5400tcctttggca
tttttataat tagcaagtgc tcactcttcc actgtcagta cattctcatt
5460cttcttgggc gctggacaag aattcaaccg gtgtgtaagc cagactcggc
ccgggcagtc 5520tcaaactcct gactccttat ataatttcta caaaaattat
aaagctattt cccactcccc 5580accccacatt catgtaacct gaagcatgag
taaaccaaga atgaggtagg cctctgtctt 5640ctaagcaaca tcagaactct
aagaacatga gggactctta gaaaactctc tggagctaac 5700cacagctggg
tcactgctca tgtactgaag accagccaga gggttcccct gaaaaggagg
5760gaaactgagc aaacattctc cagttctctt agtgtgcaca tgtttcagga
ggtgtgaacc 5820ccacatgtag cttgtgtagg caagaagaca aatagtgcta
ctgtctggtc aaggatttgt 5880ttgaagagcc atgattatgc ccatatggta
agccaccagt gctccccatc cctgtaagac 5940acttctttct cattattttc
tcctctgatg gtgtgccagg atgctggcca agagaagcca 6000agtggaaaga
aggctgttca gtgacaagga acctaagact tagtgccaag gactgaaacc
6060aagtaaactt gtaattttcc atgatggaaa catctacact ttctcattag
tggcctctac 6120agcagttgcc ccaaagaagc gtctcattgt ttttttacta
catttatgtg aagcatacag 6180gcaaactcag aaagactgtg ataaggctcg
ccagagatgc ctgcacaggt gctgggggaa 6240aagcaggacc atcctgaagg
gagatggtgt ctgtggacaa agaactctgc agtggttctt 6300atttgcatga
tttctgctgg tggaggctgt aaatgtgagc tcaaactccc acataagtga
6360gttttcattg taatccagaa tgtttttaaa tcaccctact tctattgaac
ttgcactatc 6420atctgttaac ctctactgta tttattaaat aaacctgaat
aggtaaatca cagtacagca 6480aaa 648324368PRTHomo sapiens 24Met Gln
Pro Gly Ser Ser Arg Cys Glu Glu Glu Thr Pro Ser Leu Leu1 5 10 15Trp
Gly Leu Asp Pro Val Phe Leu Ala Phe Ala Lys Leu Tyr Ile Arg 20 25
30Asp Ile Leu Asp Met Lys Glu Ser Arg Gln Val Pro Gly Val Phe Leu
35 40 45Tyr Asn Gly His Pro Ile Lys Gln Val Asp Val Leu Gly Thr Val
Ile 50 55 60Gly Val Arg Glu Arg Asp Ala Phe Tyr Ser Tyr Gly Val Asp
Asp Ser65 70 75 80Thr Gly Val Ile Asn Cys Ile Cys Trp Lys Lys Leu
Asn Thr Glu Ser 85 90 95Val Ser Ala Ala Pro Ser Ala Ala Arg Glu Leu
Ser Leu Thr Ser Gln 100 105 110Leu Lys Lys Leu Gln Glu Thr Ile Glu
Gln Lys Thr Lys Ile Glu Ile 115 120 125Gly Asp Thr Ile Arg Val Arg
Gly Ser Ile Arg Thr Tyr Arg Glu Glu 130 135 140Arg Glu Ile His Ala
Thr Thr Tyr Tyr Lys Val Asp Asp Pro Val Trp145 150 155 160Asn Ile
Gln Ile Ala Arg Met Leu Glu Leu Pro Thr Ile Tyr Arg Lys 165 170
175Val Tyr Asp Gln Pro Phe His Ser Ser Ala Leu Glu Lys Glu Glu Ala
180 185 190Leu Ser Asn Pro Gly Ala Leu Asp Leu Pro Ser Leu Thr Ser
Leu Leu 195 200 205Ser Glu Lys Ala Lys Glu Phe Leu Met Glu Asn Arg
Val Gln Ser Phe 210 215 220Tyr Gln Gln Glu Leu Glu Met Val Glu Ser
Leu Leu Ser Leu Ala Asn225 230 235 240Gln Pro Val Ile His Ser Ala
Ser Ser Asp Gln Val Asn Phe Lys Lys 245 250 255Asp Thr Thr Ser Lys
Ala Ile His Ser Ile Phe Lys Asn Ala Ile Gln 260 265 270Leu Leu Gln
Glu Lys Gly Leu Val Phe Gln Lys Asp Asp Gly Phe Asp 275 280 285Asn
Leu Tyr Tyr Val Thr Arg Glu Asp Lys Asp Leu His Arg Lys Ile 290 295
300His Arg Ile Ile Gln Gln Asp Cys Gln Lys Pro Asn His Met Glu
Lys305 310 315 320Gly Cys His Phe Leu His Ile Leu Ala Cys Ala Arg
Leu Ser Ile Arg 325 330 335Pro Gly Leu Ser Glu Ala Val Leu Gln Gln
Val Leu Glu Leu Leu Glu 340 345 350Asp Gln Ser Asp Ile Val Ser Thr
Met Glu His Tyr Tyr Thr Ala Phe 355 360 365258795DNAHomo sapiens
25gcggccgcac tagtaccccg gagcccatgg gcgcgccgag ccgggcgcgg gggcgctgaa
60cggcggagcg ggagcggccg gaggagccat ggactgcagc ctcgtgcgga cgctcgtgca
120cagatactgt gcaggagaag agaattgggt ggacagcagg accatctacg
tgggacacag 180ggagccacct ccgggcgcag aggcctacat cccacagaga
tacccagaca acaggatcgt 240ctcgtccaag tacacatttt ggaactttat
acccaagaat ttatttgaac aattcagaag 300agtagccaac ttttatttcc
ttatcatatt tctggtgcag ttgattattg atacacccac 360aagtccagtg
acaagcggac ttccactctt ctttgtcatt actgtgacgg ctatcaaaca
420gggttatgaa gactggcttc gacataaagc agacaatgcc atgaaccagt
gtcctgttca 480tttcattcag cacggcaagc tcgttcggaa acaaagtcga
aagctgcgag ttggggacat 540tgtcatggtt aaggaggacg agacctttcc
ctgcgacttg atcttccttt ccagcaaccg 600gggagatggg acgtgccacg
tcaccaccgc cagcttggat ggagaatcca gccataaaac 660gcattacgcg
gtccaggaca ccaaaggctt ccacacagag gaggatatcg gcggacttca
720cgccaccatc gagtgtgagc agccccagcc cgacctctac aagttcgtgg
gtcgcatcaa 780cgtttacagt gacctgaatg accccgtggt gaggccctta
ggatcggaaa acctgctgct 840tagaggagct acactgaaga acactgagaa
aatctttggt gtggctattt acacgggaat 900ggaaaccaag atggcattaa
attatcaatc aaaatctcag aagcgatctg ccgtggaaaa 960atcgatgaat
gcgttcctca ttgtgtatct ctgcattctg atcagcaaag ccctgataaa
1020cactgtgctg aaatacatgt ggcagagtga gccctttcgg gatgagccgt
ggtataatca 1080gaaaacggag tcggaaaggc agaggaatct gttcctcaag
gcattcacgg acttcctggc 1140cttcatggtc ctctttaact acatcatccc
tgtgtccatg tacgtcacgg tcgagatgca 1200gaagttcctc ggctcttact
tcatcacctg ggacgaagac atgtttgacg aggagactgg 1260cgaggggcct
ctggtgaaca cgtcggacct caatgaagag ctgggacagg tggagtacat
1320cttcacagac aagaccggca ccctcacgga aaacaacatg gagttcaagg
agtgctgcat 1380cgaaggccat gtctacgtgc cccacgtcat ctgcaacggg
caggtcctcc cagagtcgtc 1440aggaatcgac atgattgact cgtcccccag
cgtcaacggg agggagcgcg aggagctgtt 1500tttccgggcc ctctgtctct
gccacaccgt ccaggtgaaa gacgatgaca gcgtagacgg 1560ccccaggaaa
tcgccggacg gggggaaatc ctgtgtgtac atctcatcct cgcccgacga
1620ggtggcgctg gtcgaaggtg tccagagact tggctttacc tacctaaggc
tgaaggacaa 1680ttacatggag atattaaaca gggagaacca catcgaaagg
tttgaattgc tggaaatttt 1740gagttttgac tcagtcagaa ggagaatgag
tgtaattgta aaatctgcta caggagaaat 1800ttatctgttt tgcaaaggag
cagattcttc gatattcccc cgagtgatag aaggcaaagt 1860tgaccagatc
cgagccagag tggagcgtaa cgcagtggag gggctccgaa ctttgtgtgt
1920tgcttataaa aggctgatcc aagaagaata tgaaggcatt tgtaagctgc
tgcaggctgc 1980caaagtggcc cttcaagatc gagagaaaaa gttagcagaa
gcctatgagc aaatagagaa 2040agatcttact ctgcttggtg ctacagctgt
tgaggaccgg ctgcaggaga aagctgcaga 2100caccatcgag gccctgcaga
aggccgggat caaagtctgg gttctcacgg gagacaagat 2160ggagacggcc
gcggccacgt gctacgcctg caagctcttc cgcaggaaca cgcagctgct
2220ggagctgacc accaagagga tcgaggagca gagcctgcac gacgtcctgt
tcgagctgag 2280caagacggtc ctgcgccaca gcgggagcct gaccagagac
aacctgtccg gactttcagc 2340agatatgcag gactacggtt taattatcga
cggagctgca ctgtctctga taatgaagcc 2400tcgagaagac gggagttccg
gcaactacag ggagctcttc ctggaaatct gccggagctg 2460cagcgcggtg
ctctgctgcc gcatggcgcc cttgcagaag gctcagattg ttaaattaat
2520caaattttca aaagagcacc caatcacgtt agcaattggc gatggtgcaa
atgatgtcag 2580catgattctg gaagcgcacg tgggcatagg tgtcatcggc
aaggaaggcc gccaggctgc 2640caggaacagc gactatgcaa tcccaaagtt
taagcatttg aagaagatgc tgcttgttca 2700cgggcatttt tattacatta
ggatctctga gctcgtgcag tacttcttct ataagaacgt 2760ctgcttcatc
ttccctcagt ttttatacca gttcttctgt gggttttcac aacagacttt
2820gtacgacacc gcgtatctga ccctctacaa catcagcttc acctccctcc
ccatcctcct 2880gtacagcctc atggagcagc atgttggcat tgacgtgctc
aagagagacc cgaccctgta 2940cagggacgtc gccaagaatg ccctgctgcg
ctggcgcgtg ttcatctact ggacgctcct 3000gggactgttt gacgcactgg
tgttcttctt tggtgcttat ttcgtgtttg aaaatacaac 3060tgtgacaagc
aacgggcaga tatttggaaa ctggacgttt ggaacgctgg tattcaccgt
3120gatggtgttc acagttacac taaagcttgc attggacaca cactactgga
cttggatcaa 3180ccattttgtc atctgggggt cgctgctgtt ctacgttgtc
ttttcgcttc tctggggagg 3240agtgatctgg ccgttcctca actaccagag
gatgtactac gtgttcatcc agatgctgtc 3300cagcgggccc gcctggctgg
ccatcgtgct gctggtgacc atcagcctcc ttcccgacgt 3360cctcaagaaa
gtcctgtgcc ggcagctgtg gccaacagca acagagagag tccagactaa
3420gagccagtgc ctttctgtcg agcagtcaac catctttatg ctttctcaga
cttccagcag 3480cctgagtttc tgatggaaca agagcccagg ctaccagagc
acctgtccct cggccgcctg 3540gtacagctcc cactctcagc aggtgacact
cgcggcctgg aaggagaagg tgtccacgga 3600gcccccaccc atcctcggcg
gttcccatca ccactgcagt tccatcccaa gtcacagctg 3660ccctaggtcc
cgtgtgggaa tgctcgtgtg atggatggtc ctaagcctgt ggagactgtg
3720cacgtgcctc ttcctggccc ccagcaggca aggagggggg tcacaggcct
tgccctcgag 3780catggcaccc tggccgcctg gacccagcac tgtggttgtt
gagccacacc agtggcctct 3840gggcattcgg ctcaacgcag gagggacatt
ctgctggccc accctgcgcg ctgtcatgca 3900gaggccattc ccccaggcct
gtgtcttcac ccacctgcca tcattggcct ttgctgtcac 3960tgggagagaa
gagccgtcca gggacccatg gtggcccaca tgtggatgcc acatgctgct
4020gtttcctgct tgcccggcca ccacccatgc cctccatagg gtgaggtgga
gccatggtgg 4080tgcgtccttt actcaacaac cctccaatcc ggatgctgtg
ggaagggccg ggtcactcgg 4140ataccatcat ccctgcggat gcaccgccgt
accctgctca tctgggagtg gtttccctgc 4200ggttacgtcc aagcccgcct
gccctgtgtg ttggggctgg ctgagtttcg gtctccccat 4260caccggccgc
ctcgtggaga aggcagtgcc acgtgggagg acaaggccac gccggcagct
4320tccagccctg ccgcagaagt gccaggatgt ccatcagcca ctcgccaggg
cacggagccg 4380tcagtccact gttacgggag aatgttgatt tcgcgggtgc
gagggccggg agacagatac 4440ttggctgtga tgagcagaca tcctctgtcc
ccgtggaggg gtcaacacca aggtggtgtt 4500cgtgcaccag aacctgtctc
gggctgacgg gggtggcaca caggacacgg gtggatccca 4560acaggcagca
ccgcacctct gcccgcctcc cgcactgcag ctccgcccgc cgggctctgc
4620gtccccacgt cccctcgtcc catccccacg tcccctcatc ccgtcacctc
gtccccacat 4680ccccttgccc cgtcacctcg tcctcatgtc cccttgtcct
gtcacctcgt ccccacgtcc 4740cctcgtctcc tcatccccac gtcctctcgt
ccccttgtcc cgtccccaca taccctcgtc 4800cccatgtccc cacgcagggc
tctccttcgt cttaggatct gtccagcgct gctctgggtg 4860ggttagcaac
cccagggctg ctgtgatagg aagtccctgt tgttctccgt actggcattt
4920ctatttctag aaataatatt tgacatagcc ttaatggtcc ttaaagaaga
catttcagtg 4980tgagattcag acttcagacg ctgaaactgc tgcctttcag
gaaagcacca ccaacgctgg 5040aggaggagcc ggccctcacg cccgccccgc
gccacgctgt ggaacggggc tccggcaagt 5100gaaacccaga gggtgtttcc
gaggtgctcg acagtaggta tttttggaag ctcagatttc 5160accatttgat
tgtataatct tttacctata aaatatttat ttgaagtaga gggtaaatca
5220gcggtaagaa cagtgaacac agtggttggg ataaaataag gtgacaaaca
tcacaccaaa 5280gatgagggta gcgagcaact ggcttgagca gacagaacgg
ggaagactcc actctgtccc 5340gaggggccag ccgcaggcgt ccccagggcc
accctgccct gaggtccttg tgtggccgcc 5400ctggcttggc agccctgccc
acgctgcccc cgcaaacaat ggtgtgtgcg tttttacagc 5460cctttttagg
aacccaatat gggcataaat gtaacacctg tagcgggggc agattctctg
5520tatgttcagt taacaaatta tttgtaatgt atttttttag aaatcttaaa
attgcctttg 5580cactgaagta ttttcatagc tgtttatatc tcttttattc
atttatttaa catactgtct 5640aattttaaaa ataggttttt aaagctttca
tttttaagtt tatgaaattt tggccacttt 5700acatttagat tctggtgaga
gttttgactg aatgttccaa tctctgatga atgcgaattt 5760tcagatttga
ttttattctc tacacacacc tcttcttttc ttggtatttc tggtggcagt
5820gattagttga acagcacatt taaggcacga taatttgcta cactttttct
ttacaatttg 5880ttgcaatttc atctgctttc tatgtttcat tgttaattgc
catccttcag ccttaaaaat 5940agaagattct cacgtgaagg tttagtaagt
tgggtcccag ctctgcctgt gtggagatag 6000tcaccatgta cctctgacaa
caagttttag tgtgaaagtc actaaacttt tacacactcc 6060caaacgtctt
tttaaaaatt gcttgggaaa ttattaaatg aatgtgcctg atgatttgaa
6120atagacaagg ggcacgagat aaaaaagaaa aggatgagaa gatcctcagt
gaatgacgtt 6180gcagggtctt catgcaattt tccacctcgc agtagttagt
atttacttgc cttaaactaa 6240ctttgaagca agtaatgtca actttgagca
ctttgttgag ttttgaaaaa tcttatttgt 6300tgctgcacag gttaataaat
tatcaatttg taattcagca tgttggtcag agacacggtc 6360actgattcac
acccagtccc tgccacagac cgtctcagac acgcacagtg ggcctgctgc
6420atgattcaca cccagtccct gccacagacc gtctcagaca cgcacagtgg
gcctgctgca 6480tgattcacac ccagtccctg ccacagaccg tctcagacac
gcacagtggg cctgctgcat 6540gcgtgttacc tggcttttgg ctccacgctc
actcatagcc atgtccacat gggggcttgc 6600acacaggatc actcacatat
gtacatgtac ccaccacaaa cgtgcaagct cctgcacaca 6660tgcatgcaca
caaacgtgta cacaagtgtg agctcctaca cgcatacaca cacacacgtg
6720tacatgcacc aaagcatgtg tgacctacag acatgcagaa catgcacgtg
tacacatacc 6780acagacacgc gtgtgcatgc tcctacacaa tacatatgca
catatcatga acagcgtaag 6840ttcctacaca cggacgtgtg atacacacat
gcatgtacag gtaagcacac atgtacaagc 6900tcctacaggc ttgctctcac
acacgtgtat gcacagcaga gagacgtatg agcttctact 6960gcacacatgc
acacacacac gcacacgtac attcactaca aacgtgcagc ctcctgcaca
7020cgtgcacatt catgtgtaca ccacaaatga gttcccagac gtgtaaacac
acgtgcacac 7080atcgtacaca tgtgagctcc cacacgtaca cacagatgca
catggacaca ccccaaacac 7140gcacaggctc ctacacacat gcacacacgt
gtacaccaca aacgagctcc cagacatgta 7200aacacacgtc tcccacacgt
gagctcccac acgtacacat gcacatgtac gcaccacaaa 7260cacatgcgca
ggctcctgca ggcgtgaata cacacatgca
cacacatata cacacatgtg 7320ccacaaacaa gtgcacactg tcctggtgtc
ctgcactgca tcctgcctcc ttgctgaggg 7380gcccctgtga gaggcctctg
gatgggcatg ggaagatggg ctccctggcc cccagcccat 7440gcctccctgg
gatgaagagt ccccctcctg gcagaatgtc tgggctttgc agagcaggcc
7500ccgggggtga agtcgcagct tcacttacac cagctgctct gtgagcaagg
cttggtgccc 7560tggacaaggc ccttcccctt tagggaggtc cagcctcgca
agctgaaacc tcccctcggc 7620tcagccctat accaggcggc cacagcagga
ctggccacac ccacgccgca cctcatccgt 7680gcacgcgtcg gagcacggcc
agccttccgc cacgagccag ctgggaaggg ccgcggccgc 7740ctaaagcccc
agtcaaccca gcctgtgtct gagcagacag ggcgaacaag caggccacac
7800cgtctcgagg gaggaggcca gatgcggcca gcgtctccaa cagggtgacc
atccgctcgg 7860cttgctgagc gtttaaacaa atgtttagac aggctgtggg
gactcccctg agttgagcct 7920tggccagggg tccggtgctg tcgcgggaaa
cctccagcct tgttcttcaa accactcagc 7980tcatgtgttt tgcactgact
agtactgaat aatacaacca ctcttattta atgttagtat 8040tatttatttg
acaactcagt gtctaacagc ttgatatgca ggtccttgca tcctacattt
8100ctttaggaag ttacccattt gtaactttaa aaacaggaaa aatatcagtt
ggcaaatgca 8160atcttttttt tttttaagct aaaggtgggt gaactggaat
gaaaatcttt ctgatgttgt 8220gtctataagc agccttgatg ggatatgtta
gaagtgtcat gaaagtgtga ttctactttt 8280gcagaaaaat ctaaagatca
atttatatag ctttattttt tactttatca aagtatacag 8340aattttaata
tgcatatatt gtgtctgact taaaattata atgtctgcgt caccatttaa
8400aatgtctgtt cattatgtaa tgtaataaaa gaaggtcttc aaaaatgtat
ttaacatgaa 8460tggtatccat agttgtcatc atcataaata ctggagttta
tttttaaatt attaaacata 8520gtaggtgcat taacataaat cagtctccac
acagtaacat ttaactgata attcattaat 8580cagctttgaa aaattaaatt
gttaattaaa ccaatctaac atttcagtaa agtttatttt 8640gtatgcttct
gtttttaact tttatttctg tagataaact gactggataa tattatattg
8700gacttttctc tagattatct aagcaggaga cctgaatctg cttgcaataa
agaataaaag 8760tctgcttcag tttctttata aagaaactca cacaa
8795261134PRTHomo sapiens 26Met Asp Cys Ser Leu Val Arg Thr Leu Val
His Arg Tyr Cys Ala Gly1 5 10 15Glu Glu Asn Trp Val Asp Ser Arg Thr
Ile Tyr Val Gly His Arg Glu 20 25 30Pro Pro Pro Gly Ala Glu Ala Tyr
Ile Pro Gln Arg Tyr Pro Asp Asn 35 40 45Arg Ile Val Ser Ser Lys Tyr
Thr Phe Trp Asn Phe Ile Pro Lys Asn 50 55 60Leu Phe Glu Gln Phe Arg
Arg Val Ala Asn Phe Tyr Phe Leu Ile Ile65 70 75 80Phe Leu Val Gln
Leu Ile Ile Asp Thr Pro Thr Ser Pro Val Thr Ser 85 90 95Gly Leu Pro
Leu Phe Phe Val Ile Thr Val Thr Ala Ile Lys Gln Gly 100 105 110Tyr
Glu Asp Trp Leu Arg His Lys Ala Asp Asn Ala Met Asn Gln Cys 115 120
125Pro Val His Phe Ile Gln His Gly Lys Leu Val Arg Lys Gln Ser Arg
130 135 140Lys Leu Arg Val Gly Asp Ile Val Met Val Lys Glu Asp Glu
Thr Phe145 150 155 160Pro Cys Asp Leu Ile Phe Leu Ser Ser Asn Arg
Gly Asp Gly Thr Cys 165 170 175His Val Thr Thr Ala Ser Leu Asp Gly
Glu Ser Ser His Lys Thr His 180 185 190Tyr Ala Val Gln Asp Thr Lys
Gly Phe His Thr Glu Glu Asp Ile Gly 195 200 205Gly Leu His Ala Thr
Ile Glu Cys Glu Gln Pro Gln Pro Asp Leu Tyr 210 215 220Lys Phe Val
Gly Arg Ile Asn Val Tyr Ser Asp Leu Asn Asp Pro Val225 230 235
240Val Arg Pro Leu Gly Ser Glu Asn Leu Leu Leu Arg Gly Ala Thr Leu
245 250 255Lys Asn Thr Glu Lys Ile Phe Gly Val Ala Ile Tyr Thr Gly
Met Glu 260 265 270Thr Lys Met Ala Leu Asn Tyr Gln Ser Lys Ser Gln
Lys Arg Ser Ala 275 280 285Val Glu Lys Ser Met Asn Ala Phe Leu Ile
Val Tyr Leu Cys Ile Leu 290 295 300Ile Ser Lys Ala Leu Ile Asn Thr
Val Leu Lys Tyr Met Trp Gln Ser305 310 315 320Glu Pro Phe Arg Asp
Glu Pro Trp Tyr Asn Gln Lys Thr Glu Ser Glu 325 330 335Arg Gln Arg
Asn Leu Phe Leu Lys Ala Phe Thr Asp Phe Leu Ala Phe 340 345 350Met
Val Leu Phe Asn Tyr Ile Ile Pro Val Ser Met Tyr Val Thr Val 355 360
365Glu Met Gln Lys Phe Leu Gly Ser Tyr Phe Ile Thr Trp Asp Glu Asp
370 375 380Met Phe Asp Glu Glu Thr Gly Glu Gly Pro Leu Val Asn Thr
Ser Asp385 390 395 400Leu Asn Glu Glu Leu Gly Gln Val Glu Tyr Ile
Phe Thr Asp Lys Thr 405 410 415Gly Thr Leu Thr Glu Asn Asn Met Glu
Phe Lys Glu Cys Cys Ile Glu 420 425 430Gly His Val Tyr Val Pro His
Val Ile Cys Asn Gly Gln Val Leu Pro 435 440 445Glu Ser Ser Gly Ile
Asp Met Ile Asp Ser Ser Pro Ser Val Asn Gly 450 455 460Arg Glu Arg
Glu Glu Leu Phe Phe Arg Ala Leu Cys Leu Cys His Thr465 470 475
480Val Gln Val Lys Asp Asp Asp Ser Val Asp Gly Pro Arg Lys Ser Pro
485 490 495Asp Gly Gly Lys Ser Cys Val Tyr Ile Ser Ser Ser Pro Asp
Glu Val 500 505 510Ala Leu Val Glu Gly Val Gln Arg Leu Gly Phe Thr
Tyr Leu Arg Leu 515 520 525Lys Asp Asn Tyr Met Glu Ile Leu Asn Arg
Glu Asn His Ile Glu Arg 530 535 540Phe Glu Leu Leu Glu Ile Leu Ser
Phe Asp Ser Val Arg Arg Arg Met545 550 555 560Ser Val Ile Val Lys
Ser Ala Thr Gly Glu Ile Tyr Leu Phe Cys Lys 565 570 575Gly Ala Asp
Ser Ser Ile Phe Pro Arg Val Ile Glu Gly Lys Val Asp 580 585 590Gln
Ile Arg Ala Arg Val Glu Arg Asn Ala Val Glu Gly Leu Arg Thr 595 600
605Leu Cys Val Ala Tyr Lys Arg Leu Ile Gln Glu Glu Tyr Glu Gly Ile
610 615 620Cys Lys Leu Leu Gln Ala Ala Lys Val Ala Leu Gln Asp Arg
Glu Lys625 630 635 640Lys Leu Ala Glu Ala Tyr Glu Gln Ile Glu Lys
Asp Leu Thr Leu Leu 645 650 655Gly Ala Thr Ala Val Glu Asp Arg Leu
Gln Glu Lys Ala Ala Asp Thr 660 665 670Ile Glu Ala Leu Gln Lys Ala
Gly Ile Lys Val Trp Val Leu Thr Gly 675 680 685Asp Lys Met Glu Thr
Ala Ala Ala Thr Cys Tyr Ala Cys Lys Leu Phe 690 695 700Arg Arg Asn
Thr Gln Leu Leu Glu Leu Thr Thr Lys Arg Ile Glu Glu705 710 715
720Gln Ser Leu His Asp Val Leu Phe Glu Leu Ser Lys Thr Val Leu Arg
725 730 735His Ser Gly Ser Leu Thr Arg Asp Asn Leu Ser Gly Leu Ser
Ala Asp 740 745 750Met Gln Asp Tyr Gly Leu Ile Ile Asp Gly Ala Ala
Leu Ser Leu Ile 755 760 765Met Lys Pro Arg Glu Asp Gly Ser Ser Gly
Asn Tyr Arg Glu Leu Phe 770 775 780Leu Glu Ile Cys Arg Ser Cys Ser
Ala Val Leu Cys Cys Arg Met Ala785 790 795 800Pro Leu Gln Lys Ala
Gln Ile Val Lys Leu Ile Lys Phe Ser Lys Glu 805 810 815His Pro Ile
Thr Leu Ala Ile Gly Asp Gly Ala Asn Asp Val Ser Met 820 825 830Ile
Leu Glu Ala His Val Gly Ile Gly Val Ile Gly Lys Glu Gly Arg 835 840
845Gln Ala Ala Arg Asn Ser Asp Tyr Ala Ile Pro Lys Phe Lys His Leu
850 855 860Lys Lys Met Leu Leu Val His Gly His Phe Tyr Tyr Ile Arg
Ile Ser865 870 875 880Glu Leu Val Gln Tyr Phe Phe Tyr Lys Asn Val
Cys Phe Ile Phe Pro 885 890 895Gln Phe Leu Tyr Gln Phe Phe Cys Gly
Phe Ser Gln Gln Thr Leu Tyr 900 905 910Asp Thr Ala Tyr Leu Thr Leu
Tyr Asn Ile Ser Phe Thr Ser Leu Pro 915 920 925Ile Leu Leu Tyr Ser
Leu Met Glu Gln His Val Gly Ile Asp Val Leu 930 935 940Lys Arg Asp
Pro Thr Leu Tyr Arg Asp Val Ala Lys Asn Ala Leu Leu945 950 955
960Arg Trp Arg Val Phe Ile Tyr Trp Thr Leu Leu Gly Leu Phe Asp Ala
965 970 975Leu Val Phe Phe Phe Gly Ala Tyr Phe Val Phe Glu Asn Thr
Thr Val 980 985 990Thr Ser Asn Gly Gln Ile Phe Gly Asn Trp Thr Phe
Gly Thr Leu Val 995 1000 1005Phe Thr Val Met Val Phe Thr Val Thr
Leu Lys Leu Ala Leu Asp 1010 1015 1020Thr His Tyr Trp Thr Trp Ile
Asn His Phe Val Ile Trp Gly Ser 1025 1030 1035Leu Leu Phe Tyr Val
Val Phe Ser Leu Leu Trp Gly Gly Val Ile 1040 1045 1050Trp Pro Phe
Leu Asn Tyr Gln Arg Met Tyr Tyr Val Phe Ile Gln 1055 1060 1065Met
Leu Ser Ser Gly Pro Ala Trp Leu Ala Ile Val Leu Leu Val 1070 1075
1080Thr Ile Ser Leu Leu Pro Asp Val Leu Lys Lys Val Leu Cys Arg
1085 1090 1095Gln Leu Trp Pro Thr Ala Thr Glu Arg Val Gln Thr Lys
Ser Gln 1100 1105 1110Cys Leu Ser Val Glu Gln Ser Thr Ile Phe Met
Leu Ser Gln Thr 1115 1120 1125Ser Ser Ser Leu Ser Phe
1130274575DNAHomo sapiens 27tttccgcagt taggggctgc tatttcaacg
cagggagata aaaagaaaaa aacacttgct 60cttctacccc gctaaaaaca ctcatcctag
ggagcacgcc agcatttgca gcgttcgggg 120cagggccact cggcctgcgg
ccgttgcact ggctggaagc tggcaggcga tcacggttga 180ttggctcggg
tgcggtccaa gggcagcaac gccttcggcg ggccgcctag ggtgattggc
240tgctgcagcc caccccctag ccggtttggt gggcggcgaa gcctggattg
gtggagctaa 300gagctggctc agtttcagcg ctggctcttc gtgcatggca
gagatggcga ctgcgactcg 360gctgctgggg tggcgtgtgg cgagctggag
gctgcggccg ccgcttgccg gcttcgtttc 420ccagcgggcc cactcgcttt
tgcccgtgga cgatgcaatc aatgggctaa gcgaggagca 480gaggcaggaa
ttttggaagc agctggggaa cctgggcgta ttgggcatca cagcccctgt
540tcagtatggc ggctccggcc tgggctacct ggagcatgtg ctggtgatgg
aggagatatc 600ccgagcttcc ggagcagtgg ggctcagtta cggtgcccac
tccaacctct gcatcaacca 660gcttgtacgc aatgggaatg aggcccagaa
agagaagtat ctcccgaagc tgatcagtgg 720tgagtacatc ggagccctgg
ccatgagtga gcccaatgca ggctctgatg ttgtctctat 780gaagctcaaa
gcggaaaaga aaggaaatca ctacatcctg aatggcaaca agttctggat
840cactaatggc cctgatgctg acgtcctgat tgtctatgcc aagacagatc
tggctgctgt 900gccagcttct cggggcatca cagccttcat tgtggagaag
ggtatgcctg gctttagcac 960ctctaagaag ctggacaagc tggggatgag
gggctctaac acctgtgagc taatctttga 1020agactgcaag attcctgctg
ccaacatcct gggccatgag aataagggtg tctacgtgct 1080gatgagtggg
ctggacctgg agcggctggt gctggccggg gggcctcttg ggctcatgca
1140agcggtcctg gaccacacca ttccctacct gcacgtgagg gaagcctttg
gccagaagat 1200cggccacttc cagttgatgc aggggaagat ggctgacatg
tacacccgcc tcatggcgtg 1260tcggcagtat gtctacaatg tcgccaaggc
ctgcgatgag ggccattgca ctgctaagga 1320ctgtgcaggt gtgattcttt
actcagctga gtgtgccaca caggtagccc tggacggcat 1380tcagtgtttt
ggtggcaatg gctacatcaa tgactttccc atgggccgct ttcttcgaga
1440tgccaagctg tatgagatag gggctgggac cagcgaggtg aggcggctgg
tcatcggcag 1500agccttcaat gcagactttc actagtcctg agacccttcg
cccccttttc ctgcacctag 1560tggcctttct tgggaagtag agatgtggcg
gctttcccac cctgcccaca gcaggccctc 1620ctgcccagct gctcttgtca
gccctctggc ctctggatga ggttgagttc tccacaacag 1680ctcccaagca
tcatgggcct cgcagccggg cctgtgccac ggctagtgtt gtgtgattta
1740aaatggactc agcaggaagc atattgtctg gggattgttg ggacaggttt
tggtgactct 1800gtgcccttgc tctctaactt ctgagcccac ctcccagggt
aggcacctgg gggcatgcag 1860gtgcccacct cccagggtag gcacctgggg
gcatgcaggt acccacctct ttctcttggg 1920tgaggctctg gcaaggagat
ctctctgctc aagcacagca gaatcatggc ccctctccat 1980gaattggaac
ttggtacagg ttaagtatcc ctaatcctga aatctgaaac acttgtggtt
2040ccaagcattt tggataaggc aaattcaact ttcagtctct tttctggggg
aaaaaaataa 2100taaacctagc ctagccaggc gtggtggctc atgcttgtaa
tcccagcact tcaggaggct 2160gagatgggtg gatcacctga ggtcaggagt
tcaagaccag cctggccaac atgtggaaac 2220ctcgcctcaa ctaaaaatag
aaaaaaatta gttgggcatg gtggtgggca cctgtaatcc 2280cagctacttc
aggaggctga ggcaggagaa ttacttgaac ccaggaggcg gacgttgcag
2340tgagccgagc ttgtgccatt gcactccagc ctgggcgaca agagcaaaac
tcttcaaaaa 2400acaaaacaaa acaaaaaaac cctggccctt gtttcttcca
gtttctagag gtatcagctc 2460ctagcagctt atgaacacat atgcttgctt
ggccaggcaa ggtggtgtgt gcctgtaatc 2520ccagcacttt gggaggccaa
ggcaggtgga tcacttgcag tcaggagttc aagaccagcc 2580tgtccaacgt
ggtgaaaccc catctctact aaaaatacaa aaattagcca ggggtggtgg
2640tgcacgtctg taatcccagc tactcaggag gctgaggcag gagaatcact
tgaacccggg 2700aggtggaggt tgcaatgagc caatatgaca ccgctgcagt
ccagcctggg ccatagagtg 2760agactctgtc tcaaaaaagg aaagaaaaat
aggctgggca cagtgactca tgcctgtaat 2820cccaacactt tgggaggccg
aggcaggtgg atcacgaggt caggagttca agaccagcct 2880ggccaagatg
gtaaaacctc gtctctacta aaaatacaaa aattagccag gtgtggtggc
2940aggctcctgt aatcccagct actcaggagg ctgaggcaga gaattgcttg
aacccgggag 3000gcagagtttg cagtgagcca agatcacacc actgcactcc
agcttggacg acagagcgag 3060actctgtctc aaaaaataat aggccaggca
tggtggctca acgtctgtaa tcccagcact 3120ttgggaggcc gaggcgggca
gatcacaagg tcaggagttc gagaccagcc tgacgaccaa 3180catggtgaaa
cctcgtctct actaaaaata caaaaattag ccaggcctgg tggcacgcgc
3240ctgtaatccc agttacacag aagactgagg caggagaatc gcttgaacgc
aggaggcaga 3300ggttgcagga gctgagatcg cgccattgca ctccagcctg
ggcaacagag tgagactctg 3360tctcaaaaaa taataataaa ataaatgaac
acacatgctg ctgagtccgc agggggggca 3420gagcagagga cagcgtgctt
ttgtgtactg ttggaagact ggctcctcct gtacagcacc 3480tctgagccct
tgtgcaccgc cctgccacgg gcaccatcca gtcctggccg tgtgaccacc
3540cacagctgac tgggcagcag gcacaggccc tacccgagca ggccggagtt
ggctcgcatg 3600actccagctg aggctgcctg tgtacatttc tccagatacc
ctatggctaa ttttgttata 3660actgcacagt ggctgctgcc attttgtatt
aaatatattg tgaaacaaac ctatctgggg 3720agaagcaatc tacttgccgc
tgcttcctgt ctggatccag cttgtgtcct tggagagtgg 3780ctggcccagg
tcctattcct gtcctccagc ccgttctttc atgagggaca ggaaggtaaa
3840atcagccctt aggagagagg tctcagcctc cctttcccag atctcccagt
gagttttaaa 3900ggaagcaggg agcccagagt gctaagttct tacagccaga
aggaagctta tagatttctg 3960aaaaccgccc ctttgttttt aaaaagatca
acacaatttg actttctcaa ggtcaaaacg 4020aactagaatc cagatctgct
catggcaaaa atgggggtgt tctgagaatt ccagctttgg 4080gccgcactgt
acagcagtct ggatagagtg tgatctgaga agggaatggg tctgggttgt
4140tccacccctt ccgagttcca aaaagaggga actggttttc ttggttctca
gcccagcagc 4200acctatcctg gctcttggtc ctggcctgca gccaagtgct
gttcctagcc tgaggcttga 4260gacaggtggg gttggctcct caccaacccc
agttccgtcc catcctgagg gcaagatcct 4320gggctcatag gcagtccctt
tcacttcctt gtcttgctcc ctgctatgtt ggagatgaat 4380gtgactaaaa
gggccatctt gctggcttaa tgtgtggctg gagagaccag cctggagaca
4440atgtggcaaa atggggcgct tcatccagtc tgtctaagcc ctgtcgactt
ggggaggtga 4500tttctttcct ggttctatat gtgaagcaaa ataaatgttt
taaaattaaa agcaaaaaaa 4560acaaaatgaa ccatg 457528396PRTHomo sapiens
28Met Ala Glu Met Ala Thr Ala Thr Arg Leu Leu Gly Trp Arg Val Ala1
5 10 15Ser Trp Arg Leu Arg Pro Pro Leu Ala Gly Phe Val Ser Gln Arg
Ala 20 25 30His Ser Leu Leu Pro Val Asp Asp Ala Ile Asn Gly Leu Ser
Glu Glu 35 40 45Gln Arg Gln Glu Phe Trp Lys Gln Leu Gly Asn Leu Gly
Val Leu Gly 50 55 60Ile Thr Ala Pro Val Gln Tyr Gly Gly Ser Gly Leu
Gly Tyr Leu Glu65 70 75 80His Val Leu Val Met Glu Glu Ile Ser Arg
Ala Ser Gly Ala Val Gly 85 90 95Leu Ser Tyr Gly Ala His Ser Asn Leu
Cys Ile Asn Gln Leu Val Arg 100 105 110Asn Gly Asn Glu Ala Gln Lys
Glu Lys Tyr Leu Pro Lys Leu Ile Ser 115 120 125Gly Glu Tyr Ile Gly
Ala Leu Ala Met Ser Glu Pro Asn Ala Gly Ser 130 135 140Asp Val Val
Ser Met Lys Leu Lys Ala Glu Lys Lys Gly Asn His Tyr145 150 155
160Ile Leu Asn Gly Asn Lys Phe Trp Ile Thr Asn Gly Pro Asp Ala Asp
165 170 175Val Leu Ile Val Tyr Ala Lys Thr Asp Leu Ala Ala Val Pro
Ala Ser 180 185 190Arg Gly Ile Thr Ala Phe Ile Val Glu Lys Gly Met
Pro Gly Phe Ser 195 200 205Thr Ser Lys Lys Leu Asp Lys Leu Gly Met
Arg Gly Ser Asn Thr Cys 210 215 220Glu Leu Ile Phe Glu Asp Cys Lys
Ile Pro Ala Ala Asn Ile Leu Gly225 230 235 240His Glu Asn Lys Gly
Val Tyr Val Leu Met Ser Gly Leu Asp Leu Glu 245 250 255Arg Leu Val
Leu Ala Gly Gly Pro Leu Gly Leu Met Gln Ala Val Leu 260 265 270Asp
His Thr Ile Pro Tyr Leu His Val Arg Glu Ala Phe Gly Gln Lys 275 280
285Ile Gly His Phe Gln Leu Met Gln Gly Lys Met Ala Asp Met Tyr Thr
290 295 300Arg Leu Met Ala Cys Arg Gln Tyr Val Tyr Asn Val Ala Lys
Ala Cys305 310 315
320Asp Glu Gly His Cys Thr Ala Lys Asp Cys Ala Gly Val Ile Leu Tyr
325 330 335Ser Ala Glu Cys Ala Thr Gln Val Ala Leu Asp Gly Ile Gln
Cys Phe 340 345 350Gly Gly Asn Gly Tyr Ile Asn Asp Phe Pro Met Gly
Arg Phe Leu Arg 355 360 365Asp Ala Lys Leu Tyr Glu Ile Gly Ala Gly
Thr Ser Glu Val Arg Arg 370 375 380Leu Val Ile Gly Arg Ala Phe Asn
Ala Asp Phe His385 390 395294286DNAHomo sapiens 29caacttccgg
gtcaaaggtg cctgagccgg cgggtcccct gtgtccgccg cggctgtcgt 60cccccgctcc
cgccacttcc ggggtcgcag tcccgggcat ggagccgcga ccgtgaggcg
120ccgctggacc cgggacgacc tgcccagtcc ggccgccgcc ccacgtcccg
gtctgtgtcc 180cacgcctgca gctggaatgg aggctctctg gaccctttag
aaggcacccc tgccctcctg 240aggtcagctg agcggttaat gcggaaggtt
aagaaactgc gcctggacaa ggagaacacc 300ggaagttgga gaagcttctc
gctgaattcc gagggggctg agaggatggc caccaccggg 360accccaacgg
ccgaccgagg cgacgcagcc gccacagatg acccggccgc ccgcttccag
420gtgcagaagc actcgtggga cgggctccgg agcatcatcc acggcagccg
caagtactcg 480ggcctcattg tcaacaaggc gccccacgac ttccagtttg
tgcagaagac ggatgagtct 540gggccccact cccaccgcct ctactacctg
ggaatgccat atggcagccg agagaactcc 600ctcctctact ctgagattcc
caagaaggtc cggaaagagg ctctgctgct cctgtcctgg 660aagcagatgc
tggatcattt ccaggccacg ccccaccatg gggtctactc tcgggaggag
720gagctgctga gggagcggaa acgcctgggg gtcttcggca tcacctccta
cgacttccac 780agcgagagtg gcctcttcct cttccaggcc agcaacagcc
tcttccactg ccgcgacggc 840ggcaagaacg gcttcatggt gtcccctatg
aaaccgctgg aaatcaagac ccagtgctca 900gggccccgga tggaccccaa
aatctgccct gccgaccctg ccttcttctc cttcatcaat 960aacagcgacc
tgtgggtggc caacatcgag acaggcgagg agcggcggct gaccttctgc
1020caccaaggtt tatccaatgt cctggatgac cccaagtctg cgggtgtggc
caccttcgtc 1080atacaggaag agttcgaccg cttcactggg tactggtggt
gccccacagc ctcctgggaa 1140ggttcagagg gcctcaagac gctgcgaatc
ctgtatgagg aagtcgatga gtccgaggtg 1200gaggtcattc acgtcccctc
tcctgcgcta gaagaaagga agacggactc gtatcggtac 1260cccaggacag
gcagcaagaa tcccaagatt gccttgaaac tggctgagtt ccagactgac
1320agccagggca agatcgtctc gacccaggag aaggagctgg tgcagccctt
cagctcgctg 1380ttcccgaagg tggagtacat cgccagggcc gggtggaccc
gggatggcaa atacgcctgg 1440gccatgttcc tggaccggcc ccagcagtgg
ctccagctcg tcctcctccc cccggccctg 1500ttcatcccga gcacagagaa
tgaggagcag cggctagcct ctgccagagc tgtccccagg 1560aatgtccagc
cgtatgtggt gtacgaggag gtcaccaacg tctggatcaa tgttcatgac
1620atcttctatc ccttccccca atcagaggga gaggacgagc tctgctttct
ccgcgccaat 1680gaatgcaaga ccggcttctg ccatttgtac aaagtcaccg
ccgttttaaa atcccagggc 1740tacgattgga gtgagccctt cagccccggg
gaagatgaat ttaagtgccc cattaaggaa 1800gagattgctc tgaccagcgg
tgaatgggag gttttggcga ggcacggctc caagatctgg 1860gtcaatgagg
agaccaagct ggtgtacttc cagggcacca aggacacgcc gctggagcac
1920cacctctacg tggtcagcta tgaggcggcc ggcgagatcg tacgcctcac
cacgcccggc 1980ttctcccata gctgctccat gagccagaac ttcgacatgt
tcgtcagcca ctacagcagc 2040gtgagcacgc cgccctgcgt gcacgtctac
aagctgagcg gccccgacga cgaccccctg 2100cacaagcagc cccgcttctg
ggctagcatg atggaggcag ccagctgccc cccggattat 2160gttcctccag
agatcttcca tttccacacg cgctcggatg tgcggctcta cggcatgatc
2220tacaagcccc acgccttgca gccagggaag aagcacccca ccgtcctctt
tgtatatgga 2280ggcccccagg tgcagctggt gaataactcc ttcaaaggca
tcaagtactt gcggctcaac 2340acactggcct ccctgggcta cgccgtggtt
gtgattgacg gcaggggctc ctgtcagcga 2400gggcttcggt tcgaaggggc
cctgaaaaac caaatgggcc aggtggagat cgaggaccag 2460gtggagggcc
tgcagttcgt ggccgagaag tatggcttca tcgacctgag ccgagttgcc
2520atccatggct ggtcctacgg gggcttcctc tcgctcatgg ggctaatcca
caagccccag 2580gtgttcaagg tggccatcgc gggtgccccg gtcaccgtct
ggatggccta cgacacaggg 2640tacactgagc gctacatgga cgtccctgag
aacaaccagc acggctatga ggcgggttcc 2700gtggccctgc acgtggagaa
gctgcccaat gagcccaacc gcttgcttat cctccacggc 2760ttcctggacg
aaaacgtgca ctttttccac acaaacttcc tcgtctccca actgatccga
2820gcagggaaac cttaccagct ccagatctac cccaacgaga gacacagtat
tcgctgcccc 2880gagtcgggcg agcactatga agtcacgttg ctgcactttc
tacaggaata cctctgagcc 2940tgcccaccgg gagccgccac atcacagcac
aagtggctgc agcctccgcg gggaaccagg 3000cgggagggac tgagtggccc
gcgggcccca gtgaggcact ttgtcccgcc cagcgctggc 3060cagccccgag
gagccgctgc cttcaccgcc ccgacgcctt ttatcctttt ttaaacgctc
3120ttgggtttta tgtccgctgc ttcttggttg ccgagacaga gagatggtgg
tctcgggcca 3180gcccctcctc tccccgcctt ctgggaggag gaggtcacac
gctgatgggc actggagagg 3240ccagaagaga ctcagaggag cgggctgcct
tccgcctggg gctccctgtg acctctcagt 3300cccctggccc ggccagccac
cgtccccagc acccaagcat gcaattgcct gtcccccccg 3360gccagcctcc
ccaacttgat gtttgtgttt tgtttggggg gatatttttc ataattattt
3420aaaagacagg ccgggcgcgg tggctcacgt ctgtaatccc agcactttgg
gaggctgagg 3480cgggcggatc acctgaggtt gggagttcaa gaccagcctg
gccaacatgg ggaaaccccg 3540tctctactaa aaatacaaaa aattagccgg
gtgtggtggc gcgtgcctat aatcccagct 3600actcgggagg ctgaggcagg
agaatcgctt gaacccggga ggtggaggtt gcggtgagcc 3660aagatcgcac
cattgcactc cagcctgggc aacaagagcg aaactctgtc tcaaaataaa
3720taaaaaataa aagacagaaa gcaaggggtg cctaaatcta gacttggggt
ccacaccggg 3780cagcggggtt gcaacccagc acctggtagg ctccatttct
tcccaagccc gagcagaggg 3840tcatgcgggc cccacaggag aagcggccag
ggcccgcggg gggcaccacc tgtggacagc 3900cctcctgtcc ccaagctttc
aggcaggcac tgaaacgcac cgaacttcca cgctctgctg 3960gtcagtggcg
gctgtcccct ccccagccca gccgcccagc cacatgtgtc tgcctgaccc
4020gtacacacca ggggttccgg ggttgggagc tgaaccatcc ccacctcagg
gttatatttc 4080cctctcccct tccctccccg ccaagagctc tgccaggggc
gggcaaaaaa aaaagtaaaa 4140agaaaagaaa aaaaaaaaaa agaaacaaac
cacctctaca tattatggaa agaaaatatt 4200tttgtcgatt cttattcttt
tataattatg cgtggaagaa gtagacacat taaacgattc 4260cagttggaaa
aaaaaaaaaa aaaaaa 428630892PRTHomo sapiens 30Met Arg Lys Val Lys
Lys Leu Arg Leu Asp Lys Glu Asn Thr Gly Ser1 5 10 15Trp Arg Ser Phe
Ser Leu Asn Ser Glu Gly Ala Glu Arg Met Ala Thr 20 25 30Thr Gly Thr
Pro Thr Ala Asp Arg Gly Asp Ala Ala Ala Thr Asp Asp 35 40 45Pro Ala
Ala Arg Phe Gln Val Gln Lys His Ser Trp Asp Gly Leu Arg 50 55 60Ser
Ile Ile His Gly Ser Arg Lys Tyr Ser Gly Leu Ile Val Asn Lys65 70 75
80Ala Pro His Asp Phe Gln Phe Val Gln Lys Thr Asp Glu Ser Gly Pro
85 90 95His Ser His Arg Leu Tyr Tyr Leu Gly Met Pro Tyr Gly Ser Arg
Glu 100 105 110Asn Ser Leu Leu Tyr Ser Glu Ile Pro Lys Lys Val Arg
Lys Glu Ala 115 120 125Leu Leu Leu Leu Ser Trp Lys Gln Met Leu Asp
His Phe Gln Ala Thr 130 135 140Pro His His Gly Val Tyr Ser Arg Glu
Glu Glu Leu Leu Arg Glu Arg145 150 155 160Lys Arg Leu Gly Val Phe
Gly Ile Thr Ser Tyr Asp Phe His Ser Glu 165 170 175Ser Gly Leu Phe
Leu Phe Gln Ala Ser Asn Ser Leu Phe His Cys Arg 180 185 190Asp Gly
Gly Lys Asn Gly Phe Met Val Ser Pro Met Lys Pro Leu Glu 195 200
205Ile Lys Thr Gln Cys Ser Gly Pro Arg Met Asp Pro Lys Ile Cys Pro
210 215 220Ala Asp Pro Ala Phe Phe Ser Phe Ile Asn Asn Ser Asp Leu
Trp Val225 230 235 240Ala Asn Ile Glu Thr Gly Glu Glu Arg Arg Leu
Thr Phe Cys His Gln 245 250 255Gly Leu Ser Asn Val Leu Asp Asp Pro
Lys Ser Ala Gly Val Ala Thr 260 265 270Phe Val Ile Gln Glu Glu Phe
Asp Arg Phe Thr Gly Tyr Trp Trp Cys 275 280 285Pro Thr Ala Ser Trp
Glu Gly Ser Glu Gly Leu Lys Thr Leu Arg Ile 290 295 300Leu Tyr Glu
Glu Val Asp Glu Ser Glu Val Glu Val Ile His Val Pro305 310 315
320Ser Pro Ala Leu Glu Glu Arg Lys Thr Asp Ser Tyr Arg Tyr Pro Arg
325 330 335Thr Gly Ser Lys Asn Pro Lys Ile Ala Leu Lys Leu Ala Glu
Phe Gln 340 345 350Thr Asp Ser Gln Gly Lys Ile Val Ser Thr Gln Glu
Lys Glu Leu Val 355 360 365Gln Pro Phe Ser Ser Leu Phe Pro Lys Val
Glu Tyr Ile Ala Arg Ala 370 375 380Gly Trp Thr Arg Asp Gly Lys Tyr
Ala Trp Ala Met Phe Leu Asp Arg385 390 395 400Pro Gln Gln Trp Leu
Gln Leu Val Leu Leu Pro Pro Ala Leu Phe Ile 405 410 415Pro Ser Thr
Glu Asn Glu Glu Gln Arg Leu Ala Ser Ala Arg Ala Val 420 425 430Pro
Arg Asn Val Gln Pro Tyr Val Val Tyr Glu Glu Val Thr Asn Val 435 440
445Trp Ile Asn Val His Asp Ile Phe Tyr Pro Phe Pro Gln Ser Glu Gly
450 455 460Glu Asp Glu Leu Cys Phe Leu Arg Ala Asn Glu Cys Lys Thr
Gly Phe465 470 475 480Cys His Leu Tyr Lys Val Thr Ala Val Leu Lys
Ser Gln Gly Tyr Asp 485 490 495Trp Ser Glu Pro Phe Ser Pro Gly Glu
Asp Glu Phe Lys Cys Pro Ile 500 505 510Lys Glu Glu Ile Ala Leu Thr
Ser Gly Glu Trp Glu Val Leu Ala Arg 515 520 525His Gly Ser Lys Ile
Trp Val Asn Glu Glu Thr Lys Leu Val Tyr Phe 530 535 540Gln Gly Thr
Lys Asp Thr Pro Leu Glu His His Leu Tyr Val Val Ser545 550 555
560Tyr Glu Ala Ala Gly Glu Ile Val Arg Leu Thr Thr Pro Gly Phe Ser
565 570 575His Ser Cys Ser Met Ser Gln Asn Phe Asp Met Phe Val Ser
His Tyr 580 585 590Ser Ser Val Ser Thr Pro Pro Cys Val His Val Tyr
Lys Leu Ser Gly 595 600 605Pro Asp Asp Asp Pro Leu His Lys Gln Pro
Arg Phe Trp Ala Ser Met 610 615 620Met Glu Ala Ala Ser Cys Pro Pro
Asp Tyr Val Pro Pro Glu Ile Phe625 630 635 640His Phe His Thr Arg
Ser Asp Val Arg Leu Tyr Gly Met Ile Tyr Lys 645 650 655Pro His Ala
Leu Gln Pro Gly Lys Lys His Pro Thr Val Leu Phe Val 660 665 670Tyr
Gly Gly Pro Gln Val Gln Leu Val Asn Asn Ser Phe Lys Gly Ile 675 680
685Lys Tyr Leu Arg Leu Asn Thr Leu Ala Ser Leu Gly Tyr Ala Val Val
690 695 700Val Ile Asp Gly Arg Gly Ser Cys Gln Arg Gly Leu Arg Phe
Glu Gly705 710 715 720Ala Leu Lys Asn Gln Met Gly Gln Val Glu Ile
Glu Asp Gln Val Glu 725 730 735Gly Leu Gln Phe Val Ala Glu Lys Tyr
Gly Phe Ile Asp Leu Ser Arg 740 745 750Val Ala Ile His Gly Trp Ser
Tyr Gly Gly Phe Leu Ser Leu Met Gly 755 760 765Leu Ile His Lys Pro
Gln Val Phe Lys Val Ala Ile Ala Gly Ala Pro 770 775 780Val Thr Val
Trp Met Ala Tyr Asp Thr Gly Tyr Thr Glu Arg Tyr Met785 790 795
800Asp Val Pro Glu Asn Asn Gln His Gly Tyr Glu Ala Gly Ser Val Ala
805 810 815Leu His Val Glu Lys Leu Pro Asn Glu Pro Asn Arg Leu Leu
Ile Leu 820 825 830His Gly Phe Leu Asp Glu Asn Val His Phe Phe His
Thr Asn Phe Leu 835 840 845Val Ser Gln Leu Ile Arg Ala Gly Lys Pro
Tyr Gln Leu Gln Ile Tyr 850 855 860Pro Asn Glu Arg His Ser Ile Arg
Cys Pro Glu Ser Gly Glu His Tyr865 870 875 880Glu Val Thr Leu Leu
His Phe Leu Gln Glu Tyr Leu 885 890312113DNAHomo sapiens
31actcaccctc cggcttcctg tcggggcttt ctcagcccca ccccacgttt ggacatttgg
60agcatttcct tccctgacag ccggacctgg gactgggctg gggccctggc ggatggagac
120atgctgcccc tgctgctgct gcccctgctg tggggggggt ccctgcagga
gaagccagtg 180tacgagctgc aagtgcagaa gtcggtgacg gtgcaggagg
gcctgtgcgt ccttgtgccc 240tgctccttct cttacccctg gagatcctgg
tattcctctc ccccactcta cgtctactgg 300ttccgggacg gggagatccc
atactacgct gaggttgtgg ccacaaacaa cccagacaga 360agagtgaagc
cagagaccca gggccgattc cgcctccttg gggatgtcca gaagaagaac
420tgctccctga gcatcggaga tgccagaatg gaggacacgg gaagctattt
cttccgcgtg 480gagagaggaa gggatgtaaa atatagctac caacagaata
agctgaactt ggaggtgaca 540gccctgatag agaaacccga catccacttt
ctggagcctc tggagtccgg ccgccccaca 600aggctgagct gcagccttcc
aggatcctgt gaagcgggac cacctctcac attctcctgg 660acggggaatg
ccctcagccc cctggacccc gagaccaccc gctcctcgga gctcaccctc
720acccccaggc ccgaggacca tggcaccaac ctcacctgtc aggtgaaacg
ccaaggagct 780caggtgacca cggagagaac tgtccagctc aatgtctcct
atgctccaca gaacctcgcc 840atcagcatct tcttcagaaa tggcacaggc
acagccctgc ggatcctgag caatggcatg 900tcggtgccca tccaggaggg
ccagtccctg ttcctcgcct gcacagttga cagcaacccc 960cctgcctcac
tgagctggtt ccgggaggga aaagccctca atccttccca gacctcaatg
1020tctgggaccc tggagctgcc taacatagga gctagagagg gaggggaatt
cacctgccgg 1080gttcagcatc cgctgggctc ccagcacctg tccttcatcc
tttctgtgca gagaagctcc 1140tcttcctgca tatgtgtaac tgagaaacag
cagggctcct ggcccctcgt cctcaccctg 1200atcagggggg ctctcatggg
ggctggcttc ctcctcacct atggcctcac ctggatctac 1260tataccaggt
gtggaggccc ccagcagagc agggctgaga ggcctggctg agcccctccc
1320gctcaagaca gaactgaggt gtggacactt agccctgtgg gacacatgca
ggacatcact 1380gtcagcttct ttctggaagc tcacatccca ctgactaccc
ctcttttcct tcctgcccca 1440taccccttct acttattccc ctctgcttgt
gagtcttgcc ccaccacacc tgcatcccca 1500tctgcacccc atcccctctc
cacctgccct tctcttccct ctccatccac catctccagc 1560cctgtgaagg
gaatgtactt tcggtcttat acccccatta cccattaccc aaaagttacc
1620tttttttttt tttttttttt ttgagacaga gtctcactct gttgcacagg
ctggagttca 1680gtggcacaat ctccgttcac tgcaacctcc acctctgggg
ttcaagcaat tctcctgcct 1740cagcctccct agtagctggg attacaggtg
cctgccacca catccagtta attttttttt 1800tttgtatgtt agtagagatg
gggttttacc atgttggcca ggtctcgaac tcctgacctc 1860aagcaatcca
ctgcattggc ctcccaaagt gctggcatta caggtatgag ccaccgtgcc
1920tggctgccaa aagttacctt cttaacactt gaatttctgg tctcctcagc
ttccctatcc 1980atataggcac agagaggcag catttgtttt ccagttaaaa
ctctacctca ttgtgattat 2040tatccaatac aattgttaca aaataagtaa
aacttttatg aaacaataca acataactga 2100ttttactctt taa
211332396PRTHomo sapiens 32Met Leu Pro Leu Leu Leu Leu Pro Leu Leu
Trp Gly Gly Ser Leu Gln1 5 10 15Glu Lys Pro Val Tyr Glu Leu Gln Val
Gln Lys Ser Val Thr Val Gln 20 25 30Glu Gly Leu Cys Val Leu Val Pro
Cys Ser Phe Ser Tyr Pro Trp Arg 35 40 45Ser Trp Tyr Ser Ser Pro Pro
Leu Tyr Val Tyr Trp Phe Arg Asp Gly 50 55 60Glu Ile Pro Tyr Tyr Ala
Glu Val Val Ala Thr Asn Asn Pro Asp Arg65 70 75 80Arg Val Lys Pro
Glu Thr Gln Gly Arg Phe Arg Leu Leu Gly Asp Val 85 90 95Gln Lys Lys
Asn Cys Ser Leu Ser Ile Gly Asp Ala Arg Met Glu Asp 100 105 110Thr
Gly Ser Tyr Phe Phe Arg Val Glu Arg Gly Arg Asp Val Lys Tyr 115 120
125Ser Tyr Gln Gln Asn Lys Leu Asn Leu Glu Val Thr Ala Leu Ile Glu
130 135 140Lys Pro Asp Ile His Phe Leu Glu Pro Leu Glu Ser Gly Arg
Pro Thr145 150 155 160Arg Leu Ser Cys Ser Leu Pro Gly Ser Cys Glu
Ala Gly Pro Pro Leu 165 170 175Thr Phe Ser Trp Thr Gly Asn Ala Leu
Ser Pro Leu Asp Pro Glu Thr 180 185 190Thr Arg Ser Ser Glu Leu Thr
Leu Thr Pro Arg Pro Glu Asp His Gly 195 200 205Thr Asn Leu Thr Cys
Gln Val Lys Arg Gln Gly Ala Gln Val Thr Thr 210 215 220Glu Arg Thr
Val Gln Leu Asn Val Ser Tyr Ala Pro Gln Asn Leu Ala225 230 235
240Ile Ser Ile Phe Phe Arg Asn Gly Thr Gly Thr Ala Leu Arg Ile Leu
245 250 255Ser Asn Gly Met Ser Val Pro Ile Gln Glu Gly Gln Ser Leu
Phe Leu 260 265 270Ala Cys Thr Val Asp Ser Asn Pro Pro Ala Ser Leu
Ser Trp Phe Arg 275 280 285Glu Gly Lys Ala Leu Asn Pro Ser Gln Thr
Ser Met Ser Gly Thr Leu 290 295 300Glu Leu Pro Asn Ile Gly Ala Arg
Glu Gly Gly Glu Phe Thr Cys Arg305 310 315 320Val Gln His Pro Leu
Gly Ser Gln His Leu Ser Phe Ile Leu Ser Val 325 330 335Gln Arg Ser
Ser Ser Ser Cys Ile Cys Val Thr Glu Lys Gln Gln Gly 340 345 350Ser
Trp Pro Leu Val Leu Thr Leu Ile Arg Gly Ala Leu Met Gly Ala 355 360
365Gly Phe Leu Leu Thr Tyr Gly Leu Thr Trp Ile Tyr Tyr Thr Arg Cys
370 375 380Gly Gly Pro Gln Gln Ser Arg Ala Glu Arg Pro Gly385 390
395334045DNAHomo sapiens 33cgcccacgcc cggcgccccg accgcggagg
actccccgag ccccgcccgc catggcccgg 60atcccgacgg ccgccctggg ttgcatcagc
ctcctctgcc tgcagctccc tggctcgctg 120tcccgcagcc tgggcgggga
cccgcgaccc gtcaaaccca gggagccccc agcccggagc 180ccttccagca
gcctgcagcc caggcacccc gcaccccgac ctgtggtctg gaagcttcac
240cgggccctcc aggcacagag gggtgccggc ctggcccctg ttatgggtca
gcctctccgg 300gatggtggcc gccaacactc gggcccccga agacactcgg
gcccccgcag gacccaagcc 360cagctcctgc gagtgggctg tgtgctgggc
acctgccagg tgcagaatct cagccaccgc 420ctgtggcaac tcatgggacc
ggccggccgg caggactcag ctcctgtgga ccccagcagc 480ccccacagct
atggctgagg tggggccggg ccacacccct gcccatccca gccagggtgc
540tgtgcccccg tccagagctg cagctgagcc ccatctgaag cccagtccct
cggagctgca 600gacagcaggt cctgcagcaa caatacctgc acggctttgc
acacgtaaac ctaggctggt 660ctacacgcag tgctggtacg tcaaggagcc
taaacaccct gaaattgtga ccccctgggg 720gacagctgcc agacacagct
ggcggcagca ccagatgcta agcgcttcag agaggaggtg 780tctgcccaga
gatgtggagc agaagctggg ccctgaacac acggggccat gtctggacga
840gcaggggaga gaggctgaac tggccagaag tggcccctcc gctgctggtc
cagtcagact 900gaagcccggc cttgtgcctg ggctgttcct gctctcatgc
acaaccagcc cttccacgtg 960cctgcctgtg ggacaggagg gggagcgtgg
gatgctgtag cccccggggt tgggcaaggg 1020aaggatggtg gccctccaga
ggtcatgaag ggacctctgt ggctccagct gccaaccctg 1080gagcccagac
cgaggtggcc atggagactc cacctggatc ccctgtagga ggccagggag
1140gggaactcag cagttcagga gccaccccaa accattctgg gacagggaca
cccctttcta 1200ccccagggca gggcagggct gggtggggca agatccccca
gcccgactag acccacctca 1260cctgaagggg gtgagaccct tgttggcagc
cagacaaggg tggggctcca caggcagcac 1320aggcgcccca ccaccaccca
gtttggggac ccagtgggac caggtgcggg ggcagagggt 1380gacttaccaa
gagccaggga gggcagccca ggcccaagtg acagcaagaa caagaaccac
1440tgccggcgtg cacagacttg gtgtgtgtcc ttccctgggg ggacggggga
ctcacatgtg 1500cctgccactg gagcctctca accgtccagc agaacacggg
gttcagaaag ggctccttct 1560gctatttagc gaacactgag catttaattt
acaaatgttt gctagggtca ccctctcggc 1620catcccacga gggtcgccat
gatcacccca actctagagg ccgcagcaga gctcaggaca 1680ttcccccaca
gagcttgccc ctcagttcct acctccaagg gggagggtcc tggaagcgcc
1740cacccaggcg ccgcccctgt gcttgctccc cgagctcagg gattgccgag
tccacgtaac 1800tgacctgtac tccacgaggc cctgtgggaa cggtccaggc
tggtcctgcc ctgtggaggc 1860ctccgtgcac tgagagatgt actaggattg
cagcaaaggt ggtcagggtg atgggccgca 1920cagcgaggca gtcaaggcca
gctccctggg agaagcactg ggtcaggtga ggtctgagga 1980cagcaggcct
tccctagggg aaggagctgg gagtgccaag gccccaggtg cacaggaggc
2040gtggctgctg agaggctgca gggtggaggg gcctcggcct cagagtcatg
tgccctgtga 2100ccactgaagg gtgtcagcag agcacacggc atgaggacag
agggaggggc acggggagtg 2160aaggaggggg ccctggggca aggctcgggg
gtcaggagct cagcgtccgc tactcagccc 2220agccaaaacc ctcccagacg
tctcctctcc tgcctgggca aagtccagct tggcaccccg 2280tctggggcct
gcctgtggtc agggccaagt gttccctcct ccaggaaagc ctttaccctc
2340ctcatgccct gtagtcagga ggccgcctgc tgtaaccctc cgtgtcgcct
cgggtgcgaa 2400atcagaccca cctgacacca tcacgcggag gcccagcagc
acctgcaccc acttccagct 2460gctctggcca aaatctccgc tcggccaggc
cccgtggctc acacctgtaa tcctagcaca 2520ttgggaggcc aaggcaggca
catcacctga gttcaggagt tcaagaccag cctggccaac 2580atggtgaaat
cccgtctcta ctaaaaacag aaaattatcc gggcgtggtg gcacatgact
2640gtaatcccag ctactcagga ggctgaggca ggaggatcac ttgaacctgg
gaggcggagg 2700ttgcagtgag ctgagattgc gccattgcac tccagcctgg
gcaacaagag caaaattctg 2760cctcaaaaaa aaaaatagta ataatacaaa
aattagctgg gcgtggtggc acatgccagt 2820aattccatct actcgggagg
ctgaggcagg agaatcgtct aagcccggga ggtggaggtt 2880gcagtgagcc
cagatggcgc tgctgcactc aagcttggat gacagagcaa gactccgttt
2940caaaaaaaaa aaacctcctc tcttccttca caccttcctc tgaatcccac
ccggtcccac 3000ctcctgaacc tatccagaca ccttctcctg acccaggcac
cacctgcttt cggggcgatg 3060gccgtagcct cctcccaggc acctgtctgc
atccctctgg ccagtgcatg ctgagcacgt 3120gacctacccg tgttgggaca
cgtgaggata cagccttgac ccccaggggc tgacattcta 3180gggggagata
gaaggagaca aacgtagaag gtagaataag tgggtggtgg agtggcaggg
3240agtgctgagt gccacaggaa gtcagacaag gaaggagagt gtggggcagg
tgccgtttaa 3300atggggggcg ctggggtctc ctcacagttg cttctcagct
cagctgtgcc aggatcttgt 3360tgagtcaggt cagctgccca cagccctctt
gcctgacccc tgaagcccag aactctgatc 3420ttcacagccc taggtatggc
cccagcaccc cactgccctc tctcctgccc cagccgactg 3480ctgttcccag
acttccctgg ccacgctcca agacgccagc tctgccgcgg gcactttgtt
3540ctcacggtgt cctccatgcc tgcagggccc atgcatggga agttgcgttg
gcggcctggg 3600tgttggcggt tccgtgcctg ctccaactct ccgtgaggcc
cctctcccag agcctgacac 3660actctgtggc cgaactctag gcaggtgccc
ctgagtcctt tcctcgacga ggcctgaccc 3720catccccatc ctcgctgggc
ccgccgaccc cggtgttagc aagaatcctc taaatcagtt 3780tatggagaat
tacccaccct cgatatctga tcccattcct catctcccac ccttgatctc
3840atcaccctgc cggcctcctg caagatcctc attgagccac tccagtgaga
atccccctac 3900cctcgaaggc cgccctaaca acttcccatc cgctgacccc
tccaacgcca tcaatctcca 3960gctgtggttg ttgaactcgg aggtgagctc
ctctcaccac tctcttgaat aaagcttttc 4020tcaccatttt aaaaaaaaaa aaaaa
404534148PRTHomo sapiens 34Met Ala Arg Ile Pro Thr Ala Ala Leu Gly
Cys Ile Ser Leu Leu Cys1 5 10 15Leu Gln Leu Pro Gly Ser Leu Ser Arg
Ser Leu Gly Gly Asp Pro Arg 20 25 30Pro Val Lys Pro Arg Glu Pro Pro
Ala Arg Ser Pro Ser Ser Ser Leu 35 40 45Gln Pro Arg His Pro Ala Pro
Arg Pro Val Val Trp Lys Leu His Arg 50 55 60Ala Leu Gln Ala Gln Arg
Gly Ala Gly Leu Ala Pro Val Met Gly Gln65 70 75 80Pro Leu Arg Asp
Gly Gly Arg Gln His Ser Gly Pro Arg Arg His Ser 85 90 95Gly Pro Arg
Arg Thr Gln Ala Gln Leu Leu Arg Val Gly Cys Val Leu 100 105 110Gly
Thr Cys Gln Val Gln Asn Leu Ser His Arg Leu Trp Gln Leu Met 115 120
125Gly Pro Ala Gly Arg Gln Asp Ser Ala Pro Val Asp Pro Ser Ser Pro
130 135 140His Ser Tyr Gly145353406DNAHomo sapiens 35aggcgggcgg
agcgaggggt gggagggcgc gcgcgaacgg gcgggcgagc aagcgagcgg 60cgtctccacc
agcatctgcc gcggccgcct ttgcccgaag cccggggacg aaccgacgga
120ccgaccgcct ggcgcacgga cgcgggcgct cgctttgtgt tcggggctag
cgtcggcgag 180gcttgagctt gcagcgcgcg gcttccctgc tttctcgcgg
ccaccccggc tccggcggcc 240tcggcgcgcg aggggctgga ggtgcgggag
ccgctctccg ccggtcggtc cccgcgcggc 300tgagcccagg ccgccagcgc
cgcggccccg tgcggtgtcc ctgagctcct gctccccgcc 360gggctgctcc
gagcaacggt gcttcggagc tccaaactcg ggctgccggg gcaagtgtct
420tcatgaaccc agaggatgtc cgggaagcac tacaagggtc ctgaagtcag
ttgttgcatc 480aaatacttca tatttggctt caatgtcata ttttggtttt
tgggaataac atttcttgga 540attggactgt gggcatggaa tgaaaaagga
gttctgtcca acatctcttc catcaccgat 600ctcggcggct ttgacccagt
ttggctcttc cttgtggtgg gaggagtgat gttcattttg 660ggatttgcag
ggtgcattgg agcgctacgg gaaaacactt tccttctcaa gtttttttct
720gtgttcctgg gaattatttt cttcctggag ctcactgccg gagttctagc
atttgttttc 780aaagactgga tcaaagacca gctgtatttc tttataaaca
acaacatcag agcatatcgg 840gatgacattg atttgcaaaa cctcatagac
ttcacccagg aatattggca gtgctgtggg 900gcttttggag ctgatgattg
gaacctaaat atttacttca attgcacaga ttccaatgca 960agtcgagagc
gatgtggcgt tccattctcc tgctgcacta aagatcccgc agaagatgtc
1020atcaacactc agtgtggcta tgatgccagg caaaaaccag aagttgacca
gcagattgta 1080atctacacga aaggctgtgt gccccagttt gagaagtggt
tgcaggacaa tttaaccatc 1140gttgctggta ttttcatagg cattgcattg
ctgcagatat ttgggatatg cctggcccag 1200aatttggtta gcgatatcga
agctgtcagg gcgagctggt agaccccctg caaccgctgc 1260tgcaagacac
tggacagacc cagctttcgg gaccctcccg cgtgccgaac tgatcttcga
1320gctgcatgga cctaatcaca gatgcagcct gcagtctcgc ctaatggagc
tgccattagg 1380ggagtgtaaa actgggaaat gctgctcact gacagaatta
aaaaaaaaaa taaccagtat 1440gaaagtcgtt gcgccgtgaa tctctactgt
agccatgaat ttatggacag ttagatgctt 1500accaaaaaag aaaaaaaggg
agggtagggg acccagatgt acttgaatgt gcagaaaata 1560cattcttgtc
ctcatcttcc gtaattggag ggctgggaga ggcagctttg ctcttcacca
1620caccttggac ggaccacctt ctttctgttc catggcctga aggagtgcat
ctcctcaaag 1680actcagcccc tcacctggga gggcagtggt ttgtgggcat
ccctccatgt acattttagg 1740aaacacttgc aactctcatc tgaagaagaa
aacaactcat ctttgggttc agattttgtg 1800atggtattca gcaagtcact
tgggcgagca cacttggtct atcctggaaa gtctccttat 1860aagagaagtt
gtgtatttca tgtgcaccga gcaagggcat tggaagacgt catgaggctg
1920tattttagca ggactgatcg tttttctaag tagacctgag ctttgtttat
cagtgaaatt 1980caaggagaaa atgaggttaa tgaagaggta tcagttaaat
atccccttct tctcaccctg 2040ccaaaattag cagttggatt tttggaaact
ctggaatatt ctgggtcatt ttgttttgta 2100tgtttgttgt ttttcgtctt
ccaaaggtga aagctatgat acagttccac ttaaatttta 2160gtgttttctt
actcagctca agcattaatt tttgattaag tcttaatctg catgacctgt
2220gaatctgaat ccatcatctc cctttcctgc cagcttttct acaaacattg
aaatatgtta 2280tttggtcagc acttatttcc taggttcaca gccttgggag
gttgtggcat gtcctcccag 2340tctggctggg aagagaccag ctgtaccatc
caaatgcttc cctggtcttg atgatctctt 2400ccagagtcga tctgagtggc
cttttctgca ccctcccctt ctttctcttt gaatggaatt 2460aaacccaatt
tggaaacaac attgacccag tcaaaagctt ctaatggttt ctttttcttc
2520ctccagtttt agtttgcttt tattaaaaaa agaaaatagt gcatggccat
agctccttca 2580gttctcttat tgcagactaa ccatcaggat ggtatcaaag
cacaaatact ttggagggga 2640atgcgttgaa ctggggcaag tactctgtaa
cacaaagtgg gaaaccactt cctggtgctg 2700ccgctcctgc ccccacttta
ggtgggaggg acgagttttg ccctctagat tttaatccag 2760ctggtgtcca
ccggatgttg ccctcctggg gagcagatat cagtctgtgg aactctggga
2820aaaccacagg cacatttttc ggtgcggaca gatttgccag cacataactg
ggcagccagc 2880tagaatactt tgtggaaatt aagcgaggtt ttccatttca
gccccatggt gcatggtggt 2940ggccgatgaa tgtgtcagtc tgctcagaga
aaggacaaaa aggaaattat tttcaaaact 3000gtgttcactg tttgggtgtg
tgtatggctc tgcatgtgtg tgtttttgtc tctgtatagg 3060tagaggtatt
cacatcttac tccgactgta aggttgtctt acttcatctc tgcccccacc
3120acagttgcca ttttgtaatg tccttccaac atggagaaga cacgagctct
ctccagttgg 3180catcatttgt cttttttgtt gattgcctca ttctccagtg
aactccatct ggccaattga 3240ttcagaatca ggcaagatcc ctgccctttg
gcacatccac tgaaaggcca aacagcaagt 3300ccgagtgagt tttaaatatt
aattaatcac cctttatttt ttacacttga gagtgattgt 3360aataaaggct
gtcattaata aacttggttc taccttaaaa aaaaaa 3406365882DNAHomo sapiens
36atcaaatttc aactccaggc agtccttcca gccatgtggg ttcagcggaa agagaagcaa
60aaccactctt cctaaaatgt tagaagctgc tcttcgctta ccttggggcc tttgcattgg
120gagctgtttt tcacatcaaa gaatatgtgc tgaatggaat tttagtattt
tgctgtcgtt 180ttaatatttt cgtctggtct tcctcagttc ttccagacgc
tttctgagag aatgggggca 240ggagctctag ccatctgtca aagtaaagca
gcggttcggc tgaaagaaga catgaaaaag 300atagtggcag tgccattaaa
tgaacagaag gattttacct atcagaagtt atttggagtc 360agtctccaag
aacttgaacg gcaggggctc accgagaatg gcattccagc agtagtgtgg
420aatatagtgg aatatttgac gcagcatgga cttacccaag aaggtctttt
tagggtgaat 480ggtaacgtga aggtggtgga acaacttcga ctgaagttcg
agagtggagt gcccgtggag 540ctcgggaagg acggtgatgt ctgctcagca
gccagtctgt tgaagctgtt tctgagggag 600ctgcctgaca gtctgatcac
ctcagcgttg cagcctcgat tcattcaact ctttcaggat 660ggcagaaatg
atgttcagga gagtagctta agagacttaa taaaagagct gccagacacc
720cactactgcc tcctcaagta cctttgccag ttcttgacaa aagtagccaa
gcatcatgtg 780cagaatcgca tgaatgttca caatctcgcc actgtatttg
ggccaaattg ctttcatgtg 840ccacctgggc ttgaaggcat gaaggaacag
gacctgtgca acaagataat ggctaaaatt 900ctagaaaatt acaataccct
gtttgaagta gagtatacag aaaatgatca tctgagatgt 960gaaaacctgg
ctaggcttat catagtaaaa gaggtctatt ataagaactc cctgcccatc
1020cttttaacaa gaggcttaga aagagacatg ccaaaaccac ctccaaaaac
caagatccca 1080aaatccagga gtgagggatc tattcaggcc cacagagtac
tgcaaccaga gctatctgat 1140ggcattcctc agctcagctt gcggctaagt
tatagaaaag cctgcttgga agacatgaat 1200tcagcagagg gtgctattag
tgccaagttg gtacccagtt cacaggaaga tgaaagacct 1260ctgtcacctt
tctatttgag tgctcatgta ccccaagtca gcaatgtgtc tgcaaccgga
1320gaactcttag aaagaaccat ccgatcagct gtagaacaac atctttttga
tgttaataac 1380tctggaggtc aaagttcaga ggactcagaa tctggaacac
tatcagcatc ttctgccaca 1440tctgccagac agcgccgccg ccagtccaag
gagcaggatg aagttcgaca tgggagagac 1500aagggactta tcaacaaaga
aaatactcct tctgggttca accaccttga tgattgtatt 1560ttgaatactc
aggaagtcga aaaggtacac aaaaatactt ttggttgtgc tggagaaagg
1620agcaagccta aacgtcagaa atccagtact aaactttctg agcttcatga
caatcaggac 1680ggtcttgtga atatggaaag tctcaattcc acacgatctc
atgagagaac tggacctgat 1740gattttgaat ggatgtctga tgaaaggaaa
ggaaatgaaa aagatggtgg acacactcag 1800cattttgaga gccccacaat
gaagatccag gagcatccca gcctatctga caccaaacag 1860cagagaaatc
aagatgccgg tgaccaggag gagagctttg tctccgaagt gccccagtcg
1920gacctgactg cattgtgtga tgaaaagaac tgggaagagc ctatccctgc
tttctcctcc 1980tggcagcggg agaacagtga ctctgatgaa gcccacctct
cgccgcaggc tgggcgcctg 2040atccgtcagc tgctggacga agacagcgac
cccatgctct ctcctcggtt ctacgcttat 2100gggcagagca ggcaatacct
ggatgacaca gaagtgcctc cttccccacc aaactcccat 2160tctttcatga
ggcggcgaag ctcctctctg gggtcctatg atgatgagca agaggacctg
2220acacctgccc agctcacacg aaggattcag agccttaaaa agaagatccg
gaagtttgaa 2280gatagattcg aagaagagaa gaagtacaga ccttcccaca
gtgacaaagc agccaatccg 2340gaggttctga aatggacaaa tgaccttgcc
aaattccgga gacaacttaa agaatcaaaa 2400ctaaagatat ctgaagagga
cctaactccc aggatgcggc agcgaagcaa cacactcccc 2460aagagttttg
gttcccaact tgagaaagaa gatgagaaga agcaagagct ggtggataaa
2520gcaataaagc ccagtgttga agccacattg gaatctattc agaggaagct
ccaggagaag 2580cgagcggaaa gcagccgccc tgaggacatt aaggatatga
ccaaagacca gattgctaat 2640gagaaagtgg ctctgcagaa agctctgtta
tattatgaaa gcattcatgg acggccggta 2700acaaagaacg aacggcaggt
gatgaagcca ctatacgaca ggtaccggct ggtcaaacag 2760atcctctccc
gagctaacac catacccatc attggttccc cctccagcaa gcggagaagc
2820cctttgctgc agccaattat cgagggcgaa actgcttcct tcttcaagga
gataaaggaa 2880gaagaggagg ggtcagaaga cgatagcaat gtgaagccag
acttcatggt cactctgaaa 2940accgatttca gtgcacgatg ctttctggac
caattcgaag atgacgctga tggatttatt 3000tccccaatgg atgataaaat
accatcaaaa tgcagccagg acacagggct ttcaaatctc 3060catgctgcct
caatacctga actcctggaa cacctccagg aaatgagaga agaaaagaaa
3120aggattcgaa agaaacttcg ggattttgaa gacaactttt tcagacagaa
tggaagaaat 3180gtccagaagg aagaccgcac tcctatggct gaagaataca
gtgaatataa gcacataaag 3240gcgaaactga ggctcctgga ggtgctcatc
agcaagagag acactgattc caagtccatg 3300tgaggggcat ggccaagcac
agggggctgg cagctgcggt gagagtttac tgtccccaga 3360gaaagtgcag
ctctggaagg cagccttggg gctggccctg caaagcatgc agcccttctg
3420cctctagacc atttggcatc ggctcctgtt tccattgcct gccttagaaa
ctggctggaa 3480gaagacaatg tgacctgact taggcatttt gtaattggaa
agtcaagact gcagtatgtg 3540cacatgcgca cgcgcatgca cgcacacaca
cacacagtag tggagctttc ctaacactag 3600cagagattaa tcactacatt
agacaacact catctacaga gaatatacac tgttcttccc 3660tggataactg
agaaacaaga gaccattctc tgtctaactg tgataaaaac aagctcagga
3720ctttattcta tagagcaaac ttgctgtgga gggccatgct ctccttggac
ccagttaact 3780gcaaacgtgc attggagccc tatttgctgc cgctgccatt
ctagtgacct ttccacagag 3840ctgcgccttc ctcacgtgtg tgaaaggttt
tccccttcag ccctcaggta gatggaagct 3900gcatctgccc acgatggcag
tgcagtcatc atcttcagga tgtttcttca ggacttcctc 3960agctgacaag
gaattttggt ccctgcctag gaccgggtca tctgcagagg acagagagat
4020ggtaagcagc tgtatgaatg ctgattttaa aaccaggtca tgggagaaga
gcctggagat 4080tctttcctga acactgactg cacttaccag tctgatttta
tcgtcaaaca ccaagccagg 4140ctagcatgct catggcaatc tgtttggggc
tgttttgttg tggcactagc caaacataaa 4200ggggcttaag tcagcctgca
tacagaggat cggggagaga aggggcctgt gttctcagcc 4260tcctgagtac
ttaccagagt ttaatttttt taaaaaaaat ctgcactaaa atccccaaac
4320tgacaggtaa atgtagccct cagagctcag cccaaggcag aatctaaatc
acactatttt 4380cgagatcatg tataaaaaga aaaaaaagaa gtcatgctgt
gtggccaatt ataatttttt 4440tcaaagactt tgtcacaaaa ctgtctatat
tagacatttt ggagggacca ggaaatgtaa 4500gacaccaaat cctccatctc
ttcagtgtgc ctgatgtcac ctcatgattt gctgttactt 4560ttttaactcc
tgcgccaagg acagtgggtt ctgtgtccac ctttgtgctt tgcgaggccg
4620agcccaggca tctgctcgcc tgccacggct gaccagagaa ggtgcttcag
gagctctgcc 4680ttagacgacg tgttacagta tgaacacaca gcagaggcac
cctcgtatgt tttgaaagtt 4740gccttctgaa agggcacagt tttaaggaaa
agaaaaagaa tgtaaaacta tactgacccg 4800ttttcagttt taaagggtcg
tgagaaactg gctggtccaa tgggatttac agcaacattt 4860tccattgctg
aagtgaggta gcagctctct tctgtcagct gaatgttaag gatggggaaa
4920aagaatgcct ttaagtttgc tcttaatcgt atggaagctt gagctatgtg
ttggaagtgc 4980cctggtttta atccatacac aaagacggta cataatccta
caggtttaaa tgtacataaa 5040aatatagttt ggaattcttt gctctactgt
ttacattgca gattgctata atttcaagga 5100gtgagattat aaataaaatg
atgcacttta ggatgtttcc tatttttgaa atctgaacat 5160gaatcattca
catgaccaaa aattgtgttt ttttaaaaat acatgtctag tctgtccttt
5220aatagctctc ttaaataagc tatgatatta atcagatcat taccagttag
cttttaaagc 5280acatttgttt aagactatgt ttttggaaaa atacgctaca
gaattttttt ttaagctaca 5340aataaatgag atgctactaa ttgttttgga
atctgttgtt tctgccaaag gtaaattaac 5400taaagattta ttcaggaatc
cccatttgaa tttgtatgat tcaataaaag aaaacaccaa 5460gtaagttata
taaaataaat tgtgtatgag atgttgtgtt ttcctttgta atttccacta
5520actaactaac taacttatat tcttcatgga atggagccca gaagaaatga
gaggaagccc 5580ttttcacact agatcttatt tgaagaaatg tttgttagtc
agtcagtcag tggtttctgg 5640ctctgccgag ggagatgtgt tccccagcaa
ccatttctgc agcccagaat ctcaaggcac 5700tagaggcggt gtcttaatta
attggcttca caaagacaaa atgctctgga ctgggatttt 5760tcctttgctg
tgttgggaat atgtgtttat taattagcac atgccaacaa aataaatgtc
5820aagagttatt tcataagtgt aagtaaactt aagaattaaa gagtgcagac
ttataatttt 5880ca 5882371023PRTHomo sapiens 37Met Gly Ala Gly Ala
Leu Ala Ile Cys Gln Ser Lys Ala Ala Val Arg1 5 10 15Leu Lys Glu Asp
Met Lys Lys Ile Val Ala Val Pro Leu Asn Glu Gln 20 25 30Lys Asp Phe
Thr Tyr Gln Lys Leu Phe Gly Val Ser Leu Gln Glu Leu 35 40 45Glu Arg
Gln Gly Leu Thr Glu Asn Gly Ile Pro Ala Val Val Trp Asn 50 55 60Ile
Val Glu Tyr Leu Thr Gln His Gly Leu Thr Gln Glu Gly Leu Phe65 70 75
80Arg Val Asn Gly Asn Val Lys Val Val Glu Gln Leu Arg Leu Lys Phe
85 90 95Glu Ser Gly Val Pro Val Glu Leu Gly Lys Asp Gly Asp Val Cys
Ser 100 105 110Ala Ala Ser Leu Leu Lys Leu Phe Leu Arg Glu Leu Pro
Asp Ser Leu 115 120 125Ile Thr Ser Ala Leu Gln Pro Arg Phe Ile Gln
Leu Phe Gln Asp Gly 130
135 140Arg Asn Asp Val Gln Glu Ser Ser Leu Arg Asp Leu Ile Lys Glu
Leu145 150 155 160Pro Asp Thr His Tyr Cys Leu Leu Lys Tyr Leu Cys
Gln Phe Leu Thr 165 170 175Lys Val Ala Lys His His Val Gln Asn Arg
Met Asn Val His Asn Leu 180 185 190Ala Thr Val Phe Gly Pro Asn Cys
Phe His Val Pro Pro Gly Leu Glu 195 200 205Gly Met Lys Glu Gln Asp
Leu Cys Asn Lys Ile Met Ala Lys Ile Leu 210 215 220Glu Asn Tyr Asn
Thr Leu Phe Glu Val Glu Tyr Thr Glu Asn Asp His225 230 235 240Leu
Arg Cys Glu Asn Leu Ala Arg Leu Ile Ile Val Lys Glu Val Tyr 245 250
255Tyr Lys Asn Ser Leu Pro Ile Leu Leu Thr Arg Gly Leu Glu Arg Asp
260 265 270Met Pro Lys Pro Pro Pro Lys Thr Lys Ile Pro Lys Ser Arg
Ser Glu 275 280 285Gly Ser Ile Gln Ala His Arg Val Leu Gln Pro Glu
Leu Ser Asp Gly 290 295 300Ile Pro Gln Leu Ser Leu Arg Leu Ser Tyr
Arg Lys Ala Cys Leu Glu305 310 315 320Asp Met Asn Ser Ala Glu Gly
Ala Ile Ser Ala Lys Leu Val Pro Ser 325 330 335Ser Gln Glu Asp Glu
Arg Pro Leu Ser Pro Phe Tyr Leu Ser Ala His 340 345 350Val Pro Gln
Val Ser Asn Val Ser Ala Thr Gly Glu Leu Leu Glu Arg 355 360 365Thr
Ile Arg Ser Ala Val Glu Gln His Leu Phe Asp Val Asn Asn Ser 370 375
380Gly Gly Gln Ser Ser Glu Asp Ser Glu Ser Gly Thr Leu Ser Ala
Ser385 390 395 400Ser Ala Thr Ser Ala Arg Gln Arg Arg Arg Gln Ser
Lys Glu Gln Asp 405 410 415Glu Val Arg His Gly Arg Asp Lys Gly Leu
Ile Asn Lys Glu Asn Thr 420 425 430Pro Ser Gly Phe Asn His Leu Asp
Asp Cys Ile Leu Asn Thr Gln Glu 435 440 445Val Glu Lys Val His Lys
Asn Thr Phe Gly Cys Ala Gly Glu Arg Ser 450 455 460Lys Pro Lys Arg
Gln Lys Ser Ser Thr Lys Leu Ser Glu Leu His Asp465 470 475 480Asn
Gln Asp Gly Leu Val Asn Met Glu Ser Leu Asn Ser Thr Arg Ser 485 490
495His Glu Arg Thr Gly Pro Asp Asp Phe Glu Trp Met Ser Asp Glu Arg
500 505 510Lys Gly Asn Glu Lys Asp Gly Gly His Thr Gln His Phe Glu
Ser Pro 515 520 525Thr Met Lys Ile Gln Glu His Pro Ser Leu Ser Asp
Thr Lys Gln Gln 530 535 540Arg Asn Gln Asp Ala Gly Asp Gln Glu Glu
Ser Phe Val Ser Glu Val545 550 555 560Pro Gln Ser Asp Leu Thr Ala
Leu Cys Asp Glu Lys Asn Trp Glu Glu 565 570 575Pro Ile Pro Ala Phe
Ser Ser Trp Gln Arg Glu Asn Ser Asp Ser Asp 580 585 590Glu Ala His
Leu Ser Pro Gln Ala Gly Arg Leu Ile Arg Gln Leu Leu 595 600 605Asp
Glu Asp Ser Asp Pro Met Leu Ser Pro Arg Phe Tyr Ala Tyr Gly 610 615
620Gln Ser Arg Gln Tyr Leu Asp Asp Thr Glu Val Pro Pro Ser Pro
Pro625 630 635 640Asn Ser His Ser Phe Met Arg Arg Arg Ser Ser Ser
Leu Gly Ser Tyr 645 650 655Asp Asp Glu Gln Glu Asp Leu Thr Pro Ala
Gln Leu Thr Arg Arg Ile 660 665 670Gln Ser Leu Lys Lys Lys Ile Arg
Lys Phe Glu Asp Arg Phe Glu Glu 675 680 685Glu Lys Lys Tyr Arg Pro
Ser His Ser Asp Lys Ala Ala Asn Pro Glu 690 695 700Val Leu Lys Trp
Thr Asn Asp Leu Ala Lys Phe Arg Arg Gln Leu Lys705 710 715 720Glu
Ser Lys Leu Lys Ile Ser Glu Glu Asp Leu Thr Pro Arg Met Arg 725 730
735Gln Arg Ser Asn Thr Leu Pro Lys Ser Phe Gly Ser Gln Leu Glu Lys
740 745 750Glu Asp Glu Lys Lys Gln Glu Leu Val Asp Lys Ala Ile Lys
Pro Ser 755 760 765Val Glu Ala Thr Leu Glu Ser Ile Gln Arg Lys Leu
Gln Glu Lys Arg 770 775 780Ala Glu Ser Ser Arg Pro Glu Asp Ile Lys
Asp Met Thr Lys Asp Gln785 790 795 800Ile Ala Asn Glu Lys Val Ala
Leu Gln Lys Ala Leu Leu Tyr Tyr Glu 805 810 815Ser Ile His Gly Arg
Pro Val Thr Lys Asn Glu Arg Gln Val Met Lys 820 825 830Pro Leu Tyr
Asp Arg Tyr Arg Leu Val Lys Gln Ile Leu Ser Arg Ala 835 840 845Asn
Thr Ile Pro Ile Ile Gly Ser Pro Ser Ser Lys Arg Arg Ser Pro 850 855
860Leu Leu Gln Pro Ile Ile Glu Gly Glu Thr Ala Ser Phe Phe Lys
Glu865 870 875 880Ile Lys Glu Glu Glu Glu Gly Ser Glu Asp Asp Ser
Asn Val Lys Pro 885 890 895Asp Phe Met Val Thr Leu Lys Thr Asp Phe
Ser Ala Arg Cys Phe Leu 900 905 910Asp Gln Phe Glu Asp Asp Ala Asp
Gly Phe Ile Ser Pro Met Asp Asp 915 920 925Lys Ile Pro Ser Lys Cys
Ser Gln Asp Thr Gly Leu Ser Asn Leu His 930 935 940Ala Ala Ser Ile
Pro Glu Leu Leu Glu His Leu Gln Glu Met Arg Glu945 950 955 960Glu
Lys Lys Arg Ile Arg Lys Lys Leu Arg Asp Phe Glu Asp Asn Phe 965 970
975Phe Arg Gln Asn Gly Arg Asn Val Gln Lys Glu Asp Arg Thr Pro Met
980 985 990Ala Glu Glu Tyr Ser Glu Tyr Lys His Ile Lys Ala Lys Leu
Arg Leu 995 1000 1005Leu Glu Val Leu Ile Ser Lys Arg Asp Thr Asp
Ser Lys Ser Met 1010 1015 1020384988DNAHomo sapiens 38attgaggagc
agaaggagta gggtgcgggg gaggaggagg agcgccttta gtgctgcagc 60agctgctgct
ctgattggcc cggtggttca gctgcttccc tggaacaaaa ggtcaaagtg
120gactgcagtg taaatgtaga gaagcagccg ataaaatagc attgcctgaa
gaagtttgga 180ggctgagagc agcagtagac tggccaactg cagagcaagt
tgtttctcca gccgtgcggt 240gcagcctcat gcccccaacc cagcttagcc
actgtaagaa gacgttcact gtacagacga 300ccaaacttgc cgtggaagag
acagttgtga gattcccttg caaatttaca tacgagaatg 360gcttgtgaaa
tcatgcctct gcaaagtgct catgtacccc aagtcagcaa tgtgtctgca
420accggagaac tcttagaaag aaccatccga tcagctgtag aacaacatct
ttttgatgtt 480aataactctg gaggtcaaag ttcagaggac tcagaatctg
gaacactatc agcatcttct 540gccacatctg ccagacagcg ccgccgccag
tccaaggagc aggatgaagt tcgacatggg 600agagacaagg gacttatcaa
caaagaaaat actccttctg ggttcaacca ccttgatgat 660tgtattttga
atactcagga agtcgaaaag gtacacaaaa atacttttgg ttgtgctgga
720gaaaggagca agcctaaacg tcagaaatcc agtactaaac tttctgagct
tcatgacaat 780caggacggtc ttgtgaatat ggaaagtctc aattccacac
gatctcatga gagaactgga 840cctgatgatt ttgaatggat gtctgatgaa
aggaaaggaa atgaaaaaga tggtggacac 900actcagcatt ttgagagccc
cacaatgaag atccaggagc atcccagcct atctgacacc 960aaacagcaga
gaaatcaaga tgccggtgac caggaggaga gctttgtctc cgaagtgccc
1020cagtcggacc tgactgcatt gtgtgatgaa aagaactggg aagagcctat
ccctgctttc 1080tcctcctggc agcgggagaa cagtgactct gatgaagccc
acctctcgcc gcaggctggg 1140cgcctgatcc gtcagctgct ggacgaagac
agcgacccca tgctctctcc tcggttctac 1200gcttatgggc agagcaggca
atacctggat gacacagaag tgcctccttc cccaccaaac 1260tcccattctt
tcatgaggcg gcgaagctcc tctctggggt cctatgatga tgagcaagag
1320gacctgacac ctgcccagct cacacgaagg attcagagcc ttaaaaagaa
gatccggaag 1380tttgaagata gattcgaaga agagaagaag tacagacctt
cccacagtga caaagcagcc 1440aatccggagg ttctgaaatg gacaaatgac
cttgccaaat tccggagaca acttaaagaa 1500tcaaaactaa agatatctga
agaggaccta actcccagga tgcggcagcg aagcaacaca 1560ctccccaaga
gttttggttc ccaacttgag aaagaagatg agaagaagca agagctggtg
1620gataaagcaa taaagcccag tgttgaagcc acattggaat ctattcagag
gaagctccag 1680gagaagcgag cggaaagcag ccgccctgag gacattaagg
atatgaccaa agaccagatt 1740gctaatgaga aagtggctct gcagaaagct
ctgttatatt atgaaagcat tcatggacgg 1800ccggtaacaa agaacgaacg
gcaggtgatg aagccactat acgacaggta ccggctggtc 1860aaacagatcc
tctcccgagc taacaccata cccatcattg gttccccctc cagcaagcgg
1920agaagccctt tgctgcagcc aattatcgag ggcgaaactg cttccttctt
caaggagata 1980aaggaagaag aggaggggtc agaagacgat agcaatgtga
agccagactt catggtcact 2040ctgaaaaccg atttcagtgc acgatgcttt
ctggaccaat tcgaagatga cgctgatgga 2100tttatttccc caatggatga
taaaatacca tcaaaatgca gccaggacac agggctttca 2160aatctccatg
ctgcctcaat acctgaactc ctggaacacc tccaggaaat gagagaagaa
2220aagaaaagga ttcgaaagaa acttcgggat tttgaagaca actttttcag
acagaatgga 2280agaaatgtcc agaaggaaga ccgcactcct atggctgaag
aatacagtga atataagcac 2340ataaaggcga aactgaggct cctggaggtg
ctcatcagca agagagacac tgattccaag 2400tccatgtgag gggcatggcc
aagcacaggg ggctggcagc tgcggtgaga gtttactgtc 2460cccagagaaa
gtgcagctct ggaaggcagc cttggggctg gccctgcaaa gcatgcagcc
2520cttctgcctc tagaccattt ggcatcggct cctgtttcca ttgcctgcct
tagaaactgg 2580ctggaagaag acaatgtgac ctgacttagg cattttgtaa
ttggaaagtc aagactgcag 2640tatgtgcaca tgcgcacgcg catgcacgca
cacacacaca cagtagtgga gctttcctaa 2700cactagcaga gattaatcac
tacattagac aacactcatc tacagagaat atacactgtt 2760cttccctgga
taactgagaa acaagagacc attctctgtc taactgtgat aaaaacaagc
2820tcaggacttt attctataga gcaaacttgc tgtggagggc catgctctcc
ttggacccag 2880ttaactgcaa acgtgcattg gagccctatt tgctgccgct
gccattctag tgacctttcc 2940acagagctgc gccttcctca cgtgtgtgaa
aggttttccc cttcagccct caggtagatg 3000gaagctgcat ctgcccacga
tggcagtgca gtcatcatct tcaggatgtt tcttcaggac 3060ttcctcagct
gacaaggaat tttggtccct gcctaggacc gggtcatctg cagaggacag
3120agagatggta agcagctgta tgaatgctga ttttaaaacc aggtcatggg
agaagagcct 3180ggagattctt tcctgaacac tgactgcact taccagtctg
attttatcgt caaacaccaa 3240gccaggctag catgctcatg gcaatctgtt
tggggctgtt ttgttgtggc actagccaaa 3300cataaagggg cttaagtcag
cctgcataca gaggatcggg gagagaaggg gcctgtgttc 3360tcagcctcct
gagtacttac cagagtttaa tttttttaaa aaaaatctgc actaaaatcc
3420ccaaactgac aggtaaatgt agccctcaga gctcagccca aggcagaatc
taaatcacac 3480tattttcgag atcatgtata aaaagaaaaa aaagaagtca
tgctgtgtgg ccaattataa 3540tttttttcaa agactttgtc acaaaactgt
ctatattaga cattttggag ggaccaggaa 3600atgtaagaca ccaaatcctc
catctcttca gtgtgcctga tgtcacctca tgatttgctg 3660ttactttttt
aactcctgcg ccaaggacag tgggttctgt gtccaccttt gtgctttgcg
3720aggccgagcc caggcatctg ctcgcctgcc acggctgacc agagaaggtg
cttcaggagc 3780tctgccttag acgacgtgtt acagtatgaa cacacagcag
aggcaccctc gtatgttttg 3840aaagttgcct tctgaaaggg cacagtttta
aggaaaagaa aaagaatgta aaactatact 3900gacccgtttt cagttttaaa
gggtcgtgag aaactggctg gtccaatggg atttacagca 3960acattttcca
ttgctgaagt gaggtagcag ctctcttctg tcagctgaat gttaaggatg
4020gggaaaaaga atgcctttaa gtttgctctt aatcgtatgg aagcttgagc
tatgtgttgg 4080aagtgccctg gttttaatcc atacacaaag acggtacata
atcctacagg tttaaatgta 4140cataaaaata tagtttggaa ttctttgctc
tactgtttac attgcagatt gctataattt 4200caaggagtga gattataaat
aaaatgatgc actttaggat gtttcctatt tttgaaatct 4260gaacatgaat
cattcacatg accaaaaatt gtgttttttt aaaaatacat gtctagtctg
4320tcctttaata gctctcttaa ataagctatg atattaatca gatcattacc
agttagcttt 4380taaagcacat ttgtttaaga ctatgttttt ggaaaaatac
gctacagaat ttttttttaa 4440gctacaaata aatgagatgc tactaattgt
tttggaatct gttgtttctg ccaaaggtaa 4500attaactaaa gatttattca
ggaatcccca tttgaatttg tatgattcaa taaaagaaaa 4560caccaagtaa
gttatataaa ataaattgtg tatgagatgt tgtgttttcc tttgtaattt
4620ccactaacta actaactaac ttatattctt catggaatgg agcccagaag
aaatgagagg 4680aagccctttt cacactagat cttatttgaa gaaatgtttg
ttagtcagtc agtcagtggt 4740ttctggctct gccgagggag atgtgttccc
cagcaaccat ttctgcagcc cagaatctca 4800aggcactaga ggcggtgtct
taattaattg gcttcacaaa gacaaaatgc tctggactgg 4860gatttttcct
ttgctgtgtt gggaatatgt gtttattaat tagcacatgc caacaaaata
4920aatgtcaaga gttatttcat aagtgtaagt aaacttaaga attaaagagt
gcagacttat 4980aattttca 498839683PRTHomo sapiens 39Met Ala Cys Glu
Ile Met Pro Leu Gln Ser Ala His Val Pro Gln Val1 5 10 15Ser Asn Val
Ser Ala Thr Gly Glu Leu Leu Glu Arg Thr Ile Arg Ser 20 25 30Ala Val
Glu Gln His Leu Phe Asp Val Asn Asn Ser Gly Gly Gln Ser 35 40 45Ser
Glu Asp Ser Glu Ser Gly Thr Leu Ser Ala Ser Ser Ala Thr Ser 50 55
60Ala Arg Gln Arg Arg Arg Gln Ser Lys Glu Gln Asp Glu Val Arg His65
70 75 80Gly Arg Asp Lys Gly Leu Ile Asn Lys Glu Asn Thr Pro Ser Gly
Phe 85 90 95Asn His Leu Asp Asp Cys Ile Leu Asn Thr Gln Glu Val Glu
Lys Val 100 105 110His Lys Asn Thr Phe Gly Cys Ala Gly Glu Arg Ser
Lys Pro Lys Arg 115 120 125Gln Lys Ser Ser Thr Lys Leu Ser Glu Leu
His Asp Asn Gln Asp Gly 130 135 140Leu Val Asn Met Glu Ser Leu Asn
Ser Thr Arg Ser His Glu Arg Thr145 150 155 160Gly Pro Asp Asp Phe
Glu Trp Met Ser Asp Glu Arg Lys Gly Asn Glu 165 170 175Lys Asp Gly
Gly His Thr Gln His Phe Glu Ser Pro Thr Met Lys Ile 180 185 190Gln
Glu His Pro Ser Leu Ser Asp Thr Lys Gln Gln Arg Asn Gln Asp 195 200
205Ala Gly Asp Gln Glu Glu Ser Phe Val Ser Glu Val Pro Gln Ser Asp
210 215 220Leu Thr Ala Leu Cys Asp Glu Lys Asn Trp Glu Glu Pro Ile
Pro Ala225 230 235 240Phe Ser Ser Trp Gln Arg Glu Asn Ser Asp Ser
Asp Glu Ala His Leu 245 250 255Ser Pro Gln Ala Gly Arg Leu Ile Arg
Gln Leu Leu Asp Glu Asp Ser 260 265 270Asp Pro Met Leu Ser Pro Arg
Phe Tyr Ala Tyr Gly Gln Ser Arg Gln 275 280 285Tyr Leu Asp Asp Thr
Glu Val Pro Pro Ser Pro Pro Asn Ser His Ser 290 295 300Phe Met Arg
Arg Arg Ser Ser Ser Leu Gly Ser Tyr Asp Asp Glu Gln305 310 315
320Glu Asp Leu Thr Pro Ala Gln Leu Thr Arg Arg Ile Gln Ser Leu Lys
325 330 335Lys Lys Ile Arg Lys Phe Glu Asp Arg Phe Glu Glu Glu Lys
Lys Tyr 340 345 350Arg Pro Ser His Ser Asp Lys Ala Ala Asn Pro Glu
Val Leu Lys Trp 355 360 365Thr Asn Asp Leu Ala Lys Phe Arg Arg Gln
Leu Lys Glu Ser Lys Leu 370 375 380Lys Ile Ser Glu Glu Asp Leu Thr
Pro Arg Met Arg Gln Arg Ser Asn385 390 395 400Thr Leu Pro Lys Ser
Phe Gly Ser Gln Leu Glu Lys Glu Asp Glu Lys 405 410 415Lys Gln Glu
Leu Val Asp Lys Ala Ile Lys Pro Ser Val Glu Ala Thr 420 425 430Leu
Glu Ser Ile Gln Arg Lys Leu Gln Glu Lys Arg Ala Glu Ser Ser 435 440
445Arg Pro Glu Asp Ile Lys Asp Met Thr Lys Asp Gln Ile Ala Asn Glu
450 455 460Lys Val Ala Leu Gln Lys Ala Leu Leu Tyr Tyr Glu Ser Ile
His Gly465 470 475 480Arg Pro Val Thr Lys Asn Glu Arg Gln Val Met
Lys Pro Leu Tyr Asp 485 490 495Arg Tyr Arg Leu Val Lys Gln Ile Leu
Ser Arg Ala Asn Thr Ile Pro 500 505 510Ile Ile Gly Ser Pro Ser Ser
Lys Arg Arg Ser Pro Leu Leu Gln Pro 515 520 525Ile Ile Glu Gly Glu
Thr Ala Ser Phe Phe Lys Glu Ile Lys Glu Glu 530 535 540Glu Glu Gly
Ser Glu Asp Asp Ser Asn Val Lys Pro Asp Phe Met Val545 550 555
560Thr Leu Lys Thr Asp Phe Ser Ala Arg Cys Phe Leu Asp Gln Phe Glu
565 570 575Asp Asp Ala Asp Gly Phe Ile Ser Pro Met Asp Asp Lys Ile
Pro Ser 580 585 590Lys Cys Ser Gln Asp Thr Gly Leu Ser Asn Leu His
Ala Ala Ser Ile 595 600 605Pro Glu Leu Leu Glu His Leu Gln Glu Met
Arg Glu Glu Lys Lys Arg 610 615 620Ile Arg Lys Lys Leu Arg Asp Phe
Glu Asp Asn Phe Phe Arg Gln Asn625 630 635 640Gly Arg Asn Val Gln
Lys Glu Asp Arg Thr Pro Met Ala Glu Glu Tyr 645 650 655Ser Glu Tyr
Lys His Ile Lys Ala Lys Leu Arg Leu Leu Glu Val Leu 660 665 670Ile
Ser Lys Arg Asp Thr Asp Ser Lys Ser Met 675 680404946DNAHomo
sapiens 40attgaggagc agaaggagta gggtgcgggg gaggaggagg agcgccttta
gtgctgcagc 60agctgctgct ctgattggcc cggtggttca gctgcttccc tggaacaaaa
ggtcaaagtg 120gactgcagtg taaatgtaga gaagcagccg ataaaatagc
attgcctgaa gaagtttgga 180ggctgagagc agcagtagac tggccaactg
cagagcaagt tgtttctcca gccgtgcggt 240gcagcctcat gcccccaacc
cagcttagcc actgtaagaa gacgttcact gtacagacga 300ccaaacttgc
cgtggaagag acagttgtga gattcccttg caaatttaca tacgagaatg
360gcttgtgaaa tcatgcctct gcaaagttca caggaagatg
aaagacctct gtcacctttc 420tatttgagtg ctcatgtacc ccaagtcagc
aatgtgtctg caaccggaga actcttagaa 480agaaccatcc gatcagctgt
agaacaacat ctttttgatg ttaataactc tggaggtcaa 540agttcagagg
actcagaatc tggaacacta tcagcatctt ctgccacatc tgccagacag
600cgccgccgcc agtccaagga gcaggatgaa gttcgacatg ggagagacaa
gggacttatc 660aacaaagaaa atactccttc tgggttcaac caccttgatg
attgtatttt gaatactcag 720gaagtcgaaa aggtacacaa aaatactttt
ggttgtgctg gagaaaggag caagcctaaa 780cgtcagaaat ccagtactaa
actttctgag cttcatgaca atcaggacgg tcttgtgaat 840atggaaagtc
tcaattccac acgatctcat gagagaactg gacctgatga ttttgaatgg
900atgtctgatg aaaggaaagg aaatgaaaaa gatggtggac acactcagca
ttttgagagc 960cccacaatga agatccagga gcatcccagc ctatctgaca
ccaaacagca gagaaatcaa 1020gatgccggtg accaggagga gagctttgtc
tccgaagtgc cccagtcgga cctgactgca 1080ttgtgtgatg aaaagaactg
ggaagagcct atccctgctt tctcctcctg gcagcgggag 1140aacagtgact
ctgatgaagc ccacctctcg ccgcaggctg ggcgcctgat ccgtcagctg
1200ctggacgaag acagcgaccc catgctctct cctcggttct acgcttatgg
gcagagcagg 1260caatacctgg atgacacaga agtgcctcct tccccaccaa
actcccattc tttcatgagg 1320cggcgaagct cctctctggg gtcctatgat
gatgagcaag aggacctgac acctgcccag 1380ctcacacgaa ggattcagag
ccttaaaaag aagatccgga agtttgaaga tagattcgaa 1440gaagagaaga
agtacagacc ttcccacagt gacaaagcag ccaatccgga ggttctgaaa
1500tggacaaatg accttgccaa attccggaga caacttaaag aatcaaaact
aaagatatct 1560gaagaggacc taactcccag gatgcggcag cgaagcaaca
cactccccaa gagttttggt 1620tcccaacttg agaaagaaga tgagaagaag
caagagctgg tggataaagc aataaagccc 1680agtgttgaag ccacattgga
atctattcag aggaagctcc aggagaagcg agcggaaagc 1740agccgccctg
aggacattaa ggatatgacc aaagaccaga ttgctaatga gaaagtggct
1800ctgcagaaag ctctgttata ttatgaaagc attcatggac ggccggtaac
aaagaacgaa 1860cggcaggtga tgaagccact atacgacagg taccggctgg
tcaaacagat cctctcccga 1920gctaacacca tacccatcat tgaagaagag
gaggggtcag aagacgatag caatgtgaag 1980ccagacttca tggtcactct
gaaaaccgat ttcagtgcac gatgctttct ggaccaattc 2040gaagatgacg
ctgatggatt tatttcccca atggatgata aaataccatc aaaatgcagc
2100caggacacag ggctttcaaa tctccatgct gcctcaatac ctgaactcct
ggaacacctc 2160caggaaatga gagaagaaaa gaaaaggatt cgaaagaaac
ttcgggattt tgaagacaac 2220tttttcagac agaatggaag aaatgtccag
aaggaagacc gcactcctat ggctgaagaa 2280tacagtgaat ataagcacat
aaaggcgaaa ctgaggctcc tggaggtgct catcagcaag 2340agagacactg
attccaagtc catgtgaggg gcatggccaa gcacaggggg ctggcagctg
2400cggtgagagt ttactgtccc cagagaaagt gcagctctgg aaggcagcct
tggggctggc 2460cctgcaaagc atgcagccct tctgcctcta gaccatttgg
catcggctcc tgtttccatt 2520gcctgcctta gaaactggct ggaagaagac
aatgtgacct gacttaggca ttttgtaatt 2580ggaaagtcaa gactgcagta
tgtgcacatg cgcacgcgca tgcacgcaca cacacacaca 2640gtagtggagc
tttcctaaca ctagcagaga ttaatcacta cattagacaa cactcatcta
2700cagagaatat acactgttct tccctggata actgagaaac aagagaccat
tctctgtcta 2760actgtgataa aaacaagctc aggactttat tctatagagc
aaacttgctg tggagggcca 2820tgctctcctt ggacccagtt aactgcaaac
gtgcattgga gccctatttg ctgccgctgc 2880cattctagtg acctttccac
agagctgcgc cttcctcacg tgtgtgaaag gttttcccct 2940tcagccctca
ggtagatgga agctgcatct gcccacgatg gcagtgcagt catcatcttc
3000aggatgtttc ttcaggactt cctcagctga caaggaattt tggtccctgc
ctaggaccgg 3060gtcatctgca gaggacagag agatggtaag cagctgtatg
aatgctgatt ttaaaaccag 3120gtcatgggag aagagcctgg agattctttc
ctgaacactg actgcactta ccagtctgat 3180tttatcgtca aacaccaagc
caggctagca tgctcatggc aatctgtttg gggctgtttt 3240gttgtggcac
tagccaaaca taaaggggct taagtcagcc tgcatacaga ggatcgggga
3300gagaaggggc ctgtgttctc agcctcctga gtacttacca gagtttaatt
tttttaaaaa 3360aaatctgcac taaaatcccc aaactgacag gtaaatgtag
ccctcagagc tcagcccaag 3420gcagaatcta aatcacacta ttttcgagat
catgtataaa aagaaaaaaa agaagtcatg 3480ctgtgtggcc aattataatt
tttttcaaag actttgtcac aaaactgtct atattagaca 3540ttttggaggg
accaggaaat gtaagacacc aaatcctcca tctcttcagt gtgcctgatg
3600tcacctcatg atttgctgtt acttttttaa ctcctgcgcc aaggacagtg
ggttctgtgt 3660ccacctttgt gctttgcgag gccgagccca ggcatctgct
cgcctgccac ggctgaccag 3720agaaggtgct tcaggagctc tgccttagac
gacgtgttac agtatgaaca cacagcagag 3780gcaccctcgt atgttttgaa
agttgccttc tgaaagggca cagttttaag gaaaagaaaa 3840agaatgtaaa
actatactga cccgttttca gttttaaagg gtcgtgagaa actggctggt
3900ccaatgggat ttacagcaac attttccatt gctgaagtga ggtagcagct
ctcttctgtc 3960agctgaatgt taaggatggg gaaaaagaat gcctttaagt
ttgctcttaa tcgtatggaa 4020gcttgagcta tgtgttggaa gtgccctggt
tttaatccat acacaaagac ggtacataat 4080cctacaggtt taaatgtaca
taaaaatata gtttggaatt ctttgctcta ctgtttacat 4140tgcagattgc
tataatttca aggagtgaga ttataaataa aatgatgcac tttaggatgt
4200ttcctatttt tgaaatctga acatgaatca ttcacatgac caaaaattgt
gtttttttaa 4260aaatacatgt ctagtctgtc ctttaatagc tctcttaaat
aagctatgat attaatcaga 4320tcattaccag ttagctttta aagcacattt
gtttaagact atgtttttgg aaaaatacgc 4380tacagaattt ttttttaagc
tacaaataaa tgagatgcta ctaattgttt tggaatctgt 4440tgtttctgcc
aaaggtaaat taactaaaga tttattcagg aatccccatt tgaatttgta
4500tgattcaata aaagaaaaca ccaagtaagt tatataaaat aaattgtgta
tgagatgttg 4560tgttttcctt tgtaatttcc actaactaac taactaactt
atattcttca tggaatggag 4620cccagaagaa atgagaggaa gcccttttca
cactagatct tatttgaaga aatgtttgtt 4680agtcagtcag tcagtggttt
ctggctctgc cgagggagat gtgttcccca gcaaccattt 4740ctgcagccca
gaatctcaag gcactagagg cggtgtctta attaattggc ttcacaaaga
4800caaaatgctc tggactggga tttttccttt gctgtgttgg gaatatgtgt
ttattaatta 4860gcacatgcca acaaaataaa tgtcaagagt tatttcataa
gtgtaagtaa acttaagaat 4920taaagagtgc agacttataa ttttca
494641669PRTHomo sapiens 41Met Ala Cys Glu Ile Met Pro Leu Gln Ser
Ser Gln Glu Asp Glu Arg1 5 10 15Pro Leu Ser Pro Phe Tyr Leu Ser Ala
His Val Pro Gln Val Ser Asn 20 25 30Val Ser Ala Thr Gly Glu Leu Leu
Glu Arg Thr Ile Arg Ser Ala Val 35 40 45Glu Gln His Leu Phe Asp Val
Asn Asn Ser Gly Gly Gln Ser Ser Glu 50 55 60Asp Ser Glu Ser Gly Thr
Leu Ser Ala Ser Ser Ala Thr Ser Ala Arg65 70 75 80Gln Arg Arg Arg
Gln Ser Lys Glu Gln Asp Glu Val Arg His Gly Arg 85 90 95Asp Lys Gly
Leu Ile Asn Lys Glu Asn Thr Pro Ser Gly Phe Asn His 100 105 110Leu
Asp Asp Cys Ile Leu Asn Thr Gln Glu Val Glu Lys Val His Lys 115 120
125Asn Thr Phe Gly Cys Ala Gly Glu Arg Ser Lys Pro Lys Arg Gln Lys
130 135 140Ser Ser Thr Lys Leu Ser Glu Leu His Asp Asn Gln Asp Gly
Leu Val145 150 155 160Asn Met Glu Ser Leu Asn Ser Thr Arg Ser His
Glu Arg Thr Gly Pro 165 170 175Asp Asp Phe Glu Trp Met Ser Asp Glu
Arg Lys Gly Asn Glu Lys Asp 180 185 190Gly Gly His Thr Gln His Phe
Glu Ser Pro Thr Met Lys Ile Gln Glu 195 200 205His Pro Ser Leu Ser
Asp Thr Lys Gln Gln Arg Asn Gln Asp Ala Gly 210 215 220Asp Gln Glu
Glu Ser Phe Val Ser Glu Val Pro Gln Ser Asp Leu Thr225 230 235
240Ala Leu Cys Asp Glu Lys Asn Trp Glu Glu Pro Ile Pro Ala Phe Ser
245 250 255Ser Trp Gln Arg Glu Asn Ser Asp Ser Asp Glu Ala His Leu
Ser Pro 260 265 270Gln Ala Gly Arg Leu Ile Arg Gln Leu Leu Asp Glu
Asp Ser Asp Pro 275 280 285Met Leu Ser Pro Arg Phe Tyr Ala Tyr Gly
Gln Ser Arg Gln Tyr Leu 290 295 300Asp Asp Thr Glu Val Pro Pro Ser
Pro Pro Asn Ser His Ser Phe Met305 310 315 320Arg Arg Arg Ser Ser
Ser Leu Gly Ser Tyr Asp Asp Glu Gln Glu Asp 325 330 335Leu Thr Pro
Ala Gln Leu Thr Arg Arg Ile Gln Ser Leu Lys Lys Lys 340 345 350Ile
Arg Lys Phe Glu Asp Arg Phe Glu Glu Glu Lys Lys Tyr Arg Pro 355 360
365Ser His Ser Asp Lys Ala Ala Asn Pro Glu Val Leu Lys Trp Thr Asn
370 375 380Asp Leu Ala Lys Phe Arg Arg Gln Leu Lys Glu Ser Lys Leu
Lys Ile385 390 395 400Ser Glu Glu Asp Leu Thr Pro Arg Met Arg Gln
Arg Ser Asn Thr Leu 405 410 415Pro Lys Ser Phe Gly Ser Gln Leu Glu
Lys Glu Asp Glu Lys Lys Gln 420 425 430Glu Leu Val Asp Lys Ala Ile
Lys Pro Ser Val Glu Ala Thr Leu Glu 435 440 445Ser Ile Gln Arg Lys
Leu Gln Glu Lys Arg Ala Glu Ser Ser Arg Pro 450 455 460Glu Asp Ile
Lys Asp Met Thr Lys Asp Gln Ile Ala Asn Glu Lys Val465 470 475
480Ala Leu Gln Lys Ala Leu Leu Tyr Tyr Glu Ser Ile His Gly Arg Pro
485 490 495Val Thr Lys Asn Glu Arg Gln Val Met Lys Pro Leu Tyr Asp
Arg Tyr 500 505 510Arg Leu Val Lys Gln Ile Leu Ser Arg Ala Asn Thr
Ile Pro Ile Ile 515 520 525Glu Glu Glu Glu Gly Ser Glu Asp Asp Ser
Asn Val Lys Pro Asp Phe 530 535 540Met Val Thr Leu Lys Thr Asp Phe
Ser Ala Arg Cys Phe Leu Asp Gln545 550 555 560Phe Glu Asp Asp Ala
Asp Gly Phe Ile Ser Pro Met Asp Asp Lys Ile 565 570 575Pro Ser Lys
Cys Ser Gln Asp Thr Gly Leu Ser Asn Leu His Ala Ala 580 585 590Ser
Ile Pro Glu Leu Leu Glu His Leu Gln Glu Met Arg Glu Glu Lys 595 600
605Lys Arg Ile Arg Lys Lys Leu Arg Asp Phe Glu Asp Asn Phe Phe Arg
610 615 620Gln Asn Gly Arg Asn Val Gln Lys Glu Asp Arg Thr Pro Met
Ala Glu625 630 635 640Glu Tyr Ser Glu Tyr Lys His Ile Lys Ala Lys
Leu Arg Leu Leu Glu 645 650 655Val Leu Ile Ser Lys Arg Asp Thr Asp
Ser Lys Ser Met 660 665424946DNAHomo sapiens 42attgaggagc
agaaggagta gggtgcgggg gaggaggagg agcgccttta gtgctgcagc 60agctgctgct
ctgattggcc cggtggttca gctgcttccc tggaacaaaa ggtcaaagtg
120gactgcagtg taaatgtaga gaagcagccg ataaaatagc attgcctgaa
gaagtttgga 180ggctgagagc agcagtagac tggccaactg cagagcaagt
tgtttctcca gccgtgcggt 240gcagcctcat gcccccaacc cagcttagcc
actgtaagaa gacgttcact gtacagacga 300ccaaacttgc cgtggaagag
acagttgtga gattcccttg caaatttaca tacgagaatg 360gcttgtgaaa
tcatgcctct gcaaagactc ttagaaagaa ccatccgatc agctgtagaa
420caacatcttt ttgatgttaa taactctgga ggtcaaagtt cagaggactc
agaatctgga 480acactatcag catcttctgc cacatctgcc agacagcgcc
gccgccagtc caaggagcag 540gatgaagttc gacatgggag agacaaggga
cttatcaaca aagaaaatac tccttctggg 600ttcaaccacc ttgatgattg
tattttgaat actcaggaag tcgaaaaggt acacaaaaat 660acttttggtt
gtgctggaga aaggagcaag cctaaacgtc agaaatccag tactaaactt
720tctgagcttc atgacaatca ggacggtctt gtgaatatgg aaagtctcaa
ttccacacga 780tctcatgaga gaactggacc tgatgatttt gaatggatgt
ctgatgaaag gaaaggaaat 840gaaaaagatg gtggacacac tcagcatttt
gagagcccca caatgaagat ccaggagcat 900cccagcctat ctgacaccaa
acagcagaga aatcaagatg ccggtgacca ggaggagagc 960tttgtctccg
aagtgcccca gtcggacctg actgcattgt gtgatgaaaa gaactgggaa
1020gagcctatcc ctgctttctc ctcctggcag cgggagaaca gtgactctga
tgaagcccac 1080ctctcgccgc aggctgggcg cctgatccgt cagctgctgg
acgaagacag cgaccccatg 1140ctctctcctc ggttctacgc ttatgggcag
agcaggcaat acctggatga cacagaagtg 1200cctccttccc caccaaactc
ccattctttc atgaggcggc gaagctcctc tctggggtcc 1260tatgatgatg
agcaagagga cctgacacct gcccagctca cacgaaggat tcagagcctt
1320aaaaagaaga tccggaagtt tgaagataga ttcgaagaag agaagaagta
cagaccttcc 1380cacagtgaca aagcagccaa tccggaggtt ctgaaatgga
caaatgacct tgccaaattc 1440cggagacaac ttaaagaatc aaaactaaag
atatctgaag aggacctaac tcccaggatg 1500cggcagcgaa gcaacacact
ccccaagagt tttggttccc aacttgagaa agaagatgag 1560aagaagcaag
agctggtgga taaagcaata aagcccagtg ttgaagccac attggaatct
1620attcagagga agctccagga gaagcgagcg gaaagcagcc gccctgagga
cattaaggat 1680atgaccaaag accagattgc taatgagaaa gtggctctgc
agaaagctct gttatattat 1740gaaagcattc atggacggcc ggtaacaaag
aacgaacggc aggtgatgaa gccactatac 1800gacaggtacc ggctggtcaa
acagatcctc tcccgagcta acaccatacc catcattggt 1860tccccctcca
gcaagcggag aagccctttg ctgcagccaa ttatcgaggg cgaaactgct
1920tccttcttca aggagataaa ggaagaagag gaggggtcag aagacgatag
caatgtgaag 1980ccagacttca tggtcactct gaaaaccgat ttcagtgcac
gatgctttct ggaccaattc 2040gaagatgacg ctgatggatt tatttcccca
atggatgata aaataccatc aaaatgcagc 2100caggacacag ggctttcaaa
tctccatgct gcctcaatac ctgaactcct ggaacacctc 2160caggaaatga
gagaagaaaa gaaaaggatt cgaaagaaac ttcgggattt tgaagacaac
2220tttttcagac agaatggaag aaatgtccag aaggaagacc gcactcctat
ggctgaagaa 2280tacagtgaat ataagcacat aaaggcgaaa ctgaggctcc
tggaggtgct catcagcaag 2340agagacactg attccaagtc catgtgaggg
gcatggccaa gcacaggggg ctggcagctg 2400cggtgagagt ttactgtccc
cagagaaagt gcagctctgg aaggcagcct tggggctggc 2460cctgcaaagc
atgcagccct tctgcctcta gaccatttgg catcggctcc tgtttccatt
2520gcctgcctta gaaactggct ggaagaagac aatgtgacct gacttaggca
ttttgtaatt 2580ggaaagtcaa gactgcagta tgtgcacatg cgcacgcgca
tgcacgcaca cacacacaca 2640gtagtggagc tttcctaaca ctagcagaga
ttaatcacta cattagacaa cactcatcta 2700cagagaatat acactgttct
tccctggata actgagaaac aagagaccat tctctgtcta 2760actgtgataa
aaacaagctc aggactttat tctatagagc aaacttgctg tggagggcca
2820tgctctcctt ggacccagtt aactgcaaac gtgcattgga gccctatttg
ctgccgctgc 2880cattctagtg acctttccac agagctgcgc cttcctcacg
tgtgtgaaag gttttcccct 2940tcagccctca ggtagatgga agctgcatct
gcccacgatg gcagtgcagt catcatcttc 3000aggatgtttc ttcaggactt
cctcagctga caaggaattt tggtccctgc ctaggaccgg 3060gtcatctgca
gaggacagag agatggtaag cagctgtatg aatgctgatt ttaaaaccag
3120gtcatgggag aagagcctgg agattctttc ctgaacactg actgcactta
ccagtctgat 3180tttatcgtca aacaccaagc caggctagca tgctcatggc
aatctgtttg gggctgtttt 3240gttgtggcac tagccaaaca taaaggggct
taagtcagcc tgcatacaga ggatcgggga 3300gagaaggggc ctgtgttctc
agcctcctga gtacttacca gagtttaatt tttttaaaaa 3360aaatctgcac
taaaatcccc aaactgacag gtaaatgtag ccctcagagc tcagcccaag
3420gcagaatcta aatcacacta ttttcgagat catgtataaa aagaaaaaaa
agaagtcatg 3480ctgtgtggcc aattataatt tttttcaaag actttgtcac
aaaactgtct atattagaca 3540ttttggaggg accaggaaat gtaagacacc
aaatcctcca tctcttcagt gtgcctgatg 3600tcacctcatg atttgctgtt
acttttttaa ctcctgcgcc aaggacagtg ggttctgtgt 3660ccacctttgt
gctttgcgag gccgagccca ggcatctgct cgcctgccac ggctgaccag
3720agaaggtgct tcaggagctc tgccttagac gacgtgttac agtatgaaca
cacagcagag 3780gcaccctcgt atgttttgaa agttgccttc tgaaagggca
cagttttaag gaaaagaaaa 3840agaatgtaaa actatactga cccgttttca
gttttaaagg gtcgtgagaa actggctggt 3900ccaatgggat ttacagcaac
attttccatt gctgaagtga ggtagcagct ctcttctgtc 3960agctgaatgt
taaggatggg gaaaaagaat gcctttaagt ttgctcttaa tcgtatggaa
4020gcttgagcta tgtgttggaa gtgccctggt tttaatccat acacaaagac
ggtacataat 4080cctacaggtt taaatgtaca taaaaatata gtttggaatt
ctttgctcta ctgtttacat 4140tgcagattgc tataatttca aggagtgaga
ttataaataa aatgatgcac tttaggatgt 4200ttcctatttt tgaaatctga
acatgaatca ttcacatgac caaaaattgt gtttttttaa 4260aaatacatgt
ctagtctgtc ctttaatagc tctcttaaat aagctatgat attaatcaga
4320tcattaccag ttagctttta aagcacattt gtttaagact atgtttttgg
aaaaatacgc 4380tacagaattt ttttttaagc tacaaataaa tgagatgcta
ctaattgttt tggaatctgt 4440tgtttctgcc aaaggtaaat taactaaaga
tttattcagg aatccccatt tgaatttgta 4500tgattcaata aaagaaaaca
ccaagtaagt tatataaaat aaattgtgta tgagatgttg 4560tgttttcctt
tgtaatttcc actaactaac taactaactt atattcttca tggaatggag
4620cccagaagaa atgagaggaa gcccttttca cactagatct tatttgaaga
aatgtttgtt 4680agtcagtcag tcagtggttt ctggctctgc cgagggagat
gtgttcccca gcaaccattt 4740ctgcagccca gaatctcaag gcactagagg
cggtgtctta attaattggc ttcacaaaga 4800caaaatgctc tggactggga
tttttccttt gctgtgttgg gaatatgtgt ttattaatta 4860gcacatgcca
acaaaataaa tgtcaagagt tatttcataa gtgtaagtaa acttaagaat
4920taaagagtgc agacttataa ttttca 494643669PRTHomo sapiens 43Met Ala
Cys Glu Ile Met Pro Leu Gln Arg Leu Leu Glu Arg Thr Ile1 5 10 15Arg
Ser Ala Val Glu Gln His Leu Phe Asp Val Asn Asn Ser Gly Gly 20 25
30Gln Ser Ser Glu Asp Ser Glu Ser Gly Thr Leu Ser Ala Ser Ser Ala
35 40 45Thr Ser Ala Arg Gln Arg Arg Arg Gln Ser Lys Glu Gln Asp Glu
Val 50 55 60Arg His Gly Arg Asp Lys Gly Leu Ile Asn Lys Glu Asn Thr
Pro Ser65 70 75 80Gly Phe Asn His Leu Asp Asp Cys Ile Leu Asn Thr
Gln Glu Val Glu 85 90 95Lys Val His Lys Asn Thr Phe Gly Cys Ala Gly
Glu Arg Ser Lys Pro 100 105 110Lys Arg Gln Lys Ser Ser Thr Lys Leu
Ser Glu Leu His Asp Asn Gln 115 120 125Asp Gly Leu Val Asn Met Glu
Ser Leu Asn Ser Thr Arg Ser His Glu 130 135 140Arg Thr Gly Pro Asp
Asp Phe Glu Trp Met Ser Asp Glu Arg Lys Gly145 150 155 160Asn Glu
Lys Asp Gly Gly His Thr Gln His Phe Glu Ser Pro Thr Met 165 170
175Lys Ile Gln Glu His Pro Ser Leu Ser Asp Thr Lys Gln Gln Arg Asn
180 185 190Gln Asp Ala Gly Asp Gln Glu Glu Ser Phe Val Ser Glu Val
Pro Gln 195 200 205Ser Asp Leu Thr Ala Leu Cys Asp Glu Lys Asn Trp
Glu Glu Pro Ile 210 215 220Pro Ala Phe Ser Ser
Trp Gln Arg Glu Asn Ser Asp Ser Asp Glu Ala225 230 235 240His Leu
Ser Pro Gln Ala Gly Arg Leu Ile Arg Gln Leu Leu Asp Glu 245 250
255Asp Ser Asp Pro Met Leu Ser Pro Arg Phe Tyr Ala Tyr Gly Gln Ser
260 265 270Arg Gln Tyr Leu Asp Asp Thr Glu Val Pro Pro Ser Pro Pro
Asn Ser 275 280 285His Ser Phe Met Arg Arg Arg Ser Ser Ser Leu Gly
Ser Tyr Asp Asp 290 295 300Glu Gln Glu Asp Leu Thr Pro Ala Gln Leu
Thr Arg Arg Ile Gln Ser305 310 315 320Leu Lys Lys Lys Ile Arg Lys
Phe Glu Asp Arg Phe Glu Glu Glu Lys 325 330 335Lys Tyr Arg Pro Ser
His Ser Asp Lys Ala Ala Asn Pro Glu Val Leu 340 345 350Lys Trp Thr
Asn Asp Leu Ala Lys Phe Arg Arg Gln Leu Lys Glu Ser 355 360 365Lys
Leu Lys Ile Ser Glu Glu Asp Leu Thr Pro Arg Met Arg Gln Arg 370 375
380Ser Asn Thr Leu Pro Lys Ser Phe Gly Ser Gln Leu Glu Lys Glu
Asp385 390 395 400Glu Lys Lys Gln Glu Leu Val Asp Lys Ala Ile Lys
Pro Ser Val Glu 405 410 415Ala Thr Leu Glu Ser Ile Gln Arg Lys Leu
Gln Glu Lys Arg Ala Glu 420 425 430Ser Ser Arg Pro Glu Asp Ile Lys
Asp Met Thr Lys Asp Gln Ile Ala 435 440 445Asn Glu Lys Val Ala Leu
Gln Lys Ala Leu Leu Tyr Tyr Glu Ser Ile 450 455 460His Gly Arg Pro
Val Thr Lys Asn Glu Arg Gln Val Met Lys Pro Leu465 470 475 480Tyr
Asp Arg Tyr Arg Leu Val Lys Gln Ile Leu Ser Arg Ala Asn Thr 485 490
495Ile Pro Ile Ile Gly Ser Pro Ser Ser Lys Arg Arg Ser Pro Leu Leu
500 505 510Gln Pro Ile Ile Glu Gly Glu Thr Ala Ser Phe Phe Lys Glu
Ile Lys 515 520 525Glu Glu Glu Glu Gly Ser Glu Asp Asp Ser Asn Val
Lys Pro Asp Phe 530 535 540Met Val Thr Leu Lys Thr Asp Phe Ser Ala
Arg Cys Phe Leu Asp Gln545 550 555 560Phe Glu Asp Asp Ala Asp Gly
Phe Ile Ser Pro Met Asp Asp Lys Ile 565 570 575Pro Ser Lys Cys Ser
Gln Asp Thr Gly Leu Ser Asn Leu His Ala Ala 580 585 590Ser Ile Pro
Glu Leu Leu Glu His Leu Gln Glu Met Arg Glu Glu Lys 595 600 605Lys
Arg Ile Arg Lys Lys Leu Arg Asp Phe Glu Asp Asn Phe Phe Arg 610 615
620Gln Asn Gly Arg Asn Val Gln Lys Glu Asp Arg Thr Pro Met Ala
Glu625 630 635 640Glu Tyr Ser Glu Tyr Lys His Ile Lys Ala Lys Leu
Arg Leu Leu Glu 645 650 655Val Leu Ile Ser Lys Arg Asp Thr Asp Ser
Lys Ser Met 660 665449802DNAHomo sapiens 44aagaaaccgg ccaggtgtgg
cctaggcgcc cagtgccagc ggggaggaga ctcgctccgc 60cgccgaccaa caccaacacc
cagctccgac gcagctcctc tgcgcccttg ccgccctccg 120agccacagct
ttcctcccgc tcctgccccc ggcccgtcgc cgtctccgcg ctcgcagcgg
180cctcgggagg gcccaggtag cgagcagcga cctcgcgagc cttccgcact
cccgcccggt 240tccccggccg tccgcctatc cttggccccc tccgctttct
ccgcgccggc ccgcctcgct 300tatgcctcgg cgctgagccg ctctcccgat
tgcccgccga catgagctgc aacggaggct 360cccacccgcg gatcaacact
ctgggccgca tgatccgcgc cgagtctggc ccggacctgc 420gctacgaggt
gaccagcggc ggcgggggca ccagcaggat gtactattct cggcgcggcg
480tgatcaccga ccagaactcg gacggctact gtcaaaccgg cacgatgtcc
aggcaccaga 540accagaacac catccaggag ctgctgcaga actgctccga
ctgcttgatg cgagcagagc 600tcatcgtgca gcctgaattg aagtatggag
atggaataca actgactcgg agtcgagaat 660tggatgagtg ttttgcccag
gccaatgacc aaatggaaat cctcgacagc ttgatcagag 720agatgcggca
gatgggccag ccctgtgatg cttaccagaa aaggcttctt cagctccaag
780agcaaatgcg agccctttat aaagccatca gtgtccctcg agtccgcagg
gccagctcca 840agggtggtgg aggctacact tgtcagagtg gctctggctg
ggatgagttc accaaacatg 900tcaccagtga atgtttgggg tggatgaggc
agcaaagggc ggagatggac atggtggcct 960ggggtgtgga cctggcctca
gtggagcagc acattaacag ccaccggggc atccacaact 1020ccatcggcga
ctatcgctgg cagctggaca aaatcaaagc cgacctgcgc gagaaatctg
1080cgatctacca gttggaggag gagtatgaaa acctgctgaa agcgtccttt
gagaggatgg 1140atcacctgcg acagctgcag aacatcattc aggccacgtc
cagggagatc atgtggatca 1200atgactgcga ggaggaggag ctgctgtacg
actggagcga caagaacacc aacatcgctc 1260agaaacagga ggccttctcc
atacgcatga gtcaactgga agttaaagaa aaagagctca 1320ataagctgaa
acaagaaagt gaccaacttg tcctcaatca gcatccagct tcagacaaaa
1380ttgaggccta tatggacact ctgcagacgc agtggagttg gattcttcag
atcaccaagt 1440gcattgatgt tcatctgaaa gaaaatgctg cctactttca
gttttttgaa gaggcgcagt 1500ctactgaagc atacctgaag gggctccagg
actccatcag gaagaagtac ccctgcgaca 1560agaacatgcc cctgcagcac
ctgctggaac agatcaagga gctggagaaa gaacgagaga 1620aaatccttga
atacaagcgt caggtgcaga acttggtaaa caagtctaag aagattgtac
1680agctgaagcc tcgtaaccca gactacagaa gcaataaacc cattattctc
agagctctct 1740gtgactacaa acaagatcag aaaatcgtgc ataaggggga
tgagtgtatc ctgaaggaca 1800acaacgagcg cagcaagtgg tacgtgacgg
gcccgggagg cgttgacatg cttgttccct 1860ctgtggggct gatcatccct
cctccgaacc cactggccgt ggacctctct tgcaagattg 1920agcagtacta
cgaagccatc ttggctctgt ggaaccagct ctacatcaac atgaagagcc
1980tggtgtcctg gcactactgc atgattgaca tagagaagat cagggccatg
acaatcgcca 2040agctgaaaac aatgcggcag gaagattaca tgaagacgat
agccgacctt gagttacatt 2100accaagagtt catcagaaat agccaaggct
cagagatgtt tggagatgat gacaagcgga 2160aaatacagtc tcagttcacc
gatgcccaga agcattacca gaccctggtc attcagctcc 2220ctggctatcc
ccagcaccag acagtgacca caactgaaat cactcatcat ggaacctgcc
2280aagatgtcaa ccataataaa gtaattgaaa ccaacagaga aaatgacaag
caagaaacat 2340ggatgctgat ggagctgcag aagattcgca ggcagataga
gcactgcgag ggcaggatga 2400ctctcaaaaa cctccctcta gcagaccagg
gatcttctca ccacatcaca gtgaaaatta 2460acgagcttaa gagtgtgcag
aatgattcac aagcaattgc tgaggttctc aaccagctta 2520aagatatgct
tgccaacttc agaggttctg aaaagtactg ctatttacag aatgaagtat
2580ttggactatt tcagaaactg gaaaatatca atggtgttac agatggctac
ttaaatagct 2640tatgcacagt aagggcactg ctccaggcta ttctccaaac
agaagacatg ttaaaggttt 2700atgaagccag gctcactgag gaggaaactg
tctgcctgga cctggataaa gtggaagctt 2760accgctgtgg actgaagaaa
ataaaaaatg acttgaactt gaagaagtcg ttgttggcca 2820ctatgaagac
agaactacag aaagcccagc agatccactc tcagacttca cagcagtatc
2880cactttatga tctggacttg ggcaagttcg gtgaaaaagt cacacagctg
acagaccgct 2940ggcaaaggat agataaacag atcgacttta ggttatggga
cctggagaaa caaatcaagc 3000aattgaggaa ttatcgtgat aactatcagg
ctttctgcaa gtggctctat gatgctaaac 3060gccgccagga ttccttagaa
tccatgaaat ttggagattc caacacagtc atgcggtttt 3120tgaatgagca
gaagaacttg cacagtgaaa tatctggcaa acgagacaaa tcagaggaag
3180tacaaaaaat tgctgaactt tgcgccaatt caattaagga ttatgagctc
cagctggcct 3240catacacctc aggactggaa actctgctga acatacctat
caagaggacc atgattcagt 3300ccccttctgg ggtgattctg caagaggctg
cagatgttca tgctcggtac attgaactac 3360ttacaagatc tggagactat
tacaggttct taagtgagat gctgaagagt ttggaagatc 3420tgaagctgaa
aaataccaag atcgaagttt tggaagagga gctcagactg gcccgagatg
3480ccaactcgga aaactgtaat aagaacaaat tcctggatca gaacctgcag
aaataccagg 3540cagagtgttc ccagttcaaa gcgaagcttg cgagcctgga
ggagctgaag agacaggctg 3600agctggatgg gaagtcggct aagcaaaatc
tagacaagtg ctacggccaa ataaaagaac 3660tcaatgagaa gatcacccga
ctgacttatg agattgaaga tgaaaagaga agaagaaaat 3720ctgtggaaga
cagatttgac caacagaaga atgactatga ccaactgcag aaagcaaggc
3780aatgtgaaaa ggagaacctt ggttggcaga aattagagtc tgagaaagcc
atcaaggaga 3840aggagtacga gattgaaagg ttgagggttc tactgcagga
agaaggcacc cggaagagag 3900aatatgaaaa tgagctggca aaggtaagaa
accactataa tgaggagatg agtaatttaa 3960ggaacaagta tgaaacagag
attaacatta cgaagaccac catcaaggag atatccatgc 4020aaaaagagga
tgattccaaa aatcttagaa accagcttga tagactttca agggaaaatc
4080gagatctgaa ggatgaaatt gtcaggctca atgacagcat cttgcaggcc
actgagcagc 4140gaaggcgagc tgaagaaaac gcccttcagc aaaaggcctg
tggctctgag ataatgcaga 4200agaagcagca tctggagata gaactgaagc
aggtcatgca gcagcgctct gaggacaatg 4260cccggcacaa gcagtccctg
gaggaggctg ccaagaccat tcaggacaaa aataaggaga 4320tcgagagact
caaagctgag tttcaggagg aggccaagcg ccgctgggaa tatgaaaatg
4380aactgagtaa ggtaagaaac aattatgatg aggagatcat tagcttaaaa
aatcagtttg 4440agaccgagat caacatcacc aagaccacca tccaccagct
caccatgcag aaggaagagg 4500ataccagtgg ctaccgggct cagatagaca
atctcacccg agaaaacagg agcttatctg 4560aagaaataaa gaggctgaag
aacactctaa cccagaccac agagaatctc aggagggtgg 4620aagaagacat
ccaacagcaa aaggccactg gctctgaggt gtctcagagg aaacagcagc
4680tggaggttga gctgagacaa gtcactcaga tgcgaacaga ggagagcgta
agatataagc 4740aatctcttga tgatgctgcc aaaaccatcc aggataaaaa
caaggagata gaaaggttaa 4800aacaactgat cgacaaagaa acaaatgacc
ggaaatgcct ggaagatgaa aacgcgagat 4860tacaaagggt ccagtatgac
ctgcagaaag caaacagtag tgcgacggag acaataaaca 4920aactgaaggt
tcaggagcaa gaactgacac gcctgaggat cgactatgaa agggtttccc
4980aggagaggac tgtgaaggac caggatatca cgcggttcca gaactctctg
aaagagctgc 5040agctgcagaa gcagaaggtg gaagaggagc tgaatcggct
gaagaggacc gcgtcagaag 5100actcctgcaa gaggaagaag ctggaggaag
agctggaagg catgaggagg tcgctgaagg 5160agcaagccat caaaatcacc
aacctgaccc agcagctgga gcaggcatcc attgttaaga 5220agaggagtga
ggatgacctc cggcagcaga gggacgtgct ggatggccac ctgagggaaa
5280agcagaggac ccaggaagag ctgaggaggc tctcttctga ggtcgaggcc
ctgaggcggc 5340agttactcca ggaacaggaa agtgtcaaac aagctcactt
gaggaatgag catttccaga 5400aggcgataga agataaaagc agaagcttaa
atgaaagcaa aatagaaatt gagaggctgc 5460agtctctcac agagaacctg
accaaggagc acttgatgtt agaagaagaa ctgcggaacc 5520tgaggctgga
gtacgatgac ctgaggagag gacgaagcga agcggacagt gataaaaatg
5580caaccatctt ggaactaagg agccagctgc agatcagcaa caaccggacc
ctggaactgc 5640aggggctgat taatgattta cagagagaga gggaaaattt
gagacaggaa attgagaaat 5700tccaaaagca ggctttagag gcatctaata
ggattcagga atcaaagaat cagtgtactc 5760aggtggtaca ggaaagagag
agccttctgg tgaaaatcaa agtcctggag caagacaagg 5820caaggctgca
gaggctggag gatgagctga atcgtgcaaa atcaactcta gaggcagaaa
5880ccagggtgaa acagcgcctg gagtgtgaga aacagcaaat tcagaatgac
ctgaatcagt 5940ggaagactca atattcccgc aaggaggagg ctattaggaa
gatagaatcg gaaagagaaa 6000agagtgagag agagaagaac agtcttagga
gtgagatcga aagactccaa gcagagatca 6060agagaattga agagaggtgc
aggcgtaagc tggaggattc taccagggag acacagtcac 6120agttagaaac
agaacgctcc cgatatcaga gggagattga taaactcaga cagcgcccat
6180atgggtccca tcgagagacc cagactgagt gtgagtggac cgttgacacc
tccaagctgg 6240tgtttgatgg gctgaggaag aaggtgacag caatgcagct
ctatgagtgt cagctgatcg 6300acaaaacaac cttggacaaa ctattgaagg
ggaagaagtc agtggaagaa gttgcttctg 6360aaatccagcc attccttcgg
ggtgcaggat ctatcgctgg agcatctgct tctcctaagg 6420aaaaatactc
tttggtagag gccaagagaa agaaattaat cagcccagaa tccacagtca
6480tgcttctgga ggcccaggca gctacaggtg gtataattga tccccatcgg
aatgagaagc 6540tgactgtcga cagtgccata gctcgggacc tcattgactt
cgatgaccgt cagcagatat 6600atgcagcaga aaaagctatc actggttttg
atgatccatt ttcaggcaag acagtatctg 6660tttcagaagc catcaagaaa
aatttgattg atagagaaac cggaatgcgc ctgctggaag 6720cccagattgc
ttcagggggt gtagtagacc ctgtgaacag tgtctttttg ccaaaagatg
6780tcgccttggc ccgggggctg attgatagag atttgtatcg atccctgaat
gatccccgag 6840atagtcagaa aaactttgtg gatccagtca ccaaaaagaa
ggtcagttac gtgcagctga 6900aggaacggtg cagaatcgaa ccacatactg
gtctgctctt gctttcagta cagaagagaa 6960gcatgtcctt ccaaggaatc
agacaacctg tgaccgtcac tgagctagta gattctggta 7020tattgagacc
gtccactgtc aatgaactgg aatctggtca gatttcttat gacgaggttg
7080gtgagagaat taaggacttc ctccagggtt caagctgcat agcaggcata
tacaatgaga 7140ccacaaaaca gaagcttggc atttatgagg ccatgaaaat
tggcttagtc cgacctggta 7200ctgctctgga gttgctggaa gcccaagcag
ctactggctt tatagtggat cctgttagca 7260acttgaggtt accagtggag
gaagcctaca agagaggtct ggtgggcatt gagttcaaag 7320agaagctcct
gtctgcagaa cgagctgtca ctgggtataa tgatcctgaa acaggaaaca
7380tcatctcttt gttccaagcc atgaataagg aactcatcga aaagggccac
ggtattcgct 7440tattagaagc acagatcgca accgggggga tcattgaccc
aaaggagagc catcgtttac 7500cagttgacat agcatataag aggggctatt
tcaatgagga actcagtgag attctctcag 7560atccaagtga tgataccaaa
ggattttttg accccaacac tgaagaaaat cttacctatc 7620tgcaactaaa
agaaagatgc attaaggatg aggaaacagg gctctgtctt ctgcctctga
7680aagaaaagaa gaaacaggtg cagacatcac aaaagaatac cctcaggaag
cgtagagtgg 7740tcatagttga cccagaaacc aataaagaaa tgtctgttca
ggaggcctac aagaagggcc 7800taattgatta tgaaaccttc aaagaactgt
gtgagcagga atgtgaatgg gaagaaataa 7860ccatcacggg atcagatggc
tccaccaggg tggtcctggt agatagaaag acaggcagtc 7920agtatgatat
tcaagatgct attgacaagg gccttgttga caggaagttc tttgatcagt
7980accgatccgg cagcctcagc ctcactcaat ttgctgacat gatctccttg
aaaaatggtg 8040tcggcaccag cagcagcatg ggcagtggtg tcagcgatga
tgtttttagc agctcccgac 8100atgaatcagt aagtaagatt tccaccatat
ccagcgtcag gaatttaacc ataaggagca 8160gctctttttc agacaccctg
gaagaatcga gccccattgc agccatcttt gacacagaaa 8220acctggagaa
aatctccatt acagaaggta tagagcgggg catcgttgac agcatcacgg
8280gtcagaggct tctggaggct caggcctgca caggtggcat catccaccca
accacgggcc 8340agaagctgtc acttcaggac gcagtctccc agggtgtgat
tgaccaagac atggccacca 8400ggctgaagcc tgctcagaaa gccttcatag
gcttcgaggg tgtgaaggga aagaagaaga 8460tgtcagcagc agaggcagtg
aaagaaaaat ggctcccgta tgaggctggc cagcgcttcc 8520tggagttcca
gtacctcacg ggaggtcttg ttgacccgga agtgcatggg aggataagca
8580ccgaagaagc catccggaag gggttcatag atggccgcgc cgcacagagg
ctgcaagaca 8640ccagcagcta tgccaaaatc ctgacctgcc ccaaaaccaa
attaaaaata tcctataagg 8700atgccataaa tcgctccatg gtagaagata
tcactgggct gcgccttctg gaagccgcct 8760ccgtgtcgtc caagggctta
cccagccctt acaacatgtc ttcggctccg gggtcccgct 8820ccggctcccg
ctcgggatct cgctccggat ctcgctccgg gtcccgcagt gggtcccgga
8880gaggaagctt tgacgccaca gggaattctt cctactctta ttcctactca
tttagcagta 8940gttctattgg gcactagtag tcagttggga gtggttgcta
taccttgact tcatttatat 9000gaatttccac tttattaaat aatagaaaag
aaaatcccgg tgcttgcagt agagtgatag 9060gacattctat gcttacagaa
aatatagcca tgattgaaat caaatagtaa aggctgttct 9120ggctttttat
cttcttagct catcttaaat aagcagtaca cttggatgca gtgcgtctga
9180agtgctaatc agttgtaaca atagcacaaa tcgaacttag gatttgtttc
ttctcttctg 9240tgtttcgatt tttgatcaat tctttaattt tggaagccta
taatacagtt ttctattctt 9300ggagataaaa attaaatgga tcactgatat
tttagtcatt ctgcttctca tctaaatatt 9360tccatattct gtattaggag
aaaattaccc tcccagcacc agcccccctc tcaaaccccc 9420aacccaaaac
caagcatttt ggaatgagtc tcctttagtt tcagagtgtg gattgtataa
9480cccatatact cttcgatgta cttgtttggt ttggtattaa tttgactgtg
catgacagcg 9540gcaatctttt ctttggtcaa agttttctgt ttattttgct
tgtcatattc gatgtacttt 9600aaggtgtctt tatgaagttt gctattctgg
caataaactt ttagactttt gaagtgtttg 9660tgttttaatt taatatgttt
ataagcatgt ataaacattt agcatatttt tatcataggt 9720ctaaaaatat
ttgtttacta aatacctgtg aagaaatacc attaaaaaac tatttggttc
9780tgaattctta ctagaaaaaa aa 9802452871PRTHomo sapiens 45Met Ser
Cys Asn Gly Gly Ser His Pro Arg Ile Asn Thr Leu Gly Arg1 5 10 15Met
Ile Arg Ala Glu Ser Gly Pro Asp Leu Arg Tyr Glu Val Thr Ser 20 25
30Gly Gly Gly Gly Thr Ser Arg Met Tyr Tyr Ser Arg Arg Gly Val Ile
35 40 45Thr Asp Gln Asn Ser Asp Gly Tyr Cys Gln Thr Gly Thr Met Ser
Arg 50 55 60His Gln Asn Gln Asn Thr Ile Gln Glu Leu Leu Gln Asn Cys
Ser Asp65 70 75 80Cys Leu Met Arg Ala Glu Leu Ile Val Gln Pro Glu
Leu Lys Tyr Gly 85 90 95Asp Gly Ile Gln Leu Thr Arg Ser Arg Glu Leu
Asp Glu Cys Phe Ala 100 105 110Gln Ala Asn Asp Gln Met Glu Ile Leu
Asp Ser Leu Ile Arg Glu Met 115 120 125Arg Gln Met Gly Gln Pro Cys
Asp Ala Tyr Gln Lys Arg Leu Leu Gln 130 135 140Leu Gln Glu Gln Met
Arg Ala Leu Tyr Lys Ala Ile Ser Val Pro Arg145 150 155 160Val Arg
Arg Ala Ser Ser Lys Gly Gly Gly Gly Tyr Thr Cys Gln Ser 165 170
175Gly Ser Gly Trp Asp Glu Phe Thr Lys His Val Thr Ser Glu Cys Leu
180 185 190Gly Trp Met Arg Gln Gln Arg Ala Glu Met Asp Met Val Ala
Trp Gly 195 200 205Val Asp Leu Ala Ser Val Glu Gln His Ile Asn Ser
His Arg Gly Ile 210 215 220His Asn Ser Ile Gly Asp Tyr Arg Trp Gln
Leu Asp Lys Ile Lys Ala225 230 235 240Asp Leu Arg Glu Lys Ser Ala
Ile Tyr Gln Leu Glu Glu Glu Tyr Glu 245 250 255Asn Leu Leu Lys Ala
Ser Phe Glu Arg Met Asp His Leu Arg Gln Leu 260 265 270Gln Asn Ile
Ile Gln Ala Thr Ser Arg Glu Ile Met Trp Ile Asn Asp 275 280 285Cys
Glu Glu Glu Glu Leu Leu Tyr Asp Trp Ser Asp Lys Asn Thr Asn 290 295
300Ile Ala Gln Lys Gln Glu Ala Phe Ser Ile Arg Met Ser Gln Leu
Glu305 310 315 320Val Lys Glu Lys Glu Leu Asn Lys Leu Lys Gln Glu
Ser Asp Gln Leu 325 330 335Val Leu Asn Gln His Pro Ala Ser Asp Lys
Ile Glu Ala Tyr Met Asp 340 345 350Thr Leu Gln Thr Gln Trp Ser Trp
Ile Leu Gln Ile Thr Lys Cys Ile 355 360 365Asp Val His Leu Lys Glu
Asn Ala Ala Tyr Phe Gln Phe Phe Glu Glu 370 375 380Ala Gln Ser Thr
Glu Ala Tyr Leu Lys Gly Leu Gln Asp Ser Ile Arg385 390 395 400Lys
Lys Tyr Pro Cys Asp Lys Asn Met Pro Leu Gln His Leu Leu
Glu 405 410 415Gln Ile Lys Glu Leu Glu Lys Glu Arg Glu Lys Ile Leu
Glu Tyr Lys 420 425 430Arg Gln Val Gln Asn Leu Val Asn Lys Ser Lys
Lys Ile Val Gln Leu 435 440 445Lys Pro Arg Asn Pro Asp Tyr Arg Ser
Asn Lys Pro Ile Ile Leu Arg 450 455 460Ala Leu Cys Asp Tyr Lys Gln
Asp Gln Lys Ile Val His Lys Gly Asp465 470 475 480Glu Cys Ile Leu
Lys Asp Asn Asn Glu Arg Ser Lys Trp Tyr Val Thr 485 490 495Gly Pro
Gly Gly Val Asp Met Leu Val Pro Ser Val Gly Leu Ile Ile 500 505
510Pro Pro Pro Asn Pro Leu Ala Val Asp Leu Ser Cys Lys Ile Glu Gln
515 520 525Tyr Tyr Glu Ala Ile Leu Ala Leu Trp Asn Gln Leu Tyr Ile
Asn Met 530 535 540Lys Ser Leu Val Ser Trp His Tyr Cys Met Ile Asp
Ile Glu Lys Ile545 550 555 560Arg Ala Met Thr Ile Ala Lys Leu Lys
Thr Met Arg Gln Glu Asp Tyr 565 570 575Met Lys Thr Ile Ala Asp Leu
Glu Leu His Tyr Gln Glu Phe Ile Arg 580 585 590Asn Ser Gln Gly Ser
Glu Met Phe Gly Asp Asp Asp Lys Arg Lys Ile 595 600 605Gln Ser Gln
Phe Thr Asp Ala Gln Lys His Tyr Gln Thr Leu Val Ile 610 615 620Gln
Leu Pro Gly Tyr Pro Gln His Gln Thr Val Thr Thr Thr Glu Ile625 630
635 640Thr His His Gly Thr Cys Gln Asp Val Asn His Asn Lys Val Ile
Glu 645 650 655Thr Asn Arg Glu Asn Asp Lys Gln Glu Thr Trp Met Leu
Met Glu Leu 660 665 670Gln Lys Ile Arg Arg Gln Ile Glu His Cys Glu
Gly Arg Met Thr Leu 675 680 685Lys Asn Leu Pro Leu Ala Asp Gln Gly
Ser Ser His His Ile Thr Val 690 695 700Lys Ile Asn Glu Leu Lys Ser
Val Gln Asn Asp Ser Gln Ala Ile Ala705 710 715 720Glu Val Leu Asn
Gln Leu Lys Asp Met Leu Ala Asn Phe Arg Gly Ser 725 730 735Glu Lys
Tyr Cys Tyr Leu Gln Asn Glu Val Phe Gly Leu Phe Gln Lys 740 745
750Leu Glu Asn Ile Asn Gly Val Thr Asp Gly Tyr Leu Asn Ser Leu Cys
755 760 765Thr Val Arg Ala Leu Leu Gln Ala Ile Leu Gln Thr Glu Asp
Met Leu 770 775 780Lys Val Tyr Glu Ala Arg Leu Thr Glu Glu Glu Thr
Val Cys Leu Asp785 790 795 800Leu Asp Lys Val Glu Ala Tyr Arg Cys
Gly Leu Lys Lys Ile Lys Asn 805 810 815Asp Leu Asn Leu Lys Lys Ser
Leu Leu Ala Thr Met Lys Thr Glu Leu 820 825 830Gln Lys Ala Gln Gln
Ile His Ser Gln Thr Ser Gln Gln Tyr Pro Leu 835 840 845Tyr Asp Leu
Asp Leu Gly Lys Phe Gly Glu Lys Val Thr Gln Leu Thr 850 855 860Asp
Arg Trp Gln Arg Ile Asp Lys Gln Ile Asp Phe Arg Leu Trp Asp865 870
875 880Leu Glu Lys Gln Ile Lys Gln Leu Arg Asn Tyr Arg Asp Asn Tyr
Gln 885 890 895Ala Phe Cys Lys Trp Leu Tyr Asp Ala Lys Arg Arg Gln
Asp Ser Leu 900 905 910Glu Ser Met Lys Phe Gly Asp Ser Asn Thr Val
Met Arg Phe Leu Asn 915 920 925Glu Gln Lys Asn Leu His Ser Glu Ile
Ser Gly Lys Arg Asp Lys Ser 930 935 940Glu Glu Val Gln Lys Ile Ala
Glu Leu Cys Ala Asn Ser Ile Lys Asp945 950 955 960Tyr Glu Leu Gln
Leu Ala Ser Tyr Thr Ser Gly Leu Glu Thr Leu Leu 965 970 975Asn Ile
Pro Ile Lys Arg Thr Met Ile Gln Ser Pro Ser Gly Val Ile 980 985
990Leu Gln Glu Ala Ala Asp Val His Ala Arg Tyr Ile Glu Leu Leu Thr
995 1000 1005Arg Ser Gly Asp Tyr Tyr Arg Phe Leu Ser Glu Met Leu
Lys Ser 1010 1015 1020Leu Glu Asp Leu Lys Leu Lys Asn Thr Lys Ile
Glu Val Leu Glu 1025 1030 1035Glu Glu Leu Arg Leu Ala Arg Asp Ala
Asn Ser Glu Asn Cys Asn 1040 1045 1050Lys Asn Lys Phe Leu Asp Gln
Asn Leu Gln Lys Tyr Gln Ala Glu 1055 1060 1065Cys Ser Gln Phe Lys
Ala Lys Leu Ala Ser Leu Glu Glu Leu Lys 1070 1075 1080Arg Gln Ala
Glu Leu Asp Gly Lys Ser Ala Lys Gln Asn Leu Asp 1085 1090 1095Lys
Cys Tyr Gly Gln Ile Lys Glu Leu Asn Glu Lys Ile Thr Arg 1100 1105
1110Leu Thr Tyr Glu Ile Glu Asp Glu Lys Arg Arg Arg Lys Ser Val
1115 1120 1125Glu Asp Arg Phe Asp Gln Gln Lys Asn Asp Tyr Asp Gln
Leu Gln 1130 1135 1140Lys Ala Arg Gln Cys Glu Lys Glu Asn Leu Gly
Trp Gln Lys Leu 1145 1150 1155Glu Ser Glu Lys Ala Ile Lys Glu Lys
Glu Tyr Glu Ile Glu Arg 1160 1165 1170Leu Arg Val Leu Leu Gln Glu
Glu Gly Thr Arg Lys Arg Glu Tyr 1175 1180 1185Glu Asn Glu Leu Ala
Lys Val Arg Asn His Tyr Asn Glu Glu Met 1190 1195 1200Ser Asn Leu
Arg Asn Lys Tyr Glu Thr Glu Ile Asn Ile Thr Lys 1205 1210 1215Thr
Thr Ile Lys Glu Ile Ser Met Gln Lys Glu Asp Asp Ser Lys 1220 1225
1230Asn Leu Arg Asn Gln Leu Asp Arg Leu Ser Arg Glu Asn Arg Asp
1235 1240 1245Leu Lys Asp Glu Ile Val Arg Leu Asn Asp Ser Ile Leu
Gln Ala 1250 1255 1260Thr Glu Gln Arg Arg Arg Ala Glu Glu Asn Ala
Leu Gln Gln Lys 1265 1270 1275Ala Cys Gly Ser Glu Ile Met Gln Lys
Lys Gln His Leu Glu Ile 1280 1285 1290Glu Leu Lys Gln Val Met Gln
Gln Arg Ser Glu Asp Asn Ala Arg 1295 1300 1305His Lys Gln Ser Leu
Glu Glu Ala Ala Lys Thr Ile Gln Asp Lys 1310 1315 1320Asn Lys Glu
Ile Glu Arg Leu Lys Ala Glu Phe Gln Glu Glu Ala 1325 1330 1335Lys
Arg Arg Trp Glu Tyr Glu Asn Glu Leu Ser Lys Val Arg Asn 1340 1345
1350Asn Tyr Asp Glu Glu Ile Ile Ser Leu Lys Asn Gln Phe Glu Thr
1355 1360 1365Glu Ile Asn Ile Thr Lys Thr Thr Ile His Gln Leu Thr
Met Gln 1370 1375 1380Lys Glu Glu Asp Thr Ser Gly Tyr Arg Ala Gln
Ile Asp Asn Leu 1385 1390 1395Thr Arg Glu Asn Arg Ser Leu Ser Glu
Glu Ile Lys Arg Leu Lys 1400 1405 1410Asn Thr Leu Thr Gln Thr Thr
Glu Asn Leu Arg Arg Val Glu Glu 1415 1420 1425Asp Ile Gln Gln Gln
Lys Ala Thr Gly Ser Glu Val Ser Gln Arg 1430 1435 1440Lys Gln Gln
Leu Glu Val Glu Leu Arg Gln Val Thr Gln Met Arg 1445 1450 1455Thr
Glu Glu Ser Val Arg Tyr Lys Gln Ser Leu Asp Asp Ala Ala 1460 1465
1470Lys Thr Ile Gln Asp Lys Asn Lys Glu Ile Glu Arg Leu Lys Gln
1475 1480 1485Leu Ile Asp Lys Glu Thr Asn Asp Arg Lys Cys Leu Glu
Asp Glu 1490 1495 1500Asn Ala Arg Leu Gln Arg Val Gln Tyr Asp Leu
Gln Lys Ala Asn 1505 1510 1515Ser Ser Ala Thr Glu Thr Ile Asn Lys
Leu Lys Val Gln Glu Gln 1520 1525 1530Glu Leu Thr Arg Leu Arg Ile
Asp Tyr Glu Arg Val Ser Gln Glu 1535 1540 1545Arg Thr Val Lys Asp
Gln Asp Ile Thr Arg Phe Gln Asn Ser Leu 1550 1555 1560Lys Glu Leu
Gln Leu Gln Lys Gln Lys Val Glu Glu Glu Leu Asn 1565 1570 1575Arg
Leu Lys Arg Thr Ala Ser Glu Asp Ser Cys Lys Arg Lys Lys 1580 1585
1590Leu Glu Glu Glu Leu Glu Gly Met Arg Arg Ser Leu Lys Glu Gln
1595 1600 1605Ala Ile Lys Ile Thr Asn Leu Thr Gln Gln Leu Glu Gln
Ala Ser 1610 1615 1620Ile Val Lys Lys Arg Ser Glu Asp Asp Leu Arg
Gln Gln Arg Asp 1625 1630 1635Val Leu Asp Gly His Leu Arg Glu Lys
Gln Arg Thr Gln Glu Glu 1640 1645 1650Leu Arg Arg Leu Ser Ser Glu
Val Glu Ala Leu Arg Arg Gln Leu 1655 1660 1665Leu Gln Glu Gln Glu
Ser Val Lys Gln Ala His Leu Arg Asn Glu 1670 1675 1680His Phe Gln
Lys Ala Ile Glu Asp Lys Ser Arg Ser Leu Asn Glu 1685 1690 1695Ser
Lys Ile Glu Ile Glu Arg Leu Gln Ser Leu Thr Glu Asn Leu 1700 1705
1710Thr Lys Glu His Leu Met Leu Glu Glu Glu Leu Arg Asn Leu Arg
1715 1720 1725Leu Glu Tyr Asp Asp Leu Arg Arg Gly Arg Ser Glu Ala
Asp Ser 1730 1735 1740Asp Lys Asn Ala Thr Ile Leu Glu Leu Arg Ser
Gln Leu Gln Ile 1745 1750 1755Ser Asn Asn Arg Thr Leu Glu Leu Gln
Gly Leu Ile Asn Asp Leu 1760 1765 1770Gln Arg Glu Arg Glu Asn Leu
Arg Gln Glu Ile Glu Lys Phe Gln 1775 1780 1785Lys Gln Ala Leu Glu
Ala Ser Asn Arg Ile Gln Glu Ser Lys Asn 1790 1795 1800Gln Cys Thr
Gln Val Val Gln Glu Arg Glu Ser Leu Leu Val Lys 1805 1810 1815Ile
Lys Val Leu Glu Gln Asp Lys Ala Arg Leu Gln Arg Leu Glu 1820 1825
1830Asp Glu Leu Asn Arg Ala Lys Ser Thr Leu Glu Ala Glu Thr Arg
1835 1840 1845Val Lys Gln Arg Leu Glu Cys Glu Lys Gln Gln Ile Gln
Asn Asp 1850 1855 1860Leu Asn Gln Trp Lys Thr Gln Tyr Ser Arg Lys
Glu Glu Ala Ile 1865 1870 1875Arg Lys Ile Glu Ser Glu Arg Glu Lys
Ser Glu Arg Glu Lys Asn 1880 1885 1890Ser Leu Arg Ser Glu Ile Glu
Arg Leu Gln Ala Glu Ile Lys Arg 1895 1900 1905Ile Glu Glu Arg Cys
Arg Arg Lys Leu Glu Asp Ser Thr Arg Glu 1910 1915 1920Thr Gln Ser
Gln Leu Glu Thr Glu Arg Ser Arg Tyr Gln Arg Glu 1925 1930 1935Ile
Asp Lys Leu Arg Gln Arg Pro Tyr Gly Ser His Arg Glu Thr 1940 1945
1950Gln Thr Glu Cys Glu Trp Thr Val Asp Thr Ser Lys Leu Val Phe
1955 1960 1965Asp Gly Leu Arg Lys Lys Val Thr Ala Met Gln Leu Tyr
Glu Cys 1970 1975 1980Gln Leu Ile Asp Lys Thr Thr Leu Asp Lys Leu
Leu Lys Gly Lys 1985 1990 1995Lys Ser Val Glu Glu Val Ala Ser Glu
Ile Gln Pro Phe Leu Arg 2000 2005 2010Gly Ala Gly Ser Ile Ala Gly
Ala Ser Ala Ser Pro Lys Glu Lys 2015 2020 2025Tyr Ser Leu Val Glu
Ala Lys Arg Lys Lys Leu Ile Ser Pro Glu 2030 2035 2040Ser Thr Val
Met Leu Leu Glu Ala Gln Ala Ala Thr Gly Gly Ile 2045 2050 2055Ile
Asp Pro His Arg Asn Glu Lys Leu Thr Val Asp Ser Ala Ile 2060 2065
2070Ala Arg Asp Leu Ile Asp Phe Asp Asp Arg Gln Gln Ile Tyr Ala
2075 2080 2085Ala Glu Lys Ala Ile Thr Gly Phe Asp Asp Pro Phe Ser
Gly Lys 2090 2095 2100Thr Val Ser Val Ser Glu Ala Ile Lys Lys Asn
Leu Ile Asp Arg 2105 2110 2115Glu Thr Gly Met Arg Leu Leu Glu Ala
Gln Ile Ala Ser Gly Gly 2120 2125 2130Val Val Asp Pro Val Asn Ser
Val Phe Leu Pro Lys Asp Val Ala 2135 2140 2145Leu Ala Arg Gly Leu
Ile Asp Arg Asp Leu Tyr Arg Ser Leu Asn 2150 2155 2160Asp Pro Arg
Asp Ser Gln Lys Asn Phe Val Asp Pro Val Thr Lys 2165 2170 2175Lys
Lys Val Ser Tyr Val Gln Leu Lys Glu Arg Cys Arg Ile Glu 2180 2185
2190Pro His Thr Gly Leu Leu Leu Leu Ser Val Gln Lys Arg Ser Met
2195 2200 2205Ser Phe Gln Gly Ile Arg Gln Pro Val Thr Val Thr Glu
Leu Val 2210 2215 2220Asp Ser Gly Ile Leu Arg Pro Ser Thr Val Asn
Glu Leu Glu Ser 2225 2230 2235Gly Gln Ile Ser Tyr Asp Glu Val Gly
Glu Arg Ile Lys Asp Phe 2240 2245 2250Leu Gln Gly Ser Ser Cys Ile
Ala Gly Ile Tyr Asn Glu Thr Thr 2255 2260 2265Lys Gln Lys Leu Gly
Ile Tyr Glu Ala Met Lys Ile Gly Leu Val 2270 2275 2280Arg Pro Gly
Thr Ala Leu Glu Leu Leu Glu Ala Gln Ala Ala Thr 2285 2290 2295Gly
Phe Ile Val Asp Pro Val Ser Asn Leu Arg Leu Pro Val Glu 2300 2305
2310Glu Ala Tyr Lys Arg Gly Leu Val Gly Ile Glu Phe Lys Glu Lys
2315 2320 2325Leu Leu Ser Ala Glu Arg Ala Val Thr Gly Tyr Asn Asp
Pro Glu 2330 2335 2340Thr Gly Asn Ile Ile Ser Leu Phe Gln Ala Met
Asn Lys Glu Leu 2345 2350 2355Ile Glu Lys Gly His Gly Ile Arg Leu
Leu Glu Ala Gln Ile Ala 2360 2365 2370Thr Gly Gly Ile Ile Asp Pro
Lys Glu Ser His Arg Leu Pro Val 2375 2380 2385Asp Ile Ala Tyr Lys
Arg Gly Tyr Phe Asn Glu Glu Leu Ser Glu 2390 2395 2400Ile Leu Ser
Asp Pro Ser Asp Asp Thr Lys Gly Phe Phe Asp Pro 2405 2410 2415Asn
Thr Glu Glu Asn Leu Thr Tyr Leu Gln Leu Lys Glu Arg Cys 2420 2425
2430Ile Lys Asp Glu Glu Thr Gly Leu Cys Leu Leu Pro Leu Lys Glu
2435 2440 2445Lys Lys Lys Gln Val Gln Thr Ser Gln Lys Asn Thr Leu
Arg Lys 2450 2455 2460Arg Arg Val Val Ile Val Asp Pro Glu Thr Asn
Lys Glu Met Ser 2465 2470 2475Val Gln Glu Ala Tyr Lys Lys Gly Leu
Ile Asp Tyr Glu Thr Phe 2480 2485 2490Lys Glu Leu Cys Glu Gln Glu
Cys Glu Trp Glu Glu Ile Thr Ile 2495 2500 2505Thr Gly Ser Asp Gly
Ser Thr Arg Val Val Leu Val Asp Arg Lys 2510 2515 2520Thr Gly Ser
Gln Tyr Asp Ile Gln Asp Ala Ile Asp Lys Gly Leu 2525 2530 2535Val
Asp Arg Lys Phe Phe Asp Gln Tyr Arg Ser Gly Ser Leu Ser 2540 2545
2550Leu Thr Gln Phe Ala Asp Met Ile Ser Leu Lys Asn Gly Val Gly
2555 2560 2565Thr Ser Ser Ser Met Gly Ser Gly Val Ser Asp Asp Val
Phe Ser 2570 2575 2580Ser Ser Arg His Glu Ser Val Ser Lys Ile Ser
Thr Ile Ser Ser 2585 2590 2595Val Arg Asn Leu Thr Ile Arg Ser Ser
Ser Phe Ser Asp Thr Leu 2600 2605 2610Glu Glu Ser Ser Pro Ile Ala
Ala Ile Phe Asp Thr Glu Asn Leu 2615 2620 2625Glu Lys Ile Ser Ile
Thr Glu Gly Ile Glu Arg Gly Ile Val Asp 2630 2635 2640Ser Ile Thr
Gly Gln Arg Leu Leu Glu Ala Gln Ala Cys Thr Gly 2645 2650 2655Gly
Ile Ile His Pro Thr Thr Gly Gln Lys Leu Ser Leu Gln Asp 2660 2665
2670Ala Val Ser Gln Gly Val Ile Asp Gln Asp Met Ala Thr Arg Leu
2675 2680 2685Lys Pro Ala Gln Lys Ala Phe Ile Gly Phe Glu Gly Val
Lys Gly 2690 2695 2700Lys Lys Lys Met Ser Ala Ala Glu Ala Val Lys
Glu Lys Trp Leu 2705 2710 2715Pro Tyr Glu Ala Gly Gln Arg Phe Leu
Glu Phe Gln Tyr Leu Thr 2720 2725 2730Gly Gly Leu Val Asp Pro Glu
Val His Gly Arg Ile Ser Thr Glu 2735 2740 2745Glu Ala Ile Arg Lys
Gly Phe Ile Asp Gly Arg Ala Ala Gln Arg 2750 2755 2760Leu Gln Asp
Thr Ser Ser Tyr Ala Lys Ile Leu Thr Cys Pro Lys 2765 2770 2775Thr
Lys Leu Lys Ile Ser Tyr Lys Asp Ala Ile Asn Arg Ser Met 2780 2785
2790Val Glu Asp Ile Thr Gly Leu Arg Leu Leu Glu Ala Ala Ser Val
2795 2800 2805Ser Ser Lys Gly Leu Pro Ser Pro Tyr Asn Met Ser Ser
Ala Pro 2810 2815 2820Gly Ser Arg Ser Gly Ser Arg Ser Gly Ser Arg
Ser Gly Ser Arg 2825 2830 2835Ser Gly Ser Arg Ser Gly Ser Arg Arg
Gly Ser Phe Asp Ala Thr 2840 2845
2850Gly Asn Ser Ser Tyr Ser Tyr Ser Tyr Ser Phe Ser Ser Ser Ser
2855 2860 2865Ile Gly His 2870468473DNAHomo sapiens 46aagaaaccgg
ccaggtgtgg cctaggcgcc cagtgccagc ggggaggaga ctcgctccgc 60cgccgaccaa
caccaacacc cagctccgac gcagctcctc tgcgcccttg ccgccctccg
120agccacagct ttcctcccgc tcctgccccc ggcccgtcgc cgtctccgcg
ctcgcagcgg 180cctcgggagg gcccaggtag cgagcagcga cctcgcgagc
cttccgcact cccgcccggt 240tccccggccg tccgcctatc cttggccccc
tccgctttct ccgcgccggc ccgcctcgct 300tatgcctcgg cgctgagccg
ctctcccgat tgcccgccga catgagctgc aacggaggct 360cccacccgcg
gatcaacact ctgggccgca tgatccgcgc cgagtctggc ccggacctgc
420gctacgaggt gaccagcggc ggcgggggca ccagcaggat gtactattct
cggcgcggcg 480tgatcaccga ccagaactcg gacggctact gtcaaaccgg
cacgatgtcc aggcaccaga 540accagaacac catccaggag ctgctgcaga
actgctccga ctgcttgatg cgagcagagc 600tcatcgtgca gcctgaattg
aagtatggag atggaataca actgactcgg agtcgagaat 660tggatgagtg
ttttgcccag gccaatgacc aaatggaaat cctcgacagc ttgatcagag
720agatgcggca gatgggccag ccctgtgatg cttaccagaa aaggcttctt
cagctccaag 780agcaaatgcg agccctttat aaagccatca gtgtccctcg
agtccgcagg gccagctcca 840agggtggtgg aggctacact tgtcagagtg
gctctggctg ggatgagttc accaaacatg 900tcaccagtga atgtttgggg
tggatgaggc agcaaagggc ggagatggac atggtggcct 960ggggtgtgga
cctggcctca gtggagcagc acattaacag ccaccggggc atccacaact
1020ccatcggcga ctatcgctgg cagctggaca aaatcaaagc cgacctgcgc
gagaaatctg 1080cgatctacca gttggaggag gagtatgaaa acctgctgaa
agcgtccttt gagaggatgg 1140atcacctgcg acagctgcag aacatcattc
aggccacgtc cagggagatc atgtggatca 1200atgactgcga ggaggaggag
ctgctgtacg actggagcga caagaacacc aacatcgctc 1260agaaacagga
ggccttctcc atacgcatga gtcaactgga agttaaagaa aaagagctca
1320ataagctgaa acaagaaagt gaccaacttg tcctcaatca gcatccagct
tcagacaaaa 1380ttgaggccta tatggacact ctgcagacgc agtggagttg
gattcttcag atcaccaagt 1440gcattgatgt tcatctgaaa gaaaatgctg
cctactttca gttttttgaa gaggcgcagt 1500ctactgaagc atacctgaag
gggctccagg actccatcag gaagaagtac ccctgcgaca 1560agaacatgcc
cctgcagcac ctgctggaac agatcaagga gctggagaaa gaacgagaga
1620aaatccttga atacaagcgt caggtgcaga acttggtaaa caagtctaag
aagattgtac 1680agctgaagcc tcgtaaccca gactacagaa gcaataaacc
cattattctc agagctctct 1740gtgactacaa acaagatcag aaaatcgtgc
ataaggggga tgagtgtatc ctgaaggaca 1800acaacgagcg cagcaagtgg
tacgtgacgg gcccgggagg cgttgacatg cttgttccct 1860ctgtggggct
gatcatccct cctccgaacc cactggccgt ggacctctct tgcaagattg
1920agcagtacta cgaagccatc ttggctctgt ggaaccagct ctacatcaac
atgaagagcc 1980tggtgtcctg gcactactgc atgattgaca tagagaagat
cagggccatg acaatcgcca 2040agctgaaaac aatgcggcag gaagattaca
tgaagacgat agccgacctt gagttacatt 2100accaagagtt catcagaaat
agccaaggct cagagatgtt tggagatgat gacaagcgga 2160aaatacagtc
tcagttcacc gatgcccaga agcattacca gaccctggtc attcagctcc
2220ctggctatcc ccagcaccag acagtgacca caactgaaat cactcatcat
ggaacctgcc 2280aagatgtcaa ccataataaa gtaattgaaa ccaacagaga
aaatgacaag caagaaacat 2340ggatgctgat ggagctgcag aagattcgca
ggcagataga gcactgcgag ggcaggatga 2400ctctcaaaaa cctccctcta
gcagaccagg gatcttctca ccacatcaca gtgaaaatta 2460acgagcttaa
gagtgtgcag aatgattcac aagcaattgc tgaggttctc aaccagctta
2520aagatatgct tgccaacttc agaggttctg aaaagtactg ctatttacag
aatgaagtat 2580ttggactatt tcagaaactg gaaaatatca atggtgttac
agatggctac ttaaatagct 2640tatgcacagt aagggcactg ctccaggcta
ttctccaaac agaagacatg ttaaaggttt 2700atgaagccag gctcactgag
gaggaaactg tctgcctgga cctggataaa gtggaagctt 2760accgctgtgg
actgaagaaa ataaaaaatg acttgaactt gaagaagtcg ttgttggcca
2820ctatgaagac agaactacag aaagcccagc agatccactc tcagacttca
cagcagtatc 2880cactttatga tctggacttg ggcaagttcg gtgaaaaagt
cacacagctg acagaccgct 2940ggcaaaggat agataaacag atcgacttta
ggttatggga cctggagaaa caaatcaagc 3000aattgaggaa ttatcgtgat
aactatcagg ctttctgcaa gtggctctat gatgctaaac 3060gccgccagga
ttccttagaa tccatgaaat ttggagattc caacacagtc atgcggtttt
3120tgaatgagca gaagaacttg cacagtgaaa tatctggcaa acgagacaaa
tcagaggaag 3180tacaaaaaat tgctgaactt tgcgccaatt caattaagga
ttatgagctc cagctggcct 3240catacacctc aggactggaa actctgctga
acatacctat caagaggacc atgattcagt 3300ccccttctgg ggtgattctg
caagaggctg cagatgttca tgctcggtac attgaactac 3360ttacaagatc
tggagactat tacaggttct taagtgagat gctgaagagt ttggaagatc
3420tgaagctgaa aaataccaag atcgaagttt tggaagagga gctcagactg
gcccgagatg 3480ccaactcgga aaactgtaat aagaacaaat tcctggatca
gaacctgcag aaataccagg 3540cagagtgttc ccagttcaaa gcgaagcttg
cgagcctgga ggagctgaag agacaggctg 3600agctggatgg gaagtcggct
aagcaaaatc tagacaagtg ctacggccaa ataaaagaac 3660tcaatgagaa
gatcacccga ctgacttatg agattgaaga tgaaaagaga agaagaaaat
3720ctgtggaaga cagatttgac caacagaaga atgactatga ccaactgcag
aaagcaaggc 3780aatgtgaaaa ggagaacctt ggttggcaga aattagagtc
tgagaaagcc atcaaggaga 3840aggagtacga gattgaaagg ttgagggttc
tactgcagga agaaggcacc cggaagagag 3900aatatgaaaa tgagctggca
aaggtaagaa accactataa tgaggagatg agtaatttaa 3960ggaacaagta
tgaaacagag attaacatta cgaagaccac catcaaggag atatccatgc
4020aaaaagagga tgattccaaa aatcttagaa accagcttga tagactttca
agggaaaatc 4080gagatctgaa ggatgaaatt gtcaggctca atgacagcat
cttgcaggcc actgagcagc 4140gaaggcgagc tgaagaaaac gcccttcagc
aaaaggcctg tggctctgag ataatgcaga 4200agaagcagca tctggagata
gaactgaagc aggtcatgca gcagcgctct gaggacaatg 4260cccggcacaa
gcagtccctg gaggaggctg ccaagaccat tcaggacaaa aataaggaga
4320tcgagagact caaagctgag tttcaggagg aggccaagcg ccgctgggaa
tatgaaaatg 4380aactgagtaa ggcatctaat aggattcagg aatcaaagaa
tcagtgtact caggtggtac 4440aggaaagaga gagccttctg gtgaaaatca
aagtcctgga gcaagacaag gcaaggctgc 4500agaggctgga ggatgagctg
aatcgtgcaa aatcaactct agaggcagaa accagggtga 4560aacagcgcct
ggagtgtgag aaacagcaaa ttcagaatga cctgaatcag tggaagactc
4620aatattcccg caaggaggag gctattagga agatagaatc ggaaagagaa
aagagtgaga 4680gagagaagaa cagtcttagg agtgagatcg aaagactcca
agcagagatc aagagaattg 4740aagagaggtg caggcgtaag ctggaggatt
ctaccaggga gacacagtca cagttagaaa 4800cagaacgctc ccgatatcag
agggagattg ataaactcag acagcgccca tatgggtccc 4860atcgagagac
ccagactgag tgtgagtgga ccgttgacac ctccaagctg gtgtttgatg
4920ggctgaggaa gaaggtgaca gcaatgcagc tctatgagtg tcagctgatc
gacaaaacaa 4980ccttggacaa actattgaag gggaagaagt cagtggaaga
agttgcttct gaaatccagc 5040cattccttcg gggtgcagga tctatcgctg
gagcatctgc ttctcctaag gaaaaatact 5100ctttggtaga ggccaagaga
aagaaattaa tcagcccaga atccacagtc atgcttctgg 5160aggcccaggc
agctacaggt ggtataattg atccccatcg gaatgagaag ctgactgtcg
5220acagtgccat agctcgggac ctcattgact tcgatgaccg tcagcagata
tatgcagcag 5280aaaaagctat cactggtttt gatgatccat tttcaggcaa
gacagtatct gtttcagaag 5340ccatcaagaa aaatttgatt gatagagaaa
ccggaatgcg cctgctggaa gcccagattg 5400cttcaggggg tgtagtagac
cctgtgaaca gtgtcttttt gccaaaagat gtcgccttgg 5460cccgggggct
gattgataga gatttgtatc gatccctgaa tgatccccga gatagtcaga
5520aaaactttgt ggatccagtc accaaaaaga aggtcagtta cgtgcagctg
aaggaacggt 5580gcagaatcga accacatact ggtctgctct tgctttcagt
acagaagaga agcatgtcct 5640tccaaggaat cagacaacct gtgaccgtca
ctgagctagt agattctggt atattgagac 5700cgtccactgt caatgaactg
gaatctggtc agatttctta tgacgaggtt ggtgagagaa 5760ttaaggactt
cctccagggt tcaagctgca tagcaggcat atacaatgag accacaaaac
5820agaagcttgg catttatgag gccatgaaaa ttggcttagt ccgacctggt
actgctctgg 5880agttgctgga agcccaagca gctactggct ttatagtgga
tcctgttagc aacttgaggt 5940taccagtgga ggaagcctac aagagaggtc
tggtgggcat tgagttcaaa gagaagctcc 6000tgtctgcaga acgagctgtc
actgggtata atgatcctga aacaggaaac atcatctctt 6060tgttccaagc
catgaataag gaactcatcg aaaagggcca cggtattcgc ttattagaag
6120cacagatcgc aaccgggggg atcattgacc caaaggagag ccatcgttta
ccagttgaca 6180tagcatataa gaggggctat ttcaatgagg aactcagtga
gattctctca gatccaagtg 6240atgataccaa aggatttttt gaccccaaca
ctgaagaaaa tcttacctat ctgcaactaa 6300aagaaagatg cattaaggat
gaggaaacag ggctctgtct tctgcctctg aaagaaaaga 6360agaaacaggt
gcagacatca caaaagaata ccctcaggaa gcgtagagtg gtcatagttg
6420acccagaaac caataaagaa atgtctgttc aggaggccta caagaagggc
ctaattgatt 6480atgaaacctt caaagaactg tgtgagcagg aatgtgaatg
ggaagaaata accatcacgg 6540gatcagatgg ctccaccagg gtggtcctgg
tagatagaaa gacaggcagt cagtatgata 6600ttcaagatgc tattgacaag
ggccttgttg acaggaagtt ctttgatcag taccgatccg 6660gcagcctcag
cctcactcaa tttgctgaca tgatctcctt gaaaaatggt gtcggcacca
6720gcagcagcat gggcagtggt gtcagcgatg atgtttttag cagctcccga
catgaatcag 6780taagtaagat ttccaccata tccagcgtca ggaatttaac
cataaggagc agctcttttt 6840cagacaccct ggaagaatcg agccccattg
cagccatctt tgacacagaa aacctggaga 6900aaatctccat tacagaaggt
atagagcggg gcatcgttga cagcatcacg ggtcagaggc 6960ttctggaggc
tcaggcctgc acaggtggca tcatccaccc aaccacgggc cagaagctgt
7020cacttcagga cgcagtctcc cagggtgtga ttgaccaaga catggccacc
aggctgaagc 7080ctgctcagaa agccttcata ggcttcgagg gtgtgaaggg
aaagaagaag atgtcagcag 7140cagaggcagt gaaagaaaaa tggctcccgt
atgaggctgg ccagcgcttc ctggagttcc 7200agtacctcac gggaggtctt
gttgacccgg aagtgcatgg gaggataagc accgaagaag 7260ccatccggaa
ggggttcata gatggccgcg ccgcacagag gctgcaagac accagcagct
7320atgccaaaat cctgacctgc cccaaaacca aattaaaaat atcctataag
gatgccataa 7380atcgctccat ggtagaagat atcactgggc tgcgccttct
ggaagccgcc tccgtgtcgt 7440ccaagggctt acccagccct tacaacatgt
cttcggctcc ggggtcccgc tccggctccc 7500gctcgggatc tcgctccgga
tctcgctccg ggtcccgcag tgggtcccgg agaggaagct 7560ttgacgccac
agggaattct tcctactctt attcctactc atttagcagt agttctattg
7620ggcactagta gtcagttggg agtggttgct ataccttgac ttcatttata
tgaatttcca 7680ctttattaaa taatagaaaa gaaaatcccg gtgcttgcag
tagagtgata ggacattcta 7740tgcttacaga aaatatagcc atgattgaaa
tcaaatagta aaggctgttc tggcttttta 7800tcttcttagc tcatcttaaa
taagcagtac acttggatgc agtgcgtctg aagtgctaat 7860cagttgtaac
aatagcacaa atcgaactta ggatttgttt cttctcttct gtgtttcgat
7920ttttgatcaa ttctttaatt ttggaagcct ataatacagt tttctattct
tggagataaa 7980aattaaatgg atcactgata ttttagtcat tctgcttctc
atctaaatat ttccatattc 8040tgtattagga gaaaattacc ctcccagcac
cagcccccct ctcaaacccc caacccaaaa 8100ccaagcattt tggaatgagt
ctcctttagt ttcagagtgt ggattgtata acccatatac 8160tcttcgatgt
acttgtttgg tttggtatta atttgactgt gcatgacagc ggcaatcttt
8220tctttggtca aagttttctg tttattttgc ttgtcatatt cgatgtactt
taaggtgtct 8280ttatgaagtt tgctattctg gcaataaact tttagacttt
tgaagtgttt gtgttttaat 8340ttaatatgtt tataagcatg tataaacatt
tagcatattt ttatcatagg tctaaaaata 8400tttgtttact aaatacctgt
gaagaaatac cattaaaaaa ctatttggtt ctgaattctt 8460actagaaaaa aaa
8473472428PRTHomo sapiens 47Met Ser Cys Asn Gly Gly Ser His Pro Arg
Ile Asn Thr Leu Gly Arg1 5 10 15Met Ile Arg Ala Glu Ser Gly Pro Asp
Leu Arg Tyr Glu Val Thr Ser 20 25 30Gly Gly Gly Gly Thr Ser Arg Met
Tyr Tyr Ser Arg Arg Gly Val Ile 35 40 45Thr Asp Gln Asn Ser Asp Gly
Tyr Cys Gln Thr Gly Thr Met Ser Arg 50 55 60His Gln Asn Gln Asn Thr
Ile Gln Glu Leu Leu Gln Asn Cys Ser Asp65 70 75 80Cys Leu Met Arg
Ala Glu Leu Ile Val Gln Pro Glu Leu Lys Tyr Gly 85 90 95Asp Gly Ile
Gln Leu Thr Arg Ser Arg Glu Leu Asp Glu Cys Phe Ala 100 105 110Gln
Ala Asn Asp Gln Met Glu Ile Leu Asp Ser Leu Ile Arg Glu Met 115 120
125Arg Gln Met Gly Gln Pro Cys Asp Ala Tyr Gln Lys Arg Leu Leu Gln
130 135 140Leu Gln Glu Gln Met Arg Ala Leu Tyr Lys Ala Ile Ser Val
Pro Arg145 150 155 160Val Arg Arg Ala Ser Ser Lys Gly Gly Gly Gly
Tyr Thr Cys Gln Ser 165 170 175Gly Ser Gly Trp Asp Glu Phe Thr Lys
His Val Thr Ser Glu Cys Leu 180 185 190Gly Trp Met Arg Gln Gln Arg
Ala Glu Met Asp Met Val Ala Trp Gly 195 200 205Val Asp Leu Ala Ser
Val Glu Gln His Ile Asn Ser His Arg Gly Ile 210 215 220His Asn Ser
Ile Gly Asp Tyr Arg Trp Gln Leu Asp Lys Ile Lys Ala225 230 235
240Asp Leu Arg Glu Lys Ser Ala Ile Tyr Gln Leu Glu Glu Glu Tyr Glu
245 250 255Asn Leu Leu Lys Ala Ser Phe Glu Arg Met Asp His Leu Arg
Gln Leu 260 265 270Gln Asn Ile Ile Gln Ala Thr Ser Arg Glu Ile Met
Trp Ile Asn Asp 275 280 285Cys Glu Glu Glu Glu Leu Leu Tyr Asp Trp
Ser Asp Lys Asn Thr Asn 290 295 300Ile Ala Gln Lys Gln Glu Ala Phe
Ser Ile Arg Met Ser Gln Leu Glu305 310 315 320Val Lys Glu Lys Glu
Leu Asn Lys Leu Lys Gln Glu Ser Asp Gln Leu 325 330 335Val Leu Asn
Gln His Pro Ala Ser Asp Lys Ile Glu Ala Tyr Met Asp 340 345 350Thr
Leu Gln Thr Gln Trp Ser Trp Ile Leu Gln Ile Thr Lys Cys Ile 355 360
365Asp Val His Leu Lys Glu Asn Ala Ala Tyr Phe Gln Phe Phe Glu Glu
370 375 380Ala Gln Ser Thr Glu Ala Tyr Leu Lys Gly Leu Gln Asp Ser
Ile Arg385 390 395 400Lys Lys Tyr Pro Cys Asp Lys Asn Met Pro Leu
Gln His Leu Leu Glu 405 410 415Gln Ile Lys Glu Leu Glu Lys Glu Arg
Glu Lys Ile Leu Glu Tyr Lys 420 425 430Arg Gln Val Gln Asn Leu Val
Asn Lys Ser Lys Lys Ile Val Gln Leu 435 440 445Lys Pro Arg Asn Pro
Asp Tyr Arg Ser Asn Lys Pro Ile Ile Leu Arg 450 455 460Ala Leu Cys
Asp Tyr Lys Gln Asp Gln Lys Ile Val His Lys Gly Asp465 470 475
480Glu Cys Ile Leu Lys Asp Asn Asn Glu Arg Ser Lys Trp Tyr Val Thr
485 490 495Gly Pro Gly Gly Val Asp Met Leu Val Pro Ser Val Gly Leu
Ile Ile 500 505 510Pro Pro Pro Asn Pro Leu Ala Val Asp Leu Ser Cys
Lys Ile Glu Gln 515 520 525Tyr Tyr Glu Ala Ile Leu Ala Leu Trp Asn
Gln Leu Tyr Ile Asn Met 530 535 540Lys Ser Leu Val Ser Trp His Tyr
Cys Met Ile Asp Ile Glu Lys Ile545 550 555 560Arg Ala Met Thr Ile
Ala Lys Leu Lys Thr Met Arg Gln Glu Asp Tyr 565 570 575Met Lys Thr
Ile Ala Asp Leu Glu Leu His Tyr Gln Glu Phe Ile Arg 580 585 590Asn
Ser Gln Gly Ser Glu Met Phe Gly Asp Asp Asp Lys Arg Lys Ile 595 600
605Gln Ser Gln Phe Thr Asp Ala Gln Lys His Tyr Gln Thr Leu Val Ile
610 615 620Gln Leu Pro Gly Tyr Pro Gln His Gln Thr Val Thr Thr Thr
Glu Ile625 630 635 640Thr His His Gly Thr Cys Gln Asp Val Asn His
Asn Lys Val Ile Glu 645 650 655Thr Asn Arg Glu Asn Asp Lys Gln Glu
Thr Trp Met Leu Met Glu Leu 660 665 670Gln Lys Ile Arg Arg Gln Ile
Glu His Cys Glu Gly Arg Met Thr Leu 675 680 685Lys Asn Leu Pro Leu
Ala Asp Gln Gly Ser Ser His His Ile Thr Val 690 695 700Lys Ile Asn
Glu Leu Lys Ser Val Gln Asn Asp Ser Gln Ala Ile Ala705 710 715
720Glu Val Leu Asn Gln Leu Lys Asp Met Leu Ala Asn Phe Arg Gly Ser
725 730 735Glu Lys Tyr Cys Tyr Leu Gln Asn Glu Val Phe Gly Leu Phe
Gln Lys 740 745 750Leu Glu Asn Ile Asn Gly Val Thr Asp Gly Tyr Leu
Asn Ser Leu Cys 755 760 765Thr Val Arg Ala Leu Leu Gln Ala Ile Leu
Gln Thr Glu Asp Met Leu 770 775 780Lys Val Tyr Glu Ala Arg Leu Thr
Glu Glu Glu Thr Val Cys Leu Asp785 790 795 800Leu Asp Lys Val Glu
Ala Tyr Arg Cys Gly Leu Lys Lys Ile Lys Asn 805 810 815Asp Leu Asn
Leu Lys Lys Ser Leu Leu Ala Thr Met Lys Thr Glu Leu 820 825 830Gln
Lys Ala Gln Gln Ile His Ser Gln Thr Ser Gln Gln Tyr Pro Leu 835 840
845Tyr Asp Leu Asp Leu Gly Lys Phe Gly Glu Lys Val Thr Gln Leu Thr
850 855 860Asp Arg Trp Gln Arg Ile Asp Lys Gln Ile Asp Phe Arg Leu
Trp Asp865 870 875 880Leu Glu Lys Gln Ile Lys Gln Leu Arg Asn Tyr
Arg Asp Asn Tyr Gln 885 890 895Ala Phe Cys Lys Trp Leu Tyr Asp Ala
Lys Arg Arg Gln Asp Ser Leu 900 905 910Glu Ser Met Lys Phe Gly Asp
Ser Asn Thr Val Met Arg Phe Leu Asn 915 920 925Glu Gln Lys Asn Leu
His Ser Glu Ile Ser Gly Lys Arg Asp Lys Ser 930 935 940Glu Glu Val
Gln Lys Ile Ala Glu Leu Cys Ala Asn Ser Ile Lys Asp945 950 955
960Tyr Glu Leu Gln Leu Ala Ser Tyr Thr Ser Gly Leu Glu Thr Leu Leu
965 970 975Asn Ile Pro Ile Lys Arg Thr Met Ile Gln Ser Pro Ser Gly
Val Ile 980 985 990Leu Gln Glu Ala Ala Asp Val His Ala Arg Tyr Ile
Glu Leu Leu Thr 995 1000 1005Arg Ser Gly Asp Tyr Tyr Arg Phe Leu
Ser Glu Met Leu Lys Ser 1010 1015 1020Leu Glu Asp Leu Lys Leu Lys
Asn Thr Lys Ile Glu Val Leu Glu 1025 1030 1035Glu Glu Leu Arg Leu
Ala Arg Asp Ala Asn Ser Glu Asn Cys
Asn 1040 1045 1050Lys Asn Lys Phe Leu Asp Gln Asn Leu Gln Lys Tyr
Gln Ala Glu 1055 1060 1065Cys Ser Gln Phe Lys Ala Lys Leu Ala Ser
Leu Glu Glu Leu Lys 1070 1075 1080Arg Gln Ala Glu Leu Asp Gly Lys
Ser Ala Lys Gln Asn Leu Asp 1085 1090 1095Lys Cys Tyr Gly Gln Ile
Lys Glu Leu Asn Glu Lys Ile Thr Arg 1100 1105 1110Leu Thr Tyr Glu
Ile Glu Asp Glu Lys Arg Arg Arg Lys Ser Val 1115 1120 1125Glu Asp
Arg Phe Asp Gln Gln Lys Asn Asp Tyr Asp Gln Leu Gln 1130 1135
1140Lys Ala Arg Gln Cys Glu Lys Glu Asn Leu Gly Trp Gln Lys Leu
1145 1150 1155Glu Ser Glu Lys Ala Ile Lys Glu Lys Glu Tyr Glu Ile
Glu Arg 1160 1165 1170Leu Arg Val Leu Leu Gln Glu Glu Gly Thr Arg
Lys Arg Glu Tyr 1175 1180 1185Glu Asn Glu Leu Ala Lys Val Arg Asn
His Tyr Asn Glu Glu Met 1190 1195 1200Ser Asn Leu Arg Asn Lys Tyr
Glu Thr Glu Ile Asn Ile Thr Lys 1205 1210 1215Thr Thr Ile Lys Glu
Ile Ser Met Gln Lys Glu Asp Asp Ser Lys 1220 1225 1230Asn Leu Arg
Asn Gln Leu Asp Arg Leu Ser Arg Glu Asn Arg Asp 1235 1240 1245Leu
Lys Asp Glu Ile Val Arg Leu Asn Asp Ser Ile Leu Gln Ala 1250 1255
1260Thr Glu Gln Arg Arg Arg Ala Glu Glu Asn Ala Leu Gln Gln Lys
1265 1270 1275Ala Cys Gly Ser Glu Ile Met Gln Lys Lys Gln His Leu
Glu Ile 1280 1285 1290Glu Leu Lys Gln Val Met Gln Gln Arg Ser Glu
Asp Asn Ala Arg 1295 1300 1305His Lys Gln Ser Leu Glu Glu Ala Ala
Lys Thr Ile Gln Asp Lys 1310 1315 1320Asn Lys Glu Ile Glu Arg Leu
Lys Ala Glu Phe Gln Glu Glu Ala 1325 1330 1335Lys Arg Arg Trp Glu
Tyr Glu Asn Glu Leu Ser Lys Ala Ser Asn 1340 1345 1350Arg Ile Gln
Glu Ser Lys Asn Gln Cys Thr Gln Val Val Gln Glu 1355 1360 1365Arg
Glu Ser Leu Leu Val Lys Ile Lys Val Leu Glu Gln Asp Lys 1370 1375
1380Ala Arg Leu Gln Arg Leu Glu Asp Glu Leu Asn Arg Ala Lys Ser
1385 1390 1395Thr Leu Glu Ala Glu Thr Arg Val Lys Gln Arg Leu Glu
Cys Glu 1400 1405 1410Lys Gln Gln Ile Gln Asn Asp Leu Asn Gln Trp
Lys Thr Gln Tyr 1415 1420 1425Ser Arg Lys Glu Glu Ala Ile Arg Lys
Ile Glu Ser Glu Arg Glu 1430 1435 1440Lys Ser Glu Arg Glu Lys Asn
Ser Leu Arg Ser Glu Ile Glu Arg 1445 1450 1455Leu Gln Ala Glu Ile
Lys Arg Ile Glu Glu Arg Cys Arg Arg Lys 1460 1465 1470Leu Glu Asp
Ser Thr Arg Glu Thr Gln Ser Gln Leu Glu Thr Glu 1475 1480 1485Arg
Ser Arg Tyr Gln Arg Glu Ile Asp Lys Leu Arg Gln Arg Pro 1490 1495
1500Tyr Gly Ser His Arg Glu Thr Gln Thr Glu Cys Glu Trp Thr Val
1505 1510 1515Asp Thr Ser Lys Leu Val Phe Asp Gly Leu Arg Lys Lys
Val Thr 1520 1525 1530Ala Met Gln Leu Tyr Glu Cys Gln Leu Ile Asp
Lys Thr Thr Leu 1535 1540 1545Asp Lys Leu Leu Lys Gly Lys Lys Ser
Val Glu Glu Val Ala Ser 1550 1555 1560Glu Ile Gln Pro Phe Leu Arg
Gly Ala Gly Ser Ile Ala Gly Ala 1565 1570 1575Ser Ala Ser Pro Lys
Glu Lys Tyr Ser Leu Val Glu Ala Lys Arg 1580 1585 1590Lys Lys Leu
Ile Ser Pro Glu Ser Thr Val Met Leu Leu Glu Ala 1595 1600 1605Gln
Ala Ala Thr Gly Gly Ile Ile Asp Pro His Arg Asn Glu Lys 1610 1615
1620Leu Thr Val Asp Ser Ala Ile Ala Arg Asp Leu Ile Asp Phe Asp
1625 1630 1635Asp Arg Gln Gln Ile Tyr Ala Ala Glu Lys Ala Ile Thr
Gly Phe 1640 1645 1650Asp Asp Pro Phe Ser Gly Lys Thr Val Ser Val
Ser Glu Ala Ile 1655 1660 1665Lys Lys Asn Leu Ile Asp Arg Glu Thr
Gly Met Arg Leu Leu Glu 1670 1675 1680Ala Gln Ile Ala Ser Gly Gly
Val Val Asp Pro Val Asn Ser Val 1685 1690 1695Phe Leu Pro Lys Asp
Val Ala Leu Ala Arg Gly Leu Ile Asp Arg 1700 1705 1710Asp Leu Tyr
Arg Ser Leu Asn Asp Pro Arg Asp Ser Gln Lys Asn 1715 1720 1725Phe
Val Asp Pro Val Thr Lys Lys Lys Val Ser Tyr Val Gln Leu 1730 1735
1740Lys Glu Arg Cys Arg Ile Glu Pro His Thr Gly Leu Leu Leu Leu
1745 1750 1755Ser Val Gln Lys Arg Ser Met Ser Phe Gln Gly Ile Arg
Gln Pro 1760 1765 1770Val Thr Val Thr Glu Leu Val Asp Ser Gly Ile
Leu Arg Pro Ser 1775 1780 1785Thr Val Asn Glu Leu Glu Ser Gly Gln
Ile Ser Tyr Asp Glu Val 1790 1795 1800Gly Glu Arg Ile Lys Asp Phe
Leu Gln Gly Ser Ser Cys Ile Ala 1805 1810 1815Gly Ile Tyr Asn Glu
Thr Thr Lys Gln Lys Leu Gly Ile Tyr Glu 1820 1825 1830Ala Met Lys
Ile Gly Leu Val Arg Pro Gly Thr Ala Leu Glu Leu 1835 1840 1845Leu
Glu Ala Gln Ala Ala Thr Gly Phe Ile Val Asp Pro Val Ser 1850 1855
1860Asn Leu Arg Leu Pro Val Glu Glu Ala Tyr Lys Arg Gly Leu Val
1865 1870 1875Gly Ile Glu Phe Lys Glu Lys Leu Leu Ser Ala Glu Arg
Ala Val 1880 1885 1890Thr Gly Tyr Asn Asp Pro Glu Thr Gly Asn Ile
Ile Ser Leu Phe 1895 1900 1905Gln Ala Met Asn Lys Glu Leu Ile Glu
Lys Gly His Gly Ile Arg 1910 1915 1920Leu Leu Glu Ala Gln Ile Ala
Thr Gly Gly Ile Ile Asp Pro Lys 1925 1930 1935Glu Ser His Arg Leu
Pro Val Asp Ile Ala Tyr Lys Arg Gly Tyr 1940 1945 1950Phe Asn Glu
Glu Leu Ser Glu Ile Leu Ser Asp Pro Ser Asp Asp 1955 1960 1965Thr
Lys Gly Phe Phe Asp Pro Asn Thr Glu Glu Asn Leu Thr Tyr 1970 1975
1980Leu Gln Leu Lys Glu Arg Cys Ile Lys Asp Glu Glu Thr Gly Leu
1985 1990 1995Cys Leu Leu Pro Leu Lys Glu Lys Lys Lys Gln Val Gln
Thr Ser 2000 2005 2010Gln Lys Asn Thr Leu Arg Lys Arg Arg Val Val
Ile Val Asp Pro 2015 2020 2025Glu Thr Asn Lys Glu Met Ser Val Gln
Glu Ala Tyr Lys Lys Gly 2030 2035 2040Leu Ile Asp Tyr Glu Thr Phe
Lys Glu Leu Cys Glu Gln Glu Cys 2045 2050 2055Glu Trp Glu Glu Ile
Thr Ile Thr Gly Ser Asp Gly Ser Thr Arg 2060 2065 2070Val Val Leu
Val Asp Arg Lys Thr Gly Ser Gln Tyr Asp Ile Gln 2075 2080 2085Asp
Ala Ile Asp Lys Gly Leu Val Asp Arg Lys Phe Phe Asp Gln 2090 2095
2100Tyr Arg Ser Gly Ser Leu Ser Leu Thr Gln Phe Ala Asp Met Ile
2105 2110 2115Ser Leu Lys Asn Gly Val Gly Thr Ser Ser Ser Met Gly
Ser Gly 2120 2125 2130Val Ser Asp Asp Val Phe Ser Ser Ser Arg His
Glu Ser Val Ser 2135 2140 2145Lys Ile Ser Thr Ile Ser Ser Val Arg
Asn Leu Thr Ile Arg Ser 2150 2155 2160Ser Ser Phe Ser Asp Thr Leu
Glu Glu Ser Ser Pro Ile Ala Ala 2165 2170 2175Ile Phe Asp Thr Glu
Asn Leu Glu Lys Ile Ser Ile Thr Glu Gly 2180 2185 2190Ile Glu Arg
Gly Ile Val Asp Ser Ile Thr Gly Gln Arg Leu Leu 2195 2200 2205Glu
Ala Gln Ala Cys Thr Gly Gly Ile Ile His Pro Thr Thr Gly 2210 2215
2220Gln Lys Leu Ser Leu Gln Asp Ala Val Ser Gln Gly Val Ile Asp
2225 2230 2235Gln Asp Met Ala Thr Arg Leu Lys Pro Ala Gln Lys Ala
Phe Ile 2240 2245 2250Gly Phe Glu Gly Val Lys Gly Lys Lys Lys Met
Ser Ala Ala Glu 2255 2260 2265Ala Val Lys Glu Lys Trp Leu Pro Tyr
Glu Ala Gly Gln Arg Phe 2270 2275 2280Leu Glu Phe Gln Tyr Leu Thr
Gly Gly Leu Val Asp Pro Glu Val 2285 2290 2295His Gly Arg Ile Ser
Thr Glu Glu Ala Ile Arg Lys Gly Phe Ile 2300 2305 2310Asp Gly Arg
Ala Ala Gln Arg Leu Gln Asp Thr Ser Ser Tyr Ala 2315 2320 2325Lys
Ile Leu Thr Cys Pro Lys Thr Lys Leu Lys Ile Ser Tyr Lys 2330 2335
2340Asp Ala Ile Asn Arg Ser Met Val Glu Asp Ile Thr Gly Leu Arg
2345 2350 2355Leu Leu Glu Ala Ala Ser Val Ser Ser Lys Gly Leu Pro
Ser Pro 2360 2365 2370Tyr Asn Met Ser Ser Ala Pro Gly Ser Arg Ser
Gly Ser Arg Ser 2375 2380 2385Gly Ser Arg Ser Gly Ser Arg Ser Gly
Ser Arg Ser Gly Ser Arg 2390 2395 2400Arg Gly Ser Phe Asp Ala Thr
Gly Asn Ser Ser Tyr Ser Tyr Ser 2405 2410 2415Tyr Ser Phe Ser Ser
Ser Ser Ile Gly His 2420 2425488768DNAHomo sapiens 48gcggccgcac
tagtaccccg gagcccatgg gcgcgccgag ccgggcgcgg gggcgctgaa 60cggcggagcg
ggagcggccg gaggagccat ggactgcagc ctcgtgcgga cgctcgtgca
120cagatactgt gcaggagaag agaattgggt ggacagcagg accatctacg
tgggacacag 180ggagccacct ccgggcgcag aggcctacat cccacagaga
tacccagaca acaggatcgt 240ctcgtccaag tacacatttt ggaactttat
acccaagaat ttatttgaac aattcagaag 300agtagccaac ttttatttcc
ttatcatatt tctggtgcag ttgattattg atacacccac 360aagtccagtg
acaagcggac ttccactctt ctttgtcatt actgtgacgg ctatcaaaca
420gggttatgaa gactggcttc gacataaagc agacaatgcc atgaaccagt
gtcctgttca 480tttcattcag cacggcaagc tcgttcggaa acaaagtcga
aagctgcgag ttggggacat 540tgtcatggtt aaggaggacg agacctttcc
ctgcgacttg atcttccttt ccagcaaccg 600gggagatggg acgtgccacg
tcaccaccgc cagcttggat ggagaatcca gccataaaac 660gcattacgcg
gtccaggaca ccaaaggctt ccacacagag gaggatatcg gcggacttca
720cgccaccatc gagtgtgagc agccccagcc cgacctctac aagttcgtgg
gtcgcatcaa 780cgtttacagt gacctgaatg accccgtggt gaggccctta
ggatcggaaa acctgctgct 840tagaggagct acactgaaga acactgagaa
aatctttggt gtggctattt acacgggaat 900ggaaaccaag atggcattaa
attatcaatc aaaatctcag aagcgatctg ccgtggaaaa 960atcgatgaat
gcgttcctca ttgtgtatct ctgcattctg atcagcaaag ccctgataaa
1020cactgtgctg aaatacatgt ggcagagtga gccctttcgg gatgagccgt
ggtataatca 1080gaaaacggag tcggaaaggc agaggaatct gttcctcaag
gcattcacgg acttcctggc 1140cttcatggtc ctctttaact acatcatccc
tgtgtccatg tacgtcacgg tcgagatgca 1200gaagttcctc ggctcttact
tcatcacctg ggacgaagac atgtttgacg aggagactgg 1260cgaggggcct
ctggtgaaca cgtcggacct caatgaagag ctgggacagg tggagtacat
1320cttcacagac aagaccggca ccctcacgga aaacaacatg gagttcaagg
agtgctgcat 1380cgaaggccat gtctacgtgc cccacgtcat ctgcaacggg
caggtcctcc cagagtcgtc 1440aggaatcgac atgattgact cgtcccccag
cgtcaacggg agggagcgcg aggagctgtt 1500tttccgggcc ctctgtctct
gccacaccgt ccaggtgaaa gacgatgaca gcgtagacgg 1560ccccaggaaa
tcgccggacg gggggaaatc ctgtgtgtac atctcatcct cgcccgacga
1620ggtggcgctg gtcgaaggtg tccagagact tggctttacc tacctaaggc
tgaaggacaa 1680ttacatggag atattaaaca gggagaacca catcgaaagg
tttgaattgc tggaaatttt 1740gagttttgac tcagtcagaa ggagaatgag
tgtaattgta aaatctgcta caggagaaat 1800ttatctgttt tgcaaaggag
cagattcttc gatattcccc cgagtgatag aaggcaaagt 1860tgaccagatc
cgagccagag tggagcgtaa cgcagtggag gggctccgaa ctttgtgtgt
1920tgcttataaa aggctgatcc aagaagaata tgaaggcatt tgtaagctgc
tgcaggctgc 1980caaagtggcc cttcaagatc gagagaaaaa gttagcagaa
gcctatgagc aaatagagaa 2040agatcttact ctgcttggtg ctacagctgt
tgaggaccgg ctgcaggaga aagctgcaga 2100caccatcgag gccctgcaga
aggccgggat caaagtctgg gttctcacgg gagacaagat 2160ggagacggcc
gcggccacgt gctacgcctg caagctcttc cgcaggaaca cgcagctgct
2220ggagctgacc accaagagga tcgaggagca gagcctgcac gacgtcctgt
tcgagctgag 2280caagacggtc ctgcgccaca gcgggagcct gaccagagac
aacctgtccg gactttcagc 2340agatatgcag gactacggtt taattatcga
cggagctgca ctgtctctga taatgaagcc 2400tcgagaagac gggagttccg
gcaactacag ggagctcttc ctggaaatct gccggagctg 2460cagcgcggtg
ctctgctgcc gcatggcgcc cttgcagaag gctcagattg ttaaattaat
2520caaattttca aaagagcacc caatcacgtt agcaattggc gatggtgcaa
atgatgtcag 2580catgattctg gaagcgcacg tgggcatagg tgtcatcggc
aaggaaggcc gccaggctgc 2640caggaacagc gactatgcaa tcccaaagtt
taagcatttg aagaagatgc tgcttgttca 2700cgggcatttt tattacatta
ggatctctga gctcgtgcag tacttcttct ataagaacgt 2760ctgcttcatc
ttccctcagt ttttatacca gttcttctgt gggttttcac aacagacttt
2820gtacgacacc gcgtatctga ccctctacaa catcagcttc acctccctcc
ccatcctcct 2880gtacagcctc atggagcagc atgttggcat tgacgtgctc
aagagagacc cgaccctgta 2940cagggacgtc gccaagaatg ccctgctgcg
ctggcgcgtg ttcatctact ggacgctcct 3000gggactgttt gacgcactgg
tgttcttctt tggtgcttat ttcgtgtttg aaaatacaac 3060tgtgacaagc
aacgggcaga tatttggaaa ctggacgttt ggaacgctgg tattcaccgt
3120gatggtgttc acagttacac taaagcttgc attggacaca cactactgga
cttggatcaa 3180ccattttgtc atctgggggt cgctgctgtt ctacgttgtc
ttttcgcttc tctggggagg 3240agtgatctgg ccgttcctca actaccagag
gatgtactac gtgttcatcc agatgctgtc 3300cagcgggccc gcctggctgg
ccatcgtgct gctggtgacc atcagcctcc ttcccgacgt 3360cctcaagaaa
gtcctgtgcc ggcagctgtg gccaacagca acagagagag tccagaatgg
3420gtgcgcacag cctcgggacc gcgactcaga attcacccct cttgcctctc
tgcagagccc 3480aggctaccag agcacctgtc cctcggccgc ctggtacagc
tcccactctc agcaggtgac 3540actcgcggcc tggaaggaga aggtgtccac
ggagccccca cccatcctcg gcggttccca 3600tcaccactgc agttccatcc
caagtcacag ctgccctagg tcccgtgtgg gaatgctcgt 3660gtgatggatg
gtcctaagcc tgtggagact gtgcacgtgc ctcttcctgg cccccagcag
3720gcaaggaggg gggtcacagg ccttgccctc gagcatggca ccctggccgc
ctggacccag 3780cactgtggtt gttgagccac accagtggcc tctgggcatt
cggctcaacg caggagggac 3840attctgctgg cccaccctgc gcgctgtcat
gcagaggcca ttcccccagg cctgtgtctt 3900cacccacctg ccatcattgg
cctttgctgt cactgggaga gaagagccgt ccagggaccc 3960atggtggccc
acatgtggat gccacatgct gctgtttcct gcttgcccgg ccaccaccca
4020tgccctccat agggtgaggt ggagccatgg tggtgcgtcc tttactcaac
aaccctccaa 4080tccggatgct gtgggaaggg ccgggtcact cggataccat
catccctgcg gatgcaccgc 4140cgtaccctgc tcatctggga gtggtttccc
tgcggttacg tccaagcccg cctgccctgt 4200gtgttggggc tggctgagtt
tcggtctccc catcaccggc cgcctcgtgg agaaggcagt 4260gccacgtggg
aggacaaggc cacgccggca gcttccagcc ctgccgcaga agtgccagga
4320tgtccatcag ccactcgcca gggcacggag ccgtcagtcc actgttacgg
gagaatgttg 4380atttcgcggg tgcgagggcc gggagacaga tacttggctg
tgatgagcag acatcctctg 4440tccccgtgga ggggtcaaca ccaaggtggt
gttcgtgcac cagaacctgt ctcgggctga 4500cgggggtggc acacaggaca
cgggtggatc ccaacaggca gcaccgcacc tctgcccgcc 4560tcccgcactg
cagctccgcc cgccgggctc tgcgtcccca cgtcccctcg tcccatcccc
4620acgtcccctc atcccgtcac ctcgtcccca catccccttg ccccgtcacc
tcgtcctcat 4680gtccccttgt cctgtcacct cgtccccacg tcccctcgtc
tcctcatccc cacgtcctct 4740cgtccccttg tcccgtcccc acataccctc
gtccccatgt ccccacgcag ggctctcctt 4800cgtcttagga tctgtccagc
gctgctctgg gtgggttagc aaccccaggg ctgctgtgat 4860aggaagtccc
tgttgttctc cgtactggca tttctatttc tagaaataat atttgacata
4920gccttaatgg tccttaaaga agacatttca gtgtgagatt cagacttcag
acgctgaaac 4980tgctgccttt caggaaagca ccaccaacgc tggaggagga
gccggccctc acgcccgccc 5040cgcgccacgc tgtggaacgg ggctccggca
agtgaaaccc agagggtgtt tccgaggtgc 5100tcgacagtag gtatttttgg
aagctcagat ttcaccattt gattgtataa tcttttacct 5160ataaaatatt
tatttgaagt agagggtaaa tcagcggtaa gaacagtgaa cacagtggtt
5220gggataaaat aaggtgacaa acatcacacc aaagatgagg gtagcgagca
actggcttga 5280gcagacagaa cggggaagac tccactctgt cccgaggggc
cagccgcagg cgtccccagg 5340gccaccctgc cctgaggtcc ttgtgtggcc
gccctggctt ggcagccctg cccacgctgc 5400ccccgcaaac aatggtgtgt
gcgtttttac agcccttttt aggaacccaa tatgggcata 5460aatgtaacac
ctgtagcggg ggcagattct ctgtatgttc agttaacaaa ttatttgtaa
5520tgtatttttt tagaaatctt aaaattgcct ttgcactgaa gtattttcat
agctgtttat 5580atctctttta ttcatttatt taacatactg tctaatttta
aaaataggtt tttaaagctt 5640tcatttttaa gtttatgaaa ttttggccac
tttacattta gattctggtg agagttttga 5700ctgaatgttc caatctctga
tgaatgcgaa ttttcagatt tgattttatt ctctacacac 5760acctcttctt
ttcttggtat ttctggtggc agtgattagt tgaacagcac atttaaggca
5820cgataatttg ctacactttt tctttacaat ttgttgcaat ttcatctgct
ttctatgttt 5880cattgttaat tgccatcctt cagccttaaa aatagaagat
tctcacgtga aggtttagta 5940agttgggtcc cagctctgcc tgtgtggaga
tagtcaccat gtacctctga caacaagttt 6000tagtgtgaaa gtcactaaac
ttttacacac tcccaaacgt ctttttaaaa attgcttggg 6060aaattattaa
atgaatgtgc ctgatgattt gaaatagaca aggggcacga gataaaaaag
6120aaaaggatga gaagatcctc agtgaatgac gttgcagggt cttcatgcaa
ttttccacct 6180cgcagtagtt agtatttact tgccttaaac taactttgaa
gcaagtaatg tcaactttga 6240gcactttgtt gagttttgaa aaatcttatt
tgttgctgca caggttaata aattatcaat 6300ttgtaattca gcatgttggt
cagagacacg gtcactgatt cacacccagt ccctgccaca 6360gaccgtctca
gacacgcaca gtgggcctgc tgcatgattc acacccagtc cctgccacag
6420accgtctcag acacgcacag tgggcctgct gcatgattca cacccagtcc
ctgccacaga 6480ccgtctcaga cacgcacagt
gggcctgctg catgcgtgtt acctggcttt tggctccacg 6540ctcactcata
gccatgtcca catgggggct tgcacacagg atcactcaca tatgtacatg
6600tacccaccac aaacgtgcaa gctcctgcac acatgcatgc acacaaacgt
gtacacaagt 6660gtgagctcct acacgcatac acacacacac gtgtacatgc
accaaagcat gtgtgaccta 6720cagacatgca gaacatgcac gtgtacacat
accacagaca cgcgtgtgca tgctcctaca 6780caatacatat gcacatatca
tgaacagcgt aagttcctac acacggacgt gtgatacaca 6840catgcatgta
caggtaagca cacatgtaca agctcctaca ggcttgctct cacacacgtg
6900tatgcacagc agagagacgt atgagcttct actgcacaca tgcacacaca
cacgcacacg 6960tacattcact acaaacgtgc agcctcctgc acacgtgcac
attcatgtgt acaccacaaa 7020tgagttccca gacgtgtaaa cacacgtgca
cacatcgtac acatgtgagc tcccacacgt 7080acacacagat gcacatggac
acaccccaaa cacgcacagg ctcctacaca catgcacaca 7140cgtgtacacc
acaaacgagc tcccagacat gtaaacacac gtctcccaca cgtgagctcc
7200cacacgtaca catgcacatg tacgcaccac aaacacatgc gcaggctcct
gcaggcgtga 7260atacacacat gcacacacat atacacacat gtgccacaaa
caagtgcaca ctgtcctggt 7320gtcctgcact gcatcctgcc tccttgctga
ggggcccctg tgagaggcct ctggatgggc 7380atgggaagat gggctccctg
gcccccagcc catgcctccc tgggatgaag agtccccctc 7440ctggcagaat
gtctgggctt tgcagagcag gccccggggg tgaagtcgca gcttcactta
7500caccagctgc tctgtgagca aggcttggtg ccctggacaa ggcccttccc
ctttagggag 7560gtccagcctc gcaagctgaa acctcccctc ggctcagccc
tataccaggc ggccacagca 7620ggactggcca cacccacgcc gcacctcatc
cgtgcacgcg tcggagcacg gccagccttc 7680cgccacgagc cagctgggaa
gggccgcggc cgcctaaagc cccagtcaac ccagcctgtg 7740tctgagcaga
cagggcgaac aagcaggcca caccgtctcg agggaggagg ccagatgcgg
7800ccagcgtctc caacagggtg accatccgct cggcttgctg agcgtttaaa
caaatgttta 7860gacaggctgt ggggactccc ctgagttgag ccttggccag
gggtccggtg ctgtcgcggg 7920aaacctccag ccttgttctt caaaccactc
agctcatgtg ttttgcactg actagtactg 7980aataatacaa ccactcttat
ttaatgttag tattatttat ttgacaactc agtgtctaac 8040agcttgatat
gcaggtcctt gcatcctaca tttctttagg aagttaccca tttgtaactt
8100taaaaacagg aaaaatatca gttggcaaat gcaatctttt ttttttttaa
gctaaaggtg 8160ggtgaactgg aatgaaaatc tttctgatgt tgtgtctata
agcagccttg atgggatatg 8220ttagaagtgt catgaaagtg tgattctact
tttgcagaaa aatctaaaga tcaatttata 8280tagctttatt ttttacttta
tcaaagtata cagaatttta atatgcatat attgtgtctg 8340acttaaaatt
ataatgtctg cgtcaccatt taaaatgtct gttcattatg taatgtaata
8400aaagaaggtc ttcaaaaatg tatttaacat gaatggtatc catagttgtc
atcatcataa 8460atactggagt ttatttttaa attattaaac atagtaggtg
cattaacata aatcagtctc 8520cacacagtaa catttaactg ataattcatt
aatcagcttt gaaaaattaa attgttaatt 8580aaaccaatct aacatttcag
taaagtttat tttgtatgct tctgttttta acttttattt 8640ctgtagataa
actgactgga taatattata ttggactttt ctctagatta tctaagcagg
8700agacctgaat ctgcttgcaa taaagaataa aagtctgctt cagtttcttt
ataaagaaac 8760tcacacaa 8768491191PRTHomo sapiens 49Met Asp Cys Ser
Leu Val Arg Thr Leu Val His Arg Tyr Cys Ala Gly1 5 10 15Glu Glu Asn
Trp Val Asp Ser Arg Thr Ile Tyr Val Gly His Arg Glu 20 25 30Pro Pro
Pro Gly Ala Glu Ala Tyr Ile Pro Gln Arg Tyr Pro Asp Asn 35 40 45Arg
Ile Val Ser Ser Lys Tyr Thr Phe Trp Asn Phe Ile Pro Lys Asn 50 55
60Leu Phe Glu Gln Phe Arg Arg Val Ala Asn Phe Tyr Phe Leu Ile Ile65
70 75 80Phe Leu Val Gln Leu Ile Ile Asp Thr Pro Thr Ser Pro Val Thr
Ser 85 90 95Gly Leu Pro Leu Phe Phe Val Ile Thr Val Thr Ala Ile Lys
Gln Gly 100 105 110Tyr Glu Asp Trp Leu Arg His Lys Ala Asp Asn Ala
Met Asn Gln Cys 115 120 125Pro Val His Phe Ile Gln His Gly Lys Leu
Val Arg Lys Gln Ser Arg 130 135 140Lys Leu Arg Val Gly Asp Ile Val
Met Val Lys Glu Asp Glu Thr Phe145 150 155 160Pro Cys Asp Leu Ile
Phe Leu Ser Ser Asn Arg Gly Asp Gly Thr Cys 165 170 175His Val Thr
Thr Ala Ser Leu Asp Gly Glu Ser Ser His Lys Thr His 180 185 190Tyr
Ala Val Gln Asp Thr Lys Gly Phe His Thr Glu Glu Asp Ile Gly 195 200
205Gly Leu His Ala Thr Ile Glu Cys Glu Gln Pro Gln Pro Asp Leu Tyr
210 215 220Lys Phe Val Gly Arg Ile Asn Val Tyr Ser Asp Leu Asn Asp
Pro Val225 230 235 240Val Arg Pro Leu Gly Ser Glu Asn Leu Leu Leu
Arg Gly Ala Thr Leu 245 250 255Lys Asn Thr Glu Lys Ile Phe Gly Val
Ala Ile Tyr Thr Gly Met Glu 260 265 270Thr Lys Met Ala Leu Asn Tyr
Gln Ser Lys Ser Gln Lys Arg Ser Ala 275 280 285Val Glu Lys Ser Met
Asn Ala Phe Leu Ile Val Tyr Leu Cys Ile Leu 290 295 300Ile Ser Lys
Ala Leu Ile Asn Thr Val Leu Lys Tyr Met Trp Gln Ser305 310 315
320Glu Pro Phe Arg Asp Glu Pro Trp Tyr Asn Gln Lys Thr Glu Ser Glu
325 330 335Arg Gln Arg Asn Leu Phe Leu Lys Ala Phe Thr Asp Phe Leu
Ala Phe 340 345 350Met Val Leu Phe Asn Tyr Ile Ile Pro Val Ser Met
Tyr Val Thr Val 355 360 365Glu Met Gln Lys Phe Leu Gly Ser Tyr Phe
Ile Thr Trp Asp Glu Asp 370 375 380Met Phe Asp Glu Glu Thr Gly Glu
Gly Pro Leu Val Asn Thr Ser Asp385 390 395 400Leu Asn Glu Glu Leu
Gly Gln Val Glu Tyr Ile Phe Thr Asp Lys Thr 405 410 415Gly Thr Leu
Thr Glu Asn Asn Met Glu Phe Lys Glu Cys Cys Ile Glu 420 425 430Gly
His Val Tyr Val Pro His Val Ile Cys Asn Gly Gln Val Leu Pro 435 440
445Glu Ser Ser Gly Ile Asp Met Ile Asp Ser Ser Pro Ser Val Asn Gly
450 455 460Arg Glu Arg Glu Glu Leu Phe Phe Arg Ala Leu Cys Leu Cys
His Thr465 470 475 480Val Gln Val Lys Asp Asp Asp Ser Val Asp Gly
Pro Arg Lys Ser Pro 485 490 495Asp Gly Gly Lys Ser Cys Val Tyr Ile
Ser Ser Ser Pro Asp Glu Val 500 505 510Ala Leu Val Glu Gly Val Gln
Arg Leu Gly Phe Thr Tyr Leu Arg Leu 515 520 525Lys Asp Asn Tyr Met
Glu Ile Leu Asn Arg Glu Asn His Ile Glu Arg 530 535 540Phe Glu Leu
Leu Glu Ile Leu Ser Phe Asp Ser Val Arg Arg Arg Met545 550 555
560Ser Val Ile Val Lys Ser Ala Thr Gly Glu Ile Tyr Leu Phe Cys Lys
565 570 575Gly Ala Asp Ser Ser Ile Phe Pro Arg Val Ile Glu Gly Lys
Val Asp 580 585 590Gln Ile Arg Ala Arg Val Glu Arg Asn Ala Val Glu
Gly Leu Arg Thr 595 600 605Leu Cys Val Ala Tyr Lys Arg Leu Ile Gln
Glu Glu Tyr Glu Gly Ile 610 615 620Cys Lys Leu Leu Gln Ala Ala Lys
Val Ala Leu Gln Asp Arg Glu Lys625 630 635 640Lys Leu Ala Glu Ala
Tyr Glu Gln Ile Glu Lys Asp Leu Thr Leu Leu 645 650 655Gly Ala Thr
Ala Val Glu Asp Arg Leu Gln Glu Lys Ala Ala Asp Thr 660 665 670Ile
Glu Ala Leu Gln Lys Ala Gly Ile Lys Val Trp Val Leu Thr Gly 675 680
685Asp Lys Met Glu Thr Ala Ala Ala Thr Cys Tyr Ala Cys Lys Leu Phe
690 695 700Arg Arg Asn Thr Gln Leu Leu Glu Leu Thr Thr Lys Arg Ile
Glu Glu705 710 715 720Gln Ser Leu His Asp Val Leu Phe Glu Leu Ser
Lys Thr Val Leu Arg 725 730 735His Ser Gly Ser Leu Thr Arg Asp Asn
Leu Ser Gly Leu Ser Ala Asp 740 745 750Met Gln Asp Tyr Gly Leu Ile
Ile Asp Gly Ala Ala Leu Ser Leu Ile 755 760 765Met Lys Pro Arg Glu
Asp Gly Ser Ser Gly Asn Tyr Arg Glu Leu Phe 770 775 780Leu Glu Ile
Cys Arg Ser Cys Ser Ala Val Leu Cys Cys Arg Met Ala785 790 795
800Pro Leu Gln Lys Ala Gln Ile Val Lys Leu Ile Lys Phe Ser Lys Glu
805 810 815His Pro Ile Thr Leu Ala Ile Gly Asp Gly Ala Asn Asp Val
Ser Met 820 825 830Ile Leu Glu Ala His Val Gly Ile Gly Val Ile Gly
Lys Glu Gly Arg 835 840 845Gln Ala Ala Arg Asn Ser Asp Tyr Ala Ile
Pro Lys Phe Lys His Leu 850 855 860Lys Lys Met Leu Leu Val His Gly
His Phe Tyr Tyr Ile Arg Ile Ser865 870 875 880Glu Leu Val Gln Tyr
Phe Phe Tyr Lys Asn Val Cys Phe Ile Phe Pro 885 890 895Gln Phe Leu
Tyr Gln Phe Phe Cys Gly Phe Ser Gln Gln Thr Leu Tyr 900 905 910Asp
Thr Ala Tyr Leu Thr Leu Tyr Asn Ile Ser Phe Thr Ser Leu Pro 915 920
925Ile Leu Leu Tyr Ser Leu Met Glu Gln His Val Gly Ile Asp Val Leu
930 935 940Lys Arg Asp Pro Thr Leu Tyr Arg Asp Val Ala Lys Asn Ala
Leu Leu945 950 955 960Arg Trp Arg Val Phe Ile Tyr Trp Thr Leu Leu
Gly Leu Phe Asp Ala 965 970 975Leu Val Phe Phe Phe Gly Ala Tyr Phe
Val Phe Glu Asn Thr Thr Val 980 985 990Thr Ser Asn Gly Gln Ile Phe
Gly Asn Trp Thr Phe Gly Thr Leu Val 995 1000 1005Phe Thr Val Met
Val Phe Thr Val Thr Leu Lys Leu Ala Leu Asp 1010 1015 1020Thr His
Tyr Trp Thr Trp Ile Asn His Phe Val Ile Trp Gly Ser 1025 1030
1035Leu Leu Phe Tyr Val Val Phe Ser Leu Leu Trp Gly Gly Val Ile
1040 1045 1050Trp Pro Phe Leu Asn Tyr Gln Arg Met Tyr Tyr Val Phe
Ile Gln 1055 1060 1065Met Leu Ser Ser Gly Pro Ala Trp Leu Ala Ile
Val Leu Leu Val 1070 1075 1080Thr Ile Ser Leu Leu Pro Asp Val Leu
Lys Lys Val Leu Cys Arg 1085 1090 1095Gln Leu Trp Pro Thr Ala Thr
Glu Arg Val Gln Asn Gly Cys Ala 1100 1105 1110Gln Pro Arg Asp Arg
Asp Ser Glu Phe Thr Pro Leu Ala Ser Leu 1115 1120 1125Gln Ser Pro
Gly Tyr Gln Ser Thr Cys Pro Ser Ala Ala Trp Tyr 1130 1135 1140Ser
Ser His Ser Gln Gln Val Thr Leu Ala Ala Trp Lys Glu Lys 1145 1150
1155Val Ser Thr Glu Pro Pro Pro Ile Leu Gly Gly Ser His His His
1160 1165 1170Cys Ser Ser Ile Pro Ser His Ser Cys Pro Arg Ser Arg
Val Gly 1175 1180 1185Met Leu Val 1190504673DNAHomo sapiens
50tttccgcagt taggggctgc tatttcaacg cagggagata aaaagaaaaa aacacttgct
60cttctacccc gctaaaaaca ctcatcctag ggagcacgcc agcatttgca gcgttcgggg
120cagggccact cggcctgcgg ccgttgcact ggctggaagc tggcaggcga
tcacggttga 180ttggctcggg tgcggtccaa gggcagcaac gccttcggcg
ggccgcctag ggtgattggc 240tgctgcagcc caccccctag ccggtttggt
gggcggcgaa gcctggattg gtggagctaa 300gagctggctc agtttcagcg
ctggctcttc gtgcatggca gagatggcga ctgcgactcg 360gctgctgggg
tggcgtgtgg cgagctggag gctgcggccg ccgcttgccg gcttcgtttc
420ccagcgggcc cactcgcttt tgcccgtgga cgatgcaatc aatgggctaa
gcgaggagca 480gaggcagctt cgtcagacca tggctaagtt ccttcaggag
cacctggccc ccaaggccca 540ggagatcgat cgcagcaatg agttcaagaa
cctgcgagaa ttttggaagc agctggggaa 600cctgggcgta ttgggcatca
cagcccctgt tcagtatggc ggctccggcc tgggctacct 660ggagcatgtg
ctggtgatgg aggagatatc ccgagcttcc ggagcagtgg ggctcagtta
720cggtgcccac tccaacctct gcatcaacca gcttgtacgc aatgggaatg
aggcccagaa 780agagaagtat ctcccgaagc tgatcagtgg tgagtacatc
ggagccctgg ccatgagtga 840gcccaatgca ggctctgatg ttgtctctat
gaagctcaaa gcggaaaaga aaggaaatca 900ctacatcctg aatggcaaca
agttctggat cactaatggc cctgatgctg acgtcctgat 960tgtctatgcc
aagacagatc tggctgctgt gccagcttct cggggcatca cagccttcat
1020tgtggagaag ggtatgcctg gctttagcac ctctaagaag ctggacaagc
tggggatgag 1080gggctctaac acctgtgagc taatctttga agactgcaag
attcctgctg ccaacatcct 1140gggccatgag aataagggtg tctacgtgct
gatgagtggg ctggacctgg agcggctggt 1200gctggccggg gggcctcttg
ggctcatgca agcggtcctg gaccacacca ttccctacct 1260gcacgtgagg
gaagcctttg gccagaagat cggccacttc cagttgatgc aggggaagat
1320ggctgacatg tacacccgcc tcatggcgtg tcggcagtat gtctacaatg
tcgccaaggc 1380ctgcgatgag ggccattgca ctgctaagga ctgtgcaggt
gtgattcttt actcagctga 1440gtgtgccaca caggtagccc tggacggcat
tcagtgtttt ggtggcaatg gctacatcaa 1500tgactttccc atgggccgct
ttcttcgaga tgccaagctg tatgagatag gggctgggac 1560cagcgaggtg
aggcggctgg tcatcggcag agccttcaat gcagactttc actagtcctg
1620agacccttcg cccccttttc ctgcacctag tggcctttct tgggaagtag
agatgtggcg 1680gctttcccac cctgcccaca gcaggccctc ctgcccagct
gctcttgtca gccctctggc 1740ctctggatga ggttgagttc tccacaacag
ctcccaagca tcatgggcct cgcagccggg 1800cctgtgccac ggctagtgtt
gtgtgattta aaatggactc agcaggaagc atattgtctg 1860gggattgttg
ggacaggttt tggtgactct gtgcccttgc tctctaactt ctgagcccac
1920ctcccagggt aggcacctgg gggcatgcag gtgcccacct cccagggtag
gcacctgggg 1980gcatgcaggt acccacctct ttctcttggg tgaggctctg
gcaaggagat ctctctgctc 2040aagcacagca gaatcatggc ccctctccat
gaattggaac ttggtacagg ttaagtatcc 2100ctaatcctga aatctgaaac
acttgtggtt ccaagcattt tggataaggc aaattcaact 2160ttcagtctct
tttctggggg aaaaaaataa taaacctagc ctagccaggc gtggtggctc
2220atgcttgtaa tcccagcact tcaggaggct gagatgggtg gatcacctga
ggtcaggagt 2280tcaagaccag cctggccaac atgtggaaac ctcgcctcaa
ctaaaaatag aaaaaaatta 2340gttgggcatg gtggtgggca cctgtaatcc
cagctacttc aggaggctga ggcaggagaa 2400ttacttgaac ccaggaggcg
gacgttgcag tgagccgagc ttgtgccatt gcactccagc 2460ctgggcgaca
agagcaaaac tcttcaaaaa acaaaacaaa acaaaaaaac cctggccctt
2520gtttcttcca gtttctagag gtatcagctc ctagcagctt atgaacacat
atgcttgctt 2580ggccaggcaa ggtggtgtgt gcctgtaatc ccagcacttt
gggaggccaa ggcaggtgga 2640tcacttgcag tcaggagttc aagaccagcc
tgtccaacgt ggtgaaaccc catctctact 2700aaaaatacaa aaattagcca
ggggtggtgg tgcacgtctg taatcccagc tactcaggag 2760gctgaggcag
gagaatcact tgaacccggg aggtggaggt tgcaatgagc caatatgaca
2820ccgctgcagt ccagcctggg ccatagagtg agactctgtc tcaaaaaagg
aaagaaaaat 2880aggctgggca cagtgactca tgcctgtaat cccaacactt
tgggaggccg aggcaggtgg 2940atcacgaggt caggagttca agaccagcct
ggccaagatg gtaaaacctc gtctctacta 3000aaaatacaaa aattagccag
gtgtggtggc aggctcctgt aatcccagct actcaggagg 3060ctgaggcaga
gaattgcttg aacccgggag gcagagtttg cagtgagcca agatcacacc
3120actgcactcc agcttggacg acagagcgag actctgtctc aaaaaataat
aggccaggca 3180tggtggctca acgtctgtaa tcccagcact ttgggaggcc
gaggcgggca gatcacaagg 3240tcaggagttc gagaccagcc tgacgaccaa
catggtgaaa cctcgtctct actaaaaata 3300caaaaattag ccaggcctgg
tggcacgcgc ctgtaatccc agttacacag aagactgagg 3360caggagaatc
gcttgaacgc aggaggcaga ggttgcagga gctgagatcg cgccattgca
3420ctccagcctg ggcaacagag tgagactctg tctcaaaaaa taataataaa
ataaatgaac 3480acacatgctg ctgagtccgc agggggggca gagcagagga
cagcgtgctt ttgtgtactg 3540ttggaagact ggctcctcct gtacagcacc
tctgagccct tgtgcaccgc cctgccacgg 3600gcaccatcca gtcctggccg
tgtgaccacc cacagctgac tgggcagcag gcacaggccc 3660tacccgagca
ggccggagtt ggctcgcatg actccagctg aggctgcctg tgtacatttc
3720tccagatacc ctatggctaa ttttgttata actgcacagt ggctgctgcc
attttgtatt 3780aaatatattg tgaaacaaac ctatctgggg agaagcaatc
tacttgccgc tgcttcctgt 3840ctggatccag cttgtgtcct tggagagtgg
ctggcccagg tcctattcct gtcctccagc 3900ccgttctttc atgagggaca
ggaaggtaaa atcagccctt aggagagagg tctcagcctc 3960cctttcccag
atctcccagt gagttttaaa ggaagcaggg agcccagagt gctaagttct
4020tacagccaga aggaagctta tagatttctg aaaaccgccc ctttgttttt
aaaaagatca 4080acacaatttg actttctcaa ggtcaaaacg aactagaatc
cagatctgct catggcaaaa 4140atgggggtgt tctgagaatt ccagctttgg
gccgcactgt acagcagtct ggatagagtg 4200tgatctgaga agggaatggg
tctgggttgt tccacccctt ccgagttcca aaaagaggga 4260actggttttc
ttggttctca gcccagcagc acctatcctg gctcttggtc ctggcctgca
4320gccaagtgct gttcctagcc tgaggcttga gacaggtggg gttggctcct
caccaacccc 4380agttccgtcc catcctgagg gcaagatcct gggctcatag
gcagtccctt tcacttcctt 4440gtcttgctcc ctgctatgtt ggagatgaat
gtgactaaaa gggccatctt gctggcttaa 4500tgtgtggctg gagagaccag
cctggagaca atgtggcaaa atggggcgct tcatccagtc 4560tgtctaagcc
ctgtcgactt ggggaggtga tttctttcct ggttctatat gtgaagcaaa
4620ataaatgttt taaaattaaa agcaaaaaaa acaaaatgaa ccatgaaaaa aaa
467351426PRTHomo sapiens 51Met Ala Glu Met Ala Thr Ala Thr Arg Leu
Leu Gly Trp Arg Val Ala1 5 10 15Ser Trp Arg Leu Arg Pro Pro Leu Ala
Gly Phe Val Ser Gln Arg Ala 20 25 30His Ser Leu Leu Pro Val Asp Asp
Ala Ile Asn Gly Leu Ser Glu Glu 35 40 45Gln Arg Gln Leu Arg Gln Thr
Met Ala Lys Phe Leu Gln Glu His Leu 50 55 60Ala Pro Lys Ala Gln Glu
Ile Asp Arg Ser Asn Glu Phe Lys Asn Leu65 70 75 80Arg Glu Phe Trp
Lys Gln Leu Gly Asn Leu Gly Val Leu Gly Ile Thr 85 90 95Ala Pro Val
Gln Tyr Gly Gly Ser Gly Leu Gly Tyr Leu Glu His Val 100 105 110Leu
Val Met Glu Glu Ile Ser Arg Ala Ser Gly Ala Val Gly Leu Ser
115 120 125Tyr Gly Ala His Ser Asn Leu Cys Ile Asn Gln Leu Val Arg
Asn Gly 130 135 140Asn Glu Ala Gln Lys Glu Lys Tyr Leu Pro Lys Leu
Ile Ser Gly Glu145 150 155 160Tyr Ile Gly Ala Leu Ala Met Ser Glu
Pro Asn Ala Gly Ser Asp Val 165 170 175Val Ser Met Lys Leu Lys Ala
Glu Lys Lys Gly Asn His Tyr Ile Leu 180 185 190Asn Gly Asn Lys Phe
Trp Ile Thr Asn Gly Pro Asp Ala Asp Val Leu 195 200 205Ile Val Tyr
Ala Lys Thr Asp Leu Ala Ala Val Pro Ala Ser Arg Gly 210 215 220Ile
Thr Ala Phe Ile Val Glu Lys Gly Met Pro Gly Phe Ser Thr Ser225 230
235 240Lys Lys Leu Asp Lys Leu Gly Met Arg Gly Ser Asn Thr Cys Glu
Leu 245 250 255Ile Phe Glu Asp Cys Lys Ile Pro Ala Ala Asn Ile Leu
Gly His Glu 260 265 270Asn Lys Gly Val Tyr Val Leu Met Ser Gly Leu
Asp Leu Glu Arg Leu 275 280 285Val Leu Ala Gly Gly Pro Leu Gly Leu
Met Gln Ala Val Leu Asp His 290 295 300Thr Ile Pro Tyr Leu His Val
Arg Glu Ala Phe Gly Gln Lys Ile Gly305 310 315 320His Phe Gln Leu
Met Gln Gly Lys Met Ala Asp Met Tyr Thr Arg Leu 325 330 335Met Ala
Cys Arg Gln Tyr Val Tyr Asn Val Ala Lys Ala Cys Asp Glu 340 345
350Gly His Cys Thr Ala Lys Asp Cys Ala Gly Val Ile Leu Tyr Ser Ala
355 360 365Glu Cys Ala Thr Gln Val Ala Leu Asp Gly Ile Gln Cys Phe
Gly Gly 370 375 380Asn Gly Tyr Ile Asn Asp Phe Pro Met Gly Arg Phe
Leu Arg Asp Ala385 390 395 400Lys Leu Tyr Glu Ile Gly Ala Gly Thr
Ser Glu Val Arg Arg Leu Val 405 410 415Ile Gly Arg Ala Phe Asn Ala
Asp Phe His 420 42552268PRTHomo sapiens 52Met Ser Gly Lys His Tyr
Lys Gly Pro Glu Val Ser Cys Cys Ile Lys1 5 10 15Tyr Phe Ile Phe Gly
Phe Asn Val Ile Phe Trp Phe Leu Gly Ile Thr 20 25 30Phe Leu Gly Ile
Gly Leu Trp Ala Trp Asn Glu Lys Gly Val Leu Ser 35 40 45Asn Ile Ser
Ser Ile Thr Asp Leu Gly Gly Phe Asp Pro Val Trp Leu 50 55 60Phe Leu
Val Val Gly Gly Val Met Phe Ile Leu Gly Phe Ala Gly Cys65 70 75
80Ile Gly Ala Leu Arg Glu Asn Thr Phe Leu Leu Lys Phe Phe Ser Val
85 90 95Phe Leu Gly Ile Ile Phe Phe Leu Glu Leu Thr Ala Gly Val Leu
Ala 100 105 110Phe Val Phe Lys Asp Trp Ile Lys Asp Gln Leu Tyr Phe
Phe Ile Asn 115 120 125Asn Asn Ile Arg Ala Tyr Arg Asp Asp Ile Asp
Leu Gln Asn Leu Ile 130 135 140Asp Phe Thr Gln Glu Tyr Trp Gln Cys
Cys Gly Ala Phe Gly Ala Asp145 150 155 160Asp Trp Asn Leu Asn Ile
Tyr Phe Asn Cys Thr Asp Ser Asn Ala Ser 165 170 175Arg Glu Arg Cys
Gly Val Pro Phe Ser Cys Cys Thr Lys Asp Pro Ala 180 185 190Glu Asp
Val Ile Asn Thr Gln Cys Gly Tyr Asp Ala Arg Gln Lys Pro 195 200
205Glu Val Asp Gln Gln Ile Val Ile Tyr Thr Lys Gly Cys Val Pro Gln
210 215 220Phe Glu Lys Trp Leu Gln Asp Asn Leu Thr Ile Val Ala Gly
Ile Phe225 230 235 240Ile Gly Ile Ala Leu Leu Gln Ile Phe Gly Ile
Cys Leu Ala Gln Asn 245 250 255Leu Val Ser Asp Ile Glu Ala Val Arg
Ala Ser Trp 260 265533583DNAHomo sapiens 53ctgggcccca gcgaggcggt
ggggcggggc ggggcggggc ggggcgcgca gcaggagcga 60gtggggccgc ccgccgggcc
gcggacactg tcgcccggcg cccaggttcc caacaaggct 120acgcagaaga
acccccttga ctgaagcaat ggaggggggt ccagctgtct gctgccagga
180tcctcgggca gagctggtag aacgggtggc agccatcgat gtgactcact
tggaggaggc 240agatggtggc ccagagccta ctagaaacgg tgtggacccc
ccaccacggg ccagagctgc 300ctctgtgatc cctggcagta cttcaagact
gctcccagcc cggcctagcc tctcagccag 360gaagctttcc ctacaggagc
ggccagcagg aagctatctg gaggcgcagg ctgggcctta 420tgccacgggg
cctgccagcc acatctcccc ccgggcctgg cggaggccca ccatcgagtc
480ccaccacgtg gccatctcag atgcagagga ctgcgtgcag ctgaaccagt
acaagctgca 540gagtgagatt ggcaagggtg cctacggtgt ggtgaggctg
gcctacaacg aaagtgaaga 600cagacactat gcaatgaaag tcctttccaa
aaagaagtta ctgaagcagt atggctttcc 660acgtcgccct cccccgagag
ggtcccaggc tgcccaggga ggaccagcca agcagctgct 720gcccctggag
cgggtgtacc aggagattgc catcctgaag aagctggacc acgtgaatgt
780ggtcaaactg atcgaggtcc tggatgaccc agctgaggac aacctctatt
tggtgtttga 840cctcctgaga aaggggcccg tcatggaagt gccctgtgac
aagcccttct cggaggagca 900agctcgcctc tacctgcggg acgtcatcct
gggcctcgag tacttgcact gccagaagat 960cgtccacagg gacatcaagc
catccaacct gctcctgggg gatgatgggc acgtgaagat 1020cgccgacttt
ggcgtcagca accagtttga ggggaacgac gctcagctgt ccagcacggc
1080gggaacccca gcattcatgg cccccgaggc catttctgat tccggccaga
gcttcagtgg 1140gaaggccttg gatgtatggg ccactggcgt cacgttgtac
tgctttgtct atgggaagtg 1200cccattcatc gacgatttca tcctggccct
ccacaggaag atcaagaatg agcccgtggt 1260gtttcctgag gagccagaaa
tcagcgagga gctcaaggac ctgatcctga agatgttaga 1320caagaatccc
gagacgagaa ttggggtgcc agacatcaag ttgcaccctt gggtgaccaa
1380gaacggggag gagccccttc cttcggagga ggagcactgc agcgtggtgg
aggtgacaga 1440ggaggaggtt aagaactcag tcaggctcat ccccagctgg
accacggtga tcctggtgaa 1500gtccatgctg aggaagcgtt cctttgggaa
cccgtttgag ccccaagcac ggagggaaga 1560gcgatccatg tctgctccag
gaaacctact ggtgaaagaa gggtttggtg aagggggcaa 1620gagcccagag
ctccccggcg tccaggaaga cgaggctgca tcctgagccc ctgcatgcac
1680ccagggccac ccggcagcac actcatcccg cgcctccaga ggcccacccc
tcatgcaaca 1740gccgcccccg caggcagggg gctggggact gcagccccac
tcccgcccct cccccatcgt 1800gctgcatgac ctccacgcac gcacgtccag
ggacagactg gaatgtatgt catttggggt 1860cttgggggca gggctcccac
gaggccatcc tcctcttctt ggacctcctt ggcctgaccc 1920attctgtggg
gaaaccgggt gcccatggag cctcagaaat gccacccggc tggttggcat
1980ggcctggggc aggaggcaga ggcaggagac caagatggca ggtggaggcc
aggcttacca 2040caacggaaga gacctcccgc tggggccggg caggcctggc
tcagctgcca caggcatatg 2100gtggagaggg gggtaccctg cccaccttgg
ggtggtggca ccagagctct tgtctattca 2160gacgctggta tgggggctcg
gacccctcac tggggacagg gccagtgttg gagaattctg 2220attccttttt
tgttgtcttt tacttttgtt tttaacctgg gggttcgggg agaggccctg
2280cttgggaaca tctcacgagc tttcctacat cttccgtggt tcccagcaca
gcccaagatt 2340atttggcagc caagtggatg gaactaactt tcctggactg
tgtttcgcat tcggcgttat 2400ctggaaagtg gactgaacgg aatcaagctc
tgagcagagg cctgaagcgg aagcaccaca 2460tcgtccctgc ccatctcact
ctctcccttg atgatgcccc tagagctgag gctggagaag 2520acaccagggc
tgactttgac cgagggccat ggacgcgaca ggcctgtggc cctgcgcatg
2580ctgaaataac tggaacccag cctctcctcc tacaccggcc tacccatctg
ggcccaagag 2640ctgcactcac actcctacaa cgaaggacaa actgtccagg
tcggagggat cacgagacac 2700agaacctgga ggggtgtgca cgctggcagg
tggcctctgc ggcaattgcc tcaccctgag 2760gacatcagca gtcagcctgc
tcagagcggg ggtgctggag cgcgtgcaga cacagctctt 2820ccggagcagc
cttcaccttc tctctgggat cagtgtccgg ctggccgacg tggcatttgc
2880tgaccgaatg ctcatagagg ttgaccccca cagggtcacg caggactcgg
acactgccct 2940ggaaacatgg atggacaagg gcttttggcc acaggtgtgg
gtgtcctgtt ggaggagggc 3000ttgtttggag aagggaggct ggctggggga
gaaacccgga tcccgctgca tctccgcgcc 3060tgtgggtgca tgtcgcgtgc
tcatctgttg cacacagctc actcgtatgt cctgcactgg 3120tacatgcatc
tgtaatacag tttctacgtc tatttaaggc taggagccga atgtgcccca
3180ttgtcagtgg gtccacgttt ctccccggct cctctgggct aaggcagtgt
ggcccgaagc 3240ttaaaaagtt actcggtact gtttttaaga acacttttat
agagttagtg gaaggcaagt 3300taagagccaa tcactgatcc ccaagtgttt
cttgagcatc tggtctgggg ggaccacttt 3360gatcggaccc acccttggaa
agctcagggg taggcccagg tgggatgctc accctgtcac 3420tgagggtttt
ggttggcatc gttgtttttg aatgtagcac aagcgatgag caaactctat
3480aagagtgttt taaaaattaa cttcccagga agtgagttaa aaacaataaa
agccctttct 3540tgagttaaaa agaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaa
358354505PRTHomo sapiens 54Met Glu Gly Gly Pro Ala Val Cys Cys Gln
Asp Pro Arg Ala Glu Leu1 5 10 15Val Glu Arg Val Ala Ala Ile Asp Val
Thr His Leu Glu Glu Ala Asp 20 25 30Gly Gly Pro Glu Pro Thr Arg Asn
Gly Val Asp Pro Pro Pro Arg Ala 35 40 45Arg Ala Ala Ser Val Ile Pro
Gly Ser Thr Ser Arg Leu Leu Pro Ala 50 55 60Arg Pro Ser Leu Ser Ala
Arg Lys Leu Ser Leu Gln Glu Arg Pro Ala65 70 75 80Gly Ser Tyr Leu
Glu Ala Gln Ala Gly Pro Tyr Ala Thr Gly Pro Ala 85 90 95Ser His Ile
Ser Pro Arg Ala Trp Arg Arg Pro Thr Ile Glu Ser His 100 105 110His
Val Ala Ile Ser Asp Ala Glu Asp Cys Val Gln Leu Asn Gln Tyr 115 120
125Lys Leu Gln Ser Glu Ile Gly Lys Gly Ala Tyr Gly Val Val Arg Leu
130 135 140Ala Tyr Asn Glu Ser Glu Asp Arg His Tyr Ala Met Lys Val
Leu Ser145 150 155 160Lys Lys Lys Leu Leu Lys Gln Tyr Gly Phe Pro
Arg Arg Pro Pro Pro 165 170 175Arg Gly Ser Gln Ala Ala Gln Gly Gly
Pro Ala Lys Gln Leu Leu Pro 180 185 190Leu Glu Arg Val Tyr Gln Glu
Ile Ala Ile Leu Lys Lys Leu Asp His 195 200 205Val Asn Val Val Lys
Leu Ile Glu Val Leu Asp Asp Pro Ala Glu Asp 210 215 220Asn Leu Tyr
Leu Val Phe Asp Leu Leu Arg Lys Gly Pro Val Met Glu225 230 235
240Val Pro Cys Asp Lys Pro Phe Ser Glu Glu Gln Ala Arg Leu Tyr Leu
245 250 255Arg Asp Val Ile Leu Gly Leu Glu Tyr Leu His Cys Gln Lys
Ile Val 260 265 270His Arg Asp Ile Lys Pro Ser Asn Leu Leu Leu Gly
Asp Asp Gly His 275 280 285Val Lys Ile Ala Asp Phe Gly Val Ser Asn
Gln Phe Glu Gly Asn Asp 290 295 300Ala Gln Leu Ser Ser Thr Ala Gly
Thr Pro Ala Phe Met Ala Pro Glu305 310 315 320Ala Ile Ser Asp Ser
Gly Gln Ser Phe Ser Gly Lys Ala Leu Asp Val 325 330 335Trp Ala Thr
Gly Val Thr Leu Tyr Cys Phe Val Tyr Gly Lys Cys Pro 340 345 350Phe
Ile Asp Asp Phe Ile Leu Ala Leu His Arg Lys Ile Lys Asn Glu 355 360
365Pro Val Val Phe Pro Glu Glu Pro Glu Ile Ser Glu Glu Leu Lys Asp
370 375 380Leu Ile Leu Lys Met Leu Asp Lys Asn Pro Glu Thr Arg Ile
Gly Val385 390 395 400Pro Asp Ile Lys Leu His Pro Trp Val Thr Lys
Asn Gly Glu Glu Pro 405 410 415Leu Pro Ser Glu Glu Glu His Cys Ser
Val Val Glu Val Thr Glu Glu 420 425 430Glu Val Lys Asn Ser Val Arg
Leu Ile Pro Ser Trp Thr Thr Val Ile 435 440 445Leu Val Lys Ser Met
Leu Arg Lys Arg Ser Phe Gly Asn Pro Phe Glu 450 455 460Pro Gln Ala
Arg Arg Glu Glu Arg Ser Met Ser Ala Pro Gly Asn Leu465 470 475
480Leu Val Lys Glu Gly Phe Gly Glu Gly Gly Lys Ser Pro Glu Leu Pro
485 490 495Gly Val Gln Glu Asp Glu Ala Ala Ser 500 505553529DNAHomo
sapiens 55agcagaacag agtatgcaat ttgggaagct gtggtgtggc tgcagtggag
agttcccaac 60aaggctacgc agaagaaccc ccttgactga agcaatggag gggggtccag
ctgtctgctg 120ccaggatcct cgggcagagc tggtagaacg ggtggcagcc
atcgatgtga ctcacttgga 180ggaggcagat ggtggcccag agcctactag
aaacggtgtg gaccccccac cacgggccag 240agctgcctct gtgatccctg
gcagtacttc aagactgctc ccagcccggc ctagcctctc 300agccaggaag
ctttccctac aggagcggcc agcaggaagc tatctggagg cgcaggctgg
360gccttatgcc acggggcctg ccagccacat ctccccccgg gcctggcgga
ggcccaccat 420cgagtcccac cacgtggcca tctcagatgc agaggactgc
gtgcagctga accagtacaa 480gctgcagagt gagattggca agggtgccta
cggtgtggtg aggctggcct acaacgaaag 540tgaagacaga cactatgcaa
tgaaagtcct ttccaaaaag aagttactga agcagtatgg 600ctttccacgt
cgccctcccc cgagagggtc ccaggctgcc cagggaggac cagccaagca
660gctgctgccc ctggagcggg tgtaccagga gattgccatc ctgaagaagc
tggaccacgt 720gaatgtggtc aaactgatcg aggtcctgga tgacccagct
gaggacaacc tctatttggt 780gtttgacctc ctgagaaagg ggcccgtcat
ggaagtgccc tgtgacaagc ccttctcgga 840ggagcaagct cgcctctacc
tgcgggacgt catcctgggc ctcgagtact tgcactgcca 900gaagatcgtc
cacagggaca tcaagccatc caacctgctc ctgggggatg atgggcacgt
960gaagatcgcc gactttggcg tcagcaacca gtttgagggg aacgacgctc
agctgtccag 1020cacggcggga accccagcat tcatggcccc cgaggccatt
tctgattccg gccagagctt 1080cagtgggaag gccttggatg tatgggccac
tggcgtcacg ttgtactgct ttgtctatgg 1140gaagtgccca ttcatcgacg
atttcatcct ggccctccac aggaagatca agaatgagcc 1200cgtggtgttt
cctgaggagc cagaaatcag cgaggagctc aaggacctga tcctgaagat
1260gttagacaag aatcccgaga cgagaattgg ggtgccagac atcaagttgc
acccttgggt 1320gaccaagaac ggggaggagc cccttccttc ggaggaggag
cactgcagcg tggtggaggt 1380gacagaggag gaggttaaga actcagtcag
gctcatcccc agctggacca cggtgatcct 1440ggtgaagtcc atgctgagga
agcgttcctt tgggaacccg tttgagcccc aagcacggag 1500ggaagagcga
tccatgtctg ctccaggaaa cctactggtg aaagaagggt ttggtgaagg
1560gggcaagagc ccagagctcc ccggcgtcca ggaagacgag gctgcatcct
gagcccctgc 1620atgcacccag ggccacccgg cagcacactc atcccgcgcc
tccagaggcc cacccctcat 1680gcaacagccg cccccgcagg cagggggctg
gggactgcag ccccactccc gcccctcccc 1740catcgtgctg catgacctcc
acgcacgcac gtccagggac agactggaat gtatgtcatt 1800tggggtcttg
ggggcagggc tcccacgagg ccatcctcct cttcttggac ctccttggcc
1860tgacccattc tgtggggaaa ccgggtgccc atggagcctc agaaatgcca
cccggctggt 1920tggcatggcc tggggcagga ggcagaggca ggagaccaag
atggcaggtg gaggccaggc 1980ttaccacaac ggaagagacc tcccgctggg
gccgggcagg cctggctcag ctgccacagg 2040catatggtgg agaggggggt
accctgccca ccttggggtg gtggcaccag agctcttgtc 2100tattcagacg
ctggtatggg ggctcggacc cctcactggg gacagggcca gtgttggaga
2160attctgattc cttttttgtt gtcttttact tttgttttta acctgggggt
tcggggagag 2220gccctgcttg ggaacatctc acgagctttc ctacatcttc
cgtggttccc agcacagccc 2280aagattattt ggcagccaag tggatggaac
taactttcct ggactgtgtt tcgcattcgg 2340cgttatctgg aaagtggact
gaacggaatc aagctctgag cagaggcctg aagcggaagc 2400accacatcgt
ccctgcccat ctcactctct cccttgatga tgcccctaga gctgaggctg
2460gagaagacac cagggctgac tttgaccgag ggccatggac gcgacaggcc
tgtggccctg 2520cgcatgctga aataactgga acccagcctc tcctcctaca
ccggcctacc catctgggcc 2580caagagctgc actcacactc ctacaacgaa
ggacaaactg tccaggtcgg agggatcacg 2640agacacagaa cctggagggg
tgtgcacgct ggcaggtggc ctctgcggca attgcctcac 2700cctgaggaca
tcagcagtca gcctgctcag agcgggggtg ctggagcgcg tgcagacaca
2760gctcttccgg agcagccttc accttctctc tgggatcagt gtccggctgg
ccgacgtggc 2820atttgctgac cgaatgctca tagaggttga cccccacagg
gtcacgcagg actcggacac 2880tgccctggaa acatggatgg acaagggctt
ttggccacag gtgtgggtgt cctgttggag 2940gagggcttgt ttggagaagg
gaggctggct gggggagaaa cccggatccc gctgcatctc 3000cgcgcctgtg
ggtgcatgtc gcgtgctcat ctgttgcaca cagctcactc gtatgtcctg
3060cactggtaca tgcatctgta atacagtttc tacgtctatt taaggctagg
agccgaatgt 3120gccccattgt cagtgggtcc acgtttctcc ccggctcctc
tgggctaagg cagtgtggcc 3180cgaagcttaa aaagttactc ggtactgttt
ttaagaacac ttttatagag ttagtggaag 3240gcaagttaag agccaatcac
tgatccccaa gtgtttcttg agcatctggt ctggggggac 3300cactttgatc
ggacccaccc ttggaaagct caggggtagg cccaggtggg atgctcaccc
3360tgtcactgag ggttttggtt ggcatcgttg tttttgaatg tagcacaagc
gatgagcaaa 3420ctctataaga gtgttttaaa aattaacttc ccaggaagtg
agttaaaaac aataaaagcc 3480ctttcttgag ttaaaaagaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaa 352956532PRTHomo sapiens 56Met Gln Phe Gly Lys
Leu Trp Cys Gly Cys Ser Gly Glu Phe Pro Thr1 5 10 15Arg Leu Arg Arg
Arg Thr Pro Leu Thr Glu Ala Met Glu Gly Gly Pro 20 25 30Ala Val Cys
Cys Gln Asp Pro Arg Ala Glu Leu Val Glu Arg Val Ala 35 40 45Ala Ile
Asp Val Thr His Leu Glu Glu Ala Asp Gly Gly Pro Glu Pro 50 55 60Thr
Arg Asn Gly Val Asp Pro Pro Pro Arg Ala Arg Ala Ala Ser Val65 70 75
80Ile Pro Gly Ser Thr Ser Arg Leu Leu Pro Ala Arg Pro Ser Leu Ser
85 90 95Ala Arg Lys Leu Ser Leu Gln Glu Arg Pro Ala Gly Ser Tyr Leu
Glu 100 105 110Ala Gln Ala Gly Pro Tyr Ala Thr Gly Pro Ala Ser His
Ile Ser Pro 115 120 125Arg Ala Trp Arg Arg Pro Thr Ile Glu Ser His
His Val Ala Ile Ser 130 135 140Asp Ala Glu Asp Cys Val Gln Leu Asn
Gln Tyr Lys Leu Gln Ser Glu145 150 155 160Ile Gly Lys Gly Ala Tyr
Gly Val Val Arg Leu Ala Tyr Asn Glu Ser 165 170 175Glu Asp Arg His
Tyr Ala Met Lys Val Leu Ser Lys Lys Lys Leu Leu 180 185 190Lys Gln
Tyr Gly Phe Pro Arg Arg Pro Pro Pro Arg Gly Ser Gln Ala 195
200 205Ala Gln Gly Gly Pro Ala Lys Gln Leu Leu Pro Leu Glu Arg Val
Tyr 210 215 220Gln Glu Ile Ala Ile Leu Lys Lys Leu Asp His Val Asn
Val Val Lys225 230 235 240Leu Ile Glu Val Leu Asp Asp Pro Ala Glu
Asp Asn Leu Tyr Leu Val 245 250 255Phe Asp Leu Leu Arg Lys Gly Pro
Val Met Glu Val Pro Cys Asp Lys 260 265 270Pro Phe Ser Glu Glu Gln
Ala Arg Leu Tyr Leu Arg Asp Val Ile Leu 275 280 285Gly Leu Glu Tyr
Leu His Cys Gln Lys Ile Val His Arg Asp Ile Lys 290 295 300Pro Ser
Asn Leu Leu Leu Gly Asp Asp Gly His Val Lys Ile Ala Asp305 310 315
320Phe Gly Val Ser Asn Gln Phe Glu Gly Asn Asp Ala Gln Leu Ser Ser
325 330 335Thr Ala Gly Thr Pro Ala Phe Met Ala Pro Glu Ala Ile Ser
Asp Ser 340 345 350Gly Gln Ser Phe Ser Gly Lys Ala Leu Asp Val Trp
Ala Thr Gly Val 355 360 365Thr Leu Tyr Cys Phe Val Tyr Gly Lys Cys
Pro Phe Ile Asp Asp Phe 370 375 380Ile Leu Ala Leu His Arg Lys Ile
Lys Asn Glu Pro Val Val Phe Pro385 390 395 400Glu Glu Pro Glu Ile
Ser Glu Glu Leu Lys Asp Leu Ile Leu Lys Met 405 410 415Leu Asp Lys
Asn Pro Glu Thr Arg Ile Gly Val Pro Asp Ile Lys Leu 420 425 430His
Pro Trp Val Thr Lys Asn Gly Glu Glu Pro Leu Pro Ser Glu Glu 435 440
445Glu His Cys Ser Val Val Glu Val Thr Glu Glu Glu Val Lys Asn Ser
450 455 460Val Arg Leu Ile Pro Ser Trp Thr Thr Val Ile Leu Val Lys
Ser Met465 470 475 480Leu Arg Lys Arg Ser Phe Gly Asn Pro Phe Glu
Pro Gln Ala Arg Arg 485 490 495Glu Glu Arg Ser Met Ser Ala Pro Gly
Asn Leu Leu Val Lys Glu Gly 500 505 510Phe Gly Glu Gly Gly Lys Ser
Pro Glu Leu Pro Gly Val Gln Glu Asp 515 520 525Glu Ala Ala Ser
530572535DNAHomo sapiens 57ctgggcccca gcgaggcggt ggggcggggc
ggggcggggc ggggcgcgca gcaggagcga 60gtggggccgc ccgccgggcc gcggacactg
tcgcccggcg cccaggttcc caacaaggct 120acgcagaaga acccccttga
ctgaagcaat ggaggggggt ccagctgtct gctgccagga 180tcctcgggca
gagctggtag aacgggtggc agccatcgat gtgactcact tggaggaggc
240agatggtggc ccagagccta ctagaaacgg tgtggacccc ccaccacggg
ccagagctgc 300ctctgtgatc cctggcagta cttcaagact gctcccagcc
cggcctagcc tctcagccag 360gaagctttcc ctacaggagc ggccagcagg
aagctatctg gaggcgcagg ctgggcctta 420tgccacgggg cctgccagcc
acatctcccc ccgggcctgg cggaggccca ccatcgagtc 480ccaccacgtg
gccatctcag atgcagagga ctgcgtgcag ctgaaccagt acaagctgca
540gagtgagatt ggcaagggtg cctacggtgt ggtgaggctg gcctacaacg
aaagtgaaga 600cagacactat gcaatgaaag tcctttccaa aaagaagtta
ctgaagcagt atggctttcc 660acgtcgccct cccccgagag ggtcccaggc
tgcccaggga ggaccagcca agcagctgct 720gcccctggag cgggtgtacc
aggagattgc catcctgaag aagctggacc acgtgaatgt 780ggtcaaactg
atcgaggtcc tggatgaccc agctgaggac aacctctatt tggccctgca
840gaaccaggcc cagaatatcc agttagattc aacaaatatc gccaagcccc
actccctgct 900tccctctgag cagcaagaca gtggatccac gtgggctgcg
cgctcagtgt ttgacctcct 960gagaaagggg cccgtcatgg aagtgccctg
tgacaagccc ttctcggagg agcaagctcg 1020cctctacctg cgggacgtca
tcctgggcct cgagtacttg cactgccaga agatcgtcca 1080cagggacatc
aagccatcca acctgctcct gggggatgat gggcacgtga agatcgccga
1140ctttggcgtc agcaaccagt ttgaggggaa cgacgctcag ctgtccagca
cggcgggaac 1200cccagcattc atggcccccg aggccatttc tgattccggc
cagagcttca gtgggaaggc 1260cttggatgta tgggccactg gcgtcacgtt
gtactgcttt gtctatggga agtgcccatt 1320catcgacgat ttcatcctgg
ccctccacag gaagatcaag aatgagcccg tggtgtttcc 1380tgaggagcca
gaaatcagcg aggagctcaa ggacctgatc ctgaagatgt tagacaagaa
1440tcccgagacg agaattgggg tgccagacat caagttgcac ccttgggtga
ccaagaacgg 1500ggaggagccc cttccttcgg aggaggagca ctgcagcgtg
gtggaggtga cagaggagga 1560ggttaagaac tcagtcaggc tcatccccag
ctggaccacg gtgatcctgg tgaagtccat 1620gctgaggaag cgttcctttg
ggaacccgtt tgagccccaa gcacggaggg aagagcgatc 1680catgtctgct
ccaggaaacc tactggtgta agtactggtg ggccagggac tgccgggcac
1740tccctggagt tgggtgggga ggtctgaggc ccatcctccc actctcactg
tcgttgggcc 1800aaggccagag cctggggact tggccaggtc tcggtgttgg
ccccatttgc atctctgtcc 1860ccaaggttag tcggggctag aagggacctt
ttgggcccag ctcttgcttc attcctgggg 1920ccagcatccc tcacacacac
acttccaggg atgaggagct cacgcagccc ctccatggga 1980caggaagacc
cttcttccat gcagcttgat gtcactctct cactgggtcc agcccctctg
2040gggcttcaaa tctgtggccc cctcagccct tggcagcctg gcagaggttt
gcagacaggc 2100tgatgttggc ttcctgtagg aggctggcgg gctgtagagg
aggggtgctg gcccctctgc 2160ctggccctgg ggactgttgg ctgctctccc
aagtggccca ggctgcctgc agccattgct 2220ggggctctgt gcccagtcag
cactttgtga gtgcttgttc agtgagtaag cagggacagg 2280ctggccggtg
gaccacggga gaggaacccg cattggccga gggctcccta tggtgagcca
2340cgcctgtggg ttcaccacct cctaggaggg tccagaaaag cagctcccca
agcctgtgcg 2400cctcgtcctc agcagatcca ccttcttcac tataataaaa
gccagtctgg gatgctaaaa 2460aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2520aaaaaaaaaa aaaaa
253558520PRTHomo sapiens 58Met Glu Gly Gly Pro Ala Val Cys Cys Gln
Asp Pro Arg Ala Glu Leu1 5 10 15Val Glu Arg Val Ala Ala Ile Asp Val
Thr His Leu Glu Glu Ala Asp 20 25 30Gly Gly Pro Glu Pro Thr Arg Asn
Gly Val Asp Pro Pro Pro Arg Ala 35 40 45Arg Ala Ala Ser Val Ile Pro
Gly Ser Thr Ser Arg Leu Leu Pro Ala 50 55 60Arg Pro Ser Leu Ser Ala
Arg Lys Leu Ser Leu Gln Glu Arg Pro Ala65 70 75 80Gly Ser Tyr Leu
Glu Ala Gln Ala Gly Pro Tyr Ala Thr Gly Pro Ala 85 90 95Ser His Ile
Ser Pro Arg Ala Trp Arg Arg Pro Thr Ile Glu Ser His 100 105 110His
Val Ala Ile Ser Asp Ala Glu Asp Cys Val Gln Leu Asn Gln Tyr 115 120
125Lys Leu Gln Ser Glu Ile Gly Lys Gly Ala Tyr Gly Val Val Arg Leu
130 135 140Ala Tyr Asn Glu Ser Glu Asp Arg His Tyr Ala Met Lys Val
Leu Ser145 150 155 160Lys Lys Lys Leu Leu Lys Gln Tyr Gly Phe Pro
Arg Arg Pro Pro Pro 165 170 175Arg Gly Ser Gln Ala Ala Gln Gly Gly
Pro Ala Lys Gln Leu Leu Pro 180 185 190Leu Glu Arg Val Tyr Gln Glu
Ile Ala Ile Leu Lys Lys Leu Asp His 195 200 205Val Asn Val Val Lys
Leu Ile Glu Val Leu Asp Asp Pro Ala Glu Asp 210 215 220Asn Leu Tyr
Leu Ala Leu Gln Asn Gln Ala Gln Asn Ile Gln Leu Asp225 230 235
240Ser Thr Asn Ile Ala Lys Pro His Ser Leu Leu Pro Ser Glu Gln Gln
245 250 255Asp Ser Gly Ser Thr Trp Ala Ala Arg Ser Val Phe Asp Leu
Leu Arg 260 265 270Lys Gly Pro Val Met Glu Val Pro Cys Asp Lys Pro
Phe Ser Glu Glu 275 280 285Gln Ala Arg Leu Tyr Leu Arg Asp Val Ile
Leu Gly Leu Glu Tyr Leu 290 295 300His Cys Gln Lys Ile Val His Arg
Asp Ile Lys Pro Ser Asn Leu Leu305 310 315 320Leu Gly Asp Asp Gly
His Val Lys Ile Ala Asp Phe Gly Val Ser Asn 325 330 335Gln Phe Glu
Gly Asn Asp Ala Gln Leu Ser Ser Thr Ala Gly Thr Pro 340 345 350Ala
Phe Met Ala Pro Glu Ala Ile Ser Asp Ser Gly Gln Ser Phe Ser 355 360
365Gly Lys Ala Leu Asp Val Trp Ala Thr Gly Val Thr Leu Tyr Cys Phe
370 375 380Val Tyr Gly Lys Cys Pro Phe Ile Asp Asp Phe Ile Leu Ala
Leu His385 390 395 400Arg Lys Ile Lys Asn Glu Pro Val Val Phe Pro
Glu Glu Pro Glu Ile 405 410 415Ser Glu Glu Leu Lys Asp Leu Ile Leu
Lys Met Leu Asp Lys Asn Pro 420 425 430Glu Thr Arg Ile Gly Val Pro
Asp Ile Lys Leu His Pro Trp Val Thr 435 440 445Lys Asn Gly Glu Glu
Pro Leu Pro Ser Glu Glu Glu His Cys Ser Val 450 455 460Val Glu Val
Thr Glu Glu Glu Val Lys Asn Ser Val Arg Leu Ile Pro465 470 475
480Ser Trp Thr Thr Val Ile Leu Val Lys Ser Met Leu Arg Lys Arg Ser
485 490 495Phe Gly Asn Pro Phe Glu Pro Gln Ala Arg Arg Glu Glu Arg
Ser Met 500 505 510Ser Ala Pro Gly Asn Leu Leu Val 515
520591153DNAHomo sapiens 59gaaaacacca aatcaaccat aggtccaaga
acaattgtct ctggacggca gctatgcgac 60tcaccgtgct gtgtgctgtg tgcctgctgc
ctggcagcct ggccctgccg ctgcctcagg 120aggcgggagg catgagtgag
ctacagtggg aacaggctca ggactatctc aagagatttt 180atctctatga
ctcagaaaca aaaaatgcca acagtttaga agccaaactc aaggagatgc
240aaaaattctt tggcctacct ataactggaa tgttaaactc ccgcgtcata
gaaataatgc 300agaagcccag atgtggagtg ccagatgttg cagaatactc
actatttcca aatagcccaa 360aatggacttc caaagtggtc acctacagga
tcgtatcata tactcgagac ttaccgcata 420ttacagtgga tcgattagtg
tcaaaggctt taaacatgtg gggcaaagag atccccctgc 480atttcaggaa
agttgtatgg ggaactgctg acatcatgat tggctttgcg cgaggagctc
540atggggactc ctacccattt gatgggccag gaaacacgct ggctcatgcc
tttgcgcctg 600ggacaggtct cggaggagat gctcacttcg atgaggatga
acgctggacg gatggtagca 660gtctagggat taacttcctg tatgctgcaa
ctcatgaact tggccattct ttgggtatgg 720gacattcctc tgatcctaat
gcagtgatgt atccaaccta tggaaatgga gatccccaaa 780attttaaact
ttcccaggat gatattaaag gcattcagaa actatatgga aagagaagta
840attcaagaaa gaaatagaaa cttcaggcag aacatccatt cattcattca
ttggattgta 900tatcattgtt gcacaatcag aattgataag cactgttcct
ccactccatt tagcaattat 960gtcacccttt tttattgcag ttggtttttg
aatgtctttc actcctttta aggataaact 1020cctttatggt gtgactgtgt
cttattcatc tatacttgca gtgggtagat gtcaataaat 1080gttacataca
caaataaata aaatgtttat tccatggtaa atttaaaaaa aaaaaaaaaa
1140aaaaaaaaaa aaa 115360267PRTHomo sapiens 60Met Arg Leu Thr Val
Leu Cys Ala Val Cys Leu Leu Pro Gly Ser Leu1 5 10 15Ala Leu Pro Leu
Pro Gln Glu Ala Gly Gly Met Ser Glu Leu Gln Trp 20 25 30Glu Gln Ala
Gln Asp Tyr Leu Lys Arg Phe Tyr Leu Tyr Asp Ser Glu 35 40 45Thr Lys
Asn Ala Asn Ser Leu Glu Ala Lys Leu Lys Glu Met Gln Lys 50 55 60Phe
Phe Gly Leu Pro Ile Thr Gly Met Leu Asn Ser Arg Val Ile Glu65 70 75
80Ile Met Gln Lys Pro Arg Cys Gly Val Pro Asp Val Ala Glu Tyr Ser
85 90 95Leu Phe Pro Asn Ser Pro Lys Trp Thr Ser Lys Val Val Thr Tyr
Arg 100 105 110Ile Val Ser Tyr Thr Arg Asp Leu Pro His Ile Thr Val
Asp Arg Leu 115 120 125Val Ser Lys Ala Leu Asn Met Trp Gly Lys Glu
Ile Pro Leu His Phe 130 135 140Arg Lys Val Val Trp Gly Thr Ala Asp
Ile Met Ile Gly Phe Ala Arg145 150 155 160Gly Ala His Gly Asp Ser
Tyr Pro Phe Asp Gly Pro Gly Asn Thr Leu 165 170 175Ala His Ala Phe
Ala Pro Gly Thr Gly Leu Gly Gly Asp Ala His Phe 180 185 190Asp Glu
Asp Glu Arg Trp Thr Asp Gly Ser Ser Leu Gly Ile Asn Phe 195 200
205Leu Tyr Ala Ala Thr His Glu Leu Gly His Ser Leu Gly Met Gly His
210 215 220Ser Ser Asp Pro Asn Ala Val Met Tyr Pro Thr Tyr Gly Asn
Gly Asp225 230 235 240Pro Gln Asn Phe Lys Leu Ser Gln Asp Asp Ile
Lys Gly Ile Gln Lys 245 250 255Leu Tyr Gly Lys Arg Ser Asn Ser Arg
Lys Lys 260 26561451DNAHomo sapiens 61gggttgcgga gggtgggcct
gggaggggtg gtggccattt tttgtctaac cctaactgag 60aagggcgtag gcgccgtgct
tttgctcccc gcgcgctgtt tttctcgctg actttcagcg 120ggcggaaaag
cctcggcctg ccgccttcca ccgttcattc tagagcaaac aaaaaatgtc
180agctgctggc ccgttcgccc ctcccgggga cctgcggcgg gtcgcctgcc
cagcccccga 240accccgcctg gaggccgcgg tcggcccggg gcttctccgg
aggcacccac tgccaccgcg 300aagagttggg ctctgtcagc cgcgggtctc
tcgggggcga gggcgaggtt caggcctttc 360aggccgcagg aagaggaacg
gagcgagtcc ccgcgcgcgg cgcgattccc tgagctgtgg 420gacgtgcacc
caggactcgg ctcacacatg c 451
* * * * *
References