MARKs as Modifiers of the p53 Pathway and Methods of Use

Friedman; Lori ;   et al.

Patent Application Summary

U.S. patent application number 11/627976 was filed with the patent office on 2008-07-10 for marks as modifiers of the p53 pathway and methods of use. This patent application is currently assigned to EXELIXIS, INC.. Invention is credited to Marcia Belvin, Helen Francis-Lang, Lori Friedman, Roel P. Funke, Danxi Li, Mario N. Lioubin, Gregory D. Plowman.

Application Number20080166709 11/627976
Document ID /
Family ID27404397
Filed Date2008-07-10

United States Patent Application 20080166709
Kind Code A1
Friedman; Lori ;   et al. July 10, 2008

MARKs as Modifiers of the p53 Pathway and Methods of Use

Abstract

Human MARK genes are identified as modulators of the p53 pathway, and thus are therapeutic targets for disorders associated with defective p53 function. Methods for identifying modulators of p53, comprising screening for agents that modulate the activity of MARK are provided.


Inventors: Friedman; Lori; (San Francisco, CA) ; Plowman; Gregory D.; (San Carlos, CA) ; Belvin; Marcia; (Albany, CA) ; Francis-Lang; Helen; (San Francisco, CA) ; Li; Danxi; (San Francisco, CA) ; Funke; Roel P.; (South San Francisco, CA) ; Lioubin; Mario N.; (San Mateo, CA)
Correspondence Address:
    MCDONNELL BOEHNEN HULBERT @ BERGHOFF LLP
    300 SOUTH WACKER DRIVE, SUITE 3100
    CHICAGO
    IL
    60606
    US
Assignee: EXELIXIS, INC.
South San Francisco
CA

Family ID: 27404397
Appl. No.: 11/627976
Filed: January 28, 2007

Related U.S. Patent Documents

Application Number Filing Date Patent Number
10161565 Jun 3, 2002
11627976
60296076 Jun 5, 2001
60328605 Oct 10, 2001
60357253 Feb 15, 2002

Current U.S. Class: 435/6.16 ; 435/375
Current CPC Class: G01N 33/57423 20130101; G01N 33/5748 20130101; G01N 33/574 20130101; G01N 33/6872 20130101; G01N 2500/00 20130101; G01N 2500/04 20130101; G01N 33/5308 20130101; G01N 2500/10 20130101; G01N 33/573 20130101; A61P 43/00 20180101; C12Q 1/527 20130101; C12Q 1/42 20130101; C12Q 1/485 20130101; G01N 33/57415 20130101; G01N 33/57419 20130101; G01N 2333/82 20130101; C12Q 2600/158 20130101; G01N 2333/912 20130101; G01N 33/57496 20130101; G01N 2333/988 20130101; G01N 33/57449 20130101; G01N 33/5017 20130101; G01N 2333/4739 20130101; C12Q 1/6886 20130101; A61P 35/00 20180101; G01N 33/5011 20130101; G01N 2333/62 20130101; G01N 2510/00 20130101; G01N 33/57484 20130101; G01N 2333/705 20130101
Class at Publication: 435/6 ; 435/375
International Class: C12Q 1/68 20060101 C12Q001/68; C12N 5/00 20060101 C12N005/00

Claims



1. A method of identifying a candidate p53 pathway modulating agent, said method comprising the steps of: (a) providing an assay system comprising a purified MARK polypeptide or nucleic acid or a functionally active fragment or derivative thereof, (b) contacting the assay system with a test agent under conditions whereby, but for the presence of the test agent, the system provides a reference activity; and (c) detecting a test agent-biased activity of the assay system, wherein a difference between the test agent-biased activity and the reference activity identifies the test agent as a candidate p53 pathway modulating agent.

2. The method of claim 1 wherein the assay system comprises cultured cells that express the MARK polypeptide.

3 . The method of claim 2 wherein the cultured cells additionally have defective p53 function.

4. The method of claim 1 wherein the assay system includes a screening assay comprising a MARK polypeptide, and the candidate test agent is a small molecule modulator.

5. The method of claim 4 wherein the assay is a kinase assay.

6. The method of claim 1 wherein the assay system is selected from the group consisting of an apoptosis assay system, a cell proliferation assay system, an angiogenesis assay system, and a hypoxic induction assay system.

7. The method of claim 1 wherein the assay system includes a binding assay comprising a MARK polypeptide and the candidate test agent is an antibody.

8. The method of claim 1 wherein the assay system includes an expression assay comprising a MARK nucleic acid and the candidate test agent is a nucleic acid modulator.

9. The method of claim 8 wherein the nucleic acid modulator is an antisense oligomer.

10. The method of claim 8 wherein the nucleic acid modulator is a PMO.

11. The method of claim 1 additionally comprising: (d) administering the candidate p53 pathway modulating agent identified in (c) to a model system comprising cells defective in p53 function and, detecting a phenotypic change in the model system that indicates that the p53 function is restored.

12. The method of claim 11 wherein the model system is a mouse model with defective p53 function.

13. A method for modulating a p53 pathway of a cell comprising contacting a cell defective in p53 function with a candidate modulator that specifically binds to a MARK polypeptide comprising an amino acid sequence selected from group consisting of SEQ ID NOs:24, 25, 26, 27, 28, and 29, whereby p53 function is restored.

14. The method of claim 13 wherein the candidate modulator is administered to a vertebrate animal predetermined to have a disease or disorder resulting from a defect in p53 function.

15. The method of claim 13 wherein the candidate modulator is selected from the group consisting of an antibody and a small molecule.

16. The method of claim 1, comprising the additional steps of: (d) providing a secondary assay system comprising cultured cells or a non-human animal expressing MARK, (e) contacting the secondary assay system with the test agent of (b) or an agent derived therefrom under conditions whereby, but for the presence of the test agent or agent derived therefrom, the system provides a reference activity; and (f) detecting an agent-biased activity of the second assay system, wherein a difference between the agent-biased activity and the reference activity of the second assay system confirms the test agent or agent derived therefrom as a candidate p53 pathway modulating agent, and wherein the second assay detects an agent-biased change in the p53 pathway.

17. The method of claim 16 wherein the secondary assay system comprises cultured cells.

18. The method of claim 16 wherein the secondary assay system comprises a non-human animal.

19. The method of claim 18 wherein the non-human animal mis-expresses a p53 pathway gene.

20. A method of modulating p53 pathway in a mammalian cell comprising contacting the cell with an agent that specifically binds a MARK polypeptide or nucleic acid.

21. The method of claim 20 wherein the agent is administered to a mammalian animal predetermined to have a pathology associated with the p53 pathway.

22. The method of claim 20 wherein the agent is a small molecule modulator, a nucleic acid modulator, or an antibody.

23. A method for diagnosing a disease in a patient comprising: (a) obtaining a biological sample from the patient; (b) contacting the sample with a probe for MARK expression; (c) comparing results from step (b) with a control; (d) determining whether step (c) indicates a likelihood of disease.

24. The method of claim 23 wherein said disease is cancer.

25. The method according to claim 24, wherein said cancer is a cancer as shown in Table 1 as having >25% expression level.
Description



REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation of U.S. patent application Ser. No. 10/161,565 filed Jun. 6, 2002, which claims priority to U.S. provisional patent applications 60/296,076 filed Jun. 5, 2001, 60/328,605 filed Oct. 10, 2001, and 60/357,253 filed Feb. 15, 2002. The contents of the prior applications are hereby incorporated in their entirety.

BACKGROUND OF THE INVENTION

[0002] The p53 gene is mutated in over 50 different types of human cancers, including familial and spontaneous cancers, and is believed to be the most commonly mutated gene in human cancer (Zambetti and Levine, FASEB (1993) 7:855-865; Hollstein, et al., Nucleic Acids Res. (1994) 22:3551-3555). Greater than 90% of mutations in the p53 gene are missense mutations that alter a single amino acid that inactivates p53 function. Aberrant forms of human p53 are associated with poor prognosis, more aggressive tumors, metastasis, and short survival rates (Mitsudomi et al., Clin Cancer Res 2000 Oct; 6(10):4055-63; Koshland, Science (1993) 262:1953).

[0003] The human p53 protein normally functions as a central integrator of signals including DNA damage, hypoxia, nucleotide deprivation, and oncogene activation (Prives, Cell (1998) 95:5-8). In response to these signals, p53 protein levels are greatly increased with the result that the accumulated p53 activates cell cycle arrest or apoptosis depending on the nature and strength of these signals. Indeed, multiple lines of experimental evidence have pointed to a key role for p53 as a tumor suppressor (Levine, Cell (1997) 88:323-331). For example, homozygous p53 "knockout" mice are developmentally normal but exhibit nearly 100% incidence of neoplasia in the first year of life (Donehower et al., Nature (1992) 356:215-221).

[0004] The biochemical mechanisms and pathways through which p53 functions in normal and cancerous cells are not fully understood, but one clearly important aspect of p53 function is its activity as a gene-specific transcriptional activator. Among the genes with known p53-response elements are several with well-characterized roles in either regulation of the cell cycle or apoptosis, including GADD45, p21/Wafl/Cipl, cyclin G, Bax, IGF-BP3, and MDM2 (Levine, Cell (1997) 88:323-331).

[0005] Microtubules have a central role in the regulation of cell shape and polarity during differentiation, chromosome partitioning at mitosis, and intracellular transport. Microtubules undergo rearrangements involving rapid transitions between stable and dynamic states during these processes. Microtubule affinity regulating kinases (MARK) are a novel family of protein kinases that phosphorylate microtubule-associated proteins and trigger microtubule disruption (Drewes, G., et al. (1997) Cell 89: 297-308).

[0006] Microtubule affinity regulating kinase 1 (MARK1) is a serine/threonine kinase that phosphorylates microtubule-associated protein tau, leading to disruption of microtubules. It shares 90% amino acid homology with the rat version of MARK1, and demonstrates ubiquitous expression with highest levels in testis and brain (Nagase, T. et al. (2000) DNA Res. 7: 143-150).

[0007] EMK1 (MARK2) is a serine/threonine protein kinase with two isoforms, which differ by the presence or absence of a 162-bp alternative exon (Espinosa, L. and Navarro, E. (1998) Cytogenet. Cell Genet. 81:278-282). Both human isoforms are coexpressed in a number of cell lines and tissues, with the highest expression found in heart, brain, placenta, skeletal muscle, and pancreas, and at lower levels in lung, liver, and kidney (Inglis, J. et al. (1993) Mammalian Genome 4: 401-403). Due to the physical location of this gene, 11q12-q13, EMK1 is a candidate gene for carcinogenic events (Courseaux, A. et al. (1995) Mammalian Genome 6: 311-312), and has been associated with colon and prostate cancer (Moore, T. M., et al. (2000) J Biol Chem 275:4311-22; Navarro, E., et al. (1999) Biochim Biophys Acta 1450: 254-64).

[0008] Microtubule affinity regulating kinase 3 (MARK3) was originally identified as a marker (KP78) induced by treatment with DNA damaging agents. The loss of MARK3 was associated with carcinogenesis in the pancreas (Parsa, I. (1988) Cell Growth Differ. 9: 197-208). MARK3 may be involved in cell cycle regulation, and alterations in the MARK3 gene may lead to carcinogenesis. MARK 3 is ubiquitously expressed throughout human tissues, with an additional 3.0 Kb transcript present in the heart (Peng, C. et al. (1998) Cell Growth Differ. 9: 197-208).

[0009] MAP/microtubule affinity-regulating kinase like 1 (MARKL 1) has two isoforms (Nagase, T. et al. (2001) DNA Res. 8: 85-95), is activated by the beta-catenin/Tcf complex in hepatic cell lines, and may be involved in hepatic carcinogenesis (Kato, T. et al. (2001). Neoplasia 3:4-9).

[0010] The ability to manipulate the genomes of model organisms such as Drosophila provides a powerful means to analyze biochemical processes that, due to significant evolutionary conservation, has direct relevance to more complex vertebrate organisms. Due to a high level of gene and pathway conservation, the strong similarity of cellular processes, and the functional conservation of genes between these model organisms and mammals, identification of the involvement of novel genes in particular pathways and their functions in such model organisms can directly contribute to the understanding of the correlative pathways and methods of modulating them in mammals (see, for example, Mechler B M et al., 1985 EMBO J 4:1551-1557; Gateff E. 1982 Adv. Cancer Res. 37: 33-74; Watson K L., et al., 1994 J Cell Sci. 18: 19-33; Miklos G L, and Rubin G M. 1996 Cell 86:521-529; Wassarman D A, et al., 1995 Curr Opin Gen Dev 5: 44-50; and Booth D R. 1999 Cancer Metastasis Rev. 18: 261-284). For example, a genetic screen can be carried out in an invertebrate model organism having underexpression (e.g. knockout) or overexpression of a gene (referred to as a "genetic entry point") that yields a visible phenotype. Additional genes are mutated in a random or targeted manner. When a gene mutation changes the original phenotype caused by the mutation in the genetic entry point, the gene is identified as a "modifier" involved in the same or overlapping pathway as the genetic entry point. When the genetic entry point is an ortholog of a human gene implicated in a disease pathway, such as p53, modifier genes can be identified that may be attractive candidate targets for novel therapeutics.

[0011] All references cited herein, including sequence information in referenced Genbank identifier numbers and website references, are incorporated herein in their entireties.

SUMMARY OF THE INVENTION

[0012] We have discovered genes that modify the p53 pathway in Drosophila, and identified their human orthologs, hereinafter referred to as MARK. The invention provides methods for utilizing these p53 modifier genes and polypeptides to identify candidate therapeutic agents that can be used in the treatment of disorders associated with defective p53 function. Preferred MARK-modulating agents specifically bind to MARK polypeptides and restore p53 function. Other preferred MARK-modulating agents are nucleic acid modulators such as antisense oligomers and RNAi that repress MARK gene expression or product activity by, for example, binding to and inhibiting the respective nucleic acid (i.e. DNA or mRNA).

[0013] MARK-specific modulating agents may be evaluated by any convenient in vitro or in vivo assay for molecular interaction with a MARK polypeptide or nucleic acid. In one embodiment, candidate p53 modulating agents are tested with an assay system comprising a MARK polypeptide or nucleic acid. Candidate agents that produce a change in the activity of the assay system relative to controls are identified as candidate p53 modulating agents. The assay system may be cell-based or cell-free. MARK-modulating agents include MARK related proteins (e.g. dominant negative mutants, and biotherapeutics); MARK-specific antibodies; MARK-specific antisense oligomers and other nucleic acid modulators; and chemical agents that specifically bind MARK or compete with MARK binding target. In one specific embodiment, a small molecule modulator is identified using a kinase assay. In specific embodiments, the screening assay system is selected from a binding assay, an apoptosis assay, a cell proliferation assay, an angiogenesis assay, and a hypoxic induction assay.

[0014] In another embodiment, candidate p53 pathway modulating agents are further tested using a second assay system that detects changes in the p53 pathway, such as angiogenic, apoptotic, or cell proliferation changes produced by the originally identified candidate agent or an agent derived from the original agent. The second assay system may use cultured cells or non-human animals. In specific embodiments, the secondary assay system uses non-human animals, including animals predetermined to have a disease or disorder implicating the p53 pathway, such as an angiogenic, apoptotic, or cell proliferation disorder (e.g. cancer).

[0015] The invention further provides methods for modulating the p53 pathway in a mammalian cell by contacting the mammalian cell with an agent that specifically binds a MARK polypeptide or nucleic acid. The agent may be a small molecule modulator, a nucleic acid modulator, or an antibody and may be administered to a mammalian animal predetermined to have a pathology associated the p53 pathway.

DETAILED DESCRIPTION OF THE INVENTION

[0016] Genetic screens were designed to identify modifiers of the p53 pathway in Drosophila in which p53 was overexpressed in the wing (Ollmann M, et al., Cell 2000 101: 91-101). The KP78a gene was identified as a modifier of the p53 pathway. Accordingly, vertebrate orthologs of these modifiers, and preferably the human orthologs, microtubule affinity regulator kinase (MARK) genes (i.e., nucleic acids and polypeptides) are attractive drug targets for the treatment of pathologies associated with a defective p53 signaling pathway, such as cancer.

[0017] In vitro and in vivo methods of assessing MARK function are provided herein. Modulation of the MARK or their respective binding partners is useful for understanding the association of the p53 pathway and its members in normal and disease conditions and for developing diagnostics and therapeutic modalities for p53 related pathologies. MARK-modulating agents that act by inhibiting or enhancing MARK expression, directly or indirectly, for example, by affecting a MARK function such as enzymatic (e.g., catalytic) or binding activity, can be identified using methods provided herein. MARK modulating agents are useful in diagnosis, therapy and pharmaceutical development. Nucleic acids and polypeptides of the invention

[0018] Sequences related to MARK nucleic acids and polypeptides that can be used in the invention are disclosed in Genbank (referenced by Genbank identifier (GI) number) as GI#s 9845486 (SEQ ID NO: 1), 9845488 (SEQ ID NO:2), 18578044 (SEQ ID NO:3), 14250621 (SEQ ID NO:6), 15042610 (SEQ ID NO:7), 8923921 (SEQ ID NO:8), 17445805 (SEQ ID NO:9), 7959214 (SEQ ID NO:11), 14042208 (SEQ ID NO: 12), 3089348 (SEQ ID NO:13), 4505102 (SEQ ID NO:14), 5714635 (SEQ ID NO:15), 18448970 (SEQ ID NO:18), 13366083 (SEQ ID NO:19), 14017936 (SEQ ID NO:22), and 16555377 (SEQ ID NO:23) for nucleic acid, and GI#s 9845487 (SEQ ID NO:24), 8923922 (SEQ ID NO:25), 3089349 (SEQ ID NO:26), 4505103 (SEQ ID NO:27), 13366084 (SEQ ID NO:28) and 13899225 (SEQ ID NO:29) for polypeptides. Additionally, nucleic acid sequences of SEQ ID NOs:4, 5, 16, 17, 20, 21, and novel nucleic acid sequence of SEQ ID NO: 10 can also be used in the invention.

[0019] MARKs are kinase proteins with kinase and UBA/TS-N domains. The term "MARK polypeptide" refers to a full-length MARK protein or a functionally active fragment or derivative thereof. A "functionally active" MARK fragment or derivative exhibits one or more functional activities associated with a full-length, wild-type MARK protein, such as antigenic or immunogenic activity, enzymatic activity, ability to bind natural cellular substrates, etc. The functional activity of MARK proteins, derivatives and fragments can be assayed by various methods known to one skilled in the art (Current Protocols in Protein Science (1998) Coligan et al., eds., John Wiley & Sons, Inc., Somerset, N.J.) and as further discussed below. For purposes herein, functionally active fragments also include those fragments that comprise one or more structural domains of a MARK, such as a kinase domain or a binding domain. Protein domains can be identified using the PFAM program (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2; http://pfam.wustl.edu). For example, the proten kinase domains of MARKs from GI#s 9845487 (SEQ ID NO:24), 8923922 (SEQ ID NO:25), 4505103 (SEQ ID NO:27), and 13899225 (SEQ ID NO:29) is located at approximately amino acid residues 20 to 271, 60 to 311, 56 to 307, and 59 to 310, respectively (PFAM 00069). Further, the ubiquitin associated (UBA/TS-N) domains of MARKs from GI#s 9845487 (SEQ ID NO:24), 8923922 (SEQ ID NO:25), 4505103 (SEQ ID NO:27 and 13899225 (SEQ ID NO:29) is located at approximately amino acid residues 291 to 330, 331 to 370, 327 to 366, and 330 to 369, respectively (PFAM 00627). Methods for obtaining MARK polypeptides are also further described below. In some embodiments, preferred fragments are functionally active, domain-containing fragments comprising at least 25 contiguous amino acids, preferably at least 50, more preferably 75, and most preferably at least 100 contiguous amino acids of any one of SEQ ID NOs:24, 25,26,27, 28, or 29 (a MARK). In further preferred embodiments, the fragment comprises the entire kinase (functionally active) domain.

[0020] The term "MARK nucleic acid" refers to a DNA or RNA molecule that encodes a MARK polypeptide. Preferably, the MARK polypeptide or nucleic acid or fragment thereof is from a human, but can also be an ortholog, or derivative thereof with at least 70% sequence identity, preferably at least 80%, more preferably 85%, still more preferably 90%, and most preferably at least 95% sequence identity with MARK. Normally, orthologs in different species retain the same function, due to presence of one or more protein motifs and/or 3-dimensional structures. Orthologs are generally identified by sequence homology analysis, such as BLAST analysis, usually using protein bait sequences. Sequences are assigned as a potential ortholog if the best hit sequence from the forward BLAST result retrieves the original query sequence in the reverse BLAST (Huynen M A and Bork P, Proc Natl Acad Sci (1998) 95:5849-5856; Huynen M A et al, Genome Research (2000) 10:1204-1210). Programs for multiple sequence alignment, such as CLUSTAL (Thompson J D et al, 1994, Nucleic Acids Res 22:4673-4680) may be used to highlight conserved regions and/or residues of orthologous proteins and to generate phylogenetic trees. In a phylogenetic tree representing multiple homologous sequences from diverse species (e.g., retrieved through BLAST analysis), orthologous sequences from two species generally appear closest on the tree with respect to all other sequences from these two species. Structural threading or other analysis of protein folding (e.g., using software by ProCeryon, Biosciences, Salzburg, Austria) may also identify potential orthologs. In evolution, when a gene duplication event follows speciation, a single gene in one species, such as Drosophila, may correspond to multiple genes (paralogs) in another, such as human. As used herein, the term "orthologs" encompasses paralogs. As used herein, "percent (%) sequence identity" with respect to a subject sequence, or a specified portion of a subject sequence, is defined as the percentage of nucleotides or amino acids in the candidate derivative sequence identical with the nucleotides or amino acids in the subject sequence (or specified portion thereof), after aligning the sequences and introducing gaps, if necessary to achieve the maximum percent sequence identity, as generated by the program WU-BLAST-2.0a19 (Altschul et al., J. Mol. Biol. (1997) 215:403-410; http://blast.wustl.edu/blast/README.html) with all the search parameters set to default values. The HSP S and HSP S2 parameters are dynamic values and are established by the program itself depending upon the composition of the particular sequence and composition of the particular database against which the sequence of interest is being searched. A % identity value is determined by the number of matching identical nucleotides or amino acids divided by the sequence length for which the percent identity is being reported. "Percent (%) amino acid sequence similarity" is determined by doing the same calculation as for determining % amino acid sequence identity, but including conservative amino acid substitutions in addition to identical amino acids in the computation.

[0021] A conservative amino acid substitution is one in which an amino acid is substituted for another amino acid having similar properties such that the folding or activity of the protein is not significantly affected. Aromatic amino acids that can be substituted for each other are phenylalanine, tryptophan, and tyrosine; interchangeable hydrophobic amino acids are leucine, isoleucine, methionine, and valine; interchangeable polar amino acids are glutamine and asparagine; interchangeable basic amino acids are arginine, lysine and histidine; interchangeable acidic amino acids are aspartic acid and glutamic acid; and interchangeable small amino acids are alanine, serine, threonine, cysteine and glycine.

[0022] Alternatively, an alignment for nucleic acid sequences is provided by the local homology algorithm of Smith and Waterman (Smith and Waterman, 1981, Advances in Applied Mathematics 2:482-489; database: European Bioinformatics Institute http://www.ebi.ac.uk/MPsrch/; Smith and Waterman, 1981, J. of Molec. Biol., 147:195-197; Nicholas et al., 1998, "A Tutorial on Searching Sequence Databases and Sequence Scoring Methods" (www.psc.edu) and references cited therein.; W. R. Pearson, 1991, Genomics 11:635-650). This algorithm can be applied to amino acid sequences by using the scoring matrix developed by Dayhoff (Dayhoff: Atlas of Protein Sequences and Structure, M. O. Dayhoff ed., 5 suppl. 3:353-358, National Biomedical Research Foundation, Washington, D.C., USA), and normalized by Gribskov (Gribskov 1986 Nucl. Acids Res. 14(6):6745-6763). The Smith-Waterman algorithm may be employed where default parameters are used for scoring (for example, gap open penalty of 12, gap extension penalty of two). From the data generated, the "Match" value reflects "sequence identity."

[0023] Derivative nucleic acid molecules of the subject nucleic acid molecules include sequences that hybridize to the nucleic acid sequence of any of SEQ ID NOs: 1 through 23. The stringency of hybridization can be controlled by temperature, ionic strength, pH, and the presence of denaturing agents such as formamide during hybridization and washing. Conditions routinely used are set out in readily available procedure texts (e.g., Current Protocol in Molecular Biology, Vol. 1, Chap. 2.10, John Wiley & Sons, Publishers (1994); Sambrook et al., Molecular Cloning, Cold Spring Harbor (1989)). In some embodiments, a nucleic acid molecule of the invention is capable of hybridizing to a nucleic acid molecule containing the nucleotide sequence of any one of SEQ ID NOs: 1 through 23 under stringent hybridization conditions that comprise: prehybridization of filters containing nucleic acid for 8 hours to overnight at 65+ C. in a solution comprising 6.times.single strength citrate (SSC) (1.times.SSC is 0.15 M NaCl, 0.015 M Na citrate; pH 7.0), 5.times.Denhardt's solution, 0.05% sodium pyrophosphate and 100 .mu.g/ml herring sperm DNA; hybridization for 18-20 hours at 65.degree. C. in a solution containing 6.times.SSC, 1.times.Denhardt's solution, 100 .mu.g/ml yeast tRNA and 0.05% sodium pyrophosphate; and washing of filters at 65.degree. C. for 1 h in a solution containing 0.2.times.SSC and 0.1% SDS (sodium dodecyl sulfate).

[0024] In other embodiments, moderately stringent hybridization conditions are used that comprise: pretreatment of filters containing nucleic acid for 6 h at 40.degree. C. in a solution containing 35% formamide, 5.times.SSC, 50 mM Tris-HCl (pH7.5), 5 mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 .mu.g/ml denatured salmon sperm DNA; hybridization for 18-20h at 40.degree. C. in a solution containing 35% formamide, 5.times.SSC, 50 mM Tris-HCl (pH7.5), 5 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 .mu.g/ml salmon sperm DNA, and 10% (wt/vol) dextran sulfate; followed by washing twice for 1 hour at 55.degree. C. in a solution containing 2.times.SSC and 0.1% SDS.

[0025] Alternatively, low stringency conditions can be used that comprise: incubation for 8 hours to overnight at 37.degree. C. in a solution comprising 20% formamide, 5.times.SSC, 50 mM sodium phosphate (pH 7.6), 5.times.Denhardt's solution, 10% dextran sulfate, and 20 .mu.g/ml denatured sheared salmon sperm DNA; hybridization in the same buffer for 18 to 20 hours; and washing of filters in 1.times.SSC at about 37.degree. C. for 1 hour.

Isolation, Production, Expression, and Mis-expression of MARK Nucleic Acids and Polypeptides

[0026] MARK nucleic acids and polypeptides, useful for identifying and testing agents that modulate MARK function and for other applications related to the involvement of MARK in the p53 pathway. MARK nucleic acids and derivatives and orthologs thereof may be obtained using any available method. For instance, techniques for isolating cDNA or genomic DNA sequences of interest by screening DNA libraries or by using polymerase chain reaction (PCR) are well known in the art. In general, the particular use for the protein will dictate the particulars of expression, production, and purification methods. For instance, production of proteins for use in screening for modulating agents may require methods that preserve specific biological activities of these proteins, whereas production of proteins for antibody generation may require structural integrity of particular epitopes. Expression of proteins to be purified for screening or antibody production may require the addition of specific tags (e.g., generation of fusion proteins). Overexpression of a MARK protein for assays used to assess MARK function, such as involvement in cell cycle regulation or hypoxic response, may require expression in eukaryotic cell lines capable of these cellular activities. Techniques for the expression, production, and purification of proteins are well known in the art; any suitable means therefore may be used (e.g., Higgins S J and Hames B D (eds.) Protein Expression: A Practical Approach, Oxford University Press Inc., New York 1999; Stanbury P F et al., Principles of Fermentation Technology, 2.sup.nd edition, Elsevier Science, New York, 1995; Doonan S (ed.) Protein Purification Protocols, Humana Press, New Jersey, 1996; Coligan J E et al, Current Protocols in Protein Science (eds.), 1999, John Wiley & Sons, New York). In particular embodiments, recombinant MARK is expressed in a cell line known to have defective p53 function (e.g. SAOS-2 osteoblasts, H1299 lung cancer cells, C33A and HT3 cervical cancer cells, HT-29 and DLD-1 colon cancer cells, among others, available from American Type Culture Collection (ATCC), Manassas, Va.). The recombinant cells are used in cell-based screening assay systems of the invention, as described further below.

[0027] The nucleotide sequence encoding a MARK polypeptide can be inserted into any appropriate expression vector. The necessary transcriptional and translational signals, including promoter/enhancer element, can derive from the native MARK gene and/or its flanking regions or can be heterologous. A variety of host-vector expression systems may be utilized, such as mammalian cell systems infected with virus (e.g. vaccinia virus, adenovirus, etc.); insect cell systems infected with virus (e.g. baculovirus); microorganisms such as yeast containing yeast vectors, or bacteria transformed with bacteriophage, plasmid, or cosmid DNA. A host cell strain that modulates the expression of, modifies, and/or specifically processes the gene product may be used.

[0028] To detect expression of the MARK gene product, the expression vector can comprise a promoter operably linked to a MARK gene nucleic acid, one or more origins of replication, and, one or more selectable markers (e.g. thymidine kinase activity, resistance to antibiotics, etc.). Alternatively, recombinant expression vectors can be identified by assaying for the expression of the MARK gene product based on the physical or functional properties of the MARK protein in in vitro assay systems (e.g. immunoassays).

[0029] The MARK protein, fragment, or derivative may be optionally expressed as a fusion, or chimeric protein product (i.e. it is joined via a peptide bond to a heterologous protein sequence of a different protein), for example to facilitate purification or detection. A chimeric product can be made by ligating the appropriate nucleic acid sequences encoding the desired amino acid sequences to each other using standard methods and expressing the chimeric product. A chimeric product may also be made by protein synthetic techniques, e.g. by use of a peptide synthesizer (Hunkapiller et al., Nature (1984) 310:105-111).

[0030] Once a recombinant cell that expresses the MARK gene sequence is identified, the gene product can be isolated and purified using standard methods (e.g. ion exchange, affinity, and gel exclusion chromatography; centrifugation; differential solubility; electrophoresis, cite purification reference). Alternatively, native MARK proteins can be purified from natural sources, by standard methods (e.g. immunoaffinity purification). Once a protein is obtained, it may be quantified and its activity measured by appropriate methods, such as immunoassay, bioassay, or other measurements of physical properties, such as crystallography.

[0031] The methods of this invention may also use cells that have been engineered for altered expression (mis-expression) of MARK or other genes associated with the p53 pathway. As used herein, mis-expression encompasses ectopic expression, over-expression, under-expression, and non-expression (e.g. by gene knock-out or blocking expression that would otherwise normally occur).

Genetically Modified Animals

[0032] Animal models that have been genetically modified to alter MARK expression may be used in in vivo assays to test for activity of a candidate p53 modulating agent, or to further assess the role of MARK in a p53 pathway process such as apoptosis or cell proliferation. Preferably, the altered MARK expression results in a detectable phenotype, such as decreased or increased levels of cell proliferation, angiogenesis, or apoptosis compared to control animals having normal MARK expression. The genetically modified animal may additionally have altered p53 expression (e.g. p53 knockout). Preferred genetically modified animals are mammals such as primates, rodents (preferably mice), cows, horses, goats, sheep, pigs, dogs and cats. Preferred non-mammalian species include zebrafish, C. elegans, and Drosophila. Preferred genetically modified animals are transgenic animals having a heterologous nucleic acid sequence present as an extrachromosomal element in a portion of its cells, i.e. mosaic animals (see, for example, techniques described by Jakobovits, 1994, Curr. Biol. 4:761-763.) or stably integrated into its germ line DNA (i.e., in the genomic sequence of most or all of its cells). Heterologous nucleic acid is introduced into the germ line of such transgenic animals by genetic manipulation of, for example, embryos or embryonic stem cells of the host animal.

[0033] Methods of making transgenic animals are well-known in the art (for transgenic mice see Brinster et al., Proc. Nat. Acad. Sci. USA 82: 4438-4442 (1985), U.S. Pat. Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No. 4,873,191 by Wagner et al., and Hogan, B., Manipulating the Mouse Embryo, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (1986); for particle bombardment see U.S. Pat. No., 4,945,050, by Sandford et al.; for transgenic Drosophila see Rubin and Spradling, Science (1982) 218:348-53 and U.S. Pat. No. 4,670,388; for transgenic insects see Berghammer A. J. et al., A Universal Marker for Transgenic Insects (1999) Nature 402:370-371; for transgenic Zebrafish see Lin S., Transgenic Zebrafish, Methods Mol Biol. (2000);136:375-3830); for microinjection procedures for fish, amphibian eggs and birds see Houdebine and Chourrout, Experientia (1991) 47:897-905; for transgenic rats see Hammer et al., Cell (1990) 63:1099-1112; and for culturing of embryonic stem (ES) cells and the subsequent production of transgenic animals by the introduction of DNA into ES cells using methods such as electroporation, calcium phosphate/DNA precipitation and direct injection see, e.g., Teratocarcinomas and Embryonic Stem Cells, A Practical Approach, E. J. Robertson, ed., IRL Press (1987)). Clones of the nonhuman transgenic animals can be produced according to available methods (see Wilmut, I. et al. (1997) Nature 385:810-813; and PCT International Publication Nos. WO 97/07668 and WO 97/07669).

[0034] In one embodiment, the transgenic animal is a "knock-out" animal having a heterozygous or homozygous alteration in the sequence of an endogenous MARK gene that results in a decrease of MARK function, preferably such that MARK expression is undetectable or insignificant. Knock-out animals are typically generated by homologous recombination with a vector comprising a transgene having at least a portion of the gene to be knocked out. Typically a deletion, addition or substitution has been introduced into the transgene to functionally disrupt it. The transgene can be a human gene (e.g., from a human genomic clone) but more preferably is an ortholog of the human gene derived from the transgenic host species. For example, a mouse MARK gene is used to construct a homologous recombination vector suitable for altering an endogenous MARK gene in the mouse genome. Detailed methodologies for homologous recombination in mice are available (see Capecchi, Science (1989) 244:1288-1292; Joyner et al., Nature (1989) 338:153-156). Procedures for the production of non-rodent transgenic mammals and other animals are also available (Houdebine and Chourrout, supra; Pursel et al., Science (1989) 244:1281-1288; Simms et al., Bio/Technology (1988) 6:179-183). In a preferred embodiment, knock-out animals, such as mice harboring a knockout of a specific gene, may be used to produce antibodies against the human counterpart of the gene that has been knocked out (Claesson MH et al., (1994) Scan J Immunol 40:257-264; Declerck P J et al., (1995) J Biol Chem. 270:8397-400).

[0035] In another embodiment, the transgenic animal is a "knock-in" animal having an alteration in its genome that results in altered expression (e.g., increased (including ectopic) or decreased expression) of the MARK gene, e.g., by introduction of additional copies of MARK, or by operatively inserting a regulatory sequence that provides for altered expression of an endogenous copy of the MARK gene. Such regulatory sequences include inducible, tissue-specific, and constitutive promoters and enhancer elements. The knock-in can be homozygous or heterozygous.

[0036] Transgenic nonhuman animals can also be produced that contain selected systems allowing for regulated expression of the transgene. One example of such a system that may be produced is the cre/loxP recombinase system of bacteriophage P1 (Lakso et al., PNAS (1992) 89:6232-6236; U.S. Pat. No. 4,959,317). If a cre/loxP recombinase system is used to regulate expression of the transgene, animals containing transgenes encoding both the Cre recombinase and a selected protein are required. Such animals can be provided through the construction of "double" transgenic animals, e.g., by mating two transgenic animals, one containing a transgene encoding a selected protein and the other containing a transgene encoding a recombinase. Another example of a recombinase system is the FLP recombinase system of Saccharomyces cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355; U.S. Pat. No. 5,654,182). In a preferred embodiment, both Cre-LoxP and Flp-Frt are used in the same system to regulate expression of the transgene, and for sequential deletion of vector sequences in the same cell (Sun X et al (2000) Nat Genet 25:83-6).

[0037] The genetically modified animals can be used in genetic studies to further elucidate the p53 pathway, as animal models of disease and disorders implicating defective p53 function, and for in vivo testing of candidate therapeutic agents, such as those identified in screens described below. The candidate therapeutic agents are administered to a genetically modified animal having altered MARK function and phenotypic changes are compared with appropriate control animals such as genetically modified animals that receive placebo treatment, and/or animals with unaltered MARK expression that receive candidate therapeutic agent.

[0038] In addition to the above-described genetically modified animals having altered MARK function, animal models having defective p53 function (and otherwise normal MARK function), can be used in the methods of the present invention. For example, a p53 knockout mouse can be used to assess, in vivo, the activity of a candidate p53 modulating agent identified in one of the in vitro assays described below. p53 knockout mice are described in the literature (Jacks et al., Nature 2001;410:1111-1116, 1043-1044; Donehower et al., supra). Preferably, the candidate p53 modulating agent when administered to a model system with cells defective in p53 function, produces a detectable phenotypic change in the model system indicating that the p53 function is restored, i.e., the cells exhibit normal cell cycle progression.

Modulating Agents

[0039] The invention provides methods to identify agents that interact with and/or modulate the function of MARK and/or the p53 pathway. Such agents are useful in a variety of diagnostic and therapeutic applications associated with the p53 pathway, as well as in further analysis of the MARK protein and its contribution to the p53 pathway. Accordingly, the invention also provides methods for modulating the p53 pathway comprising the step of specifically modulating MARK activity by administering a MARK-interacting or -modulating agent.

[0040] In a preferred embodiment, MARK-modulating agents inhibit or enhance MARK activity or otherwise affect normal MARK function, including transcription, protein expression, protein localization, and cellular or extra-cellular activity. In a further preferred embodiment, the candidate p53 pathway-modulating agent specifically modulates the function of the MARK. The phrases "specific modulating agent", "specifically modulates", etc., are used herein to refer to modulating agents that directly bind to the MARK polypeptide or nucleic acid, and preferably inhibit, enhance, or otherwise alter, the function of the MARK. The term also encompasses modulating agents that alter the interaction of the MARK with a binding partner or substrate (e.g. by binding to a binding partner of a MARK, or to a protein/binding partner complex, and inhibiting function).

[0041] Preferred MARK-modulating agents include small molecule compounds; MARK-interacting proteins, including antibodies and other biotherapeutics; and nucleic acid modulators such as antisense and RNA inhibitors. The modulating agents may be formulated in pharmaceutical compositions, for example, as compositions that may comprise other active ingredients, as in combination therapy, and/or suitable carriers or excipients. Techniques for formulation and administration of the compounds may be found in "Remington's Pharmaceutical Sciences" Mack Publishing Co., Easton, Pa., 19.sup.th edition.

Small Molecule Modulators

[0042] Small molecules, are often preferred to modulate function of proteins with enzymatic function, and/or containing protein interaction domains. Chemical agents, referred to in the art as "small molecule" compounds are typically organic, non-peptide molecules, having a molecular weight less than 10,000, preferably less than 5,000, more preferably less than 1,000, and most preferably less than 500. This class of modulators includes chemically synthesized molecules, for instance, compounds from combinatorial chemical libraries. Synthetic compounds may be rationally designed or identified based on known or inferred properties of the MARK protein or may be identified by screening compound libraries. Alternative appropriate modulators of this class are natural products, particularly secondary metabolites from organisms such as plants or fungi, which can also be identified by screening compound libraries for MARK-modulating activity. Methods for generating and obtaining compounds are well known in the art (Schreiber S L, Science (2000) 151: 1964-1969; Radmann J and Gunther J, Science (2000) 151:1947-1948).

[0043] Small molecule modulators identified from screening assays, as described below, can be used as lead compounds from which candidate clinical compounds may be designed, optimized, and synthesized. Such clinical compounds may have utility in treating pathologies associated with the p53 pathway. The activity of candidate small molecule modulating agents may be improved several-fold through iterative secondary functional validation, as further described below, structure determination, and candidate modulator modification and testing. Additionally, candidate clinical compounds are generated with specific regard to clinical and pharmacological properties. For example, the reagents may be derivatized and re-screened using in vitro and in vivo assays to optimize activity and minimize toxicity for pharmaceutical development.

[0044] Protein Modulators

[0045] Specific MARK-interacting proteins are useful in a variety of diagnostic and therapeutic applications related to the p53 pathway and related disorders, as well as in validation assays for other MARK-modulating agents. In a preferred embodiment, MARK-interacting proteins affect normal MARK function, including transcription, protein expression, protein localization, and cellular or extra-cellular activity. In another embodiment, MARK-interacting proteins are useful in detecting and providing information about the function of MARK proteins, as is relevant to p53 related disorders, such as cancer (e.g., for diagnostic means).

[0046] An MARK-interacting protein may be endogenous, i.e. one that naturally interacts genetically or biochemically with a MARK, such as a member of the MARK pathway that modulates MARK expression, localization, and/or activity. MARK-modulators include dominant negative forms of MARK-interacting proteins and of MARK proteins themselves. Yeast two-hybrid and variant screens offer preferred methods for identifying endogenous MARK-interacting proteins (Finley, R. L. et al. (1996) in DNA Cloning-Expression Systems: A Practical Approach, eds. Glover D. & Hames B. D (Oxford University Press, Oxford, England), pp. 169-203; Fashema S F et al., Gene (2000) 250:1-14; Drees BL Curr Opin Chem Biol (1999) 3:64-70; Vidal M and Legrain P Nucleic Acids Res (1999) 27:919-29; and U.S. Pat. No. 5,928,868). Mass spectrometry is an alternative preferred method for the elucidation of protein complexes (reviewed in, e.g., Pandley A and Mann M, Nature (2000) 405:837-846; Yates J R 3.sup.rd, Trends Genet (2000) 16:5-8).

[0047] An MARK-interacting protein may be an exogenous protein, such as a MARK-specific antibody or a T-cell antigen receptor (see, e.g., Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory; Harlow and Lane (1999) Using antibodies: a laboratory manual. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press). MARK antibodies are further discussed below.

[0048] In preferred embodiments, a MARK-interacting protein specifically binds a MARK protein. In alternative preferred embodiments, a MARK-modulating agent binds a MARK substrate, binding partner, or cofactor.

Antibodies

[0049] In another embodiment, the protein modulator is a MARK specific antibody agonist or antagonist. The antibodies have therapeutic and diagnostic utilities, and can be used in screening assays to identify MARK modulators. The antibodies can also be used in dissecting the portions of the MARK pathway responsible for various cellular responses and in the general processing and maturation of the MARK.

[0050] Antibodies that specifically bind MARK polypeptides can be generated using known methods. Preferably the antibody is specific to a mammalian ortholog of MARK polypeptide, and more preferably, to human MARK. Antibodies may be polyclonal, monoclonal (mAbs), humanized or chimeric antibodies, single chain antibodies, Fab fragments, F(ab').sub.2 fragments, fragments produced by a FAb expression library, anti-idiotypic (anti-Id) antibodies, and epitope-binding fragments of any of the above. Epitopes of MARK which are particularly antigenic can be selected, for example, by routine screening of MARK polypeptides for antigenicity or by applying a theoretical method for selecting antigenic regions of a protein (Hopp and Wood (1981), Proc. Natl. Acad. Sci. U.S.A. 78:3824-28; Hopp and Wood, (1983) Mol. Immunol. 20:483-89; Sutcliffe et al., (1983) Science 219:660-66) to the amino acid sequence shown in any of SEQ ID NOs:24, 25, 26, 27, 28, or 29. Monoclonal antibodies with affinities of 10.sup.8 M.sup.-1 preferably 10.sup.9 M.sup.-1 to 10.sup.10 M.sup.-1, or stronger can be made by standard procedures as described (Harlow and Lane, supra; Goding (1986) Monoclonal Antibodies: Principles and Practice (2d ed) Academic Press, New York; and U.S. Pat. Nos. 4,381,292; 4,451,570; and 4,618,577). Antibodies may be generated against crude cell extracts of MARK or substantially purified fragments thereof. If MARK fragments are used, they preferably comprise at least 10, and more preferably, at least 20 contiguous amino acids of a MARK protein. In a particular embodiment, MARK-specific antigens and/or immunogens are coupled to carrier proteins that stimulate the immune response. For example, the subject polypeptides are covalently coupled to the keyhole limpet hemocyanin (KLH) carrier, and the conjugate is emulsified in Freund's complete adjuvant, which enhances the immune response. An appropriate immune system such as a laboratory rabbit or mouse is immunized according to conventional protocols.

[0051] The presence of MARK-specific antibodies is assayed by an appropriate assay such as a solid phase enzyme-linked immunosorbant assay (ELISA) using immobilized corresponding MARK polypeptides. Other assays, such as radioimmunoassays or fluorescent assays might also be used.

[0052] Chimeric antibodies specific to MARK polypeptides can be made that contain different portions from different animal species. For instance, a human immunoglobulin constant region may be linked to a variable region of a murine mAb, such that the antibody derives its biological activity from the human antibody, and its binding specificity from the murine fragment. Chimeric antibodies are produced by splicing together genes that encode the appropriate regions from each species (Morrison et al., Proc. Natl. Acad. Sci. (1984) 81:6851-6855; Neuberger et al., Nature (1984) 312:604-608; Takeda et al., Nature (1985) 31:452-454). Humanized antibodies, which are a form of chimeric antibodies, can be generated by grafting complementary-determining regions (CDRs) (Carlos, T. M., J. M. Harlan. 1994. Blood 84:2068-2101) of mouse antibodies into a background of human framework regions and constant regions by recombinant DNA technology (Riechmann L M, et al., 1988 Nature 323: 323-327). Humanized antibodies contain 10% murine sequences and 90% human sequences, and thus further reduce or eliminate immunogenicity, while retaining the antibody specificities (Co MS, and Queen C. 1991 Nature 351: 501-501; Morrison S L. 1992 Ann. Rev. Immun. 10:239-265). Humanized antibodies and methods of their production are well-known in the art (U.S. Pat. Nos. 5,530,101, 5,585,089, 5,693,762, and 6,180,370).

[0053] MARK-specific single chain antibodies which are recombinant, single chain polypeptides formed by linking the heavy and light chain fragments of the Fv regions via an amino acid bridge, can be produced by methods known in the art (U.S. Pat. No. 4,946,778; Bird, Science (1988) 242:423-426; Huston et al., Proc. Natl. Acad. Sci. USA (1988) 85:5879-5883; and Ward et al., Nature (1989) 334:544-546).

[0054] Other suitable techniques for antibody production involve in vitro exposure of lymphocytes to the antigenic polypeptides or alternatively to selection of libraries of antibodies in phage or similar vectors (Huse et al., Science (1989) 246:1275-1281). As used herein, T-cell antigen receptors are included within the scope of antibody modulators (Harlow and Lane, 1988, supra).

[0055] The polypeptides and antibodies of the present invention may be used with or without modification. Frequently, antibodies will be labeled by joining, either covalently or non-covalently, a substance that provides for a detectable signal, or that is toxic to cells that express the targeted protein (Menard S, et al., Int J. Biol Markers (1989) 4:131-134). A wide variety of labels and conjugation techniques are known and are reported extensively in both the scientific and patent literature. Suitable labels include radionuclides, enzymes, substrates, cofactors, inhibitors, fluorescent moieties, fluorescent emitting lanthanide metals, chemiluminescent moieties, bioluminescent moieties, magnetic particles, and the like (U.S. Pat. Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241). Also, recombinant immunoglobulins may be produced (U.S. Pat. No. 4,816,567). Antibodies to cytoplasmic polypeptides may be delivered and reach their targets by conjugation with membrane-penetrating toxin proteins (U.S. Pat. No. 6,086,900).

[0056] When used therapeutically in a patient, the antibodies of the subject invention are typically administered parenterally, when possible at the target site, or intravenously. The therapeutically effective dose and dosage regimen is determined by clinical studies. Typically, the amount of antibody administered is in the range of about 0.1 mg/kg -to about 10 mg/kg of patient weight. For parenteral administration, the antibodies are formulated in a unit dosage injectable form (e.g., solution, suspension, emulsion) in association with a pharmaceutically acceptable vehicle. Such vehicles are inherently nontoxic and non-therapeutic. Examples are water, saline, Ringer's solution, dextrose solution, and 5% human serum albumin. Nonaqueous vehicles such as fixed oils, ethyl oleate, or liposome carriers may also be used. The vehicle may contain minor amounts of additives, such as buffers and preservatives, which enhance isotonicity and chemical stability or otherwise enhance therapeutic potential. The antibodies' concentrations in such vehicles are typically in the range of about 1 mg/ml to about 10 mg/ml. Immunotherapeutic methods are further described in the literature (U.S. Pat. No. 5,859,206; WO0073469).

Nucleic Acid Modulators

[0057] Other preferred MARK-modulating agents comprise nucleic acid molecules, such as antisense oligomers or double stranded RNA (dsRNA), which generally inhibit MARK activity. Preferred nucleic acid modulators interfere with the function of the MARK nucleic acid such as DNA replication, transcription, translocation of the MARK RNA to the site of protein translation, translation of protein from the MARK RNA, splicing of the MARK RNA to yield one or more mRNA species, or catalytic activity which may be engaged in or facilitated by the MARK RNA.

[0058] In one embodiment, the antisense oligomer is an oligonucleotide that is sufficiently complementary to a MARK mRNA to bind to and prevent translation, preferably by binding to the 5' untranslated region. MARK-specific antisense oligonucleotides, preferably range from at least 6 to about 200 nucleotides. In some embodiments the oligonucleotide is preferably at least 10, 15, or 20 nucleotides in length. In other embodiments, the oligonucleotide is preferably less than 50, 40, or 30 nucleotides in length. The oligonucleotide can be DNA or RNA or a chimeric mixture or derivatives or modified versions thereof, single-stranded or double-stranded. The oligonucleotide can be modified at the base moiety, sugar moiety, or phosphate backbone. The oligonucleotide may include other appending groups such as peptides, agents that facilitate transport across the cell membrane, hybridization-triggered cleavage agents, and intercalating agents.

[0059] In another embodiment, the antisense oligomer is a phosphothioate morpholino oligomer (PMO). PMOs are assembled from four different morpholino subunits, each of which contain one of four genetic bases (A, C, G, or T) linked to a six-membered morpholine ring. Polymers of these subunits are joined by non-ionic phosphodiamidate intersubunit linkages. Details of how to make and use PMOs and other antisense oligomers are well known in the art (e.g. see WO99/18193; Probst J C, Antisense Oligodeoxynucleotide and Ribozyme Design, Methods. (2000) 22(3):271-281; Summerton J, and Weller D. 1997 Antisense Nucleic Acid Drug Dev. :7:187-95; U.S. Pat. No. 5,235,033; and U.S. Pat No. 5,378,841).

[0060] Alternative preferred MARK nucleic acid modulators are double-stranded RNA species mediating RNA interference (RNAi). RNAi is the process of sequence-specific, post-transcriptional gene silencing in animals and plants, initiated by double-stranded RNA (dsRNA) that is homologous in sequence to the silenced gene. Methods relating to the use of RNAi to silence genes in C. elegans, Drosophila, plants, and humans are known in the art (Fire A, et al., 1998 Nature 391:806-811; Fire, A. Trends Genet. 15, 358-363 (1999); Sharp, P. A. RNA interference 2001. Genes Dev. 15, 485-490 (2001); Hammond, S. M., et al., Nature Rev. Genet. 2, 110-1119 (2001); Tuschl, T. Chem. Biochem. 2, 239-245 (2001); Hamilton, A. et al., Science 286, 950-952 (1999); Hammond, S. M., et al., Nature 404, 293-296 (2000); Zamore, P. D., et al., Cell 101, 25-33 (2000); Bernstein, E., et al., Nature 409, 363-366 (2001); Elbashir, S. M., et al., Genes Dev. 15, 188-200 (2001); WO129058; WO9932619; Elbashir S M, et al., 2001 Nature 411 :494-498).

[0061] Nucleic acid modulators are commonly used as research reagents, diagnostics, and therapeutics. For example, antisense oligonucleotides, which are able to inhibit gene expression with exquisite specificity, are often used to elucidate the function of particular genes (see, for example, U.S. Pat. No. 6,165,790). Nucleic acid modulators are also used, for example, to distinguish between functions of various members of a biological pathway. For example, antisense oligomers have been employed as therapeutic moieties in the treatment of disease states in animals and man and have been demonstrated in numerous clinical trials to be safe and effective (Milligan J F, et al, Current Concepts in Antisense Drug Design, J Med Chem. (1993) 36:1923-1937; Tonkinson J L et al., Antisense Oligodeoxynucleotides as Clinical Therapeutic Agents, Cancer Invest. (1996) 14:54-65). Accordingly, in one aspect of the invention, a MARK-specific nucleic acid modulator is used in an assay to further elucidate the role of the MARK in the p53 pathway, and/or its relationship to other members of the pathway. In another aspect of the invention, a MARK-specific antisense oligomer is used as a therapeutic agent for treatment of p53-related disease states.

Assay Systems

[0062] The invention provides assay systems and screening methods for identifying specific modulators of MARK activity. As used herein, an "assay system" encompasses all the components required for performing and analyzing results of an assay that detects and/or measures a particular event. In general, primary assays are used to identify or confirm a modulator's specific biochemical or molecular effect with respect to the MARK nucleic acid or protein. In general, secondary assays further assess the activity of a MARK modulating agent identified by a primary assay and may confirm that the modulating agent affects MARK in a manner relevant to the p53 pathway. In some cases, MARK modulators will be directly tested in a secondary assay.

[0063] In a preferred embodiment, the screening method comprises contacting a suitable assay system comprising a MARK polypeptide with a candidate agent under conditions whereby, but for the presence of the agent, the system provides a reference activity (e.g. kinase activity), which is based on the particular molecular event the screening method detects. A statistically significant difference between the agent-biased activity and the reference activity indicates that the candidate agent modulates MARK activity, and hence the p53 pathway.

Primary Assays

[0064] The type of modulator tested generally determines the type of primary assay.

Primary Assays for Small Molecule Modulators

[0065] For small molecule modulators, screening assays are used to identify candidate modulators. Screening assays may be cell-based or may use a cell-free system that recreates or retains the relevant biochemical reaction of the target protein (reviewed in Sittampalam G S et al., Curr Opin Chem Biol (1997) 1:384-91 and accompanying references). As used herein the term "cell-based" refers to assays using live cells, dead cells, or a particular cellular fraction, such as a membrane, endoplasmic reticulum, or mitochondrial fraction. The term "cell free" encompasses assays using substantially purified protein (either endogenous or recombinantly produced), partially purified or crude cellular extracts. Screening assays may detect a variety of molecular events, including protein-DNA interactions, protein-protein interactions (e.g., receptor-ligand binding), transcriptional activity (e.g., using a reporter gene), enzymatic activity (e.g., via a property of the substrate), activity of second messengers, immunogenicty and changes in cellular morphology or other cellular characteristics. Appropriate screening assays may use a wide range of detection methods including fluorescent, radioactive, calorimetric, spectrophotometric, and amperometric methods, to provide a read-out for the particular molecular event detected.

[0066] Cell-based screening assays usually require systems for recombinant expression of MARK and any auxiliary proteins demanded by the particular assay. Appropriate methods for generating recombinant proteins produce sufficient quantities of proteins that retain their relevant biological activities and are of sufficient purity to optimize activity and assure assay reproducibility. Yeast two-hybrid and variant screens, and mass spectrometry provide preferred methods for determining protein-protein interactions and elucidation of protein complexes. In certain applications, when MARK-interacting proteins are used in screens to identify small molecule modulators, the binding specificity of the interacting protein to the MARK protein may be assayed by various known methods such as substrate processing (e.g. ability of the candidate MARK-specific binding agents to function as negative effectors in MARK-expressing cells), binding equilibrium constants (usually at least about 10.sup.7 M.sup.-1, preferably at least about 10.sup.8 M.sup.-1, more preferably at least about 10.sup.9 M.sup.-1), and immunogenicity (e.g. ability to elicit MARK specific antibody in a heterologous host such as a mouse, rat, goat or rabbit). For enzymes and receptors, binding may be assayed by, respectively, substrate and ligand processing.

[0067] The screening assay may measure a candidate agent's ability to specifically bind to or modulate activity of a MARK polypeptide, a fusion protein thereof, or to cells or membranes bearing the polypeptide or fusion protein. The MARK polypeptide can be full length or a fragment thereof that retains functional MARK activity. The MARK polypeptide may be fused to another polypeptide, such as a peptide tag for detection or anchoring, or to another tag. The MARK polypeptide is preferably human MARK, or is an ortholog or derivative thereof as described above. In a preferred embodiment, the screening assay detects candidate agent-based modulation of MARK interaction with a binding target, such as an endogenous or exogenous protein or other substrate that has MARK -specific binding activity, and can be used to assess normal MARK gene function.

[0068] Suitable assay formats that may be adapted to screen for MARK modulators are known in the art. Preferred screening assays are high throughput or ultra high throughput and thus provide automated, cost-effective means of screening compound libraries for lead compounds (Fernandes P B, Curr Opin Chem Biol (1998) 2:597-603; Sundberg S A, Curr Opin Biotechnol 2000, 11:47-53). In one preferred embodiment, screening assays uses fluorescence technologies, including fluorescence polarization, time-resolved fluorescence, and fluorescence resonance energy transfer. These systems offer means to monitor protein-protein or DNA-protein interactions in which the intensity of the signal emitted from dye-labeled molecules depends upon their interactions with partner molecules (e.g., Selvin P R, Nat Struct Biol (2000) 7:730-4; Fernandes P B, supra; Hertzberg R P and Pope A J, Curr Opin Chem Biol (2000) 4:445-451).

[0069] A variety of suitable assay systems may be used to identify candidate MARK and p53 pathway modulators (e.g. U.S. Pat. No. 6,165,992 (kinase assays); U.S. Pat. Nos. 5,550,019 and 6,133,437 (apoptosis assays); U.S. Pat. No. 6,020,135 (p53 modulation), among others). Specific preferred assays are described in more detail below.

[0070] Kinase assays. In some preferred embodiments the screening assay detects the ability of the test agent to modulate the kinase activity of a MARK polypeptide. In further embodiments, a cell-free kinase assay system is used to identify a candidate p53 modulating agent, and a secondary, cell-based assay, such as an apoptosis or hypoxic induction assay (described below), may be used to further characterize the candidate p53 modulating agent. Many different assays for kinases have been reported in the literature and are well known to those skilled in the art (e.g. U.S. Pat. No. 6,165,992; Zhu et al., Nature Genetics (2000) 26:283-289; and WO0073469). Radioassays, which monitor the transfer of a gamma phosphate are frequently used. For instance, a scintillation assay for p56 (lck) kinase activity monitors the transfer of the gamma phosphate from gamma -.sup.33p ATP to a biotinylated peptide substrate; the substrate is captured on a streptavidin coated bead that transmits the signal (Beveridge M et al., J Biomol Screen (2000) 5:205-212). This assay uses the scintillation proximity assay (SPA), in which only radio-ligand bound to receptors tethered to the surface of an SPA bead are detected by the scintillant immobilized within it, allowing binding to be measured without separation of bound from free ligand.

[0071] Other assays for protein kinase activity may use antibodies that specifically recognize phosphorylated substrates. For instance, the kinase receptor activation (KIRA) assay measures receptor tyrosine kinase activity by ligand stimulating the intact receptor in cultured cells, then capturing solubilized receptor with specific antibodies and quantifying phosphorylation via phosphotyrosine ELISA (Sadick M D, Dev Biol Stand (1999) 97:121-133).

[0072] Another example of antibody based assays for protein kinase activity is TRF (time-resolved fluorometry). This method utilizes europium chelate-labeled anti-phosphotyrosine antibodies to detect phosphate transfer to a polymeric substrate coated onto microtiter plate wells. The amount of phosphorylation is then detected using time-resolved, dissociation-enhanced fluorescence (Braunwalder AF, et al., Anal Biochem 1996 Jul 1;238(2):159-64).

[0073] Apoptosis assays. Assays for apoptosis may be performed by terminal deoxynucleotidyl transferase-mediated digoxigenin-11-dUTP nick end labeling (TUNEL) assay. The TUNEL assay is used to measure nuclear DNA fragmentation characteristic of apoptosis ( Lazebnik et al., 1994, Nature 371, 346), by following the incorporation of fluorescein-dUTP (Yonehara et al., 1989, J. Exp. Med. 169, 1747). Apoptosis may further be assayed by acridine orange staining of tissue culture cells (Lucas, R., et al., 1998, Blood 15:4730-41). An apoptosis assay system may comprise a cell that expresses a MARK, and that optionally has defective p53 function (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the apoptosis assay system and changes in induction of apoptosis relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, an apoptosis assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using a cell-free assay system. An apoptosis assay may also be used to test whether MARK function plays a direct role in apoptosis. For example, an apoptosis assay may be performed on cells that over- or under-express MARK relative to wild type cells. Differences in apoptotic response compared to wild type cells suggests that the MARK plays a direct role in the apoptotic response. Apoptosis assays are described further in US Pat. No. 6,133,437.

[0074] Cell proliferation and cell cycle assays. Cell proliferation may be assayed via bromodeoxyuridine (BRDU) incorporation. This assay identifies a cell population undergoing DNA synthesis by incorporation of BRDU into newly-synthesized DNA. Newly-synthesized DNA may then be detected using an anti-BRDU antibody (Hoshino et al., 1986, Int. J. Cancer 38, 369; Campana et al, 1988, J. Immunol. Meth. 107, 79), or by other means.

[0075] Cell Proliferation may also be examined using [.sup.3H]-thymidine incorporation (Chen, J., 1996, Oncogene 13:1395-403; Jeoung, J., 1995, J. Biol. Chem. 270:18367-73). This assay allows for quantitative characterization of S-phase DNA syntheses. In this assay, cells synthesizing DNA will incorporate [.sup.3H]-thymidine into newly synthesized DNA. Incorporation can then be measured by standard techniques such as by counting of radioisotope in a scintillation counter (e.g., Beckman LS 3800 Liquid Scintillation Counter).

[0076] Cell proliferation may also be assayed by colony formation in soft agar (Sambrook et al., Molecular Cloning, Cold Spring Harbor (1989)). For example, cells transformed with MARK are seeded in soft agar plates, and colonies are measured and counted after two weeks incubation.

[0077] Involvement of a gene in the cell cycle may be assayed by flow cytometry (Gray J W et al. (1986) Int J Radiat Biol Relat Stud Phys Chem Med 49:237-55). Cells transfected with a MARK may be stained with propidium iodide and evaluated in a flow cytometer (available from Becton Dickinson).

[0078] Accordingly, a cell proliferation or cell cycle assay system may comprise a cell that expresses a MARK, and that optionally has defective p53 function (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the assay system and changes in cell proliferation or cell cycle relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, the cell proliferation or cell cycle assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using another assay system such as a cell-free kinase assay system. A cell proliferation assay may also be used to test whether MARK function plays a direct role in cell proliferation or cell cycle. For example, a cell proliferation or cell cycle assay may be performed on cells that over- or under-express MARK relative to wild type cells. Differences in proliferation or cell cycle compared to wild type cells suggests that the MARK plays a direct role in cell proliferation or cell cycle.

[0079] Angiogenesis. Angiogenesis may be assayed using various human endothelial cell systems, such as umbilical vein, coronary artery, or dermal cells. Suitable assays include Alamar Blue based assays (available from Biosource International) to measure proliferation; migration assays using fluorescent molecules, such as the use of Becton Dickinson Falcon HTS FluoroBlock cell culture inserts to measure migration of cells through membranes in presence or absence of angiogenesis enhancer or suppressors; and tubule formation assays based on the formation of tubular structures by endothelial cells on Matrigel.RTM. (Becton Dickinson). Accordingly, an angiogenesis assay system may comprise a cell that expresses a MARK, and that optionally has defective p53 function (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the angiogenesis assay system and changes in angiogenesis relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, the angiogenesis assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using another assay system. An angiogenesis assay may also be used to test whether MARK function plays a direct role in cell proliferation. For example, an angiogenesis assay may be performed on cells that over- or under-express MARK relative to wild type cells. Differences in angiogenesis compared to wild type cells suggests that the MARK plays a direct role in angiogenesis.

[0080] Hypoxic induction. The alpha subunit of the transcription factor, hypoxia inducible factor-1 (HIF-1), is upregulated in tumor cells following exposure to hypoxia in vitro. Under hypoxic conditions, HIF-1 stimulates the expression of genes known to be important in tumour cell survival, such as those encoding glyolytic enzymes and VEGF. Induction of such genes by hypoxic conditions may be assayed by growing cells transfected with MARK in hypoxic conditions (such as with 0.1% O2, 5% CO2, and balance N2, generated in a Napco 7001 incubator (Precision Scientific)) and normoxic conditions, followed by assessment of gene activity or expression by Taqman.RTM.. For example, a hypoxic induction assay system may comprise a cell that expresses a MARK, and that optionally has a mutated p53 (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the hypoxic induction assay system and changes in hypoxic response relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, the hypoxic induction assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using another assay system. A hypoxic induction assay may also be used to test whether MARK function plays a direct role in the hypoxic response. For example, a hypoxic induction assay may be performed on cells that over- or under-express MARK relative to wild type cells. Differences in hypoxic response compared to wild type cells suggests that the MARK plays a direct role in hypoxic induction.

[0081] Cell adhesion. Cell adhesion assays measure adhesion of cells to purified adhesion proteins, or adhesion of cells to each other, in presence or absence of candidate modulating agents. Cell-protein adhesion assays measure the ability of agents to modulate the adhesion of cells to purified proteins. For example, recombinant proteins are produced, diluted to 2.5 g/mL in PBS, and used to coat the wells of a microtiter plate. The wells used for negative control are not coated. Coated wells are then washed, blocked with 1% BSA, and washed again. Compounds are diluted to 2.times.final test concentration and added to the blocked, coated wells. Cells are then added to the wells, and the unbound cells are washed off. Retained cells are labeled directly on the plate by adding a membrane-permeable fluorescent dye, such as calcein-AM, and the signal is quantified in a fluorescent microplate reader.

[0082] Cell-cell adhesion assays measure the ability of agents to modulate binding of cell adhesion proteins with their native ligands. These assays use cells that naturally or recombinantly express the adhesion protein of choice. In an exemplary assay, cells expressing the cell adhesion protein are plated in wells of a multiwell plate. Cells expressing the ligand are labeled with a membrane-permeable fluorescent dye, such as BCECF , and allowed to adhere to the monolayers in the presence of candidate agents. Unbound cells are washed off, and bound cells are detected using a fluorescence plate reader.

[0083] High-throughput cell adhesion assays have also been described. In one such assay, small molecule ligands and peptides are bound to the surface of microscope slides using a microarray spotter, intact cells are then contacted with the slides, and unbound cells are washed off. In this assay, not only the binding specificity of the peptides and modulators against cell lines are determined, but also the functional cell signaling of attached cells using immunofluorescence techniques in situ on the microchip is measured (Falsey J R et al., Bioconjug Chem. 2001 May-Jun;12(3):346-53).

[0084] Primary Assays for Antibody Modulators

[0085] For antibody modulators, appropriate primary assays test is a binding assay that tests the antibody's affinity to and specificity for the MARK protein. Methods for testing antibody affinity and specificity are well known in the art (Harlow and Lane, 1988, 1999, supra). The enzyme-linked immunosorbant assay (ELISA) is a preferred method for detecting MARK-specific antibodies; others include FACS assays, radioimmunoassays, and fluorescent assays.

[0086] Primary Assays for Nucleic Acid Modulators

[0087] For nucleic acid modulators, primary assays may test the ability of the nucleic acid modulator to inhibit or enhance MARK gene expression, preferably mRNA expression. In general, expression analysis comprises comparing MARK expression in like populations of cells (e.g., two pools of cells that endogenously or recombinantly express MARK) in the presence and absence of the nucleic acid modulator. Methods for analyzing mRNA and protein expression are well known in the art. For instance, Northern blotting, slot blotting, ribonuclease protection, quantitative RT-PCR (e.g., using the TaqMan.RTM., PE Applied Biosystems), or microarray analysis may be used to confirm that MARK mRNA expression is reduced in cells treated with the nucleic acid modulator (e.g., Current Protocols in Molecular Biology (1994) Ausubel F M et al., eds., John Wiley & Sons, Inc., chapter 4; Freeman W M et al., Biotechniques (1999) 26:112-125; Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm D H and Guiseppi-Elie, A Curr Opin Biotechnol 2001, 12:41-47). Protein expression may also be monitored. Proteins are most commonly detected with specific antibodies or antisera directed against either the MARK protein or specific peptides. A variety of means including Western blotting, ELISA, or in situ detection, are available (Harlow E and Lane D, 1988 and 1999, supra).

[0088] Secondary Assays

[0089] Secondary assays may be used to further assess the activity of MARK-modulating agent identified by any of the above methods to confirm that the modulating agent affects MARK in a manner relevant to the p53 pathway. As used herein, MARK-modulating agents encompass candidate clinical compounds or other agents derived from previously identified modulating agent. Secondary assays can also be used to test the activity of a modulating agent on a particular genetic or biochemical pathway or to test the specificity of the modulating agent's interaction with MARK.

[0090] Secondary assays generally compare like populations of cells or animals (e.g., two pools of cells or animals that endogenously or recombinantly express MARK) in the presence and absence of the candidate modulator. In general, such assays test whether treatment of cells or animals with a candidate MARK-modulating agent results in changes in the p53 pathway in comparison to untreated (or mock- or placebo-treated) cells or animals. Certain assays use "sensitized genetic backgrounds", which, as used herein, describe cells or animals engineered for altered expression of genes in the p53 or interacting pathways.

[0091] Cell-Based Assays

[0092] Cell based assays may use a variety of mammalian cell lines known to have defective p53 function (e.g. SAOS-2 osteoblasts, H1299 lung cancer cells, C33A and HT3 cervical cancer cells, HT-29 and DLD-1 colon cancer cells, among others, available from American Type Culture Collection (ATCC), Manassas, Va.). Cell based assays may detect endogenous p53 pathway activity or may rely on recombinant expression of p53 pathway components. Any of the aforementioned assays may be used in this cell-based format. Candidate modulators are typically added to the cell media but may also be injected into cells or delivered by any other efficacious means.

Animal Assays

[0093] A variety of non-human animal models of normal or defective p53 pathway may be used to test candidate MARK modulators. Models for defective p53 pathway typically use genetically modified animals that have been engineered to mis-express (e.g., over-express or lack expression in) genes involved in the p53 pathway. Assays generally require systemic delivery of the candidate modulators, such as by oral administration, injection, etc.

[0094] In a preferred embodiment, p53 pathway activity is assessed by monitoring neovascularization and angiogenesis. Animal models with defective and normal p53 are used to test the candidate modulator's affect on MARK in Matrigel.RTM. assays. Matrigel.RTM. is an extract of basement membrane proteins, and is composed primarily of laminin, collagen IV, and heparin sulfate proteoglycan. It is provided as a sterile liquid at 4.degree. C., but rapidly forms a solid gel at 37.degree. C. Liquid Matrigel.RTM. is mixed with various angiogenic agents, such as bFGF and VEGF, or with human tumor cells which over-express the MARK. The mixture is then injected subcutaneously(SC) into female athymic nude mice (Taconic, Germantown, N.Y.) to support an intense vascular response. Mice with Matrigel.RTM. pellets may be dosed via oral (PO), intraperitoneal (IP), or intravenous (IV) routes with the candidate modulator. Mice are euthanized 5-12 days post-injection, and the Matrigel.RTM. pellet is harvested for hemoglobin analysis (Sigma plasma hemoglobin kit). Hemoglobin content of the gel is found to correlate the degree of neovascularization in the gel.

[0095] In another preferred embodiment, the effect of the candidate modulator on MARK is assessed via tumorigenicity assays. In one example, xenograft human tumors are implanted SC into female athymic mice, 6-7 week old, as single cell suspensions either from a pre-existing tumor or from in vitro culture. The tumors which express the MARK endogenously are injected in the flank, 1.times.10.sup.5 to 1.times.10.sup.7 cells per mouse in a volume of 100 .mu.L using a 27 gauge needle. Mice are then ear tagged and tumors are measured twice weekly. Candidate modulator treatment is initiated on the day the mean tumor weight reaches 100 mg. Candidate modulator is delivered IV, SC, IP, or PO by bolus administration. Depending upon the pharmacokinetics of each unique candidate modulator, dosing can be performed multiple times per day. The tumor weight is assessed by measuring perpendicular diameters with a caliper and calculated by multiplying the measurements of diameters in two dimensions. At the end of the experiment, the excised tumors maybe utilized for biomarker identification or further analyses. For immunohistochemistry staining, xenograft tumors are fixed in 4% paraformaldehyde, 0.1M phosphate, pH 7.2, for 6 hours at 4.degree. C., immersed in 30% sucrose in PBS, and rapidly frozen in isopentane cooled with liquid nitrogen.

Diagnostic and Therapeutic Uses

[0096] Specific MARK-modulating agents are useful in a variety of diagnostic and therapeutic applications where disease or disease prognosis is related to defects in the p53 pathway, such as angiogenic, apoptotic, or cell proliferation disorders. Accordingly, the invention also provides methods for modulating the p53 pathway in a cell, preferably a cell pre-determined to have defective p53 function, comprising the step of administering an agent to the cell that specifically modulates MARK activity. Preferably, the modulating agent produces a detectable phenotypic change in the cell indicating that the p53 function is restored, i.e., for example, the cell undergoes normal proliferation or progression through the cell cycle.

[0097] The discovery that MARK is implicated in p53 pathway provides for a variety of methods that can be employed for the diagnostic and prognostic evaluation of diseases and disorders involving defects in the p53 pathway and for the identification of subjects having a predisposition to such diseases and disorders.

[0098] Various expression analysis methods can be used to diagnose whether MARK expression occurs in a particular sample, including Northern blotting, slot blotting, ribonuclease protection, quantitative RT-PCR, and microarray analysis. (e.g., Current Protocols in Molecular Biology (1994) Ausubel F M et al., eds., John Wiley & Sons, Inc., chapter 4; Freeman W M et al., Biotechniques (1999) 26:112-125; Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm and Guiseppi-Elie, Curr Opin Biotechnol 2001, 12:41-47). Tissues having a disease or disorder implicating defective p53 signaling that express a MARK, are identified as amenable to treatment with a MARK modulating agent. In a preferred application, the p53 defective tissue overexpresses a MARK relative to normal tissue. For example, a Northern blot analysis of mRNA from tumor and normal cell lines, or from tumor and matching normal tissue samples from the same patient, using full or partial MARK cDNA sequences as probes, can determine whether particular tumors express or overexpress MARK. Alternatively, the TaqMan.RTM. is used for quantitative RT-PCR analysis of MARK expression in cell lines, normal tissues and tumor samples (PE Applied Biosystems).

[0099] Various other diagnostic methods may be performed, for example, utilizing reagents such as the MARK oligonucleotides, and antibodies directed against a MARK, as described above for: (1) the detection of the presence of MARK gene mutations, or the detection of either over- or under-expression of MARK mRNA relative to the non-disorder state; (2) the detection of either an over- or an under-abundance of MARK gene product relative to the non-disorder state; and (3) the detection of perturbations or abnormalities in the signal transduction pathway mediated by MARK.

[0100] Thus, in a specific embodiment, the invention is drawn to a method for diagnosing a disease in a patient, the method comprising: a) obtaining a biological sample from the patient; b) contacting the sample with a probe for MARK expression; c) comparing results from step (b) with a control; and d) determining whether step (c) indicates a likelihood of disease. Preferably, the disease is cancer, most preferably a cancer as shown in TABLE 1. The probe may be either DNA or protein, including an antibody.

EXAMPLES

[0101] The following experimental section and examples are offered by way of illustration and not by way of limitation.

[0102] I. Drosophila p53 screen

[0103] The Drosophila p53 gene was overexpressed specifically in the wing using the vestigial margin quadrant enhancer. Increasing quantities of Drosophila p53 (titrated using different strength transgenic inserts in 1 or 2 copies) caused deterioration of normal wing morphology from mild to strong, with phenotypes including disruption of pattern and polarity of wing hairs, shortening and thickening of wing veins, progressive crumpling of the wing and appearance of dark "death" inclusions in wing blade. In a screen designed to identify enhancers and suppressors of Drosophila p53, homozygous females carrying two copies of p53 were crossed to 5663 males carrying random insertions of a piggyBac transposon (Fraser M et al., Virology (1985) 145:356-361). Progeny containing insertions were compared to non-insertion-bearing sibling progeny for enhancement or suppression of the p53 phenotypes. Sequence information surrounding the piggyBac insertion site was used to identify the modifier genes. Modifiers of the wing phenotype were identified as members of the p53 pathway. kp78a was a suppressor of the wing phenotype. Human orthologs of the modifiers are referred to herein as MARK.

[0104] BLAST analysis (Altschul et al., supra) was employed to identify Targets from Drosophila modifiers. For example, representative sequences from MARK GI# 9845487 (SEQ ID NO:24), GI# 8923922(SEQ ID NO:25), GI# 4505103 (SEQ ID NO:27), and GI#13899225 (SEQ ID NO:29) share 43%, 65%, 65% and 45% amino acid identity, respectively, with the Drosophila kp78a.

[0105] Various domains, signals, and functional subunits in proteins were analyzed using the PSORT (Nakai K., and Horton P., Trends Biochem Sci, 1999, 24:34-6; Kenta Nakai, Protein sorting signals and prediction of subcellular localization, Adv. Protein Chem. 54, 277-344 (2000)), PFAM (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2; http://pfam.wustl.edu), SMART (Ponting C P, et al., SMART: identification and annotation of domains from signaling and extracellular protein sequences. Nucleic Acids Res. 1999 Jan 1;27(1):229-32), TM-HMM (Erik L. L. Sonnhammer, Gunnar von Heijne, and Anders Krogh: A hidden Markov model for predicting transmembrane helices in protein sequences. In Proc. of Sixth Int. Conf. on Intelligent Systems for Molecular Biology, 175-182 Ed J. Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen Menlo Park, Calif.: AAAI Press, 1998), and clust (Remm M, and Sonnhammer E. Classification of transmembrane protein families in the Caenorhabditis elegans genome and identification of human orthologs. Genome Res. 2000 Nov; 10(11): 1679-89) programs. For example, the proten kinase domains of MARKs from GI#s 9845487 (SEQ ID NO:24), 8923922 (SEQ ID NO:25), 4505103 (SEQ ID NO:27), and 13899225 (SEQ ID NO:29) is located at approximately amino acid residues 20 to 271, 60 to 311, 56 to 307, and 59 to 310, respectively (PFAM 00069). Further, the ubiquitin associated (UBA/TS-N) domains of MARKs from GI#s 9845487 (SEQ ID NO:24), 8923922 (SEQ ID NO:25), 4505103 (SEQ ID NO:27), and 13899225 (SEQ ID NO:29) is located at approximately amino acid residues 291 to 330, 331 to 370, 327 to 366, and 330 to 369, respectively (PFAM 00627). Still further, the kinase associated domains from MARKs of GI#s9845487 (SEQ ID NO:24), 8923922 (SEQ ID NO:25), and 4505103 (SEQ ID NO:27) are located at approximately amino acid residues 696 to 745, 746 to 795, and 664 to 713, respectively (PFAM 02149).

II. High-Throughput In Vitro Fluorescence Polarization Assay

[0106] Fluorescently-labeled MARK peptide/substrate are added to each well of a 96-well microtiter plate, along with a test agent in a test buffer (10 mM HEPES, 10 mM NaCl, 6 mM magnesium chloride, pH 7.6). Changes in fluorescence polarization, determined by using a Fluorolite FPM-2 Fluorescence Polarization Microtiter System (Dynatech Laboratories, Inc), relative to control values indicates the test compound is a candidate modifier of MARK activity.

III. High-Throughput In Vitro Binding Assay.

[0107] .sup.33P-labeled MARK peptide is added in an assay buffer (100 mM KCl, 20 mM HEPES pH 7.6, 1 mM MgCl.sub.2, 1% glycerol, 0.5% NP-40, 50 mM beta-mercaptoethanol, 1 mg/ml BSA, cocktail of protease inhibitors) along with a test agent to the wells of a Neutralite-avidin coated assay plate and incubated at 25.degree. C. for 1 hour. Biotinylated substrate is then added to each well and incubated for 1 hour. Reactions are stopped by washing with PBS, and counted in a scintillation counter. Test agents that cause a difference in activity relative to control without test agent are identified as candidate p53 modulating agents.

IV. Immunoprecipitations and Immunoblotting

[0108] For coprecipitation of transfected proteins, 3.times.10.sup.6 appropriate recombinant cells containing the MARK proteins are plated on 10-cm dishes and transfected on the following day with expression constructs. The total amount of DNA is kept constant in each transfection by adding empty vector. After 24 h, cells are collected, washed once with phosphate-buffered saline and lysed for 20 min on ice in 1 ml of lysis buffer containing 50 mM Hepes, pH 7.9, 250 mM NaCl, 20 mM -glycerophosphate, 1 mM sodium orthovanadate, 5 mM p-nitrophenyl phosphate, 2 mM dithiothreitol, protease inhibitors (complete, Roche Molecular Biochemicals), and 1% Nonidet P-40. Cellular debris is removed by centrifugation twice at 15,000.times.g for 15 min. The cell lysate is incubated with 25 .mu.l of M2 beads (Sigma) for 2 h at 4.degree. C. with gentle rocking.

[0109] After extensive washing with lysis buffer, proteins bound to the beads are solubilized by boiling in SDS sample buffer, fractionated by SDS-polyacrylamide gel electrophoresis, transferred to polyvinylidene difluoride membrane and blotted with the indicated antibodies. The reactive bands are visualized with horseradish peroxidase coupled to the appropriate secondary antibodies and the enhanced chemiluminescence (ECL) Western blotting detection system (Amersham Pharmacia Biotech).

V. Kinase Assay

[0110] A purified or partially purified MARK is diluted in a suitable reaction buffer, e.g., 50 mM Hepes, pH 7.5, containing magnesium chloride or manganese chloride (1-20 mM) and a peptide or polypeptide substrate, such as myelin basic protein or casein (1-10 .mu.g/ml). The final concentration of the kinase is 1-20 nM. The enzyme reaction is conducted in microtiter plates to facilitate optimization of reaction conditions by increasing assay throughput. A 96-well microtiter plate is employed using a final volume 30-100 l. The reaction is initiated by the addition of .sup.33P-gamma-ATP (0.5 .mu.Ci/ml) and incubated for 0.5 to 3 hours at room temperature. Negative controls are provided by the addition of EDTA, which chelates the divalent cation (Mg2.sup.+or Mn.sup.2+) required for enzymatic activity. Following the incubation, the enzyme reaction is quenched using EDTA. Samples of the reaction are transferred to a 96-well glass fiber filter plate (MultiScreen, Millipore). The filters are subsequently washed with phosphate-buffered saline, dilute phosphoric acid (0.5%) or other suitable medium to remove excess radiolabeled ATP. Scintillation cocktail is added to the filter plate and the incorporated radioactivity is quantitated by scintillation counting (Wallac/Perkin Elmer). Activity is defined by the amount of radioactivity detected following subtraction of the negative control reaction value (EDTA quench).

VI. Expression Analysis

[0111] All cell lines used in the following experiments are NCI (National Cancer Institute) lines, and are available from ATCC (American Type Culture Collection, Manassas, Va. 20110-2209). Normal and tumor tissues were obtained from Impath, UC Davis, Clontech, Stratagene, and Ambion.

[0112] TaqMan analysis was used to assess expression levels of the disclosed genes in various samples.

[0113] RNA was extracted from each tissue sample using Qiagen (Valencia, Calif.) RNeasy kits, following manufacturer's protocols, to a final concentration of 50 ng/.mu.l. Single stranded cDNA was then synthesized by reverse transcribing the RNA samples using random hexamers and 500 ng of total RNA per reaction, following protocol 4304965 of Applied Biosystems (Foster City, Calif., http://www.appliedbiosystems.com/).

[0114] Primers for expression analysis using TaqMan assay (Applied Biosystems, Foster City, Calif.) were prepared according to the TaqMan protocols, and the following criteria: a) primer pairs were designed to span introns to eliminate genomic contamination, and b) each primer pair produced only one product.

[0115] Taqman reactions were carried out following manufacturer's protocols, in 25 .mu.l total volume for 96-well plates and 10 .mu.l total volume for 384-well plates, using 300 nM primer and 250 nM probe, and approximately 25 ng of cDNA. The standard curve for result analysis was prepared using a universal pool of human cDNA samples, which is a mixture of cDNAs from a wide variety of tissues so that the chance that a target will be present in appreciable amounts is good. The raw data were normalized using 18S rRNA (universally expressed in all tissues and cells).

[0116] For each expression analysis, tumor tissue samples were compared with matched normal tissues from the same patient. A gene was considered overexpressed in a tumor when the level of expression of the gene was 2 fold or higher in the tumor compared with its matched normal sample.

[0117] Results are shown in Table 1. Data presented in bold indicate that greater than 50% of tested tumor samples of the tissue type indicated in row 1 exhibited over expression of the gene listed in column 1, relative to normal samples. Underlined data indicates that between 25% to 49% of tested tumor samples exhibited over expression. A modulator identified by an assay described herein can be further validated for therapeutic effect by administration to a tumor in which the gene is overexpressed. A decrease in tumor growth confirms therapeutic utility of the modulator. Prior to treating a patient with the modulator, the likelihood that the patient will respond to treatment can be diagnosed by obtaining a tumor sample from the patient, and assaying for expression of the gene targeted by the modulator. The expression data for the gene(s) can also be used as a diagnostic marker for disease progression. The assay can be performed by expression analysis as described above, by antibody directed to the gene target, or by any other available detection method.

TABLE-US-00001 TABLE 1 -- breast -- -- colon -- -- lung -- -- ovary -- GI#9845486 (SEQ ID NO: 1) 7 11 -- 8 30 -- 8 13 -- 5 7 GI#9845488 (SEQ ID NO: 2) 1 11 -- 4 30 -- 0 13 -- 1 7 GI#8923921 (SEQ ID NO: 8) 2 11 -- 7 30 -- 6 13 -- 0 7 GI#3089348 (SEQ ID NO: 13) 2 11 -- 2 30 -- 0 13 -- 2 7 GI#13366083 (SEQ ID NO: 19) 2 11 -- 2 30 -- 5 13 -- 1 7

Sequence CWU 1

1

2912946DNAHomo sapiens 1tcctggaatt gcacgcgctt cctgaccacc aggctctggc ccttgagaag ccagcggggc 60tttgtccctg ttgctctcct tgccaaaccc agtctctctg ctagtggtgg tttcggttgc 120gacaccgtcc aggttcccag gcaggaaccg ctcggcctgg ctgcttagct acttttcact 180gaggaggtgg tggaaggtgt cgcctgctct ggctgagtaa gggtggctgg ctgagccggc 240agcccccgcc ctaggcctgg ctcttcccgg cctctgtact ttgccctcgc tgcctgacag 300gttctgctgt gggctctgct gaatggaagt cgctggtagt ccttttccct ttctccagtc 360ggcccacctt gggacacctt gactccaagc ccagcagtaa gtccaacatg attcggggcc 420gcaactcagc cacctctgct gatgagcagc cccacattgg aaactaccgg ctcctcaaga 480ccattggcaa gggtaatttt gccaaggtga agttggcccg acacatcctg actgggaaag 540aggtagctgt gaagatcatt gacaagactc aactgaactc ctccagcctc cagaaactat 600tccgcgaagt aagaataatg aaggttttga atcatcccaa catagttaaa ttatttgaag 660tgattgagac tgagaaaacg ctctaccttg tcatggagta cgctagtggc ggagaggtat 720ttgattacct agtggctcat ggcaggatga aagaaaaaga ggctcgagcc aaattccgcc 780agatagtgtc tgctgtgcag tactgtcacc agaagtttat tgtccataga gacttaaagg 840cagaaaacct gctcttggat gctgatatga acatcaagat tgcagacttt ggcttcagca 900atgaattcac ctttgggaac aagctggaca ccttctgtgg cagtccccct tatgctgccc 960cagaactctt ccagggcaaa aaatatgatg gacccgaggt ggatgtgtgg agcctaggag 1020ttatcctcta tacactggtc agcggatccc tgccttttga tggacagaac ctcaaggagc 1080tgcgggaacg ggtactgagg gggaaatacc gtattccatt ctacatgtcc acggactgtg 1140aaaacctgct taagaaattt ctcatcctta atcccagcaa gagaggcact ttagagcaaa 1200tcatgaaaga tcgatggatg aatgtgggtc acgaagatga tgaactaaag ccttacgtgg 1260agccactccc tgactacaag gacccccggc ggacagagct gatggtgtcc atgggttata 1320cacgggaaga gatccaggac tcgctggtgg gccagagata caacgaggtg atggccacct 1380atctgctcct gggctacaag agctccgagc tggaaggcga caccatcacc ctgaaacccc 1440ggccttcagc tgatctaacc aatagcagcg cccaattccc atcccacaag gtacagcgaa 1500gcgtgtcggc caatcccaag cagcggcgct tcagcgacca ggctggtcct gccattccca 1560cctctaattc ttactctaag aagactcaga gtaacaacgc agaaaataag cggcctgagg 1620aggaccggga gtcagggcgg aaagccagca gcacagccaa ggtgcctgcc agccccctgc 1680ccggtctgga gaggaagaag accaccccaa ccccctccac gaacagcgtc ctctccacca 1740gcacaaatcg aagcaggaat tccccacttt tggagcgggc cagcctcggc caggcctcca 1800tccagaatgg caaagacagc ctaaccatgc cagggtcccg ggcctccacg gcttctgctt 1860ctgccgcagt ctctgcggcc cggccccgcc agcaccagaa atccatgtcg gcctccgtgc 1920accccaacaa ggcctctggg ctgcccccca cggagagtaa ctgtgaggtg ccgcggccca 1980gcacagcccc ccagcgtgtc cctgttgcct ccccatccgc ccacaacatc agcagcagtg 2040gtggagcccc agaccgaact aacttccccc ggggtgtgtc cagccgaagc accttccatg 2100ctgggcagct ccgacaggtg cgggaccagc agaatttgcc ctacggtgtg accccagcct 2160ctccctctgg ccacagccag ggccggcggg gggcctctgg gagcatcttc agcaagttca 2220cctccaagtt tgtacgcagg aacctgaatg aacctgaaag caaagaccga gtggagacgc 2280tcagacctca cgtggtgggc agtggcggca acgacaaaga aaaggaagaa tttcgggagg 2340ccaagccccg ctccctccgc ttcacgtgga gtatgaagac cacgagctcc atggagccca 2400acgagatgat gcgggagatc cgcaaggtgc tggacgcgaa cagctgccag agcgagctgc 2460atgagaagta catgctgctg tgcatgcacg gcacgccggg ccacgaggac ttcgtgcagt 2520gggagatgga ggtgtgcaaa ctgccgcggc tctctctcaa cggggttcga tttaagcgga 2580tatcgggcac ctccatggcc ttcaaaaaca ttgcctccaa aatagccaac gagctgaagc 2640tttaacaggc tgccaggagc gggggcggcg ggggcgggcc agctggacgg gctgccggcc 2700gtgcgccgcc ccacctgggc gagactgcag cgatggattg gtgtgtctcc ctgctggcac 2760ttctcccctc cctggccctt ctcagttttc tcccacattc acccctgccc agagattccc 2820ccttctcctc tcccctactg gaggcaaagg aaggggaggg tggatggggg ggcagggctc 2880cccctcggta ctgcggttgc acagagtatt tcgcctaaac caagaaattt tttattacca 2940aaaaga 294622784DNAHomo sapiens 2tcctggaatt gcacgcgctt cctgaccacc aggctctggc ccttgagaag ccagcggggc 60tttgtccctg ttgctctcct tgccaaaccc agtctctctg ctagtggtgg tttcggttgc 120gacaccgtcc aggttcccag gcaggaaccg ctcggcctgg ctgcttagct acttttcact 180gaggaggtgg tggaaggtgt cgcctgctct ggctgagtaa gggtggctgg ctgagccggc 240agcccccgcc ctaggcctgg ctcttcccgg cctctgtact ttgccctcgc tgcctgacag 300gttctgctgt gggctctgct gaatggaagt cgctggtagt ccttttccct ttctccagtc 360ggcccacctt gggacacctt gactccaagc ccagcagtaa gtccaacatg attcggggcc 420gcaactcagc cacctctgct gatgagcagc cccacattgg aaactaccgg ctcctcaaga 480ccattggcaa gggtaatttt gccaaggtga agttggcccg acacatcctg actgggaaag 540aggtagctgt gaagatcatt gacaagactc aactgaactc ctccagcctc cagaaactat 600tccgcgaagt aagaataatg aaggttttga atcatcccaa catagttaaa ttatttgaag 660tgattgagac tgagaaaacg ctctaccttg tcatggagta cgctagtggc ggagaggtat 720ttgattacct agtggctcat ggcaggatga aagaaaaaga ggctcgagcc aaattccgcc 780agatagtgtc tgctgtgcag tactgtcacc agaagtttat tgtccataga gacttaaagg 840cagaaaacct gctcttggat gctgatatga acatcaagat tgcagacttt ggcttcagca 900atgaattcac ctttgggaac aagctggaca ccttctgtgg cagtccccct tatgctgccc 960cagaactctt ccagggcaaa aaatatgatg gacccgaggt ggatgtgtgg agcctaggag 1020ttatcctcta tacactggtc agcggatccc tgccttttga tggacagaac ctcaaggagc 1080tgcgggaacg ggtactgagg gggaaatacc gtattccatt ctacatgtcc acggactgtg 1140aaaacctgct taagaaattt ctcatcctta atcccagcaa gagaggcact ttagagcaaa 1200tcatgaaaga tcgatggatg aatgtgggtc acgaagatga tgaactaaag ccttacgtgg 1260agccactccc tgactacaag gacccccggc ggacagagct gatggtgtcc atgggttata 1320cacgggaaga gatccaggac tcgctggtgg gccagagata caacgaggtg atggccacct 1380atctgctcct gggctacaag agctccgagc tggaaggcga caccatcacc ctgaaacccc 1440ggccttcagc tgatctaacc aatagcagcg cccaattccc atcccacaag gtacagcgaa 1500gcgtgtcggc caatcccaag cagcggcgct tcagcgacca ggctggtcct gccattccca 1560cctctaattc ttactctaag aagactcaga gtaacaacgc agaaaataag cggcctgagg 1620aggaccggga gtcagggcgg aaagccagca gcacagccaa ggtgcctgcc agccccctgc 1680ccggtctgga gaggaagaag accaccccaa ccccctccac gaacagcgtc ctctccacca 1740gcacaaatcg aagcaggaat tccccacttt tggagcgggc cagcctcggc caggcctcca 1800tccagaatgg caaagacagc acagcccccc agcgtgtccc tgttgcctcc ccatccgccc 1860acaacatcag cagcagtggt ggagccccag accgaactaa cttcccccgg ggtgtgtcca 1920gccgaagcac cttccatgct gggcagctcc gacaggtgcg ggaccagcag aatttgccct 1980acggtgtgac cccagcctct ccctctggcc acagccaggg ccggcggggg gcctctggga 2040gcatcttcag caagttcacc tccaagtttg tacgcaggaa cctgaatgaa cctgaaagca 2100aagaccgagt ggagacgctc agacctcacg tggtgggcag tggcggcaac gacaaagaaa 2160aggaagaatt tcgggaggcc aagccccgct ccctccgctt cacgtggagt atgaagacca 2220cgagctccat ggagcccaac gagatgatgc gggagatccg caaggtgctg gacgcgaaca 2280gctgccagag cgagctgcat gagaagtaca tgctgctgtg catgcacggc acgccgggcc 2340acgaggactt cgtgcagtgg gagatggagg tgtgcaaact gccgcggctc tctctcaacg 2400gggttcgatt taagcggata tcgggcacct ccatggcctt caaaaacatt gcctccaaaa 2460tagccaacga gctgaagctt taacaggctg ccaggagcgg gggcggcggg ggcgggccag 2520ctggacgggc tgccggccgt gcgccgcccc acctgggcga gactgcagcg atggattggt 2580gtgtctccct gctggcactt ctcccctccc tggcccttct cagttttctc ccacattcac 2640ccctgcccag agattccccc ttctcctctc ccctactgga ggcaaaggaa ggggagggtg 2700gatggggggg cagggctccc cctcggtact gcggttgcac agagtatttc gcctaaacca 2760agaaattttt tattaccaaa aaga 278433103DNAHomo sapiensmisc_feature(2941)..(2941)"n" is A, C, G, or T 3cggtggtggc ggccatgttg ggagcagcag gtccggcggc ggctgcctgt gtgccgggcg 60cggagcagtg ccgctgaggg caggggagga gcgaggcagg cggccggctg cggcggcaga 120gagtaggcgg agcggcgcgg cccggccgaa aggcggcaca gcccagccgg gggtcggggg 180ggtgcggtcc ggagccgctc ggagccggcg cggcctagcc cgagcggcgc atccccgggc 240tggcgtgagc ggctgcccgg cctccccgca cccccggccg gggcccatgc ggcgggtgct 300cctgctgtga gaagccccgc ccggccgggc tccgcgcctt cccttccctc ccttcctcca 360agcttctcgg ttccctcccc cgagataccg gcgccatgtc cagcgctcgg acccccctac 420ccacgctgaa cgagagggac acggagcagc ccaccttggg acaccttgac tccaagccca 480gcagtaagtc caacatgatt cggggccgca actcagccac ctctgctgat gagcagcccc 540acattggaaa ctaccggctc ctcaagacca ttggcaaggg taattttgcc aaggtgaagt 600tggcccgaca catcctgact gggaaagagg tagctgtgaa gatcattgac aagactcaac 660tgaactcctc cagcctccag aaactattcc gcgaagtaag aataatgaag gttttgaatc 720atcccaacat agttaaatta tttgaagtga ttgagactga gaaaacgctc taccttgtca 780tggagtacgc tagtggcgga gaggtatttg attacctagt ggctcatggc aggatgaaag 840aaaaagaggc tcgagccaaa ttccgccaga tagtgtctgc tgtgcagtac tgtcaccaga 900agtttattgt ccatagagac ttaaaggcag aaaacctgct cttggatgct gatatgaaca 960tcaagattgc agactttggc ttcagcaatg aattcacctt tgggaacaag ctggacacct 1020tctgtggcag tcccccttat gctgccccag aactcttcca gggcaaaaaa tatgatggac 1080ccgaggtgga tgtgtggagc ctaggagtta tcctctatac actggtcagc ggatccctgc 1140cttttgatgg acagaacctc aaggagctgc gggaacgggt actgagggga aaataccgta 1200ttccattcta catgtccacg gactgtgaaa acctgcttaa gaaatttctc attcttaatc 1260ccagcaagag aggcacttta gagcaaatca tgaaagatcg atggatgaat gtgggtcacg 1320aagatgatga actaaagcct tacgtggagc cactccctga ctacaaggac ccccggcgga 1380cagagctgat ggtgtccatg ggttatacac gggaagagat ccaggactcg ctggtgggcc 1440agagatacaa cgaggtgatg gccacctatc tgctcctggg ctacaagagc tccgagctgg 1500aaggcgacac catcaccctg aaaccccggc cttcagctga tctgaccaat agcagcgccc 1560catccccatc ccacaaggta cagcgcagcg tgtcggccaa tcccaagcag cggcgcttca 1620gcgaccaggc tggtcctgcc attcccacct ctaattctta ctctaagaag actcagagta 1680acaacgcaga aaataagcgg cctgaggagg accgggagtc agggcggaaa gccagcagca 1740cagccaaggt gcctgccagc cccctgcccg gtctggagag gaagaagacc accccaaccc 1800cctccacgga acagcgtcct ctccaccagc acaaatcgaa gcaggaattc cccacttttg 1860gagcgggcca gcctcggcca ggcctccatc cagaatggca aagacagcct aaccatgcca 1920gggtcccggg cctccacggc ttctgcttct gccgcagtct ctgcggcccg gccccgccag 1980caccagaaat ccatgtcggc ctccgtgcac cccaacaagg cctctgggct gccccccacg 2040gagagtaact gtgaggtgcc gcggcccagc acagcccccc agcgtgtccc tgttgcctcc 2100ccatccgccc acaacatcag cagcagtggt ggagccccag accgaactaa cttcccccgg 2160ggtgtgtcca gccgaagcac cttccatgct gggcagctcc gacaggtgcg ggaccagcag 2220aatttgccct acggtgtgac cccagcctct ccctctggcc acagccaggg ccggcggggg 2280gcctctggga gcatcttcag caagttcacc tccaagtttg tacgcagaaa tctgtctttc 2340aggtttgcca gaaggaacct gaatgaacct gaaagcaaag accgagtgga gacgctcaga 2400cctcacgtgg tgggcagtgg cggcaacgac aaagaaaagg aagaatttcg ggaggccaag 2460ccccgctccc tccgcttcac gtggagtatg aagaccacga gctccatgga gcccaacgag 2520atgatgcggg agatccgcaa ggtgctggac gcgaacagct gccagagcga gctgcatgag 2580aagtacatgc tgctgtgcat gcacggcacg ccgggccacg aggacttcgt gcagtgggag 2640atggaggtgt gcaaactgcc gcggctctct ctcaacgggg ttcgatttaa gcggatatcg 2700ggcacctcca tggccttcaa aaacattgcc tccaaaatag ccaacgagct gaagctttaa 2760caggctgcca ggagcggggg cggcgggggg cgggccagct ggacgggctg ccggccgctg 2820cgccgcccca cctgggcgag actgcagcga tggattggtg tgtctcccct gctggcactt 2880ctcccctccc tggcccttct cagttttctc ttacatgttt gtggggggtg ggagattgtt 2940ntccagcacc ccacattcac ccctgcccag agattccccc ttctcctctc ccctactgga 3000ggcaaaggaa ggggagggtg gatggggggg cagggctccc cctcggtact gcggttgcac 3060agagtatttc gcctaaacca agaaattttt tattaccaaa aag 310342086DNAHomo sapiens 4agtccaacat gattcggggc cgcaactcag ccacctctgc tgatgagcag ccccacattg 60gaaactaccg gctcctcaag accattggca agggtaattt tgccaaggtg aagttggccc 120gacacatcct gactgggaaa gaggtagctg tgaagatcat tgacaagact caactgaact 180cctccagcct ccagaaacta ttccgcgaag taagaataat gaaggttttg aatcatccca 240acatagttaa attatttgaa gtgattgaga ctgagaaaac gctctacctt gtcatggagt 300acgctagtgg cggagaggta tttgattacc tagtggctca tggcaggatg aaagaaaaag 360aggctcgagc caaattccgc cagatagtgt ctgctgtgca gtactgtcac cagaagttta 420ttgtccatag agacttaaag gcagaaaacc tgctcttgga tgctgatatg aacatcaaga 480ttgcagactt tggcttcagc aatgaattca cctttgggaa caagctggac accttctgtg 540gcagtccccc ttatgctgcc ccagaactct tccagggcaa aaaatatgat ggacccgagg 600tggatgtgtg gagcctagga gttatcctct atacactggt cagcggatcc ctgccttttg 660atggacagaa cctcaaggag ctgcgggaac gggtactgag gggaaaatac cgtattccat 720tctacatgtc cacggactgt gaaaacctgc ttaagaaatt tctcattctt aatcccagca 780agagaggcac tttagagcaa atcatgaaag atcgatggat gaatgtgggt cacgaagatg 840atgaactaaa gccttacgtg gagccactcc ctgactacaa ggacccccgg cggacagagc 900tgatggtgtc catgggttat acacgggaag agatccagga ctcgctggtg ggccagagat 960acaacgaggt gatggccacc tatctgctcc tgggctacaa gagctccgag ctggaaggcg 1020acaccatcac cctgaaaccc cggccttcag ctgatctgac caatagcagc gccccatccc 1080catcccacaa ggtacagcgc agcgtgtcgg ccaatcccaa gcagcggcgc ttcagcgacc 1140aggctggtcc tgccattccc acctctaatt cttactctaa gaagactcag agtaacaacg 1200cagaaaataa gcggcctgag gaggaccggg agtcagggcg gaaagccagc agcacagcca 1260aggtgcctgc cagccccctg cccggtctgg agaggaagaa gaccacccca accccctcca 1320cgaacagcgt cctctccacc agcacaaatc gaagcaggaa ttccccactt ttggagcggg 1380ccagcctcgg ccaggcctcc atccagaatg gcaaagacag cacagccccc cagcgtgtcc 1440ctgttgcctc cccatccgcc cacaacatca gcagcagtgg tggagcccca gaccgaacta 1500acttcccccg gggtgtgtcc agccgaagca ccttccatgc tgggcagctc cgacaggtgc 1560gggaccagca gaatttgccc tacggtgtga ccccagcctc tccctctggc cacagccagg 1620gccggcgggg ggcctctggg agcatcttca gcaagttcac ctccaagttt gtacgcagga 1680acctgaatga acctgaaagc aaagaccgag tggagacgct cagacctcac gtggtgggca 1740gtggcggcaa cgacaaagaa aaggaagaat ttcgggaggc caagccccgc tccctccgct 1800tcacgtggag tatgaagacc acgagctcca tggagcccaa cgagatgatg cgggagatcc 1860gcaaggtgct ggacgcgaac agctgccaga gcgagctgca tgagaagtac atgctgctgt 1920gcatgcacgg cacgccgggc cacgaggact tcgtgcagtg ggagatggag gtgtgcaaac 1980tgccgcggct ctctctcaac ggggttcgat ttaagcggat atcgggcacc tccatggcct 2040tcaaaaacat tgcctccaaa atagccaacg agctgaagct ttaaca 208652248DNAHomo sapiens 5agtccaacat gattcggggc cgcaactcag ccacctctgc tgatgagcag ccccacattg 60gaaactaccg gctcctcaag accattggca agggtaattt tgccaaggtg aagttggccc 120gacacatcct gactgggaaa gaggtagctg tgaagatcat tgacaagact caactgaact 180cctccagcct ccagaaacta ttccgcgaag taagaataat gaaggttttg aatcatccca 240acatagttaa attatttgaa gtgattgaga ctgagaaaac gctctacctt gtcatggagt 300acgctagtgg cggagaggta tttgattacc tagtggctca tggcaggatg aaagaaaaag 360aggctcgagc caaattccgc cagatagtgt ctgctgtgca gtactgtcac cagaagttta 420ttgtccatag agacttaaag gcagaaaacc tgctcttgga tgctgatatg aacatcaaga 480ttgcagactt tggcttcagc aatgaattca cctttgggaa caagctggac accttctgtg 540gcagtccccc ttatgctgcc ccagaactct tccagggcaa aaaatatgat ggacccgagg 600tggatgtgtg gagcctagga gttatcctct atacactggt cagcggatcc ctgccttttg 660atggacagaa cctcaaggag ctgcgggaac gggtactgag gggaaaatac cgtattccat 720tctacatgtc cacggactgt gaaaacctgc ttaagaaatt tctcattctt aatcccagca 780agagaggcac tttagagcaa atcatgaaag atcgatggat gaatgtgggt cacgaagatg 840atgaactaaa gccttacgtg gagccactcc ctgactacaa ggacccccgg cggacagagc 900tgatggtgtc catgggttat acacgggaag agatccagga ctcgctggtg ggccagagat 960acaacgaggt gatggccacc tatctgctcc tgggctacaa gagctccgag ctggaaggcg 1020acaccatcac cctgaaaccc cggccttcag ctgatctgac caatagcagc gccccatccc 1080catcccacaa ggtacagcgc agcgtgtcgg ccaatcccaa gcagcggcgc ttcagcgacc 1140aggctggtcc tgccattccc acctctaatt cttactctaa gaagactcag agtaacaacg 1200cagaaaataa gcggcctgag gaggaccggg agtcagggcg gaaagccagc agcacagcca 1260aggtgcctgc cagccccctg cccggtctgg agaggaagaa gaccacccca accccctcca 1320cgaacagcgt cctctccacc agcacaaatc gaagcaggaa ttccccactt ttggagcggg 1380ccagcctcgg ccaggcctcc atccagaatg gcaaagacag cctaaccatg ccagggtccc 1440gggcctccac ggcttctgct tctgccgcag tctctgcggc ccggccccgc cagcaccaga 1500aatccatgtc ggcctccgtg caccccaaca aggcctctgg gctgcccccc acggagagta 1560actgtgaggt gccgcggccc agcacagccc cccagcgtgt ccctgttgcc tccccatccg 1620cccacaacat cagcagcagt ggtggagccc cagaccgaac taacttcccc cggggtgtgt 1680ccagccgaag caccttccat gctgggcagc tccgacaggt gcgggaccag cagaatttgc 1740cctacggtgt gaccccagcc tctccctctg gccacagcca gggccggcgg ggggcctctg 1800ggagcatctt cagcaagttc acctccaagt ttgtacgcag gaacctgaat gaacctgaaa 1860gcaaagaccg agtggagacg ctcagacctc acgtggtggg cagtggcggc aacgacaaag 1920aaaaggaaga atttcgggag gccaagcccc gctccctccg cttcacgtgg agtatgaaga 1980ccacgagctc catggagccc aacgagatga tgcgggagat ccgcaaggtg ctggacgcga 2040acagctgcca gagcgagctg catgagaagt acatgctgct gtgcatgcac ggcacgccgg 2100gccacgagga cttcgtgcag tgggagatgg aggtgtgcaa actgccgcgg ctctctctca 2160acggggttcg atttaagcgg atatcgggca cctccatggc cttcaaaaac attgcctcca 2220aaatagccaa cgagctgaag ctttaaca 224862701DNAHomo sapiens 6ggcacgaggg ctgaacgaga gggacacgga gcagcccacc ttgggacacc ttgactccaa 60gcccagcagt aagtccaaca tgattcgggg ccgcaactca gccacctctg ctgatgagca 120gccccacatt ggaaactacc ggctcctcaa gaccattggc aagggtaatt ttgccaaggt 180gaagttggcc cgacacatcc tgactgggaa agaggtagct gtgaagatca ttgacaagac 240tcaactgaac tcctccagcc tccagaaact attccgcgaa gtaagaataa tgaaggtttt 300gaatcatccc aacatagtta aattatttga agtgattgag actgagaaaa cgctctacct 360tgtcatggag tacgctagtg gcggagaggt atttgattac ctagtggctc atggcaggat 420gaaagaaaaa gaggctcgag ccaaattccg ccagatagtg tctgctgtgc agtactgtca 480ccagaagttt attgtccata gagacttaaa ggcagaaaac ctgctcttgg atgctgatat 540gaacatcaag attgcagact ttggcttcag caatgaattc acctttggga acaagctgga 600caccttctgt ggcagtcccc cttatgctgc cccagaactc ttccagggca aaaaatatga 660tggacccgag gtggatgtgt ggagcctagg agttatcctc tatacactgg tcagcggatc 720cctgcctttt gatggacaga acctcaagga gctgcgggaa cgggtactga ggggaaaata 780ccgtattcca ttctacatgt ccacggactg tgaaaacctg cttaagaaat ttctcattct 840taatcccagc aagagaggca ctttagagca aatcatgaaa gatcgatgga tgaatgtggg 900tcacgaagat gatgaactaa agccttacgt ggagccactc cctgactaca aggacccccg 960gcggacagag ctgatggtgt ccatgggtta tacacgggaa gagatccagg actcgctggt 1020gggccagaga tacaacgagg tgatggccac ctatctgctc ctgggctaca agagctccga 1080gctggaaggc gacaccatca ccctgaaacc ccggccttca gctgatctga ccaatagcag 1140cgccccatcc ccatcccaca aggtacagcg cagcgtgtcg gccaatccca agcagcggcg 1200cttcagcgac caggcagctg gtcctgccat tcccacctct aattcttact ctaagaagac 1260tcagagtaac aacgcagaaa ataagcggcc tgaggaggac cgggagtcag ggcggaaagc 1320cagcagcaca gccaaggtgc ctgccagccc cctgcccggt ctggagagga agaagaccac 1380cccaaccccc tccacgaaca gcgtcctctc caccagcaca aatcgaagca ggaattcccc 1440acttttggag cgggccagcc tcggtcaggc ctccatccag aatggcaaag acagcctaac 1500catgccaggg tcccgggcct ccacggcttc tgcttctgcc gcagtctctg cggcccggcc 1560ccgccagcac

cagaaatcca tgtcggcctc cgtgcacccc aacaaggcct ctgggctgcc 1620ccccacggag agtaactgtg aggtgccgcg gcccagcaca gccccccagc gtgtccctgt 1680tgcctcccca tccgcccaca acatcagcag cagtggtgga gccccagacc gaactaactt 1740cccccggggt gtgtccagcc gaagcacctt ccatgctggg cagctccgac aggtgcggga 1800ccagcagaat ttgccctacg gtgtgacccc agcctctccc tctggccaca gccagggccg 1860gcggggggcc tctgggagca tcttcagcaa gttcacctcc aagtttgtac gcagaaatct 1920gtctttcagg tttgccagaa ggaacctgaa tgaacctgaa agcaaagacc gagtggagac 1980gctcagacct cacgtggtgg gcagtggcgg caacgacaaa gaaaaggaag aatttcggga 2040ggccaagccc cgctccctcc gcttcacgtg gagtatgaag accacgagct ccatggagcc 2100caacgagatg atgcgggaga tccgcaaggt gctggacgcg aacagctgcc agagcgagct 2160gcatgagaag tacatgctgc tgtgcatgca cggcacgccg ggccacgagg acttcgtgca 2220gtgggagatg gaggtgtgca aactgccgcg gctctctctc aacggggttc gatttaagcg 2280gatatcgggc acctccatgg ccttcaaaaa cattgcctcc aaaatagcca acgagctgaa 2340gctttaacag gctgccagga gcgggggcgg cgggggcggg ccagctggac gggctgccgg 2400ccgctgcgcc gccccacctg ggcgagactg cagcgatgga ttggtgtgtc tcccctgctg 2460gcacttctcc cctccctggc ccttctcagt tttctcttac atgtttgtgg ggggtgggag 2520attgttctcc agccccccac attcacccct gcccagagat tcccccttct cctctcccct 2580actggaggca aaggaagggg agggtggatg ggggggcagg gctccccctc ggtactgcgg 2640ttgcacagag tatttcgcct aaaccaagaa attttttatt accaaaaaaa aaaaaaaaaa 2700a 270172112DNAHomo sapiens 7cccagcagta agtccaacat gattcggggc cgcaactcag ccacctctgc tgatgagcag 60ccccacattg gaaactaccg gctcctcaag accattggca agggtaattt tgccaaggtg 120aagttggccc gacacatcct gactgggaaa gaggtagctg tgaagatcat tgacaagact 180caactgaact cctccagcct ccagaaacta ttccgcgaag taagaataat gaaggttttg 240aatcatccca acatagttaa attatttgaa gtgattgaga ctgagaaaac gctctacctt 300gtcatggagt acgctagtgg cggagaggta tttgattacc tagtggctca tggcaggatg 360aaagaaaaag aggctcgagc caaattccgc cagatagtgt ctgctgtgca gtactgtcac 420cagaagttta ttgtccatag agacttaaag gcagaaaacc tgctcttgga tgctgatatg 480aacatcaaga ttgcagactt tggcttcagc aatgaattca cctttgggaa caagctggac 540accttctgtg gcagtccccc ttatgctgcc ccagaactct tccagggcaa aaaatatgat 600ggacccgagg tggatgtgtg gagcctagga gttatcctct atacactggt cagcggatcc 660ctgccttttg atggacagaa cctcaaggag ctgcgggaac gggtactgag gggaaaatac 720cgtattccat tctacatgtc cacggactgt gaaaacctgc ttaagaaatt tctcattctt 780aatcccagca agagaggcac tttagagcaa atcatgaaag atcgatggat gaatgtgggt 840cacgaagatg atgaactaaa gccttacgtg gagccactcc ctgactacaa ggacccccgg 900cggacagagc tgatggtgtc catgggttat acacgggaag agatccagga ctcgctggtg 960ggccagagat acaacgaggt gatggccacc tatctgctcc tgggctacaa gagctccgag 1020ctggaaggcg acaccatcac cctgaaaccc cggccttcag ctgatctgac caatagcagc 1080gccccatccc catcccacaa ggtacagcgc agcgtgtcgg ccaatcccaa gcagcggcgc 1140ttcagcgacc aggctggtcc tgccattccc acctctaatt cttactctaa gaagactcag 1200agtaacaacg cagaaaataa gcggcctgag gaggaccggg agtcagggcg gaaagccagc 1260agcacagcca aggtgcctgc cagccccctg cccggtctgg agaggaagaa gaccacccca 1320accccctcca cgaacagcgt cctctccacc agcacaaatc gaagcaggaa ttccccactt 1380ttggagcggg ccagcctcgg ccaggcctcc atccagaatg gcaaagacag cacagccccc 1440cagcgtgtcc ctgttgcctc cccatccgcc cacaacatca gcagcagtgg tggagcccca 1500gaccgaacta acttcccccg gggtgtgtcc agccgaagca ccttccatgc tgggcagctc 1560cgacaggtgc gggaccagca gaatttgccc tacggtgtga ccccagcctc tccctctggc 1620cacagccagg gccggcgggg ggcctctggg agcatcttca gcaagttcac ctccaagttt 1680gtacgcagga acctgaatga acctgaaagc aaagaccgag tggagacgct cagacctcac 1740gtggtgggca gtggcggcaa cgacaaagaa aaggaagaat ttcgggaggc caagccccgc 1800tccctccgct tcacgtggag tatgaagacc acgagctcca tggagcccaa cgagatgatg 1860cgggagatcc gcaaggtgct ggacgcgaac agctgccaga gcgagctgca tgagaagtac 1920atgctgctgt gcatgcacgg cacgccgggc cacgaggact tcgtgcagtg ggagatggag 1980gtgtgcaaac tgccgcggct ctctctcaac ggggttcgat ttaagcggat atcgggcacc 2040tccatggcct tcaaaaacat tgcctccaaa atagccaacg agctgaagct ttaacaggct 2100gccaggagcg gg 211282965DNAHomo sapiens 8cgggcaaccg cctcgcccga agccctccct cgttactgtc cgcatacccc ggcggcgccg 60ccgcggaaag cggctccccc tcctcttact ccgcgtcctc ttccctcttt cccccgccgg 120ggcacgcttg ttgcaccgtc ccgcggcctg cgggagccgc tcgccccgga cttgagctcg 180cgtacgaccc atttcctgtc gccccccgga gcccgcacca cagcccggcc ggtctagacc 240ccggcagacc ccgctggccg cacaaaatgt cggcccggac gccattgccg acggtgaacg 300agcgggacac ggtaaatcat acgactgtgg atggatatac tgaaccacac atccagccta 360ccaagtcgag tagcagacag aacatccccc ggtgtagaaa ctccattacg tcagcaacag 420atgaacagcc tcacattgga aattaccgtt tacaaaaaac aatagggaag ggaaattttg 480ccaaagtcaa attggcaaga cacgttctaa ctggtagaga ggttgctgtg aaaataatag 540acaaaactca gctaaatcct accagtctac aaaagttatt tcgagaagta cgaataatga 600agatactgaa tcatcctaat atagtaaaat tgtttgaagt tattgaaaca gagaagactc 660tctatttagt catggaatac gcgagtgggg gtgaagtatt tgattactta gttgcccatg 720gaagaatgaa agagaaagag gcccgtgcaa aatttaggca gattgtatct gctgtacagt 780attgtcatca aaagtacatt gttcaccgtg atcttaaggc tgaaaacctt ctccttgatg 840gtgatatgaa tattaaaatt gctgactttg gttttagtaa tgaatttaca gttgggaaca 900aattggacac attttgtgga agcccaccct atgctgctcc cgagcttttc caaggaaaga 960agtatgatgg gcctgaagtg gatgtgtgga gtctgggcgt cattctctat acattagtca 1020gtggctcctt gcctttcgat ggccagaatt taaaggaact gcgagagcga gttttacgag 1080ggaagtaccg tattcccttc tatatgtcca cagactgtga aaatcttctg aagaaattat 1140tagtcctgaa tccaataaag agaggcagct tggaacaaat aatgaaagat cgatggatga 1200atgttggtca tgaagaggaa gaactaaagc catatactga gcctgatccg gatttcaatg 1260acacaaaaag aatagacatt atggtcacca tgggctttgc acgagatgaa ataaatgatg 1320ccttaataaa tcagaagtat gatgaagtta tggctactta tattcttcta ggtagaaaac 1380cacctgaatt tgaaggtggt gaatcgttat ccagtggaaa cttgtgtcag aggtcccggc 1440ccagtagtga cttaaacaac agcactcttc agtcccctgc tcacctgaag gtccagagaa 1500gtatctcagc aaatcagaag cagcggcgtt tcagtgatca tgctggtcca tccattcctc 1560ctgctgtatc atataccaaa agacctcagg ctaacagtgt ggaaagtgaa cagaaagagg 1620agtgggacaa agatgtggct cgaaaacttg gcagcacaac agttggatca aaaagcgaga 1680tgactgcaag ccctcttgta gggccagaga ggaaaaaatc ttcaactatt ccaagtaaca 1740atgtgtattc tggaggtagc atggcaagaa ggaatacata tgtctgtgaa aggaccacag 1800atcgatacgt agcattgcag aatggaaaag acagcagcct tacggagatg tctgtgagta 1860gcatatcttc tgcaggctct tctgtggcct ctgctgtccc ctcagcacga ccccgccacc 1920agaagtccat gtccacttct ggtcatccta ttaaagtcac actgccaacc attaaagacg 1980gctctgaagc ttaccggcct ggtacaaccc agagagtgcc tgctgcttcc ccatctgctc 2040acagtattag tactgcgact ccagaccgga cccgttttcc ccgagggagc tcaagccgaa 2100gcactttcca tggtgaacag ctccgggagc gacgcagcgt tgcttataat gggccacctg 2160cttcaccatc ccatgaaacg ggtgcatttg cacatgccag aaggggaacg tcaactggta 2220taataagcaa aatcacatcc aaatttgttc gcagggatcc aagtgaaggc gaagccagtg 2280gcagaaccga cacctcaaga agtacatcag gggaaccaaa agaaagagac aaggaagagg 2340gtaaagattc taagccgcgt tctttgcggt tcacatggag tatgaagacc actagttcaa 2400tggaccctaa tgacatgatg agagaaatcc gaaaagtgtt agatgcaaat aactgtgatt 2460atgagcaaaa agagagattt ttgcttttct gtgtccatgg agacgctaga caggatagcc 2520tcgtgcagtg ggagatggaa gtctgcaagt tgccacgact gtcacttaat ggggttcgct 2580tcaagcgaat atctgggaca tctattgcct ttaagaacat tgcatcaaaa atagcaaatg 2640agcttaagct gtaaagaagt ccaaatttac aggttcaggg aagatacata catatatgag 2700gtacagtttt tgaatgtact ggtaatgcct aatgtggtct gcctgtgaat ctccccatgt 2760agaatttgcc cttaatgcaa taaggttata catagttatg aactgtaaaa ttaaagtcag 2820tatgaactat aataaatatc tgtagcttaa aaagtaggtt cacatgtaca ggtaagtata 2880ttgtgtattt ctgttcattt tctgttcata gagttgtata ataaaacatg attgcttaaa 2940aacttgaaaa aaaaaaaaaa aaaaa 296593210DNAHomo sapiens 9ggcgcggcgg cggcggtggc tgtgaccgcg cggaccgagc cgagacattc gcgccggggg 60atcgggcgcc gccgccgctg ggccccgggc gcgtggatgc ggctgggtcg ggcggcgccg 120tacacctgag gcggagaacg gggcgcggcg cgggtgacgc tgtcagggcc gcggttcctg 180acgcccaggc gctcgccagg acgagccagg cagtgatttg aggcaccggc ttcaccttca 240cccatggtcc ggagagccta gcggggctcg ccaccgcctc ccggctcccc ttccacgcct 300catcctgcca gcctcgccgc cccgccagcg ccgggcaacc gcctcgcccg aagccctccc 360tcgttactgt ccgcataccc cggcggcgcc gccgcgggaa gcggctcccc ctcctcttcc 420tccgcgtcct cttccctctt tcccccgccg gggccgcttg ttgcaccgcc ccgcggcctg 480cgggagccgc tcgccccggc cttgtgctcg cgtccgcacc cctttcctgt cgccccccgg 540ggcccgcacc acagcccggc cggcgagacc ccggccagac cccgctgccc gcacaaaatg 600tcggcccgga cgccattgcc gacggtgaac gagcgggaca cggaaaatca tacatctgtg 660gatggatata ctgaaccaca catccagcct accaagtcga gtagcagaca gaacatcccc 720cggtgtagaa actccattac gtcagcaaca gatgaacagc ctcacattgg aaattaccgt 780ttacaaaaaa caatagggaa gggaaatttt gccaaagtca aattggcaag acacgttcta 840actggtagag aggttgctgt gaaaataata gacaaaactc agctaaatcc taccagtcta 900caaaagttat ttcgagaagt acgaataatg aagatactga atcatcctaa tataggtgaa 960gtatttgatt acttagttgc ccatggaaga atgaaagaga aagaggcccg tgcaaaattt 1020aggcagattg tatctgctgt acagtattgt catcaaaagt acattgttca ccgtgatctt 1080aagctgaaaa ccttctcctt gatggtgata tgaatattaa aattgctgac tttggtttta 1140gtaatgaatt tacagttggg aacaaattgg acacattttg tggaagccca ccctatgctg 1200ctcccgagct tttccaagga aagaagtatg atgggcctga agtggatgtg tggagtctgg 1260gcgtcattct ctatacatta gtcagtggct ccttgccttt cgatggccag aatttaaagg 1320aactgcgaga gcgagtttta cgagggaagt accgtattcc cttctatatg tccacagact 1380gtgaaaatct tctgaagaaa ttattagtcc tgaatccaat aaagagaggc agcttggaac 1440aaataatgaa agatcgatgg atgaatgttg gtcatgaaga ggaagaacta aagccatata 1500ctgagcctga tccggatttc aatgacacaa aaagaataga cattatggtc accatgggct 1560ttgcacgaga tgaaataaat gatgccttaa taaatcagaa gtatgatgaa gttatggcta 1620cttatattct tctaggtaga aaaccacctg aatttgaagg tggtgaatcg ttatccagtg 1680gaaacttgtg tcagaggtcc cggcccagta gtgacttaaa caacagcact cttcagtccc 1740ctgctcacct gaaggtccag agaagtatct cagcaaatca gaagcagcgg cgtttcagtg 1800atcatgctgg tccatccatt cctcctgctg tatcatatac caaaagacct caggctaaca 1860gtgtggaaag tgaacagaaa gaggagtggg acaaagatgt ggctcgaaaa cttggcagca 1920caacagttgg atcaaaaagc gagatgactg caagccctct tgtagggcca gagaggaaaa 1980aatcttcaac tattccaagt aacaatgtgt attctggagg tagcatggca agaaggaata 2040catatgtctg tgaaaggacc acagatcgat acgtagcatt gcagaatgga aaagacagca 2100gccttacgga gatgtctgtg agtagcatat cttctgcagg ctcttctgtg gcctctgctg 2160tcccctcagc acgaccccgc caccagaagt ccatgtccac ttctggtcat cctattaaag 2220tcacactgcc aaccattaaa gacggctctg aagcttaccg gcctggtaca acccagagag 2280tgcctgctgc ttccccatct gctcacagta ttagtactgc gactccagac cggacccgtt 2340ttccccgagg gagctcaagc cgaagcactt tccatggtga acagctccgg gagcgacgca 2400gcgttgctta taatgggcca cctgcttcac catcccatga aacgggtgca tttgcacatg 2460ccagaagggg aacgtcaact ggtataataa gcaaaatcac atccaaattt gttcgcaggg 2520atccaagtga aggcgaagcc agtggcagaa ccgacacctc aagaagtaca tcaggggaac 2580caaaagaaag agacaaggaa gagggtaaag attctaagcc gcgttctttg cggttcacat 2640ggagtatgaa gaccactagt tcaatggacc ctaatgacat gatgagagaa atccgaaaag 2700tgttagatgc aaataactgt gattatgagc aaaaagagag atttttgctt ttctgtgtcc 2760atggagacgc tagacaggat agcctcgtgc agtgggagat ggaagtctgc aagttgccac 2820gactgtcact taatggggtt cgcttcaagc gaatatctgg gacatctatt gcctttaaga 2880acattgcatc aaaaatagca aatgagctta agctgtaaag aagtccaaat ttacaggttc 2940agggaagata catacatata tgaggtacag tttttgaatg tactggtaat gcctaatgtg 3000gtctgcctgt gaatctcccc atgtagaatt tgcccttaat gcaataaggt tatacatagt 3060tatgaactgt aaaattaaag tcagtatgaa ctataataaa tatctgtagc ttaaaaagta 3120ggttcacatg tacaggtaag tatattgtgt atttctgttc attttctgtt catagagttg 3180tataataaaa catgattgct taaaaacttg 3210102505DNAHomo sapiens 10gctggccgca caaaatgtcg gcccggacgc cattgccgac ggtgaacgag cgggacacgg 60aaaatcatac atctgtggat ggatatactg aaccacacat ccagcctacc aagtcgagta 120gcagacagaa catcccccgg tgtagaaact ccattacgtc agcaacagat gaacagcctc 180acattggaaa ttaccgttta caaaaaacaa tagggaaggg aaattttgcc aaagtcaaat 240tggcaagaca cgttctaact ggtagagagg ttgctgtgaa aataatagac aaaactcagc 300taaatcctac cagtctacaa aagttatttc gagaagtacg aataatgaag atactgaatc 360atcctaatat agtaaaattg tttgaagtta ttgaaacaga gaagactctc tatttagtca 420tggaatacgc gagtgggggt gaagtatttg attacttagt tgcccatgga agaatgaaag 480agaaagaggc ccgtgcaaaa tttaggcaga ttgtatctgc tgtacagtat tgtcatcaaa 540agtacattgt tcaccgtgat cttaaggctg aaaaccttct ccttgatggt gatatgaata 600ttaaaattgc tgactttggt tttagtaatg aatttacagt tgggaacaaa ttggacacat 660tttgtggaag cccaccctat gctgctcccg agcttttcca aggaaagaag tatgatgggc 720ctgaagtgga tgtgtggagt ctgggcgtca ttctctatac attagtcagt ggctccttgc 780ctttcgatgg ccagaattta aaggaactgc gagagcgagt tttacgaggg aagtaccgta 840ttcccttcta tatgtccaca gactgtgaaa atcttctgaa gaaattatta gtcctgaatc 900caataaagag aggcagcttg gaacaaataa tgaaagatcg atggatgaat gttggtcatg 960aagaggaaga actaaagcca tatactgagc ctgatccgga tttcaatgac acaaaaagaa 1020tagacattat ggtcaccatg ggctttgcac gagatgaaat aaatgatgcc ttaataaatc 1080agaagtatga tgaagttatg gctacttata ttcttctagg tagaaaacca cctgaatttg 1140aaggtggtga atcgttatcc agtggaaact tgtgtcagag gtcccggccc agtagtgact 1200taaacaacag cactcttcag tcccctgctc acctgaaggt ccagagaagt atctcagcaa 1260atcagaagca gcggcgtttc agtgatcatg ctggtccatc cattcctcct gctgtatcat 1320ataccaaaag acctcaggct aacagtgtgg aaagtgaaca gaaagaggag tgggacaaag 1380atgtggctcg aaaacttggc agcacaacag ttggatcaaa aagcgagatg actgcaagcc 1440ctcttgtagg gccagagagg aaaaaatctt caactattcc aagtaacaat gtgtattctg 1500gaggtagcat ggcaagaagg aatacatatg tctgtgaaag gaccacagat cgatacgtag 1560cattgcagaa tggaaaagac agcagcctta cggagatgtc tgtgagtagc atatcttctg 1620caggctcttc tgtggcctct gctgtcccct cagcacgacc ccgccaccag aagtccatgt 1680ccacttctgg tcatcctatt aaagtcacac tgccaaccat taaagacggc tctgaagctt 1740accggcctgg tacaacccag agagtgcctg ctgcttcccc atctgctcac agtattagta 1800ctgcgactcc agaccggacc cgttttcccc gagggagctc aagccgaagc actttccatg 1860gtgaacagct ccgggagcga cgcagcgttg cttataatgg gccacctgct tcaccatccc 1920atgaaacggg tgcatttgca catgccagaa ggggaacgtc aactggtata ataagcaaaa 1980tcacatccaa atttgttcgc agggatccaa gtgaaggcga agccagtggc agaaccgaca 2040cctcaagaag tacatcaggg gaaccaaaag aaagagacaa ggaagagggt aaagattcta 2100agccgcgttc tttgcggttc acatggagta tgaagaccac tagttcaatg gaccctaatg 2160acatgatgag agaaatccga aaagtgttag atgcaaataa ctgtgattat gagcaaaaag 2220agagattttt gcttttctgt gtccatggag acgctagaca ggatagcctc gtgcagtggg 2280agatggaagt ctgcaagttg ccacgactgt cacttaatgg ggttcgcttc aagcgaatat 2340ctgggacatc tattgccttt aagaacattg catcaaaaat agcaaatgag cttaagctgt 2400aaagaagtcc aaatttacag gttcagggaa gatacataca tatatgaggt acagtttttg 2460aatgtactgg taatgcctaa tgtggtctgc ctgtgaatct cccca 2505114638DNAHomo sapiens 11ggcgcggcgg cggcggtggc tgtgaccgcg cggaccgagc cgagacattc gcgccggggg 60atcgggcgcc gccgccgctg ggccccgggc gcgtggatgc ggctgggtcg ggcggcgccg 120tacacctgag gcggagaacg gggcgcggcg cgggtgacgc tgtcagggcc gcggttcctg 180acgcccaggc gctcgccagg acgagccagg cagtgatttg aggcaccggc ttcaccttca 240cccatggtcc ggagagccta gcggggctcg ccaccgcctc ccggctcccc ttccacgcct 300catcctgcca gcctcgccgc cccgccagcg ccgggcaacc gcctcgcccg aagccctccc 360tcgttactgt ccgcataccc cggcggcgcc gccgcgggaa gcggctcccc ctcctcttcc 420tccgcgtcct cttccctctt tcccccgccg gggccgcttg ttgcaccgcc ccgcggcctg 480cgggagccgc tcgccccggc cttgtgctcg cgtccgcacc cctttcctgt cgccccccgg 540ggcccgcacc acagcccggc cggcgagacc ccggccagac cccgctgccc gcacaaaatg 600tcggcccgga cgccattgcc gacggtgaac gagcgggaca cggaaaatca tacatctgtg 660gatggatata ctgaaccaca catccagcct accaagtcga gtagcagaca gaacatcccc 720cggtgtagaa actccattac gtcagcaaca gatgaacagc ctcacattgg aaattaccgt 780ttacaaaaaa caatagggaa gggaaatttt gccaaagtca aattggcaag acacgttcta 840actggtagag aggttgctgt gaaaataata gacaaaactc agctaaatcc taccagtcta 900caaaagttat ttcgagaagt acgaataatg aagatactga atcatcctaa tataggtgaa 960gtatttgatt acttagttgc ccatggaaga atgaaagaga aagaggcccg tgcaaaattt 1020aggcagattg tatctgctgt acagtattgt catcaaaagt acattgttca ccgtgatctt 1080aaggctgaaa accttctcct tgatggtgat atgaatatta aaattgctga ctttggtttt 1140agtaatgaat ttacagttgg gaacaaattg gacacatttt gtggaagccc accctatgct 1200gctcccgagc ttttccaagg aaagaagtat gatgggcctg aagtggatgt gtggagtctg 1260ggcgtcattc tctatacatt agtcagtggc tccttgcctt tcgatggcca gaatttaaag 1320gaactgcgag agcgagtttt acgagggaag taccgtattc ccttctatat gtccacagac 1380tgtgaaaatc ttctgaagaa attattagtc ctgaatccaa taaagagagg cagcttggaa 1440caaataatga aagatcgatg gatgaatgtt ggtcatgaag aggaagaact aaagccatat 1500actgagcctg atccggattt caatgacaca aaaagaatag acattatggt caccatgggc 1560tttgcacgag atgaaataaa tgatgcctta ataaatcaga agtatgatga agttatggct 1620acttatattc ttctaggtag aaaaccacct gaatttgaag gtggtgaatc gttatccagt 1680ggaaacttgt gtcagaggtc ccggcccagt agtgacttaa acaacagcac tcttcagtcc 1740cctgctcacc tgaaggtcca gagaagtatc tcagcaaatc agaagcagcg gcgtttcagt 1800gatcatgctg gtccatccat tcctcctgct gtatcatata ccaaaagacc tcaggctaac 1860agtgtggaaa gtgaacagaa agaggagtgg gacaaagatg tggctcgaaa acttggcagc 1920acaacagttg gatcaaaaag cgagatgact gcaagccctc ttgtagggcc agagaggaaa 1980aaatcttcaa ctattccaag taacaatgtg tattctggag gtagcatggc aagaaggaat 2040acatatgtct gtgaaaggac cacagatcga tacgtagcat tgcagaatgg aaaagacagc 2100agccttacgg agatgtctgt gagtagcata tcttctgcag gctcttctgt ggcctctgct 2160gtcccctcag cacgaccccg ccaccagaag tccatgtcca cttctggtca tcctattaaa 2220gtcacactgc caaccattaa agacggctct gaagcttacc ggcctggtac aacccagaga 2280gtgcctgctg cttccccatc tgctcacagt attagtactg cgactccaga ccggacccgt 2340tttccccgag ggagctcaag ccgaagcact ttccatggtg aacagctccg ggagcgacgc 2400agcgttgctt ataatgggcc acctgcttca ccatcccatg aaacgggtgc atttgcacat 2460gccagaaggg gaacgtcaac tggtataata agcaaaatca catccaaatt tgttcgcaga 2520agtacatcag gggaaccaaa agaaagagac aaggaagagg gtaaagattc taagccgcgt 2580tctttgcggt tcacatggag tatgaagacc actagttcaa tggaccctaa tgacatgatg 2640agagaaatcc gaaaagtgtt agatgcaaat aactgtgatt atgagcaaaa agagagattt 2700ttgcttttct gtgtccatgg agacgctaga caggatagcc tcgtgcagtg ggagatggaa 2760gtctgcaagt tgccacgact gtcacttaat ggggttcgct tcaagcgaat atctgggaca 2820tctattgcct ttaagaacat

tgcatcaaaa atagcaaatg agcttatgct gtaaagaagt 2880ccaaatttac aggttcaggg aagatacata catatatgag gtacagtttt tgaatgtact 2940ggtaatgcct aatgtggtct gcctgtgaat ctccccatgt agaatttgcc cttaatgcaa 3000taaggttata catagttatg aactgtaaaa ttaaagtcag tatgaactat aataaatatc 3060tgtagcttaa aaagtaggtt cacatgtaca ggtaagtata ttgtgtattt ctgttcattt 3120tctgttcata gagttgtata ataaaacatg attgcttaaa aacttgtata gttgtctaga 3180tttctgcacc tgaatgtatg tttgatgctt tgatttgaaa atgttcttcc ctgttattta 3240cattctggtg ggtttttaaa attcttacct ccatcatgca attttgaaaa ttgtgtccag 3300aattaaaagt gcatagaaat agcctttaca attgtagcat ggacctttaa aaattgtttt 3360aaaatcttat ttaaatttaa accagaagct gaaaaataga tcagctttat tatacacaaa 3420attattactg cttatctttg ctcttttcct tgttatcccg caaggtttag ttgagaagat 3480acaaaatgtt tacagtgttg gcacttagag tttttaaatt caagtacatg aaattcagta 3540atagcattgc cttgagctaa ctaggaagta ccgggaaaaa agttaaatct acatcaagtt 3600tcttttgaac tttgaagtgt tttctgaccc actgctaact gtagcaacaa aatttaaaag 3660aaaaaaaaca tactttatct ggctattata acataaactg tcacgtaggt ttgctgcctt 3720cagaataccg caatttaatt gcgggaatat aataatattg ggactgtttc acagcacaaa 3780ctcatcttta cagtgttgat caatgcatca gttaagaaat aatgccacct caggaattaa 3840ctggcattgg gaacatttgc ctcattctcc tgctatcctc ttcattcacc cctgccactg 3900taatatctat aagtacttaa gagacttgtg agcaaaacat actatttata acagtatatg 3960attgatttat gcttatgtgg ttgttcagtt tgttcccatg taactcgttt gttttaaata 4020ttttgccaga tttcttgtat ttattccaca tcattatgcc tataatgtgc cgctttgtga 4080ttgggcattt gcctactttt ctttcataat tagtgatata tgcgatgtaa aaccactagt 4140aaaggtacat tttaatactt gttattttat actgaattag ccttggaggt tgactgtgca 4200atgttattta ctgttgtaat tactgtaata ccaacatatg ggccccatct gcacactcct 4260gaaaaacaga aagtgtattc aaattttatc agtttaaaga aaataaagct gtgataaata 4320ctgtaattcc aacctacatt agaaggtcta agtgtaggtg atgtgccatt ccataatggc 4380ttccagacta gggtgaattt tatgttctgt actgtactgt gatgtagctt tcttctgtaa 4440cagttatgtt ttaaaattaa gtgagttttt tttttgcctt agcaaagggt ggtgtttgaa 4500aaaaaaaatg tgtagcccct ttttaaccta gtgttcattc aaaaaaaaat tgatgcaaat 4560ctttattcac tttcactggt gcacactgaa attttacttg aacagttctc ataataaagc 4620acttgtcttt tgctcttt 4638122720DNAHomo sapiens 12tcatggaata cgcgagtggg ggtgaagtat ttgattactt agttgcccat ggaagaatga 60aagagaaaga ggcccgtgca aaatttaggc agattgtatc tgctgtacag tattgtcatc 120aaaagtacat tgttcaccgt gatcttaagg ctgaaaacct tctccttgat ggtgatatga 180atattaaaat tgctgacttt ggttttagta atgaatttac agttgggaac aaattggaca 240cattttgtgg aagcccaccc tatgctgctc ccgagctttt ccaaggaaag aagtatgatg 300gtcctgaagt ggatgtgtgg agtctgggcg tcattctcta tacattagtc agtggctcct 360tgcctttcga tggccagaat ttaaaggaac tgcgagagcg agttttacga gggaagtacc 420gtattccctt ctatatgtcc acagactgtg aaaatcttct gaagaaatta ttagtcctga 480atccaataaa gagaggcagc ttggaacaaa taatgaaaga tcgatggatg aatgttggtc 540atgaagagga agaactaaag ccatatactg agcctgatcc ggatttcaat gacacaaaaa 600gaatagacat tatggtcacc atgggctttg cacgagatga aataaatgat gccttaataa 660atcagaagta tgatgaagtt atggctactt atattcttct aggtagaaaa ccacctgaat 720ttgaaggtgg tgaatcgtta tccagtggaa acttgtgtca gaggtcccgg cccagtagtg 780acttaaacaa cagcactctt cagtcccctg ctcacctgaa ggtccagaga agtatctcag 840caaatcagaa gcagcggcgt ttcagtgatc atgctggtcc atccattcct cctgctgtat 900catataccaa aagacctcag gctaacagtg tggaaagtga acagaaagag gagtgggaca 960aagatgtggc tcgaaaactt ggcagcacaa cagttggatc aaaaagcgag atgactgcaa 1020gccctcttgt agggccagag aggaaaaaat cttcaactat tccaagtaac aatgtgtatt 1080ctggaggtag catggcaaga aggaatacat atgtctgtga aaggaccaca gatcgatacg 1140tagcattgca gaatggaaaa aacagcagcc ttacggagat gtctgtgagt agcatatctt 1200ctgcaggctc ttctgtggcc tctgctgccc cctcagcacg accccgccac cagaagtcca 1260tgtccacttc tggtcatcct attaaagtca cactgccaac cattaaagac ggctctgaag 1320cttaccggcc tggtacaacc cagagagtgc ctgctgcttc cccatctgct cacagtatta 1380gtactgcgac tccagaccgg acccgttttc cccgagggag ctcaagccga agcactttcc 1440atggtgaaca gctccgggag cgacgcagcg ttgcttataa tgggccacct gcttcaccat 1500cccatgaaac gggtgcattt gcacatgcca gaaggggaac gtcaactggt ataataagca 1560aaatcacatc caaatttgtt cgcagaagta catcagggga accaaaagaa agagacaagg 1620aagagggtaa agattctaag ccgcgttctt tgcggttcac atggagtatg aagaccacta 1680gttcaatgga ccctaatgac atgatgagag aaatccgaaa agtgttagat gcaaataact 1740gtgattatga gcaaaaagag agatttttgc ttttctgtgt ccatggagac gctagacagg 1800atagcctcgt gcagtgggag atggaagtct gcaagttgca cgactgtcac ttaatggggt 1860tcgcttcaag cgaatatctg ggacatctat tgcctttaag aacattgcat caaaaatagc 1920aaatgagctt aagctgtaaa gaagtccaaa tttacaggtt cagggaagat acatacatat 1980atgaggtaca gtttttgaat gtactggtaa tgcctaatgt ggtctgcctg tgaatctccc 2040catgtagaat ttgcccttaa tgcaataagg ttatacatag ttatgaactg taaaattaaa 2100gtcagtatga actataataa atatctgtag cttaaaaagt aggttcacat gtacaggtaa 2160gtatattgtg tatttctgtt cattttctgt tcatagagtt gtataataaa acatgattgc 2220ttaaaaactt gtatagttgt ctagatttct gcacctgaat gtatgtttga tgctttgatt 2280tgaaaatgtt cttccctgtt atttacattc cggtgggttt ttaaaattct tacctccatc 2340atgcaatttt gaaaattgtg tccagaatta aaagtgcata gaaatagcct ttacaattgt 2400agcatggacc tttaaaaatt gttttaaaat cttatttaaa tttaaaccag aagctgaaaa 2460atagatcagc tttattatac acaaaattat tactgcttat ctttgctctt ttccttgtta 2520tcccgcaagg tttagttgag aagatacaaa atgtttacag tgttggcact tagagttttt 2580aaattcaagt acatgaaatt cagtaatagc attgccttga gctaactagg aagtaccggg 2640aaaaaagtta aatctacatc aagtttcttt tgaactttga agtgttttct gacccactgc 2700taactgtagc aacaaaattt 2720132698DNAHomo sapiens 13gagctgaaat tcgcggtgcg acgggaggga gtggagaagg aggtgagggg gcccaggatc 60gcggggcgcc ctgaggcaag gggacgccgg tgggtcgaag cgcagcccgc cgcccgcagg 120ctcggctccg ccactgccgc cctcccggtc tcctcgcctc gggcgccgag gcagggagag 180aatgagcccc gggacccgcc gggggacggc ccgggccagg cccgggatct agaacggccg 240tagggggaag ggagccgccc tccccacggc gccttttcgg aactgccgtg gactcgagga 300cgctggtcgc cggcctccta gggctgtgct gttttgtttt gaccctcgca ttgtgcagaa 360ttaaagtgca gtaaaatgtc cactaggacc ccattgccaa cggtgaatga acgagacact 420gaaaaccaca cgtcacatgg agatgggcgt caagaagtta cctctcgtac cagccgctca 480ggagctcggt gtagaaactc tatagcctcc tgtgcagatg aacaacctca catcggaaac 540tacagactgt tgaaaacaat cggcaagggg aattttgcaa aagtaaaatt ggcaagacat 600atccttacag gcagagaggt tgcaataaaa ataattgaca aaactcagtt gaatccaaca 660agtctacaaa agctcttcag agaagtaaga ataatgaaga ttttaaatca tcccaatata 720gtgaagttat tcgaagtcat tgaaactgaa aaaacactct acctaatcat ggaatatgca 780agtggaggtg aagtatttga ctatttggtt gcacatggca ggatgaagga aaaagaagca 840agatctaaat ttagacagat tgtgtctgca gttcaatact gccatcagaa acggatcgta 900catcgagacc tcaaggctga aaatctattg ttagatgccg atatgaacat taaaatagca 960gatttcggtt ttagcaatga atttactgtt ggcggtaaac tcgacacgtt ttgtggcagt 1020cctccatacg cagcacctga gctcttccag ggcaagaaat atgacgggcc agaagtggat 1080gtgtggagtc tgggggtcat tttatacaca ctagtcagtg gctcacttcc ctttgatggg 1140caaaacctaa aggaactgag agagagagta ttaagaggga aatacagaat tcccttctac 1200atgtctacag actgtgaaaa ccttctcaaa cgtttcctgg tgctaaatcc aattaaacgc 1260ggcactctag agcaaatcat gaaggacagg tggatcaatg cagggcatga agaagatgaa 1320ctcaaaccat ttgttgaacc agagctagac atctcagacc aaaaaagaat agatattatg 1380gtgggaatgg gatattcaca agaagaaatt caagaatctc ttagtaagat gaaatacgat 1440gaaatcacag ctacatattt gttattgggg agaaaatctt cagagctgga tgctagtgat 1500tccagttcta gcagcaatct ttcacttgct aaggttaggc cgagcagtga tctcaacaac 1560agtactggcc agtctcctca ccacaaagtg cagagaagtg tttcttcaag ccaaaagcaa 1620agacgctaca gtgaccatgc tggaccagct attccttctg ttgtggcgta tccgaaaagg 1680agtcagacaa gcactgcaga tggtgacctc aaagaagatg gaatttcctc ccggaaatca 1740agtggcagtg ctgttggagg aaagggaatt gctccagcca gtcccatgct tgggaatgca 1800agtaatccta ataaggcgga tattcctgaa cgcaagaaaa gctccactgt ccctagtagt 1860aacacagcat ctggtggaat gacacgacga aatacttatg tttgcagtga gagaactaca 1920gctgatagac actcagtgat tcagaatggc aaagaaaaca gcactattcc tgatcagaga 1980actccagttg cttcaacaca cagtatcagt agtgcagcca ccccagatcg aatccgcttc 2040ccaagaggca ctgccagtcg tagcactttc cacggccagc cccgggaacg gcgaaccgca 2100acatataatg gccctcctgc ctctcccagc ctgtcccatg aagccacacc attgtcccag 2160actcgaagcc gaggctccac taatctcttt agtaaattaa cttcaaaact cacaaggagt 2220cgcaatgtat ctgctgagca aaaagatgaa aacaaagaag caaagcctcg atccctacgc 2280ttcacctgga gcatgaaaac cactagttca atggatcccg gggacatgat gcgggaaatc 2340cgcaaagtgt tggacgccaa taactgcgac tatgagcaga gggagcgctt cttgctcttc 2400tgcgtccacg gagatgggca cgcggagaac ctcgtgcagt gggaaatgga agtgtgcaag 2460ctgccaagac tgtctctgaa cggggtccgg tttaagcgga tatcggggac atccatagcc 2520ttcaaaaata ttgcttccaa aattgccaat gagctaaagc tgtaacccag tgattatgat 2580gtaaattaag tagcaagtaa agtgttttcc tgaacactga tggaaatgta tagaataata 2640tttaggcaat aacgtctgca tcttctaaat catgaaatta aagtctgagg acgagagc 2698142914DNAHomo sapiens 14gacggcccgg gccaggcccg ggatctagaa cggccgtagg gggaagggag ccgccctccc 60cacggcgcct tttcggaact gccgtggact cgaggacgct ggtcgccggc ctcctagggc 120tgtgctgttt tgttttgacc ctcgcattgt gcagaattaa agtgcagtaa aatgtccact 180aggaccccat tgccaacggt gaatgaacga gacactgaaa accacacgtc acatggagat 240gggcgtcaag aagttacctc tcgtaccagc cgctcaggag ctcggtgtag aaactctata 300gcctcctgtg cagatgaaca acctcacatc ggaaactaca gactgttgaa aacaatcggc 360aaggggaatt ttgcaaaagt aaaattggca agacatatcc ttacaggcag agaggttgca 420ataaaaataa ttgacaaaac tcagttgaat ccaacaagtc tacaaaagct cttcagagaa 480gtaagaataa tgaagatttt aaatcatccc aatatagtga agttattcga agtcattgaa 540actcaaaaaa cactctacct aatcatggaa tatgcaagtg gaggtaaagt atttgactat 600ttggttgcac atggcaggat gaaggaaaaa gaagcaagat ctaaatttag acagattgtg 660tctgcagttc aatactgcca tcagaaacgg atcgtacatc gagacctcaa ggctgaaaat 720ctattgttag atgccgatat gaacattaaa atagcagatt tcggttttag caatgaattt 780actgttggcg gtaaactcga cacgttttgt ggcagtcctc catacgcagc acctgagctc 840ttccagggca agaaatatga cgggccagaa gtggatgtgt ggagtctggg ggtcatttta 900tacacactag tcagtggctc acttcccttt gatgggcaaa acctaaagga actgagagag 960agagtattaa gagggaaata cagaattccc ttctacatgt ctacagactg tgaaaacctt 1020ctcaaacgtt tcctggtgct aaatccaatt aaacgcggca ctctagagca aatcatgaag 1080gacaggtgga tcaatgcagg gcatgaagaa gatgaactca aaccatttgt tgaaccagag 1140ctagacatct cagaccaaaa aagaatagat attatggtgg gaatgggata ttcacaagaa 1200gaaattcaag aatctcttag taagatgaaa tacgatgaaa tcacagctac atatttgtta 1260ttggggagaa aatcttcaga ggttaggccg agcagtgatc tcaacaacag tactggccag 1320tctcctcacc acaaagtgca gagaagtgtt tcttcaagcc aaaagcaaag acgctacagt 1380gaccatgctg gaccaggtat tccttctgtt gtggcgtatc cgaaaaggag tcagaccagc 1440actgcagata gtgacctcaa agaagatgga atttcctccc ggaaatcaac tggcagtgct 1500gttggaggaa agggaattgc tccagccagt cccatgcttg ggaatgcaag taatcctaat 1560aaggcggata ttcctgaacg caagaaaagc tccactgtcc ctagtagtaa cacagcatct 1620ggtggaatga cacgacgaaa tacttatgtt tgcagtgaga gaactacaga tgatagacac 1680tcagtgattc agaatggcaa agaaaacagc actattcctg atcagagaac tccagttgct 1740tcaacacaca gtatcagtag tgcagccacc ccagatcgaa tccgcttccc aagaggcact 1800gccagtcgta gcactttcca cggccagccc cgggaacggc gaaccgcaac atataatggc 1860cctcctgcct ctcccagcct gtcccatgaa gccacaccat tgtcccagac tcgaagccga 1920ggctccacta ctctctttag taaattaact tcaaaactca caaggagtcg caatgtatct 1980gctaagcaaa aagatgaaaa caaagaagca aagcctcgat ccctacgctt cacctggagc 2040atgaaaacca ctagttcaat ggatcccggg gacatgatgc gggaaatccg caaagtgttg 2100gacgccaata actgcgacta tgagcagagg gagcgcttct tgctcttctg cgtccacgga 2160gatgggcacg cggagaacct cgtgcagtgg gaaatggaag tgtgcaagct gccaagactg 2220tctctgaacg gggtccggtt taagcggata tcggggacat ccatagcctt caaaaatatt 2280gcttccaaaa ttgccaatga gctaaagctg taacccagtg attatgatgt aaattaagta 2340gcaagtaaag tgttttcctg aacactgatg gaaatgtata gaataatatt taggcaataa 2400cgtctgcatc ttctaaatca tgaaattaaa gtctgaggac gagagcacgc ctgggagcga 2460aagctggcct tttttctacg aatgcactac attaaagatg tgcaacctat gcgccccctg 2520ccctacttcc gttaccctga gagtcggcgt gtggccccat ctccatgtgc ctcccgtctg 2580ggtgggtgtg agagtggacg gtatgtgtgt gaagtggtgt atatggaagc atctccctac 2640actggcagcc agtcattact agtacctctg cgggagatca tccggtgcta aaacattaca 2700gttgccaagg aggaaaatac tgaatgactg ctaagaatta accttaagac cagttcatag 2760ttaatacagg tttacagttc atgcctgtgg ttttgtgttt gttgttttgt gtttttttag 2820tgcaaaaggt ttaaatttat agttgtgaac attgcttgtg tgtgtttttc taagtagatt 2880cacaagataa ttaaaaattc actttttctc aggt 2914153895DNAHomo sapiens 15ctgcaggaat tccgatcctt ccgcaggttc acctacggaa accttgttac gacttttact 60tcctctagat agtcaagttc gaccgtcttc tcagcgctcc gccagggccg tgggccgacc 120ccggcggggc cgatccgagg gcctcactaa accatccaat cggtagtagc gacgggcggt 180gtgtacaaag ggcagggact taatcaacgc aagcttatga cccgcactta ctgggaattc 240ctcgttcatg gggaataatt gcaatccccg atccccatca cgaatggggt tcaacgggtt 300acccgcgcct gccggcgtag ggtaggcaca cgctgagcca gtcagtgtag cgcgcgtgca 360gccccggaca tctaagggca tcacagacct gttattgctc aatctcgggt ggctgaacgc 420cacttgtccc tctaagaagt tgggggacgc cgaccgctcg ggggtcgcgt aactagttag 480catgccagag tctcgttcgt tatcggaatt aaccagacaa atcgctccac caactaagaa 540cggccatgca ccaccaccca cggaatcgag aaagagctat caatctgtca atcctgtccg 600tgtccgggcc gggtgaggtt tcccgtgttg agtcaaatta agccgcaggc tccactcctg 660gtggtgccct tccgtcaatt cctttaagtt tcagctttgc aaccatactc cccccggaac 720ccaaagactt tggtttcccg gaagctgccc ggcgggtcat gggaataacg ccgccgcatc 780gccggtcggc atcgtttatg gtcggaacta cgacggtatc tgatcgtctt cgaacctccg 840actttcgttc ttgattaatg aaaacattct tggcaaatgc tttcgctctg gtccgtcttg 900cgccggtcca agaatttcgg aattccgcag cggcggccag cagggcggag gctgaggcag 960caagctcgct agagagggag aagcagtcgg gcgcaggcgc ctcctccgca gcccgctcca 1020tggtcggcgc ccacagcccg cggcggcctg tcttgcgctc cacttccttc acatcctcct 1080ccgcctcctc gttttcaggc gccgccggcg gcgctgtgtg gaggcccgcg agctgaaatt 1140cgcggtgcga cgggagggag tggagaagga ggtgaggggg cccaggatcg cggggcgccc 1200tgaggcaagg ggacgccggc gggccgaagc gcagcccgcc gcccgcaggc tcggctccgc 1260cactgccgcc ctcccggtct cctcgcctcg gccgccgagg cagggagaga atgagccccg 1320ggacccgccg ggggacggcc cgggccaggc ccgggatcta gacggccgta gggggaaggg 1380agccgccctc cccacggcgc cttttcggaa ctgccgtgga ctcgaggacg ctggtcgccg 1440gcctcctagg gctgtgctgt tttgttttga ccctcgcatt gtgcagaatt aaagtgcagt 1500aaaatgtcca ctaggacccc attgccaacg gtgaatgaac gagacactga aaaccacacg 1560tcacatggag atgggcgtca agaagttacc tctcgtacca gccgctcagg agctcggtgt 1620agaaactcta tagcctcctg tgcagatgaa caacctcaca tcggaaacta cagactgttg 1680aaaacaatcg gcaaggggaa ttttgcaaaa gtaaaattgg caagacatat ccttacaggc 1740agagaggttg caataaaaat aattgacaaa actcagttga atccaacaag tctacaaaag 1800ctcttcagag aagtaagaat aatgaagatt ttaaatcatc ccaatatagt gaagttattc 1860gaagtcattg aaactgaaaa aacactctac ctaatcatgg aatatgcaag tggaggtgaa 1920gtatttgact atttggttgc acatggcaag atgaaggaaa aagaagcaag atctaaattt 1980agacagggtt gtcaagctgg acagactatt aaagttcaag tctcctttga tttgcttagt 2040ctgatgttta catttattgt gtctgcagtt caatactgcc atcagaaacg gatcgtacat 2100cgagacctca aggctgaaaa tctattgtta gatgccgata tgaacattaa aatagcagat 2160ttcggtttta gcaatgaatt tactgttggc ggtaaactcg acacgttttg tggcagtcct 2220ccatacgcag cacctgagct cttccagggc aagaaatatg acgggccaga agtggatgtg 2280tggagtctgg gggtcatttt atacacacta gtcagtggct cacttccctt tgatgggcaa 2340aacctaaagg aactgagaga gagagtatta agagggaaat acagaattcc cttctacatg 2400tctacagact gtgaaaacct tctcaaacgt ttcctggtgc taaatccaat taaacgcggc 2460actctagagc aaatcatgaa ggacaggtgg atcaatgcag ggcatgaaga agatgaactc 2520aaaccatttg ttgaaccaga gctagacatc tcagaccaaa aaagaataga tattatggtg 2580ggaatgggat attcacaaga agaaattcaa gaatctctta gtaagatgaa atacgatgaa 2640atcacagcta catatttgtt attggggaga aaatcttcag agctggatgc tagtgattcc 2700agttctagca gcaatctttc acttgctaag gttaggccga gcagtgatct caacaacagt 2760actggccagt ctcctcacca caaagtgcag agaagtgttt cttcaagcca aaagcaaaga 2820cgctacagtg accatgctgg accagctatt ccttctgttg tggcgtatcc gaaaaggagt 2880cagacaagca ctgcagatgg tgacctcaaa gaagatggaa tttcctcccg gaaatcaagt 2940ggcagtgctg ttggaggaaa gggaattgct ccagccagtc ccatgcttgg gaatgcaagt 3000aatcctaata aggcggatat tcctgaacgc aagaaaagct ccactgtccc tagtagtaac 3060acagcatctg gtggaatgac acgacgaaat acttatgttt gcagtgagag aactacagct 3120gatagacact cagtgattca gaatggcaaa gaaaacagca ctattcctga tcagagaact 3180ccagttgctt caacacacag tatcagtagt gcagccaccc cagatcgaat ccgcttccca 3240agaggcactg ccagtcgtag cactttccac ggccagcccc gggaacggcg aaccgcaaca 3300tataatggcc ctcctgcctc tcccagcctg tcccatgaag ccacaccatt gtcccagact 3360cgaagccgag gctccactaa tctctttagt aaattaactt caaaactcac aaggagtcgc 3420aatgtatctg ctgagcaaaa agatgaaaac aaagaagcaa agcctcgatc cctacgcttc 3480acctggagca tgaaaaccac tagttcaatg gatcccgggg acatgatgcg ggaaatccgc 3540aaagtgttgg acgccaataa ctgcgactat gagcagaggg agcgcttctt gctcttctgc 3600gtccacggag atgggcacgc ggagaacctc gtgcagtggg aaatggaagt gtgcaagctg 3660ccaagactgt ctctgaacgg ggtccggttt aagcggatat cggggacatc catagccttc 3720aaaaatattg cttccaaaat tgccaatgag ctaaagctgt aacccagtga ttatgatgta 3780aattaagtag caagtaaagt gttttcctga acactgatgg aaatgtatag aataatattt 3840aggcaataac gtctgcatct tctaaatcat gaaattaaag tctgaggacg agagc 3895162145DNAHomo sapiens 16atgtccacta ggaccccatt gccaacggtg aatgaacgag acactgaaaa ccacacgtca 60catggagatg ggcgtcaaga agttacctct cgtaccagcc gctcaggagc tcggtgtaga 120aactctatag cctcctgtgc agatgaacaa cctcacatcg gaaactacag actgttgaaa 180acaatcggca aggggaattt tgcaaaagta aaattggcaa gacatatcct tacaggcaga 240gaggttgcaa taaaaataat tgacaaaact cagttgaatc caacaagtct acaaaagctc 300ttcagagaag taagaataat gaagatttta aatcatccca atatagtgaa gttattcgaa 360gtcattgaaa ctgaaaaaac actctaccta atcatggaat atgcaagtgg aggtgaagta 420tttgactatt tggttgcaca tggcaggatg aaggaaaaag aagcaagatc taaatttaga 480cagattgtgt ctgcagttca atactgccat cagaaacgga tcgtacatcg agacctcaag 540gctgaaaatc tattgttaga tgccgatatg aacattaaaa tagcagattt cggttttagc 600aatgaattta ctgttggcgg taaactcgac acgttttgtg gcagtcctcc atacgcagca 660cctgagctct tccagggcaa gaaatatgac gggccagaag tggatgtgtg gagtctgggg 720gtcattttat acacactagt cagtggctca cttccctttg atgggcaaaa cctaaaggaa 780ctgagagaga gagtattaag agggaaatac agaattccct

tctacatgtc tacagactgt 840gaaaaccttc tcaaacgttt cctggtgcta aatccaatta aacgcggcac tctagagcaa 900atcatgaagg acaggtggat caatgcaggg catgaagaag atgaactcaa accatttgtt 960gaaccagagc tagacatctc agaccaaaaa agaatagata ttatggtggg aatgggatat 1020tcacaagaag aaattcaaga atctcttagt aagatgaaat acgatgaaat cacagctaca 1080tatttgttat tggggagaaa atcttcagag gttaggccga gcagtgatct caacaacagt 1140actggccagt ctcctcacca caaagtgcag agaagtgttt cttcaagcca aaagcaaaga 1200cgctacagtg accatgctgg accagctatt ccttctgttg tggcgtatcc gaaaaggagt 1260cagaccagca ctgcagatag tgacctcaaa gaagatggaa tttcctcccg gaaatcaagt 1320ggcagtgctg ttggaggaaa gggaattgct ccagccagtc ccatgcttgg gaatgcaagt 1380aatcctaata aggcggatat tcctgaacgc aagaaaagct ccactgtccc tagtagtaac 1440acagcatctg gtggaatgac acgacgaaat acttatgttt gcagtgagag aactacagct 1500gatagacact cagtgattca gaatggcaaa gaaaacagca ctattcctga tcagagaact 1560ccagttgctt caacacacag tatcagtagt gcagccaccc cagatcgaat ccgcttccca 1620agaggcactg ccagtcgtag cactttccac ggccagcccc gggaacggcg aaccgcaaca 1680tataatggcc ctcctgcctc tcccagcctg tcccatgaag ccacaccatt gtcccagact 1740cgaagccgag gctccactaa tctctttagt aaattaactt caaaactcac aaggagtcgc 1800aatgtatctg ctgagcaaaa agatgaaaac aaagaagcaa agcctcgatc cctacgcttc 1860acctggagca tgaaaaccac tagttcaatg gatcccgggg acatgatgcg ggaaatccgc 1920aaagtgttgg acgccaataa ctgcgactat gagcagaggg agcgcttctt gctcttctgc 1980gtccacggag atgggcacgc ggagaacctc gtgcagtggg aaatggaagt gtgcaagctg 2040ccaagactgt ctctgaacgg ggtccggttt aagcggatat cggggacatc catagccttc 2100aaaaatattg cttccaaaat tgccaatgag ctaaagctgt aaccc 2145172193DNAHomo sapiens 17atgtccacta ggaccccatt gccaacggtg aatgaacgag acactgaaaa ccacacgtca 60catggagatg ggcgtcaaga agttacctct cgtaccagcc gctcaggagc tcggtgtaga 120aactctatag cctcctgtgc agatgaacaa cctcacatcg gaaactacag actgttgaaa 180acaatcggca aggggaattt tgcaaaagta aaattggcaa gacatatcct tacaggcaga 240gaggttgcaa taaaaataat tgacaaaact cagttgaatc caacaagtct acaaaagctc 300ttcagagaag taagaataat gaagatttta aatcatccca atatagtgaa gttattcgaa 360gtcattgaaa ctgaaaaaac actctaccta atcatggaat atgcaagtgg aggtgaagta 420tttgactatt tggttgcaca tggcaggatg aaggaaaaag aagcaagatc taaatttaga 480cagattgtgt ctgcagttca atactgccat cagaaacgga tcgtacatcg agacctcaag 540gctgaaaatc tattgttaga tgccgatatg aacattaaaa tagcagattt cggttttagc 600aatgaattta ctgttggcgg taaactcgac acgttttgtg gcagtcctcc atacgcagca 660cctgagctct tccagggcaa gaaatatgac gggccagaag tggatgtgtg gagtctgggg 720gtcattttat acacactagt cagtggctca cttccctttg atgggcaaaa cctaaaggaa 780ctgagagaga gagtattaag agggaaatac agaattccct tctacatgtc tacagactgt 840gaaaaccttc tcaaacgttt cctggtgcta aatccaatta aacgcggcac tctagagcaa 900atcatgaagg acaggtggat caatgcaggg catgaagaag atgaactcaa accatttgtt 960gaaccagagc tagacatctc agaccaaaaa agaatagata ttatggtggg aatgggatat 1020tcacaagaag aaattcaaga atctcttagt aagatgaaat acgatgaaat cacagctaca 1080tatttgttat tggggagaaa atcttcagag ctggatgcta gtgattccag ttctagcagc 1140aatctttcac ttgctaaggt taggccgagc agtgatctca acaacagtac tggccagtct 1200cctcaccaca aagtgcagag aagtgtttct tcaagccaaa agcaaagacg ctacagtgac 1260catgctggac cagctattcc ttctgttgtg gcgtatccga aaaggagtca gaccagcact 1320gcagatagtg acctcaaaga agatggaatt tcctcccgga aatcaagtgg cagtgctgtt 1380ggaggaaagg gaattgctcc agccagtccc atgcttggga atgcaagtaa tcctaataag 1440gcggatattc ctgaacgcaa gaaaagctcc actgtcccta gtagtaacac agcatctggt 1500ggaatgacac gacgaaatac ttatgtttgc agtgagagaa ctacagctga tagacactca 1560gtgattcaga atggcaaaga aaacagcact attcctgatc agagaactcc agttgcttca 1620acacacagta tcagtagtgc agccacccca gatcgaatcc gcttcccaag aggcactgcc 1680agtcgtagca ctttccacgg ccagccccgg gaacggcgaa ccgcaacata taatggccct 1740cctgcctctc ccagcctgtc ccatgaagcc acaccattgt cccagactcg aagccgaggc 1800tccactaatc tctttagtaa attaacttca aaactcacaa ggagtcgcaa tgtatctgct 1860gagcaaaaag atgaaaacaa agaagcaaag cctcgatccc tacgcttcac ctggagcatg 1920aaaaccacta gttcaatgga tcccggggac atgatgcggg aaatccgcaa agtgttggac 1980gccaataact gcgactatga gcagagggag cgcttcttgc tcttctgcgt ccacggagat 2040gggcacgcgg agaacctcgt gcagtgggaa atggaagtgt gcaagctgcc aagactgtct 2100ctgaacgggg tccggtttaa gcggatatcg gggacatcca tagccttcaa aaatattgct 2160tccaaaattg ccaatgagct aaagctgtaa ccc 2193183373DNAHomo sapiens 18caggcgcctc ctccgcagcc cgctccatgg tcggcgccca cagcccgcgg cggcctgtct 60tgcgctccac ttccttcaca tcctcctccg cctcctcgtt ttcaggcgcc gccggcggcg 120ctgtgtggag gcccgcgagc tgaaattcgc ggtgcgacgg gagggagtgg agaaggaggt 180gagggggccc aggatcgcgg ggcgccctga ggcaagggga cgccggcggg ccgaagcgca 240gcccgccgcc cgcaggctcg gctccgccac tgccgccctc ccggtctcct cgcctcggcc 300gccgaggcag ggagagaatg agccccggga cccgccgggg acggcccggg ccaggcccgg 360gatctagaac ggccgtaggg ggaagggagc cgccctcccc acggcgcctt ttcggaactg 420ccgtggactc gaggacgctg gtcgccggcc tcctagggct gtgctgtttt gttttgaccc 480tcgcattgtg cagaattaaa gtgcagtaaa atgtccacta ggaccccatt gccaacggtg 540aatgaacgag acactgaaaa ccacacgtca catggagatg ggcgtcaaga agttacctct 600cgtaccagcc gctcaggagc tcggtgtaga aactctatag cctcctgtgc agatgaacaa 660cctcacatcg gaaactacag actgttgaaa acaatcggca aggggaattt tgcaaaagta 720aaattggcaa gacatatcct tacaggcaga gaggttgcaa taaaaataat tgacaaaact 780cagttgaatc caacaagtct acaaaagctc ttcagagaag taagaataat gaagatttta 840aatcatccca atatagtgaa gttattcgaa gtcattgaaa ctgaaaaaac actctaccta 900atcatggaat atgcaagtgg aggtaaagta tttgactatt tggttgcaca tggcaggatg 960aaggaaaaag aagcaagatc taaatttaga cagattgtgt ctgcagttca atactgccat 1020cagaaacgga tcgtacatcg agacctcaag gctgaaaatc tattgttaga tgccgatatg 1080aacattaaaa tagcagattt cggttttagc aatgaattta ctgttggcgg taaactcgac 1140acgttttgtg gcagtcctcc atacgcagca cctgagctct tccagggcaa gaaatatgac 1200gggccagaag tggatgtgtg gagtctgggg gtcattttat acacactagt cagtggctca 1260cttccctttg atgggcaaaa cctaaaggaa ctgagagaga gagtattaag agggaaatac 1320agaattccct tctacatgtc tacagactgt gaaaaccttc tcaaacgttt cctggtgcta 1380aatccaatta aacgcggcac tctagagcaa atcatgaagg acaggtggat caatgcaggg 1440catgaagaag atgaactcaa accatttgtt gaaccagagc tagacatctc agaccaaaaa 1500agaatagata ttatggtggg aatgggatat tcacaagaag aaattcaaga atctcttagt 1560aagatgaaat acgatgaaat cacagctaca tatttgttat tggggagaaa atcttcagag 1620ctggatgcta gtgattccag ttctagcagc aatctttcac ttgctaaggt taggccgagc 1680agtgatctca acaacagtac tggccagtct cctcaccaca aagtgcagag aagtgtttct 1740tcaagccaaa agcaaagacg ctacagtgac catgctggac cagctattcc ttctgttgtg 1800gcgtatccga aaaggagtca gaccagcact gcagatagtg acctcaaaga agatggaatt 1860tcctcccgga aatcaagtgg cagtgctgtt ggaggaaagg gaattgctcc agccagtccc 1920atgcttggga atgcaagtaa tcctaataag gcggatattc ctgaacgcaa gaaaagctcc 1980actgtcccta gtagtaacac agcatctggt ggaatgacac gacgaaatac ttatgtttgc 2040agtgagagaa ctacagctga tagacactca gtgattcaga atggcaaaga aaacagcact 2100attcctgatc agagaactcc agttgcttca acacacagta tcagtagtgc agccacccca 2160gatcgaatcc gcttcccaag aggcactgcc agtcgtagca ctttccacgg ccagccccgg 2220gaacggcgaa ccgcaacata taatggccct cctgcctctc ccagcctgtc ccatgaagcc 2280acaccattgt cccagactcg aagccgaggc tccactaatc tctttagtaa attaacttca 2340aaactcacaa ggagaaacat gtcattcagg tttatcaaaa ggcttccaac tgaatatgag 2400aggaacggga gatatgaggg ctcaagtcgc aatgtatctg ctgagcaaaa agatgaaaac 2460aaagaagcaa agcctcgatc cctacgcttc acctggagca tgaaaaccac tagttcaatg 2520gatcccgggg acatgatgcg ggaaatccgc aaagtgttgg acgccaataa ctgcgactat 2580gagcagaggg agcgcttctt gctcttctgc gtccacggag atgggcacgc ggagaacctc 2640gtgcagtggg aaatggaagt gtgcaagctg ccaagactgt ctctgaacgg ggtccggttt 2700aagcggatat cggggacatc catagccttc aaaaatattg cttccaaaat tgccaatgag 2760ctaaagctgt aacccagtga ttatgatgta aattaagtag caagtaaagt gttttcctga 2820acactgatgg aaatgtatag aataatattt aggcaataac gtctgcatct tctaaatcat 2880gaaattaaag tctgaggacg agagcacgcc tgggagcgaa agctggcctt ttttctacga 2940atgcactaca ttaaagatgt gcaacctatg cgccccctgc cctacttccg ttaccctgag 3000agtcggcgtg tggccccatc tccatgtgcc tcccgtctgg gtgggtgtga gagtggacgg 3060tatgtgtgtg aagtggtgta tatggaagca tctccctaca ctggcagcca gtcattacta 3120gtacctctgc gggagatcat ccggtgctaa aacattacag ttgccaagga ggaaaatact 3180gaatgactgc taagaattaa ccttaagacc agttcatagt taatacaggt ttacagttca 3240tgcctgtggt tttgtgtttg ttgttttgtg tttttttagt gcaaaaggtt taaatttata 3300gttgtgaaca ttgcttgtgt gtgtttttct aagtagattc acaagataat taaaaattca 3360ctttttctca ggt 3373193609DNAHomo sapiensmisc_feature(3606)..(3606)"n" is A, C, G, or T 19cgcctccctc cgccgccgct tgggccggct ccgcgccccc tccgcggccc ccgcccgccc 60gcctgcccgc cgcccccatg gcgcccgggg tccccgctgc acggggccac taggaccctc 120ggcgtccctt cccctccccc gccctgcccc ctctcccgcc gcgcggaccc gggcgttctc 180ggcgcccagc ttttgagctc gcgtccccag gccggcgggg ggggagggga agagagggga 240ccctgggacc cccgcccccc ccacccggcc gcccctgccc cccgggaccc ggagaagatg 300tcttcgcgga cggtgctggc cccgggcaac gatcggaact cggacacgca tggcaccttg 360ggcagtggcc gctcctcgga caaaggcccg tcctggtcca gccgctcact gggtgcccgt 420tgccggaact ccatcgcctc ctgtcccgag gagcagcccc acgtgggcaa ctaccgcctg 480ctgaggacca ttgggaaggg caactctgcc aaagtcaagc tggctcggca catcctcact 540ggtcgggagg ttgccatcaa gattatcgac aaaacccagc tgaatcccag cagcctgcag 600aagctgttcc gagaagtccg catcatgaag ggcctaaacc accccaacat cgtgaagctc 660tttgaggtga ttgagactga gaagacgctg tacctggtga tggagtacgc aagtgctgga 720gaagtgtttg actacctcgt gtcgcatggc cgcatgaagg agaaggaagc tcgagccaag 780ttccgacaga ttgtttcggc tgtgcactat tgtcaccaga aaaatattgt acacagggac 840ctgaaggctg agaacctctt gctggatgcc gaggccaaca tcaagattgc tgactttggc 900ttcagcaacg agttcacgct gggatcgaag ctggacacgt tctgcgggag ccccccatat 960gccgccccgg agctgtttca gggcaagaag tacgacgggc cggaggtgga catctggagc 1020ctgggagtca tcctgtacac cctcgtcagc ggctccctgc ccttcgacgg gcacaacctc 1080aaggagctgc gggagcgagt actcagaggg aagtaccggg tccctttcta catgtcaaca 1140gactgtgaga gcatcctgcg gagatttttg gtgctgaacc cagctaaacg ctgtactctc 1200gagcaaatca tgaaagacaa atggatcaac atcggctatg agggtgagga gttgaagcca 1260tacacagagc ccgaggagga cttcggggac accaagagaa ttgaggtgat ggtgggtatg 1320ggctacacac gggaagaaat caaagagtcc ttgaccagcc agaagtacaa cgaagtgacc 1380gccacctacc tcctgctggg caggaagact gaggagggtg gggaccgggg cgccccaggg 1440ctggccctgg cacgggtgcg ggcgcccagc gacaccacca acggaacaag ttccagcaaa 1500ggcaccagcc acagcaaagg gcagcggagt tcctcttcca cctaccaccg ccagcgcagg 1560catagcgatt tctgtggccc atcccctgca cccctgcacc ccaaacgcag cccgacgagc 1620acgggggagg cggagctgaa ggaggagcgg ctgccaggcc ggaaggcgag ctgcagcacc 1680gcggggagtg ggagtcgagg gctgcccccc tccagcccca tggtcagcag cgcccacaac 1740cccaacaagg cagagatccc agagcggcgg aaggacagca cgagcacccc caacaacctc 1800cctcctagca tgatgacccg cagaaacacc tacgtttgca cagaacgccc gggggctgag 1860cgcccgtcac tgttgccaaa tgggaaagaa aacagctcag gcaccccacg ggtgccccct 1920gcctccccct ccagtcacag cctggcaccc ccatcagggg agcggagccg cctggcacgc 1980ggttccacca tccgcagcac cttccatggt ggccaggtcc gggaccggcg ggcagggggt 2040gggggtggtg ggggtgtgca gaatgggccc cctgcctctc ccacactggc ccatgaggct 2100gcacccctgc ccgccgggcg gccccgcccc accaccaacc tcttcaccaa gctgacctcc 2160aaactgaccc gaagggttac cctcgatccc tctaaacggc agaactctaa ccgctgtgtt 2220tcgggcgcct ctctgcccca gggatccaag atcaggtcgc agacgaacct gagagaatcg 2280ggggacctga ggtcacaagt tgccatctac cttgggatca aacggaaacc gccccccggc 2340tgctccgatt cccctggagt gtgaagctga ccagctcgcg ccctcctgag gccctgatgg 2400cagctctgcg ccaggccaca gcagccgccc gctgccgctg ccgccagcca cagccgttcc 2460tgctggcctg cctgcacggg ggtgcgggcg ggcccgagcc cctgtcccac ttcgaagtgg 2520aggtctgcca gctgccccgg ccaggcttgc ggggagttct cttccgccgt gtggcgggca 2580ccgccctggc cttccgcacc ctcgtcaccc gcatctccaa cgacctcgag ctctgagcca 2640ccacggtccc agggccctta ctcttcctct cccttgtcgc cttcacttct acaggagggg 2700aaggggccag ggaggggatt ctccctttat catcacctca gtttccctga attatatttg 2760ggggcaaaga ttgtcccctc tgctgttctc tggggccgct cagcacagaa gaaggatgag 2820ggggctcagc ggggggagct ggcaccttcc tggagcctcc agccagtcct gtcctccctc 2880gccctaccaa gagggcacct gaggagactt tggggacagg gcaggggcag ggagggaaac 2940tgaggaaatc ttccattcct cccaacagct caaaattagg ccttgggcag gggcagggag 3000agctgctgag cctaaagact ggagaatctg ggggactggg agtgggggtc agagaggcag 3060attccttccc ctcccgtccc ctcacgctca aacccccact tcctgcccca ggctggcgcg 3120gggcactttg tacaaatcct tgtaaatacc ccacaccctc ccctctgcaa aggtctcttg 3180aggagctgcc gctgtcacct acggttttta agttattaca ccccgaccct cctcctgtca 3240gccccctcac ctgcagcctg ttgcccaata aatttaagag agtccccccc tccccaatgc 3300tgaccctagg attttccttc cctgccctca cctgcaaatg agttaaagaa gaggcgtggg 3360aatccaggca gtggtttttc ctttcggagc ctcggttttc tcatctgcag aatgggagcg 3420gtgggggtgg gaaggtaagg atggtcgtgg aagaaggcag gatggaactc ggcctcatcc 3480ccgaggcccc agttcctata tcgggccccc cattcatcca ctcacactcc cagccaccat 3540gttacactgg actctaagcc acttcttact ccagtagtaa atttattgca ataaacaatc 3600attganccc 3609202085DNAHomo sapiens 20agatgtcttc gcggacggtg ctggccccgg gcaacgatcg gaactcggac acgcatggca 60ccttgggcag tggccgctcc tcggacaaag gcccgtcctg gtccagccgc tcactgggtg 120cccgttgccg gaactccatc gcctcctgtc ccgaggagca gccccacgtg ggcaactacc 180gcctgctgag gaccattggg aagggcaact ttgccaaagt caagctggct cggcacatcc 240tcactggtcg ggaggttgcc atcaagatta tcgacaaaac ccagctgaat cccagcagcc 300tgcagaagct gttccgagaa gtccgcatca tgaagggcct aaaccacccc aacatcgtga 360agctctttga ggtgattgag actgagaaga cgctgtacct ggtgatggag tacgcaagtg 420ctggagaagt gtttgactac ctcgtgtcgc atggccgcat gaaggagaag gaagctcgag 480ccaagttccg acagattgtt tcggctgtgc actattgtca ccagaaaaat attgtacaca 540gggacctgaa ggctgagaac ctcttgctgg atgccgaggc caacatcaag attgctgact 600ttggcttcag caacgagttc acgctgggat cgaagctgga cacgttctgc gggagccccc 660catatgccgc cccggagctg tttcagggca agaagtacga cgggccggag gtggacatct 720ggagcctggg agtcatcctg tacaccctcg tcagcggctc cctgcccttc gacgggcaca 780acctcaagga gctgcgggag cgagtactca gagggaagta ccgggtccct ttctacatgt 840caacagactg tgagagcatc ctgcggagat ttttggtgct gaacccagct aaacgctgta 900ctctcgagca aatcatgaaa gacaaatgga tcaacatcgg ctatgagggt gaggagttga 960agccatacac agagcccgag gaggacttcg gggacaccaa gagaattgag gtgatggtgg 1020gtatgggcta cacacgggaa gaaatcaaag agtccttgac cagccagaag tacaacgaag 1080tgaccgccac ctacctcctg ctgggcagga agactgagga gggtggggac cggggcgccc 1140cagggctggc cctggcacgg gtgcgggcgc ccagcgacac caccaacgga acaagttcca 1200gcaaaggcac cagccacagc aaagggcagc ggagttcctc ttccacctac caccgccagc 1260gcaggcatag cgatttctgt ggcccatccc ctgcacccct gcaccccaaa cgcagcccga 1320cgagcacggg ggaggcggag ctgaaggagg agcggctgcc aggccggaag gcgagctgca 1380gcaccgcggg gagtgggagt cgagggctgc ccccctccag ccccatggtc agcagcgccc 1440acaaccccaa caaggcagag atcccagagc ggcggaagga cagcacgagc acccccaaca 1500acctccctcc tagcatgatg acccgcagaa acacctacgt ttgcacagaa cgcccggggg 1560ctgagcgccc gtcactgttg ccaaatggga aagaaaacag ctcaggcacc ccacgggtgc 1620cccctgcctc cccctccagt cacagcctgg cacccccatc aggggagcgg agccgcctgg 1680cacgcggttc caccatccgc agcaccttcc atggtggcca ggtccgggac cggcgggcag 1740ggggtggggg tggtgggggt gtgcagaatg ggccccctgc ctctcccaca ctggcccatg 1800aggctgcacc cctgcccgcc gggcggcccc gccccaccac caacctcttc accaagctga 1860cctccaaact gacccgaagg gttaccctcg atccctctaa acggcagaac tctaaccgct 1920gtgtttcggg cgcctctctg ccccagggat ccaagatcag gtcgcagacg aacctgagag 1980aatcggggga cctgaggtca caagttgcca tctaccttgg gatcaaacgg aaaccgcccc 2040ccggctgctc cgattcccct ggagtgtgaa gctgaccagc tcgcg 2085212278DNAHomo sapiens 21agatgtcttc gcggacggtg ctggccccgg gcaacgatcg gaactcggac acgcatggca 60ccttgggcag tggccgctcc tcggacaaag gcccgtcctg gtccagccgc tcactgggtg 120cccgttgccg gaactccatc gcctcctgtc ccgaggagca gccccacgtg ggcaactacc 180gcctgctgag gaccattggg aagggcaact ttgccaaagt caagctggct cggcacatcc 240tcactggtcg ggaggttgcc atcaagatta tcgacaaaac ccagctgaat cccagcagcc 300tgcagaagct gttccgagaa gtccgcatca tgaagggcct aaaccacccc aacatcgtga 360agctctttga ggtgattgag actgagaaga cgctgtacct ggtgatggag tacgcaagtg 420ctggagaagt gtttgactac ctcgtgtcgc atggccgcat gaaggagaag gaagctcgag 480ccaagttccg acagattgtt tcggctgtgc actattgtca ccagaaaaat attgtacaca 540gggacctgaa ggctgagaac ctcttgctgg atgccgaggc caacatcaag attgctgact 600ttggcttcag caacgagttc acgctgggat cgaagctgga cacgttctgc gggagccccc 660catatgccgc cccggagctg tttcagggca agaagtacga cgggccggag gtggacatct 720ggagcctggg agtcatcctg tacaccctcg tcagcggctc cctgcccttc gacgggcaca 780acctcaagga gctgcgggag cgagtactca gagggaagta ccgggtccct ttctacatgt 840caacagactg tgagagcatc ctgcggagat ttttggtgct gaacccagct aaacgctgta 900ctctcgagca aatcatgaaa gacaaatgga tcaacatcgg ctatgagggt gaggagttga 960agccatacac agagcccgag gaggacttcg gggacaccaa gagaattgag gtgatggtgg 1020gtatgggcta cacacgggaa gaaatcaaag agtccttgac cagccagaag tacaacgaag 1080tgaccgccac ctacctcctg ctgggcagga agactgagga gggtggggac cggggcgccc 1140cagggctggc cctggcacgg gtgcgggcgc ccagcgacac caccaacgga acaagttcca 1200gcaaaggcac cagccacagc aaagggcagc ggagttcctc ttccacctac caccgccagc 1260gcaggcatag cgatttctgt ggcccatccc ctgcacccct gcaccccaaa cgcagcccga 1320cgagcacggg ggaggcggag ctgaaggagg agcggctgcc aggccggaag gcgagctgca 1380gcaccgcggg gagtgggagt cgagggctgc ccccctccag ccccatggtc agcagcgccc 1440acaaccccaa caaggcagag atcccagagc ggcggaagga cagcacgagc acccccaaca 1500acctccctcc tagcatgatg acccgcagaa acacctacgt ttgcacagaa cgcccggggg 1560ctgagcgccc gtcactgttg ccaaatggga aagaaaacag ctcaggcacc ccacgggtgc 1620cccctgcctc cccctccagt cacagcctgg cacccccatc aggggagcgg agccgcctgg 1680cacgcggttc caccatccgc agcaccttcc atggtggcca ggtccgggac cggcgggcag 1740ggggtggggg tggtgggggt gtgcagaatg ggccccctgc ctctcccaca ctggcccatg 1800aggctgcacc cctgcccgcc gggcggcccc gccccaccac caacctcttc accaagctga 1860cctccaaact gacccgaagg gtcgcagacg aacctgagag aatcggggga cctgaggtca 1920caagttgcca tctaccttgg gatcaaacgg aaaccgcccc ccggctgctc cgattcccct 1980ggagtgtgaa gctgaccagc tcgcgccctc ctgaggccct gatggcagct ctgcgccagg 2040ccacagcagc cgcccgctgc cgctgccgcc agccacagcc gttcctgctg gcctgcctgc 2100acgggggtgc gggcgggccc gagcccctgt cccacttcga agtggaggtc tgccagctgc 2160cccggccagg cttgcgggga

gttctcttcc gccgtgtggc gggcaccgcc ctggccttcc 2220gcaccctcgt cacccgcatc tccaacgacc tcgagctctg agccaccacg gtcccagg 2278224917DNAHomo sapiens 22agaagatgtc ttcgcggacg gtgctggccc cgggcaacga tcggaactcg gacacgcatg 60gcaccttggg cagtggccgc tcctcggaca aaggcccgtc ctggtccagc cgctcactgg 120gtgcccgttg ccggaactcc atcgcctcct gtcccgagga gcagccccac gtgggcaact 180accgcctgct gaggaccatt gggaagggca actttgccaa agtcaagctg gctcggcaca 240tcctcactgg tcgggaggtt gccatcaaga ttatcgacaa aacccagctg aatcccagca 300gcctgcagaa gctgttccga gaagtccgca tcatgaaggg cctaaaccac cccaacatcg 360tgaagctctt tgaggtgatt gagactgaga agacgctgta cctggtgatg gagtacgcaa 420gtgctggaga agtgtttgac tacctcgtgt cgcatggccg catgaaggag aaggaagctc 480gagccaagtt ccgacagatt gtttcggctg tgcactattg tcaccagaaa aatattgtac 540acagggacct gaaggctgag aacctcttgc tggatgccga ggccaacatc aagattgctg 600actttggctt cagcaacgag ttcacgctgg gatcgaagct ggacacgttc tgcgggagcc 660ccccatatgc cgccccggag ctgtttcagg gcaagaagta cgacgggccg gaggtggaca 720tctggagcct gggagtcatc ctgtacaccc tcgtcagcgg ctccctgccc ttcgacgggc 780acaacctcaa ggagctgcgg gagcgagtac tcagagggaa gtaccgggtc cctttctaca 840tgtcaacaga ctgtgagagc atcctgcgga gatttttggt gctgaaccca gctaaacgct 900gtactctcga gcaaatcatg aaagacaaat ggatcaacat cggctatgag ggtgaggagt 960tgaagccata cacagagccc gaggaggact tcggggacac caagagaatt gaggtgatgg 1020tgggtatggg ctacacacgg gaagaaatca aagagtcctt gaccagccag aagtacaacg 1080aagtgaccgc cacctacctc ctgctgggca ggaagactga ggagggtggg gaccggggcg 1140ccccagggct ggccctggca cgggtgcggg cgcccagcga caccaccaac ggaacaagtt 1200ccagcaaagg caccagccac agcaaagggc agcggagttc ctcttccacc taccaccgcc 1260agcgcaggca tagcgatttc tgtggcccat cccctgcacc cctgcacccc aaacgcagcc 1320cgacgagcac gggggaggcg gagctgaagg aggagcggct gccaggccgg aaggcgagct 1380gcagcaccgc ggggagtggg agtcgagggc tgcccccctc cagccccatg gtcagcagcg 1440cccacaaccc caacaaggca gagatcccag agcggcggaa ggacagcacg agcaccccca 1500acaacctccc tcctagcatg atgacccgca gaaacaccta cgtttgcaca gaacgcccgg 1560gggctgagcg cccgtcactg ttgccaaatg ggaaagaaaa cagctcaggc accccacggg 1620tgccccctgc ctccccctcc agtcacagcc tggcaccccc atcaggggag cggagccgcc 1680tggcacgcgg ttccaccatc cgcagcacct tccatggtgg ccaggtccgg gaccggcggg 1740cagggggtgg gggtggtggg ggtgtgcaga atgggccccc tgcctctccc acactggccc 1800atgaggctgc acccctgccc gccgggcggc cccgccccac caccaacctc ttcaccaagc 1860tgacctccaa actgacccga agggttaccc tcgatccctc taaacggcag aactctaacc 1920gctgtgtttc gggcgcctct ctgccccagg gatccaagat caggtcgcag acgaacctga 1980gagaatcggg ggacctgagg tcacaagttg ccatctacct tgggatcaaa cggaaaccgc 2040cccccggctg ctccgattcc cctggagtgt gaagctgacc agctcgcgcc ctcctgaggc 2100cctgatggca gctctgcgcc aggccacagc agccgcccgc tgccgctgcc gccagccaca 2160gccgttcctg ctggcctgcc tgcacggggg tgcgggcggg cccgagcccc tgtcccactt 2220cgaagtggag gtctgccagc tgccccggcc aggcttgcgg ggagttctct tccgccgtgt 2280ggcgggcacc gccctggcct tccgcaccct cgtcacccgc atctccaacg acctcgagct 2340ctgagccacc acggtcccag ggcccttact cttcctctcc cttgtcgcct tcacttctac 2400aggaggggaa ggggccaggg aggggattct ccctttatca tcacctcagt ttccctgaat 2460tatatttggg ggcaaagatt gtcccctctg ctgttctctg gggccgctca gcacagaaga 2520aggatgaggg ggctcagcgg ggggagctgg caccttcctg gagcctccag ccagtcctgt 2580cctccctcgc cctaccaaga gggcacctga ggagactttg gggacagggc aggggcaggg 2640agggaaactg aggaaatctt ccattcctcc caacagctca aaattaggcc ttgggcaggg 2700gcagggagag ctgctgagcc taaagactgg agaatctggg ggactgggag tgggggtcag 2760agaggcagat tccttcccct cccgtcccct cacgctcaaa cccccacttc ctgccccagg 2820ctggcgcggg gcactttgta caaatccttg taaatacccc acaccctccc ctctgcaaag 2880gtctcttgag gagctgccgc tgtcacctac ggtttttaag ttattacacc ccgaccctcc 2940tcctgtcagc cccctcacct gcagcctgtt gcccaataaa tttaagagag tccccccctc 3000cccaatgctg accctaggat tttccttccc tgccctcacc tgcaaatgag ttaaagaaga 3060ggcgtgggaa tccaggcagt ggtttttcct ttcggagcct cggttttctc atctgcagaa 3120tgggagcggt gggggtggga aggtaaggat ggtcgtggaa gaaggcagga tggaactcgg 3180cctcatcccc gaggccccag ttcctatatc gggcccccca ttcatccact cacactccca 3240gccaccatgt tacactggac tctaagccac ttcttactcc agtagtaaat ttattcaata 3300aacaatcatt gacccatgcc tactccatgc caggcccagt gctggacaca gagacatgaa 3360gctctgtctg tgggagacag ggattctgac acagacaccg gacaaaccat tgtcttgggg 3420agcccagaag agaaagtggg cagggtgggg tcattgggga agatgctcta gaggaattaa 3480tgctggaatg gggtgttgaa ggatgagtag gagttagtta ggcattgagt ttgccctggg 3540caaaagccca gaagtgggag tatgtggtat atcttcagag aactgggtaa tttcagtgtg 3600gctgctgtgt tgggcatgga tggagaatca gcaagagaaa tgctgtatta ggactaataa 3660tccatctacg ctgcttaagc aaaaaggtat ttgttggttt atgttactta atagtccagg 3720ggcacctggc ttcaggtagg tttgatccag gcatcaggcc attgcatcta ttttttcagt 3780gtaagttgaa ttctagtaat ttttatcaag taagggctcc tttcctggtg gcacagatga 3840cttcagcagt tagaagtttc tatccctcca gctttctgca gcagaaagac cctcattgtc 3900agtttcccag caaaagtccc agggcagact ctcattggcc caaatgggcc atgtgatttt 3960ctctaaacca atcactgtga ctctagagtg gccagactca gagctgcact tagtaggggt 4020tcctcaaagg aaggtcaagt gtcatgagca ggagaaaagg catgggagct ggacagatta 4080tagtggttga agtctgtgca gtacagaagg gcggagctta ttcacacagc acctttgggg 4140ccaaaatgaa taagctggac tttctcccca tggcactggg gaaccatgga agttcaggga 4200acttcaggga agaggcttgg tcaattcctg agagcatcct ctgtgctggg gacacagtgg 4260taatcaagac agccccaaca ctgccctcat agagctcaca gtccaatgga ggaggcagat 4320gtgtcctcag gcagcgactg ggcagggctg gtatagggga gtccagaggt gatgcctgcc 4380tcagccaggg agggcttcct ggaggagaag gagccagcta gacatggata ggagtgcgtt 4440ttaggcacag caaatggcac atacaagggc cagggagcaa gagagaggac aggtcctcaa 4500caaatggcat gtgactttgt aagtgtagaa ttgctgtgag gtatggggct aggggcgtca 4560gtagggcctt gaaggttatg gacaggggcc tgggctttct tccaagggca ctgggggagc 4620catggcaagg ttgtaggtag ggtagagatg ggcgggtttg tgctatgtgc agggtggaag 4680ggagggaagt tgacaggtca gaagatcagg aaagaggtcg gggctggaca gatggggaga 4740gcgcagatag atttaagaga gtcctgtgag gcaaagtggg caggacctgg taacaggtgt 4800ctggactgtg gctttggctg gctcagaagg tccccactgg cgtgtgtggt ctatgtagcc 4860tctgggtgtg gagctgggat cttcaactgg ggacagtaca gtaaagaaca tcacagc 4917233226DNAHomo sapiens 23gacccggaga agatgtcttc gcggacggtg ctggccccgg gcaacgatcg gaactcggac 60acgcatggca ccttgggcag tggccgctcc tcggacaaag gcccgtcctg gtccagccgc 120tcactgggtg cccgttgccg gaactccatc gcctcctgtc ccgaggagca gccccacgtg 180ggcaactacc gcctgctgag gaccattggg aagggcaact ttgccaaagt caagctggct 240cggcacatcc tcactggtcg ggaggttgcc atcaagatta tcgacaaaac ccagctgaat 300cccagcagcc tgcagaagct gttccgagaa gtccgcatca tgaagggcct aaaccacccc 360aacatcgtga agctctttga ggtgattgag actgagaaga cgctgtacct ggtgatggag 420tacgcaagtg ctggagaagt gtttgactac ctcgtgtcgc atggccgcat gaaggagaag 480gaagctcgag ccaagttccg acagattgtt tcggctgtgc actattgtca ccagaaaaat 540attgtacaca gggacctgaa ggctgagaac ctcttgctgg atgccgaggc caacatcaag 600attgctgact ttggcttcag caacgagttc acgctgggat cgaagctgga cacgttctgc 660gggagccccc catatgccgc cccggagctg tttcagggca agaagtacga cgggccggag 720gtggacatct ggagcctggg agtcatcctg tacaccctcg tcagcggctc cctgcccttc 780gacgggcaca acctcaagga gctgcgggag cgagtactca gagggaagta ccgggtccct 840ttctacatgt caacagactg tgagagcatc ctgcggagat ttttggtgct gaacccagct 900aaacgctgta ctctcgagca aatcatgaaa gacaaatgga tcaacatcgg ctatgagggt 960gaggagttga agccatacac agagcccgag gaggacttcg gggacaccaa gagaattgag 1020gtgatggtgg gtatgggcta cacacgggaa gaaatcaaag agtccttgac cagccagaag 1080tacaacgaag tgaccgccac ctacctcctg ctgggcagga agactgagga gggtggggac 1140cggggcgccc cagggctggc cctggcacgg gtgcgggcgc ccagcgacac caccaacgga 1200acaagttcca gcaaaggcac cagccacagc aaagggcagc ggagttcctc ttccacctac 1260caccgccagc gcaggcatag cgatttctgt ggcccatccc ctgcacccct gcaccccaaa 1320cgcagcccga cgagcacggg ggaggcggag ctgaaggagg agcggctgcc aggccggaag 1380gcgagctgca gcaccgcggg gagtgggagt cgagggctgc ccccctccag ccccatggtc 1440agcagcgccc acaaccccaa caaggcagag atcccagagc ggcggaagga cagcacgagc 1500acccccaaca acctccctcc tagcatgatg acccgcagaa acacctacgt ttgcacagaa 1560cgcccggggg ctgagcgccc gtcactgttg ccaaatggga aagaaaacag ctcaggcacc 1620ccacgggtgc cccctgcctc cccctccagt cacagcctgg cacccccatc aggggagcgg 1680agccgcctgg cacgcggttc caccatccgc agcaccttcc atggtggcca ggtccgggac 1740cggcgggcag ggggtggggg tggtgggggt gtgcagaatg ggccccctgc ctctcccaca 1800ctggcccatg aggctgcacc cctgcccgcc gggcggcccc gccccaccac caacctcttc 1860accaagctga cctccaaact gacccgaagg gtcgcagacg aacctgagag aatcggggga 1920cctgaggtca caagttgcca tctaccttgg gatcaaacgg aaaccgcccc ccggctgctc 1980cgattcccct ggagtgtgaa gctgaccagc tcgcgccctc ctgaggccct gatggcagct 2040ctgcgccagg ccacagcagc cgcccgctgc cgctgccgcc agccacagcc gttcctgctg 2100gcctgcctgc acgggggtgc gggcgggccc gagcccctgt cccacttcga agtggaggtc 2160tgccagctgc cccggccagg cttgcgggga gttctcttcc gccgtgtggc gggcaccgcc 2220ctggccttcc gcaccctcgt cacccgcatc tccaacgacc tcgagctctg agccaccacg 2280gtcccaggcc cttatcttct ctcccttgtc gcttcacttc tacaggaggg gaaggggcca 2340gggaggggat tctcccttta tcatcacctc agtttccctg aattatattt gggggcaaag 2400attgtcccct ctgctgttct ctggggccgc tcagcacaga agaaggatga gggggctcag 2460cggggggagc tggcaccttc ctggagcctc cagccagtcc tgtcctccct cgccctacca 2520agagggcacc tgaggagact ttggggacag ggcaggggca gggagggaaa ctgaggaaat 2580cttccattcc tcccaacagc tcaaaattag gccttgggca ggggcaggga gagctgctga 2640gcctaaagac tggagaatct gggggactgg gagtgggggt cagagaggca gattccttcc 2700cctcccgtcc cctcacgctc aaacccccac ttcctgcccc aggctggcgc ggggcacttt 2760gtacaaatcc ttgtaaatac cccacacctt cccttctgca aaggtctctt gaggagctgc 2820cgctgtcacc tacggttttt aagttattac accccgaccc tcctcctgtc agccccctca 2880cgtgcagcct gttgcccaat aaatttagga gagtcccccc ctccccaatg ctgaccctag 2940gattttcctt ccctgccctc acctgcaaat gagttaaaga agaggcgtgg gaatccaggc 3000agtggttttt cctttcggag cctcggtttt ctcatctgca gaatgggagc ggtgggggtg 3060ggaaggtaag gatggtcgtc caagaaggca ggatggaact cggcctcatc cccgaggccc 3120cagttcctat atcgggcccc ccattcatcc actcacactc ccagccacca tgttacactg 3180gactttaagc catttcttac tccagtagta aatttattca ataaac 322624745PRTHomo sapiens 24Met Ile Arg Gly Arg Asn Ser Ala Thr Ser Ala Asp Glu Gln Pro His1 5 10 15Ile Gly Asn Tyr Arg Leu Leu Lys Thr Ile Gly Lys Gly Asn Phe Ala 20 25 30Lys Val Lys Leu Ala Arg His Ile Leu Thr Gly Lys Glu Val Ala Val 35 40 45Lys Ile Ile Asp Lys Thr Gln Leu Asn Ser Ser Ser Leu Gln Lys Leu 50 55 60Phe Arg Glu Val Arg Ile Met Lys Val Leu Asn His Pro Asn Ile Val65 70 75 80Lys Leu Phe Glu Val Ile Glu Thr Glu Lys Thr Leu Tyr Leu Val Met 85 90 95Glu Tyr Ala Ser Gly Gly Glu Val Phe Asp Tyr Leu Val Ala His Gly 100 105 110Arg Met Lys Glu Lys Glu Ala Arg Ala Lys Phe Arg Gln Ile Val Ser 115 120 125Ala Val Gln Tyr Cys His Gln Lys Phe Ile Val His Arg Asp Leu Lys 130 135 140Ala Glu Asn Leu Leu Leu Asp Ala Asp Met Asn Ile Lys Ile Ala Asp145 150 155 160Phe Gly Phe Ser Asn Glu Phe Thr Phe Gly Asn Lys Leu Asp Thr Phe 165 170 175Cys Gly Ser Pro Pro Tyr Ala Ala Pro Glu Leu Phe Gln Gly Lys Lys 180 185 190Tyr Asp Gly Pro Glu Val Asp Val Trp Ser Leu Gly Val Ile Leu Tyr 195 200 205Thr Leu Val Ser Gly Ser Leu Pro Phe Asp Gly Gln Asn Leu Lys Glu 210 215 220Leu Arg Glu Arg Val Leu Arg Gly Lys Tyr Arg Ile Pro Phe Tyr Met225 230 235 240Ser Thr Asp Cys Glu Asn Leu Leu Lys Lys Phe Leu Ile Leu Asn Pro 245 250 255Ser Lys Arg Gly Thr Leu Glu Gln Ile Met Lys Asp Arg Trp Met Asn 260 265 270Val Gly His Glu Asp Asp Glu Leu Lys Pro Tyr Val Glu Pro Leu Pro 275 280 285Asp Tyr Lys Asp Pro Arg Arg Thr Glu Leu Met Val Ser Met Gly Tyr 290 295 300Thr Arg Glu Glu Ile Gln Asp Ser Leu Val Gly Gln Arg Tyr Asn Glu305 310 315 320Val Met Ala Thr Tyr Leu Leu Leu Gly Tyr Lys Ser Ser Glu Leu Glu 325 330 335Gly Asp Thr Ile Thr Leu Lys Pro Arg Pro Ser Ala Asp Leu Thr Asn 340 345 350Ser Ser Ala Gln Phe Pro Ser His Lys Val Gln Arg Ser Val Ser Ala 355 360 365Asn Pro Lys Gln Arg Arg Phe Ser Asp Gln Ala Gly Pro Ala Ile Pro 370 375 380Thr Ser Asn Ser Tyr Ser Lys Lys Thr Gln Ser Asn Asn Ala Glu Asn385 390 395 400Lys Arg Pro Glu Glu Asp Arg Glu Ser Gly Arg Lys Ala Ser Ser Thr 405 410 415Ala Lys Val Pro Ala Ser Pro Leu Pro Gly Leu Glu Arg Lys Lys Thr 420 425 430Thr Pro Thr Pro Ser Thr Asn Ser Val Leu Ser Thr Ser Thr Asn Arg 435 440 445Ser Arg Asn Ser Pro Leu Leu Glu Arg Ala Ser Leu Gly Gln Ala Ser 450 455 460Ile Gln Asn Gly Lys Asp Ser Leu Thr Met Pro Gly Ser Arg Ala Ser465 470 475 480Thr Ala Ser Ala Ser Ala Ala Val Ser Ala Ala Arg Pro Arg Gln His 485 490 495Gln Lys Ser Met Ser Ala Ser Val His Pro Asn Lys Ala Ser Gly Leu 500 505 510Pro Pro Thr Glu Ser Asn Cys Glu Val Pro Arg Pro Ser Thr Ala Pro 515 520 525Gln Arg Val Pro Val Ala Ser Pro Ser Ala His Asn Ile Ser Ser Ser 530 535 540Gly Gly Ala Pro Asp Arg Thr Asn Phe Pro Arg Gly Val Ser Ser Arg545 550 555 560Ser Thr Phe His Ala Gly Gln Leu Arg Gln Val Arg Asp Gln Gln Asn 565 570 575Leu Pro Tyr Gly Val Thr Pro Ala Ser Pro Ser Gly His Ser Gln Gly 580 585 590Arg Arg Gly Ala Ser Gly Ser Ile Phe Ser Lys Phe Thr Ser Lys Phe 595 600 605Val Arg Arg Asn Leu Asn Glu Pro Glu Ser Lys Asp Arg Val Glu Thr 610 615 620Leu Arg Pro His Val Val Gly Ser Gly Gly Asn Asp Lys Glu Lys Glu625 630 635 640Glu Phe Arg Glu Ala Lys Pro Arg Ser Leu Arg Phe Thr Trp Ser Met 645 650 655Lys Thr Thr Ser Ser Met Glu Pro Asn Glu Met Met Arg Glu Ile Arg 660 665 670Lys Val Leu Asp Ala Asn Ser Cys Gln Ser Glu Leu His Glu Lys Tyr 675 680 685Met Leu Leu Cys Met His Gly Thr Pro Gly His Glu Asp Phe Val Gln 690 695 700Trp Glu Met Glu Val Cys Lys Leu Pro Arg Leu Ser Leu Asn Gly Val705 710 715 720Arg Phe Lys Arg Ile Ser Gly Thr Ser Met Ala Phe Lys Asn Ile Ala 725 730 735Ser Lys Ile Ala Asn Glu Leu Lys Leu 740 74525795PRTHomo sapiens 25Met Ser Ala Arg Thr Pro Leu Pro Thr Val Asn Glu Arg Asp Thr Val1 5 10 15Asn His Thr Thr Val Asp Gly Tyr Thr Glu Pro His Ile Gln Pro Thr 20 25 30Lys Ser Ser Ser Arg Gln Asn Ile Pro Arg Cys Arg Asn Ser Ile Thr 35 40 45Ser Ala Thr Asp Glu Gln Pro His Ile Gly Asn Tyr Arg Leu Gln Lys 50 55 60Thr Ile Gly Lys Gly Asn Phe Ala Lys Val Lys Leu Ala Arg His Val65 70 75 80Leu Thr Gly Arg Glu Val Ala Val Lys Ile Ile Asp Lys Thr Gln Leu 85 90 95Asn Pro Thr Ser Leu Gln Lys Leu Phe Arg Glu Val Arg Ile Met Lys 100 105 110Ile Leu Asn His Pro Asn Ile Val Lys Leu Phe Glu Val Ile Glu Thr 115 120 125Glu Lys Thr Leu Tyr Leu Val Met Glu Tyr Ala Ser Gly Gly Glu Val 130 135 140Phe Asp Tyr Leu Val Ala His Gly Arg Met Lys Glu Lys Glu Ala Arg145 150 155 160Ala Lys Phe Arg Gln Ile Val Ser Ala Val Gln Tyr Cys His Gln Lys 165 170 175Tyr Ile Val His Arg Asp Leu Lys Ala Glu Asn Leu Leu Leu Asp Gly 180 185 190Asp Met Asn Ile Lys Ile Ala Asp Phe Gly Phe Ser Asn Glu Phe Thr 195 200 205Val Gly Asn Lys Leu Asp Thr Phe Cys Gly Ser Pro Pro Tyr Ala Ala 210 215 220Pro Glu Leu Phe Gln Gly Lys Lys Tyr Asp Gly Pro Glu Val Asp Val225 230 235 240Trp Ser Leu Gly Val Ile Leu Tyr Thr Leu Val Ser Gly Ser Leu Pro 245 250 255Phe Asp Gly Gln Asn Leu Lys Glu Leu Arg Glu Arg Val Leu Arg Gly 260 265 270Lys Tyr Arg Ile Pro Phe Tyr Met Ser Thr Asp Cys Glu Asn Leu Leu 275 280 285Lys Lys Leu Leu Val Leu Asn Pro Ile Lys Arg Gly Ser Leu Glu Gln 290 295 300Ile Met Lys Asp Arg Trp Met Asn Val Gly His Glu Glu Glu Glu Leu305 310 315 320Lys Pro Tyr Thr Glu Pro Asp Pro Asp Phe Asn Asp Thr Lys Arg Ile 325 330 335Asp Ile Met Val Thr Met Gly Phe Ala Arg Asp Glu Ile Asn Asp Ala 340 345 350Leu Ile Asn Gln Lys Tyr Asp Glu Val Met Ala

Thr Tyr Ile Leu Leu 355 360 365Gly Arg Lys Pro Pro Glu Phe Glu Gly Gly Glu Ser Leu Ser Ser Gly 370 375 380Asn Leu Cys Gln Arg Ser Arg Pro Ser Ser Asp Leu Asn Asn Ser Thr385 390 395 400Leu Gln Ser Pro Ala His Leu Lys Val Gln Arg Ser Ile Ser Ala Asn 405 410 415Gln Lys Gln Arg Arg Phe Ser Asp His Ala Gly Pro Ser Ile Pro Pro 420 425 430Ala Val Ser Tyr Thr Lys Arg Pro Gln Ala Asn Ser Val Glu Ser Glu 435 440 445Gln Lys Glu Glu Trp Asp Lys Asp Val Ala Arg Lys Leu Gly Ser Thr 450 455 460Thr Val Gly Ser Lys Ser Glu Met Thr Ala Ser Pro Leu Val Gly Pro465 470 475 480Glu Arg Lys Lys Ser Ser Thr Ile Pro Ser Asn Asn Val Tyr Ser Gly 485 490 495Gly Ser Met Ala Arg Arg Asn Thr Tyr Val Cys Glu Arg Thr Thr Asp 500 505 510Arg Tyr Val Ala Leu Gln Asn Gly Lys Asp Ser Ser Leu Thr Glu Met 515 520 525Ser Val Ser Ser Ile Ser Ser Ala Gly Ser Ser Val Ala Ser Ala Val 530 535 540Pro Ser Ala Arg Pro Arg His Gln Lys Ser Met Ser Thr Ser Gly His545 550 555 560Pro Ile Lys Val Thr Leu Pro Thr Ile Lys Asp Gly Ser Glu Ala Tyr 565 570 575Arg Pro Gly Thr Thr Gln Arg Val Pro Ala Ala Ser Pro Ser Ala His 580 585 590Ser Ile Ser Thr Ala Thr Pro Asp Arg Thr Arg Phe Pro Arg Gly Ser 595 600 605Ser Ser Arg Ser Thr Phe His Gly Glu Gln Leu Arg Glu Arg Arg Ser 610 615 620Val Ala Tyr Asn Gly Pro Pro Ala Ser Pro Ser His Glu Thr Gly Ala625 630 635 640Phe Ala His Ala Arg Arg Gly Thr Ser Thr Gly Ile Ile Ser Lys Ile 645 650 655Thr Ser Lys Phe Val Arg Arg Asp Pro Ser Glu Gly Glu Ala Ser Gly 660 665 670Arg Thr Asp Thr Ser Arg Ser Thr Ser Gly Glu Pro Lys Glu Arg Asp 675 680 685Lys Glu Glu Gly Lys Asp Ser Lys Pro Arg Ser Leu Arg Phe Thr Trp 690 695 700Ser Met Lys Thr Thr Ser Ser Met Asp Pro Asn Asp Met Met Arg Glu705 710 715 720Ile Arg Lys Val Leu Asp Ala Asn Asn Cys Asp Tyr Glu Gln Lys Glu 725 730 735Arg Phe Leu Leu Phe Cys Val His Gly Asp Ala Arg Gln Asp Ser Leu 740 745 750Val Gln Trp Glu Met Glu Val Cys Lys Leu Pro Arg Leu Ser Leu Asn 755 760 765Gly Val Arg Phe Lys Arg Ile Ser Gly Thr Ser Ile Ala Phe Lys Asn 770 775 780Ile Ala Ser Lys Ile Ala Asn Glu Leu Lys Leu785 790 79526729PRTHomo sapiens 26Met Ser Thr Arg Thr Pro Leu Pro Thr Val Asn Glu Arg Asp Thr Glu1 5 10 15Asn His Thr Ser His Gly Asp Gly Arg Gln Glu Val Thr Ser Arg Thr 20 25 30Ser Arg Ser Gly Ala Arg Cys Arg Asn Ser Ile Ala Ser Cys Ala Asp 35 40 45Glu Gln Pro His Ile Gly Asn Tyr Arg Leu Leu Lys Thr Ile Gly Lys 50 55 60Gly Asn Phe Ala Lys Val Lys Leu Ala Arg His Ile Leu Thr Gly Arg65 70 75 80Glu Val Ala Ile Lys Ile Ile Asp Lys Thr Gln Leu Asn Pro Thr Ser 85 90 95Leu Gln Lys Leu Phe Arg Glu Val Arg Ile Met Lys Ile Leu Asn His 100 105 110Pro Asn Ile Val Lys Leu Phe Glu Val Ile Glu Thr Glu Lys Thr Leu 115 120 125Tyr Leu Ile Met Glu Tyr Ala Ser Gly Gly Glu Val Phe Asp Tyr Leu 130 135 140Val Ala His Gly Arg Met Lys Glu Lys Glu Ala Arg Ser Lys Phe Arg145 150 155 160Gln Ile Val Ser Ala Val Gln Tyr Cys His Gln Lys Arg Ile Val His 165 170 175Arg Asp Leu Lys Ala Glu Asn Leu Leu Leu Asp Ala Asp Met Asn Ile 180 185 190Lys Ile Ala Asp Phe Gly Phe Ser Asn Glu Phe Thr Val Gly Gly Lys 195 200 205Leu Asp Thr Phe Cys Gly Ser Pro Pro Tyr Ala Ala Pro Glu Leu Phe 210 215 220Gln Gly Lys Lys Tyr Asp Gly Pro Glu Val Asp Val Trp Ser Leu Gly225 230 235 240Val Ile Leu Tyr Thr Leu Val Ser Gly Ser Leu Pro Phe Asp Gly Gln 245 250 255Asn Leu Lys Glu Leu Arg Glu Arg Val Leu Arg Gly Lys Tyr Arg Ile 260 265 270Pro Phe Tyr Met Ser Thr Asp Cys Glu Asn Leu Leu Lys Arg Phe Leu 275 280 285Val Leu Asn Pro Ile Lys Arg Gly Thr Leu Glu Gln Ile Met Lys Asp 290 295 300Arg Trp Ile Asn Ala Gly His Glu Glu Asp Glu Leu Lys Pro Phe Val305 310 315 320Glu Pro Glu Leu Asp Ile Ser Asp Gln Lys Arg Ile Asp Ile Met Val 325 330 335Gly Met Gly Tyr Ser Gln Glu Glu Ile Gln Glu Ser Leu Ser Lys Met 340 345 350Lys Tyr Asp Glu Ile Thr Ala Thr Tyr Leu Leu Leu Gly Arg Lys Ser 355 360 365Ser Glu Leu Asp Ala Ser Asp Ser Ser Ser Ser Ser Asn Leu Ser Leu 370 375 380Ala Lys Val Arg Pro Ser Ser Asp Leu Asn Asn Ser Thr Gly Gln Ser385 390 395 400Pro His His Lys Val Gln Arg Ser Val Ser Ser Ser Gln Lys Gln Arg 405 410 415Arg Tyr Ser Asp His Ala Gly Pro Ala Ile Pro Ser Val Val Ala Tyr 420 425 430Pro Lys Arg Ser Gln Thr Ser Thr Ala Asp Gly Asp Leu Lys Glu Asp 435 440 445Gly Ile Ser Ser Arg Lys Ser Ser Gly Ser Ala Val Gly Gly Lys Gly 450 455 460Ile Ala Pro Ala Ser Pro Met Leu Gly Asn Ala Ser Asn Pro Asn Lys465 470 475 480Ala Asp Ile Pro Glu Arg Lys Lys Ser Ser Thr Val Pro Ser Ser Asn 485 490 495Thr Ala Ser Gly Gly Met Thr Arg Arg Asn Thr Tyr Val Cys Ser Glu 500 505 510Arg Thr Thr Ala Asp Arg His Ser Val Ile Gln Asn Gly Lys Glu Asn 515 520 525Ser Thr Ile Pro Asp Gln Arg Thr Pro Val Ala Ser Thr His Ser Ile 530 535 540Ser Ser Ala Ala Thr Pro Asp Arg Ile Arg Phe Pro Arg Gly Thr Ala545 550 555 560Ser Arg Ser Thr Phe His Gly Gln Pro Arg Glu Arg Arg Thr Ala Thr 565 570 575Tyr Asn Gly Pro Pro Ala Ser Pro Ser Leu Ser His Glu Ala Thr Pro 580 585 590Leu Ser Gln Thr Arg Ser Arg Gly Ser Thr Asn Leu Phe Ser Lys Leu 595 600 605Thr Ser Lys Leu Thr Arg Ser Arg Asn Val Ser Ala Glu Gln Lys Asp 610 615 620Glu Asn Lys Glu Ala Lys Pro Arg Ser Leu Arg Phe Thr Trp Ser Met625 630 635 640Lys Thr Thr Ser Ser Met Asp Pro Gly Asp Met Met Arg Glu Ile Arg 645 650 655Lys Val Leu Asp Ala Asn Asn Cys Asp Tyr Glu Gln Arg Glu Arg Phe 660 665 670Leu Leu Phe Cys Val His Gly Asp Gly His Ala Glu Asn Leu Val Gln 675 680 685Trp Glu Met Glu Val Cys Lys Leu Pro Arg Leu Ser Leu Asn Gly Val 690 695 700Arg Phe Lys Arg Ile Ser Gly Thr Ser Ile Ala Phe Lys Asn Ile Ala705 710 715 720Ser Lys Ile Ala Asn Glu Leu Lys Leu 72527713PRTHomo sapiens 27Met Ser Thr Arg Thr Pro Leu Pro Thr Val Asn Glu Arg Asp Thr Glu1 5 10 15Asn His Thr Ser His Gly Asp Gly Arg Gln Glu Val Thr Ser Arg Thr 20 25 30Ser Arg Ser Gly Ala Arg Cys Arg Asn Ser Ile Ala Ser Cys Ala Asp 35 40 45Glu Gln Pro His Ile Gly Asn Tyr Arg Leu Leu Lys Thr Ile Gly Lys 50 55 60Gly Asn Phe Ala Lys Val Lys Leu Ala Arg His Ile Leu Thr Gly Arg65 70 75 80Glu Val Ala Ile Lys Ile Ile Asp Lys Thr Gln Leu Asn Pro Thr Ser 85 90 95Leu Gln Lys Leu Phe Arg Glu Val Arg Ile Met Lys Ile Leu Asn His 100 105 110Pro Asn Ile Val Lys Leu Phe Glu Val Ile Glu Thr Gln Lys Thr Leu 115 120 125Tyr Leu Ile Met Glu Tyr Ala Ser Gly Gly Lys Val Phe Asp Tyr Leu 130 135 140Val Ala His Gly Arg Met Lys Glu Lys Glu Ala Arg Ser Lys Phe Arg145 150 155 160Gln Ile Val Ser Ala Val Gln Tyr Cys His Gln Lys Arg Ile Val His 165 170 175Arg Asp Leu Lys Ala Glu Asn Leu Leu Leu Asp Ala Asp Met Asn Ile 180 185 190Lys Ile Ala Asp Phe Gly Phe Ser Asn Glu Phe Thr Val Gly Gly Lys 195 200 205Leu Asp Thr Phe Cys Gly Ser Pro Pro Tyr Ala Ala Pro Glu Leu Phe 210 215 220Gln Gly Lys Lys Tyr Asp Gly Pro Glu Val Asp Val Trp Ser Leu Gly225 230 235 240Val Ile Leu Tyr Thr Leu Val Ser Gly Ser Leu Pro Phe Asp Gly Gln 245 250 255Asn Leu Lys Glu Leu Arg Glu Arg Val Leu Arg Gly Lys Tyr Arg Ile 260 265 270Pro Phe Tyr Met Ser Thr Asp Cys Glu Asn Leu Leu Lys Arg Phe Leu 275 280 285Val Leu Asn Pro Ile Lys Arg Gly Thr Leu Glu Gln Ile Met Lys Asp 290 295 300Arg Trp Ile Asn Ala Gly His Glu Glu Asp Glu Leu Lys Pro Phe Val305 310 315 320Glu Pro Glu Leu Asp Ile Ser Asp Gln Lys Arg Ile Asp Ile Met Val 325 330 335Gly Met Gly Tyr Ser Gln Glu Glu Ile Gln Glu Ser Leu Ser Lys Met 340 345 350Lys Tyr Asp Glu Ile Thr Ala Thr Tyr Leu Leu Leu Gly Arg Lys Ser 355 360 365Ser Glu Val Arg Pro Ser Ser Asp Leu Asn Asn Ser Thr Gly Gln Ser 370 375 380Pro His His Lys Val Gln Arg Ser Val Ser Ser Ser Gln Lys Gln Arg385 390 395 400Arg Tyr Ser Asp His Ala Gly Pro Gly Ile Pro Ser Val Val Ala Tyr 405 410 415Pro Lys Arg Ser Gln Thr Ser Thr Ala Asp Ser Asp Leu Lys Glu Asp 420 425 430Gly Ile Ser Ser Arg Lys Ser Thr Gly Ser Ala Val Gly Gly Lys Gly 435 440 445Ile Ala Pro Ala Ser Pro Met Leu Gly Asn Ala Ser Asn Pro Asn Lys 450 455 460Ala Asp Ile Pro Glu Arg Lys Lys Ser Ser Thr Val Pro Ser Ser Asn465 470 475 480Thr Ala Ser Gly Gly Met Thr Arg Arg Asn Thr Tyr Val Cys Ser Glu 485 490 495Arg Thr Thr Asp Asp Arg His Ser Val Ile Gln Asn Gly Lys Glu Asn 500 505 510Ser Thr Ile Pro Asp Gln Arg Thr Pro Val Ala Ser Thr His Ser Ile 515 520 525Ser Ser Ala Ala Thr Pro Asp Arg Ile Arg Phe Pro Arg Gly Thr Ala 530 535 540Ser Arg Ser Thr Phe His Gly Gln Pro Arg Glu Arg Arg Thr Ala Thr545 550 555 560Tyr Asn Gly Pro Pro Ala Ser Pro Ser Leu Ser His Glu Ala Thr Pro 565 570 575Leu Ser Gln Thr Arg Ser Arg Gly Ser Thr Thr Leu Phe Ser Lys Leu 580 585 590Thr Ser Lys Leu Thr Arg Ser Arg Asn Val Ser Ala Lys Gln Lys Asp 595 600 605Glu Asn Lys Glu Ala Lys Pro Arg Ser Leu Arg Phe Thr Trp Ser Met 610 615 620Lys Thr Thr Ser Ser Met Asp Pro Gly Asp Met Met Arg Glu Ile Arg625 630 635 640Lys Val Leu Asp Ala Asn Asn Cys Asp Tyr Glu Gln Arg Glu Arg Phe 645 650 655Leu Leu Phe Cys Val His Gly Asp Gly His Ala Glu Asn Leu Val Gln 660 665 670Trp Glu Met Glu Val Cys Lys Leu Pro Arg Leu Ser Leu Asn Gly Val 675 680 685Arg Phe Lys Arg Ile Ser Gly Thr Ser Ile Ala Phe Lys Asn Ile Ala 690 695 700Ser Lys Ile Ala Asn Glu Leu Lys Leu705 71028688PRTHomo sapiens 28Met Ser Ser Arg Thr Val Leu Ala Pro Gly Asn Asp Arg Asn Ser Asp1 5 10 15Thr His Gly Thr Leu Gly Ser Gly Arg Ser Ser Asp Lys Gly Pro Ser 20 25 30Trp Ser Ser Arg Ser Leu Gly Ala Arg Cys Arg Asn Ser Ile Ala Ser 35 40 45Cys Pro Glu Glu Gln Pro His Val Gly Asn Tyr Arg Leu Leu Arg Thr 50 55 60Ile Gly Lys Gly Asn Ser Ala Lys Val Lys Leu Ala Arg His Ile Leu65 70 75 80Thr Gly Arg Glu Val Ala Ile Lys Ile Ile Asp Lys Thr Gln Leu Asn 85 90 95Pro Ser Ser Leu Gln Lys Leu Phe Arg Glu Val Arg Ile Met Lys Gly 100 105 110Leu Asn His Pro Asn Ile Val Lys Leu Phe Glu Val Ile Glu Thr Glu 115 120 125Lys Thr Leu Tyr Leu Val Met Glu Tyr Ala Ser Ala Gly Glu Val Phe 130 135 140Asp Tyr Leu Val Ser His Gly Arg Met Lys Glu Lys Glu Ala Arg Ala145 150 155 160Lys Phe Arg Gln Ile Val Ser Ala Val His Tyr Cys His Gln Lys Asn 165 170 175Ile Val His Arg Asp Leu Lys Ala Glu Asn Leu Leu Leu Asp Ala Glu 180 185 190Ala Asn Ile Lys Ile Ala Asp Phe Gly Phe Ser Asn Glu Phe Thr Leu 195 200 205Gly Ser Lys Leu Asp Thr Phe Cys Gly Ser Pro Pro Tyr Ala Ala Pro 210 215 220Glu Leu Phe Gln Gly Lys Lys Tyr Asp Gly Pro Glu Val Asp Ile Trp225 230 235 240Ser Leu Gly Val Ile Leu Tyr Thr Leu Val Ser Gly Ser Leu Pro Phe 245 250 255Asp Gly His Asn Leu Lys Glu Leu Arg Glu Arg Val Leu Arg Gly Lys 260 265 270Tyr Arg Val Pro Phe Tyr Met Ser Thr Asp Cys Glu Ser Ile Leu Arg 275 280 285Arg Phe Leu Val Leu Asn Pro Ala Lys Arg Cys Thr Leu Glu Gln Ile 290 295 300Met Lys Asp Lys Trp Ile Asn Ile Gly Tyr Glu Gly Glu Glu Leu Lys305 310 315 320Pro Tyr Thr Glu Pro Glu Glu Asp Phe Gly Asp Thr Lys Arg Ile Glu 325 330 335Val Met Val Gly Met Gly Tyr Thr Arg Glu Glu Ile Lys Glu Ser Leu 340 345 350Thr Ser Gln Lys Tyr Asn Glu Val Thr Ala Thr Tyr Leu Leu Leu Gly 355 360 365Arg Lys Thr Glu Glu Gly Gly Asp Arg Gly Ala Pro Gly Leu Ala Leu 370 375 380Ala Arg Val Arg Ala Pro Ser Asp Thr Thr Asn Gly Thr Ser Ser Ser385 390 395 400Lys Gly Thr Ser His Ser Lys Gly Gln Arg Ser Ser Ser Ser Thr Tyr 405 410 415His Arg Gln Arg Arg His Ser Asp Phe Cys Gly Pro Ser Pro Ala Pro 420 425 430Leu His Pro Lys Arg Ser Pro Thr Ser Thr Gly Glu Ala Glu Leu Lys 435 440 445Glu Glu Arg Leu Pro Gly Arg Lys Ala Ser Cys Ser Thr Ala Gly Ser 450 455 460Gly Ser Arg Gly Leu Pro Pro Ser Ser Pro Met Val Ser Ser Ala His465 470 475 480Asn Pro Asn Lys Ala Glu Ile Pro Glu Arg Arg Lys Asp Ser Thr Ser 485 490 495Thr Pro Asn Asn Leu Pro Pro Ser Met Met Thr Arg Arg Asn Thr Tyr 500 505 510Val Cys Thr Glu Arg Pro Gly Ala Glu Arg Pro Ser Leu Leu Pro Asn 515 520 525Gly Lys Glu Asn Ser Ser Gly Thr Pro Arg Val Pro Pro Ala Ser Pro 530 535 540Ser Ser His Ser Leu Ala Pro Pro Ser Gly Glu Arg Ser Arg Leu Ala545 550 555 560Arg Gly Ser Thr Ile Arg Ser Thr Phe His Gly Gly Gln Val Arg Asp 565 570 575Arg Arg Ala Gly Gly Gly Gly Gly Gly Gly Val Gln Asn Gly Pro Pro 580

585 590Ala Ser Pro Thr Leu Ala His Glu Ala Ala Pro Leu Pro Ala Gly Arg 595 600 605Pro Arg Pro Thr Thr Asn Leu Phe Thr Lys Leu Thr Ser Lys Leu Thr 610 615 620Arg Arg Val Thr Leu Asp Pro Ser Lys Arg Gln Asn Ser Asn Arg Cys625 630 635 640Val Ser Gly Ala Ser Leu Pro Gln Gly Ser Lys Ile Arg Ser Gln Thr 645 650 655Asn Leu Arg Glu Ser Gly Asp Leu Arg Ser Gln Val Ala Ile Tyr Leu 660 665 670Gly Ile Lys Arg Lys Pro Pro Pro Gly Cys Ser Asp Ser Pro Gly Val 675 680 68529688PRTHomo sapiens 29Met Ser Ser Arg Thr Val Leu Ala Pro Gly Asn Asp Arg Asn Ser Asp1 5 10 15Thr His Gly Thr Leu Gly Ser Gly Arg Ser Ser Asp Lys Gly Pro Ser 20 25 30Trp Ser Ser Arg Ser Leu Gly Ala Arg Cys Arg Asn Ser Ile Ala Ser 35 40 45Cys Pro Glu Glu Gln Pro His Val Gly Asn Tyr Arg Leu Leu Arg Thr 50 55 60Ile Gly Lys Gly Asn Ser Ala Lys Val Lys Leu Ala Arg His Ile Leu65 70 75 80Thr Gly Arg Glu Val Ala Ile Lys Ile Ile Asp Lys Thr Gln Leu Asn 85 90 95Pro Ser Ser Leu Gln Lys Leu Phe Arg Glu Val Arg Ile Met Lys Gly 100 105 110Leu Asn His Pro Asn Ile Val Lys Leu Phe Glu Val Ile Glu Thr Glu 115 120 125Lys Thr Leu Tyr Leu Val Met Glu Tyr Ala Ser Ala Gly Glu Val Phe 130 135 140Asp Tyr Leu Val Ser His Gly Arg Met Lys Glu Lys Glu Ala Arg Ala145 150 155 160Lys Phe Arg Gln Ile Val Ser Ala Val His Tyr Cys His Gln Lys Asn 165 170 175Ile Val His Arg Asp Leu Lys Ala Glu Asn Leu Leu Leu Asp Ala Glu 180 185 190Ala Asn Ile Lys Ile Ala Asp Phe Gly Phe Ser Asn Glu Phe Thr Leu 195 200 205Gly Ser Lys Leu Asp Thr Phe Cys Gly Ser Pro Pro Tyr Ala Ala Pro 210 215 220Glu Leu Phe Gln Gly Lys Lys Tyr Asp Gly Pro Glu Val Asp Ile Trp225 230 235 240Ser Leu Gly Val Ile Leu Tyr Thr Leu Val Ser Gly Ser Leu Pro Phe 245 250 255Asp Gly His Asn Leu Lys Glu Leu Arg Glu Arg Val Leu Arg Gly Lys 260 265 270Tyr Arg Val Pro Phe Tyr Met Ser Thr Asp Cys Glu Ser Ile Leu Arg 275 280 285Arg Phe Leu Val Leu Asn Pro Ala Lys Arg Cys Thr Leu Glu Gln Ile 290 295 300Met Lys Asp Lys Trp Ile Asn Ile Gly Tyr Glu Gly Glu Glu Leu Lys305 310 315 320Pro Tyr Thr Glu Pro Glu Glu Asp Phe Gly Asp Thr Lys Arg Ile Glu 325 330 335Val Met Val Gly Met Gly Tyr Thr Arg Glu Glu Ile Lys Glu Ser Leu 340 345 350Thr Ser Gln Lys Tyr Asn Glu Val Thr Ala Thr Tyr Leu Leu Leu Gly 355 360 365Arg Lys Thr Glu Glu Gly Gly Asp Arg Gly Ala Pro Gly Leu Ala Leu 370 375 380Ala Arg Val Arg Ala Pro Ser Asp Thr Thr Asn Gly Thr Ser Ser Ser385 390 395 400Lys Gly Thr Ser His Ser Lys Gly Gln Arg Ser Ser Ser Ser Thr Tyr 405 410 415His Arg Gln Arg Arg His Ser Asp Phe Cys Gly Pro Ser Pro Ala Pro 420 425 430Leu His Pro Lys Arg Ser Pro Thr Ser Thr Gly Glu Ala Glu Leu Lys 435 440 445Glu Glu Arg Leu Pro Gly Arg Lys Ala Ser Cys Ser Thr Ala Gly Ser 450 455 460Gly Ser Arg Gly Leu Pro Pro Ser Ser Pro Met Val Ser Ser Ala His465 470 475 480Asn Pro Asn Lys Ala Glu Ile Pro Glu Arg Arg Lys Asp Ser Thr Ser 485 490 495Thr Pro Asn Asn Leu Pro Pro Ser Met Met Thr Arg Arg Asn Thr Tyr 500 505 510Val Cys Thr Glu Arg Pro Gly Ala Glu Arg Pro Ser Leu Leu Pro Asn 515 520 525Gly Lys Glu Asn Ser Ser Gly Thr Pro Arg Val Pro Pro Ala Ser Pro 530 535 540Ser Ser His Ser Leu Ala Pro Pro Ser Gly Glu Arg Ser Arg Leu Ala545 550 555 560Arg Gly Ser Thr Ile Arg Ser Thr Phe His Gly Gly Gln Val Arg Asp 565 570 575Arg Arg Ala Gly Gly Gly Gly Gly Gly Gly Val Gln Asn Gly Pro Pro 580 585 590Ala Ser Pro Thr Leu Ala His Glu Ala Ala Pro Leu Pro Ala Gly Arg 595 600 605Pro Arg Pro Thr Thr Asn Leu Phe Thr Lys Leu Thr Ser Lys Leu Thr 610 615 620Arg Arg Val Thr Leu Asp Pro Ser Lys Arg Gln Asn Ser Asn Arg Cys625 630 635 640Val Ser Gly Ala Ser Leu Pro Gln Gly Ser Lys Ile Arg Ser Gln Thr 645 650 655Asn Leu Arg Glu Ser Gly Asp Leu Arg Ser Gln Val Ala Ile Tyr Leu 660 665 670Gly Ile Lys Arg Lys Pro Pro Pro Gly Cys Ser Asp Ser Pro Gly Val 675 680 685

* * * * *

References


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed