LRRCAPs as modifiers of the p53 pathway and methods of use Belvin, Marcia ; et al. [Belvin, Marcia]

LRRCAPs as modifiers of the p53 pathway and methods of use

Belvin, Marcia ; et al.

Patent Application Summary

U.S. patent application number 10/274583 was filed with the patent office on 2003-07-24 for lrrcaps as modifiers of the p53 pathway and methods of use. Invention is credited to Belvin, Marcia, Francis-Lang, Helen, Funke, Roel P., Li, Danxi, Lioubin, Mario N., Plowman, Gregory D., Schleithoff, Lothar.

Application Number	20030138431 10/274583
Document ID	/
Family ID	27407290
Filed Date	2003-07-24

United States Patent Application	20030138431
Kind Code	A1
Belvin, Marcia ; et al.	July 24, 2003

LRRCAPs as modifiers of the p53 pathway and methods of use

Abstract

Human LRRCAPS genes are identified as modulators of the p53 pathway, and thus are therapeutic targets for disorders associated with defective p53 function. Methods for identifying modulators of p53, comprising screening for agents that modulate the activity of LRRCAPS are provided.

Inventors:	Belvin, Marcia; (Albany, CA) ; Schleithoff, Lothar; (Tuebingen, DE) ; Plowman, Gregory D.; (San Carlos, CA) ; Funke, Roel P.; (South San, CA) ; Lioubin, Mario N.; (San Mateo, CA) ; Li, Danxi; (San Francisco, CA) ; Francis-Lang, Helen; (San Francisco, CA)
Correspondence Address:	JAN P. BRUNELLE EXELIXIS, INC. 170 HARBOR WAY P.O. BOX 511 SOUTH SAN FRANCISCO CA 94083-0511 US
Family ID:	27407290
Appl. No.:	10/274583
Filed:	October 21, 2002

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
60338733	Oct 22, 2001
60357600	Feb 15, 2002
60361196	Mar 1, 2002

Current U.S. Class:	424/155.1 ; 435/6.16; 435/7.23; 514/44A
Current CPC Class:	G01N 33/5011 20130101; G01N 33/573 20130101; C12Q 1/485 20130101; G01N 33/574 20130101; C12Q 1/527 20130101; G01N 33/57419 20130101; G01N 33/57449 20130101; G01N 2333/988 20130101; G01N 2500/10 20130101; G01N 33/57496 20130101; G01N 2333/82 20130101; G01N 33/5308 20130101; G01N 2333/4739 20130101; G01N 33/57484 20130101; G01N 2500/00 20130101; G01N 2510/00 20130101; G01N 2333/90212 20130101; G01N 33/57415 20130101; G01N 33/6872 20130101; G01N 2333/912 20130101; C12Q 1/42 20130101; G01N 33/57423 20130101
Class at Publication:	424/155.1 ; 514/44; 435/6; 435/7.23
International Class:	A61K 039/395; C12Q 001/68; G01N 033/574; A61K 048/00

Claims

What is claimed is:

1. A method of identifying a candidate p53 pathway modulating agent, said method comprising the steps of: a. providing an assay system comprising a purified LRRCAPS polypeptide or nucleic acid or a functionally active fragment or derivative thereof; b. contacting the assay system with a test agent under conditions whereby, but for the presence of the test agent, the system provides a reference activity; and c. detecting a test agent-biased activity of the assay system, wherein a difference between the test agent-biased activity and the reference activity identifies the test agent as a candidate p53 pathway modulating agent.

2. The method of claim 1 wherein the assay system comprises cultured cells that express the LRRCAPS polypeptide.

3. The method of claim 2 wherein the cultured cells additionally have defective p53 function.

4. The method of claim 1 wherein the assay system includes a screening assay comprising a LRRCAPS polypeptide, and the candidate test agent is a small molecule modulator.

5. The method of claim 4 wherein the assay is a binding assay.

6. The method of claim 1 wherein the assay system is selected from the group consisting of an apoptosis assay system, a cell proliferation assay system, an angiogenesis assay system, and a hypoxic induction assay system.

7. The method of claim 1 wherein the assay system includes a binding assay comprising a LRRCAPS polypeptide and the candidate test agent is an antibody.

8. The method of claim 1 wherein the assay system includes an expression assay comprising a LRRCAPS nucleic acid and the candidate test agent is a nucleic acid modulator.

9. The method of claim 8 wherein the nucleic acid modulator is an antisense oligomer.

10. The method of claim 8 wherein the nucleic acid modulator is a PMO.

11. The method of claim 1 additionally comprising: d. administering the candidate p53 pathway modulating agent identified in (c) to a model system comprising cells defective in p53 function and, detecting a phenotypic change in the model system that indicates that the p53 function is restored.

12. The method of claim 11 wherein the model system is a mouse model with defective p53 function.

13. A method for modulating a p53 pathway of a cell comprising contacting a cell defective in p53 function with a candidate modulator that specifically binds to a LRRCAPS polypeptide comprising an amino acid sequence selected from group consisting of SEQ ID NOs: 19, 20, 21, 22, 23, and 24, whereby p53 function is restored.

14. The method of claim 13 wherein the candidate modulator is administered to a vertebrate animal predetermined to have a disease or disorder resulting from a defect in p53 function.

15. The method of claim 13 wherein the candidate modulator is selected from the group consisting of an antibody and a small molecule.

16. The method of claim 1, comprising the additional steps of: e. providing a secondary assay system comprising cultured cells or a non-human animal expressing LRRCAPS, f. contacting the secondary assay system with the test agent of (b) or an agent derived therefrom under conditions whereby, but for the presence of the test agent or agent derived therefrom, the system provides a reference activity; and g. detecting an agent-biased activity of the second assay system, h. wherein a difference between the agent-biased activity and the reference activity of the second assay system confirms the test agent or agent derived therefrom as a candidate p53 pathway modulating agent, i. and wherein the second assay detects an agent-biased change in the p53 pathway.

17. The method of claim 16 wherein the secondary assay system comprises cultured cells.

18. The method of claim 16 wherein the secondary assay system comprises a non-human animal.

19. The method of claim 18 wherein the non-human animal mis-expresses a p53 pathway gene.

20. A method of modulating p53 pathway in a mammalian cell comprising contacting the cell with an agent that specifically binds a LRRCAPS polypeptide or nucleic acid.

21. The method of claim 20 wherein the agent is administered to a mammalian animal predetermined to have a pathology associated with the p53 pathway.

22. The method of claim 20 wherein the agent is a small molecule modulator, a nucleic acid modulator, or an antibody.

23. A method for diagnosing a disease in a patient comprising: a. obtaining a biological sample from the patient; b. contacting the sample with a probe for LRRCAPS expression; c. comparing results from step (b) with a control; d. determining whether step (c) indicates a likelihood of disease.

24. The method of claim 23 wherein said disease is cancer.

25. The method according to claim 24, wherein said cancer is a cancer as shown in Table 2 as having >25% expression level.

Description

REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. provisional patent applications 60/338,733 filed Oct. 22, 2001, 60/357,600 filed Feb. 15, 2002, and 60/361,196 filed Mar. 1, 2002. The contents of the prior applications are hereby incorporated in their entirety.

BACKGROUND OF THE INVENTION

[0002] The p53 gene is mutated in over 50 different types of human cancers, including familial and spontaneous cancers, and is believed to be the most commonly mutated gene in human cancer (Zambetti and Levine, FASEB (1993) 7:855-865; Hollstein, et al., Nucleic Acids Res. (1994) 22:3551-3555). Greater than 90% of mutations in the p53 gene are missense mutations that alter a single amino acid that inactivates p53 function. Aberrant forms of human p53 are associated with poor prognosis, more aggressive tumors, metastasis, and short survival rates (Mitsudomi et al., Clin Cancer Res 2000 October; 6(10):4055-63; Koshland, Science (1993) 262:1953).

[0003] The human p53 protein normally functions as a central integrator of signals including DNA damage, hypoxia, nucleotide deprivation, and oncogene activation (Prives, Cell (1998) 95:5-8). In response to these signals, p53 protein levels are greatly increased with the result that the accumulated p53 activates cell cycle arrest or apoptosis depending on the nature and strength of these signals. Indeed, multiple lines of experimental evidence have pointed to a key role for p53 as a tumor suppressor (Levine, Cell (1997) 88:323-331). For example, homozygous p53 "knockout" mice are developmentally normal but exhibit nearly 100% incidence of neoplasia in the first year of life (Donehower et al., Nature (1992) 356:215-221).

[0004] The biochemical mechanisms and pathways through which p53 functions in normal and cancerous cells are not fully understood, but one clearly important aspect of p53 function is its activity as a gene-specific transcriptional activator. Among the genes with known p53-response elements are several with well-characterized roles in either regulation of the cell cycle or apoptosis, including GADD45, p21/Waf1/Cip1, cyclin G, Bax, IGF-BP3, and MM2 (Levine, Cell (1997) 88:323-331).

[0005] Leucine-rich repeats (LRRs) are short motifs of 22-28 residues in length and are found in various cytoplasmic, membrane, and extracellular proteins (Rothberg, J. et al. (1990) Genes Dev (12A): 2169-87). These proteins play diverse roles, with protein-protein interactions being the most common property. In vitro studies of a synthetic LRR from Drosophila Toll protein have implied that the peptides form gels by adopting beta-sheet structures that form extended filaments. These results support the idea that LRRs mediate protein-protein interactions and cellular adhesion (Gay, N. (1991) FEBS Lett; 291(1): 87-91). Other functions of LRR-containing proteins include the binding of enzymes (Tan, F. et al. (1990) J Biol Chem; 265(1): 13-9), vascular repair (Hickey, M. (1989) Proc Natl Acad Sci USA; 86(17): 6773-7), and neuronal pathfinding and synapse formation (Taniguchi H et al (2000) J Neurobiol 42:104-106). The 3-D structure of ribonuclease inhibitor, a protein containing 15 LRRs, has been determined (Kobe, B. and Deisenhofer, J. (1993) Nature; 366(6457): 751-6) demonstrating LRRs to be a new class of alpha/beta fold. LRRs form elongated non-globular structures and are often flanked by cysteine rich domains.

[0006] D2S448 is a melanoma associated gene, a tumor antigen, and possibly a peroxidase, which may be involved in p53 -dependent apoptosis and immune responses. It also shows promise as a potential immunogenic peptide for cancer vaccination (Horikoshi, N. et al. (1999) Biochem Biophys Res Commun; 261(3): 864-9).

[0007] Glioma amplified on chromosome 1 protein (GAC1) is a member of the leucine rich repeat (LRR) superfamily and may play a role in signal transduction or cell adhesion. Gene amplification of this gene is seen in glioma and retinoblastoma tumors (Almeida, A. et al. (1998) Oncogene 16: 2997-3002). GAC1 contains 12 full-length LRR motifs, and its LRR block is flanked by cysteine-rich sequences. GAC1 is expressed in adult brain and at much lower levels in adult heart and kidney (Almeida, A. et al. (1998) supra).

[0008] Trophoblast glycoprotein (TPBG) is a protein that is expressed by all types of trophoblasts as early as 9 weeks of development, and was originally identified as a cell surface antigen defined by monoclonal antibody 5T4. TPBG plays a role in cell adhesion and motility and may be involved in metastasis, placentation, and trophoblast invasion. Expression of TPBG in gastric and colon cancers is associated with tumor metastasis and poor prognosis (Boyle, J. et al. (1990) Hum. Genet 84: 455-458). TPBG is expressed in several tumor cell lines (Myers, K. et al. (1994) Biol. Chem. 269: 9319-9324).

[0009] The ability to manipulate the genomes of model organisms such as Drosophila provides a powerful means to analyze biochemical processes that, due to significant evolutionary conservation, have direct relevance to more complex vertebrate organisms. Due to a high level of gene and pathway conservation, the strong similarity of cellular processes, and the functional conservation of genes between these model organisms and mammals, identification of the involvement of novel genes in particular pathways and their functions in such model organisms can directly contribute to the understanding of the correlative pathways and methods of modulating them in mammals (see, for example, Mechler B M et al., 1985 EMBO J 4:1551-1557; Gateff E. 1982 Adv. Cancer Res. 37: 33-74; Watson K L., et al., 1994 J Cell Sci. 18: 19-33; Miklos G L, and Rubin G M. 1996 Cell 86:521-529; Wassarman D A, et al., 1995 Curr Opin Gen Dev 5: 44-50; and Booth D R. 1999 Cancer Metastasis Rev. 18: 261-284). For example, a genetic screen can be carried out in an invertebrate model organism having underexpression (e.g. knockout) or overexpression of a gene (referred to as a "genetic entry point") that yields a visible phenotype. Additional genes are mutated in a random or targeted manner. When a gene mutation changes the original phenotype caused by the mutation in the genetic entry point, the gene is identified as a "modifier" involved in the same or overlapping pathway as the genetic entry point. When the genetic entry point is an ortholog of a human gene implicated in a disease pathway, such as p53, modifier genes can be identified that may be attractive candidate targets for novel therapeutics.

[0010] All references cited herein, including patents, patent applications, publications, and sequence information in referenced Genbank identifier numbers, are incorporated herein in their entireties.

SUMMARY OF THE INVENTION

[0011] We have discovered genes that modify the p53 pathway in Drosophila, and identified their human orthologs, hereinafter referred to as LRRCAPS (Leucine rich repeat, capricious related). The invention provides methods for utilizing these p53 modifier genes and polypeptides to identify LRRCAPS-modulating agents that are candidate therapeutic agents that can be used in the treatment of disorders associated with defective or impaired p53 function and/or LRRCAPS function. Preferred LRRCAPS-modulating agents specifically bind to LRRCAPS polypeptides and restore p53 function. Other preferred LRRCAPS-modulating agents are nucleic acid modulators such as antisense oligomers and RNAi that repress LRRCAPS gene expression or product activity by, for example, binding to and inhibiting the respective nucleic acid (i.e. DNA or mRNA).

[0012] LRRCAPS modulating agents may be evaluated by any convenient in vitro or in vivo assay for molecular interaction with an LRRCAPS polypeptide or nucleic acid. In one embodiment, candidate LRRCAPS modulating agents are tested with an assay system comprising a LRRCAPS polypeptide or nucleic acid. Agents that produce a change in the activity of the assay system relative to controls are identified as candidate p53 modulating agents. The assay system may be cell-based or cell-free. LRRCAPS-modulating agents include LRRCAPS related proteins (e.g. dominant negative mutants, and biotherapeutics); LRRCAPS-specific antibodies; LRRCAPS-specific antisense oligomers and other nucleic acid modulators; and chemical agents that specifically bind to or interact with LRRCAPS or compete with LRRCAPS binding partner (e.g. by binding to an LRRCAPS binding partner). In one specific embodiment, a small molecule modulator is identified using a binding assay. In specific embodiments, the screening assay system is selected from an apoptosis assay, a cell proliferation assay, an angiogenesis assay, and a hypoxic induction assay.

[0013] In another embodiment, candidate p53 pathway modulating agents are further tested using a second assay system that detects changes in the p53 pathway, such as angiogenic, apoptotic, or cell proliferation changes produced by the originally identified candidate agent or an agent derived from the original agent. The second assay system may use cultured cells or non-human animals. In specific embodiments, the secondary assay system uses non-human animals, including animals predetermined to have a disease or disorder implicating the p53 pathway, such as an angiogenic, apoptotic, or cell proliferation disorder (e.g. cancer).

[0014] The invention further provides methods for modulating the LRRCAPS function and/or the p53 pathway in a mammalian cell by contacting the mammalian cell with an agent that specifically binds a LRRCAPS polypeptide or nucleic acid. The agent may be a small molecule modulator, a nucleic acid modulator, or an antibody and may be administered to a mammalian animal predetermined to have a pathology associated the p53 pathway.

DETAILED DESCRIPTION OF THE INVENTION

[0015] Genetic screens were designed to identify modifiers of the p53 pathway in Drosophila, where a genetic modifier screen was carried out in which p53 was overexpressed in the wing (Ollmann M, et al., Cell 2000 101: 91-101). The CAPS gene was identified as a modifier of the p53 pathway. Accordingly, vertebrate orthologs of these modifiers, and preferably the human orthologs, LRRCAPS (Leucine rich repeat, capricious related) genes (i.e., nucleic acids and polypeptides) are attractive drug targets for the treatment of pathologies associated with a defective p53 signaling pathway, such as cancer.

[0016] In vitro and in vivo methods of assessing LRRCAPS function are provided herein. Modulation of the LRRCAPS or their respective binding partners is useful for understanding the association of the p53 pathway and its members in normal and disease conditions and for developing diagnostics and therapeutic modalities for p53 related pathologies. LRRCAPS-modulating agents that act by inhibiting or enhancing LRRCAPS expression, directly or indirectly, for example, by affecting an LRRCAPS function such as binding activity, can be identified using methods provided herein. LRRCAPS modulating agents are useful in diagnosis, therapy and pharmaceutical development.

[0017] Nucleic Acids and Polypeptides of the Invention

[0018] Sequences related to LRRCAPS nucleic acids and polypeptides that can be used in the invention are disclosed in Genbank (referenced by Genbank identifier (GI) number) as GI#s 18073097 (SEQ ID NO: 1), 16157510 (SEQ ID NO: 3), 14758125 (SEQ ID NO: 4), 14149931 (SEQ ID NO: 5), 20547335 (SEQ ID NO: 7), 5453655 (SEQ ID NO: 9), 3253212 (SEQ ID NO: 10), 21734210 (SEQ ID NO: 12), 21706505 (SEQ ID NO: 13), 14764197 (SEQ ID NO: 14), 5729717 (SEQ ID NO: 15), and 435654 (SEQ ID NO: 18) for nucleic acid, and GI#s 11877257 (SEQ ID NO: 19), 16157511 (SEQ ID NO: 20), 14758126 (SEQ ID NO: 21), 5453656 (SEQ ID NO: 22), 14764198 (SEQ ID NO: 23), and 5729718 (SEQ ID NO: 24) for polypeptides. Further, nucleic acid sequences of SEQ ID NOs: 2, 6, 8, 11, 16, and 17 can also be used in the methods of the invention.

[0019] The term "LRRCAPS polypeptide" refers to a full-length LRRCAPS protein or a functionally active fragment or derivative thereof. A "functionally active" LRRCAPS fragment or derivative exhibits one or more functional activities associated with a full-length, wild-type LRRCAPS protein, such as antigenic or immunogenic activity, ability to bind natural cellular substrates, etc. The functional activity of LRRCAPS proteins, derivatives and fragments can be assayed by various methods known to one skilled in the art (Current Protocols in Protein Science (1998) Coligan et al., eds., John Wiley & Sons, Inc., Somerset, N.J.) and as further discussed below. In one embodiment, a functionally active LRRCAPS polypeptide is a LRRCAPS derivative capable of rescuing defective endogenous LRRCAPS activity, such as in cell based or animal assays; the rescuing derivative may be from the same or a different species. For purposes herein, functionally active fragments also include those fragments that comprise one or more structural domains of an LRRCAPS, such as a binding domain. Protein domains can be identified using the PFAM program (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2). For example, the approximate amino acid locations of LRR (leucine rich repeat) and IG domains of LRRCAPS from GI#s 11877257, 16157511, 14758126, 5453656, 14764198, and 5729718 (SEQ ID NOs: 19, 20, 21, 22, 23, and 24, respectively) are listed in Table 1 further below in Example 1. Methods for obtaining LRRCAPS polypeptides are also further described below. In some embodiments, preferred fragments are functionally active, domain-containing fragments comprising at least 25 contiguous amino acids, preferably at least 50, more preferably 75, and most preferably at least 100 contiguous amino acids of any one of SEQ ID NOs: 19, 20, 21, 22, 23, and 24 (an LRRCAPS). In further preferred embodiments, the fragment comprises the entire functionally active domain.

[0020] The term "LRRCAPS nucleic acid" refers to a DNA or RNA molecule that encodes a LRRCAPS polypeptide. Preferably, the LRRCAPS polypeptide or nucleic acid or fragment thereof is from a human, but can also be an ortholog, or derivative thereof with at least 70% sequence identity, preferably at least 80%, more preferably 85%, still more preferably 90%, and most preferably at least 95% sequence identity with human LRRCAPS. Methods of identifying orthlogs are known in the art. Normally, orthologs in different species retain the same function, due to presence of one or more protein motifs and/or 3-dimensional structures. Orthologs are generally identified by sequence homology analysis, such as BLAST analysis, usually using protein bait sequences. Sequences are assigned as a potential ortholog if the best hit sequence from the forward BLAST result retrieves the original query sequence in the reverse BLAST (Huynen M A and Bork P, Proc Natl Acad Sci (1998) 95:5849-5856; Huynen M A et al., Genome Research (2000) 10:1204-1210). Programs for multiple sequence alignment, such as CLUSTAL (Thompson J D et al, 1994, Nucleic Acids Res 22:4673-4680) may be used to highlight conserved regions and/or residues of orthologous proteins and to generate phylogenetic trees. In a phylogenetic tree representing multiple homologous sequences from diverse species (e.g., retrieved through BLAST analysis), orthologous sequences from two species generally appear closest on the tree with respect to all other sequences from these two species. Structural threading or other analysis of protein folding (e.g., using software by ProCeryon, Biosciences, Salzburg, Austria) may also identify potential orthologs. In evolution, when a gene duplication event follows speciation, a single gene in one species, such as Drosophila, may correspond to multiple genes (paralogs) in another, such as human. As used herein, the term "orthologs" encompasses paralogs. As used herein, "percent (%) sequence identity" with respect to a subject sequence, or a specified portion of a subject sequence, is defined as the percentage of nucleotides or amino acids in the candidate derivative sequence identical with the nucleotides or amino acids in the subject sequence (or specified portion thereof), after aligning the sequences and introducing gaps, if necessary to achieve the maximum percent sequence identity, as generated by the program WU-BLAST-2.0a19 (Altschul et al., J. Mol. Biol. (1997) 215:403-410) with all the search parameters set to default values. The HSP S and HSP S2 parameters are dynamic values and are established by the program itself depending upon the composition of the particular sequence and composition of the particular database against which the sequence of interest is being searched. A % identity value is determined by the number of matching identical nucleotides or amino acids divided by the sequence length for which the percent identity is being reported. "Percent (%) amino acid sequence similarity" is determined by doing the same calculation as for determining % amino acid sequence identity, but including conservative amino acid substitutions in addition to identical amino acids in the computation.

[0021] A conservative amino acid substitution is one in which an amino acid is substituted for another amino acid having similar properties such that the folding or activity of the protein is not significantly affected. Aromatic amino acids that can be substituted for each other are phenylalanine, tryptophan, and tyrosine; interchangeable hydrophobic amino acids are leucine, isoleucine, methionine, and valine; interchangeable polar amino acids are glutamine and asparagine; interchangeable basic amino acids are arginine, lysine and histidine; interchangeable acidic amino acids are aspartic acid and glutamic acid; and interchangeable small amino acids are alanine, serine, threonine, cysteine and glycine.

[0022] Alternatively, an alignment for nucleic acid sequences is provided by the local homology algorithm of Smith and Waterman (Smith and Waterman, 1981, Advances in Applied Mathematics 2:482-489; database: European Bioinformatics Institute; Smith and Waterman, 1981, J. of Molec. Biol., 147:195-197; Nicholas et al., 1998, "A Tutorial on Searching Sequence Databases and Sequence Scoring Methods" (www.psc.edu) and references cited therein.; W. R. Pearson, 1991, Genomics 11:635-650). This algorithm can be applied to amino acid sequences by using the scoring matrix developed by Dayhoff (Dayhoff: Atlas of Protein Sequences and Structure, M. O. Dayhoff ed., 5 suppl. 3:353-358, National Biomedical Research Foundation, Washington, D.C., U.S.A.), and normalized by Gribskov (Gribskov 1986 Nucl. Acids Res. 14(6):6745-6763). The Smith-Waterman algorithm may be employed where default parameters are used for scoring (for example, gap open penalty of 12, gap extension penalty of two). From the data generated, the "Match" value reflects "sequence identity."

[0023] Derivative nucleic acid molecules of the subject nucleic acid molecules include sequences that hybridize to the nucleic acid sequence of any of SEQ ID NOs: 1-18. The stringency of hybridization can be controlled by temperature, ionic strength, pH, and the presence of denaturing agents such as formamide during hybridization and washing. Conditions routinely used are set out in readily available procedure texts (e.g., Current Protocol in Molecular Biology, Vol. 1, Chap. 2.10, John Wiley & Sons, Publishers (1994); Sambrook et al., Molecular Cloning, Cold Spring Harbor (1989)). In some embodiments, a nucleic acid molecule of the invention is capable of hybridizing to a nucleic acid molecule containing the nucleotide sequence of any one of SEQ ID NOs: 1-18 under high stringency hybridization conditions that comprise: prehybridization of filters containing nucleic acid for 8 hours to overnight at 65.degree. C. in a solution comprising 6.times. single strength citrate (SSC) (1.times. SSC is 0.15 M NaCl, 0.015 M Na citrate; pH 7.0), 5.times. Denhardt's solution, 0.05% sodium pyrophosphate and 100 .mu.g/ml herring sperm DNA; hybridization for 18-20 hours at 65.degree. C. in a solution containing 6.times. SSC, 1.times. Denhardt's solution, 100 .mu.g/ml yeast tRNA and 0.05% sodium pyrophosphate; and washing of filters at 65.degree. C. for 1h in a solution containing 0.1.times. SSC and 0.1% SDS (sodium dodecyl sulfate).

[0024] In other embodiments, moderately stringent hybridization conditions are used that comprise: pretreatment of filters containing nucleic acid for 6 h at 40.degree. C. in a solution containing 35% formamide, 5.times. SSC, 50 mM Tris-HCl (pH7.5), 5mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 .mu.g/ml denatured salmon sperm DNA; hybridization for 18-20h at 40.degree. C. in a solution containing 35% formamide, 5.times. SSC, 50 mM Tris-HCl (pH7.5), 5mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 .mu.g/ml salmon sperm DNA, and 10% (wt/vol) dextran sulfate; followed by washing twice for 1 hour at 55.degree. C. in a solution containing 2.times. SSC and 0.1% SDS.

[0025] Alternatively, low stringency conditions can be used that comprise: incubation for 8 hours to overnight at 37.degree. C. in a solution comprising 20% formamide, 5.times.SC, 50 mM sodium phosphate (pH 7.6), 5.times. Denhardt's solution, 10% dextran sulfate, and 20.mu.g/ml denatured sheared salmon sperm DNA; hybridization in the same buffer for 18 to 20 hours; and washing of filters in 1.times.SSC at about 37.degree. C. for 1 hour.

[0026] Isolation, Production, Expression, and Mis-expression of LRRCAPS Nucleic Acids and Polypeptides

[0027] LRRCAPS nucleic acids and polypeptides, useful for identifying and testing agents that modulate LRRCAPS function and for other applications related to the involvement of LRRCAPS in the p53 pathway. LRRCAPS nucleic acids and derivatives and orthologs thereof may be obtained using any available method. For instance, techniques for isolating cDNA or genomic DNA sequences of interest by screening DNA libraries or by using polymerase chain reaction (PCR) are well known in the art. In general, the particular use for the protein will dictate the particulars of expression, production, and purification methods. For instance, production of proteins for use in screening for modulating agents may require methods that preserve specific biological activities of these proteins, whereas production of proteins for antibody generation may require structural integrity of particular epitopes. Expression of proteins to be purified for screening or antibody production may require the addition of specific tags (e.g., generation of fusion proteins). Overexpression of an LRRCAPS protein for assays used to assess LRRCAPS function, such as involvement in cell cycle regulation or hypoxic response, may require expression in eukaryotic cell lines capable of these cellular activities. Techniques for the expression, production, and purification of proteins are well known in the art; any suitable means therefore may be used (e.g., Higgins S J and Hames B D (eds.) Protein Expression: A Practical Approach, Oxford University Press Inc., New York 1999; Stanbury P F et al., Principles of Fermentation Technology, 2.sup.nd edition, Elsevier Science, New York, 1995; Doonan S (ed.) Protein Purification Protocols, Humana Press, New Jersey, 1996; Coligan J E et al, Current Protocols in Protein Science (eds.), 1999, John Wiley & Sons, New York). In particular embodiments, recombinant LRRCAPS is expressed in a cell line known to have defective p53 function (e.g. SAOS-2 osteoblasts, H1299 lung cancer cells, C33A and HT3 cervical cancer cells, HT-29 and DLD-1 colon cancer cells, among others, available from American Type Culture Collection (ATCC), Manassas, Va.). The recombinant cells are used in cell-based screening assay systems of the invention, as described further below.

[0028] The nucleotide sequence encoding an LRRCAPS polypeptide can be inserted into any appropriate expression vector. The necessary transcriptional and translational signals, including promoter/enhancer element, can derive from the native LRRCAPS gene and/or its flanking regions or can be heterologous. A variety of host-vector expression systems may be utilized, such as mammalian cell systems infected with virus (e.g. vaccinia virus, adenovirus, etc.); insect cell systems infected with virus (e.g. baculovirus); microorganisms such as yeast containing yeast vectors, or bacteria transformed with bacteriophage, plasmid, or cosmid DNA. An isolated host cell strain that modulates the expression of, modifies, and/or specifically processes the gene product may be used.

[0029] To detect expression of the LRRCAPS gene product, the expression vector can comprise a promoter operably linked to an LRRCAPS gene nucleic acid, one or more origins of replication, and, one or more selectable markers (e.g. thymidine kinase activity, resistance to antibiotics, etc.). Alternatively, recombinant expression vectors can be identified by assaying for the expression of the LRRCAPS gene product based on the physical or functional properties of the LRRCAPS protein in in vitro assay systems (e.g. immunoassays).

[0030] The LRRCAPS protein, fragment, or derivative may be optionally expressed as a fusion, or chimeric protein product (i.e. it is joined via a peptide bond to a heterologous protein sequence of a different protein), for example to facilitate purification or detection. A chimeric product can be made by ligating the appropriate nucleic acid sequences encoding the desired amino acid sequences to each other using standard methods and expressing the chimeric product. A chimeric product may also be made by protein synthetic techniques, e.g. by use of a peptide synthesizer (Hunkapiller et al., Nature (1984) 310:105-111).

[0031] Once a recombinant cell that expresses the LRRCAPS gene sequence is identified, the gene product can be isolated and purified using standard methods (e.g. ion exchange, affinity, and gel exclusion chromatography; centrifugation; differential solubility; electrophoresis). Alternatively, native LRRCAPS proteins can be purified from natural sources, by standard methods (e.g. immunoaffinity purification). Once a protein is obtained, it may be quantified and its activity measured by appropriate methods, such as immunoassay, bioassay, or other measurements of physical properties, such as crystallography.

[0032] The methods of this invention may also use cells that have been engineered for altered expression (mis-expression) of LRRCAPS or other genes associated with the p53 pathway. As used herein, mis-expression encompasses ectopic expression, over-expression, under-expression, and non-expression (e.g. by gene knock-out or blocking expression that would otherwise normally occur).

[0033] Genetically Modified Animals

[0034] Animal models that have been genetically modified to alter LRRCAPS expression may be used in in vivo assays to test for activity of a candidate p53 modulating agent, or to further assess the role of LRRCAPS in a p53 pathway process such as apoptosis or cell proliferation. Preferably, the altered LRRCAPS expression results in a detectable phenotype, such as decreased or increased levels of cell proliferation, angiogenesis, or apoptosis compared to control animals having normal LRRCAPS expression. The genetically modified animal may additionally have altered p53 expression (e.g. p53 knockout). Preferred genetically modified animals are mammals such as primates, rodents (preferably mice or rats), among others. Preferred non-mammalian species include zebrafish, C. elegans, and Drosophila. Preferred genetically modified animals are transgenic animals having a heterologous nucleic acid sequence present as an extrachromosomal element in a portion of its cells, i.e. mosaic animals (see, for example, techniques described by Jakobovits, 1994, Curr. Biol. 4:761-763.) or stably integrated into its germ line DNA (i.e., in the genomic sequence of most or all of its cells). Heterologous nucleic acid is introduced into the germ line of such transgenic animals by genetic manipulation of, for example, embryos or embryonic stem cells of the host animal.

[0035] Methods of making transgenic animals are well-known in the art (for transgenic mice see Brinster et al., Proc. Nat. Acad. Sci. USA 82: 4438-4442 (1985), U.S. Pat. Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No. 4,873,191 by Wagner et al., and Hogan, B., Manipulating the Mouse Embryo, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (1986); for particle bombardment see U.S. Pat. No., 4,945,050, by Sandford et al.; for transgenic Drosophila see Rubin and Spradling, Science (1982) 218:348-53 and U.S. Pat. No. 4,670,388; for transgenic insects see Berghammer A. J. et al., A Universal Marker for Transgenic Insects (1999) Nature 402:370-371; for transgenic Zebrafish see Lin S., Transgenic Zebrafish, Methods Mol Biol. (2000);136:375-3830); for microinjection procedures for fish, amphibian eggs and birds see Houdebine and Chourrout, Experientia (1991) 47:897-905; for transgenic rats see Hammer et al., Cell (1990) 63:1099-1112; and for culturing of embryonic stem (ES) cells and the subsequent production of transgenic animals by the introduction of DNA into ES cells using methods such as electroporation, calcium phosphate/DNA precipitation and direct injection see, e.g., Teratocarcinomas and Embryonic Stem Cells, A Practical Approach, E. J. Robertson, ed., IRL Press (1987)). Clones of the nonhuman transgenic animals can be produced according to available methods (see Wilmut, I. et al. (1997) Nature 385:810-813; and PCT International Publication Nos. WO 97/07668 and WO 97/07669).

[0036] In one embodiment, the transgenic animal is a "knock-out" animal having a heterozygous or homozygous alteration in the sequence of an endogenous LRRCAPS gene that results in a decrease of LRRCAPS function, preferably such that LRRCAPS expression is undetectable or insignificant. Knock-out animals are typically generated by homologous recombination with a vector comprising a transgene having at least a portion of the gene to be knocked out. Typically a deletion, addition or substitution has been introduced into the transgene to functionally disrupt it. The transgene can be a human gene (e.g., from a human genomic clone) but more preferably is an ortholog of the human gene derived from the transgenic host species. For example, a mouse LRRCAPS gene is used to construct a homologous recombination vector suitable for altering an endogenous LRRCAPS gene in the mouse genome. Detailed methodologies for homologous recombination in mice are available (see Capecchi, Science (1989) 244:1288-1292; Joyner et al., Nature (1989) 338:153-156). Procedures for the production of non-rodent transgenic mammals and other animals are also available (Houdebine and Chourrout, supra; Pursel et al., Science (1989) 244:1281-1288; Simms et al., Bio/Technology (1988) 6:179-183). In a preferred embodiment, knock-out animals, such as mice harboring a knockout of a specific gene, may be used to produce antibodies against the human counterpart of the gene that has been knocked out (Claesson M H et al., (1994) Scan J Immunol 40:257-264; Declerck P J et al., (1995) J Biol Chem. 270:8397-400).

[0037] In another embodiment, the transgenic animal is a "knock-in" animal having an alteration in its genome that results in altered expression (e.g., increased (including ectopic) or decreased expression) of the LRRCAPS gene, e.g., by introduction of additional copies of LRRCAPS, or by operatively inserting a regulatory sequence that provides for altered expression of an endogenous copy of the LRRCAPS gene. Such regulatory sequences include inducible, tissue-specific, and constitutive promoters and enhancer elements. The knock-in can be homozygous or heterozygous.

[0038] Transgenic nonhuman animals can also be produced that contain selected systems allowing for regulated expression of the transgene. One example of such a system that may be produced is the cre/loxP recombinase system of bacteriophage P1 (Lakso et al., PNAS (1992) 89:6232-6236; U.S. Pat. No. 4,959,317). If a cre/loxP recombinase system is used to regulate expression of the transgene, animals containing transgenes encoding both the Cre recombinase and a selected protein are required. Such animals can be provided through the construction of "double" transgenic animals, e.g., by mating two transgenic animals, one containing a transgene encoding a selected protein and the other containing a transgene encoding a recombinase. Another example of a recombinase system is the FLP recombinase system of Saccharomyces cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355; U.S. Pat. No. 5,654,182). In a preferred embodiment, both Cre-LoxP and Flp-Frt are used in the same system to regulate expression of the transgene, and for sequential deletion of vector sequences in the same cell (Sun X et al (2000) Nat Genet 25:83-6).

[0039] The genetically modified animals can be used in genetic studies to further elucidate the p53 pathway, as animal models of disease and disorders implicating defective p53 function, and for in vivo testing of candidate therapeutic agents, such as those identified in screens described below. The candidate therapeutic agents are administered to a genetically modified animal having altered LRRCAPS function and phenotypic changes are compared with appropriate control animals such as genetically modified animals that receive placebo treatment, and/or animals with unaltered LRRCAPS expression that receive candidate therapeutic agent.

[0040] In addition to the above-described genetically modified animals having altered LRRCAPS function, animal models having defective p53 function (and otherwise normal LRRCAPS function), can be used in the methods of the present invention. For example, a p53 knockout mouse can be used to assess, in vivo, the activity of a candidate p53 modulating agent identified in one of the in vitro assays described below. p53 knockout mice are described in the literature (Jacks et al., Nature 2001;410:1111-1116, 1043-1044; Donehower et al., supra). Preferably, the candidate p53 modulating agent when administered to a model system with cells defective in p53 function, produces a detectable phenotypic change in the model system indicating that the p53 function is restored, i.e., the cells exhibit normal cell cycle progression.

[0041] Modulating Agents

[0042] The invention provides methods to identify agents that interact with and/or modulate the function of LRRCAPS and/or the p53 pathway. Modulating agents identified by the methods are also part of the invention. Such agents are useful in a variety of diagnostic and therapeutic applications associated with the p53 pathway, as well as in further analysis of the LRRCAPS protein and its contribution to the p53 pathway. Accordingly, the invention also provides methods for modulating the p53 pathway comprising the step of specifically modulating LRRCAPS activity by administering a LRRCAPS-interacting or -modulating agent.

[0043] As used herein, an "LRRCAPS-modulating agent" is any agent that modulates LRRCAPS function, for example, an agent that interacts with LRRCAPS to inhibit or enhance LRRCAPS activity or other-wise affect normal LRRCAPS function. LRRCAPS function can be affected at any level, including transcription, protein expression, protein localization, and cellular or extra-cellular activity. In a preferred embodiment, the LRRCAPS--modulating agent specifically modulates the function of the LRRCAPS. The phrases "specific modulating agent", "specifically modulates", etc., are used herein to refer to modulating agents that directly bind to the LRRCAPS polypeptide or nucleic acid, and preferably inhibit, enhance, or otherwise alter, the function of the LRRCAPS. These phrases also encompass modulating agents that alter the interaction of the LRRCAPS with a binding partner, substrate, or cofactor (e.g. by binding to a binding partner of an LRRCAPS, or to a protein/binding partner complex, and altering LRRCAPS function). In a further preferred embodiment, the LRRCAPS-modulating agent is a modulator of the p53 pathway (e.g. it restores and/or upregulates p53 function) and thus is also a p53-modulating agent.

[0044] Preferred LRRCAPS-modulating agents include small molecule compounds; LRRCAPS-interacting proteins, including antibodies and other biotherapeutics; and nucleic acid modulators such as antisense and RNA inhibitors. The modulating agents may be formulated in pharmaceutical compositions, for example, as compositions that may comprise other active ingredients, as in combination therapy, and/or suitable carriers or excipients. Techniques for formulation and administration of the compounds may be found in "Remington's Pharmaceutical Sciences" Mack Publishing Co., Easton, Pa., 19.sup.th edition.

[0045] Small Molecule Modulators

[0046] Small molecules are often preferred to modulate function of proteins with enzymatic function, and/or containing protein interaction domains. Chemical agents, referred to in the art as "small molecule" compounds are typically organic, non-peptide molecules, having a molecular weight less than 10,000, preferably less than 5,000, more preferably less than 1,000, and most preferably less than 500. This class of modulators includes chemically synthesized molecules, for instance, compounds from combinatorial chemical libraries. Synthetic compounds may be rationally designed or identified based on known or inferred properties of the LRRCAPS protein or may be identified by screening compound libraries. Alternative appropriate modulators of this class are natural products, particularly secondary metabolites from organisms such as plants or fungi, which can also be identified by screening compound libraries for LRRCAPS-modulating activity. Methods for generating and obtaining compounds are well known in the art (Schreiber S L, Science (2000) 151: 1964-1969; Radmann J and Gunther J, Science (2000) 151:1947-1948).

[0047] Small molecule modulators identified from screening assays, as described below, can be used as lead compounds from which candidate clinical compounds may be designed, optimized, and synthesized. Such clinical compounds may have utility in treating pathologies associated with the p53 pathway. The activity of candidate small molecule modulating agents may be improved several-fold through iterative secondary functional validation, as further described below, structure determination, and candidate modulator modification and testing. Additionally, candidate clinical compounds are generated with specific regard to clinical and pharmacological properties. For example, the reagents may be derivatized and re-screened using in vitro and in vivo assays to optimize activity and minimize toxicity for pharmaceutical development.

[0048] Protein Modulators

[0049] Specific LRRCAPS-interacting proteins are useful in a variety of diagnostic and therapeutic applications related to the p53 pathway and related disorders, as well as in validation assays for other LRRCAPS-modulating agents. In a preferred embodiment, LRRCAPS-interacting proteins affect normal LRRCAPS function, including transcription, protein expression, protein localization, and cellular or extra-cellular activity. In another embodiment, LRRCAPS-interacting proteins are useful in detecting and providing information about the function of LRRCAPS proteins, as is relevant to p53 related disorders, such as cancer (e.g., for diagnostic means).

[0050] An LRRCAPS-interacting protein may be endogenous, i.e. one that naturally interacts genetically or biochemically with an LRRCAPS, such as a member of the LRRCAPS pathway that modulates LRRCAPS expression, localization, and/or activity. LRRCAPS-modulators include dominant negative forms of LRRCAPS-interacting proteins and of LRRCAPS proteins themselves. Yeast two-hybrid and variant screens offer preferred methods for identifying endogenous LRRCAPS-interacting proteins (Finley, R. L. et al. (1996) in DNA Cloning-Expression Systems: A Practical Approach, eds. Glover D. & Hames B. D (Oxford University Press, Oxford, England), pp. 169-203; Fashema S F et al., Gene (2000) 250:1-14; Drees B L Curr Opin Chem Biol (1999) 3:64-70; Vidal M and Legrain P Nucleic Acids Res (1999) 27:919-29; and U.S. Pat. No. 5,928,868). Mass spectrometry is an alternative preferred method for the elucidation of protein complexes (reviewed in, e.g., Pandley A and Mann M, Nature (2000) 405:837-846; Yates J R 3.sup.rd, Trends Genet (2000) 16:5-8).

[0051] An LRRCAPS-interacting protein may be an exogenous protein, such as an LRRCAPS-specific antibody or a T-cell antigen receptor (see, e.g., Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory; Harlow and Lane (1999) Using antibodies: a laboratory manual. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press). LRRCAPS antibodies are further discussed below.

[0052] In preferred embodiments, an LRRCAPS-interacting protein specifically binds an LRRCAPS protein. In alternative preferred embodiments, an LRRCAPS-modulating agent binds an LRRCAPS substrate, binding partner, or cofactor.

[0053] Antibodies

[0054] In another embodiment, the protein modulator is an LRRCAPS specific antibody agonist or antagonist.

[0055] The antibodies have therapeutic and diagnostic utilities, and can be used in screening assays to identify LRRCAPS modulators. The antibodies can also be used in dissecting the portions of the LRRCAPS pathway responsible for various cellular responses and in the general processing and maturation of the LRRCAPS.

[0056] Antibodies that specifically bind LRRCAPS polypeptides can be generated using known methods. Preferably the antibody is specific to a mammalian ortholog of LRRCAPS polypeptide, and more preferably, to human LRRCAPS. Antibodies may be polyclonal, monoclonal (mAbs), humanized or chimeric antibodies, single chain antibodies, Fab fragments, F(ab').sub.2 fragments, fragments produced by a FAb expression library, anti-idiotypic (anti-Id) antibodies, and epitope-binding fragments of any of the above. Epitopes of LRRCAPS, which are particularly antigenic, can be selected, for example, by routine screening of LRRCAPS polypeptides for antigenicity or by applying a theoretical method for selecting antigenic regions of a protein (Hopp and Wood (1981), Proc. Natl. Acad. Sci. U.S.A. 78:3824-28; Hopp and Wood, (1983) Mol. Immunol. 20:483-89, Sutcliffe et al., (1983) Science 219:660-66) to the amino acid sequence shown in any of SEQ ID NOs: 19-24. Monoclonal antibodies with affinities of 10.sup.8 M.sup.--1 preferably 10.sup.9 M.sup.-1 to 10.sup.10 M.sup.-1, or stronger can be made by standard procedures as described (Harlow and Lane, supra; Goding (1986) Monoclonal Antibodies: Principles and Practice (2d ed) Academic Press, New York; and U.S. Pat. Nos. 4,381,292, 4,451,570, and 4,618,577). Antibodies may be generated against crude cell extracts of LRRCAPS or substantially purified fragments thereof. If LRRCAPS fragments are used, they preferably comprise at least 10, and more preferably, at least 20 contiguous amino acids of an LRRCAPS protein. In a particular embodiment, LRRCAPS-specific antigens and/or immunogens are coupled to carrier proteins that stimulate the immune response. For example, the subject polypeptides are covalently coupled to the keyhole limpet hemocyanin (KLH) carrier, and the conjugate is emulsified in Freund's complete adjuvant, which enhances the immune response. An appropriate immune system such as a laboratory rabbit or mouse is immunized according to conventional protocols.

[0057] The presence of LRRCAPS-specific antibodies is assayed by an appropriate assay such as a solid phase enzyme-linked immunosorbant assay (ELISA) using immobilized corresponding LRRCAPS polypeptides. Other assays, such as radioimmunoassays or fluorescent assays might also be used.

[0058] Chimeric antibodies specific to LRRCAPS polypeptides can be made that contain different portions from different animal species. For instance, a human immunoglobulin constant region may be linked to a variable region of a murine mAb, such that the antibody derives its biological activity from the human antibody, and its binding specificity from the murine fragment. Chimeric antibodies are produced by splicing together genes that encode the appropriate regions from each species (Morrison et al., Proc. Natl. Acad. Sci. (1984) 81:6851-6855; Neuberger et al., Nature (1984) 312:604-608; Takeda et al., Nature (1985) 31:452-454). Humanized antibodies, which are a form of chimeric antibodies, can be generated by grafting complementary-determining regions (CDRs) (Carlos, T. M., J. M. Harlan. 1994. Blood 84:2068-2101) of mouse antibodies into a background of human framework regions and constant regions by recombinant DNA technology (Riechmann L M, et al., 1988 Nature 323: 323-327). Humanized antibodies contain .about.10% murine sequences and .about.90% human sequences, and thus further reduce or eliminate immunogenicity, while retaining the antibody specificities (Co M S, and Queen C. 1991 Nature 351: 501-501; Morrison S L. 1992 Ann. Rev. Immun. 10:239-265). Humanized antibodies and methods of their production are well-known in the art (U.S. Pat. Nos. 5,530,101, 5,585,089, 5,693,762, and 6,180,370).

[0059] LRRCAPS-specific single chain antibodies which are recombinant, single chain polypeptides formed by linking the heavy and light chain fragments of the Fv regions via an amino acid bridge, can be produced by methods known in the art (U.S. Pat. No. 4,946,778; Bird, Science (1988) 242:423-426; Huston et al., Proc. Natl. Acad. Sci. USA (1988) 85:5879-5883; and Ward et al., Nature (1989) 334:544-546).

[0060] Other suitable techniques for antibody production involve in vitro exposure of lymphocytes to the antigenic polypeptides or alternatively to selection of libraries of antibodies in phage or similar vectors (Huse et al., Science (1989) 246:1275-1281). As used herein, T-cell antigen receptors are included within the scope of antibody modulators (Harlow and Lane, 1988, supra).

[0061] The polypeptides and antibodies of the present invention may be used with or without modification. Frequently, antibodies will be labeled by joining, either covalently or non-covalently, a substance that provides for a detectable signal, or that is toxic to cells that express the targeted protein (Menard S, et al., Int J. Biol Markers (1989) 4:131-134). A wide variety of labels and conjugation techniques are known and are reported extensively in both the scientific and patent literature. Suitable labels include radionuclides, enzymes, substrates, cofactors, inhibitors, fluorescent moieties, fluorescent emitting lanthanide metals, chemiluminescent moieties, bioluminescent moieties, magnetic particles, and the like (U.S. Pat. Nos. 3,817,837, 3,850,752, 3,939,350, 3,996,345, 4,277,437, 4,275,149, and 4,366,241). Also, recombinant immunoglobulins may be produced (U.S. Pat. No. 4,816,567). Antibodies to cytoplasmic polypeptides may be delivered and reach their targets by conjugation with membrane-penetrating toxin proteins (U.S. Pat. No. 6,086,900).

[0062] When used therapeutically in a patient, the antibodies of the subject invention are typically administered parenterally, when possible at the target site, or intravenously. The therapeutically effective dose and dosage regimen is determined by clinical studies. Typically, the amount of antibody administered is in the range of about 0.1 mg/kg-to about 10 mg/kg of patient weight. For parenteral administration, the antibodies are formulated in a unit dosage injectable form (e.g., solution, suspension, emulsion) in association with a pharmaceutically acceptable vehicle. Such vehicles are inherently nontoxic and non-therapeutic. Examples are water, saline, Ringer's solution, dextrose solution, and 5% human serum albumin. Nonaqueous vehicles such as fixed oils, ethyl oleate, or liposome carriers may also be used. The vehicle may contain minor amounts of additives, such as buffers and preservatives, which enhance isotonicity and chemical stability or otherwise enhance therapeutic potential. The antibodies' concentrations in such vehicles are typically in the range of about 1 mg/ml to about 10 mg/ml. Immunotherapeutic methods are further described in the literature (U.S. Pat. No. 5,859,206; WO0073469).

[0063] Specific Biotherapeutics

[0064] In a preferred embodiment, an LRRCAPS-interacting protein may have biotherapeutic applications. Biotherapeutic agents formulated in pharmaceutically acceptable carriers and dosages may be used to activate or inhibit signal transduction pathways. This modulation may be accomplished by binding a ligand, thus inhibiting the activity of the pathway; or by binding a receptor, either to inhibit activation of, or to activate, the receptor. Alternatively, the biotherapeutic may itself be a ligand capable of activating or inhibiting a receptor. Biotherapeutic agents and methods of producing them are described in detail in U.S. Pat. No. 6,146,628.

[0065] LRRCAPS, its ligand(s), antibodies to the ligand(s) or the LRRCAPS itself may be used as biotherapeutics to modulate the activity of LRRCAPS in the p53 pathway.

[0066] Nucleic Acid Modulators

[0067] Other preferred LRRCAPS-modulating agents comprise nucleic acid molecules, such as antisense oligomers or double stranded RNA (dsRNA), which generally inhibit LRRCAPS activity. Preferred nucleic acid modulators interfere with the function of the LRRCAPS nucleic acid such as DNA replication, transcription, translocation of the LRRCAPS RNA to the site of protein translation, translation of protein from the LRRCAPS RNA, splicing of the LRRCAPS RNA to yield one or more mRNA species, or catalytic activity which may be engaged in or facilitated by the LRRCAPS RNA.

[0068] In one embodiment, the antisense oligomer is an oligonucleotide that is sufficiently complementary to an LRRCAPS mRNA to bind to and prevent translation, preferably by binding to the 5' untranslated region. LRRCAPS-specific antisense oligonucleotides, preferably range from at least 6 to about 200 nucleotides. In some embodiments the oligonucleotide is preferably at least 10, 15, or 20 nucleotides in length. In other embodiments, the oligonucleotide is preferably less than 50, 40, or 30 nucleotides in length. The oligonucleotide can be DNA or RNA or a chimeric mixture or derivatives or modified versions thereof, single-stranded or double-stranded. The oligonucleotide can be modified at the base moiety, sugar moiety, or phosphate backbone. The oligonucleotide may include other appending groups such as peptides, agents that facilitate transport across the cell membrane, hybridization-triggered cleavage agents, and intercalating agents.

[0069] In another embodiment, the antisense oligomer is a phosphothioate morpholino oligomer (PMO). PMOs are assembled from four different morpholino subunits, each of which contain one of four genetic bases (A, C, G, or T) linked to a six-membered morpholine ring. Polymers of these subunits are joined by non-ionic phosphodiamidate intersubunit linkages. Details of how to make and use PMOs and other antisense oligomers are well known in the art (e.g. see WO99/18193; Probst J C, Antisense Oligodeoxynucleotide and Ribozyme Design, Methods. (2000) 22(3):271-281; Summerton J, and Weller D. 1997 Antisense Nucleic Acid Drug Dev. :7:187-95; U.S. Pat. No. 5,235,033 and U.S. Pat. No. 5,378,841).

[0070] Alternative preferred LRRCAPS nucleic acid modulators are double-stranded RNA species mediating RNA interference (RNAi). RNAi is the process of sequence-specific, post-transcriptional gene silencing in animals and plants, initiated by double-stranded RNA (dsRNA) that is homologous in sequence to the silenced gene. Methods relating to the use of RNAi to silence genes in C. elegans, Drosophila, plants, and humans are known in the art (Fire A, et al., 1998 Nature 391:806-811; Fire, A. Trends Genet. 15, 358-363 (1999); Sharp, P. A. RNA interference 2001. Genes Dev. 15, 485-490 (2001); Hammond, S. M., et al., Nature Rev. Genet. 2, 110-119 (2001); Tuschl, T. Chem. Biochem. 2, 239-245 (2001); Hamilton, A. et al., Science 286, 950-952 (1999); Hammond, S. M., et al., Nature 404, 293-296 (2000); Zamore, P. D., et al., Cell 101, 25-33 (2000); Bernstein, E., et al., Nature 409, 363-366 (2001); Elbashir, S. M., et al., Genes Dev. 15, 188-200 (2001); WO0129058; WO9932619; Elbashir S M, et al., 2001 Nature 411:494-498).

[0071] Nucleic acid modulators are commonly used as research reagents, diagnostics, and therapeutics. For example, antisense oligonucleotides, which are able to inhibit gene expression with exquisite specificity, are often used to elucidate the function of particular genes (see, for example, U.S. Pat. No. 6,165,790). Nucleic acid modulators are also used, for example, to distinguish between functions of various members of a biological pathway. For example, antisense oligomers have been employed as therapeutic moieties in the treatment of disease states in animals and man and have been demonstrated in numerous clinical trials to be safe and effective (Milligan J F, et al, Current Concepts in Antisense Drug Design, J Med Chem. (1993) 36:1923-1937; Tonkinson J L et al., Antisense Oligodeoxynucleotides as Clinical Therapeutic Agents, Cancer Invest. (1996) 14:54-65). Accordingly, in one aspect of the invention, an LRRCAPS-specific nucleic acid modulator is used in an assay to further elucidate the role of the LRRCAPS in the p53 pathway, and/or its relationship to other members of the pathway. In another aspect of the invention, an LRRCAPS-specific antisense oligomer is used as a therapeutic agent for treatment of p53-related disease states.

[0072] Assay Systems

[0073] The invention provides assay systems and screening methods for identifying specific modulators of LRRCAPS activity. As used herein, an "assay system" encompasses all the components required for performing and analyzing results of an assay that detects and/or measures a particular event. In general, primary assays are used to identify or confirm a modulator's specific biochemical or molecular effect with respect to the LRRCAPS nucleic acid or protein. In general, secondary assays further assess the activity of a LRRCAPS modulating agent identified by a primary assay and may confirm that the modulating agent affects LRRCAPS in a manner relevant to the p53 pathway. In some cases, LRRCAPS modulators will be directly tested in a secondary assay.

[0074] In a preferred embodiment, the screening method comprises contacting a suitable assay system comprising an LRRCAPS polypeptide or nucleic acid with a candidate agent under conditions whereby, but for the presence of the agent, the system provides a reference activity (e.g. binding activity), which is based on the particular molecular event the screening method detects. A statistically significant difference between the agent-biased activity and the reference activity indicates that the candidate agent modulates LRRCAPS activity, and hence the p53 pathway. The LRRCAPS polypeptide or nucleic acid used in the assay may comprise any of the nucleic acids or polypeptides described above.

[0075] Primary Assays

[0076] The type of modulator tested generally determines the type of primary assay.

[0077] Primary Assays for Small Molecule Modulators

[0078] For small molecule modulators, screening assays are used to identify candidate modulators. Screening assays may be cell-based or may use a cell-free system that recreates or retains the relevant biochemical reaction of the target protein (reviewed in Sittampalam G S et al., Curr Opin Chem Biol (1997) 1:384-91 and accompanying references). As used herein the term "cell-based" refers to assays using live cells, dead cells, or a particular cellular fraction, such as a membrane, endoplasmic reticulum, or mitochondrial fraction. The term "cell free" encompasses assays using substantially purified protein (either endogenous or recombinantly produced), partially purified or crude cellular extracts. Screening assays may detect a variety of molecular events, including protein-DNA interactions, protein-protein interactions (e.g., receptor-ligand binding), transcriptional activity (e.g., using a reporter gene), enzymatic activity (e.g., via a property of the substrate), activity of second messengers, immunogenicty and changes in cellular morphology or other cellular characteristics. Appropriate screening assays may use a wide range of detection methods including fluorescent, radioactive, calorimetric, spectrophotometric, and amperometric methods, to provide a read-out for the particular molecular event detected.

[0079] Cell-based screening assays usually require systems for recombinant expression of LRRCAPS and any auxiliary proteins demanded by the particular assay. Appropriate methods for generating recombinant proteins produce sufficient quantities of proteins that retain their relevant biological activities and are of sufficient purity to optimize activity and assure assay reproducibility. Yeast two-hybrid and variant screens, and mass spectrometry provide preferred methods for determining protein-protein interactions and elucidation of protein complexes. In certain applications, when LRRCAPS-interacting proteins are used in screens to identify small molecule modulators, the binding specificity of the interacting protein to the LRRCAPS protein may be assayed by various known methods such as substrate processing (e.g. ability of the candidate LRRCAPS-specific binding agents to function as negative effectors in LRRCAPS-expressing cells), binding equilibrium constants (usually at least about 10.sup.7 M.sup.-1, preferably at least about 10.sup.8 M.sup.-1, more preferably at least about 10.sup.9 M.sup.-1), and immunogenicity (e.g. ability to elicit LRRCAPS specific antibody in a heterologous host such as a mouse, rat, goat or rabbit). For enzymes and receptors, binding may be assayed by, respectively, substrate and ligand processing.

[0080] The screening assay may measure a candidate agent's ability to specifically bind to or modulate activity of a LRRCAPS polypeptide, a fusion protein thereof, or to cells or membranes bearing the polypeptide or fusion protein. The LRRCAPS polypeptide can be full length or a fragment thereof that retains functional LRRCAPS activity. The LRRCAPS polypeptide may be fused to another polypeptide, such as a peptide tag for detection or anchoring, or to another tag. The LRRCAPS polypeptide is preferably human LRRCAPS, or is an ortholog or derivative thereof as described above. In a preferred embodiment, the screening assay detects candidate agent-based modulation of LRRCAPS interaction with a binding target, such as an endogenous or exogenous protein or other substrate that has LRRCAPS -specific binding activity, and can be used to assess normal LRRCAPS gene function.

[0081] Suitable assay formats that may be adapted to screen for LRRCAPS modulators are known in the art. Preferred screening assays are high throughput or ultra high throughput and thus provide automated, cost-effective means of screening compound libraries for lead compounds (Fernandes P B, Curr Opin Chem Biol (1998) 2:597-603; Sundberg S A, Curr Opin Biotechnol 2000, 11:47-53). In one preferred embodiment, screening assays uses fluorescence technologies, including fluorescence polarization, time-resolved fluorescence, and fluorescence resonance energy transfer. These systems offer means to monitor protein-protein or DNA-protein interactions in which the intensity of the signal emitted from dye-labeled molecules depends upon their interactions with partner molecules (e.g., Selvin P R, Nat Struct Biol (2000) 7:730-4; Fernandes P B, supra; Hertzberg R P and Pope A J, Curr Opin Chem Biol (2000) 4:445-451).

[0082] A variety of suitable assay systems may be used to identify candidate LRRCAPS and p53 pathway modulators (e.g. U.S. Pat. Nos. 5,550,019 and 6,133,437 (apoptosis assays), U.S. Pat. No. 6,020,135 (p53 modulation), U.S. Pat. Nos. 5,976,782, 6,225,118 and 6,444,434 (angiogenesis assays), among others). Specific preferred assays are described in more detail below.

[0083] Apoptosis Assays

[0084] Assays for apoptosis may be performed by terminal deoxynucleotidyl transferase-mediated digoxigenin-11-dUTP nick end labeling (TUNEL) assay. The TUNEL assay is used to measure nuclear DNA fragmentation characteristic of apoptosis ( Lazebnik et al., 1994, Nature 371, 346), by following the incorporation of fluorescein-dUTP (Yonehara et al., 1989, J. Exp. Med. 169, 1747). Apoptosis may further be assayed by acridine orange staining of tissue culture cells (Lucas, R., et al., 1998, Blood 15:4730-41). An apoptosis assay system may comprise a cell that expresses an LRRCAPS, and that optionally has defective p53 function (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the apoptosis assay system and changes in induction of apoptosis relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, an apoptosis assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using a cell-free assay system. An apoptosis assay may also be used to test whether LRRCAPS function plays a direct role in apoptosis. For example, an apoptosis assay may be performed on cells that over- or under-express LRRCAPS relative to wild type cells. Differences in apoptotic response compared to wild type cells suggests that the LRRCAPS plays a direct role in the apoptotic response. Apoptosis assays are described further in U.S. Pat. No. 6,133,437.

[0085] Cell Proliferation and Cell Cycle Assays

[0086] Cell proliferation may be assayed via bromodeoxyuridine (BRDU) incorporation. This assay identifies a cell population undergoing DNA synthesis by incorporation of BRDU into newly-synthesized DNA. Newly-synthesized DNA may then be detected using an anti-BRDU antibody (Hoshino et al., 1986, Int. J. Cancer 38, 369; Campana et al., 1988, J. Immunol. Meth. 107, 79), or by other means. Cell Proliferation may also be examined using [.sup.3H]-thymidine incorporation (Chen, J., 1996, Oncogene 13:1395-403; Jeoung, J., 1995, J. Biol. Chem. 270:18367-73). This assay allows for quantitative characterization of S-phase DNA syntheses. In this assay, cells synthesizing DNA will incorporate [.sup.3H]-thymidine into newly synthesized DNA. Incorporation can then be measured by standard techniques such as by counting of radioisotope in a scintillation counter (e.g., Beckman LS 3800 Liquid Scintillation Counter). Another proliferation assay uses the dye Alamar Blue (available from Biosource International), which fluoresces when reduced in living cells and provides an indirect measurement of cell number (Voytik-Harbin S L et al., 1998, In Vitro Cell Dev Biol Anim 34:239-46).

[0087] Cell proliferation may also be assayed by colony formation in soft agar (Sambrook et al., Molecular Cloning, Cold Spring Harbor (1989)). For example, cells transformed with LRRCAPS are seeded in soft agar plates, and colonies are measured and counted after two weeks incubation.

[0088] Involvement of a gene in the cell cycle may be assayed by flow cytometry (Gray J W et al. (1986) Int J Radiat Biol Relat Stud Phys Chem Med 49:237-55). Cells transfected with an LRRCAPS may be stained with propidium iodide and evaluated in a flow cytometer (available from Becton Dickinson), which indicates accumulation of cells in different stages of the cell cycle.

[0089] Accordingly, a cell proliferation or cell cycle assay system may comprise a cell that expresses an LRRCAPS, and that optionally has defective p53 function (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the assay system and changes in cell proliferation or cell cycle relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, the cell proliferation or cell cycle assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using another assay system such as a cell-free assay system. A cell proliferation assay may also be used to test whether LRRCAPS function plays a direct role in cell proliferation or cell cycle. For example, a cell proliferation or cell cycle assay may be performed on cells that over- or under-express LRRCAPS relative to wild type cells. Differences in proliferation or cell cycle compared to wild type cells suggests that the LRRCAPS plays a direct role in cell proliferation or cell cycle.

[0090] Angiogenesis

[0091] Angiogenesis may be assayed using various human endothelial cell systems, such as umbilical vein, coronary artery, or dermal cells. Suitable assays include Alamar Blue based assays (available from Biosource International) to measure proliferation; migration assays using fluorescent molecules, such as the use of Becton Dickinson Falcon HTS FluoroBlock cell culture inserts to measure migration of cells through membranes in presence or absence of angiogenesis enhancer or suppressors; and tubule formation assays based on the formation of tubular structures by endothelial cells on Matrigel.RTM. (Becton Dickinson). Accordingly, an angiogenesis assay system may comprise a cell that expresses an LRRCAPS, and that optionally has defective p53 function (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the angiogenesis assay system and changes in angiogenesis relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, the angiogenesis assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using another assay system. An angiogenesis assay may also be used to test whether LRRCAPS function plays a direct role in cell proliferation. For example, an angiogenesis assay may be performed on cells that over- or under-express LRRCAPS relative to wild type cells. Differences in angiogenesis compared to wild type cells suggests that the LRRCAPS plays a direct role in angiogenesis. U.S. Pat. Nos. 5,976,782, 6,225,118 and 6,444,434, among others.

[0092] Hypoxic Induction

[0093] The alpha subunit of the transcription factor, hypoxia inducible factor-1 (HIF-1), is upregulated in tumor cells following exposure to hypoxia in vitro. Under hypoxic conditions, HIF-1 stimulates the expression of genes known to be important in tumour cell survival, such as those encoding glyolytic enzymes and VEGF. Induction of such genes by hypoxic conditions may be assayed by growing cells transfected with LRRCAPS in hypoxic conditions (such as with 0.1% O2, 5% CO2, and balance N2, generated in a Napco 7001 incubator (Precision Scientific)) and normoxic conditions, followed by assessment of gene activity or expression by Taqman.RTM.. For example, a hypoxic induction assay system may comprise a cell that expresses an LRRCAPS, and that optionally has a mutated p53 (e.g. p53 is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the hypoxic induction assay system and changes in hypoxic response relative to controls where no test agent is added, identify candidate p53 modulating agents. In some embodiments of the invention, the hypoxic induction assay may be used as a secondary assay to test a candidate p53 modulating agents that is initially identified using another assay system. A hypoxic induction assay may also be used to test whether LRRCAPS function plays a direct role in the hypoxic response. For example, a hypoxic induction assay may be performed on cells that over- or under-express LRRCAPS relative to wild type cells. Differences in hypoxic response compared to wild type cells suggests that the LRRCAPS plays a direct role in hypoxic induction.

[0094] Cell Adhesion

[0095] Cell adhesion assays measure adhesion of cells to purified adhesion proteins, or adhesion of cells to each other, in presence or absence of candidate modulating agents. Cell-protein adhesion assays measure the ability of agents to modulate the adhesion of cells to purified proteins. For example, recombinant proteins are produced, diluted to 2.5g/mL in PBS, and used to coat the wells of a microtiter plate. The wells used for negative control are not coated. Coated wells are then washed, blocked with 1% BSA, and washed again. Compounds are diluted to 2.times. final test concentration and added to the blocked, coated wells. Cells are then added to the wells, and the unbound cells are washed off. Retained cells are labeled directly on the plate by adding a membrane-permeable fluorescent dye, such as calcein-AM, and the signal is quantified in a fluorescent microplate reader.

[0096] Cell-cell adhesion assays measure the ability of agents to modulate binding of cell adhesion proteins with their native ligands. These assays use cells that naturally or recombinantly express the adhesion protein of choice. In an exemplary assay, cells expressing the cell adhesion protein are plated in wells of a multiwell plate. Cells expressing the ligand are labeled with a membrane-permeable fluorescent dye, such as BCECF, and allowed to adhere to the monolayers in the presence of candidate agents. Unbound cells are washed off, and bound cells are detected using a fluorescence plate reader.

[0097] High-throughput cell adhesion assays have also been described. In one such assay, small molecule ligands and peptides are bound to the surface of microscope slides using a microarray spotter, intact cells are then contacted with the slides, and unbound cells are washed off. In this assay, not only the binding specificity of the peptides and modulators against cell lines are determined, but also the functional cell signaling of attached cells using immunofluorescence techniques in situ on the microchip is measured (Falsey J R et al., Bioconjug Chem. 2001 May-Jun, 12(3):346-53).

[0098] Cell Migration

[0099] An invasion/migration assay (also called a migration assay) tests the ability of cells to overcome a physical barrier and to migrate towards pro-angiogenic signals. Migration assays are known in the art (e.g., Paik J H et al., 2001, J Biol Chem 276:11830-11837). In a typical experimental set-up, cultured endothelial cells are seeded onto a matrix-coated porous lamina, with pore sizes generally smaller than typical cell size. The matrix generally simulates the environment of the extracellular matrix, as described above. The lamina is typically a membrane, such as the transwell polycarbonate membrane (Corning Costar Corporation, Cambridge, Mass.), and is generally part of an upper chamber that is in fluid contact with a lower chamber containing pro-angiogenic stimuli. Migration is generally assayed after an overnight incubation with stimuli, but longer or shorter time frames may also be used. Migration is assessed as the number of cells that crossed the lamina, and may be detected by staining cells with hemotoxylin solution (VWR Scientific, South San Francisco, Calif.), or by any other method for determining cell number. In another exemplary set up, cells are fluorescently labeled and migration is detected using fluorescent readings, for instance using the Falcon HTS FluoroBlok (Becton Dickinson). While some migration is observed in the absence of stimulus, migration is greatly increased in response to pro-angiogenic factors. As described above, a preferred assay system for migration/invasion assays comprises testing an LRRCAPS's response to a variety of pro-angiogenic factors, including tumor angiogenic and inflammatory angiogenic agents, and culturing the cells in serum free medium.

[0100] Sprouting Assay

[0101] A sprouting assay is a three-dimensional in vitro angiogenesis assay that uses a cell-number defined spheroid aggregation of endothelial cells ("spheroid"), embedded in a collagen gel-based matrix. The spheroid can serve as a starting point for the sprouting of capillary-like structures by invasion into the extracellular matrix (termed "cell sprouting") and the subsequent formation of complex anastomosing networks (Korff and Augustin, 1999, J Cell Sci 112:3249-58). In an exemplary experimental set-up, spheroids are prepared by pipetting 400 human umbilical vein endothelial cells into individual wells of a nonadhesive 96-well plates to allow overnight spheroidal aggregation (Korff and Augustin: J Cell Biol 143: 1341-52, 1998). Spheroids are harvested and seeded in 900 .mu.l of methocel-collagen solution and pipetted into individual wells of a 24 well plate to allow collagen gel polymerization. Test agents are added after 30 min by pipetting 100 .mu.l of 10-fold concentrated working dilution of the test substances on top of the gel. Plates are incubated at 37.degree. C. for 24h. Dishes are fixed at the end of the experimental incubation period by addition of paraformaldehyde. Sprouting intensity of endothelial cells can be quantitated by an automated image analysis system to determine the cumulative sprout length per spheroid.

[0102] Primary Assays for Antibody Modulators

[0103] For antibody modulators, appropriate primary assays test is a binding assay that tests the antibody's affinity to and specificity for the LRRCAPS protein. Methods for testing antibody affinity and specificity are well known in the art (Harlow and Lane, 1988, 1999, supra). The enzyme-linked immunosorbant assay (ELISA) is a preferred method for detecting LRRCAPS-specific antibodies; others include FACS assays, radioimmunoassays, and fluorescent assays.

[0104] In some cases, screening assays described for small molecule modulators may also be used to test antibody modulators.

[0105] Primary Assays for Nucleic Acid Modulators

[0106] For nucleic acid modulators, primary assays may test the ability of the nucleic acid modulator to inhibit or enhance LRRCAPS gene expression, preferably mRNA expression. In general, expression analysis comprises comparing LRRCAPS expression in like populations of cells (e.g., two pools of cells that endogenously or recombinantly express LRRCAPS) in the presence and absence of the nucleic acid modulator. Methods for analyzing mRNA and protein expression are well known in the art. For instance, Northern blotting, slot blotting, ribonuclease protection, quantitative RT-PCR (e.g., using the TaqMan.RTM., PE Applied Biosystems), or microarray analysis may be used to confirm that LRRCAPS mRNA expression is reduced in cells treated with the nucleic acid modulator (e.g., Current Protocols in Molecular Biology (1994) Ausubel F M et al., eds., John Wiley & Sons, Inc., chapter 4; Freeman W M et al., Biotechniques (1999) 26:112-125; Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm D H and Guiseppi-Elie, A Curr Opin Biotechnol 2001, 12:41-47). Protein expression may also be monitored. Proteins are most commonly detected with specific antibodies or antisera directed against either the LRRCAPS protein or specific peptides. A variety of means including Western blotting, ELISA, or in situ detection, are available (Harlow E and Lane D, 1988 and 1999, supra).

[0107] In some cases, screening assays described for small molecule modulators, particularly in assay systems that involve LRRCAPS mRNA expression, may also be used to test nucleic acid modulators.

[0108] Secondary Assays

[0109] Secondary assays may be used to further assess the activity of LRRCAPS-modulating agent identified by any of the above methods to confirm that the modulating agent affects LRRCAPS in a manner relevant to the p53 pathway. As used herein, LRRCAPS-modulating agents encompass candidate clinical compounds or other agents derived from previously identified modulating agent. Secondary assays can also be used to test the activity of a modulating agent on a particular genetic or biochemical pathway or to test the specificity of the modulating agent's interaction with LRRCAPS.

[0110] Secondary assays generally compare like populations of cells or animals (e.g., two pools of cells or animals that endogenously or recombinantly express LRRCAPS) in the presence and absence of the candidate modulator. In general, such assays test whether treatment of cells or animals with a candidate LRRCAPS-modulating agent results in changes in the p53 pathway in comparison to untreated (or mock- or placebo-treated) cells or animals. Certain assays use "sensitized genetic backgrounds", which, as used herein, describe cells or animals engineered for altered expression of genes in the p53 or interacting pathways.

[0111] Cell-Based Assays

[0112] Cell based assays may use a variety of mammalian cell lines known to have defective p53 function (e.g. SAOS-2 osteoblasts, H1299 lung cancer cells, C33A and HT3 cervical cancer cells, HT-29 and DLD-1 colon cancer cells, among others, available from American Type Culture Collection (ATCC), Manassas, Va.). Cell based assays may detect endogenous p53 pathway activity or may rely on recombinant expression of p53 pathway components. Any of the aforementioned assays may be used in this cell-based format. Candidate modulators are typically added to the cell media but may also be injected into cells or delivered by any other efficacious means.

[0113] Animal Assays

[0114] A variety of non-human animal models of normal or defective p53 pathway may be used to test candidate LRRCAPS modulators. Models for defective p53 pathway typically use genetically modified animals that have been engineered to mis-express (e.g., over-express or lack expression in) genes involved in the p53 pathway. Assays generally require systemic delivery of the candidate modulators, such as by oral administration, injection, etc.

[0115] In a preferred embodiment, p53 pathway activity is assessed by monitoring neovascularization and angiogenesis. Animal models with defective and normal p53 are used to test the candidate modulator's affect on LRRCAPS in Matrigel.RTM. assays. Matrigel.RTM. is an extract of basement membrane proteins, and is composed primarily of laminin, collagen IV, and heparin sulfate proteoglycan. It is provided as a sterile liquid at 4.degree. C., but rapidly forms a solid gel at 37.degree. C. Liquid Matrigel.RTM. is mixed with various angiogenic agents, such as bFGF and VEGF, or with human tumor cells which over-express the LRRCAPS. The mixture is then injected subcutaneously(SC) into female athymic nude mice (Taconic, Germantown, N.Y.) to support an intense vascular response. Mice with Matrigel.RTM. pellets may be dosed via oral (PO), intraperitoneal (IP), or intravenous (IV) routes with the candidate modulator. Mice are euthanized 5-12 days post-injection, and the Matrigel.RTM. pellet is harvested for hemoglobin analysis (Sigma plasma hemoglobin kit). Hemoglobin content of the gel is found to correlate the degree of neovascularization in the gel.

[0116] In another preferred embodiment, the effect of the candidate modulator on LRRCAPS is assessed via tumorigenicity assays. In one example, a xenograft comprising human cells from a pre-existing tumor or a tumor cell line is used. Tumor xenograft assays are known in the art (see, e.g., Ogawa K et al., 2000, Oncogene 19:6043-6052). Xenografts are typically implanted SC into female athymic mice, 6-7 week old, as single cell suspensions either from a pre-existing tumor or from in vitro culture. The tumors which express the LRRCAPS endogenously are injected in the flank, 1.times.10.sup.5 to 1.times.10.sup.7 cells per mouse in a volume of 100 .mu.L using a 27gauge needle. Mice are then ear tagged and tumors are measured twice weekly. Candidate modulator treatment is initiated on the day the mean tumor weight reaches 100 mg. Candidate modulator is delivered IV, SC, IP, or PO by bolus administration. Depending upon the pharmacokinetics of each unique candidate modulator, dosing can be performed multiple times per day. The tumor weight is assessed by measuring perpendicular diameters with a caliper and calculated by multiplying the measurements of diameters in two dimensions. At the end of the experiment, the excised tumors maybe utilized for biomarker identification or further analyses. For immunohistochemistry staining, xenograft tumors are fixed in 4% paraformaldehyde, 0.1M phosphate, pH 7.2, for 6 hours at 4.degree. C., immersed in 30% sucrose in PBS, and rapidly frozen in isopentane cooled with liquid nitrogen.

[0117] In another preferred embodiment, tumorogenicity is monitored using a hollow fiber assay, which is described in U.S. Pat. No. 5,698,413. Briefly, the method comprises implanting into a laboratory animal a biocompatible, semi-permeable encapsulation device containing target cells, treating the laboratory animal with a candidate modulating agent, and evaluating the target cells for reaction to the candidate modulator. Implanted cells are generally human cells from a pre-existing tumor or a tumor cell line. After an appropriate period of time, generally around six days, the implanted samples are harvested for evaluation of the candidate modulator. Tumorogenicity and modulator efficacy may be evaluated by assaying the quantity of viable cells present in the macrocapsule, which can be determined by tests known in the art, for example, MTT dye conversion assay, neutral red dye uptake, trypan blue staining, viable cell counts, the number of colonies formed in soft agar, the capacity of the cells to recover and replicate in vitro, etc.

[0118] In another preferred embodiment, a tumorogenicity assay use a transgenic animal, usually a mouse, carrying a dominant oncogene or tumor suppressor gene knockout under the control of tissue specific regulatory sequences; these assays are generally referred to as transgenic tumor assays. In a preferred application, tumor development in the transgenic model is well characterized or is controlled. In an exemplary model, the "RIP1-Tag2" transgene, comprising the SV40 large T-antigen oncogene under control of the insulin gene regulatory regions is expressed in pancreatic beta cells and results in islet cell carcinomas (Hanahan D, 1985, Nature 315:115-122; Parangi S et al, 1996, Proc Natl Acad Sci USA 93: 2002-2007; Bergers G et al, 1999, Science 284:808-812). An "angiogenic switch," occurs at approximately five weeks, as normally quiescent capillaries in a subset of hyperproliferative islets become angiogenic. The RIP1-TAG2 mice die by age 14 weeks. Candidate modulators may be administered at a variety of stages, including just prior to the angiogenic switch (e.g., for a model of tumor prevention), during the growth of small tumors (e.g., for a model of intervention), or during the growth of large and/or invasive tumors (e.g., for a model of regression). Tumorogenicity and modulator efficacy can be evaluating life-span extension and/or tumor characteristics, including number of tumors, tumor size, tumor morphology, vessel density, apoptotic index, etc.

[0119] Diagnostic and Therapeutic Uses

[0120] Specific LRRCAPS-modulating agents are useful in a variety of diagnostic and therapeutic applications where disease or disease prognosis is related to defects in the p53 pathway, such as angiogenic, apoptotic, or cell proliferation disorders. Accordingly, the invention also provides methods for modulating the p53 pathway in a cell, preferably a cell pre-determined to have defective or impaired p53 function (e.g. due to overexpression, underexpression, or misexpression of p53, or due to gene mutations), comprising the step of administering an agent to the cell that specifically modulates LRRCAPS activity. Preferably, the modulating agent produces a detectable phenotypic change in the cell indicating that the p53 function is restored. The phrase "function is restored", and equivalents, as used herein, means that the desired phenotype is achieved, or is brought closer to normal compared to untreated cells. For example, with restored p53 function, cell proliferation and/or progression through cell cycle may normalize, or be brought closer to normal relative to untreated cells. The invention also provides methods for treating disorders or disease associated with impaired p53 function by administering a therapeutically effective amount of an LRRCAPS -modulating agent that modulates the p53 pathway. The invention further provides methods for modulating LRRCAPS function in a cell, preferably a cell pre-determined to have defective or impaired LRRCAPS function, by administering an LRRCAPS -modulating agent. Additionally, the invention provides a method for treating disorders or disease associated with impaired LRRCAPS function by administering a therapeutically effective amount of an LRRCAPS-modulating agent.

[0121] The discovery that LRRCAPS is implicated in p53 pathway provides for a variety of methods that can be employed for the diagnostic and prognostic evaluation of diseases and disorders involving defects in the p53 pathway and for the identification of subjects having a predisposition to such diseases and disorders.

[0122] Various expression analysis methods can be used to diagnose whether LRRCAPS expression occurs in a particular sample, including Northern blotting, slot blotting, ribonuclease protection, quantitative RT-PCR, and microarray analysis. (e.g., Current Protocols in Molecular Biology (1994) Ausubel F M et al., eds., John Wiley & Sons, Inc., chapter 4; Freeman W M et al., Biotechniques (1999) 26:112-125; Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm and Guiseppi-Elie, Curr Opin Biotechnol 2001, 12:41-47). Tissues having a disease or disorder implicating defective p53 signaling that express an LRRCAPS, are identified as amenable to treatment with an LRRCAPS modulating agent. In a preferred application, the p53 defective tissue overexpresses an LRRCAPS relative to normal tissue. For example, a Northern blot analysis of mRNA from tumor and normal cell lines, or from tumor and matching normal tissue samples from the same patient, using full or partial LRRCAPS cDNA sequences as probes, can determine whether particular tumors express or overexpress LRRCAPS. Alternatively, the TaqMan.RTM. is used for quantitative RT-PCR analysis of LRRCAPS expression in cell lines, normal tissues and tumor samples (PE Applied Biosystems).

[0123] Various other diagnostic methods may be performed, for example, utilizing reagents such as the LRRCAPS oligonucleotides, and antibodies directed against an LRRCAPS, as described above for: (1) the detection of the presence of LRRCAPS gene mutations, or the detection of either over- or under-expression of LRRCAPS mRNA relative to the non-disorder state; (2) the detection of either an over- or an under-abundance of LRRCAPS gene product relative to the non-disorder state; and (3) the detection of perturbations or abnormalities in the signal transduction pathway mediated by LRRCAPS.

[0124] Thus, in a specific embodiment, the invention is drawn to a method for diagnosing a disease or disorder in a patient that is associated with alterations in LRRCAPS expression, the method comprising: a) obtaining a biological sample from the patient; b) contacting the sample with a probe for LRRCAPS expression; c) comparing results from step (b) with a control; and d) determining whether step (c) indicates a likelihood of the disease or disorder. Preferably, the disease is cancer, most preferably a cancer as shown in TABLE 2. The probe may be either DNA or protein, including an antibody.

EXAMPLES

[0125] The following experimental section and examples are offered by way of illustration and not by way of limitation.

[0126] I. Drosophila p53 Screen

[0127] The Drosophila p53 gene was overexpressed specifically in the wing using the vestigial margin quadrant enhancer. Increasing quantities of Drosophila p53 (titrated using different strength transgenic inserts in 1 or 2 copies) caused deterioration of normal wing morphology from mild to strong, with phenotypes including disruption of pattern and polarity of wing hairs, shortening and thickening of wing veins, progressive crumpling of the wing and appearance of dark "death" inclusions in wing blade. In a screen designed to identify enhancers and suppressors of Drosophila p53, homozygous females carrying two copies of p53 were crossed to 5663 males carrying random insertions of a piggyBac transposon (Fraser M et al., Virology (1985) 145:356-361).

[0128] Progeny containing insertions were compared to non-insertion-bearing sibling progeny for enhancement or suppression of the p53 phenotypes. Sequence information surrounding the piggyBac insertion site was used to identify the modifier genes. Modifiers of the wing phenotype were identified as members of the p53 pathway. CAPS was an enhancer of the wing phenotype. Human orthologs of the modifiers are referred to herein as LRRCAPS.

[0129] BLAST analysis (Altschul et al., supra) was employed to identify Targets from Drosophila modifiers. Various domains, signals, and functional subunits in proteins were analyzed using the PSORT (Nakai K., and Horton P., Trends Biochem Sci, 1999, 24:34-6; Kenta Nakai, Protein sorting signals and prediction of subcellular localization, Adv. Protein Chem. 54, 277-344 (2000)), PFAM (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2), SMART (Ponting C P, et al., SMART: identification and annotation of domains from signaling and extracellular protein sequences. Nucleic Acids Res. Jan. 1, 1999; 27(1):229-32), TM-HMM (Erik L. L. Sonnhammer, Gunnar von Heijne, and Anders Krogh: A hidden Markov model for predicting transmembrane helices in protein sequences. In Proc. of Sixth Int. Conf. on Intelligent Systems for Molecular Biology, p 175-182 Ed J. Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen Menlo Park, Calif.: AAAI Press, 1998), and dust (Remm M, and Sonnhammer E. Classification of transmembrane protein families in the Caenorhabditis elegans genome and identification of human orthologs. Genome Res. 2000 Nov; 10(11):1679-89) programs. For example, PFAM was employed to determine approximate amino acid locations for the LRR and IG domains of GI#s 11877257, 16157511, 14758126, 5453656, 14764198, and 5729718 (SEQ ID NOs: 19, 20, 21, 22, 23, and 24, respectively), as shown in Table 1.

1TABLE 1 Approximate amino acid locations for various domains of LRRCAPS polypeptides: LRRCAPS LRRCAPS LRR domain (PFAM 01462, IG domain GI # SEQ ID NO: PFAM00560, PFAM 01463) (PFAM00047) 11877257 19 28 to 59, 339 to 372, 63 to 86, 87 to 110, 111 to 134, 135 to 158, 159 to 181, 182 to 203, 376 to 399, 400 to 423, 424 to 447, 448 to 471, 472 to 494, 495 to 519, 216 to 264, 529 to 579 16157511 20 63 to 86, 87 to 110, 111 to 134, 260 to 319, 356 to 135 to 158, 159 to 182, 119 414, 447 to 504, to244 539 to 596 14758126 21 22 to 45, 46 to 69, 70 to 93, 94 to 117, 118 to 140, 141 to 162, 321 to 341, 345 to 368, 369 to 392, 393 to 416, 417 to 440, 175 to 225, 474 to 524 5453656 22 70 to 93, 94 to 117, 118 to 141, 438 to 499 142 to 165, 166 to 189, 190 to 213, 214 to 237, 238 to 261, 262 to 285, 286 to 310, 311 to 335, 336 to 359, 360 to 383, 369-421 14764198 23 3 to 26, 27 to 50, 51 to 74, 76 to 227 to 285 99, 100 to 123, 124 to 144, 165- 210 5729718 24 61 to 90, 92 to 115, 119 to 142, 143 to 166, 211 to 234, 235 to 258, 259 to 282, 294 to 345

[0130] II. High-Throughput In Vitro Fluorescence Polarization Assay

[0131] Fluorescently-labeled LRRCAPS peptide/substrate are added to each well of a 96-well microtiter plate, along with a test agent in a test buffer (10 mM HEPES, 10 mM NaCl, 6 mM magnesium chloride, pH 7.6). Changes in fluorescence polarization, determined by using a Fluorolite FPM-2 Fluorescence Polarization Microtiter System (Dynatech Laboratories, Inc), relative to control values indicates the test compound is a candidate modifier of LRRCAPS activity.

[0132] III. High-Throughput In Vitro Binding Assay

[0133] .sup.33P-labeled LRRCAPS peptide is added in an assay buffer (100 mM KCl, 20 mM HEPES pH 7.6, 1 mM MgCl.sub.2, 1% glycerol, 0.5% NP-40, 50 mM beta-mercaptoethanol, 1 mg/ml BSA, cocktail of protease inhibitors) along with a test agent to the wells of a Neutralite-avidin coated assay plate and incubated at 25.degree. C. for 1 hour. Biotinylated substrate is then added to each well and incubated for 1 hour. Reactions are stopped by washing with PBS, and counted in a scintillation counter. Test agents that cause a difference in activity relative to control without test agent are identified as candidate p53 modulating agents.

[0134] IV. Immunoprecipitations and Immunoblotting

[0135] For coprecipitation of transfected proteins, 3.times.10.sup.6 appropriate recombinant cells containing the LRRCAPS proteins are plated on 10-cm dishes and transfected on the following day with expression constructs. The total amount of DNA is kept constant in each transfection by adding empty vector. After 24 h, cells are collected, washed once with phosphate-buffered saline and lysed for 20 min on ice in 1 ml of lysis buffer containing 50 mM Hepes, pH 7.9, 250 mM NaCl, 20 mM -glycerophosphate, 1 mM sodium orthovanadate, 5 mM p-nitrophenyl phosphate, 2 mM dithiothreitol, protease inhibitors (complete, Roche Molecular Biochemicals), and 1% Nonidet P-40. Cellular debris is removed by centrifugation twice at 15,000.times.g for 15 min. The cell lysate is incubated with 25 .mu.l of M2 beads (Sigma) for 2 h at 4.degree. C. with gentle rocking.

[0136] After extensive washing with lysis buffer, proteins bound to the beads are solubilized by boiling in SDS sample buffer, fractionated by SDS-polyacrylamide gel electrophoresis, transferred to polyvinylidene difluoride membrane and blotted with the indicated antibodies. The reactive bands are visualized with horseradish peroxidase coupled to the appropriate secondary antibodies and the enhanced chemiluminescence (ECL) Western blotting detection system (Amersham Pharmacia Biotech).

[0137] V. Expression Analysis

[0138] All cell lines used in the following experiments are NCI (National Cancer Institute) lines, and are available from ATCC (American Type Culture Collection, Manassas, Va. 20110-2209). Normal and tumor tissues were obtained from Impath, U C Davis, Clontech, Stratagene, and Ambion.

[0139] TaqMan analysis was used to assess expression levels of the disclosed genes in various samples.

[0140] RNA was extracted from each tissue sample using Qiagen (Valencia, Calif.) RNeasy kits, following manufacturer's protocols, to a final concentration of 50 ng/.mu.l. Single stranded cDNA was then synthesized by reverse transcribing the RNA samples using random hexamers and 500 ng of total RNA per reaction, following protocol 4304965 of Applied Biosystems (Foster City, Calif.).

[0141] Primers for expression analysis using TaqMan assay (Applied Biosystems, Foster City, Calif.) were prepared according to the TaqMan protocols, and the following criteria: a) primer pairs were designed to span introns to eliminate genomic contamination, and b) each primer pair produced only one product.

[0142] Taqman reactions were carried out following manufacturer's protocols, in 25 .mu.l total volume for 96-well plates and 10 .mu.l total volume for 384-well plates, using 300nM primer and 250 nM probe, and approximately 25 ng of cDNA. The standard curve for result analysis was prepared using a universal pool of human cDNA samples, which is a mixture of cDNAs from a wide variety of tissues so that the chance that a target will be present in appreciable amounts is good. The raw data were normalized using 18S rRNA (universally expressed in all tissues and cells).

[0143] For each expression analysis, tumor tissue samples were compared with matched normal tissues from the same patient. A gene was considered overexpressed in a tumor when the level of expression of the gene was 2 fold or higher in the tumor compared with its matched normal sample. In cases where normal tissue was not available, a universal pool of cDNA samples was used instead. In these cases, a gene was considered overexpressed in a tumor sample when the difference of expression levels between a tumor sample and the average of all normal samples from the same tissue type was greater than 2 times the standard deviation of all normal samples (i.e., Tumor-average (all normal samples)>2.times.STDEV (all normal samples)).

[0144] Results are shown in Table 2. Number of pairs of tumor samples and matched normal tissue from the same patient are shown for each tumor type. Percentage of the samples with at least two-fold overexpression for each tumor type is provided. "ND" means not done. A modulator identified by an assay described herein can be further validated for therapeutic effect by administration to a tumor in which the gene is overexpressed. A decrease in tumor growth confirms therapeutic utility of the modulator. Prior to treating a patient with the modulator, the likelihood that the patient will respond to treatment can be diagnosed by obtaining a tumor sample from the patient, and assaying for expression of the gene targeted by the modulator. The expression data for the gene(s) can also be used as a diagnostic marker for disease progression. The assay can be performed by expression analysis as described above, by antibody directed to the gene target, or by any other available detection method.

2TABLE 2 SEQ Head ID # of # of and # of # of # of NO Breast Pairs Colon Pairs Neck Pairs Kidney Pairs Lung Pairs 1 5.3% 19 12.1% 33 12.5% 8 26.1% 23 10.0% 20 3 21.2% 33 30.3% 33 ND 0 70.0% 20 25.0% 32 4 36.8% 19 9.1% 33 37.5% 8 45.8% 24 45.0% 20 10 42.1% 19 9.1% 33 12.5% 8 21.7% 23 28.6% 21 14 31.6% 19 21.2% 33 50.0% 8 36.4% 22 30.0% 20 16 33.3% 18 32.3% 31 100.0% 8 26.1% 23 57.9% 19 15 50.0% 12 33.3% 30 ND 0 ND 0 71.4% 14 SEQ ID # of # of # of # of NO Ovary Pairs Uterus Pairs Protate Pairs Skin Pairs 1 18.2% 11 10.5% 19 33.3% 12 66.7% 3 3 30.0% 6 15.8% 19 0.0% 12 ND 0 4 27.3% 11 26.3% 19 33.3% 12 0.0% 3 10 66.7% 12 35.3% 17 0.0% 12 0.0% 3 14 27.3% 11 10.5% 19 16.7% 12 33.3% 3 16 36.4% 11 15.8% 19 8.3% 12 0.0% 3 15 42.9% 7 ND 0 ND 0 ND 0

[0145]

Sequence CWU 1

1

24 1 8580 DNA Homo sapiens 1 ggtgtccagc tcggatggct gactgctctt tcacttgtcg ttaacttcca cccagtagtc 60 ttagtctccc cttggacatg gtaccgaccg tgtgccattc ttatttcact tgaccgagcc 120 cgagctgcag cccagggtgg aagccgcaag ggctgggcag agttccaggc aagagggacc 180 gagaggcgtg agcagtgctc cgcagcgctt gtcagagaaa tggagcagcg gctcctttgt 240 gggctcttac atccccgtcc tgggtgcaca gcccggatat ttgatctgtg gggtgattga 300 taagtgaggg agagggggga cgcgatccct ccctccctcc tccctccctc ctccctcctc 360 actccctcct ccctcctccc tccctcctcc ctcctccctc ctccctccct tcttcccctc 420 tcctccctcc cgctttactt ccctcctccc tctcgcctcc ctccctcctt cctgcaagaa 480 gcgttgcccg ttggctagct gctcggtggg gatctgcctg ccctgggggc gccgcccgcg 540 ctccccgcgg tgctctcgct cctgggctgc gccagtccga ggcggtgccg gctcctttgc 600 ctccccgagt cgcagatgct gcgggcgcct ccgggaaaag atctgggcgg cgcgctcgct 660 cggtaagttc tgagcactca gggacgcggt ggcgacgcgg ccagtgagcc ggctttcctc 720 agtccgttgc ctttcccggc cacctctcct tgcgaggggc accagcgtgg ggaggctggg 780 cgccatccgc ggagggcagc tcgctggcgg ccgccctcta ccctcaatcc ccactggaga 840 ttcctcaccc cgggaccgtc cgcgcgggcg tggtcgggct ccgcgcctcg cgcagcgggg 900 tggcacaggc ggccagggag ggcccacgca cccggcgcga gctagaagcc tccggtcggc 960 ctgcagtgcc caagtcccat ggcgagggca gcccgagtgg ccgtcgcggc tgtaggtccg 1020 catgccgggc accgcaccag gcgtctagca ggtaggggca gggaaggtag ggctgcgctg 1080 gcggccggtg cccagttacc ggcgatcggg gacgctcgga gacacctggt ctcccggaag 1140 cgcccttcgg aaatgggatt cgacccggct tgcgggcggc gggtgtttag aagaagaggc 1200 tgcgggcaag cagtgccccc tctctggttc cccggactcc tcttagcccc ctcgtggcct 1260 gatgggcggc cgggacggag gtgggctgta agcccgccgg cacccaccgt gtcctgtgga 1320 aggcttggag acctgagcca ggctctactc ggttagatgc gagtgagaca ggcgacggag 1380 ttcgtcttta agcctccctt gcctcagcaa ggagagggag cgttttcctt attttaatga 1440 ccgcctttcc ctcccttggg gtcccagttc accctgaacc tttccacagc cttaaaagcc 1500 cgctctccct agcggagtgc tgcggcagtt cttcagaaga cacggggcca aatgaaactt 1560 acttaaagtg gttatcgcgt ttcaggctga tttgttccta gtaactacgt tttggaaagc 1620 agctgtggac tctcaaggac aggcaggaac gaagacctcc ttagggtccg agtgtcgctt 1680 cccacagtta gattacccat gaatttcctt gattctgtag gtctcaagaa tctcatgagc 1740 ccccaaccac ctccaccttt ctctctgaat ctctctcctg tctctgaaat tcttcgcaaa 1800 aataatctgt cctcaaggaa tattgaaagc gtcatcaatg tcttgaagga aattactgta 1860 tctgagaacc gaaactagta tttgatgttt cactttcaaa tacatttttt tttaataaaa 1920 aggatacctt taagtaaaac actaccactg ctatttacgt ttaagtagat ttttaattca 1980 tattaagcag tagtgtgctt tttggaaaga tgatggactg ggactcataa tccctgggtt 2040 ttgtttctat ttgtagtttc ttgagtaagt cagcactttc cctggttaag aaatgacctc 2100 atctgtaaaa tgaacctgac ttctaaggtc ttgttcggct ctcacgtctt tacttcagtt 2160 aaatattgac ataatacatg tttgttgaat gaatgactga atgaataaat cttattgctc 2220 taggaactat gtgcttttaa cctttggaaa tacatagaaa caaatgcctc ctttgctagg 2280 aaagggagct taattgtgct cccactgcat cagactgctt ccatctaatg atgcaattgc 2340 aatacagggt gggaggccat ccacagggct gtccctctgc ctcagagctc atctcaagtt 2400 tgcccttctc tatggagaag gaaaatttga gtctccagaa ggaaagtatt ttgctaaacg 2460 ccaagcaaga attagagggt aagggggctt ctgatgcctg gtccaaggct cttagaagaa 2520 gaagaaggag aaagggagaa agctcaagaa aataatgcac agatcaccag ctacaggtgg 2580 ccctagtgcc taagtttata aactaccccc gcccttcccg gagagaaagg agcttgtata 2640 aagggaacaa tcctagaacc taggcttaac aaggaaggag aaggggagaa gcaagggtca 2700 ttctgttctc aagtggtttg ttgtatctgt gtgtttatct attccacatt tggccaggca 2760 gtgggactcc gggaaagcaa gctgaagtgt tttgtgggac aggcacatca tttggtcaac 2820 tccttttccc tggttgattc cctccttttc ctgacccatc tcccccttcc taccagaggc 2880 aaccaaggcc tttccaggca acagcagaca ttatttaccc cttgcgaatt gattccacaa 2940 tggaagactc ttgggtccaa ggagctgtaa gataattaag gagaaaacag ataatgaagg 3000 caaaaatcca acacctggcc attagcaatg ctctggaaag gctatttaaa cccgcattgg 3060 atatcagagc tgggagggcc catacagtct acctacctgc cttttgcaga tggacacagg 3120 aagatccaga agctagtggc acatctagca acagagccag atcagaaccc aggtaagctc 3180 ggtctcaggc caggattcct tttccatctc ctctctttct caggagccag tcttcctgca 3240 ccagcttcct cttttctcct agctcccctg cccctgcagc ctggagggct caaccaccct 3300 tcctttggct cccactccca gctgaggctc agcctggcag tgcttttctg acacccactt 3360 cttttctcct tcctccaggc aagaagtgca cgtttaaacc tcattatctg gctaagtata 3420 aacctcaggg agaaaagggg tttgtttttg tttttctcag tctataagct aacattgagc 3480 tgactttcag agtccaggca aatatgtttg gtgggtcacc acccagaaag gaattctctt 3540 agcactgaat cagggctctg tatgtaaagt ataaatcccc taagagaact cctcaccctc 3600 ctacacagac acattggtgc atgcacacac accccactct ctgcacagag agcatctgag 3660 ctaggagctg gcaagtgggg caccagtcct gtagcaaaga ggcacaccac acacacacac 3720 acacacacac acacacactc tctctctctc tctctctcac tcacacacac acacacacac 3780 acacacacac acacacacgc ttctcccttg cttgggaatc tgggagaaga accccccacc 3840 cccacccctg ccctccatag gcattgtgta ggtgagagaa agagggagga gtgagagaga 3900 acacacagag gggcccaagg aggtgccagg ccattagcag ggcccctcct tgagaaaccc 3960 ctctgcagga gcttctcctg ccgccagcca ggttggaggt ggagtagttc agaatcaact 4020 gacgcagccg ggaattgagc tttgcaaagc cacttgcaag gaagggaagc atctgcccaa 4080 ccctccccca ccgcgcgccc tggatcctct gcctgccccc tcccccgtga cgtcacccta 4140 gtcctgtccc ggggagcctg caaagcctct cagattcaaa ctgctagacg cactgctgcc 4200 accgccaccg aattggaaac gcgcgcccag gctccgtcgt cgccttcgcc cgccgaccgg 4260 gccagccggc tctccgacct ccctacagaa tcgcacccca gtccctccct ggcagctcgg 4320 cttccctcag ctccaactct tctcttccgc tcctgcctcc tgtcggattt ttaatttctg 4380 cgcaccccca gtcaaattaa atcaaccaac aaaaagcagg gcatcccccc tggaagcagc 4440 gtcttatttt accttgttct cccacttcct gaagatgcta aactcctggt ggactgcaga 4500 ggagagggat tcagtcttct cctgatgtgt gagtaacccc cacctcgcac tgtctttccc 4560 atctctatct ctcctctaca cctgccgcca gccccctgat tcctgatttt cccaccccct 4620 ttttgcgctt tttttttttt tcctaaagcg attgcgattt ctgctgggag ctcaagacgg 4680 gcgagctgcc cgagatctct tcgagatacc ccaggggagg aggagatggg caggatttag 4740 taggacaact cggttactaa tgacttggcg gctggctgcg accccccggg aaatcaggtg 4800 caagcatgtg tgttcccggg gcgcgtgtgt gggtggcctc gggatggggg aatagggagc 4860 ggagaaaaga aaccgctttg gaaaatgcaa tgatttcatt ctgccgtgtt gctaaccccc 4920 tcattctccc tcgctcccac tccgcctcct tgctttacca tttttaatcg ggcttccttg 4980 tttctttctt ccctgcaccc gcttcttccc cctgccccca cctaaggttt gcctgtaggt 5040 acctgagttg acaccgaagg tgcctaaaga tgctgagcgg cgtttggttc ctcagtgtgt 5100 taaccgtggc cgggatctta cagacagaga gtcgcaaaac tgccaaagac atttgcaaga 5160 tccgctgtct gtgcgaagaa aaggaaaacg tactgaatat caactgtgag aacaaaggat 5220 ttacaacagt tagcctgctc cagccccccc agtatcgaat ctatcagctt tttctcaatg 5280 gaaacctctt gacaagactg tatccaaacg aatttgtcaa ttactccaac gcggtgactc 5340 ttcacctagg taacaacggg ttacaggaga tccgaacggg ggcattcagt ggcctgaaaa 5400 ctctcaaaag actgcatctc aacaacaaca agcttgagat attgagggag gacaccttcc 5460 taggcctgga gagcctggag tatctccagg ccgactacaa ttacatcagt gccatcgagg 5520 ctggggcatt cagcaaactt aacaagctca aagtgctcat cctgaatgac aaccttctgc 5580 tttcactgcc cagcaatgtg ttccgctttg tcctgctgac ccacttagac ctcaggggga 5640 ataggctaaa agtaatgcct tttgctggcg tccttgaaca tattggaggg atcatggaga 5700 ttcagctgga ggaaaatcca tggaattgca cttgtgactt acttcctctc aaggcctggc 5760 tagacaccat aactgttttt gtgggagaga ttgtctgtga gactcccttt aggttgcatg 5820 ggaaagacgt gacccagctg accaggcaag acctctgtcc cagaaaaagt gccagtgatt 5880 ccagtcagag gggcagccat gctgacaccc acgtccaaag gctgtcacct acaatgaatc 5940 ctgctctcaa cccaaccagg gctccgaaag ccagccggcc gcccaaaatg agaaatcgtc 6000 caactccccg agtgactgtg tcaaaggaca ggcaaagttt tggacccatc atggtgtacc 6060 agaccaagtc tcctgtgcct ctcacctgtc ccagcagctg tgtctgcacc tctcagagct 6120 cagacaatgg tctgaatgta aactgccaag aaaggaagtt cactaatatc tctgacctgc 6180 agcccaaacc gaccagtcca aagaaactct acctaacagg gaactatctt caaactgtct 6240 ataagaatga cctcttagaa tacagttctt tggacttact gcacttagga aacaacagga 6300 ttgcagtcat tcaggaaggt gcctttacaa acctgaccag tttacgcaga ctttatctga 6360 atggcaatta ccttgaagtg ctgtaccctt ctatgtttga tggactgcag agcttgcaat 6420 atctctattt agagtataat gtcattaagg aaattaagcc tctgaccttt gatgctttga 6480 ttaacctaca gctactgttt ctgaacaaca accttcttcg gtccttacct gataatatat 6540 ttggggggac ggccctaacc aggctgaatc tgagaaacaa ccatttttct cacctgcccg 6600 tgaaaggggt tctggatcag ctcccggctt tcatccagat agatctgcag gagaacccct 6660 gggactgtac ctgtgacatc atggggctga aagactggac agaacatgcc aattcccctg 6720 tcatcattaa tgaggtgact tgcgaatctc ctgctaagca tgcaggggag atactaaaat 6780 ttctggggag ggaggctatc tgtccagaca gcccaaactt gtcagatgga accgtcttgt 6840 caatgaatca caatacagac acacctcggt cgcttagtgt gtctcctagt tcctatcctg 6900 aactacacac tgaagttcca ctgtctgtct taattctggg attgcttgtt gttttcatct 6960 tatctgtctg ttttggggct ggtttattcg tctttgtctt gaaacgccga aagggagtgc 7020 cgagcgttcc caggaatacc aacaacttag acgtaagctc ctttcaatta cagtatgggt 7080 cttacaacac tgagactcac gataaaacag acggccatgt ctacaactat atccccccac 7140 ctgtgggtca gatgtgccaa aaccccatct acatgcagaa ggaaggagac ccagtagcct 7200 attaccgaaa cctgcaagag ttcagctata gcaacctgga ggagaaaaaa gaagagccag 7260 ccacacctgc ttacacaata agtgccactg agctgctaga aaagcaggcc acaccaagag 7320 agcctgagct gctgtatcaa aatattgctg agcgagtcaa ggaacttccc agcgcaggcc 7380 tagtccacta taacttttgt accttaccta aaaggcagtt tgccccttcc tatgaatctc 7440 gacgccaaaa ccaagacaga atcaataaaa ccgttttata tggaactccc aggaaatgct 7500 ttgtggggca gtcaaaaccc aaccaccctt tactgcaagc taagccgcaa tcagaaccgg 7560 actacctcga agttctggaa aaacaaactg caatcagtca gctgtgaagg gaaatcattt 7620 acaaccctaa ggcatcagag gatgctgctc cgaactgttg gaaacaagga cattagcttt 7680 tgtgtttgtt tttgttctcc ctttcccagt gttaatgggg gactttgaaa atgtttggga 7740 gataggatga agtcatgatt ttgcttttgc aagttttcct ttaaattatt tctctctcgc 7800 tctcctcccc tccttttttt tttttttttt ttctttttcc cttctcttct taggaaccat 7860 cagtggacat gaatgtttct acaatgcatt tcttcataga ttttgtttat ggttttgttt 7920 cttttttctt ctttgttttt cagtgtggga gtgggaagag gagattatag tgactgaaga 7980 aagaataggc aaacttttca aatgaaaatg gatatttagt gtattttgta gaagatctcc 8040 aaagatcttt tgtgactaca acttcttttg taaataatga tatatggtat ttccatcgtc 8100 agttaccgag tatagccact gggtatcact actttgtgtt aaagtgcctt cgcactttaa 8160 gtacattact taaatgttgc ttttagcttt gataaattga aaatatttta atgtgttgta 8220 tttttgaaat tgaaaacact gtaaaataga ttgatgtgtc agctatatta agtcaacgta 8280 cagtttgctt gagttataga aaccagcctg tcatcaaatg attctagttc taggactttg 8340 taggcttaac tataaaatat ttcctttcct ctgggtttaa gtgattttat ttaagtcaac 8400 taaggggatt taacagtgga ctagaggtaa taagccacct cagtcaggat taataattca 8460 ttaataaaat atatttaacc caatatcaga gtgaattgag caattaatgc ccttccgtaa 8520 atcattattt tacactaaca tggtgagtgt tttagattat tttcctaatt aaaagaacgt 8580 2 2667 DNA Homo sapiens 2 ttcttccctg cacccgcttc ttccccctgc ccccacctaa ggtttgcctg taggtacctg 60 agttgacacc gaaggtgcct aaagatgctg agcggcgttt ggttcctcag tgtgttaacc 120 gtggccggga tcttacagac agagagtcgc aaaactgcca aagacatttg caagatccgc 180 tgtctgtgcg aagaaaagga aaacgtactg aatatcaact gtgagaacaa aggatttaca 240 acagttagcc tgctccagcc cccccagtat cgaatctatc agctttttct caatggaaac 300 ctcttgacaa gactgtatcc aaacgaattt gtcaattact ccaacgcggt gactcttcac 360 ctaggtaaca acgggttaca ggagatccga acgggggcat tcagtggcct gaaaactctc 420 aaaagactgc atctcaacaa caacaagctt gagatattga gggaggacac cttcctaggc 480 ctggagagcc tggagtatct ccaggccgac tacaattaca tcagtgccat cgaggctggg 540 gcattcagca aacttaacaa gctcaaagtg ctcatcctga atgacaacct tctgctttca 600 ctgcccagca atgtgttccg ctttgtcctg ctgacccact tagacctcag ggggaatagg 660 ctaaaagtaa tgccttttgc tggcgtcctt gaacatattg gagggatcat ggagattcag 720 ctggaggaaa atccatggaa ttgcacttgt gacttacttc ctctcaaggc ctggctagac 780 accataactg tttttgtggg agagattgtc tgtgagactc cctttaggtt gcatgggaaa 840 gacgtgaccc agctgaccag gcaagacctc tgtcccagaa aaagtgccag tgattccagt 900 cagaggggca gccatgctga cacccacgtc caaaggctgt cacctacaat gaatcctgct 960 ctcaacccaa ccagggctcc gaaagccagc cggccgccca aaatgagaaa tcgtccaact 1020 ccccgagtga ctgtgtcaaa ggacaggcaa agttttggac ccatcatggt gtaccagacc 1080 aagtctcctg tgcctctcac ctgtcccagc agctgtgtct gcacctctca gagctcagac 1140 aatggtctga atgtaaactg ccaagaaagg aagttcacta atatctctga cctgcagccc 1200 aaaccgacca gtccaaagaa actctaccta acagggaact atcttcaaac tgtctataag 1260 aatgacctct tagaatacag ttctttggac ttactgcact taggaaacaa caggattgca 1320 gtcattcagg aaggtgcctt tacaaacctg accagtttac gcagacttta tctgaatggc 1380 aattaccttg aagtgctgta cccttctatg tttgatggac tgcagagctt gcaatatctc 1440 tatttagagt ataatgtcat taaggaaatt aagcctctga cctttgatgc tttgattaac 1500 ctacagctac tgtttctgaa caacaacctt cttcggtcct tacctgataa tatatttggg 1560 gggacggccc taaccaggct gaatctgaga aacaaccatt tttctcacct gcccgtgaaa 1620 ggggttctgg atcagctccc ggctttcatc cagatagatc tgcaggagaa cccctgggac 1680 tgtacctgtg acatcatggg gctgaaagac tggacagaac atgccaattc ccctgtcatc 1740 attaatgagg tgacttgcga atctcctgct aagcatgcag gggagatact aaaatttctg 1800 gggagggagg ctatctgtcc agacagccca aacttgtcag atggaaccgt cttgtcaatg 1860 aatcacaata cagacacacc tcggtcgctt agtgtgtctc ctagttccta tcctgaacta 1920 cacactgaag ttccactgtc tgtcttaatt ctgggattgc ttgttgtttt catcttatct 1980 gtctgttttg gggctggttt attcgtcttt gtcttgaaac gccgaaaggg agtgccgagc 2040 gttcccagga ataccaacaa cttagacgta agctcctttc aattacagta tgggtcttac 2100 aacactgaga ctcacgataa aacagacggc catgtctaca actatatccc cccacctgtg 2160 ggtcagatgt gccaaaaccc catctacatg cagaaggaag gagacccagt agcctattac 2220 cgaaacctgc aagagttcag ctatagcaac ctggaggaga aaaaagaaga gccagccaca 2280 cctgcttaca caataagtgc cactgagctg ctagaaaagc aggccacacc aagagagcct 2340 gagctgctgt atcaaaatat tgctgagcga gtcaaggaac ttcccagcgc aggcctagtc 2400 cactataact tttgtacctt acctaaaagg cagtttgccc cttcctatga atctcgacgc 2460 caaaaccaag acagaatcaa taaaaccgtt ttatatggaa ctcccaggaa atgctttgtg 2520 gggcagtcaa aacccaacca ccctttactg caagctaagc cgcaatcaga accggactac 2580 ctcgaagttc tggaaaaaca aactgcaatc agtcagctgt gaagggaaat catttacaac 2640 cctaaggcat cagaggatgc tgctccg 2667 3 6801 DNA Homo sapiens 3 agccggccgt ggtggctccg tgcgtccgag cgtccgtccg cgccgtcggc catggccaag 60 cgctccaggg gccccgggcg ccgctgcctg ttggcgctcg tgctgttctg cgcctggggg 120 acgctggccg tggtggccca gaagccgggc gcagggtgtc cgagccgctg cctgtgcttc 180 cgcaccaccg tgcgctgcat gcatctgctg ctggaggccg tgcccgccgt ggcgccgcag 240 acctccatcc tagatcttcg ctttaacaga atcagagaga tccaacctgg ggcattcagg 300 cggctgagga acttgaacac attgcttctc aataataatc agatcaagag gatacctagt 360 ggagcatttg aagacttgga aaatttaaaa tatctctatc tgtacaagaa tgagatccag 420 tcaattgaca ggcaagcatt taagggactt gcctctctag agcaactata cctgcacttt 480 aatcagatag aaactttgga cccagattcg ttccagcatc tcccgaagct cgagaggcta 540 tttttgcata acaaccggat tacacattta gttccaggga catttaatca cttggaatct 600 atgaagagat tgcgactgga ctcaaacaca cttcactgcg actgtgaaat cctgtggttg 660 gcggatttgc tgaaaaccta cgcggagtcg gggaacgcgc aggcagcggc catctgtgaa 720 tatcccagac gcatccaggg acgctcagtg gcaaccatca ccccggaaga gctgaactgt 780 gaaaggcccc ggatcacctc cgagccccag gacgcagatg tgacctcggg gaacaccgtg 840 tacttcacct gcagagccga aggcaacccc aagcctgaga tcatctggct gcgaaacaat 900 aatgagctga gcatgaagac agattcccgc ctaaacttgc tggacgatgg gaccctgatg 960 atccagaaca cacaggagac agaccagggt atctaccagt gcatggcaaa gaacgtggcc 1020 ggagaggtga agacgcaaga ggtgaccctc aggtacttcg ggtctccagc tcgacccact 1080 tttgtaatcc agccacagaa tacagaggtg ctggttgggg agagcgtcac gctggagtgc 1140 agcgccacag gccacccccc gccgcggatc tcctggacga gaggtgaccg cacacccttg 1200 ccagttgacc cgcgggtgaa catcacgcct tctggcgggc tttacataca gaacgtcgta 1260 cagggggaca gcggagagta tgcgtgctct gcgaccaaca acattgacag cgtccatgcc 1320 accgctttca tcatcgtcca ggctcttcct cagttcactg tgacgcctca ggacagagtc 1380 gttattgagg gccagaccgt ggatttccag tgtgaagcca agggcaaccc gccgcccgtc 1440 atcgcctgga ccaagggagg gagccagctc tccgtggacc ggcggcacct ggtcctgtca 1500 tcgggaacac ttagaatctc tggtgttgcc ctccacgacc agggccagta cgaatgccag 1560 gctgtcaaca tcatcggctc ccagaaggtc gtggcccacc tgactgtgca gcccagagtc 1620 accccagtgt ttgccagcat tcccagcgac acaacagtgg aggtgggcgc caatgtgcag 1680 ctcccgtgca gctcccaggg cgagcccgag ccagccatca cctggaacaa ggatggggtt 1740 caggtgacag aaagtggaaa atttcacatc agccctgaag gattcttgac catcaatgac 1800 gttggccctg cagacgcagg tcgctatgag tgtgtggccc ggaacaccat tgggtcggcc 1860 tcggtgagca tggtgctcag tgtgaatgac gtcagtcgaa atggagatcc gtttgtagct 1920 acctccatcg tggaagcgat tgcgactgtt gacagagcta taaactcaac ccgaacacat 1980 ttgtttgaca gccgtcctcg ttctccaaat gatttgctgg ccttgttccg gtatccgagg 2040 gatccttaca cagttgaaca ggcacgggcg ggagaaatct ttgaacggac attgcagctc 2100 attcaggagc atgtacagca tggcttgatg gtcgacctca acggaacaag ttaccactac 2160 aacgacctgg tgtctccaca gtacctgaac ctcatcgcaa acctgtcggg ctgtaccgcc 2220 caccggcgcg tgaacaactg ctcggacatg tgcttccacc agaagtaccg gacgcacgac 2280 ggcacctgta acaacctgca gcaccccatg tggggcgcct cgctgaccgc cttcgagcgc 2340 ctgctgaaat ccgtgtacga gaatggcttc aacacccctc ggggcatcaa cccccaccga 2400 ctgtacaacg ggcacgccct tcccatgccg cgcctggtgt ccaccaccct gatcgggacg 2460 gagaccgtca cacccgacga gcagttcacc cacatgctga tgcagtgggg ccagttcctg 2520 gaccacgacc tcgactccac ggtggtggcc ctgagccagg cacgcttctc cgacggacag 2580 cactgcagca acgtgtgcag caacgacccc ccctgcttct ctgtcatgat cccccccaat 2640 gactcccggg ccaggagcgg ggcccgctgc atgttcttcg tgcgctccag ccctgtgtgc 2700 ggcagcggca tgacttcgct gctcatgaac tccgtgtacc cgcgggagca gatcaaccag 2760 ctcacctcct acatagacgc atccaacgtg tacgggagca cggagcatga ggcccgcagc 2820 atccgcgacc tggccagcca ccgcggcctg ctgcggcagg gcatcgtgca gcggtccggg 2880 aagccgctgc tccccttcgc caccgggccg cccacggagt gcatgcggga cgagaacgag 2940 agccccatcc cctgcttcct ggccggggac caccgcgcca acgagcagct gggcctgacc 3000 agcatgcaca cgctgtggtt ccgcgagcac aaccgcattg ccacggagct gctcaagctg 3060 aacccgcact gggacggcga caccatctac tatgagacca ggaagatcgt gggtgcggag 3120 atccagcaca tcacctacca gcactggctc ccgaagatcc tgggggaggt gggcatgagg 3180 acgctgggag agtaccacgg ctacgacccc ggcatcaatg ctggcatctt caacgccttc 3240 gccaccgcgg ccttcaggtt tggccacacg cttgtcaacc cactgcttta ccggctggac 3300 gagaacttcc agcccattgc acaagatcac ctcccccttc acaaagcttt cttctctccc 3360 ttccggattg tgaatgaggg cggcatcgat ccgcttctca gggggctgtt cggggtggcg 3420 gggaaaatgc gtgtgccctc gcagctgctg aacacggagc tcacggagcg gctgttctcc 3480 atggcacaca cggtggctct ggacctggcg gccatcaaca tccagcgggg ccgggaccac 3540 gggatcccac cctaccacga ctacagggtc tactgcaatc tatcggcggc acacacgttc 3600 gaggacctga aaaatgagat taaaaaccct gagatccggg agaaactgaa aaggttgtat 3660 ggctcgacac tcaacatcga cctgtttccg gcgctcgtgg tggaggacct

ggtgcctggc 3720 agccggctgg gccccaccct gatgtgtctt ctcagcacac agttcaagcg cctgcgagat 3780 ggggacaggt tgtggtatga gaaccctggg gtgttctccc cggcccagct gactcagatc 3840 aagcagacgt cgctggccag gatcctatgc gacaacgcgg acaacatcac ccgggtgcag 3900 agcgacgtgt tcagggtggc ggagttccct cacggctacg gcagctgtga cgagatcccc 3960 agggtagacc tccgggtgtg gcaggactgc tgtgaagact gtaggaccag ggggcagttc 4020 aatgcctttt cctatcattt ccgaggcaga cggtctcttg agttcagcta ccaggaggac 4080 aagccgacca agaaaacaag accacggaaa atacccagtg ttgggagaca gggggaacat 4140 ctcagcaaca gcacctcagc cttcagcaca cgctcagatg catctgggac aaatgacttc 4200 agagagtttg ttctggaaat gcagaagacc atcacagacc tcagaacaca gataaagaaa 4260 cttgaatcac ggctcagtac cacagagtgc gtggatgccg ggggcgaatc tcacgccaac 4320 aacaccaagt ggaaaaaaga tgcatgcacc atttgtgaat gcaaagacgg gcaggtcacc 4380 tgcttcgtgg aagcttgccc ccctgccacc tgtgctgtcc ccgtgaacat cccaggggcc 4440 tgctgtccag tctgcttaca gaagagggcg gaggaaaagc cctaggctcc tgggaggctc 4500 ctcagagttt gtctgctgtg ccatcgtgag atcgggtggc cgatggcagg gagctgcgga 4560 ctgcagacca ggaaacaccc agaactcgtg acatttcatg acaacgtcca gctggtgctg 4620 ttacagaagg cagtgcagga ggcttccaac cagagcatct gcggagaagg aggcacagca 4680 ggtgcctgaa gggaagcagg caggagtcct agcttcacgt tagacttctc aggtttttat 4740 ttaattcttt taaaatgaaa aattggtgct actattaaat tgcacagttg aatcatttag 4800 gcgcctaaat tgattttgcc tcccaacacc atttcttttt aaataaagca ggatacctct 4860 atatgtcagc cttgccttgt tcagatgcca ggagccggca gacctgtcac ccgcaggtgg 4920 ggtgagtctt ggagctgcca gaggggctca ccgaaatcgg ggttccatca caagctatgt 4980 ttaaaaagaa aattggtgtt tggcaaacgg aacagaacct ttgatgagag cgttcacagg 5040 gacactgtct gggggtgcag tgcaagcccc cggcctcttc cctgggaacc tctgaactcc 5100 tccttcctct gggctctctg taacatttca ccacacgtca gcatctaatc ccaagacaaa 5160 cattcccgct gctcgaagca gctgtatagc ctgtgactct ccgtgtgtca gctccttcca 5220 cacctgatta gaacattcat aagccacatt tagaaacagg tttgctttca gctgtcactt 5280 gcacacatac tgcctagttg tgaaccaaat gtgaaaaaac ctccttcatc ccattgtgta 5340 tctgatacct gccgagggcc aagggtgtgt gttgacaacg ccgctcccag ccggccctgg 5400 ttgcgtccac gtcctgaaca agagccgctt ccggatggct cttcccaagg gaggaggagc 5460 tcaagtgtcg ggaactgtct aacttcaggt tgtgtgagtg cgttaaaaaa aaaaaaaaaa 5520 aagaatccct atacctcatt tgtattttta aaatgcgtga tgttttatga aattgtgtcc 5580 attttttagg tattagatat ggcagaaaaa ccatttccac tatgcaaagt tcttttagac 5640 gtcagtgaaa atcaactctc atacctcatg gtctctcttt aattgaccaa aaccttccat 5700 ttttctctaa atacaaagcg atctgtgttc tgagcaacct ttccccgaac acacagcttc 5760 agtgcagcac gctgacctga gtatccacca tgtgccaggc acagtgctgg gcacacgagg 5820 caccaaggtc cgggccacct gcccgcagca aggcccagct gaggtggtgg agggagcccc 5880 tgaggtcagg ggccgtttcg gttcagggtg gcaggtgtcc agcactgggg tatggcgtcg 5940 aggcttccat ggggtggggg aggccagctt ccttctgaca ggatgggcgc atacagtgcc 6000 tggtgtgatt tgtgcacaac ccgtgttcca ggtgcacatc ctcccaagga gacacccaga 6060 cccttccagc acgggccggc caagttgctg cggcggaggc agcatttcag ctgtgaggaa 6120 ggtcattgga ttcatgtgtt ttatctgtaa aaatggttgt cttaacttct taacctcata 6180 ttggtaagtg attgataaaa attggttggt gtttcatgac atgtggactt cttttgaaat 6240 agcaagtcaa atgtagtgac caaattgtgg aagagatttc tgtcaaatag gaaatgtgta 6300 agttcgtcta aaagctgatg gttatgtaag ttgctcaggc actcagatga cagcagattc 6360 tgggttctgg gagtgttctg tgcctcttac atgccctgga ggcctcatgg tctcagtgct 6420 gaggcggcac acctgtagca cacctgcgta atgtgcggtc tgggccagtc acaaggaatt 6480 gtgttgtcta agccaaaggg ggaagctgac tgtgatttac caaaaaaaat tctgtaattc 6540 aaaccaaaat gtctgcggaa tcaccagttt gatactctct gtaatcagaa cagtgggcag 6600 tgcctgggtg aacgtgtcta gcagccactg tgcgggatcg ctgtaacagg agtggaatgt 6660 acatatttat ttacttttct aactgctcca acagccaaat gcctttttta tgaccattgt 6720 attcagttca ttaccaaaga aatgtttgca ctttgtaatg atgcctttca gttcaaataa 6780 atgggtcaca ttttcaaatg g 6801 4 2593 DNA Homo sapiens 4 actcccaaac tccagtgctc tcatccagag gctcttgtga ttctctttgc aattgtgagg 60 aaaaagatgg cacaatgcta ataaattgtg aagcaaaagg tatcaagatg gtatctgaaa 120 taagtgtgcc accatcacga cctttccaac taagcttatt aaataacggc ttgacgatgc 180 ttcacacaaa tgacttttct gggcttacca atgctatttc aatacacctt ggatttaaca 240 atattgcaga tattgagata ggtgcattta atggccttgg cctcctgaaa caacttcata 300 tcaatcacaa ttctttagaa attcttaaag aggatacttt ccatggactg gaaaacctgg 360 aattcctgca agcagataac aattttatca cagtgattga accaagtgcc tttagcaagc 420 tcaacagact caaagtgtta attttaaatg acaatgctat tgagagtctt cctccaaaca 480 tcttccgatt tgttccttta acccatctag atcttcgtgg aaatcaatta caaacattgc 540 cttatgttgg ttttctcgaa cacattggcc gaatattgga tcttcagttg gaggacaaca 600 aatgggcctg caattgtgac ttattgcagt taaaaacttg gttggagaac atgcctccac 660 agtctataat tggtgatgtt gtctgcaaca gccctccatt ttttaaagga agtatactca 720 gtagactaaa gaaggaatct atttgcccta ctccaccagt gtatgaagaa catgaggatc 780 cttcaggatc attacatctg gcagcaacat cttcaataaa tgatagtcgc atgtcaacta 840 agaccacgtc cattctaaaa ctacccacca aagcaccagg tttgatacct tatattacaa 900 agccatccac tcaacttcca ggaccttact gccctattcc ttgtaactgc aaagtcctat 960 ccccatcagg acttctaata cattgtcagg agcgcaacat tgaaagctta tcagatctga 1020 gacctcctcc gcaaaatcct agaaagctca ttctagcggg aaatattatt cacagtttaa 1080 tgaagtctga tctagtggaa tatttcactt tggaaatgct tcacttggga aacaatcgta 1140 ttgaagttct tgaagaagga tcgtttatga acctaacgag attacaaaaa ctctatctaa 1200 atggtaacca cctgaccaaa ttaagtaaag gcatgttcct tggtctccat aatcttgaat 1260 acttatatct tgaatacaat gccattaagg aaatactgcc aggaaccttt aatccaatgc 1320 ctaaacttaa agtcctgtat ttaaataaca acctcctcca agttttacca ccacatattt 1380 tttcaggggt tcctctaact aaggtaaatc ttaaaacaaa ccagtttacc catctacctg 1440 taagtaatat tttggatgat cttgatttgc taacccagat tgaccttgag gataacccct 1500 gggactgctc ctgtgacctg gttggactgc agcaatggat acaaaagtta agcaagaaca 1560 cagtgacaga tgacatcctc tgcacttccc ccgggcatct cgacaaaaag gaattgaaag 1620 ccctaaatag tgaaattctc tgtccaggtt tagtaaataa cccatccatg ccaacacaga 1680 ctagttacct tatggtcacc actcctgcaa caacaacaaa tacggctgat actattttac 1740 gatctcttac ggacgctgtg ccactgtctg ttctaatatt gggacttctg attatgttca 1800 tcactattgt tttctgtgct gcagggatag tggttcttgt tcttcaccgc aggagaagat 1860 acaaaaagaa acaagtagat gagcaaatga gagacaacag tcctgtgcat cttcagtaca 1920 gcatgtatgg ccataaaacc actcatcaca ctactgaaag accctctgcc tcactctatg 1980 aacagcacat ggtgagcccc atggttcatg tctatagaag tccatccttt ggtccaaagc 2040 atctggaaga ggaagaagag aggaatgaga aagaaggaag tgatgcaaaa catctccaaa 2100 gaagtctttt ggaacaggaa aatcattcac cactcacagg gtcaaatatg aaatacaaaa 2160 ccacgaacca atcaacagaa tttttatcct tccaagatgc cagctcattg tacagaaaca 2220 ttttagaaaa agaaagggaa cttcagcaac tgggaatcac agaataccta aggaaaaaca 2280 ttgctcagct ccagcctgat atggaggcac attatcctgg agcccacgaa gagctgaagt 2340 taatggaaac attaatgtac tcacgtccaa ggaaggtatt agtggaacag acaaaaaatg 2400 agtattttga acttaaagct aatttacatg ctgaacctga ctatttagaa gtcctggagc 2460 agcaaacata gatggagagt ttgagggctt tcgcagaaat gctgtgattc tgttttaagt 2520 ccataccttg taaataagtg ccttacgtga gtgtgtcatc aatcagaacc taagcacagc 2580 agtaaactat ggg 2593 5 2606 DNA Homo sapiens 5 actcccaaac tccagtgctc tcatccagag gctcttgtga ttctctttgc aattgtgagg 60 aaaaagatgg cacaatgcta ataaattgtg aagcaaaagg tatcaagatg gtatctgaaa 120 taagtgtgcc accatcacga cctttccaac taagcttatt aaataacggc ttgacgatgc 180 ttcacacaaa tgacttttct gggcttacca atgctatttc aatacacctt ggatttaaca 240 atattgcaga tattgagata ggtgcattta atggccttgg cctcctgaaa caacttcata 300 tcaatcacaa ttctttagaa attcttaaag aggatacttt ccatggactg gaaaacctgg 360 aattcctgca agcagataac aattttatca cagtgattga accaagtgcc tttagcaagc 420 tcaacagact caaagtgtta attttaaatg acaatgctat tgagagtctt cctccaaaca 480 tcttccgatt tgttccttta acccatctag atcttcgtgg aaatcaatta caaacattgc 540 cttatgttgg ttttctcgaa cacattggcc gaatattgga tcttcagttg gaggacaaca 600 aatgggcctg caattgtgac ttattgcagt taaaaacttg gttggagaac atgcctccac 660 agtctataat tggtgatgtt gtctgcaaca gccctccatt ttttaaagga agtatactca 720 gtagactaaa gaaggaatct atttgcccta ctccaccagt gtatgaagaa catgaggatc 780 cttcaggatc attacatctg gcagcaacat cttcaataaa tgatagtcgc atgtcaacta 840 agaccacgtc cattctaaaa ctacccacca aagcaccagg tttgatacct tatattacaa 900 agccatccac tcaacttcca ggaccttact gccctattcc ttgtaactgc aaagtcctat 960 ccccatcagg acttctaata cattgtcagg agcgcaacat tgaaagctta tcagatctga 1020 gacctcctcc gcaaaatcct agaaagctca ttctagcggg aaatattatt cacagtttaa 1080 tgaagtctga tctagtggaa tatttcactt tggaaatgct tcacttggga aacaatcgta 1140 ttgaagttct tgaagaagga tcgtttatga acctaacgag attacaaaaa ctctatctaa 1200 atggtaacca cctgaccaaa ttaagtaaag gcatgttcct tggtctccat aatcttgaat 1260 acttatatct tgaatacaat gccattaagg aaatactgcc aggaaccttt aatccaatgc 1320 ctaaacttaa agtcctgtat ttaaataaca cctcctccaa gttttaccac cacatatttt 1380 ttcaggggtt cctctaacta aggtaaatct taaacaaacc agtttaccca tctacctgta 1440 agtaatattt ggatgatctt gatttactaa cccagattga ccttgaggat aacccctggg 1500 ctgctcctgt gacctggttg gactgcagca atggatacaa aagttaagca agaacacagt 1560 gacagatgac atcctctgca cttcccccgg gcatctcgac aaaaaggaat tgaaagccct 1620 aaatagtgaa attctctgtc caggtttagt aaataaccca tccatgccaa cacagactag 1680 ttaccttatg gtcaccactc ctgcaacaac aacaaatacg gctgatacta ttttacgatc 1740 tcttacggac gctgtgccac tgtctgttct aatattggga cttctgatta tgttcatcac 1800 tattgttttc tgtgctgcag ggatagtggt tcttgttctt caccgcagga gaagatacaa 1860 aaagaaacaa gtagatgagc aaatgagaga caacagtcct gtgcatcttc agtacagcat 1920 gtatggccat aaaaccactc atcacactac tgaaagaccc tctgcctcac tctatgaaca 1980 gcacatggtg agccccatgg ttcatgtcta tagaagtcca tcctttggtc caaagcatct 2040 ggaagaggaa gaagagagga atgagaaaga aggaagtgat gcaaaacatc tccaaagaag 2100 tcttttggaa caggaaaatc attcaccact cacagggtca aatatgaaat acaaaaccac 2160 gaaccaatca acagaatttt tatccttcca agatgccagc tcattgtaca gaaacatttt 2220 agaaaaagaa agggaacttc agcaactggg aatcacagaa tacctaagga aaaacattgc 2280 tcagctccag cctgatatgg aggcacatta tcctggagcc cacgaagagc tgaagttaat 2340 ggaaacatta atgtactcac gtccaaggaa ggtattagtg gaacagacaa aaaatgagta 2400 ttttgaactt aaagctaatt tacatgctga acctgactat ttagaagtcc tggagcagca 2460 aacatagatg gagagtttga gggctttcgc agaaatgctg tgattctgtt ttaagtccat 2520 accttgtaaa taagtgcctt acgtgagtgt gtcatcaatc agaacctaag cacagcagtt 2580 aacttgggaa aaaaaaaaaa aaaaaa 2606 6 2574 DNA Homo sapiens 6 tcatcacatg acaacatgaa gctgtggatt catctctttt attcatctct ccttgcctgt 60 atatctttac actcccaaac tccagtgctc tcatccagag gctcttgtga ttctctttgc 120 aattgtgagg aaaaagatgg cacaatgcta ataaattgtg aagcaaaagg tatcaagatg 180 gtatctgaaa taagtgtgcc accatcacga cctttccaac taagcttatt aaataacggc 240 ttgacgatgc ttcacacaaa tgacttttct gggcttacca atgctatttc aatacacctt 300 ggatttaaca atattgcaga tattgagata ggtgcattta atggccttgg cctcctgaaa 360 caacttcata tcaatcacaa ttctttagaa attcttaaag aggatacttt ccatggactg 420 gaaaacctgg aattcctgca agcagataac aattttatca cagtgattga accaagtgcc 480 tttagcaagc tcaacagact caaagtgtta attttaaatg acaatgctat tgagagtctt 540 cctccaaaca tcttccgatt tgttccttta acccatctag atcttcgtgg aaatcaatta 600 caaacattgc cttatgttgg ttttctcgaa cacattggcc gaatattgga tcttcagttg 660 gaggacaaca aatgggcctg caattgtgac ttattgcagt taaaaacttg gttggagaac 720 atgcctccac agtctataat tggtgatgtt gtctgcaaca gccctccatt ttttaaagga 780 agtatactca gtagactaaa gaaggaatct atttgcccta ctccaccagt gtatgaagaa 840 catgaggatc cttcaggatc attacatctg gcagcaacat cttcaataaa tgatagtcgc 900 atgtcaacta agaccacgtc cattctaaaa ctacccacca aagcaccagg tttgatacct 960 tatattacaa agccatccac tcaacttcca ggaccttact gccctattcc ttgtaactgc 1020 aaagtcctat ccccatcagg acttctaata cattgtcagg agcgcaacat tgaaagctta 1080 tcagatctga gacctcctcc gcaaaatcct agaaagctca ttctagcggg aaatattatt 1140 cacagtttaa tgaagtctga tctagtggaa tatttcactt tggaaatgct tcacttggga 1200 aacaatcgta ttgaagttct tgaagaagga tcgtttatga acctaacgag attacaaaaa 1260 ctctatctaa atggtaacca cctgaccaaa ttaagtaaag gcatgttcct tggtctccat 1320 aatcttgaat acttatatct tgaatacaat gccattaagg aaatactgcc aggaaccttt 1380 aatccaatgc ctaaacttaa agtcctgtat ttaaataaca acctcctcca agttttacca 1440 ccacatattt tttcaggggt tcctctaact aaggtaaatc ttaaaacaaa ccagtttacc 1500 catctacctg taagtaatat tttggatgat cttgatttac taacccagat tgaccttgag 1560 gataacccct gggactgctc ctgtgacctg gttggactgc agcaatggat acaaaagtta 1620 agcaagaaca cagtgacaga tgacatcctc tgcacttccc ccgggcatct cgacaaaaag 1680 gaattgaaag ccctaaatag tgaaattctc tgtccaggtt tagtaaataa cccatccatg 1740 ccaacacaga ctagttacct tatggtcacc actcctgcaa caacaacaaa tacggctgat 1800 actattttac gatctcttac ggacgctgtg ccactgtctg ttctaatatt gggacttctg 1860 attatgttca tcactattgt tttctgtgct gcagggatag tggttcttgt tcttcaccgc 1920 aggagaagat acaaaaagaa acaagtagat gagcaaatga gagacaacag tcctgtgcat 1980 cttcagtaca gcatgtatgg ccataaaacc actcatcaca ctactgaaag accctctgcc 2040 tcactctatg aacagcacat ggtgagcccc atggttcatg tctatagaag tccatccttt 2100 ggtccaaagc atctggaaga ggaagaagag aggaatgaga aagaaggaag tgatgcaaaa 2160 catctccaaa gaagtctttt ggaacaggaa aatcattcac cactcacagg gtcaaatatg 2220 aaatacaaaa ccacgaacca atcaacagaa tttttatcct tccaagatgc cagctcattg 2280 tacagaaaca ttttagaaaa agaaagggaa cttcagcaac tgggaatcac agaataccta 2340 aggaaaaaca ttgctcagct ccagcctgat atggaggcac attatcctgg agcccacgaa 2400 gagctgaagt taatggaaac attaatgtac tcacgtccaa ggaaggtatt agtggaacag 2460 acaaaaaatg agtattttga acttaaagct aatttacatg ctgaacctga ctatttagaa 2520 gtcctggagc agcaaacata gatggagagt ttgagggctt tcgcagaaat gctg 2574 7 1504 DNA Homo sapiens 7 attgccttat gttggttttc tcgaacacat tggccgaata ttggatcttc agttggagga 60 caacaaatgg gcctgcaatt gtgacttatt gcagttaaaa acttggttgg agaacatgcc 120 tccacagtct ataattggtg atgttgtctg caacagccct ccatttttta aaggaagtat 180 actcagtaga ctaaagaagg aatctatttg ccctactcca ccagtgtatg aagaacatga 240 ggatccttca ggatcattac atctggcagc aacatcttca ataaatgata gtcgcatgtc 300 aactaagacc acgtccattc taaaactacc caccaaagca ccaggtttga taccttatat 360 tacaaagcca tccactcaac ttccaggacc ttactgccct attccttgta actgcaaagt 420 cctatcccca tcaggacttc taatacattg tcaggagcgc aacattgaaa gcttatcaga 480 tctgagacct cctccgcaaa atcctagaaa gctcattcta gcgggaaata ttattcacag 540 tttaatgaat ccatcctttg gtccaaagca tctggaagag gaagaagaga ggaatgagaa 600 agaaggaagt gatgcaaaac atctccaaag aagtcttttg gaacaggaaa atcattcacc 660 actcacaggg tcaaatatga aatacaaaac cacgaaccaa tcaacagaat ttttatcctt 720 ccaagatgcc agctcattgt acagaaacat tttagaaaaa gaaagggaac ttcagcaact 780 gggaatcaca gaatacctaa ggaaaaacat tgctcagctc cagcctgata tggaggcaca 840 ttatcctgga gcccacgaag agctgaagtt aatggaaaca ttaatgtact cacgtccaag 900 gaaggtatta gtggaacaga caaaaaatga gtattttgaa cttaaagcta atttacatgc 960 tgaacctgac tatttagaag tcctggagca gcaaacatag atggagagtt tgagggcttt 1020 cgcagaaatg ctgtgattct gttttaagtc cataccttgt aaataagtgc cttacgtgag 1080 tgtgtcatca atcagaacct aagcacagca gtaaactatg gggaaaaaaa aagaagaaga 1140 aaaagaaact cagggatcac tgggagaagc catggcatta tcttcaggca atttagtctg 1200 tcccaaataa aataaatcct tgcatgtaaa tcattcaagg attatagtaa tatttcatat 1260 actgaaaagt gtctcatagg agtcctcttg cacatctaaa aaggctgaac atttaagtat 1320 cccgaatttt cttgaattgc tttccctata gattaattac aattggattt catcatttaa 1380 aaaccatact tgtatatgta gttataatat gtaaggaata cattgtttat aaccagtatg 1440 tacttcaaaa atgtgtattg tcaaacatac ctaactttct tgcaataaat gcaaaagaaa 1500 ctgg 1504 8 3131 DNA Homo sapiens misc_feature (1)..(3131) "n" is A, C, G, or T 8 agcgtcgaca acaagaaata ctagaaaagg aggaaggaga acattgctgc agcttggatc 60 tacaacctaa gaaagcaaga gtgatcaatc tcagctctgt taaacatctt gtttacttac 120 tgcattcagc agcttgcaaa tggttaacta tatgcaaaaa agtcagcata gctgtgaagt 180 atgccgtgaa ttttaattga gggaaaaagg gacaattgct tcaggatgct ctagtatgca 240 ctctgcttga aatattttca atgaaatgct cagtattcta tctttgacca gaggttttaa 300 ctttatgaag ctatgggact tgacaaaaag tgatatttga gaagaaagta cgcagtggtt 360 ggtgttttct tttttttaat aaaggaattg aattactttg aacacctctt ccagctgtgc 420 attacagata acgtcaggaa gagtctctgc tttacaggta atcggatttc atcacatgac 480 aacatgaagc tgtggattca tctcttttat tcatctctcc tngcctgtat atctttacac 540 tcccaaactc cagtgctctc atccagaggc tcttgtgatt ctctttgcaa ttgtgaggaa 600 aaagatggca caatgctaat aaattgtgaa gcaaaaggta tcaagatggt atctgaaata 660 agtgtgccac catcacgacc tttccaacta agcttattaa ataacggctt gacgatgctt 720 cacacaaatg acttttctgg gcttaccaat gctatttcaa tacaccttgg atttaacaat 780 attgcagata ttgagatagg tgcatttaat ggccttggcc tcctgaaaca acttcatatc 840 aatcacaatt ctttagaaat tcttaaagag gatactttcc atggactgga aaacctggaa 900 ttcctgcaag cagataacaa ttttatcaca gtgattgaac caagtgcctt tagcaagctc 960 aacagactca aagtgttaat tttaaatgac aatgctattg agagtcttcc tccaaacatc 1020 ttccgatttg ttcctttaac ccatctagat cttcgtggaa atcaattaca aacattgcct 1080 tatgttggtt ttctcgaaca cattggccga atattggatc ttcagttgga ggacaacaaa 1140 tgggcctgca attgtgactt attgcagtta aaaacttggt tggagaacat gcctccacag 1200 tctataattg gtgatgttgt ctgcaacagc cctccatttt ttaaaggaag tatactcagt 1260 agactaaaga aggaatctat ttgccctact ccaccagtgt atgaagaaca tgaggatcct 1320 tcaggatcat tacatctggc agcaacatct tcaataaatg atagtcgcat gtcaactaag 1380 accacgtcca ttctaaaact acccaccaaa gcaccaggtt tgatacctta tattacaaag 1440 ccatccactc aacttccagg accttactgc cctattcctt gtaactgcaa agtcctatcc 1500 ccatcaggac ttctaataca ttgtcaggag cgcaacattg aaagcttatc agatctgaga 1560 cctcctccgc aaaatcctag aaagctcatt ctagcgggaa atattattca cagtttaatg 1620 aagtctgatc tagtggaata tttcactttg gaaatgcttc acttgggaaa caatcgtatt 1680 gaagttcttg aagaaggatc gtttatgaac ctaacgagat tacaaaaact ctatctaaat 1740 ggtaaccacc tgaccaaatt aagtaaaggc atgttccttg gtctccataa tcttgaatac 1800 ttatatcttg aatacaatgc cattaaggaa atactgccag gaacctttaa tccaatgcct 1860 aaacttaaag tcctgtattt aaataacaac ctcctccaag ttttaccacc acatattttt 1920 tcaggggttc ctctaactaa ggtaaatctt aaaacaaacc agtttaccca tctacctgta 1980 agtaatattt tggatgatct tgatttgcta acccagattg accttgagga taacccctgg 2040 gactgctcct gtgacctggt tggactgcag caatggatac aaaagttaag caagaacaca 2100 gtgacagatg acatcctctg cacttccccc gggcatctcg acaaaaagga attgaaagcc 2160 ctaaatagtg aaattctctg tccaggttta gtaaataacc catccatgcc aacacagact 2220 agttacctta tggtcaccac tcctgcaaca acaacaaata cggctgatac tattttacga 2280 tctcttacgg acgctgtgcc actgtctgtt ctaatattgg gacttctgat tatgttcatc 2340

actattgttt tctgtgctgc agggatagtg gttcttgttc ttcaccgcag gagaagatac 2400 aaaaagaaac aagtagatga gcaaatgaga gacaacagtc ctgtgcatct tcagtacagc 2460 atgtatggcc ataaaaccac tcatcacact actgaaagac cctctgcctc actctatgaa 2520 cagcacatgg tgagccccat ggttcatgtc tatagaagtc catcctttgg tccaaagcat 2580 ctggaagagg aagaagagag gaatgagaaa gaaggaagtg atgcaaaaca tctccaaaga 2640 agtcttttgg aacaggaaaa tcattcacca ctcacagggt caaatatgaa atacaaaacc 2700 acgaaccaat caacagaatt tttatccttc caagatgcca gctcattgta cagaaacatt 2760 ttagaaaaag aaagggaact tcagcaactg ggaatcacag aatacctaag gaaaaacatt 2820 gctcagctcc agcctgatat ggaggcacat tatcctggag cccacgaaga gctgaagtta 2880 atggaaacat taatgtactc acgtccaagg aaggtattag tggaacagac aaaaaatgag 2940 tattttgaac ttaaagctaa tttacatgct gaacctgact atttagaagt cctggagcag 3000 caaacataga tggagagttt gagggctttc gcagaaatgc tgtgattctg ttttaagtcc 3060 ataccttgta aataagtgcc ttacgtgagt gtgtcatcaa tcagaaccta agcacagcag 3120 taaactatgg g 3131 9 3227 DNA Homo sapiens 9 cagcagacgg cagacggtgc ccggcgccca cgggaagctg aagatacagc ggtatacaaa 60 cctccatgtc tcaagctgac cacacttgta tctggacttg ccagctgatt agctctgttc 120 ccaccaggct cgttcaaaga cccacagctt gagggggcag agggctgccc ctgatggggc 180 ctggcaatga ctgagcaggc ccagccccag aggacaagga agagaaggca tattgaggag 240 ggcaagaagt gacgcccggt gtagaatgac tgccctggga gggtggttcc ttgggccctg 300 gcagggttgc tgacccttac cctgcaaaac acaaagagca ggactccaga ctcttcttgt 360 gaatggtccc ctgccctgca gctccaccat gaggcttctc gtggccccac tcttgctagc 420 ttgggtggct ggtgccactg ccgctgtgcc cgtggtaccc tggcatgttc cctgcccccc 480 tcagtgtgcc tgccagatcc ggccctggta tacgccccgc tcgtcctacc gcgaggctac 540 cactgtggac tgcaatgacc tattcctgac ggcagtcccc ccggcactcc ccgcaggcac 600 acagaccctg ctcctgcaga gcaacagcat tgtccgtgtg gaccagagtg agctgggcta 660 cctggccaat ctcacagagc tggacctgtc ccagaacagc ttttcggatg cccgagactg 720 tgatttccat gcccttcccc agctgctgag cctgcaccta gaggagaacc agctgacccg 780 gctggaggac cacagctttg cagggctggc cagcctacag gaactctatc tcaaccacaa 840 ccagctctac cgcatcgccc ccagggcctt ttctggcctc agcaacttgc tgcggctgca 900 cctcaactcc aacctcctga gggccattga cagccgctgg tttgaaatgc tgcccaactt 960 ggagatactc atgattggcg gcaacaaggt agatgccatc ctggacatga acttccggcc 1020 cctggccaac ctgcgtagcc tggtgctagc aggcatgaac ctgcgggaga tctccgacta 1080 tgccctggag gggctgcaaa gcctggagag cctctccttc tatgacaacc agctggcccg 1140 ggtgcccagg cgggcactgg aacaggtgcc cgggctcaag ttcctagacc tcaacaagaa 1200 cccgctccag cgggtagggc cgggggactt tgccaacatg ctgcacctta aggagctggg 1260 actgaacaac atggaggagc tggtctccat cgacaagttt gccctggtga acctccccga 1320 gctgaccaag ctggacatca ccaataaccc acggctgtcc ttcatccacc cccgcgcctt 1380 ccaccacctg ccccagatgg agaccctcat gctcaacaac aacgctctca gtgccttgca 1440 ccagcagacg gtggagtccc tgcccaacct gcaggaggta ggtctccacg gcaaccccat 1500 ccgctgtgac tgtgtcatcc gctgggccaa tgccacgggc acccgtgtcc gcttcatcga 1560 gccgcaatcc accctgtgtg cggagcctcc ggacctccag cgcctcccgg tccgtgaggt 1620 gcccttccgg gagatgacgg accactgttt gcccctcatc tccccacgaa gcttcccccc 1680 aagcctccag gtagccagtg gagagagcat ggtgctgcat tgccgggcac tggccgaacc 1740 cgaacccgag atctactggg tcactccagc tgggcttcga ctgacacctg cccatgcagg 1800 caggaggtgc cgggtgtacc ccgaggggac cctggagctg cggagggtga cagcagaaga 1860 ggcagggcta tacacctgtg tggcccagaa cctggtgggg gctgacacta agacggttag 1920 tgtggttgtg ggccgtgctc tcctccagcc aggcagggac gaaggacagg ggctggagct 1980 ccgggtgcag gagacccacc cctatcacat cctgctatct tgggtcaccc cacccaacac 2040 agtgtccacc aacctcacct ggtccagtgc ctcctccctc cggggccagg gggccacagc 2100 tctggcccgc ctgcctcggg gaacccacag ctacaacatt acccgcctcc ttcaggccac 2160 ggagtactgg gcctgcctgc aagtggcctt tgctgatgcc cacacccagt tggcttgtgt 2220 atgggccagg accaaagagg ccacttcttg ccacagagcc ttaggggatc gtcctgggct 2280 cattgccatc ctggctctcg ctgtccttct cctggcagct gggctagcgg cccaccttgg 2340 cacaggccaa cccaggaagg gtgtgggtgg gaggcggcct ctccctccag cctgggcttt 2400 ctggggctgg agtgcccctt ctgtccgggt tgtgtctgct cccctcgtcc tgccctggaa 2460 tccagggagg aagctgccca gatcctcaga aggggagaca ctgttgccac cattgtctca 2520 aaattcttga agctcagcct gttctcagca gtagagaaat cactaggact actttttacc 2580 aaaagagaag cagtctgggc cagatgccct gccaggaaag ggacatggac ccacgtgctt 2640 gaggcctggc agctgggcca agacagatgg ggctttgtgg ccctgggggt gcttctgcag 2700 ccttgaaaaa gttgccctta cctcctaggg tcacctctgc tgccattctg aggaacatct 2760 ccaaggaacg ggagggactt tggctagagc ctcctgcctc cccatcttct ctctgcccag 2820 aggctcctgg gcctggcttg gctgtcccct acctgtgtcc ccgggctgca ccccttcctc 2880 ttctctttct ctgtacagtc tcagttgctt gctcttgtgc ctcctgggca agggctgaag 2940 gaggccactc catctcacct cggggggctg ccctcaatgt gggagtgacc ccagccagat 3000 ctgaaggaca tttgggagag ggatgcccag gaacgcctca tctcagcagc ctgggctcgg 3060 cattccgaag ctgactttct ataggcaatt ttgtaccttt gtggagaaat gtgtcacctc 3120 ccccaacccg attcactctt ttctcctgtt ttgtaaaaaa taaaaataaa taataacaat 3180 aatacggggg aaaggaacga aaggaactaa aaaaaaaaaa aaaaaaa 3227 10 3227 DNA Homo sapiens 10 cagcagacgg cagacggtgc ccggcgccca cgggaagctg aagatacagc ggtatacaaa 60 cctccatgtc tcaagctgac cacacttgta tctggacttg ccagctgatt agctctgttc 120 ccaccaggct cgttcaaaga cccacagctt gagggggcag agggctgccc ctgatggggc 180 ctggcaatga ctgagcaggc ccagccccag aggacaagga agagaaggca tattgaggag 240 ggcaagaagt gacgcccggt gtagaatgac tgccctggga gggtggttcc ttgggccctg 300 gcagggttgc tgacccttac cctgcaaaac acaaagagca ggactccaga ctcttcttgt 360 gaatggtccc ctgccctgca gctccaccat gaggcttctc gtggccccac tcttgctagc 420 ttgggtggct ggtgccactg ccgctgtgcc cgtggtaccc tggcatgttc cctgcccccc 480 tcagtgtgcc tgccagatcc ggccctggta tacgccccgc tcgtcctacc gcgaggctac 540 cactgtggac tgcaatgacc tattcctgac ggcagtcccc ccggcactcc ccgcaggcac 600 acagaccctg ctcctgcaga gcaacagcat tgtccgtgtg gaccagagtg agctgggcta 660 cctggccaat ctcacagagc tggacctgtc ccagaacagc ttttcggatg cccgagactg 720 tgatttccat gcccttcccc agctgctgag cctgcaccta gaggagaacc agctgacccg 780 gctggaggac cacagctttg cagggctggc cagcctacag gaactctatc tcaaccacaa 840 ccagctctac cgcatcgccc ccagggcctt ttctggcctc agcaacttgc tgcggctgca 900 cctcaactcc aacctcctga gggccattga cagccgctgg tttgaaatgc tgcccaactt 960 ggagatactc atgattggcg gcaacaaggt agatgccatc ctggacatga acttccggcc 1020 cctggccaac ctgcgtagcc tggtgctagc aggcatgaac ctgcgggaga tctccgacta 1080 tgccctggag gggctgcaaa gcctggagag cctctccttc tatgacaacc agctggcccg 1140 ggtgcccagg cgggcactgg aacaggtgcc cgggctcaag ttcctagacc tcaacaagaa 1200 cccgctccag cgggtagggc cgggggactt tgccaacatg ctgcacctta aggagctggg 1260 actgaacaac atggaggagc tggtctccat cgacaagttt gccctggtga acctccccga 1320 gctgaccaag ctggacatca ccaataaccc acggctgtcc ttcatccacc cccgcgcctt 1380 ccaccacctg ccccagatgg agaccctcat gctcaacaac aacgctctca gtgccttgca 1440 ccagcagacg gtggagtccc tgcccaacct gcaggaggta ggtctccacg gcaaccccat 1500 ccgctgtgac tgtgtcatcc gctgggccaa tgccacgggc acccgtgtcc gcttcatcga 1560 gccgcaatcc accctgtgtg cggagcctcc ggacctccag cgcctcccgg tccgtgaggt 1620 gcccttccgg gagatgacgg accactgttt gcccctcatc tccccacgaa gcttcccccc 1680 aagcctccag gtagccagtg gagagagcat ggtgctgcat tgccgggcac tggccgaacc 1740 cgaacccgag atctactggg tcactccagc tgggcttcga ctgacacctg cccatgcagg 1800 caggaggtgc cgggtgtacc ccgaggggac cctggagctg cggagggtga cagcagaaga 1860 ggcagggcta tacacctgtg tggcccagaa cctggtgggg gctgacacta agacggttag 1920 tgtggttgtg ggccgtgctc tcctccagcc aggcagggac gaaggacagg ggctggagct 1980 ccgggtgcag gagacccacc cctatcacat cctgctatct tgggtcaccc cacccaacac 2040 agtgtccacc aacctcacct ggtccagtgc ctcctccctc cggggccagg gggccacagc 2100 tctggcccgc ctgcctcggg gaacccacag ctacaacatt acccgcctcc ttcaggccac 2160 ggagtactgg gcctgcctgc aagtggcctt tgctgatgcc cacacccagt tggcttgtgt 2220 atgggccagg accaaagagg ccacttcttg ccacagagcc ttaggggatc gtcctgggct 2280 cattgccatc ctggctctcg ctgtccttct cctggcagct gggctagcgg cccaccttgg 2340 cacaggccaa cccaggaagg gtgtgggtgg gaggcggcct ctccctccag cctgggcttt 2400 ctggggctgg agtgcccctt ctgtccgggt tgtgtctgct cccctcgtcc tgccctggaa 2460 tccagggagg aagctgccca gatcctcaga aggggagaca ctgttgccac cattgtctca 2520 aaattcttga agctcagcct gttctcagca gtagagaaat cactaggact actttttacc 2580 aaaagagaag cagtctgggc cagatgccct gccaggaaag ggacatggac ccacgtgctt 2640 gaggcctggc agctgggcca agacagatgg ggctttgtgg ccctgggggt gcttctgcag 2700 ccttgaaaaa gttgccctta cctcctaggg tcacctctgc tgccattctg aggaacatct 2760 ccaaggaacg ggagggactt tggctagagc ctcctgcctc cccatcttct ctctgcccag 2820 aggctcctgg gcctggcttg gctgtcccct acctgtgtcc ccgggctgca ccccttcctc 2880 ttctctttct ctgtacagtc tcagttgctt gctcttgtgc ctcctgggca agggctgaag 2940 gaggccactc catctcacct cggggggctg ccctcaatgt gggagtgacc ccagccagat 3000 ctgaaggaca tttgggagag ggatgcccag gaacgcctca tctcagcagc ctgggctcgg 3060 cattccgaag ctgactttct ataggcaatt ttgtaccttt gtggagaaat gtgtcacctc 3120 ccccaacccg attcactctt ttctcctgtt ttgtaaaaaa taaaaataaa taataacaat 3180 aatacggggg aaaggaacga aaggaactaa aaaaaaaaaa aaaaaaa 3227 11 592 DNA Homo sapiens misc_feature (1)..(592) "n" is A, C, G, or T 11 ttanntntcc tagcagtatt tagcaccttt ttgccacctt ggtgaacaga aaattgtatt 60 ttcctgtctt tcatggctga aaacaaaagt aatgggaatt ttaaatacgt ttgcnganac 120 tgcccctccc ctcattgagg gtcactgctc aagagtgcag gagtggactc tccactgatg 180 ggtctccctc cccatcctgg tttccacccc gggctggcta gctctgttgg tttgnagact 240 ganagccagc ctggctcatt ctcattattg gctagttagc tttctttatc aacctgctca 300 ctcacaaatg tgtgccctca gccagagagt aagaaagccc aaatctgtta cagcttctaa 360 aaaaatagat ttctaatttg tcctactcat gttaggagca ttatctttga aggtaaaaca 420 tagtgtatca ttgtgtaaac tcccaggctt gatgtagcag aagagatcat ttctggaggc 480 ttcagcaatg ggaatttagc attataagag agatttggac naaccagtcc aaagtggtcc 540 gagttcttaa aatcccaggg tagggnaact ccactccttc ctttcttctn tg 592 12 5036 DNA Homo sapiens 12 gaacaggcac agggcctgca gctgcagcat tcagtgcatc ccagaccctc cagaggccca 60 gggcagggtc agatcagaca gacctggcag cacagacgtg aagtgtaaac ggcacacagc 120 cccctggact tgggttgatt cttgtatctt gctgctcacc agctatgtga cttgagccaa 180 gttctttaac ctcttcagga cctcagtctc ttaatctgca aactggaggt attagtgctt 240 atctcatgta cctgttatga gagtcatatg aaaaaagaca ttcagagtct tcaaagagtc 300 ctgagcaatg gtaggtgctc agtaagtgtt tgtgcatttc cttgacactt gtagctaaaa 360 acttacctgg aatcagactt ccactcagga gtcagaccag gctcaagaag gcaacacagt 420 ctggcccctt tttgtttcca tctttggttc taggccaagg ctggacttgc ctcatggccc 480 tggcatctga gagagaaact ggataggttc ctcatgctca ggatggagga aacagcagtg 540 gcttgcattg ttgattgcag gggtgatgtc taatctcctg aagaaaggaa ctctggagtc 600 acagagattg attcatacaa tcaacacaca ctgggcaccc atcccgtact acaggcacaa 660 gaggcaagac aggtctctag ttagagacgt tttcccacgg tgtggtcaga actgtgacag 720 taggaagcat gagtactgca agagcataaa gccagcccct tagtcagcca ggtggggagg 780 ttttggagaa gacttctgca ggtggtgact gaacttaaag ttgagcgttg aagaacaagt 840 aggagttgtc tggaagacaa tgaggaggag acatcagaag gggtggcatg tcacggtaac 900 ctcaacccaa aaggtttgtg tgcagcgggt aagtggggtg gtgacaagag aagccaggtg 960 aaggggccgg gccaggtcac agaaggtctt acctgcacag ctgaggagct cagacccagg 1020 cttcttaact ccactgcctc ctcctgggca acactgccag gctgttggga aaggagaaac 1080 actcctgcct ccagggtggg gaagttggca aacaccaaag caatatcccc acccagaaga 1140 tcctgggcag cagaacaccc cagcctgcct ggatgtgtca ggtctgctgc ctttcacctc 1200 tagggtcctg gggagcccat cccagatttt agaagctgta cttatttcca ccaaccctca 1260 cccacctccc acctacactg ccagaggaat ttctggaaac ttatcatttt atcttccatt 1320 tacccagtga gccaaggagg ctgggaggaa agaggtaaga aaggttagag aacctacctc 1380 acatctctct gggctcagaa ggactctgaa gataacaata atttcagccc atccactctc 1440 cttccctccc aaacacacat gtgcatgtac acacacacat acacacacat acaccttcct 1500 ctccttcact gaagactcac agtcactcac tctgtgagca ggtcatagaa aaggacacta 1560 aagccttaag gacgggcctg gccattacct ctgcagctcc tttggcttgt tgagtcaaaa 1620 aacatgggag gggccaggca cggtgactca cacctgtaat cccagcattt tgggagaccg 1680 aggtgagcag atcacttgag gtcaggagtt cgagaccagc ctggccaaca tggagaaacc 1740 cccatctcta ctaaaaatac aaaaattagc caggagtggt ggcaggtgcc tgtaatccca 1800 gctactcagg tggctgagcc aggagaatcg cttgaatcca ggaggcggag gatgcagtca 1860 gctgagtgca ccgctgcact ccagcctggg tgacagaatg agactctgtc tcaaacaaac 1920 aaacacggga ggaggggtag atactgcttc tctgcaacct ccttaactct gcatcctctt 1980 cttccagggc tgcccctgat ggggcctggc aatgactgag caggcccagc cccagaggac 2040 aaggaagaga aggcatattg aggagggcaa gaagtgacgc ccggtgtaga atgactgccc 2100 tgggagggtg gttccttggg ccctggcagg gttgctgacc cttaccctgc aaaacacaaa 2160 gagcaggact ccagactctt cttgtgaatg gtcccctgcc ctgcagctcc accatgaggc 2220 ttctcgtggc cccactcttg ctagcttggg tggctggtgc cactgccgct gtgcccgtgg 2280 taccctggcg tgttccctgc ccccctcagt gtgcctgcca gatccggccc tggtatacgc 2340 cccgctcgtc ctaccgcgag gctaccactg tggactgcaa tgacctattc ctgacggcag 2400 tccccccggc actccccgca ggcacacaga ccctgctcct gcagagcaac agcattgtcc 2460 gtgtggacca gagtgagctg ggctaccagg ccaatctcac agagctggac ctgtcccaga 2520 acagcttttc ggatgcccga gactgtgatt tccatgccct gccccagctg ctgagcctgc 2580 acctagagga gaaccagctg acccggctgg aggaccacag ctttgcaggg ctggccagcc 2640 tacaggaact ctatctcaac cacaaccagc tctaccgcat cgcccccagg gccttttctg 2700 gcctcagcaa cttgctgcgg ctgcacctca actccaacct cctgagggcc attgacagcc 2760 gctggtttga aatgctgccc aacttggaga tactcatgat tggcggcaac aaggtagatg 2820 ccatcctgga catgaacttc cggcccctgg ccaacctgcg tagcctggtg ctagcaggca 2880 tgaacctgcg ggagatctcc gactatgccc tggaggggct gcaaagcctg gagagcctct 2940 ccttctatga caaccagctg gcccgggtgc ccaggcgggc actggaacag gtgcccgggc 3000 tcaagttcct agacctcaac aagaacccgc tccagcgggt agggccgggg gactttgcca 3060 acatgctgca ccttaaggag ctgggactga acaacatgga ggagctggtc tccatcgaca 3120 agtttgccct ggtgaacctc cccgagctga ccaagctgga catcaccaat aacccacggc 3180 tgtccttcat ccacccccgc gccttccacc acctgcccca gatggagacc ctcatgctca 3240 acaacaacgc tctcagtgcc ttgcaccagc agacggtgga gtccctgccc aacctgcagg 3300 aggtaggtct ccacggcaac cccatccgct gtgactgtgt catccgctgg gccaatgcca 3360 cgggcacccg tgtccgcttc atcgagccgc aatccaccct gtgtgcggag cctccggacc 3420 tccagcgcct cccggtccgt gaggtgccct tccgggagat gacggaccac tgtttgcccc 3480 tcatctcccc acgaagcttc cccccaagcc tccaggtagc cagtggagag agcatggtgc 3540 tgcattgccg ggcactggcc gaacccgaac ccgagatcta ctgggtcact ccagctgggc 3600 ttcgactgac acctgcccat gcaggcagga ggtaccgggt gtaccccgag gggaccctgg 3660 agctgcggag ggtgacagca gaagaggcag ggctatacac ctgtgtggcc cagaacctgg 3720 tgggggctga cactaagacg gttagtgtgg ttgtgggccg tgctctcctc cagccaggca 3780 gggacgaagg acaggggctg gagctccggg tgcaggagac ccacccctat cacatcctgc 3840 tatcttgggt caccccaccc aacacagtgt ccaccaacct cacctggtcc agtgcctcct 3900 ccctccgggg ccagggggcc acagctctgg cccgcctgcc tcggggaacc cacagctaca 3960 acattacccg cctccttcag gccacggagt actgggcctg cctgcaagtg gcctttgctg 4020 atgcccacac ccagttggct tgtgtatggg ccaggaccaa agaggccact tcttgccaca 4080 gagccttagg ggatcgtcct gggctcattg ccatcctggc tctcgctgtc cttctcctgg 4140 cagctgggct agcggcccac cttggcacag gccaacccag gaagggtgtg ggtgggaggc 4200 ggcctctccc tccagcctgg gctttctggg gctggagtgc cccttctgtc cgggttgtgt 4260 ctgctcccct cgtcctgccc tggaatccag ggaggaagct gcccagatcc tcagaagggg 4320 agacactgtt gccaccattg tctcaaaatt cttgaagctc agcctgttct cagcagtaga 4380 gaaatcacta ggactacttt ttaccaaaag agaagcagtc tgggccagat gccctgccag 4440 gaaagggaca tggacccacg tgcttgaggc ctggcagctg ggccaagaca gatggggctt 4500 tgtggccctg ggggtgcttc tgcagccttg aaaaagttgc ccttacctcc tagggtcacc 4560 tctgctgcca ttctgaggaa catctccaag gaacaggagg gactttggct agagcctcct 4620 gcctccccat cttctctctg cccagaggct cctgggcctg gcttggctgt cccctacctg 4680 tgtccccggg ctgcacccct tcctcttctc tttctctgta cagtctcagt tgcttgctct 4740 tgtgcctcct gggcaagggc tgaaggaggc cactccatct cacctcgggg ggctgccctc 4800 aatgtgggag tgaccccagc cagatctgaa ggacatttgg gagagggatg cccaggaacg 4860 cctcatctca gcagcctggg ctcggcattc cgaagctgac tttctatagg caattttgta 4920 cctttgtgga gaaatgtgtc acctccccca acccgattca ctcttttctc ctgttttgta 4980 aaaaataaaa ataaataata acaataaaaa aaaaaaaaaa aaaaaaaaaa aaaaaa 5036 13 3207 DNA Homo sapiens 13 gagggcgccc gccgagcctc cccggcctgt gcagacggcg cgcgcggcgg gagggcgcgg 60 accagcgtcc ccagcccggc cccgggcgga aggcgagcgg agcgcggccg cgcgggcagc 120 agacggcaga cggtgcccgg cgcccacggg ggctgcccct gatggggcct ggcaatgact 180 gagcaggccc agccccagag gacaaggaag agaaggcata ttgaggaggg caagaagtga 240 cgcccggtgt agaatgactg ccctgggagg gtggttcctt gggccctggc agggttgctg 300 acccttaccc tgcaaaacac aaagagcagg actccagact cttcttgtga atggtcccct 360 gccctgcagc tccaccatga ggcttctcgt ggccccactc ttgctagctt gggtggctgg 420 tgccactgcc gctgtgcccg tggtaccctg gcatgttccc tgcccccctc agtgtgcctg 480 ccagatccgg ccctggtata cgccccgctc gtcctaccgc gaggctacca ctgtggactg 540 caatgaccta ttcctgacgg cagtcccccc ggcactcccc gcaggcacac agaccctgct 600 cctgcagagc aacagcattg tccgtgtgga ccagagtgag ctgggctacc tggccaatct 660 cacagagctg gacctgtccc agaacagctt ttcggatgcc cgagactgtg atttccatgc 720 cctgccccag ctgctgagcc tgcacctaga ggagaaccag ctgacccggc tggaggacca 780 cagctttgca gggctggcca gcctacagga actctatctc aaccacaacc agctctaccg 840 catcgccccc agggcctttt ctggcctcag caacttgctg cggctgcacc tcaactccaa 900 cctcctgagg gccattgaca gccgctggtt tgaaatgctg cccaacttgg agatactcat 960 gattggcggc aacaaggtag atgccatcct ggacatgaac ttccggcccc tggccaacct 1020 gcgtagcctg gtgctagcag gcatgaacct gcgggagatc tccgactatg ccctggaggg 1080 gctgcaaagc ctggagagcc tctccttcta tgacaaccag ctggcccggg tgcccaggcg 1140 ggcactggaa caggtgcccg ggctcaagtt cctagacctc aacaagaacc cgctccagcg 1200 ggtagggccg ggggactttg ccaacatgct gcaccttaag gagctgggac tgaacaacat 1260 ggaggagctg gtctccatcg acaagtttgc cctggtgaac ctccccgagc tgaccaagct 1320 ggacatcacc aataacccac ggctgtcctt catccacccc cgcgccttcc accacctgcc 1380 ccagatggag accctcatgc tcaacaacaa cgctctcagt gccttgcacc agcagacggc 1440 ggagtccctg cccaacctgc aggaggtagg tctccacggc aaccccatcc gctgtgactg 1500 tgtcatccgc tgggccaatg ccacgggcac ccgtgtccgc ttcatcgagc cgcaatccac 1560 cctgtgtgcg gagcctccgg acctccagcg cctcccggtc cgtgaggtgc ccttccggga 1620 gatgacggac cactgtttgc ccctcatctc cccacgaagc ttccccccaa gcctccaggt 1680 agccagtgga gagagcatgg tgctgcattg ccgggcactg gccgaacccg aacccgagat 1740 ctactgggtc actccagctg ggcttcgact gacacctgcc catgcaggca ggaggtaccg 1800 ggtgtacccc gaggggaccc tggagctgcg gagggtgaca gcagaagagg cggggctata 1860 cacctgtgtg gcccagaacc tggtgggggc tgacactaag acggttagtg tggttgtggg 1920 ccgtgctctc ctccagccag

gcagggacga aggacagggg ctggagctcc gggtgcagga 1980 gacccacccc tatcacatcc tgctatcttg ggtcacccca cccaacacag tgtccaccaa 2040 cctcacctgg tccagtgcct cctccctccg gggccagggg gccacagctc tggcccgcct 2100 gcctcgggga acccacagct acaacattac ccgcctcctt caggccacgg agtactgggc 2160 ctgcctgcaa gtggcctttg ctgatgccca cacccagttg gcttgtgtat gggccaggac 2220 caaagaggcc acttcttgcc acagagcctt aggggaccgt cctgggctca ttgccatcct 2280 ggctctcgct gtccttctcc tggcagctgg gctagcggcc caccttggca caggccaacc 2340 caggaagggt gtgggtggga ggcggcctct ccctccagcc tgggctttct ggggctggag 2400 tcccccttct gtccgggttg tgtctgctcc cctcgtcctg ccctggaatc cagggaggaa 2460 gctgcccaga tcctcagaag gggagacact gttgccacca ttgtctcaaa attcttgaag 2520 ctcagcctgt tctcagcagt agagaaatca ctaggactac tttttaccaa aagagaagca 2580 gtctgggcca gatgccctgc caggaaaggg acatggaccc acgtgcttga ggcctggcag 2640 ctgggccaag acagatgggg ctttgtggcc ctgggggtgc ttctgcagcc tcgaaaaagt 2700 tgcccttacc tcctagggtc acctctgctg ccattctgag gaacatctcc aaggaacagg 2760 agggactttg gctagagcct cctgcctccc catcttctct ctgcccagag gctcctgggc 2820 ctggcttggc tgtcccctac ctgtgtcccc gggctgcacc ccttcctctt ctctttctct 2880 gtacagtctc agttgcttgc tcttgtgcct cctgggcaag ggctgaagga ggccactcca 2940 tctcacctcg gggggctgcc ctcaatgtgg gagtgacccc agccagatct gaaggacatt 3000 tgggagaggg atgcccagga acgcctcatc tcagcagcct gggctcggca ttccgaagct 3060 gactttctat aggcaatttt gtacctttgt ggagaaatgt gtcacctccc ccaacccgat 3120 tcactctttt ctcctgtttt gtaaaaaata aaaataaata ataacaataa tacgggggaa 3180 aggaacgaaa ggaaaaaaaa aaaaaaa 3207 14 3170 DNA Homo sapiens 14 ggctcaccga caacttcatc gccgccgtgc gccgccgaga cttcgccaac atgaccagcc 60 tggtgcacct cactctctcc cggaacacca tcggccaggt ggcagctggc gccttcgccg 120 acctgcgtgc cctccgggcc ctgcacctgg acagcaaccg cctggcggag gtgcgcggcg 180 accagctccg cggcctgggc aacctccgcc acctgatcct tggaaacaac cagatccgcc 240 gggtggagtc ggcggccttt gacgccttcc tgtccaccgt ggaggacctg gatctgtcct 300 acaacaacct ggaggccctg ccgtgggagg cggtgggcca gatggtgaac ctaaacaccc 360 tcacgctgga ccacaacctc atcgaccaca tcgcggaggg gaccttcgtg cagcttcaca 420 agctggtccg tctggacatg acctccaacc gcctgcataa actcccgccc gacgggctct 480 tcctgaggtc gcagggcacc gggcccaagc cgcccacccc gctgaccgtc agcttcggcg 540 gcaaccccct gcactgcaac tgcgagctgc tctggctgcg gcggctgacc cgcgaggacg 600 acttagagac ctgcgccacg cccgaacacc tcaccgaccg ctacttctgg tccatccccg 660 aggaggagtt cctgtgtgag cccccgctga tcacacggca ggcggggggc cgggccctgg 720 tggtggaagg ccaggcggtg agcctgcgct gccgagcggt gggtgacccc gagccggtgg 780 tgcactgggt ggcacctgat gggcggctgc tggggaactc cagccggacc cgggtccggg 840 gggacgggac gctggatgtg accatcacca ccttgaggga cagtggcacc ttcacttgta 900 tcgcctccaa tgctgctggg gaagcgacgg cgcccgtgga ggtgtgcgtg gtacctctgc 960 ctctgatggc acccccgccg gctgccccgc cgcctctcac cgagcccggc tcctctgaca 1020 tcgccacgcc gggcagacca ggtgccaacg attctgcggc tgagcgtcgg ctcgtggcag 1080 ccgagctcac ctcgaactcc gtgctcatcc gctggccagc ccagaggcct gtgcccggaa 1140 tacgcatgta ccaggttcag tacaacagtt ccgttgatga ctccctcgtc tacaggatga 1200 tcccgtccac cagtcagacc ttcctggtga atgacctggc ggcgggccgt gcctacgact 1260 tgtgcgtgct ggcggtctac gacgacgggg ccacagcgct gccggcaacg cgagtggtgg 1320 gctgtgtaca gttcaccacc gctggggatc cggcgccctg ccgcccgctg agggcccatt 1380 tcttgggcgg caccatgatc atcgccatcg ggggcgtcat cgtcgcctcg gtcctcgtct 1440 tcatcgttct gctcatgatc cgctataagg tgtatggcga cggggacagc cgccgcgtca 1500 agggctccag gtcgctcccg cgggtcagcc acgtgtgctc gcagaccaac ggcgcaggca 1560 caggcgcggc acaggccccg gccctgccgg cccaggacca ctacgaggcg ctgcgcgagg 1620 tggagtccca ggctgccccc gccgtcgccg tcgaggccaa ggccatggag gccgagacgg 1680 catccgcgga gccggaggtg gtccttggac gttctctggg cggctcggcc acctcgctgt 1740 gcctgctgcc atccgaggaa acttccgggg aggagtctcg ggccgcggtg ggccctcgaa 1800 ggagccgatc cggcgccctg gagccaccaa cctcggcgcc ccctactcta gctctagttc 1860 ctgggggagc cgcggcccgg ccgaggccgc agcagcgcta ttcgttcgac ggggactacg 1920 gggcactatt ccagagccac agttacccgc gccgcgcccg gcggacaaag cgccaccggt 1980 ccacgccgca cctggacggg gctggagggg gcgcggccgg ggaggatgga gacctggggc 2040 tgggctccgc cagggcgtgc ctggctttca ccagcaccga gtggatgctg gagagtaccg 2100 tgtgagcggc gggcgggcgc cgggacgcct gggtgccgca gaccaaacgc ccagccgcac 2160 ggacgctggg gcgggactgg gagaaagcgc agcgccaaga cattggacca gagtggagac 2220 gcgcccttgt ccccgggagg gggcggggca gcctcgggct gcggctcgag gccacgcccc 2280 cgtgcccagg gcggggttcg gggaccggct gccggcctcc cttcccctat ggactcctcg 2340 acccccctcc tacccctccc ctcgcgcgct cgcggacctc gctggagccg gtgccttaca 2400 cagcgaagcg cggggagggg cagggccccc tgacactgca gcactgagac acgagccccc 2460 tcccccagcc cgtcacccgg ggccggggcg aggggcccat ttcttgtatc tggctggact 2520 agatcctatt ctgtcccgcg gcggcctcca aagcctccca ccccacccca cgcacattcc 2580 tggtccggtc gggtctggct tggggtcccc ctttctctgt ttccctcgtt tgtctctatc 2640 ccgccctctt gtcgtctctc tgtagtgcct gtctttccct atttgcctct cctttctctc 2700 tgtcctgtcg tctcttgtcc ctcggccctc cctggttttg tctagtctcc ctgtctctcc 2760 tgatttcttc tctttactca ttctcccggg caggtcccac tggaaggacc agactctccc 2820 aaataaatcc ccacacgaac aaaatccaaa accaaatccc cctccctacc ggagccggga 2880 ccctccgccg cagcagaatt aaactttttt ctgtgtctga ggccctgctg acctgtgtgt 2940 gtgtctgtat gtgtgtccgc gtgtagtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg 3000 tgtgtgttgg gggagggtga cctagattgc agcataagga ctctaagtga gactgaagga 3060 agatgggaag atgactaact ggggccggag gagactggca gacaggcttt tatcctctga 3120 gagacttaga ggtggggaat aatcacaaaa ataaaatgat cataatagct 3170 15 2053 DNA Homo sapiens 15 ccggctcgcg ccctccgggc ccagcctccc gagccttcgg agcgggcgcc gtcccagccc 60 agctccgggg aaacgcgagc cgcgatgcct ggggggtgct cccggggccc cgccgccggg 120 gacgggcgtc tgcggctggc gcgactagcg ctggtactcc tgggctgggt ctcctcgtct 180 tctcccacct cctcggcatc ctccttctcc tcctcggcgc cgttcctggc ttccgccgtg 240 tccgcccagc ccccgctgcc ggaccagtgc cccgcgctgt gcgagtgctc cgaggcagcg 300 cgcacagtca agtgcgttaa ccgcaatctg accgaggtgc ccacggacct gcccgcctac 360 gtgcgcaacc tcttccttac cggcaaccag ctggccgtgc tccctgccgg cgccttcgcc 420 cgccggccgc cgctggcgga gctggccgcg ctcaacctca gcggcagccg cctggacgag 480 gtgcgcgcgg gcgccttcga gcatctgccc agcctgcgcc agctcgacct cagccacaac 540 ccactggccg acctcagtcc cttcgctttc tcgggcagca atgccagcgt ctcggccccc 600 agtccccttg tggaactgat cctgaaccac atcgtgcccc ctgaagatga gcggcagaac 660 cggagcttcg agggcatggt ggtggcggcc ctgctggcgg gccgtgcact gcaggggctc 720 cgccgcttgg agctggccag caaccacttc ctttacctgc cgcgggatgt gctggcccaa 780 ctgcccagcc tcaggcacct ggacttaagt aataattcgc tggtgagcct gacctacgtg 840 tccttccgca acctgacaca tctagaaagc ctccacctgg aggacaatgc cctcaaggtc 900 cttcacaatg gcaccctggc tgagttgcaa ggtctacccc acattagggt tttcctggac 960 aacaatccct gggtctgcga ctgccacatg gcagacatgg tgacctggct caaggaaaca 1020 gaggtagtgc agggcaaaga ccggctcacc tgtgcatatc cggaaaaaat gaggaatcgg 1080 gtcctcttgg aactcaacag tgctgacctg gactgtgacc cgattcttcc cccatccctg 1140 caaacctctt atgtcttcct gggtattgtt ttagccctga taggcgctat tttcctcctg 1200 gttttgtatt tgaaccgcaa ggggataaaa aagtggatgc ataacatcag agatgcctgc 1260 agggatcaca tggaagggta tcattacaga tatgaaatca atgcggaccc cagattaaca 1320 aacctcagtt ctaactcgga tgtctgagaa atattagagg acagaccaag gacaactctg 1380 catgagatgt agacttaagc tttatcccta ctaggcttgc tccactttca tcctccacta 1440 tagatacaac ggactttgac taaaagcagt gaaggggatt tgcttccttg ttatgtaaag 1500 tttctcggtg tgttctgtta atgtaagacg atgaacagtt gtgtatagtg ttttaccctc 1560 ttctttttct tggaactcct caacacgtat ggagggattt ttcaggtttc agcatgaaca 1620 tgggcttctt gctgtctgtc tctctctcag tacagttcaa ggtgtagcaa gtgtacccac 1680 acagatagca ttcaacaaaa gctgcctcaa ctttttcgag aaaaatactt tattcataaa 1740 tatcagtttt attctcatgt acctaagttg tggagaaaat aattgcatcc tataaactgc 1800 ctgcagacgt tagcaggctc ttcaaaataa ctccatggtg cacaggagca cctgcatcca 1860 agagcatgct tacattttac tgttctgcat attacaaaaa ataacttgca acttcataac 1920 ttctttgaca aagtaaatta cttttttgat tgcagtttat atgaaaatgt actgattttt 1980 ttttaataaa ctgcatcgag atccaaccga ctgaattgtt aaaaaaaaaa aaaaataaag 2040 attcttaaaa gaa 2053 16 973 DNA Homo sapiens 16 cagcccagct ccggggaaac gcgagccgcg atgcctgggg ggtgctcccg gggccccgcc 60 gccggggacg ggcgtctgcg gctggcgcga ctagcgctgg tactcctggg ctgggtctcc 120 tcgtcttctc ccacctcctc ggcatcctcc ttctcctcct cggcgccgtt cctggcttcc 180 gccgtgtccg cccagccccc gctgccggac cagtgccccg cgctgtgcga gtgctccgag 240 gcagcgcgca cagtcaagtg cgttaaccgc aatctgaccg aggtgcccac ggacctgccc 300 gcctacgtgc gcaacctctt ccttaccagc aaccacttcc tttacctgcc gcgggatgtg 360 ctggcccaac tgcccagcct caggcacctg gacttaagta ataattcgct ggtgagcctg 420 acctacgtgt ccttccgcaa cctgacacat ctagaaagcc tccacctgga ggacaatgcc 480 ctcaaggtcc ttcacaatgg caccctggct gagttgcaag gtctacccca cattagggtt 540 ttcctggaca acaatccctg ggtctgcgac tgccacatgg cagacatggt gacctggctc 600 aaggaaacag aggtagtgca gggcaaagac cggctcacct gtgcatatcc ggaaaaaatg 660 aggaatcggg tcctcttgga actcaacagt gctgacctgg actgtgaccc gattcttccc 720 ccatccctgc aaacctctta tgtcttcctg ggtattgttt tagccctgat aggcgctatt 780 ttcctcctgg ttttgtattt gaaccgcaag gggataaaaa agtggatgca taacatcaga 840 gatgcctgca gggatcacat ggaagggtat cattacagat atgaaatcaa tgcggacccc 900 agattaacga acctcagttc taactcggat gtctgagaaa tattagagga cagaccaagg 960 acaactctgc atg 973 17 1331 DNA Homo sapiens 17 cagcccagct ccggggaaac gcgagccgcg atgcctgggg ggtgctcccg gggccccgcc 60 gccggggacg ggcgtctgcg gctggcgcga ctagcgctgg tactcctggg ctgggtctcc 120 tcgtcttctc ccacctcctc ggcatcctcc ttctcctcct cggcgccgtt cctggcttcc 180 gccgtgtccg cccagccccc gctgccggac cagtgccccg cgctgtgcga gtgctccgag 240 gcagcgcgca cagtcaagtg cgttaaccgc aatctgaccg aggtgcccac ggacctgccc 300 gcctacgtgc gcaacctctt ccttaccggc aaccagctgg ccgtgctccc tgccggcgcc 360 ttcgcccgcc ggccgccgct ggcggagctg gccgcgctca acctcagcgg cagccgcctg 420 gacgaggtgc gcgcgggcgc cttcgagcat ctgcccagcc tgcgccagct cgacctcagc 480 cacaacccac tggccgacct cagtcccttc gctttctcgg gcagcaatgc cagcgtctcg 540 gcccccagtc cccttgtgga actgatcctg aaccacatcg tgccccctga agatgagcgg 600 cagaaccgga gcttcgaggg catggtggtg gcggccctgc tggcgggccg tgcactgcag 660 gggctccgcc gcttggagct ggccagcaac cacttccttt acctgccgcg ggatgtgctg 720 gcccaactgc ccagcctcag gcacctggac ttaagtaata attcgctggt gagcctgacc 780 tacgtgtcct tccgcaacct gacacatcta gaaagcctcc acctggagga caatgccctc 840 aaggtccttc acaatggcac cctggctgag ttgcaaggtc taccccacat tagggttttc 900 ctggacaaca atccctgggt ctgcgactgc cacatggcag acatggtgac ctggctcaag 960 gaaacagagg tagtgcaggg caaagaccgg ctcacctgtg catatccgga aaaaatgagg 1020 aatcgggtcc tcttggaact caacagtgct gacctggact gtgacccgat tcttccccca 1080 tccctgcaaa cctcttatgt cttcctgggt attgttttag ccctgatagg cgctattttc 1140 ctcctggttt tgtatttgaa ccgcaagggg ataaaaaagt ggatgcataa catcagagat 1200 gcctgcaggg atcacatgga agggtatcat tacagatatg aaatcaatgc ggaccccaga 1260 ttaacaaacc tcagttctaa ctcggatgtc tgagaaatat tagaggacag accaaggaca 1320 actctgcatg a 1331 18 2053 DNA Homo sapiens 18 ccggctcgcg ccctccgggc ccagcctccc gagccttcgg agcgggcgcc gtcccagccc 60 agctccgggg aaacgcgagc cgcgatgcct ggggggtgct cccggggccc cgccgccggg 120 gacgggcgtc tgcggctggc gcgactagcg ctggtactcc tgggctgggt ctcctcgtct 180 tctcccacct cctcggcatc ctccttctcc tcctcggcgc cgttcctggc ttccgccgtg 240 tccgcccagc ccccgctgcc ggaccagtgc cccgcgctgt gcgagtgctc cgaggcagcg 300 cgcacagtca agtgcgttaa ccgcaatctg accgaggtgc ccacggacct gcccgcctac 360 gtgcgcaacc tcttccttac cggcaaccag ctggccgtgc tccctgccgg cgccttcgcc 420 cgccggccgc cgctggcgga gctggccgcg ctcaacctca gcggcagccg cctggacgag 480 gtgcgcgcgg gcgccttcga gcatctgccc agcctgcgcc agctcgacct cagccacaac 540 ccactggccg acctcagtcc cttcgctttc tcgggcagca atgccagcgt ctcggccccc 600 agtccccttg tggaactgat cctgaaccac atcgtgcccc ctgaagatga gcggcagaac 660 cggagcttcg agggcatggt ggtggcggcc ctgctggcgg gccgtgcact gcaggggctc 720 cgccgcttgg agctggccag caaccacttc ctttacctgc cgcgggatgt gctggcccaa 780 ctgcccagcc tcaggcacct ggacttaagt aataattcgc tggtgagcct gacctacgtg 840 tccttccgca acctgacaca tctagaaagc ctccacctgg aggacaatgc cctcaaggtc 900 cttcacaatg gcaccctggc tgagttgcaa ggtctacccc acattagggt tttcctggac 960 aacaatccct gggtctgcga ctgccacatg gcagacatgg tgacctggct caaggaaaca 1020 gaggtagtgc agggcaaaga ccggctcacc tgtgcatatc cggaaaaaat gaggaatcgg 1080 gtcctcttgg aactcaacag tgctgacctg gactgtgacc cgattcttcc cccatccctg 1140 caaacctctt atgtcttcct gggtattgtt ttagccctga taggcgctat tttcctcctg 1200 gttttgtatt tgaaccgcaa ggggataaaa aagtggatgc ataacatcag agatgcctgc 1260 agggatcaca tggaagggta tcattacaga tatgaaatca atgcggaccc cagattaaca 1320 aacctcagtt ctaactcgga tgtctgagaa atattagagg acagaccaag gacaactctg 1380 catgagatgt agacttaagc tttatcccta ctaggcttgc tccactttca tcctccacta 1440 tagatacaac ggactttgac taaaagcagt gaaggggatt tgcttccttg ttatgtaaag 1500 tttctcggtg tgttctgtta atgtaagacg atgaacagtt gtgtatagtg ttttaccctc 1560 ttctttttct tggaactcct caacacgtat ggagggattt ttcaggtttc agcatgaaca 1620 tgggcttctt gctgtctgtc tctctctcag tacagttcaa ggtgtagcaa gtgtacccac 1680 acagatagca ttcaacaaaa gctgcctcaa ctttttcgag aaaaatactt tattcataaa 1740 tatcagtttt attctcatgt acctaagttg tggagaaaat aattgcatcc tataaactgc 1800 ctgcagacgt tagcaggctc ttcaaaataa ctccatggtg cacaggagca cctgcatcca 1860 agagcatgct tacattttac tgttctgcat attacaaaaa ataacttgca acttcataac 1920 ttctttgaca aagtaaatta cttttttgat tgcagtttat atgaaaatgt actgattttt 1980 ttttaataaa ctgcatcgag atccaaccga ctgaattgtt aaaaaaaaaa aaaaataaag 2040 attcttaaaa gaa 2053 19 845 PRT Homo sapiens 19 Met Leu Ser Gly Val Trp Phe Leu Ser Val Leu Thr Val Ala Gly Ile 1 5 10 15 Leu Gln Thr Glu Ser Arg Lys Thr Ala Lys Asp Ile Cys Lys Ile Arg 20 25 30 Cys Leu Cys Glu Glu Lys Glu Asn Val Leu Asn Ile Asn Cys Glu Asn 35 40 45 Lys Gly Phe Thr Thr Val Ser Leu Leu Gln Pro Pro Gln Tyr Arg Ile 50 55 60 Tyr Gln Leu Phe Leu Asn Gly Asn Leu Leu Thr Arg Leu Tyr Pro Asn 65 70 75 80 Glu Phe Val Asn Tyr Ser Asn Ala Val Thr Leu His Leu Gly Asn Asn 85 90 95 Gly Leu Gln Glu Ile Arg Thr Gly Ala Phe Ser Gly Leu Lys Thr Leu 100 105 110 Lys Arg Leu His Leu Asn Asn Asn Lys Leu Glu Ile Leu Arg Glu Asp 115 120 125 Thr Phe Leu Gly Leu Glu Ser Leu Glu Tyr Leu Gln Ala Asp Tyr Asn 130 135 140 Tyr Ile Ser Ala Ile Glu Ala Gly Ala Phe Ser Lys Leu Asn Lys Leu 145 150 155 160 Lys Val Leu Ile Leu Asn Asp Asn Leu Leu Leu Ser Leu Pro Ser Asn 165 170 175 Val Phe Arg Phe Val Leu Leu Thr His Leu Asp Leu Arg Gly Asn Arg 180 185 190 Leu Lys Val Met Pro Phe Ala Gly Val Leu Glu His Ile Gly Gly Ile 195 200 205 Met Glu Ile Gln Leu Glu Glu Asn Pro Trp Asn Cys Thr Cys Asp Leu 210 215 220 Leu Pro Leu Lys Ala Trp Leu Asp Thr Ile Thr Val Phe Val Gly Glu 225 230 235 240 Ile Val Cys Glu Thr Pro Phe Arg Leu His Gly Lys Asp Val Thr Gln 245 250 255 Leu Thr Arg Gln Asp Leu Cys Pro Arg Lys Ser Ala Ser Asp Ser Ser 260 265 270 Gln Arg Gly Ser His Ala Asp Thr His Val Gln Arg Leu Ser Pro Thr 275 280 285 Met Asn Pro Ala Leu Asn Pro Thr Arg Ala Pro Lys Ala Ser Arg Pro 290 295 300 Pro Lys Met Arg Asn Arg Pro Thr Pro Arg Val Thr Val Ser Lys Asp 305 310 315 320 Arg Gln Ser Phe Gly Pro Ile Met Val Tyr Gln Thr Lys Ser Pro Val 325 330 335 Pro Leu Thr Cys Pro Ser Ser Cys Val Cys Thr Ser Gln Ser Ser Asp 340 345 350 Asn Gly Leu Asn Val Asn Cys Gln Glu Arg Lys Phe Thr Asn Ile Ser 355 360 365 Asp Leu Gln Pro Lys Pro Thr Ser Pro Lys Lys Leu Tyr Leu Thr Gly 370 375 380 Asn Tyr Leu Gln Thr Val Tyr Lys Asn Asp Leu Leu Glu Tyr Ser Ser 385 390 395 400 Leu Asp Leu Leu His Leu Gly Asn Asn Arg Ile Ala Val Ile Gln Glu 405 410 415 Gly Ala Phe Thr Asn Leu Thr Ser Leu Arg Arg Leu Tyr Leu Asn Gly 420 425 430 Asn Tyr Leu Glu Val Leu Tyr Pro Ser Met Phe Asp Gly Leu Gln Ser 435 440 445 Leu Gln Tyr Leu Tyr Leu Glu Tyr Asn Val Ile Lys Glu Ile Lys Pro 450 455 460 Leu Thr Phe Asp Ala Leu Ile Asn Leu Gln Leu Leu Phe Leu Asn Asn 465 470 475 480 Asn Leu Leu Arg Ser Leu Pro Asp Asn Ile Phe Gly Gly Thr Ala Leu 485 490 495 Thr Arg Leu Asn Leu Arg Asn Asn His Phe Ser His Leu Pro Val Lys 500 505 510 Gly Val Leu Asp Gln Leu Pro Ala Phe Ile Gln Ile Asp Leu Gln Glu 515 520 525 Asn Pro Trp Asp Cys Thr Cys Asp Ile Met Gly Leu Lys Asp Trp Thr 530 535 540 Glu His Ala Asn Ser Pro Val Ile Ile Asn Glu Val Thr Cys Glu Ser 545 550 555 560 Pro Ala Lys His Ala Gly Glu Ile Leu Lys Phe Leu Gly Arg Glu Ala 565 570 575 Ile Cys Pro Asp Ser Pro Asn Leu Ser Asp Gly Thr Val Leu Ser Met 580 585 590 Asn His Asn Thr Asp Thr Pro Arg Ser Leu Ser Val Ser Pro Ser Ser 595 600 605 Tyr Pro Glu Leu His Thr Glu Val Pro Leu Ser Val Leu Ile Leu Gly 610 615

620 Leu Leu Val Val Phe Ile Leu Ser Val Cys Phe Gly Ala Gly Leu Phe 625 630 635 640 Val Phe Val Leu Lys Arg Arg Lys Gly Val Pro Ser Val Pro Arg Asn 645 650 655 Thr Asn Asn Leu Asp Val Ser Ser Phe Gln Leu Gln Tyr Gly Ser Tyr 660 665 670 Asn Thr Glu Thr His Asp Lys Thr Asp Gly His Val Tyr Asn Tyr Ile 675 680 685 Pro Pro Pro Val Gly Gln Met Cys Gln Asn Pro Ile Tyr Met Gln Lys 690 695 700 Glu Gly Asp Pro Val Ala Tyr Tyr Arg Asn Leu Gln Glu Phe Ser Tyr 705 710 715 720 Ser Asn Leu Glu Glu Lys Lys Glu Glu Pro Ala Thr Pro Ala Tyr Thr 725 730 735 Ile Ser Ala Thr Glu Leu Leu Glu Lys Gln Ala Thr Pro Arg Glu Pro 740 745 750 Glu Leu Leu Tyr Gln Asn Ile Ala Glu Arg Val Lys Glu Leu Pro Ser 755 760 765 Ala Gly Leu Val His Tyr Asn Phe Cys Thr Leu Pro Lys Arg Gln Phe 770 775 780 Ala Pro Ser Tyr Glu Ser Arg Arg Gln Asn Gln Asp Arg Ile Asn Lys 785 790 795 800 Thr Val Leu Tyr Gly Thr Pro Arg Lys Cys Phe Val Gly Gln Ser Lys 805 810 815 Pro Asn His Pro Leu Leu Gln Ala Lys Pro Gln Ser Glu Pro Asp Tyr 820 825 830 Leu Glu Val Leu Glu Lys Gln Thr Ala Ile Ser Gln Leu 835 840 845 20 1477 PRT Homo sapiens 20 Met Ala Lys Arg Ser Arg Gly Pro Gly Arg Arg Cys Leu Leu Ala Leu 1 5 10 15 Val Leu Phe Cys Ala Trp Gly Thr Leu Ala Val Val Ala Gln Lys Pro 20 25 30 Gly Ala Gly Cys Pro Ser Arg Cys Leu Cys Phe Arg Thr Thr Val Arg 35 40 45 Cys Met His Leu Leu Leu Glu Ala Val Pro Ala Val Ala Pro Gln Thr 50 55 60 Ser Ile Leu Asp Leu Arg Phe Asn Arg Ile Arg Glu Ile Gln Pro Gly 65 70 75 80 Ala Phe Arg Arg Leu Arg Asn Leu Asn Thr Leu Leu Leu Asn Asn Asn 85 90 95 Gln Ile Lys Arg Ile Pro Ser Gly Ala Phe Glu Asp Leu Glu Asn Leu 100 105 110 Lys Tyr Leu Tyr Leu Tyr Lys Asn Glu Ile Gln Ser Ile Asp Arg Gln 115 120 125 Ala Phe Lys Gly Leu Ala Ser Leu Glu Gln Leu Tyr Leu His Phe Asn 130 135 140 Gln Ile Glu Thr Leu Asp Pro Asp Ser Phe Gln His Leu Pro Lys Leu 145 150 155 160 Glu Arg Leu Phe Leu His Asn Asn Arg Ile Thr His Leu Val Pro Gly 165 170 175 Thr Phe Asn His Leu Glu Ser Met Lys Arg Leu Arg Leu Asp Ser Asn 180 185 190 Thr Leu His Cys Asp Cys Glu Ile Leu Trp Leu Ala Asp Leu Leu Lys 195 200 205 Thr Tyr Ala Glu Ser Gly Asn Ala Gln Ala Ala Ala Ile Cys Glu Tyr 210 215 220 Pro Arg Arg Ile Gln Gly Arg Ser Val Ala Thr Ile Thr Pro Glu Glu 225 230 235 240 Leu Asn Cys Glu Arg Pro Arg Ile Thr Ser Glu Pro Gln Asp Ala Asp 245 250 255 Val Thr Ser Gly Asn Thr Val Tyr Phe Thr Cys Arg Ala Glu Gly Asn 260 265 270 Pro Lys Pro Glu Ile Ile Trp Leu Arg Asn Asn Asn Glu Leu Ser Met 275 280 285 Lys Thr Asp Ser Arg Leu Asn Leu Leu Asp Asp Gly Thr Leu Met Ile 290 295 300 Gln Asn Thr Gln Glu Thr Asp Gln Gly Ile Tyr Gln Cys Met Ala Lys 305 310 315 320 Asn Val Ala Gly Glu Val Lys Thr Gln Glu Val Thr Leu Arg Tyr Phe 325 330 335 Gly Ser Pro Ala Arg Pro Thr Phe Val Ile Gln Pro Gln Asn Thr Glu 340 345 350 Val Leu Val Gly Glu Ser Val Thr Leu Glu Cys Ser Ala Thr Gly His 355 360 365 Pro Pro Pro Arg Ile Ser Trp Thr Arg Gly Asp Arg Thr Pro Leu Pro 370 375 380 Val Asp Pro Arg Val Asn Ile Thr Pro Ser Gly Gly Leu Tyr Ile Gln 385 390 395 400 Asn Val Val Gln Gly Asp Ser Gly Glu Tyr Ala Cys Ser Ala Thr Asn 405 410 415 Asn Ile Asp Ser Val His Ala Thr Ala Phe Ile Ile Val Gln Ala Leu 420 425 430 Pro Gln Phe Thr Val Thr Pro Gln Asp Arg Val Val Ile Glu Gly Gln 435 440 445 Thr Val Asp Phe Gln Cys Glu Ala Lys Gly Asn Pro Pro Pro Val Ile 450 455 460 Ala Trp Thr Lys Gly Gly Ser Gln Leu Ser Val Asp Arg Arg His Leu 465 470 475 480 Val Leu Ser Ser Gly Thr Leu Arg Ile Ser Gly Val Ala Leu His Asp 485 490 495 Gln Gly Gln Tyr Glu Cys Gln Ala Val Asn Ile Ile Gly Ser Gln Lys 500 505 510 Val Val Ala His Leu Thr Val Gln Pro Arg Val Thr Pro Val Phe Ala 515 520 525 Ser Ile Pro Ser Asp Thr Thr Val Glu Val Gly Ala Asn Val Gln Leu 530 535 540 Pro Cys Ser Ser Gln Gly Glu Pro Glu Pro Ala Ile Thr Trp Asn Lys 545 550 555 560 Asp Gly Val Gln Val Thr Glu Ser Gly Lys Phe His Ile Ser Pro Glu 565 570 575 Gly Phe Leu Thr Ile Asn Asp Val Gly Pro Ala Asp Ala Gly Arg Tyr 580 585 590 Glu Cys Val Ala Arg Asn Thr Ile Gly Ser Ala Ser Val Ser Met Val 595 600 605 Leu Ser Val Asn Asp Val Ser Arg Asn Gly Asp Pro Phe Val Ala Thr 610 615 620 Ser Ile Val Glu Ala Ile Ala Thr Val Asp Arg Ala Ile Asn Ser Thr 625 630 635 640 Arg Thr His Leu Phe Asp Ser Arg Pro Arg Ser Pro Asn Asp Leu Leu 645 650 655 Ala Leu Phe Arg Tyr Pro Arg Asp Pro Tyr Thr Val Glu Gln Ala Arg 660 665 670 Ala Gly Glu Ile Phe Glu Arg Thr Leu Gln Leu Ile Gln Glu His Val 675 680 685 Gln His Gly Leu Met Val Asp Leu Asn Gly Thr Ser Tyr His Tyr Asn 690 695 700 Asp Leu Val Ser Pro Gln Tyr Leu Asn Leu Ile Ala Asn Leu Ser Gly 705 710 715 720 Cys Thr Ala His Arg Arg Val Asn Asn Cys Ser Asp Met Cys Phe His 725 730 735 Gln Lys Tyr Arg Thr His Asp Gly Thr Cys Asn Asn Leu Gln His Pro 740 745 750 Met Trp Gly Ala Ser Leu Thr Ala Phe Glu Arg Leu Leu Lys Ser Val 755 760 765 Tyr Glu Asn Gly Phe Asn Thr Pro Arg Gly Ile Asn Pro His Arg Leu 770 775 780 Tyr Asn Gly His Ala Leu Pro Met Pro Arg Leu Val Ser Thr Thr Leu 785 790 795 800 Ile Gly Thr Glu Thr Val Thr Pro Asp Glu Gln Phe Thr His Met Leu 805 810 815 Met Gln Trp Gly Gln Phe Leu Asp His Asp Leu Asp Ser Thr Val Val 820 825 830 Ala Leu Ser Gln Ala Arg Phe Ser Asp Gly Gln His Cys Ser Asn Val 835 840 845 Cys Ser Asn Asp Pro Pro Cys Phe Ser Val Met Ile Pro Pro Asn Asp 850 855 860 Ser Arg Ala Arg Ser Gly Ala Arg Cys Met Phe Phe Val Arg Ser Ser 865 870 875 880 Pro Val Cys Gly Ser Gly Met Thr Ser Leu Leu Met Asn Ser Val Tyr 885 890 895 Pro Arg Glu Gln Ile Asn Gln Leu Thr Ser Tyr Ile Asp Ala Ser Asn 900 905 910 Val Tyr Gly Ser Thr Glu His Glu Ala Arg Ser Ile Arg Asp Leu Ala 915 920 925 Ser His Arg Gly Leu Leu Arg Gln Gly Ile Val Gln Arg Ser Gly Lys 930 935 940 Pro Leu Leu Pro Phe Ala Thr Gly Pro Pro Thr Glu Cys Met Arg Asp 945 950 955 960 Glu Asn Glu Ser Pro Ile Pro Cys Phe Leu Ala Gly Asp His Arg Ala 965 970 975 Asn Glu Gln Leu Gly Leu Thr Ser Met His Thr Leu Trp Phe Arg Glu 980 985 990 His Asn Arg Ile Ala Thr Glu Leu Leu Lys Leu Asn Pro His Trp Asp 995 1000 1005 Gly Asp Thr Ile Tyr Tyr Glu Thr Arg Lys Ile Val Gly Ala Glu 1010 1015 1020 Ile Gln His Ile Thr Tyr Gln His Trp Leu Pro Lys Ile Leu Gly 1025 1030 1035 Glu Val Gly Met Arg Thr Leu Gly Glu Tyr His Gly Tyr Asp Pro 1040 1045 1050 Gly Ile Asn Ala Gly Ile Phe Asn Ala Phe Ala Thr Ala Ala Phe 1055 1060 1065 Arg Phe Gly His Thr Leu Val Asn Pro Leu Leu Tyr Arg Leu Asp 1070 1075 1080 Glu Asn Phe Gln Pro Ile Ala Gln Asp His Leu Pro Leu His Lys 1085 1090 1095 Ala Phe Phe Ser Pro Phe Arg Ile Val Asn Glu Gly Gly Ile Asp 1100 1105 1110 Pro Leu Leu Arg Gly Leu Phe Gly Val Ala Gly Lys Met Arg Val 1115 1120 1125 Pro Ser Gln Leu Leu Asn Thr Glu Leu Thr Glu Arg Leu Phe Ser 1130 1135 1140 Met Ala His Thr Val Ala Leu Asp Leu Ala Ala Ile Asn Ile Gln 1145 1150 1155 Arg Gly Arg Asp His Gly Ile Pro Pro Tyr His Asp Tyr Arg Val 1160 1165 1170 Tyr Cys Asn Leu Ser Ala Ala His Thr Phe Glu Asp Leu Lys Asn 1175 1180 1185 Glu Ile Lys Asn Pro Glu Ile Arg Glu Lys Leu Lys Arg Leu Tyr 1190 1195 1200 Gly Ser Thr Leu Asn Ile Asp Leu Phe Pro Ala Leu Val Val Glu 1205 1210 1215 Asp Leu Val Pro Gly Ser Arg Leu Gly Pro Thr Leu Met Cys Leu 1220 1225 1230 Leu Ser Thr Gln Phe Lys Arg Leu Arg Asp Gly Asp Arg Leu Trp 1235 1240 1245 Tyr Glu Asn Pro Gly Val Phe Ser Pro Ala Gln Leu Thr Gln Ile 1250 1255 1260 Lys Gln Thr Ser Leu Ala Arg Ile Leu Cys Asp Asn Ala Asp Asn 1265 1270 1275 Ile Thr Arg Val Gln Ser Asp Val Phe Arg Val Ala Glu Phe Pro 1280 1285 1290 His Gly Tyr Gly Ser Cys Asp Glu Ile Pro Arg Val Asp Leu Arg 1295 1300 1305 Val Trp Gln Asp Cys Cys Glu Asp Cys Arg Thr Arg Gly Gln Phe 1310 1315 1320 Asn Ala Phe Ser Tyr His Phe Arg Gly Arg Arg Ser Leu Glu Phe 1325 1330 1335 Ser Tyr Gln Glu Asp Lys Pro Thr Lys Lys Thr Arg Pro Arg Lys 1340 1345 1350 Ile Pro Ser Val Gly Arg Gln Gly Glu His Leu Ser Asn Ser Thr 1355 1360 1365 Ser Ala Phe Ser Thr Arg Ser Asp Ala Ser Gly Thr Asn Asp Phe 1370 1375 1380 Arg Glu Phe Val Leu Glu Met Gln Lys Thr Ile Thr Asp Leu Arg 1385 1390 1395 Thr Gln Ile Lys Lys Leu Glu Ser Arg Leu Ser Thr Thr Glu Cys 1400 1405 1410 Val Asp Ala Gly Gly Glu Ser His Ala Asn Asn Thr Lys Trp Lys 1415 1420 1425 Lys Asp Ala Cys Thr Ile Cys Glu Cys Lys Asp Gly Gln Val Thr 1430 1435 1440 Cys Phe Val Glu Ala Cys Pro Pro Ala Thr Cys Ala Val Pro Val 1445 1450 1455 Asn Ile Pro Gly Ala Cys Cys Pro Val Cys Leu Gln Lys Arg Ala 1460 1465 1470 Glu Glu Lys Pro 1475 21 798 PRT Homo sapiens 21 Met Leu Ile Asn Cys Glu Ala Lys Gly Ile Lys Met Val Ser Glu Ile 1 5 10 15 Ser Val Pro Pro Ser Arg Pro Phe Gln Leu Ser Leu Leu Asn Asn Gly 20 25 30 Leu Thr Met Leu His Thr Asn Asp Phe Ser Gly Leu Thr Asn Ala Ile 35 40 45 Ser Ile His Leu Gly Phe Asn Asn Ile Ala Asp Ile Glu Ile Gly Ala 50 55 60 Phe Asn Gly Leu Gly Leu Leu Lys Gln Leu His Ile Asn His Asn Ser 65 70 75 80 Leu Glu Ile Leu Lys Glu Asp Thr Phe His Gly Leu Glu Asn Leu Glu 85 90 95 Phe Leu Gln Ala Asp Asn Asn Phe Ile Thr Val Ile Glu Pro Ser Ala 100 105 110 Phe Ser Lys Leu Asn Arg Leu Lys Val Leu Ile Leu Asn Asp Asn Ala 115 120 125 Ile Glu Ser Leu Pro Pro Asn Ile Phe Arg Phe Val Pro Leu Thr His 130 135 140 Leu Asp Leu Arg Gly Asn Gln Leu Gln Thr Leu Pro Tyr Val Gly Phe 145 150 155 160 Leu Glu His Ile Gly Arg Ile Leu Asp Leu Gln Leu Glu Asp Asn Lys 165 170 175 Trp Ala Cys Asn Cys Asp Leu Leu Gln Leu Lys Thr Trp Leu Glu Asn 180 185 190 Met Pro Pro Gln Ser Ile Ile Gly Asp Val Val Cys Asn Ser Pro Pro 195 200 205 Phe Phe Lys Gly Ser Ile Leu Ser Arg Leu Lys Lys Glu Ser Ile Cys 210 215 220 Pro Thr Pro Pro Val Tyr Glu Glu His Glu Asp Pro Ser Gly Ser Leu 225 230 235 240 His Leu Ala Ala Thr Ser Ser Ile Asn Asp Ser Arg Met Ser Thr Lys 245 250 255 Thr Thr Ser Ile Leu Lys Leu Pro Thr Lys Ala Pro Gly Leu Ile Pro 260 265 270 Tyr Ile Thr Lys Pro Ser Thr Gln Leu Pro Gly Pro Tyr Cys Pro Ile 275 280 285 Pro Cys Asn Cys Lys Val Leu Ser Pro Ser Gly Leu Leu Ile His Cys 290 295 300 Gln Glu Arg Asn Ile Glu Ser Leu Ser Asp Leu Arg Pro Pro Pro Gln 305 310 315 320 Asn Pro Arg Lys Leu Ile Leu Ala Gly Asn Ile Ile His Ser Leu Met 325 330 335 Lys Ser Asp Leu Val Glu Tyr Phe Thr Leu Glu Met Leu His Leu Gly 340 345 350 Asn Asn Arg Ile Glu Val Leu Glu Glu Gly Ser Phe Met Asn Leu Thr 355 360 365 Arg Leu Gln Lys Leu Tyr Leu Asn Gly Asn His Leu Thr Lys Leu Ser 370 375 380 Lys Gly Met Phe Leu Gly Leu His Asn Leu Glu Tyr Leu Tyr Leu Glu 385 390 395 400 Tyr Asn Ala Ile Lys Glu Ile Leu Pro Gly Thr Phe Asn Pro Met Pro 405 410 415 Lys Leu Lys Val Leu Tyr Leu Asn Asn Asn Leu Leu Gln Val Leu Pro 420 425 430 Pro His Ile Phe Ser Gly Val Pro Leu Thr Lys Val Asn Leu Lys Thr 435 440 445 Asn Gln Phe Thr His Leu Pro Val Ser Asn Ile Leu Asp Asp Leu Asp 450 455 460 Leu Leu Thr Gln Ile Asp Leu Glu Asp Asn Pro Trp Asp Cys Ser Cys 465 470 475 480 Asp Leu Val Gly Leu Gln Gln Trp Ile Gln Lys Leu Ser Lys Asn Thr 485 490 495 Val Thr Asp Asp Ile Leu Cys Thr Ser Pro Gly His Leu Asp Lys Lys 500 505 510 Glu Leu Lys Ala Leu Asn Ser Glu Ile Leu Cys Pro Gly Leu Val Asn 515 520 525 Asn Pro Ser Met Pro Thr Gln Thr Ser Tyr Leu Met Val Thr Thr Pro 530 535 540 Ala Thr Thr Thr Asn Thr Ala Asp Thr Ile Leu Arg Ser Leu Thr Asp 545 550 555 560 Ala Val Pro Leu Ser Val Leu Ile Leu Gly Leu Leu Ile Met Phe Ile 565 570 575 Thr Ile Val Phe Cys Ala Ala Gly Ile Val Val Leu Val Leu His Arg 580 585 590 Arg Arg Arg Tyr Lys Lys Lys Gln Val Asp Glu Gln Met Arg Asp Asn 595 600 605 Ser Pro Val His Leu Gln Tyr Ser Met Tyr Gly His Lys Thr Thr His 610 615 620 His Thr Thr Glu Arg Pro Ser Ala Ser Leu Tyr Glu Gln His Met Val 625 630 635 640 Ser Pro Met Val His Val Tyr Arg Ser Pro Ser Phe Gly Pro Lys His 645 650 655 Leu Glu Glu Glu Glu Glu Arg Asn Glu Lys Glu Gly Ser Asp Ala Lys 660 665 670 His Leu Gln Arg Ser Leu Leu Glu Gln Glu Asn His Ser Pro Leu Thr 675 680 685 Gly Ser Asn Met Lys Tyr Lys Thr Thr Asn Gln Ser Thr Glu Phe Leu 690 695 700 Ser Phe Gln Asp Ala Ser Ser Leu Tyr Arg Asn Ile Leu Glu Lys Glu 705 710 715 720 Arg Glu Leu Gln Gln Leu Gly Ile Thr Glu Tyr Leu Arg Lys Asn Ile

725 730 735 Ala Gln Leu Gln Pro Asp Met Glu Ala His Tyr Pro Gly Ala His Glu 740 745 750 Glu Leu Lys Leu Met Glu Thr Leu Met Tyr Ser Arg Pro Arg Lys Val 755 760 765 Leu Val Glu Gln Thr Lys Asn Glu Tyr Phe Glu Leu Lys Ala Asn Leu 770 775 780 His Ala Glu Pro Asp Tyr Leu Glu Val Leu Glu Gln Gln Thr 785 790 795 22 713 PRT Homo sapiens 22 Met Arg Leu Leu Val Ala Pro Leu Leu Leu Ala Trp Val Ala Gly Ala 1 5 10 15 Thr Ala Ala Val Pro Val Val Pro Trp His Val Pro Cys Pro Pro Gln 20 25 30 Cys Ala Cys Gln Ile Arg Pro Trp Tyr Thr Pro Arg Ser Ser Tyr Arg 35 40 45 Glu Ala Thr Thr Val Asp Cys Asn Asp Leu Phe Leu Thr Ala Val Pro 50 55 60 Pro Ala Leu Pro Ala Gly Thr Gln Thr Leu Leu Leu Gln Ser Asn Ser 65 70 75 80 Ile Val Arg Val Asp Gln Ser Glu Leu Gly Tyr Leu Ala Asn Leu Thr 85 90 95 Glu Leu Asp Leu Ser Gln Asn Ser Phe Ser Asp Ala Arg Asp Cys Asp 100 105 110 Phe His Ala Leu Pro Gln Leu Leu Ser Leu His Leu Glu Glu Asn Gln 115 120 125 Leu Thr Arg Leu Glu Asp His Ser Phe Ala Gly Leu Ala Ser Leu Gln 130 135 140 Glu Leu Tyr Leu Asn His Asn Gln Leu Tyr Arg Ile Ala Pro Arg Ala 145 150 155 160 Phe Ser Gly Leu Ser Asn Leu Leu Arg Leu His Leu Asn Ser Asn Leu 165 170 175 Leu Arg Ala Ile Asp Ser Arg Trp Phe Glu Met Leu Pro Asn Leu Glu 180 185 190 Ile Leu Met Ile Gly Gly Asn Lys Val Asp Ala Ile Leu Asp Met Asn 195 200 205 Phe Arg Pro Leu Ala Asn Leu Arg Ser Leu Val Leu Ala Gly Met Asn 210 215 220 Leu Arg Glu Ile Ser Asp Tyr Ala Leu Glu Gly Leu Gln Ser Leu Glu 225 230 235 240 Ser Leu Ser Phe Tyr Asp Asn Gln Leu Ala Arg Val Pro Arg Arg Ala 245 250 255 Leu Glu Gln Val Pro Gly Leu Lys Phe Leu Asp Leu Asn Lys Asn Pro 260 265 270 Leu Gln Arg Val Gly Pro Gly Asp Phe Ala Asn Met Leu His Leu Lys 275 280 285 Glu Leu Gly Leu Asn Asn Met Glu Glu Leu Val Ser Ile Asp Lys Phe 290 295 300 Ala Leu Val Asn Leu Pro Glu Leu Thr Lys Leu Asp Ile Thr Asn Asn 305 310 315 320 Pro Arg Leu Ser Phe Ile His Pro Arg Ala Phe His His Leu Pro Gln 325 330 335 Met Glu Thr Leu Met Leu Asn Asn Asn Ala Leu Ser Ala Leu His Gln 340 345 350 Gln Thr Val Glu Ser Leu Pro Asn Leu Gln Glu Val Gly Leu His Gly 355 360 365 Asn Pro Ile Arg Cys Asp Cys Val Ile Arg Trp Ala Asn Ala Thr Gly 370 375 380 Thr Arg Val Arg Phe Ile Glu Pro Gln Ser Thr Leu Cys Ala Glu Pro 385 390 395 400 Pro Asp Leu Gln Arg Leu Pro Val Arg Glu Val Pro Phe Arg Glu Met 405 410 415 Thr Asp His Cys Leu Pro Leu Ile Ser Pro Arg Ser Phe Pro Pro Ser 420 425 430 Leu Gln Val Ala Ser Gly Glu Ser Met Val Leu His Cys Arg Ala Leu 435 440 445 Ala Glu Pro Glu Pro Glu Ile Tyr Trp Val Thr Pro Ala Gly Leu Arg 450 455 460 Leu Thr Pro Ala His Ala Gly Arg Arg Cys Arg Val Tyr Pro Glu Gly 465 470 475 480 Thr Leu Glu Leu Arg Arg Val Thr Ala Glu Glu Ala Gly Leu Tyr Thr 485 490 495 Cys Val Ala Gln Asn Leu Val Gly Ala Asp Thr Lys Thr Val Ser Val 500 505 510 Val Val Gly Arg Ala Leu Leu Gln Pro Gly Arg Asp Glu Gly Gln Gly 515 520 525 Leu Glu Leu Arg Val Gln Glu Thr His Pro Tyr His Ile Leu Leu Ser 530 535 540 Trp Val Thr Pro Pro Asn Thr Val Ser Thr Asn Leu Thr Trp Ser Ser 545 550 555 560 Ala Ser Ser Leu Arg Gly Gln Gly Ala Thr Ala Leu Ala Arg Leu Pro 565 570 575 Arg Gly Thr His Ser Tyr Asn Ile Thr Arg Leu Leu Gln Ala Thr Glu 580 585 590 Tyr Trp Ala Cys Leu Gln Val Ala Phe Ala Asp Ala His Thr Gln Leu 595 600 605 Ala Cys Val Trp Ala Arg Thr Lys Glu Ala Thr Ser Cys His Arg Ala 610 615 620 Leu Gly Asp Arg Pro Gly Leu Ile Ala Ile Leu Ala Leu Ala Val Leu 625 630 635 640 Leu Leu Ala Ala Gly Leu Ala Ala His Leu Gly Thr Gly Gln Pro Arg 645 650 655 Lys Gly Val Gly Gly Arg Arg Pro Leu Pro Pro Ala Trp Ala Phe Trp 660 665 670 Gly Trp Ser Ala Pro Ser Val Arg Val Val Ser Ala Pro Leu Val Leu 675 680 685 Pro Trp Asn Pro Gly Arg Lys Leu Pro Arg Ser Ser Glu Gly Glu Thr 690 695 700 Leu Leu Pro Pro Leu Ser Gln Asn Ser 705 710 23 684 PRT Homo sapiens 23 Met Thr Ser Leu Val His Leu Thr Leu Ser Arg Asn Thr Ile Gly Gln 1 5 10 15 Val Ala Ala Gly Ala Phe Ala Asp Leu Arg Ala Leu Arg Ala Leu His 20 25 30 Leu Asp Ser Asn Arg Leu Ala Glu Val Arg Gly Asp Gln Leu Arg Gly 35 40 45 Leu Gly Asn Leu Arg His Leu Ile Leu Gly Asn Asn Gln Ile Arg Arg 50 55 60 Val Glu Ser Ala Ala Phe Asp Ala Phe Leu Ser Thr Val Glu Asp Leu 65 70 75 80 Asp Leu Ser Tyr Asn Asn Leu Glu Ala Leu Pro Trp Glu Ala Val Gly 85 90 95 Gln Met Val Asn Leu Asn Thr Leu Thr Leu Asp His Asn Leu Ile Asp 100 105 110 His Ile Ala Glu Gly Thr Phe Val Gln Leu His Lys Leu Val Arg Leu 115 120 125 Asp Met Thr Ser Asn Arg Leu His Lys Leu Pro Pro Asp Gly Leu Phe 130 135 140 Leu Arg Ser Gln Gly Thr Gly Pro Lys Pro Pro Thr Pro Leu Thr Val 145 150 155 160 Ser Phe Gly Gly Asn Pro Leu His Cys Asn Cys Glu Leu Leu Trp Leu 165 170 175 Arg Arg Leu Thr Arg Glu Asp Asp Leu Glu Thr Cys Ala Thr Pro Glu 180 185 190 His Leu Thr Asp Arg Tyr Phe Trp Ser Ile Pro Glu Glu Glu Phe Leu 195 200 205 Cys Glu Pro Pro Leu Ile Thr Arg Gln Ala Gly Gly Arg Ala Leu Val 210 215 220 Val Glu Gly Gln Ala Val Ser Leu Arg Cys Arg Ala Val Gly Asp Pro 225 230 235 240 Glu Pro Val Val His Trp Val Ala Pro Asp Gly Arg Leu Leu Gly Asn 245 250 255 Ser Ser Arg Thr Arg Val Arg Gly Asp Gly Thr Leu Asp Val Thr Ile 260 265 270 Thr Thr Leu Arg Asp Ser Gly Thr Phe Thr Cys Ile Ala Ser Asn Ala 275 280 285 Ala Gly Glu Ala Thr Ala Pro Val Glu Val Cys Val Val Pro Leu Pro 290 295 300 Leu Met Ala Pro Pro Pro Ala Ala Pro Pro Pro Leu Thr Glu Pro Gly 305 310 315 320 Ser Ser Asp Ile Ala Thr Pro Gly Arg Pro Gly Ala Asn Asp Ser Ala 325 330 335 Ala Glu Arg Arg Leu Val Ala Ala Glu Leu Thr Ser Asn Ser Val Leu 340 345 350 Ile Arg Trp Pro Ala Gln Arg Pro Val Pro Gly Ile Arg Met Tyr Gln 355 360 365 Val Gln Tyr Asn Ser Ser Val Asp Asp Ser Leu Val Tyr Arg Met Ile 370 375 380 Pro Ser Thr Ser Gln Thr Phe Leu Val Asn Asp Leu Ala Ala Gly Arg 385 390 395 400 Ala Tyr Asp Leu Cys Val Leu Ala Val Tyr Asp Asp Gly Ala Thr Ala 405 410 415 Leu Pro Ala Thr Arg Val Val Gly Cys Val Gln Phe Thr Thr Ala Gly 420 425 430 Asp Pro Ala Pro Cys Arg Pro Leu Arg Ala His Phe Leu Gly Gly Thr 435 440 445 Met Ile Ile Ala Ile Gly Gly Val Ile Val Ala Ser Val Leu Val Phe 450 455 460 Ile Val Leu Leu Met Ile Arg Tyr Lys Val Tyr Gly Asp Gly Asp Ser 465 470 475 480 Arg Arg Val Lys Gly Ser Arg Ser Leu Pro Arg Val Ser His Val Cys 485 490 495 Ser Gln Thr Asn Gly Ala Gly Thr Gly Ala Ala Gln Ala Pro Ala Leu 500 505 510 Pro Ala Gln Asp His Tyr Glu Ala Leu Arg Glu Val Glu Ser Gln Ala 515 520 525 Ala Pro Ala Val Ala Val Glu Ala Lys Ala Met Glu Ala Glu Thr Ala 530 535 540 Ser Ala Glu Pro Glu Val Val Leu Gly Arg Ser Leu Gly Gly Ser Ala 545 550 555 560 Thr Ser Leu Cys Leu Leu Pro Ser Glu Glu Thr Ser Gly Glu Glu Ser 565 570 575 Arg Ala Ala Val Gly Pro Arg Arg Ser Arg Ser Gly Ala Leu Glu Pro 580 585 590 Pro Thr Ser Ala Pro Pro Thr Leu Ala Leu Val Pro Gly Gly Ala Ala 595 600 605 Ala Arg Pro Arg Pro Gln Gln Arg Tyr Ser Phe Asp Gly Asp Tyr Gly 610 615 620 Ala Leu Phe Gln Ser His Ser Tyr Pro Arg Arg Ala Arg Arg Thr Lys 625 630 635 640 Arg His Arg Ser Thr Pro His Leu Asp Gly Ala Gly Gly Gly Ala Ala 645 650 655 Gly Glu Asp Gly Asp Leu Gly Leu Gly Ser Ala Arg Ala Cys Leu Ala 660 665 670 Phe Thr Ser Thr Glu Trp Met Leu Glu Ser Thr Val 675 680 24 420 PRT Homo sapiens 24 Met Pro Gly Gly Cys Ser Arg Gly Pro Ala Ala Gly Asp Gly Arg Leu 1 5 10 15 Arg Leu Ala Arg Leu Ala Leu Val Leu Leu Gly Trp Val Ser Ser Ser 20 25 30 Ser Pro Thr Ser Ser Ala Ser Ser Phe Ser Ser Ser Ala Pro Phe Leu 35 40 45 Ala Ser Ala Val Ser Ala Gln Pro Pro Leu Pro Asp Gln Cys Pro Ala 50 55 60 Leu Cys Glu Cys Ser Glu Ala Ala Arg Thr Val Lys Cys Val Asn Arg 65 70 75 80 Asn Leu Thr Glu Val Pro Thr Asp Leu Pro Ala Tyr Val Arg Asn Leu 85 90 95 Phe Leu Thr Gly Asn Gln Leu Ala Val Leu Pro Ala Gly Ala Phe Ala 100 105 110 Arg Arg Pro Pro Leu Ala Glu Leu Ala Ala Leu Asn Leu Ser Gly Ser 115 120 125 Arg Leu Asp Glu Val Arg Ala Gly Ala Phe Glu His Leu Pro Ser Leu 130 135 140 Arg Gln Leu Asp Leu Ser His Asn Pro Leu Ala Asp Leu Ser Pro Phe 145 150 155 160 Ala Phe Ser Gly Ser Asn Ala Ser Val Ser Ala Pro Ser Pro Leu Val 165 170 175 Glu Leu Ile Leu Asn His Ile Val Pro Pro Glu Asp Glu Arg Gln Asn 180 185 190 Arg Ser Phe Glu Gly Met Val Val Ala Ala Leu Leu Ala Gly Arg Ala 195 200 205 Leu Gln Gly Leu Arg Arg Leu Glu Leu Ala Ser Asn His Phe Leu Tyr 210 215 220 Leu Pro Arg Asp Val Leu Ala Gln Leu Pro Ser Leu Arg His Leu Asp 225 230 235 240 Leu Ser Asn Asn Ser Leu Val Ser Leu Thr Tyr Val Ser Phe Arg Asn 245 250 255 Leu Thr His Leu Glu Ser Leu His Leu Glu Asp Asn Ala Leu Lys Val 260 265 270 Leu His Asn Gly Thr Leu Ala Glu Leu Gln Gly Leu Pro His Ile Arg 275 280 285 Val Phe Leu Asp Asn Asn Pro Trp Val Cys Asp Cys His Met Ala Asp 290 295 300 Met Val Thr Trp Leu Lys Glu Thr Glu Val Val Gln Gly Lys Asp Arg 305 310 315 320 Leu Thr Cys Ala Tyr Pro Glu Lys Met Arg Asn Arg Val Leu Leu Glu 325 330 335 Leu Asn Ser Ala Asp Leu Asp Cys Asp Pro Ile Leu Pro Pro Ser Leu 340 345 350 Gln Thr Ser Tyr Val Phe Leu Gly Ile Val Leu Ala Leu Ile Gly Ala 355 360 365 Ile Phe Leu Leu Val Leu Tyr Leu Asn Arg Lys Gly Ile Lys Lys Trp 370 375 380 Met His Asn Ile Arg Asp Ala Cys Arg Asp His Met Glu Gly Tyr His 385 390 395 400 Tyr Arg Tyr Glu Ile Asn Ala Asp Pro Arg Leu Thr Asn Leu Ser Ser 405 410 415 Asn Ser Asp Val 420

* * * * *