PFKs as Modifiers of the IGFR Pathway and Methods of Use Friedman; Lori S. ; et al. [Exelixis, Inc.;]

PFKs as Modifiers of the IGFR Pathway and Methods of Use

Friedman; Lori S. ; et al.

Patent Application Summary

U.S. patent application number 13/862082 was filed with the patent office on 2013-08-15 for pfks as modifiers of the igfr pathway and methods of use. This patent application is currently assigned to EXELIXIS, INC.. The applicant listed for this patent is Exelixis, Inc.. Invention is credited to Lynn Margaret Bjerke, Helen Francis-Lang, Lori S. Friedman, Timothy S. Heuer, Annette L. Parks, Kenneth James Shaw.

Application Number	20130212716 13/862082
Document ID	/
Family ID	35785569
Filed Date	2013-08-15

United States Patent Application	20130212716
Kind Code	A1
Friedman; Lori S. ; et al.	August 15, 2013

PFKs as Modifiers of the IGFR Pathway and Methods of Use

Abstract

Human PFK genes are identified as modulators of the IGFR pathway, and thus are therapeutic targets for disorders associated with defective IGFR function. Methods for identifying modulators of IGFR, comprising screening for agents that modulate the activity of PFK are provided.

Inventors:

Friedman; Lori S.; (San Carlos, CA) ; Francis-Lang; Helen; (San Francisco, CA) ; Parks; Annette L.; (Newton, MA) ; Shaw; Kenneth James; (Brisbane, CA) ; Bjerke; Lynn Margaret; (Sutton, GB) ; Heuer; Timothy S.; (El Granada, CA)

Applicant:

Name	City	State	Country	Type
Exelixis, Inc.;			US

Assignee:

EXELIXIS, INC.
South San Francisco
CA

Family ID:

35785569

Appl. No.:

13/862082

Filed:

April 12, 2013

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
11628557	Sep 16, 2008
PCT/US2005/021614	Jun 20, 2005
13862082
60581696	Jun 21, 2004

Current U.S. Class:	800/3 ; 424/158.1; 424/9.2; 435/15; 435/375; 435/6.11; 435/6.13; 435/7.21; 506/11; 506/9
Current CPC Class:	A61P 35/00 20180101; C12Q 1/68 20130101; C12N 9/0071 20130101; G01N 2500/00 20130101; C12Q 1/26 20130101; C12Y 114/11002 20130101; G01N 33/74 20130101; C12Q 1/6886 20130101; A61K 49/0008 20130101; Y10T 436/143333 20150115; A61P 43/00 20180101; G01N 33/574 20130101; A61P 31/00 20180101; C12Q 1/485 20130101
Class at Publication:	800/3 ; 424/158.1; 424/9.2; 435/15; 435/7.21; 435/6.13; 435/375; 435/6.11; 506/9; 506/11
International Class:	C12Q 1/48 20060101 C12Q001/48; G01N 33/74 20060101 G01N033/74; C12Q 1/68 20060101 C12Q001/68; A61K 49/00 20060101 A61K049/00

Claims

1. A method of identifying a candidate IGFR pathway modulating agent, said method comprising the steps of: (a) providing an assay system comprising a PFK polypeptide or nucleic acid; (b) contacting the assay system with a test agent under conditions whereby, but for the presence of the test agent, the system provides a reference activity; and (c) detecting a test agent-biased activity of the assay system, wherein a difference between the test agent-biased activity and the reference activity identifies the test agent as a candidate IGFR pathway modulating agent.

2. The method of claim 1 wherein the assay system comprises cultured cells that express the PFK polypeptide.

3. The method of claim 2 wherein the cultured cells additionally have defective IGFR function.

4. The method of claim 1 wherein the assay system includes a screening assay comprising a PFK polypeptide, and the candidate test agent is a small molecule modulator.

5. The method of claim 4 wherein the assay is a kinase assay.

6. The method of claim 1 wherein the assay system is selected from the group consisting of an apoptosis assay system, a cell proliferation assay system, an angiogenesis assay system, and a hypoxic induction assay system.

7. The method of claim 1 wherein the assay system includes a binding assay comprising a PFK polypeptide and the candidate test agent is an antibody.

8. The method of claim 1 wherein the assay system includes an expression assay comprising a PFK nucleic acid and the candidate test agent is a nucleic acid modulator.

9. The method of claim 8 wherein the nucleic acid modulator is an antisense oligomer.

10. The method of claim 8 wherein the nucleic acid modulator is a PMO.

11. The method of claim 1 additionally comprising: (d) administering the candidate IGFR pathway modulating agent identified in (c) to a model system comprising cells defective in IGFR function and, detecting a phenotypic change in the model system that indicates that the IGFR function is restored.

12. The method of claim 11 wherein the model system is a mouse model with defective IGFR function.

13. A method for modulating a IGFR pathway of a cell comprising contacting a cell defective in IGFR function with a candidate modulator that specifically binds to a PFK polypeptide, whereby IGFR function is restored.

14. The method of claim 13 wherein the candidate modulator is administered to a vertebrate animal predetermined to have a disease or disorder resulting from a defect in IGFR function.

15. The method of claim 13 wherein the candidate modulator is selected from the group consisting of an antibody and a small molecule.

16. The method of claim 1, comprising the additional steps of: (d) providing a secondary assay system comprising cultured cells or a non-human animal expressing PFK, (e) contacting the secondary assay system with the test agent of (b) or an agent derived therefrom under conditions whereby, but for the presence of the test agent or agent derived therefrom, the system provides a reference activity; and (f) detecting an agent-biased activity of the second assay system, wherein a difference between the agent-biased activity and the reference activity of the second assay system confirms the test agent or agent derived therefrom as a candidate IGFR pathway modulating agent, and wherein the second assay detects an agent-biased change in the IGFR pathway.

17. The method of claim 16 wherein the secondary assay system comprises cultured cells.

18. The method of claim 16 wherein the secondary assay system comprises a non-human animal.

19. The method of claim 18 wherein the non-human animal mis-expresses a IGFR pathway gene.

20. A method of modulating IGFR pathway in a mammalian cell comprising contacting the cell with an agent that specifically binds a PFK polypeptide or nucleic acid.

21. The method of claim 20 wherein the agent is administered to a mammalian animal predetermined to have a pathology associated with the IGFR pathway.

22. The method of claim 20 wherein the agent is a small molecule modulator, a nucleic acid modulator, or an antibody.

23. A method for diagnosing a disease in a patient comprising: obtaining a biological sample from the patient; contacting the sample with a probe for PFK expression; comparing results from step (b) with a control; determining whether step (c) indicates a likelihood of disease.

24. The method of claim 23 wherein said disease is cancer.

25. The method according to claim 24, wherein said cancer is a cancer as shown in Table 1 as having >25% expression level.

Description

REFERENCE TO RELATED APPLICATIONS

[0001] This application claims priority to U.S. provisional patent application 60/581,696 filed Jun. 21, 2004. The contents of the prior application are hereby incorporated in their entirety.

BACKGROUND OF THE INVENTION

[0002] Somatic mutations in the PTEN (Phosphatase and Tensin homolog deleted on chromosome 10) gene are known to cause tumors in a variety of human tissues. In addition, germline mutations in PTEN are the cause of human diseases (Cowden disease and Bannayan-Zonana syndrome) associated with increased risk of breast and thyroid cancer (Nelen M R et al. (1997) Hum Mol Genet, 8:1383-1387; Liaw D et al. (1997) Nat Genet, 1:64-67; Marsh D J et al. (1998) Hum Mol Genet, 3:507-515). PTEN is thought to act as a tumor suppressor by regulating several signaling pathways through the second messenger phosphatidylinositol 3,4,5 triphosphate (PIP3). PTEN dephosphorylates the D3 position of PIP3 and downregulates signaling events dependent on PIP3 levels (Maehama T and Dixon J E (1998) J Biol Chem, 22, 13375-8). In particular, pro-survival pathways downstream of the insulin-like growth factor (IGF) pathway are regulated by PTEN activity. Stimulation of the IGF pathway, or loss of PTEN function, elevates PIP3 levels and activates pro-survival pathways associated with tumorigenesis (Stambolic V et al. (1998) Cell, 95:29-39). Consistent with this model, elevated levels of insulin-like growth factors I and II correlate with increased risk of cancer (Yu H et al (1999) J Natl Cancer Inst 91:151-156) and poor prognosis (Takanami I et al, 1996, J Surg Oncol 61(3):205-8). In addition, increased levels or activity of positive effectors of the IGF pathway, such as Akt and PI(3) kinase, have been implicated in several types of human cancer (Nicholson K M and Anderson N G (2002) Cellular Signalling, 14:381-395).

[0003] In Drosophila melanogaster, as in vertebrates, the Insulin Growth Factor Receptor (IGFR) pathway includes the positive effectors PI(3) kinase, Akt, and PDK and the inhibitor, PTEN. These proteins have been implicated in multiple processes, including the regulation of cell growth and size as well as cell division and survival (Oldham S and Hafen E. (2003) Trends Cell Biol. 13:79-85; Garafolo R S. (2002) Trends Endocr. Metab. 13:156-162; Backman S A et al. (2002) Curr. Op. Neurobio. 12:1-7; Tapon N et al. (2001) Curr Op. Cell Biol. 13:731-737). Activation of the pathway in Drosophila can result in increases in cell size, cell number and organ size (Oldham S et al. (2002) Dev. 129:4103-4109; Prober D A and Edgar B A. (2002) Genes & Dev. 16:2286-2299; Potter C J et al. (2001) Cell 105:357-368; Verdu J et al. (1999) Cell Biol. 1:500-506).

[0004] Phosphofructokinase (PFK) is a tetrameric enzyme that catalyzes a key step in glycolysis, namely the conversion of D-fructose 6-phosphate to D-fructose 1,6-bisphosphate. Separate genes encode a muscle subunit (M) and a liver subunit (L). Muscle Phosphofructokinase (PFKM) is a homotetramer of M subunits, liver type Phosphofructokinase (PFKL) is a homotetramer of L-subunits, while platelet type Phosphofructokinase (PFKP) can be composed of any tetrameric combination of M and L subunits.

[0005] The ability, to manipulate the genomes of model organisms such as Drosophila provides a powerful means to analyze biochemical processes that, due to significant evolutionary conservation, have direct relevance to more complex vertebrate organisms. Due to a high level of gene and pathway conservation, the strong similarity of cellular processes, and the functional conservation of genes between these model organisms and mammals, identification of the involvement of novel genes in particular pathways and their functions in such model organisms can directly contribute to the understanding of the correlative pathways and methods of modulating them in mammals (see, for example, Mechler B M et al., 1985 EMBO J 4:1551-1557; Gateff E. 1982 Adv. Cancer Res. 37: 33-74; Watson K L., et al., 1994 J Cell Sci. 18: 19-33; Miklos G L, and Rubin G M. 1996 Cell 86:521-529; Wassarman D A, et al., 1995 Curr Opin Gen Dev 5: 44-50; and Booth D R. 1999 Cancer Metastasis Rev. 18: 261-284). For example, a genetic screen can be carried out in an invertebrate model organism having underexpression (e.g. knockout) or overexpression of a gene (referred to as a "genetic entry point") that yields a visible phenotype. Additional genes are mutated in a random or targeted manner. When a gene mutation changes the original phenotype caused by the mutation in the genetic entry point, the gene is identified as a "modifier" involved in the same or overlapping pathway as the genetic entry point. When the genetic entry point is an ortholog of a human gene implicated in a disease pathway, such as IGFR, modifier genes can be identified that may be attractive candidate targets for novel therapeutics.

[0006] All references cited herein, including patents, patent applications, publications, and sequence information in referenced Genbank identifier numbers, are incorporated herein in their entireties.

SUMMARY OF THE INVENTION

[0007] We have discovered genes that modify the IGFR pathway in Drosophila, and identified their human orthologs, hereinafter referred to as Phosphofructokinase (PFK). The invention provides methods for utilizing these IGFR modifier genes and polypeptides to identify PFK-modulating agents that are candidate therapeutic agents that can be used in the treatment of disorders associated with defective or impaired IGFR function and/or PFK function. Preferred PFK-modulating agents specifically bind to PFK polypeptides and restore IGFR function. Other preferred PFK-modulating agents are nucleic acid modulators such as antisense oligomers and RNAi that repress PFK gene expression or product activity by, for example, binding to and inhibiting the respective nucleic acid (i.e. DNA or mRNA).

[0008] PFK modulating agents may be evaluated by any convenient in vitro or in vivo assay for molecular interaction with a PFK polypeptide or nucleic acid. In one embodiment, candidate PFK modulating agents are tested with an assay system comprising a PFK polypeptide or nucleic acid. Agents that produce a change in the activity of the assay system relative to controls are identified as candidate IGFR modulating agents. The assay system may be cell-based or cell-free. PFK-modulating agents include PFK related proteins (e.g. dominant negative mutants, and biotherapeutics); PFK-specific antibodies; PFK-specific antisense oligomers and other nucleic acid modulators; and chemical agents that specifically bind to or interact with PFK or compete with PFK binding partner (e.g. by binding to a PFK binding partner). In one specific embodiment, a small molecule modulator is identified using a kinase assay. In specific embodiments, the screening assay system is selected from a binding assay, an apoptosis assay, a cell proliferation assay, an angiogenesis assay, and a hypoxic induction assay.

[0009] In another embodiment, candidate IGFR pathway modulating agents are further tested using a second assay system that detects changes in the IGFR pathway, such as angiogenic, apoptotic, or cell proliferation changes produced by the originally identified candidate agent or an agent derived from the original agent. The second assay system may use cultured cells or non-human animals. In specific embodiments, the secondary assay system uses non-human animals, including animals predetermined to have a disease or disorder implicating the IGFR pathway, such as an angiogenic, apoptotic, or cell proliferation disorder (e.g. cancer).

[0010] The invention further provides methods for modulating the PFK function and/or the IGFR pathway in a mammalian cell by contacting the mammalian cell with an agent that specifically binds a PFK polypeptide or nucleic acid. The agent may be a small molecule modulator, a nucleic acid modulator, or an antibody and may be administered to a mammalian animal predetermined to have a pathology associated with the IGFR pathway.

DETAILED DESCRIPTION OF THE INVENTION

[0011] A dominant loss of function screen was carried out in Drosophila to identify genes that interact with or modulate the IGFR signaling pathway. Modifiers of the IGFR pathway and their orthologs were identified. The DROSOPHILA PFK gene was identified as a modifier of the IGFR pathway. Accordingly, vertebrate orthologs of this modifier, and preferably the human orthologs, PFK genes (i.e., nucleic acids and polypeptides) are attractive drug targets for the treatment of pathologies associated with a defective IGFR signaling pathway, such as cancer.

[0012] In vitro and in vivo methods of assessing PFK function are provided herein. Modulation of the PFK or their respective binding partners is useful for understanding the association of the IGFR pathway and its members in normal and disease conditions and for developing diagnostics and therapeutic modalities for IGFR related pathologies. PFK-modulating agents that act by inhibiting or enhancing PFK expression, directly or indirectly, for example, by affecting a PFK function such as enzymatic (e.g., catalytic) or binding activity, can be identified using methods provided herein. PFK modulating agents are useful in diagnosis, therapy and pharmaceutical development.

[0013] Nucleic Acids and Polypeptides of the Invention

[0014] Sequences related to PFK nucleic acids and polypeptides that can be used in the invention are disclosed in Genbank (referenced by Genbank identifier (GI) number) as GI#s 50346003 (SEQ ID NO:1), 20560995 (SEQ ID NO:2), 21361069 (SEQ ID NO:3), 14602831 (SEQ ID NO:4), 17390696 (SEQ ID NO:5), 13623608 (SEQ ID NO:6), 14043100 (SEQ ID NO:7), 35391 (SEQ ID NO:8), 35393 (SEQ ID NO:9), 35395 (SEQ ID NO:10), 35397 (SEQ ID NO:11), 35399 (SEQ ID NO:12), 35400 (SEQ ID NO:13), 35402 (SEQ ID NO:14), 35405 (SEQ ID NO:15), 35407 (SEQ ID NO:16), 35409 (SEQ ID NO:17), 35411 (SEQ ID NO:18), 35413 (SEQ ID NO:19), 35415 (SEQ ID NO:20), 35417 (SEQ ID NO:21), 35419 (SEQ ID NO:22), 35421 (SEQ ID NO:23), 35423 (SEQ ID NO:24), 35424 (SEQ ID NO:25), 35426 (SEQ ID NO:26), 35428 (SEQ ID NO:27), 21749869 (SEQ ID NO:28), 50346004 (SEQ ID NO:29), 39725712 (SEQ ID NO:30), 4505748 (SEQ ID NO:31), 12653524 (SEQ ID NO:32), 15342052 (SEQ ID NO:33), 15215396 (SEQ ID NO:34), 18203736 (SEQ ID NO:35), 188633 (SEQ ID NO:36), 188634 (SEQ ID NO:37), 188635 (SEQ ID NO:38), 188636 (SEQ ID NO:39), 188637 (SEQ ID NO:40), 188638 (SEQ ID NO:41), 188639 (SEQ ID NO:42), 188640 (SEQ ID NO:43), 188641 (SEQ ID NO:44), 188642 (SEQ ID NO:45), 188643 (SEQ ID NO:46), 188644 (SEQ ID NO:47), 188645 (SEQ ID NO:48), 188646 (SEQ ID NO:49), 188647 (SEQ ID NO:50), 188648 (SEQ ID NO:51), 188649 (SEQ ID NO:52), 188650 (SEQ ID NO:53), 188651 (SEQ ID NO:54), 188652 (SEQ ID NO:55), 188653 (SEQ ID NO:56), 188654 (SEQ ID NO:57), 188655 (SEQ ID NO:58), 3964478 (SEQ ID NO:59), 41352062 (SEQ ID NO:60), 11321600 (SEQ ID NO:61), 12803424 (SEQ ID NO:62), and 20810529 (SEQ ID NO:63) for nucleic acid, and GI#s 21361070 (SEQ ID NO:64), 4505749 (SEQ ID NO:65), and 11321601 (SEQ ID NO:66) for polypeptide sequences.

[0015] The term "PFK polypeptide" refers to a full-length PFK protein or a functionally active fragment or derivative thereof. A "functionally active" PFK fragment or derivative exhibits one or more functional activities associated with a full-length, wild-type PFK protein, such as antigenic or immunogenic activity, enzymatic activity, ability to bind natural cellular substrates, etc. The functional activity of PFK proteins, derivatives and fragments can be assayed by various methods known to one skilled in the art (Current Protocols in Protein Science (1998) Coligan et al., eds., John Wiley & Sons, Inc., Somerset, N.J.) and as further discussed below. In one embodiment, a functionally active PFK polypeptide is a PFK derivative capable of rescuing defective endogenous PFK activity, such as in cell based or animal assays; the rescuing derivative may be from the same or a different species. For purposes herein, functionally active fragments also include those fragments that comprise one or more structural domains of a PFK, such as a kinase domain or a binding domain. Protein domains can be identified using the PFAM program (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2). For example, the Phosphofructokinase domain (PFAM 00365) of PFK from GI#21361070 (SEQ ID NO:64) is located at approximately amino acid residues 76 to 373, 448 to 735; the Phosphofructokinase domain of PFK from GI#4505749 (SEQ ID NO:65) is located at approximately amino acid residues 16 to 326, 401 to 689; and the Phosphofructokinase domain of PFK from GI#11321601 (SEQ ID NO:66) is located at approximately amino acid residues 25 to 335, 412 to 699. Methods for obtaining PFK polypeptides are also further described below. In some embodiments, preferred fragments are functionally active, domain-containing fragments comprising at least 25 contiguous amino acids, preferably at least 50, more preferably 75, and most preferably at least 100 contiguous amino acids of a PFK. In further preferred embodiments, the fragment comprises the entire functionally active domain.

[0016] The term "PFK nucleic acid" refers to a DNA or RNA molecule that encodes a PFK polypeptide. Preferably, the PFK polypeptide or nucleic acid or fragment thereof is from a human, but can also be an ortholog, or derivative thereof with at least 70% sequence identity, preferably at least 80%, more preferably 85%, still more preferably 90%, and most preferably at least 95% sequence identity with human PFK. Methods of identifying orthlogs are known in the art. Normally, orthologs in different species retain the same function, due to presence of one or more protein motifs and/or 3-dimensional structures. Orthologs are generally identified by sequence homology analysis, such as BLAST analysis, usually using protein bait sequences. Sequences are assigned as a potential ortholog if the best hit sequence from the forward BLAST result retrieves the original query sequence in the reverse BLAST (Huynen M A and Bork P, Proc Natl Acad Sci (1998) 95:5849-5856; Huynen M A et al., Genome Research (2000) 10:1204-1210). Programs for multiple sequence alignment, such as CLUSTAL (Thompson J D et al, 1994, Nucleic Acids Res 22:4673-4680) may be used to highlight conserved regions and/or residues of orthologous proteins and to generate phylogenetic trees. In a phylogenetic tree representing multiple homologous sequences from diverse species (e.g., retrieved through BLAST analysis), orthologous sequences from two species generally appear closest on the tree with respect to all other sequences from these two species. Structural threading or other analysis of protein folding (e.g., using software by ProCeryon, Biosciences, Salzburg, Austria) may also identify potential orthologs. In evolution, when a gene duplication event follows speciation, a single gene in one species, such as Drosophila, may correspond to multiple genes (paralogs) in another, such as human. As used herein, the term "orthologs" encompasses paralogs. As used herein, "percent (%) sequence identity" with respect to a subject sequence, or a specified portion of a subject sequence, is defined as the percentage of nucleotides or amino acids in the candidate derivative sequence identical with the nucleotides or amino acids in the subject sequence (or specified portion thereof), after aligning the sequences and introducing gaps, if necessary to achieve the maximum percent sequence identity, as generated by the program WU-BLAST-2.0a19 (Altschul et al., J. Mol. Biol. (1997) 215:403-410) with all the search parameters set to default values. The HSP S and HSP S2 parameters are dynamic values and are established by the program itself depending upon the composition of the particular sequence and composition of the particular database against which the sequence of interest is being searched. A % identity value is determined by the number of matching identical nucleotides or amino acids divided by the sequence length for which the percent identity is being reported. "Percent (%) amino acid sequence similarity" is determined by doing the same calculation as for determining % amino acid sequence identity, but including conservative amino acid substitutions in addition to identical amino acids in the computation.

[0017] A conservative amino acid substitution is one in which an amino acid is substituted for another amino acid having similar properties such that the folding or activity of the protein is not significantly affected. Aromatic amino acids that can be substituted for each other are phenylalanine, tryptophan, and tyrosine; interchangeable hydrophobic amino acids are leucine, isoleucine, methionine, and valine; interchangeable polar amino acids are glutamine and asparagine; interchangeable basic amino acids are arginine, lysine and histidine; interchangeable acidic amino acids are aspartic acid and glutamic acid; and interchangeable small amino acids are alanine, serine, threonine, cysteine and glycine.

[0018] Alternatively, an alignment for nucleic acid sequences is provided by the local homology algorithm of Smith and Waterman (Smith and Waterman, 1981, Advances in Applied Mathematics 2:482-489; database: European Bioinformatics Institute; Smith and Waterman, 1981, J. of Molec. Biol., 147:195-197; Nicholas et al.; 1998, "A Tutorial on Searching Sequence Databases and Sequence Scoring Methods" (www.psc.edu) and references cited therein; W. R. Pearson, 1991, Genomics 11:635-650). This algorithm can be applied to amino acid sequences by using the scoring matrix developed by Dayhoff (Dayhoff: Atlas of Protein Sequences and Structure, M. O. Dayhoff ed., 5 suppl. 3:353-358, National Biomedical Research Foundation, Washington, D.C., USA), and normalized by Gribskov (Gribskov 1986 Nucl. Acids Res. 14(6):6745-6763). The Smith-Waterman algorithm may be employed where default parameters are used for scoring (for example, gap open penalty of 12, gap extension penalty of two). From the data generated, the "Match" value reflects "sequence identity."

[0019] Derivative nucleic acid molecules of the subject nucleic acid molecules include sequences that hybridize to the nucleic acid sequence of a PFK. The stringency of hybridization can be controlled by temperature, ionic strength, pH, and the presence of denaturing agents such as formamide during hybridization and washing. Conditions routinely used are set out in readily available procedure texts (e.g., Current Protocol in Molecular Biology, Vol. 1, Chap. 2.10, John Wiley & Sons, Publishers (1994); Sambrook et al., Molecular Cloning, Cold Spring Harbor (1989)). In some embodiments, a nucleic acid molecule of the invention is capable of hybridizing to a nucleic acid molecule containing the nucleotide sequence of a PFK under high stringency hybridization conditions that are: prehybridization of filters containing nucleic acid for 8 hours to overnight at 65.degree. C. in a solution comprising 6.times. single strength citrate (SSC) (1.times.SSC is 0.15 M NaCl, 0.015 M Na citrate; pH 7.0), 5.times.Denhardt's solution, 0.05% sodium pyrophosphate and 100 .mu.g/ml herring sperm DNA; hybridization for 18-20 hours at 65.degree. C. in a solution containing 6.times.SSC, 1.times.Denhardt's solution, 100 .mu.g/ml yeast tRNA and 0.05% sodium pyrophosphate; and washing of filters at 65.degree. C. for 1 h in a solution containing 0.1.times.SSC and 0.1% SDS (sodium dodecyl sulfate).

[0020] In other embodiments, moderately stringent hybridization conditions are used that are: pretreatment of filters containing nucleic acid for 6 h at 40.degree. C. in a solution containing 35% formamide, 5.times.SSC, 50 mM Tris-HCl (pH7.5), 5 mM EDTA, 0.1% PVP, 0.1% Ficoll, 1% BSA, and 500 .mu.g/ml denatured salmon sperm DNA; hybridization for 18-20 h at 40.degree. C. in a solution containing 35% formamide, 5.times.SSC, 50 mM Tris-HCl (pH7.5), 5 mM EDTA, 0.02% PVP, 0.02% Ficoll, 0.2% BSA, 100 .mu.g/ml salmon sperm DNA, and 10% (wt/vol) dextran sulfate; followed by washing twice for 1 hour at 55.degree. C. in a solution containing 2.times.SSC and 0.1% SDS.

[0021] Alternatively, low stringency conditions can be used that are: incubation for 8 hours to overnight at 37.degree. C. in a solution comprising 20% formamide, 5.times.SSC, 50 mM sodium phosphate (pH 7.6), 5.times.Denhardt's solution, 10% dextran sulfate, and 20 .mu.g/ml denatured sheared salmon sperm DNA; hybridization in the same buffer for 18 to 20 hours; and washing of filters in 1.times.SSC at about 37.degree. C. for 1 hour.

[0022] Isolation, Production, Expression, and Mis-Expression of PFK Nucleic Acids and Polypeptides

[0023] PFK nucleic acids and polypeptides are useful for identifying and testing agents that modulate PFK function and for other applications related to the involvement of PFK in the IGFR pathway. PFK nucleic acids and derivatives and orthologs thereof may be obtained using any available method. For instance, techniques for isolating cDNA or genomic DNA sequences of interest by screening DNA libraries or by using polymerase chain reaction (PCR) are well known in the art. In general, the particular use for the protein will dictate the particulars of expression, production, and purification methods. For instance, production of proteins for use in screening for modulating agents may require methods that preserve specific biological activities of these proteins, whereas production of proteins for antibody generation may require structural integrity of particular epitopes. Expression of proteins to be purified for screening or antibody production may require the addition of specific tags (e.g., generation of fusion proteins). Overexpression of a PFK protein for assays used to assess PFK function, such as involvement in cell cycle regulation or hypoxic response, may require expression in eukaryotic cell lines capable of these cellular activities. Techniques for the expression, production, and purification of proteins are well known in the art; any suitable means therefore may be used (e.g., Higgins S J and Hames B D (eds.) Protein Expression: A Practical Approach, Oxford University Press Inc., New York 1999; Stanbury P F et al., Principles of Fermentation Technology, 2.sup.nd edition, Elsevier Science, New York, 1995; Doonan S (ed.) Protein Purification Protocols, Humana Press, New Jersey, 1996; Coligan J E et al, Current Protocols in Protein Science (eds.), 1999, John Wiley & Sons, New York). In particular embodiments, recombinant PFK is expressed in a cell line known to have defective IGFR function. The recombinant cells are used in cell-based screening assay systems of the invention, as described further below.

[0024] The nucleotide sequence encoding a PFK polypeptide can be inserted into any appropriate expression vector. The necessary transcriptional and translational signals, including promoter/enhancer element, can derive from the native PFK gene and/or its flanking regions or can be heterologous. A variety of host-vector expression systems may be utilized, such as mammalian cell systems infected with virus (e.g. vaccinia virus, adenovirus, etc.); insect cell systems infected with virus (e.g. baculovirus); microorganisms such as yeast containing yeast vectors, or bacteria transformed with bacteriophage, plasmid, or cosmid DNA. An isolated host cell strain that modulates the expression of, modifies, and/or specifically processes the gene product may be used.

[0025] To detect expression of the PFK gene product, the expression vector can comprise a promoter operably linked to a PFK gene nucleic acid, one or more origins of replication, and, one or more selectable markers (e.g. thymidine kinase activity, resistance to antibiotics, etc.). Alternatively, recombinant expression vectors can be identified by assaying for the expression of the PFK gene product based on the physical or functional properties of the PFK protein in in vitro assay systems (e.g. immunoassays).

[0026] The PFK protein, fragment, or derivative may be optionally expressed as a fusion, or chimeric protein product (i.e. it is joined via a peptide bond to a heterologous protein sequence of a different protein), for example to facilitate purification or detection. A chimeric product can be made by ligating the appropriate nucleic acid sequences encoding the desired amino acid sequences to each other using standard methods and expressing the chimeric product. A chimeric product may also be made by protein synthetic techniques, e.g. by use of a peptide synthesizer (Hunkapiller et al., Nature (1984) 310:105-111).

[0027] Once a recombinant cell that expresses the PFK gene sequence is identified, the gene product can be isolated and purified using standard methods (e.g. ion exchange, affinity, and gel exclusion chromatography; centrifugation; differential solubility; electrophoresis). Alternatively, native PFK proteins can be purified from natural sources, by standard methods (e.g. immunoaffinity purification). Once a protein is obtained, it may be quantified and its activity measured by appropriate methods, such as immunoassay, bioassay, or other measurements of physical properties, such as crystallography.

[0028] The methods of this invention may also use cells that have been engineered for altered expression (mis-expression) of PFK or other genes associated with the IGFR pathway. As used herein, mis-expression encompasses ectopic expression, over-expression, under-expression, and non-expression (e.g. by gene knock-out or blocking expression that would otherwise normally occur).

[0029] Genetically Modified Animals

[0030] Animal models that have been genetically modified to alter PFK expression may be used in in vivo assays to test for activity of a candidate IGFR modulating agent, or to further assess the role of PFK in a IGFR pathway process such as apoptosis or cell proliferation. Preferably, the altered PFK expression results in a detectable phenotype, such as decreased or increased levels of cell proliferation, angiogenesis, or apoptosis compared to control animals having normal PFK expression. The genetically modified animal may additionally have altered IGFR expression (e.g. IGFR knockout). Preferred genetically modified animals are mammals such as primates, rodents (preferably mice or rats), among others. Preferred non-mammalian species include zebrafish, C. elegans, and Drosophila. Preferred genetically modified animals are transgenic animals having a heterologous nucleic acid sequence present as an extrachromosomal element in a portion of its cells, i.e. mosaic animals (see, for example, techniques described by Jakobovits, 1994, Curr. Biol. 4:761-763.) or stably integrated into its germ line DNA (i.e., in the genomic sequence of most or all of its cells). Heterologous nucleic acid is introduced into the germ line of such transgenic animals by genetic manipulation of, for example, embryos or embryonic stem cells of the host animal.

[0031] Methods of making transgenic animals are well-known in the art (for transgenic mice see Brinster et al., Proc. Nat. Acad. Sci. USA 82: 4438-4442 (1985), U.S. Pat. Nos. 4,736,866 and 4,870,009, both by Leder et al., U.S. Pat. No. 4,873,191 by Wagner et al., and Hogan, B., Manipulating the Mouse Embryo, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., (1986); for particle bombardment see U.S. Pat. No. 4,945,050, by Sandford et al.; for transgenic Drosophila see Rubin and Spradling, Science (1982) 218:348-53 and U.S. Pat. No. 4,670,388; for transgenic insects see Berghammer A. J. et al., A Universal Marker for Transgenic Insects (1999) Nature 402:370-371; for transgenic Zebrafish see Lin S., Transgenic Zebrafish, Methods Mol Biol. (2000); 136:375-3830); for microinjection procedures for fish, amphibian eggs and birds see Houdebine and Chourrout, Experientia (1991) 47:897-905; for transgenic rats see Hammer et al., Cell (1990) 63:1099-1112; and for culturing of embryonic stem (ES) cells and the subsequent production of transgenic animals by the introduction of DNA into ES cells using methods such as electroporation, calcium phosphate/DNA precipitation and direct injection see, e.g., Teratocarcinomas and Embryonic Stem Cells, A Practical Approach, E. J. Robertson, ed., IRL Press (1987)). Clones of the nonhuman transgenic animals can be produced according to available methods (see Wilmut, I. et al. (1997) Nature 385:810-813; and PCT International Publication Nos. WO 97/07668 and WO 97/07669).

[0032] In one embodiment, the transgenic animal is a "knock-out" animal having a heterozygous or homozygous alteration in the sequence of an endogenous PFK gene that results in a decrease of PFK function, preferably such that PFK expression is undetectable or insignificant. Knock-out animals are typically generated by homologous recombination with a vector comprising a transgene having at least a portion of the gene to be knocked out. Typically a deletion, addition or substitution has been introduced into the transgene to functionally disrupt it. The transgene can be a human gene (e.g., from a human genomic clone) but more preferably is an ortholog of the human gene derived from the transgenic host species. For example, a mouse PFK gene is used to construct a homologous recombination vector suitable for altering an endogenous PFK gene in the mouse genome. Detailed methodologies for homologous recombination in mice are available (see Capecchi, Science (1989) 244:1288-1292; Joyner et al., Nature (1989) 338:153-156). Procedures for the production of non-rodent transgenic mammals and other animals are also available (Houdebine and Chourrout, supra; Pursel et al., Science (1989) 244:1281-1288; Simms et al., Bio/Technology (1988) 6:179-183). In a preferred embodiment, knock-out animals, such as mice harboring a knockout of a specific gene, may be used to produce antibodies against the human counterpart of the gene that has been knocked out (Claesson M H et al., (1994) Scan J Immunol 40:257-264; Declerck P J et al., (1995) J Biol Chem. 270:8397-400).

[0033] In another embodiment, the transgenic animal is a "knock-in" animal having an alteration in its genome that results in altered expression (e.g., increased (including ectopic) or decreased expression) of the PFK gene, e.g., by introduction of additional copies of PFK, or by operatively inserting a regulatory sequence that provides for altered expression of an endogenous copy of the PFK gene. Such regulatory sequences include inducible, tissue-specific, and constitutive promoters and enhancer elements. The knock-in can be homozygous or heterozygous.

[0034] Transgenic nonhuman animals can also be produced that contain selected systems allowing for regulated expression of the transgene. One example of such a system that may be produced is the cre/loxP recombinase system of bacteriophage P1 (Lakso et al., PNAS (1992) 89:6232-6236; U.S. Pat. No. 4,959,317). If a cre/loxP recombinase system is used to regulate expression of the transgene, animals containing transgenes encoding both the Cre recombinase and a selected protein are required. Such animals can be provided through the construction of "double" transgenic animals, e.g., by mating two transgenic animals, one containing a transgene encoding a selected protein and the other containing a transgene encoding a recombinase. Another example of a recombinase system is the FLP recombinase system of Saccharomyces cerevisiae (O'Gorman et al. (1991) Science 251:1351-1355; U.S. Pat. No. 5,654,182). In a preferred embodiment, both Cre-LoxP and Flp-Frt are used in the same system to regulate expression of the transgene, and for sequential deletion of vector sequences in the same cell (Sun X et al (2000) Nat Genet 25:83-6).

[0035] The genetically modified animals can be used in genetic studies to further elucidate the IGFR pathway, as animal models of disease and disorders implicating defective IGFR function, and for in vivo testing of candidate therapeutic agents, such as those identified in screens described below. The candidate therapeutic agents are administered to a genetically modified animal having altered PFK function and phenotypic changes are compared with appropriate control animals such as genetically modified animals that receive placebo treatment, and/or animals with unaltered PFK expression that receive candidate therapeutic agent.

[0036] In addition to the above-described genetically modified animals having altered PFK function, animal models having defective IGFR function (and otherwise normal PFK function), can be used in the methods of the present invention. For example, a IGFR knockout mouse can be used to assess, in vivo, the activity of a candidate IGFR modulating agent identified in one of the in vitro assays described below. Preferably, the candidate IGFR modulating agent when administered to a model system with cells defective in IGFR function, produces a detectable phenotypic change in the model system indicating that the IGFR function is restored, i.e., the cells exhibit normal cell cycle progression.

[0037] Modulating Agents

[0038] The invention provides methods to identify agents that interact with and/or modulate the function of PFK and/or the IGFR pathway. Modulating agents identified by the methods are also part of the invention. Such agents are useful in a variety of diagnostic and therapeutic applications associated with the IGFR pathway, as well as in further analysis of the PFK protein and its contribution to the IGFR pathway. Accordingly, the invention also provides methods for modulating the IGFR pathway comprising the step of specifically modulating PFK activity by administering a PFK-interacting or -modulating agent.

[0039] As used herein, a "PFK-modulating agent" is any agent that modulates PFK function, for example, an agent that interacts with PFK to inhibit or enhance PFK activity or otherwise affect normal PFK function. PFK function can be affected at any level, including transcription, protein expression, protein localization, and cellular or extra-cellular activity. In a preferred embodiment, the PFK-modulating agent specifically modulates the function of the PFK. The phrases "specific modulating agent", "specifically modulates", etc., are used herein to refer to modulating agents that directly bind to the PFK polypeptide or nucleic acid, and preferably inhibit, enhance, or otherwise alter, the function of the PFK. These phrases also encompass modulating agents that alter the interaction of the PFK with a binding partner, substrate, or cofactor (e.g. by binding to a binding partner of a PFK, or to a protein/binding partner complex, and altering PFK function). In a further preferred embodiment, the PFK-modulating agent is a modulator of the IGFR pathway (e.g. it restores and/or upregulates IGFR function) and thus is also an IGFR-modulating agent.

[0040] Preferred PFK-modulating agents include small molecule compounds; PFK-interacting proteins, including antibodies and other biotherapeutics; and nucleic acid modulators such as antisense and RNA inhibitors. The modulating agents may be formulated in pharmaceutical compositions, for example, as compositions that may comprise other active ingredients, as in combination therapy, and/or suitable carriers or excipients. Techniques for formulation and administration of the compounds may be found in "Remington's Pharmaceutical Sciences" Mack Publishing Co., Easton, Pa., 19.sup.th edition.

[0041] Small Molecule Modulators

[0042] Small molecules are often preferred to modulate function of proteins with enzymatic function, and/or containing protein interaction domains. Chemical agents, referred to in the art as "small molecule" compounds are typically organic, non-peptide molecules, having a molecular weight up to 10,000, preferably up to 5,000, more preferably up to 1,000, and most preferably up to 500 daltons. This class of modulators includes chemically synthesized molecules, for instance, compounds from combinatorial chemical libraries. Synthetic compounds may be rationally designed or identified based on known or inferred properties of the PFK protein or may be identified by screening compound libraries. Alternative appropriate modulators of this class are natural products, particularly secondary metabolites from organisms such as plants or fungi, which can also be identified by screening compound libraries for PFK-modulating activity. Methods for generating and obtaining compounds are well known in the art (Schreiber S L, Science (2000) 151: 1964-1969; Radmann J and Gunther J, Science (2000) 151:1947-1948).

[0043] Small molecule modulators identified from screening assays, as described below, can be used as lead compounds from which candidate clinical compounds may be designed, optimized, and synthesized. Such clinical compounds may have utility in treating pathologies associated with the IGFR pathway. The activity of candidate small molecule modulating agents may be improved several-fold through iterative secondary functional validation, as further described below, structure determination, and candidate modulator modification and testing. Additionally, candidate clinical compounds are generated with specific regard to clinical and pharmacological properties. For example, the reagents may be derivatized and re-screened using in vitro and in vivo assays to optimize activity and minimize toxicity for pharmaceutical development.

[0044] Protein Modulators

[0045] Specific PFK-interacting proteins are useful in a variety of diagnostic and therapeutic applications related to the IGFR pathway and related disorders, as well as in validation assays for other PFK-modulating agents. In a preferred embodiment, PFK-interacting proteins affect normal PFK function, including transcription, protein expression, protein localization, and cellular or extra-cellular activity. In another embodiment, PFK-interacting proteins are useful in detecting and providing information about the function of PFK proteins, as is relevant to IGFR related, disorders, such as cancer (e.g., for diagnostic means).

[0046] A PFK-interacting protein may be endogenous, i.e. one that naturally interacts genetically or biochemically with a PFK, such as a member of the PFK pathway that modulates PFK expression, localization, and/or activity. PFK-modulators include dominant negative forms of PFK-interacting proteins and of PFK proteins themselves. Yeast two-hybrid and variant screens offer preferred methods for identifying endogenous PFK-interacting proteins (Finley, R. L. et al. (1996) in DNA Cloning-Expression Systems: A Practical Approach, eds. Glover D. & Hames B. D (Oxford University Press, Oxford, England), pp. 169-203; Fashema S F et al., Gene (2000) 250:1-14; Drees B L Curr Opin Chem Biol (1999) 3:64-70; Vidal M and Legrain P Nucleic Acids Res (1999) 27:919-29; and U.S. Pat. No. 5,928,868). Mass spectrometry is an alternative preferred method for the elucidation of protein complexes (reviewed in, e.g., Pandley A and Mann M, Nature (2000) 405:837-846; Yates J R 3.sup.rd, Trends Genet (2000) 16:5-8).

[0047] An PFK-interacting protein may be an exogenous protein, such as a PFK-specific antibody or a T-cell antigen receptor (see, e.g., Harlow and Lane (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory; Harlow and Lane (1999) Using antibodies: a laboratory manual. Cold Spring Harbor, N.Y.: Cold Spring Harbor Laboratory Press). PFK antibodies are further discussed below.

[0048] In preferred embodiments, a PFK-interacting protein specifically binds a PFK protein. In alternative preferred embodiments, a PFK-modulating agent binds a PFK substrate, binding partner, or cofactor.

[0049] Antibodies

[0050] In another embodiment, the protein modulator is a PFK specific antibody agonist or antagonist. The antibodies have therapeutic and diagnostic utilities, and can be used in screening assays to identify PFK modulators. The antibodies can also be used in dissecting the portions of the PFK pathway responsible for various cellular responses and in the general processing and maturation of the PFK.

[0051] Antibodies that specifically bind PFK polypeptides can be generated using known methods. Preferably the antibody is specific to a mammalian ortholog of PFK polypeptide, and more preferably, to human PFK. Antibodies may be polyclonal, monoclonal (mAbs), humanized or chimeric antibodies, single chain antibodies, Fab fragments, F(ab').sub.2 fragments, fragments produced by a FAb expression library, anti-idiotypic (anti-Id) antibodies, and epitope-binding fragments of any of the above. Epitopes of PFK which are particularly antigenic can be selected, for example, by routine screening of PFK polypeptides for antigenicity or by applying a theoretical method for selecting antigenic regions of a protein (Hopp and Wood (1981), Proc. Natl. Acad. Sci. U.S.A. 78:3824-28; Hopp and Wood, (1983) Mol. Immunol. 20:483-89; Sutcliffe et al., (1983) Science 219:660-66) to the amino acid sequence of a PFK. Monoclonal antibodies with affinities of 10.sup.8 M.sup.-1 preferably 10.sup.9 M.sup.-1 to 10.sup.10 M.sup.-1, or stronger can be made by standard procedures as described (Harlow and Lane, supra; Goding (1986) Monoclonal Antibodies Principles and Practice (2d ed) Academic Press, New York; and U.S. Pat. Nos. 4,381,292; 4,451,570; and 4,618,577). Antibodies may be generated against crude cell extracts of PFK or substantially purified fragments thereof. If PFK fragments are used, they preferably comprise at least 10, and more preferably, at least 20 contiguous amino acids of a PFK protein. In a particular embodiment, PFK-specific antigens and/or immunogens are coupled to carrier proteins that stimulate the immune response. For example, the subject polypeptides are covalently coupled to the keyhole limpet hemocyanin (KLH) carrier, and the conjugate is emulsified in Freund's complete adjuvant, which enhances the immune response. An appropriate immune system such as a laboratory rabbit or mouse is immunized according to conventional protocols.

[0052] The presence of PFK-specific antibodies is assayed by an appropriate assay such as a solid phase enzyme-linked immunosorbant assay (ELISA) using immobilized corresponding PFK polypeptides. Other assays, such as radioimmunoassays or fluorescent assays might also be used.

[0053] Chimeric antibodies specific to PFK polypeptides can be made that contain different portions from different animal species. For instance, a human immunoglobulin constant region may be linked to a variable region of a murine mAb, such that the antibody derives its biological activity from the human antibody, and its binding specificity from the murine fragment. Chimeric antibodies are produced by splicing together genes that encode the appropriate regions from each species (Morrison et al., Proc. Natl. Acad. Sci. (1984) 81:6851-6855; Neuberger et al., Nature (1984) 312:604-608; Takeda et al., Nature (1985) 31:452-454). Humanized antibodies, which are a form of chimeric antibodies, can be generated by grafting complementary-determining regions (CDRs) (Carlos, T. M., J. M. Harlan. 1994. Blood 84:2068-2101) of mouse antibodies into a background of human framework regions and constant regions by recombinant DNA technology (Riechmann L M, et al., 1988 Nature 323: 323-327). Humanized antibodies contain .about.10% murine sequences and .about.90% human sequences, and thus further reduce or eliminate immunogenicity, while retaining the antibody specificities (Co M S, and Queen C. 1991 Nature 351: 501-501; Morrison S L. 1992 Ann. Rev. Immun. 10:239-265). Humanized antibodies and methods of their production are well-known in the art (U.S. Pat. Nos. 5,530,101, 5,585,089, 5,693,762, and 6,180,370).

[0054] PFK-specific single chain antibodies which are recombinant, single chain polypeptides formed by linking the heavy and light chain fragments of the Fv regions via an amino acid bridge, can be produced by methods known in the art (U.S. Pat. No. 4,946,778; Bird, Science (1988) 242:423-426; Huston et al., Proc. Natl. Acad. Sci. USA (1988) 85:5879-5883; and Ward et al., Nature (1989) 334:544-546).

[0055] Other suitable techniques for antibody production involve in vitro exposure of lymphocytes to the antigenic polypeptides or alternatively to selection of libraries of antibodies in phage or similar vectors (Huse et al., Science (1989) 246:1275-1281). As used herein, T-cell antigen receptors are included within the scope of antibody modulators (Harlow and Lane, 1988, supra).

[0056] The polypeptides and antibodies of the present invention may be used with or without modification. Frequently, antibodies will be labeled by joining, either covalently or non-covalently, a substance that provides for a detectable signal, or that is toxic to cells that express the targeted protein (Menard S, et al., Int J. Biol Markers (1989) 4:131-134). A wide variety of labels and conjugation techniques are known and are reported extensively in both the scientific and patent literature. Suitable labels include radionuclides, enzymes, substrates, cofactors, inhibitors, fluorescent moieties, fluorescent emitting lanthanide metals, chemiluminescent moieties, bioluminescent moieties, magnetic particles, and the like (U.S. Pat. Nos. 3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and 4,366,241). Also, recombinant immunoglobulins may be produced (U.S. Pat. No. 4,816,567). Antibodies to cytoplasmic polypeptides may be delivered and reach their targets by conjugation with membrane-penetrating toxin proteins (U.S. Pat. No. 6,086,900).

[0057] When used therapeutically in a patient, the antibodies of the subject invention are typically administered parenterally, when possible at the target site, or intravenously. The therapeutically effective dose and dosage regimen is determined by clinical studies. Typically, the amount of antibody administered is in the range of about 0.1 mg/kg- to about 10 mg/kg of patient weight. For parenteral administration, the antibodies are formulated in a unit dosage injectable form (e.g., solution, suspension, emulsion) in association with a pharmaceutically acceptable vehicle. Such vehicles are inherently nontoxic and non-therapeutic. Examples are water, saline, Ringer's solution, dextrose solution, and 5% human serum albumin. Nonaqueous vehicles such as fixed oils, ethyl oleate, or liposome carriers may also be used. The vehicle may contain minor amounts of additives, such as buffers and preservatives, which enhance isotonicity and chemical stability or otherwise enhance therapeutic potential. The antibodies' concentrations in such vehicles are typically in the range of about 1 mg/ml to about 10 mg/ml. Immunotherapeutic methods are further described in the literature (U.S. Pat. No. 5,859,206; WO0073469).

[0058] Nucleic Acid Modulators

[0059] Other preferred PFK-modulating agents comprise nucleic acid molecules, such as antisense oligomers or double stranded RNA (dsRNA), which generally inhibit PFK activity. Preferred nucleic acid modulators interfere with the function of the PFK nucleic acid such as DNA replication, transcription, translocation of the PFK RNA to the site of protein translation, translation of protein from the PFK RNA, splicing of the PFK RNA to yield one or more mRNA species, or catalytic activity which may be engaged in or facilitated by the PFK RNA.

[0060] In one embodiment, the antisense oligomer is an oligonucleotide that is sufficiently complementary to a PFK mRNA to bind to and prevent translation, preferably by binding to the 5' untranslated region. PFK-specific antisense oligonucleotides, preferably range from at least 6 to about 200 nucleotides. In some embodiments the oligonucleotide is preferably at least 10, 15, or 20 nucleotides in length. In other embodiments, the oligonucleotide is preferably less than 50, 40, or 30 nucleotides in length. The oligonucleotide can be DNA or RNA or a chimeric mixture or derivatives or modified versions thereof, single-stranded or double-stranded. The oligonucleotide can be modified at the base moiety, sugar moiety, or phosphate backbone. The oligonucleotide may include other appending groups such as peptides, agents that facilitate transport across the cell membrane, hybridization-triggered cleavage agents, and intercalating agents.

[0061] In another embodiment, the antisense oligomer is a phosphothioate morpholino oligomer (PMO). PMOs are assembled from four different morpholino subunits, each of which contain one of four genetic bases (A, C, G, or T) linked to a six-membered morpholine ring. Polymers of these subunits are joined by non-ionic phosphodiamidate intersubunit linkages. Details of how to make and use PMOs and other antisense oligomers are well known in the art (e.g. see WO99/18193; Probst J C, Antisense Oligodeoxynucleotide and Ribozyme Design, Methods. (2000) 22(3):271-281; Summerton J, and Weller D. 1997 Antisense Nucleic Acid Drug Dev.: 7:187-95; U.S. Pat. No. 5,235,033; and U.S. Pat. No. 5,378,841).

[0062] Alternative preferred PFK nucleic acid modulators are double-stranded RNA species mediating RNA interference (RNAi). RNAi is the process of sequence-specific, post-transcriptional gene silencing in animals and plants, initiated by double-stranded RNA (dsRNA) that is homologous in sequence to the silenced gene. Methods relating to the use of RNAi to silence genes in C. elegans, Drosophila, plants, and humans are known in the art (Fire A, et al., 1998 Nature 391:806-811; Fire, A. Trends Genet. 15, 358-363 (1999); Sharp, P. A. RNA interference 2001. Genes Dev. 15, 485-490 (2001); Hammond, S. M., et al., Nature Rev. Genet. 2, 110-1119 (2001); Tuschl, T. Chem. Biochem. 2, 239-245 (2001); Hamilton, A. et al., Science 286, 950-952 (1999); Hammond, S. M., et al., Nature 404, 293-296 (2000); Zamore, P. D., et al., Cell 101, 25-33 (2000); Bernstein, E., et al., Nature 409, 363-366 (2001); Elbashir, S. M., et al., Genes Dev. 15, 188-200 (2001); WO0129058; WO9932619; Elbashir S M, et al., 2001 Nature 411:494-498; Novina C D and Sharp P. 2004 Nature 430:161-164; Soutschek J et al 2004 Nature 432:173-178; Hsieh A C et al. (2004) NAR 32(3):893-901).

[0063] Nucleic acid modulators are commonly used as research reagents, diagnostics, and therapeutics. For example, antisense oligonucleotides, which are able to inhibit gene expression with exquisite specificity, are often used to elucidate the function of particular genes (see, for example, U.S. Pat. No. 6,165,790). Nucleic acid modulators are also used, for example, to distinguish between functions of various members of a biological pathway. For example, antisense oligomers have been employed as therapeutic moieties in the treatment of disease states in animals and man and have been demonstrated in numerous clinical trials to be safe and effective (Milligan J F, et al, Current Concepts in Antisense Drug Design, J Med Chem. (1993) 36:1923-1937; Tonkinson J L et al., Antisense Oligodeoxynucleotides as Clinical Therapeutic Agents, Cancer Invest. (1996) 14:54-65). Accordingly, in one aspect of the invention, a PFK-specific nucleic acid modulator is used in an assay to further elucidate the role of the PFK in the IGFR pathway, and/or its relationship to other members of the pathway. In another aspect of the invention, a PFK-specific antisense oligomer is used as a therapeutic agent for treatment of IGFR-related disease states.

[0064] Assay Systems

[0065] The invention provides assay systems and screening methods for identifying specific modulators of PFK activity. As used herein, an "assay system" encompasses all the components required for performing and analyzing results of an assay that detects and/or measures a particular event. In general, primary assays are used to identify or confirm a modulator's specific biochemical or molecular effect with respect to the PFK nucleic acid or protein. In general, secondary assays further assess the activity of a PFK modulating agent identified by a primary assay and may confirm that the modulating agent affects PFK in a manner relevant to the IGFR pathway. In some cases, PFK modulators will be directly tested in a secondary assay.

[0066] In a preferred embodiment, the screening method comprises contacting a suitable assay system comprising a PFK polypeptide or nucleic acid with a candidate agent under conditions whereby, but for the presence of the agent, the system provides a reference activity (e.g. kinase activity), which is based on the particular molecular event the screening method detects. A statistically significant difference between the agent-biased activity and the reference activity indicates that the candidate agent modulates PFK activity, and hence the IGFR pathway. The PFK polypeptide or nucleic acid used in the assay may comprise any of the nucleic acids or polypeptides described above.

[0067] Primary Assays

[0068] The type of modulator tested generally determines the type of primary assay.

[0069] Primary Assays for Small Molecule Modulators

[0070] For small molecule modulators, screening assays are used to identify candidate modulators. Screening assays may be cell-based or may use a cell-free system that recreates or retains the relevant biochemical reaction of the target protein (reviewed in Sittampalam G S et al., Curr Opin Chem Biol (1997) 1:384-91 and accompanying references). As used herein the term "cell-based" refers to assays using live cells, dead cells, or a particular cellular fraction, such as a membrane, endoplasmic reticulum, or mitochondrial fraction. The term "cell free" encompasses assays using substantially purified protein (either endogenous or recombinantly produced), partially purified or crude cellular extracts. Screening assays may detect a variety of molecular events, including protein-DNA interactions, protein-protein interactions (e.g., receptor-ligand binding), transcriptional activity (e.g., using a reporter gene), enzymatic activity (e.g., via a property of the substrate), activity of second messengers, immunogenicity and changes in cellular morphology or other cellular characteristics. Appropriate screening assays may use a wide range of detection methods including fluorescent, radioactive, colorimetric, spectrophotometric, and amperometric methods, to provide a read-out for the particular molecular event detected.

[0071] Cell-based screening assays usually require systems for recombinant expression of PFK and any auxiliary proteins demanded by the particular assay. Appropriate methods for generating recombinant proteins produce sufficient quantities of proteins that retain their relevant biological activities and are of sufficient purity to optimize activity and assure assay reproducibility. Yeast two-hybrid and variant screens, and mass spectrometry provide preferred methods for determining protein-protein interactions and elucidation of protein complexes. In certain applications, when PFK-interacting proteins are used in screens to identify small molecule modulators, the binding specificity of the interacting protein to the PFK protein may be assayed by various known methods such as substrate processing (e.g. ability of the candidate PFK-specific binding agents to function as negative effectors in PFK-expressing cells), binding equilibrium constants (usually at least about 10.sup.7M.sup.-1, preferably at least about 10.sup.8 M.sup.-1, more preferably at least about 10.sup.9 M.sup.-1), and immunogenicity (e.g. ability to elicit PFK specific antibody in a heterologous host such as a mouse, rat, goat or rabbit). For enzymes and receptors, binding may be assayed by, respectively, substrate and ligand processing.

[0072] The screening assay may measure a candidate agent's ability to specifically bind to or modulate activity of a PFK polypeptide, a fusion protein thereof, or to cells or membranes bearing the polypeptide or fusion protein. The PFK polypeptide can be full length or a fragment thereof that retains functional PFK activity. The PFK polypeptide may be fused to another polypeptide, such as a peptide tag for detection or anchoring, or to another tag. The PFK polypeptide is preferably human PFK, or is an ortholog or derivative thereof as described above. In a preferred embodiment, the screening assay detects candidate agent-based modulation of PFK interaction with a binding target, such as an endogenous or exogenous protein or other substrate that has PFK-specific binding activity, and can be used to assess normal PFK gene function.

[0073] Suitable assay formats that may be adapted to screen for PFK modulators are known in the art. Preferred screening assays are high throughput or ultra high throughput and thus provide automated, cost-effective means of screening compound libraries for lead compounds (Fernandes P B, Curr Opin Chem Biol (1998) 2:597-603; Sundberg S A, Curr Opin Biotechnol 2000, 11:47-53). In one preferred embodiment, screening assays uses fluorescence technologies, including fluorescence polarization, time-resolved fluorescence, and fluorescence resonance energy transfer. These systems offer means to monitor protein-protein or DNA-protein interactions in which the intensity of the signal emitted from dye-labeled molecules depends upon their interactions with partner molecules (e.g., Selvin P R, Nat Struct Biol (2000) 7:730-4; Fernandes P B, supra; Hertzberg R P and Pope A J, Curr Opin Chem Biol (2000) 4:445-451).

[0074] A variety of suitable assay systems may be used to identify candidate PFK and IGFR pathway modulators (e.g. U.S. Pat. No. 6,165,992 and U.S. Pat. No. 6,720,162 (kinase assays); U.S. Pat. Nos. 5,550,019 and 6,133,437 (apoptosis assays); and U.S. Pat. Nos. 5,976,782, 6,225,118 and 6,444,434 (angiogenesis assays), among others). Specific preferred assays are described in more detail below.

[0075] Kinases, key signal transduction proteins that may be either membrane-associated or intracellular; catalyze the transfer of gamma phosphate from adenosine triphosphate (ATP) to a serine, threonine or tyrosine residue in a protein substrate. Radioassays, which monitor the transfer from [gamma-.sup.32P or -.sup.33]ATP, are frequently used to assay kinase activity. For instance, a scintillation assay for p56 (lck) kinase activity monitors the transfer of the gamma phosphate from [gamma-.sup.33P] ATP to a biotinylated peptide substrate. The substrate is captured on a streptavidin coated bead that transmits the signal (Beveridge M et al., J Biomol Screen (2000) 5:205-212). This assay uses the scintillation proximity assay (SPA), in which only radio-ligand bound to receptors tethered to the surface of an SPA bead are detected by the scintillant immobilized within it, allowing binding to be measured without separation of bound from free ligand. Other assays for protein kinase activity may use antibodies that specifically recognize phosphorylated substrates. For instance, the kinase receptor activation (KIRA) assay measures receptor tyrosine kinase activity by ligand stimulating the intact receptor in cultured cells, then capturing solubilized receptor with specific antibodies and quantifying phosphorylation via phosphotyrosine ELISA (Sadick M D, Dev Biol Stand (1999) 97:121-133). Another example of antibody based assays for protein kinase activity is TRF (time-resolved fluorometry). This method utilizes europium chelate-labeled anti-phosphotyrosine antibodies to detect phosphate transfer to a polymeric substrate coated onto microtiter plate wells. The amount of phosphorylation is then detected using time-resolved, dissociation-enhanced fluorescence (Braunwalder A F, et al., Anal Biochem 1996 Jul. 1; 238(2):159-64). Yet other assays for kinases involve uncoupled, pH sensitive assays that can be used for high-throughput screening of potential inhibitors or for determining substrate specificity. Since kinases catalyze the transfer of a gamma-phosphoryl group from ATP to an appropriate hydroxyl acceptor with the release of a proton, a pH sensitive assay is based on the detection of this proton using an appropriately matched buffer/indicator system (Chapman E and Wong C H (2.002) Bioorg Med Chem. 10:551-5).

[0076] Apoptosis Assays.

[0077] Apoptosis or programmed cell death is a suicide program is activated within the cell, leading to fragmentation of DNA, shrinkage of the cytoplasm, membrane changes and cell death. Apoptosis is mediated by proteolytic enzymes of the caspase family. Many of the altering parameters of a cell are measurable during apoptosis. Assays for apoptosis may be performed by terminal deoxynucleotidyl transferase-mediated digoxigenin-11-dUTP nick end labeling (TUNEL) assay. The TUNEL assay is used to measure nuclear DNA fragmentation characteristic of apoptosis (Lazebnik et al., 1994, Nature 371, 346), by following the incorporation of fluorescein-dUTP (Yonehara et al., 1989, J. Exp. Med. 169, 1747). Apoptosis may further be assayed by acridine orange staining of tissue culture cells (Lucas, R., et al., 1998, Blood 15:4730-41). Other cell-based apoptosis assays include the caspase-3/7 assay and the cell death nucleosome ELISA assay. The caspase 3/7 assay is based on the activation of the caspase cleavage activity as part of a cascade of events that occur during programmed cell death in many apoptotic pathways. In the caspase 3/7 assay (commercially available Apo-ONE.TM. Homogeneous Caspase-3/7 assay from Promega, cat#67790), lysis buffer and caspase substrate are mixed and added to cells. The caspase substrate becomes fluorescent when cleaved by active caspase 3/7. The nucleosome ELISA assay is a general cell death assay known to those skilled in the art, and available commercially (Roche, Cat#1774425). This assay is a quantitative sandwich-enzyme-immunoassay which uses monoclonal antibodies directed against DNA and histones respectively, thus specifically determining amount of mono- and oligonucleosomes in the cytoplasmic fraction of cell lysates. Mono and oligonucleosomes are enriched in the cytoplasm during apoptosis due to the fact that DNA fragmentation occurs several hours before the plasma membrane breaks down, allowing for accumulation in the cytoplasm. Nucleosomes are not present in the cytoplasmic fraction of cells that are not undergoing apoptosis. The Phospho-histone H2B assay is another apoptosis assay, based on phosphorylation of histone H2B as a result of apoptosis. Fluorescent dyes that are associated with phosphohistone H2B may be used to measure the increase of phosphohistone H2B as a result of apoptosis. Apoptosis assays that simultaneously measure multiple parameters associated with apoptosis have also been developed. In such assays, various cellular parameters that can be associated with antibodies or fluorescent dyes, and that mark various stages of apoptosis are labeled, and the results are measured using instruments such as Cellomics.TM. ArrayScan.RTM. HCS System. The measurable parameters and their markers include anti-active caspase-3 antibody which marks intermediate stage apoptosis, anti-PARP-p85 antibody (cleaved PARP) which marks late stage apoptosis, Hoechst labels which label the nucleus and are used to measure nuclear swelling as a measure of early apoptosis and nuclear condensation as a measure of late apoptosis, TOTO-3 fluorescent dye which labels DNA of dead cells with high cell membrane permeability, and anti-alpha-tubulin or F-actin labels, which assess cytoskeletal changes in cells and correlate well with TOTO-3 label.

[0078] An apoptosis assay system may comprise a cell that expresses a PFK, and that optionally has defective IGFR function (e.g. IGFR is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the apoptosis assay system and changes in induction of apoptosis relative to controls where no test agent is added, identify candidate IGFR modulating agents. In some embodiments of the invention, an apoptosis assay may be used as a secondary assay to test a candidate IGFR modulating agents that is initially identified using a cell-free assay system. An apoptosis assay may also be used to test whether PFK function plays a direct role in apoptosis. For example, an apoptosis assay may be performed on cells that over- or under-express PFK relative to wild type cells. Differences in apoptotic response compared to wild type cells suggests that the PFK plays a direct role in the apoptotic response. Apoptosis assays are described further in U.S. Pat. No. 6,133,437.

[0079] Cell Proliferation and Cell Cycle Assays.

[0080] Cell proliferation may be assayed via bromodeoxyuridine (BRDU) incorporation. This assay identifies a cell population undergoing DNA synthesis by incorporation of BRDU into newly-synthesized DNA. Newly-synthesized DNA may then be detected using an anti-BRDU antibody (Hoshino et al., 1986, Int. J. Cancer 38, 369; Campana et al., 1988, J. Immunol. Meth. 107, 79), or by other means.

[0081] Cell proliferation is also assayed via phospho-histone H3 staining, which identifies a cell population undergoing mitosis by phosphorylation of histone H3. Phosphorylation of histone H3 at serine 10 is detected using an antibody specific to the phosphorylated form of the serine 10 residue of histone H3. (Chadlee, D. N. 1995, J. Biol. Chem 270:20098-105). Cell Proliferation may also be examined using [.sup.3H]-thymidine incorporation (Chen, J., 1996, Oncogene 13:1395-403; Jeoung, J., 1995, J. Biol. Chem. 270:18367-73). This assay allows for quantitative characterization of S-phase DNA syntheses. In this assay, cells synthesizing DNA will incorporate [.sup.3H]-thymidine into newly synthesized DNA. Incorporation can then be measured by standard techniques such as by counting of radioisotope in a scintillation counter (e.g., Beckman LS 3800 Liquid Scintillation Counter). Another proliferation assay uses the dye Alamar Blue (available from Biosource International), which fluoresces when reduced in living cells and provides an indirect measurement of cell number (Voytik-Harbin S L et al., 1998, In Vitro Cell Dev Biol Anim 34:239-46). Yet another proliferation assay, the MTS assay, is based on in vitro cytotoxicity assessment of industrial chemicals, and uses the soluble tetrazolium salt, MTS. MTS assays are commercially available, for example, the Promega CellTiter 96.RTM. AQueous Non-Radioactive Cell Proliferation Assay (Cat. #G5421).

[0082] Cell proliferation may also be assayed by colony formation in soft agar, or clonogenic survival assay (Sambrook et al., Molecular Cloning, Cold Spring Harbor (1989)). For example, cells transformed with PFK are seeded in soft agar plates, and colonies are measured and counted after two weeks incubation.

[0083] Cell proliferation may also be assayed by measuring ATP levels as indicator of metabolically active cells. Such assays are commercially available, for example Cell Titer-Glo.TM., which is a luminescent homogeneous assay available from Promega.

[0084] Involvement of a gene in the cell cycle may be assayed by flow cytometry (Gray J W et al. (1986) Int J Radiat Biol Relat Stud Phys Chem Med 49:237-55). Cells transfected with a PFK may be stained with propidium iodide and evaluated in a flow cytometer (available from Becton Dickinson), which indicates accumulation of cells in different stages of the cell cycle.

[0085] Involvement of a gene in cell cycle may also be assayed by FOXO nuclear translocation assays. The FOXO family of transcription factors are mediators of various cellular functions including cell cycle progression and cell death, and are negatively regulated by activation of the PI3 kinase pathway. Akt phosphorylation of FOXO family members leads to FOXO sequestration in the cytoplasm and transcriptional inactivation (Medema, R. H et al (2000) Nature 404: 782-787). PTEN is a negative regulator of PI3 kinase pathway. Activation of PTEN, or loss of PI3 kinase or AKT, prevents phosphorylation of FOXO, leading to accumulation of FOXO in the nucleus, transcriptional activation of FOXO regulated genes, and apoptosis. Alternatively, loss of PTEN leads to pathway activation and cell survival (Nakamura, N. et al (2000) Mol Cell Biol 20: 8969-8982). FOXO translocation into the cytoplasm is used in assays and screens to identify members and/or modulators of the PTEN pathway. FOXO translocation assays using GFP or luciferase as detection reagents are known in the art (e.g., Zhang X et al (2002) J Biol Chem 277:45276-45284; and Li et al (2003) Mol Cell Biol 23:104-118).

[0086] Accordingly, a cell proliferation or cell cycle assay system may comprise a cell that expresses a PFK, and that optionally has defective IGFR function (e.g. IGFR is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the assay system and changes in cell proliferation or cell cycle relative to controls where no test agent is added, identify candidate IGFR modulating agents. In some embodiments of the invention, the cell proliferation or cell cycle assay may be used as a secondary assay to test a candidate IGFR modulating agents that is initially identified using another assay system such as a cell-free assay system. A cell proliferation assay may also be used to test whether PFK function plays a direct role in cell proliferation or cell cycle. For example, a cell proliferation or cell cycle assay may be performed on cells that over- or under-express PFK relative to wild type cells. Differences in proliferation or cell cycle compared to wild type cells suggests that the PFK plays a direct role in cell proliferation or cell cycle.

[0087] Angiogenesis.

[0088] Angiogenesis may be assayed using various human endothelial cell systems, such as umbilical vein, coronary artery, or dermal cells. Suitable assays include Alamar Blue based assays (available from Biosource International) to measure proliferation; migration assays using fluorescent molecules, such as the use of Becton Dickinson Falcon HTS FluoroBlock cell culture inserts to measure migration of cells through membranes in presence or absence of angiogenesis enhancer or suppressors; and tubule formation assays based on the formation of tubular structures by endothelial cells on Matrigel.RTM. (Becton Dickinson). Accordingly, an angiogenesis assay system may comprise a cell that expresses a PFK, and that optionally has defective IGFR function (e.g. IGFR is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the angiogenesis assay system and changes in angiogenesis relative to controls where no test agent is added, identify candidate IGFR modulating agents. In some embodiments of the invention, the angiogenesis assay may be used as a secondary assay to test a candidate IGFR modulating agents that is initially identified using another assay system. An angiogenesis assay may also be used to test whether PFK function plays a direct role in cell proliferation. For example, an angiogenesis assay may be performed on cells that over- or under-express PFK relative to wild type cells. Differences in angiogenesis compared to wild type cells suggests that the PFK plays a direct role in angiogenesis. U.S. Pat. Nos. 5,976,782, 6,225,118 and 6,444,434, among others, describe various angiogenesis assays.

[0089] Hypoxic Induction.

[0090] The alpha subunit of the transcription factor, hypoxia inducible factor-1 (HIF-1), is upregulated in tumor cells following exposure to hypoxia in vitro. Under hypoxic conditions, HIF-1 stimulates the expression of genes known to be important in tumour cell survival, such as those encoding glyolytic enzymes and VEGF. Induction of such genes by hypoxic conditions may be assayed by growing cells transfected with PFK in hypoxic conditions (such as with 0.1% O2, 5% CO2, and balance N2, generated in a Napco 7001 incubator (Precision Scientific)) and normoxic conditions, followed by assessment of gene activity or expression by Taqman.RTM.. For example, a hypoxic induction assay system may comprise a cell that expresses a PFK, and that optionally has defective IGFR function (e.g. IGFR is over-expressed or under-expressed relative to wild-type cells). A test agent can be added to the hypoxic induction assay system and changes in hypoxic response relative to controls where no test agent is added, identify candidate IGFR modulating agents. In some embodiments of the invention, the hypoxic induction assay may be used as a secondary assay to test a candidate IGFR modulating agents that is initially identified using another assay system. A hypoxic induction assay may also be used to test whether PFK function plays a direct role in the hypoxic response. For example, a hypoxic induction assay may be performed on cells that over- or under-express PFK relative to wild type cells. Differences in hypoxic response compared to wild type cells suggests that the PFK plays a direct role in hypoxic induction.

[0091] Cell Adhesion.

[0092] Cell adhesion assays measure adhesion of cells to purified adhesion proteins, or adhesion of cells to each other, in presence or absence of candidate modulating agents. Cell-protein adhesion assays measure the ability of agents to modulate the adhesion of cells to purified proteins. For example, recombinant proteins are produced, diluted to 2.5 g/mL in PBS, and used to coat the wells of a microtiter plate. The wells used for negative control are not coated. Coated wells are then washed, blocked with 1% BSA, and washed again. Compounds are diluted to 2.times. final test concentration and added to the blocked, coated wells. Cells are then added to the wells, and the unbound cells are washed off. Retained cells are labeled directly on the plate by adding a membrane-permeable fluorescent dye, such as calcein-AM, and the signal is quantified in a fluorescent microplate reader.

[0093] Cell-cell adhesion assays measure the ability of agents to modulate binding of cell adhesion proteins with their native ligands. These assays use cells that naturally or recombinantly express the adhesion protein of choice. In an exemplary assay, cells expressing the cell adhesion protein are plated in wells of a multiwell plate. Cells expressing the ligand are labeled with a membrane-permeable fluorescent dye, such as BCECF, and allowed to adhere to the monolayers in the presence of candidate agents. Unbound cells are washed off, and bound cells are detected using a fluorescence plate reader.

[0094] High-throughput cell adhesion assays have also been described. In one such assay, small molecule ligands and peptides are bound to the surface of microscope slides using a microarray spotter, intact cells are then contacted with the slides, and unbound cells are washed off. In this assay, not only the binding specificity of the peptides and modulators against cell lines are determined, but also the functional cell signaling of attached cells using immunofluorescence techniques in situ on the microchip is measured (Falsey J R et al., Bioconjug Chem. 2001 May-June; 12(3):346-53).

[0095] Primary Assays for Antibody Modulators

[0096] For antibody modulators, appropriate primary assays test is a binding assay that tests the antibody's affinity to and specificity for the PFK protein. Methods for testing antibody affinity and specificity are well known in the art (Harlow and Lane, 1988, 1999, supra). The enzyme-linked immunosorbant assay (ELISA) is a preferred method for detecting PFK-specific antibodies; others include FACS assays, radioimmunoassays, and fluorescent assays.

[0097] In some cases, screening assays described for small molecule modulators may also be used to test antibody modulators.

[0098] Primary Assays for Nucleic Acid Modulators

[0099] For nucleic acid modulators, primary assays may test the ability of the nucleic acid modulator to inhibit or enhance PFK gene expression, preferably mRNA expression. In general, expression analysis comprises comparing PFK expression in like populations of cells (e.g., two pools of cells that endogenously or recombinantly express PFK) in the presence and absence of the nucleic, acid modulator. Methods for analyzing mRNA and protein expression are well known in the art. For instance, Northern blotting, slot blotting, ribonuclease protection, quantitative RT-PCR (e.g., using the TaqMan.RTM., PE Applied Biosystems), or microarray analysis may be used to confirm that PFK mRNA expression is reduced in cells treated with the nucleic acid modulator (e.g., Current Protocols in Molecular Biology (1994) Ausubel F M et al., eds., John Wiley & Sons, Inc., chapter 4; Freeman W M et al., Biotechniques (1999) 26:112-125; Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm D H and Guiseppi-Elie, A Curr Opin Biotechnol 2001, 12:41-47). Protein expression may also be monitored. Proteins are most commonly detected with specific antibodies or antisera directed against either the PFK protein or specific peptides. A variety of means including Western blotting, ELISA, or in situ detection, are available (Harlow E and Lane D, 1988 and 1999, supra).

[0100] In some cases, screening assays described for small molecule modulators, particularly in assay systems that involve PFK mRNA expression, may also be used to test nucleic acid modulators.

[0101] Secondary Assays

[0102] Secondary assays may be used to further assess the activity of PFK-modulating agent identified by any of the above methods to confirm that the modulating agent affects PFK in a manner relevant to the IGFR pathway. As used herein, PFK-modulating agents encompass candidate clinical compounds or other agents derived from previously identified modulating agent. Secondary assays can also be used to test the activity of a modulating agent on a particular genetic or biochemical pathway or to test the specificity of the modulating agent's interaction with PFK.

[0103] Secondary assays generally compare like populations of cells or animals (e.g., two pools of cells or animals that endogenously or recombinantly express PFK) in the presence and absence of the candidate modulator. In general, such assays test whether treatment of cells or animals with a candidate PFK-modulating agent results in changes in the IGFR pathway in comparison to untreated (or mock- or placebo-treated) cells or animals. Certain assays use "sensitized genetic backgrounds", which, as used herein, describe cells or animals engineered for altered expression of genes in the IGFR or interacting pathways.

Cell-Based Assays

[0104] Cell based assays may detect endogenous IGFR pathway activity or may rely on recombinant expression of IGFR pathway components. Any of the aforementioned assays may be used in this cell-based format. Candidate modulators are typically added to the cell media but may also be injected into cells or delivered by any other efficacious means.

Animal Assays

[0105] A variety of non-human animal models of normal or defective IGFR pathway may be used to test candidate PFK modulators. Models for defective IGFR pathway typically use genetically modified animals that have been engineered to mis-express (e.g., over-express or lack expression in) genes involved in the IGFR pathway. Assays generally require systemic delivery of the candidate modulators, such as by oral administration, injection, etc.

[0106] In a preferred embodiment, IGFR pathway activity is assessed by monitoring neovascularization and angiogenesis. Animal models with defective and normal IGFR are used to test the candidate modulator's affect on PFK in Matrigel.RTM. assays. Matrigel.RTM. is an extract of basement membrane proteins, and is composed primarily of laminin, collagen IV, and heparin sulfate proteoglycan. It is provided as a sterile liquid at 4.degree. C., but rapidly forms a solid gel at 37.degree. C. Liquid Matrigel.RTM. is mixed with various angiogenic agents, such as bFGF and VEGF, or with human tumor cells which over-express the PFK. The mixture is then injected subcutaneously (SC) into female athymic nude mice (Taconic, Germantown, N.Y.) to support an intense vascular response. Mice with Matrigel.RTM. pellets may be dosed via oral (PO), intraperitoneal (IP), or intravenous (IV) routes with the candidate modulator. Mice are euthanized 5-12 days post-injection, and the Matrigel.RTM. pellet is harvested for hemoglobin analysis (Sigma plasma hemoglobin kit). Hemoglobin content of the gel is found to correlate the degree of neovascularization in the gel.

[0107] In another preferred embodiment, the effect of the candidate modulator on PFK is assessed via tumorigenicity assays. Tumor xenograft assays are known in the art (see, e.g., Ogawa K et al., 2000, Oncogene 19:6043-6052). Xenografts are typically implanted SC into female athymic mice, 6-7 week old, as single cell suspensions either from a pre-existing tumor or from in vitro culture. The tumors which express the PFK endogenously are injected in the flank, 1.times.10.sup.5 to 1.times.10.sup.7 cells per mouse in a volume of 100 .mu.L using a 27 gauge needle. Mice are then ear tagged and tumors are measured twice weekly. Candidate modulator treatment is initiated on the day the mean tumor weight reaches 100 mg. Candidate modulator is delivered IV, SC, IP, or PO by bolus administration. Depending upon the pharmacokinetics of each unique candidate modulator, dosing can be performed multiple times per day. The tumor weight is assessed by measuring perpendicular diameters with a caliper and calculated by multiplying the measurements of diameters in two dimensions. At the end of the experiment, the excised tumors maybe utilized for biomarker identification or further analyses. For immunohistochemistry staining, xenograft tumors are fixed in 4% paraformaldehyde, 0.1 M phosphate, pH 7.2, for 6 hours at 4.degree. C., immersed in 30% sucrose in PBS, and rapidly frozen in isopentane cooled with liquid nitrogen.

[0108] In another preferred embodiment, tumorogenicity is monitored using a hollow fiber assay, which is described in U.S. Pat. No. 5,698,413. Briefly, the method comprises implanting into a laboratory animal a biocompatible, semi-permeable encapsulation device containing target cells, treating the laboratory animal with a candidate modulating agent, and evaluating the target cells for reaction to the candidate modulator. Implanted cells are generally human cells from a pre-existing tumor or a tumor cell line. After an appropriate period of time, generally around six days, the implanted samples are harvested for evaluation of the candidate modulator. Tumorogenicity and modulator efficacy may be evaluated by assaying the quantity of viable cells present in the macrocapsule, which can be determined by tests known in the art, for example, MTT dye conversion assay, neutral red dye uptake, trypan blue staining, viable cell counts, the number of colonies formed in soft agar, the capacity of the cells to recover and replicate in vitro, etc.

[0109] In another preferred embodiment, a tumorogenicity assay use a transgenic animal, usually a mouse, carrying a dominant oncogene or tumor suppressor gene knockout under the control of tissue specific regulatory sequences; these assays are generally referred to as transgenic tumor assays. In a preferred application, tumor development in the transgenic model is well characterized or is controlled. In an exemplary model, the "RIP1-Tag2" transgene, comprising the SV40 large T-antigen oncogene under control of the insulin gene regulatory regions is expressed in pancreatic beta cells and results in islet cell carcinomas (Hanahan D, 1985, Nature 315:115-122; Parangi S et al, 1996, Proc Natl Acad Sci USA 93: 2002-2007; Bergers G et al, 1999, Science 284:808-812). An "angiogenic switch," occurs at approximately five weeks, as normally quiescent capillaries in a subset of hyperproliferative islets become angiogenic. The RIP1-TAG2 mice die by age 14 weeks. Candidate modulators may be administered at a variety of stages, including just prior to the angiogenic switch (e.g., for a model of tumor prevention), during the growth of small tumors (e.g., for a model of intervention), or during the growth of large and/or invasive tumors (e.g., for a model of regression). Tumorogenicity and modulator efficacy can be evaluating life-span extension and/or tumor characteristics, including number of tumors, tumor size, tumor morphology, vessel density, apoptotic index, etc.

Diagnostic and Therapeutic Uses

[0110] Specific PFK-modulating agents are useful in a variety of diagnostic and therapeutic applications where disease or disease prognosis is related to defects in the IGFR pathway, such as angiogenic, apoptotic, or cell proliferation disorders. Accordingly, the invention also provides methods for modulating the IGFR pathway in a cell, preferably a cell pre-determined to have defective or impaired IGFR function (e.g. due to overexpression, underexpression, or misexpression of IGFR, or due to gene mutations), comprising the step of administering an agent to the cell that specifically modulates PFK activity. Preferably, the modulating agent produces a detectable phenotypic change in the cell indicating that the IGFR function is restored. The phrase "function is restored", and equivalents, as used herein, means that the desired phenotype is achieved, or is brought closer to normal compared to untreated cells. For example, with restored IGFR function, cell proliferation and/or progression through cell cycle may normalize, or be brought closer to normal relative to untreated cells. The invention also provides methods for treating disorders or disease associated with impaired IGFR function by administering a therapeutically effective amount of a PFK-modulating agent that modulates the IGFR pathway. The invention further provides methods for modulating PFK function in a cell, preferably a cell pre-determined to have defective or impaired PFK function, by administering a PFK-modulating agent. Additionally, the invention provides a method for treating disorders or disease associated with impaired PFK function by administering a therapeutically effective amount of a PFK-modulating agent.

[0111] The discovery that PFK is implicated in IGFR pathway provides for a variety of methods that can be employed for the diagnostic and prognostic evaluation of diseases and disorders involving defects in the IGFR pathway and for the identification of subjects having a predisposition to such diseases and disorders.

[0112] Various expression analysis methods can be used to diagnose whether PFK expression occurs in a particular sample, including Northern blotting, slot blotting, ribonuclease protection, quantitative RT-PCR, and microarray analysis. (e.g., Current Protocols in Molecular Biology (1994) Ausubel F M et al., eds., John Wiley & Sons, Inc., chapter 4; Freeman W M et al., Biotechniques (1999) 26:112-125; Kallioniemi O P, Ann Med 2001, 33:142-147; Blohm and Guiseppi-Elie, Curr Opin Biotechnol 2001, 12:41-47). Tissues having a disease or disorder implicating defective IGFR signaling that express a PFK, are identified as amenable to treatment with a PFK modulating agent. In a preferred application, the IGFR defective tissue overexpresses a PFK relative to normal tissue. For example, a Northern blot analysis of mRNA from tumor and normal cell lines, or from tumor and matching normal tissue samples from the same patient, using full or partial PFK cDNA sequences as probes, can determine whether particular tumors express or overexpress PFK. Alternatively, the TaqMan.RTM. is used for quantitative RT-PCR analysis of PFK expression in cell lines, normal tissues and tumor samples (PE Applied Biosystems).

[0113] Various other diagnostic methods may be performed, for example, utilizing reagents such as the PFK oligonucleotides, and antibodies directed against a PFK, as described above for: (1) the detection of the presence of PFK gene mutations, or the detection of either over- or under-expression of PFK mRNA relative to the non-disorder state; (2) the detection of either an over or an under-abundance of PFK gene product relative to the non-disorder state; and (3) the detection of perturbations or abnormalities in the signal transduction pathway mediated by PFK.

[0114] Kits for detecting expression of PFK in various samples, comprising at least one antibody specific to PFK, all reagents and/or devices suitable for the detection of antibodies, the immobilization of antibodies, and the like, and instructions for using such kits in diagnosis or therapy are also provided.

[0115] Thus, in a specific embodiment, the invention is drawn to a method for diagnosing a disease or disorder in a patient that is associated with alterations in PFK expression, the method comprising: a) obtaining a biological sample from the patient; b) contacting the sample with a probe for PFK expression; c) comparing results from step (b) with a control; and d) determining whether step (c) indicates a likelihood of the disease or disorder. Preferably, the disease is cancer, most preferably a cancer as shown in TABLE 1. The probe may be either DNA or protein, including an antibody.

EXAMPLES

[0116] The following experimental section and examples are offered by way of illustration and not by way of limitation.

I. Drosophila IGFR Screen

[0117] A dominant loss of function screen was carried out in Drosophila to identify, genes that interact with or modulate the IGFR signaling pathway. Activation of the pathway by overexpression of IGFR at early stages in the developing Drosophila eye leads to an increase in cell number which results in a larger and rougher adult eye (Potter C J et al. (2001) Cell 105:357-368; Huang et al., 1999. Dev. 126:5365-5372). We generated a fly stock with an enlarged eye due to overexpression of IGFR and identified modifiers of this phenotype. We then identified human orthologues of these modifiers.

[0118] The screening stock carried two transgenes. The genotype is as follows:

[0119] +; +; P {DmIGFR-pExp-UAS)} P {Gal4-pExp-1Xey}/TM6B

[0120] Screening stock females of the above genotype were crossed to males from a collection of 3 classes of piggyBac-based transposons. The resulting progeny, which contain both the transgenes and the transposon, were scored for the effect of the transposon on the eye overgrowth phenotype (either enhancement, suppression or no effect). All data was recorded and all modifiers were retested with a repeat of the original cross. Modifiers of the eye phenotype were identified as members of the IGFR pathway. DROSOPHILA PFK was a suppressor of the eye phenotype. Orthologs of the modifiers are referred to herein as PFK.

[0121] BLAST analysis (Altschul et al., supra) was employed to identify orthologs of Drosophila modifiers. For example, representative sequences from PFK, GI#s 21361070 (SEQ ID NO:64), 4505749 (SEQ ID NO:65), and 11321601 (SEQ ID NO:66) share 55%, 58%, and 57% amino acid identity, respectively, with the Drosophila PFK.

[0122] Various domains, signals, and functional subunits in proteins were analyzed using the PSORT (Nakai K., and Horton P., Trends Biochem Sci, 1999, 24:34-6; Kenta Nakai, Protein sorting signals and prediction of subcellular localization, Adv. Protein Chem. 54, 277-344 (2000)), PFAM (Bateman A., et al., Nucleic Acids Res, 1999, 27:260-2), SMART (Ponting C P, et al., SMART: identification and annotation of domains from signaling and extracellular protein sequences. Nucleic Acids Res. 1999 Jan. 1; 27(1):229-32), TM-HMM (Erik L. L. Sonnhammer, Gunnar von Heijne, and Anders Krogh: A hidden Markov model for predicting transmembrane helices in protein sequences. In Proc. of Sixth Int. Conf. on Intelligent Systems for Molecular Biology, p 175-182 Ed J. Glasgow, T. Littlejohn, F. Major, R. Lathrop, D. Sankoff, and C. Sensen Menlo Park, CA: AAAI Press, 1998), and clust (Remm M, and Sonnhammer E. Classification of transmembrane protein families in the Caenorhabditis elegans genome and identification of human orthologs. Genome Res. 2000 November; 10(11):1679-89) programs. For example, the Phosphofructokinase domain (PFAM 00365) of PFK from GI#21361070 (SEQ ID NO:64) is located at approximately amino acid residues 76 to 373, 448 to 735; the Phosphofructokinase domain of PFK from GI#4505749 (SEQ ID NO:65) is located at approximately amino acid residues 16 to 326, 401 to 689; and the Phosphofructokinase domain of PFK from GI#11321601 (SEQ ID NO:66) is located at approximately amino acid residues 25 to 335, 412 to 699.

II. High-Throughput In Vitro Fluorescence Polarization Assay

[0123] Fluorescently-labeled PFK peptide/substrate are added to each well of a 96-well microtiter plate, along with a test agent in a test buffer (10 mM HEPES, 10 mM NaCl, 6 mM magnesium chloride, pH 7.6). Changes in fluorescence polarization, determined by using a Fluorolite FPM-2 Fluorescence Polarization Microtiter System (Dynatech Laboratories, Inc), relative to control values indicates the test compound is a candidate modifier of PFK activity.

III. High-Throughput In Vitro Binding Assay

[0124] .sup.33P-labeled PFK peptide is added in an assay buffer (100 mM KCl, 20 mM HEPES pH 7.6, 1 mM MgCl.sub.2, 1% glycerol, 0.5% NP-40, 50 mM beta-mercaptoethanol, 1 mg/ml BSA, cocktail of protease inhibitors) along with a test agent to the wells of a Neutralite-avidin coated assay plate and incubated at 25.degree. C. for 1 hour. Biotinylated substrate is then added to each well and incubated for 1 hour. Reactions are stopped by washing with PBS, and counted in a scintillation counter. Test agents that cause a difference in activity relative to control without test agent are identified as candidate IGFR modulating agents.

IV. Immunoprecipitations and Immunoblotting

[0125] For coprecipitation of transfected proteins, 3.times.10.sup.6 appropriate recombinant cells containing the PFK proteins are plated on 10-cm dishes and transfected on the following day with expression constructs. The total amount of DNA is kept constant in each transfection by adding empty vector. After 24 h, cells are collected, washed once with phosphate-buffered saline and lysed for 20 min on ice in 1 ml of lysis buffer containing 50 mM Hepes, pH 7.9, 250 mM NaCl, 20 mM-glycerophosphate, 1 mM sodium orthovanadate, 5 mM p-nitrophenyl phosphate, 2 mM dithiothreitol, protease inhibitors (complete, Roche Molecular Biochemicals), and 1% Nonidet P-40. Cellular debris is removed by centrifugation twice at 15,000.times.g for 15 min. The cell lysate is incubated with 25 .mu.l of M2 beads (Sigma) for 2 h at 4.degree. C. with gentle rocking.

[0126] After extensive washing with lysis buffer, proteins bound to the beads are solubilized by boiling in SDS, sample buffer, fractionated by SDS-polyacrylamide gel electrophoresis, transferred to polyvinylidene difluoride membrane and blotted with the indicated antibodies. The reactive bands are visualized with horseradish peroxidase coupled to the appropriate secondary antibodies and the enhanced chemiluminescence (ECL) Western blotting detection system (Amersham Pharmacia Biotech).

V. Kinase Assay

[0127] A purified or partially purified PFK is diluted in a suitable reaction buffer, e.g., 50 mM Hepes, pH 7.5, containing magnesium chloride or manganese chloride (1-20 mM) and a peptide or polypeptide substrate, such as myelin basic protein or casein (1-10 .mu.g/ml). The final concentration of the kinase is 1-20 nM. The enzyme reaction is conducted in microtiter plates to facilitate optimization of reaction conditions by increasing assay throughput. A 96-well microtiter plate is employed using a final volume 30-100 .mu.l. The reaction is initiated by the addition of .sup.33P-gamma-ATP (0.5 .mu.Ci/ml) and incubated for 0.5 to 3 hours at room temperature. Negative controls are provided by the addition of EDTA, which chelates the divalent cation (Mg2.sup.+ or Mn.sup.2+) required for enzymatic activity. Following the incubation, the enzyme reaction is quenched using EDTA. Samples of the reaction are transferred to a 96-well glass fiber filter plate (MultiScreen, Millipore). The filters are subsequently washed with phosphate-buffered saline, dilute phosphoric acid (0.5%) or other suitable medium to remove excess radiolabeled ATP. Scintillation cocktail is added to the filter plate and the incorporated radioactivity is quantitated by scintillation counting (Wallac/Perkin Elmer). Activity is defined by the amount of radioactivity detected following subtraction of the negative control reaction value (EDTA quench).

VI. Expression Analysis

[0128] All cell lines used in the following experiments are NCI (National Cancer Institute) lines, and are available from ATCC (American Type Culture Collection, Manassas, Va. 20110-2209). Normal and tumor tissues were obtained from Impath, UC Davis, Clontech, Stratagene, Ardais, Genome Collaborative, and Ambion.

[0129] TaqMan.RTM. analysis was used to assess expression levels of the disclosed genes in various samples.

[0130] RNA was extracted from each tissue sample using Qiagen (Valencia, Calif.) RNeasy kits, following manufacturer's protocols, to a final concentration of 50 ng/.mu.l. Single stranded cDNA was then synthesized by reverse transcribing the RNA samples using random hexamers and 500 ng of total RNA per reaction, following protocol 4304965 of Applied Biosystems (Foster City, Calif.).

[0131] Primers for expression analysis using TaqMan.RTM. assay (Applied Biosystems, Foster City, Calif.) were prepared according to the TaqMan.RTM. protocols, and the following criteria: a) primer pairs were designed to span introns to eliminate genomic contamination, and b) each primer pair produced only one product. Expression analysis was performed using a 7900HT instrument.

[0132] TaqMan.RTM. reactions were carried out following manufacturer's protocols, in 25 .mu.l total volume for 96-well plates and 10 .mu.l total volume for 384-well plates, using 300 nM primer and 250 nM probe, and approximately 25 ng of cDNA. The standard curve for result analysis was prepared using a universal pool of human cDNA samples, which is a mixture of cDNAs from a wide variety of tissues so that the chance that a target will be present in appreciable amounts is good. The raw data were normalized using 18S rRNA (universally expressed in all tissues and cells).

[0133] For each expression analysis, tumor tissue samples were compared with matched normal tissues from the same patient. A gene was considered overexpressed in a tumor when the level of expression of the gene was 2 fold or higher in the tumor compared with its matched normal sample. In cases where normal tissue was not available, a universal pool of cDNA samples was used instead. In these cases, a gene was considered overexpressed in a tumor sample when the difference of expression levels between a tumor sample and the average of all normal samples from the same tissue type was greater than 2 times the standard deviation of all normal samples (i.e., Tumor-average(all normal samples)>2.times.STDEV(all normal samples)).

[0134] Results are shown in Table 1. Number of pairs of tumor samples and matched normal tissue from the same patient are shown for each tumor type. Percentage of the samples with at least two-fold overexpression for each tumor type is provided. A modulator identified by an assay described herein can be further validated for therapeutic effect by administration to a tumor in which the gene is overexpressed. A decrease in tumor growth confirms therapeutic utility of the modulator. Prior to treating a patient with the modulator, the likelihood that the patient will respond to treatment can be diagnosed by obtaining a tumor sample from the patient, and assaying for expression of the gene targeted by the modulator. The expression data for the gene(s) can also be used as a diagnostic marker for disease progression. The assay can be performed by expression analysis as described above, by antibody directed to the gene target, or by any other available detection method.

TABLE-US-00001 TABLE 1 SEQ ID NO 2 31 61 Breast 19% 17% 22% # of Pairs 36 36 36 Colon 10% 30% 18% # of Pairs 40 40 40 Head And 31% 15% 69% Neck # of Pairs 13 13 13 Liver 22% 44% 67% # of Pairs 9 9 9 Lung 15% 20% 68% # of Pairs 40 40 40 Lymphoma 0% 75% 50% # of Pairs 4 4 4 Ovary 37% 16% 58% # of Pairs 19 19 19 Pancreas 75% 67% 67% # of Pairs 12 12 12 Prostate 8% 4% 0% # of Pairs 24 24 24 Skin 29% 71% 57% # of Pairs 7 7 7 Stomach 9% 27% 55% # of Pairs 11 11 11 Testis 12% 0% 25% # of Pairs 8 8 8 Thyroid 0% 43% 21% Gland # of Pairs 14 14 14 Uterus 35% 0% 30% # of Pairs 23 23 23

VII. PFK Functional Assays

[0135] RNAi experiments were carried out to knock down expression of PFK (SEQ ID. NO:2) in various cell lines using small interfering RNAs (siRNA, Elbashir et al, supra).

[0136] Effect of PFK RNAi on Cell Proliferation and Growth.

[0137] Standard colony growth assays, as described above, were employed to study the effects of decreased PFK expression on cell growth. The results of this experiment indicated that RNAi of PFK decreased proliferation in A549 lung cancer and A2780 ovarian cancer cells.

[0138] [.sup.3H]-thymidine incorporation assay, as described above, was also employed to study the effects of decreased PFK expression on cell proliferation. The results of this experiment indicated that RNAi of PFK decreased proliferation in A549 lung cancer cells.

[0139] Effect of PFK RNAi on Apoptosis.

[0140] The Phospho-histone H2B assay, as described above, was employed to study the effects of decreased PFK expression on apoptosis. The results of this experiment indicated that RNAi of PFK of SEQ ID NO:2 increased apoptosis in 231T breast cancer cells, A549 lung cancer cells, and U87MG glioblastoma cells. Further, RNAi of PFK of SEQ ID NO:2 also caused a decrease in cell count in 231T breast cancer cells.

[0141] Involvement in PTEN/IGFR Pathway:

[0142] PFK FOXO nuclear translocation assays. FOXO nuclear translocation assays, as described above, were employed to assess involvement of PFK in the PTEN/IGF pathway. In these experiments, cells with reduced expression of PFK by RNAi were transiently transfected with a plasmid expressing GFP-tagged FOXO. Automated imaging of cellular components, such as nucleus and cytoplasm were then carried out to assess translocation of FOXO. Alternatively, cells were co-transfected with siRNA directed to PFK along with a plasmid containing FOXO, and a cassette containing a promoter, a FOXO response element, and luciferase. Cells were then analyzed for luciferase activity and compared with cells with no siRNA. Results indicated that reduced expression of PFK led to translocation of FOXO to the cytoplasm in PC3 prostate cancer cells, A549 lung cancer cells, and A2780 ovarian cancer cells. These results suggest involvement of PFK in the PTEN/IGFR pathway.

[0143] Pan-AKT Assays.

[0144] This assay was developed to detect involvement of PFK in the PTEN/IGFR pathway. The assay detects changes in phosphorylation for several substrates of AKT, such as PRAS40, BAD, 4EBP1, and RPS6. For this experiment, antibodies were raised against phosphorylated AKT substrates, including the consensus phosphorylated AKT substrate sequence RxRxxS/T. Expression levels of phosphorylated substrates were then quantitated at normal levels, in presence of a negative control, a positive control (AKT), and then with PFK knockout. For example, when AKT levels were reduced, expression of all its substrates was also reduced. Results indicated that reduced expression of PFK of SEQ ID NO:2 was similar to reduced AKT levels in 231T breast cancer and A549 lung cancer cells.

[0145] We used RPS6 assay for one subset of experiments. RPS6 is an IGF dependent substrate of AKT. IGF 1 treatment increases cytoplasmic RPS6 levels. Alternatively, Lily compound LY294002, a PI3K inhibitor, reduces AKT and cytoplasmic RPS6 levels. Cells were plated in 96 well plates, transfected with RNAi for PFK, fixed, treated with RPS6 antibody, and stained. Measurements were based on percentage of population of cells with increased or decreased staining compared with negative or positive control cells. Results of this experiment showed that reduced expression of PFK caused an alteration in the level of phospho RPS6 protein in 231T breast cancer and PC3 prostate cancer cells, thus suggesting an involvement in the IGFR pathway.

[0146] We used PRAS40 as the substrate for another subset of experiments. For this substrate, pathway inhibition causes decreased cytoplasmic staining and increased nuclear and perinuclear staining. Cells were plated in 96 well plates, transfected with RNAi for PFK, fixed, treated with PRAS40 antibody, and stained. Measurements were based on percentage of population of cells with increased or decreased nuclear/cytoplasmic staining ratio compared with negative or positive control cells. Results of this experiment showed that reduced expression of PFK altered the level of phospho PRAS40 protein in 231T breast cancer cells, A549 lung cancer cells, and PC3 prostate cancer cells, thus suggesting an involvement in the IGFR pathway.

[0147] We used BAD as the substrate for another subset of the experiments. For this substrate, AKT pathway inhibition causes decreased cytoplasmic staining and unchanged or increased nuclear staining. Cells were plated in 96 well plates, transfected with RNAi for PFK, fixed, permeabilized and stained with anti-phospho-BAD antibody. Measurements were based on the percentage of the population of cells with a decreased Cytoplasmic/Nuclear staining ratio compared with negative or positive control cells. Results of this experiment showed that reduced expression of PFK caused a reduction in the level of phospho-BAD protein in the cytoplasm in 231T breast cancer cells, A549 lung cancer cells, and PC3 prostate cancer cells, thus suggesting an involvement in the IGFR pathway. Taken together, the results of the pan-AKT assay suggest involvement of PFK in the PTEN/IGFR pathway.

Sequence CWU 1

1

6612924DNAHomo sapiens 1gacggcgacg cggcgcaggc ggcgggagtg cgagctgggc ccgtgtttcg gccgccgcca 60tggccgcggt ggacctggag aagctgcggg cgtcgggcgc gggcaaggcc atcggcgtcc 120tgaccagcgg cggcgacgcg caaggcatga acgctgctgt ccgggctgtg acgcgcatgg 180gcatttatgt gggtgccaaa gtcttcctca tctacgaggg ctatgagggc ctcgtggagg 240gaggtgagaa catcaagcag gccaactggc tgagcgtctc caacatcatc cagctgggcg 300gcactatcat tggcagcgct cgctgcaagg cctttaccac cagggagggg cgccgggcag 360cggcctacaa cctggtccag cacggcatca ccaacctgtg cgtcatcggc ggggatggca 420gcctcacagg tgccaacatc ttccgcagcg agtggggcag cctgctggag gagctggtgg 480cggaaggtaa gatctcagag actacagccc ggacctactc gcacctgaac atcgcgggcc 540tagtgggctc catcgataac gacttctgcg gcaccgacat gaccatcggc acggactcgg 600ccctccaccg catcatggag gtcatcgatg ccatcaccac cactgcccag agccaccaga 660ggaccttcgt gctggaagtg atgggccggc actgcgggta cctggcgctg gtatctgcac 720tggcctcagg ggccgactgg ctgttcatcc ccgaggctcc acccgaggac ggctgggaga 780acttcatgtg tgagaggctg ggtgagactc ggagccgtgg gtcccgactg aacatcatca 840tcatcgctga gggtgccatt gaccgcaacg ggaagcccat ctcgtccagc tacgtgaagg 900acctggtggt tcagaggctg ggcttcgaca cccgtgtaac tgtgctgggc cacgtgcagc 960ggggagggac gccctctgcc ttcgaccgga tcctgagcag caagatgggc atggaggcgg 1020tgatggcgct gctggaagcc acgcctgaca cgccggcctg cgtggtcacc ctctcgggga 1080accagtcagt gcggctgccc ctcatggagt gcgtgcagat gaccaaggaa gtgcagaaag 1140ccatggatga caagaggttt gacgaggcca cccagctccg tggtgggagc ttcgagaaca 1200actggaacat ttacaagctc ctcgcccacc agaagccccc caaggagaag tctaacttct 1260ccctggccat cctgaatgtg ggggccccgg cggctggcat gaatgcggcc gtgcgctcgg 1320cggtgcggac cggcatctcc catggacaca cagtatacgt ggtgcacgat ggcttcgaag 1380gcctagccaa gggtcaggtg caagaagtag gctggcacga cgtggccggc tggttggggc 1440gtggtggctc catgctgggg accaagagga ccctgcccaa gggccagctg gagtccattg 1500tggagaacat ccgcatctat ggtattcacg ccctgctggt ggtcggtggg tttgaggcct 1560atgaaggggt gctgcagctg gtggaggctc gcgggcgcta cgaggagctc tgcatcgtca 1620tgtgtgtcat cccagccacc atcagcaaca acgtccctgg caccgacttc agcctgggct 1680ccgacactgc tgtaaatgcc gccatggaga gctgtgaccg catcaaacag tctgcctcgg 1740ggaccaagcg ccgtgtgttc atcgtggaga ccatgggggg ttactgtggc tacctggcca 1800ccgtgactgg cattgctgtg ggggccgacg ccgcctacgt cttcgaggac cctttcaaca 1860tccacgactt aaaggtcaac gtggagcaca tgacggagaa gatgaagaca gacattcaga 1920ggggcctggt gctgcggaac gagaagtgcc atgactacta caccacggag ttcctgtaca 1980acctgtactc atcagagggc aagggcgtct tcgactgcag gaccaatgtc ctgggccacc 2040tgcagcaggg tggcgctcca accccctttg accggaacta tgggaccaag ctgggggtga 2100aggccatgct gtggttgtcg gagaagctgc gcgaggttta ccgcaaggga cgggtgttcg 2160ccaatgcccc agactcggcc tgcgtgatcg gcctgaagaa gaaggcggtg gccttcagcc 2220ccgtcactga gctcaagaaa gacactgatt tcgagcaccg catgccacgg gagcagtggt 2280ggctgagcct gcggctcatg ctgaagatgc tggcacaata ccgcatcagt atggccgcct 2340acgtgtcagg ggagctggag cacgtgaccc gccgcaccct gagcatggac aagggcttct 2400gaggccagcc atgcccacgc ccctccccag cccccaccca tgccagcgca gcgccagggc 2460tcagatgggg cctgggctgt tgtgtctgga gcctgcaggc aggtgggggc tgcgtccctg 2520ctcagcccat cccctgcctc tatccctggc cacctgccag gcctccctcg ggctggtgtc 2580ttgagaccag cctgccaggc cctccagcag gaggacagag tgccctgggg catccacctt 2640cctgcccagg ggacgtggcg ctgtcggtgt ttggaggctg ctgccccctg gctttggcgc 2700cccatgggcc ctcagcgtct ccccatgctg ggctcactac atgggccagc ccttgctcta 2760cctggccggt aggctgctgg cgcctaggtt gtgttgagag ggggatgccc ctggccctgc 2820ctcactgtga cctgctcctg cccacgtgca gcacctgtca ccttttctag aaataaaatc 2880accctgactg tggggtgcat cggtctccgg agaaaacaaa aaaa 292423384DNAHomo sapiens 2gcgacgcggc gcaggcggcg ggagtgcgag ctgggcccgt gtttcggccg ccgccatggc 60cgcggtggac ctggagaagc tgcgggcgtc gggcgcgggc aaggccatcg gcgtcctgac 120cagcggcggc gacgcgcaag gtcccctgac aagcccacca ggccccctgc tgagatggct 180gtgaccctgg gctgacccgc ccagtggcac attgactccg cctggagctg gggagaccag 240agaggccctg tggttggacg gtggcctggg tgcgctgctc ctgccctctc cttgccctgc 300ctcagctgct gcctgccaga ggcgtggcac ctcacctcac acctgctccc tgctgctgag 360ccccacgcca agctggagag cggatgagaa gcatgtgtaa ccagggtaga ggtcgagagt 420cctctcgtgg gggtctccat gttcaaggga gctgccgagg cttgagcagg agcccccagc 480aggaaactgg ctttgccaag gcccccgctg ggacagactg tttctttcac tgcagtcctg 540ggagccgagg gcaaggggac aggaaagagg aagtgacctc agagcctggt ggcaccagca 600tcatgtccag gctggggggc atgaacgctg ctgtccgggc tgtgacgcgc atgggcattt 660atgtgggtgc caaagtcttc ctcatctacg agggctatga gggcctcgtg gagggaggtg 720agaacatcaa gcaggccaac tggctgagcg tctccaacat catccagctg ggcggcacta 780tcattggcag cgctcgctgc aaggccttta ccaccaggga ggggcgccgg gcagcggcct 840acaacctggt ccagcacggc atcaccaacc tgtgcgtcat cggcggggat ggcagcctta 900caggtgccaa catcttccgc agcgagtggg gcagcctgct ggaggagctg gtggcggaag 960gtaagatctc agagactaca gcccggacct actcgcacct gaacatcgcg ggcctagtgg 1020gctccatcga taacgacttc tgcggcaccg acatgaccat cggcacggac tcggccctcc 1080accgcatcat ggaggtcatc gatgccatca ccaccactgc ccagagccac cagaggacct 1140tcgtgctgga agtgatgggc cggcactgcg ggtacctggc gctggtatct gcactggcct 1200caggggccga ctggctgttc atccccgagg ctccacccga ggacggctgg gagaacttca 1260tgtgtgagag gctgggtgag actcggagcc gtgggtcccg actgaacatc atcatcatcg 1320ctgagggtgc cattgaccgc aacgggaagc ccatctcgtc cagctacgtg aaggacctgg 1380tggttcagag gctgggcttc gacacccgtg taactgtgct gggccacgtg cagcggggag 1440ggacgccctc tgccttcgac cggatcctga gcagcaagat gggcatggag gcggtgatgg 1500cgctgctgga agccacgcct gacacgccgg cctgcgtggt caccctctcg gggaaccagt 1560cagtgcggct gcccctcatg gagtgcgtgc agatgaccaa ggaagtgcag aaagccatgg 1620atgacaagag gtttgacgag gccacccagc tccgtggtgg gagcttcgag aacaactgga 1680acatttacaa gctcctcgcc caccagaagc cccccaagga gaagtctaac ttctccctgg 1740ccatcctgaa tgtgggggcc ccggcggctg gcatgaatgc ggccgtgcgc tcggcggtgc 1800ggaccggcat ctcccatgga cacacagtat acgtggtgca cgatggcttc gaaggcctag 1860ccaagggtca ggtgcaagaa gtaggctggc acgacgtggc cggctggttg gggcgtggtg 1920gctccatgct ggggaccaag aggaccctgc ccaagggcca gctggagtcc attgtggaga 1980acatccgcat ctatggtatt cacgccctgc tggtggtcgg tgggtttgag gcctatgaag 2040gggtgctgca gctggtggag gctcgcgggc gctacgagga gctctgcatc gtcatgtgtg 2100tcatcccagc caccatcagc aacaacgtcc ctggcaccga cttcagcctg ggctccgaca 2160ctgctgtaaa tgccgccatg gagagctgtg accgcatcaa acagtctgcc tcggggacca 2220agcgccgtgt gttcatcgtg gagaccatgg ggggttactg tggctacctg gccaccgtga 2280ctggcattgc tgtgggggcc gacgccgcct acgtcttcga ggaccctttc aacatccacg 2340acttaaaggt caacgtggag cacatgacgg agaagatgaa gacagacatt cagaggggcc 2400tggtgctgcg gaacgagaag tgccatgact actacaccac ggagttcctg tacaacctgt 2460actcatcaga gggcaagggc gtcttcgact gcaggaccaa tgtcctgggc cacctgcagc 2520agggtggcgc tccaaccccc tttgaccgga actatgggac caagctgggg gtgaaggcca 2580tgctgtggtt gtcggagaag ctgcgcgagg tttaccgcaa gggacgggtg ttcgccaatg 2640ccccagactc ggcctgcgtg atcggcctga agaagaaggc ggtggccttc agccccgtca 2700ctgagctcaa gaaagacact gatttcgagc accgcatgcc acgggagcag tggtggctga 2760gcctgcggct catgctgaag atgctggcac aataccgcat cagtatggcc gcctacgtgt 2820caggggagct ggagcacgtg acccgccgca ccctgagcat ggacaagggc ttctgaggcc 2880agccatgccc acgcccctcc ccagccccca cccatgccag cgcagcgcca gggctcagat 2940ggggcctggg ctgttgtgtc tggagcctgc aggcaggtgg gggctgcgtc cctgctcagc 3000ccatcccctg cctctatccc tggccacctg ccaggcctcc ctcgggctgg tgtcttgaga 3060ccagcctgcc aggccctcca gcaggaggac agagtgccct ggggcatcca ccttcctgcc 3120caggggacgt ggcgctgtcg gtgtttggag gctgctgccc cctggctttg gcgccccatg 3180ggccctcagc gtctccccat gctgggctca ctacatgggc cagcccttgc tctacctggc 3240cggtaggctg ctggcgccta ggttgtgttg agagggggat gcccctggcc ctgcctcact 3300gtgacctgct cctgcccacg tgcagcacct gtcacctttt ctagaaataa aatcaccctg 3360actgtggggt gcatcggtct ccgg 338433385DNAHomo sapiens 3cgtgtttcgg ccgccgccat ggccgcggtg gacctggaga agctgcgggc gtcgggcgcg 60ggcaaggcca tcggcgtcct gaccagcggc ggcgacgcgc aaggtcccct gacaagccca 120ccaggccccc tgctgagatg gctgtgaccc tgggctgacc cgcccagtgg cacattgact 180ccgcctggag ctggggagac cagagaggcc ctgtggttgg acggtggcct gggtgcgctg 240ctcctgccct ctccttgccc tgcctcagct gctgcctgcc agaggcgtgg cacctcacct 300cacacctgct ccctgctgct gagccccacg ccaagctgga gagcggatga gaagcatgtg 360taaccagggt agaggtcgag agtcctctcg tgggggtctc catgttcaag ggagctgccg 420aggcttgagc aggagccccc agcaggaaac tggctttgcc aaggcccccg ctgggacaga 480ctgtttcttt cactgcagtc ctgggagccg agggcaaggg gacaggaaag aggaagtgac 540ctcagagcct ggtggcacca gcatcatgtc caggctgggg ggcatgaacg ctgctgtccg 600ggctgtgacg cgcatgggca tttatgtggg tgccaaagtc ttcctcatct acgagggcta 660tgagggcctc gtggagggag gtgagaacat caagcaggcc aactggctga gcgtctccaa 720catcatccag ctgggcggca ctatcattgg cagcgctcgc tgcaaggcct ttaccaccag 780ggaggggcgc cgggcagcgg cctacaacct ggtccagcac ggcatcacca acctgtgcgt 840catcggcggg gatggcagcc ttacaggtgc caacatcttc cgcagcgagt ggggcagcct 900gctggaggag ctggtggcgg aaggtaagat ctcagagact acagcccgga cctactcgca 960cctgaacatc gcgggcctag tgggctccat cgataacgac ttctgcggca ccgacatgac 1020catcggcacg gactcggccc tccaccgcat catggaggtc atcgatgcca tcaccaccac 1080tgcccagagc caccagagga ccttcgtgct ggaagtgatg ggccggcact gcgggtacct 1140ggcgctggta tctgcactgg cctcaggggc cgactggctg ttcatccccg aggctccacc 1200cgaggacggc tgggagaact tcatgtgtga gaggctgggt gagactcgga gccgtgggtc 1260ccgactgaac atcatcatca tcgctgaggg tgccattgac cgcaacggga agcccatctc 1320gtccagctac gtgaaggacc tggtggttca gaggctgggc ttcgacaccc gtgtaactgt 1380gctgggccac gtgcagcggg gagggacgcc ctctgccttc gaccggatcc tgagcagcaa 1440gatgggcatg gaggcggtga tggcgctgct ggaagccacg cctgacacgc cggcctgcgt 1500ggtcaccctc tcggggaacc agtcagtgcg gctgcccctc atggagtgcg tgcagatgac 1560caaggaagtg cagaaagcca tggatgacaa gaggtttgac gaggccaccc agctccgtgg 1620tgggagcttc gagaacaact ggaacattta caagctcctc gcccaccaga agccccccaa 1680ggagaagtct aacttctccc tggccatcct gaatgtgggg gccccggcgg ctggcatgaa 1740tgcggccgtg cgctcggcgg tgcggaccgg catctcccat ggacacacag tatacgtggt 1800gcacgatggc ttcgaaggcc tagccaaggg tcaggtgcaa gaagtaggct ggcacgacgt 1860ggccggctgg ttggggcgtg gtggctccat gctggggacc aagaggaccc tgcccaaggg 1920ccagctggag tccattgtgg agaacatccg catctatggt attcacgccc tgctggtggt 1980cggtgggttt gaggcctatg aaggggtgct gcagctggtg gaggctcgcg ggcgctacga 2040ggagctctgc atcgtcatgt gtgtcatccc agccaccatc agcaacaacg tccctggcac 2100cgacttcagc ctgggctccg acactgctgt aaatgccgcc atggagagct gtgaccgcat 2160caaacagtct gcctcgggga ccaagcgccg tgtgttcatc gtggagacca tggggggtta 2220ctgtggctac ctggccaccg tgactggcat tgctgtgggg gccgacgccg cctacgtctt 2280cgaggaccct ttcaacatcc acgacttaaa ggtcaacgtg gagcacatga cggagaagat 2340gaagacagac attcagaggg gcctggtgct gcggaacgag aagtgccatg actactacac 2400cacggagttc ctgtacaacc tgtactcatc agagggcaag ggcgtcttcg actgcaggac 2460caatgtcctg ggccacctgc agcagggtgg cgctccaacc ccctttgacc ggaactatgg 2520gaccaagctg ggggtgaagg ccatgctgtg gttgtcggag aagctgcgcg aggtttaccg 2580caagggacgg gtgttcgcca atgccccaga ctcggcctgc gtgatcggcc tgaagaagaa 2640ggcggtggcc ttcagccccg tcactgagct caagaaagac actgatttcg agcaccgcat 2700gccacgggag cagtggtggc tgagcctgcg gctcatgctg aagatgctgg cacaataccg 2760catcagtatg gccgcctacg tgtcagggga gctggagcac gtgacccgcc gcaccctgag 2820catggacaag ggcttctgag gccagccatg cccacgcccc tccccagccc ccacccatgc 2880cagcgcagcg ccagggctca gatggggcct gggctgttgt gtctggagcc tgcaggcagg 2940tgggggctgc gtccctgctc agcccatccc ctgcctctat ccctggccac ctgccaggcc 3000tccctccggc tggtgtcttg agaccagcct gccaggccct ccagcaggag gacagagtgc 3060cctggggcat ccaccttcct gcccagggga cgtggcgctg tcggtgtttg gaggctgctg 3120ccccctggct ttggcgcccc atgggccctc agcgtctccc catgctgggc tcactacatg 3180ggccagccct tgctctacct ggccggtagg ctgctggcgc ctaggttgtg ttgagagggg 3240gatgcccctg gccctgcctc actgtgacct gctcctgccc acgtgcagca cctgtcacct 3300tttctagaaa taaaatcacc ctgactgtgg ggtgcatcgg tctccggaaa aaaaaaaaaa 3360aaaaaaaaaa aaaaaaaaaa aaaaa 338542920DNAHomo sapiens 4ggcacgaggc ccgtgtttcg gccgccgcca tggccgcggt ggacctggag aagctgcggg 60cgtcgggcgc gggcaaggcc atcggcgtcc tgaccagcgg cggcgacgcg caaggcatga 120acgctgctgt ccgggctgtg acgcgcatgg gcatttatgt gggtgccaaa gtcttcctca 180tctacgaggg ctatgagggc ctcgtggagg gaggtgagaa catcaagcag gccaactggc 240tgagcgtctc caacatcatc cagctgggcg gcactatcat tggcagcgct cgctgcaagg 300cctttaccac cagggagggg cgccgggcag cggcctacaa cctggtccag cacggcatca 360ccaacctgtg cgtcatcggc ggggatggca gccttacagg tgccaacatc ttccgcagcg 420agtggggcag cctgctggag gagctggtgg cggaaggtaa gatctcagag actacagccc 480ggacctactc gcacctgaac atcgcgggcc tagtgggctc catcgataac gacttctgcg 540gcaccgacat gaccatcggc acggactcgg ccctccaccg catcatggag gtcatcgatg 600ccatcaccac cactgcccag agccaccaga ggaccttcgt gctggaagtg atgggccggc 660actgcgggta cctggcgctg gtatctgcac tggcctcagg ggccgactgg ctgttcatcc 720ccgaggctcc acccgaggac ggctgggaga acttcatgtg tgagaggctg ggtgagactc 780ggagccgtgg gtcccgactg aacatcatca tcatcgctga gggtgccatt gaccgcaacg 840ggaagcccat ctcgtccagc tacgtgaagg acctggtggt tcagaggctg ggcttcgaca 900cccgtgtaac tgtgctgggc cacgtgcagc ggggagggac gccctctgcc ttcgaccgga 960tcctgagcag caagatgggc atggaggcgg tgatggcgct gctggaagcc acgcctgaca 1020cgccggcctg cgtggtcacc ctctcgggga accagtcagt gcggctgccc ctcatggagt 1080gcgtgcagat gaccaaggaa gtgcagaaag ccatggatga caagaggttt gacgaggcca 1140cccagctccg tggtgggagc ttcgagaaca actggaacat ttacaagctc ctcgcccacc 1200agaagccccc caaggagaag tctaacttct ccctggccat cctgaatgtg ggggccccgg 1260cggctggcat gaatgcggcc gtgcgctcgg cggtgcggac cggcatctcc catggacaca 1320cagtatacgt ggtgcacgat ggcttcgaag gcctagccaa gggtcaggtg caagaagtag 1380gctggcacga cgtggccggc tggttggggc gtggtggctc catgctgggg accaagagga 1440ccctgcccaa gggccagctg gagtccattg tggagaacat ccgcatctat ggtattcacg 1500ccctgctggt ggtcggtggg tttgaggcct atgaaggggt gctgcagctg gtggaggctc 1560gcgggcgcta cgaggagctc tgcatcgtca tgtgtgtcat cccagccacc atcagcaaca 1620acgtccctgg caccgacttc agcctgggct ccgacactgc tgtaaatgcc gccatggaga 1680gctgtgaccg catcaaacag tctgcctcgg ggaccaagcg ccgtgtgttc atcgtggaga 1740ccatgggggg ttactgtggc tacctggcca ccgtgactgg cattgctgtg ggggccgacg 1800ccgcctacgt cttcgaggac cctttcaaca tccacgactt aaaggtcaac gtggagcaca 1860tgacggagaa gatgaagaca gacattcaga ggggcctggt gctgcggaac gagaagtgcc 1920atgactacta caccacggag ttcctgtaca acctgtactc atcagagggc aacggcgtct 1980tcgactgcag gaccaatgtc ctgggccacc tgcagcaggg tggcgctcca accccctttg 2040accggaacta tgggaccaag ctgggggtga aggccatgct gtggttgtcg gagaagctgc 2100gcgaggttta ccgcaaggga cgggtgttcg ccaatgcccc agactcggcc tgcgtgatcg 2160gcctgaagaa gaaggcggcg gccttcagcc ccgtcactga gctcaagaaa gacactgatt 2220tcgagcaccg catgccacgg gagcagtggt ggctgagcct gcggctcatg ctgaagatgc 2280tggcacaata ccgcatcagt atggccgcct acgtgtcagg ggagctggag cacgtgaccc 2340gccgcaccct gagcatggac aagggcttct gaggccagcc atgcccacgc ccctccccag 2400cccccaccca tgccagcgca gcgccagggc tcagatgggg cctgggctgt tgtgtctgga 2460gcctgcaggc aggtgggggc tgcgtccctg ctcagcccat cccctgcctc tatccctggc 2520cacctgccag gcctccctcc ggctggtgtc ttgagaccag cctgccaggc cctccagcag 2580gaggacagag tgccctgggg catccacctt cctgcccagg ggacgtggcg ctgtcggtgt 2640ttggaggctg ctgccccctg gctttggcgc cccatgggcc ctcagcgtct ccccatgctg 2700ggctcactac atgggccagc ccttgctcta cctggccggt aggctgctgg cgcctaggtt 2760gtgttgagag ggggatgccc ctggccctgc ctcactgtga cctgctcctg cccacgtgca 2820gcacctgtca ccttttctag aaataaaatc accctgactg tggggtgcat cggtctccgg 2880aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 292052255DNAHomo sapiens 5gcgggtacct ggcgctggta tctgcactgg cctcaggggc cgactggctg ttcatccccg 60aggctccacc cgaggacggc tgggagaact tcatgtgtga gaggctgggt gagactcgga 120gccgtgggtc ccgactgaac atcatcatca tcgctgaggg tgccattgac cgcaacggga 180agcccatctc gtccagctac gtgaaggacc tggtggttca gaggctgggc ttcgacaccc 240gtgtaactgt gctgggccac gtgcagcggg gagggacgcc ctctgccttc gaccggatcc 300tgagcagcaa gatgggcatg gaggcggtga tggcgctgct ggaagccacg cctgacacgc 360cggcctgcgt ggtcaccctc tcggggaacc agtcagtgcg gctgcccctc atggagtgcg 420tgcagatgac caaggaagtg cagaaagcca tggatgacaa gaggtttgac gaggccaccc 480agctccgtgg tgggagcttc gagaacaact ggaacattta caagctcctc gcccaccaga 540agccccccaa ggagaagtct aacttctccc tggccatcct gaatgtgggg gccccggcgg 600ctggcatgaa tgcggccgtg cgctcggcgg tgcggaccgg catctcccat ggacacacag 660tatacgtggt gcacgatggc ttcgaaggcc tagccaaggg tcaggtgcaa gaagtaggct 720ggcacgacgt ggccggctgg ttggggcgtg gtggctccat gctggggacc aagaggaccc 780tgcccaaggg ccagctggag tccattgtgg agaacatccg catctatggt attcacgccc 840tgctggtggt cggtgggttt gaggcctatg aaggggtgct gcagctggtg gaggctcgcg 900ggcgctacga ggagctctgc atcgtcatgt gtgtcatccc agccaccatc agcaacaacg 960tccctggcac cgacttcagc ctgggctccg acactgctgt aaatgccgcc atggagagct 1020gtgaccgcat caaacagtct gcctcgggga ccaagcgccg tgtgttcatc gtggagacca 1080tggggggtta ctgtggctac ctggccaccg tgactggcat tgctgtgggg gccgacgccg 1140cctacgtctt cgaggaccct ttcaacatcc acgacttaaa ggtcaacgtg gagcacatga 1200cggagaagat gaagacagac attcagaggg gcctggtgct gcggaacgag aagtgccatg 1260actactacac cacggagttc ctgtacaacc tgtactcatc agagggcaag ggcgtcttcg 1320actgcaggac caatgtcctg ggccacctgc agcagggtgg cgctccaacc ccctttgacc 1380ggaactatgg gaccaagctg ggggtgaagg ccatgctgtg gttgtcggag aagctgcgcg 1440aggtttaccg caagggacgg gtgttcgcca atgccccaga ctcggcctgc gtgatcggcc 1500tgaagaagaa ggcggtggcc ttcagccccg tcactgagct caagaaagac actgatttcg 1560agcaccgcat gccacgggag cagtggtggc tgagcctgcg gctcatgctg aagatgctgg 1620cacaataccg catcagtatg gccgcctacg tgtcagggga gctggagcac gtgacccgcc 1680gcaccctgag catggacaag ggcttctgag gccagccatg cccacgcccc tccccagccc 1740ccacccatgc cagcgcagcg ccagggctca gatggggcct gggctgttgt gtctggagcc 1800tgcaggcagg tgggggctgc gtccctgctc agcccatccc ctgcctctat ccctggccac 1860ctgccaggcc tccctccggc tggtgtcttg agaccagcct gccaggccct ccagcaggag 1920gacagagtgc cctggggcat ccaccttcct gcccagggga cgtggcgctg tcggtgtttg 1980gaggctgctg ccccctggct ttggcgcccc atgggccctc agcgtctccc catgctgggc 2040tcactacatg ggccagccct tgctctacct ggccggtagg ctgctggcgc ctaggttgtg 2100ttgagagggg gatgcccctg gccctgcctc actgtgacct gctcctgccc acgtgcagca 2160cctgtcacct tttctagaaa taaaatcacc ctgactgtgg ggtgcatcgg tctccggaga 2220aaaaaaaaaa aaaaaaaaaa

aaaaaaaaaa aaaaa 225563385DNAHomo sapiens 6cgtgtttcgg ccgccgccat ggccgcggtg gacctggaga agctgcgggc gtcgggcgcg 60ggcaaggcca tcggcgtcct gaccagcggc ggcgacgcgc aaggtcccct gacaagccca 120ccaggccccc tgctgagatg gctgtgaccc tgggctgacc cgcccagtgg cacattgact 180ccgcctggag ctggggagac cagagaggcc ctgtggttgg acggtggcct gggtgcgctg 240ctcctgccct ctccttgccc tgcctcagct gctgcctgcc agaggcgtgg cacctcacct 300cacacctgct ccctgctgct gagccccacg ccaagctgga gagcggatga gaagcatgtg 360taaccagggt agaggtcgag agtcctctcg tgggggtctc catgttcaag ggagctgccg 420aggcttgagc aggagccccc agcaggaaac tggctttgcc aaggcccccg ctgggacaga 480ctgtttcttt cactgcagtc ctgggagccg agggcaaggg gacaggaaag aggaagtgac 540ctcagagcct ggtggcacca gcatcatgtc caggctgggg ggcatgaacg ctgctgtccg 600ggctgtgacg cgcatgggca tttatgtggg tgccaaagtc ttcctcatct acgagggcta 660tgagggcctc gtggagggag gtgagaacat caagcaggcc aactggctga gcgtctccaa 720catcatccag ctgggcggca ctatcattgg cagcgctcgc tgcaaggcct ttaccaccag 780ggaggggcgc cgggcagcgg cctacaacct ggtccagcac ggcatcacca acctgtgcgt 840catcggcggg gatggcagcc ttacaggtgc caacatcttc cgcagcgagt ggggcagcct 900gctggaggag ctggtggcgg aaggtaagat ctcagagact acagcccgga cctactcgca 960cctgaacatc gcgggcctag tgggctccat cgataacgac ttctgcggca ccgacatgac 1020catcggcacg gactcggccc tccaccgcat catggaggtc atcgatgcca tcaccaccac 1080tgcccagagc caccagagga ccttcgtgct ggaagtgatg ggccggcact gcgggtacct 1140ggcgctggta tctgcactgg cctcaggggc cgactggctg ttcatccccg aggctccacc 1200cgaggacggc tgggagaact tcatgtgtga gaggctgggt gagactcgga gccgtgggtc 1260ccgactgaac atcatcatca tcgctgaggg tgccattgac cgcaacggga agcccatctc 1320gtccagctac gtgaaggacc tggtggttca gaggctgggc ttcgacaccc gtgtaactgt 1380gctgggccac gtgcagcggg gagggacgcc ctctgccttc gaccggatcc tgagcagcaa 1440gatgggcatg gaggcggtga tggcgctgct ggaagccacg cctgacacgc cggcctgcgt 1500ggtcaccctc tcggggaacc agtcagtgcg gctgcccctc atggagtgcg tgcagatgac 1560caaggaagtg cagaaagcca tggatgacaa gaggtttgac gaggccaccc agctccgtgg 1620tgggagcttc gagaacaact ggaacattta caagctcctc gcccaccaga agccccccaa 1680ggagaagtct aacttctccc tggccatcct gaatgtgggg gccccggcgg ctggcatgaa 1740tgcggccgtg cgctcggcgg tgcggaccgg catctcccat ggacacacag tatacgtggt 1800gcacgatggc ttcgaaggcc tagccaaggg tcaggtgcaa gaagtaggct ggcacgacgt 1860ggccggctgg ttggggcgtg gtggctccat gctggggacc aagaggaccc tgcccaaggg 1920ccagctggag tccattgtgg agaacatccg catctatggt attcacgccc tgctggtggt 1980cggtgggttt gaggcctatg aaggggtgct gcagctggtg gaggctcgcg ggcgctacga 2040ggagctctgc atcgtcatgt gtgtcatccc agccaccatc agcaacaacg tccctggcac 2100cgacttcagc ctgggctccg acactgctgt aaatgccgcc atggagagct gtgaccgcat 2160caaacagtct gcctcgggga ccaagcgccg tgtgttcatc gtggagacca tggggggtta 2220ctgtggctac ctggccaccg tgactggcat tgctgtgggg gccgacgccg cctacgtctt 2280cgaggaccct ttcaacatcc acgacttaaa ggtcaacgtg gagcacatga cggagaagat 2340gaagacagac attcagaggg gcctggtgct gcggaacgag aagtgccatg actactacac 2400cacggagttc ctgtacaacc tgtactcatc agagggcaag ggcgtcttcg actgcaggac 2460caatgtcctg ggccacctgc agcagggtgg cgctccaacc ccctttgacc ggaactatgg 2520gaccaagctg ggggtgaagg ccatgctgtg gttgtcggag aagctgcgcg aggtttaccg 2580caagggacgg gtgttcgcca atgccccaga ctcggcctgc gtgatcggcc tgaagaagaa 2640ggcggtggcc ttcagccccg tcactgagct caagaaagac actgatttcg agcaccgcat 2700gccacgggag cagtggtggc tgagcctgcg gctcatgctg aagatgctgg cacaataccg 2760catcagtatg gccgcctacg tgtcagggga gctggagcac gtgacccgcc gcaccctgag 2820catggacaag ggcttctgag gccagccatg cccacgcccc tccccagccc ccacccatgc 2880cagcgcagcg ccagggctca gatggggcct gggctgttgt gtctggagcc tgcaggcagg 2940tgggggctgc gtccctgctc agcccatccc ctgcctctat ccctggccac ctgccaggcc 3000tccctccggc tggtgtcttg agaccagcct gccaggccct ccagcaggag gacagagtgc 3060cctggggcat ccaccttcct gcccagggga cgtggcgctg tcggtgtttg gaggctgctg 3120ccccctggct ttggcgcccc atgggccctc agcgtctccc catgctgggc tcactacatg 3180ggccagccct tgctctacct ggccggtagg ctgctggcgc ctaggttgtg ttgagagggg 3240gatgcccctg gccctgcctc actgtgacct gctcctgccc acgtgcagca cctgtcacct 3300tttctagaaa taaaatcacc ctgactgtgg ggtgcatcgg tctccggaaa aaaaaaaaaa 3360aaaaaaaaaa aaaaaaaaaa aaaaa 338572879DNAHomo sapiens 7gtttcggccg ccgccatggc cgcggtggac ctggagaagc tgcgggcgtc gggcgcgggc 60aaggccatcg gcgtcctgac cagcggcggc gacgcgcaag gcatgaacgc tgctgtccgg 120gctgtgacgc gcatgggcat ttatgtgggt gccaaagtct tcctcatcta cgagggctat 180gagggcctcg tggagggagg tgagaacatc aagcaggcca actggctgag cgtctccaac 240atcatccagc tgggcggcac tatcattggc agcgctcgct gcaaggcctt taccaccagg 300gaggggcgcc gggcagcggc ctacaacctg gtccagcacg gcatcaccaa cctgtgcgtc 360atcggcgggg atggcagcct tacaggtgcc aacatcttcc gcagcgagtg gggcagcctg 420ctggaggagc tggtggcgga aggtaagatc tcagagacta cagcccggac ctactcgcac 480ctgaacatcg cgggcctagt gggctccatc gataacgact tctgcggcac cgacatgacc 540atcggcacgg actcggccct ccaccgcatc atggaggtca tcgatgccat caccaccact 600gcccagagcc accagaggac cttcgtgctg gaagtgatgg gccggcactg cgggtacctg 660gcgctggtat ctgcactggc ctcaggggcc gactggctgt tcatccccga ggctccaccc 720gaggacggct gggagaactt catgtgtgag aggctgggtg agactcggag ccgtgggtcc 780cgactgaaca tcatcatcat cgctgagggt gccattgacc gcaacgggaa gcccatctcg 840tccagctacg tgaaggacct ggtggttcag aggctgggct tcgacacccg tgtaactgtg 900ctgggccacg tgcagcgggg agggacgccc tctgccttcg accggatcct gagcagcaag 960atgggcatgg aggcggtgat ggcgctgctg gaagccacgc ctgacacgcc ggcctgcgtg 1020gtcaccctct cggggaacca gtcagtgcgg ctgcccctca tggagtgcgt gcagatgacc 1080aaggaagtgc agaaagccat ggatgacaag aggtttgacg aggccaccca gctccgtggt 1140gggagcttcg agaacaactg gaacatttac aagctcctcg cccaccagaa gccccccaag 1200gagaagtcta acttctccct ggccatcctg aatgtggggg ccccggcggc tggcatgaat 1260gcggccgtgc gctcggcggt gcggaccggc atctcccatg gacacacagt atacgtggtg 1320cacgatggct tcgaaggcct agccaagggt caggtgcaag aagtaggctg gcacgacgtg 1380gccggctggt tggggcgtgg tggctccatg ctggggacca agaggaccct gcccaagggc 1440cagctggagt ccattgtgga gaacatccgc atctatggta ttcacgccct gctggtggtc 1500ggtgggtttg aggcctatga aggggtgctg cagctggtgg aggctcgcgg gcgctacgag 1560gagctctgca tcgtcatgtg tgtcatccca gccaccatca gcaacaacgt ccctggcacc 1620gacttcagcc tgggctccga cactgctgta aatgccgcca tggagagctg tgaccgcatc 1680aaacagtctg cctcggggac caagcgccgt gtgttcatcg tggagaccat ggggggttac 1740tgtggctacc tggccaccgt gactggcatt gctgtggggg ccgacgccgc ctacgtcttc 1800gaggaccctt tcaacatcca cgacttaaag gtcaacgtgg agcacatgac ggagaagatg 1860aagacagaca ttcagagggg cctggtgctg cggaacgaga agtgccatga ctactacacc 1920acggagttcc tgtacaacct gtactcatca gagggcaagg gcgtcttcga ctgcaggacc 1980aatgtcctgg gccacctgca gcagggtggc gctccaaccc cctttgaccg gaactatggg 2040accaagctgg gggtgaaggc catgctgtgg ttgtcggaga agctgcgcga ggtttaccgc 2100aagggacggg tgttcgccaa tgccccagac tcggcctgcg tgatcggcct gaagaagaag 2160gcggtggcct tcagccccgt cactgagctc aagaaagaca ctgatttcga gcaccgcatg 2220ccacgggagc agtggtggct gagcctgcgg ctcatgctga agatgctggc acaataccgc 2280atcagtatgg ccgcctacgt gtcaggggag ctggagcacg tgacccgccg caccctgagc 2340atggacaagg gcttctgagg ccagccatgc ccacgcccct ccccagcccc cacccatgcc 2400agcgcagcgc cagggctcag atggggcctg ggctgttgtg tctggagcct gcaggcaggt 2460gggggctgcg tccctgctca gcccatcccc tgcctctatc cctggccacc tgccaggcct 2520ccctccggct ggtgtcttga gaccagcctg ccaggccctc cagcaggagg acagagtgcc 2580ctggggcatc caccttcctg cccaggggac gtggcgctgt cggtgtttgg aggctgctgc 2640cccctggctt tggcgcccca tgggccctca gcgtctcccc atgctgggct cactacatgg 2700gccagccctt gctctacctg gccggtaggc tgctggcgcc taggttgtgt tgagaggggg 2760atgcccctgg ccctgcctca ctgtgacctg ctcctgccca cgtgcagcac ctgtcacctt 2820ttctagaaat aaaatcaccc tgactgtggg gtgcatcggt caaaaaaaaa aaaaaaaaa 28798149DNAHomo sapiens 8acgcggcgca ggcggcggga gtgcgagctg ggcccgtgtt tcggccgccg ccatggccgc 60ggtggacctg gagaagctgc gggcgtcggg cgcgggcaag gccatcggcg tcctgaccag 120cggcggcgac cggcaaggtg gggcggggg 1499177DNAHomo sapiens 9tcctctgaga tggggagggt gtcagggcct tgcttctcag cgtgggagct gacaggtttg 60ccctgacctc cacaggcatg aacgctgctg tccgggctgt gacgcgcatg ggcatttatg 120tgggtgccaa agtcttcctc atctacgagg taaggccaag gtgggctgtg tgtgtgc 17710168DNAHomo sapiens 10agggacttgc tgccggccgc catgggttcc cctatctcat gcctgcccac tcttgattca 60gggctatgag ggcctcgtgg agggaggtga gaacatcaag caggccaact ggctgagcgt 120atccaacatc atccagctgg tgaggcctgg gaacgcggat gcatgttg 16811299DNAHomo sapiens 11ccagtcctgg gtccctctgg tgatcccagg gctgtctgcc gcctgccatc tctcctgaag 60tttctggtct cctctgtgca gggcgcgact atcattggca cggctcgctc gaaggccttt 120accaccaggg aggggcgccg ggcagcggct aacaacctgg tccagcacgg catcaccaac 180ctgtgcgtca tcggcgggga tggcagcctt acaggtgcca acatcttccg cagcgagtgg 240ggcagcctgc tggaggagct ggtggcggaa ggtgggtctg tgcccggcgc actgtaggc 29912349DNAHomo sapiens 12tgccacaggg ttcccaggca ggaggaggcc tgagcctgga actcccgggc cctgccgggc 60tgcacgcctg ggatgcgggg gaaggggtgg caggagaggg gtctctggcc ctgtggtggg 120gccagtggag cctcagccag gtcctcctgc tgctcctggc ccaggtaaga tctcagagac 180tacagcccgg acctactcgc acctgaacat cgcgggccta gtgggctcca tcgataacga 240cttctgcggc accgacatga ccatcggcac ggactcggcc ctccaccgca tcatggaggt 300catcgatgcc atcaccacca ctgcccagag gtgagtgagg ctggccgcc 34913606DNAHomo sapiens 13ctagtggatc ccaaggtctg ttttagctca gagcctgggc atgagaaggg gctgtccctg 60cctgcctctc catccactgg gtcccttgag caccccgcag aatcgggctg gcagggcgtg 120tggctggcac tgatgcatcc tcctgttcca tctccacagc caccagagga ccttcgtgct 180ggaagtgatg ggccggcact gcgggtgagg aggggcttct ggcccgctgg gtggcccggg 240tgctgctggg gaccgcagtg acaggtgtgg catatttatg ctagggctca gttaatgcca 300tgggtgtgag agagccgggt gggggcctga gcaggcaggc gctcgctcct ccaggtacct 360ggcgctggta tctgcactgg cctcaggggc cgactggctg ttcatccccg aggctccacc 420cgaggctcca cccgaggacg gctgggagaa cttcatgtgt gagaggctgg gtgaggtggg 480tgccgtccag cctgctgggg gccgcaggtg tcctggtgca ctgggtagcg cccctggggt 540tttgggacca gccttgacca actcatctat cactcatggg ttcatcagca gctgcaggtg 600cccctt 606141226DNAHomo sapiens 14tcactgagag tctgtttccc cctgtaacat gcgggttccc gccttgccca ccacccaggg 60tgcactgggg cagggagggt ggcacagctc tcctgctgga cctgcagccc atcttggccc 120tgtctgacct cacctgcact ggcgccacct cctccaggta gccctccact cccaccacac 180ccgggccctg tctgagctgc tccaaccctg tccctgtagg aggcattgtg gcccgtggtg 240ggcaaacatg gggcttgtgg ctcagagcga gtcccacaaa tgttggctgg gtcattgaag 300tggacggccg aaaagctctt tcagttccca gagagctggg ccgggctggt gtgtggtgct 360gaggcagcag cctgaggctg gcagagggga tggtgtgtcc tatgtgcacc agtgtggacc 420cacagtggct tcagggtgca gggtgtccgg ggagccctgg ctggtgccag cgatgcgggg 480cccctggtta taaggatgag cagatggaag ctcacagggg ccccaggtac ctggcccagg 540ctagcccggg caggcccctc tgccctgtgc cccgtggagc tgcagccctg tgcgtcttcc 600cgcctggaag cgttcaccag acacaagggc ccagcccagc tgtgtgtgtg ctgggcccag 660cccgagccgg cggggttggt ggggtgctgc ccttcccagg cgggcgggca gagctcttgt 720tcaccgcctc caggagggcg gggggtccta gtgggcccag cctcatctct gccctcgcgc 780tccaggcctg ctttcctcct gccgggtggg gattgtgccc tggcccatgt gggttgggaa 840tggtggcccc agggagggct cctggcaccc gggggctgtg tccggggcag gtttccctgc 900ctggcagctg agccctgtcg tgtctttgac ccagactcgg agccgtgggt cccgactgaa 960catcatcatc atcgctgagg gtgccattga ccgcaacggg aagcccatct cgtccagcta 1020cgtgaaggac gtgcgtgtgg gcctgggggt ggccactggg cacctgctcc tctaggccgt 1080gtgggctggg gctcagggct ggtccttccc actgtcctgc agctggtggt tcagaggctg 1140ggcttcgaca cccgtgtaac tgtgctgggc cacgtgcagc ggggagggac gccctctgcc 1200ttcgaccgga tcctggtaag tggcca 122615583DNAHomo sapiensmisc_feature(18)..(18)n is a, c, g, or t 15gctctagaac tagtggancc cccgggctgc aggaattcga tatcaagctt cgtcagcgtg 60gggggctctg gaagctgggg tttgcacatc tacagaggat gggcatgtgg cttggggtag 120agggaccaag tgggtgtgcc agcctgaacc cttccccaca gagcagcaag atgggcatgg 180aggcggtgat ggcgctgctg gaagccacgc ctgacacgcc ggcctgcgtg gtcaccctct 240cggggaacca gtcagtgcgg ctgcccctca tggagtgcgt gcagatggta agccctgggc 300cccccccatc agaaccgcct ggcccctctc cccagtcccc actcacaggc cccactgctc 360tctgggggcc cccagcactg tgagcaccgg aggcagggcc tcgtggctgg cccagggcat 420cccaggtctc cagggagggg agggatgtga gcacatccct gggtgggacg tngggacctg 480ggacgttccc caggaggtgt gtcggagctg cagggagcct ggtgagcatg ggaagtcaca 540ggggtccact gccactgagc ttatgtaggc agtggtggga gtt 58316239DNAHomo sapiens 16gtttctcttc cttaagacca aggaagtgca gaaagccatg gatgacaaga ggtttgacga 60ggccacccag ctccgtggtg ggtaagcccc ctcatgatac ccctgcactc ttacatggat 120gggtcccggt gccaggcagc atgtgctcga gtggcgctat gcacgcctgg cctgggtcat 180ccttctaggc accgcgtctg aagatcgagg gaggaagggg cctgcgggtg gacaggagg 23917103DNAHomo sapiensmisc_feature(6)..(6)n is a, c, g, or t 17tgtgcntggt gtgacccgaa tcccttccag gagcttcgag aacaactgga acatttacag 60gctcctcacc caccagaagc cccccaagga gaaggtgagg cag 10318231DNAHomo sapiensmisc_feature(30)..(31)n is a, c, g, or t 18cctcctgtgc aggttggggg tcccctcccn nggctgtgcc tcacgctnat ctccccttct 60ctctgaagtc taacttctcc ctggccatcc tgaatgtggg ggccccggcg gctggcatga 120atgcggccgt gcgctcggcg gtgcggaccg gcatctccca tggacacaca gtatacgtgg 180tgcacgatgg cttcgaaggc ctagccaagg gtcaggtggg tccggccggg g 23119181DNAHomo sapiens 19cccggcaaca ggcccaaccc tggggggaat tggccagagg ctcaggctgg cccctgaagc 60tgcatgtcct cctggcaggt gcaagaagta ggctggcacg acgtggccgg ctggttgggg 120cgtggtggct ccatgctggg gaccaagagg tgagctgcct gctgcgggta cctggggacc 180t 18120110DNAHomo sapiens 20tttcttccgc aggaccctgc ccaagggcca gctggagtcc attgtggaga acatccgcat 60ctatggtatt cacgccctgc tggtggtcgg tgggtttgag gtgagagctg 11021208DNAHomo sapiensmisc_feature(10)..(10)n is a, c, g, or t 21tcggtgctgn ccttgacctg ccccgtccct actgctgcag gcctatgaag gggtgctgca 60gctggtggag gctcgcgggc gctacgagga gctctgcatc gtcatgtgtg tcatcccagc 120caccatcagc aacaacgtcc ctggcaccga cttcagcctg ggctccgaca ctgctgtaaa 180tgccgccatg gaggtacggn ctcctgga 20822199DNAHomo sapiens 22accccccctt gtcccccaga gctgtgaccg catcaaacag tctgcctcgg ggaccaagcg 60ccgtgtgttc atcgtggaga ccatgggggg ttactgtggc tacctggcca ccgtgactgg 120cattgctgtg ggggccgacg ccgcctacgt cttcgaggac cctttcaaca tccacgactt 180aaaggtgagc ccagcccag 19923174DNAHomo sapiens 23tgctcctgct ggccccggat cgccggtcag cctggaattc cctccccaca gtctccggct 60catccgtgtc cgcccctccc gcaggtcaac gtggagcaca tgacggagaa gatgaagaca 120gacattcaga ggggcctggt gctgcggtga ggctgccgtg ggtccctggc caca 17424144DNAHomo sapiensmisc_feature(9)..(9)n is a, c, g, or t 24gactcaggnc ctgntgcccc ctctcaggaa cgagaagtgc catgactact acaccacgga 60gttcctgtac aacctgtact catcagaggg caagggcgtc ttcgactgca ggaccaatgt 120cctgggccac ctgcagcagg tgtg 14425207DNAHomo sapiens 25gatccccgat cctgtctgca ctggcgttgg ccttggccag gcagcccagg ggagtccagg 60gaaccgggcc tcacctgttt ccagggtggc gctccaaccc cctttgaccg gaactatggg 120accaagctgg gggtgaaggc catgctgtgg ttgtcggaga agctgcgcga ggtttaccgc 180aagggtaggt ggtgggtgcg acccgag 20726327DNAHomo sapiensmisc_feature(191)..(191)n is a, c, g, or t 26cctccctggg gcagggcctc accatggagg gctgccacgt gcctctgttt gcaggacggg 60tgttcgccaa tgccccagac tcggcctgcg tgatcggcct gaagaagaag gcggtggcct 120tcagccccgt cactgagctc aagaaagaca ctgatttcga gtgagttcca ccaaagcctc 180gtggaggcgg ngtggggctg aggggtggcc cagaccttcc ctgaggccgg tgtgccagac 240ccagccccac tggcaccctg accccgcaag cctcctggcc ccatgtccag gtccccccag 300gccgtggaga gcagggacca tgcccaa 327271080DNAHomo sapiens 27gtggctgaag agctgccctg acccctgact ccccatcatc ctcccatccc cgtcctgcac 60aggcaccgca tgccacggga gcagtggtgg ctgagcctgc ggctcatgct gaagatgctg 120gcacaatacc gcatcagtat ggccgcctac gtgtcagggg agctggagca cgtgacccgc 180cgcaccctga gcatggacaa gggcttctga ggccagccat gcccgagctg cccctcccca 240gcccccaccc atgccagcgc acgcgccagg gctcagatgg ggcctgggct gttgtgtctg 300gagcctgcag gcaggtgggg gctgcgtccc tgctcagccc atcccctgcc tctactccct 360ggccacctgc caggcctccc tccggctggt gtcttgagac cagcctgcca ggcctccagc 420aggaggacag agtgccctgg ggcatccacc ttcctgccca ggggacgtgg cgctgtcggt 480gtttggaggc tgctgccccc tggctttggc gccccatggg ccctcagcgt ctccccatgc 540tgggctcact acatgggcca gcccttgctc tacctggccg gtaggctgct ggcgcctagg 600ttgtgttgag agggggatgc ccctggccct gcctcactgt gacctgctcc tgcccacgtg 660cagcacctgt caccttttct agaaataaaa tcaccctgac tgtggggtgc catcggtctc 720cggagagcac agctgcagaa ctcctcaccg agcggccacg ccagtggcag cagccccagg 780ggtggaggcc ctctggccag tgcctgggac aggtcaaagg gacatgtgcc ctgagaggcc 840acaggtgctc ttcaggactc cctgggggcc accagggtga cctgagcccc tcctggtcct 900cccctggggg cagaagggta cagcctcact cctctgtctc ccaacctcag cctgagtggg 960ggtctccaac ctgcaggctg gtggctggct tgagccagta tccaggagac attgatggtg 1020gtctcgcaag aggggaaaag aaggcacggc agagctgcgc ccaccagggg ctagggctga 1080282815DNAHomo sapiens 28gtttcggccg ccgccatggc cgcggtggac ctggagaagc tgcgggcgtc gggcgcgggc 60aaggccatcg gcgtcctgac cagcggcggc gacgcgcaag gcatgaacgc tgctgtccgg 120gctgtgacgc gcatgggcat ttatgtgggt gccaaagtct tcctcatcta cgagggctat 180gagggcctcg tggagggagg tgagaacatc aagcaggcca actggctgag cgtctccaac 240atcatccagc tgggcggcac tatcattggc agcgctcgct gcaaggcctt taccaccaac 300ctgtgcgtca tcggcgggga tggcagcctc acaggtgcca acatcttccg cagcgagtgg 360ggcagcctgc tggaggagct ggtggcggaa ggtaagatct cagagactac agcccggacc 420tactcgcacc tgaacatcgc gggcctagtg ggctccatcg ataacgactt ctgcggcacc 480gacatgacca tcggcacgga ctcggccctc caccgcatca tggaggtcat cgatgccatc 540accaccactg cccagagcca ccagaggacc ttcgtgctgg aagtgatggg ccggcactgc 600gggtacctgg cgctggtatc tgcactggcc tcaggggccg actggctgtt catccccgag 660gctccacccg aggacggctg ggagaacttc atgtgtgaga ggctgggtga gactcggagc 720cgtgggtccc gactgaacat catcatcatc gctgagggtg ccattgaccg caacgggaag 780cccatctcgt ccagctacgt gaaggacctg gtggttcaga

ggctgggctt cgacacccgt 840gtaactgtgc tgggccacgt gcagcgggga gggacgccct ctgccttcga ccggatcctg 900agcagcaaga tgggcatgga ggcggttatg gcgctgctgg aagccacgcc tgacacgccg 960gcctgcgtgg tcaccctctc ggggaaccag tcagtgcggc tgcccctcat ggagtgcgtg 1020cagatgacca aggaagtgca gaaagccatg gatgacaaga ggtttgacga ggccacccag 1080ctccgtggtg ggagcttcga gaacaactgg aacatttaca agctcctcgc ccaccagaag 1140ccccccaagg agaagtctaa cttctccctg gccatcctga atgtgggggc cccggcggct 1200ggcatgaatg cggccgtgcg ctcggcggtg cggaccggca tctcccatgg acacacagta 1260tacgtggtgc acgatggctt cgaaggccta gccaagggtc aggtgcaaga agtaggctgg 1320cacgacctgg ccggctggtt ggggcgtggt ggctccatgc tggggaccaa gaggaccctg 1380cccaagggcc agctggagtc cattgtggag aacatccgca tctatggtat tcacgccctg 1440ctggtggtcg gtgggtttga ggcctatgaa ggggtgctgc agctggtgga ggctcgcggg 1500cgctacgagg agctctgcat cgtcatgtgt gtcatcccag ccaccatcag caacaacgtc 1560cctggcaccg acttcagcct gggctccgac actgctgtaa atgccgccat ggagagctgt 1620gaccgcatca aacagtctgc ctcggggacc aagcgccgtg tgttcatcgt ggagaccatg 1680gggggttact gtggctacct ggccaccgtg actggcattg ctgtgggggc cgacgccgcc 1740tacgtcttcg aggacccttt caacatccac gacttaaagg tcaacgtgga gcacatgacg 1800gagaagatga agacagacat tcagaggggc ctggtgctgc ggaacgagaa gtgccatgac 1860tactacacca cggagttcct gtacaacctg tactcatcag agggcaaggg cgtcttcgac 1920tgcaggacca atgtcctggg ccacctgcag cagggtggcg ctccaacccc ctttgaccgg 1980aactatggga ccaagctggg ggtggaggcc atgctgtggt tgtcggagaa gctgcgcgag 2040gtttaccgca agggacgggt gttcgccaat gccccagact cggcctgcgt gatcggcctg 2100aagaagaagg cggtggcctt cagccccgtc actgagctca agaaagacac tgatttcgag 2160caccgcatgc cacgggagca gtggtggctg agcctgcggc tcatgctgaa gatgctggca 2220caataccgca tcagtatggc cgcctacgtg tcaggggagc tggagcacgt gacccgccgc 2280accctgagca tggacaaggg cttctgaggc cagccatgcc cacgcccctc cccagccccc 2340acccatgcca gcgcagcgcc agggctcaga tggggcctgg gctgttgtgt ctggagcctg 2400caggcaggtg ggggctgcgt ccctgctcag cccatcccct gcctctatcc ctggccacct 2460gccaggcctc cctccggctg gtgtcttgag accagcctgc caggccctcc agcaggagga 2520cagagtgccc tggggcatcc accttcctgc ccaggggacg tggcgctgtc ggtgtttgga 2580ggctgctgcc ccctggcttt ggcgccccat gggccctcag cgtctcccca tgctgggctc 2640actacatggg ccagcccttg ctctacctgg ccggtaggct gctggcgcct aggttgtgtt 2700gagaggggga tgcccctggc cctgcctcac tgtgacctgc tcctgcccac gtgcagcacc 2760tgtcaccttt tctagaaata aaatcaccct gactgtgggg tgcatcggtc tccgg 2815293402DNAHomo sapiens 29gacggcgacg cggcgcaggc ggcgggagtg cgagctgggc ccgtgtttcg gccgccgcca 60tggccgcggt ggacctggag aagctgcggg cgtcgggcgc gggcaaggcc atcggcgtcc 120tgaccagcgg cggcgacgcg caaggtcccc tgacaagccc accaggcccc ctgctgagat 180ggctgtgacc ctgggctgac ccgcccagtg gcacattgac tccgcctgga gctggggaga 240ccagagaggc cctgtggttg gacggtggcc tgggtgcgct gctcctgccc tctccttgcc 300ctgcctcagc tgctgcctgc cagaggcgtg gcacctcacc tcacacctgc tccctgctgc 360tgagccccac gccaagctgg agagcggatg agaagcatgt gtaaccaggg tagaggtcga 420gagtcctctc gtgggggtct ccatgttcaa gggagctgcc gaggcttgag caggagcccc 480cagcaggaaa ctggctttgc caaggccccc gctgggacag actgtttctt tcactgcagt 540cctgggagcc gagggcaagg ggacaggaaa gaggaagtga cctcagagcc tggtggcacc 600agcatcatgt ccaggctggg gggcatgaac gctgctgtcc gggctgtgac gcgcatgggc 660atttatgtgg gtgccaaagt cttcctcatc tacgagggct atgagggcct cgtggaggga 720ggtgagaaca tcaagcaggc caactggctg agcgtctcca acatcatcca gctgggcggc 780actatcattg gcagcgctcg ctgcaaggcc tttaccacca gggaggggcg ccgggcagcg 840gcctacaacc tggtccagca cggcatcacc aacctgtgcg tcatcggcgg ggatggcagc 900ctcacaggtg ccaacatctt ccgcagcgag tggggcagcc tgctggagga gctggtggcg 960gaaggtaaga tctcagagac tacagcccgg acctactcgc acctgaacat cgcgggccta 1020gtgggctcca tcgataacga cttctgcggc accgacatga ccatcggcac ggactcggcc 1080ctccaccgca tcatggaggt catcgatgcc atcaccacca ctgcccagag ccaccagagg 1140accttcgtgc tggaagtgat gggccggcac tgcgggtacc tggcgctggt atctgcactg 1200gcctcagggg ccgactggct gttcatcccc gaggctccac ccgaggacgg ctgggagaac 1260ttcatgtgtg agaggctggg tgagactcgg agccgtgggt cccgactgaa catcatcatc 1320atcgctgagg gtgccattga ccgcaacggg aagcccatct cgtccagcta cgtgaaggac 1380ctggtggttc agaggctggg cttcgacacc cgtgtaactg tgctgggcca cgtgcagcgg 1440ggagggacgc cctctgcctt cgaccggatc ctgagcagca agatgggcat ggaggcggtg 1500atggcgctgc tggaagccac gcctgacacg ccggcctgcg tggtcaccct ctcggggaac 1560cagtcagtgc ggctgcccct catggagtgc gtgcagatga ccaaggaagt gcagaaagcc 1620atggatgaca agaggtttga cgaggccacc cagctccgtg gtgggagctt cgagaacaac 1680tggaacattt acaagctcct cgcccaccag aagcccccca aggagaagtc taacttctcc 1740ctggccatcc tgaatgtggg ggccccggcg gctggcatga atgcggccgt gcgctcggcg 1800gtgcggaccg gcatctccca tggacacaca gtatacgtgg tgcacgatgg cttcgaaggc 1860ctagccaagg gtcaggtgca agaagtaggc tggcacgacg tggccggctg gttggggcgt 1920ggtggctcca tgctggggac caagaggacc ctgcccaagg gccagctgga gtccattgtg 1980gagaacatcc gcatctatgg tattcacgcc ctgctggtgg tcggtgggtt tgaggcctat 2040gaaggggtgc tgcagctggt ggaggctcgc gggcgctacg aggagctctg catcgtcatg 2100tgtgtcatcc cagccaccat cagcaacaac gtccctggca ccgacttcag cctgggctcc 2160gacactgctg taaatgccgc catggagagc tgtgaccgca tcaaacagtc tgcctcgggg 2220accaagcgcc gtgtgttcat cgtggagacc atggggggtt actgtggcta cctggccacc 2280gtgactggca ttgctgtggg ggccgacgcc gcctacgtct tcgaggaccc tttcaacatc 2340cacgacttaa aggtcaacgt ggagcacatg acggagaaga tgaagacaga cattcagagg 2400ggcctggtgc tgcggaacga gaagtgccat gactactaca ccacggagtt cctgtacaac 2460ctgtactcat cagagggcaa gggcgtcttc gactgcagga ccaatgtcct gggccacctg 2520cagcagggtg gcgctccaac cccctttgac cggaactatg ggaccaagct gggggtgaag 2580gccatgctgt ggttgtcgga gaagctgcgc gaggtttacc gcaagggacg ggtgttcgcc 2640aatgccccag actcggcctg cgtgatcggc ctgaagaaga aggcggtggc cttcagcccc 2700gtcactgagc tcaagaaaga cactgatttc gagcaccgca tgccacggga gcagtggtgg 2760ctgagcctgc ggctcatgct gaagatgctg gcacaatacc gcatcagtat ggccgcctac 2820gtgtcagggg agctggagca cgtgacccgc cgcaccctga gcatggacaa gggcttctga 2880ggccagccat gcccacgccc ctccccagcc cccacccatg ccagcgcagc gccagggctc 2940agatggggcc tgggctgttg tgtctggagc ctgcaggcag gtgggggctg cgtccctgct 3000cagcccatcc cctgcctcta tccctggcca cctgccaggc ctccctcggg ctggtgtctt 3060gagaccagcc tgccaggccc tccagcagga ggacagagtg ccctggggca tccaccttcc 3120tgcccagggg acgtggcgct gtcggtgttt ggaggctgct gccccctggc tttggcgccc 3180catgggccct cagcgtctcc ccatgctggg ctcactacat gggccagccc ttgctctacc 3240tggccggtag gctgctggcg cctaggttgt gttgagaggg ggatgcccct ggccctgcct 3300cactgtgacc tgctcctgcc cacgtgcagc acctgtcacc ttttctagaa ataaaatcac 3360cctgactgtg gggtgcatcg gtctccggag aaaacaaaaa aa 3402302812DNAHomo sapiens 30ctaaaagagt ggatcatgac ccatgaagag caccatgcag ccaaaaccct ggggattggc 60aaagccattg ctgtcttaac ctctggtgga gatgcccaag gtatgaatgc tgctgtcagg 120gctgtggttc gagttggtat cttcaccggt gcccgtgtct tctttgtcca tgagggttat 180caaggcctgg tggatggtgg agatcacatc aaggaagcca cctgggagag cgtttcgatg 240atgcttcagc tgggaggcac ggtgattgga agtgcccggt gcaaggactt tcgggaacga 300gaaggacgac tccgagctgc ctacaacctg gtgaagcgtg ggatcaccaa tctctgtgtc 360attgggggtg atggcagcct cactggggct gacaccttcc gttctgagtg gagtgacttg 420ttgagtgacc tccagaaagc aggtaagatc acagatgagg aggctacgaa gtccagctac 480ctgaacattg tgggcctggt tgggtcaatt gacaatgact tctgtggcac cgatatgacc 540attggcactg actctgccct gcatcggatc atggaaattg tagatgccat cactaccact 600gcccagagcc accagaggac atttgtgtta gaagtaatgg gccgccactg tggatacctg 660gcccttgtca cctctctgtc ctgtggggcc gactgggttt ttattcctga atgtccacca 720gatgacgact gggaggaaca cctttgtcgc cgactcagcg agacaaggac ccgtggttct 780cgtctcaaca tcatcattgt ggctgagggt gcaattgaca agaatggaaa accaatcacc 840tcagaagaca tcaagaatct ggtggttaag cgtctgggat atgacacccg ggttactgtc 900ttggggcatg tgcagagggg tgggacgcca tcagcctttg acagaattct gggcagcagg 960atgggtgtgg aagcagtgat ggcacttttg gaggggaccc cagatacccc agcctgtgta 1020gtgagcctct ctggtaacca ggctgtgcgc ctgcccctca tggaatgtgt ccaggtgacc 1080aaagatgtga ccaaggccat ggatgagaag aaatttgacg aagccctgaa gctgagaggc 1140cggagcttca tgaacaactg ggaggtgtac aagcttctag ctcatgtcag acccccggta 1200tctaagagtg gttcgcacac agtggctgtg atgaacgtgg gggctccggc tgcaggcatg 1260aatgctgctg ttcgctccac tgtgaggatt ggccttatcc agggcaaccg agtgctcgtt 1320gtccatgatg gtttcgaggg cctggccaag gggcagatag aggaagctgg ctggagctat 1380gttgggggct ggactggcca aggtggctct aaacttggga ctaaaaggac tctacccaag 1440aagagctttg aacagatcag tgccaatata actaagttta acattcaggg ccttgtcatc 1500attgggggct ttgaggctta cacagggggc ctggaactga tggagggcag gaagcagttt 1560gatgagctct gcatcccatt tgtggtcatt cctgctacag tctccaacaa tgtccctggc 1620tcagacttca gcgttggggc tgacacagca ctcaatacta tctgcacaac ctgtgaccgc 1680atcaagcagt cagcagctgg caccaagcgt cgggtgttta tcattgagac tatgggtggc 1740tactgtggct acctggctac catggctgga ctggcagctg gggccgatgc tgcctacatt 1800tttgaggagc ccttcaccat tcgagacctg caggcaaatg ttgaacatct ggtgcaaaag 1860atgaaaacaa ctgtgaaaag gggcttggtg ttaaggaatg aaaagtgcaa tgagaactat 1920accactgact tcattttcaa cctgtactct gaggagggga agggcatctt cgacagcagg 1980aagaatgtgc ttggtcacat gcagcagggt gggagcccaa ccccatttga taggaatttt 2040gccactaaga tgggcgccaa ggctatgaac tggatgtctg ggaaaatcaa agagagttac 2100cgtaatgggc ggatctttgc caatactcca gattcgggct gtgttctggg gatgcgtaag 2160agggctctgg tcttccaacc agtggctgag ctgaaggacc agacagattt tgagcatcga 2220atccccaagg aacagtggtg gctgaaactg aggcccatcc tcaaaatcct agccaagtac 2280gagattgact tggacacttc agaccatgcc cacctggagc acatcacccg gaagcggtcc 2340ggggaagctg ccgtctaaac ctctctggag tgaggggaat agattacctg atcatggtca 2400gctcacaccc taataagtcc acatcttctc agtgttttag ctgttttttt cattaggttt 2460ccttttattc tgtaccttgc agccatgacc agttctggcc aggagctgga ggagcaggca 2520gtgggtggga gctcctttta ggtagaattt aacatgactt ctgccccagc tttatctgtc 2580acacaaggct gggcacctct agtgctactg ctagatatca cttactcagt tagaattttc 2640ctaaaaataa gctttattta tttctttgtg ataacaaaga gtcttggttc ctctactact 2700tttactacag tgacaaattg taactacact aataaatgcc aactggtcac tgtgaaaaaa 2760aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aa 2812312759DNAHomo sapiens 31gcctgactga gagtggatca tgacccatga agagcaccat gcagccaaaa ccctggggat 60tggcaaagcc attgctgtct taacctctgg tggagatgcc caaggtatga atgctgctgt 120cagggctgtg gttcgagttg gtatcttcac cggtgcccgt gtcttctttg tccatgaggg 180ttatcaaggc ctggtggatg gtggagatca catcaaggaa gccacctggg agagcgtttc 240gatgatgctt cagctgggag gcacggtgat tggaagtgcc cggtgcaagg actttcggga 300acgagaagga cgactccgag ctgcctacaa cctggtgaag cgtgggatca ccaatctctg 360tgtcattggg ggtgatggca gcctcactgg ggctgacacc ttccgttctg agtggagtga 420cttgttgagt gacctccaga aagcaggtaa gatcacagat gaggaggcta cgaagtccag 480ctacctgaac attgtgggcc tggttgggtc aattgacaat gacttctgtg gcactgatat 540gaccattggc actgactctg ccctgcatcg gatcatggaa attgtagatg ccatcactac 600cactgcccag agccaccaga ggacatttgt gttagaagta atgggccgcc actgtggata 660cctggccctt gtcacctctc tgtcctgtgg ggccgactgg gtttttattc ctgaatgtcc 720accagatgac gactgggagg aacacctttg tcgccgactc agcgagacaa ggacccgtgg 780ttctcgtctc aacatcatca ttgtggctga gggtgcaatt gacaagaatg gaaaaccaat 840cacctcagaa gacatcaaga atctggtggt taagcgtctg ggatatgaca cccgggttac 900tgtcttgggg catgtgcaga ggggtgggac gccatcagcc tttgacagaa ttctgggcag 960caggatgggt gtggaagcag tgatggcact tttggagggg accccagata ccccagcctg 1020tgtagtgagc ctctctggta accaggctgt gcgcctgccc ctcatggaat gtgtccaggt 1080gaccaaagat gtgaccaagg ccatggatga gaagaaattt gacgaagccc tgaagctgag 1140aggccggagc ttcatgaaca actgggaggt gtacaagctt ctagctcatg tcagaccccc 1200ggtatctaag agtggttcgc acacagtggc tgtgatgaac gtgggggctc cggctgcagg 1260catgaatgct gctgttcgct ccactgtgag gattggcctt atccagggca accgagtgct 1320cgttgtccat gatggtttcg agggcctggc caaggggcag atagaggaag ctggctggag 1380ctatgttggg ggctggactg gccaaggtgg ctctaaactt gggactaaaa ggactctacc 1440caagaagagc tttgaacaga tcagtgccaa tataactaag tttaacattc agggccttgt 1500catcattggg ggctttgagg cttacacagg gggcctggaa ctgatggagg gcaggaagca 1560gtttgatgag ctctgcatcc catttgtggt cattcctgct acagtctcca acaatgtccc 1620tggctcagac ttcagcgttg gggctgacac agcactcaat actatctgca caacctgtga 1680ccgcatcaag cagtcagcag ctggcaccaa gcgtcgggtg tttatcattg agactatggg 1740tggctactgt ggctacctgg ctaccatggc tggactggca gctggggccg atgctgccta 1800catttttgag gagcccttca ccattcgaga cctgcaggca aatgttgaac atctggtgca 1860aaagatgaaa acaactgtga aaaggggctt ggtgttaagg aatgaaaagt gcaatgagaa 1920ctataccact gacttcattt tcaacctgta ctctgaggag gggaagggca tcttcgacag 1980caggaagaat gtgcttggtc acatgcagca gggtgggagc ccaaccccat ttgataggaa 2040ttttgccact aagatgggcg ccaaggctat gaactggatg tctgggaaaa tcaaagagag 2100ttaccgtaat gggcggatct ttgccaatac tccagattcg ggctgtgttc tggggatgcg 2160taagagggct ctggtcttcc aaccagtggc tgagctgaag gaccagacag attttgagca 2220tcgaatcccc aaggaacagt ggtggctgaa actgaggccc atcctcaaaa tcctagccaa 2280gtacgagatt gacttggaca cttcagacca tgcccacctg gagcacatca cccggaagcg 2340gtccggggaa gctgccgtct aaacctctct ggagtgaggg gaatagatta cctgatcatg 2400gtcagctcac accctaataa gtccacatct tctcagtgtt ttagctgttt ttttcattag 2460gtttcctttt attctgtacc ttgcagccat gaccagttct ggccaggagc tggaggagca 2520ggcagtgggt gggagctcct tttaggtaga atttaacatg acttctgccc cagctttatc 2580tgtcacacaa ggctgggcac ctctagtgct actgctagat atcacttact cagttagaat 2640tttcctaaaa ataagcttta tttatttctt tgtgataaca aagagtcttg gttcctctac 2700tacttttact acagtgacaa attgtaacta cactaataaa tgccaactgg tcactgtga 2759322821DNAHomo sapiens 32ggcacgaggc taaaagagtg gatcatgacc catgaagagc accatgcagc caaaaccctg 60gggattggca aagccattgc tgtcttaacc tctggtggag atgcccaagg tatgaatgct 120gctgtcaggg ctgtggttcg agttggtatc ttcaccggtg cccgtgtctt ctttgtccat 180gagggttatc aaggcctggt ggatggtgga gatcacatca aggaagccac ctgggagagc 240gtttcgatga tgcttcagct gggaggcacg gtgattggaa gtgcccggtg caaggacttt 300cgggaacgag aaggacgact ccgagctgcc tacaacctgg tgaagcgtgg gatcaccaat 360ctctgtgtca ttgggggtga tggcagcctc actggggctg acaccttccg ttctgagtgg 420agtgacttgt tgagtgacct ccagaaagca ggtaagatca cagatgagga ggctacgaag 480tccagctacc tgaacattgt gggcctggtt gggtcaattg acaatgactt ctgtggcacc 540gatatgacca ttggcactga ctctgccctg catcggatca tggaaattgt agatgccatc 600actaccactg cccagagcca ccagaggaca tttgtgttag aagtaatggg ccgccactgt 660ggatacctgg cccttgtcac ctctctgtcc tgtggggccg actgggtttt tattcctgaa 720tgtccaccag atgacgactg ggaggaacac ctttgtcgcc gactcagcga gacaaggacc 780cgtggttctc gtctcaacat catcattgtg gctgagggtg caattgacaa gaatggaaaa 840ccaatcacct cagaagacat caagaatctg gtggttaagc gtctgggata tgacacccgg 900gttactgtct tggggcatgt gcagaggggt gggacgccat cagcctttga cagaattctg 960ggcagcagga tgggtgtgga agcagtgatg gcacttttgg aggggacccc agatacccca 1020gcctgtgtag tgagcctctc tggtaaccag gctgtgcgcc tgcccctcat ggaatgtgtc 1080caggtgacca aagatgtgac caaggccatg gatgagaaga aatttgacga agccctgaag 1140ctgagaggcc ggagcttcat gaacaactgg gaggtgtaca agcttctagc tcatgtcaga 1200cccccggtat ctaagagtgg ttcgcacaca gtggctgtga tgaacgtggg ggctccggct 1260gcaggcatga atgctgctgt tcgctccact gtgaggattg gccttatcca gggcaaccga 1320gtgctcgttg tccatgatgg tttcgagggc ctggccaagg ggcagataga ggaagctggc 1380tggagctatg ttgggggctg gactggccaa ggtggctcta aacttgggac taaaaggact 1440ctacccaaga agagctttga acagatcagt gccaatataa ctaagtttaa cattcagggc 1500cttgtcatca ttgggggctt tgaggcttac acagggggcc tggaactgat ggagggcagg 1560aagcagtttg atgagctctg catcccattt gtggtcattc ctgctacagt ctccaacaat 1620gtccctggct cagacttcag cgttggggct gacacagcac tcaatactat ctgcacaacc 1680tgtgaccgca tcaagcagtc agcagctggc accaagcgtc gggtgtttat cattgagact 1740atgggtggct actgtggcta cctggctacc atggctggac tggcagctgg ggccgatgct 1800gcctacattt ttgaggagcc cttcaccatt cgagacctgc aggcaaatgt tgaacatctg 1860gtgcaaaaga tgaaaacaac tgtgaaaagg ggcttggtgt taaggaatga aaagtgcaat 1920gagaactata ccactgactt cattttcaac ctgtactctg aggaggggaa gggcatcttc 1980gacagcagga agaatgtgct tggtcacatg cagcagggtg ggagcccaac cccatttgat 2040aggaattttg ccactaagat gggcgccaag gctatgaact ggatgtctgg gaaaatcaaa 2100gagagttacc gtaatgggcg gatctttgcc aatactccag attcgggctg tgttctgggg 2160atgcgtaaga gggctctggt cttccaacca gtggctgagc tgaaggacca gacagatttt 2220gagcatcgaa tccccaagga acagtggtgg ctgaaactga ggcccatcct caaaatccta 2280gccaagtacg agattgactt ggacacttca gaccatgccc acctggagca catcacccgg 2340aagcggtccg gggaagctgc cgtctaaacc tctctggagt gaggggaata gattacctga 2400tcatggtcag ctcacaccct aataagtcca catcttctca gtgttttagc tgtttttttc 2460attaggtttc cttttattct gtaccttgca gccatgacca gttctggcca ggagctggag 2520gagcaggcag tgggtgggag ctccttttag gtagaattta acatgacttc tgccccagct 2580ttatctgtca cacaaggctg ggcacctcta gtgctactgc tagatatcac ttactcagtt 2640agaattttcc taaaaataag ctttatttat ttctttgtga taacaaagag tcttggttcc 2700tctactactt ttactacagt gacaaattgt aactacacta ataaatgcca actggtcact 2760gtgaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2820a 2821332821DNAHomo sapiens 33ggcacgaggc taaaagagtg gatcatgacc catgaagagc accatgcagc caaaaccctg 60gggattggca aagccattgc tgtcttaacc tctggtggag atgcccaagg tatgaatgct 120gctgtcaggg ctgtggttcg agttggtatc ttcaccggtg cccgtgtctt ctttgtccat 180gagggttatc aaggcctggt ggatggtgga gatcacatca aggaagccac ctgggagagc 240gtttcgatga tgcttcagct gggaggcacg gtgattggaa gtgcccggtg caaggacttt 300cgggaacgag aaggacgact ccgagctgcc tacaacctgg tgaagcgtgg gatcaccaat 360ctctgtgtca ttgggggtga tggcagcctc actggggctg acaccttccg ttctgagtgg 420agtgacttgt tgagtgacct ccagaaagca ggtaagatca cagatgagga ggctacgaag 480tccagctacc tgaacattgt gggcctggtt gggtcaattg acaatgactt ctgtggcacc 540gatatgacca ttggcactga ctctgccctg catcggatca tggaaattgt agatgccatc 600actaccactg cccagagcca ccagaggaca tttgtgttag aagtaatggg ccgccactgt 660ggatacctgg cccttgtcac ctctctgtcc tgtggggccg actgggtttt tattcctgaa 720tgtccaccag atgacgactg ggaggaacac ctttgtcgcc gactcagcga gacaaggacc 780cgtggttctc gtctcaacat catcattgtg gctgagggtg caattgacaa gaatggaaaa 840ccaatcacct cagaagacat caagaatctg gtggttaagc gtctgggata tgacacccgg 900gttactgtct tggggcatgt gcagaggggt gggacgccat cagcctttga cagaattctg 960ggcagcagga tgggtgtgga agcagtgatg gcacttttgg aggggacccc agatacccca 1020gcctgtgtag tgagcctctc tggtaaccag gctgtgcgcc tgcccctcat ggaatgtgtc

1080caggtgacca aagatgtgac caaggccatg gatgagaaga aatttgacga agccctgaag 1140ctgagaggcc ggagcttcat gaacaactgg gaggtgtaca agcttctagc tcatgtcaga 1200cccccggtat ctaagagtgg ttcgcacaca gtggctgtga tgaacgtggg ggctccggct 1260gcaggcatga atgctgctgt tcgctccact gtgaggattg gccttatcca gggcaaccga 1320gtgctcgttg tccatgatgg tttcgagggc ctggccaagg ggcagataga ggaagctggc 1380tggagctatg ttgggggctg gactggccaa ggtggctcta aacttgggac taaaaggact 1440ctacccaaga agagctttga acagatcagt gccaatataa ctaagtttaa cattcagggc 1500cttgtcatca ttgggggctt tgaggcttac acagggggcc tggaactgat ggagggcagg 1560aagcagtttg atgagctctg catcccattt gtggtcattc ctgctacagt ctccaacaat 1620gtccctggct cagacttcag cgttggggct gacacagcac tcaatactat ctgcacaacc 1680tgtgaccgca tcaagcagtc agcagctggc accaagcgtc gggtgtttat cattgagact 1740atgggtggct actgtggcta cctggctacc atggctggac tggcagctgg ggccgatgct 1800gcctacattt ttgaggagcc cttcaccatt cgagacctgc aggcaaatgt tgaacatctg 1860gtgcaaaaga tgaaaacaac tgtgaaaagg ggcttggtgt taaggaatga aaagtgcaat 1920gagaactata ccactgactt cattttcaac ctgtactctg aggaggggaa gggcatcttc 1980gacagcagga agaatgtgct tggtcacatg cagcagggtg ggagcccaac cccatttgat 2040aggaattttg ccactaagat gggcgccaag gctatgaact ggatgtctgg gaaaatcaaa 2100gagagttacc gtaatgggcg gatctttgcc aatactccag attcgggctg tgttctgggg 2160atgcgtaaga gggctctggt cttccaacca gtggctgagc tgaaggacca gacagatttt 2220gagcatcgaa tccccaagga acagtggtgg ctgaaactga ggcccatcct caaaatccta 2280gccaagtacg agattgactt ggacacttca gaccatgccc acctggagca catcacccgg 2340aagcggtccg gggaagctgc cgtctaaacc tctctggagt gaggggaata gattacctga 2400tcatggtcag ctcacaccct aataagtcca catcttctca gtgttttagc tgtttttttc 2460attaggtttc cttttattct gtaccttgca gccatgacca gttctggcca ggagctggag 2520gagcaggcag tgggtgggag ctccttttag gtagaattta acatgacttc tgccccagct 2580ttatctgtca cacaaggctg ggcacctcta gtgctactgc tagatatcac ttactcagtt 2640agaattttcc taaaaataag ctttatttat ttctttgtga taacaaagag tcttggttcc 2700tctactactt ttactacagt gacaaattgt aactacacta ataaatgcca actggtcact 2760gtgaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2820a 2821342821DNAHomo sapiens 34ggcacgaggc taaaagagtg gatcatgacc catgaagagc accatgcagc caaaaccctg 60gggattggca aagccattgc tgtcttaacc tctggtggag atgcccaagg tatgaatgct 120gctgtcaggg ctgtggttcg agttggtatc ttcaccggtg cccgtgtctt ctttgtccat 180gagggttatc aaggcctggt ggatggtgga gatcacatca aggaagccac ctgggagagc 240gtttcgatga tgcttcagct gggaggcacg gtgattggaa gtgcccggtg caaggacttt 300cgggaacgag aaggacgact ccgagctgcc tacaacctgg tgaagcgtgg gatcaccaat 360ctctgtgtca ttgggggtga tggcagcctc actggggctg acaccttccg ttctgagtgg 420agtgacttgt tgagtgacct ccagaaagca ggtaagatca cagatgagga ggctacgaag 480tccagctacc tgaacattgt gggcctggtt gggtcaattg acaatgactt ctgtggcacc 540gatatgacca ttggcactga ctctgccctg catcggatca tggaaattgt agatgccatc 600actaccactg cccagagcca ccagaggaca tttgtgttag aagtaatggg ccgccactgt 660ggatacctgg cccttgtcac ctctctgtcc tgtggggccg actgggtttt tattcctgaa 720tgtccaccag atgacgactg ggaggaacac ctttgtcgcc gactcagcga gacaaggacc 780cgtggttctc gtctcaacat catcattgtg gctgagggtg caattgacaa gaatggaaaa 840ccaatcacct cagaagacat caagaatctg gtggttaagc gtctgggata tgacacccgg 900gttactgtct tggggcatgt gcagaggggt gggacgccat cagcctttga cagaattctg 960ggcagcagga tgggtgtgga agcagtgatg gcacttttgg aggggacccc agatacccca 1020gcctgtgtag tgagcctctc tggtaaccag gctgtgcgcc tgcccctcat ggaatgtgtc 1080caggtgacca aagatgtgac caaggccatg gatgagaaga aatttgacga agccctgaag 1140ctgagaggcc ggagcttcat gaacaactgg gaggtgtaca agcttctagc tcatgtcaga 1200cccccggtat ctaagagtgg ttcgcacaca gtggctgtga tgaacgtggg ggctccggct 1260gcaggcatga atgctgctgt tcgctccact gtgaggattg gccttatcca gggcaaccga 1320gtgctcgttg tccatgatgg tttcgagggc ctggccaagg ggcagataga ggaagctggc 1380tggagctatg ttgggggctg gactggccaa ggtggctcta aacttgggac taaaaggact 1440ctacccaaga agagctttga acagatcagt gccaatataa ctaagtttaa cattcagggc 1500cttgtcatca ttgggggctt tgaggcttac acagggggcc tggaactgat ggagggcagg 1560aagcagtttg atgagctctg catcccattt gtggtcattc ctgctacagt ctccaacaat 1620gtccctggct cagacttcag cgttggggct gacacagcac tcaatactat ctgcacaacc 1680tgtgaccgca tcaagcagtc agcagctggc accaagcgtc gggtgtttat cattgagact 1740atgggtggct actgtggcta cctggctacc atggctggac tggcagctgg ggccgatgct 1800gcctacattt ttgaggagcc cttcaccatt cgagacctgc aggcaaatgt tgaacatctg 1860gtgcaaaaga tgaaaacaac tgtgaaaagg ggcttggtgt taaggaatga aaagtgcaat 1920gagaactata ccactgactt cattttcaac ctgtactctg aggaggggaa gggcatcttc 1980gacagcagga agaatgtgct tggtcacatg cagcagggtg ggagcccaac cccatttgat 2040aggaattttg ccactaagat gggcgccaag gctatgaact ggatgtctgg gaaaatcaaa 2100gagagttacc gtaatgggcg gatctttgcc aatactccag attcgggctg tgttctgggg 2160atgcgtaaga gggctctggt cttccaacca gtggctgagc tgaaggacca gacagatttt 2220gagcatcgaa tccccaagga acagtggtgg ctgaaactga ggcccatcct caaaatccta 2280gccaagtacg agattgactt ggacacttca gaccatgccc acctggagca catcacccgg 2340aagcggtccg gggaagctgc cgtctaaacc tctctggagt gaggggaata gattacctga 2400tcatggtcag ctcacaccct aataagtcca catcttctca gtgttttagc tgtttttttc 2460attaggtttc cttttattct gtaccttgca gccatgacca gttctggcca ggagctggag 2520gagcaggcag tgggtgggag ctccttttag gtagaattta acatgacttc tgccccagct 2580ttatctgtca cacaaggctg ggcacctcta gtgctactgc tagatatcac ttactcagtt 2640agaattttcc taaaaataag ctttatttat ttctttgtga taacaaagag tcttggttcc 2700tctactactt ttactacagt gacaaattgt aactacacta ataaatgcca actggtcact 2760gtgaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2820a 2821352819DNAHomo sapiens 35ggcacgaggc agcggcggag gagagctaag actaaaagag tggatcatga cccatgaaga 60gcaccatgca gccaaaaccc tggggattgg caaagccatt gctgtcttaa cctctggtgg 120agatgcccaa ggtatgaatg ctgctgtcag ggctgtggtt cgagttggta tcttcaccgg 180tgcccgtgtc ttctttgtcc atgagggtta tcaaggcctg gtggatggtg gagatcacat 240caaggaagcc acctgggaga gcgtttcgat gatgcttcag ctgggaggca cggtgattgg 300aagtgcccgg tgcaaggact ttcgggaacg agaaggacga ctccgagctg cctacaacct 360ggtgaagcgt gggatcacca atctctgtgt cattgggggt gatggcagcc tcactggggc 420tgacaccttc cgttctgagt ggagtgactt gttgagtgac ctccagaaag caggtaagat 480cacagatgag gaggctacga agtccagcta cctgaacatt gtgggcctgg ttgggtcaat 540tgacaatgac ttctgtggca ccgatatgac cattggcact gactctgccc tgcatcggat 600catggaaatt gtagatgcca tcactaccac tgcccagagc caccagagga catttgtgtt 660agaagtaatg ggccgccact gtggatacct ggcccttgtc acctctctgt cctgtggggc 720cgactgggtt tttattcctg aatgtccacc agatgacgac tgggaggaac acctttgtcg 780ccgactcagc gagacaagga cccgtggttc tcgtctcaac atcatcattg tggctgaggg 840tgcaattgac aagaatggaa aaccaatcac ctcagaagac atcaagaatc tggtggttaa 900gcgtctggga tatgacaccc gggttactgt cttggggcat gtgcagaggg gtgggacgcc 960atcagccttt gacagaattc tgggcagcag gatgggtgtg gaagcagtga tggcactttt 1020ggaggggacc ccagataccc cagcctgtgt agtgagcctc tctggtaacc aggctgtgcg 1080cctgcccctc atggaatgtg tccaggtgac caaagatgtg accaaggcca tggatgagaa 1140gaaatttgac gaagccctga agctgagagg ccggagcttc atgaacaact gggaggtgta 1200caagcttcta gctcatgtca gacccccggt atctaagagt ggttcgcaca cagtggctgt 1260gatgaacgtg ggggctccgg ctgcaggcat gaatgctgct gttcgctcca ctgtgaggat 1320tggccttatc cagggcaacc gagtgctcgt tgtccatgat ggtttcgagg gcctggccaa 1380ggggcagata gaggaagctg gctggagcta tgttgggggc tggactggcc aaggtggctc 1440taaacttggg actaaaagga ctctacccaa gaagagcttt gaacagatca gtgccaatat 1500aactaagttt aacattcagg gccttgtcat cattgggggc tttgaggctt acacaggggg 1560cctggaactg atggagggca ggaagcagtt tgatgagctc tgcatcccat ttgtggtcat 1620tcctgctaca gtctccaaca atgtccctgg ctcagacttc agcgttgggg ctgacacagc 1680actcaatact atctgcacaa cctgtgaccg catcaagcag tcagcagctg gcaccaagcg 1740tcgggtgttt atcattgaga ctatgggtgg ctactgtggc tacctggcta ccatggctgg 1800actggcagct ggggccgatg ctgcctacat ttttgaggag cccttcacca ttcgagacct 1860gcaggcaaat gttgaacatc tggtgcaaaa gatgaaaaca actgtgaaaa ggggcttggt 1920gttaaggaat gaaaagtgca atgagaacta taccactgac ttcattttca acctgtactc 1980tgaggagggg aagggcatct tcgacagcag gaagaatgtg cttggtcaca tgcagcaggg 2040tgggagccca accccatttg ataggaattt tgccactaag atgggcgcca aggctatgaa 2100ctggatgtct gggaaaatca aagagagtta ccgtaatggg cggatctttg ccaatactcc 2160agattcgggc tgtgttctgg ggatgcgtaa gagggctctg gtcttccaac cagtggctga 2220gctgaaggac cagacagatt ttgagcatcg aatccccaag gaacagtggt ggctgaaact 2280gaggcccatc ctcaaaatcc tagccaagta cgagattgac ttggacactt cagaccatgc 2340ccacctggag cacatcaccc ggaagcggtc cggggaagcg gccgtctaaa cctctctgga 2400gtgaggggaa tagattacct gatcatggtc agctcacacc ctaataagtc cacatcttct 2460cagtgtttta gctgtttttt tcattaggtt tccttttatt ctgtaccttg cagccatgac 2520cagttctggc caggagctgg aggagcaggc agtgggtggg agctcctttt aggtagaatt 2580taacatgact tctgccccag ctttatctgt cacacaaggc tgggcacctc tagtgctact 2640gctagatatc acttactcag ttagaatttt cctaaaaata agctttattt atttctttgt 2700gataacaaag agtcttggtt cctctactac ttttactaca gtgacaaatt gtaactacac 2760taataaatgc caactggtca ctgtgaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaa 281936751DNAHomo sapiens 36ttcactgcca gaatcccgcc cccatccctc cccactgcag gcttgagaag tgggcctcac 60agacacgacc cttacctctt gacactttca ttccaaatag ctagcttttc tgtactttct 120ctgagaaagg taatgtaaat gtaaaagaag tttttgaggc tattctatag aagatgtcat 180aattcaatgt tcctcaaagg cagggactat atttattttt ttaaacctct cggcacctaa 240tgcaaaatct agcacacaga ctcccaagaa atgtttgctt aaagacggat agagggccac 300aggggcaggt tttattagca gttgctgaag atgttggaaa tgcaaggtgt ctcaggcagt 360ctggactagg aaaaagttct gtgttgaaaa caagagaagc aacgaaataa tttattattc 420ccattaaata aagatgctga tggctctcga gatagatatg aggaccccag gactctgggg 480agtgaaggaa ctaagccgct gccatatctg agctgagtac ttagggggag gaggaagagg 540aggagaaagg caagcaggag gaggcggact tcttgtcagc atctgttagt ggaggttggg 600aagcctctcc tccttccccc tccctctttg cctccacctg gctcctcccc atgttcgtcc 660atcacccctc ccccctttcc caaggacaat ctgcaagaaa gcagcggcgg aggagagcta 720agactaaaag gcaagagggg ccattgagtg a 75137435DNAHomo sapiens 37ctttttctcc tccctttctc atttctattc agtcaacttc tcttttccct gaccttagtt 60tattccccaa ataggttcca atcggggtgg gagggatcag ggaggaatta ggaccttagt 120agtcttgggc ttgattacat gacatttcag cttgtcagtc tacaagggtg tggctttcct 180ctggaagaag tccaaagctc tcaggctgca aagctcagac ttggtatagt gggagagcct 240gactgaggtg gctctagcca gtctaattgc cgttccttta gctagtggca tcttgattcc 300tgctgtgtct taactgacca ttgtcttaaa ttctagagtg gatcatgacc catgaagagc 360accatgcagc caaaaccctg gggattggca aagccattgc tgtcttaacc tctggtggag 420atgcccaagg taagg 4353895DNAHomo sapiens 38tgtccctcct ttcaggtatg aatgctgctg tcagggctgt ggttcgagtt ggtatcttca 60ccggtgcccg tgtcttcttt gtccatgagg ttggt 953999DNAHomo sapiens 39taatgtgtca cacagggtta tcaaggcctg gtggatggtg gagatcacat caaggaagcc 60acctgggaga gcgtttcgat gatgcttcag ctggtatgt 9940211DNAHomo sapiens 40ctacctcttc caaagggagg cacggtgatt ggaagtgccc ggtgcaagga ctttcgggaa 60cgagaaggac gactccgagc tgcctacaac ctggtgaagc gtgggatcac caatctctgt 120gtcattgggg gtgatggcag cctcactggg gctgacacct tccgttctga gtggagtgac 180ttgttgagtg acctccagaa agcaggtaag a 21141187DNAHomo sapiens 41gcttctcatt gtcaggtaag atcacagatg aggaggctac gaagtccagc tacctgaaca 60ttgtgggcct ggttgggtca attgacaatg acttctgtgg cactgatatg accattggca 120ctgactctgc cctgcatcgg atcatggaaa ttgtagatgc catcactacc actgcccaga 180ggtaagg 1874266DNAHomo sapiens 42gtctatctct tgcagccacc agaggacatt tgtgttagaa gtaatgggcc gccactgtgg 60gtaaga 6643129DNAHomo sapiens 43gactctcatc tcagatacct ggcccttgtc acctctctgt cctgtggggc cgactgggtt 60tttattcctg aatgtccacc agatgacgac tgggaggaac acctttgtcg ccgactcagc 120gaggtactt 12944117DNAHomo sapiens 44tttgttctca accagacaag gacccgtggt tctcgtctca acatcatcat tgtggctgag 60ggtgcaattg acaagaatgg aaaaccaatc acctcagaag acatcaagaa tgttcgt 11745114DNAHomo sapiens 45tgttggtccc ttcagctggt ggttaagcgt ctgggatatg acacccgggt tactgtcttg 60gggcatgtgc agaggggtgg gacgccatca gcctttgaca gaattctggt aagt 11446147DNAHomo sapiens 46tctgggctcc tgcagggcag caggatgggt gtggaagcag tgatggcact tttggagggg 60accccagata ccccagcctg tgtagtgagc ctctctggta accaggctgt gcgcctgccc 120ctcatggaat gtgtccaggt ggtaagt 1474786DNAHomo sapiens 47gtgctccccc ctcagaccaa agatgtgacc aaggccatgg atgagaagaa atttgacgaa 60gccctgaagc tgagaggccg gtgagg 864885DNAHomo sapiens 48ttgtacttcc tacaggagct tcatgaacaa ctgggaggtg tacaagcttc tagctcatgt 60cagacccccg gtatctaagg tactg 8549171DNAHomo sapiens 49cttcctcctg tatagagtgg ttcgcacaca gtggctgtga tgaacgtggg ggctccggct 60gcaggcatga atgctgctgt tcgctccact gtgaggattg gccttatcca gggcaaccga 120gtgctcgttg tccatgatgg tttcgagggc ctggccaagg ggcaggtatg g 1715092DNAHomo sapiens 50ggcttatccc cacagataga ggaagctggc tggagctatg ttgggggctg gactggccaa 60ggtggctcta aacttgggac taaaaggtaa gt 9251109DNAHomo sapiens 51tcactgatca actaggactc tacccaagaa gagctttgaa cagatcagtg ccaatataac 60taagtttaac attcagggcc ttgtcatcat tgggggcttt gaggtgagt 10952174DNAHomo sapiens 52ctctcttctt cttaggctta cacagggggc ctggaactga tggagggcag gaagcagttt 60gatgagctct gcatcccatt tgtggtcatt cctgctacag tctccaacaa tgtccctggc 120tcagacttca gcgttggggc tgacacagca ctcaatacta tctgcacagt gaga 17453186DNAHomo sapiens 53ccattgtcct tgcagacctg tgaccgcatc aagcagtcag cagctggcac caagcgtcgg 60gtgtttatca ttgagactat gggtggctac tgtggctacc tggctaccat ggctggactg 120gcagctgggg ccgatgctgc ctacattttt gaggagccct tcaccattcg agacctgcag 180gtagct 1865483DNAHomo sapiens 54ctctttcatt ttcaggcaaa tgttgaacat ctggtgcaaa agatgaaaac aactgtgaaa 60aggggcttgg tgttaaggta cct 8355133DNAHomo sapiens 55ttctccacct ggcaggaatg aaaagtgcaa tgagaactat accactgact tcattttcaa 60cctgtactct gaggagggga agggcatctt cgacagcagg aagaatgtgc ttggtcacat 120gcagcaggta ggg 13356121DNAHomo sapiens 56tgtttctttc tccagggtgg gagcccaacc ccatttgata ggaattttgc cactaagatg 60ggcgccaagg ctatgaactg gatgtctggg aaaatcaaag agagttaccg taatggtagg 120t 12157127DNAHomo sapiens 57catccctcat tgcagggcgg atctttgcca atactccaga ttcgggctgt gttctgggga 60tgcgtaagag ggctctggtc ttccaaccag tggctgagct gaaggaccag acagattttg 120agtgagt 12758556DNAHomo sapiens 58ctcattcctc tgtaggcatc gaatccccaa ggaacagtgg tggctgaaac tgaggcccat 60cctcaaaatc ctagccaagt acgagattga cttggacact tcagaccatg cccacctgga 120gcacatcacc cggaagcggt ccggggaagc tgccgtctaa acctctctgg agtgagggga 180atagattacc tgatcatggt cagctcacac cctaataagt ccacatcttc tcagtgtttt 240agctgttttt ttcattaggt ttccttttat tctgtacctt gcagccatga ccagttctgg 300ccaggagctg gaggagcagg cagtgggtgg gagctccttt taggtagaat ttaacatgac 360ttctgcccca gctttatctg tcacacaagg ctgggcacct ctagtgctac tgctagatat 420cacttactca gttagaattt tcctaaaaat aagctttatt tatttctttg tgataacaaa 480gagtcttggt tcctctacta cttttactac agtgacaaat tgtaactaca ctaataaatg 540ccaactggtc actgtg 556592661DNAHomo sapiens 59ccatgaagag caccatgcag ccaaaaccct ggggattggc aaagccattg ctgtcttaac 60ctctggtgga gatgcccaag gtatgaatgc tgctgtcagg gctgtggttc gagttggtat 120cttcaccggt gcccgtgtct tctttgtcca tgagggttat caaggcctgg tggatggtgg 180agatcacatc aaggaagcca cctgggagag cgtttcgatg atgcttcagc tgggaggcac 240ggtgattgga agtgcccggt gcaaggactt tcgggaacga gaaggacgac tccgagctgc 300ctacaacctg gtgaagcgtg ggatcaccaa tctctgtgtc attgggggtg atggcagcct 360cactggggct gacaccttcc gttctgagtg gagtgacttg ttgagtgacc tccagaaagc 420aggtaagatc acagatgagg aggctacgaa gtccagctac ctgaacattg tgggcctggt 480tgggtcaatt gacaatgact tctgtggcac cgatatgacc attggcactg actctgccct 540gcatcggatc atggaaattg tagatgccat cactaccact gcccagagcc accagaggac 600atttgtgtta gaagtaatgg gccgccactg tggatacctg gcccttgtca cctctctgtc 660ctgtggggcc gactgggttt ttattcctga atgtccacca gatgacgact gggaggaaca 720cctttgtcgc cgactcagcg agacaaggac ccgtggttct cgtctcaaca tcatcattgt 780ggctgagggt gcaattgaca agaatggaaa accaatcacc tcagaagaca tcaagaatgg 840cagcaggatg ggtgtggaag cagtgatggc acttttggag gggaccccag ataccccagc 900ctgtgtagtg agcctctctg gtaaccaggc tgtgcgcctg cccctcatgg aatgtgtcca 960ggtgaccaaa gatgtgacca aggccatgga tgagaagaaa tttgacgaag ccctgaagct 1020gagaggccgg agcttcatga acaactggga ggtgtacaag cttctagctc atgtcagacc 1080cccggtatct aagagtggtt cgcacacagt ggctgtgatg aacgtggggg ctccggctgc 1140aggcatgaat gctgctgttc gctccactgt gaggattggc cttatccagg gcaaccgagt 1200gctcgttgtc catgatggtt tcgagggcct ggccaagggg cagatagagg aagctggctg 1260gagctatgtt gggggctgga ctggccaagg tggctctaaa cttgggacta aaaggactct 1320acccaagaag agctttgaac agatcagtgc caatataact aagtttaaca ttcagggcct 1380tgtcatcatt gggggctttg aggcttacac agggggcctg gaactgatgg agggcaggaa 1440gcagtttgat gagctctgca tcccatttgt ggtcattcct gctacagtct ccaacaatgt 1500ccctggctca gacttcagcg ttggggctga cacagcactc aatactatct gcacaacctg 1560tgaccgcatc aagcagtcag cagctggcac caagcgtcgg gtgtttatca ttgagactat 1620gggtggctac tgtggctacc tggctaccat ggctggactg gcagctgggg ccgatgctgc 1680ctacattttt gaggagccct tcaccattcg agacctgcag gcaaatgttg aacatctggt 1740gcaaaagatg aaaacaactg tgaaaagggg cttggtgtta aggaatgaaa agtgcaatga 1800gaactatacc actgacttca ttttcaacct gtactctgag gaggggaagg gcatcttcga 1860cagcaggaag aatgtgcttg gtcacatgca gcagggtggg agcccaaccc catttgatag 1920gaattttgcc actaagatgg gcgccaaggc tatgaactgg atgtctggga aaatcaaaga 1980gagttaccgt aatgggcgga

tctttgccaa tactccagat tcgggctgtg ttctggggat 2040gcgtaagagg gctctggtct tccaaccagt ggctgagctg aaggaccaga cagattttga 2100gcatcgaatc cccaaggaac agtggtggct gaaactgagg cccatcctca aaatcctagc 2160caagtacgag attgacttgg acacttcaga ccatgcccac ctggagcaca tcacccggaa 2220gcggtccggg gaagcggccg tctaaacctc tctggagtga ggggaataga ttacctgatc 2280atggtcagct cacaccctaa taagtccaca tcttctcagt gttttagctg tttttttcat 2340taggtttcct tttattctgt accttgcagc catgaccagt tctggccagg agctggagga 2400gcaggcagtg ggtgggagct ccttttaggt agaatttaac atgacttctg ccccagcttt 2460atctgtcaca caaggctggg cacctctagt gctactgcta gatatcactt actcagttag 2520aattttccta aaaataagct ttatttattt ctttgtgata acaaagagtc ttggttcctc 2580tactactttt actacagtga caaattgtaa ctacactaat aaatgccaac tggtcactgt 2640gaaaaaaaaa aaaaaaaaaa a 2661602628DNAHomo sapiens 60gcacccggac gtgcggctcc cctcggcctc ctcgccatgg acgcggacga ctcccgggcc 60cccaagggct ccttgcggaa gttcctggag cacctctccg gggccggcaa ggccatcggc 120gtgctgacca gcggcgggga tgctcaaggt atgaacgctg ccgtccgtgc cgtggtgcgc 180atgggtatct acgtgggggc caaggtgtac ttcatctacg agggctacca gggcatggtg 240gacggaggct caaacatcgc agaggccgac tgggagagtg tctccagcat cctgcaagtg 300ggcgggacga tcattggcag tgcgcggtgc caggccttcc gcacgcggga aggccgcctg 360aaggctgctt gcaacctgct gcagcgcggc atcaccaacc tgtgtgtgat cggcggggac 420gggagcctca ccggggccaa cctcttccgg aaggagtgga gtgggctgct ggaggagctg 480gccaggaacg gccagatcga taaggaggcc gtgcagaagt acgcctacct caacgtggtg 540ggcatggtgg gctccatcga caatgatttc tgcggcaccg acatgaccat cggcacggac 600tccgccctgc acaggatcat cgaggtcgtc gacgccatca tgaccacggc ccagagccac 660cagaggacct tcgttctgga ggtgatggga cgacactgtg ggtacctggc cctggtgagt 720gccttggcct gcggtgcgga ctgggtgttc cttccagaat ctccaccaga ggaaggctgg 780gaggagcaga tgtgtgtcaa actctcggag aaccgtgccc ggaaaaaaag gctgaatatt 840attattgtgg ctgaaggagc aattgatacc caaaataaac ccatcacctc tgagaaaatc 900aaagagcttg tcgtcacgca gctgggctat gacacacgtg tgaccatcct cgggcacgtg 960cagagaggag ggaccccttc ggcattcgac aggatcttgg ccagccgcat gggagtggag 1020gcagtcatcg ccttgctaga ggccaccccg gacaccccag cttgcgtcgt gtcactgaac 1080gggaaccacg ccgtgcgcct gccgctgatg gagtgcgtgc agatgactca ggatgtgcag 1140aaggcgatgg acgagaggag atttcaagat gcggttcgac tccgagggag gagctttgcg 1200ggcaacctga acacctacaa gcgacttgcc atcaagctgc cggatgatca gatcccaaag 1260accaattgca acgtagctgt catcaacgtg ggggcacccg cggctgggat gaacgcagcc 1320gtacgctcag ctgtgcgcgt gggcattgcc gacggccaca ggatgctcgc catctatgat 1380ggctttgacg gcttcgccaa gggccagatc aaagaaatcg gctggacaga tgtcgggggc 1440tggaccggcc aaggaggctc cattcttggg acaaaacgcg ttctcccggg gaagtacttg 1500gaagagatcg ccacacagat gcgcacgcac agcatcaacg cgctgctgat catcggtgga 1560ttcgaggcct acctgggact cctggagctg tcagccgccc gggagaagca cgaggagttc 1620tgtgtcccca tggtcatggt tcccgctact gtgtccaaca atgtgccggg ttccgatttc 1680agcatcgggg cagacaccgc cctgaacact atcaccgaca cctgcgaccg catcaagcag 1740tccgccagcg gaaccaagcg gcgcgtgttc atcatcgaga ccatgggcgg ctactgtggc 1800tacctggcca acatgggggg gctcgcggcc ggagctgatg ccgcatacat tttcgaagag 1860cccttcgaca tcagggatct gcagtccaac gtggagcacc tgacggagaa aatgaagacc 1920accatccaga gaggccttgt gctcagaaat gagagctgca gtgaaaacta caccaccgac 1980ttcatttacc agctgtattc agaagagggc aaaggcgtgt ttgactgcag gaagaacgtg 2040ctgggtcaca tgcagcaggg tggggcaccc tctccatttg atagaaactt tggaaccaaa 2100atctctgcca gagctatgga gtggatcact gcaaaactca aggaggcccg gggcagagga 2160aaaaaattta ccaccgatga ttccatttgt gtgctgggaa taagcaaaag aaacgttatt 2220tttcaacctg tggcagagct gaagaagcaa acggattttg agcacaggat tcccaaagaa 2280cagtggtggc tcaagctacg gcccctcatg aaaatcctgg ccaagtacaa ggccagctat 2340gacgtgtcgg actcaggcca gctggaacat gtgcagccct ggagtgtctg acccagtccc 2400gcctgcatgt gcctgcagcc accgtggact gtctgttttt gtaacactta agttatttta 2460tcagcacttt atgcacgtat tattgacatt aatacctaat cggcgagtgc ccatctgccc 2520cacctgctcc agtgcgtgct gtctgtggag tgtgtctcat gctttcagat gtgcatatga 2580gcagaattaa ttaaacattt gcctatgact ccaacaaaaa aaaaaaaa 2628612591DNAHomo sapiens 61cccggacgtg cggctcccct cggcctcctc gccatggacg cggacgactc ccgggccccc 60aagggctcct tgcggaagtt cctggagcac ctctccgggg ccggcaaggc catcggcgtg 120ctgaccagcg gcggggatgc tcaaggtatg aacgctgccg tccgtgccgt ggtgcgcatg 180ggtatctacg tgggggccaa ggtgtacttc atctacgagg gctaccaggg catggtggac 240ggaggctcaa acatcgcaga ggccgactgg gagagtgtct ccagcatcct gcaagtgggc 300gggacgatca ttggcagtgc gcggtgccag gccttccgca cgcgggaagg ccgcctgaag 360gctgcttgca acctgctgca gcgcggcatc accaacctgt gtgtgatcgg cggggacggg 420agcctcaccg gggccaacct cttccggaag gagtggagtg ggctgctgga ggagctggcc 480aggaacggcc agatcgataa ggaggccgtg cagaagtacg cctacctcaa cgtggtgggc 540atggtgggct ccatcgacaa tgatttctgc ggcaccgaca tgaccatcgg cacggactcc 600gccctgcaca ggatcatcga ggtcgtcgac gccatcatga ccacggccca gagccaccag 660aggaccttcg ttctggaggt gatgggacga cactgtgggt acctggccct ggtgagtgcc 720ttggcctgcg gtgcggactg ggtgttcctt ccagaatctc caccagagga aggctgggag 780gagcagatgt gtgtcaaact ctcggagaac cgtgcccgga aaaaaaggct gaatattatt 840attgtggctg aaggagcaat tgatacccaa aataaaccca tcacctctga gaaaatcaaa 900gagcttgtcg tcacgcagct gggctatgac acacgtgtga ccatcctcgg gcacgtgcag 960agaggaggga ccccttcggc attcgacagg atcttggcca gccgcatggg agtggaggca 1020gtcatcgcct tgctagaggc caccccggac accccagctt gcgtcgtgtc actgaacggg 1080aaccacgccg tgcgcctgcc gctgatggag tgcgtgcaga tgactcagga tgtgcagaag 1140gcgatggacg agaggagatt tcaagatgcg gttcgactcc gagggaggag ctttgcgggc 1200aacctgaaca cctacaagcg acttgccatc aagctgccgg atgatcagat cccaaagacc 1260aattgcaacg tagctgtcat caacgtgggg gcacccgcgg ctgggatgaa cgcggccgta 1320cgctcagctg tgcgcgtggg cattgccgac ggccacagga tgctcgccat ctatgatggc 1380tttgacggct tcgccaaggg ccagatcaaa gaaatcggct ggacagatgt cgggggctgg 1440accggccaag gaggctccat tcttgggaca aaacgcgttc tcccggggaa gtacttggaa 1500gagatcgcca cacagatgcg cacgcacagc atcaacgcgc tgctgatcat cggtggattc 1560gaggcctacc tgggactcct ggagctgtca gccgcccggg agaagcacga ggagttctgt 1620gtccccatgg tcatggttcc cgctactgtg tccaacaatg tgccgggttc cgatttcagc 1680atcggggcag acaccgccct gaacactatc accgacacct gcgaccgcat caagcagtcc 1740gccagcggaa ccaagcggcg cgtgttcatc atcgagacca tgggcggcta ctgtggctac 1800ctggccaaca tgggggggct cgcggccgga gctgatgccg catacatttt cgaagagccc 1860ttcgacatca gggatctgca gtccaacgtg gagcacctga cggagaaaat gaagaccacc 1920atccagagag gccttgtgct cagaaatgag agctgcagtg aaaactacac caccgacttc 1980atttaccagc tgtattcaga agagggcaaa ggcgtgtttg actgcaggaa gaacgtgctg 2040ggtcacatgc agcagggtgg ggcaccctct ccatttgata gaaactttgg aaccaaaatc 2100tctgccagag ctatggagtg gatcactgca aaactcaagg aggcccgggg cagaggaaaa 2160aaatttacca ccgatgattc catttgtgtg ctgggaataa gcaaaagaaa cgttattttt 2220caacctgtgg cagagctgaa gaagcaaacg gattttgagc acaggattcc caaagaacag 2280tggtggctca agctacggcc cctcatgaaa atcctggcca agtacaaggc cagctatgac 2340gtgtcggact caggccagct ggaacatgtg cagccctgga gtgtctgacc cagtcccgcc 2400tgcatgtgcc tgcagccacc gtggactgtc tgtttttgta acacttaagt tattttatca 2460gcactttatg cacgtattat tgacattaat acctaatcgg cgagtgccca tctgccccac 2520cagctccagt gcgtgctgtc tgtggagtgt gtctcatgct ttcagatgtg catatgagca 2580gaattaatta a 2591622636DNAHomo sapiens 62gcacccggac gtgcggctcc cctcggcctc ctcgccatgg acgcggacga ctcccgggcc 60cccaagggct ccttgcggaa gttcctggag cacctctccg gggccggcaa ggccatcggc 120gtgctgacca gcggcgggga tgctcaaggt atgaacgctg ccgtccgtgc cgtggtgcgc 180atgggtatct acgtgggggc caaggtgtac ttcatctacg agggctacca gggcatggtg 240gacggaggct caaacatcgc agaggccgac tgggagagtg tctccagcat cctgcaagtg 300ggcgggacga tcattggcag tgcgcggtgc caggccttcc gcacgcggga aggccgcctg 360aaggctgctt gcaacctgct gcagcgcggc atcaccaacc tgtgtgtgat cggcggggac 420gggagcctca ccggggccaa cctcttccgg aaggagtgga gtgggctgct ggaggagctg 480gccaggaacg gccagatcga taaggaggcc gtgcagaagt acgcctacct caacgtggtg 540ggcatggtgg gctccatcga caatgatttc tgcggcaccg acatgaccat cggcacggac 600tccgccctgc acaggatcat cgaggtcgtc gacgccatca tgaccacggc ccagagccac 660cagaggacct tcgttctgga ggtgatggga cgacactgtg ggtacctggc cctggtgagt 720gccttggcct gcggtgcgga ctgggtgttc cttccagaat ctccaccaga ggaaggctgg 780gaggagcaga tgtgtgtcaa actctcggag aaccgtgccc ggaaaaaaag gctgaatatt 840attattgtgg ctgaaggagc aattgatacc caaaataaac ccatcacctc tgagaaaatc 900aaagagcttg tcgtcacgca gctgggctat gacacacgtg tgaccatcct cgggcacgtg 960cagagaggag ggaccccttc ggcattcgac aggatcttgg ccagccgcat gggagtggag 1020gcagtcatcg ccttgctaga ggccaccccg gacaccccag cttgcgtcgt gtcactgaac 1080gggaaccacg ccgtgcgcct gccgctgatg gagtgcgtgc agatgactca ggatgtgcag 1140aaggcgatgg acgagaggag atttcaagat gcggttcgac tccgagggag gagctttgcg 1200ggcaacctga acacctacaa gcgacttgcc atcaagctgc cggatgatca gatcccaaag 1260accaattgca acgtagctgt catcaacgtg ggggcacccg cggctgggat gaacgcggcc 1320gtacgctcag ctgtgcgcgt gggcattgcc gacggccaca ggatgctcgc catctatgat 1380ggctttgacg gcttcgccaa gggccagatc aaagaaatcg gctggacaga tgtcgggggc 1440tggaccggcc aaggaggctc cattcttggg acaaaacgcg ttctcccggg gaagtacttg 1500gaagagatcg ccacacagat gcgcacgcac agcatcaacg cgctgctgat catcggtgga 1560ttcgaggcct acctgggact cctggagctg tcagccgccc gggagaagca cgaggagttc 1620tgtgtcccca tggtcatggt tcccgctact gtgtccaaca atgtgccggg ttccgatttc 1680agcatcgggg cagacaccgc cctgaacact atcaccgaca cctgcgaccg catcaagcag 1740tccgccagcg gaaccaagcg gcgcgtgttc atcatcgaga ccatgggcgg ctactgtggc 1800tacctggcca acatgggggg gctcgcggcc ggagctgatg ccgcatacat tttcgaagag 1860cccttcgaca tcagggatct gcagtccaac gtggagcacc tgacggagaa aatgaagacc 1920accatccaga gaggccttgt gctcagaaat gagagctgca gtgaaaacta caccaccgac 1980ttcatttacc agctgtattc agaagagggc aaaggcgtgt ttgactgcag gaagaacgtg 2040ctgggtcaca tgcagcaggg tggggcaccc tctccatttg atagaaactt tggaaccaaa 2100atctctgcca gagctatgga gtggatcact gcaaaactca aggaggcccg gggcagagga 2160aaaaaattta ccaccgatga ttccatttgt gtgctgggaa taagcaaaag aaacgttatt 2220tttcaacctg tggcagagct gaagaagcaa acggattttg agcacaggat tcccaaagaa 2280cagtggtggc tcaagctacg gcccctcatg aaaatcctgg ccaagtacaa ggccagctat 2340gacgtgtcgg actcaggcca gctggaacat gtgcagccct ggagtgtctg acccagtccc 2400gcctgcatgt gcctgcagcc accgtggact gtctgttttt gtaacactta agttatttta 2460tcagcacttt atgcacgtat tattgacatt aatacctaat cggcgagtgc ccatctgccc 2520caccagctcc agtgcgtgct gtctgtggag tgtgtctcat gctttcagat gtgcatatga 2580gcagaattaa ttaaacattt gcctatgact ccaaaaaaaa aaaaaaaaaa aaaaaa 2636632671DNAHomo sapiens 63gagtcaggcg cgcgcgggca gggtccccat tgcctgctgc gcacccggac gtgcggctcc 60cctcggcctc ctcgccatgg acgcggacga ctcccgggcc cccaagggct ccttgcggaa 120gttcctggag cacctctccg gggccggcaa ggccatcggc gtgctgacca gcggcgggga 180tgctcaaggt atgaacgctg ccgtccgtgc cgtggtgcgc atgggtatct acgtgggggc 240caaggtgtac ttcatctacg agggctacca gggcatggtg gacggaggct caaacatcgc 300agaggccgac tgggagagtg tctccagcat cctgcaagtg ggcgggacga tcattggcag 360tgcgcggtgc caggccttcc gcacgcggga aggccgcctg aaggctgctt gcaacctgct 420gcagcgcggc atcaccaacc tgtgtgtgat cggcggggac gggagcctca ccggggccaa 480cctcttccgg aaggagtgga gtgggctgct ggaggagctg gccaggaacg gccagatcga 540taaggaggcc gtgcagaaat acgcctacct caacgtggtg ggcatggtgg gctccatcga 600caatgatttc tgcggcaccg acatgaccat cggcacggac tccgccctgc acaggatcat 660cgaggtcgtc gacgccatca tgaccacggc ccagagccac cagaggacct tcgttctgga 720ggtgatggga cgacactgtg ggtacctggc cctggtgagt gccttggcct gcggtgcgga 780ctgggtgttc cttccagaat ctccaccaga ggaaggctgg gaggagcaga tgtgtgtcaa 840actctcggag aaccgtgccc ggaaaaaaag gctgaatatt attattgtgg ctgaaggagc 900aattgatacc caaaataaac ccatcacctc tgagaaaatc aaagagcttg tcgtcacgca 960gctgggctat gacacacgtg tgaccatcct cgggcacgtg cagagaggag ggaccccttc 1020ggcattcgac aggatcttgg ccagccgcat gggagtggag gcagtcatcg ccttgctaga 1080ggccaccccg gacaccccag cttgcgtcgt gtcactgaac gggaaccacg ccgtgcgcct 1140gccgctgatg gagtgcgtgc agatgactca ggatgtgcag aaggcgatgg acgagaggag 1200atttcaagat gcggttcgac tccgagggag gagctttgcg ggcaacctga acacctacaa 1260gcgacttgcc atcaagctgc cggatgatca gatcccaaag accaattgca acgtagctgt 1320catcaacgtg ggggcacccg cggctgggat gaacgcggcc gtacgctcag ctgtgcgcgt 1380gggcattgcc gacggccaca ggatgctcgc catctatgat ggctttgacg gcttcgccaa 1440gggccagatc aaagaaatcg gctggacaga tgtcgggggc tggaccggcc aaggaggctc 1500cattcttggg acaaaacgcg ttctcccggg gaagtacttg gaagagatcg ccacacagat 1560gcgcacgcac agcatcaacg cgctgctgat catcggtgga ttcgaggcct acctgggact 1620cctggagctg tcagccgccc gggagaagca cgaggagttc tgtgtcccca tggtcatggt 1680tcccgctact gtgtccaaca atgtgccggg ttccgatttc agcatcgggg cagacaccgc 1740cctgaacact atcaccgaca cctgcgaccg catcaagcag tccgccagcg gaaccaagcg 1800gcgcgtgttc atcatcgaga ccatgggcgg ctactgtggc tacctggcca acatgggggg 1860gctcgcggct ggagctgatg ccgcatacat tttcgaagag cccttcgaca tcagggatct 1920gcagtccaac gtggagcacc tgacggagaa aatgaagacc accatccaga gaggccttgt 1980gctcagaaat gagagctgca gtgaaaacta caccaccgac ttcatttacc agctgtattc 2040agaagagggc aaaggcgtgt ttgactgcag gaagaacgtg ctgggtcaca tgcagcaggg 2100tggggcaccc tctccatttg atagaaactt tggaaccaaa atctctgcca gagctatgga 2160gtggatcact gcaaaactca aggaggcccg gggcagagga aaaaaattta ccaccgatga 2220ttccatttgt gtgctgggaa taagcaaaag aaacgttatt tttcaacctg tggcagagct 2280gaagaagcaa acggattttg agcacaggat tcccaaagaa cagtggtggc tcaagctacg 2340gcccctcatg aaaatcctgg ccaagtacaa ggccagctat gacgtgtcgg actcaggcca 2400gctggaacat gtgcagccct ggagtgtctg acccagtccc gcctgcatgt gcctgcagcc 2460accgtggact gtctcttttt gtaacactta agttatttta tcagcacttt atgcacgtat 2520tattgacatt aatacctaat cggcgagtgc ccatctgccc caccagcccc agtgcgtgct 2580gtctgtggag tgtgtctcat gctttcagat gtgcatatga gcagaattaa ttaaacattt 2640gcctacaaaa aaaaaaaaaa aaaaaaaaaa a 267164827PRTHomo sapiens 64Met Cys Asn Gln Gly Arg Gly Arg Glu Ser Ser Arg Gly Gly Leu His 1 5 10 15 Val Gln Gly Ser Cys Arg Gly Leu Ser Arg Ser Pro Gln Gln Glu Thr 20 25 30 Gly Phe Ala Lys Ala Pro Ala Gly Thr Asp Cys Phe Phe His Cys Ser 35 40 45 Pro Gly Ser Arg Gly Gln Gly Asp Arg Lys Glu Glu Val Thr Ser Glu 50 55 60 Pro Gly Gly Thr Ser Ile Met Ser Arg Leu Gly Gly Met Asn Ala Ala 65 70 75 80 Val Arg Ala Val Thr Arg Met Gly Ile Tyr Val Gly Ala Lys Val Phe 85 90 95 Leu Ile Tyr Glu Gly Tyr Glu Gly Leu Val Glu Gly Gly Glu Asn Ile 100 105 110 Lys Gln Ala Asn Trp Leu Ser Val Ser Asn Ile Ile Gln Leu Gly Gly 115 120 125 Thr Ile Ile Gly Ser Ala Arg Cys Lys Ala Phe Thr Thr Arg Glu Gly 130 135 140 Arg Arg Ala Ala Ala Tyr Asn Leu Val Gln His Gly Ile Thr Asn Leu 145 150 155 160 Cys Val Ile Gly Gly Asp Gly Ser Leu Thr Gly Ala Asn Ile Phe Arg 165 170 175 Ser Glu Trp Gly Ser Leu Leu Glu Glu Leu Val Ala Glu Gly Lys Ile 180 185 190 Ser Glu Thr Thr Ala Arg Thr Tyr Ser His Leu Asn Ile Ala Gly Leu 195 200 205 Val Gly Ser Ile Asp Asn Asp Phe Cys Gly Thr Asp Met Thr Ile Gly 210 215 220 Thr Asp Ser Ala Leu His Arg Ile Met Glu Val Ile Asp Ala Ile Thr 225 230 235 240 Thr Thr Ala Gln Ser His Gln Arg Thr Phe Val Leu Glu Val Met Gly 245 250 255 Arg His Cys Gly Tyr Leu Ala Leu Val Ser Ala Leu Ala Ser Gly Ala 260 265 270 Asp Trp Leu Phe Ile Pro Glu Ala Pro Pro Glu Asp Gly Trp Glu Asn 275 280 285 Phe Met Cys Glu Arg Leu Gly Glu Thr Arg Ser Arg Gly Ser Arg Leu 290 295 300 Asn Ile Ile Ile Ile Ala Glu Gly Ala Ile Asp Arg Asn Gly Lys Pro 305 310 315 320 Ile Ser Ser Ser Tyr Val Lys Asp Leu Val Val Gln Arg Leu Gly Phe 325 330 335 Asp Thr Arg Val Thr Val Leu Gly His Val Gln Arg Gly Gly Thr Pro 340 345 350 Ser Ala Phe Asp Arg Ile Leu Ser Ser Lys Met Gly Met Glu Ala Val 355 360 365 Met Ala Leu Leu Glu Ala Thr Pro Asp Thr Pro Ala Cys Val Val Thr 370 375 380 Leu Ser Gly Asn Gln Ser Val Arg Leu Pro Leu Met Glu Cys Val Gln 385 390 395 400 Met Thr Lys Glu Val Gln Lys Ala Met Asp Asp Lys Arg Phe Asp Glu 405 410 415 Ala Thr Gln Leu Arg Gly Gly Ser Phe Glu Asn Asn Trp Asn Ile Tyr 420 425 430 Lys Leu Leu Ala His Gln Lys Pro Pro Lys Glu Lys Ser Asn Phe Ser 435 440 445 Leu Ala Ile Leu Asn Val Gly Ala Pro Ala Ala Gly Met Asn Ala Ala 450 455 460 Val Arg Ser Ala Val Arg Thr Gly Ile Ser His Gly His Thr Val Tyr 465 470 475 480 Val Val His Asp Gly Phe Glu Gly Leu Ala Lys Gly Gln Val Gln Glu 485 490 495 Val Gly Trp His Asp Val Ala Gly Trp Leu Gly Arg Gly Gly Ser Met 500 505 510 Leu Gly Thr Lys Arg Thr Leu Pro Lys Gly Gln Leu Glu Ser Ile Val 515 520 525 Glu Asn Ile Arg Ile Tyr Gly Ile His Ala Leu Leu Val Val Gly Gly 530 535 540 Phe Glu Ala Tyr Glu Gly Val Leu Gln Leu Val Glu Ala Arg Gly Arg 545 550

555 560 Tyr Glu Glu Leu Cys Ile Val Met Cys Val Ile Pro Ala Thr Ile Ser 565 570 575 Asn Asn Val Pro Gly Thr Asp Phe Ser Leu Gly Ser Asp Thr Ala Val 580 585 590 Asn Ala Ala Met Glu Ser Cys Asp Arg Ile Lys Gln Ser Ala Ser Gly 595 600 605 Thr Lys Arg Arg Val Phe Ile Val Glu Thr Met Gly Gly Tyr Cys Gly 610 615 620 Tyr Leu Ala Thr Val Thr Gly Ile Ala Val Gly Ala Asp Ala Ala Tyr 625 630 635 640 Val Phe Glu Asp Pro Phe Asn Ile His Asp Leu Lys Val Asn Val Glu 645 650 655 His Met Thr Glu Lys Met Lys Thr Asp Ile Gln Arg Gly Leu Val Leu 660 665 670 Arg Asn Glu Lys Cys His Asp Tyr Tyr Thr Thr Glu Phe Leu Tyr Asn 675 680 685 Leu Tyr Ser Ser Glu Gly Lys Gly Val Phe Asp Cys Arg Thr Asn Val 690 695 700 Leu Gly His Leu Gln Gln Gly Gly Ala Pro Thr Pro Phe Asp Arg Asn 705 710 715 720 Tyr Gly Thr Lys Leu Gly Val Lys Ala Met Leu Trp Leu Ser Glu Lys 725 730 735 Leu Arg Glu Val Tyr Arg Lys Gly Arg Val Phe Ala Asn Ala Pro Asp 740 745 750 Ser Ala Cys Val Ile Gly Leu Lys Lys Lys Ala Val Ala Phe Ser Pro 755 760 765 Val Thr Glu Leu Lys Lys Asp Thr Asp Phe Glu His Arg Met Pro Arg 770 775 780 Glu Gln Trp Trp Leu Ser Leu Arg Leu Met Leu Lys Met Leu Ala Gln 785 790 795 800 Tyr Arg Ile Ser Met Ala Ala Tyr Val Ser Gly Glu Leu Glu His Val 805 810 815 Thr Arg Arg Thr Leu Ser Met Asp Lys Gly Phe 820 825 65780PRTHomo sapiens 65Met Thr His Glu Glu His His Ala Ala Lys Thr Leu Gly Ile Gly Lys 1 5 10 15 Ala Ile Ala Val Leu Thr Ser Gly Gly Asp Ala Gln Gly Met Asn Ala 20 25 30 Ala Val Arg Ala Val Val Arg Val Gly Ile Phe Thr Gly Ala Arg Val 35 40 45 Phe Phe Val His Glu Gly Tyr Gln Gly Leu Val Asp Gly Gly Asp His 50 55 60 Ile Lys Glu Ala Thr Trp Glu Ser Val Ser Met Met Leu Gln Leu Gly 65 70 75 80 Gly Thr Val Ile Gly Ser Ala Arg Cys Lys Asp Phe Arg Glu Arg Glu 85 90 95 Gly Arg Leu Arg Ala Ala Tyr Asn Leu Val Lys Arg Gly Ile Thr Asn 100 105 110 Leu Cys Val Ile Gly Gly Asp Gly Ser Leu Thr Gly Ala Asp Thr Phe 115 120 125 Arg Ser Glu Trp Ser Asp Leu Leu Ser Asp Leu Gln Lys Ala Gly Lys 130 135 140 Ile Thr Asp Glu Glu Ala Thr Lys Ser Ser Tyr Leu Asn Ile Val Gly 145 150 155 160 Leu Val Gly Ser Ile Asp Asn Asp Phe Cys Gly Thr Asp Met Thr Ile 165 170 175 Gly Thr Asp Ser Ala Leu His Arg Ile Met Glu Ile Val Asp Ala Ile 180 185 190 Thr Thr Thr Ala Gln Ser His Gln Arg Thr Phe Val Leu Glu Val Met 195 200 205 Gly Arg His Cys Gly Tyr Leu Ala Leu Val Thr Ser Leu Ser Cys Gly 210 215 220 Ala Asp Trp Val Phe Ile Pro Glu Cys Pro Pro Asp Asp Asp Trp Glu 225 230 235 240 Glu His Leu Cys Arg Arg Leu Ser Glu Thr Arg Thr Arg Gly Ser Arg 245 250 255 Leu Asn Ile Ile Ile Val Ala Glu Gly Ala Ile Asp Lys Asn Gly Lys 260 265 270 Pro Ile Thr Ser Glu Asp Ile Lys Asn Leu Val Val Lys Arg Leu Gly 275 280 285 Tyr Asp Thr Arg Val Thr Val Leu Gly His Val Gln Arg Gly Gly Thr 290 295 300 Pro Ser Ala Phe Asp Arg Ile Leu Gly Ser Arg Met Gly Val Glu Ala 305 310 315 320 Val Met Ala Leu Leu Glu Gly Thr Pro Asp Thr Pro Ala Cys Val Val 325 330 335 Ser Leu Ser Gly Asn Gln Ala Val Arg Leu Pro Leu Met Glu Cys Val 340 345 350 Gln Val Thr Lys Asp Val Thr Lys Ala Met Asp Glu Lys Lys Phe Asp 355 360 365 Glu Ala Leu Lys Leu Arg Gly Arg Ser Phe Met Asn Asn Trp Glu Val 370 375 380 Tyr Lys Leu Leu Ala His Val Arg Pro Pro Val Ser Lys Ser Gly Ser 385 390 395 400 His Thr Val Ala Val Met Asn Val Gly Ala Pro Ala Ala Gly Met Asn 405 410 415 Ala Ala Val Arg Ser Thr Val Arg Ile Gly Leu Ile Gln Gly Asn Arg 420 425 430 Val Leu Val Val His Asp Gly Phe Glu Gly Leu Ala Lys Gly Gln Ile 435 440 445 Glu Glu Ala Gly Trp Ser Tyr Val Gly Gly Trp Thr Gly Gln Gly Gly 450 455 460 Ser Lys Leu Gly Thr Lys Arg Thr Leu Pro Lys Lys Ser Phe Glu Gln 465 470 475 480 Ile Ser Ala Asn Ile Thr Lys Phe Asn Ile Gln Gly Leu Val Ile Ile 485 490 495 Gly Gly Phe Glu Ala Tyr Thr Gly Gly Leu Glu Leu Met Glu Gly Arg 500 505 510 Lys Gln Phe Asp Glu Leu Cys Ile Pro Phe Val Val Ile Pro Ala Thr 515 520 525 Val Ser Asn Asn Val Pro Gly Ser Asp Phe Ser Val Gly Ala Asp Thr 530 535 540 Ala Leu Asn Thr Ile Cys Thr Thr Cys Asp Arg Ile Lys Gln Ser Ala 545 550 555 560 Ala Gly Thr Lys Arg Arg Val Phe Ile Ile Glu Thr Met Gly Gly Tyr 565 570 575 Cys Gly Tyr Leu Ala Thr Met Ala Gly Leu Ala Ala Gly Ala Asp Ala 580 585 590 Ala Tyr Ile Phe Glu Glu Pro Phe Thr Ile Arg Asp Leu Gln Ala Asn 595 600 605 Val Glu His Leu Val Gln Lys Met Lys Thr Thr Val Lys Arg Gly Leu 610 615 620 Val Leu Arg Asn Glu Lys Cys Asn Glu Asn Tyr Thr Thr Asp Phe Ile 625 630 635 640 Phe Asn Leu Tyr Ser Glu Glu Gly Lys Gly Ile Phe Asp Ser Arg Lys 645 650 655 Asn Val Leu Gly His Met Gln Gln Gly Gly Ser Pro Thr Pro Phe Asp 660 665 670 Arg Asn Phe Ala Thr Lys Met Gly Ala Lys Ala Met Asn Trp Met Ser 675 680 685 Gly Lys Ile Lys Glu Ser Tyr Arg Asn Gly Arg Ile Phe Ala Asn Thr 690 695 700 Pro Asp Ser Gly Cys Val Leu Gly Met Arg Lys Arg Ala Leu Val Phe 705 710 715 720 Gln Pro Val Ala Glu Leu Lys Asp Gln Thr Asp Phe Glu His Arg Ile 725 730 735 Pro Lys Glu Gln Trp Trp Leu Lys Leu Arg Pro Ile Leu Lys Ile Leu 740 745 750 Ala Lys Tyr Glu Ile Asp Leu Asp Thr Ser Asp His Ala His Leu Glu 755 760 765 His Ile Thr Arg Lys Arg Ser Gly Glu Ala Ala Val 770 775 780 66784PRTHomo sapiens 66Met Asp Ala Asp Asp Ser Arg Ala Pro Lys Gly Ser Leu Arg Lys Phe 1 5 10 15 Leu Glu His Leu Ser Gly Ala Gly Lys Ala Ile Gly Val Leu Thr Ser 20 25 30 Gly Gly Asp Ala Gln Gly Met Asn Ala Ala Val Arg Ala Val Val Arg 35 40 45 Met Gly Ile Tyr Val Gly Ala Lys Val Tyr Phe Ile Tyr Glu Gly Tyr 50 55 60 Gln Gly Met Val Asp Gly Gly Ser Asn Ile Ala Glu Ala Asp Trp Glu 65 70 75 80 Ser Val Ser Ser Ile Leu Gln Val Gly Gly Thr Ile Ile Gly Ser Ala 85 90 95 Arg Cys Gln Ala Phe Arg Thr Arg Glu Gly Arg Leu Lys Ala Ala Cys 100 105 110 Asn Leu Leu Gln Arg Gly Ile Thr Asn Leu Cys Val Ile Gly Gly Asp 115 120 125 Gly Ser Leu Thr Gly Ala Asn Leu Phe Arg Lys Glu Trp Ser Gly Leu 130 135 140 Leu Glu Glu Leu Ala Arg Asn Gly Gln Ile Asp Lys Glu Ala Val Gln 145 150 155 160 Lys Tyr Ala Tyr Leu Asn Val Val Gly Met Val Gly Ser Ile Asp Asn 165 170 175 Asp Phe Cys Gly Thr Asp Met Thr Ile Gly Thr Asp Ser Ala Leu His 180 185 190 Arg Ile Ile Glu Val Val Asp Ala Ile Met Thr Thr Ala Gln Ser His 195 200 205 Gln Arg Thr Phe Val Leu Glu Val Met Gly Arg His Cys Gly Tyr Leu 210 215 220 Ala Leu Val Ser Ala Leu Ala Cys Gly Ala Asp Trp Val Phe Leu Pro 225 230 235 240 Glu Ser Pro Pro Glu Glu Gly Trp Glu Glu Gln Met Cys Val Lys Leu 245 250 255 Ser Glu Asn Arg Ala Arg Lys Lys Arg Leu Asn Ile Ile Ile Val Ala 260 265 270 Glu Gly Ala Ile Asp Thr Gln Asn Lys Pro Ile Thr Ser Glu Lys Ile 275 280 285 Lys Glu Leu Val Val Thr Gln Leu Gly Tyr Asp Thr Arg Val Thr Ile 290 295 300 Leu Gly His Val Gln Arg Gly Gly Thr Pro Ser Ala Phe Asp Arg Ile 305 310 315 320 Leu Ala Ser Arg Met Gly Val Glu Ala Val Ile Ala Leu Leu Glu Ala 325 330 335 Thr Pro Asp Thr Pro Ala Cys Val Val Ser Leu Asn Gly Asn His Ala 340 345 350 Val Arg Leu Pro Leu Met Glu Cys Val Gln Met Thr Gln Asp Val Gln 355 360 365 Lys Ala Met Asp Glu Arg Arg Phe Gln Asp Ala Val Arg Leu Arg Gly 370 375 380 Arg Ser Phe Ala Gly Asn Leu Asn Thr Tyr Lys Arg Leu Ala Ile Lys 385 390 395 400 Leu Pro Asp Asp Gln Ile Pro Lys Thr Asn Cys Asn Val Ala Val Ile 405 410 415 Asn Val Gly Ala Pro Ala Ala Gly Met Asn Ala Ala Val Arg Ser Ala 420 425 430 Val Arg Val Gly Ile Ala Asp Gly His Arg Met Leu Ala Ile Tyr Asp 435 440 445 Gly Phe Asp Gly Phe Ala Lys Gly Gln Ile Lys Glu Ile Gly Trp Thr 450 455 460 Asp Val Gly Gly Trp Thr Gly Gln Gly Gly Ser Ile Leu Gly Thr Lys 465 470 475 480 Arg Val Leu Pro Gly Lys Tyr Leu Glu Glu Ile Ala Thr Gln Met Arg 485 490 495 Thr His Ser Ile Asn Ala Leu Leu Ile Ile Gly Gly Phe Glu Ala Tyr 500 505 510 Leu Gly Leu Leu Glu Leu Ser Ala Ala Arg Glu Lys His Glu Glu Phe 515 520 525 Cys Val Pro Met Val Met Val Pro Ala Thr Val Ser Asn Asn Val Pro 530 535 540 Gly Ser Asp Phe Ser Ile Gly Ala Asp Thr Ala Leu Asn Thr Ile Thr 545 550 555 560 Asp Thr Cys Asp Arg Ile Lys Gln Ser Ala Ser Gly Thr Lys Arg Arg 565 570 575 Val Phe Ile Ile Glu Thr Met Gly Gly Tyr Cys Gly Tyr Leu Ala Asn 580 585 590 Met Gly Gly Leu Ala Ala Gly Ala Asp Ala Ala Tyr Ile Phe Glu Glu 595 600 605 Pro Phe Asp Ile Arg Asp Leu Gln Ser Asn Val Glu His Leu Thr Glu 610 615 620 Lys Met Lys Thr Thr Ile Gln Arg Gly Leu Val Leu Arg Asn Glu Ser 625 630 635 640 Cys Ser Glu Asn Tyr Thr Thr Asp Phe Ile Tyr Gln Leu Tyr Ser Glu 645 650 655 Glu Gly Lys Gly Val Phe Asp Cys Arg Lys Asn Val Leu Gly His Met 660 665 670 Gln Gln Gly Gly Ala Pro Ser Pro Phe Asp Arg Asn Phe Gly Thr Lys 675 680 685 Ile Ser Ala Arg Ala Met Glu Trp Ile Thr Ala Lys Leu Lys Glu Ala 690 695 700 Arg Gly Arg Gly Lys Lys Phe Thr Thr Asp Asp Ser Ile Cys Val Leu 705 710 715 720 Gly Ile Ser Lys Arg Asn Val Ile Phe Gln Pro Val Ala Glu Leu Lys 725 730 735 Lys Gln Thr Asp Phe Glu His Arg Ile Pro Lys Glu Gln Trp Trp Leu 740 745 750 Lys Leu Arg Pro Leu Met Lys Ile Leu Ala Lys Tyr Lys Ala Ser Tyr 755 760 765 Asp Val Ser Asp Ser Gly Gln Leu Glu His Val Gln Pro Trp Ser Val 770 775 780

* * * * *