Pancreas-specific proteins Dohrmann, Cord ; et al. [Develogen Aktiengesellschaft fuer Entwicklungsbiologische Forschung]

Pancreas-specific proteins

Dohrmann, Cord ; et al.

Patent Application Summary

U.S. patent application number 10/998197 was filed with the patent office on 2005-10-06 for pancreas-specific proteins. This patent application is currently assigned to Develogen Aktiengesellschaft fuer Entwicklungsbiologische Forschung. Invention is credited to Austen, Matthias, Dohrmann, Cord.

Application Number	20050222070 10/998197
Document ID	/
Family ID	35055157
Filed Date	2005-10-06

United States Patent Application	20050222070
Kind Code	A1
Dohrmann, Cord ; et al.	October 6, 2005

Pancreas-specific proteins

Abstract

The present invention discloses polynucleotides which identify and encode DP119, DP444, DP810, DP685, WE474, DP160, RA977, or RA770 as well as novel functions for these proteins of the inventions. The invention provides for compositions for disorders associated with the expression of the proteins of the invention, such as for the treatment, alleviation and/or prevention of pancreatic dysfunction (for example diabetes, hyperglycemia, and impaired glucose tolerance), and related disorders, and other disease and disorders.

Inventors:	Dohrmann, Cord; (Goettingen, DE) ; Austen, Matthias; (Goettingen, DE)
Correspondence Address:	ROTHWELL, FIGG, ERNST & MANBECK, P.C. 1425 K STREET, N.W. SUITE 800 WASHINGTON DC 20005 US
Assignee:	Develogen Aktiengesellschaft fuer Entwicklungsbiologische Forschung Goettingen DE
Family ID:	35055157
Appl. No.:	10/998197
Filed:	November 29, 2004

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
10998197	Nov 29, 2004
PCT/EP03/05700	May 30, 2003

Current U.S. Class:	514/44R
Current CPC Class:	C07K 14/47 20130101; A01K 2227/40 20130101; A01K 2267/03 20130101; A01K 2217/075 20130101; A01K 2217/05 20130101; G01N 2800/042 20130101; A01K 67/0276 20130101; A01K 2227/105 20130101
Class at Publication:	514/044
International Class:	A61K 048/00

Foreign Application Data

Date	Code	Application Number
May 29, 2002	EP	02011963.2
Sep 17, 2002	EP	02020829.4

Claims

1. Use of a nucleic acid molecule selected from DP119, DP444, DP810, DF685, WE474, DP160, RA977, or RA770 or a polypeptide encoded thereby or a fragment or variant of said nucleic acid molecule or said polypeptide or an effector/modulator of said nucleic acid or said polypeptide for the manufacture of a pharmaceutical agent.

2. The use of claim 1 wherein the nucleic acid molecule is a vertebrate nucleic acid, particularly a human nucleic acid, or a fragment thereof or variant thereof.

3. The use of claim 1, wherein said nucleic acid molecule (a) hybridizes under stringent conditions to the nucleic acid molecule of SEQ ID NO: 1, 3, 5, 7, 9, 11, 12, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43 and/or the complementary strand thereof, (b) it is degenerate with respect to the nucleic acid molecule of (a), (c) encodes a polypeptide which is at least 80% identical to SEQ ID NO:2, 4, 6, 8, 10, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44 or (d) differs from the nucleic acid molecule of (a) to (c) by mutation and wherein said mutation causes an alteration, deletion, duplication or premature stop in the encoded polypeptide.

4. The use of claim 1, wherein the nucleic acid molecule is a DNA molecule.

5. The use of claim 1, wherein the nucleic acid molecule encodes a protein of the invention specifically expressed in pancreatic tissues or other tissues.

6. The use of claim 1, wherein said nucleic acid molecule is a recombinant nucleic acid molecule.

7. The use of claim 6, wherein said recombinant nucleic acid molecule is a vector, particularly an expression vector.

8. The use of claim 1, wherein said polypeptide is a recombinant polypeptide.

9. The use of claim 8, wherein said polypeptide is a fusion polypeptide.

10. The use of claim 1, wherein said nucleic acid molecule is selected from hybridization probes, primers and anti-sense oligonucleotides and aptamers.

11. The use of claim 1 for diagnostic applications.

12. The use of claim 1 for therapeutic applications.

13. The use of claim 1 for the manufacture of an agent for diagnosis, monitoring, prevention or treatment of pancreatic disorders, such as diabetes and related disorders, as well as neurodegenerative disorders, and other diseases.

14. The use of claim 13 for detecting and/or verifying, for the treatment, alleviation and/or prevention of a pancreatic dysfunction (for example diabetes, hyperglycemia, and impaired glucose tolerance), and related disorders including obesity, and neurodegenerative disorders, and others, in cells, cell masses, organs and/or subjects.

15. The use of claim 13 for promoting the differentiation and/or function of beta-cells in vitro and/or in vivo.

16. The use of claim 13 for the regeneration of beta-cells in vitro and/or in vivo.

17. Use of a nucleic acid molecule selected from DP119, DP444, DP810, DP685, WE474, DPI 60, RA977, or RA770 or a polypeptide encoded thereby or a fragment or variant of said nucleic acid molecule or said polypeptide or an effector/modulator of said nucleic acid molecule or said polypeptide as defined in claim 1 for monitoring and/or controlling the function of a gene and/or a gene product which is influenced and/or modified by a DP119, DP444, DP81 0, DP685, WE474, DP160, RA977, or RA770 polypeptide.

18. Use of a nucleic acid molecule selected from DP119, DP444, DP810, DP685, WE474, or RA977 or a polypeptide encoded thereby or a fragment or variant of said nucleic acid molecule or said polypeptide or an effector/modulator of said nucleic acid molecule or said polypeptide as defined in claim 1 for identifying substances capable of interacting with a DP119, DP444, DP810, DP685, WE474, DP160, RA977, or RA770 polypeptide.

19. A non-human transgenic animal exhibiting a modified expression of a DP119, DP444, DP810, DP685, WE474, DP160, RA977, or RA770 polypeptide.

20. The animal of claim 19, wherein the expression of DP119, DP444, DP81 0, DP685, WE474, DP160, RA977, or RA770 polypeptide is increased and/or reduced.

21. A recombinant host cell exhibiting a modified expression of a DP119, DP444, DP81 0, DP685, WE474, DP160, RA977, or RA770 polypeptide.

22. The cell of claim 21 which is a human cell.

23. A method of identifying a (poly)peptide involved in a metabolic disorder or metabolic syndrome, particularly in pancreatic dysfunction, in a mammal comprising the steps of (a) contacting a collection of test (poly)peptides with a DP119, DP444, DP810, DP685, WE474, DP160, RA977, or RA770 polypeptide or a fragment thereof under conditions that allow binding of said test (poly)peptide; (b) removing test (poly)peptides which do not bind and (c) identifying test (poly)peptides that bind to said DP119, DP444, DP81 0, DP685, WE474, DP160, RA977, or RA770 polypeptide or the fragment thereof.

24. A method of screening for an agent which modulates the interaction of a DP1 19, DP444, DP81 0, DP685, WE474, DP160, RA977, or RA770 polypeptide or a fragment thereof with a binding target/agent, comprising the steps of (a) incubating a mixture comprising (aa) a DP119, DP444, DP810, DP685, WE474, DP160, RA977, or RA770 polypeptide or a fragment thereof; (ab) a binding target/agent of said DP119, DP444, DP81 0, DP685, WE474, DP160, RA977, or RA770 polypeptide or fragment thereof; and (ac) a candidate agent under conditions whereby said DP119, DP444, DP81 0, DP685, WE474, DP160, RA977, or RA770 polypeptide or fragment thereof specifically binds to said binding target/agent at a reference affinity; (b) detecting the binding affinity of said DP119, DP444, DP810, DP685, WE474, DP160, RA977, or RA770 polypeptide or fragment thereof to said binding target to determine an (candidate) agent-biased affinity; and (c) determining a difference between (candidate) agent-biased affinity and the reference affinity.

25. A method of screening for an agent which modulates the activity of a DP119, DP444, DP810, DP685, WE474, DP160, RA977, or 25 RA770 polypeptide, comprising the steps of (a) incubating a mixture comprising (aa) a DP119, DP444, DP81 0, DP685, WE474, DP160, RA977, or RA770 polypeptide or a fragment thereof; and (ab) a candidate agent b) a vector comprising the nucleic acid of (a); (c) a host cell comprising the nucleic acid molecule of (a) or the

26. A method of producing a composition comprising mixing a (poly)peptide identified by the method of claim 23 with a pharmaceutically acceptable carrier and/or diluent.

27. The method of claim 26 wherein said composition is a pharmaceutical composition is a pharmaceutical composition for preventing, alleviating or treating of a pancreatic dysfunction (for example diabetes, hyperglycemia, and impaired glucose tolerance), and related disorders including obesity, and neurodegenerative disorders, and others.

28. Use of a (poly)peptide as identified by the method of claim 23 for the preparation of a pharmaceutical composition for the treatment, alleviation and/or prevention of a pancreatic dysfunction (for example diabetes, hyperglycemia, and impaired glucose tolerance), and related disorders including obesity, and neurodegenerative disorders, and others.

29. Use of a nucleic acid molecule of the DP119, DP444, DP810, DP685, WE474, DP160, RA977 or RA770 gene family or of a fragment thereof for the preparation of a non-human animal which over- or underexpresses the DP119, DP444, DP810, DP685, WE474, DP160, RA977, RA770 gene product.

30. Kit comprising of at least one of (a) a DP119, DP444, DP810, DP685, WE474, DP160, RA977, or RA770 nucleic acid molecule or a fragment thereof; (b) a vector comprising the nucleic acid of (a); (c) a host cell comprising the nucleic acid molecule of (a) or the vector of (b); (d) a polypeptide encoded by the nucleic acid molecule of (a); (e) a fusion polypeptide encoded by the nucleic acid molecule of (a); (f) an antibody, an aptamer or another receptor the nucleic acid molecule of (a) or the polypeptide of (d) or (e) and (g) an anti-sense oligonucleotide of the nucleic acid molecule of (a).

31. A method of producing a composition comprising mixing an agent identified by the method of claim 24 with a pharmaceutically acceptable carrier and/or diluent.

32. Use of an agent as identified by the method of claim 24 for the preparation of a pharmaceutical composition for the treatment, alleviation and/or prevention of a pancreatic dysfunction (for example diabetes, hyperglycemia, and impaired glucose tolerance), and related disorders including obesity, and neurodegenerative disorders, and others.

Description

[0001] This invention relates to the use of nucleic acid and amino acid sequences of proteins specifically expressed in certain tissues including pancreatic tissues and to the use of effectors/modulators in the diagnosis, study, prevention, and treatment of diseases and disorders, for example, but not limited to, of the pancreas including metabolic disorders such as diabetes and related disorders like obesity, adipositas, and/or metabolic syndrome, as well as liver diseases, neurodegenerative disorders, and others. In addition, these sequences can be used for beta cell regeneration.

[0002] There are worldwide more than 151 million people having diabetes, 10% of those in the United States and about 20% in Europe (see, for example, Zimmet et al., 2001, Nature 414:782-787). Diabetes is among the leading causes of death and considered to be one of the main threats to human health in the 21 st century. There are two main forms of diabetes. Type I autoimmune diabetes (IDDM) results from the destruction of insulin producing beta-cells in the pancreatic islets of Langerhans. The adult pancreas has very limited regenerative potential, and so these islets are not replaced after they are destroyed. The patient's survival then depends on exogenous administration of insulin. The risk of developing type I diabetes is higher than for virtually all other severe chronic diseases of childhood. Type II diabetes is characterized by a progression from moderate to severe insulin-resistance and glucose intolerance, leading eventually to beta cell failure and dependence on exogenous insulin. High body weight and a sedentary live style are major risk factors for type II diabetes. Recently, LADA (latent autoimmune diabetes in adults) has been recognized as a form of diabetes distinct from Type I and Type II diabetes. Patients with LADA are usually first diagnosed later than most Type I diabetics, are initially not dependent on exogenous insulin and are characterized by the presence of islet autoantibodies, particularly against GAD65. It is estimated that about 10% of all patients which are currently diagnosed as Type II diabetics are actually LADA patients.

[0003] In about 4% of all pregnancies, elevated blood glucose levels can be observed in the mother. While this type of diabetes ("gestational diabetes") usually resolves after birth it represents a health risk for both mother and baby and therefore needs to be treated.

[0004] It should be noted, that not only early phase type II diabetics but also type I and LADA patients retain some beta cell activity. Therefore, in most if not all forms of diabetes, beneficial treatments can be obtained by improving insulin secretion by the beta cells still present in the patient.

[0005] Although since the availability of injectable insulin diabetes is no longer an acutely live-threatening disease, it imposes a significant burden on the patient. This is because administration of insulin and other cannot prevent excursions to high or low blood glucose levels. Acute hypoglycemia can lead to coma and death. Frequent hyperglycemia causes complications, including diabetic ketoacidosis, end-stage renal disease, diabetic neuropathy, diabetic retinopathy and amputation. There are also a host of related conditions, such as obesity, hypertension, heart disease, peripheral vascular disease, and infections, for which persons with diabetes are at substantially increased risk. These and other complications account for a major proportion of the high cost of treating diabetic patients and contribute to overall lower quality of life and a reduced life expectancy. In order to cure diabetes, the lost beta cells would have to be replaced. This is currently done during islet or pancreas transplantation. However, donor organs are not available in sufficient numbers to transplant even a significant proportion of insulin dependent diabetic patients. Furthermore, patients have to undergo immunosuppressive therapy after transplantation, leading to a different set of side effects and long term complications.

[0006] Transplantable material could be generated from stem cells differentiated in vitro before transplantation into the patient. Progress has been made towards the differentiation of beta cells in vitro, however, additional factors promoting differentiation will have to be identified in order to enhance the performance of the differentiated cells.

[0007] A different approach can be regeneration through differentiation of somatic stem cells contained within the patient's body. These stem cells could be those which mediate the normal replacement of lost beta cells within the pancreas. However, it is also possible to treat diabetes by appropriate differentiation of stem cells in other tissues such as the liver, the intestine, or other organs.

[0008] Thus, there is a need in the art for the identification of novel factors which can promote the differentiation and/or function of beta cells in vitro and/or in vivo.

[0009] The pancreas is an essential organ possessing both an exocrine function involved in the delivery of enzymes into the digestive tract and an endocrine function by which various hormones are secreted into the blood stream. The exocrine function is assured by acinar and centroacinar cells that produce various digestive enzymes (for example, amylase, proteases, nuclease, etc.) and intercalated ducts that transport these enzymes in alkaline solution to the duodenum. The functional unit of the endocrine pancreas is the islet of Langerhans. Islets are scattered throughout the exocrine portion of the pancreas and are composed of four cell types: alpha-, beta-, delta- and PP-cells, reviewed for example in Kim & Hebrok, 2001, Genes & Development 15:111-127, and in Slack, Development 121 (1995), 1569-1580. Beta-cells produce insulin, represent the majority of the endocrine cells and form the core of the islets, while alpha-cells secrete glucagon and are located in the periphery. Delta-cells and PP-cells are less numerous and secrete somatostatin and pancreatic polypeptide, respectively.

[0010] Early pancreatic development has been well studied in different species, including chicken, zebrafish, and mice (for an detailed review, see Kim & Hebrock, 2001, supra). The pancreas develops from distinct dorsal and ventral anlagen. Pancreas development requires specification of the pancreas anlage along both anterior-posterior and dorsal-ventral axes. Within the developing anlage, a number of important regulatory factors important for proper organ development have been described, although a recapitulation of the different developmental programs in vitro has so far proven to be difficult.

[0011] Later in life, the acinar and ductal cells retain a significant proliferative capacity that can ensure cell renewal and growth, whereas the islet cells become mostly mitotically inactive. During embryonic development, and probably later in life, pancreatic islets of Langerhans originate from differentiating epithelial stem cells. These stem cells are situated in the pancreatic ducts or appear to form duct-like structures during development but are otherwise poorly characterized. The early progenitor cells to the pancreatic islets are multipotential and coactivate an early endocrine gene expression program. As development proceeds, expression of islet-specific hormones becomes restricted to the pattern of expression characteristic for mature islet cells. Pancreatic islet formation is dynamic and responds to changes in insulin demand, such as during pregnancy, or during childhood and adolescence.

[0012] Many pancreas diseases are associated with defects in pancreatic architecture or insufficient cellular regeneration, but the molecular mechanisms underlying these defects are basically unknown. However, studies have identified a number of signaling pathways which influence pancreatic cell fate as well as the morphogenesis of pancreatic structures, for example FGF signaling, activin signaling, the Hedgehog pathway, notch signaling, VEGF signaling, and the TGF-beta signaling pathway. There is a need in the prior art for the identification of candidate genes that are specifically expressed in early development in certain pancreatic tissues. These genes and the thereby encoded proteins can provide tools to the diagnosis and treatment of severe pancreatic disorders and related diseases. Therefore, this invention describes proteins that are specifically expressed in pancreatic tissues early in the development. The invention relates to the use of these genes and proteins in the diagnosis, prevention and/or treatment of pancreatic dysfunctions, such as diabetes, and other diseases.

[0013] So far, a function in the regulation of metabolic diseases such as diabetes has not been described in the prior art for the proteins of the invention. This invention describes novel functions for the DP119, DP444, DP810, DP685, WE474, DP160, RA977, or RA770 genes and proteins encoded thereby (referred to as proteins of the invention herein) that are involved in the development of the pancreas.

[0014] The identification of polynucleotides encoding molecules specifically expressed in the pancreatic tissues such as embryonic pancreatic epithelium, islet cells of the pancreas, pancreatic mesenchyme, as well as other tissues like forebrain, hindbrain, ganglia, branchial arches, stomach, intestinal region, lung, and mesonephrons, and the molecules themselves, presents the opportunity to investigate diseases and disorders of the pancreas, including diabetes. The identification of the proteins of the invention and antibodies against these proteins as well as effector molecules of said polypeptides or proteins, e.g. aptamers or other receptors satisfies a need in the art by providing new compositions useful in diagnosis, treatment, and prognosis of pancreatic diseases, adipositas and other metabolic disorders, as well as neurodegenerative disorders and other diseases.

[0015] DP119, DP444, DP810, DP685, WE474, DP160, RA977, or RA770 proteins and nucleic acid molecules coding therefor are obtainable from vertebrate species, e.g. mammals or birds. Particularly preferred are human homolog nucleic acids or polypeptides (see FIGS. 2, 4, 6, 8, 10, 12, 14, or 16, respectively). Also particularly preferred are chicken nucleic acids and polypeptides encoded thereby (see FIGS. 2, 4, 6, 8, 10, 12, 14, or 16, respectively).

[0016] Accordingly, the invention features a substantially purified protein which has the amino acid sequence shown in SEQ ID NO: 2, 4, 6, 8, 10, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42 or 44 respectively. One aspect of the invention features isolated and substantially purified polynucleotides that encode the proteins of the invention. In a particular aspect, the polynucleotide is the nucleotide sequence of SEQ ID NO: 1, 3, 5, 7, 9, 11, 12, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41 or 43. The invention also relates to a polynucleotide sequence comprising the complement of SEQ ID NO: 1, 3, 5, 7, 9, 11, 12, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41 or 43, or variants thereof. In addition, the invention features polynucleotide sequences which hybridize under stringent conditions to SEQ ID NO: 1, 3, 5, 7, 9, 11, 12, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41 or 43 and/or the complement thereof. The invention additionally features polypeptides or peptides comprising fragments or portions of the above amino acid sequences and polynucleotides or oligonucleotides comprising fragments or portions of the above nucleic acid sequences and nucleic acid analogs, e.g. peptide nucleic acids (PNA), morpholinonucleic acids, locked nucleic acids (LNA), or antisense molecules thereof, and expression vectors and host cells comprising polynucleotides that encode the proteins of the invention. The length of polypeptide or peptide fragments is preferably at least 5, more preferably at least 6 and most preferably at least 8 amino acids. The length of nucleic acid fragments and nucleic acid analogs is preferably at least 10, more preferably at least 15 and most preferably at least 20 nucleotides.

[0017] The present invention also features antibodies which bind specifically to the proteins of the invention, and pharmaceutical compositions comprising substantially purified proteins of the invention. The invention also features the use of effectors, e.g. agonists and antagonists of the proteins of the invention. Effectors are preferably selected from antibodies, aptamers, low molecular weight molecules, antisense-molecules, ribozymes capable of modulating the function of the nucleic acids and proteins of the invention. The nucleic acids that encode the proteins of the invention are used in identifying: homologous or related genes; in producing compositions that modulate the expression or function of the encoded proteins; for gene therapy; mapping functional regions of the proteins; and in characterizing associated physiological pathways.

[0018] Before the present proteins, nucleotide sequences, and methods are described, it is understood that this invention is not limited to the particular methodology, protocols, cell lines, vectors, and reagents described as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention which will be limited only by the appended claims.

[0019] It must be noted that as used herein and in the appended claims, the singular forms "a", "an", and "the" include plural reference unless the context clearly dictates otherwise. Thus, for example, reference to "a host cell" includes a plurality of such host cells, reference to the "antibody" is a reference to one or more antibodies and equivalents thereof known to those skilled in the art, and so forth. Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, the preferred methods, devices, and materials are now described. All publications mentioned herein are incorporated herein by reference for the purpose of describing and disclosing the cell lines, vectors, and methodologies which are reported in the publications which might be used in connection with the invention. Nothing herein is to be construed as an admission that the invention is not entitled to antedate such disclosure by virtue of prior invention.

[0020] The invention is based on the finding of novel functions for DP119, DP444, DP810, DP685, WE474, DP160, RA977, or RA770 proteins and particularly based on the finding that these proteins are expressed specifically in early pancreatic tissues and in other tissues.

[0021] The invention is further based on polynucleotides encoding the proteins of the invention, functional fragments of said genes, polypeptides encoded by said genes or fragments thereof, and effectors/modulators, e.g. antibodies, biologically active nucleic acids, such as antisense molecules, RNAi molecules or ribozymes, aptamers, peptides or low-molecular weight organic compounds recognizing said polynucleotides or polypeptides, and the use of these compositions for the diagnosis, study, prevention, or treatment of diseases and disorders related to such cells, including metabolic diseases, such as diabetes and obesity, neurodegenerative disorders, heart diseases, intestinal diseases, liver disorders, and others.

[0022] Nucleic acids encoding the chicken proteins of the present invention were first identified from the pancreas tissue cDNA library (day 6) through a whole-mount in situ screen for genes expressed in the embryonic pancreatic bud (see EXAMPLES).

[0023] Zebrafish have gained importance as model organism during the recent years. The embryos of this species are transparent and available in large numbers, develop quickly outside of their mother and allow both forward and reverse genetic analysis of gene function. Published data on pancreatic development in zebrafish shows that islet formation occurs extremely rapid (within 24 hrs) and suggest that this process requires the same regulatory genes as in mammals (see Biemar et al., Dev Biol. 2001 Feb. 15; 230(2):189-203). Suppressing gene function in zebrafish embryos using morpholino antisense oligonucleotides (Mos), modified Peptide Nucleic Acids (mPNAs) or other antisense compounds with good efficiency and specificity yields phenotypes which are usually indistinguishable from genetic mutants in the same gene (Nasevicius et al., Nat Genet. 2000 October; 26(2):216-20; Effimov et al., NAR 26; 566-575; Urtishak et al., 5th international conference on zebrafish development and genetics, Madison/Wis. 2002, abstr. #17). Therefore, this approach allows rapid assessment of gene function in a model vertebrate.

[0024] Microarrays are analytical tools routinely used in bioanalysis. A microarray has molecules distributed over, and stably associated with, the surface of a solid support. The term "microarray" refers to an arrangement of a plurality of polynucleotides, polypeptides, antibodies, or other chemical compounds on a substrate. Microarrays of polypeptides, polynucleotides, and/or antibodies have been developed and find use in a variety of applications, such as monitoring gene expression, drug discovery, gene sequencing, gene mapping, bacterial identification, and combinatorial chemistry. One area in particular in which microarrays find use is in gene expression analysis (see Example 4). Array technology can be used to explore the expression of a single polymorphic gene or the expression profile of a large number of related or unrelated genes. When the expression of a single gene is examined, arrays are employed to detect the expression of a specific gene or its variants. When an expression profile is examined, arrays provide a platform for identifying genes that are tissue specific, are affected by a substance being tested in a toxicology assay, are part of a signaling cascade, carry out housekeeping functions, or are specifically related to a particular genetic predisposition, condition, disease, or disorder.

[0025] Microarrays may be prepared, used, and analyzed using methods known in the art (see for example, Brennan, T. M. et al. (1995) U.S. Pat. No. 5,474,796--Schena, M. et al. (1996) Proc. Natl. Acad. Sci. USA 93:10614-10619; Baldeschweiler et al. (1995) PCT application WO95/251116; Shalon, D. et al. (1995) PCT application WO95/35505; Heller, R. A. et al. (1997) Proc. Natl. Acad. Sci. USA 94:21502155; Heller, M. J. et al. (1997) U.S. Pat. No. 5,605,662). Various types of microarrays are well known and thoroughly described in Schena, M., ed. (1999; DNA Microarrays: A Practical Approach, Oxford University Press, London).

[0026] In further embodiments, oligonucleotides or longer fragments derived from any of the polynucleotides described herein may be used as elements on a microarray. The microarray can be used in transcript imaging techniques which monitor the relative expression levels of large numbers of genes simultaneously as described below. The microarray may also be used: to identify genetic variants, mutations, and polymorphisms. This information may be used to determine gene function, to understand the genetic basis of a disorder, to diagnose a disorder, to monitor progression/regression of disease as a function of gene expression, and to develop and monitor the activities of therapeutic agents in the treatment of disease. In particular, this information may be used to develop a pharmacogenomic profile of a patient in order to select the most appropriate and effective treatment regimen for that patient. For example, therapeutic agents, which are highly effective and display the fewest side effects may be selected for a patient based on his/her pharmacogenomic profile.

[0027] DP119: In one embodiment, the invention encompasses the chicken DP119 protein, a polypeptide comprising the amino acid sequence of SEQ ID NO:2, as presented using the one-letter code in FIG. 2B. In situ hybridization experiments using the DP119 protein described in this invention were done on whole mounts of 5-day-old chick embryos (FIG. 1A), on sectioned pancreatic bud tissue (FIG. 1B), and on a cross-section through the dorsal part of a day 5 chicken embryo (FIG. 1C). The hybridizations show that DP119 transcripts are exclusively expressed in the ganglia along the neural tube (nt), on the outside of the developing stomach (st) and in the dorsal and ventral pancreatic buds (dpb, vpb), in pancreatic islets (is), and in some cells of the pancreatic epithelium and duct cells (du) (see FIG. 1).

[0028] The predicted amino acid sequence was searched in the publicly available GenBank database. In search of sequence databases, it was found, for example, that DP119 has homology with a human hypothetical protein (Genbank Accession Number AL050137.1 for the cDNA and CAB43286.1 for the protein) and to mouse hypothetical protein (Genbank Accession Number BC025654.1 for the cDNA and AAH25654.1 for the protein, see FIG. 2). Based upon homology, DP119 protein and each homologous protein or peptide may share at least some activity.

[0029] The C-terminus of DP119 contains an olfactomedin-like domain; the N-terminus is characterized by a cystein-rich domain reminiscent of certain cytokines. These two domains may represent functional subdomains of the protein.

[0030] DP444: In one embodiment, the invention encompasses the chicken DP444 protein, a polypeptide comprising the amino acid sequence of SEQ ID NO: 8, as presented using the one-letter code in FIG. 4B. In situ hybridization experiments using the DP444 protein described in this invention were done on whole mounts of 3.5- (FIG. 3A), 4- (FIG. 3B), and 5-day-old chick embryos (FIG. 3C) and on sectioned pancreatic bud tissue (FIG. 3D). The hybridizations show that DP444 transcripts are exclusively expressed in dorsal and ventral pancreatic buds, along the neural tube, in somites, the developing intestine, in the dorsal hindbrain, the stomach, and in pancreatic islets (see FIG. 3).

[0031] The predicted amino acid sequence was searched in the publicly available GenBank database. In search of sequence databases, it was found, for example, that DP444 has homology with the human protein BAC03521, nucleotide GenBank Accession no. AK090815 (see EXAMPLE 10 for more detail). Highly homologous mouse and fish proteins could also be identified (see FIG. 4K). Search of public domain databases (e.g. SMART at http://smart.embl-heidelberg.de/ or RPS-BLAST at the NCBI) revealed that there are no known protein domains within DP444. DP444, its human, mouse and fish homologs and the proteins F25C8.3 (Anopheles gambiae, gi.vertline.19572386), F25C8.3.p (C. elegans, gi.vertline.17560138) and the CG18437 gene product (Drosophila melanogaster, gi.vertline.7301616) form a novel family of unknown function (FIG. 4K).

[0032] Knockdown of DP444 gene function in zebrafish using antisense-Morpholino-oligos specific for DP444 leads to an islet convergence defect in 20-30% of all injected embryos (see FIG. 3E): A similar defect can be observed, when the zebrafish homolog of the neural-adhesion molecule DM-GRASP/neurolin/BEN/CD166 is functionally suppressed by the same method. Suppression of both genes at the same time does not lead to an additive effect, suggesting that CD166 and DP444 might act in the same pathway. The CD166 gene has, besides its role in neural pathfinding and T-cell-activation, been implicated in pancreatic development. A link between CD166 function and expression of the key pancreatic regulatory gene Pdx1 has been suggested (see Stephan et al., Developmental Biology 212, 264-277). Thus, DP444 may be involved in Pdx1 regulation.

[0033] Expression analysis in adult mouse tissues reveals that DP444 transcripts are restricted to brain (particularly hypothalamus) and islets, suggesting an important function of DP444 in beta cells.

[0034] DP810: In one embodiment, the invention encompasses the chicken DP810-like protein, a polypeptide comprising the amino acid sequence of SEQ ID NO: 18, as presented using the one-letter code in FIG. 6B. In situ hybridization experiments using the DP810 protein described in this invention were done on whole mounts of 5-day-old chick embryos (FIGS. 5A and 5B) and on sectioned pancreatic bud tissue (FIGS. 5C and 5D). The hybridizations show that DP810 transcripts of the invention are exclusively expressed in the periphery of islets (is, FIG. 5) and in the surrounding pancreatic mesenchyme (pm, FIG. 5).

[0035] The predicted amino acid sequence was searched in the publicly available GenBank database. In search of sequence databases, it was found, for example, that DP810 has homology with human likely ortholog of mouse polydom protein (GenBank Accession Number NM.sub.--024500.1 for the cDNA (FIG. 6C, SEQ ID NO: 19), NP.sub.--078776.1 for the protein (FIG. 6D, SEQ ID NO: 20). Based upon homology, DP810 protein and each homologous protein or peptide may share at least some activity.

[0036] Polydom was described first in 2000 (Gilges D. et al., 2000, Biochem J. 352 Pt 1:49-59). It was shown that a C-terminally tagged form of the protein is secreted when expressed in Cos7 cells. Sites for N-glycosylation in the primary sequence and a slightly reduced mobility on SDS-PAGE gels suggest postranslational modification by glycosylation. Strong expression of polydom was found in human placenta and lung, weaker expression was seen in spleen, skeletal muscle and heart. Pancreatic expression was not analyzed. The human homolog of Polydom was mapped by FISH to chromosome 9q32. Polydom contains a number of protein domains. Most notable are EGF--(epidermal growth factor) like repeats, a von Willebrand factor type A domain, and 34 complement control protein (CCP) modules, suggesting a potential function in cell signalling or cell adhesion.

[0037] DP685: In one embodiment, the invention encompasses the chicken DP685 protein, a polypeptide encoded by the nucleic acid sequence of SEQ ID NO: 21, as presented in FIG. 8A. In situ hybridization experiments using the DP685 protein described in this invention were done on whole mounts of 4- (FIG. 7A) and 5-day-old chick embryos (FIG. 7B). The hybridizations show that transcripts are expressed in the dorsal pancreatic bud and in the developing stomach, and in the dorsal neural tube, the dorsal forebrain, hindbrain, branchial arches, hindlimb and forelimb.

[0038] The predicted amino acid sequence was searched in the publicly available GenBank database. In search of sequence databases, it was found, for example, that DP685 has homology with a human autotaxin-t (synonym Ectonucleotide pyrophosphatase/Pyrophosphatase 2 (ENPP2); Genbank Accession Number L46720.1 and AAB00855.1; SEQ ID NO: 23 and 24). Based upon homology, DP685 protein and each homologous protein or peptide may share at least some activity.

[0039] The bifunctional enzyme phosphodiesterase I (EC 3.1.4.1)/nucleotide pyrophosphatase (EC 3.6.1.9) (refered to as PD-I (alpha)) was cloned from rat brain by Narita et al. (1994) J. Biol. Chem. 269: 28235-28242. The human PD-I alpha homologue is an 863-amino acid protein with 89% identity to the rat protein (Kawagoe et al. (1995) Genomics 30: 380-384). Northern blot analysis detected a 3-kb transcript in brain, placenta, kidney and lung. An apparent splice variant of PD-I (alpha) lacking 52 amino acids, but otherwise identical, has been described as autotaxin, a tumor cell motility-stimulating factor (Murata et al., 1994 J. Biol. Chem. 269: 30479-30484). Kawagoe et al. (1995), supra, obtained a genomic clone for the 5'-end of the gene which contained a variety of potential DNA-binding sites as well as intron 1.

[0040] However, two recent publications have identified that autotaxin has lysophospholipase D activity and that it synthesizes lysophosphatidic acid (LPA) (Tokumura et al., 2002, J Biol Chem. 2002 Aug 9; Umezu-Goto et al., 2002, J Cell Biol. 158(2):227-33; reviewed in Moolenaar, 2002, J Cell Biol. 158(2):197-9). LPA is a potent signalling compound with effects on cytoskeletal organization, cell proliferation and cell migration. Its activity is mediated by a family of G-protein coupled receptors belonging to the edg-family. The different members of this family show differences in expression and downstream signalling partners (reviewed e.g. in Takuwa et al., 2002, J Biochem (Tokyo). 131(6):767-71).

[0041] As shown in this invention, the expression pattern of autotaxin in the day 4 and day 5 chicken embryo suggests that autotaxin and/or LPA synthesized by autotaxin plays an important and up to now unknown role in animal development. This is especially striking when the patterning of the limbs, the central nervous system and growth, differentiation and morphogenesis of the pancreas are considered (see FIG. 3).

[0042] The expression of autotaxin in the embryonic pancreatic bud suggests a novel function of insulin secreting cells from other cell types such as stem cells.

[0043] The expression of autotaxin in neural tissues, e.g. the neural tube and the brain, and in the limbs suggests a novel function and a use of autotaxin, LPA, or other reaction products generated by autotaxin in the generation of neural cells and cells of the motility apparatus from other cell types such as stem cells.

[0044] It also raises the possibility that agonists specific for LPA-receptors expressed in specific cell types or their precursors can modulate the growth, differentiation, or organ-specific organization of these cells. For example, stimulation of an LPA-receptor more or less specifically expressed in certain cell types such as pancreatic stem cells, other stem cells or other cells that can be used to generate new insulin-secreting cells might yield relatively specific responses in spite of the many effects described in the literature for LPA.

[0045] WE474: In one embodiment, the invention encompasses the chicken WE474 protein, a polypeptide comprising the amino acid sequence of SEQ ID NO:28, as presented using the one-letter code in FIG. 10B. In situ hybridization experiments using the WE474 protein described in this invention were done on whole mounts of 5-day-old chick embryos. The hybridizations show that WE474 transcripts are exclusively expressed in the liver (li) and in the intestinal region (in) including the developing pancreas (FIG. 9A).

[0046] The predicted amino acid sequence was searched in the publicly available GenBank database. In search of sequence databases, it was found, for example, that WE474 has homology with a human collectin sub-family member 10 (Genbank Accession Number NM.sub.--006438.2 for the cDNA and NP.sub.--006429.1 for the protein; SEQ ID. NO: 29 and 30). Based upon homology, WE474 protein and each homologous protein or peptide may share at least some activity.

[0047] Collectins are a C-lectin family with collagen-like sequences and carbohydrate recognition domains. These proteins can bind to carbohydrate antigens of microorganisms and inhibit their infection by direct neutralization and agglutination, the activation of complement through the lectin pathway, and opsonization by collectin receptors (Ohtani K. et al., 1999, J Biol Chem 274(19):13681-13689). A cDNA encoding human collectin from liver (CL-L1 (collectin liver 1)) has typical collectin structural characteristics, consisting of an N-terminal cysteine-rich domain, a collagen-like domain, a neck domain, and a carbohydrate recognition domain. This collectin has a unique repeat of four lysine residues in its C-terminal area. CL-L1 is present mainly in liver as a cytosolic protein and at low levels in placenta. More sensitive analyses showed that most tissues (except skeletal muscle) have CL-L1 mRNA. Zoo-blot analysis indicated that CL-L1 is limited to mammals and birds. A chromosomal localization study indicated that the CL-L1 gene localizes to chromosome 8q23-q24.1. CL-L1 binds mannose weakly (see, for example, Ohtani K. et al., 1999, J Biol Chem 274(19):13681-13689). Analysis of the WE474 protein sequence using suitable software (such as SignalP, Nielsen et al., Protein Engineering 10, 1-6) reveals the presence of a secretion signal. Thus, WE474 is likely to have a role in cell-cell or autocrine signalling.

[0048] DP160: In one embodiment, the invention encompasses the chicken DP160 protein, a polypeptide comprising the amino acid sequence of SEQ ID NO:32, as presented using the one-letter code in FIG. 12B. In situ hybridization experiments using the DP160 protein described in this invention were done on whole mounts of 5-day-old chick embryos (FIG. 11A) and on a cross-section through the developing pancreas of a 5-day-old chick embryo (FIG. 11A). The hybridizations show that DP160 transcripts are exclusively expressed in the ganglia along the neural tube (nt), on the outside of the developing stomach (st), in the mesonephros, in the dorsal and ventral pancreatic buds (dpb, vpb), in pancreatic islets (is), and in some cells of the pancreatic epithelium (see FIG. 11).

[0049] The predicted amino acid sequence was searched in the publicly available GenBank database. In search of sequence databases, it was found, for example, that DP160 has homology with a human CCR4 carbon catabolite repression 4-like protein (CCRN4L; Nocturnin) (Genbank Accession Number XP.sub.--003343.3 and XP.sub.--003343.2; SEQ ID NO: 33 and 34). Based upon homology, or DP160 protein and each homologous protein or peptide may share at least some activity.

[0050] Nocturnin was originally identified by differential display as a circadian clock regulated gene with high expression at night in photoreceptors of the African clawed frog, Xenopus laevis. Although encoding a novel protein, the nocturnin cDNA had strong sequence similarity with a C-terminal domain of the yeast transcription factor CCR4, and with mouse and human ESTs. Since its original identification several homologues of nocturnin/CCR4 were cloned, including from human and mouse. Northern analysis of mRNA in C3H/He and C57/BI6 mice revealed that the mNoc gene is expressed in a broad range of tissues, with greatest abundance in liver, kidney and testis as well as in multiple brain regions. Furthermore, mNoc exhibits circadian rhythmicity of mRNA abundance with peak levels at the time of light offset in the retina, spleen, heart, kidney and liver (Wang et al., 2001, BMC Dev Biol 1(1):9).

[0051] RA977: In one embodiment, the invention encompasses the chicken RA977 protein, a polypeptide comprising the amino acid sequence of SEQ ID NO:36, as presented using the one-letter code in FIG. 14B. In situ hybridization experiments using the RA977 protein described in this invention were done on whole mounts of 5-day-old chick embryos. The hybridizations show that RA977 transcripts are exclusively expressed in dorsal pancreatic bud (see FIGS. 13A and 13B).

[0052] The predicted amino acid sequence was searched in the publicly available GenBank database. In search of sequence databases, it was found, for example, that RA977 has homology with a human epithelial membrane protein 2 (EMP2; Genbank Accession Number XM.sub.--030218.1 for the cDNA and P54851 for the protein; SEQ ID NO: 37 and 38, see FIG. 14). Based upon homology, RA977 protein and each homologous protein or peptide may share at least some activity.

[0053] The epithelial membrane protein-2 (EMP-2) is a member of the peripheral myelin protein 22 gene family (PMP22/EMP/MP20 gene family). Mutations affecting the PMP22 gene are associated with hereditary motor and sensory neuropathies. In human, EMP-2 mRNA transcripts are found in most tissues including liver. EMP-2 is most prominently expressed in the adult ovary, heart, lung and intestine and in fetal lung. Since PMP22 has been implicated in the regulation of cell proliferation and apoptosis, it appears likely that EMP-2 is involved in similar regulatory processes in a variety of tissues (Taylor V. and Suter U., 1996, Gene 175(1-2):115-120).

[0054] Charcot-Marie-Tooth (CMT) neuropathy represents a genetically heterogeneous group of diseases affecting the peripheral nervous system. Autosomal dominant CMT type 1C (CMT1C). was mapped genetically to chromosome 16p13.1-p12.3. The epithelial membrane protein 2 gene (EMP2), which maps to chromosome 16p13.2, is a candidate gene for CMT1C (Street V. A., 2002, Am J Hum Genet 70(1):244-250).

[0055] Epithelial membrane protein 2, a 4-transmembrane protein, might suppress B-cell lymphoma tumorigenicity through a functional tumor suppressor phenotype (Wang C. X., 2001, Blood 97(12):3890-3895).

[0056] RA770: In one embodiment, the invention encompasses the chicken RA770-like protein, a polypeptide comprising the amino acid sequence of SEQ ID NO: 40, as presented using the one-letter code in FIG. 16B. In situ hybridization experiments using the RA770 protein described in this invention were done on whole mounts of 5-day-old chick embryos (FIG. 15A). The hybridizations show that RA770 transcripts of the invention are exclusively expressed in the duodenum (dd) and ventral pancreatic bud (vpd), in the stomach region (st), lung (lu) and dorsal pancreatic bud (dpb) (FIG. 15).

[0057] The predicted amino acid sequence was searched in the publicly available GenBank database. In search of sequence databases, it was found, for example, that RA770 has homology with human neurturin precursor (GenBank Accession Number NM.sub.--004558 (FIG. 16C, SEQ ID NO: 41, FIG. 16D, SEQ ID NO: 42)) and with mouse neurturin precursor (GenBank Accession Number NM.sub.--008738 (FIG. 16E, SEQ ID NO: 43, FIG. 16F, SEQ ID NO: 44)). Based upon homology, RA770 protein and each homologous protein or peptide may share at least some activity.

[0058] Neurturin (or NRTN), a potent neurotrophic factor, was purified from Chinese hamster ovary cell-conditioned media by Kotzbauer et al. (1996) Nature 384: 467-470. The protein is closely related to glial cell line-derived neurotrophic factor (GDNF). Neurturin and GDNF form a distinct TGF-beta subfamily, referred to as TRNs (for `TGF-beta-related neurotrophins`; see review by Takahashi, 2001, Cytokine Growth Factor Rev 12(4):361-73). Members of this protein family signal through a unique multicomponent receptor system consisting of RET tyrosine kinase and glycosyl-phosphatidylinositol-anchored coreceptor (GFRalpha1-4)). These neurotrophic factors promote the survival of various neurons including peripheral autonomic and sensory neurons as well as central motor and dopamine neurons, and have been expected as therapeutic agents for neurodegenerative diseases. In addition, the GDNF/RET signaling plays a crucial role in renal development and regulation of spermatogonia differentiation. RET mutations cause several human diseases such as papillary thyroid carcinoma, multiple endocrine neoplasia types 2A and 2B, and Hirschsprung's disease. The mutations resulted in RET activation or inactivation by various mechanisms and the biological properties of mutant proteins appeared to be correlated with disease phenotypes. The signaling pathways activated by GDNF or mutant RET are being extensively investigated to understand the molecular mechanisms of disease development and the physiological roles of the GDNF family ligands.

[0059] Heuckeroth et al. (1997) Genomics 44:137-140 stated that inactivating mutations in GDNF or Ret in knockout mice cause intestinal aganglionosis and renal dysplasia. Neurturin also signals through RET and a GPI-linked coreceptor. Like GDNF, neurturin can promote the survival of numerous neuronal populations, including sympathetic, nodose, and dorsal root ganglion sensory neurons. Heuckeroth et al. (1997), supra, isolated mouse and human genomic neurturin clones and showed that preproneurturin is encoded by 2 exons. Mouse and human clones have common intron/exon boundaries. They used interspecific backcross analysis to localize neurturin to mouse chromosome 17 and fluorescence in situ hybridization to localize human neurturin to the syntenic region of 19p13.3.

[0060] Considering that RET and glial cell line-derived neurotrophic factor mutations had been reported in Hirschsprung disease, Doray et al. (1998) Hum. Molec. Genet. 7: 1449-1452 regarded the other RET ligand, neurturin, as an attractive candidate gene, especially as it shares large homologies with GDNF. Doray et al. (1998), supra, reported a heterozygous missense Neurturin mutation in a large nonconsanguineous family including 4 children affected with a severe aganglionosis phenotype extending up to the small intestine. It appeared that the Neurturin mutation they found was not sufficient to cause HSCR, and this multiplex family also segregated a RET mutation. This cascade of independent and additive genetic events fits well with the multigenic pattern of inheritance expected in HSCR, and further supports the role of RET ligands in the development of the enteric nervous system.

[0061] The invention also encompasses variants of the proteins of the invention. A preferred variant is one having at least 80%, and more preferably 90%, amino acid sequence similarity to the amino acid sequence of the proteins of the invention (SEQ ID NO: 2, 4, 6, 8, 10, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42 or 44 respectively). A most preferred variant is one having at least 95% amino acid sequence similarity to SEQ ID NO: 2, 4, 6, 8, 10, 14, 16, 18, 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42 or 44 respectively.

[0062] The invention also encompasses polynucleotides which encode the proteins of the invention. Accordingly, any nucleic acid sequence which encodes the amino acid sequence of the proteins of the invention can be used to generate recombinant molecules which express the proteins of the invention. In a particular embodiment, the invention encompasses the polynucleotide comprising the nucleic acid sequence of SEQ ID NO: 1, 3, 5, 7, 9, 11, 12, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41 or 43. It will be appreciated by those skilled in the art that as a result of the degeneracy of the genetic code, a multitude of nucleotide sequences encoding the proteins of the invention, some bearing minimal homology to the nucleotide sequences of any known and naturally occurring gene, may be produced. Thus, the invention contemplates each and every possible variation of nucleotide sequence that could be made by selecting combinations based on possible codon choices.

[0063] Also encompassed by the invention are polynucleotide sequences that are capable of hybridizing to the claimed nucleotide sequences, and in particular, those shown in SEQ ID NO: 13, 5, 7, 9, 11, 12, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41 or 43, and/or the complement thereof under various conditions of stringency. Hybridization conditions are based on the melting temperature (Tm) of the nucleic acid binding complex or probe, as taught in Wahl, G. M. and S. L. Berger (1987, Methods Enzymol. 152:399-407) and Kimmel, A. R. (1987, Methods Enzymol. 152:507-511), and may be used at a defined stringency. Preferably, hybridization under stringent conditions means that after washing for 1 h with 1.times.SSC and 0.1. % SDS at 50.degree. C., preferably at 55.degree. C., more preferably at 62.degree. C. and most preferably at 68.degree. C., particularly for 1 h in 0.2.times.SSC and 0.1% SDS at 50.degree. C., preferably at 55.degree. C., more preferably at 62.degree. C. and most preferably at 68.degree. C., a positive hybridization signal is observed. Altered nucleic acid sequences encoding the proteins of the invention which are encompassed by the invention include deletions, insertions, or substitutions of different nucleotides resulting in polynucleotides that encode the same or functionally equivalent proteins of the invention. The encoded proteins may also contain deletions, insertions, or substitutions of amino acid residues which produce a silent change and result in a functionally equivalent protein of the invention.

[0064] Also included within the scope of the present invention are alleles of the genes encoding the proteins of the invention. As used herein, an "allele" or "allelic sequence" is an alternative form of the gene which may result from at least one mutation in the nucleic acid sequence. Alleles may result in altered mRNAs or polypeptides whose structures or function may or may not be altered. Any given gene may have none, one, or many allelic forms. Common mutational changes which give rise to alleles are generally ascribed to natural deletions, additions, or substitutions of nucleotides. Each of these types of changes may occur alone, or in combination with the others, one or more times in a given sequence. Methods for DNA sequencing which are well known and generally available in the art may be used to practice any embodiments of the invention. The nucleic acid sequences encoding the proteins of the invention may be extended utilizing a partial nucleotide sequence and employing various methods known in the art to detect upstream sequences such as promoters and regulatory elements. For example, one method which may be employed, "restriction-site" PCR, uses universal primers to retrieve unknown sequence adjacent to a known locus (Sarkar, G. (1993) PCR Methods Applic. 2:318-322). In particular, genomic DNA is first amplified in the presence of primer to linker sequence and a primer specific to the known region. The amplified sequences are then subjected to a second round of PCR with the same linker primer and another specific primer internal to the first one. Products of each round of PCR are transcribed with an appropriate RNA polymerase and sequenced using reverse transcriptase. Inverse PCR may also be used to amplify or extend sequences using divergent primers based on a known region (Triglia, T. et al. (1988) Nucleic Acids Res. 16:8186). The primers may be designed using OLIGO 4.06 primer analysis software (National Biosciences Inc., Plymouth, Minn.), or another appropriate program, to 22-30 nucleotides in length, to have a GC content of 50% or more, and to anneal to the target sequence at temperatures about 68.degree. C.-72.degree. C. The method uses several restriction enzymes to generates suitable fragment. The fragment is then circularized by intramolecular ligation and used as a PCR template.

[0065] Another method which may be used is capture PCR which involves PCR amplification of DNA fragments adjacent to a known sequence in human and yeast artificial chromosome DNA (Lagerstrom, M. et al. (PCR Methods Applic. 1:111-119). In this method, multiple restriction enzyme digestions and ligations also be used to place an engineered double-stranded sequence into an unknown portion of the DNA molecule before performing PCR. Another method which may be used to retrieve unknown sequences is that of Parker, J. D. et al. (1991; Nucleic Acids Res. 19:3055-3060). Additionally, one may use PCR, nested primers, and PROMOTERFINDER libraries to walk in genomic DNA (Clontech, Palo Alto, Calif.). This process avoids the need to screen libraries and is useful in finding intron/exon junctions. When screening for full-length cDNAs, it is preferable to use libraries that have been size-selected to include larger cDNAs. Also, random-primed libraries are preferable, in that they will contain more sequences which contain the 5' regions of genes. Use of a randomly primed library may be especially preferable for situations in which an oligo d(T) library does not yield a full-length cDNA. Genomic libraries may be useful for extension of sequence into the 5' and 3' non-transcribed regulatory regions. Capillary electrophoresis systems which are commercially available may be used to analyze the size or confirm the nucleotide sequence of sequencing or PCR products. In particular, capillary sequencing may employ flowable polymers for electrophoretic separation, four different fluorescent dyes (one for each nucleotide) which are laser activated, and detection of the emitted wavelengths by a charge coupled devise camera. Output/light intensity may be converted to electrical signal using appropriate software (e.g. GENOTYPER and SEQUENCE NAVIGATOR, Perkin Elmer) and the entire process from loading of samples to computer analysis and electronic data display may be computer controlled. Capillary electrophoresis is especially preferable for the sequencing of small pieces of DNA which might be present in limited amounts in a particular sample.

[0066] In another embodiment of the invention, polynucleotide sequences or functional fragments thereof which encode the proteins of the invention, or fusion proteins or functional equivalents thereof, may be used in recombinant DNA molecules to direct expression of the proteins of the invention in appropriate host cells. Due to the inherent degeneracy of the genetic code, other DNA sequences which encode substantially the same or a functionally equivalent amino acid sequence may be produced and these sequences may be used to clone and express the proteins of the invention. As will be understood by those of skill in the art, it may be advantageous to produce the protein-encoding nucleotide sequences possessing non-naturally occurring codons. For example, codons preferred by a particular prokaryotic or eukaryotic host can be selected to increase the rate of protein expression or to produce a recombinant RNA transcript having desirable properties, such as a half-life which is longer than that of a transcript generated from the naturally occurring sequence. The nucleotide sequences of the present invention can be engineered using methods generally known in the art in order to alter the proteins of the invention encoding sequences for a variety of reasons, including but not limited to, alterations, which modify the cloning, processing, and/or expression of the gene product. DNA shuffling by random fragmentation and PCR reassembly of gene fragments and synthetic oligonucleotides may be used to engineer the nucleotide sequences. For example, site-directed mutagenesis may be used to insert new restriction sites, alter glycosylation patterns, change codon preference, produce splice variants, or introduce mutations, and so forth. Such mutated genes may be used to study structure-function relationships of the proteins of the invention, or to alter properties of the proteins that affect their function or regulation.

[0067] In another embodiment of the invention, natural, modified, or recombinant nucleic acid sequences encoding the proteins of the invention may be ligated to a heterologous sequence to encode a fusion protein. For example, to screen peptide libraries for inhibitors of the proteins of the invention activity, it may be useful to encode chimeric proteins of the invention that can be recognized by a commercially available antibody. A fusion protein may also be engineered to contain a cleavage site located between the proteins of the invention encoding sequence and the heterologous protein sequence, so that the proteins of the invention may be cleaved and purified away from the heterologous moiety. A fusion protein between the DP444 protein and a protein transduction peptide (reviewed e.g. in Lindsay, M. A.; Curr Opin Pharmacol 2002 October; 2(5):587-94) may be engineered to allow the uptake of recombinant fusion protein by mammalian cells. In another embodiment, sequences encoding the proteins of the invention may be synthesized, in whole or in part, using chemical methods well known in the art (see Caruthers, M. H. et al. (1980) Nuc. Acids Res. Symp. Ser. 7:215-223, Horn, T. et al. (1980) Nucl. Acids Res. Symp. Ser. 7:225-232). Alternatively, the protein itself may be produced using chemical methods to synthesize the amino acid sequence of the proteins of the invention, or a portion thereof. For example, peptide synthesis can be performed using various solid-phase techniques (Roberge, J. Y. et al. (1995) Science 269:202-204) and automated synthesis may be achieved, for example, using the ABI 431A peptide synthesizer (Perkin Elmer). The newly synthesized peptide may be substantially purified by preparative high performance liquid chromatography (e.g. Creighton, T. (1983) proteins, Structures and Molecular Principles, WH Freeman and Co., New York, N.Y.) The composition of the synthetic peptides may be confirmed by amino acid analysis or sequencing (e.g. the Edman degradation procedure; Creighton, supra). Additionally, the amino acid sequence of the proteins of the invention, or any part thereof, may be altered during direct synthesis and/or combined using chemical methods with sequences from other proteins, or any part thereof, to produce a variant polypeptide.

[0068] In order to express a biologically active protein of the invention, the nucleotide sequences encoding the proteins of the invention or functional equivalents, may be inserted into appropriate expression vector, i.e. a vector which contains the necessary elements for the transcription and translation of the inserted coding sequence. Methods which are well known to those skilled in the art may be used to construct expression vectors containing sequences encoding the proteins of the invention and appropriate transcriptional and translational control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, and in vivo genetic recombination. Such techniques are described in Sambrook, J. et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Press, Plainview, N.Y., and Ausubel, F. M. et al. (1989) Current Protocols in Molecular Biology, John Wiley & Sons, New York, N.Y.

[0069] A variety of expression vector/host systems may be utilized to contain and express sequences encoding the proteins of the invention. These include, but are not limited to, microorganisms such as bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; yeast transformed with yeast expression vectors; insect cell systems infected with virus expression vectors (e.g. baculovirus); plant cell systems transformed with virus expression vectors (e.g. cauliflower mosaic virus, CaMV; tobacco mosaic virus, TMV) or with bacterial expression vectors (e.g. Ti or PBR322 plasmids); or animal cell systems.

[0070] The presence of polynucleotide sequences encoding the proteins of the invention can be detected by DNA-DNA or DNA-RNA hybridization and/or amplification using probes or portions or functional fragments of polynucleotides encoding the proteins of the invention. Nucleic acid amplification based assays involve the use of oligonucleotides or oligomers based on the sequences encoding the proteins of the invention to detect transformants containing DNA or RNA encoding the proteins of the invention. As used herein "oligonucleotides" or "oligomers" refer to a nucleic acid sequence of at least about 10 nucleotides and as many as about 60 nucleotides, preferably about 15 to 30 nucleotides, and more preferably about 20-25 nucleotides, which can be used as a probe or amplimer.

[0071] A variety of protocols for detecting and measuring the expression of the proteins of the invention, using either polyclonal or monoclonal antibodies specific for the protein are known in the art. Examples include enzyme-linked immunosorbent assay (ELISA), radioimmunoassay (RIA), and fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering epitopes on the proteins of the invention is preferred, but a competitive binding assay may be employed. These and other assays are described, among other places, in Hampton, R. et al. (1990; Serological Methods, a Laboratory Manual, APS Press, St Paul, Minn.) and Maddox, D. E. et al. (1983; J. Exp. Med. 158:1211-1216).

[0072] Compounds that bind the proteins of the invention, e.g. antibodies, are useful for the identification or enrichment of cells, which are positive for the expression of the proteins of the invention, from complex cell mixtures. Such cell populations are useful in transplantation, for experimental evaluation, and as source of lineage and cell specific products, including mRNA species useful in identifying genes specifically expressed in these cells, and as target for the identification of factors of molecules that can affect them. The pancreatic progenitor cell population, which is positive for the expression of the proteins of the invention, is useful in transplantation to provide a recipient with pancreatic islet cells, including insulin producing beta cells; for drug screening; experimental models of islet differentiation and interaction with other cell types; in vitro screening assays to define growth and differentiation factors, and to additionally characterize genes involved in islet development and regulation; and the like. The native cells may be used for these purposes, or they may be genetically modified to provide altered capabilities. Cells from a regenerating pancreas, from embryonic foregut, stomach and duodenum, or other sources of pancreatic progenitor cells may be used as a starting population. The progenitor cells may be obtained from any mammalian species, e.g. equine, bovine, porcine, canine, feline, rodent, e.g. mice, rats, hamster, primate, etc. particularly human.

[0073] A wide variety of labels and conjugation techniques are known by those skilled in the art and may be used in various nucleic acid and amino acid assays. Means for producing labeled hybridization or PCR probes for detecting sequences related to polynucleotides encoding the proteins of the invention include oligolabeling, nick translation, end-labeling or PCR amplification using a labeled nucleotide.

[0074] Alternatively, the sequences encoding the proteins of the invention, or any portions thereof may be cloned into a vector for the production of an mRNA probe. Such vectors are known in the art, are commercially available, and may be used to synthesize RNA probes in vitro by addition of an appropriate RNA polymerase such as T7, T3, or SP6 and labeled nucleotides. These procedures may be conducted using a variety of commercially available kits (Pharmacia & Upjohn, (Kalamazoo, Mich.); Promega (Madison Wis.); and U.S. Biochemical Corp., (Cleveland, Ohio). Suitable reporter molecules or labels, which may be used, include radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents as well as substrates, cofactors, inhibitors, magnetic particles, and the like.

[0075] Host cells transformed with nucleotide sequences encoding the proteins of the invention may be cultured under conditions suitable for the expression and recovery of the protein from cell culture. The protein produced by a recombinant cell may be secreted or contained intracellularly depending on the sequence and/or the vector used. As will be understood by those of skill in the art, expression vectors containing polynucleotides which encode the proteins of the invention may be designed to contain signal sequences which direct secretion of the proteins of the invention through a prokaryotic or eukaryotic cell membrane. Other recombinant constructions may be used to join sequences encoding the proteins of the invention to nucleotide sequence encoding a polypeptide domain which will facilitate purification of soluble proteins. Such purification facilitating domains include, but are not limited to, metal chelating peptides such as histidine-tryptophan modules that allow purification on immobilized metals, protein A domains that allow purification on immobilized immunoglobulin, and the domain utilized in the FLAG extension/affinity purification system (Immunex Corp., Seattle, Wash.) The inclusion of cleavable linker sequences such as those specific for Factor XA or enterokinase (Invitrogen, San Diego, Calif.) between the purification domain and the proteins of the invention may be used to facilitate purification. One such expression vector provides for expression of a fusion protein containing the proteins of the invention and a nucleic acid encoding 6 histidine residues preceding a thioredoxin or an enterokinase cleavage site. The histidine residues facilitate purification on IMIAC (immobilized metal ion affinity chromotagraphy as described in Porath, J. et al. (1992, Prot. Exp. Purif. 3: 263-281) while the enterokinase cleavage site provides a means for purifying the proteins of the invention from the fusion protein. A discussion of vectors which contain fusion proteins is provided in Kroll, D. J. et al. (1993; DNA Cell Biol. 12:441-453). In addition to recombinant production, fragments of the proteins of the invention may be produced by direct peptide synthesis using solid-phase techniques (Merrifield J. (1963) J. Am. Chem. Soc. 85:2149-2154). protein synthesis may be performed using manual techniques or by automation. Automated synthesis may be achieved, for example, using Applied Biosystems 431A peptide synthesizer (Perkin Elmer). Various fragments of the proteins of the invention may be chemically synthesized separately and combined using chemical methods to produce the full length molecule.

[0076] The nucleic acids encoding the proteins of the invention can be used to generate transgenic animal or site specific gene modifications in cell lines. Transgenic animals may be made through homologous recombination, where the normal locus of the genes encoding the proteins of the invention is altered. Alternatively, a nucleic acid construct is randomly integrated into the genome. Vectors for stable integration include plasmids, retrovirusses and other animal virusses, YACs, and the like. The modified cells or animal are useful in the study of the function and regulation of the proteins of the invention. For example, a series of small deletions and/or substitutions may be made in the genes that encode the proteins of the invention to determine the role of particular domains of the protein, functions in pancreatic differentiation, etc. Specific constructs of interest include anti-sense molecules, which will block the expression of the proteins of the invention, or expression of dominant negative mutations. A detectable marker, such as lac Z may be introduced in the locus of the genes of the invention, where upregulation of expression of the genes of the invention will result in an easily detected change in phenotype. One may also provide for expression of the genes of the invention or variants thereof in cells or tissues where it is not normally expressed or at abnormal times of development. In addition, by providing expression of the proteins of the invention in cells in which they are not normally produced, one can induce changes in cell behavior. DNA constructs for homologous recombination will comprise at least portions of the genes of the invention with the desired genetic modification, and will include regions of homology to the target locus. DNA constructs for random integration need not include regions of homology to mediate recombination. Conveniently, markers for positive and negative selection are included. Methods for generating cells having targeted gene modifications through homologous recombination are known in the art. For embryonic stem (ES) cells, an ES cell line may be employed, or embryonic cells may be obtained freshly from a host, e.g. mouse, rat, guinea pig etc. Such cells are grown on an appropriate fibroblast-feeder layer or grown in presence of leukemia inhibiting factor (LIF). When ES or embryonic cells have been transformed, they may be used to produce transgenic animals. After transformation, the cells are plated onto a feeder layer in an appropriate medium. Cells containing the construct may be detected by employing a selective medium. After sufficient time for colonies to grow, they are picked and analyzed for the occurrence of homologous recombination or integration of the construct. Those colonies that ate positive may then be used for embryo manipulation and blastocyst injection. Blastocysts are obtained from 4 to 6 week old superovulated females. The ES cells are trypsinized, and the modified cells are injected into the blastocoel of the blastocyst. After injection, the blastocysts are returned to each uterine horn of pseudopregnant females. Females are then allowed to go to term and the resulting offspring screened for the construct. By providing for a different phenotype of the blastocyst and the genetically modified cells, chimeric progeny can be readily detected. The chimeric animals are screened for the presence of the modified gene and males and females having the modification are mated to produce homozygous progeny. If the gene alterations cause lethality at some point in development, tissues or organs can be maintained as allogenic or congenic grafts or transplants, or in vitro culture. The transgenic animals may be any non-human mammal, such as laboratory animal, domestic animals, etc. The transgenic animals may be used in functional studies, drug screening, etc.

[0077] Diagnostics and Therapeutics

[0078] From the in situ expression patterns obtained by using the proteins of this invention it can be concluded that the proteins described in this invention are specifically expressed in pancreatic cells such as islet cells (for example DP685; DP160; RA770), pancreatic mesenchyme (RA770), cells of the pancreatic epithelium (for example DP685; DP160), pancreatic duct cells (DP160) as well as in other cells such as ganglia along the neural tube (DP160; DP444), somites (DP444), dorsal hindbrain (DP444), liver (DP685), heart (DP685), stomach (DP444) and intestinal cells (DP685; DP444). Therefore, the nucleic acids and proteins of the invention and effectors/modulators thereof are useful in diagnostic and therapeutic applications implicated, for example but not limited to, in metabolic disorders and dysfunctions associated with the above organs or tissues like diabetes and obesity, liver diseases and neural diseases, e.g. neuro-degenerative disorders and other diseases and disorders. Hence the proteins of the invention could be useful as a diagnostic markers or as a target for small molecule screening, and in prevention or treatment of diabetes and/or obesity and other metabolic disorders and other diseases such as neurodegenerative disorders, heart, liver, stomach, or intestinal disorders.

[0079] Therapeutic uses for the invention(s) are, for example but not limited to, the following: (i) tissue regeneration in vitro and in vivo (regeneration for all these tissues and cell types composing these tissues and cell types derived from these tissues); (ii) protein therapeutic, (iii) small molecule drug target, (iv) antibody target (therapeutic, diagnostic, drug targeting/cytotoxic antibody), (v) diagnostic and/or prognostic marker, (vi) gene therapy (gene delivery/gene ablation), and (vii) research tools.

[0080] The nucleic acids and proteins of the invention are useful in therapeutic applications implicated in various diseases and disorders described below and/or other pathologies and disorders. For example, but not limited to, a cDNA encoding one of the proteins of the invention may be useful in gene therapy, and the proteins of the invention may be useful when administered to a subject in need thereof. By way of non-limiting example, the compositions of the present invention will have efficacy for treatment of patients suffering from, for example, but not limited to, in metabolic disorders like diabetes and obesity, and other diseases and disorders. The novel nucleic acids encoding the proteins of the invention, or functional fragments thereof, may further be useful in diagnostic applications, wherein the presence or amount of the nucleic acid or the protein are to be assessed. These materials are further useful in the generation of antibodies that bind immunospecifically to the novel substances of the invention for use in therapeutic or diagnostic methods. In other embodiments of the invention, the compositions of the invention e.g. the proteins or functional fragments thereof may be used for therapeutic purposes. For example, the compositions, such as the pancreas specific proteins described in this invention, can be used for promoting the differentiation and/or function of beta cells in vitro and/or in vivo. Further, the compositions, such as the proteins, can be used for the regeneration of .beta.-cells, e.g. of partially or completely dysfunctional .beta.-cells in vitro and/or in vivo.

[0081] For example, in one aspect, antibodies which are specific for the proteins of the invention may be used directly as an antagonist, or indirectly as a targeting or delivery mechanism for bringing a pharmaceutical agent to cells or tissue which express the proteins of the invention. The antibodies may be generated using methods that are well known in the art. Such antibodies may include, but are not limited to, polyclonal, monoclonal, chimeric, single chain, Fab fragments, and fragments produced by a Fab expression library. Neutralizing antibodies, (i.e. those which inhibit biological function) are especially preferred for therapeutic use.

[0082] For the production of antibodies, various hosts including goats, rabbits, rats, mice, humans, and others, may be immunized by injection with the proteins of the invention or any fragment or oligopeptide thereof which has immunogenic properties. Depending on the host species, various adjuvants may be used to increase immunological response. It is preferred that the peptides, fragments or oligopeptides used to induce antibodies to the proteins of the invention have an amino acid sequence consisting of at least five amino acids, and more preferably at least 10 amino acids.

[0083] Monoclonal antibodies to the proteins of the invention may be prepared using any technique which provides for the production of antibody molecules by continuous cell lines in culture. These include, but are not limited to, the hybridoma technique, the human B-cell hybridoma technique, and the EBV-hybridoma technique (Kohler, G. et al. (1975) Nature 256:495-497; Kozbor, D. et al. (1985) J. Immunol. Methods 81:31-42; Cote, R. J. et al. (Proc. Natl. Acad. Sci. 80:2026-2030; Cole, S. P. et al. (1984) Mol. Cell Biol. 62:109-120). In addition, techniques developed for the production of "chimeric antibodies", the splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate antigen specificity and biological activity can be used (Morrison, S. L. et al. (1984) Proc. Natl. Aced. Sci. 81:6851-6855; Neuberger, M. S. et al. (1984) Nature 312:604-608; Takeda, S. et al. (1985) Nature 314:452-454). Alternatively, techniques described for the production of single chain antibodies may be adapted, using methods known in the art, to produce the proteins of the invention-specific single chain antibodies. Antibodies with related specificity, but of distinct idiotypic composition, may be generated by chain shuffling from random combinatorial immunoglobulin libraries (Burton, D. R. (1991) Proc. Natl. Acad. Sci. 88:11120-3). Antibodies may also be producing by inducing in vivo production in the lymphocyte population or by screening recombinant immunoglobulin libraries or panels of highly specific binding reagents as disclosed in the literature (Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci. 86:3833-3837; Winter, G. et al. (1991) Nature 349:293-299).

[0084] Antibody fragments which contain specific binding sites for the proteins of the invention may also be generated. For example, such fragments include, but are not limited to, the F(ab').sub.2 fragments which can be produced by pepsin digestion of the antibody molecule and the Fab fragments which can be generated by reducing the disulfide bridges of F(ab').sub.2 fragments. Alternatively, Fab expression libraries may be constructed to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity (Huse, W. D. et al. (1989) Science 254:1275-1281).

[0085] Various immunoassays may be used for screening to identify antibodies having the desired specificity. Numerous protocols for competitive binding and immunoradiometric assays using either polyclonal or monoclonal antibodies with established specificities are well known in the art. Such immunoassays typically involve the measurement of complex formation between the proteins of the invention and its specific antibody. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies reactive to two non-interfering the proteins of the invention epitopes is preferred, but a competitive binding assay may also be employed (Maddox, supra).

[0086] In another embodiment of the invention, the polynucleotides, or any fragment thereof, such as aptamers, antisense molecules, RNAi molecules or ribozymes may be used for therapeutic purposes. In one aspect, aptamers i.e. nucleic acid molecules which are capable of binding to a protein of the invention and modulating its activity, may be generated by a screening and selection procedure involving the use of combinatorial nucleic acid libraries.

[0087] In a further aspect, antisense molecules to the polynucleotide encoding the proteins of the invention may be used in situations in which it would be desirable to block the transcription of the mRNA. In particular, cells may be transformed with sequences complementary to polynucleotides encoding the proteins of the invention. Thus, antisense molecules may be used to modulate the activity of the proteins of the invention, or to achieve regulation of gene function. Such technology is now well know in the art, and sense or antisense oligomers or larger fragments, can be designed from various locations along the coding or control regions of sequences encoding the proteins of the invention. Expression vectors derived from retroviruses, adenoviruses, herpes or vaccinia viruses, or from various bacterial plasmids may be used for delivery of nucleotide sequences to the targeted organ, tissue or cell population. Methods which are well known to those skilled in the art can be used to construct recombinant vectors which will express antisense molecules complementary to the polynucleotides of the gene encoding the proteins of the invention. These techniques are described both in Sambrook et al. (supra) and in Ausubel et al. (supra). Genes encoding the proteins of the invention can be turned off by transforming a cell or tissue with expression vectors which express high levels of a polynucleotide or fragment thereof which encodes the proteins of the invention. Such constructs may be used to introduce untranslatable sense or antisense sequences into a cell. Even in the absence of integration into the DNA, such vectors may continue to transcribe RNA molecules until they are disabled by endogenous nucleases. Transient expression may last for a month or more with a non-replicating vector and even longer if appropriate replication elements are part of the vector system.

[0088] As mentioned above, modifications of gene expression can be obtained by designing antisense molecules, DNA, RNA, or nucleic acid analogues such as PNA, to the control regions of the gene encoding the proteins of the invention, i.e., the promoters, enhancers, and introns. Oligonucleotides derived from the transcription initiation site, e.g. between positions -10 and +10 from the start site, are preferred. Similarly, inhibition can be achieved using "triple helix" base-pairing methodology. Triple helix pairing is useful because it causes inhibition of the ability of the double helix to open sufficiently for the binding of polymerases, transcription factors, or regulatory molecules. Recent therapeutic advances using triplex DNA have been described in the literature (Gee, J. E. et al. (1994) In; Huber, B. E. and B. I. Carr, Molecular and Immunologic Approaches, Futura Publishing Co., Mt. Kisco, N.Y.). The antisense molecules may also be designed to block translation of mRNA by preventing the transcript from binding to ribosomes.

[0089] Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme molecule to complementary target RNA, followed by endonucleolytic cleavage. Examples which may be used include engineered hammerhead motif ribozyme molecules that can be specifically and efficiently catalyze endonucleolytic cleavage of sequences encoding the proteins of the invention. Specific ribozyme cleavage sites within any potential RNA target are initially identified by scanning the target molecule for ribozyme cleavage sites which include the following sequences: GUA, GUU, and GUC. Once identified, short RNA sequences of between 15 and 20 ribonucleotides corresponding to the region of the target gene containing the cleavage site may be evaluated for secondary structural features which may render the oligonucleotide inoperable. The suitability of candidate targets may also be evaluated by testing accessibility to hybridization with complementary oligonucleotides using ribonuclease protection assays.

[0090] Effector nucleic acid molecules, e.g. antisense molecules and ribozymes of the invention may be prepared by any method known in the art for the synthesis of nucleic acid molecules. These include techniques for chemically synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. Alternatively, RNA molecules may be generated by in vitro and in vivo transcription of DNA sequences encoding the proteins of the invention. Such DNA sequences may be incorporated into a variety of vectors with suitable RNA polymerase promoters such as T7 or SP6. Alternatively, these cDNA constructs that synthesize antisense RNA constitutively or inducibly can be introduced into cell lines, cells, or tissues. RNA molecules may be modified to increase intracellular stability and half-life. Possible modifications include, but are not limited to, the addition of flanking sequences at the 5' and/or 3' ends of the molecule or the use of phosphorothioate or 2' O-methyl rather than phosphodiesterase linkages within the backbone of the molecule. This concept is inherent in the production of PNAs and can be extended in all of these molecules by the inclusion of nontraditional bases such as inosine, queosine, and wybutosine, as well as acetyl-, methyl-, thio- and similarly modified forms of adenine, cytidine, guanine, thymine, and uridine which are not as easily recognized by endogenous endonucleases.

[0091] Gene function can also be suppressed using small interfering RNAs. These are short (18 to 25 bp) RNA duplexes (the RNA may be modified for stabilization). The small interfering RNAs can be made either synthetically, by in vitro transcription procedures or using suitable vectors which express the desired RNA duplex as a hairpin structure inside the target cell. Applications include functional gene suppression in tissue culture, in model organisms such as mice or therapeutically (see e.g. Shi, Y. Trends Genet 19(1):9-12; Shuey, D. J., Drug Discov Today. 7(20):1040-6). The presence of longer (>30 bp) antisense RNAs inside of eukaryotic cells can also lead to gene silencing under certain circumstances.

[0092] Many methods for introducing vectors into cells or tissues are available and equally suitable for use in vivo, in vitro, and ex vivo. For ex vivo therapy, vectors may be introduced into stem cells taken from the patient and clonally propagated for autologous transplant back into that same patient. Delivery by transfection and by liposome injections may be achieved using methods which are well known in the art. Any of the therapeutic methods described above may be applied to any suitable subject including, for example, mammals such as dogs, cats, cows, horses, rabbits, monkeys, and most preferably, humans.

[0093] An additional embodiment of the invention relates to the administration of a pharmaceutical composition, in conjunction with a pharmaceutically acceptable carrier, for any of the therapeutic effects discussed above. Such pharmaceutical compositions may consist of the proteins of the invention, antibodies to the proteins of the invention, mimetics, agonists, antagonists, or inhibitors of the proteins of the invention. The compositions may be administered alone or in combination with at least one other agent, such as stabilizing compound, which may be administered in any sterile, biocompatible pharmaceutical carrier, including, but not limited to, saline, buffered saline, dextrose, and water. The compositions may be administered to a patient alone, or in combination with other agents, drugs or hormones. The pharmaceutical compositions utilized in this invention may be administered by any number of routes including, but not limited to, oral, intravenous, intramuscular, intra-arterial, intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means.

[0094] In addition to the active ingredients, these pharmaceutical compositions may contain suitable pharmaceutically-acceptable carriers comprising excipients and auxiliaries which facilitate processing of the active compounds into preparations which can be used pharmaceutically. Further details on techniques for formulation and administration may be found in the latest edition of Remington's Pharmaceutical Sciences (Maack Publishing Co., Easton, Pa.).

[0095] The pharmaceutical compositions of the present invention may be manufactured in a manner that is known in the art, e.g. by means of conventional mixing, dissolving, granulating, dragee-making, levigating, emulsifying, encapsulating, entrapping, or lyophilizing processes. After pharmaceutical compositions have been prepared, they can be placed in an appropriate container and labeled for treatment of an indicated condition. For administration of the proteins of the invention, such labeling would include amount, frequency, and method of administration.

[0096] Pharmaceutical compositions suitable for use in the invention include compositions wherein the active ingredients are contained in an effective amount to achieve the intended purpose. The determination of an effective dose is well within the capability of those skilled in the art. For any compounds, the therapeutically effective does can be estimated initially either in cell culture assays, e.g. of preadipoctic cell lines, or in animal models, usually mice, rabbits, dogs, or pigs. The animal model may also be used to determine the appropriate concentration range and route of administration. Such information can then be used to determine useful doses and routes for administration in humans. A therapeutically effective dose refers to that amount of active ingredient, for example the proteins of the invention or fragments thereof, antibodies of the proteins; of the invention, which is effective for the treatment of a specific condition. Therapeutic efficacy can toxicity may be determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g. ED50 (the does therapeutically effective in 50% of the population) and LD50 (the dose lethal to 50% of the population). The dose ratio between therapeutic and toxic effects is the therapeutic index, and it can be expressed as the ratio, LD50/ED50. Pharmaceutical compositions which exhibit large therapeutic indices are preferred. The data obtained from cell culture assays and animal studies is used in formulating a range of dosage for human use. The dosage contained in such compositions is preferably within a range of circulating concentrations that include the ED50 with little or no toxicity. The dosage varies within this range depending upon the dosage from employed, sensitivity of the patient, and the route of administration. The exact dosage will be determined by the practitioner, in light of factors related to the subject that requires treatment. Dosage and administration are adjusted to provide sufficient levels of the active moiety or to maintain the desired effect. Factors which may be taken into account include the severity of the disease state, general health of the subject, age, weight, and gender of the subject, diet, time and frequency of administration, drug combination(s), reaction sensitivities, and tolerance/response to therapy. Long-acting pharmaceutical compositions may be administered every 3 to 4 days, every week, or once every two weeks depending on half-life and clearance rate of the particular formulation. Normal dosage amounts may vary from 0.1 to 100,000 micrograms, up to a total dose of about 1 g, depending upon the route of administration. Guidance as to particular dosages and methods of delivery is provided in the literature and generally available to practitioners in the art. Those skilled in the art employ different formulations for nucleotides than for proteins or their inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells, conditions, locations, etc.

[0097] In another embodiment, antibodies which specifically bind the proteins of the invention may be used for the diagnosis of conditions or diseases characterized by expression of the proteins of the invention, or in assays to monitor patients being treated with the proteins of the invention, agonists, antagonists or inhibitors. The antibodies useful for diagnostic purposes may be prepared in the same manner as those described above for therapeutics. Diagnostic assays for the proteins of the invention include methods which utilize the antibody and a label to detect the proteins of the invention in human body fluids or extracts of cells or tissues. The antibodies may be used with or without modification, and may be labeled by joining them, either covalently or non-covalently, with a reporter molecule. A wide variety of reporter molecules which are known in the art may be used several of which are described above.

[0098] A variety of protocols including ELISA, RIA, and FACS for measuring the proteins of the invention are known in the art and provide a basis for diagnosing altered or abnormal levels of the proteins of the invention expression. Normal or standard values for the proteins of the invention expression are established by combining body fluids or cell extracts taken from normal mammalian subjects, preferably human, with antibody to the proteins of the invention under conditions suitable for complex formation. The amount of standard complex formation may be quantified by various methods, but preferably by photometric means. Quantities of the proteins of the invention expressed in control and disease samples from biopsied tissues, for example, are compared with the standard values. Deviation between standard and subject values establishes the parameters for diagnosing disease.

[0099] In another embodiment of the invention, the polynucleotides of the invention may be used for diagnostic purposes. The polynucleotides which may be used include oligonucleotide sequences, antisense RNA and DNA molecules, and PNAs. The polynucleotides may be used to detect and quantitate gene expression in biopsied tissues in which expression of the proteins of the invention may be correlated with disease. The diagnostic assay may be used to distinguish between absence, presence, and excess expression of the proteins of the invention, and to monitor regulation of the proteins of the invention levels during therapeutic intervention.

[0100] In one aspect, hybridization with PCR probes which are capable of detecting polynucleotide sequences, including genomic sequences, encoding the proteins of the invention or closely related molecules, may be used to identify nucleic acid sequences which encode the proteins of the invention. The specificity of the probe, whether it is made from a highly specific region, or a less specific region, and the stringency of the hybridization or amplification (maximal, high, intermediate, or low) will determine whether the probe identifies only naturally occurring sequences encoding the proteins of the invention, alleles, or related sequences. Probes may also be used for the detection of related sequences, and should preferably contain at least 50% of the nucleotides from any of the proteins of the invention encoding sequences. The hybridization probes of the subject invention may be DNA or RNA and derived from the nucleotide sequence of SEQ ID NO: 1, 3, 5, 7, 9, 11, 12, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41 or 43, or from a genomic sequence including promoter, enhancer elements, and introns of the naturally occurring the proteins of the invention. Means for producing specific hybridization probes for DNAs encoding the proteins of the invention include the cloning of nucleic acid sequences encoding the proteins of the invention or the proteins of the invention derivatives into vectors for the production of mRNA probes. Such vectors are known in the art, commercially available, and may be used to synthesize RNA probes in vitro by means of the addition of the appropriate RNA polymerases and the appropriate labeled nucleotides. Hybridization probes may be labeled by a variety of reporter groups, for example, radionuclides such as .sup.32P or .sup.35S, or enzymatic labels, such as alkaline phosphatase coupled to the probe via avidin/biotin coupling systems, and the like.

[0101] Polynucleotide sequences may be used for the diagnosis of conditions or diseases which are associated with expression of the proteins of the invention. Examples of such conditions or diseases include, but are not limited to, pancreatic diseases and disorders, including diabetes. Polynucleotide sequences may also be used to monitor the progress of patients receiving treatment for pancreatic diseases and disorders, including diabetes. The polynucleotide sequences may be used in Southern or northern analysis, dot blot, or other membrane-based technologies; in PCR technologies; or in dip stick, pin, ELISA or chip assays utilizing fluids or tissues from patient biopsies to detect altered the proteins of the invention expression. Such qualitative or quantitative methods are well known in the art.

[0102] In a particular aspect, the nucleotide sequences may be useful in assays that detect activation or induction of various pancreatic diseases and disorders, including diabetes, particularly those mentioned above. The nucleotide sequences may be labeled by standard methods, and added to a fluid or tissue sample from a patient under conditions suitable for the formation of hybridization complexes. After a suitable incubation period, the sample is washed and the signal is quantitated and compared with a standard value. The presence of altered levels of nucleotide sequences in the sample compared to the standard, e.g. a control sample indicates the presence of the associated disease. Such assays may also be used to evaluate the efficacy of a particular therapeutic treatment regimen in animal studies, in clinical trials, or in monitoring the treatment of an individual patient.

[0103] In order to provide a basis for the diagnosis of disease associated with expression of the proteins of the invention, a normal or standard profile for expression is established. This may be accomplished by combining body fluids or cell extracts taken from normal subjects, either animal or human, with a sequence, or a fragment thereof, which encodes the proteins of the invention, under conditions suitable for hybridization or amplification. Standard hybridization may be quantified by comparing the values obtained from normal subjects with those from an experiment where a known amount of a substantially purified polynucleotide is used. Standard values obtained from normal samples may be compared with values obtained from samples from patients who are symptomatic for disease. Deviation between standard and subject values is used to establish the presence of disease. Once disease is established and a treatment protocol is initiated, hybridization assays may be repeated on a regular basis to evaluate whether the level of expression in the patient begins to approximate that which is observed in the normal patient. The results obtained from successive assays may be used to show the efficacy of treatment over a period ranging from several days to months.

[0104] With respect to pancreatic diseases and disorders, including diabetes, the presence of a relatively high amount of transcript in biopsied tissue from an individual may indicate a predisposition for the development of the disease, or may provide a means for detecting the disease prior to the appearance of actual clinical symptoms. A more definitive diagnosis of this type may allow health professionals to employ preventative measures or aggressive treatment earlier thereby preventing the development or further progression of the pancreatic diseases and disorders. Additional diagnostic uses for oligonucleotides designed from the sequences encoding the proteins of the invention may involve the use of PCR. Such oligomers may be chemically synthesized, generated enzymatically, or produced from a recombinant source. Oligomers will preferably consist of two nucleotide sequences, one with sense orientation (5'.fwdarw.3') and another with antisense (3'.rarw.5'), employed under optimized conditions for identification of a specific gene or condition. The same two oligomers, nested sets of oligomers, or even a degenerate pool of oligomers may be employed under less stringent conditions for detection and/or quantitation of closely related DNA or RNA sequences.

[0105] Methods, which may also be used to quantitate the expression of the proteins of the invention, include various labels, e.g. radioisotopes, fluorescers, chemiluminescers, enzymes, specific binding molecules, particles, e.g. magnetic particles or the like. Specific binding molecules include pairs, such as biotin and streptavidin, digoxin and antidigoxin etc. For the specific binding members, the complementary member would normally be labeled with a molecule that provides for detection, in accordance with known procedures. The methods include coamplification of a control nucleic acid, and standard curves onto which the experimental results are interpolated (Melby, P. C. et al. (1993) J. Immunol. Methods, 159:235-244; Duplaa, C. et al. (1993) Anal. Biochem. 212:229-236. The speed of quantitation of multiple samples may be accelerated by running the assay in an ELISA format where the oligomer of interest is presented in various dilutions and a spectrophotometric or colorimetric response gives rapid quantitation.

[0106] In another embodiment of the invention, the nucleic acid sequences which encode the proteins of the invention may also be used to generate hybridization probes which are useful for mapping the naturally occurring genomic sequence. The sequences may be mapped to a particular chromosome or to a specific region of the chromosome using well known techniques. Such techniques include FISH, FACS, or artificial chromosome constructions, such as yeast artificial chromosomes, bacterial artificial chromosomes, bacterial P1 constructions or single chromosomencDNA libraries as reviewed in Price, C. M. (1993) Blood Rev. 7:127-134, and Trask, B. J. (1991) Trends Genet. 7:149-154. FISH (as described in Verma et al. (1988) Human Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York, N.Y.) may be correlated with other physical chromosome mapping techniques and genetic map data. Examples of genetic map data can be found in the 1994 Genome Issue of Science (265:1981f). Correlation between the location of the gene encoding the proteins of the invention on a physical chromosomal map and a specific disease, or predisposition to a specific disease, may help delimit the region of DNA associated with that genetic disease.

[0107] The nucleotide sequences of the subject invention may be used to detect differences in gene sequences between normal, carrier, or affected individuals. In situ hybridization of chromosomal preparations and physical mapping techniques such as linkage analysis using established chromosomal markers may be used for extending genetic maps. Often the placement of a gene on the chromosome of another mammalian species, such as mouse, may reveal associated markers even if the number or arm of a particular human chromosome is not known. New sequences can be assigned to chromosomal arms, or parts thereof, by physical mapping. This provides valuable information to investigators searching for disease genes using positional cloning or other gene discovery techniques. Once the disease or syndrome has been crudely localized by genetic linkage to a particular genomic region, for example, AT to 11q22-23 (Gatti, R. A. et al. (1988) Nature 336:577-580), any sequences mapping to that area may represent associated or regulatory genes for further investigation. The nucleotide sequence of the subject invention may also be used to detect differences in the chromosomal location due to translocation, inversion, etc. among normal, carrier, or affected individuals.

[0108] In another embodiment of the invention, the proteins of the invention, its catalytic or immunogenic fragments or oligopeptides thereof, an in vitro model, a genetically altered cell or animal, can be used for screening libraries of compounds in any of a variety of drug screening techniques. One can identify ligands or substrates that bind to, modulate or mimic the action of one or more of the proteins of the invention. A protein of the invention or a fragment thereof employed in such screening may be free in solution, affixed to a solid support, borne on a cell surface, or located intracellularly. The formation of binding complexes, between the proteins of the invention and the agent tested, may be measured. Of particular interest are screening assays for agents that have a low toxicity for mammalian cells. The term "agent" as used herein describes any molecule, e.g. protein, peptide or pharmaceutical, with the capability of altering or mimicking the physiological function of one or more of the proteins of the invention. Candidate agents encompass numerous chemical classes, though typically they are organic molecules, preferably small organic compounds having a molecular weight of more than 50 and less than about 2,500 Daltons. Candidate agents comprise functional groups necessary for structural interaction with proteins, particularly hydrogen bonding, and typically include at least an amine, carbonyl, hydroxyl or carboxyl group, preferably at least two of the functional chemical groups. The candidate agents often comprise cyclical carbon or heterocyclic structures and/or aromatic or polyaromatic structures substituted with one or more of the above functional groups. Candidate agents are also found among biomolecules including peptides, saccharides, fatty acids, steroids, purines, pyrimidines, derivatives, structural analogs or combinations thereof. Candidate agents are obtained from a wide variety of sources including libraries of synthetic or natural compounds. For example, numerous means are available for random and directed synthesis of a wide variety of organic compounds and biomolecules, including expression of randomized oligonucleotides and oligopeptides. Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant and animal extracts are available or readily produced. Additionally, natural or synthetically produced libraries and compounds are readily modified through conventional chemical, physical and biochemical means, and may be used to produce combinatorial libraries. Known pharmacological agents may be subjected to directed or random chemical modifications, such as acylation, alkylation, esterification, amidification, etc. to produce structural analogs. Where the screening assay is a binding assay, one or more of the molecules may be joined to a label, where the label can directly or indirectly provide a detectable signal.

[0109] Another technique for drug screening which may be used provides for high throughput screening of compounds having suitable binding affinity to the protein of interest as described in published PCT application WO84/03564. In this method, as applied to the proteins of the invention large numbers of different small test compounds are provided or synthesized on a solid substrate, such as plastic pins or some other surface. The test compounds are reacted with the proteins of the invention, or fragments thereof, and washed. Bound the proteins of the invention is then detected by methods well known in the art. Purified the proteins of the invention can also be coated directly onto plates for use in the aforementioned drug screening techniques. Alternatively, non-neutralizing antibodies can be used to capture the peptide and immobilize it on a solid support. In another embodiment, one may use competitive drug screening assays in which neutralizing antibodies capable of binding the proteins of the invention specifically compete with a test compound for binding the proteins of the invention. In this manner, the antibodies can be used to detect the presence of any peptide which shares one or more antigenic determinants with the proteins of the invention. In additional embodiments, the nucleotide sequences which encode the proteins of the invention may be used in any molecular biology techniques that have yet to be developed, provided the new techniques rely on properties of nucleotide that are currently known, including, but not limited to, such properties as the triplet genetic code and specific base pair interactions.

[0110] The nucleic acids encoding the proteins of the invention can be used to generate transgenic cell lines and animals. These transgenic non-human animals are useful in the study of the function and regulation of the proteins of the invention in vivo. Transgenic animals, particularly mammalian transgenic animals, can serve as a model system for the investigation of many developmental and cellular processes common to humans. A variety of non-human models of metabolic disorders can be used to test modulators of the protein of the invention. Misexpression (for example, overexpression or lack of expression) of the protein of the invention, particular feeding conditions, and/or administration of biologically active compounts can create models of metablic disorders.

[0111] In one embodiment of the invention, such assays use mouse models of insulin resistance and/or diabetes, such as mice carrying gene knockouts in the leptin pathway (for example, ob (leptin) or db (leptin receptor) mice). Such mice develop typical symptoms of diabetes, show hepatic lipid accumulation and frequently have increased plasma lipid levels (see Bruning et al, 1998, Mol. Cell. 2:449-569). Susceptible wild type mice (for example C57BI/6) show similiar symptoms if fed a high fat diet. In addition to testing the expression of the proteins of the invention in such mouse strainns, these mice could be used to test whether administration of a candidate modulator alters for example lipid accumulation in the liver, in plasma, or adipose tissues using standard assays well known in the art, such as FPLC, calorimetric assays, blood glucose level tests, insulin tolerance tests and others.

[0112] Transgenic animals may be made through homologous recombination in non-human embryonic stem cells, where the normal locus of the gene encoding the protein of the invention is mutated. Alternatively, a nucleic acid construct encoding the protein is injected into oocytes and is randomly integrated into the genome. One may also express the genes of the invention or variants thereof in tissues where they are not normally expressed or at abnormal times of development. Furthermore, variants of the genes of the invention like specific constructs expressing anti-sense molecules or expression of dominant negative mutations, which will block or alter the expression of the proteins of the invention may be randomly integrated into the genome. A detectable marker, such as lac Z or luciferase may be introduced into the locus of the genes of the invention, where upregulation of expression of the genes of the invention will result in an easily detectable change in phenotype. Vectors for stable integration include plasmids, retroviruses and other animal viruses, yeast artificial chromosomes (YACs), and the like.

[0113] DNA constructs for homologous recombination will contain at least portions of the genes of the invention with the desired genetic modification, and will include regions of homology to the target locus. Conveniently, markers for positive and negative selection are included. DNA constructs for random integration do not need to contain regions of homology to mediate recombination. DNA constructs for random integration will consist of the nucleic acids encoding the proteins of the invention, a regulatory element (promoter), an intron and a poly-adenylation signal. Methods for generating cells having targeted gene modifications through homologous recombination are known in the field. For embryonic stem (ES) cells, an ES cell line may be employed, or embryonic cells may be obtained freshly from a host, e.g. mouse, rat, guinea pig, etc. Such cells are grown on an appropriate fibroblast-feeder layer and are grown in the presence of leukemia inhibiting factor (LIF).

[0114] When ES or embryonic cells or somatic pluripotent stem cells have been transformed, they may be used to produce transgenic animals. After transformation, the cells are plated onto a feeder layer in an appropriate medium. Cells containing the construct may be selected by employing a selective medium. After sufficient time for colonies to grow, they are picked and analyzed for the occurrence of homologous recombination or integration of the construct. Those colonies that are positive may then be used for embryo transfection and blastocyst injection. Blastocysts are obtained from 4 to 6 week old superovulated females. The ES cells are trypsinized, and the modified cells are injected into the blastocoel of the blastocyst. After injection, the blastocysts are returned to each uterine horn of pseudopregnant females. Females are then allowed to go to term and the resulting offspring is screened for the construct. By providing for a different phenotype of the blastocyst and the genetically modified cells, chimeric progeny can be readily detected. The chimeric animals are screened for the presence of the modified gene and males and females having the modification are mated to produce homozygous progeny. If the gene alterations cause lethality at some point in development, tissues or organs can be maintained as allogenic or congenic grafts or transplants, or in vitro culture. The transgenic animals may be any non-human mammal, such as laboratory animal, domestic animals, etc. The transgenic animals may be used in functional studies, drug screening, etc.

[0115] Finally, the invention also relates to a kit comprising at least one of

[0116] (a) a nucleic acid molecule or a functional fragment thereof;

[0117] (b) a amino acid molecule or a functional fragment or an isoform thereof;

[0118] (c) a vector comprising the nucleic acid of (a);

[0119] (d) a host cell comprising the nucleic acid of (a) or the vector of (b);

[0120] (e) a polypeptide encoded by the nucleic acid of (a);

[0121] (f) a fusion polypeptide encoded by the nucleic acid of (a);

[0122] (g) an antibody, an aptamer or another receptor against the nucleic acid of (a) or the polypeptide of (d) or (e) and

[0123] (h) an anti-sense oligonucleotide of the nucleic acid of (a).

[0124] The kit may be used for diagnostic or therapeutic purposes or for screening applications as described above. The kit may further contain user instructions.

BRIEF DESCRIPTION OF THE FIGURES

[0125] FIG. 1: In situ hybridization results for the DP119 protein.

[0126] FIG. 1A shows whole-mount in situ hybridizatons on chick embryos (day 5dpb=dorsal pancreatic bud, vbp=ventral pancreatic bud, st=stomach, nt=neural tube; FIG. 1B shows in situ hybridizations on developing pancreatic tissue sections. DP293 positive cells are shown in blue colour; insulin is stained in brown). Expression can be seen in islets (is) and some cells of the pancreatic epithelium and duct cells (du). FIG. 1C shows a cross-section through the dorsal part of a day 5 chicken embyro stianed for DP119 expression by in situ hybridization. Staining is evident in scattered neural tube (nt) cells and in ganglionic cells surrounding the neural tube.

[0127] FIG. 1B shows the expression of the human DP119. Shown is the quantitative analysis of DP119 expression in human abdominal adipocyte cells, during the differentiation from preadipocytes to mature adipocytes.

[0128] FIG. 2: DP119 sequences.

[0129] FIG. 2A: Nucleic acid sequence (SEQ ID NO:1) containing the 3' of a chicken gene homologous to human DKFZp586L151. Underlined is the 3' untranslated region; the stop codon is shown in bold.

[0130] FIG. 2B: protein sequence (SEQ ID NO:2) encoded by the coding sequence shown in FIG. 2A.

[0131] FIG. 2C: Nucleic acid sequence (SEQ ID NO:3) encoding the human homolog protein, (GenBank Accession Number AL050137.1).

[0132] FIG. 2D: protein sequence (SEQ ID NO:4) encoded by the coding sequence shown in FIG. 2C (GenBank Accession Number CAB43286.1).

[0133] FIG. 2E: Nucleic acid sequence (SEQ. ID NO:5) encoding the mouse homolog protein, (GenBank Accession Number BC025654.1).

[0134] FIG. 2F: protein sequence (SEQ ID NO:6) encoded by the coding sequence shown in FIG. 8E (GenBank Accession Number Aah25654.1).

[0135] FIG. 2G: Aligment of DP119 from different species (Mm, mouse; Hs, Homo sapiens; Dr, Danio rerio; Gg, chicken)

[0136] FIG. 3: Expression of DP444.

[0137] FIG. 3A: Whole mount in situ hybridization using a day 3.5 chicken embryo and a DP444 probe. Expression is seen along the neural tube (nt) and in somites, the developing intestine (in) and in branchial arches.

[0138] FIG. 3B: Whole mount in situ hybridization using a day 4 chicken embryo and a DP444 probe. Expression is seen along the neural tube (nt) and in somites, the developing intestine (in) and in the dorsal hindbrain (hb).

[0139] FIG. 3C: Whole mount in situ hybridization using a day 5 chicken embryo and a DP444 probe. Expression domains in the stomach (st) and the pancreatic buds (dpb, vpb) are indicated.

[0140] FIG. 3D: Double labelling on a section through developing pancreas (chicken day 5). Insulin is stained brown, DP444 expression is stained purple. Expression of DP444 can be seen in islets (is) strongly overlapping with insulin expression.

[0141] FIG. 3E: Loss of DP444 function leads to islet defects in zebrafish. FIG. 3Ea shows a 24 h old embryo injected with control antisense oligo, FIG. 3Eb shows a 24 h old fish embryo injected with antisense oligo blocking the translation of DP444. Insulin expression is stained purple.

[0142] FIG. 4: DP444 sequences.

[0143] FIG. 4A: Nucleic acid sequence (SEQ ID NO:7). The stop codon is in bold and the 3'UTR is underlined.

[0144] FIG. 4B: Amino acid sequence of DP444 (SEQ ID NO:8).

[0145] FIG. 4C: Nucleic acid sequence of the human homolog QV2-NN2006-230401-628-d06 NN2006, SEQ ID NO:9 (GenBank Accession Number BI035296).

[0146] FIG. 4D: Amino acid sequence of the human homolog of DP444 (SEQ ID NO:10)(Translation of SEQ ID NO:9).

[0147] FIG. 4E: Nucleic acid sequence of GenBank Accession Number BF951817 (QV1-NN0228-091100-436-g05 NN0228 Homo sapiens, SEQ ID NO:11).

[0148] FIG. 4F: Nucleic acid sequence of GenBank Accession Number AI214480.1; (qg69c12.x1 Soares_NFL_T_GBC_S1 Homo sapiens, SEQ ID NO:12).

[0149] FIG. 4G: GenBank Accession Number Hs2.sub.--5191.sub.--28.sub.--4.s- ub.--1 predicted mRNA, (SEQ ID NO:13).

[0150] FIG. 4H: GenBank Accession Number Hs2.sub.--5191.sub.--28.sub.--4.s- ub.--1 predicted protein, (SEQ ID NO:14).

[0151] FIG. 4I: GenBank Accession Number Hs2.sub.--5191.sub.--28.sub.--4.s- ub.--3 predicted mRNA, (SEQ ID NO:15).

[0152] FIG. 4J: GenBank Accession Number Hs2.sub.--5191.sub.--28.sub.--4.s- ub.--3 predicted protein, (SEQ ID NO:16).

[0153] FIG. 4K: Aligment of DP444 from different species (Dr, zebrafish; Mm, mouse; Hs, Homo sapiens; Gg, chicken)

[0154] FIG. 5: In situ hybridization results for the DP810 protein.

[0155] FIG. 5A and FIG. 5B show whole-mount in situ hybridizatons on chick embryos (day 5). li=liver, ht=heart, dpb=dorsal pancreatic bud; FIG. 5C and FIG. 5D show in situ hybridizations on sections through developing pancreas (5-day-old chicken). pe=pancreatic epithelium, is=islet, pm=pancreatic mesenchyme.

[0156] FIG. 6: DP810 sequences.

[0157] FIG. 6A: DP810-protein. The 3' untranslated region is underlined and the stop codon is in bold font. (SEQ ID NO: 17)

[0158] FIG. 6B: protein sequence (SEQ ID NO: 18) encoded by the coding sequence shown in FIG. 6A.

[0159] FIG. 6C: Nucleic acid sequence (SEQ ID NO:19) encoding the human homolog DP810-protein, (GenBank Accession Number NM.sub.--02400.1; polydom).

[0160] FIG. 6D: protein sequence (SEQ ID NO:20) encoded by the coding sequence shown in FIG. 6C (GenBank Accession Number NP.sub.--078776.1).

[0161] FIG. 7: Expression of DP685 protein.

[0162] FIG. 7A and FIG. 7B show whole-mount in situ hybridizatons on chick embryos (A: day 4; B: day 5). In FIG. 7A, expression is seen along the dorsal neural tube (nt), in the dorsal forebrain (fb) and hindbrain (hb), in branchial arches (ba) and theanterior part of the developing hindlimb (ahl). A strong signal is also seen in the region of the developing stomach (st). In FIG. 7B, expression is seen in the developing stomach (st) and in the dorsal pancreatic bud (dpb).

[0163] FIG. 7C shows the expression of the human DP685. Shown is the quantitative analysis of DP685 expression in human abdominal adipocyte cells, during the differentiation from preadipocytes to mature adipocytes.

[0164] FIG. 8: DP685 sequences.

[0165] FIG. 8A: Nucleic acid sequence (SEQ ID NO:21) encoding the chicken DP685 protein.

[0166] FIG. 8B: Protein sequence (SEQ ID NO: 22) encoded by the coding sequence shown in FIG. 8A.

[0167] FIG. 5C: Nucleic acid sequence (SEQ ID NO:23) encoding the human homolog DP685 protein (autotaxin).

[0168] FIG. 8D: protein sequence (SEQ ID NO:24) encoded by the coding sequence shown in FIG. 8C.

[0169] FIG. 8E: Nucleic acid sequence (SEQ ID NO:25) encoding the mouse homolog DP685 protein.

[0170] FIG. 8F: Protein sequence (SEQ ID NO:26) encoded by the coding sequence shown in FIG. 8E.

[0171] FIG. 9: In situ hybridization results for the WE474 protein.

[0172] FIG. 9A shows whole-mount in situ hybridizatons on chick embryos (day 5). in=intestine, li=liver anlage;

[0173] FIG. 10: WE474 sequences.

[0174] FIG. 10A: Nucleic acid sequence (SEQ ID NO:27) consisting of the 3' untranslated region of chicken collectin.

[0175] FIG. 10B: protein sequence (SEQ ID NO:28) encoded by the coding sequence shown in FIG. 6A.

[0176] FIG. 10C: Nucleic acid sequence (SEQ ID NO:29) encoding the human homolog collectin COLEC10-protein, (GenBank Accession Number NM.sub.--006438.2).

[0177] FIG. 10D: protein sequence (SEQ ID NO:30) encoded by the coding sequence shown in FIG. 10C (GenBank Accession Number NP.sub.--006429.1).

[0178] FIG. 11: In situ hybridization results for the DP160 protein.

[0179] FIG. 11A shows whole-mount in situ hybridizatons on chick embryos (day 5). DP160 is expressed along the neural tube (nt), in the mesonephros (mn) and in the developing gastrointestinal tract (stomach: st; dorsal and ventral pancreatic buds:dpb, vpb).

[0180] FIG. 11B. shows a double labelling on a section through developing pancreas (day 5). Insulin is stained in brown, DP160 expression is stained purple. Expression can be seen in islets (is) and in cells of the pancreatic epithelium.

[0181] FIG. 12: DP160 sequences.

[0182] FIG. 12A: Nucleic acid sequence (SEQ ID NO:31)

[0183] FIG. 12B: protein sequence (SEQ ID NO:32) encoded by the coding sequence shown in FIG. 12A.

[0184] FIG. 12C: Nucleic acid sequence (SEQ ID NO:33) encoding the human homolog protein.

[0185] FIG. 12D: protein sequence (SEQ ID NO:34) encoded by the coding sequence shown in FIG. 12C.

[0186] FIG. 13: Expression of RA977.

[0187] FIG. 13A and FIG. 13B: Whole mount in situ hybridization using a day 5 chicken embryo and a RA977 probe. Expression of RA977 is observed in the dorsal pancreatic bud (dpb). The strong signal seen in the stomach (st) is due to nonspecific probe trapping. Same embryo is shown at two different magnifications.

[0188] FIG. 14: RA977 sequences.

[0189] FIG. 14A: Nucleic acid sequence (SEQ ID NO: 35) OF RA977. Stop and start codons are in bold and the UTRs are underlined.

[0190] FIG. 14B: Amino acid sequence of RA977 (SEQ ID NO:36).

[0191] FIG. 14C: Nucleic acid sequence of Homo sapiens epithelial membrane protein 2 (EMP2), mRNA (GENBANK ACCESSION NUMBER XM.sub.--030218.1; SEQ ID NO: 37).

[0192] FIG. 14D: Amino acid sequence of EMP2_HUMAN Epithelial membrane protein-2 (EMP-2) (XMP protein)(GenBank Accession Number P54851; SEQ ID NO: 38).

[0193] FIG. 15: In situ hybridization results for the RA770 protein.

[0194] FIG. 15A shows whole-mount in situ hybridizatons on chick embryos (day 5). dpb=dorsal pancreatic bud; vpb=ventral pancreatic bud; lu=lung, st=stomach region; dd=duodenum

[0195] FIG. 16: RA770 sequences.

[0196] FIG. 16A: Nucleic acid sequence (SEQ ID NO:39) encoding the chicken RA770-protein.

[0197] FIG. 16B: Protein sequence (SEQ ID NO: 40) encoded by the coding sequence shown in FIG. 16A.

[0198] FIG. 16C: Nucleic acid sequence (SEQ ID NO:42) encoding the human homolog RA770 protein (GenBank Accession Number NM.sub.--004558.1; Neurturin).

[0199] FIG. 16D: protein sequence (SEQ ID NO:43) encoded by the coding sequence shown in FIG. 16C (GenBank Accession Number NP.sub.--004549.1).

[0200] FIG. 16E: Nucleic acid sequence (SEQ ID NO:44) encoding the mouse homolog RA770 protein (GenBank Accession Number NM.sub.--008738.1; Neurturin).

[0201] FIG. 16F: Protein sequence (SEQ ID NO.sub.44) encoded by the coding sequence shown in FIG. 16E (GenBank Accession Number NP.sub.--032764.1).

[0202] FIG. 17 shows the structure of the mouse mDG770 transgenic construct. Shown is the rIP promoter (0.8 kb rat insulin II promoter) as a thin line, the mouse DG770 cDNA (mDG770) as white box, the hybrid-intron structure (hybrid-intron) as grey box and the polyadenylation signal (bgh-polyA) as black box.

[0203] FIG. 18 shows pancreatic islets of mDG770 transgenic mice with ectopic mDG770 expression. Taqman expression analysis on islet cDNA isolated from two wild type and two transgenic littermates using a mDG770 specific primer/probe pair. The data are presented as fold mDG770 induction relative to wild type mDG770 expression in islets.

[0204] FIG. 19 shows the growth curves of DG770 transgenic mice (rIP-mDG770) compared to wild type mice (wt) on high fat (HF) diet. Data are presented as mean bodyweight in g/over time+/-standard deviation. DG770 transgenic mice have an increased body weight compared to wt mice on HF diet.

[0205] FIG. 20 shows the lean and fat body mass in mDG770 transgenic mice compared to wild type mice (wt) on HF diet. After 4 weeks on HF diet lean and fat body mass of individual male mDG770 transgenic mice (dark grey bars, N=6) and male littermate controls (light grey bars, N=5) was measured using NMR analysis. The data are expressed as mean organ weight as % of bodyweight+/-standard deviation. mDG770 transgenic mice have an increased fat body mass compared to wt mice on HF diet.

[0206] FIG. 21 shows body length of mDG770 transgenic mice compared to wild type mice (wt) on HF diet. Body length of 4 weeks old male wild type mice (light grey bar, N=5) and mDG770 transgenic mice (dark grey bar, N=6). The data are expressed as mean body length in cm+/-standard deviation. mDG770 transgenic mice have a normal body length.

[0207] FIG. 22 shows the analysis of DG770 expression in mammalian (mouse) tissues.

[0208] FIG. 22A shows the real-time PCR analysis of DG770 expression in wild type mouse tissues (referred to as wt-mice) and in tissues of mice fed with a control diet (referred to as controldiet).

[0209] FIG. 22B shows the real-time PCR analysis of DG770 expression in fasted mice (referred to as fasted-mice) and genetically obese mice (referred to as ob/ob-mice) compared to wild-type mice, and in mice fed with a high fat diet (referred to as high fat diet) compared to mice fed with a control diet.

[0210] The examples below are provided to illustrate the subject invention and are not included for the purpose of limiting the invention.

EXAMPLES

Example 1

DPd6 Chick cDNA Library Construction

[0211] The Chick DPd6 cDNA library was constructed from dorsal pancreatic buds dissected from 6 day old chick embryos. The frozen tissue was homogenized and lysed using a Brinkmann POLYTRON homogenizer PT-3000 (Brinkman Instruments, Westbury, N.J.) in guanidinium isothiocyanate solution. The lysates were centrifuged over a 5.7 M CsCl cushion using as Beckman SW28 rotor in a Beckman L8-70M ultracentrifuge (Beckman Instruments, Fullerton, Calif.) for 18 hours at 25,000 rpm at ambient temperature. The RNA was extracted with acid phenol pH 4.7, precipitated using 0.3 M sodium acetate and 2.5 volumes of ethanol, resuspended in RNase-free water, and DNase treated at 37.degree. C. The RNA extraction was repeated with acid phenol pH 4.7 and precipitated with sodium acetate and ethanol as before. The mRNA was then isolated using the Micro-FastTrack 2.0 mRNA isolation kit (Invitrogen, Groningen, Netherlands) and used to construct the cDNA libraries. The mRNAs were handled according to the recommended protocols in the SUPERSCRIPT cDNA synthesis and plasmid cloning system (Gibco/BRL). Following transformation into DH10B host cells, single colonies were picked and the subjected to PCR in order to amplify the cloned cDNA insert. Amplified PCR fragments representing single cDNA inserts were subsequently in vitro transcribed to generate Digoxygenin labelled RNA probes (Roche). The RNA probes were used in a whole-mount in situ screen to determine the expression of their respective gene products in early chick embryos. Plasmids containing the genes encoding the proteins of the invention were identified because of their high expression in pancreatic tissues.

Example 2

In Situ Hybridizations

[0212] Whole-mount in situ hybridizations were performed according to standard protocols as known to those skilled in the art and as described previously (for example, Pelton, R. W. et al., (1990) Development 110,609-620; Belo, J. A. et al., (1997) Mech. Dev. 68, 45-57).

Example 3

Isolation and Sequencing of cDNA Clones

[0213] Plasmid DNA was released from the cells and purified using the REAL PREP 96-well plasmid isolation kit (QIAGEN). This kit enabled the simultaneous purification of 96 samples in a 96-well block using multi-channel reagent dispensers. The protocol recommended by the manufacturer was employed except for the following changes, as indicated below: (i) the bacteria were cultured in 1 ml of sterile Terrific Broth (LIFE TECHNOLOGIES.TM., Gaithersburg, Md., USA) with carbenicillin at 25 mg/L and glycerol at 0.4%; (ii) after inoculation, the cultures were incubated for 19 hours and at the end of incubation, the cells were lysed with 0.3 ml of lysis buffer; and (iii) following isopropanol precipitation, the plasmid DNA pellet was resuspended in 0.1 ml of distilled water. After the last step in the protocol, samples were transferred to a 96-well block for storage at 4.degree. C. The cDNAs were sequenced by GATC Biotech AG (Konstanz, Germany) accoding to standard protocols known to those skilled in the art.

Example 4

Homology Searching of cDNA Clones and Their Deduced Proteins

[0214] After the reading frame was determined, the nucleotide sequences of the invention as well as the amino acid sequences deduced from them were used as query sequences against databases such as GenBank, SwissProt, BLOCKS, and Pima II. These databases, which contain previously identified and annotated sequences, were searched for regions of homology (similarity) using BLAST, which stands for Basic Local Alignment Search Tool (Altschul S. F. (1993) J. Mol. Evol. 36:290-300; Altschul, S. F. et al. (1990) J. Mol. Biol. 215:403-10). BLAST produced alignments of both nucleotide and amino acid sequences to determine sequence similarity. Because of the local nature of the alignments, BLAST was especially useful in determining exact matches or in identifying homologs which may be of prokaryotic (bacterial) or eukaryotic (animal, fungal, or plant) origin. Other algorithms such as the one described in Smith et al. (1992, protein Engineering 5:35-51), incorporated herein by reference, could have been used when dealing with primary sequence patterns and secondary structure gap penalties. The BLAST approach, as detailed in Karlin et al. (supra) and incorporated herein by reference, searched formatches between a query sequence and a database sequence. BLAST evaluated the statistical significance of any matches found, and reported only those matches that satisfy the user-selected threshold of significance. In this application, threshold was set at 10-25 for nucleotides and 10-14 for peptides. Nucleotide sequences were searched against the GenBank databases for primate, rodent, and other mammalian sequences; and deduced amino acid sequences from the same clones were then searched against GenBank functional protein databases, mammalian, vertebrate, and eukaryote for homology.

Example 5

Extension of Polynucleotides to Full Length or to Recover Regulatory Sequences

[0215] Full length nucleic acid sequences encoding the proteins of the invention are used to design oligonucleotide primers for extending a partial nucleotide sequence to full length or for obtaining 5' or 3', intron or other control sequences from genomic libraries. One primer is synthesized to initiate extension in the antisense direction and the other is synthesized to extend sequence in the sense direction. Primers are used to facilitate the extension of the known sequence "outward" generating amplicons containing new, unknown nucleotide sequence for the region of interest. The initial primers are designed from the cDNA using OLIGO 4.06 primer analysis software (National Biosciences), or another appropriate program, to be 22-30 nucleotides in length, to have a GC content of 50% or more, and to anneal to the target sequence at temperatures about 68.degree. C.-72.degree. C. Any stretch of nucleotides which would result in hairpin dimerizations is avoided. The original, selected cDNA libraries, or a human genomic library are used to extend the sequence, the latter is most useful to obtain 5' upstream regions. If more extension is necessary or desired, additional sets of primers are designed to further extend the known region. By following the instructions for the XL-PCR kit (Perkin Elmer) and thoroughly mixing the enzyme and reaction mix, high fidelity amplification is obtained. Beginning with 40 pmol of each primer and the recommended concentrations of all other components of the kit, PCR is performed using the Peltier thermal cycler (PTC200; M. J. Research, Watertown, Mass.) and the following parameters:

[0216] Step 1 94.degree. C. for 1 min (initial denaturation)

[0217] Step 2 65.degree. C. for 1 min

[0218] Step 3 68.degree. C. for 6 min

[0219] Step 4 94.degree. C. for 15 sec

[0220] Step 5 65.degree. C. for 1 min

[0221] Step 6 68.degree. C. for 7 min

[0222] Step 7 Repeat step 4-6 for 15 additional cycles

[0223] Step 8 94.degree. C. for 15 sec

[0224] Step 9 65.degree. C. for 1 min

[0225] Step 10 68.degree. C. for 7-15 min

[0226] Step 11 Repeat step 8-10 for 12 cycles

[0227] Step 12 72.degree. C. for 8 ml

[0228] Step 13 4.degree. C. (and holding)

[0229] A 5-10 .mu.l aliquot of the reaction mixture is analyzed by electrophoresis on a low concentration (about 0.6-0.8% agarose mini-gel to determine which reactions were successful in extending the sequence. Bands thought to contain the largest products are selected and removed from the gel. Further purification involves using a commercial gel extraction method such as the QIAQUICK DNA purification kit (QIAGEN). After recovery of the DNA, Klenow enzyme is used to trim single-stranded, nucleotide overhangs creating blunt ends which facilitate religation and cloning. After ethanol precipitation, the products are redissolved in 13 .mu.l of ligation buffer, 1 .mu.l T4-DNA ligase (15 units) and 1 .mu.l T4 polynucleotide kinase are added, and the mixture is incubated at room temperature for 2-3 hours or overnight at 16.degree. C. Competent E. coli cells (in 40 .mu.l of appropriate media) are transformed with 3 .mu.l of ligation mixture and cultured in 80 .mu.l of SOC medium (Sambrook et al., supra). After incubation for one hour at 37.degree. C. the whole transformation mixture is plated on Luria Bertani (LB)-agar (Sambrook et al., supra) containing 2.times.Carb. The following day, several colonies are randomly picked from each plate and cultured in 150 .mu.l of liquid LB/2.times.Carb medium placed in an individual well of an appropriate, commercially-available, sterile 96-well microtiter plate. The following day, 5 .mu.l of each overnight culture is transferred into a non-sterile 96-well plate and after dilution 1:10 with water, 5 .mu.l of each sample is transferred into a PCR array. For PCR amplification, 18 .mu.l of concentrated PCR reaction mix (3.3.times.) containing 4 units of rTth DNA polymerase, a vector primer, and one or both of the gene specific primers used for the extension reaction are added to each well. Amplification is performed using the following conditions:

[0230] Step 1 94.degree. C. for 60 sec

[0231] Step 2 94.degree. C. for 20 sec

[0232] Step 3 55.degree. C. for 30 sec

[0233] Step 4 72.degree. C. for 90 sec

[0234] Step 5 Repeat steps 2-4 for an additional 29 cycles

[0235] Step 6 72.degree. C. for 180 sec

[0236] Step 7 4.degree. C. (and holding)

[0237] Aliquots of the PCR reactions are run on agarose gels together with molecular weight markers. The sizes of the PCR products are compared to the original partial cDNAs, and appropriate clones are selected, ligated into plasmid, and sequenced.

Example 6

Labeling and Use of Hydridization Probes

[0238] Hybridization probes derived from nucleic acids described in this invention were employed to screen cDNAs, genomic DNAs, or mRNAs. Although the labeling of oligonucleotides, consisting of about 20 base-pairs, is specifically described, essentially the same procedure is used with larger cDNA fragments. Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 primer analysis software (National Biosciences, labeled by combining 50 pmol of each oligomer and 250 .mu.Ci Of .gamma.-.sup.32P adenosine triphosphate (Amersham) and T4 polynucleotide kinase (DuPont Nen(r), Boston, Mass.). The labelled oligonucleotides are substantially purified with SEPHADEX G-25 superfine resin column (Pharmacia & Upjohn). A portion containing 107 counts per minute of each of the sense and antisense oligonucleotides is used in a typical membrane based hybridization analysis of human genomic DNA digested with one of the following membranes (Ase I, Bgl II, EcoRI, Pst I, Xba I, or Pvu II; DuPont NEN(r)). The DNA from each digest is fractionated on a 0.7 percent agarose gel and transferred to nylon membranes (NYTRAN PLUS membrane, Schleicher & Schuell, Durham, N.H.). Hybrization is carried out for 16 hours at 40.degree. C. To remove nonspecific signals, blots are sequentially washed at room temperature under increasingly stringent conditions up to 0.1.times.saline solution citrate (SSC) and 0.5% sodium dodecyl sulfate. After XOMAI AR Autoradiography film (Kodak Rochester, N.Y.) is exposed to the blots, or the blots are placed in a PHOSPHOIMAGER (Molecular Dynamics, Sunnyvale, Calif.) for several hours, hybridization patterns are compared visually.

Example 7

Antisense Molecules

[0239] Antisense molecules to the sequences encoding proteins of the invention, or any part thereof, are used to inhibit in vivo or in vitro expression of naturally occurring the proteins of the invention. Although use of antisense oligonucleotides, comprising about 20 base-pairs, is specifically described, essentially the same procedure is used with larger cDNA fragments. An oligonucleotide is used to inhibit expression of naturally occurring proteins of the invention. Antisense oligonucleotides can inhibit gene function in multiple ways. They can bind to the 5'UTR of a transcript and block translation. Alternatively, binding of the antisense oligonucleotide can induce cleavage of the transcript by RNAseH. Antisense oligos have also been shown to block splicing of a pre-mRNA, thereby either blocking formation of specific splice forms or leading to the accumulation of unspliced messages which cannot give rise to mature protein, are unstable, or both. The mechanism of action of a particular antisense oligonucleotide is determined by the chemical composition of the oligonucleotide and/or by the binding site within the targeted transcript.

[0240] Antisense oligonucleotides can be applied to tissue culture cells, used in animals or therapeutically in humans. Injection into early zebrafish or xenopus embryos allows convenient analysis of gene function in these species.

Example 8

Expression of the Proteins of the Invention

[0241] Expression of the proteins of the invention, such as the proteins of the invention and homologous proteins, is accomplished by subcloning the cDNAs into appropriate vectors and transforming the vectors into host cells. In this case, the cloning, vector, PSPORT 1, previously used for the generation of the cDNA library is used to express the proteins of the invention in E. coli. Upstream of the cloning site, this vector contains a promoter for .beta.-galactosidase, followed by sequence containing the amino-terminal Met, and the subsequent seven residues of .beta.-galactosidase. Immediately following these eight residues is a bacteriophage promoter useful for transcription and a linker containing a number of unique restriction sites. Induction of an isolated, transformed bacterial strain with IPTG using standard methods produces a fusion protein which consists of the first eight residues of .beta.-galactosidase, about 5 to 15 residues of linker, and the full length protein. The signal residues direct the secretion of the proteins of the invention into the bacterial growth media which can be used directly in the following assay for activity.

Example 9

Production of Antibodies Specific for the Proteins of the Invention

[0242] The proteins of the invention that are substantially purified using PAGE electrophoresis (Sambrook, supra), or other purification techniques, is used to immunize rabbits and to produce antibodies using standard protocols. The amino acid sequences are analyzed using DNASTAR software (DNASTAR Inc) to determine regions of high immunogenicity and a corresponding oligopolypeptide is synthesized and used to raise antibodies by means known to those of skill in the art. Selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions, is described by Ausubel et al. (supra), and others.

[0243] Typically, the oligopeptides are 15 residues in length, synthesized using an Applied Biosystems 431A peptide synthesizer 431A using Fmoc-chemistry, and coupled to keyhole limpet hemocyanin (KLH, Sigma, St. Louis, Mo.) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS; Ausubel et al., supra). Rabbits are immunized with the oligopeptide-KLH complex in complete Freund's adjuvant. The resulting antisera are tested for antipeptide activity, for example, by binding the peptide to plastic, blocking with 1% BSA, reacting with rabbit antisera, washing, and reacting with radioiodinated, goat anti-rabbit IgG.

[0244] The proteins of the invention or biologically active fragments thereof are labeled with .sup.125I Bolton-Hunter reagent (Bolton et al. (1973) Biochem. J. 133:529). Candidate molecules previously arrayed in the wells of a multi-well plate are incubated with the labeled proteins of the invention, washed and any wells with labeled proteins of the invention complex are assayed. Data obtained using different concentrations of proteins of the invention are used to calculate values for the number, affinity, and association of proteins of the invention with the candidate molecules. All publications and patents mentioned in the above specification are herein incorporated by reference.

Example 10

Identification of Human Homologous Genes and Proteins

[0245] Homologous proteins and nucleic acid molecules coding therefore are obtainable from insect or vertebrate species, e.g. mammals or birds. Sequences homologous to the chicken proteins and nucleic acid molecules were identified using the publicly available program BLASTP 2.2.3 of the non-redundant protein data base of the National Center for Biotechnology Information (NCBI) (see, Altschul et al., 1997, Nucleic Acids Res. 25:3389-3402).

[0246] Chicken DP119 (SEQ ID NO: 2) showed 93% identities and 98% homologies to amino acids 251 to 432 of human CAB43286.1 (SEQ ID NO: 4; encoded by AL050137.1--SEQ ID NO:3) and 93% identities and 97% homologies to amino acids 565 to 746 of mouse AAH25654.1 (SEQ ID NO: 5; encoded by BC025654.1; SEQ ID NO: 6). BLAST searches using human in the Derwent GenSeq Database using human CAB43286.1 or mouse AAH25654.1 as querys revealed the following entries: WO200153312-A1 with claimed applications include diseases of the peripheral nervous system and Immune system suppression, and others; WO200018922-A2 describing novel carbohydrate-associated proteins used for the prevention and treatment of autoimmune/inflammatory disorders, the gastrointestinal and reproductive systems; and WO200155320-A2 with uses in prevention and treatment of reproductive system disorders, including cancer.

[0247] Chicken DP444 (SEQ ID: 8 encoded by SEQ ID: 7) showed 93% identity and 97% homology to the polypeptide encoded by human BI035296 (SEQ ID: 9, FIG. 4C); 91% identity and 94% homology to the polypeptide encoded by human BF951817 (SEQ ID: 11, FIG. 4E); and 92% identity and 95% homology to the polypeptide encoded by human A1214480.1 (SEQ ID: 12, FIG. 4F). Search of the Derwent GenSeq database revealed no matches.

[0248] Chicken DP810 (SEQ ID NO: 17, see FIG. 6) encodes a polypeptide (SEQ ID NO: 18) showing 55% identities and 66% homologies to amino acids 3082 to 3566 of mouse polydom protein (NP.sub.--073725.1). Homology is especially high for amino acids 3346 to 3566 of mouse polydom (84% identities, 94% homology). The partial version of the human homolog of polydom is encoded by NP.sub.--078776.1 (SEQ ID NO: 19 and SEQ ID NO: 20). Search of the Derwent GenSeq database revealed no match.

[0249] Chicken DP685 (SEQ ID NO:22, see FIG. 8) showed 85% identities and 92% homologies between amino acids 1 to 735 amino acids 125 to 863 of human autotaxin-t (SEQ ID NO:24). BLAST searches in the Derwent GenSeq Database using human autotaxin-t (GenBank Accession Numbers AAB00855.1 and L46720.1) as query identified Accession Number AAR86596, in patent application WO 95/32221 describing an Autotaxin motility stimulating protein, used in cancer diagnosis and therapy.

[0250] Chicken WE474 (SEQ ID NO: 27 encoding SEQ ID NO: 28, see FIG. 10) showed 69% identities and 81% homologies to human collectin sub-family member 10 (C-type lectin) Accession Number NM.sub.--006438.2 (nucleotide) and NP.sub.--006429.1 (amino acids), SEQ ID Nos: 29 and 30, resp., Search of the Derwent GenSeq database using human NP.sub.--006429.1 found patent applications WO9946281-A2 targeting blood coagulation disorders, cancers and cellular adhesion disorders and WO200168848-A2 targeting applications in the diagnosis of a wide range of tumours.

[0251] Chicken DP160 (SEQ ID NO:32, see FIG. 12) showed 78% identities and 85% homologies between amino acids 3 to 140 to amino acids 386 to 799' of human CCR4 carbon catabolite repression 4-like (CCRN4L) (Genbank Acession Number XM.sub.--003343.2) and to amino acids 386 to 799 of human CCR4 carbon catabolite repression 4-like (CCRN4L) (Genbank Acession Number NM.sub.--912118.1). BLAST searches in the Derwent GenSeq Database using human human CCR4 carbon catabolite repression 4-like (CCRN4L) (GenBank Accession Numbers XP.sub.--003343.3 and XM.sub.--003343.2) as query identified Accession Number AAZ15795 describing human gene expression product cDNA sequence SEQ ID NO:3264., in patent application WO WO9938972-A2 used in cancer therapy.

[0252] Chicken RA977 (SEQ ID NO: 35; encoded protein SEQ ID NO: 36, see FIG. 14) showed 70% identities and 83% homology to human EMP-2 (XM.sub.--030218.1; SEQ ID NO: 37 for nucleotide; P54851; SEQ ID NO: 38 for protein sequence). Search of the Derwent GenSeq database revealed matches to patent applications WO200194629-A2 claiming applications for cancer diagnostics and WO200229086-A2 claiming applications for cancer diagnostics and therapy.

[0253] Chicken RA770 (SEQ ID NO:40, see. FIG. 16) showed 67% identities and 87% homologies between amino acids 5 to 94 to the C-terminal amino acids 108 to 197 of human neurturin precursor (SEQ ID NO:42). Chicken RA770 (SEQ ID NO:2) showed 64% identities and 84% homologies between amino acids 5 to 94 to the C-terminal amino acids 106 to 195 of mouse neurturin precursor (SEQ ID NO:44). BLAST searches in the Derwent GenSeq Database using human neurturin precursor (GenBank Accession Numbers NP.sub.--004549.1 and NM.sub.--004558.1) as query identified Accession Number AAY16637, disclosed as SEQ ID NO:7 in patent application WO 99/14235, describing a new isolated persephin growth factor used to promote neuronal growth. The persephin GF polypeptides or polynucleotides can be used for preventing or treating cellular degeneration or insufficiency, and can also be used for treating, e.g. peripheral nerve trauma or injury, exposure to neurotoxins, metabolic diseases such as diabetes or renal dysfunctions and damage caused by infectious agents. In addition, patent applicaton WO 97/08196 describes Accession Number AAW13716 encoding Human pre-pro-neurturin as novel growth factor Neurturin used to treat neuro-degenerative and haematopoietic cell degeneration diseases. The same protein was also disclosed in WO9906064-A1 as new neurturin neurotrophic factor protein product useful for treating sensorineural hearing loss as well as treating, lesions and disturbances to the vestibular apparatus.

Example 11

Generation of a mDG770 Transgenic Construct

[0254] A complete mDG770 Open Reading Frame (ORF) was cloned under the control of the rat insulin promoter II (Lomedico et al., (1979) Cell 18: 545-558) using the Gateway system (Invitrogen). For the structure of the transgenic construct, see also FIG. 17.

Example 12

Generation of rIP-mDG770 Transgenic Mice

[0255] Transgenic construct DNA (see Example 11) was injected into C57/BL6.times.CBA embryos (Harlan Winkelmann, Borchen, Germany) using standard techniques (see, for example, Brinster et al. (1985), Proc. Natl. Acad. Sci. USA 82: 4438-4442). The mDG770 transgene (see Example 11) was expressed under the control of the rat insulin promoter II (Lomedico et al., supra) using techniques known to those skilled in the art (for example, see, Gunnig et al. (1987), Proc. Natl. Acad. Sci. USA 84, 4831-4835). Using this technique, several independent founderlines were generated.

Example 13

Genotype Analysis of rIP-mDG770 Transgenic Mice

[0256] Genotyping was performed by PCR using genomic DNA isolated from the tail tip. To detect the mDG770 transgene a transgene specific forward primer (5' tgc tat ctg tct gga tgt gcc 3' and a mDG770 transgene specific reverse primer (5' aag gac acc tcg tcc tca tag 3') was used.

Example 14

mDG770 Expression Analysis Via TaqMan Analysis

[0257] The expression of the mDG770 transgene in islets was monitored by TaqMan analysis. For this analysis, 25 ng cDNA derived from pancreatic islet RNA isolated from transgenic mice and their littermates and a mDG770 specifc primer/probe pair were used to detect endogenous as well as transgenic mDG770 expression (mDG770-1 forward primer: 5' GCC TAT GAG GAC GAG GTG TCC 3', mDG770 reverse primer: 5' AGC TCT TGC AGC GTG TGG T 3', mDG770 probe: 5' TCC TGG ACG TGC ACA GCC GC 3'). TaqMan analysis was performed using standard techniques known to those skilled in the art. Ectopic transgene expression was detected in 3 of 4 rIP-mDG770 transgenic founderlines analysed. The two founderlines showing highest transgene expression levels were used for further analysis.

Example 15

Bodyweight, Body Length and NMR Analysis in mDG770 Transgenic Mice

[0258] 3 to 6 mice were housed per cage. Growth curves were generated by measuring the bodyweight of individual mDG770 transgenic mice and their wild-type littermates on a weekly basis using a normal balance. The body length was measured from nose to anus placing a ruler along the middle axis of the mouse. On selected time points the lean and fat body mass was measured using non-invasive NMR analysis: to do this individual mice were placed into a Bruker Minispec NMR machine (Bruker, USA) and the lean and body fat content was estimated.

[0259] Various modifications and variations of the described method and system of the invention will be apparent to those skilled in the art without departing from the scope and spirit of the invention. Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments. Indeed, various modifications of the described modes for carrying out the invention which are obvious to those skilled in molecular biology or related fields are intended to be within the scope of the following claims.

Sequence CWU 1

1

56 1 1350 DNA Gallus gallus 1 acgtcgtcta caacgggtcc ttctactaca accgggcctt cacccgcaac atcatcaaat 60 acgacctgaa gcagcggtac gtggccgcct gggccatgct gcacgacgtg gcctacgagg 120 agtccacccc gtggcgatgg cgcggccatt ccgatgtgga cttcgccgtg gacgagaacg 180 gcctgtgggt catttacccg gccatcagct acgagggctt caatcaggag gtgatcgtgc 240 tgagcaagct gaacgcagcc gacctcagca cccagaaaga gacgacgtgg aggacgggcc 300 tgcggaagaa cttctatggg aactgcttcg tcatctgcgg ggtcctgtac gcggtcgaca 360 gctacaacaa gaggaacgcc aacatctcct acgcctttga cacgcacacc aacactcaga 420 tcatcccccg gctgctcttt gagaatgagt acgcctacac cacgcagata gactataacc 480 ccaaggaccg cctgctctac gcttgggaca atggccacca ggtcacctac cacgtcatct 540 ttgcctactg agcgccccgg gatggggcac tgcgagcgag gggccaccag cacctttcat 600 tgttgttatt tttattatta ttattattat tattttgtac aaatcaaaga gtacgtgatg 660 ggtttttgtc tcaggctgtt tagatggcgg attgtagatc gatccccagg ccaggaccac 720 ccctttgtcc ccggtgtgac cttgcctctg tgctcgaggg cagtgcggcg gggcccgtgg 780 cagcagggct gctcctttgg ggggacgctg aggaggaggt ggccctgaca taaccctgct 840 gatgtttttt tagatgaaag ccatcagcgc ttaaccccag gcccagtgca aagctggcct 900 ttctgctgca ggcaccggct cctgtggcag gacggtggtg tccacccgtc cccgtggagg 960 ggtgcattgt cccctcgggg ggccaccctc ccacccgaca gtcagcgggt gcttgggaga 1020 tcctgctgta caacacgcac agccccggtg ctggcactta gctgaggact gtcccctctc 1080 cccctgactc tgccctttgc agcctgccct gggggctcca tctggcctgg gggggggctg 1140 tgggtgccgg gctgggtgct ggcagtggga ggggggcact gtaaatatgt gtagatgact 1200 tctgtttgtg cgttttgtaa ccaaaatagt ccccatttgg tatctgcctc gcggaggtcc 1260 cagcctccgt ccctccagcc tggcaccgcc ttgtatttac ccgctgttaa taataaaaga 1320 tcaagtacct ttgcaaaaaa aaaaaaaaaa 1350 2 182 PRT Gallus gallus 2 Val Val Tyr Asn Gly Ser Phe Tyr Tyr Asn Arg Ala Phe Thr Arg Asn 1 5 10 15 Ile Ile Lys Tyr Asp Leu Lys Gln Arg Tyr Val Ala Ala Trp Ala Met 20 25 30 Leu His Asp Val Ala Tyr Glu Glu Ser Thr Pro Trp Arg Trp Arg Gly 35 40 45 His Ser Asp Val Asp Phe Ala Val Asp Glu Asn Gly Leu Trp Val Ile 50 55 60 Tyr Pro Ala Ile Ser Tyr Glu Gly Phe Asn Gln Glu Val Ile Val Leu 65 70 75 80 Ser Lys Leu Asn Ala Ala Asp Leu Ser Thr Gln Lys Glu Thr Thr Trp 85 90 95 Arg Thr Gly Leu Arg Lys Asn Phe Tyr Gly Asn Cys Phe Val Ile Cys 100 105 110 Gly Val Leu Tyr Ala Val Asp Ser Tyr Asn Lys Arg Asn Ala Asn Ile 115 120 125 Ser Tyr Ala Phe Asp Thr His Thr Asn Thr Gln Ile Ile Pro Arg Leu 130 135 140 Leu Phe Glu Asn Glu Tyr Ala Tyr Thr Thr Gln Ile Asp Tyr Asn Pro 145 150 155 160 Lys Asp Arg Leu Leu Tyr Ala Trp Asp Asn Gly His Gln Val Thr Tyr 165 170 175 His Val Ile Phe Ala Tyr 180 3 1801 DNA Homo sapiens 3 gtgagttttt cagcggtgac aatggagtgg atttgctgat tgaagatcag ctcctgagac 60 acaacggcct gatgaccagt gtcacccgga ggcctgcagc cacccgtcag ggacacagca 120 ctgctgtgac aagcgacctg aacgctcgga ccgcaccctg gtcctcagca ctgccacagc 180 cctcgacctc agatcccagc atcgccaacc atgcctcagt gggaccaaca ctccaaacaa 240 cctcggtgtc tccagatccc acaagggagt cagtcctgca gccttctcct caggtaccag 300 ccaccactgt ggcccacaca gccacccagc aaccagcagc cccagctcct ccggcagtgt 360 ctcccaggga ggcattgatg gaagctatgc acacagtccc agtgcctccc accacagtca 420 gaacagactc gctggggaaa gatgctcctg ctgggcgggg aacaacccct gccagcccca 480 cgctgagccc cgaagaagaa gatgacatcc ggaatgtcat aggaaggtgc aaggacactc 540 tctccacaat cacggggccg accacccaga acacatatgg gcggaatgaa ggggcctgga 600 tgaaggaccc cctggccaag gatgagcgga tttacgtaac caactattac tacggcaaca 660 ccctggtaga gttccggaac ctggagaact tcaaacaagg tcgctggagc aattcctaca 720 agctcccgta cagctggatc ggcacaggcc acgtggtata caatggcgcc ttctactaca 780 atcgcgcctt cacccgcaac atcatcaagt acgacctgaa gcagcgctac gtggctgcct 840 gggccatgct gcatgacgtg gcctacgagg aggccacccc ctggcgatgg cagggccact 900 cagacgtgga ctttgctgtg gacgagaatg gcctatggct catctacccg gccctggacg 960 atgagggctt cagccaggag gtcattgtcc tgagcaagct caatgccgcg gacctgagca 1020 cacagaagga gaccacatgg cgcacggggc tccggaggaa tttctacggc aactgcttcg 1080 tcatctgtgg ggtgctgtat gccgtggata gctacaacca gcggaatgcc aacatctcct 1140 acgctttcga cacccacacc aacacacaga tcgtccccag gctgctgttc gagaatgagt 1200 attcctatac gacccagata gactacaacc ccaaggaccg cctgctctat gcctgggaca 1260 atggccacca ggtcacttac catgtcatct ttgcctactg acacccttgt ccccacaagc 1320 agaagcacag aggggtcact agcaccttgt gtgtatgtgt gtgcgtgcac gtgtgtgtag 1380 gtgggtatgt gttgtttaaa aatatatatt attttgtata atattgcaaa tgtaaaatga 1440 caatttgggt ctattttttt atatggattg tagatcaatc catacgtgta tgtgctggtc 1500 tcatcctccc cagtttatat ttttgtgcaa atgaacttct ccttttgacc agtaaccacc 1560 ttccttcaag ccttcagccc ctccagctcc aagtctcaga tctcgaccat tgaaaaggtt 1620 tcttcatctg ggtcttgcag gaggcaggca acaccaggag cagaaatgaa agaggcaaga 1680 aagaagtgct atgtggcgag aaaaaaagtt ttaatgtatt ggagaagttt taaaaaaccc 1740 agaaaaacgc tttttttttt ttaataaaga agaaatttaa aatcaaaaaa aaaaaaaaaa 1800 a 1801 4 432 PRT Homo sapiens 4 Glu Phe Phe Ser Gly Asp Asn Gly Val Asp Leu Leu Ile Glu Asp Gln 1 5 10 15 Leu Leu Arg His Asn Gly Leu Met Thr Ser Val Thr Arg Arg Pro Ala 20 25 30 Ala Thr Arg Gln Gly His Ser Thr Ala Val Thr Ser Asp Leu Asn Ala 35 40 45 Arg Thr Ala Pro Trp Ser Ser Ala Leu Pro Gln Pro Ser Thr Ser Asp 50 55 60 Pro Ser Ile Ala Asn His Ala Ser Val Gly Pro Thr Leu Gln Thr Thr 65 70 75 80 Ser Val Ser Pro Asp Pro Thr Arg Glu Ser Val Leu Gln Pro Ser Pro 85 90 95 Gln Val Pro Ala Thr Thr Val Ala His Thr Ala Thr Gln Gln Pro Ala 100 105 110 Ala Pro Ala Pro Pro Ala Val Ser Pro Arg Glu Ala Leu Met Glu Ala 115 120 125 Met His Thr Val Pro Val Pro Pro Thr Thr Val Arg Thr Asp Ser Leu 130 135 140 Gly Lys Asp Ala Pro Ala Gly Arg Gly Thr Thr Pro Ala Ser Pro Thr 145 150 155 160 Leu Ser Pro Glu Glu Glu Asp Asp Ile Arg Asn Val Ile Gly Arg Cys 165 170 175 Lys Asp Thr Leu Ser Thr Ile Thr Gly Pro Thr Thr Gln Asn Thr Tyr 180 185 190 Gly Arg Asn Glu Gly Ala Trp Met Lys Asp Pro Leu Ala Lys Asp Glu 195 200 205 Arg Ile Tyr Val Thr Asn Tyr Tyr Tyr Gly Asn Thr Leu Val Glu Phe 210 215 220 Arg Asn Leu Glu Asn Phe Lys Gln Gly Arg Trp Ser Asn Ser Tyr Lys 225 230 235 240 Leu Pro Tyr Ser Trp Ile Gly Thr Gly His Val Val Tyr Asn Gly Ala 245 250 255 Phe Tyr Tyr Asn Arg Ala Phe Thr Arg Asn Ile Ile Lys Tyr Asp Leu 260 265 270 Lys Gln Arg Tyr Val Ala Ala Trp Ala Met Leu His Asp Val Ala Tyr 275 280 285 Glu Glu Ala Thr Pro Trp Arg Trp Gln Gly His Ser Asp Val Asp Phe 290 295 300 Ala Val Asp Glu Asn Gly Leu Trp Leu Ile Tyr Pro Ala Leu Asp Asp 305 310 315 320 Glu Gly Phe Ser Gln Glu Val Ile Val Leu Ser Lys Leu Asn Ala Ala 325 330 335 Asp Leu Ser Thr Gln Lys Glu Thr Thr Trp Arg Thr Gly Leu Arg Arg 340 345 350 Asn Phe Tyr Gly Asn Cys Phe Val Ile Cys Gly Val Leu Tyr Ala Val 355 360 365 Asp Ser Tyr Asn Gln Arg Asn Ala Asn Ile Ser Tyr Ala Phe Asp Thr 370 375 380 His Thr Asn Thr Gln Ile Val Pro Arg Leu Leu Phe Glu Asn Glu Tyr 385 390 395 400 Ser Tyr Thr Thr Gln Ile Asp Tyr Asn Pro Lys Asp Arg Leu Leu Tyr 405 410 415 Ala Trp Asp Asn Gly His Gln Val Thr Tyr His Val Ile Phe Ala Tyr 420 425 430 5 2863 DNA Mus musculus 5 ccacgcgtcc gagtgaagcc gccttccagc ctgtctttgc tgagacctcc gacccaaggt 60 ggtctctgta gggactaaag tccctactgt cgcatctctc atggcctatc ccctgccatt 120 ggttctctgc tttgctctgg tggtggcaca ggtctggggg tccactacac ctcccacagg 180 gacaagcgag ccccctgatg tgcaaacagt ggagcccacg gaagatgaca ttctgcaaaa 240 cgaggcggac aaccaggaga acgttttatc tcagctgctg ggagactatg acaaggtcaa 300 ggctgtgtct gagggctctg actgtcagtg caaatgtgtg gtgagaccgc tgggccgaga 360 tgcctgccag aggatcaacc agggggcttc caggaaggaa gacttctaca ctgtggaaac 420 catcacctcg ggctcatcct gtaaatgtgc ttgtgttgct cctccgtctg ccgtcaatcc 480 ctgtgaggga gacttcaggc tccagaagct tcgggaggct gacagccgag atttgaagct 540 gtctacaatt atagacatgt tggaaggtgc tttctacggc ctggacctcc taaagctgca 600 ttcggttacc actaaactcg tggggcgagt ggataaactg gaggaggaag tctctaagaa 660 cctcaccaag gagaatgagc aaatcaaaga ggacgtggaa gaaatccgaa cggagctgaa 720 caagcgaggc aaggagaact gctctgacaa caccctagag agcatgccag acatccgctc 780 agccctgcag agggatgcgg ctgcagccta cgcccaccca gagtatgaag aacggtttct 840 gcaggaggaa actgtgtcac agcagatcaa ctccatcgaa ctcctgagga cgcagccact 900 ggtccctcct gcagcgatga agccgcagcg gcccctgcag agacaggtgc acctgagagg 960 tcggctggcc tccaagccca ccgtcatcag gggaatcacc tactataaag ccaaggtctc 1020 tgaggaggaa aatgacatag aagagcagca cgatgagctt ttcagtggcg acagtggagt 1080 ggacttgctg atagaagatc agcttctaag acaggaagac ctactgacaa gtgccacccg 1140 gaggccagca accactcgtc acactgctgc tgtcacgact gatgcgagca ttcaggccgc 1200 agcctcatcc tcagagcctg cacaggcctc tgcctcagca tccagctttg ttgagcctgc 1260 tcctcaggcc tccgatagag agctcttggc aaccccacag actaccacag tgtttccaga 1320 gcccacgggg gtgatgcctt ctacccaagt ctcacccacc accgtggccc acacagctgt 1380 ccagccactt ccagcaatgg ttcctgggga catatttgtg gaagctctac ccttggtccc 1440 tctgttacct gacacagttg ggacagacat gccagaggaa gaggggactg cagggcagga 1500 agcaacctct gctggtccca tcctgagccc tgaagaagaa gatgacattc ggaatgtgat 1560 aggaaggtgc aaggacaccc tctctacaat cacaggaccg accacccaga acacatatgg 1620 acggaatgaa ggggcctgga tgaaggaccc cctagccaag gacgaccgca tttacgtaac 1680 caactattac tatggcaaca cactggtcga gttccgaaac ctggagaact tcaaacaagg 1740 tcgctggagc aattcctaca agcttccata cagctggatc ggcacgggtc acgtggtcta 1800 caacggcgcc ttctactata accgggcctt cacccgaaac atcatcaagt atgacctgaa 1860 gcagcgttat gtggctgcct gggccatgct gcacgatgtg gcctatgagg aggccactcc 1920 ttggcggtgg cagggtcact cggatgtgga ctttgctgtg gatgagaatg gcctgtggct 1980 tatctaccca gctctggatg atgaaggttt caaccaggag gtcattgtcc tgagcaagct 2040 caatgccgtg gacctgagca cgcagaagga gaccacgtgg cgcactgggc tccggaggaa 2100 tttctatggc aactgctttg tcatctgtgg ggtactatat gctgtggaca gctataacca 2160 gaggaatgcc aacatctcct atgcctttga cacacacacc aacacacaga ttgtccctag 2220 gctgctgttt gagaatgaat attcgtacac cacccagata gactacaacc ccaaggaccg 2280 cctcctctat gcctgggaca atggccacca ggtcacctac catgtcatct ttgcctactg 2340 acacacttga ccctgcaaaa agaagcacag tggggccact agcaccttgt gtgtgtctgt 2400 gtgcatgtct gtctgtgaga ttgtgcaggt gggtgtgtgt tgttttaaaa tatattattt 2460 tgtataatat tacaagtgta aaatgacagt ttgggtctat tttttttata tggattgtag 2520 atcaatccat atgtgtatgt gctggtctca tccttcacaa tttatatttt tgtgcaaatg 2580 aacttctcct tctgaccagt aactaccttc tttcgtgctc tgaacctctg gctcctgagg 2640 tcaagggctg gagggtttct tcctccaggt cttgcagcca ggagcaggag tgtggggctc 2700 aggaaaaagt gctaagtggc ggcaaagttt ttatgtatta gagaagttct taaaactcag 2760 aaaaaaatac tttttttaaa taaaggagat attttaagac ccttaaaaaa aaaaaaaaaa 2820 aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaa 2863 6 746 PRT Mus musculus 6 Met Ala Tyr Pro Leu Pro Leu Val Leu Cys Phe Ala Leu Val Val Ala 1 5 10 15 Gln Val Trp Gly Ser Thr Thr Pro Pro Thr Gly Thr Ser Glu Pro Pro 20 25 30 Asp Val Gln Thr Val Glu Pro Thr Glu Asp Asp Ile Leu Gln Asn Glu 35 40 45 Ala Asp Asn Gln Glu Asn Val Leu Ser Gln Leu Leu Gly Asp Tyr Asp 50 55 60 Lys Val Lys Ala Val Ser Glu Gly Ser Asp Cys Gln Cys Lys Cys Val 65 70 75 80 Val Arg Pro Leu Gly Arg Asp Ala Cys Gln Arg Ile Asn Gln Gly Ala 85 90 95 Ser Arg Lys Glu Asp Phe Tyr Thr Val Glu Thr Ile Thr Ser Gly Ser 100 105 110 Ser Cys Lys Cys Ala Cys Val Ala Pro Pro Ser Ala Val Asn Pro Cys 115 120 125 Glu Gly Asp Phe Arg Leu Gln Lys Leu Arg Glu Ala Asp Ser Arg Asp 130 135 140 Leu Lys Leu Ser Thr Ile Ile Asp Met Leu Glu Gly Ala Phe Tyr Gly 145 150 155 160 Leu Asp Leu Leu Lys Leu His Ser Val Thr Thr Lys Leu Val Gly Arg 165 170 175 Val Asp Lys Leu Glu Glu Glu Val Ser Lys Asn Leu Thr Lys Glu Asn 180 185 190 Glu Gln Ile Lys Glu Asp Val Glu Glu Ile Arg Thr Glu Leu Asn Lys 195 200 205 Arg Gly Lys Glu Asn Cys Ser Asp Asn Thr Leu Glu Ser Met Pro Asp 210 215 220 Ile Arg Ser Ala Leu Gln Arg Asp Ala Ala Ala Ala Tyr Ala His Pro 225 230 235 240 Glu Tyr Glu Glu Arg Phe Leu Gln Glu Glu Thr Val Ser Gln Gln Ile 245 250 255 Asn Ser Ile Glu Leu Leu Arg Thr Gln Pro Leu Val Pro Pro Ala Ala 260 265 270 Met Lys Pro Gln Arg Pro Leu Gln Arg Gln Val His Leu Arg Gly Arg 275 280 285 Leu Ala Ser Lys Pro Thr Val Ile Arg Gly Ile Thr Tyr Tyr Lys Ala 290 295 300 Lys Val Ser Glu Glu Glu Asn Asp Ile Glu Glu Gln His Asp Glu Leu 305 310 315 320 Phe Ser Gly Asp Ser Gly Val Asp Leu Leu Ile Glu Asp Gln Leu Leu 325 330 335 Arg Gln Glu Asp Leu Leu Thr Ser Ala Thr Arg Arg Pro Ala Thr Thr 340 345 350 Arg His Thr Ala Ala Val Thr Thr Asp Ala Ser Ile Gln Ala Ala Ala 355 360 365 Ser Ser Ser Glu Pro Ala Gln Ala Ser Ala Ser Ala Ser Ser Phe Val 370 375 380 Glu Pro Ala Pro Gln Ala Ser Asp Arg Glu Leu Leu Ala Thr Pro Gln 385 390 395 400 Thr Thr Thr Val Phe Pro Glu Pro Thr Gly Val Met Pro Ser Thr Gln 405 410 415 Val Ser Pro Thr Thr Val Ala His Thr Ala Val Gln Pro Leu Pro Ala 420 425 430 Met Val Pro Gly Asp Ile Phe Val Glu Ala Leu Pro Leu Val Pro Leu 435 440 445 Leu Pro Asp Thr Val Gly Thr Asp Met Pro Glu Glu Glu Gly Thr Ala 450 455 460 Gly Gln Glu Ala Thr Ser Ala Gly Pro Ile Leu Ser Pro Glu Glu Glu 465 470 475 480 Asp Asp Ile Arg Asn Val Ile Gly Arg Cys Lys Asp Thr Leu Ser Thr 485 490 495 Ile Thr Gly Pro Thr Thr Gln Asn Thr Tyr Gly Arg Asn Glu Gly Ala 500 505 510 Trp Met Lys Asp Pro Leu Ala Lys Asp Asp Arg Ile Tyr Val Thr Asn 515 520 525 Tyr Tyr Tyr Gly Asn Thr Leu Val Glu Phe Arg Asn Leu Glu Asn Phe 530 535 540 Lys Gln Gly Arg Trp Ser Asn Ser Tyr Lys Leu Pro Tyr Ser Trp Ile 545 550 555 560 Gly Thr Gly His Val Val Tyr Asn Gly Ala Phe Tyr Tyr Asn Arg Ala 565 570 575 Phe Thr Arg Asn Ile Ile Lys Tyr Asp Leu Lys Gln Arg Tyr Val Ala 580 585 590 Ala Trp Ala Met Leu His Asp Val Ala Tyr Glu Glu Ala Thr Pro Trp 595 600 605 Arg Trp Gln Gly His Ser Asp Val Asp Phe Ala Val Asp Glu Asn Gly 610 615 620 Leu Trp Leu Ile Tyr Pro Ala Leu Asp Asp Glu Gly Phe Asn Gln Glu 625 630 635 640 Val Ile Val Leu Ser Lys Leu Asn Ala Val Asp Leu Ser Thr Gln Lys 645 650 655 Glu Thr Thr Trp Arg Thr Gly Leu Arg Arg Asn Phe Tyr Gly Asn Cys 660 665 670 Phe Val Ile Cys Gly Val Leu Tyr Ala Val Asp Ser Tyr Asn Gln Arg 675 680 685 Asn Ala Asn Ile Ser Tyr Ala Phe Asp Thr His Thr Asn Thr Gln Ile 690 695 700 Val Pro Arg Leu Leu Phe Glu Asn Glu Tyr Ser Tyr Thr Thr Gln Ile 705 710 715 720 Asp Tyr Asn Pro Lys Asp Arg Leu Leu Tyr Ala Trp Asp Asn Gly His 725 730 735 Gln Val Thr Tyr His Val Ile Phe Ala Tyr 740 745 7 2317 DNA Gallus gallus 7 ccacgcgtcc gcccacgcgt ccggaaagag ttttggtaga gaacaagctt catggacttt 60 ctccagctct ctctgaagcc atccagagca tttctcgctg ggaacttgtc caggctgcgc 120 tcccacatgt gctacactgc actgcaacat tgctctccaa ccgaaacaag ctaggtcatc 180 aggataaact tggagtagct gaaacaaagc ttcttcacac tcttcactgg atgctgttgg 240 aggcccctca ggactgcagc aatgaccgat ttggaggaga cagaggttct agctggggag 300 ggagcagtag tgcctttatc caccaggctg aaaaccaggg atcaccggga catccccgac 360 ccagcaccac gaatgatgag gacgagaaca acagaaggaa gttctttcag aactccatgg 420 ccaccgtgga

gctctttgtg ttcctctttg ctcctctggt tcacaggatt aaagaatctg 480 acctgacgtt tcgattggct agtggccttg ttatttggca gcctatgtgg gaacacaggc 540 aacctgaagt gtctgccttc aatgccctcg taaaaccaat caggaacatt gttacagcta 600 aaagaagttc tcctaccaac aatcagagtg tgacttgtga atccctaaat ctggacagtg 660 gtcatacaga gggactgcag gtggtctgtg agacgaccct gcccgattct gtaccttcaa 720 agcccactgt ttcagcatgt catcgtggaa attccttgga aggaagcgtg tcctctcaaa 780 cctctcagga gagaggtact ccacatccca gagtgtccat ggtgatccca ccatgccaga 840 agtctcgcta tgccacttac tttgatgtgg cagtactgcg ctgcttgctg cagcctcact 900 ggtctgagga gggcacacag tggtcactga tgtattacct gcagagactg aggcatatgc 960 tacaggaaaa gcccgagaaa ccacctgagc cagagatcac ccctttgcca agacttcgca 1020 gtagctccat ggtggctgct gcaccctctc tggtgaatac ccacaaaact caggatctca 1080 caatgaaatg taatgaggaa gaaaaatcac taagcacaga agcgttttcc aaggtttcac 1140 tgaccaactt gcgtaggcca gcggttccag atctctccac agatctgggg atgaacatct 1200 tcaaaaagtt taaaagccgc aaagaggaca gagagcgtga acgcaaaggg tcaattcctt 1260 tccaccatac tgggaagaag cgtcaacgga gaatggggat gcccttcctt ctccatgagg 1320 accatttgga tgtttcaccc actcggagca ctttttcatt tggcagtttt tctggcattg 1380 gagaggaccg acgtggcatt gagagaggag gatggcaaac caccatattg ggaaagttca 1440 ccagacgggg gagctctgac acagcaacgg agatggaaag cctgagcgct aggcactcac 1500 actctcacca cactcttgtc tctgatatgc cagaccactc aaacagccat ggagagaaca 1560 cagtcaaaga agttcggtcc cagatctcta ccatcactgt ggccaccttc aacactaccc 1620 tggcttcgtt caatgtgggc tatgctgatt tcttcagtga gcacatgagg aagctttgca 1680 atcaggtgcc catccctgag atgccccacg agcctcttgc gtgtgccaac ctcccacgga 1740 gcctgacaga ctcatgcatc aattacagtt gcttggagga tacggatcac attgatggaa 1800 ccaacaactt tgtccacaag aacggcatgc tggatctctc ggtaaatggc aaggaatgag 1860 gaaagccagg tccctcttct gtcaatatag tggtaccatt gagatcaggg tgttgatggg 1920 cttttcctcc acctctttat atgacttctc tcagcagtac ataaaggtag tcctgaaggc 1980 tgtttacttg gtcctgaacc atgacatcag ctccaggatt tgtgatgtgg cactgaacat 2040 tgtggagtgc ttgcttcagc ttggagtggt gccatctgta gagaaagtcc ggaggaagag 2100 cgagaacaaa gaaaatgaag cccctgaaaa gagaccaaat gagggatctt ttcaactcaa 2160 agcttctgga ggttcggctt gtggatttgg gcctcctcca gtcagtggaa ctggagatgg 2220 aggagaagaa ggaggcggtg gaagtggtgg aggaggaagc gatggaggtg gtggaggagg 2280 agggccgtat gagaagaatg acaaaaaaaa aaaaaaa 2317 8 618 PRT Gallus gallus 8 Thr Arg Pro Pro Thr Arg Pro Glu Arg Val Leu Val Glu Asn Lys Leu 1 5 10 15 His Gly Leu Ser Pro Ala Leu Ser Glu Ala Ile Gln Ser Ile Ser Arg 20 25 30 Trp Glu Leu Val Gln Ala Ala Leu Pro His Val Leu His Cys Thr Ala 35 40 45 Thr Leu Leu Ser Asn Arg Asn Lys Leu Gly His Gln Asp Lys Leu Gly 50 55 60 Val Ala Glu Thr Lys Leu Leu His Thr Leu His Trp Met Leu Leu Glu 65 70 75 80 Ala Pro Gln Asp Cys Ser Asn Asp Arg Phe Gly Gly Asp Arg Gly Ser 85 90 95 Ser Trp Gly Gly Ser Ser Ser Ala Phe Ile His Gln Ala Glu Asn Gln 100 105 110 Gly Ser Pro Gly His Pro Arg Pro Ser Thr Thr Asn Asp Glu Asp Glu 115 120 125 Asn Asn Arg Arg Lys Phe Phe Gln Asn Ser Met Ala Thr Val Glu Leu 130 135 140 Phe Val Phe Leu Phe Ala Pro Leu Val His Arg Ile Lys Glu Ser Asp 145 150 155 160 Leu Thr Phe Arg Leu Ala Ser Gly Leu Val Ile Trp Gln Pro Met Trp 165 170 175 Glu His Arg Gln Pro Glu Val Ser Ala Phe Asn Ala Leu Val Lys Pro 180 185 190 Ile Arg Asn Ile Val Thr Ala Lys Arg Ser Ser Pro Thr Asn Asn Gln 195 200 205 Ser Val Thr Cys Glu Ser Leu Asn Leu Asp Ser Gly His Thr Glu Gly 210 215 220 Leu Gln Val Val Cys Glu Thr Thr Leu Pro Asp Ser Val Pro Ser Lys 225 230 235 240 Pro Thr Val Ser Ala Cys His Arg Gly Asn Ser Leu Glu Gly Ser Val 245 250 255 Ser Ser Gln Thr Ser Gln Glu Arg Gly Thr Pro His Pro Arg Val Ser 260 265 270 Met Val Ile Pro Pro Cys Gln Lys Ser Arg Tyr Ala Thr Tyr Phe Asp 275 280 285 Val Ala Val Leu Arg Cys Leu Leu Gln Pro His Trp Ser Glu Glu Gly 290 295 300 Thr Gln Trp Ser Leu Met Tyr Tyr Leu Gln Arg Leu Arg His Met Leu 305 310 315 320 Gln Glu Lys Pro Glu Lys Pro Pro Glu Pro Glu Ile Thr Pro Leu Pro 325 330 335 Arg Leu Arg Ser Ser Ser Met Val Ala Ala Ala Pro Ser Leu Val Asn 340 345 350 Thr His Lys Thr Gln Asp Leu Thr Met Lys Cys Asn Glu Glu Glu Lys 355 360 365 Ser Leu Ser Thr Glu Ala Phe Ser Lys Val Ser Leu Thr Asn Leu Arg 370 375 380 Arg Pro Ala Val Pro Asp Leu Ser Thr Asp Leu Gly Met Asn Ile Phe 385 390 395 400 Lys Lys Phe Lys Ser Arg Lys Glu Asp Arg Glu Arg Glu Arg Lys Gly 405 410 415 Ser Ile Pro Phe His His Thr Gly Lys Lys Arg Gln Arg Arg Met Gly 420 425 430 Met Pro Phe Leu Leu His Glu Asp His Leu Asp Val Ser Pro Thr Arg 435 440 445 Ser Thr Phe Ser Phe Gly Ser Phe Ser Gly Ile Gly Glu Asp Arg Arg 450 455 460 Gly Ile Glu Arg Gly Gly Trp Gln Thr Thr Ile Leu Gly Lys Phe Thr 465 470 475 480 Arg Arg Gly Ser Ser Asp Thr Ala Thr Glu Met Glu Ser Leu Ser Ala 485 490 495 Arg His Ser His Ser His His Thr Leu Val Ser Asp Met Pro Asp His 500 505 510 Ser Asn Ser His Gly Glu Asn Thr Val Lys Glu Val Arg Ser Gln Ile 515 520 525 Ser Thr Ile Thr Val Ala Thr Phe Asn Thr Thr Leu Ala Ser Phe Asn 530 535 540 Val Gly Tyr Ala Asp Phe Phe Ser Glu His Met Arg Lys Leu Cys Asn 545 550 555 560 Gln Val Pro Ile Pro Glu Met Pro His Glu Pro Leu Ala Cys Ala Asn 565 570 575 Leu Pro Arg Ser Leu Thr Asp Ser Cys Ile Asn Tyr Ser Cys Leu Glu 580 585 590 Asp Thr Asp His Ile Asp Gly Thr Asn Asn Phe Val His Lys Asn Gly 595 600 605 Met Leu Asp Leu Ser Val Asn Gly Lys Glu 610 615 9 383 DNA Homo sapiens 9 ggaaattgac ccggcgaggc agttcagatg cagccactga gatggagagt ctgagcgcca 60 ggcattccca ctcccatcac accctggtaa gcgacctgcc ggacccctcc gacagccatg 120 gagaaaacac cgtcaaggaa gtgcgatctc agatctccac catcacagtt gcgaccttca 180 ataccacttt ggcgtcattc aacgtaggct atgcagactt tttcaatgag catatgagga 240 aactctgcaa ccaggtgcct atcccggaga tgccacatga acctctggca tgtgctaacc 300 tacctcgaag cctcacagac tcctgcataa actacagcta cctagaggac acagaacata 360 ttgacgggac caataacttt gtc 383 10 127 PRT Homo sapiens 10 Lys Leu Thr Arg Arg Gly Ser Ser Asp Ala Ala Thr Glu Met Glu Ser 1 5 10 15 Leu Ser Ala Arg His Ser His Ser His His Thr Leu Val Ser Asp Leu 20 25 30 Pro Asp Pro Ser Asp Ser His Gly Glu Asn Thr Val Lys Glu Val Arg 35 40 45 Ser Gln Ile Ser Thr Ile Thr Val Ala Thr Phe Asn Thr Thr Leu Ala 50 55 60 Ser Phe Asn Val Gly Tyr Ala Asp Phe Phe Asn Glu His Met Arg Lys 65 70 75 80 Leu Cys Asn Gln Val Pro Ile Pro Glu Met Pro His Glu Pro Leu Ala 85 90 95 Cys Ala Asn Leu Pro Arg Ser Leu Thr Asp Ser Cys Ile Asn Tyr Ser 100 105 110 Tyr Leu Glu Asp Thr Glu His Ile Asp Gly Thr Asn Asn Phe Val 115 120 125 11 593 DNA Homo sapiens 11 tcggtaccgg gtcagtacgg atgtgaggtc agattccttg atcctggtac cagtggagca 60 aacagaaaca cgaagagctc cacagtagcc atggagttct ggaagatctt tcttcggttg 120 ttctcttctt cgtcattaga gctgctttgg caaggctgcc ctggagaacc ctggttttca 180 acctggtgga tgaaagcact gctgcttcca ccccagctgg agcctcggtc tgtaccccca 240 aaccgctcat tgttgcagtc ctggggggcc tccagaagca tccagtgtag agtgtgaagg 300 agctttgtct cagcaacacc caatttatcc tggtggccta gcttgtttcg gtttgaaagc 360 agggttgcag tgcagtggag gacatgaggc aaagcagctt gcaccagttc ccatccggaa 420 atgctctgga tggcttcaga gagagctgga gagaggccat gcagcttgtt ttctaccaac 480 actcgctcaa aggacacaca agaagcttca tattgcttcc ccagtttggg cctcaaaaat 540 gcactggttt gccggcacag gaaggtctgg atgggcaggg ggatgccgcg ggc 593 12 418 DNA Homo sapiens 12 tgcccatcca gaccttcctg tggcggcaaa ccagtccttt gagcgagtgt tggtagaaaa 60 caagctgcat ggcctctctc cagctctctc tgaagccatc cagagcattt ccagatggga 120 actggtgcaa gctgctttgc ctcatgtcct ccactgcact gcaaccctgc tttcaaaccg 180 aaacaagcta ggccaccagg ataaattggg tgttgctgag acaaagctcc ttcacactct 240 acactggatg cttctggagg ccccccagga ctgcaacaat gagcggtttg ggggtacaga 300 ccgaggctcc agctggggtg gaagcagcag tgctttcatc caccaggttg aaaaccaggg 360 ttctccaggg cagccttgcc aaagcagctc taatgacgaa gaagagaaca accgaaga 418 13 738 DNA Homo sapiens 13 atggtgaaga ggaagagctc cgagggccag gagcaggacg gcggccgcgg catccccctg 60 cccatccaga ccttcctgtg gcggcaaacc agtttttatt atgactgtac acgccaccag 120 gataaattgg gtgttgctga gacaaagctc cttcacactc tacactggat gcttctggag 180 gccccccagg actgcaacaa tgagcggttt gggggtacag accgaggctc cagctggggt 240 ggaagcagca gtgctttcat ccaccaggtt gaaaaccagg gttctccagg gcagccttgc 300 caaagcagct ctaatgacga agaagagaac aaccgaagaa agatcttcca gaactccatg 360 gctactgtgg agctcttcgt gtttctgttt gctcccctgg tacacaggat caaggaatct 420 gacctcacct tccgtctggc cagtgggctt gttatatggc agcccatgtg ggaacacaga 480 cagcccggag tctctggctt taccgcactg gtgaagccca tcaggaacat cattacagct 540 aagagaagtt ctcctatcaa cagtcaaagc cggacctgtg aatcaccaaa tcaagatgca 600 agacacttag aggtactact aacctggtgc ttctatttta gcctcatgct tctattcagt 660 tcacctctgt atgatgaatt cttgatgtgt aactctccta tagatactgg gtatggagat 720 gaaaaagaaa ataattaa 738 14 245 PRT Homo sapiens 14 Met Val Lys Arg Lys Ser Ser Glu Gly Gln Glu Gln Asp Gly Gly Arg 1 5 10 15 Gly Ile Pro Leu Pro Ile Gln Thr Phe Leu Trp Arg Gln Thr Ser Phe 20 25 30 Tyr Tyr Asp Cys Thr Arg His Gln Asp Lys Leu Gly Val Ala Glu Thr 35 40 45 Lys Leu Leu His Thr Leu His Trp Met Leu Leu Glu Ala Pro Gln Asp 50 55 60 Cys Asn Asn Glu Arg Phe Gly Gly Thr Asp Arg Gly Ser Ser Trp Gly 65 70 75 80 Gly Ser Ser Ser Ala Phe Ile His Gln Val Glu Asn Gln Gly Ser Pro 85 90 95 Gly Gln Pro Cys Gln Ser Ser Ser Asn Asp Glu Glu Glu Asn Asn Arg 100 105 110 Arg Lys Ile Phe Gln Asn Ser Met Ala Thr Val Glu Leu Phe Val Phe 115 120 125 Leu Phe Ala Pro Leu Val His Arg Ile Lys Glu Ser Asp Leu Thr Phe 130 135 140 Arg Leu Ala Ser Gly Leu Val Ile Trp Gln Pro Met Trp Glu His Arg 145 150 155 160 Gln Pro Gly Val Ser Gly Phe Thr Ala Leu Val Lys Pro Ile Arg Asn 165 170 175 Ile Ile Thr Ala Lys Arg Ser Ser Pro Ile Asn Ser Gln Ser Arg Thr 180 185 190 Cys Glu Ser Pro Asn Gln Asp Ala Arg His Leu Glu Val Leu Leu Thr 195 200 205 Trp Cys Phe Tyr Phe Ser Leu Met Leu Leu Phe Ser Ser Pro Leu Tyr 210 215 220 Asp Glu Phe Leu Met Cys Asn Ser Pro Ile Asp Thr Gly Tyr Gly Asp 225 230 235 240 Glu Lys Glu Asn Asn 245 15 6702 DNA Homo sapiens 15 atgctgtgct gcccctctga aagcttgatt gtcagtataa ttatcttctt tctgccatgg 60 aacagggcct ctcttgtgat acctccgtgc caaaggtccc gctatgccac ctactttgac 120 gttgctgttc tgcgctgcct acttcagccc cattggtctg aggaaggcac tcagtggtct 180 ctgatgtact atctacaaag gctgcgacac atgttggaag agaagccaga aaagcctccg 240 gagccagata ttcctctcct gcccagaccc aggagtagct ccatggtggc agcagctccc 300 tcactagtga acacccacaa aacccaagat ctcaccatga agtgtaacga ggaggaaaaa 360 tctcttagct ctgaggcctt ttccaaggtt tcactgacca atctgcgtag atctgcagtc 420 ccagatcttt cttcagacct gggcatgaat atttttaaaa agttcaagag ccgcaaagaa 480 gaccgagaga ggaaaggctc cattccattc caccacacag gcaagaggag gccacggaga 540 atgggagtgc ccttcctgct tcacgaggac cacctggatg tgtcccccac gcgcagcaca 600 ttctcctttg gaagtttctc tgggctggga gaagacaggc gaggaattga gaaaggaggc 660 tggcaaacca ccattttagg gaaattgacc cggcgaggca gttcagatgc agccactgag 720 atggagagtc tgagcgccag gcattcccac tcccatcaca ccctggtaag cgacctgccg 780 gacccctcca acagccatgg agaaaacacc gtcaaggaag tgcgatctca gatctccacc 840 atcacagttg cgaccttcaa taccactttg gcgtcattca acgtaggcta tgcagacttt 900 ttcaatgagc atatgaggaa actctgcaac caggtgccta tcccggagat gccacatgaa 960 cctctggcat gtgctaacct acctcgaagc ctcacagact cctgcataaa ctacagctac 1020 ctagaggaca cagaacatat tgacgggacc aataactttg tccacaagaa tggaatgctt 1080 gatctttctg tagttctgaa ggctgtttat cttgtcctta atcatgacat cagctctcgt 1140 atctgtgacg tggcgctaaa cattgtggaa tgcttgcttc aacttggtgt ggtgccctgt 1200 gtagaaaaga atagaaagaa gagtgaaaac aaggaaaatg agaccttgga aaagaggcca 1260 agtgagggag ctttccaatt caaaggagta tctggaagtt ccacctgtgg attcggaggc 1320 cctgctgatg aaagtacacc tgtaagcaac cataggcttg ctctaacaat gctcatcaaa 1380 atagtgaagt ctttgggatg tgcctatggt tgtggtgaag gacaccgagg gctctctgga 1440 gatcgtctga gacaccaggt attccgagag aatgcccaga actgcctcac taagctatac 1500 aagctagata agatgcagtt ccgacaaacc atgagggact atgtgaacaa ggactctctc 1560 aataatgtag tggacttctt gcatgctttg ctaggatttt gtatggagcc ggtcactgac 1620 aacaaggctg ggtttggaaa taacttcacc acagtggaca acaaatccac agcccaaaat 1680 gtggaaggca ttatcgtcag cgccatgttt aaatccctca tcacacgctg cgcttcaacc 1740 acacatgaat tgcacagccc tgagaatctg ggactgtatt gtgacattcg tcagctggtc 1800 cagtttatca aagaggctca tgggaatgtc ttcaggagag tggccctcag cgctctgctt 1860 gacagtgccg agaagttagc accagggaaa aaggtggagg agaatgaaca ggaatctaag 1920 cctgcaggca gtaaaagcga tgaacaaatg caaggagcca acttggggcg gaaagatttc 1980 tggcgtaaga tgttcaagtc ccagagtgca gcaagtgaca ccagcagcca gtctgaacag 2040 gacacttcag aatgcacgac tgcccactca gggaccacct ctgaccgacg tgcccgctca 2100 cgatcccgca gaatttccct ccgaaagaag cttaaactcc ccatagggaa ctggctgaag 2160 agatcatccc tctcaggcct ggcagatggt gtggaggacc tcctggacat tagctctgtg 2220 gaccgactct ctttcatcag gcaaagctcc aaggtcaaat tcactagtgc tgtgaagctt 2280 tctgaaggtg ggccaggaag tggcatggaa aatggaagag atgaagagga gaatttcttc 2340 aagcgtcttg gttgccacag ttttgatgat catctctctc ccaaccaaga tggtggaaaa 2400 agcaaaaacg tggtgaatct tggagcaatc cgacaaggca tgaaacgctt ccaatttctg 2460 ttaaactgct gtgagccagg gacaattcct gatgcctcca tcctagcagc tgccttggat 2520 ctagaagccc ctgtggtggc cagagcagcc ttgttcctgg aatgtgctcg ttttgttcac 2580 cgctgcaacc gtggcaactg gccagagtgg atgaaagggc accacgtgaa catcaccaag 2640 aaaggacttt cccggggacg ctctcccatt gtgggcaaca agcgaaacca gaagctgcag 2700 tggaatgcag ccaagctctt ctaccaatgg ggagacgcaa ttggcgtccg attgaatgag 2760 ctgtgccacg gggaaagtga gagcccagcc aacctgctgg gtctcattta cgatgaagag 2820 accaagagga gacttagaaa ggaggatgag gaggaagact ttttagatga cagtaaggag 2880 actcccttta ctacaagaac ccctgcttgt actgtgaacc cctctaaatg cggttgcccc 2940 tttgccttga agatggcagc atgtcagctt cttctggaga ttaccacctt cctgcgagag 3000 accttttctt gcctgcccag acctcgcact gagcctctgg tggacttgga gagctgcaga 3060 cttcgtttgg atcccgagtt ggaccggcac agatatgaga ggaagatcag ctttgctggg 3120 gtcctggacg aaaatgaaga ctcaaaagat tctctccaca gcagcagcca cactctcaaa 3180 tcagatgcag gagtcgagga gaagaaagtt cccagcagga agatcaggat aggaggttct 3240 cgcctgctcc agattaaagg aacccgcagt ttccaggtga agaagggggg ttccttgtcc 3300 agcattcgcc gggtcggcag cttaaagagc agcaagttat cacggcagga ctcagagtct 3360 gaggctgagg agctgcagct gtcccagagc agggacactg tcactgacct agaagggagt 3420 ccttggagtg caagcgagcc cagcattgag ccagagggaa tgagtaatgc cggcgcggag 3480 gagaattacc acagaaacat gtcgtggctt catgtgatga tcttgctgtg caatcagcag 3540 agtttcatct gcactcacgt tgactactgc catccccact gctacctgca ccacagccgc 3600 tcctgtgccc gactggtcag agccatcaag ctactctatg gagacagtgt ggactccctg 3660 agggaaagca gcaacatcag cagtgtggct ctccggggca agaaacagaa agaatgctca 3720 gataagtcat gcctgaggac accttctcta aagaagagag tttcagatgc caatctggaa 3780 ggaaaaaaag attccggaat gctgaagtac atcagacttc aggtattgtt acctggatca 3840 gaaggattca tggaactttt aacagggagg ggactccaga cagcctattt actaatgttt 3900 gggacataca acatcagttg gtacagtgtt ggcataaagc cccttcagtt ggtgatgagc 3960 ttgtcgcctg ctcccttatc tctgttaatc aaggcagcac caattctgac agaggagatg 4020 tacggagaca tccagccagc tgcctgggag ctcctgctca gcatggatga gcacatggca 4080 ggggcagcag tgaaggtgcc tgaggccgtg tccgacatgc tgatgtcaga gttccaccac 4140 ccggagactg tgcagaggct gaacgctgtc ctcaagttcc acacgctctg gaggtttcgc 4200 tatcaggtct ggccccggat ggaggaaggg gcacagcaga tttttaagaa atccttttca 4260 gcccgggctg tgtcccgctc ccatcaaagg gcagaacaca tcttaaagaa cttgcagcag 4320 gaggaagaaa agaaacgact tggtagagaa gccagcctca tcactgccat ccccatcacc 4380 caggaggctt gctatgagcc cacatgcacg cccaactcag aaccggaaga agaagtagaa 4440 gaagtcacca atctggcatc ccgtcgactg tctgtgagtc catcctgcac ctccagcact 4500 tcccacagga attattcctt ccgccgcggg tcagtctggt cagtgcgttc agccgtcagt 4560 gctgaagatg aggaacatac cactgaacac acgccgaacc accatgtgcc tcagccccca 4620 caagcagtgt tcccagcatg catctgtgca

gcagtacttc ccattgttca tctgatggag 4680 gatggtgagg tgcgggaaga tggagtagca gtgagtgctg tggctcaaca agtcttatgg 4740 aactgtctaa ttgaagatcc atcaacggtt cttcgacatt ttctggaaaa actgaccatc 4800 agcaatagac aagatgagtt aatgtacatg ctgcgcaaac ttctcttgaa tattggagac 4860 tttcctgctc agacatctca catcctattc aactatttgg taggattaat catgtacttt 4920 gtgcggaccc cctgcgagtg ggggatggat gccatttcag ccaccctgac attcctgtgg 4980 gaggtggtgg gttacgtgga gggcctcttc ttcaaggatc tcaagcagac gatgaagaag 5040 gagcagtgtg aggtgaagct cctggtgacc gcttcaatgc caggtactaa aaccttggta 5100 gttcatggac agaatgagtg cgatatccca acccagttac cagtccatga agacactcaa 5160 tttgaagccc tgttgaagga gtgtctggag ttttttaata tcccagaatc ccagtcaaca 5220 cattattttc ttatggataa acgatggaac cttatccact acaataagac ctatgttcga 5280 gatatttatc ctttccggag gtcagtatct ccccagctga atcttgtaca tatgcatcca 5340 gagaagggac aggagctcat tcagaaacag gtgttcaccc gaaagctgga agaagtaggg 5400 cgggtgttgt ttctcatctc cctaacccag aagatcccca cagcccacaa acagtcccac 5460 gtctccatgc ttcaggaaga cctcctccgc ctgccctcat tccctcgtag tgctattgat 5520 gctgagtttt cactcttcag tgatcctcaa gctggaaagg aactgtttgg cctcgacact 5580 cttcagaaaa gcttgtggat ccagctgctg gaggaaatgt tcctgggcat gccgagcgag 5640 tttccatggg gagacgaaat catgcttttc ctcaacgttt ttaacggggc tctgatcctc 5700 cacccggaag acagtgccct gctcaggcag tatgctgcca ccgtcatcaa caccgcggtg 5760 cacttcaacc acctcttctc tctcagcggc taccagtgga ttctccccac catgctgcag 5820 gtgtactccg actatgaaag caatccccag ctgcgtcaag ccatcgaatt tgcctgtcac 5880 cagttctata ttctacaccg gaagcccttt gtgctccagc tgtttgctag tgtggcccct 5940 ctcctggaat ttcctgatgc tgccaataat gggcccagca aaggtgtgtc agctcagtgc 6000 ctgtttgact tgctgcagtc cctagaggga gagaccaccg acatattaga catcttagag 6060 ctggtcaaag ctgagaagcc tctcaagtca ttagatttct gctatggaaa cgaagatctg 6120 acattttcta tcagtgaagc cattaagctc tgtgtcactg tggtggcgta tgctcccgaa 6180 tcattcagaa gtcttcagat gctgatggtc ttagaagcct tagttccatg ttacctacaa 6240 aagctaaaga ggcagacatc acaggtggag acagtacctg ctgcccgaga ggagattgcg 6300 gccactgctg ctcttgcgac gtccctacag gcccttttgt acagtgtaga ggtcctcacc 6360 agggaaaacc ttcatttact ggaggaaggg caaggcattc ccagagagga actggatgaa 6420 cgaattgctc gggaagagtt cagaagaccc cgggagtcct tactgaatat ttgcactgag 6480 ttctataagc actgtgggcc acggctgaag atcttgcaaa atctggctgg ggagcctcgg 6540 gtcattgcct tggaactgct ggatgtgaag tctcacatga gtgtgctagg gaaaggcccc 6600 agaattactt ccctgtgcac tcgtatttcg tcttcctaca gagatgccat ttcacttgaa 6660 attcatgcta aaggccgtat ttgtgtttca aaaggaacgt ga 6702 16 2233 PRT Homo sapiens 16 Met Leu Cys Cys Pro Ser Glu Ser Leu Ile Val Ser Ile Ile Ile Phe 1 5 10 15 Phe Leu Pro Trp Asn Arg Ala Ser Leu Val Ile Pro Pro Cys Gln Arg 20 25 30 Ser Arg Tyr Ala Thr Tyr Phe Asp Val Ala Val Leu Arg Cys Leu Leu 35 40 45 Gln Pro His Trp Ser Glu Glu Gly Thr Gln Trp Ser Leu Met Tyr Tyr 50 55 60 Leu Gln Arg Leu Arg His Met Leu Glu Glu Lys Pro Glu Lys Pro Pro 65 70 75 80 Glu Pro Asp Ile Pro Leu Leu Pro Arg Pro Arg Ser Ser Ser Met Val 85 90 95 Ala Ala Ala Pro Ser Leu Val Asn Thr His Lys Thr Gln Asp Leu Thr 100 105 110 Met Lys Cys Asn Glu Glu Glu Lys Ser Leu Ser Ser Glu Ala Phe Ser 115 120 125 Lys Val Ser Leu Thr Asn Leu Arg Arg Ser Ala Val Pro Asp Leu Ser 130 135 140 Ser Asp Leu Gly Met Asn Ile Phe Lys Lys Phe Lys Ser Arg Lys Glu 145 150 155 160 Asp Arg Glu Arg Lys Gly Ser Ile Pro Phe His His Thr Gly Lys Arg 165 170 175 Arg Pro Arg Arg Met Gly Val Pro Phe Leu Leu His Glu Asp His Leu 180 185 190 Asp Val Ser Pro Thr Arg Ser Thr Phe Ser Phe Gly Ser Phe Ser Gly 195 200 205 Leu Gly Glu Asp Arg Arg Gly Ile Glu Lys Gly Gly Trp Gln Thr Thr 210 215 220 Ile Leu Gly Lys Leu Thr Arg Arg Gly Ser Ser Asp Ala Ala Thr Glu 225 230 235 240 Met Glu Ser Leu Ser Ala Arg His Ser His Ser His His Thr Leu Val 245 250 255 Ser Asp Leu Pro Asp Pro Ser Asn Ser His Gly Glu Asn Thr Val Lys 260 265 270 Glu Val Arg Ser Gln Ile Ser Thr Ile Thr Val Ala Thr Phe Asn Thr 275 280 285 Thr Leu Ala Ser Phe Asn Val Gly Tyr Ala Asp Phe Phe Asn Glu His 290 295 300 Met Arg Lys Leu Cys Asn Gln Val Pro Ile Pro Glu Met Pro His Glu 305 310 315 320 Pro Leu Ala Cys Ala Asn Leu Pro Arg Ser Leu Thr Asp Ser Cys Ile 325 330 335 Asn Tyr Ser Tyr Leu Glu Asp Thr Glu His Ile Asp Gly Thr Asn Asn 340 345 350 Phe Val His Lys Asn Gly Met Leu Asp Leu Ser Val Val Leu Lys Ala 355 360 365 Val Tyr Leu Val Leu Asn His Asp Ile Ser Ser Arg Ile Cys Asp Val 370 375 380 Ala Leu Asn Ile Val Glu Cys Leu Leu Gln Leu Gly Val Val Pro Cys 385 390 395 400 Val Glu Lys Asn Arg Lys Lys Ser Glu Asn Lys Glu Asn Glu Thr Leu 405 410 415 Glu Lys Arg Pro Ser Glu Gly Ala Phe Gln Phe Lys Gly Val Ser Gly 420 425 430 Ser Ser Thr Cys Gly Phe Gly Gly Pro Ala Asp Glu Ser Thr Pro Val 435 440 445 Ser Asn His Arg Leu Ala Leu Thr Met Leu Ile Lys Ile Val Lys Ser 450 455 460 Leu Gly Cys Ala Tyr Gly Cys Gly Glu Gly His Arg Gly Leu Ser Gly 465 470 475 480 Asp Arg Leu Arg His Gln Val Phe Arg Glu Asn Ala Gln Asn Cys Leu 485 490 495 Thr Lys Leu Tyr Lys Leu Asp Lys Met Gln Phe Arg Gln Thr Met Arg 500 505 510 Asp Tyr Val Asn Lys Asp Ser Leu Asn Asn Val Val Asp Phe Leu His 515 520 525 Ala Leu Leu Gly Phe Cys Met Glu Pro Val Thr Asp Asn Lys Ala Gly 530 535 540 Phe Gly Asn Asn Phe Thr Thr Val Asp Asn Lys Ser Thr Ala Gln Asn 545 550 555 560 Val Glu Gly Ile Ile Val Ser Ala Met Phe Lys Ser Leu Ile Thr Arg 565 570 575 Cys Ala Ser Thr Thr His Glu Leu His Ser Pro Glu Asn Leu Gly Leu 580 585 590 Tyr Cys Asp Ile Arg Gln Leu Val Gln Phe Ile Lys Glu Ala His Gly 595 600 605 Asn Val Phe Arg Arg Val Ala Leu Ser Ala Leu Leu Asp Ser Ala Glu 610 615 620 Lys Leu Ala Pro Gly Lys Lys Val Glu Glu Asn Glu Gln Glu Ser Lys 625 630 635 640 Pro Ala Gly Ser Lys Ser Asp Glu Gln Met Gln Gly Ala Asn Leu Gly 645 650 655 Arg Lys Asp Phe Trp Arg Lys Met Phe Lys Ser Gln Ser Ala Ala Ser 660 665 670 Asp Thr Ser Ser Gln Ser Glu Gln Asp Thr Ser Glu Cys Thr Thr Ala 675 680 685 His Ser Gly Thr Thr Ser Asp Arg Arg Ala Arg Ser Arg Ser Arg Arg 690 695 700 Ile Ser Leu Arg Lys Lys Leu Lys Leu Pro Ile Gly Asn Trp Leu Lys 705 710 715 720 Arg Ser Ser Leu Ser Gly Leu Ala Asp Gly Val Glu Asp Leu Leu Asp 725 730 735 Ile Ser Ser Val Asp Arg Leu Ser Phe Ile Arg Gln Ser Ser Lys Val 740 745 750 Lys Phe Thr Ser Ala Val Lys Leu Ser Glu Gly Gly Pro Gly Ser Gly 755 760 765 Met Glu Asn Gly Arg Asp Glu Glu Glu Asn Phe Phe Lys Arg Leu Gly 770 775 780 Cys His Ser Phe Asp Asp His Leu Ser Pro Asn Gln Asp Gly Gly Lys 785 790 795 800 Ser Lys Asn Val Val Asn Leu Gly Ala Ile Arg Gln Gly Met Lys Arg 805 810 815 Phe Gln Phe Leu Leu Asn Cys Cys Glu Pro Gly Thr Ile Pro Asp Ala 820 825 830 Ser Ile Leu Ala Ala Ala Leu Asp Leu Glu Ala Pro Val Val Ala Arg 835 840 845 Ala Ala Leu Phe Leu Glu Cys Ala Arg Phe Val His Arg Cys Asn Arg 850 855 860 Gly Asn Trp Pro Glu Trp Met Lys Gly His His Val Asn Ile Thr Lys 865 870 875 880 Lys Gly Leu Ser Arg Gly Arg Ser Pro Ile Val Gly Asn Lys Arg Asn 885 890 895 Gln Lys Leu Gln Trp Asn Ala Ala Lys Leu Phe Tyr Gln Trp Gly Asp 900 905 910 Ala Ile Gly Val Arg Leu Asn Glu Leu Cys His Gly Glu Ser Glu Ser 915 920 925 Pro Ala Asn Leu Leu Gly Leu Ile Tyr Asp Glu Glu Thr Lys Arg Arg 930 935 940 Leu Arg Lys Glu Asp Glu Glu Glu Asp Phe Leu Asp Asp Ser Lys Glu 945 950 955 960 Thr Pro Phe Thr Thr Arg Thr Pro Ala Cys Thr Val Asn Pro Ser Lys 965 970 975 Cys Gly Cys Pro Phe Ala Leu Lys Met Ala Ala Cys Gln Leu Leu Leu 980 985 990 Glu Ile Thr Thr Phe Leu Arg Glu Thr Phe Ser Cys Leu Pro Arg Pro 995 1000 1005 Arg Thr Glu Pro Leu Val Asp Leu Glu Ser Cys Arg Leu Arg Leu 1010 1015 1020 Asp Pro Glu Leu Asp Arg His Arg Tyr Glu Arg Lys Ile Ser Phe 1025 1030 1035 Ala Gly Val Leu Asp Glu Asn Glu Asp Ser Lys Asp Ser Leu His 1040 1045 1050 Ser Ser Ser His Thr Leu Lys Ser Asp Ala Gly Val Glu Glu Lys 1055 1060 1065 Lys Val Pro Ser Arg Lys Ile Arg Ile Gly Gly Ser Arg Leu Leu 1070 1075 1080 Gln Ile Lys Gly Thr Arg Ser Phe Gln Val Lys Lys Gly Gly Ser 1085 1090 1095 Leu Ser Ser Ile Arg Arg Val Gly Ser Leu Lys Ser Ser Lys Leu 1100 1105 1110 Ser Arg Gln Asp Ser Glu Ser Glu Ala Glu Glu Leu Gln Leu Ser 1115 1120 1125 Gln Ser Arg Asp Thr Val Thr Asp Leu Glu Gly Ser Pro Trp Ser 1130 1135 1140 Ala Ser Glu Pro Ser Ile Glu Pro Glu Gly Met Ser Asn Ala Gly 1145 1150 1155 Ala Glu Glu Asn Tyr His Arg Asn Met Ser Trp Leu His Val Met 1160 1165 1170 Ile Leu Leu Cys Asn Gln Gln Ser Phe Ile Cys Thr His Val Asp 1175 1180 1185 Tyr Cys His Pro His Cys Tyr Leu His His Ser Arg Ser Cys Ala 1190 1195 1200 Arg Leu Val Arg Ala Ile Lys Leu Leu Tyr Gly Asp Ser Val Asp 1205 1210 1215 Ser Leu Arg Glu Ser Ser Asn Ile Ser Ser Val Ala Leu Arg Gly 1220 1225 1230 Lys Lys Gln Lys Glu Cys Ser Asp Lys Ser Cys Leu Arg Thr Pro 1235 1240 1245 Ser Leu Lys Lys Arg Val Ser Asp Ala Asn Leu Glu Gly Lys Lys 1250 1255 1260 Asp Ser Gly Met Leu Lys Tyr Ile Arg Leu Gln Val Leu Leu Pro 1265 1270 1275 Gly Ser Glu Gly Phe Met Glu Leu Leu Thr Gly Arg Gly Leu Gln 1280 1285 1290 Thr Ala Tyr Leu Leu Met Phe Gly Thr Tyr Asn Ile Ser Trp Tyr 1295 1300 1305 Ser Val Gly Ile Lys Pro Leu Gln Leu Val Met Ser Leu Ser Pro 1310 1315 1320 Ala Pro Leu Ser Leu Leu Ile Lys Ala Ala Pro Ile Leu Thr Glu 1325 1330 1335 Glu Met Tyr Gly Asp Ile Gln Pro Ala Ala Trp Glu Leu Leu Leu 1340 1345 1350 Ser Met Asp Glu His Met Ala Gly Ala Ala Val Lys Val Pro Glu 1355 1360 1365 Ala Val Ser Asp Met Leu Met Ser Glu Phe His His Pro Glu Thr 1370 1375 1380 Val Gln Arg Leu Asn Ala Val Leu Lys Phe His Thr Leu Trp Arg 1385 1390 1395 Phe Arg Tyr Gln Val Trp Pro Arg Met Glu Glu Gly Ala Gln Gln 1400 1405 1410 Ile Phe Lys Lys Ser Phe Ser Ala Arg Ala Val Ser Arg Ser His 1415 1420 1425 Gln Arg Ala Glu His Ile Leu Lys Asn Leu Gln Gln Glu Glu Glu 1430 1435 1440 Lys Lys Arg Leu Gly Arg Glu Ala Ser Leu Ile Thr Ala Ile Pro 1445 1450 1455 Ile Thr Gln Glu Ala Cys Tyr Glu Pro Thr Cys Thr Pro Asn Ser 1460 1465 1470 Glu Pro Glu Glu Glu Val Glu Glu Val Thr Asn Leu Ala Ser Arg 1475 1480 1485 Arg Leu Ser Val Ser Pro Ser Cys Thr Ser Ser Thr Ser His Arg 1490 1495 1500 Asn Tyr Ser Phe Arg Arg Gly Ser Val Trp Ser Val Arg Ser Ala 1505 1510 1515 Val Ser Ala Glu Asp Glu Glu His Thr Thr Glu His Thr Pro Asn 1520 1525 1530 His His Val Pro Gln Pro Pro Gln Ala Val Phe Pro Ala Cys Ile 1535 1540 1545 Cys Ala Ala Val Leu Pro Ile Val His Leu Met Glu Asp Gly Glu 1550 1555 1560 Val Arg Glu Asp Gly Val Ala Val Ser Ala Val Ala Gln Gln Val 1565 1570 1575 Leu Trp Asn Cys Leu Ile Glu Asp Pro Ser Thr Val Leu Arg His 1580 1585 1590 Phe Leu Glu Lys Leu Thr Ile Ser Asn Arg Gln Asp Glu Leu Met 1595 1600 1605 Tyr Met Leu Arg Lys Leu Leu Leu Asn Ile Gly Asp Phe Pro Ala 1610 1615 1620 Gln Thr Ser His Ile Leu Phe Asn Tyr Leu Val Gly Leu Ile Met 1625 1630 1635 Tyr Phe Val Arg Thr Pro Cys Glu Trp Gly Met Asp Ala Ile Ser 1640 1645 1650 Ala Thr Leu Thr Phe Leu Trp Glu Val Val Gly Tyr Val Glu Gly 1655 1660 1665 Leu Phe Phe Lys Asp Leu Lys Gln Thr Met Lys Lys Glu Gln Cys 1670 1675 1680 Glu Val Lys Leu Leu Val Thr Ala Ser Met Pro Gly Thr Lys Thr 1685 1690 1695 Leu Val Val His Gly Gln Asn Glu Cys Asp Ile Pro Thr Gln Leu 1700 1705 1710 Pro Val His Glu Asp Thr Gln Phe Glu Ala Leu Leu Lys Glu Cys 1715 1720 1725 Leu Glu Phe Phe Asn Ile Pro Glu Ser Gln Ser Thr His Tyr Phe 1730 1735 1740 Leu Met Asp Lys Arg Trp Asn Leu Ile His Tyr Asn Lys Thr Tyr 1745 1750 1755 Val Arg Asp Ile Tyr Pro Phe Arg Arg Ser Val Ser Pro Gln Leu 1760 1765 1770 Asn Leu Val His Met His Pro Glu Lys Gly Gln Glu Leu Ile Gln 1775 1780 1785 Lys Gln Val Phe Thr Arg Lys Leu Glu Glu Val Gly Arg Val Leu 1790 1795 1800 Phe Leu Ile Ser Leu Thr Gln Lys Ile Pro Thr Ala His Lys Gln 1805 1810 1815 Ser His Val Ser Met Leu Gln Glu Asp Leu Leu Arg Leu Pro Ser 1820 1825 1830 Phe Pro Arg Ser Ala Ile Asp Ala Glu Phe Ser Leu Phe Ser Asp 1835 1840 1845 Pro Gln Ala Gly Lys Glu Leu Phe Gly Leu Asp Thr Leu Gln Lys 1850 1855 1860 Ser Leu Trp Ile Gln Leu Leu Glu Glu Met Phe Leu Gly Met Pro 1865 1870 1875 Ser Glu Phe Pro Trp Gly Asp Glu Ile Met Leu Phe Leu Asn Val 1880 1885 1890 Phe Asn Gly Ala Leu Ile Leu His Pro Glu Asp Ser Ala Leu Leu 1895 1900 1905 Arg Gln Tyr Ala Ala Thr Val Ile Asn Thr Ala Val His Phe Asn 1910 1915 1920 His Leu Phe Ser Leu Ser Gly Tyr Gln Trp Ile Leu Pro Thr Met 1925 1930 1935 Leu Gln Val Tyr Ser Asp Tyr Glu Ser Asn Pro Gln Leu Arg Gln 1940 1945 1950 Ala Ile Glu Phe Ala Cys His Gln Phe Tyr Ile Leu His Arg Lys 1955 1960 1965 Pro Phe Val Leu Gln Leu Phe Ala Ser Val Ala Pro Leu Leu Glu 1970 1975 1980 Phe Pro Asp Ala Ala Asn Asn Gly Pro Ser Lys Gly Val Ser Ala 1985 1990 1995 Gln Cys Leu Phe Asp Leu Leu Gln Ser Leu Glu Gly Glu Thr Thr 2000 2005 2010 Asp Ile Leu Asp Ile Leu Glu Leu Val Lys Ala Glu Lys Pro Leu 2015 2020 2025 Lys Ser Leu Asp Phe Cys Tyr Gly Asn Glu Asp Leu Thr Phe Ser 2030 2035 2040 Ile Ser Glu Ala Ile Lys Leu Cys Val Thr Val Val Ala Tyr Ala 2045 2050 2055 Pro Glu Ser Phe Arg Ser Leu Gln Met Leu Met Val Leu Glu Ala 2060 2065 2070 Leu Val Pro Cys Tyr Leu Gln Lys Leu Lys Arg Gln Thr Ser Gln 2075 2080

2085 Val Glu Thr Val Pro Ala Ala Arg Glu Glu Ile Ala Ala Thr Ala 2090 2095 2100 Ala Leu Ala Thr Ser Leu Gln Ala Leu Leu Tyr Ser Val Glu Val 2105 2110 2115 Leu Thr Arg Glu Asn Leu His Leu Leu Glu Glu Gly Gln Gly Ile 2120 2125 2130 Pro Arg Glu Glu Leu Asp Glu Arg Ile Ala Arg Glu Glu Phe Arg 2135 2140 2145 Arg Pro Arg Glu Ser Leu Leu Asn Ile Cys Thr Glu Phe Tyr Lys 2150 2155 2160 His Cys Gly Pro Arg Leu Lys Ile Leu Gln Asn Leu Ala Gly Glu 2165 2170 2175 Pro Arg Val Ile Ala Leu Glu Leu Leu Asp Val Lys Ser His Met 2180 2185 2190 Ser Val Leu Gly Lys Gly Pro Arg Ile Thr Ser Leu Cys Thr Arg 2195 2200 2205 Ile Ser Ser Ser Tyr Arg Asp Ala Ile Ser Leu Glu Ile His Ala 2210 2215 2220 Lys Gly Arg Ile Cys Val Ser Lys Gly Thr 2225 2230 17 3962 DNA Gallus gallus misc_feature (832)..(846) n is a, c, g, or t 17 gaattcggca cgaggatcac ccacgtcata gtactcgggg acaacttcaa ctgtgagcac 60 aaacatcact ttgtcatgtg tagaaggcta cactctggtg ggagcaagca catccacgtg 120 caaggagagt ggcgtttgga tgccagagtt ttctgatgac atttgcattc ctgtgtcatg 180 tgggatccca gaatctccag agcacggatt tgtggttggc accaaattca gttacaaaga 240 tgtggttctt tataaatgtg atcctggcta cgaactacaa ggtgatacag aacggacttg 300 ccaagaagac aagctttgga gtggctcagt gccaacatgc agaagagtat cttgtgggcc 360 cccagaggtg atcgaaaatg gatctgttca aggagaagag ttcctgtttg gcagcgaggc 420 tttttacagc tgtgaccctg gtttcgaact gcagggacca agccgaagaa tttgccacgt 480 tgacaagaag tggagcccct ctgctcctgt gtgtaggcga attacttgcg ggctgcctcc 540 ttcaatagaa aaagcagagg ccatttctac aggaaacaca tacaaaagta atgtaacctt 600 tgtgtgcagc tctggttacc accttgttgg accgcagaat atcacatgtc ttgccaatgg 660 gagctggagt aagccattac cactgtgtga agagaccaga tgcaaactgc cactttcttt 720 gctgaatggg aaggcaattt atgaaaataa tacagttggc agtactgtag catatttctg 780 caagagcgga tacagtttgg aaggagaacc tacagcagag tgcacaaggg annnnnnnnn 840 nnnnnntcct ttgcctctct gtaaaccaaa cccttgtccc gtgcctttca taatcccaga 900 gaatgccctt ctctctgagg tggattttta cgtcgggcag aatgtgtcca tcaggtgcag 960 ggaaggctac cagttgaaag ggcaggctgt gatcacttgt aatgctgatg agacttggac 1020 tccaacaaca gctaagtgtg aaaagatatc ttgcgggccc ccagctcaca tagagaatgc 1080 tttcatccgt ggtagcttct atcagtatgg agatatgatc acctactcat gctacagtgg 1140 ttatatgctg gagggacccc tgcggagcat ttgcttagaa aatggaacgt ggacaacacc 1200 acctacatgc aaagctgtct gtcggttccc atgtcagaat ggtggagtct gtgagcgacc 1260 aaatgcctgc tcgtgtccag atggctggat gggtcgtctc tgtgaagagc caatatgcat 1320 tttgccatgt ctcaatggag gtcgctgtgt ggctccttac aagtgtgact gcccccctgg 1380 atggactgga tcgcggtgcc atacagctgt ttgccagtca ccttgcttaa atggtgggaa 1440 gtgcatacga ccaaatcgat gttactgtcc ctcatcatgg actggacatg attgctcaag 1500 aaaacggaag gctggattct accacttcta acagcagagc aacagtttta cactcagaaa 1560 cctttcttca gcctagacag cggggctcag aatctaatgc attgtaaatc acatccattg 1620 cttcccttcc ccccacctcc tttgttttgt attttatttt gtgatatatt ttttctatac 1680 ctttcaattt ttaaagaaaa cctctgtatt ttccatttac aaaagtatta tcaaatatat 1740 gctgctatat acacaccata cacatacaaa agtgaagatc cctactgttc actgagaaag 1800 tggctgtgta cggtgaagtc cctcccattt cttacacccg gtaagctaat taaaacatgc 1860 tatactgcca gccatgatta aacmsamtgy kkcmgttctg cttatcatct gccaaagcat 1920 actgaaatcc agcaacttaa tggtaaggaa taattatgta aagctaattg aaccaccgaa 1980 ctttgcattg ggcttgtgtc atggttgtat aaattagaag tacatctgat aaagtcccaa 2040 ttgtagccag agttcctggt ggacgtaagt agattctgta atgttcatta tgtgacatta 2100 acgtcattgg aaagcgactt agatggaagg cagtggcaag aattttagcc atcagtaaaa 2160 tactcaaaag catgaaagag ttgagacaat gtctaggcaa taacagcctc tgaggatttt 2220 tggcatacag gcatttcagg tgtcatgatc agtctggata atccagaatg cagcagcgga 2280 cagcacagac cactgaaaac ttccccctgg taatggaact caccactact tgcctgcaac 2340 cagtagccct ttcctgtgtg atgatcaaat acacatccaa catcctcctg ccaggcaaat 2400 gtttttgaga catggggttt gggtcccaat gttttggccc tgcagtaggg agagaaggtg 2460 aagctttgct gtttgcttgc agaagagtgg tatttatgtt atgctgaacc ctcagagaac 2520 tggaaaaggc ctctcttgtg tacatgcaca ggcagaaata cctagctgag taagaaatgc 2580 tgagagcaca catgctgtcc gatttctctt tcgcacattg ttgatcccag tgcatctgag 2640 agtcacacat ggttgagtgc catcattcag ttgtgctcta atgagctgag atgctgagat 2700 ttaccgatgg gtacgtggtg tggcggaatt acaaggtgga aatcccagtc atgtgctgag 2760 gtcaaatgtt tgctaattat catcagatag taatgaagtc tagtctgtga aagaagattt 2820 tagagtgaga accattgatc gggagctcca tttttcccag tagcagcaga aaagcatgac 2880 tgtcagccca cactaggaaa gaagaaggaa tatgctctac actctgcagc attactgcgt 2940 agttaccctc ggggtcatga gcgtgcacac gctgccccca cctcccccct tccctcttta 3000 taaatataca ttccctttat gaatgcatga taggacaata aaaggagcta atggagggac 3060 tagggcgcta gtgaagactg acacatagct aatggctgtt aacccaagac cagaaatggg 3120 gaacaaacaa gtgaagctgt gaaccaggaa aagctggaag aaaaacaaac aggtgaagaa 3180 tatttgtcaa gggacgagct gaattcgaat gcagattcct tcccactggg agctgcaacc 3240 ggctgaagag ttgttctttc aactcccgta aatatatttt ttctgatgga ttctgctgac 3300 atgtaccaac agccatcagt gtttacagct ttggttcaag ttagcattca gtaaataata 3360 acacgtttca acccacggtc actgccatgt gtaggcactt tgttccctga ctcctgctgc 3420 tgtgcacagt ggggtgtaca gatgctgtag tgagcagctc gggatacctg aagggaaaga 3480 gtgcatcagt gggagaagtg gatttttatt tatatgtcat tctcatcttt tacaaagtag 3540 tcccattttc agtgtgcttc tctggtacgt gccctcacag ccctggcaat ctccagagca 3600 gagcagcagt gctttggaag gcgagcaggg ctggcaggag actgctgagc cttgggggcg 3660 agggccggct tttagcactg cagcttcaca ctagtgacta gtacatggag tttggggata 3720 tactcagtca atacgtttca taagctgatg tggtagaaag agtagctgaa actataggct 3780 gttatattag tgctgtgtat gatgctttga tacttgctgg aatattatcc cttccccatt 3840 ctgtgcggta ttgtcattta tgtcactgct tgttgtgtgt tttaaaggac ttctgtgtga 3900 tgcactttac actgtaaata aagttgcacc ctgtttagta ccwaaaaaaa aaaaaaaaaa 3960 aa 3962 18 499 PRT Gallus gallus 18 Tyr Ser Gly Thr Thr Ser Thr Val Ser Thr Asn Ile Thr Leu Ser Cys 1 5 10 15 Val Glu Gly Tyr Thr Leu Val Gly Ala Ser Thr Ser Thr Cys Lys Glu 20 25 30 Ser Gly Val Trp Met Pro Glu Phe Ser Asp Asp Ile Cys Ile Pro Val 35 40 45 Ser Cys Gly Ile Pro Glu Ser Pro Glu His Gly Phe Val Val Gly Thr 50 55 60 Lys Phe Ser Tyr Lys Asp Val Val Leu Tyr Lys Cys Asp Pro Gly Tyr 65 70 75 80 Glu Leu Gln Gly Asp Thr Glu Arg Thr Cys Gln Glu Asp Lys Leu Trp 85 90 95 Ser Gly Ser Val Pro Thr Cys Arg Arg Val Ser Cys Gly Pro Pro Glu 100 105 110 Val Ile Glu Asn Gly Ser Val Gln Gly Glu Glu Phe Leu Phe Gly Ser 115 120 125 Glu Ala Phe Tyr Ser Cys Asp Pro Gly Phe Glu Leu Gln Gly Pro Ser 130 135 140 Arg Arg Ile Cys His Val Asp Lys Lys Trp Ser Pro Ser Ala Pro Val 145 150 155 160 Cys Arg Arg Ile Thr Cys Gly Leu Pro Pro Ser Ile Glu Lys Ala Glu 165 170 175 Ala Ile Ser Thr Gly Asn Thr Tyr Lys Ser Asn Val Thr Phe Val Cys 180 185 190 Ser Ser Gly Tyr His Leu Val Gly Pro Gln Asn Ile Thr Cys Leu Ala 195 200 205 Asn Gly Ser Trp Ser Lys Pro Leu Pro Leu Cys Glu Glu Thr Arg Cys 210 215 220 Lys Leu Pro Leu Ser Leu Leu Asn Gly Lys Ala Ile Tyr Glu Asn Asn 225 230 235 240 Thr Val Gly Ser Thr Val Ala Tyr Phe Cys Lys Ser Gly Tyr Ser Leu 245 250 255 Glu Gly Glu Pro Thr Ala Glu Cys Thr Arg Asn Asn Asn Asn Asn Asn 260 265 270 Pro Leu Pro Leu Cys Lys Pro Asn Pro Cys Pro Val Pro Phe Ile Ile 275 280 285 Pro Glu Asn Ala Leu Leu Ser Glu Val Asp Phe Tyr Val Gly Gln Asn 290 295 300 Val Ser Ile Arg Cys Arg Glu Gly Tyr Gln Leu Lys Gly Gln Ala Val 305 310 315 320 Ile Thr Cys Asn Ala Asp Glu Thr Trp Thr Pro Thr Thr Ala Lys Cys 325 330 335 Glu Lys Ile Ser Cys Gly Pro Pro Ala His Ile Glu Asn Ala Phe Ile 340 345 350 Arg Gly Ser Phe Tyr Gln Tyr Gly Asp Met Ile Thr Tyr Ser Cys Tyr 355 360 365 Ser Gly Tyr Met Leu Glu Gly Pro Leu Arg Ser Ile Cys Leu Glu Asn 370 375 380 Gly Thr Trp Thr Thr Pro Pro Thr Cys Lys Ala Val Cys Arg Phe Pro 385 390 395 400 Cys Gln Asn Gly Gly Val Cys Glu Arg Pro Asn Ala Cys Ser Cys Pro 405 410 415 Asp Gly Trp Met Gly Arg Leu Cys Glu Glu Pro Ile Cys Ile Leu Pro 420 425 430 Cys Leu Asn Gly Gly Arg Cys Val Ala Pro Tyr Lys Cys Asp Cys Pro 435 440 445 Pro Gly Trp Thr Gly Ser Arg Cys His Thr Ala Val Cys Gln Ser Pro 450 455 460 Cys Leu Asn Gly Gly Lys Cys Ile Arg Pro Asn Arg Cys Tyr Cys Pro 465 470 475 480 Ser Ser Trp Thr Gly His Asp Cys Ser Arg Lys Arg Lys Ala Gly Phe 485 490 495 Tyr His Phe 19 1969 DNA Homo sapiens 19 tatgaatgca cagcttgccc atcggggaca tacaaacctg aagcctcacc aggaggaatc 60 agcagttgca ttccatgtcc cgatgaaaat cacacctctc cacctggaag cacatcccct 120 gaagactgtg tctgcagaga gggatacagg gcatctggcc agacctgtga acttgtccac 180 tgccctgccc tgaagcctcc cgaaaatggt tactttatcc aaaacacttg caacaaccac 240 ttcaatgcag cctgtggggt ccgatgtcac cctggatttg atcttgtggg aagcagcatc 300 atcttatgtc tacccaatgg tttgtggtcc ggttcagaga gctactgcag agtaagaaca 360 tgtcctcatc tccgccagcc gaaacatggc cacatcagct gttctacaag ggaaatgtta 420 tataagacaa catgtttggt tgcctgtgat gaagggtaca gactagaagg cagtgataag 480 cttacttgtc aaggaaacag ccagtgggat gggccagaac cccggtgtgt ggagcgccac 540 tgttccacct ttcagatgcc caaagatgtc atcatatccc cccacaactg tggcaagcag 600 ccagccaaat ttgggacgat ctgctatgta agttgccgcc aagggttcat tttatctgga 660 gtcaaagaaa tgctgagatg taccacttct ggaaaatgga atgtcggagt tcaggcagct 720 gtgtgtaaag acgtggaggc tcctcaaatc aactgtccta aggacataga ggctaagact 780 ctggaacagc aagattctgc caatgttacc tggcagattc caacagctaa agacaactct 840 ggtgaaaagg tgtcagtccg cgttcatcca gctttcaccc caccttacct tttcccaatt 900 ggagatgttg ctatcgtata cacggcaact gacctatccg gcaaccaggc cagctgcatt 960 ttccatatca aggttattga tgcagaacca cctgtcatag actggtgcag atctccacct 1020 cccgtccagg tctcggagaa ggtacatgcc gcaagctggg atgagcctca gttctcagac 1080 aactcagggg ctgaattggt cattaccaga agtcatacac aaggagacct tttccctcaa 1140 ggggagacta tagtacagta tacagccact gacccctcag gcaataacag gacatgtgat 1200 atccatattg tcataaaagg ttctccctgt gaaatcccat tcacacctgt aaatggggat 1260 tttatatgca ctccagataa tactggagtc aactgtacat taacttgctt ggagggctat 1320 gatttcacag aagggtctac tgacaagtat tattgtgctt atgaagatgg cgtctggaaa 1380 ccaacatata ccactgaatg gccagactgt gccaaaaaac gttttgcaaa ccacgggttc 1440 aagtcctttg agatgttcta caaagcagct cgttgtgatg acacagatct gatgaagaag 1500 ttttctgaag cattggagac gaccctggga aaaatggtcc catcattttg tagtgatgca 1560 gaggacattg actgcagact ggaggagaac ctgaccaaaa aatattgcct agaatataat 1620 tatgactatg aaaatggctt tgcaattggt aattaaattc tgtggcatcg gtagttggca 1680 agactaatct gcaaaataag aataattcca gaaaagtgag gcaaactaga aacattaact 1740 tctattaatt tattcatcaa gtattttagg atggctaaat aatttgataa tgtgctgaaa 1800 gatcattaag gttatatcaa attttagtaa caaataaatt atttaaaatt atttgccagg 1860 attcttaaaa atgacaaaaa ctaagaaaac taagtcacat atgctggtaa aattcaaatg 1920 ttgatgtatc ctaaaagaga atagtaataa agtcctaaca gcaactttt 1969 20 413 PRT Homo sapiens 20 Met Leu Tyr Lys Thr Thr Cys Leu Val Ala Cys Asp Glu Gly Tyr Arg 1 5 10 15 Leu Glu Gly Ser Asp Lys Leu Thr Cys Gln Gly Asn Ser Gln Trp Asp 20 25 30 Gly Pro Glu Pro Arg Cys Val Glu Arg His Cys Ser Thr Phe Gln Met 35 40 45 Pro Lys Asp Val Ile Ile Ser Pro His Asn Cys Gly Lys Gln Pro Ala 50 55 60 Lys Phe Gly Thr Ile Cys Tyr Val Ser Cys Arg Gln Gly Phe Ile Leu 65 70 75 80 Ser Gly Val Lys Glu Met Leu Arg Cys Thr Thr Ser Gly Lys Trp Asn 85 90 95 Val Gly Val Gln Ala Ala Val Cys Lys Asp Val Glu Ala Pro Gln Ile 100 105 110 Asn Cys Pro Lys Asp Ile Glu Ala Lys Thr Leu Glu Gln Gln Asp Ser 115 120 125 Ala Asn Val Thr Trp Gln Ile Pro Thr Ala Lys Asp Asn Ser Gly Glu 130 135 140 Lys Val Ser Val Arg Val His Pro Ala Phe Thr Pro Pro Tyr Leu Phe 145 150 155 160 Pro Ile Gly Asp Val Ala Ile Val Tyr Thr Ala Thr Asp Leu Ser Gly 165 170 175 Asn Gln Ala Ser Cys Ile Phe His Ile Lys Val Ile Asp Ala Glu Pro 180 185 190 Pro Val Ile Asp Trp Cys Arg Ser Pro Pro Pro Val Gln Val Ser Glu 195 200 205 Lys Val His Ala Ala Ser Trp Asp Glu Pro Gln Phe Ser Asp Asn Ser 210 215 220 Gly Ala Glu Leu Val Ile Thr Arg Ser His Thr Gln Gly Asp Leu Phe 225 230 235 240 Pro Gln Gly Glu Thr Ile Val Gln Tyr Thr Ala Thr Asp Pro Ser Gly 245 250 255 Asn Asn Arg Thr Cys Asp Ile His Ile Val Ile Lys Gly Ser Pro Cys 260 265 270 Glu Ile Pro Phe Thr Pro Val Asn Gly Asp Phe Ile Cys Thr Pro Asp 275 280 285 Asn Thr Gly Val Asn Cys Thr Leu Thr Cys Leu Glu Gly Tyr Asp Phe 290 295 300 Thr Glu Gly Ser Thr Asp Lys Tyr Tyr Cys Ala Tyr Glu Asp Gly Val 305 310 315 320 Trp Lys Pro Thr Tyr Thr Thr Glu Trp Pro Asp Cys Ala Lys Lys Arg 325 330 335 Phe Ala Asn His Gly Phe Lys Ser Phe Glu Met Phe Tyr Lys Ala Ala 340 345 350 Arg Cys Asp Asp Thr Asp Leu Met Lys Lys Phe Ser Glu Ala Leu Glu 355 360 365 Thr Thr Leu Gly Lys Met Val Pro Ser Phe Cys Ser Asp Ala Glu Asp 370 375 380 Ile Asp Cys Arg Leu Glu Glu Asn Leu Thr Lys Lys Tyr Cys Leu Glu 385 390 395 400 Tyr Asn Tyr Asp Tyr Glu Asn Gly Phe Ala Ile Gly Asn 405 410 21 2693 DNA Gallus gallus misc_feature (2651)..(2651) n is a, c, g, or t 21 cttaagtaga ggagactgtt gcactaatta ccaggtcgtt tgcaaaggag aaacccactg 60 ggtcgatgat gactgtgaag agataaaaac tcctgaatgt ccagcaggct ttgttcgtcc 120 tcctttgatc atcttctctg ttgatggttt ccgtgcatca tatatgaaga aagggaacaa 180 ggtcatgccc aatattgaaa agctgagatc ttgtggaaca cattctcctt acatgaggcc 240 ggtctaccct acaaaaacct tccccaactt gtacaccctt gctactggac tctatcctga 300 atcacatgga atcgttggca attcaatgta tgacccagtg tttgatgcca gcttcagtct 360 tcgagggcga gagaaattca atcacagatg gtggggaggt caaccaattt ggattactgc 420 agccaagcaa ggggtgaaag ctggcacatt cttctggtct gttgtcatcc cccacgagcg 480 tagaatacta acaatactgc agtggctgac ccttccggat aacgaaaggc cttatgttta 540 tgctttctac tctgagcaac cagatgctgc tggccacaga tatggtcctt tcaactcaga 600 gatgatggta aatcccctga gagagattga caagacagta ggacaactaa tggatggact 660 gaaacagctg aaactgcatc gatgtgtcaa tgtcatattt gttggtgatc atgggatgga 720 agatactact tgtgaaagaa ctgaattttt gagcaactac ctgaccaacg tggaagatat 780 cattctgctg cctggatctt tagggagaat tcgccctagg tctagcaata acctgaaata 840 tgaccccaaa gtgattgttg ccaaccttac atgcaggaag ccagaccagc actttaagcc 900 atacttgaag catcaccttt ctaaacgctt gcactatgct tacaataggc gaattgagga 960 tgtccattta ctggttgagc gcaagtggca tgtagcaagg aaagctgtgg atgtttacaa 1020 gaaaccaaca ggaaagtgtt tcttccatgg agaccatggc tatgacaaca agataaacag 1080 catgcagact gtcttcatag gttatggacc tacattcaaa tacaagacca aagtaccgcc 1140 ttttgaaaac attgaacttt acaatgtcat gtgtgatctg cttggattaa agcctgctcc 1200 caataatggt acccacggaa gtttgaatca cctgctaaga gccaatgttt ataaaccaac 1260 tgtgccagat gaagttgcta agccacttta tcctgtagca ctaccttctg catcagattt 1320 tgatatagga tgtacatgtg atgataagaa caagttggat gaactcaaca agcgctttca 1380 tgtcaaggga acggaagaga agcatcttct gtacgggcgc cctgcagtgc tgtaccgcac 1440 gaagtacaat atcttgcacc accatgactt tgaaagtggc tacagtgaaa cattcctgat 1500 gcctctctgg acatcctaca ctatttccaa acaggcagag gtatccggtg tcccagaaca 1560 cctggccagc tgcgtcaggc ccgatctccg catatctcca ggaaacagcc agagctgctc 1620 agcctacaga ggtgacaagc agctctccta cagcttcctc ttccctcctc aactaagttc 1680 ctctgcagaa gcaaagtatg atgcttttct aataacaaat atcattccaa tgtatcctgc 1740 tttcaaaaag gtatggaact atttccaaag ggttttagtg aagagatatg ccactgaacg 1800 aaatggagtc aatgttataa gtggaccaat ctttgactat gactatgatg gtttacatga 1860 cacacctgaa aaaatcaaac agtttgtgga aggcagtgcc atccctgttc ctactcatta 1920 ctatgccatc ataaccagct gtttagattt cactcagcca gccgacaagt gtgatggacc 1980 actctctgtt ctctcgtaca tccttcccca ccggcctgac aacgatgaga gctgcaatag 2040 catggaagat gaatcaaagt gggttgaaga tcttcttaag atgcacactg cacgggtgcg 2100 ggacattgag cagctcacaa gcttggactt cttccgaaag acgagtcgca gctacacaga 2160 aatcctctcc ctaaagacat acctgcatac atttgaaagt gaaatttagc tttctaacct 2220 tgctcagtgc attcttttat caactggtgt atatttttat attggtttta tatttattaa 2280 tttgaaacca ggacattaaa aatattagta ttttaatctt gtatcaaatc ttaaatatta 2340

aacccttgtg tcatttgttt tgtttctcta atgtttaata taggtatgtc tcttggttta 2400 tttagtagcg cttgtaatac tgcagcttaa gtccttactc caagctttta tctggtgctg 2460 cagaatttga tacgtgattc gaggaaatat taatttccca tgctccttta ccacactttt 2520 agtcctgtac tgtgtatcaa aatactgaac atgtaaaatt acattcattt actgttgact 2580 atgtgacaga catatttaaa ccctatagac aaatagcatc ttaaatataa taaaccacac 2640 attcagtttt naaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaa 2693 22 735 PRT Gallus gallus 22 Leu Ser Arg Gly Asp Cys Cys Thr Asn Tyr Gln Val Val Cys Lys Gly 1 5 10 15 Glu Thr His Trp Val Asp Asp Asp Cys Glu Glu Ile Lys Thr Pro Glu 20 25 30 Cys Pro Ala Gly Phe Val Arg Pro Pro Leu Ile Ile Phe Ser Val Asp 35 40 45 Gly Phe Arg Ala Ser Tyr Met Lys Lys Gly Asn Lys Val Met Pro Asn 50 55 60 Ile Glu Lys Leu Arg Ser Cys Gly Thr His Ser Pro Tyr Met Arg Pro 65 70 75 80 Val Tyr Pro Thr Lys Thr Phe Pro Asn Leu Tyr Thr Leu Ala Thr Gly 85 90 95 Leu Tyr Pro Glu Ser His Gly Ile Val Gly Asn Ser Met Tyr Asp Pro 100 105 110 Val Phe Asp Ala Ser Phe Ser Leu Arg Gly Arg Glu Lys Phe Asn His 115 120 125 Arg Trp Trp Gly Gly Gln Pro Ile Trp Ile Thr Ala Ala Lys Gln Gly 130 135 140 Val Lys Ala Gly Thr Phe Phe Trp Ser Val Val Ile Pro His Glu Arg 145 150 155 160 Arg Ile Leu Thr Ile Leu Gln Trp Leu Thr Leu Pro Asp Asn Glu Arg 165 170 175 Pro Tyr Val Tyr Ala Phe Tyr Ser Glu Gln Pro Asp Ala Ala Gly His 180 185 190 Arg Tyr Gly Pro Phe Asn Ser Glu Met Met Val Asn Pro Leu Arg Glu 195 200 205 Ile Asp Lys Thr Val Gly Gln Leu Met Asp Gly Leu Lys Gln Leu Lys 210 215 220 Leu His Arg Cys Val Asn Val Ile Phe Val Gly Asp His Gly Met Glu 225 230 235 240 Asp Thr Thr Cys Glu Arg Thr Glu Phe Leu Ser Asn Tyr Leu Thr Asn 245 250 255 Val Glu Asp Ile Ile Leu Leu Pro Gly Ser Leu Gly Arg Ile Arg Pro 260 265 270 Arg Ser Ser Asn Asn Leu Lys Tyr Asp Pro Lys Val Ile Val Ala Asn 275 280 285 Leu Thr Cys Arg Lys Pro Asp Gln His Phe Lys Pro Tyr Leu Lys His 290 295 300 His Leu Ser Lys Arg Leu His Tyr Ala Tyr Asn Arg Arg Ile Glu Asp 305 310 315 320 Val His Leu Leu Val Glu Arg Lys Trp His Val Ala Arg Lys Ala Val 325 330 335 Asp Val Tyr Lys Lys Pro Thr Gly Lys Cys Phe Phe His Gly Asp His 340 345 350 Gly Tyr Asp Asn Lys Ile Asn Ser Met Gln Thr Val Phe Ile Gly Tyr 355 360 365 Gly Pro Thr Phe Lys Tyr Lys Thr Lys Val Pro Pro Phe Glu Asn Ile 370 375 380 Glu Leu Tyr Asn Val Met Cys Asp Leu Leu Gly Leu Lys Pro Ala Pro 385 390 395 400 Asn Asn Gly Thr His Gly Ser Leu Asn His Leu Leu Arg Ala Asn Val 405 410 415 Tyr Lys Pro Thr Val Pro Asp Glu Val Ala Lys Pro Leu Tyr Pro Val 420 425 430 Ala Leu Pro Ser Ala Ser Asp Phe Asp Ile Gly Cys Thr Cys Asp Asp 435 440 445 Lys Asn Lys Leu Asp Glu Leu Asn Lys Arg Phe His Val Lys Gly Thr 450 455 460 Glu Glu Lys His Leu Leu Tyr Gly Arg Pro Ala Val Leu Tyr Arg Thr 465 470 475 480 Lys Tyr Asn Ile Leu His His His Asp Phe Glu Ser Gly Tyr Ser Glu 485 490 495 Thr Phe Leu Met Pro Leu Trp Thr Ser Tyr Thr Ile Ser Lys Gln Ala 500 505 510 Glu Val Ser Gly Val Pro Glu His Leu Ala Ser Cys Val Arg Pro Asp 515 520 525 Leu Arg Ile Ser Pro Gly Asn Ser Gln Ser Cys Ser Ala Tyr Arg Gly 530 535 540 Asp Lys Gln Leu Ser Tyr Ser Phe Leu Phe Pro Pro Gln Leu Ser Ser 545 550 555 560 Ser Ala Glu Ala Lys Tyr Asp Ala Phe Leu Ile Thr Asn Ile Ile Pro 565 570 575 Met Tyr Pro Ala Phe Lys Lys Val Trp Asn Tyr Phe Gln Arg Val Leu 580 585 590 Val Lys Arg Tyr Ala Thr Glu Arg Asn Gly Val Asn Val Ile Ser Gly 595 600 605 Pro Ile Phe Asp Tyr Asp Tyr Asp Gly Leu His Asp Thr Pro Glu Lys 610 615 620 Ile Lys Gln Phe Val Glu Gly Ser Ala Ile Pro Val Pro Thr His Tyr 625 630 635 640 Tyr Ala Ile Ile Thr Ser Cys Leu Asp Phe Thr Gln Pro Ala Asp Lys 645 650 655 Cys Asp Gly Pro Leu Ser Val Leu Ser Tyr Ile Leu Pro His Arg Pro 660 665 670 Asp Asn Asp Glu Ser Cys Asn Ser Met Glu Asp Glu Ser Lys Trp Val 675 680 685 Glu Asp Leu Leu Lys Met His Thr Ala Arg Val Arg Asp Ile Glu Gln 690 695 700 Leu Thr Ser Leu Asp Phe Phe Arg Lys Thr Ser Arg Ser Tyr Thr Glu 705 710 715 720 Ile Leu Ser Leu Lys Thr Tyr Leu His Thr Phe Glu Ser Glu Ile 725 730 735 23 3110 DNA Homo sapiens 23 agtgcactcc gtgaaggcaa agagaacacg ctgcaaaagg ctttccaata atcctcgaca 60 tggcaaggag gagctcgttc cagtcgtgtc agataatatc cctgttcact tttgccgttg 120 gagtcaatat ctgcttagga ttcactgcac atcgaattaa gagagcagaa ggatgggagg 180 aaggtcctcc tacagtgcta tcagactccc cctggaccaa catctccgga tcttgcaagg 240 gcaggtgctt tgaacttcaa gaggctggac ctcctgattg tcgctgtgac aacttgtgta 300 agagctatac cagttgctgc catgactttg atgagctgtg tttgaagaca gcccgtgcgt 360 gggagtgtac taaggacaga tgtggggaag tcagaaatga agaaaatgcc tgtcactgct 420 cagaggactg cttggccagg ggagactgct gtaccaatta ccaagtggtt tgcaaaggag 480 agtcgcattg ggttgatgat gactgtgagg aaataaaggc cgcagaatgc cctgcagggt 540 ttgttcgccc tccattaatc atcttctccg tggatggctt ccgtgcatca tacatgaaga 600 aaggcagcaa agtcatgcct aatattgaaa aactaaggtc ttgtggcaca cactctccct 660 acatgaggcc ggtgtaccca actaaaacct ttcctaactt atacactttg gccactgggc 720 tatatccaga atcacatgga attgttggca attcaatgta tgatcctgta tttgatgcca 780 cttttcatct gcgagggcga gagaaattta atcatagatg gtggggaggt caaccgctat 840 ggattacagc caccaagcaa ggggtgaaag ctggaacatt cttttggtct gttgtcatcc 900 ctcacgagcg gagaatatta accatattgc agtggctcac cctgccagat catgagaggc 960 cttcggtcta tgccttctat tctgagcaac ctgatttctc tggacacaaa tatggccctt 1020 tcggccctga gatgacaaat cctctgaggg aaatcgacaa aattgtgggg caattaatgg 1080 atggactgaa acaactaaaa ctgcatcggt gtgtcaacgt catctttgtc ggagaccatg 1140 gaatggaaga tgtcacatgt gatagaactg agttcttgag taattaccta actaatgtgg 1200 atgatattac tttagtgcct ggaactctag gaagaattcg atccaaattt agcaacaatg 1260 ctaaatatga ccccaaagcc attattgcca atctcacgtg taaaaaacca gatcagcact 1320 ttaagcctta cttgaaacag caccttccca aacgtttgca ctatgccaac aacagaagaa 1380 ttgaggatat ccatttattg gtggaacgca gatggcatgt tgcaaggaaa cctttggatg 1440 tttataagaa accatcagga aaatgctttt tccagggaga ccacggattt gataacaagg 1500 tcaacagcat gcagactgtt tttgtaggtt atggcccaac atttaagtac aagactaaag 1560 tgcctccatt tgaaaacatt gaactttaca atgttatgtg tgatctcctg ggattgaagc 1620 cagctcctaa taatgggacc catggaagtt tgaatcatct cctgcgcact aataccttca 1680 ggccaaccat gccagaggaa gttaccagac ccaattatcc agggattatg taccttcagt 1740 ctgattttga cctgggctgc acttgtgatg ataaggtaga gccaaagaac aagttggatg 1800 aactcaacaa acggcttcat acaaaagggt ctacagaaga gagacacctc ctctatgggc 1860 gacctgcagt gctttatcgg actagatatg atatcttata tcacactgac tttgaaagtg 1920 gttatagtga aatattccta atgccactct ggacatcata tactgtttcc aaacaggctg 1980 aggtttccag cgttcctgac catctgacca gttgcgtccg gcctgatgtc cgtgtttctc 2040 cgagtttcag tcagaactgt ttggcctaca aaaatgataa gcagatgtcc tacggattcc 2100 tctttcctcc ttatctgagc tcttcaccag aggctaaata tgatgcattc cttgtaacca 2160 atatggttcc aatgtatcct gctttcaaac gggtctggaa ttatttccaa agggtattgg 2220 tgaagaaata tgcttcggaa agaaatggag ttaacgtgat aagtggacca atcttcgact 2280 atgactatga tggcttacat gacacagaag acaaaataaa acagtacgtg gaaggcagtt 2340 ccattcctgt tccaactcac tactacagca tcatcaccag ctgtctggat ttcactcagc 2400 ctgccgacaa gtgtgacggc cctctctctg tgtcctcctt catcctgcct caccggcctg 2460 acaacgagga gagctgcaat agctcagagg acgaatcaaa atgggtagaa gaactcatga 2520 agatgcacac agctagggtg cgtgacattg aacatctcac cagcctggac ttcttccgaa 2580 agaccagccg cagctaccca gaaatcctga cactcaagac atacctgcat acatatgaga 2640 gcgagattta actttctgag catctgcagt acagtcttat caactggttg tatattttta 2700 tattgttttt gtatttatta atttgaaacc aggacattaa aaatgttagt attttaatcc 2760 tgtaccaaat ctgacatatt atgcctgaat gactccactg tttttctcta atgcttgatt 2820 taggtagcct tgtgttctga gtagagcttg taataaatac tgcagcttga gtttttagtg 2880 gaagcttcta aatggtgctg cagatttgat atttgcattg aggaaatatt aattttccaa 2940 tgcacagttg ccacatttag tcctgtactg tatggaaaca ctgattttgt aaagttgcct 3000 ttatttgctg ttaactgtta actatgacag atatatttaa gccttataaa ccaatcttaa 3060 acataataaa tcacacattc agttttttct ggtaaaaaaa aaaaaaaaaa 3110 24 863 PRT Homo sapiens 24 Met Ala Arg Arg Ser Ser Phe Gln Ser Cys Gln Ile Ile Ser Leu Phe 1 5 10 15 Thr Phe Ala Val Gly Val Asn Ile Cys Leu Gly Phe Thr Ala His Arg 20 25 30 Ile Lys Arg Ala Glu Gly Trp Glu Glu Gly Pro Pro Thr Val Leu Ser 35 40 45 Asp Ser Pro Trp Thr Asn Ile Ser Gly Ser Cys Lys Gly Arg Cys Phe 50 55 60 Glu Leu Gln Glu Ala Gly Pro Pro Asp Cys Arg Cys Asp Asn Leu Cys 65 70 75 80 Lys Ser Tyr Thr Ser Cys Cys His Asp Phe Asp Glu Leu Cys Leu Lys 85 90 95 Thr Ala Arg Ala Trp Glu Cys Thr Lys Asp Arg Cys Gly Glu Val Arg 100 105 110 Asn Glu Glu Asn Ala Cys His Cys Ser Glu Asp Cys Leu Ala Arg Gly 115 120 125 Asp Cys Cys Thr Asn Tyr Gln Val Val Cys Lys Gly Glu Ser His Trp 130 135 140 Val Asp Asp Asp Cys Glu Glu Ile Lys Ala Ala Glu Cys Pro Ala Gly 145 150 155 160 Phe Val Arg Pro Pro Leu Ile Ile Phe Ser Val Asp Gly Phe Arg Ala 165 170 175 Ser Tyr Met Lys Lys Gly Ser Lys Val Met Pro Asn Ile Glu Lys Leu 180 185 190 Arg Ser Cys Gly Thr His Ser Pro Tyr Met Arg Pro Val Tyr Pro Thr 195 200 205 Lys Thr Phe Pro Asn Leu Tyr Thr Leu Ala Thr Gly Leu Tyr Pro Glu 210 215 220 Ser His Gly Ile Val Gly Asn Ser Met Tyr Asp Pro Val Phe Asp Ala 225 230 235 240 Thr Phe His Leu Arg Gly Arg Glu Lys Phe Asn His Arg Trp Trp Gly 245 250 255 Gly Gln Pro Leu Trp Ile Thr Ala Thr Lys Gln Gly Val Lys Ala Gly 260 265 270 Thr Phe Phe Trp Ser Val Val Ile Pro His Glu Arg Arg Ile Leu Thr 275 280 285 Ile Leu Gln Trp Leu Thr Leu Pro Asp His Glu Arg Pro Ser Val Tyr 290 295 300 Ala Phe Tyr Ser Glu Gln Pro Asp Phe Ser Gly His Lys Tyr Gly Pro 305 310 315 320 Phe Gly Pro Glu Met Thr Asn Pro Leu Arg Glu Ile Asp Lys Ile Val 325 330 335 Gly Gln Leu Met Asp Gly Leu Lys Gln Leu Lys Leu His Arg Cys Val 340 345 350 Asn Val Ile Phe Val Gly Asp His Gly Met Glu Asp Val Thr Cys Asp 355 360 365 Arg Thr Glu Phe Leu Ser Asn Tyr Leu Thr Asn Val Asp Asp Ile Thr 370 375 380 Leu Val Pro Gly Thr Leu Gly Arg Ile Arg Ser Lys Phe Ser Asn Asn 385 390 395 400 Ala Lys Tyr Asp Pro Lys Ala Ile Ile Ala Asn Leu Thr Cys Lys Lys 405 410 415 Pro Asp Gln His Phe Lys Pro Tyr Leu Lys Gln His Leu Pro Lys Arg 420 425 430 Leu His Tyr Ala Asn Asn Arg Arg Ile Glu Asp Ile His Leu Leu Val 435 440 445 Glu Arg Arg Trp His Val Ala Arg Lys Pro Leu Asp Val Tyr Lys Lys 450 455 460 Pro Ser Gly Lys Cys Phe Phe Gln Gly Asp His Gly Phe Asp Asn Lys 465 470 475 480 Val Asn Ser Met Gln Thr Val Phe Val Gly Tyr Gly Pro Thr Phe Lys 485 490 495 Tyr Lys Thr Lys Val Pro Pro Phe Glu Asn Ile Glu Leu Tyr Asn Val 500 505 510 Met Cys Asp Leu Leu Gly Leu Lys Pro Ala Pro Asn Asn Gly Thr His 515 520 525 Gly Ser Leu Asn His Leu Leu Arg Thr Asn Thr Phe Arg Pro Thr Met 530 535 540 Pro Glu Glu Val Thr Arg Pro Asn Tyr Pro Gly Ile Met Tyr Leu Gln 545 550 555 560 Ser Asp Phe Asp Leu Gly Cys Thr Cys Asp Asp Lys Val Glu Pro Lys 565 570 575 Asn Lys Leu Asp Glu Leu Asn Lys Arg Leu His Thr Lys Gly Ser Thr 580 585 590 Glu Glu Arg His Leu Leu Tyr Gly Arg Pro Ala Val Leu Tyr Arg Thr 595 600 605 Arg Tyr Asp Ile Leu Tyr His Thr Asp Phe Glu Ser Gly Tyr Ser Glu 610 615 620 Ile Phe Leu Met Pro Leu Trp Thr Ser Tyr Thr Val Ser Lys Gln Ala 625 630 635 640 Glu Val Ser Ser Val Pro Asp His Leu Thr Ser Cys Val Arg Pro Asp 645 650 655 Val Arg Val Ser Pro Ser Phe Ser Gln Asn Cys Leu Ala Tyr Lys Asn 660 665 670 Asp Lys Gln Met Ser Tyr Gly Phe Leu Phe Pro Pro Tyr Leu Ser Ser 675 680 685 Ser Pro Glu Ala Lys Tyr Asp Ala Phe Leu Val Thr Asn Met Val Pro 690 695 700 Met Tyr Pro Ala Phe Lys Arg Val Trp Asn Tyr Phe Gln Arg Val Leu 705 710 715 720 Val Lys Lys Tyr Ala Ser Glu Arg Asn Gly Val Asn Val Ile Ser Gly 725 730 735 Pro Ile Phe Asp Tyr Asp Tyr Asp Gly Leu His Asp Thr Glu Asp Lys 740 745 750 Ile Lys Gln Tyr Val Glu Gly Ser Ser Ile Pro Val Pro Thr His Tyr 755 760 765 Tyr Ser Ile Ile Thr Ser Cys Leu Asp Phe Thr Gln Pro Ala Asp Lys 770 775 780 Cys Asp Gly Pro Leu Ser Val Ser Ser Phe Ile Leu Pro His Arg Pro 785 790 795 800 Asp Asn Glu Glu Ser Cys Asn Ser Ser Glu Asp Glu Ser Lys Trp Val 805 810 815 Glu Glu Leu Met Lys Met His Thr Ala Arg Val Arg Asp Ile Glu His 820 825 830 Leu Thr Ser Leu Asp Phe Phe Arg Lys Thr Ser Arg Ser Tyr Pro Glu 835 840 845 Ile Leu Thr Leu Lys Thr Tyr Leu His Thr Tyr Glu Ser Glu Ile 850 855 860 25 2772 DNA Mus musculus 25 cccacgcgtc cgcccacgcg tccggagaac accctgcaga ggttttccaa gaatccctcg 60 gcatggcaag acaaggctgt ttcgggtcat accaggtaat atccttgttc acttttgcca 120 tcggcgtcaa tctctgctta ggattcacag caagtcgaat taagagggcc gaatgggatg 180 aaggacctcc cacagtgtta tctgactctc catggaccaa cacatctgga tcctgcaaag 240 gtagatgctt tgagcttcaa gaggttggac ctcctgactg tcggtgtgac aacctatgta 300 agagctacag cagctgctgc catgattttg atgagctctg tttgaaaaca gctcgaggct 360 gggagtgcac caaagacaga tgtggggaag tacgaaatga ggaaaatgcc tgtcactgct 420 cagaagactg cttgtcccgg ggagactgct gtaccaacta tcaagtggtc tgcaaaggag 480 aatcacactg ggtagatgat gactgtgaag aaataagagt ccctgaatgc cctgcagggt 540 ttgtccgccc tccgttaatc atcttctctg tggatggatt ccgtgcatcg tacatgaaga 600 aaggcagcaa ggttatgccc aacattgaga aactgcggtc ctgtggcacc catgctccct 660 acatgaggcc tgtgtaccct acaaaaacct tccctaatct gtatacgctg gccactggtt 720 tatatccaga atcccatgga atcgttggca attcaatgta tgaccctgtc tttgatgcta 780 ctttccatct tcgagggcga gagaagttta accatagatg gtggggaggc caaccgctat 840 ggattacagc caccaagcaa ggggtgagag ccgggacatt cttttggtct gtgagcatcc 900 ctcacgagcg gagaatccta actatccttc agtggctttc cctgccagac aatgagaggc 960 cttcagttta tgccttctac tccgagcagc ctgatttttc tggacacaag tacggccctt 1020 ttggccctga gatgacaaat cctctgaggg agattgacaa gaccgtgggg cagttaatgg 1080 acggactgaa acaactcaag ctgcaccgtt gtgtgaatgt tatctttgtt ggagaccatg 1140 gaatggaaga cgtgacatgt gacagaactg agttcttgag caactatctg actaacgtgg 1200 atgatattac tttagtacct ggaactctag gaagaattcg acccaagatt cccaataatc 1260 ttaaatatga ccctaaagcc attattgcta acctcacgtg taaaaaacca gatcagcact 1320 ttaagcctta catgaaacag caccttccca aacgtttgca ctatgccaac aatcggagaa 1380 tcgaggatct ccatttattg gtggaacgca gatggcatgt tgcaaggaaa cctttggacg 1440 tttataagaa gccgtcagga aaatgttttt tccagggtga ccacggcttt gataacaagg 1500 tcaatagcat gcagactgtt tttgtaggtt atggcccaac ttttaagtac aggactaaag 1560 tgcctccatt tgaaaacatt gaactttata atgttatgtg cgatctccta ggcttgaagc 1620 cagctcccaa taatggaaca catggaagtt

tgaatcacct gctacgcaca aataccttta 1680 ggccaaccct accagaggaa gtcagcagac ccaattaccc agggattatg taccttcagt 1740 ctgattttga cctgggctgc acctgtgatg ataaggtaga gccaaagaac aaattggaag 1800 aactaaataa acgccttcat accaaaggat ctacagaaga gagacatctc ctgtatggac 1860 gacctgcagt gctttatcgg actagctatg atatcttata ccatacggac tttgaaagtg 1920 gttacagtga aatattctta atgcctctct ggacttctta taccatttct aagcaggctg 1980 aggtctctag catcccagag cacctgacca actgtgttcg ccctgatgtc cgtgtatctc 2040 ctggattcag tcagaactgt ttagcctata aaaatgataa acagatgtcc tatggattcc 2100 tttttcctcc ctatctgagc tcttccccag aagcgaaata tgatgcattc cttgtaacca 2160 acatggttcc aatgtaccct gccttcaaac gtgtttggac ttatttccaa agggtcttgg 2220 tgaagaaata tgcgtcagaa aggaatgggg tcaacgtaat aagtggaccg atctttgact 2280 acaattacga tggcttacgt gacattgagg atgaaattaa acagtatgtg gaaggcagct 2340 ctattcctgt ccctacccac tactacagca tcatcaccag ctgcctggac ttcactcagc 2400 ctgcagacaa gtgtgatggt cctctctctg tgtcttcttt catccttcct caccgacctg 2460 acaatgatga gagctgtaat agttccgagg atgagtcgaa gtgggtagag gaactcatga 2520 agatgcacac agctcgggtg agggacatcg agcatctcac cggtctggat ttctaccgga 2580 agactagccg tagctattcg gaaattctga ccctcaagac atacctgcat acatatgaga 2640 gcgagattta acttcctggg cctgggcagt gtagtcttag caactggtgt atatttttat 2700 atggtgtttg tatttattaa tttgaaacca ggacataaac aaacaaagaa acaaatgaaa 2760 aaaaaaaaaa aa 2772 26 862 PRT Mus musculus 26 Met Ala Arg Gln Gly Cys Phe Gly Ser Tyr Gln Val Ile Ser Leu Phe 1 5 10 15 Thr Phe Ala Ile Gly Val Asn Leu Cys Leu Gly Phe Thr Ala Ser Arg 20 25 30 Ile Lys Arg Ala Glu Trp Asp Glu Gly Pro Pro Thr Val Leu Ser Asp 35 40 45 Ser Pro Trp Thr Asn Thr Ser Gly Ser Cys Lys Gly Arg Cys Phe Glu 50 55 60 Leu Gln Glu Val Gly Pro Pro Asp Cys Arg Cys Asp Asn Leu Cys Lys 65 70 75 80 Ser Tyr Ser Ser Cys Cys His Asp Phe Asp Glu Leu Cys Leu Lys Thr 85 90 95 Ala Arg Gly Trp Glu Cys Thr Lys Asp Arg Cys Gly Glu Val Arg Asn 100 105 110 Glu Glu Asn Ala Cys His Cys Ser Glu Asp Cys Leu Ser Arg Gly Asp 115 120 125 Cys Cys Thr Asn Tyr Gln Val Val Cys Lys Gly Glu Ser His Trp Val 130 135 140 Asp Asp Asp Cys Glu Glu Ile Arg Val Pro Glu Cys Pro Ala Gly Phe 145 150 155 160 Val Arg Pro Pro Leu Ile Ile Phe Ser Val Asp Gly Phe Arg Ala Ser 165 170 175 Tyr Met Lys Lys Gly Ser Lys Val Met Pro Asn Ile Glu Lys Leu Arg 180 185 190 Ser Cys Gly Thr His Ala Pro Tyr Met Arg Pro Val Tyr Pro Thr Lys 195 200 205 Thr Phe Pro Asn Leu Tyr Thr Leu Ala Thr Gly Leu Tyr Pro Glu Ser 210 215 220 His Gly Ile Val Gly Asn Ser Met Tyr Asp Pro Val Phe Asp Ala Thr 225 230 235 240 Phe His Leu Arg Gly Arg Glu Lys Phe Asn His Arg Trp Trp Gly Gly 245 250 255 Gln Pro Leu Trp Ile Thr Ala Thr Lys Gln Gly Val Arg Ala Gly Thr 260 265 270 Phe Phe Trp Ser Val Ser Ile Pro His Glu Arg Arg Ile Leu Thr Ile 275 280 285 Leu Gln Trp Leu Ser Leu Pro Asp Asn Glu Arg Pro Ser Val Tyr Ala 290 295 300 Phe Tyr Ser Glu Gln Pro Asp Phe Ser Gly His Lys Tyr Gly Pro Phe 305 310 315 320 Gly Pro Glu Met Thr Asn Pro Leu Arg Glu Ile Asp Lys Thr Val Gly 325 330 335 Gln Leu Met Asp Gly Leu Lys Gln Leu Lys Leu His Arg Cys Val Asn 340 345 350 Val Ile Phe Val Gly Asp His Gly Met Glu Asp Val Thr Cys Asp Arg 355 360 365 Thr Glu Phe Leu Ser Asn Tyr Leu Thr Asn Val Asp Asp Ile Thr Leu 370 375 380 Val Pro Gly Thr Leu Gly Arg Ile Arg Pro Lys Ile Pro Asn Asn Leu 385 390 395 400 Lys Tyr Asp Pro Lys Ala Ile Ile Ala Asn Leu Thr Cys Lys Lys Pro 405 410 415 Asp Gln His Phe Lys Pro Tyr Met Lys Gln His Leu Pro Lys Arg Leu 420 425 430 His Tyr Ala Asn Asn Arg Arg Ile Glu Asp Leu His Leu Leu Val Glu 435 440 445 Arg Arg Trp His Val Ala Arg Lys Pro Leu Asp Val Tyr Lys Lys Pro 450 455 460 Ser Gly Lys Cys Phe Phe Gln Gly Asp His Gly Phe Asp Asn Lys Val 465 470 475 480 Asn Ser Met Gln Thr Val Phe Val Gly Tyr Gly Pro Thr Phe Lys Tyr 485 490 495 Arg Thr Lys Val Pro Pro Phe Glu Asn Ile Glu Leu Tyr Asn Val Met 500 505 510 Cys Asp Leu Leu Gly Leu Lys Pro Ala Pro Asn Asn Gly Thr His Gly 515 520 525 Ser Leu Asn His Leu Leu Arg Thr Asn Thr Phe Arg Pro Thr Leu Pro 530 535 540 Glu Glu Val Ser Arg Pro Asn Tyr Pro Gly Ile Met Tyr Leu Gln Ser 545 550 555 560 Asp Phe Asp Leu Gly Cys Thr Cys Asp Asp Lys Val Glu Pro Lys Asn 565 570 575 Lys Leu Glu Glu Leu Asn Lys Arg Leu His Thr Lys Gly Ser Thr Glu 580 585 590 Glu Arg His Leu Leu Tyr Gly Arg Pro Ala Val Leu Tyr Arg Thr Ser 595 600 605 Tyr Asp Ile Leu Tyr His Thr Asp Phe Glu Ser Gly Tyr Ser Glu Ile 610 615 620 Phe Leu Met Pro Leu Trp Thr Ser Tyr Thr Ile Ser Lys Gln Ala Glu 625 630 635 640 Val Ser Ser Ile Pro Glu His Leu Thr Asn Cys Val Arg Pro Asp Val 645 650 655 Arg Val Ser Pro Gly Phe Ser Gln Asn Cys Leu Ala Tyr Lys Asn Asp 660 665 670 Lys Gln Met Ser Tyr Gly Phe Leu Phe Pro Pro Tyr Leu Ser Ser Ser 675 680 685 Pro Glu Ala Lys Tyr Asp Ala Phe Leu Val Thr Asn Met Val Pro Met 690 695 700 Tyr Pro Ala Phe Lys Arg Val Trp Thr Tyr Phe Gln Arg Val Leu Val 705 710 715 720 Lys Lys Tyr Ala Ser Glu Arg Asn Gly Val Asn Val Ile Ser Gly Pro 725 730 735 Ile Phe Asp Tyr Asn Tyr Asp Gly Leu Arg Asp Ile Glu Asp Glu Ile 740 745 750 Lys Gln Tyr Val Glu Gly Ser Ser Ile Pro Val Pro Thr His Tyr Tyr 755 760 765 Ser Ile Ile Thr Ser Cys Leu Asp Phe Thr Gln Pro Ala Asp Lys Cys 770 775 780 Asp Gly Pro Leu Ser Val Ser Ser Phe Ile Leu Pro His Arg Pro Asp 785 790 795 800 Asn Asp Glu Ser Cys Asn Ser Ser Glu Asp Glu Ser Lys Trp Val Glu 805 810 815 Glu Leu Met Lys Met His Thr Ala Arg Val Arg Asp Ile Glu His Leu 820 825 830 Thr Gly Leu Asp Phe Tyr Arg Lys Thr Ser Arg Ser Tyr Ser Glu Ile 835 840 845 Leu Thr Leu Lys Thr Tyr Leu His Thr Tyr Glu Ser Glu Ile 850 855 860 27 610 DNA Gallus gallus misc_feature (484)..(484) n is a, c, g, or t 27 ccacgcgtcc ggctctcaat ctttgcactg cttcagttag cagagcattt atttttgatt 60 cagctgcatt tgttaagact gtaacaacga aaggcatttc ctgagaagct gcaaggatga 120 gcagaaagaa agaacaacag ctaaggaaat atgggaccct agtagtgctt ttcatcttcc 180 aagttcagat ttttggtttt gatgttgaca atcgacctac aacagatgtc tgctcgacac 240 acactatttt acctggacca aaaggggatg atggtgaaaa aggagataga ggagaagtgg 300 gcaaacaagg aaaagttgga ccaaaaggac ctaaaggaaa caaaggaact gtgggggatg 360 tcggtgacca gggaatgctt gggaaaatcg gtccgattgg aggaaaaggt gacaaaggag 420 ccaaaggcat atcaggggta tctggaaaaa aaggaaaagc aggcacagtc tgtgactgtg 480 gaangtccgc anagttgttg gacaactgaa tatcaatgtt gctcggctta acacatncat 540 caagtttgta aaagaatggt ttttgcnggc cttnaggggg accggtggaa aaattcttcc 600 tttttttggc 610 28 122 PRT Gallus gallus 28 Met Ser Arg Lys Lys Glu Gln Gln Leu Arg Lys Tyr Gly Thr Leu Val 1 5 10 15 Val Leu Phe Ile Phe Gln Val Gln Ile Phe Gly Phe Asp Val Asp Asn 20 25 30 Arg Pro Thr Thr Asp Val Cys Ser Thr His Thr Ile Leu Pro Gly Pro 35 40 45 Lys Gly Asp Asp Gly Glu Lys Gly Asp Arg Gly Glu Val Gly Lys Gln 50 55 60 Gly Lys Val Gly Pro Lys Gly Pro Lys Gly Asn Lys Gly Thr Val Gly 65 70 75 80 Asp Val Gly Asp Gln Gly Met Leu Gly Lys Ile Gly Pro Ile Gly Gly 85 90 95 Lys Gly Asp Lys Gly Ala Lys Gly Ile Ser Gly Val Ser Gly Lys Lys 100 105 110 Gly Lys Ala Gly Thr Val Cys Asp Cys Gly 115 120 29 1686 DNA Homo sapiens 29 aagcaggagg ttttatttaa aataaagctg tttatttggc atttctggga gacccttttc 60 tgaggaacca cagcaatgaa tggctttgca tccttgcttc gaagaaacca atttatcctc 120 ctggtactat ttcttttgca aattcagagt ctgggtctgg atattgatag ccgtcctacc 180 gctgaagtct gtgccacaca cacaatttca ccaggaccca aaggagatga tggtgaaaaa 240 ggagatccag gagaagaggg aaagcatggc aaagtgggac gcatggggcc gaaaggaatt 300 aaaggagaac tgggtgatat gggagatcgg ggcaatattg gcaagactgg gcccattggg 360 aagaagggtg acaaagggga aaaaggtttg cttggaatac ctggagaaaa aggcaaagca 420 ggtactgtct gtgattgtgg aagataccgg aaatttgttg gacaactgga tattagtatt 480 gcccggctca agacatctat gaagtttgtc aagaatgtga tagcagggat tagggaaact 540 gaagagaaat tctactacat cgtgcaggaa gagaagaact acagggaatc cctaacccac 600 tgcaggattc ggggtggaat gctagccatg cccaaggatg aagctgccaa cacactcatc 660 gctgactatg ttgccaagag tggcttcttt cgggtgttca ttggcgtgaa tgaccttgaa 720 agggagggac agtacatgtt cacagacaac actccactgc agaactatag caactggaat 780 gagggggaac ccagcgaccc ctatggtcat gaggactgtg tggagatgct gagctctggc 840 agatggaatg acacagagtg ccatcttacc atgtactttg tctgtgagtt catcaagaag 900 aaaaagtaac ttccctcatc ctacgtattt gctattttcc tgtgaccgtc attacagtta 960 ttgttatcca tccttttttt cctgattgta ctacatttga tctgagtcaa catagctaga 1020 aaatgctaaa ctgaggtatg gagcctccat catcatgctc ttttgtgatg attttcatat 1080 tttcacacat ggtatgttat tgacccaata actcgccagg ttacatgggt cttgagagag 1140 aattttaatt actaattgtg cacgagatag ttggttgtct atatgtcaaa tgagttgttc 1200 tcttggtatt tgctctacca tctctcccta gagcactctg tgtctatccc agtggataat 1260 ttcccagttt actggtgatg attaggaagg ttgttgatgg ttaggctaac ctgccctggc 1320 ccaaagccag acatgtacaa gggctttctg tgagcaatga taagatcttt gaatccaaga 1380 tgcccagatg ttttaccagt cacaccctat ggccatggct atacttggaa gttctccttg 1440 ttggcacaga catagaaatg ctttaacccc aagcctttat atgggggact tctagctttg 1500 tgtcttgttt cagaccatgt ggaatgataa atactctttt tgtgcttctg atctatcgat 1560 ttcactaaca tataccaagt aggtgctttg aacccctttc tgtaggctca caccttaatc 1620 tcaggcccct atatagtcac actttgattt aagaaaaatg gagctcttga aatcaaaaga 1680 aaaaaa 1686 30 277 PRT Homo sapiens 30 Met Asn Gly Phe Ala Ser Leu Leu Arg Arg Asn Gln Phe Ile Leu Leu 1 5 10 15 Val Leu Phe Leu Leu Gln Ile Gln Ser Leu Gly Leu Asp Ile Asp Ser 20 25 30 Arg Pro Thr Ala Glu Val Cys Ala Thr His Thr Ile Ser Pro Gly Pro 35 40 45 Lys Gly Asp Asp Gly Glu Lys Gly Asp Pro Gly Glu Glu Gly Lys His 50 55 60 Gly Lys Val Gly Arg Met Gly Pro Lys Gly Ile Lys Gly Glu Leu Gly 65 70 75 80 Asp Met Gly Asp Arg Gly Asn Ile Gly Lys Thr Gly Pro Ile Gly Lys 85 90 95 Lys Gly Asp Lys Gly Glu Lys Gly Leu Leu Gly Ile Pro Gly Glu Lys 100 105 110 Gly Lys Ala Gly Thr Val Cys Asp Cys Gly Arg Tyr Arg Lys Phe Val 115 120 125 Gly Gln Leu Asp Ile Ser Ile Ala Arg Leu Lys Thr Ser Met Lys Phe 130 135 140 Val Lys Asn Val Ile Ala Gly Ile Arg Glu Thr Glu Glu Lys Phe Tyr 145 150 155 160 Tyr Ile Val Gln Glu Glu Lys Asn Tyr Arg Glu Ser Leu Thr His Cys 165 170 175 Arg Ile Arg Gly Gly Met Leu Ala Met Pro Lys Asp Glu Ala Ala Asn 180 185 190 Thr Leu Ile Ala Asp Tyr Val Ala Lys Ser Gly Phe Phe Arg Val Phe 195 200 205 Ile Gly Val Asn Asp Leu Glu Arg Glu Gly Gln Tyr Met Phe Thr Asp 210 215 220 Asn Thr Pro Leu Gln Asn Tyr Ser Asn Trp Asn Glu Gly Glu Pro Ser 225 230 235 240 Asp Pro Tyr Gly His Glu Asp Cys Val Glu Met Leu Ser Ser Gly Arg 245 250 255 Trp Asn Asp Thr Glu Cys His Leu Thr Met Tyr Phe Val Cys Glu Phe 260 265 270 Ile Lys Lys Lys Lys 275 31 423 DNA Gallus gallus misc_feature (265)..(265) n is a, c, g, or t 31 tgcagcttgt tccatgggaa acagcaccag ccggctctac agcgcgctcg ccaagacgct 60 gagcagcagt gccgtgtccc agcaccagga ctgcctggag cagcccaact cggcgcagct 120 ggagcccata gaccccaagg acctactgga ggaatgccag ctcgttctgc agaaacggcc 180 acctcgcttc cagaggaact tcgtggacct gaagaaaaac acagccagta accaccgccc 240 catccgggtc atgcagtgga acatnctcgc ccaagctctc ggagaaggca aagacaactt 300 cgttcagtgc cccatggaag ctctgaagtg ggaggaaagg aagtgcctca tcctggagga 360 aatccttgcc tacaagccgg atatcttgtg cctgcaagaa gtcgaccact acttttacac 420 ctt 423 32 140 PRT Gallus gallus misc_feature (88)..(88) Xaa can be any naturally occurring amino acid 32 Ala Ala Cys Ser Met Gly Asn Ser Thr Ser Arg Leu Tyr Ser Ala Leu 1 5 10 15 Ala Lys Thr Leu Ser Ser Ser Ala Val Ser Gln His Gln Asp Cys Leu 20 25 30 Glu Gln Pro Asn Ser Ala Gln Leu Glu Pro Ile Asp Pro Lys Asp Leu 35 40 45 Leu Glu Glu Cys Gln Leu Val Leu Gln Lys Arg Pro Pro Arg Phe Gln 50 55 60 Arg Asn Phe Val Asp Leu Lys Lys Asn Thr Ala Ser Asn His Arg Pro 65 70 75 80 Ile Arg Val Met Gln Trp Asn Xaa Leu Ala Gln Ala Leu Gly Glu Gly 85 90 95 Lys Asp Asn Phe Val Gln Cys Pro Met Glu Ala Leu Lys Trp Glu Glu 100 105 110 Arg Lys Cys Leu Ile Leu Glu Glu Ile Leu Ala Tyr Lys Pro Asp Ile 115 120 125 Leu Cys Leu Gln Glu Val Asp His Tyr Phe Tyr Thr 130 135 140 33 1767 DNA Homo sapiens 33 ccgacgcagc ggtgttgcac ctccctctcc ggctctgctg cccgggattt ccccagaacc 60 tgcgccgcgc gagaaggagc ctgggagcat ccgcccacac tgcccggaca gtcggctcga 120 ctcggtgccc tcggccccag ccgggctccg ctcctcgggc gcgcgagggg ccgtggtggc 180 ggcggcgccc ggcatgtttc atagtccgcg gcggctctgc tcggccctgc tgcagaggga 240 cgcgcccggc ctgcgccgcc tgcccgcccc agggctgcgc cgcccgttgt ccccgccggc 300 tgctgttccc aggcccgcat ccccccggct gctggcggcg gcctcggcgg cctcgggcgc 360 cgcgaggtcg tgttcccgaa cagtgtgttc catgggaacc ggtacaagca gactctatag 420 tgctctcgcc aagacactga acagcagcgc tgcctcccag cacccagagt atttggtgtc 480 acctgaccca gagcatctgg agcccattga tcctaaagag cttcttgagg aatgcagggc 540 cgtcctgcac acccgacctc cccggttcca gagggatttt gtggatctga ggacagattg 600 ccctagtacc cacccaccta tcagggttat gcaatggaac atcctcgccc aagctcttgg 660 agaaggcaaa gacaactttg tacagtgccc tgttgaagca ctcaaatggg aagaaaggaa 720 atgtctcatc ctggaagaaa tcctggccta ccagcctgat atattgtgcc tccaagaggt 780 ggaccactat tttgacacct tccagccact cctcagtaga ctaggctatc aaggcacgtt 840 tttccccaaa ccctggtcac cttgtctaga tgtagaacac aacaatggac cagatggttg 900 tgccttattt tttcttcaaa accgattcaa gctagtcaac agtgccaata ttaggctgac 960 agccatgaca ttgaaaacca accaggtggc cattgcacag accctggagt gcaaggagtc 1020 aggccgacag ttctgcatcg ctgttaccca tctaaaagca cgcactggct gggagcggtt 1080 tcgatcagct caaggctgtg acctccttca gaacctgcaa aacatcaccc aaggagccaa 1140 gattcccctt attgtgtgtg gggacttcaa tgcagagcca acagaagagg tctacaaaca 1200 ctttgcttcc tccagcctca acctgaacag cgcctacaag ctgctgagtg ctgatgggca 1260 gtcagaaccc ccatacacta cctggaagat ccggacctca ggggagtgca ggcacaccct 1320 ggattacatc tggtattcta aacatgctct aaatgtaagg tcagctctcg atctgctcac 1380 tgaagaacag attggaccca acaggttacc ttccttcaat tatccttcag accacctgtc 1440 tctagtgtgt gacttcagct ttactgagga atctgatgga ctttcataaa tacttgcttt 1500 tgtcttttta atcacaggag tctatttttt tttttttttt tttttttttg agacagagtc 1560 tcgctctgtt gcctaggctg gagtacagtg gcctgatctc ggctcactgc aagatccgcc 1620 tcccgggttc atggcattct cctgcctcag cctccagagc aactgggaca acaggcgccc 1680 gtcaccacgc ccagctaatt ttttgtattt ttagtagaga cggggtttca ccgtgttagc 1740 caggatggtc tcgatctcct gaccttg 1767 34 431 PRT Homo sapiens 34 Met Phe His Ser Pro Arg Arg Leu Cys Ser Ala Leu Leu Gln Arg Asp 1 5 10 15 Ala Pro Gly Leu Arg Arg Leu Pro Ala Pro Gly Leu Arg Arg Pro Leu 20 25 30 Ser Pro Pro Ala Ala Val Pro Arg Pro Ala Ser Pro Arg Leu Leu Ala

35 40 45 Ala Ala Ser Ala Ala Ser Gly Ala Ala Arg Ser Cys Ser Arg Thr Val 50 55 60 Cys Ser Met Gly Thr Gly Thr Ser Arg Leu Tyr Ser Ala Leu Ala Lys 65 70 75 80 Thr Leu Asn Ser Ser Ala Ala Ser Gln His Pro Glu Tyr Leu Val Ser 85 90 95 Pro Asp Pro Glu His Leu Glu Pro Ile Asp Pro Lys Glu Leu Leu Glu 100 105 110 Glu Cys Arg Ala Val Leu His Thr Arg Pro Pro Arg Phe Gln Arg Asp 115 120 125 Phe Val Asp Leu Arg Thr Asp Cys Pro Ser Thr His Pro Pro Ile Arg 130 135 140 Val Met Gln Trp Asn Ile Leu Ala Gln Ala Leu Gly Glu Gly Lys Asp 145 150 155 160 Asn Phe Val Gln Cys Pro Val Glu Ala Leu Lys Trp Glu Glu Arg Lys 165 170 175 Cys Leu Ile Leu Glu Glu Ile Leu Ala Tyr Gln Pro Asp Ile Leu Cys 180 185 190 Leu Gln Glu Val Asp His Tyr Phe Asp Thr Phe Gln Pro Leu Leu Ser 195 200 205 Arg Leu Gly Tyr Gln Gly Thr Phe Phe Pro Lys Pro Trp Ser Pro Cys 210 215 220 Leu Asp Val Glu His Asn Asn Gly Pro Asp Gly Cys Ala Leu Phe Phe 225 230 235 240 Leu Gln Asn Arg Phe Lys Leu Val Asn Ser Ala Asn Ile Arg Leu Thr 245 250 255 Ala Met Thr Leu Lys Thr Asn Gln Val Ala Ile Ala Gln Thr Leu Glu 260 265 270 Cys Lys Glu Ser Gly Arg Gln Phe Cys Ile Ala Val Thr His Leu Lys 275 280 285 Ala Arg Thr Gly Trp Glu Arg Phe Arg Ser Ala Gln Gly Cys Asp Leu 290 295 300 Leu Gln Asn Leu Gln Asn Ile Thr Gln Gly Ala Lys Ile Pro Leu Ile 305 310 315 320 Val Cys Gly Asp Phe Asn Ala Glu Pro Thr Glu Glu Val Tyr Lys His 325 330 335 Phe Ala Ser Ser Ser Leu Asn Leu Asn Ser Ala Tyr Lys Leu Leu Ser 340 345 350 Ala Asp Gly Gln Ser Glu Pro Pro Tyr Thr Thr Trp Lys Ile Arg Thr 355 360 365 Ser Gly Glu Cys Arg His Thr Leu Asp Tyr Ile Trp Tyr Ser Lys His 370 375 380 Ala Leu Asn Val Arg Ser Ala Leu Asp Leu Leu Thr Glu Glu Gln Ile 385 390 395 400 Gly Pro Asn Arg Leu Pro Ser Phe Asn Tyr Pro Ser Asp His Leu Ser 405 410 415 Leu Val Cys Asp Phe Ser Phe Thr Glu Glu Ser Asp Gly Leu Ser 420 425 430 35 3075 DNA Gallus gallus misc_feature (762)..(775) n is a, c, g, or t 35 acgcacgcac ctctgcctct gcaggcggat gaggggcact tttgaaaatt attttctttc 60 cacacccaac cctcgtctga catcacttct gcaggaggga gggcgggaac agccccgctg 120 ccagaaggtc gcggagagct ccgccggccc ccgcgcacca tttgtctcaa actaaatact 180 cttcaaatca aggatgttga ttcttctggc tttcattatt atatttcaca taacttcagc 240 agcgctgttg ttcatctcaa ctattgacaa tgcctggtgg gtaggagata acttttctac 300 agatgtctgg agtgcatgtg ccacaaataa tagcacctgc acacctatta ctgttcaatt 360 cagagaatat caatcaattc aggctgttca ggcctgcatg gtcctatcta ctattttctg 420 ttgtgtggca tttctggttt tcattcttca acttttccgt ctaaagcaag gagaaagatt 480 tgtgttaacc tctattatcc agctcctgtc atgtctgtgc gttatgattg cagcttccat 540 ttacacagat aggcatgagg aactgcacaa gagcattgaa tatgccattg aagtttctaa 600 aggccaatat ggctattcct tcgtcttagc ctggattgca ttcgccttta ctctgatcag 660 tggtgttatg tacctagtat taaggaaacg taaataaatg ttggcagcta gttattactg 720 tcacggcagt acaaaaccaa attccagtaa ctattttgta tnnnnnnnnn nnnnnggttt 780 tgtagtaaag gtattgtttc tctaaaaatg tactgtgttc ttaatatgaa acagaataca 840 aaacaaaaaa caaccaacag caggtttaat ggaatgcctg gcattcggtc tgagcaagac 900 tgacccaagt tttcttttac ttatttcacc atcatcagtg gtgaaatggt gtctttcctt 960 ttctagacat taacagttct tggcctctgt cagattacta ttaaagtctt tgtaaattaa 1020 tttggaagca atgtgctaag catactcctg gcctggatct agccctttgg gatggataaa 1080 tacagggnnn nnnnnnnnnn nggccaggat cgtgatgcaa aagcaaacaa gtataaaagc 1140 ccaaagctgc actcaatgtt gctgttctag cagaggacga atgttctgct atttataatg 1200 tgcagtaagt gtcatcaagc ttttattaaa accacttgct ctgcaaaagt aaacaactcc 1260 tttttgtact ccagcaactg attctcttta tccttcttca cgtttaattt aagcatacag 1320 agcctttggc aggaaaagtt acaatcaaat tcgaaattca gtgcacaact tgagacagga 1380 gtagtctgag cagaaagagg tactccactc aagtcctgca gccctttatt tttgcattgt 1440 gcagtaccaa atttaacact tttttttcag ccaaactcag tatgtttatt acattgggct 1500 ctggctagat atatcatgtt ggctaatata tgatttagaa aaggctcttc ttttttgttt 1560 ttcctgtgtc tgctcactag gaaattggcc tttacaaaat tcattctaag ttcctatgtg 1620 gatttgactt gaataagaat tcctactaaa gaaatcagag tgtaactatt atgcatagga 1680 gttccaggat agttttaaga atttttggtg attctttctt ttcaataatt ctgtgagaga 1740 attactgtaa taccagattt aactgctcag caatataata ctggctttgg ctggtggtga 1800 tttcagggtt tggagaccag tgtggggaat gaattaagtg gctttttctg gttagtcaca 1860 cttctgatgt taaaatgtag atttgacttt gtaaaagcat taaccctgta ttcatttcat 1920 gatactcact gcagctgacc caatatatag gcaataaaaa taaatgaatt ttaaatgaga 1980 ttttacactt aatgtagaac aaaattccta ttacaaaata atgtagctct actaatgttg 2040 ataacttacc ctattacaca gcagctgata gtctgaccca ttgctgcagg tagttcatcc 2100 ttgagttctc acggaactgt ataggaattg tgtcggacat gagtaatggg tcatgctgtt 2160 ccatctccat tccctgaaca tcctaaaatg cactaacgag taatacttct attagggagc 2220 aaagaaagca caacaggact ggcaagaagt taattagaca actaagcaga acagcaaatt 2280 aatagtaaaa ataacagcag ttaaaaaaaa ccctcaataa atcagtctga gcgaaatgca 2340 ttctcacctt cccagtcttg catgatgcta atcttctgtt agtctttttt ctcttagtgg 2400 gaacactctg aatttcaggc attactaccc tactttttaa aaaagtgttt ctgctgtttg 2460 ctgaatacat ttcagattca aaacgtgaat tttgctagca agcaggattt gttttaaata 2520 aacagatgta ggtttaaggc tgaaagtaga tagtctgtaa gttgggtgtt tggctagtct 2580 tattcaaaca tgaaatatta agggtgaaat tctaaaacaa atgtgcattg aagctatttt 2640 atatctagaa gataatccta taacactgta aattaagctg aaatgccact gacttgaaga 2700 gatgcttctt cagtttcttg ccttaataat gcttaggtca tttatagagc aaatatttaa 2760 gataaagatg tatatataca tgaactcagc ttacttctac agtaaaagct ctgtcacttt 2820 agttagaagt gaaaagcaca cacagcagca tatacgtggt gccacacaga gaacatacgt 2880 caatattcga agtaccaaga aaataaatgc caaaaagttt ggacaagagt tttaacagga 2940 caaacatatt ttagaatatt ctttttatct gatatgcttt taaaatatac cattttctat 3000 gctctatata ttctgaaatt gtacatgaaa ataaagttaa aatgaattct tgtattgtaa 3060 aaaaaaaaaa aaaaa 3075 36 167 PRT Gallus gallus 36 Met Leu Ile Leu Leu Ala Phe Ile Ile Ile Phe His Ile Thr Ser Ala 1 5 10 15 Ala Leu Leu Phe Ile Ser Thr Ile Asp Asn Ala Trp Trp Val Gly Asp 20 25 30 Asn Phe Ser Thr Asp Val Trp Ser Ala Cys Ala Thr Asn Asn Ser Thr 35 40 45 Cys Thr Pro Ile Thr Val Gln Phe Arg Glu Tyr Gln Ser Ile Gln Ala 50 55 60 Val Gln Ala Cys Met Val Leu Ser Thr Ile Phe Cys Cys Val Ala Phe 65 70 75 80 Leu Val Phe Ile Leu Gln Leu Phe Arg Leu Lys Gln Gly Glu Arg Phe 85 90 95 Val Leu Thr Ser Ile Ile Gln Leu Leu Ser Cys Leu Cys Val Met Ile 100 105 110 Ala Ala Ser Ile Tyr Thr Asp Arg His Glu Glu Leu His Lys Ser Ile 115 120 125 Glu Tyr Ala Ile Glu Val Ser Lys Gly Gln Tyr Gly Tyr Ser Phe Val 130 135 140 Leu Ala Trp Ile Ala Phe Ala Phe Thr Leu Ile Ser Gly Val Met Tyr 145 150 155 160 Leu Val Leu Arg Lys Arg Lys 165 37 690 DNA Homo sapiens 37 cagcacatcc cgctctgggc tttaaacgtg acccctcgcc tcgactcgcc ctgccctgtg 60 aaaatgttgg tgcttcttgc tttcatcatc gccttccaca tcacctctgc agccttgctg 120 ttcattgcca ccgtcgacaa tgcctggtgg gtaggagatg agttttttgc agatgtctgg 180 agaatatgta ccaacaacac gaattgcaca gtcatcaatg acagctttca agagtactcc 240 acgctgcagg cggtccaggc caccatgatc ctctccacca ttctctgctg catcgccttc 300 ttcatcttcg tgctccagct cttccgcctg aagcagggag agaggtttgt cctaacctcc 360 atcatccagc taatgtcatg tctgtgtgtc atgattgcgg cctccattta tacagacagg 420 cgtgaagaca ttcacgacaa aaacgcgaaa ttctatcccg tgaccagaga aggcagctac 480 ggctactcct acatcctggc gtgggtggcc ttcgcctgca ccttcatcag cggcatgatg 540 tacctgatac tgaggaagcg caaatagagt tccggagctg ggttgcttct gctgcagtac 600 agaatccaca ttcagataac cattttgtat ataatcatta ttttttgagg tttttctagc 660 aaacgtattg tttcctttaa aagcccaaaa 690 38 167 PRT Homo sapiens 38 Met Leu Val Leu Leu Ala Phe Ile Ile Ala Phe His Ile Thr Ser Ala 1 5 10 15 Ala Leu Leu Phe Ile Ala Thr Val Asp Asn Ala Trp Trp Val Gly Asp 20 25 30 Glu Phe Phe Ala Asp Val Trp Arg Ile Cys Thr Asn Asn Thr Asn Cys 35 40 45 Thr Val Ile Asn Asp Ser Phe Gln Glu Tyr Ser Thr Leu Gln Ala Val 50 55 60 Gln Ala Thr Met Ile Leu Ser Thr Ile Leu Cys Cys Ile Ala Phe Phe 65 70 75 80 Ile Phe Val Leu Gln Leu Phe Arg Leu Lys Gln Gly Glu Arg Phe Val 85 90 95 Leu Thr Ser Ile Ile Gln Leu Met Ser Cys Leu Cys Val Met Ile Ala 100 105 110 Ala Ser Ile Tyr Thr Asp Arg Arg Glu Asp Ile His Asp Lys Asn Ala 115 120 125 Lys Phe Tyr Pro Val Thr Arg Glu Gly Ser Tyr Gly Tyr Ser Tyr Ile 130 135 140 Leu Ala Trp Val Ala Phe Ala Cys Thr Phe Ile Ser Gly Met Met Tyr 145 150 155 160 Leu Ile Leu Arg Lys Arg Lys 165 39 551 DNA Gallus gallus 39 ggtcgaccca cgcgtccggg tgagcgtcag cgagttgggc ctgggctacg agtcggacga 60 gaccgtgttg ttccgctact gcagcggcac ctgcgacgcg gccgtcagga actacgacct 120 ctcgctgaag agcgtgcgca gccggaagaa gatcaggaag gagaaggtgc gcgcgcggcc 180 ctgctgcagg ccgctggcct acgatgatga cgtctccttc ttggatgcct acaaccgcta 240 ctacaccgtc aatgagctgt cggccaaaga gtgtggctgt gtgtgaaggg ccgggttggg 300 gggtggctca atggggccga agcccgtggt ggggatgggg atggaccccg caccgctgcc 360 cgccccatgg acctcccgtg tccagttgga ggaggagaga cgacccatgg acctaccatg 420 tccattggga agaggaaaga tgccccatgg accctccgtg tccattggga ggaggagaaa 480 tgccccacag accccccatg tccattggga agaggagaga tgccccatgg acccttcgtg 540 tctagtggga a 551 40 94 PRT Gallus gallus 40 Val Asp Pro Arg Val Arg Val Ser Val Ser Glu Leu Gly Leu Gly Tyr 1 5 10 15 Glu Ser Asp Glu Thr Val Leu Phe Arg Tyr Cys Ser Gly Thr Cys Asp 20 25 30 Ala Ala Val Arg Asn Tyr Asp Leu Ser Leu Lys Ser Val Arg Ser Arg 35 40 45 Lys Lys Ile Arg Lys Glu Lys Val Arg Ala Arg Pro Cys Cys Arg Pro 50 55 60 Leu Ala Tyr Asp Asp Asp Val Ser Phe Leu Asp Ala Tyr Asn Arg Tyr 65 70 75 80 Tyr Thr Val Asn Glu Leu Ser Ala Lys Glu Cys Gly Cys Val 85 90 41 594 DNA Homo sapiens 41 atgcagcgct ggaaggcggc ggccttggcc tcagtgctct gcagctccgt gctgtccatc 60 tggatgtgtc gagagggcct gcttctcagc caccgcctcg gacctgcgct ggtccccctg 120 caccgcctgc ctcgaaccct ggacgcccgg attgcccgcc tggcccagta ccgtgcactc 180 ctgcaggggg ccccggatgc gatggagctg cgcgagctga cgccctgggc tgggcggccc 240 ccaggtccgc gccgtcgggc ggggccccgg cggcggcgcg cgcgtgcgcg gttgggggcg 300 cggccttgcg ggctgcgcga gctggaggtg cgcgtgagcg agctgggcct gggctacgcg 360 tccgacgaga cggtgctgtt ccgctactgc gcaggcgcct gcgaggctgc cgcgcgcgtc 420 tacgacctcg ggctgcgacg actgcgccag cggcggcgcc tgcggcggga gcgggtgcgc 480 gcgcagccct gctgccgccc gacggcctac gaggacgagg tgtccttcct ggacgcgcac 540 agccgctacc acacggtgca cgagctgtcg gcgcgcgagt gcgcctgcgt gtga 594 42 197 PRT Homo sapiens 42 Met Gln Arg Trp Lys Ala Ala Ala Leu Ala Ser Val Leu Cys Ser Ser 1 5 10 15 Val Leu Ser Ile Trp Met Cys Arg Glu Gly Leu Leu Leu Ser His Arg 20 25 30 Leu Gly Pro Ala Leu Val Pro Leu His Arg Leu Pro Arg Thr Leu Asp 35 40 45 Ala Arg Ile Ala Arg Leu Ala Gln Tyr Arg Ala Leu Leu Gln Gly Ala 50 55 60 Pro Asp Ala Met Glu Leu Arg Glu Leu Thr Pro Trp Ala Gly Arg Pro 65 70 75 80 Pro Gly Pro Arg Arg Arg Ala Gly Pro Arg Arg Arg Arg Ala Arg Ala 85 90 95 Arg Leu Gly Ala Arg Pro Cys Gly Leu Arg Glu Leu Glu Val Arg Val 100 105 110 Ser Glu Leu Gly Leu Gly Tyr Ala Ser Asp Glu Thr Val Leu Phe Arg 115 120 125 Tyr Cys Ala Gly Ala Cys Glu Ala Ala Ala Arg Val Tyr Asp Leu Gly 130 135 140 Leu Arg Arg Leu Arg Gln Arg Arg Arg Leu Arg Arg Glu Arg Val Arg 145 150 155 160 Ala Gln Pro Cys Cys Arg Pro Thr Ala Tyr Glu Asp Glu Val Ser Phe 165 170 175 Leu Asp Ala His Ser Arg Tyr His Thr Val His Glu Leu Ser Ala Arg 180 185 190 Glu Cys Ala Cys Val 195 43 1023 DNA Mus musculus 43 ggagggagag cgcgcggtgg tttcgtccgt gtgccccgcg cccggcgctc ctcgcgtggc 60 cccgcgtcct gagcgcgctc cagcctccca cgcgcgccac cccggggttc actgagcccg 120 gcgagcccgg ggaagacaga gaaagagagg ccaggggggg aaccccatgg cccggcccgt 180 gtcccgcacc ctgtgcggtg gcctcctccg gcacggggtc cccgggtcgc ctccggtccc 240 cgcgatccgg atggcgcacg cagtggctgg ggccgggccg ggctcgggtg gtcggaggag 300 tcaccactga ccgggtcatc tggagcccgt ggcaggccga ggcccaggat gaggcgctgg 360 aaggcagcgg ccctggtgtc gctcatctgc agctccctgc tatctgtctg gatgtgccag 420 gagggtctgc tcttgggcca ccgcctggga cccgcgcttg ccccgctacg acgccctcca 480 cgcaccctgg acgcccgcat cgcccgcctg gcccagtatc gcgctctgct ccagggcgcc 540 cccgacgcgg tggagcttcg agaactttct ccctgggctg cccgcatccc gggaccgcgc 600 cgtcgagcgg gtccccggcg tcggcgggcg cggccggggg ctcggccttg tgggctgcgc 660 gagctcgagg tgcgcgtgag cgagctgggc ctgggctaca cgtcggatga gaccgtgctg 720 ttccgctact gcgcaggcgc gtgcgaggcg gccatccgca tctacgacct gggccttcgg 780 cgcctgcgcc agcggaggcg cgtgcgcaga gagcgggcgc gggcgcaccc gtgttgtcgc 840 ccgacggcct atgaggacga ggtgtccttc ctggacgtgc acagccgcta ccacacgctg 900 caagagctgt cggcgcggga gtgcgcgtgc gtgtgatgct acctcacgcc ccccgacctg 960 cgaaagggcc ctccctgccg accctcgctg agaactgact tcacataaag tgtgggaact 1020 ccc 1023 44 195 PRT Mus musculus 44 Met Arg Arg Trp Lys Ala Ala Ala Leu Val Ser Leu Ile Cys Ser Ser 1 5 10 15 Leu Leu Ser Val Trp Met Cys Gln Glu Gly Leu Leu Leu Gly His Arg 20 25 30 Leu Gly Pro Ala Leu Ala Pro Leu Arg Arg Pro Pro Arg Thr Leu Asp 35 40 45 Ala Arg Ile Ala Arg Leu Ala Gln Tyr Arg Ala Leu Leu Gln Gly Ala 50 55 60 Pro Asp Ala Val Glu Leu Arg Glu Leu Ser Pro Trp Ala Ala Arg Ile 65 70 75 80 Pro Gly Pro Arg Arg Arg Ala Gly Pro Arg Arg Arg Arg Ala Arg Pro 85 90 95 Gly Ala Arg Pro Cys Gly Leu Arg Glu Leu Glu Val Arg Val Ser Glu 100 105 110 Leu Gly Leu Gly Tyr Thr Ser Asp Glu Thr Val Leu Phe Arg Tyr Cys 115 120 125 Ala Gly Ala Cys Glu Ala Ala Ile Arg Ile Tyr Asp Leu Gly Leu Arg 130 135 140 Arg Leu Arg Gln Arg Arg Arg Val Arg Arg Glu Arg Ala Arg Ala His 145 150 155 160 Pro Cys Cys Arg Pro Thr Ala Tyr Glu Asp Glu Val Ser Phe Leu Asp 165 170 175 Val His Ser Arg Tyr His Thr Leu Gln Glu Leu Ser Ala Arg Glu Cys 180 185 190 Ala Cys Val 195 45 261 PRT Danio rerio 45 Ile Phe Gly Glu Pro Glu Pro Val Lys Met Ile Ser Glu Gly Ser Asp 1 5 10 15 Cys Arg Cys Lys Cys Val Met Arg Pro Leu Ser Ile Glu Ala Cys Ser 20 25 30 Arg Leu Arg Asp Gly Ser Leu Arg Val Asp Asp Phe Tyr Thr Val Glu 35 40 45 Thr Val Ser Ser Gly Ser Asp Cys Lys Cys Ser Cys Thr Ala Pro Pro 50 55 60 Ser Ser Leu Asn Pro Cys Glu Asn Glu Trp Arg Thr Glu Lys Leu Met 65 70 75 80 Lys Gln Ala Pro Glu Leu Leu Lys Leu His Ser Met Val Asp Leu Leu 85 90 95 Glu Gly Thr Leu Tyr Ser Met Asp Leu Met Lys Val His Ala Tyr Met 100 105 110 Asn Lys Val Val Ser Gln Met Asn Thr Leu Glu Glu Thr Ile Lys Thr 115 120 125 Asn Leu Thr Arg Glu Asn Glu Phe Val Arg Asp Ser Val Val Asn Leu 130 135 140 Ser Asn Gln Leu Lys Arg Tyr Glu Asn Tyr Ser Asp Ile Met Val Ser 145 150 155 160 Ile Lys Lys Glu Ile Ser Ser Leu Gly Leu Gln Leu Leu Gln Lys Asp 165 170 175 Ala Ala Ser Asp Ser Lys Ala Gln Gly Thr Glu Ser Lys Lys Ser Lys 180 185 190 Glu Ala Ile Lys Pro Pro Asn Lys Lys Pro Pro

Ala Val Lys Pro Pro 195 200 205 Pro Lys Gln Pro Lys Glu Lys Pro Val Lys Pro Lys Lys Glu Ala Pro 210 215 220 Ala Lys Ala Ala Lys Pro Ala Lys Pro Asp Pro Thr Thr Lys Thr Lys 225 230 235 240 Thr Ser Val His Gln Thr Gly Val Ile Arg Gly Ile Thr Tyr Tyr Lys 245 250 255 Ala Ser Lys Ser Glu 260 46 146 PRT Danio rerio 46 Met Trp Arg Ile Val Glu Leu Val Ala Cys Leu Leu Met Met Ser Ser 1 5 10 15 His Val Ser Ser Gln Ser Lys Ile Phe Gly Glu Glu Gln Val Arg Met 20 25 30 Thr Ser Glu Gly Ser Asp Cys Arg Cys Lys Cys Ile Met Arg Pro Leu 35 40 45 Thr Arg Asp Ala Cys Ala Arg Leu Arg Thr Gly Ser Val Arg Val Glu 50 55 60 Asp Phe Tyr Thr Val Glu Thr Val Ser Ser Gly Ala Asp Cys Lys Cys 65 70 75 80 Ser Cys Thr Ala Pro Pro Ser Ser Leu Asn Pro Cys Glu Asn Glu Trp 85 90 95 Lys Arg Glu Lys Leu Lys Lys Gln Ala Pro Glu Leu Leu Lys Leu Gln 100 105 110 Ser Met Val Asp Leu Leu Glu Gly Thr Leu Phe Ser Met Asp Leu Leu 115 120 125 Lys Val His Ser Tyr Ile Asn Lys Val Val Ser Gln Met Asn Asn Leu 130 135 140 Glu Glu 145 47 681 PRT Mus musculus 47 Met Glu Ala Ala Ala Val Leu Pro Arg Tyr Leu Gln Leu Arg Leu Leu 1 5 10 15 Leu Val Leu Leu Leu Leu Val Leu Leu Arg Ala Gly Pro Val Trp Pro 20 25 30 Asp Ser Lys Val Phe Ser Asp Leu Asp Gln Val Arg Met Thr Ser Glu 35 40 45 Gly Ser Asp Cys Arg Cys Lys Cys Ile Met Arg Pro Leu Ser Lys Asp 50 55 60 Ala Cys Ser Arg Val Arg Ser Gly Arg Ala Arg Val Glu Asp Phe Tyr 65 70 75 80 Thr Val Glu Thr Val Ser Ser Gly Ala Asp Cys Arg Cys Ser Cys Thr 85 90 95 Ala Pro Pro Ser Ser Leu Asn Pro Cys Glu Asn Glu Trp Lys Met Glu 100 105 110 Lys Leu Lys Lys Gln Ala Pro Glu Leu Leu Lys Leu Gln Ser Met Val 115 120 125 Asp Leu Leu Glu Gly Ala Leu Tyr Ser Met Asp Leu Met Lys Val His 130 135 140 Ala Tyr Ile Gln Lys Val Ala Ser Gln Met Asn Thr Leu Glu Glu Ser 145 150 155 160 Ile Lys Ala Asn Leu Ser Leu Glu Asn Lys Val Val Lys Asp Ser Val 165 170 175 His His Leu Ser Glu Gln Leu Lys Ser Tyr Glu Asn Gln Ser Ala Ile 180 185 190 Met Met Ser Ile Lys Lys Glu Leu Ser Ser Leu Gly Leu Gln Leu Leu 195 200 205 Gln Arg Asp Ala Ala Ala Val Pro Ala Thr Ala Pro Ala Ser Ser Pro 210 215 220 Asp Ser Lys Ala Gln Asp Thr Ala Gly Gly Gln Gly Arg Asp Leu Asn 225 230 235 240 Lys Tyr Gly Ser Ile Gln Lys Ser Phe Ser Asp Lys Gly Leu Ala Lys 245 250 255 Pro Pro Lys Glu Lys Leu Leu Lys Val Glu Lys Leu Arg Lys Glu Ser 260 265 270 Ile Lys Gly Arg Ile Pro Gln Pro Thr Ala Arg Pro Arg Ala Leu Ala 275 280 285 Gln Gln Gln Ala Val Ile Arg Gly Phe Thr Tyr Tyr Lys Ala Gly Arg 290 295 300 Gln Glu Ala Arg Gln Glu Ala Arg Gln Glu Ala Pro Lys Ala Ala Ala 305 310 315 320 Asp Ser Thr Leu Lys Gly Thr Ser Trp Leu Glu Lys Leu Pro Pro Lys 325 330 335 Ile Glu Ala Lys Leu Pro Glu Pro Asn Ser Ala Lys His Asp Asp Val 340 345 350 Arg Leu Gln Ala Ser Glu Gly Gly Asn Leu Thr Pro Asp Ile Thr Thr 355 360 365 Thr Thr Thr Ser Thr Ser Ser Ser Thr Thr Thr Thr Thr Gly Thr Thr 370 375 380 Ser Thr Thr Ser Thr Thr Ser Thr Thr Ser Thr Thr Thr Pro Ser Pro 385 390 395 400 Ile Thr Thr Pro Trp Pro Thr Glu Pro Pro Leu His Pro Glu Val Pro 405 410 415 Ser Gln Gly Arg Glu Asp Ser Cys Glu Gly Thr Leu Arg Ala Val Asp 420 425 430 Pro Pro Val Lys His His Ser Tyr Gly Arg His Glu Gly Ala Trp Met 435 440 445 Lys Asp Pro Ala Ala Leu Asp Asp Arg Ile Tyr Val Thr Asn Tyr Tyr 450 455 460 Tyr Gly Asn Ser Leu Val Glu Phe Arg Asn Leu Glu Asn Phe Lys Gln 465 470 475 480 Gly Arg Trp Ser Asn Met Tyr Lys Leu Pro Tyr Asn Trp Ile Gly Thr 485 490 495 Gly His Val Val Tyr Gln Gly Ala Phe Tyr Tyr Asn Arg Ala Phe Thr 500 505 510 Lys Asn Ile Ile Lys Tyr Asp Leu Arg Gln Arg Phe Val Ala Ser Trp 515 520 525 Ala Leu Leu Pro Asp Val Val Tyr Glu Asp Thr Thr Pro Trp Lys Trp 530 535 540 Arg Gly His Ser Asp Ile Asp Phe Ala Val Asp Glu Ser Gly Leu Trp 545 550 555 560 Val Ile Tyr Pro Ala Val Asp Glu His Asp Glu Thr Gln His Glu Val 565 570 575 Ile Val Leu Ser Arg Leu Asp Pro Ala Asp Leu Ser Val His Arg Glu 580 585 590 Thr Thr Trp Lys Thr Arg Leu Arg Arg Asn Ser Tyr Gly Asn Cys Phe 595 600 605 Leu Val Cys Gly Ile Leu Tyr Thr Val Asp Thr Tyr Asn Gln His Glu 610 615 620 Gly Gln Val Ala Tyr Ala Phe Asp Thr His Thr Gly Thr Asp Ala His 625 630 635 640 Pro Gln Leu Pro Phe Leu Asn Glu Tyr Ser Tyr Thr Thr Gln Val Asp 645 650 655 Tyr Asn Pro Lys Glu Arg Val Leu Tyr Ala Trp Asp Asn Gly His Gln 660 665 670 Leu Thr Tyr Thr Leu His Phe Val Val 675 680 48 704 PRT Homo sapiens 48 Met Ala Ala Ala Ala Leu Pro Pro Arg Pro Leu Leu Leu Leu Pro Leu 1 5 10 15 Val Leu Leu Leu Ser Gly Arg Pro Thr Arg Ala Asp Ser Lys Val Phe 20 25 30 Gly Asp Leu Asp Gln Val Arg Met Thr Ser Glu Gly Ser Asp Cys Arg 35 40 45 Cys Lys Cys Ile Met Arg Pro Leu Ser Lys Asp Ala Cys Ser Arg Val 50 55 60 Arg Ser Gly Arg Ala Arg Val Glu Asp Phe Tyr Thr Val Glu Thr Val 65 70 75 80 Ser Ser Gly Thr Asp Cys Arg Cys Ser Cys Thr Ala Pro Pro Ser Ser 85 90 95 Leu Asn Pro Cys Glu Asn Glu Trp Lys Met Glu Lys Leu Lys Lys Gln 100 105 110 Ala Pro Glu Leu Leu Lys Ser Ile Lys Ala Asn Leu Ser Arg Glu Asn 115 120 125 Glu Val Val Lys Asp Ser Val Arg His Leu Ser Glu Gln Leu Arg His 130 135 140 Tyr Glu Asn His Ser Ala Ile Met Leu Gly Ile Lys Lys Glu Leu Ser 145 150 155 160 Arg Leu Gly Leu Gln Leu Leu Gln Lys Asp Ala Ala Ala Ala Pro Ala 165 170 175 Thr Pro Ala Thr Gly Thr Gly Ser Lys Ala Gln Asp Thr Ala Arg Gly 180 185 190 Lys Gly Lys Asp Ile Ser Lys Tyr Gly Ser Val Gln Lys Ser Phe Ala 195 200 205 Asp Arg Gly Leu Pro Lys Pro Pro Lys Glu Lys Leu Leu Gln Val Glu 210 215 220 Lys Leu Arg Lys Glu Ser Gly Lys Gly Ser Phe Leu Gln Pro Thr Ala 225 230 235 240 Lys Pro Arg Ala Leu Ala Gln Gln Gln Ala Val Ile Arg Gly Phe Thr 245 250 255 Tyr Tyr Lys Ala Gly Lys Gln Glu Val Thr Glu Ala Val Ala Asp Asn 260 265 270 Ala Leu Gln Gly Thr Ser Trp Leu Glu Gln Leu Pro Pro Lys Val Glu 275 280 285 Gly Arg Ser Asn Ser Ala Glu Pro Asn Ser Ala Glu Gln Asp Glu Ala 290 295 300 Glu Pro Arg Ser Ser Glu Arg Val Asp Leu Ala Ser Gly Thr Thr His 305 310 315 320 Leu Ile Leu Pro Pro His Ser Leu His His His Ser Thr Pro Val Leu 325 330 335 Ala Thr Pro Ala Pro Phe His Leu Gln Cys His Asn Lys Pro Val Pro 340 345 350 Ser Pro Arg Arg Trp Gln Thr Thr Pro Ser Arg Ala Leu Pro Gly Trp 355 360 365 Ser Asn Cys Arg Pro Arg Trp Arg Ala Gly Pro Thr Pro Gln Ser Pro 370 375 380 Thr Pro Gln Ser Arg Met Arg Leu Ser Pro Gly Pro Pro Ser Glu Trp 385 390 395 400 Thr Trp Leu Leu Ala Pro His Phe Asn Pro Cys His His His His Arg 405 410 415 His Pro His Pro Gln Pro Pro Thr Thr Ser Leu Leu Pro Thr Glu Pro 420 425 430 Pro Ser Gly Pro Glu Val Ser Ser Gln Gly Arg Glu Ala Ser Cys Glu 435 440 445 Gly Thr Leu Arg Ala Val Asp Pro Pro Val Arg His His Ser Tyr Gly 450 455 460 Arg His Glu Gly Ala Trp Met Lys Asp Pro Ala Ala Arg Asp Asp Arg 465 470 475 480 Ile Tyr Val Thr Asn Tyr Tyr Tyr Gly Asn Ser Leu Val Glu Phe Arg 485 490 495 Asn Leu Glu Asn Phe Lys Gln Gly Arg Trp Ser Asn Met Tyr Lys Leu 500 505 510 Pro Tyr Asn Trp Ile Gly Thr Gly His Val Val Tyr Gln Gly Ala Phe 515 520 525 Tyr Tyr Asn Arg Ala Phe Thr Lys Asn Ile Ile Lys Tyr Asp Leu Arg 530 535 540 Gln Arg Phe Val Ala Ser Trp Ala Leu Leu Pro Asp Val Val Tyr Glu 545 550 555 560 Asp Thr Thr Pro Trp Lys Trp Arg Gly His Ser Asp Ile Asp Phe Ala 565 570 575 Val Asp Glu Ser Gly Leu Trp Val Ile Tyr Pro Ala Val Asp Asp Arg 580 585 590 Asp Glu Ala Gln Pro Glu Val Ile Val Leu Ser Arg Leu Asp Pro Gly 595 600 605 Asp Leu Ser Val His Arg Glu Thr Thr Trp Lys Thr Arg Leu Arg Arg 610 615 620 Asn Ser Tyr Gly Asn Cys Phe Leu Val Cys Gly Ile Leu Tyr Ala Val 625 630 635 640 Asp Thr Tyr Asn Gln Gln Glu Gly Gln Val Ala Tyr Ala Phe Asp Thr 645 650 655 His Thr Gly Thr Asp Ala Arg Pro Gln Leu Pro Phe Leu Asn Glu His 660 665 670 Ala Tyr Thr Thr Gln Ile Asp Tyr Asn Pro Lys Glu Arg Val Leu Tyr 675 680 685 Ala Trp Asp Asn Gly His Gln Leu Thr Tyr Thr Leu His Phe Val Val 690 695 700 49 831 PRT Rattus norvegicus 49 Met Ala Tyr Pro Leu Pro Leu Val Leu Cys Phe Ala Leu Val Val Ala 1 5 10 15 Arg Val Trp Gly Ser Ser Thr Pro Pro Thr Gly Thr Ser Glu Pro Pro 20 25 30 Asp Val Gln Thr Val Ala Pro Thr Glu Asp Asp Val Leu Gln Asn Glu 35 40 45 Ala Asp Asn Gln Glu Asn Val Leu Ser Gln Leu Leu Gly Asp Tyr Asp 50 55 60 Lys Val Lys Ala Val Ser Glu Gly Ser Asp Cys Gln Cys Lys Cys Val 65 70 75 80 Val Arg Pro Leu Gly Arg Asp Ala Cys Gln Arg Ile Asn Glu Gly Ala 85 90 95 Ser Arg Lys Glu Asp Phe Tyr Thr Val Glu Thr Ile Thr Ser Gly Ser 100 105 110 Ser Cys Lys Cys Ala Cys Val Ala Pro Pro Ser Ala Val Asn Pro Cys 115 120 125 Glu Gly Asp Phe Arg Leu Gln Lys Leu Arg Glu Ala Asp Ser Arg Asp 130 135 140 Leu Lys Leu Ser Thr Ile Ile Asp Met Leu Glu Gly Ala Phe Tyr Gly 145 150 155 160 Leu Asp Leu Leu Lys Leu His Ser Val Thr Thr Lys Leu Val Gly Arg 165 170 175 Val Asp Lys Leu Glu Glu Glu Val Ser Lys Asn Leu Thr Lys Glu Asn 180 185 190 Glu Gln Ile Lys Glu Asp Val Glu Glu Ile Arg Thr Glu Leu Asn Lys 195 200 205 Arg Gly Lys Glu Asn Cys Ser Asp Asn Ile Leu Gly Asn Met Pro Asp 210 215 220 Ile Arg Ser Ala Leu Gln Arg Asp Ala Ala Ala Ala Tyr Ala His Pro 225 230 235 240 Glu Glu Gln Tyr Glu Glu Arg Phe Leu Gln Glu Glu Thr Val Ser Gln 245 250 255 Gln Ile Asn Ser Ile Glu Leu Leu Arg Thr Gln Pro Leu Ala Pro Pro 260 265 270 Thr Val Met Lys Pro Arg Gln Pro Ser Gln Arg Gln Val His Leu Arg 275 280 285 Gly Arg Leu Ala Ser Lys Pro Thr Val Ile Arg Gly Ile Thr Tyr Tyr 290 295 300 Lys Ala Lys Val Ser Glu Glu Glu Asn Asp Ile Glu Asp Gln His Asp 305 310 315 320 Glu Leu Phe Ser Gly Asp Ser Gly Val Asp Leu Leu Ile Glu Asp Gln 325 330 335 Leu Leu Arg Gln Glu Asp Leu Leu Met Ser Ala Thr Arg Arg Pro Ala 340 345 350 Thr Thr Arg His Ala Ala Ala Val Ser Thr Asp Ala Ser Val Gln Ala 355 360 365 Thr Ala Leu Ser Ser Glu Pro Ala Gln Ala Ser Ala Ser Ala Pro Ser 370 375 380 Leu Val Asp Pro Ala Ser Gln Ala Pro Asp Arg Gln Leu Leu Ala Ser 385 390 395 400 Pro Gln Thr Thr Thr Val Ser Pro Glu Thr Met Gly Val Met Pro Ser 405 410 415 Thr Gln Val Ser Pro Thr Thr Val Ala His Thr Ala Ile Gln Pro Pro 420 425 430 Pro Ala Met Ile Pro Gly Asp Ile Phe Val Glu Ala Leu His Leu Val 435 440 445 Pro Met Ser Pro Asp Thr Val Gly Thr Asp Met Ala Glu Glu Glu Gly 450 455 460 Thr Ala Arg Gln Glu Ala Thr Ser Ala Ser Pro Ile Leu Ser Pro Glu 465 470 475 480 Glu Glu Asp Asp Ile Arg Asn Val Ile Gly Val Phe Lys Cys Ser Glu 485 490 495 Ala Pro His Ile Ser Ala Ala Phe Leu His Gly Leu Lys His Ser Val 500 505 510 Gly Phe Ser Val Pro Trp Arg His Val Glu Ile Cys Leu Lys Ile Arg 515 520 525 Val Ser Val Leu Leu Ser Leu Val Trp Gln Gly Leu Pro Gly Tyr Gln 530 535 540 Ala Ile Pro Lys Arg Tyr Phe Glu Glu Asn Gly Trp Ile Pro Ala Pro 545 550 555 560 Pro Arg Lys Thr Gly Val Leu Lys Glu Ala Leu Gln Leu Glu Cys Lys 565 570 575 Asp Thr Leu Ser Thr Ile Thr Gly Pro Thr Thr Gln Asn Thr Tyr Gly 580 585 590 Arg Asn Glu Gly Ala Trp Met Lys Asp Pro Leu Ala Lys Asp Asp Arg 595 600 605 Ile Tyr Val Thr Asn Tyr Tyr Tyr Gly Asn Thr Leu Val Glu Phe Arg 610 615 620 Asn Leu Glu Asn Phe Lys Gln Gly Arg Trp Ser Asn Ser Tyr Lys Leu 625 630 635 640 Pro Tyr Ser Trp Ile Gly Thr Gly His Val Val Tyr Asn Gly Ala Phe 645 650 655 Tyr Tyr Asn Arg Ala Phe Thr Arg Asn Ile Ile Lys Tyr Asp Leu Lys 660 665 670 Gln Arg Tyr Val Ala Ala Trp Ala Met Leu His Asp Val Ala Tyr Glu 675 680 685 Glu Thr Thr Pro Trp Arg Trp Gln Gly His Ser Asp Val Asp Phe Ala 690 695 700 Val Asp Glu Asn Gly Leu Trp Leu Ile Tyr Pro Ala Leu Asp Asp Glu 705 710 715 720 Gly Phe Ser Gln Glu Val Ile Val Leu Ser Lys Leu Asn Ala Val Asp 725 730 735 Leu Ser Thr Gln Lys Glu Thr Thr Trp Arg Thr Gly Leu Arg Arg Asn 740 745 750 Phe Tyr Gly Asn Cys Phe Val Ile Cys Gly Val Leu Tyr Ala Val Asp 755 760 765 Ser Tyr Asn Gln Arg Asn Ala Asn Ile Ser Tyr Ala Phe Asp Thr His 770 775 780 Thr Asn Thr Gln Ile Val Pro Arg Leu Leu Phe Glu Asn Glu Tyr Ser 785 790 795 800 Tyr Thr Thr Gln Ile Asp Tyr Asn Pro Lys Asp Arg Leu Leu Tyr Ala 805 810 815 Trp Asp Asn Gly His Gln Val Thr Tyr His Val Ile Phe Ala Tyr 820 825 830 50 860 PRT Homo sapiens 50 Met Ala Lys Pro Arg Leu Leu Val Leu Tyr Phe Ala Leu Ile Val Val 1 5

10 15 Pro Ala Trp Val Ser Ser Ile Val Leu Thr Gly Thr Ser Glu Pro Pro 20 25 30 Asp Ala Gln Thr Val Ala Pro Ala Glu Asp Glu Thr Leu Gln Asn Glu 35 40 45 Ala Asp Asn Gln Glu Asn Val Leu Ser Gln Leu Leu Gly Asp Tyr Asp 50 55 60 Lys Val Lys Ala Met Ser Glu Gly Ser Asp Cys Gln Cys Lys Cys Val 65 70 75 80 Val Arg Pro Leu Gly Arg Asp Ala Cys Gln Arg Ile Asn Ala Gly Ala 85 90 95 Ser Arg Lys Glu Asp Phe Tyr Thr Val Glu Thr Ile Thr Ser Gly Ser 100 105 110 Ser Cys Lys Cys Ala Cys Val Ala Pro Pro Ser Ala Leu Asn Pro Cys 115 120 125 Glu Gly Asp Phe Arg Leu Gln Lys Leu Arg Glu Ala Asp Ser Gln Asp 130 135 140 Leu Lys Val Gly Pro Gly Met Gly Gln Cys Leu Gly Arg Glu Gly Thr 145 150 155 160 Phe Glu Ile His Lys Ser Gly Lys Ala Met Val Glu Asp Ser Lys Pro 165 170 175 Phe Glu Glu Gly Leu Ser His Phe Leu Thr Gln Thr Phe Arg Lys Ala 180 185 190 Glu Cys Thr Tyr Thr Ile Val Leu Ala Tyr Ile Pro Val Tyr Thr Asn 195 200 205 Val Phe Leu Thr Ala Thr Ser Gln Phe Leu Ala Ser Gly Phe Pro Val 210 215 220 Glu Pro Pro Leu Ser Thr Ile Ile Asp Met Leu Glu Gly Ala Phe Tyr 225 230 235 240 Gly Leu Asp Leu Leu Lys Leu His Ser Val Thr Thr Lys Leu Val Gly 245 250 255 Arg Val Asp Lys Leu Glu Glu Met Leu Glu Gly Ala Phe Tyr Gly Leu 260 265 270 Asp Leu Leu Lys Leu His Ser Val Thr Thr Lys Leu Val Gly Arg Val 275 280 285 Asp Lys Leu Glu Glu Glu Val Ser Lys Asn Thr Lys Glu Asn Glu Gln 290 295 300 Ile Lys Glu Asp Met Glu Glu Ile Arg Thr Glu Met Asn Lys Arg Gly 305 310 315 320 Lys Glu Asn Cys Ser Glu Asn Ile Leu Asp Ser Met Pro Asp Ile Arg 325 330 335 Ser Ala Leu Gln Arg Asp Ala Ala Ala Ala Tyr Ala His Pro Glu Tyr 340 345 350 Glu Glu Arg Phe Leu Gln Glu Glu Thr Val Ser Gln Gln Ile Asn Ser 355 360 365 Ile Glu Leu Leu Gln Thr Arg Pro Leu Ala Leu Pro Glu Val Val Lys 370 375 380 Ser Gln Arg Pro Leu Gln Arg Gln Val His Leu Arg Gly Arg Pro Ala 385 390 395 400 Ser Gln Pro Thr Val Ile Arg Gly Ile Thr Tyr Tyr Lys Ala Lys Val 405 410 415 Ser Glu Glu Glu Asn Asp Ile Glu Glu Gln Gln Asp Glu Phe Phe Ser 420 425 430 Gly Asp Asn Gly Val Asp Leu Leu Ile Glu Asp Gln Leu Leu Arg His 435 440 445 Asn Gly Leu Met Thr Ser Val Thr Arg Arg Pro Ala Ala Thr Arg Gln 450 455 460 Gly His Ser Thr Ala Val Thr Ser Asp Leu Asn Ala Arg Thr Ala Pro 465 470 475 480 Trp Ser Ser Ala Leu Pro Gln Pro Ser Thr Ser Asp Pro Ser Ile Ala 485 490 495 Asn His Ala Ser Val Gly Pro Thr Leu Gln Thr Thr Ser Val Ser Pro 500 505 510 Asp Pro Thr Arg Glu Ser Val Leu Gln Pro Ser Pro Gln Val Pro Ala 515 520 525 Thr Thr Val Ala His Thr Ala Thr Gln Gln Pro Ala Ala Pro Ala Pro 530 535 540 Pro Ala Val Ser Pro Arg Glu Ala Leu Met Glu Ala Met His Thr Val 545 550 555 560 Pro Val Pro Pro Thr Thr Val Arg Thr Asp Ser Leu Gly Lys Asp Ala 565 570 575 Pro Ala Gly Trp Gly Thr Thr Pro Ala Ser Pro Thr Leu Ser Pro Glu 580 585 590 Glu Glu Asp Asp Ile Arg Asn Val Ile Gly Arg Cys Lys Asp Thr Leu 595 600 605 Ser Thr Ile Thr Gly Pro Thr Thr Gln Asn Thr Tyr Gly Arg Asn Glu 610 615 620 Gly Ala Trp Met Lys Asp Pro Leu Ala Lys Asp Glu Arg Ile Tyr Val 625 630 635 640 Thr Asn Tyr Tyr Tyr Gly Asn Thr Leu Val Glu Phe Arg Asn Leu Glu 645 650 655 Asn Phe Lys Gln Gly Arg Trp Ser Asn Ser Tyr Lys Leu Pro Tyr Ser 660 665 670 Trp Ile Gly Thr Gly His Val Val Tyr Asn Gly Ala Phe Tyr Tyr Asn 675 680 685 Arg Ala Phe Thr Arg Asn Ile Ile Lys Tyr Asp Leu Lys Gln Arg Tyr 690 695 700 Val Ala Ala Trp Ala Met Leu His Asp Val Ala Tyr Glu Glu Ala Thr 705 710 715 720 Pro Trp Arg Trp Gln Gly His Ser Asp Val Asp Phe Ala Val Asp Glu 725 730 735 Asn Gly Leu Trp Leu Ile Tyr Pro Ala Leu Asp Asp Glu Gly Phe Ser 740 745 750 Gln Glu Val Ile Val Leu Ser Lys Leu Asn Ala Ala Asp Leu Ser Thr 755 760 765 Gln Lys Glu Thr Thr Trp Arg Thr Gly Leu Arg Arg Asn Phe Tyr Gly 770 775 780 Asn Cys Phe Val Ile Cys Gly Val Leu Tyr Ala Val Asp Ser Tyr Asn 785 790 795 800 Gln Arg Asn Ala Asn Ile Ser Tyr Ala Phe Asp Thr His Thr Asn Thr 805 810 815 Gln Ile Val Pro Arg Leu Leu Phe Glu Asn Glu Tyr Ser Tyr Thr Thr 820 825 830 Gln Ile Asp Tyr Asn Pro Lys Asp Arg Leu Leu Tyr Ala Trp Asp Asn 835 840 845 Gly His Gln Val Thr Tyr His Val Ile Phe Ala Tyr 850 855 860 51 719 PRT Danio rerio 51 Met Thr Glu Met Lys Ile Trp Cys Val Leu Leu Met Ala Phe Ala Leu 1 5 10 15 Thr Ser Ala Ala Pro Lys Ser His Leu Arg Leu Glu Glu Lys Thr Lys 20 25 30 Asp Asn Asn Asp Thr Leu Gln Val Glu Ile Asp Asn Gln Glu His Ile 35 40 45 Leu Ser Gln Leu Leu Gly Asp Tyr Asp Lys Val Lys Ala Leu Ser Glu 50 55 60 Gly Ser Asp Cys Gly Cys Lys Cys Val Val Arg Pro Leu Ser Ala Ser 65 70 75 80 Ala Cys Gln Arg Ile Arg Glu Gly His Ala Thr Pro Gln Asp Phe Tyr 85 90 95 Thr Val Glu Thr Ile Thr Ser Gly Pro His Cys Lys Cys Ala Cys Ile 100 105 110 Ala Pro Pro Ser Ala Leu Asn Pro Cys Glu Gly Asp Phe Arg Leu Lys 115 120 125 Lys Leu Arg Gln Ala Gly Lys Asp Asn Ile Lys Leu Ser Thr Ile Leu 130 135 140 Glu Leu Leu Glu Gly Ser Phe Tyr Gly Met Asp Leu Leu Lys Leu His 145 150 155 160 Ser Val Thr Thr Lys Ile Leu Asp Arg Met Asp Thr Ile Glu Lys Met 165 170 175 Val Leu Asn Asn Gln Thr Glu Glu Lys Leu Asn Thr Ile Ser Thr Ser 180 185 190 Pro Asn Pro Gln Leu Ser Thr Ser Ser Pro Thr Thr Leu Pro Ser Val 195 200 205 Ile Gln Glu Lys Ser Thr Ser Leu Arg Gln Gln Asn Asp Glu Ala Ala 210 215 220 Ala Phe Gln His Met Glu Ser Lys Tyr Glu Glu Lys Phe Val Gly Asp 225 230 235 240 Ile Leu Asn Ser Gly Ser Asp Leu Asn Lys Ala Thr Thr Ala Leu Gln 245 250 255 Glu Gln Glu Gln Gln Gly Arg Lys Lys Gln Pro Lys Ile Thr Val Arg 260 265 270 Gly Ile Thr Tyr Tyr Arg Ser Asp Pro Val Asp Glu Met Asp Ser Glu 275 280 285 Lys Asn Leu Lys Glu Thr Ser Ala Ser Ser Val Thr Gln Thr Gly Ala 290 295 300 Leu Ile Lys Glu His Leu Lys Ala Ser Thr Gln Ser Thr Leu Asn Thr 305 310 315 320 Leu Thr Pro Ser Pro Thr Ser His Ser Asn Ala Leu Thr Val Thr Glu 325 330 335 Ser Ser Val Gly Ile Asn Ala His Lys Gly Glu Val Thr Thr Ile Val 340 345 350 Met Thr Ala Ser Val Thr Gly Ser Lys Thr Asp Ser Val Thr Asp Leu 355 360 365 Thr Gln Leu Ser Pro Arg Val Arg Glu Thr Leu Thr Thr Thr Arg Thr 370 375 380 Thr Thr Lys Thr Ala Thr Thr Ser Gln Pro Val Lys Arg Lys Tyr Ser 385 390 395 400 Ile Ser Trp Asp Glu Glu Glu Glu Ala Val Val Pro Glu Gln Val Glu 405 410 415 Glu Glu Lys Ala Val Lys Pro Val Val Glu Asp Lys Val Gly Glu Glu 420 425 430 Pro Gln Arg Lys Pro Gly Thr Ala His His Gln Ala Lys Thr Ile Ser 435 440 445 Thr Val Lys Gln Gln Ile Lys Phe Ser Leu Gly Met Cys Lys Asp Thr 450 455 460 Leu Ala Thr Ile Ser Glu Pro Ile Thr His Asn Thr Tyr Gly Arg Asn 465 470 475 480 Glu Gly Ala Trp Met Lys Asp Pro Leu Asp Gln Asp Asp Lys Ile Tyr 485 490 495 Val Thr Asn Tyr Tyr Tyr Gly Asn Asn Leu Leu Glu Phe Arg Asn Ile 500 505 510 Asp Val Phe Lys Gln Gly Arg Phe Thr Asn Ser Tyr Lys Leu Pro Tyr 515 520 525 Asn Trp Ile Gly Thr Gly His Val Val Tyr Lys Gly Ala Phe Tyr Tyr 530 535 540 Asn Arg Ala Phe Ser Arg Asp Ile Ile Lys Phe Asp Leu Arg Leu Arg 545 550 555 560 Tyr Val Ala Ala Trp Thr Met Leu His Asp Ala Val Phe Glu Asn Asp 565 570 575 Asp Val Ser Ser Trp Arg Trp Arg Gly Asn Ser Asp Met Asp Leu Ala 580 585 590 Ile Asp Glu Ser Gly Leu Trp Val Ile Tyr Pro Ala Leu Asp Asp Glu 595 600 605 Gly Phe Leu Gln Glu Val Ile Val Leu Ser Arg Leu Asn Pro Thr Asp 610 615 620 Leu Ser Met Lys Arg Glu Thr Thr Trp Arg Thr Gly Leu Arg Arg Asn 625 630 635 640 Arg Tyr Gly Asn Cys Phe Ile Val Cys Gly Val Leu Tyr Ala Thr Asp 645 650 655 Ser Tyr Asn Gln Gln Asp Thr Asn Leu Ser Tyr Ala Phe Asp Thr His 660 665 670 Thr Asn Thr Gln Val Ile Pro His Leu Pro Phe Ser Asn Asn Tyr Thr 675 680 685 Tyr Val Thr Gln Ile Asp Tyr Asn Pro Lys Glu Arg Val Leu Tyr Ala 690 695 700 Trp Asp Asn Gly His Gln Val Thr Tyr Asn Val Gln Phe Ala Tyr 705 710 715 52 656 PRT Danio rerio 52 Met Gly Leu Leu Leu Tyr Ile Phe Cys Cys Val Phe Cys Leu Thr Arg 1 5 10 15 Ala Asn Val Glu Gln Gln Ala Thr Asp Asn Thr Asp Asn Arg Ala Thr 20 25 30 Leu Glu Asp Glu Met Asp Asn Gln Glu Asn Ile Leu Thr Gln Leu Ile 35 40 45 Gly Asp Tyr Asp Lys Val Lys Thr Leu Ser Glu Gly Ser Asp Cys Gln 50 55 60 Cys Lys Cys Val Val Arg Pro Met Ser Arg Ser Ala Cys Lys Arg Ile 65 70 75 80 Glu Glu Ala Gln Ala Lys Ile Glu Asp Phe Tyr Thr Val Glu Pro Val 85 90 95 Thr Ala Gly Pro Asn Cys Lys Lys Cys Ala Cys Ile Ala Pro Pro Ser 100 105 110 Ala Leu Asn Pro Cys Glu Gly Asp Phe Arg Phe Lys Lys Leu Gln Lys 115 120 125 Thr Gly Gln Tyr Asp Ile Lys Leu Ser Asn Ile Met Asp Leu Leu Glu 130 135 140 Gly Ser Phe Tyr Gly Met Asp Leu Leu Lys Leu His Ser Val Thr Thr 145 150 155 160 Lys Leu Leu Glu Arg Val Asp Asn Ile Glu Lys Ser Phe Ser Gly Asn 165 170 175 Leu Thr Lys Glu Lys Val Ser Val Lys Gly Glu Lys Gly Gln Gly Lys 180 185 190 Gly Ala Arg Ser Asn Gln Arg Gln Glu Lys Lys Lys Arg Leu Ser Val 195 200 205 Leu Glu Pro Ser Leu Gln Lys Asn Ala Ala Ala Ala Phe Ala His Thr 210 215 220 Glu Val Gln Met Gln Gln Phe Ile Pro Asp Gln Arg Lys Tyr Glu Glu 225 230 235 240 Lys Phe Val Gly Asn Gln Gly Pro Ser Lys Pro Val Leu Lys Lys Ser 245 250 255 Lys Ser Glu Gly Gln Glu Glu Gln His Lys Pro Ala Lys Thr Lys Ala 260 265 270 Asp Ala Lys Asn Met Ser Leu Arg Ser Met Thr Phe Tyr Lys Ala Asn 275 280 285 Arg Met Glu Asp Ser Glu Gly Glu Glu Arg Met Asp Leu Ile Ile Glu 290 295 300 Asp Gln Leu His Lys Gln Gly Leu Asn Thr Pro Val Thr Thr Pro Glu 305 310 315 320 Ala Thr Val Thr Val Thr Gln Ser Thr Thr Ile Asn Leu Asn Thr Gln 325 330 335 Asn Phe Thr Thr Ala Arg Met Ser Asn Val Thr Lys Gln Thr Gln Gly 340 345 350 Gln Ser Val Lys Ala Met Met Ser Ser Thr Ile Thr Thr Glu Arg Pro 355 360 365 Thr Met Pro Thr Ser Thr Thr Ser Thr Ser Thr Met Thr Pro Gly Thr 370 375 380 Asn Thr Thr Thr Ile Ala Thr Pro Leu Val Val Pro Lys Gln Leu Ala 385 390 395 400 Arg Ile Cys Lys Asp Thr Leu Ala Ser Ile Ser Asp Pro Val Thr His 405 410 415 Asn Lys Tyr Gly Lys Asn Glu Gly Ala Trp Met Lys Asp Pro Lys Gly 420 425 430 Asn Gly Lys Val Val Tyr Val Thr Asp Tyr Tyr Tyr Gly Asn Gln Leu 435 440 445 Leu Glu Phe Arg Asp Ile Asp Thr Phe Lys Gln Gly Gln Val Ser Asn 450 455 460 Ser Tyr Lys Leu Pro Tyr Asn Trp Ile Gly Thr Gly His Val Val Tyr 465 470 475 480 Ser Gly Ser Phe Phe Tyr Asn Arg Ala Phe Ser Arg Asp Ile Ile Arg 485 490 495 Phe Asp Leu Arg Leu Arg Tyr Val Ala Ala Trp Thr Thr Leu His Asp 500 505 510 Ala Ile Leu Glu Glu Glu Glu Ala Pro Trp Thr Trp Gly Gly His Ser 515 520 525 Asp Ile Asp Phe Ser Val Asp Glu Ser Gly Leu Trp Leu Val Tyr Pro 530 535 540 Ala Leu Asp Asp Glu Gly Phe His Gln Glu Val Ile Ile Leu Ser Lys 545 550 555 560 Leu Arg Ala Ser Asp Leu Gln Lys Glu Lys Ser Trp Arg Thr Gly Leu 565 570 575 Arg Arg Asn Tyr Tyr Gly Asn Cys Phe Val Ile Cys Gly Val Leu Tyr 580 585 590 Ala Val Asp Ser Phe Glu Arg Thr His Ala Asn Ile Ser Tyr Ala Phe 595 600 605 Asp Thr His Thr His Thr Gln Met Ile Pro Arg Leu Pro Phe Ile Asn 610 615 620 Asn Tyr Thr Tyr Thr Thr Gln Ile Asp Tyr Asn Pro Lys Glu Arg Met 625 630 635 640 Leu Tyr Ala Trp Asp Asn Gly His Gln Val Thr Tyr Asp Val Ile Phe 645 650 655 53 699 PRT Homo sapiens 53 Met Val Lys Arg Lys Ser Ser Glu Gly Gln Glu Gln Asp Gly Gly Arg 1 5 10 15 Gly Ile Pro Leu Pro Ile Gln Thr Phe Leu Trp Arg Gln Thr Ser Ala 20 25 30 Phe Leu Arg Pro Lys Leu Gly Lys Gln Tyr Glu Ala Ser Cys Val Ser 35 40 45 Phe Glu Arg Val Leu Val Glu Asn Lys Leu His Gly Leu Ser Pro Ala 50 55 60 Leu Ser Glu Ala Ile Gln Ser Ile Ser Arg Trp Glu Leu Val Gln Ala 65 70 75 80 Ala Leu Pro His Val Leu His Cys Thr Ala Thr Leu Leu Ser Asn Arg 85 90 95 Asn Lys Leu Gly His Gln Asp Lys Leu Gly Val Ala Glu Thr Lys Leu 100 105 110 Leu His Thr Leu His Trp Met Leu Leu Glu Ala Pro Gln Asp Cys Asn 115 120 125 Asn Glu Arg Phe Gly Gly Thr Asp Arg Gly Ser Ser Trp Gly Gly Ser 130 135 140 Ser Ser Ala Phe Ile His Gln Val Glu Asn Gln Gly Ser Pro Gly Gln 145 150 155 160 Pro Cys Gln Ser Ser Ser Asn Asp Glu Glu Glu Asn Asn Arg Arg Lys 165 170 175 Ile Phe Gln Asn Ser Met Ala Thr Val Glu Leu Phe Val Phe Leu Phe 180 185 190 Ala Pro Leu Val His Arg Ile Lys Glu Ser Asp Leu Thr Phe Arg Leu 195 200 205 Ala Ser Gly Leu Val Ile Trp Gln Pro Met Trp Glu His Arg Gln Pro 210

215 220 Gly Val Ser Gly Phe Thr Ala Leu Val Lys Pro Ile Arg Asn Ile Ile 225 230 235 240 Thr Ala Lys Arg Ser Ser Pro Ile Asn Ser Gln Ser Arg Thr Cys Glu 245 250 255 Ser Pro Asn Gln Asp Ala Arg His Leu Glu Gly Leu Gln Val Val Cys 260 265 270 Glu Thr Phe Gln Ser Asp Ser Ile Ser Pro Lys Ala Thr Ile Ser Gly 275 280 285 Cys His Arg Gly Asn Ser Phe Asp Gly Ser Leu Ser Ser Gln Thr Ser 290 295 300 Gln Glu Arg Gly Pro Ser His Ser Arg Ala Ser Leu Val Ile Pro Pro 305 310 315 320 Cys Gln Arg Ser Arg Tyr Ala Thr Tyr Phe Asp Val Ala Val Leu Arg 325 330 335 Cys Leu Leu Gln Pro His Trp Ser Glu Glu Gly Thr Gln Trp Ser Leu 340 345 350 Met Tyr Tyr Leu Gln Arg Leu Arg His Met Leu Glu Glu Lys Pro Glu 355 360 365 Lys Pro Pro Glu Pro Asp Ile Pro Leu Leu Pro Arg Pro Arg Ser Ser 370 375 380 Ser Met Val Ala Ala Ala Pro Ser Leu Val Asn Thr His Lys Thr Gln 385 390 395 400 Asp Leu Thr Met Lys Cys Asn Glu Glu Glu Lys Ser Leu Ser Ser Glu 405 410 415 Ala Phe Ser Lys Val Ser Leu Thr Asn Leu Arg Arg Ser Ala Val Pro 420 425 430 Asp Leu Ser Ser Asp Leu Gly Met Asn Ile Phe Lys Lys Phe Lys Ser 435 440 445 Arg Lys Glu Asp Arg Glu Arg Lys Gly Ser Ile Pro Phe His His Thr 450 455 460 Gly Lys Arg Arg Pro Arg Arg Met Gly Val Pro Phe Leu Leu His Glu 465 470 475 480 Asp His Leu Asp Val Ser Pro Thr Arg Ser Thr Phe Ser Phe Gly Ser 485 490 495 Phe Ser Gly Leu Gly Glu Asp Arg Arg Gly Ile Glu Lys Gly Gly Trp 500 505 510 Gln Thr Thr Ile Leu Gly Lys Leu Thr Arg Arg Gly Ser Ser Asp Ala 515 520 525 Ala Thr Glu Met Glu Ser Leu Ser Ala Arg His Ser His Ser His His 530 535 540 Thr Leu Val Ser Asp Leu Pro Asp Pro Ser Asn Ser His Gly Glu Asn 545 550 555 560 Thr Val Lys Glu Val Arg Ser Gln Ile Ser Thr Ile Thr Val Ala Thr 565 570 575 Phe Asn Thr Thr Leu Ala Ser Phe Asn Val Gly Tyr Ala Asp Phe Phe 580 585 590 Asn Glu His Met Arg Lys Leu Cys Asn Gln Val Pro Ile Pro Glu Met 595 600 605 Pro His Glu Pro Leu Ala Cys Ala Asn Leu Pro Arg Ser Leu Thr Asp 610 615 620 Ser Cys Ile Asn Tyr Ser Tyr Leu Glu Asp Thr Glu His Ile Asp Gly 625 630 635 640 Thr Asn Asn Phe Val His Lys Asn Gly Met Leu Asp Leu Ser Val Val 645 650 655 Leu Lys Ala Val Tyr Leu Val Leu Asn His Asp Ile Ser Ser Arg Ile 660 665 670 Cys Asp Val Ala Leu Asn Ile Val Glu Cys Leu Leu Gln Leu Gly Val 675 680 685 Val Pro Cys Val Glu Lys Asn Arg Lys Lys Ser 690 695 54 700 PRT Mus musculus 54 Met Val Lys Arg Lys Ser Ser Glu Gly Gln Glu Gln Asp Gly Gly Arg 1 5 10 15 Gly Ile Pro Leu Pro Ile Gln Thr Phe Leu Trp Arg Gln Thr Ser Ala 20 25 30 Phe Leu Arg Pro Lys Leu Gly Lys Gln Tyr Glu Ala Ser Cys Val Ser 35 40 45 Phe Glu Arg Val Leu Val Glu Asn Lys Leu His Gly Leu Ser Pro Ala 50 55 60 Leu Ser Glu Ala Ile Gln Ser Ile Ser Arg Trp Glu Leu Val Gln Ala 65 70 75 80 Ala Leu Pro His Val Leu His Cys Thr Ala Thr Leu Leu Ser Asn Arg 85 90 95 Asn Lys Leu Gly His Gln Asp Lys Leu Gly Val Ala Glu Thr Lys Leu 100 105 110 Leu His Thr Leu His Trp Met Leu Leu Glu Ala Pro Gln Asp Cys Asn 115 120 125 Asn Asp Gln Phe Gly Gly Thr Asp Arg Gly Ser Ser Trp Gly Gly Ser 130 135 140 Ser Ser Ala Phe Ile His Gln Ile Glu Asn Gln Gly Ser Pro Gly Gln 145 150 155 160 Pro Cys Arg Ser Ser Ser His Asp Glu Glu Glu Asn Asn Arg Arg Lys 165 170 175 Thr Phe Gln Asn Ser Met Ala Thr Val Glu Leu Phe Val Phe Leu Phe 180 185 190 Ala Pro Leu Val His Arg Ile Lys Glu Ser Asp Leu Thr Phe Arg Leu 195 200 205 Ala Ser Gly Leu Val Ile Trp Gln Pro Met Trp Glu His Arg Gln Pro 210 215 220 Glu Val Ser Gly Phe Thr Ala Leu Val Lys Pro Ile Arg Asn Ile Ile 225 230 235 240 Thr Ala Lys Arg Ser Ser Pro Ile Asn Ser Gln Ser Gln Thr Cys Glu 245 250 255 Ser Pro Asn Gln Asp Thr Arg Gln Gln Gly Glu Gly Leu Gln Val Val 260 265 270 Ser Glu Ala Leu Gln Ser Asp Ser Ile Ser Pro Lys Ala Thr Ile Ser 275 280 285 Gly Cys His Gln Gly Asn Ser Phe Asp Gly Ser Leu Ser Ser Gln Thr 290 295 300 Ser Gln Glu Arg Gly Pro Ser His Ser Arg Ala Ser Leu Val Ile Pro 305 310 315 320 Pro Cys Gln Arg Ser Arg Tyr Ala Thr Tyr Phe Asp Val Ala Val Leu 325 330 335 Arg Cys Leu Leu Gln Pro His Trp Ser Glu Glu Gly Thr Gln Trp Ser 340 345 350 Leu Met Tyr Tyr Leu Gln Arg Leu Arg His Met Leu Glu Glu Lys Pro 355 360 365 Glu Lys Thr Pro Asp Pro Asp Ile Pro Leu Leu Pro Arg Pro Arg Ser 370 375 380 Ser Ser Met Val Ala Ala Ala Pro Ser Leu Val Asn Thr His Lys Thr 385 390 395 400 Gln Asp Leu Thr Met Lys Cys Asn Glu Glu Glu Lys Ser Leu Ser Pro 405 410 415 Glu Ala Phe Ser Lys Val Ser Leu Thr Asn Leu Arg Arg Ser Ala Val 420 425 430 Pro Asp Leu Ser Ser Asp Leu Gly Met Asn Ile Phe Lys Lys Phe Lys 435 440 445 Ser Arg Lys Glu Asp Arg Glu Arg Lys Gly Ser Ile Pro Phe His His 450 455 460 Thr Gly Lys Arg Arg Pro Arg Arg Met Gly Val Pro Phe Leu Leu His 465 470 475 480 Glu Asp His Leu Asp Val Ser Pro Thr Arg Ser Thr Phe Ser Phe Gly 485 490 495 Ser Phe Ser Gly Leu Gly Glu Asp Arg Arg Gly Ile Glu Lys Gly Gly 500 505 510 Trp Gln Thr Thr Ile Leu Gly Lys Leu Thr Arg Arg Gly Ser Ser Asp 515 520 525 Ala Ala Thr Glu Met Glu Ser Leu Ser Ala Arg His Ser His Ser His 530 535 540 His Thr Leu Val Ser Asp Leu Pro Asp His Ser Asn Ser His Gly Glu 545 550 555 560 Asn Thr Val Lys Glu Val Arg Ser Gln Ile Ser Thr Ile Thr Val Ala 565 570 575 Thr Phe Asn Thr Thr Leu Ala Ser Phe Asn Val Gly Tyr Ala Asp Phe 580 585 590 Phe Ser Glu His Met Arg Lys Leu Cys Ser Gln Val Pro Ile Pro Glu 595 600 605 Met Pro His Glu Pro Leu Ala Cys Ala Asn Leu Pro Arg Ser Leu Thr 610 615 620 Asp Ser Cys Ile Asn Tyr Ser Tyr Leu Glu Asp Thr Glu His Ile Asp 625 630 635 640 Gly Thr Asn Asn Phe Val His Lys Asn Gly Met Leu Asp Leu Ser Val 645 650 655 Val Leu Lys Ala Val Tyr Leu Val Leu Asn His Asp Ile Ser Ser Arg 660 665 670 Ile Cys Asp Val Ala Leu Asn Ile Val Glu Cys Leu Leu Gln Leu Gly 675 680 685 Val Val Pro Cys Val Glu Lys Asn Arg Lys Lys Ser 690 695 700 55 618 PRT Gallus gallus 55 Thr Arg Pro Pro Thr Arg Pro Glu Arg Val Leu Val Glu Asn Lys Leu 1 5 10 15 His Gly Leu Ser Pro Ala Leu Ser Glu Ala Ile Gln Ser Ile Ser Arg 20 25 30 Trp Glu Leu Val Gln Ala Ala Leu Pro His Val Leu His Cys Thr Ala 35 40 45 Thr Leu Leu Ser Asn Arg Asn Lys Leu Gly His Gln Asp Lys Leu Gly 50 55 60 Val Ala Glu Thr Lys Leu Leu His Thr Leu His Trp Met Leu Leu Glu 65 70 75 80 Ala Pro Gln Asp Cys Ser Asn Asp Arg Phe Gly Gly Asp Arg Gly Ser 85 90 95 Ser Trp Gly Gly Ser Ser Ser Ala Phe Ile His Gln Ala Glu Asn Gln 100 105 110 Gly Ser Pro Gly His Pro Arg Pro Ser Thr Thr Asn Asp Glu Asp Glu 115 120 125 Asn Asn Arg Arg Lys Phe Phe Gln Asn Ser Met Ala Thr Val Glu Leu 130 135 140 Phe Val Phe Leu Phe Ala Pro Leu Val His Arg Ile Lys Glu Ser Asp 145 150 155 160 Leu Thr Phe Arg Leu Ala Ser Gly Leu Val Ile Trp Gln Pro Met Trp 165 170 175 Glu His Arg Gln Pro Glu Val Ser Ala Phe Asn Ala Leu Val Lys Pro 180 185 190 Ile Arg Asn Ile Val Thr Ala Lys Arg Ser Ser Pro Thr Asn Asn Gln 195 200 205 Ser Val Thr Cys Glu Ser Leu Asn Leu Asp Ser Gly His Thr Glu Gly 210 215 220 Leu Gln Val Val Cys Glu Thr Thr Leu Pro Asp Ser Val Pro Ser Lys 225 230 235 240 Pro Thr Val Ser Ala Cys His Arg Gly Asn Ser Leu Glu Gly Ser Val 245 250 255 Ser Ser Gln Thr Ser Gln Glu Arg Gly Thr Pro His Pro Arg Val Ser 260 265 270 Met Val Ile Pro Pro Cys Gln Lys Ser Arg Tyr Ala Thr Tyr Phe Asp 275 280 285 Val Ala Val Leu Arg Cys Leu Leu Gln Pro His Trp Ser Glu Glu Gly 290 295 300 Thr Gln Trp Ser Leu Met Tyr Tyr Leu Gln Arg Leu Arg His Met Leu 305 310 315 320 Gln Glu Lys Pro Glu Lys Pro Pro Glu Pro Glu Ile Thr Pro Leu Pro 325 330 335 Arg Leu Arg Ser Ser Ser Met Val Ala Ala Ala Pro Ser Leu Val Asn 340 345 350 Thr His Lys Thr Gln Asp Leu Thr Met Lys Cys Asn Glu Glu Glu Lys 355 360 365 Ser Leu Ser Thr Glu Ala Phe Ser Lys Val Ser Leu Thr Asn Leu Arg 370 375 380 Arg Pro Ala Val Pro Asp Leu Ser Thr Asp Leu Gly Met Asn Ile Phe 385 390 395 400 Lys Lys Phe Lys Ser Arg Lys Glu Asp Arg Glu Arg Glu Arg Lys Gly 405 410 415 Ser Ile Pro Phe His His Thr Gly Lys Arg Arg Gln Arg Arg Met Gly 420 425 430 Met Pro Phe Leu Leu His Glu Asp His Leu Asp Val Ser Pro Thr Arg 435 440 445 Ser Thr Phe Ser Phe Gly Ser Phe Ser Gly Leu Gly Glu Asp Arg Arg 450 455 460 Gly Ile Glu Arg Gly Gly Trp Gln Thr Thr Ile Leu Gly Lys Phe Thr 465 470 475 480 Arg Arg Gly Ser Ser Asp Thr Ala Thr Glu Met Glu Ser Leu Ser Ala 485 490 495 Arg His Ser His Ser His His Thr Leu Val Ser Asp Met Pro Asp His 500 505 510 Ser Asn Ser His Gly Glu Asn Thr Val Lys Glu Val Arg Ser Gln Ile 515 520 525 Ser Thr Ile Thr Val Ala Thr Phe Asn Thr Thr Leu Ala Ser Phe Asn 530 535 540 Val Gly Tyr Ala Asp Phe Phe Ser Glu His Met Arg Lys Leu Cys Asn 545 550 555 560 Gln Val Pro Ile Pro Glu Met Pro His Glu Pro Leu Ala Cys Ala Asn 565 570 575 Leu Pro Arg Ser Leu Thr Asp Ser Cys Ile Asn Tyr Ser Cys Leu Glu 580 585 590 Asp Thr Asp His Ile Asp Gly Thr Asn Asn Phe Val His Lys Asn Gly 595 600 605 Met Leu Asp Leu Ser Val Asn Gly Lys Glu 610 615 56 650 PRT Danio rerio 56 Met Val Lys Arg Lys Ser Leu Asp Asp Ser Asp Gln Glu Asn Cys Arg 1 5 10 15 Gly Ile Pro Phe Pro Ile Gln Thr Phe Leu Trp Arg Gln Thr Ser Ala 20 25 30 Phe Leu Arg Pro Lys Leu Gly Lys Gln Tyr Glu Ala Ser Cys Val Ser 35 40 45 Phe Glu Arg Val Leu Val Glu Asn Lys Leu His Gly Leu Ser Pro Ala 50 55 60 Leu Thr Glu Ala Ile Gln Ser Ile Ser Arg Trp Glu Leu Val Gln Ala 65 70 75 80 Ala Leu Pro His Val Leu His Cys Thr Ser Ile Leu Leu Ser Asn Arg 85 90 95 Asn Lys Leu Gly His Gln Asp Lys Leu Gly Val Ala Glu Thr Lys Leu 100 105 110 Leu His Thr Leu His Trp Met Leu Leu Glu Ala Ala Gln Glu Cys His 115 120 125 Gln Glu Pro Gly Leu Ile His Gly Trp Ser Gly Gly Ser Ser Gly Ser 130 135 140 Gly Ser Ala Tyr Leu Gln Pro Met Gly Asn Gln Gly Leu Thr Asp His 145 150 155 160 Asn Gly Ser Thr Pro Glu Glu Thr Glu Tyr Ala Arg Ala Lys Leu Tyr 165 170 175 His Lys Asn Met Ala Thr Val Glu Leu Phe Val Phe Leu Phe Ala Pro 180 185 190 Leu Ile Asn Arg Ile Lys Glu Ser Asp Leu Thr Phe Arg Leu Ala Gly 195 200 205 Gly Leu Val Ile Trp Gln Pro Met Trp Glu His Arg Gln Pro Asp Val 210 215 220 Pro Ala Phe Ser Ala Leu Ile Lys Pro Leu Arg Asn Ile Ile Thr Ala 225 230 235 240 Lys Arg Asn Ser Gln Met Asn Asn Gln Cys Ser Pro His Asp Ser Ser 245 250 255 Asn Pro Cys Pro Ala Val Val Cys Glu Ser Ala Leu Ser Asp Ser Ser 260 265 270 Ser Ser Pro Ser Met Thr Gly Gln Ser Cys Arg Arg Gly Asn Ser Leu 275 280 285 Glu Asn Gln Arg Ala Arg Tyr Ala Thr Tyr Phe Asp Val Ala Val Leu 290 295 300 Arg Cys Leu Met Gln Pro His Trp Thr Glu Glu Gly Val His Trp Ala 305 310 315 320 Leu Ile Tyr Tyr Leu Gln Arg Leu Arg Gln Ile Leu Gln Ile Thr Pro 325 330 335 Leu Pro Arg Pro Arg Ser Ser Ser Met Val Ala Ala Thr Pro Ser Leu 340 345 350 Val Asn Thr His Lys Thr Gln Pro His Asn Pro Phe Thr Arg Pro Arg 355 360 365 Ser Ser Ser Met Val Ala Ala Thr Pro Ser Leu Val Asn Thr His Lys 370 375 380 Thr Gln Asp Met Thr Leu Lys Cys Asn Glu Glu Ser Arg Ser Leu Ser 385 390 395 400 Ser Glu Thr Phe Ser Lys Val Ser Val Thr Asn Leu Arg Arg Gln Ala 405 410 415 Val Pro Asp Leu Ser Ser Glu Met Gly Met Asn Ile Phe Lys Lys Phe 420 425 430 Lys Asn Arg Arg Glu Asp Arg Glu Arg Lys Gly Ser Ile Pro Phe His 435 440 445 His Thr Gly Lys Lys Arg Gln Arg Arg Met Gly Val Pro Phe Leu Met 450 455 460 His Glu Asp His Leu Asp Val Ser Pro Thr Arg Ser Thr Phe Ser Phe 465 470 475 480 Gly Ser Phe Ser Gly Leu Gly Asp Asp Arg Arg Thr Leu Asp Arg Gly 485 490 495 Gly Trp Pro Ser Thr Ile Met Gly Lys Leu Thr Arg Arg Gly Ser Ser 500 505 510 Asp Thr Thr Gly Asp Val Asp Ser Leu Gly Ala Lys His Phe His Ser 515 520 525 His His Asn Leu Pro Glu His Ser Asn Ser His Ser Glu Asn Thr Ile 530 535 540 Lys Glu Gly Val Arg Ser Gln Ile Ser Thr Ile Thr Met Ala Thr Phe 545 550 555 560 Asn Thr Thr Val Ala Ser Phe Asn Val Gly Tyr Thr Asp Phe Phe Thr 565 570 575 Glu His Ile Lys Lys Leu Cys Asn Pro Ile Pro Ile Pro Glu Met Pro 580 585 590 Cys Glu Pro Leu Ala Cys Ser Asn Leu Pro Arg Ser Leu Thr Asp Ser 595 600 605 Cys Ile Asn Tyr Thr Ser Leu Glu Asp Arg Asp Thr Ile Glu Gly Thr 610 615 620 Asn Asn Phe Ile Leu Lys Asn Gly Met Leu Asp Leu Met Val Arg Gly 625 630 635 640 Lys Asn

Tyr Asn Arg Glu Thr Ile Lys Glu 645 650

* * * * *

References

smart.embl-heidelberg.de/orRPS-BLASTattheNCBI