Compositions, Methods and Kits for Diagnosis of Lung Cancer Kearney; Paul Edward ; et al. [Integrated Diagnostics, Inc.]

Compositions, Methods and Kits for Diagnosis of Lung Cancer

Kearney; Paul Edward ; et al.

Patent Application Summary

U.S. patent application number 15/041775 was filed with the patent office on 2016-06-09 for compositions, methods and kits for diagnosis of lung cancer. The applicant listed for this patent is Integrated Diagnostics, Inc.. Invention is credited to Kenneth Charles Fang, Clive Hayward, Paul Edward Kearney, Xiao-Jun Li.

Application Number	20160161493 15/041775
Document ID	/
Family ID	51352811
Filed Date	2016-06-09

United States Patent Application	20160161493
Kind Code	A1
Kearney; Paul Edward ; et al.	June 9, 2016

Compositions, Methods and Kits for Diagnosis of Lung Cancer

Abstract

The present invention provides methods for identifying biomarker proteins that exhibit differential expression in subjects with a first lung condition versus healthy subjects or subjects with a second lung condition. The present invention also provides compositions comprising these biomarker proteins and methods of using these biomarker proteins or panels thereof to diagnose, classify, and monitor various lung conditions. The methods and compositions provided herein may be used to diagnose or classify a subject as having lung cancer or a non-cancerous condition, and to distinguish between different types of cancer (e.g., malignant versus benign, SCLC versus NSCLC).

Inventors:

Kearney; Paul Edward; (Seattle, WA) ; Fang; Kenneth Charles; (San Francisco, CA) ; Li; Xiao-Jun; (Bellevue, WA) ; Hayward; Clive; (Seattle, WA)

Applicant:

Name	City	State	Country	Type
Integrated Diagnostics, Inc.	Seattle	WA	US

Family ID:

51352811

Appl. No.:

15/041775

Filed:

February 11, 2016

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
14341245	Jul 25, 2014	9297805
15041775
61858760	Jul 26, 2013

Current U.S. Class:	506/12 ; 702/19
Current CPC Class:	G01N 2333/988 20130101; G16B 20/00 20190201; G01N 2333/96433 20130101; G01N 2333/78 20130101; G01N 33/57488 20130101; G01N 2333/46 20130101; G01N 33/57423 20130101; G01N 2333/785 20130101
International Class:	G01N 33/574 20060101 G01N033/574

Claims

1. A method of determining that a lung condition in a subject is cancer comprising: (a) assessing the expression of a plurality of proteins comprising determining the protein expression level of at least each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN and COIA1_HUMAN from a biological sample obtained from the subject; (b) calculating a score from the protein expression of at least each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN and COIA1_HUMAN from the biological sample determined in step (a); and (c) comparing the score from the biological sample to a plurality of scores obtained from a reference population, wherein the comparison provides a determination that the lung condition is not cancer.

2. The method of claim 1, wherein the subject has a pulmonary nodule.

3. The method of claim 2, wherein the pulmonary nodule is 30 mm or less.

4. The method of claim 3, wherein the pulmonary nodule is between 8-30 mm.

5. The method of claim 1, wherein said lung condition is cancer or a non-cancerous lung condition.

6. The method of claim 1, wherein said cancer is non-small cell lung cancer.

7. The method of claim 1, wherein said non-cancerous lung condition is chronic obstructive pulmonary disease, hamartoma, fibroma, neurofibroma, granuloma, sarcoidosis, bacterial infection or fungal infection.

8. The method of claim 1, wherein the subject is a human.

9. The method of claim 1, wherein said biological sample is tissue, blood, plasma, serum, whole blood, urine, saliva, genital secretions, cerebrospinal fluid, sweat, excreta, or bronchoalveolar lavage.

10. The method of claim 1, wherein assessing the expression of a plurality of proteins further comprises determining the protein expression level of at least one of PEDF_HUMAN, MASP1_HUMAN, GELS_HUMAN, LUM_HUMAN, C163A_HUMAN and PTPRJ_HUMAN.

11. The method of claim 1, wherein determining the protein expression level of at least each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN and COIA1_HUMAN comprises fragmenting each protein to generate at least one peptide.

12. The method of claim 11, wherein the proteins are fragmented by trypsin digestion.

13. The method of claim 12, further comprising providing a synthetic, modified, heavy peptides corresponding to each peptide generated from each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN and COIA1_HUMAN.

14. The method of claim 13, wherein at least one of the synthetic peptides has an isotopic label attached.

15. The method of claim 1, wherein assessing the expression of a plurality of proteins is performed by mass spectrometry (MS), liquid chromatography-selected reaction monitoring/mass spectrometry (LC-SRM-MS), reverse transcriptase-polymerase chain reaction (RT-PCR), microarray, serial analysis of gene expression (SAGE), gene expression analysis by massively parallel signature sequencing (MPSS), immunoassays, immunohistochemistry (IHC), transcriptomics, or proteomics.

16. The method of claim 15, wherein the expression of a plurality of proteins is performed by liquid chromatography-selected reaction monitoring/mass spectrometry (LC-SRM-MS).

17. The method of claim 11, wherein a transition for each peptide is determined by liquid chromatography-selected reaction monitoring/mass spectrometry (LC-SRM-MS).

18. The method of claim 17, wherein the peptide transitions comprise at least ALQASALK (401.25, 617.4), AVGLAGTFR (446.26, 721.4), GFLLLASLR (495.31, 559.4), LGGPEAGLGEYLFER (804.4, 1083.6), and VEIFYR (413.73, 598.3).

19. The method of claim 1, wherein said score is determined as P.sub.s=1/[1+exp(-.alpha.-.SIGMA..sub.i=1.sup.5.beta..sub.i*I.sub.i,s-.ga- mma.*I.sub.COIA1*I.sub.FRIL], where {hacek over (I)} is Box-Cox transformed and normalized intensity of transition i in said sample (s), .beta..sub.i is the corresponding logistic regression coefficient, .alpha. is a panel-specific constant, and .gamma. is a coefficient for the interaction term.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation application of U.S. application Ser. No. 14/341,245 filed Jul. 25, 2014, which claims priority to, and the benefit of, U.S. Ser. No. 61/858,760, filed Jul. 26, 2013, the entire contents of each of which are incorporated herein by reference in their entireties.

INCORPORATION-BY-REFERENCE OF SEQUENCE LISTING

[0002] The contents of the text file named "IDIA-009 Sequence listing_ST25.txt", which was created on Sep. 29, 2014, and is 108 KB in size, are hereby incorporated by reference in their entireties.

BACKGROUND OF THE INVENTION

[0003] Lung conditions and particularly lung cancer present significant diagnostic challenges. In many asymptomatic patients, radiological screens such as computed tomography (CT) scanning are a first step in the diagnostic paradigm. Pulmonary nodules (PNs) or indeterminate nodules are located in the lung and are often discovered during screening of both high risk patients or incidentally. The number of PNs identified is expected to rise due to increased numbers of patients with access to health care, the rapid adoption of screening techniques and an aging population. It is estimated that over 3 million PNs are identified annually in the US. Although the majority of PNs are benign, some are malignant leading to additional interventions. For patients considered low risk for malignant nodules, current medical practice dictates scans every three to six months for at least two years to monitor for lung cancer. The time period between identification of a PN and diagnosis is a time of medical surveillance or "watchful waiting" and may induce stress on the patient and lead to significant risk and expense due to repeated imaging studies. If a biopsy is performed on a patient who is found to have a benign nodule, the costs and potential for harm to the patient increase unnecessarily. Major surgery is indicated in order to excise a specimen for tissue biopsy and diagnosis. All of these procedures are associated with risk to the patient including: illness, injury and death as well as high economic costs.

[0004] Frequently, PNs cannot be biopsied to determine if they are benign or malignant due to their size and/or location in the lung. However, PNs are connected to the circulatory system, and so if malignant, protein markers of cancer can enter the blood and provide a signal for determining if a PN is malignant or not.

[0005] Diagnostic methods that can replace or complement current diagnostic methods for patients presenting with PNs are needed to improve diagnostics, reduce costs and minimize invasive procedures and complications to patients.

SUMMARY OF THE INVENTION

[0006] The present invention provides novel compositions, methods and kits for identifying protein markers to identify, diagnose, classify and monitor lung conditions, particularly lung cancer. The present invention uses a multiplexed assay to distinguish benign pulmonary nodules from malignant pulmonary nodules to classify patients with or without lung cancer. The present invention may be used in patients who present with symptoms of lung cancer, but do not have pulmonary nodules.

[0007] The present invention provides a method of determining the likelihood that a lung condition in a subject is cancer by measuring the abundance of proteins in a sample obtained from the subject; calculating a probability of cancer score based on the protein abundance and a protein-protein (mathematical) interaction between FRIL_HUMAN and COIA1_HUMAN; and ruling out cancer for the subject if the score is lower than a pre-determined score. When cancer is ruled out, the subject does not receive a treatment protocol. Treatment protocols include for example pulmonary function test (PFT), pulmonary imaging, a biopsy, a surgery, chemotherapy, a radiotherapy, or any combination thereof. In some embodiments, the imaging is an x-ray, a chest computed tomography (CT) scan, or a positron emission tomography (PET) scan.

[0008] The present invention further provides a method of determining the likelihood of the presence of a lung condition in a subject by measuring the abundance of proteins in a sample obtained from the subject, calculating a probability of cancer score based on the protein abundance and a protein-protein (mathematical) interaction between FRIL_HUMAN and COIA1_HUMAN; and concluding the presence of said lung condition if the score is equal or greater than a pre-determined score. The pre-determined score can be determined by scoring a plurality of subjects as part of a reference population. The lung condition is lung cancer such as for example, non-small cell lung cancer (NSCLC). The subject is at risk of developing lung cancer. The likelihood of cancer can be determined by the sensitivity, specificity, negative predictive value or positive predictive value associated with the score.

[0009] The present invention also provides methods of determining that a lung condition in a subject is cancer comprising assessing the expression of a plurality of proteins comprising determining the protein expression level of at least each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN and COIA1_HUMAN from a biological sample obtained from the subject; calculating a score from the protein expression of at least each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN and COIA1_HUMAN from the biological sample determined in the preceding step; and comparing the score from the biological sample to a plurality of scores obtained from a reference population, wherein the comparison provides a determination that the lung condition is not cancer.

[0010] The determination that a lung condition is not cancer can include assessing the expression of a plurality of proteins to determine the protein expression level of at least each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN, and COIA1_HUMAN obtained from a biological sample from a subject. A score is calculated from these assessments and this score is further compared with a plurality of scores obtained from a reference population, wherein the comparison provides a determination that the lung condition is not cancer. The method can also include determining an interaction between FRIL_HUMAN AND COIA1_HUMAN.

[0011] Comparing the score from the subject with the plurality of scores obtained from the reference population can provide a cancer probability. Preferably, when the comparison provides a cancer probability and the probability is 15% or less, the lung condition is classified as not cancer. More preferably, when the comparison provides a cancer probability and the probability is 10% or less, the lung condition is classified as not cancer. Most preferably, when the comparison provides a cancer probability and the probability is 5% or less, the lung condition is classified as not cancer.

[0012] The subject can be one that has or is suspected of having a pulmonary nodule. The pulmonary nodule can have a diameter of 30 mm or less. Preferably, the pulmonary nodule has a diameter of about 8 mm to 30 mm.

[0013] The subject can be suspected of having a cancerous or non-cancerous lung condition. A cancerous lung condition can include non-small cell lung cancer. A s non-cancerous lung condition can include chronic obstructive pulmonary disease, hamartoma, fibroma, neurofibroma, granuloma, sarcoidosis, bacterial infection or fungal infection.

[0014] The subject can be a mammal. Preferably, the subject is a human.

[0015] The biological sample can be any sample obtained from the subject, e.g., tissue, cell, fluid. Preferably, the biological sample is tissue, blood, plasma, serum, whole blood, urine, saliva, genital secretions, cerebrospinal fluid, sweat, excreta, or bronchoalveolar lavage.

[0016] The methods of the present invention can also include assessing the expression of a plurality of proteins which comprises determining the protein expression level of at least one of PEDF_HUMAN, MASP1_HUMAN, GELS_HUMAN, LUM_HUMAN, C163A_HUMAN and PTPRJ_HUMAN.

[0017] Determining the protein level of at least one of, or each of, the proteins of the present invention can include fragmenting the protein to generate at least one peptide per protein. Preferably, the fragmentation of the protein is accomplished by trypsin digestion.

[0018] The methods of the present invention can further include normalizing the protein measurements. For example, the protein measurements can normalized by one or more "housekeeping" proteins, e.g., proteins which do not have variable expression across different samples or subjects. Preferable normalizing proteins can include at least one of PEDF_HUMAN, MASP1_HUMAN, GELS_HUMAN, LUM_HUMAN, C163A_HUMAN and PTPRJ_HUMAN.

[0019] The invention further provides methods of using synthetic, modified, heavy peptides corresponding to at least one of, or each of, ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN, COIA1_HUMAN, PEDF_HUMAN, MASP1_HUMAN, GELS_HUMAN, LUM_HUMAN, C163A_HUMAN or PTPRJ_HUMAN. At least one of, or each of, the synthetic peptides can an isotopic label attached.

[0020] Methods to assess the expression of a plurality of proteins can include mass spectrometry (MS), liquid chromatography-selected reaction monitoring/mass spectrometry (LC-SRM-MS), reverse transcriptase-polymerase chain reaction (RT-PCR), microarray, serial analysis of gene expression (SAGE), gene expression analysis by massively parallel signature sequencing (MPSS), immunoassays, immunohistochemistry (IHC), transcriptomics, or proteomics. Preferably, the expression of a plurality of proteins is assessed LC-SRM-MS. LC-SRM-MS can be used to determine transitions for each peptide analyzed. Preferably, peptide transitions can be determined for at least one of, or each of, ALQASALK (SEQ ID NO: 25), AVGLAGTFR (SEQ ID NO: 26), GFLLLASLR (SEQ ID NO: 27), LGGPEAGLGEYLFER (SEQ ID NO: 28) or VEIFYR (SEQ ID NO: 29). More preferably the peptide transitions include at least ALQASALK (SEQ ID NO: 25) (401.25, 617.4), AVGLAGTFR (SEQ ID NO: 26) (446.26, 721.4), GFLLLASLR (SEQ ID NO: 27) (495.31, 559.4), LGGPEAGLGEYLFER (SEQ ID NO: 28) (804.4, 1083.6), and VEIFYR (SEQ ID NO: 29) (413.73, 598.3).

[0021] The measuring step may also be performed using a compound that specifically binds the protein being detected or a peptide transition. For example, a compound that specifically binds to the protein being measured can be an antibody or an aptamer.

[0022] The score can be calculated from a logistic regression model applied to the protein measurements. For example, the score is determined as P.sub.s=1/[1+exp(-.alpha.-.SIGMA..sub.i=1.sup.5.beta..sub.i*I.sub.i,s-- .gamma.*I.sub.COIA1*I.sub.FRIL], where I.sub.i,s is Box-Cox transformed and normalized intensity of transition i in said sample (s), .beta..sub.i is the corresponding logistic regression coefficient, .alpha. is a panel-specific constant, and .gamma. is a coefficient for the interaction term.

[0023] The reference population can include at least 100 subjects with a lung condition and wherein each subject in the reference population has been assigned a score based on the protein expression of at least each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN and COIA1_HUMAN obtained from a biological sample from the subject. The invention further provides methods for the treatment of a subject, wherein if the lung condition is not cancer the subject is treated based on clinical practice guidelines. Preferably, if a lung condition is not cancer the subject receives image monitoring for at least a 1 year period, for at least a 2 year period or at least a 3 year period. More preferably, if the lung condition is not cancer, the subject receives chest computed tomography scans for at least a 1 year period, for at least a 2 year period or at least a 3 year period.

[0024] The present invention also provides that at least one step of any disclosed method can be performed on a computer or computer system.

[0025] The patent and scientific literature referred to herein establishes the knowledge that is available to those with skill in the art. All United States patents and published or unpublished United States patent applications cited herein are incorporated by reference. All published foreign patents and patent applications cited herein are hereby incorporated by reference. GenBank and NCBI submissions indicated by accession number cited herein are hereby incorporated by reference. All other published references, documents, manuscripts and scientific literature cited herein are hereby incorporated by reference.

[0026] While this disclosure has been particularly shown and described with references to preferred embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the disclosure encompassed by the appended claims.

BRIEF DESCRIPTION OF THE DRAWINGS

[0027] FIG. 1 is a graph showing sample batches used in the experiments from three sites UPenn, Laval and NYU.

[0028] FIG. 2 is a panel of graphs showing A) NPV and specificity of panel ID_465 and B) area under the curve for a receiving operating curve for panel ID_465.

[0029] FIG. 3 is a panel of graphs showing A) NPV and specificity of panel ID_341 and B) area under the curve for a receiving operating curve for panel ID_341.

[0030] FIG. 4 is a graph showing NPV and specificity of panel ID_465 plus COIA1.times.FRIL interaction (C4 Classifier).

[0031] FIG. 5 is a graph showing NPV and specificity of panel ID_341.

DETAILED DESCRIPTION OF THE INVENTION

[0032] The disclosed invention derives from the surprising discovery, that in patients presenting with pulmonary nodule(s), a small panel of protein markers in the blood is able to specifically identify and distinguish malignant and benign lung nodules with high negative predictive value (NPV). More importantly, at least two protein markers among the panel mathematically interact in the model for determining the probability score. Such protein-protein interaction surprisingly increases the specificity of the methods described herein. The classifier (C4 Classifier) described herein also demonstrates remarkable independence and accuracy. None of the clinical factors impact the classifier's score.

[0033] Accordingly the invention provides unique advantages to the patient associated with early detection of lung cancer in a patient, including increased life span, decreased morbidity and mortality, decreased exposure to radiation during screening and repeat screenings and a minimally invasive diagnostic model. Importantly, the methods of the invention allow for a patient to avoid invasive procedures.

[0034] The routine clinical use of chest computed tomography (CT) scans identifies millions of pulmonary nodules annually, of which only a small minority are malignant but contribute to the dismal 15% five-year survival rate for patients diagnosed with non-small cell lung cancer (NSCLC). The early diagnosis of lung cancer in patients with pulmonary nodules is a top priority, as decision-making based on clinical presentation, in conjunction with current non-invasive diagnostic options such as chest CT and positron emission tomography (PET) scans, and other invasive alternatives, has not altered the clinical outcomes of patients with Stage I NSCLC. The subgroup of pulmonary nodules between 8 mm and 20 mm in size is increasingly recognized as being "intermediate" relative to the lower rate of malignancies below 8 mm and the higher rate of malignancies above 20 mm. Invasive sampling of the lung nodule by biopsy using transthoracic needle aspiration or bronchoscopy may provide a cytopathologic diagnosis of NSCLC, but are also associated with both false-negative and non-diagnostic results. In summary, a key unmet clinical need for the management of pulmonary nodules is a non-invasive diagnostic test that discriminates between malignant and benign processes in patients with indeterminate pulmonary nodules (IPNs), especially between 8 mm and 20 mm in size.

[0035] The clinical decision to be more or less aggressive in treatment is based on risk factors, primarily nodule size, smoking history and age in addition to imaging. As these are not conclusive, there is a great need for a molecular-based blood test that would be both non-invasive and provide complementary information to risk factors and imaging.

[0036] Accordingly, these and related embodiments will find uses in screening methods for lung conditions, and particularly lung cancer diagnostics. More importantly, the invention finds use in determining the clinical management of a patient. That is, the method of invention is useful in ruling in or ruling out a particular treatment protocol for an individual subject.

[0037] Cancer biology requires a molecular strategy to address the unmet medical need for an assessment of lung cancer risk. The field of diagnostic medicine has evolved with technology and assays that provide sensitive mechanisms for detection of changes in proteins. The methods described herein use a LC-SRM-MS technology for measuring the concentration of blood plasma proteins that are collectively changed in patients with a malignant PN. This protein signature is indicative of lung cancer. LC-SRM-MS is one method that provides for both quantification and identification of circulating proteins in plasma. Changes in protein expression levels, such as but not limited to signaling factors, growth factors, cleaved surface proteins and secreted proteins, can be detected using such a sensitive technology to assay cancer. Presented herein is a blood-based classification test to determine the likelihood that a patient presenting with a pulmonary nodule has a nodule that is benign or malignant. The present invention presents a classification algorithm that predicts the relative likelihood of the PN being benign or malignant.

[0038] More broadly, it is demonstrated that there are many variations on this invention that are also diagnostic tests for the likelihood that a PN is benign or malignant. These are variations on the panel of proteins, protein standards, measurement methodology and/or classification algorithm.

[0039] The present invention also provides methods of determining that a lung condition in a subject is cancer comprising assessing the expression of a plurality of proteins comprising determining the protein expression level of at least each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1 HUMAN and COIA1_HUMAN from a biological sample obtained from the subject; calculating a score from the protein expression of at least each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1 HUMAN and COIA1_HUMAN from the biological sample determined in the preceding step; and comparing the score from the biological sample to a plurality of scores obtained from a reference population, wherein the comparison provides a determination that the lung condition is not cancer.

[0040] The determination that a lung condition is not cancer can include assessing the expression of a plurality of proteins to determine the protein expression level of at least each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1 HUMAN, and COIA1_HUMAN obtained from a biological sample from a subject. A score is calculated from these assessments and this score is further compared with a plurality of scores obtained from a reference population, wherein the comparison provides a determination that the lung condition is not cancer. The method can also include determining an interaction between FRIL_HUMAN AND COIA1_HUMAN.

[0041] Comparing the score from the subject with the plurality of scores obtained from the reference population can provide a cancer probability. Preferably, when the comparison provides a cancer probability and the probability is 15% or less, the lung condition is classified as not cancer. More preferably, when the comparison provides a cancer probability and the probability is 10% or less, the lung condition is classified as not cancer. Most preferably, when the comparison provides a cancer probability and the probability is 5% or less, the lung condition is classified as not cancer.

[0042] The subject can be one that has or is suspected of having a pulmonary nodule. The pulmonary nodule can have a diameter of 30 mm or less. Preferably, the pulmonary nodule has a diameter of about 8 mm to 30 mm.

[0043] The subject can be suspected of having a cancerous or non-cancerous lung condition. A cancerous lung condition can include non-small cell lung cancer. A s non-cancerous lung condition can include chronic obstructive pulmonary disease, hamartoma, fibroma, neurofibroma, granuloma, sarcoidosis, bacterial infection or fungal infection.

[0044] The subject can be a mammal. Preferably, the subject is a human.

[0045] The biological sample can be any sample obtained from the subject, e.g., tissue, cell, fluid. Preferably, the biological sample is tissue, blood, plasma, serum, whole blood, urine, saliva, genital secretions, cerebrospinal fluid, sweat, excreta, or bronchoalveolar lavage.

[0046] The methods of the present invention can also include assessing the expression of a plurality of proteins which comprises determining the protein expression level of at least one of PEDF_HUMAN, MASP1_HUMAN, GELS_HUMAN, LUM_HUMAN, C163A_HUMAN and PTPRJ_HUMAN.

[0047] Determining the protein level of at least one of, or each of, the proteins of the present invention can include fragmenting the protein to generate at least one peptide per protein. Preferably, the fragmentation of the protein is accomplished by trypsin digestion.

[0048] The methods of the present invention can further include normalizing the protein measurements. For example, the protein measurements can normalized by one or more "housekeeping" proteins, e.g., proteins which do not have variable expression across different samples or subjects. Preferable normalizing proteins can include at least one of PEDF_HUMAN, MASP1_HUMAN, GELS_HUMAN, LUM_HUMAN, C163A_HUMAN and PTPRJ_HUMAN.

[0049] The invention further provides methods of using synthetic, modified, heavy peptides corresponding to at least one of, or each of, ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1 HUMAN, COIA1_HUMAN, PEDF_HUMAN, MASP1_HUMAN, GELS_HUMAN, LUM_HUMAN, C163A_HUMAN or PTPRJ_HUMAN. At least one of, or each of, the synthetic peptides can an isotopic label attached.

[0050] Methods to assess the expression of a plurality of proteins can include mass spectrometry (MS), liquid chromatography-selected reaction monitoring/mass spectrometry (LC-SRM-MS), reverse transcriptase-polymerase chain reaction (RT-PCR), microarray, serial analysis of gene expression (SAGE), gene expression analysis by massively parallel signature sequencing (MPSS), immunoassays, immunohistochemistry (IHC), transcriptomics, or proteomics. Preferably, the expression of a plurality of proteins is assessed LC-SRM-MS. LC-SRM-MS can be used to determine transitions for each peptide analyzed. Preferably, peptide transitions can be determined for at least one of, or each of, ALQASALK (SEQ ID NO: 25), AVGLAGTFR (SEQ ID NO: 26), GFLLLASLR (SEQ ID NO: 27), LGGPEAGLGEYLFER (SEQ ID NO: 28) or VEIFYR (SEQ ID NO: 29). More preferably the peptide transitions include at least ALQASALK (SEQ ID NO: 25) (401.25, 617.4), AVGLAGTFR (SEQ ID NO: 26) (446.26, 721.4), GFLLLASLR (SEQ ID NO: 27) (495.31, 559.4), LGGPEAGLGEYLFER (SEQ ID NO: 28) (804.4, 1083.6), and VEIFYR (SEQ ID NO: 29) (413.73, 598.3).

[0051] The measuring step may also be performed using a compound that specifically binds the protein being detected or a peptide transition. For example, a compound that specifically binds to the protein being measured can be an antibody or an aptamer.

[0052] The score can be calculated from a logistic regression model applied to the protein measurements. For example, the score is determined as P.sub.s=1/[1+exp(-.alpha.-.SIGMA..sub.i=1.sup.5.beta..sub.i*I.sub.i,s-- .gamma.*I.sub.COIA1*I.sub.FRIL], where I.sub.i,s Box-Cox transformed and normalized intensity of transition i in said sample (s), .beta. is the corresponding logistic regression coefficient, .alpha. is a panel-specific constant, and .gamma. is a coefficient for the interaction term.

[0053] The reference population can include at least 100 subjects with a lung condition and wherein each subject in the reference population has been assigned a score based on the protein expression of at least each of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN and COIA1_HUMAN obtained from a biological sample from the subject. The invention further provides methods for the treatment of a subject, wherein if the lung condition is not cancer the subject is treated based on clinical practice guidelines. Preferably, if a lung condition is not cancer the subject receives image monitoring for at least a 1 year period, for at least a 2 year period or at least a 3 year period. More preferably, if the lung condition is not cancer, the subject receives chest computed tomography scans for at least a 1 year period, for at least a 2 year period or at least a 3 year period.

[0054] The present invention also provides that at least one step of any disclosed method can be performed on a computer or computer system.

[0055] As disclosed herein, archival plasma samples from subjects presenting with PNs were analyzed for differential protein expression by mass spectrometry and the results were used to identify biomarker proteins and panels of biomarker proteins that are differentially expressed in conjunction with various lung conditions (cancer vs. non-cancer).

[0056] In one aspect of the invention, the panel comprises at least 2, 3, 4, 5, or more protein markers with at least one protein-protein interaction. In some embodiments, the panel comprises 5 protein markers with at least one protein-protein interaction. In some embodiments, the panel comprises ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN, and COIA1_HUMAN; and FRIL_HUMAN and COIA1_HUMAN interact in the model for determining the probability score of cancer. In some embodiments, the panel comprises 2, 3, or 4 biomarkers selected from the group consisting of ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN, and COIA1_HUMAN; and at least one protein-protein mathematical interaction exits among the biomarkers.

[0057] Additional biomarkers that can be used herein are described in WO 13/096845, the contents of which are incorporated herein by reference in its entireties.

[0058] The term "interact", "interacted", "interaction" or "protein-protein interaction" used herein refers to mathematical interaction between peptides (or peptide transitions) derived from two or more protein markers when calculating the probability score of cancer.

[0059] The term "pulmonary nodules" (PNs) refers to lung lesions that can be visualized by radiographic techniques. A pulmonary nodule is any nodules less than or equal to three centimeters in diameter. In one example a pulmonary nodule has a diameter of about 0.8 cm to 2 cm.

[0060] The term "masses" or "pulmonary masses" refers to lung nodules that are greater than three centimeters maximal diameter.

[0061] The term "blood biopsy" refers to a diagnostic study of the blood to determine whether a patient presenting with a nodule has a condition that may be classified as either benign or malignant.

[0062] The term "acceptance criteria" refers to the set of criteria to which an assay, test, diagnostic or product should conform to be considered acceptable for its intended use. As used herein, acceptance criteria are a list of tests, references to analytical procedures, and appropriate measures, which are defined for an assay or product that will be used in a diagnostic. For example, the acceptance criteria for the classifier refer to a set of predetermined ranges of coefficients.

[0063] The term "average maximal AUC" refers to the methodology of calculating performance. For the present invention, in the process of defining the set of proteins that should be in a panel by forward or backwards selection proteins are removed or added one at a time. A plot can be generated with performance (AUC or partial AUC score on the Y axis and proteins on the X axis) the point which maximizes performance indicates the number and set of proteins the gives the best result.

[0064] The term "partial AUC factor or pAUC factor" is greater than expected by random prediction. At sensitivity=0.90 the pAUC factor is the trapezoidal area under the ROC curve from 0.9 to 1.0 Specificity/(0.1*0.1/2).

[0065] The term "incremental information" refers to information that may be used with other diagnostic information to enhance diagnostic accuracy. Incremental information is independent of clinical factors such as including nodule size, age, or gender.

[0066] The term "score" or "scoring" refers to calculating a probability likelihood for a sample. For the present invention, values closer to 1.0 are used to represent the likelihood that a sample is cancer, values closer to 0.0 represent the likelihood that a sample is benign.

[0067] The term "robust" refers to a test or procedure that is not seriously disturbed by violations of the assumptions on which it is based. For the present invention, a robust test is a test wherein the proteins or transitions of the mass spectrometry chromatograms have been manually reviewed and are "generally" free of interfering signals.

[0068] The term "coefficients" refers to the weight assigned to each protein used to in the logistic regression model to score a sample.

[0069] In certain embodiments of the invention, it is contemplated that in terms of the logistic regression model of MC CV, the model coefficient and the coefficient of variation (CV) of each protein's model coefficient may increase or decrease, dependent upon the method (or model) of measurement of the protein classifier. For each of the listed proteins in the panels, there is about, at least, at least about, or at most about a 2-, 3-, 4-, 5-, 6-, 7-, 8-, 9-, or 10-, -fold or any range derivable therein for each of the coefficient and CV. Alternatively, it is contemplated that quantitative embodiments of the invention may be discussed in terms of as about, at least, at least about, or at most about 10, 20, 30, 40, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99% or more, or any range derivable therein.

[0070] The term "best team players" refers to the proteins that rank the best in the random panel selection algorithm, i.e., perform well on panels. When combined into a classifier these proteins can segregate cancer from benign samples. "Best team player proteins" are synonymous with "cooperative proteins". The term "cooperative proteins" refers to proteins that appear more frequently on high performing panels of proteins than expected by chance. This gives rise to a protein's cooperative score which measures how (in) frequently it appears on high performing panels. For example, a protein with a cooperative score of 1.5 appears on high performing panels 1.5.times. more than would be expected by chance alone.

[0071] The term "classifying" as used herein with regard to a lung condition refers to the act of compiling and analyzing expression data for using statistical techniques to provide a classification to aid in diagnosis of a lung condition, particularly lung cancer.

[0072] The term "classifier" as used herein refers to an algorithm that discriminates between disease states with a predetermined level of statistical significance. A two-class classifier is an algorithm that uses data points from measurements from a sample and classifies the data into one of two groups. In certain embodiments, the data used in the classifier is the relative expression of proteins in a biological sample. Protein expression levels in a subject can be compared to levels in patients previously diagnosed as disease free or with a specified condition. Table 4 lists a representative classifier (C4 Classifier).

[0073] The "classifier" maximizes the probability of distinguishing a randomly selected cancer sample from a randomly selected benign sample, i.e., the AUC of ROC curve.

[0074] In addition to the classifier's constituent proteins with differential expression, it may also include proteins with minimal or no biologic variation to enable assessment of variability, or the lack thereof, within or between clinical specimens; these proteins may be termed endogenous proteins and serve as internal controls for the other classifier proteins.

[0075] The term "normalization" or "normalizer" as used herein refers to the expression of a differential value in terms of a standard value to adjust for effects which arise from technical variation due to sample handling, sample preparation and mass spectrometry measurement rather than biological variation of protein concentration in a sample. For example, when measuring the expression of a differentially expressed protein, the absolute value for the expression of the protein can be expressed in terms of an absolute value for the expression of a standard protein that is substantially constant in expression. This prevents the technical variation of sample preparation and mass spectrometry measurement from impeding the measurement of protein concentration levels in the sample.

[0076] The term "condition" as used herein refers generally to a disease, event, or change in health status.

[0077] The term "treatment protocol" as used herein including further diagnostic testing typically performed to determine whether a pulmonary nodule is benign or malignant. Treatment protocols include diagnostic tests typically used to diagnose pulmonary nodules or masses such as for example, CT scan, positron emission tomography (PET) scan, bronchoscopy or tissue biopsy. Treatment protocol as used herein is also meant to include therapeutic treatments typically used to treat malignant pulmonary nodules and/or lung cancer such as for example, chemotherapy, radiation or surgery.

[0078] The terms "diagnosis" and "diagnostics" also encompass the terms "prognosis" and "prognostics", respectively, as well as the applications of such procedures over two or more time points to monitor the diagnosis and/or prognosis over time, and statistical modeling based thereupon. Furthermore the term diagnosis includes: a. prediction (determining if a patient will likely develop a hyperproliferative disease) b. prognosis (predicting whether a patient will likely have a better or worse outcome at a pre-selected time in the future) c. therapy selection d. therapeutic drug monitoring e. relapse monitoring.

[0079] In some embodiments, for example, classification of a biological sample as being derived from a subject with a lung condition may refer to the results and related reports generated by a laboratory, while diagnosis may refer to the act of a medical professional in using the classification to identify or verify the lung condition.

[0080] The term "providing" as used herein with regard to a biological sample refers to directly or indirectly obtaining the biological sample from a subject. For example, "providing" may refer to the act of directly obtaining the biological sample from a subject (e.g., by a blood draw, tissue biopsy, lavage and the like). Likewise, "providing" may refer to the act of indirectly obtaining the biological sample. For example, providing may refer to the act of a laboratory receiving the sample from the party that directly obtained the sample, or to the act of obtaining the sample from an archive.

[0081] As used herein, "lung cancer" preferably refers to cancers of the lung, but may include any disease or other disorder of the respiratory system of a human or other mammal. Respiratory neoplastic disorders include, for example small cell carcinoma or small cell lung cancer (SCLC), non-small cell carcinoma or non-small cell lung cancer (NSCLC), squamous cell carcinoma, adenocarcinoma, broncho-alveolar carcinoma, mixed pulmonary carcinoma, malignant pleural mesothelioma, undifferentiated large cell carcinoma, giant cell carcinoma, synchronous tumors, large cell neuroendocrine carcinoma, adenosquamous carcinoma, undifferentiated carcinoma; and small cell carcinoma, including oat cell cancer, mixed small cell/large cell carcinoma, and combined small cell carcinoma; as well as adenoid cystic carcinoma, hamartomas, mucoepidermoid tumors, typical carcinoid lung tumors, atypical carcinoid lung tumors, peripheral carcinoid lung tumors, central carcinoid lung tumors, pleural mesotheliomas, and undifferentiated pulmonary carcinoma and cancers that originate outside the lungs such as secondary cancers that have metastasized to the lungs from other parts of the body. Lung cancers may be of any stage or grade. Preferably the term may be used to refer collectively to any dysplasia, hyperplasia, neoplasia, or metastasis in which the protein biomarkers expressed above normal levels as may be determined, for example, by comparison to adjacent healthy tissue.

[0082] Examples of non-cancerous lung condition include chronic obstructive pulmonary disease (COPD), benign tumors or masses of cells (e.g., hamartoma, fibroma, neurofibroma), granuloma, sarcoidosis, and infections caused by bacterial (e.g., tuberculosis) or fungal (e.g. histoplasmosis) pathogens. In certain embodiments, a lung condition may be associated with the appearance of radiographic PNs.

[0083] As used herein, "lung tissue", and "lung cancer" refer to tissue or cancer, respectively, of the lungs themselves, as well as the tissue adjacent to and/or within the strata underlying the lungs and supporting structures such as the pleura, intercostal muscles, ribs, and other elements of the respiratory system. The respiratory system itself is taken in this context as representing nasal cavity, sinuses, pharynx, larynx, trachea, bronchi, lungs, lung lobes, aveoli, aveolar ducts, aveolar sacs, aveolar capillaries, bronchioles, respiratory bronchioles, visceral pleura, parietal pleura, pleural cavity, diaphragm, epiglottis, adenoids, tonsils, mouth and tongue, and the like. The tissue or cancer may be from a mammal and is preferably from a human, although monkeys, apes, cats, dogs, cows, horses and rabbits are within the scope of the present invention. The term "lung condition" as used herein refers to a disease, event, or change in health status relating to the lung, including for example lung cancer and various non-cancerous conditions.

[0084] "Accuracy" refers to the degree of conformity of a measured or calculated quantity (a test reported value) to its actual (or true) value. Clinical accuracy relates to the proportion of true outcomes (true positives (TP) or true negatives (TN) versus misclassified outcomes (false positives (FP) or false negatives (FN)), and may be stated as a sensitivity, specificity, positive predictive values (PPV) or negative predictive values (NPV), or as a likelihood, odds ratio, among other measures.

[0085] The term "biological sample" as used herein refers to any sample of biological origin potentially containing one or more biomarker proteins. Examples of biological samples include tissue, organs, or bodily fluids such as whole blood, plasma, serum, tissue, lavage or any other specimen used for detection of disease.

[0086] The term "subject" as used herein refers to a mammal, preferably a human.

[0087] The term "biomarker protein" as used herein refers to a polypeptide in a biological sample from a subject with a lung condition versus a biological sample from a control subject. A biomarker protein includes not only the polypeptide itself, but also minor variations thereof, including for example one or more amino acid substitutions or modifications such as glycosylation or phosphorylation.

[0088] The term "biomarker protein panel" as used herein refers to a plurality of biomarker proteins. In certain embodiments, the expression levels of the proteins in the panels can be correlated with the existence of a lung condition in a subject. In certain embodiments, biomarker protein panels comprise 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 60, 70, 80, 90 or 100 proteins. In certain embodiments, the biomarker proteins panels comprise 2-5 proteins, 5-10 proteins, 10-20 proteins or more.

[0089] "Treating" or "treatment" as used herein with regard to a condition may refer to preventing the condition, slowing the onset or rate of development of the condition, reducing the risk of developing the condition, preventing or delaying the development of symptoms associated with the condition, reducing or ending symptoms associated with the condition, generating a complete or partial regression of the condition, or some combination thereof.

[0090] Biomarker levels may change due to treatment of the disease. The changes in biomarker levels may be measured by the present invention. Changes in biomarker levels may be used to monitor the progression of disease or therapy.

[0091] "Altered", "changed" or "significantly different" refer to a detectable change or difference from a reasonably comparable state, profile, measurement, or the like. One skilled in the art should be able to determine a reasonable measurable change. Such changes may be all or none. They may be incremental and need not be linear. They may be by orders of magnitude. A change may be an increase or decrease by 1%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99%, 100%, or more, or any value in between 0% and 100%. Alternatively the change may be 1-fold, 1.5-fold 2-fold, 3-fold, 4-fold, 5-fold or more, or any values in between 1-fold and five-fold. The change may be statistically significant with a p value of 0.1, 0.05, 0.001, or 0.0001.

[0092] Using the methods of the current invention, a clinical assessment of a patient is first performed. If there exists is a higher likelihood for cancer, the clinician may rule in the disease which will require the pursuit of diagnostic testing options yielding data which increase and/or substantiate the likelihood of the diagnosis. "Rule in" of a disease requires a test with a high specificity.

[0093] "FN" is false negative, which for a disease state test means classifying a disease subject incorrectly as non-disease or normal.

[0094] "FP" is false positive, which for a disease state test means classifying a normal subject incorrectly as having disease.

[0095] The term "rule in" refers to a diagnostic test with high specificity that optionally coupled with a clinical assessment indicates a higher likelihood for cancer. If the clinical assessment is a lower likelihood for cancer, the clinician may adopt a stance to rule out the disease, which will require diagnostic tests which yield data that decrease the likelihood of the diagnosis. "Rule out" requires a test with a high sensitivity. Accordingly, the term "ruling in" as used herein is meant that the subject is selected to receive a treatment protocol.

[0096] The term "rule out" refers to a diagnostic test with high sensitivity that optionally coupled with a clinical assessment indicates a lower likelihood for cancer. Accordingly, the term "ruling out" as used herein is meant that the subject is selected not to receive a treatment protocol.

[0097] The term "sensitivity of a test" refers to the probability that a patient with the disease will have a positive test result. This is derived from the number of patients with the disease who have a positive test result (true positive) divided by the total number of patients with the disease, including those with true positive results and those patients with the disease who have a negative result, i.e. false negative.

[0098] The term "specificity of a test" refers to the probability that a patient without the disease will have a negative test result. This is derived from the number of patients without the disease who have a negative test result (true negative) divided by all patients without the disease, including those with a true negative result and those patients without the disease who have a positive test result, e.g. false positive. While the sensitivity, specificity, true or false positive rate, and true or false negative rate of a test provide an indication of a test's performance, e.g. relative to other tests, to make a clinical decision for an individual patient based on the test's result, the clinician requires performance parameters of the test with respect to a given population.

[0099] The term "positive predictive value" (PPV) refers to the probability that a positive result correctly identifies a patient who has the disease, which is the number of true positives divided by the sum of true positives and false positives.

[0100] The term "negative predictive value" or "NPV" is calculated by TN/(TN+FN) or the true negative fraction of all negative test results. It also is inherently impacted by the prevalence of the disease and pre-test probability of the population intended to be tested. The term NPV refers to the probability that a negative test correctly identifies a patient without the disease, which is the number of true negatives divided by the sum of true negatives and false negatives. A positive result from a test with a sufficient PPV can be used to rule in the disease for a patient, while a negative result from a test with a sufficient NPV can be used to rule out the disease, if the disease prevalence for the given population, of which the patient can be considered a part, is known.

[0101] The term "disease prevalence" refers to the number of all new and old cases of a disease or occurrences of an event during a particular period. Prevalence is expressed as a ratio in which the number of events is the numerator and the population at risk is the denominator.

[0102] The term disease incidence refers to a measure of the risk of developing some new condition within a specified period of time; the number of new cases during some time period, it is better expressed as a proportion or a rate with a denominator.

[0103] Lung cancer risk according to the "National Lung Screening Trial" is classified by age and smoking history. High risk--age.gtoreq.55 and .gtoreq.30 pack-years smoking history; Moderate risk--age.gtoreq.50 and .gtoreq.20 pack-years smoking history; Low risk--<age 50 or <20 pack-years smoking history.

[0104] The clinician must decide on using a diagnostic test based on its intrinsic performance parameters, including sensitivity and specificity, and on its extrinsic performance parameters, such as positive predictive value and negative predictive value, which depend upon the disease's prevalence in a given population.

[0105] Additional parameters which may influence clinical assessment of disease likelihood include the prior frequency and closeness of a patient to a known agent, e.g. exposure risk, that directly or indirectly is associated with disease causation, e.g. second hand smoke, radiation, etc., and also the radiographic appearance or characterization of the pulmonary nodule exclusive of size. A nodule's description may include solid, semi-solid or ground glass which characterizes it based on the spectrum of relative gray scale density employed by the CT scan technology.

[0106] "Mass spectrometry" refers to a method comprising employing an ionization source to generate gas phase ions from an analyte presented on a sample presenting surface of a probe and detecting the gas phase ions with a mass spectrometer.

[0107] In an embodiment of the invention, a panel of 5 proteins (ALDOA, FRIL, LG3BP, TSP1, and COIA1) and one protein-protein interaction term (FRIL and COIA1) effectively distinguish between samples derived from patients with benign and malignant nodules less than 2 cm diameter.

[0108] Bioinformatic and biostatistical analyses were used first to identify individual proteins with statistically significant differential expression, and then using these proteins to derive one or more combinations of proteins or panels of proteins, which collectively demonstrated superior discriminatory performance compared to any individual protein. Bioinformatic and biostatistical methods are used to derive coefficients (C) for each individual protein in the panel that reflects its relative expression level, i.e. increased or decreased, and its weight or importance with respect to the panel's net discriminatory ability, relative to the other proteins. The quantitative discriminatory ability of the panel can be expressed as a mathematical algorithm with a term for each of its constituent proteins being the product of its coefficient and the protein's plasma expression level (P) (as measured by LC-SRM-MS), e.g. C.times.P, with an algorithm consisting of n proteins described as: C1.times.P1+C2.times.P2+C3.times.P3++Cn.times.Pn. An algorithm that discriminates between disease states with a predetermined level of statistical significance may be refers to a "disease classifier". In addition to the classifier's constituent proteins with differential expression, it may also include proteins with minimal or no biologic variation to enable assessment of variability, or the lack thereof, within or between clinical specimens; these proteins may be termed typical native proteins and serve as internal controls for the other classifier proteins.

[0109] In certain embodiments, expression levels are measured by MS. MS analyzes the mass spectrum produced by an ion after its production by the vaporization of its parent protein and its separation from other ions based on its mass-to-charge ratio. The most common modes of acquiring MS data are 1) full scan acquisition resulting in the typical total ion current plot (TIC), 2) selected ion monitoring (SIM), and 3) selected reaction monitoring (SRM).

[0110] In certain embodiments of the methods provided herein, biomarker protein expression levels are measured by LC-SRM-MS. LC-SRM-MS is a highly selective method of tandem mass spectrometry which has the potential to effectively filter out all molecules and contaminants except the desired analyte(s). This is particularly beneficial if the analysis sample is a complex mixture which may comprise several isobaric species within a defined analytical window. LC-SRM-MS methods may utilize a triple quadrupole mass spectrometer which, as is known in the art, includes three quadrupole rod sets. A first stage of mass selection is performed in the first quadrupole rod set, and the selectively transmitted ions are fragmented in the second quadrupole rod set. The resultant transition (product) ions are conveyed to the third quadrupole rod set, which performs a second stage of mass selection. The product ions transmitted through the third quadrupole rod set are measured by a detector, which generates a signal representative of the numbers of selectively transmitted product ions. The RF and DC potentials applied to the first and third quadrupoles are tuned to select (respectively) precursor and product ions that have m/z values lying within narrow specified ranges. By specifying the appropriate transitions (m/z values of precursor and product ions), a peptide corresponding to a targeted protein may be measured with high degrees of sensitivity and selectivity. Signal-to-noise ratio is superior to conventional tandem mass spectrometry (MS/MS) experiments, which select one mass window in the first quadrupole and then measure all generated transitions in the ion detector. LC-SRM-MS.

[0111] In certain embodiments, an SRM-MS assay for use in diagnosing or monitoring lung cancer as disclosed herein may utilize one or more peptides and/or peptide transitions derived from the proteins ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN, and COIA1_HUMAN. In certain embodiments, the peptides and/or peptide transitions derived from 2 or more proteins "interact" mathematically. In certain embodiments, the peptides and/or peptide transitions derived from FRIL and COIA1 mathematically interact in the model for determining the probability score of lung cancer.

[0112] The expression level of a biomarker protein can be measured using any suitable method known in the art, including but not limited to mass spectrometry (MS), reverse transcriptase-polymerase chain reaction (RT-PCR), microarray, serial analysis of gene expression (SAGE), gene expression analysis by massively parallel signature sequencing (MPSS), immunoassays (e.g., ELISA), immunohistochemistry (IHC), transcriptomics, and proteomics.

[0113] To evaluate the diagnostic performance of a particular set of peptide transitions, a ROC curve is generated for each significant transition.

[0114] An "ROC curve" as used herein refers to a plot of the true positive rate (sensitivity) against the false positive rate (specificity) for a binary classifier system as its discrimination threshold is varied. A ROC curve can be represented equivalently by plotting the fraction of true positives out of the positives (TPR=true positive rate) versus the fraction of false positives out of the negatives (FPR=false positive rate). Each point on the ROC curve represents a sensitivity/specificity pair corresponding to a particular decision threshold.

[0115] AUC represents the area under the ROC curve. The AUC is an overall indication of the diagnostic accuracy of 1) a biomarker or a panel of biomarkers and 2) a ROC curve. AUC is determined by the "trapezoidal rule." For a given curve, the data points are connected by straight line segments, perpendiculars are erected from the abscissa to each data point, and the sum of the areas of the triangles and trapezoids so constructed is computed. In certain embodiments of the methods provided herein, a biomarker protein has an AUC in the range of about 0.75 to 1.0. In certain of these embodiments, the AUC is in the range of about 0.8 to 0.8, 0.9 to 0.95, or 0.95 to 1.0.

[0116] The methods provided herein are minimally invasive and pose little or no risk of adverse effects. As such, they may be used to diagnose, monitor and provide clinical management of subjects who do not exhibit any symptoms of a lung condition and subjects classified as low risk for developing a lung condition. For example, the methods disclosed herein may be used to diagnose lung cancer in a subject who does not present with a PN and/or has not presented with a PN in the past, but who nonetheless deemed at risk of developing a PN and/or a lung condition. Similarly, the methods disclosed herein may be used as a strictly precautionary measure to diagnose healthy subjects who are classified as low risk for developing a lung condition.

[0117] The present invention provides a method of determining the likelihood that a lung condition in a subject is cancer by measuring an abundance of a panel of proteins in a sample obtained from the subject; calculating a probability of cancer score based on the protein measurements and ruling out cancer for the subject if the score is lower than a pre-determined score, when cancer is ruled out the subject does not receive a treatment protocol. Treatment protocols include for example pulmonary function test (PFT), pulmonary imaging, a biopsy, a surgery, a chemotherapy, a radiotherapy, or any combination thereof. In some embodiments, the imaging is an x-ray, a chest computed tomography (CT) scan, or a positron emission tomography (PET) scan.

[0118] The present invention further provides a method of ruling in the likelihood of cancer for a subject by measuring an abundance of panel of proteins in a sample obtained from the subject, calculating a probability of cancer score based on the protein measurements and ruling in the likelihood of cancer for the subject if the score is higher than a pre-determined score

[0119] In another aspect the invention further provides a method of determining the likelihood of the presence of a lung condition in a subject by measuring an abundance of panel of proteins in a sample obtained from the subject, calculating a probability of cancer score based on the protein measurements and concluding the presence of this lung condition if the score is equal or greater than a pre-determined score. The lung condition is lung cancer such as for example, non-small cell lung cancer (NSCLC). The subject is at risk of developing lung cancer.

[0120] The panel includes 5 proteins ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN, and COIA1_HUMAN. Nucleic acid and amino acid sequences for these can be found in Table 6 and Table 7, respectively. Preferably, FRIL_HUMAN and COIA1_HUMAN mathematically interact in the model for determining the probability score.

[0121] In merely illustrative embodiments, the methods described herein include steps of (a) measuring the abundance (intensity) of one representative peptide transition derived from each of the proteins comprising ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN, and COIA1_HUMAN in a sample obtained from a subject; (b) determining the coefficient for each representative peptide transition; (c) calculating a sum of the products of each logarithmically transformed (and optionally normalized) intensity of each transition and its corresponding coefficient; (d) calculating a mathematical interaction between FRIL and COIA1 by multiplying their logarithmically transformed (and optionally normalized) intensity of their representative peptide transitions; and (e) calculating a probability of cancer score based on the sum calculated in step (c) and the mathematical interaction calculated in step (d).

[0122] In some embodiments, the representative peptide transitions for proteins ALDOA_HUMAN, COIA1_HUMAN, TSP1_HUMAN, FRIL_HUMAN, and LG3BP_HUMAN are ALQASALK (SEQ ID NO: 25) (401.25, 617.4), AVGLAGTFR (SEQ ID NO: 26) (446.26, 721.4), GFLLLASLR (SEQ ID NO: 27) (495.31, 559.4), LGGPEAGLGEYLFER (SEQ ID NO: 28) (804.4, 1083.6), and VEIFYR (SEQ ID NO: 29) (413.73, 598.3), respectively.

[0123] In some embodiments, the measuring step of any method described herein is performed by detecting transitions comprising ALQASALK (SEQ ID NO: 25) (401.25, 617.4), AVGLAGTFR (SEQ ID NO: 26) (446.26, 721.4), GFLLLASLR (SEQ ID NO: 27) (495.31, 559.4), LGGPEAGLGEYLFER (SEQ ID NO: 28) (804.4, 1083.6), and VEIFYR (SEQ ID NO: 29) (413.73, 598.3).

[0124] The subject has or is suspected of having a pulmonary nodule. The pulmonary nodule has a diameter of less than or equal to 3.0 cm. In one embodiment, the pulmonary nodule has a diameter of about 0.8 cm to 2.0 cm. The subject may have stage IA lung cancer (i.e., the tumor is smaller than 3 cm).

[0125] The probability score is calculated from a logistic regression model applied to the protein measurements. For example, the score is determined as P.sub.s=1/[1+exp(-.alpha.-.SIGMA..sub.i=1.sup.5.beta..sub.i*I.sub.i,s-.ga- mma.*I.sub.COIA1*I.sub.FRIL], where I.sub.i,s is logarithmically transformed and normalized intensity of transition i in said sample (s), .beta. is the corresponding logistic regression coefficient, .alpha. is a panel-specific constant, and .gamma. is a coefficient for the interaction term. The score determined has a negative predictive value (NPV) of at least about 85%, at least 90% or higher (91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or higher).

[0126] In various embodiments, the method of the present invention further comprises normalizing the protein measurements. For example, the protein measurements are normalized by one or more proteins selected from PEDF_HUMAN, MASP1_HUMAN, GELS_HUMAN, LUM_HUMAN, C163A_HUMAN and PTPRJ_HUMAN. Nucleic acid and amino acid sequences for these can be found in Table 8 and Table 9, respectively.

[0127] The biological sample includes such as for example tissue, blood, plasma, serum, whole blood, urine, saliva, genital secretion, cerebrospinal fluid, sweat and excreta.

[0128] In some embodiments, the determining the likelihood of cancer is determined by the sensitivity, specificity, negative predictive value or positive predictive value associated with the score.

[0129] The measuring step is performed by selected reaction monitoring mass spectrometry, using a compound that specifically binds the protein being detected or a peptide transition. In one embodiment, the compound that specifically binds to the protein being measured is an antibody or an aptamer.

[0130] In specific embodiments, the diagnostic methods disclosed herein are used to rule out a treatment protocol for a subject, measuring the abundance of a panel of proteins in a sample obtained from the subject, calculating a probability of cancer score based on the protein measurements and protein-protein interaction and ruling out the treatment protocol for the subject if the score determined in the sample is lower than a pre-determined score. In some embodiments the panel contains ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN, and COIA1_HUMAN; and FRIL_HUMAN and COIA1_HUMAN interact in the model for determining the score.

[0131] In specific embodiments, the diagnostic methods disclosed herein are used to rule in a treatment protocol for a subject by measuring the abundance of a panel of proteins in a sample obtained from the subject, calculating a probability of cancer score based on the protein measurements and protein-protein interaction and ruling in the treatment protocol for the subject if the score determined in the sample is greater than a pre-determined score. In some embodiments the panel contains ALDOA_HUMAN, FRIL_HUMAN, LG3BP_HUMAN, TSP1_HUMAN, and COIA1_HUMAN; and FRIL_HUMAN and COIA1_HUMAN interact in the model for determining the score.

[0132] In certain embodiments, the diagnostic methods disclosed herein can be used in combination with other clinical assessment methods, including for example various radiographic and/or invasive methods. Similarly, in certain embodiments, the diagnostic methods disclosed herein can be used to identify candidates for other clinical assessment methods, or to assess the likelihood that a subject will benefit from other clinical assessment methods.

[0133] The high abundance of certain proteins in a biological sample such as plasma or serum can hinder the ability to assay a protein of interest, particularly where the protein of interest is expressed at relatively low concentrations. Several methods are available to circumvent this issue, including enrichment, separation, and depletion. Enrichment uses an affinity agent to extract proteins from the sample by class, e.g., removal of glycosylated proteins by glycocapture. Separation uses methods such as gel electrophoresis or isoelectric focusing to divide the sample into multiple fractions that largely do not overlap in protein content. Depletion typically uses affinity columns to remove the most abundant proteins in blood, such as albumin, by utilizing advanced technologies such as IgY14/Supermix (SigmaSt. Louis, Mo.) that enable the removal of the majority of the most abundant proteins.

[0134] In certain embodiments of the methods provided herein, a biological sample may be subjected to enrichment, separation, and/or depletion prior to assaying biomarker or putative biomarker protein expression levels. In certain of these embodiments, blood proteins may be initially processed by a glycocapture method, which enriches for glycosylated proteins, allowing quantification assays to detect proteins in the high pg/ml to low ng/ml concentration range. Exemplary methods of glycocapture are well known in the art (see, e.g., U.S. Pat. No. 7,183,188; U.S. Patent Appl. Publ. No. 2007/0099251; U.S. Patent Appl. Publ. No. 2007/0202539; U.S. Patent Appl. Publ. No. 2007/0269895; and U.S. Patent Appl. Publ. No. 2010/0279382). In other embodiments, blood proteins may be initially processed by a protein depletion method, which allows for detection of commonly obscured biomarkers in samples by removing abundant proteins. In one such embodiment, the protein depletion method is a Supermix (Sigma) depletion method.

[0135] In certain embodiments, a biomarker protein panel comprises two to 100 biomarker proteins. In certain of these embodiments, the panel comprises 2 to 5, 6 to 10, 11 to 15, 16 to 20, 21-25, 5 to 25, 26 to 30, 31 to 40, 41 to 50, 25 to 50, 51 to 75, 76 to 100, biomarker proteins. In certain embodiments, a biomarker protein panel comprises one or more subpanels of biomarker proteins that each comprises at least two biomarker proteins. For example, biomarker protein panel may comprise a first subpanel made up of biomarker proteins that are overexpressed in a particular lung condition and a second subpanel made up of biomarker proteins that are under-expressed in a particular lung condition.

[0136] In certain embodiments of the methods, compositions, and kits provided herein, a biomarker protein may be a protein that exhibits differential expression in conjunction with lung cancer.

[0137] In other embodiments, the diagnosis methods disclosed herein may be used to distinguish between two different lung conditions. For example, the methods may be used to classify a lung condition as malignant lung cancer versus benign lung cancer, NSCLC versus SCLC, or lung cancer versus non-cancer condition (e.g., inflammatory condition).

[0138] In certain embodiments, kits are provided for diagnosing a lung condition in a subject. These kits are used to detect expression levels of one or more biomarker proteins. Optionally, a kit may comprise instructions for use in the form of a label or a separate insert. The kits can contain reagents that specifically bind to proteins in the panels described, herein. These reagents can include antibodies. The kits can also contain reagents that specifically bind to mRNA expressing proteins in the panels described, herein. These reagents can include nucleotide probes. The kits can also include reagents for the detection of reagents that specifically bind to the proteins in the panels described herein. These reagents can include fluorophores.

[0139] The following examples are provided to better illustrate the claimed invention and are not to be interpreted as limiting the scope of the invention. To the extent that specific materials are mentioned, it is merely for purposes of illustration and is not intended to limit the invention. One skilled in the art may develop equivalent means or reactants without the exercise of inventive capacity and without departing from the scope of the invention

EXAMPLES

Example 1

Identification of a Robust Classifier that Distinguishes Malignant and Benign Lung Nodule

[0140] Plasma samples of patients originated from three sites (UPenn, Laval and NYU) were divided into five experimental batches. Within each batch, four aliquots of a pooled human plasma standard (HPS) sample were processed. Plasma samples were immuno-depleted, denatured, reduced, trypsin-digested, and analyzed by LC-MRM-MS at Integrated Diagnostics using protocols developed in previous studies.

[0141] The 100 clinical samples were all from patients with lung nodules of 8-20 mm in size and age>40 years. Cancer and benign samples were matched on gender, age (+/-10 years) and nodule size (+/-8 mm). There were some bias between cancer and benign samples on smoking history and on smoking pack-years.

TABLE-US-00001 TABLE 1 Sources of samples and their assignment to five batches. Batch Center Benign Cancer Total S1 UPenn 10 10 20 S2 UPenn 10 10 20 S3 Laval 10 10 20 S4 NYU 10 10 20 S5 NYU 10 10 20 Total 3 Sites 50 50 100

[0142] Detailed procedures for sample preparation and data processing, including normalization of the raw data can be found in PCT/US2012/071387 (WO13/096845), the contents of which are incorporated herein by their entireties.

[0143] Among all the possible panels formed by the 13 proteins identified in WO13/096845, there were 28 panels with a cross-validated performance with partial AUC at specificity=0.9 greater than two-fold the number expected by random chance (0.1 2/2). These models were retained and using 100,000 cross-validation models to get a more accurate measure of their logistic regression coefficients and to determine the coefficient of variability for the model coefficients. Measure the CVs of each protein coefficient and report the NPV, SPC performance, on median panel was performed at a prevalence of 20%.

TABLE-US-00002 TABLE 2 Robust 28 panels Proteins max_cv max_cv_protein ALPHA_CV NPV specificity threshold xv_pAUC_factor ALDOA, TSP1, PRDX1, LG3BP 0.54 ALDOA 0.73 0.90 0.68 0.50 3.33 ALDOA, TSP1, LG3BP 0.58 TSP1 0.73 0.90 0.55 0.49 4.47 ALDOA, COIA1, TSP1, LG3BP 0.73 COIA1 0.62 0.90 0.55 0.49 4.17 ALDOA, COIA1, FRIL, LG3BP 0.62 COIA1 0.38 0.90 0.51 0.48 3.89 COIA1, LG3BP 0.78 COIA1 0.57 0.90 0.51 0.49 3.75 LG3BP 0.23 LG3BP 0.32 0.90 0.49 0.48 4.05 ALDOA, LG3BP 0.44 ALDOA 0.38 0.91 0.47 0.47 5.45 ALDOA, LRP1, LG3BP 0.54 LRP1 0.66 0.91 0.47 0.46 4.26 ALDOA, COIA1, PRDX1, LG3BP 0.73 ALDOA 0.75 0.90 0.45 0.45 3.82 COIA1, PRDX1, LG3BP 0.70 COIA1 0.89 0.90 0.43 0.45 3.35 ALDOA, COIA1, LG3BP 0.65 COIA1 0.52 0.90 0.38 0.45 5.26 ISLR, ALDOA, COIA1, TSP1, FRIL, PRDX1, LG3BP 6.85 COIA1 0.96 0.90 0.72 0.49 2.10 PRDX1, LG3BP 0.37 PRDX1 1.50 0.90 0.55 0.49 3.34 ALDOA, PRDX1, LG3BP 0.82 ALDOA 2.61 0.90 0.53 0.47 3.74 ISLR, ALDOA, TSP1, PRDX1, LG3BP 1.50 ISLR 2.00 0.90 0.53 0.48 3.31 ISLR, ALDOA, COIA1, TSP1, PRDX1, LG3BP 42.98 ISLR 4.48 0.90 0.53 0.48 2.90 ISLR, ALDOA, TSP1, LG3BP 1.13 ISLR 1.04 0.90 0.51 0.48 4.08 ISLR, ALDOA, COIA1, TSP1, LG3BP 4.33 ISLR 1.50 0.90 0.51 0.48 3.76 ISLR, ALDOA, PRDX1, LG3BP 1.17 ISLR 1.24 0.90 0.51 0.47 3.74 ISLR, LG3BP 1.18 ISLR 1.01 0.91 0.47 0.47 3.57 ISLR, COIA1, LG3BP 4.46 ISLR 1.43 0.91 0.47 0.48 3.30 ISLR, PRDX1, LG3BP 1.32 ISLR 1.46 0.91 0.47 0.46 3.28 ISLR, ALDOA, LG3BP 1.01 ISLR 0.89 0.90 0.45 0.46 4.91 ALDOA, COIA1, LRP1, LG3BP 0.83 COIA1 3.18 0.90 0.45 0.46 4.01 ISLR, ALDOA, COIA1, PRDXl, LG3BP 8.97 ISLR 2.14 0.90 0.45 0.45 3.58 ISLR, COIA1, PRDX1, LG3BP 20.54 ISLR 2.86 0.90 0.43 0.45 3.12 ISLR, ALDOA, COIA1, LG3BP 3.63 ISLR 1.27 0.90 0.38 0.44 4.71 ISLR, ALDOA, LRP1, LG3BP 0.95 ISLR 2.97 0.90 0.38 0.44 3.97

[0144] All possible panels of proteins ALDOA, COIA1, FRIL, LG3BP, LRP1, PRDX1, TSP1, TETN, and BGH3 are next generated. A set of 27 panels were selected to be carried forward by the following criteria:

[0145] Median Specificity>=0.5

[0146] Max Coefficient CV<=1.5

[0147] Maximum ALPHA CV<=1.5

[0148] Cross-validated pAUC at specificity=0.9 greater than one fold random.

[0149] A minimum of four proteins per panel.

[0150] The top 6 panels were carried forward.

TABLE-US-00003 TABLE 3 Top 6 panels Median Panel Proteins Size Specificity Rank xv_Specificity ID_341 ALDOA, TSP1, 5 0.62 3 0.32 FRIL, PRDX1, LG3BP ID_85 TSP1, FRIL, 4 0.55 5 0.31 PRDX1, LG3BP ID_340 ALDOA, TSP1, 4 0.66 1 0.29 FRIL, PRDX1 ID_449 ALDOA, COIA1, 4 0.51 6 0.27 TSP1, LG3BP ID_465 ALDOA, COIA1, 5 0.60 4 0.24 TSP1, FRIL, LG3BP ID_469 ALDOA, COIA1, 6 0.64 2 0.23 TSP1, FRIL, PRDX1, LG3BP

[0151] Representative NPV/Specificity plot for ID.sub.-- 465 and ID_341 panels can be found in FIGS. 2 and 3, respectively.

[0152] All possible interaction pairs were added to panel 465. The cross validated performance (Specificity at NPV=0.90) and partial AUC was measured. The below table displays the performance:

Cross validated performance and partial AUC for panel 465.

TABLE-US-00004 ID_ Median Median xv_ xv_ 465 ID465 Max_ Max_cv_ ALPHA_ speci- thres- pAUC_ xv_ Thres- xv- xv_ Name cv protein CV NPV ficity hold xv NPV Spec hold spec pAUC ID_ 0.981 Interaction 0.429 0.901 0.617 0.483 1.751 0.900 0.182 0.346 0 0 465 term ALQA SALK (SEQ ID NO: 25)_ 401. 25.6 17.40_ times_ AV GLAG TFR (SEQ ID NO: 26)_ 446. 26_ 721.40 ID_ 0.955 GFLLLASLR 0.381 0.904 0.638 0.481 1.571 0.900 0.201 0.355 0 0 465 (SEQ ID NO: ALQA 27)_495.31_ SALK 559.40 (SEQ ID NO: 25)_ 401. 25_6 17.40_ times_ GF LLLA SLR (SEQ ID NO: 27)_ 495. 31_5 59.40 ID_ 0.735 LGGPEAGLGEY 0.529 0.901 0.681 0.501 1.944 0.900 0.240 0.375 0 0 465 LFER (SEQ ALQA ID NO: SALK 28)_804.40_ (SEQ 1083.60 ID NO: 25)_ 401. 25_6 17.40 times_ LG GPEA GLGE YLFER (SEQ ID NO: 28)_ 804. 40_1 083.60 ID_ 0.953 Interaction 0.397 0.901 0.617 0.495 2.209 0.900 0.241 0.376 0 1 465 term ALQA SALK (SEQ ID NO: 25)_ 401. 25_6 17.40 times_ VE IFYR (SEQ ID NO: 29)_ 413. 73_5 98.30 ID_ 0.891 Interaction 0.475 0.901 0.511 0.455 1.734 0.900 0.188 0.336 0 0 465 term AVGL AGTFR (SEQ ID NO: 26)_ 446. 26_7 21.40 times_ GF LLLA SLR (SEQ ID NO: 27)_ 495. 31_5 59.40 ID_ 0.466 LGGPEAGLGEY 0.619 0.902 0.660 0.496 2.402 0.900 0.396 0.422 1 1 465 LFER (SEQ AVGL ID NO: AGTFR 28)_804.40_ (SEQ 1083.60 ID NO: 26)_ 446. 26_7 21.40 times_ LG GPEA GLGE YLFER (SEQ ID NO: 28)_ 804. 40_1 083. 60 ID_ 4.349 VEIFYR (SEQ 0.510 0.905 0.574 0.481 1.643 0.900 0.216 0.360 0 0 465 ID NO: AVGL 29)_413.73_ AGTFR 598.30 (SEQ ID NO: 26)_ 446. 26_7 21.40 times_ VEI FYR (SEQ ID NO: 29)_ 413. 735 98.30 ID_ 556.510 Interaction 0.420 0.901 0.617 0.485 1.217 0.900 0.165 0.337 0 0 465 term GFLL LASLR (SEQ ID NO: 27)_ 495. 31_5 59.40 times_ LG GPEA GLGE YLFER (SEQ ID NO: 28)_ 804. 40_1 083.60 ID_ 0.806 AVGLAGTFR 0.392 0.903 0.702 0.509 1.955 0.900 0.222 0.370 0 0 465 (SEQ ID NO: GFLL 26)_446.26_ LASLR 721.40 (SEQ ID NO: 27)_ 495. 31_5 59.40 times_ VE IFYR (SEQ ID NO: 29)_ 413. 73_5 98.30 ID_ 0.743 AVGLAGTFR 0.387 0.902 0.660 0.496 1.947 0.900 0.283 0.392 1 0 465 (SEQ ID NO: LGGP 26)_446.26_ EAGL 721.40 GEYL FER (SEQ ID NO: 28)_ 804. 40_1 083. 60_ times _VEI FYR (SEQ ID NO: 29)_ 413. 73_5 98.30 ID_ 0.700 AVGLAGTFR 0.404 0.903 0.596 0.482 1.974 0.900 0.246 0.381 465 (SEQ ID NO: 26)_446.26_ 721.40

[0153] The panel including the interaction term from COIA1 and FRIL performed much better than the panel without interaction terms in both cross validated specificity at NPV=0.9 and cross validated partial AUC.

TABLE-US-00005 TABLE 4 C4 Classifier Compound SEQ Precursor Product Protein Name ID NO: Ion Ion Coefficient ALDOA_HUMAN ALQASALK 25 401.25 617.4 -0.47459794 (Beta) COIA1_HUMAN AVGLAGTF 26 446.26 721.4 -2.468073083 R (Beta) TSP1_HUMAN GFLLLASL 27 495.31 559.4 0.33223188 R (Beta) FRIL_HUMAN LGGPEAGL 28 804.4 1083.6 -0.864887827 GEYLFER LG3BP_HUMAN VEIFYR 29 413.73 598.3 -0.903170248 COIA1 x FRIL Inter- -1.227671396 action ALPHA Constant -1.621210001

TABLE-US-00006 TABLE 5 Performance of C4 Classifier Threshold NPV Specificity 0.48 0.85 0.55 0.37 0.90 0.28 0.27 0.95* 0.13

TABLE-US-00007 TABLE 6 Nucleotide sequences of proteins in high performing panels. Seq. Gene Name Nucleotide Sequence ID. ALDOA_HUMAN ATGCCCTACCAATATCCAGCACTGACCCCG 1 GAGCAGAAGAAGGAGCTGTCTGACATCGCT CACCGCATCGTGGCACCTGGCAAGGGCATC CTGGCTGCAGATGAGTCCACTGGGAGCATT GCCAAGCGGCTGCAGTCCATTGGCACCGAG AACACCGAGGAGAACCGGCGCTTCTACCGC CAGCTGCTGCTGACAGCTGACGACCGCGTG AACCCCTGCATTGGGGGTGTCATCCTCTTC CATGAGACACTCTACCAGAAGGCGGATGAT GGGCGTCCCTTCCCCCAAGTTATCAAATCC AAGGGCGGTGTTGTGGGCATCAAGGTAGAC AAGGGCGTGGTCCCCCTGGCAGGGACAAAT GGCGAGACTACCACCCAAGGGTTGGATGGG CTGTCTGAGCGCTGTGCCCAGTACAAGAAG GACGGAGCTGACTTCGCCAAGTGGCGTTGT GTGCTGAAGATTGGGGAACACACCCCCTCA GCCCTCGCCATCATGGAAAATGCCAATGTT CTGGCCCGTTATGCCAGTATCTGCCAGCAG AATGGCATTGTGCCCATCGTGGAGCCTGAG ATCCTCCCTGATGGGGACCATGACTTGAAG CGCTGCCAGTATGTGACCGAGAAGGTGCTG GCTGCTGTCTACAAGGCTCTGAGTGACCAC CACATCTACCTGGAAGGCACCTTGCTGAAG CCCAACATGGTCACCCCAGGCCATGCTTGC ACTCAGAAGTTTTCTCATGAGGAGATTGCC ATGGCGACCGTCACAGCGCTGCGCCGCACA GTGCCCCCCGCTGTCACTGGGATCACCTTC CTGTCTGGAGGCCAGAGTGAGGAGGAGGCG TCCATCAACCTCAATGCCATTAACAAGTGC CCCCTGCTGAAGCCCTGGGCCCTGACCTTC TCCTACGGCCGAGCCCTGCAGGCCTCTGCC CTGAAGGCCTGGGGCGGGAAGAAGGAGAAC CTGAAGGCTGCGCAGGAGGAGTATGTCAAG CGAGCCCTGGCCAACAGCCTTGCCTGTCAA GGAAAGTACACTCCGAGCGGTCAGGCTGGG GCTGCTGCCAGCGAGTCCCTCTTCGTCTCT AACCACGCCTATTAA ALDOA_HUMAN ATGGCAAGGCGCAAGCCAGAAGGGTCCAGC 2 (isoform 2) TTCAACATGACCCACCTGTCCATGGCTATG GCCTTTTCCTTTCCCCCAGTTGCCAGTGGG CAACTCCACCCTCAGCTGGGCAACACCCAG CACCAGACAGAGTTAGGAAAGGAACTTGCT ACTACCAGCACCATGCCCTACCAATATCCA GCACTGACCCCGGAGCAGAAGAAGGAGCTG TCTGACATCGCTCACCGCATCGTGGCACCT GGCAAGGGCATCCTGGCTGCAGATGAGTCC ACTGGGAGCATTGCCAAGCGGCTGCAGTCC ATTGGCACCGAGAACACCGAGGAGAACCGG CGCTTCTACCGCCAGCTGCTGCTGACAGCT GACGACCGCGTGAACCCCTGCATTGGGGGT GTCATCCTCTTCCATGAGACACTCTACCAG AAGGCGGATGATGGGCGTCCCTTCCCCCAA GTTATCAAATCCAAGGGCGGTGTTGTGGGC ATCAAGGTAGACAAGGGCGTGGTCCCCCTG GCAGGGACAAATGGCGAGACTACCACCCAA GGGTTGGATGGGCTGTCTGAGCGCTGTGCC CAGTACAAGAAGGACGGAGCTGACTTCGCC AAGTGGCGTTGTGTGCTGAAGATTGGGGAA CACACCCCCTCAGCCCTCGCCATCATGGAA AATGCCAATGTTCTGGCCCGTTATGCCAGT ATCTGCCAGCAGAATGGCATTGTGCCCATC GTGGAGCCTGAGATCCTCCCTGATGGGGAC CATGACTTGAAGCGCTGCCAGTATGTGACC GAGAAGGTGCTGGCTGCTGTCTACAAGGCT CTGAGTGACCACCACATCTACCTGGAAGGC ACCTTGCTGAAGCCCAACATGGTCACCCCA GGCCATGCTTGCACTCAGAAGTTTTCTCAT GAGGAGATTGCCATGGCGACCGTCACAGCG CTGCGCCGCACAGTGCCCCCCGCTGTCACT GGGATCACCTTCCTGTCTGGAGGCCAGAGT GAGGAGGAGGCGTCCATCAACCTCAATGCC ATTAACAAGTGCCCCCTGCTGAAGCCCTGG GCCCTGACCTTCTCCTACGGCCGAGCCCTG CAGGCCTCTGCCCTGAAGGCCTGGGGCGGG AAGAAGGAGAACCTGAAGGCTGCGCAGGAG GAGTATGTCAAGCGAGCCCTGGCCAACAGC CTTGCCTGTCAAGGAAAGTACACTCCGAGC GGTCAGGCTGGGGCTGCTGCCAGCGAGTCC CTCTTCGTCTCTAACCACGCCTATTAA FRIL_HUMAN ATGAGCTCCCAGATTCGTCAGAATTATTCC 3 ACCGACGTGGAGGCAGCCGTCAACAGCCTG GTCAATTTGTACCTGCAGGCCTCCTACACC TACCTCTCTCTGGGCTTCTATTTCGACCGC GATGATGTGGCTCTGGAAGGCGTGAGCCAC TTCTTCCGCGAATTGGCCGAGGAGAAGCGC GAGGGCTACGAGCGTCTCCTGAAGATGCAA AACCAGCGTGGCGGCCGCGCTCTCTTCCAG GACATCAAGAAGCCAGCTGAAGATGAGTGG GGTAAAACCCCAGACGCCATGAAAGCTGCC ATGGCCCTGGAGAAAAAGCTGAACCAGGCC CTTTTGGATCTTCATGCCCTGGGTTCTGCC CGCACGGACCCCCATCTCTGTGACTTCCTG GAGACTCACTTCCTAGATGAGGAAGTGAAG CTTATCAAGAAGATGGGTGACCACCTGACC AACCTCCACAGGCTGGGTGGCCCGGAGGCT GGGCTGGGCGAGTATCTCTTCGAAAGGCTC ACTCTCAAGCACGACTAA LG3BP_HUMAN ATGACCCCTCCGAGGCTCTTCTGGGTGTGG 4 CTGCTGGTTGCAGGAACCCAAGGCGTGAAC GATGGTGACATGCGGCTGGCCGATGGGGGC GCCACCAACCAGGGCCGCGTGGAGATCTTC TACAGAGGCCAGTGGGGCACTGTGTGTGAC AACCTGTGGGACCTGACTGATGCCAGCGTC GTCTGCCGGGCCCTGGGCTTCGAGAACGCC ACCCAGGCTCTGGGCAGAGCTGCCTTCGGG CAAGGATCAGGCGCGATGATGCTGGATGAG GTCCAGTGCACGGGAACCGAGGCCTCACTG GCCGAGTGCAAGTCCCTGGGCTGGCTGAAG AGCAACTGCAGGCACGAGAGAGACGCTGGT GTGGTCTGCACCAATGAAACCAGGAGCACC CACACCCTGGACCTCTCCAGGGAGCTCTCG GAGGCCCTTGGCCAGATCTTTGACAGCCAG CGGGGCTGCGACCTGTCCATCAGCGTGAAT GTGCAGGGCGAGGACGCCCTGGGCTTCTGT GGCCACACGGTCATCCTGACTGCCAACCTG GAGGCCCAGGCGCTGTGGAAGGAGCCGGGC AGCAATGTCACCATGAGTGTGGATGCTGAG TGTGTGCCGATGGTCAGGGACCTTCTCAGG TACTTCTACTCCCGAAGGATTGACATCACC CTGTCGTCAGTCAAGTGCTTCCACAAGCTG GCCTCTGCCTATGGGGCCAGGCAGCTGCAG GGCTACTGCGCAAGCCTCTTTGCCATCCTC CTCCCCCAGGACCCCTCGTTCCAGATGCCC CTGGACCTGTATGCCTATGCAGTGGCCACA GGGGACGCCCTGCTGGAGAAGCTCTGCCTA CAGTTCCTGGCCTGGAACTTCGAGGCCTTG ACGCAGGCCGAGGCCTGGCCCAGTGTCCCC ACAGACCTGCTCCAACTGCTGCTGCCCAGG AGCGACCTGGCGGTGCCCAGCGAGCTGGCC CTACTGAAGGCCGTGGACACCTGGAGCTGG GGGGAGCGTGCCTCCCATGAGGAGGTGGAG GGCTTGGTGGAGAAGATCGGCTTCCCCATG ATGCTCCCTGAGGAGCTCTTTGAGCTGCAG TTGAACCTGTCCGTGTACTGGAGCCACGAG GCCCTGTTCCAGAAGAAGACTCTGCAGGCC CTGGAATTCCACACTGTGCCCTTCCAGTTG CTGGCCCGGTACAAAGGCCTGAACCTCACC GAGGATACCTACAAGCCCCGGATTTACACC TCGCCCACCTGGAGTGCCTTTGTGACAGAC AGTTCCTGGAGTGCACGGAAGTCACAACTG GTCTATCAGTCCAGACGGGGGCCTTTGGTC AAATATTCTTCTGATTACTTCCAAGCCCCC TCTGACTACAGATACTACCCCTACCAGTCC TTCCAGACTCCACAACACCCCAGCTTCCTC TTCCAGGACAAGAGGGTGTCCTGGTCCCTG GTCTACCTCCCCACCATCCAGAGCTGCTGG AACTACGGCTTCTCCTGCTCCTCGGACGAG CTCCCTGTCCTGGGCCTCACCAAGTCTGGC GGCTCAGATCGCACCATTGCCTACGAAAAC AAAGCCCTGATGCTCTGCGAAGGGCTCTTC GTGGCAGACGTCACCGATTTCGAGGGCTGG AAGGCTGCGATTCCCAGTGCCCTGGACACC AACAGCTCGAAGAGCACCTCCTCCTTCCCC TGCCCGGCAGGGCACTTCAACGGCTTCCGC ACGGTCATCCGCCCCTTCTACCTGACCAAC TCCTCAGGTGTGGACTAG TSP1_HUMAN ATGGGGCTGGCCTGGGGACTAGGCGTCCTG 5 TTCCTGATGCATGTGTGTGGCACCAACCGC ATTCCAGAGTCTGGCGGAGACAACAGCGTG TTTGACATCTTTGAACTCACCGGGGCCGCC CGCAAGGGGTCTGGGCGCCGACTGGTGAAG GGCCCCGACCCTTCCAGCCCAGCTTTCCGC ATCGAGGATGCCAACCTGATCCCCCCTGTG CCTGATGACAAGTTCCAAGACCTGGTGGAT GCTGTGCGGGCAGAAAAGGGTTTCCTCCTT CTGGCATCCCTGAGGCAGATGAAGAAGACC CGGGGCACGCTGCTGGCCCTGGAGCGGAAA GACCACTCTGGCCAGGTCTTCAGCGTGGTG TCCAATGGCAAGGCGGGCACCCTGGACCTC AGCCTGACCGTCCAAGGAAAGCAGCACGTG GTGTCTGTGGAAGAAGCTCTCCTGGCAACC GGCCAGTGGAAGAGCATCACCCTGTTTGTG CAGGAAGACAGGGCCCAGCTGTACATCGAC TGTGAAAAGATGGAGAATGCTGAGTTGGAC GTCCCCATCCAAAGCGTCTTCACCAGAGAC CTGGCCAGCATCGCCAGACTCCGCATCGCA AAGGGGGGCGTCAATGACAATTTCCAGGGG GTGCTGCAGAATGTGAGGTTTGTCTTTGGA ACCACACCAGAAGACATCCTCAGGAACAAA GGCTGCTCCAGCTCTACCAGTGTCCTCCTC ACCCTTGACAACAACGTGGTGAATGGTTCC AGCCCTGCCATCCGCACTAACTACATTGGC CACAAGACAAAGGACTTGCAAGCCATCTGC GGCATCTCCTGTGATGAGCTGTCCAGCATG GTCCTGGAACTCAGGGGCCTGCGCACCATT GTGACCACGCTGCAGGACAGCATCCGCAAA GTGACTGAAGAGAACAAAGAGTTGGCCAAT GAGCTGAGGCGGCCTCCCCTATGCTATCAC AACGGAGTTCAGTACAGAAATAACGAGGAA TGGACTGTTGATAGCTGCACTGAGTGTCAC TGTCAGAACTCAGTTACCATCTGCAAAAAG GTGTCCTGCCCCATCATGCCCTGCTCCAAT GCCACAGTTCCTGATGGAGAATGCTGTCCT CGCTGTTGGCCCAGCGACTCTGCGGACGAT GGCTGGTCTCCATGGTCCGAGTGGACCTCC TGTTCTACGAGCTGTGGCAATGGAATTCAG CAGCGCGGCCGCTCCTGCGATAGCCTCAAC AACCGATGTGAGGGCTCCTCGGTCCAGACA CGGACCTGCCACATTCAGGAGTGTGACAAG AGATTTAAACAGGATGGTGGCTGGAGCCAC TGGTCCCCGTGGTCATCTTGTTCTGTGACA TGTGGTGATGGTGTGATCACAAGGATCCGG CTCTGCAACTCTCCCAGCCCCCAGATGAAC GGGAAACCCTGTGAAGGCGAAGCGCGGGAG ACCAAAGCCTGCAAGAAAGACGCCTGCCCC ATCAATGGAGGCTGGGGTCCTTGGTCACCA TGGGACATCTGTTCTGTCACCTGTGGAGGA GGGGTACAGAAACGTAGTCGTCTCTGCAAC AACCCCACACCCCAGTTTGGAGGCAAGGAC TGCGTTGGTGATGTAACAGAAAACCAGATC TGCAACAAGCAGGACTGTCCAATTGATGGA TGCCTGTCCAATCCCTGCTTTGCCGGCGTG AAGTGTACTAGCTACCCTGATGGCAGCTGG AAATGTGGTGCTTGTCCCCCTGGTTACAGT GGAAATGGCATCCAGTGCACAGATGTTGAT GAGTGCAAAGAAGTGCCTGATGCCTGCTTC AACCACAATGGAGAGCACCGGTGTGAGAAC ACGGACCCCGGCTACAACTGCCTGCCCTGC CCCCCACGCTTCACCGGCTCACAGCCCTTC GGCCAGGGTGTCGAACATGCCACGGCCAAC AAACAGGTGTGCAAGCCCCGTAACCCCTGC ACGGATGGGACCCACGACTGCAACAAGAAC GCCAAGTGCAACTACCTGGGCCACTATAGC GACCCCATGTACCGCTGCGAGTGCAAGCCT GGCTACGCTGGCAATGGCATCATCTGCGGG GAGGACACAGACCTGGATGGCTGGCCCAAT GAGAACCTGGTGTGCGTGGCCAATGCGACT TACCACTGCAAAAAGGATAATTGCCCCAAC CTTCCCAACTCAGGGCAGGAAGACTATGAC AAGGATGGAATTGGTGATGCCTGTGATGAT GACGATGACAATGATAAAATTCCAGATGAC AGGGACAACTGTCCATTCCATTACAACCCA GCTCAGTATGACTATGACAGAGATGATGTG GGAGACCGCTGTGACAACTGTCCCTACAAC CACAACCCAGATCAGGCAGACACAGACAAC AATGGGGAAGGAGACGCCTGTGCTGCAGAC ATTGATGGAGACGGTATCCTCAATGAACGG GACAACTGCCAGTACGTCTACAATGTGGAC CAGAGAGACACTGATATGGATGGGGTTGGA

GATCAGTGTGACAATTGCCCCTTGGAACAC AATCCGGATCAGCTGGACTCTGACTCAGAC CGCATTGGAGATACCTGTGACAACAATCAG GATATTGATGAAGATGGCCACCAGAACAAT CTGGACAACTGTCCCTATGTGCCCAATGCC AACCAGGCTGACCATGACAAAGATGGCAAG GGAGATGCCTGTGACCACGATGATGACAAC GATGGCATTCCTGATGACAAGGACAACTGC AGACTCGTGCCCAATCCCGACCAGAAGGAC TCTGACGGCGATGGTCGAGGTGATGCCTGC AAAGATGATTTTGACCATGACAGTGTGCCA GACATCGATGACATCTGTCCTGAGAATGTT GACATCAGTGAGACCGATTTCCGCCGATTC CAGATGATTCCTCTGGACCCCAAAGGGACA TCCCAAAATGACCCTAACTGGGTTGTACGC CATCAGGGTAAAGAACTCGTCCAGACTGTC AACTGTGATCCTGGACTCGCTGTAGGTTAT GATGAGTTTAATGCTGTGGACTTCAGTGGC ACCTTCTTCATCAACACCGAAAGGGACGAT GACTATGCTGGATTTGTCTTTGGCTACCAG TCCAGCAGCCGCTTTTATGTTGTGATGTGG AAGCAAGTCACCCAGTCCTACTGGGACACC AACCCCACGAGGGCTCAGGGATACTCGGGC CTTTCTGTGAAAGTTGTAAACTCCACCACA GGGCCTGGCGAGCACCTGCGGAACGCCCTG TGGCACACAGGAAACACCCCTGGCCAGGTG CGCACCCTGTGGCATGACCCTCGTCACATA GGCTGGAAAGATTTCACCGCCTACAGATGG CGTCTCAGCCACAGGCCAAAGACGGGTTTC ATTAGAGTGGTGATGTATGAAGGGAAGAAA ATCATGGCTGACTCAGGACCCATCTATGAT AAAACCTATGCTGGTGGTAGACTAGGGTTG TTTGTCTTCTCTCAAGAAATGGTGTTCTTC TCTGACCTGAAATACGAATGTAGAGATCCC TAA CO1A1_HUMAN ATGTTCAGCTTTGTGGACCTCCGGCTCCTG 6 CTCCTCTTAGCGGCCACCGCCCTCCTGACG CACGGCCAAGAGGAAGGCCAAGTCGAGGGC CAAGACGAAGACATCCCACCAATCACCTGC GTACAGAACGGCCTCAGGTACCATGACCGA GACGTGTGGAAACCCGAGCCCTGCCGGATC TGCGTCTGCGACAACGGCAAGGTGTTGTGC GATGACGTGATCTGTGACGAGACCAAGAAC TGCCCCGGCGCCGAAGTCCCCGAGGGCGAG TGCTGTCCCGTCTGCCCCGACGGCTCAGAG TCACCCACCGACCAAGAAACCACCGGCGTC GAGGGACCCAAGGGAGACACTGGCCCCCGA GGCCCAAGGGGACCCGCAGGCCCCCCTGGC CGAGATGGCATCCCTGGACAGCCTGGACTT CCCGGACCCCCCGGACCCCCCGGACCTCCC GGACCCCCTGGCCTCGGAGGAAACTTTGCT CCCCAGCTGTCTTATGGCTATGATGAGAAA TCAACCGGAGGAATTTCCGTGCCTGGCCCC ATGGGTCCCTCTGGTCCTCGTGGTCTCCCT GGCCCCCCTGGTGCACCTGGTCCCCAAGGC TTCCAAGGTCCCCCTGGTGAGCCTGGCGAG CCTGGAGCTTCAGGTCCCATGGGTCCCCGA GGTCCCCCAGGTCCCCCTGGAAAGAATGGA GATGATGGGGAAGCTGGAAAACCTGGTCGT CCTGGTGAGCGTGGGCCTCCTGGGCCTCAG GGTGCTCGAGGATTGCCCGGAACAGCTGGC CTCCCTGGAATGAAGGGACACAGAGGTTTC AGTGGTTTGGATGGTGCCAAGGGAGATGCT GGTCCTGCTGGTCCTAAGGGTGAGCCTGGC AGCCCTGGTGAAAATGGAGCTCCTGGTCAG ATGGGCCCCCGTGGCCTGCCTGGTGAGAGA GGTCGCCCTGGAGCCCCTGGCCCTGCTGGT GCTCGTGGAAATGATGGTGCTACTGGTGCT GCCGGGCCCCCTGGTCCCACCGGCCCCGCT GGTCCTCCTGGCTTCCCTGGTGCTGTTGGT GCTAAGGGTGAAGCTGGTCCCCAAGGGCCC CGAGGCTCTGAAGGTCCCCAGGGTGTGCGT GGTGAGCCTGGCCCCCCTGGCCCTGCTGGT GCTGCTGGCCCTGCTGGAAACCCTGGTGCT GATGGACAGCCTGGTGCTAAAGGTGCCAAT GGTGCTCCTGGTATTGCTGGTGCTCCTGGC TTCCCTGGTGCCCGAGGCCCCTCTGGACCC CAGGGCCCCGGCGGCCCTCCTGGTCCCAAG GGTAACAGCGGTGAACCTGGTGCTCCTGGC AGCAAAGGAGACACTGGTGCTAAGGGAGAG CCTGGCCCTGTTGGTGTTCAAGGACCCCCT GGCCCTGCTGGAGAGGAAGGAAAGCGAGGA GCTCGAGGTGAACCCGGACCCACTGGCCTG CCCGGACCCCCTGGCGAGCGTGGTGGACCT GGTAGCCGTGGTTTCCCTGGCGCAGATGGT GTTGCTGGTCCCAAGGGTCCCGCTGGTGAA CGTGGTTCTCCTGGCCCTGCTGGCCCCAAA GGATCTCCTGGTGAAGCTGGTCGTCCCGGT GAAGCTGGTCTGCCTGGTGCCAAGGGTCTG ACTGGAAGCCCTGGCAGCCCTGGTCCTGAT GGCAAAACTGGCCCCCCTGGTCCCGCCGGT CAAGATGGTCGCCCCGGACCCCCAGGCCCA CCTGGTGCCCGTGGTCAGGCTGGTGTGATG GGATTCCCTGGACCTAAAGGTGCTGCTGGA GAGCCCGGCAAGGCTGGAGAGCGAGGTGTT CCCGGACCCCCTGGCGCTGTCGGTCCTGCT GGCAAAGATGGAGAGGCTGGAGCTCAGGGA CCCCCTGGCCCTGCTGGTCCCGCTGGCGAG AGAGGTGAACAAGGCCCTGCTGGCTCCCCC GGATTCCAGGGTCTCCCTGGTCCTGCTGGT CCTCCAGGTGAAGCAGGCAAACCTGGTGAA CAGGGTGTTCCTGGAGACCTTGGCGCCCCT GGCCCCTCTGGAGCAAGAGGCGAGAGAGGT TTCCCTGGCGAGCGTGGTGTGCAAGGTCCC CCTGGTCCTGCTGGTCCCCGAGGGGCCAAC GGTGCTCCCGGCAACGATGGTGCTAAGGGT GATGCTGGTGCCCCTGGAGCTCCCGGTAGC CAGGGCGCCCCTGGCCTTCAGGGAATGCCT GGTGAACGTGGTGCAGCTGGTCTTCCAGGG CCTAAGGGTGACAGAGGTGATGCTGGTCCC AAAGGTGCTGATGGCTCTCCTGGCAAAGAT GGCGTCCGTGGTCTGACTGGCCCCATTGGT CCTCCTGGCCCTGCTGGTGCCCCTGGTGAC AAGGGTGAAAGTGGTCCCAGCGGCCCTGCT GGTCCCACTGGAGCTCGTGGTGCCCCCGGA GACCGTGGTGAGCCTGGTCCCCCCGGCCCT GCTGGCTTTGCTGGCCCCCCTGGTGCTGAC GGCCAACCTGGTGCTAAAGGCGAACCTGGT GATGCTGGTGCTAAAGGCGATGCTGGTCCC CCTGGCCCTGCCGGACCCGCTGGACCCCCT GGCCCCATTGGTAATGTTGGTGCTCCTGGA GCCAAAGGTGCTCGCGGCAGCGCTGGTCCC CCTGGTGCTACTGGTTTCCCTGGTGCTGCT GGCCGAGTCGGTCCTCCTGGCCCCTCTGGA AATGCTGGACCCCCTGGCCCTCCTGGTCCT GCTGGCAAAGAAGGCGGCAAAGGTCCCCGT GGTGAGACTGGCCCTGCTGGACGTCCTGGT GAAGTTGGTCCCCCTGGTCCCCCTGGCCCT GCTGGCGAGAAAGGATCCCCTGGTGCTGAT GGTCCTGCTGGTGCTCCTGGTACTCCCGGG CCTCAAGGTATTGCTGGACAGCGTGGTGTG GTCGGCCTGCCTGGTCAGAGAGGAGAGAGA GGCTTCCCTGGTCTTCCTGGCCCCTCTGGT GAACCTGGCAAACAAGGTCCCTCTGGAGCA AGTGGTGAACGTGGTCCCCCTGGTCCCATG GGCCCCCCTGGATTGGCTGGACCCCCTGGT GAATCTGGACGTGAGGGGGCTCCTGGTGCC GAAGGTTCCCCTGGACGAGACGGTTCTCCT GGCGCCAAGGGTGACCGTGGTGAGACCGGC CCCGCTGGACCCCCTGGTGCTCCTGGTGCT CCTGGTGCCCCTGGCCCCGTTGGCCCTGCT GGCAAGAGTGGTGATCGTGGTGAGACTGGT CCTGCTGGTCCCACCGGTCCTGTCGGCCCT GTTGGCGCCCGTGGCCCCGCCGGACCCCAA GGCCCCCGTGGTGACAAGGGTGAGACAGGC GAACAGGGCGACAGAGGCATAAAGGGTCAC CGTGGCTTCTCTGGCCTCCAGGGTCCCCCT GGCCCTCCTGGCTCTCCTGGTGAACAAGGT CCCTCTGGAGCCTCTGGTCCTGCTGGTCCC CGAGGTCCCCCTGGCTCTGCTGGTGCTCCT GGCAAAGATGGACTCAACGGTCTCCCTGGC CCCATTGGGCCCCCTGGTCCTCGCGGTCGC ACTGGTGATGCTGGTCCTGTTGGTCCCCCC GGCCCTCCTGGACCTCCTGGTCCCCCTGGT CCTCCCAGCGCTGGTTTCGACTTCAGCTTC CTGCCCCAGCCACCTCAAGAGAAGGCTCAC GATGGTGGCCGCTACTACCGGGCTGATGAT GCCAATGTGGTTCGTGACCGTGACCTCGAG GTGGACACCACCCTCAAGAGCCTGAGCCAG CAGATCGAGAACATCCGGAGCCCAGAGGGC AGCCGCAAGAACCCCGCCCGCACCTGCCGT GACCTCAAGATGTGCCACTCTGACTGGAAG AGTGGAGAGTACTGGATTGACCCCAACCAA GGCTGCAACCTGGATGCCATCAAAGTCTTC TGCAACATGGAGACTGGTGAGACCTGCGTG TACCCCACTCAGCCCAGTGTGGCCCAGAAG AACTGGTACATCAGCAAGAACCCCAAGGAC AAGAGGCATGTCTGGTTCGGCGAGAGCATG ACCGATGGATTCCAGTTCGAGTATGGCGGC CAGGGCTCCGACCCTGCCGATGTGGCCATC CAGCTGACCTTCCTGCGCCTGATGTCCACC GAGGCCTCCCAGAACATCACCTACCACTGC AAGAACAGCGTGGCCTACATGGACCAGCAG ACTGGCAACCTCAAGAAGGCCCTGCTCCTC CAGGGCTCCAACGAGATCGAGATCCGCGCC GAGGGCAACAGCCGCTTCACCTACAGCGTC ACTGTCGATGGCTGCACGAGTCACACCGGA GCCTGGGGCAAGACAGTGATTGAATACAAA ACCACCAAGACCTCCCGCCTGCCCATCATC GATGTGGCCCCCTTGGACGTTGGTGCCCCA GACCAGGAATTCGGCTTCGACGTTGGCCCT GTCTGCTTCCTGTAA

TABLE-US-00008 TABLE 7 Amino acid sequences of proteins in high performing panels. Protein Seq. Name Amino Acid Sequence ID. ALDOA_HUMAN MPYQYPALTPEQKKELSDIAHRIVAPGKGI 7 LAADESTGSIAKRLQSIGTENTEENRRFYR QLLLTADDRVNPCIGGVILFHETLYOKADD GRPFPOVIKSKGGVVGIKVDKGVVPLAGTN GETTTQGLDGLSERCAQYKKDGADFAKWRC VLKIGEHTPSALAIMENANVLARYASICQQ NGIVPIVEPEILPDGDHDLKRCQYVTEKVL AAVYKALSDHHIYLEGTLLKPNMVTPGHAC TQKFSHEEIAMATVTALRRTVPPAVTGITF LSGGQSEEEASINLNAINKOPLLKPWALTF SYGRALQASALKAWGGKKENLKAAQEEYVK RALANSLACQGKYTPSGQAGAAASESLFVS NHAY ALDOA_HUMAN MARRKPEGSSFNMTHLSMAMAFSFPPVASG 8 (isoform 2) QLHPQLGNTQHQTELGKELATTSTMPYQYP ALTPEQKKELSDIAHRIVAPGKGILAADES TGSTAKRLQSIGTENTEENRRFYRQLLLTA DDRVNPCIGGVILFHETLYQKADDGRPFPQ VIKSKGGVVGINVDKGVVPLAGTNGETTTQ GLDGLSERCAQYKKDGADFAKWRCVLKIGE HTPSAIAIMENANVLARYASICQQNGIVPI VEPEILPDGDHDLKRCQYVTEKVLAAVYKA LSDHHIYLEGTLLKPNMVTPGHACTQKFSH EEIAMATVTALRRTVPPAVTGITFLSGGQS EEEASINLNAINKCPLLKPWALTFSYGRAL QASALKAWGGKKENLKAAQEEYVKRALANS LACQGKYTPSGQAGAAASESLFVSNHAY FRIL_HUMAN MSSQIRQNYSTDVEAAVNSLVNLYLQASYT 9 YLSLGFYFDRDDVALEGVSHFFRELAEEKR EGYERLLKMQNQRGGRALFQDIKKPAEDEW GKTPDAMKAAMALEKKLNQALLDLHALGSA RTDPHLCDFLETHFLDEEVKLIKKMGDHLT NLHRLGGPEAGLGEYLFERLTLKHD LG3BP_HUMAN MTPPRLFWVWLLVAGTQGVNDGDMRLADGG 10 ATNQGRVEIFYRGQWGTVCDNLWDLTDASV VCRALGFENATQALGRAAFGQGSGPIMLDE VQCTGTEASLADCKSLGWLKSNCRHERDAG VVCTNETRSTHTLDLSRELSEALGQIFDSQ RGCDLSISVNVQGEDALGFCGHTVILTANL EAQALWKEPGSNVTMSVDAECVPMVRDLLR YFYSRRIDITLSSVKCFHKLASAYGARQLQ GYCASLFAILLPQDPSFQMPLDLYAYAVAT GDALLEKLCLQFLAWNFEALTQAEAWPSVP TDLLQLLLPRSDLAVPSELALLKAVDTWSW GERASHEEVEGLVEKIRFPMMLPEELFELQ FNLSLYWSHEALFQKKTLQALEFHTVPFQL LARYKGLNLTEDTYKPRIYTSPTWSAFVTD SSWSARKSQLVYQSRRGPLVKYSSDYFQAP SDYRYYPYQSFQTPQHPSFLFQDKRVSWSL VYLPTIQSCWNYGFSCSSDELPVLGLTKSG GSDRTIAYENKALMLCEGLFVADVTDFEGW KAAIPSALDTNSSKSTSSFPCPAGHFNGFR TVIRPFYLTNSSGVD TSP1_HUMAN MGLAWGLGVLFLMHVCGTNRIPESGGDNSV 11 FDIFELTGAARKGSGRRLVKGPDPSSPAFR IEDANLIPPVPDDKFQDLVDAVRAEKGFLL LASLRQMKKTRGTLLALERKDHSGQVFSVV SNGKAGTLDLSLTVQGKQHVVSVEEALLAT GQWKSITLFVQEDRAQLYIDCEKMENAELD VPIQSVFTRDLASIARLRIAKGGVNDNFQG VLQNVRFVFGTTPEDILRNKGCSSSTSVLL TLDNNVVNGSSPAIRTNYIGHKTKDLQAIC GISCDELSSMVLELRGLRTIVTTLQDSIRK VTEENKELANELRRPPLCYHNGVQYRNNEE WTVDSCTECHCQNSVTICKKVSCPIMPCSN ATVPDGECCPRCWPSDSADDGWSPWSEWTS CSTSCGNGIQQRGRSCDSLNNRCEGSSVQT RTCHIQECDKRFKQDGGWSHWSPWSSCSVT CGDGVITRIRLCNSPSPQMNGKPCEGEARE TKACKKDACPINGGWGPWSPWDICSVTCGG GVQKRSRLCNNPTPQFGGKDCVGDVTENQI CNKQDCPIDGCLSNPCFAGVKCTSYPDGSW KCGACPPGYSGNGIQCTDVDECKEVPDACF NHNGEHRCENTDPGYNCLPCPPRFTGSQPF GQGVEHATANKQVCKPRNPCTDGTHDCNKN AKCNYLGHYSDPMYRCECKPGYAGNGIICG EDTDLDGWPNENLVCVANATYHCKKDNCPN LPNSGQEDYDKDGIGDACDDDDDNDKIPDD RDNCPFHYNPAQYDYDRDDVGDRCDNCPYN HNPDQADTDNNGEGDACAADIDGDGILNER DNCQYVYNVDQRDTDMDGVGDQCDNCPLEH NPDQLDSDSDRIGDTCDNNQDIDEDGHQNN LDNCPYVPNANQADHDKDGKGDACDHDDDN DGIPDDKDNCRLVPNPDQKDSDGDGRGDAC KDDFDHDSVPDIDDICPENVDISETDFRRF QMIPLDPKGTSQNDPNWVVRHQGKELVQTV NCDPGLAVGYDEFNAVDFSGTFFINTERDD DYAGFVFGYQSSSRFYVVMWKQVTQSYWDT NPTRAQGYSGLSVKVVNSTTGPGEHLRNAL WHTGNTPGQVRTLWHDPRHIGWKDFTAYRW RLSHRPKTGFIRVVMYEGKKIMADSGPIYD KTYAGGRLGLFVFSQEMVFFSDLKYECRDP CO1A1_HUMAN MFSFVDLRLLLLLAATALLTHGQEEGQVEG 12 QDEDIPPITCVQNGLRYHDRDVWKPEPCRI CVCDNGKVLCDDVICDETKNCPGAEVPEGE CCPVCPDGSESPTDQETTGVEGPKGDTGPR GPRGPAGPPGRDGIPGQPGLPGPPGPPGPP GPPGLGGNFAPQLSYGYDEKSTGGISVPGP MGPSGPRGLPGPPGAPGPQGFQGPPGEPGE PGASGPMGPRGPPGPPGKNGDDGEAGKPGR PGERGPPGPQGARGLPGTAGLPGMKGHRGF SGLDGAKGDAGPAGPKGEPGSPGENGAPGQ MGPRGLPGERGRPGAPGPAGARGNDGATGA AGPPGPTGPAGPPGFPGAVGAKGEAGPQGP RGSEGPQGVRGEPGPPGPAGAAGPAGNPGA DGQPGAKGANGAPGIAGAPGFPGARGPSGP QGPGGPPGPKGNSGEPGAPGSKGDTGAKGE PGPVGVQGPPGPAGEEGKRGARGEPGPTGL PGPPGERGGPGSRGFPGADGVAGPKGPAGE RGSPGPAGPKGSPGEAGRPGEAGLPGAKGL TGSPGSPGPDGKTGPPGPAGQDGRPGPPGP PGARGQAGVMGFPGPKGAAGEPGKAGERGV PGPPGAVGPAGKDGEAGAQGPPGPAGPAGE RGEQGPAGSPGFQGLPGPAGPPGEAGKPGE QGVPGDLGAPGPSGARGERGFPGERGVQGP PGPAGPRGANGAPGNDGAKGDAGAPGAPGS QGAPGLQGMPGERGAAGLPGPKGDRGDAGP KGADGSPGKDGVRGLTGPIGPPGPAGAPGD KGESGPSGPAGPTGARGAPGDRGEPGPPGP AGFAGPPGADGQPGAKGEPGDAGAKGDAGP PGPAGPAGPPGPIGNVGAPGAKGARGSAGP PGATGFPGAAGRVGPPGPSGNAGPPGPPGP AGKEGGKGPRGETGPAGRPGEVGPPGPPGP AGEKGSPGADGPAGAPGTPGPQGIAGQRGV VGLPGQRGERGFPGLPGPSGEPGKQGPSGA SGERGPPGPMGPPGLAGPPGESGREGAPGA EGSPGRDGSPGAKGDRGETGPAGPPGAPGA PGAPGPVGPAGKSGDRGETGPAGPTGPVGP VGARGPAGPQGPRGDKGETGEQGDRGIKGH RGFSGLQGPPGPPGSPGEQGPSGASGPAGP RGPPGSAGAPGKDGLNGLPGPIGPPGPRGR TGDAGPVGPPGPPGPPGPPGPPSAGFDFSF LPQPPQEKAHDGGRYYRADDANVVRDRDLE VDTTLKSLSQQIENIRSPEGSRKNPARTCR DLKMCHSDWKSGEYWIDPNQGCNLDAIKVF CNMETGETCVYPTQPSVAQKNWYISKNPKD KRHVWFGESMTDGFQFEYGGQGSDPADVAI QLTFLRLMSTEASQNITYHCKNSVAYMDQQ TGNLKKALLLQGSNEIEIRAEGNSRFTYSV TVDGCTSHTGAWGKTVIEYKTTKTSRLPII DVAPLDVGAPDQEFGEDVGPVCFL

TABLE-US-00009 TABLE 8 Nucleotide sequences of normalizer proteins in panel. Seq. Gene Name Nucleotide Sequence ID. PEDF_HUMAN ATGCAGGCCCTGGTGCTACTCCTCTGCATT 13 GGAGCCCTCCTCGGGCACAGCAGCTGCCAG AACCCTGCCAGCCCCCCGGAGGAGGGCTCC CCAGACCCCGACAGCACAGGGGCGCTGGTG GAGGAGGAGGATCCTTTCTTCAAAGTCCCC GTGAACAAGCTGGCAGCGGCTGTCTCCAAC TTCGGCTATGACCTGTACCGGGTGCGATCC AGCACGAGCCCCACGACCAACGTGCTCCTG TCTCCTCTCAGTGTGGCCACGGCCCTCTCG GCCCTCTCGCTGGGAGCGGAGCAGCGAACA GAATCCATCATTCACCGGGCTCTCTACTAT GACTTGATCAGCAGCCCAGACATCCATGGT ACCTATAAGGAGCTCCTTGACACGGTCACT GCCCCCCAGAAGAACCTCAAGAGTGCCTCC CGGATCGTCTTTGAGAAGAAGCTGCGCATA AAATCCAGCTTTGTGGCACCTCTGGAAAAG TCATATGGGACCAGGCCCAGAGTCCTGACG GGCAACCCTCGCTTGGACCTGCAAGAGATC AACAACTGGGTGCAGGCGCAGATGAAAGGG AAGCTCGCCAGGTCCACAAAGGAAATTCCC GATGAGATCAGCATTCTCCTTCTCGGTGTG GCGCACTTCAAGGGGCAGTGGGTAACAAAG TTTGACTCCAGAAAGACTTCCCTCGAGGAT TTCTACTTGGATGAAGAGAGGACCGTGAGG GTCCCCATGATGTCGGACCCTAAGGCTGTT TTACGCTATGGCTTGGATTCAGATCTCAGC TGCAAGATTGCCCAGCTGCCCTTGACCGGA AGCATGAGTATCATCTTCTTCCTGCCCCTG AAAGTGACCCAGAATTTGACCTTGATAGAG GAGAGCCTCACCTCCGAGTTCATTCATGAC ATAGACCGAGAACTGAAGACCGTGCAGGCG GTCCTCACTGTCCCCAAGCTGAAGCTGAGT TATGAAGGCGAAGTCACCAAGTCCCTGCAG GAGATGAAGCTGCAATCCTTGTTTGATTCA CCAGACTTTAGCAAGATCACAGGCAAACCC ATCAAGCTGACTCAGGTGGAACACCGGGCT GGCTTTGAGTGGAACGAGGATGGGGCGGGA ACCACCCCCAGCCCAGGGCTGCAGCCTGCC CACCTCACCTTCCCGCTGGACTATCACCTT AACCAGCCTTTCATCTTCGTACTGAGGGAC ACAGACACAGGGGCCCTTCTCTTCATTGGC AAGATTCTGGACCCCAGGGGCCCCTAA MASP1_HUMAN ATGAGGTGGCTGCTTCTCTATTATGCTCTG 14 TGCTTCTCCCTGTCAAAGGCTTCAGCCCAC ACCGTGGAGCTAAACAATATGTTTGGCCAG ATCCAGTCGCCTGGTTATCCAGACTCCTAT CCCAGTGATTCAGAGGTGACTTGGAATATC ACTGTCCCAGATGGGTTTCGGATCAAGCTT TACTTCATGCACTTCAACTTGGAATCCTCC TACCTTTGTGAATATGACTATGTGAAGGTA GAAACTGAGGACCAGGTGCTGGCAACCTTC TGTGGCAGGGAGACCACAGACACAGAGCAG ACTCCCGGCCAGGAGGTGGTCCTCTCCCCT GGCTCCTTCATGTCCATCACTTTCCGGTCA GATTTCTCCAATGAGGAGCGTTTCACAGGC TTTGATGCCCACTACATGGCTGTGGATGTG GACGAGTGCAAGGAGAGGGAGGACGAGGAG CTGTCCTGTGACCACTACTGCCACAACTAC ATTGGCGGCTACTACTGCTCCTGCCGCTTC GGCTACATCCTCCACACAGACAACAGGACC TGCCGAGTGGAGTGCAGTGACAACCTCTTC ACTCAAAGGACTGGGGTGATCACCAGCCCT GACTTCCCAAACCCTTACCCCAAGAGCTCT GAATGCCTGTATACCATCGAGCTGGAGGAG GGTTTCATGGTCAACCTGCAGTTTGAGGAC ATATTTGACATTGAGGACCATCCTGAGGTG CCCTGCCCCTATGACTACATCAAGATCAAA GTTGGTCCAAAAGTTTTGGGGCCTTTCTGT GGAGAGAAAGCCCCAGAACCCATCAGCACC CAGAGCCACAGTGTCCTGATCCTGTTCCAT AGTGACAACTCGGGAGAGAACCGGGGCTGG AGGCTCTCATACAGGGCTGCAGGAAATGAG TGCCCAGAGCTACAGCCTCCTGTCCATGGG AAAATCGAGCCCTCCCAAGCCAAGTATTTC TTCAAAGACCAAGTGCTCGTCAGCTGTGAC ACAGGCTACAAAGTGCTGAAGGATAATGTG GAGATGGACACATTCCAGATTGAGTGTCTG AAGGATGGGACGTGGAGTAACAAGATTCCC ACCTGTAAAATTGTAGACTGTAGAGCCCCA GGAGAGCTGGAACACGGGCTGATCACCTTC TCTACAAGGAACAACCTCACCACATACAAG TCTGAGATCAAATACTCCTGTCAGGAGCCC TATTACAAGATGCTCAACAATAACACAGGT ATATATACCTGTTCTGCCCAAGGAGTCTGG ATGAATAAAGTATTGGGGAGAAGCCTACCC ACCTGCCTTCCAGTGTGTGGGCTCCCCAAG TTCTCCCGGAAGCTGATGGCCAGGATCTTC AATGGACGCCCAGCCCAGAAAGGCACCACT CCCTGGATTGCCATGCTGTCACACCTGAAT GGGCAGCCCTTCTGCGGAGGCTCCCTTCTA GGCTCCAGCTGGATCGTGACCGCCGCACAC TGCCTCCACCAGTCACTCGATCCGGAAGAT CCGACCCTACGTGATTCAGACTTGCTCAGC CCTTCTGACTTCAAAATCATCCTGGGCAAG CATTGGAGGCTCCGGTCAGATGAAAATGAA CAGCATCTCGGCGTCAAACACACCACTCTC CACCCCCAGTATGATCCCAACACATTCGAG AATGACGTGGCTCTGGTGGAGCTGTTGGAG AGCCCAGTGCTGAATGCCTTCGTGATGCCC ATCTGTCTGCCTGAGGGACCCCAGCAGGAA GGAGCCATGGTCATCGTCAGCGGCTGGGGG AAGCAGTTCTTGCAAAGGTTCCCAGAGACC CTGATGGAGATTGAAATCCCGATTGTTGAC CACAGCACCTGCCAGAAGGCTTATGCCCCG CTGAAGAAGAAAGTGACCAGGGACATGATC TGTGCTGGGGAGAAGGAAGGGGGAAAGGAC GCCTGTGCGGGTGACTCTGGAGGCCCCATG GTGACCCTGAATAGAGAAAGAGGCCAGTGG TACCTGGTGGGCACTGTGTCCTGGGGTGAT GACTGTGGGAAGAAGGACCGCTACGGAGTA TACTCTTACATCCACCACAACAAGGACTGG ATCCAGAGGGTCACCGGAGTGAGGAACTGA GELS_HUMAN ATGGCTCCGCACCGCCCCGCGCCCGCGCTG 15 CTTTGCGCGCTGTCCCTGGCGCTGTGCGCG CTGTCGCTGCCCGTCCGCGCGGCCACTGCG TCGCGGGGGGCGTCCCAGGCGGGGGCGCCC CAGGGGCGGGTGCCCGAGGCGCGGCCCAAC AGCATGGTGGTGGAACACCCCGAGTTCCTC AAGGCAGGGAAGGAGCCTGGCCTGCAGATC TGGCGTGTGGAGAAGTTCGATCTGGTGCCC GTGCCCACCAACCTTTATGGAGACTTCTTC ACGGGCGACGCCTACGTCATCCTGAAGACA GTGCAGCTGAGGAACGGAAATCTGCAGTAT GACCTCCACTACTGGCTGGGCAATGAGTGC AGCCAGGATGAGAGCGGGGCGGCCGCCATC TTTACCGTGCAGCTGGATGACTACCTGAAC GGCCGGGCCGTGCAGCACCGTGAGGTCCAG GGCTTCGAGTCGGCCACCTTCCTAGGCTAC TTCAAGTCTGGCCTGAAGTACAAGAAAGGA GGTGTGGCATCAGGATTCAAGCACGTGGTA CCCAACGAGGTGGTGGTGCAGAGACTCTTC CAGGTCAAAGGGCGGCGTGTGGTCCGTGCC ACCGAGGTACCTGTGTCCTGGGAGAGCTTC AACAATGGCGACTGCTTCATCCTGGACCTG GGCAACAACATCCACCAGTGGTGTGGTTCC AACAGCAATCGGTATGAAAGACTGAAGGCC ACACAGGTGTCCAAGGGCATCCGGGACAAC GAGCGGAGTGGCCGGGCCCGAGTGCACGTG TCTGAGGAGGGCACTGAGCCCGAGGCGATG CTCCAGGTGCTGGGCCCCAAGCCGGCTCTG CCTGCAGGTACCGAGGACACCGCCAAGGAG GATGCGGCCAACCGCAAGCTGGCCAAGCTC TACAAGGTCTCCAATGGTGCAGGGACCATG TCCGTCTCCCTCGTGGCTGATGAGAACCCC TTCGCCCAGGGGGCCCTGAAGTCAGAGGAC TGCTTCATCCTGGACCACGGCAAAGATGGG AAAATCTTTGTCTGGAAAGGCAAGCAGGCA AACACGGAGGAGAGGAAGGCTGCCCTCAAA ACAGCCTCTGACTTCATCACCAAGATGGAC TACCCCAAGCAGACTCAGGTCTCGGTCCTT CCTGAGGGCGGTGAGACCCCACTGTTCAAG CAGTTCTTCAAGAACTGGCGGGACCCAGAC CAGACAGATGGCCTGGGCTTGTCCTACCTT TCCAGCCATATCGCCAACGTGGAGCGGGTG CCCTTCGACGCCGCCACCCTGCACACCTCC ACTGCCATGGCCGCCCAGCACGGCATGGAT GACGATGGCACAGGCCAGAAACAGATCTGG AGAATCGAAGGTTCCAACAAGGTGCCCGTG GACCCTGCCACATATGGACAGTTCTATGGA GGCGACAGCTACATCATTCTGTACAACTAC CGCCATGGTGGCCGCCAGGGGCAGATAATC TATAACTGGCAGGGTGCCCAGTCTACCCAG GATGAGGTCGCTGCATCTGCCATCCTGACT GCTCAGCTGGATGAGGAGCTGGGAGGTACC CCTGTCCAGAGCCGTGTGGTCCAAGGCAAG GAGCCCGCCCACCTCATGAGCCTGTTTGGT GGGAAGCCCATGATCATCTACAAGGGCGGC ACCTCCCGCGAGGGCGGGCAGACAGCCCCT GCCAGCACCCGCCTCTTCCAGGTCCGCGCC AACAGCGCTGGAGCCACCCGGGCTGTTGAG GTATTGCCTAAGGCTGGTGCACTGAACTCC AACGATGCCTTTGTTCTGAAAACCCCCTCA GCCGCCTACCTGTGGGTGGGTACAGGAGCC AGCGAGGCAGAGAAGACGGGGGCCCAGGAG CTGCTCAGGGTGCTGCGGGCCCAACCTGTG CAGGTGGCAGAAGGCAGCGAGCCAGATGGC TTCTGGGAGGCCCTGGGCGGGAAGGCTGCC TACCGCACATCCCCACGGCTGAAGGACAAG AAGATGGATGCCCATCCTCCTCGCCTCTTT GCCTGCTCCAACAAGATTGGACGTTTTGTG ATCGAAGAGGTTCCTGGTGAGCTCATGCAG GAAGACCTGGCAACGGATGACGTCATGCTT CTGGACACCTGGGACCAGGTCTTTGTCTGG GTTGGAAAGGATTCTCAAGAAGAAGAAAAG ACAGAAGCCTTGACTTCTGCTAAGCGGTAC ATCGAGACGGACCCAGCCAATCGGGATCGG CGGACGCCCATCACCGTGGTGAAGCAAGGC TTTGAGCCTCCCTCCTTTGTGGGCTGGTTC CTTGGCTGGGATGATGATTACTGGTCTGTG GACCCCTTGGACAGGGCCATGGCTGAGCTG GCTGCCTGA LUM_HUMAN ATGAGTCTAAGTGCATTTACTCTCTTCCTG 16 GCATTGATTGGTGGTACCAGTGGCCAGTAC TATGATTATGATTTTCCCCTATCAATTTAT GGGCAATCATCACCAAACTGTGCACCAGAA TGTAACTGCCCTGAAAGCTACCCAAGTGCC ATGTACTGTGATGAGCTGAAATTGAAAAGT GTACCAATGGTGCCTCCTGGAATCAAGTAT CTTTACCTTAGGAATAACCAGATTGACCAT ATTGATGAAAAGGCCTTTGAGAATGTAACT GATCTGCAGTGGCTCATTCTAGATCACAAC CTTCTAGAAAACTCCAAGATAAAAGGGAGA GTTTTCTCTAAATTGAAACAACTGAAGAAG CTGCATATAAACCACAACAACCTGACAGAG TCTGTGGGCCCACTTCCCAAATCTCTGGAG GATCTGCAGCTTACTCATAACAAGATCACA AAGCTGGGCTCTTTTGAAGGATTGGTAAAC CTGACCTTCATCCATCTCCAGCACAATCGG CTGAAAGAGGATGCTGTTTCAGCTGCTTTT AAAGGTCTTAAATCACTCGAATACCTTGAC TTGAGCTTCAATCAGATAGCCAGACTGCCT TCTGGTCTCCCTGTCTCTCTTCTAACTCTC TACTTAGACAACAATAAGATCAGCAACATC CCTGATGAGTATTTCAAGCGTTTTAATGCA TTGCAGTATCTGCGTTTATCTCACAACGAA CTGGCTGATAGTGGAATACCTGGAAATTCT TTCAATGTGTCATCCCTGGTTGAGCTGGAT CTGTCCTATAACAAGCTTAAAAACATACCA ACTGTCAATGAAAACCTTGAAAACTATTAC CTGGAGGTCAATCAACTTGAGAAGTTTGAC ATAAAGAGCTTCTGCAAGATCCTGGGGCCA TTATCCTACTCCAAGATCAAGCATTTGCGT TTGGATGGCAATCGCATCTCAGAAACCAGT CTTCCACCGGATATGTATGAATGTCTACGT GTTGCTAACGAAGTCACTCTTAATTAA C163A_HUMAN ATGAGCAAACTCAGAATGGTGCTACTTGAA 17 GACTCTGGATCTGCTGACTTCAGAAGACAT TTTGTCAACTTGAGTCCCTTCACCATTACT GTGGTCTTACTTCTCAGTGCCTGTTTTGTC ACCAGTTCTCTTGGAGGAACAGACAAGGAG CTGAGGCTAGTGGATGGTGAAAACAAGTGT AGCGGGAGAGTGGAAGTGAAAGTCCAGGAG GAGTGGGGAACGGTGTGTAATAATGGCTGG AGCATGGAAGCGGTCTCTGTGATTTGTAAC CAGCTGGGATGTCCAACTGCTATCAAAGCC CCTGGATGGGCTAATTCCAGTGCAGGTTCT GGACGCATTTGGATGGATCATGTTTCTTGT CGTGGGAATGAGTCAGCTCTTTGGGATTGC AAACATGATGGATGGGGAAAGCATAGTAAC

TGTACTCACCAACAAGATGCTGGAGTGACC TGCTCAGATGGATCCAATTTGGAAATGAGG CTGACGCGTGGAGGGAATATGTGTTCTGGA AGAATAGAGATCAAATTCCAAGGACGGTGG GGAACAGTGTGTGATGATAACTTCAACATA GATCATGCATCTGTCATTTGTAGACAACTT GAATGTGGAAGTGCTGTCAGTTTCTCTGGT TCATCTAATTTTGGAGAAGGCTCTGGACCA ATCTGGTTTGATGATCTTATATGCAACGGA AATGAGTCAGCTCTCTGGAACTGCAAACAT CAAGGATGGGGAAAGCATAACTGTGATCAT GCTGAGGATGCTGGAGTGATTTGCTCAAAG GGAGCAGATCTGAGCCTGAGACTGGTAGAT GGAGTCACTGAATGTTCAGGAAGATTAGAA GTGAGATTCCAAGGAGAATGGGGGACAATA TGTGATGACGGCTGGGACAGTTACGATGCT GCTGTGGCATGCAAGCAACTGGGATGTCCA ACTGCCGTCACAGCCATTGGTCGAGTTAAC GCCAGTAAGGGATTTGGACACATCTGGCTT GACAGCGTTTCTTGCCAGGGACATGAACCT GCTATCTGGCAATGTAAACACCATGAATGG GGAAAGCATTATTGCAATCACAATGAAGAT GCTGGCGTGACATGTTCTGATGGATCAGAT CTGGAGCTAAGACTTAGAGGTGGAGGCAGC CGCTGTGCTGGGACAGTTGAGGTGGAGATT CAGAGACTGTTAGGGAAGGTGTGTGACAGA GGCTGGGGACTGAAAGAAGCTGATGTGGTT TGCAGGCAGCTGGGATGTGGATCTGCACTC AAAACATCTTATCAAGTGTACTCCAAAATC CAGGCAACAAACACATGGCTGTTTCTAAGT AGCTGTAACGGAAATGAAACTTCTCTTTGG GACTGCAAGAACTGGCAATGGGGTGGACTT ACCTGTGATCACTATGAAGAAGCCAAAATT ACCTGCTCAGCCCACAGGGAACCCAGACTG GTTGGAGGGGACATTCCCTGTTCTGGACGT GTTGAAGTGAAGCATGGTGACACGTGGGGC TCCATCTGTGATTCGGACTTCTCTCTGGAA GCTGCCAGCGTTCTATGCAGGGAATTACAG TGTGGCACAGTTGTCTCTATCCTGGGGGGA GCTCACTTTGGAGAGGGAAATGGACAGATC TGGGCTGAAGAATTCCAGTGTGAGGGACAT GAGTCCCATCTTTCACTCTGCCCAGTAGCA CCCCGCCCAGAAGGAACTTGTAGCCACAGC AGGGATGTTGGAGTAGTCTGCTCAAGATAC ACAGAAATTCGCTTGGTGAATGGCAAGACC CCGTGTGAGGGCAGAGTGGAGCTCAAAACG CTTGGTGCCTGGGGATCCCTCTGTAACTCT CACTGGGACATAGAAGATGCCCATGTTCTT TGCCAGCAGCTTAAATGTGGAGTTGCCCTT TCTACCCCAGGAGGAGCACGTTTTGGAAAA GGAAATGGTCAGATCTGGAGGCATATGTTT CACTGCACTGGGACTGAGCAGCACATGGGA GATTGTCCTGTAACTGCTCTAGGTGCTTCA TTATGTCCTTCAGAGCAAGTGGCCTCTGTA ATCTGCTCAGGAAACCAGTCCCAAACACTG TCCTCGTGCAATTCATCGTCTTTGGGCCCA ACAAGGCCTACCATTCCAGAAGAAAGTGCT GTGGCCTGCATAGAGAGTGGTCAACTTCGC CTGGTAAATGGAGGAGGTCGCTGTGCTGGG AGAGTAGAGATCTATCATGAGGGCTCCTGG GGCACCATCTGTGATGACAGCTGGGACCTG AGTGATGCCCACGTGGTTTGCAGACAGCTG GGCTGTGGAGAGGCCATTAATGCCACTGGT TCTGCTCATTTTGGGGAAGGAACAGGGCCC ATCTGGCTGGATGAGATGAAATGCAATGGA AAAGAATCCCGCATTTGGCAGTGCCATTCA CACGGCTGGGGGCAGCAAAATTGCAGGCAC AAGGAGGATGCGGGAGTTATCTGCTCAGAA TTCATGTCTCTGAGACTGACCAGTGAAGCC AGCAGAGAGGCCTGTGCAGGGCGTCTGGAA GTTTTTTACAATGGAGCTTGGGGCACTGTT GGCAAGAGTAGCATGTCTGAAACCACTGTG GGTGTGGTGTGCAGGCAGCTGGGCTGTGCA GACAAAGGGAAAATCAACCCTGCATCTTTA GACAAGGCCATGTCCATTCCCATGTGGGTG GACAATGTTCAGTGTCCAAAAGGACCTGAC ACGCTGTGGCAGTGCCCATCATCTCCATGG GAGAAGAGACTGGCCAGCCCCTCGGAGGAG ACCTGGATCACATGTGACAACAAGATAAGA CTTCAGGAAGGACCCACTTCCTGTTCTGGA CGTGTGGAGATCTGGCATGGAGGTTCCTGG GGGACAGTGTGTGATGACTCTTGGGACTTG GACGATGCTCAGGTGGTGTGTCAACAACTT GGCTGTGGTCCAGCTTTGAAAGCATTCAAA GAAGCAGAGTTTGGTCAGGGGACTGGACCG ATATGGCTCAATGAAGTGAAGTGCAAAGGG AATGAGTCTTCCTTGTGGGATTGTCCTGCC AGACGCTGGGGCCATAGTGAGTGTGGGCAC AAGGAAGACGCTGCAGTGAATTGCACAGAT ATTTCAGTGCAGAAAACCCCACAAAAAGCC ACAACAGGTCGCTCATCCCGTCAGTCATCC TTTATTGCAGTCGGGATCCTTGGGGTTGTT CTGTTGGCCATTTTCGTCGCATTATTCTTC TTGACTAAAAAGCGAAGACAGAGACAGCGG CTTGCAGTTTCCTCAAGAGGAGAGAACTTA GTCCACCAAATTCAATACCGGGAGATGAAT TCTTGCCTGAATGCAGATGATCTGGACCTA ATGAATTCCTCAGGAGGCCATTCTGAGCCA CACTGA PTPRJ_HUMAN ATGAAGCCGGCGGCGCGGGAGGCGCGGCTG 18 CCTCCGCGCTCGCCCGGGCTGCGCTGGGCG CTGCCGCTGCTGCTGCTGCTGCTGCGCCTG GGCCAGATCCTGTGCGCAGGTGGCACCCCT AGTCCAATTCCTGACCCTTCAGTAGCAACT GTTGCCACAGGGGAAAATGGCATAACGCAG ATCAGCAGTACAGCAGAATCCTTTCATAAA CAGAATGGAACTGGAACACCTCAGGTGGAA ACAAACACCAGTGAGGATGGTGAAAGCTCT GGAGCCAACGATAGTTTAAGAACACCTGAA CAAGGATCTAATGGGACTGATGGGGCATCT CAAAAAACTCCCAGTAGCACTGGGCCCAGT CCTGTGTTTGACATTAAAGCTGTTTCCATC AGTCCAACCAATGTGATCTTAACTTGGAAA AGTAATGACACAGCTGCTTCTGAGTACAAG TATGTAGTAAAGCATAAGATGGAAAATGAG AAGACAATTACTGTTGTGCATCAACCATGG TGTAACATCACAGGCTTACGTCCAGCGACT TCATATGTATTCTCCATCACTCCAGGAATA GGCAATGAGACTTGGGGAGATCCCAGAGTC ATAAAAGTCATCACAGAGCCGATCCCAGTT TCTGATCTCCGTGTTGCCCTCACGGGTGTG AGGAAGGCTGCTCTCTCCTGGAGCAATGGC AATGGCACTGCCTCCTGCCGGGTTCTTCTT GAAAGCATTGGAAGCCATGAGGAGTTGACT CAAGACTCAAGACTTCAGGTCAATATCTCG GGCCTGAAGCCAGGGGTTCAATACAACATC AACCCGTATCTTCTACAATCAAATAAGACA AAGGGAGACCCCTTGGGCACAGAAGGTGGC TTGGATGCCAGCAATACAGAGAGAAGCCGG GCAGGGAGCCCCACCGCCCCTGTGCATGAT GAGTCCCTCGTGGGACCTGTGGACCCATCC TCCGGCCAGCAGTCCCGAGACACGGAAGTC CTGCTTGTCGGGTTAGAGCCTGGCACCCGA TACAATGCCACCGTTTATTCCCAAGCAGCG AATGGCACAGAAGGACAGCCCCAGGCCATA GAGTTCAGGACAAATGCTATTCAGGTTTTT GACGTCACCGCTGTGAACATCAGTGCCACA AGCCTGACCCTGATCTGGAAAGTCAGCGAT AACGAGTCGTCATCTAACTATACCTACAAG ATACATGTGGCGGGGGAGACAGATTCTTCC AATCTCAACGTCAGTGAGCCTCGCGCTGTC ATCCCCGGACTCCGCTCCAGCACCTTCTAC AACATCACAGTGTGTCCTGTCCTAGGTGAC ATCGAGGGCACGCCGGGCTTCCTCCAAGTG CACACCCCCCCTGTTCCAGTTTCTGACTTC CGAGTGACAGTGGTCAGCACGACGGAGATC GGCTTAGCATGGAGCAGCCATGATGCAGAA TCATTTCAGATGCATATCACACAGGAGGGA GCTGGCAATTCTCGGGTAGAAATAACCACC AACCAAAGTATTATCATTGGTGGCTTGTTC CCTGGAACCAAGTATTGCTTTGAAATAGTT CCAAAAGGACCAAATGGGACTGAAGGGGCA TCTCGGACAGTTTGCAATAGAACTGGATGA

TABLE-US-00010 TABLE 9 Amino acid sequences of normalizer proteins in panel. Seq. Gene Name Nucleotide Sequence ID. PEDF_HUMAN MQALVILLCIGALLGHSSCQNPASPPEEGS 19 PDPDSTGALVEEEDPFFKVPVNKLAAAVSN FGYDLYRVRSSTSPTTNVLLSPLSVATALS ALSLGAEQRTESIIHRALYYDLISSPDIHG TYKELLDTVTAPQKNLKSASRIVFEKKLRI KSSFVAPLEKSYGTRPRVLTGNPRLDLQEI NNWVQAQMKGKLARSTKEIPDEISILLLGV AHFKGQWVTKFDSRKTSLEDFYLDEERTVR VPMMSDPKAVLRYGLDSDLSCKIAQLPLTG SMSIIFFLPLKVTQNLTLIEESLTSEFIHD IDRELKTVQAVLTVPKLKLSYEGEVTKSLQ EMKLQSLFDSPDFSKITGKPIKLTQVEHRA GFEWNEDGAGTTPSPGLQPAHLTFPLDYHL NQPFIFVLRDTDTGALLFIGKILDPRGP MASP1_HUMAN MRWLLLYYALCFSLSKASAHTVELNNMFGQ 20 IQSPGYPDSYPSDSEVTWNITVPDGFRIKL YFMHFNLESSYLCEYDYVKVETEDQVLATF CGRETTDTEQTPGQEVVLSPGSFMSITFRS DFSNEERFTGFDAHYMAVDVDECKEREDEE LSCDHYCHNYIGGYYCSCRFGYILHTDNRT CRVECSDNLFTQRTGVITSPDFPNPYPKSS ECLYTIELEEGFMVNLQFEDIFDIEDHPEV PCPYDYIKIKVGPKVLGPFCGEKAPEPIST QSHSVLILFHSDNSGENRGWRLSYRAAGNE CPELQPPVHGKIEPSQAKYFFKDQVLVSCD TGYKVLKDNVEMDTFQIECLKDGTWSNKIP TCKIVDCRAPGELEHGLITFSTRNNLTTYK SEIKYSCQEPYYKMLNNNTGIYTCSAQGVW MNKVLGRSLPTCLPVCGLPKFSRKLMARIF NGRPAQKGTTPWIAMLSHLNGQPFCGGSLL GSSWIVTAAHCLHQSLDPEDPTLRDSDLLS PSDFKITLGKHWRLRSDENEQHLGVKHTTL HPQYDPNTFENDVALVELLESPVLNAFVMP ICLPEGPQQEGAMVIVSGWGKQFLQRFPET LMEIEIPIVDHSTCQKAYAPLKKKVTRDMI CAGEKEGGKDACAGDSGGPMVTLNRERGQW YLVGTVSWGDDCGKKDRYGVYSYIHHNKDW IQRVTGVRN GELS_HUMAN MAPHRPAPALLCALSLALCALSLPVRAATA 21 SRGASQAGAPQGRVPEARPNSNVVEHPEFL KAGKEPGLQIWRVEKFDLVPVPTNLYGDFF TGDAYVILKTVQLRNGNLQYDLHYWLGNEC SQDESGAAAIFTVQLDDYLNGRAVQHREVQ GFESATFLGYFKSGLKYKKGGVASGFKHVV PNEVVVQRLFQVKGRRVVRATEVPVSWESF NNGDCFILDLGNNIHQWCGSNSNRYERLKA TQVSKGIRDNERSGRARVHVSEEGTEPEAM LQVLGPKPALPAGTEDTAKEDAANRKLAKL TASDFITKMDYPKQTQVSVLPEGGETPLFK QFFKNWRDPDQTDGLGLSYLSSHIANVERV PFDAATLHTSTAMAAQHGMDDDGTGQKQIW RIEGSNKVPVDPATYGQFYGGDSYIILYNY RHGGRQGQIIYNWQGAQSTQDEVAASAILT AQLDEELGGTPVQSRVVQGKEPAHLMSLFG GKPMITYKGGTSREGGQTAPASTRLFQVRA NSAGATRAVEVLPKAGALNSNDAFVLKTPS AAYLWVGTGASEAEKTGAQELLRVLRAQPV QVAEGSEPDGFWEALGGKAAYRTSPRLKDK KMDAHPPRLFACSNRIGRFVIEEVPGELMQ EDLATDDVMLLDTWDQVFVWVGKDSQEEEK TEALTSAKRYIETDPANRDRRTPITVVKQG FEPPSFVGWFLGWDDDYWSVDPLDRAMAEL AA LUM_HUMAN MSLSAFTLFLALIGGTSGQYYDYDFPLSIY 22 GQSSPNCAPECNCPESYPSAMYCDELKLKS VPMVPPGIKYLYLRNNQIDHIDEKAFENVT DLQWLILDHNLLENSKIKGRVFSKLKQLKK LHINHNNLTESVGPLPKSLEDLQLTHNKIT KLGSFEGLVNLTFIHLQHNRLKEDAVSAAF KGLKSLEYLDLSFNQIARLPSGLPVSLLTL YLDNNKISNIPDEYFKRFNALQYLRLSHNE LADSGIPGNSFNVSSLVELDLSYNKLKNIP TVNENLENYYLEVNQLEKFDIKSFCKILGP LSYSKIKHLRLDGNRISETSLPPDMYECLR VANEVTLN C163A_HUMAN MSKLRMVLLEDSGSADFRRHFVNLSPFTIT 23 VVLLLSACFVTSSLGGTDKELRLVDGENKC SGRVEVKVQEEWGTVCNNGWSMEAVSVICN QLGCPTAIKAPGWANSSAGSGRIWMDHVSC RGNESALWDCKHDGWGKHSNCTHQQDAGVT CSDGSNLEMRLTRGGNMCSGRIEIKFQGRW GTVCDDNFNIDHASVICRQLECGSAVSFSG SSNFGEGSGPIWFDDLICNGNESALWNCKH QGWGKHNCDHAEDAGVICSKGADLSLRLVD GVTECSGRLEVRFQGEWGTICDDGWDSYDA AVACKQLGCPTAVTAIGRVNASKGFGHIWL DSVSCQGHEPAIWQCKHHEWGKHYCNHNED AGVTCSDGSDLELRLRGGGSRCAGTVEVEI QRLLGKVCDRGWGLKEADVVCRQLGCGSAL KTSYQVYSKIQATNTWLFLSSCNGNETSLW DCKNWQWGGLTCDHYEEAKITCSAHREPRL VGGDIPCSGRVEVKHGDTWGSICDSDFSLE AASVLCRELQCGTVVSILGGAHFGEGNGQI WAEEFQCEGHESHLSLCPVAPRPEGTCSHS RDVGVVCSRYTEIRLVNGKTPCEGRVELKT LGAWGSLCNSHWDIEDAHVLCQQLKCGVAL STPGGARFGKGNGQIWRHMFHCTGTEQHMG DCPVTALGASLCPSEQVASVICSGNQSQTL SSCNSSSLGPTRPTIPEESAVACIESGQLR LVNGGGRCAGRVEIYHEGSWGTICDDSWDL SDAHVVCRQLGCGEAINATGSAHFGEGTGP IWLDEMKCNGKESRIWQCHSHGWGQQNCRH KEDAGVICSEFMSLRLTSEASREACAGRLE VFYNGAWGTVGKSSMSETTVGVVCRQLGCA DKGKINPASLDKAMSIPMWVDNVQCPKGPD TLWQCPSSPWEKRLASPSEETWITCDNKIR LQEGPTSCSGRVEIWHGGSWGTVCDDSWDL DDAQVVCQQLGCGPALKAFKEAEFGQGTGP IWLNEVKCKGNESSLWDCPARRWGHSECGH KEDAAVNCTDISVQKTPQKATTGRSSRQSS FIAVGILGVVLLAIFVALFFLTKKRRQRQR LAVSSRGENLVHQIQYREMNSCLNADDLDL MNSSENSHESADFSAAELISVSKFLPISGM EKEAILSHTEKENGNL PTPRJ_HUMAN MKPAAREARLPPRSPGLRWALPLLLLLLRL 24 GQILCAGGTPSPIPDPSVATVATGENGITQ ISSTAESFHKQNGTGTPQVETNTSEDGESS GANDSLRTPEQGSNGTDGASQKTPSSTGPS PVFDIKAVSISPTNVILTWKSNDTAASEYK YVVKHKMENEKTITVVHQPWCNITGLRPAT SYVFSITPGIGNETWGDPRVIKVITEPIPV SDLRVALTGVRKAALSWSNGNGTASCRVLL ESIGSHEELTQDSRLQVNISGLKPGVQYNI NPYLLQSNKTKGDPLGTEGGLDASNTERSR AGSPTAPVHDESLVGPVDPSSGQQSRDTEV LLVGLEPGTRYNATVYSQAANGTEGQPQAI EFRTNAIQVFDVTAVNISATSLTLIWKVSD NESSSNYTYKIHVAGETDSSNLNVSEPRAV IPGLRSSTFYNITVCPVLGDIEGTPGFLQV HTPPVPVSDFRVTVVSTTEIGLAWSSHDAE SFQMHITQEGAGNSRVEITTNQSIIIGGLF PGTKYCFEIVPKGPNGTEGASRTVCNRTVP SAVFDIHVVYVTTTEMWLDWKSPDGASEYV YHLVIESKHGSNHTSTYDKAITLQGLIPGT LYNITISPEVDHVWGDPNSTAQYTRPSNVS NIDVSTNTTAATLSWQNFDDASPTYSYCLL IEKAGNSSNATQVVTDIGITDATVTELIPG SSYTVEIFAQVGDGIKSLEPGRKSFCTDPA SMASFDCEVVPKEPALVLKWTCPPGANAGF ELEVSSGAWNNATHLESCSSENGTEYRTEV TYLNFSTSYNISITTVSCGKMAAPTRNTCT TGITDPPPPDGSPNITSVSHNSVKVKFSGF EASHGPIKAYAVILTTGEAGHPSADVLKYT YEDFKKGASDTYVTYLIRTEEKGRSQSLSE VLKYEIDVGNESTTLGYYNGKLEPLGSYRA CVAGFTNITFHPQNKGLIDGAESYVSFSRY SDAVSLPQDPGVICGAVFGCIFGALVIVTV GGFITORKKRKDAKNNEVSFSQIKPKKSKL IRVENFEAYFKKQQADSNCGFAEEYEDLKL VGISQPKYAAELAENRGKNRYNNVLPYDIS RVKLSVQTHSTDDYINANYMPGYHSKKDFI ATQGPLPNTLKDFWRMVWEKNVYAIIMLTK CVEQGRTKCEEYWPSKQAQDYGDITVAMTS EIVLPEWTIRDFTVKNIQTSESHPLRQFHF TSWPDHGVPDTTDLLINFRYLVRDYMKQSP PESPILVHCSAGVGRTGTFIAIDRLIYQIE NENTVDVYGIVYDLRMHRPLMVQTEDQYVF LNQCVLDIVRSQKDSKVDLIYQNTTAMTIY ENLAPVTTFGKTNGYIA

Sequence CWU 1

1

2911095DNAHomo sapiens 1atgccctacc aatatccagc actgaccccg gagcagaaga aggagctgtc tgacatcgct 60caccgcatcg tggcacctgg caagggcatc ctggctgcag atgagtccac tgggagcatt 120gccaagcggc tgcagtccat tggcaccgag aacaccgagg agaaccggcg cttctaccgc 180cagctgctgc tgacagctga cgaccgcgtg aacccctgca ttgggggtgt catcctcttc 240catgagacac tctaccagaa ggcggatgat gggcgtccct tcccccaagt tatcaaatcc 300aagggcggtg ttgtgggcat caaggtagac aagggcgtgg tccccctggc agggacaaat 360ggcgagacta ccacccaagg gttggatggg ctgtctgagc gctgtgccca gtacaagaag 420gacggagctg acttcgccaa gtggcgttgt gtgctgaaga ttggggaaca caccccctca 480gccctcgcca tcatggaaaa tgccaatgtt ctggcccgtt atgccagtat ctgccagcag 540aatggcattg tgcccatcgt ggagcctgag atcctccctg atggggacca tgacttgaag 600cgctgccagt atgtgaccga gaaggtgctg gctgctgtct acaaggctct gagtgaccac 660cacatctacc tggaaggcac cttgctgaag cccaacatgg tcaccccagg ccatgcttgc 720actcagaagt tttctcatga ggagattgcc atggcgaccg tcacagcgct gcgccgcaca 780gtgccccccg ctgtcactgg gatcaccttc ctgtctggag gccagagtga ggaggaggcg 840tccatcaacc tcaatgccat taacaagtgc cccctgctga agccctgggc cctgaccttc 900tcctacggcc gagccctgca ggcctctgcc ctgaaggcct ggggcgggaa gaaggagaac 960ctgaaggctg cgcaggagga gtatgtcaag cgagccctgg ccaacagcct tgcctgtcaa 1020ggaaagtaca ctccgagcgg tcaggctggg gctgctgcca gcgagtccct cttcgtctct 1080aaccacgcct attaa 109521257DNAHomo sapiens 2atggcaaggc gcaagccaga agggtccagc ttcaacatga cccacctgtc catggctatg 60gccttttcct ttcccccagt tgccagtggg caactccacc ctcagctggg caacacccag 120caccagacag agttaggaaa ggaacttgct actaccagca ccatgcccta ccaatatcca 180gcactgaccc cggagcagaa gaaggagctg tctgacatcg ctcaccgcat cgtggcacct 240ggcaagggca tcctggctgc agatgagtcc actgggagca ttgccaagcg gctgcagtcc 300attggcaccg agaacaccga ggagaaccgg cgcttctacc gccagctgct gctgacagct 360gacgaccgcg tgaacccctg cattgggggt gtcatcctct tccatgagac actctaccag 420aaggcggatg atgggcgtcc cttcccccaa gttatcaaat ccaagggcgg tgttgtgggc 480atcaaggtag acaagggcgt ggtccccctg gcagggacaa atggcgagac taccacccaa 540gggttggatg ggctgtctga gcgctgtgcc cagtacaaga aggacggagc tgacttcgcc 600aagtggcgtt gtgtgctgaa gattggggaa cacaccccct cagccctcgc catcatggaa 660aatgccaatg ttctggcccg ttatgccagt atctgccagc agaatggcat tgtgcccatc 720gtggagcctg agatcctccc tgatggggac catgacttga agcgctgcca gtatgtgacc 780gagaaggtgc tggctgctgt ctacaaggct ctgagtgacc accacatcta cctggaaggc 840accttgctga agcccaacat ggtcacccca ggccatgctt gcactcagaa gttttctcat 900gaggagattg ccatggcgac cgtcacagcg ctgcgccgca cagtgccccc cgctgtcact 960gggatcacct tcctgtctgg aggccagagt gaggaggagg cgtccatcaa cctcaatgcc 1020attaacaagt gccccctgct gaagccctgg gccctgacct tctcctacgg ccgagccctg 1080caggcctctg ccctgaaggc ctggggcggg aagaaggaga acctgaaggc tgcgcaggag 1140gagtatgtca agcgagccct ggccaacagc cttgcctgtc aaggaaagta cactccgagc 1200ggtcaggctg gggctgctgc cagcgagtcc ctcttcgtct ctaaccacgc ctattaa 12573528DNAHomo sapiens 3atgagctccc agattcgtca gaattattcc accgacgtgg aggcagccgt caacagcctg 60gtcaatttgt acctgcaggc ctcctacacc tacctctctc tgggcttcta tttcgaccgc 120gatgatgtgg ctctggaagg cgtgagccac ttcttccgcg aattggccga ggagaagcgc 180gagggctacg agcgtctcct gaagatgcaa aaccagcgtg gcggccgcgc tctcttccag 240gacatcaaga agccagctga agatgagtgg ggtaaaaccc cagacgccat gaaagctgcc 300atggccctgg agaaaaagct gaaccaggcc cttttggatc ttcatgccct gggttctgcc 360cgcacggacc cccatctctg tgacttcctg gagactcact tcctagatga ggaagtgaag 420cttatcaaga agatgggtga ccacctgacc aacctccaca ggctgggtgg cccggaggct 480gggctgggcg agtatctctt cgaaaggctc actctcaagc acgactaa 52841758DNAHomo sapiens 4atgacccctc cgaggctctt ctgggtgtgg ctgctggttg caggaaccca aggcgtgaac 60gatggtgaca tgcggctggc cgatgggggc gccaccaacc agggccgcgt ggagatcttc 120tacagaggcc agtggggcac tgtgtgtgac aacctgtggg acctgactga tgccagcgtc 180gtctgccggg ccctgggctt cgagaacgcc acccaggctc tgggcagagc tgccttcggg 240caaggatcag gccccatcat gctggatgag gtccagtgca cgggaaccga ggcctcactg 300gccgactgca agtccctggg ctggctgaag agcaactgca ggcacgagag agacgctggt 360gtggtctgca ccaatgaaac caggagcacc cacaccctgg acctctccag ggagctctcg 420gaggcccttg gccagatctt tgacagccag cggggctgcg acctgtccat cagcgtgaat 480gtgcagggcg aggacgccct gggcttctgt ggccacacgg tcatcctgac tgccaacctg 540gaggcccagg ccctgtggaa ggagccgggc agcaatgtca ccatgagtgt ggatgctgag 600tgtgtgccca tggtcaggga ccttctcagg tacttctact cccgaaggat tgacatcacc 660ctgtcgtcag tcaagtgctt ccacaagctg gcctctgcct atggggccag gcagctgcag 720ggctactgcg caagcctctt tgccatcctc ctcccccagg acccctcgtt ccagatgccc 780ctggacctgt atgcctatgc agtggccaca ggggacgccc tgctggagaa gctctgccta 840cagttcctgg cctggaactt cgaggccttg acgcaggccg aggcctggcc cagtgtcccc 900acagacctgc tccaactgct gctgcccagg agcgacctgg cggtgcccag cgagctggcc 960ctactgaagg ccgtggacac ctggagctgg ggggagcgtg cctcccatga ggaggtggag 1020ggcttggtgg agaagatccg cttccccatg atgctccctg aggagctctt tgagctgcag 1080ttcaacctgt ccctgtactg gagccacgag gccctgttcc agaagaagac tctgcaggcc 1140ctggaattcc acactgtgcc cttccagttg ctggcccggt acaaaggcct gaacctcacc 1200gaggatacct acaagccccg gatttacacc tcgcccacct ggagtgcctt tgtgacagac 1260agttcctgga gtgcacggaa gtcacaactg gtctatcagt ccagacgggg gcctttggtc 1320aaatattctt ctgattactt ccaagccccc tctgactaca gatactaccc ctaccagtcc 1380ttccagactc cacaacaccc cagcttcctc ttccaggaca agagggtgtc ctggtccctg 1440gtctacctcc ccaccatcca gagctgctgg aactacggct tctcctgctc ctcggacgag 1500ctccctgtcc tgggcctcac caagtctggc ggctcagatc gcaccattgc ctacgaaaac 1560aaagccctga tgctctgcga agggctcttc gtggcagacg tcaccgattt cgagggctgg 1620aaggctgcga ttcccagtgc cctggacacc aacagctcga agagcacctc ctccttcccc 1680tgcccggcag ggcacttcaa cggcttccgc acggtcatcc gccccttcta cctgaccaac 1740tcctcaggtg tggactag 175853513DNAHomo sapiens 5atggggctgg cctggggact aggcgtcctg ttcctgatgc atgtgtgtgg caccaaccgc 60attccagagt ctggcggaga caacagcgtg tttgacatct ttgaactcac cggggccgcc 120cgcaaggggt ctgggcgccg actggtgaag ggccccgacc cttccagccc agctttccgc 180atcgaggatg ccaacctgat cccccctgtg cctgatgaca agttccaaga cctggtggat 240gctgtgcggg cagaaaaggg tttcctcctt ctggcatccc tgaggcagat gaagaagacc 300cggggcacgc tgctggccct ggagcggaaa gaccactctg gccaggtctt cagcgtggtg 360tccaatggca aggcgggcac cctggacctc agcctgaccg tccaaggaaa gcagcacgtg 420gtgtctgtgg aagaagctct cctggcaacc ggccagtgga agagcatcac cctgtttgtg 480caggaagaca gggcccagct gtacatcgac tgtgaaaaga tggagaatgc tgagttggac 540gtccccatcc aaagcgtctt caccagagac ctggccagca tcgccagact ccgcatcgca 600aaggggggcg tcaatgacaa tttccagggg gtgctgcaga atgtgaggtt tgtctttgga 660accacaccag aagacatcct caggaacaaa ggctgctcca gctctaccag tgtcctcctc 720acccttgaca acaacgtggt gaatggttcc agccctgcca tccgcactaa ctacattggc 780cacaagacaa aggacttgca agccatctgc ggcatctcct gtgatgagct gtccagcatg 840gtcctggaac tcaggggcct gcgcaccatt gtgaccacgc tgcaggacag catccgcaaa 900gtgactgaag agaacaaaga gttggccaat gagctgaggc ggcctcccct atgctatcac 960aacggagttc agtacagaaa taacgaggaa tggactgttg atagctgcac tgagtgtcac 1020tgtcagaact cagttaccat ctgcaaaaag gtgtcctgcc ccatcatgcc ctgctccaat 1080gccacagttc ctgatggaga atgctgtcct cgctgttggc ccagcgactc tgcggacgat 1140ggctggtctc catggtccga gtggacctcc tgttctacga gctgtggcaa tggaattcag 1200cagcgcggcc gctcctgcga tagcctcaac aaccgatgtg agggctcctc ggtccagaca 1260cggacctgcc acattcagga gtgtgacaag agatttaaac aggatggtgg ctggagccac 1320tggtccccgt ggtcatcttg ttctgtgaca tgtggtgatg gtgtgatcac aaggatccgg 1380ctctgcaact ctcccagccc ccagatgaac gggaaaccct gtgaaggcga agcgcgggag 1440accaaagcct gcaagaaaga cgcctgcccc atcaatggag gctggggtcc ttggtcacca 1500tgggacatct gttctgtcac ctgtggagga ggggtacaga aacgtagtcg tctctgcaac 1560aaccccacac cccagtttgg aggcaaggac tgcgttggtg atgtaacaga aaaccagatc 1620tgcaacaagc aggactgtcc aattgatgga tgcctgtcca atccctgctt tgccggcgtg 1680aagtgtacta gctaccctga tggcagctgg aaatgtggtg cttgtccccc tggttacagt 1740ggaaatggca tccagtgcac agatgttgat gagtgcaaag aagtgcctga tgcctgcttc 1800aaccacaatg gagagcaccg gtgtgagaac acggaccccg gctacaactg cctgccctgc 1860cccccacgct tcaccggctc acagcccttc ggccagggtg tcgaacatgc cacggccaac 1920aaacaggtgt gcaagccccg taacccctgc acggatggga cccacgactg caacaagaac 1980gccaagtgca actacctggg ccactatagc gaccccatgt accgctgcga gtgcaagcct 2040ggctacgctg gcaatggcat catctgcggg gaggacacag acctggatgg ctggcccaat 2100gagaacctgg tgtgcgtggc caatgcgact taccactgca aaaaggataa ttgccccaac 2160cttcccaact cagggcagga agactatgac aaggatggaa ttggtgatgc ctgtgatgat 2220gacgatgaca atgataaaat tccagatgac agggacaact gtccattcca ttacaaccca 2280gctcagtatg actatgacag agatgatgtg ggagaccgct gtgacaactg tccctacaac 2340cacaacccag atcaggcaga cacagacaac aatggggaag gagacgcctg tgctgcagac 2400attgatggag acggtatcct caatgaacgg gacaactgcc agtacgtcta caatgtggac 2460cagagagaca ctgatatgga tggggttgga gatcagtgtg acaattgccc cttggaacac 2520aatccggatc agctggactc tgactcagac cgcattggag atacctgtga caacaatcag 2580gatattgatg aagatggcca ccagaacaat ctggacaact gtccctatgt gcccaatgcc 2640aaccaggctg accatgacaa agatggcaag ggagatgcct gtgaccacga tgatgacaac 2700gatggcattc ctgatgacaa ggacaactgc agactcgtgc ccaatcccga ccagaaggac 2760tctgacggcg atggtcgagg tgatgcctgc aaagatgatt ttgaccatga cagtgtgcca 2820gacatcgatg acatctgtcc tgagaatgtt gacatcagtg agaccgattt ccgccgattc 2880cagatgattc ctctggaccc caaagggaca tcccaaaatg accctaactg ggttgtacgc 2940catcagggta aagaactcgt ccagactgtc aactgtgatc ctggactcgc tgtaggttat 3000gatgagttta atgctgtgga cttcagtggc accttcttca tcaacaccga aagggacgat 3060gactatgctg gatttgtctt tggctaccag tccagcagcc gcttttatgt tgtgatgtgg 3120aagcaagtca cccagtccta ctgggacacc aaccccacga gggctcaggg atactcgggc 3180ctttctgtga aagttgtaaa ctccaccaca gggcctggcg agcacctgcg gaacgccctg 3240tggcacacag gaaacacccc tggccaggtg cgcaccctgt ggcatgaccc tcgtcacata 3300ggctggaaag atttcaccgc ctacagatgg cgtctcagcc acaggccaaa gacgggtttc 3360attagagtgg tgatgtatga agggaagaaa atcatggctg actcaggacc catctatgat 3420aaaacctatg ctggtggtag actagggttg tttgtcttct ctcaagaaat ggtgttcttc 3480tctgacctga aatacgaatg tagagatccc taa 351364395DNAHomo sapiens 6atgttcagct ttgtggacct ccggctcctg ctcctcttag cggccaccgc cctcctgacg 60cacggccaag aggaaggcca agtcgagggc caagacgaag acatcccacc aatcacctgc 120gtacagaacg gcctcaggta ccatgaccga gacgtgtgga aacccgagcc ctgccggatc 180tgcgtctgcg acaacggcaa ggtgttgtgc gatgacgtga tctgtgacga gaccaagaac 240tgccccggcg ccgaagtccc cgagggcgag tgctgtcccg tctgccccga cggctcagag 300tcacccaccg accaagaaac caccggcgtc gagggaccca agggagacac tggcccccga 360ggcccaaggg gacccgcagg cccccctggc cgagatggca tccctggaca gcctggactt 420cccggacccc ccggaccccc cggacctccc ggaccccctg gcctcggagg aaactttgct 480ccccagctgt cttatggcta tgatgagaaa tcaaccggag gaatttccgt gcctggcccc 540atgggtccct ctggtcctcg tggtctccct ggcccccctg gtgcacctgg tccccaaggc 600ttccaaggtc cccctggtga gcctggcgag cctggagctt caggtcccat gggtccccga 660ggtcccccag gtccccctgg aaagaatgga gatgatgggg aagctggaaa acctggtcgt 720cctggtgagc gtgggcctcc tgggcctcag ggtgctcgag gattgcccgg aacagctggc 780ctccctggaa tgaagggaca cagaggtttc agtggtttgg atggtgccaa gggagatgct 840ggtcctgctg gtcctaaggg tgagcctggc agccctggtg aaaatggagc tcctggtcag 900atgggccccc gtggcctgcc tggtgagaga ggtcgccctg gagcccctgg ccctgctggt 960gctcgtggaa atgatggtgc tactggtgct gccgggcccc ctggtcccac cggccccgct 1020ggtcctcctg gcttccctgg tgctgttggt gctaagggtg aagctggtcc ccaagggccc 1080cgaggctctg aaggtcccca gggtgtgcgt ggtgagcctg gcccccctgg ccctgctggt 1140gctgctggcc ctgctggaaa ccctggtgct gatggacagc ctggtgctaa aggtgccaat 1200ggtgctcctg gtattgctgg tgctcctggc ttccctggtg cccgaggccc ctctggaccc 1260cagggccccg gcggccctcc tggtcccaag ggtaacagcg gtgaacctgg tgctcctggc 1320agcaaaggag acactggtgc taagggagag cctggccctg ttggtgttca aggaccccct 1380ggccctgctg gagaggaagg aaagcgagga gctcgaggtg aacccggacc cactggcctg 1440cccggacccc ctggcgagcg tggtggacct ggtagccgtg gtttccctgg cgcagatggt 1500gttgctggtc ccaagggtcc cgctggtgaa cgtggttctc ctggccctgc tggccccaaa 1560ggatctcctg gtgaagctgg tcgtcccggt gaagctggtc tgcctggtgc caagggtctg 1620actggaagcc ctggcagccc tggtcctgat ggcaaaactg gcccccctgg tcccgccggt 1680caagatggtc gccccggacc cccaggccca cctggtgccc gtggtcaggc tggtgtgatg 1740ggattccctg gacctaaagg tgctgctgga gagcccggca aggctggaga gcgaggtgtt 1800cccggacccc ctggcgctgt cggtcctgct ggcaaagatg gagaggctgg agctcaggga 1860ccccctggcc ctgctggtcc cgctggcgag agaggtgaac aaggccctgc tggctccccc 1920ggattccagg gtctccctgg tcctgctggt cctccaggtg aagcaggcaa acctggtgaa 1980cagggtgttc ctggagacct tggcgcccct ggcccctctg gagcaagagg cgagagaggt 2040ttccctggcg agcgtggtgt gcaaggtccc cctggtcctg ctggtccccg aggggccaac 2100ggtgctcccg gcaacgatgg tgctaagggt gatgctggtg cccctggagc tcccggtagc 2160cagggcgccc ctggccttca gggaatgcct ggtgaacgtg gtgcagctgg tcttccaggg 2220cctaagggtg acagaggtga tgctggtccc aaaggtgctg atggctctcc tggcaaagat 2280ggcgtccgtg gtctgactgg ccccattggt cctcctggcc ctgctggtgc ccctggtgac 2340aagggtgaaa gtggtcccag cggccctgct ggtcccactg gagctcgtgg tgcccccgga 2400gaccgtggtg agcctggtcc ccccggccct gctggctttg ctggcccccc tggtgctgac 2460ggccaacctg gtgctaaagg cgaacctggt gatgctggtg ctaaaggcga tgctggtccc 2520cctggccctg ccggacccgc tggaccccct ggccccattg gtaatgttgg tgctcctgga 2580gccaaaggtg ctcgcggcag cgctggtccc cctggtgcta ctggtttccc tggtgctgct 2640ggccgagtcg gtcctcctgg cccctctgga aatgctggac cccctggccc tcctggtcct 2700gctggcaaag aaggcggcaa aggtccccgt ggtgagactg gccctgctgg acgtcctggt 2760gaagttggtc cccctggtcc ccctggccct gctggcgaga aaggatcccc tggtgctgat 2820ggtcctgctg gtgctcctgg tactcccggg cctcaaggta ttgctggaca gcgtggtgtg 2880gtcggcctgc ctggtcagag aggagagaga ggcttccctg gtcttcctgg cccctctggt 2940gaacctggca aacaaggtcc ctctggagca agtggtgaac gtggtccccc tggtcccatg 3000ggcccccctg gattggctgg accccctggt gaatctggac gtgagggggc tcctggtgcc 3060gaaggttccc ctggacgaga cggttctcct ggcgccaagg gtgaccgtgg tgagaccggc 3120cccgctggac cccctggtgc tcctggtgct cctggtgccc ctggccccgt tggccctgct 3180ggcaagagtg gtgatcgtgg tgagactggt cctgctggtc ccaccggtcc tgtcggccct 3240gttggcgccc gtggccccgc cggaccccaa ggcccccgtg gtgacaaggg tgagacaggc 3300gaacagggcg acagaggcat aaagggtcac cgtggcttct ctggcctcca gggtccccct 3360ggccctcctg gctctcctgg tgaacaaggt ccctctggag cctctggtcc tgctggtccc 3420cgaggtcccc ctggctctgc tggtgctcct ggcaaagatg gactcaacgg tctccctggc 3480cccattgggc cccctggtcc tcgcggtcgc actggtgatg ctggtcctgt tggtcccccc 3540ggccctcctg gacctcctgg tccccctggt cctcccagcg ctggtttcga cttcagcttc 3600ctgccccagc cacctcaaga gaaggctcac gatggtggcc gctactaccg ggctgatgat 3660gccaatgtgg ttcgtgaccg tgacctcgag gtggacacca ccctcaagag cctgagccag 3720cagatcgaga acatccggag cccagagggc agccgcaaga accccgcccg cacctgccgt 3780gacctcaaga tgtgccactc tgactggaag agtggagagt actggattga ccccaaccaa 3840ggctgcaacc tggatgccat caaagtcttc tgcaacatgg agactggtga gacctgcgtg 3900taccccactc agcccagtgt ggcccagaag aactggtaca tcagcaagaa ccccaaggac 3960aagaggcatg tctggttcgg cgagagcatg accgatggat tccagttcga gtatggcggc 4020cagggctccg accctgccga tgtggccatc cagctgacct tcctgcgcct gatgtccacc 4080gaggcctccc agaacatcac ctaccactgc aagaacagcg tggcctacat ggaccagcag 4140actggcaacc tcaagaaggc cctgctcctc cagggctcca acgagatcga gatccgcgcc 4200gagggcaaca gccgcttcac ctacagcgtc actgtcgatg gctgcacgag tcacaccgga 4260gcctggggca agacagtgat tgaatacaaa accaccaaga cctcccgcct gcccatcatc 4320gatgtggccc ccttggacgt tggtgcccca gaccaggaat tcggcttcga cgttggccct 4380gtctgcttcc tgtaa 43957364PRTHomo sapiens 7Met Pro Tyr Gln Tyr Pro Ala Leu Thr Pro Glu Gln Lys Lys Glu Leu 1 5 10 15 Ser Asp Ile Ala His Arg Ile Val Ala Pro Gly Lys Gly Ile Leu Ala 20 25 30 Ala Asp Glu Ser Thr Gly Ser Ile Ala Lys Arg Leu Gln Ser Ile Gly 35 40 45 Thr Glu Asn Thr Glu Glu Asn Arg Arg Phe Tyr Arg Gln Leu Leu Leu 50 55 60 Thr Ala Asp Asp Arg Val Asn Pro Cys Ile Gly Gly Val Ile Leu Phe 65 70 75 80 His Glu Thr Leu Tyr Gln Lys Ala Asp Asp Gly Arg Pro Phe Pro Gln 85 90 95 Val Ile Lys Ser Lys Gly Gly Val Val Gly Ile Lys Val Asp Lys Gly 100 105 110 Val Val Pro Leu Ala Gly Thr Asn Gly Glu Thr Thr Thr Gln Gly Leu 115 120 125 Asp Gly Leu Ser Glu Arg Cys Ala Gln Tyr Lys Lys Asp Gly Ala Asp 130 135 140 Phe Ala Lys Trp Arg Cys Val Leu Lys Ile Gly Glu His Thr Pro Ser 145 150 155 160 Ala Leu Ala Ile Met Glu Asn Ala Asn Val Leu Ala Arg Tyr Ala Ser 165 170 175 Ile Cys Gln Gln Asn Gly Ile Val Pro Ile Val Glu Pro Glu Ile Leu 180 185 190 Pro Asp Gly Asp His Asp Leu Lys Arg Cys Gln Tyr Val Thr Glu Lys 195 200 205 Val Leu Ala Ala Val Tyr Lys Ala Leu Ser Asp His His Ile Tyr Leu 210 215 220 Glu Gly Thr Leu Leu Lys Pro Asn Met Val Thr Pro Gly His Ala Cys 225 230 235 240 Thr Gln Lys Phe Ser His Glu Glu Ile Ala Met Ala Thr Val Thr Ala 245 250 255 Leu Arg Arg Thr Val Pro Pro Ala Val Thr Gly Ile Thr Phe Leu Ser 260 265 270 Gly Gly Gln Ser Glu Glu Glu Ala Ser Ile Asn Leu Asn Ala Ile Asn 275 280 285 Lys Cys Pro Leu Leu Lys Pro Trp Ala Leu Thr Phe Ser Tyr Gly Arg 290 295 300 Ala Leu Gln Ala Ser Ala Leu Lys Ala Trp Gly Gly Lys Lys Glu Asn 305 310 315 320 Leu Lys Ala Ala Gln Glu Glu Tyr Val Lys Arg Ala Leu Ala Asn Ser 325 330 335 Leu Ala Cys

Gln Gly Lys Tyr Thr Pro Ser Gly Gln Ala Gly Ala Ala 340 345 350 Ala Ser Glu Ser Leu Phe Val Ser Asn His Ala Tyr 355 360 8418PRTHomo sapiens 8Met Ala Arg Arg Lys Pro Glu Gly Ser Ser Phe Asn Met Thr His Leu 1 5 10 15 Ser Met Ala Met Ala Phe Ser Phe Pro Pro Val Ala Ser Gly Gln Leu 20 25 30 His Pro Gln Leu Gly Asn Thr Gln His Gln Thr Glu Leu Gly Lys Glu 35 40 45 Leu Ala Thr Thr Ser Thr Met Pro Tyr Gln Tyr Pro Ala Leu Thr Pro 50 55 60 Glu Gln Lys Lys Glu Leu Ser Asp Ile Ala His Arg Ile Val Ala Pro 65 70 75 80 Gly Lys Gly Ile Leu Ala Ala Asp Glu Ser Thr Gly Ser Ile Ala Lys 85 90 95 Arg Leu Gln Ser Ile Gly Thr Glu Asn Thr Glu Glu Asn Arg Arg Phe 100 105 110 Tyr Arg Gln Leu Leu Leu Thr Ala Asp Asp Arg Val Asn Pro Cys Ile 115 120 125 Gly Gly Val Ile Leu Phe His Glu Thr Leu Tyr Gln Lys Ala Asp Asp 130 135 140 Gly Arg Pro Phe Pro Gln Val Ile Lys Ser Lys Gly Gly Val Val Gly 145 150 155 160 Ile Lys Val Asp Lys Gly Val Val Pro Leu Ala Gly Thr Asn Gly Glu 165 170 175 Thr Thr Thr Gln Gly Leu Asp Gly Leu Ser Glu Arg Cys Ala Gln Tyr 180 185 190 Lys Lys Asp Gly Ala Asp Phe Ala Lys Trp Arg Cys Val Leu Lys Ile 195 200 205 Gly Glu His Thr Pro Ser Ala Leu Ala Ile Met Glu Asn Ala Asn Val 210 215 220 Leu Ala Arg Tyr Ala Ser Ile Cys Gln Gln Asn Gly Ile Val Pro Ile 225 230 235 240 Val Glu Pro Glu Ile Leu Pro Asp Gly Asp His Asp Leu Lys Arg Cys 245 250 255 Gln Tyr Val Thr Glu Lys Val Leu Ala Ala Val Tyr Lys Ala Leu Ser 260 265 270 Asp His His Ile Tyr Leu Glu Gly Thr Leu Leu Lys Pro Asn Met Val 275 280 285 Thr Pro Gly His Ala Cys Thr Gln Lys Phe Ser His Glu Glu Ile Ala 290 295 300 Met Ala Thr Val Thr Ala Leu Arg Arg Thr Val Pro Pro Ala Val Thr 305 310 315 320 Gly Ile Thr Phe Leu Ser Gly Gly Gln Ser Glu Glu Glu Ala Ser Ile 325 330 335 Asn Leu Asn Ala Ile Asn Lys Cys Pro Leu Leu Lys Pro Trp Ala Leu 340 345 350 Thr Phe Ser Tyr Gly Arg Ala Leu Gln Ala Ser Ala Leu Lys Ala Trp 355 360 365 Gly Gly Lys Lys Glu Asn Leu Lys Ala Ala Gln Glu Glu Tyr Val Lys 370 375 380 Arg Ala Leu Ala Asn Ser Leu Ala Cys Gln Gly Lys Tyr Thr Pro Ser 385 390 395 400 Gly Gln Ala Gly Ala Ala Ala Ser Glu Ser Leu Phe Val Ser Asn His 405 410 415 Ala Tyr 9175PRTHomo sapiens 9Met Ser Ser Gln Ile Arg Gln Asn Tyr Ser Thr Asp Val Glu Ala Ala 1 5 10 15 Val Asn Ser Leu Val Asn Leu Tyr Leu Gln Ala Ser Tyr Thr Tyr Leu 20 25 30 Ser Leu Gly Phe Tyr Phe Asp Arg Asp Asp Val Ala Leu Glu Gly Val 35 40 45 Ser His Phe Phe Arg Glu Leu Ala Glu Glu Lys Arg Glu Gly Tyr Glu 50 55 60 Arg Leu Leu Lys Met Gln Asn Gln Arg Gly Gly Arg Ala Leu Phe Gln 65 70 75 80 Asp Ile Lys Lys Pro Ala Glu Asp Glu Trp Gly Lys Thr Pro Asp Ala 85 90 95 Met Lys Ala Ala Met Ala Leu Glu Lys Lys Leu Asn Gln Ala Leu Leu 100 105 110 Asp Leu His Ala Leu Gly Ser Ala Arg Thr Asp Pro His Leu Cys Asp 115 120 125 Phe Leu Glu Thr His Phe Leu Asp Glu Glu Val Lys Leu Ile Lys Lys 130 135 140 Met Gly Asp His Leu Thr Asn Leu His Arg Leu Gly Gly Pro Glu Ala 145 150 155 160 Gly Leu Gly Glu Tyr Leu Phe Glu Arg Leu Thr Leu Lys His Asp 165 170 175 10585PRTHomo sapiens 10Met Thr Pro Pro Arg Leu Phe Trp Val Trp Leu Leu Val Ala Gly Thr 1 5 10 15 Gln Gly Val Asn Asp Gly Asp Met Arg Leu Ala Asp Gly Gly Ala Thr 20 25 30 Asn Gln Gly Arg Val Glu Ile Phe Tyr Arg Gly Gln Trp Gly Thr Val 35 40 45 Cys Asp Asn Leu Trp Asp Leu Thr Asp Ala Ser Val Val Cys Arg Ala 50 55 60 Leu Gly Phe Glu Asn Ala Thr Gln Ala Leu Gly Arg Ala Ala Phe Gly 65 70 75 80 Gln Gly Ser Gly Pro Ile Met Leu Asp Glu Val Gln Cys Thr Gly Thr 85 90 95 Glu Ala Ser Leu Ala Asp Cys Lys Ser Leu Gly Trp Leu Lys Ser Asn 100 105 110 Cys Arg His Glu Arg Asp Ala Gly Val Val Cys Thr Asn Glu Thr Arg 115 120 125 Ser Thr His Thr Leu Asp Leu Ser Arg Glu Leu Ser Glu Ala Leu Gly 130 135 140 Gln Ile Phe Asp Ser Gln Arg Gly Cys Asp Leu Ser Ile Ser Val Asn 145 150 155 160 Val Gln Gly Glu Asp Ala Leu Gly Phe Cys Gly His Thr Val Ile Leu 165 170 175 Thr Ala Asn Leu Glu Ala Gln Ala Leu Trp Lys Glu Pro Gly Ser Asn 180 185 190 Val Thr Met Ser Val Asp Ala Glu Cys Val Pro Met Val Arg Asp Leu 195 200 205 Leu Arg Tyr Phe Tyr Ser Arg Arg Ile Asp Ile Thr Leu Ser Ser Val 210 215 220 Lys Cys Phe His Lys Leu Ala Ser Ala Tyr Gly Ala Arg Gln Leu Gln 225 230 235 240 Gly Tyr Cys Ala Ser Leu Phe Ala Ile Leu Leu Pro Gln Asp Pro Ser 245 250 255 Phe Gln Met Pro Leu Asp Leu Tyr Ala Tyr Ala Val Ala Thr Gly Asp 260 265 270 Ala Leu Leu Glu Lys Leu Cys Leu Gln Phe Leu Ala Trp Asn Phe Glu 275 280 285 Ala Leu Thr Gln Ala Glu Ala Trp Pro Ser Val Pro Thr Asp Leu Leu 290 295 300 Gln Leu Leu Leu Pro Arg Ser Asp Leu Ala Val Pro Ser Glu Leu Ala 305 310 315 320 Leu Leu Lys Ala Val Asp Thr Trp Ser Trp Gly Glu Arg Ala Ser His 325 330 335 Glu Glu Val Glu Gly Leu Val Glu Lys Ile Arg Phe Pro Met Met Leu 340 345 350 Pro Glu Glu Leu Phe Glu Leu Gln Phe Asn Leu Ser Leu Tyr Trp Ser 355 360 365 His Glu Ala Leu Phe Gln Lys Lys Thr Leu Gln Ala Leu Glu Phe His 370 375 380 Thr Val Pro Phe Gln Leu Leu Ala Arg Tyr Lys Gly Leu Asn Leu Thr 385 390 395 400 Glu Asp Thr Tyr Lys Pro Arg Ile Tyr Thr Ser Pro Thr Trp Ser Ala 405 410 415 Phe Val Thr Asp Ser Ser Trp Ser Ala Arg Lys Ser Gln Leu Val Tyr 420 425 430 Gln Ser Arg Arg Gly Pro Leu Val Lys Tyr Ser Ser Asp Tyr Phe Gln 435 440 445 Ala Pro Ser Asp Tyr Arg Tyr Tyr Pro Tyr Gln Ser Phe Gln Thr Pro 450 455 460 Gln His Pro Ser Phe Leu Phe Gln Asp Lys Arg Val Ser Trp Ser Leu 465 470 475 480 Val Tyr Leu Pro Thr Ile Gln Ser Cys Trp Asn Tyr Gly Phe Ser Cys 485 490 495 Ser Ser Asp Glu Leu Pro Val Leu Gly Leu Thr Lys Ser Gly Gly Ser 500 505 510 Asp Arg Thr Ile Ala Tyr Glu Asn Lys Ala Leu Met Leu Cys Glu Gly 515 520 525 Leu Phe Val Ala Asp Val Thr Asp Phe Glu Gly Trp Lys Ala Ala Ile 530 535 540 Pro Ser Ala Leu Asp Thr Asn Ser Ser Lys Ser Thr Ser Ser Phe Pro 545 550 555 560 Cys Pro Ala Gly His Phe Asn Gly Phe Arg Thr Val Ile Arg Pro Phe 565 570 575 Tyr Leu Thr Asn Ser Ser Gly Val Asp 580 585 111170PRTHomo sapiens 11Met Gly Leu Ala Trp Gly Leu Gly Val Leu Phe Leu Met His Val Cys 1 5 10 15 Gly Thr Asn Arg Ile Pro Glu Ser Gly Gly Asp Asn Ser Val Phe Asp 20 25 30 Ile Phe Glu Leu Thr Gly Ala Ala Arg Lys Gly Ser Gly Arg Arg Leu 35 40 45 Val Lys Gly Pro Asp Pro Ser Ser Pro Ala Phe Arg Ile Glu Asp Ala 50 55 60 Asn Leu Ile Pro Pro Val Pro Asp Asp Lys Phe Gln Asp Leu Val Asp 65 70 75 80 Ala Val Arg Ala Glu Lys Gly Phe Leu Leu Leu Ala Ser Leu Arg Gln 85 90 95 Met Lys Lys Thr Arg Gly Thr Leu Leu Ala Leu Glu Arg Lys Asp His 100 105 110 Ser Gly Gln Val Phe Ser Val Val Ser Asn Gly Lys Ala Gly Thr Leu 115 120 125 Asp Leu Ser Leu Thr Val Gln Gly Lys Gln His Val Val Ser Val Glu 130 135 140 Glu Ala Leu Leu Ala Thr Gly Gln Trp Lys Ser Ile Thr Leu Phe Val 145 150 155 160 Gln Glu Asp Arg Ala Gln Leu Tyr Ile Asp Cys Glu Lys Met Glu Asn 165 170 175 Ala Glu Leu Asp Val Pro Ile Gln Ser Val Phe Thr Arg Asp Leu Ala 180 185 190 Ser Ile Ala Arg Leu Arg Ile Ala Lys Gly Gly Val Asn Asp Asn Phe 195 200 205 Gln Gly Val Leu Gln Asn Val Arg Phe Val Phe Gly Thr Thr Pro Glu 210 215 220 Asp Ile Leu Arg Asn Lys Gly Cys Ser Ser Ser Thr Ser Val Leu Leu 225 230 235 240 Thr Leu Asp Asn Asn Val Val Asn Gly Ser Ser Pro Ala Ile Arg Thr 245 250 255 Asn Tyr Ile Gly His Lys Thr Lys Asp Leu Gln Ala Ile Cys Gly Ile 260 265 270 Ser Cys Asp Glu Leu Ser Ser Met Val Leu Glu Leu Arg Gly Leu Arg 275 280 285 Thr Ile Val Thr Thr Leu Gln Asp Ser Ile Arg Lys Val Thr Glu Glu 290 295 300 Asn Lys Glu Leu Ala Asn Glu Leu Arg Arg Pro Pro Leu Cys Tyr His 305 310 315 320 Asn Gly Val Gln Tyr Arg Asn Asn Glu Glu Trp Thr Val Asp Ser Cys 325 330 335 Thr Glu Cys His Cys Gln Asn Ser Val Thr Ile Cys Lys Lys Val Ser 340 345 350 Cys Pro Ile Met Pro Cys Ser Asn Ala Thr Val Pro Asp Gly Glu Cys 355 360 365 Cys Pro Arg Cys Trp Pro Ser Asp Ser Ala Asp Asp Gly Trp Ser Pro 370 375 380 Trp Ser Glu Trp Thr Ser Cys Ser Thr Ser Cys Gly Asn Gly Ile Gln 385 390 395 400 Gln Arg Gly Arg Ser Cys Asp Ser Leu Asn Asn Arg Cys Glu Gly Ser 405 410 415 Ser Val Gln Thr Arg Thr Cys His Ile Gln Glu Cys Asp Lys Arg Phe 420 425 430 Lys Gln Asp Gly Gly Trp Ser His Trp Ser Pro Trp Ser Ser Cys Ser 435 440 445 Val Thr Cys Gly Asp Gly Val Ile Thr Arg Ile Arg Leu Cys Asn Ser 450 455 460 Pro Ser Pro Gln Met Asn Gly Lys Pro Cys Glu Gly Glu Ala Arg Glu 465 470 475 480 Thr Lys Ala Cys Lys Lys Asp Ala Cys Pro Ile Asn Gly Gly Trp Gly 485 490 495 Pro Trp Ser Pro Trp Asp Ile Cys Ser Val Thr Cys Gly Gly Gly Val 500 505 510 Gln Lys Arg Ser Arg Leu Cys Asn Asn Pro Thr Pro Gln Phe Gly Gly 515 520 525 Lys Asp Cys Val Gly Asp Val Thr Glu Asn Gln Ile Cys Asn Lys Gln 530 535 540 Asp Cys Pro Ile Asp Gly Cys Leu Ser Asn Pro Cys Phe Ala Gly Val 545 550 555 560 Lys Cys Thr Ser Tyr Pro Asp Gly Ser Trp Lys Cys Gly Ala Cys Pro 565 570 575 Pro Gly Tyr Ser Gly Asn Gly Ile Gln Cys Thr Asp Val Asp Glu Cys 580 585 590 Lys Glu Val Pro Asp Ala Cys Phe Asn His Asn Gly Glu His Arg Cys 595 600 605 Glu Asn Thr Asp Pro Gly Tyr Asn Cys Leu Pro Cys Pro Pro Arg Phe 610 615 620 Thr Gly Ser Gln Pro Phe Gly Gln Gly Val Glu His Ala Thr Ala Asn 625 630 635 640 Lys Gln Val Cys Lys Pro Arg Asn Pro Cys Thr Asp Gly Thr His Asp 645 650 655 Cys Asn Lys Asn Ala Lys Cys Asn Tyr Leu Gly His Tyr Ser Asp Pro 660 665 670 Met Tyr Arg Cys Glu Cys Lys Pro Gly Tyr Ala Gly Asn Gly Ile Ile 675 680 685 Cys Gly Glu Asp Thr Asp Leu Asp Gly Trp Pro Asn Glu Asn Leu Val 690 695 700 Cys Val Ala Asn Ala Thr Tyr His Cys Lys Lys Asp Asn Cys Pro Asn 705 710 715 720 Leu Pro Asn Ser Gly Gln Glu Asp Tyr Asp Lys Asp Gly Ile Gly Asp 725 730 735 Ala Cys Asp Asp Asp Asp Asp Asn Asp Lys Ile Pro Asp Asp Arg Asp 740 745 750 Asn Cys Pro Phe His Tyr Asn Pro Ala Gln Tyr Asp Tyr Asp Arg Asp 755 760 765 Asp Val Gly Asp Arg Cys Asp Asn Cys Pro Tyr Asn His Asn Pro Asp 770 775 780 Gln Ala Asp Thr Asp Asn Asn Gly Glu Gly Asp Ala Cys Ala Ala Asp 785 790 795 800 Ile Asp Gly Asp Gly Ile Leu Asn Glu Arg Asp Asn Cys Gln Tyr Val 805 810 815 Tyr Asn Val Asp Gln Arg Asp Thr Asp Met Asp Gly Val Gly Asp Gln 820 825 830 Cys Asp Asn Cys Pro Leu Glu His Asn Pro Asp Gln Leu Asp Ser Asp 835 840 845 Ser Asp Arg Ile Gly Asp Thr Cys Asp Asn Asn Gln Asp Ile Asp Glu 850 855 860 Asp Gly His Gln Asn Asn Leu Asp Asn Cys Pro Tyr Val Pro Asn Ala 865 870 875 880 Asn Gln Ala Asp His Asp Lys Asp Gly Lys Gly Asp Ala Cys Asp His 885 890 895 Asp Asp Asp Asn Asp Gly Ile Pro Asp Asp Lys Asp Asn Cys Arg Leu 900 905 910 Val Pro Asn Pro Asp Gln Lys Asp Ser Asp Gly Asp Gly Arg Gly Asp 915 920 925 Ala Cys Lys Asp Asp Phe Asp His Asp Ser Val Pro Asp Ile Asp Asp 930 935 940 Ile Cys Pro Glu Asn Val Asp Ile Ser Glu Thr Asp Phe Arg Arg Phe 945 950 955 960 Gln Met Ile Pro Leu Asp Pro Lys Gly Thr Ser Gln Asn Asp Pro Asn 965 970 975 Trp Val Val Arg His Gln Gly Lys Glu Leu Val Gln Thr Val Asn Cys 980 985 990 Asp Pro Gly Leu Ala Val Gly Tyr Asp Glu Phe Asn Ala Val Asp Phe 995 1000 1005 Ser Gly Thr Phe Phe Ile Asn Thr Glu Arg Asp Asp Asp Tyr Ala 1010 1015 1020 Gly Phe Val Phe Gly Tyr Gln Ser Ser Ser Arg Phe Tyr Val Val 1025 1030 1035 Met Trp Lys Gln Val Thr Gln Ser Tyr Trp Asp Thr Asn Pro Thr 1040 1045 1050 Arg Ala Gln Gly Tyr Ser Gly Leu Ser Val Lys Val Val Asn Ser 1055 1060 1065 Thr Thr Gly Pro Gly Glu His Leu Arg Asn Ala Leu Trp His Thr 1070 1075 1080 Gly Asn Thr Pro Gly

Gln Val Arg Thr Leu Trp His Asp Pro Arg 1085 1090 1095 His Ile Gly Trp Lys Asp Phe Thr Ala Tyr Arg Trp Arg Leu Ser 1100 1105 1110 His Arg Pro Lys Thr Gly Phe Ile Arg Val Val Met Tyr Glu Gly 1115 1120 1125 Lys Lys Ile Met Ala Asp Ser Gly Pro Ile Tyr Asp Lys Thr Tyr 1130 1135 1140 Ala Gly Gly Arg Leu Gly Leu Phe Val Phe Ser Gln Glu Met Val 1145 1150 1155 Phe Phe Ser Asp Leu Lys Tyr Glu Cys Arg Asp Pro 1160 1165 1170 121464PRTHomo sapiens 12Met Phe Ser Phe Val Asp Leu Arg Leu Leu Leu Leu Leu Ala Ala Thr 1 5 10 15 Ala Leu Leu Thr His Gly Gln Glu Glu Gly Gln Val Glu Gly Gln Asp 20 25 30 Glu Asp Ile Pro Pro Ile Thr Cys Val Gln Asn Gly Leu Arg Tyr His 35 40 45 Asp Arg Asp Val Trp Lys Pro Glu Pro Cys Arg Ile Cys Val Cys Asp 50 55 60 Asn Gly Lys Val Leu Cys Asp Asp Val Ile Cys Asp Glu Thr Lys Asn 65 70 75 80 Cys Pro Gly Ala Glu Val Pro Glu Gly Glu Cys Cys Pro Val Cys Pro 85 90 95 Asp Gly Ser Glu Ser Pro Thr Asp Gln Glu Thr Thr Gly Val Glu Gly 100 105 110 Pro Lys Gly Asp Thr Gly Pro Arg Gly Pro Arg Gly Pro Ala Gly Pro 115 120 125 Pro Gly Arg Asp Gly Ile Pro Gly Gln Pro Gly Leu Pro Gly Pro Pro 130 135 140 Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Leu Gly Gly Asn Phe Ala 145 150 155 160 Pro Gln Leu Ser Tyr Gly Tyr Asp Glu Lys Ser Thr Gly Gly Ile Ser 165 170 175 Val Pro Gly Pro Met Gly Pro Ser Gly Pro Arg Gly Leu Pro Gly Pro 180 185 190 Pro Gly Ala Pro Gly Pro Gln Gly Phe Gln Gly Pro Pro Gly Glu Pro 195 200 205 Gly Glu Pro Gly Ala Ser Gly Pro Met Gly Pro Arg Gly Pro Pro Gly 210 215 220 Pro Pro Gly Lys Asn Gly Asp Asp Gly Glu Ala Gly Lys Pro Gly Arg 225 230 235 240 Pro Gly Glu Arg Gly Pro Pro Gly Pro Gln Gly Ala Arg Gly Leu Pro 245 250 255 Gly Thr Ala Gly Leu Pro Gly Met Lys Gly His Arg Gly Phe Ser Gly 260 265 270 Leu Asp Gly Ala Lys Gly Asp Ala Gly Pro Ala Gly Pro Lys Gly Glu 275 280 285 Pro Gly Ser Pro Gly Glu Asn Gly Ala Pro Gly Gln Met Gly Pro Arg 290 295 300 Gly Leu Pro Gly Glu Arg Gly Arg Pro Gly Ala Pro Gly Pro Ala Gly 305 310 315 320 Ala Arg Gly Asn Asp Gly Ala Thr Gly Ala Ala Gly Pro Pro Gly Pro 325 330 335 Thr Gly Pro Ala Gly Pro Pro Gly Phe Pro Gly Ala Val Gly Ala Lys 340 345 350 Gly Glu Ala Gly Pro Gln Gly Pro Arg Gly Ser Glu Gly Pro Gln Gly 355 360 365 Val Arg Gly Glu Pro Gly Pro Pro Gly Pro Ala Gly Ala Ala Gly Pro 370 375 380 Ala Gly Asn Pro Gly Ala Asp Gly Gln Pro Gly Ala Lys Gly Ala Asn 385 390 395 400 Gly Ala Pro Gly Ile Ala Gly Ala Pro Gly Phe Pro Gly Ala Arg Gly 405 410 415 Pro Ser Gly Pro Gln Gly Pro Gly Gly Pro Pro Gly Pro Lys Gly Asn 420 425 430 Ser Gly Glu Pro Gly Ala Pro Gly Ser Lys Gly Asp Thr Gly Ala Lys 435 440 445 Gly Glu Pro Gly Pro Val Gly Val Gln Gly Pro Pro Gly Pro Ala Gly 450 455 460 Glu Glu Gly Lys Arg Gly Ala Arg Gly Glu Pro Gly Pro Thr Gly Leu 465 470 475 480 Pro Gly Pro Pro Gly Glu Arg Gly Gly Pro Gly Ser Arg Gly Phe Pro 485 490 495 Gly Ala Asp Gly Val Ala Gly Pro Lys Gly Pro Ala Gly Glu Arg Gly 500 505 510 Ser Pro Gly Pro Ala Gly Pro Lys Gly Ser Pro Gly Glu Ala Gly Arg 515 520 525 Pro Gly Glu Ala Gly Leu Pro Gly Ala Lys Gly Leu Thr Gly Ser Pro 530 535 540 Gly Ser Pro Gly Pro Asp Gly Lys Thr Gly Pro Pro Gly Pro Ala Gly 545 550 555 560 Gln Asp Gly Arg Pro Gly Pro Pro Gly Pro Pro Gly Ala Arg Gly Gln 565 570 575 Ala Gly Val Met Gly Phe Pro Gly Pro Lys Gly Ala Ala Gly Glu Pro 580 585 590 Gly Lys Ala Gly Glu Arg Gly Val Pro Gly Pro Pro Gly Ala Val Gly 595 600 605 Pro Ala Gly Lys Asp Gly Glu Ala Gly Ala Gln Gly Pro Pro Gly Pro 610 615 620 Ala Gly Pro Ala Gly Glu Arg Gly Glu Gln Gly Pro Ala Gly Ser Pro 625 630 635 640 Gly Phe Gln Gly Leu Pro Gly Pro Ala Gly Pro Pro Gly Glu Ala Gly 645 650 655 Lys Pro Gly Glu Gln Gly Val Pro Gly Asp Leu Gly Ala Pro Gly Pro 660 665 670 Ser Gly Ala Arg Gly Glu Arg Gly Phe Pro Gly Glu Arg Gly Val Gln 675 680 685 Gly Pro Pro Gly Pro Ala Gly Pro Arg Gly Ala Asn Gly Ala Pro Gly 690 695 700 Asn Asp Gly Ala Lys Gly Asp Ala Gly Ala Pro Gly Ala Pro Gly Ser 705 710 715 720 Gln Gly Ala Pro Gly Leu Gln Gly Met Pro Gly Glu Arg Gly Ala Ala 725 730 735 Gly Leu Pro Gly Pro Lys Gly Asp Arg Gly Asp Ala Gly Pro Lys Gly 740 745 750 Ala Asp Gly Ser Pro Gly Lys Asp Gly Val Arg Gly Leu Thr Gly Pro 755 760 765 Ile Gly Pro Pro Gly Pro Ala Gly Ala Pro Gly Asp Lys Gly Glu Ser 770 775 780 Gly Pro Ser Gly Pro Ala Gly Pro Thr Gly Ala Arg Gly Ala Pro Gly 785 790 795 800 Asp Arg Gly Glu Pro Gly Pro Pro Gly Pro Ala Gly Phe Ala Gly Pro 805 810 815 Pro Gly Ala Asp Gly Gln Pro Gly Ala Lys Gly Glu Pro Gly Asp Ala 820 825 830 Gly Ala Lys Gly Asp Ala Gly Pro Pro Gly Pro Ala Gly Pro Ala Gly 835 840 845 Pro Pro Gly Pro Ile Gly Asn Val Gly Ala Pro Gly Ala Lys Gly Ala 850 855 860 Arg Gly Ser Ala Gly Pro Pro Gly Ala Thr Gly Phe Pro Gly Ala Ala 865 870 875 880 Gly Arg Val Gly Pro Pro Gly Pro Ser Gly Asn Ala Gly Pro Pro Gly 885 890 895 Pro Pro Gly Pro Ala Gly Lys Glu Gly Gly Lys Gly Pro Arg Gly Glu 900 905 910 Thr Gly Pro Ala Gly Arg Pro Gly Glu Val Gly Pro Pro Gly Pro Pro 915 920 925 Gly Pro Ala Gly Glu Lys Gly Ser Pro Gly Ala Asp Gly Pro Ala Gly 930 935 940 Ala Pro Gly Thr Pro Gly Pro Gln Gly Ile Ala Gly Gln Arg Gly Val 945 950 955 960 Val Gly Leu Pro Gly Gln Arg Gly Glu Arg Gly Phe Pro Gly Leu Pro 965 970 975 Gly Pro Ser Gly Glu Pro Gly Lys Gln Gly Pro Ser Gly Ala Ser Gly 980 985 990 Glu Arg Gly Pro Pro Gly Pro Met Gly Pro Pro Gly Leu Ala Gly Pro 995 1000 1005 Pro Gly Glu Ser Gly Arg Glu Gly Ala Pro Gly Ala Glu Gly Ser 1010 1015 1020 Pro Gly Arg Asp Gly Ser Pro Gly Ala Lys Gly Asp Arg Gly Glu 1025 1030 1035 Thr Gly Pro Ala Gly Pro Pro Gly Ala Pro Gly Ala Pro Gly Ala 1040 1045 1050 Pro Gly Pro Val Gly Pro Ala Gly Lys Ser Gly Asp Arg Gly Glu 1055 1060 1065 Thr Gly Pro Ala Gly Pro Thr Gly Pro Val Gly Pro Val Gly Ala 1070 1075 1080 Arg Gly Pro Ala Gly Pro Gln Gly Pro Arg Gly Asp Lys Gly Glu 1085 1090 1095 Thr Gly Glu Gln Gly Asp Arg Gly Ile Lys Gly His Arg Gly Phe 1100 1105 1110 Ser Gly Leu Gln Gly Pro Pro Gly Pro Pro Gly Ser Pro Gly Glu 1115 1120 1125 Gln Gly Pro Ser Gly Ala Ser Gly Pro Ala Gly Pro Arg Gly Pro 1130 1135 1140 Pro Gly Ser Ala Gly Ala Pro Gly Lys Asp Gly Leu Asn Gly Leu 1145 1150 1155 Pro Gly Pro Ile Gly Pro Pro Gly Pro Arg Gly Arg Thr Gly Asp 1160 1165 1170 Ala Gly Pro Val Gly Pro Pro Gly Pro Pro Gly Pro Pro Gly Pro 1175 1180 1185 Pro Gly Pro Pro Ser Ala Gly Phe Asp Phe Ser Phe Leu Pro Gln 1190 1195 1200 Pro Pro Gln Glu Lys Ala His Asp Gly Gly Arg Tyr Tyr Arg Ala 1205 1210 1215 Asp Asp Ala Asn Val Val Arg Asp Arg Asp Leu Glu Val Asp Thr 1220 1225 1230 Thr Leu Lys Ser Leu Ser Gln Gln Ile Glu Asn Ile Arg Ser Pro 1235 1240 1245 Glu Gly Ser Arg Lys Asn Pro Ala Arg Thr Cys Arg Asp Leu Lys 1250 1255 1260 Met Cys His Ser Asp Trp Lys Ser Gly Glu Tyr Trp Ile Asp Pro 1265 1270 1275 Asn Gln Gly Cys Asn Leu Asp Ala Ile Lys Val Phe Cys Asn Met 1280 1285 1290 Glu Thr Gly Glu Thr Cys Val Tyr Pro Thr Gln Pro Ser Val Ala 1295 1300 1305 Gln Lys Asn Trp Tyr Ile Ser Lys Asn Pro Lys Asp Lys Arg His 1310 1315 1320 Val Trp Phe Gly Glu Ser Met Thr Asp Gly Phe Gln Phe Glu Tyr 1325 1330 1335 Gly Gly Gln Gly Ser Asp Pro Ala Asp Val Ala Ile Gln Leu Thr 1340 1345 1350 Phe Leu Arg Leu Met Ser Thr Glu Ala Ser Gln Asn Ile Thr Tyr 1355 1360 1365 His Cys Lys Asn Ser Val Ala Tyr Met Asp Gln Gln Thr Gly Asn 1370 1375 1380 Leu Lys Lys Ala Leu Leu Leu Gln Gly Ser Asn Glu Ile Glu Ile 1385 1390 1395 Arg Ala Glu Gly Asn Ser Arg Phe Thr Tyr Ser Val Thr Val Asp 1400 1405 1410 Gly Cys Thr Ser His Thr Gly Ala Trp Gly Lys Thr Val Ile Glu 1415 1420 1425 Tyr Lys Thr Thr Lys Thr Ser Arg Leu Pro Ile Ile Asp Val Ala 1430 1435 1440 Pro Leu Asp Val Gly Ala Pro Asp Gln Glu Phe Gly Phe Asp Val 1445 1450 1455 Gly Pro Val Cys Phe Leu 1460 131257DNAHomo sapiens 13atgcaggccc tggtgctact cctctgcatt ggagccctcc tcgggcacag cagctgccag 60aaccctgcca gccccccgga ggagggctcc ccagaccccg acagcacagg ggcgctggtg 120gaggaggagg atcctttctt caaagtcccc gtgaacaagc tggcagcggc tgtctccaac 180ttcggctatg acctgtaccg ggtgcgatcc agcacgagcc ccacgaccaa cgtgctcctg 240tctcctctca gtgtggccac ggccctctcg gccctctcgc tgggagcgga gcagcgaaca 300gaatccatca ttcaccgggc tctctactat gacttgatca gcagcccaga catccatggt 360acctataagg agctccttga cacggtcact gccccccaga agaacctcaa gagtgcctcc 420cggatcgtct ttgagaagaa gctgcgcata aaatccagct ttgtggcacc tctggaaaag 480tcatatggga ccaggcccag agtcctgacg ggcaaccctc gcttggacct gcaagagatc 540aacaactggg tgcaggcgca gatgaaaggg aagctcgcca ggtccacaaa ggaaattccc 600gatgagatca gcattctcct tctcggtgtg gcgcacttca aggggcagtg ggtaacaaag 660tttgactcca gaaagacttc cctcgaggat ttctacttgg atgaagagag gaccgtgagg 720gtccccatga tgtcggaccc taaggctgtt ttacgctatg gcttggattc agatctcagc 780tgcaagattg cccagctgcc cttgaccgga agcatgagta tcatcttctt cctgcccctg 840aaagtgaccc agaatttgac cttgatagag gagagcctca cctccgagtt cattcatgac 900atagaccgag aactgaagac cgtgcaggcg gtcctcactg tccccaagct gaagctgagt 960tatgaaggcg aagtcaccaa gtccctgcag gagatgaagc tgcaatcctt gtttgattca 1020ccagacttta gcaagatcac aggcaaaccc atcaagctga ctcaggtgga acaccgggct 1080ggctttgagt ggaacgagga tggggcggga accaccccca gcccagggct gcagcctgcc 1140cacctcacct tcccgctgga ctatcacctt aaccagcctt tcatcttcgt actgagggac 1200acagacacag gggcccttct cttcattggc aagattctgg accccagggg cccctaa 1257142100DNAHomo sapiens 14atgaggtggc tgcttctcta ttatgctctg tgcttctccc tgtcaaaggc ttcagcccac 60accgtggagc taaacaatat gtttggccag atccagtcgc ctggttatcc agactcctat 120cccagtgatt cagaggtgac ttggaatatc actgtcccag atgggtttcg gatcaagctt 180tacttcatgc acttcaactt ggaatcctcc tacctttgtg aatatgacta tgtgaaggta 240gaaactgagg accaggtgct ggcaaccttc tgtggcaggg agaccacaga cacagagcag 300actcccggcc aggaggtggt cctctcccct ggctccttca tgtccatcac tttccggtca 360gatttctcca atgaggagcg tttcacaggc tttgatgccc actacatggc tgtggatgtg 420gacgagtgca aggagaggga ggacgaggag ctgtcctgtg accactactg ccacaactac 480attggcggct actactgctc ctgccgcttc ggctacatcc tccacacaga caacaggacc 540tgccgagtgg agtgcagtga caacctcttc actcaaagga ctggggtgat caccagccct 600gacttcccaa acccttaccc caagagctct gaatgcctgt ataccatcga gctggaggag 660ggtttcatgg tcaacctgca gtttgaggac atatttgaca ttgaggacca tcctgaggtg 720ccctgcccct atgactacat caagatcaaa gttggtccaa aagttttggg gcctttctgt 780ggagagaaag ccccagaacc catcagcacc cagagccaca gtgtcctgat cctgttccat 840agtgacaact cgggagagaa ccggggctgg aggctctcat acagggctgc aggaaatgag 900tgcccagagc tacagcctcc tgtccatggg aaaatcgagc cctcccaagc caagtatttc 960ttcaaagacc aagtgctcgt cagctgtgac acaggctaca aagtgctgaa ggataatgtg 1020gagatggaca cattccagat tgagtgtctg aaggatggga cgtggagtaa caagattccc 1080acctgtaaaa ttgtagactg tagagcccca ggagagctgg aacacgggct gatcaccttc 1140tctacaagga acaacctcac cacatacaag tctgagatca aatactcctg tcaggagccc 1200tattacaaga tgctcaacaa taacacaggt atatatacct gttctgccca aggagtctgg 1260atgaataaag tattggggag aagcctaccc acctgccttc cagtgtgtgg gctccccaag 1320ttctcccgga agctgatggc caggatcttc aatggacgcc cagcccagaa aggcaccact 1380ccctggattg ccatgctgtc acacctgaat gggcagccct tctgcggagg ctcccttcta 1440ggctccagct ggatcgtgac cgccgcacac tgcctccacc agtcactcga tccggaagat 1500ccgaccctac gtgattcaga cttgctcagc ccttctgact tcaaaatcat cctgggcaag 1560cattggaggc tccggtcaga tgaaaatgaa cagcatctcg gcgtcaaaca caccactctc 1620cacccccagt atgatcccaa cacattcgag aatgacgtgg ctctggtgga gctgttggag 1680agcccagtgc tgaatgcctt cgtgatgccc atctgtctgc ctgagggacc ccagcaggaa 1740ggagccatgg tcatcgtcag cggctggggg aagcagttct tgcaaaggtt cccagagacc 1800ctgatggaga ttgaaatccc gattgttgac cacagcacct gccagaaggc ttatgccccg 1860ctgaagaaga aagtgaccag ggacatgatc tgtgctgggg agaaggaagg gggaaaggac 1920gcctgtgcgg gtgactctgg aggccccatg gtgaccctga atagagaaag aggccagtgg 1980tacctggtgg gcactgtgtc ctggggtgat gactgtggga agaaggaccg ctacggagta 2040tactcttaca tccaccacaa caaggactgg atccagaggg tcaccggagt gaggaactga 2100152349DNAHomo sapiens 15atggctccgc accgccccgc gcccgcgctg ctttgcgcgc tgtccctggc gctgtgcgcg 60ctgtcgctgc ccgtccgcgc ggccactgcg tcgcgggggg cgtcccaggc gggggcgccc 120caggggcggg tgcccgaggc gcggcccaac agcatggtgg tggaacaccc cgagttcctc 180aaggcaggga aggagcctgg cctgcagatc tggcgtgtgg agaagttcga tctggtgccc 240gtgcccacca acctttatgg agacttcttc acgggcgacg cctacgtcat cctgaagaca 300gtgcagctga ggaacggaaa tctgcagtat gacctccact actggctggg caatgagtgc 360agccaggatg agagcggggc ggccgccatc tttaccgtgc agctggatga ctacctgaac 420ggccgggccg tgcagcaccg tgaggtccag ggcttcgagt cggccacctt cctaggctac 480ttcaagtctg gcctgaagta caagaaagga ggtgtggcat caggattcaa gcacgtggta 540cccaacgagg tggtggtgca gagactcttc caggtcaaag ggcggcgtgt ggtccgtgcc 600accgaggtac ctgtgtcctg ggagagcttc aacaatggcg actgcttcat cctggacctg 660ggcaacaaca tccaccagtg gtgtggttcc aacagcaatc ggtatgaaag actgaaggcc 720acacaggtgt ccaagggcat ccgggacaac gagcggagtg gccgggcccg agtgcacgtg 780tctgaggagg gcactgagcc cgaggcgatg ctccaggtgc tgggccccaa gccggctctg 840cctgcaggta ccgaggacac cgccaaggag gatgcggcca accgcaagct ggccaagctc 900tacaaggtct ccaatggtgc agggaccatg tccgtctccc tcgtggctga tgagaacccc 960ttcgcccagg gggccctgaa gtcagaggac tgcttcatcc tggaccacgg caaagatggg 1020aaaatctttg tctggaaagg caagcaggca aacacggagg agaggaaggc tgccctcaaa 1080acagcctctg acttcatcac caagatggac taccccaagc agactcaggt ctcggtcctt 1140cctgagggcg gtgagacccc actgttcaag cagttcttca agaactggcg ggacccagac 1200cagacagatg gcctgggctt gtcctacctt tccagccata tcgccaacgt ggagcgggtg 1260cccttcgacg ccgccaccct gcacacctcc actgccatgg ccgcccagca cggcatggat 1320gacgatggca

caggccagaa acagatctgg agaatcgaag gttccaacaa ggtgcccgtg 1380gaccctgcca catatggaca gttctatgga ggcgacagct acatcattct gtacaactac 1440cgccatggtg gccgccaggg gcagataatc tataactggc agggtgccca gtctacccag 1500gatgaggtcg ctgcatctgc catcctgact gctcagctgg atgaggagct gggaggtacc 1560cctgtccaga gccgtgtggt ccaaggcaag gagcccgccc acctcatgag cctgtttggt 1620gggaagccca tgatcatcta caagggcggc acctcccgcg agggcgggca gacagcccct 1680gccagcaccc gcctcttcca ggtccgcgcc aacagcgctg gagccacccg ggctgttgag 1740gtattgccta aggctggtgc actgaactcc aacgatgcct ttgttctgaa aaccccctca 1800gccgcctacc tgtgggtggg tacaggagcc agcgaggcag agaagacggg ggcccaggag 1860ctgctcaggg tgctgcgggc ccaacctgtg caggtggcag aaggcagcga gccagatggc 1920ttctgggagg ccctgggcgg gaaggctgcc taccgcacat ccccacggct gaaggacaag 1980aagatggatg cccatcctcc tcgcctcttt gcctgctcca acaagattgg acgttttgtg 2040atcgaagagg ttcctggtga gctcatgcag gaagacctgg caacggatga cgtcatgctt 2100ctggacacct gggaccaggt ctttgtctgg gttggaaagg attctcaaga agaagaaaag 2160acagaagcct tgacttctgc taagcggtac atcgagacgg acccagccaa tcgggatcgg 2220cggacgccca tcaccgtggt gaagcaaggc tttgagcctc cctcctttgt gggctggttc 2280cttggctggg atgatgatta ctggtctgtg gaccccttgg acagggccat ggctgagctg 2340gctgcctga 2349161017DNAHomo sapiens 16atgagtctaa gtgcatttac tctcttcctg gcattgattg gtggtaccag tggccagtac 60tatgattatg attttcccct atcaatttat gggcaatcat caccaaactg tgcaccagaa 120tgtaactgcc ctgaaagcta cccaagtgcc atgtactgtg atgagctgaa attgaaaagt 180gtaccaatgg tgcctcctgg aatcaagtat ctttacctta ggaataacca gattgaccat 240attgatgaaa aggcctttga gaatgtaact gatctgcagt ggctcattct agatcacaac 300cttctagaaa actccaagat aaaagggaga gttttctcta aattgaaaca actgaagaag 360ctgcatataa accacaacaa cctgacagag tctgtgggcc cacttcccaa atctctggag 420gatctgcagc ttactcataa caagatcaca aagctgggct cttttgaagg attggtaaac 480ctgaccttca tccatctcca gcacaatcgg ctgaaagagg atgctgtttc agctgctttt 540aaaggtctta aatcactcga ataccttgac ttgagcttca atcagatagc cagactgcct 600tctggtctcc ctgtctctct tctaactctc tacttagaca acaataagat cagcaacatc 660cctgatgagt atttcaagcg ttttaatgca ttgcagtatc tgcgtttatc tcacaacgaa 720ctggctgata gtggaatacc tggaaattct ttcaatgtgt catccctggt tgagctggat 780ctgtcctata acaagcttaa aaacatacca actgtcaatg aaaaccttga aaactattac 840ctggaggtca atcaacttga gaagtttgac ataaagagct tctgcaagat cctggggcca 900ttatcctact ccaagatcaa gcatttgcgt ttggatggca atcgcatctc agaaaccagt 960cttccaccgg atatgtatga atgtctacgt gttgctaacg aagtcactct taattaa 1017173366DNAHomo sapiens 17atgagcaaac tcagaatggt gctacttgaa gactctggat ctgctgactt cagaagacat 60tttgtcaact tgagtccctt caccattact gtggtcttac ttctcagtgc ctgttttgtc 120accagttctc ttggaggaac agacaaggag ctgaggctag tggatggtga aaacaagtgt 180agcgggagag tggaagtgaa agtccaggag gagtggggaa cggtgtgtaa taatggctgg 240agcatggaag cggtctctgt gatttgtaac cagctgggat gtccaactgc tatcaaagcc 300cctggatggg ctaattccag tgcaggttct ggacgcattt ggatggatca tgtttcttgt 360cgtgggaatg agtcagctct ttgggattgc aaacatgatg gatggggaaa gcatagtaac 420tgtactcacc aacaagatgc tggagtgacc tgctcagatg gatccaattt ggaaatgagg 480ctgacgcgtg gagggaatat gtgttctgga agaatagaga tcaaattcca aggacggtgg 540ggaacagtgt gtgatgataa cttcaacata gatcatgcat ctgtcatttg tagacaactt 600gaatgtggaa gtgctgtcag tttctctggt tcatctaatt ttggagaagg ctctggacca 660atctggtttg atgatcttat atgcaacgga aatgagtcag ctctctggaa ctgcaaacat 720caaggatggg gaaagcataa ctgtgatcat gctgaggatg ctggagtgat ttgctcaaag 780ggagcagatc tgagcctgag actggtagat ggagtcactg aatgttcagg aagattagaa 840gtgagattcc aaggagaatg ggggacaata tgtgatgacg gctgggacag ttacgatgct 900gctgtggcat gcaagcaact gggatgtcca actgccgtca cagccattgg tcgagttaac 960gccagtaagg gatttggaca catctggctt gacagcgttt cttgccaggg acatgaacct 1020gctatctggc aatgtaaaca ccatgaatgg ggaaagcatt attgcaatca caatgaagat 1080gctggcgtga catgttctga tggatcagat ctggagctaa gacttagagg tggaggcagc 1140cgctgtgctg ggacagttga ggtggagatt cagagactgt tagggaaggt gtgtgacaga 1200ggctggggac tgaaagaagc tgatgtggtt tgcaggcagc tgggatgtgg atctgcactc 1260aaaacatctt atcaagtgta ctccaaaatc caggcaacaa acacatggct gtttctaagt 1320agctgtaacg gaaatgaaac ttctctttgg gactgcaaga actggcaatg gggtggactt 1380acctgtgatc actatgaaga agccaaaatt acctgctcag cccacaggga acccagactg 1440gttggagggg acattccctg ttctggacgt gttgaagtga agcatggtga cacgtggggc 1500tccatctgtg attcggactt ctctctggaa gctgccagcg ttctatgcag ggaattacag 1560tgtggcacag ttgtctctat cctgggggga gctcactttg gagagggaaa tggacagatc 1620tgggctgaag aattccagtg tgagggacat gagtcccatc tttcactctg cccagtagca 1680ccccgcccag aaggaacttg tagccacagc agggatgttg gagtagtctg ctcaagatac 1740acagaaattc gcttggtgaa tggcaagacc ccgtgtgagg gcagagtgga gctcaaaacg 1800cttggtgcct ggggatccct ctgtaactct cactgggaca tagaagatgc ccatgttctt 1860tgccagcagc ttaaatgtgg agttgccctt tctaccccag gaggagcacg ttttggaaaa 1920ggaaatggtc agatctggag gcatatgttt cactgcactg ggactgagca gcacatggga 1980gattgtcctg taactgctct aggtgcttca ttatgtcctt cagagcaagt ggcctctgta 2040atctgctcag gaaaccagtc ccaaacactg tcctcgtgca attcatcgtc tttgggccca 2100acaaggccta ccattccaga agaaagtgct gtggcctgca tagagagtgg tcaacttcgc 2160ctggtaaatg gaggaggtcg ctgtgctggg agagtagaga tctatcatga gggctcctgg 2220ggcaccatct gtgatgacag ctgggacctg agtgatgccc acgtggtttg cagacagctg 2280ggctgtggag aggccattaa tgccactggt tctgctcatt ttggggaagg aacagggccc 2340atctggctgg atgagatgaa atgcaatgga aaagaatccc gcatttggca gtgccattca 2400cacggctggg ggcagcaaaa ttgcaggcac aaggaggatg cgggagttat ctgctcagaa 2460ttcatgtctc tgagactgac cagtgaagcc agcagagagg cctgtgcagg gcgtctggaa 2520gttttttaca atggagcttg gggcactgtt ggcaagagta gcatgtctga aaccactgtg 2580ggtgtggtgt gcaggcagct gggctgtgca gacaaaggga aaatcaaccc tgcatcttta 2640gacaaggcca tgtccattcc catgtgggtg gacaatgttc agtgtccaaa aggacctgac 2700acgctgtggc agtgcccatc atctccatgg gagaagagac tggccagccc ctcggaggag 2760acctggatca catgtgacaa caagataaga cttcaggaag gacccacttc ctgttctgga 2820cgtgtggaga tctggcatgg aggttcctgg gggacagtgt gtgatgactc ttgggacttg 2880gacgatgctc aggtggtgtg tcaacaactt ggctgtggtc cagctttgaa agcattcaaa 2940gaagcagagt ttggtcaggg gactggaccg atatggctca atgaagtgaa gtgcaaaggg 3000aatgagtctt ccttgtggga ttgtcctgcc agacgctggg gccatagtga gtgtgggcac 3060aaggaagacg ctgcagtgaa ttgcacagat atttcagtgc agaaaacccc acaaaaagcc 3120acaacaggtc gctcatcccg tcagtcatcc tttattgcag tcgggatcct tggggttgtt 3180ctgttggcca ttttcgtcgc attattcttc ttgactaaaa agcgaagaca gagacagcgg 3240cttgcagttt cctcaagagg agagaactta gtccaccaaa ttcaataccg ggagatgaat 3300tcttgcctga atgcagatga tctggaccta atgaattcct caggaggcca ttctgagcca 3360cactga 3366181620DNAHomo sapiens 18atgaagccgg cggcgcggga ggcgcggctg cctccgcgct cgcccgggct gcgctgggcg 60ctgccgctgc tgctgctgct gctgcgcctg ggccagatcc tgtgcgcagg tggcacccct 120agtccaattc ctgacccttc agtagcaact gttgccacag gggaaaatgg cataacgcag 180atcagcagta cagcagaatc ctttcataaa cagaatggaa ctggaacacc tcaggtggaa 240acaaacacca gtgaggatgg tgaaagctct ggagccaacg atagtttaag aacacctgaa 300caaggatcta atgggactga tggggcatct caaaaaactc ccagtagcac tgggcccagt 360cctgtgtttg acattaaagc tgtttccatc agtccaacca atgtgatctt aacttggaaa 420agtaatgaca cagctgcttc tgagtacaag tatgtagtaa agcataagat ggaaaatgag 480aagacaatta ctgttgtgca tcaaccatgg tgtaacatca caggcttacg tccagcgact 540tcatatgtat tctccatcac tccaggaata ggcaatgaga cttggggaga tcccagagtc 600ataaaagtca tcacagagcc gatcccagtt tctgatctcc gtgttgccct cacgggtgtg 660aggaaggctg ctctctcctg gagcaatggc aatggcactg cctcctgccg ggttcttctt 720gaaagcattg gaagccatga ggagttgact caagactcaa gacttcaggt caatatctcg 780ggcctgaagc caggggttca atacaacatc aacccgtatc ttctacaatc aaataagaca 840aagggagacc ccttgggcac agaaggtggc ttggatgcca gcaatacaga gagaagccgg 900gcagggagcc ccaccgcccc tgtgcatgat gagtccctcg tgggacctgt ggacccatcc 960tccggccagc agtcccgaga cacggaagtc ctgcttgtcg ggttagagcc tggcacccga 1020tacaatgcca ccgtttattc ccaagcagcg aatggcacag aaggacagcc ccaggccata 1080gagttcagga caaatgctat tcaggttttt gacgtcaccg ctgtgaacat cagtgccaca 1140agcctgaccc tgatctggaa agtcagcgat aacgagtcgt catctaacta tacctacaag 1200atacatgtgg cgggggagac agattcttcc aatctcaacg tcagtgagcc tcgcgctgtc 1260atccccggac tccgctccag caccttctac aacatcacag tgtgtcctgt cctaggtgac 1320atcgagggca cgccgggctt cctccaagtg cacacccccc ctgttccagt ttctgacttc 1380cgagtgacag tggtcagcac gacggagatc ggcttagcat ggagcagcca tgatgcagaa 1440tcatttcaga tgcatatcac acaggaggga gctggcaatt ctcgggtaga aataaccacc 1500aaccaaagta ttatcattgg tggcttgttc cctggaacca agtattgctt tgaaatagtt 1560ccaaaaggac caaatgggac tgaaggggca tctcggacag tttgcaatag aactggatga 162019418PRTHomo sapiens 19Met Gln Ala Leu Val Leu Leu Leu Cys Ile Gly Ala Leu Leu Gly His 1 5 10 15 Ser Ser Cys Gln Asn Pro Ala Ser Pro Pro Glu Glu Gly Ser Pro Asp 20 25 30 Pro Asp Ser Thr Gly Ala Leu Val Glu Glu Glu Asp Pro Phe Phe Lys 35 40 45 Val Pro Val Asn Lys Leu Ala Ala Ala Val Ser Asn Phe Gly Tyr Asp 50 55 60 Leu Tyr Arg Val Arg Ser Ser Thr Ser Pro Thr Thr Asn Val Leu Leu 65 70 75 80 Ser Pro Leu Ser Val Ala Thr Ala Leu Ser Ala Leu Ser Leu Gly Ala 85 90 95 Glu Gln Arg Thr Glu Ser Ile Ile His Arg Ala Leu Tyr Tyr Asp Leu 100 105 110 Ile Ser Ser Pro Asp Ile His Gly Thr Tyr Lys Glu Leu Leu Asp Thr 115 120 125 Val Thr Ala Pro Gln Lys Asn Leu Lys Ser Ala Ser Arg Ile Val Phe 130 135 140 Glu Lys Lys Leu Arg Ile Lys Ser Ser Phe Val Ala Pro Leu Glu Lys 145 150 155 160 Ser Tyr Gly Thr Arg Pro Arg Val Leu Thr Gly Asn Pro Arg Leu Asp 165 170 175 Leu Gln Glu Ile Asn Asn Trp Val Gln Ala Gln Met Lys Gly Lys Leu 180 185 190 Ala Arg Ser Thr Lys Glu Ile Pro Asp Glu Ile Ser Ile Leu Leu Leu 195 200 205 Gly Val Ala His Phe Lys Gly Gln Trp Val Thr Lys Phe Asp Ser Arg 210 215 220 Lys Thr Ser Leu Glu Asp Phe Tyr Leu Asp Glu Glu Arg Thr Val Arg 225 230 235 240 Val Pro Met Met Ser Asp Pro Lys Ala Val Leu Arg Tyr Gly Leu Asp 245 250 255 Ser Asp Leu Ser Cys Lys Ile Ala Gln Leu Pro Leu Thr Gly Ser Met 260 265 270 Ser Ile Ile Phe Phe Leu Pro Leu Lys Val Thr Gln Asn Leu Thr Leu 275 280 285 Ile Glu Glu Ser Leu Thr Ser Glu Phe Ile His Asp Ile Asp Arg Glu 290 295 300 Leu Lys Thr Val Gln Ala Val Leu Thr Val Pro Lys Leu Lys Leu Ser 305 310 315 320 Tyr Glu Gly Glu Val Thr Lys Ser Leu Gln Glu Met Lys Leu Gln Ser 325 330 335 Leu Phe Asp Ser Pro Asp Phe Ser Lys Ile Thr Gly Lys Pro Ile Lys 340 345 350 Leu Thr Gln Val Glu His Arg Ala Gly Phe Glu Trp Asn Glu Asp Gly 355 360 365 Ala Gly Thr Thr Pro Ser Pro Gly Leu Gln Pro Ala His Leu Thr Phe 370 375 380 Pro Leu Asp Tyr His Leu Asn Gln Pro Phe Ile Phe Val Leu Arg Asp 385 390 395 400 Thr Asp Thr Gly Ala Leu Leu Phe Ile Gly Lys Ile Leu Asp Pro Arg 405 410 415 Gly Pro 20699PRTHomo sapiens 20Met Arg Trp Leu Leu Leu Tyr Tyr Ala Leu Cys Phe Ser Leu Ser Lys 1 5 10 15 Ala Ser Ala His Thr Val Glu Leu Asn Asn Met Phe Gly Gln Ile Gln 20 25 30 Ser Pro Gly Tyr Pro Asp Ser Tyr Pro Ser Asp Ser Glu Val Thr Trp 35 40 45 Asn Ile Thr Val Pro Asp Gly Phe Arg Ile Lys Leu Tyr Phe Met His 50 55 60 Phe Asn Leu Glu Ser Ser Tyr Leu Cys Glu Tyr Asp Tyr Val Lys Val 65 70 75 80 Glu Thr Glu Asp Gln Val Leu Ala Thr Phe Cys Gly Arg Glu Thr Thr 85 90 95 Asp Thr Glu Gln Thr Pro Gly Gln Glu Val Val Leu Ser Pro Gly Ser 100 105 110 Phe Met Ser Ile Thr Phe Arg Ser Asp Phe Ser Asn Glu Glu Arg Phe 115 120 125 Thr Gly Phe Asp Ala His Tyr Met Ala Val Asp Val Asp Glu Cys Lys 130 135 140 Glu Arg Glu Asp Glu Glu Leu Ser Cys Asp His Tyr Cys His Asn Tyr 145 150 155 160 Ile Gly Gly Tyr Tyr Cys Ser Cys Arg Phe Gly Tyr Ile Leu His Thr 165 170 175 Asp Asn Arg Thr Cys Arg Val Glu Cys Ser Asp Asn Leu Phe Thr Gln 180 185 190 Arg Thr Gly Val Ile Thr Ser Pro Asp Phe Pro Asn Pro Tyr Pro Lys 195 200 205 Ser Ser Glu Cys Leu Tyr Thr Ile Glu Leu Glu Glu Gly Phe Met Val 210 215 220 Asn Leu Gln Phe Glu Asp Ile Phe Asp Ile Glu Asp His Pro Glu Val 225 230 235 240 Pro Cys Pro Tyr Asp Tyr Ile Lys Ile Lys Val Gly Pro Lys Val Leu 245 250 255 Gly Pro Phe Cys Gly Glu Lys Ala Pro Glu Pro Ile Ser Thr Gln Ser 260 265 270 His Ser Val Leu Ile Leu Phe His Ser Asp Asn Ser Gly Glu Asn Arg 275 280 285 Gly Trp Arg Leu Ser Tyr Arg Ala Ala Gly Asn Glu Cys Pro Glu Leu 290 295 300 Gln Pro Pro Val His Gly Lys Ile Glu Pro Ser Gln Ala Lys Tyr Phe 305 310 315 320 Phe Lys Asp Gln Val Leu Val Ser Cys Asp Thr Gly Tyr Lys Val Leu 325 330 335 Lys Asp Asn Val Glu Met Asp Thr Phe Gln Ile Glu Cys Leu Lys Asp 340 345 350 Gly Thr Trp Ser Asn Lys Ile Pro Thr Cys Lys Ile Val Asp Cys Arg 355 360 365 Ala Pro Gly Glu Leu Glu His Gly Leu Ile Thr Phe Ser Thr Arg Asn 370 375 380 Asn Leu Thr Thr Tyr Lys Ser Glu Ile Lys Tyr Ser Cys Gln Glu Pro 385 390 395 400 Tyr Tyr Lys Met Leu Asn Asn Asn Thr Gly Ile Tyr Thr Cys Ser Ala 405 410 415 Gln Gly Val Trp Met Asn Lys Val Leu Gly Arg Ser Leu Pro Thr Cys 420 425 430 Leu Pro Val Cys Gly Leu Pro Lys Phe Ser Arg Lys Leu Met Ala Arg 435 440 445 Ile Phe Asn Gly Arg Pro Ala Gln Lys Gly Thr Thr Pro Trp Ile Ala 450 455 460 Met Leu Ser His Leu Asn Gly Gln Pro Phe Cys Gly Gly Ser Leu Leu 465 470 475 480 Gly Ser Ser Trp Ile Val Thr Ala Ala His Cys Leu His Gln Ser Leu 485 490 495 Asp Pro Glu Asp Pro Thr Leu Arg Asp Ser Asp Leu Leu Ser Pro Ser 500 505 510 Asp Phe Lys Ile Ile Leu Gly Lys His Trp Arg Leu Arg Ser Asp Glu 515 520 525 Asn Glu Gln His Leu Gly Val Lys His Thr Thr Leu His Pro Gln Tyr 530 535 540 Asp Pro Asn Thr Phe Glu Asn Asp Val Ala Leu Val Glu Leu Leu Glu 545 550 555 560 Ser Pro Val Leu Asn Ala Phe Val Met Pro Ile Cys Leu Pro Glu Gly 565 570 575 Pro Gln Gln Glu Gly Ala Met Val Ile Val Ser Gly Trp Gly Lys Gln 580 585 590 Phe Leu Gln Arg Phe Pro Glu Thr Leu Met Glu Ile Glu Ile Pro Ile 595 600 605 Val Asp His Ser Thr Cys Gln Lys Ala Tyr Ala Pro Leu Lys Lys Lys 610 615 620 Val Thr Arg Asp Met Ile Cys Ala Gly Glu Lys Glu Gly Gly Lys Asp 625 630 635 640 Ala Cys Ala Gly Asp Ser Gly Gly Pro Met Val Thr Leu Asn Arg Glu 645 650 655 Arg Gly Gln Trp Tyr Leu Val Gly Thr Val Ser Trp Gly Asp Asp Cys 660 665 670 Gly Lys Lys Asp Arg Tyr Gly Val Tyr Ser Tyr Ile His His Asn Lys 675 680 685 Asp Trp Ile Gln Arg Val Thr Gly Val Arg Asn 690 695 21722PRTHomo sapiens 21Met Ala Pro His Arg Pro Ala Pro Ala Leu Leu Cys Ala Leu Ser Leu 1 5 10 15 Ala Leu Cys Ala Leu Ser Leu Pro Val Arg Ala Ala Thr Ala Ser Arg 20 25 30 Gly Ala Ser Gln Ala Gly Ala Pro Gln Gly Arg Val Pro Glu Ala Arg 35 40 45 Pro Asn Ser Met Val Val Glu His Pro Glu Phe Leu Lys Ala Gly Lys 50 55 60 Glu Pro Gly Leu Gln Ile Trp Arg Val Glu Lys Phe Asp Leu Val Pro 65 70 75

80 Val Pro Thr Asn Leu Tyr Gly Asp Phe Phe Thr Gly Asp Ala Tyr Val 85 90 95 Ile Leu Lys Thr Val Gln Leu Arg Asn Gly Asn Leu Gln Tyr Asp Leu 100 105 110 His Tyr Trp Leu Gly Asn Glu Cys Ser Gln Asp Glu Ser Gly Ala Ala 115 120 125 Ala Ile Phe Thr Val Gln Leu Asp Asp Tyr Leu Asn Gly Arg Ala Val 130 135 140 Gln His Arg Glu Val Gln Gly Phe Glu Ser Ala Thr Phe Leu Gly Tyr 145 150 155 160 Phe Lys Ser Gly Leu Lys Tyr Lys Lys Gly Gly Val Ala Ser Gly Phe 165 170 175 Lys His Val Val Pro Asn Glu Val Val Val Gln Arg Leu Phe Gln Val 180 185 190 Lys Gly Arg Arg Val Val Arg Ala Thr Glu Val Pro Val Ser Trp Glu 195 200 205 Ser Phe Asn Asn Gly Asp Cys Phe Ile Leu Asp Leu Gly Asn Asn Ile 210 215 220 His Gln Trp Cys Gly Ser Asn Ser Asn Arg Tyr Glu Arg Leu Lys Ala 225 230 235 240 Thr Gln Val Ser Lys Gly Ile Arg Asp Asn Glu Arg Ser Gly Arg Ala 245 250 255 Arg Val His Val Ser Glu Glu Gly Thr Glu Pro Glu Ala Met Leu Gln 260 265 270 Val Leu Gly Pro Lys Pro Ala Leu Pro Ala Gly Thr Glu Asp Thr Ala 275 280 285 Lys Glu Asp Ala Ala Asn Arg Lys Leu Ala Lys Leu Thr Ala Ser Asp 290 295 300 Phe Ile Thr Lys Met Asp Tyr Pro Lys Gln Thr Gln Val Ser Val Leu 305 310 315 320 Pro Glu Gly Gly Glu Thr Pro Leu Phe Lys Gln Phe Phe Lys Asn Trp 325 330 335 Arg Asp Pro Asp Gln Thr Asp Gly Leu Gly Leu Ser Tyr Leu Ser Ser 340 345 350 His Ile Ala Asn Val Glu Arg Val Pro Phe Asp Ala Ala Thr Leu His 355 360 365 Thr Ser Thr Ala Met Ala Ala Gln His Gly Met Asp Asp Asp Gly Thr 370 375 380 Gly Gln Lys Gln Ile Trp Arg Ile Glu Gly Ser Asn Lys Val Pro Val 385 390 395 400 Asp Pro Ala Thr Tyr Gly Gln Phe Tyr Gly Gly Asp Ser Tyr Ile Ile 405 410 415 Leu Tyr Asn Tyr Arg His Gly Gly Arg Gln Gly Gln Ile Ile Tyr Asn 420 425 430 Trp Gln Gly Ala Gln Ser Thr Gln Asp Glu Val Ala Ala Ser Ala Ile 435 440 445 Leu Thr Ala Gln Leu Asp Glu Glu Leu Gly Gly Thr Pro Val Gln Ser 450 455 460 Arg Val Val Gln Gly Lys Glu Pro Ala His Leu Met Ser Leu Phe Gly 465 470 475 480 Gly Lys Pro Met Ile Ile Tyr Lys Gly Gly Thr Ser Arg Glu Gly Gly 485 490 495 Gln Thr Ala Pro Ala Ser Thr Arg Leu Phe Gln Val Arg Ala Asn Ser 500 505 510 Ala Gly Ala Thr Arg Ala Val Glu Val Leu Pro Lys Ala Gly Ala Leu 515 520 525 Asn Ser Asn Asp Ala Phe Val Leu Lys Thr Pro Ser Ala Ala Tyr Leu 530 535 540 Trp Val Gly Thr Gly Ala Ser Glu Ala Glu Lys Thr Gly Ala Gln Glu 545 550 555 560 Leu Leu Arg Val Leu Arg Ala Gln Pro Val Gln Val Ala Glu Gly Ser 565 570 575 Glu Pro Asp Gly Phe Trp Glu Ala Leu Gly Gly Lys Ala Ala Tyr Arg 580 585 590 Thr Ser Pro Arg Leu Lys Asp Lys Lys Met Asp Ala His Pro Pro Arg 595 600 605 Leu Phe Ala Cys Ser Asn Lys Ile Gly Arg Phe Val Ile Glu Glu Val 610 615 620 Pro Gly Glu Leu Met Gln Glu Asp Leu Ala Thr Asp Asp Val Met Leu 625 630 635 640 Leu Asp Thr Trp Asp Gln Val Phe Val Trp Val Gly Lys Asp Ser Gln 645 650 655 Glu Glu Glu Lys Thr Glu Ala Leu Thr Ser Ala Lys Arg Tyr Ile Glu 660 665 670 Thr Asp Pro Ala Asn Arg Asp Arg Arg Thr Pro Ile Thr Val Val Lys 675 680 685 Gln Gly Phe Glu Pro Pro Ser Phe Val Gly Trp Phe Leu Gly Trp Asp 690 695 700 Asp Asp Tyr Trp Ser Val Asp Pro Leu Asp Arg Ala Met Ala Glu Leu 705 710 715 720 Ala Ala 22338PRTHomo sapiens 22Met Ser Leu Ser Ala Phe Thr Leu Phe Leu Ala Leu Ile Gly Gly Thr 1 5 10 15 Ser Gly Gln Tyr Tyr Asp Tyr Asp Phe Pro Leu Ser Ile Tyr Gly Gln 20 25 30 Ser Ser Pro Asn Cys Ala Pro Glu Cys Asn Cys Pro Glu Ser Tyr Pro 35 40 45 Ser Ala Met Tyr Cys Asp Glu Leu Lys Leu Lys Ser Val Pro Met Val 50 55 60 Pro Pro Gly Ile Lys Tyr Leu Tyr Leu Arg Asn Asn Gln Ile Asp His 65 70 75 80 Ile Asp Glu Lys Ala Phe Glu Asn Val Thr Asp Leu Gln Trp Leu Ile 85 90 95 Leu Asp His Asn Leu Leu Glu Asn Ser Lys Ile Lys Gly Arg Val Phe 100 105 110 Ser Lys Leu Lys Gln Leu Lys Lys Leu His Ile Asn His Asn Asn Leu 115 120 125 Thr Glu Ser Val Gly Pro Leu Pro Lys Ser Leu Glu Asp Leu Gln Leu 130 135 140 Thr His Asn Lys Ile Thr Lys Leu Gly Ser Phe Glu Gly Leu Val Asn 145 150 155 160 Leu Thr Phe Ile His Leu Gln His Asn Arg Leu Lys Glu Asp Ala Val 165 170 175 Ser Ala Ala Phe Lys Gly Leu Lys Ser Leu Glu Tyr Leu Asp Leu Ser 180 185 190 Phe Asn Gln Ile Ala Arg Leu Pro Ser Gly Leu Pro Val Ser Leu Leu 195 200 205 Thr Leu Tyr Leu Asp Asn Asn Lys Ile Ser Asn Ile Pro Asp Glu Tyr 210 215 220 Phe Lys Arg Phe Asn Ala Leu Gln Tyr Leu Arg Leu Ser His Asn Glu 225 230 235 240 Leu Ala Asp Ser Gly Ile Pro Gly Asn Ser Phe Asn Val Ser Ser Leu 245 250 255 Val Glu Leu Asp Leu Ser Tyr Asn Lys Leu Lys Asn Ile Pro Thr Val 260 265 270 Asn Glu Asn Leu Glu Asn Tyr Tyr Leu Glu Val Asn Gln Leu Glu Lys 275 280 285 Phe Asp Ile Lys Ser Phe Cys Lys Ile Leu Gly Pro Leu Ser Tyr Ser 290 295 300 Lys Ile Lys His Leu Arg Leu Asp Gly Asn Arg Ile Ser Glu Thr Ser 305 310 315 320 Leu Pro Pro Asp Met Tyr Glu Cys Leu Arg Val Ala Asn Glu Val Thr 325 330 335 Leu Asn 231156PRTHomo sapiens 23Met Ser Lys Leu Arg Met Val Leu Leu Glu Asp Ser Gly Ser Ala Asp 1 5 10 15 Phe Arg Arg His Phe Val Asn Leu Ser Pro Phe Thr Ile Thr Val Val 20 25 30 Leu Leu Leu Ser Ala Cys Phe Val Thr Ser Ser Leu Gly Gly Thr Asp 35 40 45 Lys Glu Leu Arg Leu Val Asp Gly Glu Asn Lys Cys Ser Gly Arg Val 50 55 60 Glu Val Lys Val Gln Glu Glu Trp Gly Thr Val Cys Asn Asn Gly Trp 65 70 75 80 Ser Met Glu Ala Val Ser Val Ile Cys Asn Gln Leu Gly Cys Pro Thr 85 90 95 Ala Ile Lys Ala Pro Gly Trp Ala Asn Ser Ser Ala Gly Ser Gly Arg 100 105 110 Ile Trp Met Asp His Val Ser Cys Arg Gly Asn Glu Ser Ala Leu Trp 115 120 125 Asp Cys Lys His Asp Gly Trp Gly Lys His Ser Asn Cys Thr His Gln 130 135 140 Gln Asp Ala Gly Val Thr Cys Ser Asp Gly Ser Asn Leu Glu Met Arg 145 150 155 160 Leu Thr Arg Gly Gly Asn Met Cys Ser Gly Arg Ile Glu Ile Lys Phe 165 170 175 Gln Gly Arg Trp Gly Thr Val Cys Asp Asp Asn Phe Asn Ile Asp His 180 185 190 Ala Ser Val Ile Cys Arg Gln Leu Glu Cys Gly Ser Ala Val Ser Phe 195 200 205 Ser Gly Ser Ser Asn Phe Gly Glu Gly Ser Gly Pro Ile Trp Phe Asp 210 215 220 Asp Leu Ile Cys Asn Gly Asn Glu Ser Ala Leu Trp Asn Cys Lys His 225 230 235 240 Gln Gly Trp Gly Lys His Asn Cys Asp His Ala Glu Asp Ala Gly Val 245 250 255 Ile Cys Ser Lys Gly Ala Asp Leu Ser Leu Arg Leu Val Asp Gly Val 260 265 270 Thr Glu Cys Ser Gly Arg Leu Glu Val Arg Phe Gln Gly Glu Trp Gly 275 280 285 Thr Ile Cys Asp Asp Gly Trp Asp Ser Tyr Asp Ala Ala Val Ala Cys 290 295 300 Lys Gln Leu Gly Cys Pro Thr Ala Val Thr Ala Ile Gly Arg Val Asn 305 310 315 320 Ala Ser Lys Gly Phe Gly His Ile Trp Leu Asp Ser Val Ser Cys Gln 325 330 335 Gly His Glu Pro Ala Ile Trp Gln Cys Lys His His Glu Trp Gly Lys 340 345 350 His Tyr Cys Asn His Asn Glu Asp Ala Gly Val Thr Cys Ser Asp Gly 355 360 365 Ser Asp Leu Glu Leu Arg Leu Arg Gly Gly Gly Ser Arg Cys Ala Gly 370 375 380 Thr Val Glu Val Glu Ile Gln Arg Leu Leu Gly Lys Val Cys Asp Arg 385 390 395 400 Gly Trp Gly Leu Lys Glu Ala Asp Val Val Cys Arg Gln Leu Gly Cys 405 410 415 Gly Ser Ala Leu Lys Thr Ser Tyr Gln Val Tyr Ser Lys Ile Gln Ala 420 425 430 Thr Asn Thr Trp Leu Phe Leu Ser Ser Cys Asn Gly Asn Glu Thr Ser 435 440 445 Leu Trp Asp Cys Lys Asn Trp Gln Trp Gly Gly Leu Thr Cys Asp His 450 455 460 Tyr Glu Glu Ala Lys Ile Thr Cys Ser Ala His Arg Glu Pro Arg Leu 465 470 475 480 Val Gly Gly Asp Ile Pro Cys Ser Gly Arg Val Glu Val Lys His Gly 485 490 495 Asp Thr Trp Gly Ser Ile Cys Asp Ser Asp Phe Ser Leu Glu Ala Ala 500 505 510 Ser Val Leu Cys Arg Glu Leu Gln Cys Gly Thr Val Val Ser Ile Leu 515 520 525 Gly Gly Ala His Phe Gly Glu Gly Asn Gly Gln Ile Trp Ala Glu Glu 530 535 540 Phe Gln Cys Glu Gly His Glu Ser His Leu Ser Leu Cys Pro Val Ala 545 550 555 560 Pro Arg Pro Glu Gly Thr Cys Ser His Ser Arg Asp Val Gly Val Val 565 570 575 Cys Ser Arg Tyr Thr Glu Ile Arg Leu Val Asn Gly Lys Thr Pro Cys 580 585 590 Glu Gly Arg Val Glu Leu Lys Thr Leu Gly Ala Trp Gly Ser Leu Cys 595 600 605 Asn Ser His Trp Asp Ile Glu Asp Ala His Val Leu Cys Gln Gln Leu 610 615 620 Lys Cys Gly Val Ala Leu Ser Thr Pro Gly Gly Ala Arg Phe Gly Lys 625 630 635 640 Gly Asn Gly Gln Ile Trp Arg His Met Phe His Cys Thr Gly Thr Glu 645 650 655 Gln His Met Gly Asp Cys Pro Val Thr Ala Leu Gly Ala Ser Leu Cys 660 665 670 Pro Ser Glu Gln Val Ala Ser Val Ile Cys Ser Gly Asn Gln Ser Gln 675 680 685 Thr Leu Ser Ser Cys Asn Ser Ser Ser Leu Gly Pro Thr Arg Pro Thr 690 695 700 Ile Pro Glu Glu Ser Ala Val Ala Cys Ile Glu Ser Gly Gln Leu Arg 705 710 715 720 Leu Val Asn Gly Gly Gly Arg Cys Ala Gly Arg Val Glu Ile Tyr His 725 730 735 Glu Gly Ser Trp Gly Thr Ile Cys Asp Asp Ser Trp Asp Leu Ser Asp 740 745 750 Ala His Val Val Cys Arg Gln Leu Gly Cys Gly Glu Ala Ile Asn Ala 755 760 765 Thr Gly Ser Ala His Phe Gly Glu Gly Thr Gly Pro Ile Trp Leu Asp 770 775 780 Glu Met Lys Cys Asn Gly Lys Glu Ser Arg Ile Trp Gln Cys His Ser 785 790 795 800 His Gly Trp Gly Gln Gln Asn Cys Arg His Lys Glu Asp Ala Gly Val 805 810 815 Ile Cys Ser Glu Phe Met Ser Leu Arg Leu Thr Ser Glu Ala Ser Arg 820 825 830 Glu Ala Cys Ala Gly Arg Leu Glu Val Phe Tyr Asn Gly Ala Trp Gly 835 840 845 Thr Val Gly Lys Ser Ser Met Ser Glu Thr Thr Val Gly Val Val Cys 850 855 860 Arg Gln Leu Gly Cys Ala Asp Lys Gly Lys Ile Asn Pro Ala Ser Leu 865 870 875 880 Asp Lys Ala Met Ser Ile Pro Met Trp Val Asp Asn Val Gln Cys Pro 885 890 895 Lys Gly Pro Asp Thr Leu Trp Gln Cys Pro Ser Ser Pro Trp Glu Lys 900 905 910 Arg Leu Ala Ser Pro Ser Glu Glu Thr Trp Ile Thr Cys Asp Asn Lys 915 920 925 Ile Arg Leu Gln Glu Gly Pro Thr Ser Cys Ser Gly Arg Val Glu Ile 930 935 940 Trp His Gly Gly Ser Trp Gly Thr Val Cys Asp Asp Ser Trp Asp Leu 945 950 955 960 Asp Asp Ala Gln Val Val Cys Gln Gln Leu Gly Cys Gly Pro Ala Leu 965 970 975 Lys Ala Phe Lys Glu Ala Glu Phe Gly Gln Gly Thr Gly Pro Ile Trp 980 985 990 Leu Asn Glu Val Lys Cys Lys Gly Asn Glu Ser Ser Leu Trp Asp Cys 995 1000 1005 Pro Ala Arg Arg Trp Gly His Ser Glu Cys Gly His Lys Glu Asp 1010 1015 1020 Ala Ala Val Asn Cys Thr Asp Ile Ser Val Gln Lys Thr Pro Gln 1025 1030 1035 Lys Ala Thr Thr Gly Arg Ser Ser Arg Gln Ser Ser Phe Ile Ala 1040 1045 1050 Val Gly Ile Leu Gly Val Val Leu Leu Ala Ile Phe Val Ala Leu 1055 1060 1065 Phe Phe Leu Thr Lys Lys Arg Arg Gln Arg Gln Arg Leu Ala Val 1070 1075 1080 Ser Ser Arg Gly Glu Asn Leu Val His Gln Ile Gln Tyr Arg Glu 1085 1090 1095 Met Asn Ser Cys Leu Asn Ala Asp Asp Leu Asp Leu Met Asn Ser 1100 1105 1110 Ser Glu Asn Ser His Glu Ser Ala Asp Phe Ser Ala Ala Glu Leu 1115 1120 1125 Ile Ser Val Ser Lys Phe Leu Pro Ile Ser Gly Met Glu Lys Glu 1130 1135 1140 Ala Ile Leu Ser His Thr Glu Lys Glu Asn Gly Asn Leu 1145 1150 1155 241337PRTHomo sapiens 24Met Lys Pro Ala Ala Arg Glu Ala Arg Leu Pro Pro Arg Ser Pro Gly 1 5 10 15 Leu Arg Trp Ala Leu Pro Leu Leu Leu Leu Leu Leu Arg Leu Gly Gln 20 25 30 Ile Leu Cys Ala Gly Gly Thr Pro Ser Pro Ile Pro Asp Pro Ser Val 35 40 45 Ala Thr Val Ala Thr Gly Glu Asn Gly Ile Thr Gln Ile Ser Ser Thr 50 55 60 Ala Glu Ser Phe His Lys Gln Asn Gly Thr Gly Thr Pro Gln Val Glu 65 70 75 80 Thr Asn Thr Ser Glu Asp Gly Glu Ser Ser Gly Ala Asn Asp Ser Leu 85 90 95 Arg Thr Pro Glu Gln Gly Ser Asn Gly Thr Asp Gly Ala Ser Gln Lys 100 105 110 Thr Pro Ser Ser Thr Gly Pro Ser Pro Val Phe Asp Ile Lys Ala Val 115 120 125 Ser Ile Ser Pro Thr Asn Val Ile Leu Thr Trp Lys Ser Asn Asp Thr 130 135 140 Ala Ala Ser Glu Tyr Lys Tyr Val Val Lys His Lys Met Glu Asn Glu 145 150

155 160 Lys Thr Ile Thr Val Val His Gln Pro Trp Cys Asn Ile Thr Gly Leu 165 170 175 Arg Pro Ala Thr Ser Tyr Val Phe Ser Ile Thr Pro Gly Ile Gly Asn 180 185 190 Glu Thr Trp Gly Asp Pro Arg Val Ile Lys Val Ile Thr Glu Pro Ile 195 200 205 Pro Val Ser Asp Leu Arg Val Ala Leu Thr Gly Val Arg Lys Ala Ala 210 215 220 Leu Ser Trp Ser Asn Gly Asn Gly Thr Ala Ser Cys Arg Val Leu Leu 225 230 235 240 Glu Ser Ile Gly Ser His Glu Glu Leu Thr Gln Asp Ser Arg Leu Gln 245 250 255 Val Asn Ile Ser Gly Leu Lys Pro Gly Val Gln Tyr Asn Ile Asn Pro 260 265 270 Tyr Leu Leu Gln Ser Asn Lys Thr Lys Gly Asp Pro Leu Gly Thr Glu 275 280 285 Gly Gly Leu Asp Ala Ser Asn Thr Glu Arg Ser Arg Ala Gly Ser Pro 290 295 300 Thr Ala Pro Val His Asp Glu Ser Leu Val Gly Pro Val Asp Pro Ser 305 310 315 320 Ser Gly Gln Gln Ser Arg Asp Thr Glu Val Leu Leu Val Gly Leu Glu 325 330 335 Pro Gly Thr Arg Tyr Asn Ala Thr Val Tyr Ser Gln Ala Ala Asn Gly 340 345 350 Thr Glu Gly Gln Pro Gln Ala Ile Glu Phe Arg Thr Asn Ala Ile Gln 355 360 365 Val Phe Asp Val Thr Ala Val Asn Ile Ser Ala Thr Ser Leu Thr Leu 370 375 380 Ile Trp Lys Val Ser Asp Asn Glu Ser Ser Ser Asn Tyr Thr Tyr Lys 385 390 395 400 Ile His Val Ala Gly Glu Thr Asp Ser Ser Asn Leu Asn Val Ser Glu 405 410 415 Pro Arg Ala Val Ile Pro Gly Leu Arg Ser Ser Thr Phe Tyr Asn Ile 420 425 430 Thr Val Cys Pro Val Leu Gly Asp Ile Glu Gly Thr Pro Gly Phe Leu 435 440 445 Gln Val His Thr Pro Pro Val Pro Val Ser Asp Phe Arg Val Thr Val 450 455 460 Val Ser Thr Thr Glu Ile Gly Leu Ala Trp Ser Ser His Asp Ala Glu 465 470 475 480 Ser Phe Gln Met His Ile Thr Gln Glu Gly Ala Gly Asn Ser Arg Val 485 490 495 Glu Ile Thr Thr Asn Gln Ser Ile Ile Ile Gly Gly Leu Phe Pro Gly 500 505 510 Thr Lys Tyr Cys Phe Glu Ile Val Pro Lys Gly Pro Asn Gly Thr Glu 515 520 525 Gly Ala Ser Arg Thr Val Cys Asn Arg Thr Val Pro Ser Ala Val Phe 530 535 540 Asp Ile His Val Val Tyr Val Thr Thr Thr Glu Met Trp Leu Asp Trp 545 550 555 560 Lys Ser Pro Asp Gly Ala Ser Glu Tyr Val Tyr His Leu Val Ile Glu 565 570 575 Ser Lys His Gly Ser Asn His Thr Ser Thr Tyr Asp Lys Ala Ile Thr 580 585 590 Leu Gln Gly Leu Ile Pro Gly Thr Leu Tyr Asn Ile Thr Ile Ser Pro 595 600 605 Glu Val Asp His Val Trp Gly Asp Pro Asn Ser Thr Ala Gln Tyr Thr 610 615 620 Arg Pro Ser Asn Val Ser Asn Ile Asp Val Ser Thr Asn Thr Thr Ala 625 630 635 640 Ala Thr Leu Ser Trp Gln Asn Phe Asp Asp Ala Ser Pro Thr Tyr Ser 645 650 655 Tyr Cys Leu Leu Ile Glu Lys Ala Gly Asn Ser Ser Asn Ala Thr Gln 660 665 670 Val Val Thr Asp Ile Gly Ile Thr Asp Ala Thr Val Thr Glu Leu Ile 675 680 685 Pro Gly Ser Ser Tyr Thr Val Glu Ile Phe Ala Gln Val Gly Asp Gly 690 695 700 Ile Lys Ser Leu Glu Pro Gly Arg Lys Ser Phe Cys Thr Asp Pro Ala 705 710 715 720 Ser Met Ala Ser Phe Asp Cys Glu Val Val Pro Lys Glu Pro Ala Leu 725 730 735 Val Leu Lys Trp Thr Cys Pro Pro Gly Ala Asn Ala Gly Phe Glu Leu 740 745 750 Glu Val Ser Ser Gly Ala Trp Asn Asn Ala Thr His Leu Glu Ser Cys 755 760 765 Ser Ser Glu Asn Gly Thr Glu Tyr Arg Thr Glu Val Thr Tyr Leu Asn 770 775 780 Phe Ser Thr Ser Tyr Asn Ile Ser Ile Thr Thr Val Ser Cys Gly Lys 785 790 795 800 Met Ala Ala Pro Thr Arg Asn Thr Cys Thr Thr Gly Ile Thr Asp Pro 805 810 815 Pro Pro Pro Asp Gly Ser Pro Asn Ile Thr Ser Val Ser His Asn Ser 820 825 830 Val Lys Val Lys Phe Ser Gly Phe Glu Ala Ser His Gly Pro Ile Lys 835 840 845 Ala Tyr Ala Val Ile Leu Thr Thr Gly Glu Ala Gly His Pro Ser Ala 850 855 860 Asp Val Leu Lys Tyr Thr Tyr Glu Asp Phe Lys Lys Gly Ala Ser Asp 865 870 875 880 Thr Tyr Val Thr Tyr Leu Ile Arg Thr Glu Glu Lys Gly Arg Ser Gln 885 890 895 Ser Leu Ser Glu Val Leu Lys Tyr Glu Ile Asp Val Gly Asn Glu Ser 900 905 910 Thr Thr Leu Gly Tyr Tyr Asn Gly Lys Leu Glu Pro Leu Gly Ser Tyr 915 920 925 Arg Ala Cys Val Ala Gly Phe Thr Asn Ile Thr Phe His Pro Gln Asn 930 935 940 Lys Gly Leu Ile Asp Gly Ala Glu Ser Tyr Val Ser Phe Ser Arg Tyr 945 950 955 960 Ser Asp Ala Val Ser Leu Pro Gln Asp Pro Gly Val Ile Cys Gly Ala 965 970 975 Val Phe Gly Cys Ile Phe Gly Ala Leu Val Ile Val Thr Val Gly Gly 980 985 990 Phe Ile Phe Trp Arg Lys Lys Arg Lys Asp Ala Lys Asn Asn Glu Val 995 1000 1005 Ser Phe Ser Gln Ile Lys Pro Lys Lys Ser Lys Leu Ile Arg Val 1010 1015 1020 Glu Asn Phe Glu Ala Tyr Phe Lys Lys Gln Gln Ala Asp Ser Asn 1025 1030 1035 Cys Gly Phe Ala Glu Glu Tyr Glu Asp Leu Lys Leu Val Gly Ile 1040 1045 1050 Ser Gln Pro Lys Tyr Ala Ala Glu Leu Ala Glu Asn Arg Gly Lys 1055 1060 1065 Asn Arg Tyr Asn Asn Val Leu Pro Tyr Asp Ile Ser Arg Val Lys 1070 1075 1080 Leu Ser Val Gln Thr His Ser Thr Asp Asp Tyr Ile Asn Ala Asn 1085 1090 1095 Tyr Met Pro Gly Tyr His Ser Lys Lys Asp Phe Ile Ala Thr Gln 1100 1105 1110 Gly Pro Leu Pro Asn Thr Leu Lys Asp Phe Trp Arg Met Val Trp 1115 1120 1125 Glu Lys Asn Val Tyr Ala Ile Ile Met Leu Thr Lys Cys Val Glu 1130 1135 1140 Gln Gly Arg Thr Lys Cys Glu Glu Tyr Trp Pro Ser Lys Gln Ala 1145 1150 1155 Gln Asp Tyr Gly Asp Ile Thr Val Ala Met Thr Ser Glu Ile Val 1160 1165 1170 Leu Pro Glu Trp Thr Ile Arg Asp Phe Thr Val Lys Asn Ile Gln 1175 1180 1185 Thr Ser Glu Ser His Pro Leu Arg Gln Phe His Phe Thr Ser Trp 1190 1195 1200 Pro Asp His Gly Val Pro Asp Thr Thr Asp Leu Leu Ile Asn Phe 1205 1210 1215 Arg Tyr Leu Val Arg Asp Tyr Met Lys Gln Ser Pro Pro Glu Ser 1220 1225 1230 Pro Ile Leu Val His Cys Ser Ala Gly Val Gly Arg Thr Gly Thr 1235 1240 1245 Phe Ile Ala Ile Asp Arg Leu Ile Tyr Gln Ile Glu Asn Glu Asn 1250 1255 1260 Thr Val Asp Val Tyr Gly Ile Val Tyr Asp Leu Arg Met His Arg 1265 1270 1275 Pro Leu Met Val Gln Thr Glu Asp Gln Tyr Val Phe Leu Asn Gln 1280 1285 1290 Cys Val Leu Asp Ile Val Arg Ser Gln Lys Asp Ser Lys Val Asp 1295 1300 1305 Leu Ile Tyr Gln Asn Thr Thr Ala Met Thr Ile Tyr Glu Asn Leu 1310 1315 1320 Ala Pro Val Thr Thr Phe Gly Lys Thr Asn Gly Tyr Ile Ala 1325 1330 1335 258PRTHomo sapiens 25Ala Leu Gln Ala Ser Ala Leu Lys 1 5 269PRTHomo sapiens 26Ala Val Gly Leu Ala Gly Thr Phe Arg 1 5 279PRTHomo sapiens 27Gly Phe Leu Leu Leu Ala Ser Leu Arg 1 5 2815PRTHomo sapiens 28Leu Gly Gly Pro Glu Ala Gly Leu Gly Glu Tyr Leu Phe Glu Arg 1 5 10 15 296PRTHomo sapiens 29Val Glu Ile Phe Tyr Arg 1 5

* * * * *