Identification Of Markers In Esophageal Cancer, Colon Cancer, Head And Neck Cancer, And Melanoma Godfrey; Tony E. ; et al. [Godfrey; Tony E.]

Identification Of Markers In Esophageal Cancer, Colon Cancer, Head And Neck Cancer, And Melanoma

Godfrey; Tony E. ; et al.

Patent Application Summary

U.S. patent application number 12/637862 was filed with the patent office on 2010-09-23 for identification of markers in esophageal cancer, colon cancer, head and neck cancer, and melanoma. Invention is credited to Tony E. Godfrey, William E. Gooding, Steven J. Hughes, Siva Raja, Liqiang Xi.

Application Number	20100240037 12/637862
Document ID	/
Family ID	35839738
Filed Date	2010-09-23

United States Patent Application	20100240037
Kind Code	A1
Godfrey; Tony E. ; et al.	September 23, 2010

IDENTIFICATION OF MARKERS IN ESOPHAGEAL CANCER, COLON CANCER, HEAD AND NECK CANCER, AND MELANOMA

Abstract

Methods for identifying expression of markers indicative of the presence of esophageal, a squamous cell cancer, a squamous cell cancer of the head and neck, colon cancer and melanoma are provided. Also provided are articles of manufacture useful in such methods and compositions containing primers and probes useful in such methods.

Inventors:	Godfrey; Tony E.; (Bronxville, NY) ; Xi; Liqiang; (Plainsboro, NJ) ; Raja; Siva; (Jamaica Plain, MA) ; Hughes; Steven J.; (Blawnox, PA) ; Gooding; William E.; (Pittsburgh, PA)
Correspondence Address:	Hirshman Law, LLC Gatehouse Building, 101 W. Station Square Dr., Suite 500 PITTSBURGH PA 15219 US
Family ID:	35839738
Appl. No.:	12/637862
Filed:	December 15, 2009

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
11178134	Jul 8, 2005	7662561
12637862
60587019	Jul 9, 2004
60586599	Jul 9, 2004

Current U.S. Class:	435/6.13
Current CPC Class:	C12Q 1/6886 20130101; C12Q 2600/16 20130101; C12Q 2600/112 20130101; C12Q 2600/158 20130101
Class at Publication:	435/6
International Class:	C12Q 1/68 20060101 C12Q001/68

Claims

1. A method of identifying expression of markers indicative of the presence of esophageal cancer cells in a lymph node of a patient, comprising determining if a first mRNA species specific to one of CEA, CK19, CK20, TACSTD1, VIL1, PVA and CK7 is overabundant in an RNA sample prepared from the lymph node, provided when the first mRNA species is CEA, the method further comprises determining if a second mRNA species specific to CK19 is overabundant in an RNA sample prepared from the lymph node, the overabundance of the mRNA species being indicative of the presence of displaced esophageal cells in the lymph node.

2. The method of claim 1, further comprising determining if one or more additional mRNA species, different from the first mRNA species, specific to one or more of CEA, CK19, CK20, TACSTD1, VIL1, PVA and CK7 is overabundant in the RNA sample, the overabundance of the first mRNA species and the one or more additional mRNA species being indicative of the presence of displaced esophageal cells in the lymph node.

3. The method of claim 1, wherein the first mRNA species is specific to CK19 and a second mRNA species is specific CEA.

4. The method of claim 1, wherein the first mRNA species is specific to CK20.

5. The method of claim 4, further comprising determining if a second mRNA species specific to CK19 is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced esophageal cells in the lymph node.

6. The method of claim 1, wherein the first mRNA species is specific to TACSTD1.

7. The method of claim 6, further comprising determining if a second mRNA species specific to CEA is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced esophageal cells in the lymph node.

8. The method of claim 6, further comprising determining if a second mRNA species specific to CK7 is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced esophageal cells in the lymph node.

9. The method of claim 6, further comprising determining if a second mRNA species specific to CK19 is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced esophageal cells in the lymph node.

10. The method of claim 6, further comprising determining if a second mRNA species specific to CK20 is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced esophageal cells in the lymph node.

11. The method of claim 6, further comprising determining if a second mRNA species specific to VIL1 is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced esophageal cells in the lymph node.

12. The method of claim 1, wherein the first mRNA species is specific to VIL1.

13. The method of claim 12, further comprising determining if a second mRNA species specific to CK19 is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced esophageal cells in the lymph node.

14. The method of claim 1, wherein the first mRNA species is specific to CK7.

15. The method of claim 1, wherein the first mRNA species is specific to PVA.

16-24. (canceled)

25. A method of identifying expression of markers indicative of the presence of cells of a squamous cell carcinoma of the head & neck in a lymph node of a patient, comprising determining if a first mRNA species specific to one of CEA, CK19, PTHrP, TACSTD1 and SCCA1.2 is overabundant in an RNA sample prepared from the lymph node, the overabundance of the mRNA species being indicative of the presence of displaced cells of a squamous cell carcinoma of the head & neck in the lymph node.

26. The method of claim 25, wherein the first mRNA species is specific to CEA.

27. The method of claim 25, wherein the first mRNA species is specific to PTHrP.

28. The method of claim 27, further comprising determining if a second mRNA species specific to SCCA1.2 is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced cells of a squamous cell carcinoma of the head & neck in the lymph node.

29. The method of claim 27, further comprising determining if a second mRNA species specific to PVA is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced cells of a squamous cell carcinoma of the head & neck in the lymph node.

30. (canceled)

31. The method of claim 25, wherein the first mRNA species is specific to PVA and further comprising determining if a second mRNA species specific to SCCA1.2 is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced cells of a squamous cell carcinoma of the head & neck in the lymph node.

32. The method of claim 25, wherein the first mRNA species is specific to CK19.

33. The method of claim 25, wherein the first mRNA species is specific to TACSTD1.

34. The method of claim 33, further comprising determining if a second mRNA species specific to SCCA1.2 is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced cells of a squamous cell carcinoma of the head & neck in the lymph node.

35. The method of claim 33, further comprising determining if a second mRNA species specific to PVA is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced cells of a squamous cell carcinoma of the head & neck in the lymph node.

36. The method of claim 33, further comprising determining if a second mRNA species specific to PTHrP is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of displaced cells of a squamous cell carcinoma of the head & neck in the lymph node.

37. The method of claim 25, wherein the first mRNA species is specific to SCCA1.2.

38. The method of claim 25, further comprising determining if one or more additional mRNA species, different from the first mRNA species, specific to one or more of CEA, CK19, PTHrP, PVA, TACSTD1 and SCCA1.2 is overabundant in the RNA sample, the overabundance of the first mRNA species and the one or more additional mRNA species being indicative of the presence of cells of a squamous cell carcinoma of the head & neck in the lymph node.

39. The method of claim 25, comprising quantifying levels of the mRNA species in the RNA sample and determining if one or more of the mRNA species are overabundant in the RNA sample.

40-47. (canceled)

48. A method of identifying expression of markers indicative of the presence of cells of a squamous cell carcinoma in a lymph node of a patient, comprising determining if a first mRNA species specific to PVA is overabundant in an RNA sample prepared from the lymph node, the overabundance of the mRNA being indicative of the presence of displaced cells of a squamous cell carcinoma in the lymph node.

49. A method of identifying expression of markers indicative of the presence of colon cancer cells in a lymph node of a patient, comprising determining if a first mRNA species specific to one of CDX1, TACSTD1 and VIL1 is overabundant in an RNA sample prepared from the lymph node, the overabundance of the first mRNA species being indicative of the presence of displaced colon cells in the lymph node.

50. The method of claim 49, wherein the first mRNA species is specific to CDX1.

51. The method of claim 49, wherein the first mRNA species is specific to TACSTD1.

52. The method of claim 49, wherein the first mRNA species is specific to VIL1.

53. The method of claim 49, comprising quantifying levels of the mRNA species in the RNA sample and determining if one or more of the mRNA species are overabundant in the RNA sample.

54-61. (canceled)

62. The method of claim 49, further comprising determining if one or more additional mRNA species, different from the first mRNA species, specific to one or more of CDX1, CEA, CK19, CK20, TACSTD1, and VIL1 are overabundant in the RNA sample, the overabundance of the first mRNA species and the one or more additional mRNA species being indicative of the presence of displaced colon cells in the lymph node.

63. A method of identifying expression of markers indicative of the presence of melanoma cells in a lymph node of a patient, comprising determining if a first mRNA species specific to a MAGEA136-plex is overabundant in an RNA sample prepared from the lymph node, the overabundance of the first mRNA species being indicative of the presence of melanoma cells in the lymph node.

64. The method of claim 63, further comprising determining if a second mRNA species specific to MART1 is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of melanoma cells in the lymph node.

65. The method of claim 63, further comprising determining if a second mRNA species specific to TYR is overabundant in the RNA sample, the overabundance of the mRNA species being indicative of the presence of melanoma cells in the lymph node.

66-87. (canceled)

Description

CROSS REFERENCE TO RELATED APPLICATIONS

[0001] This application is a Divisional of U.S. patent application Ser. No. 11/178,134, filed Jul. 8, 2005, corresponding to United States Patent Publication No. 20060019290, published Jan. 26, 2006, and claims the benefit under 35 U.S.C. .sctn.119(e) to priority U.S. Provisional Patent Application Nos. 60/586,599 and 60/587,019, both filed on Jul. 9, 2004, each of which is incorporated herein by reference in its entirety.

BACKGROUND

[0002] 1. Field of the Invention

[0003] Provided are improved cancer diagnostic methods, along with compositions and apparatus useful in conducting those methods.

[0004] 2. Description of the Related Art

[0005] Early detection of cancer typically leads to increased survival rates. Metastatic lesions commonly are detected by histological techniques, including immunohistochemical techniques. Metastasized cells typically infiltrate the lymph nodes, and, thus in most instances, certain sentinel lymph nodes, lymph nodes where metastasized cells typically first infiltrate, are recognized for each cancer type and are analyzed for the presence of lesions, including micrometastases. Trained histologists often can detect metastatic lesions visually after tissue from a sentinel lymph node is sectioned and stained. Highly trained histologists often can visualize micrometasteses, but the ability to visualize such lesions varies from histologist-to-histologist.

[0006] In many surgical procedures to remove tumors, biopsies of sentinel lymph nodes are taken. The surgical procedure is then halted and the excised lymphatic tissue is then analyzed. Once it is determined that the tumor has metastasized, a second, more radical surgical procedure is performed, removing regional lymphatics. A rapid method for identifying tumors is therefore warranted, not only because more assays can be performed in a given time period, thereby increasing laboratory turnaround, but permitting accurate, intraoperative decisions to be made, rather than conducting a second surgical procedure. It is therefore desirable to identify useful diagnostics for malignancies, especially that permit rapid and/or intraoperative detection of lymphatic micrometastases.

SUMMARY

[0007] The present invention relates to a diagnostic method for detecting the presence of cancer cells in a patient by identifying the expression of certain markers indicative of the presence of cancer cell.

[0008] In one embodiment, the present invention relates to a method of identifying the expression of markers indicative of the presence of esophageal cancer cells in a lymph node of a patient. The method comprises determining if an mRNA species specific to one or more of CEA, CK7, CK19, CK20, VIL1, TACSTD1, and PVA is overabundant in an RNA sample prepared from the lymph node. The overabundance of the mRNA species is indicative of the presence of displaced cells of the esophagus in the lymph node.

[0009] In another embodiment, the present invention relates to a method of identifying the expression of markers indicative of the presence of cells of squamous cell carcinoma of the head and neck in a lymph node of a patient. The method comprises determining if an mRNA species specific to one or more of CEA, CK19, PTHrP, PVA, TACSTD1 and SCCA1.2 (SCCA1+SCCA2) is overabundant in an RNA sample prepared from the lymph node. The overabundance of the mRNA species is indicative of the presence of displaced cells of a squamous cell carcinoma of the head and neck in the lymph node.

[0010] In still another embodiment, the present invention relates to a method for identifying the expression or markers indicative of the presence of cells of a squamous cell carcinoma in a lymph node of a patient. The method comprises determining if an mRNA species specific to PVA is overabundant in an RNA sample prepared from the lymph node. The overabundance of the mRNA species is indicative of the presence of displaced cells of a squamous cell carcinoma in the lymph node.

[0011] In yet another embodiment, the present invention relates to a method for identifying the expression of markers indicative of the presence of colon cancer cells in a lymph node of a patient. The method comprises determining if an mRNA species specific to one or more of CDX1, TACSTD1 and VIL1 is overabundant in an RNA sample prepared from the lymph node. The overabundance of the mRNA species is indicative of the presence of displaced colon cells in the lymph node.

[0012] In still another embodiment, the present invention relates to a method for identifying the expression of markers indicative of the presence of melanoma cells in a lymph node of a patient. The method comprises determining if an mRNA species specific to one or more of MAGEA136-plex, MART1, and TYR is overabundant in an RNA sample prepared from the lymph node. The overabundance of the mRNA species is indicative of the presence of melanoma cells in the lymph node.

[0013] In yet a further embodiment, the present invention relates to an article of manufacture comprising packaging material and one or more nucleic acids specific to one or more of CEA, CK7, CK19, CK20, VIL1, TACSTD1, and PVA. The packaging material comprises an indicia, for example and without limitation, a writing, illustration, label, tag, book, booklet and/or package insert, indicating that the one or more nucleic acids can be used in a method of identifying expression of markers indicative of the presence of esophageal cancer cells in a lymph node of a patient.

[0014] In a still further embodiment, the present invention relates to an article of manufacture comprising packaging material and one or more nucleic acids specific to one or more of CEA, CK19, PTHrP, PVA, TACSTD1 and SCCA1.2. The packaging material comprises an indicia indicating that the one or more nucleic acids can be used in a method of identifying expression of markers indicative of the presence of cells of a squamous cell carcinoma of the head and neck in a lymph node of a patient.

[0015] In another embodiment, the present invention relates to an article of manufacture comprising packaging material and one or more nucleic acids specific to one or more of CDX1, TACSTD1 and VIL1. The packaging material comprises an indicia indicating that the one or more nucleic acids can be used in a method of identifying expression of markers indicative of the presence of colon cancer cells in a lymph node of a patient.

[0016] In still another embodiment, the present invention relates to an article of manufacture comprising packaging material and one or more nucleic acids specific to one or more of MAGEA136-plex, MART1 and TYR. The packaging material comprises an indicia indicating that the one or more nucleic acids can be used in a method of identifying expression of markers indicative of the presence of melanoma cells in a lymph node of a patient.

[0017] In still another embodiment, the present invention relates to an article of manufacture comprising packaging material and one or more nucleic acids specific to PVA. The packaging material comprises an indicia indicating that the one or more nucleic acids can be used in a method of identifying expression of markers indicative of the presence of cells of a squamous cell carcinoma in a lymph node of a patient.

[0018] In yet another embodiment, the present invention relates to a composition comprising one or more primers or probes specific to one or more of CEA, CK7, CK19, CK20, VIL1, TACSTD1, and PVA and RNA extracted from the lymph node of a patient diagnosed with or suspected of having esophageal cancer, or a nucleic acid, or analog thereof, derived from the RNA.

[0019] In a further embodiment, the present invention relates to a composition comprising one or more primers or probes specific to one or more of CEA, CK19, PTHrP, PVA, TACSTD1 and SCCA1.2 and RNA extracted from the lymph node of a patient diagnosed with or suspected of having squamous cell carcinoma of the head and neck, or a nucleic acid, or analog thereof, derived from the RNA.

[0020] In still a further embodiment, the present invention relates to a composition comprising one or more primers or probes specific to one or more of CDX1, TACSTD1 and VIL1 and RNA extracted from the lymph node of a patient diagnosed with or suspected of having colon cancer, or a nucleic acid, or an analog thereof, derived from the RNA.

[0021] In yet a further embodiment, the present invention relates to a composition comprising one or more primers or probes specific to one or more of MAGEA136-plex, MART1 and TYR and RNA extracted from a lymph node of a patient diagnosed with or suspected of having melanoma, or a nucleic acid, or analog thereof, derived from the RNA.

[0022] In another embodiment, the present invention relates to a composition comprising one or more primers or probes specific to PVA and RNA extracted from a sentinel lymph node of a patient diagnosed with or suspected of having a squamous cell carcinoma, or a nucleic acid, or analog thereof, derived from the RNA.

BRIEF DESCRIPTION OF THE DRAWINGS

[0023] FIG. 1 is a listing of a cDNA sequence of the caudal-type homeo box transcription factor 1 (CDX1) marker (SEQ ID NO: 1).

[0024] FIG. 2 is a listing of a cDNA sequence for the carcinoembryonic antigen-related cell adhesion molecule 5 (CEA) marker (SEQ ID NO: 2).

[0025] FIG. 3 is a listing of a cDNA sequence for the cytokeratin 7 (CK7) marker (SEQ ID NO: 3).

[0026] FIG. 4 is a listing of a cDNA sequence for the cytokeratin 19 (CK19) marker (SEQ ID NO: 4).

[0027] FIG. 5 is a listing of a cDNA sequence for the cytokeratin 20 (CK20) marker (SEQ ID NO: 5).

[0028] FIG. 6 is a listing of a cDNA sequence for the melanoma antigen gene family A1 (MAGEA1) marker (SEQ ID NO: 6).

[0029] FIG. 7 is a listing of a cDNA sequence for the melanoma antigen gene family A3 (MAGEA3) marker (SEQ ID NO: 7).

[0030] FIG. 8 is a listing of a cDNA sequence for the melanoma antigen gene family A6 (MAGEA6) marker (SEQ ID NO: 8).

[0031] FIG. 9 is a listing of a cDNA sequence for the melanoma antigen recognized by T cells 1 (MART1) marker (SEQ ID NO: 9).

[0032] FIG. 10 is a listing of a cDNA sequence for the parathyroid hormone-related protein (PTHrP) marker (SEQ ID NO: 10).

[0033] FIG. 11 is a listing of a cDNA sequence for the pemphigu vulgatis antigen (PVA) marker (SEQ ID NO: 11).

[0034] FIG. 12 is a listing of a cDNA sequence for the squamous cell carcinoma antigen 1 (SCCA1) marker (SEQ ID NO: 12).

[0035] FIG. 13 is a listing of a cDNA sequence for the squamous cell carcinoma antigen 2 (SCCA2) marker (SEQ ID NO: 13).

[0036] FIG. 14 is a listing of a cDNA sequence for the tumor-associated calcium signal transducer 1 (TACSTD1) marker (SEQ ID NO: 14).

[0037] FIG. 15 is a listing of a cDNA sequence for the tyrosinase (TYR) marker (SEQ ID NO: 15).

[0038] FIG. 16 is a listing of a cDNA sequence for the villin 1 (VIL1) marker (SEQ ID NO: 16).

[0039] FIG. 17 is a scatter plot showing the expression levels of CEA, CK7, SCCA 1.2, CK20, TACSTD1, VIL and CK19 in primary tumor, tumor-positive lymph nodes and benign lymph nodes of an esophageal cancer patient.

[0040] FIG. 18A-O provide scatter plots illustrating the ability of two-marker systems to distinguish between benign and malignant cells in a lymph node of an esophageal cancer patient (negative--gray circle; positive--black circle).

[0041] FIGS. 19 is a scatter plot showing the expression levels of CEA, CK19, PThRP, PVA, SCCA1.2 and TACSTD1 in primary tumor, tumor-positive lymph nodes and benign lymph nodes of a head & neck cancer patient.

[0042] FIG. 20A-F provides scatter plots illustrating the ability of two-marker systems to distinguish between benign and malignant cells in a lymph node of a head & neck cancer patient (negative--circle; positive--"+").

[0043] FIGS. 21 is a scatter plot showing the expression levels of MART1, TYR and MAGEA136-plex in primary tumor, tumor-positive lymph nodes and benign lymph nodes of a melanoma patient.

[0044] FIGS. 22A and 22B provide scatter plots illustrating the ability of two-marker systems to distinguish between benign and malignant cells in a lymph node of a melanoma patient (negative--circle; positive--"+").

[0045] FIGS. 23 is a scatter plot showing the expression levels of CDX1, CEA, CK19, CK20, TACSTD1 and VIL1 in primary tumor, tumor-positive lymph nodes and benign lymph nodes of a colon cancer patient.

DETAILED DESCRIPTION

[0046] Provided are methods and compositions useful in identifying esophageal cancer, colon cancer, head and neck cancer and melanoma cells, including micrometastases, in lymph nodes. Early detection of metastases typically is related to patient survival. Very small metastases often go undetected in histological study of lymph node biopsies, resulting in false negative results that result in decreased chances of patient survival. The nucleic acid detection assays described herein are much more discriminating than are histological studies in most instances (a few, excellent histologists are capable of identifying micrometastases in lymph node sections), and are robust and repeatable in the hands of any minimally-trained technician. Although the methods and compositions described herein are necessarily presented comprising expression of specific mRNA markers, this should be understood that it shall not be deemed to exclude methods and compositions comprising combinations of the specific markers and other markers known in the art.

[0047] To this end, a number of molecular markers are identified, that are expressed in certain cancer types, including esophageal cancer, colon cancer, head and neck cancer and melanoma. These markers are markers specific to the tissue from which the particular cancer type arises and typically are not expressed, at least to the same levels, in lymphoid tissue. The presence and/or elevated expression of one or more of these markers in sentinel lymph node tissue is indicative of displaced cells in the lymphoid tissue, which correlates strongly with a cancer diagnosis. As used herein a "squamous cell carcinoma" is a cancer arising, at least in part, from a squamous cell population and/or containing, at least in part, a squamous cell population including, without limitation, cancers of the cervix; penis; head and neck, including, without limitation cancers of the oral cavity, salivary glands, paranasal sinuses and nasal cavity, pharynx and larynx; lung; esophageal; skin other than melanoma; vulva and bladder.

[0048] As used herein, the terms "expression" and "expressed" mean production of a gene-specific mRNA by a cell. In the context of the present disclosure, a "marker" is a gene that is expressed abnormally in a lymphatic biopsy. In one embodiment, the markers described herein are mRNA species that are expressed in cells of a specific tumor source at a significantly higher level as compared to expression in lymphoid cells.

[0049] Expression levels of mRNA can be quantified by a number of methods. Traditional methods include Northern blot analysis. More recently, nucleic acid detection methods have been devised that facilitate quantification of transcripts. Examples of PCR methods are described in U.S. patent application Ser. No. 10/090,326 (U.S. Ser. No. 10/090,326), incorporated herein by reference in its entirety. Other methods for determining expression levels of a given mRNA include isothermic amplification or detection assays and array technologies, as are known in the art, such as, without limitation, those described below.

[0050] The improved PCR methods described herein as well as in U.S. Ser. No. 10/090,326, and other nucleic acid detection and amplification methods described herein and as are known in the art permit rapid detection of cancer cells in lymph node tissue. These rapid methods can be used intraoperatively, and also are useful in detecting rare nucleic acid species, even in multiplexed PCR reactions that concurrently detect a more prevalent control nucleic acid.

[0051] A typical PCR reaction includes multiple amplification steps, or cycles that selectively amplify a target nucleic acid species. Because detection of transcripts is necessary, the PCR reaction is coupled with a reverse transcription step (reverse transcription PCR, or RT-PCR). A typical PCR reaction includes three steps: a denaturing step in which a target nucleic acid is denatured; an annealing step in which a set of PCR primers (forward and backward primers) anneal to complementary DNA strands; and an elongation step in which a thermostable DNA polymerase elongates the primers. By repeating this step multiple times, a DNA fragment is amplified to produce an amplicon, corresponding to the target DNA sequence. Typical PCR reactions include 30 or more cycles of denaturation, annealing and elongation. In many cases, the annealing and elongation steps can be performed concurrently, that is at the same temperature, in which case the cycle contains only two steps.

[0052] The lengths of the denaturation, annealing and elongation stages may be any desirable length of time. However, in attempting to shorten the PCR amplification reaction to a time suitable for intraoperative diagnosis, the lengths of these steps can be in the seconds range, rather than the minutes range. The denaturation step may be conducted for times of one second or less. The annealing and elongation steps optimally are less than 10 seconds each, and when conducted at the same temperature, the combination annealing/elongation step may be less than 10 seconds. Use of recently developed amplification techniques, such as conducting the PCR reaction in a Rayleigh-Benard convection cell, also can dramatically shorten the PCR reaction time beyond these time limits (see, Krishnan, My et al., "PCR in a Rayleigh-Benard convection cell." Science 298:793 (2002), and Braun, D. et al., "Exponential DNA Replication by Lominar Convection," Physical Review Letters, 91:158103).

[0053] As described in U.S. Ser. No. 10/090,326, each cycle may be shortened considerably without substantial deterioration of production of amplicons. Use of high concentrations of primers is helpful in shortening the PCR cycle time. High concentrations typically are greater than about 400 nM, and often greater than about 800 nM, though the optimal concentration of primers will vary somewhat from assay-to-assay. Sensitivity of RT-PCR assays may be enhanced by the use of a sensitive reverse transcriptase enzyme (described below) and/or high concentrations of reverse transcriptase primer to produce the initial target PCR template.

[0054] The specificity of any given PCR reaction relies heavily, but not exclusively, on the identity of the primer sets. The primer sets are pairs of forward and reverse oligonucleotide primers that anneal to a target DNA sequence to permit amplification of the target sequence, thereby producing a target sequence-specific amplicon. PCR primer sets can include two primers internal to the target sequence, or one primer internal to the target sequence and one specific to a target sequence that is ligated to the DNA or cDNA target, using a technique known as "ligation-anchored PCR" (Troutt, A. B., et al. (1992), "Ligation-anchored PCR: A Simple Amplification Technique with Single-sided Specificity," Proc. Natl. Acad. Sci. USA, 89:9823-9825).

[0055] As used herein, a "derivative" of a specified oligonucleotide is an oligonucleotide that binds to the same target sequence as the specified oligonucleotide and amplifies the same target sequence to produce essentially the same amplicon as the specified oligonucleotide but for differences between the specified oligonucleotide and its derivative. The derivative may differ from the specified oligonucleotide by insertion, deletion and/or substitution of any residue of the specified sequence so long as the derivative substantially retains the characteristics of the specified sequence in its use for the same purpose as the specified sequence.

[0056] As used herein, "reagents" for any assay or reaction, such as a reverse transcription and PCR, are any compound or composition that is added to the reaction mixture including, without limitation, enzyme(s), nucleotides or analogs thereof, primers and primer sets, probes, antibodies or other binding reagents, detectable labels or tags, buffers, salts and co-factors. As used herein, unless expressed otherwise, a "reaction mixture" for a given assay or reaction includes all necessary compounds and/or compositions necessary to perform that assay or reaction, even if those compounds or compositions are not expressly indicated. Reagents for many common assays or reactions, such as enzymatic reaction, are known in the art and typically are provided and/or suggested when the assay or reaction kit is sold.

[0057] As also described in U.S. Ser. No. 10/090,326, multiplexed PCR assays may be optimized, or balanced, by time-shifting the production of amplicons, rather than by manipulating primer concentrations. This may be achieved by using two primer sets, each primer set having a different Tm so that a two-stage PCR assay can be performed, with different annealing and/or elongation temperatures for each stage to favor the production of one amplicon over another. This time and temperature shifting method permits optimal balancing of the multiplex reaction without the difficulties faced when manipulation of primer concentrations is used to balance the reaction. This technique is especially useful in a multiplex reaction where it is desirable to amplify a rare cDNA along with a control cDNA.

[0058] A quantitative reverse transcriptase polymerase chain reaction (QRT-PCR) for rapidly and accurately detecting low abundance RNA species in a population of RNA molecules (for example, and without limitation, total RNA or mRNA), includes the steps of: a) incubating an RNA sample with a reverse transcriptase and a high concentration of a target sequence-specific reverse transcriptase primer under conditions suitable to generate cDNA; b) subsequently adding suitable polymerase chain reaction (PCR) reagents to the reverse transcriptase reaction, including a high concentration of a PCR primer set specific to the cDNA and a thermostable DNA polymerase to the reverse transcriptase reaction, and c) cycling the PCR reaction for a desired number of cycles and under suitable conditions to generate PCR product ("amplicons") specific to the cDNA. By temporally separating the reverse transcriptase and the PCR reactions, and by using reverse transcriptase-optimized and PCR-optimized primers, excellent specificity is obtained. The reaction may be conducted in a single tube (all tubes, containers, vials, cells and the like in which a reaction is performed may be referred to herein, from time to time, generically, as a "reaction vessel"), removing a source of contamination typically found in two-tube reactions. These reaction conditions permit very rapid QRT-PCR reactions, typically on the order of 20 minutes from the beginning of the reverse transcriptase reaction to the end of a 40 cycle PCR reaction.

[0059] The reaction c) may be performed in the same tube as the reverse transcriptase reaction by adding sufficient reagents to the reverse transcriptase (RT) reaction to create good, or even optimal conditions for the PCR reaction to proceed. A single tube may be loaded, prior to the running of the reverse transcriptase reaction, with: 1) the reverse transcriptase reaction mixture, and 2) the PCR reaction mixture to be mixed with the cDNA mixture after the reverse transcriptase reaction is completed. The reverse transcriptase reaction mixture and the PCR reaction mixture may be physically separated by a solid, or semi-solid (including amorphous, glassy substances and waxy) barrier of a composition that melts at a temperature greater than the incubation temperature of the reverse transcriptase reaction, but below the denaturing temperature of the PCR reaction. The barrier composition may be hydrophobic in nature and forms a second phase with the RT and PCR reaction mixtures when in liquid form. One example of such a barrier composition is wax beads, commonly used in PCR reactions, such as the AMPLIWAX PCR GEM products commercially available from Applied Biosystems of Foster City, Calif.

[0060] Alternatively, the separation of the reverse transcriptase and the PCR reactions may be achieved by adding the PCR reagents, including the PCR primer set and thermostable DNA polymerase, after the reverse transcriptase reaction is completed. Preferably the PCR reagents, are added mechanically by a robotic or fluidic means to make sample contamination less likely and to remove human error.

[0061] The products of the QRT-PCR process may be compared after a fixed number of PCR cycles to determine the relative quantity of the RNA species as compared to a given reporter gene. One method of comparing the relative quantities of the products of the QRT-PCR process is by gel electrophoresis, for instance, by running the samples on a gel and detecting those samples by one of a number of known methods including, without limitation, Southern blotting and subsequent detection with a labeled probe, staining with ethidium bromide and incorporating fluorescent or radioactive tags in the amplicons.

[0062] However, the progress of the quantitative PCR reactions typically is monitored by determining the relative rates of amplicon production for each PCR primer set. Monitoring amplicon production may be achieved by a number of processes, including without limitation, fluorescent primers, fluorogenic probes and fluorescent dyes that bind double-stranded DNA. A common method is the fluorescent 5' nuclease assay. This method exploits the 5' nuclease activity of certain thermostable DNA polymerases (such as Taq or Tfl DNA polymerases) to cleave an oligomeric probe during the PCR process. The oligomer is selected to anneal to the amplified target sequence under elongation conditions. The probe typically has a fluorescent reporter on its 5' end and a fluorescent quencher of the reporter at the 3' end. So long as the oligomer is intact, the fluorescent signal from the reporter is quenched. However, when the oligomer is digested during the elongation process, the fluorescent reporter no longer is in proximity to the quencher. The relative accumulation of free fluorescent reporter for a given amplicon may be compared to the accumulation of the same amplicons for a control sample and/or to that of a control gene, such as .beta.-actin or 18S rRNA to determine the relative abundance of a given cDNA product of a given RNA in a RNA population. Products and reagents for the fluorescent 5' nuclease assay are readily available commercially, for instance from Applied Biosystems.

[0063] Equipment and software also are readily available for monitoring amplicon accumulation in PCR and QRT-PCR according to the fluorescent 5' nuclease assay and other QPCR/QRT-PCR procedures, including the Smart Cycler, commercially available from Cepheid of Sunnyvale, Calif., the ABI Prism 7700 Sequence Detection System (TaqMan), commercially available from Applied Biosystems. A cartridge-based sample preparation system (GenXpert) combines a thermal cycler and fluorescent detection device having the capabilities of the Smart Cycler product with fluid circuits and processing elements capable of automatically extracting specific nucleic acids from a tissue sample and performing QPCR or QRT-PCR on the nucleic acid. The system uses disposable cartridges that can be configured and pre-loaded with a broad variety of reagents. Such a system can be configured to disrupt tissue and extract total RNA or mRNA from the sample. The reverse transcriptase reaction components can be added automatically to the RNA and the QPCR reaction components can be added automatically upon completion of the reverse transcriptase reaction.

[0064] Further, the PCR reaction may be monitored of production (or loss) of a particular fluorochrome from the reaction. When the fluorochrome levels reach (or fall to) a desired level, the automated system will automatically alter the PCR conditions. In one example, this is particularly useful in the multiplexed embodiment described above, where a more-abundant (control) target species is amplified by the first, lower Tm, primer set at a lower temperature than the less abundant species amplified by the second, higher Tm, primer set. In the first stage of the PCR amplification, the annealing temperature is lower than the effective Tm of the first primer set. The annealing temperature then is automatically raised above the effective Tm of the first primer set when production of the first amplicon by the first primer set is detected. In a system that automatically dispenses multiple reagents from a cartridge, such as the GeneXpert system, a first PCR reaction may be conducted at the first Tm and, when the first PCR reaction proceeds past a threshold level, a second primer with a different Tm is added, resulting in a sequential multiplexed reaction.

[0065] In the above-described reactions, the amounts of certain reverse transcriptase and the PCR reaction components typically are atypical in order to take advantage of the faster ramp times of some thermal cyclers. Specifically, the primer concentrations are very high. Typical gene-specific primer concentrations for reverse transcriptase reactions are less than about 20 nM. To achieve a rapid reverse transcriptase reaction on the order of one to two minutes, the reverse transcriptase primer concentration was raised to greater than 20 nM, preferably at least about 50 nM, and typically about 100 nM. Standard PCR primer concentrations range from 100 nM to 300 nM. Higher concentrations may be used in standard PCR reactions to compensate for Tm variations. However, the referenced primer concentrations are for circumstances where no Tm compensation is needed. Proportionately higher concentrations of primers may be empirically determined and used if Tm compensation is necessary or desired. To achieve rapid PCR reactions, the PCR primer concentrations typically are greater than 200 nM, preferably greater than about 500 nM and typically about 800 nM. Typically, the ratio of reverse transcriptase primer to PCR primer is about 1 to 8 or more. The increase in primer concentrations permitted PCR experiments of 40 cycles to be conducted in less than 20 minutes.

[0066] A sensitive reverse transcriptase may be preferred in certain circumstances where either low amounts of RNA are present or a target RNA is a low abundance RNA. By the term "sensitive reverse transcriptase," it is meant a reverse transcriptase capable of producing suitable PCR templates from low copy number transcripts for use as PCR templates. The sensitivity of the sensitive reverse transcriptase may derive from the physical nature of the enzyme, or from specific reaction conditions of the reverse transcriptase reaction mixture that produces the enhanced sensitivity. One example of a sensitive reverse transcriptase is SensiScript RT reverse transcriptase, commercially available from Qiagen, Inc. of Valencia, Calif. This reverse transcriptase is optimized for the production of cDNA from RNA samples of <50 ng, but also has the ability to produce PCR templates from low copy number transcripts. In practice, in the assays described herein, adequate results were obtained for samples of up to, and even in excess of, about 400 ng RNA. Other sensitive reverse transcriptases having substantially similar ability to reverse transcribe low copy number transcripts would be equivalent sensitive reverse transcriptase for the purposes described herein. Notwithstanding the above, the ability of the sensitive reverse transcriptase to produce cDNA from low quantities of RNA is secondary to the ability of the enzyme, or enzyme reaction system to produce PCR templates from low copy number sequences.

[0067] As discussed above, the procedures described herein also may be used in multiplex QRT-PCR processes. In its broadest sense, a multiplex PCR process involves production of two or more amplicons in the same reaction vessel. Multiplex amplicons may be analyzed by gel electrophoresis and detection of the amplicons by one of a variety of methods, such as, without limitation ethidium bromide staining, Southern blotting and hybridization to probes, or by incorporating fluorescent or radioactive moieties into the amplicons and subsequently viewing the product on a gel. However, real-time monitoring of the production of two or more amplicons is preferred. The fluorescent 5' nuclease assay is the most common monitoring method. Equipment is now available (for example, the above-described Smart Cycler and TaqMan products) that permits the real-time monitoring of accumulation of two or more fluorescent reporters in the same tube. For multiplex monitoring of the fluorescent 5' nuclease assay, oligomers are provided corresponding to each amplicon species to be detected. The oligomer probe for each amplicon species has a fluorescent reporter with a different peak emission wavelength than the oligomer probe(s) for each other amplicons species. The accumulation of each unquenched fluorescent reporter can be monitored to determine the relative amounts of the target sequence corresponding to each amplicon.

[0068] In traditional multiplex QPCR and QRT-PCR procedures, the selection of PCR primer sets having similar annealing and elongation kinetics and similar sized amplicons are desirable. The design and selection of appropriate PCR primer sets is a process that is well known to a person skilled in the art. The process for identifying optimal PCR primer sets, and respective ratios thereof to achieve a balanced multiplex reaction also is known. By "balanced," it is meant that certain amplicon(s) do not out-compete the other amplicon(s) for resources, such as dNTPs or enzyme. For instance, by limiting the abundance of the PCR primers for the more abundant RNA species in an RT-PCR experiment will allow the detection of less abundant species. Equalization of the Tm (melting temperature) for all PCR primer sets also is encouraged. See, for instance, ABI PRISM 7700 Sequence Detection System User Bulletin #5, "Multiplex PCR with TaqMan VIC Probes", Applied Biosystems (1998/2001).

[0069] Despite the above, for very low copy number transcripts, it is difficult to design accurate multiplex PCR experiments, even by limiting the PCR primer sets for the more abundant control species. One solution to this problem is to run the PCR reaction for the low abundance RNA in a separate tube than the PCR reaction for the more abundant species. However, that strategy does not take advantage of the benefits of running a multiplex PCR experiment. A two-tube process has several drawbacks, including cost, the addition of more room for experimental error and the increased chance of sample contamination, which is critical in PCR assays.

[0070] A method has been described in WO 02/070751 for performing a multiplex PCR process, including QRT-PCR and QPCR, capable of detecting low copy number nucleic acid species along with one or more higher copy number species. The difference between low copy number and high copy number nucleic acid species is relative, but is referred to herein as a difference in the prevalence of a low (lower) copy number species and a high (higher) copy number species of at least about 30-fold, but more typically at least about 100-fold. For purposes herein, the relative prevalence of two nucleic acid species to be amplified is more salient than the relative prevalence of the two nucleic acid species in relation to other nucleic acid species in a given nucleic acid sample because other nucleic acid species in the nucleic acid sample do not directly compete with the species to be amplified for PCR resources.

[0071] As used herein, the prevalence of any given nucleic acid species in a given nucleic acid sample, prior to testing, is unknown. Thus, the "expected" number of copies of a given nucleic acid species in an nucleic acid sample often is used herein and is based on historical data on the prevalence of that species in nucleic acid samples. For any given pair of nucleic acid species, one would expect, based on previous determinations of the relative prevalence of the two species in a sample, the prevalence of each species to fall within a range. By determining these ranges one would determine the difference in the expected number of target sequences for each species. An mRNA species is identified as "overabundant" if it is present in statistically significant amounts over normal prevalence of the mRNA species in a sample from a normal patient or lymph node. As is abundantly illustrated in the examples and plots provided herein, a person of skill in the art would be able to ascertain statistically significant ranges or cutoffs for determining the precise definition of "overabundance" for any one or more mRNA species.

[0072] The multiplex method involves performing a two- (or more) stage PCR amplification, permitting modulation of the relative rate of production of a first amplicon by a first primer set and a second amplicon by a second primer set during the respective amplification stages. By this method, PCR amplifications to produce amplicons directed to a lower abundance nucleic acid species are effectively "balanced" with PCR amplifications to produce amplicons directed to a higher abundance nucleic acid species. Separating the reaction into two or more temporal stages may be achieved by omitting the PCR primer set for any amplicons that are not to be produced in the first amplification stage. This is best achieved through use of automated processes, such as the GenXpert prototype system described above. Two or more separate amplification stages may be used to tailor and balance multiplex assays, along with, or to the exclusion of tailoring the concentration of the respective primer sets.

[0073] A second method for temporally separating the PCR amplification process into two or more stages is to select PCR primer sets with variation in their respective Tm. In one example, primers for a lower copy number nucleic acid species would have a higher Tm (Tm.sub.1) than primers for a higher abundance species (Tm.sub.2). In this process, the first stage of PCR amplification is conducted for a predetermined number of cycles at a temperature sufficiently higher than Tm.sub.2 so that there is substantially no amplification of the higher abundance species. After the first stage of amplification, the annealing and elongation steps of the PCR reaction are conducted at a lower temperature, typically about Tm.sub.2, so that both the lower abundance and the higher abundance amplimers are amplified. It should be noted that Tm, as used herein and unless otherwise noted, refers to "effective Tm," which is the Tm for any given primer in a given reaction mix, which depends on factors, including, without limitation, the nucleic acid sequence of the primer and the primer concentration in the reaction mixture.

[0074] It should be noted that PCR amplification is a dynamic process. When using temperature to modulate the respective PCR reactions in a multiplex PCR reaction, the higher temperature annealing stage may be carried out at any temperature typically ranging from just above the lower Tm to just below the higher Tm, so long as the reaction favors production of the amplicon by the higher Tm primer set. Similarly, the annealing for the lower temperature reaction typically is at any temperature below the Tm of the low temperature primer set.

[0075] In the example provided above, in the higher temperature stage the amplicon for the low abundance RNA is amplified at a rate faster than that the amplicon for the higher abundance RNA (and preferably to the substantial exclusion of production of the second amplicon), so that, prior to the second amplification stage, where it is desirable that amplification of all amplicons proceeds in a substantially balanced manner, the amplicon for the lower abundance RNA is of sufficient abundance that the amplification of the higher abundance RNA does not interfere with the amplification of the amplicon for the lower abundance RNA. In the first stage of amplification, when the amplicon for the low abundance nucleic acid is preferentially amplified, the annealing and elongation steps may be performed above Tm.sub.1 to gain specificity over efficiency (during the second stage of the amplification, since there is a relatively large number of low abundance nucleic acid amplicons, selectivity no longer is a significant issue, and efficiency of amplicon production is preferred). It, therefore, should be noted that although favorable in many instances, the temperature variations may not necessarily result in the complete shutdown of one amplification reaction over another.

[0076] In another variation of the above-described amplification reaction, a first primer set with a first Tm may target a more-abundant template sequence (for instance, the control template sequence) and a second primer set with a higher Tm may target a less-abundant template sequence. In this case, the more-abundant template and the less-abundant template may both be amplified in a first stage at a temperature below the (lower) Tm of the first primer set. When a threshold amount of amplicon corresponding to the more abundant template is reached, the annealing and/or elongation temperature of the reaction is raised above the Tm of the first primer set, but below the higher Tm of the second primer set to effectively shut down amplification of the more abundant template.

[0077] Selection of three or more sets of PCR primer sets having three or more different Tms (for instance, Tm.sub.1>Tm.sub.2>Tm.sub.3) can be used to amplify sequences of varying abundance in a stepwise manner, so long as the differences in the Tms are sufficiently large to permit preferential amplification of desired sequences to the substantial exclusion of undesired sequences for a desired number of cycles. In that process, the lowest abundance sequences are amplified in a first stage for a predetermined number of cycles. Next, the lowest abundance and the lesser abundance sequences are amplified in a second stage for a predetermined number of cycles. Lastly, all sequences are amplified in a third stage. As with the two-stage reaction described above, the minimum temperature for each stage may vary, depending on the relative efficiencies of each single amplification reaction of the multiplex reaction. It should be recognized that two or more amplimers may have substantially the same Tm, to permit amplification of more than one species of similar abundance at any stage of the amplification process. As with the two-stage reaction, the three-stage reaction may also proceed stepwise from amplification of the most abundant nucleic acid species at the lowest annealing temperature to amplification of the least abundant species at the highest annealing temperature.

[0078] By this sequential amplification method, an additional tool is provided for the "balancing" of multiplex PCR reactions besides the matching of Tms and using limiting amounts of one or more PCR primer sets. The exploitation of PCR primer sets with different Tms as a method for sequentially amplifying different amplicons may be preferred in certain circumstances to the sequential addition of additional primer sets. However, the use of temperature-dependent sequencing of multiplex PCR reactions may be coupled with the sequential physical addition of primer sets to a single reaction mixture.

[0079] An internal positive control that confirms the operation of a particular amplification reaction for a negative result also may be used. The internal positive controls (IPC) are DNA oligonucleotides that have the same primer sequences as the target gene (CEA or tyrosinase) but have a different internal probe sequence. Selected sites in the IPC's optionally may be synthesized with uracil instead of thymine so that contamination with the highly concentrated mimic could be controlled using uracil DNA glycosylase, if required. The IPCs maybe added to any PCR reaction mastermix in amounts that are determined empirically to give Ct values typically greater than the Ct values of the endogenous target of the primer set. The PCR assays are then performed according to standard protocols, and even when there is no endogenous target for the primer set, the IPC would be amplified, thereby verifying that the failure to amplify the target endogenous DNA is not a failure of the PCR reagents in the mastermix. In this embodiment, the IPC probe fluoresces differently than the probe for the endogenous sequences. A variation of this for use in RT-PCR reactions is where the IPC is an RNA and the RNA includes an RT primer sequence. In this embodiment, the IPC verifies function of both the RT and PCR reactions. Both RNA and DNA IPCs (with different corresponding probes) may also be employed to differentiate difficulties in the RT and PCR reactions.

[0080] The rapid QRT-PCR protocols described herein may be run in about 20 minutes. This short time period permits the assay to be run intraoperatively so that a surgeon can decide on a surgical course during a single operation (typically the patient will remain anesthetized and/or otherwise sedated in a single "operation", though there may be a waiting period between when the sample to be tested is obtained and the time the interoperative assay is complete), rather than requiring a second operation, or requiring the surgeon to perform unneeded or overly broad prophylactic procedures. For instance, in the surgical evaluation of certain cancers, including breast cancer, melanoma, lung cancer, esophageal cancer and colon cancer, tumors and sentinel lymph nodes are removed in a first operation. The sentinel nodes are later evaluated for micrometastases, and, when micrometastases are detected in a patient's sentinel lymph node, the patient will need a second operation, thereby increasing the patient's surgical risks and patient discomfort associated with multiple operations. With the ability to determine the expression levels of certain tumor-specific markers described herein in less than 30 minutes with increased accuracy, a physician can make an immediate decision on how to proceed without requiring the patient to leave the operating room or associated facilities. The rapid test also is applicable to needle biopsies taken in a physician's office. A patient need not wait for days to get the results of a biopsy (such as a needle biopsy of a tumor or lymph node), but can now get more accurate results in a very short time.

[0081] As used herein, in the context of gene expression analysis, a probe is "specific to" a gene or transcript if under reaction conditions it can hybrizide specifically to transcripts of that gene within a sample, or sequences complementary thereto, and not to other transcripts. Thus, in a diagnostic assay, a probe is specific to a gene if it can bind to a specific transcript or desired family of transcripts in mRNA extracted from a specimen, to the practical exclusion (does not interfere substantially with the detection assay) of other transcripts. In a PCR assay, primers are specific to a gene if they specifically amplify a sequence of that gene, to the practical exclusion of other sequences in a sample.

[0082] Table B provides primer and probe sequences for the mRNA quantification assays described and depicted in the Examples and Figures. FIGS. 1-16 provide non-limiting examples of cDNA sequences of the various mRNA species detected in the Examples. Although the sequences provided in Table B were found effective in the assays described in the examples, other primers and probes would likely be equally suited for use in the QRT-PCR and other mRNA detection and quantification assays, either described herein or as are known in the art. Design of alternate primer and probe sets for PCR assays, as well as for other mRNA detection assays is well within the abilities of one of average skill in the art. For example and without limitation, a number of computer software programs will generate primers and primer sets for PCR assays from cDNA sequences according to specified parameters. Non limiting examples of such software include, NetPrimer and Primer Premier 5, commercially available from PREMIER Biosoft International of Palo Alto, Calif., which also provides primer and probe design software for molecular beacon and array assays. Primers and/or probes for two or more different mRNAs can be identified, for example and without limitation, by aligning the two or more target sequences according to standard methods, determining common sequences between the two or more mRNAs and entering the common sequences into a suitable primer design computer program.

[0083] As used herein, a "primer or probe" for detecting a specific mRNA species is any primer, primer set and/or probe that can be utilized to detect and/or quantify the specific mRNA species. An "mRNA species" can be a single mRNA species, corresponding to a single mRNA expression product of a single gene, or can be multiple mRNAs that are detected by a single common primer and/or probe combination, such as the SCCA1.2 and MAGEA136-plex pecies described below.

[0084] In the commercialization of the methods described herein, certain kits for detection of specific nucleic acids will be particularly useful. A kit typically comprises one or more reagents, such as, without limitation, nucleic acid primers or probes, packaged in a container, such as, without limitation, a vial, tube or bottle, in a package suitable for commercial distribution, such as, without limitation, a box, a sealed pouch, a blister pack and a carton. The package typically contains an indicia, for example and without limitation, a writing, illustration, label, book, booklet, tag and/or packaging insert, indicating that the packaged reagents can be used in a method for identifying expression of markers indicative of the presence of cancer cells in a lymph node of a patient. As used herein, "packaging materials" includes any article used in the packaging, for distribution of reagents in a kit, including, without limitation, containers, vials, tubes, bottles, pouches, blister packaging, labels, tags, instruction sheets, and package inserts.

[0085] One example of such a kit would include reagents necessary for the one-tube QRT-PCR process described above. In one example, the kit would include the above-described reagents, including reverse transcriptase, a reverse transcriptase primer, a corresponding PCR primer set, a thermostable DNA polymerase, such as Taq polymerase, and a suitable fluorescent reporter, such as, without limitation, a probe for a fluorescent 5' nuclease assay, a molecular beacon probe, a single dye primer or a fluorescent dye specific to double-stranded DNA, such as ethidium bromide. The primers may be present in quantities that would yield the high concentrations described above. Thermostable DNA polymerases are commonly and commercially available from a variety of manufacturers. Additional materials in the kit may include: suitable reaction tubes or vials, a barrier composition, typically a wax bead, optionally including magnesium; reaction mixtures (typically 10X) for the reverse transcriptase and the PCR stages, including necessary buffers and reagents such as dNTPs; nuclease- or RNase-free water; RNase inhibitor; control nucleic acid(s) and/or any additional buffers, compounds, co-factors, ionic constituents, proteins and enzymes, polymers, and the like that may be used in reverse transcriptase and/or PCR stages of QRT-PCR reactions.

[0086] Components of a kit are packaged in any manner that is commercially practicable. For example, PCR primers and reverse transcriptase may be packaged individually to facilitate flexibility in configuring the assay, or together to increase ease of use and to reduce contamination. Similarly, buffers, salts and co-factors can be packaged separately or together.

[0087] The kits also may include reagents and mechanical components suitable for the manual or automated extraction of nucleic acid from a tissue sample. These reagents are known to those skilled in the art and typically are a matter of design choice. For instance, in one embodiment of an automated process, tissue is disrupted ultrasonically in a suitable lysis solution provided in the kit. The resultant lysate solution is then filtered and RNA is bound to RNA-binding magnetic beads also provided in the kit or cartridge. The bead-bound RNA is washed, and the RNA is eluted from the beads and placed into a suitable reverse transcriptase reaction mixture prior to the reverse transcriptase reaction. In automated processes, the choice of reagents and their mode of packaging (for instance in disposable single-use cartridges) typically are dictated by the physical configuration of the robotics and fluidics of the specific RNA extraction system, for example and without limitation, the GenXpert system. International Patent Publication Nos. WO 04/48931, WO 03/77055, WO 03/72253, WO 03/55973, WO 02/52030, WO 02/18902, WO 01/84463, WO 01/57253, WO 01/45845, WO 00/73413, WO 00/73412 and WO 00/72970 provide non-limiting examples of cartridge-based systems and related technology useful in the methods described herein.

[0088] The constituents of the kits may be packaged together or separately, and each constituent may be presented in one or more tubes or vials, or in cartridge form, as is appropriate. The constituents, independently or together, may be packaged in any useful state, including without limitation, in a dehydrated, a lyophilized, a glassified or an aqueous state. The kits may take the physical form of a cartridge for use in automated processes, having two or more compartments including the above-described reagents. Suitable cartridges are disclosed for example in U.S. Pat. Nos. 6,440,725, 6,431,476, 6,403,037 and 6,374,684.

[0089] Array technologies also can facilitate determining the expression level of two or more genes by facilitating performance of the desired reactions and their analysis by running multiple parallel reactions at the same time. One example of an array is the GeneChip.RTM. gene expression array, commercially available from Affymetrix, Inc. of Santa Clara, Calif. Patents illustrating array technology and uses therefor include, without limitation, U.S. Pat. Nos. 6,040,138, 6,245,517, 6,251,601, 6,261,776, 6,306,643, 6,309,823, 6,346,413, 6,406,844 and 6,416,952. A plethora of other "array" patents exist, illustrating the multitude of physical forms a useful array can take. An "array", such as a "microarray" can be a substrate containing one or more binding reagents, typically in discrete physical locations, permitting high throughput analysis of the binding of a sample to the array. In the context of the methods described herein, an array contains probes specific to transcripts of one or more of the genes described herein affixed to a substrate. The probes can be nucleic acids or analogs thereof, as are known in the art. An array also can refer to a plurality of discrete reaction chambers, permitting multiple parallel reactions and detection events on a miniaturized scale.

[0090] As mentioned above, PCR-based technologies may be used to quantify mRNA levels in a given tissue sample. Other sequence-specific nucleic acid quantification methods may be more or less suited. In one embodiment, the nucleic acid quantification method is a rolling circle amplification method. Non-limiting examples of rolling circle amplification methods are described in U.S. Pat. Nos. 5,854,003; 6,183,960; 6,344,329; and 6,210,884, each of which are incorporated herein by reference to the extent they teach methods for detecting and quantifying RNA species. In one embodiment, a padlock probe is employed to facilitate the rolling circle amplification process. (See Nilsson, M. et al. (2002), "Making Ends Meet in Genetic Analysis Using Padlock Probes," Human Mutation 19:410-415 and Schweitzer, B. et al (2001), "Combining Nucleic Acid Amplification and Detection," Current Opinion in Biotechnology, 12:21-27). A padlock probe is a linear oligonucleotide or polynucleotide designed to include one target-complementary sequence at each end, and which is designed such that the two ends are brought immediately next to each other upon hybridization to the target sequence. The probe also includes a spacer between the target-complementary sequences that includes a polymerase primer site and a site for binding to a probe, such as a molecular beacon probe, for detecting the padlock probe spacer sequence. If properly hybridized to an RNA template, the probe ends can then be joined by enzymatic DNA ligation to form a circular template that can be amplified by polymerase extension of a complementary primer. Thousands of concatemerized copies of the template can be generated by each primer, permitting detection and quantification of the original RNA template. Quantification can be automated by use, for example and without limitation, of a molecular beacon probe or other probe capable of detecting accumulation of a target sequence. By using padlock probes with different spacers to bind different molecular beacons that fluoresce a different color on binding to the amplified spacer, this automated reaction can be multiplexed. Padlock probe sequences target unique portions of the target RNA in order to ensure specific binding with limited or no cross-reactivity. RCA is an isothermic method in that the amplification is performed at one temperature.

[0091] Another isothermic method, for example and without limitation, is nucleic acid sequence-based amplification (NASBA). A typical NASBA reaction is initiated by the annealing of a first oligonucleotide primer to an RNA target in an RNA sample. The 3' end of the first primer is complementary to the target analyte; the 5' end encodes the T7 RNA polymerase promoter. After annealing, the primer is extended by reverse transcription (AMV-RT, for example) to produce a cDNA. The RNA is digested with RNase H, permitting a second primer (sense) to anneal to the cDNA strand, permitting the DNA polymerase activity of the reverse transcriptase to be engaged, producing a double-stranded cDNA copy of the original RNA template, with a functional T7 RNA polymerase promoter at one end. T7 polymerase is then used to produce an additional RNA template, which is further amplified, though in reverse order, according to the same procedure. A variety of other nucleic acid detection and/or amplification methods are known to those of skill in the art, including variations on the isothermic strand displacement, PCR and RCA methods described herein.

Example 1

General Materials and Methods

[0092] Identification of Potential Markers. An extensive literature and public database survey was conducted to identify any potential markers. Resources for this survey included PubMed, OMIM, UniGene (http://www.ncbi.nlm.nih.gov/), GeneCards (http://bioinfo.weizmann ac it/cards), and CGAP (http://cgap.nci.nih.gov). Survey criteria were somewhat flexible but the goal was to identify genes with moderate to high expression in tumors and low expression in normal lymph nodes. In addition, genes reported to be upregulated in tumors and genes with restricted tissue distribution were considered potentially useful. Finally, genes reported to be cancer-specific, such as the cancer testis antigens and hTERT, were evaluated.

[0093] Tissues and Pathological Evaluation. Tissue specimens were obtained from tissue banks at the University of Pittsburgh Medical Center through IRB approved protocols. All specimens were snap frozen in liquid nitrogen and later embedded in OCT for frozen sectioning. Twenty 5-micron sections were cut from each tissue for RNA isolation. In addition, sections were cut and placed on slides for H&E and IHC analysis at the beginning, middle (between the tenth and eleventh sections for RNA), and end of the sections for RNA isolation. All three H&E slides from each specimen underwent pathological review to confirm presence of tumor, percentage of tumor, and to identify the presence of any contaminating tissues. All of the unstained slides were stored at -20.degree. C. Immunohistochemistry evaluation was performed using the AE1/AE3 antibody cocktail (DAKO, Carpinteria, Calif.), and Vector Elite ABC kit and Vector AEC Chromagen (Vecta Laboratories, Burlingame, Calif.). IHC was used as needed as needed to confirm the H&E histology.

[0094] Screening Approach. The screening was conducted in two phases. All potential markers entered the primary screening phase and expression was analyzed in 6 primary tumors and 10 benign lymph nodes obtained from patients without cancer (5 RNA pools with 2 lymph node RNA's per pool). Markers that showed good characteristics for lymph node metastasis detection passed into the secondary screening phase. The secondary screen consisted of expression analysis on 20-25 primary tumors, 20-25 histologically positive lymph nodes and 21 benign lymph nodes without cancer.

[0095] RNA Isolation and cDNA Synthesis. RNA was isolated using the RNeasy minikit (Qiagen, Valencia, Calif.) essentially as described by the manufacturer. The only modification was that we doubled the volume of lysis reagent and loaded the column in two steps. This was found to provide better RNA yield and purity, probably as a result of diluting out the OCT in the tissue sections. Reverse transcription was performed in 100-.mu.l reaction volumes either with random hexamer priming or sequence-specific priming using a probe indicated in Table C and Superscript II (Invitrogen, Carlsbad, Calif.) reverse transcriptase. For the primary screen, three reverse transcription reactions were performed, each with 500 ng of RNA. The cDNA's were combined and QPCR was performed using the equivalent of 20 ng RNA per reaction. For the secondary screen, the RNA input for primary tumors and positive nodes was also 500 ng. For benign nodes however, the RNA input was 2000 ng resulting in the equivalent of 80 ng RNA per QPCR reaction.

[0096] Quantitative PCR. All quantitative PCR was performed on the ABI Prism 7700 Sequence Detection Instrument (Applied Biosystems, Foster City, Calif.). Relative expression of the marker genes was calculated using the delta-C.sub.T methods previously described and with-glucuronidase as the endogenous control gene. All assays were designed for use with 5' nuclease hybridization probes although the primary screening was performed using SYBER Green quantification in order to save cost. Assays were designed using the ABI Primer Express Version 2.0 software and where possible, amplicons spanned exon junctions in order to provide cDNA specificity. All primer pairs were tested for amplification specificity (generation of a single band on gels) at 60, 62 and 64.degree. C. annealing temperature. In addition, PCR efficiency was estimated using SYBER green quantification prior to use in the primary screen. Further optimization and more precise estimates of efficiency were performed with 5'nuclease probes for all assays used in the secondary screen.

[0097] A mixture of the Universal Human Reference RNA (Stratagene, La Jolla, Calif.) and RNAs from human placenta, thyroid, heart, colon, PCI13 cell line and SKBR3 cell line served as a universal positive expression control for all the genes in the marker screening process.

[0098] Quantification with SYBER Green (Primary Screen). For SYBR Green I-based QPCR, each 50 .mu.l reaction contained 1.times. TaqMan buffer A (Applied Biosystems), 300 nM each dNTP, 3.5 mM MgCl.sub.2, 0.06 units/.mu.l Amplitaq Gold (Applied Biosystems), 0.25.times. SYBR Green I (Molecular Probes, Eugene, Oreg.) and 200 nM each primer. The amplification program comprised 2-stages with an initial 95.degree. C. Taq activation stage for 12 min followed by 40 cycles of 95.degree. C. denaturation for 15 s, 60 or 62 or 64.degree. C. anneal/extend for 60 s and a 10 second data collection step at a temperature 2-4.degree. C. below the T.sub.m of the specific PCR product being amplified (Tom B. Morrison, et al, 1998). After amplification, a melting curve analysis was performed by collecting fluorescence data while increasing the temperature from 60.degree. C-95.degree. C. over 20 minutes.

[0099] Quantification with 5' Nuclease Probes (Secondary Screen). Probe-based QPCR was performed as described previously (Godfrey et al., Clinical Cancer Res. 2001 December, 7(12):4041-8). Briefly, reactions were performed with a probe concentration of 200 nM and a 60 second anneal/extend phase at 60.degree. C., or 62.degree. C., or 64.degree. C. The sequences of primers and probes (purchased from IDT, Coralville, Iowa) for genes evaluated in the secondary screen are listed in Table B, below.

[0100] Data Analysis. In the primary screen, data from the melt curve was analyzed using the ABI Prism 7700 Dissociation Curve Analysis 1.0 software (Applied Biosystems). The first derivative of the melting cure was used to determine the product T.sub.m as well as to establish the presence of the specific product in each sample. In general, samples were analyzed in duplicate PCR reactions and the average C.sub.t value was used in the expression analysis. However, in the secondary screen triplicate reactions were performed for each individual benign node and the lowest C.sub.t value was used in the calculation of relative expression in order to obtain the highest value of background expression for the sample.

[0101] Cancer tissue-specific studies have been conducted, as described in the Examples below, in which a variety of molecular markers were identified as correlating with pathological states in cancers including esophageal cancer, colon cancer, head and neck cancer and in melanoma. Table A identifies genes used in the following studies. Table B provides PCR primer and TAQMAN probe sequences used in the quantitative PCR and RT-PCR amplifications described herein. Table C provides RT primer sequences as used instead of random hexamer primers. All PCR and RT-PCR reactions were conducted using standard methods. For all figures, T=primary tumor; PN=tumor-positive lymph nodes (by histological screening, that is, by review of H&E stained tissue and, when needed, by IHC, as described above); and BN=benign lymph nodes (by histological screening)

TABLE-US-00001 TABLE A Accession No./ Official Gene Alternative Gene Marker OMIN No.* Symbol Official Gene Name Symbol Alias CDX1 NM_001804/ CDX1 caudal type homeo box transcription NA NA 600746 factor 1 CEA NM_004363/ CEACAM5 carcinoembryonic antigen-related CEA, CD66e NA 114890 cell adhesion molecule 5 CK19 NM_002276/ KRT19 keratin 19 K19, CK19, K1CS, cytokeratin 19; 148020 MGC15366 keratin, type I, 40-kd; keratin, type I cytoskeletal 19; 40-kDa keratin intermediate filament precursor gene CK20 NM_019010/ KRT20 keratin 20 K20, CK20, MGC35423 cytokeratin 20; 608218 keratin, type I; cytoskeletal 20 TACSTD1 NM_002354/ TACSTD1 tumor-associated calcium signal EGP, KSA, M4S1, MK-1 antigen; 185535 transducer 1 MK-1, KS1/4, EGP40, antigen identified by MIC18, TROP1, Ep- monoclonal antibody AUA1; CAM, CO17-1A, GA733-2 membrane component, chromosome 4, surface marker (35 kD glycoprotein) VIL1 NM_007127/ VIL1 villin 1 VIL, D2S1471 villin-1 193040 CK7 NM_005556/ KRT7 keratin 7 K7, CK7, SCL, K2C7, Sarcolectin; 148059 MGC3625 cytokeratin 7; type II mesothelial keratin K7; keratin, type II cytoskeletal 7; keratin, 55K type II cytoskeletal; keratin, simple epithelial type I, K7 SCCA1 NM_006919/ SERPINB3 serine (or cysteine) proteinase SCC, T4-A, SCCA1, squamous cell carcinoma 600517 inhibitor, clade B (ovalbumin), SCCA-PD antigen 1 member 3carcinoma antigen 1&2 SCCA2 NM_002974/ SERPINB4 serine (or cysteine) proteinase PI11, SCCA2, LEUPIN leupin; 600518 inhibitor, clade B (ovalbumin), squamous cell carcinoma member 4 antigen 2; protease inhibitor (leucine- serpin) PTHrP NM_002820/ PTHLH parathyroid hormone-like hormone PTHRP, PTHR, HHM, parathyroid hormone-related 168470 protein; pth-related protein; formerly humoral hypercalcemia of malignancy, included; PVA NM_001944/ DSG3 desmoglein 3 (pemphigus vulgaris PVA, CDHF6 pemphigus vulgaris antigen; 169615 antigen) 130-kD pemphigus vulgaris antigen MAGEA1 NM_004988/ MAGEA1 melanoma antigen, family A, 1 MAGE1, MGC9326 melanoma antigen MAGE-1; 300016 (directs expression of antigen MZ2- melanoma-associated antigen E) 1; melanoma-associated antigen MZ2-E MAGEA3 NM_005362/ MAGEA3 melanoma antigen, family A, 3 HIP8, HYPD, MAGE3, antigen MZ2-D; 300174 MGC14613 MAGE-3 antigen; melanoma-associated antigen 3 MAGEA6 NM_005363/ MAGEA6 melanoma antigen, family A, 6 MAGE6, MAGE3B, MAGE-6 antigen; 300176 MAGE-3b, MGC52297 melanoma-associated antigen 6 MART1 NM_005511/ MLANA melan-A MART1, MART-1 melanoma antigen 605513 recognized by t cells 1 TYR NM_000372/ TYR tyrosinase (oculocutaneous albinism OCA1A, OCAIA Tyrosinase 606933 IA) Online Mendelian Inheritance in Man (www.ncbi.nlm.nih.gov).

TABLE-US-00002 TABLE B Gene Oligonucleotide Sequence (5'.fwdarw.3') Sequence Listing Reference CDX1 Forward primer CGGTGGCAGCGGTAAGAC SEQ ID NO: 1, bases 516 to 533 Reverse primer GATTGTGATGTAACGGCTGTAATG SEQ ID NO: 17 Probe ACCAAGGACAAGTACCGCGTGGTCTACA SEQ ID NO: 1, bases 538 to 565 CEA Forward primer AGACAATCACAGTCTCTGCGGA SEQ ID NO: 2, bases 1589 to 1610 Reverse primer ATCCTTGTCCTCCACGGGTT SEQ ID NO: 18 Probe CAAGCCCTCCATCTCCAGCAACAACT SEQ ID NO: 2, bases 1617 to 1642 CK19 Forward primer AGATCGACAACGCCCGT SEQ ID NO: 19 Reverse primer AGAGCCTGTTCCGTCTCAAA SEQ ID NO: 20 Probe TGGCTGCAGATGACTTCCGAACCA SEQ ID NO: 4, bases 614 to 637 CK20 Forward primer CACCTCCCAGAGCCTTGAGAT SEQ ID NO: 5, bases 915 to 935 Reverse primer GGGCCTTGGTCTCCTCTAGAG SEQ ID NO: 21 Probe CCATCTCAGCATGAAAGAGTCTTTGGAGCA SEQ ID NO: 5, bases 948 to 977 CK7 Forward primer CCCTCAATGAGACGGAGTTGA SEQ ID NO: 3, bases 807 to 827 Reverse primer CCAGGGAGCGACTGTTGTC SEQ ID NO: 22 Probe AGCTGCAGTCCCAGATCTCCGACACATC SEQ ID NO: 3, bases 831 to 858 MAGEA136_plex.sup.A Forward primer GTGAGGAGGCAAGGTTYTSAG SEQ ID NO: 23 Reverse primer AGACCCACWGGCAGATCTTCTC SEQ ID NO: 24 Probe1 AGGATTCCCTGGAGGCCACAGAGG SEQ ID NO: 6, bases 80 to 103 Probe2 ACAGGCTGACCTGGAGGACCAGAGG SEQ ID NO: 7, bases 90 to 104 MART1 Forward primer GATGCTCACTTCATCTATGGTTACC SEQ ID NO: 9, bases 66 to 90 Reverse primer ACTGTCAGGATGCCGATCC SEQ ID NO: 25 Probe AGCGGCCTCTTCAGCCGTGGTGT SEQ ID NO: 26 PTHrP Forward primer GCGGTGTTCCTGCTGAGCTA SEQ ID NO: 10, bases 356 to 375 Reverse primer TCATGGAGGAGCTGATGTTCAGA SEQ ID NO: 27 Probe TCTCAGCCGCCGCCTCAAAAGA SEQ ID NO: 10, bases 409 to 430 PVA Forward primer AAAGAAACCCAATTGCCAAGATTAC SEQ ID NO: 11, bases 280 to 304 Reverse primer CAAAAGGCGGCTGATCGAT SEQ ID NO: 28 Probe CCAAGCAACCCAGAAAATCACCTACCG SEQ ID NO: 11, bases 314 to 340 SCCA1.2.sup.B Forward primer AAGCTGCAACATATCATGTTGATAGG SEQ ID NO: 12, bases 267 to 292 Reverse primer GGCGATCTTCAGCTCATATGC SEQ ID NO: 29 Probe TGTTCATCACCAGTTTCAAAAGCTTCTGACT SEQ ID NO: 12, bases 301 to 331 TACSTD1 Forward primer TCATTTGCTCAAAGCTGGCTG SEQ ID NO: 14, bases 348 to 368 Reverse primer GGTTTTGCTCTTCTCCCAAGTTT SEQ ID NO: 30 Probe AAATGTTTGGTGATGAAGGCAGAAATGAATGG SEQ ID NO: 14, bases 371 to 402 TYR Forward primer ACTTACTCAGCCCAGCATCATTC SEQ ID NO: 15, bases 1284 to 1306 Reverse primer ACTGATGGCTGTTGTACTCCTCC SEQ ID NO: 31 Probe TCTCCTCTTGGCAGATTGTCTGTAGCCGA SEQ ID NO: 15, bases 1308 to 1336 Villin1 Forward primer TGGTTCCTGGCTTGGGATC SEQ ID NO: 16, bases 2152 to 2170 Reverse primer TTGCCAGACTCCGCCTTC SEQ ID NO: 32 Probe TCAAGTGGAGTAACACCAAATCCTATGAGGACC SEQ ID NO: 16, bases 2174 to 2206 .sup.AA universal primer set designed to recognize transcripts of MAGEA1, MAGEA3 and MAGEA6. .sup.BA universal primer set designed to recognize transcripts of both SCCA1 AND SCCA2.

TABLE-US-00003 TABLE C RT Specific Primer Sequence Gene Marker (5'.fwdarw.3') Listing Reference CEA GTGAAGGCCACAGCAT SEQ ID NO: 33 CK20 AACTGGCTGCTGTAACG SEQ ID NO: 34 MART1 GCCGATGAGCAGTAAGACT SEQ ID NO: 35 PVA TGTCAACAACAAAGATTCCA SEQ ID NO: 36 SCCA1.2 TCTCCGAAGAGCTTGTTG SEQ ID NO: 37 TACSTD1 AGCCCATCATTGTTCTG SEQ ID NO: 38 TYR CGTTCCATTGCATAAAG SEQ ID NO: 40 VIL1 GCTCCAGTCCCTAAGG SEQ ID NO: 41

Example 2

Esophageal Cancer

[0102] Expression levels of CEA, CK7, CK19, CK20, TACSTD1 and VIL1 were determined by the methods described in Example 1. FIG. 17 is a scatter plot showing the expression levels of CEA, CK7, CK19, CK20, TACSTD1 and VIL1 in primary tumor, tumor-positive lymph nodes and benign lymph nodes. FIGS. 18A-O provide scatter plots illustrating the ability of two-marker systems to distinguish between benign and malignant cells in a lymph node. Tables D and E provide the raw data from which the graphs of FIGS. 17 and 18A-O were generated. This data illustrates the strong correlation of expression of CEA, CK7, CK19, CK20, TACSTD1 and VIL1 markers, alone or in combination, in sentinel lymph nodes with the presence of malignant cells arising from an esophageal cancer in the sentinel lymph nodes.

TABLE-US-00004 TABLE D Single Marker Prediction Characteristics for Esophageal Cancer Observed Data Parametric Bootstrap Estimates* Classification Classification Sensitivity Specificity AUC Accuracy Sensitivity Specificity Accuracy CEA .95 .95 .98 .95 .93 .93 .93 CK7 .95 .86 .94 .90 .82 .89 .85 CK19 1.0 1.0 1.0 1.0 .99 .94 97 CK20 1.0 .95 .995 .98 .98 .92 .95 TACSTD1 1.0 1.0 1.0 1.0 .96 .99 .98 Villin1 .95 .95 .98 .95 .92 .93 .92 optimism = .02-.05 1000 parametric bootstrap samples of lymph node expression levels were generated and a new decision rule based on the most accurate cutoff was formulated each time (total of 1000 decision rules). The bootstrap estimates are the average prediction properties from classifying the original 41 lymph nodes 1000 times.

TABLE-US-00005 TABLE E Two Marker Prediction Characteristics for Esophageal Cancer Observed Data Parametric Bootstrap Estimates* Classification Classification Sensitivity Specificity Accuracy Sensitivity Specificity Accuracy CEA + CK7 .95 1.0 .98 .93 .99 .96 CEA + CK19 .95 1.0 .98 .97 .99 .98 CEA + CK20 .95 1.0 .98 .97 .99 .97 CEA + TACSTD1 1.0 1.0 1.0 .99 1.0 .99 CEA + Villin1 .95 1.0 .98 .95 1.0 .98 CK7 + CK19 1.0 1.0 1.0 .99 .99 .99 CK7 + CK20 .95 1.0 .98 .93 .99 .97 CK7 + TACSTD1 1.0 1.0 1.0 .99 1.0. .99 CK7 + Villin1 .95 1.0 .98 .95 .99 .98 CK19 + CK20 .95 1.0 .98 .97 .99 .98 CK19 + TACSTD1 1.0 1.0 1.0 .99 1.0 .99 CK19 + Villin1 1.0 1.0 1.0 .99 .99 .99 CK20 + TACSTD1 1.0 1.0 1.0 .99 1.0 .99 CK20 + Villin1 .95 1.0 .98 .94 1.0 .97 TACSTD1 + Villin1 1.0 1.0 1.0 .99 1.0 .99 1000 parametric bootstrap samples of 41 lymph node marker pair expression levels were generated. For each new sample a new decision rule was devised to split the region into 2 zones equal prediction probability (see methods) (total of 1000 decision rules). The bootstrap estimates are the average prediction properties from classifying the original 41 lymph nodes 1000 times.

Example 3

Head and Neck Cancer

[0103] FIG. 19 is a scatter plot showing the expression levels of CEA, CK19, PTHrP, PVA, SCCA1.2 and TACSTD1 in primary tumor, tumor-positive lymph nodes and benign lymph nodes. FIGS. 20A-F provides scatter plots illustrating the ability of two-marker systems to distinguish between benign and malignant cells in a lymph node. Tables F and G provide the raw data from which the graphs of FIGS. 19 and 20A-F were generated. This data illustrates the strong correlation between expression of CEA, CK19, PTHrP, PVA, SCCA1.2 and TACSTD1 markers, alone or in combination, in sentinel lymph nodes and the presence of malignant cells arising from a squamous cell carcinoma of the head and neck in the sentinel lymph nodes.

TABLE-US-00006 TABLE F Single Marker Prediction Characteristics -Head and Neck Cancer Non Parametric Bootstrap Observed Data Estimates* Classification Classification Sensitivity Specificity AUC Accuracy Sensitivity Specificity Accuracy bias** CEA 1.0 .905 .990 .950 .974 .880 .872 .078 CK19 .895 .905 .917 .900 .867 .880 .872 .028 EGFR .895 1.0 .945 .947 .873 .979 .925 .022 PTHrP .947 1.0 .990 .975 .938 .988 .963 .012 PVA 1.0 1.0 1.0 1.0 1.0 1.0 1.0 .000 SCCA1.2 1.0 1.0 1.0 1.0 .998 .985 .991 .009 TACSTD1 1.0 .952 .997 .975 .983 .944 .962 .013 500 bootstrap samples of lymph node expression levels were generated and a new decision rule based on the most accurate cutoff was formulated each time (total of 500 decision rules). 500 bootstrap samples of lymph node expression levels were generated and a new decision rule based on the most accurate cutoff was formulated each time (total of 500 decision rules). The optimism in for each bootstrap sample is calculated as the difference between the classification statistic applied to theoriginal data and applied to the bootstrap data. The average over all bootstrap samples is computed and reported as the bias in the values derived from the observed data (Efron's enhanced bootstrap prediction error estimate, see Efron and Tibshirani, An Introduction to the Bootstrap, Chapman and Hall/CRC Press Boca Raton, 1993). **bias = enhanced bootstrap estimate of optimism, or the amount that classification accuracy is overestimated when tested on the original data.

TABLE-US-00007 TABLE G Two Marker Prediction Characteristics for Head & Neck Cancer Observed Data Non Parametric Bootstrap Estimates Classification Classification Sensitivity Specificity Accuracy Sensitivity Specificity Accuracy Bias** PVA + 1.0 1.0 1.0 .993 1.0 .997 .003 TACSTD1 PVA + 1.0 1.0 1.0 1.0 1.0 1.0 .000 PTHrP PVA + 1.0 1.0 1.0 1.0 1.0 1.0 .000 SCCA1.2 TACSTD1 + .947 1.0 .975 .944 1.0 .974 .001 PTHrP TACSTD1 + 1.0 1.0 1.0 .984 1.0 .992 .008 SCCA1.2 PTHrP + 1.0 1.0 1.0 1.0 1.0 1.0 .000 SCCA1.2 500 bootstrap samples of lymph node expression levels were generated and a new decision rule based on the most accurate cutoff was formulated each time (total of 500 decision rules). 500 bootstrap samples of lymph node expression levels were generated and a new decision rule based on the most accurate cutoff was formulated each time (total of 500 decision rules). The optimism in for each bootstrap sample is calculated as the difference between the classification statistic applied to theoriginal data and applied to the bootstrap data. The average over all bootstrap samples is computed and reported as the bias in the values derived from the observed data (Efron's enhanced bootstrap prediction error estimate, see Efron and Tibshirani, An Introduction to the Bootstrap, Chapman and Hall/CRC Press Boca Raton, 1993). **bias = enhanced bootstrap estimate of optimism, or the amount that classification accuracy is overestimated when tested on the original data.

Example 4

Melanoma

[0104] FIG. 21 is a scatter plot showing the expression levels of MART1, TYR and MAGEA136-plex in primary tumor, tumor-positive lymph nodes and benign lymph nodes. FIGS. 22A and 22B provide scatter plots illustrating the ability of two-marker systems to distinguish between benign and malignant cells in a lymph node. This data illustrates the strong correlation between expression of MART1, TYR and MAGEA136-plex markers, alone or in combination, in sentinel lymph nodes and the presence of malignant cells arising from melanoma in the sentinel lymph nodes.

Example 5

Colon Cancer

[0105] FIG. 23 is a scatter plot showing the expression levels of CDX1, CEA, CK19, CK20, TACSTD1 and VIL1 in primary tumor, tumor-positive lymph nodes and benign lymph nodes. This data illustrates the strong correlation between expression of CDX1, CEA, CK19, CK20, TACSTD1 and VIL1 markers, in sentinel lymph nodes and the presence of malignant cells arising from colon cancer in the sentinel lymph nodes.

Sequence CWU 1

1

4011699DNAHomo sapiens 1aggtgagcgg ttgctcgtcg tcggggcggc cggcagcggc ggctccaggg cccagcatgc 60gcgggggacc ccgcggccac catgtatgtg ggctatgtgc tggacaagga ttcgcccgtg 120taccccggcc cagccaggcc agccagcctc ggcctgggcc cggcaaacta cggccccccg 180gccccgcccc cggcgccccc gcagtacccc gacttctcca gctactctca cgtggagccg 240gcccccgcgc ccccgacggc ctggggggcg cccttccctg cgcccaagga cgactgggcc 300gccgcctacg gcccgggccc cgcggcccct gccgccagcc cagcttcgct ggcattcggg 360ccccctccag actttagccc ggtgccggcg ccccctgggc ccggcccggg cctcctggcg 420cagcccctcg ggggcccggg cacaccgtcc tcgcccggag cgcagaggcc gacgccctac 480gagtggatgc ggcgcagcgt ggcggccgga ggcggcggtg gcagcggtaa gactcggacc 540aaggacaagt accgcgtggt ctacaccgac caccaacgcc tggagctgga gaaggagttt 600cattacagcc gttacatcac aatccggcgg aaatcagagc tggctgccaa tctggggctc 660actgaacggc aggtgaagat ctggttccaa aaccggcggg caaaggagcg caaagtgaac 720aagaagaaac agcagcagca acagccccca cagccgccga tggcccacga catcacggcc 780accccagccg ggccatccct ggggggcctg tgtcccagca acaccagcct cctggccacc 840tcctctccaa tgcctgtgaa agaggagttt ctgccatagc cccatgccca gcctgtgcgc 900cgggggacct ggggactcgg gtgctgggag tgtggctcct gtgggcccag gaggtctggt 960ccgagtctca gccctgacct tctgggacat ggtggacagt cacctatcca ccctctgcat 1020ccccttggcc cattgtgtgc agtaagcctg ttggataaag accttccagc tcctgtgttc 1080tagacctctg ggggataagg gagtccaggg tggatgatct caatctcccg tgggcatctc 1140aagccccaaa tggttggggg aggggcctag acaaggctcc aggccccacc tcctcctcca 1200tacgttcaga ggtgcagctg gaggcctgtg tggggaccac actgatcctg gagaaaaggg 1260atggagctga aaaagatgga atgcttgcag agcatgacct gaggagggag gaacgtggtc 1320aactcacacc tgcctcttct gcagcctcac ctctacctgc ccccatcata agggcactga 1380gcccttccca ggctggatac taagcacaaa gcccatagca ctgggctctg atggctgctc 1440cactgggtta cagaatcaca gccctcatga tcattctcag tgagggctct ggattgagag 1500ggaggccctg ggaggagaga agggggcaga gtcttcccta ccaggtttct acacccccgc 1560caggctgccc atcagggccc agggagcccc cagaggactt tattcggacc aagcagagct 1620cacagctgga caggtgttgt atatagagtg gaatctcttg gatgcagctt caagaataaa 1680tttttcttct cttttcaaa 169922974DNAHomo sapiens 2ctcagggcag agggaggaag gacagcagac cagacagtca cagcagcctt gacaaaacgt 60tcctggaact caagctcttc tccacagagg aggacagagc agacagcaga gaccatggag 120tctccctcgg cccctcccca cagatggtgc atcccctggc agaggctcct gctcacagcc 180tcacttctaa ccttctggaa cccgcccacc actgccaagc tcactattga atccacgccg 240ttcaatgtcg cagaggggaa ggaggtgctt ctacttgtcc acaatctgcc ccagcatctt 300tttggctaca gctggtacaa aggtgaaaga gtggatggca accgtcaaat tataggatat 360gtaataggaa ctcaacaagc taccccaggg cccgcataca gtggtcgaga gataatatac 420cccaatgcat ccctgctgat ccagaacatc atccagaatg acacaggatt ctacacccta 480cacgtcataa agtcagatct tgtgaatgaa gaagcaactg gccagttccg ggtatacccg 540gagctgccca agccctccat ctccagcaac aactccaaac ccgtggagga caaggatgct 600gtggccttca cctgtgaacc tgagactcag gacgcaacct acctgtggtg ggtaaacaat 660cagagcctcc cggtcagtcc caggctgcag ctgtccaatg gcaacaggac cctcactcta 720ttcaatgtca caagaaatga cacagcaagc tacaaatgtg aaacccagaa cccagtgagt 780gccaggcgca gtgattcagt catcctgaat gtcctctatg gcccggatgc ccccaccatt 840tcccctctaa acacatctta cagatcaggg gaaaatctga acctctcctg ccacgcagcc 900tctaacccac ctgcacagta ctcttggttt gtcaatggga ctttccagca atccacccaa 960gagctcttta tccccaacat cactgtgaat aatagtggat cctatacgtg ccaagcccat 1020aactcagaca ctggcctcaa taggaccaca gtcacgacga tcacagtcta tgcagagcca 1080cccaaaccct tcatcaccag caacaactcc aaccccgtgg aggatgagga tgctgtagcc 1140ttaacctgtg aacctgagat tcagaacaca acctacctgt ggtgggtaaa taatcagagc 1200ctcccggtca gtcccaggct gcagctgtcc aatgacaaca ggaccctcac tctactcagt 1260gtcacaagga atgatgtagg accctatgag tgtggaatcc agaacgaatt aagtgttgac 1320cacagcgacc cagtcatcct gaatgtcctc tatggcccag acgaccccac catttccccc 1380tcatacacct attaccgtcc aggggtgaac ctcagcctct cctgccatgc agcctctaac 1440ccacctgcac agtattcttg gctgattgat gggaacatcc agcaacacac acaagagctc 1500tttatctcca acatcactga gaagaacagc ggactctata cctgccaggc caataactca 1560gccagtggcc acagcaggac tacagtcaag acaatcacag tctctgcgga gctgcccaag 1620ccctccatct ccagcaacaa ctccaaaccc gtggaggaca aggatgctgt ggccttcacc 1680tgtgaacctg aggctcagaa cacaacctac ctgtggtggg taaatggtca gagcctccca 1740gtcagtccca ggctgcagct gtccaatggc aacaggaccc tcactctatt caatgtcaca 1800agaaatgacg caagagccta tgtatgtgga atccagaact cagtgagtgc aaaccgcagt 1860gacccagtca ccctggatgt cctctatggg ccggacaccc ccatcatttc ccccccagac 1920tcgtcttacc tttcgggagc gaacctcaac ctctcctgcc actcggcctc taacccatcc 1980ccgcagtatt cttggcgtat caatgggata ccgcagcaac acacacaagt tctctttatc 2040gccaaaatca cgccaaataa taacgggacc tatgcctgtt ttgtctctaa cttggctact 2100ggccgcaata attccatagt caagagcatc acagtctctg catctggaac ttctcctggt 2160ctctcagctg gggccactgt cggcatcatg attggagtgc tggttggggt tgctctgata 2220tagcagccct ggtgtagttt cttcatttca ggaagactga cagttgtttt gcttcttcct 2280taaagcattt gcaacagcta cagtctaaaa ttgcttcttt accaaggata tttacagaaa 2340agactctgac cagagatcga gaccatccta gccaacatcg tgaaacccca tctctactaa 2400aaatacaaaa atgagctggg cttggtggcg cgcacctgta gtcccagtta ctcgggaggc 2460tgaggcagga gaatcgcttg aacccgggag gtggagattg cagtgagccc agatcgcacc 2520actgcactcc agtctggcaa cagagcaaga ctccatctca aaaagaaaag aaaagaagac 2580tctgacctgt actcttgaat acaagtttct gataccactg cactgtctga gaatttccaa 2640aactttaatg aactaactga cagcttcatg aaactgtcca ccaagatcaa gcagagaaaa 2700taattaattt catgggacta aatgaactaa tgaggattgc tgattcttta aatgtcttgt 2760ttcccagatt tcaggaaact ttttttcttt taagctatcc actcttacag caatttgata 2820aaatatactt ttgtgaacaa aaattgagac atttacattt tctccctatg tggtcgctcc 2880agacttggga aactattcat gaatatttat attgtatggt aatatagtta ttgcacaagt 2940tcaataaaaa tctgctcttt gtataacaga aaaa 297431753DNAHomo sapiens 3cagccccgcc cctacctgtg gaagcccagc cgcccgctcc cgcggataaa aggtgcggag 60tgtccccgag gtcagcgagt gcgcgctcct cctcgcccgc cgctaggtcc atcccggccc 120agccaccatg tccatccact tcagctcccc ggtattcacc tcgcgctcag ccgccttctc 180gggccgcggc gcccaggtgc gcctgagctc cgctcgcccc ggcggccttg gcagcagcag 240cctctacggc ctcggcgcct cgcggccgcg cgtggccgtg cgctctgcct atgggggccc 300ggtgggcgcc ggcatccgcg aggtcaccat taaccagagc ctgctggccc cgctgcggct 360ggacgccgac ccctccctcc agcgggtgcg ccaggaggag agcgagcaga tcaagaccct 420caacaacaag tttgcctcct tcatcgacaa ggtgcggttt ctggagcagc agaacaagct 480gctggagacc aagtggacgc tgctgcagga gcagaagtcg gccaagagca gccgcctccc 540agacatcttt gaggcccaga ttgctggcct tcggggtcag cttgaggcac tgcaggtgga 600tgggggccgc ctggaggcgg agctgcggag catgcaggat gtggtggagg acttcaagaa 660taagtacgaa gatgaaatta accgccgcac agctgctgag aatgagtttg tggtgctgaa 720gaaggatgtg gatgctgcct acatgagcaa ggtggagctg gaggccaagg tggatgccct 780gaatgatgag atcaacttcc tcaggaccct caatgagacg gagttgacag agctgcagtc 840ccagatctcc gacacatctg tggtgctgtc catggacaac agtcgctccc tggacctgga 900cggcatcatc gctgaggtca aggcacagta tgaggagatg gccaaatgca gccgggctga 960ggctgaagcc tggtaccaga ccaagtttga gaccctccag gcccaggctg ggaagcatgg 1020ggacgacctc cggaataccc ggaatgagat ttcagagatg aaccgggcca tccagaggct 1080gcaggctgag atcgacaaca tcaagaacca gcgtgccaag ttggaggccg ccattgccga 1140ggctgaggag cgtggggagc tggcgctcaa ggatgctcgt gccaagcagg aggagctgga 1200agccgccctg cagcgggcca agcaggatat ggcacggcag ctgcgtgagt accaggaact 1260catgagcgtg aagctggccc tggacatcga gatcgccacc taccgcaagc tgctggaggg 1320cgaggagagc cggttggctg gagatggagt gggagccgtg aatatctctg tgatgaattc 1380cactggtggc agtagcagtg gcggtggcat tgggctgacc ctcgggggaa ccatgggcag 1440caatgccctg agcttctcca gcagtgcggg tcctgggctc ctgaaggctt attccatccg 1500gaccgcatcc gccagtcgca ggagtgcccg cgactgagcc gcctcccacc actccactcc 1560tccagccacc acccacaatc acaagaagat tcccacccct gcctcccatg cctggtccca 1620agacagtgag acagtctgga aagtgatgtc agaatagctt ccaataaagc agcctcattc 1680tgaggcctga gtgatccacg tgaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1740aaaaaaaaaa aaa 175341440DNAHomo sapiens 4cgcccctgac accattcctc ccttcccccc tccaccggcc gcgggcataa aaggcgccag 60gtgagggcct cgccgctcct cccgcgaatc gcagcttctg agaccagggt tgctccgtcc 120gtgctccgcc tcgccatgac ttcctacagc tatcgccagt cgtcggccac gtcgtccttc 180ggaggcctgg gcggcggctc cgtgcgtttt gggccggggg tcgcctttcg cgcgcccagc 240attcacgggg gctccggcgg ccgcggcgta tccgtgtcct ccgcccgctt tgtgtcctcg 300tcctcctcgg gggcctacgg cggcggctac ggcggcgtcc tgaccgcgtc cgacgggctg 360ctggcgggca acgagaagct aaccatgcag aacctcaacg accgcctggc ctcctacctg 420gacaaggtgc gcgccctgga ggcggccaac ggcgagctag aggtgaagat ccgcgactgg 480taccagaagc aggggcctgg gccctcccgc gactacagcc actactacac gaccatccag 540gacctgcggg acaagattct tggtgccacc attgagaact ccaggattgt cctgcagatc 600gacaatgccc gtctggctgc agatgacttc cgaaccaagt ttgagacgga acaggctctg 660cgcatgagcg tggaggccga catcaacggc ctgcgcaggg tgctggatga gctgaccctg 720gccaggaccg acctggagat gcagatcgaa ggcctgaagg aagagctggc ctacctgaag 780aagaaccatg aggaggaaat cagtacgctg aggggccaag tgggaggcca ggtcagtgtg 840gaggtggatt ccgctccggg caccgatctc gccaagatcc tgagtgacat gcgaagccaa 900tatgaggtca tggccgagca gaaccggaag gatgctgaag cctggttcac cagccggact 960gaagaattga accgggaggt cgctggccac acggagcagc tccagatgag caggtccgag 1020gttactgacc tgcggcgcac ccttcagggt cttgagattg agctgcagtc acagctgagc 1080atgaaagctg ccttggaaga cacactggca gaaacggagg cgcgctttgg agcccagctg 1140gcgcatatcc aggcgctgat cagcggtatt gaagcccagc tgggcgatgt gcgagctgat 1200agtgagcggc agaatcagga gtaccagcgg ctcatggaca tcaagtcgcg gctggagcag 1260gagattgcca cctaccgcag cctgctcgag ggacaggaag atcactacaa caatttgtct 1320gcctccaagg tcctctgagg cagcaggctc tggggcttct gctgtccttt ggagggtgtc 1380ttctgggtag agggatggga aggaagggac ccttaccccc ggctcttctc ctgacctgcc 144051817DNAHomo sapiens 5caaccatcct gaagctacag gtgctccctc ctggaatctc caatggattt cagtcgcaga 60agcttccaca gaagcctgag ctcctccttg caggcccctg tagtcagtac agtgggcatg 120cagcgcctcg ggacgacacc cagcgtttat gggggtgctg gaggccgggg catccgcatc 180tccaactcca gacacacggt gaactatggg agcgatctca caggcggcgg ggacctgttt 240gttggcaatg agaaaatggc catgcagaac ctaaatgacc gtctagcgag ctacctagaa 300aaggtgcgga ccctggagca gtccaactcc aaacttgaag tgcaaatcaa gcagtggtac 360gaaaccaacg ccccgagggc tggtcgcgac tacagtgcat attacagaca aattgaagag 420ctgcgaagtc agattaagga tgctcaactg caaaatgctc ggtgtgtcct gcaaattgat 480aatgctaaac tggctgctga ggacttcaga ctgaagtatg agactgagag aggaatacgt 540ctaacagtgg aagctgatct ccaaggcctg aataaggtct ttgatgacct aaccctacat 600aaaacagatt tggagattca aattgaagaa ctgaataaag acctagctct cctcaaaaag 660gagcatcagg aggaagtcga tggcctacac aagcatctgg gcaacactgt caatgtggag 720gttgatgctg ctccaggcct gaaccttggc gtcatcatga atgaaatgag gcagaagtat 780gaagtcatgg cccagaagaa ccttcaagag gccaaagaac agtttgagag acagactgca 840gttctgcagc aacaggtcac agtgaatact gaagaattaa aaggaactga ggttcaacta 900acggagctga gacgcacctc ccagagcctt gagatagaac tccagtccca tctcagcatg 960aaagagtctt tggagcacac tctagaggag accaaggccc gttacagcag ccagttagcc 1020aacctccagt cgctgttgag ctctctggag gcccaactga tgcagattcg gagtaacatg 1080gaacgccaga acaacgaata ccatatcctt cttgacataa agactcgact tgaacaggaa 1140attgctactt accgccgcct tctggaagga gaagacgtaa aaactacaga atatcagtta 1200agcaccctgg aagagagaga tataaagaaa accaggaaga ttaagacagt cgtgcaagaa 1260gtagtggatg gcaaggtcgt gtcatctgaa gtcaaagagg tggaagaaaa tatctaaata 1320gctaccagaa ggagatgctg ctgaggtttt gaaagaaatt tggctataat cttatctttg 1380ctccctgcaa gaaatcagcc ataagaaagc actattaata ctctgcagtg attagaaggg 1440gtggggtggc gggaatccta tttatcagac tctgtaattg aatataaatg ttttactcag 1500aggagctgca aattgcctgc aaaaatgaaa tccagtgagc actagaatat ttaaaacatc 1560attactgcca tctttatcat gaagcacatc aattacaagc tgtagaccac ctaatatcaa 1620tttgtaggta atgttcctga aaattgcaat acatttcaat tatactaaac ctcacaaagt 1680agaggaatcc atgtaaattg caaataaacc actttctaat tttttcctgt ttctgaaaaa 1740aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1800aaaaaaaaaa aaaaaaa 181761722DNAHomo sapiens 6cgtagagttc ggccgaagga acctgaccca ggctctgtga ggaggcaagg ttttcagggg 60acaggccaac ccagaggaca ggattccctg gaggccacag aggagcacca aggagaagat 120ctgcctgtgg gtcttcattg cccagctcct gcccacactc ctgcctgctg ccctgacgag 180agtcatcatg tctcttgagc agaggagtct gcactgcaag cctgaggaag cccttgaggc 240ccaacaagag gccctgggcc tggtgtgtgt gcaggctgcc gcctcctcct cctctcctct 300ggtcctgggc accctggagg aggtgcccac tgctgggtca acagatcctc cccagagtcc 360tcagggagcc tccgcctttc ccactaccat caacttcact cgacagaggc aacccagtga 420gggttccagc agccgtgaag aggaggggcc aagcacctct tgtatcctgg agtccttgtt 480ccgagcagta atcactaaga aggtggctga tttggttggt tttctgctcc tcaaatatcg 540agccagggag ccagtcacaa aggcagaaat gctggagagt gtcatcaaaa attacaagca 600ctgttttcct gagatcttcg gcaaagcctc tgagtccttg cagctggtct ttggcattga 660cgtgaaggaa gcagacccca ccggccactc ctatgtcctt gtcacctgcc taggtctctc 720ctatgatggc ctgctgggtg ataatcagat catgcccaag acaggcttcc tgataattgt 780cctggtcatg attgcaatgg agggcggcca tgctcctgag gaggaaatct gggaggagct 840gagtgtgatg gaggtgtatg atgggaggga gcacagtgcc tatggggagc ccaggaagct 900gctcacccaa gatttggtgc aggaaaagta cctggagtac cggcaggtgc cggacagtga 960tcccgcacgc tatgagttcc tgtggggtcc aagggccctt gctgaaacca gctatgtgaa 1020agtccttgag tatgtgatca aggtcagtgc aagagttcgc tttttcttcc catccctgcg 1080tgaagcagct ttgagagagg aggaagaggg agtctgagca tgagttgcag ccagggccag 1140tgggaggggg actgggccag tgcaccttcc agggccgcgt ccagcagctt cccctgcctc 1200gtgtgacatg aggcccattc ttcactctga agagagcggt cagtgttctc agtagtaggt 1260ttctgttcta ttgggtgact tggagattta tctttgttct cttttggaat tgttcaaatg 1320ttttttttta agggatggtt gaatgaactt cagcatccaa gtttatgaat gacagcagtc 1380acacagttct gtgtatatag tttaagggta agagtcttgt gttttattca gattgggaaa 1440tccattctat tttgtgaatt gggataataa cagcagtgga ataagtactt agaaatgtga 1500aaaatgagca gtaaaataga tgagataaag aactaaagaa attaagagat agtcaattct 1560tgctttatac ctcagtctat tctgtaaaat ttttaaagat atatgcatac ctggatttcc 1620ttggcttctt tgagaatgta agagaaatta aatctgaata aagaattctt cctgttaaaa 1680aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aa 172271753DNAHomo sapiens 7gagattctcg ccctgagcaa cgagcgacgg cctgacgtcg gcggagggaa gccggcccag 60gctcggtgag gaggcaaggt tctgagggga caggctgacc tggaggacca gaggcccccg 120gaggagcact gaaggagaag atctgccagt gggtctccat tgcccagctc ctgcccacac 180tcccgcctgt tgccctgacc agagtcatca tgcctcttga gcagaggagt cagcactgca 240agcctgaaga aggccttgag gcccgaggag aggccctggg cctggtgggt gcgcaggctc 300ctgctactga ggagcaggag gctgcctcct cctcttctac tctagttgaa gtcaccctgg 360gggaggtgcc tgctgccgag tcaccagatc ctccccagag tcctcaggga gcctccagcc 420tccccactac catgaactac cctctctgga gccaatccta tgaggactcc agcaaccaag 480aagaggaggg gccaagcacc ttccctgacc tggagtccga gttccaagca gcactcagta 540ggaaggtggc cgagttggtt cattttctgc tcctcaagta tcgagccagg gagccggtca 600caaaggcaga aatgctgggg agtgtcgtcg gaaattggca gtatttcttt cctgtgatct 660tcagcaaagc ttccagttcc ttgcagctgg tctttggcat cgagctgatg gaagtggacc 720ccatcggcca cttgtacatc tttgccacct gcctgggcct ctcctacgat ggcctgctgg 780gtgacaatca gatcatgccc aaggcaggcc tcctgataat cgtcctggcc ataatcgcaa 840gagagggcga ctgtgcccct gaggagaaaa tctgggagga gctgagtgtg ttagaggtgt 900ttgaggggag ggaagacagt atcttggggg atcccaagaa gctgctcacc caacatttcg 960tgcaggaaaa ctacctggag taccggcagg tccccggcag tgatcctgca tgttatgaat 1020tcctgtgggg tccaagggcc ctcgttgaaa ccagctatgt gaaagtcctg caccatatgg 1080taaagatcag tggaggacct cacatttcct acccacccct gcatgagtgg gttttgagag 1140agggggaaga gtgagtctga gcacgagttg cagccagggc cagtgggagg gggtctgggc 1200cagtgcacct tccggggccg catcccttag tttccactgc ctcctgtgac gtgaggccca 1260ttcttcactc tttgaagcga gcagtcagca ttcttagtag tgggtttctg ttctgttgga 1320tgactttgag attattcttt gtttcctgtt ggagttgttc aaatgttcct tttaacggat 1380ggttgaatga gcgtcagcat ccaggtttat gaatgacagt agtcacacat agtgctgttt 1440atatagttta ggagtaagag tcttgttttt tactcaaatt gggaaatcca ttccattttg 1500tgaattgtga cataataata gcagtggtaa aagtatttgc ttaaaattgt gagcgaatta 1560gcaataacat acatgagata actcaagaaa tcaaaagata gttgattctt gccttgtacc 1620tcaatctatt ctgtaaaatt aaacaaatat gcaaaccagg atttccttga cttctttgag 1680aatgcaagcg aaattaaatc tgaataaata attcttcctc ttcaaaaaaa aaaaaaaaaa 1740aaaaaaaaaa aaa 175381723DNAHomo sapiens 8agcaacgagc gacggcctga cgtcggcgga gggaagccgg cccaggctcg gtgaggaggc 60aaggttctga ggggacaggc tgacctggag gaccagaggc ccccggagga gcactgaagg 120agaagatctg ccagtgggtc tccattgccc agctcctgcc cacactcccg cctgttgccc 180tgaccagagt catcatgcct cttgagcaga ggagtcagca ctgcaagcct gaagaaggcc 240ttgaggcccg aggagaggcc ctgggcctgg tgggtgcgca ggctcctgct actgaggagc 300aggaggctgc ctcctcctct tctactctag ttgaagtcac cctgggggag gtgcctgctg 360ccgagtcacc agatcctccc cagagtcctc agggagcctc cagcctcccc actaccatga 420actaccctct ctggagccaa tcctatgagg actccagcaa ccaagaagag gaggggccaa 480gcaccttccc tgacctggag tctgagttcc aagcagcact cagtaggaag gtggccaagt 540tggttcattt tctgctcctc aagtatcgag ccagggagcc ggtcacaaag gcagaaatgc 600tggggagtgt cgtcggaaat tggcagtact tctttcctgt gatcttcagc aaagcttccg 660attccttgca gctggtcttt ggcatcgagc tgatggaagt ggaccccatc ggccacgtgt 720acatctttgc cacctgcctg ggcctctcct acgatggcct gctgggtgac aatcagatca 780tgcccaagac aggcttcctg ataatcatcc tggccataat cgcaaaagag ggcgactgtg 840cccctgagga gaaaatctgg gaggagctga gtgtgttaga ggtgtttgag gggagggaag 900acagtatctt cggggatccc aagaagctgc tcacccaata tttcgtgcag gaaaactacc 960tggagtaccg gcaggtcccc ggcagtgatc ctgcatgcta tgagttcctg tggggtccaa 1020gggccctcat tgaaaccagc tatgtgaaag tcctgcacca tatggtaaag atcagtggag 1080gacctcgcat ttcctaccca ctcctgcatg agtgggcttt gagagagggg gaagagtgag 1140tctgagcacg agttgcagcc agggccagtg ggagggggtt tgggccagtg caccttccgg 1200ggccccatcc cttagtttcc actgcctcct gtgacgtgag gcccattctt cactctttga 1260agcgagcagt cagcattctt agtagtgggt ttctgttctg ttggatgact ttgagattat 1320tctttgtttc ctgttggagt tgttcaaatg ttccttttaa cggatggttg aatgagcgtc 1380agcatccagg tttatgaatg acagtagtca cacatagtgc tgtttatata gtttaggagt 1440aagagtcttg ttttttattc agattgggaa atccattcca ttttgtgaat tgtgacataa 1500taatagcagt ggtaaaagta

tttgcttaaa attgtgagcg aattagcaat aacatacatg 1560agataactca agaaatcaaa agatagttga ttcttgcctt gtacctcaat ctattctgta 1620aaattaaaca aatatgcaaa ccaggatttc cttgacttct ttgagaatgc aagcgaaatt 1680aaatctgaat aaataattaa aaaaaaaaaa aaaaaaaaaa aaa 172391524DNAHomo sapiens 9agcagacaga ggactctcat taaggaaggt gtcctgtgcc ctgaccctac aagatgccaa 60gagaagatgc tcacttcatc tatggttacc ccaagaaggg gcacggccac tcttacacca 120cggctgaaga ggccgctggg atcggcatcc tgacagtgat cctgggagtc ttactgctca 180tcggctgttg gtattgtaga agacgaaatg gatacagagc cttgatggat aaaagtcttc 240atgttggcac tcaatgtgcc ttaacaagaa gatgcccaca agaagggttt gatcatcggg 300acagcaaagt gtctcttcaa gagaaaaact gtgaacctgt ggttcccaat gctccacctg 360cttatgagaa actctctgca gaacagtcac caccacctta ttcaccttaa gagccagcga 420gacacctgag acatgctgaa attatttctc tcacactttt gcttgaattt aatacagaca 480tctaatgttc tcctttggaa tggtgtagga aaaatgcaag ccatctctaa taataagtca 540gtgttaaaat tttagtaggt ccgctagcag tactaatcat gtgaggaaat gatgagaaat 600attaaattgg gaaaactcca tcaataaatg ttgcaatgca tgatactatc tgtgccagag 660gtaatgttag taaatccatg gtgttatttt ctgagagaca gaattcaagt gggtattctg 720gggccatcca atttctcttt acttgaaatt tggctaataa caaactagtc aggttttcga 780accttgaccg acatgaactg tacacagaat tgttccagta ctatggagtg ctcacaaagg 840atacttttac aggttaagac aaagggttga ctggcctatt tatctgatca agaacatgtc 900agcaatgtct ctttgtgctc taaaattcta ttatactaca ataatatatt gtaaagatcc 960tatagctctt tttttttgag atggagtttc gcttttgttg cccaggctgg agtgcaatgg 1020cgcgatcttg gctcaccata acctccgcct cccaggttca agcaattctc ctgccttagc 1080ctcctgagta gctgggatta caggcgtgcg ccactatgcc tgactaattt tgtagtttta 1140gtagagacgg ggtttctcca tgttggtcag gctggtctca aactcctgac ctcaggtgat 1200ctgcccgcct cagcctccca aagtgctgga attacaggcg tgagccacca cgcctggctg 1260gatcctatat cttaggtaag acatataacg cagtctaatt acatttcact tcaaggctca 1320atgctattct aactaatgac aagtattttc tactaaacca gaaattggta gaaggattta 1380aataagtaaa agctactatg tactgcctta gtgctgatgc ctgtgtactg ccttaaatgt 1440acctatggca atttagctct cttgggttcc caaatccctc tcacaagaat gtgcagaaga 1500aatcataaag gatcagagat tctg 1524101881DNAHomo sapiens 10cctgcatctt tttggaagga ttctttttat aaatcagaaa gtgttcgagg ttcaaaggtt 60tgcctcggag cgtgtgaaca ttcctccgct cggttttcaa ctcgcctcca acctgcgccg 120cccggccagc atgtctcccc gcccgtgaag cggggctgcc gcctccctgc cgctccggct 180gccactaacg acccgccctc gccgccacct ggccctcctg atcgacgaca cacgcacttg 240aaacttgttc tcagggtgtg tggaatcaac tttccggaag caaccagccc accagaggag 300gtcccgagcg cgagcggaga cgatgcagcg gagactggtt cagcagtgga gcgtcgcggt 360gttcctgctg agctacgcgg tgccctcctg cgggcgctcg gtggagggtc tcagccgccg 420cctcaaaaga gctgtgtctg aacatcagct cctccatgac aaggggaagt ccatccaaga 480tttacggcga cgattcttcc ttcaccatct gatcgcagaa atccacacag ctgaaatcag 540agctacctcg gaggtgtccc ctaactccaa gccctctccc aacacaaaga accaccccgt 600ccgatttggg tctgatgatg agggcagata cctaactcag gaaactaaca aggtggagac 660gtacaaagag cagccgctca agacacctgg gaagaaaaag aaaggcaagc ccgggaaacg 720caaggagcag gaaaagaaaa aacggcgaac tcgctctgcc tggttagact ctggagtgac 780tgggagtggg ctagaagggg accacctgtc tgacacctcc acaacgtcgc tggagctcga 840ttcacggtaa caggcttctc tggcccgtag cctcagcggg gtgctctcag ctgggttttg 900gagcctccct tctgccttgg cttggacaaa cctagaattt tctcccttta tgtatctcta 960tcgattgtgt agcaattgac agagaataac tcagaatatt gtctgcctta aagcagtacc 1020cccctaccac acacacccct gtcctccagc accatagaga ggcgctagag cccattcctc 1080tttctccacc gtcacccaac atcaatcctt taccactcta ccaaataatt tcatattcaa 1140gcttcagaag ctagtgacca tcttcataat ttgctggaga agtgtgtttc ttccccttac 1200tctcacacct gggcaaactt tcttcagtgt ttttcatttc ttacgttctt tcacttcaag 1260ggagaatata gaagcatttg atattatcta caaacactgc agaacagcat catgtcataa 1320acgattctga gccattcaca ctttttattt aattaaatgt atttaattaa atctcaaatt 1380tattttaatg taaagaactt aaattatgtt ttaaacacat gccttaaatt tgtttaatta 1440aatttaactc tggtttctac cagctcatac aaaataaatg gtttctgaaa atgtttaagt 1500attaacttac aaggatatag gtttttctca tgtatctttt tgttcattgg caagatgaaa 1560taatttttct agggtaatgc cgtaggaaaa ataaaacttc acatttatgt ggcttgttta 1620tccttagctc acagattgag gtaataatga cactcctaga ctttgggatc aaataactta 1680gggccaagtc ttgggtctga atttatttaa gttcacaacc tagggcaagt tactctgcct 1740ttctaagact cacttacatc ttctgtgaaa tataattgta ccaacctcat agagtttggt 1800gtcaactaaa tgagattata tgtggactaa atatctgtca tatagtaaac actcaataaa 1860ttgcaacata ttaaaaaaaa a 1881113336DNAHomo sapiens 11ttttcttaga cattaactgc agacggctgg caggatagaa gcagcggctc acttggactt 60tttcaccagg gaaatcagag acaatgatgg ggctcttccc cagaactaca ggggctctgg 120ccatcttcgt ggtggtcata ttggttcatg gagaattgcg aatagagact aaaggtcaat 180atgatgaaga agagatgact atgcaacaag ctaaaagaag gcaaaaacgt gaatgggtga 240aatttgccaa accctgcaga gaaggagaag ataactcaaa aagaaaccca attgccaaga 300ttacttcaga ttaccaagca acccagaaaa tcacctaccg aatctctgga gtgggaatcg 360atcagccgcc ttttggaatc tttgttgttg acaaaaacac tggagatatt aacataacag 420ctatagtcga ccgggaggaa actccaagct tcctgatcac atgtcgggct ctaaatgccc 480aaggactaga tgtagagaaa ccacttatac taacggttaa aattttggat attaatgata 540atcctccagt attttcacaa caaattttca tgggtgaaat tgaagaaaat agtgcctcaa 600actcactggt gatgatacta aatgccacag atgcagatga accaaaccac ttgaattcta 660aaattgcctt caaaattgtc tctcaggaac cagcaggcac acccatgttc ctcctaagca 720gaaacactgg ggaagtccgt actttgacca attctcttga ccgagagcaa gctagcagct 780atcgtctggt tgtgagtggt gcagacaaag atggagaagg actatcaact caatgtgaat 840gtaatattaa agtgaaagat gtcaacgata acttcccaat gtttagagac tctcagtatt 900cagcacgtat tgaagaaaat attttaagtt ctgaattact tcgatttcaa gtaacagatt 960tggatgaaga gtacacagat aattggcttg cagtatattt ctttacctct gggaatgaag 1020gaaattggtt tgaaatacaa actgatccta gaactaatga aggcatcctg aaagtggtga 1080aggctctaga ttatgaacaa ctacaaagcg tgaaacttag tattgctgtc aaaaacaaag 1140ctgaatttca ccaatcagtt atctctcgat accgagttca gtcaacccca gtcacaattc 1200aggtaataaa tgtaagagaa ggaattgcat tccgtcctgc ttccaagaca tttactgtgc 1260aaaaaggcat aagtagcaaa aaattggtgg attatatcct gggaacatat caagccatcg 1320atgaggacac taacaaagct gcctcaaatg tcaaatatgt catgggacgt aacgatggtg 1380gatacctaat gattgattca aaaactgctg aaatcaaatt tgtcaaaaat atgaaccgag 1440attctacttt catagttaac aaaacaatca cagctgaggt tctggccata gatgaataca 1500cgggtaaaac ttctacaggc acggtatatg ttagagtacc cgatttcaat gacaattgtc 1560caacagctgt cctcgaaaaa gatgcagttt gcagttcttc accttccgtg gttgtctccg 1620ctagaacact gaataataga tacactggcc cctatacatt tgcactggaa gatcaacctg 1680taaagttgcc tgccgtatgg agtatcacaa ccctcaatgc tacctcggcc ctcctcagag 1740cccaggaaca gatacctcct ggagtatacc acatctccct ggtacttaca gacagtcaga 1800acaatcggtg tgagatgcca cgcagcttga cactggaagt ctgtcagtgt gacaacaggg 1860gcatctgtgg aacttcttac ccaaccacaa gccctgggac caggtatggc aggccgcact 1920cagggaggct ggggcctgcc gccatcggcc tgctgctcct tggtctcctg ctgctgctgt 1980tggcccccct tctgctgttg acctgtgact gtggggcagg ttctactggg ggagtgacag 2040gtggttttat cccagttcct gatggctcag aaggaacaat tcatcagtgg ggaattgaag 2100gagcccatcc tgaagacaag gaaatcacaa atatttgtgt gcctcctgta acagccaatg 2160gagccgattt catggaaagt tctgaagttt gtacaaatac gtatgccaga ggcacagcgg 2220tggaaggcac ttcaggaatg gaaatgacca ctaagcttgg agcagccact gaatctggag 2280gtgctgcagg ctttgcaaca gggacagtgt caggagctgc ttcaggattc ggagcagcca 2340ctggagttgg catctgttcc tcagggcagt ctggaaccat gagaacaagg cattccactg 2400gaggaaccaa taaggactac gctgatgggg cgataagcat gaattttctg gactcctact 2460tttctcagaa agcatttgcc tgtgcggagg aagacgatgg ccaggaagca aatgactgct 2520tgttgatcta tgataatgaa ggcgcagatg ccactggttc tcctgtgggc tccgtgggtt 2580gttgcagttt tattgctgat gacctggatg acagcttctt ggactcactt ggacccaaat 2640ttaaaaaact tgcagagata agccttggtg ttgatggtga aggcaaagaa gttcagccac 2700cctctaaaga cagcggttat gggattgaat cctgtggcca tcccatagaa gtccagcaga 2760caggatttgt taagtgccag actttgtcag gaagtcaagg agcttctgct ttgtccgcct 2820ctgggtctgt ccagccagct gtttccatcc ctgaccctct gcagcatggt aactatttag 2880taacggagac ttactcggct tctggttccc tcgtgcaacc ttccactgca ggctttgatc 2940cacttctcac acaaaatgtg atagtgacag aaagggtgat ctgtcccatt tccagtgttc 3000ctggcaacct agctggccca acgcagctac gagggtcaca tactatgctc tgtacagagg 3060atccttgctc ccgtctaata tgaccagaat gagctggaat accacactga ccaaatctgg 3120atctttggac taaagtattc aaaatagcat agcaaagctc actgtattgg gctaataatt 3180tggcacttat tagcttctct cataaactga tcacgattat aaattaaatg tttgggttca 3240taccccaaaa gcaatatgtt gtcactccta attctcaagt actattcaaa ttgtagtaaa 3300tcttaaagtt tttcaaaacc ctaaaatcat attcgc 3336121694DNAHomo sapiens 12ctctctgccc acctctgctt cctctaggaa cacaggagtt ccagatcaca tcgagttcac 60catgaattca ctcagtgaag ccaacaccaa gttcatgttc gacctgttcc aacagttcag 120aaaatcaaaa gagaacaaca tcttctattc ccctatcagc atcacatcag cattagggat 180ggtcctctta ggagccaaag acaacactgc acaacagatt aagaaggttc ttcactttga 240tcaagtcaca gagaacacca caggaaaagc tgcaacatat catgttgata ggtcaggaaa 300tgttcatcac cagtttcaaa agcttctgac tgaattcaac aaatccactg atgcatatga 360gctgaagatc gccaacaagc tcttcggaga aaaaacgtat ctatttttac aggaatattt 420agatgccatc aagaaatttt accagaccag tgtggaatct gttgattttg caaatgctcc 480agaagaaagt cgaaagaaga ttaactcctg ggtggaaagt caaacgaatg aaaaaattaa 540aaacctaatt cctgaaggta atattggcag caataccaca ttggttcttg tgaacgcaat 600ctatttcaaa gggcagtggg agaagaaatt taataaagaa gatactaaag aggaaaaatt 660ttggccaaac aagaatacat acaagtccat acagatgatg aggcaataca catcttttca 720ttttgcctcg ctggaggatg tacaggccaa ggtcctggaa ataccataca aaggcaaaga 780tctaagcatg attgtgttgc tgccaaatga aatcgatggt ctccagaagc ttgaagagaa 840actcactgct gagaaattga tggaatggac aagtttgcag aatatgagag agacacgtgt 900cgatttacac ttacctcggt tcaaagtgga agagagctat gacctcaagg acacgttgag 960aaccatggga atggtggata tcttcaatgg ggatgcagac ctctcaggca tgaccgggag 1020ccgcggtctc gtgctatctg gagtcctaca caaggccttt gtggaggtta cagaggaggg 1080agcagaagct gcagctgcca ccgctgtagt aggattcgga tcatcaccta cttcaactaa 1140tgaagagttc cattgtaatc accctttcct attcttcata aggcaaaata agaccaacag 1200catcctcttc tatggcagat tctcatcccc gtagatgcaa ttagtctgtc actccatttg 1260gaaaatgttc acctgcagat gttctggtaa actgattgct ggcaacaaca gattctcttg 1320gctcatattt cttttctttc tcatcttgat gatgatcgtc atcatcaaga atttaatgat 1380taaaatagca tgcctttctc tctttctctt aataagccca catataaatg tactttttct 1440tccagaaaaa ttctccttga ggaaaaatgt ccaaaataag atgaatcact taataccgta 1500tcttctaaat ttgaaatata attctgtttg tgacctgttt taaatgaacc aaaccaaatc 1560atactttttc tttgaattta gcaacctaga aacacacatt tctttgaatt taggtgatac 1620ctaaatcctt cttatgtttc taaattttgt gattctataa aacacatcat caataaaata 1680gtgacataaa atca 1694131782DNAHomo sapiens 13gggagacaca cacagcctct ctgcccacct ctgcttcctc taggaacaca ggagttccag 60atcacatcga gttcaccatg aattcactca gtgaagccaa caccaagttc atgttcgatc 120tgttccaaca gttcagaaaa tcaaaagaga acaacatctt ctattcccct atcagcatca 180catcagcatt agggatggtc ctcttaggag ccaaagacaa cactgcacaa caaattagca 240aggttcttca ctttgatcaa gtcacagaga acaccacaga aaaagctgca acatatcatg 300ttgataggtc aggaaatgtt catcaccagt ttcaaaagct tctgactgaa ttcaacaaat 360ccactgatgc atatgagctg aagatcgcca acaagctctt cggagaaaag acgtatcaat 420ttttacagga atatttagat gccatcaaga aattttacca gaccagtgtg gaatctactg 480attttgcaaa tgctccagaa gaaagtcgaa agaagattaa ctcctgggtg gaaagtcaaa 540cgaatgaaaa aattaaaaac ctatttcctg atgggactat tggcaatgat acgacactgg 600ttcttgtgaa cgcaatctat ttcaaagggc agtgggagaa taaatttaaa aaagaaaaca 660ctaaagagga aaaattttgg ccaaacaaga atacatacaa atctgtacag atgatgaggc 720aatacaattc ctttaatttt gccttgctgg aggatgtaca ggccaaggtc ctggaaatac 780catacaaagg caaagatcta agcatgattg tgctgctgcc aaatgaaatc gatggtctgc 840agaagcttga agagaaactc actgctgaga aattgatgga atggacaagt ttgcagaata 900tgagagagac atgtgtcgat ttacacttac ctcggttcaa aatggaagag agctatgacc 960tcaaggacac gttgagaacc atgggaatgg tgaatatctt caatggggat gcagacctct 1020caggcatgac ctggagccac ggtctctcag tatctaaagt cctacacaag gcctttgtgg 1080aggtcactga ggagggagtg gaagctgcag ctgccaccgc tgtagtagta gtcgaattat 1140catctccttc aactaatgaa gagttctgtt gtaatcaccc tttcctattc ttcataaggc 1200aaaataagac caacagcatc ctcttctatg gcagattctc atccccatag atgcaattag 1260tctgtcactc catttagaaa atgttcacct agaggtgttc tggtaaactg attgctggca 1320acaacagatt ctcttggctc atatttcttt tctatctcat cttgatgatg atagtcatca 1380tcaagaattt aatgattaaa atagcatgcc tttctctctt tctcttaata agcccacata 1440taaatgtact tttccttcca gaaaaatttc ccttgaggaa aaatgtccaa gataagatga 1500atcatttaat accgtgtctt ctaaatttga aatataattc tgtttctgac ctgttttaaa 1560tgaaccaaac caaatcatac tttctcttca aatttagcaa cctagaaaca cacatttctt 1620tgaatttagg tgatacctaa atccttctta tgtttctaaa ttttgtgatt ctataaaaca 1680catcatcaat aaaataatga ctaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1740aaaaaaaaaa aacccaaaaa aaaaaaaaaa aaaaaaaaaa aa 1782141528DNAHomo sapiens 14cggcgagcga gcaccttcga cgcggtccgg ggaccccctc gtcgctgtcc tcccgacgcg 60gacccgcgtg ccccaggcct cgcgctgccc ggccggctcc tcgtgtccca ctcccggcgc 120acgccctccc gcgagtcccg ggcccctccc gcgcccctct tctcggcgcg cgcgcagcat 180ggcgcccccg caggtcctcg cgttcgggct tctgcttgcc gcggcgacgg cgacttttgc 240cgcagctcag gaagaatgtg tctgtgaaaa ctacaagctg gccgtaaact gctttgtgaa 300taataatcgt caatgccagt gtacttcagt tggtgcacaa aatactgtca tttgctcaaa 360gctggctgcc aaatgtttgg tgatgaaggc agaaatgaat ggctcaaaac ttgggagaag 420agcaaaacct gaaggggccc tccagaacaa tgatgggctt tatgatcctg actgcgatga 480gagcgggctc tttaaggcca agcagtgcaa cggcacctcc acgtgctggt gtgtgaacac 540tgctggggtc agaagaacag acaaggacac tgaaataacc tgctctgagc gagtgagaac 600ctactggatc atcattgaac taaaacacaa agcaagagaa aaaccttatg atagtaaaag 660tttgcggact gcacttcaga aggagatcac aacgcgttat caactggatc caaaatttat 720cacgagtatt ttgtatgaga ataatgttat cactattgat ctggttcaaa attcttctca 780aaaaactcag aatgatgtgg acatagctga tgtggcttat tattttgaaa aagatgttaa 840aggtgaatcc ttgtttcatt ctaagaaaat ggacctgaca gtaaatgggg aacaactgga 900tctggatcct ggtcaaactt taatttatta tgttgatgaa aaagcacctg aattctcaat 960gcagggtcta aaagctggtg ttattgctgt tattgtggtt gtggtgatag cagttgttgc 1020tggaattgtt gtgctggtta tttccagaaa gaagagaatg gcaaagtatg agaaggctga 1080gataaaggag atgggtgaga tgcataggga actcaatgca taactatata atttgaagat 1140tatagaagaa gggaaatagc aaatggacac aaattacaaa tgtgtgtgcg tgggacgaag 1200acatctttga aggtcatgag tttgttagtt taacatcata tatttgtaat agtgaaacct 1260gtactcaaaa tataagcagc ttgaaactgg ctttaccaat cttgaaattt gaccacaagt 1320gtcttatata tgcagatcta atgtaaaatc cagaacttgg actccatcgt taaaattatt 1380tatgtgtaac attcaaatgt gtgcattaaa tatgcttcca cagtaaaatc tgaaaaactg 1440atttgtgatt gaaagctgcc tttctattta cttgagtctt gtacatacat acttttttat 1500gagctatgaa ataaaacatt ttaaactg 1528152384DNAHomo sapiens 15tattgagttc ttcaaacatt gtagcctctt tatggtctct gagaaataac taccttaaac 60ccataatctt taatacttcc taaactttct taataagaga agctctattc ctgacactac 120ctctcatttg caaggtcaaa tcatcattag ttttgtagtc tattaactgg gtttgcttag 180gtcaggcatt attattacta accttattgt taatattcta accataagaa ttaaactatt 240aatggtgaat agagtttttc actttaacat aggcctatcc cactggtggg atacgagcca 300attcgaaaga aaagtcagtc atgtgctttt cagaggatga aagcttaaga taaagactaa 360aagtgtttga tgctggaggt gggagtggta ttatataggt ctcagccaag acatgtgata 420atcactgtag tagtagctgg aaagagaaat ctgtgactcc aattagccag ttcctgcaga 480ccttgtgagg actagaggaa gaatgctcct ggctgttttg tactgcctgc tgtggagttt 540ccagacctcc gctggccatt tccctagagc ctgtgtctcc tctaagaacc tgatggagaa 600ggaatgctgt ccaccgtgga gcggggacag gagtccctgt ggccagcttt caggcagagg 660ttcctgtcag aatatccttc tgtccaatgc accacttggg cctcaatttc ccttcacagg 720ggtggatgac cgggagtcgt ggccttccgt cttttataat aggacctgcc agtgctctgg 780caacttcatg ggattcaact gtggaaactg caagtttggc ttttggggac caaactgcac 840agagagacga ctcttggtga gaagaaacat cttcgatttg agtgccccag agaaggacaa 900attttttgcc tacctcactt tagcaaagca taccatcagc tcagactatg tcatccccat 960agggacctat ggccaaatga aaaatggatc aacacccatg tttaacgaca tcaatattta 1020tgacctcttt gtctggatgc attattatgt gtcaatggat gcactgcttg ggggatctga 1080aatctggaga gacattgatt ttgcccatga agcaccagct tttctgcctt ggcatagact 1140cttcttgttg cggtgggaac aagaaatcca gaagctgaca ggagatgaaa acttcactat 1200tccatattgg gactggcggg atgcagaaaa gtgtgacatt tgcacagatg agtacatggg 1260aggtcagcac cccacaaatc ctaacttact cagcccagca tcattcttct cctcttggca 1320gattgtctgt agccgattgg aggagtacaa cagccatcag tctttatgca atggaacgcc 1380cgagggacct ttacggcgta atcctggaaa ccatgacaaa tccagaaccc caaggctccc 1440ctcttcagct gatgtagaat tttgcctgag tttgacccaa tatgaatctg gttccatgga 1500taaagctgcc aatttcagct ttagaaatac actggaagga tttgctagtc cacttactgg 1560gatagcggat gcctctcaaa gcagcatgca caatgccttg cacatctata tgaatggaac 1620aatgtcccag gtacagggat ctgccaacga tcctatcttc cttcttcacc atgcatttgt 1680tgacagtatt tttgagcagt ggctccgaag gcaccgtcct cttcaagaag tttatccaga 1740agccaatgca cccattggac ataaccggga atcctacatg gttcctttta taccactgta 1800cagaaatggt gatttcttta tttcatccaa agatctgggc tatgactata gctatctaca 1860agattcagac ccagactctt ttcaagacta cattaagtcc tatttggaac aagcgagtcg 1920gatctggtca tggctccttg gggcggcgat ggtaggggcc gtcctcactg ccctgctggc 1980agggcttgtg agcttgctgt gtcgtcacaa gagaaagcag cttcctgaag aaaagcagcc 2040actcctcatg gagaaagagg attaccacag cttgtatcag agccatttat aaaaggctta 2100ggcaatagag tagggccaaa aagcctgacc tcactctaac tcaaagtaat gtccaggttc 2160ccagagaata tctgctggta tttttctgta aagaccattt gcaaaattgt aacctaatac 2220aaagtgtagc cttcttccaa ctcaggtaga acacacctgt ctttgtcttg ctgttttcac 2280tcagcccttt taacattttc ccctaagccc atatgtctaa ggaaaggatg ctatttggta 2340atgaggaact gttatttgta tgtgaattaa agtgctctta tttt 2384162702DNAHomo sapiens 16cattctcccc caggctcact caccatgacc aagctgagcg cccaagtcaa aggctctctc 60aacatcacca ccccggggct gcagatatgg aggatcgagg ccatgcagat ggtgcctgtt 120ccttccagca cctttggaag cttcttcgat ggtgactgct acatcatcct ggctatccac 180aagacagcca gcagcctgtc ctatgacatc cactactgga ttggccagga ctcatccctg 240gatgagcagg gggcagctgc catctacacc acacagatgg atgacttcct gaagggccgg 300gctgtgcagc accgcgaggt ccagggcaac gagagcgagg

ccttccgagg ctacttcaag 360caaggccttg tgatccggaa agggggcgtg gcttctggca tgaagcacgt ggagaccaac 420tcctatgacg tccagaggct gctgcatgtc aagggcaaga ggaacgtggt agctggagag 480gtagagatgt cctggaagag tttcaaccga ggggatgttt tcctcctgga ccttgggaag 540cttatcatcc agtggaatgg accggaaagc acccgtatgg agagactcag gggcatgact 600ctggccaagg agatccgaga ccaggagcgg ggagggcgca cctatgtagg cgtggtggac 660ggagagaatg aattggcatc cccgaagctg atggaggtga tgaaccacgt gctgggcaag 720cgcagggagc tgaaggcggc cgtgcccgac acggtggtgg agccggcact caaggctgca 780ctcaaactgt accatgtgtc tgactccgag gggaatctgg tggtgaggga agtcgccaca 840cggccactga cacaggacct gctcagtcac gaggactgtt acatcctgga ccaggggggc 900ctgaagatct acgtgtggaa agggaagaaa gccaatgagc aggagaagaa gggagccatg 960agccatgcgc tgaacttcat caaagccaag cagtacccac caagcacaca ggtggaggtg 1020cagaatgatg gggctgagtc ggccgtcttt cagcagctct tccagaagtg gacagcgtcc 1080aaccggacct caggcctagg caaaacccac actgtgggct ccgtggccaa agtggaacag 1140gtgaagttcg atgccacatc catgcatgtc aagcctcagg tggctgccca gcagaagatg 1200gtagatgatg ggagtgggga agtgcaggtg tggcgcattg agaacctaga gctggtacct 1260gtggattcca agtggctagg ccacttctat gggggcgact gctacctgct gctctacacc 1320tacctcatcg gcgagaagca gcattacctg ctctacgttt ggcagggcag ccaggccagc 1380caagatgaaa ttacagcatc agcttatcaa gccgtcatcc tggaccagaa gtacaatggt 1440gaaccagtcc agatccgggt cccaatgggc aaggagccac ctcatcttat gtccatcttc 1500aagggacgca tggtggtcta ccagggaggc acctcccgaa ctaacaactt ggagaccggg 1560ccctccacac ggctgttcca ggtccaggga actggcgcca acaacaccaa ggcctttgag 1620gtcccagcgc gggccaattt cctcaattcc aatgatgtct ttgtcctcaa gacccagtct 1680tgctgctatc tatggtgtgg gaagggttgt agcggggacg agcgggagat ggccaagatg 1740gttgctgaca ccatctcccg gacggagaag caagtggtgg tggaagggca ggagccagcc 1800aacttctgga tggccctggg tgggaaggcc ccctatgcca acaccaagag actacaggaa 1860gaaaacctgg tcatcacccc ccggctcttt gagtgttcca acaagactgg gcgcttcctg 1920gccacagaga tccctgactt caatcaggat gacttggaag aggatgatgt gttcctacta 1980gatgtctggg accaggtctt cttctggatt gggaaacatg ccaacgagga ggagaagaag 2040gccgcagcaa ccactgcaca ggaatacctc aagacccatc ccagcgggcg tgaccctgag 2100acccccatca ttgtggtgaa gcagggacac gagcccccca ccttcacagg ctggttcctg 2160gcttgggatc ccttcaagtg gagtaacacc aaatcctatg aggacctgaa ggcggagtct 2220ggcaacctta gggactggag ccagatcact gctgaggtca caagccccaa agtggacgtg 2280ttcaatgcta acagcaacct cagttctggg cctctgccca tcttccccct ggagcagcta 2340gtgaacaagc ctgtagagga gctccccgag ggtgtggacc ccagcaggaa ggaggaacac 2400ctgtccattg aagatttcac tcaggccttt gggatgactc cagctgcctt ctctgctctg 2460cctcgatgga agcaacaaaa cctcaagaaa gaaaaaggac tattttgaga agagtagctg 2520tggttgtaaa gcagtaccct accctgattg tagggtctca ttttctcacc gatattagtc 2580ctacaccaat tgaagtgaaa ttttgcagat gtgcctatga gcacaaactt ctgtggcaaa 2640tgccagtttt gtttaataat gtacctattc cttcagaaag atgatacccc aaaaaaaaaa 2700aa 27021724DNAHomo sapiens 17gattgtgatg taacggctgt aatg 241820DNAHomo sapiens 18atccttgtcc tccacgggtt 201917DNAHomo sapiens 19agatcgacaa cgcccgt 172020DNAHomo sapiens 20agagcctgtt ccgtctcaaa 202121DNAHomo sapiens 21gggccttggt ctcctctaga g 212219DNAHomo sapiens 22ccagggagcg actgttgtc 192321DNAHomo sapiens 23gtgaggaggc aaggttytsa g 212422DNAHomo sapiens 24agacccacwg gcagatcttc tc 222519DNAHomo sapiens 25actgtcagga tgccgatcc 192623DNAHomo sapiens 26agcggcctct tcagccgtgg tgt 232723DNAHomo sapiens 27tcatggagga gctgatgttc aga 232819DNAHomo sapiens 28caaaaggcgg ctgatcgat 192921DNAHomo sapiens 29ggcgatcttc agctcatatg c 213023DNAHomo sapiens 30ggttttgctc ttctcccaag ttt 233123DNAHomo sapiens 31actgatggct gttgtactcc tcc 233218DNAHomo sapiens 32ttgccagact ccgccttc 183316DNAHomo sapiens 33gtgaaggcca cagcat 163417DNAHomo sapiens 34aactggctgc tgtaacg 173519DNAHomo sapiens 35gccgatgagc agtaagact 193620DNAHomo sapiens 36tgtcaacaac aaagattcca 203718DNAHomo sapiens 37tctccgaaga gcttgttg 183817DNAHomo sapiens 38agcccatcat tgttctg 173917DNAHomo sapiens 39cgttccattg cataaag 174016DNAHomo sapiens 40gctccagtcc ctaagg 16

* * * * *

Identification Of Markers In Esophageal Cancer, Colon Cancer, Head And Neck Cancer, And Melanoma

Godfrey; Tony E. ; et al.

References