U.S. patent application number 14/746487 was filed with the patent office on 2015-12-24 for methods and materials for classification of tissue of origin of tumor samples.
The applicant listed for this patent is Rosetta Genomics Ltd.. Invention is credited to Ranit Aharonov, Nir Dromi, Nitzan Rosenfeld, Shai Rosenwald.
Application Number | 20150368724 14/746487 |
Document ID | / |
Family ID | 49235334 |
Filed Date | 2015-12-24 |
United States Patent
Application |
20150368724 |
Kind Code |
A1 |
Aharonov; Ranit ; et
al. |
December 24, 2015 |
METHODS AND MATERIALS FOR CLASSIFICATION OF TISSUE OF ORIGIN OF
TUMOR SAMPLES
Abstract
The present invention provides a process for classification of
cancers and tissues of origin through the analysis of the
expression patterns of specific microRNAs and nucleic acid
molecules relating thereto. Classification according to a microRNA
tree-based expression framework allows optimization of treatment,
and determination of specific therapy.
Inventors: |
Aharonov; Ranit; (Tel Aviv,
IL) ; Rosenfeld; Nitzan; (Rehovot, IL) ;
Rosenwald; Shai; (Nes Ziona, IL) ; Dromi; Nir;
(Rehovot, IL) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Rosetta Genomics Ltd. |
Rehovot |
|
IL |
|
|
Family ID: |
49235334 |
Appl. No.: |
14/746487 |
Filed: |
June 22, 2015 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13856190 |
Apr 3, 2013 |
9096906 |
|
|
14746487 |
|
|
|
|
13167489 |
Jun 23, 2011 |
8802599 |
|
|
13856190 |
|
|
|
|
12532940 |
Sep 24, 2009 |
|
|
|
PCT/IL2008/000396 |
Mar 20, 2008 |
|
|
|
13167489 |
|
|
|
|
PCT/IL2009/001212 |
Dec 23, 2009 |
|
|
|
13167489 |
|
|
|
|
PCT/IL2011/000849 |
Nov 1, 2011 |
|
|
|
13856190 |
|
|
|
|
60907266 |
Mar 27, 2007 |
|
|
|
60929244 |
Jun 19, 2007 |
|
|
|
61024565 |
Jan 30, 2008 |
|
|
|
61140642 |
Dec 24, 2008 |
|
|
|
61415875 |
Nov 22, 2010 |
|
|
|
Current U.S.
Class: |
506/9 ; 435/6.11;
435/6.12; 506/16 |
Current CPC
Class: |
C12Q 1/6886 20130101;
C12Q 2600/178 20130101; C12Q 2600/158 20130101 |
International
Class: |
C12Q 1/68 20060101
C12Q001/68 |
Claims
1.-95. (canceled)
96. A method of identifying a tissue of origin of a cancer sample,
said method comprising: (a) obtaining a biological sample from a
subject in need thereof, wherein the sample is of a cancer selected
from the group consisting of cancer of unknown primary (CUP),
primary cancer, and metastatic cancer; (b) measuring the level of
nucleic acids comprising SEQ ID NOS: 1, 2 or 156, 3-7, 9-12, 14-21,
23-27, 29-40, 42, 43, 44 or 191, 45-51, 53-56, 57 or 202, 58, 59,
60 or 208, 61, 62 or 211, 64-69, 146-148, and optionally at least
one control nucleic acid in the biological sample and applying a
classifier algorithm to said level of nucleic acids measured; and
(c) identifying the tissue of origin of the sample based on the
classification provided by the classifier algorithm.
97. The method of claim 96, wherein the classifier algorithm is
selected from the group consisting of: decision tree classifier,
K-nearest neighbor classifier (KNN), logistic regression
classifier, nearest neighbor classifier, neural network classifier,
Gaussian mixture model (GMM), Support Vector Machine (SVM)
classifier, nearest centroid classifier, linear regression
classifier and random forest classifier.
98. The method of claim 96, wherein the cancer is selected from the
group consisting of adrenocortical carcinoma; anus or skin squamous
cell carcinoma; biliary tract adenocarcinoma; Ewing sarcoma;
gastrointestinal stromal tumor (GIST); gastrointestinal tract
carcinoid; renal cell carcinoma: chromophobe, clear cell and
papillary; pancreatic islet cell tumor; pheochromocytoma;
urothelial cell carcinoma (TCC); lung, head & neck, or
esophagus squamous cell carcinoma (SCC); brain: astrocytic tumor,
oligodendroglioma; breast adenocarcinoma; uterine cervix squamous
cell carcinoma; chondrosarcoma; germ cell cancer; sarcoma;
colorectal adenocarcinoma; liposarcoma; hepatocellular carcinoma
(HCC); lung large cell or adenocarcinoma; lung carcinoid; pleural
mesothelioma; lung small cell carcinoma; B-cell lymphoma; T-cell
lymphoma; melanoma; malignant fibrous histiocytoma (MFH) or
fibrosarcoma; osteosarcoma; ovarian primitive germ cell tumor;
ovarian carcinoma; pancreatic adenocarcinoma; prostate
adenocarcinoma; rhabdomyosarcoma; gastric or esophageal
adenocarcinoma; synovial sarcoma; non-seminomatous testicular germ
cell tumor; seminomatous testicular germ cell tumor; thymoma;
thymic carcinoma; follicular thyroid carcinoma; medullary thyroid
carcinoma; and papillary thyroid carcinoma.
99. The method of claim 98, wherein a level of SEQ ID NOS: 55 above
the reference threshold indicates a cancer of germ cell origin
selected from the group consisting of an ovarian primitive cell and
a testis cell, and further wherein a level of SEQ ID NOS: 29 and 62
above the reference threshold indicates a testis cell cancer origin
selected from the group consisting of seminomatous testicular germ
cell and non-seminomatous testicular germ cell.
100. The method of claim 98, wherein a level of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6, 9
and 29 above the reference threshold indicates a cancer origin
selected from the group consisting of biliary tract adenocarcinoma
and hepatocellular carcinoma.
101. The method of claim 98, wherein a level of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6,
30, 46, 16, 156, 66 and 68 above the reference threshold indicates
a cancer of brain origin, and further wherein a level of SEQ ID
NOS: 40 and 60 above the reference threshold indicates a brain
cancer origin selected from the group consisting of
oligodendroglioma and astrocytoma.
102. The method of claim 98, wherein a level of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6,
30, 46, 56, 65, 25, 27, 35, 14 and 21 above the reference threshold
indicates a cancer of prostate adenocarcinoma origin; wherein a
level of a nucleic acid sequence selected from the group consisting
of SEQ ID NOS: 55, 6, 30, 46, 56, 65, 27, 35, 14, 21, 32, 51, 7,
25, 50, 11, 148, 4, 49 and 67 above the reference threshold
indicates a cancer of breast adenocarcinoma origin; wherein a level
of a nucleic acid sequence selected from the group consisting of
SEQ ID NOS: 55, 6, 30, 46, 56, 65, 27, 35, 14, 21, 32, 51, 7, 25,
4, 39, 50, 11, 148, 49, 67, 57 and 34 above the reference threshold
indicates a cancer of an ovarian carcinoma origin; wherein a level
of a nucleic acid sequence selected from the group consisting of
SEQ ID NOS: 55, 6, 30, 46, 56, 65, 25, 27, 35, 14, 21, 32, 51, 7,
11, 148, 4, 49, 67, 57 and 34 above the reference threshold
indicates a cancer of lung large cell or lung adenocarcinoma
origin; and wherein a level of a nucleic acid sequence selected
from the group consisting of SEQ ID NOS: 55, 6, 30, 46, 56, 65, 25,
20 and 45 above the reference threshold indicates a cancer of lung
small cell carcinoma origin.
103. The method of claim 98, wherein a level of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6,
30, 46, 56, 65, 25, 27, 35, 14, 21, 32, 51, 7, 11, 148 and 4 above
the reference threshold indicates a cancer of thyroid carcinoma
origin, and further wherein a level of SEQ ID NOS: 17 and 34 above
the threshold indicates that the thyroid carcinoma origin is
follicular or papillary.
104. The method of claim 98, wherein a level of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6,
30, 46, 56, 65, 25, 27, 35, 14, 21, 32, 51, 7, 50, 4, 39, 3 and 34
above the reference threshold indicates a cancer of a thymic
carcinoma origin; or wherein a level of a nucleic acid sequence
selected from the group consisting of SEQ ID NOS: 55, 6, 30, 46,
56, 65, 25, 27, 35, 14, 21, 32, 51, 7, 50, 4, 39, 3, 34, 69, 24 and
44 above the reference threshold indicates a cancer of urothelial
cell carcinoma or squamous cell carcinoma origin, and further
wherein a level of SEQ ID NOS: 1, 5 and 54 above the reference
threshold indicates that the squamous-cell-carcinoma origin is
uterine cervix squamous-cell--carcinoma or non-uterine cervix
squamous cell carcinoma; or further wherein a level of SEQ ID NOS:
11 and 23 above the reference threshold indicates that the
non-uterine cervix squamous cell carcinoma origin is selected from
the group consisting of: a) anus or skin squamous cell carcinoma,
and b) lung, head & neck, and esophagus squamous cell
carcinoma.
105. The method of claim 98, wherein a level of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6,
30, 46, 16, 2, 47 and 50 above the reference threshold indicates a
cancer of melanoma or lymphoma origin, and further wherein a level
of SEQ ID NOS: 35 and 48 above the reference threshold indicates
that the lymphoma cancer origin is selected from the group
consisting of B-cell lymphoma and T-cell lymphoma.
106. The method of claim 98, wherein a level of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6,
30, 46, 56, 65, 25, 20, 45, 40, 67 and 68 above the reference
threshold indicates a cancer of medullary thyroid carcinoma origin;
wherein a level of a nucleic acid sequence selected from the group
consisting of SEQ ID NOS: 55, 6, 30, 46, 56, 65, 25, 20, 45, 40,
67, 68, 64, 53 and 37 above the reference threshold indicates a
cancer of lung carcinoid origin; and wherein a level of a nucleic
acid sequence selected from the group consisting of SEQ ID NOS: 55,
6, 30, 46, 56, 65, 25, 20, 45, 40, 67, 68, 64, 53, 37, 34 and 18
above the reference threshold indicates a cancer of
gastrointestinal tract carcinoid or pancreatic islet cell tumor
origin.
107. The method of claim 98, wherein a level of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6,
30, 46, 56, 65, 25, 27, 35, 42, 36 and 146 above the reference
threshold indicates a cancer of gastric or esophageal
adenocarcinoma origin; wherein a level of a nucleic acid sequence
selected from the group consisting of SEQ ID NOS: 55, 6, 30, 46,
56, 65, 25, 27, 35, 42, 36, 146, 20 and 43 above the reference
threshold indicates a cancer of colorectal adenocarcinoma origin;
wherein a level of a nucleic acid sequence selected from the group
consisting of SEQ ID NOS: 55, 6, 30, 46, 56, 65, 25, 27, 35, 42,
36, 146, 20, 43, 51, 49 and 16 above the reference threshold
indicates a cancer of pancreatic adenocarcinoma or biliary tract
adenocarcinoma origin; wherein a level of a nucleic acid sequence
selected from the group consisting of SEQ ID NOS: 55, 6, 30, 46,
16, 2, 66, 68, 19 and 29 above the reference threshold indicates a
cancer of renal cell carcinoma origin, and further wherein a level
of SEQ ID NOS: 36 and 147 above the reference threshold indicates a
chromophobe renal cell carcinoma origin, or further wherein a level
of SEQ ID NOS: 49 and 9 above the reference threshold indicates
that the renal cell carcinoma origin is clear cell or
papillary.
108. The method of claim 98, wherein a level of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6,
30, 46, 16, 2, 66, 68, 19, 29, 65 and 56 above the reference
threshold indicates a cancer of pheochromocytoma origin; wherein a
level of a nucleic acid sequence selected from the group consisting
of SEQ ID NOS: 55, 6, 30, 46, 16, 2, 66, 68, 19, 29, 65, 56, 31, 38
and 61 above the reference threshold indicates a cancer of
adrenocortical origin; wherein a level of a nucleic acid sequence
selected from the group consisting of SEQ ID NOS: 55, 6, 30, 46,
16, 2, 66, 68, 19, 29, 65, 56, 31, 38, 61, 14 and 45 above the
reference threshold indicates a cancer of gastrointestinal stromal
tumor origin; wherein a level of a nucleic acid sequence selected
from the group consisting of SEQ ID NOS: 55, 6, 30, 46, 16, 2, 66,
68, 19, 29, 65, 56, 31, 38, 61, 14, 45, 35, 10 and 5 above the
reference threshold indicates a cancer of pleural mesothelioma or
sarcoma origin, and further wherein a level of SEQ ID NOS: 3, 40
and 15 above the reference threshold indicates that the sarcoma is
synovial sarcoma, or further wherein a level of SEQ ID NOS: 3, 40,
15, 12 and 58 above the reference threshold indicates that the
sarcoma is chondrosarcoma, or further wherein a level of SEQ ID
NOS: 3, 40, 15, 12, 58, 36 and 26 above the reference threshold
indicates that the sarcoma is liposarcoma, or further wherein a
level of SEQ ID NOS: 3, 40, 15, 12, 58, 36, 26, 21, 25 and 49 above
the reference threshold indicates that the sarcoma is Ewing sarcoma
or osteosarcoma; or further wherein a level of SEQ ID NOS: 3, 40,
15, 12, 58, 36, 26, 21, 59, 39 and 33 above the reference threshold
indicates that the sarcoma is selected from the group consisting
of: a) rhabdomyosarcoma, and b) malignant fibrous histiocytoma and
fibrosarcoma.
109. The method of claim 96, wherein the biological sample is
selected from the group consisting of a bodily fluid, a cell line,
a tissue sample, a biopsy sample, a needle biopsy sample, a fine
needle biopsy (FNA) sample, a surgically removed sample, and a
sample obtained by tissue-sampling procedures such as endoscopy,
bronchoscopy, or laparoscopic methods.
110. The method of claim 109, wherein the tissue is a fresh,
frozen, fixed, wax-embedded or formalin-fixed paraffin-embedded
(FFPE) tissue.
111. The method of claim 96, wherein the level of the nucleic acid
sequence is determined by a method selected from the group
consisting of nucleic acid hybridization and nucleic acid
amplification.
112. The method of claim 113, wherein nucleic acid hybridization is
performed using a solid-phase nucleic acid biochip array or in situ
hybridization and wherein nucleic acid amplification is real-time
PCR comprising forward and reverse primers and a probe comprising a
sequence selected from the group consisting of a sequence that is
complementary to a sequence selected from SEQ ID NOS: 1, 2 or 156,
3-7, 9-12, 14-21, 23-27, 29-40, 42, 43, 44 or 191, 45-51, 53-56, 57
or 202, 58, 59, 60 or 208, 61, 62 or 211, 64-69, 146-148, and
optionally at least one control nucleic acid and a fragment
thereof.
113. A kit for performing the method of claim 96 comprising probes,
wherein the probes comprise (i) DNA equivalents of nucleic acids
comprising SEQ ID NOS: 1-7, 9-12, 14-21, 23-27, 29-40, 42-51,
53-57, 59-62, 64-69, 146-148, and 156, (ii) the complements
thereof, or (iii) sequences at least 90% identical to (i) or (ii).
Description
FIELD OF THE INVENTION
[0001] The present invention relates to methods and materials for
classification of cancers and the identification of their tissue of
origin. Specifically the invention relates to microRNA molecules
associated with specific cancers, as well as various nucleic acid
molecules relating thereto or derived therefrom.
BACKGROUND OF THE INVENTION
[0002] microRNAs (miRs, miRNAs) are a novel class of non-coding,
regulatory RNA genes.sup.1-3 which are involved in
oncogenesis.sup.4 and show remarkable tissue-specificity.sup.5-7.
They have emerged as highly tissue-specific biomarkers.sup.2,5,6
postulated to play important roles in encoding developmental
decisions of differentiation. Various studies have tied microRNAs
to the development of specific malignancies.sup.4. MicroRNAs are
also stable in tissue, stored frozen or as formalin-fixed,
paraffin-embedded (FFPE) samples, and in serum.
[0003] Hundreds of thousands of patients in the U.S. are diagnosed
each year with a cancer that has already metastasized, without a
clearly identified primary site. Oncologists and pathologists are
constantly faced with a diagnostic dilemma when trying to identify
the primary origin of a patient's metastasis. As metastases need to
be treated according to their primary origin, accurate
identification of the metastases' primary origin can be critical
for determining appropriate treatment.
[0004] Once a metastatic tumor is found, the patient may undergo a
wide range of costly, time consuming, and at times inefficient
tests, including physical examination of the patient,
histopathology analysis of the biopsy, imaging methods such as
chest X-ray, CT and PET scans, in order to identify the primary
origin of the metastasis.
[0005] Metastatic cancer of unknown primary (CUP) accounts for 3-5%
of all new cancer cases, and as a group is usually a very
aggressive disease with a poor prognosis.sup.10. The concept of CUP
comes from the limitation of present methods to identify cancer
origin, despite an often complicated and costly process which can
significantly delay proper treatment of such patients. Recent
studies revealed a high degree of variation in clinical management,
in the absence of evidence based treatment for CUP.sup.11. Many
protocols were evaluated.sup.12 but have shown relatively small
benefit.sup.13. Determining tumor tissue of origin is thus an
important clinical application of molecular diagnostics.sup.9.
[0006] Molecular classification studies for tumor tissue
origin.sup.14-17 have generally used classification algorithms that
did not utilize domain-specific knowledge: tissues were treated as
a-priori equivalents, ignoring underlying similarities between
tissue types with a common developmental origin in embryogenesis.
An exception of note is the study by Shedden and co-workers.sup.18,
that was based on a pathology classification tree. These studies
used machine-learning methods that average effects of biological
features (e.g., mRNA expression levels), an approach which is more
amenable to automated processing but does not use or generate
mechanistic insights.
[0007] Various markers have been proposed to indicate specific
types of cancers and tumor tissue of origin. However, the
diagnostic accuracy of tumor markers has not yet been defined.
There is thus a need for a more efficient and effective method for
diagnosing and classifying specific types of cancers.
SUMMARY OF THE INVENTION
[0008] The present invention provides specific nucleic acid
sequences for use in the identification, classification and
diagnosis of specific cancers and tumor tissue of origin. The
nucleic acid sequences can also be used as prognostic markers for
prognostic evaluation and determination of appropriate treatment of
a subject based on the abundance of the nucleic acid sequences in a
biological sample. The present invention provides a method for
accurate identification of tumor tissue origin.
[0009] The invention is based in part on the development of a
microRNA-based classifier for tumor classification. microRNA
expression levels were measured in 1300 primary and metastatic
tumor paraffin-embedded samples. microRNAs were profiled using a
custom array platform. Using the custom array platform, a set of
over 300 microRNAs was identified for the normalization of the
array data and 65 microRNAs were used for the accurate
classification of over 40 different tumor types. The accuracy of
the assay exceeds 85%.
[0010] The findings demonstrate the utility of microRNA as novel
biomarkers for the tissue of origin of a metastatic tumor. The
classifier has wide biological as well as diagnostic
applications.
[0011] According to a first aspect, the present invention provides
a method of identifying a tissue of origin of a cancer, the method
comprising obtaining a biological sample from a subject, measuring
the relative abundance in said sample of nucleic acid sequences
selected from the group consisting of SEQ ID NOS: 1-390, any
combinations thereof, or a sequence having at least about 80%
identity thereto; and comparing the measurement to a reference
abundance of the nucleic acid by using a classifier algorithm,
wherein the relative abundance of said nucleic acid sequences
allows for the identification of the tissue of origin of said
sample.
[0012] According to one aspect, the classifier algorithm is
selected from the group consisting of decision tree classifier,
K-nearest neighbor classifier (KNN), logistic regression
classifier, nearest neighbor classifier, neural network classifier,
Gaussian mixture model (GMM), Support Vector Machine (SVM)
classifier, nearest centroid classifier, linear regression
classifier and random forest classifier. According to one aspect,
the sample is obtained from a subject with cancer of unknown
primary (CUP), with a primary cancer or with a metastatic
cancer.
[0013] According to certain embodiments, the cancer is selected
from the group consisting of adrenocortical carcinoma; anus or skin
squamous cell carcinoma; biliary tract adenocarcinoma; Ewing
sarcoma; gastrointestinal stromal tumor (GIST); gastrointestinal
tract carcinoid; renal cell carcinoma: chromophobe, clear cell and
papillary; pancreatic islet cell tumor; pheochromocytoma;
urothelial cell carcinoma (TCC); lung, head & neck, or
esophagus squamous cell carcinoma (SCC); brain: astrocytic tumor,
oligodendroglioma; breast adenocarcinoma; uterine cervix squamous
cell carcinoma; chondrosarcoma; germ cell cancer; sarcoma;
colorectal adenocarcinoma; liposarcoma; hepatocellular carcinoma
(HCC); lung large cell or adenocarcinoma; lung carcinoid; pleural
mesothelioma; lung small cell carcinoma; B-cell lymphoma; T-cell
lymphoma; melanoma; malignant fibrous histiocytoma (MFH) or
fibrosarcoma; osteosarcoma; ovarian primitive germ cell tumor;
ovarian carcinoma; pancreatic adenocarcinoma; prostate
adenocarcinoma; rhabdomyosarcoma; gastric or esophageal
adenocarcinoma; synovial sarcoma; non-seminomatous testicular germ
cell tumor; seminomatous testicular germ cell tumor; thymoma/thymic
carcinoma; follicular thyroid carcinoma; medullary thyroid
carcinoma; and papillary thyroid carcinoma.
[0014] The invention further provides a method for identifying a
cancer of germ cell origin, comprising measuring the relative
abundance of SEQ ID NO: 55 or a sequence having at least about 80%
identity thereto in said sample; wherein the abundance of said
nucleic acid sequence is indicative of a cancer of germ cell
origin. According to some embodiments the germ cell is selected
from the group consisting of an ovarian primitive cell and a testis
cell. According to some embodiments the group of nucleic acid
furthers consists of SEQ ID NOS: 29, 62 or a sequence having at
least about 80% identity thereto, and the abundance of said nucleic
acid sequence is indicative of a testis cell cancer origin selected
from the group consisting of seminomatous testicular germ cell and
non-seminomatous testicular germ cell.
[0015] The invention further provides a method for identifying a
cancer origin selected from the group consisting of biliary tract
adenocarcinoma and hepatocellular carcinoma, comprising measuring
the relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 9, 29 or a sequence having
at least about 80% identity thereto in said sample; wherein the
abundance of said nucleic acid sequence is indicative of a cancer
origin selected from the group consisting of biliary tract
adenocarcinoma and hepatocellular carcinoma.
[0016] The invention further provides a method for identifying a
cancer of brain origin, the method comprising measuring the
relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 30, 46, 16, 156, 66, 68 or a
sequence having at least about 80% identity thereto in said sample;
wherein the abundance of said nucleic acid sequence is indicative
of a cancer of brain origin.
[0017] According to some embodiments the group of nucleic acid
furthers consists of SEQ ID NOS: 40, 60 or a sequence having at
least about 80% identity thereto, and wherein the abundance of said
nucleic acid sequence is indicative of a brain cancer origin
selected from the group consisting of oligodendroglioma and
astrocytoma.
[0018] The invention further provides a method for identifying a
cancer of prostate adenocarcinoma origin, the method comprising
measuring the relative abundance of a nucleic acid sequence
selected from the group consisting of SEQ ID NOS: 55, 6, 30, 46,
56, 65, 25, 27, 35, 14, 21 or a sequence having at least about 80%
identity thereto in said sample; wherein the abundance of said
nucleic acid sequence is indicative of a cancer of prostate
adenocarcinoma origin.
[0019] The invention further provides a method for identifying a
cancer of breast adenocarcinoma origin, the method comprising
measuring the relative abundance of a nucleic acid sequence
selected from the group consisting of SEQ ID NOS: 55, 6, 30, 46,
56, 65, 27, 35, 14, 21, 32, 51, 7, 25, 50, 11, 148, 4, 49, 67 or a
sequence having at least about 80% identity thereto in said sample;
wherein the abundance of said nucleic acid sequence is indicative
of a cancer of breast adenocarcinoma origin.
[0020] The invention further provides a method for identifying a
cancer of ovarian carcinoma origin, the method comprising measuring
the relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 30, 46, 56, 65, 27, 35, 14,
21, 32, 51, 7, 25, 4, 39, 50, 11, 148, 49, 67, 57, 34 or a sequence
having at least about 80% identity thereto in said sample; wherein
the abundance of said nucleic acid sequence is indicative of a
cancer of an ovarian carcinoma origin.
[0021] The invention further provides a method for identifying a
cancer of thyroid carcinoma origin, the method comprising measuring
the relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 30, 46, 56, 65, 25, 27, 35,
14, 21, 32, 51, 7, 11, 148, 4 or a sequence having at least about
80% identity thereto in said sample; wherein the abundance of said
nucleic acid sequence is indicative of a cancer of thyroid
carcinoma origin.
[0022] According to some embodiments the group of nucleic acid
furthers consists of SEQ ID NOS: 17, 34 or a sequence having at
least about 80% identity thereto, and wherein said thyroid
carcinoma origin is selected from the group consisting of
follicular and papillary.
[0023] The invention further provides a method for identifying a
cancer origin selected from the group consisting of lung large cell
and lung adenocarcinoma, the method comprising measuring the
relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 30, 46, 56, 65, 25, 27, 35,
14, 21, 32, 51, 7, 11, 148, 4, 49, 67, 57, 34 or a sequence having
at least about 80% identity thereto in said sample; wherein the
abundance of said nucleic acid sequence is indicative of a cancer
origin selected from the group consisting of lung large cell and
lung adenocarcinoma.
[0024] The invention further provides a method for identifying a
cancer origin selected from the group consisting of lung large cell
and lung adenocarcinoma, the method comprising measuring the
relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 30, 46, 56, 65, 25, 27, 35,
14, 21, 32, 51, 7, 11, 148, 4, 49, 67, 57, 34 or a sequence having
at least about 80% identity thereto in said sample; wherein the
abundance of said nucleic acid sequence is indicative of a cancer
origin selected from the group consisting of lung large cell and
lung adenocarcinoma.
[0025] The invention further provides a method for identifying a
cancer of thymic carcinoma origin, the method comprising measuring
the relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 30, 46, 56, 65, 25, 27, 35,
14, 21, 32, 51, 7, 50, 4, 39, 3, 34 or a sequence having at least
about 80% identity thereto in said sample; wherein the abundance of
said nucleic acid sequence is indicative of a cancer of a thymic
carcinoma origin.
[0026] The invention further provides a method for identifying a
cancer origin selected from the group consisting of a urothelial
cell carcinoma and squamous cell carcinoma, the method comprising
measuring the relative abundance of a nucleic acid sequence
selected from the group consisting of SEQ ID NOS: 55, 6, 30, 46,
56, 65, 25, 27, 35, 14, 21, 32, 51, 7, 50, 4, 39, 3, 34, 69, 24, 44
or a sequence having at least about 80% identity thereto in said
sample; wherein the abundance of said nucleic acid sequence is
indicative of is indicative of a cancer origin selected from the
group consisting of urothelial cell carcinoma and squamous cell
carcinoma.
[0027] According to some embodiments the group of nucleic acid
further consists of SEQ ID NOS: 1, 5, 54 or a sequence having at
least about 80% identity thereto, and wherein the abundance of said
nucleic acid sequence is indicative of squamous-cell-carcinoma
origin selected from the group consisting of uterine cervix
squamous-cell-carcinoma and non uterine cervix squamous cell
carcinoma.
[0028] According to some embodiments the group of nucleic acid
further consists of SEQ ID NOS: 11, 23 or a sequence having at
least about 80% identity thereto in said sample, and wherein the
abundance of said nucleic acid sequence is indicative of a
non-uterine cervix squamous cell carcinoma origin selected from the
group consisting of anus or skin squamous cell carcinoma; and lung,
head & neck, and esophagus squamous cell carcinoma.
[0029] The invention further provides a method for identifying a
cancer origin selected from melanoma and lymphoma, the method
comprising measuring the relative abundance of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6,
30, 46, 16, 2, 47, 50 or a sequence having at least about 80%
identity thereto in said sample; wherein the abundance of said
nucleic acid sequence is indicative of a cancer origin selected
from the group consisting of melanoma and lymphoma.
[0030] According to some embodiments the group of nucleic acid
further consists of SEQ ID NOS: 35, 48 or a sequence having at
least about 80% identity thereto, and wherein the abundance of said
nucleic acid sequence is indicative of a lymphoma cancer origin
selected from the group consisting of B-cell lymphoma and T-cell
lymphoma.
[0031] The invention further provides a method for identifying a
cancer of lung small cell carcinoma origin, the method comprising
measuring the relative abundance of a nucleic acid sequence
selected from the group consisting of SEQ ID NOS: 55, 6, 30, 46,
56, 65, 25, 20, 45 or a sequence having at least about 80% identity
thereto in said sample; wherein the abundance of said nucleic acid
sequence is indicative of a cancer of lung small cell carcinoma
origin.
[0032] The invention further provides a method for identifying a
cancer of medullary thyroid carcinoma origin, the method comprising
measuring the relative abundance of a nucleic acid sequence
selected from the group consisting of SEQ ID NOS: 55, 6, 30, 46,
56, 65, 25, 20, 45, 40, 67, 68 or a sequence having at least about
80% identity thereto in said sample; wherein the abundance of said
nucleic acid sequence is indicative of a cancer of medullary
thyroid carcinoma origin.
[0033] The invention further provides a method for identifying a
cancer of lung carcinoid origin, the method comprising measuring
the relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 30, 46, 56, 65, 25, 20, 45,
40, 67, 68, 64, 53, 37 or a sequence having at least about 80%
identity thereto in said sample; wherein the abundance of said
nucleic acid sequence is indicative of a cancer of lung carcinoid
origin.
[0034] The invention further provides a method for identifying a
cancer origin selected from the group consisting of
gastrointestinal tract carcinoid and pancreatic islet cell tumor,
the method comprising measuring the relative abundance of a nucleic
acid sequence selected from the group consisting of SEQ ID NOS: 55,
6, 30, 46, 56, 65, 25, 20, 45, 40, 67, 68, 64, 53, 37, 34, 18 or a
sequence having at least about 80% identity thereto in said sample;
wherein the abundance of said nucleic acid sequence is indicative
of a cancer origin selected from the group consisting of
gastrointestinal tract carcinoid and pancreatic islet cell
tumor.
[0035] The invention further provides a method for identifying a
cancer origin selected from the group consisting of gastric and
esophageal adenocarcinoma, the method comprising measuring the
relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 30, 46, 56, 65, 25, 27, 35,
42, 36, 146 or a sequence having at least about 80% identity
thereto in said sample; wherein the abundance of said nucleic acid
sequence is indicative of a cancer origin elected from the group
consisting of gastric and esophageal adenocarcinoma.
[0036] The invention further provides a method for identifying a
cancer of colorectal adenocarcinoma origin, the method comprising
measuring the relative abundance of a nucleic acid sequence
selected from the group consisting of SEQ ID NOS: 55, 6, 30, 46,
56, 65, 25, 27, 35, 42, 36, 146, 20, 43 or a sequence having at
least about 80% identity thereto in said sample; wherein the
abundance of said nucleic acid sequence is indicative of a cancer
of colorectal adenocarcinoma origin.
[0037] The invention further provides a method for identifying a
cancer origin selected from the group consisting of pancreatic
adenocarcinoma and biliary tract adenocarcinoma, the method
comprising measuring the relative abundance of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6,
30, 46, 56, 65, 25, 27, 35, 42, 36, 146, 20, 4351, 49, 16, or a
sequence having at least about 80% identity thereto, and wherein
the abundance of said nucleic acid sequence is indicative of a
cancer origin selected from the group consisting of pancreatic
adenocarcinoma or biliary tract adenocarcinoma.
[0038] The invention further provides a method for identifying a
cancer of renal cell carcinoma origin, the method comprising
measuring the relative abundance of a nucleic acid sequence
selected from the group consisting of SEQ ID NOS: 55, 6, 30, 46,
16, 2, 66, 68, 19, 29 or a sequence having at least about 80%
identity thereto in said sample; wherein the abundance of said
nucleic acid sequence is indicative of a cancer of renal cell
carcinoma origin.
[0039] According to some embodiments the group of nucleic acid
further consists of SEQ ID NOS: 36, 147 or a sequence having at
least about 80% identity thereto, and wherein the abundance of said
nucleic acid sequence is indicative of a chromophobe renal cell
carcinoma origin.
[0040] According to some embodiments the group of nucleic acid
further consists of SEQ ID NOS: 49, 9 or a sequence having at least
about 80% identity thereto, and wherein the abundance of said
nucleic acid sequence is indicative of a renal cell carcinoma
origin selected from the group consisting of clear cell and
papillary.
[0041] The invention further provides a method for identifying a
cancer of pheochromocytoma origin, the method comprising measuring
the relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 30, 46, 16, 2, 66, 68, 19,
29, 65, 56 or a sequence having at least about 80% identity thereto
in said sample; wherein the abundance of said nucleic acid sequence
is indicative of a cancer of pheochromocytoma origin.
[0042] The invention further provides a method for identifying a
cancer of adrenocortical origin, the method comprising measuring
the relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 30, 46, 16, 2, 66, 68, 19,
29, 65, 56, 31, 38, 61 or a sequence having at least about 80%
identity thereto in said sample; wherein the abundance of said
nucleic acid sequence is indicative of a cancer of adrenocortical
origin.
[0043] The invention further provides a method for identifying a
cancer of gastrointestinal stromal tumor origin, the method
comprising measuring the relative abundance of a nucleic acid
sequence selected from the group consisting of SEQ ID NOS: 55, 6,
30, 46, 16, 2, 66, 68, 19, 29, 65, 56, 31, 38, 61, 14, 45 or a
sequence having at least about 80% identity thereto in said sample;
wherein the abundance of said nucleic acid sequence is indicative
of a cancer of gastrointestinal stromal tumor origin.
[0044] The invention further provides a method for identifying a
cancer origin selected from the group consisting of pleural
mesothelioma and sarcoma, the method comprising measuring the
relative abundance of a nucleic acid sequence selected from the
group consisting of SEQ ID NOS: 55, 6, 30, 46, 16, 2, 66, 68, 19,
29, 65, 56, 31, 38, 61, 14, 45, 35, 10, 5 or a sequence having at
least about 80% identity thereto in said sample; wherein the
abundance of said nucleic acid sequence is indicative of a cancer
origin selected from the group consisting of pleural mesothelioma
and sarcoma.
[0045] According to some embodiments the group of nucleic acid
further consists of SEQ ID NOS: 3, 40, 15 or a sequence having at
least about 80% identity thereto, and wherein said sarcoma is
synovial sarcoma.
[0046] According to some embodiments the group of nucleic acid
further consists of SEQ ID NOS: 3, 40, 15, 12, 58 or a sequence
having at least about 80% identity thereto, and wherein said
sarcoma is chondrosarcoma.
[0047] According to some embodiments the group of nucleic acid
further consists of SEQ ID NOS: 3, 40, 15, 12, 58, 36, 26 or a
sequence having at least about 80% identity thereto, and wherein
said sarcoma is liposarcoma.
[0048] According to some embodiments the group of nucleic acid
further consists of SEQ ID NOS: 3, 40, 15, 12, 58, 36, 26, 21, 25,
49 or a sequence having at least about 80% identity thereto and
wherein said sarcoma is selected from the group consisting of Ewing
sarcoma and osteosarcoma.
[0049] According to some embodiments the group of nucleic acid
further consists of SEQ ID NOS: 3, 40, 15, 12, 58, 36, 26, 21, 59,
39, 33 or a sequence having at least about 80% identity thereto and
wherein said sarcoma is selected from the group consisting of
rhabdomyosarcoma; and malignant fibrous histiocytoma and
fibrosarcoma.
[0050] According to another aspect, the present invention provides
a method of distinguishing between cancers of different origins,
said method comprising:
[0051] (a) obtaining a biological sample from a subject;
[0052] (b) measuring the relative abundance in said sample of
nucleic acid sequences selected from the group consisting of SEQ ID
NOS: 1-390 or a sequence having at least about 80% identity
thereto; and
[0053] (c) comparing said measurement to a reference abundance of
said nucleic acid by using a classifier algorithm;
[0054] wherein the relative abundance of said nucleic acid sequence
in said sample allows for distinguishing between cancers of
different origins.
[0055] According to some embodiments the measurement of the
relative abundance of SEQ ID NOS: 372, 233, 55, 200, 201 or a
sequence having at least about 80% identity thereto in said sample
allows for distinguishing between a cancer originating from a
germ-cell tumor and a cancer originating from the group consisting
of non-germ-cell tumors.
[0056] According to some embodiments the measurement of the
relative abundance of SEQ ID NOS: 6, 30, 13 or a sequence having at
least about 80% identity thereto in said sample allows for
distinguishing between a cancer originating from hepatobiliary
tumors and a cancer originating from the group consisting of
non-germ-cell non-hepatobiliary tumors.
[0057] According to some embodiments the measurement of the
relative abundance of SEQ ID NOS: 28, 29, 231, 9 or a sequence
having at least about 80% identity thereto in said sample allows
for distinguishing between a cancer originating from liver tumors
and a cancer originating from biliary-tract carcinomas.
[0058] According to some embodiments the measurement of the
relative abundance of SEQ ID NOS: 46, 5, 12, 30, 29, 28, 32, 13,
152, 49 or a sequence having at least about 80% identity thereto in
said sample allows for distinguishing between a cancer originating
from the group consisting of tumors from an epithelial origin and a
cancer originating from the group consisting of tumors from a
non-epithelial origin.
[0059] According to some embodiments the measurement of the
relative abundance of SEQ ID NOS: 164, 168, 170, 16, 198, 50, 176,
186, 11, 158, 20, 155, 231, 4, 8, 46, 3, 2, 7 or a sequence having
at least about 80% identity thereto in said sample allows for
distinguishing between a cancer originating from the group
consisting of melanoma and lymphoma and a cancer originating from
the group consisting of all other non-epithelial tumors.
[0060] According to some embodiments the measurement of the
relative abundance of SEQ ID NOS: 159, 66, 225, 187, 162, 161, 68,
232, 173, 11, 8, 174, 155, 231, 4, 182, 181, 37 or a sequence
having at least about 80% identity thereto in said sample allows
for distinguishing between a cancer originating from brain tumors
and a cancer originating from the group consisting of all
non-brain, non-epithelial tumors.
[0061] According to some embodiments the measurement of the
relative abundance of SEQ ID NOS: 40, 208, 60, 153, 230, 228, 147,
34, 206, 35, 52, 25, 229, 161, 187, 179 or a sequence having at
least about 80% identity thereto in said sample allows for
distinguishing between a cancer originating from astrocytoma and a
cancer originating from oligodendroglioma.
[0062] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 56, 65, 25, 175, 152, 155, 32, 49, 35,
181, or a sequence having at least about 80% identity thereto in
said sample allows for distinguishing between a cancer originating
from the group consisting of neuroendocrine tumors and a cancer
originating from the group consisting of all non-neuroendocrine,
epithelial tumors.
[0063] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 27, 177, 4, 32, 35 or a sequence having at
least about 80% identity thereto in said sample allows for
distinguishing between a cancer originating from the group
consisting of gastrointestinal epithelial tumors and a cancer
originating from the group consisting of non-gastrointestinal
epithelial tumors.
[0064] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 56, 199, 14, 15, 165, 231, 36, 154, 21, 49
or a sequence having at least about 80% identity thereto in said
sample allows for distinguishing between a cancer originating from
prostate tumors and a cancer originating from the group consisting
of all other non-gastrointestinal epithelial tumors.
[0065] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 222, 62, 29, 28, 211, 214, 227, 215, 218,
152, 216, 212, 224, 13, 194, 192, 221, 217, 205, 219, 32, 193, 223,
220, 210, 209, 213, 163, 30 or a sequence having at least about 80%
identity thereto in said sample allows for distinguishing between a
cancer originating from seminoma and a cancer originating from the
group consisting of non-seminoma testis-tumors.
[0066] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 42, 32, 36, 178, 243, 242, 49, 240, 57,
11, 46, 17, 47, 51, 7, 8, 154, 190, 157, 196, 197, or a sequence
having at least about 80% identity thereto in said sample allows
for distinguishing between a cancer originating from the group
consisting of squamous cell carcinoma, transitional cell carcinoma
and thymoma, and a cancer originating from the group consisting of
non gastrointestinal adenocarcinoma tumors.
[0067] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 56, 46, 25, 152, 50, 45, 191, 181, 179,
49, 32, 42, 184, 40, 147, 236, 57, 203, 36, or a sequence having at
least about 80% identity thereto in said sample allows for
distinguishing between a cancer originating from breast
adenocarcinoma, and a cancer originating from the group consisting
of squamous cell carcinoma, transitional cell carcinoma, thymomas
and ovarian carcinoma.
[0068] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 253, 32, 4, 39, 10, 46, 5, 226, 2, 195,
32, 185, 11, 168, 184, 16, 242, 12, 237, 243, 250, 49, 246, 167 or
a sequence having at least about 80% identity thereto in said
sample allows for distinguishing between a cancer originating from
ovarian carcinoma, and a cancer originating from the group
consisting of squamous cell carcinoma, transitional cell carcinoma
and thymomas.
[0069] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 11, 147, 17, 157, 40, 8, 49, 9, 191, 205,
207, 195, 51, 46, 45, 52, 234, 231, 21, 169, 43, 3, 196, 154, 390,
171, 255, 197, 190, 189, 39, 7, 48, 47, 32, 36, 4, 178, 37, 181,
25, 183, 182, 35, 240, 57, 242, 204, 236, 176, 158, 148, 206, 50,
20, 34, 186, 239, 251, 244, 24, 188, 172, 238 or a sequence having
at least about 80% identity thereto in said sample allows for
distinguishing between a cancer originating from thyroid carcinoma,
and a cancer originating from the group consisting of breast
adenocarcinoma, lung large cell carcinoma, lung adenocarcinoma and
ovarian carcinoma.
[0070] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 249, 180, 65, 235, 241, 248, 254, 247,
160, 243, 245, 252, 17, 49, 166, 225, 168, 34 or a sequence having
at least about 80% identity thereto in said sample allows for
distinguishing between a cancer originating from follicular thyroid
carcinoma and a cancer originating from papillary thyroid
carcinoma.
[0071] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 32, 56, 50, 45, 25, 253, 152, 9, 46, 191,
178, 49, 40, 10, 147, 4, 36, 228, 236, 230, 189, 240, 67, 202, 17
or a sequence having at least about 80% identity thereto in said
sample allows for distinguishing between a cancer originating from
breast adenocarcinoma and a cancer originating from the group
consisting of lung adenocarcinoma and ovarian carcinoma.
[0072] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 56, 11, 168, 16, 237, 21, 52, 12, 154,
279, 9, 39, 47, 23, 50, 167, 383, 34, 35, 388, 5, 359, 245, 254,
10, 240, 236, 202, 4, 25, 203, 231, 20, 158, 186, 258, 244, 172, 2,
235, 256, 28, 277, 296, 374, 153, 181 or a sequence having at least
about 80% identity thereto in said sample allows for distinguishing
between a cancer originating from lung adenocarcinoma and a cancer
originating from ovarian carcinoma.
[0073] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 161, 164, 22, 53, 285, 3, 152, 191, 154,
21, 206, 174, 19, 45, 171, 179, 8, 296, 284, 18, 51, 258, 49, 184,
35, 34, 37, 42, 228, 15, 14, 242, 230, 253, 36, 182, 293, 292, 4,
294, 297, 354, 377, 189, 30, 386, 249, 5, 274 or a sequence having
at least about 80% identity thereto in said sample allows for
distinguishing between a cancer originating from thymic carcinoma
and a cancer originating from the group consisting of transitional
cell carcinoma and squamous cell carcinoma.
[0074] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 69, 28, 280, 13, 191, 152, 29, 175, 30,
204, 4, 24, 5, 329, 273, 170, 184, 26, 231, 368, 37, 16, 169, 155,
35, 40, 17 or a sequence having at least about 80% identity thereto
in said sample allows for distinguishing between a cancer
originating from transitional cell carcinoma and a cancer
originating from the group consisting of squamous cell
carcinoma.
[0075] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 164, 5, 231, 54, 1, 242, 372, 249, 167,
254, 354, 381, 380, 245, 358, 364, 240, 11, 378 or a sequence
having at least about 80% identity thereto in said sample allows
for distinguishing between squamous cell carcinoma cancers
originating from the uterine cervix, and squamous cell carcinoma
cancers originating from the group consisting of anus and skin,
lung, head & neck and esophagus.
[0076] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 305, 184, 41, 183, 49, 382, 235, 291, 181,
5, 296, 289, 206, 338, 334, 25, 11, 19, 198, 23 or a sequence
having at least about 80% identity thereto in said sample allows
for distinguishing between squamous cell carcinoma cancers
originating from the group consisting of anus and skin, and between
squamous cell carcinoma cancers originating from the group
consisting of lung, head & neck and esophagus.
[0077] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 4, 11, 46, 8, 274, 169, 36, 47, 363, 231,
303, 349, 10, 7, 3, 16, 164, 170, 168, 198, 50, 245, 365, 45, 382,
259, 296, 364, 314, 12 or a sequence having at least about 80%
identity thereto in said sample allows for distinguishing between a
cancer originating from melanoma and a cancer originating from
lymphoma.
[0078] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 11, 191, 48, 35, 228 or a sequence having
at least about 80% identity thereto in said sample allows for
distinguishing between a cancer originating from B-cell lymphoma
and a cancer originating from T-cell lymphoma.
[0079] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 158, 20, 176, 186, 148, 36, 51, 172, 260,
265, 67, 188, 277, 284, 302, 68, 168, 242, 204, 162, 177, 27, 65,
263, 155, 191, 190, 45, 59, 43, 56, 266, 14, 15, 8, 7, 39, 189,
249, 231, 293, 2 or a sequence having at least about 80% identity
thereto in said sample allows for distinguishing between a cancer
originating from lung small cell carcinoma and a cancer originating
from the group consisting of lung carcinoid, medullary thyroid
carcinoma, gastrointestinal tract carcinoid and pancreatic islet
cell tumor.
[0080] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 159, 40, 147, 11, 311, 4, 8, 231, 301,
297, 68, 67, 265, 36 or a sequence having at least about 80%
identity thereto in said sample allows for distinguishing between a
cancer originating from medullary thyroid carcinoma and a cancer
originating from other neuroendocrine tumors selected from the
group consisting of lung carcinoid, gastrointestinal tract
carcinoid and pancreatic islet cell tumor.
[0081] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 331, 162, 59, 326, 306, 350, 317, 155,
325, 318, 339, 264, 332, 262, 336, 324, 322, 330, 321, 263, 309,
53, 320, 275, 352, 312, 355, 367, 269, 64, 308, 175, 190, 54, 302,
152, 301, 266, 47, 313, 359, 65, 307, 191, 242, 4, 147, 40, 372,
168, 16, 182, 167, 356, 148, 382, 37, 364, 35 or a sequence having
at least about 80% identity thereto in said sample allows for
distinguishing between a cancer originating from lung carcinoid
tumors, and a cancer originating from gastrointestinal
neuroendocrine tumors selected from the group consisting of
gastrointestinal tract carcinoid and pancreatic islet cell
tumor.
[0082] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 263, 288, 18, 286, 162, 225, 287, 206,
205, 296, 258, 313, 377, 373, 256, 153, 259, 265, 303, 268, 267,
165, 15, 272, 14, 202, 236, 203, 4, 168, 310, 298, 27, 29, 34, 228,
3, 349, 35, 26 or a sequence having at least about 80% identity
thereto in said sample allows for distinguishing between a cancer
originating from pancreatic islet cell tumors and a
Gastrointestinal neuroendocrine carcinoid cancer originating from
the group consisting of small intestine and duodenum; appendicitis,
stomach and pancreas.
[0083] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 36, 267, 268, 165, 15, 14, 356, 167, 372,
272, 370, 42, 41, 146 or a sequence having at least about 80%
identity thereto in said sample allows for distinguishing between
adenocarcinoma tumors of the gastrointestinal system originating
from:
[0084] the group consisting of gastric and esophageal
adenocarcinoma, and
[0085] the group consisting of cholangiocarcinoma or adenocarcinoma
of the extrahepatic biliary tract, pancreatic adenocarcinoma and
colorectal adenocarcinoma.
[0086] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 42, 184, 67, 158, 20, 186, 284, 389, 203,
240, 236, 146, 204, 43, 176, 202, 49, 46, 38, 363 or a sequence
having at least about 80% identity thereto in said sample allows
for distinguishing between a cancer originating from colorectal
adenocarcinoma and a cancer originating from the group consisting
of adenocarcinoma of biliary tract or pancreas.
[0087] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 49, 11, 13, 373, 154, 5, 30, 45, 178, 147,
274, 16, 40, 21, 43, 253, 245, 256, 12, 374, 379, 180, 153, 51, 52,
1, 295, 257, 385, 293, 294 or a sequence having at least about 80%
identity thereto in said sample allows for distinguishing between a
cancer originating from pancreatic adenocarcinoma, and a cancer
originating from the group consisting of cholangiocarcinoma or
adenocarcinoma of the extrahepatic biliary tract.
[0088] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 29, 28, 30, 46, 49, 195, 152, 175, 47, 4,
387, 196, 177, 375, 27, 304, 40, 191, 147, 35, 16, 34, 5, 155, 181,
312, 183, 182, 320, 59, 38, 324, 323, 37, 322, 325, 19, 42, 334,
265, 22 or a sequence having at least about 80% identity thereto in
said sample allows for distinguishing between a cancer originating
from:
[0089] renal cell tumors selected from the group consisting of
chromophobe renal cell carcinoma, clear cell renal cell carcinoma
and papillary renal cell carcinoma, and
[0090] the group consisting of sarcomas, adrenal tumors and pleural
mesothelioma.
[0091] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 65, 56, 11, 162, 59, 331, 350, 155, 335,
159, 336, 332, 263, 306, 339, 337, 275, 301, 276, 330, 317, 309,
45, 318, 324, 352, 191, 262, 269, 313, 19, 367, 326, 325, 322, 327,
190, 261, 321, 360, 353, 312, 371, 5, 328, 205, 183, 38, 181, 37,
40, 182, 147, 17, 42, 382, 34, 18, 3 or a sequence having at least
about 80% identity thereto in said sample allows for distinguishing
between a cancer originating from pheochromocytoma, and a cancer
originating from the group consisting of all sarcoma, adrenal
carcinoma and mesothelioma tumors.
[0092] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 61, 333, 31, 347, 346, 344, 345, 387, 334,
351, 324, 326, 269, 155, 320, 322, 59, 318, 325, 245, 254, 331,
275, 180, 355, 370, 323, 312, 178, 249, 183, 181, 38, 182, 37, 3,
25 or a sequence having at least about 80% identity thereto in said
sample allows for distinguishing between a cancer originating from
adrenal carcinoma and a cancer originating from the group
consisting of mesothelioma and sarcoma tumors.
[0093] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 165, 14, 15, 333, 272, 270, 45, 301, 191,
46, 195, 266, 190, 19, 334, 155, 25, 147, 40, 34 or a sequence
having at least about 80% identity thereto in said sample allows
for distinguishing between a cancer originating from a
gastrointestinal stromal tumor and a cancer originating from the
group consisting of mesothelioma and sarcoma tumors.
[0094] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 13, 30, 361, 280, 362, 147, 40, 291, 387,
290, 299, 152, 178, 303, 242, 49, 11, 35, 34, 36, 206, 16, 170,
177, 17 or a sequence having at least about 80% identity thereto in
said sample allows for distinguishing between a cancer originating
from a chromophobe renal cell carcinoma tumor and a cancer
originating from the group consisting of clear cell and papillary
renal cell carcinoma tumors.
[0095] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 344, 382, 9, 338, 29, 49, 28, 195, 46, 4,
11, 254 or a sequence having at least about 80% identity thereto in
said sample allows for distinguishing between a renal carcinoma
cancer originating from a clear cell tumor and a cancer originating
from a papillary tumor.
[0096] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 49, 35, 17, 34, 25, 36, 168, 170, 26, 4,
190, 46, 10, 240, 43, 39, 385, 63, 202, 181, 37, 5, 183, 182, 38,
206, 296, 1 or a sequence having at least about 80% identity
thereto in said sample allows for distinguishing between a cancer
originating from pleural mesothelioma and a cancer originating from
the group consisting of sarcoma tumors.
[0097] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 152, 29, 159, 28, 339, 275, 352, 19, 320,
155, 262, 38, 37, 182, 331, 317, 323, 355, 3, 282, 312, 181, 269,
318, 59, 266, 322, 8, 324, 10, 40, 147, 169, 205, 34, 168, 14, 15,
12, 46, 255, 39, 23, 190, 236, 386, 379, 202 or a sequence having
at least about 80% identity thereto in said sample allows for
distinguishing between a cancer originating from a synovial sarcoma
and a cancer originating from the group consisting of other sarcoma
tumors.
[0098] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 12, 271, 206, 333, 11, 58, 36, 18, 178,
293, 189, 382, 381, 240, 249, 5, 377, 235, 17, 20, 385, 384, 46,
283 or a sequence having at least about 80% identity thereto in
said sample allows for distinguishing between a cancer originating
from chondrosarcoma and a cancer originating from the group
consisting of other non-synovial sarcoma tumors.
[0099] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 295, 205, 25, 26, 231, 183, 42, 254, 168,
64, 14, 178, 15, 39, 36, 154, 265, 174, 384, 67 or a sequence
having at least about 80% identity thereto in said sample allows
for distinguishing between a cancer originating from liposarcoma
and a cancer originating from the group consisting of other non
chondrosarcoma and non synovial sarcoma tumors.
[0100] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 22, 154, 21, 174, 205, 158, 186, 148, 20,
59, 8, 183, 231 or a sequence having at least about 80% identity
thereto in said sample allows for distinguishing between a cancer
originating from:
[0101] the group consisting of Ewing sarcoma and osteosarcoma,
and
[0102] the group consisting of rhabdomyosarcoma, malignant fibrous
histiocytoma and fibrosarcoma.
[0103] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 155, 179, 43, 208, 278, 17, 385, 174, 5,
52, 257, 366, 48, 49, 12, 25, 169, 34, 35, 23, 384, 189, 377, 265,
294, 293, 292 or a sequence having at least about 80% identity
thereto in said sample allows for distinguishing between a cancer
originating from Ewing sarcoma and a cancer originating from
osteosarcoma.
[0104] According to some embodiments, measurement of the relative
abundance of SEQ ID NOS: 33, 268, 267, 333, 276, 319, 306, 320,
334, 323, 300, 281, 59, 339, 316, 176, 348, 352, 349, 67, 357, 315,
343, 342, 355, 340, 344, 10, 341, 331, 20, 277, 318, 158, 265, 284,
36, 183, 40, 63, 147, 43, 289, 52, 190, 4, 5, 39, 169, 208 or a
sequence having at least about 80% identity thereto in said sample
allows for distinguishing between a cancer originating from
rhabdomyosarcoma and a cancer originating from the group consisting
of malignant fibrous histiocytoma and fibrosarcoma.
[0105] According to some aspects of the invention the biological
sample is selected from the group consisting of bodily fluid, a
cell line, a tissue sample, a biopsy sample, a needle biopsy
sample, a fine needle biopsy (FNA) sample, a surgically removed
sample, and a sample obtained by tissue-sampling procedures such as
endoscopy, bronchoscopy, or laparoscopic methods. According to some
embodiments, the tissue is a fresh, frozen, fixed, wax-embedded or
formalin-fixed paraffin-embedded (FFPE) tissue.
[0106] According to additional aspects of the invention the nucleic
acid sequence relative abundance is determined by a method selected
from the group consisting of nucleic acid hybridization and nucleic
acid amplification. According to some embodiments, the nucleic acid
hybridization is performed using a solid-phase nucleic acid biochip
array or in situ hybridization. According to some embodiments, the
nucleic acid amplification method is real-time PCR. According to
some embodiments, the real-time PCR comprises forward and reverse
primers. According to additional embodiments, the real-time PCR
method further comprises a probe. According to additional
embodiments, the probe comprises a sequence selected from the group
consisting of a sequence that is complementary to a sequence
selected from SEQ ID NOS: 1-390; a fragment thereof and a sequence
having at least about 80% identity thereto.
[0107] According to another aspect, the present invention provides
a kit for cancer origin identification, the kit comprising a probe
comprising a sequence selected from the group consisting of a
sequence that is complementary to a sequence selected from SEQ ID
NOS: 1-390; a fragment thereof and a sequence having at least about
80% identity thereto.
[0108] These and other embodiments of the present invention will
become apparent in conjunction with the figures, description and
claims that follow.
BRIEF DESCRIPTION OF THE DRAWINGS
[0109] FIGS. 1A-1F demonstrate the structure of the binary
decision-tree classifier, with 45 nodes and 46 leaves. Each node is
a binary decision between two sets of samples, those to the left
and right of the node. A series of binary decisions, starting at
node #1 and moving downwards, lead to one of the possible tumor
types, which are the "leaves" of the tree. A sample which is
classified to the right branch at node #1 continues to node #2,
otherwise it continues to node #11. A sample which is classified to
the right branch at node #2 continues to node #4, otherwise it
continues to node #3. A sample that reaches node #3, is further
classified to either the left branch at node #3, and is assigned to
the "hepatocellular carcinoma" class, or to the right branch at
node #3, and is assigned to the "biliary tract adenocarcinoma"
class.
[0110] Decisions are made at consecutive nodes using microRNA
expression levels, until an end-point ("leaf" of the tree) is
reached, indicating the predicted class for this sample. In
specifying the tree structure, clinico-pathological considerations
were combined with properties observed in the training set
data.
[0111] FIGS. 2A-2D demonstrate binary decisions at node #1 of the
decision-tree. When training a decision algorithm for a given node,
only samples from classes which are possible outcomes of this node
are used for training. The "non germ cell" classes (right branch at
node #1); are easily distinguished from tumors of the "germ cell"
classes (left branch at node #1) using the expression levels of
hsa-miR-373 (SEQ ID NO: 233, 2A), hsa-miR-372 (SEQ ID NO: 55, 2B),
hsa-miR-371-3p (SEQ ID NO: 200, 2C), and hsa-miR-371-5p (SEQ ID NO:
201, 2D). The boxplot presentations comparing distribution of the
expression of the statistically significant miRs in tumor samples
from the "germ cell" classes (left box) and "non germ cell" classes
(right box). The line in the box indicates the median value. The
box contains 50% of the data and the horizontal lines and crosses
(outliers) show the full range of signals in this group.
[0112] FIG. 3 demonstrates binary decisions at node #3 of the
decision-tree. Tumors of hepatocellular carcinoma (HCC) origin
(left branch at node #3, marked by squares) are easily
distinguished from tumors of biliary tract adenocarcinoma origin
(right branch at node #3, marked by diamonds) using the expression
levels of hsa-miR-200b (SEQ ID NO: 29, y-axis) and hsa-miR-126 (SEQ
ID NO: 9, x-axis).
[0113] FIG. 4 demonstrates binary decisions at node #4 of the
decision-tree. Tumors originating in epithelial (diamonds) are
easily distinguished from tumors of non-epithelial origin (squares)
using the expression levels of hsa-miR-30a (SEQ ID NO: 46, y-axis)
and hsa-miR-200c (SEQ ID NO: 30, x-axis).
[0114] FIG. 5 demonstrates binary decisions at node #5 of the
decision-tree. Tumors originating in the lymphoma or melanoma
(diamonds) are easily distinguished from tumors of non epithelial,
non lymphoma/melanoma origin (squares) using the expression levels
of hsa-miR-146a (SEQ ID NO: 16, y-axis), hsa-miR-30a (SEQ ID NO:
46, x-axis) and hsa-let-7e (SEQ ID NO: 2, z-axis).
[0115] FIG. 6 demonstrates binary decisions at node #6 of the
decision-tree. Tumors originating in the brain (left branch at node
#6, marked by diamonds) are easily distinguished from tumors of non
epithelial, non brain (right branch at node #6, marked by squares)
using the expression levels of hsa-miR-9* (SEQ ID NO: 66, y-axis)
and hsa-miR-92b (SEQ ID NO: 68, x-axis).
[0116] FIG. 7 demonstrates binary decisions at node #7 of the
decision-tree. Tumors originating in astrocytoma (right branch at
node #7, marked by diamonds) are easily distinguished from tumors
of oligodendroglioma origins (left branch at node #7, marked by
squares) using the expression levels of hsa-miR-497 (SEQ ID NO: 60,
y-axis) and hsa-miR-222 (SEQ ID NO: 40, x-axis).
[0117] FIG. 8 demonstrates binary decisions at node #8 of the
decision-tree. Tumors originating in the neuroendocrine (diamonds)
are easily distinguished from tumors of epithelial, origin
(squares) using the expression levels of hsa-miR-193a-3p (SEQ ID
NO: 181, y-axis), hsa-miR-7 (SEQ ID NO: 65, x-axis) and hsa-miR-375
(SEQ ID NO: 56, z-axis).
[0118] FIG. 9 demonstrates binary decisions at node #9 of the
decision-tree. Tumors originating in gastro-intestinal (GI) (left
branch at node #9, marked by diamonds) are easily distinguished
from tumors of non GI origins (right branch at node #9, marked by
squares) using the expression levels of hsa-miR-21* (SEQ ID NO: 35,
y-axis) and hsa-miR-194 (SEQ ID NO: 27, x-axis).
[0119] FIG. 10 demonstrates binary decisions at node #10 of the
decision-tree. Tumors originating in prostate adenocarcinoma (left
branch at node #10, marked by diamonds) are easily distinguished
from tumors of non prostate origins (right branch at node #10,
marked by squares) using the expression levels of hsa-miR-181a (SEQ
ID NO: 21, y-axis) and hsa-miR-143 (SEQ ID NO: 14, x-axis).
[0120] FIG. 11 demonstrates binary decisions at node #12 of the
decision-tree. Tumors originating in seminomatous testicular germ
cell (left branch at node #12, marked by diamonds) are easily
distinguished from tumors of non seminomatous origins (right branch
at node #12, marked by squares) using the expression levels of
hsa-miR-516a-5p (SEQ ID NO: 62, y-axis) and hsa-miR-200b (SEQ ID
NO: 29, x-axis).
[0121] FIG. 12 demonstrates binary decisions at node #16 of the
decision-tree. Tumors originating in thyroid carcinoma (diamonds)
are easily distinguished from tumors of adenocarcinoma of the lung,
breast and ovarian origin (squares) using the expression levels of
hsa-miR-93 (SEQ ID NO: 148, y-axis), hsa-miR-138 (SEQ ID NO: 11,
x-axis) and hsa-miR-10a (SEQ ID NO: 4, z-axis).
[0122] FIG. 13 demonstrates binary decisions at node #17 of the
decision-tree. Tumors originating in follicular thyroid carcinoma
(left branch at node #17, marked by diamonds) are easily
distinguished from tumors of papillary thyroid carcinoma origins
(right branch at node #17, marked by squares) using the expression
levels of hsa-miR-21 (SEQ ID NO: 34, y-axis) and hsa-miR-146b-5p
(SEQ ID NO: 17, x-axis).
[0123] FIG. 14 demonstrates binary decisions at node #18 of the
decision-tree. Tumors originating in breast (diamonds) are easily
distinguished from tumors of lung and ovarian origin (squares)
using the expression levels of hsa-miR-92a (SEQ ID NO: 67, y-axis),
hsa-miR-193a-3p (SEQ ID NO: 25, x-axis) and hsa-miR-31 (SEQ ID NO:
49, z-axis).
[0124] FIG. 15 demonstrates binary decisions at node #19 of the
decision-tree. Tumors originating in lung adenocarcinoma (diamonds)
are easily distinguished from tumors of ovarian carcinoma origin
(squares) using the expression levels of hsa-miR-21 (SEQ ID NO: 34,
y-axis), hsa-miR-378 (SEQ ID NO: 57, x-axis) and hsa-miR-138 (SEQ
ID NO: 11, z-axis).
[0125] FIG. 16 demonstrates binary decisions at node #20 of the
decision-tree. Tumors originating in thymic carcinoma (left branch
at node #20, marked by diamonds) are easily distinguished from
tumors of urothelial carcinoma, transitional cell carcinoma (TCC)
carcinoma and squamous cell carcinoma (SCC) origins (right branch
at node #20, marked by squares) using the expression levels of
hsa-miR-21 (SEQ ID NO: 34, y-axis) and hsa-miR-100 (SEQ ID NO: 3,
x-axis).
[0126] FIG. 17 demonstrates binary decisions at node #22 of the
decision-tree. Tumors originating in SCC of the uterine cervix
(diamonds) are easily distinguished from tumors of other SCC origin
(squares) using the expression levels of hsa-miR-361-5p (SEQ ID NO:
54, y-axis), hsa-let-7c (SEQ ID NO: 1, x-axis) and hsa-miR-10b (SEQ
ID NO: 5, z-axis).
[0127] FIG. 18 demonstrates binary decisions at node #24 of the
decision-tree. Tumors originating in melanoma (diamonds) are easily
distinguished from tumors of lymphoma origin (squares) using the
expression levels of hsa-miR-342-3p (SEQ ID NO: 50, y-axis) and
hsa-miR-30d (SEQ ID NO: 47, x-axis).
[0128] FIG. 19 demonstrates binary decisions at node #27 of the
decision-tree. Tumors originating in thyroid carcinoma, medullary
(diamonds) are easily distinguished from tumors of other
neuroendocrine origin (squares) using the expression levels of
hsa-miR-92b (SEQ ID NO: 68, y-axis), hsa-miR-222 (SEQ ID NO: 40,
x-axis) and hsa-miR-92a (SEQ ID NO: 67, z-axis).
[0129] FIG. 20 demonstrates binary decisions at node #30 of the
decision-tree. Tumors originating in gastric or esophageal
adenocarcinoma (diamonds) are easily distinguished from tumors of
other GI adenocarcinoma origin (squares) using the expression
levels of hsa-miR-1201 (SEQ ID NO: 146, y-axis), hsa-miR-224 (SEQ
ID NO: 42, x-axis) and hsa-miR-210 (SEQ ID NO: 36, z-axis).
[0130] FIG. 21 demonstrates binary decisions at node #31 of the
decision-tree. Tumors originating in colorectal adenocarcinoma
(diamonds) are easily distinguished from tumors of adenocarcinoma
of biliary tract or pancreas origin (squares) using the expression
levels of hsa-miR-30a (SEQ ID NO: 46, y-axis), hsa-miR-17 (SEQ ID
NO: 20, x-axis) and hsa-miR-29a (SEQ ID NO: 43, z-axis).
[0131] FIG. 22 demonstrates binary decisions at node #33 of the
decision-tree. Tumors originating in kidney (diamonds) are easily
distinguished from tumors of adrenal, mesothelioma and sarcoma
origin (squares) using the expression levels of hsa-miR-200b (SEQ
ID NO: 29, y-axis), hsa-miR-30a (SEQ ID NO: 46, x-axis) and
hsa-miR-149 (SEQ ID NO: 19, z-axis).
[0132] FIG. 23 demonstrates binary decisions at node #34 of the
decision-tree. Tumors originating in pheochromocytoma (diamonds)
are easily distinguished from tumors of adrenal, mesothelioma and
sarcoma origin (squares) using the expression levels of hsa-miR-375
(SEQ ID NO: 56, y-axis) and hsa-miR-7 (SEQ ID NO: 65, x-axis).
[0133] FIG. 24 demonstrates binary decisions at node #44 of the
decision-tree. Tumors originating in Ewing sarcoma (diamonds) are
easily distinguished from tumors of osteosarcoma origin (squares)
using the expression levels of hsa-miR-31 (SEQ ID NO: 49, y-axis)
and hsa-miR-193a-3p (SEQ ID NO: 25, x-axis).
[0134] FIG. 25 demonstrates binary decisions at node #45 of the
decision-tree. Tumors originating in Rhabdomyosarcoma (diamonds)
are easily distinguished from tumors of malignant fibrous
histiocytoma (MFH) or fibrosarcoma origin (squares) using the
expression levels of hsa-miR-206 (SEQ ID NO: 33, y-axis),
hsa-miR-22 (SEQ ID NO: 39, x-axis) and hsa-miR-487b (SEQ ID NO: 59,
z-axis).
DETAILED DESCRIPTION OF THE INVENTION
[0135] Identification of the tissue-of-origin of a tumor is vital
to its management. The present invention is based in part on the
discovery that specific nucleic acid sequences can be used for the
identification of the tissue-of-origin of a tumor. The present
invention provides a sensitive, specific and accurate method which
can be used to distinguish between different tumor origins. A new
microRNA-based classifier was developed for determining tissue
origin of tumors based on 65 microRNAs markers. The classifier uses
a specific algorithm and allows a clear interpretation of the
specific biomarkers.
[0136] According to the present invention each node in the
classification tree may be used as an independent differential
diagnosis tool, for example in the identification of different
types of lung cancers. The possibility to distinguish between
different tumor origins facilitates providing the patient with the
best and most suitable treatment.
[0137] The present invention provides diagnostic assays and
methods, both quantitative and qualitative for detecting,
diagnosing, monitoring, staging and prognosticating cancers by
comparing the levels of the specific microRNA molecules of the
invention. Such levels are preferably measured in at least one of
biopsies, tumor samples, fine-needle aspiration (FNA), cells,
tissues and/or bodily fluids. The methods provided in the present
invention are particularly useful for discriminating between
different cancers.
[0138] All the methods of the present invention may optionally
further include measuring levels of additional cancer markers. The
cancer markers measured in addition to said microRNA molecules
depend on the cancer being tested and are known to those of skill
in the art.
[0139] Assay techniques can be used to determine levels of gene
expression, such as genes encoding the nucleic acids of the present
invention in a sample derived from a patient. Such assay methods,
which are well known to those of skill in the art, include, but are
not limited to, nucleic acid microarrays and biochip analysis,
reverse transcriptase PCR (RT-PCR) assays, immunohistochemistry
assays, in situ hybridization assays, competitive-binding assays,
northern blot analyses and ELISA assays.
[0140] According to one embodiment, the assay is based on
expression level of 65 microRNAs in RNA extracted from FFPE
metastatic tumor tissue.
[0141] The expression levels are used to infer the sample origin
using analysis techniques such as, but not limited to,
decision-tree classifier, K nearest neighbors classifier, logistic
regression classifier, linear regression classifier, nearest
neighbor classifier, neural network classifier and nearest centroid
classifier.
[0142] In use of the decision tree classifier the expression levels
are used to make binary decisions (at each relevant node) following
the pre-defined structure of the binary decision-tree (defined
using a training set).
[0143] At each node, the expressions of one or several microRNAs
are combined together using a function of the form P=exp
(.beta.0+.beta.1*miR1+.beta.2*miR2+.beta.3*miR3 . . . )/(1-exp
(.beta.0+.beta.1*miR1+.beta.2*miR2+.beta.3*miR3 . . . )), where the
values of .beta.0, .beta.1, .beta.2 . . . and the identities of the
microRNAs have been pre-determined (using a training set). The
resulting P is compared to a probability threshold level (P.sub.TH,
which was also determined using the training set), and the
classification continues to the left or right branch according to
whether P is larger or smaller than the P.sub.TH for that node.
This continues until an end-point ("leaf") of the tree is reached.
According to some embodiments, P.sub.TH=0.5 for all nodes, and the
value of .beta.0 is adjusted accordingly. According to further
embodiments, .beta.0, .beta.1, .beta.2, . . . are adjusted so that
the slope of the log of the odds ratio function is limited.
[0144] Training the tree algorithm means determining the tree
structure--which nodes there are and what is on each side, and, for
each node: which miRs are used, the values of .beta.0, .beta.1,
.beta.2 . . . and the P.sub.TH. These are determined by a
combination of machine learning, optimization algorithm, and trial
and error by experts in machine learning and diagnostic
algorithms.
[0145] An arbitrary threshold of the expression level of one or
more nucleic acid sequences can be set for assigning a sample to
one of two groups. Alternatively, in a preferred embodiment,
expression levels of one or more nucleic acid sequences of the
invention are combined by a method such as logistic regression to
define a metric which is then compared to previously measured
samples or to a threshold. The threshold is treated as a parameter
that can be used to quantify the confidence with which samples are
assigned to each class. The threshold can be scaled to favor
sensitivity or specificity, depending on the clinical scenario. The
correlation value to the reference data generates a continuous
score that can be scaled and provides diagnostic information on the
likelihood that a sample belongs to a certain class of cancer
origin or type. In multivariate analysis the microRNA signature
provides a high level of prognostic information.
[0146] In another preferred embodiment, expression level of nucleic
acids is used to classify a test sample by comparison to a training
set of samples. In this embodiment, the test sample is compared in
turn to each one of the training set samples. The comparison is
performed by comparing the expression levels of one or multiple
nucleic acids between the test sample and the specific training
sample. Each such pairwise comparison generates a combined metric
for the multiple nucleic acids, which can be calculated by various
numeric methods such as correlation, cosine, Euclidian distance,
mean square distance, or other methods known to those skilled in
the art. The training samples are then ranked according to this
metric, and the samples with the highest values of the metric (or
lowest values, according to the type of metric) are identified,
indicating those samples that are most similar to the test sample.
By choosing a parameter K, this generates a list that includes the
K training samples that are most similar to the test sample.
Various methods can then be applied to identify from this list the
predicted class of the test sample. In a favored embodiment, the
test sample is predicted to belong to the class that has the
highest number of representative in the list of K most-similar
training samples (this method is known as the K Nearest Neighbors
method). Other embodiments may provide a list of predictions
including all or part of the classes represented in the list, those
classes that are represented more than a given minimum number of
times, or other voting schemes whereby classes are grouped
together.
1. DEFINITIONS
[0147] It is to be understood that the terminology used herein is
for the purpose of describing particular embodiments only and is
not intended to be limiting. It must be noted that, as used in the
specification and the appended claims, the singular forms "a," "an"
and "the" include plural referents unless the context clearly
dictates otherwise.
[0148] For the recitation of numeric ranges herein, each
intervening number there between with the same degree of precision
is explicitly contemplated. For example, for the range of 6-9, the
numbers 7 and 8 are contemplated in addition to 6 and 9, and for
the range 6.0-7.0, the number 6.0, 6.1, 6.2, 6.3, 6.4, 6.5, 6.6,
6.7, 6.8, 6.9 and 7.0 are explicitly contemplated.
[0149] About
[0150] As used herein, the term "about" refers to +/-10%.
[0151] Attached
[0152] "Attached" or "immobilized", as used herein, to refer to a
probe and a solid support means that the binding between the probe
and the solid support is sufficient to be stable under conditions
of binding, washing, analysis, and removal. The binding may be
covalent or non-covalent. Covalent bonds may be formed directly
between the probe and the solid support or may be formed by a cross
linker or by inclusion of a specific reactive group on either the
solid support or the probe or both molecules. Non-covalent binding
may be one or more of electrostatic, hydrophilic, and hydrophobic
interactions. Included in non-covalent binding is the covalent
attachment of a molecule, such as streptavidin, to the support and
the non-covalent binding of a biotinylated probe to the
streptavidin Immobilization may also involve a combination of
covalent and non-covalent interactions.
[0153] Baseline
[0154] "Baseline", as used herein, means the initial cycles of PCR,
in which there is little change in fluorescence signal.
[0155] Biological Sample
[0156] "Biological sample", as used herein, means a sample of
biological tissue or fluid that comprises nucleic acids. Such
samples include, but are not limited to, tissue or fluid isolated
from subjects. Biological samples may also include sections of
tissues such as biopsy and autopsy samples, FFPE samples, frozen
sections taken for histological purposes, blood, blood fraction,
plasma, serum, sputum, stool, tears, mucus, hair, skin, urine,
effusions, ascitic fluid, amniotic fluid, saliva, cerebrospinal
fluid, cervical secretions, vaginal secretions, endometrial
secretions, gastrointestinal secretions, bronchial secretions, cell
line, tissue sample, or secretions from the breast. A biological
sample may be provided by fine-needle aspiration (FNA), pleural
effusion or bronchial brushing. A biological sample may be provided
by removing a sample of cells from a subject but can also be
accomplished by using previously isolated cells (e. g., isolated by
another person, at another time, and/or for another purpose), or by
performing the methods described herein in vivo. Archival tissues,
such as those having treatment or outcome history, may also be
used. Biological samples also include explants and primary and/or
transformed cell cultures derived from animal or human tissues.
[0157] Cancer
[0158] The term "cancer" is meant to include all types of cancerous
growths or oncogenic processes, metastatic tissues or malignantly
transformed cells, tissues, or organs, irrespective of
histopathologic type or stage of invasiveness. Examples of cancers
include, but are not limited, to solid tumors and leukemias,
including: apudoma, choristoma, branchioma, malignant carcinoid
syndrome, carcinoid heart disease, carcinoma (e.g., Walker, basal
cell, basosquamous, Brown-Pearce, ductal, Ehrlich tumor, non-small
cell lung (e.g., lung squamous cell carcinoma, lung adenocarcinoma
and lung undifferentiated large cell carcinoma), oat cell,
papillary, bronchiolar, bronchogenic, squamous cell, and
transitional cell), histiocytic disorders, leukemia (e.g., B cell,
mixed cell, null cell, T cell, T-cell chronic, HTLV-II-associated,
lymphocytic acute, lymphocytic chronic, mast cell, and myeloid),
histiocytosis malignant, Hodgkin disease, immunoproliferative
small, non-Hodgkin lymphoma, plasmacytoma, reticuloendotheliosis,
melanoma, chondroblastoma, chondroma, chondrosarcoma, fibroma,
fibrosarcoma, giant cell tumors, histiocytoma, lipoma, liposarcoma,
mesothelioma, myxoma, myxosarcoma, osteoma, osteosarcoma, Ewing
sarcoma, synovioma, adenofibroma, adenolymphoma, carcinosarcoma,
chordoma, craniopharyngioma, dysgerminoma, hamartoma, mesenchymoma,
mesonephroma, myosarcoma, ameloblastoma, cementoma, odontoma,
teratoma, thymoma, trophoblastic tumor, adeno-carcinoma, adenoma,
cholangioma, cholesteatoma, cylindroma, cystadenocarcinoma,
cystadenoma, granulosa cell tumor, gynandroblastoma, hepatoma,
hidradenoma, islet cell tumor, Leydig cell tumor, papilloma,
Sertoli cell tumor, theca cell tumor, leiomyoma, leiomyosarcoma,
myoblastoma, myosarcoma, rhabdomyoma, rhabdomyosarcoma, ependymoma,
ganglioneuroma, glioma, medulloblastoma, meningioma, neurilemmoma,
neuroblastoma, neuroepithelioma, neurofibroma, neuroma,
paraganglioma, paraganglioma nonchromaffin, angiokeratoma,
angiolymphoid hyperplasia with eosinophilia, angioma sclerosing,
angiomatosis, glomangioma, hemangioendothelioma, hemangioma,
hemangiopericytoma, hemangiosarcoma, lymphangioma, lymphangiomyoma,
lymphangiosarcoma, pinealoma, carcinosarcoma, chondrosarcoma,
cystosarcoma, phyllodes, fibrosarcoma, hemangiosarcoma,
leimyosarcoma, leukosarcoma, liposarcoma, lymphangiosarcoma,
myosarcoma, myxosarcoma, ovarian carcinoma, rhabdomyosarcoma,
sarcoma (e.g., Ewing, experimental, Kaposi, and mast cell),
neurofibromatosis, and cervical dysplasia, and other conditions in
which cells have become immortalized or transformed.
[0159] Classification
[0160] The term classification refers to a procedure and/or
algorithm in which individual items are placed into groups or
classes based on quantitative information on one or more
characteristics inherent in the items (referred to as traits,
variables, characters, features, etc.) and based on a statistical
model and/or a training set of previously labeled items. A
"classification tree" places categorical variables into
classes.
[0161] Complement
[0162] "Complement" or "complementary" is used herein to refer to a
nucleic acid may mean Watson-Crick (e.g., A-T/U and C-G) or
Hoogsteen base pairing between nucleotides or nucleotide analogs of
nucleic acid molecules. A full complement or fully complementary
means 100% complementary base pairing between nucleotides or
nucleotide analogs of nucleic acid molecules. In some embodiments,
the complementary sequence has a reverse orientation (5'-3').
[0163] Ct
[0164] Ct signals represent the first cycle of PCR where
amplification crosses a threshold (cycle threshold) of
fluorescence. Accordingly, low values of Ct represent high
abundance or expression levels of the microRNA.
[0165] In some embodiments the PCR Ct signal is normalized such
that the normalized Ct remains inversed from the expression level.
In other embodiments the PCR Ct signal may be normalized and then
inverted such that low normalized-inverted Ct represents low
abundance or low expression levels of the microRNA.
[0166] Data Processing Routine
[0167] As used herein, a "data processing routine" refers to a
process that can be embodied in software that determines the
biological significance of acquired data (i.e., the ultimate
results of an assay or analysis). For example, the data processing
routine can determine a tissue of origin based upon the data
collected. In the systems and methods herein, the data processing
routine can also control the data collection routine based upon the
results determined. The data processing routine and the data
collection routines can be integrated and provide feedback to
operate the data acquisition, and hence provide assay-based judging
methods.
[0168] Data Set
[0169] As use herein, the term "data set" refers to numerical
values obtained from the analysis. These numerical values
associated with analysis may be values such as peak height and area
under the curve.
[0170] Data Structure
[0171] As used herein, the term "data structure" refers to a
combination of two or more data sets, an application of one or more
mathematical manipulation to one or more data sets to obtain one or
more new data sets, or a manipulation of two or more data sets into
a form that provides a visual illustration of the data in a new
way. An example of a data structure prepared from manipulation of
two or more data sets would be a hierarchical cluster.
[0172] Detection
[0173] "Detection" means detecting the presence of a component in a
sample. Detection also means detecting the absence of a component.
Detection also means determining the level of a component, either
quantitatively or qualitatively.
[0174] Differential Expression
[0175] "Differential expression" means qualitative or quantitative
differences in the temporal and/or spatial gene expression patterns
within and among cells and tissue. Thus, a differentially expressed
gene may qualitatively have its expression altered, including an
activation or inactivation, in, e.g., normal versus diseased
tissue. Genes may be turned on or turned off in a particular state,
relative to another state, thus permitting comparison of two or
more states. A qualitatively regulated gene may exhibit an
expression pattern within a state or cell type which may be
detectable by standard techniques. Some genes may be expressed in
one state or cell type, but not in both. Alternatively, the
difference in expression may be quantitative, e.g., in that
expression is modulated: up-regulated, resulting in an increased
amount of transcript, or down-regulated, resulting in a decreased
amount of transcript. The degree to which expression differs needs
only to be large enough to quantify via standard characterization
techniques such as expression arrays, quantitative reverse
transcriptase PCR, northern blot analysis, real-time PCR, in situ
hybridization and RNase protection.
[0176] Epithelial Tumors
[0177] "Epithelial tumors" is meant to include all types of tumors
from epithelial origin. Examples of epithelial tumors include, but
are not limited to cholangioca or adenoca of extrahepatic biliary
tract, urothelial carcinoma, adenocarcinoma of the breast, lung
large cell or adenocarcinoma, lung small cell carcinoma, carcinoid,
lung, ovarian carcinoma, pancreatic adenocarcinoma, prostatic
adenocarcinoma, gastric or esophageal adenocarcinoma,
thymoma/thymic carcinoma, follicular thyroid carcinoma, papillary
thyroid carcinoma, medullary thyroid carcinoma, anus or skin
squamous cell carcinoma, lung, head&neck, or esophagus squamous
cell carcinoma, uterine cervix squamous cell carcinoma,
gastrointestinal tract carcinoid, pancreatic islet cell tumor and
colorectal adenocarcinoma.
[0178] Non Epithelial Tumors
[0179] "Non epithelial tumors" is meant to include all types of
tumors from non epithelial origin. Examples of non epithelial
tumors include, but are not limited to adrenocortical carcinoma,
chromophobe renal cell carcinoma, clear cell renal cell carcinoma,
papillary renal cell carcinoma, pleural mesothelioma, astrocytic
tumor, oligodendroglioma, pheochromocytoma, B-cell lymphoma, T-cell
lymphoma, melanoma, gastrointestinal stromal tumor (GIST), Ewing
Sarcoma, chondrosarcoma, malignant fibrous histiocytoma (MFH) or
fibrosarcoma, osteosarcoma, rhabdomyosarcoma, synovial sarcoma and
liposarcoma.
[0180] Expression Profile
[0181] The term "expression profile" is used broadly to include a
genomic expression profile, e.g., an expression profile of
microRNAs. Profiles may be generated by any convenient means for
determining a level of a nucleic acid sequence, e.g., quantitative
hybridization of microRNA, labeled microRNA, amplified microRNA,
cDNA, etc., quantitative PCR, ELISA for quantitation, and the like,
and allow the analysis of differential gene expression between two
samples. A subject or patient tumor sample, e.g., cells or
collections thereof, e.g., tissues, is assayed. Samples are
collected by any convenient method, as known in the art. Nucleic
acid sequences of interest are nucleic acid sequences that are
found to be predictive, including the nucleic acid sequences
provided above, where the expression profile may include expression
data for 5, 10, 20, 25, 50, 100 or more of the nucleic acid
sequences, including all of the listed nucleic acid sequences.
According to some embodiments, the term "expression profile" means
measuring the relative abundance of the nucleic acid sequences in
the measured samples.
[0182] Expression Ratio
[0183] "Expression ratio", as used herein, refers to relative
expression levels of two or more nucleic acids as determined by
detecting the relative expression levels of the corresponding
nucleic acids in a biological sample.
[0184] FDR (False Discovery Rate)
[0185] When performing multiple statistical tests, for example in
comparing between the signal of two groups in multiple data
features, there is an increasingly high probability of obtaining
false positive results, by random differences between the groups
that can reach levels that would otherwise be considered
statistically significant. In order to limit the proportion of such
false discoveries, statistical significance is defined only for
data features in which the differences reached a p-value (by
two-sided t-test) below a threshold, which is dependent on the
number of tests performed and the distribution of p-values obtained
in these tests.
[0186] Fragment
[0187] "Fragment" is used herein to indicate a non-full-length part
of a nucleic acid. Thus, a fragment is itself also a nucleic
acid.
[0188] Gastrointestinal Tumors
[0189] "gastrointestinal tumors" is meant to include all types of
tumors from gastrointestinal origin. Examples of gastrointestinal
tumors include, but are not limited to cholangioca. or adenoca of
extrahepatic biliary tract, pancreatic adenocarcinoma, gastric or
esophageal adenocarcinoma, and colorectal adenocarcinoma.
[0190] Gene
[0191] "Gene", as used herein, may be a natural (e.g., genomic) or
synthetic gene comprising transcriptional and/or translational
regulatory sequences and/or a coding region and/or non-translated
sequences (e.g., introns, 5'- and 3'-untranslated sequences). The
coding region of a gene may be a nucleotide sequence coding for an
amino acid sequence or a functional RNA, such as tRNA, rRNA,
catalytic RNA, siRNA, miRNA or antisense RNA. A gene may also be an
mRNA or cDNA corresponding to the coding regions (e.g., exons and
miRNA) optionally comprising 5'- or 3'-untranslated sequences
linked thereto. A gene may also be an amplified nucleic acid
molecule produced in vitro, comprising all or a part of the coding
region and/or 5'- or 3'-untranslated sequences linked thereto.
[0192] Germ Cell Tumors
[0193] "Germ cell tumors" as used herein, include, but are not
limited, to non-seminomatous testicular germ cell tumors,
seminomatous testicular germ cell tumors and ovarian primitive germ
cell tumors.
[0194] Groove Binder/Minor Groove Binder (MGB)
[0195] "Groove binder" and/or "minor groove binder" may be used
interchangeably and refer to small molecules that fit into the
minor groove of double-stranded DNA, typically in a
sequence-specific manner. Minor groove binders may be long, flat
molecules that can adopt a crescent-like shape and thus fit snugly
into the minor groove of a double helix, often displacing water.
Minor groove binding molecules may typically comprise several
aromatic rings connected by bonds with torsional freedom such as
furan, benzene, or pyrrole rings. Minor groove binders may be
antibiotics such as netropsin, distamycin, berenil, pentamidine and
other aromatic diamidines, Hoechst 33258, SN 6999, aureolic
anti-tumor drugs such as chromomycin and mithramycin, CC-1065,
dihydrocyclopyrroloindole tripeptide (DPI.sub.3),
1,2-dihydro-(3H)-pyrrolo[3,2-e]indole-7-carboxylate (CDPI.sub.3),
and related compounds and analogues, including those described in
Nucleic Acids in Chemistry and Biology, 2nd ed., Blackburn and
Gait, eds., Oxford University Press, 1996, and PCT Published
Application No. WO 03/078450, the contents of which are
incorporated herein by reference. A minor groove binder may be a
component of a primer, a probe, a hybridization tag complement, or
combinations thereof. Minor groove binders may increase the T.sub.m
of the primer or a probe to which they are attached, allowing such
primers or probes to effectively hybridize at higher
temperatures.
[0196] High Expression miR-205 Tumors
[0197] "High expression miR-205 tumors" as used herein include, but
are not limited, to urothelial carcinoma (TCC), thymoma/thymic
carcinoma, anus or skin squamous cell carcinoma, lung,
head&neck, or esophagus squamous cell carcinoma and uterine
cervix squamous cell carcinoma.
[0198] Low Expression 205 Tumors
[0199] "Low expression miR-205 tumors" as used herein include, but
are not limited, to lung, large cell or adenocarcinoma, follicular
thyroid carcinoma and papillary thyroid carcinoma.
[0200] Host Cell
[0201] "Host cell", as used herein, may be a naturally occurring
cell or a transformed cell that may contain a vector and may
support replication of the vector. Host cells may be cultured
cells, explants, cells in vivo, and the like. Host cells may be
prokaryotic cells, such as E. coli, or eukaryotic cells, such as
yeast, insect, amphibian, or mammalian cells, such as CHO and HeLa
cells.
[0202] Identity
[0203] "Identical" or "identity", as used herein, in the context of
two or more nucleic acids or polypeptide sequences mean that the
sequences have a specified percentage of residues that are the same
over a specified region. The percentage may be calculated by
optimally aligning the two sequences, comparing the two sequences
over the specified region, determining the number of positions at
which the identical residue occurs in both sequences to yield the
number of matched positions, dividing the number of matched
positions by the total number of positions in the specified region,
and multiplying the result by 100 to yield the percentage of
sequence identity. In cases where the two sequences are of
different lengths or the alignment produces one or more staggered
ends and the specified region of comparison includes only a single
sequence, the residues of single sequence are included in the
denominator but not the numerator of the calculation. When
comparing DNA and RNA sequences, thymine (T) and uracil (U) may be
considered equivalent. Identity may be performed manually or by
using a computer sequence algorithm such as BLAST or BLAST 2.0.
[0204] In Situ Detection
[0205] "In situ detection", as used herein, means the detection of
expression or expression levels in the original site, hereby
meaning in a tissue sample such as biopsy.
[0206] K-Nearest Neighbor
[0207] The phrase "K-nearest neighbor" refers to a classification
method that classifies a point by calculating the distances between
it and points in the training data set. It then assigns the point
to the class that is most common among its K-nearest neighbors
(where K is an integer).
[0208] Leaf
[0209] A leaf, as used herein, is the terminal group in a
classification or decision tree.
[0210] Label
[0211] "Label", as used herein, means a composition detectable by
spectroscopic, photochemical, biochemical, immunochemical,
chemical, or other physical means. For example, useful labels
include .sup.32P, fluorescent dyes, electron-dense reagents,
enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin,
or haptens and other entities which can be made detectable. A label
may be incorporated into nucleic acids and proteins at any
position.
[0212] Logistic Regression
[0213] Logistic regression is part of a category of statistical
models called generalized linear models. Logistic regression can
allow one to predict a discrete outcome, such as group membership,
from a set of variables that may be continuous, discrete,
dichotomous, or a mix of any of these. The dependent or response
variable can be dichotomous, for example, one of two possible types
of cancer. Logistic regression models the natural log of the odds
ratio, i.e., the ratio of the probability of belonging to the first
group (P) over the probability of belonging to the second group
(1-P), as a linear combination of the different expression levels
(in log-space). The logistic regression output can be used as a
classifier by prescribing that a case or sample will be classified
into the first type if P is greater than 0.5 or 50%. Alternatively,
the calculated probability P can be used as a variable in other
contexts, such as a 1D or 2D threshold classifier.
[0214] Metastasis
[0215] "Metastasis" means the process by which cancer spreads from
the place at which it first arose as a primary tumor to other
locations in the body. The metastatic progression of a primary
tumor reflects multiple stages, including dissociation from
neighboring primary tumor cells, survival in the circulation, and
growth in a secondary location.
[0216] Neuroendocrine Tumors
[0217] "Neuroendocrine tumors" is meant to include all types of
tumors from neuroendocrine origin. Examples of neuroendocrine
tumors include, but are not limited to lung small cell carcinoma,
lung carcinoid, gastrointestinal tract carcinoid, pancreatic islet
cell tumor and medullary thyroid carcinoma.
[0218] Node
[0219] A "node" is a decision point in a classification (i.e.,
decision) tree. Also, a point in a neural net that combines input
from other nodes and produces an output through application of an
activation function.
[0220] Nucleic Acid
[0221] "Nucleic acid" or "oligonucleotide" or "polynucleotide", as
used herein, mean at least two nucleotides covalently linked
together. The depiction of a single strand also defines the
sequence of the complementary strand. Thus, a nucleic acid also
encompasses the complementary strand of a depicted single strand.
Many variants of a nucleic acid may be used for the same purpose as
a given nucleic acid. Thus, a nucleic acid also encompasses
substantially identical nucleic acids and complements thereof. A
single strand provides a probe that may hybridize to a target
sequence under stringent hybridization conditions. Thus, a nucleic
acid also encompasses a probe that hybridizes under stringent
hybridization conditions.
[0222] Nucleic acids may be single-stranded or double-stranded, or
may contain portions of both double-stranded and single-stranded
sequences. The nucleic acid may be DNA, both genomic and cDNA, RNA,
or a hybrid, where the nucleic acid may contain combinations of
deoxyribo- and ribo-nucleotides, and combinations of bases
including uracil, adenine, thymine, cytosine, guanine, inosine,
xanthine hypoxanthine, isocytosine and isoguanine. Nucleic acids
may be obtained by chemical synthesis methods or by recombinant
methods.
[0223] A nucleic acid will generally contain phosphodiester bonds,
although nucleic acid analogs may be included that may have at
least one different linkage, e.g., phosphoramidate,
phosphorothioate, phosphorodithioate, or O-methylphosphoroamidite
linkages and peptide nucleic acid backbones and linkages. Other
analog nucleic acids include those with positive backbones,
non-ionic backbones and non-ribose backbones, including those
described in U.S. Pat. Nos. 5,235,033 and 5,034,506, which are
incorporated herein by reference. Nucleic acids containing one or
more non-naturally occurring or modified nucleotides are also
included within one definition of nucleic acids. The modified
nucleotide analog may be located for example at the 5'-end and/or
the 3'-end of the nucleic acid molecule. Representative examples of
nucleotide analogs may be selected from sugar- or backbone-modified
ribonucleotides. It should be noted, however, that also
nucleobase-modified ribonucleotides, i.e., ribonucleotides,
containing a non-naturally occurring nucleobase instead of a
naturally occurring nucleobase such as uridine or cytidine modified
at the 5-position, e.g., 5-(2-amino) propyl uridine, 5-bromo
uridine; adenosine and guanosine modified at the 8-position, e.g.,
8-bromo guanosine; deaza nucleotides, e.g., 7-deaza-adenosine; 0-
and N-alkylated nucleotides, e.g., N6-methyl adenosine are
suitable. The 2'-OH-group may be replaced by a group selected from
H, OR, R, halo, SH, SR, NH.sub.2, NHR, NR.sub.2 or CN, wherein R is
C1-C6 alkyl, alkenyl or alkynyl and halo is F, Cl, Br or I.
Modified nucleotides also include nucleotides conjugated with
cholesterol through, e.g., a hydroxyprolinol linkage as described
in Krutzfeldt et al., Nature 2005; 438:685-689, Soutschek et al.,
Nature 2004; 432:173-178, and U.S. Patent Publication No.
20050107325, which are incorporated herein by reference. Additional
modified nucleotides and nucleic acids are described in U.S. Patent
Publication No. 20050182005, which is incorporated herein by
reference. Modifications of the ribose-phosphate backbone may be
done for a variety of reasons, e.g., to increase the stability and
half-life of such molecules in physiological environments, to
enhance diffusion across cell membranes, or as probes on a biochip.
The backbone modification may also enhance resistance to
degradation, such as in the harsh endocytic environment of cells.
The backbone modification may also reduce nucleic acid clearance by
hepatocytes, such as in the liver and kidney. Mixtures of naturally
occurring nucleic acids and analogs may be made; alternatively,
mixtures of different nucleic acid analogs, and mixtures of
naturally occurring nucleic acids and analogs may be made.
[0224] Probe
[0225] "Probe", as used herein, means an oligonucleotide capable of
binding to a target nucleic acid of complementary sequence through
one or more types of chemical bonds, usually through complementary
base pairing, usually through hydrogen bond formation. Probes may
bind target sequences lacking complete complementarity with the
probe sequence depending upon the stringency of the hybridization
conditions. There may be any number of base pair mismatches which
will interfere with hybridization between the target sequence and
the single-stranded nucleic acids described herein. However, if the
number of mutations is so great that no hybridization can occur
under even the least stringent of hybridization conditions, the
sequence is not a complementary target sequence. A probe may be
single-stranded or partially single- and partially double-stranded.
The strandedness of the probe is dictated by the structure,
composition, and properties of the target sequence. Probes may be
directly labeled or indirectly labeled such as with biotin to which
a streptavidin complex may later bind.
[0226] Reference Value
[0227] As used herein, the term "reference value" or "reference
expression profile" refers to a criterion expression value to which
measured values are compared in order to identify a specific
cancer. The reference value may be based on the abundance of the
nucleic acids, or may be based on a combined metric score
thereof.
[0228] In preferred embodiments the reference value is determined
from statistical analysis of studies that compare microRNA
expression with known clinical outcomes.
[0229] Sarcoma
[0230] Sarcoma is meant to include all types of tumors from sarcoma
origin. Examples of sarcoma tumors include, but are not limited to
gastrointestinal stromal tumor (GIST), Ewing sarcoma,
chondrosarcoma, malignant fibrous histiocytoma (MFH) or
fibrosarcoma, osteosarcoma, rhabdomyosarcoma, synovial sarcoma and
liposarcoma.
[0231] Sensitivity
[0232] "Sensitivity", as used herein, may mean a statistical
measure of how well a binary classification test correctly
identifies a condition, for example, how frequently it correctly
classifies a cancer into the correct class out of two possible
classes. The sensitivity for class A is the proportion of cases
that are determined to belong to class "A" by the test out of the
cases that are in class "A", as determined by some absolute or gold
standard.
[0233] Specificity
[0234] "Specificity", as used herein, may mean a statistical
measure of how well a binary classification test correctly
identifies a condition, for example, how frequently it correctly
classifies a cancer into the correct class out of two possible
classes. The specificity for class A is the proportion of cases
that are determined to belong to class "not A" by the test out of
the cases that are in class "not A", as determined by some absolute
or gold standard.
[0235] Stringent Hybridization Conditions
[0236] "Stringent hybridization conditions", as used herein, mean
conditions under which a first nucleic acid sequence (e.g., probe)
will hybridize to a second nucleic acid sequence (e.g., target),
such as in a complex mixture of nucleic acids. Stringent conditions
are sequence-dependent and will be different in different
circumstances. Stringent conditions may be selected to be about
5-10.degree. C. lower than the thermal melting point (T.sub.m) for
the specific sequence at a defined ionic strength pH. The T.sub.m
may be the temperature (under defined ionic strength, pH, and
nucleic concentration) at which 50% of the probes complementary to
the target hybridize to the target sequence at equilibrium (as the
target sequences are present in excess, at T.sub.m, 50% of the
probes are occupied at equilibrium). Stringent conditions may be
those in which the salt concentration is less than about 1.0 M
sodium ion, such as about 0.01-1.0 M sodium ion concentration (or
other salts) at pH 7.0 to 8.3 and the temperature is at least about
30.degree. C. for short probes (e.g., about 10-50 nucleotides) and
at least about 60.degree. C. for long probes (e.g., greater than
about 50 nucleotides). Stringent conditions may also be achieved
with the addition of destabilizing agents such as formamide. For
selective or specific hybridization, a positive signal may be at
least 2 to 10 times background hybridization. Exemplary stringent
hybridization conditions include the following: 50% formamide,
5.times.SSC, and 1% SDS, incubating at 42.degree. C., or,
5.times.SSC, 1% SDS, incubating at 65.degree. C., with wash in
0.2.times.SSC, and 0.1% SDS at 65.degree. C.
[0237] Substantially Complementary
[0238] "Substantially complementary", as used herein, means that a
first sequence is at least 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%,
97%, 98% or 99% identical to the complement of a second sequence
over a region of 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85,
90, 95, 100 or more nucleotides, or that the two sequences
hybridize under stringent hybridization conditions.
[0239] Substantially Identical
[0240] "Substantially identical", as used herein, means that a
first and a second sequence are at least 60%, 65%, 70%, 75%, 80%,
85%, 90%, 95%, 97%, 98% or 99% identical over a region of 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 35,
40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100 or more
nucleotides or amino acids, or with respect to nucleic acids, if
the first sequence is substantially complementary to the complement
of the second sequence.
[0241] Subject
[0242] As used herein, the term "subject" refers to a mammal,
including both human and other mammals. The methods of the present
invention are preferably applied to human subjects.
[0243] Target Nucleic Acid
[0244] "Target nucleic acid", as used herein, means a nucleic acid
or variant thereof that may be bound by another nucleic acid. A
target nucleic acid may be a DNA sequence. The target nucleic acid
may be RNA. The target nucleic acid may comprise a mRNA, tRNA,
shRNA, siRNA or Piwi-interacting RNA, or a pri-miRNA, pre-miRNA,
miRNA, or anti-miRNA.
[0245] The target nucleic acid may comprise a target miRNA binding
site or a variant thereof. One or more probes may bind the target
nucleic acid. The target binding site may comprise 5-100 or 10-60
nucleotides. The target binding site may comprise a total of 5, 6,
7, 8, 9, 10, 11, 12, 13, 14, 15,16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30-40, 40-50, 50-60, 61, 62 or 63 nucleotides.
The target site sequence may comprise at least 5 nucleotides of the
sequence of a target miRNA binding site disclosed in U.S. patent
application Ser. Nos. 11/384,049, 11/418,870 or 11/429,720, the
contents of which are incorporated herein.
[0246] 1D/2D Threshold Classifier
[0247] "1D/2D threshold classifier", as used herein, may mean an
algorithm for classifying a case or sample such as a cancer sample
into one of two possible types such as two types of cancer. For a
1D threshold classifier, the decision is based on one variable and
one predetermined threshold value; the sample is assigned to one
class if the variable exceeds the threshold and to the other class
if the variable is less than the threshold. A 2D threshold
classifier is an algorithm for classifying into one of two types
based on the values of two variables. A threshold may be calculated
as a function (usually a continuous or even a monotonic function)
of the first variable; the decision is then reached by comparing
the second variable to the calculated threshold, similar to the 1D
threshold classifier.
[0248] Tissue Sample
[0249] As used herein, a tissue sample is tissue obtained from a
tissue biopsy using methods well known to those of ordinary skill
in the related medical arts. The phrase "suspected of being
cancerous", as used herein, means a cancer tissue sample believed
by one of ordinary skill in the medical arts to contain cancerous
cells. Methods for obtaining the sample from the biopsy include
gross apportioning of a mass, microdissection, laser-based
microdissection, or other art-known cell-separation methods.
[0250] Tumor
[0251] "Tumor", as used herein, refers to all neoplastic cell
growth and proliferation, whether malignant or benign, and all
pre-cancerous and cancerous cells and tissues.
[0252] Variant
[0253] "Variant", as used herein, referring to a nucleic acid means
(i) a portion of a referenced nucleotide sequence; (ii) the
complement of a referenced nucleotide sequence or portion thereof;
(iii) a nucleic acid that is substantially identical to a
referenced nucleic acid or the complement thereof; or (iv) a
nucleic acid that hybridizes under stringent conditions to the
referenced nucleic acid, complement thereof, or a sequence
substantially identical thereto.
[0254] Wild Type
[0255] As used herein, the term "wild-type" sequence refers to a
coding, a non-coding or an interface sequence which is an allelic
form of sequence that performs the natural or normal function for
that sequence. Wild-type sequences include multiple allelic forms
of a cognate sequence, for example, multiple alleles of a wild type
sequence may encode silent or conservative changes to the protein
sequence that a coding sequence encodes.
[0256] The present invention employs miRNAs for the identification,
classification and diagnosis of specific cancers and the
identification of their tissues of origin.
[0257] 1. microRNA Processing
[0258] A gene coding for microRNA (miRNA) may be transcribed
leading to production of a miRNA primary transcript known as the
pri-miRNA. The pri-miRNA may comprise a hairpin with a stem and
loop structure. The stem of the hairpin may comprise mismatched
bases. The pri-miRNA may comprise several hairpins in a
polycistronic structure.
[0259] The hairpin structure of the pri-miRNA may be recognized by
Drosha, which is an RNase III endonuclease. Drosha may recognize
terminal loops in the pri-miRNA and cleave approximately two
helical turns into the stem to produce a 60-70 nt precursor known
as the pre-miRNA. Drosha may cleave the pri-miRNA with a staggered
cut typical of RNase III endonucleases yielding a pre-miRNA stem
loop with a 5' phosphate and .about.2 nucleotide 3' overhang.
Approximately one helical turn of stem (.about.10 nucleotides)
extending beyond the Drosha cleavage site may be essential for
efficient processing. The pre-miRNA may then be actively
transported from the nucleus to the cytoplasm by Ran-GTP and the
export receptor Ex-portin-5.
[0260] The pre-miRNA may be recognized by Dicer, which is also an
RNase III endonuclease. Dicer may recognize the double-stranded
stem of the pre-miRNA. Dicer may also cut off the terminal loop two
helical turns away from the base of the stem loop, leaving an
additional 5' phosphate and a .about.2 nucleotide 3' overhang. The
resulting siRNA-like duplex, which may comprise mismatches,
comprises the mature miRNA and a similar-sized fragment known as
the miRNA*. The miRNA and miRNA* may be derived from opposing arms
of the pri-miRNA and pre-miRNA. MiRNA* sequences may be found in
libraries of cloned miRNAs, but typically at lower frequency than
the miRNAs.
[0261] Although initially present as a double-stranded species with
miRNA*, the miRNA may eventually become incorporated as a
single-stranded RNA into a ribonucleoprotein complex known as the
RNA-induced silencing complex (RISC). Various proteins can form the
RISC, which can lead to variability in specificity for miRNA/miRNA*
duplexes, binding site of the target gene, activity of miRNA
(repress or activate), and which strand of the miRNA/miRNA* duplex
is loaded in to the RISC.
[0262] When the miRNA strand of the miRNA:miRNA* duplex is loaded
into the RISC, the miRNA* may be removed and degraded. The strand
of the miRNA:miRNA* duplex that is loaded into the RISC may be the
strand whose 5' end is less tightly paired. In cases where both
ends of the miRNA:miRNA* have roughly equivalent 5' pairing, both
miRNA and miRNA* may have gene silencing activity.
[0263] The RISC may identify target nucleic acids based on high
levels of complementarity between the miRNA and the mRNA,
especially by nucleotides 2-7 of the miRNA. Only one case has been
reported in animals where the interaction between the miRNA and its
target was along the entire length of the miRNA. This was shown for
miR-196 and Hox B8 and it was further shown that miR-196 mediates
the cleavage of the Hox B8 mRNA (Yekta et al. Science 2004;
304:594-596). Otherwise, such interactions are known only in plants
(Bartel & Bartel 2003; 132:709-717).
[0264] A number of studies have looked at the base-pairing
requirement between miRNA and its mRNA target for achieving
efficient inhibition of translation (reviewed by Bartel 2004;
116:281-297). In mammalian cells, the first 8 nucleotides of the
miRNA may be important (Doench & Sharp Genes Dev 2004;
18:504-511). However, other parts of the microRNA may also
participate in mRNA binding. Moreover, sufficient base pairing at
the 3' can compensate for insufficient pairing at the 5' (Brennecke
et al., PloS Biol 2005; 3:e85). Computation studies, analyzing
miRNA binding on whole genomes have suggested a specific role for
bases 2-7 at the 5' of the miRNA in target binding but the role of
the first nucleotide, found usually to be "A" was also recognized
(Lewis et al. Cell 2005; 120:15-20) Similarly, nucleotides 1-7 or
2-8 were used to identify and validate targets by Krek et al. (Nat
Genet 2005; 37:495-500).
[0265] The target sites in the mRNA may be in the 5' UTR, the 3'
UTR or in the coding region. Interestingly, multiple miRNAs may
regulate the same mRNA target by recognizing the same or multiple
sites. The presence of multiple miRNA binding sites in most
genetically identified targets may indicate that the cooperative
action of multiple RISCs provides the most efficient translational
inhibition.
[0266] miRNAs may direct the RISC to down-regulate gene expression
by either of two mechanisms: mRNA cleavage or translational
repression. The miRNA may specify cleavage of the mRNA if the mRNA
has a certain degree of complementarity to the miRNA. When a miRNA
guides cleavage, the cut may be between the nucleotides pairing to
residues 10 and 11 of the miRNA. Alternatively, the miRNA may
repress translation if the miRNA does not have the requisite degree
of complementarity to the miRNA. Translational repression may be
more prevalent in animals since animals may have a lower degree of
complementarity between the miRNA and binding site.
[0267] It should be noted that there may be variability in the 5'
and 3' ends of any pair of miRNA and miRNA*. This variability may
be due to variability in the enzymatic processing of Drosha and
Dicer with respect to the site of cleavage. Variability at the 5'
and 3' ends of miRNA and miRNA* may also be due to mismatches in
the stem structures of the pri-miRNA and pre-miRNA. The mismatches
of the stem strands may lead to a population of different hairpin
structures. Variability in the stem structures may also lead to
variability in the products of cleavage by Drosha and Dicer.
[0268] 2. Nucleic Acids
[0269] Nucleic acids are provided herein. The nucleic acids
comprise the sequences of SEQ ID NOS: 1-390 or variants thereof.
The variant may be a complement of the referenced nucleotide
sequence. The variant may also be a nucleotide sequence that is
substantially identical to the referenced nucleotide sequence or
the complement thereof. The variant may also be a nucleotide
sequence which hybridizes under stringent conditions to the
referenced nucleotide sequence, complements thereof, or nucleotide
sequences substantially identical thereto.
[0270] The nucleic acid may have a length of from about 10 to about
250 nucleotides. The nucleic acid may have a length of at least 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27,
28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 125, 150, 175, 200
or 250 nucleotides. The nucleic acid may be synthesized or
expressed in a cell (in vitro or in vivo) using a synthetic gene
described herein. The nucleic acid may be synthesized as a
single-strand molecule and hybridized to a substantially
complementary nucleic acid to form a duplex. The nucleic acid may
be introduced to a cell, tissue or organ in a single- or
double-stranded form or capable of being expressed by a synthetic
gene using methods well known to those skilled in the art,
including as described in U.S. Pat. No. 6,506,559, which is
incorporated herein by reference.
[0271] SEQ ID NOs 1-34 are in accordance with Sanger database
version 10; SEQ ID NOs 35-390 are in accordance with Sanger
database version 11;
TABLE-US-00001 TABLE 1 SEQ ID NOS of sequences used in the
invention miR name miR SEQ ID NO hairpin SEQ ID NO hsa-let-7c 1 70
hsa-let-7e 2, 156 71 hsa-miR-100 3 72 hsa-miR-10a 4 73 hsa-miR-10b
5 74 hsa-miR-122 6 75 hsa-miR-125a-5p 7 76 hsa-miR-125b 8 77, 78
hsa-miR-126 9 79 hsa-miR-130a 10 80 hsa-miR-138 11 81, 82
hsa-miR-140-3p 12 83 hsa-miR-141 13 84 hsa-miR-143 14 85
hsa-miR-145 15 86 hsa-miR-146a 16 87 hsa-miR-146b-5p 17 88
hsa-miR-148a 18 89 hsa-miR-149 19 90 hsa-miR-17 20 91 hsa-miR-181a
21 92, 93 hsa-miR-181a* 22 92, 93 hsa-miR-185 23 94 hsa-miR-191 24
95 hsa-miR-193a-3p 25 96 hsa-miR-193a-5p 26 96 hsa-miR-194 27 97,
98 hsa-miR-200a 28 99 hsa-miR-200b 29 100 hsa-miR-200c 30 101
hsa-miR-202 31 102 hsa-miR-205 32 103 hsa-miR-206 33 104 hsa-miR-21
34 105 hsa-miR-21* 35 105 hsa-miR-210 36 106 hsa-miR-214 37 107
hsa-miR-214* 38 107 hsa-miR-22 39 108 hsa-miR-222 40 109
hsa-miR-223 41 110 hsa-miR-224 42 111 hsa-miR-29a 43 112
hsa-miR-29c 44, 191 113 hsa-miR-29c* 45 113 hsa-miR-30a 46 114
hsa-miR-30d 47 115 hsa-miR-30e 48 116 hsa-miR-31 49 117
hsa-miR-342-3p 50 118 hsa-miR-345 51 119 hsa-miR-34a 52 120
hsa-miR-34c-5p 53 121 hsa-miR-361-5p 54 122 hsa-miR-372 55 123
hsa-miR-375 56 124 hsa-miR-378 57, 202 125 hsa-miR-455-5p 58 126
hsa-miR-487b 59 127 hsa-miR-497 60, 208 128 hsa-miR-509-3p 61 129,
130, 131 hsa-miR-516a-5p 62, 211 132, 133 hsa-miR-574-5p 63 134
hsa-miR-652 64 135 hsa-miR-7 65 136, 137, 138 hsa-miR-9* 66 139,
140, 141 hsa-miR-92a 67 142, 143 hsa-miR-92b 68 144 hsa-miR-934 69
145 hsa-miR-1201 146 149 hsa-miR-221 147 150 hsa-miR-93 148 151
hsa-miR-182 152 hsa-let-7d 153 hsa-miR-181b 154 hsa-miR-127-3p 155
hsa-let-7i 157 hsa-miR-106a 158 hsa-miR-124 159 hsa-miR-1248 160
hsa-miR-128 161 hsa-miR-129-3p 162 hsa-miR-1323 163 hsa-miR-142-5p
164 hsa-miR-143* 165 hsa-miR-146b-3p 166 hsa-miR-149* 167
hsa-miR-150 168 hsa-miR-152 169 hsa-miR-155 170 hsa-miR-15a 171
hsa-miR-15b 172 hsa-miR-181c 173 hsa-miR-181d 174 hsa-miR-183 175
hsa-miR-18a 176 hsa-miR-192 177 hsa-miR-193b 178 hsa-miR-195 179
hsa-miR-1973 180 hsa-miR-199a-3p 181 hsa-miR-199a-5p 182
hsa-miR-199b-5p 183 hsa-miR-203 184 hsa-miR-205* 185 hsa-miR-20a
186 hsa-miR-219-2-3p 187 hsa-miR-25 188 hsa-miR-27b 189 hsa-miR-29b
190 hsa-miR-302a 192 hsa-miR-302a* 193 hsa-miR-302d 194
hsa-miR-30a* 195 hsa-miR-30c 196 hsa-miR-331-3p 197 hsa-miR-342-5p
198 hsa-miR-363 199 hsa-miR-371-3p 200 hsa-miR-371-5p 201
hsa-miR-422a 203 hsa-miR-425 204 hsa-miR-451 205 hsa-miR-455-3p 206
hsa-miR-486-5p 207 hsa-miR-498 209 hsa-miR-512-5p 210 hsa-miR-516b
212 hsa-miR-517a 213 hsa-miR-517c 214 hsa-miR-518a-3p 215
hsa-miR-518e 216 hsa-miR-518f* 217 hsa-miR-519a 218 hsa-miR-519d
219 hsa-miR-520a-5p 220 hsa-miR-520c-3p 221 hsa-miR-520d-5p 222
hsa-miR-524-5p 223 hsa-miR-527 224 hsa-miR-551b 225 hsa-miR-625 226
hsa-miR-767-5p 227 hsa-miR-886-3p 228 hsa-miR-9 229 hsa-miR-886-5p
230 hsa-miR-99a 231 hsa-miR-99a* 232 hsa-miR-373 233 hsa-miR-1977
234 hsa-miR-1978 235 MID-00689 236 MID-15684 237, 369 MID-15867 238
MID-15907 239 MID-15965 240 MID-16318 241 MID-16489 242 MID-16869
243 MID-17144 244 MID-18336 245 MID-18422 246 MID-19340 247
MID-19533 248 MID-20524 249 MID-20703 250 MID-21271 251 MID-22664
252 MID-23256 253 MID-23291 254 MID-23794 255 MID-00405 390
hsa-let-7a 256 hsa-let-7b 257 hsa-let-7f 258 hsa-let-7g 259
hsa-miR-106b 260 hsa-miR-1180 261 hsa-miR-127-5p 262 hsa-miR-129*
263 hsa-miR-129-5p 264 hsa-miR-130b 265 hsa-miR-132 266
hsa-miR-133a 267 hsa-miR-133b 268 hsa-miR-134 269 hsa-miR-139-5p
270 hsa-miR-140-5p 271 hsa-miR-145* 272 hsa-miR-148b 273
hsa-miR-151-3p 274 hsa-miR-154 275 hsa-miR-154* 276 hsa-miR-17* 277
hsa-miR-181a-2* 278 hsa-miR-1826 279 hsa-miR-187 280 hsa-miR-188-5p
281 hsa-miR-196a 282 hsa-miR-1979 283 hsa-miR-19b 284 hsa-miR-20b
285 hsa-miR-216a 286 hsa-miR-216b 287 hsa-miR-217 288 hsa-miR-22*
289 hsa-miR-221* 290 hsa-miR-222* 291 hsa-miR-23a 292 hsa-miR-23b
293 hsa-miR-24 294 hsa-miR-26a 295 hsa-miR-26b 296 hsa-miR-27a 297
hsa-miR-28-3p 298 hsa-miR-296-5p 299 hsa-miR-299-3p 300
hsa-miR-29b-2* 301 hsa-miR-301a 302 hsa-miR-30b 303 hsa-miR-30e*
304 hsa-miR-31* 305 hsa-miR-323-3p 306 hsa-miR-324-5p 307
hsa-miR-328 308 hsa-miR-329 309 hsa-miR-330-3p 310 hsa-miR-335 311
hsa-miR-337-5p 312 hsa-miR-338-3p 313 hsa-miR-361-3p 314
hsa-miR-362-3p 315 hsa-miR-362-5p 316 hsa-miR-369-5p 317
hsa-miR-370 318 hsa-miR-376a 319 hsa-miR-376c 320 hsa-miR-377* 321
hsa-miR-379 322 hsa-miR-381 323 hsa-miR-382 324 hsa-miR-409-3p 325
hsa-miR-409-5p 326 hsa-miR-410 327 hsa-miR-411 328
hsa-miR-425* 329 hsa-miR-431* 330 hsa-miR-432 331 hsa-miR-433 332
hsa-miR-483-3p 333 hsa-miR-483-5p 334 hsa-miR-485-3p 335
hsa-miR-485-5p 336 hsa-miR-487a 337 hsa-miR-494 338 hsa-miR-495 339
hsa-miR-500 340 hsa-miR-500* 341 hsa-miR-501-3p 342 hsa-miR-502-3p
343 hsa-miR-503 344 hsa-miR-506 345 hsa-miR-509-3-5p 346
hsa-miR-513a-5p 347 hsa-miR-532-3p 348 hsa-miR-532-5p 349
hsa-miR-539 350 hsa-miR-542-5p 351 hsa-miR-543 352 hsa-miR-598 353
hsa-miR-612 354 hsa-miR-654-3p 355 hsa-miR-658 356 hsa-miR-660 357
hsa-miR-665 358 hsa-miR-708 359 hsa-miR-873 360 hsa-miR-874 361
hsa-miR-891a 362 hsa-miR-99b 363 MID-00064 364 MID-00078 365
MID-00144 366 MID-00465 367 MID-00672 368 MID-15986 370 MID-16270
371 MID-16469 372 MID-16582 373 MID-16748 374 MID-17356 (3651) 389
MID-17375 375 MID-17576 376 MID-17866 377 MID-18307 378 MID-18395
379 MID-19898 380 MID-19962 381 MID-22331 382 MID-22912 383
MID-23017 384 MID-23168 385 MID-23178 386 MID-23751 387
hsa-miR-423-5p 388
[0272] 3. Nucleic Acid Complexes
[0273] The nucleic acid may further comprise one or more of the
following: a peptide, a protein, a RNA-DNA hybrid, an antibody, an
antibody fragment, a Fab fragment, and an aptamer.
[0274] 4. Pri-miRNA
[0275] The nucleic acid may comprise a sequence of a pri-miRNA or a
variant thereof. The pri-miRNA sequence may comprise from
45-30,000, 50-25,000, 100-20,000, 1,000-1,500 or 80-100
nucleotides. The sequence of the pri-miRNA may comprise a
pre-miRNA, miRNA and miRNA*, as set forth herein, and variants
thereof. The sequence of the pri-miRNA may comprise any of the
sequences of SEQ ID NOS: 1-390 or variants thereof.
[0276] The pri-miRNA may comprise a hairpin structure. The hairpin
may comprise a first and a second nucleic acid sequence that are
substantially complimentary. The first and second nucleic acid
sequence may be from 37-50 nucleotides. The first and second
nucleic acid sequence may be separated by a third sequence of from
8-12 nucleotides. The hairpin structure may have a free energy of
less than -25 Kcal/mole, as calculated by the Vienna algorithm with
default parameters, as described in Hofacker et al. (Monatshefte f.
Chemie 1994; 125:167-188), the contents of which are incorporated
herein by reference. The hairpin may comprise a terminal loop of
4-20, 8-12 or 10 nucleotides. The pri-miRNA may comprise at least
19% adenosine nucleotides, at least 16% cytosine nucleotides, at
least 23% thymine nucleotides and at least 19% guanine
nucleotides.
[0277] 5. Pre-miRNA
[0278] The nucleic acid may also comprise a sequence of a pre-miRNA
or a variant thereof. The pre-miRNA sequence may comprise from
45-90, 60-80 or 60-70 nucleotides. The sequence of the pre-miRNA
may comprise a miRNA and a miRNA* as set forth herein. The sequence
of the pre-miRNA may also be that of a pri-miRNA excluding from
0-160 nucleotides from the 5' and 3' ends of the pri-miRNA. The
sequence of the pre-miRNA may comprise the sequence of SEQ ID NOS:
1-390 or variants thereof.
[0279] 6. miRNA
[0280] The nucleic acid may also comprise a sequence of a miRNA
(including miRNA*) or a variant thereof. The miRNA sequence may
comprise from 13-33, 18-24 or 21-23 nucleotides. The miRNA may also
comprise a total of at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,
33, 34, 35, 36, 37, 38, 39 or 40 nucleotides. The sequence of the
miRNA may be the first 13-33 nucleotides of the pre-miRNA. The
sequence of the miRNA may also be the last 13-33 nucleotides of the
pre-miRNA. The sequence of the miRNA may comprise the sequence of
SEQ ID NOS: 1-69, 146-148, 152-390 or variants thereof.
[0281] 7. Probes
[0282] A probe comprising a nucleic acid described herein is also
provided. Probes may be used for screening and diagnostic methods,
as outlined below. The probe may be attached or immobilized to a
solid substrate, such as a biochip.
[0283] The probe may have a length of from 8 to 500, 10 to 100 or
20 to 60 nucleotides. The probe may also have a length of at least
8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24,
25, 26, 27, 28, 29, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 120,
140, 160, 180, 200, 220, 240, 260, 280 or 300 nucleotides. The
probe may further comprise a linker sequence of from 10-60
nucleotides. The probe may comprise a nucleic acid that is
complementary to a sequence selected from the group consisting of
SEQ ID NOS: 1-390 or variants thereof.
[0284] 8. Biochip
[0285] A biochip is also provided. The biochip may comprise a solid
substrate comprising an attached probe or plurality of probes
described herein. The probes may be capable of hybridizing to a
target sequence under stringent hybridization conditions. The
probes may be attached at spatially defined addresses on the
substrate. More than one probe per target sequence may be used,
with either overlapping probes or probes to different sections of a
particular target sequence. The probes may be capable of
hybridizing to target sequences associated with a single disorder
appreciated by those in the art. The probes may either be
synthesized first, with subsequent attachment to the biochip, or
may be directly synthesized on the biochip.
[0286] The solid substrate may be a material that may be modified
to contain discrete individual sites appropriate for the attachment
or association of the probes and is amenable to at least one
detection method. Representative examples of substrates include
glass and modified or functionalized glass, plastics (including
acrylics, polystyrene and copolymers of styrene and other
materials, polypropylene, polyethylene, polybutylene,
polyurethanes, TeflonJ, etc.), polysaccharides, nylon or
nitrocellulose, resins, silica or silica-based materials including
silicon and modified silicon, carbon, metals, inorganic glasses and
plastics. The substrates may allow optical detection without
appreciably fluorescing.
[0287] The substrate may be planar, although other configurations
of substrates may be used as well. For example, probes may be
placed on the inside surface of a tube, for flow-through sample
analysis to minimize sample volume. Similarly, the substrate may be
flexible, such as flexible foam, including closed cell foams made
of particular plastics.
[0288] The biochip and the probe may be derivatized with chemical
functional groups for subsequent attachment of the two. For
example, the biochip may be derivatized with a chemical functional
group including, but not limited to, amino groups, carboxyl groups,
oxo groups or thiol groups. Using these functional groups, the
probes may be attached using functional groups on the probes either
directly or indirectly using a linker. The probes may be attached
to the solid support by either the 5' terminus, 3' terminus, or via
an internal nucleotide.
[0289] The probe may also be attached to the solid support
non-covalently. For example, biotinylated oligonucleotides can be
made, which may bind to surfaces covalently coated with
streptavidin, resulting in attachment. Alternatively, probes may be
synthesized on the surface using techniques such as
photopolymerization and photolithography.
[0290] 9. Diagnostics
[0291] As used herein, the term "diagnosing" refers to classifying
pathology, or a symptom, determining a severity of the pathology
(e.g., grade or stage), monitoring pathology progression,
forecasting an outcome of pathology and/or prospects of
recovery.
[0292] As used herein, the phrase "subject in need thereof" refers
to an animal or human subject who is known to have cancer, at risk
of having cancer (e.g., a genetically predisposed subject, a
subject with medical and/or family history of cancer, a subject who
has been exposed to carcinogens, occupational hazard, environmental
hazard) and/or a subject who exhibits suspicious clinical signs of
cancer (e.g., blood in the stool or melena, unexplained pain,
sweating, unexplained fever, unexplained loss of weight up to
anorexia, changes in bowel habits (constipation and/or diarrhea),
tenesmus (sense of incomplete defecation, for rectal cancer
specifically), anemia and/or general weakness). Additionally or
alternatively, the subject in need thereof can be a healthy human
subject undergoing a routine well-being check up.
[0293] Analyzing presence of malignant or pre-malignant cells can
be effected in vivo or ex vivo, whereby a biological sample (e.g.,
biopsy, blood) is retrieved. Such biopsy samples comprise cells and
may be an incisional or excisional biopsy. Alternatively, the cells
may be retrieved from a complete resection.
[0294] While employing the present teachings, additional
information may be gleaned pertaining to the determination of
treatment regimen, treatment course and/or to the measurement of
the severity of the disease.
[0295] As used herein the phrase "treatment regimen" refers to a
treatment plan that specifies the type of treatment, dosage,
follow-up plans, schedule and/or duration of a treatment provided
to a subject in need thereof (e.g., a subject diagnosed with a
pathology). The selected treatment regimen can be an aggressive one
which is expected to result in the best clinical outcome (e.g.,
complete cure of the pathology) or a more moderate one which may
relieve symptoms of the pathology yet results in incomplete cure of
the pathology. It will be appreciated that in certain cases the
treatment regimen may be associated with some discomfort to the
subject or adverse side effects (e.g., damage to healthy cells or
tissue). The type of treatment can include a surgical intervention
(e.g., removal of lesion, diseased cells, tissue, or organ), a cell
replacement therapy, an administration of a therapeutic drug (e.g.,
receptor agonists, antagonists, hormones, chemotherapy agents) in a
local or a systemic mode, an exposure to radiation therapy using an
external source (e.g., external beam) and/or an internal source
(e.g., brachytherapy) and/or any combination thereof. The dosage,
schedule and duration of treatment can vary, depending on the
severity of pathology and the selected type of treatment, and those
of skill in the art are capable of adjusting the type of treatment
with the dosage, schedule and duration of treatment.
[0296] A method of diagnosis is also provided. The method comprises
detecting an expression level of a specific cancer-associated
nucleic acid in a biological sample. The sample may be derived from
a patient. Diagnosis of a specific cancer state in a patient may
allow for prognosis and selection of therapeutic strategy. Further,
the developmental stage of cells may be classified by determining
temporarily expressed specific cancer-associated nucleic acids.
[0297] In situ hybridization of labeled probes to tissue arrays may
be performed. When comparing the fingerprints between individual
samples the skilled artisan can make a diagnosis, a prognosis, or a
prediction based on the findings. It is further understood that the
nucleic acid sequences which indicate the diagnosis may differ from
those which indicate the prognosis and molecular profiling of the
condition of the cells or exosomes may lead to distinctions between
responsive or refractory conditions or may be predictive of
outcomes.
[0298] 10. Kits
[0299] A kit is also provided and may comprise a nucleic acid
described herein together with any or all of the following: assay
reagents, buffers, probes and/or primers, and sterile saline or
another pharmaceutically acceptable emulsion and suspension base.
In addition, the kits may include instructional materials
containing directions (e.g., protocols) for the practice of the
methods described herein. The kit may further comprise a software
package for data analysis of expression profiles.
[0300] For example, the kit may be a kit for the amplification,
detection, identification or quantification of a target nucleic
acid sequence. The kit may comprise a poly (T) primer, a forward
primer, a reverse primer, and a probe.
[0301] Any of the compositions described herein may be comprised in
a kit. In a non-limiting example, reagents for isolating miRNA,
labeling miRNA, and/or evaluating a miRNA population using an array
are included in a kit. The kit may further include reagents for
creating or synthesizing miRNA probes. The kits will thus comprise,
in suitable container means, an enzyme for labeling the miRNA by
incorporating labeled nucleotide or unlabeled nucleotides that are
subsequently labeled. It may also include one or more buffers, such
as reaction buffer, labeling buffer, washing buffer, or a
hybridization buffer, compounds for preparing the miRNA probes,
components for in situ hybridization and components for isolating
miRNA. Other kits of the invention may include components for
making a nucleic acid array comprising miRNA, and thus may include,
for example, a solid support.
[0302] The following examples are presented in order to more fully
illustrate some embodiments of the invention. They should, in no
way be construed, however, as limiting the broad scope of the
invention.
EXAMPLES
Methods
1. Tumor Samples
[0303] 1300 primary and metastatic tumor FFPE were used in the
study. Tumor samples were obtained from several sources.
Institutional review approvals were obtained for all samples in
accordance with each institute's institutional review board or IRB
equivalent guidelines. Samples included primary tumors and
metastases of defined origins, according to clinical records. Tumor
content was at least 50% for >95% of samples, as determined by a
pathologist based on hematoxylin-eosin (H&E) stained
slides.
2. RNA Extraction
[0304] For FFPE samples, total RNA was isolated from seven to ten
10-.mu.m-thick tissue sections using the miR extraction protocol
developed at Rosetta Genomics. Briefly, the sample was incubated a
few times in xylene at 57.degree. C. to remove paraffin excess,
followed by ethanol washes. Proteins were degraded by proteinase K
solution at 45.degree. C. for a few hours. The RNA was extracted
with acid phenol:chloroform followed by ethanol precipitation and
DNAse digestion. Total RNA quantity and quality was checked by
spectrophotometer (Nanodrop ND-1000).
3. miR Array Platform
[0305] Custom microarrays (Agilent Technologies, Santa Clara,
Calif.) were produced by printing DNA oligonucleotide probes to:
982 miRs sequences, 17 negative controls, 23 spikes, and 10
positive controls (total of 1032 probes). Each probe, printed in
triplicate, carried up to 28-nucleotide (nt) linker at the 3' end
of the microRNA's complement sequence. 17 negative control probes
were designed using as sequences which do not match the genome. Two
groups of positive control probes were designed to hybridize to miR
array: (i) synthetic small RNAs were spiked to the RNA before
labeling to verify the labeling efficiency; and (ii) probes for
abundant small RNA (e.g., small nuclear RNAs (U43, U24, Z30, U6,
U48, U44), 5.8s and 5s ribosomal RNA are spotted on the array to
verify RNA quality.
4. Cy-Dye Labeling of miRNA for miR Array
[0306] One .mu.g of total RNA were labeled by ligation (Thomson et
al. Nature Methods 2004; 1:47-53) of an RNA-linker, p-rCrU-Cy/dye
(Eurogentec or equivalent), to the 3' end with Cy3 or Cy5. The
labeling reaction contained total RNA, spikes (0.1-100 fmoles), 400
ng RNA-linker-dye, 15% DMSO, 1.times. ligase buffer and 20 units of
T4 RNA ligase (NEB), and proceeded at 4.degree. C. for 1 h,
followed by 1 h at 37.degree. C., followed by 4.degree. C. up to 40
min.
[0307] The labeled RNA was mixed with 30 .mu.l hybridization
mixture (mixture of 45 .mu.L of the 10.times.GE Agilent Blocking
Agent and 246 .mu.L of 2.times. Hi-RPM Hybridization). The labeling
mixture was incubated at 100.degree. C. for 5 minutes followed by
ice incubation in water bath for 5 minutes. Slides were Hybridize
at 54.degree. C. for 16-20 hours, followed by two washes. The first
wash was conducted at room temperature with Agilent GE Wash Buffer
1 for 5 min followed by a second wash with Agilent GE Wash Buffer 2
at 37.degree. C. for 5 min
[0308] Arrays were scanned using an Agilent Microarray Scanner
Bundle G2565BA (resolution of 5 .mu.m at XDR Hi 100%, XDR Lo 5%).
Array images were analyzed using Feature Extraction 10.7 software
(Agilent).
5. Array Signal Calculation and Normalization
[0309] Triplicate spots were combined to produce one signal for
each probe by taking the logarithmic mean of reliable spots. All
data were log 2-transformed and the analysis was performed in log
2-space. A reference data vector for normalization R was calculated
by taking the median expression level for each probe across all
samples. For each sample data vector S, a 2nd degree polynomial F
was found so as to provide the best fit between the sample data and
the reference data, such that R.apprxeq.F(S). Remote data points
("outliers") were not used for fitting the polynomial F. For each
probe in the sample (element Si in the vector S), the normalized
value (in log-space) Mi was calculated from the initial value Si by
transforming it with the polynomial function F, so that
Mi=F(Si).
6. Logistic Regression
[0310] The aim of a logistic regression model is to use several
features, such as expression levels of several microRNAs, to assign
a probability of belonging to one of two possible groups, such as
two branches of a node in a binary decision-tree. Logistic
regression models the natural log of the odds ratio, i.e., the
ratio of the probability of belonging to the first group, for
example, the left branch in a node of a binary decision-tree (P)
over the probability of belonging to the second group, for example,
the right branch in such a node (1-P), as a linear combination of
the different expression levels (in log-space). The logistic
regression assumes that:
ln ( P 1 - P ) = .beta. 0 + i = 1 N .beta. i M i = .beta. 0 +
.beta. 1 M 1 + .beta. 2 M 2 + , ##EQU00001##
[0311] where .beta..sub.0 is the bias, M.sub.i is the expression
level (normalized, in log 2-space) of the i-th microRNA used in the
decision node, and .beta..sub.i is its corresponding coefficient.
.beta.i>0 indicates that the probability to take the left branch
(P) increases when the expression level of this microRNA (Mi)
increases, and the opposite for .beta.i<0. If a node uses only a
single microRNA (M), then solving for P results in:
P = .beta. 0 + .beta. i M 1 + .beta. 0 + .beta. 1 M .
##EQU00002##
[0312] The regression error on each sample is the difference
between the assigned probability P and the true "probability" of
this sample, i.e., 1 if this sample is in the left branch group and
0 otherwise. The training and optimization of the logistic
regression model calculates the parameters .beta. and the p-values
[for each microRNA by the Wald statistic and for the overall model
by the .chi.2 (chi-square) difference], maximizing the likelihood
of the data given the model and minimizing the total regression
error
Samples in first group ( 1 - P j ) + Samples in second group P j .
##EQU00003##
[0313] The probability output of the logistic model is here
converted to a binary decision by comparing P to a threshold,
denoted by P.sub.TH, i.e., if P.gtoreq.P.sub.TH then the sample
belongs to the left branch ("first group") and vice versa. Choosing
at each node the branch which has a probability >0.5, i.e.,
using a probability threshold of 0.5, leads to a minimization of
the sum of the regression errors. However, as the goal was the
minimization of the overall number of misclassifications (and not
of their probability), a modification which adjusts the probability
threshold (P.sub.TH) was used in order to minimize the overall
number of mistakes at each node (Table 2). For each node the
threshold to a new probability threshold P.sub.TH was optimized
such that the number of classification errors is minimized. This
change of probability threshold is equivalent (in terms of
classifications) to a modification of the bias .beta..sub.0, which
may reflect a change in the prior frequencies of the classes. Once
the threshold was chosen .beta..sub.0 was modified such that the
threshold will be shifted back to 0.5. In addition, .beta.0,
.beta.1, .beta.2, . . . were adjusted so that the slope of the log
of the odds ratio function is limited.
7. Stepwise Logistic Regression and Feature Selection
[0314] The original data contain the expression levels of multiple
microRNAs for each sample, i.e., multiple of data features. In
training the classifier for each node, only a small subset of these
features was selected and used for optimizing a logistic regression
model. In the initial training this was done using a forward
stepwise scheme. The features were sorted in order of decreasing
log-likelihoods, and the logistic model was started off and
optimized with the first feature. The second feature was then
added, and the model re-optimized. The regression error of the two
models was compared: if the addition of the feature did not provide
a significant advantage (a .chi.2 difference less than 7.88,
p-value of 0.005), the new feature was discarded. Otherwise, the
added feature was kept. Adding a new feature may make a previous
feature redundant (e.g., if they are very highly correlated). To
check for this, the process iteratively checks if the feature with
lowest likelihood can be discarded (without losing .chi.2
difference as above). After ensuring that the current set of
features is compact in this sense, the process continues to test
the next feature in the sorted list, until features are exhausted.
No limitation on the number of features was inserted into the
algorithm.
[0315] The stepwise logistic regression method was used on subsets
of the training set samples by re-sampling the training set with
repetition ("bootstrap"), so that each of the 20 runs contained
somewhat different training set. All the features that took part in
one of the 20 models were collected. A robust set of 1-3 features
per each node was selected by comparing features that were
repeatedly chosen in the bootstrap sets to previous evidence, and
considering their signal strengths and reliability. When using
these selected features to construct the classifier, the stepwise
process was not used and the training optimized the logistic
regression model parameters only.
8. K-Nearest-Neighbors (KNN) Classification Algorithm
[0316] The KNN algorithm (see e.g., Ma et al., Arch Pathol Lab Med
2006; 130:465-73) calculates the distance (Pearson correlation) of
any sample to all samples in the training set, and classifies the
sample by the majority vote of the k samples which are most similar
(k being a parameter of the classifier). The correlation is
calculated on the pre-defined set of microRNAs (the microRNAs that
were used by the decision-tree). KNN algorithms with k=1; 10 were
compared, and the optimal performer was selected, using k=5. The
KNN was based on comparing the expression of all 65 microRNAs in
each sample to all other samples in the training database.
9. Reporting a Final Answer (Prediction):
[0317] The decision-tree and KNN each return a predicted tissue of
origin and histological type where applicable. The tissue of origin
and histological type may be one of the exact origins and types in
the training or a variant thereof. For example, whereas the
training includes brain oligodendroglioma and brain astrocytoma,
the answer may simply be brain carcinoma. In addition to the tissue
of origin and histological type, the KNN and decision-tree each
return a confidence measure. The KNN returns the number of samples
within the K nearest neighbors that agreed with the answer reported
by the KNN (denoted by V), and the decision-tree returns the
probability of the result (P), which is the multiplication of the
probabilities at each branch point made on the way to that answer.
The classifier returns the two different predictions or a single
prediction in case the predictions concur, can be unified into a
single answer (for example into the prediction brain if the KNN
returned brain oligodendroglioma and the decision-tree brain
astrocytoma), or if based on V and P, one answer is chosen to
override the other.
Example 1
Decision-Tree Classification Algorithm
[0318] A tumor classifier was built using the microRNA expression
levels by applying a binary tree classification scheme (FIGS.
1A-F). This framework is set up to utilize the specificity of
microRNAs in tissue differentiation and embryogenesis: different
microRNAs are involved in various stages of tissue specification,
and are used by the algorithm at different decision points or
"nodes". The tree breaks up the complex multi-tissue classification
problem into a set of simpler binary decisions. At each node,
classes which branch out earlier in the tree are not considered,
reducing interference from irrelevant samples and further
simplifying the decision. The decision at each node can then be
accomplished using only a small number of microRNA biomarkers,
which have well-defined roles in the classification (Table 2). The
structure of the binary tree was based on a hierarchy of tissue
development and morphological similarity.sup.18, which was modified
by prominent features of the microRNA expression patterns. For
example, the expression patterns of microRNAs indicated a
significant difference between germ cell tumors and tumors of
non-germ cell origin, and these are therefore distinguished at node
#1 (FIG. 2) into separate branches (FIG. 1A).
[0319] For each of the individual nodes logistic regression models
were used, a robust family of classifiers which are frequently used
in epidemiological and clinical studies to combine continuous data
features into a binary decision (FIGS. 2-25 and Methods). Since
gene expression classifiers have an inherent redundancy in
selecting the gene features, bootstrapping was used on the training
sample set as a method to select a stable microRNA set for each
node (Methods). This resulted in a small number (usually 2-3) of
microRNA features per node, totaling 65 microRNAs for the full
classifier (Table 2). This approach provides a systematic process
for identifying new biomarkers for differential expression.
TABLE-US-00002 TABLE 2 microRNAs used per class in the tree
classifier miR List: Class hsa-miR-372 (SEQ ID NO: 55) Germ cell
cancer hsa-miR-372, hsa-miR-122 (SEQ ID NO: 6), hsa-miR-126 (SEQ ID
Biliary tract NO: 9), hsa-miR-200b (SEQ ID NO: 29) adenocarcinoma
hsa-miR-372, hsa-miR-122, hsa-miR-126, hsa-miR-200b Hepatocellular
carcinoma (HCC) hsa-miR-372, hsa-miR-122, hsa-miR-200c (SEQ ID NO:
30), hsa-miR- Brain tumor 30a (SEQ ID NO: 46), hsa-miR-146a (SEQ ID
NO: 16), hsa-let-7e (SEQ ID NO: 156), hsa-miR-9* (SEQ ID NO: 66),
hsa-miR-92b (SEQ ID NO: 68) hsa-miR-372, hsa-miR-122, hsa-miR-200c,
hsa-miR-30a, hsa-miR- Brain - 146a, hsa-let-7e, hsa-miR-9*,
hsa-miR-92b, hsa-miR-222 (SEQ ID oligodendroglioma NO: 40),
hsa-miR-497 (SEQ ID NO: 60) hsa-miR-372, hsa-miR-122, hsa-miR-200c,
hsa-miR-30a, hsa-miR- Brain - astrocytoma 146a, hsa-let-7e,
hsa-miR-9*, hsa-miR-92b, hsa-miR-222, hsa-miR-497 hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375 Prostate (SEQ
ID NO: 56), hsa-miR-7 (SEQ ID NO: 65), hsa-miR-193a-3p (SEQ
Adenocarcinoma ID NO: 25), hsa-miR-194 (SEQ ID NO: 27), hsa-miR-21*
(SEQ ID NO: 35), hsa-miR-143 (SEQ ID NO: 14), hsa-miR-181a (SEQ ID
NO: 21) hsa-miR-372 Ovarian primitive germ cell tumor hsa-miR-372
Testis hsa-miR-372, hsa-miR-200b, hsa-miR-516a-5p (SEQ ID NO: 62)
Seminomatous testicular germ cell tumor hsa-miR-372, hsa-miR-200b,
hsa-miR-516a-5p Non seminomatous testicular germ cell tumor
hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-7,
Breast hsa-miR-194, hsa-miR-21*, hsa-miR-143, hsa-miR-181a,
hsa-miR-205 adenocarcinoma (SEQ ID NO: 32), hsa-miR-345 (SEQ ID NO:
51), hsa-miR-125a-5p (SEQ ID NO: 7), hsa-miR-193a-3p (SEQ ID NO:
25), hsa-miR-375, hsa- miR-342-3p (SEQ ID NO: 50) hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-7, Ovarian
carcinoma hsa-miR-193a-3p, hsa-miR-194, hsa-miR-21*, hsa-miR-143,
hsa-miR- 181a, hsa-miR-345, hsa-miR-125a-5p, hsa-miR-193a-3p,
hsa-miR-375, hsa-miR-342-3p, hsa-miR-205 (SEQ ID NO: 32),
hsa-miR-10a (SEQ ID NO: 4), hsa-miR-22 (SEQ ID NO: 39) hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375, Thyroid
carcinoma hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194, hsa-miR-21*,
hsa-miR- 143, hsa-miR-181a, hsa-miR-205, hsa-miR-345,
hsa-miR-125a-5p, hsa- miR-138 (SEQ ID NO: 11), hsa-miR-93 (SEQ ID
NO: 148), hsa-miR- 10a (SEQ ID NO: 4) hsa-miR-372, hsa-miR-122,
hsa-miR-200c, hsa-miR-30a, hsa-miR-375, Thyroid carcinoma
hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194, hsa-miR-21*, hsa-miR-
follicular 143, hsa-miR-181a, hsa-miR-205, hsa-miR-345,
hsa-miR-125a-5p, hsa- miR-138, hsa-miR-93, hsa-miR-10a,
hsa-miR-146b-5p (SEQ ID NO: 17), hsa-miR-21 (SEQ ID NO: 34)
hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375,
Thyroid carcinoma hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194,
hsa-miR-21*, hsa-miR- papillary 143, hsa-miR-181a, hsa-miR-205,
hsa-miR-345, hsa-miR-125a-5p, hsa- miR-138, hsa-miR-93,
hsa-miR-10a, hsa-miR-146b-5p, hsa-miR-21 hsa-miR-372, hsa-miR-122,
hsa-miR-200c, hsa-miR-30a, hsa-miR-375, Breast hsa-miR-7,
hsa-miR-194, hsa-miR-21*, hsa-miR-143, hsa-miR-181a, adenocarcinoma
hsa-miR-205, hsa-miR-345, hsa-miR-125a-5p, hsa-miR-138, hsa-miR-
93, hsa-miR-10a, hsa-miR-193a-3p (SEQ ID NO: 25), hsa-miR-31 (SEQ
ID NO: 49), hsa-miR-92a (SEQ ID NO: 67) hsa-miR-372, hsa-miR-122,
hsa-miR-200c, hsa-miR-30a, hsa-miR-375, Lung large cell or
hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194, hsa-miR-21*, hsa-miR-
adenocarcinoma 143, hsa-miR-181a, hsa-miR-205, hsa-miR-345,
hsa-miR-125a-5p, hsa- miR-93, hsa-miR-10a, hsa-miR-193a-3p,
hsa-miR-31, hsa-miR-92a, hsa-miR-138 (SEQ ID NO: 11), hsa-miR-378
(SEQ ID NO: 57), hsa- miR-21 (SEQ ID NO: 34) hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375, Ovarian
carcinoma hsa-miR-7, hsa-miR-194, hsa-miR-21*, hsa-miR-143,
hsa-miR-181a, hsa-miR-205, hsa-miR-345, hsa-miR-125a-5p,
hsa-miR-93, hsa-miR- 10a, hsa-miR-193a-3p, hsa-miR-31, hsa-miR-92a,
hsa-miR-138, hsa- miR-378, hsa-miR-21 hsa-miR-372, hsa-miR-122,
hsa-miR-200c, hsa-miR-30a, hsa-miR-375, Thymoma hsa-miR-7,
hsa-miR-193a-3p, hsa-miR-194, hsa-miR-21*, hsa-miR- 143,
hsa-miR-181a, hsa-miR-205, hsa-miR-345, hsa-miR-125a-5p, hsa-
miR-342-3p, hsa-miR-10a, hsa-miR-22, hsa-miR-100, hsa-miR-21
hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375,
Urothelial carcinoma hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194,
hsa-miR-21*, hsa-miR- (TCC) 143, hsa-miR-181a, hsa-miR-205,
hsa-miR-345, hsa-miR-125a-5p, hsa- miR-342-3p, hsa-miR-205,
hsa-miR-10a, hsa-miR-22, hsa-miR-100, hsa-miR-21, hsa-miR-934 (SEQ
ID NO: 69), hsa-miR-191 (SEQ ID NO: 24), hsa-miR-29c (SEQ ID NO:
44) hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a,
hsa-miR-375, Squamous cell hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194,
hsa-miR-21*, hsa-miR- carcinoma (SCC) 143, hsa-miR-181a,
hsa-miR-205, hsa-miR-345, hsa-miR-125a-5p, hsa- miR-342-3p,
hsa-miR-10a, hsa-miR-22, hsa-miR-100, hsa-miR-21, hsa- miR-934,
hsa-miR-191, hsa-miR-29c hsa-miR-372, hsa-miR-122, hsa-miR-200c,
hsa-miR-30a, hsa-miR-375, Uterine cervix SCC hsa-miR-7,
hsa-miR-193a-3p, hsa-miR-194, hsa-miR-21*, hsa-miR- 143,
hsa-miR-181a, hsa-miR-205, hsa-miR-345, hsa-miR-125a-5p, hsa-
miR-342-3p, hsa-miR-10a, hsa-miR-22, hsa-miR-100, hsa-miR-21, hsa-
miR-934, hsa-miR-191, hsa-miR-29c, hsa-miR-10b (SEQ ID NO: 5),
hsa-let-7c (SEQ ID NO: 1), hsa-miR-361-5p (SEQ ID NO: 54)
hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375,
Anus or Skin SCC hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194,
hsa-miR-21*, hsa-miR- 143, hsa-miR-181a, hsa-miR-205, hsa-miR-345,
hsa-miR-125a-5p, hsa- miR-193a-3p, hsa-miR-375, hsa-miR-342-3p,
hsa-miR-205, hsa-miR- 10a, hsa-miR-22, hsa-miR-100, hsa-miR-21,
hsa-miR-934, hsa-miR- 191, hsa-miR-29c, hsa-miR-10b, hsa-let-7c,
hsa-miR-361-5p, hsa-miR- 138, hsa-miR-185 (SEQ ID NO: 23)
hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375,
Lung, Head& Neck or hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194,
hsa-miR-21*, hsa-miR- Esophagus SCC 143, hsa-miR-181a, hsa-miR-205,
hsa-miR-345, hsa-miR-125a-5p, hsa- miR-342-3p, hsa-miR-10a,
hsa-miR-22, hsa-miR-100, hsa-miR-21, hsa- miR-934, hsa-miR-191,
hsa-miR-29c, hsa-let-7c, hsa-miR-361-5p, hsa- miR-10b, hsa-miR-138,
hsa-miR-185 hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a,
hsa-miR-146a Melanoma (SEQ ID NO: 16), hsa-let-7e (SEQ ID NO: 2),
hsa-miR-30d (SEQ ID NO: 47), hsa-miR-342-3p hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR- Lymphoma 146a,
hsa-let-7e, hsa-miR-30d, hsa-miR-342-3p hsa-miR-372, hsa-miR-122,
hsa-miR-200c, hsa-miR-30a, hsa-miR- B cell lymphoma 146a,
hsa-let-7e, hsa-miR-30d, hsa-miR-342-3p, hsa-miR-21*, hsa- miR-30e
(SEQ ID NO: 48) hsa-miR-372, hsa-miR-122, hsa-miR-200c,
hsa-miR-30a, hsa-miR- T cell lymphoma 146a, hsa-let-7e,
hsa-miR-30a, hsa-miR-30d, hsa-miR-342-3p, hsa- miR-21*, hsa-miR-30e
hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375,
Lung small cell hsa-miR-7, hsa-miR-193a-3p, hsa-miR-17 (SEQ ID NO:
20), hsa-miR- carcinoma 29c* (SEQ ID NO: 45) hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375, Medullary
thyroid hsa-miR-7, hsa-miR-193a-3p, hsa-miR-17, hsa-miR-29c*,
hsa-miR-222 carcinoma (SEQ ID NO: 40), hsa-miR-92a (SEQ ID NO: 67),
hsa-miR-92b (SEQ ID NO: 68) hsa-miR-372, hsa-miR-122, hsa-miR-200c,
hsa-miR-30a, hsa-miR-375, Lung carcinoid hsa-miR-7,
hsa-miR-193a-3p, hsa-miR-17, hsa-miR-29c*, hsa-miR- 222,
hsa-miR-92a, hsa-miR-92b, hsa-miR-652 (SEQ ID NO: 64), hsa-
miR-34c-5p (SEQ ID NO: 53), hsa-miR-214 (SEQ ID NO: 37)
hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375,
Gastrointestinal (GI) hsa-miR-7, hsa-miR-193a-3p, hsa-miR-17,
hsa-miR-29c*, hsa-miR- tract carcinoid 222, hsa-miR-92a,
hsa-miR-92b, hsa-miR-652, hsa-miR-34c-5p, hsa- miR-214, hsa-miR-21
(SEQ ID NO: 34), hsa-miR-148a (SEQ ID NO: 18) hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375, Pancreas islet
cell hsa-miR-7, hsa-miR-193a-3p, hsa-miR-17, hsa-miR-29c*, hsa-miR-
tumor 222, hsa-miR-92a, hsa-miR-92b, hsa-miR-652, hsa-miR-34c-5p,
hsa- miR-214, hsa-miR-21, hsa-miR-148a hsa-miR-372, hsa-miR-122,
hsa-miR-200c, hsa-miR-30a, hsa-miR-375, Gastric or Esophageal
hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194 (SEQ ID NO: 27), hsa-miR-
Adenocarcinoma 21*(SEQ ID NO: 35), hsa-miR-224 (SEQ ID NO: 42),
hsa-miR-210 (SEQ ID NO: 36), hsa-miR-1201 (SEQ ID NO: 146)
hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375,
Colorectal hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194, hsa-miR-21*,
hsa-miR- Adenocarcinoma 224, hsa-miR-210, hsa-miR-1201, hsa-miR-17
(SEQ ID NO: 20), hsa- miR-29a (SEQ ID NO: 43) hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375, Pancreas or
bile hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194, hsa-miR-21*, hsa-miR-
224, hsa-miR-210, hsa-miR-1201, hsa-miR-17, hsa-miR-29a
hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-375,
Pancreatic hsa-miR-7, hsa-miR-193a-3p, hsa-miR-194, hsa-miR-21*,
hsa-miR- adenocarcinoma 224, hsa-miR-210, hsa-miR-1201, hsa-miR-17,
hsa-miR-29a, hsa-miR- 345 (SEQ ID NO: 51), hsa-miR-31 (SEQ ID NO:
49), hsa-miR-146a (SEQ ID NO: 16) hsa-miR-372, hsa-miR-122,
hsa-miR-200c, hsa-miR-30a, hsa-miR-375, Biliary tract hsa-miR-7,
hsa-miR-193a-3p, hsa-miR-194, hsa-miR-21*, hsa-miR- adenocarcinoma
224, hsa-miR-210, hsa-miR-1201, hsa-miR-17, hsa-miR-29a, hsa-miR-
345, hsa-miR-31, hsa-miR-146a hsa-miR-372, hsa-miR-122,
hsa-miR-200c, hsa-miR-30a, hsa-miR-146a Renal cell carcinoma (SEQ
ID NO: 16), hsa-let-7e, hsa-miR-9* (SEQ ID NO: 66), hsa-miR-
chromophobe 92b (SEQ ID NO: 68), hsa-miR-149 (SEQ ID NO: 19),
hsa-miR-200b (SEQ ID NO: 29) hsa-miR-372, hsa-miR-122,
hsa-miR-200c, hsa-miR-30a, hsa-miR- Pheochromocytoma 146a,
hsa-let-7e, hsa-miR-9*, hsa-miR-92b, hsa-miR-30a, hsa-miR-149,
hsa-miR-200b, hsa-miR-7 (SEQ ID NO: 65), hsa-miR-375 hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR- Adrenocortical
146a, hsa-let-7e, hsa-miR-9*, hsa-miR-92b, hsa-miR-149, hsa-miR-
200b, hsa-miR-7, hsa-miR-375, hsa-miR-202 (SEQ ID NO: 31), hsa-
miR-214* (SEQ ID NO: 38), hsa-miR-509-3p (SEQ ID NO: 61)
hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-
Gastrointestinal 146a, hsa-let-7e, hsa-miR-9*, hsa-miR-92b,
hsa-miR-149, hsa-miR- stromal tumor (GIST) 200b, hsa-miR-7,
hsa-miR-375, hsa-miR-202, hsa-miR-214*, hsa-miR- 509-3p,
hsa-miR-143 (SEQ ID NO: 14), hsa-miR-29c* hsa-miR-372, hsa-miR-122,
hsa-miR-200c, hsa-miR-30a, hsa-miR- Renal cell carcinoma 146a,
hsa-let-7e, hsa-miR-9*, hsa-miR-92b, hsa-miR-149, hsa-miR-
chromophobe 200b, hsa-miR-210 (SEQ ID NO: 36), hsa-miR-221 (SEQ ID
NO: 147) hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a,
hsa-miR- Renal cell carcinoma 146a, hsa-let-7e, hsa-miR-9*,
hsa-miR-92b, hsa-miR-149, hsa-miR- clear cell 200b, hsa-miR-210,
hsa-miR-221, hsa-miR-31 (SEQ ID NO: 49), hsa- miR-126 (SEQ ID NO:
9) hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-
Renal cell carcinoma 146a, hsa-let-7e, hsa-miR-9*, hsa-miR-92b,
hsa-miR-149, hsa-miR- papillary 200b, hsa-miR-210, hsa-miR-221,
hsa-miR-31, hsa-miR-126 hsa-miR-372, hsa-miR-122, hsa-miR-200c,
hsa-miR-30a, hsa-miR- Pleural mesothelioma 146a, hsa-let-7e,
hsa-miR-9*, hsa-miR-92b, hsa-miR-149, hsa-miR- 200b, hsa-miR-7 (SEQ
ID NO: 65), hsa-miR-375, hsa-miR-202 (SEQ ID NO: 31), hsa-miR-214*
(SEQ ID NO: 38), hsa-miR-509-3p (SEQ ID NO: 61), hsa-miR-143 (SEQ
ID NO: 14), hsa-miR-29c*, hsa-miR-21* (SEQ ID NO: 35), hsa-miR-130a
(SEQ ID NO: 10), hsa-miR-10b (SEQ ID NO: 5) hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR- Sarcoma 146a,
hsa-let-7e, hsa-miR-9*, hsa-miR-92b, hsa-miR-149, hsa-miR- 200b,
hsa-miR-7, hsa-miR-375, hsa-miR-202, hsa-miR-214*, hsa-miR- 509-3p,
hsa-miR-143, hsa-miR-29c*, hsa-miR-21*, hsa-miR-130a, hsa- miR-10b
hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR-
Synovial
sarcoma 146a, hsa-let-7e, hsa-miR-9*, hsa-miR-92b, hsa-miR-149,
hsa-miR- 200b, hsa-miR-7, hsa-miR-375, hsa-miR-202, hsa-miR-214*,
hsa-miR- 509-3p, hsa-miR-143, hsa-miR-29c*, hsa-miR-21*,
hsa-miR-130a, hsa- miR-10b, hsa-miR-100 (SEQ ID NO: 3), hsa-miR-222
(SEQ ID NO: 40), hsa-miR-145 (SEQ ID NO: 15) hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR- Chondrosarcoma
146a, hsa-let-7e, hsa-miR-9*, hsa-miR-92b, hsa-miR-149, hsa-miR-
200b, hsa-miR-7, hsa-miR-375, hsa-miR-202, hsa-miR-214*, hsa-miR-
509-3p, hsa-miR-143, hsa-miR-29c*, hsa-miR-21*, hsa-miR-130a, hsa-
miR-10b, hsa-miR-100, hsa-miR-222, hsa-miR-145, hsa-miR-140-3p (SEQ
ID NO: 12), hsa-miR-455-5p (SEQ ID NO: 58) hsa-miR-372,
hsa-miR-122, hsa-miR-200c, hsa-miR-30a, hsa-miR- Liposarcoma 146a,
hsa-let-7e, hsa-miR-9*, hsa-miR-92b, hsa-miR-149, hsa-miR- 200b,
hsa-miR-7, hsa-miR-375, hsa-miR-202, hsa-miR-214*, hsa-miR- 509-3p,
hsa-miR-143, hsa-miR-29c*, hsa-miR-21*, hsa-miR-130a, hsa- miR-10b,
hsa-miR-100, hsa-miR-222, hsa-miR-145, hsa-miR-140-3p,
hsa-miR-455-5p, hsa-miR-210 (SEQ ID NO: 36), hsa-miR-193a-5p (SEQ
ID NO: 26) hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a,
hsa-miR- Ewing sarcoma 146a, hsa-let-7e, hsa-miR-9*, hsa-miR-92b,
hsa-miR-149, hsa-miR- 200b, hsa-miR-7, hsa-miR-375, hsa-miR-202,
hsa-miR-214*, hsa-miR- 509-3p, hsa-miR-143, hsa-miR-29c*,
hsa-miR-21*, hsa-miR-130a, hsa- miR-10b, hsa-miR-100, hsa-miR-222,
hsa-miR-145, hsa-miR-140-3p, hsa-miR-455-5p, hsa-miR-210,
hsa-miR-193a-5p, hsa-miR-181a, hsa- miR-193a-3p (SEQ ID NO: 25),
hsa-miR-31 (SEQ ID NO: 49) hsa-miR-372, hsa-miR-122, hsa-miR-200c,
hsa-miR-30a, hsa-miR- Osteosarcoma 146a, hsa-let-7e, hsa-miR-9*,
hsa-miR-92b, hsa-miR-149, hsa-miR- 200b, hsa-miR-7, hsa-miR-375,
hsa-miR-202, hsa-miR-214*, hsa-miR- 509-3p, hsa-miR-143,
hsa-miR-29c*, hsa-miR-21*, hsa-miR-130a, hsa- miR-10b, hsa-miR-100,
hsa-miR-222, hsa-miR-145, hsa-miR-140-3p, hsa-miR-455-5p,
hsa-miR-210, hsa-miR-193a-5p, hsa-miR-181a, hsa- miR-193a-3p,
hsa-miR-31 hsa-miR-372, hsa-miR-122, hsa-miR-200c, hsa-miR-30a,
hsa-miR- Rhabdomyo sarcoma 146a, hsa-let-7e, hsa-miR-30a,
hsa-miR-9*, hsa-miR-92b, hsa-miR-30a, hsa-miR-149, hsa-miR-200b,
hsa-miR-7, hsa-miR-375, hsa-miR-202, hsa-miR-214*, hsa-miR-509-3p,
hsa-miR-143, hsa-miR-29c*, hsa-miR- 21*, hsa-miR-130a, hsa-miR-10b,
hsa-miR-100, hsa-miR-222, hsa- miR-145, hsa-miR-140-3p,
hsa-miR-455-5p, hsa-miR-210, hsa-miR- 193a-5p, hsa-miR-181a,
hsa-miR-487b (SEQ ID NO: 59), hsa-miR-22 (SEQ ID NO: 39),
hsa-miR-206 (SEQ ID NO: 33) hsa-miR-372, hsa-miR-122, hsa-miR-200c,
hsa-miR-30a, hsa-miR- Malignant fibrous 146a, hsa-let-7e,
hsa-miR-30a, hsa-miR-9*, hsa-miR-92b, hsa-miR-30a, histiocytoma
(MFH) hsa-miR-149, hsa-miR-200b, hsa-miR-7, hsa-miR-375,
hsa-miR-202, or fibresarcoma hsa-miR-214*, hsa-miR-509-3p,
hsa-miR-143, hsa-miR-29c*, hsa-miR- 21*, hsa-miR-130a, hsa-miR-10b,
hsa-miR-100, hsa-miR-222, hsa- miR-145, hsa-miR-140-3p,
hsa-miR-455-5p, hsa-miR-210, hsa-miR- 193a-5p, hsa-miR-181a,
hsa-miR-487b, hsa-miR-22, hsa-miR-206
Example 2
Expression of miRs Provides for Distinguishing Between Tumors
TABLE-US-00003 [0320] TABLE 3 miR expression (in fluorescence
units) distinguishing between the group consisting of germ-cell
tumors and the group consisting of all other tumors SEQ fold- ID
median values change p-value NO. miR name 2.7e+004-5.0e+001 545.73
(+) <e-240 233 hsa-miR-373 1.8e+004-5.0e+001 365.93 (+)
<e-240 55 hsa-miR-372 8.6e+003-5.0e+001 171.72 (+) <e-240 200
hsa-miR-371-3p 5.9e+003-5.1e+001 115.94 (+) 7.3e-249.sup. 201
hsa-miR-371-5p (+) for all the listed miRs, the higher expression
is in tumors from a germ-cell origin.
[0321] hsa-miR-372 (SEQ ID NO: 55) is used at node 1 of the
binary-tree-classifier detailed in the invention to distinguish
between germ-cell tumors and all other tumors.
[0322] FIGS. 2A-D are boxplot presentations comparing distribution
of the expression of the statistically significant miRs in tumor
samples from the "germ cell" class (left box) and "non germ cell"
class (right box).
TABLE-US-00004 TABLE 4 miR expression (in fluorescence units)
distinguishing between the group consisting of hepatobiliary tumors
and the group consisting of non germ-cell non-hepatobiliary tumors
SEQ fold- ID medianvalues change p-value NO. miR name
1.0e+005-5.0e+001 2024.31 (+) 1.1e-123 6 hsa-miR-122
7.4e+001-8.1e+003 109.63 (-) 3.6e-010 30 hsa-miR-200c
5.0e+001-1.4e+003 27.92 (-) 4.8e-010 13 hsa-miR-141 (+) the higher
expression of this miR is in tumors from a hepatobiliary origin (-)
the higher expression of this miR is in tumors from a non
germ-cell, non-hepatobiliary origin
[0323] hsa-miR-122 (SEQ ID NO: 6) is used at node 2 of the
binary-tree-classifier detailed in the invention to distinguish
between hepatobiliary tumors and non germ-cell non-hepatobiliary
tumors.
TABLE-US-00005 TABLE 5 miR expression (in fluorescence units)
distinguishing between the group consisting of liver tumors and the
group consisting of biliary- tract carcinomas (cholangiocarcinoma
or gallbladder adenocarcinoma) SEQ fold- ID median values change
p-value NO. miR name 6.1e+003-4.1e+002 14.74 (+) 5.5e-005 28
hsa-miR-200a 9.7e+003-9.0e+002 10.74 (+) 2.4e-004 29 hsa-miR-200b
1.9e+003-7.0e+003 3.67 (-) 8.5e-004 231 hsa-miR-99a
3.3e+003-7.5e+003 2.28 (-) 6.2e-004 9 hsa-miR-126 (+) the higher
expression of this miR is in biliary tract carcinomas (-) the
higher expression of this miR is in liver tumors
[0324] hsa-miR-126 (SEQ ID NO: 9) and hsa-miR-200b (SEQ ID NO: 29)
are used at node 3 of the binary-tree-classifier detailed in the
invention to distinguish between liver tumors and biliary-tract
carcinoma.
[0325] FIG. 3 demonstrates that tumors of hepatocellular carcinoma
(HCC) origin (marked by squares) are easily distinguished from
tumors of biliary tract adenocarcinoma origin (marked by diamonds)
using the expression levels of hsa-miR-200b (SEQ ID NO: 29, y-axis)
and hsa-miR-126 (SEQ ID NO: 9, x-axis).
TABLE-US-00006 TABLE 6 miR expression (in fluorescence units)
distinguishing between the group consisting of tumors from an
epithelial origin and the group consisting of tumors from a
non-epithelial origin SEQ fold- ID median values change p-value NO.
miR name 1.5e+004-7.7e+001 196.43 (+) 1.5e-300 30 hsa-miR-200c
9.0e+003-5.0e+001 180.07 (+) 1.3e-208 29 hsa-miR-200b
3.9e+003-5.0e+001 78.09 (+) 2.2e-187 28 hsa-miR-200a
2.7e+003-5.0e+001 54.64 (+) 7.0e-078 32 hsa-miR-205
2.6e+003-5.0e+001 51.98 (+) 1.2e-265 13 hsa-miR-141
5.4e+002-9.2e+001 5.90 (+) 6.3e-048 152 hsa-miR-182
1.1e+003-2.5e+002 4.35 (+) 4.8e-022 49 hsa-miR-31 (+) for all the
listed miRs, the higher expression is in tumors from epithelial
origins
[0326] A combination of the expression level of any of the miRs
detailed in table 6 with the expression level of any of hsa-miR-30a
(SEQ ID NO: 46), hsa-miR-10b (SEQ ID NO: 5) and hsa-miR-140-3p (SEQ
ID NO: 12) also provides for distinguishing between tumors from
epithelial origins and tumors from non-epithelial origins. This is
demonstrated at node 4 of the binary-tree-classifier detailed in
the invention with hsa-miR-200c (SEQ ID NO: 30) and hsa-miR-30a
(SEQ ID NO: 46) (FIG. 4). Tumors originating in epithelial
(diamonds) are easily distinguished from tumors of non-epithelial
origin (squares) using the expression levels of hsa-miR-30a (SEQ ID
NO: 46, y-axis) and hsa-miR-200c (SEQ ID NO: 30, x-axis).
TABLE-US-00007 TABLE 7 miR expression (in fluorescence units)
distinguishing between the group consisting of melanoma and
lymphoma (B-cell, T-cell), and the group consisting of all other
non-epithelial tumors SEQ fold- ID median values change p-value NO.
miR name 2.0e+003-7.0e+001 28.25 (-) 1.9e-074 164 hsa-miR-142-5p
1.2e+004-6.3e+002 18.86 (-) 6.0e-061 168 hsa-miR-150
5.4e+003-3.1e+002 17.29 (-) 5.6e-060 170 hsa-miR-155
4.2e+003-3.5e+002 12.03 (-) 8.4e-068 16 hsa-miR-146a
5.9e+002-1.4e+002 4.25 (-) 8.2e-048 198 hsa-miR-342-5p
7.5e+003-1.9e+003 4.02 (-) 4.8e-056 50 hsa-miR-342-3p
8.9e+002-2.5e+002 3.53 (-) 6.0e-035 176 hsa-miR-18a
4.4e+003-1.4e+003 3.28 (-) 8.0e-038 186 hsa-miR-20a
7.9e+002-2.6e+002 3.03 (-) 7.3e-005 11 hsa-miR-138
6.6e+003-2.3e+003 2.82 (-) 4.0e-039 158 hsa-miR-106a
4.1e+003-1.4e+003 2.82 (-) 2.4e-037 20 hsa-miR-17 6.2e+001-5.9e+002
9.53 (+) 3.7e-027 155 hsa-miR-127-3p 1.2e+003-7.0e+003 5.71 (+)
1.5e-047 231 hsa-miR-99a 3.9e+002-1.7e+003 4.25 (+) 6.6e-022 4
hsa-miR-10a 1.0e+004-4.1e+004 3.91 (+) 3.2e-037 8 hsa-miR-125b
6.5e+002-2.2e+003 3.37 (+) 2.4e-023 46 hsa-miR-30a
1.9e+003-5.6e+003 2.98 (+) 1.0e-025 3 hsa-miR-100 2.5e+003-7.1e+003
2.89 (+) 1.8e-051 2 hsa-let-7e 2.9e+003-8.4e+003 2.86 (+) 8.1e-047
7 hsa-miR-125a-5p (+) the higher expression of this miR is in the
group of non-epithelial tumors excluding melanoma and lymphoma (-)
the higher expression of this miR is in the group consisting of
melanoma and lymphoma
[0327] hsa-miR-146a (SEQ ID NO: 16), hsa-let-7e (SEQ ID NO: 2) and
hsa-miR-30a (SEQ ID NO: 46) are used at node 5 of the
binary-tree-classifier detailed in the invention to distinguish
between the group consisting of melanoma and lymphoma, and the
group consisting of all other non-epithelial tumors. FIG. 5
demonstrates that tumors originating in the lymphoma or melanoma
(diamonds) are easily distinguished from tumors of non epithelial,
non lymphoma/melanoma origin (squares) using the expression levels
of hsa-miR-146a (SEQ ID NO: 16, y-axis), hsa-miR-30a (SEQ ID NO:
46, x-axis) and hsa-let-7e (SEQ ID NO: 2, z-axis).
TABLE-US-00008 TABLE 8 miR expression (in fluorescence units)
distinguishing between the group consisting of brain tumors
(astrocytic tumor and oligodendroglioma) and the group consisting
of all non-brain, non-epithelial tumors SEQ fold- ID median values
change p-value NO. miR name 9.1e+003-5.0e+001 182.94 (+) 3.8e-059
159 hsa-miR-124 4.4e+003-5.0e+001 88.33 (+) 1.1e-125 66 hsa-miR-9*
2.1e+003-6.0e+001 34.97 (+) 6.0e-035 225 hsa-miR-551b
9.9e+002-5.0e+001 19.73 (+) 3.0e-116 187 hsa-miR-219-2-3p
6.5e+002-5.0e+001 12.95 (+) 1.8e-021 162 hsa-miR-129-3p
1.1e+003-1.0e+002 10.52 (+) 2.0e-034 161 hsa-miR-128
2.3e+003-2.5e+002 9.45 (+) 2.2e-052 68 hsa-miR-92b
5.2e+002-6.8e+001 7.61 (+) 6.7e-019 232 hsa-miR-99a*
6.9e+002-9.2e+001 7.45 (+) 5.5e-023 173 hsa-miR-181c
2.2e+003-3.5e+002 6.34 (+) 7.4e-007 11 hsa-miR-138
1.2e+005-2.4e+004 4.78 (+) 1.7e-014 8 hsa-miR-125b
1.8e+003-3.9e+002 4.70 (+) 7.2e-014 174 hsa-miR-181d
8.5e+002-1.8e+002 4.64 (+) 2.2e-002 155 hsa-miR-127-3p
1.6e+004-3.5e+003 4.60 (+) 2.4e-010 231 hsa-miR-99a
8.5e+001-1.1e+003 13.55 (-) 2.4e-014 4 hsa-miR-10a
7.7e+002-6.6e+003 8.58 (-) 8.4e-017 182 hsa-miR-199a-5p
5.7e+002-4.7e+003 8.12 (-) 1.8e-013 181 hsa-miR-199a-3p
2.8e+002-1.9e+003 6.81 (-) 1.4e-012 37 hsa-miR-214 (+) the higher
expression of this miR is in the group consisting of brain tumors
(-) the higher expression of this miR is in the group consisting of
all non-brain, non-epithelial tumors
[0328] hsa-miR-9* (SEQ ID NO: 66) and hsa-miR-92b (SEQ ID NO: 68)
are used at node 6 of the binary-tree-classifier detailed in the
invention to distinguish between brain tumors and the group
consisting of all non-brain, non-epithelial tumors. FIG. 6
demonstrates that tumors originating in the brain (marked by
diamonds) are easily distinguished from tumors of non epithelial,
non brain origin (marked by squares) using the expression levels of
hsa-miR-9* (SEQ ID NO: 66, y-axis) and hsa-miR-92b (SEQ ID NO: 68,
x-axis).
TABLE-US-00009 TABLE 9 miR expression (in fluorescence units)
distinguishing between astrocytic tumors and oligodendrogliomas SEQ
fold- ID median values change p-value NO. miR name
2.5e+003-2.3e+002 11.10 (+) 5.1e-011 230 hsa-miR-886-5p
4.4e+003-4.9e+002 9.06 (+) 1.1e-009 228 hsa-miR-886-3p
1.0e+004-1.7e+003 5.99 (+) 7.7e-008 147 hsa-miR-221
1.3e+004-2.6e+003 5.03 (+) 2.6e-006 40 hsa-miR-222
3.3e+004-7.3e+003 4.54 (+) 3.9e-004 34 hsa-miR-21 8.4e+002-2.2e+002
3.78 (+) 3.7e-006 206 hsa-miR-455-3p 6.0e+002-1.8e+002 3.30 (+)
1.3e-002 35 hsa-miR-21* 5.8e+003-1.8e+003 3.15 (+) 2.4e-005 52
hsa-miR-34a 1.1e+003-3.5e+002 3.04 (+) 1.0e-003 25 hsa-miR-193a-3p
1.6e+002-8.2e+002 5.17 (-) 1.2e-004 229 hsa-miR-9 4.6e+002-2.3e+003
5.09 (-) 7.1e-003 161 hsa-miR-128 4.1e+002-1.8e+003 4.43 (-)
1.3e-002 187 hsa-miR-219-2-3p 3.8e+003-1.3e+004 3.31 (-) 1.9e-002
179 hsa-miR-195 (+) the higher expression of this miR is in
astrocytic tumors (-) the higher expression of this miR is in
oligodendrogliomas
[0329] A combination of the expression level of any of the miRs
detailed in table 9 with the expression level of hsa-miR-497 (SEQ
ID NO: 208) or hsa-let-7d (SEQ ID NO: 153) also provides for
classification of brain tumors as astrocytic tumors or
oligodendrogliomas. This is demonstrated at node 7 of the
binary-tree-classifier detailed in the invention with hsa-miR-222
(SEQ ID NO: 40) and hsa-miR-497 (SEQ ID NO: 208). In another
embodiment of the invention, the expression levels of hsa-miR-222
(SEQ ID NO: 40) and hsa-let-7d (SEQ ID NO: 153) are combined to
distinguish between astrocytic tumors and oligodendrogliomas.
[0330] FIG. 7 demonstrates that tumors originating in astrocytoma
(marked by diamonds) are easily distinguished from tumors of
oligodendroglioma origins (marked by squares) using the expression
levels of hsa-miR-497 (SEQ ID NO: 208, y-axis) and hsa-miR-222 (SEQ
ID NO: 40, x-axis).
TABLE-US-00010 TABLE 10 miR expression (in fluorescence units)
distinguishing between the group consisting of neuroendocrine
tumors and the group consisting of all non-neuroendocrine,
epithelial tumors SEQ fold- ID median values change p-value NO. miR
name 3.8e+004-1.5e+002 259.47 (+) 5.3e-086 56 hsa-miR-375
3.6e+003-5.2e+001 70.47 (+) 4.4e-145 65 hsa-miR-7 1.3e+003-1.8e+002
6.89 (+) 4.7e-044 175 hsa-miR-183 1.9e+003-4.4e+002 4.42 (+)
3.5e-025 152 hsa-miR-182 1.2e+003-3.0e+002 4.16 (+) 5.5e-028 155
hsa-miR-127-3p 5.6e+001-7.0e+003 124.66 (-) 1.4e-023 32 hsa-miR-205
1.5e+002-1.4e+003 9.25 (-) 1.8e-019 49 hsa-miR-31 3.4e+002-1.4e+003
4.12 (-) 9.5e-032 35 hsa-miR-21* (+) the higher expression of this
miR is in the group consisting of neuroendocrine tumors (-) the
higher expression of this miR is in the group consisting of all
non-neuroendocrine, epithelial tumors
[0331] hsa-miR-375 (SEQ ID NO: 56), hsa-miR-7 (SEQ ID NO: 65) and
hsa-miR-193a-3p (SEQ ID NO: 25) are used at node 8 of the
binary-tree-classifier detailed in the invention to distinguish
between the group consisting of neuroendocrine tumors and the group
consisting of all non-neuroendocrine, epithelial tumors. FIG. 8
demonstrates that tumors originating in the neuroendocrine
(diamonds) are easily distinguished from tumors of epithelial,
origin (squares) using the expression levels of hsa-miR-193a-3p
(SEQ ID NO: 25, y-axis), hsa-miR-7 (SEQ ID NO: 65, x-axis) and
hsa-miR-375 (SEQ ID NO: 56, z-axis).
TABLE-US-00011 TABLE 11 miR expression (in fluorescence units)
distinguishing between the group consisting of gastrointestinal
(GI) epithelial tumors and the group consisting of non-GI
epithelial tumors SEQ fold- ID median values change p-value NO. miR
name 2.6e+003-7.1e+001 36.09 (+) 2.5e-127 27 hsa-miR-194
3.9e+003-1.2e+002 33.26 (+) 1.6e-117 177 hsa-miR-192
2.6e+003-6.7e+002 3.88 (+) 3.3e-021 4 hsa-miR-10a 5.0e+001-2.1e+004
411.76 (-) 6.5e-045 32 hsa-miR-205 (+) the higher expression of
this miR is in the group consisting of GI epithelial tumors (-) the
higher expression of this miR is in the group consisting of non-GI
epithelial tumors
[0332] hsa-miR-194 (SEQ ID NO: 27) and hsa-miR-21* (SEQ ID NO: 35)
are used at node 9 of the binary-tree-classifier detailed in the
invention to distinguish between GI epithelial tumors and non-GI
epithelial tumors.
[0333] FIG. 9 demonstrates that tumors originating in
gastro-intestinal (GI) (marked by diamonds) are easily
distinguished from tumors of non GI origins (marked by squares)
using the expression levels of hsa-miR-21* (SEQ ID NO: 35, y-axis)
and hsa-miR-194 (SEQ ID NO: 27, x-axis).
TABLE-US-00012 TABLE 12 miR expression (in fluorescence units)
distinguishing between prostate tumors and all other non-GI
epithelial tumors SEQ fold- ID median values change p-value NO. miR
name 5.1e+003-5.2e+001 96.76 (+) 3.7e-016 56 hsa-miR-375
1.0e+003-5.5e+001 18.27 (+) 4.0e-025 199 hsa-miR-363
6.8e+004-7.2e+003 9.41 (+) 1.0e-025 14 hsa-miR-143
1.2e+005-1.4e+004 8.14 (+) 7.8e-022 15 hsa-miR-145
2.8e+003-3.5e+002 7.89 (+) 1.5e-012 165 hsa-miR-143*
2.1e+004-4.4e+003 4.76 (+) 2.2e-011 231 hsa-miR-99a
4.6e+002-2.1e+003 4.58 (-) 8.0e-007 36 hsa-miR-210
2.7e+002-1.1e+003 3.84 (-) 7.8e-017 154 hsa-miR-181b
1.2e+003-4.3e+003 3.76 (-) 1.2e-014 21 hsa-miR-181a
5.5e+002-2.0e+003 3.63 (-) 2.3e-002 49 hsa-miR-31 (+) the higher
expression of this miR is in prostate tumors (-) the higher
expression of this miR is in the group consisting of all other
non-GI epithelial tumors
[0334] hsa-miR-143 (SEQ ID NO: 14) and hsa-miR-181a (SEQ ID NO: 21)
are used at node 10 of the binary-tree-classifier detailed in the
invention to distinguish between prostate tumors and all other
non-GI epithelial tumors.
[0335] FIG. 10 demonstrates that tumors originating in prostate
adenocarcinoma (marked by diamonds) are easily distinguished from
tumors of non prostate origins (marked by squares) using the
expression levels of hsa-miR-181a (SEQ ID NO: 21, y-axis) and
hsa-miR-143 (SEQ ID NO: 14, x-axis).
TABLE-US-00013 TABLE 13 miR expression (in fluorescence units)
distinguishing between seminomatous and non- seminomatous
testicular tumors SEQ fold- ID median values change p-value NO. miR
name 4.3e+003-7.6e+002 5.63 (+) 6.6e-004 152 hsa-miR-182
1.0e+002-2.1e+003 20.46 (-) 6.2e-005 216 hsa-miR-518e
7.8e+001-1.2e+003 15.29 (-) 4.5e-005 212 hsa-miR-516b
6.8e+001-8.2e+002 11.94 (-) 2.2e-005 224 hsa-miR-527
2.1e+002-2.2e+003 10.40 (-) 1.9e-006 13 hsa-miR-141
5.3e+002-5.0e+003 9.48 (-) 5.0e-004 194 hsa-miR-302d
1.4e+002-1.3e+003 8.97 (-) 4.1e-006 192 hsa-miR-302a
2.7e+002-2.3e+003 8.78 (-) 2.9e-003 221 hsa-miR-520c-3p
1.3e+002-1.2e+003 8.65 (-) 8.3e-004 217 hsa-miR-518f*
3.4e+003-2.9e+004 5.98 (-) 2.6e-007 205 hsa-miR-451
2.8e+002-1.7e+003 5.98 (-) 1.1e-002 219 hsa-miR-519d
2.0e+002-1.2e+003 5.90 (-) 6.8e-005 32 hsa-miR-205
2.0e+002-1.1e+003 5.59 (-) 5.8e-006 193 hsa-miR-302a*
1.9e+002-1.0e+003 5.27 (-) 6.7e-003 223 hsa-miR-524-5p
1.5e+002-8.0e+002 5.22 (-) 5.4e-003 220 hsa-miR-520a-5p
2.2e+002-1.1e+003 5.21 (-) 4.1e-003 210 hsa-miR-512-5p
3.2e+002-1.4e+003 4.57 (-) 9.2e-003 209 hsa-miR-498
7.2e+002-3.2e+003 4.51 (-) 3.1e-002 213 hsa-miR-517a
6.4e+002-2.9e+003 4.47 (-) 2.9e-002 163 hsa-miR-1323
9.5e+002-4.1e+003 4.29 (-) 1.3e-004 30 hsa-miR-200c (+) the higher
expression of this miR is in seminoma tumors (-) the higher
expression of this miR is in non-seminoma tumors
[0336] A combination of the expression level of any of the miRs
detailed in table 13 with the expression level of hsa-miR-200b (SEQ
ID NO: 29), hsa-miR-200a (SEQ ID NO: 28), hsa-miR-516a-5p (SEQ ID
NO: 211), hsa-miR-767-5p (SEQ ID NO: 227), hsa-miR-518a-3p (SEQ ID
NO: 215), hsa-miR-520d-5p (SEQ ID NO: 222), hsa-miR-519a (SEQ ID
NO: 218) and hsa-miR-517c (SEQ ID NO: 214) also provides for
classification of seminoma and non-seminoma testis-tumors.
[0337] hsa-miR-516a-5p (SEQ ID NO: 211) and hsa-miR-200b (SEQ ID
NO: 29) are used at node 12 of the binary-tree-classifier detailed
in the invention to distinguish between seminoma and non-seminoma
testis-tumors.
[0338] FIG. 11 demonstrates that tumors originating in seminomatous
testicular germ cell (marked by diamonds) are easily distinguished
from tumors of non seminomatous origins (marked by squares) using
the expression levels of hsa-miR-516a-5p (SEQ ID NO: 211, y-axis)
and hsa-miR-200b (SEQ ID NO: 29, x-axis).
TABLE-US-00014 TABLE 14 miR expression (in fluorescence units)
distinguishing between the group consisting of squamous cell
carcinoma (SCC), transitional cell carcinoma (TCC), thymoma and the
group consisting of non gastrointestinal (GI) adenocarcinoma tumors
SEQ ID fold- miR name NO. p-value change median values hsa-miR-205
32 1.6e-059 321.76 (+) 4.6e+004-1.4e+002 hsa-miR-210 36 8.6e-015
5.96 (+) 2.9e+003-4.9e+002 hsa-miR-193b 178 2.6e-016 3.82 (+)
2.5e+003-6.6e+002 MID-16869 243 1.8e-008 3.67 (+) 2.5e+003-6.8e+002
MID-16489 242 2.2e-011 3.53 (+) 3.4e+003-9.7e+002 hsa-miR-31 49
8.2e-004 2.82 (+) 3.7e+003-1.3e+003 MID-15965 240 1.7e-010 2.78 (+)
6.4e+003-2.3e+003 hsa-miR-378 57 2.7e-017 2.71 (+)
1.4e+003-5.2e+002 hsa-miR-138 11 2.9e-023 8.05 (-)
2.8e+002-2.2e+003 hsa-miR-30a 46 1.5e-018 3.70 (-)
7.3e+002-2.7e+003 hsa-miR-146b-5p 17 4.0e-013 2.60 (-)
8.6e+002-2.2e+003 hsa-miR-30d 47 1.5e-021 2.44 (-)
1.8e+003-4.3e+003 hsa-miR-345 51 2.6e-019 2.38 (-)
4.5e+002-1.1e+003 hsa-miR-125a-5p 7 3.8e-014 2.30 (-)
4.2e+003-9.6e+003 hsa-miR-125b 8 3.0e-009 2.26 (-)
1.9e+004-4.3e+004 hsa-miR-181b 154 3.2e-008 2.24 (-)
9.4e+002-2.1e+003 hsa-miR-29b 190 3.3e-010 2.13 (-)
7.4e+002-1.6e+003 hsa-let-7i 157 4.6e-014 2.04 (-)
6.6e+003-1.3e+004 hsa-miR-30c 196 7.9e-010 2.04 (-)
2.4e+003-4.8e+003 (+) the higher expression of this miR is in SCC,
TCC and thymoma (-) the higher expression of this miR is in non GI
adenocarcinoma
[0339] Node 13 of the binary-tree-classifier separates tissues with
high expression of miR-205 (SCC marker) such as SCC, TCC and
thymomas from adenocarcinomas.
[0340] Breast adenocarcinoma and ovarian carcinoma are excluded
from this separation due to a wide range of expression of
miR-205.
[0341] A combination of the expression level of any of the miRs
detailed in table 14 with the expression level of hsa-miR-331-3p
(SEQ ID NO: 197) also provides for this classification.
[0342] hsa-miR-205 (SEQ ID NO: 32), hsa-miR-345 (SEQ ID NO: 51) and
hsa-miR-125a-5p (SEQ ID NO: 7) are used at node 13 of the
binary-tree-classifier detailed in the invention.
TABLE-US-00015 TABLE 15 miR expression (in fluorescence units)
distinguishing between the group consisting of breast
adenocarcinoma and the group consisting of SCC, TCC, thymomas and
ovarian carcinoma SEQ ID fold- miR name NO. p-value change median
values hsa-miR-375 56 4.1e-029 25.95 (+) 1.3e+003-5.0e+001
hsa-miR-30a 46 1.6e-014 3.25 (+) 2.7e+003-8.3e+002 hsa-miR-193a-3p
25 9.5e-022 3.09 (+) 4.4e+003-1.4e+003 hsa-miR-182 152 2.1e-009
2.94 (+) 1.2e+003-4.1e+002 hsa-miR-342-3p 50 7.3e-014 2.48 (+)
5.5e+003-2.2e+003 hsa-miR-29c* 45 6.3e-008 2.48 (+)
6.6e+002-2.7e+002 hsa-miR-29c 191 5.8e-007 2.26 (+)
5.0e+002-2.2e+002 hsa-miR-199a-3p 181 1.3e-004 2.19 (+)
7.3e+003-3.3e+003 hsa-miR-195 179 2.0e-006 2.05 (+)
2.2e+003-1.1e+003 hsa-miR-31 49 9.6e-014 13.81 (-)
2.2e+002-3.1e+003 hsa-miR-205 32 9.3e-008 6.32 (-)
6.3e+003-4.0e+004 hsa-miR-224 42 5.3e-010 5.72 (-)
8.9e+001-5.1e+002 hsa-miR-203 184 6.7e-007 4.05 (-)
1.5e+002-6.3e+002 hsa-miR-222 40 5.8e-018 2.64 (-)
5.7e+003-1.5e+004 hsa-miR-221 147 1.0e-018 2.41 (-)
3.8e+003-9.2e+003 MID-00689 236 4.9e-010 2.39 (-) 4.8e+002-1.1e+003
hsa-miR-378 57 1.2e-010 2.37 (-) 6.0e+002-1.4e+003 hsa-miR-422a 203
2.2e-008 2.24 (-) 2.3e+002-5.2e+002 hsa-miR-210 36 1.5e-007 2.22
(-) 1.2e+003-2.6e+003 (+) the higher expression of this miR is in
breast adenocarcinoma (-) the higher expression of this miR is in
SCC, TCC, thymomas and ovarian carcinoma
[0343] hsa-miR-193a-3p (SEQ ID NO: 25), hsa-miR-375 (SEQ ID NO: 56)
and hsa-miR-342-3p (SEQ ID NO: 50) are used at node 14 of the
binary-tree-classifier detailed in the invention. According to
another embodiment, hsa-miR-193a-3p (SEQ ID NO: 25), hsa-miR-375
(SEQ ID NO: 56) and hsa-miR-224 (SEQ ID NO: 42) may be used at node
14 of the binary-tree-classifier detailed in the invention.
TABLE-US-00016 TABLE 16 miR expression (in fluorescence units)
distinguishing between the group consisting of ovarian carcinoma
and the group consisting of SCC, TCC and thymomas SEQ ID fold- miR
name NO. p-value change median values hsa-miR-10a 4 1.2e-012 5.57
(+) 3.1e+003-5.5e+002 hsa-miR-130a 10 1.7e-014 3.41 (+)
5.1e+003-1.5e+003 hsa-miR-30a* 195 1.3e-014 3.39 (+)
2.5e+002-7.5e+001 hsa-miR-10b 5 5.5e-009 2.68 (+) 2.4e+003-8.8e+002
hsa-miR-625 226 2.5e-012 2.48 (+) 2.9e+002-1.2e+002 hsa-let-7e 2
7.7e-012 2.28 (+) 8.8e+003-3.9e+003 hsa-miR-30a 46 3.6e-007 2.20
(+) 1.6e+003-7.3e+002 hsa-miR-205 32 1.0e-033 37.52 (-)
1.2e+003-4.6e+004 hsa-miR-205* 185 4.2e-018 5.42 (-)
5.0e+001-2.7e+002 hsa-miR-138 11 1.1e-009 4.63 (-) 6.0e+001
2.8e+002 hsa-miR-150 168 5.2e-010 4.18 (-) 5.7e+002-2.4e+003
hsa-miR-203 184 2.5e-003 2.74 (-) 2.9e+002-8.0e+002 hsa-miR-146a 16
2.1e-006 2.62 (-) 2.9e+002-7.6e+002 MID-16489 242 1.9e-007 2.49 (-)
1.4e+003-3.4e+003 hsa-miR-140-3p 12 2.3e-015 2.42 (-)
9.5e+002-2.3e+003 MID-15684 237 5.2e-006 2.37 (-) 7.6e+002-1.8e+003
MID-16869 243 9.8e-005 2.23 (-) 1.1e+003-2.5e+003 MID-20703 250
5.9e-004 2.18 (-) 2.1e+003-4.6e+003 hsa-miR-22 39 6.5e-012 2.13 (-)
2.7e+003-5.8e+003 MID-23256 253 1.0e-003 2.13 (-) 4.3e+002-9.2e+002
hsa-miR-31 49 1.4e-002 2.12 (-) 1.7e+003-3.7e+003 MID-18422 246
1.3e-003 2.06 (-) 1.7e+003-3.6e+003 hsa-miR-149* 167 1.4e-007 2.04
(-) 2.1e+003-4.4e+003 (+) the higher expression of this miR is in
ovarian carcinoma (-) the higher expression of this miR is in SCC,
TCC and thymomas
[0344] hsa-miR-205 (SEQ ID NO: 32), hsa-miR-10a (SEQ ID NO: 4) and
hsa-miR-22 (SEQ ID NO: 39) are used at node 15 of the
binary-tree-classifier detailed in the invention.
TABLE-US-00017 TABLE 17 miR expression (in fluorescence units)
distinguishing between the group consisting of thyroid carcinoma
(follicular and papillary) and the group consisting of breast
adenocarcinoma, lung large cell carcinoma, lung adenocarcinoma and
ovarian carcinoma SEQ ID fold- miR name NO. p-value change median
values hsa-miR-138 11 7.4e-033 33.86 (+) 4.1e+003-1.2e+002
hsa-miR-221 147 1.4e-009 5.03 (+) 3.4e+004-6.7e+003 hsa-miR-146b-5p
17 1.0e-006 4.74 (+) 4.9e+003-1.0e+003 hsa-let-7i 157 2.8e-027 3.71
(+) 2.2e+004-5.8e+003 hsa-miR-222 40 7.9e-009 3.63 (+)
3.9e+004-1.1e+004 hsa-miR-125b 8 1.5e-014 2.78 (+)
5.4e+004-1.9e+004 hsa-miR-31 49 5.0e-003 2.78 (+) 1.3e+003-4.9e+002
hsa-miR-126 9 1.3e-008 2.48 (+) 6.9e+003-2.8e+003 hsa-miR-29c 191
4.8e-007 2.36 (+) 8.1e+002-3.4e+002 hsa-miR-451 205 3.8e-003 2.33
(+) 1.3e+004-5.7e+003 hsa-miR-486-5p 207 1.3e-003 2.16 (+)
4.8e+002-2.2e+002 hsa-miR-30a* 195 8.0e-006 2.12 (+)
4.9e+002-2.3e+002 hsa-miR-345 51 2.7e-011 2.11 (+)
1.4e+003-6.6e+002 hsa-miR-30a 46 1.5e-006 2.10 (+) .5e+003-1.7e+003
hsa-miR-29c* 45 1.5e-006 1.97 (+) 6.4e+002-3.2e+002 hsa-miR-34a 52
3.4e-007 1.88 (+) 7.4e+003-4.0e+003 hsa-miR-1977 234 5.9e-008 1.88
(+) 6.1e+003-3.3e+003 hsa-miR-99a 231 1.5e-005 1.85 (+)
7.5e+003-4.1e+003 hsa-miR-181a 21 2.7e-004 1.85 (+)
7.6e+003-4.1e+003 hsa-miR-152 169 5.1e-008 1.82 (+)
1.0e+003-5.5e+002 hsa-miR-29a 43 3.7e-008 1.79 (+)
1.16+004-6.0e+003 hsa-miR-100 3 2.9e-005 1.77 (+) 5.4e+003-3.0e+003
hsa-miR-30c 196 7.3e-007 1.73 (+) 5.9e+003-3.4e+003 hsa-miR-181b
154 4.6e-004 1.69 (+) 2.1e+003-1.2e+003 MID-00405 390 6.4e-003 1.66
(+) 4.4e+002-2.6e+002 hsa-miR-15a 171 5.3e-006 1.65 (+)
5.5e+002-3.3e+002 MID-23794 255 6.0e-003 1.60 (+) 1.3e+003-8.1e+002
hsa-miR-331-3p 197 8.3e-007 1.57 (+) 2.3e+003-1.4e+003 hsa-miR-29b
190 1.4e-005 1.57 (+) 1.8e+003-1.1e+003 hsa-miR-27b 189 1.4e-003
1.56 (+) 3.6e+003-2.3e+003 hsa-miR-22 39 3.6e-005 1.52 (+)
6.6e+003-4.3e+003 hsa-miR-125a-5p 7 2.2e-006 1.52 (+)
9.9e+003-6.5e+003 hsa-miR-30e 48 6.2e-006 1.51 (+)
7.8e+002-5.2e+002 hsa-miR-30d 47 1.2e-002 1.51 (+)
4.3e+003-2.8e+003 hsa-miR-205 32 2.2e-005 22.05 (-)
1.0e+002-2.2e+003 hsa-miR-210 36 4.8e-020 8.83 (-)
2.1e+002-1.8e+003 hsa-miR-10a 4 8.7e-011 4.34 (-) 3.2e+002-1.4e+003
hsa-miR-193b 178 4.6e-014 3.59 (-) 4.9e+002-1.8e+003 hsa-miR-214 37
8.3e-006 2.74 (-) 1.0e+003-2.8e+003 hsa-miR-199a-3p 181 4.4e-005
2.67 (-) 2.0e+003-5.5e+003 hsa-miR-193a-3p 25 3.1e-011 2.65 (-)
1.1e+003-2.8e+003 hsa-miR-199b-5p 183 1.3e-004 2.63 (-)
2.7e+002-7.2e+002 hsa-miR-199a-5p 182 1.6e-005 2.57 (-)
3.1e+003-8.1e+003 hsa-miR-21* 35 4.7e-006 2.41 (-)
5.1e+002-1.2e+003 MID-15965 240 9.3e-003 2.39 (-) 2.0e+003-4.8e+003
hsa-miR-378 57 3.0e-006 2.35 (-) 3.4e+002-8.0e+002 MID-16489 242
1.2e-003 2.25 (-) 8.2e+002-1.9e+003 hsa-miR-425 204 1.4e-011 2.18
(-) 6.0e+002-1.3e+003 MID-00689 236 5.3e-006 2.06 (-)
3.0e+002-6.1e+002 hsa-miR-18a 176 2.0e-006 1.96 (-)
2.3e+002-4.5e+002 hsa-miR-106a 158 3.5e-006 1.85 (-)
2.3e+003-4.3e+003 hsa-miR-93 148 1.9e-010 1.79 (-)
2.4e+003-4.4e+003 hsa-miR-455-3p 206 1.9e-007 1.79 (-)
2.9e+002-5.2e+002 hsa-miR-342-3p 50 7.8e-005 1.78 (-)
1.3e+003-2.3e+003 hsa-miR-17 20 6.0e-006 1.75 (-) 1.4e+003-2.5e+003
hsa-miR-21 34 5.1e-006 1.75 (-) 2.8e+004-4.8e+004 hsa-miR-20a 186
1.1e-004 1.74 (-) 1.4e+003-2.4e+003 MID-15907 239 8.4e-004 1.72 (-)
2.4e+002-4.1e+002 MID-21271 251 9.9e-003 1.62 (-) 3.0e+002-4.8e+002
MID-17144 244 4.3e-002 1.60 (-) 2.2e+003-3.6e+003 hsa-miR-191 24
4.7e-006 1.59 (-) 3.8e+003-6.1e+003 hsa-miR-25 188 2.0e-005 1.59
(-) 1.0e+003-1.6e+003 hsa-miR-15b 172 1.9e-002 1.57 (-)
2.1e+003-3.2e+003 MID-15867 238 9.5e-003 1.56 (-) 2.6e+003-4.0e+003
(+) the higher expression of this miR is in thyroid carcinoma (-)
the higher expression of this miR is in breast adenocarcinoma, lung
large cell carcinoma, lung adenocarcinoma and ovarian carcinoma
[0345] FIG. 12 demonstrates binary decisions at node #16 of the
decision-tree. Tumors originating in thyroid carcinoma (diamonds)
are easily distinguished from tumors of adenocarcinoma of the lung,
breast and ovarian origin (squares) using the expression levels of
hsa-miR-93 (SEQ ID NO: 148, y-axis), hsa-miR-138 (SEQ ID NO: 11,
x-axis) and hsa-miR-10a (SEQ ID NO: 4, z-axis).
TABLE-US-00018 TABLE 18 miR expression (in fluorescence units)
distinguishing between follicular thyroid carcinoma and papillary
thyroid carcinoma SEQ ID fold- miR name NO. p-value change median
values MID-20524 249 4.5e-011 9.34 (+) 6.6e+003-7.1e+002
hsa-miR-1973 180 1.9e-008 7.80 (+) 1.7e+003-2.2e+002 hsa-miR-7 65
8.3e-005 7.58 (+) 4.5e+002-5.9e+001 hsa-miR-1978 235 4.8e-007 6.52
(+) 2.5e+003-3.8e+002 MID-16318 241 1.5e-008 6.14 (+)
2.2e+003-3.6e+002 MID-19533 248 3.0e-004 6.00 (+) 4.2e+002-7.1e+001
MID-23291 254 1.6e-008 5.76 (+) 9.6e+002-1.7e+002 MID-19340 247
3.2e-005 5.33 (+) 9.9e+002-1.9e+002 hsa-miR-1248 160 6.8e-009 5.17
(+) 6.4e+002-1.2e+002 MID-16869 243 1.1e-006 4.97 (+)
1.5e+003-3.0e+002 MID-18336 245 1.4e-010 4.48 (+) 2.7e+003-6.1e+002
MID-22664 252 7.0e-004 4.00 (+) 5.0e+002-1.2e+002 hsa-miR-146b-5p
17 6.7e-011 62.88 (-) 4.0e+002-2.5e+004 hsa-miR-31 49 2.5e-008
18.72 (-) 4.4e+002-8.2e+003 hsa-miR-146b-3p 166 5.0e-012 18.69 (-)
5.0e+001-9.3e+002 hsa-miR-551b 225 4.8e-006 10.86 (-)
7.6e+001-8.3e+002 hsa-miR-150 168 3.2e-007 10.71 (-)
3.1e+002-3.3e+003 hsa-miR-21 34 3.4e-007 4.40 (-) 1.1e+004-4.7e+004
(+) the higher expression of this miR is in follicular thyroid
carcinoma (-) the higher expression of this miR is in papillary
thyroid carcinoma
[0346] FIG. 13 demonstrates binary decisions at node #17 of the
decision-tree. Tumors originating in follicular thyroid carcinoma
(marked by diamonds) are easily distinguished from tumors of
papillary thyroid carcinoma origins (marked by squares) using the
expression levels of hsa-miR-21 (SEQ ID NO: 34, y-axis) and
hsa-miR-146b-5p (SEQ ID NO: 17, x-axis).
TABLE-US-00019 TABLE 19 miR expression (in fluorescence units)
distinguishing between the group consisting of breast
adenocarcinoma and the group consisting of lung adenocarcinoma and
ovarian carcinoma SEQ ID fold- miR name NO. p-value change median
values hsa-miR-205 32 8.8e-005 10.55 (+) 6.3e+003-6.0e+002
hsa-miR-375 56 7.9e-006 8.43 (+) 1.3e+003-1.5e+002 hsa-miR-342-3p
50 7.7e-012 3.17 (+) 5.5e+003-1.7e+003 hsa-miR-29c* 45 2.2e-008
2.52 (+) 6.6e+002-2.6e+002 hsa-miR-193a-3p 25 2.2e-012 2.23 (+)
4.4e+003-2.0e+003 MID-23256 253 7.9e-005 2.20 (+) 9.5e+002-4.3e+002
hsa-miR-182 152 4.9e-005 2.15 (+) 1.2e+003-5.6e+002 hsa-miR-126 9
5.3e-005 1.94 (+) 3.7e+003-1.9e+003 hsa-miR-30a 46 2.9e-004 1.90
(+) 2.7e+003-1.4e+003 hsa-miR-29c 191 1.1e-004 1.78 (+)
5.0e+002-2.8e+002 hsa-miR-193b 178 1.5e-006 1.72 (+)
2.4e+003-1.4e+003 hsa-miR-31 49 7.7e-006 5.79 (-) 2.2e+002-1.3e+003
hsa-miR-222 40 3.0e-010 2.69 (-) 5.7e+003-1.5e+004 hsa-miR-130a 10
1.6e-006 2.50 (-) 1.4e+003-3.4e+003 hsa-miR-221 147 1.2e-009 2.41
(-) 3.8e+003-9.3e+003 hsa-miR-10a 4 2.2e-002 2.33 (-)
9.3e+002-2.2e+003 hsa-miR-210 36 1.5e-005 2.09 (-)
1.2e+003-2.5e+003 hsa-miR-886-3p 228 9.9e-005 2.01 (-)
1.3e+003-2.6e+003 MID-00689 236 4.6e-004 1.95 (-) 4.8e+002-9.4e+002
hsa-miR-886-5p 230 6.3e-004 1.92 (-) 5.3e+002-1.0e+003 hsa-miR-27b
189 7.1e-005 1.86 (-) 1.8e+003-3.3e+003 MID-15965 240 6.1e-003 1.79
(-) 3.6e+003-6.4e+003 hsa-miR-92a 67 3.5e-005 1.75 (-)
2.7e+003-4.7e+003 hsa-miR-378 202 3.0e-004 1.73 (-)
6.0e+002-1.0e+003 hsa-miR-146b-5p 17 2.9e-004 1.71 (-)
8.0e+002-1.4e+003 (+) the higher expression of this miR is in
breast adenocarcinoma (-) the higher expression of this miR is in
lung adenocarcinoma and ovarian carcinoma
[0347] FIG. 14 demonstrates binary decisions at node #18 of the
decision-tree. Tumors originating in breast (diamonds) are easily
distinguished from tumors of lung and ovarian origin (squares)
using the expression levels of hsa-miR-92a (SEQ ID NO: 67, y-axis),
hsa-miR-193a-3p (SEQ ID NO: 25, x-axis) and hsa-miR-31 (SEQ ID NO:
49, z-axis).
TABLE-US-00020 TABLE 20 miR expression (in fluorescence units)
distinguishing between lung adenocarcinoma and ovarian carcinoma
SEQ fold- ID median values change p-value NO. miR name
1.4e+003-5.2e+001 27.96 (+) 3.5e-008 56 hsa-miR-375
5.5e+002-6.0e+001 9.19 (+) 8.1e-009 11 hsa-miR-138
3.2e+003-5.7e+002 5.65 (+) 5.6e-004 168 hsa-miR-150
9.7e+002-2.9e+002 3.35 (+) 2.2e-004 16 hsa-miR-146a
2.3e+003-7.6e+002 2.96 (+) 3.2e-003 237 MID-15684 8.4e+003-2.9e+003
2.88 (+) 1.7e-010 21 hsa-miR-181a 7.0e+003-2.6e+003 2.69 (+)
9.2e-008 52 hsa-miR-34a 2.4e+003-9.5e+002 2.58 (+) 3.2e-007 12
hsa-miR-140-3p 2.1e+003-8.8e+002 2.39 (+) 2.1e-007 154 hsa-miR-181b
1.4e+005-6.3e+004 2.28 (+) 1.9e-003 279 hsa-miR-1826
3.2e+003-1.4e+003 2.25 (+) 6.1e-003 9 hsa-miR-126 6.1e+003-2.7e+003
2.24 (+) 2.3e-006 39 hsa-miR-22 4.2e+003-2.2e+003 1.93 (+) 8.9e-005
47 hsa-miR-30d 1.9e+003-1.0e+003 1.90 (+) 3.5e-006 23 hsa-miR-185
2.8e+003-1.5e+003 1.88 (+) 1.4e-003 50 hsa-miR-342-3p
3.6e+003-2.1e+003 1.69 (+) 1.8e-002 167 hsa-miR-149*
9.7e+002-5.9e+002 1.66 (+) 8.7e-005 383 MID-22912 6.8e+004-4.2e+004
1.64 (+) 5.4e-004 34 hsa-miR-21 1.8e+003-1.2e+003 1.55 (+) 5.1e-004
35 hsa-miR-21* 1.4e+003-8.9e+002 1.54 (+) 1.7e-004 388
hsa-miR-423-5p 4.4e+002-2.4e+003 5.38 (-) 1.0e-007 5 hsa-miR-10b
1.9e+002-7.2e+002 3.77 (-) 3.3e-006 359 hsa-miR-708
6.4e+002-2.1e+003 3.27 (-) 5.8e-003 245 MID-18336 1.8e+002-5.4e+002
2.96 (-) 6.6e-003 254 MID-23291 1.7e+003-5.1e+003 2.95 (-) 3.4e-005
10 hsa-miR-130a 2.8e+003-8.1e+003 2.86 (-) 3.7e-003 240 MID-15965
5.1e+002-1.3e+003 2.65 (-) 3.7e-004 236 MID-00689 6.1e+002-1.6e+003
2.63 (-) 1.8e-004 202 hsa-miR-378 1.3e+003-3.1e+003 2.39 (-)
1.2e-002 4 hsa-miR-41a 1.0e+003-2.3e+003 2.30 (-) 1.8e-006 25
hsa-miR-193a-3p 2.6e+002-5.6e+002 2.15 (-) 4.1e-004 203
hsa-miR-422a 3.0e+003-6.1e+003 2.04 (-) 1.8e-002 231 hsa-miR-99a
2.0e+003-3.9e+003 2.01 (-) 3.5e-005 20 hsa-miR-17 3.3e+003-6.4e+003
1.96 (-) 3.5e-005 158 hsa-miR-106a 1.8e+003-3.5e+003 1.88 (-)
3.1e-005 186 hsa-miR-20a 3.2e+003-5.9e+003 1.85 (-) 1.2e-004 258
hsa-let-7f 2.4e+003-4.4e+003 1.85 (-) 2.0e-003 244 MID-17144
2.0e+003-3.5e+003 1.72 (-) 8.7e-004 172 hsa-miR-15b
5.2e+003-8.8e+003 1.69 (-) 5.3e-004 2 hsa-let-7e 4.1e+002-6.8e+002
1.66 (-) 1.2e-002 235 hsa-miR-1978 3.3e+004-5.4e+004 1.66 (-)
3.3e-005 256 hsa-let-7a 2.8e+003-4.7e+003 1.65 (-) 3.5e-003 28
hsa-miR-200a 5.2e+002-8.5e+002 1.62 (-) 1.6e-004 277 hsa-miR-17*
3.7e+002-5.8e+002 1.57 (-) 1.9e-003 296 hsa-miR-26b
1.0e+005-1.6e+005 1.56 (-) 3.7e-002 374 MID-16748 6.0e+003-9.1e+003
1.53 (-) 2.1e-005 153 hsa-let-7d 3.8e+003-5.7e+003 1.50 (-)
4.1e-003 181 hsa-miR-199a-3p (+) the higher expression of this miR
is in lung adenocarcinoma (-) the higher expression of this miR is
in ovarian carcinoma
[0348] FIG. 15 demonstrates binary decisions at node #19 of the
decision-tree. Tumors originating in lung adenocarcinoma (diamonds)
are easily distinguished from tumors of ovarian carcinoma origin
(squares) using the expression levels of hsa-miR-21 (SEQ ID NO: 34,
y-axis), hsa-miR-378 (SEQ ID NO: 202, x-axis) and hsa-miR-138 (SEQ
ID NO: 11, z-axis).
TABLE-US-00021 TABLE 21 miR expression (in fluorescence units)
distinguishing between the group consisting of thymic carcinoma and
the group consisting of TCC and SCC SEQ fold- ID median values
change p-value NO. miR name 5.3e+002-5.9e+001 9.00 (+) 5.7e-026 161
hsa-miR-128 7.4e+002-9.2e+001 8.04 (+) 2.2e-007 164 hsa-miR-142-5p
6.8e+002-8.8e+001 7.82 (+) 2.6e-021 22 hsa-miR-181a*
7.1e+002-1.2e+002 6.09 (+) 1.2e-006 53 hsa-miR-34c-5p
9.1e+002-1.8e+002 5.06 (+) 2.2e-008 285 hsa-miR-20b
1.3e+004-2.8e+003 4.59 (+) 7.5e-014 3 hsa-miR-100 1.6e+003-3.6e+002
4.39 (+) 7.1e-007 152 hsa-miR-182 8.7e+002-2.0e+002 4.37 (+)
6.4e-010 191 hsa-miR-29c 3.7e+003-9.1e+002 4.09 (+) 2.6e-014 154
hsa-miR-181b 1.5e+004-3.8e+003 3.82 (+) 4.8e-009 21 hsa-miR-181a
2.1e+003-6.4e+002 3.25 (+) 6.4e-006 206 hsa-miR-455-3p
8.7e+002-2.7e+002 3.23 (+) 1.5e-010 174 hsa-miR-181d
9.4e+002-2.9e+002 3.23 (+) 1.9e-004 19 hsa-miR-149
7.3e+002-2.6e+002 2.80 (+) 2.5e-008 45 hsa-miR-29c*
8.7e+002-3.2e+002 2.69 (+) 2.2e-007 171 hsa-miR-15a
2.7e+003-1.0e+003 2.66 (+) 5.1e-005 179 hsa-miR-195
4.4e+004-1.8e+004 2.46 (+) 8.2e-008 8 hsa-miR-125b
7.3e+002-3.2e+002 2.26 (+) 2.8e-004 296 hsa-miR-26b
2.7e+003-1.2e+003 2.22 (+) 2.4e-003 284 hsa-miR-19b
5.7e+002-2.6e+002 2.17 (+) 7.8e-005 18 hsa-miR-148a
9.1e+002-4.4e+002 2.06 (+) 1.4e-006 51 hsa-miR-345
7.5e+003-3.8e+003 2.00 (+) 1.6e-002 258 hsa-let-7f
1.8e+002-4.4e+003 24.66 (-) 3.9e-008 49 hsa-miR-31
5.8e+001-1.0e+003 17.65 (-) 1.0e-007 184 hsa-miR-203
2.2e+002-1.6e+003 7.43 (-) 4.5e-018 35 hsa-miR-21*
1.1e+004-5.5e+004 4.97 (-) 1.5e-032 34 hsa-miR-21 6.2e+002-2.5e+003
4.06 (-) 2.0e-008 37 hsa-miR-214 1.6e+002-5.8e+002 3.69 (-)
6.3e-005 42 hsa-miR-224 6.9e+002-2.5e+003 3.58 (-) 2.3e-009 228
hsa-miR-886-3p 4.8e+003-1.7e+004 3.47 (-) 3.9e-009 15 hsa-miR-145
2.7e+003-8.2e+003 3.08 (-) 2.7e-008 14 hsa-miR-143
1.3e+003-3.7e+003 2.93 (-) 6.7e-005 242 MID-16489 2.5e+002-7.4e+002
2.91 (-) 4.3e-006 230 hsa-miR-886-5p 3.5e+002-1.0e+003 2.90 (-)
5.6e-004 253 MID-23256 1.1e+003-3.0e+003 2.82 (-) 7.9e-006 36
hsa-miR-210 2.2e+003-5.8e+003 2.63 (-) 1.2e-007 182 hsa-miR-199a-5p
5.0e+003-1.2e+004 2.48 (-) 3.8e-006 293 hsa-miR-23b
7.5e+003-1.8e+004 2.44 (-) 1.1e-008 292 hsa-miR-23a
2.7e+002-6.6e+002 2.43 (-) 5.4e-003 4 hsa-miR-10a 9.0e+003-2.2e+004
2.43 (-) 5.5e-015 294 hsa-miR-24 3.8e+003-8.8e+003 2.35 (-)
3.0e-004 297 hsa-miR-27a 2.3e+002-5.2e+002 2.28 (-) 1.6e-002 354
hsa-miR-612 1.8e+003-3.9e+003 2.22 (-) 1.1e-006 377 MID-17866
1.6e+003-3.3e+003 2.11 (-) 2.4e-004 189 hsa-miR-27b
6.9e+003-1.4e+004 2.08 (-) 1.5e-004 30 hsa-miR-200c
3.8e+004-7.9e+004 2.05 (-) 1.8e-008 386 MID-23178 1.5e+003-3.1e+003
2.04 (-) 2.6e-002 249 MID-20524 4.6e+002-9.3e+002 2.03 (-) 5.6e-002
5 hsa-miR-10b 2.5e+002-5.0e+002 2.03 (-) 3.2e-005 274
hsa-miR-151-3p (+) the higher expression of this miR is in thymic
carcinoma (-) the higher expression of this miR is in TCC and
SCC
[0349] FIG. 16 demonstrates binary decisions at node #20 of the
decision-tree. Tumors originating in thymic carcinoma (marked by
diamonds) are easily distinguished from tumors of urothelial
carcinoma, transitional cell carcinoma (TCC) carcinoma and squamous
cell carcinoma (SCC) origins (marked by squares) using the
expression levels of hsa-miR-21 (SEQ ID NO: 34, y-axis) and
hsa-miR-100 (SEQ ID NO: 3, x-axis).
TABLE-US-00022 TABLE 22 miR expression (in fluorescence units)
distinguishing between TCC and SCC (of anus, skin, lung,
head&neck, esophagus or uterine cervix) SEQ fold- ID median
values change p-value NO. miR name 2.5e+002-5.0e+001 5.05 (+)
7.4e-036 69 hsa-miR-934 9.1e+003-2.2e+003 4.14 (+) 1.2e-012 28
hsa-miR-200a 2.1e+002-5.3e+001 3.87 (+) 8.4e-007 280 hsa-miR-187
6.0e+003-1.9e+003 3.19 (+) 9.8e-008 13 hsa-miR-141
5.6e+002-1.8e+002 3.15 (+) 4.7e-013 191 hsa-miR-29c
9.4e+002-3.0e+002 3.13 (+) 8.1e-008 152 hsa-miR-182
1.8e+004-6.2e+003 2.99 (+) 3.9e-010 29 hsa-miR-200b
3.2e+002-1.1e+002 2.81 (+) 8.8e-009 175 hsa-miR-183
3.1e+004-1.2e+004 2.65 (+) 8.1e-005 30 hsa-miR-200c
2.1e+003-8.1e+002 2.63 (+) 6.2e-020 204 hsa-miR-425
1.2e+003-4.8e+002 2.41 (+) 6.3e-007 4 hsa-miR-10a 8.5e+003-3.7e+003
2.30 (+) 2.4e-024 24 hsa-miR-191 1.8e+003-8.4e+002 2.14 (+)
2.7e-006 5 hsa-miR-10b 3.5e+002-1.7e+002 2.09 (+) 2.0e-006 329
hsa-miR-425* 3.4e+002-1.6e+002 2.08 (+) 8.0e-005 273 hsa-miR-148b
3.1e+002-8.1e+002 2.60 (-) 1.6e-005 170 hsa-miR-155
5.0e+002-1.2e+003 2.49 (-) 1.0e-002 184 hsa-miR-203
1.5e+002-3.5e+002 2.39 (-) 3.0e-011 26 hsa-miR-193a-5p
1.9e+003-4.5e+003 2.35 (-) 1.3e-008 231 hsa-miR-99a
1.2e+002-2.7e+002 2.28 (-) 1.6e-003 368 MID-00672 1.4e+003-3.2e+003
2.25 (-) 1.3e-005 37 hsa-miR-214 3.9e+002-8.7e+002 2.23 (-)
1.5e-004 16 hsa-miR-146a 3.5e+002-7.6e+002 2.15 (-) 4.2e-005 169
hsa-miR-152 1.5e+002-3.3e+002 2.15 (-) 4.4e-004 155 hsa-miR-127-3p
8.4e+002-1.7e+003 2.08 (-) 1.6e-005 35 hsa-miR-21*
7.7e+003-1.6e+004 2.07 (-) 5.3e-012 40 hsa-miR-222
4.4e+002-9.0e+002 2.06 (-) 1.5e-005 17 hsa-miR-146b-5p (+) the
higher expression of this miR is in TCC (-) the higher expression
of this miR is in SCC
[0350] hsa-miR-934 (SEQ ID NO: 69), hsa-miR-191 (SEQ ID NO: 24) and
hsa-miR-29c (SEQ ID NO: 191) are used at node #21 of the
binary-tree-classifier detailed in the invention to distinguish
between TCC and SCC.
TABLE-US-00023 TABLE 23 miR expression (in fluorescence units)
distinguishing between SCC of the uterine cervix and other SCC
tumors (anus, skin, lung, head& neck or esophagus) median
values auROC fold-change p-value SEQ ID NO. miR name
2.4e+002-9.2e+001 0.65 2.57 (+) 2.0e-002 164 hsa-miR-142-5p
1.6e+003-7.6e+002 0.85 2.13 (+) 1.7e-005 5 hsa-miR-10b
8.9e+003-4.4e+003 0.74 2.01 (+) 2.1e-004 231 hsa-miR-99a
1.2e+003-9.8e+002 0.71 1.24 (+) 1.2e-002 54 hsa-miR-361-5p
3.4e+004-2.7e+004 0.71 1.24 (+) 3.9e-004 1 hsa-let-7c
1.3e+003-4.3e+003 0.81 3.39 (-) 9.9e-006 242 MID-16489
3.9e+002-1.2e+003 0.74 3.10 (-) 2.1e-003 372 MID-16469
1.1e+003-3.3e+003 0.84 3.09 (-) 1.3e-008 249 MID-20524
1.7e+003-5.2e+003 0.78 3.01 (-) 2.4e-005 167 hsa-miR-149*
2.7e+002-8.0e+002 0.79 2.97 (-) 1.4e-004 254 MID-23291
2.2e+002-6.2e+002 0.76 2.77 (-) 1.7e-004 354 hsa-miR-612
5.7e+002-1.5e+003 0.76 2.65 (-) 7.8e-006 381 MID-19962
2.3e+002-6.0e+002 0.79 2.63 (-) 2.1e-005 380 MID-19898
9.8e+002-2.4e+003 0.78 2.44 (-) 2.8e-005 245 MID-18336
1.2e+002-2.8e+002 0.73 2.34 (-) 5.3e-003 358 hsa-miR-665
6.1e+002-1.4e+003 0.70 2.31 (-) 6.1e-003 364 MID-00064
2.9e+003-6.7e+003 0.81 2.30 (-) 8.8e-008 240 MID-15965
1.2e+002-2.8e+002 0.66 2.26 (-) 1.5e-002 11 hsa-miR-138
1.0e+002-2.3e+002 0.77 2.24 (-) 8.9e-005 378 MID-18307 (+) the
higher expression of this miR is in SCC of the uterine cervix (-)
the higher expression of this miR is in other SCC tumors
[0351] FIG. 17 demonstrates binary decisions at node #22 of the
decision-tree. Tumors originating in SCC of the uterine cervix
(diamonds) are easily distinguished from tumors of other SCC origin
(squares) using the expression levels of hsa-miR-361-5p (SEQ ID NO:
54, y-axis), hsa-let-7c (SEQ ID NO: 1, x-axis) and hsa-miR-10b (SEQ
ID NO: 5, z-axis).
TABLE-US-00024 TABLE 24 miR expression (in fluorescence units)
distinguishing between anus or skin SCC and upper SCC tumors (lung,
head& neck or esophagus) median values auROC fold-change
p-value SEQ ID NO. miR name 3.2e+002-5.0e+001 0.78 6.38 (+)
3.0e-006 305 hsa-miR-31* 4.3e+003-8.0e+002 0.80 5.39 (+) 1.8e-006
184 hsa-miR-203 8.6e+002-2.5e+002 0.78 3.49 (+) 1.8e-006 41
hsa-miR-223 1.7e+003-5.4e+002 0.80 3.12 (+) 3.5e-006 183
hsa-miR-199b-5p 9.4e+003-3.5e+003 0.70 2.73 (+) 2.4e-003 49
hsa-miR-31 8.7e+003-3.2e+003 0.86 2.71 (+) 3.6e-007 382 MID-22331
1.9e+003-7.1e+002 0.87 2.68 (+) 1.7e-008 235 hsa-miR-1978
2.4e+002-9.2e+001 0.83 2.55 (+) 9.6e-009 291 hsa-miR-222*
6.8e+003-2.9e+003 0.74 2.31 (+) 7.4e-004 181 hsa-miR-199a-3p
1.5e+003-6.7e+002 0.88 2.28 (+) 7.1e-007 5 hsa-miR-10b
5.3e+002-2.4e+002 0.75 2.21 (+) 1.4e-004 296 hsa-miR-26b
3.4e+002-1.6e+002 0.74 2.19 (+) 7.7e-005 289 hsa-miR-22*
1.3e+003-6.0e+002 0.71 2.13 (+) 1.2e-003 206 hsa-miR-455-3p
7.9e+003-3.8e+003 0.84 2.11 (+) 4.2e-006 338 hsa-miR-494
2.9e+002-1.4e+002 0.73 2.08 (+) 1.1e-004 334 hsa-miR-483-5p
2.8e+003-1.3e+003 0.82 2.07 (+) 4.5e-006 25 hsa-miR-193a-3p
1.1e+002-3.3e+002 0.77 3.03 (-) 2.3e-005 11 hsa-miR-138
1.3e+002-3.1e+002 0.65 2.29 (-) 1.5e-002 19 hsa-miR-149
9.7e+001-2.1e+002 0.75 2.16 (-) 4.0e-005 198 hsa-miR-342-5p
1.1e+003-1.8e+003 0.83 1.63 (-) 1.1e-006 23 hsa-miR-185 (+) the
higher expression of this miR is in anus or skin SCC (-) the higher
expression of this miR is in upper SCC tumors
[0352] hsa-miR-10b (SEQ ID NO: 5), hsa-miR-138 (SEQ ID NO: 11) and
hsa-miR-185 (SEQ ID NO: 23) are used at node 23 of the
binary-tree-classifier detailed in the invention to distinguish
between anus or skin SCC and upper SCC tumors.
TABLE-US-00025 TABLE 25 miR expression (in fluorescence units)
distinguishing between melanoma and lymphoma (B-cell or T-cell)
tumors median values auROC fold-change p-value SEQ ID NO. miR name
1.7e+003-3.0e+002 0.89 5.81 (+) 2.8e-010 4 hsa-miR-10a
1.9e+003-6.0e+002 0.80 3.13 (+) 7.9e-005 11 hsa-miR-138
1.7e+003-5.7e+002 0.94 2.98 (+) 2.3e-011 46 hsa-miR-30a
2.5e+004-8.8e+003 0.87 2.83 (+) 1.1e-009 8 hsa-miR-125b
6.2e+002-2.3e+002 0.94 2.74 (+) 9.2e-011 274 hsa-miR-151-3p
9.2e+002-3.4e+002 0.87 2.70 (+) 1.9e-007 169 hsa-miR-152
1.6e+003-6.0e+002 0.77 2.60 (+) 2.0e-004 36 hsa-miR-210
4.8e+003-1.9e+003 0.90 2.56 (+) 2.1e-011 47 hsa-miR-30d
1.2e+003-5.5e+002 0.88 2.26 (+) 2.5e-008 363 hsa-miR-99b
2.4e+003-1.1e+003 0.85 2.24 (+) 1.4e-006 231 hsa-miR-99a
6.5e+003-3.0e+003 0.80 2.17 (+) 2.2e-005 303 hsa-miR-30b
6.4e+002-3.0e+002 0.86 2.14 (+) 2.9e-008 349 hsa-miR-532-5p
2.1e+003-1.0e+003 0.86 2.08 (+) 1.6e-006 10 hsa-miR-130a
5.4e+003-2.6e+003 0.81 2.06 (+) 8.3e-006 7 hsa-miR-125a-5p
3.6e+003-1.8e+003 0.82 2.05 (+) 2.5e-006 3 hsa-miR-100
7.9e+003-3.9e+003 0.69 2.04 (+) 1.5e-002 16 hsa-miR-146a
1.6e+002-2.2e+003 0.93 13.84 (-) 1.1e-014 164 hsa-miR-142-5p
7.2e+002-7.5e+003 0.93 10.40 (-) 1.1e-013 170 hsa-miR-155
2.0e+003-1.4e+004 0.90 7.18 (-) 2.2e-010 168 hsa-miR-150
1.7e+002-7.0e+002 0.91 4.14 (-) 5.2e-011 198 hsa-miR-342-5p
2.2e+003-8.3e+003 0.97 3.83 (-) 6.2e-019 50 hsa-miR-342-3p
9.1e+002-2.6e+003 0.86 2.87 (-) 1.3e-008 245 MID-18336
2.3e+002-6.4e+002 0.77 2.74 (-) 2.4e-004 365 MID-00078
1.9e+002-5.2e+002 0.78 2.68 (-) 4.7e-004 45 hsa-miR-29c*
2.8e+003-6.6e+003 0.74 2.34 (-) 1.6e-003 382 MID-22331
3.4e+003-7.9e+003 0.85 2.30 (-) 1.2e-004 259 hsa-let-7g
3.8e+002-8.5e+002 0.75 2.25 (-) 4.3e-003 296 hsa-miR-26b
7.5e+002-1.6e+003 0.78 2.16 (-) 2.3e-004 364 MID-00064
6.3e+002-1.3e+003 0.79 2.08 (-) 7.9e-005 314 hsa-miR-361-3p
2.7e+003-5.4e+003 0.80 2.05 (-) 7.5e-007 12 hsa-miR-140-3p (+) the
higher expression of this miR is in melanoma (-) the higher
expression of this miR is in lymphoma
[0353] FIG. 18 demonstrates binary decisions at node #24 of the
decision-tree. Tumors originating in melanoma (diamonds) are easily
distinguished from tumors of lymphoma origin (squares) using the
expression levels of hsa-miR-342-3p (SEQ ID NO: 50, y-axis) and
hsa-miR-30d (SEQ ID NO: 47, x-axis).
TABLE-US-00026 TABLE 26 miR expression (in fluorescence units)
distinguishing between B-cell lymphoma and T-cell lymphoma median
values auROC fold-change p-value SEQ ID NO. miR name
8.3e+002-2.8e+002 0.74 2.96 (+) 3.7e-005 11 hsa-miR-138
6.7e+002-2.8e+002 0.72 2.37 (+) 2.2e-003 191 hsa-miR-29c
1.2e+003-5.9e+002 0.76 2.02 (+) 1.4e-003 48 hsa-miR-30e
6.7e+002-1.8e+003 0.79 2.77 (-) 1.1e-006 35 hsa-miR-21*
1.5e+003-3.9e+003 0.68 2.68 (-) 2.6e-003 228 hsa-miR-886-3p (+) the
higher expression of this miR is in B-cell lymphoma (-) the higher
expression of this miR is in T-cell lymphoma
[0354] hsa-miR-30e (SEQ ID NO: 48) and hsa-miR-21* (SEQ ID NO: 35)
are used at node 25 of the binary-tree-classifier detailed in the
invention to distinguish between B-cell lymphoma and T-cell
lymphoma.
TABLE-US-00027 TABLE 27 miR expression (in fluorescence units)
distinguishing between lung small cell carcinoma and other
neuroendocrine tumors selected from the group consisting of lung
carcinoid, medullary thyroid carcinoma, gastrointestinal tract
carcinoid and pancreatic islet cell tumor median values auROC
fold-change p-value SEQ ID NO. miR name 1.2e+004-1.2e+003 0.99 9.68
(+) 3.3e-021 158 hsa-miR-106a 7.3e+003-7.9e+002 1.00 9.17 (+)
3.4e-022 20 hsa-miR-17 1.4e+003-1.6e+002 0.99 8.53 (+) 8.2e-022 176
hsa-miR-18a 5.8e+003-7.0e+002 1.00 8.38 (+) 7.4e-021 186
hsa-miR-20a 1.1e+004-1.5e+003 0.98 7.71 (+) 1.7e-022 148 hsa-miR-93
4.7e+003-6.7e+002 0.89 6.99 (+) 1.0e-008 36 hsa-miR-210
2.2e+003-3.7e+002 0.95 5.87 (+) 2.8e-016 51 hsa-miR-345
8.9e+003-1.8e+003 0.95 4.96 (+) 1.6e-010 172 hsa-miR-15b
8.2e+003-1.8e+003 0.98 4.68 (+) 6.3e-020 260 hsa-miR-106b
1.1e+003-2.4e+002 0.91 4.62 (+) 7.7e-010 265 hsa-miR-130b
8.0e+003-1.8e+003 0.94 4.33 (+) 2.7e-013 67 hsa-miR-92a
4.1e+003-9.8e+002 0.98 4.15 (+) 2.6e-019 188 hsa-miR-25
1.1e+003-3.4e+002 0.98 3.40 (+) 7.9e-016 277 hsa-miR-17*
2.5e+003-8.3e+002 0.99 2.96 (+) 1.8e-011 284 hsa-miR-19b
5.1e+002-1.8e+002 0.74 2.84 (+) 6.1e-004 302 hsa-miR-301a
7.9e+002-2.9e+002 0.91 2.78 (+) 8.7e-010 68 hsa-miR-92b
9.9e+002-4.3e+002 0.69 2.28 (+) 5.1e-002 168 hsa-miR-150
2.5e+003-1.1e+003 0.70 2.24 (+) 4.5e-003 242 MID-16489
1.4e+003-6.6e+002 0.91 2.12 (+) 1.1e-009 204 hsa-miR-425
5.0e+001-1.6e+003 0.91 31.23 (-) 4.5e-009 162 hsa-miR-129-3p
1.1e+002-1.6e+003 0.91 14.13 (-) 8.6e-009 177 hsa-miR-192
7.6e+001-7.9e+002 0.91 10.42 (-) 1.5e-008 27 hsa-miR-194
5.5e+002-5.0e+003 0.92 9.14 (-) 1.7e-009 65 hsa-miR-7
7.1e+001-5.7e+002 0.78 8.02 (-) 5.8e-005 263 hsa-miR-129*
2.5e+002-1.6e+003 0.80 6.30 (-) 3.5e-005 155 hsa-miR-127-3p
1.5e+002-9.1e+002 0.96 6.05 (-) 3.5e-015 191 hsa-miR-29c
3.3e+002-2.0e+003 0.93 5.99 (-) 3.3e-013 190 hsa-miR-29b
1.7e+002-9.9e+002 0.99 5.76 (-) 1.3e-020 45 hsa-miR-29c*
1.2e+002-6.6e+002 0.75 5.60 (-) 8.0e-004 59 hsa-miR-487b
1.8e+003-8.0e+003 0.90 4.44 (-) 1.6e-012 43 hsa-miR-29a
1.3e+004-4.9e+004 0.88 3.87 (-) 3.3e-006 56 hsa-miR-375
1.6e+002-5.5e+002 0.95 3.37 (-) 9.6e-011 266 hsa-miR-132
4.0e+003-1.2e+004 0.82 2.98 (-) 9.4e-006 14 hsa-miR-143
7.8e+003-2.3e+004 0.85 2.89 (-) 6.1e-006 15 hsa-miR-145
1.2e+004-3.4e+004 0.79 2.83 (-) 4.3e-005 8 hsa-miR-125b
4.5e+003-1.2e+004 0.97 2.70 (-) 1.7e-014 7 hsa-miR-125a-5p
1.9e+003-5.0e+003 0.89 2.67 (-) 3.6e-010 39 hsa-miR-22
2.5e+003-5.7e+003 0.79 2.25 (-) 8.7e-004 189 hsa-miR-27b
1.1e+003-2.4e+003 0.64 2.18 (-) 4.1e-002 249 MID-20524
2.2e+003-4.8e+003 0.72 2.14 (-) 8.5e-003 231 hsa-miR-99a
9.6e+003-2.0e+004 0.82 2.12 (-) 1.3e-003 293 hsa-miR-23b
5.1e+003-1.0e+004 0.80 2.01 (-) 6.3e-005 2 hsa-let-7e (+) the
higher expression of this miR is in lung small cell carcinoma (-)
the higher expression of this miR is in other neuroendocrine
tumors
[0355] hsa-miR-17 (SEQ ID NO: 20) and hsa-miR-29c* (SEQ ID NO: 45)
are used at node #26 of the binary-tree-classifier detailed in the
invention to distinguish between lung small cell carcinoma and
other neuroendocrine tumors.
TABLE-US-00028 TABLE 28 miR expression (in fluorescence units)
distinguishing between medullary thyroid carcinoma and other
neuroendocrine tumors selected from the group consisting of lung
carcinoid, gastrointestinal tract carcinoid and pancreatic islet
cell tumor median values auROC fold-change p-value SEQ ID NO. miR
name 4.4e+003-5.5e+001 0.84 79.70 (+) 1.5e-007 159 hsa-miR-124
4.0e+004-4.9e+003 0.98 8.07 (+) 1.6e-015 40 hsa-miR-222
1.9e+004-2.8e+003 0.98 6.85 (+) 4.8e-016 147 hsa-miR-221
1.1e+003-2.0e+002 0.70 5.55 (+) 1.1e-003 11 hsa-miR-138
3.2e+002-7.8e+001 0.83 4.12 (+) 7.6e-007 311 hsa-miR-335
5.8e+003-1.5e+003 0.86 3.91 (+) 1.3e-006 4 hsa-miR-10a
6.3e+004-1.7e+004 0.83 3.61 (+) 3.9e-006 8 hsa-miR-125b
1.1e+004-3.2e+003 0.79 3.43 (+) 5.5e-005 231 hsa-miR-99a
4.3e+002-2.0e+002 0.78 2.10 (+) 2.8e-004 301 hsa-miR-29b-2*
7.9e+003-3.8e+003 0.82 2.06 (+) 4.4e-005 297 hsa-miR-27a
1.4e+002-4.0e+002 0.95 2.95 (-) 7.5e-011 68 hsa-miR-92b
1.1e+003-2.8e+003 0.87 2.50 (-) 3.2e-006 67 hsa-miR-92a
1.8e+002-3.7e+002 0.76 2.07 (-) 2.0e-003 265 hsa-miR-130b
4.4e+002-9.0e+002 0.75 2.04 (-) 2.1e-003 36 hsa-miR-210 (+) the
higher expression of this miR is in medullary thyroid carcinoma (-)
the higher expression of this miR is in other neuroendocrine
tumors
[0356] FIG. 19 demonstrates binary decisions at node #27 of the
decision-tree. Tumors originating in medullary thyroid carcinoma
(diamonds) are easily distinguished from tumors of other
neuroendocrine origin (squares) using the expression levels of
hsa-miR-92b (SEQ ID NO: 68, y-axis), hsa-miR-222 (SEQ ID NO: 40,
x-axis) and hsa-miR-92a (SEQ ID NO: 67, z-axis).
TABLE-US-00029 TABLE 29 miR expression (in fluorescence units)
distinguishing between lung carcinoid tumors and GI neuroendocrine
tumors selected from the group consisting of gastrointestinal tract
carcinoid and pancreatic islet cell tumor median values auROC
fold-change p-value SEQ ID NO. miR name 4.0e+003-9.9e+001 0.90
40.08 (+) 1.9e-010 331 hsa-miR-432 6.0e+003-1.5e+002 0.86 39.24 (+)
4.6e-008 162 hsa-miR-129-3p 6.3e+003-1.9e+002 0.87 34.16 (+)
7.8e-009 59 hsa-miR-487b 1.3e+003-5.5e+001 0.88 23.36 (+) 2.9e-010
326 hsa-miR-409-5p 1.1e+003-5.0e+001 0.88 21.14 (+) 5.2e-010 306
hsa-miR-323-3p 1.0e+003-5.5e+001 0.87 18.59 (+) 1.5e-009 350
hsa-miR-539 7.9e+002-5.6e+001 0.84 14.25 (+) 1.4e-008 317
hsa-miR-369-5p 1.0e+004-7.2e+002 0.86 13.95 (+) 3.2e-007 155
hsa-miR-127-3p 1.7e+003-1.2e+002 0.86 13.60 (+) 2.1e-008 325
hsa-miR-409-3p 1.6e+003-1.2e+002 0.88 13.10 (+) 4.2e-009 318
hsa-miR-370 9.5e+002-7.3e+001 0.81 13.03 (+) 3.1e-006 339
hsa-miR-495 9.5e+002-7.4e+001 0.84 12.92 (+) 5.7e-007 264
hsa-miR-129-5p 6.4e+002-5.0e+001 0.91 12.84 (+) 1.6e-013 332
hsa-miR-433 6.5e+002-5.7e+001 0.88 11.52 (+) 5.1e-011 262
hsa-miR-127-5p 5.6e+002-5.2e+001 0.90 10.76 (+) 2.7e-012 336
hsa-miR-485-5p 2.0e+003-1.9e+002 0.86 10.44 (+) 4.2e-008 324
hsa-miR-382 7.9e+002-7.8e+001 0.83 10.20 (+) 1.3e-007 322
hsa-miR-379 6.0e+002-5.9e+001 0.89 10.15 (+) 9.6e-012 330
hsa-miR-431* 4.7e+002-5.0e+001 0.90 9.41 (+) 6.1e-012 321
hsa-miR-377* 1.3e+003-1.4e+002 0.80 9.40 (+) 1.5e-005 263
hsa-miR-129* 4.7e+002-5.0e+001 0.86 9.35 (+) 1.8e-008 309
hsa-miR-329 4.9e+002-5.3e+001 0.79 9.24 (+) 3.1e-005 53
hsa-miR-34c-5p 1.1e+003-1.2e+002 0.83 9.05 (+) 6.4e-007 320
hsa-miR-376c 1.1e+003-1.2e+002 0.86 8.81 (+) 2.3e-008 275
hsa-miR-154 6.5e+002-8.4e+001 0.83 7.73 (+) 8.1e-007 352
hsa-miR-543 9.9e+002-1.3e+002 0.82 7.49 (+) 3.2e-007 312
hsa-miR-337-5p 6.2e+002-8.8e+001 0.86 7.10 (+) 3.0e-008 355
hsa-miR-654-3p 3.5e+002-5.0e+001 0.91 7.05 (+) 3.2e-013 367
MID-00465 6.0e+002-1.0e+002 0.84 5.76 (+) 5.9e-007 269 hsa-miR-134
3.2e+003-8.5e+002 0.91 3.84 (+) 1.2e-011 64 hsa-miR-652
3.2e+002-1.1e+002 0.83 2.84 (+) 2.3e-005 308 hsa-miR-328
2.6e+003-9.4e+002 0.74 2.78 (+) 1.1e-003 175 hsa-miR-183
2.8e+003-1.0e+003 0.87 2.73 (+) 3.0e-006 190 hsa-miR-29b
3.9e+003-1.6e+003 0.88 2.49 (+) 6.9e-010 54 hsa-miR-361-5p
4.1e+002-1.7e+002 0.67 2.44 (+) 2.1e-002 302 hsa-miR-301a
4.0e+003-1.7e+003 0.79 2.41 (+) 5.9e-004 152 hsa-miR-182
4.0e+002-1.7e+002 0.88 2.39 (+) 4.7e-007 301 hsa-miR-29b-2*
8.7e+002-3.7e+002 0.77 2.36 (+) 6.8e-005 266 hsa-miR-132
7.7e+003-3.3e+003 0.82 2.34 (+) 5.4e-006 47 hsa-miR-30d
3.7e+002-1.6e+002 0.70 2.32 (+) 5.8e-003 313 hsa-miR-338-3p
3.3e+002-1.5e+002 0.66 2.16 (+) 1.3e-002 359 hsa-miR-708
5.5e+003-2.5e+003 0.68 2.16 (+) 4.2e-002 65 hsa-miR-7
2.1e+003-9.9e+002 0.78 2.13 (+) 7.0e-005 307 hsa-miR-324-5p
1.2e+003-5.9e+002 0.81 2.02 (+) 1.6e-004 191 hsa-miR-29c
3.5e+002-1.9e+003 0.88 5.36 (-) 1.0e-007 242 MID-16489
6.5e+002-1.9e+003 0.76 2.96 (-) 4.9e-004 4 hsa-miR-10a
1.3e+003-3.6e+003 0.84 2.79 (-) 1.9e-006 147 hsa-miR-221
2.2e+003-5.9e+003 0.81 2.75 (-) 8.5e-006 40 hsa-miR-222
2.6e+002-6.8e+002 0.76 2.56 (-) 7.4e-004 372 MID-16469
3.5e+002-8.9e+002 0.71 2.56 (-) 4.7e-003 168 hsa-miR-150
1.5e+002-3.7e+002 0.83 2.55 (-) 4.8e-005 16 hsa-miR-146a
1.9e+003-4.7e+003 0.82 2.40 (-) 1.7e-005 182 hsa-miR-199a-5p
1.3e+003-3.0e+003 0.79 2.35 (-) 1.2e-004 167 hsa-miR-149*
2.1e+002-4.8e+002 0.84 2.26 (-) 3.1e-005 356 hsa-miR-658
1.2e+003-2.8e+003 0.74 2.25 (-) 1.4e-003 148 hsa-miR-93
1.4e+003-3.1e+003 0.70 2.23 (-) 1.9e-002 382 MID-22331
8.0e+002-1.8e+003 0.83 2.21 (-) 2.9e-005 37 hsa-miR-214
4.4e+002-8.9e+002 0.79 2.01 (-) 2.1e-004 364 MID-00064
2.1e+002-4.2e+002 0.78 2.01 (-) 1.1e-003 35 hsa-miR-21* (+) the
higher expression of this miR is in lung carcinoid tumors (-) the
higher expression of this miR is in GI neuroendocrine tumors
[0357] hsa-miR-652 (SEQ ID NO: 64), hsa-miR-34c-5p (SEQ ID NO: 53)
and hsa-miR-214 (SEQ ID NO: 37) are used at node 28 of the
binary-tree-classifier detailed in the invention to distinguish
between lung carcinoid tumors and GI neuroendocrine tumors.
TABLE-US-00030 TABLE 30 miR expression (in fluorescence units)
distinguishing between pancreatic islet cell tumors and GI
neuroendocrine carcinoid tumors selected from the group consisting
of small intestine and duodenum; appendicitis, stomach and pancreas
fold- miR name SEQ ID NO. p-value change auROC median values
hsa-miR-129* 263 2.8e-004 20.91 (+) 0.80 2.3e+003 1.1e+002
hsa-miR-217 288 6.6e-003 9.61 (+) 0.72 4.8e+002 5.0e+001
hsa-miR-148a 18 6.8e-006 8.54 (+) 0.90 1.6e+003 1.9e+002
hsa-miR-216a 286 2.7e-002 8.34 (+) 0.68 4.3e+002 5.2e+001
hsa-miR-129-3p 162 4.4e-003 7.22 (+) 0.74 1.8e+003 2.5e+002
hsa-miR-551b 225 2.3e-003 6.65 (+) 0.74 6.6e+002 9.9e+001
hsa-miR-216b 287 5.4e-003 6.04 (+) 0.75 3.0e+002 5.0e+001
hsa-miR-455-3p 206 7.3e-007 3.75 (+) 0.92 7.1e+002 1.9e+002
hsa-miR-451 205 2.5e-003 3.65 (+) 0.79 1.3e+004 3.4e+003
hsa-miR-26b 296 2.8e-004 3.43 (+) 0.83 8.9e+002 2.6e+002 hsa-let-7f
258 3.6e-004 3.29 (+) 0.91 8.7e+003 2.6e+003 hsa-miR-338-3p 313
3.2e-003 3.25 (+) 0.78 5.2e+002 1.6e+002 MID-17866 377 5.0e-005
2.71 (+) 0.85 6.6e+003 2.4e+003 MID-16582 373 1.2e-005 2.45 (+)
0.88 1.9e+004 7.6e+003 hsa-let-7a 256 1.0e-003 2.42 (+) 0.80
6.4e+004 2.7e+004 hsa-let-7d 153 1.8e-004 2.36 (+) 0.89 1.1e+004
4.7e+003 hsa-let-7g 259 2.2e-003 2.28 (+) 0.86 6.4e+003 2.8e+003
hsa-miR-130b 265 1.0e-002 2.11 (+) 0.69 4.8e+002 2.3e+002
hsa-miR-30b 303 7.5e-003 2.09 (+) 0.75 6.4e+003 3.1e+003
hsa-miR-133b 268 5.4e-004 9.40 (-) 0.81 1.0e+002 9.7e+002
hsa-miR-133a 267 5.4e-004 9.22 (-) 0.80 1.1e+002 1.0e+003
hsa-miR-143* 165 2.1e-006 8.37 (-) 0.93 2.3e+002 1.9e+003
hsa-miR-145 15 3.7e-008 8.18 (-) 0.94 1.1e+004 9.2e+004
hsa-miR-145* 272 7.0e-006 8.05 (-) 0.91 6.6e+001 5.3e+002
hsa-miR-143 14 3.7e-009 7.30 (-) 0.96 5.2e+003 3.8e+004 hsa-miR-378
202 6.2e-006 6.35 (-) 0.88 3.1e+002 2.0e+003 MID-00689 236 8.1e-006
4.99 (-) 0.88 2.9e+002 1.4e+003 hsa-miR-422a 203 7.9e-006 4.74 (-)
0.88 1.4e+002 6.4e+002 hsa-miR-10a 4 2.4e-004 3.91 (-) 0.82
9.2e+002 3.6e+003 hsa-miR-150 168 4.4e-003 3.78 (-) 0.78 3.0e+002
1.1e+003 hsa-miR-330-3p 310 3.4e-004 3.23 (-) 0.81 1.1e+002
3.6e+002 hsa-miR-28-3p 298 4.6e-007 3.16 (-) 0.95 2.1e+002 6.7e+002
hsa-miR-194 27 1.4e-002 3.09 (-) 0.74 7.2e+002 2.2e+003
hsa-miR-200b 29 7.6e-005 2.72 (-) 0.91 7.5e+003 2.1e+004 hsa-miR-21
34 2.8e-006 2.57 (-) 0.87 8.5e+003 2.2e+004 hsa-miR-886-3p 228
8.0e-003 2.56 (-) 0.74 6.7e+002 1.7e+003 hsa-miR-100 3 3.6e-003
2.50 (-) 0.77 1.5e+003 3.8e+003 hsa-miR-532-5p 349 4.3e-007 2.14
(-) 0.94 2.5e+002 5.3e+002 hsa-miR-21* 35 8.5e-004 2.06 (-) 0.82
2.4e+002 5.0e+002 hsa-miR-193a-5p 26 5.7e-003 2.01 (-) 0.77
1.6e+002 3.3e+002 (+) the higher expression of this miR is in
pancreatic islet cell tumors (-) the higher expression of this miR
is in GI neuroendocrine carcinoid tumors
[0358] hsa-miR-21 (SEQ ID NO: 34), and hsa-miR-148a (SEQ ID NO: 18)
are used at node 29 of the binary-tree-classifier detailed in the
invention to distinguish between pancreatic islet cell tumors and
GI neuroendocrine carcinoid tumors.
TABLE-US-00031 TABLE 31 miR expression (in fluorescence units)
distinguishing between gastric or esophageal adenocarcinoma and
other adenocarcinoma tumors of the gastrointestinal system selected
from the group consisting of cholangiocarcinoma or adenocarcinoma
of extrahepatic biliary tract, pancreatic adenocarcinoma and
colorectal adenocarcinoma SEQ ID fold- miR name NO. p-value change
auROC median values hsa-miR-133a 267 4.6e-008 9.14 (+) 0.74
6.2e+002 6.7e+001 hsa-miR-133b 268 3.9e-008 8.73 (+) 0.74 5.5e+002
6.3e+001 hsa-miR-143* 165 3.9e-007 4.26 (+) 0.75 2.5e+003 5.9e+002
hsa-miR-145 15 4.5e-004 2.82 (+) 0.71 7.9e+004 2.8e+004 hsa-miR-143
14 1.3e-003 2.55 (+) 0.68 3.2e+004 1.3e+004 hsa-miR-658 356
8.2e-004 2.53 (+) 0.71 1.3e+003 5.1e+002 hsa-miR-149* 167 2.2e-004
2.33 (+) 0.72 7.2e+003 3.1e+003 MID-17576 376 7.2e-004 2.22 (+)
0.69 3.1e+003 1.4e+003 MID-16469 372 3.0e-004 2.20 (+) 0.71
1.4e+003 6.5e+002 hsa-miR-145* 272 3.0e-004 2.14 (+) 0.69 3.2e+002
1.5e+002 MID-15986 370 3.8e-004 2.11 (+) 0.74 2.9e+003 1.4e+003
hsa-miR-224 42 5.4e-008 6.57 (-) 0.83 5.5e+001 3.6e+002 hsa-miR-223
41 1.1e-004 2.61 (-) 0.73 1.5e+002 4.0e+002 hsa-miR-1201 146
1.2e-002 1.28 (-) 0.67 9.0e+002 1.2e+003 (+) the higher expression
of this miR is in gastric or esophageal adenocarcinoma (-) the
higher expression of this miR is in other adenocarcinoma tumors of
the gastrointestinal system
[0359] FIG. 20 demonstrates binary decisions at node #30 of the
decision-tree. Tumors originating in gastric or esophageal
adenocarcinoma (diamonds) are easily distinguished from tumors of
other GI adenocarcinoma origin (squares) using the expression
levels of hsa-miR-1201 (SEQ ID NO: 146, y-axis), hsa-miR-224 (SEQ
ID NO: 42, x-axis) and hsa-miR-210 (SEQ ID NO: 36, z-axis).
TABLE-US-00032 TABLE 32 miR expression (in fluorescence units)
distinguishing between colorectal adenocarcinoma and
cholangiocarcinoma or adenocarcinoma of biliary tract or pancreas
SEQ fold- miR name ID NO. p-value change auROC median values
hsa-miR-224 42 4.0e-003 2.55 (+) 0.69 5.4e+002 2.1e+002 hsa-miR-203
184 1.2e-003 2.28 (+) 0.70 4.2e+002 1.8e+002 hsa-miR-92a 67
5.1e-007 1.91 (+) 0.77 6.2e+003 3.2e+003 hsa-miR-106a 158 4.6e-007
1.81 (+) 0.81 5.6e+003 3.1e+003 hsa-miR-17 20 1.3e-007 1.81 (+)
0.81 3.2e+003 1.8e+003 hsa-miR-20a 186 7.9e-005 1.80 (+) 0.76
3.2e+003 1.8e+003 hsa-miR-19b 284 1.4e-005 1.75 (+) 0.76 1.9e+003
1.1e+003 MID-17356 389 3.0e-003 1.67 (+) 0.70 2.6e+003 1.6e+003
hsa-miR-422a 203 2.1e-005 1.63 (+) 0.75 5.1e+002 3.1e+002 MID-15965
240 5.6e-003 1.60 (+) 0.67 7.2e+003 4.5e+003 MID-00689 236 1.7e-005
1.59 (+) 0.76 1.1e+003 6.9e+002 hsa-miR-1201 146 2.5e-003 1.53 (+)
0.68 1.6e+003 1.1e+003 hsa-miR-425 204 5.2e-004 1.49 (+) 0.69
1.4e+003 9.1e+002 hsa-miR-29a 43 1.2e-005 1.44 (+) 0.77 9.3e+003
6.5e+003 hsa-miR-18a 176 7.3e-006 1.44 (+) 0.75 6.4e+002 4.5e+002
hsa-miR-378 202 1.4e-004 1.41 (+) 0.72 1.3e+003 9.1e+002 hsa-miR-31
49 2.0e-003 3.39 (-) 0.69 5.3e+002 1.8e+003 hsa-miR-30a 46 2.2e-008
2.39 (-) 0.82 8.2e+002 2.0e+003 hsa-miR-214* 38 1.3e-002 1.47 (-)
0.66 2.5e+002 3.7e+002 hsa-miR-99b 363 2.2e-003 1.41 (-) 0.73
9.0e+002 1.3e+003 (+) the higher expression of this miR is in
colorectal adenocarcinoma (-) the higher expression of this miR is
in other cholangiocarcinoma or adenocarcinoma tumors of biliary
tract or pancreas
[0360] FIG. 21 demonstrates binary decisions at node #31 of the
decision-tree. Tumors originating in colorectal adenocarcinoma
(diamonds) are easily distinguished from tumors of
cholangiocarcinoma or adenocarcinoma of biliary tract or pancreas
origin (squares) using the expression levels of hsa-miR-30a (SEQ ID
NO: 46, y-axis), hsa-miR-17 (SEQ ID NO: 20, x-axis) and hsa-miR-29a
(SEQ ID NO: 43, z-axis).
TABLE-US-00033 TABLE 33 miR expression (in fluorescence units)
distinguishing between cholangiocarcinoma or adenocarcinoma of
extrahepatic biliary tract and pancreatic adenocarcinoma SEQ fold-
miR name ID NO. p-value change auROC median values hsa-miR-31 49
1.5e-003 3.06 (+) 0.81 3.4e+003 1.1e+003 hsa-miR-138 11 1.1e-002
2.36 (+) 0.71 3.3e+002 1.4e+002 hsa-miR-141 13 1.7e-002 1.77 (+)
0.70 3.0e+003 1.7e+003 MID-16582 373 1.5e-002 1.65 (+) 0.70
1.8e+004 1.1e+004 hsa-miR-181b 154 9.6e-002 1.63 (+) 0.69 1.4e+003
8.4e+002 hsa-miR-10b 5 5.1e-001 1.62 (+) 0.69 7.0e+002 4.3e+002
hsa-miR-200c 30 7.4e-002 1.61 (+) 0.68 1.5e+004 9.3e+003
hsa-miR-29c* 45 1.3e-002 1.58 (+) 0.72 4.2e+002 2.7e+002
hsa-miR-193b 178 1.1e-001 1.47 (+) 0.66 1.5e+003 1.0e+003
hsa-miR-221 147 1.2e-002 1.36 (+) 0.75 9.0e+003 6.6e+003
hsa-miR-151-3p 274 4.0e-002 1.36 (+) 0.70 6.4e+002 4.7e+002
hsa-miR-146a 16 2.4e-002 1.34 (+) 0.66 7.3e+002 5.4e+002
hsa-miR-222 40 3.7e-002 1.32 (+) 0.71 1.5e+004 1.1e+004
hsa-miR-181a 21 8.4e-002 1.30 (+) 0.71 4.9e+003 3.8e+003
hsa-miR-29a 43 6.3e-002 1.14 (+) 0.66 6.8e+003 6.0e+003 MID-23256
253 2.1e-002 1.81 (-) 0.74 3.3e+002 5.9e+002 MID-18336 245 8.0e-002
1.70 (-) 0.66 1.1e+003 1.9e+003 hsa-let-7a 256 7.4e-003 1.68 (-)
0.73 2.7e+004 4.5e+004 hsa-miR-140-3p 12 9.2e-002 1.51 (-) 0.65
1.8e+003 2.7e+003 MID-16748 374 5.4e-003 1.47 (-) 0.75 9.3e+004
1.4e+005 MID-18395 379 2.9e-002 1.45 (-) 0.66 6.1e+004 8.9e+004
hsa-miR-1973 180 6.6e-002 1.41 (-) 0.69 3.3e+002 4.7e+002
hsa-let-7d 153 2.6e-002 1.40 (-) 0.68 4.3e+003 6.0e+003 hsa-miR-345
51 7.1e-002 1.39 (-) 0.75 3.2e+002 4.4e+002 hsa-miR-34a 52 3.9e-002
1.38 (-) 0.70 4.4e+003 6.1e+003 hsa-let-7c 1 1.4e-002 1.37 (-) 0.73
2.4e+004 3.3e+004 hsa-miR-26a 295 2.8e-002 1.36 (-) 0.68 1.5e+004
2.0e+004 hsa-let-7b 257 3.3e-003 1.35 (-) 0.77 2.9e+004 3.9e+004
MID-23168 385 6.5e-002 1.26 (-) 0.66 4.8e+003 6.1e+003 hsa-miR-23b
293 9.0e-002 1.21 (-) 0.67 1.0e+004 1.3e+004 hsa-miR-24 294
2.6e-002 1.18 (-) 0.68 2.1e+004 2.4e+004 (+) the higher expression
of this miR is in pancreatic adenocarcinoma (-) the higher
expression of this miR is in cholangiocarcinoma or adenocarcinoma
of extrahepatic biliary tract
[0361] hsa-miR-345 (SEQ ID NO: 51), hsa-miR-31 (SEQ ID NO: 49) and
hsa-miR-146a (SEQ ID NO: 16) are used at node #32 of the
binary-tree-classifier detailed in the invention to distinguish
between cholangio cancer or adenocarcinoma of extrahepatic biliary
tract and pancreatic adenocarcinoma.
TABLE-US-00034 TABLE 34 miR expression (in fluorescence units)
distinguishing between kidney tumors selected from the group
consisting of chromophobe renal cell carcinoma, clear cell renal
cell carcinoma and papillary renal cell carcinoma and other tumors
selected from the group consisting of sarcoma, adrenal
(pheochromocytoma, adrenocortical carcinoma) and mesothelioma
(pleural mesothelioma) SEQ fold- miR name ID NO. p-value change
auROC median values hsa-miR-200b 29 7.6e-042 96.12 (+) 0.94
4.8e+003 5.0e+001 hsa-miR-200a 28 3.3e-044 45.03 (+) 0.94 2.3e+003
5.0e+001 hsa-miR-200c 30 1.1e-015 15.36 (+) 0.82 7.7e+002 5.0e+001
hsa-miR-30a 46 8.6e-041 9.73 (+) 0.96 1.1e+004 1.2e+003 hsa-miR-31
49 1.1e-008 9.21 (+) 0.74 1.1e+003 1.2e+002 hsa-miR-30a* 195
8.6e-039 8.87 (+) 0.94 1.7e+003 1.9e+002 hsa-miR-182 152 1.1e-009
6.58 (+) 0.74 5.0e+002 7.5e+001 hsa-miR-183 175 9.8e-011 5.07 (+)
0.76 2.5e+002 5.0e+001 hsa-miR-30d 47 5.0e-033 3.81 (+) 0.92
8.3e+003 2.2e+003 hsa-miR-10a 4 2.5e-016 3.52 (+) 0.83 5.1e+003
1.5e+003 MID-23751 387 6.4e-011 3.15 (+) 0.75 2.3e+002 7.3e+001
hsa-miR-30c 196 1.1e-025 2.95 (+) 0.89 9.5e+003 3.2e+003
hsa-miR-192 177 2.1e-012 2.80 (+) 0.76 4.0e+002 1.4e+002 MID-17375
375 1.7e-015 2.52 (+) 0.79 2.5e+002 9.8e+001 hsa-miR-194 27
2.1e-012 2.43 (+) 0.75 2.2e+002 9.0e+001 hsa-miR-30e* 304 6.5e-013
2.40 (+) 0.76 2.8e+002 1.2e+002 hsa-miR-222 40 1.6e-012 2.37 (+)
0.75 1.6e+004 6.7e+003 hsa-miR-29c 191 9.5e-006 2.33 (+) 0.69
5.9e+002 2.5e+002 hsa-miR-221 147 1.4e-011 2.20 (+) 0.74 9.8e+003
4.5e+003 hsa-miR-21* 35 7.6e-003 2.19 (+) 0.61 9.8e+002 4.5e+002
hsa-miR-146a 16 3.7e-007 2.15 (+) 0.71 6.1e+002 2.8e+002 hsa-miR-21
34 1.2e-003 2.07 (+) 0.64 4.9e+004 2.4e+004 hsa-miR-10b 5 5.4e-004
2.06 (+) 0.66 4.7e+003 2.3e+003 hsa-miR-127-3p 155 2.1e-018 9.53
(-) 0.85 1.2e+002 1.2e+003 hsa-miR-199a-3p 181 4.9e-023 7.57 (-)
0.89 1.6e+003 1.2e+004 hsa-miR-337-5p 312 1.2e-019 7.45 (-) 0.86
5.0e+001 3.7e+002 hsa-miR-199b-5p 183 1.7e-015 7.21 (-) 0.82
1.4e+002 1.0e+003 hsa-miR-199a-5p 182 1.4e-017 6.48 (-) 0.86
2.6e+003 1.7e+004 hsa-miR-376c 320 3.5e-019 5.73 (-) 0.86 5.0e+001
2.9e+002 hsa-miR-487b 59 2.6e-016 5.23 (-) 0.86 6.5e+001 3.4e+002
hsa-miR-214* 38 9.7e-016 5.18 (-) 0.82 9.4e+001 4.9e+002
hsa-miR-382 324 2.2e-017 4.83 (-) 0.86 5.0e+001 2.4e+002
hsa-miR-381 323 6.6e-017 4.27 (-) 0.83 5.0e+001 2.1e+002
hsa-miR-214 37 2.7e-013 4.22 (-) 0.81 1.2e+003 5.0e+003 hsa-miR-379
322 8.5e-018 4.21 (-) 0.86 5.0e+001 2.1e+002 hsa-miR-409-3p 325
1.2e-015 4.14 (-) 0.83 5.0e+001 2.1e+002 hsa-miR-149 19 2.8e-016
3.76 (-) 0.86 6.7e+001 2.5e+002 hsa-miR-224 42 2.8e-007 3.51 (-)
0.71 7.5e+001 2.6e+002 hsa-miR-483-5p 334 1.3e-011 3.25 (-) 0.79
9.9e+001 3.2e+002 hsa-miR-130b 265 9.5e-012 2.08 (-) 0.79 1.5e+002
3.0e+002 hsa-miR-181a* 22 4.8e-009 2.00 (-) 0.76 1.0e+002 2.0e+002
(+) the higher expression of this miR is in kidney tumors (-) the
higher expression of this miR is in sarcoma, adrenal and
mesothelioma tumors
[0362] FIG. 22 demonstrates binary decisions at node #33 of the
decision-tree. Tumors originating in kidney (diamonds) are easily
distinguished from tumors of adrenal, mesothelioma and sarcoma
origin (squares) using the expression levels of hsa-miR-200b (SEQ
ID NO: 29, y-axis), hsa-miR-30a (SEQ ID NO: 46, x-axis) and
hsa-miR-149 (SEQ ID NO: 19, z-axis).
TABLE-US-00035 TABLE 35 miR expression (in fluorescence units)
distinguishing between pheochromocytoma (neuroendocrine tumor of
the adrenal) and all sarcoma, adrenal carcinoma and mesothelioma
tumors miR name SEQ ID NO. p-value fold-change auROC median values
hsa-miR-7 65 6.7e-067 295.36 (+) 0.96 1.5e+004 5.0e+001 hsa-miR-375
56 5.0e-036 196.58 (+) 0.91 9.8e+003 5.0e+001 hsa-miR-138 11
3.2e-009 29.73 (+) 0.85 4.0e+003 1.3e+002 hsa-miR-129-3p 162
1.5e-021 20.53 (+) 0.94 1.0e+003 5.0e+001 hsa-miR-487b 59 3.0e-008
15.11 (+) 0.84 4.0e+003 2.7e+002 hsa-miR-432 331 7.4e-008 14.54 (+)
0.81 2.2e+003 1.5e+002 hsa-miR-539 350 9.4e-011 12.45 (+) 0.84
8.7e+002 7.0e+001 hsa-miR-127-3p 155 8.3e-005 12.36 (+) 0.80
1.2e+004 9.6e+002 hsa-miR-485-3p 335 1.2e-008 11.61 (+) 0.80
6.8e+002 5.8e+001 hsa-miR-124 159 2.7e-008 11.48 (+) 0.87 5.7e+002
5.0e+001 hsa-miR-485-5p 336 2.3e-014 10.67 (+) 0.86 5.6e+002
5.3e+001 hsa-miR-433 332 1.2e-012 10.38 (+) 0.83 5.2e+002 5.0e+001
hsa-miR-129* 263 1.1e-029 10.28 (+) 0.94 5.1e+002 5.0e+001
hsa-miR-323-3p 306 9.6e-008 9.55 (+) 0.82 4.8e+002 5.0e+001
hsa-miR-495 339 1.0e-005 9.22 (+) 0.79 1.2e+003 1.2e+002
hsa-miR-487a 337 4.4e-006 9.01 (+) 0.80 5.2e+002 5.8e+001
hsa-miR-154 275 4.8e-006 8.75 (+) 0.80 1.4e+003 1.6e+002
hsa-miR-29b-2* 301 6.0e-013 8.56 (+) 0.90 6.1e+002 7.1e+001
hsa-miR-154* 276 3.7e-005 8.54 (+) 0.78 4.5e+002 5.3e+001
hsa-miR-431* 330 4.7e-009 7.77 (+) 0.83 4.1e+002 5.3e+001
hsa-miR-369-5p 317 1.3e-006 7.56 (+) 0.81 7.4e+002 9.8e+001
hsa-miR-329 309 1.4e-007 7.56 (+) 0.80 4.8e+002 6.4e+001
hsa-miR-29c* 45 1.4e-009 7.28 (+) 0.90 1.8e+003 2.4e+002
hsa-miR-370 318 1.5e-005 7.24 (+) 0.79 1.1e+003 1.5e+002
hsa-miR-382 324 1.5e-005 6.74 (+) 0.80 1.5e+003 2.2e+002
hsa-miR-543 352 3.1e-004 6.53 (+) 0.76 8.6e+002 1.3e+002
hsa-miR-29c 191 2.6e-008 6.44 (+) 0.89 1.5e+003 2.3e+002
hsa-miR-127-5p 262 1.1e-005 6.40 (+) 0.79 6.2e+002 9.6e+001
hsa-miR-134 269 2.3e-004 6.39 (+) 0.77 9.6e+002 1.5e+002
hsa-miR-338-3p 313 2.1e-012 6.03 (+) 0.90 3.3e+002 5.4e+001
hsa-miR-149 19 4.8e-008 5.80 (+) 0.84 1.3e+003 2.2e+002 MID-00465
367 7.5e-012 5.32 (+) 0.82 2.7e+002 5.0e+001 hsa-miR-409-5p 326
5.3e-005 5.27 (+) 0.78 5.6e+002 1.1e+002 hsa-miR-409-3p 325
3.1e-004 5.26 (+) 0.76 9.6e+002 1.8e+002 hsa-miR-379 322 8.2e-004
5.25 (+) 0.76 9.2e+002 1.8e+002 hsa-miR-410 327 4.6e-008 5.05 (+)
0.79 2.5e+002 5.0e+001 hsa-miR-29b 190 3.4e-011 4.95 (+) 0.97
4.0e+003 8.1e+002 hsa-miR-1180 261 7.6e-019 4.85 (+) 0.93 3.7e+002
7.6e+001 hsa-miR-377* 321 1.0e-005 4.43 (+) 0.79 2.8e+002 6.4e+001
hsa-miR-873 360 2.9e-009 4.09 (+) 0.81 2.0e+002 5.0e+001
hsa-miR-598 353 1.2e-012 4.08 (+) 0.88 2.1e+002 5.3e+001
hsa-miR-337-5p 312 3.0e-003 4.01 (+) 0.73 1.3e+003 3.3e+002
MID-16270 371 8.8e-005 3.96 (+) 0.77 2.6e+002 6.6e+001 hsa-miR-10b
5 6.2e-005 3.51 (+) 0.85 7.2e+003 2.1e+003 hsa-miR-411 328 2.6e-002
3.40 (+) 0.68 2.3e+002 6.7e+001 hsa-miR-451 205 4.3e-004 3.17 (+)
0.77 2.2e+004 6.9e+003 hsa-miR-199b-5p 183 2.9e-008 12.33 (-) 0.90
1.1e+002 1.3e+003 hsa-miR-214* 38 2.4e-006 4.75 (-) 0.86 1.1e+002
5.5e+002 hsa-miR-199a-3p 181 2.9e-005 4.48 (-) 0.84 3.2e+003
1.5e+004 hsa-miR-214 37 6.8e-005 4.47 (-) 0.85 1.3e+003 5.9e+003
hsa-miR-222 40 5.9e-004 4.23 (-) 0.80 1.8e+003 7.8e+003
hsa-miR-199a-5p 182 9.0e-006 4.13 (-) 0.87 4.5e+003 1.8e+004
hsa-miR-221 147 6.4e-004 4.11 (-) 0.80 1.3e+003 5.2e+003
hsa-miR-146b-5p 17 4.7e-007 3.72 (-) 0.85 1.9e+002 7.0e+002
hsa-miR-224 42 2.3e-003 3.48 (-) 0.72 8.9e+001 3.1e+002 MID-22331
382 3.8e-005 3.42 (-) 0.81 8.0e+002 2.7e+003 hsa-miR-21 34 1.1e-004
3.35 (-) 0.79 7.9e+003 2.6e+004 hsa-miR-148a 18 1.3e-003 3.08 (-)
0.74 1.4e+002 4.2e+002 hsa-miR-100 3 2.6e-005 3.03 (-) 0.83
2.1e+003 6.3e+003 (+) the higher expression of this miR is in
pheochromocytoma (-) the higher expression of this miR is in
sarcoma, adrenal carcinoma and mesothelioma tumors
[0363] FIG. 23 demonstrates binary decisions at node #34 of the
decision-tree. Tumors originating in pheochromocytoma (diamonds)
are easily distinguished from tumors of adrenal, mesothelioma and
sarcoma origin (squares) using the expression levels of hsa-miR-375
(SEQ ID NO: 56, y-axis) and hsa-miR-7 (SEQ ID NO: 65, x-axis).
TABLE-US-00036 TABLE 36 miR expression (in fluorescence units)
distinguishing between adrenal carcinoma and mesothelioma or
sarcoma tumors SEQ fold- miR name ID NO. p-value change auROC
median values hsa-miR-509-3p 61 1.3e-040 51.10 (+) 0.98 2.6e+003
5.0e+001 hsa-miR-483-3p 333 4.9e-007 24.55 (+) 0.76 1.3e+003
5.4e+001 hsa-miR-202 31 8.1e-066 24.01 (+) 0.99 1.2e+003 5.0e+001
hsa-miR-513a-5p 347 2.6e-024 21.83 (+) 0.95 1.4e+003 6.4e+001
hsa-miR-509-3-5p 346 9.3e-030 12.08 (+) 0.96 6.0e+002 5.0e+001
hsa-miR-503 344 2.2e-016 11.82 (+) 0.92 2.2e+003 1.9e+002
hsa-miR-506 345 3.8e-033 10.25 (+) 0.98 5.1e+002 5.0e+001 MID-23751
387 1.2e-026 9.70 (+) 0.96 5.9e+002 6.1e+001 hsa-miR-483-5p 334
6.0e-005 8.66 (+) 0.71 2.7e+003 3.1e+002 hsa-miR-542-5p 351
1.1e-015 7.79 (+) 0.91 1.1e+003 1.4e+002 hsa-miR-382 324 8.5e-005
5.77 (+) 0.72 1.2e+003 2.0e+002 hsa-miR-409-5p 326 3.1e-007 5.44
(+) 0.75 5.5e+002 1.0e+002 hsa-miR-134 269 2.7e-004 5.31 (+) 0.73
7.2e+002 1.4e+002 hsa-miR-127-3p 155 8.9e-003 4.98 (+) 0.69
3.9e+003 7.9e+002 hsa-miR-376c 320 6.4e-003 4.93 (+) 0.68 1.0e+003
2.1e+002 hsa-miR-379 322 2.6e-003 4.84 (+) 0.69 7.8e+002 1.6e+002
hsa-miR-487b 59 4.9e-005 4.53 (+) 0.72 1.0e+003 2.2e+002
hsa-miR-370 318 1.3e-003 4.49 (+) 0.69 6.6e+002 1.5e+002
hsa-miR-409-3p 325 2.9e-004 4.45 (+) 0.71 7.8e+002 1.7e+002
MID-18336 245 3.9e-011 4.19 (+) 0.92 4.7e+003 1.1e+003 MID-23291
254 1.8e-007 3.79 (+) 0.84 1.1e+003 2.9e+002 hsa-miR-432 331
1.5e-004 3.71 (+) 0.70 5.0e+002 1.4e+002 hsa-miR-154 275 1.6e-003
3.60 (+) 0.69 5.3e+002 1.5e+002 hsa-miR-1973 180 7.2e-011 3.48 (+)
0.90 1.1e+003 3.1e+002 hsa-miR-654-3p 355 6.9e-002 3.22 (+) 0.63
5.4e+002 1.7e+002 MID-15986 370 8.6e-009 3.14 (+) 0.86 3.5e+003
1.1e+003 hsa-miR-381 323 4.5e-002 3.07 (+) 0.64 5.4e+002 1.8e+002
hsa-miR-337-5p 312 5.8e-002 3.07 (+) 0.63 8.9e+002 2.9e+002
hsa-miR-193b 178 1.0e-008 3.03 (+) 0.88 5.6e+003 1.8e+003 MID-20524
249 1.6e-006 3.02 (+) 0.81 4.2e+003 1.4e+003 hsa-miR-199b-5p 183
1.3e-015 18.32 (-) 0.96 9.5e+001 1.7e+003 hsa-miR-199a-3p 181
9.7e-014 10.80 (-) 0.95 1.7e+003 1.9e+004 hsa-miR-214* 38 2.6e-016
10.75 (-) 0.97 6.1e+001 6.5e+002 hsa-miR-199a-5p 182 1.9e-015 9.43
(-) 0.97 2.5e+003 2.4e+004 hsa-miR-214 37 1.2e-011 7.89 (-) 0.96
9.0e+002 7.1e+003 hsa-miR-100 3 1.8e-012 4.87 (-) 0.90 1.5e+003
7.5e+003 hsa-miR-193a-3p 25 3.6e-006 3.37 (-) 0.83 7.6e+002
2.5e+003 hsa-miR-152 169 2.5e-006 3.05 (-) 0.80 4.3e+002 1.3e+003
(+) the higher expression of this miR is in adrenal carcinoma (-)
the higher expression of this miR is in sarcoma and mesothelioma
tumors
[0364] hsa-miR-202 (SEQ ID NO: 31), hsa-miR-509-3p (SEQ ID NO: 61)
and hsa-miR-214* (SEQ ID NO: 38) are used at node 35 of the
binary-tree-classifier detailed in the invention to distinguish
between adrenal carcinoma and sarcoma or mesothelioma tumors.
TABLE-US-00037 TABLE 37 miR expression (in fluorescence units)
distinguishing between GIST and mesothelioma or sarcoma tumors
fold- SEQ median values auROC change p-value ID NO. miR name
2.4e+002 5.6e+003 0.97 23.39 (+) 4.2e-033 165 hsa-miR-143* 4.9e+003
1.0e+005 0.97 21.41 (+) 1.7e-025 14 hsa-miR-143 8.1e+003 1.5e+005
0.99 18.42 (+) 4.2e-026 15 hsa-miR-145 5.0e+001 7.9e+002 0.87 15.77
(+) 1.5e-010 333 hsa-miR-483-3p 6.2e+001 8.4e+002 0.98 13.54 (+)
1.5e-037 272 hsa-miR-145* 1.6e+002 1.6e+003 0.99 9.58 (+) 2.7e-024
270 hsa-miR-139-5p 1.8e+002 1.8e+003 0.96 9.49 (+) 2.7e-019 45
hsa-miR-29c* 6.1e+001 5.8e+002 0.95 9.48 (+) 7.9e-028 301
hsa-miR-29b-2* 1.9e+002 1.5e+003 0.94 7.89 (+) 1.9e-015 191
hsa-miR-29c 1.2e+003 6.3e+003 0.96 5.12 (+) 8.8e-014 46 hsa-miR-30a
1.9e+002 7.3e+002 0.93 3.84 (+) 6.6e-013 195 hsa-miR-30a* 2.4e+002
8.7e+002 0.92 3.66 (+) 1.8e-008 266 hsa-miR-132 6.1e+002 2.2e+003
0.91 3.52 (+) 4.2e-008 190 hsa-miR-29b 1.9e+002 6.5e+002 0.82 3.50
(+) 1.5e-006 19 hsa-miR-149 2.6e+002 9.0e+002 0.82 3.47 (+)
4.8e-005 334 hsa-miR-483-5p 9.3e+002 1.9e+002 0.70 5.00 (-)
2.1e-003 155 hsa-miR-127-3p 3.4e+003 7.1e+002 0.88 4.74 (-)
2.3e-007 25 hsa-miR-193a-3p 7.0e+003 1.9e+003 0.79 3.64 (-)
5.5e-004 147 hsa-miR-221 9.8e+003 2.8e+003 0.78 3.54 (-) 1.1e-003
40 hsa-miR-222 3.2e+004 9.8e+003 0.75 3.26 (-) 1.1e-003 34
hsa-miR-21 (+) the higher expression of this miR is in GIST (-) the
higher expression of this miR is in sarcoma and mesothelioma
tumors
[0365] hsa-miR-29C* (SEQ ID NO: 45) and hsa-miR-143 (SEQ ID NO: 14)
are used at node 36 of the binary-tree-classifier detailed in the
invention to distinguish between GIST and sarcoma or mesothelioma
tumors.
TABLE-US-00038 TABLE 38 miR expression (in fluorescence units)
distinguishing between chromophobe renal cell carcinoma tumors and
clear cell or papillary renal cell carcinoma tumors fold- SEQ
median values auROC change p-value ID NO. miR name 8.8e+001
2.1e+003 0.99 23.68 (+) 4.7e-017 13 hsa-miR-141 3.0e+002 5.7e+003
0.99 18.81 (+) 8.4e-012 30 hsa-miR-200c 6.6e+001 9.8e+002 0.99
14.85 (+) 7.5e-019 361 hsa-miR-874 5.0e+001 7.4e+002 0.97 14.80 (+)
1.0e-014 280 hsa-miR-187 5.0e+001 7.2e+002 0.98 14.47 (+) 4.7e-018
362 hsa-miR-891a 5.3e+003 7.4e+004 0.98 13.97 (+) 5.3e-017 147
hsa-miR-221 7.6e+003 9.0e+004 0.97 11.89 (+) 1.4e-015 40
hsa-miR-222 5.3e+001 5.1e+002 0.98 9.66 (+) 1.2e-017 291
hsa-miR-222* 1.4e+002 1.1e+003 0.94 8.01 (+) 2.7e-010 387 MID-23751
7.4e+001 5.4e+002 0.97 7.32 (+) 4.4e-015 290 hsa-miR-221* 1.1e+002
5.6e+002 0.93 4.97 (+) 3.2e-010 299 hsa-miR-296-5p 3.2e+002
1.5e+003 0.90 4.90 (+) 3.1e-007 152 hsa-miR-182 8.4e+002 3.3e+003
0.73 3.89 (+) 6.3e-003 178 hsa-miR-193b 5.4e+003 1.7e+004 0.92 3.26
(+) 2.2e-007 303 hsa-miR-30b 1.1e+003 3.5e+003 0.74 3.20 (+)
8.1e-003 242 MID-16489 6.2e+003 3.3e+002 0.85 18.53 (-) 4.3e-006 49
hsa-miR-31 6.1e+002 5.0e+001 0.90 12.13 (-) 2.1e-007 11 hsa-miR-138
1.8e+003 2.2e+002 0.98 8.38 (-) 6.7e-014 35 hsa-miR-21* 7.9e+004
1.0e+004 0.95 7.54 (-) 3.1e-013 34 hsa-miR-21 3.7e+003 5.4e+002
0.92 6.79 (-) 3.1e-009 36 hsa-miR-210 9.8e+002 1.7e+002 0.97 5.71
(-) 3.1e-013 206 hsa-miR-455-3p 1.0e+003 2.5e+002 0.91 4.07 (-)
4.7e-008 16 hsa-miR-146a 6.0e+002 1.7e+002 0.89 3.64 (-) 7.1e-007
170 hsa-miR-155 7.5e+002 2.1e+002 0.78 3.48 (-) 1.6e-003 177
hsa-miR-192 8.6e+002 2.5e+002 0.86 3.39 (-) 6.6e-006 17
hsa-miR-146b-5p (+) the higher expression of this miR is in
chromophobe renal cell carcinoma tumors (-) the higher expression
of this miR is in clear cell or papillary renal cell carcinoma
tumors
[0366] hsa-miR-210 (SEQ ID NO: 36) and hsa-miR-221 (SEQ ID NO: 147)
are used at node #37 of the binary-tree-classifier detailed in the
invention to distinguish between chromophobe renal cell carcinoma
tumors and clear cell or papillary renal cell carcinoma tumors.
TABLE-US-00039 TABLE 39 miR expression (in fluorescence units)
distinguishing between clear cell and papillary renal cell
carcinoma tumors SEQ fold- miR name ID NO. p-value change auROC
median values hsa-miR-503 344 2.3e-005 4.81 (+) 0.89 5.7e+002
1.2e+002 MID-22331 382 5.8e-003 3.65 (+) 0.81 5.9e+003 1.6e+003
hsa-miR-126 9 1.1e-005 3.54 (+) 0.94 6.4e+003 1.8e+003 hsa-miR-494
338 3.0e-003 3.45 (+) 0.82 5.7e+003 1.7e+003 hsa-miR-200b 29
3.1e-004 8.35 (-) 0.87 1.3e+003 1.1e+004 hsa-miR-31 49 3.0e-002
6.61 (-) 0.81 1.3e+003 8.7e+003 hsa-miR-200a 28 5.0e-005 5.30 (-)
0.92 9.5e+002 5.1e+003 hsa-miR-30a* 195 1.1e-009 4.10 (-) 1.00
5.1e+002 2.1e+003 hsa-miR-30a 46 4.5e-010 3.70 (-) 1.00 5.0e+003
1.9e+004 hsa-miR-10a 4 6.9e-004 3.39 (-) 0.86 1.6e+003 5.3e+003
hsa-miR-138 11 2.0e-002 3.23 (-) 0.76 2.3e+002 7.6e+002 MID-23291
254 7.4e-003 3.17 (-) 0.79 2.0e+002 6.4e+002 (+) the higher
expression of this miR is in renal clear cell carcinoma tumors (-)
the higher expression of this miR is in papillary renal cell
carcinoma tumors
[0367] hsa-miR-31 (SEQ ID NO: 49) and hsa-miR-126 (SEQ ID NO: 9)
are used at node 38 of the binary-tree-classifier detailed in the
invention to distinguish between renal clear cell and papillary
cell carcinoma tumors.
TABLE-US-00040 TABLE 40 miR expression (in fluorescence units)
distinguishing between pleural mesothelioma and sarcoma tumors SEQ
ID fold- miR name NO. p-value change auROC median values hsa-miR-31
49 1.7e-006 13.97 (+) 0.78 1.7e+003 1.2e+002 hsa-miR-21* 35
2.0e-011 5.01 (+) 0.89 2.1e+003 4.3e+002 hsa-miR-146b-5p 17
2.1e-008 2.75 (+) 0.84 1.6e+003 5.9e+002 hsa-miR-21 34 4.8e-010
2.71 (+) 0.89 6.9e+004 2.5e+004 hsa-miR-193a-3p 25 2.3e-005 2.57
(+) 0.77 6.1e+003 2.4e+003 hsa-miR-210 36 5.9e-005 2.49 (+) 0.75
3.1e+003 1.2e+003 hsa-miR-150 168 9.7e-004 2.33 (+) 0.70 1.1e+003
4.6e+002 hsa-miR-155 170 1.4e-004 2.33 (+) 0.75 6.8e+002 2.9e+002
hsa-miR-193a-5p 26 1.4e-005 2.25 (+) 0.76 7.3e+002 3.2e+002
hsa-miR-10a 4 1.7e-004 2.13 (+) 0.76 2.4e+003 1.1e+003 hsa-miR-29b
190 1.1e-003 2.03 (+) 0.70 1.0e+003 5.1e+002 hsa-miR-30a 46
1.5e-005 1.99 (+) 0.77 1.8e+003 9.0e+002 hsa-miR-130a 10 8.9e-003
1.90 (+) 0.71 4.9e+003 2.6e+003 MID-15965 240 1.6e-003 1.88 (+)
0.69 5.2e+003 2.8e+003 hsa-miR-29a 43 1.5e-002 1.71 (+) 0.67
7.3e+003 4.3e+003 hsa-miR-22 39 1.1e-004 1.65 (+) 0.72 8.2e+003
5.0e+003 MID-23168 385 2.9e-002 1.57 (+) 0.64 5.9e+003 3.8e+003
hsa-miR-574-5p 63 1.4e-002 1.55 (+) 0.66 1.5e+003 9.9e+002
hsa-miR-378 202 2.8e-002 1.53 (+) 0.66 1.2e+003 7.5e+002
hsa-miR-199a-3p 181 3.3e-007 3.62 (-) 0.84 7.4e+003 2.7e+004
hsa-miR-214 37 3.2e-006 3.45 (-) 0.81 3.1e+003 1.1e+004 hsa-miR-10b
5 6.7e-008 3.36 (-) 0.85 8.3e+002 2.8e+003 hsa-miR-199b-5p 183
2.0e-004 3.19 (-) 0.74 9.1e+002 2.9e+003 hsa-miR-199a-5p 182
6.0e-005 2.84 (-) 0.80 1.1e+004 3.0e+004 hsa-miR-214* 38 5.7e-005
2.22 (-) 0.78 3.8e+002 8.4e+002 hsa-miR-455-3p 206 2.8e-004 1.69
(-) 0.75 7.0e+002 1.2e+003 hsa-miR-26b 296 2.9e-003 1.61 (-) 0.75
4.4e+002 7.0e+002 hsa-let-7c 1 4.3e-003 1.58 (-) 0.71 3.2e+004
5.0e+004 (+) the higher expression of this miR is in pleural
mesothelioma tumors (-) the higher expression of this miR is in
sarcoma tumors
[0368] hsa-miR-21* (SEQ ID NO: 35) hsa-miR-130a (SEQ ID NO: 10) and
hsa-miR-10b (SEQ ID NO: 5) are used at node 39 of the
binary-tree-classifier detailed in the invention to distinguish
between pleural mesothelioma tumors and sarcoma tumors.
TABLE-US-00041 TABLE 41 miR expression (in fluorescence units)
distinguishing between synovial sarcoma and other sarcoma tumors
SEQ fold- miR name ID NO. p-value change auROC median values
hsa-miR-182 152 2.9e-009 25.03 (+) 0.89 1.3e+003 5.1e+001
hsa-miR-200b 29 9.2e-009 21.59 (+) 0.92 1.1e+003 5.0e+001
hsa-miR-124 159 5.9e-007 19.35 (+) 0.88 9.7e+002 5.0e+001
hsa-miR-200a 28 1.5e-008 12.81 (+) 0.92 6.4e+002 5.0e+001
hsa-miR-495 339 8.2e-005 7.00 (+) 0.88 8.9e+002 1.3e+002
hsa-miR-154 275 1.6e-005 6.94 (+) 0.89 1.1e+003 1.6e+002
hsa-miR-543 352 2.8e-004 6.53 (+) 0.87 7.3e+002 1.1e+002
hsa-miR-149 19 6.1e-006 6.51 (+) 0.91 8.8e+002 1.4e+002
hsa-miR-376c 320 9.6e-005 6.22 (+) 0.86 2.0e+003 3.3e+002
hsa-miR-127-3p 155 1.5e-003 6.05 (+) 0.84 6.2e+003 1.0e+003
hsa-miR-127-5p 262 1.7e-004 5.40 (+) 0.84 5.2e+002 9.7e+001
hsa-miR-214* 38 2.5e-004 5.33 (+) 0.86 4.2e+003 7.8e+002
hsa-miR-214 37 7.3e-003 5.29 (+) 0.84 5.0e+004 9.5e+003
hsa-miR-199a-5p 182 4.7e-003 4.90 (+) 0.87 1.4e+005 2.9e+004
hsa-miR-432 331 1.9e-003 4.56 (+) 0.81 6.4e+002 1.4e+002
hsa-miR-369-5p 317 2.7e-004 4.43 (+) 0.84 5.0e+002 1.1e+002
hsa-miR-381 323 9.6e-003 4.19 (+) 0.78 8.9e+002 2.1e+002
hsa-miR-654-3p 355 2.7e-003 3.96 (+) 0.81 7.7e+002 1.9e+002
hsa-miR-100 3 9.6e-006 3.79 (+) 0.91 2.1e+004 5.6e+003 hsa-miR-196a
282 2.8e-004 3.76 (+) 0.86 6.5e+002 1.7e+002 hsa-miR-337-5p 312
6.0e-003 3.55 (+) 0.80 1.5e+003 4.1e+002 hsa-miR-199a-3p 181
7.4e-003 3.52 (+) 0.86 9.1e+004 2.6e+004 hsa-miR-134 269 6.5e-003
3.41 (+) 0.80 5.5e+002 1.6e+002 hsa-miR-370 318 8.3e-003 3.32 (+)
0.79 6.4e+002 1.9e+002 hsa-miR-487b 59 7.6e-003 3.08 (+) 0.78
1.0e+003 3.2e+002 hsa-miR-132 266 7.1e-007 3.02 (+) 0.87 6.4e+002
2.1e+002 hsa-miR-379 322 1.1e-002 2.92 (+) 0.78 6.3e+002 2.2e+002
hsa-miR-125b 8 1.6e-004 2.83 (+) 0.92 1.2e+005 4.4e+004 hsa-miR-382
324 8.4e-003 2.58 (+) 0.78 6.1e+002 2.4e+002 hsa-miR-130a 10
1.8e-003 2.45 (+) 0.80 6.0e+003 2.4e+003 hsa-miR-222 40 2.1e-010
10.95 (-) 0.96 8.4e+002 9.2e+003 hsa-miR-221 147 7.9e-010 10.19 (-)
0.96 6.7e+002 6.8e+003 hsa-miR-152 169 2.1e-005 4.92 (-) 0.88
3.6e+002 1.8e+003 hsa-miR-451 205 2.8e-002 4.72 (-) 0.72 1.7e+003
7.9e+003 hsa-miR-21 34 4.4e-005 4.59 (-) 0.84 5.9e+003 2.7e+004
hsa-miR-150 168 9.2e-004 4.32 (-) 0.83 1.4e+002 5.9e+002
hsa-miR-143 14 2.2e-007 4.15 (-) 0.92 1.3e+003 5.5e+003 hsa-miR-145
15 2.7e-007 3.30 (-) 0.93 2.9e+003 9.5e+003 hsa-miR-140-3p 12
1.6e-003 3.30 (-) 0.91 1.0e+003 3.5e+003 hsa-miR-30a 46 6.1e-003
2.92 (-) 0.78 3.5e+002 1.0e+003 MID-23794 255 4.1e-004 2.86 (-)
0.82 3.6e+002 1.0e+003 hsa-miR-22 39 1.9e-005 2.82 (-) 0.88
2.0e+003 5.8e+003 hsa-miR-185 23 3.9e-003 2.74 (-) 0.77 4.0e+002
1.1e+003 hsa-miR-29b 190 6.8e-003 2.45 (-) 0.80 2.4e+002 5.8e+002
MID-00689 236 4.9e-003 2.27 (-) 0.79 3.1e+002 7.1e+002 MID-23178
386 2.2e-004 2.10 (-) 0.85 3.2e+004 6.7e+004 MID-18395 379 2.9e-003
2.08 (-) 0.80 3.7e+004 7.7e+004 hsa-miR-378 202 7.4e-003 2.07 (-)
0.78 3.8e+002 7.8e+002 (+) the higher expression of this miR is in
synovial sarcoma tumors (-) the higher expression of this miR is in
other sarcoma tumors
[0369] hsa-miR-100 (SEQ ID NO: 3) hsa-miR-145 (SEQ ID NO: 15) and
hsa-miR-222 (SEQ ID NO: 40) are used at node 40 of the
binary-tree-classifier detailed in the invention to distinguish
between synovial sarcoma tumors and other sarcoma tumors.
TABLE-US-00042 TABLE 42 miR expression (in fluorescence units)
distinguishing between chondrosarcoma and other non synovial
sarcoma tumors SEQ fold- miR name ID NO. p-value change auROC
median values hsa-miR-140-3p 12 2.1e-022 75.69 (+) 1.00 2.2e+005
2.9e+003 hsa-miR-140-5p 271 8.5e-015 35.23 (+) 0.91 5.1e+003
1.5e+002 hsa-miR-455-3p 206 6.1e-015 14.49 (+) 0.98 1.6e+004
1.1e+003 hsa-miR-483-3p 333 3.1e-003 11.03 (+) 0.71 5.5e+002
5.0e+001 hsa-miR-138 11 1.2e-006 11.01 (+) 0.88 1.1e+003 9.5e+001
hsa-miR-455-5p 58 6.3e-012 8.87 (+) 0.87 8.2e+002 9.2e+001
hsa-miR-210 36 1.5e-006 4.37 (+) 0.91 4.7e+003 1.1e+003
hsa-miR-148a 18 3.1e-004 3.98 (+) 0.83 1.4e+003 3.6e+002
hsa-miR-193b 178 2.3e-002 2.36 (+) 0.72 3.6e+003 1.5e+003
hsa-miR-23b 293 1.5e-004 2.13 (+) 0.84 2.8e+004 1.3e+004
hsa-miR-27b 189 5.8e-004 2.05 (+) 0.80 5.5e+003 2.7e+003 MID-22331
382 1.1e-004 5.01 (-) 0.70 6.7e+002 3.4e+003 MID-19962 381 1.2e-004
3.91 (-) 0.81 1.7e+002 6.6e+002 MID-15965 240 1.9e-004 3.76 (-)
0.83 8.5e+002 3.2e+003 MID-20524 249 8.0e-004 3.47 (-) 0.79
4.2e+002 1.5e+003 hsa-miR-10b 5 1.3e-005 3.27 (-) 0.85 9.0e+002
2.9e+003 MID-17866 377 6.9e-005 2.92 (-) 0.78 1.0e+003 2.9e+003
hsa-miR-1978 235 1.3e-003 2.62 (-) 0.75 2.7e+002 7.1e+002
hsa-miR-146b-5p 17 4.4e-005 2.48 (-) 0.81 2.8e+002 7.0e+002
hsa-miR-17 20 2.7e-002 2.36 (-) 0.71 5.7e+002 1.3e+003 MID-23168
385 8.2e-003 2.36 (-) 0.73 1.9e+003 4.5e+003 MID-23017 384 4.8e-003
2.16 (-) 0.74 5.0e+003 1.1e+004 hsa-miR-30a 46 3.2e-004 2.04 (-)
0.79 5.4e+002 1.1e+003 hsa-miR-1979 283 3.0e-004 2.02 (-) 0.83
8.1e+003 1.6e+004 (+) the higher expression of this miR is in
chondrosarcoma tumors (-) the higher expression of this miR is in
other non-synovial sarcoma tumors
[0370] hsa-miR-140-3p (SEQ ID NO: 12) and hsa-miR-455-5p (SEQ ID
NO: 58) are used at node 41 of the binary-tree-classifier detailed
in the invention to distinguish between chondrosarcoma tumors and
other non-synovial sarcoma tumors.
TABLE-US-00043 TABLE 43 miR expression (in fluorescence units)
distinguishing between liposarcoma and other non chondrosarcoma and
non synovial sarcoma tumors SEQ fold- miR name ID NO. p-value
change auROC median values hsa-miR-26a 295 1.6e-011 6.18 (+) 0.93
1.2e+005 1.9e+004 hsa-miR-451 205 8.1e-003 4.20 (+) 0.73 1.8e+004
4.2e+003 hsa-miR-193a-3p 25 6.5e-006 3.94 (+) 0.84 5.9e+003
1.5e+003 hsa-miR-193a-5p 26 7.5e-007 3.70 (+) 0.88 8.8e+002
2.4e+002 hsa-miR-99a 231 2.2e-005 3.24 (+) 0.88 2.0e+004 6.1e+003
hsa-miR-199b-5p 183 1.9e-003 2.60 (+) 0.75 5.9e+003 2.3e+003
hsa-miR-224 42 1.7e-004 2.54 (+) 0.79 7.9e+002 3.1e+002 MID-23291
254 9.9e-003 2.54 (+) 0.71 7.4e+002 2.9e+002 hsa-miR-150 168
1.5e-002 2.38 (+) 0.71 1.0e+003 4.2e+002 hsa-miR-652 64 1.1e-004
2.36 (+) 0.77 7.7e+002 3.2e+002 hsa-miR-143 14 5.4e-006 2.27 (+)
0.84 1.1e+004 4.8e+003 hsa-miR-193b 178 2.7e-004 2.20 (+) 0.76
3.0e+003 1.4e+003 hsa-miR-145 15 1.1e-004 2.13 (+) 0.78 1.7e+004
7.9e+003 hsa-miR-22 39 9.8e-004 2.12 (+) 0.79 9.7e+003 4.6e+003
hsa-miR-210 36 1.8e-004 4.49 (-) 0.79 3.1e+002 1.4e+003
hsa-miR-181b 154 1.2e-002 2.60 (-) 0.71 9.0e+002 2.4e+003
hsa-miR-130b 265 4.0e-003 2.29 (-) 0.75 2.6e+002 5.9e+002
hsa-miR-181d 174 3.2e-003 2.16 (-) 0.75 3.0e+002 6.5e+002 MID-23017
384 2.0e-004 2.14 (-) 0.79 5.6e+003 1.2e+004 hsa-miR-92a 67
6.6e-004 2.04 (-) 0.80 1.6e+003 3.3e+003 (+) the higher expression
of this miR is in liposarcoma tumors (-) the higher expression of
this miR is in other non-chondrosarcoma and non-synovial sarcoma
tumors
[0371] hsa-miR-210 (SEQ ID NO: 36) and hsa-miR-193a-5p (SEQ ID NO:
26) are used at node 42 of the binary-tree-classifier detailed in
the invention to distinguish between liposarcoma tumors and other
non-chondrosarcoma and non-synovial sarcoma tumors.
TABLE-US-00044 TABLE 44 miR expression (in fluorescence units)
distinguishing between Ewing sarcoma or osteosarcoma; and
rhabdomyosarcoma, malignant fibrous histiocytoma (MFH) or
fibrosarcoma miR name SEQ ID NO. p-value fold-change auROC median
values hsa-miR-181a* 22 1.1e-006 6.62 (+) 0.87 1.2e+003 1.9e+002
hsa-miR-181b 154 8.7e-009 5.68 (+) 0.91 6.4e+003 1.1e+003
hsa-miR-181a 21 2.9e-010 5.67 (+) 0.93 2.1e+004 3.7e+003
hsa-miR-181d 174 3.5e-006 4.19 (+) 0.85 1.8e+003 4.2e+002
hsa-miR-451 205 1.2e-002 3.27 (+) 0.72 9.4e+003 2.9e+003
hsa-miR-106a 158 2.9e-003 2.63 (+) 0.78 4.7e+003 1.8e+003
hsa-miR-20a 186 2.9e-003 2.52 (+) 0.78 2.8e+003 1.1e+003 hsa-miR-93
148 9.2e-005 2.45 (+) 0.81 4.9e+003 2.0e+003 hsa-miR-17 20 5.1e-003
2.32 (+) 0.77 2.6e+003 1.1e+003 hsa-miR-487b 59 1.1e-002 4.54 (-)
0.71 1.3e+002 6.0e+002 hsa-miR-125b 8 2.9e-005 2.86 (-) 0.84
1.7e+004 4.9e+004 hsa-miR-199b-5p 183 9.4e-003 2.70 (-) 0.72
1.3e+003 3.4e+003 hsa-miR-99a 231 1.1e-003 2.34 (-) 0.76 3.5e+003
8.1e+003 (+) the higher expression of this miR is in Ewing sarcoma
or osteosarcoma tumors (-) the higher expression of this miR is in
rhabdomyosarcoma, malignant fibrous histiocytoma (MFH) or
fibrosarcoma tumors
[0372] hsa-miR-181a (SEQ ID NO: 21) is used at node 43 of the
binary-tree-classifier detailed in the invention to distinguish
between Ewing sarcoma or osteosarcoma tumors and rhabdomyosarcoma,
malignant fibrous histiocytoma (MFH) or fibrosarcoma tumors.
TABLE-US-00045 TABLE 45 miR expression (in fluorescence units)
distinguishing between Ewing sarcoma and osteosarcoma miR name SEQ
ID NO. p-value fold-change auROC median values hsa-miR-127-3p 155
3.7e-006 6.60 (+) 1.00 1.1e+003 1.6e+002 hsa-miR-195 179 8.9e-004
5.85 (+) 0.97 8.5e+003 1.4e+003 hsa-miR-29a 43 1.4e-002 4.90 (+)
0.86 1.4e+004 2.8e+003 hsa-miR-497 208 1.1e-004 4.58 (+) 1.00
6.5e+003 1.4e+003 hsa-miR-181a-2* 278 1.0e-003 4.42 (+) 0.88
7.6e+002 1.7e+002 hsa-miR-146b-5p 17 6.0e-003 4.05 (+) 0.86
1.6e+003 4.0e+002 MID-23168 385 1.4e-002 2.64 (+) 0.81 8.9e+003
3.4e+003 hsa-miR-181d 174 1.5e-002 2.60 (+) 0.77 2.1e+003 8.0e+002
hsa-miR-10b 5 1.3e-002 2.55 (+) 0.82 4.1e+003 1.6e+003 hsa-miR-34a
52 7.1e-003 2.19 (+) 0.84 4.9e+003 2.2e+003 hsa-let-7b 257 2.7e-004
2.16 (+) 0.97 5.4e+004 2.5e+004 MID-00144 366 2.1e-003 2.12 (+)
0.88 5.2e+002 2.5e+002 hsa-miR-30e 48 6.2e-003 2.06 (+) 0.84
9.4e+002 4.5e+002 hsa-miR-31 49 7.9e-005 25.44 (-) 0.96 5.0e+001
1.3e+003 hsa-miR-140-3p 12 1.4e-003 5.72 (-) 0.89 2.0e+003 1.2e+004
hsa-miR-193a-3p 25 5.2e-005 4.92 (-) 0.94 7.6e+002 3.8e+003
hsa-miR-152 169 3.3e-003 4.09 (-) 0.89 4.4e+002 1.8e+003 hsa-miR-21
34 3.2e-003 3.00 (-) 0.89 1.2e+004 3.7e+004 hsa-miR-21* 35 1.7e-003
2.96 (-) 0.83 2.7e+002 8.1e+002 hsa-miR-185 23 4.2e-003 2.55 (-)
0.88 6.7e+002 1.7e+003 MID-23017 384 1.7e-002 2.53 (-) 0.82
8.2e+003 2.1e+004 hsa-miR-27b 189 3.8e-003 2.52 (-) 0.84 1.7e+003
4.3e+003 MID-17866 377 3.0e-002 2.18 (-) 0.80 2.3e+003 5.1e+003
hsa-miR-130b 265 3.0e-002 2.17 (-) 0.78 4.4e+002 9.6e+002
hsa-miR-24 294 3.3e-003 2.07 (-) 0.82 1.8e+004 3.7e+004 hsa-miR-23b
293 9.0e-003 2.03 (-) 0.86 8.8e+003 1.8e+004 hsa-miR-23a 292
1.6e-002 2.02 (-) 0.80 1.5e+004 3.0e+004 (+) the higher expression
of this miR is in Ewing sarcoma tumors (-) the higher expression of
this miR is in osteosarcoma tumors
[0373] FIG. 24 demonstrates binary decisions at node #44 of the
decision-tree. Tumors originating in Ewing sarcoma (diamonds) are
easily distinguished from tumors of osteosarcoma origin (squares)
using the expression levels of hsa-miR-31 (SEQ ID NO: 49, y-axis)
and hsa-miR-193a-3p (SEQ ID NO: 25, x-axis).
TABLE-US-00046 TABLE 46 miR expression (in fluorescence units)
distinguishing between rhabdomyosarcoma and malignant fibrous
histiocytoma (MFH) or fibrosarcoma fold- SEQ median values auROC
change p-value ID NO. miR name 5.0e+001 4.1e+003 0.96 81.34 (+)
1.9e-007 33 hsa-miR-206 5.7e+001 4.3e+003 0.89 74.89 (+) 1.8e-004
268 hsa-miR-133b 5.9e+001 3.9e+003 0.88 66.65 (+) 3.2e-004 267
hsa-miR-133a 5.0e+001 1.3e+003 0.89 25.89 (+) 3.9e-006 333
hsa-miR-483-3p 5.3e+001 5.2e+002 0.85 9.90 (+) 1.3e-004 276
hsa-miR-154* 5.8e+001 5.6e+002 0.85 9.63 (+) 1.2e-004 319
hsa-miR-376a 5.7e+001 5.1e+002 0.86 9.00 (+) 4.8e-005 306
hsa-miR-323-3p 2.5e+002 1.8e+003 0.84 7.01 (+) 2.8e-003 320
hsa-miR-376c 2.6e+002 1.7e+003 0.82 6.52 (+) 3.9e-003 334
hsa-miR-483-5p 3.1e+002 1.9e+003 0.87 6.22 (+) 5.1e-004 323
hsa-miR-381 1.0e+002 6.3e+002 0.85 6.19 (+) 5.4e-004 300
hsa-miR-299-3p 1.3e+002 7.9e+002 0.82 6.18 (+) 1.4e-003 281
hsa-miR-188-5p 4.1e+002 2.3e+003 0.86 5.73 (+) 1.4e-003 59
hsa-miR-487b 1.5e+002 8.4e+002 0.85 5.68 (+) 8.1e-004 339
hsa-miR-495 3.7e+002 1.7e+003 0.79 4.57 (+) 3.1e-002 316
hsa-miR-362-5p 2.0e+002 9.2e+002 0.80 4.49 (+) 2.4e-003 176
hsa-miR-18a 2.9e+002 1.3e+003 0.82 4.39 (+) 1.4e-003 348
hsa-miR-532-3p 1.8e+002 7.8e+002 0.85 4.27 (+) 4.0e-004 352
hsa-miR-543 4.0e+002 1.7e+003 0.81 4.18 (+) 2.3e-002 349
hsa-miR-532-5p 1.9e+003 7.8e+003 0.87 4.14 (+) 4.9e-004 67
hsa-miR-92a 5.7e+002 2.4e+003 0.86 4.13 (+) 9.2e-004 357
hsa-miR-660 1.3e+002 5.6e+002 0.78 4.13 (+) 4.2e-003 315
hsa-miR-362-3p 2.3e+002 8.6e+002 0.81 3.73 (+) 2.8e-003 343
hsa-miR-502-3p 2.0e+002 7.2e+002 0.84 3.64 (+) 1.5e-003 342
hsa-miR-501-3p 2.3e+002 8.5e+002 0.82 3.62 (+) 6.7e-003 355
hsa-miR-654-3p 1.9e+002 6.7e+002 0.79 3.56 (+) 1.3e-002 340
hsa-miR-500 2.4e+002 8.4e+002 0.80 3.56 (+) 7.9e-003 344
hsa-miR-503 2.2e+003 7.6e+003 0.78 3.53 (+) 7.2e-003 10
hsa-miR-130a 2.6e+002 8.8e+002 0.80 3.35 (+) 3.7e-003 341
hsa-miR-500* 2.6e+002 7.9e+002 0.79 3.06 (+) 7.3e-003 331
hsa-miR-432 9.3e+002 2.7e+003 0.77 2.90 (+) 1.4e-002 20 hsa-miR-17
4.3e+002 1.2e+003 0.86 2.90 (+) 1.0e-003 277 hsa-miR-17* 2.4e+002
6.7e+002 0.83 2.77 (+) 7.0e-003 318 hsa-miR-370 1.6e+003 4.5e+003
0.78 2.75 (+) 1.4e-002 158 hsa-miR-106a 4.3e+002 1.1e+003 0.83 2.67
(+) 3.0e-003 265 hsa-miR-130b 1.0e+003 2.7e+003 0.86 2.63 (+)
7.1e-004 284 hsa-miR-19b 8.6e+002 2.1e+003 0.82 2.43 (+) 8.6e-003
36 hsa-miR-210 6.1e+003 6.8e+002 0.90 8.92 (-) 1.8e-004 183
hsa-miR-199b-5p 1.9e+004 4.5e+003 0.83 4.15 (-) 8.0e-004 40
hsa-miR-222 1.1e+003 3.1e+002 0.90 3.55 (-) 5.6e-005 63
hsa-miR-574-5p 1.1e+004 3.2e+003 0.82 3.52 (-) 2.2e-003 147
hsa-miR-221 5.9e+003 1.8e+003 0.80 3.25 (-) 2.0e-003 43 hsa-miR-29a
5.2e+002 1.6e+002 0.82 3.19 (-) 5.4e-003 289 hsa-miR-22* 5.1e+003
1.7e+003 0.82 3.04 (-) 7.0e-003 52 hsa-miR-34a 8.1e+002 2.9e+002
0.76 2.81 (-) 1.4e-002 190 hsa-miR-29b 1.2e+003 4.5e+002 0.86 2.67
(-) 4.7e-003 4 hsa-miR-10a 3.7e+003 1.5e+003 0.86 2.43 (-) 1.3e-003
5 hsa-miR-10b 7.0e+003 2.9e+003 0.85 2.39 (-) 1.5e-003 39
hsa-miR-22 1.6e+003 6.9e+002 0.78 2.25 (-) 1.5e-002 169 hsa-miR-152
2.9e+003 1.3e+003 0.76 2.19 (-) 2.8e-002 208 hsa-miR-497 (+) the
higher expression of this miR is in rhabdomyosarcoma tumors (-) the
higher expression of this miR is in MFH or fibrosarcoma tumors
[0374] FIG. 25 demonstrates binary decisions at node #45 of the
decision-tree. Tumors originating in Rhabdomyosarcoma (diamonds)
are easily distinguished from tumors of malignant fibrous
histiocytoma (MFH) or fibrosarcoma origin (squares) using the
expression levels of hsa-miR-206 (SEQ ID NO: 33, y-axis),
hsa-miR-22 (SEQ ID NO: 39, x-axis) and hsa-miR-487b (SEQ ID NO: 59,
z-axis).
TABLE-US-00047 TABLE 47 .beta. values of the decision tree
classifier The classification at node 11 is based on the gender of
subject rather than on beta values; accordingly, no data is
provided for this node. P.sub.TH = 0.5 for all node miR 1 miR 2 miR
3 .beta.0 SEQ SEQ SEQ Node intercept miR hsa- ID NO .beta.1 miR
hsa- ID NO .beta.2 miR hsa- ID NO .beta.3 1 -23.3111 miR-372 55
2.3127 2 -26.9408 miR-122 6 2.3127 3 -3.8519 miR-200b 29 1.8567
miR-126 9 -1.379 4 -8.2646 miR-200c 30 1.9582 miR-30a 46 -1.2306 5
17.4706 miR-146a 16 1.1979 let-7e 2 -1.7697 miR-30a 46 -0.88435 6
-32.5621 miR-9* 66 1.5475 miR-92b 68 1.7188 7 -9.5521 miR-222 40
-1.1606 miR-497 208 2.0005 8 -23.053 miR-193a-3p 25 -1.0267 miR-7
65 1.2404 miR-375 56 1.6602 9 -29.3207 miR-194 27 2.0115 miR-21* 35
1.1414 10 1.244 miR-181a 21 -1.5458 miR-143 14 0.9879 12 21.3416
miR-200b 29 -1.942 miR-516a-5p 211 -1.256 13 10.3775 miR-125a-5p 7
-1.1455 miR-205 32 1.1064 miR-345 51 -1.0128 14 -40.666 miR-193a-3p
25 1.9505 miR-342-3p 50 0.93196 miR-375 56 0.82076 15 26.2937
miR-22 39 -1.8153 miR-10a 4 0.61098 miR-205 32 -0.91632 16 9.4008
miR-93 148 -1.3023 miR-138 11 1.5494 miR-10a 4 -1.119 17 42.5529
miR-21 34 -1.801 miR-146b-5p 17 -1.4509 18 0.52521 miR-193a-3p 25
1.7974 miR-31 49 -0.63021 miR-92a 67 -1.3119 19 -20.7179 miR-138 11
0.9662 miR-378 202 -1.3077 miR-21 34 1.6447 20 15.0039 miR-100 3
1.0814 miR-21 34 -2.0444 21 -31.6015 miR-191 24 1.5137 miR-29c 191
0.22547 miR-934 69 1.734 22 -44.3141 miR-10b 5 1.41 let-7c 1
0.86212 miR-361-5p 54 1.6178 23 7.6168 miR-138 11 -0.32773 miR-10b
5 1.3275 miR-185 23 -1.8652 24 2.4904 miR-342-3p 50 -1.7146 miR-30d
47 1.5521 26 -10.0563 miR-17 20 1.9063 miR-29c* 45 -1.3096 27
-2.3904 miR-222 40 1.5531 miR-92b 68 -1.5907 miR-92a 67 -0.63749 28
-22.027 miR-652 64 1.9688 miR-214 37 -0.65807 miR-34c-5p 53 1.0197
29 -11.4697 miR-21 34 1.8457 miR-148a 18 -1.3936 30 21.7628 miR-224
42 -1.3059 miR-210 36 -0.79749 1201 146 -0.50909 31 -17.747 miR-17
20 0.95763 miR-29a 43 1.6268 miR-30a 46 -1.3361 32 -2.3716 miR-31
49 1.0661 miR-146a 16 0.62041 miR-345 51 -1.8214 33 -4.226 miR-200b
29 0.48415 miR-149 19 -2.0172 miR-30a 46 1.0224 34 -29.6828 miR-7
65 2.1394 miR-375 56 0.87847 35 -23.6445 miR-202 31 2.1832
miR-509-3p 61 0.76095 miR-214* 38 -0.057027 36 -41.4047 miR-29c* 45
1.2571 miR-143 14 1.9413 37 -25.1227 miR-221 147 2.2247 miR-210 36
-0.63202 38 -24.5409 miR-31 49 -0.19797 miR-126 9 2.3043 39
-20.7495 miR-130a 10 1.014 miR-10b 5 -1.0484 miR-21* 35 1.7948 40
-6.0971 miR-100 3 1.9198 miR-222 40 -1.0289 miR-145 15 -0.77759 41
-38.5059 miR-140-3p 12 1.6462 miR-455-5p 58 1.6244 42 -10.7873
miR-210 36 -0.84091 miR-193a-5p 26 1.9298 43 -30.4778 miR-181a 21
2.3127 44 31.0975 miR-193a-3p 25 -2.0358 miR-31 49 -1.0974 45
-17.5516 miR-22 39 -0.91078 miR-487b 59 1.0201 miR-206 487
1.8651
TABLE-US-00048 TABLE 48 Using fine-needle aspiration (FNA), pleural
effusion or bronchial brushing for the identification of cancer
tissue of origin Class Biopsy identified Site Histological Type
Sampling Method lung-small Lymph Neuroendocrine; Small percutaneous
FNA Node UpperSCC Lung Non-small; squamous percutaneous FNA
UpperSCC Lung Non-small; adenocarcinoma percutaneous FNA lung-small
Lung Neuroendocrine; Small percutaneous FNA lung-adeno Lung
Non-small; adenocarcinoma percutaneous FNA UpperSCC Lung Non-small;
squamous percutaneous FNA lung-small Lymph Neuroendocrine; Small
transbronchial FNA Node lung-small Lung Neuroendocrine; Small
transbronchial FNA lung-adeno Lung Non-small; adenocarcinoma
Pleural effusion pleura lung-adeno Lung Non-small; adenocarcinoma
Pleural effusion pleura Lung, small Lung Neuroendocrine; Small
bronchial brushing Lung, small Lung Neuroendocrine; Small bronchial
brushing Lung, small Lung Neuroendocrine; Small bronchial brushing
Lung, small Lung Neuroendocrine; Small bronchial brushing Lung,
small Lung Neuroendocrine; Small bronchial brushing
[0375] The foregoing description of the specific embodiments so
fully reveals the general nature of the invention that others can,
by applying current knowledge, readily modify and/or adapt for
various applications such specific embodiments without undue
experimentation and without departing from the generic concept,
and, therefore, such adaptations and modifications should and are
intended to be comprehended within the meaning and range of
equivalents of the disclosed embodiments. Although the invention
has been described in conjunction with specific embodiments
thereof, it is evident that many alternatives, modifications and
variations will be apparent to those skilled in the art.
Accordingly, it is intended to embrace all such alternatives,
modifications and variations that fall within the spirit and broad
scope of the appended claims.
[0376] It should be understood that the detailed description and
specific examples, while indicating preferred embodiments of the
invention, are given by way of illustration only, since various
changes and modifications within the spirit and scope of the
invention will become apparent to those skilled in the art from
this detailed description.
REFERENCES
[0377] 1. Bentwich, I. et al. Identification of hundreds of
conserved and nonconserved human microRNAs. Nat Genet (2005).
[0378] 2. Farh, K. K. et al. The Widespread Impact of Mammalian
MicroRNAs on mRNA Repression and Evolution. Science (2005). [0379]
3. Griffiths-Jones, S., Grocock, R. J., van Dongen, S., Bateman, A.
& Enright, A. J. miRBase: microRNA sequences, targets and gene
nomenclature. Nucleic Acids Res 34, D140-4 (2006). [0380] 4. He, L.
et al. A microRNA polycistron as a potential human oncogene. Nature
435, 828-33 (2005). [0381] 5. Baskerville, S. & Bartel, D. P.
Microarray profiling of microRNAs reveals frequent coexpression
with neighboring miRNAs and host genes. Rna 11, 241-7 (2005).
[0382] 6. Landgraf, P. et al. A Mammalian microRNA Expression Atlas
Based on Small RNA Library Sequencing. Cell 129, 1401-14 (2007).
[0383] 7. Volinia, S. et al. A microRNA expression signature of
human solid tumors defines cancer gene targets. Proc Natl Acad Sci
USA (2006). [0384] 8. Lu, J. et al. MicroRNA expression profiles
classify human cancers. Nature 435, 834-8 (2005). [0385] 9.
Varadhachary, G. R., Abbruzzese, J. L. & Lenzi, R. Diagnostic
strategies for unknown primary cancer. Cancer 100, 1776-85 (2004).
[0386] 10. Pimiento, J. M., Teso, D., Malkan, A., Dudrick, S. J.
& Palesty, J. A. Cancer of unknown primary origin: a decade of
experience in a community-based hospital. Am J Surg 194, 833-7;
discussion 837-8 (2007). [0387] 11. Shaw, P. H., Adams, R., Jordan,
C. & Crosby, T. D. A clinical review of the investigation and
management of carcinoma of unknown primary in a single cancer
network. Clin Oncol (R Coll Radiol) 19, 87-95 (2007). [0388] 12.
Hainsworth, J. D. & Greco, F. A. Treatment of patients with
cancer of an unknown primary site. N Engl J Med 329, 257-63 (1993).
[0389] 13. Blaszyk, H., Hartmann, A. & Bjornsson, J. Cancer of
unknown primary: clinicopathologic correlations. Apmis 111, 1089-94
(2003). [0390] 14. Bloom, G. et al. Multi-platform, multi-site,
microarray-based human tumor classification. Am J Pathol 164, 9-16
(2004). [0391] 15. Ma, X. J. et al. Molecular classification of
human cancers using a 92-gene real-time quantitative polymerase
chain reaction assay. Arch Pathol Lab Med 130, 465-73 (2006).
[0392] 16. Talantov, D. et al. A quantitative reverse
transcriptase-polymerase chain reaction assay to identify
metastatic carcinoma tissue of origin. J Mol Diagn 8, 320-9 (2006).
[0393] 17. Tothill, R. W. et al. An expression-based site of origin
diagnostic method designed for clinical application to cancer of
unknown origin. Cancer Res 65, 4031-40 (2005). [0394] 18. Shedden,
K. A. et al. Accurate molecular classification of human cancers
based on gene expression using a simple classifier with a
pathological tree-based framework. Am J Pathol 163, 1985-95 (2003).
[0395] 19. Raver-Shapira, N. et al. Transcriptional Activation of
miR-34a Contributes to p53-Mediated Apoptosis. Mol Cell (2007).
[0396] 20. Xiao, C. et al. MiR-150 Controls B Cell Differentiation
by Targeting the Transcription Factor c-Myb. Cell 131, 146-59
(2007).
Sequence CWU 1
1
390122RNAHomo sapiens 1ugagguagua gguuguaugg uu 22222RNAHomo
sapiens 2ugagguagga gguuguauag uu 22322RNAHomo sapiens 3aacccguaga
uccgaacuug ug 22423RNAHomo sapiens 4uacccuguag auccgaauuu gug
23523RNAHomo sapiens 5uacccuguag aaccgaauuu gug 23622RNAHomo
sapiens 6uggaguguga caaugguguu ug 22724RNAHomo sapiens 7ucccugagac
ccuuuaaccu guga 24822RNAHomo sapiens 8ucccugagac ccuaacuugu ga
22922RNAHomo sapiens 9ucguaccgug aguaauaaug cg 221022RNAHomo
sapiens 10cagugcaaug uuaaaagggc au 221123RNAHomo sapiens
11agcugguguu gugaaucagg ccg 231221RNAHomo sapiens 12uaccacaggg
uagaaccacg g 211322RNAHomo sapiens 13uaacacuguc ugguaaagau gg
221421RNAHomo sapiens 14ugagaugaag cacuguagcu c 211523RNAHomo
sapiens 15guccaguuuu cccaggaauc ccu 231622RNAHomo sapiens
16ugagaacuga auuccauggg uu 221722RNAHomo sapiens 17ugagaacuga
auuccauagg cu 221822RNAHomo sapiens 18ucagugcacu acagaacuuu gu
221923RNAHomo sapiens 19ucuggcuccg ugucuucacu ccc 232023RNAHomo
sapiens 20caaagugcuu acagugcagg uag 232123RNAHomo sapiens
21aacauucaac gcugucggug agu 232222RNAHomo sapiens 22accaucgacc
guugauugua cc 222322RNAHomo sapiens 23uggagagaaa ggcaguuccu ga
222423RNAHomo sapiens 24caacggaauc ccaaaagcag cug 232522RNAHomo
sapiens 25aacuggccua caaaguccca gu 222622RNAHomo sapiens
26ugggucuuug cgggcgagau ga 222722RNAHomo sapiens 27uguaacagca
acuccaugug ga 222822RNAHomo sapiens 28uaacacuguc ugguaacgau gu
222922RNAHomo sapiens 29uaauacugcc ugguaaugau ga 223023RNAHomo
sapiens 30uaauacugcc ggguaaugau gga 233120RNAHomo sapiens
31agagguauag ggcaugggaa 203222RNAHomo sapiens 32uccuucauuc
caccggaguc ug 223322RNAHomo sapiens 33uggaauguaa ggaagugugu gg
223422RNAHomo sapiens 34uagcuuauca gacugauguu ga 223521RNAHomo
sapiens 35caacaccagu cgaugggcug u 213622RNAHomo sapiens
36cugugcgugu gacagcggcu ga 223722RNAHomo sapiens 37acagcaggca
cagacaggca gu 223822RNAHomo sapiens 38ugccugucua cacuugcugu gc
223922RNAHomo sapiens 39aagcugccag uugaagaacu gu 224021RNAHomo
sapiens 40agcuacaucu ggcuacuggg u 214122RNAHomo sapiens
41ugucaguuug ucaaauaccc ca 224221RNAHomo sapiens 42caagucacua
gugguuccgu u 214322RNAHomo sapiens 43uagcaccauc ugaaaucggu ua
224422RNAHomo sapiens 44uagcaccauu ugaaaucggu ua 224522RNAHomo
sapiens 45ugaccgauuu cuccuggugu uc 224622RNAHomo sapiens
46uguaaacauc cucgacugga ag 224722RNAHomo sapiens 47uguaaacauc
cccgacugga ag 224822RNAHomo sapiens 48uguaaacauc cuugacugga ag
224921RNAHomo sapiens 49aggcaagaug cuggcauagc u 215023RNAHomo
sapiens 50ucucacacag aaaucgcacc cgu 235122RNAHomo sapiens
51gcugacuccu aguccagggc uc 225222RNAHomo sapiens 52uggcaguguc
uuagcugguu gu 225323RNAHomo sapiens 53aggcagugua guuagcugau ugc
235422RNAHomo sapiens 54uuaucagaau cuccaggggu ac 225523RNAHomo
sapiens 55aaagugcugc gacauuugag cgu 235622RNAHomo sapiens
56uuuguucguu cggcucgcgu ga 225721RNAHomo sapiens 57acuggacuug
gagucagaag g 215822RNAHomo sapiens 58uaugugccuu uggacuacau cg
225922RNAHomo sapiens 59aaucguacag ggucauccac uu 226021RNAHomo
sapiens 60cagcagcaca cugugguuug u 216122RNAHomo sapiens
61ugauugguac gucugugggu ag 226223RNAHomo sapiens 62uucucgagga
aagaagcacu uuc 236323RNAHomo sapiens 63ugagugugug ugugugagug ugu
236421RNAHomo sapiens 64aauggcgcca cuaggguugu g 216523RNAHomo
sapiens 65uggaagacua gugauuuugu ugu 236622RNAHomo sapiens
66auaaagcuag auaaccgaaa gu 226722RNAHomo sapiens 67uauugcacuu
gucccggccu gu 226822RNAHomo sapiens 68uauugcacuc gucccggccu cc
226922RNAHomo sapiens 69ugucuacuac uggagacacu gg 227084RNAHomo
sapiens 70gcauccgggu ugagguagua gguuguaugg uuuagaguua cacccuggga
guuaacugua 60caaccuucua gcuuuccuug gagc 847179RNAHomo sapiens
71cccgggcuga gguaggaggu uguauaguug aggaggacac ccaaggagau cacuauacgg
60ccuccuagcu uuccccagg 797280RNAHomo sapiens 72ccuguugcca
caaacccgua gauccgaacu ugugguauua guccgcacaa gcuuguaucu 60auagguaugu
gucuguuagg 8073110RNAHomo sapiens 73gaucugucug ucuucuguau
auacccugua gauccgaauu uguguaagga auuuuguggu 60cacaaauucg uaucuagggg
aauauguagu ugacauaaac acuccgcucu 11074110RNAHomo sapiens
74ccagagguug uaacguuguc uauauauacc cuguagaacc gaauuugugu gguauccgua
60uagucacaga uucgauucua ggggaauaua uggucgaugc aaaaacuuca
1107585RNAHomo sapiens 75ccuuagcaga gcuguggagu gugacaaugg
uguuuguguc uaaacuauca aacgccauua 60ucacacuaaa uagcuacugc uaggc
857686RNAHomo sapiens 76ugccagucuc uaggucccug agacccuuua accugugagg
acauccaggg ucacagguga 60gguucuuggg agccuggcgu cuggcc 867788RNAHomo
sapiens 77ugcgcuccuc ucagucccug agacccuaac uugugauguu uaccguuuaa
auccacgggu 60uaggcucuug ggagcugcga gucgugcu 887889RNAHomo sapiens
78accagacuuu uccuaguccc ugagacccua acuugugagg uauuuuagua acaucacaag
60ucaggcucuu gggaccuagg cggagggga 897985RNAHomo sapiens
79cgcuggcgac gggacauuau uacuuuuggu acgcgcugug acacuucaaa cucguaccgu
60gaguaauaau gcgccgucca cggca 858089RNAHomo sapiens 80ugcugcuggc
cagagcucuu uucacauugu gcuacugucu gcaccuguca cuagcagugc 60aauguuaaaa
gggcauuggc cguguagug 898199RNAHomo sapiens 81cccuggcaug gugugguggg
gcagcuggug uugugaauca ggccguugcc aaucagagaa 60cggcuacuuc acaacaccag
ggccacacca cacuacagg 998284RNAHomo sapiens 82cguugcugca gcugguguug
ugaaucaggc cgacgagcag cgcauccucu uacccggcua 60uuucacgaca ccaggguugc
auca 8483100RNAHomo sapiens 83ugugucucuc ucuguguccu gccagugguu
uuacccuaug guagguuacg ucaugcuguu 60cuaccacagg guagaaccac ggacaggaua
ccggggcacc 1008495RNAHomo sapiens 84cggccggccc uggguccauc
uuccaguaca guguuggaug gucuaauugu gaagcuccua 60acacugucug guaaagaugg
cucccgggug gguuc 9585106RNAHomo sapiens 85gcgcagcgcc cugucuccca
gccugaggug cagugcugca ucucugguca guugggaguc 60ugagaugaag cacuguagcu
caggaagaga gaaguuguuc ugcagc 1068688RNAHomo sapiens 86caccuugucc
ucacggucca guuuucccag gaaucccuua gaugcuaaga uggggauucc 60uggaaauacu
guucuugagg ucaugguu 888799RNAHomo sapiens 87ccgaugugua uccucagcuu
ugagaacuga auuccauggg uugugucagu gucagaccuc 60ugaaauucag uucuucagcu
gggauaucuc ugucaucgu 998873RNAHomo sapiens 88ccuggcacug agaacugaau
uccauaggcu gugagcucua gcaaugcccu guggacucag 60uucuggugcc cgg
738968RNAHomo sapiens 89gaggcaaagu ucugagacac uccgacucug aguaugauag
aagucagugc acuacagaac 60uuugucuc 689089RNAHomo sapiens 90gccggcgccc
gagcucuggc uccgugucuu cacucccgug cuuguccgag gagggaggga 60gggacggggg
cugugcuggg gcagcugga 899184RNAHomo sapiens 91gucagaauaa ugucaaagug
cuuacagugc agguagugau augugcaucu acugcaguga 60aggcacuugu agcauuaugg
ugac 8492110RNAHomo sapiens 92ugaguuuuga gguugcuuca gugaacauuc
aacgcugucg gugaguuugg aauuaaaauc 60aaaaccaucg accguugauu guacccuaug
gcuaaccauc aucuacucca 11093110RNAHomo sapiens 93agaagggcua
ucaggccagc cuucagagga cuccaaggaa cauucaacgc ugucggugag 60uuugggauuu
gaaaaaacca cugaccguug acuguaccuu gggguccuua 1109482RNAHomo sapiens
94agggggcgag ggauuggaga gaaaggcagu uccugauggu ccccucccca ggggcuggcu
60uuccucuggu ccuucccucc ca 829592RNAHomo sapiens 95cggcuggaca
gcgggcaacg gaaucccaaa agcagcuguu gucuccagag cauuccagcu 60gcgcuuggau
uucguccccu gcucuccugc cu 929688RNAHomo sapiens 96cgaggauggg
agcugagggc ugggucuuug cgggcgagau gagggugucg gaucaacugg 60ccuacaaagu
cccaguucuc ggcccccg 889785RNAHomo sapiens 97augguguuau caaguguaac
agcaacucca uguggacugu guaccaauuu ccaguggaga 60ugcuguuacu uuugaugguu
accaa 859885RNAHomo sapiens 98ugguucccgc ccccuguaac agcaacucca
uguggaagug cccacugguu ccaguggggc 60ugcuguuauc uggggcgagg gccag
859990RNAHomo sapiens 99ccgggccccu gugagcaucu uaccggacag ugcuggauuu
cccagcuuga cucuaacacu 60gucugguaac gauguucaaa ggugacccgc
9010095RNAHomo sapiens 100ccagcucggg cagccguggc caucuuacug
ggcagcauug gauggaguca ggucucuaau 60acugccuggu aaugaugacg gcggagcccu
gcacg 9510168RNAHomo sapiens 101cccucgucuu acccagcagu guuugggugc
gguugggagu cucuaauacu gccggguaau 60gauggagg 68102110RNAHomo sapiens
102cgccucagag ccgcccgccg uuccuuuuuc cuaugcauau acuucuuuga
ggaucuggcc 60uaaagaggua uagggcaugg gaaaacgggg cggucggguc cuccccagcg
110103110RNAHomo sapiens 103aaagauccuc agacaaucca ugugcuucuc
uuguccuuca uuccaccgga gucugucuca 60uacccaacca gauuucagug gagugaaguu
caggaggcau ggagcugaca 11010486RNAHomo sapiens 104ugcuucccga
ggccacaugc uucuuuauau ccccauaugg auuacuuugc uauggaaugu 60aaggaagugu
gugguuucgg caagug 8610572RNAHomo sapiens 105ugucggguag cuuaucagac
ugauguugac uguugaaucu cauggcaaca ccagucgaug 60ggcugucuga ca
72106110RNAHomo sapiens 106acccggcagu gccuccaggc gcagggcagc
cccugcccac cgcacacugc gcugccccag 60acccacugug cgugugacag cggcugaucu
gugccugggc agcgcgaccc 110107110RNAHomo sapiens 107ggccuggcug
gacagaguug ucaugugucu gccugucuac acuugcugug cagaacaucc 60gcucaccugu
acagcaggca cagacaggca gucacaugac aacccagccu 11010885RNAHomo sapiens
108ggcugagccg caguaguucu ucaguggcaa gcuuuauguc cugacccagc
uaaagcugcc 60aguugaagaa cuguugcccu cugcc 85109110RNAHomo sapiens
109gcugcuggaa gguguaggua cccucaaugg cucaguagcc aguguagauc
cugucuuucg 60uaaucagcag cuacaucugg cuacuggguc ucugauggca ucuucuagcu
110110110RNAHomo sapiens 110ccuggccucc ugcagugcca cgcuccgugu
auuugacaag cugaguugga cacuccaugu 60gguagagugu caguuuguca aauaccccaa
gugcggcaca ugcuuaccag 11011181RNAHomo sapiens 111gggcuuucaa
gucacuagug guuccguuua guagaugauu gugcauuguu ucaaaauggu 60gcccuaguga
cuacaaagcc c 8111264RNAHomo sapiens 112augacugauu ucuuuuggug
uucagaguca auauaauuuu cuagcaccau cugaaaucgg 60uuau 6411388RNAHomo
sapiens 113aucucuuaca caggcugacc gauuucuccu gguguucaga gucuguuuuu
gucuagcacc 60auuugaaauc gguuaugaug uaggggga 8811471RNAHomo sapiens
114gcgacuguaa acauccucga cuggaagcug ugaagccaca gaugggcuuu
cagucggaug 60uuugcagcug c 7111570RNAHomo sapiens 115guuguuguaa
acauccccga cuggaagcug uaagacacag cuaagcuuuc agucagaugu 60uugcugcuac
7011692RNAHomo sapiens 116gggcagucuu ugcuacugua aacauccuug
acuggaagcu guaagguguu cagaggagcu 60uucagucgga uguuuacagc ggcaggcugc
ca 9211771RNAHomo sapiens 117ggagaggagg caagaugcug gcauagcugu
ugaacuggga accugcuaug ccaacauauu 60gccaucuuuc c 7111899RNAHomo
sapiens 118gaaacugggc ucaaggugag gggugcuauc ugugauugag ggacaugguu
aauggaauug 60ucucacacag aaaucgcacc cgucaccuug gccuacuua
9911998RNAHomo sapiens 119acccaaaccc uaggucugcu gacuccuagu
ccagggcucg ugauggcugg ugggcccuga 60acgagggguc uggaggccug gguuugaaua
ucgacagc 98120110RNAHomo sapiens 120ggccagcugu gaguguuucu
uuggcagugu cuuagcuggu uguugugagc aauaguaagg 60aagcaaucag caaguauacu
gcccuagaag ugcugcacgu uguggggccc 11012177RNAHomo sapiens
121agucuaguua cuaggcagug uaguuagcug auugcuaaua guaccaauca
cuaaccacac 60ggccagguaa aaagauu 7712272RNAHomo sapiens
122ggagcuuauc agaaucucca gggguacuuu auaauuucaa aaaguccccc
aggugugauu 60cugauuugcu uc 7212367RNAHomo sapiens 123gugggccuca
aauguggagc acuauucuga uguccaagug gaaagugcug cgacauuuga 60gcgucac
6712464RNAHomo sapiens 124ccccgcgacg agccccucgc acaaaccgga
ccugagcguu uuguucguuc ggcucgcgug 60aggc 6412566RNAHomo sapiens
125agggcuccug acuccagguc cuguguguua ccuagaaaua gcacuggacu
uggagucaga 60aggccu 6612696RNAHomo sapiens 126ucccuggcgu gaggguaugu
gccuuuggac uacaucgugg aagccagcac caugcagucc 60augggcauau acacuugccu
caaggccuau gucauc 9612784RNAHomo sapiens 127uugguacuug gagagugguu
aucccugucc uguucguuuu gcucaugucg aaucguacag 60ggucauccac uuuuucagua
ucaa 84128112RNAHomo sapiens 128ccaccccggu ccugcucccg ccccagcagc
acacuguggu uuguacggca cuguggccac 60guccaaacca cacuguggug uuagagcgag
ggugggggag gcaccgccga gg 11212994RNAHomo sapiens 129caugcugugu
gugguacccu acugcagaca guggcaauca uguauaauua aaaaugauug 60guacgucugu
ggguagagua cugcaugaca caug 9413091RNAHomo sapiens 130caugcugugu
gugguacccu acugcagaca guggcaauca uguauaauua aaaaugauug 60guacgucugu
ggguagagua cugcaugaca c 9113175RNAHomo sapiens 131gugguacccu
acugcagacg uggcaaucau guauaauuaa aaaugauugg uacgucugug 60gguagaguac
ugcau 7513290RNAHomo sapiens 132ucucaggcug ugaccuucuc gaggaaagaa
gcacuuucug uugucugaaa gaaaagaaag 60ugcuuccuuu cagaggguua cgguuugaga
9013390RNAHomo sapiens 133ucucagguug ugaccuucuc gaggaaagaa
gcacuuucug uugucugaaa gaaaagaaag 60ugcuuccuuu cagaggguua cgguuugaga
9013496RNAHomo sapiens 134gggaccugcg ugggugcggg cgugugagug
ugugugugug aguguguguc gcuccggguc 60cacgcucaug cacacaccca cacgcccaca
cucagg 9613598RNAHomo sapiens 135acgaauggcu augcacugca caacccuagg
agagggugcc auucacauag acuauaauug 60aauggcgcca cuaggguugu gcagugcaca
accuacac 98136110RNAHomo sapiens 136uuggauguug gccuaguucu
guguggaaga cuagugauuu uguuguuuuu agauaacuaa 60aucgacaaca aaucacaguc
ugccauaugg cacaggccau gccucuacag 110137110RNAHomo sapiens
137cuggauacag aguggaccgg cuggccccau cuggaagacu agugauuuug
uuguugucuu 60acugcgcuca acaacaaauc ccagucuacc uaauggugcc agccaucgca
110138110RNAHomo sapiens 138agauuagagu
ggcugugguc uagugcugug uggaagacua gugauuuugu uguucugaug 60uacuacgaca
acaagucaca gccggccuca uagcgcagac ucccuucgac 11013989RNAHomo sapiens
139cgggguuggu uguuaucuuu gguuaucuag cuguaugagu gguguggagu
cuucauaaag 60cuagauaacc gaaaguaaaa auaacccca 8914087RNAHomo sapiens
140ggaagcgagu uguuaucuuu gguuaucuag cuguaugagu guauuggucu
ucauaaagcu 60agauaaccga aaguaaaaac uccuuca 8714190RNAHomo sapiens
141ggaggcccgu uucucucuuu gguuaucuag cuguaugagu gccacagagc
cgucauaaag 60cuagauaacc gaaaguagaa augauucuca 9014278RNAHomo
sapiens 142cuuucuacac agguugggau cgguugcaau gcuguguuuc uguaugguau
ugcacuuguc 60ccggccuguu gaguuugg 7814375RNAHomo sapiens
143ucaucccugg guggggauuu guugcauuac uuguguucua uauaaaguau
ugcacuuguc 60ccggccugug gaaga 7514496RNAHomo sapiens 144cgggccccgg
gcgggcggga gggacgggac gcggugcagu guuguuuuuu cccccgccaa 60uauugcacuc
gucccggccu ccggcccccc cggccc 9614583RNAHomo sapiens 145agaaauaagg
cuucugucua cuacuggaga cacugguagu auaaaaccca gagucuccag 60uaauggacgg
gagccuuauu ucu 8314624RNAHomo sapiens 146agccugauua aacacaugcu cuga
2414723RNAHomo sapiens 147agcuacauug ucugcugggu uuc 2314823RNAHomo
sapiens 148caaagugcug uucgugcagg uag 2314985RNAHomo sapiens
149uuuacaguuu gccaugauga aaugcauguu aaguccgugu uucagcugau
cagccugauu 60aaacacaugc ucugagcaga cuaaa 85150110RNAHomo sapiens
150ugaacaucca ggucuggggc augaaccugg cauacaaugu agauuucugu
guucguuagg 60caacagcuac auugucugcu ggguuucagg cuaccuggaa acauguucuc
11015180RNAHomo sapiens 151cugggggcuc caaagugcug uucgugcagg
uagugugauu acccaaccua cugcugagcu 60agcacuuccc gagcccccgg
8015224RNAHomo sapiens 152uuuggcaaug guagaacuca cacu 2415322RNAHomo
sapiens 153agagguagua gguugcauag uu 2215423RNAHomo sapiens
154aacauucauu gcugucggug ggu 2315522RNAHomo sapiens 155ucggauccgu
cugagcuugg cu 2215622RNAHomo sapiens 156ugagguagga gguuguauag uu
2215722RNAHomo sapiens 157ugagguagua guuugugcug uu 2215823RNAHomo
sapiens 158aaaagugcuu acagugcagg uag 2315920RNAHomo sapiens
159uaaggcacgc ggugaaugcc 2016027RNAHomo sapiens 160accuucuugu
auaagcacug ugcuaaa 2716121RNAHomo sapiens 161ucacagugaa ccggucucuu
u 2116222RNAHomo sapiens 162aagcccuuac cccaaaaagc au 2216322RNAHomo
sapiens 163ucaaaacuga ggggcauuuu cu 2216421RNAHomo sapiens
164cauaaaguag aaagcacuac u 2116522RNAHomo sapiens 165ggugcagugc
ugcaucucug gu 2216622RNAHomo sapiens 166ugcccugugg acucaguucu gg
2216721RNAHomo sapiens 167agggagggac gggggcugug c 2116822RNAHomo
sapiens 168ucucccaacc cuuguaccag ug 2216921RNAHomo sapiens
169ucagugcaug acagaacuug g 2117023RNAHomo sapiens 170uuaaugcuaa
ucgugauagg ggu 2317122RNAHomo sapiens 171uagcagcaca uaaugguuug ug
2217222RNAHomo sapiens 172uagcagcaca ucaugguuua ca 2217322RNAHomo
sapiens 173aacauucaac cugucgguga gu 2217423RNAHomo sapiens
174aacauucauu guugucggug ggu 2317522RNAHomo sapiens 175uauggcacug
guagaauuca cu 2217623RNAHomo sapiens 176uaaggugcau cuagugcaga uag
2317721RNAHomo sapiens 177cugaccuaug aauugacagc c 2117822RNAHomo
sapiens 178aacuggcccu caaagucccg cu 2217921RNAHomo sapiens
179uagcagcaca gaaauauugg c 2118019RNAHomo sapiens 180accgugcaaa
gguagcaua 1918122RNAHomo sapiens 181acaguagucu gcacauuggu ua
2218223RNAHomo sapiens 182cccaguguuc agacuaccug uuc 2318323RNAHomo
sapiens 183cccaguguuu agacuaucug uuc 2318422RNAHomo sapiens
184gugaaauguu uaggaccacu ag 2218521RNAHomo sapiens 185gauuucagug
gagugaaguu c 2118623RNAHomo sapiens 186uaaagugcuu auagugcagg uag
2318722RNAHomo sapiens 187agaauugugg cuggacaucu gu 2218822RNAHomo
sapiens 188cauugcacuu gucucggucu ga 2218921RNAHomo sapiens
189uucacagugg cuaaguucug c 2119023RNAHomo sapiens 190uagcaccauu
ugaaaucagu guu 2319122RNAHomo sapiens 191uagcaccauu ugaaaucggu ua
2219223RNAHomo sapiens 192uaagugcuuc cauguuuugg uga 2319323RNAHomo
sapiens 193acuuaaacgu ggauguacuu gcu 2319423RNAHomo sapiens
194uaagugcuuc cauguuugag ugu 2319522RNAHomo sapiens 195cuuucagucg
gauguuugca gc 2219623RNAHomo sapiens 196uguaaacauc cuacacucuc agc
2319721RNAHomo sapiens 197gccccugggc cuauccuaga a 2119821RNAHomo
sapiens 198aggggugcua ucugugauug a 2119922RNAHomo sapiens
199aauugcacgg uauccaucug ua 2220023RNAHomo sapiens 200aagugccgcc
aucuuuugag ugu 2320120RNAHomo sapiens 201acucaaacug ugggggcacu
2020221RNAHomo sapiens 202acuggacuug gagucagaag g 2120322RNAHomo
sapiens 203acuggacuua gggucagaag gc 2220423RNAHomo sapiens
204aaugacacga ucacucccgu uga 2320522RNAHomo sapiens 205aaaccguuac
cauuacugag uu 2220621RNAHomo sapiens 206gcaguccaug ggcauauaca c
2120722RNAHomo sapiens 207uccuguacug agcugccccg ag 2220821RNAHomo
sapiens 208cagcagcaca cugugguuug u 2120923RNAHomo sapiens
209uuucaagcca gggggcguuu uuc 2321023RNAHomo sapiens 210cacucagccu
ugagggcacu uuc 2321123RNAHomo sapiens 211uucucgagga aagaagcacu uuc
2321222RNAHomo sapiens 212aucuggaggu aagaagcacu uu 2221322RNAHomo
sapiens 213aucgugcauc ccuuuagagu gu 2221422RNAHomo sapiens
214aucgugcauc cuuuuagagu gu 2221522RNAHomo sapiens 215gaaagcgcuu
cccuuugcug ga 2221621RNAHomo sapiens 216aaagcgcuuc ccuucagagu g
2121722RNAHomo sapiens 217cucuagaggg aagcacuuuc uc 2221822RNAHomo
sapiens 218aaagugcauc cuuuuagagu gu 2221922RNAHomo sapiens
219caaagugccu cccuuuagag ug 2222021RNAHomo sapiens 220cuccagaggg
aaguacuuuc u 2122122RNAHomo sapiens 221aaagugcuuc cuuuuagagg gu
2222220RNAHomo sapiens 222cuacaaaggg aagcccuuuc 2022322RNAHomo
sapiens 223cuacaaaggg aagcacuuuc uc 2222420RNAHomo sapiens
224cugcaaaggg aagcccuuuc 2022521RNAHomo sapiens 225gcgacccaua
cuugguuuca g 2122621RNAHomo sapiens 226agggggaaag uucuauaguc c
2122723RNAHomo sapiens 227ugcaccaugg uugucugagc aug 2322821RNAHomo
sapiens 228cgcgggugcu uacugacccu u 2122923RNAHomo sapiens
229ucuuugguua ucuagcugua uga 2323023RNAHomo sapiens 230cgggucggag
uuagcucaag cgg 2323122RNAHomo sapiens 231aacccguaga uccgaucuug ug
2223222RNAHomo sapiens 232caagcucgcu ucuauggguc ug 2223323RNAHomo
sapiens 233gaagugcuuc gauuuugggg ugu 2323422RNAHomo sapiens
234gauuagggug cuuagcuguu aa 2223521RNAHomo sapiens 235gguuuggucc
uagccuuucu a 2123622RNAHomo sapiens 236uggacuugga gucaggaggc cu
2223722RNAHomo sapiens 237aaucugcagg gggagccugg gu 2223821RNAHomo
sapiens 238acaugaaaag gggagagggc a 2123922RNAHomo sapiens
239acccccccca gccauacaua ga 2224025RNAHomo sapiens 240acuaccccag
gaugccagca uaguu 2524122RNAHomo sapiens 241agcugguuug auggggagcc au
2224222RNAHomo sapiens 242agggugacag ggaacaguag au 2224322RNAHomo
sapiens 243augugggugg uggucaccgu uu 2224422RNAHomo sapiens
244cacugauuau cgaggcgauu cu 2224520RNAHomo sapiens 245gaacccuacu
ccugguacca 2024622RNAHomo sapiens 246gaauuuccug aggggagggg gc
2224722RNAHomo sapiens 247ggcaggacgg cguaggucuu ga 2224822RNAHomo
sapiens 248gggcugggca gguuucagga au 2224921RNAHomo sapiens
249uaggucaagg uguagcccau a 2125022RNAHomo sapiens 250uauguacaag
guggaggggg cg 2225122RNAHomo sapiens 251ucccccaccc uuagcuuaga ua
2225222RNAHomo sapiens 252uggagcaggc uggggcuuug ag 2225323RNAHomo
sapiens 253ugugcuccgg aguuaccucg uuu 2325418RNAHomo sapiens
254uguggguucg aguuccau 1825518RNAHomo sapiens 255uucccggcca
augcauua 1825622RNAHomo sapiens 256ugagguagua gguuguauag uu
2225722RNAHomo sapiens 257ugagguagua gguugugugg uu 2225822RNAHomo
sapiens 258ugagguagua gauuguauag uu 2225922RNAHomo sapiens
259ugagguagua guuuguacag uu 2226021RNAHomo sapiens 260uaaagugcug
acagugcaga u 2126122RNAHomo sapiens 261uuuccggcuc gcgugggugu gu
2226222RNAHomo sapiens 262cugaagcuca gagggcucug au 2226322RNAHomo
sapiens 263aagcccuuac cccaaaaagu au 2226421RNAHomo sapiens
264cuuuuugcgg ucugggcuug c 2126522RNAHomo sapiens 265cagugcaaug
augaaagggc au 2226622RNAHomo sapiens 266uaacagucua cagccauggu cg
2226722RNAHomo sapiens 267uuuggucccc uucaaccagc ug 2226822RNAHomo
sapiens 268uuuggucccc uucaaccagc ua 2226922RNAHomo sapiens
269ugugacuggu ugaccagagg gg 2227022RNAHomo sapiens 270ucuacagugc
acgugucucc ag 2227122RNAHomo sapiens 271cagugguuuu acccuauggu ag
2227222RNAHomo sapiens 272ggauuccugg aaauacuguu cu 2227322RNAHomo
sapiens 273ucagugcauc acagaacuuu gu 2227421RNAHomo sapiens
274cuagacugaa gcuccuugag g 2127522RNAHomo sapiens 275uagguuaucc
guguugccuu cg 2227622RNAHomo sapiens 276aaucauacac gguugaccua uu
2227722RNAHomo sapiens 277acugcaguga aggcacuugu ag 2227822RNAHomo
sapiens 278accacugacc guugacugua cc 2227927RNAHomo sapiens
279auugaucauc gacacuucga acgcaau 2728022RNAHomo sapiens
280ucgugucuug uguugcagcc gg 2228121RNAHomo sapiens 281caucccuugc
augguggagg g 2128222RNAHomo sapiens 282uagguaguuu cauguuguug gg
2228322RNAHomo sapiens 283cucccacugc uucacuugac ua 2228423RNAHomo
sapiens 284ugugcaaauc caugcaaaac uga 2328523RNAHomo sapiens
285caaagugcuc auagugcagg uag 2328622RNAHomo sapiens 286uaaucucagc
uggcaacugu ga 2228722RNAHomo sapiens 287aaaucucugc aggcaaaugu ga
2228823RNAHomo sapiens 288uacugcauca ggaacugauu gga 2328922RNAHomo
sapiens 289aguucuucag uggcaagcuu ua 2229022RNAHomo sapiens
290accuggcaua caauguagau uu 2229122RNAHomo sapiens 291cucaguagcc
aguguagauc cu 2229221RNAHomo sapiens 292aucacauugc cagggauuuc c
2129321RNAHomo sapiens 293aucacauugc cagggauuac c 2129422RNAHomo
sapiens 294uggcucaguu cagcaggaac ag 2229522RNAHomo sapiens
295uucaaguaau ccaggauagg cu 2229621RNAHomo sapiens 296uucaaguaau
ucaggauagg u 2129721RNAHomo sapiens 297uucacagugg cuaaguuccg c
2129822RNAHomo sapiens 298cacuagauug ugagcuccug ga 2229921RNAHomo
sapiens 299agggcccccc cucaauccug u 2130022RNAHomo sapiens
300uaugugggau gguaaaccgc uu 2230122RNAHomo sapiens 301cugguuucac
augguggcuu ag 2230223RNAHomo sapiens 302cagugcaaua guauugucaa agc
2330322RNAHomo sapiens 303uguaaacauc cuacacucag cu 2230422RNAHomo
sapiens 304cuuucagucg gauguuuaca gc 2230522RNAHomo sapiens
305ugcuaugcca acauauugcc au 2230621RNAHomo sapiens 306cacauuacac
ggucgaccuc u 2130723RNAHomo sapiens 307cgcauccccu agggcauugg ugu
2330822RNAHomo sapiens 308cuggcccucu cugcccuucc gu 2230922RNAHomo
sapiens 309aacacaccug guuaaccucu uu 2231023RNAHomo sapiens
310gcaaagcaca cggccugcag aga 2331123RNAHomo sapiens 311ucaagagcaa
uaacgaaaaa ugu 2331221RNAHomo sapiens 312gaacggcuuc auacaggagu u
2131322RNAHomo sapiens 313uccagcauca gugauuuugu ug 2231423RNAHomo
sapiens 314ucccccaggu gugauucuga uuu 2331522RNAHomo sapiens
315aacacaccua uucaaggauu ca 2231624RNAHomo sapiens 316aauccuugga
accuaggugu gagu 2431722RNAHomo sapiens 317agaucgaccg uguuauauuc gc
2231822RNAHomo sapiens 318gccugcuggg
guggaaccug gu 2231921RNAHomo sapiens 319aucauagagg aaaauccacg u
2132021RNAHomo sapiens 320aacauagagg aaauuccacg u 2132122RNAHomo
sapiens 321agagguugcc cuuggugaau uc 2232221RNAHomo sapiens
322ugguagacua uggaacguag g 2132322RNAHomo sapiens 323uauacaaggg
caagcucucu gu 2232422RNAHomo sapiens 324gaaguuguuc gugguggauu cg
2232522RNAHomo sapiens 325gaauguugcu cggugaaccc cu 2232623RNAHomo
sapiens 326agguuacccg agcaacuuug cau 2332721RNAHomo sapiens
327aauauaacac agauggccug u 2132821RNAHomo sapiens 328uaguagaccg
uauagcguac g 2132922RNAHomo sapiens 329aucgggaaug ucguguccgc cc
2233022RNAHomo sapiens 330caggucgucu ugcagggcuu cu 2233123RNAHomo
sapiens 331ucuuggagua ggucauuggg ugg 2333222RNAHomo sapiens
332aucaugaugg gcuccucggu gu 2233321RNAHomo sapiens 333ucacuccucu
ccucccgucu u 2133422RNAHomo sapiens 334aagacgggag gaaagaaggg ag
2233522RNAHomo sapiens 335gucauacacg gcucuccucu cu 2233622RNAHomo
sapiens 336agaggcuggc cgugaugaau uc 2233722RNAHomo sapiens
337aaucauacag ggacauccag uu 2233822RNAHomo sapiens 338ugaaacauac
acgggaaacc uc 2233922RNAHomo sapiens 339aaacaaacau ggugcacuuc uu
2234023RNAHomo sapiens 340uaauccuugc uaccugggug aga 2334122RNAHomo
sapiens 341augcaccugg gcaaggauuc ug 2234222RNAHomo sapiens
342aaugcacccg ggcaaggauu cu 2234322RNAHomo sapiens 343aaugcaccug
ggcaaggauu ca 2234423RNAHomo sapiens 344uagcagcggg aacaguucug cag
2334521RNAHomo sapiens 345uaaggcaccc uucugaguag a 2134622RNAHomo
sapiens 346uacugcagac guggcaauca ug 2234718RNAHomo sapiens
347uucacaggga ggugucau 1834822RNAHomo sapiens 348ccucccacac
ccaaggcuug ca 2234922RNAHomo sapiens 349caugccuuga guguaggacc gu
2235022RNAHomo sapiens 350ggagaaauua uccuuggugu gu 2235123RNAHomo
sapiens 351ucggggauca ucaugucacg aga 2335222RNAHomo sapiens
352aaacauucgc ggugcacuuc uu 2235322RNAHomo sapiens 353uacgucaucg
uugucaucgu ca 2235425RNAHomo sapiens 354gcugggcagg gcuucugagc uccuu
2535522RNAHomo sapiens 355uaugucugcu gaccaucacc uu 2235625RNAHomo
sapiens 356ggcggaggga aguagguccg uuggu 2535722RNAHomo sapiens
357uacccauugc auaucggagu ug 2235820RNAHomo sapiens 358accaggaggc
ugaggccccu 2035923RNAHomo sapiens 359aaggagcuua caaucuagcu ggg
2336021RNAHomo sapiens 360gcaggaacuu gugagucucc u 2136122RNAHomo
sapiens 361cugcccuggc ccgagggacc ga 2236222RNAHomo sapiens
362ugcaacgaac cugagccacu ga 2236322RNAHomo sapiens 363cacccguaga
accgaccuug cg 2236422RNAHomo sapiens 364aacuggggcg ggaaggggga ag
2236522RNAHomo sapiens 365aagugauugg aggugggugg gg 2236622RNAHomo
sapiens 366agaagcugaa gggagagaga ca 2236722RNAHomo sapiens
367gugguuaucc cugcuguguu cg 2236822RNAHomo sapiens 368ugcagcuggu
ggagucuggg gg 2236922RNAHomo sapiens 369aaucugcagg gggagccugg gu
2237024RNAHomo sapiens 370acucccaugu cccuugggaa gguc 2437123RNAHomo
sapiens 371agcgagguug cccuuuguau auu 2337219RNAHomo sapiens
372agggcugggg acagagaug 1937319RNAHomo sapiens 373agugaagcau
uggacugua 1937418RNAHomo sapiens 374aucccacucc ugacacca
1837521RNAHomo sapiens 375cauccuagcc cuaagucugg c 2137628RNAHomo
sapiens 376cccaggcugg aguguagugg cgugaucu 2837722RNAHomo sapiens
377cgccugugaa uagucacugc ac 2237820RNAHomo sapiens 378gaaagcugag
cgugaacgug 2037920RNAHomo sapiens 379gaaucccacu ucugacacca
2038023RNAHomo sapiens 380guuccuguug gccgagugga gac 2338119RNAHomo
sapiens 381uaaaaggaac ucggcaaau 1938217RNAHomo sapiens
382ugcagaucuu gguggua 1738322RNAHomo sapiens 383uggggccucc
cacagcuguu uc 2238417RNAHomo sapiens 384ugguggucua gugguua
1738525RNAHomo sapiens 385uguccaaagu aaacgcccug acgca
2538619RNAHomo sapiens 386ugucccuucg uggucgcca 1938720RNAHomo
sapiens 387uucaugggga agcagauuug 2038823RNAHomo sapiens
388ugaggggcag agagcgagac uuu 2338924RNAHomo sapiens 389cauagcccgg
ucgcugguac auga 2439023RNAHomo sapiens 390gccgagacua gagucacauc cug
23
* * * * *