U.S. patent application number 12/935181 was filed with the patent office on 2011-07-14 for aberrant mitochondrial dna, associated fusion transcripts and hybridization probes therefor.
This patent application is currently assigned to MITOMICS INC.. Invention is credited to Jennifer Creed, Gabriel Dakubo, Ryan Parr, Brian Reguly, Kerry Robinson.
Application Number | 20110172113 12/935181 |
Document ID | / |
Family ID | 41112880 |
Filed Date | 2011-07-14 |
United States Patent
Application |
20110172113 |
Kind Code |
A1 |
Parr; Ryan ; et al. |
July 14, 2011 |
ABERRANT MITOCHONDRIAL DNA, ASSOCIATED FUSION TRANSCRIPTS AND
HYBRIDIZATION PROBES THEREFOR
Abstract
The present invention provides novel mitochondrial fusion
transcripts and the parent mutated mtDNA molecules that are useful
for predicting, diagnosing and/or monitoring cancer. Hybridization
probes complementary thereto for use in the methods of the
invention are also provided.
Inventors: |
Parr; Ryan; (Thunder Bay,
CA) ; Reguly; Brian; (Vancouver, CA) ; Dakubo;
Gabriel; (Thunder Bay, CA) ; Creed; Jennifer;
(Thunder Bay, CA) ; Robinson; Kerry; (Thunder Bay,
CA) |
Assignee: |
MITOMICS INC.
Thunder Bay
ON
|
Family ID: |
41112880 |
Appl. No.: |
12/935181 |
Filed: |
March 27, 2009 |
PCT Filed: |
March 27, 2009 |
PCT NO: |
PCT/CA2009/000351 |
371 Date: |
January 17, 2011 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
61040616 |
Mar 28, 2008 |
|
|
|
Current U.S.
Class: |
506/9 ; 435/6.11;
506/16; 506/18; 530/402; 536/23.4; 536/24.3 |
Current CPC
Class: |
A61P 35/00 20180101;
C07K 14/4748 20130101; C12Q 1/6886 20130101; C12Q 2600/158
20130101 |
Class at
Publication: |
506/9 ; 530/402;
536/23.4; 536/24.3; 435/6.11; 506/18; 506/16 |
International
Class: |
C12Q 1/68 20060101
C12Q001/68; C07K 14/00 20060101 C07K014/00; C12N 15/00 20060101
C12N015/00; C07H 21/00 20060101 C07H021/00; C40B 30/04 20060101
C40B030/04; C40B 40/10 20060101 C40B040/10; C40B 40/06 20060101
C40B040/06 |
Claims
1. An isolated mitochondrial fusion transcript associated with
cancer.
2. The mitochondrial fusion transcript of claim 1, wherein the
transcript comprises an insertion, translocation, deletion,
duplication, recombination, rearrangement or combination
thereof.
3. The mitochondrial fusion transcript of claim 2, wherein the
transcript comprises a deletion.
4. The mitochondrial fusion transcript of claim 3, wherein the
transcript comprises a sequence as set forth in any one of SEQ ID
NOs:18 to 33 or 50.
5. The mitochondrial fusion transcript of claim 3, wherein the
transcript comprises a sequence as set forth in any one of SEQ ID
NOs: 18-21, 23, 25-33 or 50.
6. The mitochondrial fusion transcript of claim 3, wherein the
transcript comprises the expressed RNA transcript of a deletion
sequence set out in Table 1.
7. A mitochondrial fusion protein corresponding to the fusion
transcript of claim 4 and having a sequence as set forth in any one
of SEQ ID NOs: 34 to 49 and 52.
8. An isolated mitochondrial DNA (mtDNA) encoding the fusion
transcript of claim 1.
9. The isolated mtDNA of claim 8 having a sequence as set forth in
any one of SEQ ID NOs: 2-17 or 51.
10. A hybridization probe having a nucleic acid sequence
complementary to at least a portion of the mitochondrial fusion
transcript according to claim 4.
11. A method of detecting a cancer in a mammal, the method
comprising assaying a tissue sample from the mammal for the
presence of at least one mitochondrial fusion transcript associated
with cancer by hybridizing the sample with at least one
hybridization probe having a nucleic acid sequence complementary to
at least a portion of the mitochondrial fusion transcript according
to claim 4.
12. A method of detecting a cancer in a mammal, the method
comprising assaying a tissue sample from the mammal for the
presence of at least one aberrant mtDNA associated with cancer by
hybridizing the sample with at least one hybridization probe having
a nucleic acid sequence complementary to at least a portion of the
mtDNA according to claim 7.
13. The method of claim 11, wherein the cancer is prostate cancer,
testicular cancer, ovarian cancer, breast cancer, colorectal
cancer, lung cancer, melanoma skin cancer or combinations
thereof.
14. The method of claim 13, wherein the assay comprises: a)
conducting a hybridization reaction using at least one of said
probes to allow said at least one probe to hybridize to a
complementary mitochondrial fusion transcript or mtDNA; b)
quantifying the amount of the at least one mitochondrial fusion
transcript or mtDNA in said sample by quantifying the amount of
said transcript or mtDNA hybridized to said at least one probe;
and, c) comparing the amount of the mitochondrial fusion transcript
or mtDNA in the sample to at least one known reference value.
15. The method of claim 14, wherein the assay is carried out using
diagnostic imaging technology.
16. The method of claim 15, wherein the diagnostic imaging
technology comprises high throughput microarray analysis.
17. The method of claim 14, wherein the assay is carried out using
branched DNA technology.
18. The method of claim 14, wherein the assay is carried out using
PCR.
19. A kit for conducting an assay for detecting the presence of a
cancer in a mammal, said kit comprising at least one hybridization
probe complementary to at least a portion of the fusion transcript
of claim 4.
20. A screening tool comprised of a microarray having 10's, 100's,
or 1000's of mitochondrial fusion transcripts according to claim 4
for identification of those associated with cancer.
21. A screening tool comprised of a microarray having 10's, 100's,
or 1000's of mitochondrial DNAs according to claim 9 for
identification of those associated with cancer.
22. A screening tool comprised of a multiplexed branched DNA assay
having 10's, 100's, or 1000's of mitochondrial fusion transcripts
according to claim 4 for identification of those associated with
cancer.
23. A screening tool comprised of a multiplexed branched DNA assay
having 10's, 100's, or 1000's of mitochondrial DNAs according to
claim 9 for identification of those associated with cancer.
24. A hybridization probe having a nucleic acid sequence
complementary to at least a portion of the mitochondrial fusion
transcript according to claim 5.
25. A hybridization probe having a nucleic acid sequence
complementary to at least a portion of a mitochondrial fusion
transcript according to claim 3, wherein the transcript comprises
the expressed RNA transcript of a deletion sequence set out in
Table 1.
26. A hybridization probe having a nucleic acid sequence
complementary to at least a portion of the mtDNA of claim 9.
27. A method of detecting a cancer in a mammal, the method
comprising assaying a tissue sample from the mammal for the
presence of at least one mitochondrial fusion transcript associated
with cancer by hybridizing the sample with at least one
hybridization probe having a nucleic acid sequence complementary to
at least a portion of the mitochondrial fusion transcript according
to claim 5.
28. A method of detecting a cancer in a mammal, the method
comprising assaying a tissue sample from the mammal for the
presence of at least one mitochondrial fusion transcript associated
with cancer by hybridizing the sample with at least one
hybridization probe having a nucleic acid sequence complementary to
at least a portion of the mitochondrial fusion transcript according
to 3, wherein the transcript comprises the expressed RNA transcript
of a deletion sequence set out in Table 1.
29. A method of detecting a cancer in a mammal, the method
comprising assaying a tissue sample from the mammal for the
presence of at least one aberrant mtDNA associated with cancer by
hybridizing the sample with at least one hybridization probe having
a nucleic acid sequence complementary to at least a portion of the
mtDNA according to claim 9.
30. The method of claim 27, wherein the cancer is prostate cancer,
testicular cancer, ovarian cancer, breast cancer, colorectal
cancer, lung cancer, melanoma skin cancer or combinations
thereof.
31. The method of claim 28, wherein the cancer is prostate cancer,
testicular cancer, ovarian cancer, breast cancer, colorectal
cancer, lung cancer, melanoma skin cancer or combinations
thereof.
32. The method of claim 12, wherein the cancer is prostate cancer,
testicular cancer, ovarian cancer, breast cancer, colorectal
cancer, lung cancer, melanoma skin cancer or combinations
thereof.
33. The method of claim 29, wherein the cancer is prostate cancer,
testicular cancer, ovarian cancer, breast cancer, colorectal
cancer, lung cancer, melanoma skin cancer or combinations
thereof.
34. A kit for conducting an assay for detecting the presence of a
cancer in a mammal, said kit comprising at least one hybridization
probe complementary to at least a portion of the fusion transcript
of claim 5.
35. A kit for conducting an assay for detecting the presence of a
cancer in a mammal, said kit comprising at least one hybridization
probe complementary to at least a portion of the fusion transcript
of claim 3, wherein the transcript comprises the expressed RNA
transcript of a deletion sequence set out in Table 1.
36. A kit for conducting an assay for detecting the presence of a
cancer in a mammal, said kit comprising at least one hybridization
probe complementary to at least a portion of the mtDNA of claim
9.
37. A screening tool comprised of a microarray having 10's, 100's,
or 1000's of mitochondrial fusion transcripts according to claim 5
for identification of those associated with cancer.
38. A screening tool comprised of a microarray having 10's, 100's,
or 1000's of mitochondrial fusion transcripts according to claim 3
for identification of those associated with cancer, wherein the
transcript comprises the expressed RNA transcript of a deletion
sequence set out in Table 1.
39. A screening tool comprised of a multiplexed branched DNA assay
having 10's, 100's, or 1000's of mitochondrial fusion transcripts
according to claim 5 for identification of those associated with
cancer.
40. A screening tool comprised of a multiplexed branched DNA assay
having 10's, 100's, or 1000's of mitochondrial fusion transcripts
according to claim 3 for identification of those associated with
cancer, wherein the transcript comprises the expressed RNA
transcript of a deletion sequence set out in Table 1.
Description
FIELD OF THE INVENTION
[0001] The present invention relates to the field of mitochondrial
genomics. In one aspect, the invention relates to the
identification and use of mitochondrial genome fusion transcripts
and probes that hybridize thereto.
BACKGROUND OF THE INVENTION
[0002] Mitochondrial Genome
[0003] The mitochondrial genome is a compact yet critical sequence
of nucleic acids. Mitochondrial DNA, or "mtDNA", comprises a small
genome of 16,569 nucleic acid base pairs (bp) (Anderson et al.,
1981; Andrews et al., 1999) in contrast to the immense nuclear
genome of 3.3 billion by (haploid). Its genetic complement is
substantially smaller than that of its nuclear cell mate (0.0005%).
However, individual cells carry anywhere from 10.sup.3 to 10.sup.4
mitochondria depending on specific cellular functions (Singh and
Modica-Napolitano 2002). Communication or chemical signalling
routinely occurs between the nuclear and mitochondrial genomes
(Sherratt et al., 1997). Moreover, specific nuclear components are
responsible for the maintenance and integrity of mitochondrial
sequences (Croteau et al., 1999). All mtDNA genomes in a given
individual are identical due to the clonal expansion of
mitochondria within the ovum, once fertilization has occurred.
However mutagenic events can induce sequence diversity reflected as
somatic mutations. These mutations may accumulate in different
tissues throughout the body in a condition known as
heteroplasmy.
[0004] Mitochondrial Proteome
[0005] About 3,000 nuclear genes are required to construct, operate
and maintain mitochondria, with only thirty-seven of these coded by
the mitochondrial genome, indicating heavy mitochondrial dependence
on nuclear loci. The mitochondrial genome codes for a complement of
24 genes, including 2 rRNAs and 22 tRNAs that ensure correct
translation of the remaining 13 genes which are vital to electron
transport (see FIG. 1). The mitochondrial genome is dependent on
seventy nuclear encoded proteins to accomplish the oxidation and
reduction reactions necessary for this vital function, in addition
to the thirteen polypeptides supplied by the mitochondrial genome.
Both nuclear and mitochondrial proteins form complexes spanning the
inner mitochondrial membrane and collectively generate 80-90% of
the chemical fuel adenosine triphosphate, or ATP, required for
cellular metabolism. In addition to energy production, mitochondria
play a central role in other metabolic pathways as well. A critical
function of the mitochondria is mediation of cell death, or
apoptosis (see Green and Kroemer, 2005). Essentially, there are
signal pathways which permeabilize the outer mitochondrial
membrane, or in addition, the inner mitochondrial membrane as well.
When particular mitochondrial proteins are released into the
cytosol, non-reversible cell death is set in motion. This process
highlights the multi-functional role that some mitochondrial
proteins have. These multi-tasking proteins suggest that there are
other mitochondrial proteins as well which may have alternate
functions.
[0006] Mitochondrial Fusion Transcriptome
[0007] The mitochondrial genome is unusual in that it is a
circular, intron-less DNA molecule. The genome is interspersed with
repeat motifs which flank specific lengths of sequences. Sequences
between these repeats are prone to deletion under circumstances
which are not well understood. Given the number of repeats in the
mitochondrial genome, there are many possible deletions. The best
known example is the 4977 "common deletion." This deletion has been
associated with several purported conditions and diseases and is
thought to increase in frequency with aging (Dai et al., 2004; Ro
et al., 2003; Barron et al., 2001; Lewis et al., 2000;
Muller-Hocker, 1998; Porteous et al., 1998) (FIG. 4). The current
thinking in the field of mitochondrial genomics is that
mitochondrial deletions are merely deleterious by-products of
damage to the mitochondrial genome by such agents as reactive
oxygen species and UVR. (Krishnan et al 2008, Nature Genetics).
Further, though it is recognized that high levels of mtDNA
deletions can have severe consequences on the cell's ability to
produce energy in the form of ATP as a result of missing gene
sequences necessary for cellular respiration, it is not anticipated
that these deleted mitochondrial molecules may be a component of
downstream pathways, have an intended functional role, and possibly
may be more aptly viewed as alternate natural forms of the
recognized genes of the mitochondria as has been anticipated by the
Applicant.
[0008] The sequence dynamics of mtDNA are important diagnostic
tools. Mutations in mtDNA are often preliminary indicators of
developing disease. For example, it has been demonstrated that
point mutations in the mitochondrial genome are characteristic of
tumour foci in the prostate. This trend also extends to normal
appearing tissue both adjacent to and distant from tumour tissue
(Parr et al. 2006). This suggests that mitochondrial mutations
occur early in the malignant transformation pathway.
[0009] For example, the frequency of a 3.4 kb mitochondrial
deletion has excellent utility in discriminating between benign and
malignant prostate tissues (Maki et al. 2008).
[0010] Mitochondrial fusion transcripts have been reported
previously in the literature, first in soybeans (Morgens et al.
1984) and then later in two patients with Kearns-Sayre Syndrome, a
rare neuromuscular disorder (Nakase et al 1990). Importantly, these
transcripts were not found to have (or investigated regarding)
association with any human cancers.
SUMMARY OF THE INVENTION
[0011] An object of the present invention to provide aberrant
mitochondrial DNA, associated fusion transcripts and hybridization
probes therefor.
[0012] In accordance with an aspect of the invention, there is
provided an isolated mitochondrial fusion transcript associated
with cancer.
[0013] In accordance with an aspect of the invention, there is
provided a mitochondrial fusion protein corresponding to the above
fusion transcript, having a sequence as set forth in any one of SEQ
ID NOs: 34 to 49 and 52.
[0014] In accordance with another aspect of the invention, there is
provided an isolated mtDNA encoding a fusion transcript of the
invention.
[0015] In accordance with another aspect of the invention, there is
provided a hybridization probe having a nucleic acid sequence
complementary to at least a portion of a mitochondrial fusion
transcript or an mtDNA of the invention.
[0016] In accordance with another aspect of the invention, there is
provided a method of detecting a cancer in a mammal, the method
comprising assaying a tissue sample from the mammal for the
presence of at least one mitochondrial fusion transcript associated
with cancer by hybridizing the sample with at least one
hybridization probe having a nucleic acid sequence complementary to
at least a portion of a mitochondrial fusion transcript according
to the invention.
[0017] In accordance with another aspect of the invention, there is
provided a method of detecting a cancer in a mammal, the method
comprising assaying a tissue sample from the mammal for the
presence of at least one aberrant mtDNA associated with cancer by
hybridizing the sample with at least one hybridization probe having
a nucleic acid sequence complementary to at least a portion of an
mtDNA according to the invention.
[0018] In accordance with another aspect of the invention, there is
provided a kit for conducting an assay for detecting the presence
of a cancer in a mammal, said kit comprising at least one
hybridization probe complementary to at least a portion of a fusion
transcript or an mtDNA of the invention.
[0019] In accordance with another aspect of the invention, there is
provided a screening tool comprised of a microarray having 10's,
100's, or 1000's of mitochondrial fusion transcripts for
identification of those associated with cancer.
[0020] In accordance with another aspect of the invention, there is
provided a screening tool comprised of a microarray having 10's,
100's, or 1000's of mitochondrial DNAs corresponding to
mitochondrial fusion transcripts for identification of those
associated with cancer.
[0021] In accordance with another aspect of the invention, there is
provided a screening tool comprised of a multiplexed branched DNA
assay having 10's, 100's, or 1000's of mitochondrial fusion
transcripts for identification of those associated with cancer.
[0022] In accordance with another aspect of the invention, there is
provided a screening tool comprised of a multiplexed branched DNA
assay having 10's, 100's, or 1000's of mitochondrial DNAs
corresponding to mitochondrial fusion transcripts for
identification of those associated with cancer.
BRIEF DESCRIPTION OF THE DRAWINGS
[0023] The embodiments of the invention will now be described by
way of example only with reference to the appended drawings
wherein:
[0024] FIG. 1 is an illustration showing mitochondrial coding
genes.
[0025] FIG. 2 shows polyadenalated fusion transcripts in prostate
samples invoked by the loss of the 3.4 kb deletion.
[0026] FIG. 3 shows polyadenalated fusion transcripts in prostate
samples invoked by the loss of the 4977 kb common deletion.
[0027] FIG. 4 shows polyadenalated fusion transcripts in breast
samples invoked by the loss of the 3.4 kb segment from the
mtgenome.
[0028] FIGS. 5a and 5b show an example of a mitochondrial DNA
region before and after splicing of genes.
[0029] FIGS. 6a to 6g illustrate the results for transcripts 2, 3,
8, 9, 10, 11 and 12 of the invention in the identification of
colorectal cancer tumours.
[0030] FIGS. 7a to 7d illustrate the results for transcripts 6, 8,
10 and 20 of the invention in the identification of lung cancer
tumours.
[0031] FIGS. 8a to 8g illustrate the results for transcripts 6, 10,
11, 14, 15, 16 and 20 of the invention in the identification of
melanomas.
[0032] FIGS. 9a to 9h illustrate the results for transcripts 1, 2,
3, 6, 11, 12, 15 and 20 of the invention in the identification of
ovarian cancer.
[0033] FIGS. 10 to 18 illustrate the results for transcripts 2, 3,
4, 11, 12, 13, 15, 16 and 20 of the invention in the identification
of testicular cancer.
DETAILED DESCRIPTION OF THE INVENTION
[0034] The present invention provides novel mitochondrial fusion
transcripts and the parent mutated mtDNA molecules that are useful
for predicting, diagnosing and/or monitoring cancer. The invention
further provides hybridization probes for the detection of fusion
transcripts and associated mtDNA molecules and the use of such
probes.
[0035] Definitions
[0036] Unless defined otherwise, all technical and scientific terms
used herein have the same meaning as commonly understood by one of
ordinary skill in the art to which this invention belongs.
[0037] As used herein, "aberration" or "mutation" encompasses any
modification in the wild type mitochondrial DNA sequence that
results in a fusion transcript and includes, without limitation,
insertions, translocations, deletions, duplications,
recombinations, rearrangements or combinations thereof.
[0038] As defined herein, "biological sample" refers to a tissue or
bodily fluid containing cells from which a molecule of interest can
be obtained. For example, the biological sample can be derived from
tissue such as prostate, breast, colorectal, lung and skin, or from
blood, saliva, cerebral spinal fluid, sputa, urine, mucous,
synovial fluid, peritoneal fluid, amniotic fluid and the like. The
biological sample may be a surgical specimen or a biopsy specimen.
The biological sample can be used either directly as obtained from
the source or following a pre-treatment to modify the character of
the sample. Thus, the biological sample can be pre-treated prior to
use by, for example, preparing plasma or serum from blood,
disrupting cells, preparing liquids from solid materials, diluting
viscous fluids, filtering liquids, distilling liquids,
concentrating liquids, inactivating interfering components, adding
reagents, and the like.
[0039] A "continuous" transcript is a fusion transcript that keeps
the reading frame from the beginning to the end of both spliced
genes. An "end" transcript is a fusion transcript that results in a
premature termination codon before the original termination codon
of a second spliced gene.
[0040] As used herein, "mitochondrial DNA" or "mtDNA" is DNA
present in mitochondria.
[0041] As used herein, the expression "mitochondrial fusion
transcript" or "fusion transcript" refers to an RNA transcription
product produced as a result of the transcription of a mutated
mitochondrial DNA sequence wherein such mutations may comprise
mitochondrial deletions and other large-scale mitochondrial DNA
rearrangements.
[0042] Computer Analysis and Sequence Targetting
[0043] As discussed above, mitochondrial fusion transcripts have
been reported in soybeans (Morgens et al. 1984) and in humans
suffering from a rare neuromuscular disorder (Nakase et al 1990).
Fusion transcripts associated with human cancer have not, however,
been described.
[0044] Using the knowledge gained from mapping the large-scale
deletions of the human mitochondrial genome associated with cancer,
the observation of high frequencies of these deletions, and the
evidence in another organism and another disease type of
trancriptionally active mutated mtDNA molecules, Applicant
hypothesized that such deletions may have importance beyond the DNA
molecule and the damage and repair processes as it relates to
cancer. To test this hypothesis computer analysis of the
mitochondrial genome was conducted, specific for repeat elements,
which suggested many potential deletion sites. Following this
initial step identifying unique repeats in the mitochondrial
sequence having non-adjacent or non-tandem locations, a filter was
then applied to identify those repeats that upon initiating a
deletion event in the DNA molecule would then likely reclose or
religate to produce a fused DNA sequence having an open reading
frame (ORF). A subset of 18 molecules were then selected for
targetting to investigate whether: 1) they existed in the natural
biological state of humans and 2) they had relevance to malignancy.
Results from these investigations are described hereinafter.
[0045] Genomic Mutations
[0046] Mitochondrial DNA (mtDNA) dynamics are an important
diagnostic tool. Mutations in mtDNA are often preliminary
indicators of developing disease and behave as biomarkers
indicative of risk factors associated with disease onset. According
to the present invention, large-scale rearrangement mutations in
the mitochondrial genome result in the generation of fusion
transcripts associated with cancer. Thus, the use of mtDNA encoding
such transcripts and probes directed thereto for the detection,
diagnosis and monitoring of cancer is provided.
[0047] One of skill in the art will appreciate that the mtDNA
molecules for use in the methods of the present invention may be
derived through the isolation of naturally-occurring mutants or may
be based on the complementary sequence of any of the fusion
transcripts described herein. Exemplary mtDNA sequences and fusion
transcripts are disclosed in Applicant's U.S. priority application
No. 61/040,616, herein incorporated in its entirety by
reference.
[0048] Detection of Mutant Genomic Sequences
[0049] Mutant mtDNA sequences according to the present invention
may comprise any modification that results in the generation of a
fusion transcript. Non-limiting examples of such modifications
include insertions, translocations, deletions, duplications,
recombinations, rearrangements or combinations thereof. While the
modification or change can vary greatly in size from only a few
bases to several kilobases, preferably the modification results in
a substantive deletion or other large-scale genomic aberration.
[0050] Extraction of DNA to detect the presence of such mutations
may take place using art-recognized methods, followed by
amplification of all or a region of the mitochondrial genome, and
may include sequencing of the mitochondrial genome, as described in
Current Protocols in Molecular Biology. Alternatively, crude tissue
homogenates may be used as well as techniques not requiring
amplification of specific fragments of interest.
[0051] The step of detecting the mutations can be selected from any
technique as is known to those skilled in the art. For example,
analyzing mtDNA can comprise selection of targets by branching DNA,
sequencing the mtDNA, amplifying mtDNA by PCR, Southern, Northern,
Western South-Western blot hybridizations, denaturing HPLC,
hybridization to microarrays, biochips or gene chips, molecular
marker analysis, biosensors, melting temperature profiling or a
combination of any of the above.
[0052] Any suitable means to sequence mitochondrial DNA may be
used. Preferably, mtDNA is amplified by PCR prior to sequencing.
The method of PCR is well known in the art and may be performed as
described in Mullis and Faloona, 1987, Methods Enzymol., 155: 335.
PCR products can be sequenced directly or cloned into a vector
which is then placed into a bacterial host. Examples of DNA
sequencing methods are found in Brumley, R. L. Jr. and Smith, L.
M., 1991, Rapid DNA sequencing by horizontal ultrathin gel
electrophoresis, Nucleic Acids Res. 19:4121-4126 and Luckey, J. A.,
et al, 1993, High speed DNA sequencing by capillary gel
electrophoresis, Methods Enzymol. 218: 154-172. The combined use of
PCR and sequencing of mtDNA is described in Hopgood, R., et al,
1992, Strategies for automated sequencing of human mtDNA directly
from PCR products, Biotechniques 13:82-92 and Tanaka, M. et al,
1996, Automated sequencing of mtDNA, Methods Enzymol.
264:407-421.
[0053] Methods of selecting appropriate sequences for preparing
various primers are also known in the art. For example, the primer
can be prepared using conventional solid-phase synthesis using
commercially available equipment, such as that available from
Applied Biosystems USA Inc. (Foster City, Calif.), DuPont,
(Wilmington, Del.), or Milligen (Bedford, Mass.).
[0054] According to an aspect of the invention, to determine
candidate genomic sequences, a junction point of a sequence
deletion is first identified. Sequence deletions are primarily
identified by direct and indirect repetitive elements which flank
the sequence to be deleted at the 5' and 3' end. The removal of a
section of the nucleotides from the genome followed by the ligation
of the genome results in the creation of a novel junction
point.
[0055] Upon identification of the junction point, the nucleotides
of the genes flanking the junction point are determined in order to
identify a spliced gene. Typically the spliced gene comprises the
initiation codon from the first gene and the termination codon of
the second gene, and may be expressed as a continuous transcript,
i.e. one that keeps the reading frame from the beginning to the end
of both spliced genes. It is also possible that alternate
initiation or termination codons contained within the gene
sequences may be used as is evidenced by SEQ ID No:2 and SEQ ID No:
17 disclosed herein. Some known mitochondrial deletions discovered
to have an open reading frame (ORF) when the rearranged sequences
are rejoined at the splice site are provided in Table 1.
[0056] Exemplary mtDNA molecules for use in the methods of the
present invention, which have been verified to exist in the lab,
are provided below. These mtDNAs are based on modifications of the
known mitochondrial genome (SEQ ID NO: 1) and have been assigned a
fusion or "FUS" designation, wherein A:B represents the junction
point between the last mitochondrial nucleotide of the first
spliced gene and the first mitochondrial nucleotide of the second
spliced gene. The identification of the spliced genes is provided
in parentheses followed by the corresponding sequence identifier.
Where provided below, (AltMet) and (OrigMet) refer to alternate and
original translation start sites, respectively. [0057] FUS
8469:13447 (AltMet) (ATP synthase F0 subunit 8 to NADH
dehydrogenase subunit) (SEQ ID No: 2) [0058] FUS 10744:14124 (NADH
dehydrogenase subunit 4L (ND4L) to NADH dehydrogenase subunit 5
(ND5)) (SEQ ID No: 3) [0059] FUS 7974:15496 (Cytochrome c oxidase
subunit II (COII) to Cytochrome b (Cytb)) (SEQ ID No: 4) [0060] FUS
7992:15730 (Cytochrome c oxidase subunit II (COII) to Cytochrome b
(Cytb)) (SEQ ID No: 5) [0061] FUS 8210:15339 (Cytochrome c oxidase
subunit II (COII) to Cytochrome b (Cytb)) (SEQ ID No: 6) [0062] FUS
8828:14896 (ATP synthase F0 subunit 6 (ATPase6) to Cytochrome b
(Cytb)) (SEQ ID No: 7) [0063] FUS 10665:14856 (NADH dehydrogenase
subunit 4L (ND4L) to Cytochrome b (Cytb)) (SEQ ID No: 8) [0064] FUS
6075:13799 (Cytochrome c oxidase subunit I (COI) to NADH de
hydrogenase subunit 5 (ND5)) (SEQ ID No: 9) [0065] FUS 6325:13989
(Cytochrome c oxidase subunit I (COI) to NADH dehydrogenase subunit
5 (ND5)) (SEQ ID No: 10) [0066] FUS 7438:13476 (Cytochrome c
oxidase subunit I (COI) to NADH dehydrogenase subunit 5 (ND5)) (SEQ
ID No: 11) [0067] FUS 7775:13532 (Cytochrome c oxidase subunit II
(COII) to NADH dehydrogenase subunit 5 (ND5)) (SEQ ID No: 12)
[0068] FUS 8213:13991 (Cytochrome c oxidase subunit II (COII) to
NADH dehydrogenase subunit 5 (ND5)) (SEQ ID No: 13) [0069] FUS
9191:12909 (ATP synthase F0 subunit 6 (ATPase6) to NADH
dehydrogenase subunit 5 (ND5)) (SEQ ID No: 14) [0070] FUS
9574:12972 (Cytochrome c oxidase subunit III (COII) to NADH
dehydrogenase subunit 5 (ND5)) (SEQ ID No: 15) [0071] FUS
10367:12829 (NADH dehydrogenase subunit 3 (ND3) to NADH
dehydrogenase subunit 5 (ND5)) (SEQ ID No: 16) [0072] FUS
8469:13447 (OrigMet) (ATP synthase F0 subunit 8 to NADH
dehydrogenase subunit) (SEQ ID No: 17) [0073] FUS 9144:13816 ((ATP
synthase F0 subunit 6 (ATPase6) to NADH dehydrogenase subunit 5
(ND5)) (SEQ ID No: 51)
[0074] The present invention also provides the use of variants or
fragments of these sequences for predicting, diagnosing and/or
monitoring cancer.
[0075] "Variant", as used herein, refers to a nucleic acid
differing from a mtDNA sequence of the present invention, but
retaining essential properties thereof. Generally, variants are
overall closely similar, and, in many regions, identical to a
select mtDNA sequence. Specifically, the variants of the present
invention comprise at least one of the nucleotides of the junction
point of the spliced genes, and may further comprise one or more
nucleotides adjacent thereto. In one embodiment of the invention,
the variant sequence is at least 80%, 85%, 90%, 95%, 96%, 97%, 98%
or 99% identical to any one of the mtDNA sequences of the
invention, or the complementary strand thereto.
[0076] In the present invention, "fragment" refers to a short
nucleic acid sequence which is a portion of that contained in the
disclosed genomic sequences, or the complementary strand thereto.
This portion includes at least one of the nucleotides comprising
the junction point of the spliced genes, and may further comprise
one or more nucleotides adjacent thereto. The fragments of the
invention are preferably at least about 15 nt, and more preferably
at least about 20 nt, still more preferably at least about 30 nt,
and even more preferably, at least about 40 nt, at least about 50
nt, at least about 75 nt, or at least about 150 nt in length. A
fragment "at least 20 nt in length," for example, is intended to
include 20 or more contiguous bases of any one of the mtDNA
sequences listed above. In this context "about" includes the
particularly recited value, a value larger or smaller by several
(5, 4, 3, 2, or 1) nucleotides, at either terminus or at both
termini. These fragments have uses that include, but are not
limited to, as diagnostic probes and primers as discussed herein.
Of course, larger fragments (e.g., 50, 150, 500, 600, 2000
nucleotides) are also contemplated.
[0077] Thus, in specific embodiments of the invention, the mtDNA
sequences are selected from the group consisting of:
[0078] SEQ ID NO: 2 (FUS 8469:13447; AltMet)
[0079] SEQ ID NO: 3 (FUS 10744:14124)
[0080] SEQ ID NO: 4 (FUS 7974:15496)
[0081] SEQ ID NO: 5 (FUS 7992:15730)
[0082] SEQ ID NO: 6 (FUS 8210:15339)
[0083] SEQ ID NO: 7 (FUS 8828:14896)
[0084] SEQ ID NO: 8 (FUS 10665:14856)
[0085] SEQ ID NO: 9 (FUS 6075:13799)
[0086] SEQ ID NO: 10 (FUS 6325:13989)
[0087] SEQ ID NO: 11 (FUS 7438:13476)
[0088] SEQ ID NO: 12 (FUS 7775:13532)
[0089] SEQ ID NO: 13 (FUS 8213:13991)
[0090] SEQ ID NO: 14 (FUS 9191:12909)
[0091] SEQ ID NO: 15 (FUS 9574:12972)
[0092] SEQ ID NO: 16 (FUS 10367:12829)
[0093] SEQ ID NO: 17(FUS 8469:13447; OrigMet)
[0094] SEQ ID NO: 51 (FUS 9144:13816), and
fragments or variants thereof.
[0095] Probes
[0096] Another aspect of the invention is to provide a
hybridization probe capable of recognizing an aberrant mtDNA
sequence of the invention. As used herein, the term "probe" refers
to an oligonucleotide which forms a duplex structure with a
sequence in the target nucleic acid, due to complementarity of at
least one sequence in the probe with a sequence in the target
region. The probe may be labeled, according to methods known in the
art.
[0097] Once aberrant mtDNA associated with a particular disease is
identified, hybridization of mtDNA to, for example, an array of
oligonucleotides can be used to identify particular mutations,
however, any known method of hybridization may be used.
[0098] As with the primers of the present invention, probes may be
generated directly against exemplary mtDNA fusion molecules of the
invention, or to a fragment or variant thereof. For instance, the
sequences set forth in SEQ ID NOs: 2-17 and 51 and those disclosed
in Table 1 can be used to design primers or probes that will detect
a nucleic acid sequence comprising a fusion sequence of interest.
As would be understood by those of skill in the art, primers or
probes which hybridize to these nucleic acid molecules may do so
under highly stringent hybridization conditions or lower stringency
conditions, such conditions known to those skilled in the art and
found, for example, in Current Protocols in Molecular Biology (John
Wiley & Sons, New York (1989)), 6.3.1-6.3.6.
[0099] In specific embodiments of the invention, the probes of the
invention contain a sequence complementary to at least a portion of
the aberrant mtDNA comprising the junction point of the spliced
genes. This portion includes at least one of the nucleotides
involved in the junction point A:B, and may further comprise one or
more nucleotides adjacent thereto. In this regard, the present
invention encompasses any suitable targeting mechanism that will
select an mtDNA molecule using the nucleotides involved and/or
adjacent to the junction point A:B.
[0100] Various types of probes known in the art are contemplated by
the present invention. For example, the probe may be a
hybridization probe, the binding of which to a target nucleotide
sequence can be detected using a general DNA binding dye such as
ethidium bromide, SYBR.RTM. Green, SYBR.RTM. Gold and the like.
Alternatively, the probe can incorporate one or more detectable
labels. Detectable labels are molecules or moieties a property or
characteristic of which can be detected directly or indirectly and
are chosen such that the ability of the probe to hybridize with its
target sequence is not affected. Methods of labelling nucleic acid
sequences are well-known in the art (see, for example, Ausubel et
al., (1997 & updates) Current Protocols in Molecular Biology,
Wiley & Sons, New York).
[0101] Labels suitable for use with the probes of the present
invention include those that can be directly detected, such as
radioisotopes, fluorophores, chemiluminophores, enzymes, colloidal
particles, fluorescent microparticles, and the like. One skilled in
the art will understand that directly detectable labels may require
additional components, such as substrates, triggering reagents,
light, and the like to enable detection of the label. The present
invention also contemplates the use of labels that are detected
indirectly.
[0102] The probes of the invention are preferably at least about 15
nt, and more preferably at least about 20 nt, still more preferably
at least about 30 nt, and even more preferably, at least about 40
nt, at least about 50 nt, at least about 75 nt, or at least about
150 nt in length. A probe of "at least 20 nt in length," for
example, is intended to include 20 or more contiguous bases that
are complementary to an mtDNA sequence of the invention. Of course,
larger probes (e.g., 50, 150, 500, 600, 2000 nucleotides) may be
preferable.
[0103] The probes of the invention will also hybridize to nucleic
acid molecules in biological samples, thereby enabling the methods
of the invention. Accordingly, in one aspect of the invention,
there is provided a hybridization probe for use in the detection of
cancer, wherein the probe is complementary to at least a portion of
an aberrant mtDNA molecule. In another aspect the present invention
provides probes and a use of (or a method of using) such probes for
the detection of colorectal cancer, lung cancer, breast cancer,
ovarian cancer, testicular, cancer, prostate cancer and/or melanoma
skin cancer.
[0104] Assays
[0105] Measuring the level of aberrant mtDNA in a biological sample
can determine the presence of one or more cancers in a subject. The
present invention, therefore, encompasses methods for predicting,
diagnosing or monitoring cancer, comprising obtaining one or more
biological samples, extracting mtDNA from the samples, and assaying
the samples for aberrant mtDNA by: quantifying the amount of one or
more aberrant mtDNA sequences in the sample and comparing the
quantity detected with a reference value. As would be understood by
those of skill in the art, the reference value is based on whether
the method seeks to predict, diagnose or monitor cancer.
Accordingly, the reference value may relate to mtDNA data collected
from one or more known non-cancerous biological samples, from one
or more known cancerous biological samples, and/or from one or more
biological samples taken over time.
[0106] In one aspect, the invention provides a method of detecting
cancer in a mammal, the method comprising assaying a tissue sample
from the mammal for the presence of an aberrant mitochondrial DNA
described above. The present invention also provides for methods
comprising assaying a tissue sample from the mammal by hybridizing
the sample with at least one hybridization probe. The probe may be
generated against a mutant mitochondrial DNA sequence of the
invention as described herein.
[0107] In another aspect, the invention provides a method as above,
wherein the assay comprises:
[0108] a) conducting a hybridization reaction using at least one of
the probes to allow the at least one probe to hybridize to a
complementary aberrant mitochondrial DNA sequence;
[0109] b) quantifying the amount of the at least one aberrant
mitochondrial DNA sequence in the sample by quantifying the amount
of the mitochondrial DNA hybridized to the at least one probe;
and,
[0110] c) comparing the amount of the mitochondrial DNA in the
sample to at least one known reference value.
[0111] Also included in the present invention are methods for
predicting, diagnosing or monitoring cancer comprising diagnostic
imaging assays as described below. The diagnostic assays of the
invention can be readily adapted for high-throughput.
High-throughput assays provide the advantage of processing many
samples simultaneously and significantly decrease the time required
to screen a large number of samples. The present invention,
therefore, contemplates the use of the nucleotides of the present
invention in high-throughput screening or assays to detect and/or
quantitate target nucleotide sequences in a plurality of test
samples.
[0112] Fusion Transcripts
[0113] The present invention further provides the identification of
fusion transcripts and associated hybridization probes useful in
methods for predicting, diagnosing and/or monitoring cancer. One of
skill in the art will appreciate that such molecules may be derived
through the isolation of naturally-occurring transcripts or,
alternatively, by the recombinant expression of mtDNAs isolated
according to the methods of the invention. As discussed, such
mtDNAs typically comprise a spliced gene having the initiation
codon from the first gene and the termination codon of the second
gene. Accordingly, fusion transcripts derived therefrom comprise a
junction point associated with the spliced genes.
[0114] Detection of Fusion Transcripts
[0115] Naturally occurring fusion transcripts can be extracted from
a biological sample and identified according to any suitable method
known in the art, or may be conducted according to the methods
described in the examples. In one embodiment of the invention,
stable polyadenylated fusion transcripts are identified using
Oligo(dT) primers that target transcripts with poly-A tails,
followed by RT-PCR using primer pairs designed against the target
transcript.
[0116] The following exemplary fusion transcripts were detected
using such methods and found useful in predicting, diagnosing
and/or monitoring cancer as indicated in the examples. Likewise,
fusion transcripts derived from the ORF sequences identified in
Table 1 may be useful in predicting, diagnosing and/or monitoring
cancer according to the assays and methods of the present
invention.
[0117] SEQ ID NO: 18 (Transcripts 1;8469:13447; AltMet)
[0118] SEQ ID NO: 19 (Transcript 2;10744:14124)
[0119] SEQ ID NO: 20 (Transcript 3;7974:15496)
[0120] SEQ ID NO: 21 (Transcript 4;7992:15730)
[0121] SEQ ID NO: 22 (Transcript 5;8210:15339)
[0122] SEQ ID NO: 23 (Transcript 6;8828:14896)
[0123] SEQ ID NO: 24 (Transcript 7;10665:14856)
[0124] SEQ ID NO: 25 (Transcript 8;6075:13799)
[0125] SEQ ID NO: 26 (Transcript 9;6325:13989)
[0126] SEQ ID NO: 27 (Transcript 10;7438:13476)
[0127] SEQ ID NO: 28 (Transcript 11;7775:13532)
[0128] SEQ ID NO: 29 (Transcript 12;8213:13991)
[0129] SEQ ID NO: 30 (Transcript 14;9191:12909)
[0130] SEQ ID NO: 31 (Transcript 15;9574:12972)
[0131] SEQ ID NO: 32 (Transcript 16;10367:12829)
[0132] SEQ ID NO: 33 (Transcript 20;8469:13447; OrigMet)
[0133] SEQ ID NO: 50 (Transcript 13; 9144:13816)
[0134] Further, fusion transcripts of like character to those
described herein are contemplated for use in the field of clinical
oncology.
[0135] Fusion transcripts can also be produced by recombinant
techniques known in the art. Typically this involves transformation
(including transfection, transduction, or infection) of a suitable
host cell with an expression vector comprising an mtDNA sequence of
interest.
[0136] Variants or fragments of the fusion transcripts identified
herein are also provided. Such sequences may adhere to the size
limitations and percent identities described above with respect to
genomic variants and fragments, or as determined suitable by a
skilled technician.
[0137] In addition, putative protein sequences corresponding to
transcripts 1-16 and 20 are listed below. These sequences, which
encode hypothetical fusion proteins, are provided as a further
embodiment of the present invention.
[0138] SEQ ID NO: 34 (Transcripts 1)
[0139] SEQ ID NO: 35 (Transcript 2)
[0140] SEQ ID NO: 36 (Transcript 3)
[0141] SEQ ID NO: 37 (Transcript 4)
[0142] SEQ ID NO: 38 (Transcript 5)
[0143] SEQ ID NO: 39 (Transcript 6)
[0144] SEQ ID NO: 40 (Transcript 7)
[0145] SEQ ID NO: 41 (Transcript 8)
[0146] SEQ ID NO: 42 (Transcript 9)
[0147] SEQ ID NO: 43 (Transcript 10)
[0148] SEQ ID NO: 44 (Transcript 11)
[0149] SEQ ID NO: 45 (Transcript 12)
[0150] SEQ ID NO: 46 (Transcript 14)
[0151] SEQ ID NO: 47 (Transcript 15)
[0152] SEQ ID NO: 48 (Transcript 16)
[0153] SEQ ID NO: 49 (Transcripts 20)
[0154] SEQ ID NO: 52 (Transcript 13)
[0155] Probes
[0156] Once a fusion transcript has been characterized, primers or
probes can be developed to target the transcript in a biological
sample. Such primers and probes may be prepared using any known
method (as described above) or as set out in the examples provided
below. A probe may, for example, be generated for the fusion
transcript, and detection technologies, such as QuantiGene 2.0.TM.
by Panomics.TM., used to detect the presence of the transcript in a
sample. Primers and probes may be generated directly against
exemplary fusion transcripts of the invention, or to a fragment or
variant thereof. For instance, the sequences set forth in SEQ ID
NOs: 18-33 and 50 as well as those disclosed in Table 1 can be used
to design probes that will detect a nucleic acid sequence
comprising a fusion sequence of interest.
[0157] As would be understood by those skilled in the art, probes
designed to hybridize to the fusion transcripts of the invention
contain a sequence complementary to at least a portion of the
transcript expressing the junction point of the spliced genes. This
portion includes at least one of the nucleotides complementary to
the expressed junction point, and may further comprise one or more
complementary nucleotides adjacent thereto. In this regard, the
present invention encompasses any suitable targeting mechanism that
will select a fusion transcript that uses the nucleotides involved
and adjacent to the junction point of the spliced genes.
[0158] Various types of probes and methods of labelling known in
the art are contemplated for the preparation of transcript probes.
Such types and methods have been described above with respect to
the detection of genomic sequences. The transcript probes of the
invention are preferably at least about 15 nt, and more preferably
at least about 20 nt, still more preferably at least about 30 nt,
and even more preferably, at least about 40 nt, at least about 50
nt, at least about 75 nt, or at least about 150 nt in length. A
probe of "at least 20 nt in length," for example, is intended to
include 20 or more contiguous bases that are complementary to an
mtDNA sequence of the invention. Of course, larger probes (e.g.,
50, 150, 500, 600, 2000 nucleotides) may be preferable.
[0159] In one aspect, the invention provides a hybridization probe
for use in the detection of cancer, wherein the probe is
complementary to at least a portion of a mitochondrial fusion
transcript provided above.
[0160] In another aspect, the present invention provides probes and
a use of (or a method of using) such probes for the detection of
colorectal cancer, lung cancer, breast cancer, ovarian cancer,
testicular cancer, prostate cancer or melanoma skin cancer.
[0161] Assays
[0162] Measuring the level of mitochondrial fusion transcripts in a
biological sample can determine the presence of one or more cancers
in a subject. The present invention, therefore, provides methods
for predicting, diagnosing or monitoring cancer, comprising
obtaining one or more biological samples, extracting mitochondrial
RNA from the samples, and assaying the samples for fusion
transcripts by: quantifying the amount of one or more fusion
transcripts in the sample and comparing the quantity detected with
a reference value. As would be understood by those of skill in the
art, the reference value is based on whether the method seeks to
predict, diagnose or monitor cancer. Accordingly, the reference
value may relate to transcript data collected from one or more
known non-cancerous biological samples, from one or more known
cancerous biological samples, and/or from one or more biological
samples taken over time.
[0163] In one aspect, the invention provides a method of detecting
a cancer in a mammal, the method comprising assaying a tissue
sample from said mammal for the presence of at least one fusion
transcript of the invention by hybridizing said sample with at
least one hybridization probe having a nucleic acid sequence
complementary to at least a portion of the mitochondrial fusion
transcript.
[0164] In another aspect, the invention provides a method as above,
wherein the assay comprises:
[0165] a) conducting a hybridization reaction using at least one of
the above-noted probes to allow the at least one probe to hybridize
to a complementary mitochondrial fusion transcript;
[0166] b) quantifying the amount of the at least one mitochondrial
fusion transcript in the sample by quantifying the amount of the
transcript hybridized to the at least one probe; and,
[0167] c) comparing the amount of the mitochondrial fusion
transcript in the sample to at least one known reference value.
[0168] As discussed above, the diagnostic assays of the invention
may also comprise diagnostic methods and screening tools as
described herein and can be readily adapted for high-throughput.
The present invention, therefore, contemplates the use of the
fusion transcripts and associated probes of the present invention
in high-throughput screening or assays to detect and/or quantitate
target nucleotide sequences in a plurality of test samples.
[0169] Diagnostic Methods and Screening Tools
[0170] Methods and screening tools for diagnosing specific diseases
or identifying specific mitochondrial mutations are also herein
contemplated. Any known method of hybridization may be used to
carry out such methods including, without limitation, probe/primer
based technologies such as branched DNA and qPCR, both single-plex
and multi-plex. Array technology, which has oligonucleotide probes
matching the wild type or mutated region, and a control probe, may
also be used. Commercially available arrays such as microarrays or
gene chips are suitable. These arrays contain thousands of matched
and control pairs of probes on a slide or microchip, and are
capable of sequencing the entire genome very quickly. Review
articles describing the use of microarrays in genome and DNA
sequence analysis are available on-line.
[0171] Screening tools designed to identify targets which are
relevant to a given biological condition may include specific
arrangements of nucleic acids associated with a particular disease
or disorder. Thus, in accordance with one embodiment of the
invention, there is provided a screening tool comprised of a
microarray having 10's, 100's, or 1000's of mitochondrial fusion
transcripts for identification of those associated with one or more
cancers. In accordance with another embodiment, there is provided a
screening tool comprised of a microarray having 10's, 100's, or
1000's of mitochondrial DNAs corresponding to mitochondrial fusion
transcripts for identification of those associated with one or more
cancers. In a further embodiment, there is provided a screening
tool comprised of a multiplexed branched DNA assay having 10's,
100's, or 1000's of mitochondrial fusion transcripts for
identification of those associated with one or more cancers. In yet
another embodiment of the invention, there is provided a screening
tool comprised of a multiplexed branched DNA assay having 10's,
100's, or 1000's of mitochondrial DNAs corresponding to
mitochondrial fusion transcripts for identification of those
associated with one or more cancers.
[0172] Approaches useful in the field of clinical oncology are also
herein contemplated and may include such diagnostic imaging
techniques as Positron Emission Tomography (PET), contrast Magnetic
Resonance Imaging (MRI) or the like. These diagnostic methods are
well known to those of skill in the art and are useful in the
diagnosis and prognosis of cancer.
[0173] Diagnostic Monitoring
[0174] The methods of the present invention may further comprise
the step of recommending a monitoring regime or course of therapy
based on the outcome of one or more assays. This allows clinicians
to practice personalized medicine; e.g. cancer therapy, by
monitoring the progression of the patient's cancer (such as by
recognizing when an initial or subsequent mutation occurs) or
treatment (such as by recognizing when a mutation is
stabilized).
[0175] With knowledge of the boundaries of the sequence variation
in hand, the information can be used to diagnose a pre-cancerous
condition or existing cancer condition. Further, by quantitating
the amount of aberrant mtDNA in successive samples over time, the
progression of a cancer condition can be monitored. For example,
data provided by assaying the patient's tissues at one point in
time to detect a first set of mutations from wild-type could be
compared against data provided from a subsequent assay, to
determine if changes in the aberration have occurred.
[0176] Where a mutation is found in an individual who has not yet
developed symptoms of cancer, the mutation may be indicative of a
genetic susceptibility to develop a cancer condition. A
determination of susceptibility to disease or diagnosis of its
presence can further be evaluated on a qualitative basis based on
information concerning the prevalence, if any, of the cancer
condition in the patient's family history and the presence of other
risk factors, such as exposure to environmental factors and whether
the patient's cells also carry a mutation of another sort.
[0177] Biological Sample
[0178] The present invention provides for diagnostic tests which
involve obtaining or collecting one or more biological samples. In
the context of the present invention, "biological sample" refers to
a tissue or bodily fluid containing cells from which mtDNA and
mtRNA can be obtained. For example, the biological sample can be
derived from tissue including, but not limited to, skin, lung,
breast, prostate, nervous, muscle, heart, stomach, colon, rectal
tissue and the like; or from blood, saliva, cerebral spinal fluid,
sputa, urine, mucous, synovial fluid, peritoneal fluid, amniotic
fluid and the like. The biological sample may be obtained from a
cancerous or non-cancerous tissue and may be, but is not limited
to, a surgical specimen or a biopsy specimen.
[0179] The biological sample can be used either directly as
obtained from the source or following a pre-treatment to modify the
character of the sample. Thus, the biological sample can be
pre-treated prior to use by, for example, preparing plasma or serum
from blood, disrupting cells, preparing liquids from solid
materials, diluting viscous fluids, filtering liquids, distilling
liquids, concentrating liquids, inactivating interfering
components, adding reagents, and the like.
[0180] One skilled in the art will understand that more than one
sample type may be assayed at a single time (i.e. for the detection
of more than one cancer). Furthermore, where a course of
collections are required, for example, for the monitoring of cancer
over time, a given sample may be diagnosed alone or together with
other samples taken throughout a test period. In this regard,
biological samples may be taken once only, or at regular intervals
such as biweekly, monthly, semi-annually or annually.
[0181] Kits
[0182] The present invention provides diagnostic/screening kits for
detecting cancer in a clinical environment. Such kits may include
one or more sampling means, in combination with one or more probes
according to the present invention.
[0183] The kits can optionally include reagents required to conduct
a diagnostic assay, such as buffers, salts, detection reagents, and
the like. Other components, such as buffers and solutions for the
isolation and/or treatment of a biological sample, may also be
included in the kit. One or more of the components of the kit may
be lyophilised and the kit may further comprise reagents suitable
for the reconstitution of the lyophilised components.
[0184] Where appropriate, the kit may also contain reaction
vessels, mixing vessels and other components that facilitate the
preparation of the test sample. The kit may also optionally include
instructions for use, which may be provided in paper form or in
computer-readable form, such as a disc, CD, DVD or the like.
[0185] In one embodiment of the invention there is provided a kit
for diagnosing cancer comprising sampling means and a hybridization
probe of the invention.
[0186] Various aspects of the invention will be described by
illustration using the following examples. The examples provided
herein serve only to illustrate certain specific embodiments of the
invention and are not intended to limit the scope of the invention
in any way.
EXAMPLES
Example 1
Detection of Mitochondrial Fusion Transcripts
[0187] The mitochondrial 4977 "common deletion" and a 3.4 kb
deletion previously identified by the present Applicant in PCT
application no. PCT/CA2007/001711 (the entire contents of which are
incorporated by reference) result in unique open reading frames
having active transcripts as identified by oligo-dT selection in
prostate tissue (FIGS. 2 and 3). Examination of breast tissue
samples also reveals the presence of a stable polyadenylated fusion
transcript resulting from the 3.4 kb deletion (FIG. 4).
[0188] Reverse Transcriptase-PCR Protocol for Deletion Transcript
Detection
[0189] RNA Isolation cDNA Synthesis
[0190] Total RNA was isolated from snap frozen prostate and breast
tissue samples (both malignant and normal samples adjacent to
tumours) using the Aurum.TM. Total RNA Fatty and Fibrous Tissue kit
(Bio-Rad, Hercules, Calif.) following the manufacturer's
instructions. Since in this experiment, genomic DNA contamination
was to be avoided, a DNase I treatment step was included, using
methods as commonly known in the art. RNA quantity and quality were
determined with an ND-1000 spectrophotometer (NanoDrop.RTM.
technologies). From a starting material of about 100 g, total RNA
concentrations varied from 100-1000 ng/ul with a 260/280 ratio
between 1.89-2.10. RNA concentrations were adjusted to 100 ng/ul
and 2 ul of each template were used for first strand DNA synthesis
with SuperScript.TM. First-Strand Synthesis System for RT-PCR
(Invitrogen) following the manufacturer's instructions. In order to
identify stable polyadenylated fusion transcripts, Oligo(dT)
primers that target transcripts with poly-A tails were used.
[0191] PCR
[0192] Real time PCR was performed using 5 ul of each cDNA template
with the iQ.TM. SYBR.RTM. Green Supermix (Bio-Rad, Hercules,
Calif.) on DNA Engine Opticon.RTM. 2 Continuous Fluorescence
Detection System (Bio-Rad, Hercules, Calif.). The primer pairs
targeting the 4977 bp deletion are; 8416F
5'-CCTTACACTATTCCTCATCAC-3', 13637R 5'-TGACCTGTTAGGGTGAGAAG-3', and
those for the 3.4 kb deletion are; ND4LF
5'-TCGCTCACACCTCATATCCTC-3', ND5R 5'-TGTGATTAGGAGTAGGGTTAGG-3'. The
reaction cocktail included: 2.times. SYBR.RTM. Green Supermix (100
mM KCL, 40 mM Tris-HCl, pH 8.4, 0.4 mM of each dNTP [dATP, dCTP,
dGTP, and dTTP], iTaq.TM. DNA polymerase, 50 units/ml, 6 mM
MgCl.sub.2, SYBR.RTM. Green 1, 20 nM flourescein, and stabilizers),
250 nM each of primers, and ddH.sub.2O. PCR cycling parameters were
as follows; (1) 95.degree. C. for 2 min, (2) 95.degree. C. for 30
sec, (3) 55.degree. C. (for the 4977 bp deletion) and 63.degree. C.
(for the 3.4 kb deletion) for 30 sec, (4) 72.degree. C. for 45 sec,
(5) plate read, followed by 39 cycles of steps 3 to 5, and final
incubation at 4.degree. C. Apart from cycling threshold and melting
curve analysis, samples were run on agarose gels for specific
visualization of amplification products (see FIGS. 2 to 4).
[0193] FIG. 2 is an agarose gel showing polyadenalated fusion
transcripts in prostate samples invoked by the loss of 3.4 kb from
the mitochondrial genome. Legend for FIG. 2: B-blank, Lanes 1-6
transcripts detected in cDNA; lanes 7-12 no reverse transcriptase
(RT) controls for samples in lanes 1-6.
[0194] FIG. 3 shows polyadenalated fusion transcripts in prostate
samples invoked by the loss of the 4977 kb common deletion. Legend
for FIG. 3: B-blank, Lanes 1-6 transcripts detected in cDNA; lanes
7-12 no RT controls for samples in lanes 1-6.
[0195] FIG. 4 shows polyadenalated fusion transcripts in breast
samples invoked by the loss of 3.4 kb from the mtgenome. Legend for
FIG. 4: Lanes 2-8 transcripts from breast cDNAs; lane 9 negative
(water) control; lanes 10 and 11, negative, no RT, controls for
samples in lanes 2 and 3.
[0196] These results demonstrate the existence of stable
mitochondrial fusion transcripts.
Example 2
Identification and Targetting of Fusion Products
[0197] Various hybridization probes were designed to detect, and
further demonstrate the presence of novel transcripts resulting
from mutated mitochondrial genomes, such as the 3.4 kb deletion.
For this purpose, a single-plex branched DNA platform for
quantitative gene expression analysis (QuantiGene 2.0.TM.,
Panomics.TM.) was utilized. The specific deletions and sequences
listed in this example are based on their relative positions with
the entire mtDNA genome, which is recited in SEQ ID NO: 1. The
nucleic acid sequences of the four transcript to which the probes
were designed in this example are identified herein as follows:
Transcript 1 (SEQ ID NO: 18), Transcript 2 (SEQ ID NO: 19),
Transcript 3 (SEQ ID NO: 20) and Transcript 4 (SEQ ID NO: 21).
[0198] An example of a continuous transcript from the 3.4 kb
mitochondrial genome deletion occurs with the genes ND4L (NADH
dehydrogenase subunit 4L) and ND5 (NADH dehydrogenase subunit 5). A
probe having a complementary sequence to SEQ ID NO: 19, was used to
detect transcript 2. The repetitive elements occur at positions
10745-10754 in ND4L and 14124-14133 in ND5.
[0199] The 3.4 kb deletion results in the removal of the 3' end of
ND4L, the full ND4 gene, tRNA histidine, tRNA serine2, tRNA
leucine2, and the majority of the 5' end of ND5 (see FIG. 5a),
resulting in a gene splice of ND4L and ND5 with a junction point of
10744(ND4L):14124(ND5) (FIG. 5b). SEQ ID NO: 3 is the complementary
DNA sequence to the RNA transcript (SEQ ID NO: 19) detected in the
manner described above.
[0200] Similarly, transcript 1 is a fusion transcript between
ATPase 8 and ND5 associated with positions 8469:13447 (SEQ ID NO:
18). Transcripts 3 and 4 (SEQ ID NO: 20 and SEQ ID NO: 21,
respectively) are fusion transcripts between COII and Cytb
associated with nucleotide positions 7974:15496 and 7992:15730
respectively. Table 3 provides a summary of the relationships
between the various sequences used in this example. Table 3
includes the detected fusion transcript and the DNA sequence
complementary to the fusion transcript detected.
Example 3
Application to Prostate Cancer
[0201] Using the four fusion transcripts, i.e. transcripts 1 to 4,
discussed above, two prostate tissue samples from one patient were
analyzed to assess the quantitative difference of the novel
predicted fusion transcripts. The results of the experiment are
provided in Table 2 below, wherein "Homog 1" refers to the
homogenate of frozen prostate tumour tissue from a patient and
"Homog 2" refers to the homogenate of frozen normal prostate tissue
adjacent to the tumour of the patient. These samples were processed
according to the manufacturer's protocol (QuantiGene.RTM. Sample
Processing Kit for Fresh or Frozen Animal Tissues; and
QuantiGene.RTM. 2.0 Reagent System User Manual) starting with 25.8
mg of Homog 1 and 28.9 mg of Homog 2 (the assay setup is shown in
Tables 5a and 5b).
[0202] Clearly demonstrated is an increased presence of
mitochondrial fusion transcripts in prostate cancer tissue compared
to normal adjacent prostate tissue. The fusion transcript is
present in the normal tissue, although at much lower levels. The
relative luminescence units (RLU) generated by hybridization of a
probe to a target transcript are directly proportional to the
abundance of each transcript. Table 2 also indicates the
coefficients of variation, CV, expressed as a percentage, of the
readings taken for the samples. The CV comprises the Standard
deviation divided by the average of the values. The significance of
such stably transcribed mitochondrial gene products in cancer
tissue has implications in disease evolution and progression.
Example 4
Application to Breast Cancer
[0203] Using the same protocol from Example 3 but focusing only on
Transcript 2, the novel fusion transcript associated with the 3.4
kb mtgenome deletion, analyses were conducted on two samples of
breast tumour tissue and two samples of tumour-free tissues
adjacent to those tumours, as well as three samples of prostate
tumour tissue, one sample comprising adjacent tumour-free tissue.
Results for this example are provided in Table 4. The prostate
tumour tissue sample having a corresponding normal tissue section
demonstrated a similar pattern to the prostate sample analyzed in
Example 3 in that the tumour tissue had approximately 2 times the
amount of the fusion transcript than did the normal adjacent
tissue. The breast tumour samples demonstrated a marked increase in
the fusion transcript levels when compared to the adjacent
non-tumour tissues. A 1:100 dilution of the homogenate was used for
this analysis as it performed most reproducibly in the experiment
cited in Example 3.
[0204] Thus, the above discussed results illustrate the application
of the transcripts of the invention in the detection of tumours of
both prostate and breast tissue.
Example 5
Application to Colorectal Cancer
[0205] This study sought to determine the effectiveness of several
transcripts of the invention in detecting colorectal cancer. A
total of 19 samples were prepared comprising nine control (benign)
tissue samples (samples 1 to 9) and ten tumour (malignant) tissue
samples (samples 10 to 19). The samples were homogenized according
to the manufacturer's recommendations (Quantigene.RTM. Sample
Processing Kit for Fresh or Frozen Animal Tissues; and Quantigene
2.0 Reagent System User Manual). Seven target transcripts and one
housekeeper transcript were prepared in the manner as outlined
above in previous examples. The characteristics of the transcripts
are summarized as follows:
TABLE-US-00001 TABLE 7 Characteristics of Breast Cancer Transcripts
Transcript ID Junction Site Gene Junction 2 10744:14124 ND4L:ND5 3
7974:15496 COII:Cytb 10 7438:13476 COI:ND5 11 7775:13532 COII:ND5
12 8213:13991 COII:ND5 Peptidylpropyl isomerase B (PPIB) N/A N/A
("housekeeper")
[0206] It is noted that transcripts 2 and 3 are the same as those
discussed above with respect to Examples 3 and 4.
[0207] Homogenates were prepared using approximately 25 mg of
tissue from OCT blocks and diluted 1:1 for transcripts 2 and 4, and
1:8 for transcripts 10 and 11. The quantity of the transcripts was
measured in Relative Luminenscence Units RLU on a Glomax.TM. Multi
Detection System (Promega). All samples were assayed in triplicate
for each transcript. Background measurements (no template) were
done in triplicate as well. The analysis accounted for background
by subtracting the lower limit from the RLU values for the samples.
Input RNA was accounted for by using the formula log.sub.2 a
RLU-log.sub.2 h RLU where a is the target fusion transcript and h
is the housekeeper transcript.
[0208] The analysis of the data comprised the following steps:
[0209] a) Establish CV's (coefficients of variation) for triplicate
assays; acceptable if 15%.
[0210] b) Establish average RLU value for triplicate assays of
target fusion transcript(a) and housekeeper transcript (h).
[0211] c) Establish lower limit from triplicate value of background
RLU (l).
[0212] d) Subtract lower limit (l) from (a).
[0213] e) Calculate log.sub.2 a RLU-log.sub.2 h RLU.
[0214] Summary of Results:
[0215] The results of the above analysis are illustrated in FIGS.
6a to 6g, which comprise plots of the log.sub.2 a RLU-log.sub.2 h
RLU against sample number. Also illustrated are the respective ROC
(Receiver Operating Characteristic) curves determined from the
results for each transcript.
[0216] Transcript 2: There exists a statistically significant
difference between the means (p<0.10) of the normal and
malignant groups (p>0.09), using a cutoff value of 3.6129 as
demonstrated by the ROC curve results in a sensitivity of 60% and
specificity of 89% and the area under the curve is 0.73 indicating
fair test accuracy. The threshold value chosen may be adjusted to
increase either the specificity or sensitivity of the test for a
particular application.
[0217] Transcript 3: There exists a statistically significant
difference between the means (p<0.05) of the normal and
malignant groups (p=0.03), using a cutoff value of 4.0813 as
demonstrated by the ROC curve results in a sensitivity of 60% and
specificity of 78% and the area under the curve is 0.79 indicating
fair to good test accuracy. The threshold value chosen may be
adjusted to increase either the specificity or sensitivity of the
test for a particular application.
[0218] Transcript 8: There exists a statistically significant
difference between the means (p<0.1) of the normal and malignant
groups (p=0.06). Using a cutoff value of -6.0975 as demonstrated by
the ROC curve results in a sensitivity of 60% and specificity of
89% and the area under the curve is 0.76 indicating fair test
accuracy. The threshold value chosen may be adjusted to increase
either the specificity or sensitivity of the test for a particular
application.
[0219] Transcript 9: There exists a statistically significant
difference between the means (p<0.1) of the normal and malignant
groups (p=0.06). Using a cutoff value of -7.5555 as demonstrated by
the ROC curve results in a sensitivity of 60% and specificity of
89% and the area under the curve is 0.76 indicating fair to good
test accuracy. The threshold value chosen may be adjusted to
increase either the specificity or sensitivity of the test for a
particular application.
[0220] Transcript 10: There is a statistically significant
difference between the means (p.ltoreq.0.01) of the normal and
malignant groups (p=0.01). Using a cutoff value of -3.8272 as
demonstrated by the ROC curve results in a sensitivity of 90% and
specificity of 67% and the area under the curve is 0.84, indicating
good test accuracy. The threshold value chosen may be adjusted to
increase either the specificity or sensitivity of the test for a
particular application.
[0221] Transcript 11: There exists a statistically significant
difference between the means (p<0.1) of the normal and malignant
groups (p=0.06), using a cutoff value of 3.1753 as demonstrated by
the ROC curve results in a sensitivity of 70% and specificity of
78% and the area under the curve is 0.76 indicating fair to good
test accuracy. The threshold value chosen may be adjusted to
increase either the specificity or sensitivity of the test for a
particular application.
[0222] Transcript 12: There exists a statistically significant
difference between the means (p<0.1) of the normal and malignant
groups (p=0.06), using a cut-off value of 3.2626 as demonstrated by
the ROC curve results in a sensitivity of 70% and specificity of
78% and the area under the curve is 0.76 indicating fair to good
test accuracy. The threshold value chosen may be adjusted to
increase either the specificity or sensitivity of the test for a
particular application.
[0223] Conclusions:
[0224] The above results illustrate the utility of transcripts 2,
3, 8, 9, 10, 11, and 12 in the detection of colorectal cancer and
in distinguishing malignant from normal colorectal tissue. As
indicated above, transcripts 2 and 3 were also found to have
utility in the detection of prostate cancer. Transcript 2 was also
found to have utility in the detection of breast cancer. Transcript
11 was also found to have utility in the detection of melanoma skin
cancer. Transcript 10 was also found to have utility in the
detection of lung cancer and melanoma. Transcript 8 was also found
to have utility in the detection of lung cancer. Any of the 7
transcripts listed may be used individually or in combination as a
tool for the detection of characterization of colorectal cancer in
a clinical setting.
Example 6
Application to Lung Cancer
[0225] This study sought to determine the effectiveness of several
transcripts of the invention in the detection of lung cancer. As in
Example 5, nine control (benign) tissue samples (samples 1 to 9)
and ten tumour (malignant) tissue samples (samples 10 to 19) were
homogenized according to the manufacturer's recommendations
(Quantigene.RTM. Sample Processing Kit for Fresh or Frozen Animal
Tissues; and Quantigene 2.0 Reagent System User Manual).
Homogenates were diluted 1:8 and the quantity of 4 target
transcripts and 1 housekeeper transcript was measured in Relative
Luminenscence Units RLU on a Glomax.TM. Multi Detection System
(Promega). All samples were assayed in triplicate for each
transcript. Background measurements (no template) were done in
triplicate as well.
[0226] The following transcripts were prepared for this
example:
TABLE-US-00002 TABLE 8 Characteristics of Lung Cancer Transcripts
Transcript ID Junction Site Gene Junction 6 8828:14896 ATPase6:Cytb
8 6075:13799 COI:ND5 10 7438:13476 COI:ND5 20 8469:13447
ATPase8:ND5 Peptidylpropyl isomerase B (PPIB) N/A N/A
("housekeeper")
[0227] The tissue samples used in this example had the following
characteristics:
TABLE-US-00003 TABLE 9 Characteristics of Lung Cancer Samples
Sample Malignant Comments (source of tissue) 1 NO interstitial lung
disease 2 NO emphysema 3 NO aneurysm 4 NO bronchopneumonia, COPD 5
NO malignant neoplasm in liver, origin unknown, calcified
granulomas in lung 6 NO 12 hours post mortem, mild emphysema 7 NO
12 hours post mortem, large B cell lymphoma, pulmonary edema,
pneumonia 8 NO pneumonia, edema, alveolar damage 9 NO congestion
and edema 10 YES adenocarcinoma, non-small cell 11 YES small cell
12 YES squamous cell carcinoma, NSC, emphysema 13 YES
adenocarcinoma, lung cancer, nsc, metastatic 14 YES squamous cell
carcinoma, non-small cell 15 YES mixed squamous and adenocarcinoma
16 YES non-small cell carcinoma, squamous 17 YES small cell
carcinoma 18 YES adenocarcinoma, lung cancer, nsc 19 YES
adenocarcinoma, lung cancer, nsc, metastatic
[0228] The analysis of data was performed according to the method
described in Example 5. The results are illustrated in FIGS. 7a,
7b, 7c and 7d.
[0229] Summary of Results:
[0230] Transcript 6: There exists a statistically significant
difference between the means (p<0.1) of the normal (benign) and
malignant groups (p=0.06), using a cutoff value of -6.5691 as
demonstrated by the ROC curve results in a sensitivity of 80% and
specificity of 71% and the area under the curve is 0.77, indicating
fair test accuracy. The threshold value chosen may be adjusted to
increase either the specificity or sensitivity of the test for a
particular application.
[0231] Transcript 8: The difference between the means of the normal
and malignant groups is statistically significant, p<0.05
(p=0.02). Using a cutoff value of -9.6166 as demonstrated by the
ROC curve results in a sensitivity of 90% and specificity of 86%
and the area under the curve is 0.86 indicating good test accuracy.
The threshold value chosen may be adjusted to increase either the
specificity or sensitivity of the test for a particular
application.
[0232] Transcript 10: The difference between the means of the
normal and malignant groups is statistically significant,
p.ltoreq.0.01 (p=0.01). Using a cutoff value of -10.6717 as
demonstrated by the ROC curve results in a sensitivity of 90% and
specificity of 86% and the area under the curve is 0.89 indicating
good test accuracy. The threshold value chosen may be adjusted to
increase either the specificity or sensitivity of the test for a
particular application.
[0233] Transcript 20: The difference between the means of the
normal and malignant groups is statistically significant,
p.ltoreq.0.1 (p=0.1). Using a cutoff value of 2.5071 as
demonstrated by the ROC curve results in a sensitivity of 70% and
specificity of 71% and the area under the curve is 0.74 indicating
fair test accuracy. The threshold value chosen may be adjusted to
increase either the specificity or sensitivity of the test for a
particular application.
[0234] Conclusions:
[0235] The results from example 6 illustrate the utility of
transcripts 6, 8, 10, and 20 of the invention in the detection of
lung cancer tumours and the distinction between malignant and
normal lung tissues. Any of these three transcripts may be used for
the detection or characterization of lung cancer in a clinical
setting.
Example 7
Application to Melanoma
[0236] This study sought to determine the effectiveness of several
transcripts of the invention in the detection of melanomas. In this
study a total of 14 samples were used, comprising five control
(benign) tissue samples and nine malignant tissue samples. All
samples were formalin fixed, paraffin embedded (FFPE). The FFPE
tissue samples were sectioned into tubes and homogenized according
to the manufacturer's recommendations (Quantigene.RTM. 2.0 Sample
Processing Kit for FFPE Samples; and Quantigene 2.0 Reagent System
User Manual) such that each sample approximated 20 microns prior to
homogenization. Homogenates were diluted 1:4 and the quantity of 7
target transcripts and 1 housekeeper transcript was measured in
Relative Luminenscence Units RLU on a Glomax.TM. Multi Detection
System (Promega). All samples were assayed in triplicate for each
transcript. Background measurements (no template) were done in
triplicate as well.
[0237] The 14 tissue samples used in this example had the following
characteristics:
TABLE-US-00004 TABLE 10 Characteristics of Melanoma Cancer Samples
Sample Malignant Comments (source of tissue) 1 NO breast reduction
tissue (skin) 2 NO breast reduction tissue (skin) 3 NO breast
reduction tissue (skin) 4 NO breast reduction tissue (skin) 5 NO
breast reduction tissue (skin) 6 YES lentigo maligna, (melanoma in
situ) invasive melanoma not present 7 YES invasive malignant
melanoma 8 YES nodular melanoma, pT3b, associated features of
lentigo maligna 9 YES residual superficial spreading invasive
malignant melanoma, Clark's level II 10 YES superficial spreading
malignant melanoma, Clark's Level II 11 YES nodular malignant
melanoma, Clark's level IV 12 YES superficial spreading malignant
melanoma in situ, no evidence of invasion 13 YES superficial
spreading malignant melanoma, Clark's level II, focally present
vertical phase 14 YES superficial spreading malignant melanoma in
situ, Clark's level I
[0238] The following transcripts were prepared for this
example:
TABLE-US-00005 TABLE 11 Characteristics of Melanoma Cancer
Transcripts Transcript ID Junction Site GeneJunction 6 8828:4896
ATPase6:Cytb 10 7438:13476 COI:ND5 11 7775:13532 COII:ND5 14
9191:12909 ATPase6:ND5 15 9574:12972 COIII:ND5 16 10367:12829
ND3:ND5 20 8469:13447 ATPase8:ND5 Peptidylpropyl isomerase B (PPIB)
N/A N/A ("housekeeper")
[0239] As indicated, transcripts 10 and 11 were also used in
Example 5. The analysis of data was performed according to the
method described in Example 5. The results are illustrated in FIGS.
8a-8g.
[0240] Summary of Results:
[0241] Transcript 6: There exists a statistically significant
difference between the means (p.ltoreq.0.01) of the normal and
malignant groups (p=0.01). Further, using a cutoff value of -5.9531
as demonstrated by the ROC curve results in a sensitivity of 89%
and specificity of 80% and the area under the curve is 0.96,
indicating very good test accuracy. The threshold value chosen may
be adjusted to increase either the specificity or sensitivity of
the test for a particular application.
[0242] Transcript 10: There exists a statistically significant
difference between the means (p.ltoreq.0.05) of the normal and
malignant groups (p=0.05), using a cutoff value of -4.7572 as
demonstrated by the ROC curve results in a sensitivity of 89% and
specificity of 40% and the area under the curve is 0.82, indicating
good test accuracy. The threshold value chosen may be adjusted to
increase either the specificity or sensitivity of the test for a
particular application.
[0243] Transcript 11: There exists a statistically significant
difference between the means (p<0.05) of the normal and
malignant groups (p=0.02). Further, using a cutoff value of 1.6762
as demonstrated by the ROC curve results in a sensitivity of 78%
and specificity of 100% and the area under the curve is 0.89,
indicating good test accuracy. The threshold value chosen may be
adjusted to increase either the specificity or sensitivity of the
test for a particular application.
[0244] Transcript 14: There exists a statistically significant
difference between the means (p.ltoreq.0.05) of the normal and
malignant groups (p=0.05). Further, using a cutoff value of -4.9118
as demonstrated by the ROC curve results in a sensitivity of 89%
and specificity of 60% and the area under the curve is 0.82,
indicating good test accuracy. The threshold value chosen may be
adjusted to increase either the specificity or sensitivity of the
test for a particular application.
[0245] Transcript 15: There exists a statistically significant
difference between the means (p<0.1) of the normal and malignant
groups (p=0.07), using a cutoff value of -7.3107 as demonstrated by
the ROC curve results in a sensitivity of 100% and specificity of
67% and the area under the curve is 0.80, indicating good test
accuracy. The threshold value chosen may be adjusted to increase
either the specificity or sensitivity of the test for a particular
application.
[0246] Transcript 16: There exists a statistically significant
difference between the means (p<0.05) of the normal and
malignant groups (p=0.03). Further, using a cutoff value of
-10.5963 as demonstrated by the ROC curve results in a sensitivity
of 89% and specificity of 80% and the area under the curve is
0.878, indicating good test accuracy. The threshold value chosen
may be adjusted to increase either the specificity or sensitivity
of the test for a particular application.
[0247] Transcript 20: There exists a statistically significant
difference between the means (p<0.05) of the normal and
malignant groups (p=0.04). Further, using a cutoff value of -8.3543
as demonstrated by the ROC curve results in a sensitivity of 100%
and specificity of 80% and the area under the curve is 0.89,
indicating good test accuracy. The threshold value chosen may be
adjusted to increase either the specificity or sensitivity of the
test for a particular application.
[0248] Conclusions:
[0249] The results from example 7 illustrate the utility of
transcripts 6, 10, 11, 14, 15, 16 and 20 of the invention in the
detection of malignant melanomas. As indicated above, transcripts
10 and 11 were also found have utility in detecting colorectal
cancer while transcript 6 has utility in the detection of lung
cancer. A transcript summary by disease is provided at Table 6.
Example 8
Application to Ovarian Cancer
[0250] This study sought to determine the effectiveness of several
transcripts of the invention in detecting ovarian cancer. A total
of 20 samples were prepared comprising ten control (benign) tissue
samples (samples 1 to 10) and ten tumour (malignant) tissue samples
(samples 11 to 20). The samples were homogenized according to the
manufacturer's recommendations (Quantigene.RTM. Sample Processing
Kit for Fresh or Frozen Animal Tissues; and Quantigene 2.0 Reagent
System User Manual). Eight target transcripts and one housekeeper
transcript were prepared in the manner as outlined above in
previous examples.
[0251] The 20 tissue samples used in this example had the following
characteristics:
TABLE-US-00006 TABLE 12 Characteristics of Ovarian Cancer Samples
Sample Diagnosis Comments 1 Normal follicular cyst 2 Normal fibroma
3 Normal No pathological change in ovaries 4 Normal follicular
cysts 5 Normal cellular fibroma 6 Normal benign follicular and
simple cysts 7 Normal leiomyomata, corpora albicantia 8 Normal
copora albicantia and an epithelial inclusions cysts 9 Normal
corpora albicantia 10 Normal corpora albicantia, surface inclusion
cysts, follicullar cysts 11 Malignant high grade poorly
differentiated papillary serous carcinoma involving omentum 12
Malignant endometrioid adenocarcinoma, well to moderately
differentiated with focal serous differentiation 13 Malignant
papillary serous carcinoma 14 Malignant mixed epithelial carcinoma
predominantly papillary serous carcinoma 15 Malignant High grade:
serous carcinoma, papillary and solid growth patterns 16 Malignant
High Grade (3/3) Papillary serous carcinoma 17 Malignant papillary
serous carcinoma, high nuclear grade 18 Malignant Papillary serous
cystadenocarcinomas Grade: III 19 Malignant poorly differentiated
papillary serous carcinoma 20 Malignant Well-differentiated
adnocarcinoma, Endometrioid type, Grade 1
[0252] The characteristics of the transcripts are summarized as
follows:
TABLE-US-00007 TABLE 13 Characteristics of Ovarian Cancer
Transcripts Transcript ID Junction Site Gene Junction 1 8469:13447
ATPase8:ND5 2 10744:14124 ND4L:ND5 3 7974:15496 COII:Cytb 6
8828:14896 ATPase6:Cytb 11 7775:13532 COII:ND5 12 8213:13991
COII:ND5 15 9574:12972 COIII:ND5 20 8469:13447 ATPase8:ND5
Ribosomal Protein Large PO (LRP) N/A N/A Housekeeper
[0253] It is noted that transcripts 1, 2, 3, 6, 11, 12, 15 and 20
are the same as those discussed above with respect to Examples
3-7.
[0254] Homogenates were prepared using approximately 25 mg of
frozen tissue and diluted 1:4. The quantity of the transcripts was
measured in Relative Luminenscence Units RLU on a Glomax.TM. Multi
Detection System (Promega). All samples were assayed in triplicate
for each transcript. Background measurements (no template) were
done in triplicate as well. The analysis accounted for background
by subtracting the lower limit from the RLU values for the samples.
Input RNA was accounted for by using the formula log.sub.2 a
RLU-log.sub.2 h RLU where a is the target fusion transcript and h
is the housekeeper transcript.
[0255] The analysis of the data comprised the following steps:
[0256] a) Establish CV's (coefficients of variation) for triplicate
assays; acceptable if 15%.
[0257] b) Establish average RLU value for triplicate assays of
target fusion transcript(a) and housekeeper transcript (h).
[0258] c) Establish lower limit from triplicate value of background
RLU (l).
[0259] d) Subtract lower limit (l) from (a).
[0260] e) Calculate log.sub.2 a RLU-log.sub.2 h RLU.
[0261] Summary of Results:
[0262] The results of the above analysis are illustrated in FIGS.
9a to 9h, which comprise plots of the log.sub.2 a RLU-log.sub.2 h
RLU against sample number. Also illustrated are the respective ROC
(Receiver Operating Characteristic) curves determined from the
results for each transcript.
[0263] Transcript 1: There exists a statistically significant
difference between the means (p<0.05) of the normal and
malignant groups (p=0.002). Using a cutoff value of -11.1503 as
demonstrated by the ROC curve results in a sensitivity of 90% and
specificity of 80% and the area under the curve is 0.91 indicating
very good test accuracy. The threshold value chosen may be adjusted
to increase either the specificity or sensitivity of the test for a
particular application.
[0264] Transcript 2: There exists a statistically significant
difference between the means (p<0.01) of the normal and
malignant groups (p=0.001). Using a cutoff value of 0.6962 as
demonstrated by the ROC curve results in a sensitivity of 90% and
specificity of 100% and the area under the curve is 0.96 indicating
very good test accuracy. The threshold value chosen may be adjusted
to increase either the specificity or sensitivity of the test for a
particular application.
[0265] Transcript 3: There exists a statistically significant
difference between the means (p<0.01) of the normal and
malignant groups (p=0.000). Using a cutoff value of 0.6754 as
demonstrated by the ROC curve results in a sensitivity of 100% and
specificity of 100% and the area under the curve is 1.00 indicating
excellent test accuracy. The threshold value chosen may be adjusted
to increase either the specificity or sensitivity of the test for a
particular application.
[0266] Transcript 6: There exists a statistically significant
difference between the means (p<0.01) of the normal and
malignant groups (p=0.007). Using a cutoff value of -9.6479 as
demonstrated by the ROC curve results in a sensitivity of 90% and
specificity of 70% and the area under the curve is 0.86 indicating
good test accuracy. The threshold value chosen may be adjusted to
increase either the specificity or sensitivity of the test for a
particular application.
[0267] Transcript 11: There is a statistically significant
difference between the means (p<0.01) of the normal and
malignant groups (p=0.000). Using a cutoff value of -1.3794
demonstrated by the ROC curve results in a sensitivity of 100% and
specificity of 90% and the area under the curve is 0.99, indicating
excellent test accuracy. The threshold value chosen may be adjusted
to increase either the specificity or sensitivity of the test for a
particular application.
[0268] Transcript 12: There exists a statistically significant
difference between the means (p<0.01) of the normal and
malignant groups (p=0.001). Using a cutoff value of -1.2379 as
demonstrated by the ROC curve results in a sensitivity of 90% and
specificity of 100% and the area under the curve is 0.96 indicating
excellent test accuracy. The threshold value chosen may be adjusted
to increase either the specificity or sensitivity of the test for a
particular application.
[0269] Transcript 15: There exists a statistically significant
difference between the means (p<0.05) of the normal and
malignant groups (p=0.023). Using a cut-off value of -8.6926 as
demonstrated by the ROC curve results in a sensitivity of 70% and
specificity of 80% and the area under the curve is 0.80 indicating
good test accuracy. The threshold value chosen may be adjusted to
increase either the specificity or sensitivity of the test for a
particular application.
[0270] Transcript 20: There exists a statistically significant
difference between the means (p<0.01) of the normal and
malignant groups (p=0.000). Using a cut-off value of 0.6521 as
demonstrated by the ROC curve results in a sensitivity of 100% and
specificity of 100% and the area under the curve is 0.76 indicating
fair to good test accuracy. The threshold value chosen may be
adjusted to increase either the specificity or sensitivity of the
test for a particular application.
[0271] Conclusions:
[0272] The above results illustrate the utility of transcripts 1,
2, 3, 6, 11, 12, 15, and 20 in the detection of ovarian cancer and
in distinguishing malignant from normal ovarian tissue. Transcripts
1, 2 and 3 were also found to have utility in the detection of
prostate cancer. Transcript 6 was also found to have utility in the
detection of melanoma and lung cancer. Transcript 11 was also found
to have utility in the detection of melanoma skin cancer,
colorectal cancer and testicular cancer. Transcript 12 was also
found to have utility in the detection of colorectal cancer and
testicular cancer. Transcript 15 was also found to have utility in
the detection of melanoma and testicular cancer. Transcript 20 was
also found to have utility in the detection of colorectal cancer,
melanoma, and testicular cancer. Any of the 8 transcripts listed
may be used individually or in combination as a tool for the
detection or characterization of ovarian cancer in a clinical
setting.
Example 9
Application to Testicular Cancer
[0273] This study sought to determine the effectiveness of several
transcripts of the invention in detecting testicular cancer. A
total of 17 samples were prepared comprising eight control (benign)
tissue samples (samples 1 to 8) and 9 tumour (malignant) tissue
samples (samples 9 to 17), 5 of the malignant samples were
non-seminomas (samples 9-13) and 4 were seminomas (samples 14-17).
The samples were homogenized according to the manufacturer's
recommendations (Quantigene.RTM. Sample Processing Kit for Fresh or
Frozen Animal Tissues; and Quantigene 2.0 Reagent System User
Manual). 10 target transcripts and one housekeeper transcript were
prepared in the manner as outlined above in previous examples.
[0274] The 17 tissue samples used in this example had the following
characteristics:
TABLE-US-00008 TABLE 14 Characteristics of Testicular Cancer
Samples General Stratified Sample Diagnosis Malignant Diagnosis 1
Benign Benign 2 Benign Benign 3 Benign Benign 4 Benign Benign 5
Benign Benign 6 Benign Benign 7 Benign Benign 8 Benign Benign 9
Malignant Non-Seminoma 10 Malignant Non-Seminoma 11 Malignant
Non-Seminoma 12 Malignant Non-Seminoma 13 Malignant Non-Seminoma 14
Malignant Seminoma 15 Malignant Seminoma 16 Malignant Seminoma 17
Malignant Seminoma
[0275] The characteristics of the transcripts are summarized as
follows:
TABLE-US-00009 TABLE 15 Characteristics of Testicular Cancer
Transcripts Transcript ID Junction Site Gene Junction 2 10744:14124
ND4L:ND5 3 7974:15496 COII:Cytb 4 7992:15730 COII:Cytb 11
7775:13532 COII:ND5 12 8213:13991 COII:ND5 13 9144:13816
ATPase6:ND5 15 9574:12972 COIII:ND5 16 10367:12829 ND3:ND5 20
8469:13447 ATPase8:ND5 Peptidylpropyl isomerase B (PPIB) N/A
N/A
[0276] It is noted that transcripts 2, 3, 4, 7, 11, 12, 15, 16 and
20 are the same as those discussed above with respect to Examples
3-8.
[0277] Homogenates were prepared using approximately 25 mg of
frozen tissue and diluted 1:4. The quantity of the transcripts was
measured in Relative Luminenscence Units RLU on a Glomax.TM. Multi
Detection System (Promega). All samples were assayed in triplicate
for each transcript. Background measurements (no template) were
done in triplicate as well. The analysis accounted for background
by subtracting the lower limit from the RLU values for the samples.
Input RNA was accounted for by using the formula log.sub.2 a
RLU-log.sub.2 h RLU where a is the target fusion transcript and h
is the housekeeper transcript.
[0278] The analysis of the data comprised the following steps:
[0279] a) Establish CV's (coefficients of variation) for triplicate
assays; acceptable if .ltoreq.15%.
[0280] b) Establish average RLU value for triplicate assays of
target fusion transcript (a) and housekeeper transcript (h).
[0281] c) Establish lower limit from triplicate value of background
RLU (l).
[0282] d) Subtract lower limit (l) from (a).
[0283] e) Calculate log.sub.2 a RLU-log.sub.2 h RLU.
[0284] Summary of Results:
[0285] The results of the above analysis are illustrated in FIGS.
10 to 18, which comprise plots of the log.sub.2 a RLU-log.sub.2 h
RLU against sample number. Also illustrated are the respective ROC
(Receiver Operating Characteristic) curves determined from the
results for each transcript.
[0286] While some transcripts distinguish between benign and
malignant testicular tissue, others demonstrate distinction between
the tumour subtypes of seminoma and non-seminoma and/or benign
testicular tissue. It is therefore anticipated that combining
transcripts from each class will facilitate not only detection of
testicular cancer but also classification into subtype of seminoma
or non-seminomas.
[0287] Transcript 2: There exists a statistically significant
difference between the means (p<0.05) of the normal group and
the malignant seminomas (p=0.02). Using a cutoff value of 1.5621 as
demonstrated by the ROC curve results in a sensitivity of 100% and
specificity of 100% and the area under the curve is 1.00 indicating
excellent test accuracy. There also exists a statistically
significant difference between the means (p<0.05) of the
malignant seminomas and the malignant non-seminomas (p=0.024).
Using a cutoff value of 2.1006 as demonstrated by the ROC curve
results in a sensitivity of 100% and specificity of 80% and the
area under the curve is 0.90 indicating excellent test accuracy.
The threshold value chosen may be adjusted to increase either the
specificity or sensitivity of the test for a particular
application.
[0288] Transcript 3: There exists a statistically significant
difference between the means (p<0.05) of the normal group and
the malignant seminomas (p=0.018). Using a cutoff value of 0.969 as
demonstrated by the ROC curve results in a sensitivity of 100% and
specificity of 87.5% and the area under the curve is 0.969
indicating excellent accuracy. There also exists a statistically
significant difference between the means (p<0.05) of the
malignant seminomas and the malignant non-seminomas (p=0.017).
Using a cutoff value of 1.8181 as demonstrated by the ROC curve
results in a sensitivity of 100% and specificity of 80% and the
area under the curve is 0.9 indicating excellent test accuracy. The
threshold value chosen may be adjusted to increase either the
specificity or sensitivity of the test for a particular
application.
[0289] Transcript 4: There exists a statistically significant
difference between the means (p<0.05) of the normal and
malignant groups (p=0.034). Using a cutoff value of -9.7628 as
demonstrated by the ROC curve results in a sensitivity of 67% and
specificity of 100% and the area under the curve is 0.833
indicating good test accuracy. The threshold value chosen may be
adjusted to increase either the specificity or sensitivity of the
test for a particular application.
[0290] Transcript 11: There exists a statistically significant
difference between the means (p<0.05) of the normal group and
the malignant seminomas (p=0.016). Using a cutoff value of 0.732 as
demonstrated by the ROC curve results in a sensitivity of 100% and
specificity of 100% and the area under the curve is 1.00 indicating
excellent test accuracy. There also exists a statistically
significant difference between the means (p<0.05) of the
malignant seminomas and the malignant non-seminomas (p=0.016).
Using a cutoff value of 0.9884 as demonstrated by the ROC curve
results in a sensitivity of 100% and specificity of 80% and the
area under the curve is 0.90 indicating excellent test accuracy.
The threshold value chosen may be adjusted to increase either the
specificity or sensitivity of the test for a particular
application.
[0291] Transcript 12: There exists a statistically significant
difference between the means (p<0.1) of the normal group and the
malignant seminomas (p=0.056). Using a cutoff value of 1.5361 as
demonstrated by the ROC curve results in a sensitivity of 100% and
specificity of 87.5% and the area under the curve is 0.969
indicating excellent test accuracy. There also exists a
statistically significant difference between the means (p<0.05)
of the malignant seminomas and the malignant non-seminomas
(p=0.044). Using a cutoff value of 1.6039 as demonstrated by the
ROC curve results in a sensitivity of 100% and specificity of 80%
and the area under the curve is 0.9 indicating excellent test
accuracy. The threshold value chosen may be adjusted to increase
either the specificity or sensitivity of the test for a particular
application.
[0292] Transcript 13: There exists a statistically significant
difference between the means (p<0.05) of the normal group and
the malignant group (p=0.019). Using a cutoff value of -9.8751 as
demonstrated by the ROC curve results in a sensitivity of 87.5% and
specificity of 78% and the area under the curve is 0.875 indicating
very good test accuracy. There also exists a statistically
significant difference between the means (p<0.01) of the
malignant non-seminomas and the benign group (p=0.000). Using a
cutoff value of -13.9519 as demonstrated by the ROC curve results
in a sensitivity of 100% and specificity of 87.5% and the area
under the curve is 0.975 indicating excellent test accuracy. There
also exists a statistically significant difference between the
means (p<0.01) of the malignant seminomas and the malignant
non-seminomas (p=0.001). Using a cutoff value of -15.8501 as
demonstrated by the ROC curve results in a sensitivity of 100% and
specificity of 100% and the area under the curve is 1.00 indicating
excellent test accuracy. The threshold value chosen may be adjusted
to increase either the specificity or sensitivity of the test for a
particular application.
[0293] Transcript 15: There exists a statistically significant
difference between the means (p<0.1) of the normal and malignant
groups (p=0.065). Using a cut-off value of -5.4916 as demonstrated
by the ROC curve results in a sensitivity of 75% and specificity of
89% and the area under the curve is 0.833 indicating good test
accuracy. The threshold value chosen may be adjusted to increase
either the specificity or sensitivity of the test for a particular
application.
[0294] Transcript 16: There exists a statistically significant
difference between the means (p<0.05) of the normal and
malignant groups including both seminomas and non-seminomas
(p=0.037). Using a cut-off value of -6.448 as demonstrated by the
ROC curve results in a sensitivity of 89% and specificity of 75%
and the area under the curve is 0.806 indicating good test
accuracy. There also exists a statistically significant difference
between the means (p<0.05) of the normal and malignant seminomas
(p=0.037). Using a cut-off value of -7.4575 as demonstrated by the
ROC curve results in a sensitivity of 100% and specificity of 87.5%
and the area under the curve is 0.938 indicating excellent test
accuracy. The threshold value chosen may be adjusted to increase
either the specificity or sensitivity of the test for a particular
application.
[0295] Transcript 20: There exists a statistically significant
difference between the means (p<0.01) of the normal group and
the malignant seminomas (p=0.006). Using a cutoff value of 1.8364
as demonstrated by the ROC curve results in a sensitivity of 100%
and specificity of 100% and the area under the curve is 1.00
indicating excellent test accuracy. There also exists a
statistically significant difference between the means (p<0.01)
of the malignant seminomas and the malignant non-seminomas
(p=0.004). Using a cutoff value of 1.6065 as demonstrated by the
ROC curve results in a sensitivity of 100% and specificity of 100%
and the area under the curve is 1.00 indicating excellent test
accuracy. The threshold value chosen may be adjusted to increase
either the specificity or sensitivity of the test for a particular
application.
[0296] Conclusions:
[0297] The above results illustrate the utility of transcripts 2,
3, 4, 11, 12, 13, 15, 16, and 20 in the detection of testicular
cancer, and testicular cancer subtypes, and in distinguishing
malignant from normal testicular tissue. Transcript 2 was also
found to have utility in the detection of prostate, breast,
colorectal and ovarian cancer. Transcript 3 was also found to have
utility in the detection of prostate, breast, melanoma, colorectal,
and ovarian cancers. Transcript 4 was also found to have utility in
the detection of prostate and colorectal cancers. Transcript 11 was
also found to have utility in the detection of colorectal,
melanoma, and ovarian cancers. Transcript 12 was also found to have
utility in the detection of colorectal and ovarian cancers.
Transcript 15 was also found to have utility in the detection of
melanoma and ovarian cancers. Transcript 16 was also found to have
utility in the detection of melanoma skin cancer. Transcript 20 was
also found to have utility in the detection of colorectal cancer,
melanoma, and ovarian cancer. Any of the 9 transcripts listed may
be used individually or in combination as a tool for the detection
or characterization of testicular cancer in a clinical setting.
[0298] In one aspect, the invention provides a kit for conducting
an assay for determining the presence of cancer in a tissue sample.
The kit includes the required reagents for conducting the assay as
described above. In particular, the kit includes one or more
containers containing one or more hybridization probes
corresponding to transcripts 1 to 17, and 20 described above. As
will be understood, the reagents for conducting the assay may
include any necessary buffers, salts, detection reagents etc.
Further, the kit may include any necessary sample collection
devices, containers etc. for obtaining the needed tissue samples,
reagents or materials to prepare the tissue samples for example by
homogenization or nucleic acid extraction, and for conducting the
subject assay or assays. The kit may also include control tissues
or samples to establish or validate acceptable values for diseased
or non-diseased tissues.
[0299] Although the invention has been described with reference to
certain specific embodiments, various modifications thereof will be
apparent to those skilled in the art without departing from the
spirit and scope of the invention as outlined in the claims
appended hereto. All documents (articles, manuals, patent
applications etc) referred to in the present application are
incorporated herein in their entirety by reference.
[0300] Bibliography
[0301] The following references, amongst others, were cited in the
foregoing description. The entire contents of these references are
incorporated herein by way of reference thereto.
TABLE-US-00010 Author Journal Title Volume Date Anderson et al
Nature Sequence and Organization of the Human 290(5806): 457- 1981
Mitochondrial Genome 65 Andrews et al Nat Genet Reanalysis and
revision of the Cambridge 23(2): 147 1999 reference sequence for
human mitochondrial DNA. Modica- Expert Rev Mitochondria as targets
for detection and 4: 1-19 2002 Napolitano et al Mol Med treatment
of cancer Sherratt et al Clin Sci (Lond) Mitochondrial DNA defects:
a widening 92(3): 225-35 1997 clinical spectrum of disorders.
Croteau et al Mutat Res Mitochondrial DNA repair pathways. 434(3):
137-48 1999 Green and J Clin Invest Pharmacological manipulation of
cell death: 115(10): 2610- 2005 Kroemer clinical applications in
sight? 2617 Dai et al Acta Correlation of cochlear blood supply
with 24(2): 130-6 2004 Otolaryngol mitochondrial DNA common
deletion in presbyacusis. Ro et al Muscle Nerve Deleted 4977-bp
mitochondrial DNA 28(6): 737-43 2003 mutation is associated with
sporadic amyotrophic lateral sclerosis: a hospital- based
case-control study. Barron et al Invest Mitochondrial abnormalities
in ageing 42(12): 3016-22 2001 Ophthalmol macular photoreceptors.
Vis Sci Lewis et al J Pathol Detection of damage to the
mitochondrial 191(3): 274-81 2000 genome in the oncocytic cells of
Warthin's tumour. Muller-Hocker Mod Pathol The common 4977 base
pair deletion of 11(3): 295-301. 1998 et al mitochondrial DNA
preferentially accumulates in the cardiac conduction system of
patients with Kearns-Sayre syndrome. Porteous et al Eur J Biochem
Bioenergetic consequences of accumulating 257(1): 192-201 1998 the
common 4977-bp mitochondrial DNA deletion. Parr et al J Mol Diagn
Somatic mitochondrial DNA mutations in 8(3): 312-9. 2006 prostate
cancer and normal appearing adjacent glands in comparison to age-
matched prostate samples without malignant histology. Maki et al Am
J Clin Mitochondrial genome deletion aids in the 129(1): 57-66 2008
Pathol identification of false- and true-negative prostate needle
core biopsy specimens. Nakase et al Am J Hum Transcription and
translation of deleted 46(3): 418-27. 1990 Genet mitochondrial
genomes in Kearns-Sayre syndrome: implications for pathogenesis.
Libura et al Blood Therapy-related acute myeloid leukemia- 105(5):
2124-31 2005 like MLL rearrangements are induced by etoposide in
primary human CD34+ cells and remain stable after clonal expansion.
Meyer et al Proc Natl Diagnostic tool for the identification of MLL
102(2): 449-54 2005 Acad Sci rearrangements including unknown
partner USA genes. Eguchi et al Genes MLL chimeric protein
activation renders 45(8): 754-60 2006 Chromosomes cells vulnerable
to chromosomal damage: Cancer an explanation for the very short
latency of infant leukemia. Hayashi et al Proc Natl Introduction of
disease-related 88: 10614- 1991 Acad Sci mitochondrial DNA
deletions into HeLa cells 10618 USA lacking mitochondrial DNA
results in mitochondrial dysfunction
TABLE-US-00011 TABLE 1 Known mitochondrial deletions having an ORF
Deletion Deletion Repeat Number of Junction (nt:nt) Size (bp)
Location (nt/nt) Repeats References COX I - ND5 6075:13799 -7723
6076-6084/13799- D, 9/9 Mita, S., Rizzuto, R., Moraes, C. T.,
Shanske, S., Arnaudo, E., Fabrizi, 13807 G. M., Koga, Y., DiMauro,
S., Schon, E. A. (1990) "Recombination via flanking direct repeats
is a major cause of large-scale deletions of human mitochondrial
DNA" Nucleic Acids Research 18(3): 561-567 6238:14103 -7864
6235-6238/14099- D, 4/4 Blok, R. B., Thorburn, D.R., Thompson, G.
N., Dahl, H. H. (1995) "A 14102 topoisomerase II cleavage site is
associated with a novel mitochondrial DNA deletion" Human Genetics
95 (1): 75-81 6325:13989 -7663 6326-6341/13889- D, 16/17 Larsson,
N. G., Holme, E., Kristiansson, B., Oldfors, A., Tulinius, M. 14004
(1990) "Progressive increase of the mutated mitochondrial DNA
fraction in Kearns-Sayre syndrome" Pediatric Research 28 (2): 131-
136 Larsson, N. G., Holme, E. (1992) "Multiple short direct repeats
associated with single mtDNA deletions " Biochimica et Biophysica
Acta 1139(4): 311-314 6330:13994 -7663 6331-6341/13994- D, 11/11
Mita, S., Rizzuto, R., Moraes, C. T., Shanske, S., Arnaudo, E.,
Fabrizi, 14004 G. M., Koga, Y., DiMauro, S., Schon, E.A. (1990)
"Recombination via flanking direct repeats is a major cause of
large-scale deletions of human mitochondrial DNA" Nucleic Acids
Research 18(3): 561-567 COX II - ND5 7829:14135 -6305
7824-7829/14129- D, 6/6 Bet, L, Moggio, M., Comi, G. P., Mariani,
C., Prelle, A., Checcarelli, N., 14134 Bordoni, A., Bresolin, N.,
Scarpini, E., Scarlato, G. (1994) "Multiple sclerosis and
mitochondrial myopathy: an unusual combination of diseases" Journal
of Neurology 241 (8): 511-516 8213:13991 -5777 8214-8220/13991- D,
7/7 Hinokio, Y., Suzuki, S., Komatu, K., Ohtomo, M., Onoda, M.,
13997 Matsumoto, M., Hirai, S., Sato, Y., Akai, H., Abe, K.,
Toyota, T. (1995) "A new mitochondrial DNA deletion associated with
diabetic amyotrophy, diabetic myoatrophy and diabetic fatty liver"
Muscle and Nerve 3 (9): S142-149 ATPase - ND5 8631:13513 -4881
8625-8631/13506- D, 7/7 Zhang, C., Baumer, A., Mackay, I. R.,
Linnane, A. W., Nagley, P. (1995) 13512 "Unusual pattern of
mitochondrial DNA deletions in skeletal muscle of an adult human
with chronic fatigue syndrome" Human Molecular Genetics 4 (4): 751
-754 9144:13816 -4671 9137-9144/13808- D, 8/8 Ota, Y., Tanaka, M.,
Sato, W., Ohno, K., Yamamoto, T., Maehara, M., 13815 Negoro, T.,
Watanabe, K., Awaya, S., Ozawa, T. (1991) "Detection of platelet
mitochondrial DNA deletions in Kearns-Sayre syndrome" Investigative
Ophthalmology and Visual Science 32 (10): 2667-2675 9191:12909
-3717 9189-9191/12906- D, 3/3 Tanaka, M., Sato, W., Ohno, K.,
Yamamoto, T., Ozawa, T. (1989) 12908 "Direct sequencing of
mitochondrial DNA in myopathic patients" Biochemical and
Biophysical Research Communications 164 ( ): 156- 163 COX III - ND5
10190:13753 -3562 10191-10190/13753- D, 8/8 Rotig, A., Bourgeron,
T., Chretien. D., Rustin, P., Munnich, A. (1995) 13760 "Spectrum of
mitochondrial DNA rearrangements in the Pearson marrow-pancreas
syndrome" Human Molecular Genetics 4 (8): 1327- 1330 Rotig, A.,
Cormier, Y., Kol, F., Mize, C. E., Souslubray, J. M., Veerman, A.,
Pearson, H. A., Munnich, A. (1991) "Site-specific deletions of the
mitochondrial genome in Pearson marrow-pancreas syndrome" Genomics
10 (2): 502-504 10067:12029 -2461 10365-10367/12825- D, 3/3 Kapsa,
R., Thompson, G. N., Thorburn, D. R., Dahl, H. H., Marzuki, G.,
12828 Byrna, E., Blair, R. B. (1984) "A novel mtDNA deletion in an
infant with Pearson syndrome" Journal of Inherited Metabolic
Disease 17 (5): 521- 526 ND4L - ND5 10744:14124 -3378
10745-10754/14124- D, 9/10 Cormier-Daire, V., Bonnefont, J. P.,
Rustin, P., Maurage, C., Ogler, H., 14133 Schmitz, J., Ricour, C.,
Saudubray, J. M., Munnich, A., Rotig, A. (1984) "Mitochondrial DNA,
rearrangements with onset as chronic diarrhea with villous atrophy"
Journal of Pediatrics 124 (1): 53-70 ND4 - ND5 11232:13980 -2747
1324-11242/13981- D, 9/9 Rotig, A., Cormier, V., Roll, F., Mize, C.
E., Saudubray, J. M., Veerman, 13989 A., Pearson, H. A., Munnich,
A. (1991) "Site-specific deletions of the mitochondrial genome in
Pearson marrow-pancreas syndrome" Genomics 10 (2): 502-504 Rotig,
A., Cormier, Y., Blanche, S., Bonnefont, J. P., Ledeist, F.,
Romero, N., Schmitz, J., Rustin, P., Fischer, A., Saudubray, J. M.
(1990) "Pearson's marrow-pancreas syndrome. A multi-system
mitochondrial disorder in infancy" Journal of Clinical
Investigation 86 ( ): 1601-1608 Cormier, V., Rotig, A., Quartino,
A. R., Forni, G. L., Cerane, R., Maier, M., Saudubray, J. M.,
Munnich, A. (1990) "Widespread multitissue deletions of the
mitochondrial genome in Pearson marrow-pancreas syndrome" Journal
of Pediatrics 117 (4): 599-602 Awata, T., Matsumata, T., Iwamoto,
Y., Matsuda, A., Kuzuya, T., Saito, T. (1993) "Japanese case of
diabetes mellitus and deafness with mutations in mitochondrial
tRNALeu(UUR) gene [letter]" Lancet 341 (8855): 1281-1282
TABLE-US-00012 TABLE 2 Prostate Cancer Detection with Novel
Mitochondrial Fusion Transcripts ##STR00001## * unit results in
table are RLU (relative luminescence units); Data read on Glorunner
.TM.. % CV = Coefficient of variation (as %). Legend: Homog =
homogenate. Homog 1: Prostate tumour tissue sample from patient;
Homog 2: Histologically normal tissue adjacent to tumour from
patient. RNA: Control: Total RNA from prostate tissue (Ambion p/n
7988). Shading: Background measurement.
TABLE-US-00013 TABLE 3 Deletion/Transcript/DNA Complement DNA
sequence with deletion complementary Deletion RNA transcript to RNA
transcript Transcript No. ATP synthase F0 subunit 8 to NADH SEQ ID
NO: 18 SEQ ID NO: 2 1 dehydrogenase subunit mitochondrial positions
8366-14148 (with reference to SEQ ID NO: 1). NADH dehydrogenase
subunit 4L SEQ ID NO: 19 SEQ ID NO: 3 2 (ND4L) to NADH
dehydrogenase subunit 5 (ND5); Mitochondrial positions 10470- 14148
(with reference to SEQ ID NO: 1) Cytochrome c oxidase subunit II
(COII) to SEQ ID NO: 20 SEQ ID NO: 4 3 Cytochrome b (Cytb);
Mitochondrial positions 7586-15887 (with reference to SEQ ID NO: 1)
Cytochrome c oxidase subunit II (COII) to SEQ ID NO: 21 SEQ ID NO:
5 4 Cytochrome b (Cytb); Mitochondrial positions 7586-15887 (with
reference to SEQ ID NO: 1)
TABLE-US-00014 TABLE 4 Breast and Prostate Cancer Detection Normal
Normal Normal adjacent Adjacent Adjacent Breast Breast Breast to
Breast Prostate Prostate Prostate to Prostate Tumour 1 Tumour 1
Tumour 2 Tumour 2 Tumour 3 Tumour 4 Tumour 5 Tumour 5 1 2 3 4 5 6 7
8 1:100 dilution E 68920 2971 49108 1245 46723 56679 99836 35504
1:100 dilution replicate F 92409 3017 60637 1512 53940 56155 100582
44221 G 420 3 31 6 26 25 44 23 H 518 3 4 5 5 3 4 2 % CV 20.6 1.1
14.9 13.7 10.1 0.7 0.5 15.5 unit results in table are RLU (relative
luminescence units) background G1, H1 empty well G2-G8, H2-H8
TABLE-US-00015 TABLE 5a Assay Conditions Template for the assay
Homo- Homo- Homo- Homo- Homo- Homo- Homo- Homo- RNA gen 1 gen 2 RNA
gen 1 gen 2 RNA gen 1 gen 2 RNA gen 1 gen 2 Tran- Tran- Tran- Tran-
Tran- Tran- Tran- Tran- Tran- Tran- Tran- Tran- script 1 script 1
script 1 script 2 script 2 script 2 script 3 script 3 script 3
script 4 script 4 script 4 1 2 3 4 5 6 7 8 9 10 11 12 A RNA Homog 1
Homog 2 RNA Homog 1 Homog 2 RNA Homog 1 Homog 2 RNA Homog 1 Homog 2
B Dil 1 Dil 1 Dil 1 Dil 1 Dil 1 Dil 1 Dil 1 Dil 1 Dil 1 Dil 1 Dil 1
Dil 1 C RNA Homog 1 Homog 2 RNA Homog 1 Homog 2 RNA Homog 1 Homog 2
RNA Homog 1 Homog 2 D Dil 2 Dil 2 Dil 2 Dil 2 Dil 2 Dil 2 Dil 2 Dil
2 Dil 2 Dil 2 Dil 2 Dil 2 E RNA Homog 1 Homog 2 RNA Homog 1 Homog 2
RNA Homog 1 Homog 2 RNA Homog 1 Homog 2 F Dil 3 Dil 3 Dil 3 Dil 3
Dil 3 Dil 3 Dil 3 Dil 3 Dil 3 Dil 3 Dil 3 Dil 3 G RNA Homog 1 Tran-
RNA Homog 1 Tran- RNA Homog 1 Tran- RNA Homog 1 Tran- script 1
script 1 script 1 script 1 H Dil 4 Dil 4 Back- Dil 4 Dil 4 Back-
Dil 4 Dil 4 Back- Dil 4 Dil 4 Back- ground ground ground ground
Homogenate1- Used 26 mg of tissue to homogenize in 700 ul H soln
with Proteinase K (PK). Used Qiagen TissueRuptor. Used 40 ul
homogenate supernatant, 20, 10 and 5 ul for dilution Homogenate1 =
Tumour tissue from the tumorous Prostate Homogenate2- Used 29 mg of
tissue to homogenize in 700 ul H soln with PK. Used Qiagen
TissueRuptor. Used 40 ul homogenate supernatant, 20, 10 and 5 ul
for dilution Homogenate2= Normal tissue from the tumorous Prostate
RNA dilution was made as below. RNA was from Prostate Normal from
Ambion. Assay was done in duplicates.
TABLE-US-00016 TABLE 5b RNA dilution RNA Dilution ng/ul Dil 1 3000
1:3 dil Dil 2 1000 Serial dil Dil 3 333 Dil 4 111
TABLE-US-00017 TABLE 6 Transcript Summary by Disease Prostate
Breast Colorectal Melanoma Lung Ovarian Testicular Probe Cancer
Cancer Cancer Skin Cancer Cancer Cancer Cancer 1 .cndot. .cndot. 2
.cndot. .cndot. .cndot. .cndot. .cndot. 3 .cndot. .cndot. .cndot.
.cndot. 4 .cndot. .cndot. 5 6 .cndot. .cndot. .cndot. 7 8 .cndot.
.cndot. 9 .cndot. 10 .cndot. .cndot. .cndot. 11 .cndot. .cndot.
.cndot. .cndot. 12 .cndot. .cndot. .cndot. 13 .cndot. 14 .cndot. 15
.cndot. .cndot. .cndot. 16 .cndot. .cndot. 17 20 .cndot. .cndot.
.cndot. .cndot.
Sequence CWU 1
1
52116568DNAHuman 1gatcacaggt ctatcaccct attaaccact cacgggagct
ctccatgcat ttggtatttt 60cgtctggggg gtatgcacgc gatagcattg cgagacgctg
gagccggagc accctatgtc 120gcagtatctg tctttgattc ctgcctcatc
ctattattta tcgcacctac gttcaatatt 180acaggcgaac atacttacta
aagtgtgtta attaattaat gcttgtagga cataataata 240acaattgaat
gtctgcacag ccactttcca cacagacatc ataacaaaaa atttccacca
300aaccccccct cccccgcttc tggccacagc acttaaacac atctctgcca
aaccccaaaa 360acaaagaacc ctaacaccag cctaaccaga tttcaaattt
tatcttttgg cggtatgcac 420ttttaacagt caccccccaa ctaacacatt
attttcccct cccactccca tactactaat 480ctcatcaata caacccccgc
ccatcctacc cagcacacac acaccgctgc taaccccata 540ccccgaacca
accaaacccc aaagacaccc cccacagttt atgtagctta cctcctcaaa
600gcaatacact gaaaatgttt agacgggctc acatcacccc ataaacaaat
aggtttggtc 660ctagcctttc tattagctct tagtaagatt acacatgcaa
gcatccccgt tccagtgagt 720tcaccctcta aatcaccacg atcaaaagga
acaagcatca agcacgcagc aatgcagctc 780aaaacgctta gcctagccac
acccccacgg gaaacagcag tgattaacct ttagcaataa 840acgaaagttt
aactaagcta tactaacccc agggttggtc aatttcgtgc cagccaccgc
900ggtcacacga ttaacccaag tcaatagaag ccggcgtaaa gagtgtttta
gatcaccccc 960tccccaataa agctaaaact cacctgagtt gtaaaaaact
ccagttgaca caaaatagac 1020tacgaaagtg gctttaacat atctgaacac
acaatagcta agacccaaac tgggattaga 1080taccccacta tgcttagccc
taaacctcaa cagttaaatc aacaaaactg ctcgccagaa 1140cactacgagc
cacagcttaa aactcaaagg acctggcggt gcttcatatc cctctagagg
1200agcctgttct gtaatcgata aaccccgatc aacctcacca cctcttgctc
agcctatata 1260ccgccatctt cagcaaaccc tgatgaaggc tacaaagtaa
gcgcaagtac ccacgtaaag 1320acgttaggtc aaggtgtagc ccatgaggtg
gcaagaaatg ggctacattt tctaccccag 1380aaaactacga tagcccttat
gaaacttaag ggtcgaaggt ggatttagca gtaaactaag 1440agtagagtgc
ttagttgaac agggccctga agcgcgtaca caccgcccgt caccctcctc
1500aagtatactt caaaggacat ttaactaaaa cccctacgca tttatataga
ggagacaagt 1560cgtaacatgg taagtgtact ggaaagtgca cttggacgaa
ccagagtgta gcttaacaca 1620aagcacccaa cttacactta ggagatttca
acttaacttg accgctctga gctaaaccta 1680gccccaaacc cactccacct
tactaccaga caaccttagc caaaccattt acccaaataa 1740agtataggcg
atagaaattg aaacctggcg caatagatat agtaccgcaa gggaaagatg
1800aaaaattata accaagcata atatagcaag gactaacccc tataccttct
gcataatgaa 1860ttaactagaa ataactttgc aaggagagcc aaagctaaga
cccccgaaac cagacgagct 1920acctaagaac agctaaaaga gcacacccgt
ctatgtagca aaatagtggg aagatttata 1980ggtagaggcg acaaacctac
cgagcctggt gatagctggt tgtccaagat agaatcttag 2040ttcaacttta
aatttgccca cagaaccctc taaatcccct tgtaaattta actgttagtc
2100caaagaggaa cagctctttg gacactagga aaaaaccttg tagagagagt
aaaaaattta 2160acacccatag taggcctaaa agcagccacc aattaagaaa
gcgttcaagc tcaacaccca 2220ctacctaaaa aatcccaaac atataactga
actcctcaca cccaattgga ccaatctatc 2280accctataga agaactaatg
ttagtataag taacatgaaa acattctcct ccgcataagc 2340ctgcgtcaga
ttaaaacact gaactgacaa ttaacagccc aatatctaca atcaaccaac
2400aagtcattat taccctcact gtcaacccaa cacaggcatg ctcataagga
aaggttaaaa 2460aaagtaaaag gaactcggca aatcttaccc cgcctgttta
ccaaaaacat cacctctagc 2520atcaccagta ttagaggcac cgcctgccca
gtgacacatg tttaacggcc gcggtaccct 2580aaccgtgcaa aggtagcata
atcacttgtt ccttaaatag ggacctgtat gaatggctcc 2640acgagggttc
agctgtctct tacttttaac cagtgaaatt gacctgcccg tgaagaggcg
2700ggcataacac agcaagacga gaagacccta tggagcttta atttattaat
gcaaacagta 2760cctaacaaac ccacaggtcc taaactacca aacctgcatt
aaaaatttcg gttggggcga 2820cctcggagca gaacccaacc tccgagcagt
acatgctaag acttcaccag tcaaagcgaa 2880ctactatact caattgatcc
aataacttga ccaacggaac aagttaccct agggataaca 2940gcgcaatcct
attctagagt ccatatcaac aatagggttt acgacctcga tgttggatca
3000ggacatcccg atggtgcagc cgctattaaa ggttcgtttg ttcaacgatt
aaagtcctac 3060gtgatctgag ttcagaccgg agtaatccag gtcggtttct
atctacttca aattcctccc 3120tgtacgaaag gacaagagaa ataaggccta
cttcacaaag cgccttcccc cgtaaatgat 3180atcatctcaa cttagtatta
tacccacacc cacccaagaa cagggtttgt taagatggca 3240gagcccggta
atcgcataaa acttaaaact ttacagtcag aggttcaatt cctcttctta
3300acaacatacc catggccaac ctcctactcc tcattgtacc cattctaatc
gcaatggcat 3360tcctaatgct taccgaacga aaaattctag gctatataca
actacgcaaa ggccccaacg 3420ttgtaggccc ctacgggcta ctacaaccct
tcgctgacgc cataaaactc ttcaccaaag 3480agcccctaaa acccgccaca
tctaccatca ccctctacat caccgccccg accttagctc 3540tcaccatcgc
tcttctacta tgaacccccc tccccatacc caaccccctg gtcaacctca
3600acctaggcct cctatttatt ctagccacct ctagcctagc cgtttactca
atcctctgat 3660cagggtgagc atcaaactca aactacgccc tgatcggcgc
actgcgagca gtagcccaaa 3720caatctcata tgaagtcacc ctagccatca
ttctactatc aacattacta ataagtggct 3780cctttaacct ctccaccctt
atcacaacac aagaacacct ctgattactc ctgccatcat 3840gacccttggc
cataatatga tttatctcca cactagcaga gaccaaccga acccccttcg
3900accttgccga aggggagtcc gaactagtct caggcttcaa catcgaatac
gccgcaggcc 3960ccttcgccct attcttcata gccgaataca caaacattat
tataataaac accctcacca 4020ctacaatctt cctaggaaca acatatgacg
cactctcccc tgaactctac acaacatatt 4080ttgtcaccaa gaccctactt
ctaacctccc tgttcttatg aattcgaaca gcataccccc 4140gattccgcta
cgaccaactc atacacctcc tatgaaaaaa cttcctacca ctcaccctag
4200cattacttat atgatatgtc tccataccca ttacaatctc cagcattccc
cctcaaacct 4260aagaaatatg tctgataaaa gagttacttt gatagagtaa
ataataggag cttaaacccc 4320cttatttcta ggactatgag aatcgaaccc
atccctgaga atccaaaatt ctccgtgcca 4380cctatcacac cccatcctaa
agtaaggtca gctaaataag ctatcgggcc cataccccga 4440aaatgttggt
tatacccttc ccgtactaat taatcccctg gcccaacccg tcatctactc
4500taccatcttt gcaggcacac tcatcacagc gctaagctcg cactgatttt
ttacctgagt 4560aggcctagaa ataaacatgc tagcttttat tccagttcta
accaaaaaaa taaaccctcg 4620ttccacagaa gctgccatca agtatttcct
cacgcaagca accgcatcca taatccttct 4680aatagctatc ctcttcaaca
atatactctc cggacaatga accataacca atactaccaa 4740tcaatactca
tcattaataa tcataatagc tatagcaata aaactaggaa tagccccctt
4800tcacttctga gtcccagagg ttacccaagg cacccctctg acatccggcc
tgcttcttct 4860cacatgacaa aaactagccc ccatctcaat catataccaa
atctctccct cactaaacgt 4920aagccttctc ctcactctct caatcttatc
catcatagca ggcagttgag gtggattaaa 4980ccaaacccag ctacgcaaaa
tcttagcata ctcctcaatt acccacatag gatgaataat 5040agcagttcta
ccgtacaacc ctaacataac cattcttaat ttaactattt atattatcct
5100aactactacc gcattcctac tactcaactt aaactccagc accacgaccc
tactactatc 5160tcgcacctga aacaagctaa catgactaac acccttaatt
ccatccaccc tcctctccct 5220aggaggcctg cccccgctaa ccggcttttt
gcccaaatgg gccattatcg aagaattcac 5280aaaaaacaat agcctcatca
tccccaccat catagccacc atcaccctcc ttaacctcta 5340cttctaccta
cgcctaatct actccacctc aatcacacta ctccccatat ctaacaacgt
5400aaaaataaaa tgacagtttg aacatacaaa acccacccca ttcctcccca
cactcatcgc 5460ccttaccacg ctactcctac ctatctcccc ttttatacta
ataatcttat agaaatttag 5520gttaaataca gaccaagagc cttcaaagcc
ctcagtaagt tgcaatactt aatttctgta 5580acagctaagg actgcaaaac
cccactctgc atcaactgaa cgcaaatcag ccactttaat 5640taagctaagc
ccttactaga ccaatgggac ttaaacccac aaacacttag ttaacagcta
5700agcaccctaa tcaactggct tcaatctact tctcccgccg ccgggaaaaa
aggcgggaga 5760agccccggca ggtttgaagc tgcttcttcg aatttgcaat
tcaatatgaa aatcacctcg 5820gagctggtaa aaagaggcct aacccctgtc
tttagattta cagtccaatg cttcactcag 5880ccattttacc tcacccccac
tgatgttcgc cgaccgttga ctattctcta caaaccacaa 5940agacattgga
acactatacc tattattcgg cgcatgagct ggagtcctag gcacagctct
6000aagcctcctt attcgagccg agctgggcca gccaggcaac cttctaggta
acgaccacat 6060ctacaacgtt atcgtcacag cccatgcatt tgtaataatc
ttcttcatag taatacccat 6120cataatcgga ggctttggca actgactagt
tcccctaata atcggtgccc ccgatatggc 6180gtttccccgc ataaacaaca
taagcttctg actcttacct ccctctctcc tactcctgct 6240cgcatctgct
atagtggagg ccggagcagg aacaggttga acagtctacc ctcccttagc
6300agggaactac tcccaccctg gagcctccgt agacctaacc atcttctcct
tacacctagc 6360aggtgtctcc tctatcttag gggccatcaa tttcatcaca
acaattatca atataaaacc 6420ccctgccata acccaatacc aaacgcccct
cttcgtctga tccgtcctaa tcacagcagt 6480cctacttctc ctatctctcc
cagtcctagc tgctggcatc actatactac taacagaccg 6540caacctcaac
accaccttct tcgaccccgc cggaggagga gaccccattc tataccaaca
6600cctattctga tttttcggtc accctgaagt ttatattctt atcctaccag
gcttcggaat 6660aatctcccat attgtaactt actactccgg aaaaaaagaa
ccatttggat acataggtat 6720ggtctgagct atgatatcaa ttggcttcct
agggtttatc gtgtgagcac accatatatt 6780tacagtagga atagacgtag
acacacgagc atatttcacc tccgctacca taatcatcgc 6840tatccccacc
ggcgtcaaag tatttagctg actcgccaca ctccacggaa gcaatatgaa
6900atgatctgct gcagtgctct gagccctagg attcatcttt cttttcaccg
taggtggcct 6960gactggcatt gtattagcaa actcatcact agacatcgta
ctacacgaca cgtactacgt 7020tgtagcccac ttccactatg tcctatcaat
aggagctgta tttgccatca taggaggctt 7080cattcactga tttcccctat
tctcaggcta caccctagac caaacctacg ccaaaatcca 7140tttcactatc
atattcatcg gcgtaaatct aactttcttc ccacaacact ttctcggcct
7200atccggaatg ccccgacgtt actcggacta ccccgatgca tacaccacat
gaaacatcct 7260atcatctgta ggctcattca tttctctaac agcagtaata
ttaataattt tcatgatttg 7320agaagccttc gcttcgaagc gaaaagtcct
aatagtagaa gaaccctcca taaacctgga 7380gtgactatat ggatgccccc
caccctacca cacattcgaa gaacccgtat acataaaatc 7440tagacaaaaa
aggaaggaat cgaacccccc aaagctggtt tcaagccaac cccatggcct
7500ccatgacttt ttcaaaaagg tattagaaaa accatttcat aactttgtca
aagttaaatt 7560ataggctaaa tcctatatat cttaatggca catgcagcgc
aagtaggtct acaagacgct 7620acttccccta tcatagaaga gcttatcacc
tttcatgatc acgccctcat aatcattttc 7680cttatctgct tcctagtcct
gtatgccctt ttcctaacac tcacaacaaa actaactaat 7740actaacatct
cagacgctca ggaaatagaa accgtctgaa ctatcctgcc cgccatcatc
7800ctagtcctca tcgccctccc atccctacgc atcctttaca taacagacga
ggtcaacgat 7860ccctccctta ccatcaaatc aattggccac caatggtact
gaacctacga gtacaccgac 7920tacggcggac taatcttcaa ctcctacata
cttcccccat tattcctaga accaggcgac 7980ctgcgactcc ttgacgttga
caatcgagta gtactcccga ttgaagcccc cattcgtata 8040ataattacat
cacaagacgt cttgcactca tgagctgtcc ccacattagg cttaaaaaca
8100gatgcaattc ccggacgtct aaaccaaacc actttcaccg ctacacgacc
gggggtatac 8160tacggtcaat gctctgaaat ctgtggagca aaccacagtt
tcatgcccat cgtcctagaa 8220ttaattcccc taaaaatctt tgaaataggg
cccgtattta ccctatagca ccccctctac 8280cccctctaga gcccactgta
aagctaactt agcattaacc ttttaagtta aagattaaga 8340gaaccaacac
ctctttacag tgaaatgccc caactaaata ctaccgtatg gcccaccata
8400attaccccca tactccttac actattcctc atcacccaac taaaaatatt
aaacacaaac 8460taccacctac ctccctcacc aaagcccata aaaataaaaa
attataacaa accctgagaa 8520ccaaaatgaa cgaaaatctg ttcgcttcat
tcattgcccc cacaatccta ggcctacccg 8580ccgcagtact gatcattcta
tttccccctc tattgatccc cacctccaaa tatctcatca 8640acaaccgact
aatcaccacc caacaatgac taatcaaact aacctcaaaa caaatgataa
8700ccatacacaa cactaaagga cgaacctgat ctcttatact agtatcctta
atcattttta 8760ttgccacaac taacctcctc ggactcctgc ctcactcatt
tacaccaacc acccaactat 8820ctataaacct agccatggcc atccccttat
gagcgggcac agtgattata ggctttcgct 8880ctaagattaa aaatgcccta
gcccacttct taccacaagg cacacctaca ccccttatcc 8940ccatactagt
tattatcgaa accatcagcc tactcattca accaatagcc ctggccgtac
9000gcctaaccgc taacattact gcaggccacc tactcatgca cctaattgga
agcgccaccc 9060tagcaatatc aaccattaac cttccctcta cacttatcat
cttcacaatt ctaattctac 9120tgactatcct agaaatcgct gtcgccttaa
tccaagccta cgttttcaca cttctagtaa 9180gcctctacct gcacgacaac
acataatgac ccaccaatca catgcctatc atatagtaaa 9240acccagccca
tgacccctaa caggggccct ctcagccctc ctaatgacct ccggcctagc
9300catgtgattt cacttccact ccataacgct cctcatacta ggcctactaa
ccaacacact 9360aaccatatac caatgatggc gcgatgtaac acgagaaagc
acataccaag gccaccacac 9420accacctgtc caaaaaggcc ttcgatacgg
gataatccta tttattacct cagaagtttt 9480tttcttcgca ggatttttct
gagcctttta ccactccagc ctagccccta ccccccaatt 9540aggagggcac
tggcccccaa caggcatcac cccgctaaat cccctagaag tcccactcct
9600aaacacatcc gtattactcg catcaggagt atcaatcacc tgagctcacc
atagtctaat 9660agaaaacaac cgaaaccaaa taattcaagc actgcttatt
acaattttac tgggtctcta 9720ttttaccctc ctacaagcct cagagtactt
cgagtctccc ttcaccattt ccgacggcat 9780ctacggctca acattttttg
tagccacagg cttccacgga cttcacgtca ttattggctc 9840aactttcctc
actatctgct tcatccgcca actaatattt cactttacat ccaaacatca
9900ctttggcttc gaagccgccg cctgatactg gcattttgta gatgtggttt
gactatttct 9960gtatgtctcc atctattgat gagggtctta ctcttttagt
ataaatagta ccgttaactt 10020ccaattaact agttttgaca acattcaaaa
aagagtaata aacttcgcct taattttaat 10080aatcaacacc ctcctagcct
tactactaat aattattaca ttttgactac cacaactcaa 10140cggctacata
gaaaaatcca ccccttacga gtgcggcttc gaccctatat cccccgcccg
10200cgtccctttc tccataaaat tcttcttagt agctattacc ttcttattat
ttgatctaga 10260aattgccctc cttttacccc taccatgagc cctacaaaca
actaacctgc cactaatagt 10320tatgtcatcc ctcttattaa tcatcatcct
agccctaagt ctggcctatg agtgactaca 10380aaaaggatta gactgaaccg
aattggtata tagtttaaac aaaacgaatg atttcgactc 10440attaaattat
gataatcata tttaccaaat gcccctcatt tacataaata ttatactagc
10500atttaccatc tcacttctag gaatactagt atatcgctca cacctcatat
cctccctact 10560atgcctagaa ggaataatac tatcgctgtt cattatagct
actctcataa ccctcaacac 10620ccactccctc ttagccaata ttgtgcctat
tgccatacta gtctttgccg cctgcgaagc 10680agcggtgggc ctagccctac
tagtctcaat ctccaacaca tatggcctag actacgtaca 10740taacctaaac
ctactccaat gctaaaacta atcgtcccaa caattatatt actaccactg
10800acatgacttt ccaaaaaaca cataatttga atcaacacaa ccacccacag
cctaattatt 10860agcatcatcc ctctactatt ttttaaccaa atcaacaaca
acctatttag ctgttcccca 10920accttttcct ccgaccccct aacaaccccc
ctcctaatac taactacctg actcctaccc 10980ctcacaatca tggcaagcca
acgccactta tccagtgaac cactatcacg aaaaaaactc 11040tacctctcta
tactaatctc cctacaaatc tccttaatta taacattcac agccacagaa
11100ctaatcatat tttatatctt cttcgaaacc acacttatcc ccaccttggc
tatcatcacc 11160cgatgaggca accagccaga acgcctgaac gcaggcacat
acttcctatt ctacacccta 11220gtaggctccc ttcccctact catcgcacta
atttacactc acaacaccct aggctcacta 11280aacattctac tactcactct
cactgcccaa gaactatcaa actcctgagc caacaactta 11340atatgactag
cttacacaat agcttttata gtaaagatac ctctttacgg actccactta
11400tgactcccta aagcccatgt cgaagccccc atcgctgggt caatagtact
tgccgcagta 11460ctcttaaaac taggcggcta tggtataata cgcctcacac
tcattctcaa ccccctgaca 11520aaacacatag cctacccctt ccttgtacta
tccctatgag gcataattat aacaagctcc 11580atctgcctac gacaaacaga
cctaaaatcg ctcattgcat actcttcaat cagccacata 11640gccctcgtag
taacagccat tctcatccaa accccctgaa gcttcaccgg cgcagtcatt
11700ctcataatcg cccacgggct tacatcctca ttactattct gcctagcaaa
ctcaaactac 11760gaacgcactc acagtcgcat cataatcctc tctcaaggac
ttcaaactct actcccacta 11820atagcttttt gatgacttct agcaagcctc
gctaacctcg ccttaccccc cactattaac 11880ctactgggag aactctctgt
gctagtaacc acgttctcct gatcaaatat cactctccta 11940cttacaggac
tcaacatact agtcacagcc ctatactccc tctacatatt taccacaaca
12000caatggggct cactcaccca ccacattaac aacataaaac cctcattcac
acgagaaaac 12060accctcatgt tcatacacct atcccccatt ctcctcctat
ccctcaaccc cgacatcatt 12120accgggtttt cctcttgtaa atatagttta
accaaaacat cagattgtga atctgacaac 12180agaggcttac gaccccttat
ttaccgagaa agctcacaag aactgctaac tcatgccccc 12240atgtctaaca
acatggcttt ctcaactttt aaaggataac agctatccat tggtcttagg
12300ccccaaaaat tttggtgcaa ctccaaataa aagtaataac catgcacact
actataacca 12360ccctaaccct gacttcccta attcccccca tccttaccac
cctcgttaac cctaacaaaa 12420aaaactcata cccccattat gtaaaatcca
ttgtcgcatc cacctttatt atcagtctct 12480tccccacaac aatattcatg
tgcctagacc aagaagttat tatctcgaac tgacactgag 12540ccacaaccca
aacaacccag ctctccctaa gcttcaaact agactacttc tccataatat
12600tcatccctgt agcattgttc gttacatggt ccatcataga attctcactg
tgatatataa 12660actcagaccc aaacattaat cagttcttca aatatctact
catcttccta attaccatac 12720taatcttagt taccgctaac aacctattcc
aactgttcat cggctgagag ggcgtaggaa 12780ttatatcctt cttgctcatc
agttgatgat acgcccgagc agatgccaac acagcagcca 12840ttcaagcaat
cctatacaac cgtatcggcg atatcggttt catcctcgcc ttagcatgat
12900ttatcctaca ctccaactca tgagacccac aacaaatagc ccttctaaac
gctaatccaa 12960gcctcacccc actactaggc ctcctcctag cagcagcagg
caaatcagcc caattaggtc 13020tccacccctg actcccctca gccatagaag
gccccacccc agtctcagcc ctactccact 13080caagcactat agttgtagca
ggaatcttct tactcatccg cttccacccc ctagcagaaa 13140atagcccact
aatccaaact ctaacactat gcttaggcgc tatcaccact ctgttcgcag
13200cagtctgcgc ccttacacaa aatgacatca aaaaaatcgt agccttctcc
acttcaagtc 13260aactaggact cataatagtt acaatcggca tcaaccaacc
acacctagca ttcctgcaca 13320tctgtaccca cgccttcttc aaagccatac
tatttatgtg ctccgggtcc atcatccaca 13380accttaacaa tgaacaagat
attcgaaaaa taggaggact actcaaaacc atacctctca 13440cttcaacctc
cctcaccatt ggcagcctag cattagcagg aatacctttc ctcacaggtt
13500tctactccaa agaccacatc atcgaaaccg caaacatatc atacacaaac
gcctgagccc 13560tatctattac tctcatcgct acctccctga caagcgccta
tagcactcga ataattcttc 13620tcaccctaac aggtcaacct cgcttcccca
cccttactaa cattaacgaa aataacccca 13680ccctactaaa ccccattaaa
cgcctggcag ccggaagcct attcgcagga tttctcatta 13740ctaacaacat
ttcccccgca tcccccttcc aaacaacaat ccccctctac ctaaaactca
13800cagccctcgc tgtcactttc ctaggacttc taacagccct agacctcaac
tacctaacca 13860acaaacttaa aataaaatcc ccactatgca cattttattt
ctccaacata ctcggattct 13920accctagcat cacacaccgc acaatcccct
atctaggcct tcttacgagc caaaacctgc 13980ccctactcct cctagaccta
acctgactag aaaagctatt acctaaaaca atttcacagc 14040accaaatctc
cacctccatc atcacctcaa cccaaaaagg cataattaaa ctttacttcc
14100tctctttctt cttcccactc atcctaaccc tactcctaat cacataacct
attcccccga 14160gcaatctcaa ttacaatata tacaccaaca aacaatgttc
aaccagtaac tactactaat 14220caacgcccat aatcatacaa agcccccgca
ccaataggat cctcccgaat caaccctgac 14280ccctctcctt cataaattat
tcagcttcct acactattaa agtttaccac aaccaccacc 14340ccatcatact
ctttcaccca cagcaccaat cctacctcca tcgctaaccc cactaaaaca
14400ctcaccaaga cctcaacccc tgacccccat gcctcaggat actcctcaat
agccatcgct 14460gtagtatatc caaagacaac catcattccc cctaaataaa
ttaaaaaaac tattaaaccc 14520atataacctc ccccaaaatt cagaataata
acacacccga ccacaccgct aacaatcaat 14580actaaacccc cataaatagg
agaaggctta gaagaaaacc ccacaaaccc cattactaaa 14640cccacactca
acagaaacaa agcatacatc attattctcg cacggactac aaccacgacc
14700aatgatatga aaaaccatcg ttgtatttca actacaagaa caccaatgac
cccaatacgc 14760aaaactaacc ccctaataaa attaattaac cactcattca
tcgacctccc caccccatcc 14820aacatctccg catgatgaaa cttcggctca
ctccttggcg cctgcctgat cctccaaatc 14880accacaggac tattcctagc
catgcactac tcaccagacg cctcaaccgc cttttcatca 14940atcgcccaca
tcactcgaga cgtaaattat ggctgaatca tccgctacct tcacgccaat
15000ggcgcctcaa tattctttat ctgcctcttc ctacacatcg ggcgaggcct
atattacgga 15060tcatttctct actcagaaac ctgaaacatc ggcattatcc
tcctgcttgc aactatagca 15120acagccttca taggctatgt cctcccgtga
ggccaaatat cattctgagg ggccacagta 15180attacaaact tactatccgc
catcccatac attgggacag acctagttca atgaatctga 15240ggaggctact
cagtagacag tcccaccctc acacgattct ttacctttca cttcatcttg
15300cccttcatta ttgcagccct agcaacactc cacctcctat tcttgcacga
aacgggatca 15360aacaaccccc taggaatcac ctcccattcc gataaaatca
ccttccaccc ttactacaca 15420atcaaagacg ccctcggctt acttctcttc
cttctctcct taatgacatt aacactattc 15480tcaccagacc tcctaggcga
cccagacaat tataccctag ccaacccctt aaacacccct 15540ccccacatca
agcccgaatg atatttccta ttcgcctaca caattctccg atccgtccct
15600aacaaactag gaggcgtcct tgccctatta ctatccatcc tcatcctagc
aataatcccc 15660atcctccata tatccaaaca acaaagcata atatttcgcc
cactaagcca atcactttat 15720tgactcctag ccgcagacct cctcattcta
acctgaatcg gaggacaacc agtaagctac 15780ccttttacca tcattggaca
agtagcatcc gtactatact tcacaacaat cctaatccta 15840ataccaacta
tctccctaat tgaaaacaaa atactcaaat gggcctgtcc ttgtagtata
15900aactaataca ccagtcttgt aaaccggaga tgaaaacctt tttccaagga
caaatcagag 15960aaaaagtctt taactccacc attagcaccc aaagctaaga
ttctaattta aactattctc 16020tgttctttca tggggaagca gatttgggta
ccacccaagt attgactcac ccatcaacaa 16080ccgctatgta tttcgtacat
tactgccagc caccatgaat attgtacggt accataaata 16140cttgaccacc
tgtagtacat aaaaacccaa tccacatcaa aaccccctcc ccatgcttac
16200aagcaagtac agcaatcaac cctcaactat cacacatcaa ctgcaactcc
aaagccaccc 16260ctcacccact aggataccaa caaacctacc cacccttaac
agtacatagt acataaagcc 16320atttaccgta catagcacat tacagtcaaa
tcccttctcg tccccatgga tgacccccct 16380cagatagggg tcccttgacc
accatcctcc gtgaaatcaa tatcccgcac aagagtgcta 16440ctctcctcgc
tccgggccca taacacttgg gggtagctaa agtgaactgt atccgacatc
16500tggttcctac ttcagggtca taaagcctaa atagcccaca cgttcccctt
aaataagaca 16560tcacgatg 165682783DNAArtificialcDNA 2atggcccacc
ataattaccc ccatactcct tacactattc ctcatcaccc aactaaaaat 60attaaacaca
aactaccacc tacctccctc accattggca gcctagcatt agcaggaata
120cctttcctca caggtttcta ctccaaagac cacatcatcg aaaccgcaaa
catatcatac 180acaaacgcct gagccctatc tattactctc atcgctacct
ccctgacaag cgcctatagc 240actcgaataa ttcttctcac cctaacaggt
caacctcgct tccccaccct tactaacatt 300aacgaaaata accccaccct
actaaacccc attaaacgcc tggcagccgg aagcctattc 360gcaggatttc
tcattactaa caacatttcc cccgcatccc ccttccaaac aacaatcccc
420ctctacctaa aactcacagc cctcgctgtc actttcctag gacttctaac
agccctagac 480ctcaactacc taaccaacaa acttaaaata aaatccccac
tatgcacatt ttatttctcc 540aacatactcg gattctaccc tagcatcaca
caccgcacaa tcccctatct aggccttctt 600acgagccaaa acctgcccct
actcctccta gacctaacct gactagaaaa gctattacct 660aaaacaattt
cacagcacca aatctccacc tccatcatca cctcaaccca aaaaggcata
720attaaacttt acttcctctc tttcttcttc ccactcatcc taaccctact
cctaatcaca 780taa 7833300DNAArtificialcDNA 3atgcccctca tttacataaa
tattatacta gcatttacca tctcacttct aggaatacta 60gtatatcgct cacacctcat
atcctcccta ctatgcctag aaggaataat actatcgctg 120ttcattatag
ctactctcat aaccctcaac acccactccc tcttagccaa tattgtgcct
180attgccatac tagtctttgc cgcctgcgaa gcagcggtgg gcctagccct
actagtctca 240atctccaaca catatggcct agactacgta cataacctaa
ccctactcct aatcacataa 3004781DNAArtificialcDNA 4atggcacatg
cagcgcaagt aggtctacaa gacgctactt cccctatcat agaagagctt 60atcacctttc
atgatcacgc cctcataatc attttcctta tctgcttcct agtcctgtat
120gcccttttcc taacactcac aacaaaacta actaatacta acatctcaga
cgctcaggaa 180atagaaaccg tctgaactat cctgcccgcc atcatcctag
tcctcatcgc cctcccatcc 240ctacgcatcc tttacataac agacgaggtc
aacgatccct cccttaccat caaatcaatt 300ggccaccaat ggtactgaac
ctacgagtac accgactacg gcggactaat cttcaactcc 360tacatacttc
ccccattatt cctagaacca ggcgacccag acaattatac cctagccaac
420cccttaaaca cccctcccca catcaagccc gaatgatatt tcctattcgc
ctacacaatt 480ctccgatccg tccctaacaa actaggaggc gtccttgccc
tattactatc catcctcatc 540ctagcaataa tccccatcct ccatatatcc
aaacaacaaa gcataatatt tcgcccacta 600agccaatcac tttattgact
cctagccgca gacctcctca ttctaacctg aatcggagga 660caaccagtaa
gctacccttt taccatcatt ggacaagtag catccgtact atacttcaca
720acaatcctaa tcctaatacc aactatctcc ctaattgaaa acaaaatact
caaatgggcc 780t 7815565DNAArtificialcDNA 5atggcacatg cagcgcaagt
aggtctacaa gacgctactt cccctatcat agaagagctt 60atcacctttc atgatcacgc
cctcataatc attttcctta tctgcttcct agtcctgtat 120gcccttttcc
taacactcac aacaaaacta actaatacta acatctcaga cgctcaggaa
180atagaaaccg tctgaactat cctgcccgcc atcatcctag tcctcatcgc
cctcccatcc 240ctacgcatcc tttacataac agacgaggtc aacgatccct
cccttaccat caaatcaatt 300ggccaccaat ggtactgaac ctacgagtac
accgactacg gcggactaat cttcaactcc 360tacatacttc ccccattatt
cctagaacca ggcgacctgc gactcctagc cgcagacctc 420ctcattctaa
cctgaatcgg aggacaacca gtaagctacc cttttaccat cattggacaa
480gtagcatccg tactatactt cacaacaatc ctaatcctaa taccaactat
ctccctaatt 540gaaaacaaaa tactcaaatg ggcct 56561174DNAArtificialcDNA
6atggcacatg cagcgcaagt aggtctacaa gacgctactt cccctatcat agaagagctt
60atcacctttc atgatcacgc cctcataatc attttcctta tctgcttcct agtcctgtat
120gcccttttcc taacactcac aacaaaacta actaatacta acatctcaga
cgctcaggaa 180atagaaaccg tctgaactat cctgcccgcc atcatcctag
tcctcatcgc cctcccatcc 240ctacgcatcc tttacataac agacgaggtc
aacgatccct cccttaccat caaatcaatt 300ggccaccaat ggtactgaac
ctacgagtac accgactacg gcggactaat cttcaactcc 360tacatacttc
ccccattatt cctagaacca ggcgacctgc gactccttga cgttgacaat
420cgagtagtac tcccgattga agcccccatt cgtataataa ttacatcaca
agacgtcttg 480cactcatgag ctgtccccac attaggctta aaaacagatg
caattcccgg acgtctaaac 540caaaccactt tcaccgctac acgaccgggg
gtatactacg gtcaatgctc tgaaatctgt 600ggagcaaacc acagtttcat
gcccatattc ttgcacgaaa cgggatcaaa caacccccta 660ggaatcacct
cccattccga taaaatcacc ttccaccctt actacacaat caaagacgcc
720ctcggcttac ttctcttcct tctctcctta atgacattaa cactattctc
accagacctc 780ctaggcgacc cagacaatta taccctagcc aaccccttaa
acacccctcc ccacatcaag 840cccgaatgat atttcctatt cgcctacaca
attctccgat ccgtccctaa caaactagga 900ggcgtccttg ccctattact
atccatcctc atcctagcaa taatccccat cctccatata 960tccaaacaac
aaagcataat atttcgccca ctaagccaat cactttattg actcctagcc
1020gcagacctcc tcattctaac ctgaatcgga ggacaaccag taagctaccc
ttttaccatc 1080attggacaag tagcatccgt actatacttc acaacaatcc
taatcctaat accaactatc 1140tccctaattg aaaacaaaat actcaaatgg gcct
117471294DNAArtificialcDNA 7atgaacgaaa atctgttcgc ttcattcatt
gcccccacaa tcctaggcct acccgccgca 60gtactgatca ttctatttcc ccctctattg
atccccacct ccaaatatct catcaacaac 120cgactaatca ccacccaaca
atgactaatc aaactaacct caaaacaaat gataaccata 180cacaacacta
aaggacgaac ctgatctctt atactagtat ccttaatcat ttttattgcc
240acaactaacc tcctcggact cctgcctcac tcatttacac caaccaccca
actatctata 300aacctagcca tgcactactc accagacgcc tcaaccgcct
tttcatcaat cgcccacatc 360actcgagacg taaattatgg ctgaatcatc
cgctaccttc acgccaatgg cgcctcaata 420ttctttatct gcctcttcct
acacatcggg cgaggcctat attacggatc atttctctac 480tcagaaacct
gaaacatcgg cattatcctc ctgcttgcaa ctatagcaac agccttcata
540ggctatgtcc tcccgtgagg ccaaatatca ttctgagggg ccacagtaat
tacaaactta 600ctatccgcca tcccatacat tgggacagac ctagttcaat
gaatctgagg aggctactca 660gtagacagtc ccaccctcac acgattcttt
acctttcact tcatcttgcc cttcattatt 720gcagccctag caacactcca
cctcctattc ttgcacgaaa cgggatcaaa caacccccta 780ggaatcacct
cccattccga taaaatcacc ttccaccctt actacacaat caaagacgcc
840ctcggcttac ttctcttcct tctctcctta atgacattaa cactattctc
accagacctc 900ctaggcgacc cagacaatta taccctagcc aaccccttaa
acacccctcc ccacatcaag 960cccgaatgat atttcctatt cgcctacaca
attctccgat ccgtccctaa caaactagga 1020ggcgtccttg ccctattact
atccatcctc atcctagcaa taatccccat cctccatata 1080tccaaacaac
aaagcataat atttcgccca ctaagccaat cactttattg actcctagcc
1140gcagacctcc tcattctaac ctgaatcgga ggacaaccag taagctaccc
ttttaccatc 1200attggacaag tagcatccgt actatacttc acaacaatcc
taatcctaat accaactatc 1260tccctaattg aaaacaaaat actcaaatgg gcct
129481228DNAArtificialcDNA 8atgcccctca tttacataaa tattatacta
gcatttacca tctcacttct aggaatacta 60gtatatcgct cacacctcat atcctcccta
ctatgcctag aaggaataat actatcgctg 120ttcattatag ctactctcat
aaccctcaac acccactccc tcttagccaa tattgtgcct 180attgccatac
tagtctttgg cgcctgcctg atcctccaaa tcaccacagg actattccta
240gccatgcact actcaccaga cgcctcaacc gccttttcat caatcgccca
catcactcga 300gacgtaaatt atggctgaat catccgctac cttcacgcca
atggcgcctc aatattcttt 360atctgcctct tcctacacat cgggcgaggc
ctatattacg gatcatttct ctactcagaa 420acctgaaaca tcggcattat
cctcctgctt gcaactatag caacagcctt cataggctat 480gtcctcccgt
gaggccaaat atcattctga ggggccacag taattacaaa cttactatcc
540gccatcccat acattgggac agacctagtt caatgaatct gaggaggcta
ctcagtagac 600agtcccaccc tcacacgatt ctttaccttt cacttcatct
tgcccttcat tattgcagcc 660ctagcaacac tccacctcct attcttgcac
gaaacgggat caaacaaccc cctaggaatc 720acctcccatt ccgataaaat
caccttccac ccttactaca caatcaaaga cgccctcggc 780ttacttctct
tccttctctc cttaatgaca ttaacactat tctcaccaga cctcctaggc
840gacccagaca attataccct agccaacccc ttaaacaccc ctccccacat
caagcccgaa 900tgatatttcc tattcgccta cacaattctc cgatccgtcc
ctaacaaact aggaggcgtc 960cttgccctat tactatccat cctcatccta
gcaataatcc ccatcctcca tatatccaaa 1020caacaaagca taatatttcg
cccactaagc caatcacttt attgactcct agccgcagac 1080ctcctcattc
taacctgaat cggaggacaa ccagtaagct acccttttac catcattgga
1140caagtagcat ccgtactata cttcacaaca atcctaatcc taataccaac
tatctcccta 1200attgaaaaca aaatactcaa atgggcct
12289522DNAArtificialcDNA 9atgttcgccg accgttgact attctctaca
aaccacaaag acattggaac actataccta 60ttattcggcg catgagctgg agtcctaggc
acagctctaa gcctccttat tcgagccgag 120ctgggccagc caggcaacct
tctaggtaac gaccacatct acaacgttat cgtcacagcc 180ctcgctgtca
ctttcctagg acttctaaca gccctagacc tcaactacct aaccaacaaa
240cttaaaataa aatccccact atgcacattt tatttctcca acatactcgg
attctaccct 300agcatcacac accgcacaat cccctatcta ggccttctta
cgagccaaaa cctgccccta 360ctcctcctag acctaacctg actagaaaag
ctattaccta aaacaatttc acagcaccaa 420atctccacct ccatcatcac
ctcaacccaa aaaggcataa ttaaacttta cttcctctct 480ttcttcttcc
cactcatcct aaccctactc ctaatcacat aa 52210582DNAArtificialcDNA
10atgttcgccg accgttgact attctctaca aaccacaaag acattggaac actataccta
60ttattcggcg catgagctgg agtcctaggc acagctctaa gcctccttat tcgagccgag
120ctgggccagc caggcaacct tctaggtaac gaccacatct acaacgttat
cgtcacagcc 180catgcatttg taataatctt cttcatagta atacccatca
taatcggagg ctttggcaac 240tgactagttc ccctaataat cggtgccccc
gatatggcgt ttccccgcat aaacaacata 300agcttctgac tcttacctcc
ctctctccta ctcctgctcg catctgctat agtggaggcc 360ggagcaggaa
caggttgaac agtctaccct cccttagcag ggaactactc ccaccctgga
420gccctcctag acctaacctg actagaaaag ctattaccta aaacaatttc
acagcaccaa 480atctccacct ccatcatcac ctcaacccaa aaaggcataa
ttaaacttta cttcctctct 540ttcttcttcc cactcatcct aaccctactc
ctaatcacat aa 582112208DNAArtificialcDNA 11atgttcgccg accgttgact
attctctaca aaccacaaag acattggaac actataccta 60ttattcggcg catgagctgg
agtcctaggc acagctctaa gcctccttat tcgagccgag 120ctgggccagc
caggcaacct tctaggtaac gaccacatct acaacgttat cgtcacagcc
180catgcatttg taataatctt cttcatagta atacccatca taatcggagg
ctttggcaac 240tgactagttc ccctaataat cggtgccccc gatatggcgt
ttccccgcat aaacaacata 300agcttctgac tcttacctcc ctctctccta
ctcctgctcg catctgctat agtggaggcc 360ggagcaggaa caggttgaac
agtctaccct cccttagcag ggaactactc ccaccctgga 420gcctccgtag
acctaaccat cttctcctta cacctagcag gtgtctcctc tatcttaggg
480gccatcaatt tcatcacaac aattatcaat ataaaacccc ctgccataac
ccaataccaa 540acgcccctct tcgtctgatc cgtcctaatc acagcagtcc
tacttctcct atctctccca 600gtcctagctg ctggcatcac tatactacta
acagaccgca acctcaacac caccttcttc 660gaccccgccg gaggaggaga
ccccattcta taccaacacc tattctgatt tttcggtcac 720cctgaagttt
atattcttat cctaccaggc ttcggaataa tctcccatat tgtaacttac
780tactccggaa aaaaagaacc atttggatac ataggtatgg tctgagctat
gatatcaatt 840ggcttcctag ggtttatcgt gtgagcacac catatattta
cagtaggaat agacgtagac 900acacgagcat atttcacctc cgctaccata
atcatcgcta tccccaccgg cgtcaaagta 960tttagctgac tcgccacact
ccacggaagc aatatgaaat gatctgctgc agtgctctga 1020gccctaggat
tcatctttct tttcaccgta ggtggcctga ctggcattgt attagcaaac
1080tcatcactag acatcgtact acacgacacg tactacgttg tagcccactt
ccactatgtc 1140ctatcaatag gagctgtatt tgccatcata ggaggcttca
ttcactgatt tcccctattc 1200tcaggctaca ccctagacca aacctacgcc
aaaatccatt tcactatcat attcatcggc 1260gtaaatctaa ctttcttccc
acaacacttt ctcggcctat ccggaatgcc ccgacgttac 1320tcggactacc
ccgatgcata caccacatga aacatcctat catctgtagg ctcattcatt
1380tctctaacag cagtaatatt aataattttc atgatttgag aagccttcgc
ttcgaagcga 1440aaagtcctaa tagtagaaga accctccata aacctggagt
gactatatgg atgcccccca 1500ccctaccaca cattcgaaga acccgtatac
ataaaagcag gaataccttt cctcacaggt 1560ttctactcca aagaccacat
catcgaaacc gcaaacatat catacacaaa cgcctgagcc 1620ctatctatta
ctctcatcgc tacctccctg acaagcgcct atagcactcg aataattctt
1680ctcaccctaa caggtcaacc tcgcttcccc acccttacta acattaacga
aaataacccc 1740accctactaa accccattaa acgcctggca gccggaagcc
tattcgcagg atttctcatt 1800actaacaaca tttcccccgc atcccccttc
caaacaacaa tccccctcta cctaaaactc 1860acagccctcg ctgtcacttt
cctaggactt ctaacagccc tagacctcaa ctacctaacc 1920aacaaactta
aaataaaatc cccactatgc acattttatt tctccaacat actcggattc
1980taccctagca tcacacaccg cacaatcccc tatctaggcc ttcttacgag
ccaaaacctg 2040cccctactcc tcctagacct aacctgacta gaaaagctat
tacctaaaac aatttcacag 2100caccaaatct ccacctccat catcacctca
acccaaaaag gcataattaa actttacttc 2160ctctctttct tcttcccact
catcctaacc ctactcctaa tcacataa 220812807DNAArtificialcDNA
12atggcacatg cagcgcaagt aggtctacaa gacgctactt cccctatcat agaagagctt
60atcacctttc atgatcacgc cctcataatc attttcctta tctgcttcct agtcctgtat
120gcccttttcc taacactcac aacaaaacta actaatacta acatctcaga
cgctcaggaa 180atagaaaccg caaacatatc atacacaaac gcctgagccc
tatctattac tctcatcgct 240acctccctga caagcgccta tagcactcga
ataattcttc tcaccctaac aggtcaacct 300cgcttcccca cccttactaa
cattaacgaa aataacccca ccctactaaa ccccattaaa 360cgcctggcag
ccggaagcct attcgcagga tttctcatta ctaacaacat ttcccccgca
420tcccccttcc aaacaacaat ccccctctac ctaaaactca cagccctcgc
tgtcactttc 480ctaggacttc taacagccct agacctcaac tacctaacca
acaaacttaa aataaaatcc 540ccactatgca cattttattt ctccaacata
ctcggattct accctagcat cacacaccgc 600acaatcccct atctaggcct
tcttacgagc caaaacctgc ccctactcct cctagaccta 660acctgactag
aaaagctatt acctaaaaca atttcacagc accaaatctc cacctccatc
720atcacctcaa cccaaaaagg cataattaaa ctttacttcc tctctttctt
cttcccactc 780atcctaaccc tactcctaat cacataa
80713786DNAArtificialcDNA 13atggcacatg cagcgcaagt aggtctacaa
gacgctactt cccctatcat agaagagctt 60atcacctttc atgatcacgc cctcataatc
attttcctta tctgcttcct agtcctgtat 120gcccttttcc taacactcac
aacaaaacta actaatacta acatctcaga cgctcaggaa 180atagaaaccg
tctgaactat cctgcccgcc atcatcctag tcctcatcgc cctcccatcc
240ctacgcatcc tttacataac agacgaggtc aacgatccct cccttaccat
caaatcaatt 300ggccaccaat ggtactgaac ctacgagtac accgactacg
gcggactaat cttcaactcc 360tacatacttc ccccattatt cctagaacca
ggcgacctgc gactccttga cgttgacaat 420cgagtagtac tcccgattga
agcccccatt cgtataataa ttacatcaca agacgtcttg 480cactcatgag
ctgtccccac attaggctta aaaacagatg caattcccgg acgtctaaac
540caaaccactt tcaccgctac acgaccgggg gtatactacg gtcaatgctc
tgaaatctgt 600ggagcaaacc acagtttcat gcccatcgtc ctagacctaa
cctgactaga aaagctatta 660cctaaaacaa tttcacagca ccaaatctcc
acctccatca tcacctcaac ccaaaaaggc 720ataattaaac tttacttcct
ctctttcttc ttcccactca tcctaaccct actcctaatc 780acataa
786141905DNAArtificialcDNA 14atgaacgaaa atctgttcgc ttcattcatt
gcccccacaa tcctaggcct acccgccgca 60gtactgatca ttctatttcc ccctctattg
atccccacct ccaaatatct catcaacaac 120cgactaatca ccacccaaca
atgactaatc aaactaacct caaaacaaat gataaccata 180cacaacacta
aaggacgaac ctgatctctt atactagtat ccttaatcat ttttattgcc
240acaactaacc tcctcggact cctgcctcac tcatttacac caaccaccca
actatctata 300aacctagcca tggccatccc cttatgagcg ggcacagtga
ttataggctt tcgctctaag 360attaaaaatg ccctagccca cttcttacca
caaggcacac ctacacccct tatccccata 420ctagttatta tcgaaaccat
cagcctactc attcaaccaa tagccctggc cgtacgccta 480accgctaaca
ttactgcagg ccacctactc atgcacctaa ttggaagcgc caccctagca
540atatcaacca ttaaccttcc ctctacactt atcatcttca caattctaat
tctactgact 600atcctagaaa tcgctgtcgc cttaatccaa gcctacgttt
tcacacttct agtaagcctc 660tacctacact ccaactcatg agacccacaa
caaatagccc ttctaaacgc taatccaagc 720ctcaccccac tactaggcct
cctcctagca gcagcaggca aatcagccca attaggtctc 780cacccctgac
tcccctcagc catagaaggc cccaccccag tctcagccct actccactca
840agcactatag ttgtagcagg aatcttctta ctcatccgct tccaccccct
agcagaaaat 900agcccactaa tccaaactct aacactatgc ttaggcgcta
tcaccactct gttcgcagca 960gtctgcgccc ttacacaaaa tgacatcaaa
aaaatcgtag ccttctccac ttcaagtcaa 1020ctaggactca taatagttac
aatcggcatc aaccaaccac acctagcatt cctgcacatc 1080tgtacccacg
ccttcttcaa agccatacta tttatgtgct ccgggtccat catccacaac
1140cttaacaatg aacaagatat tcgaaaaata ggaggactac tcaaaaccat
acctctcact 1200tcaacctccc tcaccattgg cagcctagca ttagcaggaa
tacctttcct cacaggtttc 1260tactccaaag accacatcat cgaaaccgca
aacatatcat acacaaacgc ctgagcccta 1320tctattactc tcatcgctac
ctccctgaca agcgcctata gcactcgaat aattcttctc 1380accctaacag
gtcaacctcg cttccccacc cttactaaca ttaacgaaaa taaccccacc
1440ctactaaacc ccattaaacg cctggcagcc ggaagcctat tcgcaggatt
tctcattact 1500aacaacattt cccccgcatc ccccttccaa acaacaatcc
ccctctacct aaaactcaca 1560gccctcgctg tcactttcct aggacttcta
acagccctag acctcaacta cctaaccaac 1620aaacttaaaa taaaatcccc
actatgcaca ttttatttct ccaacatact cggattctac 1680cctagcatca
cacaccgcac aatcccctat ctaggccttc ttacgagcca aaacctgccc
1740ctactcctcc tagacctaac ctgactagaa aagctattac ctaaaacaat
ttcacagcac
1800caaatctcca cctccatcat cacctcaacc caaaaaggca taattaaact
ttacttcctc 1860tctttcttct tcccactcat cctaacccta ctcctaatca cataa
1905151545DNAArtificialcDNA 15atgacccacc aatcacatgc ctatcatata
gtaaaaccca gcccatgacc cctaacaggg 60gccctctcag ccctcctaat gacctccggc
ctagccatgt gatttcactt ccactccata 120acgctcctca tactaggcct
actaaccaac acactaacca tataccaatg atggcgcgat 180gtaacacgag
aaagcacata ccaaggccac cacacaccac ctgtccaaaa aggccttcga
240tacgggataa tcctatttat tacctcagaa gtttttttct tcgcaggatt
tttctgagcc 300ttttaccact ccagcctagc ccctaccccc caattaggag
ggcactggcc cccaacaggc 360atcaccccac tactaggcct cctcctagca
gcagcaggca aatcagccca attaggtctc 420cacccctgac tcccctcagc
catagaaggc cccaccccag tctcagccct actccactca 480agcactatag
ttgtagcagg aatcttctta ctcatccgct tccaccccct agcagaaaat
540agcccactaa tccaaactct aacactatgc ttaggcgcta tcaccactct
gttcgcagca 600gtctgcgccc ttacacaaaa tgacatcaaa aaaatcgtag
ccttctccac ttcaagtcaa 660ctaggactca taatagttac aatcggcatc
aaccaaccac acctagcatt cctgcacatc 720tgtacccacg ccttcttcaa
agccatacta tttatgtgct ccgggtccat catccacaac 780cttaacaatg
aacaagatat tcgaaaaata ggaggactac tcaaaaccat acctctcact
840tcaacctccc tcaccattgg cagcctagca ttagcaggaa tacctttcct
cacaggtttc 900tactccaaag accacatcat cgaaaccgca aacatatcat
acacaaacgc ctgagcccta 960tctattactc tcatcgctac ctccctgaca
agcgcctata gcactcgaat aattcttctc 1020accctaacag gtcaacctcg
cttccccacc cttactaaca ttaacgaaaa taaccccacc 1080ctactaaacc
ccattaaacg cctggcagcc ggaagcctat tcgcaggatt tctcattact
1140aacaacattt cccccgcatc ccccttccaa acaacaatcc ccctctacct
aaaactcaca 1200gccctcgctg tcactttcct aggacttcta acagccctag
acctcaacta cctaaccaac 1260aaacttaaaa taaaatcccc actatgcaca
ttttatttct ccaacatact cggattctac 1320cctagcatca cacaccgcac
aatcccctat ctaggccttc ttacgagcca aaacctgccc 1380ctactcctcc
tagacctaac ctgactagaa aagctattac ctaaaacaat ttcacagcac
1440caaatctcca cctccatcat cacctcaacc caaaaaggca taattaaact
ttacttcctc 1500tctttcttct tcccactcat cctaacccta ctcctaatca cataa
1545161629DNAArtificialcDNA 16ataaacttcg ccttaatttt aataatcaac
accctcctag ccttactact aataattatt 60acattttgac taccacaact caacggctac
atagaaaaat ccacccctta cgagtgcggc 120ttcgacccta tatcccccgc
ccgcgtccct ttctccataa aattcttctt agtagctatt 180accttcttat
tatttgatct agaaattgcc ctccttttac ccctaccatg agccctacaa
240acaactaacc tgccactaat agttatgtca tccctcttat taatcatcat
cctagcccta 300agtctggcca acacagcagc cattcaagca atcctataca
accgtatcgg cgatatcggt 360ttcatcctcg ccttagcatg atttatccta
cactccaact catgagaccc acaacaaata 420gcccttctaa acgctaatcc
aagcctcacc ccactactag gcctcctcct agcagcagca 480ggcaaatcag
cccaattagg tctccacccc tgactcccct cagccataga aggccccacc
540ccagtctcag ccctactcca ctcaagcact atagttgtag caggaatctt
cttactcatc 600cgcttccacc ccctagcaga aaatagccca ctaatccaaa
ctctaacact atgcttaggc 660gctatcacca ctctgttcgc agcagtctgc
gcccttacac aaaatgacat caaaaaaatc 720gtagccttct ccacttcaag
tcaactagga ctcataatag ttacaatcgg catcaaccaa 780ccacacctag
cattcctgca catctgtacc cacgccttct tcaaagccat actatttatg
840tgctccgggt ccatcatcca caaccttaac aatgaacaag atattcgaaa
aataggagga 900ctactcaaaa ccatacctct cacttcaacc tccctcacca
ttggcagcct agcattagca 960ggaatacctt tcctcacagg tttctactcc
aaagaccaca tcatcgaaac cgcaaacata 1020tcatacacaa acgcctgagc
cctatctatt actctcatcg ctacctccct gacaagcgcc 1080tatagcactc
gaataattct tctcacccta acaggtcaac ctcgcttccc cacccttact
1140aacattaacg aaaataaccc caccctacta aaccccatta aacgcctggc
agccggaagc 1200ctattcgcag gatttctcat tactaacaac atttcccccg
catccccctt ccaaacaaca 1260atccccctct acctaaaact cacagccctc
gctgtcactt tcctaggact tctaacagcc 1320ctagacctca actacctaac
caacaaactt aaaataaaat ccccactatg cacattttat 1380ttctccaaca
tactcggatt ctaccctagc atcacacacc gcacaatccc ctatctaggc
1440cttcttacga gccaaaacct gcccctactc ctcctagacc taacctgact
agaaaagcta 1500ttacctaaaa caatttcaca gcaccaaatc tccacctcca
tcatcacctc aacccaaaaa 1560ggcataatta aactttactt cctctctttc
ttcttcccac tcatcctaac cctactccta 1620atcacataa
162917129DNAArtificialcDNA 17atgccccaac taaatactac cgtatggccc
accataatta cccccatact ccttacacta 60ttcctcatca cccaactaaa aatattaaac
acaaactacc acctacctcc ctcaccattg 120gcagcctag 12918783RNAHuman
18auggcccacc auaauuaccc ccauacuccu uacacuauuc cucaucaccc aacuaaaaau
60auuaaacaca aacuaccacc uaccucccuc accauuggca gccuagcauu agcaggaaua
120ccuuuccuca cagguuucua cuccaaagac cacaucaucg aaaccgcaaa
cauaucauac 180acaaacgccu gagcccuauc uauuacucuc aucgcuaccu
cccugacaag cgccuauagc 240acucgaauaa uucuucucac ccuaacaggu
caaccucgcu uccccacccu uacuaacauu 300aacgaaaaua accccacccu
acuaaacccc auuaaacgcc uggcagccgg aagccuauuc 360gcaggauuuc
ucauuacuaa caacauuucc cccgcauccc ccuuccaaac aacaaucccc
420cucuaccuaa aacucacagc ccucgcuguc acuuuccuag gacuucuaac
agcccuagac 480cucaacuacc uaaccaacaa acuuaaaaua aaauccccac
uaugcacauu uuauuucucc 540aacauacucg gauucuaccc uagcaucaca
caccgcacaa uccccuaucu aggccuucuu 600acgagccaaa accugccccu
acuccuccua gaccuaaccu gacuagaaaa gcuauuaccu 660aaaacaauuu
cacagcacca aaucuccacc uccaucauca ccucaaccca aaaaggcaua
720auuaaacuuu acuuccucuc uuucuucuuc ccacucaucc uaacccuacu
ccuaaucaca 780uaa 78319300RNAHuman 19augccccuca uuuacauaaa
uauuauacua gcauuuacca ucucacuucu aggaauacua 60guauaucgcu cacaccucau
auccucccua cuaugccuag aaggaauaau acuaucgcug 120uucauuauag
cuacucucau aacccucaac acccacuccc ucuuagccaa uauugugccu
180auugccauac uagucuuugc cgccugcgaa gcagcggugg gccuagcccu
acuagucuca 240aucuccaaca cauauggccu agacuacgua cauaaccuaa
cccuacuccu aaucacauaa 30020781RNAHuman 20auggcacaug cagcgcaagu
aggucuacaa gacgcuacuu ccccuaucau agaagagcuu 60aucaccuuuc augaucacgc
ccucauaauc auuuuccuua ucugcuuccu aguccuguau 120gcccuuuucc
uaacacucac aacaaaacua acuaauacua acaucucaga cgcucaggaa
180auagaaaccg ucugaacuau ccugcccgcc aucauccuag uccucaucgc
ccucccaucc 240cuacgcaucc uuuacauaac agacgagguc aacgaucccu
cccuuaccau caaaucaauu 300ggccaccaau gguacugaac cuacgaguac
accgacuacg gcggacuaau cuucaacucc 360uacauacuuc ccccauuauu
ccuagaacca ggcgacccag acaauuauac ccuagccaac 420cccuuaaaca
ccccucccca caucaagccc gaaugauauu uccuauucgc cuacacaauu
480cuccgauccg ucccuaacaa acuaggaggc guccuugccc uauuacuauc
cauccucauc 540cuagcaauaa uccccauccu ccauauaucc aaacaacaaa
gcauaauauu ucgcccacua 600agccaaucac uuuauugacu ccuagccgca
gaccuccuca uucuaaccug aaucggagga 660caaccaguaa gcuacccuuu
uaccaucauu ggacaaguag cauccguacu auacuucaca 720acaauccuaa
uccuaauacc aacuaucucc cuaauugaaa acaaaauacu caaaugggcc 780u
78121565RNAHuman 21auggcacaug cagcgcaagu aggucuacaa gacgcuacuu
ccccuaucau agaagagcuu 60aucaccuuuc augaucacgc ccucauaauc auuuuccuua
ucugcuuccu aguccuguau 120gcccuuuucc uaacacucac aacaaaacua
acuaauacua acaucucaga cgcucaggaa 180auagaaaccg ucugaacuau
ccugcccgcc aucauccuag uccucaucgc ccucccaucc 240cuacgcaucc
uuuacauaac agacgagguc aacgaucccu cccuuaccau caaaucaauu
300ggccaccaau gguacugaac cuacgaguac accgacuacg gcggacuaau
cuucaacucc 360uacauacuuc ccccauuauu ccuagaacca ggcgaccugc
gacuccuagc cgcagaccuc 420cucauucuaa ccugaaucgg aggacaacca
guaagcuacc cuuuuaccau cauuggacaa 480guagcauccg uacuauacuu
cacaacaauc cuaauccuaa uaccaacuau cucccuaauu 540gaaaacaaaa
uacucaaaug ggccu 565221174RNAHuman 22auggcacaug cagcgcaagu
aggucuacaa gacgcuacuu ccccuaucau agaagagcuu 60aucaccuuuc augaucacgc
ccucauaauc auuuuccuua ucugcuuccu aguccuguau 120gcccuuuucc
uaacacucac aacaaaacua acuaauacua acaucucaga cgcucaggaa
180auagaaaccg ucugaacuau ccugcccgcc aucauccuag uccucaucgc
ccucccaucc 240cuacgcaucc uuuacauaac agacgagguc aacgaucccu
cccuuaccau caaaucaauu 300ggccaccaau gguacugaac cuacgaguac
accgacuacg gcggacuaau cuucaacucc 360uacauacuuc ccccauuauu
ccuagaacca ggcgaccugc gacuccuuga cguugacaau 420cgaguaguac
ucccgauuga agcccccauu cguauaauaa uuacaucaca agacgucuug
480cacucaugag cuguccccac auuaggcuua aaaacagaug caauucccgg
acgucuaaac 540caaaccacuu ucaccgcuac acgaccgggg guauacuacg
gucaaugcuc ugaaaucugu 600ggagcaaacc acaguuucau gcccauauuc
uugcacgaaa cgggaucaaa caacccccua 660ggaaucaccu cccauuccga
uaaaaucacc uuccacccuu acuacacaau caaagacgcc 720cucggcuuac
uucucuuccu ucucuccuua augacauuaa cacuauucuc accagaccuc
780cuaggcgacc cagacaauua uacccuagcc aaccccuuaa acaccccucc
ccacaucaag 840cccgaaugau auuuccuauu cgccuacaca auucuccgau
ccgucccuaa caaacuagga 900ggcguccuug cccuauuacu auccauccuc
auccuagcaa uaauccccau ccuccauaua 960uccaaacaac aaagcauaau
auuucgccca cuaagccaau cacuuuauug acuccuagcc 1020gcagaccucc
ucauucuaac cugaaucgga ggacaaccag uaagcuaccc uuuuaccauc
1080auuggacaag uagcauccgu acuauacuuc acaacaaucc uaauccuaau
accaacuauc 1140ucccuaauug aaaacaaaau acucaaaugg gccu
1174231294RNAHuman 23augaacgaaa aucuguucgc uucauucauu gcccccacaa
uccuaggccu acccgccgca 60guacugauca uucuauuucc cccucuauug auccccaccu
ccaaauaucu caucaacaac 120cgacuaauca ccacccaaca augacuaauc
aaacuaaccu caaaacaaau gauaaccaua 180cacaacacua aaggacgaac
cugaucucuu auacuaguau ccuuaaucau uuuuauugcc 240acaacuaacc
uccucggacu ccugccucac ucauuuacac caaccaccca acuaucuaua
300aaccuagcca ugcacuacuc accagacgcc ucaaccgccu uuucaucaau
cgcccacauc 360acucgagacg uaaauuaugg cugaaucauc cgcuaccuuc
acgccaaugg cgccucaaua 420uucuuuaucu gccucuuccu acacaucggg
cgaggccuau auuacggauc auuucucuac 480ucagaaaccu gaaacaucgg
cauuauccuc cugcuugcaa cuauagcaac agccuucaua 540ggcuaugucc
ucccgugagg ccaaauauca uucugagggg ccacaguaau uacaaacuua
600cuauccgcca ucccauacau ugggacagac cuaguucaau gaaucugagg
aggcuacuca 660guagacaguc ccacccucac acgauucuuu accuuucacu
ucaucuugcc cuucauuauu 720gcagcccuag caacacucca ccuccuauuc
uugcacgaaa cgggaucaaa caacccccua 780ggaaucaccu cccauuccga
uaaaaucacc uuccacccuu acuacacaau caaagacgcc 840cucggcuuac
uucucuuccu ucucuccuua augacauuaa cacuauucuc accagaccuc
900cuaggcgacc cagacaauua uacccuagcc aaccccuuaa acaccccucc
ccacaucaag 960cccgaaugau auuuccuauu cgccuacaca auucuccgau
ccgucccuaa caaacuagga 1020ggcguccuug cccuauuacu auccauccuc
auccuagcaa uaauccccau ccuccauaua 1080uccaaacaac aaagcauaau
auuucgccca cuaagccaau cacuuuauug acuccuagcc 1140gcagaccucc
ucauucuaac cugaaucgga ggacaaccag uaagcuaccc uuuuaccauc
1200auuggacaag uagcauccgu acuauacuuc acaacaaucc uaauccuaau
accaacuauc 1260ucccuaauug aaaacaaaau acucaaaugg gccu
1294241228RNAHuman 24augccccuca uuuacauaaa uauuauacua gcauuuacca
ucucacuucu aggaauacua 60guauaucgcu cacaccucau auccucccua cuaugccuag
aaggaauaau acuaucgcug 120uucauuauag cuacucucau aacccucaac
acccacuccc ucuuagccaa uauugugccu 180auugccauac uagucuuugg
cgccugccug auccuccaaa ucaccacagg acuauuccua 240gccaugcacu
acucaccaga cgccucaacc gccuuuucau caaucgccca caucacucga
300gacguaaauu auggcugaau cauccgcuac cuucacgcca auggcgccuc
aauauucuuu 360aucugccucu uccuacacau cgggcgaggc cuauauuacg
gaucauuucu cuacucagaa 420accugaaaca ucggcauuau ccuccugcuu
gcaacuauag caacagccuu cauaggcuau 480guccucccgu gaggccaaau
aucauucuga ggggccacag uaauuacaaa cuuacuaucc 540gccaucccau
acauugggac agaccuaguu caaugaaucu gaggaggcua cucaguagac
600agucccaccc ucacacgauu cuuuaccuuu cacuucaucu ugcccuucau
uauugcagcc 660cuagcaacac uccaccuccu auucuugcac gaaacgggau
caaacaaccc ccuaggaauc 720accucccauu ccgauaaaau caccuuccac
ccuuacuaca caaucaaaga cgcccucggc 780uuacuucucu uccuucucuc
cuuaaugaca uuaacacuau ucucaccaga ccuccuaggc 840gacccagaca
auuauacccu agccaacccc uuaaacaccc cuccccacau caagcccgaa
900ugauauuucc uauucgccua cacaauucuc cgauccgucc cuaacaaacu
aggaggcguc 960cuugcccuau uacuauccau ccucauccua gcaauaaucc
ccauccucca uauauccaaa 1020caacaaagca uaauauuucg cccacuaagc
caaucacuuu auugacuccu agccgcagac 1080cuccucauuc uaaccugaau
cggaggacaa ccaguaagcu acccuuuuac caucauugga 1140caaguagcau
ccguacuaua cuucacaaca auccuaaucc uaauaccaac uaucucccua
1200auugaaaaca aaauacucaa augggccu 122825522RNAHuman 25auguucgccg
accguugacu auucucuaca aaccacaaag acauuggaac acuauaccua 60uuauucggcg
caugagcugg aguccuaggc acagcucuaa gccuccuuau ucgagccgag
120cugggccagc caggcaaccu ucuagguaac gaccacaucu acaacguuau
cgucacagcc 180cucgcuguca cuuuccuagg acuucuaaca gcccuagacc
ucaacuaccu aaccaacaaa 240cuuaaaauaa aauccccacu augcacauuu
uauuucucca acauacucgg auucuacccu 300agcaucacac accgcacaau
ccccuaucua ggccuucuua cgagccaaaa ccugccccua 360cuccuccuag
accuaaccug acuagaaaag cuauuaccua aaacaauuuc acagcaccaa
420aucuccaccu ccaucaucac cucaacccaa aaaggcauaa uuaaacuuua
cuuccucucu 480uucuucuucc cacucauccu aacccuacuc cuaaucacau aa
52226582RNAHuman 26auguucgccg accguugacu auucucuaca aaccacaaag
acauuggaac acuauaccua 60uuauucggcg caugagcugg aguccuaggc acagcucuaa
gccuccuuau ucgagccgag 120cugggccagc caggcaaccu ucuagguaac
gaccacaucu acaacguuau cgucacagcc 180caugcauuug uaauaaucuu
cuucauagua auacccauca uaaucggagg cuuuggcaac 240ugacuaguuc
cccuaauaau cggugccccc gauauggcgu uuccccgcau aaacaacaua
300agcuucugac ucuuaccucc cucucuccua cuccugcucg caucugcuau
aguggaggcc 360ggagcaggaa cagguugaac agucuacccu cccuuagcag
ggaacuacuc ccacccugga 420gcccuccuag accuaaccug acuagaaaag
cuauuaccua aaacaauuuc acagcaccaa 480aucuccaccu ccaucaucac
cucaacccaa aaaggcauaa uuaaacuuua cuuccucucu 540uucuucuucc
cacucauccu aacccuacuc cuaaucacau aa 582272208RNAHuman 27auguucgccg
accguugacu auucucuaca aaccacaaag acauuggaac acuauaccua 60uuauucggcg
caugagcugg aguccuaggc acagcucuaa gccuccuuau ucgagccgag
120cugggccagc caggcaaccu ucuagguaac gaccacaucu acaacguuau
cgucacagcc 180caugcauuug uaauaaucuu cuucauagua auacccauca
uaaucggagg cuuuggcaac 240ugacuaguuc cccuaauaau cggugccccc
gauauggcgu uuccccgcau aaacaacaua 300agcuucugac ucuuaccucc
cucucuccua cuccugcucg caucugcuau aguggaggcc 360ggagcaggaa
cagguugaac agucuacccu cccuuagcag ggaacuacuc ccacccugga
420gccuccguag accuaaccau cuucuccuua caccuagcag gugucuccuc
uaucuuaggg 480gccaucaauu ucaucacaac aauuaucaau auaaaacccc
cugccauaac ccaauaccaa 540acgccccucu ucgucugauc cguccuaauc
acagcagucc uacuucuccu aucucuccca 600guccuagcug cuggcaucac
uauacuacua acagaccgca accucaacac caccuucuuc 660gaccccgccg
gaggaggaga ccccauucua uaccaacacc uauucugauu uuucggucac
720ccugaaguuu auauucuuau ccuaccaggc uucggaauaa ucucccauau
uguaacuuac 780uacuccggaa aaaaagaacc auuuggauac auagguaugg
ucugagcuau gauaucaauu 840ggcuuccuag gguuuaucgu gugagcacac
cauauauuua caguaggaau agacguagac 900acacgagcau auuucaccuc
cgcuaccaua aucaucgcua uccccaccgg cgucaaagua 960uuuagcugac
ucgccacacu ccacggaagc aauaugaaau gaucugcugc agugcucuga
1020gcccuaggau ucaucuuucu uuucaccgua gguggccuga cuggcauugu
auuagcaaac 1080ucaucacuag acaucguacu acacgacacg uacuacguug
uagcccacuu ccacuauguc 1140cuaucaauag gagcuguauu ugccaucaua
ggaggcuuca uucacugauu uccccuauuc 1200ucaggcuaca cccuagacca
aaccuacgcc aaaauccauu ucacuaucau auucaucggc 1260guaaaucuaa
cuuucuuccc acaacacuuu cucggccuau ccggaaugcc ccgacguuac
1320ucggacuacc ccgaugcaua caccacauga aacauccuau caucuguagg
cucauucauu 1380ucucuaacag caguaauauu aauaauuuuc augauuugag
aagccuucgc uucgaagcga 1440aaaguccuaa uaguagaaga acccuccaua
aaccuggagu gacuauaugg augcccccca 1500cccuaccaca cauucgaaga
acccguauac auaaaagcag gaauaccuuu ccucacaggu 1560uucuacucca
aagaccacau caucgaaacc gcaaacauau cauacacaaa cgccugagcc
1620cuaucuauua cucucaucgc uaccucccug acaagcgccu auagcacucg
aauaauucuu 1680cucacccuaa caggucaacc ucgcuucccc acccuuacua
acauuaacga aaauaacccc 1740acccuacuaa accccauuaa acgccuggca
gccggaagcc uauucgcagg auuucucauu 1800acuaacaaca uuucccccgc
aucccccuuc caaacaacaa ucccccucua ccuaaaacuc 1860acagcccucg
cugucacuuu ccuaggacuu cuaacagccc uagaccucaa cuaccuaacc
1920aacaaacuua aaauaaaauc cccacuaugc acauuuuauu ucuccaacau
acucggauuc 1980uacccuagca ucacacaccg cacaaucccc uaucuaggcc
uucuuacgag ccaaaaccug 2040ccccuacucc uccuagaccu aaccugacua
gaaaagcuau uaccuaaaac aauuucacag 2100caccaaaucu ccaccuccau
caucaccuca acccaaaaag gcauaauuaa acuuuacuuc 2160cucucuuucu
ucuucccacu cauccuaacc cuacuccuaa ucacauaa 220828807RNAHuman
28auggcacaug cagcgcaagu aggucuacaa gacgcuacuu ccccuaucau agaagagcuu
60aucaccuuuc augaucacgc ccucauaauc auuuuccuua ucugcuuccu aguccuguau
120gcccuuuucc uaacacucac aacaaaacua acuaauacua acaucucaga
cgcucaggaa 180auagaaaccg caaacauauc auacacaaac gccugagccc
uaucuauuac ucucaucgcu 240accucccuga caagcgccua uagcacucga
auaauucuuc ucacccuaac aggucaaccu 300cgcuucccca cccuuacuaa
cauuaacgaa aauaacccca cccuacuaaa ccccauuaaa 360cgccuggcag
ccggaagccu auucgcagga uuucucauua cuaacaacau uucccccgca
420ucccccuucc aaacaacaau cccccucuac cuaaaacuca cagcccucgc
ugucacuuuc 480cuaggacuuc uaacagcccu agaccucaac uaccuaacca
acaaacuuaa aauaaaaucc 540ccacuaugca cauuuuauuu cuccaacaua
cucggauucu acccuagcau cacacaccgc 600acaauccccu aucuaggccu
ucuuacgagc caaaaccugc cccuacuccu ccuagaccua 660accugacuag
aaaagcuauu accuaaaaca auuucacagc accaaaucuc caccuccauc
720aucaccucaa cccaaaaagg cauaauuaaa cuuuacuucc ucucuuucuu
cuucccacuc 780auccuaaccc uacuccuaau cacauaa 80729786RNAHuman
29auggcacaug cagcgcaagu aggucuacaa gacgcuacuu ccccuaucau agaagagcuu
60aucaccuuuc augaucacgc ccucauaauc auuuuccuua ucugcuuccu aguccuguau
120gcccuuuucc uaacacucac aacaaaacua acuaauacua acaucucaga
cgcucaggaa 180auagaaaccg ucugaacuau ccugcccgcc aucauccuag
uccucaucgc ccucccaucc 240cuacgcaucc uuuacauaac agacgagguc
aacgaucccu cccuuaccau caaaucaauu 300ggccaccaau gguacugaac
cuacgaguac accgacuacg gcggacuaau cuucaacucc 360uacauacuuc
ccccauuauu ccuagaacca ggcgaccugc gacuccuuga cguugacaau
420cgaguaguac ucccgauuga agcccccauu cguauaauaa uuacaucaca
agacgucuug 480cacucaugag cuguccccac auuaggcuua aaaacagaug
caauucccgg acgucuaaac 540caaaccacuu ucaccgcuac acgaccgggg
guauacuacg gucaaugcuc ugaaaucugu 600ggagcaaacc acaguuucau
gcccaucguc cuagaccuaa ccugacuaga
aaagcuauua 660ccuaaaacaa uuucacagca ccaaaucucc accuccauca
ucaccucaac ccaaaaaggc 720auaauuaaac uuuacuuccu cucuuucuuc
uucccacuca uccuaacccu acuccuaauc 780acauaa 786301905RNAHuman
30augaacgaaa aucuguucgc uucauucauu gcccccacaa uccuaggccu acccgccgca
60guacugauca uucuauuucc cccucuauug auccccaccu ccaaauaucu caucaacaac
120cgacuaauca ccacccaaca augacuaauc aaacuaaccu caaaacaaau
gauaaccaua 180cacaacacua aaggacgaac cugaucucuu auacuaguau
ccuuaaucau uuuuauugcc 240acaacuaacc uccucggacu ccugccucac
ucauuuacac caaccaccca acuaucuaua 300aaccuagcca uggccauccc
cuuaugagcg ggcacaguga uuauaggcuu ucgcucuaag 360auuaaaaaug
cccuagccca cuucuuacca caaggcacac cuacaccccu uauccccaua
420cuaguuauua ucgaaaccau cagccuacuc auucaaccaa uagcccuggc
cguacgccua 480accgcuaaca uuacugcagg ccaccuacuc augcaccuaa
uuggaagcgc cacccuagca 540auaucaacca uuaaccuucc cucuacacuu
aucaucuuca caauucuaau ucuacugacu 600auccuagaaa ucgcugucgc
cuuaauccaa gccuacguuu ucacacuucu aguaagccuc 660uaccuacacu
ccaacucaug agacccacaa caaauagccc uucuaaacgc uaauccaagc
720cucaccccac uacuaggccu ccuccuagca gcagcaggca aaucagccca
auuaggucuc 780caccccugac uccccucagc cauagaaggc cccaccccag
ucucagcccu acuccacuca 840agcacuauag uuguagcagg aaucuucuua
cucauccgcu uccacccccu agcagaaaau 900agcccacuaa uccaaacucu
aacacuaugc uuaggcgcua ucaccacucu guucgcagca 960gucugcgccc
uuacacaaaa ugacaucaaa aaaaucguag ccuucuccac uucaagucaa
1020cuaggacuca uaauaguuac aaucggcauc aaccaaccac accuagcauu
ccugcacauc 1080uguacccacg ccuucuucaa agccauacua uuuaugugcu
ccggguccau cauccacaac 1140cuuaacaaug aacaagauau ucgaaaaaua
ggaggacuac ucaaaaccau accucucacu 1200ucaaccuccc ucaccauugg
cagccuagca uuagcaggaa uaccuuuccu cacagguuuc 1260uacuccaaag
accacaucau cgaaaccgca aacauaucau acacaaacgc cugagcccua
1320ucuauuacuc ucaucgcuac cucccugaca agcgccuaua gcacucgaau
aauucuucuc 1380acccuaacag gucaaccucg cuuccccacc cuuacuaaca
uuaacgaaaa uaaccccacc 1440cuacuaaacc ccauuaaacg ccuggcagcc
ggaagccuau ucgcaggauu ucucauuacu 1500aacaacauuu cccccgcauc
ccccuuccaa acaacaaucc cccucuaccu aaaacucaca 1560gcccucgcug
ucacuuuccu aggacuucua acagcccuag accucaacua ccuaaccaac
1620aaacuuaaaa uaaaaucccc acuaugcaca uuuuauuucu ccaacauacu
cggauucuac 1680ccuagcauca cacaccgcac aauccccuau cuaggccuuc
uuacgagcca aaaccugccc 1740cuacuccucc uagaccuaac cugacuagaa
aagcuauuac cuaaaacaau uucacagcac 1800caaaucucca ccuccaucau
caccucaacc caaaaaggca uaauuaaacu uuacuuccuc 1860ucuuucuucu
ucccacucau ccuaacccua cuccuaauca cauaa 1905311545RNAHuman
31augacccacc aaucacaugc cuaucauaua guaaaaccca gcccaugacc ccuaacaggg
60gcccucucag cccuccuaau gaccuccggc cuagccaugu gauuucacuu ccacuccaua
120acgcuccuca uacuaggccu acuaaccaac acacuaacca uauaccaaug
auggcgcgau 180guaacacgag aaagcacaua ccaaggccac cacacaccac
cuguccaaaa aggccuucga 240uacgggauaa uccuauuuau uaccucagaa
guuuuuuucu ucgcaggauu uuucugagcc 300uuuuaccacu ccagccuagc
cccuaccccc caauuaggag ggcacuggcc cccaacaggc 360aucaccccac
uacuaggccu ccuccuagca gcagcaggca aaucagccca auuaggucuc
420caccccugac uccccucagc cauagaaggc cccaccccag ucucagcccu
acuccacuca 480agcacuauag uuguagcagg aaucuucuua cucauccgcu
uccacccccu agcagaaaau 540agcccacuaa uccaaacucu aacacuaugc
uuaggcgcua ucaccacucu guucgcagca 600gucugcgccc uuacacaaaa
ugacaucaaa aaaaucguag ccuucuccac uucaagucaa 660cuaggacuca
uaauaguuac aaucggcauc aaccaaccac accuagcauu ccugcacauc
720uguacccacg ccuucuucaa agccauacua uuuaugugcu ccggguccau
cauccacaac 780cuuaacaaug aacaagauau ucgaaaaaua ggaggacuac
ucaaaaccau accucucacu 840ucaaccuccc ucaccauugg cagccuagca
uuagcaggaa uaccuuuccu cacagguuuc 900uacuccaaag accacaucau
cgaaaccgca aacauaucau acacaaacgc cugagcccua 960ucuauuacuc
ucaucgcuac cucccugaca agcgccuaua gcacucgaau aauucuucuc
1020acccuaacag gucaaccucg cuuccccacc cuuacuaaca uuaacgaaaa
uaaccccacc 1080cuacuaaacc ccauuaaacg ccuggcagcc ggaagccuau
ucgcaggauu ucucauuacu 1140aacaacauuu cccccgcauc ccccuuccaa
acaacaaucc cccucuaccu aaaacucaca 1200gcccucgcug ucacuuuccu
aggacuucua acagcccuag accucaacua ccuaaccaac 1260aaacuuaaaa
uaaaaucccc acuaugcaca uuuuauuucu ccaacauacu cggauucuac
1320ccuagcauca cacaccgcac aauccccuau cuaggccuuc uuacgagcca
aaaccugccc 1380cuacuccucc uagaccuaac cugacuagaa aagcuauuac
cuaaaacaau uucacagcac 1440caaaucucca ccuccaucau caccucaacc
caaaaaggca uaauuaaacu uuacuuccuc 1500ucuuucuucu ucccacucau
ccuaacccua cuccuaauca cauaa 1545321629RNAHuman 32auaaacuucg
ccuuaauuuu aauaaucaac acccuccuag ccuuacuacu aauaauuauu 60acauuuugac
uaccacaacu caacggcuac auagaaaaau ccaccccuua cgagugcggc
120uucgacccua uaucccccgc ccgcgucccu uucuccauaa aauucuucuu
aguagcuauu 180accuucuuau uauuugaucu agaaauugcc cuccuuuuac
cccuaccaug agcccuacaa 240acaacuaacc ugccacuaau aguuauguca
ucccucuuau uaaucaucau ccuagcccua 300agucuggcca acacagcagc
cauucaagca auccuauaca accguaucgg cgauaucggu 360uucauccucg
ccuuagcaug auuuauccua cacuccaacu caugagaccc acaacaaaua
420gcccuucuaa acgcuaaucc aagccucacc ccacuacuag gccuccuccu
agcagcagca 480ggcaaaucag cccaauuagg ucuccacccc ugacuccccu
cagccauaga aggccccacc 540ccagucucag cccuacucca cucaagcacu
auaguuguag caggaaucuu cuuacucauc 600cgcuuccacc cccuagcaga
aaauagccca cuaauccaaa cucuaacacu augcuuaggc 660gcuaucacca
cucuguucgc agcagucugc gcccuuacac aaaaugacau caaaaaaauc
720guagccuucu ccacuucaag ucaacuagga cucauaauag uuacaaucgg
caucaaccaa 780ccacaccuag cauuccugca caucuguacc cacgccuucu
ucaaagccau acuauuuaug 840ugcuccgggu ccaucaucca caaccuuaac
aaugaacaag auauucgaaa aauaggagga 900cuacucaaaa ccauaccucu
cacuucaacc ucccucacca uuggcagccu agcauuagca 960ggaauaccuu
uccucacagg uuucuacucc aaagaccaca ucaucgaaac cgcaaacaua
1020ucauacacaa acgccugagc ccuaucuauu acucucaucg cuaccucccu
gacaagcgcc 1080uauagcacuc gaauaauucu ucucacccua acaggucaac
cucgcuuccc cacccuuacu 1140aacauuaacg aaaauaaccc cacccuacua
aaccccauua aacgccuggc agccggaagc 1200cuauucgcag gauuucucau
uacuaacaac auuucccccg caucccccuu ccaaacaaca 1260aucccccucu
accuaaaacu cacagcccuc gcugucacuu uccuaggacu ucuaacagcc
1320cuagaccuca acuaccuaac caacaaacuu aaaauaaaau ccccacuaug
cacauuuuau 1380uucuccaaca uacucggauu cuacccuagc aucacacacc
gcacaauccc cuaucuaggc 1440cuucuuacga gccaaaaccu gccccuacuc
cuccuagacc uaaccugacu agaaaagcua 1500uuaccuaaaa caauuucaca
gcaccaaauc uccaccucca ucaucaccuc aacccaaaaa 1560ggcauaauua
aacuuuacuu ccucucuuuc uucuucccac ucauccuaac ccuacuccua
1620aucacauaa 162933129RNAHuman 33augccccaac uaaauacuac cguauggccc
accauaauua cccccauacu ccuuacacua 60uuccucauca cccaacuaaa aauauuaaac
acaaacuacc accuaccucc cucaccauug 120gcagccuag
12934261PRTArtificialputative protein sequence 34Met Ala His His
Asn Tyr Pro His Thr Pro Tyr Thr Ile Pro His His1 5 10 15Pro Thr Lys
Asn Ile Lys His Lys Leu Pro Pro Thr Ser Leu Thr Ile 20 25 30Gly Ser
Leu Ala Leu Ala Gly Met Pro Phe Leu Thr Gly Phe Tyr Ser 35 40 45Lys
Asp His Ile Ile Glu Thr Ala Asn Met Ser Tyr Thr Asn Ala Trp 50 55
60Ala Leu Ser Ile Thr Leu Ile Ala Thr Ser Leu Thr Ser Ala Tyr Ser65
70 75 80Thr Arg Met Ile Leu Leu Thr Leu Thr Gly Gln Pro Arg Phe Pro
Thr 85 90 95Leu Thr Asn Ile Asn Glu Asn Asn Pro Thr Leu Leu Asn Pro
Ile Lys 100 105 110Arg Leu Ala Ala Gly Ser Leu Phe Ala Gly Phe Leu
Ile Thr Asn Asn 115 120 125Ile Ser Pro Ala Ser Pro Phe Gln Thr Thr
Ile Pro Leu Tyr Leu Lys 130 135 140Leu Thr Ala Leu Ala Val Thr Phe
Leu Gly Leu Leu Thr Ala Leu Asp145 150 155 160Leu Asn Tyr Leu Thr
Asn Lys Leu Lys Met Lys Ser Pro Leu Cys Thr 165 170 175Phe Tyr Phe
Ser Asn Met Leu Gly Phe Tyr Pro Ser Ile Thr His Arg 180 185 190Thr
Ile Pro Tyr Leu Gly Leu Leu Thr Ser Gln Asn Leu Pro Leu Leu 195 200
205Leu Leu Asp Leu Thr Trp Leu Glu Lys Leu Leu Pro Lys Thr Ile Ser
210 215 220Gln His Gln Ile Ser Thr Ser Ile Ile Thr Ser Thr Gln Lys
Gly Met225 230 235 240Ile Lys Leu Tyr Phe Leu Ser Phe Phe Phe Pro
Leu Ile Leu Thr Leu 245 250 255Leu Leu Ile Thr Xaa
26035100PRTArtificialputative protein sequence 35Met Pro Leu Ile
Tyr Met Asn Ile Met Leu Ala Phe Thr Ile Ser Leu1 5 10 15Leu Gly Met
Leu Val Tyr Arg Ser His Leu Met Ser Ser Leu Leu Cys 20 25 30Leu Glu
Gly Met Met Leu Ser Leu Phe Ile Met Ala Thr Leu Met Thr 35 40 45Leu
Asn Thr His Ser Leu Leu Ala Asn Ile Val Pro Ile Ala Met Leu 50 55
60Val Phe Ala Ala Cys Glu Ala Ala Val Gly Leu Ala Leu Leu Val Ser65
70 75 80Ile Ser Asn Thr Tyr Gly Leu Asp Tyr Val His Asn Leu Thr Leu
Leu 85 90 95Leu Ile Thr Xaa 10036261PRTArtificialputative protein
sequence 36Met Ala His Ala Ala Gln Val Gly Leu Gln Asp Ala Thr Ser
Pro Ile1 5 10 15Met Glu Glu Leu Ile Thr Phe His Asp His Ala Leu Met
Ile Ile Phe 20 25 30Leu Ile Cys Phe Leu Val Leu Tyr Ala Leu Phe Leu
Thr Leu Thr Thr 35 40 45Lys Leu Thr Asn Thr Asn Ile Ser Asp Ala Gln
Glu Met Glu Thr Val 50 55 60Trp Thr Ile Leu Pro Ala Ile Ile Leu Val
Leu Ile Ala Leu Pro Ser65 70 75 80Leu Arg Ile Leu Tyr Met Thr Asp
Glu Val Asn Asp Pro Ser Leu Thr 85 90 95Ile Lys Ser Ile Gly His Gln
Trp Tyr Trp Thr Tyr Glu Tyr Thr Asp 100 105 110Tyr Gly Gly Leu Ile
Phe Asn Ser Tyr Met Leu Pro Pro Leu Phe Leu 115 120 125Glu Pro Gly
Asp Pro Asp Asn Tyr Thr Leu Ala Asn Pro Leu Asn Thr 130 135 140Pro
Pro His Ile Lys Pro Glu Trp Tyr Phe Leu Phe Ala Tyr Thr Ile145 150
155 160Leu Arg Ser Val Pro Asn Lys Leu Gly Gly Val Leu Ala Leu Leu
Leu 165 170 175Ser Ile Leu Ile Leu Ala Met Ile Pro Ile Leu His Met
Ser Lys Gln 180 185 190Gln Ser Met Met Phe Arg Pro Leu Ser Gln Ser
Leu Tyr Trp Leu Leu 195 200 205Ala Ala Asp Leu Leu Ile Leu Thr Trp
Ile Gly Gly Gln Pro Val Ser 210 215 220Tyr Pro Phe Thr Ile Ile Gly
Gln Val Ala Ser Val Leu Tyr Phe Thr225 230 235 240Thr Ile Leu Ile
Leu Met Pro Thr Ile Ser Leu Ile Glu Asn Lys Met 245 250 255Leu Lys
Trp Ala Xaa 26037189PRTArtificialputative protein sequence 37Met
Ala His Ala Ala Gln Val Gly Leu Gln Asp Ala Thr Ser Pro Ile1 5 10
15Met Glu Glu Leu Ile Thr Phe His Asp His Ala Leu Met Ile Ile Phe
20 25 30Leu Ile Cys Phe Leu Val Leu Tyr Ala Leu Phe Leu Thr Leu Thr
Thr 35 40 45Lys Leu Thr Asn Thr Asn Ile Ser Asp Ala Gln Glu Met Glu
Thr Val 50 55 60Trp Thr Ile Leu Pro Ala Ile Ile Leu Val Leu Ile Ala
Leu Pro Ser65 70 75 80Leu Arg Ile Leu Tyr Met Thr Asp Glu Val Asn
Asp Pro Ser Leu Thr 85 90 95Ile Lys Ser Ile Gly His Gln Trp Tyr Trp
Thr Tyr Glu Tyr Thr Asp 100 105 110Tyr Gly Gly Leu Ile Phe Asn Ser
Tyr Met Leu Pro Pro Leu Phe Leu 115 120 125Glu Pro Gly Asp Leu Arg
Leu Leu Ala Ala Asp Leu Leu Ile Leu Thr 130 135 140Trp Ile Gly Gly
Gln Pro Val Ser Tyr Pro Phe Thr Ile Ile Gly Gln145 150 155 160Val
Ala Ser Val Leu Tyr Phe Thr Thr Ile Leu Ile Leu Met Pro Thr 165 170
175Ile Ser Leu Ile Glu Asn Lys Met Leu Lys Trp Ala Xaa 180
18538392PRTArtificialputative protein sequence 38Met Ala His Ala
Ala Gln Val Gly Leu Gln Asp Ala Thr Ser Pro Ile1 5 10 15Met Glu Glu
Leu Ile Thr Phe His Asp His Ala Leu Met Ile Ile Phe 20 25 30Leu Ile
Cys Phe Leu Val Leu Tyr Ala Leu Phe Leu Thr Leu Thr Thr 35 40 45Lys
Leu Thr Asn Thr Asn Ile Ser Asp Ala Gln Glu Met Glu Thr Val 50 55
60Trp Thr Ile Leu Pro Ala Ile Ile Leu Val Leu Ile Ala Leu Pro Ser65
70 75 80Leu Arg Ile Leu Tyr Met Thr Asp Glu Val Asn Asp Pro Ser Leu
Thr 85 90 95Ile Lys Ser Ile Gly His Gln Trp Tyr Trp Thr Tyr Glu Tyr
Thr Asp 100 105 110Tyr Gly Gly Leu Ile Phe Asn Ser Tyr Met Leu Pro
Pro Leu Phe Leu 115 120 125Glu Pro Gly Asp Leu Arg Leu Leu Asp Val
Asp Asn Arg Val Val Leu 130 135 140Pro Ile Glu Ala Pro Ile Arg Met
Met Ile Thr Ser Gln Asp Val Leu145 150 155 160His Ser Trp Ala Val
Pro Thr Leu Gly Leu Lys Thr Asp Ala Ile Pro 165 170 175Gly Arg Leu
Asn Gln Thr Thr Phe Thr Ala Thr Arg Pro Gly Val Tyr 180 185 190Tyr
Gly Gln Cys Ser Glu Ile Cys Gly Ala Asn His Ser Phe Met Pro 195 200
205Met Phe Leu His Glu Thr Gly Ser Asn Asn Pro Leu Gly Ile Thr Ser
210 215 220His Ser Asp Lys Ile Thr Phe His Pro Tyr Tyr Thr Ile Lys
Asp Ala225 230 235 240Leu Gly Leu Leu Leu Phe Leu Leu Ser Leu Met
Thr Leu Thr Leu Phe 245 250 255Ser Pro Asp Leu Leu Gly Asp Pro Asp
Asn Tyr Thr Leu Ala Asn Pro 260 265 270Leu Asn Thr Pro Pro His Ile
Lys Pro Glu Trp Tyr Phe Leu Phe Ala 275 280 285Tyr Thr Ile Leu Arg
Ser Val Pro Asn Lys Leu Gly Gly Val Leu Ala 290 295 300Leu Leu Leu
Ser Ile Leu Ile Leu Ala Met Ile Pro Ile Leu His Met305 310 315
320Ser Lys Gln Gln Ser Met Met Phe Arg Pro Leu Ser Gln Ser Leu Tyr
325 330 335Trp Leu Leu Ala Ala Asp Leu Leu Ile Leu Thr Trp Ile Gly
Gly Gln 340 345 350Pro Val Ser Tyr Pro Phe Thr Ile Ile Gly Gln Val
Ala Ser Val Leu 355 360 365Tyr Phe Thr Thr Ile Leu Ile Leu Met Pro
Thr Ile Ser Leu Ile Glu 370 375 380Asn Lys Met Leu Lys Trp Ala
Xaa385 39039432PRTArtificialputative protein sequence 39Met Asn Glu
Asn Leu Phe Ala Ser Phe Ile Ala Pro Thr Ile Leu Gly1 5 10 15Leu Pro
Ala Ala Val Leu Ile Ile Leu Phe Pro Pro Leu Leu Ile Pro 20 25 30Thr
Ser Lys Tyr Leu Ile Asn Asn Arg Leu Ile Thr Thr Gln Gln Trp 35 40
45Leu Ile Lys Leu Thr Ser Lys Gln Met Met Thr Met His Asn Thr Lys
50 55 60Gly Arg Thr Trp Ser Leu Met Leu Val Ser Leu Ile Ile Phe Ile
Ala65 70 75 80Thr Thr Asn Leu Leu Gly Leu Leu Pro His Ser Phe Thr
Pro Thr Thr 85 90 95Gln Leu Ser Met Asn Leu Ala Met His Tyr Ser Pro
Asp Ala Ser Thr 100 105 110Ala Phe Ser Ser Ile Ala His Ile Thr Arg
Asp Val Asn Tyr Gly Trp 115 120 125Ile Ile Arg Tyr Leu His Ala Asn
Gly Ala Ser Met Phe Phe Ile Cys 130 135 140Leu Phe Leu His Ile Gly
Arg Gly Leu Tyr Tyr Gly Ser Phe Leu Tyr145 150 155 160Ser Glu Thr
Trp Asn Ile Gly Ile Ile Leu Leu Leu Ala Thr Met Ala 165 170 175Thr
Ala Phe Met Gly Tyr Val Leu Pro Trp Gly Gln Met Ser Phe Trp 180 185
190Gly Ala Thr Val Ile Thr Asn Leu Leu Ser Ala Ile Pro Tyr Ile Gly
195 200 205Thr Asp Leu Val Gln Trp Ile Trp Gly Gly Tyr Ser Val Asp
Ser Pro 210 215 220Thr Leu Thr Arg Phe Phe Thr Phe His Phe Ile Leu
Pro Phe Ile Ile225 230 235 240Ala Ala Leu Ala Thr Leu His Leu Leu
Phe Leu His Glu Thr Gly Ser 245 250 255Asn Asn Pro Leu Gly Ile Thr
Ser His Ser Asp Lys Ile Thr Phe His 260 265 270Pro Tyr Tyr Thr Ile
Lys Asp Ala Leu Gly Leu Leu Leu Phe Leu Leu 275 280 285Ser Leu Met
Thr Leu Thr Leu Phe Ser Pro Asp Leu Leu Gly Asp Pro 290 295 300Asp
Asn Tyr Thr Leu Ala Asn Pro Leu Asn Thr
Pro Pro His Ile Lys305 310 315 320Pro Glu Trp Tyr Phe Leu Phe Ala
Tyr Thr Ile Leu Arg Ser Val Pro 325 330 335Asn Lys Leu Gly Gly Val
Leu Ala Leu Leu Leu Ser Ile Leu Ile Leu 340 345 350Ala Met Ile Pro
Ile Leu His Met Ser Lys Gln Gln Ser Met Met Phe 355 360 365Arg Pro
Leu Ser Gln Ser Leu Tyr Trp Leu Leu Ala Ala Asp Leu Leu 370 375
380Ile Leu Thr Trp Ile Gly Gly Gln Pro Val Ser Tyr Pro Phe Thr
Ile385 390 395 400Ile Gly Gln Val Ala Ser Val Leu Tyr Phe Thr Thr
Ile Leu Ile Leu 405 410 415Met Pro Thr Ile Ser Leu Ile Glu Asn Lys
Met Leu Lys Trp Ala Xaa 420 425 43040410PRTArtificialputative
protein sequence 40Met Pro Leu Ile Tyr Met Asn Ile Met Leu Ala Phe
Thr Ile Ser Leu1 5 10 15Leu Gly Met Leu Val Tyr Arg Ser His Leu Met
Ser Ser Leu Leu Cys 20 25 30Leu Glu Gly Met Met Leu Ser Leu Phe Ile
Met Ala Thr Leu Met Thr 35 40 45Leu Asn Thr His Ser Leu Leu Ala Asn
Ile Val Pro Ile Ala Met Leu 50 55 60Val Phe Gly Ala Cys Leu Ile Leu
Gln Ile Thr Thr Gly Leu Phe Leu65 70 75 80Ala Met His Tyr Ser Pro
Asp Ala Ser Thr Ala Phe Ser Ser Ile Ala 85 90 95His Ile Thr Arg Asp
Val Asn Tyr Gly Trp Ile Ile Arg Tyr Leu His 100 105 110Ala Asn Gly
Ala Ser Met Phe Phe Ile Cys Leu Phe Leu His Ile Gly 115 120 125Arg
Gly Leu Tyr Tyr Gly Ser Phe Leu Tyr Ser Glu Thr Trp Asn Ile 130 135
140Gly Ile Ile Leu Leu Leu Ala Thr Met Ala Thr Ala Phe Met Gly
Tyr145 150 155 160Val Leu Pro Trp Gly Gln Met Ser Phe Trp Gly Ala
Thr Val Ile Thr 165 170 175Asn Leu Leu Ser Ala Ile Pro Tyr Ile Gly
Thr Asp Leu Val Gln Trp 180 185 190Ile Trp Gly Gly Tyr Ser Val Asp
Ser Pro Thr Leu Thr Arg Phe Phe 195 200 205Thr Phe His Phe Ile Leu
Pro Phe Ile Ile Ala Ala Leu Ala Thr Leu 210 215 220His Leu Leu Phe
Leu His Glu Thr Gly Ser Asn Asn Pro Leu Gly Ile225 230 235 240Thr
Ser His Ser Asp Lys Ile Thr Phe His Pro Tyr Tyr Thr Ile Lys 245 250
255Asp Ala Leu Gly Leu Leu Leu Phe Leu Leu Ser Leu Met Thr Leu Thr
260 265 270Leu Phe Ser Pro Asp Leu Leu Gly Asp Pro Asp Asn Tyr Thr
Leu Ala 275 280 285Asn Pro Leu Asn Thr Pro Pro His Ile Lys Pro Glu
Trp Tyr Phe Leu 290 295 300Phe Ala Tyr Thr Ile Leu Arg Ser Val Pro
Asn Lys Leu Gly Gly Val305 310 315 320Leu Ala Leu Leu Leu Ser Ile
Leu Ile Leu Ala Met Ile Pro Ile Leu 325 330 335His Met Ser Lys Gln
Gln Ser Met Met Phe Arg Pro Leu Ser Gln Ser 340 345 350Leu Tyr Trp
Leu Leu Ala Ala Asp Leu Leu Ile Leu Thr Trp Ile Gly 355 360 365Gly
Gln Pro Val Ser Tyr Pro Phe Thr Ile Ile Gly Gln Val Ala Ser 370 375
380Val Leu Tyr Phe Thr Thr Ile Leu Ile Leu Met Pro Thr Ile Ser
Leu385 390 395 400Ile Glu Asn Lys Met Leu Lys Trp Ala Xaa 405
41041174PRTArtificialputative protein sequence 41Met Phe Ala Asp
Arg Trp Leu Phe Ser Thr Asn His Lys Asp Ile Gly1 5 10 15Thr Leu Tyr
Leu Leu Phe Gly Ala Trp Ala Gly Val Leu Gly Thr Ala 20 25 30Leu Ser
Leu Leu Ile Arg Ala Glu Leu Gly Gln Pro Gly Asn Leu Leu 35 40 45Gly
Asn Asp His Ile Tyr Asn Val Ile Val Thr Ala Leu Ala Val Thr 50 55
60Phe Leu Gly Leu Leu Thr Ala Leu Asp Leu Asn Tyr Leu Thr Asn Lys65
70 75 80Leu Lys Met Lys Ser Pro Leu Cys Thr Phe Tyr Phe Ser Asn Met
Leu 85 90 95Gly Phe Tyr Pro Ser Ile Thr His Arg Thr Ile Pro Tyr Leu
Gly Leu 100 105 110Leu Thr Ser Gln Asn Leu Pro Leu Leu Leu Leu Asp
Leu Thr Trp Leu 115 120 125Glu Lys Leu Leu Pro Lys Thr Ile Ser Gln
His Gln Ile Ser Thr Ser 130 135 140Ile Ile Thr Ser Thr Gln Lys Gly
Met Ile Lys Leu Tyr Phe Leu Ser145 150 155 160Phe Phe Phe Pro Leu
Ile Leu Thr Leu Leu Leu Ile Thr Xaa 165
17042194PRTArtificialputative protein sequence 42Met Phe Ala Asp
Arg Trp Leu Phe Ser Thr Asn His Lys Asp Ile Gly1 5 10 15Thr Leu Tyr
Leu Leu Phe Gly Ala Trp Ala Gly Val Leu Gly Thr Ala 20 25 30Leu Ser
Leu Leu Ile Arg Ala Glu Leu Gly Gln Pro Gly Asn Leu Leu 35 40 45Gly
Asn Asp His Ile Tyr Asn Val Ile Val Thr Ala His Ala Phe Val 50 55
60Met Ile Phe Phe Met Val Met Pro Ile Met Ile Gly Gly Phe Gly Asn65
70 75 80Trp Leu Val Pro Leu Met Ile Gly Ala Pro Asp Met Ala Phe Pro
Arg 85 90 95Met Asn Asn Met Ser Phe Trp Leu Leu Pro Pro Ser Leu Leu
Leu Leu 100 105 110Leu Ala Ser Ala Met Val Glu Ala Gly Ala Gly Thr
Gly Trp Thr Val 115 120 125Tyr Pro Pro Leu Ala Gly Asn Tyr Ser His
Pro Gly Ala Leu Leu Asp 130 135 140Leu Thr Trp Leu Glu Lys Leu Leu
Pro Lys Thr Ile Ser Gln His Gln145 150 155 160Ile Ser Thr Ser Ile
Ile Thr Ser Thr Gln Lys Gly Met Ile Lys Leu 165 170 175Tyr Phe Leu
Ser Phe Phe Phe Pro Leu Ile Leu Thr Leu Leu Leu Ile 180 185 190Thr
Xaa43736PRTArtificialputative protein sequence 43Met Phe Ala Asp
Arg Trp Leu Phe Ser Thr Asn His Lys Asp Ile Gly1 5 10 15Thr Leu Tyr
Leu Leu Phe Gly Ala Trp Ala Gly Val Leu Gly Thr Ala 20 25 30Leu Ser
Leu Leu Ile Arg Ala Glu Leu Gly Gln Pro Gly Asn Leu Leu 35 40 45Gly
Asn Asp His Ile Tyr Asn Val Ile Val Thr Ala His Ala Phe Val 50 55
60Met Ile Phe Phe Met Val Met Pro Ile Met Ile Gly Gly Phe Gly Asn65
70 75 80Trp Leu Val Pro Leu Met Ile Gly Ala Pro Asp Met Ala Phe Pro
Arg 85 90 95Met Asn Asn Met Ser Phe Trp Leu Leu Pro Pro Ser Leu Leu
Leu Leu 100 105 110Leu Ala Ser Ala Met Val Glu Ala Gly Ala Gly Thr
Gly Trp Thr Val 115 120 125Tyr Pro Pro Leu Ala Gly Asn Tyr Ser His
Pro Gly Ala Ser Val Asp 130 135 140Leu Thr Ile Phe Ser Leu His Leu
Ala Gly Val Ser Ser Ile Leu Gly145 150 155 160Ala Ile Asn Phe Ile
Thr Thr Ile Ile Asn Met Lys Pro Pro Ala Met 165 170 175Thr Gln Tyr
Gln Thr Pro Leu Phe Val Trp Ser Val Leu Ile Thr Ala 180 185 190Val
Leu Leu Leu Leu Ser Leu Pro Val Leu Ala Ala Gly Ile Thr Met 195 200
205Leu Leu Thr Asp Arg Asn Leu Asn Thr Thr Phe Phe Asp Pro Ala Gly
210 215 220Gly Gly Asp Pro Ile Leu Tyr Gln His Leu Phe Trp Phe Phe
Gly His225 230 235 240Pro Glu Val Tyr Ile Leu Ile Leu Pro Gly Phe
Gly Met Ile Ser His 245 250 255Ile Val Thr Tyr Tyr Ser Gly Lys Lys
Glu Pro Phe Gly Tyr Met Gly 260 265 270Met Val Trp Ala Met Met Ser
Ile Gly Phe Leu Gly Phe Ile Val Trp 275 280 285Ala His His Met Phe
Thr Val Gly Met Asp Val Asp Thr Arg Ala Tyr 290 295 300Phe Thr Ser
Ala Thr Met Ile Ile Ala Ile Pro Thr Gly Val Lys Val305 310 315
320Phe Ser Trp Leu Ala Thr Leu His Gly Ser Asn Met Lys Trp Ser Ala
325 330 335Ala Val Leu Trp Ala Leu Gly Phe Ile Phe Leu Phe Thr Val
Gly Gly 340 345 350Leu Thr Gly Ile Val Leu Ala Asn Ser Ser Leu Asp
Ile Val Leu His 355 360 365Asp Thr Tyr Tyr Val Val Ala His Phe His
Tyr Val Leu Ser Met Gly 370 375 380Ala Val Phe Ala Ile Met Gly Gly
Phe Ile His Trp Phe Pro Leu Phe385 390 395 400Ser Gly Tyr Thr Leu
Asp Gln Thr Tyr Ala Lys Ile His Phe Thr Ile 405 410 415Met Phe Ile
Gly Val Asn Leu Thr Phe Phe Pro Gln His Phe Leu Gly 420 425 430Leu
Ser Gly Met Pro Arg Arg Tyr Ser Asp Tyr Pro Asp Ala Tyr Thr 435 440
445Thr Trp Asn Ile Leu Ser Ser Val Gly Ser Phe Ile Ser Leu Thr Ala
450 455 460Val Met Leu Met Ile Phe Met Ile Trp Glu Ala Phe Ala Ser
Lys Arg465 470 475 480Lys Val Leu Met Val Glu Glu Pro Ser Met Asn
Leu Glu Trp Leu Tyr 485 490 495Gly Cys Pro Pro Pro Tyr His Thr Phe
Glu Glu Pro Val Tyr Met Lys 500 505 510Ala Gly Met Pro Phe Leu Thr
Gly Phe Tyr Ser Lys Asp His Ile Ile 515 520 525Glu Thr Ala Asn Met
Ser Tyr Thr Asn Ala Trp Ala Leu Ser Ile Thr 530 535 540Leu Ile Ala
Thr Ser Leu Thr Ser Ala Tyr Ser Thr Arg Met Ile Leu545 550 555
560Leu Thr Leu Thr Gly Gln Pro Arg Phe Pro Thr Leu Thr Asn Ile Asn
565 570 575Glu Asn Asn Pro Thr Leu Leu Asn Pro Ile Lys Arg Leu Ala
Ala Gly 580 585 590Ser Leu Phe Ala Gly Phe Leu Ile Thr Asn Asn Ile
Ser Pro Ala Ser 595 600 605Pro Phe Gln Thr Thr Ile Pro Leu Tyr Leu
Lys Leu Thr Ala Leu Ala 610 615 620Val Thr Phe Leu Gly Leu Leu Thr
Ala Leu Asp Leu Asn Tyr Leu Thr625 630 635 640Asn Lys Leu Lys Met
Lys Ser Pro Leu Cys Thr Phe Tyr Phe Ser Asn 645 650 655Met Leu Gly
Phe Tyr Pro Ser Ile Thr His Arg Thr Ile Pro Tyr Leu 660 665 670Gly
Leu Leu Thr Ser Gln Asn Leu Pro Leu Leu Leu Leu Asp Leu Thr 675 680
685Trp Leu Glu Lys Leu Leu Pro Lys Thr Ile Ser Gln His Gln Ile Ser
690 695 700Thr Ser Ile Ile Thr Ser Thr Gln Lys Gly Met Ile Lys Leu
Tyr Phe705 710 715 720Leu Ser Phe Phe Phe Pro Leu Ile Leu Thr Leu
Leu Leu Ile Thr Xaa 725 730 73544269PRTArtificialputative protein
sequence 44Met Ala His Ala Ala Gln Val Gly Leu Gln Asp Ala Thr Ser
Pro Ile1 5 10 15Met Glu Glu Leu Ile Thr Phe His Asp His Ala Leu Met
Ile Ile Phe 20 25 30Leu Ile Cys Phe Leu Val Leu Tyr Ala Leu Phe Leu
Thr Leu Thr Thr 35 40 45Lys Leu Thr Asn Thr Asn Ile Ser Asp Ala Gln
Glu Met Glu Thr Ala 50 55 60Asn Met Ser Tyr Thr Asn Ala Trp Ala Leu
Ser Ile Thr Leu Ile Ala65 70 75 80Thr Ser Leu Thr Ser Ala Tyr Ser
Thr Arg Met Ile Leu Leu Thr Leu 85 90 95Thr Gly Gln Pro Arg Phe Pro
Thr Leu Thr Asn Ile Asn Glu Asn Asn 100 105 110Pro Thr Leu Leu Asn
Pro Ile Lys Arg Leu Ala Ala Gly Ser Leu Phe 115 120 125Ala Gly Phe
Leu Ile Thr Asn Asn Ile Ser Pro Ala Ser Pro Phe Gln 130 135 140Thr
Thr Ile Pro Leu Tyr Leu Lys Leu Thr Ala Leu Ala Val Thr Phe145 150
155 160Leu Gly Leu Leu Thr Ala Leu Asp Leu Asn Tyr Leu Thr Asn Lys
Leu 165 170 175Lys Met Lys Ser Pro Leu Cys Thr Phe Tyr Phe Ser Asn
Met Leu Gly 180 185 190Phe Tyr Pro Ser Ile Thr His Arg Thr Ile Pro
Tyr Leu Gly Leu Leu 195 200 205Thr Ser Gln Asn Leu Pro Leu Leu Leu
Leu Asp Leu Thr Trp Leu Glu 210 215 220Lys Leu Leu Pro Lys Thr Ile
Ser Gln His Gln Ile Ser Thr Ser Ile225 230 235 240Ile Thr Ser Thr
Gln Lys Gly Met Ile Lys Leu Tyr Phe Leu Ser Phe 245 250 255Phe Phe
Pro Leu Ile Leu Thr Leu Leu Leu Ile Thr Xaa 260
26545262PRTArtificialputative protein sequence 45Met Ala His Ala
Ala Gln Val Gly Leu Gln Asp Ala Thr Ser Pro Ile1 5 10 15Met Glu Glu
Leu Ile Thr Phe His Asp His Ala Leu Met Ile Ile Phe 20 25 30Leu Ile
Cys Phe Leu Val Leu Tyr Ala Leu Phe Leu Thr Leu Thr Thr 35 40 45Lys
Leu Thr Asn Thr Asn Ile Ser Asp Ala Gln Glu Met Glu Thr Val 50 55
60Trp Thr Ile Leu Pro Ala Ile Ile Leu Val Leu Ile Ala Leu Pro Ser65
70 75 80Leu Arg Ile Leu Tyr Met Thr Asp Glu Val Asn Asp Pro Ser Leu
Thr 85 90 95Ile Lys Ser Ile Gly His Gln Trp Tyr Trp Thr Tyr Glu Tyr
Thr Asp 100 105 110Tyr Gly Gly Leu Ile Phe Asn Ser Tyr Met Leu Pro
Pro Leu Phe Leu 115 120 125Glu Pro Gly Asp Leu Arg Leu Leu Asp Val
Asp Asn Arg Val Val Leu 130 135 140Pro Ile Glu Ala Pro Ile Arg Met
Met Ile Thr Ser Gln Asp Val Leu145 150 155 160His Ser Trp Ala Val
Pro Thr Leu Gly Leu Lys Thr Asp Ala Ile Pro 165 170 175Gly Arg Leu
Asn Gln Thr Thr Phe Thr Ala Thr Arg Pro Gly Val Tyr 180 185 190Tyr
Gly Gln Cys Ser Glu Ile Cys Gly Ala Asn His Ser Phe Met Pro 195 200
205Ile Val Leu Asp Leu Thr Trp Leu Glu Lys Leu Leu Pro Lys Thr Ile
210 215 220Ser Gln His Gln Ile Ser Thr Ser Ile Ile Thr Ser Thr Gln
Lys Gly225 230 235 240Met Ile Lys Leu Tyr Phe Leu Ser Phe Phe Phe
Pro Leu Ile Leu Thr 245 250 255Leu Leu Leu Ile Thr Xaa
26046635PRTArtificialputative protein sequence 46Met Asn Glu Asn
Leu Phe Ala Ser Phe Ile Ala Pro Thr Ile Leu Gly1 5 10 15Leu Pro Ala
Ala Val Leu Ile Ile Leu Phe Pro Pro Leu Leu Ile Pro 20 25 30Thr Ser
Lys Tyr Leu Ile Asn Asn Arg Leu Ile Thr Thr Gln Gln Trp 35 40 45Leu
Ile Lys Leu Thr Ser Lys Gln Met Met Thr Met His Asn Thr Lys 50 55
60Gly Arg Thr Trp Ser Leu Met Leu Val Ser Leu Ile Ile Phe Ile Ala65
70 75 80Thr Thr Asn Leu Leu Gly Leu Leu Pro His Ser Phe Thr Pro Thr
Thr 85 90 95Gln Leu Ser Met Asn Leu Ala Met Ala Ile Pro Leu Trp Ala
Gly Thr 100 105 110Val Ile Met Gly Phe Arg Ser Lys Ile Lys Asn Ala
Leu Ala His Phe 115 120 125Leu Pro Gln Gly Thr Pro Thr Pro Leu Ile
Pro Met Leu Val Ile Ile 130 135 140Glu Thr Ile Ser Leu Leu Ile Gln
Pro Met Ala Leu Ala Val Arg Leu145 150 155 160Thr Ala Asn Ile Thr
Ala Gly His Leu Leu Met His Leu Ile Gly Ser 165 170 175Ala Thr Leu
Ala Met Ser Thr Ile Asn Leu Pro Ser Thr Leu Ile Ile 180 185 190Phe
Thr Ile Leu Ile Leu Leu Thr Ile Leu Glu Ile Ala Val Ala Leu 195 200
205Ile Gln Ala Tyr Val Phe Thr Leu Leu Val Ser Leu Tyr Leu His Ser
210 215 220Asn Ser Trp Asp Pro Gln Gln Met Ala Leu Leu Asn Ala Asn
Pro Ser225 230 235 240Leu Thr Pro Leu Leu Gly Leu Leu Leu Ala Ala
Ala Gly Lys Ser Ala 245 250 255Gln Leu Gly Leu
His Pro Trp Leu Pro Ser Ala Met Glu Gly Pro Thr 260 265 270Pro Val
Ser Ala Leu Leu His Ser Ser Thr Met Val Val Ala Gly Ile 275 280
285Phe Leu Leu Ile Arg Phe His Pro Leu Ala Glu Asn Ser Pro Leu Ile
290 295 300Gln Thr Leu Thr Leu Cys Leu Gly Ala Ile Thr Thr Leu Phe
Ala Ala305 310 315 320Val Cys Ala Leu Thr Gln Asn Asp Ile Lys Lys
Ile Val Ala Phe Ser 325 330 335Thr Ser Ser Gln Leu Gly Leu Met Met
Val Thr Ile Gly Ile Asn Gln 340 345 350Pro His Leu Ala Phe Leu His
Ile Cys Thr His Ala Phe Phe Lys Ala 355 360 365Met Leu Phe Met Cys
Ser Gly Ser Ile Ile His Asn Leu Asn Asn Glu 370 375 380Gln Asp Ile
Arg Lys Met Gly Gly Leu Leu Lys Thr Met Pro Leu Thr385 390 395
400Ser Thr Ser Leu Thr Ile Gly Ser Leu Ala Leu Ala Gly Met Pro Phe
405 410 415Leu Thr Gly Phe Tyr Ser Lys Asp His Ile Ile Glu Thr Ala
Asn Met 420 425 430Ser Tyr Thr Asn Ala Trp Ala Leu Ser Ile Thr Leu
Ile Ala Thr Ser 435 440 445Leu Thr Ser Ala Tyr Ser Thr Arg Met Ile
Leu Leu Thr Leu Thr Gly 450 455 460Gln Pro Arg Phe Pro Thr Leu Thr
Asn Ile Asn Glu Asn Asn Pro Thr465 470 475 480Leu Leu Asn Pro Ile
Lys Arg Leu Ala Ala Gly Ser Leu Phe Ala Gly 485 490 495Phe Leu Ile
Thr Asn Asn Ile Ser Pro Ala Ser Pro Phe Gln Thr Thr 500 505 510Ile
Pro Leu Tyr Leu Lys Leu Thr Ala Leu Ala Val Thr Phe Leu Gly 515 520
525Leu Leu Thr Ala Leu Asp Leu Asn Tyr Leu Thr Asn Lys Leu Lys Met
530 535 540Lys Ser Pro Leu Cys Thr Phe Tyr Phe Ser Asn Met Leu Gly
Phe Tyr545 550 555 560Pro Ser Ile Thr His Arg Thr Ile Pro Tyr Leu
Gly Leu Leu Thr Ser 565 570 575Gln Asn Leu Pro Leu Leu Leu Leu Asp
Leu Thr Trp Leu Glu Lys Leu 580 585 590Leu Pro Lys Thr Ile Ser Gln
His Gln Ile Ser Thr Ser Ile Ile Thr 595 600 605Ser Thr Gln Lys Gly
Met Ile Lys Leu Tyr Phe Leu Ser Phe Phe Phe 610 615 620Pro Leu Ile
Leu Thr Leu Leu Leu Ile Thr Xaa625 630
63547515PRTArtificialputative protein sequence 47Met Thr His Gln
Ser His Ala Tyr His Met Val Lys Pro Ser Pro Trp1 5 10 15Pro Leu Thr
Gly Ala Leu Ser Ala Leu Leu Met Thr Ser Gly Leu Ala 20 25 30Met Trp
Phe His Phe His Ser Met Thr Leu Leu Met Leu Gly Leu Leu 35 40 45Thr
Asn Thr Leu Thr Met Tyr Gln Trp Trp Arg Asp Val Thr Arg Glu 50 55
60Ser Thr Tyr Gln Gly His His Thr Pro Pro Val Gln Lys Gly Leu Arg65
70 75 80Tyr Gly Met Ile Leu Phe Ile Thr Ser Glu Val Phe Phe Phe Ala
Gly 85 90 95Phe Phe Trp Ala Phe Tyr His Ser Ser Leu Ala Pro Thr Pro
Gln Leu 100 105 110Gly Gly His Trp Pro Pro Thr Gly Ile Thr Pro Leu
Leu Gly Leu Leu 115 120 125Leu Ala Ala Ala Gly Lys Ser Ala Gln Leu
Gly Leu His Pro Trp Leu 130 135 140Pro Ser Ala Met Glu Gly Pro Thr
Pro Val Ser Ala Leu Leu His Ser145 150 155 160Ser Thr Met Val Val
Ala Gly Ile Phe Leu Leu Ile Arg Phe His Pro 165 170 175Leu Ala Glu
Asn Ser Pro Leu Ile Gln Thr Leu Thr Leu Cys Leu Gly 180 185 190Ala
Ile Thr Thr Leu Phe Ala Ala Val Cys Ala Leu Thr Gln Asn Asp 195 200
205Ile Lys Lys Ile Val Ala Phe Ser Thr Ser Ser Gln Leu Gly Leu Met
210 215 220Met Val Thr Ile Gly Ile Asn Gln Pro His Leu Ala Phe Leu
His Ile225 230 235 240Cys Thr His Ala Phe Phe Lys Ala Met Leu Phe
Met Cys Ser Gly Ser 245 250 255Ile Ile His Asn Leu Asn Asn Glu Gln
Asp Ile Arg Lys Met Gly Gly 260 265 270Leu Leu Lys Thr Met Pro Leu
Thr Ser Thr Ser Leu Thr Ile Gly Ser 275 280 285Leu Ala Leu Ala Gly
Met Pro Phe Leu Thr Gly Phe Tyr Ser Lys Asp 290 295 300His Ile Ile
Glu Thr Ala Asn Met Ser Tyr Thr Asn Ala Trp Ala Leu305 310 315
320Ser Ile Thr Leu Ile Ala Thr Ser Leu Thr Ser Ala Tyr Ser Thr Arg
325 330 335Met Ile Leu Leu Thr Leu Thr Gly Gln Pro Arg Phe Pro Thr
Leu Thr 340 345 350Asn Ile Asn Glu Asn Asn Pro Thr Leu Leu Asn Pro
Ile Lys Arg Leu 355 360 365Ala Ala Gly Ser Leu Phe Ala Gly Phe Leu
Ile Thr Asn Asn Ile Ser 370 375 380Pro Ala Ser Pro Phe Gln Thr Thr
Ile Pro Leu Tyr Leu Lys Leu Thr385 390 395 400Ala Leu Ala Val Thr
Phe Leu Gly Leu Leu Thr Ala Leu Asp Leu Asn 405 410 415Tyr Leu Thr
Asn Lys Leu Lys Met Lys Ser Pro Leu Cys Thr Phe Tyr 420 425 430Phe
Ser Asn Met Leu Gly Phe Tyr Pro Ser Ile Thr His Arg Thr Ile 435 440
445Pro Tyr Leu Gly Leu Leu Thr Ser Gln Asn Leu Pro Leu Leu Leu Leu
450 455 460Asp Leu Thr Trp Leu Glu Lys Leu Leu Pro Lys Thr Ile Ser
Gln His465 470 475 480Gln Ile Ser Thr Ser Ile Ile Thr Ser Thr Gln
Lys Gly Met Ile Lys 485 490 495Leu Tyr Phe Leu Ser Phe Phe Phe Pro
Leu Ile Leu Thr Leu Leu Leu 500 505 510Ile Thr Xaa
51548543PRTArtificialputative protein sequence 48Met Asn Phe Ala
Leu Ile Leu Met Ile Asn Thr Leu Leu Ala Leu Leu1 5 10 15Leu Met Ile
Ile Thr Phe Trp Leu Pro Gln Leu Asn Gly Tyr Met Glu 20 25 30Lys Ser
Thr Pro Tyr Glu Cys Gly Phe Asp Pro Met Ser Pro Ala Arg 35 40 45Val
Pro Phe Ser Met Lys Phe Phe Leu Val Ala Ile Thr Phe Leu Leu 50 55
60Phe Asp Leu Glu Ile Ala Leu Leu Leu Pro Leu Pro Trp Ala Leu Gln65
70 75 80Thr Thr Asn Leu Pro Leu Met Val Met Ser Ser Leu Leu Leu Ile
Ile 85 90 95Ile Leu Ala Leu Ser Leu Ala Asn Thr Ala Ala Ile Gln Ala
Ile Leu 100 105 110Tyr Asn Arg Ile Gly Asp Ile Gly Phe Ile Leu Ala
Leu Ala Trp Phe 115 120 125Ile Leu His Ser Asn Ser Trp Asp Pro Gln
Gln Met Ala Leu Leu Asn 130 135 140Ala Asn Pro Ser Leu Thr Pro Leu
Leu Gly Leu Leu Leu Ala Ala Ala145 150 155 160Gly Lys Ser Ala Gln
Leu Gly Leu His Pro Trp Leu Pro Ser Ala Met 165 170 175Glu Gly Pro
Thr Pro Val Ser Ala Leu Leu His Ser Ser Thr Met Val 180 185 190Val
Ala Gly Ile Phe Leu Leu Ile Arg Phe His Pro Leu Ala Glu Asn 195 200
205Ser Pro Leu Ile Gln Thr Leu Thr Leu Cys Leu Gly Ala Ile Thr Thr
210 215 220Leu Phe Ala Ala Val Cys Ala Leu Thr Gln Asn Asp Ile Lys
Lys Ile225 230 235 240Val Ala Phe Ser Thr Ser Ser Gln Leu Gly Leu
Met Met Val Thr Ile 245 250 255Gly Ile Asn Gln Pro His Leu Ala Phe
Leu His Ile Cys Thr His Ala 260 265 270Phe Phe Lys Ala Met Leu Phe
Met Cys Ser Gly Ser Ile Ile His Asn 275 280 285Leu Asn Asn Glu Gln
Asp Ile Arg Lys Met Gly Gly Leu Leu Lys Thr 290 295 300Met Pro Leu
Thr Ser Thr Ser Leu Thr Ile Gly Ser Leu Ala Leu Ala305 310 315
320Gly Met Pro Phe Leu Thr Gly Phe Tyr Ser Lys Asp His Ile Ile Glu
325 330 335Thr Ala Asn Met Ser Tyr Thr Asn Ala Trp Ala Leu Ser Ile
Thr Leu 340 345 350Ile Ala Thr Ser Leu Thr Ser Ala Tyr Ser Thr Arg
Met Ile Leu Leu 355 360 365Thr Leu Thr Gly Gln Pro Arg Phe Pro Thr
Leu Thr Asn Ile Asn Glu 370 375 380Asn Asn Pro Thr Leu Leu Asn Pro
Ile Lys Arg Leu Ala Ala Gly Ser385 390 395 400Leu Phe Ala Gly Phe
Leu Ile Thr Asn Asn Ile Ser Pro Ala Ser Pro 405 410 415Phe Gln Thr
Thr Ile Pro Leu Tyr Leu Lys Leu Thr Ala Leu Ala Val 420 425 430Thr
Phe Leu Gly Leu Leu Thr Ala Leu Asp Leu Asn Tyr Leu Thr Asn 435 440
445Lys Leu Lys Met Lys Ser Pro Leu Cys Thr Phe Tyr Phe Ser Asn Met
450 455 460Leu Gly Phe Tyr Pro Ser Ile Thr His Arg Thr Ile Pro Tyr
Leu Gly465 470 475 480Leu Leu Thr Ser Gln Asn Leu Pro Leu Leu Leu
Leu Asp Leu Thr Trp 485 490 495Leu Glu Lys Leu Leu Pro Lys Thr Ile
Ser Gln His Gln Ile Ser Thr 500 505 510Ser Ile Ile Thr Ser Thr Gln
Lys Gly Met Ile Lys Leu Tyr Phe Leu 515 520 525Ser Phe Phe Phe Pro
Leu Ile Leu Thr Leu Leu Leu Ile Thr Xaa 530 535
5404943PRTArtificialputative protein sequence 49Met Pro Gln Leu Asn
Thr Thr Val Trp Pro Thr Met Ile Thr Pro Met1 5 10 15Leu Leu Thr Leu
Phe Leu Ile Thr Gln Leu Lys Met Leu Asn Thr Asn 20 25 30Tyr His Leu
Pro Pro Ser Pro Leu Ala Ala Xaa 35 4050951RNAHuman 50augaacgaaa
aucuguucgc uucauucauu gcccccacaa uccuaggccu acccgccgca 60guacugauca
uucuauuucc cccucuauug auccccaccu ccaaauaucu caucaacaac
120cgacuaauca ccacccaaca augacuaauc aaacuaaccu caaaacaaau
gauaaccaua 180cacaacacua aaggacgaac cugaucucuu auacuaguau
ccuuaaucau uuuuauugcc 240acaacuaacc uccucggacu ccugccucac
ucauuuacac caaccaccca acuaucuaua 300aaccuagcca uggccauccc
cuuaugagcg ggcacaguga uuauaggcuu ucgcucuaag 360auuaaaaaug
cccuagccca cuucuuacca caaggcacac cuacaccccu uauccccaua
420cuaguuauua ucgaaaccau cagccuacuc auucaaccaa uagcccuggc
cguacgccua 480accgcuaaca uuacugcagg ccaccuacuc augcaccuaa
uuggaagcgc cacccuagca 540auaucaacca uuaaccuucc cucuacacuu
aucaucuuca caauucuaau ucuacugacu 600auccuagaaa ucgcugucac
uuuccuagga cuucuaacag cccuagaccu caacuaccua 660accaacaaac
uuaaaauaaa auccccacua ugcacauuuu auuucuccaa cauacucgga
720uucuacccua gcaucacaca ccgcacaauc cccuaucuag gccuucuuac
gagccaaaac 780cugccccuac uccuccuaga ccuaaccuga cuagaaaagc
uauuaccuaa aacaauuuca 840cagcaccaaa ucuccaccuc caucaucacc
ucaacccaaa aaggcauaau uaaacuuuac 900uuccucucuu ucuucuuccc
acucauccua acccuacucc uaaucacaua a 95151951DNAArtificialcDNA
51atgaacgaaa atctgttcgc ttcattcatt gcccccacaa tcctaggcct acccgccgca
60gtactgatca ttctatttcc ccctctattg atccccacct ccaaatatct catcaacaac
120cgactaatca ccacccaaca atgactaatc aaactaacct caaaacaaat
gataaccata 180cacaacacta aaggacgaac ctgatctctt atactagtat
ccttaatcat ttttattgcc 240acaactaacc tcctcggact cctgcctcac
tcatttacac caaccaccca actatctata 300aacctagcca tggccatccc
cttatgagcg ggcacagtga ttataggctt tcgctctaag 360attaaaaatg
ccctagccca cttcttacca caaggcacac ctacacccct tatccccata
420ctagttatta tcgaaaccat cagcctactc attcaaccaa tagccctggc
cgtacgccta 480accgctaaca ttactgcagg ccacctactc atgcacctaa
ttggaagcgc caccctagca 540atatcaacca ttaaccttcc ctctacactt
atcatcttca caattctaat tctactgact 600atcctagaaa tcgctgtcac
tttcctagga cttctaacag ccctagacct caactaccta 660accaacaaac
ttaaaataaa atccccacta tgcacatttt atttctccaa catactcgga
720ttctacccta gcatcacaca ccgcacaatc ccctatctag gccttcttac
gagccaaaac 780ctgcccctac tcctcctaga cctaacctga ctagaaaagc
tattacctaa aacaatttca 840cagcaccaaa tctccacctc catcatcacc
tcaacccaaa aaggcataat taaactttac 900ttcctctctt tcttcttccc
actcatccta accctactcc taatcacata a 95152317PRTArtificialputative
protein sequence 52Met Asn Glu Asn Leu Phe Ala Ser Phe Ile Ala Pro
Thr Ile Leu Gly1 5 10 15Leu Pro Ala Ala Val Leu Ile Ile Leu Phe Pro
Pro Leu Leu Ile Pro 20 25 30Thr Ser Lys Tyr Leu Ile Asn Asn Arg Leu
Ile Thr Thr Gln Gln Trp 35 40 45Leu Ile Lys Leu Thr Ser Lys Gln Met
Met Thr Met His Asn Thr Lys 50 55 60Gly Arg Thr Trp Ser Leu Met Leu
Val Ser Leu Ile Ile Phe Ile Ala65 70 75 80Thr Thr Asn Leu Leu Gly
Leu Leu Pro His Ser Phe Thr Pro Thr Thr 85 90 95Gln Leu Ser Met Asn
Leu Ala Met Ala Ile Pro Leu Trp Ala Gly Thr 100 105 110Val Ile Met
Gly Phe Arg Ser Lys Ile Lys Asn Ala Leu Ala His Phe 115 120 125Leu
Pro Gln Gly Thr Pro Thr Pro Leu Ile Pro Met Leu Val Ile Ile 130 135
140Glu Thr Ile Ser Leu Leu Ile Gln Pro Met Ala Leu Ala Val Arg
Leu145 150 155 160Thr Ala Asn Ile Thr Ala Gly His Leu Leu Met His
Leu Ile Gly Ser 165 170 175Ala Thr Leu Ala Met Ser Thr Ile Asn Leu
Pro Ser Thr Leu Ile Ile 180 185 190Phe Thr Ile Leu Ile Leu Leu Thr
Ile Leu Glu Ile Ala Val Thr Phe 195 200 205Leu Gly Leu Leu Thr Ala
Leu Asp Leu Asn Tyr Leu Thr Asn Lys Leu 210 215 220Lys Met Lys Ser
Pro Leu Cys Thr Phe Tyr Phe Ser Asn Met Leu Gly225 230 235 240Phe
Tyr Pro Ser Ile Thr His Arg Thr Ile Pro Tyr Leu Gly Leu Leu 245 250
255Thr Ser Gln Asn Leu Pro Leu Leu Leu Leu Asp Leu Thr Trp Leu Glu
260 265 270Lys Leu Leu Pro Lys Thr Ile Ser Gln His Gln Ile Ser Thr
Ser Ile 275 280 285Ile Thr Ser Thr Gln Lys Gly Met Ile Lys Leu Tyr
Phe Leu Ser Phe 290 295 300Phe Phe Pro Leu Ile Leu Thr Leu Leu Leu
Ile Thr Xaa305 310 315
* * * * *