U.S. patent application number 12/239581 was filed with the patent office on 2009-09-24 for dlc-1 gene deleted in cancers.
This patent application is currently assigned to The Government of the United States of America as Represented by the Secretary of the. Invention is credited to NICOLAS POPESCU, SNORRI S. THORGEIRSSON, BAO-ZHU YUAN.
Application Number | 20090239220 12/239581 |
Document ID | / |
Family ID | 34576144 |
Filed Date | 2009-09-24 |
United States Patent
Application |
20090239220 |
Kind Code |
A1 |
YUAN; BAO-ZHU ; et
al. |
September 24, 2009 |
DLC-1 GENE DELETED IN CANCERS
Abstract
A cDNA molecule corresponding to a newly discovered human gene
is disclosed. The new gene, which is frequently deleted in liver
cancer cells and cell lines, is called the DLC-1 gene. Because the
gene is frequently deleted in liver cancer cells, but present in
normal cells, it is thought to act as a tumor suppressor. This gene
is also frequently deleted in breast and colon cancers, and its
expression is decreased or undetectable in many prostate and colon
cancers. Also disclosed is the amino acid sequence of the protein
encoded by the DLC-1 gene. Methods of using these biological
materials in the diagnosis and treatment of hepatocellular cancer,
breast cancer, colon cancer, prostate cancer, and adenocarcinomas
are presented.
Inventors: |
YUAN; BAO-ZHU; (COLUMBIA,
MD) ; THORGEIRSSON; SNORRI S.; (BETHESDA, MD)
; POPESCU; NICOLAS; (BETHESDA, MD) |
Correspondence
Address: |
KLARQUIST SPARKMAN, LLP
121 S.W. SALMON STREET, SUITE #1600
PORTLAND
OR
97204-2988
US
|
Assignee: |
The Government of the United States
of America as Represented by the Secretary of the
Department of Health and Human Services
|
Family ID: |
34576144 |
Appl. No.: |
12/239581 |
Filed: |
September 26, 2008 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10995914 |
Nov 22, 2004 |
7534565 |
|
|
12239581 |
|
|
|
|
09644947 |
Aug 23, 2000 |
6897018 |
|
|
10995914 |
|
|
|
|
60075952 |
Feb 25, 1998 |
|
|
|
Current U.S.
Class: |
435/6.14 ;
435/320.1; 435/7.1; 530/350; 530/387.9; 536/22.1; 536/23.5 |
Current CPC
Class: |
A61K 48/00 20130101;
C07K 14/4703 20130101 |
Class at
Publication: |
435/6 ; 536/22.1;
435/320.1; 530/350; 530/387.9; 536/23.5; 435/7.1 |
International
Class: |
C12Q 1/68 20060101
C12Q001/68; C07H 21/00 20060101 C07H021/00; C12N 15/63 20060101
C12N015/63; C07K 14/00 20060101 C07K014/00; C07K 16/00 20060101
C07K016/00; C07H 21/04 20060101 C07H021/04; G01N 33/53 20060101
G01N033/53 |
Foreign Application Data
Date |
Code |
Application Number |
Feb 25, 1999 |
US |
PCT/US99/04164 |
Claims
1. An isolated nucleic acid molecule selected from the group
consisting of: (a) nucleic acid molecules encoding an amino acid
sequence as shown in Seq. I.D. No. 2; (b) nucleic acid molecules
which hybridize under stringent conditions to a nucleic acid
molecule according to (a), and which encode a protein having an
activity of the DLC-1 protein; (c) nucleic acid molecules
comprising a nucleotide sequence as shown in Seq. I.D. No. 1.
2. An isolated nucleic acid molecule according to claim 1(b)
wherein the DNA molecule hybridizes to a DNA molecule according to
(a) under conditions wherein DNA molecules with more than 25%
mismatch will not hybridize to each other.
3. A recombinant vector including a nucleic acid molecule according
to claim 1.
4. An isolated nucleic acid molecule comprising at least 15
consecutive nucleotides of a DNA sequence shown in Seq. I.D. No.
1.
5. An isolated nucleic acid molecule according to claim 4 wherein
the molecule comprises at least 25 consecutive nucleotides of the
DNA sequence shown in Seq. I.D. No. 1.
6. A recombinant DNA vector including a DNA molecule according to
claim 4.
7. The purified protein of claim 15, wherein the amino acid
sequence comprises the sequence specified in 15(a).
8. An antibody capable of specifically binding to a protein
according to claim 7.
9. The antibody of claim 8 wherein the antibody is a monoclonal
antibody.
10. An isolated cDNA molecule encoding a human DLC-1 protein.
11. A cDNA according to claim 10 wherein the cDNA comprises the
nucleic acid sequence depicted as bases 325-3657 of Seq. I.D. No.
1.
12. A cDNA according to claim 10 wherein the human DLC-1 protein
comprises an amino acid sequence as depicted in Seq. I.D. No.
2.
13. A method of detecting a deletion of a DLC-1 gene in a cell, the
method comprising (a) incubating an oligonucleotide according to
claim 15 with nucleic acid of the cell under conditions such that
the oligonucleotide will specifically hybridize to a DLC-1 gene
present in the nucleic acid to form an oligonucleotide:DLC-1 gene
complex; (b) detecting the presence or absence of
oligonucleotide:DLC-1 complexes, wherein the absence of said
complexes indicates deletion of the DLC-1 gene.
14. A method of detecting the presence of DLC-1 protein in a cell,
the method comprising (a) incubating an antibody according to claim
10 with proteins of the cell under conditions such that the
antibody will specifically bind to a DLC-1 protein present in the
cell to form an antibody:DLC-1 protein complex; and (b) detecting
the presence of antibody:DLC-1 protein complexes.
15. A purified protein having DLC-1 protein biological activity,
and comprising an amino acid sequence selected from the group of:
(a) the amino acid sequence shown in Seq. I.D. No.2; (b) amino acid
sequences that differ from those specified in (a) by one or more
conservative amino acid substitutions; and (c) amino acid sequences
having at least 90% sequence identity to the sequence specified in
(a).
16. The purified protein of claim 15(c), wherein the amino acid
sequences have at least 95% sequence identity to the sequence
specified in 15(a).
17. The purified protein of claim 15(c), wherein the amino acid
sequences have at least 98% sequence identity to the sequence
specified in 15(a).
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. application Ser.
No. 10/995,914, filed Nov. 22, 2004, which is a continuation of
application Ser. No. 09/644,947, filed Aug. 23, 2000, which claims
priority under 35 U.S.C. .sctn. 120 from International Application
No. PCT/US99/04164, filed Feb. 25, 1999, and under 35 U.S.C.
.sctn.119 from U.S. Provisional Application No. 60/075,952, filed
Feb. 25, 1998. The prior applications are incorporated herein by
reference in their entirety.
FIELD OF THE INVENTION
[0002] The present invention relates to the cloning and sequencing
of the human cDNA molecule corresponding to a newly discovered
gene, called DLC-1, which is frequently deleted in liver, breast
and colon cancer cells. In addition, lower DLC-1 expression is
frequently observed in liver, colon, and prostate cancer cells,
compared to normal tissue. The present invention also relates to
methods for screening and diagnosis of a genetic predisposition to
liver cancer and other cancer types, and methods of gene therapy
utilizing recombinant DNA technologies.
BACKGROUND OF THE INVENTION
[0003] The isolation of genes involved in human cancer development
is critical for uncovering the molecular basis of cancer. One
theory of cancer development holds that there are tumor suppressor
genes in all normal cells which, when they become non-functional
due to mutations, cause neoplastic development (Knudsen et al.,
Cancer Res. 45:1482, 1985). Evidence to support this theory has
been found in the cases of human retinoblastoma and colorectal
tumors (see U.S. Pat. No. 5,330,892 and references cited therein),
as well as in connection with breast and ovarian cancers (see U.S.
Pat. No. 5,693,473 and references cited therein).
[0004] More particularly, recurrent deletions on the short arm of
human chromosome 8 in cases of liver, breast, lung and prostate
cancers have raised the possibility of the presence of tumor
suppressor genes in that location. For example, loss on the short
arm of chromosome 8 in prostrate cancer (PC) cells was described in
Brothman (Cancer Genet. Cytogenet. 95:116-21, 1997). Similar
deletions on the short arm of chromosome 8 also have been detected
in primary hepatocellular cancer (HCC), non-small cell lung
carcinoma (NSCLC) and node-negative breast carcinomas (Isola, Am.
J. Pathol. 147:905-11, 1995; and Marchio, et al., Genes Chromo.
Canc. 18:59-65, 1997).
[0005] While recurrent chromosome 8 deletions in malignant tumors
support the relevance of this lesion in carcinogenesis, scientists
previously have been unable to identify the tumor suppressor genes
involved in such deletions. This lack of knowledge concerning the
molecular genetic basis of HCC, and other cancers associated with
chromosome 8 deletions, has hampered efforts to diagnose the
predisposition to such diseases and to develop more effective
treatments aimed at curing genetic deficiencies.
[0006] Therefore, it is an object of the present invention to
provide a human cDNA molecule corresponding to a previously unknown
gene located on the short arm of chromosome 8, the deletion of
which appears to be closely associated with the development of HCC
and other cancers. The cloning and sequencing of such a cDNA
molecule enables new and improved methods of diagnosis and
treatment of such diseases.
SUMMARY OF THE INVENTION
[0007] The present invention discloses the discovery of new human
gene involved in the pathogenesis of hepatocellular cancer (HCC),
the most common primary liver cancer, and one of the most common
cancers in the world, with 251,000 new cases reported each year.
(Simonetti et al., Dig. Dis. Sci. 36:962-72, 1991; Harris et al.,
Cancer Cells 2:146-8, 1990; Marchio, et al., Genes Chromo. Cancer
18:59-65, 1997). More specifically, the present invention discloses
the isolation of the full length cDNA and the chromosomal
localization of a new gene which is frequently deleted in liver
cancer, and hence is named the DLC-1 gene.
[0008] The full-length cDNA for DLC-1 is 3850 bp long (Seq. I.D.
No. 1), encodes a protein of 1091 amino acids (Seq. I.D. No. 2),
and was localized by fluorescence in situ hybridization to
chromosome 8 at bands p21.3-22. Because the DLC-1 gene is deleted
from a significant percentage of primary HCC tumor cells and cell
lines, primary breast cancers (BC), and colorectal cancer (CRC)
cell lines, and its expression is decreased or not observed in a
significant percentage of HCC cell lines, CRC cell lines and
prostate cancer (PC) cell lines, the DLC-1 gene appears to operate
as a tumor suppressor in liver cancer and other cancers including
PC, CRC and BC.
[0009] The object of identifying the hitherto unknown DLC-1 gene
has been achieved by providing an isolated human cDNA molecule
which is able specifically to correct the cellular defects
characteristic of cells from patients with a deleted or mutated
DLC-1 gene. Specifically, the invention provides, for the first
time, an isolated cDNA molecule which, when transfected into cells
derived from a patient with a deleted or mutated DLC-1 gene, can
produce the DLC-1 protein believed to be active in suppressing HCC
pathogenesis and other cancers, such as breast, colorectal, and
prostate cancers. The invention encompasses the DLC-1 cDNA molecule
(derived from normal human liver cells), the nucleotide sequence of
this cDNA, and the putative amino acid sequence of the DLC-1
protein encoded by this cDNA.
[0010] Having herein provided the nucleotide sequence of the DLC-1
cDNA, correspondingly provided are the complementary DNA strands of
the cDNA molecule and DNA molecules which hybridize under stringent
conditions to the DLC-1 cDNA molecule or its complementary strand.
Such hybridizing molecules include DNA molecules differing only by
minor sequence changes, including nucleotide substitutions,
deletions and additions. Also comprehended by this invention are
isolated oligonucleotides comprising at least a segment of the cDNA
molecule or its complementary strand, such as oligonucleotides
which may be employed as effective DNA hybridization probes or
primers useful in the polymerase chain reaction or as hybridization
probes. Such probes and primers are particularly useful in the
screening and diagnosis of persons genetically predisposed to HCC,
and other cancers, as the result of DLC-1 gene deletions.
[0011] Hybridizing DNA molecules and variants on the DLC-1 cDNA may
readily be created by standard molecular biology techniques.
Through the manipulation of the nucleotide sequence of the human
cDNA provided by this invention by standard molecular biology
techniques, variants of the DLC-1 protein may be made which differ
in precise amino acid sequence from the disclosed protein yet which
maintain the essential characteristics of the DLC-1 protein or
which are selected to differ in one or more characteristics from
this protein. Such variants are another aspect of the present
invention.
[0012] Also provided by the present invention are recombinant DNA
vectors comprising the disclosed DNA molecules, and transgenic host
cells containing such recombinant vectors.
[0013] Having isolated the human DLC-1 cDNA sequence, the genomic
sequence for the gene was determined according to the following
method: A human genomic library constructed using the P1 vector,
pAD10SacBII, was transferred from its original E. coli host into a
second E. coli host, strain N3516, following procedures well-known
in the art. A positive P1 clone containing the DLC-1 gene was then
obtained by performing a protocol of PCR-based P1 library screening
(Sheperd, Proc. Natl. Acad. Sci. USA 91:2629-33, 1994; Neuhausen,
Hum. Mol. Genet. 3:1919-26, 1994). The PCR primers used in this
screening, designed from a genomic fragment isolated through
Representational Difference Analysis (described more fully below),
are listed below:
TABLE-US-00001 PL7-3F 5' GACACCACCATCTCTGTGCTC 3' (Seq. I.D. No. 7)
PL7-3R 5' GCAGACTGTCCTTCGTAGTTG 3' (Seq. I.D. No. 8)
An isolated and purified biological sample of this genomic DLC-1
gene was deposited with the American Type Culture Collection (ATCC)
in Manassas, Va., on Feb. 25, 1998, under accession number 98676.
The present invention also provides for the use of the DLC-1 cDNA,
the corresponding genomic gene and of the DLC-1 protein, and
derivatives thereof, in aspects of diagnosis and treatment of HCC,
and other cancers including, but not limited to PC, BC and CRC,
resulting from DLC-1 deletion or mutation.
[0014] An embodiment of the present invention is a method for
screening a subject to determine if the subject carries a mutant
DLC-1 gene, or if the gene has been partially or completely
deleted, as is thought to occur in many HCC cases. The method
comprises the steps of: providing a biological sample obtained from
the subject, which sample includes DNA or RNA, and providing an
assay for detecting in the biological sample the presence of a
mutant DLC-1 gene, a mutant DLC-1 RNA, or the absence, through
deletion, of the DLC-1 gene and corresponding RNA.
[0015] The foregoing assay may be assembled in the form of a
diagnostic kit and preferably comprises either: hybridization with
oligonucleotides; PCR amplification of the DLC-1 gene or a part
thereof using oligonucleotide primers; RT-PCR amplification of the
DLC-1 RNA or a part thereof using oligonucleotide primers; or
direct sequencing of the DLC-1 gene of the subject's genome using
oligonucleotide primers. The efficiency of these molecular genetic
methods should permit a rapid classification of patients affected
by deletions or mutations of the DLC-1 gene.
[0016] A further aspect of the present invention is a method for
screening a subject to assay for the presence of a mutant or
deleted DLC-1 gene, comprising the steps of: providing a biological
sample of the subject which sample contains cellular proteins, and
providing an immunoassay for quantitating the level of DLC-1
protein in the biological sample. Diagnostic methods for the
detection of mutant or deleted DLC-1 genes made possible by this
invention will provide an enhanced ability to diagnose
susceptibility to HCC and other cancers such as PC, BC and CRC.
[0017] Another aspect of the present invention is an antibody
preparation comprising antibodies that specifically detect the
DLC-1 protein, wherein the antibodies are selected from the group
consisting of monoclonal antibodies and polyclonal antibodies.
[0018] Those skilled in the art will appreciate the utility of this
invention is not limited to the specific experimental modes and
materials described herein.
[0019] The foregoing and other features and advantages of the
invention will become more apparent from the following detailed
description and accompanying drawings.
BRIEF DESCRIPTION OF DRAWINGS
[0020] FIG. 1 is a digital image of a Southern blot which compares
primary HCC tumor cells (T) with healthy normal liver cells (N),
and demonstrates a genomic deletion of the L7-3 clone in the HCC
cells. Primary tumors 94-25T, 95-03T and 95-06T showed 50% decrease
of DNA intensity as compared with normal liver tissues.
[0021] FIG. 2 is a digital image of a Southern blot which compares
representative HCC cell lines with healthy liver cells (NL-DNA),
and demonstrates a genomic deletion of the L7-3 clone in 9 of 11
HCC cell lines. Cell lines Sk-Hep-1, PLC/PRF/5, WRL, Focus, HLF,
Hep3B, Huh-7, Huh-6, Chang showed reduction of DNA intensity
compared with human normal liver genomic DNA.
[0022] FIG. 3 is a digital image of a Southern blot which compares
representative primary human breast cancers (T) with healthy normal
blood cells (N) from the same patient, and demonstrates a genomic
deletion of the DLC-1 gene in 7 of 15 primary breast cancers. A
representative 10 of the 15 primary tumors are shown. DNA was
digested with either (a) BglII or (b) BamHI. Cell lines IC11T,
IC12T, IC13T, IC2T, IC6T, and IC7T showed reduction of DNA
intensity compared with normal DNA.
[0023] FIG. 4 is a digital image of a Southern blot which compares
representative human colon cancer cell lines with normal DNA (lane
1), and demonstrates a genomic deletion of the DLC-1 gene in two
out of five colon cancer cell lines. Cell lines SW1116 and SW403
(lanes 5 and 6) showed reduction of DNA intensity compared with
normal DNA (lane 1).
[0024] FIG. 5 is a digital image of a Northern blot showing the
mRNA expression of the DLC-1 gene in normal human tissues. The
DLC-1 gene is expressed in all normal tissues tested as a 7.5 kb
major transcript and a 4.5 kb minor transcript.
[0025] FIG. 6 is a digital image of a Northern blot comparing the
mRNA expression of DLC-1 gene in normal human tissues (NL-RNA) and
HCC cell lines. DLC-1 mRNA expression was decreased or not detected
in the WRL, 7703, Chang and Focus HCC cell lines.
[0026] FIG. 7 is a digital image of a Northern blot comparing the
mRNA expression of DLC-1 gene in normal human tissues (CDD33C0) and
human colon cancer cell lines. DLC-1 mRNA was expression was
decreased or not detected in HCT-15, LS147T, DLD-1, HD29, SW1116,
T84, SW1417, SW403, SW948, LS180, and SW48 cell lines.
[0027] FIG. 8 is a digital image of a Northern blot showing the
mRNA expression of DLC-1 gene in three human prostate cancer cell
lines. DLC-1 mRNA was not detected in the LN-Cap and SP3504 cell
lines.
[0028] FIG. 9 is a schematic drawing of the human DLC-1 gene. Exons
1-14 are represented by boxes, with introns represented by the
lines connecting the boxes.
[0029] FIG. 10 is a schematic drawing of how the mouse DLC-1 gene
was targeted using homologous recombination. The resulting
construct can be used to generate DLC-1 homozygous knock-out
mice.
SEQUENCE LISTING
[0030] The nucleic and amino acid sequences listed in the
accompanying sequence listing are shown using standard letter
abbreviations for nucleotide bases, and three letter code for amino
acids. Only one strand of each nucleic acid sequence is shown, but
the complementary strand is understood as included by any reference
to the displayed strand.
[0031] Seq. I.D. No. 1 is the nucleotide sequence of the human
DLC-1 cDNA.
[0032] Seq. I.D. No. 2 is the amino acid sequence of the human
DLC-1 protein.
[0033] Seq. I.D. Nos. 3-4 are oligonucleotide sequences of PCR
primers which can be used to amplify the entire DLC-1 cDNA
molecule.
[0034] Seq. I.D. Nos. 5-6 are oligonucleotide sequences of PCR
primers which can be used to amplify the open reading frame of the
DLC-1 cDNA molecule.
[0035] Seq. I.D. Nos. 7-8 are the oligonucleotide sequences of PCR
primers used to screen a human genomic library.
[0036] Seq. I.D. Nos. 9-11 are the oligonucleotide sequences of the
primers used for 5' and 3' RACE.
[0037] Seq. I.D. No. 10 is the nucleotide sequence for the DLC-1
cDNA probe.
[0038] Seq. I.D. No. 11 is the nucleotide sequence for the L7-3
probe.
[0039] Seq. I.D. No. 12 is the nucleotide sequence for the P-35
probe.
[0040] Seq. I.D. No. 13 is the nucleotide sequence for ______ of
part of the human genomic DCL-1 sequence.
[0041] Seq. I.D. No. 14 is the nucleotide sequence for ______ of
the human genomic DCL-1 sequence.
[0042] Seq. I.D. No. 15 is the nucleotide sequence for ______ of
part of the human genomic DCL-1 sequence.
[0043] Seq. I.D. No. 16 is the nucleotide sequence for ______ of
part of the human genomic DCL-1 sequence.
[0044] Seq. I.D. No. 17 is the nucleotide sequence for ______ of
part of the human genomic DCL-1 sequence.
[0045] Seq. I.D. No. 18 is the nucleotide sequence for ______ of
part of the human genomic DCL-1 sequence.
[0046] Seq. I.D. No. 19 is the nucleotide sequence for ______ of
part of the human genomic DCL-1 sequence.
[0047] Seq. I.D. No. 20 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
[0048] Seq. I.D. No. 21 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
[0049] Seq. I.D. No. 22 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
[0050] Seq. I.D. No. 23 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
[0051] Seq. I.D. No. 24 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
[0052] Seq. I.D. No. 25 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
[0053] Seq. I.D. No. 26 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
[0054] Seq. I.D. No. 27 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
[0055] Seq. I.D. No. 28 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
[0056] Seq. I.D. No. 29 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
[0057] Seq. I.D. No. 30 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
[0058] Seq. I.D. No. 31 is the nucleotide sequence for ______ of
part of the mouse genomic DCL-1 sequence.
DETAILED DESCRIPTION OF THE INVENTION
[0059] The present invention discloses the isolation of the full
length cDNA and the chromosomal localization of a new gene, called
the DLC-1 gene. As discussed in Examples 1-3 below, deletion of the
DLC-1 gene has been detected in about half of the primary HCC tumor
cells and in a majority of the HCC cell lines which were studied.
In addition, studies of other cancers revealed that DLC-1 is also
deleted in 7 of 15 primary breast cancers and in 2 of 5 CRC cell
lines. Moreover, the DLC-1 gene was not expressed in 29% of HCC
cell lines, 64% of CRC cell lines and 67% of PC cell lines. These
frequent deletions suggest that the DLC-1 gene is a tumor
suppressor gene for HCC as well as PC, BC and CRC.
[0060] The full-length cDNA for DLC-1 is 3850 bp long (Seq. I.D.
No. 1) and encodes a protein of 1091 amino acids (Seq. I.D. No. 2).
Fluorescent in situ hybridization has generally localized the gene
on the short arm of chromosome 8 at bands p21.3-22.
[0061] Further evidence that the DLC-1 gene acts as a tumor
suppressor is found in its 86% homology with the rat p122 RhoGAP
gene (Homma and Emori, EMBO. J. 14:286-91, 1995). The rat p122
RhoGAP gene encodes a GTPase activating protein that catalyzes the
conversion of the active GTP-bound Rho complex to an inactive
GDP-bound one. The Rho family proteins, a subfamily of the Ras
small GTP binding superfamily, function as important regulators in
the organization of actin cytoskeleton (Nobes, et al., Cell
81:53-62, 1995). Rho proteins are also involved in Ras-mediated
oncogenic transformation (Khosravi-Far, et al., Adv. Cancer Res.
69:59-105, 1997). GAP genes may function as tumor suppressors by
down-regulating oncogenic Rho proteins (Quilliam, et al. Bioessays
17:395-404, 1995; Wang, et al., Cancer Res. 57:2478-84, 1997).
Based on its substantial homology with the rat p122 RhoGAP gene, it
appears likely the DLC-1 gene is a human RhoGAP gene involved in
the suppression of HCC tumors.
DEFINITIONS
[0062] In order to facilitate review of the various embodiments of
the invention, the following definition of terms is provided:
[0063] Breast Carcinoma (BC): breast cancer thought to result, in
some instances, from the deletion or mutation of the DLC-1 tumor
suppressor gene.
[0064] cDNA (complementary DNA): a piece of DNA lacking internal,
non-coding segments (introns) and regulatory sequences which
determine transcription. cDNA is synthesized in the laboratory by
reverse transcription from messenger RNA extracted from cells.
[0065] Colorectal Carcinoma (CRC): colorectal cancer (such as
adenocarcinoma) thought to result, in some instances, from the
deletion or mutation of the DLC-1 tumor suppressor gene.
[0066] Deletion: the removal of a sequence of DNA, the regions on
either side being joined together.
[0067] DLC-1 gene: a gene, the mutation of which is associated with
hepatocellular, breast, colon and prostate carcinomas, and
particularly adenocarcinomas of those organs A mutation of the
DLC-1 gene may include nucleotide sequence changes, additions or
deletions, including deletion of large portions or all of the DLC-1
gene. The term "DLC-1 gene" is understood to include the various
sequence polymorphisms and allelic variations that exist within the
population. This term relates primarily to an isolated coding
sequence, but can also include some or all of the flanking
regulatory elements and/or intron sequences.
[0068] DLC-1 cDNA: a mammalian cDNA molecule which, when
transfected into DLC-1 cells, expresses the DLC-1 protein. The
DLC-1 cDNA can be derived by reverse transcription from the mRNA
encoded by the DLC-1 gene and lacks internal non-coding segments
and transcription regulatory sequences present in the DLC-1
gene.
[0069] DLC-1 protein: the protein encoded by the DLC-1 cDNA, the
altered expression or mutation of which can predispose to the
development of certain cancers, such as hepatocellular carcinoma.
This definition is understood to include the various sequence
polymorphisms that exist, wherein amino acid substitutions in the
protein sequence do not affect the essential functions of the
protein.
[0070] DNA: deoxyribonucleic acid. DNA is a long chain polymer
which comprises the genetic material of most living organisms (some
viruses have genes comprising ribonucleic acid (RNA)). The
repeating units in DNA polymers are four different nucleotides,
each of which comprises one of the four bases, adenine, guanine,
cytosine and thymine bound to a deoxyribose sugar to which a
phosphate group is attached. Triplets of nucleotides, referred to
as codons, in DNA molecules code for amino acid in a polypeptide.
The term codon is also used for the corresponding (and
complementary) sequences of three nucleotides in the mRNA into
which the DNA sequence is transcribed.
[0071] Hepatocellular carcinoma (HCC): liver cancer thought to
result, in some instances, from the deletion or mutation of the
DLC-1 tumor suppressor gene.
[0072] Isolated: requires that the material be removed from its
original environment. For example, a naturally occurring DNA
molecule present in a living animal is not isolated, but the same
DNA molecule, separated from some or all of the coexisting
materials in the natural system, is isolated.
[0073] Mutant DLC-1 gene: a mutant form of the DLC-1 gene which in
some embodiments is associated with hepatocellular, breast, colon
and/or prostate carcinoma.
[0074] Mutant DLC-1 RNA: the RNA transcribed from a mutant DLC-1
gene.
[0075] Mutant DLC-1 protein: the protein encoded by a mutant DLC-1
gene.
[0076] Oligonucleotide: A linear polynucleotide sequence of up to
about 200 nucleotide bases in length, for example a polynucleotide
(such as DNA or RNA) which is at least 6 nucleotides, for example
at least 15, 50, 100 or even 200 nucleotides long.
[0077] ORF: open reading frame. Contains a series of nucleotide
triplets (codons) coding for amino acids without any termination
codons. These sequences are usually translatable into protein.
[0078] PCR: polymerase chain reaction. Describes a technique in
which cycles of denaturation, annealing with primer, and then
extension with DNA polymerase are used to amplify the number of
copies of a target DNA sequence.
[0079] Pharmaceutically acceptable carriers: The pharmaceutically
acceptable carriers useful in this invention are conventional.
Remington's Pharmaceutical Sciences, by E. W. Martin, Mack
Publishing Co., Easton, Pa., 15th Edition (1975), describes
compositions and formulations suitable for pharmaceutical delivery
of the fusion proteins herein disclosed.
[0080] In general, the nature of the carrier will depend on the
particular mode of administration being employed. For instance,
parenteral formulations usually comprise injectable fluids that
include pharmaceutically and physiologically acceptable fluids such
as water, physiological saline, balanced salt solutions, aqueous
dextrose, glycerol or the like as a vehicle. For solid compositions
(e.g., powder, pill, tablet, or capsule forms), conventional
non-toxic solid carriers can include, for example, pharmaceutical
grades of mannitol, lactose, starch, or magnesium stearate. In
addition to biologically-neutral carriers, pharmaceutical
compositions to be administered can contain minor amounts of
non-toxic auxiliary substances, such as wetting or emulsifying
agents, preservatives, and pH buffering agents and the like, for
example sodium acetate or sorbitan monolaurate.
[0081] Probes and primers: Nucleic acid probes and primers may
readily be prepared based on the nucleic acids provided by this
invention. A probe comprises an isolated nucleic acid attached to a
detectable label or reporter molecule. Typical labels include
radioactive isotopes, ligands, chemiluminescent agents, and
enzymes. Methods for labeling and guidance in the choice of labels
appropriate for various purposes are discussed, e.g., in Sambrook
et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor,
N.Y., 1989) and Ausubel et al. (Current Protocols in Molecular
Biology, Greene Publishing Associates and Wiley-Intersciences,
1987).
[0082] Primers are short nucleic acids, for example DNA
oligonucleotides 15 nucleotides or more in length. Primers may be
annealed to a complementary target DNA strand by nucleic acid
hybridization to form a hybrid between the primer and the target
DNA strand, and then extended along the target DNA strand by a DNA
polymerase enzyme. Primer pairs can be used for amplification of a
nucleic acid sequence, e.g., by the polymerase chain reaction (PCR)
or other nucleic-acid amplification methods known in the art.
[0083] Methods for preparing and using probes and primers are
described, for example, in Sambrook et al. (Molecular Cloning: A
Laboratory Manual, Cold Spring Harbor, N.Y., 1989), Ausubel et al.
(Current Protocols in Molecular Biology, Greene Publishing
Associates and Wiley-Intersciences, 1987), and Innis et al., (PCR
Protocols, A Guide to Methods and Applications, Innis et al.
(eds.), Academic Press, Inc., San Diego, Calif., 1990). PCR primer
pairs can be derived from a known sequence, for example, by using
computer programs intended for that purpose such as Primer (Version
0.5, .COPYRGT. 1991, Whitehead Institute for Biomedical Research,
Cambridge, Mass.).
[0084] Prostate Carcinoma (PC): prostate cancer (such as prostatic
adenocarcinoma) thought to result, in some instances, from the
deletion or mutation of the DLC-1 tumor suppressor gene.
[0085] Protein: a biological molecule expressed by a gene and
comprised of amino acids.
[0086] Purified: the term "purified" does not require absolute
purity; rather, it is intended as a relative term. Thus, for
example, a purified protein preparation is one in which the protein
referred to is more pure than the protein in its natural
environment within a cell.
[0087] Recombinant: A recombinant nucleic acid is one that has a
sequence that is not naturally occurring or has a sequence that is
made by an artificial combination of two otherwise separated
segments of sequence. This artificial combination is often
accomplished by chemical synthesis or, more commonly, by the
artificial manipulation of isolated segments of nucleic acids,
e.g., by genetic engineering techniques.
[0088] Representational Difference Analysis (RDA): a PCR-based
subtractive hybridization technique used to identify differences in
the mRNA transcripts present in closely related cell lines.
[0089] Sequence identity: the similarity between two nucleic acid
sequences, or two amino acid sequences, is expressed in terms of
the similarity between the sequences, otherwise referred to as
sequence identity. Sequence identity is frequently measured in
terms of percentage identity (or similarity or homology); the
higher the percentage, the more similar are the two sequences.
[0090] Methods of alignment of sequences for comparison are
well-known in the art. Various programs and alignment algorithms
are described in: Smith and Waterman, Adv. Appl. Math. 2:482, 1981;
Needleman and Wunsch, J. Mol. Bio. 48:443, 1970; Pearson and
Lipman, Methods in Mol. Biol. 24: 307-31, 1988; Higgins and Sharp,
Gene 73:237-44, 1988; Higgins and Sharp, CABIOS 5:151-3, 1989;
Corpet et al., Nuc. Acids Res. 16:10881-90, 1988; Huang et al.,
Comp. Appl. BioSci. 8:155-65, 1992; and Pearson et al., Meth. Mol.
Biol. 24:307-31, 1994
[0091] The NCBI Basic Local Alignment Search Tool (BLAST) (Altschul
et al., J. Mol. Biol. 215:403-10, 1990) is available from several
sources, including the National Center for Biological Information
(NBCI, Bethesda, Md.) and on the Internet, for use in connection
with the sequence analysis programs blastp, blastn, blastx, tblastn
and tblastx. It can be accessed at
http://www.ncbi.nlm.nih.gov/BLAST/. A description of how to
determine sequence identity using this program is available at
http://www.ncbi.nlm.nih.gov/BLAST/blast_help.html.
[0092] Homologs of the DLC-1 protein are typically characterized by
possession of at least 70% sequence identity counted over the full
length alignment with the disclosed amino acid sequence using the
NCBI Blast 2.0, gapped blastp set to default parameters. Such
homologous peptides will more preferably possess at least 75%, more
preferably at least 80% and still more preferably at least 90% or
95% sequence identity determined by this method. When less than the
entire sequence is being compared for sequence identity, homologs
will possess at least 75% and more preferably at least 85% and more
preferably still at least 90% or 95% sequence identity over short
windows of 10-20 amino acids. Methods for determining sequence
identity over such short windows are described at
http://www.ncbi.nlm.nih.gov/BLAST/blast_FAQs.html. One of skill in
the art will appreciate that these sequence identity ranges are
provided for guidance only; it is entirely possible that strongly
significant homologs or other variants could be obtained that fall
outside of the ranges provided.
[0093] The present invention provides not only the peptide homologs
that are described above, but also nucleic acid molecules that
encode such homologs.
[0094] Transformed: A transformed cell is a cell into which has
been introduced a nucleic acid molecule by molecular biology
techniques. As used herein, the term transformation encompasses all
techniques by which a nucleic acid molecule might be introduced
into such a cell, including transfection with viral vectors,
transformation with plasmid vectors, and introduction of naked DNA
by electroporation, lipofection, and particle gun acceleration.
[0095] Vector: A nucleic acid molecule as introduced into a host
cell, thereby producing a transformed host cell. A vector may
include nucleic acid sequences that permit it to replicate in a
host cell, such as an origin of replication. A vector may also
include one or more selectable marker genes and other genetic
elements known in the art.
[0096] VNTR probes: Variable Number of Tandem Repeat probes. These
are highly polymorphic DNA markers for human chromosomes. The
polymorphism is due to variation in the number of tandem repeats of
a short DNA sequence. Use of these probes enables the DNA of an
individual to be distinguished from that derived from another
individual.
[0097] Tumor: a neoplasm
[0098] Neoplasm: abnormal growth of cells
[0099] Cancer: malignant neoplasm that has undergone characteristic
anaplasia with loss of differentiation, increased rate of growth,
invasion of surrounding tissue, and is capable of metastasis.
[0100] Malignant: cells which have the properties of anaplasia
invasion and metastasis.
[0101] Normal cells: Non-tumor, non-malignant cells
[0102] Mammal: This term includes both human and non-human mammals.
Similarly, the term "patient" includes both human and veterinary
subjects.
[0103] Animal: Living multicellular vertebrate organisms, a
category which includes, for example, mammals and birds.
[0104] Transgenic Cell: transformed cells which contain foreign,
non-native DNA.
[0105] Additional definitions of common terms in molecular biology
may be found in Lewin, B. "Genes V" published by Oxford University
Press.
Materials and Methods
Primary HCC Samples and HCC Cell Lines
[0106] All of the primary liver tumor DNAs were obtained from
surgical resection of HCC tissues from patients in Qidong, China.
Each tumor sample was matched with its surrounding non-cancerous
liver tissue. DNAs were extracted after diagnosis of HCC with or
without cirrhosis. The tumors were Hepatitis B virus (HBV) positive
for HBVsAg and/or PCR detection of HBVx gene. HCC cell lines were
obtained from ATCC (Manassas, Va.), Qidong Liver Cancer Institute,
China, and Dr. Curtis C. Harris (Laboratory of Human
Carcinogenesis, Division of Basic Sciences, National Cancer
Institute) (Wang, et al., Chin. J. Oncol. 3:241-4, 1981).
Breast, Prostate and Colorectal Carcinomas
[0107] All normal and CRC (adenocarcinomas) cell lines were
purchased from ATCC (Manassas, Va.). The PC cell lines (also
adenocarcinomas) were obtained from The University of Texas M.D.
Anderson Cancer Center (Houston, Tex.). The DNA from primary breast
carcinomas and blood cells were obtained from patients in
Iceland.
Manipulation of Genetic Material
[0108] Unless otherwise specified, manipulation of genetic material
was performed according to standard laboratory procedures, such as
those described in Sambrook et al. (Molecular Cloning: A Laboratory
Manual, Cold Spring Harbor, N.Y., 1989) and Ausubel et al. (Current
Protocols in Molecular Biology, Greene Publishing Associates and
Wiley-Intersciences, 1987).
Representational Difference Analysis (RDA)
[0109] One primary HCC, having a homozygous point mutation of the
p53 gene, but not in its surrounding, non-cancerous liver tissue,
was selected for analysis. RDA was performed as originally
described in Lisitsyn et al. (Proc. Natl. Acad. Sci. USA 92:151-5,
1995), with tumor DNA as tester and normal liver DNA as driver.
BglII (Promega, Madison, Wis.) was chosen as the restriction enzyme
and its adaptors were used for direct preparation of amplicons and
PCR-based subtractive hybridization. The final difference products
showing distinct bands in agarose gel were recovered after BglII
digestion and ligated into the BglII site of dephosphorylated pSP72
vector (Promega). The recombinant difference products were then
transfected into E. coli DH10B.
Characterization of RDA Probes
[0110] Plasmids with distinct DNA inserts were selected for further
analysis. DNA sequencing was performed using the Dye Terminator
Cycle DNA Sequencing kit (Perkin Elmer, Rockville, Md.). Sequencing
reaction products were purified by spin columns (Princeton
Separations, Adelphia, N.J.), and run on a 377 DNA Sequencer
(Perkin Elmer/Applied Biosystems, Foster City, Calif.). The
homology analysis was carried out by BLAST search of the GenBank
DNA databases (Altschul, et al., J. Mol. Biol. 215:403-10, 1990).
The RDA products that elicited significant homology or appeared in
multiple clones, were selected for further Southern blot and/or
Northern blot analysis.
Conditions for Southern Analysis
[0111] Genomic DNA was isolated from tumor and non-tumor cell
lysates and digested with restriction enzymes. The digested DNA was
separated by electrophoresis in a 1% agarose gel and transferred to
nylon membrane for hybridization. 50 ng of DNA probe was
radio-labeled (Prime-It RmT, Stratagene) as per the manufacturers
instructions and used for hybridization. A probe for beta-actin was
used as a standard to control for the amount of DNA loaded.
Hybridization was performed at 68.degree. C. for 2-4 hours using
Quickhybrid solution (Stratagene). Following hybridization, the
membranes were washed three times at 37.degree. C. for 10 min in
1.times.SSC solution containing 0.1.times.SDS. This was followed by
a single wash at 62.degree. C. for 30 min in 0.1.times.SSC solution
containing 0.1.times.SDS. Blots were exposed to a PhosphoImager,
and analyzed using Software ImageQuant Version 3.3 (Molecular
Dynamics, Sunnyvale, Calif.) for quantitative analysis.
Conditions for Northern Analysis
[0112] Total RNA was extracted from cell lysates using TRIzol
solution (Gibco-BRL), which was then separated in a 1% agarose gel
and transferred to nylon membrane for hybridization. 50 ng of DNA
probe was radio-labeled (Prime-It RmT, Stratagene) as per the
manufacturers instructions and used for hybridization. A probe for
GAPDH or beta-actin was used as a control for the amount of RNA
loaded. Hybridization, washing, and analysis was performed as
described above for Southern Hybridization.
5' and 3' RACE and cDNA Library Screening for cDNA Cloning
[0113] 5' and 3' RACE (Rapid Amplification of cDNA Ends) were
started from a deleted fragment detected with RDA, and performed
using human placenta Marathon.TM. cDNA as template (Clontech, Inc.,
Palo Alto, Calif.). The primers used for RACE, generated from the
L7-3 sequence (Seq. I.D. No. ______), are as follows:
TABLE-US-00002 (Seq. I.D. No. 9) PrRACE5: 5'
CACTCCGGTCCTTGTAGTCTGGAACC 3' was used for the first round of PCR
for 5' RACE. (Seq. I.D. No. 10) PrRACE5N: 5'
ATCCTCTTCATGAACTCGGGCACGG 3' was used as the nested primer in the
second round of 5' RACE. (Seq. I.D. No. 11) PrRACE3: 5'
GATCAAGGTTCTAGACTACAAGGACCG 3' was used for 3' RACE.
[0114] The final 5' RACE product, exhibiting the same band pattern
as the deleted fragment in Northern blot hybridization, was labeled
with .alpha.-[.sup.32P]-dCTP to screen a 5' Strech cDNA library
constructed from human lung tissue (Clontech, Inc.). The lambda DNA
of positive clones was converted into plasmid DNA by transfecting
lambda DNA into AM1 bacterial cells. The full-length cDNA
sequencing of positive clones was completed by primer walking and
assembled by Sequencher.TM. 3.1 program.
Fluorescence In Situ Hybridization (FISH) Gene Mapping and
Comparative Genomic Hybridization (CGH)
[0115] A genomic probe isolated from human P1 library was labeled
with biotin and used for FISH chromosomal localization and CGH
analysis. For both analyses, chromosomes prepared from
methotrexate-synchronized normal peripheral lymphocyte cultures
were used. The original CGH protocol, described in Kallioniemi et
al. (Science 258:818-21, 1992), was employed with minor
modifications. The conditions of hybridization, the detection of
hybridization signals, digital-image acquisition, processing and
analysis, and direct fluorescent signal localization on banded
chromosomes were performed as previously described in Zimonjic et
al. (Cancer Genet. Cytogenet. 80:100-2, 1995).
[0116] The following examples are illustrative of the scope of the
present invention.
Example 1
Detection of DLC-1 Deletion in Liver Cancer Cells by RDA
[0117] Primary HCC tumor samples, matched with surrounding
non-cancerous liver tissue, were obtained as described above and
analyzed by RDA. Several RDA difference products were observed
after the third round of hybridization/selection as distinct bands
in agarose gel. Twenty individual fragments were isolated and
analyzed by Southern blot hybridization for deletions. One clone,
L7-3, of 600 bp (Seq. I.D. No. 11), showed loss of heterozygosity
(LOH) in the primary tumor (FIG. 1). BLAST search revealed that the
L7-3 clone had homology to rat p122 RhoGAP cDNA (Homma and Emori,
EMBO. J. 14:286-91, 1995).
Example 2
Southern Analysis
HCC Cell Lines
[0118] To determine if the L7-3 clone is represented in a region
recurrently deleted in HCC, 15 primary HCC tumors and 11
HCC-derived cell lines were examined using Southern analysis as
described above. The DNA was digested with BglII, and probed with
L7-3 (Seq. I.D. No. 11). Seven of the fifteen primary HCC tumors
(representatives are shown in FIG. 1) and 9 of the 11 HCC cell
lines (FIG. 2) had a genomic deletion of the L7-3 clone compared to
no deletions in the normal liver cells.
Primary Breast Carcinomas
[0119] Using Southern analysis as described above, primary human
breast cancer and corresponding patient blood cell DNA was digested
with BglII (FIG. 3a) or BamHI (FIG. 3b) and probed with full-length
DLC-1 cDNA (Seq. I.D. No. 10). Genomic deletions of DLC-1 gene were
detected in 7 of 15 human primary breast cancers (representatives
are shown in FIG. 3). Deletions were noted if the DNA intensity of
the tumor tissues exhibited at least half the intensity when
compared with their normal tissue DNA. Samples IC11T, IC12T, IC13T,
IC2T, IC6T, IC7T are representative for the genomic deletions in
these experiments.
[0120] Southern analysis of these cells resulted in several bands.
As a control for DNA loading, the bands that remained unchanged in
the tumor cells were used.
Colon Carcinoma Cell Lines
[0121] Using Southern analysis as described above, normal genomic
DNA (Promega) and the DNA from five CRC cell lines were digested
with EcoRI, and probed with a mixture of L7-3 and P-35 (Seq. I.D.
Nos. 11 and 12) which correspond to exon 2 and exon 7 of the human
DLC-1 gene (see FIG. 9), respectively. Genomic deletions of DLC-1
gene were detected in two of five human CRC cell lines (FIG. 4).
Cell lines SW403 and SW1116 showed half of the DNA intensity for
probe P-35 when compared with normal genomic DNA (compare lanes 5
and 6 with lane 1). Interestingly, the signal was unaltered when
the L7-3 probe was used, indicating that this region (exon 2) is
not responsible for the development of CRC in these cell lines.
Therefore, the signal from L7-3 can be used as an internal control
for the amount of DNA loaded.
Example 3
Northern Analysis
HCC Cell Lines
[0122] Considering the significant DNA sequence homology of the
L7-3 clone with rat RhoGAP cDNA, its mRNA expression was examined
in both normal human tissues and HCC-derived cell lines by Northern
analysis as described above. Analysis of mRNA isolated from several
normal human tissues, including liver, demonstrated that the L7-3
clone hybridized to a 7.5 kb (major) transcript and a 4.5 kb
(minor) transcript (FIG. 5) that were detected in all normal
tissues but not in 4 (WRL, 7703, Chang and Focus) out of 14 human
HCC-derived cell lines (FIG. 6).
Colorectal Carcinomas
[0123] Using Northern analysis as described above, RNA from normal
and CRC cell lines was prepared and probed with the full-length
DLC-1 cDNA (Seq. I.D. No. 10). In human CRC cell lines, 11 out of
17 (HCT-15, LS147T, DLD-1, HD29, SW1116, T84, SW1417, SW403, SW948,
LS180, SW48) showed noticeably decreased or no expression of DLC-1
mRNA (FIG. 7). In this experiment, the normal human colon
fibroblast cell line CDD33C0 was used as a normal control.
Prostate Carcinomas
[0124] Using Northern analysis as described above, RNA from PC cell
lines was prepared and probed with the full-length DLC-1 cDNA (Seq.
I.D. No. 10). Low levels or no DLC-1 gene expression was
demonstrated by in two (LN-Cap and SP3504) out of three human PC
cell lines (FIG. 8).
Example 4
Obtaining the DLC-1 cDNA
[0125] The cDNA for the clone L7-3 was obtained by 5' RACE and 3'
RACE coupled with cDNA library screening as described above. The
full-length cDNA of DLC-1 gene is 3850 bp long (Seq. I.D. No. 1)
and encodes a protein of 1091 amino acids (Seq. I.D. No. 2). The
estimated molecular weight of DLC-1 protein is 125 kD. The
untranslated regions of 5' end and 3' end of DLC-1 gene are 324 bp
and 250 bp, respectively (Seq. I.D. No. 1).
Example 5
Chromosomal Localization of Human DLC-1
[0126] The DLC-1 gene was chromosomally localized using the
materials and methods described above. The majority of metaphases
hybridized with biotin or digoxigenin-labeled genomic probe had
fluorescent signal at identical sites on both chromatids of the
short arm of chromosome 8. The signal was analyzed in 100
metaphases with both homologous labeled. Fifty metaphases were
examined by imaging of DAPI generated and enhanced G-like banding.
The fluorescent signals were distributed within region 8p21-22
However, over 50% of doublets were at bands 8p21.3-22, the most
likely location of the DLC-1 gene.
[0127] To further characterize the region harboring the DLC-1 gene,
the primary tumor DNA used as tester in RDA (94-25T) was analyzed
by CGH. The fluorescence profile for chromosome 8 demonstrated DNA
loss on region of 8p23-q11.2 and gain on region of
8q21.1-q24.3.
Example 6
Cloning and Characterization of Human DLC-1
[0128] The DLC-1 cDNA sequence (Seq. I.D. No. 1) described above
does not contain the introns, upstream transcriptional promoter or
regulatory regions or downstream transcriptional regulatory regions
of the DLC-1 gene. It is possible that some mutations in the DLC-1
gene that may lead to HCC are not included in the cDNA but rather
are located in other regions of the DLC-1 gene. Mutations located
outside of the open reading frame that encodes the DLC-1 protein
are not likely to affect the functional activity of the protein but
rather are likely to result in altered levels of the protein in the
cell. For example, mutations in the promoter region of the DLC-1
gene may prevent transcription of the gene and therefore lead to
the complete absence of the DLC-1 protein in the cell.
[0129] Additionally, mutations within intron sequences in the
genomic gene may also prevent expression of the DLC-1 protein.
Following transcription of a gene containing introns, the intron
sequences are removed from the RNA molecule in a process termed
splicing prior to translation of the RNA molecule which results in
production of the encoded protein. When the RNA molecule is spliced
to remove the introns, the cellular enzymes that perform the
splicing function recognize sequences around the intron/exon border
and in this manner recognize the appropriate splice sites. If there
is a mutation within the sequence of the intron close to the
junction of the intron with an exon, the enzymes may not recognize
the junction and may fail to remove the intron. If this occurs, the
encoded protein will likely be defective. Thus, mutations inside
the intron sequences within the DLC-1 gene (termed "splice site
mutations") may also lead to the development of HCC. However,
knowledge of the exon structure and intronic splice site sequences
of the DLC-1 gene is required to define the molecular basis of
these abnormalities. The provision herein of the DLC-1 cDNA
sequence (Seq. I.D. No. 1) enables the cloning of the entire DLC-1
gene (including the promoter and other regulatory regions and the
intron sequences) and the determination of its nucleotide sequence.
With this information in hand, diagnosis of a genetic
predisposition to HCC and other cancers based on DNA analysis will
comprehend all possible mutagenic events at the DLC-1 locus.
[0130] The ATCC deposit (98676) of the genomic DLC-1 gene may be
utilized in aspects of the present invention. Alternatively, the
DLC-1 gene may be isolated by one or more routine procedures,
including PCR-based screening of a human genomic P1 library as
described above. Alternatively, the method described in WO 93/22435
can be utilized. For example, a YAC library of human genomic
sequences (Monaco and Lehrach, Proc. Natl. Acad. Sci. U.S.A.
88:4123-7, 1991) is screened for the DLC-1 gene by the polymerase
chain reaction (PCR). The library is arranged in a number (e.g.,
39) of primary DNA pools, prepared from high-density grids each
containing around 300-400 YAC clones. Primary pools are screened by
PCR to identify a pool which contains a positive clone. A secondary
PCR screen is then performed on the appropriate set of eight row
and 12 column pools, as described by Bentley et al. (Genomics
12:534-41, 1992). PCR primers based on the DLC-1 cDNA sequence are
used as a sequence tagged site (STS) for the 3' region of the gene.
The yeast DNA is then amplified with these primers by PCR for 30
cycles of 94.degree. C. for 1 minute, 60.degree. C. for 1 minute
and 72.degree. C. for 1 minute, with a final 5 minute extension at
72.degree. C. Confirmation that positive YAC clones contain the
majority of the coding sequence of the DLC-1 genomic gene is
obtained by amplification of an STS from the 5' end of the cDNA.
Exon boundaries of the DLC-1 gene are then characterized, e.g., by
the vectorette PCR method. This strategy has been described in
detail previously (Roberts et al., Genomics 13:942-50, 1992).
[0131] With the sequences of the DLC-1 cDNA and DLC-1 gene in hand,
primers derived from these sequences may be used in diagnostic
tests (described below) to determine the presence of mutations in
any part of the genomic DLC-1 gene of a patient. Such primers will
be oligonucleotides comprising a fragment of sequence from the
DLC-1 gene (either intron sequence, exon sequence or a sequence
spanning an intron-exon boundary) and will comprise at least 15
consecutive nucleotides of the DLC-1 cDNA or gene. It will be
appreciated that greater specificity may be achieved by using
primers of greater lengths. Thus, in order to obtain enhanced
specificity, the primers used may comprise 20, 25, 30 or even 50
consecutive nucleotides of the DLC-1 cDNA or gene. Furthermore,
with the provision of the DLC-1 intron sequence information the
analysis of a large and as yet untapped source of patient material
for mutations will now be possible using methods such as chemical
cleavage of mismatches (Cotton et al., Proc Natl Acad Sci USA.
85:4397-401, 1988; Montandon et al., Nucleic Acids Res. 9:3347-58,
1989) and single-strand conformational polymorphism analysis (Orita
et al., Genomics 5:874-879, 1989).
[0132] Additional experiments may now be performed to identify and
characterize regulatory elements flanking the DLC-1 gene. These
regulatory elements may be characterized by standard techniques
including deletion analyses wherein successive nucleotides of a
putative regulatory region are removed and the effect of the
deletions are studied by either transient or long-term expression
analyses experiments. The identification and characterization of
regulatory elements flanking the genomic DLC-1 gene may be made by
functional experimentation (deletion analyses, etc.) in mammalian
cells by either transient or long-term expression analyses.
[0133] Having provided a genomic clone for the DLC-1 gene (Seq.
I.D. No.______), it will be apparent to one skilled in the art that
either the genomic clone or the cDNA or sequences derived from
these clones may be utilized in applications of this invention,
including but not limited to, studies of the expression of the
DLC-1 gene, studies of the function of the DLC-1 protein, the
generation of antibodies to the DLC-1 protein diagnosis and therapy
of DLC-1 deleted or mutated patients to prevent or treat the onset
of HCC. Descriptions of applications describing the use of DLC-1
cDNA are therefore intended to comprehend the use of the genomic
DLC-1 gene. It will also be apparent to one skilled in the art that
homologs of this gene may now be cloned from other species, such as
the rat or the mouse, by standard cloning methods. Such homologs
will be useful in the production of animal models of HCC.
[0134] To facilitate the detection of point mutations in liver and
other cancers that exhibit alteration at region 8p12-22, the human
DLC-1 gene was cloned and the intron/exon sequences characterized
(Seq. I.D. No. ______ and FIG. 9).
[0135] Human DLC-1 is approximately 25 kb, and contains 14 exons.
The largest exon is exon 2, at 1.5 kb, while the remaining exons
are less than 300 bp on average (FIG. 9).
Example 7
Cloning Mouse DLC-1
[0136] A full understanding of the function of DLC-1 and its role
in cancer development is essential. This understanding can be
facilitated by the generation of knock-out mice, which contain a
non-functional DLC-1 gene. Prior to generating knock-out mice, the
mouse DLC-1 gene was cloned (genomic or cDNA?).
[0137] Mouse DLC-1 genomic DNA was cloned and localized to
chromosome 8 by FISH (see above for methods) using a mouse DLC-1
genomic DNA clone as the probe. Mouse DLC-1 is in a syntenic region
of the human DLC-1 gene. The localization of DLC-1 gene in mice may
permit studies with in vivo models for carcinogenesis.
Example 8
Generating Transgenic Mice
[0138] Methods for generating transgenic mice are described in Gene
Targeting, A. L. Joyuner ed., Oxford University Press, 1995 and
Watson, J. D. et al., Recombinant DNA 2.sup.nd Ed., W.H. Freeman
and Co., New York, 1992, Chapter 14. To specifically generate
transgenic mice containing a functional deletion of the DLC-1 gene,
a 1.5 kb fragment in the front of exon 2 and another 5.5 kb
fragment spanning from intron 2 to intron 5 were used as short arm
and long arm, respectively. Between long arm and short arm, the neo
gene was introduced, generating the vector shown in FIG. 10,
referred to as the knock-out vector herein.
[0139] Using standard transgenic mouse technology, the vector shown
in FIG. 10 can be used to generate DLC-1 knock-out mice by
homologous recombination. The knock-out vector is introduced into
embryonic stem cells (ES cells) by standard methods which may
include transfection, retroviral infection or electroporation (also
see Example 11). The transfected ES cells expressing the knock-out
vector will grow in medium containing the antibiotic G418. The
neomycin resistant ES cells will be microinjected into mouse
embryos (blastocysts), which are implanted into the uterus of
pseudopregnant mice. The litter will be screened for chimeric mice
by observing their coat color. Chimeric mice are ones in which the
injected ES cells developed into the germ line, thereby allowing
transmission of the gene to their offspring. The resulting
heterozygotic mice will be mated to generate a homozygous line of
transgenic mice functionally deleted for DLC-1. These homozygous
mice will then be screened phenotypically, for example, their
predisposition to developing cancer.
Example 9
Preferred Method of Making the DLC-1 cDNA
[0140] The foregoing discussion describes the original means by
which the DLC-1 cDNA was obtained and also provides the nucleotide
sequence of this clone. With the provision of this sequence
information, the polymerase chain reaction (PCR) may now be
utilized in a more direct and simple method for producing the DLC-1
cDNA.
[0141] Essentially, total RNA is extracted from human cells by any
one of a variety of methods routinely used; Sambrook et al.
(Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y.,
1989) and Ausubel et al. (In Current Protocols in Molecular
Biology, Greene Publishing Associates and Wiley-Intersciences,
1987) provide descriptions of methods for RNA isolation. Any human
cell line derived from a non-DLC-1 deleted individual would be
suitable, such as the widely used HeLa cell line, or the WI-38
human skin fibroblast cell line available from the American Type
Culture Collection, Rockville, Md. The extracted RNA is then used
as a template for performing the reverse transcription-polymerase
chain reaction (RT-PCR) amplification of cDNA. Methods and
conditions for RT-PCR are described in Kawasaki et al. (In PCR
Protocols, A Guide to Methods and Applications, Innis et al.
(eds.), pp. 21-27, Academic Press, Inc., San Diego, Calif., 1990).
The selection of PCR primers will be made according to the portions
of the cDNA which are to be amplified. Primers may be chosen to
amplify small segments of a cDNA or the entire cDNA molecule.
Variations in amplification conditions may be required to
accommodate primers of differing lengths; such considerations are
well known in the art and are discussed in Innis et al. (PCR
Protocols, A Guide to Methods and Applications, Innis et al.
(eds.), Academic Press, Inc., San Diego, Calif., 1990). The entire
DLC-1 cDNA molecule may be amplified using the following
combination of primers:
TABLE-US-00003 (Seq. I.D. No. 3) 5' TAT GGG CTC GAG CGG CCG CCC 3'
(Seq. I.D. No. 4) 5' CGC ACA GTC TTA CAT ATT CCA 3'
The open reading frame of the cDNA molecule may be amplified using
the following combination of primers:
TABLE-US-00004 (Seq. I.D. No. 5) 5' ATG TGC AGA AAG AAG CCG GAC ACC
3' (Seq. I.D. No. 6) 5' CCT AGA TTT GGT GTC TTT GGT TTC 3'
These primers are illustrative only; it will be appreciated by one
skilled in the art that many different primers may be derived from
the provided cDNA sequence in order to amplify particular regions
of these cDNAs.
Example 10
Sequence Variants of DLC-1
[0142] The nucleotide sequence of the DLC-1 cDNA (Seq. I.D. No. 1)
and the amino acid sequence of the DLC-1 protein (Seq. I.D. No. 2)
which is encoded by that cDNA, respectively are shown in FIG. 5.
Having presented the nucleotide sequence of the DLC-1 cDNA and the
amino acid sequence of the protein, this invention now also
facilitates the creation of DNA molecules, and thereby proteins,
which are derived from those disclosed but which vary in their
precise nucleotide or amino acid sequence from those disclosed.
Such variants may be obtained through a combination of standard
molecular biology laboratory techniques and the nucleotide sequence
information disclosed by this invention.
[0143] Variant DNA molecules include those created by standard DNA
mutagenesis techniques, for example, M13 primer mutagenesis.
Details of these techniques are provided in Sambrook et al. (In
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y.,
1989, Ch. 15). By the use of such techniques, variants may be
created which differ in minor ways from those disclosed. DNA
molecules and nucleotide sequences which are derivatives of those
specifically disclosed herein and which differ from those disclosed
by the deletion, addition or substitution of nucleotides while
still encoding a protein which possesses the functional
characteristic of the DLC-1 protein are comprehended by this
invention. Also within the scope of this invention are small DNA
molecules which are derived from the disclosed DNA molecules. Such
small DNA molecules include oligonucleotides suitable for use as
hybridization probes or polymerase chain reaction (PCR) primers. As
such, these small DNA molecules will comprise at least a segment of
the DLC-1 cDNA molecule or the DLC-1 gene and, for the purposes of
PCR, will comprise at least a 15 nucleotide sequence and, more
preferably, a 20-50 nucleotide sequence of the DLC-1 cDNA (Seq.
I.D. No. 1) or the DLC-1 gene (Seq. I.D. No.______) (i.e., at least
20-50 consecutive nucleotides of the DLC-1 cDNA or gene sequences).
DNA molecules and nucleotide sequences which are derived from the
disclosed DNA molecules as described above may also be defined as
DNA sequences which hybridize under stringent conditions to the DNA
sequences disclosed, or fragments thereof.
[0144] Hybridization conditions resulting in particular degrees of
stringency will vary depending upon the nature of the hybridization
method of choice and the composition and length of the hybridizing
DNA used. Generally, the temperature of hybridization and the ionic
strength (especially the Na.sup.+ concentration) of the
hybridization buffer will determine the stringency of
hybridization. Calculations regarding hybridization conditions
required for attaining particular degrees of stringency are
discussed by Sambrook et al. (In Molecular Cloning: A Laboratory
Manual, Cold Spring Harbor, N.Y., 1989 ch. 9 and 11), herein
incorporated by reference. By way of illustration only, a
hybridization experiment may be performed by hybridization of a DNA
molecule (for example, a deviation of the DLC-1 cDNA) to a target
DNA molecule (for example, the DLC-1 cDNA) which has been
electrophoresed in an agarose gel and transferred to a
nitrocellulose membrane by Southern blotting (Southern, J. Mol.
Biol. 98:503, 1975), a technique well known in the art and
described in Sambrook et al. (Molecular Cloning: A Laboratory
Manual, Cold Spring Harbor, N.Y., 1989). Hybridization with a
target probe labeled with [.sup.32P]-dCTP is generally carried out
in a solution of high ionic strength such as 6.times.SSC at a
temperature that is 20-25.degree. C. below the melting temperature,
T.sub.m, described below. For such Southern hybridization
experiments where the target DNA molecule on the Southern blot
contains 10 ng of DNA or more, hybridization is typically carried
out for 6-8 hours using 1-2 ng/ml radiolabeled probe (of specific
activity equal to 10.sup.9 CPM/.mu.g or greater). Following
hybridization, the nitrocellulose filter is washed to remove
background hybridization. The washing conditions should be as
stringent as possible to remove background hybridization but to
retain a specific hybridization signal. The term T.sub.m represents
the temperature above which, under the prevailing ionic conditions,
the radiolabeled probe molecule will not hybridize to its target
DNA molecule. The T.sub.m of such a hybrid molecule may be
estimated from the following equation (Bolton and McCarthy, Proc.
Natl. Acad. Sci. USA 48:1390, 1962):
T.sub.m=81.5.degree. C.-16.6(log.sub.10[Na.sup.+])+0.41(%
G+C)-0.63(% formamide)-(600/l)
Where l=the length of the hybrid in base pairs. This equation is
valid for concentrations of Na.sup.+ in the range of 0.01 M to 0.4
M, and it is less accurate for calculations of T.sub.m in solutions
of higher [Na.sup.+]. The equation is also primarily valid for DNAs
whose G+C content is in the range of 30% to 75%, and it applies to
hybrids greater than 100 nucleotides in length (the behavior of
oligonucleotide probes is described in detail in Ch. 11 of Sambrook
et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor,
N.Y., 1989).
[0145] Thus, by way of example, for a 150 base pair DNA probe
derived from the open reading frame of the DLC-1 cDNA (with a
hypothetical % GC=45%), a calculation of hybridization conditions
required to give particular stringencies may be made as
follows:
[0146] For this example, it is assumed that the filter will be
washed in 0.3.times.SSC solution following hybridization, thereby:
[0147] [Na.sup.+]=0.045M [0148] % GC=45% [0149] Formamide
concentration=0 [0150] l=150 base pairs
[0150] T m = 81.5 - 16 ( log 10 [ Na + ] ) + ( 0.41 .times. 45 ) -
( 600 ) ( 150 ) ##EQU00001## [0151] and so T.sub.m=74.4.degree.
C.
[0152] The T.sub.m of double-stranded DNA decreases by
1-1.5.degree. C. with every 1% decrease in homology (Bonner et al.,
J. Mol. Biol. 81:123, 1973). Therefore, for this given example,
washing the filter in 0.3.times.SSC at 59.4-64.4.degree. C. will
produce a stringency of hybridization equivalent to 90%; that is,
DNA molecules with more than 10% sequence variation relative to the
target DLC-1 cDNA will not hybridize. Alternatively, washing the
hybridized filter in 0.3.times.SSC at a temperature of
65.4-68.4.degree. C. will yield a hybridization stringency of 94%;
that is, DNA molecules with more than 6% sequence variation
relative to the target DLC-1 cDNA molecule will not hybridize. The
above example is given entirely by way of theoretical illustration.
One skilled in the art will appreciate that other hybridization
techniques may be utilized and that variations in experimental
conditions will necessitate alternative calculations for
stringency.
[0153] In particular embodiments of the present invention,
stringent conditions may be defined as those under which DNA
molecules with more than 25% sequence variation (also termed
"mismatch") will not hybridize. In a more particular embodiment,
stringent conditions are those under which DNA molecules with more
than 15% mismatch will not hybridize, and more preferably still,
stringent conditions are those under which DNA sequences with more
than 10% mismatch will not hybridize. In another embodiment,
stringent conditions are those under which DNA sequences with more
than 6% mismatch will not hybridize.
[0154] The degeneracy of the genetic code further widens the scope
of the present invention as it enables major variations in the
nucleotide sequence of a DNA molecule while maintaining the amino
acid sequence of the encoded protein. For example, the sixteenth
amino acid residue of the DLC-1 protein is alanine. This is encoded
in the DLC-1 cDNA by the nucleotide codon triplet GCC. Because of
the degeneracy of the genetic code, three other nucleotide codon
triplets, GCT, GCG and GCA, also code for alanine. Thus, the
nucleotide sequence of the DLC-1 cDNA could be changed at this
position to any of these three codons without affecting the amino
acid composition of the encoded protein or the characteristics of
the protein. The genetic code and variations in nucleotide codons
for particular amino acids is presented in Tables 1 and 2. Based
upon the degeneracy of the genetic code, variant DNA molecules may
be derived from the cDNA molecules disclosed herein using standard
DNA mutagenesis techniques as described above, or by synthesis of
DNA sequences. DNA sequences which do not hybridize under stringent
conditions to the cDNA sequences disclosed by virtue of sequence
variation based on the degeneracy of the genetic code are herein
also comprehended by this invention.
[0155] The invention also includes DNA sequences that are
substantially identical to any of the DNA sequences disclosed
herein, where substantially identical means a sequence that has
identical nucleotides in at least 75% of the aligned nucleotides,
for example 80%, 85%, 90%, 95% or 98% identity of the aligned
sequences.
TABLE-US-00005 TABLE 1 The Genetic Code First Third Position Second
Position Position (5' end) T C A G (3' end) T Phe Ser Tyr Cys T Phe
Ser Tyr Cys C Leu Ser Stop (och) Stop A Leu Ser Stop (amb) Trp G C
Leu Pro His Arg T Leu Pro His Arg C Leu Pro Gln Arg A Leu Pro Gln
Arg G A Ile Thr Asn Ser T Ile Thr Asn Ser C Ile Thr Lys Arg A Met
Thr Lys Arg G G Val Ala Asp Gly T Val Ala Asp Gly C Val Ala Glu Gly
A Val (Met) Ala Glu Gly G "Stop (och)" stands for the ochre
termination triplet, and "Stop (amb)" for the amber. ATG is the
most common initiator codon; GTG usually codes for valine, but it
can also code for methionine to initiate an mRNA chain.
TABLE-US-00006 TABLE 2 The Degeneracy of the Genetic Code Number of
Total Number Synonymous Codons Amino Acid of Codons 6 Leu, Ser, Arg
18 4 Gly, Pro, Ala, Val, Thr 20 3 Ile 3 2 Phe, Tyr, Cys, His, Gln,
18 Glu, Asn, Asp, Lys 1 Met, Trp 2 Total number of codons for amino
acids 61 Number of codons for termination 3 Total number of codons
in genetic code 64
[0156] One skilled in the art will recognize that the DNA
mutagenesis techniques described above may be used not only to
produce variant DNA molecules, but will also facilitate the
production of proteins which differ in certain structural aspects
from the DLC-1 protein, yet which proteins are clearly derivative
of this protein and which maintain the essential characteristics of
the DLC-1 protein. Newly derived proteins may also be selected in
order to obtain variations on the characteristic of the DLC-1
protein, as will be more fully described below. Such derivatives
include those with variations in amino acid sequence including
minor deletions, additions and substitutions.
[0157] While the site for introducing an amino acid sequence
variation is predetermined, the mutation per se need not be
predetermined. For example, in order to optimize the performance of
a mutation at a given site, random mutagenesis may be conducted at
the target codon or region and the expressed protein variants
screened for the optimal combination of desired activity.
Techniques for making substitution mutations at predetermined sites
in DNA having a known sequence as described above are well
known.
[0158] Amino acid substitutions are typically of single residues;
insertions usually will be on the order of about from 1 to 10 amino
acid residues; and deletions will range about from 1 to 30
residues. Deletions or insertions preferably are made in adjacent
pairs, i.e., a deletion of 2 residues or insertion of 2 residues.
Substitutions, deletions, insertions or any combination thereof may
be combined to arrive at a final construct. Obviously, the
mutations that are made in the DNA encoding the protein must not
place the sequence out of reading frame and preferably will not
create complementary regions that could produce secondary mRNA
structure.
[0159] Substitutional variants are those in which at least one
residue in the amino acid sequence has been removed and a different
residue inserted in its place. Such substitutions generally are
made in accordance with the following Table 3 when it is desired to
finely modulate the characteristics of the protein. Table 3 shows
amino acids which may be substituted for an original amino acid in
a protein and which are regarded as conservative substitutions.
TABLE-US-00007 TABLE 3 Original Residue Conservative Substitutions
Ala ser Arg lys Asn gln, his Asp glu Cys ser Gln asn Glu asp Gly
pro His asn; gln Ile leu, val Leu ile; val Lys arg; gln; glu Met
leu; ile Phe met; leu; tyr Ser thr Thr ser Trp tyr Tyr trp; phe Val
ile; leu
[0160] Substantial changes in function or immunological identity
are made by selecting substitutions that are less conservative than
those in Table 3, i.e., selecting residues that differ more
significantly in their effect on maintaining (a) the structure of
the polypeptide backbone in the area of the substitution, for
example, as a sheet or helical conformation, (b) the charge or
hydrophobicity of the molecule at the target site, or (c) the bulk
of the side chain. The substitutions which in general are expected
to produce the greatest changes in protein properties will be those
in which (a) a hydrophilic residue, e.g., seryl or threonyl, is
substituted for (or by) a hydrophobic residue, e.g., leucyl,
isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline
is substituted for (or by) any other residue; (c) a residue having
an electropositive side chain, e.g., lysyl, arginyl, or histadyl,
is substituted for (or by) an electronegative residue, e.g.,
glutamyl or aspartyl; or (d) a residue having a bulky side chain,
e.g., phenylalanine, is substituted for (or by) one not having a
side chain, e.g., glycine.
[0161] The effects of these amino acid substitutions or deletions
or additions may be assessed for derivatives of the DLC-1 protein
by assays in which DNA molecules encoding the derivative proteins
are transfected into DLC-1 cells using routine procedures.
[0162] The DLC-1 gene, DLC-1 cDNA, DNA molecules derived therefrom
and the protein encoded by the cDNA and derivatives thereof may be
utilized in aspects of both the study of HCC and for diagnostic and
therapeutic applications related to HCC. Utilities of the present
invention include, but are not limited to, those utilities
described in the examples presented herein. Those skilled in the
art will recognize that the utilities herein described are not
limited to the specific experimental modes and materials presented
and will appreciate the wider potential utility of this
invention.
Example 11
Expression of DLC-1 cDNA Sequences
[0163] With the provision of the DLC-1 cDNA (Seq. I.D. No. 1), the
expression and purification of the DLC-1 protein by standard
laboratory techniques is now enabled. The purified protein may be
used for functional analyses, antibody production, diagnostics and
patient therapy. Furthermore, the DNA sequence of the DLC-1 cDNA
can be manipulated in studies to understand the expression of the
gene and the function of its product. Mutant forms of the DLC-1 may
be isolated based upon information contained herein, and may be
studied in order to detect alteration in expression patterns in
terms of relative quantities, tissue specificity and functional
properties of the encoded mutant DLC-1 protein. Partial or
full-length cDNA sequences, which encode for the subject protein,
may be ligated into bacterial expression vectors. Methods for
expressing large amounts of protein from a cloned gene introduced
into Escherichia coli (E. coli) may be utilized for the
purification, localization and functional analysis of proteins. For
example, fusion proteins consisting of amino terminal peptides
encoded by a portion of the E. coli lacZ or trpE gene linked to
DLC-1 proteins may be used to prepare polyclonal and monoclonal
antibodies against these proteins. Thereafter, these antibodies may
be used to purify proteins by immunoaffinity chromatography, in
diagnostic assays to quantitate the levels of protein and to
localize proteins in tissues and individual cells by
immunofluorescence.
[0164] Intact native protein may also be produced in E. coli in
large amounts for functional studies. Methods and plasmid vectors
for producing fusion proteins and intact native proteins in
bacteria are described in Sambrook et al. (In Molecular Cloning: A
Laboratory Manual, Cold Spring Harbor, N.Y., 1989, ch. 17) herein
incorporated by reference. Such fusion proteins may be made in
large amounts, are easy to purify, and can be used to elicit
antibody response. Native proteins can be produced in bacteria by
placing a strong, regulated promoter and an efficient ribosome
binding site upstream of the cloned gene. If low levels of protein
are produced, additional steps may be taken to increase protein
production; if high levels of protein are produced, purification is
relatively easy. Suitable methods are presented in Sambrook et al.
(Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y.,
1989) and are well known in the art. Often, proteins expressed at
high levels are found in insoluble inclusion bodies. Methods for
extracting proteins from these aggregates are described by Sambrook
et al. (In Molecular Cloning: A Laboratory Manual, Cold Spring
Harbor, N.Y., 1989, ch. 17). Vector systems suitable for the
expression of lacZ fusion genes include the pUR series of vectors
(Ruther and Muller-Hill, EMBO J. 2:1791, 1983), pEX1-3 (Stanley and
Luzio, EMBO J. 3:1429, 1984) and pMR100 (Gray et al., Proc. Natl.
Acad. Sci. USA 79:6598, 1982). Vectors suitable for the production
of intact native proteins include pKC30 (Shimatake and Rosenberg,
Nature 292:128, 1981), pKK177-3 (Amann and Brosius, Gene 40:183,
1985) and pET-3 (Studiar and Moffatt, J. Mol. Biol. 189:113, 1986).
DLC-1 fusion proteins may be isolated from protein gels,
lyophilized, ground into a powder and used as an antigen. The DNA
sequence can also be transferred from its existing context in pREP4
to other cloning vehicles, such as other plasmids, bacteriophages,
cosmids, animal viruses and yeast artificial chromosomes (YACs)
(Burke et al., Science 236:806-12, 1987). These vectors may then be
introduced into a variety of hosts including somatic cells, and
simple or complex organisms, such as bacteria, fungi (Timberlake
and Marshall, Science 244:1313-7, 1989), invertebrates, plants
(Gasser and Fraley, Science 244:1293, 1989), and pigs (Pursel et
al., Science 244:1281-8, 1989), which cell or organisms are
rendered transgenic by the introduction of the heterologous DLC-1
cDNA.
[0165] For expression in mammalian cells, the cDNA sequence may be
ligated to heterologous promoters, such as the simian virus (SV)
40, promoter in the pSV2 vector (Mulligan and Berg, Proc. Natl.
Acad. Sci. USA 78:2072-6, 1981), and introduced into cells, such as
monkey COS-1 cells (Gluzman, Cell 23:175-182, 1981), to achieve
transient or long-term expression. The stable integration of the
chimeric gene construct may be maintained in mammalian cells by
biochemical selection, such as neomycin (Southern and Berg, J. Mol.
Appl. Genet. 1:327-41, 1982) and mycophenolic acid (Mulligan and
Berg, Proc. Natl. Acad. Sci. USA 78:2072-6, 1981).
[0166] DNA sequences can be manipulated with standard procedures
such as restriction enzyme digestion, fill-in with DNA polymerase,
deletion by exonuclease, extension by terminal deoxynucleotide
transferase, ligation of synthetic or cloned DNA sequences,
site-directed sequence-alteration via single-stranded bacteriophage
intermediate or with the use of specific oligonucleotides in
combination with PCR.
[0167] The cDNA sequence (or portions derived from it) or a mini
gene (a cDNA with an intron and its own promoter) may be introduced
into eukaryotic expression vectors by conventional techniques.
These vectors are designed to permit the transcription of the cDNA
in eukaryotic cells by providing regulatory sequences that initiate
and enhance the transcription of the cDNA and ensure its proper
splicing and polyadenylation. Vectors containing the promoter and
enhancer regions of the SV40 or long terminal repeat (LTR) of the
Rous Sarcoma virus and polyadenylation and splicing signal from
SV40 are readily available (Mulligan and Berg, Proc. Natl. Acad.
Sci. USA 78:2072-6, 1981; Gorman et al., Proc. Natl. Acad. Sci USA
78:6777-6781, 1982). The level of expression of the cDNA can be
manipulated with this type of vector, either by using promoters
that have different activities (for example, the baculovirus pAC373
can express cDNAs at high levels in S. frugiperda cells (Summers
and Smith, In: Genetically Altered Viruses and the Environment,
Fields et al. (Eds.) 22:319-328, Cold Spring Harbor Laboratory
Press, Cold Spring Harbor, N.Y., 1985) or by using vectors that
contain promoters amenable to modulation, for example, the
glucocorticoid-responsive promoter from the mouse mammary tumor
virus (Lee et al., Nature 294:228, 1982). The expression of the
cDNA can be monitored in the recipient cells 24 to 72 hours after
introduction (transient expression).
[0168] In addition, some vectors contain selectable markers such as
the gpt (Mulligan and Berg, Proc. Natl. Acad. Sci. USA 78:2072-6,
1981) or neo (Southern and Berg, J. Mol. Appl. Genet. 1:327-41,
1982) bacterial genes. These selectable markers permit selection of
transfected cells that exhibit stable, long-term expression of the
vectors (and therefore the cDNA). The vectors can be maintained in
the cells as episomal, freely replicating entities by using
regulatory elements of viruses such as papilloma (Sarver et al.,
Mol. Cell Biol. 1:486, 1981) or Epstein-Barr (Sugden et al., Mol.
Cell Biol. 5:410, 1985). Alternatively, one can also produce cell
lines that have integrated the vector into genomic DNA. Both of
these types of cell lines produce the gene product on a continuous
basis. One can also produce cell lines that have amplified the
number of copies of the vector (and therefore of the cDNA as well)
to create cell lines that can produce high levels of the gene
product (Alt et al., J. Biol. Chem. 253:1357, 1978).
[0169] The transfer of DNA into eukaryotic, in particular human or
other mammalian cells, is now a conventional technique. The vectors
are introduced into the recipient cells as pure DNA (transfection)
by, for example, precipitation with calcium phosphate (Graham and
vander Eb, Virology 52:466, 1973) or strontium phosphate (Brash et
al., Mol. Cell Biol. 7:2013, 1987), electroporation (Neumann et
al., EMBO J 1:841, 1982), lipofection (Felgner et al., Proc. Natl.
Acad. Sci USA 84:7413, 1987), DEAE dextran (McCuthan et al., J.
Natl Cancer Inst. 41:351, 1968), microinjection (Mueller et al.,
Cell 15:579, 1978), protoplast fusion (Schafner, Proc. Natl. Acad.
Sci. USA 77:2163-7, 1980), or pellet guns (Klein et al., Nature
327:70, 1987). Alternatively, the cDNA can be introduced by
infection with virus vectors. Systems are developed that use, for
example, retroviruses (Bernstein et al., Gen. Engrg. 7:235, 1985),
adenoviruses (Ahmad et al., J. Virol. 57:267, 1986), or Herpes
virus (Spaete et al, Cell 30:295, 1982).
[0170] These eukaryotic expression systems can be used for studies
of the DLC-1 gene and mutant forms of this gene, the DLC-1 protein
and mutant forms of this protein. Such uses include, for example,
the identification of regulatory elements located in the 5' region
of the DLC-1 gene on genomic clones that can be isolated from human
genomic DNA libraries using the information contained in the
present invention. The eukaryotic expression systems may also be
used to study the function of the normal complete protein, specific
portions of the protein, or of naturally occurring or artificially
produced mutant proteins.
[0171] Using the above techniques, the expression vectors
containing the DLC-1 gene sequence or fragments or variants or
mutants thereof can be introduced into human cells, mammalian cells
from other species or non-mammalian cells as desired. The choice of
cell is determined by the purpose of the treatment. For example,
monkey COS cells (Gluzman, Cell 23:175-182, 1981) that produce high
levels of the SV40 T antigen and permit the replication of vectors
containing the SV40 origin of replication may be used. Similarly,
Chinese hamster ovary (CHO), mouse NIH 3T3 fibroblasts or human
fibroblasts or lymphoblasts (as described herein) may be used.
[0172] The following is provided as one exemplary method to express
DLC-1 polypeptide from the cloned DLC-1 cDNA sequences in mammalian
cells. Cloning vector pXT1, commercially available from Stratagene,
contains the Long Terminal Repeats (LTRs) and a portion of the GAG
gene from Moloney Murine Leukemia Virus. The position of the viral
LTRs allows highly efficient, stable transfection of the region
within the LTRs. The vector also contains the Herpes Simplex
Thymidine Kinase promoter (TK), active in embryonal cells and in a
wide variety of tissues in mice, and a selectable neomycin gene
conferring G418 resistance. Two unique restriction sites BglII and
XhoI are directly downstream from the TK promoter. DLC-1 cDNA,
including the entire open reading frame for the DLC-1 protein and
the 3' untranslated region of the cDNA is cloned into one of the
two unique restriction sites downstream from the promoter.
[0173] The ligated product is transfected into mouse NIH 3T3 cells
using Lipofectin (Life Technologies, Inc.) under conditions
outlined in the product specification. Positive transfectants are
selected after growing the transfected cells in 600 .mu.g/ml G418
(Sigma, St. Louis, Mo.). The protein is released into the
supernatant and may be purified by standard immunoaffinity
chromatography techniques using antibodies raised against the DLC-1
protein, as described below.
[0174] Expression of the DLC-1 protein in eukaryotic cells may also
be used as a source of proteins to raise antibodies. The DLC-1
protein may be extracted following release of the protein into the
supernatant as described above, or, the cDNA sequence may be
incorporated into a eukaryotic expression vector and expressed as a
chimeric protein with, for example, .beta.-globin. Antibody to
.beta.-globin is thereafter used to purify the chimeric protein.
Corresponding protease cleavage sites engineered between the
.beta.-globin gene and the cDNA are then used to separate the two
polypeptide fragments from one another after translation. One
useful expression vector for generating .beta.-globin chimeric
proteins is pSG5 (Stratagene). This vector encodes rabbit
.beta.-globin.
[0175] The present invention thus encompasses recombinant vectors
which comprise all or part of the DLC-1 gene or cDNA sequences, for
expression in a suitable host. The DLC-1 DNA is operatively linked
in the vector to an expression control sequence in the recombinant
DNA molecule so that the DLC-1 polypeptide can be expressed. The
expression control sequence may be selected from the group
consisting of sequences that control the expression of genes of
prokaryotic or eukaryotic cells and their viruses and combinations
thereof. The expression control sequence may be specifically
selected from the group consisting of the lac system, the trp
system, the tac system, the trc system, major operator and promoter
regions of phage lambda, the control region of fd coat protein, the
early and late promoters of SV40, promoters derived from polyoma,
adenovirus, retrovirus, baculovirus and simian virus, the promoter
for 3-phosphoglycerate kinase, the promoters of yeast acid
phosphatase, the promoter of the yeast alpha-mating factors and
combinations thereof.
[0176] The host cell, which may be transfected with the vector of
this invention, may be selected from the group consisting of E.
coli, Pseudomonas, Bacillus subtilis, Bacillus stearothermophilus
or other bacilli; other bacteria; yeast; fungi; insect; mouse or
other animal; or plant hosts; or human tissue cells.
[0177] It is appreciated that for mutant or variant DLC-1 DNA
sequences, similar systems are employed to express and produce the
mutant product.
Example 12
Production of an Antibody to DLC-1 Protein
[0178] Monoclonal or polyclonal antibodies may be produced to
either the normal DLC-1 protein or mutant forms of this protein.
Optimally, antibodies raised against the DLC-1 protein would
specifically detect the DLC-1 protein. That is, such antibodies
would recognize and bind the DLC-1 protein and would not
substantially recognize or bind to other proteins found in human
cells. The determination that an antibody specifically detects the
DLC-1 protein is made by any one of a number of standard
immunoassay methods; for instance, the Western blotting technique
(Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold
Spring Harbor, N.Y., 1989). To determine that a given antibody
preparation (such as one produced in a mouse) specifically detects
the DLC-1 protein by Western blotting, total cellular protein is
extracted from human cells (for example, lymphocytes) and
electrophoresed on a sodium dodecyl sulfate-polyacrylamide gel. The
proteins are then transferred to a membrane (for example,
nitrocellulose) by Western blotting, and the antibody preparation
is incubated with the membrane. After washing the membrane to
remove non-specifically bound antibodies, the presence of
specifically bound antibodies is detected by the use of an
anti-mouse antibody conjugated to an enzyme such as alkaline
phosphatase; application of the substrate
5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium results
in the production of a dense blue compound by immuno-localized
alkaline phosphatase. Antibodies which specifically detect the
DLC-1 protein will, by this technique, be shown to bind to the
DLC-1 protein band (which will be localized at a given position on
the gel determined by its molecular weight). Non-specific binding
of the antibody to other proteins may occur and may be detectable
as a weak signal on the Western blot. The non-specific nature of
this binding will be recognized by one skilled in the art by the
weak signal obtained on the Western blot relative to the strong
primary signal arising from the specific antibody-DLC-1 protein
binding.
[0179] Substantially pure DLC-1 protein suitable for use as an
immunogen is isolated from transfected or transformed cells.
Concentration of protein in the final preparation is adjusted, for
example, by concentration on an Amicon filter device, to the level
of a few micrograms per milliliter. Monoclonal or polyclonal
antibody to the protein can then be prepared as follows:
Monoclonal Antibody Production by Hybridoma Fusion
[0180] Monoclonal antibody to epitopes of the DLC-1 protein
identified and isolated as described can be prepared from murine
hybridomas according to the classical method of Kohler and Milstein
(Nature 256:495, 1975) or derivative methods thereof. Briefly, a
mouse is repetitively inoculated with a few micrograms of the
selected protein over a period of a few weeks. The mouse is then
sacrificed, and the antibody-producing cells of the spleen
isolated. The spleen cells are fused by means of polyethylene
glycol with mouse myeloma cells, and the excess unfused cells
destroyed by growth of the system on selective media comprising
aminopterin (HAT media). The successfully fused cells are diluted
and aliquots of the dilution placed in wells of a microtiter plate
where growth of the culture is continued. Antibody-producing clones
are identified by detection of antibody in the supernatant fluid of
the wells by immunoassay procedures, such as ELISA, as originally
described by Engvall (Enzymol. 70:419, 1980), and derivative
methods thereof. Selected positive clones can be expanded and their
monoclonal antibody product harvested for use. Detailed procedures
for monoclonal antibody production are described in Harlow and Lane
(Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory,
New York, 1988).
Polyclonal Antibody Production by Immunization
[0181] Polyclonal antiserum containing antibodies to heterogenous
epitopes of a single protein can be prepared by immunizing suitable
animals with the expressed protein, which can be unmodified or
modified to enhance immunogenicity. Effective polyclonal antibody
production is affected by many factors related both to the antigen
and the host species. For example, small molecules tend to be less
immunogenic than others and may require the use of carriers and
adjuvant. Also, host animals vary in response to site of
inoculations and dose, with both inadequate or excessive doses of
antigen resulting in low titer antisera. Small doses (ng level) of
antigen administered at multiple intradermal sites appears to be
most reliable. An effective immunization protocol for rabbits can
be found in Vaitukaitis et al. (J. Clin. Endocrinol. Metab.
33:988-91, 1971).
[0182] Booster injections can be given at regular intervals, and
antiserum harvested when antibody titer thereof, as determined
semi-quantitatively, for example, by double immunodiffusion in agar
against known concentrations of the antigen, begins to fall. See,
for example, Ouchterlony et al. (In Handbook of Experimental
Immunology, Wier, D. (ed.) chapter 19, Blackwell, 1973). Plateau
concentration of antibody is usually in the range of 0.1 to 0.2
mg/ml of serum (about 12 .mu.M). Affinity of the antisera for the
antigen is determined by preparing competitive binding curves, as
described, for example, by Fisher (Manual of Clinical Immunology,
Ch. 42, 1980).
Antibodies Raised against Synthetic Peptides
[0183] A third approach to raising antibodies against the DLC-1
protein is to use synthetic peptides synthesized on a commercially
available peptide synthesizer based upon the predicted amino acid
sequence of the DLC-1 protein.
Antibodies Raised by Injection of DLC-1 Gene
[0184] Antibodies may be raised against the DLC-1 protein by
subcutaneous injection of a DNA vector which expresses the DLC-1
protein into laboratory animals, such as mice. Delivery of the
recombinant vector into the animals may be achieved using a
hand-held form of the Biolistic system (Sanford et al., Particulate
Sci. Technol. 5:27-37, 1987) as described by Tang et al. (Nature
356:152-4, 1992). Expression vectors suitable for this purpose may
include those which express the DLC-1 gene under the
transcriptional control of either the human .beta.-actin promoter
or the cytomegalovirus (CMV) promoter.
[0185] Antibody preparations prepared according to these protocols
are useful in quantitative immunoassays which determine
concentrations of antigen-bearing substances in biological samples;
they are also used semi-quantitatively or qualitatively to identify
the presence of antigen in a biological sample.
Example 13
DNA-Based Diagnosis
[0186] One major application of the DLC-1 sequence information
presented herein is in the area of genetic testing for
predisposition to HCC, BC, PC and/or CRC owing to DLC-1 deletion or
mutation. The gene sequence of the DLC-1 gene, including
intron-exon boundaries is also useful in such diagnostic methods.
Individuals carrying mutations in the DLC-1 gene, or having
heterozygous or homozygous deletions of the DLC-1 gene, may be
detected at the DNA level with the use of a variety of techniques.
For such a diagnostic procedure, a biological sample of the
subject, which biological sample contains either DNA or RNA derived
from the subject, is assayed for a mutated or deleted DLC-1 gene.
Suitable biological samples include samples containing genomic DNA
or RNA obtained from body cells, such as those present in
peripheral blood, urine, saliva, tissue biopsy, surgical specimen,
amniocentesis samples and autopsy material. The detection in the
biological sample of either a mutant DLC-1 gene, a mutant DLC-1
RNA, or a homozygously or heterozygously deleted DLC-1 gene, may be
performed by a number of methodologies, as outlined below.
[0187] A preferred embodiment of such detection techniques is the
polymerase chain reaction amplification of reverse transcribed RNA
(RT-PCR) of RNA isolated from lymphocytes followed by direct DNA
sequence determination of the products. The presence of one or more
nucleotide differences between the obtained sequence and the cDNA
sequences, and especially, differences in the ORF portion of the
nucleotide sequence are taken as indicative of a potential DLC-1
gene mutation.
[0188] Alternatively, DNA extracted from lymphocytes or other cells
may be used directly for amplification. The direct amplification
from genomic DNA would be appropriate for analysis of the entire
DLC-1 gene including regulatory sequences located upstream and
downstream from the open reading frame. Recent reviews of direct
DNA diagnosis have been presented by Caskey (Science 236:1223-8,
1989) and by Landegren et al. (Science 242:229-37, 1989).
[0189] Further studies of DLC-1 genes isolated from DLC-1 patients
may reveal particular mutations, or deletions, which occur at a
high frequency within this population of individuals. In this case,
rather than sequencing the entire DLC-1 gene, it may be possible to
design DNA diagnostic methods to specifically detect the most
common DLC-1 mutations or deletions.
[0190] The detection of specific DNA mutations may be achieved by
methods such as hybridization using specific oligonucleotides
(Wallace et al., Cold Spring Harbor Symp. Quant. Biol. 51:257-61,
1986), direct DNA sequencing (Church and Gilbert, Proc. Natl. Acad.
Sci. USA 81:1991-5, 1988), the use of restriction enzymes (Flavell
et al., Cell 15:25, 1978; Geever et al., Proc. Natl. Acad. Sci USA
78:5081, 1981), discrimination on the basis of electrophoretic
mobility in gels with denaturing reagent (Myers and Maniatis, Cold
Spring Harbor Symp. Quant. Biol. 51:275-84, 1986), RNase protection
(Myers et al., Science 230:1242, 1985), chemical cleavage (Cotton
et al., Proc. Natl. Acad. Sci. USA 85:4397-401, 1988), and the
ligase-mediated detection procedure (Landegren et al., Science
241:1077, 1988).
[0191] Oligonucleotides specific to normal or mutant sequences are
chemically synthesized using commercially available machines,
labeled radioactively with isotopes (such as .sup.32P) or
non-radioactively, with tags such as biotin (Ward and Langer et
al., Proc. Natl. Acad. Sci. USA 78:6633-57, 1981), and hybridized
to individual DNA samples immobilized on membranes or other solid
supports by dot-blot or transfer from gels after electrophoresis.
The presence of these specific sequences are visualized by methods
such as autoradiography or fluorometric (Landegren, et al., Science
242:229-37, 1989) or calorimetric reactions (Gebeyehu et al.,
Nucleic Acids Res. 15:4513-34, 1987). The absence of hybridization
would indicate a mutation in the particular region of the gene, or
deleted DLC-1 gene.
[0192] Sequence differences between normal and mutant forms of the
DLC-1 gene may also be revealed by the direct DNA sequencing method
of Church and Gilbert (Proc. Natl. Acad. Sci. USA 81:1991-5, 1988).
Cloned DNA segments may be used as probes to detect specific DNA
segments. The sensitivity of this method is greatly enhanced when
combined with PCR (Wrichnik et al., Nucleic Acids Res. 15:529-42,
1987; Wong et al., Nature 330:384-386, 1987; Stoflet et al.,
Science 239:491-4, 1988). In this approach, a sequencing primer
which lies within the amplified sequence is used with
double-stranded PCR product or single-stranded template generated
by a modified PCR. The sequence determination is performed by
conventional procedures with radiolabeled nucleotides or by
automatic sequencing procedures with fluorescent tags.
[0193] Sequence alterations may occasionally generate fortuitous
restriction enzyme recognition sites or may eliminate existing
restriction sites. Changes in restriction sites are revealed by the
use of appropriate enzyme digestion followed by conventional
gel-blot hybridization (Southern, J. Mol. Biol. 98:503, 1975). DNA
fragments carrying the site (either normal or mutant) are detected
by their reduction in size or increase of corresponding restriction
fragment numbers. Genomic DNA samples may also be amplified by PCR
prior to treatment with the appropriate restriction enzyme;
fragments of different sizes are then visualized under UV light in
the presence of ethidium bromide after gel electrophoresis.
[0194] Genetic testing based on DNA sequence differences may be
achieved by detection of alteration in electrophoretic mobility of
DNA fragments in gels with or without denaturing reagent. Small
sequence deletions and insertions can be visualized by
high-resolution gel electrophoresis. For example, a PCR product
with small deletions is clearly distinguishable from a normal
sequence on an 8% non-denaturing polyacrylamide gel (WO 91/10734;
Nagamine et al., Am. J. Hum. Genet. 45:337-9, 1989). DNA fragments
of different sequence compositions may be distinguished on
denaturing formamide gradient gels in which the mobilities of
different DNA fragments are retarded in the gel at different
positions according to their specific "partial-melting"
temperatures (Myers et al., Science 230:1242, 1985). Alternatively,
a method of detecting a mutation comprising a single base
substitution or other small change could be based on differential
primer length in a PCR. For example, an invariant primer could be
used in addition to a primer specific for a mutation. The PCR
products of the normal and mutant genes can then be differentially
detected in acrylamide gels.
[0195] In addition to conventional gel-electrophoresis and
blot-hybridization methods, DNA fragments may also be visualized by
methods where the individual DNA samples are not immobilized on
membranes. The probe and target sequences may be both in solution,
or the probe sequence may be immobilized (Saiki et al., Proc. Nat.
Acad. Sci. USA 86:6230-4, 1989). A variety of detection methods,
such as autoradiography involving radioisotopes, direct detection
of radioactive decay (in the presence or absence of scintillant),
spectrophotometry involving calorigenic reactions and fluorometry
involved fluorogenic reactions, may be used to identify specific
individual genotypes.
[0196] If more than one mutation is frequently encountered in the
DLC-1 gene, a system capable of detecting such multiple mutations
would be desirable. For example, a PCR with multiple, specific
oligonucleotide primers and hybridization probes may be used to
identify all possible mutations at the same time (Chamberlain et
al., Nucl. Acids Res. 16:1141-55, 1988). The procedure may involve
immobilized sequence-specific oligonucleotides probes (Saiki et
al., Proc. Nat. Acad. Sci. USA 86:6230-4, 1989).
[0197] The following Example describes one method by which
deletions of the DLC-1 gene may be detected.
Example 14
Two Step Assay to Detect the Presence of DLC-1 Gene in a Sample
[0198] Patient liver, breast, prostate and/or colorectal tissue
sample is processed according to the method disclosed by
Antonarakis, et al. (New Eng. J. Med. 313:842-848, 1985), separated
through a 1% agarose gel and transferred to a nylon membrane for
Southern blot analysis. Membranes are UV cross linked at 150 mJ
using a GS Gene Linker (Bio-Rad). A DLC-1 probe is subcloned into
pTZ18U. The phagemids are transformed into E. coli MV 1190 infected
with M13KO7 helper phage (Bio-Rad, Richmond, Calif.). Single
stranded DNA is isolated according to standard procedures (see
Sambrook, et al. Molecular Cloning: A Laboratory Manual, Cold
Spring Harbor, N.Y., 1989).
[0199] Blots are prehybridized for 15-30 min. at 65.degree. C. in
7% sodium dodecyl sulfate (SDS) in 0.5M NaPO.sub.4. The methods
follow those described by Nguyen, et al. (BioTechniques 13:116-123,
1992). The blots are hybridized overnight at 65.degree. C. in 7%
SDS, 0.5M NaPO.sub.4 with 25-50 ng/ml single stranded probe DNA.
Post-hybridization washes consist of two 30 min. washes in 5% SDS,
40 mM NaPO.sub.4 at 65.degree. C., followed by two 30-min washes in
1% SDS, 40 mM NaPO.sub.4 at 65.degree. C.
[0200] Next the blots are rinsed with phosphate buffered saline (pH
6.8) for 5 min at room temperature and incubated with 0.2% casein
in PBS for 5 min. The blots are then preincubated for 5-10 minutes
in a shaking water bath at 45.degree. C. with hybridization buffer
consisting of 6M urea, 0.3M NaCl, and 5.times.Denhardt's solution
(see Sambrook, et al., Molecular Cloning: A Laboratory Manual, Cold
Spring Harbor, N.Y., 1989). The buffer is removed and replaced with
50-75 .mu.l/cm.sup.2 fresh hybridization buffer plus 2.5 nM of the
covalently cross-linked oligonucleotide sequence complementary to
the universal primer site (UP-AP, Bio-Rad). The blots are
hybridized for 20-30 min at 45.degree. C. and post hybridization
washes are incubated at 45.degree. C. as two 10 min washes in 6 M
urea, 1.times. standard saline citrate (SSC), 0.1% SDS and one 10
min wash in 1.times.SSC, 0.1% Triton.RTM. X-100. The blots are
rinsed for 10 min at room temperature with 1.times.SSC.
[0201] Blots are incubated for 10 min at room temperature with
shaking in the substrate buffer consisting of 0.1M diethanolamine,
1 mM MgCl.sub.2, 0.02% sodium azide, pH 10.0. Individual blots are
placed in heat sealable bags with substrate buffer and 0.2 mM AMPPD
(3-(2'-spiroadamantane)-4-methoxy-4-(3'-phosphoryloxy)phenyl-1,2-dioxetan-
e, disodium salt, Bio-Rad). After a 20 min incubation at room
temperature with shaking, the excess AMPPD solution is removed. The
blot is exposed to X-ray film overnight. Positive bands indicate
the presence of the DLC-1 gene. Patient samples which show no
hybridizing bands lack the DLC-1 gene, indicating the possibility
of ongoing cancer, or an enhanced susceptibility to developing
cancer in the future.
Example 15
Quantitation of DLC-1 Protein
[0202] An alternative method of diagnosing DLC-1 gene deletion or
mutation is to quantitate the level of DLC-1 protein in the cells
of an individual. This diagnostic tool would be useful for
detecting reduced levels of the DLC-1 protein which result from,
for example, mutations in the promoter regions of the DLC-1 gene or
mutations within the coding region of the gene which produced
truncated, non-functional polypeptides, as well as from deletions
of the entire DLC-1 gene. The determination of reduced DLC-1
protein levels would be an alternative or supplemental approach to
the direct determination of DLC-1 gene deletion or mutation status
by the methods outlined above. The availability of antibodies
specific to the DLC-1 protein will facilitate the quantitation of
cellular DLC-1 protein by one of a number of immunoassay methods
which are well known in the art and are presented in Harlow and
Lane (Antibodies, A Laboratory Manual, Cold Spring Harbor
Laboratory, New York, 1988).
[0203] For the purposes of quantitating the DLC-1 protein, a
biological sample of the subject, which sample includes cellular
proteins, is required. Such a biological sample may be obtained
from body cells, such as those present in peripheral blood, urine,
saliva, tissue biopsy, amniocentesis samples, surgical specimens
and autopsy material, particularly liver cells. Quantitation of
DLC-1 protein is achieved by immunoassay and compared to levels of
the protein found in healthy cells. A significant (e.g., 50% or
greater) reduction in the amount of DLC-1 protein in the cells of a
subject compared to the amount of DLC-1 protein found in normal
human cells would be taken as an indication that the subject may
have deletions or mutations in the DLC-1 gene locus.
Example 16
Gene Therapy
[0204] A new gene therapy approach for DLC-1 patients is now made
possible by the present invention. Essentially, liver cells may be
removed from a patient having deletions or mutations of the DLC-1
gene, and then transfected with an expression vector containing the
DLC-1 cDNA. These transfected liver cells will thereby produce
functional DLC-1 protein and can be reintroduced into the patient.
In addition to liver cells, breast, colorectal, prostate, or other
cells may be used, depending on the cancer of interest.
[0205] The scientific and medical procedures required for human
cell transfection are now routine procedures. The provision herein
of DLC-1 cDNAs now allows the development of human gene therapy
based upon these procedures. Immunotherapy of melanoma patients
using genetically engineered tumor-infiltrating lymphocytes (TILs)
has been reported by Rosenberg et al. (N. Engl. J. Med. 323:570-8,
1990). In that study, a retrovirus vector was used to introduce a
gene for neomycin resistance into TILs. A similar approach may be
used to introduce the DLC-1 cDNA into patients affected by DLC-1
deletions or mutations.
[0206] Retroviruses have been considered the preferred vector for
experiments in gene therapy, with a high efficiency of infection
and stable integration and expression (Orkin et al., Prog. Med.
Genet. 7:130, 1988). The full length DLC-1 gene or cDNA can be
cloned into a retroviral vector and driven from either its
endogenous promoter or from the retroviral LTR (long terminal
repeat). Other viral transfection systems may also be utilized for
this type of approach, including Adeno-Associated virus (AAV)
(McLaughlin et al., J. Virol. 62:1963, 1988), Vaccinia virus (Moss
et al., Annu. Rev. Immunol. 5:305, 1987), Bovine Papilloma virus
(Rasmussen et al., Methods Enzymol. 139:642, 1987) or members of
the herpesvirus group such as Epstein-Barr virus (Margolskee et
al., Mol. Cell. Biol. 8:2837-47, 1988). Recent developments in gene
therapy techniques include the use of RNA-DNA hybrid
oligonucleotides, as described by Cole-Strauss, et al. (Science
273:1386-9, 1996). This technique may allow for site-specific
integration of cloned sequences, permitting accurately targeted
gene replacement.
[0207] Having illustrated and described the principles of isolating
the human DLC-1 cDNA and its corresponding genomic genes, the
protein and modes of use of these biological molecules, it should
be apparent to one skilled in the art that the invention can be
modified in arrangement and detail without departing from such
principles. We claim all modifications coming within the spirit and
scope of the claims presented herein.
Sequence CWU 1
1
3113850DNAHomo sapiensCDS(325)..(3600) 1tatgggctcg agcggccgcc
cgggcaggtg cccgagcgag ggcgcttcgc tcccagccag 60gacatggccg cacctctccg
catcaggagc gccggctcac ggacttctcg cccaactccc 120tgagcgctcc
ctcgtttcga tctttagaaa accccgcttt ctttctgggg ccgtgacgag
180gggcagggag cggcgagcaa ggatgcgttg aggaccgcga gggcgcgcgt
ctcgggtgcc 240gccgtgggtc ccgacgcgga agccgagccg cctccgcctg
cctcgacttc cccacagcgc 300ttccgccgcc gcctgccgtg cttg atg tgc aga aag
aag ccg gac acc atg 351Met Cys Arg Lys Lys Pro Asp Thr Met1 5atc
cta aca caa att gaa gcc aag gaa gct tgt gat tgg cta cgg gca 399Ile
Leu Thr Gln Ile Glu Ala Lys Glu Ala Cys Asp Trp Leu Arg Ala10 15 20
25act ggt ttc ccc cag tat gca cag ctt tat gaa gat ttc ctg ttc ccc
447Thr Gly Phe Pro Gln Tyr Ala Gln Leu Tyr Glu Asp Phe Leu Phe
Pro30 35 40atc gat att tcc ttg gtc aag aga gag cat gat ttt ttg gac
aga gat 495Ile Asp Ile Ser Leu Val Lys Arg Glu His Asp Phe Leu Asp
Arg Asp45 50 55gcc att gag gct cta tgc agg cgt cta aat act tta aac
aaa tgt gcg 543Ala Ile Glu Ala Leu Cys Arg Arg Leu Asn Thr Leu Asn
Lys Cys Ala60 65 70gtg atg aag cta gaa att agt cct cat cgg aaa cga
agt gac gat tca 591Val Met Lys Leu Glu Ile Ser Pro His Arg Lys Arg
Ser Asp Asp Ser75 80 85gac gag gat gag cct tgt gcc atc agt ggc aaa
tgg act ttc caa agg 639Asp Glu Asp Glu Pro Cys Ala Ile Ser Gly Lys
Trp Thr Phe Gln Arg90 95 100 105gac agc aag agg tgg tcc cgg ctt gaa
gag ttt gat gtc ttt tct cca 687Asp Ser Lys Arg Trp Ser Arg Leu Glu
Glu Phe Asp Val Phe Ser Pro110 115 120aaa caa gac ctg gtc cct ggg
tcc cca gac gac tcc cac ccg aag gac 735Lys Gln Asp Leu Val Pro Gly
Ser Pro Asp Asp Ser His Pro Lys Asp125 130 135ggc ccc agc ccc gga
ggc acg ctg atg gac ctc agc gag cgc cag gag 783Gly Pro Ser Pro Gly
Gly Thr Leu Met Asp Leu Ser Glu Arg Gln Glu140 145 150gtg tct tcc
gtc cgc agc ctc agc agc act ggc agc ctc ccc agc cac 831Val Ser Ser
Val Arg Ser Leu Ser Ser Thr Gly Ser Leu Pro Ser His155 160 165gcg
ccc ccc agc gag gat gct gcc acc ccc cgg act aac tcc gtc atc 879Ala
Pro Pro Ser Glu Asp Ala Ala Thr Pro Arg Thr Asn Ser Val Ile170 175
180 185agc gtt tgc tcc tcc agc aac ttg gca ggc aat gac gac tct ttc
ggc 927Ser Val Cys Ser Ser Ser Asn Leu Ala Gly Asn Asp Asp Ser Phe
Gly190 195 200agc ctg ccc tct ccc aag gaa ctg tcc agc ttc agc ttc
agc atg aaa 975Ser Leu Pro Ser Pro Lys Glu Leu Ser Ser Phe Ser Phe
Ser Met Lys205 210 215ggc cac gaa aaa act gcc aag tcc aag acg cgc
agt ctg ctg aaa cgg 1023Gly His Glu Lys Thr Ala Lys Ser Lys Thr Arg
Ser Leu Leu Lys Arg220 225 230atg gag agc ctg aag ctc aag agc tcc
cat cac agc aag cac aaa gcg 1071Met Glu Ser Leu Lys Leu Lys Ser Ser
His His Ser Lys His Lys Ala235 240 245ccc tca aag ctg ggg ttg atc
atc agc ggg ccc atc ttg caa gag ggg 1119Pro Ser Lys Leu Gly Leu Ile
Ile Ser Gly Pro Ile Leu Gln Glu Gly250 255 260 265atg gat gag gag
aag ctg aag cag ctc agc tgc gtg gag atc tcc gcc 1167Met Asp Glu Glu
Lys Leu Lys Gln Leu Ser Cys Val Glu Ile Ser Ala270 275 280ctc aat
ggc aac cgc atc aac gtc ccc atg gta cga aag agg agc gtt 1215Leu Asn
Gly Asn Arg Ile Asn Val Pro Met Val Arg Lys Arg Ser Val285 290
295tcc aac tcc acg cag acc agc agc agc agc agc cag tcg gag acc agc
1263Ser Asn Ser Thr Gln Thr Ser Ser Ser Ser Ser Gln Ser Glu Thr
Ser300 305 310agc gcg gtc agc acg ccc agc cct gtt acg agg acc cgg
agc ctc agt 1311Ser Ala Val Ser Thr Pro Ser Pro Val Thr Arg Thr Arg
Ser Leu Ser315 320 325gcg tgc aac aag cgg gtg ggc atg tac tta gag
ggc ttc gat cct ttc 1359Ala Cys Asn Lys Arg Val Gly Met Tyr Leu Glu
Gly Phe Asp Pro Phe330 335 340 345aat cag tca aca ttt aac aac gtg
gtg gag cag aac ttt aag aac cgc 1407Asn Gln Ser Thr Phe Asn Asn Val
Val Glu Gln Asn Phe Lys Asn Arg350 355 360gag agc tac cca gag gac
acg gtg ttc tac atc cct gaa gat cac aag 1455Glu Ser Tyr Pro Glu Asp
Thr Val Phe Tyr Ile Pro Glu Asp His Lys365 370 375cct ggc act ttc
ccc aaa gct ctc acc aat ggc agt ttc tcc ccc tcg 1503Pro Gly Thr Phe
Pro Lys Ala Leu Thr Asn Gly Ser Phe Ser Pro Ser380 385 390ggg aat
aac ggc tct gtg aac tgg agg acg gga agc ttc cac ggc cct 1551Gly Asn
Asn Gly Ser Val Asn Trp Arg Thr Gly Ser Phe His Gly Pro395 400
405ggc cac atc agc ctc agg agg gaa aac agt agc gac agc ccc aag gaa
1599Gly His Ile Ser Leu Arg Arg Glu Asn Ser Ser Asp Ser Pro Lys
Glu410 415 420 425ctg aag aga cgc aat tct tcc agc tcc atg agc agc
cgc ctg agc atc 1647Leu Lys Arg Arg Asn Ser Ser Ser Ser Met Ser Ser
Arg Leu Ser Ile430 435 440tac gac aac gtg ccg ggc tcc atc ctc tac
tcc agt tca ggg gac ctg 1695Tyr Asp Asn Val Pro Gly Ser Ile Leu Tyr
Ser Ser Ser Gly Asp Leu445 450 455gcg gat ctg gag aac gag gac atc
ttc ccc gag ctg gac gac atc ctc 1743Ala Asp Leu Glu Asn Glu Asp Ile
Phe Pro Glu Leu Asp Asp Ile Leu460 465 470tac cac gtg aag ggg atg
cag cgg ata gtc aat cag tgg tcg gag aag 1791Tyr His Val Lys Gly Met
Gln Arg Ile Val Asn Gln Trp Ser Glu Lys475 480 485ttt tct gat gag
gga gat tcg gac tca gcc ctg gac tcg gtc tct ccc 1839Phe Ser Asp Glu
Gly Asp Ser Asp Ser Ala Leu Asp Ser Val Ser Pro490 495 500 505tgc
ccg tcc tct cca aaa cag ata cac ctg gat gtg gac aac gac cga 1887Cys
Pro Ser Ser Pro Lys Gln Ile His Leu Asp Val Asp Asn Asp Arg510 515
520acc aca ccc agc gac ctg gac agc aca ggc aac tcc ctg aat gaa ccg
1935Thr Thr Pro Ser Asp Leu Asp Ser Thr Gly Asn Ser Leu Asn Glu
Pro525 530 535gaa gag ccc tcc gag atc ccg gaa aga agg gat tct ggg
gtt ggg gct 1983Glu Glu Pro Ser Glu Ile Pro Glu Arg Arg Asp Ser Gly
Val Gly Ala540 545 550tcc cta acc agg tcc aac agg cac cga ctg aga
tgg cac agt ttc cag 2031Ser Leu Thr Arg Ser Asn Arg His Arg Leu Arg
Trp His Ser Phe Gln555 560 565agc tca cat cgg cca agc ctc aac tct
gta tca cta cag att aac tgc 2079Ser Ser His Arg Pro Ser Leu Asn Ser
Val Ser Leu Gln Ile Asn Cys570 575 580 585cag tct gtg gcc cag atg
aac ctg ctg cag aaa tac tca ctc cta aag 2127Gln Ser Val Ala Gln Met
Asn Leu Leu Gln Lys Tyr Ser Leu Leu Lys590 595 600cta acg gcc ctg
ctg gag aaa tac aca cct tct aac aag cat ggt ttt 2175Leu Thr Ala Leu
Leu Glu Lys Tyr Thr Pro Ser Asn Lys His Gly Phe605 610 615agc tgg
gcc gtg ccc aag ttc atg aag agg atc aag gtt cca gac tac 2223Ser Trp
Ala Val Pro Lys Phe Met Lys Arg Ile Lys Val Pro Asp Tyr620 625
630aag gac cgg agt gtg ttt ggg gtc cca ctg acg gtc aac gtg cag cgc
2271Lys Asp Arg Ser Val Phe Gly Val Pro Leu Thr Val Asn Val Gln
Arg635 640 645aca gga caa ccg ttg cct cag agc atc cag cag gcc atg
cga tac ctc 2319Thr Gly Gln Pro Leu Pro Gln Ser Ile Gln Gln Ala Met
Arg Tyr Leu650 655 660 665cgg aac cat tgt ttg gat cag gtt ggg ctc
ttc aaa aaa tcg ggg gtc 2367Arg Asn His Cys Leu Asp Gln Val Gly Leu
Phe Lys Lys Ser Gly Val670 675 680aag tcc cgg att cag gct ctg cgc
cag atg aat gaa ggt gcc ata gac 2415Lys Ser Arg Ile Gln Ala Leu Arg
Gln Met Asn Glu Gly Ala Ile Asp685 690 695tgt gtc aac tac gaa gga
cag tct gct tat gac gtg gca gac atg ctg 2463Cys Val Asn Tyr Glu Gly
Gln Ser Ala Tyr Asp Val Ala Asp Met Leu700 705 710aag cag tat ttt
cga gat ctt cct gag cca cta atg acg aac aaa ctc 2511Lys Gln Tyr Phe
Arg Asp Leu Pro Glu Pro Leu Met Thr Asn Lys Leu715 720 725tcg gaa
acc ttt cta cag atc tac caa tat gtg ccc aag gac cag cgc 2559Ser Glu
Thr Phe Leu Gln Ile Tyr Gln Tyr Val Pro Lys Asp Gln Arg730 735 740
745ctg cag gcc atc aag gct gcc atc atg ctg ctg cct gac gag aac cgg
2607Leu Gln Ala Ile Lys Ala Ala Ile Met Leu Leu Pro Asp Glu Asn
Arg750 755 760gtg gtt ctg cag acc ctg ctt tat ttc ctg tgc gat gtc
aca gca gcc 2655Val Val Leu Gln Thr Leu Leu Tyr Phe Leu Cys Asp Val
Thr Ala Ala765 770 775gta aaa gaa aac cag atg acc cca acc aac ctg
gcc gtg tgc tta gcg 2703Val Lys Glu Asn Gln Met Thr Pro Thr Asn Leu
Ala Val Cys Leu Ala780 785 790cct tcc ctc ttc cat ctc aac acc ctg
aag aga gag aat tcc tct ccc 2751Pro Ser Leu Phe His Leu Asn Thr Leu
Lys Arg Glu Asn Ser Ser Pro795 800 805agg gta atg caa aga aaa caa
agt ttg ggc aaa cca gat cag aaa gat 2799Arg Val Met Gln Arg Lys Gln
Ser Leu Gly Lys Pro Asp Gln Lys Asp810 815 820 825ttg aat gaa aac
cta gct gcc act caa ggg ctg gcc cat atg atc gcc 2847Leu Asn Glu Asn
Leu Ala Ala Thr Gln Gly Leu Ala His Met Ile Ala830 835 840gag tgc
aag aag ctt ttc cag gtt ccc gag gaa atg agc cga tgt cgt 2895Glu Cys
Lys Lys Leu Phe Gln Val Pro Glu Glu Met Ser Arg Cys Arg845 850
855aat tcc tat acc gaa caa gag ctg aag ccc ctc act ctg gaa gca ctc
2943Asn Ser Tyr Thr Glu Gln Glu Leu Lys Pro Leu Thr Leu Glu Ala
Leu860 865 870ggg cac ctg ggt aat gat gac tca gct gac tac caa cac
ttc ctc cag 2991Gly His Leu Gly Asn Asp Asp Ser Ala Asp Tyr Gln His
Phe Leu Gln875 880 885gac tgt gtg gat ggc ctg ttt aaa gaa gtc aaa
gag aag ttt aaa ggc 3039Asp Cys Val Asp Gly Leu Phe Lys Glu Val Lys
Glu Lys Phe Lys Gly890 895 900 905tgg gtc agc tac tcc act tcg gag
cag gct gag ctg tcc tat aag aag 3087Trp Val Ser Tyr Ser Thr Ser Glu
Gln Ala Glu Leu Ser Tyr Lys Lys910 915 920gtg agc gaa gga ccc cgt
ctg agg ctt tgg agg tca gtc att gaa gtc 3135Val Ser Glu Gly Pro Arg
Leu Arg Leu Trp Arg Ser Val Ile Glu Val925 930 935cct gct gtg cca
gag gaa atc tta aag cgc cta ctt aaa gaa cag cac 3183Pro Ala Val Pro
Glu Glu Ile Leu Lys Arg Leu Leu Lys Glu Gln His940 945 950ctc tgg
gat gta gac ctg ttg gat tca aaa gtg atc gaa att ctg gac 3231Leu Trp
Asp Val Asp Leu Leu Asp Ser Lys Val Ile Glu Ile Leu Asp955 960
965agc caa act gaa att tac cag tat gtc caa aac agt atg gca cct cat
3279Ser Gln Thr Glu Ile Tyr Gln Tyr Val Gln Asn Ser Met Ala Pro
His970 975 980 985cct gct cga gac tac gtt gtt tta aga acc tgg agg
act aat tta ccc 3327Pro Ala Arg Asp Tyr Val Val Leu Arg Thr Trp Arg
Thr Asn Leu Pro990 995 1000aaa gga gcc tgt gcc ctt tta cta acc tct
gtg gat cac gat cgc gca 3375Lys Gly Ala Cys Ala Leu Leu Leu Thr Ser
Val Asp His Asp Arg Ala1005 1010 1015cct gtg gtg ggt gtg agg gtt
aat gtg ctc ttg tcc agg tat ttg att 3423Pro Val Val Gly Val Arg Val
Asn Val Leu Leu Ser Arg Tyr Leu Ile1020 1025 1030gaa ccc tgt ggg
cca gga aaa tcc aaa ctc acc tac atg tgc aga gtt 3471Glu Pro Cys Gly
Pro Gly Lys Ser Lys Leu Thr Tyr Met Cys Arg Val1035 1040 1045gac
tta agg ggc cac atg cca gaa tgg tac aca aaa tct ttt gga cat 3519Asp
Leu Arg Gly His Met Pro Glu Trp Tyr Thr Lys Ser Phe Gly His1050
1055 1060 1065ttg tgt gca gct gaa gtt gta aag atc cgg gat tcc ttc
agt aac cag 3567Leu Cys Ala Ala Glu Val Val Lys Ile Arg Asp Ser Phe
Ser Asn Gln1070 1075 1080aac act gaa acc aaa gac acc aaa tct agg
tga tcactgaagc aacgcaaccg 3620Asn Thr Glu Thr Lys Asp Thr Lys Ser
Arg1085 1090cttccaccac catggtgttt gtttttagaa gttttgccag tccttgaaga
atgggttctg 3680tgtgtaatcc tgaaacaaag aaaactacaa gctggagtgt
aggaattgac tatagcaatt 3740tgatacattt ttaaagctgc ttcctgtttg
ttgagggtct gtattcatag accttgactg 3800gaatatgtaa gactgtgcga
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 385021091PRTHomo sapiens 2Met Cys
Arg Lys Lys Pro Asp Thr Met Ile Leu Thr Gln Ile Glu Ala1 5 10 15Lys
Glu Ala Cys Asp Trp Leu Arg Ala Thr Gly Phe Pro Gln Tyr Ala20 25
30Gln Leu Tyr Glu Asp Phe Leu Phe Pro Ile Asp Ile Ser Leu Val Lys35
40 45Arg Glu His Asp Phe Leu Asp Arg Asp Ala Ile Glu Ala Leu Cys
Arg50 55 60Arg Leu Asn Thr Leu Asn Lys Cys Ala Val Met Lys Leu Glu
Ile Ser65 70 75 80Pro His Arg Lys Arg Ser Asp Asp Ser Asp Glu Asp
Glu Pro Cys Ala85 90 95Ile Ser Gly Lys Trp Thr Phe Gln Arg Asp Ser
Lys Arg Trp Ser Arg100 105 110Leu Glu Glu Phe Asp Val Phe Ser Pro
Lys Gln Asp Leu Val Pro Gly115 120 125Ser Pro Asp Asp Ser His Pro
Lys Asp Gly Pro Ser Pro Gly Gly Thr130 135 140Leu Met Asp Leu Ser
Glu Arg Gln Glu Val Ser Ser Val Arg Ser Leu145 150 155 160Ser Ser
Thr Gly Ser Leu Pro Ser His Ala Pro Pro Ser Glu Asp Ala165 170
175Ala Thr Pro Arg Thr Asn Ser Val Ile Ser Val Cys Ser Ser Ser
Asn180 185 190Leu Ala Gly Asn Asp Asp Ser Phe Gly Ser Leu Pro Ser
Pro Lys Glu195 200 205Leu Ser Ser Phe Ser Phe Ser Met Lys Gly His
Glu Lys Thr Ala Lys210 215 220Ser Lys Thr Arg Ser Leu Leu Lys Arg
Met Glu Ser Leu Lys Leu Lys225 230 235 240Ser Ser His His Ser Lys
His Lys Ala Pro Ser Lys Leu Gly Leu Ile245 250 255Ile Ser Gly Pro
Ile Leu Gln Glu Gly Met Asp Glu Glu Lys Leu Lys260 265 270Gln Leu
Ser Cys Val Glu Ile Ser Ala Leu Asn Gly Asn Arg Ile Asn275 280
285Val Pro Met Val Arg Lys Arg Ser Val Ser Asn Ser Thr Gln Thr
Ser290 295 300Ser Ser Ser Ser Gln Ser Glu Thr Ser Ser Ala Val Ser
Thr Pro Ser305 310 315 320Pro Val Thr Arg Thr Arg Ser Leu Ser Ala
Cys Asn Lys Arg Val Gly325 330 335Met Tyr Leu Glu Gly Phe Asp Pro
Phe Asn Gln Ser Thr Phe Asn Asn340 345 350Val Val Glu Gln Asn Phe
Lys Asn Arg Glu Ser Tyr Pro Glu Asp Thr355 360 365Val Phe Tyr Ile
Pro Glu Asp His Lys Pro Gly Thr Phe Pro Lys Ala370 375 380Leu Thr
Asn Gly Ser Phe Ser Pro Ser Gly Asn Asn Gly Ser Val Asn385 390 395
400Trp Arg Thr Gly Ser Phe His Gly Pro Gly His Ile Ser Leu Arg
Arg405 410 415Glu Asn Ser Ser Asp Ser Pro Lys Glu Leu Lys Arg Arg
Asn Ser Ser420 425 430Ser Ser Met Ser Ser Arg Leu Ser Ile Tyr Asp
Asn Val Pro Gly Ser435 440 445Ile Leu Tyr Ser Ser Ser Gly Asp Leu
Ala Asp Leu Glu Asn Glu Asp450 455 460Ile Phe Pro Glu Leu Asp Asp
Ile Leu Tyr His Val Lys Gly Met Gln465 470 475 480Arg Ile Val Asn
Gln Trp Ser Glu Lys Phe Ser Asp Glu Gly Asp Ser485 490 495Asp Ser
Ala Leu Asp Ser Val Ser Pro Cys Pro Ser Ser Pro Lys Gln500 505
510Ile His Leu Asp Val Asp Asn Asp Arg Thr Thr Pro Ser Asp Leu
Asp515 520 525Ser Thr Gly Asn Ser Leu Asn Glu Pro Glu Glu Pro Ser
Glu Ile Pro530 535 540Glu Arg Arg Asp Ser Gly Val Gly Ala Ser Leu
Thr Arg Ser Asn Arg545 550 555 560His Arg Leu Arg Trp His Ser Phe
Gln Ser Ser His Arg Pro Ser Leu565 570 575Asn Ser Val Ser Leu Gln
Ile Asn Cys Gln Ser Val Ala Gln Met Asn580 585 590Leu Leu Gln Lys
Tyr Ser Leu Leu Lys Leu Thr Ala Leu Leu Glu Lys595 600 605Tyr Thr
Pro Ser Asn Lys His Gly Phe Ser Trp Ala Val Pro Lys Phe610 615
620Met Lys Arg Ile Lys Val Pro Asp Tyr Lys Asp Arg Ser Val Phe
Gly625 630 635 640Val Pro Leu Thr Val Asn Val Gln Arg Thr Gly Gln
Pro Leu Pro Gln645 650 655Ser Ile Gln Gln Ala Met Arg Tyr Leu Arg
Asn His Cys Leu Asp Gln660 665 670Val Gly Leu Phe Lys Lys Ser Gly
Val Lys Ser Arg Ile Gln Ala Leu675 680 685Arg Gln Met Asn Glu Gly
Ala Ile Asp Cys Val Asn Tyr Glu Gly Gln690 695 700Ser Ala Tyr Asp
Val Ala Asp Met Leu Lys Gln Tyr Phe Arg Asp Leu705 710 715 720Pro
Glu Pro Leu Met Thr Asn Lys Leu Ser Glu Thr Phe Leu Gln Ile725 730
735Tyr Gln Tyr Val
Pro Lys Asp Gln Arg Leu Gln Ala Ile Lys Ala Ala740 745 750Ile Met
Leu Leu Pro Asp Glu Asn Arg Val Val Leu Gln Thr Leu Leu755 760
765Tyr Phe Leu Cys Asp Val Thr Ala Ala Val Lys Glu Asn Gln Met
Thr770 775 780Pro Thr Asn Leu Ala Val Cys Leu Ala Pro Ser Leu Phe
His Leu Asn785 790 795 800Thr Leu Lys Arg Glu Asn Ser Ser Pro Arg
Val Met Gln Arg Lys Gln805 810 815Ser Leu Gly Lys Pro Asp Gln Lys
Asp Leu Asn Glu Asn Leu Ala Ala820 825 830Thr Gln Gly Leu Ala His
Met Ile Ala Glu Cys Lys Lys Leu Phe Gln835 840 845Val Pro Glu Glu
Met Ser Arg Cys Arg Asn Ser Tyr Thr Glu Gln Glu850 855 860Leu Lys
Pro Leu Thr Leu Glu Ala Leu Gly His Leu Gly Asn Asp Asp865 870 875
880Ser Ala Asp Tyr Gln His Phe Leu Gln Asp Cys Val Asp Gly Leu
Phe885 890 895Lys Glu Val Lys Glu Lys Phe Lys Gly Trp Val Ser Tyr
Ser Thr Ser900 905 910Glu Gln Ala Glu Leu Ser Tyr Lys Lys Val Ser
Glu Gly Pro Arg Leu915 920 925Arg Leu Trp Arg Ser Val Ile Glu Val
Pro Ala Val Pro Glu Glu Ile930 935 940Leu Lys Arg Leu Leu Lys Glu
Gln His Leu Trp Asp Val Asp Leu Leu945 950 955 960Asp Ser Lys Val
Ile Glu Ile Leu Asp Ser Gln Thr Glu Ile Tyr Gln965 970 975Tyr Val
Gln Asn Ser Met Ala Pro His Pro Ala Arg Asp Tyr Val Val980 985
990Leu Arg Thr Trp Arg Thr Asn Leu Pro Lys Gly Ala Cys Ala Leu
Leu995 1000 1005Leu Thr Ser Val Asp His Asp Arg Ala Pro Val Val Gly
Val Arg Val1010 1015 1020Asn Val Leu Leu Ser Arg Tyr Leu Ile Glu
Pro Cys Gly Pro Gly Lys1025 1030 1035 1040Ser Lys Leu Thr Tyr Met
Cys Arg Val Asp Leu Arg Gly His Met Pro1045 1050 1055Glu Trp Tyr
Thr Lys Ser Phe Gly His Leu Cys Ala Ala Glu Val Val1060 1065
1070Lys Ile Arg Asp Ser Phe Ser Asn Gln Asn Thr Glu Thr Lys Asp
Thr1075 1080 1085Lys Ser Arg1090321DNAArtificial
SequenceDescription of Artificial Sequence PCR primer 3tatgggctcg
agcggccgcc c 21421DNAArtificial SequenceDescription of Artificial
Sequence PCR primer 4cgcacagtct tacatattcc a 21524DNAArtificial
SequenceDescription of Artificial Sequence PCR primer 5atgtgcagaa
agaagccgga cacc 24624DNAArtificial SequenceDescription of
Artificial Sequence PCR primer 6cctagatttg gtgtctttgg tttc
24721DNAArtificial SequenceDescription of Artificial Sequence PCR
primer 7gacaccacca tctctgtgct c 21821DNAArtificial
SequenceDescription of Artificial Sequence PCR primer 8gcagactgtc
cttcgtagtt g 21926DNAArtificial SequenceDescription of Artificial
Sequence primer 9cactccggtc cttgtagtct ggaacc 261025DNAArtificial
SequenceDescription of Artificial Sequence primer 10atcctcttca
tgaactcggg cacgg 251127DNAArtificial SequenceDescription of
Artificial Sequence primer 11gatcaaggtt ctagactaca aggaccg
2712691DNAArtificial SequenceDescription of Artificial Sequence
probe 12ccngcaganc tcgaaaatat gcttcggcat gtctgccacg tcataagcag
actgtccttc 60gtagttgaca cagtctatgg caccctcatt catctggcgc agagcctgaa
tccgggactt 120gacccccgat tttctgaaga gcccaacctg tcggaagagc
aacactaagt gtggggtaca 180ttcacgtgga cgcagtgttt acaccacaca
actagaagaa gctgcatgta atccgagctc 240ccctgagtac gtgaacccgc
aggcagcgct ctcacctgat ccaaacaatg gttccggggg 300tatcgcacgg
cctgctggat gctctcaggc aacggttgtc ctgtgcgctg cacgttgacc
360gtcaggggac cccaaacaca ctccggtcct tgtagtctgg aaccttgatc
ctcttcatga 420actcgggcac ggccctgtta aagagcacag agatggtggt
gtcggcggan acatgctcac 480ttgtctgtct acacttgtcc aattctgcag
gcaaaccctg tgggctccag attctgatat 540catccaatga attcgagctc
gtaccgggga tcctctaaaa tccaacttgc aggcattcca 600gcttcagctg
ctccaatttc tatatgttcc cctaaatcgt atttttttga aacataaggt
660tattttttta attgtaccnc gttcctaacn a 69113301DNAArtificial
SequenceDescription of Artificial Sequence probe 13gaggctctat
gcaggcgtct aaatacttta aacaaatgtg cggtgatgaa gctagaaatt 60agtcctcatc
ggaaacgaag tgacgattca gacgaggatg agccttgtgc catcagtggc
120aaatggactt tccaaaggga cagcaagagg tggtcccggc ttgaagagtt
tgatgtcttt 180tctccaaaac aagacctggt ccctgggtcc ccagacgact
cccacccgaa ggacggcccc 240agccccggag gcacgctgat ggacctcagc
gagcgccagg aggtgtcttc cgtccgcagc 300c 301143006DNAHomo
sapiensmisc_feature(1)..(3006)n represents a or g or c or t/u
14cnggcagatc tcgaanatac tgcttcggca tgtctgccac gtcataagca gactgtcctt
60cgtagttgac acagtctatg gcaccytcat tcatctggcg cagagcctga atccgggact
120tgacccccga ttttctgaag agcccaacct gtcggaagag caacactaag
tgtggggtac 180attcacgtgg acgcagtgtt tacaccacac aactagaaga
agctgcatgt aatccgagct 240cccctgagta cgtggacccg caggcagcgc
tctcacctga tccaaacaat ggttccggrg 300gtatcgcayg gcctgctgga
tgctctgagg caacggttgt cctgtgcgct gcacgttgac 360cgtcagtggg
accccaaaca cactccggtc cttgtagtct ggaaccttga tcctcttcat
420gaacttgggc acggcctgtt aaagaacaca gagatggtgg tgttggcgga
gacatgctca 480cttgtctgtc tacacttgtc caattctgca ggcaaaccct
gtgggctcca gatctgtgct 540aatacggtgg ctacttaaat ttaaattaaa
caaaatgaca aattcagttc cccagtggta 600ctggccacac ttcaggtgct
ccttcatctt ttgtgctcag tacctactgt attggcctgt 660gcagataaag
aacattccta tcatccagac agttctcctg gacagtgctg ttctagatct
720tctaagagtg ggggttgaca ggtccgtttc ctcagttagg agcgtccttc
caccttgaac 780ctggagaatt ggggtctaca gtcttaagga agctgatgga
tttccttaca gaatggcggt 840ataggatgga acaagcagaa aacaacatgt
aataccctaa ttaggtgcat ctgatagagt 900gtgaaaaaca aggtcccttt
tgtcttgaaa aaagggtaag aatcacttct gagttcttga 960tgagatcgaa
agcatttagg gtcaaaaggc gcagataaca catgatggga aaacagcaat
1020gagagcctaa cacaatggga gccaactcca gagctcaaca gtgaatgacc
tgaagtcaaa 1080ataaaatctg ctgctgatga cccggagaac attacatctt
taggtttcta aaggaagatg 1140gaaaaggaac aatgggggtt ttgtgagccg
accccaggct ccctggtgtc ctgaaaccag 1200gtccacccca gcactatatg
caacagcagg aaacccatgt catgcatttc aggctgtcaa 1260gcagaaattc
cagctctcca aatgacctct ctgaacagga cccgaaaggg caaggccaaa
1320caggaaaaga accttgtgta ggattcctcc ctgctccaca gatcccacca
tgtgaggctt 1380ttacagttgg ttttgagtca ctggaaacac tgaccagaac
acaagaagta ttatggactt 1440tcagattctt gagggtttgg tggggatggg
ggtgggccac tccgaaatga gaatctaaaa 1500tatgcagttt taaatagcca
gcagggaaaa cattactcta agcacagagg aactccagag 1560aagacagact
gctttgcctt ttgaatgctc accagcagcc atggcatgtt actgtttata
1620gctccaggaa aggtaaaacg aaagagcaaa gttaagtttg tatttccata
cagttaagtg 1680tgtggtatca tggctataag tgtgcataat actcgctttg
tcgggggaga aaagcccgac 1740ggcggaatgt gaaaagaaca cattacgatc
cccaccgaga atctgaagca tgtgaggata 1800aaccggtcaa tacttatttc
tgtcattcag aacaaacaac ttctgtattt agcaaggctc 1860acataataac
agcctttgaa cgggagtgct ttgatgctga agttaaatct gctatgatcc
1920taaggagagg aggagctgga gacaaaaaga acagtttcct tgctttgccg
actttctcaa 1980gcaacttggg tttgctacag agtgctacta atgaaatggg
cggcttctcc atttttatca 2040aatatggtag tgtgcgactg gataataaac
actcagatta ctgaaaagac ttaaggattc 2100ccagatgaca ctgaaaaatg
cactgagatg tcaatctaga aacatttctc tgcttggcac 2160tgatagcaga
aaaattaaga tgtacccaga ttaggtgata tccatgaccc atctagcctt
2220acagcctacc cctcacattc tatatactaa ggagctatat ttttcaaagt
aattatgaac 2280aatttgtaca atgcatttca tctctacatt tgagtctata
atatgttaga gtagtgaatt 2340ccttaaaata attattcact gttagacagt
ctttgctaga aaaaaagtaa cctgaattct 2400ttagcacagg tggatgctac
aaatayctgc mcrkscrrmy kywykakymy tattattatt 2460attattattt
tttgagatag agtcttactc tgtcacccag gctggagtgc agtagcctta
2520tcttggggct cactgcaacc tccatcttct gggctcaagg gattctcatg
cctcagcttc 2580ctgagtagat gggattacag gtgcatgcca ccacactcag
ctaatttttg tatttttagt 2640agagaatggg gttcgccaat gttggccagc
tggtctcaaa ctcctggcgt catgtgatcc 2700acctatgtca gattcccaag
atgctgggat acaggcatga gccacacacc cgccccaaga 2760tgatttctaa
aaacaggcat gaatacggta taagaacagg twctgtaant caagnaattc
2820caaganggtc tcaywawatc twatkgttgt ccttctcctc cayccagaaa
tacratctgm 2880tactgtgcat acattwactg awagtggawk atyctawtat
tattgggaan gancccctat 2940caccacntga ccctaagagt attgnatttt
caccccntca tctggcgata tgacntgccc 3000gngggg 300615305DNAHomo
sapiens 15tcaaaggcat gggaaatgat agattttatg catttgaact agcaaacaga
tgtttctcat 60tttatttcca tgctttctaa cttaaataat tcatcagctt ttctttcttt
tctctgatag 120gggccacatg ccagaatggt acacaaaatc ttttggacat
ttgtgtgcag ctgaagttgt 180aaagatccgg gattccttca gtaaccagaa
cactgaaacc aaagacacca aatctaggtg 240atcactgaag caacgcaacc
gcttccacca ccatggtgtt tgtttctaga acttttgcca 300gtcct
30516466DNAHomo sapiensmisc_feature(1)..(466)n represents a or g or
c or t/u 16tggattnccn tgncactgaa aaatacatcc tctttccagg tgagcgaagg
accccctctg 60aggctttgga ggtcagtcat tgaagtccct gctgtgccag aggaaatctt
aaagcgccta 120cttaaagaac agcacctctg ggatgtagac ctgttggatt
caaaagtgat cgaaattctg 180gacagccaaa ctgaaattta ccagtatgtc
caaaacagta tggcacctca tcctgctcga 240gactacgttg ttttaaggtg
agcgcttccc agttgttttt ttgtgacaag gatgactcca 300tatatgaacc
aagcctatat gtcactgatc ttacaagatg gtataattat ttaaagtaga
360ggccgggcat atggtggctc acacctgtaa tcccagcact ctgggaggcc
aaggtgggag 420gatcacttga ggccagcagt tcaagaccag cctggntaat atagca
46617692DNAHomo sapiensmisc_feature(1)..(692)n represents a or g or
c or t/u 17ccngcaganc tcgaaaatat gcttcggcat gtctgccacg tcataagcag
actgtccttc 60gtagttgaca cagtctatgg caccctcatt catctggcgc agancctgaa
tccgggactt 120gacccccgat tttctgaaga gcccaacctg tcggaagagc
aacactaagt gtggggtaca 180ttcacgtgga cgcagtgttt acaccacaca
actagaagaa gctgcatgta atccgagctc 240ccctgagtac gtgaacccgc
aggcagcgct ctcacctgat ccaaacaatg gttccggggg 300tatcgcacgg
cctgctggat gctctgaggc aacggttgtc ctgtgcgctg cacgttgacc
360gtcaggggac cccaaacaca ctccggtcct tgtagtctgg aaccttgatc
ctcttcatga 420actcgggcac ggccctgtta aagagcacag agatggtggt
gtcggcggan acatgctcac 480ttgtctgtct acacttgtcc aattctgcag
gcaaaccctg tgggctccag attctgatat 540catccaatga attcnanctc
ngtaccgggg atcctctaaa atccaacttg caggcattcc 600agcttcagct
gctccaattt ctatatgttc ccctaaatcn tatttttttg aaacataagg
660ttattttttt aattgtaccn cgttcctaac na 69218315DNAHomo
sapiensmisc_feature(314)n represents a or g or c or t/u
18tttcgtgtga ggggcttagc tcttgttcgg tataggaatt acgacatcgg ctcatttcct
60cgggaacctg tgcggaacat gacagacaga aaggaggtga gtccacctgt actcaatctc
120aatgcccatc agtggaaaag actgggtagg aacaatggcc tggtccttaa
agcagtgcag 180gcatcttccc gccggaggtg ggctatcatg ctgaccgcac
gtgttatcac gaggatatga 240acagatcacc tccataaatg tatctgaaat
cttatttcca tgtaaggtct ttggaaagtt 300agagtagggg gagnc
31519281DNAHomo sapiensmisc_feature(1)..(281)n represents a or g or
c or t/u 19ctcnngactg tgtggatggc ctgtttaaag aagtcaaaga gaagtttaaa
ggctgggtca 60ngctactcca cttcggagca ggctgagctg tcctataaga aggtaaggct
tcaccctgtt 120gtcggctagt tgagtccagg agtcgaagct tgggtccatc
agagataaca cgcttttgcc 180aactaatctg tctggggatc tgtagcccac
aacctccctt gtagagctgg gcaccggggt 240gagtaagatc cccgtggtga
gagtggaaac cgnncaaagc a 281201713DNAMus
musculusmisc_feature(1)..(1713)n represents a or g or c or t/u
20ttgaacgctt gggtaccggg ccccccctcg aggtcgacgg tatcgataag cttgatatcg
60aattcctgca gcccggggga tccaatccct gggtccccag acaactctcg tttgcaaagc
120gccacaagcc acgaaagcat gctgacagac ctcagcgagc accaggaggt
ggcctctgtc 180cgaagcctca gcagcaccag cagcagcgtc cccacccacg
cagcccacag tggagatgcc 240actacgcccc gaaccaattc cgtcatcagc
gtctgctcct ccggacactt tgtaggcaac 300gatgactctt tttccagcct
gccgtctccc aaggaactgt ccagcttcag ttttagcatg 360aaaggccacc
acgagaagaa caccaagtcg aagacgcgga gcctgctcaa acgcatggag
420agcctgaagc tcaagggctc ccaccacagc aagcacaagg cgccttccaa
gctggggttg 480atcatcagtg ctcccattct gcaggagggt atggatgagg
cgaagctgaa gcagctgaac 540tgtgtggaga tctcagccct caatggcaac
cacatcaacg tgcccatggt accggaaaag 600gagccgtgtc taacttcacc
cagaccagca gcaagcagca gccaatcaga gaccagcagc 660gcggtcagca
cacccagccc ggtcaccagg acccggagcc tcagcacctg taacaagcgg
720gtgggcatgt atctagaggg cttcgaccca ttcagtcagt ccaccttgaa
caacgtgacg 780gagcagaact ataaaaaccg tgagagctac ccagaggaca
cggtgttcta cattcccgaa 840gatcacaagc ccggcacctt ccctaaggcc
ctctcccatg gcagtttctg tccctcggga 900aacagttctg tgaactggag
gaccggaagc ttccatggcc ccggccatct cagcctacgg 960agagaaaaca
gccatgacag tcctaaggag ctgaagagac gcaattcttc cagctctctg
1020agcagccgcc tgagcatcta tgataacgta ccgggttcta tcctgtactc
cagctcggga 1080gaactggccg acctggagaa tgaggacatc ttccctgagc
tggatgacat tctctaccac 1140gtgaagggga tgcagcggat agtcaaccag
tggtccgaga agttttccga cgagggagac 1200tcggactcag ccctggactc
tgtctctcct tgcccgtcat cttcaaaaca gattcacctg 1260gatgtggacc
atgaccgaag gacacccagt gacctggaca gcacaggcaa ctccttcaat
1320gagcccgaag agcccactga tatccggaaa gaagagactt ccggggtggg
ggctttcctt 1380gaccagtgca ataggtaagg gaaaggcgtt gctttctcgg
atgcattcca aaaggtgggg 1440gaaattcaaa gaaaggggtc ttgctttggg
tggggattgg agttctngat anttttgcca 1500agttccttgg aaaattcctt
aggggaattg gatncccaac cngggaagaa cccccaaaca 1560aatccccnaa
cngggaaaaa ggnggttttt attnaaaacc tggggtnntt gaaacccttt
1620gggccattca aangggattn ccntacccag gtggggancc cttggaaana
aangggtggg 1680tggttttgga aacnaatttt tagtcccngg gcc
1713214767DNAMus musculusmisc_feature(1)..(4767)n represents a or g
or c or t/u 21cataccaagt gaggtgtaat tgtttaaacc aaaaagtttg
aaggatatgg caaaagccag 60acttaaattt ccatttttcc tttttttttt tttttttaag
ggaaattctt attcaatgtg 120taagtgctca ctatcatctc tggggaggca
gagggagaaa aaaaatacct ggtaattcaa 180agccagtctg ggctacacag
caagatcgtc cctcaaaaaa gtacttttta attaaaagag 240agaaattatt
ccgaatccat agaaatagtc gttggagtat tgggaggtgg gaagcccaag
300gcccttgtcc atgtagtcac acataatggc agtggcttgg gctttcatag
aagggcacac 360gtggggacct tcccttgtgg gctttctgac tcttcactta
ctgcatatgc ctactgcaga 420gatttcctct ggactggagc actgggactt
tctttctaaa aatataaagt tcagtaatga 480ccaacaatta tgattaggct
agtaggcttt tgttcatttt taaaaattgt atgtgtgtga 540gtattttacc
tcacgcatag tgtatgtacc gtgcctctga agacggaaga aaacattagc
600ttcccctgga actggagtta cagatggttg taagccacca tgtaagtgct
gggatttgaa 660ctcaggtcca tctggaagaa cagccagtgc ggtacccact
gagccatctc tccagccccc 720tgcccagtgt tcttaaagtg ttagtctacg
gtagcagatg atttggtccc ttgaagaaat 780tctttcccct caatcttgct
agcttgactg ataacctaaa cccattgagg aagctctgat 840cacgagcaag
ctctactccg gactggaaga gtgttcagtg tgtctcaaag cacgtacttg
900tggtgttgta aaccgtgagc catgctgaga cgcctcttgt gaaatgtctt
cccgtggctt 960caggaacatt tcagaccgct gttttccttt ggagttaaaa
ctgactcctt ctaccaacac 1020gtggaaagaa ttgtgaacat cagctggtag
ttgtcatatg aaaaaacaaa acaaaacaaa 1080acaaaaaact atgttgtctg
tcactgtcat cttcagtatg tactttgtcc ccaaatcacc 1140atgacatgcc
aaagccgtgt caagcattgc agagacattc taaccttgtt gctcttacta
1200ttcagtttaa aaagaagcaa gtaattgtgg gaaggtaggg gatgcttgga
agaggacttt 1260gctatgtaga ccaaactggg ctagaactca acaatcctcc
tgcctcagcc tcccatgtgc 1320tacatgcaac aaacaaggag cttaaacatt
tttttttttt atgaatgcca ggaaaaccta 1380caggaatttg aagaattttt
gtgggagcct ctgttttctt atttcttctt ctgtcttatt 1440ttaaatgcaa
gaaggggcag acctccacct gctctccttt tatctgtgcg cctccagccc
1500tagccccaac cttgtgctgc aaagctcttg aagcttcgac attgcacctt
tggctccatc 1560tgtcttgaaa aacggaccca aggcacccaa gagataagac
ctgcacattc ctgctgggcc 1620cttgccttgg gtggcggcgg ggtcagaatg
cccaaggcca cagatggtta ctgatagcgc 1680tatctcggcc acctacttga
acgatcctac ttcaggtcct cttggctggc ttttctatat 1740tttcttttct
tttctgccat tgttaatact tgtttcacaa ccaactgtag aggttgctgt
1800ctttgggcac cagagccact gtgctttaat cctgggttct taggcaagat
tcctaagctc 1860tctaagcccc gcccccatcc cctttcgtcc cttataaaat
aaagataaat catagtatct 1920gtggcagaag gttgtcagag gactgaagac
gagccagtgc agtgtcaccc aaagacagtg 1980gcagttcacc tagttagaac
catattttaa ttcttggttg acagagcacg actgtatgta 2040tctatgtggt
agcaagtgat gtttcaatgt ttgtgtgtaa ggtgaatgag tgaattatgg
2100gggttaacat atccgatagc ttataggttt atcatcttgt ctggagtata
tgcaaattgg 2160ctattttaaa atataaaatt aatattaact atagtcaccc
tggtatgcca agtcgccctg 2220cacttgctgc ctgcctcttt gcgactccct
gtcccttccc aacttctggt gaccatcctt 2280ctgttccctt ctatgaaatg
agtttcttct ctggtcagaa ctactatctt atgtccctag 2340tacccctccg
gaaatctgag ggtcctgctc tttggagatc ctagagcatg cggatgggtg
2400aggggaaatc attgaaaaac cacagaaacc cagagaggaa gcggcacgcc
cctagtctgg 2460tgccaccagc ataaaaagtt aaagttgact tttctcaaac
caacctcctg ggtcttttgt 2520tgtttgactt aaactggcgt gtgtgaagtt
actccacctc cccaagcccc ataggcctcc 2580atgcctagta aatttggtta
taaacaccac tcagccatta aagccccaat gcagtccagt 2640ggagatttga
ttacgggttc gattaatgaa tcccagacct aagactaact taaccattgc
2700tcactcttaa agccttgaaa aaaactgggg gagtgaaaca ttacatttgg
ttgtgtcctt 2760taactgagac ccctcagcaa gggaccctac acccttctga
gcctccagtg tctctcaact 2820gttcctcctg ccctycccca ctcctccagt
gtctctcaac tgttcctcct gccctccccc 2880actcctttcc catgcaagga
gaggtttttc tgaaagagtt ggtgttctgt tttatctcag 2940tttattattc
tataaacagg cttccacata atctatagaa tcaaaggcag gcttctcagg
3000ctgcagagat actacctatc ctggtgcatc caagttgtca gagcaggacc
cgggagataa 3060agcccagcag ggtacaagat cagttccaag tggagggaat
taagcggctc ttattccatg 3120gaaaaaaaaa agcaaggttg caataattcg
ggaaagaaat aaaagactga tgggtgtgtg 3180tgtgtgtgtg tgtgtgtgtg
tgtgtgtgtg tgtgtgtgta agcttatgag gcaacaagca 3240gacgcattta
aaaaggaaga ctttggtgat gatcatctgg aagattctag
aaagaactga 3300ggcccaggga cctgtcactc acactttgca tactaggtag
cgagtagata acgggtgcta 3360ctgttgtttt ttgttttttt tttttctcct
atgactttta atgaagctga ttgattgatg 3420actgattgat cgattgattg
attgatggtt gattgatcga ctgattgatt tccattgtgc 3480taaggattga
acttgaagcc ttgtgtgtcc ttcgcaagta cctgatcact gaactactct
3540gccgtccccc tttctctaat gtggctaaac cgatatcatt ggcgatgggg
gcaactcgtt 3600caaagctgca gtttgactcc catctcagcg gggactgtgt
tctaagggcc tgtttgtgct 3660cagtgagatt tttaaaataa tcatttgtgc
agttgctgtc gatactgaaa acagtctctc 3720ctgataggac tgagtaataa
agaggcctgg aacttcgcct ctgtataata aattcaagca 3780ataaaagtca
ccttctgaca tggacatttc tgaggcccat tgtccttctt aattattact
3840tgagtgagaa gggtgcactg agcactttgc ctgcaacctt ccccagttcc
tactgctggc 3900ctgttgccct tgaagtgggc ctgccattga tgctgtagca
tgccgtctaa caagaaatag 3960aatggcactt ttgtgttaga caagcttttt
tttttttttt tgagaataga actcactagc 4020tagaccaggc tggcctccaa
ctcacagaga cctacttgcc tctgccttct gggtattaag 4080attaaagacg
tgcactacca tcctgggact ccattacccg ctatgtaatt gaagtgtagc
4140atacctgccg aaactagaaa tgagttccga gaagctcata ttgtatgggt
cagttgttca 4200gtttgattgc ccattcgtgg ttcctttctc tgctcacggc
ctttctctgc tctgcaggcg 4260cttaaatact ctaaacaagt gtgcagtcat
gaagctggag attagtcctc accggaagcg 4320agtgagtacc aaaattacat
gggggggggg ggcagggaca gcaggcacac taaccaagac 4380aggacttgta
tctacactct gtaaaaggcc ctgtttgtcc attcctcaac atgttaaaac
4440ccctatttgg agacagtagt ggatggtggc atctactgct ctggacttga
agaaatctgt 4500tacttttccc agtgaactcc atggctacca tgtgattcaa
agcatgaagc ctattgaatc 4560tccagaggaa tttcacattg ctccctagag
gaaataaagc taacattctg taggacctct 4620tcctgtttcc tggatggaac
agtagctcca tctcgaagct gtcaagatga aaggggaagg 4680ctggcttggg
ggatactgta ggagatgtgg atcgtggggg gtggggagga agacgccgga
4740gcaggaaatc ccatacactc tgtggna 4767221072DNAMus
musculusmisc_feature(6)n represents a or g or c or t/u 22ttgaanccca
agctggagct ccccgcggtg gcggccgctc tagaactagt ggatccagat 60acagttcttg
tctttaaact ctgactatgg acaggaatta tatcctgccc acgacccatc
120cagcctgact gtccacatct tacactctac actcaaggct gaggattcta
gattatgaag 180agttagacat ctaatacatt tctattttaa aaatatagtt
gctctgtggt ggggcatggt 240ggcacatggc tttaatccta gcacccagag
gaggtagagg caggtgaatc tctgagttca 300aggccagcct ggtatatata
gcactgactg ctctcccaga ggtcctgagt tcaatttcca 360gcaaccatat
ggtggctcac aaccatctgg aatggaatcc gatgccctct tctggtgtgt
420ctgaagacag ctacagtgta ctcatacata caataaataa ttcttaaaaa
aaaaaacaaa 480aacaaaaaca aaaactcaaa cacaaacaaa cagtatatat
gtaagatatt atagctaacc 540acttaagttt attattctct gagcattttt
gccagaaagg tctgcttcta aataaacaac 600aaagcaaaaa caccccaaag
tccaaacaaa aaccccaaac tttttagcac aggtagattt 660ctcaggttat
gctcaaaacc ttcattcaaa actgaccgac agcgtgatgg agtgtgggct
720cagcatgaac aagggcctga acgcatctca ggcaaccacg tgatggctga
aaacccaacc 780aaccagtcct gcagttaact ccctgaggct ccaggagttt
gagcagcatg gagaacatag 840cctggaggat gtggagacca cctgcttaaa
ggttgatgga ctggtgacat tgacagagga 900cagaacggtc ctaagctgag
tgctggggac aacctcaggg agcatgatgg catcccccca 960gggccattgc
tcactgctca ctacgagctg gctctcttac cagctgaagc cgtgcttgtt
1020ggaggcgtgt cttttccagc agggccgtca atttcaggag ccagtttttc tt
1072231104DNAMus musculusmisc_feature(1)..(1104)n represents a or g
or c or t/u 23ttggamracy sggtaccggg ccccccctcg cggtcgacgg
tatcgataag cttgatatcg 60aattcctgca gcccggggga tcctgctttg ggaaaaagac
gtcaaactct tcaaggcggg 120accaccgctt gctgtccctc tggaaagtcc
acttgccgct tatggcgcaa ggctcatctt 180catccgaatc ctcactctga
aaacacagaa tgaagccatt tatgtactgg gccaagcagg 240gggcagaagg
cagaacacag gttaagggcc aggccacagc ccaaaggata ttcccagtgt
300ccattgctca gttctcttat gtaacaaaga tggatttaaa gacattatta
ttgggctgga 360gggatggctc agccgttaag aacactgacc gcttttccag
aggtcctgag ttcaaatccc 420agcaaccaca aggtggctca acaaccatct
gtaatgagat ctgatgccct cttccggtgt 480gtctgaagac agccatagtg
tacttatata taatataaat aaatccttaa aaaaagagac 540attattatta
ctttatttta tttagagaat gtatttgcat gtatgtgtat atgtatggat
600gtatatgaat gttcacaccg tgttcagacg actaccagtc agtgtgagtt
ttctccttca 660gtcatataga actgggtcgt caggcttggc aacaggccga
ctgtcattta accagcccag 720atgtaaagac tttaacagaa gtctgaccaa
gtgttgccag ctaaacaagt cattttattg 780aaaccctggc tcgttgggcc
attcactaat cgctcacaaa ggggacctct gagatgggcc 840gaaaattcaa
gcatgcaaaa tattctgaac tggaatcaga gtcaacagtc gtgggactcc
900ctctggattg cctccagttt aactgcgtgt tgacagagtg tgtttatata
ctcgtgtgca 960attaaaaaaa aaaaaaagct attttcaaac agcagaatgg
cagctgagga ctctaggtcc 1020aaagagaaaa gacanggnat ttcttttaaa
agaactgaag accatttaan cgagccatct 1080gtggcagaaa aggnaaaata gant
110424725DNAMus musculusmisc_feature(1)..(725)n represents a or g
or c or t/u 24aannccctga tatcccggaa agaagagact ccggggtggg
ggcttccctg accaggtgca 60ataggtaagg aagggcgttg cttctcgatg catccagagg
tggggaatca agaagggtct 120gcttggtggg attgagtctg atatttgcag
tcctgcaaat tcctagggac tgcatccaac 180caggagaccc caacaatccc
aacgggaaag gagtattata aactgggtat gaacctttgg 240tcatcaagga
tgcagacagt ggaccctgga agatggtggt gtttgaacaa tatagtcagg
300ccttatccac cgtggggtgt acttagacgt gcttaaagtg cttgcatctt
gattctcctg 360cagttccaaa tcttcggttt cagccaggca cagatgagaa
ctactcaggg gagaaactgt 420cttctccgtc attataccct gggtaataga
gtgtgaccgt gaactactag caggttgtta 480tagcaatctg gcttataaac
ttacattaaa tggggagggt gctcccgatg tgcgtagaca 540ctatccatct
tctataagag gcctgagtgt actgagtcca catatctgct atgtctggaa
600ccaaccttca ggggttacaa agacagtggg ggtggggggg aggcagggaa
aggaagatcg 660atgctcttgg ttcctgatga tcagaagatt ggtcccagct
tactcctttc cgcctgttct 720ttttg 72525528DNAMus
musculusmisc_feature(1)..(528)n represents a or g or c or t/u
25agacgnggtc ccactgactg tgaacgtgca gcgctcagga cagcccctgc cccagagcat
60ccagcaggcc atgcgctacc tccgtaacca ctgtctggac caggtgagta cagctgcctg
120tggatcccac tcgtgggagc ggagctttgg gctgcatgtt tttttttcta
gtttcgtggg 180gaagggtcct gcttccacac ccatccctgc tgttctcctt
ccaaaaggtc gggctcttca 240ggaagtcagg tgtcaaatcc cggatccagg
ctctacgcca gatgaatgaa agcgctgaag 300ataatgtcaa ctatgaaggc
cagtctgctt atgatgtggc agacatgtta aagcaatatt 360ttcgagatct
tcctgagccc ctcatgacga acaaactntn cgaaaccttn ctgcagatct
420accagtgtaa gcgttctttg gtcttcttaa gnaactgatg tcgggttcat
gggaccaact 480gagcacacaa gcctttttna tgccatcctt ttgaaanaaa aacttnat
52826393DNAMus musculusmisc_feature(1)..(393)n represents a or g or
c or t/u 26aacanaanat tccggatttc ctcagggacc tggaaanaat tcttgcattc
agcaatcatg 60tgggccagcc cttggantcg ccgctaggtt ttcattcagg tctttctggt
ctggtttgcc 120caaactctgt tttctttgca ttacccttgg anaaaaattc
tctcncttca gggtgttgag 180gtggaanang gacggagcta ggcacacagc
caggttggtg ggagtcatct ggttttcttt 240cacagccgct gtgacatcgc
tcaggaaata aaaaantgtc tgcanaacct cccggttctc 300ntcgggcagg
ancataatgg ccgccttgat ggcttggang cgctggtcct tgggcacata
360ctggtatatc tgcanggaan gttcggaaaa ttt 39327601DNAMus
musculusmisc_feature(1)..(601)n represents a or g or c or t/u
27ccaagctgaa ttccgggcgc cttgtgcttg ctgtggtggg agcccttgaa gcttcaggct
60ctccatgcgt ttgagcaggc tccgcgtctt cgacttggtg ttcttctcgt ggtggccttt
120catgctaaaa ctgaagctgg acagttcctt gggagacggc aggctggaaa
aagagtcatc 180gttgcctaca aagtgtccgg aggagcagac gctgatgacg
gaattggttc ggggcgtant 240ggcatctcca ctgtgggctg cgtgggtggg
gacgctgctg ctggtgctgc tgangcttcg 300gacagangcc acctcctggt
gctcgctgag gtctgtcagc atgctttcgt ggcttgtggc 360gctttgcaaa
cgaaanttgt ctggggaccc agggattgga tcctgctttg ggaaaaagac
420tcaaactctt caaggcggga ccaccgcttg ctgtccctct ggaaagtcca
cttgccgctt 480atggcgcaag gctcatcttc atccgaatcc tcacttcgct
tccggtgaag actaatctcc 540acttcntgaa tgcacacttg tttanaatat
ttaacncctg canaaaacct ccatggcgtc 600t 60128260DNAMus
musculusmisc_feature(1)..(260)n represents a or g or c or t/u
28ggcttangga agtgccgggc ttgtgatctt cgggaatgta gaacaccgtg tcctctgggt
60agctctcacg gtttttatag ttctgctccg tcncnttgtt caaggtggac tgactgaatg
120ggtccaancc ctctaaatac atgcccaccc gcttgttaca ggtgctgagg
ctccgggtcc 180tggtgaccgg gctgggtgtg ctgaccgcnc tgctggtctc
tgattggctg ctgctgctgc 240tggtctgggt ggaattagac 26029358DNAMus
musculusmisc_feature(1)..(358)n represents a or g or c or t/u
29ctgattccgg gttgacatta tcttcagcgc tttcattcat ctggcgtaga gcctggatcc
60gggatttgac acctgacttc ctgaagancc cgacctggtc cagacagtgg ttacggaggt
120agcgcatggc ctgctggatg ctctggggca ggggctgtcc tgagcgctgc
acgttcacag 180tcagtgggac cccaaacaca ctccggtcct tgtagtctgg
aaccttgatc cttttcatga 240acttgggcac agcccagctg aanccgtgct
tgttggangg cgtgtncttt tccagcaggg 300ccgtcaattt caggagcgag
tntttctgca gcaggttcat ctgggccaca gactggca 35830154DNAMus
musculusmisc_feature(1)..(154)n represents a or g or c or t/u
30aattccgggc gatgtcacag cggctgtgaa agaaaaccag atgactccca ccaacctggc
60tgtgtgccta gctccgtccc tcttccacct caacaccctg aancnataga attcttctcc
120aagggtaatg canatgaaaa cagagtttgg gcaa 15431294DNAMus
musculusmisc_feature(1)..(294)n represents a or g or c or t/u
31aagctggaat ccggtgcgct ccagccttga gccatggctg tgcgtcctcg ctgttggagc
60cacggctccc cagctccgtg ccccgctccc tgagagtgct cccttcgcgg tggcaatcta
120aaacccacga ttttgcccga gctggggcga agcgtaagga agctgcgaac
cangatgtgc 180tgacgaccgc gaggggctcg cgtcccggct gccaccgtgg
gtcccgacgt gggatcccga 240tnacttctgg cngcctcgac tttcccagtg
cgctcccgtc gncctgcgcc gacc 294
* * * * *
References