Dlc-1 Gene Deleted In Cancers YUAN; BAO-ZHU ; et al. [The Government of the United States of America as Represented by the Secretary of the]

Dlc-1 Gene Deleted In Cancers

YUAN; BAO-ZHU ; et al.

Patent Application Summary

U.S. patent application number 12/239581 was filed with the patent office on 2009-09-24 for dlc-1 gene deleted in cancers. This patent application is currently assigned to The Government of the United States of America as Represented by the Secretary of the. Invention is credited to NICOLAS POPESCU, SNORRI S. THORGEIRSSON, BAO-ZHU YUAN.

Application Number	20090239220 12/239581
Document ID	/
Family ID	34576144
Filed Date	2009-09-24

United States Patent Application	20090239220
Kind Code	A1
YUAN; BAO-ZHU ; et al.	September 24, 2009

DLC-1 GENE DELETED IN CANCERS

Abstract

A cDNA molecule corresponding to a newly discovered human gene is disclosed. The new gene, which is frequently deleted in liver cancer cells and cell lines, is called the DLC-1 gene. Because the gene is frequently deleted in liver cancer cells, but present in normal cells, it is thought to act as a tumor suppressor. This gene is also frequently deleted in breast and colon cancers, and its expression is decreased or undetectable in many prostate and colon cancers. Also disclosed is the amino acid sequence of the protein encoded by the DLC-1 gene. Methods of using these biological materials in the diagnosis and treatment of hepatocellular cancer, breast cancer, colon cancer, prostate cancer, and adenocarcinomas are presented.

Inventors:	YUAN; BAO-ZHU; (COLUMBIA, MD) ; THORGEIRSSON; SNORRI S.; (BETHESDA, MD) ; POPESCU; NICOLAS; (BETHESDA, MD)
Correspondence Address:	KLARQUIST SPARKMAN, LLP 121 S.W. SALMON STREET, SUITE #1600 PORTLAND OR 97204-2988 US
Assignee:	The Government of the United States of America as Represented by the Secretary of the Department of Health and Human Services
Family ID:	34576144
Appl. No.:	12/239581
Filed:	September 26, 2008

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
10995914	Nov 22, 2004	7534565
12239581
09644947	Aug 23, 2000	6897018
10995914
60075952	Feb 25, 1998

Current U.S. Class:	435/6.14 ; 435/320.1; 435/7.1; 530/350; 530/387.9; 536/22.1; 536/23.5
Current CPC Class:	A61K 48/00 20130101; C07K 14/4703 20130101
Class at Publication:	435/6 ; 536/22.1; 435/320.1; 530/350; 530/387.9; 536/23.5; 435/7.1
International Class:	C12Q 1/68 20060101 C12Q001/68; C07H 21/00 20060101 C07H021/00; C12N 15/63 20060101 C12N015/63; C07K 14/00 20060101 C07K014/00; C07K 16/00 20060101 C07K016/00; C07H 21/04 20060101 C07H021/04; G01N 33/53 20060101 G01N033/53

Foreign Application Data

Date	Code	Application Number
Feb 25, 1999	US	PCT/US99/04164

Claims

1. An isolated nucleic acid molecule selected from the group consisting of: (a) nucleic acid molecules encoding an amino acid sequence as shown in Seq. I.D. No. 2; (b) nucleic acid molecules which hybridize under stringent conditions to a nucleic acid molecule according to (a), and which encode a protein having an activity of the DLC-1 protein; (c) nucleic acid molecules comprising a nucleotide sequence as shown in Seq. I.D. No. 1.

2. An isolated nucleic acid molecule according to claim 1(b) wherein the DNA molecule hybridizes to a DNA molecule according to (a) under conditions wherein DNA molecules with more than 25% mismatch will not hybridize to each other.

3. A recombinant vector including a nucleic acid molecule according to claim 1.

4. An isolated nucleic acid molecule comprising at least 15 consecutive nucleotides of a DNA sequence shown in Seq. I.D. No. 1.

5. An isolated nucleic acid molecule according to claim 4 wherein the molecule comprises at least 25 consecutive nucleotides of the DNA sequence shown in Seq. I.D. No. 1.

6. A recombinant DNA vector including a DNA molecule according to claim 4.

7. The purified protein of claim 15, wherein the amino acid sequence comprises the sequence specified in 15(a).

8. An antibody capable of specifically binding to a protein according to claim 7.

9. The antibody of claim 8 wherein the antibody is a monoclonal antibody.

10. An isolated cDNA molecule encoding a human DLC-1 protein.

11. A cDNA according to claim 10 wherein the cDNA comprises the nucleic acid sequence depicted as bases 325-3657 of Seq. I.D. No. 1.

12. A cDNA according to claim 10 wherein the human DLC-1 protein comprises an amino acid sequence as depicted in Seq. I.D. No. 2.

13. A method of detecting a deletion of a DLC-1 gene in a cell, the method comprising (a) incubating an oligonucleotide according to claim 15 with nucleic acid of the cell under conditions such that the oligonucleotide will specifically hybridize to a DLC-1 gene present in the nucleic acid to form an oligonucleotide:DLC-1 gene complex; (b) detecting the presence or absence of oligonucleotide:DLC-1 complexes, wherein the absence of said complexes indicates deletion of the DLC-1 gene.

14. A method of detecting the presence of DLC-1 protein in a cell, the method comprising (a) incubating an antibody according to claim 10 with proteins of the cell under conditions such that the antibody will specifically bind to a DLC-1 protein present in the cell to form an antibody:DLC-1 protein complex; and (b) detecting the presence of antibody:DLC-1 protein complexes.

15. A purified protein having DLC-1 protein biological activity, and comprising an amino acid sequence selected from the group of: (a) the amino acid sequence shown in Seq. I.D. No.2; (b) amino acid sequences that differ from those specified in (a) by one or more conservative amino acid substitutions; and (c) amino acid sequences having at least 90% sequence identity to the sequence specified in (a).

16. The purified protein of claim 15(c), wherein the amino acid sequences have at least 95% sequence identity to the sequence specified in 15(a).

17. The purified protein of claim 15(c), wherein the amino acid sequences have at least 98% sequence identity to the sequence specified in 15(a).

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] This application is a continuation of U.S. application Ser. No. 10/995,914, filed Nov. 22, 2004, which is a continuation of application Ser. No. 09/644,947, filed Aug. 23, 2000, which claims priority under 35 U.S.C. .sctn. 120 from International Application No. PCT/US99/04164, filed Feb. 25, 1999, and under 35 U.S.C. .sctn.119 from U.S. Provisional Application No. 60/075,952, filed Feb. 25, 1998. The prior applications are incorporated herein by reference in their entirety.

FIELD OF THE INVENTION

[0002] The present invention relates to the cloning and sequencing of the human cDNA molecule corresponding to a newly discovered gene, called DLC-1, which is frequently deleted in liver, breast and colon cancer cells. In addition, lower DLC-1 expression is frequently observed in liver, colon, and prostate cancer cells, compared to normal tissue. The present invention also relates to methods for screening and diagnosis of a genetic predisposition to liver cancer and other cancer types, and methods of gene therapy utilizing recombinant DNA technologies.

BACKGROUND OF THE INVENTION

[0003] The isolation of genes involved in human cancer development is critical for uncovering the molecular basis of cancer. One theory of cancer development holds that there are tumor suppressor genes in all normal cells which, when they become non-functional due to mutations, cause neoplastic development (Knudsen et al., Cancer Res. 45:1482, 1985). Evidence to support this theory has been found in the cases of human retinoblastoma and colorectal tumors (see U.S. Pat. No. 5,330,892 and references cited therein), as well as in connection with breast and ovarian cancers (see U.S. Pat. No. 5,693,473 and references cited therein).

[0004] More particularly, recurrent deletions on the short arm of human chromosome 8 in cases of liver, breast, lung and prostate cancers have raised the possibility of the presence of tumor suppressor genes in that location. For example, loss on the short arm of chromosome 8 in prostrate cancer (PC) cells was described in Brothman (Cancer Genet. Cytogenet. 95:116-21, 1997). Similar deletions on the short arm of chromosome 8 also have been detected in primary hepatocellular cancer (HCC), non-small cell lung carcinoma (NSCLC) and node-negative breast carcinomas (Isola, Am. J. Pathol. 147:905-11, 1995; and Marchio, et al., Genes Chromo. Canc. 18:59-65, 1997).

[0005] While recurrent chromosome 8 deletions in malignant tumors support the relevance of this lesion in carcinogenesis, scientists previously have been unable to identify the tumor suppressor genes involved in such deletions. This lack of knowledge concerning the molecular genetic basis of HCC, and other cancers associated with chromosome 8 deletions, has hampered efforts to diagnose the predisposition to such diseases and to develop more effective treatments aimed at curing genetic deficiencies.

[0006] Therefore, it is an object of the present invention to provide a human cDNA molecule corresponding to a previously unknown gene located on the short arm of chromosome 8, the deletion of which appears to be closely associated with the development of HCC and other cancers. The cloning and sequencing of such a cDNA molecule enables new and improved methods of diagnosis and treatment of such diseases.

SUMMARY OF THE INVENTION

[0007] The present invention discloses the discovery of new human gene involved in the pathogenesis of hepatocellular cancer (HCC), the most common primary liver cancer, and one of the most common cancers in the world, with 251,000 new cases reported each year. (Simonetti et al., Dig. Dis. Sci. 36:962-72, 1991; Harris et al., Cancer Cells 2:146-8, 1990; Marchio, et al., Genes Chromo. Cancer 18:59-65, 1997). More specifically, the present invention discloses the isolation of the full length cDNA and the chromosomal localization of a new gene which is frequently deleted in liver cancer, and hence is named the DLC-1 gene.

[0008] The full-length cDNA for DLC-1 is 3850 bp long (Seq. I.D. No. 1), encodes a protein of 1091 amino acids (Seq. I.D. No. 2), and was localized by fluorescence in situ hybridization to chromosome 8 at bands p21.3-22. Because the DLC-1 gene is deleted from a significant percentage of primary HCC tumor cells and cell lines, primary breast cancers (BC), and colorectal cancer (CRC) cell lines, and its expression is decreased or not observed in a significant percentage of HCC cell lines, CRC cell lines and prostate cancer (PC) cell lines, the DLC-1 gene appears to operate as a tumor suppressor in liver cancer and other cancers including PC, CRC and BC.

[0009] The object of identifying the hitherto unknown DLC-1 gene has been achieved by providing an isolated human cDNA molecule which is able specifically to correct the cellular defects characteristic of cells from patients with a deleted or mutated DLC-1 gene. Specifically, the invention provides, for the first time, an isolated cDNA molecule which, when transfected into cells derived from a patient with a deleted or mutated DLC-1 gene, can produce the DLC-1 protein believed to be active in suppressing HCC pathogenesis and other cancers, such as breast, colorectal, and prostate cancers. The invention encompasses the DLC-1 cDNA molecule (derived from normal human liver cells), the nucleotide sequence of this cDNA, and the putative amino acid sequence of the DLC-1 protein encoded by this cDNA.

[0010] Having herein provided the nucleotide sequence of the DLC-1 cDNA, correspondingly provided are the complementary DNA strands of the cDNA molecule and DNA molecules which hybridize under stringent conditions to the DLC-1 cDNA molecule or its complementary strand. Such hybridizing molecules include DNA molecules differing only by minor sequence changes, including nucleotide substitutions, deletions and additions. Also comprehended by this invention are isolated oligonucleotides comprising at least a segment of the cDNA molecule or its complementary strand, such as oligonucleotides which may be employed as effective DNA hybridization probes or primers useful in the polymerase chain reaction or as hybridization probes. Such probes and primers are particularly useful in the screening and diagnosis of persons genetically predisposed to HCC, and other cancers, as the result of DLC-1 gene deletions.

[0011] Hybridizing DNA molecules and variants on the DLC-1 cDNA may readily be created by standard molecular biology techniques. Through the manipulation of the nucleotide sequence of the human cDNA provided by this invention by standard molecular biology techniques, variants of the DLC-1 protein may be made which differ in precise amino acid sequence from the disclosed protein yet which maintain the essential characteristics of the DLC-1 protein or which are selected to differ in one or more characteristics from this protein. Such variants are another aspect of the present invention.

[0012] Also provided by the present invention are recombinant DNA vectors comprising the disclosed DNA molecules, and transgenic host cells containing such recombinant vectors.

[0013] Having isolated the human DLC-1 cDNA sequence, the genomic sequence for the gene was determined according to the following method: A human genomic library constructed using the P1 vector, pAD10SacBII, was transferred from its original E. coli host into a second E. coli host, strain N3516, following procedures well-known in the art. A positive P1 clone containing the DLC-1 gene was then obtained by performing a protocol of PCR-based P1 library screening (Sheperd, Proc. Natl. Acad. Sci. USA 91:2629-33, 1994; Neuhausen, Hum. Mol. Genet. 3:1919-26, 1994). The PCR primers used in this screening, designed from a genomic fragment isolated through Representational Difference Analysis (described more fully below), are listed below:

TABLE-US-00001 PL7-3F 5' GACACCACCATCTCTGTGCTC 3' (Seq. I.D. No. 7) PL7-3R 5' GCAGACTGTCCTTCGTAGTTG 3' (Seq. I.D. No. 8)

An isolated and purified biological sample of this genomic DLC-1 gene was deposited with the American Type Culture Collection (ATCC) in Manassas, Va., on Feb. 25, 1998, under accession number 98676. The present invention also provides for the use of the DLC-1 cDNA, the corresponding genomic gene and of the DLC-1 protein, and derivatives thereof, in aspects of diagnosis and treatment of HCC, and other cancers including, but not limited to PC, BC and CRC, resulting from DLC-1 deletion or mutation.

[0014] An embodiment of the present invention is a method for screening a subject to determine if the subject carries a mutant DLC-1 gene, or if the gene has been partially or completely deleted, as is thought to occur in many HCC cases. The method comprises the steps of: providing a biological sample obtained from the subject, which sample includes DNA or RNA, and providing an assay for detecting in the biological sample the presence of a mutant DLC-1 gene, a mutant DLC-1 RNA, or the absence, through deletion, of the DLC-1 gene and corresponding RNA.

[0015] The foregoing assay may be assembled in the form of a diagnostic kit and preferably comprises either: hybridization with oligonucleotides; PCR amplification of the DLC-1 gene or a part thereof using oligonucleotide primers; RT-PCR amplification of the DLC-1 RNA or a part thereof using oligonucleotide primers; or direct sequencing of the DLC-1 gene of the subject's genome using oligonucleotide primers. The efficiency of these molecular genetic methods should permit a rapid classification of patients affected by deletions or mutations of the DLC-1 gene.

[0016] A further aspect of the present invention is a method for screening a subject to assay for the presence of a mutant or deleted DLC-1 gene, comprising the steps of: providing a biological sample of the subject which sample contains cellular proteins, and providing an immunoassay for quantitating the level of DLC-1 protein in the biological sample. Diagnostic methods for the detection of mutant or deleted DLC-1 genes made possible by this invention will provide an enhanced ability to diagnose susceptibility to HCC and other cancers such as PC, BC and CRC.

[0017] Another aspect of the present invention is an antibody preparation comprising antibodies that specifically detect the DLC-1 protein, wherein the antibodies are selected from the group consisting of monoclonal antibodies and polyclonal antibodies.

[0018] Those skilled in the art will appreciate the utility of this invention is not limited to the specific experimental modes and materials described herein.

[0019] The foregoing and other features and advantages of the invention will become more apparent from the following detailed description and accompanying drawings.

BRIEF DESCRIPTION OF DRAWINGS

[0020] FIG. 1 is a digital image of a Southern blot which compares primary HCC tumor cells (T) with healthy normal liver cells (N), and demonstrates a genomic deletion of the L7-3 clone in the HCC cells. Primary tumors 94-25T, 95-03T and 95-06T showed 50% decrease of DNA intensity as compared with normal liver tissues.

[0021] FIG. 2 is a digital image of a Southern blot which compares representative HCC cell lines with healthy liver cells (NL-DNA), and demonstrates a genomic deletion of the L7-3 clone in 9 of 11 HCC cell lines. Cell lines Sk-Hep-1, PLC/PRF/5, WRL, Focus, HLF, Hep3B, Huh-7, Huh-6, Chang showed reduction of DNA intensity compared with human normal liver genomic DNA.

[0022] FIG. 3 is a digital image of a Southern blot which compares representative primary human breast cancers (T) with healthy normal blood cells (N) from the same patient, and demonstrates a genomic deletion of the DLC-1 gene in 7 of 15 primary breast cancers. A representative 10 of the 15 primary tumors are shown. DNA was digested with either (a) BglII or (b) BamHI. Cell lines IC11T, IC12T, IC13T, IC2T, IC6T, and IC7T showed reduction of DNA intensity compared with normal DNA.

[0023] FIG. 4 is a digital image of a Southern blot which compares representative human colon cancer cell lines with normal DNA (lane 1), and demonstrates a genomic deletion of the DLC-1 gene in two out of five colon cancer cell lines. Cell lines SW1116 and SW403 (lanes 5 and 6) showed reduction of DNA intensity compared with normal DNA (lane 1).

[0024] FIG. 5 is a digital image of a Northern blot showing the mRNA expression of the DLC-1 gene in normal human tissues. The DLC-1 gene is expressed in all normal tissues tested as a 7.5 kb major transcript and a 4.5 kb minor transcript.

[0025] FIG. 6 is a digital image of a Northern blot comparing the mRNA expression of DLC-1 gene in normal human tissues (NL-RNA) and HCC cell lines. DLC-1 mRNA expression was decreased or not detected in the WRL, 7703, Chang and Focus HCC cell lines.

[0026] FIG. 7 is a digital image of a Northern blot comparing the mRNA expression of DLC-1 gene in normal human tissues (CDD33C0) and human colon cancer cell lines. DLC-1 mRNA was expression was decreased or not detected in HCT-15, LS147T, DLD-1, HD29, SW1116, T84, SW1417, SW403, SW948, LS180, and SW48 cell lines.

[0027] FIG. 8 is a digital image of a Northern blot showing the mRNA expression of DLC-1 gene in three human prostate cancer cell lines. DLC-1 mRNA was not detected in the LN-Cap and SP3504 cell lines.

[0028] FIG. 9 is a schematic drawing of the human DLC-1 gene. Exons 1-14 are represented by boxes, with introns represented by the lines connecting the boxes.

[0029] FIG. 10 is a schematic drawing of how the mouse DLC-1 gene was targeted using homologous recombination. The resulting construct can be used to generate DLC-1 homozygous knock-out mice.

SEQUENCE LISTING

[0030] The nucleic and amino acid sequences listed in the accompanying sequence listing are shown using standard letter abbreviations for nucleotide bases, and three letter code for amino acids. Only one strand of each nucleic acid sequence is shown, but the complementary strand is understood as included by any reference to the displayed strand.

[0031] Seq. I.D. No. 1 is the nucleotide sequence of the human DLC-1 cDNA.

[0032] Seq. I.D. No. 2 is the amino acid sequence of the human DLC-1 protein.

[0033] Seq. I.D. Nos. 3-4 are oligonucleotide sequences of PCR primers which can be used to amplify the entire DLC-1 cDNA molecule.

[0034] Seq. I.D. Nos. 5-6 are oligonucleotide sequences of PCR primers which can be used to amplify the open reading frame of the DLC-1 cDNA molecule.

[0035] Seq. I.D. Nos. 7-8 are the oligonucleotide sequences of PCR primers used to screen a human genomic library.

[0036] Seq. I.D. Nos. 9-11 are the oligonucleotide sequences of the primers used for 5' and 3' RACE.

[0037] Seq. I.D. No. 10 is the nucleotide sequence for the DLC-1 cDNA probe.

[0038] Seq. I.D. No. 11 is the nucleotide sequence for the L7-3 probe.

[0039] Seq. I.D. No. 12 is the nucleotide sequence for the P-35 probe.

[0040] Seq. I.D. No. 13 is the nucleotide sequence for ______ of part of the human genomic DCL-1 sequence.

[0041] Seq. I.D. No. 14 is the nucleotide sequence for ______ of the human genomic DCL-1 sequence.

[0042] Seq. I.D. No. 15 is the nucleotide sequence for ______ of part of the human genomic DCL-1 sequence.

[0043] Seq. I.D. No. 16 is the nucleotide sequence for ______ of part of the human genomic DCL-1 sequence.

[0044] Seq. I.D. No. 17 is the nucleotide sequence for ______ of part of the human genomic DCL-1 sequence.

[0045] Seq. I.D. No. 18 is the nucleotide sequence for ______ of part of the human genomic DCL-1 sequence.

[0046] Seq. I.D. No. 19 is the nucleotide sequence for ______ of part of the human genomic DCL-1 sequence.

[0047] Seq. I.D. No. 20 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

[0048] Seq. I.D. No. 21 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

[0049] Seq. I.D. No. 22 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

[0050] Seq. I.D. No. 23 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

[0051] Seq. I.D. No. 24 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

[0052] Seq. I.D. No. 25 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

[0053] Seq. I.D. No. 26 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

[0054] Seq. I.D. No. 27 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

[0055] Seq. I.D. No. 28 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

[0056] Seq. I.D. No. 29 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

[0057] Seq. I.D. No. 30 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

[0058] Seq. I.D. No. 31 is the nucleotide sequence for ______ of part of the mouse genomic DCL-1 sequence.

DETAILED DESCRIPTION OF THE INVENTION

[0059] The present invention discloses the isolation of the full length cDNA and the chromosomal localization of a new gene, called the DLC-1 gene. As discussed in Examples 1-3 below, deletion of the DLC-1 gene has been detected in about half of the primary HCC tumor cells and in a majority of the HCC cell lines which were studied. In addition, studies of other cancers revealed that DLC-1 is also deleted in 7 of 15 primary breast cancers and in 2 of 5 CRC cell lines. Moreover, the DLC-1 gene was not expressed in 29% of HCC cell lines, 64% of CRC cell lines and 67% of PC cell lines. These frequent deletions suggest that the DLC-1 gene is a tumor suppressor gene for HCC as well as PC, BC and CRC.

[0060] The full-length cDNA for DLC-1 is 3850 bp long (Seq. I.D. No. 1) and encodes a protein of 1091 amino acids (Seq. I.D. No. 2). Fluorescent in situ hybridization has generally localized the gene on the short arm of chromosome 8 at bands p21.3-22.

[0061] Further evidence that the DLC-1 gene acts as a tumor suppressor is found in its 86% homology with the rat p122 RhoGAP gene (Homma and Emori, EMBO. J. 14:286-91, 1995). The rat p122 RhoGAP gene encodes a GTPase activating protein that catalyzes the conversion of the active GTP-bound Rho complex to an inactive GDP-bound one. The Rho family proteins, a subfamily of the Ras small GTP binding superfamily, function as important regulators in the organization of actin cytoskeleton (Nobes, et al., Cell 81:53-62, 1995). Rho proteins are also involved in Ras-mediated oncogenic transformation (Khosravi-Far, et al., Adv. Cancer Res. 69:59-105, 1997). GAP genes may function as tumor suppressors by down-regulating oncogenic Rho proteins (Quilliam, et al. Bioessays 17:395-404, 1995; Wang, et al., Cancer Res. 57:2478-84, 1997). Based on its substantial homology with the rat p122 RhoGAP gene, it appears likely the DLC-1 gene is a human RhoGAP gene involved in the suppression of HCC tumors.

DEFINITIONS

[0062] In order to facilitate review of the various embodiments of the invention, the following definition of terms is provided:

[0063] Breast Carcinoma (BC): breast cancer thought to result, in some instances, from the deletion or mutation of the DLC-1 tumor suppressor gene.

[0064] cDNA (complementary DNA): a piece of DNA lacking internal, non-coding segments (introns) and regulatory sequences which determine transcription. cDNA is synthesized in the laboratory by reverse transcription from messenger RNA extracted from cells.

[0065] Colorectal Carcinoma (CRC): colorectal cancer (such as adenocarcinoma) thought to result, in some instances, from the deletion or mutation of the DLC-1 tumor suppressor gene.

[0066] Deletion: the removal of a sequence of DNA, the regions on either side being joined together.

[0067] DLC-1 gene: a gene, the mutation of which is associated with hepatocellular, breast, colon and prostate carcinomas, and particularly adenocarcinomas of those organs A mutation of the DLC-1 gene may include nucleotide sequence changes, additions or deletions, including deletion of large portions or all of the DLC-1 gene. The term "DLC-1 gene" is understood to include the various sequence polymorphisms and allelic variations that exist within the population. This term relates primarily to an isolated coding sequence, but can also include some or all of the flanking regulatory elements and/or intron sequences.

[0068] DLC-1 cDNA: a mammalian cDNA molecule which, when transfected into DLC-1 cells, expresses the DLC-1 protein. The DLC-1 cDNA can be derived by reverse transcription from the mRNA encoded by the DLC-1 gene and lacks internal non-coding segments and transcription regulatory sequences present in the DLC-1 gene.

[0069] DLC-1 protein: the protein encoded by the DLC-1 cDNA, the altered expression or mutation of which can predispose to the development of certain cancers, such as hepatocellular carcinoma. This definition is understood to include the various sequence polymorphisms that exist, wherein amino acid substitutions in the protein sequence do not affect the essential functions of the protein.

[0070] DNA: deoxyribonucleic acid. DNA is a long chain polymer which comprises the genetic material of most living organisms (some viruses have genes comprising ribonucleic acid (RNA)). The repeating units in DNA polymers are four different nucleotides, each of which comprises one of the four bases, adenine, guanine, cytosine and thymine bound to a deoxyribose sugar to which a phosphate group is attached. Triplets of nucleotides, referred to as codons, in DNA molecules code for amino acid in a polypeptide. The term codon is also used for the corresponding (and complementary) sequences of three nucleotides in the mRNA into which the DNA sequence is transcribed.

[0071] Hepatocellular carcinoma (HCC): liver cancer thought to result, in some instances, from the deletion or mutation of the DLC-1 tumor suppressor gene.

[0072] Isolated: requires that the material be removed from its original environment. For example, a naturally occurring DNA molecule present in a living animal is not isolated, but the same DNA molecule, separated from some or all of the coexisting materials in the natural system, is isolated.

[0073] Mutant DLC-1 gene: a mutant form of the DLC-1 gene which in some embodiments is associated with hepatocellular, breast, colon and/or prostate carcinoma.

[0074] Mutant DLC-1 RNA: the RNA transcribed from a mutant DLC-1 gene.

[0075] Mutant DLC-1 protein: the protein encoded by a mutant DLC-1 gene.

[0076] Oligonucleotide: A linear polynucleotide sequence of up to about 200 nucleotide bases in length, for example a polynucleotide (such as DNA or RNA) which is at least 6 nucleotides, for example at least 15, 50, 100 or even 200 nucleotides long.

[0077] ORF: open reading frame. Contains a series of nucleotide triplets (codons) coding for amino acids without any termination codons. These sequences are usually translatable into protein.

[0078] PCR: polymerase chain reaction. Describes a technique in which cycles of denaturation, annealing with primer, and then extension with DNA polymerase are used to amplify the number of copies of a target DNA sequence.

[0079] Pharmaceutically acceptable carriers: The pharmaceutically acceptable carriers useful in this invention are conventional. Remington's Pharmaceutical Sciences, by E. W. Martin, Mack Publishing Co., Easton, Pa., 15th Edition (1975), describes compositions and formulations suitable for pharmaceutical delivery of the fusion proteins herein disclosed.

[0080] In general, the nature of the carrier will depend on the particular mode of administration being employed. For instance, parenteral formulations usually comprise injectable fluids that include pharmaceutically and physiologically acceptable fluids such as water, physiological saline, balanced salt solutions, aqueous dextrose, glycerol or the like as a vehicle. For solid compositions (e.g., powder, pill, tablet, or capsule forms), conventional non-toxic solid carriers can include, for example, pharmaceutical grades of mannitol, lactose, starch, or magnesium stearate. In addition to biologically-neutral carriers, pharmaceutical compositions to be administered can contain minor amounts of non-toxic auxiliary substances, such as wetting or emulsifying agents, preservatives, and pH buffering agents and the like, for example sodium acetate or sorbitan monolaurate.

[0081] Probes and primers: Nucleic acid probes and primers may readily be prepared based on the nucleic acids provided by this invention. A probe comprises an isolated nucleic acid attached to a detectable label or reporter molecule. Typical labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. Methods for labeling and guidance in the choice of labels appropriate for various purposes are discussed, e.g., in Sambrook et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989) and Ausubel et al. (Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences, 1987).

[0082] Primers are short nucleic acids, for example DNA oligonucleotides 15 nucleotides or more in length. Primers may be annealed to a complementary target DNA strand by nucleic acid hybridization to form a hybrid between the primer and the target DNA strand, and then extended along the target DNA strand by a DNA polymerase enzyme. Primer pairs can be used for amplification of a nucleic acid sequence, e.g., by the polymerase chain reaction (PCR) or other nucleic-acid amplification methods known in the art.

[0083] Methods for preparing and using probes and primers are described, for example, in Sambrook et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989), Ausubel et al. (Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences, 1987), and Innis et al., (PCR Protocols, A Guide to Methods and Applications, Innis et al. (eds.), Academic Press, Inc., San Diego, Calif., 1990). PCR primer pairs can be derived from a known sequence, for example, by using computer programs intended for that purpose such as Primer (Version 0.5, .COPYRGT. 1991, Whitehead Institute for Biomedical Research, Cambridge, Mass.).

[0084] Prostate Carcinoma (PC): prostate cancer (such as prostatic adenocarcinoma) thought to result, in some instances, from the deletion or mutation of the DLC-1 tumor suppressor gene.

[0085] Protein: a biological molecule expressed by a gene and comprised of amino acids.

[0086] Purified: the term "purified" does not require absolute purity; rather, it is intended as a relative term. Thus, for example, a purified protein preparation is one in which the protein referred to is more pure than the protein in its natural environment within a cell.

[0087] Recombinant: A recombinant nucleic acid is one that has a sequence that is not naturally occurring or has a sequence that is made by an artificial combination of two otherwise separated segments of sequence. This artificial combination is often accomplished by chemical synthesis or, more commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic engineering techniques.

[0088] Representational Difference Analysis (RDA): a PCR-based subtractive hybridization technique used to identify differences in the mRNA transcripts present in closely related cell lines.

[0089] Sequence identity: the similarity between two nucleic acid sequences, or two amino acid sequences, is expressed in terms of the similarity between the sequences, otherwise referred to as sequence identity. Sequence identity is frequently measured in terms of percentage identity (or similarity or homology); the higher the percentage, the more similar are the two sequences.

[0090] Methods of alignment of sequences for comparison are well-known in the art. Various programs and alignment algorithms are described in: Smith and Waterman, Adv. Appl. Math. 2:482, 1981; Needleman and Wunsch, J. Mol. Bio. 48:443, 1970; Pearson and Lipman, Methods in Mol. Biol. 24: 307-31, 1988; Higgins and Sharp, Gene 73:237-44, 1988; Higgins and Sharp, CABIOS 5:151-3, 1989; Corpet et al., Nuc. Acids Res. 16:10881-90, 1988; Huang et al., Comp. Appl. BioSci. 8:155-65, 1992; and Pearson et al., Meth. Mol. Biol. 24:307-31, 1994

[0091] The NCBI Basic Local Alignment Search Tool (BLAST) (Altschul et al., J. Mol. Biol. 215:403-10, 1990) is available from several sources, including the National Center for Biological Information (NBCI, Bethesda, Md.) and on the Internet, for use in connection with the sequence analysis programs blastp, blastn, blastx, tblastn and tblastx. It can be accessed at http://www.ncbi.nlm.nih.gov/BLAST/. A description of how to determine sequence identity using this program is available at http://www.ncbi.nlm.nih.gov/BLAST/blast_help.html.

[0092] Homologs of the DLC-1 protein are typically characterized by possession of at least 70% sequence identity counted over the full length alignment with the disclosed amino acid sequence using the NCBI Blast 2.0, gapped blastp set to default parameters. Such homologous peptides will more preferably possess at least 75%, more preferably at least 80% and still more preferably at least 90% or 95% sequence identity determined by this method. When less than the entire sequence is being compared for sequence identity, homologs will possess at least 75% and more preferably at least 85% and more preferably still at least 90% or 95% sequence identity over short windows of 10-20 amino acids. Methods for determining sequence identity over such short windows are described at http://www.ncbi.nlm.nih.gov/BLAST/blast_FAQs.html. One of skill in the art will appreciate that these sequence identity ranges are provided for guidance only; it is entirely possible that strongly significant homologs or other variants could be obtained that fall outside of the ranges provided.

[0093] The present invention provides not only the peptide homologs that are described above, but also nucleic acid molecules that encode such homologs.

[0094] Transformed: A transformed cell is a cell into which has been introduced a nucleic acid molecule by molecular biology techniques. As used herein, the term transformation encompasses all techniques by which a nucleic acid molecule might be introduced into such a cell, including transfection with viral vectors, transformation with plasmid vectors, and introduction of naked DNA by electroporation, lipofection, and particle gun acceleration.

[0095] Vector: A nucleic acid molecule as introduced into a host cell, thereby producing a transformed host cell. A vector may include nucleic acid sequences that permit it to replicate in a host cell, such as an origin of replication. A vector may also include one or more selectable marker genes and other genetic elements known in the art.

[0096] VNTR probes: Variable Number of Tandem Repeat probes. These are highly polymorphic DNA markers for human chromosomes. The polymorphism is due to variation in the number of tandem repeats of a short DNA sequence. Use of these probes enables the DNA of an individual to be distinguished from that derived from another individual.

[0097] Tumor: a neoplasm

[0098] Neoplasm: abnormal growth of cells

[0099] Cancer: malignant neoplasm that has undergone characteristic anaplasia with loss of differentiation, increased rate of growth, invasion of surrounding tissue, and is capable of metastasis.

[0100] Malignant: cells which have the properties of anaplasia invasion and metastasis.

[0101] Normal cells: Non-tumor, non-malignant cells

[0102] Mammal: This term includes both human and non-human mammals. Similarly, the term "patient" includes both human and veterinary subjects.

[0103] Animal: Living multicellular vertebrate organisms, a category which includes, for example, mammals and birds.

[0104] Transgenic Cell: transformed cells which contain foreign, non-native DNA.

[0105] Additional definitions of common terms in molecular biology may be found in Lewin, B. "Genes V" published by Oxford University Press.

Materials and Methods

Primary HCC Samples and HCC Cell Lines

[0106] All of the primary liver tumor DNAs were obtained from surgical resection of HCC tissues from patients in Qidong, China. Each tumor sample was matched with its surrounding non-cancerous liver tissue. DNAs were extracted after diagnosis of HCC with or without cirrhosis. The tumors were Hepatitis B virus (HBV) positive for HBVsAg and/or PCR detection of HBVx gene. HCC cell lines were obtained from ATCC (Manassas, Va.), Qidong Liver Cancer Institute, China, and Dr. Curtis C. Harris (Laboratory of Human Carcinogenesis, Division of Basic Sciences, National Cancer Institute) (Wang, et al., Chin. J. Oncol. 3:241-4, 1981).

Breast, Prostate and Colorectal Carcinomas

[0107] All normal and CRC (adenocarcinomas) cell lines were purchased from ATCC (Manassas, Va.). The PC cell lines (also adenocarcinomas) were obtained from The University of Texas M.D. Anderson Cancer Center (Houston, Tex.). The DNA from primary breast carcinomas and blood cells were obtained from patients in Iceland.

Manipulation of Genetic Material

[0108] Unless otherwise specified, manipulation of genetic material was performed according to standard laboratory procedures, such as those described in Sambrook et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989) and Ausubel et al. (Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences, 1987).

Representational Difference Analysis (RDA)

[0109] One primary HCC, having a homozygous point mutation of the p53 gene, but not in its surrounding, non-cancerous liver tissue, was selected for analysis. RDA was performed as originally described in Lisitsyn et al. (Proc. Natl. Acad. Sci. USA 92:151-5, 1995), with tumor DNA as tester and normal liver DNA as driver. BglII (Promega, Madison, Wis.) was chosen as the restriction enzyme and its adaptors were used for direct preparation of amplicons and PCR-based subtractive hybridization. The final difference products showing distinct bands in agarose gel were recovered after BglII digestion and ligated into the BglII site of dephosphorylated pSP72 vector (Promega). The recombinant difference products were then transfected into E. coli DH10B.

Characterization of RDA Probes

[0110] Plasmids with distinct DNA inserts were selected for further analysis. DNA sequencing was performed using the Dye Terminator Cycle DNA Sequencing kit (Perkin Elmer, Rockville, Md.). Sequencing reaction products were purified by spin columns (Princeton Separations, Adelphia, N.J.), and run on a 377 DNA Sequencer (Perkin Elmer/Applied Biosystems, Foster City, Calif.). The homology analysis was carried out by BLAST search of the GenBank DNA databases (Altschul, et al., J. Mol. Biol. 215:403-10, 1990). The RDA products that elicited significant homology or appeared in multiple clones, were selected for further Southern blot and/or Northern blot analysis.

Conditions for Southern Analysis

[0111] Genomic DNA was isolated from tumor and non-tumor cell lysates and digested with restriction enzymes. The digested DNA was separated by electrophoresis in a 1% agarose gel and transferred to nylon membrane for hybridization. 50 ng of DNA probe was radio-labeled (Prime-It RmT, Stratagene) as per the manufacturers instructions and used for hybridization. A probe for beta-actin was used as a standard to control for the amount of DNA loaded. Hybridization was performed at 68.degree. C. for 2-4 hours using Quickhybrid solution (Stratagene). Following hybridization, the membranes were washed three times at 37.degree. C. for 10 min in 1.times.SSC solution containing 0.1.times.SDS. This was followed by a single wash at 62.degree. C. for 30 min in 0.1.times.SSC solution containing 0.1.times.SDS. Blots were exposed to a PhosphoImager, and analyzed using Software ImageQuant Version 3.3 (Molecular Dynamics, Sunnyvale, Calif.) for quantitative analysis.

Conditions for Northern Analysis

[0112] Total RNA was extracted from cell lysates using TRIzol solution (Gibco-BRL), which was then separated in a 1% agarose gel and transferred to nylon membrane for hybridization. 50 ng of DNA probe was radio-labeled (Prime-It RmT, Stratagene) as per the manufacturers instructions and used for hybridization. A probe for GAPDH or beta-actin was used as a control for the amount of RNA loaded. Hybridization, washing, and analysis was performed as described above for Southern Hybridization.

5' and 3' RACE and cDNA Library Screening for cDNA Cloning

[0113] 5' and 3' RACE (Rapid Amplification of cDNA Ends) were started from a deleted fragment detected with RDA, and performed using human placenta Marathon.TM. cDNA as template (Clontech, Inc., Palo Alto, Calif.). The primers used for RACE, generated from the L7-3 sequence (Seq. I.D. No. ______), are as follows:

TABLE-US-00002 (Seq. I.D. No. 9) PrRACE5: 5' CACTCCGGTCCTTGTAGTCTGGAACC 3' was used for the first round of PCR for 5' RACE. (Seq. I.D. No. 10) PrRACE5N: 5' ATCCTCTTCATGAACTCGGGCACGG 3' was used as the nested primer in the second round of 5' RACE. (Seq. I.D. No. 11) PrRACE3: 5' GATCAAGGTTCTAGACTACAAGGACCG 3' was used for 3' RACE.

[0114] The final 5' RACE product, exhibiting the same band pattern as the deleted fragment in Northern blot hybridization, was labeled with .alpha.-[.sup.32P]-dCTP to screen a 5' Strech cDNA library constructed from human lung tissue (Clontech, Inc.). The lambda DNA of positive clones was converted into plasmid DNA by transfecting lambda DNA into AM1 bacterial cells. The full-length cDNA sequencing of positive clones was completed by primer walking and assembled by Sequencher.TM. 3.1 program.

Fluorescence In Situ Hybridization (FISH) Gene Mapping and Comparative Genomic Hybridization (CGH)

[0115] A genomic probe isolated from human P1 library was labeled with biotin and used for FISH chromosomal localization and CGH analysis. For both analyses, chromosomes prepared from methotrexate-synchronized normal peripheral lymphocyte cultures were used. The original CGH protocol, described in Kallioniemi et al. (Science 258:818-21, 1992), was employed with minor modifications. The conditions of hybridization, the detection of hybridization signals, digital-image acquisition, processing and analysis, and direct fluorescent signal localization on banded chromosomes were performed as previously described in Zimonjic et al. (Cancer Genet. Cytogenet. 80:100-2, 1995).

[0116] The following examples are illustrative of the scope of the present invention.

Example 1

Detection of DLC-1 Deletion in Liver Cancer Cells by RDA

[0117] Primary HCC tumor samples, matched with surrounding non-cancerous liver tissue, were obtained as described above and analyzed by RDA. Several RDA difference products were observed after the third round of hybridization/selection as distinct bands in agarose gel. Twenty individual fragments were isolated and analyzed by Southern blot hybridization for deletions. One clone, L7-3, of 600 bp (Seq. I.D. No. 11), showed loss of heterozygosity (LOH) in the primary tumor (FIG. 1). BLAST search revealed that the L7-3 clone had homology to rat p122 RhoGAP cDNA (Homma and Emori, EMBO. J. 14:286-91, 1995).

Example 2

Southern Analysis

HCC Cell Lines

[0118] To determine if the L7-3 clone is represented in a region recurrently deleted in HCC, 15 primary HCC tumors and 11 HCC-derived cell lines were examined using Southern analysis as described above. The DNA was digested with BglII, and probed with L7-3 (Seq. I.D. No. 11). Seven of the fifteen primary HCC tumors (representatives are shown in FIG. 1) and 9 of the 11 HCC cell lines (FIG. 2) had a genomic deletion of the L7-3 clone compared to no deletions in the normal liver cells.

Primary Breast Carcinomas

[0119] Using Southern analysis as described above, primary human breast cancer and corresponding patient blood cell DNA was digested with BglII (FIG. 3a) or BamHI (FIG. 3b) and probed with full-length DLC-1 cDNA (Seq. I.D. No. 10). Genomic deletions of DLC-1 gene were detected in 7 of 15 human primary breast cancers (representatives are shown in FIG. 3). Deletions were noted if the DNA intensity of the tumor tissues exhibited at least half the intensity when compared with their normal tissue DNA. Samples IC11T, IC12T, IC13T, IC2T, IC6T, IC7T are representative for the genomic deletions in these experiments.

[0120] Southern analysis of these cells resulted in several bands. As a control for DNA loading, the bands that remained unchanged in the tumor cells were used.

Colon Carcinoma Cell Lines

[0121] Using Southern analysis as described above, normal genomic DNA (Promega) and the DNA from five CRC cell lines were digested with EcoRI, and probed with a mixture of L7-3 and P-35 (Seq. I.D. Nos. 11 and 12) which correspond to exon 2 and exon 7 of the human DLC-1 gene (see FIG. 9), respectively. Genomic deletions of DLC-1 gene were detected in two of five human CRC cell lines (FIG. 4). Cell lines SW403 and SW1116 showed half of the DNA intensity for probe P-35 when compared with normal genomic DNA (compare lanes 5 and 6 with lane 1). Interestingly, the signal was unaltered when the L7-3 probe was used, indicating that this region (exon 2) is not responsible for the development of CRC in these cell lines. Therefore, the signal from L7-3 can be used as an internal control for the amount of DNA loaded.

Example 3

Northern Analysis

HCC Cell Lines

[0122] Considering the significant DNA sequence homology of the L7-3 clone with rat RhoGAP cDNA, its mRNA expression was examined in both normal human tissues and HCC-derived cell lines by Northern analysis as described above. Analysis of mRNA isolated from several normal human tissues, including liver, demonstrated that the L7-3 clone hybridized to a 7.5 kb (major) transcript and a 4.5 kb (minor) transcript (FIG. 5) that were detected in all normal tissues but not in 4 (WRL, 7703, Chang and Focus) out of 14 human HCC-derived cell lines (FIG. 6).

Colorectal Carcinomas

[0123] Using Northern analysis as described above, RNA from normal and CRC cell lines was prepared and probed with the full-length DLC-1 cDNA (Seq. I.D. No. 10). In human CRC cell lines, 11 out of 17 (HCT-15, LS147T, DLD-1, HD29, SW1116, T84, SW1417, SW403, SW948, LS180, SW48) showed noticeably decreased or no expression of DLC-1 mRNA (FIG. 7). In this experiment, the normal human colon fibroblast cell line CDD33C0 was used as a normal control.

Prostate Carcinomas

[0124] Using Northern analysis as described above, RNA from PC cell lines was prepared and probed with the full-length DLC-1 cDNA (Seq. I.D. No. 10). Low levels or no DLC-1 gene expression was demonstrated by in two (LN-Cap and SP3504) out of three human PC cell lines (FIG. 8).

Example 4

Obtaining the DLC-1 cDNA

[0125] The cDNA for the clone L7-3 was obtained by 5' RACE and 3' RACE coupled with cDNA library screening as described above. The full-length cDNA of DLC-1 gene is 3850 bp long (Seq. I.D. No. 1) and encodes a protein of 1091 amino acids (Seq. I.D. No. 2). The estimated molecular weight of DLC-1 protein is 125 kD. The untranslated regions of 5' end and 3' end of DLC-1 gene are 324 bp and 250 bp, respectively (Seq. I.D. No. 1).

Example 5

Chromosomal Localization of Human DLC-1

[0126] The DLC-1 gene was chromosomally localized using the materials and methods described above. The majority of metaphases hybridized with biotin or digoxigenin-labeled genomic probe had fluorescent signal at identical sites on both chromatids of the short arm of chromosome 8. The signal was analyzed in 100 metaphases with both homologous labeled. Fifty metaphases were examined by imaging of DAPI generated and enhanced G-like banding. The fluorescent signals were distributed within region 8p21-22 However, over 50% of doublets were at bands 8p21.3-22, the most likely location of the DLC-1 gene.

[0127] To further characterize the region harboring the DLC-1 gene, the primary tumor DNA used as tester in RDA (94-25T) was analyzed by CGH. The fluorescence profile for chromosome 8 demonstrated DNA loss on region of 8p23-q11.2 and gain on region of 8q21.1-q24.3.

Example 6

Cloning and Characterization of Human DLC-1

[0128] The DLC-1 cDNA sequence (Seq. I.D. No. 1) described above does not contain the introns, upstream transcriptional promoter or regulatory regions or downstream transcriptional regulatory regions of the DLC-1 gene. It is possible that some mutations in the DLC-1 gene that may lead to HCC are not included in the cDNA but rather are located in other regions of the DLC-1 gene. Mutations located outside of the open reading frame that encodes the DLC-1 protein are not likely to affect the functional activity of the protein but rather are likely to result in altered levels of the protein in the cell. For example, mutations in the promoter region of the DLC-1 gene may prevent transcription of the gene and therefore lead to the complete absence of the DLC-1 protein in the cell.

[0129] Additionally, mutations within intron sequences in the genomic gene may also prevent expression of the DLC-1 protein. Following transcription of a gene containing introns, the intron sequences are removed from the RNA molecule in a process termed splicing prior to translation of the RNA molecule which results in production of the encoded protein. When the RNA molecule is spliced to remove the introns, the cellular enzymes that perform the splicing function recognize sequences around the intron/exon border and in this manner recognize the appropriate splice sites. If there is a mutation within the sequence of the intron close to the junction of the intron with an exon, the enzymes may not recognize the junction and may fail to remove the intron. If this occurs, the encoded protein will likely be defective. Thus, mutations inside the intron sequences within the DLC-1 gene (termed "splice site mutations") may also lead to the development of HCC. However, knowledge of the exon structure and intronic splice site sequences of the DLC-1 gene is required to define the molecular basis of these abnormalities. The provision herein of the DLC-1 cDNA sequence (Seq. I.D. No. 1) enables the cloning of the entire DLC-1 gene (including the promoter and other regulatory regions and the intron sequences) and the determination of its nucleotide sequence. With this information in hand, diagnosis of a genetic predisposition to HCC and other cancers based on DNA analysis will comprehend all possible mutagenic events at the DLC-1 locus.

[0130] The ATCC deposit (98676) of the genomic DLC-1 gene may be utilized in aspects of the present invention. Alternatively, the DLC-1 gene may be isolated by one or more routine procedures, including PCR-based screening of a human genomic P1 library as described above. Alternatively, the method described in WO 93/22435 can be utilized. For example, a YAC library of human genomic sequences (Monaco and Lehrach, Proc. Natl. Acad. Sci. U.S.A. 88:4123-7, 1991) is screened for the DLC-1 gene by the polymerase chain reaction (PCR). The library is arranged in a number (e.g., 39) of primary DNA pools, prepared from high-density grids each containing around 300-400 YAC clones. Primary pools are screened by PCR to identify a pool which contains a positive clone. A secondary PCR screen is then performed on the appropriate set of eight row and 12 column pools, as described by Bentley et al. (Genomics 12:534-41, 1992). PCR primers based on the DLC-1 cDNA sequence are used as a sequence tagged site (STS) for the 3' region of the gene. The yeast DNA is then amplified with these primers by PCR for 30 cycles of 94.degree. C. for 1 minute, 60.degree. C. for 1 minute and 72.degree. C. for 1 minute, with a final 5 minute extension at 72.degree. C. Confirmation that positive YAC clones contain the majority of the coding sequence of the DLC-1 genomic gene is obtained by amplification of an STS from the 5' end of the cDNA. Exon boundaries of the DLC-1 gene are then characterized, e.g., by the vectorette PCR method. This strategy has been described in detail previously (Roberts et al., Genomics 13:942-50, 1992).

[0131] With the sequences of the DLC-1 cDNA and DLC-1 gene in hand, primers derived from these sequences may be used in diagnostic tests (described below) to determine the presence of mutations in any part of the genomic DLC-1 gene of a patient. Such primers will be oligonucleotides comprising a fragment of sequence from the DLC-1 gene (either intron sequence, exon sequence or a sequence spanning an intron-exon boundary) and will comprise at least 15 consecutive nucleotides of the DLC-1 cDNA or gene. It will be appreciated that greater specificity may be achieved by using primers of greater lengths. Thus, in order to obtain enhanced specificity, the primers used may comprise 20, 25, 30 or even 50 consecutive nucleotides of the DLC-1 cDNA or gene. Furthermore, with the provision of the DLC-1 intron sequence information the analysis of a large and as yet untapped source of patient material for mutations will now be possible using methods such as chemical cleavage of mismatches (Cotton et al., Proc Natl Acad Sci USA. 85:4397-401, 1988; Montandon et al., Nucleic Acids Res. 9:3347-58, 1989) and single-strand conformational polymorphism analysis (Orita et al., Genomics 5:874-879, 1989).

[0132] Additional experiments may now be performed to identify and characterize regulatory elements flanking the DLC-1 gene. These regulatory elements may be characterized by standard techniques including deletion analyses wherein successive nucleotides of a putative regulatory region are removed and the effect of the deletions are studied by either transient or long-term expression analyses experiments. The identification and characterization of regulatory elements flanking the genomic DLC-1 gene may be made by functional experimentation (deletion analyses, etc.) in mammalian cells by either transient or long-term expression analyses.

[0133] Having provided a genomic clone for the DLC-1 gene (Seq. I.D. No.______), it will be apparent to one skilled in the art that either the genomic clone or the cDNA or sequences derived from these clones may be utilized in applications of this invention, including but not limited to, studies of the expression of the DLC-1 gene, studies of the function of the DLC-1 protein, the generation of antibodies to the DLC-1 protein diagnosis and therapy of DLC-1 deleted or mutated patients to prevent or treat the onset of HCC. Descriptions of applications describing the use of DLC-1 cDNA are therefore intended to comprehend the use of the genomic DLC-1 gene. It will also be apparent to one skilled in the art that homologs of this gene may now be cloned from other species, such as the rat or the mouse, by standard cloning methods. Such homologs will be useful in the production of animal models of HCC.

[0134] To facilitate the detection of point mutations in liver and other cancers that exhibit alteration at region 8p12-22, the human DLC-1 gene was cloned and the intron/exon sequences characterized (Seq. I.D. No. ______ and FIG. 9).

[0135] Human DLC-1 is approximately 25 kb, and contains 14 exons. The largest exon is exon 2, at 1.5 kb, while the remaining exons are less than 300 bp on average (FIG. 9).

Example 7

Cloning Mouse DLC-1

[0136] A full understanding of the function of DLC-1 and its role in cancer development is essential. This understanding can be facilitated by the generation of knock-out mice, which contain a non-functional DLC-1 gene. Prior to generating knock-out mice, the mouse DLC-1 gene was cloned (genomic or cDNA?).

[0137] Mouse DLC-1 genomic DNA was cloned and localized to chromosome 8 by FISH (see above for methods) using a mouse DLC-1 genomic DNA clone as the probe. Mouse DLC-1 is in a syntenic region of the human DLC-1 gene. The localization of DLC-1 gene in mice may permit studies with in vivo models for carcinogenesis.

Example 8

Generating Transgenic Mice

[0138] Methods for generating transgenic mice are described in Gene Targeting, A. L. Joyuner ed., Oxford University Press, 1995 and Watson, J. D. et al., Recombinant DNA 2.sup.nd Ed., W.H. Freeman and Co., New York, 1992, Chapter 14. To specifically generate transgenic mice containing a functional deletion of the DLC-1 gene, a 1.5 kb fragment in the front of exon 2 and another 5.5 kb fragment spanning from intron 2 to intron 5 were used as short arm and long arm, respectively. Between long arm and short arm, the neo gene was introduced, generating the vector shown in FIG. 10, referred to as the knock-out vector herein.

[0139] Using standard transgenic mouse technology, the vector shown in FIG. 10 can be used to generate DLC-1 knock-out mice by homologous recombination. The knock-out vector is introduced into embryonic stem cells (ES cells) by standard methods which may include transfection, retroviral infection or electroporation (also see Example 11). The transfected ES cells expressing the knock-out vector will grow in medium containing the antibiotic G418. The neomycin resistant ES cells will be microinjected into mouse embryos (blastocysts), which are implanted into the uterus of pseudopregnant mice. The litter will be screened for chimeric mice by observing their coat color. Chimeric mice are ones in which the injected ES cells developed into the germ line, thereby allowing transmission of the gene to their offspring. The resulting heterozygotic mice will be mated to generate a homozygous line of transgenic mice functionally deleted for DLC-1. These homozygous mice will then be screened phenotypically, for example, their predisposition to developing cancer.

Example 9

Preferred Method of Making the DLC-1 cDNA

[0140] The foregoing discussion describes the original means by which the DLC-1 cDNA was obtained and also provides the nucleotide sequence of this clone. With the provision of this sequence information, the polymerase chain reaction (PCR) may now be utilized in a more direct and simple method for producing the DLC-1 cDNA.

[0141] Essentially, total RNA is extracted from human cells by any one of a variety of methods routinely used; Sambrook et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989) and Ausubel et al. (In Current Protocols in Molecular Biology, Greene Publishing Associates and Wiley-Intersciences, 1987) provide descriptions of methods for RNA isolation. Any human cell line derived from a non-DLC-1 deleted individual would be suitable, such as the widely used HeLa cell line, or the WI-38 human skin fibroblast cell line available from the American Type Culture Collection, Rockville, Md. The extracted RNA is then used as a template for performing the reverse transcription-polymerase chain reaction (RT-PCR) amplification of cDNA. Methods and conditions for RT-PCR are described in Kawasaki et al. (In PCR Protocols, A Guide to Methods and Applications, Innis et al. (eds.), pp. 21-27, Academic Press, Inc., San Diego, Calif., 1990). The selection of PCR primers will be made according to the portions of the cDNA which are to be amplified. Primers may be chosen to amplify small segments of a cDNA or the entire cDNA molecule. Variations in amplification conditions may be required to accommodate primers of differing lengths; such considerations are well known in the art and are discussed in Innis et al. (PCR Protocols, A Guide to Methods and Applications, Innis et al. (eds.), Academic Press, Inc., San Diego, Calif., 1990). The entire DLC-1 cDNA molecule may be amplified using the following combination of primers:

TABLE-US-00003 (Seq. I.D. No. 3) 5' TAT GGG CTC GAG CGG CCG CCC 3' (Seq. I.D. No. 4) 5' CGC ACA GTC TTA CAT ATT CCA 3'

The open reading frame of the cDNA molecule may be amplified using the following combination of primers:

TABLE-US-00004 (Seq. I.D. No. 5) 5' ATG TGC AGA AAG AAG CCG GAC ACC 3' (Seq. I.D. No. 6) 5' CCT AGA TTT GGT GTC TTT GGT TTC 3'

These primers are illustrative only; it will be appreciated by one skilled in the art that many different primers may be derived from the provided cDNA sequence in order to amplify particular regions of these cDNAs.

Example 10

Sequence Variants of DLC-1

[0142] The nucleotide sequence of the DLC-1 cDNA (Seq. I.D. No. 1) and the amino acid sequence of the DLC-1 protein (Seq. I.D. No. 2) which is encoded by that cDNA, respectively are shown in FIG. 5. Having presented the nucleotide sequence of the DLC-1 cDNA and the amino acid sequence of the protein, this invention now also facilitates the creation of DNA molecules, and thereby proteins, which are derived from those disclosed but which vary in their precise nucleotide or amino acid sequence from those disclosed. Such variants may be obtained through a combination of standard molecular biology laboratory techniques and the nucleotide sequence information disclosed by this invention.

[0143] Variant DNA molecules include those created by standard DNA mutagenesis techniques, for example, M13 primer mutagenesis. Details of these techniques are provided in Sambrook et al. (In Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989, Ch. 15). By the use of such techniques, variants may be created which differ in minor ways from those disclosed. DNA molecules and nucleotide sequences which are derivatives of those specifically disclosed herein and which differ from those disclosed by the deletion, addition or substitution of nucleotides while still encoding a protein which possesses the functional characteristic of the DLC-1 protein are comprehended by this invention. Also within the scope of this invention are small DNA molecules which are derived from the disclosed DNA molecules. Such small DNA molecules include oligonucleotides suitable for use as hybridization probes or polymerase chain reaction (PCR) primers. As such, these small DNA molecules will comprise at least a segment of the DLC-1 cDNA molecule or the DLC-1 gene and, for the purposes of PCR, will comprise at least a 15 nucleotide sequence and, more preferably, a 20-50 nucleotide sequence of the DLC-1 cDNA (Seq. I.D. No. 1) or the DLC-1 gene (Seq. I.D. No.______) (i.e., at least 20-50 consecutive nucleotides of the DLC-1 cDNA or gene sequences). DNA molecules and nucleotide sequences which are derived from the disclosed DNA molecules as described above may also be defined as DNA sequences which hybridize under stringent conditions to the DNA sequences disclosed, or fragments thereof.

[0144] Hybridization conditions resulting in particular degrees of stringency will vary depending upon the nature of the hybridization method of choice and the composition and length of the hybridizing DNA used. Generally, the temperature of hybridization and the ionic strength (especially the Na.sup.+ concentration) of the hybridization buffer will determine the stringency of hybridization. Calculations regarding hybridization conditions required for attaining particular degrees of stringency are discussed by Sambrook et al. (In Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989 ch. 9 and 11), herein incorporated by reference. By way of illustration only, a hybridization experiment may be performed by hybridization of a DNA molecule (for example, a deviation of the DLC-1 cDNA) to a target DNA molecule (for example, the DLC-1 cDNA) which has been electrophoresed in an agarose gel and transferred to a nitrocellulose membrane by Southern blotting (Southern, J. Mol. Biol. 98:503, 1975), a technique well known in the art and described in Sambrook et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989). Hybridization with a target probe labeled with [.sup.32P]-dCTP is generally carried out in a solution of high ionic strength such as 6.times.SSC at a temperature that is 20-25.degree. C. below the melting temperature, T.sub.m, described below. For such Southern hybridization experiments where the target DNA molecule on the Southern blot contains 10 ng of DNA or more, hybridization is typically carried out for 6-8 hours using 1-2 ng/ml radiolabeled probe (of specific activity equal to 10.sup.9 CPM/.mu.g or greater). Following hybridization, the nitrocellulose filter is washed to remove background hybridization. The washing conditions should be as stringent as possible to remove background hybridization but to retain a specific hybridization signal. The term T.sub.m represents the temperature above which, under the prevailing ionic conditions, the radiolabeled probe molecule will not hybridize to its target DNA molecule. The T.sub.m of such a hybrid molecule may be estimated from the following equation (Bolton and McCarthy, Proc. Natl. Acad. Sci. USA 48:1390, 1962):

T.sub.m=81.5.degree. C.-16.6(log.sub.10[Na.sup.+])+0.41(% G+C)-0.63(% formamide)-(600/l)

Where l=the length of the hybrid in base pairs. This equation is valid for concentrations of Na.sup.+ in the range of 0.01 M to 0.4 M, and it is less accurate for calculations of T.sub.m in solutions of higher [Na.sup.+]. The equation is also primarily valid for DNAs whose G+C content is in the range of 30% to 75%, and it applies to hybrids greater than 100 nucleotides in length (the behavior of oligonucleotide probes is described in detail in Ch. 11 of Sambrook et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989).

[0145] Thus, by way of example, for a 150 base pair DNA probe derived from the open reading frame of the DLC-1 cDNA (with a hypothetical % GC=45%), a calculation of hybridization conditions required to give particular stringencies may be made as follows:

[0146] For this example, it is assumed that the filter will be washed in 0.3.times.SSC solution following hybridization, thereby: [0147] [Na.sup.+]=0.045M [0148] % GC=45% [0149] Formamide concentration=0 [0150] l=150 base pairs

[0150] T m = 81.5 - 16 ( log 10 [ Na + ] ) + ( 0.41 .times. 45 ) - ( 600 ) ( 150 ) ##EQU00001## [0151] and so T.sub.m=74.4.degree. C.

[0152] The T.sub.m of double-stranded DNA decreases by 1-1.5.degree. C. with every 1% decrease in homology (Bonner et al., J. Mol. Biol. 81:123, 1973). Therefore, for this given example, washing the filter in 0.3.times.SSC at 59.4-64.4.degree. C. will produce a stringency of hybridization equivalent to 90%; that is, DNA molecules with more than 10% sequence variation relative to the target DLC-1 cDNA will not hybridize. Alternatively, washing the hybridized filter in 0.3.times.SSC at a temperature of 65.4-68.4.degree. C. will yield a hybridization stringency of 94%; that is, DNA molecules with more than 6% sequence variation relative to the target DLC-1 cDNA molecule will not hybridize. The above example is given entirely by way of theoretical illustration. One skilled in the art will appreciate that other hybridization techniques may be utilized and that variations in experimental conditions will necessitate alternative calculations for stringency.

[0153] In particular embodiments of the present invention, stringent conditions may be defined as those under which DNA molecules with more than 25% sequence variation (also termed "mismatch") will not hybridize. In a more particular embodiment, stringent conditions are those under which DNA molecules with more than 15% mismatch will not hybridize, and more preferably still, stringent conditions are those under which DNA sequences with more than 10% mismatch will not hybridize. In another embodiment, stringent conditions are those under which DNA sequences with more than 6% mismatch will not hybridize.

[0154] The degeneracy of the genetic code further widens the scope of the present invention as it enables major variations in the nucleotide sequence of a DNA molecule while maintaining the amino acid sequence of the encoded protein. For example, the sixteenth amino acid residue of the DLC-1 protein is alanine. This is encoded in the DLC-1 cDNA by the nucleotide codon triplet GCC. Because of the degeneracy of the genetic code, three other nucleotide codon triplets, GCT, GCG and GCA, also code for alanine. Thus, the nucleotide sequence of the DLC-1 cDNA could be changed at this position to any of these three codons without affecting the amino acid composition of the encoded protein or the characteristics of the protein. The genetic code and variations in nucleotide codons for particular amino acids is presented in Tables 1 and 2. Based upon the degeneracy of the genetic code, variant DNA molecules may be derived from the cDNA molecules disclosed herein using standard DNA mutagenesis techniques as described above, or by synthesis of DNA sequences. DNA sequences which do not hybridize under stringent conditions to the cDNA sequences disclosed by virtue of sequence variation based on the degeneracy of the genetic code are herein also comprehended by this invention.

[0155] The invention also includes DNA sequences that are substantially identical to any of the DNA sequences disclosed herein, where substantially identical means a sequence that has identical nucleotides in at least 75% of the aligned nucleotides, for example 80%, 85%, 90%, 95% or 98% identity of the aligned sequences.

TABLE-US-00005 TABLE 1 The Genetic Code First Third Position Second Position Position (5' end) T C A G (3' end) T Phe Ser Tyr Cys T Phe Ser Tyr Cys C Leu Ser Stop (och) Stop A Leu Ser Stop (amb) Trp G C Leu Pro His Arg T Leu Pro His Arg C Leu Pro Gln Arg A Leu Pro Gln Arg G A Ile Thr Asn Ser T Ile Thr Asn Ser C Ile Thr Lys Arg A Met Thr Lys Arg G G Val Ala Asp Gly T Val Ala Asp Gly C Val Ala Glu Gly A Val (Met) Ala Glu Gly G "Stop (och)" stands for the ochre termination triplet, and "Stop (amb)" for the amber. ATG is the most common initiator codon; GTG usually codes for valine, but it can also code for methionine to initiate an mRNA chain.

TABLE-US-00006 TABLE 2 The Degeneracy of the Genetic Code Number of Total Number Synonymous Codons Amino Acid of Codons 6 Leu, Ser, Arg 18 4 Gly, Pro, Ala, Val, Thr 20 3 Ile 3 2 Phe, Tyr, Cys, His, Gln, 18 Glu, Asn, Asp, Lys 1 Met, Trp 2 Total number of codons for amino acids 61 Number of codons for termination 3 Total number of codons in genetic code 64

[0156] One skilled in the art will recognize that the DNA mutagenesis techniques described above may be used not only to produce variant DNA molecules, but will also facilitate the production of proteins which differ in certain structural aspects from the DLC-1 protein, yet which proteins are clearly derivative of this protein and which maintain the essential characteristics of the DLC-1 protein. Newly derived proteins may also be selected in order to obtain variations on the characteristic of the DLC-1 protein, as will be more fully described below. Such derivatives include those with variations in amino acid sequence including minor deletions, additions and substitutions.

[0157] While the site for introducing an amino acid sequence variation is predetermined, the mutation per se need not be predetermined. For example, in order to optimize the performance of a mutation at a given site, random mutagenesis may be conducted at the target codon or region and the expressed protein variants screened for the optimal combination of desired activity. Techniques for making substitution mutations at predetermined sites in DNA having a known sequence as described above are well known.

[0158] Amino acid substitutions are typically of single residues; insertions usually will be on the order of about from 1 to 10 amino acid residues; and deletions will range about from 1 to 30 residues. Deletions or insertions preferably are made in adjacent pairs, i.e., a deletion of 2 residues or insertion of 2 residues. Substitutions, deletions, insertions or any combination thereof may be combined to arrive at a final construct. Obviously, the mutations that are made in the DNA encoding the protein must not place the sequence out of reading frame and preferably will not create complementary regions that could produce secondary mRNA structure.

[0159] Substitutional variants are those in which at least one residue in the amino acid sequence has been removed and a different residue inserted in its place. Such substitutions generally are made in accordance with the following Table 3 when it is desired to finely modulate the characteristics of the protein. Table 3 shows amino acids which may be substituted for an original amino acid in a protein and which are regarded as conservative substitutions.

TABLE-US-00007 TABLE 3 Original Residue Conservative Substitutions Ala ser Arg lys Asn gln, his Asp glu Cys ser Gln asn Glu asp Gly pro His asn; gln Ile leu, val Leu ile; val Lys arg; gln; glu Met leu; ile Phe met; leu; tyr Ser thr Thr ser Trp tyr Tyr trp; phe Val ile; leu

[0160] Substantial changes in function or immunological identity are made by selecting substitutions that are less conservative than those in Table 3, i.e., selecting residues that differ more significantly in their effect on maintaining (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. The substitutions which in general are expected to produce the greatest changes in protein properties will be those in which (a) a hydrophilic residue, e.g., seryl or threonyl, is substituted for (or by) a hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue having an electropositive side chain, e.g., lysyl, arginyl, or histadyl, is substituted for (or by) an electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g., phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine.

[0161] The effects of these amino acid substitutions or deletions or additions may be assessed for derivatives of the DLC-1 protein by assays in which DNA molecules encoding the derivative proteins are transfected into DLC-1 cells using routine procedures.

[0162] The DLC-1 gene, DLC-1 cDNA, DNA molecules derived therefrom and the protein encoded by the cDNA and derivatives thereof may be utilized in aspects of both the study of HCC and for diagnostic and therapeutic applications related to HCC. Utilities of the present invention include, but are not limited to, those utilities described in the examples presented herein. Those skilled in the art will recognize that the utilities herein described are not limited to the specific experimental modes and materials presented and will appreciate the wider potential utility of this invention.

Example 11

Expression of DLC-1 cDNA Sequences

[0163] With the provision of the DLC-1 cDNA (Seq. I.D. No. 1), the expression and purification of the DLC-1 protein by standard laboratory techniques is now enabled. The purified protein may be used for functional analyses, antibody production, diagnostics and patient therapy. Furthermore, the DNA sequence of the DLC-1 cDNA can be manipulated in studies to understand the expression of the gene and the function of its product. Mutant forms of the DLC-1 may be isolated based upon information contained herein, and may be studied in order to detect alteration in expression patterns in terms of relative quantities, tissue specificity and functional properties of the encoded mutant DLC-1 protein. Partial or full-length cDNA sequences, which encode for the subject protein, may be ligated into bacterial expression vectors. Methods for expressing large amounts of protein from a cloned gene introduced into Escherichia coli (E. coli) may be utilized for the purification, localization and functional analysis of proteins. For example, fusion proteins consisting of amino terminal peptides encoded by a portion of the E. coli lacZ or trpE gene linked to DLC-1 proteins may be used to prepare polyclonal and monoclonal antibodies against these proteins. Thereafter, these antibodies may be used to purify proteins by immunoaffinity chromatography, in diagnostic assays to quantitate the levels of protein and to localize proteins in tissues and individual cells by immunofluorescence.

[0164] Intact native protein may also be produced in E. coli in large amounts for functional studies. Methods and plasmid vectors for producing fusion proteins and intact native proteins in bacteria are described in Sambrook et al. (In Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989, ch. 17) herein incorporated by reference. Such fusion proteins may be made in large amounts, are easy to purify, and can be used to elicit antibody response. Native proteins can be produced in bacteria by placing a strong, regulated promoter and an efficient ribosome binding site upstream of the cloned gene. If low levels of protein are produced, additional steps may be taken to increase protein production; if high levels of protein are produced, purification is relatively easy. Suitable methods are presented in Sambrook et al. (Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989) and are well known in the art. Often, proteins expressed at high levels are found in insoluble inclusion bodies. Methods for extracting proteins from these aggregates are described by Sambrook et al. (In Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989, ch. 17). Vector systems suitable for the expression of lacZ fusion genes include the pUR series of vectors (Ruther and Muller-Hill, EMBO J. 2:1791, 1983), pEX1-3 (Stanley and Luzio, EMBO J. 3:1429, 1984) and pMR100 (Gray et al., Proc. Natl. Acad. Sci. USA 79:6598, 1982). Vectors suitable for the production of intact native proteins include pKC30 (Shimatake and Rosenberg, Nature 292:128, 1981), pKK177-3 (Amann and Brosius, Gene 40:183, 1985) and pET-3 (Studiar and Moffatt, J. Mol. Biol. 189:113, 1986). DLC-1 fusion proteins may be isolated from protein gels, lyophilized, ground into a powder and used as an antigen. The DNA sequence can also be transferred from its existing context in pREP4 to other cloning vehicles, such as other plasmids, bacteriophages, cosmids, animal viruses and yeast artificial chromosomes (YACs) (Burke et al., Science 236:806-12, 1987). These vectors may then be introduced into a variety of hosts including somatic cells, and simple or complex organisms, such as bacteria, fungi (Timberlake and Marshall, Science 244:1313-7, 1989), invertebrates, plants (Gasser and Fraley, Science 244:1293, 1989), and pigs (Pursel et al., Science 244:1281-8, 1989), which cell or organisms are rendered transgenic by the introduction of the heterologous DLC-1 cDNA.

[0165] For expression in mammalian cells, the cDNA sequence may be ligated to heterologous promoters, such as the simian virus (SV) 40, promoter in the pSV2 vector (Mulligan and Berg, Proc. Natl. Acad. Sci. USA 78:2072-6, 1981), and introduced into cells, such as monkey COS-1 cells (Gluzman, Cell 23:175-182, 1981), to achieve transient or long-term expression. The stable integration of the chimeric gene construct may be maintained in mammalian cells by biochemical selection, such as neomycin (Southern and Berg, J. Mol. Appl. Genet. 1:327-41, 1982) and mycophenolic acid (Mulligan and Berg, Proc. Natl. Acad. Sci. USA 78:2072-6, 1981).

[0166] DNA sequences can be manipulated with standard procedures such as restriction enzyme digestion, fill-in with DNA polymerase, deletion by exonuclease, extension by terminal deoxynucleotide transferase, ligation of synthetic or cloned DNA sequences, site-directed sequence-alteration via single-stranded bacteriophage intermediate or with the use of specific oligonucleotides in combination with PCR.

[0167] The cDNA sequence (or portions derived from it) or a mini gene (a cDNA with an intron and its own promoter) may be introduced into eukaryotic expression vectors by conventional techniques. These vectors are designed to permit the transcription of the cDNA in eukaryotic cells by providing regulatory sequences that initiate and enhance the transcription of the cDNA and ensure its proper splicing and polyadenylation. Vectors containing the promoter and enhancer regions of the SV40 or long terminal repeat (LTR) of the Rous Sarcoma virus and polyadenylation and splicing signal from SV40 are readily available (Mulligan and Berg, Proc. Natl. Acad. Sci. USA 78:2072-6, 1981; Gorman et al., Proc. Natl. Acad. Sci USA 78:6777-6781, 1982). The level of expression of the cDNA can be manipulated with this type of vector, either by using promoters that have different activities (for example, the baculovirus pAC373 can express cDNAs at high levels in S. frugiperda cells (Summers and Smith, In: Genetically Altered Viruses and the Environment, Fields et al. (Eds.) 22:319-328, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1985) or by using vectors that contain promoters amenable to modulation, for example, the glucocorticoid-responsive promoter from the mouse mammary tumor virus (Lee et al., Nature 294:228, 1982). The expression of the cDNA can be monitored in the recipient cells 24 to 72 hours after introduction (transient expression).

[0168] In addition, some vectors contain selectable markers such as the gpt (Mulligan and Berg, Proc. Natl. Acad. Sci. USA 78:2072-6, 1981) or neo (Southern and Berg, J. Mol. Appl. Genet. 1:327-41, 1982) bacterial genes. These selectable markers permit selection of transfected cells that exhibit stable, long-term expression of the vectors (and therefore the cDNA). The vectors can be maintained in the cells as episomal, freely replicating entities by using regulatory elements of viruses such as papilloma (Sarver et al., Mol. Cell Biol. 1:486, 1981) or Epstein-Barr (Sugden et al., Mol. Cell Biol. 5:410, 1985). Alternatively, one can also produce cell lines that have integrated the vector into genomic DNA. Both of these types of cell lines produce the gene product on a continuous basis. One can also produce cell lines that have amplified the number of copies of the vector (and therefore of the cDNA as well) to create cell lines that can produce high levels of the gene product (Alt et al., J. Biol. Chem. 253:1357, 1978).

[0169] The transfer of DNA into eukaryotic, in particular human or other mammalian cells, is now a conventional technique. The vectors are introduced into the recipient cells as pure DNA (transfection) by, for example, precipitation with calcium phosphate (Graham and vander Eb, Virology 52:466, 1973) or strontium phosphate (Brash et al., Mol. Cell Biol. 7:2013, 1987), electroporation (Neumann et al., EMBO J 1:841, 1982), lipofection (Felgner et al., Proc. Natl. Acad. Sci USA 84:7413, 1987), DEAE dextran (McCuthan et al., J. Natl Cancer Inst. 41:351, 1968), microinjection (Mueller et al., Cell 15:579, 1978), protoplast fusion (Schafner, Proc. Natl. Acad. Sci. USA 77:2163-7, 1980), or pellet guns (Klein et al., Nature 327:70, 1987). Alternatively, the cDNA can be introduced by infection with virus vectors. Systems are developed that use, for example, retroviruses (Bernstein et al., Gen. Engrg. 7:235, 1985), adenoviruses (Ahmad et al., J. Virol. 57:267, 1986), or Herpes virus (Spaete et al, Cell 30:295, 1982).

[0170] These eukaryotic expression systems can be used for studies of the DLC-1 gene and mutant forms of this gene, the DLC-1 protein and mutant forms of this protein. Such uses include, for example, the identification of regulatory elements located in the 5' region of the DLC-1 gene on genomic clones that can be isolated from human genomic DNA libraries using the information contained in the present invention. The eukaryotic expression systems may also be used to study the function of the normal complete protein, specific portions of the protein, or of naturally occurring or artificially produced mutant proteins.

[0171] Using the above techniques, the expression vectors containing the DLC-1 gene sequence or fragments or variants or mutants thereof can be introduced into human cells, mammalian cells from other species or non-mammalian cells as desired. The choice of cell is determined by the purpose of the treatment. For example, monkey COS cells (Gluzman, Cell 23:175-182, 1981) that produce high levels of the SV40 T antigen and permit the replication of vectors containing the SV40 origin of replication may be used. Similarly, Chinese hamster ovary (CHO), mouse NIH 3T3 fibroblasts or human fibroblasts or lymphoblasts (as described herein) may be used.

[0172] The following is provided as one exemplary method to express DLC-1 polypeptide from the cloned DLC-1 cDNA sequences in mammalian cells. Cloning vector pXT1, commercially available from Stratagene, contains the Long Terminal Repeats (LTRs) and a portion of the GAG gene from Moloney Murine Leukemia Virus. The position of the viral LTRs allows highly efficient, stable transfection of the region within the LTRs. The vector also contains the Herpes Simplex Thymidine Kinase promoter (TK), active in embryonal cells and in a wide variety of tissues in mice, and a selectable neomycin gene conferring G418 resistance. Two unique restriction sites BglII and XhoI are directly downstream from the TK promoter. DLC-1 cDNA, including the entire open reading frame for the DLC-1 protein and the 3' untranslated region of the cDNA is cloned into one of the two unique restriction sites downstream from the promoter.

[0173] The ligated product is transfected into mouse NIH 3T3 cells using Lipofectin (Life Technologies, Inc.) under conditions outlined in the product specification. Positive transfectants are selected after growing the transfected cells in 600 .mu.g/ml G418 (Sigma, St. Louis, Mo.). The protein is released into the supernatant and may be purified by standard immunoaffinity chromatography techniques using antibodies raised against the DLC-1 protein, as described below.

[0174] Expression of the DLC-1 protein in eukaryotic cells may also be used as a source of proteins to raise antibodies. The DLC-1 protein may be extracted following release of the protein into the supernatant as described above, or, the cDNA sequence may be incorporated into a eukaryotic expression vector and expressed as a chimeric protein with, for example, .beta.-globin. Antibody to .beta.-globin is thereafter used to purify the chimeric protein. Corresponding protease cleavage sites engineered between the .beta.-globin gene and the cDNA are then used to separate the two polypeptide fragments from one another after translation. One useful expression vector for generating .beta.-globin chimeric proteins is pSG5 (Stratagene). This vector encodes rabbit .beta.-globin.

[0175] The present invention thus encompasses recombinant vectors which comprise all or part of the DLC-1 gene or cDNA sequences, for expression in a suitable host. The DLC-1 DNA is operatively linked in the vector to an expression control sequence in the recombinant DNA molecule so that the DLC-1 polypeptide can be expressed. The expression control sequence may be selected from the group consisting of sequences that control the expression of genes of prokaryotic or eukaryotic cells and their viruses and combinations thereof. The expression control sequence may be specifically selected from the group consisting of the lac system, the trp system, the tac system, the trc system, major operator and promoter regions of phage lambda, the control region of fd coat protein, the early and late promoters of SV40, promoters derived from polyoma, adenovirus, retrovirus, baculovirus and simian virus, the promoter for 3-phosphoglycerate kinase, the promoters of yeast acid phosphatase, the promoter of the yeast alpha-mating factors and combinations thereof.

[0176] The host cell, which may be transfected with the vector of this invention, may be selected from the group consisting of E. coli, Pseudomonas, Bacillus subtilis, Bacillus stearothermophilus or other bacilli; other bacteria; yeast; fungi; insect; mouse or other animal; or plant hosts; or human tissue cells.

[0177] It is appreciated that for mutant or variant DLC-1 DNA sequences, similar systems are employed to express and produce the mutant product.

Example 12

Production of an Antibody to DLC-1 Protein

[0178] Monoclonal or polyclonal antibodies may be produced to either the normal DLC-1 protein or mutant forms of this protein. Optimally, antibodies raised against the DLC-1 protein would specifically detect the DLC-1 protein. That is, such antibodies would recognize and bind the DLC-1 protein and would not substantially recognize or bind to other proteins found in human cells. The determination that an antibody specifically detects the DLC-1 protein is made by any one of a number of standard immunoassay methods; for instance, the Western blotting technique (Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989). To determine that a given antibody preparation (such as one produced in a mouse) specifically detects the DLC-1 protein by Western blotting, total cellular protein is extracted from human cells (for example, lymphocytes) and electrophoresed on a sodium dodecyl sulfate-polyacrylamide gel. The proteins are then transferred to a membrane (for example, nitrocellulose) by Western blotting, and the antibody preparation is incubated with the membrane. After washing the membrane to remove non-specifically bound antibodies, the presence of specifically bound antibodies is detected by the use of an anti-mouse antibody conjugated to an enzyme such as alkaline phosphatase; application of the substrate 5-bromo-4-chloro-3-indolyl phosphate/nitro blue tetrazolium results in the production of a dense blue compound by immuno-localized alkaline phosphatase. Antibodies which specifically detect the DLC-1 protein will, by this technique, be shown to bind to the DLC-1 protein band (which will be localized at a given position on the gel determined by its molecular weight). Non-specific binding of the antibody to other proteins may occur and may be detectable as a weak signal on the Western blot. The non-specific nature of this binding will be recognized by one skilled in the art by the weak signal obtained on the Western blot relative to the strong primary signal arising from the specific antibody-DLC-1 protein binding.

[0179] Substantially pure DLC-1 protein suitable for use as an immunogen is isolated from transfected or transformed cells. Concentration of protein in the final preparation is adjusted, for example, by concentration on an Amicon filter device, to the level of a few micrograms per milliliter. Monoclonal or polyclonal antibody to the protein can then be prepared as follows:

Monoclonal Antibody Production by Hybridoma Fusion

[0180] Monoclonal antibody to epitopes of the DLC-1 protein identified and isolated as described can be prepared from murine hybridomas according to the classical method of Kohler and Milstein (Nature 256:495, 1975) or derivative methods thereof. Briefly, a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks. The mouse is then sacrificed, and the antibody-producing cells of the spleen isolated. The spleen cells are fused by means of polyethylene glycol with mouse myeloma cells, and the excess unfused cells destroyed by growth of the system on selective media comprising aminopterin (HAT media). The successfully fused cells are diluted and aliquots of the dilution placed in wells of a microtiter plate where growth of the culture is continued. Antibody-producing clones are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures, such as ELISA, as originally described by Engvall (Enzymol. 70:419, 1980), and derivative methods thereof. Selected positive clones can be expanded and their monoclonal antibody product harvested for use. Detailed procedures for monoclonal antibody production are described in Harlow and Lane (Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory, New York, 1988).

Polyclonal Antibody Production by Immunization

[0181] Polyclonal antiserum containing antibodies to heterogenous epitopes of a single protein can be prepared by immunizing suitable animals with the expressed protein, which can be unmodified or modified to enhance immunogenicity. Effective polyclonal antibody production is affected by many factors related both to the antigen and the host species. For example, small molecules tend to be less immunogenic than others and may require the use of carriers and adjuvant. Also, host animals vary in response to site of inoculations and dose, with both inadequate or excessive doses of antigen resulting in low titer antisera. Small doses (ng level) of antigen administered at multiple intradermal sites appears to be most reliable. An effective immunization protocol for rabbits can be found in Vaitukaitis et al. (J. Clin. Endocrinol. Metab. 33:988-91, 1971).

[0182] Booster injections can be given at regular intervals, and antiserum harvested when antibody titer thereof, as determined semi-quantitatively, for example, by double immunodiffusion in agar against known concentrations of the antigen, begins to fall. See, for example, Ouchterlony et al. (In Handbook of Experimental Immunology, Wier, D. (ed.) chapter 19, Blackwell, 1973). Plateau concentration of antibody is usually in the range of 0.1 to 0.2 mg/ml of serum (about 12 .mu.M). Affinity of the antisera for the antigen is determined by preparing competitive binding curves, as described, for example, by Fisher (Manual of Clinical Immunology, Ch. 42, 1980).

Antibodies Raised against Synthetic Peptides

[0183] A third approach to raising antibodies against the DLC-1 protein is to use synthetic peptides synthesized on a commercially available peptide synthesizer based upon the predicted amino acid sequence of the DLC-1 protein.

Antibodies Raised by Injection of DLC-1 Gene

[0184] Antibodies may be raised against the DLC-1 protein by subcutaneous injection of a DNA vector which expresses the DLC-1 protein into laboratory animals, such as mice. Delivery of the recombinant vector into the animals may be achieved using a hand-held form of the Biolistic system (Sanford et al., Particulate Sci. Technol. 5:27-37, 1987) as described by Tang et al. (Nature 356:152-4, 1992). Expression vectors suitable for this purpose may include those which express the DLC-1 gene under the transcriptional control of either the human .beta.-actin promoter or the cytomegalovirus (CMV) promoter.

[0185] Antibody preparations prepared according to these protocols are useful in quantitative immunoassays which determine concentrations of antigen-bearing substances in biological samples; they are also used semi-quantitatively or qualitatively to identify the presence of antigen in a biological sample.

Example 13

DNA-Based Diagnosis

[0186] One major application of the DLC-1 sequence information presented herein is in the area of genetic testing for predisposition to HCC, BC, PC and/or CRC owing to DLC-1 deletion or mutation. The gene sequence of the DLC-1 gene, including intron-exon boundaries is also useful in such diagnostic methods. Individuals carrying mutations in the DLC-1 gene, or having heterozygous or homozygous deletions of the DLC-1 gene, may be detected at the DNA level with the use of a variety of techniques. For such a diagnostic procedure, a biological sample of the subject, which biological sample contains either DNA or RNA derived from the subject, is assayed for a mutated or deleted DLC-1 gene. Suitable biological samples include samples containing genomic DNA or RNA obtained from body cells, such as those present in peripheral blood, urine, saliva, tissue biopsy, surgical specimen, amniocentesis samples and autopsy material. The detection in the biological sample of either a mutant DLC-1 gene, a mutant DLC-1 RNA, or a homozygously or heterozygously deleted DLC-1 gene, may be performed by a number of methodologies, as outlined below.

[0187] A preferred embodiment of such detection techniques is the polymerase chain reaction amplification of reverse transcribed RNA (RT-PCR) of RNA isolated from lymphocytes followed by direct DNA sequence determination of the products. The presence of one or more nucleotide differences between the obtained sequence and the cDNA sequences, and especially, differences in the ORF portion of the nucleotide sequence are taken as indicative of a potential DLC-1 gene mutation.

[0188] Alternatively, DNA extracted from lymphocytes or other cells may be used directly for amplification. The direct amplification from genomic DNA would be appropriate for analysis of the entire DLC-1 gene including regulatory sequences located upstream and downstream from the open reading frame. Recent reviews of direct DNA diagnosis have been presented by Caskey (Science 236:1223-8, 1989) and by Landegren et al. (Science 242:229-37, 1989).

[0189] Further studies of DLC-1 genes isolated from DLC-1 patients may reveal particular mutations, or deletions, which occur at a high frequency within this population of individuals. In this case, rather than sequencing the entire DLC-1 gene, it may be possible to design DNA diagnostic methods to specifically detect the most common DLC-1 mutations or deletions.

[0190] The detection of specific DNA mutations may be achieved by methods such as hybridization using specific oligonucleotides (Wallace et al., Cold Spring Harbor Symp. Quant. Biol. 51:257-61, 1986), direct DNA sequencing (Church and Gilbert, Proc. Natl. Acad. Sci. USA 81:1991-5, 1988), the use of restriction enzymes (Flavell et al., Cell 15:25, 1978; Geever et al., Proc. Natl. Acad. Sci USA 78:5081, 1981), discrimination on the basis of electrophoretic mobility in gels with denaturing reagent (Myers and Maniatis, Cold Spring Harbor Symp. Quant. Biol. 51:275-84, 1986), RNase protection (Myers et al., Science 230:1242, 1985), chemical cleavage (Cotton et al., Proc. Natl. Acad. Sci. USA 85:4397-401, 1988), and the ligase-mediated detection procedure (Landegren et al., Science 241:1077, 1988).

[0191] Oligonucleotides specific to normal or mutant sequences are chemically synthesized using commercially available machines, labeled radioactively with isotopes (such as .sup.32P) or non-radioactively, with tags such as biotin (Ward and Langer et al., Proc. Natl. Acad. Sci. USA 78:6633-57, 1981), and hybridized to individual DNA samples immobilized on membranes or other solid supports by dot-blot or transfer from gels after electrophoresis. The presence of these specific sequences are visualized by methods such as autoradiography or fluorometric (Landegren, et al., Science 242:229-37, 1989) or calorimetric reactions (Gebeyehu et al., Nucleic Acids Res. 15:4513-34, 1987). The absence of hybridization would indicate a mutation in the particular region of the gene, or deleted DLC-1 gene.

[0192] Sequence differences between normal and mutant forms of the DLC-1 gene may also be revealed by the direct DNA sequencing method of Church and Gilbert (Proc. Natl. Acad. Sci. USA 81:1991-5, 1988). Cloned DNA segments may be used as probes to detect specific DNA segments. The sensitivity of this method is greatly enhanced when combined with PCR (Wrichnik et al., Nucleic Acids Res. 15:529-42, 1987; Wong et al., Nature 330:384-386, 1987; Stoflet et al., Science 239:491-4, 1988). In this approach, a sequencing primer which lies within the amplified sequence is used with double-stranded PCR product or single-stranded template generated by a modified PCR. The sequence determination is performed by conventional procedures with radiolabeled nucleotides or by automatic sequencing procedures with fluorescent tags.

[0193] Sequence alterations may occasionally generate fortuitous restriction enzyme recognition sites or may eliminate existing restriction sites. Changes in restriction sites are revealed by the use of appropriate enzyme digestion followed by conventional gel-blot hybridization (Southern, J. Mol. Biol. 98:503, 1975). DNA fragments carrying the site (either normal or mutant) are detected by their reduction in size or increase of corresponding restriction fragment numbers. Genomic DNA samples may also be amplified by PCR prior to treatment with the appropriate restriction enzyme; fragments of different sizes are then visualized under UV light in the presence of ethidium bromide after gel electrophoresis.

[0194] Genetic testing based on DNA sequence differences may be achieved by detection of alteration in electrophoretic mobility of DNA fragments in gels with or without denaturing reagent. Small sequence deletions and insertions can be visualized by high-resolution gel electrophoresis. For example, a PCR product with small deletions is clearly distinguishable from a normal sequence on an 8% non-denaturing polyacrylamide gel (WO 91/10734; Nagamine et al., Am. J. Hum. Genet. 45:337-9, 1989). DNA fragments of different sequence compositions may be distinguished on denaturing formamide gradient gels in which the mobilities of different DNA fragments are retarded in the gel at different positions according to their specific "partial-melting" temperatures (Myers et al., Science 230:1242, 1985). Alternatively, a method of detecting a mutation comprising a single base substitution or other small change could be based on differential primer length in a PCR. For example, an invariant primer could be used in addition to a primer specific for a mutation. The PCR products of the normal and mutant genes can then be differentially detected in acrylamide gels.

[0195] In addition to conventional gel-electrophoresis and blot-hybridization methods, DNA fragments may also be visualized by methods where the individual DNA samples are not immobilized on membranes. The probe and target sequences may be both in solution, or the probe sequence may be immobilized (Saiki et al., Proc. Nat. Acad. Sci. USA 86:6230-4, 1989). A variety of detection methods, such as autoradiography involving radioisotopes, direct detection of radioactive decay (in the presence or absence of scintillant), spectrophotometry involving calorigenic reactions and fluorometry involved fluorogenic reactions, may be used to identify specific individual genotypes.

[0196] If more than one mutation is frequently encountered in the DLC-1 gene, a system capable of detecting such multiple mutations would be desirable. For example, a PCR with multiple, specific oligonucleotide primers and hybridization probes may be used to identify all possible mutations at the same time (Chamberlain et al., Nucl. Acids Res. 16:1141-55, 1988). The procedure may involve immobilized sequence-specific oligonucleotides probes (Saiki et al., Proc. Nat. Acad. Sci. USA 86:6230-4, 1989).

[0197] The following Example describes one method by which deletions of the DLC-1 gene may be detected.

Example 14

Two Step Assay to Detect the Presence of DLC-1 Gene in a Sample

[0198] Patient liver, breast, prostate and/or colorectal tissue sample is processed according to the method disclosed by Antonarakis, et al. (New Eng. J. Med. 313:842-848, 1985), separated through a 1% agarose gel and transferred to a nylon membrane for Southern blot analysis. Membranes are UV cross linked at 150 mJ using a GS Gene Linker (Bio-Rad). A DLC-1 probe is subcloned into pTZ18U. The phagemids are transformed into E. coli MV 1190 infected with M13KO7 helper phage (Bio-Rad, Richmond, Calif.). Single stranded DNA is isolated according to standard procedures (see Sambrook, et al. Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989).

[0199] Blots are prehybridized for 15-30 min. at 65.degree. C. in 7% sodium dodecyl sulfate (SDS) in 0.5M NaPO.sub.4. The methods follow those described by Nguyen, et al. (BioTechniques 13:116-123, 1992). The blots are hybridized overnight at 65.degree. C. in 7% SDS, 0.5M NaPO.sub.4 with 25-50 ng/ml single stranded probe DNA. Post-hybridization washes consist of two 30 min. washes in 5% SDS, 40 mM NaPO.sub.4 at 65.degree. C., followed by two 30-min washes in 1% SDS, 40 mM NaPO.sub.4 at 65.degree. C.

[0200] Next the blots are rinsed with phosphate buffered saline (pH 6.8) for 5 min at room temperature and incubated with 0.2% casein in PBS for 5 min. The blots are then preincubated for 5-10 minutes in a shaking water bath at 45.degree. C. with hybridization buffer consisting of 6M urea, 0.3M NaCl, and 5.times.Denhardt's solution (see Sambrook, et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor, N.Y., 1989). The buffer is removed and replaced with 50-75 .mu.l/cm.sup.2 fresh hybridization buffer plus 2.5 nM of the covalently cross-linked oligonucleotide sequence complementary to the universal primer site (UP-AP, Bio-Rad). The blots are hybridized for 20-30 min at 45.degree. C. and post hybridization washes are incubated at 45.degree. C. as two 10 min washes in 6 M urea, 1.times. standard saline citrate (SSC), 0.1% SDS and one 10 min wash in 1.times.SSC, 0.1% Triton.RTM. X-100. The blots are rinsed for 10 min at room temperature with 1.times.SSC.

[0201] Blots are incubated for 10 min at room temperature with shaking in the substrate buffer consisting of 0.1M diethanolamine, 1 mM MgCl.sub.2, 0.02% sodium azide, pH 10.0. Individual blots are placed in heat sealable bags with substrate buffer and 0.2 mM AMPPD (3-(2'-spiroadamantane)-4-methoxy-4-(3'-phosphoryloxy)phenyl-1,2-dioxetan- e, disodium salt, Bio-Rad). After a 20 min incubation at room temperature with shaking, the excess AMPPD solution is removed. The blot is exposed to X-ray film overnight. Positive bands indicate the presence of the DLC-1 gene. Patient samples which show no hybridizing bands lack the DLC-1 gene, indicating the possibility of ongoing cancer, or an enhanced susceptibility to developing cancer in the future.

Example 15

Quantitation of DLC-1 Protein

[0202] An alternative method of diagnosing DLC-1 gene deletion or mutation is to quantitate the level of DLC-1 protein in the cells of an individual. This diagnostic tool would be useful for detecting reduced levels of the DLC-1 protein which result from, for example, mutations in the promoter regions of the DLC-1 gene or mutations within the coding region of the gene which produced truncated, non-functional polypeptides, as well as from deletions of the entire DLC-1 gene. The determination of reduced DLC-1 protein levels would be an alternative or supplemental approach to the direct determination of DLC-1 gene deletion or mutation status by the methods outlined above. The availability of antibodies specific to the DLC-1 protein will facilitate the quantitation of cellular DLC-1 protein by one of a number of immunoassay methods which are well known in the art and are presented in Harlow and Lane (Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory, New York, 1988).

[0203] For the purposes of quantitating the DLC-1 protein, a biological sample of the subject, which sample includes cellular proteins, is required. Such a biological sample may be obtained from body cells, such as those present in peripheral blood, urine, saliva, tissue biopsy, amniocentesis samples, surgical specimens and autopsy material, particularly liver cells. Quantitation of DLC-1 protein is achieved by immunoassay and compared to levels of the protein found in healthy cells. A significant (e.g., 50% or greater) reduction in the amount of DLC-1 protein in the cells of a subject compared to the amount of DLC-1 protein found in normal human cells would be taken as an indication that the subject may have deletions or mutations in the DLC-1 gene locus.

Example 16

Gene Therapy

[0204] A new gene therapy approach for DLC-1 patients is now made possible by the present invention. Essentially, liver cells may be removed from a patient having deletions or mutations of the DLC-1 gene, and then transfected with an expression vector containing the DLC-1 cDNA. These transfected liver cells will thereby produce functional DLC-1 protein and can be reintroduced into the patient. In addition to liver cells, breast, colorectal, prostate, or other cells may be used, depending on the cancer of interest.

[0205] The scientific and medical procedures required for human cell transfection are now routine procedures. The provision herein of DLC-1 cDNAs now allows the development of human gene therapy based upon these procedures. Immunotherapy of melanoma patients using genetically engineered tumor-infiltrating lymphocytes (TILs) has been reported by Rosenberg et al. (N. Engl. J. Med. 323:570-8, 1990). In that study, a retrovirus vector was used to introduce a gene for neomycin resistance into TILs. A similar approach may be used to introduce the DLC-1 cDNA into patients affected by DLC-1 deletions or mutations.

[0206] Retroviruses have been considered the preferred vector for experiments in gene therapy, with a high efficiency of infection and stable integration and expression (Orkin et al., Prog. Med. Genet. 7:130, 1988). The full length DLC-1 gene or cDNA can be cloned into a retroviral vector and driven from either its endogenous promoter or from the retroviral LTR (long terminal repeat). Other viral transfection systems may also be utilized for this type of approach, including Adeno-Associated virus (AAV) (McLaughlin et al., J. Virol. 62:1963, 1988), Vaccinia virus (Moss et al., Annu. Rev. Immunol. 5:305, 1987), Bovine Papilloma virus (Rasmussen et al., Methods Enzymol. 139:642, 1987) or members of the herpesvirus group such as Epstein-Barr virus (Margolskee et al., Mol. Cell. Biol. 8:2837-47, 1988). Recent developments in gene therapy techniques include the use of RNA-DNA hybrid oligonucleotides, as described by Cole-Strauss, et al. (Science 273:1386-9, 1996). This technique may allow for site-specific integration of cloned sequences, permitting accurately targeted gene replacement.

[0207] Having illustrated and described the principles of isolating the human DLC-1 cDNA and its corresponding genomic genes, the protein and modes of use of these biological molecules, it should be apparent to one skilled in the art that the invention can be modified in arrangement and detail without departing from such principles. We claim all modifications coming within the spirit and scope of the claims presented herein.

Sequence CWU 1

1

3113850DNAHomo sapiensCDS(325)..(3600) 1tatgggctcg agcggccgcc cgggcaggtg cccgagcgag ggcgcttcgc tcccagccag 60gacatggccg cacctctccg catcaggagc gccggctcac ggacttctcg cccaactccc 120tgagcgctcc ctcgtttcga tctttagaaa accccgcttt ctttctgggg ccgtgacgag 180gggcagggag cggcgagcaa ggatgcgttg aggaccgcga gggcgcgcgt ctcgggtgcc 240gccgtgggtc ccgacgcgga agccgagccg cctccgcctg cctcgacttc cccacagcgc 300ttccgccgcc gcctgccgtg cttg atg tgc aga aag aag ccg gac acc atg 351Met Cys Arg Lys Lys Pro Asp Thr Met1 5atc cta aca caa att gaa gcc aag gaa gct tgt gat tgg cta cgg gca 399Ile Leu Thr Gln Ile Glu Ala Lys Glu Ala Cys Asp Trp Leu Arg Ala10 15 20 25act ggt ttc ccc cag tat gca cag ctt tat gaa gat ttc ctg ttc ccc 447Thr Gly Phe Pro Gln Tyr Ala Gln Leu Tyr Glu Asp Phe Leu Phe Pro30 35 40atc gat att tcc ttg gtc aag aga gag cat gat ttt ttg gac aga gat 495Ile Asp Ile Ser Leu Val Lys Arg Glu His Asp Phe Leu Asp Arg Asp45 50 55gcc att gag gct cta tgc agg cgt cta aat act tta aac aaa tgt gcg 543Ala Ile Glu Ala Leu Cys Arg Arg Leu Asn Thr Leu Asn Lys Cys Ala60 65 70gtg atg aag cta gaa att agt cct cat cgg aaa cga agt gac gat tca 591Val Met Lys Leu Glu Ile Ser Pro His Arg Lys Arg Ser Asp Asp Ser75 80 85gac gag gat gag cct tgt gcc atc agt ggc aaa tgg act ttc caa agg 639Asp Glu Asp Glu Pro Cys Ala Ile Ser Gly Lys Trp Thr Phe Gln Arg90 95 100 105gac agc aag agg tgg tcc cgg ctt gaa gag ttt gat gtc ttt tct cca 687Asp Ser Lys Arg Trp Ser Arg Leu Glu Glu Phe Asp Val Phe Ser Pro110 115 120aaa caa gac ctg gtc cct ggg tcc cca gac gac tcc cac ccg aag gac 735Lys Gln Asp Leu Val Pro Gly Ser Pro Asp Asp Ser His Pro Lys Asp125 130 135ggc ccc agc ccc gga ggc acg ctg atg gac ctc agc gag cgc cag gag 783Gly Pro Ser Pro Gly Gly Thr Leu Met Asp Leu Ser Glu Arg Gln Glu140 145 150gtg tct tcc gtc cgc agc ctc agc agc act ggc agc ctc ccc agc cac 831Val Ser Ser Val Arg Ser Leu Ser Ser Thr Gly Ser Leu Pro Ser His155 160 165gcg ccc ccc agc gag gat gct gcc acc ccc cgg act aac tcc gtc atc 879Ala Pro Pro Ser Glu Asp Ala Ala Thr Pro Arg Thr Asn Ser Val Ile170 175 180 185agc gtt tgc tcc tcc agc aac ttg gca ggc aat gac gac tct ttc ggc 927Ser Val Cys Ser Ser Ser Asn Leu Ala Gly Asn Asp Asp Ser Phe Gly190 195 200agc ctg ccc tct ccc aag gaa ctg tcc agc ttc agc ttc agc atg aaa 975Ser Leu Pro Ser Pro Lys Glu Leu Ser Ser Phe Ser Phe Ser Met Lys205 210 215ggc cac gaa aaa act gcc aag tcc aag acg cgc agt ctg ctg aaa cgg 1023Gly His Glu Lys Thr Ala Lys Ser Lys Thr Arg Ser Leu Leu Lys Arg220 225 230atg gag agc ctg aag ctc aag agc tcc cat cac agc aag cac aaa gcg 1071Met Glu Ser Leu Lys Leu Lys Ser Ser His His Ser Lys His Lys Ala235 240 245ccc tca aag ctg ggg ttg atc atc agc ggg ccc atc ttg caa gag ggg 1119Pro Ser Lys Leu Gly Leu Ile Ile Ser Gly Pro Ile Leu Gln Glu Gly250 255 260 265atg gat gag gag aag ctg aag cag ctc agc tgc gtg gag atc tcc gcc 1167Met Asp Glu Glu Lys Leu Lys Gln Leu Ser Cys Val Glu Ile Ser Ala270 275 280ctc aat ggc aac cgc atc aac gtc ccc atg gta cga aag agg agc gtt 1215Leu Asn Gly Asn Arg Ile Asn Val Pro Met Val Arg Lys Arg Ser Val285 290 295tcc aac tcc acg cag acc agc agc agc agc agc cag tcg gag acc agc 1263Ser Asn Ser Thr Gln Thr Ser Ser Ser Ser Ser Gln Ser Glu Thr Ser300 305 310agc gcg gtc agc acg ccc agc cct gtt acg agg acc cgg agc ctc agt 1311Ser Ala Val Ser Thr Pro Ser Pro Val Thr Arg Thr Arg Ser Leu Ser315 320 325gcg tgc aac aag cgg gtg ggc atg tac tta gag ggc ttc gat cct ttc 1359Ala Cys Asn Lys Arg Val Gly Met Tyr Leu Glu Gly Phe Asp Pro Phe330 335 340 345aat cag tca aca ttt aac aac gtg gtg gag cag aac ttt aag aac cgc 1407Asn Gln Ser Thr Phe Asn Asn Val Val Glu Gln Asn Phe Lys Asn Arg350 355 360gag agc tac cca gag gac acg gtg ttc tac atc cct gaa gat cac aag 1455Glu Ser Tyr Pro Glu Asp Thr Val Phe Tyr Ile Pro Glu Asp His Lys365 370 375cct ggc act ttc ccc aaa gct ctc acc aat ggc agt ttc tcc ccc tcg 1503Pro Gly Thr Phe Pro Lys Ala Leu Thr Asn Gly Ser Phe Ser Pro Ser380 385 390ggg aat aac ggc tct gtg aac tgg agg acg gga agc ttc cac ggc cct 1551Gly Asn Asn Gly Ser Val Asn Trp Arg Thr Gly Ser Phe His Gly Pro395 400 405ggc cac atc agc ctc agg agg gaa aac agt agc gac agc ccc aag gaa 1599Gly His Ile Ser Leu Arg Arg Glu Asn Ser Ser Asp Ser Pro Lys Glu410 415 420 425ctg aag aga cgc aat tct tcc agc tcc atg agc agc cgc ctg agc atc 1647Leu Lys Arg Arg Asn Ser Ser Ser Ser Met Ser Ser Arg Leu Ser Ile430 435 440tac gac aac gtg ccg ggc tcc atc ctc tac tcc agt tca ggg gac ctg 1695Tyr Asp Asn Val Pro Gly Ser Ile Leu Tyr Ser Ser Ser Gly Asp Leu445 450 455gcg gat ctg gag aac gag gac atc ttc ccc gag ctg gac gac atc ctc 1743Ala Asp Leu Glu Asn Glu Asp Ile Phe Pro Glu Leu Asp Asp Ile Leu460 465 470tac cac gtg aag ggg atg cag cgg ata gtc aat cag tgg tcg gag aag 1791Tyr His Val Lys Gly Met Gln Arg Ile Val Asn Gln Trp Ser Glu Lys475 480 485ttt tct gat gag gga gat tcg gac tca gcc ctg gac tcg gtc tct ccc 1839Phe Ser Asp Glu Gly Asp Ser Asp Ser Ala Leu Asp Ser Val Ser Pro490 495 500 505tgc ccg tcc tct cca aaa cag ata cac ctg gat gtg gac aac gac cga 1887Cys Pro Ser Ser Pro Lys Gln Ile His Leu Asp Val Asp Asn Asp Arg510 515 520acc aca ccc agc gac ctg gac agc aca ggc aac tcc ctg aat gaa ccg 1935Thr Thr Pro Ser Asp Leu Asp Ser Thr Gly Asn Ser Leu Asn Glu Pro525 530 535gaa gag ccc tcc gag atc ccg gaa aga agg gat tct ggg gtt ggg gct 1983Glu Glu Pro Ser Glu Ile Pro Glu Arg Arg Asp Ser Gly Val Gly Ala540 545 550tcc cta acc agg tcc aac agg cac cga ctg aga tgg cac agt ttc cag 2031Ser Leu Thr Arg Ser Asn Arg His Arg Leu Arg Trp His Ser Phe Gln555 560 565agc tca cat cgg cca agc ctc aac tct gta tca cta cag att aac tgc 2079Ser Ser His Arg Pro Ser Leu Asn Ser Val Ser Leu Gln Ile Asn Cys570 575 580 585cag tct gtg gcc cag atg aac ctg ctg cag aaa tac tca ctc cta aag 2127Gln Ser Val Ala Gln Met Asn Leu Leu Gln Lys Tyr Ser Leu Leu Lys590 595 600cta acg gcc ctg ctg gag aaa tac aca cct tct aac aag cat ggt ttt 2175Leu Thr Ala Leu Leu Glu Lys Tyr Thr Pro Ser Asn Lys His Gly Phe605 610 615agc tgg gcc gtg ccc aag ttc atg aag agg atc aag gtt cca gac tac 2223Ser Trp Ala Val Pro Lys Phe Met Lys Arg Ile Lys Val Pro Asp Tyr620 625 630aag gac cgg agt gtg ttt ggg gtc cca ctg acg gtc aac gtg cag cgc 2271Lys Asp Arg Ser Val Phe Gly Val Pro Leu Thr Val Asn Val Gln Arg635 640 645aca gga caa ccg ttg cct cag agc atc cag cag gcc atg cga tac ctc 2319Thr Gly Gln Pro Leu Pro Gln Ser Ile Gln Gln Ala Met Arg Tyr Leu650 655 660 665cgg aac cat tgt ttg gat cag gtt ggg ctc ttc aaa aaa tcg ggg gtc 2367Arg Asn His Cys Leu Asp Gln Val Gly Leu Phe Lys Lys Ser Gly Val670 675 680aag tcc cgg att cag gct ctg cgc cag atg aat gaa ggt gcc ata gac 2415Lys Ser Arg Ile Gln Ala Leu Arg Gln Met Asn Glu Gly Ala Ile Asp685 690 695tgt gtc aac tac gaa gga cag tct gct tat gac gtg gca gac atg ctg 2463Cys Val Asn Tyr Glu Gly Gln Ser Ala Tyr Asp Val Ala Asp Met Leu700 705 710aag cag tat ttt cga gat ctt cct gag cca cta atg acg aac aaa ctc 2511Lys Gln Tyr Phe Arg Asp Leu Pro Glu Pro Leu Met Thr Asn Lys Leu715 720 725tcg gaa acc ttt cta cag atc tac caa tat gtg ccc aag gac cag cgc 2559Ser Glu Thr Phe Leu Gln Ile Tyr Gln Tyr Val Pro Lys Asp Gln Arg730 735 740 745ctg cag gcc atc aag gct gcc atc atg ctg ctg cct gac gag aac cgg 2607Leu Gln Ala Ile Lys Ala Ala Ile Met Leu Leu Pro Asp Glu Asn Arg750 755 760gtg gtt ctg cag acc ctg ctt tat ttc ctg tgc gat gtc aca gca gcc 2655Val Val Leu Gln Thr Leu Leu Tyr Phe Leu Cys Asp Val Thr Ala Ala765 770 775gta aaa gaa aac cag atg acc cca acc aac ctg gcc gtg tgc tta gcg 2703Val Lys Glu Asn Gln Met Thr Pro Thr Asn Leu Ala Val Cys Leu Ala780 785 790cct tcc ctc ttc cat ctc aac acc ctg aag aga gag aat tcc tct ccc 2751Pro Ser Leu Phe His Leu Asn Thr Leu Lys Arg Glu Asn Ser Ser Pro795 800 805agg gta atg caa aga aaa caa agt ttg ggc aaa cca gat cag aaa gat 2799Arg Val Met Gln Arg Lys Gln Ser Leu Gly Lys Pro Asp Gln Lys Asp810 815 820 825ttg aat gaa aac cta gct gcc act caa ggg ctg gcc cat atg atc gcc 2847Leu Asn Glu Asn Leu Ala Ala Thr Gln Gly Leu Ala His Met Ile Ala830 835 840gag tgc aag aag ctt ttc cag gtt ccc gag gaa atg agc cga tgt cgt 2895Glu Cys Lys Lys Leu Phe Gln Val Pro Glu Glu Met Ser Arg Cys Arg845 850 855aat tcc tat acc gaa caa gag ctg aag ccc ctc act ctg gaa gca ctc 2943Asn Ser Tyr Thr Glu Gln Glu Leu Lys Pro Leu Thr Leu Glu Ala Leu860 865 870ggg cac ctg ggt aat gat gac tca gct gac tac caa cac ttc ctc cag 2991Gly His Leu Gly Asn Asp Asp Ser Ala Asp Tyr Gln His Phe Leu Gln875 880 885gac tgt gtg gat ggc ctg ttt aaa gaa gtc aaa gag aag ttt aaa ggc 3039Asp Cys Val Asp Gly Leu Phe Lys Glu Val Lys Glu Lys Phe Lys Gly890 895 900 905tgg gtc agc tac tcc act tcg gag cag gct gag ctg tcc tat aag aag 3087Trp Val Ser Tyr Ser Thr Ser Glu Gln Ala Glu Leu Ser Tyr Lys Lys910 915 920gtg agc gaa gga ccc cgt ctg agg ctt tgg agg tca gtc att gaa gtc 3135Val Ser Glu Gly Pro Arg Leu Arg Leu Trp Arg Ser Val Ile Glu Val925 930 935cct gct gtg cca gag gaa atc tta aag cgc cta ctt aaa gaa cag cac 3183Pro Ala Val Pro Glu Glu Ile Leu Lys Arg Leu Leu Lys Glu Gln His940 945 950ctc tgg gat gta gac ctg ttg gat tca aaa gtg atc gaa att ctg gac 3231Leu Trp Asp Val Asp Leu Leu Asp Ser Lys Val Ile Glu Ile Leu Asp955 960 965agc caa act gaa att tac cag tat gtc caa aac agt atg gca cct cat 3279Ser Gln Thr Glu Ile Tyr Gln Tyr Val Gln Asn Ser Met Ala Pro His970 975 980 985cct gct cga gac tac gtt gtt tta aga acc tgg agg act aat tta ccc 3327Pro Ala Arg Asp Tyr Val Val Leu Arg Thr Trp Arg Thr Asn Leu Pro990 995 1000aaa gga gcc tgt gcc ctt tta cta acc tct gtg gat cac gat cgc gca 3375Lys Gly Ala Cys Ala Leu Leu Leu Thr Ser Val Asp His Asp Arg Ala1005 1010 1015cct gtg gtg ggt gtg agg gtt aat gtg ctc ttg tcc agg tat ttg att 3423Pro Val Val Gly Val Arg Val Asn Val Leu Leu Ser Arg Tyr Leu Ile1020 1025 1030gaa ccc tgt ggg cca gga aaa tcc aaa ctc acc tac atg tgc aga gtt 3471Glu Pro Cys Gly Pro Gly Lys Ser Lys Leu Thr Tyr Met Cys Arg Val1035 1040 1045gac tta agg ggc cac atg cca gaa tgg tac aca aaa tct ttt gga cat 3519Asp Leu Arg Gly His Met Pro Glu Trp Tyr Thr Lys Ser Phe Gly His1050 1055 1060 1065ttg tgt gca gct gaa gtt gta aag atc cgg gat tcc ttc agt aac cag 3567Leu Cys Ala Ala Glu Val Val Lys Ile Arg Asp Ser Phe Ser Asn Gln1070 1075 1080aac act gaa acc aaa gac acc aaa tct agg tga tcactgaagc aacgcaaccg 3620Asn Thr Glu Thr Lys Asp Thr Lys Ser Arg1085 1090cttccaccac catggtgttt gtttttagaa gttttgccag tccttgaaga atgggttctg 3680tgtgtaatcc tgaaacaaag aaaactacaa gctggagtgt aggaattgac tatagcaatt 3740tgatacattt ttaaagctgc ttcctgtttg ttgagggtct gtattcatag accttgactg 3800gaatatgtaa gactgtgcga aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 385021091PRTHomo sapiens 2Met Cys Arg Lys Lys Pro Asp Thr Met Ile Leu Thr Gln Ile Glu Ala1 5 10 15Lys Glu Ala Cys Asp Trp Leu Arg Ala Thr Gly Phe Pro Gln Tyr Ala20 25 30Gln Leu Tyr Glu Asp Phe Leu Phe Pro Ile Asp Ile Ser Leu Val Lys35 40 45Arg Glu His Asp Phe Leu Asp Arg Asp Ala Ile Glu Ala Leu Cys Arg50 55 60Arg Leu Asn Thr Leu Asn Lys Cys Ala Val Met Lys Leu Glu Ile Ser65 70 75 80Pro His Arg Lys Arg Ser Asp Asp Ser Asp Glu Asp Glu Pro Cys Ala85 90 95Ile Ser Gly Lys Trp Thr Phe Gln Arg Asp Ser Lys Arg Trp Ser Arg100 105 110Leu Glu Glu Phe Asp Val Phe Ser Pro Lys Gln Asp Leu Val Pro Gly115 120 125Ser Pro Asp Asp Ser His Pro Lys Asp Gly Pro Ser Pro Gly Gly Thr130 135 140Leu Met Asp Leu Ser Glu Arg Gln Glu Val Ser Ser Val Arg Ser Leu145 150 155 160Ser Ser Thr Gly Ser Leu Pro Ser His Ala Pro Pro Ser Glu Asp Ala165 170 175Ala Thr Pro Arg Thr Asn Ser Val Ile Ser Val Cys Ser Ser Ser Asn180 185 190Leu Ala Gly Asn Asp Asp Ser Phe Gly Ser Leu Pro Ser Pro Lys Glu195 200 205Leu Ser Ser Phe Ser Phe Ser Met Lys Gly His Glu Lys Thr Ala Lys210 215 220Ser Lys Thr Arg Ser Leu Leu Lys Arg Met Glu Ser Leu Lys Leu Lys225 230 235 240Ser Ser His His Ser Lys His Lys Ala Pro Ser Lys Leu Gly Leu Ile245 250 255Ile Ser Gly Pro Ile Leu Gln Glu Gly Met Asp Glu Glu Lys Leu Lys260 265 270Gln Leu Ser Cys Val Glu Ile Ser Ala Leu Asn Gly Asn Arg Ile Asn275 280 285Val Pro Met Val Arg Lys Arg Ser Val Ser Asn Ser Thr Gln Thr Ser290 295 300Ser Ser Ser Ser Gln Ser Glu Thr Ser Ser Ala Val Ser Thr Pro Ser305 310 315 320Pro Val Thr Arg Thr Arg Ser Leu Ser Ala Cys Asn Lys Arg Val Gly325 330 335Met Tyr Leu Glu Gly Phe Asp Pro Phe Asn Gln Ser Thr Phe Asn Asn340 345 350Val Val Glu Gln Asn Phe Lys Asn Arg Glu Ser Tyr Pro Glu Asp Thr355 360 365Val Phe Tyr Ile Pro Glu Asp His Lys Pro Gly Thr Phe Pro Lys Ala370 375 380Leu Thr Asn Gly Ser Phe Ser Pro Ser Gly Asn Asn Gly Ser Val Asn385 390 395 400Trp Arg Thr Gly Ser Phe His Gly Pro Gly His Ile Ser Leu Arg Arg405 410 415Glu Asn Ser Ser Asp Ser Pro Lys Glu Leu Lys Arg Arg Asn Ser Ser420 425 430Ser Ser Met Ser Ser Arg Leu Ser Ile Tyr Asp Asn Val Pro Gly Ser435 440 445Ile Leu Tyr Ser Ser Ser Gly Asp Leu Ala Asp Leu Glu Asn Glu Asp450 455 460Ile Phe Pro Glu Leu Asp Asp Ile Leu Tyr His Val Lys Gly Met Gln465 470 475 480Arg Ile Val Asn Gln Trp Ser Glu Lys Phe Ser Asp Glu Gly Asp Ser485 490 495Asp Ser Ala Leu Asp Ser Val Ser Pro Cys Pro Ser Ser Pro Lys Gln500 505 510Ile His Leu Asp Val Asp Asn Asp Arg Thr Thr Pro Ser Asp Leu Asp515 520 525Ser Thr Gly Asn Ser Leu Asn Glu Pro Glu Glu Pro Ser Glu Ile Pro530 535 540Glu Arg Arg Asp Ser Gly Val Gly Ala Ser Leu Thr Arg Ser Asn Arg545 550 555 560His Arg Leu Arg Trp His Ser Phe Gln Ser Ser His Arg Pro Ser Leu565 570 575Asn Ser Val Ser Leu Gln Ile Asn Cys Gln Ser Val Ala Gln Met Asn580 585 590Leu Leu Gln Lys Tyr Ser Leu Leu Lys Leu Thr Ala Leu Leu Glu Lys595 600 605Tyr Thr Pro Ser Asn Lys His Gly Phe Ser Trp Ala Val Pro Lys Phe610 615 620Met Lys Arg Ile Lys Val Pro Asp Tyr Lys Asp Arg Ser Val Phe Gly625 630 635 640Val Pro Leu Thr Val Asn Val Gln Arg Thr Gly Gln Pro Leu Pro Gln645 650 655Ser Ile Gln Gln Ala Met Arg Tyr Leu Arg Asn His Cys Leu Asp Gln660 665 670Val Gly Leu Phe Lys Lys Ser Gly Val Lys Ser Arg Ile Gln Ala Leu675 680 685Arg Gln Met Asn Glu Gly Ala Ile Asp Cys Val Asn Tyr Glu Gly Gln690 695 700Ser Ala Tyr Asp Val Ala Asp Met Leu Lys Gln Tyr Phe Arg Asp Leu705 710 715 720Pro Glu Pro Leu Met Thr Asn Lys Leu Ser Glu Thr Phe Leu Gln Ile725 730 735Tyr Gln Tyr Val

Pro Lys Asp Gln Arg Leu Gln Ala Ile Lys Ala Ala740 745 750Ile Met Leu Leu Pro Asp Glu Asn Arg Val Val Leu Gln Thr Leu Leu755 760 765Tyr Phe Leu Cys Asp Val Thr Ala Ala Val Lys Glu Asn Gln Met Thr770 775 780Pro Thr Asn Leu Ala Val Cys Leu Ala Pro Ser Leu Phe His Leu Asn785 790 795 800Thr Leu Lys Arg Glu Asn Ser Ser Pro Arg Val Met Gln Arg Lys Gln805 810 815Ser Leu Gly Lys Pro Asp Gln Lys Asp Leu Asn Glu Asn Leu Ala Ala820 825 830Thr Gln Gly Leu Ala His Met Ile Ala Glu Cys Lys Lys Leu Phe Gln835 840 845Val Pro Glu Glu Met Ser Arg Cys Arg Asn Ser Tyr Thr Glu Gln Glu850 855 860Leu Lys Pro Leu Thr Leu Glu Ala Leu Gly His Leu Gly Asn Asp Asp865 870 875 880Ser Ala Asp Tyr Gln His Phe Leu Gln Asp Cys Val Asp Gly Leu Phe885 890 895Lys Glu Val Lys Glu Lys Phe Lys Gly Trp Val Ser Tyr Ser Thr Ser900 905 910Glu Gln Ala Glu Leu Ser Tyr Lys Lys Val Ser Glu Gly Pro Arg Leu915 920 925Arg Leu Trp Arg Ser Val Ile Glu Val Pro Ala Val Pro Glu Glu Ile930 935 940Leu Lys Arg Leu Leu Lys Glu Gln His Leu Trp Asp Val Asp Leu Leu945 950 955 960Asp Ser Lys Val Ile Glu Ile Leu Asp Ser Gln Thr Glu Ile Tyr Gln965 970 975Tyr Val Gln Asn Ser Met Ala Pro His Pro Ala Arg Asp Tyr Val Val980 985 990Leu Arg Thr Trp Arg Thr Asn Leu Pro Lys Gly Ala Cys Ala Leu Leu995 1000 1005Leu Thr Ser Val Asp His Asp Arg Ala Pro Val Val Gly Val Arg Val1010 1015 1020Asn Val Leu Leu Ser Arg Tyr Leu Ile Glu Pro Cys Gly Pro Gly Lys1025 1030 1035 1040Ser Lys Leu Thr Tyr Met Cys Arg Val Asp Leu Arg Gly His Met Pro1045 1050 1055Glu Trp Tyr Thr Lys Ser Phe Gly His Leu Cys Ala Ala Glu Val Val1060 1065 1070Lys Ile Arg Asp Ser Phe Ser Asn Gln Asn Thr Glu Thr Lys Asp Thr1075 1080 1085Lys Ser Arg1090321DNAArtificial SequenceDescription of Artificial Sequence PCR primer 3tatgggctcg agcggccgcc c 21421DNAArtificial SequenceDescription of Artificial Sequence PCR primer 4cgcacagtct tacatattcc a 21524DNAArtificial SequenceDescription of Artificial Sequence PCR primer 5atgtgcagaa agaagccgga cacc 24624DNAArtificial SequenceDescription of Artificial Sequence PCR primer 6cctagatttg gtgtctttgg tttc 24721DNAArtificial SequenceDescription of Artificial Sequence PCR primer 7gacaccacca tctctgtgct c 21821DNAArtificial SequenceDescription of Artificial Sequence PCR primer 8gcagactgtc cttcgtagtt g 21926DNAArtificial SequenceDescription of Artificial Sequence primer 9cactccggtc cttgtagtct ggaacc 261025DNAArtificial SequenceDescription of Artificial Sequence primer 10atcctcttca tgaactcggg cacgg 251127DNAArtificial SequenceDescription of Artificial Sequence primer 11gatcaaggtt ctagactaca aggaccg 2712691DNAArtificial SequenceDescription of Artificial Sequence probe 12ccngcaganc tcgaaaatat gcttcggcat gtctgccacg tcataagcag actgtccttc 60gtagttgaca cagtctatgg caccctcatt catctggcgc agagcctgaa tccgggactt 120gacccccgat tttctgaaga gcccaacctg tcggaagagc aacactaagt gtggggtaca 180ttcacgtgga cgcagtgttt acaccacaca actagaagaa gctgcatgta atccgagctc 240ccctgagtac gtgaacccgc aggcagcgct ctcacctgat ccaaacaatg gttccggggg 300tatcgcacgg cctgctggat gctctcaggc aacggttgtc ctgtgcgctg cacgttgacc 360gtcaggggac cccaaacaca ctccggtcct tgtagtctgg aaccttgatc ctcttcatga 420actcgggcac ggccctgtta aagagcacag agatggtggt gtcggcggan acatgctcac 480ttgtctgtct acacttgtcc aattctgcag gcaaaccctg tgggctccag attctgatat 540catccaatga attcgagctc gtaccgggga tcctctaaaa tccaacttgc aggcattcca 600gcttcagctg ctccaatttc tatatgttcc cctaaatcgt atttttttga aacataaggt 660tattttttta attgtaccnc gttcctaacn a 69113301DNAArtificial SequenceDescription of Artificial Sequence probe 13gaggctctat gcaggcgtct aaatacttta aacaaatgtg cggtgatgaa gctagaaatt 60agtcctcatc ggaaacgaag tgacgattca gacgaggatg agccttgtgc catcagtggc 120aaatggactt tccaaaggga cagcaagagg tggtcccggc ttgaagagtt tgatgtcttt 180tctccaaaac aagacctggt ccctgggtcc ccagacgact cccacccgaa ggacggcccc 240agccccggag gcacgctgat ggacctcagc gagcgccagg aggtgtcttc cgtccgcagc 300c 301143006DNAHomo sapiensmisc_feature(1)..(3006)n represents a or g or c or t/u 14cnggcagatc tcgaanatac tgcttcggca tgtctgccac gtcataagca gactgtcctt 60cgtagttgac acagtctatg gcaccytcat tcatctggcg cagagcctga atccgggact 120tgacccccga ttttctgaag agcccaacct gtcggaagag caacactaag tgtggggtac 180attcacgtgg acgcagtgtt tacaccacac aactagaaga agctgcatgt aatccgagct 240cccctgagta cgtggacccg caggcagcgc tctcacctga tccaaacaat ggttccggrg 300gtatcgcayg gcctgctgga tgctctgagg caacggttgt cctgtgcgct gcacgttgac 360cgtcagtggg accccaaaca cactccggtc cttgtagtct ggaaccttga tcctcttcat 420gaacttgggc acggcctgtt aaagaacaca gagatggtgg tgttggcgga gacatgctca 480cttgtctgtc tacacttgtc caattctgca ggcaaaccct gtgggctcca gatctgtgct 540aatacggtgg ctacttaaat ttaaattaaa caaaatgaca aattcagttc cccagtggta 600ctggccacac ttcaggtgct ccttcatctt ttgtgctcag tacctactgt attggcctgt 660gcagataaag aacattccta tcatccagac agttctcctg gacagtgctg ttctagatct 720tctaagagtg ggggttgaca ggtccgtttc ctcagttagg agcgtccttc caccttgaac 780ctggagaatt ggggtctaca gtcttaagga agctgatgga tttccttaca gaatggcggt 840ataggatgga acaagcagaa aacaacatgt aataccctaa ttaggtgcat ctgatagagt 900gtgaaaaaca aggtcccttt tgtcttgaaa aaagggtaag aatcacttct gagttcttga 960tgagatcgaa agcatttagg gtcaaaaggc gcagataaca catgatggga aaacagcaat 1020gagagcctaa cacaatggga gccaactcca gagctcaaca gtgaatgacc tgaagtcaaa 1080ataaaatctg ctgctgatga cccggagaac attacatctt taggtttcta aaggaagatg 1140gaaaaggaac aatgggggtt ttgtgagccg accccaggct ccctggtgtc ctgaaaccag 1200gtccacccca gcactatatg caacagcagg aaacccatgt catgcatttc aggctgtcaa 1260gcagaaattc cagctctcca aatgacctct ctgaacagga cccgaaaggg caaggccaaa 1320caggaaaaga accttgtgta ggattcctcc ctgctccaca gatcccacca tgtgaggctt 1380ttacagttgg ttttgagtca ctggaaacac tgaccagaac acaagaagta ttatggactt 1440tcagattctt gagggtttgg tggggatggg ggtgggccac tccgaaatga gaatctaaaa 1500tatgcagttt taaatagcca gcagggaaaa cattactcta agcacagagg aactccagag 1560aagacagact gctttgcctt ttgaatgctc accagcagcc atggcatgtt actgtttata 1620gctccaggaa aggtaaaacg aaagagcaaa gttaagtttg tatttccata cagttaagtg 1680tgtggtatca tggctataag tgtgcataat actcgctttg tcgggggaga aaagcccgac 1740ggcggaatgt gaaaagaaca cattacgatc cccaccgaga atctgaagca tgtgaggata 1800aaccggtcaa tacttatttc tgtcattcag aacaaacaac ttctgtattt agcaaggctc 1860acataataac agcctttgaa cgggagtgct ttgatgctga agttaaatct gctatgatcc 1920taaggagagg aggagctgga gacaaaaaga acagtttcct tgctttgccg actttctcaa 1980gcaacttggg tttgctacag agtgctacta atgaaatggg cggcttctcc atttttatca 2040aatatggtag tgtgcgactg gataataaac actcagatta ctgaaaagac ttaaggattc 2100ccagatgaca ctgaaaaatg cactgagatg tcaatctaga aacatttctc tgcttggcac 2160tgatagcaga aaaattaaga tgtacccaga ttaggtgata tccatgaccc atctagcctt 2220acagcctacc cctcacattc tatatactaa ggagctatat ttttcaaagt aattatgaac 2280aatttgtaca atgcatttca tctctacatt tgagtctata atatgttaga gtagtgaatt 2340ccttaaaata attattcact gttagacagt ctttgctaga aaaaaagtaa cctgaattct 2400ttagcacagg tggatgctac aaatayctgc mcrkscrrmy kywykakymy tattattatt 2460attattattt tttgagatag agtcttactc tgtcacccag gctggagtgc agtagcctta 2520tcttggggct cactgcaacc tccatcttct gggctcaagg gattctcatg cctcagcttc 2580ctgagtagat gggattacag gtgcatgcca ccacactcag ctaatttttg tatttttagt 2640agagaatggg gttcgccaat gttggccagc tggtctcaaa ctcctggcgt catgtgatcc 2700acctatgtca gattcccaag atgctgggat acaggcatga gccacacacc cgccccaaga 2760tgatttctaa aaacaggcat gaatacggta taagaacagg twctgtaant caagnaattc 2820caaganggtc tcaywawatc twatkgttgt ccttctcctc cayccagaaa tacratctgm 2880tactgtgcat acattwactg awagtggawk atyctawtat tattgggaan gancccctat 2940caccacntga ccctaagagt attgnatttt caccccntca tctggcgata tgacntgccc 3000gngggg 300615305DNAHomo sapiens 15tcaaaggcat gggaaatgat agattttatg catttgaact agcaaacaga tgtttctcat 60tttatttcca tgctttctaa cttaaataat tcatcagctt ttctttcttt tctctgatag 120gggccacatg ccagaatggt acacaaaatc ttttggacat ttgtgtgcag ctgaagttgt 180aaagatccgg gattccttca gtaaccagaa cactgaaacc aaagacacca aatctaggtg 240atcactgaag caacgcaacc gcttccacca ccatggtgtt tgtttctaga acttttgcca 300gtcct 30516466DNAHomo sapiensmisc_feature(1)..(466)n represents a or g or c or t/u 16tggattnccn tgncactgaa aaatacatcc tctttccagg tgagcgaagg accccctctg 60aggctttgga ggtcagtcat tgaagtccct gctgtgccag aggaaatctt aaagcgccta 120cttaaagaac agcacctctg ggatgtagac ctgttggatt caaaagtgat cgaaattctg 180gacagccaaa ctgaaattta ccagtatgtc caaaacagta tggcacctca tcctgctcga 240gactacgttg ttttaaggtg agcgcttccc agttgttttt ttgtgacaag gatgactcca 300tatatgaacc aagcctatat gtcactgatc ttacaagatg gtataattat ttaaagtaga 360ggccgggcat atggtggctc acacctgtaa tcccagcact ctgggaggcc aaggtgggag 420gatcacttga ggccagcagt tcaagaccag cctggntaat atagca 46617692DNAHomo sapiensmisc_feature(1)..(692)n represents a or g or c or t/u 17ccngcaganc tcgaaaatat gcttcggcat gtctgccacg tcataagcag actgtccttc 60gtagttgaca cagtctatgg caccctcatt catctggcgc agancctgaa tccgggactt 120gacccccgat tttctgaaga gcccaacctg tcggaagagc aacactaagt gtggggtaca 180ttcacgtgga cgcagtgttt acaccacaca actagaagaa gctgcatgta atccgagctc 240ccctgagtac gtgaacccgc aggcagcgct ctcacctgat ccaaacaatg gttccggggg 300tatcgcacgg cctgctggat gctctgaggc aacggttgtc ctgtgcgctg cacgttgacc 360gtcaggggac cccaaacaca ctccggtcct tgtagtctgg aaccttgatc ctcttcatga 420actcgggcac ggccctgtta aagagcacag agatggtggt gtcggcggan acatgctcac 480ttgtctgtct acacttgtcc aattctgcag gcaaaccctg tgggctccag attctgatat 540catccaatga attcnanctc ngtaccgggg atcctctaaa atccaacttg caggcattcc 600agcttcagct gctccaattt ctatatgttc ccctaaatcn tatttttttg aaacataagg 660ttattttttt aattgtaccn cgttcctaac na 69218315DNAHomo sapiensmisc_feature(314)n represents a or g or c or t/u 18tttcgtgtga ggggcttagc tcttgttcgg tataggaatt acgacatcgg ctcatttcct 60cgggaacctg tgcggaacat gacagacaga aaggaggtga gtccacctgt actcaatctc 120aatgcccatc agtggaaaag actgggtagg aacaatggcc tggtccttaa agcagtgcag 180gcatcttccc gccggaggtg ggctatcatg ctgaccgcac gtgttatcac gaggatatga 240acagatcacc tccataaatg tatctgaaat cttatttcca tgtaaggtct ttggaaagtt 300agagtagggg gagnc 31519281DNAHomo sapiensmisc_feature(1)..(281)n represents a or g or c or t/u 19ctcnngactg tgtggatggc ctgtttaaag aagtcaaaga gaagtttaaa ggctgggtca 60ngctactcca cttcggagca ggctgagctg tcctataaga aggtaaggct tcaccctgtt 120gtcggctagt tgagtccagg agtcgaagct tgggtccatc agagataaca cgcttttgcc 180aactaatctg tctggggatc tgtagcccac aacctccctt gtagagctgg gcaccggggt 240gagtaagatc cccgtggtga gagtggaaac cgnncaaagc a 281201713DNAMus musculusmisc_feature(1)..(1713)n represents a or g or c or t/u 20ttgaacgctt gggtaccggg ccccccctcg aggtcgacgg tatcgataag cttgatatcg 60aattcctgca gcccggggga tccaatccct gggtccccag acaactctcg tttgcaaagc 120gccacaagcc acgaaagcat gctgacagac ctcagcgagc accaggaggt ggcctctgtc 180cgaagcctca gcagcaccag cagcagcgtc cccacccacg cagcccacag tggagatgcc 240actacgcccc gaaccaattc cgtcatcagc gtctgctcct ccggacactt tgtaggcaac 300gatgactctt tttccagcct gccgtctccc aaggaactgt ccagcttcag ttttagcatg 360aaaggccacc acgagaagaa caccaagtcg aagacgcgga gcctgctcaa acgcatggag 420agcctgaagc tcaagggctc ccaccacagc aagcacaagg cgccttccaa gctggggttg 480atcatcagtg ctcccattct gcaggagggt atggatgagg cgaagctgaa gcagctgaac 540tgtgtggaga tctcagccct caatggcaac cacatcaacg tgcccatggt accggaaaag 600gagccgtgtc taacttcacc cagaccagca gcaagcagca gccaatcaga gaccagcagc 660gcggtcagca cacccagccc ggtcaccagg acccggagcc tcagcacctg taacaagcgg 720gtgggcatgt atctagaggg cttcgaccca ttcagtcagt ccaccttgaa caacgtgacg 780gagcagaact ataaaaaccg tgagagctac ccagaggaca cggtgttcta cattcccgaa 840gatcacaagc ccggcacctt ccctaaggcc ctctcccatg gcagtttctg tccctcggga 900aacagttctg tgaactggag gaccggaagc ttccatggcc ccggccatct cagcctacgg 960agagaaaaca gccatgacag tcctaaggag ctgaagagac gcaattcttc cagctctctg 1020agcagccgcc tgagcatcta tgataacgta ccgggttcta tcctgtactc cagctcggga 1080gaactggccg acctggagaa tgaggacatc ttccctgagc tggatgacat tctctaccac 1140gtgaagggga tgcagcggat agtcaaccag tggtccgaga agttttccga cgagggagac 1200tcggactcag ccctggactc tgtctctcct tgcccgtcat cttcaaaaca gattcacctg 1260gatgtggacc atgaccgaag gacacccagt gacctggaca gcacaggcaa ctccttcaat 1320gagcccgaag agcccactga tatccggaaa gaagagactt ccggggtggg ggctttcctt 1380gaccagtgca ataggtaagg gaaaggcgtt gctttctcgg atgcattcca aaaggtgggg 1440gaaattcaaa gaaaggggtc ttgctttggg tggggattgg agttctngat anttttgcca 1500agttccttgg aaaattcctt aggggaattg gatncccaac cngggaagaa cccccaaaca 1560aatccccnaa cngggaaaaa ggnggttttt attnaaaacc tggggtnntt gaaacccttt 1620gggccattca aangggattn ccntacccag gtggggancc cttggaaana aangggtggg 1680tggttttgga aacnaatttt tagtcccngg gcc 1713214767DNAMus musculusmisc_feature(1)..(4767)n represents a or g or c or t/u 21cataccaagt gaggtgtaat tgtttaaacc aaaaagtttg aaggatatgg caaaagccag 60acttaaattt ccatttttcc tttttttttt tttttttaag ggaaattctt attcaatgtg 120taagtgctca ctatcatctc tggggaggca gagggagaaa aaaaatacct ggtaattcaa 180agccagtctg ggctacacag caagatcgtc cctcaaaaaa gtacttttta attaaaagag 240agaaattatt ccgaatccat agaaatagtc gttggagtat tgggaggtgg gaagcccaag 300gcccttgtcc atgtagtcac acataatggc agtggcttgg gctttcatag aagggcacac 360gtggggacct tcccttgtgg gctttctgac tcttcactta ctgcatatgc ctactgcaga 420gatttcctct ggactggagc actgggactt tctttctaaa aatataaagt tcagtaatga 480ccaacaatta tgattaggct agtaggcttt tgttcatttt taaaaattgt atgtgtgtga 540gtattttacc tcacgcatag tgtatgtacc gtgcctctga agacggaaga aaacattagc 600ttcccctgga actggagtta cagatggttg taagccacca tgtaagtgct gggatttgaa 660ctcaggtcca tctggaagaa cagccagtgc ggtacccact gagccatctc tccagccccc 720tgcccagtgt tcttaaagtg ttagtctacg gtagcagatg atttggtccc ttgaagaaat 780tctttcccct caatcttgct agcttgactg ataacctaaa cccattgagg aagctctgat 840cacgagcaag ctctactccg gactggaaga gtgttcagtg tgtctcaaag cacgtacttg 900tggtgttgta aaccgtgagc catgctgaga cgcctcttgt gaaatgtctt cccgtggctt 960caggaacatt tcagaccgct gttttccttt ggagttaaaa ctgactcctt ctaccaacac 1020gtggaaagaa ttgtgaacat cagctggtag ttgtcatatg aaaaaacaaa acaaaacaaa 1080acaaaaaact atgttgtctg tcactgtcat cttcagtatg tactttgtcc ccaaatcacc 1140atgacatgcc aaagccgtgt caagcattgc agagacattc taaccttgtt gctcttacta 1200ttcagtttaa aaagaagcaa gtaattgtgg gaaggtaggg gatgcttgga agaggacttt 1260gctatgtaga ccaaactggg ctagaactca acaatcctcc tgcctcagcc tcccatgtgc 1320tacatgcaac aaacaaggag cttaaacatt tttttttttt atgaatgcca ggaaaaccta 1380caggaatttg aagaattttt gtgggagcct ctgttttctt atttcttctt ctgtcttatt 1440ttaaatgcaa gaaggggcag acctccacct gctctccttt tatctgtgcg cctccagccc 1500tagccccaac cttgtgctgc aaagctcttg aagcttcgac attgcacctt tggctccatc 1560tgtcttgaaa aacggaccca aggcacccaa gagataagac ctgcacattc ctgctgggcc 1620cttgccttgg gtggcggcgg ggtcagaatg cccaaggcca cagatggtta ctgatagcgc 1680tatctcggcc acctacttga acgatcctac ttcaggtcct cttggctggc ttttctatat 1740tttcttttct tttctgccat tgttaatact tgtttcacaa ccaactgtag aggttgctgt 1800ctttgggcac cagagccact gtgctttaat cctgggttct taggcaagat tcctaagctc 1860tctaagcccc gcccccatcc cctttcgtcc cttataaaat aaagataaat catagtatct 1920gtggcagaag gttgtcagag gactgaagac gagccagtgc agtgtcaccc aaagacagtg 1980gcagttcacc tagttagaac catattttaa ttcttggttg acagagcacg actgtatgta 2040tctatgtggt agcaagtgat gtttcaatgt ttgtgtgtaa ggtgaatgag tgaattatgg 2100gggttaacat atccgatagc ttataggttt atcatcttgt ctggagtata tgcaaattgg 2160ctattttaaa atataaaatt aatattaact atagtcaccc tggtatgcca agtcgccctg 2220cacttgctgc ctgcctcttt gcgactccct gtcccttccc aacttctggt gaccatcctt 2280ctgttccctt ctatgaaatg agtttcttct ctggtcagaa ctactatctt atgtccctag 2340tacccctccg gaaatctgag ggtcctgctc tttggagatc ctagagcatg cggatgggtg 2400aggggaaatc attgaaaaac cacagaaacc cagagaggaa gcggcacgcc cctagtctgg 2460tgccaccagc ataaaaagtt aaagttgact tttctcaaac caacctcctg ggtcttttgt 2520tgtttgactt aaactggcgt gtgtgaagtt actccacctc cccaagcccc ataggcctcc 2580atgcctagta aatttggtta taaacaccac tcagccatta aagccccaat gcagtccagt 2640ggagatttga ttacgggttc gattaatgaa tcccagacct aagactaact taaccattgc 2700tcactcttaa agccttgaaa aaaactgggg gagtgaaaca ttacatttgg ttgtgtcctt 2760taactgagac ccctcagcaa gggaccctac acccttctga gcctccagtg tctctcaact 2820gttcctcctg ccctycccca ctcctccagt gtctctcaac tgttcctcct gccctccccc 2880actcctttcc catgcaagga gaggtttttc tgaaagagtt ggtgttctgt tttatctcag 2940tttattattc tataaacagg cttccacata atctatagaa tcaaaggcag gcttctcagg 3000ctgcagagat actacctatc ctggtgcatc caagttgtca gagcaggacc cgggagataa 3060agcccagcag ggtacaagat cagttccaag tggagggaat taagcggctc ttattccatg 3120gaaaaaaaaa agcaaggttg caataattcg ggaaagaaat aaaagactga tgggtgtgtg 3180tgtgtgtgtg tgtgtgtgtg tgtgtgtgtg tgtgtgtgta agcttatgag gcaacaagca 3240gacgcattta aaaaggaaga ctttggtgat gatcatctgg aagattctag

aaagaactga 3300ggcccaggga cctgtcactc acactttgca tactaggtag cgagtagata acgggtgcta 3360ctgttgtttt ttgttttttt tttttctcct atgactttta atgaagctga ttgattgatg 3420actgattgat cgattgattg attgatggtt gattgatcga ctgattgatt tccattgtgc 3480taaggattga acttgaagcc ttgtgtgtcc ttcgcaagta cctgatcact gaactactct 3540gccgtccccc tttctctaat gtggctaaac cgatatcatt ggcgatgggg gcaactcgtt 3600caaagctgca gtttgactcc catctcagcg gggactgtgt tctaagggcc tgtttgtgct 3660cagtgagatt tttaaaataa tcatttgtgc agttgctgtc gatactgaaa acagtctctc 3720ctgataggac tgagtaataa agaggcctgg aacttcgcct ctgtataata aattcaagca 3780ataaaagtca ccttctgaca tggacatttc tgaggcccat tgtccttctt aattattact 3840tgagtgagaa gggtgcactg agcactttgc ctgcaacctt ccccagttcc tactgctggc 3900ctgttgccct tgaagtgggc ctgccattga tgctgtagca tgccgtctaa caagaaatag 3960aatggcactt ttgtgttaga caagcttttt tttttttttt tgagaataga actcactagc 4020tagaccaggc tggcctccaa ctcacagaga cctacttgcc tctgccttct gggtattaag 4080attaaagacg tgcactacca tcctgggact ccattacccg ctatgtaatt gaagtgtagc 4140atacctgccg aaactagaaa tgagttccga gaagctcata ttgtatgggt cagttgttca 4200gtttgattgc ccattcgtgg ttcctttctc tgctcacggc ctttctctgc tctgcaggcg 4260cttaaatact ctaaacaagt gtgcagtcat gaagctggag attagtcctc accggaagcg 4320agtgagtacc aaaattacat gggggggggg ggcagggaca gcaggcacac taaccaagac 4380aggacttgta tctacactct gtaaaaggcc ctgtttgtcc attcctcaac atgttaaaac 4440ccctatttgg agacagtagt ggatggtggc atctactgct ctggacttga agaaatctgt 4500tacttttccc agtgaactcc atggctacca tgtgattcaa agcatgaagc ctattgaatc 4560tccagaggaa tttcacattg ctccctagag gaaataaagc taacattctg taggacctct 4620tcctgtttcc tggatggaac agtagctcca tctcgaagct gtcaagatga aaggggaagg 4680ctggcttggg ggatactgta ggagatgtgg atcgtggggg gtggggagga agacgccgga 4740gcaggaaatc ccatacactc tgtggna 4767221072DNAMus musculusmisc_feature(6)n represents a or g or c or t/u 22ttgaanccca agctggagct ccccgcggtg gcggccgctc tagaactagt ggatccagat 60acagttcttg tctttaaact ctgactatgg acaggaatta tatcctgccc acgacccatc 120cagcctgact gtccacatct tacactctac actcaaggct gaggattcta gattatgaag 180agttagacat ctaatacatt tctattttaa aaatatagtt gctctgtggt ggggcatggt 240ggcacatggc tttaatccta gcacccagag gaggtagagg caggtgaatc tctgagttca 300aggccagcct ggtatatata gcactgactg ctctcccaga ggtcctgagt tcaatttcca 360gcaaccatat ggtggctcac aaccatctgg aatggaatcc gatgccctct tctggtgtgt 420ctgaagacag ctacagtgta ctcatacata caataaataa ttcttaaaaa aaaaaacaaa 480aacaaaaaca aaaactcaaa cacaaacaaa cagtatatat gtaagatatt atagctaacc 540acttaagttt attattctct gagcattttt gccagaaagg tctgcttcta aataaacaac 600aaagcaaaaa caccccaaag tccaaacaaa aaccccaaac tttttagcac aggtagattt 660ctcaggttat gctcaaaacc ttcattcaaa actgaccgac agcgtgatgg agtgtgggct 720cagcatgaac aagggcctga acgcatctca ggcaaccacg tgatggctga aaacccaacc 780aaccagtcct gcagttaact ccctgaggct ccaggagttt gagcagcatg gagaacatag 840cctggaggat gtggagacca cctgcttaaa ggttgatgga ctggtgacat tgacagagga 900cagaacggtc ctaagctgag tgctggggac aacctcaggg agcatgatgg catcccccca 960gggccattgc tcactgctca ctacgagctg gctctcttac cagctgaagc cgtgcttgtt 1020ggaggcgtgt cttttccagc agggccgtca atttcaggag ccagtttttc tt 1072231104DNAMus musculusmisc_feature(1)..(1104)n represents a or g or c or t/u 23ttggamracy sggtaccggg ccccccctcg cggtcgacgg tatcgataag cttgatatcg 60aattcctgca gcccggggga tcctgctttg ggaaaaagac gtcaaactct tcaaggcggg 120accaccgctt gctgtccctc tggaaagtcc acttgccgct tatggcgcaa ggctcatctt 180catccgaatc ctcactctga aaacacagaa tgaagccatt tatgtactgg gccaagcagg 240gggcagaagg cagaacacag gttaagggcc aggccacagc ccaaaggata ttcccagtgt 300ccattgctca gttctcttat gtaacaaaga tggatttaaa gacattatta ttgggctgga 360gggatggctc agccgttaag aacactgacc gcttttccag aggtcctgag ttcaaatccc 420agcaaccaca aggtggctca acaaccatct gtaatgagat ctgatgccct cttccggtgt 480gtctgaagac agccatagtg tacttatata taatataaat aaatccttaa aaaaagagac 540attattatta ctttatttta tttagagaat gtatttgcat gtatgtgtat atgtatggat 600gtatatgaat gttcacaccg tgttcagacg actaccagtc agtgtgagtt ttctccttca 660gtcatataga actgggtcgt caggcttggc aacaggccga ctgtcattta accagcccag 720atgtaaagac tttaacagaa gtctgaccaa gtgttgccag ctaaacaagt cattttattg 780aaaccctggc tcgttgggcc attcactaat cgctcacaaa ggggacctct gagatgggcc 840gaaaattcaa gcatgcaaaa tattctgaac tggaatcaga gtcaacagtc gtgggactcc 900ctctggattg cctccagttt aactgcgtgt tgacagagtg tgtttatata ctcgtgtgca 960attaaaaaaa aaaaaaagct attttcaaac agcagaatgg cagctgagga ctctaggtcc 1020aaagagaaaa gacanggnat ttcttttaaa agaactgaag accatttaan cgagccatct 1080gtggcagaaa aggnaaaata gant 110424725DNAMus musculusmisc_feature(1)..(725)n represents a or g or c or t/u 24aannccctga tatcccggaa agaagagact ccggggtggg ggcttccctg accaggtgca 60ataggtaagg aagggcgttg cttctcgatg catccagagg tggggaatca agaagggtct 120gcttggtggg attgagtctg atatttgcag tcctgcaaat tcctagggac tgcatccaac 180caggagaccc caacaatccc aacgggaaag gagtattata aactgggtat gaacctttgg 240tcatcaagga tgcagacagt ggaccctgga agatggtggt gtttgaacaa tatagtcagg 300ccttatccac cgtggggtgt acttagacgt gcttaaagtg cttgcatctt gattctcctg 360cagttccaaa tcttcggttt cagccaggca cagatgagaa ctactcaggg gagaaactgt 420cttctccgtc attataccct gggtaataga gtgtgaccgt gaactactag caggttgtta 480tagcaatctg gcttataaac ttacattaaa tggggagggt gctcccgatg tgcgtagaca 540ctatccatct tctataagag gcctgagtgt actgagtcca catatctgct atgtctggaa 600ccaaccttca ggggttacaa agacagtggg ggtggggggg aggcagggaa aggaagatcg 660atgctcttgg ttcctgatga tcagaagatt ggtcccagct tactcctttc cgcctgttct 720ttttg 72525528DNAMus musculusmisc_feature(1)..(528)n represents a or g or c or t/u 25agacgnggtc ccactgactg tgaacgtgca gcgctcagga cagcccctgc cccagagcat 60ccagcaggcc atgcgctacc tccgtaacca ctgtctggac caggtgagta cagctgcctg 120tggatcccac tcgtgggagc ggagctttgg gctgcatgtt tttttttcta gtttcgtggg 180gaagggtcct gcttccacac ccatccctgc tgttctcctt ccaaaaggtc gggctcttca 240ggaagtcagg tgtcaaatcc cggatccagg ctctacgcca gatgaatgaa agcgctgaag 300ataatgtcaa ctatgaaggc cagtctgctt atgatgtggc agacatgtta aagcaatatt 360ttcgagatct tcctgagccc ctcatgacga acaaactntn cgaaaccttn ctgcagatct 420accagtgtaa gcgttctttg gtcttcttaa gnaactgatg tcgggttcat gggaccaact 480gagcacacaa gcctttttna tgccatcctt ttgaaanaaa aacttnat 52826393DNAMus musculusmisc_feature(1)..(393)n represents a or g or c or t/u 26aacanaanat tccggatttc ctcagggacc tggaaanaat tcttgcattc agcaatcatg 60tgggccagcc cttggantcg ccgctaggtt ttcattcagg tctttctggt ctggtttgcc 120caaactctgt tttctttgca ttacccttgg anaaaaattc tctcncttca gggtgttgag 180gtggaanang gacggagcta ggcacacagc caggttggtg ggagtcatct ggttttcttt 240cacagccgct gtgacatcgc tcaggaaata aaaaantgtc tgcanaacct cccggttctc 300ntcgggcagg ancataatgg ccgccttgat ggcttggang cgctggtcct tgggcacata 360ctggtatatc tgcanggaan gttcggaaaa ttt 39327601DNAMus musculusmisc_feature(1)..(601)n represents a or g or c or t/u 27ccaagctgaa ttccgggcgc cttgtgcttg ctgtggtggg agcccttgaa gcttcaggct 60ctccatgcgt ttgagcaggc tccgcgtctt cgacttggtg ttcttctcgt ggtggccttt 120catgctaaaa ctgaagctgg acagttcctt gggagacggc aggctggaaa aagagtcatc 180gttgcctaca aagtgtccgg aggagcagac gctgatgacg gaattggttc ggggcgtant 240ggcatctcca ctgtgggctg cgtgggtggg gacgctgctg ctggtgctgc tgangcttcg 300gacagangcc acctcctggt gctcgctgag gtctgtcagc atgctttcgt ggcttgtggc 360gctttgcaaa cgaaanttgt ctggggaccc agggattgga tcctgctttg ggaaaaagac 420tcaaactctt caaggcggga ccaccgcttg ctgtccctct ggaaagtcca cttgccgctt 480atggcgcaag gctcatcttc atccgaatcc tcacttcgct tccggtgaag actaatctcc 540acttcntgaa tgcacacttg tttanaatat ttaacncctg canaaaacct ccatggcgtc 600t 60128260DNAMus musculusmisc_feature(1)..(260)n represents a or g or c or t/u 28ggcttangga agtgccgggc ttgtgatctt cgggaatgta gaacaccgtg tcctctgggt 60agctctcacg gtttttatag ttctgctccg tcncnttgtt caaggtggac tgactgaatg 120ggtccaancc ctctaaatac atgcccaccc gcttgttaca ggtgctgagg ctccgggtcc 180tggtgaccgg gctgggtgtg ctgaccgcnc tgctggtctc tgattggctg ctgctgctgc 240tggtctgggt ggaattagac 26029358DNAMus musculusmisc_feature(1)..(358)n represents a or g or c or t/u 29ctgattccgg gttgacatta tcttcagcgc tttcattcat ctggcgtaga gcctggatcc 60gggatttgac acctgacttc ctgaagancc cgacctggtc cagacagtgg ttacggaggt 120agcgcatggc ctgctggatg ctctggggca ggggctgtcc tgagcgctgc acgttcacag 180tcagtgggac cccaaacaca ctccggtcct tgtagtctgg aaccttgatc cttttcatga 240acttgggcac agcccagctg aanccgtgct tgttggangg cgtgtncttt tccagcaggg 300ccgtcaattt caggagcgag tntttctgca gcaggttcat ctgggccaca gactggca 35830154DNAMus musculusmisc_feature(1)..(154)n represents a or g or c or t/u 30aattccgggc gatgtcacag cggctgtgaa agaaaaccag atgactccca ccaacctggc 60tgtgtgccta gctccgtccc tcttccacct caacaccctg aancnataga attcttctcc 120aagggtaatg canatgaaaa cagagtttgg gcaa 15431294DNAMus musculusmisc_feature(1)..(294)n represents a or g or c or t/u 31aagctggaat ccggtgcgct ccagccttga gccatggctg tgcgtcctcg ctgttggagc 60cacggctccc cagctccgtg ccccgctccc tgagagtgct cccttcgcgg tggcaatcta 120aaacccacga ttttgcccga gctggggcga agcgtaagga agctgcgaac cangatgtgc 180tgacgaccgc gaggggctcg cgtcccggct gccaccgtgg gtcccgacgt gggatcccga 240tnacttctgg cngcctcgac tttcccagtg cgctcccgtc gncctgcgcc gacc 294

* * * * *

Dlc-1 Gene Deleted In Cancers

YUAN; BAO-ZHU ; et al.

References