Products and methods for gaucher disease therpy Mahuran, Don J. ; et al. [Callahan, John W.]

Products and methods for gaucher disease therpy

Mahuran, Don J. ; et al.

Patent Application Summary

U.S. patent application number 10/706466 was filed with the patent office on 2004-04-29 for products and methods for gaucher disease therpy. Invention is credited to Callahan, John W., Clarke, Joe T.R., Mahuran, Don J..

Application Number	20040082535 10/706466
Document ID	/
Family ID	31498866
Filed Date	2004-04-29

United States Patent Application	20040082535
Kind Code	A1
Mahuran, Don J. ; et al.	April 29, 2004

Products and methods for gaucher disease therpy

Abstract

The invention relates to products and methods for medical treatment of Gaucher disease and, in particular, an improved Gcc DNA for insertion into any applicable expression vector for gene therapy treatment. The invention includes an isolated Gcc DNA molecule, wherein nucleic acid molecules have been modified at cryptic splice sites to prevent or decrease splicing of mRNA produced from the DNA molecule, while preserving the ability of the DNA to express functional Gcc polypeptides.

Inventors:	Mahuran, Don J.; (Toronto, CA) ; Clarke, Joe T.R.; (Toronto, CA) ; Callahan, John W.; (Mississauga, CA)
Correspondence Address:	SYNNESTVEDT & LECHNER, LLP 2600 ARAMARK TOWER 1101 MARKET STREET PHILADELPHIA PA 191072950
Family ID:	31498866
Appl. No.:	10/706466
Filed:	November 12, 2003

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
10706466	Nov 12, 2003
09586216	Jun 2, 2000
6696272
60137598	Jun 3, 1999

Current U.S. Class:	514/44R ; 435/366; 536/23.2
Current CPC Class:	C12N 9/2402 20130101; A01K 2217/05 20130101; C12Y 302/01045 20130101; A61K 48/005 20130101
Class at Publication:	514/044 ; 435/366; 536/023.2
International Class:	A61K 048/00; C07H 021/04; C12N 005/08

Foreign Application Data

Date	Code	Application Number
Jun 2, 1999	CA	2,272,055

Claims

We claim:

1. An isolated Gcc DNA molecule, wherein the DNA molecule has a modification in at least one nucleotide that disrupts a splicing consensus sequence and prevents splicing of mRNA produced from the DNA molecule, while preserving the ability of the DNA to express active Gcc.

2. The DNA molecule of claim 1, wherein the modification impairs a consensus nucleotide sequence needed to induce splicing.

3. The DNA molecule of claim 2, wherein the DNA molecule is modified at two cryptic splice sites.

4. The DNA molecule of claim 1 or 3, comprising a mutation in the 3' junction site.

5. The DNA molecule of claim 4, wherein the mutation is as shown in the 3' junction site in Table 1, or a functionally equivalent mutation.

6. The DNA molecule of claim 1 or 3, comprising a mutation in the 5' splice junction site

7. The DNA molecule of claim 6, wherein the mutation is as shown in the 5' junction site in Table 1, or a functionally equivalent mutation.

8. The DNA molecule of claim 1, comprising all or part of the nucleotide sequence shown in FIG. 4(b).

9. A vector comprising the DNA molecule of any of claims 1 to 8.

10. The vector of claim 9, comprising a promoter that is functional in a mammalian cell.

11. mRNA produced from the DNA molecule of any of claims 1 to 8 or the vector of claim 9 or claim 10.

12. A method of medical treatment of Gaucher disease in a mammal, comprising administering to the mammal an effective amount of the nucleic acid molecule of any of claims 1 to 8 or the vector of claim 9 or claim 10 and expressing an effective amount of the polypeptide encoded by the nucleic acid molecule for alleviating clinical symptoms of Gaucher disease.

13. A host cell, or progeny thereof, comprising the nucleic acid molecule of any of claims 1 to 8 or the vector of claim 9 or claim 10.

14. The host cell of claim 13, selected from the group consisting of a mammalian cell, a human cell and a Chinese Hamster Ovary cell.

15. A method for producing a recombinant host cell capable of expressing a Gcc nucleic acid molecule, the method comprising introducing into the host cell the vector of claim 9 or 10.

16. A method for expressing a Gcc polypeptide in the host cell of claim 13 or 14 comprising culturing the host cell under conditions suitable for DNA molecule expression.

17. A method for producing a transgenic cell that expresses elevated levels of Gcc polypeptide relative to a non-transgenic cell, comprising transforming a cell with the vector of claim 9 or 10.

18. An isolated polypeptide encoded by and/or produced from the nucleic acid molecule of any of claims 1 to 8, or the vector of claim 9 or 10.

19. A method of producing a genetically transformed cell which expresses or overexpresses a Gcc polypeptide, comprising: (a) preparing a Gcc nucleic acid molecule according to any of claims 1-18; (b) inserting the nucleic acid molecule in a vector so that the nucleic acid molecule is operably linked to a promoter; (c) inserting the vector into a cell.

20. A transgenic cell produced according to the method of claim 19.

21. A pharmaceutical composition, comprising a carrier and (i) the nucleic acid molecule of any of claims 1 to 8 (ii) the vector of claims 9 or 10 or (iii) Gcc polypeptide produced from (i) or (ii), in an effective amount for reducing clinical symptoms of Gaucher disease.

22. The composition of claim 21, wherein the carrier comprises a liposome.

Description

FIELD OF THE INVENTION

[0001] The invention relates to products and methods for medical treatment of Gaucher disease and, in particular, nucleic acid molecules, polypeptides and vectors for polypeptide or gene therapy treatment.

BACKGROUND OF THE INVENTION

[0002] Gaucher disease is a lysosomal storage disease caused by the deficiency of functional glucocerebrosidase (Gcc) enzyme. Gcc is present in all cell types. The defective enzyme cannot break down a fatty substance, glucocerebroside, which is an important component of cell membranes. The fat accumulates in macrophages (which are known as the "Gaucher cells"). The fat-laden macrophages are found typically in the liver, spleen, bone marrow and lungs. The amount of the enzyme deficiency varies from person to person as do the symptoms. Some patients may show no clinical symptoms, while others may die from the disease. The symptoms of the disease and mutant forms of Gcc that cause Gaucher disease are described, for example, in U.S. Pat. No. 5,266,459 (Beutler) and U.S. Pat. No. 5,234,811 (Beutler and Sorge).

[0003] There are therapies for Gaucher disease. Ceredase is a form of the Gcc enzyme from placenta that is able to metabolize the fat in Gaucher cells. The enzyme restores normal function to a Gaucher cell. The amount of enzyme used in treatment varies. As much as 30-60 units per kilogram of bodyweight (U/kg/bw) may be given every other week. Positive results have been reported with 2.3 U/kg/bw given three times a week. Lower doses, such as 1-5 U/kg/bw twice weekly, have also been used with success, but this is less frequent. The intarcellular half life of the enzyme is up to 60 hours. A large number of placentas are needed to make sufficient Ceredase, so this form of therapy is very expensive. It has been almost completely replaced by treatment with a recombinant form of the enzyme, Cerezyme but this therapy is also expensive. Cerezyme is dispensed as a powder whereas Ceredase comes as a liquid. Sterile water must be added to the Cerezyme bottle to dissolve the powder. The shelf life of the drugs is short (<3 months), and splitting doses is cumbersome and wasteful. Allergic reactions to Ceredase are common, but rarely life-threatening. Adverse reactions to Cerezyme appear to be less common, but experience with the drug is still very limited.

[0004] Gcc has been structurally modified in order to obtain improved pharmacokinetics over naturally occurring Gcc (which is derived from placenta). These modifications include amino acid modifications as well as carbohydrate changes. For example, U.S. Pat. No. 5,549,892 discloses a recombinant polypeptide that differs from naturally occurring Gcc by the presence of histidine in place of arginine at position 495. In another embodiment, the carbohydrate remodeled recombinant Gcc has increased fucose and N-acetyl glucosamine residues compared to remodeled naturally occurring Gcc. The increased pharmacokinetics of these compounds provides a therapeutic effect at doses that are lower than those required using remodeled, naturally occurring Gcc. However, this Gcc remains expensive to provide. Furthermore, improved pharmacokinetics does not necessarily compensate for inadequate bioavailability of Gcc.

[0005] Gene therapy has been administered to Gaucher patients. All experiments carried out to date have been undertaken using ex vivo, retrovirus-mediated transfection, which requires sophisticated laboratory facilities and is very expensive. Although transgene expression could be demonstrated in mice undergoing this procedure, experiments in humans have been disappointing. No clinically significant Gcc gene expression has been reported in humans undergoing retrovirus-mediated transfection with existing Gcc gene preparations. One problem of gene therapy is in reproducibly obtaining high-level, tissue-specific and enduring expression from genes transferred into cells. Currently, there is no suitable gene therapy vector that expresses at a high level for Gaucher disease gene therapy.

SUMMARY OF THE INVENTION

[0006] The invention includes a modified Gcc cDNA insert that can be inserted into any mammalian expression vector for use in the medical treatment of Gaucher disease. In a preferred embodiment, the modified cDNA was inserted into a vector named pINEX2.0 which was then used to transfect mammalian cells. When pINEX2.0 containing the unmodified Gcc cDNA coding sequence, pINEX5'GCC3', was transfected into cells, their RNA purified from cell lysates and subjected to reverse transcription followed by the polymerase chain reaction (RT-PCR), two distinct major bands were observed after agarose gel electrophoresis (FIG. 1). Isolation, purification and sequencing of the RT-PCR products identified a major aberrantly spliced mRNA species which encodes only a 19 amino acid peptide before encountering a STOP codon. Surprisingly, this aberrant splicing event occurred completely within the Gcc cDNA coding sequence (FIG. 2), i.e. no vector sequences were involved. Site directed mutagenesis was performed to modify the nucleotide sequence in the region of aberrant mRNA splicing without affecting polypeptide coding (FIG. 3). Modifications were aimed at disrupting the known consensus sequences for RNA-splicing (Krawczak et al. 1992). The effectiveness of these modifications were tested by transient transfection into CHO cells, followed by our human-specific immunoprecipitation assay for Gcc. Data (n=18) indicate a 5.+-.1 (Std. Error)-fold increase in Gcc activity was achieved when the modified replaced the unmodified insert in the pINEX2.0 expression vector.

[0007] The invention relates to an isolated Gcc DNA molecule, wherein the DNA molecule has a modification in at least one nucleotide that disrupts a splicing consensus sequence and prevents splicing of mRNA produced from the DNA molecule, while preserving the ability of the DNA to express active Gcc. The modification impairs a consensus nucleotide sequence needed to induce splicing. The DNA molecule is preferably modified at two cryptic splice sites. The DNA preferably includes a mutation in the 3' junction site. In one embodiment, the mutation is as shown in the 3' junction site in Table 1, or a functionally equivalent mutation. In another embodiment, the DNA molecule includes a mutation in the 5' splice junction site. The mutation is preferably as shown in the 5' junction site in Table 1, or a functionally equivalent mutation.

[0008] The DNA molecule preferably includes all or part of the nucleotide sequence shown in FIG. 4(b).

[0009] Another aspect of the invention relates to a vector including a DNA molecule of the invention. The vector preferably includes a promoter that is functional in a mammalian cell.

[0010] The invention also includes mRNA produced from the DNA molecule or vector of the invention.

[0011] Another aspect of the invention relates to a method of medical treatment of Gaucher disease in a mammal, including administering to the mammal an effective amount of a nucleic acid molecule of the invention or a vector of the invention and expressing an effective amount of the polypeptide encoded by the nucleic acid molecule for alleviating clinical symptoms of Gaucher disease.

[0012] The invention includes a host cell, or progeny thereof, including a nucleic acid molecule of the invention. The host cell is preferably selected from the group consisting of a mammalian cell, a human cell and a Chinese Hamster Ovary cell. The invention also includes a method for producing a recombinant host cell capable of expressing a Gcc nucleic acid molecule, the method including introducing into the host cell a vector of the invention. The invention also includes a method for expressing a Gcc polypeptide in a host cell including culturing the host cell under conditions suitable for DNA molecule expression. Another aspect of the invention relates to a method for producing a transgenic cell that expresses elevated levels of Gcc polypeptide relative to a non-transgenic cell, including transforming a cell with a vector of the invention.

[0013] The invention includes an isolated polypeptide encoded by and/or produced from a nucleic acid molecule of th invention, or a vector of the invention.

[0014] The invention includes a method of producing a genetically transformed cell which expresses or overexpresses a Gcc polypeptide, including: a) preparing a Gcc nucleic acid molecule according to any of claims 1-18; b) inserting the nucleic acid molecule in a vector so that the nucleic acid molecule is operably linked to a promoter; c) inserting the vector into a cell. The invention includes a transgenic cell produced according to the method of the invention.

[0015] The invention also includes a pharmaceutical composition, including a carrier and (i) a nucleic acid molecule of the invention (ii) a vector of the invention or (iii) Gcc polypeptide produced from (i) or (ii), in an effective amount for reducing clinical symptoms of Gaucher disease. The carrier preferably carrier includes a liposome.

BRIEF DESCRIPTION OF THE DRAWINGS

[0016] Preferred embodiments of the invention will be described in relation to the drawings in which:

[0017] FIG. 1. Separation of RT-PCR products from CHO cell permanently transfected with pINEX5'-GCC-3'.

[0018] FIG. 2. Diagrammatic Representation of possible RT-PCR products representing mRNA splice variants from pINEX-5'-GCC-3'.

[0019] FIG. 3. Comparison of consensus splice site donor/acceptor site and "Cryptic" splice sites in Gcc cDNA. Sequences of (a) unmodified Gcc cDNA contained in pINEX5'Gcc3' (b) In a preferred embodiment, this sequence represents modified Gcc cDNA contained in pINEX-WEIRD. The translated amino acid sequence for either the modified or unmodified Gcc cDNAs is also given, note that the modified nucleotides had no effect on the amino acid sequence.

[0020] FIG. 4. (a) The sequences of the aberrantly processed transcript from the unmodified Gcc cDNA insert in pINEX5'Gcc3' and its translated polypeptide (b) Modified DNA and its translated polypeptide. In a preferred embodiment, this sequence represents modified Gcc cDNA.

DETAILED DESCRIPTION OF THE INVENTION

[0021] The invention satisfies the need for a DNA (preferably a cDNA) that when inserted into any mammalian expression vector transcribes RNA that is resistant to aberrant processing in the transfected or transduced target cells and thus, is much more likely to translate the functional full length Gcc protein. Therefore, such a modified insert would improve the levels of Gcc expression when used in any vectors designed for in vivo or ex vivo gene therapy treatments of Gaucher disease. As well, when inserted into any efficient mammalian expression vector, such as pINEX2.0, the modified Gcc cDNA as compared to the unmodified cDNA will increase the production levels of recombinate Gcc polypeptide for use in enzyme replacement therapy for Gaucher disease. Thus, the modified insert directs a higher level of Gcc expression through a mechanism that is independent of the mammalian expression vector used whether in vivo or in vitro. The modified insert is safe as preferably no change in the amino acid sequence of Gcc is encoded by the nucleotide changes, and should confer a sustained and appropriate level of cell-specific expression for gene therapy when coupled with the appropriate vector and transfection or transduction methodologies. The expressed DNA insert is preferably a modified Gcc cDNA or a modified fragment of a Gcc cDNA that express a polypeptide having Gcc activity which is effective for treatment of Gaucher disease. The DNA is modified to prevent aberrant cellular splicing of its mRNA produced when expressed in mammalian cells. The modified DNA insert may be used with any expression vector to transfect or transduce any mammalian cell type, such as CHO cells for the expression of human Gcc for enzyme replacement. These would also include human stem cells for ex vivo gene therapy or macrophages for in vivo gene therapy.

[0022] The invention also includes the methods of making the modified DNA. The methods may be applied to a Gcc DNA from any source that requires modification to avoid undesirable splicing including humans, other mammals or synthetic DNA.

[0023] During the search to improve the efficiency of human Gcc expression it was determined that a major amount of the RNA transcribed from any vector was aberrantly spliced due to cryptic 5' and 3' splice sites contained in the human Gcc cDNA (FIGS. 1 & 2). Since this RNA species encodes only a 19 amino acid peptide, it is far less stable than the properly spliced product encoding the complete 536 residues of Gcc (Maquat 1996), and therefore transcribed at a much higher level than is indicated from our steady-state RT-PCR data (FIG. 1). We modified the two cryptic sites in a manner that conserved the wild type amino acid sequence while destroying the consensus nucleotide sequences needed to induce splicing (FIG. 3). Transient expression of this modified insert indicated a 5-fold increase in Gcc expression. Such an increase in expression efficiency is not only valuable for any gene therapy approach, but also useful in decreasing the cost of enzyme replacement since the enzyme source is now Gcc-transfected mammalian cells.

[0024] Treatment using any vector containing a modified insert to prevent aberrent transcript processing (by gene therapy or by administration of polypeptide produced from a vector) will lower the cost of the present enzyme replacement therapy (currently as much as about US$100,000 per yr. for a patient) by increasing the yield of functional Gcc protein.

[0025] The modified insert when used with any appropriate expression vector is also used to direct the expression of Gcc for use in research and characterization of the enzyme's function.

[0026] Other useful DNA inserts include a nucleic acid molecule having at least about: 50%, 60%, 70%, 80%, 90%, 95%, 99% or 99.5% sequence identity to the modified Gcc nucleic acid molecule (the Gcc sequence in FIG. 4(b)) wherein the molecule having sequence identity has a modification in at least one nucleotide (preferably two nucleotides) that disrupts a splicing consensus sequence and prevents splicing of mRNA while it encodes a polypeptide having Gcc activity. Changes in the Gcc nucleotide sequence which result in production of a chemically equivalent (for example, as a result of redundancy of the genetic code) or chemically similar amino acid (for example where sequence similarity is present), may also be made to produce high levels of unspliced transcript from the Gcc cDNA for therapeutic use. The DNA molecule or DNA molecule fragment may be isolated from a native source (in sense or antisense orientations) and modified or synthesized (with or without subsequent modification). It may be a mutated native or synthetic sequence or a combination of these in order to prevent or decrease aberrently spliced transcripts.

[0027] Selection of Vector

[0028] Separating the Gcc activity derived from transfected human cDNA from the endogenous Gcc activity of the host cells, e.g. CHO, was done to determine the efficiency of expression vectors. A high level of expression is needed not only for any in vivo or ex vivo gene therapy approach, but also for the efficiency of producing Gcc for enzyme replacement therapy now being done in transfected cells (Grabowski et al. 1995). We have developed an immunoprecipitation assay that is specific for the human enzyme and have used it to evaluate several expression vectors. The vector producing the highest level of Gcc in transiently transfected CHO cells was pINEX2.0 from INEX Pharmaceuticals. The vector contains a CMV-based promoter and a potential intron prior to the initiating ATG of the Gcc cDNA. In general our results indicated that a CMV-based promoter gave the highest level of expression and that the placement of the vector's intron at the 5' end of th insert was supperior to placing it at the 3' end. Other suitable vectors will be apparent to a skilled person.

[0029] After some initial modifications to the 5' untranslated end of the cDNA construct to ensure a match with the consensus sequences for protein initiation (Kozak 1987) and the 3' end to eliminate most of the untranslated nucleotides prior to the vector's polyadenylation signal, a lysate from transiently transfected CHO cells still produced low levels of Gcc specific activity, requiring our immunoprecipitation assay to detect the increase in human activity in the cells' total Gcc pool. A line of permanently-transfected CHO cells was prepared in order to analyzed the sequence(s) of the Gcc mRNA(s) being transcribed from the expression vector.

[0030] Identification and Characterization of Differentially Spliced RNA

[0031] Cells Expressing Differentially Spliced RNA

[0032] A CHO cell line was permanently co-transfected with the pINEX-5'-GCC-3' construct and a construct containing a selectable marker, pREP10. After selection and isolation of individual clones, the clones were assayed to determine specific activity of Gcc (in nmole/hr/mg total lysate protein). One clone, termed A7, was grown in larger scale and RNA isolated from it. A reverse transcription reaction followed by PCR, RT-PCR, was performed on total cellular RNA from CHO control cells and A7 clone cells. Following agarose gel electrophoresis, two major bands were observed (see FIG. 1).

[0033] Restriction Digest Analysis

[0034] The major bands of the RT-PCR reaction were electrophoretically separated on a larger scale and the band(s) excised. The purified cDNA was ligated into the pCR2.1 cloning vector. Restriction digest analysis of the clones obtained using Stu I and Eco RI, revealed a series of different patterns, consistent with possible aberrant splicing (data not shown).

[0035] Sequencing Analysis

[0036] Sequencing of representative clones of each pattern, obtained from the high- and/or low-molecular weight bands, revealed a number of differentially spliced products. In the high-molecular weight band, 5 out of 10 clones contained product that was spliced at the upstream site in the vector only, producing wild-type message. One out of 10 contained a completely unspliced product, and another 1 out of 10 contained a product with a restriction map consistent with both upstream (vector) and downstream (insert) splice events taking place (FIG. 2). Remaining clones contained unidentifiable restriction maps and were therefore not sequenced. Sequencing and restriction digest analysis of the low-molecular weight band revealed that in 7 of 10 clones both splices had taken place, and in 2 out of 10 only the upstream splice event (wild-type message) had occurred.

[0037] Sequencing results confirmed that the major alternatively spliced species resulted from the removal of sequences within the Gcc cDNA itself, through the recognition of cryptic 5' and 3' splice sites roughly corresponding to the known consensus sequences that induce RNA splicing in mammalian cells (Krawczak et al. 1992). The deduced amino acid sequence from this RNA species predicts a reading frame shift after Arg.sup.17 and an early stop two codons later (FIG. 3). This would encode only a 19 amino acid peptide lacking even a complete signal sequence, necessary for targeting the protein to the cell's endoplasmic reticulum. In order to eliminate aberrant splicing, the Gcc cDNA was modified by site-directed mutagenesis to alter some of the critical nucleotides making up the consensus sequences (Krawcza et al. 1992) to ensure that the cryptic splice sites no longer be recognized by the RNA processing mechanism. Care was taken to preserve the amino acid coding sequence. FIG. 3 shows the consensus sequence for the 5' or 3' splice junctions (Krawczak et al. 1992), the original nucleotide sequence of the Gcc cDNA, the deduced amino acid sequence, and the modifications undertaken to destroy the consensus splicing sequences. Other modifications to destroy the consensus splicing sequences will be apparent.

[0038] Transfection Experiments

[0039] Eighteen independent transient transfection experiments were performed to compare Gcc expression levels between pINEX-5'-GCC-3' to pINEX-WEIRD. After correcting for transfection efficiency with .beta.-galactosidase (see Methods), pINEX-WEIRD produced 5.+-.1 (standard error) fold higher levels of Gcc activity (see example in Table 2).

[0040] Future work will confirm that all aberrent processing of the Gcc transcript from the modified Gcc cDNA is eliminated. If not, other modifications based on the known consensus splice-sites sequences will be undertaken.

[0041] Modified DNA/DNA Having Sequence Identity

[0042] Many modifications may be made to the vector and Gcc DNA sequences and these will be apparent to one skilled in the art. The invention includes nucleotide modifications of the sequences disclosed in this application (or fragments thereof) that are capable of expressing Gcc in in vivo or in vitro cells. For example, the regulatory sequences may be modified or a nucleic acid sequence to be expressed may be modified using techniques known in the art. Modifications include substitution, insertion or deletion of nucleotid s or altering the relativ positions or order of nucleotides. The invention includes DNA which has a sequenc with sufficient identity to a nucleotide sequence described in this application to hybridize under moderate to high stringency hybridization conditions. Hybridization techniques are well known in the art (see Sambrook et al. Molecular Cloning: A Laboratory Manual, Most Recent Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.). High stringency washes have low salt (preferably about 0.2% SSC), and low stringency washes have high salt (preferably about 2% SSC). A temperature of about 37.degree. C. or about 42.degree. C. is considered low stringency, and a temperature of about 50-65.degree. C. is high stringency. The modified inserts encoding Gcc of the invention also include DNA molecules (or a fragment thereof) having at least 50% identity, at least 70% identity, at least 80% identity, at least 90% identity, at least 95% identity, at least 96% identity, at least 97% identity, at least 98% identity or, most preferred, at least 99%, 99.5% or 99.8% identity to a modified Gcc nucleic acid molecule acid as shown in FIG. 4(a), which have a modified consensus sequence to prevent splicing and which are capable of expressing DNA molecules in vivo or in vitro. Identity refers to the similarity of two nucleotide sequences that are aligned so that the highest order match is obtained. Identity is calculated according to methods known in the art. For example, if a nucleotide sequence (called "Sequence A") has 90% identity to the Gcc sequence in FIG. 4(b)], then Sequence A will be identical to the referenced portion of FIG. 4(b) except that Sequence A may include up to 10 point mutations (such as deletions or substitutions with other nucleotides) per each 100 nucleotides of the referenced portion of FIG. 4(b). The invention also includes DNA sequences which are complementary to the aforementioned sequences. "Sequence identify" may be determined, for example, by the Gap program. The algorithm of Needleman and Wunsch (1970 J Mol. Biol. 48:443-453) is used in the Gap program.

[0043] The DNA has a modification in at least one nucleotide that disrupts a splicing consensus sequence and prevents splicing of mRNA while it encodes a polypeptide having Gcc activity. This means an enzyme that can both convert the natural substrate, glucocerebroside (D-glucosylceramide), to ceramide and glucose under the appropriate conditions, and also hydrolyzed an artificial substrate, 4-methylumbelliferyl-.alpha.-D-glucopyranoside, at a rate of greater than 10 pmoles/hr/mg of purified Gcc polypeptide.

[0044] Functionally Equivalent Nucleic Acid Molecules Identified by Hybridization

[0045] Other functionally equivalent forms of the modified Gcc DNA of the invention can be identified using conventional DNA-DNA or DNA-RNA hybridization techniques. Thus, the present invention also includes nucleotide sequences that hybridize to the sequence in FIG. 4(b) or its complementary sequence, wherein the molecule that hybridizes to the Gcc portion in 4(b) has a modification in at least one nucleotide (more preferably at least two nucleotides) that disrupts a splicing consensus sequence and prevents aberrant splicing of mRNA while it encodes a polypeptide having Gcc activity. Such nucleic acid molecules preferably hybridize to the Gcc sequence in FIG. 4(b) under moderate to high stringency conditions For example, high stringency washes have low salt (preferably about 0.2% SSC), and low stringency washes have high salt (preferably about 2% SSC). A temperature of about 37.degree. C. or about 42.degree. C. is considered low stringency, and a temperature of about 50-65.degree. C. is high stringency (see Sambrook et al. Molecular Cloning: A Laboratory Manual, Most Recent Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.).

[0046] A nucleic acid molecule is considered to be functionally equivalent to the modified Gcc nucleic acid molecules of the present invention if the nucleic acid molecule has a modification in at least one nucleotide that disrupts a splicing consensus sequence and prevents splicing of mRNA while it encodes a polypeptide having Gcc activity (Gcc activity means an enzyme that can both convert the natural substrate, glucocerebroside (D-glucosylceramide), to ceramide and glucose under the appropriate conditions, and also hydrolyzed an artificial substrate, 4-methylumbelliferyl-.beta.-D-glucopyranoside, at a rate of greater than 10 .mu.moles/hr/mg of purified Gcc polypeptide.).

[0047] Cells Containing a Vector of the Invention

[0048] The invention relates to a host cell (isolated cell in vitro or a cell in vivo, or a cell treated ex vivo and returned to an in vivo site) containing a vector and modified Gcc sequence of the invention. The preparation of transformed cells is done according to known techniques (see Materials and Methods for example of CHO cells containing a vector). The invention includes methods of expressing Gcc in the cell.

[0049] Pharmaceutical Compositions

[0050] The pharmaceutical compositions of this invention used to treat patients having Gaucher Disease could include an acceptable carrier, auxiliary or excipient. Polypeptides may be administered in pharmaceutical compositions in enzyme replacement therapy or in gene therapy.

[0051] The pharmaceutical compositions can be administered by ex vivo and in vivo methods such as electroporation, DNA microinjection, liposome DNA delivery, and virus vectors that have RNA or DNA genomes including retrovirus vectors, lentivirus vectors, Adenovirus vectors and Adeno-associated virus (MAV) vectors. Dosages to be administered depend on patient needs, on the desired effect and on the chos n route of administration. The vectors may b introduced into the cells or their precursors using in vivo deliv ry vehicles such as liposomes or DNA or RNA virus vectors. They may also be introduced into these cells using physical techniques such as microinjection or chemical methods such as coprecipitation. The vector may be introduced into any mammalian cell type, such as CHO cells or human cells.

[0052] The pharmaceutical compositions can be prepared by known methods for the preparation of pharmaceutically acceptable compositions which can be administered to patients, and such that an effective quantity of the vector or polypeptide is combined in a mixture with a pharmaceutically acceptable vehicle. Suitable vehicles are described, for example in Remington's Pharmaceutical Sciences (Remington's Pharmaceutical Sciences, Mack Publishing Company, Easton, Pa., USA).

[0053] On this basis, the pharmaceutical compositions could include an active compound or substance, such as a polypeptide, in association with one or more pharmaceutically acceptable vehicles or diluents, and contained in buffered solutions with a suitable pH and isoosmotic with the physiological fluids. The methods of combining the vectors with the vehicles or combining them with diluents is well known to those skilled in the art. The composition could include a targeting agent for the transport of the active compound to specified sites within the mammalian cells.

[0054] Method of Medical Treatment of Gaucher Disease

[0055] Any vectors containing the DNA molecules of the invention may be administered to mammals, preferably humans, in gene therapy using techniques described below. The polypeptide produced from the modified inserts may also be administered to mammals, preferably humans, in enzyme replacement therapy.

[0056] Gene Therapy

[0057] Gene therapy to replace Gcc expression (Nolta et al. 1992; Tsai et al. 1992; Sidransky et al. 1993; Schuening et al. 1997; Dunbar et al. 1998) may be useful to modify the development or progression of Gaucher disease. The invention includes methods for providing gene therapy for treatment of diseases, disorders or abnormal physical states characterized by insufficient Gcc expression or inadequate levels or activity of Gcc polypeptide.

[0058] The invention includes methods and compositions for providing a nucleotide sequence encoding Gcc or biologically functional equivalent nucleotide sequence to the cells of an individual such that expression of Gcc in the cells provides the biological activity or phenotype of Gcc polypeptide to those cells. Sufficient amounts of the nucleotide sequence are administered and expressed at sufficient levels to provide the biological activity or phenotype of Gcc polypeptide to the cells. For example, the method can involve a method of delivering a gene encoding Gcc to the cells of an individual having a disease, disorder or abnormal physical state, comprising administering to the individual a vector comprising DNA encoding Gcc wherein the DNA has modified sites to prevent undesirable splicing. The method may also relate to a method for providing an individual having a disease, disorder or abnormal physical state with biologically active Gcc polypeptide by administering DNA encoding Gcc. The method may be performed ex vivo or in vivo. Gene therapy methods and compositions are demonstrated, for example, in U.S. Pat. Nos. 5,672,344, 5,645,829, 5,741,486, 5,656,465, 5,547,932, 5,529,774, 5,436,146, 5,399,346 and 5,670,488, 5,240,846.

[0059] The method also relates to a method for producing a stock of recombinant virus by producing virus suitable for gene therapy comprising modified DNA encoding Gcc. This method preferably involves transfecting cells permissive for virus replication (the virus containing modified Gcc) and collecting the virus produced.

[0060] Typically, a male or female is treated with the vector containing the invention (subject age will typically range from 1 to 60 years of age). At the time of treatment, he typically will have bone involvement, bone thinning and bone pain and will have an enlarged spleen and liver. The vector containing the invention is administered intravenously in order to achieve a desired level of enzyme in the patient. Treatments are repeated as deemed appropriate by a physician to ameliorate the clinical symptoms of Gaucher disease. Such treatments may be lifelong.

[0061] Patients report significant improvement in bone involvement, pain and thinning, with reduction in frequency and/or intensity of pain episodes, or complete disappearance of skeletal pain often within the first six months of treatment. Patients also show improvement in cortical bone thickness. Enlargement of the spleen and liver are reduced. One of the disease markers, the enzyme chitotriosidase, shows a dramatic reduction during the course of a year.

[0062] Administration of Gcc Polypeptide

[0063] The Gcc polypeptide is administered in pharmaceutical compositions in enzyme replacement therapy, examples of which are described above (Beutler et al. 1991; Barton et al. 1992; Fallet et al. 1992; Brady et al. 1994; Grabowski et al. 1995; Rosenthal et al. 1995).

[0064] Typically, a male or female is treated with the polypeptide of the invention (subject age will typically range from 1 to 60 years of age). At the time of treatment, he typically will have bone involvement, bone thinning and bone pain and may have an enlarged spleen and liver. The polypeptide of the invention is administered intravenously at about 30 U/kg every 2 weeks in order to achieve a desired level of enzyme in the patient.

[0065] Patients report significant improvement in bone involvement, pain and thinning, with reduction in frequency and/or intensity of pain episodes, or complete disappearance of skeletal pain often within the first six months of treatment. Patients also show improvement in cortical bone thickness. Enlargement of the spleen and liver are reduced. One of the disease markers, the enzyme chitotriosidase, shows a dramatic reduction during the course of a year.

[0066] Research Tool

[0067] Mammals and cell cultures transfected or transduced with vectors containing the invention are useful as research tools. Mammals and cell cultures are used in research according to numerous techniques known in the art. For example, one may obtain cells or mice (Tybulewicz et al. 1992) that express low levels of the normal or mutant Gcc polypeptide and use them in experiments to assess expression of a recombinant Gcc nucleotide sequence. In an example of such a procedure, experimental groups of mice are transformed with vectors containing recombinant Gcc genes to assess the levels of polypeptide produced, its functionality and the phenotype of the cells or mice (for example, physical characteristics of the cell structure). Some of the changes described above to optimize expression may be omitted if a lower level of expression is desired. It would be obvious to one skilled in the art that changes could be made to alter the levels of polypeptide expression.

[0068] In another example, a cell line (either an immortalized cell culture or a stem cell culture) is transformed with a DNA molecule of the invention (or variants) to measure levels of expression of the DNA molecule and the activity of the DNA molecule. For example, one may obtain mouse or human cell lines or cultures bearing the vector of the invention and obtain expression after the transfer of the cells into immunocompromised mice.

[0069] Using Exogenous Agents In Combination With the Hybrid Gene

[0070] Cells transfected or transduced with a DNA molecule or polypeptide according to the invention may, in appropriate circumstances, be treated with conventional medical treatment of Gaucher disease, such as enzyme replacement therapy. The appropriate combination of treatments would be apparent to a skilled physician.

[0071] Material and Meth ds

[0072] Reagents:

[0073] All reagents used during the course of these experiments w r of research grade or molecular biology grades, as appropriate. Substrate for the acid .beta.-glucosidase (Gcc) activity, 4-methylumbelliferyl-.beta.-D- -glucopyranoside (MUGc), was purchased from Sigma and purified additionally as described below. Oligonucleotide primers were obtained, as a lyophilized powder, from the Hospital for Sick Children Biotechnology Service Centre's DNA Synthesis Service. Tissue culture media (alpha-MEM) was obtained from the University of Toronto Media Preparation Service. Fetal bovine serum were obtained from CanSera through the Hospital for Sick Children Tissue Culture Service.

[0074] Lac Z. Neutral .beta.-Galactosidase Assay

[0075] Samples of cell lysates were diluted into water to a final volume of 60 .mu.L. Substrate solution (190 .mu.L), prepared by dissolving 19 mg of 4-MU-.beta.-gal (4-methylumbelliferyl-.beta.-galactoside) in 100 ml of pH 7.0 0.1 M citrate buffer, was added. The mixture was incubated at 37.degree. C. for 30 minutes and then stopped by the addition of 2.0 ml of 0.1 M MAP. Fluorescence of standard quantities of free 4-MU in 0.1 MAP (2-methyl-2-amino-1-propanol) and the assay mixtures were determined on a fluorescence spectrophotometer using 365 nm excitation and 450 nm emission wavelengths. Polypeptide concentration of the cell lysates were determined by the Bio-Rad method. Specific activity of the lysates were determined as nmole MU/mg polypeptide.

[0076] Acid .beta.-Glucosidase (Gcc) Activity (Specific or Total):

[0077] Samples were prepared by freeze-thaw lysis (5.times.) in PBS containing 0.1% sodium taurocholate (NaTC), usually 100 .mu.L for a P100 dish of confluent CHO cells. A sample of the lysate (5-20 .mu.L) was diluted with 0.25% BSA to a total volume of 100 .mu.L. Reagents were added in the following order citrate/phosphate buffer (1 M/2M, pH 4.5), 25 .mu.L; 2% NaTC in ddH.sub.2O, 25 .mu.L; and 20 mM of the MUGc substratesolution, 100 .mu.L. The reaction was typically allowed to proceed for 1 hour at 37.degree. C. and then stopped by the addition of 3.0 ml of 0.1 M MAP, pH 10.5. Fluorescence of the released 4-MU was measured with the use of on a Perkin Elmer LS 30 Luminescence Spectrometer with sipper attachment. Polypeptide content was determined using the BioRad Protein Assay reagent.

[0078] Substrate solution (20 mM) was prepared by dissolving MUGc in ddH.sub.2O and heating to 40-50.degree. C. for 15-20 minutes with occasional agitation. The solution was cooled and then extracted 3.times. with an equal volume of ethyl acetate. The final aqueous solution was bubbled with N.sub.2 gas (to remove residual ethyl acetat) and aliquoted into tubes which were then frozen and stored at -20.degree. C. until needed. The substrate solution was thawed for use in a beaker of warm water, then vortexed vigorously to ensure complete dissolution of any solid material.

[0079] Immunoprecipitation Assay:

[0080] For each immunoprecipitation assay, 125 .mu.L of Goat anti-rabbit IgG coated magnetic beads (hereafter called "beads") were isolated from suspension using a permanent magnet stand (Advanced Magnetics). Beads were washed 3 times with phosphate-buffered saline containing 0.05% bovine serum albumin (PBS/BSA) by resuspension, and removal from suspension using a magnetic stand, followed by removal of the supernatant. After the final wash the beads were resuspended in 100 .mu.L of PBS/BSA and an appropriate amount of the rabbit anti-Gcc IgG (#5470), usually 4 .mu.g per assay, was added. The mixture was placed in an appropriately sized tube, depending on volume, and allowed to incubate with rotation for 4 hours at 4.degree. C. The bead-antibody complex was precipitated with the permanent magnet stand and washed 3 times with PBS/BSA to remove any remaining free antibody and finally resuspended in 100 .mu.L of PBS/BSA. Cell lysates were prepared as above and diluted into a minimum of 400 .mu.L (to allow for adequate mixing). The washed antibody-bead complex (100 .mu.L) was added to the diluted sample and allowed to incubate overnight at 4.degree. C. with rotation. The samples were placed on ice in the permanent magnet stand and allowed to precipitate for .about.30 minutes. The beads in each sample were washed (750 .mu.L) with PBS/BSA containing 0.1% Triton X-100, then twice with PBS/BSA containing 0.2% NaTC. After the final wash, the beads were resuspended in PBS/BSA/Triton and assayed for Gcc activity (as described above).

[0081] Expression of Gcc in Transiently Transfected CHO Cells

[0082] CHO cells were co-transfected with either 8 .mu.g of pINEX-5'-GCC-3' or pINEX-WEIRD and 2 .mu.g pCMV-Lac Z (encoding E. coli .beta.-galactosidase as a control for transfection efficiency) using Superfect Reagent (QIAGEN GmBH, Germany), according to the manufacturer's protocol. Cells were harvested after 2 days and the lysates analyzed for Gcc (using the immunoprecipitation assay) and .beta.-galactosidase activity. Final Gcc levels were adjusted based on the relative levels of .beta.-galactosidase activity in each lysate sample.

[0083] Cloned CHO Cells Permanently-Transfected with pINEX-5'-GCC-3':

[0084] CHO cells were co-transfected with 8 .beta.g of pINEX-5'-GCC-3' and 2 .mu.g pREP10 (containing a hygromycin resistence gene). Selective medium, containing 200 .mu.g/mL hygromycin was added, and the cells were allowed to grow for approximately two weeks, splitting as necessary. After two we ks the cells were harvested by trysinization, counted using a hemocytometer and diluted as necessary to isolate single cells using 96-well dishes. After 10 days, clones that were growing well were transferred into 100 mm dishes and allowed to grow for a further 10 days, splitting as necessary. Cells from each clone were harvested and assayed for Gcc activity. The final clone selected, termed A7, had the highest Gcc activity of all the clones examined.

[0085] RNA Isolation and Reverse Transcription and PCR (RT-PCR):

[0086] Cells were grown in large dishes (P150), and RNA was isolated from control CHO cells and the A7 clone, according to the one-step guanidinium isothiocyanate procedure(Chomczynski and Sacchi 1987). RNA (1 .mu.g), primer (SPR2 (see Table 1), 200 pmol), RNase inhibitor, and ddH.sub.2O (to 12.5 .mu.L total), were mixed and incubated at 65.degree. C. for 20 minutes. After cooling on ice for 5 minutes, the remaining components of the RT reaction cocktail were added (RT buffer, DTT, dNTPs, RNase inhibitor, and reverse transcriptase). The reaction cocktail (total 25 .mu.L) was incubated at 37.degree. C. for 90 min.

[0087] PCR was performed using the RT reaction products (1 .mu.L) as template. After addition of ddH.sub.2O, and primers (SPF and 53GCC2000R (see Table 1), 20 pmol each), the reaction was incubated at 95.degree. C. for 5 minutes to inactivate the reverse transcriptase. The remaining reaction components (dNTPs, MgCl2, and Taq polymerase (Gibco BRL)) were used at manufacturers suggested levels. Thermocycling was performed under the following conditions: 94.degree. C./3 min; 30 cycles of 94.degree./1.5 min, 55.degree./1 min, 72.degree./1.5 min; 72.degree./10 min. Samples of the PCR reaction (10 .mu.L) were loaded onto a 1.5% agarose gel using Tris-Acetate-EDTA buffer (TAE, 40 mM Tris-acetate/2 mM EDTA), electrophoresed and visualized using ethidium bromide.

[0088] Cloning of RT-PCR Products:

[0089] Electrophoresis of the RT-PCR products from the A7 clone cell line showed two major bands. One full RT-PCR reaction mixture (100 .mu.L) was separated eletrbphoretically on an agarose gel, and the two major product bands were excised and purified using the Qiaex II Gel Extraction Kit (QIAGEN GmBH, Germany). The fragments were cloned into the TA cloning vector (pCR2.1) according to the manufacturer's directions (Invitrogen, Carlsbad, Calif.). The inserts were sequenced using either .sup.33S-T7 Sequencing Kit or .sup.33P-cycle Sequencing Kit (Amersham Pharmacia Biotech, Sweden) from either the M13 forward or M13 reverse primer location on the vector. Sequencing gels were exposed to BioMaxMR film (Kodak) overnight and subsequently read.

[0090] Site-Directed Mutagenesis (Internal "Weird" Spice Fix):

[0091] The cryptic splice site located within the Gcc cDNA was modified by site-directed mutagenesis in order to remove potential consensus splice junction sites from the Gcc cDNA. A PCR product was obtained using one oligonucleotide primer which mutagenized a number of bases in the putative 3' junction site (3'-junction (see Table 1)) and another for the putative 5'-splice junction site (5'-junction (see Table 1)). The PCR reaction contained: 1.times.Pfu reaction buffer (10.times. stock provided by manufacturer), 0.4 mM dNTP, 10 ng template DNA (pINEX-5'GCC-3'), 500 ng of each oligo, and 2.5U Pfu DNA Polymerase in a final volume of 50 .mu.L in the appropriate buffer.

[0092] Amplification was performed using a Robocycler 40 Temperature Cycler (Stratagene) for 30 cycles, with temperatures and times as follows: 94.degree. C./45 sec., 59.degree. C./1 min. and 72.degree. C./1 min. 20 sec. The PCR product was used as a mega-primer in the second round of PCR. The second PCR reaction consisted of: 5 .mu.L of the above PCR reaction mixture, 1.times.Pfu reaction buffer (10.times. stock provided by manufacturer), 0.4 mM dNTPs, 50 ng template (pINEX-5'-GCC-3'), 500 ng upstream oligo (SPF) and 5U Pfu DNA polymerase in a final reaction volume of 100 .mu.L. Reaction temperature conditions used were the same as for the initial PCR above. The PCR products were digested with 10U of Dra III and Xho I for 3 hr at 37.degree. C. The plasmid pINEX-5'GCC-3' was digested in parallel using the same method. Digested products were electrophoretically separated on an agarose gel, and the appropriate pieces were excised and purified as described above. Ligation was performed in a 20 .mu.L final volume using 5U of T4 DNA Ligase (MBI Fermentas, Lithuania), incubating overnight at 16.degree. C. to produce pINEX-WEIRD. DNA was transformed into DH5 E. coli cells (Gibco BRL) and plated onto appropriate LB agar plates containing antibiotics. Plasmid DNA was isolated and screened by restriction digest and sequencing to confirm that they contained the appropriate insert.

[0093] The present invention has been described in detail and with particular reference to the preferred embodiments; however, it will be understood by one having ordinary skill in the art that changes can be made thereto without departing from the spirit and scope of the invention.

[0094] All publications, patents and patent applications are incorporated by reference in their entirety to the same extent as if each individual publication, patent or patent application was specifically and individually indicated to be incorporated by reference in its entirety.

REFERENCES

[0095] Barton N W, Brady R O, Dambrosia J M, Doppelt S H, Hill S C, Holder C A, Mankin H J, et al (1992) Dose-dependent responses to macrophage-targeted glucocerebrosidase in a child with Gaucher disease. J Pediatr 1:277-280

[0096] Beutler E, Kay A C, Saven A, Garver P, Thurston D W, Rosenbloom B E (1991) Enzyme-replacement therapy for Gaucher's disease. N EngI J Med 325:1809-1810

[0097] Brady R O, Murray G J, Barton N W (1994) Modifying exogenous glucocerebrosidase for effective replacement therapy in Gaucher disease. J Inherited Metab Dis 17:510-519

[0098] Chomczynski P, Sacchi N (1987) Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal. Biochem. 162:156-9

[0099] Dunbar C E, Kohn D B, Schiffmann R, Barton N W, Nolta J A, Esplin J A, Pensiero M, et al (1998) Retroviral transfer of the glucocerebrosidase gene into CD34+cells from patients with Gaucher disease: In vivo detection of transduced cells without myeloablation. Hum Gene Ther 9:2629-2640

[0100] Fallet S, Grace M E, Sibille A, Mendelson D S, Shapiro R S, Hermann G, Grabowski G A (1992) Enzyme augmentation in moderate to life-threatening Gaucher disease. Pediatr Res 31:496-502

[0101] Grabowski G A, Barton N W, Pastores G, Dambrosia J M, Banerjee T K, McKee M A, Parker C, et al (1995) Enzyme therapy in type 1 Gaucher disease: Comparative efficacy of mannose-terminated glucocerebrosidase from natural and recombinant sources. Ann. Intern. Med. 122:33-39

[0102] Kozak M (1987) At least six nucleotides preceding the AUG initiator codon enhance translation in mammalian cells. J. Mol. Biol. 196:947-950

[0103] Krawczak M, Reiss J, Cooper D N (1992) The mutational spectrum of single base-pair substitutions in mRNA splice junctions of human genes: causes and consequences. Hum. Genet. 90:41-54

[0104] Maquat L E (1996) Defects in RNA splicing and the consequence of shortened translational reading frames. Am. J. Hum. Genet. 59:279-286

[0105] Nolta J A, Yu X J, Bahner I, Kohn D B (1992) Retroviral-mediated transfer of the human glucocerebrosidase gene into cultured Gaucher bone marrow. J Clin Invest 90:342-348

[0106] Rosenthal D I, Doppelt S H, Mankin H J, Dambrosia J M, Xavier R J, McKusick K A, Rosen B R, et al (1995) Enzyme replacement therapy for Gaucher disease: Skeletal responses to macrophage-targeted glucocerebrosidase. Pediatrics 96:629-637

[0107] Schuening F, Longo W L, Atkinson M E, Zaboikin M (1997) Retrovirus-mediated transfer of the cDNA for human glucocerebrosidase into peripheral blood repopulating cells of patients with Gaucher's disease. Hum Gene Ther 8:2143-2160

[0108] Sidransky E, Martin B, Ginns E I (1993) Treatment of Gaucher's disease. N EngI J Med 328:1566-1566

[0109] Tsai P, Lipton J M, Sahdev I, Najfeld V, Rankin L R, Slyper A H, Ludman M, et al (1992) Allogenic bone marrow transplantation in severe Gaucher disease. Pediatr Res 31:503-507

[0110] Tybulewicz V, Tremblay M L, LaMarca M E, Willemsen R, Stubblefield B K, Winfield S, Zablocka B, et al (1992) Animal model of Gaucher's disease from targeted disruption of the mouse glucocerebrosidase gene. Nature 357:407-410

1TABLE 1 Sequence of Oligos Used in this Study: Oligo Nam Olig S qu nc * (5' to 3') SPR2 GCCAGTGTGATGGATATCTGC SPF GACCGATCCAGCCTCCGGACTCT 53GCC2000R GCCGCACACTCTGCTCCCAGAA 3-junction CATCCGTCGCCCACTGCGTGTACTCTCATAGCGGGAAAA TGTCAGGGCAGG 5'junction CCTTTGAGTAGAGTCTCCATCATGGCTGGC = Bases underlined indicate bases changed in site-directed mutagenesis PCR procedures.

[0111]

2TABLE 2 One of 18 Transient Expression Experiment Comparing the Wild-Type Gcc cDNA (plnex5'3'Gcc) with the Gcc cDNA Modified to Remove the Cryptic Splice Sites (plnexWEIRD) Lac-Z Corection Fact Total Gcc Human Gcc Vector (pmoles/ (C.F.) (pmoles/hr/.mu.g) (pmoles/hr/.mu.g) C.F. X Hum % plnex5'3'G None 35 N/A 67 1.6 N/A 15 plnex5'3'Gcc 1550 1 101 10.5 8.8 100 plnexWEIRD 185 10.1 63 6.8 52.5 597

[0112]

Sequence CWU 1

1

19 1 10 RNA Homo sapiens m=1-2 m=c or a 1 mmagguaagu 10 2 41 DNA Homo sapiens 2 aagccgttga gtagggtaag catcatggct ggcagcctca c 41 3 14 PRT Homo sapiens 3 Lys Pro Leu Ser Arg Val Ser Ile Met Ala Gly Ser Leu Thr 1 5 10 4 41 DNA Homo sapiens 4 aagccgttga gtagagtctc catcatggct ggcagcctca c 41 5 15 RNA Homo sapiens misc_difference y=1-10; n=11 y=c or u; n=any nucleotide 5 yyyyyyyyyy ncagg 15 6 40 DNA Homo sapiens 6 tttcctgccc ttggtacctt cagccgctat gagagtacac 40 7 14 PRT Homo sapiens 7 Phe Pro Ala Leu Gly Thr Phe Ser Arg Tyr Glu Ser Thr Arg 1 5 10 8 40 DNA Homo sapiens 8 tttcctgccc tgggaacatt ttcccgctat gagagtacac 40 9 32 DNA Homo sapiens 9 aagccgttga gtaggccgct atgagagtac ac 32 10 7 PRT Homo sapiens 10 Lys Pro Leu Ser Arg Pro Leu 1 5 11 1741 DNA Homo sapiens CDS (37)..(1647) misc_difference n=1 n=any nucleotide 11 ngcggccgct tagcttgact taagaaggcc gacgcc atg gag ttt tca agt cct 54 Met Glu Phe Ser Ser Pro 1 5 tcc aga gag gaa tgt ccc aag cct ttg agt agg gta agc atc atg gct 102 Ser Arg Glu Glu Cys Pro Lys Pro Leu Ser Arg Val Ser Ile Met Ala 10 15 20 ggc agc ctc aca ggt ttg ctt cta ctt cag gca gtg tcg tgg gca tca 150 Gly Ser Leu Thr Gly Leu Leu Leu Leu Gln Ala Val Ser Trp Ala Ser 25 30 35 ggt gcc cgc ccc tgc atc cct aaa agc ttc ggc tac agc tcg gtg gtg 198 Gly Ala Arg Pro Cys Ile Pro Lys Ser Phe Gly Tyr Ser Ser Val Val 40 45 50 tgt gtc tgc aat gcc aca tac tgt gac tcc ttt gac ccc ccg acc ttt 246 Cys Val Cys Asn Ala Thr Tyr Cys Asp Ser Phe Asp Pro Pro Thr Phe 55 60 65 70 cct gcc ctt ggt acc ttc agc cgc tat gag agt aca cgc agt ggg cga 294 Pro Ala Leu Gly Thr Phe Ser Arg Tyr Glu Ser Thr Arg Ser Gly Arg 75 80 85 cgg atg gag ctg agt atg ggg ccc atc cag gct aat cac acg ggc aca 342 Arg Met Glu Leu Ser Met Gly Pro Ile Gln Ala Asn His Thr Gly Thr 90 95 100 ggc ctg cta ctg acc ctg cag cca gaa cag aag ttc cag aaa gtg aag 390 Gly Leu Leu Leu Thr Leu Gln Pro Glu Gln Lys Phe Gln Lys Val Lys 105 110 115 gga ttt gga ggg gcc atg aca gat gct gct gct ctc aac atc ctt gcc 438 Gly Phe Gly Gly Ala Met Thr Asp Ala Ala Ala Leu Asn Ile Leu Ala 120 125 130 ctg tca ccc cct gcc caa aat ttg cta ctt aaa tcg tac ttc tct gaa 486 Leu Ser Pro Pro Ala Gln Asn Leu Leu Leu Lys Ser Tyr Phe Ser Glu 135 140 145 150 gaa gga atc gga tat aac atc atc cgg gta ccc atg gcc agc tgt gac 534 Glu Gly Ile Gly Tyr Asn Ile Ile Arg Val Pro Met Ala Ser Cys Asp 155 160 165 ttc tcc atc cgc acc tac acc tat gca gac acc cct gat gat ttc cag 582 Phe Ser Ile Arg Thr Tyr Thr Tyr Ala Asp Thr Pro Asp Asp Phe Gln 170 175 180 ttg cac aac ttc agc ctc cca gag gaa gat acc aag ctc aag ata ccc 630 Leu His Asn Phe Ser Leu Pro Glu Glu Asp Thr Lys Leu Lys Ile Pro 185 190 195 ctg att cac cga gcc ctg cag ttg gcc cag cgt ccc gtt tca ctc ctt 678 Leu Ile His Arg Ala Leu Gln Leu Ala Gln Arg Pro Val Ser Leu Leu 200 205 210 gcc agc ccc tgg aca tca ccc act tgg ctc aag acc aat gga gcg gtg 726 Ala Ser Pro Trp Thr Ser Pro Thr Trp Leu Lys Thr Asn Gly Ala Val 215 220 225 230 aat ggg aag ggg tca ctc aag gga cag ccc gga gac atc tac cac cag 774 Asn Gly Lys Gly Ser Leu Lys Gly Gln Pro Gly Asp Ile Tyr His Gln 235 240 245 acc tgg gcc aga tac ttt gtg aag ttc ctg gat gcc tat gct gag cac 822 Thr Trp Ala Arg Tyr Phe Val Lys Phe Leu Asp Ala Tyr Ala Glu His 250 255 260 aag tta cag ttc tgg gca gtg aca gct gaa aat gag cct tct gct ggg 870 Lys Leu Gln Phe Trp Ala Val Thr Ala Glu Asn Glu Pro Ser Ala Gly 265 270 275 ctg ttg agt gga tac ccc ttc cag tgc ctg ggc ttc acc cct gaa cat 918 Leu Leu Ser Gly Tyr Pro Phe Gln Cys Leu Gly Phe Thr Pro Glu His 280 285 290 cag cga gac tta att gcc cgt gac cta ggt cct acc ctc gcc aac agt 966 Gln Arg Asp Leu Ile Ala Arg Asp Leu Gly Pro Thr Leu Ala Asn Ser 295 300 305 310 act cac cac aat gtc cgc cta ctc atg ctg gat gac caa cgc ttg ctg 1014 Thr His His Asn Val Arg Leu Leu Met Leu Asp Asp Gln Arg Leu Leu 315 320 325 ctg ccc cac tgg gca aag gtg gta ctg aca gac cca gaa gca gct aaa 1062 Leu Pro His Trp Ala Lys Val Val Leu Thr Asp Pro Glu Ala Ala Lys 330 335 340 tat gtt cat ggc att gct gta cat tgg tac ctg gac ttt ctg gct cca 1110 Tyr Val His Gly Ile Ala Val His Trp Tyr Leu Asp Phe Leu Ala Pro 345 350 355 gcc aaa gcc acc cta ggg gag aca cac cgc ctg ttc ccc aac acc atg 1158 Ala Lys Ala Thr Leu Gly Glu Thr His Arg Leu Phe Pro Asn Thr Met 360 365 370 ctc ttt gcc tca gag gcc tgt gtg ggc tcc aag ttc tgg gag cag agt 1206 Leu Phe Ala Ser Glu Ala Cys Val Gly Ser Lys Phe Trp Glu Gln Ser 375 380 385 390 gtg cgg cta ggc tcc tgg gat cga ggg atg cag tac agc cac agc atc 1254 Val Arg Leu Gly Ser Trp Asp Arg Gly Met Gln Tyr Ser His Ser Ile 395 400 405 atc acg aac ctc ctg tac cat gtg gtc ggc tgg acc gac tgg aac ctt 1302 Ile Thr Asn Leu Leu Tyr His Val Val Gly Trp Thr Asp Trp Asn Leu 410 415 420 gcc ctg aac ccc gaa gga gga ccc aat tgg gtg cgt aac ttt gtc gac 1350 Ala Leu Asn Pro Glu Gly Gly Pro Asn Trp Val Arg Asn Phe Val Asp 425 430 435 agt ccc atc att gta gac atc acc aag gac acg ttt tac aaa cag ccc 1398 Ser Pro Ile Ile Val Asp Ile Thr Lys Asp Thr Phe Tyr Lys Gln Pro 440 445 450 atg ttc tac cac ctt ggc cat ttc agc aag ttc att cct gag ggc tcc 1446 Met Phe Tyr His Leu Gly His Phe Ser Lys Phe Ile Pro Glu Gly Ser 455 460 465 470 cag aga gtg ggg ctg gtt gcc agt cag aag aac gac ctg gac gca gtg 1494 Gln Arg Val Gly Leu Val Ala Ser Gln Lys Asn Asp Leu Asp Ala Val 475 480 485 gca ttg atg cat ccc gat ggc tct gct gtt gtg gtc gtg cta aac cgc 1542 Ala Leu Met His Pro Asp Gly Ser Ala Val Val Val Val Leu Asn Arg 490 495 500 tcc tct aag gat gtg cct ctt acc atc aag gat cct gct gtg ggc ttc 1590 Ser Ser Lys Asp Val Pro Leu Thr Ile Lys Asp Pro Ala Val Gly Phe 505 510 515 ctg gag aca atc tca cct ggc tac tcc att cac acc tac ctg tgg cat 1638 Leu Glu Thr Ile Ser Pro Gly Tyr Ser Ile His Thr Tyr Leu Trp His 520 525 530 cgc cag tga tggagcagat actcaaggag gcactgggct cagcctgggc 1687 Arg Gln 535 attaaaggga cagagtcagc gaattctgca gatatccatc acactggcgg ccgc 1741 12 536 PRT Homo sapiens 12 Met Glu Phe Ser Ser Pro Ser Arg Glu Glu Cys Pro Lys Pro Leu Ser 1 5 10 15 Arg Val Ser Ile Met Ala Gly Ser Leu Thr Gly Leu Leu Leu Leu Gln 20 25 30 Ala Val Ser Trp Ala Ser Gly Ala Arg Pro Cys Ile Pro Lys Ser Phe 35 40 45 Gly Tyr Ser Ser Val Val Cys Val Cys Asn Ala Thr Tyr Cys Asp Ser 50 55 60 Phe Asp Pro Pro Thr Phe Pro Ala Leu Gly Thr Phe Ser Arg Tyr Glu 65 70 75 80 Ser Thr Arg Ser Gly Arg Arg Met Glu Leu Ser Met Gly Pro Ile Gln 85 90 95 Ala Asn His Thr Gly Thr Gly Leu Leu Leu Thr Leu Gln Pro Glu Gln 100 105 110 Lys Phe Gln Lys Val Lys Gly Phe Gly Gly Ala Met Thr Asp Ala Ala 115 120 125 Ala Leu Asn Ile Leu Ala Leu Ser Pro Pro Ala Gln Asn Leu Leu Leu 130 135 140 Lys Ser Tyr Phe Ser Glu Glu Gly Ile Gly Tyr Asn Ile Ile Arg Val 145 150 155 160 Pro Met Ala Ser Cys Asp Phe Ser Ile Arg Thr Tyr Thr Tyr Ala Asp 165 170 175 Thr Pro Asp Asp Phe Gln Leu His Asn Phe Ser Leu Pro Glu Glu Asp 180 185 190 Thr Lys Leu Lys Ile Pro Leu Ile His Arg Ala Leu Gln Leu Ala Gln 195 200 205 Arg Pro Val Ser Leu Leu Ala Ser Pro Trp Thr Ser Pro Thr Trp Leu 210 215 220 Lys Thr Asn Gly Ala Val Asn Gly Lys Gly Ser Leu Lys Gly Gln Pro 225 230 235 240 Gly Asp Ile Tyr His Gln Thr Trp Ala Arg Tyr Phe Val Lys Phe Leu 245 250 255 Asp Ala Tyr Ala Glu His Lys Leu Gln Phe Trp Ala Val Thr Ala Glu 260 265 270 Asn Glu Pro Ser Ala Gly Leu Leu Ser Gly Tyr Pro Phe Gln Cys Leu 275 280 285 Gly Phe Thr Pro Glu His Gln Arg Asp Leu Ile Ala Arg Asp Leu Gly 290 295 300 Pro Thr Leu Ala Asn Ser Thr His His Asn Val Arg Leu Leu Met Leu 305 310 315 320 Asp Asp Gln Arg Leu Leu Leu Pro His Trp Ala Lys Val Val Leu Thr 325 330 335 Asp Pro Glu Ala Ala Lys Tyr Val His Gly Ile Ala Val His Trp Tyr 340 345 350 Leu Asp Phe Leu Ala Pro Ala Lys Ala Thr Leu Gly Glu Thr His Arg 355 360 365 Leu Phe Pro Asn Thr Met Leu Phe Ala Ser Glu Ala Cys Val Gly Ser 370 375 380 Lys Phe Trp Glu Gln Ser Val Arg Leu Gly Ser Trp Asp Arg Gly Met 385 390 395 400 Gln Tyr Ser His Ser Ile Ile Thr Asn Leu Leu Tyr His Val Val Gly 405 410 415 Trp Thr Asp Trp Asn Leu Ala Leu Asn Pro Glu Gly Gly Pro Asn Trp 420 425 430 Val Arg Asn Phe Val Asp Ser Pro Ile Ile Val Asp Ile Thr Lys Asp 435 440 445 Thr Phe Tyr Lys Gln Pro Met Phe Tyr His Leu Gly His Phe Ser Lys 450 455 460 Phe Ile Pro Glu Gly Ser Gln Arg Val Gly Leu Val Ala Ser Gln Lys 465 470 475 480 Asn Asp Leu Asp Ala Val Ala Leu Met His Pro Asp Gly Ser Ala Val 485 490 495 Val Val Val Leu Asn Arg Ser Ser Lys Asp Val Pro Leu Thr Ile Lys 500 505 510 Asp Pro Ala Val Gly Phe Leu Glu Thr Ile Ser Pro Gly Tyr Ser Ile 515 520 525 His Thr Tyr Leu Trp His Arg Gln 530 535 13 1741 DNA Homo sapiens CDS (37)..(1647) misc_difference n=1 n=any nucleotide 13 ngcggccgct tagcttgact taagaaggcc gacgcc atg gag ttt tca agt cct 54 Met Glu Phe Ser Ser Pro 1 5 tcc aga gag gaa tgt ccc aag cct ttg agt aga gtc tcc atc atg gct 102 Ser Arg Glu Glu Cys Pro Lys Pro Leu Ser Arg Val Ser Ile Met Ala 10 15 20 ggc agc ctc aca ggt ttg ctt cta ctt cag gca gtg tcg tgg gca tca 150 Gly Ser Leu Thr Gly Leu Leu Leu Leu Gln Ala Val Ser Trp Ala Ser 25 30 35 ggt gcc cgc ccc tgc atc cct aaa agc ttc ggc tac agc tcg gtg gtg 198 Gly Ala Arg Pro Cys Ile Pro Lys Ser Phe Gly Tyr Ser Ser Val Val 40 45 50 tgt gtc tgc aat gcc aca tac tgt gac tcc ttt gac ccc ccg acc ttt 246 Cys Val Cys Asn Ala Thr Tyr Cys Asp Ser Phe Asp Pro Pro Thr Phe 55 60 65 70 cct gcc ctg gga aca ttt tcc cgc tat gag agt aca cgc agt ggg cga 294 Pro Ala Leu Gly Thr Phe Ser Arg Tyr Glu Ser Thr Arg Ser Gly Arg 75 80 85 cgg atg gag ctg agt atg ggg ccc atc cag gct aat cac acg ggc aca 342 Arg Met Glu Leu Ser Met Gly Pro Ile Gln Ala Asn His Thr Gly Thr 90 95 100 ggc ctg cta ctg acc ctg cag cca gaa cag aag ttc cag aaa gtg aag 390 Gly Leu Leu Leu Thr Leu Gln Pro Glu Gln Lys Phe Gln Lys Val Lys 105 110 115 gga ttt gga ggg gcc atg aca gat gct gct gct ctc aac atc ctt gcc 438 Gly Phe Gly Gly Ala Met Thr Asp Ala Ala Ala Leu Asn Ile Leu Ala 120 125 130 ctg tca ccc cct gcc caa aat ttg cta ctt aaa tcg tac ttc tct gaa 486 Leu Ser Pro Pro Ala Gln Asn Leu Leu Leu Lys Ser Tyr Phe Ser Glu 135 140 145 150 gaa gga atc gga tat aac atc atc cgg gta ccc atg gcc agc tgt gac 534 Glu Gly Ile Gly Tyr Asn Ile Ile Arg Val Pro Met Ala Ser Cys Asp 155 160 165 ttc tcc atc cgc acc tac acc tat gca gac acc cct gat gat ttc cag 582 Phe Ser Ile Arg Thr Tyr Thr Tyr Ala Asp Thr Pro Asp Asp Phe Gln 170 175 180 ttg cac aac ttc agc ctc cca gag gaa gat acc aag ctc aag ata ccc 630 Leu His Asn Phe Ser Leu Pro Glu Glu Asp Thr Lys Leu Lys Ile Pro 185 190 195 ctg att cac cga gcc ctg cag ttg gcc cag cgt ccc gtt tca ctc ctt 678 Leu Ile His Arg Ala Leu Gln Leu Ala Gln Arg Pro Val Ser Leu Leu 200 205 210 gcc agc ccc tgg aca tca ccc act tgg ctc aag acc aat gga gcg gtg 726 Ala Ser Pro Trp Thr Ser Pro Thr Trp Leu Lys Thr Asn Gly Ala Val 215 220 225 230 aat ggg aag ggg tca ctc aag gga cag ccc gga gac atc tac cac cag 774 Asn Gly Lys Gly Ser Leu Lys Gly Gln Pro Gly Asp Ile Tyr His Gln 235 240 245 acc tgg gcc aga tac ttt gtg aag ttc ctg gat gcc tat gct gag cac 822 Thr Trp Ala Arg Tyr Phe Val Lys Phe Leu Asp Ala Tyr Ala Glu His 250 255 260 aag tta cag ttc tgg gca gtg aca gct gaa aat gag cct tct gct ggg 870 Lys Leu Gln Phe Trp Ala Val Thr Ala Glu Asn Glu Pro Ser Ala Gly 265 270 275 ctg ttg agt gga tac ccc ttc cag tgc ctg ggc ttc acc cct gaa cat 918 Leu Leu Ser Gly Tyr Pro Phe Gln Cys Leu Gly Phe Thr Pro Glu His 280 285 290 cag cga gac tta att gcc cgt gac cta ggt cct acc ctc gcc aac agt 966 Gln Arg Asp Leu Ile Ala Arg Asp Leu Gly Pro Thr Leu Ala Asn Ser 295 300 305 310 act cac cac aat gtc cgc cta ctc atg ctg gat gac caa cgc ttg ctg 1014 Thr His His Asn Val Arg Leu Leu Met Leu Asp Asp Gln Arg Leu Leu 315 320 325 ctg ccc cac tgg gca aag gtg gta ctg aca gac cca gaa gca gct aaa 1062 Leu Pro His Trp Ala Lys Val Val Leu Thr Asp Pro Glu Ala Ala Lys 330 335 340 tat gtt cat ggc att gct gta cat tgg tac ctg gac ttt ctg gct cca 1110 Tyr Val His Gly Ile Ala Val His Trp Tyr Leu Asp Phe Leu Ala Pro 345 350 355 gcc aaa gcc acc cta ggg gag aca cac cgc ctg ttc ccc aac acc atg 1158 Ala Lys Ala Thr Leu Gly Glu Thr His Arg Leu Phe Pro Asn Thr Met 360 365 370 ctc ttt gcc tca gag gcc tgt gtg ggc tcc aag ttc tgg gag cag agt 1206 Leu Phe Ala Ser Glu Ala Cys Val Gly Ser Lys Phe Trp Glu Gln Ser 375 380 385 390 gtg cgg cta ggc tcc tgg gat cga ggg atg cag tac agc cac agc atc 1254 Val Arg Leu Gly Ser Trp Asp Arg Gly Met Gln Tyr Ser His Ser Ile 395 400 405 atc acg aac ctc ctg tac cat gtg gtc ggc tgg acc gac tgg aac ctt 1302 Ile Thr Asn Leu Leu Tyr His Val Val Gly Trp Thr Asp Trp Asn Leu 410 415 420 gcc ctg aac ccc gaa gga gga ccc aat tgg gtg cgt aac ttt gtc gac 1350 Ala Leu Asn Pro Glu Gly Gly Pro Asn Trp Val Arg Asn Phe Val Asp 425 430 435 agt ccc atc att gta gac atc acc aag gac acg ttt tac aaa cag ccc 1398 Ser Pro Ile Ile Val Asp Ile Thr Lys Asp Thr Phe Tyr Lys Gln Pro 440 445 450 atg ttc tac cac ctt ggc cat ttc agc aag ttc att cct gag ggc tcc 1446 Met Phe Tyr His Leu Gly His Phe Ser Lys Phe Ile Pro Glu Gly Ser 455 460 465 470 cag aga gtg ggg ctg gtt gcc agt cag aag aac gac ctg gac gca gtg 1494 Gln Arg Val Gly Leu Val Ala Ser Gln Lys Asn Asp Leu Asp Ala Val 475 480 485 gca ttg atg cat ccc gat ggc tct gct gtt gtg gtc gtg cta aac cgc 1542 Ala Leu Met His Pro Asp Gly Ser Ala Val Val Val Val Leu Asn Arg 490 495 500 tcc tct aag gat gtg cct ctt acc atc aag gat cct gct gtg ggc ttc 1590 Ser Ser Lys Asp Val Pro Leu Thr Ile Lys Asp Pro Ala Val Gly Phe 505

510 515 ctg gag aca atc tca cct ggc tac tcc att cac acc tac ctg tgg cat 1638 Leu Glu Thr Ile Ser Pro Gly Tyr Ser Ile His Thr Tyr Leu Trp His 520 525 530 cgc cag tga tggagcagat actcaaggag gcactgggct cagcctgggc 1687 Arg Gln 535 attaaaggga cagagtcagc gaattctgca gatatccatc acactggcgg ccgc 1741 14 536 PRT Homo sapiens 14 Met Glu Phe Ser Ser Pro Ser Arg Glu Glu Cys Pro Lys Pro Leu Ser 1 5 10 15 Arg Val Ser Ile Met Ala Gly Ser Leu Thr Gly Leu Leu Leu Leu Gln 20 25 30 Ala Val Ser Trp Ala Ser Gly Ala Arg Pro Cys Ile Pro Lys Ser Phe 35 40 45 Gly Tyr Ser Ser Val Val Cys Val Cys Asn Ala Thr Tyr Cys Asp Ser 50 55 60 Phe Asp Pro Pro Thr Phe Pro Ala Leu Gly Thr Phe Ser Arg Tyr Glu 65 70 75 80 Ser Thr Arg Ser Gly Arg Arg Met Glu Leu Ser Met Gly Pro Ile Gln 85 90 95 Ala Asn His Thr Gly Thr Gly Leu Leu Leu Thr Leu Gln Pro Glu Gln 100 105 110 Lys Phe Gln Lys Val Lys Gly Phe Gly Gly Ala Met Thr Asp Ala Ala 115 120 125 Ala Leu Asn Ile Leu Ala Leu Ser Pro Pro Ala Gln Asn Leu Leu Leu 130 135 140 Lys Ser Tyr Phe Ser Glu Glu Gly Ile Gly Tyr Asn Ile Ile Arg Val 145 150 155 160 Pro Met Ala Ser Cys Asp Phe Ser Ile Arg Thr Tyr Thr Tyr Ala Asp 165 170 175 Thr Pro Asp Asp Phe Gln Leu His Asn Phe Ser Leu Pro Glu Glu Asp 180 185 190 Thr Lys Leu Lys Ile Pro Leu Ile His Arg Ala Leu Gln Leu Ala Gln 195 200 205 Arg Pro Val Ser Leu Leu Ala Ser Pro Trp Thr Ser Pro Thr Trp Leu 210 215 220 Lys Thr Asn Gly Ala Val Asn Gly Lys Gly Ser Leu Lys Gly Gln Pro 225 230 235 240 Gly Asp Ile Tyr His Gln Thr Trp Ala Arg Tyr Phe Val Lys Phe Leu 245 250 255 Asp Ala Tyr Ala Glu His Lys Leu Gln Phe Trp Ala Val Thr Ala Glu 260 265 270 Asn Glu Pro Ser Ala Gly Leu Leu Ser Gly Tyr Pro Phe Gln Cys Leu 275 280 285 Gly Phe Thr Pro Glu His Gln Arg Asp Leu Ile Ala Arg Asp Leu Gly 290 295 300 Pro Thr Leu Ala Asn Ser Thr His His Asn Val Arg Leu Leu Met Leu 305 310 315 320 Asp Asp Gln Arg Leu Leu Leu Pro His Trp Ala Lys Val Val Leu Thr 325 330 335 Asp Pro Glu Ala Ala Lys Tyr Val His Gly Ile Ala Val His Trp Tyr 340 345 350 Leu Asp Phe Leu Ala Pro Ala Lys Ala Thr Leu Gly Glu Thr His Arg 355 360 365 Leu Phe Pro Asn Thr Met Leu Phe Ala Ser Glu Ala Cys Val Gly Ser 370 375 380 Lys Phe Trp Glu Gln Ser Val Arg Leu Gly Ser Trp Asp Arg Gly Met 385 390 395 400 Gln Tyr Ser His Ser Ile Ile Thr Asn Leu Leu Tyr His Val Val Gly 405 410 415 Trp Thr Asp Trp Asn Leu Ala Leu Asn Pro Glu Gly Gly Pro Asn Trp 420 425 430 Val Arg Asn Phe Val Asp Ser Pro Ile Ile Val Asp Ile Thr Lys Asp 435 440 445 Thr Phe Tyr Lys Gln Pro Met Phe Tyr His Leu Gly His Phe Ser Lys 450 455 460 Phe Ile Pro Glu Gly Ser Gln Arg Val Gly Leu Val Ala Ser Gln Lys 465 470 475 480 Asn Asp Leu Asp Ala Val Ala Leu Met His Pro Asp Gly Ser Ala Val 485 490 495 Val Val Val Leu Asn Arg Ser Ser Lys Asp Val Pro Leu Thr Ile Lys 500 505 510 Asp Pro Ala Val Gly Phe Leu Glu Thr Ile Ser Pro Gly Tyr Ser Ile 515 520 525 His Thr Tyr Leu Trp His Arg Gln 530 535 15 21 DNA Artificial Sequence oligonucleotide 15 gccagtgtga tggatatctg c 21 16 23 DNA Artificial Sequence oligonucleotide 16 gaccgatcca gcctccggac tct 23 17 22 DNA Artificial Sequence oligonucleotide 17 gccgcacact ctgctcccag aa 22 18 51 DNA Artificial Sequence oligonucleotide 18 catccgtcgc ccactgcgtg tactctcata gcgggaaaat gtcagggcag g 51 19 30 DNA Artificial Sequence oligonucleotide 19 cctttgagta gagtctccat catggctggc 30

* * * * *