U.S. patent application number 10/137216 was filed with the patent office on 2003-04-10 for method for the identification of synthetic cell-or tissue-specific transcriptional regulatory regions.
This patent application is currently assigned to Baylor College of Medicine. Invention is credited to Eastman, Eric M., Li, Xuyang, Nordstrom, Jeff, Schwartz, Robert J..
Application Number | 20030068631 10/137216 |
Document ID | / |
Family ID | 21977393 |
Filed Date | 2003-04-10 |
United States Patent
Application |
20030068631 |
Kind Code |
A1 |
Schwartz, Robert J. ; et
al. |
April 10, 2003 |
Method for the identification of synthetic cell-or tissue-specific
transcriptional regulatory regions
Abstract
The invention concerns making and evaluating synthetic
regulatory regions for controlling gene expression. The invention
features a method for identifying transcription factor binding
sites and a method for evaluating the regulatory functions of
synthetic regulatory regions.
Inventors: |
Schwartz, Robert J.;
(Houston, TX) ; Eastman, Eric M.; (Highland,
MD) ; Li, Xuyang; (Houston, TX) ; Nordstrom,
Jeff; (College Station, TX) |
Correspondence
Address: |
LYON & LYON LLP/ VALENTIS INC.
633 WEST FIFTH STREET, SUITE 4700
LOS ANGELES
CA
90071-2066
US
|
Assignee: |
Baylor College of Medicine
|
Family ID: |
21977393 |
Appl. No.: |
10/137216 |
Filed: |
May 1, 2002 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10137216 |
May 1, 2002 |
|
|
|
09115407 |
Jul 14, 1998 |
|
|
|
6410228 |
|
|
|
|
60052403 |
Jul 14, 1997 |
|
|
|
Current U.S.
Class: |
435/6.12 ;
435/6.13; 435/91.2 |
Current CPC
Class: |
C12Q 1/6897 20130101;
C12Q 1/6811 20130101; C12Q 1/6876 20130101; C12N 15/1034 20130101;
C12N 15/1051 20130101; C12Q 2600/158 20130101 |
Class at
Publication: |
435/6 ;
435/91.2 |
International
Class: |
C12Q 001/68; C12P
019/34 |
Claims
We claim:
1. A method of identifying binding sites for transcription factors,
comprising the step of: identifying the oligonucleotides in
oligonucleotide-protein complexes formed between one or more
proteins of a cellular or nuclear extract and any of a plurality of
double-stranded oligonucleotide fragments in a mixture of said
fragments and said extract wherein said complexes are separated
from free oligonucleotides in said mixture using size exclusion
chromatography; and wherein the presence of a said oligonucleotide
in a said complex is indicative that said oligonucleotide comprises
a said binding site.
2. The method of claim 1, wherein a said double-stranded
oligonucleotide fragment is made by synthesizing a single-stranded
oligonucleotide and converting said single-stranded oligonucleotide
to a double-stranded oligonucleotide.
3. The method of claim 1, wherein said oligonucleotide fragment
comprises a central random sequence and both restriction sites and
primer sequences in the 5' and 3' ends.
4. The method of claim 1, wherein said identifying comprises
amplifying and sequencing said oligonucleotides from said
protein-oligonucleotide complexes.
5. The method of claim 4, wherein said amplifying is performed by
polymerase chain reaction.
6. A method for evaluating a putative cell- or tissue-specific
transcriptional regulatory region, comprising the step of:
determining whether a cell comprising said putative transcriptional
regulatory region in a transcriptional regulatory position to a
selective gene is selected under selective conditions, wherein said
selection of said cell will only occur if said selective gene is
expressed at a sufficiently high level in said cell and wherein
said selective gene will only be expressed at said sufficiently
high level if said putative transcriptional regulatory region is
active in said cell.
7. The method of claim 6, wherein said selection condition is a
positive selection condition, and wherein said method comprises the
steps of: culturing one or more cells under said positive selection
condition, wherein said at least one said cell each contain a
different nucleic acid test sequence inserted in a transcriptional
regulatory position to a selective gene; and wherein a cell of said
one or more cells cannot be selected under said positive selection
condition in the absence of high level expression of said selective
gene; and determining whether any of said cells can be selected,
wherein selection of a cell is indicative that the said nucleic
acid test sequence in said cell contains a transcriptional region
active in said cell.
8. The method of claim 7, wherein said selective gene is a gene
encoding an antigen.
9. The method of claim 7, wherein said positive selection condition
comprises the steps of: contacting said cells with
florescence-labeled antibody specific for cells expressing said
selective gene; and selecting cells expressing said selective gene
with florescence activated cell sorting.
10. The method of claim 7, wherein said positive selection
condition comprises the steps of: contacting said cells with
iron-labeled antibody specific for cells expressing said selective
gene; and selecting cells expressing said selective gene with
magnetic bead sorting.
11. The method of claim 6, wherein said selection condition is a
stress condition, said method comprising the steps of: culturing
one or more cells under said stress condition, wherein each of said
at least one cell contains a different nucleic acid test sequence
inserted in a transcriptional regulatory position to a selective
gene, wherein said selective gene is a protective gene and wherein
the growth of said one or more cells is inhibited under said stress
condition in the absence of high level expression of said
protective gene; and wherein growth of a cell of said at least one
cell in the presence of said stress condition is indicative that
said nucleic acid test sequence comprises a transcriptional
regulatory region active in said cell.
12. The method of claim 11, wherein said stress condition is the
presence of at least one biochemical agent.
13. The method of claim 11, wherein said protective gene is an
adenosine deaminase gene.
14. The method of claim 12, wherein said at least one biochemical
agent is xylofuranosyl-adenine.
15. The method of claim 11, wherein said protective gene is a
dihydrofolate reductase gene.
16. The method of claim 12, wherein said at least one biochemical
agent is methotrexate.
17. The method of claim 12, wherein said at least one biochemical
agent consists of xylofuranosyl-adenine and deoxycorformacin.
18. The method of claim 12, wherein said at least one biochemical
agent consists of alanosine, adenosine, and uridine.
19. The method of claims 8 or 11, wherein said test sequence
comprises a combination or modification of known transcription
factor response elements.
20. The method of claims 8 or 11, wherein said test sequence
comprises one or more binding sites of unknown function.
21. The method of claims 8 or 11, wherein said test sequence
comprises a combination of at least one known transcription factor
response element and at least one binding site of unknown
function.
22. The method of claims 8 or 11, wherein said one or cells are
muscle cells.
23. A method of identifying cell- or tissue-specific synthetic
regulatory regions, comprising the steps of: identifying the
oligonucleotides in oligonucleotide-protein complexes formed
between one or more proteins of a cellular or nuclear extract and
any of a plurality of double-stranded oligonucleotide fragments in
a mixture of said fragments and said extract wherein said complexes
are separated from free oligonucleotides in said mixture using size
exclusion chromatography; and wherein the presence of a said
oligonucleotide in a said complex is indicative that said
oligonucleotide comprises a said binding site. evaluating a
putative cell- or tissue-specific transcriptional regulatory region
comprising said binding site under a selection condition by
determining whether a cell comprising said putative transcriptional
regulatory region in a transcriptional regulatory position to a
selective gene is selected under selective conditions, wherein said
selection of said cell will only occur if said selective gene is
expressed at a sufficiently high level in said cell and wherein
said selective gene will only be expressed at said sufficiently
high level if said putative transcriptional regulatory region is
active in said cell.
24. The method of claim 23, wherein a said double-stranded
oligonucleotide fragment is made by synthesizing a single-stranded
oligonucleotide and converting said single-stranded oligonucleotide
to a double-stranded oligonucleotide.
25. The method of claim 23, wherein said oligonucleotide fragment
comprises a central random sequence and both restriction sites and
primer sequences in the 5' and 3' ends.
26. The method of claim 23, wherein said complexes are separated
from free oligonucleotides using size exclusion chromatography.
27. The method of claim 23, wherein said identifying comprises
amplifying and sequencing said oligonucleotides from said
protein-oligonucleotide complexes.
28. The method of claim 23, wherein said amplifying is performed by
polymerase chain reaction.
29. A method of claim 23, wherein said selection condition is a
positive selection condition, and wherein said evaluating comprises
the steps of: culturing one or more cells under said positive
selection condition, wherein at least one said cell contains a
nucleic acid test sequence inserted in a transcriptional regulatory
position to a selective gene; and wherein said one or more cells
cannot be selected under said positive selection condition in the
absence of high level expression of said selective gene; and
wherein at least one cell capable to be selected in the presence of
the positive selection condition is indicative that the nucleic
acid test sequence contains a transcriptional region active in said
cell.
30. A method of claim 29, wherein said selective gene is a gene
encoding an antigen.
31. A method of claim 29, wherein said positive selection condition
comprises the steps of: contacting said cells with
florescence-labeled antibody specific for cells expressing said
selective gene; and selecting out cells expressing said selective
gene with florescence activated cell sorting.
32. A method of claim 29, wherein said positive selection condition
comprises the steps of: contacting said cells with magnetic
bead-labeled antibody specific for cells expressing said selective
gene; and selecting out cells expressing said selective gene with
magnetic sorting.
33. A method of claim 23, wherein said selection condition is a
stress condition, and wherein said evaluating comprises the steps
of: culturing one or more cells under said stress condition,
wherein at least one said cell contains a nucleic acid test
sequence inserted in a transcriptional regulatory position to a
protective gene; the growth of said one or more cells being
inhibited under said stress condition in the absence of high level
expression of said protective gene; and wherein growth of a cell of
said at least one cell in the presence of said stress condition is
indicative that said nucleic acid test sequence comprises a
transcriptional regulatory region active in said cell.
34. The method of claim 33, wherein said stress condition is the
presence of at least one biochemical agent.
35. The method of claim 33, wherein said protective gene is an
adenosine deaminase gene.
36. The method of claim 34, wherein said at least one biochemical
agent is xylofuranosyl-adenine.
37. The method of claim 33, wherein said protective gene is a
dihydrofolate reductase gene.
38. The method of claim 34, wherein said at least one biochemical
agent is methotrexate.
39. The method of claim 34, wherein said at least one biochemical
agent consists of xylofuranosyl-adenine and deoxycorformacin.
40. The method of claim 34, wherein said at least one biochemical
agent consists of alanosine, adenosine, and uridine.
41. The method of claim 29 or 34, wherein said test sequence
comprises a combination or modification of at least one known
transcription factor response element and at least one said binding
site.
42. The method of claims 29 or 34, wherein said one or more cells
are muscle cells.
Description
RELATED APPLICATION
[0001] The present application claims priority to U.S. patent
application Ser. No. 09/115,407, filed Jul. 14, 1998 (Lyon &
Lyon Docket No. 235/238), which is based on U.S. Provisional Patent
Application No. 60/052,403, filed Jul. 14, 1997 (Lyon & Lyon
Docket No. 224/269), both entitled METHOD FOR THE IDENTIFICATION OF
SYNTHETIC CELL- OR TISSUE-SPECIFIC TRANSCRIPTIONAL REGULATORY
REGIONS, by Schwartz et al, and which are both incorporated herein
by reference in its entirety, including any drawings.
BACKGROUND OF THE INVENTION
[0002] This invention relates to natural and synthetic cell- or
tissue-specific transcriptional regulatory regions that regulate
gene transcription in particular cells or tissues. In addition,
this invention also relates to the methods for the selection,
identification and evaluation of the synthetic cell- or
tissue-specific transcriptional regulatory regions. None of the
information described herein is admitted to be prior art to the
present invention, but is provided solely to assist the
understanding of the reader.
[0003] Cell- or tissue-specific gene expression plays a central
role in the proliferation and differentiation of cells. As the
first step of gene expression, transcription is an important step
for regulation. The study of transcriptional regulatory regions is
one of the major fields in modern biology. The transcriptional
regulatory regions are also very important for applications in
biotechnology, such as in gene therapy and the production of
recombinant proteins.
[0004] Transcriptional regulatory regions generally have two
portions: transcription initiation sites and enhancers which are
capable of regulating the transcription level from a distance to
the initiation sites. The binding of transcription factors to the
regulatory regions is necessary for the regulatory regions to
regulate transcription. The regulatory regions fall into several
categories: general regulatory regions which regulate transcription
in all cells of an organism, inducible regulatory regions which
only regulate transcription in response to certain signals, and
cell- or tissue-specific regulatory regions which only regulate
transcription in certain cells.
[0005] Several methods have been used to identify the regulatory
regions. One of these methods is the analysis of regions that are
important for the proper expression of cloned genes. The first step
is usually to identify rough boundaries of the regulatory regions
using deletion and mutation analysis of the cloned genes. These
regions include the 5' upstream regions, 3' downstream regions, and
sometimes introns or coding sequences within the gene itself. Most
studies are performed using chimeric constructs containing a
reporter gene such as .beta.-galactosidase (.beta.-gal),
chloramphenicol acetyltransferase (CAT), luciferase or growth
hormone (GH). The regions that actually bind protein factors can be
more accurately defined using DNA footprinting techniques followed
by mutation analysis. The sequences that bind protein transcription
factors are often referred to as transcription factor binding
sites.
[0006] Consensus sequences for a number of common binding sites
have been determined. One example is the binding site recognized by
the family of basic-helix-loop-helix (bHLH) transcription factors.
The consensus sequence of binding sites for bHLH proteins is
5'-CANNTG-3', where "N" can be any nucleotide. This binding site is
called the "E box" and is found in the regulatory regions of a
number of genes that are expressed in diverse cell types, including
lymphocytes, muscle cells and fibroblasts. Some bHLH proteins are
common to most or all cells while others are cell-specific. In
addition, bHLH proteins form heterodimers and the interaction of
some of these dimers with DNA is cell-specific. The binding of
different bHLH proteins to specific regulatory regions appears to
be affected by the variable dinucleotide sequence within the core
consensus sequence and the sequence adjacent to the core sequence
(Sun, et al., Cell 64:459-470 (1991)).
[0007] Binding sites associated with newly cloned and sequenced
genes can also be identified by searching the sequence for homology
with the sequences of known binding sites that have been
characterized from other, sometimes related, genes.
[0008] In addition, several methods were developed to identify the
binding sites of transcription factors without cloning of the
target genes. Selected and amplified binding site (SAAB) method was
used to identify the binding sites for known transcription factors
(Blackwell, et al., Science 250:1104-1110 (1990)). By using this
method, synthesized templates with random sequences are incubated
with purified transcription factors. Those bound to transcription
factors are isolated with electrophoretic mobility shift assay
(EMSA). The templates are then amplified by the polymerase chain
reaction (PCR). After reiteratively being rebound and reamplified,
the binding site of the transcription factor is sequenced and
identified. The binding site of transcription factor myc was
identified with this method (Blackwell, et al., Science
250:1149-1151 (1990)).
[0009] It is often difficult, however, to identify and purify
transcription factors for use in such assays. Indeed, the binding
sites are often identified first and then are used to facilitate
the identification and purification of transcription factors
binding to the sites. Moreover, in many studies, it is crucial to
understand the characteristics of certain regulatory regions,
whereas it is not necessary to know the transcription factors
binding to the regulatory regions. A method similar to SAAB,
multiplex selection technique (MuST) was therefore developed
(Nullur, et al., PNAS 93:1184-1189 (1996)). In the multiplex
selection technique, purified transcription factors are replaced
with crude nuclear extract, so that binding sites can be identified
without the identification of transcription factors. The identified
binding sites can then be used to identify the corresponding
transcription factors.
[0010] The regulatory regions often consist of multiple different
binding sites for transcription factors. The characteristics of a
regulatory region are determined by the composition and arrangement
of the binding sites. In addition to naturally-occurring regulatory
regions, synthetic regulatory regions can be constructed through
the combination and modification of binding sites.
[0011] Available naturally-occurring regulatory regions are not
always capable of regulating transcription in a desired manner. In
these cases, as well as others, synthetic regulatory regions may be
utilized to provide the desired functional characteristics. As an
example, synthetic herpes simplex virus (HSV) regulatory regions
were constructed by linking the 5' nontranscribed domain of an HSV
.alpha. gene to a fragment containing the transcription initiation
site and the 5' transcribed noncoding region from an HSV .gamma.
gene (Roizman, PCT 94/14971). The resulting synthetic regulatory
regions direct constitutive transcription of the heterologous gene
throughout the reproductive cycle of the virus at a high cumulative
level. Synthetic regulatory regions were also constructed to
achieve high inducible transcription levels and low basal
transcription levels (Filmus, et al., PCT 93/20218).
[0012] In both of the above cases, the binding sites are
well-understood transcription factor response elements. Many
binding sites, however, are not well-understood, especially those
identified without the cloning of the corresponding transcription
factors. These binding sites are therefore only potential
transcription factor response elements until they are confirmed to
be functional for transcription regulation using functional assays.
These assays are usually a laborious and costly task. It is even
more complicated for synthetic regulatory regions produced by the
combination, modification and rearrangement of various binding
sites.
SUMMARY OF THE INVENTION
[0013] Applicant has designed useful methods to create, identify
and evaluate cell- or tissue-specific synthetic regulatory regions.
Specifically, the methods include the selection of transcription
factor binding sites, the creation of synthetic regulatory regions
using the binding sites and/or portions of known regulatory
regions, and the evaluation of the synthetic regulatory regions.
The synthetic regulatory regions acquired with this method can be
used in gene delivery or gene therapy to achieve desired gene
expression in targeted cells. The acquired synthetic regulatory
regions can also be used to achieve the production of recombinant
proteins at high levels.
[0014] The present invention utilizes the recognition that the
cells themselves contain all the information required to identify
the binding sites that are most important or are recognized by the
key transcription factors in the cells. The methods described for
the selection of binding sites do not require any previous
knowledge of the genes that are expressed or the transcription
factors that are present in the cells. Thus, these methods bypass
the extensive work needed for the purification, identification, and
analysis of transcription factors. In addition, these methods
eliminate the need to know the tissue specific transcription factor
binding sites. Furthermore, many more potential binding sites can
be identified using these methods than using the methods with
purified transcription factors. Similarly, the methods for the
creation and evaluation of synthetic regulatory regions do not
require complete understanding of the binding sites. The binding
sites can be linked together in various combinations and with
various arrangements, and can then be evaluated to select
particular synthetic regulatory regions which are functional in a
certain cell line. Therefore, these methods make it possible to
create and identify useful synthetic regulatory regions on a
large-scale.
[0015] As indicated above, the methods discussed herein are useful
for identifying regulatory region sequences for gene delivery or
gene therapy. One of the major obstacles for gene delivery or gene
therapy is the difficulty in expressing genes at preferred levels
in certain cells or tissues. The difficulties are partly due to the
lack of proper regulatory regions to direct the desired gene
transcription. The functional synthetic regulatory regions
identified from these methods will provide many candidates for the
regulatory regions needed in gene delivery or gene therapy.
Moreover, these synthetic regulatory regions will also be
candidates for the regulatory regions needed in large-scale
production of recombinant proteins, which also requires gene
transcription at high level in certain cell lines.
[0016] A first aspect of the present invention features a method of
identifying binding sites for transcription factors. The method
involves identifying the oligonucleotides in
protein-oligonucleotide complexes formed between a cellular or
nuclear extract from a group of cells and any of a plurality of
double-stranded oligonucleotide fragments. Preferably the complexes
are separated from free oligonucleotides using size exclusion
chromatography. The presence of an oligonucleotide in a complex is
indicative that the oligonucleotide includes a binding site.
[0017] In preferred embodiments, the double-stranded
oligonucleotides are made through the synthesis of single-stranded
oligonucleotide and conversion of the single-stranded
oligonucleotide to double-stranded oligonucleotide. Also in
preferred embodiments, the oligonucleotide fragment has a central
random sequence and both restriction sites and primer sequences on
both ends. In preferred embodiments, the identifying step includes
amplifying, cloning and sequencing the oligonucleotide fragments
from the protein-oligonucleotide complexes to identify the binding
sites. The amplifying step is preferably performed by polymerase
chain reaction.
[0018] The oligonucleotide fragments can be of various sizes, but
preferably include test sequences between about 5 and 500 bp in
length, more preferably between about 5 and 100 bp, still more
preferably between 20 and 50 bp.
[0019] The term "transcribe" or "transcription" as used herein
refers to the synthesis of RNA by RNA polymerase, following a DNA
template. Transcription is the first step of gene expression and
the most important step for the regulation of gene expression. That
is, the regulation of gene expression is achieved mainly through
the regulation of transcription.
[0020] The term "gene expression" refers to the process in which
genetic information flows from DNA to functional molecules, such as
proteins or RNA molecules. The regulation of transcription, as a
part of gene expression is achieved with the interaction between
the regulatory region of a gene and various transcription
factors.
[0021] As used herein, the term "transcriptional regulatory
regions" or "regulatory regions" refers to the regions of a gene
controlling the transcription of the gene. A regulatory region
often includes several portions. Some of these portions are in the
initiation site for transcription, whereas others are located a
distance to the initiation site. The term thus includes regions
commonly referred to as enhancers.
[0022] The term "synthetic regulatory regions" as used herein
refers to regulatory regions which are artificially made (i.e.,
made by humans using molecular biology techniques) such as by the
creation with one or more modifications, combinations, or
rearrangements of various transcription factor binding sites.
[0023] The term "transcription factors" as used herein refers to
proteins which bind to the elements of regulatory regions and
regulate the transcription of the corresponding genes. According to
their functions, transcription factors fall into several
categories. These include general transcription factors which are
needed by most genes in most cells, cell- or tissue-specific
transcription factors which only regulate gene transcription in
certain cells, and inducible transcription factors which regulate
gene transcription in response to certain signals.
[0024] The term "transcription factor binding site" or "binding
site" refers to any nucleic acid sequence which can bind
transcription factors under transcription conditions or conditions
approximating intracellular physical conditions.
[0025] As used herein, the term "transcription factor response
elements" or "response elements" refers to the functional
regulatory region components which can bind transcription factors
and thereby regulate transcription of the corresponding genes.
Thus, binding sites are potential response elements, their
regulatory function can readily be tested and characterized.
[0026] As used herein, the term "restriction sites" refers to
deoxyribonucleic acid sequences at which specific restriction
endonucleases can cleave in a sequence-specific manner.
[0027] The term "cells" or "cell" as used herein refers to a
membrane-enveloped protoplasmic body capable of independent
reproduction. Cells can be maintained, or propagated, in vivo, in
vitro or in tissue culture and are capable of being transformed by
plasmids as discussed herein.
[0028] As used herein "tissue" refers to a population consisting of
cells of the same kind performing the same function.
[0029] The term "nuclear or cellular extract" refers to a
preparation containing all or some of the cellular contents from
inside the nuclear membrane or the plasma membrane of cells
respectively, particularly including protein components. Such an
extract is distinguished from a purified transcription factor.
[0030] As used in this context, the term "mixing" refers to putting
together oligonucleotides and nuclear or cellular extract, such
that the oligonucleotides and components of the extract can contact
each other. Preferably a nuclear extract is used.
[0031] The term "oligonucleotide" as used herein refers to a
nucleic acid molecule consisting of same or different individual
nucleotides which are covalently linked together. Oligonucleotides
can be single-stranded or double-stranded, consisting of two
anti-parallel single-stranded oligonucleotides with complementary
sequences. For use in the identification of binding sites, each
oligonucleotide strand is preferably between about 5 and 500
nucleotides in length, more preferably between 5 and 100, still
more preferably between about 7 and 50, and most preferably between
about 20 and 50 nucleotides in length. The term "free
oligonucleotide" refers to the oligonucleotides which are not bound
to proteins or any other compounds. The term
"protein-oligonucleotide complexes" as used herein refers to the
complexes comprising oligonucleotides and the proteins bound with
the oligonucleotides.
[0032] As used in the context of the oligonucleotide fragments, the
term "conversion" is used to refer to the synthesis of a
single-stranded DNA molecule complementary to another DNA molecule
to form a double-stranded DNA molecule.
[0033] The term "primer" as used herein refers to a single-stranded
oligonucleotide, the 3' end of which can be used as the initiation
site for the DNA synthesis with a DNA polymerase. As used herein,
the term "primer sequence" refers to the sequence of the primer or
the complementary sequence.
[0034] As used herein, the terms "5'" and "3'" refer to the two
different ends of a single-stranded DNA molecule respectively in
accord with common usage. When used in relation to a coding
sequence, the terms refer to being in the 5' direction from the
coding sequence or in the 3' direction from the coding sequence.
For a sequence on a circular nucleic acid molecule, e.g., on a
circular plasmid, the terms refer to the direction from a reference
sequence but not fully around the chain, and preferably includes a
functional relationship. Thus, for example, a regulatory region is
5' to a coding sequence if it is in a position in which it would be
expected to functionally affect transcription if in a 5' position
on a linear molecule. Usually, a 5' position is closer to the 5'
end of a coding sequence than to the 3' end.
[0035] As used herein, the term "size exclusion chromatography"
refers to a technique for the separation of biomolecules. This
approach separates molecules into two groups, one which is smaller
than the exclusion size of the chromatographic media and another
which is larger than the exclusion size. The
protein-oligonucleotide complexes are much larger than free
oligonucleotides, so they can be readily separated, utilizing an
exclusion size greater than the size of the free oligonucleotides
and smaller than the size of protein-oligonucleotide complex. In
this context, size refers to the effective radius of the molecule
or complex. As indicated above, nuclear or cellular extract, which
includes many different transcription factors, is used instead of
purified transcription factors in the present invention. The
protein-oligonucleotide complexes resulting from the mixing of
oligonucleotide fragments and nuclear or cellular extract therefore
have many different sizes. As a result, size exclusion
chromatography provides a more useful separation than
electrophoretic mobility shift assay (EMSA) because size exclusion
chromatography produces a simple separation of bound and unbound
oligonucleotides while EMSA produces a series of bands distributed
over a gel. Due to the nature of the gels typically utilized, EMSA
generally also requires an extraction step to recover the bound
oligonucleotide from the gel for further manipulation.
[0036] The term "amplifying" as used herein refers to increasing
the numbers of DNA molecules. The approaches for amplifying
include, but are not limited to, polymerase chain reaction.
[0037] As used herein, the term "sequencing" refers to the process
of identifying the nucleotide sequence of DNA molecules. The term
"nucleotide sequence" refers to the linear order of nucleotides in
a DNA molecule or other nucleic acid molecules. Methods for
sequencing of nucleic acid molecules are well-known to those
skilled in the art.
[0038] A second aspect of the present invention features a method
for evaluating a cell- or tissue-specific synthetic regulatory
region or regions. This method involves determining whether a cell
is selected under selective conditions. The method uses cells which
contain different putative transcriptional regulatory regions
located in transcriptional regulatory positions to a selective
gene. A cell can only be selected if the selective gene is
expressed at sufficiently high levels, and the selective gene will
be expressed at the sufficiently high level if the putative
transcriptional regulatory region is active in the particular cell.
The capability of a cell to be selected in response to the
selection condition indicates that the nucleic acid test sequence
contains a transcriptional regulatory region active in the cell.
The selection condition can be adjusted so that only strong
regulatory regions will be effective to be selected in the
selection condition. In general, the method involves culturing the
cell or cells having the putative transcriptional regulatory
sequence
[0039] The term "sufficiently high level" refers to a functional
level of expression which depends on the type of selection used and
the stringency applied to the selection. Thus, for positive
selection, the level is sufficient to allow discrimination of a
cell expressing the selective gene at a "sufficiently high level"
from an otherwise isogenic cell not expressing the gene at a
sufficiently high level. For negative selection, a "sufficiently
high level" is a level which allows the cell to grow in the
presence of the selection condition.
[0040] In a preferred embodiment, the selection condition is a
positive selection condition. The capability of at least one cell
to be selected in the presence of the selective condition is
indicative that the nucleic acid test sequence contains a
transcriptional region active in the cell. The selection condition
can be adjusted so that only strong regulatory regions will be
effective to be selected in the selection condition.
[0041] In another preferred embodiment, the selection condition is
a negative selection condition, i.e., stress condition; and the
selective gene is a protective gene. The growth of the cells is
inhibited under the stress condition in the absence of high level
expression of the protective gene. Growth of at least one cell in
the presence of the stress condition is indicative that the nucleic
acid test sequence contains a transcriptional region active in the
cell. The stress condition can be adjusted so that only strong
regulatory regions will be effective to overcome the stress
condition.
[0042] The term "regulates" or "regulation" as used herein refers
to the effect of nucleic acid sequences or other molecules involved
in control of a response or action. In particular, this includes
the effects of sequences involved in regulating, controlling or
affecting the expression level or rate of structural genes.
Generally this includes the binding of transcription factors to
sequences, affecting transcription rates or other steps in gene
expression.
[0043] As used in this context, the term "transcriptional
regulatory position" refers to the position where functional
regulatory regions can influence the transcription of the selective
gene. Transcriptional regulatory positions include, but are not
limited to, 5' to the coding sequence of the selective gene, 3' to
the coding sequence of the selective gene, and within the intron or
signal sequence of the selective gene. For identification and/or
evaluation of synthetic regulatory regions, the region 5' to the
coding sequence of the selective gene is of particular interest,
however, other positions are also of interest and can be utilized
in this invention.
[0044] The term "cell- or tissue-specific transcriptional
regulatory region" as used herein refers to a nucleic acid sequence
which is involved in controlling transcription through one or more
coding sequences in a cell- or tissue-specific manner. As used
herein, the term "cell- or tissue-specific transcription" refers to
the gene transcription which occurs at a higher level in cells of a
group or in certain tissue as compared to other cells or tissue of
the corresponding organism generally.
[0045] As used herein the term "transfected" or "transfection"
refers to the incorporation of foreign DNA into cultured cells by
exposing them to such DNA. This would include the introduction of
DNA by various delivery methods, e.g., via vectors or plasmids
using naked DNA, DNA-cationic lipid complexes, DNA in liposomes.
The methods may include techniques to enhance penetration of the
cellular membrane, such as electroporation or use of lytic
peptides.
[0046] The term "cells of a group" as used herein refers to cells
which are differentiated into the same or similar stage, and
thereby have the same or similar characteristics, e.g., the same or
similar characteristics with respect to control of
transcription.
[0047] As used herein, the term "vector" refers to a DNA construct
which can be transfected into cells. Vectors can be of a variety of
different types, including plasmids, viral vectors, and others.
Various genes can be inserted into a vector so that the gene can be
delivered into cells. The term "insert" as in this context refers
to incorporating a nucleic acid sequence into the vector nucleic
acid sequence. Vector can include both linear and circular DNA
constructs.
[0048] The term "selection condition", refers to conditions, under
which cells expressing a selective gene show distinguishing
features, and thereby can be easily separated from cells not
expressing a selective gene. Selection condition can be positive
selection condition, or negative selection condition, i.e., stress
condition.
[0049] The term "positive selection conditions" refers to
conditions which distinguish cells expressing the selective gene so
that these cells can be easily isolated. The positive selection can
be, but not limited to, Fluorescence Activated Cell Sorting (FACS)
and magnetic bead sorting.
[0050] The term "selective gene" refers to a gene whose expression
confers on its host cells a special feature which allows the host
cell to be distinguished from other cells with which the host cell
is associated. The selective gene can be, but is not limited to, a
gene coding a particular antigen or antibody, or a protective
gene.
[0051] The term "stress conditions" refers to conditions which
either kill the cells or inhibit the division and proliferation of
the cells. Such stress conditions include, but are not limited to,
1) elevated temperatures; 2) radiation; and 3) contact with
particular biochemical agents.
[0052] The term "protective gene" means a gene encoding a protein
which is capable of protecting cells from a stress condition. Such
protective genes include, but are not limited to, genes for 1)
adenosine deaminase; 2) dihydrofolate reductase; and 3) heat shock
proteins.
[0053] The term "biochemical agents" as used herein refers to
compounds which kill certain cells or inhibit the division and
proliferation of certain cells. These biochemical agents include,
but are not limited to, 1) xylofuranosyl-adenine; 2) methotrexate;
3) xylofuranosyl-adenine and deoxycorformacin; 4) alanosine,
adenosine, and uridine.
[0054] As used in connection with binding sites and regulatory
regions, the term "combination" refers to linking together two or
more of the same or different kinds of oligonucleotides. The term
"modification" refers to a change in the sequence of a DNA
molecule, which includes, but is not limited to, the substitution
of one or a few nucleotides, or the addition or deletion of one or
a few nucleotides as compared to a reference sequence. The term
"rearrangement" refers to one or more changes in the order of
subsequences of a regulatory region, and can include the insertion
of a new subsequence or replacement of a subsequence with a new
subsequence. This includes combinations of re-ordering,
substitution, and insertion of subsequences.
[0055] A third aspect of the present invention features a method,
which combines both of the above aspects, for evaluating a cell- or
tissue-specific transcriptional regulatory region. The method
involves identifying the oligonucleotides in
protein-oligonucleotide complexes formed between a cellular or
nuclear extract from a group of cells and any of a plurality of
double-stranded oligonucleotide fragments. The presence of an
oligonucleotide in a complex is indicative that the oligonucleotide
includes a binding site. One or more cells are then cultured under
a selection condition. Among the cells, at least one cell, and
preferably a plurality of cells, contains a nucleic acid test
sequence inserted in a transcriptional regulatory position to a
selective gene. The test sequence consists of at least one of the
binding sites identified using the cellular or nuclear extract. The
capability of at least one cell to be selected in the presence of
the selection condition is indicative that the nucleic acid test
sequence contains a transcriptional region active in the cell. The
selection condition can be adjusted so that only strong regulatory
regions will be effective to be selected in the selection
condition.
[0056] In addition, in another aspect, the invention provides
synthetic regulatory regions which include all or portions of the
synthetic regulatory regions described in Example 5 and in the
Drawings. Preferably the synthetic regulatory region is in a
transcriptional regulatory position with respect to a coding
sequence of interest. A portion of one of the described regions
preferably includes at least 20 contiguous nucleotides, more
preferably at least 40 contiguous nucleotides, and still more
preferably at least 80 contiguous nucleotides of one of the
described synthetic regulatory regions. Preferably the portion is
placed at about the same position relative to a coding sequence as
it occupied in the plasmids used for analysis as described herein.
Thus, the portion is preferably within 100 nucleotides, more
preferably within 60 nucleotides, and still more preferably within
30 nucleotides of the position it occupied in a corresponding
described synthetic regulatory region.
[0057] Other features and advantages of the invention will be
apparent from the following detailed description of the invention
in conjunction with the accompanying drawings and from the
claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[0058] FIG. 1 shows five important features for the synthetic
single-stranded oligonucleotides (oligos) used in the described
selection method.
[0059] FIG. 2 outlines the overall scheme for an embodiment of
transcription factor selection of regulatory regions.
[0060] FIG. 3 is a comparison of relative regulatory region
activity of a number of different regulatory regions during
differentiation in primary myoblast cells.
[0061] FIG. 4 shows the differential SRF activity on c-Fos SRE vs
muscle SRE.
[0062] FIG. 5 shows the arrangement of sub-elements of some
exemplary synthetic regulatory regions.
[0063] FIG. 6 is a bar graph showing the expression levels in
myotubes of the luciferase reporter gene driven by various
synthetic regulatory regions in comparison to the expression driven
by the skeletal .alpha.-actin promoter, and the expression level of
each of the synthetic regulatory regions in the presence of KCl
depolarization.
[0064] FIG. 7 shows the activities of exemplary regulatory regions
under the nerve-injury induced down-regulation of skeletal actin.
Tibiales muscle of ICR mice were injected with 100 .mu.g of clone
skeletal .alpha.-actin promoter 448 (control), synthetic regulatory
region C1-28, and C5-12 luciferase vectors. Two weeks post sciatic
nerve crush, the muscle was harvested and assayed for luciferase
reporter gene activity.
[0065] FIG. 8 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C1-28,
including the sequence of the synthetic regulatory region
insert.
[0066] FIGS. 9A & B show two independently determined sequences
of portions of the plasmid containing the synthetic regulatory
region of clone C2-27, including the sequence of the synthetic
regulatory region insert.
[0067] FIGS. 10A & B show two independently determined
sequences of portions of the plasmid containing the synthetic
regulatory region of clone C5-12, including the sequence of the
synthetic regulatory region insert.
[0068] FIGS. 11A & B show two independently determined
sequences of portions of the plasmid containing the synthetic
regulatory region of clone C6-16, including the sequence of the
synthetic regulatory region insert.
[0069] FIG. 12 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C6'-7,
including the sequence of the synthetic regulatory region
insert.
[0070] FIGS. 13A & B show two independently determined
sequences of portions of the plasmid containing the synthetic
regulatory region of clone C5-1, including the sequence of the
synthetic regulatory region insert.
[0071] FIGS. 14A & B show two independently determined
sequences of portions of the plasmid containing the synthetic
regulatory region of clone C5-5, including the sequence of the
synthetic regulatory region insert.
[0072] FIGS. 15A & B show two independently determined
sequences of portions of the plasmid containing the synthetic
regulatory region of clone C6-5, including the sequence of the
synthetic regulatory region insert.
[0073] FIGS. 16A & B show two independently determined
sequences of portions of the plasmid containing the synthetic
regulatory region of clone C1-1, including the sequence of the
synthetic regulatory region insert.
[0074] FIGS. 17A & B show two independently determined
sequences of portions of the plasmid containing the synthetic
regulatory region of clone C1-14, including the sequence of the
synthetic regulatory region insert.
[0075] FIG. 18 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C1-20,
including the sequence of the synthetic regulatory region
insert.
[0076] FIG. 19 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C1-21,
including the sequence of the synthetic regulatory region
insert.
[0077] FIGS. 20A & B show two independently determined
sequences of portions of the plasmid containing the synthetic
regulatory region of clone C1-26, including the sequence of the
synthetic regulatory region insert.
[0078] FIGS. 21A & B show two independently determined
sequences of portions of the plasmid containing the synthetic
regulatory region of clone C2-26, including the sequence of the
synthetic regulatory region insert.
[0079] FIGS. 22A & B show two independently determined
sequences of portions of the plasmid containing the synthetic
regulatory region of clone C5-13, including the sequence of the
synthetic regulatory region insert.
[0080] FIG. 23 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C5'-3,
including the sequence of the synthetic regulatory region
insert.
[0081] FIG. 24 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C5'-5,
including the sequence of the synthetic regulatory region
insert.
[0082] FIG. 25 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C5'-9,
including the sequence of the synthetic regulatory region
insert.
[0083] FIG. 26 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C5'-12,
including the sequence of the synthetic regulatory region
insert.
[0084] FIGS. 27A & B show two independently determined
sequences of portions of the plasmid containing the synthetic
regulatory region of clone C6-12, including the sequence of the
synthetic regulatory region insert.
[0085] FIG. 28 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C6'-8,
including the sequence of the synthetic regulatory region
insert.
[0086] FIG. 29 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C6'-10,
including the sequence of the synthetic regulatory region
insert.
[0087] FIG. 30 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C6'-11,
including the sequence of the synthetic regulatory region
insert.
[0088] FIG. 31 shows the sequence of a portion of the plasmid
containing the synthetic regulatory region of clone C6'-22,
including the sequence of the synthetic regulatory region
insert.
DETAILED DESCRIPTION OF THE INVENTION
[0089] The present invention provides methods for identifying and
selecting transcription factor binding sites and methods for
creating and evaluating synthetic regulatory regions or identified
transcriptional regulatory regions. The following description is
offered by way of illustration and is not intended to limit the
invention in any manner.
[0090] The description includes specific examples of preferred
embodiments of the present invention. These examples demonstrate
how oligonucleotide fragments and nuclear or cellular extracts are
used to identify transcription factor binding sites. These examples
also demonstrate how synthetic regulatory regions can be created
through the modification, combination, and rearrangement of these
binding sites or portions thereof and/or of known regulatory
regions or binding sites. Furthermnore, these examples demonstrate
how the synthetic regulatory regions can be evaluated. Such
evaluation can identify functional synthetic regulatory regions
which direct transcription of a gene at a high level in a
particular cell line. These examples include in vivo and in vitro
techniques.
[0091] Identification of Transcription Factor Binding Sites
[0092] The present invention provides a method for identifying
nucleic acid sequences which bind cellular proteins, and which are
therefore putative transcriptional regulatory sequences. The method
can use any of a variety of mixtures of DNA binding proteins, in
particular including crude transcription factor preparations from
nuclear extracts or whole cell extracts of specific cells or
tissues. Certain proteins in such mixtures or extracts will bind to
and select specific oligonucleotide sequences from a mixture of
oligonucleotide sequences. The oligonucleotide sequences can be
random sequences, or fragments of DNA from a genomic or cDNA
source, or portions, modifications or rearrangements of known
binding sites or other selections of nucleic acid sequences.
[0093] The protein-bound or selected oligonucleotides are then
identified, such as by amplification, cloning and sequencing. The
sequences of selected oligonucleotides will reveal consensus
sequences which are recognized by the more abundant transcription
factors in these cells. Some of the selected sequences will be
recognized by common, non-cell-specific transcription factors but a
number of selected sequences will be recognized by cell-specific
transcription factors.
[0094] As a first step of an exemplary selection method for the
identification of synthetic regulatory regions, synthetic
single-stranded oligonucleotides (oligos) are constructed or
obtained which preferably have five important features. These
oligos preferably contain the following:
[0095] 1. a specific sequence of 10-30 nucleotides at the 5' end to
act as a primer annealing site for DNA amplification after the
selection process has been performed. This sequence will be
identical in all oligos and is labeled "P1" in FIG. 1.
[0096] 2. a specific restriction enzyme cleavage site located
immediately 3' to or within the 3' end of the 5' primer sequence.
This site will be used for the cloning of the selected oligos. This
site will be identical in all oligos and is labeled "RI" in FIG.
1.
[0097] 3. a region within the central part of the oligo that
contains a number of random nucleotides (preferably.gtoreq.10
nucleotides). The sequence in this region will be responsible for
the selection of oligos during the selection process.
[0098] 4. a specific restriction enzyme cleavage site located
immediately 3' to the region of random nucleotides. This site will
be used with the other restriction site for the cloning of the
selected oligos. This site will be identical in all oligos and may
be different from the restriction enzyme cleavage site (R1) at the
5' side of the region of random nucleotides and is labeled "R2" in
FIG. 1.
[0099] 5. a specific sequence of 10-30 nucleotides at the 3' end of
the oligos to act as a primer annealing site for both the synthesis
of a second strand complementary to the original oligos prior to
selection and DNA amplification after the selection process has
been performed. This sequence will be identical in all oligos but
different from the sequence at the 5' end of the oligos (P1) and is
labeled "P2" in FIG. 1.
[0100] As outlined in FIG. 2, the overall scheme for the selection
of binding sites in this embodiment is as follows.
[0101] The single-stranded oligo is first converted to a
double-stranded oligo by extending primer P2 using a DNA polymerase
such as the Klenow fragment of E. Coli DNA polymerase I, T4 DNA
polymerase, or T7 DNA polymerase. The double-stranded oligos are
gel-purified and incubated with the crude transcription factor
preparation, which preferably would be prepared from isolated
nuclei but could also be prepared from whole cell extracts (Dent,
et al., In Transcription Factors: A Practical Approach, D. S.
Latchman (ed.) IRL Press, Oxford, 1-26, (1993)). Transcription
factors in the protein extracts will bind to oligos which contain
the appropriate recognition sequence or binding site. In preferred
embodiments, protein-DNA complexes are separated from unbound
oligos by size exclusion chromatography (SEC). SEC is preferable
for this step because the protein-DNA complexes will be
heterogeneous in size due to differences in the molecular weights
of the bound transcription factors and the possibility that
multimeric protein complexes may bind to some binding sites. Thus,
electrophoresis would result in a distribution of bands across the
gel which would require separate extraction. In contrast, SEC media
and conditions can be selected to provide a sharp separation of
free and protein-bound oligos.
[0102] The selected oligos are then purified, amplified using
primers P1 and P2, and digested with restriction enzymes R1 and R2
to excise the central protein-binding regions from the flanking
primer sequences. Those skilled in the art can readily determine
appropriate primers and restriction enzymes. The digested oligos
are then ligated to form concatamers, and fragments in the 200-400
bp range are purified and cloned into an appropriate
cloning/sequencing vector. Cloning 200-400 bp concatamers, which
contain 20 or more different selected sequences, allows the
acquisition of much more sequence information per sequencing
reaction than would be obtained if single selected oligos were
cloned and sequenced. The method, however, can also utilize single
oligos or other size concatamers.
[0103] The sequences of individual selected oligos are aligned to
identify consensus sequences for the most abundant transcription
factors. These sequences are tested for cell specificity, either
individually or in combination, by cloning them upstream of a basal
heterologous regulatory region driving a reporter gene. The
selected oligos can also be used in combination with known
transcription factor response elements to make synthetic regulatory
regions.
[0104] Since this method does not require knowledge of the genes
that are expressed or the transcription factors that are present in
the cells of interest, this method can be used to identify
transcriptional regulatory sequences which are utilized in cell
types or under conditions in which gene regulation is poorly
understood. The process can be used to identify and characterize
regulatory regions that are highly active in a specific cell type
or tissue, as well as cell-specific regulatory regions. This can be
extended to include different developmental stages, induction
states, or transformation states of cells.
[0105] Evaluation Method for Synthetic Regulatory Regions
[0106] Because of the limitations in previous methods, as discussed
above, new methods are needed to evaluate the functions of
synthetic regulatory regions. This invention provides an approach
utilizing the expression of proteins capable of protecting cells
from stress conditions, such as drugs, to select functional
synthetic regulatory regions. In addition to evaluating synthetic
regulatory regions, this method can be used to evaluate any of a
variety of other transcriptional regulatory sequences.
[0107] A number of different proteins are capable of protecting
eukaryotic cells from the toxic effects of specific biochemical
agents (drugs). The genes coding for some of these proteins
(protective genes) have been used to select for the amplification
of other non-selectable genes that are linked to the protective
gene. This amplification occurs after integration of the two linked
genes into the same site of the genome of transfected cells. These
selection systems have been used to amplify exogenous genes to
increase the production of recombinant proteins (Kaufman, Meth.
Enzymol. 185:537-566 (1990); Kellems, Current Opinion in
Biotechnology 2:723-729 (1991); Kellems, Methods in Molecular
Genetics 5:143-155 (1994)).
[0108] The gene most frequently used in gene amplification schemes
is the gene coding for dihydrofolate reductase (DHFR), which
provides protection against the toxic effects of the drug
methotrexate. After transfection of methotrexate sensitive cells
with an expression plasmid containing both the DHFR gene and the
gene of interest, these genes can be induced to coamplify by
treating the cells with increasing concentrations of methotrexate
(Kaufman, Meth. Enzymol. 185:537-566 (1990)).
[0109] The gene for adenosine deaminase (ADA) can also be used to
select for the amplification of linked genes (Kellems et al., in
Genetics and Molecular Biology of Industrial Microorganisms,
Hershberger et al., (ed.) American Society for Microbiology,
Washington, 215-225 (1989); Kellems, Current Opinion in
Biotechnology 2:723-729 (1991); Kellems, in Gene Amplification in
Mammalian Cells, Marcel Dekker, Inc., New York, 207-221 (1992);
Kellems, Methods in Molecular Genetics 5:143-155 (1994)). ADA is an
enzyme involved in purine metabolism in mammalian cells and can
provide protection against the toxic effects of the drug such as
xylofuranosyl-adenine (xyl-A).
[0110] Applicant has found that the ADA gene can be used in a
method for evaluating the transcriptional activity of
transcriptional regulatory regions. In this method, a high level of
ADA gene expression is required to allow growth of a cell. Such
high level expression will only be provided if a test sequence
inserted in a transcriptional regulatory position, e.g., upstream
to the ADA gene, is effective in allowing sufficient transcription
of the ADA gene.
[0111] In this system, synthetic regulatory regions/enhancers will
be assembled from mixtures of synthetic oligonucleotides, fragments
of cloned natural regulatory regions, and/or protein binding sites
using a random combinatorial approach. The synthetic regulatory
regions will be inserted upstream of a basal TATA box and
functional ADA minigene (cDNA) contained in a plasmid. This will
produce libraries of synthetic or recombined regulatory regions
which can contain millions of different combinations. These plasmid
libraries will then be transfected into cells of different origins
and the transfected cells will be selected for increased ADA
activity in transient assays. Cells that express no or low levels
of ADA will be killed and lost from the culture due to insufficient
ADA activity. Cells that express high levels of ADA, due to the
strength of the synthetic regulatory region, will survive. This
procedure thus selects for synthetic regulatory regions that drive
the expression of ADA in that specific cell type. This approach can
be used to develop strong regulatory region that will function in
cells or tissues for which there is poor understanding of patterns
of gene expression or the regulatory regions of specific genes have
not been characterized.
[0112] This approach is not limited to the use of ADA-based
selection protocols but can also utilize selection strategies
developed based on the expression of other genes, including but not
limited to dihydrofolate reductase (DHFR), metallothienin, CAD,
thymidylate synthetase, ornithine decarboxylase, etc. (see Kellems,
Current Opinion in Biotechnology 2:723-729 (1991) for a more
extensive list).
[0113] Examples of how this type of selection system could be used
are outlined below:
[0114] Creation of Synthetic Regulatory Regions from Transcription
Factor Binding Sites
[0115] As discussed previously, the synthetic regulatory regions
are created to have altered composition, order, and/or spacing of
individual binding sites for transcription factors. Creation of the
synthetic regulatory regions usually uses a combination of specific
restriction sites. If convenient sites are not available,
alternatives can be used, such as chemical resynthesis or
engineering of different restriction sites onto the ends of the
binding sites. A variety of methods can be used to assemble the
different components, such as the method of nucleic acid ordered
assembly with directionality (NOMAD) (Rebatchouk, et al., PNAS
93:10891-10896 (1996)).
[0116] NOMAD is a general cloning strategy (WWW resource locator
http://Lmb1.bios.uic.edu/NOMAD/NOMAD.html). NOMAD can manipulate
the binding sites in the form of "module" having a standardized
cohesive structure. Specially designed "assembly vectors" allow for
sequential and directional insertion of any number of binding sites
in an arbitrary predetermined order, using the ability of type IIS
restriction enzymes to cut DNA outside of their recognition
sequences (Rebatchouk, et al., PNAS 93:10891-10896 (1996)). NOMAD
ensures the convenient construction of the synthetic regulatory
regions with altered composition, order, or spacing of individual
binding sites for transcription factors. The acquired synthetic
regulatory regions can then be evaluated, such as with the ADA
selection method.
[0117] Biochemical Agents Used in ADA Selection
[0118] A number of protocols have been developed that use ADA
selection to amplify genes (Kellems et al., in Genetics and
Molecular Biology of Industrial Microorganisms, Hershberger et al.,
(ed.) American Society for Microbiology, Washington, 215-225
(1989); Kellems, Current Opinion in Biotechnology 2:723-729 (1991);
Kellems, in Gene Amplification in Mammalian Cells, Marcel Dekker,
Inc., New York, 207-221 (1992); Kellems, Methods in Molecular
Genetics 5:143-155 (1994)).
[0119] In this invention, a method has been developed which uses
ADA to identify and evaluate regulatory regions, such as synthetic
regulatory regions, or other regulatory sequences. This method can
be performed in a number of different ways, including the
following.
[0120] The simplest method uses increasing concentrations of
xylofuranosyl-adenine (xyl-A) alone. In cells expressing low levels
of ADA, xyl-A is converted to xyl-AMP by adenosine kinase. Xyl-AMP
is subsequently converted to xyl-ATP which can then be incorporated
into RNA by RNA polymerase where it acts to block further extension
of the RNA chain. This chain termination is due to the fact that,
unlike the normal sugar contained in ribonucleosides, xylose lacks
a 3' hydroxyl group which is required for RNA chain extension. ADA
is capable of detoxifying xyl-A by converting it to hypoxanthine
and xylose-P.sub.i, both of which are non-toxic. Since the chain
terminating effect of xyl-A is independent of DNA synthesis, xyl-A
will readily kill non-dividing as well as dividing cells (Kellems
et al., in Genetics and Molecular Biology of Industrial
Microorganisms, Hershberger et al., (ed.) American Society for
Microbiology, Washington, 215-225 (1989)). The concentration of
xyl-A required to kill a specific type of cell depends on the level
of endogenous ADA expressed by those cells. Most cells normally
produce relatively low levels of ADA and are, therefore, killed
quickly by low (micromolar) concentrations of xyl-A. Endogenous ADA
can be selectively inhibited by incubation with deoxycoformacin.
This protocol has the limitation that ADA expression increases with
increasing concentrations of xyl-A up to only about 10 .mu.M. Cells
can be selected that are resistant to higher concentrations of
xyl-A but they do not express higher levels of ADA. It was found
that cells selected for resistance to more than about 10 .mu.M
xyl-A were deficient in the activity of adenosine kinase, which is
responsible for converting xyl-A to xyl-AMP, the first step in
producing xyl-ATP which is a substrate for RNA polymerase.
[0121] An alternative method of ADA selection, termed 11AAU
selection (Yeung et al., J. Biol. Chem. 258:8338-8345 (1983); Yeung
et al., J. Biol. Chem. 258:8330-8337 (1983)), was subsequently
developed that used a combination of 1) alanosine, which inhibits
the de novo synthesis of AMP; 2) adenosine, which then becomes a
required substrate for adenosine kinase via the salvage
biosynthetic pathway; and 3) uridine, which overcomes the
inhibitory effect of high concentrations of adenosine on UNT
synthesis. This selection protocol requires adenosine kinase to
produce AMP and thus greatly reduces the chance that this enzyme
will be affected during the selection process. In this protocol
adenosine is used at a concentration that is cytotoxic to normal
cells. Thus, this protocol selects for increased expression of ADA
which is required to detoxify the excess adenosine. ADA activity
can be further increased by exposing cells to both 11AAU selection
and increasing concentrations of deoxycoformacin (Yeung et al., J.
Biol. Chem. 258:8330-8337 (1983)). However, some cells do not
tolerate the 11AAU/deoxycoformacin selection system well.
[0122] Yet another selection system uses xyl-A as the cytotoxic
agent in combination with deoxycorformacin to inhibit endogenous
ADA activity (Kaufman et al., PNAS 83:3136-3140 (1986); Kellems et
al., in Genetics and Molecular Biology of Industrial
Microorganisms, Hershberger et al., (ed.) American Society for
Microbiology, Washington, 215-225 (1989)). This is a very effective
method to select for increased ADA levels but does not provide any
selection for the maintenance of adenosine kinase activity.
Therefore, this method should not be used for long periods of time
as this increases the probability that adenosine kinase mutants
will arise.
[0123] These selection methods can be used in the selection and
evaluation of synthetic regulatory regions, as discussed
previously. An exogenous ADA gene under the control of one of the
synthetic regulatory regions to be evaluated is transfected into
cells that are then placed under selective pressure. The surviving
cells should carry the functional synthetic regulatory regions
which direct the strong transcription of ADA gene, protecting the
cells from the toxic effect of the biochemical agents.
[0124] As indicated, a variety of different selection methods can
be used to identify effective synthetic regulatory regions.
Generally a selection method based on expression of a protective
gene can be used, where the selection method is able to distinguish
between low or moderate expression levels and high expression
levels. This allows a semi-quantitative comparison of the relative
effects of different synthetic and natural promoters or other
regulatory regions.
[0125] Positive selection systems can also be used, such as
magnetic sorting and FACS. An example of such systems is the MAC
Selecting System (Miltenvi Biotec, Auburn, Calif.). In this system,
a gene encoding CD4 antigen is the selective gene and CD4 antibody
complexed to magnetic beads is used to separate cells expressing
CD4 antigen from non-expressing cells. Alternatively, florescence
labeled CD4 antibody can be used to detect CD4 expressing cells,
and expressing cells can then be separated by FACS.
[0126] Synthetic Regulatory Regions for Muscle Cells
[0127] The development of synthetic regulatory regions with high
level activity in a particular cell type or state can be
illustrated by the identification of regions producing high level
expression in muscle cells. Individual synthetic oligonucleotides
can be synthesized containing known consensus sequences capable of
binding cell-specific transcription factors (transcription factor
binding sites), ligated together in random combinations and cloned
upstream of the ADA gene as described above. For example, consensus
sequences for muscle-specific binding sites, including serum
binding sites (SREs), MEF-1 sites, MEF-2 sites, and/or TEF-1 sites,
can be used. This library of synthetic regulatory regions can then
be transfected into muscle cells (e.g., C.sub.2C.sub.12, SOL8, or
primary myoblast cells). The ADA selection system allows the
selection against clones containing weak muscle regulatory regions
and for clones containing strong muscle regulatory regions.
[0128] Also, cloned or PCR-amplified cell-specific regulatory
elements can be digested with one or more frequent cutting
restriction enzymes to produce mixtures of small DNA fragments
containing sequences capable of binding cell-specific transcription
factors. These fragments would be ligated together in random
combinations and cloned upstream of the ADA gene as described
above. For example, regulatory regions for the skeletal
.alpha.-actin, cardiac .alpha.-actin, myosin heavy chain, and
myosin light chain genes, which contain the muscle-specific binding
sites, can be used.
[0129] This library of synthetic regulatory regions would then be
transfected into muscle cells (e.g., C.sub.2C.sub.12, SOL8, or
primary myoblast cells). The ADA selection system would allow the
selection against clones containing weak muscle regulatory regions
and for clones containing strong muscle regulatory regions.
[0130] Identification of 3', 5', and Intron Regions that Enhance
Gene Expression
[0131] Alone, or in combination with the promoter selection
methodology described herein, one may use the combinatorial
approach combined with a selection methodology to identify gene
control regions, including novel regions, such as 3' untranslated
regions (3'UTR), 5' untranslated regions (5'UTR), and intron
elements that have the effect of enhancing gene expression when
inserted into a plasmid construct in the proper orientation to the
gene. One skilled in the art will immediately recognize the proper
position of the element to be inserted from the terms 3'UTR, 5'UTR,
and intron. 3'UTR, 5'UTR, or intron regions from known gene are
randomly combined, for example, by the method described herein in
connection with promoter/enhancer sequences, and inserted into the
appropriate position relative to the coding sequence of the gene of
interest. As indicated above, other sequences can also be used,
including but not limited to random sequences and combinatorial
rearrangements of known sequences. A selection procedure, such as
that described above, is then employed to identify control regions
which have the effect of enhancing the expression of the gene with
which they are associated.
[0132] Selection of Transcriptional Regulatory Regions from Various
Tissue Types
[0133] While the methods described herein are exemplified by
selection of muscle-specific promoter sequences, the use of these
methods is by no means restricted to muscle cells. For example,
cells of lung, kidney, brain, heart, eye, inner ear, epithelial,
endothelial, mesothelial, smooth muscle, neuronal, lymphocyte,
macrophage, glial, microglial, intestinal, colon, bone,
hematopoietic, skin, liver, cancerous, precancerous, metastatic,
fetal, or vascular origin may be used to identify expression
enhancing regulatory regions. In addition, regulatory elements
derived from one cell type may be selected for in a different cell
type for expression enhancing capacity. Such a procedure would also
fall within the scope of this invention.
[0134] Identification of Reduced-Size Active Portion of Synthetic
Regulatory Region
[0135] Using methods described above, one can identify synthetic
regulatory regions which provide appropriate expression levels in a
selected type or group of cells. Depending on the oligonucleotide
length utilized in the identification, it can be useful to reduce
the size of the synthetic regulatory region by identifying and
utilizing a portion or portions of the larger region which provide
the enhanced transcriptional regulatory effects. Such
identification can be performed by routine methods, such as by
replacement of portions of an effective regulatory region with
equal length inactive sequences and determining the activity of the
resulting modified region. If the expression enhancing activity is
significantly reduced, this indicates that the modified region
includes at least part of a sequence which provides the expression
enhancing activity. On the other hand, if the modification does not
significantly affect the resulting expression, this indicates that
the modified portion does not contribute to the activity of the
synthetic regulatory region. Thus, the portion or portions which
significantly contribute to the transcriptional regulatory activity
can be used as new smaller synthetic regulatory regions separately
from other parts of the original synthetic regulatory region.
Generally the position of the active portion or portions with
respect to the coding sequence should be maintained at
approximately the position it occupied in the original synthetic
region. However, it will not usually be necessary to maintain
exactly the same position, but will preferably be within 100, 60,
30, or fewer bases of the original position.
[0136] While the active portions can be of various sizes,
preferably the portion providing a small synthetic transcriptional
regulatory region includes at least 20 contiguous nucleotides, and
more preferably includes at least 40, 60, 80, or 100 contiguous
nucleotides of the original synthetic region.
[0137] The present invention is further illustrated by the
following examples, which are not intended to limit the present
invention in any way.
EXAMPLE 1
Generating the Libraries of Synthetic Muscle Specific Regulatory
Regions by Random Combination of Regulatory Elements
[0138] Available naturally-occurring muscle specific regulatory
regions cannot regulate transcription in all desired manners in
muscle cells. Synthetic muscle specific regulatory regions are
therefore needed to provide new candidates for controlling the
transcription. The synthetic muscle specific regulatory regions can
be constructed by random combination of transcription factor
binding sites which are known to be important in the regulation of
general transcription or muscle cell-specific transcription. This
example illustrates how synthetic muscle-specific regulatory
regions can be constructed using a selection of known binding
sites.
[0139] The sequences which are shown in the following include MRE
(muscle response element), E-box which is the binding site
recognized by the family of basic-helix-loop-helix (bHLH)
transcription factors, and the binding sites for transcription
factors MEF-2, TEF-1 and Sp1.
1 MEF-2 CTCTAAAAATAACCCT MRE GCCCAACACCCAAATATGGCTT E-box
CTCACCTGCTG TEF-1 GCCGCATTCCTGGG Sp1 CCCCGCCC
[0140] The first step in constructing the synthetic regulatory
regions is to synthesize double-stranded oligonucleotides
containing one of the above binding sites. This synthesis is
performed for each of the binding sites to be included. The
oligonucleotides should be sticky ended, i.e., have ends which are
single-stranded with sequences complementary to each other. The
oligonucleotides preferably fit in one or two helical turns so that
elements reside on the same face after being linked together. This
can be achieved by constructing a sequence so that the contact
points contained in the elements are approximately 10 base pairs
apart from each other (or approximately 20 base pairs apart). Those
skilled in the art will know appropriate techniques to provide
appropriate spacing and sticky ends.
[0141] These oligonucleotides are mixed together using a particular
ratio of different oligonucleotides. This ratio can be varied to
favor the presence of a particular element. For example, MEF-2,
E-box, MRE, TEF-1, and Sp1 can be mixed at a ratio of 4:2:2:2:1, in
order to increase the probability of MEF-2 presence in the
synthetic regulatory regions. Similarly, the ratios can be biased
in favor of other binding sites. The mixed oligonucleotides can
automatically be linked together non-covalently through annealing
of the sticky ends. The oligonucleotides are then ligated using a
DNA ligase. The oligonucleotides are therefore covalently linked
together to form new and longer oligonucleotides.
[0142] The ligated oligonucleotides are cut through partial
digestion with a nuclease. The digested oligonucleotides are
separated by gel electrophoresis and the oligonucleotides with a
particular size, e.g., 200 bp, are recovered from the gel. The
recovered oligonucleotides are then capped with a sticky ended
adaptor using a DNA ligase.
[0143] The capped oligonucleotides are then cloned into appropriate
vectors for expression analysis. For example, for identification of
effective myogenic promoter/enhancer sequences, the capped
oligonucleotides can be inserted at a site adjacent to the Sk-actin
TATA-box in a myogenic vector system (MVS) .beta.-gal construct or
at -200 in MVS .beta.-gal construct.
EXAMPLE 2
Comparison of Relative Regulatory Region Activity During
Differentiation at Primary Myoblast Cells
[0144] The synthetic regulatory regions should be evaluated to
confirm they are functional in the regulation of transcription.
Large-scale evaluation can be done with the stress condition
selection (e.g., ADA), as discussed above; medium-scale evaluation
can be done either with the stress condition selection, or with the
following approach or with other analyses of expression level. This
example also illustrates the selection of synthetic regulatory
regions which regulate transcription rates in particular cells, in
this example, muscle cells.
[0145] In this approach, the synthetic regulatory regions are
inserted into a vector to regulate the transcription of a reporter
gene, instead of a selective gene. The reporter genes include, but
are not limited to, the genes encoding .beta.-gal and luciferase.
Minilysate prepared DNA, such as the constructs of example 1, is
transferred into myogenic cultures in 96 well microtiter dishes.
.beta.-gal activity is assayed by routine methods, e.g., mini ONPG
assay, and compared to .beta.-gal expression driven by the
cytomegalovirus immediate early promoter (CMV-.beta.-gal). High
.beta.-gal activities represent the strong synthetic regulatory
regions. Of course, other non-cell-specific regulatory regions
could also be used for a reference expression level.
[0146] The above approach can also be used for the further
evaluation of synthetic regulatory regions acquired using the
stress condition approach, as the .beta.-gal activity assay can
provide quantitative information about the regulatory regions being
evaluated.
[0147] FIG. 3 shows the comparison of relative regulatory region
activity during differentiation of primary myoblast cells. This
experiment was done using reporter gene product assay. The
regulatory region containing 2.times.MEF-2 has about a five-fold
higher activity than other regulatory regions tested. This result
indicates that the regulatory region containing 2.times.MEF is
capable of stimulating gene transcription at a high level in
myoblast cells.
EXAMPLE 3
Differential SRF Activity on c-Fos SRE vs Muscle SRE
[0148] The above approach (Example 2) using a reporter gene product
assay was used to determine the differential SRF activity on c-Fos
SRE and muscle SRE, the sequences of which are shown in the
following. These sequences have sequence similarity in the SRF
binding sites, which are underlined.
2 C-FOS SRE: ACAGGATGTCCATATTAGGACATCTGCG MUSCLE SRE:
GCCCGACACCCAAATATGGCGACGGCCG
[0149] The c-Fos SRE and muscle SRE were inserted into a vector to
regulate a reporter gene encoding a luciferase. The vector
constructs were transferred into C.sub.2C.sub.12 myoblasts. The
luciferase gene is transcribed in the presence of various SRF's.
The luciferase activity was then assayed. All the transcription
factors tested except GCN1 showed similar activities on c-Fos SRE
and muscle SRE. On c-Fos SRE, GCN1 has about 3-fold higher activity
than SRFwt does. On muscle SRE, in contrast, GCN1 has about 2-fold
lower activity than SRFwt does (FIG. 4). These results indicate
that minor variations in transcription binding sites can result in
a major difference in regulatory region activity in the presence of
a particular transcription factor.
EXAMPLE 4
Selection of Tissue- or Cell-Specific Elements in Vivo
[0150] In addition to in vitro selection approaches, synthetic
tissue- or cell-specific transcriptional regulatory regions can be
selected and evaluated in vivo. One of the most important uses of
the synthetic elements is to regulate tissue- or cell-specific gene
expression in an organism. The synthetic elements identified in
vitro may be further studied in vivo to better evaluate or
understand their functions. Useful in vivo approaches include, but
are not limited to, transgenic animals and muscle injection.
[0151] A. Insertion of Vectors into Transgenic Mice
[0152] Vectors are constructed containing a reporter gene, e.g.
.beta.-gal, under the control of the synthetic elements identified
as having in vitro activity in a particular type or types of cells,
e.g., in muscle cells. Transgenic mice carrying the vectors can be
generated by standard oocyte injection (Brinster, et al, Proc.
Natl. Acad. Sci. USA 82:4438-4442 (1958)) and bred to demonstrate
stable transmission of transgenes to subsequent generations.
Transgenics can be identified by polymerase chain reaction or
Southern genomic DNA blotting analysis, such as from tail cut
DNA.
[0153] Transgenics can be tested for tissue specific expression,
e.g., muscle specific expression, of the transferred vector by RNA
blotting of total RNA isolated from several tissues, or by
.beta.-gal assay. For example, samples can be taken and analyzed
from skeletal muscle, gonad, lymph nodes, liver, spleen, kidney,
lungs, heart, brain, bone marrow, blood, and other tissues. The
analysis and comparison of expression levels, such as by the
determination of .beta.-gal activity in the different tissues, will
reveal the regulatory pattern of the synthetic regulatory regions
in the organism. Expression in one tissue at a significantly higher
level than in other tissues indicates that the regulatory regions
on the plasmid a specific for that tissue.
[0154] Such in vivo analysis of tissue specific expression is
applicable to the evaluation of regulatory regions in any position
with respect to the coding sequences, such as in the 5' UTR, the 3'
UTR, and in introns.
[0155] B. Somatic Gene Transfer to Skeletal Muscle in Vivo
[0156] To demonstrate the effects of the synthetic elements as used
in in vivo gene therapy and/or to identify elements having muscle
specific activity, vectors can be injected into adult muscle (e.g.,
avian or mammalian) for the expression of a reporter gene such as
the gene encoding .beta.-gal or luciferase.
[0157] Vectors carrying .beta.-gal under the control of the
synthetic elements, or under the control of known regulatory
regions (used as controls), are pelleted by centrifugation, dried
under vacuum, resuspended in an appropriate formulation, and
injected into the quadriceps muscle (20 .mu.g/pellet-3
pellets/muscle) of 2 sets of 6 mice (injection into other muscles
can also be used). The animal is sacrificed 48 hours following
introduction of the DNA and the entire muscle (the muscle injected)
from each animal that received an inoculation is removed and
assayed for .beta.-gal activity in the tissue. If sufficient
experimental animals are available, it is preferable to assay for
expression at a number of different time points, such as 24 hrs, 48
hrs, 7 days, 14 days, and 28 days following DNA introduction. In
this way additional information is provided on the time course of
expression of the reporter gene.
[0158] As described above, expression of the reporter gene is
determined by assay for activity of the product of that gene, e.g.,
.beta.-gal activity, however, other methods can also be used,
including reverse transcriptase PCR analysis.
[0159] Muscle specific expression is demonstrated by showing that
expression occurs only or at a significantly higher level in muscle
than in other tissues. Therefore, the evaluation preferably also
includes assaying for expression of the reporter gene in tissues
other than skeletal muscle. It is expected that some amount of the
injected vector will migrate to other tissues. Thus, at each of the
time points for which muscle samples are taken, samples can also be
taken from a set of other tissues, such as gonad, lymph nodes,
liver, spleen, kidney, lungs, heart, brain, bone marrow, and blood.
Each of the samples is assayed for reporter gene expression.
[0160] The pattern of reporter gene expression can also be
correlated with the presence of the vector. The presence of the
vector in a tissue can be determined by amplification and
hybridization of a vector-specific sequence.
EXAMPLE 5
The Development of Synthetic Regulatory Regions
[0161] The above examples describe approaches to constructing,
screening, and evaluating synthetic regulatory regions. The
combination of these approaches can identify regulatory regions
with advantageous properties for particular applications. The
following example demonstrates that synthetic regulatory regions
constructed using binding sequences in a combinatorial approach can
be identified which provide advantageous expression characteristics
in a particular tissue and state of that tissue.
[0162] To aid in understanding the results of this example, a short
background discussion may be of assistance. IGF-1 plays a role as a
neurotrophic agent in repairing crushed motor neurons. Localized
expression of IGF-I hastens the repair of crushed motor neurons.
Although it is one of the strongest muscle specific promoters,
skeletal .alpha.-actin promoter is not an ideal regulatory region
for this expression as intact innervation of muscle is required to
maintain skeletal .alpha.-actin promoter activity at a high level.
In transgenic mice having .alpha.-actin/hIGF-1 transgene and
showing high level expression of hIGF-1, following sciatic nerve
crush the expression level of hIGF-1 was down regulated. hIGF-1
expression was at a minimum about 2 weeks post crush (matching the
time of greatest muscle atrophy), and only began to return to
normal levels at about 3 weeks post crush.
[0163] Thus, nerve crush effectively represses skeletal
.alpha.-actin promoter, which only recovers with reinnervation.
This is in accord with observations that injected
.alpha.-actin/IGF-1 plasmids take at least three weeks to show
effectiveness. Earlier expression of IGF-1 would therefore be
desirable in order to maintain high level expression of
neurotrophic genes during the early stages of nerve and muscle
regeneration.
[0164] It is, therefore, beneficial to develop synthetic myogenic
regulatory regions to drive IGF-I expression which are insensitive
to the innervation state of muscle. Thus, having a myogenic
regulatory region that is turned on all the time in muscle should
even further speed the nerve repair process. In order to develop
such a regulatory region, we took the following steps.
[0165] A. Construction of Libraries of Synthetic Regulatory
Regions
[0166] We first constructed a series of synthetic regulatory
regions based on the sequences of transcriptional control elements
involved in the activation and regulation of genes in mammalian
cells.
[0167] The portion of the skeletal .alpha.-actin promoter upstream
of the ATAAAA box was removed from plasmid p612aACATMLC (which
contains a pBluescript polylinker upstream of a skeletal
.alpha.-actin promoter) by digestion with EagI, which cuts in the
pBluescript polylinker upstream of the promoter and 47 bp upstream
of the ATAAAA box. The luciferase gene was linked downstream of the
resulting minimal .alpha.-actin promoter. The synthetic regulatory
regions were randomly cloned into this minimal
.alpha.-actin/luciferase test plasmid.
[0168] The control elements that were tested include:
3 SRE 5'-GACACCCAAATATGGCGACCG-3' 3'-CTGTGGGTTTATACCGCTGGC-5' MEF-1
5'-CCAACACCTGCTGCCTGCC-3' 3'-GGTTGTGGACGACGGACGG-5' MEF-2
5'-CGCTCTAAAAATAACTCCC-3' 3'-GCGAGATTTTTATTGAGGG-5' TEF-1
5'-CACCATTCCTCAC-3' 3'-GTGGTAAGGAGTG-5' SP1 5'-CCGTCCGCCCTCGG-3'
3'-GGCAGGCGGGAGCC-5'
[0169] The SRE sequence corresponds to the proximal skeletal
.alpha.-actin SRE sequence. The SRE core sequence is overlined. The
MEF-1 sequence (complemented and overlined) and the adjacent GCTGC
motif (asterisked) are conserved in the muscle creating kinase gene
and rat myosin light chain gene (Lasser et al., 1989). The SP1
sequence (overlined) has an Eag1 half restriction site at each end.
Sp1 sites were included as spacers between the other control
elements.
[0170] Oligonucleotide pairs (dsDNA) were annealed and then ligated
together in various combinations to form larger fragments of
randomly oriented control elements. Since each of the Sp1 elements
contains EagI half-sites at each end, an intact EagI restriction
site will be generated wherever two Sp1 elements are ligated
together. DNA fragments contain from 8 to 14 control elements in
random combinations with EagI cohesive ends, and thus represent
synthetic regulatory regions. Fragments formed from each of the
combinations of elements resulted in a separate pool of fragments.
Each of the combinations contains a heterogenous set of fragments
resulting from the particular starting combination of
oligonucleotides, as the oligonucleotides can anneal together in
various orders and numbers.
[0171] DNA fragments from each pool of synthetic regulatory regions
was ligated into the EagI site of the minimal
.alpha.-actin/luciferase plasmid. Approximately twenty clones were
picked for each combination, which were then grown, purified with
Qiagen kits and used to transfect primary myoblasts.
[0172] The clones were named Cm-n, where m is the number of a
particular combination and n is the number of a particular clone
picked from that combination. For example, C5-1 represents clone
number 1 of combination number 5. FIG. 5 shows the arrangement of
sub-elements of some exemplary synthetic regulatory regions. The
sequences of portions of the plasmids containing exemplary
synthetic regulatory regions, including the sequence of the
synthetic regulatory region, are shown in FIGS. 8-31. The sequences
are believed to be correct, however a small percentage of sequence
errors may be present. One skilled in the art could readily obtain
the correct synthetic regulatory region by identifying the
particular elements and their positions in the region from the
sequence provided, and constructing the synthetic regulatory
regions from those elements in the same positions and
orientations.
[0173] A p448 Sk .alpha.-actin promoter/luciferase vector was used
as a control. This promoter is a standard representative of strong
muscle specific promoters, being one of the strongest such
promoters currently available. Expression from this vector was used
as a standard for comparison of the expression levels regulated by
the test synthetic regulatory regions.
[0174] B. Screening of Library of Synthetic Regulatory Regions in
Vitro
[0175] Plasmids of the synthetic regulatory region library
described in A. were transfected into muscle cells with
lipofectamine transfections in two series. The transfected cells
from these transfection series were grown and collected for
luciferase activity assay.
[0176] We observed from the first series of lipofectamine
transfections done in duplicate in primary myoblast cultures, that
none of the eight constructions grown for each of the multimerized
SREs, E-boxes, MEF-2, and TEF-1 regulatory regions (32 separate
plasmids) had activity greater or equal to the activity of the
skeletal .alpha.-actin promoter/enhancer driven luciferase plasmid
(p448).
[0177] In the second series, six different combinations of
synthetic regulatory regions were then tested in mature myotubes.
Luciferase activities up to 5-fold greater than that driven by the
skeletal .alpha.-actin promoter/enhancer were detected by
transfections in a subset of clones, namely C1-28 (FIG. 8), C2-27
(FIG. 9), C5-12 (FIG. 10), C6-16 (FIG. 11) and C6'-7 (FIG. 12). In
muscle cells, therefore, these synthetic regulatory regions
stimulate higher transcription levels than skeletal .alpha.-actin
promoter.
[0178] Moreover, we used a simple assay to check the effect of
myoblast depolarization as a way to evaluate the potential for
innervation effects on muscle gene expression. We found that the
skeletal .alpha.-actin promoter is up-regulated 3-4 fold by
applying KCl for 20 minutes to the media of myotube cultures.
Clones C1-28 (FIG. 8), C5-1 (FIG. 13), C5-5 (FIG. 14), C6-5 (FIG.
15), C5-12 (FIG. 10), C6-16 (FIG. 11), and C6'-7 (FIG. 12) provided
high levels of rather stable expression in depolarized myotubes.
Thus, these synthetic regulatory regions may be much less affected
by innervation effects than skeletal .alpha.-actin promoter and are
ready for further evaluation. Results of the reporter expression
levels and of the expression levels in the KCl depolarization test
are shown in FIG. 6.
[0179] Method
[0180] A. First Transfection
[0181] 1 .mu.g synthetic regulatory region/luciferase plasmid was
transfected into 24 hr primary chick myoblast in 60 mm plates
(500,000 cells/plate). 200 ng CMV .beta.-gal plasmid was
cotransfected in each transfection.
[0182] 40 hours after transfection, KCl was added directly to the
medium to a concentration of 50 .mu.M and cells were treated at
37.degree. C. for 2 hours. The medium containing KCl was aspirated,
the cells rinsed once with HBSS, and fresh medium was added. The
control plates without KCl treatment were left untouched in the
original medium.
[0183] 20 hours after KCL treatment, cells were collected and
luciferase activity was assayed.
[0184] B. Second Transfection
[0185] 100 ng synthetic regulatory region-luciferase plasmid, along
with 200 ng CMV .beta.-gal plasmid was transfected to 24-hour
primary chick myoblast in 60 mm plates (500,000 cells/plate). 700
ng YEAST MARKER carrier DNA was added to each transfection to make
the total amount of DNA transfected 1 .mu.g.
[0186] 36 hours after transfection, cells were rinsed once with
HBSS, MEM (no serum) containing 50 .mu.M KCl (for control) was
added, and the cells were incubated at 37.degree. C. for 40
minutes. Then the above medium was aspirated, the cells rinsed once
with HBSS, and full medium was added.
[0187] 24 hours after KCl treatment, cells were collected, and
luciferase activity was assayed.
[0188] C. Evaluation of Synthetic Regulatory Regions in Nerve Crush
Model
[0189] To demonstrate the evaluation and identification of
synthetic regulatory regions effective in a specific in vivo
environment, we tested some of the constructs from above which were
shown to provide high level myogenic expression and for which the
in vitro test suggested less sensitivity to innervation effects
than the Sk .alpha.-actin promoter/enhancer. Results for two of the
constructs in a nerve crush model are described. Experiments were
designed to test synthetic regulatory regions that are resistant to
nerve-injury induced down-regulation of expression driven by
skeletal actin promoter.
[0190] Tibiales muscles of ICR mice were injected with 100 .mu.g of
clone skeletal .alpha.-actin promoter 448 (control), synthetic
regulatory region luciferase vectors C1-28 (FIG. 6), and C5-12
(FIG. 8), which had been shown to be less affected by myoblast
depolarization effect than the control (see Section B.). Two weeks
post sciatic nerve crush, the injected muscle was harvested and
assayed for luciferase activity. The expression levels from C1-28
and C5-12 were approximately 7-fold and 15-fold greater
respectively than from the skeletal .alpha.-actin promoter (FIG.
7).
[0191] These results demonstrate that the two new regulatory
regions were more resistant to injury induced regulation. A benefit
of these regulatory regions will be to sustain high expression
levels of neurotrophic genes during the initial stages of nerve and
muscle regeneration, when skeletal .alpha.-actin promoter is
down-regulated. The higher expression levels provided by synthetic
regulatory regions such as these may allow the use of significantly
lower amounts of DNA, e.g., {fraction (1/10)} the amount of DNA, to
achieve the same biological effects as that provided by expression
driven by promoters such as the skeletal .alpha.-actin
promoter.
[0192] One skilled in the art would readily appreciate that the
present invention is well adapted to carry out the objects and
obtain the ends and advantages mentioned, as well as those inherent
therein. The molecular complexes and the methods, procedures,
treatments, molecules, specific compounds described herein are
presently representative of preferred embodiments, are exemplary,
and are not intended as limitations on the scope of the invention.
It will be readily apparent to one skilled in the art that varying
substitutions and modifications may be made to the invention
disclosed herein without departing from the scope and spirit of the
invention.
[0193] All patents and publications mentioned in the specification
are indicative of the levels of those skilled in the art to which
the invention pertains. All patents and publications are herein
incorporated by reference to the same extent as if each individual
publication was specifically and individually indicated to be
incorporated by reference.
[0194] Other embodiments are within the following claims.
* * * * *
References