U.S. patent application number 10/362703 was filed with the patent office on 2006-12-14 for method of performing reverse transcription reaction using reverse transcriptase encoded by non-ltr retrotransposable element.
Invention is credited to Arkadiusz Bibillo, Thomas H. Eickbush.
Application Number | 20060281079 10/362703 |
Document ID | / |
Family ID | 22859749 |
Filed Date | 2006-12-14 |
United States Patent
Application |
20060281079 |
Kind Code |
A1 |
Eickbush; Thomas H. ; et
al. |
December 14, 2006 |
METHOD OF PERFORMING REVERSE TRANSCRIPTION REACTION USING REVERSE
TRANSCRIPTASE ENCODED BY NON-LTR RETROTRANSPOSABLE ELEMENT
Abstract
The present invention relates to a method of preparing a cDNA
molecule which includes: contacting an RNA molecule, in the
presence of dNTPs, with a non-LTR retrotransposon protein or
polypeptide having reverse transcriptase activity under conditions
effective for production of a cDNA molecule complementary to the
RNA molecule, said contacting being carried out in the absence of a
target DNA molecule of the non-LTR retrotransporon protein or
polypeptide; and isolating the cDNA molecule.
Inventors: |
Eickbush; Thomas H.;
(Penfield, NY) ; Bibillo; Arkadiusz; (Rochester,
NY) |
Correspondence
Address: |
Edwin V Merkel;Nixon Peabody
Clinton Square
PO Box 31051
Rochester
NY
14603
US
|
Family ID: |
22859749 |
Appl. No.: |
10/362703 |
Filed: |
August 30, 2001 |
PCT Filed: |
August 30, 2001 |
PCT NO: |
PCT/US01/27125 |
371 Date: |
February 25, 2003 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60229075 |
Aug 30, 2000 |
|
|
|
Current U.S.
Class: |
435/6.13 ;
435/91.2 |
Current CPC
Class: |
C12Q 1/6844 20130101;
C12Q 2521/107 20130101; C12N 9/1276 20130101; C12N 15/1096
20130101; C12Q 2521/107 20130101; C12Q 1/6844 20130101 |
Class at
Publication: |
435/006 ;
435/091.2 |
International
Class: |
C12Q 1/68 20060101
C12Q001/68; C12P 19/34 20060101 C12P019/34 |
Goverment Interests
[0002] This invention was made, at least in part, utilizing funding
received from the National Institutes of Health grant GM42790. The
U.S. government may have certain rights in this invention.
Claims
1. A method of preparing a cDNA molecule comprising: contacting an
RNA molecule, in the presence of dNTPs, with a non-LTR
retrotransposon protein or polypeptide having reverse transcriptase
activity, wherein the non-LTR retrotransposon protein or
polypeptide is an R2 protein or polypeptide, under conditions
effective for production of a cDNA molecule complementary to the
RNA molecule, said contacting being carried out in the absence of a
target DNA molecule of the non-LTR retrotransposon protein or
polypeptide; and isolating the cDNA molecule.
2. (canceled)
3. The method according to claim 1, wherein the R2 protein or
polypeptide is derived from an arthropod.
4. The method according to claim 3, wherein the arthropod is Bombyx
mori.
5. The method according to claim 1, wherein the RNA molecule lacks
a primer site to initiate reverse transcription.
6. The method according to claim 1, wherein the RNA molecule lacks
a polyadenylation region.
7. The method according to claim 1, wherein said contacting is
carried out in the presence of both a donor RNA molecule having a
known sequence and an acceptor RNA molecule having a known
sequence.
8. The method according to claim 1, wherein said contacting is
carried out in the presence of a donor RNA molecule having a known
sequence.
9. The method according to claim 1, wherein said contacting is
carried out in the presence of an acceptor RNA molecule having a
known sequence.
10. The method according to claim 1, wherein said contacting is
carried out under isothermic conditions.
11. The method according to claim 1, wherein said contacting is
carried out at a temperature of between about 20.degree. C. and
about 4.sup.00C.
12. The method according to claim 11, wherein said contacting is
carried out at a temperature of between about 21.degree. C. and
about 35.degree. C.
13. The method according to claim 1, wherein the RNA molecule
includes a structure or an annealed duplex region that would
interfere with retroviral reverse transcriptase function.
14. The method according to claim 1, wherein said contacting is
carried out under conditions whereby a significant portion of the
isolated cDNA molecules are substantially full length reverse
transcripts of the RNA molecule.
15. The method according to claim 1, wherein the RNA molecule
includes a polyadenylated region, the method further comprising:
annealing a primer to the polyadenylated region of the RNA molecule
prior to said contacting.
16. The method according to claim 15, wherein said contacting is
carried out in the presence of an acceptor RNA molecule having a
known nucleotide sequence.
17-26. (canceled)
27. A method of amplifying a cDNA molecule comprising: performing
the method of claim 1 to obtain a single-stranded cDNA molecule
that includes a region of interest; annealing a first primer to the
single-stranded cDNA molecule at a position 3' of the region of
interest; and extending the first primer to form a complementary
DNA strand including a complement of the region of interest.
28. The method according to claim 27 further comprising:
dissociating the complementary DNA strand from the single-stranded
cDNA molecule; annealing a second primer to the complementary DNA
strand molecule at a position 3' of the complement of the region of
interest; and extending the second primer to form a second
complementary DNA strand which is substantially the same as the
single-stranded cDNA molecule at the region of interest.
29. The method according to claim 28 further comprising:
dissociating the second complementary DNA strand from the
complementary DNA strand; and repeating said annealing and
extending of the first and second primers, using the second
complementary DNA strand, to form third and fourth complementary
DNA strands, the third complementary DNA strand being substantially
the same as the first complementary strand and the fourth
complementary DNA strand being substantially the same as the second
complementary strand.
30. The method according to claim 27, wherein said performing is
carried out under conditions effective for the non-LTR
retrotransposon protein or polypeptide to jump from the RNA
molecule to an acceptor RNA molecule having a known sequence, the
single-stranded cDNA molecule comprising a first portion
complementary to the RNA molecule and a second portion
complementary to the acceptor RNA molecule, the second portion
being located 3' of the first portion.
31. The method according to claim 30, wherein the primer anneals to
the second portion of the single-stranded cDNA molecule.
32. The method according to claim 27, wherein said performing is
carried out under conditions effective for the non-LTR
retrotransposon protein or polypeptide to jump from a donor RNA
molecule having a known sequence to the RNA molecule, the
single-stranded cDNA molecule comprising a first portion
complementary to the donor RNA molecule and a second portion
complementary to the RNA molecule, the second portion being located
3' of the first portion.
33. The method according to claim 27 further comprising: exposing
the single-stranded cDNA molecule to a terminal transferase in the
presence of dCTPs to form an oligoC tail at the 3' end of the
single-stranded cDNA molecule.
34. The method according to claim 33, wherein said exposing is
carried out prior to said annealing the first primer and the first
primer anneals to the oligoC tail.
35. The method according to claim 27, wherein said performing is
carried out under substantially isothermic conditions.
36. The method according to claim 27, wherein said performing is
carried out at a temperature of between about 20.degree. C. and
about 40.degree. C.
37. The method according to claim 27, wherein said performing is
carried out at a temperature of between about 21.degree. C. and
about 35.degree. C.
38. A method of amplifying a cDNA molecule comprising: performing
the method according to claim 7 to obtain a single-stranded cDNA
molecule that includes a region of interest, a region complementary
of the donor RNA 5' of the region of interest, and a region
complementary of the acceptor RNA 3' of the region of interest;
annealing a first primer to the single-stranded cDNA molecule at a
position 3' of the region of interest; and extending the first
primer to form a complementary DNA strand including a complement of
the region of interest.
39. The method according to claim 38 further comprising:
dissociating the complementary DNA strand from the single-stranded
cDNA molecule; annealing a second primer to the complementary DNA
strand molecule at a position 3' of the complement of the region of
interest; and extending the second primer to form a second
complementary DNA strand which is substantially the same as the
single-stranded cDNA molecule at the region of interest.
40. The method according to claim 39 further comprising:
dissociating the second complementary DNA strand from the
complementary DNA strand; and repeating said annealing and
extending of the first and second primers, using the second
complementary DNA strand, to form third and fourth complementary
DNA strands, the third complementary DNA strand being substantially
the same as the first complementary strand and the fourth
complementary DNA strand being substantially the same as the second
complementary strand.
41. A method of amplifying a cDNA molecule comprising: performing
the method according to claim 16 to obtain a single-stranded cDNA
molecule that includes a region of interest, an oligoT region 5' of
the region of interest, and a region complementary of the acceptor
RNA 3' of the region of interest; annealing a first primer to the
single-stranded cDNA molecule at a position 3' of the region of
interest; and extending the first primer to form a complementary
DNA strand including a complement of the region of interest.
42. The method according to claim 41 further comprising:
dissociating the complementary DNA strand from the single-stranded
cDNA molecule; annealing a second primer to the complementary DNA
strand molecule at a position 3' of the complement of the region of
interest; and extending the second primer to form a second
complementary DNA strand which is substantially the same as the
single-stranded cDNA molecule at the region of interest.
43. The method according to claim 42 further comprising:
dissociating the second complementary DNA strand from the
complementary DNA strand; and repeating said annealing and
extending of the first and second primers, using the second
complementary DNA strand, to form third and fourth complementary
DNA strands, the third complementary DNA strand being substantially
the same as the first complementary strand and the fourth
complementary DNA strand being substantially the same as the second
complementary strand.
44-49. (canceled)
Description
[0001] This application claims the benefit of U.S. Provisional
Patent Application Ser. No. 60/229,075 to Eickbush et al., filed
Aug. 30, 2000, which is hereby incorporated by reference in its
entirety.
FIELD OF THE INVENTION
[0003] The present invention relates generally to the preparation
of nucleic acid molecules using a protein or polypeptide having
reverse transcriptase activity, particularly proteins or
polypeptides which have reverse transcriptase activity and are
encoded by a class of non-long terminal repeat ("non-LTR")
retrotransposable elements.
BACKGROUND OF THE INVENTION
[0004] Reverse transcriptases, enzymes that catalyze RNA-dependent
DNA synthesis, have been used as a component of transcription-based
amplification systems. These systems amplify RNA and DNA target
sequences up to 1 trillion fold. Exemplary systems are disclosed in
PCT Patent Application WO 89/01050 to Burg et al.; PCT Patent
Application WO 88/10315 to Gingeras et al.; European Patent
Application 0 329 822 to Davey and Malek, European Patent
Application 0 373 960 to Gingeras et al.; PCT Patent Application WO
91/02814 to Malek and Davey; and European Patent Application 0 408
295 A2 to Kacian and Fultz. Others have also been described or are
otherwise conmmercially available.
[0005] Some of the transcription-based amplification methods are
exceptionally convenient since the amplification reaction according
to these methods is isothermal. Thus, these systems are
particularly suited for routine clinical laboratory use in
diagnostic tests (i.e., pathogen detection, cancer detection,
etc.). Reverse transcriptases are also employed as an initial step
in some protocols when the polymerase chain reaction (PCR) is used
to amplify an RNA target. See U.S. Pat. No. 5,130,238 to Malek et
al.; and Mocharla et al., Gene 99:271-275 (1990). In such "RT-PCR"
procedures, the reverse transcriptase is used to make an initial
complementary DNA ("cDNA") copy of the RNA target, which is then
amplified by successive rounds of DNA replication.
[0006] Reverse transcriptases were once believed to be enzymes
unique to the replication of retroviruses (Baltimore,
"RNA-dependent DNA Polymerase in Virions of RNA Tumor Viruses,"
Nature 226:1209-1211 (1970); Temin and Mizutani, "RNA-Directed DNA
Polymerase in Virions of Rous Sarcoma Viruses," Nature 226:1211
-1213 (1970)). Reverse transcriptases are now known to be encoded
by a wide range of genetic elements in both eukaryotes and
prokaryotes (Varmus, "Reverse Transcription," Sci. Amer. 257:56-66
(1987); Temin, "Retrons in Bacteria," Nature 339:254-255
(1989)).
[0007] Most commercially available reverse transcriptase, however,
are retroviral in origin. The retroviral reverse transcriptases
have three enzymatic activities: a RNA-directed DNA polymerase
activity, a DNA-directed DNA polymerase activity, and an RNAse H
activity (Verma, "The Reverse Transcriptase," Biochim. Biophys.
Acta 473:1-38 (1977)). The latter activity specifically degrades
RNA contained in an RNA:DNA duplex. Degradation of the RNA strand
of RNA:DNA intermediates by RNAse H is an important component of
some transcription-based amplification systems and is to be
distinguished from unwanted degradation due to contaminating
nucleases, which interferes with amplification. While
retroviral-derived reverse transcriptases lacking RNAse H activity
have been developed (U.S. Pat. No. 6,063,608 to Kotewicz et al.),
it should be noted that retroviral transcriptases are typically
characterized by several characteristics which limit their
usefulness. These include: the necessity to use an primer that will
anneal to the RNA template, the low processivity of the enzymes
(i.e., the tendency to dissociate from the RNA before reaching the
end), and the inability of the enzymes to transcribe through region
of RNA secondary structure.
[0008] Eukaryotic genomes in particular are filled with mobile
elements, retrotransposons, that use reverse transcriptase for
replication. The reverse transcriptases encoded by non-LTR
retrotransposons are highly divergent in sequence from the
retroviral enzymes and utilize entirely different mechanisms to
prime cDNA synthesis.
[0009] One of the most abundant classes of reverse
transcriptase-encoding elements is the non-LTR retrotransposons
(also called LINEs, retroposons and polyA-retrotransposons).
Studies of the purified reverse transcriptase from the R2 element
of the silkmoth, Bombyx mori, have provided insights into the
mechanism of non-LTR retrotransposition (Luan et al., "Reverse
Transcription of R2Bm RNA is Primed by a Nick at the Chromosomal
Target Site: A Mechanism for non-LTR Retrotransposition," Cell
72:595-605 (1993)). R2 elements are specialized for insertion into
the 28S ribosomal RNA (rRNA) genes found in the nucleoli of
eukaryotic cells. The 120 kilodalton protein encoded by R2 has both
reverse transcriptase and endonuclease activity. Based on in vitro
studies of these two activities, R2 retrotransposition is a coupled
DNA cleavage/reverse transcription reaction (Luan and Eickbush,
"RNA Template Requirements for Target DNA-Primed Reverse
Transcription by the R2 Retrotransposable Element," Mol. Cell.
Biol. 15:3882-3891 (1995); Luan and Eickbush, "Downstream 28S Gene
Sequences on the RNA Template Affect the Choice of Primer and the
Accuracy of Initiation by the R2 Reverse Transcriptase," Mol. Cell.
Biol. 16:4726-4734 (1996); Mathews et al., "Secondary Structure
Model of the RNA Recognized by the Reverse Transcriptase from the
R2 Retrotransposable Element," RNA 3:1-16 (1997); Yang and
Eickbush, "RNA-induced Changes in the Activity of the Endonuclease
Encoded by the R2 Retrotransposable Element," Mol. Cell. Biol.
18:3455-3465 (1998); and Yang et al., "Identification of the
Endonuclease Domain Encoded by R2 and Other Site-specific, non-Long
Terminal Repeat Retrotransposable Elements," Proc. Natl. Acad. Sci.
USA 96:7847-7852 (1999)). The 3' ed generated by a first-stand
cleavage (nick) of the DNA target site is used as primer for
reverse transcription of the RNA template. This utilization of the
DNA target to prime cDNA synthesis has been called target-primed
reverse transcription ("TPRT"). Removal of the RNA template and
synthesis of the second DNA strand does not occur in vitro and is
likely to involve the cellular DNA repair and replication
machinery. While much has been learned about the TPRT reaction, the
activity of any non-LTR element reverse transcriptase has not been
characterized in the absence of their DNA target site.
[0010] The present invention is directed to overcoming the
above-identified limitations of RT reactions performed using
previously identified retroviral reverse transcriptases as well as
other deficiencies in the art.
SUMMARY OF THE INVENTION
[0011] One aspect of the present invention relates to a method of
preparing a cDNA molecule which includes: contacting an RNA
molecule, in the presence of dNTPs, with a non-LTR retrotransposon
protein or polypeptide having reverse transcriptase activity under
conditions effective for production of a cDNA molecule
complementary to the RNA molecule, said contacting being carried
out in the absence of a target DNA molecule of the non-LTR
retrotransposon protein or polypeptide; and isolating the cDNA
molecule.
[0012] A second aspect of the present invention relates to a method
of amplifying a cDNA molecule which includes: preparing a
single-stranded cDNA molecule according to the present invention,
wherein the single-stranded cDNA molecule includes a region of
interest; annealing a first primer to the single-stranded cDNA
molecule at a position 3' of the region of interest; and extending
the first primer to form a complementary DNA strand including a
complement of the region of interest.
[0013] A third aspect of the present invention relates to a kit
which can be used to prepare cDNA from RNA. The kit includes: a
carrier device including one or more compartments adapted to
receive one or more containers; and a first container which
includes a non-LTR retrotransposon protein or polypeptide having
reverse transcriptase activity. The kit may further include: one or
more additional containers selected from the group-consisting of
(i) a second container which includes a buffer, (ii) a third
container which includes dNTPs, (iii) a fourth container which
includes donor RNA having a known sequence, and (iv) a fifth
container which includes acceptor RNA having a known sequence.
[0014] A fourth aspect of the present invention relates to a pool
of cDNA molecules prepared according to the method of preparing a
cDNA molecule according to the present invention.
[0015] As used herein, "non-LTR retrotransposon protein or
polypeptide" refers to naturally occurring proteins encoded by
non-LTR retrotransposons and polypeptide fragments thereof which
possess reverse transcriptase activity, as well as proteins or
polypeptides derived therefrom which contain one or more amino acid
substitutions that either enhance the reverse transcriptase
activity thereof or have no deleterious effect thereon. A preferred
class of non-LTR retrotransposon proteins or polypeptides are R2
proteins or polypeptides. Thus, as used herein, "R2 protein or
polypeptide" refers to naturally occurring proteins encoded by R2
elements and polypeptide fragments thereof which possess reverse
transcriptase activity, as well as proteins or polypeptides derived
therefrom which contain one or more amino acid substitutions that
either enhance the reverse transcriptase activity thereof or have
no deleterious effect thereon.
[0016] Applicants have surprisingly discovered that the
protein-encoded by a the R2 element of Bombyx mori, which has
reverse transcriptase activity, has several unusual properties in
the absence of its DNA target. It was previously shown that the
protein encoded by the R2 element of Bombyx mori required its DNA
target to carry out TPRT of its own RNA (Luan et al., "Reverse
Transcription of R2Bm RNA is Primed by a Nick at the Chromosomal
Target Site: A Mechanism for Non-LTR Retrotransposition," Cell
72:595-605 (1993); Luan et al., "RNA Template Requirements for
Target DNA-Primed Reverse Transcription by the R2 Retrotransposable
Element," Mol. Cell Biol. 15(7):3882-3891 (1995); Luan et al.,
"Downstream 28S Gene Sequences on the RNA Template Affect the
Choice of Primer and the Accuracy of Initiation by the R2 Reverse
Transcriptase," Mol. Cell Biol. 16(9):4726-4734 (1996), each of
which is hereby incorporated by reference in its entirety). Because
the R2 element protein can function as a reverse transcriptase in
the absence of its target DNA, this protein (as well as polypeptide
fragments thereof) can be used to prepare cDNA in a reverse
transcription procedure of the present invention, which can then be
followed by conventional amplification procedures to expand the
copy number of the transcribed cDNAs. The present invention
provides a number of benefits previously unrealized with reverse
transcription procedures performed, for example, using retroviral
or retroviral-derived proteins having reverse transcriptase
activity. These include: (i) elimination of the need for
sequence-specific primers, since R2 proteins or polypeptides have
an ability to use the 3' end of any RNA to prime cDNA synthesis;
(ii) ability to combine cDNA copies from multiple RNA templates
into a single cDNA strand, which is the result of the R2 protein or
polypeptide propensity to jump between RNA templates in the absence
of any sequence identity; and (iii) a propensity to completely or
nearly completely copy the RNA template to form a population of
cDNA molecules having a greater concentration of substantially
full-length cDNAs (as compared to the population provided by
retroviral or retroviral-derived proteins having reverse
transcriptase activity).
BRIEF DESCRIPTION OF THE DRAWINGS
[0017] FIGS. 1A-B illustrate a mechanism of DNA cleavage during a
TPRT assay for the R2 protein. FIG. 1A shows a diagram of the
nucleic acid templates and products of the reactions conducted in
FIG. 1B. In FIG. 1A, gray lines represent RNA templates; black
lines represent DNA target; and dotted lines represent cDNA
product. The DNA substrate is a uniformly .sup.32P-labeled 164 bp
fragment. The two RNAs used as templates are either 254 nt in
length, corresponding to the 3' untranslated region of the R2
element from B. mori, or 274 nt in length containing an extra 20 nt
of the 28S gene sequence downstream of the R2 insertion site. The
TPRT reaction is initiated by a cleavage of the lower (noncoding)
strand of the target DNA. The free DNA 3' end released by this nick
is used to prime reverse transcription starting at the 3' end of
the 254 nt template, or 20 nt from the 3' end of the R2 sequence in
the 274 nt template (see also diagram in FIG. 2). Thus, the TPRT
product with both RNAs is .about.254 nt cDNA, including a 110 nt
fragment of lower target DNA strand and an .about.254 nt reverse
transcript of the RNA template. FIG. 1B is an image of an
autoradiograph of the reaction products separated on a 33 cm 6%
denaturing polyacrylamide gel. The reactions contain 5 ng of the R2
protein, 20 ng of target DNA and 150 ng of RNA. Lane 1, 254 nt R2
RNA; Lane 2, 274 nt R2 RNA. In addition to the .about.364 nt TPRT
product, both lanes contain larger cDNA products (.about.618 and
638 nt) that represent jumps to the end of a second RNA
template.
[0018] FIG. 2 illustrates the various junction sequences resulting
from the template jumps in FIG. 1B. The presumed mechanism for the
generation of the .about.638 nt TPRT product in FIG. 1B is
diagramed at the top of this figure. Reverse transcription is
initiated 20 nt from the 3' end of the RNA at the beginning of R2
3' UTR. When the RT reaches the 5' end of the first template
(donor) it jumps to the free 3' end of a second RNA template. To
obtain the cDNA sequence corresponding to these jumps, the
.about.638 nt TPRT product from FIG. 1B, lane 2, was purified from
the gel and the junction region amplified by PCR using primers
AB.18 (SEQ ID No: 31, Table I) and AB.19 (SEQ ID No: 32, Table I).
At the bottom of the Figure are the junctions derived from six
cloned PCR products. The top sequence represents the 5' and 3' ends
of 274 nt R2 RNA. Four of the junctions contain an extra nucleotide
(nucleotides between the inner dotted vertical lines) while one
junction contains a 6 nt internal deletion of the acceptor RNA. SEQ
ID Nos: 1 and 2 are, respectively, the 3'-terminal and 5'-terminal
sequences of the 274 base R2 RNA. SEQ ID Nos: 3-6 are the
nucleotide sequences of the junction, region of cDNA RT
products.
[0019] FIG. 3 is a graph illustrating the effects of RNA
concentration on the efficiency of the template jumping reaction.
TPRT reactions were performed similar to that in FIG. 1 with the
254 nt R2 RNA template concentration varied from 0.4 -40 nM.
Products of TPRT reactions were separated on a 6% denaturing
polyacrylamide gel and the intensity of the .about.618 nt fragment
was determined relative to the total level of TRPT (.about.364 and
.about.618 nt bands) using a PhosphorImager and Image Quant.
[0020] FIG. 4 is an image of an autoradiograph which illustrates
that template jumps during the TPRT reaction are specific to R2 RNA
templates. The reactions were conducted as in FIG. 1 except that
the RNA templates added to the reactions were: lane 1, 50 ng of the
274 nt R2 RNA (20 nM); lane 2, 20 nM 274 nt R2 RNA and 80 nM 334 nt
vector RNA; lane 3, 20 nM 274 nt R2 RNA and 160 nM 334 nt vector
RNA. The competing vector RNA would give rise to TPRT products of
about 445 nt. Template jumps of the major TPRT product to vector
RNA would give rise to a .about.700 nt band.
[0021] FIGS. 5A-B illustrate how reverse transcription can be
primed by RNA itself. FIG. 5A is an autoradiograph illustrating the
reverse transcription of the 334 nt vector RNA in the absence of
the DNA target site. All reactions were conducted with 300 ng of
the 334 nt vector RNA in the presence of [.alpha.-.sup.32p] dCTP.
Lane 1, primer extension reaction in which 50 ng primer AB.23 (SEQ
ID No: 33, Table I) was pre-annealed to the RNA template. The
reverse transcription products were treated with 2 .mu.g RNase A
for 10 min at 37.degree. C. before electrophoresis. Lane 2,
reaction identical to that in lane 1, but no primer was annealed to
the RNA template. Lane 3, reaction conditions identical to that in
lane 2 (no DNA primer), but the reaction products were not treated
with RNase A before electrophoresis. FIG. 5B is a schematic diagram
of the reverse transcription reactions shown in FIG. 5A. The top
diagram shows a simple primer extension assay (lane 1) giving rise
to a 334 nt cDNA primary product followed by a template jump to
generate a .about.668 nt product. The bottom diagram shows a
reverse transcription reaction that is primed by another RNA
molecule. The products of this RNA-primed reaction will be
.about.334 nt and .about.668 nt if the products are treated with
RNase A before electrophoresis (lane 2), and .about.668 nt and
.about.1000 nt if not treated with RNase A (lane 3). The self
complimentary covalent RNA/cDNA hybrid migrates as a diffuse band
at .about.600 nt in lane3, because secondary structures form that
affect migration. RNA migrates with a different mobility than
single-stranded DNA on these gels. The cDNA products in lane 2 are
about 10 nt shorter than those in lane 1 suggesting that RNA-primed
cDNA synthesis initiates about 10 nt from the 3' end of the
template.
[0022] FIGS. 6A-B illustrate the specificity of the RNA-primed
reverse transcription and template jumping reactions. FIG. 6A is an
autoradiograph of the reaction products obtained for the RNA-primed
reverse transcription and template jumping reactions. All reactions
were conducted in the presence of [.alpha.-.sup.32P] dCTP, the
absence of DNA primers, and all products were treated with RNase A
before electrophoresis. Lane 1, 2.4 pmoles 183 nt vector RNA; lane
2, 2.4 pmoles 254 nt R2 RNA, lane 3, 2.4 pmoles 334 nt vector RNA;
lane 4, 2.4 pmoles 183 nt vector RNA+2.4 pmoles 254 nt R2 RNA; and
Lane 5, 2.4 pmoles 254 nt R2 RNA+2.4 pmoles 334 nt vector RNA. The
efficiency of the RNA-priming and of the template jumps were
quantified and are presented in Table 2 (see Example 5). FIG. 6B
illustrates the junction sequences of template jumps from the 334
nt vector RNA to the 254 nt R2 RNA. An aliquot of the total
reaction products shown in lane 5 above was PCR amplified using
primers AB.23 (SEQ ID No: 33, Table I) and AB.2b (SEQ ID No: 26,
Table I). The PCR products were cloned and random clones sequenced.
Shown at the top are the 3' end of the 254 nt RNA template (SEQ ID
No: 7) and the 5' end of the 334 nt RNA template (SEQ ID No: 8)
used in the reverse transcription reaction. Below these sequences
are seven clones derived from the cDNA (SEQ ID Nos: 9-15). Most of
the junctions contain additional nucleotides not derived from
either of the RNA templates (nucleotides between the dotted
vertical lines). Not all cDNA products extended to the end of the
334 nt templates (number of bases deleted are given), but these
jumps did not involve short segments of sequence identity with the
acceptor RNA.
[0023] FIGS. 7A-B illustrate the affects of directed template jumps
between `donor` and `acceptor` RNA templates. In FIG. 7B, an image
of an autoradiograph illustrates directed template jumps to short
`acceptor` RNA templates. Reverse transcription in each reaction
was initiated from a .sup.32P-end-labeled primer AB.9 (SEQ ID No:
28, Table I) (30 ng) annealed to 30 ng 177 nt `donor` RNA template.
Lane 1 contained 400 ng 334 nt vector RNA, lane 2 contained 400 ng
183 nt vector RNA, and lane 3 contained no acceptor RNA. In FIG.
7A, an image of an autoradiograph illustrates directed template
jumps to longer `acceptor` RNA templates. Reverse transcription was
again initiated from the .sup.32P-end-labeled primer AB.9 (SEQ ID
No: 28, Table I) (50 ng) annealed to 50 ng 177 nt `donor` RNA
template. Lane 1, 300 ng 334 nt vector RNA; lane 2, 300 ng 600 nt
RNA; and lane 3, 300 ng 1090 nt RNA.
[0024] FIGS. 8A-B illustrate the affect that the DNA target has to
stabilize interactions between the R2 protein and its RNA template.
All components were pre-incubated for 15 minutes at 37.degree. C.
and separated on 5% native polyacrylamide gels at 4.degree. C. In
FIG. 8A, all lanes contain 10 ng of .sup.32P-labeled 254 nt R2 RNA
and 10 ng of R2 protein. Lane 1, no other additions; lane 2, 100 ng
of the 164 bp target DNA; lane 3, 100 ng of a 50 nt DNA
oligonucleotide AB.17 (SEQ ID No: 30, Table I); lane 4, 100 ng of
DdeI digested pBSII(SK-) DNA. In FIG. 8B, lane 1, 20 ng
.sup.32P-labeled target DNA, 15 ng R2 protein, and 100 ng 254 nt R2
RNA; lane 2, 20 ng .sup.32P-labeled target DNA and 15 ng R2
protein; lane 3, 10 ng of .sup.32P-labeled 254 nt R2 RNA. 10 ng of
R2 protein and 100 ng of the 164 bp target DNA.
[0025] FIG. 9A is an autoradiograph which illustrates that template
jumps can occur onto single-stranded DNA. Each reverse
transcription reaction contained 150 ng 254 nt R2 RNA pre-annealed
with various amounts of the DNA primer AB.2b (19 nt) (SEQ ID No:
26, Table I). Lane 1, 500 ng primer AB.2b (SEQ ID No: 26, Table I);
lane 2, 50 ng primer AB.2b (SEQ ID No: 26, Table I); lane 3, 5 ng
primer AB.2b (SEQ ID No: 26, Table I); lane 4, 50 ng primer AB.2b
(SEQ ID No: 26, Table I)+250 ng ssDNA AB.17 (SEQ ID No: 30, Table
I) (50 nt); lane 5, 50 ng primer AB.2b (SEQ ID No: 26, Table I)+250
ng ssDNA AB.26 (SEQ ID No: 35, Table I) (54nt). FIG. 9B illustrates
the junction sequences of template jumps from the 254 nt R2 RNA to
the 54 nt ssDNA. The .about.310 nt band in lane 5 (panel A) was
excised from the gel, the DNA eluted and PCR amplified using
primers AB.23 (SEQ ID No: 33, Table I) and AB.2b (SEQ ID No: 26,
Table I). The PCR products were cloned and random clones sequenced.
Shown at the top is the 3' end of the 54 nt ssDNA acceptor (SEQ ID
No: 16) and the 5' end of the 254 nt R2 RNA donor (SEQ ID No: 17).
Below these sequences are the five sequenced junctions, two of
which were identical (SEQ ID No: 18). Three of the sequences (SEQ
ID Nos: 19-21) possessed junctions that contain additional
nucleotides not derived from either of the donor RNA or acceptor
ssDNA (nucleotides between the dotted vertical lines).
[0026] FIG. 10 is an image of an autoradiograph which illustrates a
comparison of the template jumping activity of the R2 and AMV RTs.
The reactions contained 150 ng of a 283 nt R2 RNA pre-annealed with
250 ng DNA primer AB.2b (SEQ ID. No: 26, Table I). Reactions were
conducted in the identical conditions except that lane 1 contained
5 U of AMV RT (Promega) and lane 2 contained .about.10 ng of R2
protein.
[0027] FIGS. 11A-C are schematic models which correlate the unusual
abilities of the R2 RT to its structural differences from that of
retroviral RT. Gray lines represent RNA; dotted lines represent
cDNA; black lines represent DNA; and rounded rectangles represent
protein. The active site of the RT is indicated by a diffuse shaded
region. FIG. 11A is a comparison of the structure of the R2 and HIV
RTs. The HIV structure is a simplified depiction of the detailed
crystallographic studies (Kohlstaedt et al., "Crystal Structure at
3.5 Angstrom Resolution of HIV-1 Reverse Transcriptase Complexed
with an Inhibitor," Science 256:1783-1790 (1992); Sarafianos et
al., "Crystal Structure of HIV-1 Reverse Transcriptase in Complex
with a Polypurine Tract RNA:DNA," EMBO J. 20:1449-1461 (2001), each
of which is hereby incorporated by reference in its entirety). The
R2 protein lacks an RNase H domain and has additional segments in
the `fingers and palm` regions of the RT domain. Therefore, unlike
the retroviral protein, the R-2 protein is depicted as containing
most of its affinity for the RNA template upstream of the active
site (shaded region). FIG. 11B is summary of the unusual properties
of the R2 protein. Because the 3' end of free RNA can bind
downstream of the active site, it can be used to prime reverse
transcription. Template jumping is possible because the template
binding site upstream of the active site can bind a second RNA
before the protein dissociates after reverse transcribing the first
RNA template. FIG. 11C illustrates the similarity of RNA-priming
and template jumping to models of the integration reaction of R2.
RNA-priming can be viewed as similar to the signature step of the
TPRT reaction. When the R2 protein is bound to the DNA target site,
the 3' end of the cleaved DNA strand can bind downstream of the RT
active site. Meanwhile, template jumps may be similar to one
mechanism proposed for the attachment of the R2 sequence to the
upstream target site after second strand cleavage (Burke et al.,
"The Domain Structure and Retrotransposition Mechanism of R2
Elements are Conserved Throughout Arthropods," Mol. Biol. Evol;
16:502-511 (1999), which is hereby incorporated by reference in its
entirety). While there is no direct biochemical evidence for this
step, it represents the model that best explains the sequence
variation found at the 5' end of endogenous R2 elements (Burke et
al., "The Domain Structure and Retrotransposition Mechanism of R2
Elements are Conserved Throughout Arthropods," Mol. Biol. Evol.
16:502-511 (1999); and George et al., "Analysis of the 5' Junctions
of R2 Insertions with the 28S gene: Implications for non-LTR
Retrotransposition," Genetics 142:853-863 (1996), each of which is
hereby incorporated by reference in its entirety) For both TPRT and
5' attachment, the DNA strands are drawn partially denatured by the
R2 protein, as it would seem most similar to what has been shown to
occur with RNA.
[0028] FIG. 12 is a diagram which illustrates a method of preparing
cDNA according to the present invention, which may include
RNA-priming and template jumping steps. Gray lines represent RNA
templates; and dotted lines represent cDNA product. The initial
components of the reaction are RNA (either template RNA, donor RNA,
or acceptor RNA), the non-LTR retrotransposon RT, and dNTPs. The R2
RT can use the 3' end of one RNA molecule to prime reverse
transcription of a second RNA molecule (RNA priming). The two RNA
molecules can be the same or different. After reverse transcription
to the end of the second RNA, R2 RT can jump to a third RNA
molecule (again, the same or different) and continue reverse
transcription (template jumping). Both RNA-priming and template
jumping do not require sequence identity between the RNAs
involved.
[0029] FIG. 13 is an autoradiograph which illustrates a
processivity assay comparing R2 and AMV RTs on a 600 nt RNA
template. 5' end-labeled AB.23 primer (SEQ ID No: 33, Table I) was
annealed to the 600 nt vector RNA template as described above. Each
lane contained 50 fmole of annealed RNA/DNA-primer. Reactions were
started with the addition of 5 .mu.l of 1.25 mM dNTP and then
stopped after 5 min incubation in 37.degree. C. In those reactions
with R2 RT, 2 ng (20 fmole) of R2 RT was preincubated with the
RNA/primer for 5 min at 37.degree. C. in 50 mM Tris-HCl (pH 7.5),
0.2 M NaCl, 10 mM MgCl.sub.2, 2.5 mM DTT, 0.01 % Triton X-100 in
final volume 25 .mu.l. Lane 1, no other additions; lane 2, the
preincubation mixture also contained 2.5 .mu.l of "trap" (20 .mu.g
of heparin, .about.1 .mu.g of poly(rA)/poly(dT)13-18); lane 3,
after preincubation the "trap" was added at the start of the
reaction (addition of dNTP). In those reactions with AMV RT, 2.5 U
of AMV RT (Promega) was preincubated with the RNA/primer for 5 min
at 37.degree. C. in 50 mM Tris-HCl (pH 8.3), 50 mM KCl, 5 mM
MgCl.sub.2, 5 mM DTT, 0.5 mM spermidine, in a final volume of 25
.mu.l. Lane 1, no other additions; lane 2, the preincubation
mixture also contained 2.5 .mu.l of "trap" (20 .mu.g of heparin,
.about.1 .mu.g of poly(rA)/poly(dT) 13-18); lane 3, after
preincubation the "trap" was added at the start of the reaction
(addition of dNTP).
[0030] FIGS. 14A-B are scans of the RNA processivity assay using a
600 nt (14A) and 1094 nt (14B) RNA templates. FIG. 14A is the
Phosphoimage scan of lane 3 from FIG. 13. The approximate size of
the cDNA products can be calculated relative to 100 nt size
standards. FIG. 14B is the similar processivity assay to that
conducted in FIG. 13 lane 3 (`trap` reactions), except that the RNA
is a 1094 nt vector RNA and the reverse transcription reaction was
primed with the end-labeled AB.34 (SEQ ID No: 38, Table I). For
each primer extension reaction, 50 fmole of annealed RNA/DNA-primer
was used under the same conditions as described for lane 3 in FIG.
13. The products of reverse transcription was separated on 6%
denaturing PAGE, scanned using PhosphoImaging function and analyzed
using Image Quant.
[0031] FIGS. 15A-B illustrate the dissociation rates of R2 RT and
AMV RT from an RNA template. FIG. 15A is an autoradiograph of
different primer extension reactions, which were allowed to proceed
for varying lengths of time. End-labeled AB.8 (SEQ ID No: 27, Table
I) DNA primer was annealed to an 183 nt RNA template. For each
primer extension reaction, about 50 fmole of annealed
RNA/DNA-primer was used. The complex dissociation, as a function of
time, was assayed by yield of primer extension. For R2 RT reaction,
50 fmole of the template/primer was preincubated with 2 ng (20
fmole) of R2 RT for 15 min at 37.degree. C. under the same
conditions as in FIG. 13. After preincubation 2.5 .mu.l of the
"trap" (20 .mu.g of heparin, .about.1 .mu.g of
poly(rA)/poly(dT)13-18) was added to the preincubation. The
addition of the "trap" is considered as a time 0. The mixture was
incubated at 37.degree. C. for the lengths of time indicated before
the addition of 2.5 .mu.l of 2.5 mM dNTP to start the reaction. All
polymerization reactions were conducted for 4 min at 37.degree. C.
The products of the reactions were separated on 7% of denaturing
polyacrylamide gel and analyzed as described above using
PhosphorImager and Image Quant. The fraction of enzyme which
remained bound to the template as a function of time was determined
based on assumption that the yield of the cDNA accumulation is
proportional to the fraction of enzymes that is bind to the
template at the moment of addition of dNTP. Reactions with the AMV
RT were conducted like that with the R2 RT, except that each
reaction contained 2.5 U of AMV RT (Promega) and the preincubation
mixture was that recommended for AMV (see FIG. 13). FIG. 15B is a
graph comparing the R2 and AMV RT dissociation rates from an RNA
template. Open squares, data for R2 RT (average of three
experiments); solid circles, data for AMV RT.
[0032] FIGS. 16A-B illustrate the elongation rate of R2 RT. FIG.
16A is an autoradiograph illustrating the reaction products of a
primed RT reaction. The template in the reaction of FIG. 16A is the
1094 nt RNA, with synthesis primed by end-labeled AB.34 (SEQ ID No:
38, Table I). For each lane, 125 fmole of template/primer was
preincubated with 10 ng (100 fmole) of R2 RT for 5 min at
37.degree. C. After preincubation, the reverse transcription was
started by addition 5 .mu.l of 1.25 nM dNTP and stopped according
to the time in seconds indicated at the top of the Figure by quick
mixing with 3 volume of ethanol containing 0.3M sodium acetate (pH
5.2) and 1% SDS. After precipitation, the products were separated
on 6% denaturing PAGE and analyzed using PhosphorImager and Image
Quant. FIG. 16B illustrates a similar elongation assay using the
600 nt RNA primed with AB.23 (SEQ ID No: 33, Table I) and either
the R2 and AMV RT. Plotted in this Figure is the longest
polymerization products detected on denaturing polyacrylamide gels
like that in FIG. 16A. The maximal rate of elongation was
determined by a fitting of the data points with a linear function.
Circles, the R2 data points; triangles, the AMV data points.
[0033] FIG. 17A is an autoradiograph which illustrates how R2 RT is
unaffected by RNA secondary structure. The RNA template for this
reaction has an annealed primer near the middle of the RNA and a
longer RNA block annealed to the 5' end of the RNA (FIG. 17B). This
template was formed by the annealing of 300 ng (3 pmol) of 334 nt
RNA, 25 ng (4 pmole) of end-labeled DNA primer AB.8 (SEQ ID No: 27,
Table I), and 300 ng (8 pmol) of 117 nt RNA. The DNA template for
the production of the 117 nt RNA was pBSK(SK-) digested with KpnI.
RNA was synthesized with T3 RNA polymerase. The procedure of
annealing was similar to that of all other annealing reaction
described above. The 334 nt RNA template without the RNA block was
prepared in a similar manner. For each primer extension reaction,
about 300 fmole of annealed template preincubated with either 2 ng
(20 fmole) of R2 or 1.5 U of AMV RT (Promega). Preincubations with
enzymes were for 5 min. at 37.degree. C. After preincubation,
reverse transcription was started by addition 5 .mu.l of 1.25 mM
dNTP, and stopped after 5 min at 37.degree. C. In the processive
runs (left panel) 2.5 .mu.l of "trap" (20 .mu.g of heparin,
.about.1 .mu.g of poly(A)/poly(dT)13-18) was added along with 2.5
.mu.l of 2.5 mM dNTP to start the reaction.
[0034] FIG. 18 is an autoradiograph which illustrates the effects
of temperature on RT processivity. The template and reaction
condition are exactly like those described with respect to FIG. 13
for a processive run (i.e., with the trap), except that
preincubation was for 20 min at 25.degree. C. followed by a short
(2 min) equilibration at the new temperature (25.degree.
C.-55.degree. C. in 5.degree. C increments). The elongation
reaction was started by the addition of dNTPs and the trap and were
conducted for 5 minutes.
[0035] FIGS. 19 is a graph which illustrates the percentage of the
cDNA products that are full-length (600 nt) as a function of
temperature. The graph is an analysis of the data in FIG. 18.
Plotted is the fraction of the total cDNA (all cDNA between 100 and
600 nt in FIG. 18) that corresponds to full length (600 nt). The
left panel is for R2.RT and the right panel is for AMV RT.
DETAILED DESCRIPTION OF THE INVENTION
[0036] Preferred non-LTR retrotransposon proteins or polypeptides
are the proteins or polypeptides of R2 elements.
[0037] One preferred protein possessing reverse transcriptase
activity and encoded by a non-LTR retrotransposable R2 element is
the protein encoded by the R2 element of Bombyx mori. This protein
has an amino acid sequence corresponding to SEQ ID No: 22 as
follows: TABLE-US-00001 Met Met Ala Ser Thr Ala Leu Ser Leu Met Gly
Arg 5 10 Cys Asn Pro Asp 15 Gly Cys Thr Arg Gly Lys His Val Thr Ala
Ala Pro 20 25 Met Asp Gly Pro 30 Arg Gly Pro Ser Ser Leu Ala Gly
Thr Phe Gly Trp 35 40 Gly Leu Ala Ile 45 Pro Ala Gly Glu Pro Cys
Gly Arg Val Cys Ser Pro 50 55 60 Ala Thr Val Gly Phe Phe Pro Val
Ala Lys Lys Ser Asn Lys Glu Asn 65 70 75 Arg Pro Glu Ala 80 Ser Gly
Leu Pro Leu Glu Ser Glu Arg Thr Gly Asp 85 90 Asn Pro Thr Val 95
Arg Gly Ser Ala Gly Ala Asp Pro Val Gly Gln Asp 100 105 Ala Pro Gly
Trp 110 Thr Cys Gln Phe Cys Glu Arg Thr Phe Ser Thr Asn 115 120 Arg
Gly Leu Gly 125 Val His Lys Arg Arg Ala His Pro Val Glu Thr Asn 130
135 140 Thr Asp Ala Ala Pro Met Met Val Lys Arg Arg Trp His Gly Glu
Glu 145 150 155 Ile Asp Leu Leu 160 Ala Arg Thr Glu Ala Arg Leu Leu
Ala Glu Arg Gly 165 170 Gln Cys Ser Gly 175 Gly Asp Leu Phe Gly Ala
Leu Pro Gly Phe Gly Arg 180 185 Thr Leu Glu Ala 190 Ile Lys Gly Gln
Arg Arg Arg Glu Pro Tyr Arg Ala 195 200 Leu Val Gln Ala 205 His Leu
Ala Arg Phe Gly Ser Gln Pro Gly Pro Ser 210 215 220 Ser Gly Gly Cys
Ser Ala Glu Pro Asp Phe Arg Arg Ala Ser Gly Ala 225 230 235 Glu Glu
Ala Gly 240 Glu Glu Arg Cys Ala Glu Asp Ala Ala Ala Tyr Asp 245 250
Pro Ser Ala Val 255 Gly Gln Met Ser Pro Asp Ala Ala Arg Val Leu Ser
260 265 Glu Leu Leu Glu 270 Gly Ala Gly Arg Arg Arg Ala Cys Arg Ala
Met Arg 275 280 Pro Lys Thr Ala 285 Gly Arg Arg Asn Asp Leu His Asp
Asp Arg Thr Ala 290 295 300 Ser Ala His Lys Thr Ser Arg Gln Lys Arg
Arg Ala Glu Tyr Ala Arg 305 310 315 Val Gln Glu Leu 320 Tyr Lys Lys
Cys Arg Ser Arg Ala Ala Ala Gln Val 325 330 Ile Asp Gly Ala 335 Cys
Gly Gly Val Gly His Ser Leu Glu Glu Met Glu 340 345 Thr Tyr Trp Arg
350 Pro Ile Leu Glu Arg Val Ser Asp Ala Pro Gly Pro 355 360 Thr Pro
Glu Ala 365 Leu His Ala Leu Gly Arg Ala Glu Trp His Gly Gly 370 375
380 Asn Arg Asp Tyr Thr Gln Leu Trp Lys Pro Ile Ser Val Glu Glu Ile
385 390 395 Lys Ala Ser Arg 400 Phe Asp Trp Arg Thr Ser Pro Gly Pro
Asp Gly Ile 405 410 Arg Ser Gly Gln 415 Trp Arg Ala Val Pro Val His
Leu Lys Ala Glu Met 420 425 Phe Asn Ala Trp 430 Met Ala Arg Gly Glu
Ile Pro Glu Ile Leu Arg Gln 435 440 Cys Arg Thr Val 445 Phe Val Pro
Lys Val Glu Arg Pro Gly Gly Pro Gly 450 455 460 Glu Tyr Arg Pro Ile
Ser Ile Ala Ser Ile Pro Leu Arg His Phe His 465 470 475 Ser Ile Leu
Ala 480 Arg Arg Leu Leu Ala Cys Cys Pro Pro Asp Ala Arg 485 490 Gln
Arg Gly Phe 495 Ile Cys Ala Asp Gly Thr Leu Glu Asn Ser Ala Val 500
505 Leu Asp Ala Val 510 Leu Gly Asp Ser Arg Lys Lys Leu Arg Glu Cys
His 515 520 Val Ala Val Leu 525 Asp Phe Ala Lys Ala Phe Asp Thr Val
Ser His Glu 530 535 540 Ala Leu Val Glu Leu Leu Arg Leu Arg Gly Met
Pro Glu Gln Phe Cys 545 550 555 Gly Tyr Ile Ala 560 His Leu Tyr Asp
Thr Ala Ser Thr Thr Leu Ala Val 565 570 Asn Asn Glu Met 575 Ser Ser
Pro Val Lys Val Gly Arg Gly Val Arg Gln 580 585 Gly Asp Pro Leu 590
Ser Pro Ile Leu Phe Asn Val Val Met Asp Leu Ile 595 600 Leu Ala Ser
Leu 605 Pro Glu Arg Val Gly Tyr Arg Leu Glu Met Glu Leu 610 615 620
Val Ser Ala Leu Ala Tyr Ala Asp Asp Leu Val Leu Leu Ala Gly Ser 625
630 635 Lys Val Gly Met 640 Gln Glu Ser Ile Ser Ala Val Asp Cys Val
Gly Arg 645 650 Gln Met Gly Leu 655 Arg Leu Asn Cys Arg Lys Ser Ala
Val Leu Ser Met 660 665 Ile Pro Asp Gly 670 His Arg Lys Lys His His
Tyr Leu Thr Glu Arg Thr 675 680 Phe Asn Ile Gly 685 Gly Lys Pro Leu
Arg Gln Val Ser Cys Val Glu Arg 690 695 700 Trp Arg Tyr Leu Gly Val
Asp Phe Glu Ala Ser Gly Cys Val Thr Leu 705 710 715 Glu His Ser Ile
720 Ser Ser Ala Leu Asn Asn Ile Ser Arg Ala Pro Leu 725 730 Lys Pro
Gln Gln 735 Arg Leu Glu Ile Leu Arg Ala His Leu Ile Pro Arg 740 745
Phe Gln His Gly 750 Phe Val Leu Gly Asn Ile Ser Asp Asp Arg Leu Arg
755 760 Met Leu Asp Val 765 Gln Ile Arg Lys Ala Val Gly Gln Trp Leu
Arg Leu 770 775 780 Pro Ala Asp Val Pro Lys Ala Tyr Tyr His Ala Ala
Val Gln Asp Gly 785 790 795 Gly Leu Ala Ile 800 Pro Ser Val Arg Ala
Thr Ile Pro Asp Leu Ile Val 805 810 Arg Arg Phe Gly 815 Gly Leu Asp
Ser Ser Pro Trp Ser Val Ala Arg Ala
820 825 Ala Ala Lys Ser 830 Asp Lys Ile Arg Lys Lys Leu Arg Trp Ala
Trp Lys 835 840 Gln Leu Arg Arg 845 Phe Ser Arg Val Asp Ser Thr Thr
Gln Arg Pro Ser 850 855 860 Val Arg Leu Phe Trp Arg Glu His Leu His
Ala Ser Val Asp Gly Arg 865 870 875 Glu Leu Arg Glu 880 Ser Thr Arg
Thr Pro Thr Ser Thr Lys Trp Ile Arg 885 890 Glu Arg Cys Ala 895 Gln
Ile Thr Gly Arg Asp Phe Val Gln Phe Val His 900 905 Thr His Ile Asn
910 Ala Leu Pro Ser Arg Ile Arg Gly Ser Arg Gly Arg 915 920 Arg Gly
Gly Gly 925 Glu Ser Ser Leu Thr Cys Arg Ala Gly Cys Lys Val 930 935
940 Arg Glu Thr Thr Ala His Ile Leu Gln Gln Cys His Arg Thr His Gly
945 950 955 Gly Arg Ile Leu 960 Arg His Asn Lys Ile Val Ser Phe Val
Ala Lys Ala 965 970 Met Glu Glu Asn 975 Lys Trp Thr Val Glu Leu Glu
Pro Arg Leu Arg Thr 980 985 Ser Val Gly Leu 990 Arg Lys Pro Asp Ile
Ile Ala Ser Arg Asp Gly Val 995 1000 Gly Val Ile Val 1005 Asp Val
Gln Val Val Ser Gly Gln Arg Ser Leu Asp 1010 1015 1020 Glu Leu His
Arg Glu Lys Arg Asn Lys Tyr Gly Asn His Gly Glu Leu 1025 1030 1035
Val Glu Leu Val 1040 Ala Gly Arg Leu Gly Leu Pro Lys Ala Glu Cys
Val 1045 1050 Arg Ala Thr Ser 1055 Cys Thr Ile Ser Trp Arg Gly Val
Trp Ser Leu Thr 1060 1065 Ser Tyr Lys Glu 1070 Leu Arg Ser Ile Ile
Gly Leu Arg Glu Pro Thr Leu 1075 1080 Gln Ile Val Pro 1085 Ile Leu
Ala Leu Arg Gly Ser His Met Asn Trp Thr 1090 1095 1100 Arg Phe Asn
Gln Met Thr Ser Val Met Gly Gly Gly Val Gly 1105 1110
[0038] This protein is further characterized as also possessing
endonuclease activity. It is encoded by a DNA molecule having a
nucleotide sequence corresponding to SEQ ID No: 23 as follows.
TABLE-US-00002 atgatggcga gcaccgcact gtcccttatg ggacggtgta 60
acccggatgg ctgtacacgt ggtaaacacg tgacagcagc cccgatggac ggaccgcgag
120 gaccgtcaag cctagcaggt accttcgggt ggggccttgc gatacctgcg
ggcgaaccct 180 gtggtcgggt ttgcagcccg gccacagtgg gtttttttcc
tgttgcaaaa aagtcaaata 240 aagaaaatag acctgaagcc tctggcctcc
cgctggagtc agagaggaca ggcgataacc 300 cgactgtgcg gggttccgcc
ggcgcagatc ctgtgggtca ggatgcgcct ggttggacct 360 gccagttctg
cgaacgaacc ttttcgacca acaggggttt gggtgtccac aagcgtagag 420
cccaccctgt tgagaccaat acggatgccg ctccgatgat ggtgaagcgg cggtggcatg
480 gcgaggaaat cgacctcctc gctcgcaccg aggccaggtt gctcgctgag
cggggtcagt 540 gctcgggtgg agacctcttt ggcgcgcttc cagggtttgg
aagaactctg gaagcgatta 600 agggacaacg gcggagggag ccttatcggg
cattggtgca agcgcacctt gcccgatttg 660 gttcccagcc gggtccctcg
tcgggggggt gctcggccga gcctgacttc cggcgggctt 720 ctggagctga
ggaagcgggc gaggaacgat gcgccgaaga cgccgctgcc tatgatccat 780
ccgcagtcgg tcagatgtcg cccgatgccg ctcgggttct ctccgaactc cttgagggtg
840 cggggagaag acgagcgtgc agggctatga gacccaagac tgcagggcgg
cgaaacgatt 900 tgcacgatga tcggacagct agtgcccaca aaaccagtag
acaaaagcgc agggcagagt 960 acgcgcgtgt gcaggaactg tacaagaagt
gtcgcagcag agcagcagct gaggtgatcg 1020 atggcgcgtg tgggggtgtc
ggacactcgc tcgaggagat ggagacctat tggcgaccta 1080 tcctcgagag
agtgtccgat gcacctgggc ctacaccgga agctcttcac gccctagggc 1140
gtgcggagtg gcacgggggc aatcgcgact acacccagct gtggaagccg atctcggtgg
1200 aagagatcaa ggcctcccgc tttgactggc gaacttcgcc gggcccggac
ggtatacgtt 1260 cgggtcagtg gcgtgcggtt cctgtgcact tgaaggcgga
aatgttcaat gcatggatgg 1320 cacgaggcga aatacccgaa attctacggc
agtgccgaac cgtctttgta cctaaggtgg 1380 agagaccagg tggaccgggg
gaatatcgac cgatctcgat cgcgtcgatt cccctgagac 1440 actttcactc
catcttggcc cggaggctgt tggcttgctg cccccctgat gcacgacagc 1500
gcggatttat ctgcgccgac ggtacgctgg agaattccgc agtactggac gcggtgcttg
1560 gggatagcag gaagaagctg cgggaatgtc acgtggcggt gctagacttc
gccaaggcat 1620 ttgacacagt gtctcacgag gcacttgtcg aattgctgag
gttgaggggc atgcccgaac 1680 agttctgcgg ctacattgct cacctatacg
atacggcgtc caccacctta gccgtgaaca 1740 atgaaatgag cagccctgta
aaagtgggac gaggggttcg tcaaggggac cctctgtcgc 1800 cgatactctt
caacgtggtg atggacctca tcctggcttc cctgccggag agggtcgggt 1860
ataggttgga gatggaactc gtgtccgctc tggcctatgc tgacgaccta gtcctgcttg
1920 cggggtcgaa ggtagggatg caggagtcca tctctgctgt ggactgtgtc
ggtaggcaga 1980 tgggcctacg cctgaattgc aggaaaagcg cggttctgtc
tatgataccg gatggccacc 2040 gcaagaagca tcactacctg actgagcgaa
ccttcaatat tggaggtaag ccgctcaggc 2100 aggtgagttg tgttgagcgg
tggcgatatc ttggtgtcga ttttgaggcc tctggatgcg 2160 tgacattaga
gcatagtatc agtagtgctc tgaataacat ctcaagggca cctctcaaac 2220
cccaacagag gttggagatt ttgagagctc atctgattcc gagattccag cacggttttg
2280 tgcttggaaa catctcggat gaccgattga gaatgctcga tgtccaaatc
cggaaagcag 2340 tcggacagtg gctaaggcta ccggcggatg tgcccaaggc
atattatcac gccgcagttc 2400 aggacggcgg cttagcgatc ccatcggtgc
gagcgaccat cccggacctc attgtgaggc 2460 gtttcggggg gctcgactcg
tcaccatggt cagtggcaag agccgccgcc aaatctgata 2520 agattcgtaa
gaaactgcgg tgggcctgga aacagctccg caggttcagc cgtgttgact 2580
ccacaacgca acgaccatct gtgcgcttgt tttggcgaga acatctgcat gcatctgttg
2640 atggacgcga acttcgcgaa tccacacgca ccccgacatc cacaaagtgg
attagggagc 2700 gatgcgcgca gataaccgga cgggacttcg tgcagttcgt
gcacactcat atcaacgccc 2760 tcccatcccg cattcgcgga tcgagagggc
gtagaggtgg gggtgagtct tcgttgacct 2820 gccgtgctgg ttgcaaggtt
agggagacga cggctcacat cctacaacag tgtcacagaa 2880 cacacggcgg
ccggattcta cgacacaaca agattgtatc tttcgtggcg aaagccatgg 2940
aagagaacaa gtggacggtt gagctggagc cgaggctacg aacatcggtt ggtctccgta
3000 agccggatat tatcgcctcc agggatggtg tcggagtgat cgtggacgtg
caggtggtct 3060 cgggccagcg atcgcttgac gagctccacc gtgagaaacg
taataaatac gggaatcacg 3120 gggagctggt tgagttggtc gcaggtagac
taggacttcc gaaagctgag tgcgtgcgag 3180 ccacttcgtg cacgatatct
tggaggggag tatggagcct gacttcttat aaggagttaa 3240 ggtccataat
cgggcttcgg gaaccgacac tacaaatcgt tccgatactg gcgttgagag 3300
gttcacacat gaactggacc aggttcaatc agatgacgtc cgtcatgggg ggcggcgttg
3345 gttga
The complete amino acid and cDNA nucleotide sequences are also
reported, respectively, at Genbank Accession Nos. AAB59214 and
MI6558, each of which is hereby incorporated by reference in its
entirety.
[0039] In addition to the protein encoded by the R2 element of
Bombyx mori, other proteins possessing reverse transcriptase
activity which are encoded by different non-LTR retrotransposable
R2 elements can also be employed in the methods of the present
invention. A number of other arthropods are known to harbor R2
elements which exhibit a similar structure to the R2 element of
Bombyx mori (Burke et al., "The Domain Structure and
Retrotransposition Mechanism of R2 Elements Are Conserved
Throughout Arthropods," Mol. Biol. Evol. 16(4):502-511 (1999); Yang
et al., "Identification of the Endonuclease Domain Encoded by R2
and Other Site-Specific, Non-Long Terminal Repeat Retrotransposable
Elements," Proc. Natl. Acad. Sci. USA 96:7847-7852 (1999), each of
which is hereby incorporated by reference in its entirety). The R2
elements of other arthropods include, without limitation, R2
elements from Drosophila spp. (fruit fly), Forficula auricularia
(earwig), Popillia japonica (Japanese beetle), Nasonia vitipennis
(jewel wasp), Tenebrio molitor (mealworm), Collembola spp.
(springtails), Isopoda spp. (pillbugs), and Limulus polyphemus
(horseshoe crab).
[0040] The protein and encoding DNA sequences for the R2 element of
D. melanogaster are reported, respectively, at Genbank Accession
Nos. P16423 and X51967, each of which is hereby incorporated by
reference in its entirety. The protein and encoding DNA sequences
for the R2 element of D. mercatorum are reported, respectively, at
Genbank Accession Nos. AAB94032 and AF015685, each of which is
hereby incorporated by reference in its entirety. The protein and
encoding DNA sequences for the R2 element of P. japonica are
reported, respectively, at Genbank Accession Nos. AAB66358 and
L00949, each of which is hereby incorporated by reference in its
entirety. The protein and encoding DNA sequences for the R2 element
of N. vitripennis are reported, respectively, at Genbank Accession
Nos. AAC34927 and L00950, each of which is hereby incorporated by
reference in its entirety.
[0041] Other non-LTR retrotransposon elements and their proteins
can be readily identified by isolating putative non-LTR
retrotransposon element proteins and testing them for homology with
the above-listed R2 proteins as well as testing them for
endonuclease and target-primed reverse transcriptase activity as
described, for example, in Luan et al., "Reverse Transcription of
R2Bm RNA is Primed by a Nick at the Chromosomal Target Site: A
Mechanism for Non-LTR Retrotransposition," Cell 72:595-605 (1993);
Luan et al., "RNA Template Requirements for Target DNA-Primed
Reverse Transcription by the R2 Retrotransposable Element," Mol.
Cell Biol. 15(7):3882-3891 (1995); Luan et al., "Downstream 28S
Gene Sequences on the RNA Template Affect the Choice of Primer and
the Accuracy of Initiation by the R2 Reverse Transcriptase," Mol.
Cell Biol. 16(9):4726-4734 (1996), each of which is hereby
incorporated by reference in its entirety. Once identified, DNA
molecules encoding the non-LTR retrotransposon protein can be
isolated using standard techniques known to those skilled in the
art.
[0042] Fragments of the above-identified non-LTR retrotransposon
proteins can also be utilized in accordance with the present
invention. It has previously been demonstrated that the protein
encoded by the R2 element of a number of arthropods possess
multiple functional domains, including an N-terminal DNA binding
domain, a central reverse transcriptase domain, and a C-terminal
endonuclease domain (Burke et al., "The Domain Structure and
Retrotransposition Mechanism of R2 Elements Are Conserved
Throughout Arthropods," Mol. Biol. Evol. 16(4):502-511 (1999); Yang
et al., "Identification of the Endonuclease Domain Encoded by R2
and Other Site-Specific, Non-Long Terminal Repeat Retrotransposable
Elements," Proc. Natl. Acad. Sci. USA 96:7847-7852 (1999), each of
which is hereby incorporated by reference in its entirety).
[0043] Suitable fragments can be produced by several means.
Subclones of the gene encoding a known non-LTR retrotransposon
protein can be produced using conventional molecular genetic
manipulation for subcloning gene fragments, such as described by
Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold
Springs Laboratory, Cold Springs Harbor, N.Y. (1989), and Ausubel
et al. (ed.), Current Protocols in Molecular Biology, John Wiley
& Sons (New York, N.Y.) (1999 and preceding editions), each of
which is hereby incorporated by reference in its entirety. The
subclones then are expressed in vitro or in vivo in bacterial cells
to yield a smaller protein or polypeptide that can be tested for
reverse transcriptase activity, e.g., using known procedures or
procedures set forth in U.S. Pat. No. 6,100,039 to Burke et al. and
U.S. Pat. No. 6,132,995 to Gronowitz et al., each of which is
hereby incorporated by reference in its entirety.
[0044] In another approach, based on knowledge of the primary
structure of the non-LTR retrotransposon protein, fragments of the
gene may be synthesized using the PCR technique together with
specific sets of primers chosen to represent particular portions of
the protein, i.e., encoding a fragment having reverse transcriptase
activity (see Erlich et al., "Recent Advances in the Polymerase
Chain Reaction," Science 252:1643-51 (1991), which is hereby
incorporated by reference in its entirety). These can then be
cloned into an appropriate vector for expression of a truncated
protein or polypeptide from bacterial cells.
[0045] Fusion proteins which include the reverse transcriptase can
also be used in accordance with the invention. Such fusion proteins
may comprise, for example, a carrier protein which has a leader
sequence of hydrophobic amino acids at the amino terminus of the
reverse transcriptase domain. This carrier protein is normally
excreted through the membrane of the cell within which it is made.
By cleavage of the hydrophobic leader sequence during excretion, a
means is provided for producing a polypeptide having reverse
transcriptase activity, which can be recovered either from the
periplasmic space or the medium in which the bacterium is grown.
The use of such a carrier protein allows isolation of polypeptides
having reverse transcriptase activity without contamination by
other proteins within the bacterium, and may achieve production of
a form of reverse transcriptase having greater stability by
avoiding the enzymes within the bacterial cell which degrade
foreign proteins. The DNA and amino acid sequences for such
hydrophobic leader sequences, as well as methods of preparing such
fusion proteins are taught, e.g., in U.S. Pat. No. 4,411,994 to
Gilbert et al., which is hereby incorporated by reference in its
entirety.
[0046] It is also possible to prepare fusion proteins comprising a
polypeptide having reverse transcriptase activity that is linked
via peptide bond at the amino or carboxy termini with polypeptides
which stabilize or change the solubility of the polypeptide having
reverse transcriptase activity. An amino-terminal gene fusion which
encodes reverse transcriptase, having both DNA polymerase and RNase
activity, and trpE is taught, e.g., by Tanese et al., Proc. Natl.
Acad. Sci. USA 82:4944-4948 (1985), which is hereby incorporated by
reference in its entirety. A carboxy-terminal gene fusion which
encodes reverse transcriptase and part of the plasmid pBR322 tet
gene is taught, e.g., by Kotewicz et al., Gene 35:249-258 (1985);
and Gerard, DNA 5:271-279 (1986), each of which is hereby
incorporated by reference in its entirety.
[0047] A DNA molecule encoding the non-LTR retrotransposon protein
or polypeptide having reverse transcriptase activity can be
incorporated in cells using conventional recombinant DNA
technology. Generally, this involves inserting the DNA molecule
into an expression system to which the DNA molecule is heterologous
(i.e., not normally present). The heterologous DNA molecule is
inserted into the expression system or vector in sense orientation
and correct reading frame. Depending on the vector, the DNA
molecule can be ligated to appropriate regulatory sequences either
prior to its insertion into the vector (i.e., as a chimeric gene)
or at the time of its insertion (i.e., thereby forming the chimeric
gene). The DNA molecule can be cloned into the vector using
standard cloning procedures in the art, as described by Maniatis et
al., Molecular Cloning: A Laboratory Manual, Cold Springs
Laboratory, Cold Springs Harbor, N.Y. (1982), which is hereby
incorporated by reference in its entirety.
[0048] U.S. Pat. No. 4,237,224 to Cohen and Boyer, which is hereby
incorporated by reference in its entirety, describes the production
of expression systems in the form of recombinant plasmids using
restriction enzyme cleavage and ligation with DNA ligase. These
recombinant plasmids are then introduced by means of transformation
and replicated in unicellular cultures including prokaryotic
organisms and eukaryotic cells grown in tissue culture.
[0049] Recombinant genes may also be introduced into viruses, such
as vaccinia virus. Recombinant viruses can be generated by
transfection of plasmids into cells infected with virus.
[0050] Suitable vectors include, but are not limited to, the
following viral vectors such as lambda vector system gt11, gt
WES.tB, Charon 4, and plasmid vectors such as pBR322, pBR325,
pACYC177, pACYC184, pUC8, pUC9, pUCI8, pUC19, pLG339, pR290, pKC37,
pKC101, SV 40, pBluescript II SK +/- or KS +/- (see "Stratagene
Cloning Systems" Catalog (1993) from Stratagene, La Jolla, Calif.,
which is hereby incorporated by reference in its entirety), pQE,
pIHS21, pGEX, pET series (see Studier et. al., "Use of T7 RNA
Polymerase to Direct Expression of Cloned Genes," Gene Expression
Technology, vol. 185 (1990), which is hereby incorporated by
reference in its entirety), and any derivatives thereof. Suitable
vectors are continually being developed and identified.
[0051] Recombinant molecules can be introduced into host cells via
transformation, transduction, conjugation, mobilization, or
electroporation.
[0052] A variety of host-vector systems may be utilized to express
the protein-encoding sequence(s). Primarily, the vector system must
be compatible with the host cell used. Host-vector systems include
but are not limited to the following: bacteria transformed with
bacteriophage DNA, plasmid DNA, or cosmid DNA; microorganisms such
as yeast containing yeast vectors; mammalian cell systems infected
with virus (e.g., vaccinia virus, adenovirus, etc.); insect cell
systems infected with virus (e.g., baculovirus); and plant cells
infected by bacteria or transformed via particle bombardment (i.e.
biolistics). The expression elements of these vectors vary in their
strength and specificities. Depending upon the host-vector system
utilized, any one of a number of suitable transcription and
translation elements can be used.
[0053] Different genetic signals and processing events control many
levels of gene expression (e.g., DNA transcription and messenger
RNA ("mRNA") translation).
[0054] Transcription of DNA is dependent upon the presence of a
promoter which is a DNA sequence that directs the binding of RNA
polymerase and thereby promotes mRNA synthesis. The DNA sequences
of eukaryotic promoters differ from those of prokaryotic promoters.
Furthermore, eukaryotic promoters and accompanying genetic signals
may not be recognized in or may not function in a prokaryotic
system, and, further, prokaryotic promoters typically are not
recognized and do not function in eukaryotic cells.
[0055] Similarly, translation of mRNA in prokaryotes depends upon
the presence of the proper prokaryotic signals which differ from
those of eukaryotes. Efficient translation of mRNA in prokaryotes
requires a ribosome binding site called the Shine-Dalgarno ("SD")
sequence on the mRNA. This sequence is a short nucleotide sequence
of mRNA that is located before the start codon, usually AUG, which
encodes the amino-terminal methionine of the protein. The SD
sequences are complementary to the 3'-end of the 16S rRNA
(ribosomal RNA) and probably promote binding of mRNA to ribosomes
by duplexing with the rRNA to allow correct positioning of the
ribosome. For a review on maximizing gene expression, see Roberts
and Lauer, Methods in Enzymology, 68:473 (1979), which is hereby
incorporated by reference in its entirety.
[0056] Promoters vary in their "strength" (i.e., their ability to
promote transcription). For the purposes of expressing a cloned
gene, it is desirable to use strong promoters in order to obtain a
high level of transcription and, hence, expression of the gene.
Depending upon the host cell system utilized, any one of a number
of suitable promoters may be used. For instance, when cloning in E.
coli, its bacteriophages, or plasmids, promoters such as the T7
phage promoter, lac promoter, trp promoter, recA promoter,
ribosomal RNA promoter, the P.sub.R and P.sub.L promoters of
coliphage lambda and others, including but not limited, to lacUV5,
ompF, bla, lpp, and the like, may be used to direct high levels of
transcription of adjacent DNA segments. Additionally, a hybrid
trp-lacUV5 (tac) promoter or other E. coli promoters produced by
recombinant DNA or other synthetic DNA techniques may be used to
provide for transcription of the inserted gene.
[0057] Bacterial host cell strains and expression vectors may be
chosen which inhibit the action of the promoter unless specifically
induced. In certain operons, the addition of specific inducers is
necessary for efficient transcription of the inserted DNA. For
example, the lac operon is induced by the addition of lactose or
IPTG (isopropylthio-beta-D-galactoside). A variety of other
operons, such as trp, pro, etc., are under different controls.
[0058] Specific initiation signals are also required for efficient
gene transcription and translation in prokaryotic cells. These
transcription and translation initiation signals may vary in
"strength" as measured by the quantity of gene specific messenger
RNA and protein synthesized, respectively. The DNA expression
vector, which contains a promoter, may also contain any combination
of various "strong" transcription and/or translation initiation
signals. For instance, efficient translation in E. coli requires a
Shine-Dalgarno ("SD") sequence about 7-9 bases 5' to the initiation
codon ("ATG") to provide a ribosome binding site. Thus, any SD-ATG
combination that can be utilized by host cell ribosomes may be
employed. Such combinations include, but are not limited to, the
SD-ATG combination from the cro gene or the N gene of coliphage
lambda, or from the E. coli tryptophan E, D, C, B or A genes.
Additionally, any SD-ATG combination produced by recombinant DNA or
other techniques involving incorporation of synthetic nucleotides
may be used.
[0059] Once the DNA molecules encoding the non-LTR retrotransposon
protein or polypeptide having reverse transcriptase activity, as
described above, have been cloned into an expression system, they
are ready to be incorporated into a host cell. Such incorporation
can be carried out by the various forms of transformation noted
above, depending upon the vector/host cell system. Suitable host
cells include, but are not limited to, bacteria, virus, yeast,
mammalian cells, insect, plant, and the like.
[0060] The transformed hosts of the inventions may be cultured
under protein producing conditions according to any of the methods
which are known to those skilled in the art.
[0061] The non-LTR retrotransposon protein or polypeptide having
reverse transcriptase activity may be isolated according to
conventional methods known to those skilled in the art. For
example, the cells may be collected by centrifugation, washed with
suitable buffers, lysed and sonicated, and the reverse
transcriptase isolated by column chromatography, for example, on
DEAE-cellulose, phosphocellulose (see Kotewicz et al., Gene
35:249-258 (1985), which is hereby incorporated by reference in its
entirety) or other standard isolation and identification techniques
using, for example, polyribocytidylic acid-agarose, or
hydroxylapatite or by electrophoresis or immunoprecipitation. The
non-LTR retrotransposon protein or polypeptide is preferably
produced in purified form (preferably, at least about 80%, more
preferably at least about 90%, pure).
[0062] Having expressed and isolated the non-LTR retrotransposon
protein or polypeptide, it can subsequently be used in accordance
with the present invention.
[0063] According to one aspect of the present invention the non-LTR
retrotransposon protein or polypeptide is used to prepare cDNA from
RNA. This can be achieved by contacting an RNA molecule, in the
presence of dNTPs, with a non-LTR retrotransposon protein or
polypeptide having reverse transcriptase activity (as described
above) under conditions effective for production of a cDNA molecule
complementary to the RNA molecule, where the contacting is carried
out in the absence of a target DNA molecule of the non-LTR
retrotransposon protein or polypeptide. Thereafter, the resulting
cDNA can be isolated.
[0064] Basically, in the presence of RNA (i.e., a plurality of RNA
molecules) and dNTPs, the non-LTR retrotransposon protein or
polypeptide will use the 3' end of one RNA molecule to prime
reverse transcription of another RNA, which can be the same or
different from the RNA acting as primer. This is illustrated in
step (1) of FIG. 12. The protein or polypeptide, characterized by a
high degree of processivity, will likely continue to the end of the
RNA template as shown in step (2), at which point it may, but need
not, jump to a second RNA template as shown in step (3). Reverse
transcription is again likely to continue to the end of the second
RNA template as shown in step (4). Another template jump may or may
not occur. In most instances, one or more of the RNA molecules
which are reverse transcribed will include a region of interest
(i.e., for which one or more cDNA copies are desired). It may also
be desired to specifically include in the reaction mixture acceptor
and/or donor RNA molecules having known sequences. Their known
sequences can be used to anneal primers for subsequent
amplification procedures (infra).
[0065] The target DNA sequences for a number of different non-LTR
retrotransposons, in particular R2elements, have been identified
previously (Burke et al., "The Domain Structure and
Retrotransposition Mechanism of R2 Elements Are Conserved
Throughout Arthropods," Mol. Biol. Evol. 16(4):502-511 (1999),
which is hereby incorporated by reference in its entirety). For the
R2 element of Bombyx mori, the target DNA molecule has a nucleotide
sequence according to SEQ ID No: 24 as follows: TABLE-US-00003
taaacggcgg gagtaactat gactctctta aggtagccaa 50 atgcctcgtc
The cleavage site is between positions 31 and 32 of SEQ ID No: 24.
The nick site on the opposite strand is two bases downstream from
the cleavage site (Luan et al., "Reverse Transcription of R2Bm RNA
Is Primed by a Nick at the Chromosomal Target Site: A Mechanism for
non-LTR Retrotransposition," Cell 72:595-605 (1993), which is
hereby incorporated by reference in its entirety).
[0066] There is preferably a sufficient time delay between the
steps of contacting and isolating, as described above. Suitable
time delays include, without limitation, preferably at least about
30 seconds, more preferably between about 1 minute and 2 hours
minutes, even more preferably between about 10 minutes and 2 hours.
The synthesis of a complete cDNA may be accomplished by adding the
R2 protein or polypeptide and all four dNTPs with the RNA template.
The reverse transcription can be carried out under substantially
isothermic conditions or under variable temperature conditions.
Suitable temperatures range from about 20.degree. C. to about
40.degree. C., preferably about 21.degree. C. to about 35.degree.
C., and most preferably at about 22.degree. C. to about 32.degree.
C. The particular temperature employed will depend, at least in
part on the desired cDNA product one wishes to obtain, as a greater
percentage of full length cDNA products can be obtained using
temperatures at about 25.degree. C. (i.e., about 22.degree. C. to
about 28.degree. C.), while an increase in the total yield of cDNA
product can be achieved at higher temperatures.
[0067] Use of the non-LTR retrotransposon protein or polypeptide,
in particular the R2 protein or polypeptide, offers a number of
distinct advantages over retroviral reverse transcriptases. With
respect to the RNA molecule, the RNA does not need a particular
primer site; hence, it does not require a polyadenylation region as
needed by retroviral RT. However, when polyadenylated RNA molecules
are reverse transcribed, the polyadenylated region affords a primer
binding site that can be used for primer-directed cDNA extension,
resulting in a known polyT region at the 5' end of a cDNA molecule.
In addition, non-LTR retrotransposon proteins or polypeptides like
the R2 protein or polypeptide are capable of carrying out reverse
transcription irrespective of the RNA structure. Retroviral RTs
frequently stop at certain sequences or in regions which contain a
secondary structure, such as stem or stem/loop formations or duplex
formations upstream of the template extension, whereas the R2
protein or polypeptide does not.
[0068] With respect to the reverse transcription process, the R2
protein or polypeptide is characterized by a significantly greater
processivity than retroviral reverse transcriptases. The R2 protein
or polypeptide is characterized by a speed of about 880 nt per
minute, which is comparable to retroviral reverse transcriptases.
More important, though, is the stability of the R2 protein or
polypeptide once it has started reverse transcription. Because of
its stability, the R2 protein or polypeptide is capable of
preparing a population of cDNAs where a significant portion of the
cDNA molecules are substantially full length reverse transcripts of
the RNA template. By substantially full length, it is intended that
the cDNAs are at least about 85 percent of the RNA template length,
more preferably at least about 90 percent of the RNA template
length, even more preferably about 95 percent of the RNA template
length. By significant portion, it is intended to denote at least
twice as much as can be prepared using the AMV reverse
transcriptase. For example, using a 600 nt template at about
25.degree. C., the R2 protein or polypeptide can prepare a
population of cDNA molecules where about 22% are full length while
the AMV RT, using the same template at about 37.degree. C. can only
prepare a population of cDNA molecule where about 1.2% are full
length.
[0069] According to another aspect of the present invention, the
initial reverse transcription process is followed by amplification
procedure, whereby the isolated cDNA is amplified using any one of
a number of suitable amplification procedures.
[0070] By way of example, the PCR amplification can be performed
following isolation of the cDNA. Because the PCR utilizes primers
to initiate second strand synthesis, the cDNA molecules prepared
during the reverse transcription process should be labeled at their
5' ends with a sequence which will hybridize with suitable PCR
primers. Two approaches can be utilized to label the cDNAs.
[0071] According to one approach, an oligoC tail can be added at
the 3' end of the cDNA transcripts by incubating them with terminal
transferase and dCTPs (Chang et al., Nature 275:617-624 (1978);
Maniatis et al., Molecular Cloning, Cold Spring Harbor Laboratory
(1982), each of which is hereby incorporated by reference in its
entirety). A primer which anneals to the oligoC tail can be used in
subsequent PCR amplification.
[0072] According to a second approach, where the sequence of only a
portion of an RNA molecule is known, directed template jumping can
be employed to prepare cDNAs starting within the known sequence of
the RNA, extending through the unknown sequence (i.e., region of
interest) at its 5' end and having a known sequence located at the
5' end. This cDNA is immediately available for PCR amplification,
because the unknown sequence (i.e., region of interest) is flanked
by known sequences which can be annealed by PCR primers.
Alternatively, the 3' end of a partially known RNA molecule can be
obtained again using the template jumping ability of R2. Reverse
transcription is primed from a known donor RNA sequence, the
reverse transcriptase jumps to the 3' end of the partially known
RNA and continues synthesis past the region of known sequence. PCR
amplification is again possible, because the cDNA product contains
its unknown sequences (i.e., region of interest) flanked by known
sequences which can be annealed by primers.
[0073] Basically, the PCR process is carried out in step-wise
fashion using alternating steps of annealing primers, extending
primers to achieve complementary strand synthesis, followed by
strand dissociation. Beginning with an isolated as cDNA (or pool of
ss cDNAs) containing a region of interest, a first primer is
annealed to the ss cDNA molecule at a position 3' of the region of
interest and then the primer is extended to form a complementary
DNA strand including a complement of the region of interest. This
complementary DNA strand can then be dissociated from the ss cDNA,
at which time it is available for annealing by a second primer at a
position 3' of the complement of the region of interest. Primer
extension is carried out to form a second complementary DNA strand
which is substantially the same as the single-stranded cDNA
molecule at the region of interest. Upon dissociating the second
complementary cDNA molecule from the complementary DNA strand, the
entire process can be repeated indefinitely to amplify the quantity
of cDNA which contain the region of interest or a complement
thereof.
[0074] The non-LTR retrotransposon protein or polypeptide is
ideally suited for incorporation into a kit which is useful for the
preparation of cDNA from RNA. Such a kit may include a carrier
device compartmentalized to receive one or more containers, such as
vials, tubes, and the like, each of which includes one of the
separate elements used to prepare cDNA from RNA. For example, there
may be provided a first container, the contents of which include
the non-LTR protein or polypeptide in solution. Further, any number
of additional containers can be provided, the contents of which
independently include suitable buffers, substrates for DNA
synthesis such as the deoxynucleotide triphosphates (e.g., dATP,
dCTP, dGTP, and dTTP) either individually or collectively in a
suitable solution, a terminal transferase in solution, donor RNA
having a known nucleotide sequence for use as an RT primer to
obtain a 3' end of RNA, and acceptor RNA having a known nucleotide
sequence to obtain a 5' end of RNA. Any combinations of the above
components can be provided.
[0075] The R2 protein or polypeptide may be present at about 0.1
.mu.g/ml to about 1 .mu.g/ml, preferably about 200 ng/ml to about
500 ng/ml. The buffer conditions for the reverse transcription can
range from about 50 to about 200 mM NaCl, about 1-10 mM MgCl.sub.2,
about 0.0 to 0.2% Triton X-100, about 10 to about 250 .mu.M
deoxynucleotide triphosphates, at a pH from about 7 to about 8.5.
The donor and acceptor RNAs for template jumps can be at
concentration from about 0.5 to about 20 .mu.g/ml. The terminal
transferase, if employed, may be present at a concentration of
about 0.1 .mu.g/ml to about 100 .mu.g/ml, preferably about 5
.mu.g/ml to about 50 .mu.g/ml.
EXAMPLES
[0076] The following examples are provided to illustrate
embodiments of the present invention, but they are by no means
intended to limit its scope. The materials and methods described
below were utilized in the following examples.
Preparations of Target DNA
[0077] The DNA substrate for the TPRT reaction was a 164 nt segment
of the 28S rRNA gene generated by PCR amplification from clone
pB109 using primer AB.j44 (SEQ ID No: 37, Table I) complementary to
the 28S sequence 54 bp upstream of the R2 insertion site and primer
AB.25 complementary to the region 110 bp downstream of the R2 site.
The PCR was carried out in 50 .mu.l reactions containing 10 ng of
pB 109, 200 ng each primer, 50 .mu.Ci of [.alpha.-.sup.32P] dCTP
(3,000 Ci/mmol, New England Nuclear), 200 .mu.M each dATP, dGTP and
dTTP, and 100 .mu.M dCTP, 2.5-5 U Taq DNA polymerase (Life
Technologies). The length of the DNA strand designed to be used as
primer was twice as long as that used in previous assays (Yang and
Eickbush, "RNA-induced Changes in the Activity of the Endonuclease
Encoded by the R2 Retrotransposable Element," Mol. Cell. Biol.
18:3455-3465 (1998); Yang et al., "Identification of the
Endonuclease Domain Encoded by R2 and Other Site-specific, non-Long
Terminal Repeat Retrotransposable Elements," Proc. Natl. Acad. Sci.
USA 96:7847-7852 (1999), each of which is hereby incorporated by
reference in its entirety) in order to increase the number of
.alpha.-P.sup.32 CTPs that could be incorporated by PCR and, thus,
increase the sensitivity of the assay. The PCR amplification
products were separated on 8% native polyacrylamide gels, the 164
bp band was cut from the gel, and eluted at room temperature in
0.3M sodium acetate pH 5.2, 0.03% SDS. The elution buffer was
extracted with phenol/chloroform and the DNA recovered by ethanol
precipitation.
Preparations of RNA Templates
[0078] All RNA templates were generated by in vitro run-off
transcription using either T7 or T3 RNA polymerase (Fermentas Inc.,
Life Technologies). Templates were either restriction digested
pBSII(SK-) plasmids or PCR amplified products containing the T7
promoter. The 254 nt R2 RNA was transcribed from a template
generated by PCR amplification of pBmR2-249A4 (Luan et al.,
"Reverse Transcription of R2Bm RNA is Primed by a Nick at the
Chromosomal Target Site: A Mechanism for non-LTR
Retrotransposition," Cell 72:595-605 (1993), which is hereby
incorporated by reference in its entirety) using primers AB.13 (SEQ
ID No: 29, Table I) and AB.2b (SEQ ID No: 26, Table I). The 274 nt
R2 RNA was transcribed from a template generated by PCR
amplification of R2Bm249V5'3' (Luan and Eickbush, "Downstream 28S
Gene Sequences on the RNA Template Affect the Choice of Primer and
the Accuracy of Initiation by the R2 Reverse Transcriptase," Mol.
Cell. Biol. 16:4726-4734 (1996), which is hereby incorporated by
reference in its entirety) using primers AB.13 (SEQ ID No: 29,
Table I) and AB.9 (SEQ ID No: 28, Table I). These R2 RNA templates
differed from those used previously ((Yang and Eickbush,
"RNA-induced Changes in the Activity of the Endonuclease Encoded by
the R2 Retrotransposable Element," Mol. Cell. Biol. 18:3455-3465
(1998); Yang et al., "Identification of the Endonuclease Domain
Encoded by R2 and Other Site-specific, non-Long Terminal Repeat
Retrotransposable Elements," Proc. Natl. Acad. Sci. USA
96:7847-7852 (1999), each of which is hereby incorporated by
reference in its entirety), in that the RNA did not included 30 nt
of pBSII(SK-) sequence at the 5' end of the RNA. The presence of
these C-rich plasmid sequences reduce the efficiency of the TPRT
reaction.
[0079] The 334 nt vector RNA was transcribed from the pBSII(SK-)
plasmid (Stratagene) predigested with PvuII. The 183 nt vector RNA
was transcribed from a template generated from amplification of
pBSII(SK-) using primers AB.8 (SEQ ID No: 27, Table I) and AB.T7
(SEQ ID No: 36, Table I). The 600 nt RNA was transcribed like the
334 nt RNA except that a KpnI and BamHI fragment of the R1Dm
element (position 5020-5340) was cloned into the polylinker region
of pBSII(SK-) (provided by D. Eickbush). The 1090 nt vector RNA was
transcribed from pBSII(SK-) predigested with XmnI using T3 RNA
polymerase. Finally, the 177 nt donor RNA in FIG. 7 was transcribed
from a PCR template using primers AB.1 (SEQ ID No: 25, Table I) and
AB.9 (SEQ ID No: 28, Table I) and the plasmid pB108 (Xiong and
Eickbush, "Functional Expression of a Sequence-specific
Endonuclease Encoded by the Retrotransposon R2Bm," Cell 55:235-246
(1988), which is hereby incorporated by reference in its
entirety).
[0080] The in vitro transcription was performed in 80 .mu.l volumes
containing 2-5 .mu.g of pre-digested plasmid DNA or gel purified
PCR fragments, 16 .mu.l 5.times. transcription buffer, 1 mM each
NTP and 150 U of T7 or T3 RNA polymerase (Fermentas Inc., Life
Technologies). Reactions were incubated at 37.degree. C. until a
pyrophosphate precipitation formed (approximately 1.5 hr). After
synthesis the samples were diluted 2-fold, mixed with DNase I
buffer and the DNA templates removed with 10 U of DNase I (Ambion
Inc.) for 25 min at 37.degree. C. The products of transcription
were ethanol precipitated and separated on 5% Urea-PAGE.
Full-length RNA templates were excised from the gel, eluted at room
temperature in 0.3M sodium acetate pH 5.2, 0.03% SDS for 1.5 hr,
extracted with phenol/chloroform and ethanol precipitated. The
transcripts were dissolved in 50 mM NaCl to a final RNA
concentration of 0.1 .mu.g/.mu.l.
[0081] Synthesis of P.sup.32-labeled 254 nt R2 RNA for the gel
shift experiments was performed according to the Fermentas Inc.
protocol for the synthesis of high specific activity radiolabeled
RNA using T7 RNA polymerase. RNA was transcribed from 1 .mu.g of
the 254 nt R2 PCR product and DNA template removed by incubation
with 2 U of DNase I for 15 min at 37.degree. C. RNA yields after
purification from 5% Urea-PAGE were determined by scintillation
counting and the known specific activity of labeled nucleotide in
the reaction. TABLE-US-00004 TABLE I Definition of Primer Sequences
SEQ ID Primer Nucleotide Sequence No: AB.1
CTGCAGTAATACGACTCACTATAGGACTTGGGGAATCCGACT 25 AB.2b
TTTTCATCGCCGGATCATC 26 AB.8 GGAAACAGCTATGACCATG 27 AB.9
GATGACGAGGCATTTGGCTA 28 AB.13
CTGCAGTAATACGACTCACTATAGGTTGAGCCTTGCACAGTAG 29 AB.17
CGACGGCCAGTGCCAAGCTTGCATGCCTGCAGGTCGACTCTAGAGGATCC 30 AB.18
CGGGATCCGAAGCCAAGGGAGCGAG 31 AB.19 GCTCTAGAGCGTACGGCCACGATC 32
AB.23 GGGGTACCGACAGGTTTCCCGACTG 33 AB.25 GCTCTAGAGTTCCCTTGGCTGTGGT
34 AB.26 GCTCTAGAGCAAGCAAGCGCGGGTAAACGGCGGGAGTAACTATGACTCTCTTAA 35
AB.T7 TAATACGACTCACTATAG 36 AB.j44 AATTCAAGCAAGCGCGG 37 AB.34
CGTTCTTCGGGGCGAAAACTC 38
Reverse Transcription Assays
[0082] Unless otherwise specified, all RT reactions were performed
in 30 .mu.l volumes containing 50 mM Tris-HCl (pH 7.5), 200 mM
NaCl, 10 mM MgCl.sub.2, 2.5 mM DTT, 0.01% Triton X-100 and 25 .mu.M
each dNTP. The concentration of the R2 protein was 0.8-4 nM (3-15
ng). In all TPRT reactions, labeled target DNA (see previous
section) was present at a concentration of 6-12 nM (20-40 ng). In
most other reactions, 15 .mu.Ci of [.alpha.-.sup.32]dCTP
3,000Ci/mmol (New England Nuclear) was added. In the primer
extension assays, the DNA oligonucleotides (concentrations as
specified in the Figures) were annealed to the RNA by heating to
96.degree. C. and slow cooling (3.5.degree. C./min) to 25.degree.
C. For the reactions in FIG. 7, the DNA primer was end-labeled with
polynucleotide kinase. End-labeling reactions were performed
accordingly to Fermentas Inc. protocol in 20 .mu.l volume
containing 200 ng primer and 25 .mu.Ci of [.gamma.-.sup.32P]ATP
3,000 Ci/mmol (New England Nuclear). The polynucleotide kinase was
inactivated by heating to 96.degree. C. for 10 min. All reverse
transcription reaction were incubated at 37.degree. C. for 30-50
min, and stopped by heating at 96.degree. C. for 5 min. Unless
otherwise indicated, the excess RNA was removed by digestion with
1-2 .mu.g of RNase A for 10 min at 37.degree. C. and ethanol
precipitated before electrophoresis.
Mobility Shift Analysis
[0083] RNA gel mobility shift reactions were performed with 10 ng
of R2 protein and 10 ng of [.sup.32P]-labeled 254 nt R2 RNA in a 20
.mu.l reaction mixture containing 50 mM Tris-HCl (pH 7.5), 200 nm
NaCl, 10 mM MgCl.sub.2, 2.5 mM DTT, 0.01% Triton X-100. The RNA and
protein were preincubated at 37.degree. C. for 15 min, placed on
ice for 5 min, mixed with 2 .mu.l of loading buffer (0.2%
bromophenol blue, 0.02% xylene cyanol FF and 60% glycerol), and
analyzed on 5% native polyacrylamide gels containing 5% glycerol
(1/55 acrylamide/bisacrylamide). The electrophoresis was performed
at 4.degree. C. The identical procedure was applied for the DNA
mobility shift assay except that 10 ng R2 protein was preincubated
with 20 ng of labeled target DNA.
Analysis of the Junction Sequences Derived from Template Jumps
[0084] Unless otherwise indicated the band corresponding to the
template jump product was excised from a polyacrylamide gel, eluted
with 0.3 M sodium acetate pH 5.2, 0.03% SDS for several hours at
room temperature, phenol/chloroform extracted and ethanol
precipitated. The isolated cDNA was then used as a template for PCR
amplification using the primer indicated in the figure legends. The
PCR products were directly cloned into mp18T2 (Burke et al., "R4, a
non-LTR Retrotransposon Specific to the Large Subunit rRNA Gene of
Nematodes," Nucleic Acids Res. 23:4628-4634 (1995), which is hereby
incorporated by reference in its entirety) and individual clones
sequenced.
Example 1
Recombinant Expression of the Bombyx more R2 Protein
[0085] The expression construct, pR260, was derived from construct
pR250 (Xiong & Eickbush, Cell 55:235-246 (1988), which is
hereby incorporated by reference in its entirety). A 3.5 kb Smal
fragment of pR250 from 18 bp upstream of the first methionine codon
to the 3' end untranslated region was subcloned into pUC18 in-frame
with the lacZ gene. E coli strain JM109/pR260 was grown at
37.degree. C. in LB broth until an OD.sub.595=0.5-0.6.
Isopropylthio-.beta.-galactoside (IPTG) was then added to a final
concentration of 0.2 mM and the cultures were further incubated for
1 hour. Cells are harvested by centrifugation, washed in cold 50 mM
Tris-HCl, pH 8.0, and collected by centrifugation. The following
procedure is described for a 1.5 liters of cells but can be scaled
to larger or smaller culture volumes. All procedures are conducted
at 0-4.degree. C. The cell pellets are resuspended in 6.8 ml buffer
A (0.1 mM Tris-HCl pH7.5, 5 mM EDTA, 50% glycerol) and incubated
for 30 minutes in 5 mM dithiothreitol (DTT), 2 mM benzamidine-HCl,
and 2 mg/ml lysozyme. 32 ml of buffer B (0.1 M Tris-HCl pH7.5, 1 M
NaCl, 5 mM DTT, 0.2% triton X-100, 10 mM MgCl.sub.2, 2 mM
benzamidine) is then added, followed by an additional 30 minutes
incubation. The lysate is centrifuged in a SW50.1 rotor at 33,000
rpm for 20 hours. The upper 1 ml of the supernate from each tube
contains little R2 protein and is discarded. The remaining 4 ml of
supernate from each tube is decanted and diluted with H.sub.2O to
lower the NaCl concentration to 0.4 M. The diluted crude extract is
loaded onto a 15 ml Q Sepharose-fast-flow column (Pharmacia)
equilibrated in 0.4 M NaCl/buffer C (25 mM Tris-HCl pH 7.5, 2 mM
DTT). The column is washed with 50 ml of the 0.4 M NaCl/buffer C,
and the R2 protein eluted with 0.6 M NaCl/buffer C. Fractions
containing the R2 endonuclease activity are pooled, dialyzed
against 0.2 M NaCl/buffer D (25 mM Tris-HCl pH7.5, 2 mM DTT, 10%
Glycerol), and applied to a 1.5 ml DNA-cellulose column (Pharmacia)
equilibrated with 0.4 M NaCl/buffer D. The column is washed with 9
ml 0.4 M NaCl/buffer D and eluted with 0.8 M NaCl/buffer D. R2
protein eluted from the DNA cellulose column is concentrated
approximately 5-fold on a Centricon-50 column (Amico) and dialyzed
against 50% glycerol, 0.4 M NaCl, 25 mM Tris-HCl (pH 7.5) and 2 mM
DTT at 4.degree. C. A typical final volume is 100-200 .mu.l of
concentrated solution containing from 5-15 ng/.mu.g R2 protein. The
protein can be stored after dialysis at -20.degree. C. for several
month with only minor decreases in activity.
[0086] Protein concentrations were determined on SDS-polyacrylamide
gels using the fluorescent stain SYPRO Orange (Bio-Rad
Laboratories). The intensity of the R2 band was compared with known
concentrations of bovine serum albumin using the fluoroimaging
function of a Storm 860 PhosphorImager and Image Quant.
Example 2
RNA Template Jumping After Target-Primed Reverse Transcription
[0087] The signature step of the TPRT reaction is the use of the 3'
hydroxyl group released by first-strand cleavage of the DNA target
site as primer for cDNA synthesis. This cleavage/reverse
transcription reaction can be studied in vitro using purified
components as shown in FIG. 1. The substrate in the assay is a
uniformly P.sup.32 labeled DNA fragment containing the 28S rRNA
gene insertion site. The R2 cleavage site on this substrate is
positioned such that a 110 nt fragment is used as the primer. The
RNA templates added to the reaction contain the minimum sequences
needed to initiate the TPRT reaction: the 3' untranslated region of
the R2 element. The R2 RNA templates are either 254 nt in length,
if the RNA ends at the precise 3' junction of the RR2 element, or
274 nt in length, if the RNA extends 20 nt into the downstream 28S
gene sequences.
[0088] Denaturing polyacrylamide gel electrophoresis of typical
TPRT reactions are shown in FIG. 1B. To allow maximum separation of
the TPRT products the small, previously described (Luan et al.,
"Reverse Transcription of R2Bm RNA is Primed by a Nick at the
Chromosomal Target Site: A Mechanism for non-LTR
Retrotransposition," Cell 72:595-605 (1993), which is hereby
incorporated by reference in its entirety) DNA cleavage products
have been run off the bottom of the gel. The major TPRT products
generated from the 254 nt R2 RNA (lane 1) and the 274 nt R2 RNA
(lane 2) are both approximately 365 nt in length because reverse
transcription starts at the 3' end of the R2 sequences on the RNA
template, irrespective of whether this sequence is located at the
3' end or an internal position of the template (110 nt DNA
primer+254 nt RNA template=364 nt) (Luan and Eickbush, "Downstream
28S Gene Sequences on the RNA Template Affect the Choice of Primer
and the Accuracy of Initiation by the R2 Reverse Transcriptase,"
Mol. Cell. Biol. 16:4726-4734 (1996), which is hereby incorporated
by reference in its entirety). Also visible in both lanes are
longer reaction products. A distinct band at 620 nt and a faint
band at 870 nt are present in lane 1. The 620 nt product could be
formed if the R2 reverse transcriptase, after completing synthesis
of the first RNA template, was able to jump to the 3' end of a
second RNA template and continue synthesis (110 nt+2.times.254
nt=618 nt). The faint 870 nt band could be explained by two
consecutive jumps (110 nt+3.times.254 nt 872 nt). In the case of
the 274 nt R2 template (lane 2), the longer cDNA products are about
640 and 910 nt. Because the TPRT products generated with the 274 nt
R2 RNA are 20 and 40 nt longer than those formed with the 254 nt
RNA, the putative jumps between templates would appear to involve
the 3' end of the second template, not the internal site used to
initiate the TPRT reaction.
[0089] To obtain direct support for jumps between RNA templates,
the 638 nt cDNA fragment was isolated from lane 2 and the putative
jump region of the cDNA was PCR amplified. The sequence of
individual cloned products are shown in FIG. 2. All six sequenced
junctions revealed that the R2 enzyme had extended to the terminal
5' nucleotide of the first RNA template and continued
polymerization at the first 3' nucleotide of the second RNA
template. One clone contained a six nucleotide deletion near, but
not at the 3' end of the second RNA template. Four of the six
junctions had an additional nucleotide between the two RNA
sequences. These extra nucleotides could have been added by the R2
reverse transcriptase during the jump between templates. A similar
addition of non-templated nucleotides has previously been observed
when the R2 reverse transcriptase initiates the TPRT reaction (Luan
et al., "Reverse Transcription of R2Bm RNA is Primed by a Nick at
the Chromosomal Target Site: A Mechanism for non-LTR
Retrotransposition," Cell 72:595-605 (1993); and Luan and Eickbush,
"RNA Template Requirements for Target DNA-Primed Reverse
Transcription by the R2 Retrotransposable Element," Mol. Cell.
Biol. 15:3882-3891 (1995), each of which is hereby incorporated by
reference in its entirety). The extra nucleotides could also have
been generated during the in7 vitro synthesis of the RNA template.
T7 RNA polymerase has been shown to add an additional residue
(usually A) in DNA run-off react ions similar to those employed to
generate RNA templates (see Millagan and Uhlenbeck, "Synthesis of
Small RNAs using T7 RNA Polymerase," Meth. Enzymol. 180:51-62
(1989), which is hereby incorporated by reference in its
entirety).
[0090] Reverse transcription of the terminal nucleotides of the
donor and acceptor RNAs eliminates the possibility that the jumps
between RNA templates are promoted by annealing of the 5' end of
the newly synthesized cDNA to the acceptor RNA template.
Furthermore, initiation of reverse transcription at the terminal 3'
nucleotide of the 274 nt acceptor RNA template, rather than 20
nucleotides internally, suggests an important role of free 3' ends
rather than the RNA secondary structure (Mathews et al., "Secondary
Structure Model of the RNA Recognized by the Reverse Transcriptase
from the R2 Retrotransposable Element," RNA 3:1-16 (1997), which is
hereby incorporated by reference in its entirety) in the template
jumping reaction. These properties of the R2 reverse transcriptase
are in sharp contrast to the strand transfers which occur in the
reverse transcription cycles of retroviruses and
LTR-retrotransposons.
[0091] The R2 reverse transcriptase, at least under the in vitro
condition described here, cannot efficiently use the 3' end of a
RNA:DNA hybrid to initiate reverse transcription. Therefore, the
template jumps in FIG. 1B appear to involve the ability of the
actively elongating R2 reverse transcriptase to associate with the
3' end of a second RNA template before it dissociates from the
first RNA template. This reaction can thus be viewed as continuous
cDNA synthesis on non-continuous RNA templates.
[0092] If it is assumed that the R2 enzyme, upon reaching the 5'
end of the first RNA template, has only limited time to bind
another RNA template before it dissociates, then one would predict
that the frequency of the template jumps would be dependent upon
the concentration of free RNA ends in the reaction. Therefore, a
series of reactions were conducted where the concentration of the
254 nt RNA template was incrementally increased. The frequency of
the template jumps (618 nt fragment) relative to the total TPRT
products over a 100 fold range in RNA concentration was plotted in
FIG. 3. As predicted, the frequency of template jumping increased
as the concentration of RNA increased. At the highest concentration
of RNA tested, 40 nM, approximately 13% of the TPRT reactions
underwent a template jump.
Example 3
RNA Priming of the Reverse Transcription Reaction
[0093] To determine if there is specificity for the RNA used as
acceptor in the template jump, the TPRT assays were also conducted
in the presence of an excess of non-R2 competitor RNA. The 334 nt
competitor RNA was a transcript of the pBSII(SK-) plasmid. It has
previously been shown that only those RNAs that contain the 3'
untranslated region of the R2 element can be used as templates in
the TPRT reaction (Luan and Eickbush, "RNA Template Requirements
for Target DNA-primed Reverse Transcription by the R2
Retrotransposable Element," Mol. Cell. Biol. 15:3882-3891 (1995),
which is hereby incorporated by reference in its entirety). This
specificity was confirmed in FIG. 4, as the only initial TPRT
product observed was the approximately 365 nt fragment generated
from the R2 RNA template. TPRT products generated from the longer
vector RNA would be approximately 445 nt in length. In a similar
manner, the only products resulting from template jumps between
RNAs were approximately 640 nt and 910 nt in length (compare lanes
1 and 3), indicating that only the R2 RNA was being used as
acceptor of the jump even in the presence of an 8 fold molar excess
of the competitor RNA.
Example 4
Reverse Transcription in the Absence of DNA Target Site
[0094] It was determined whether the R2 enzyme could undergo
template jumps during reverse transcription reactions that are
primed by a non-specific method of annealing a short DNA
oligonucleotide to an RNA template (FIG. 5B). In such primer
extension assays, both R2 and non-R2 RNA can be used by the R2
reverse transcriptase as templates (Luan et al., "Reverse
Transcription of R2Bm RNA is Primed by a Nick at the Chromosomal
Target Site: A Mechanism for non-LTR Retrotransposition," Cell
72:595-605 (1993), which is hereby incorporated by reference in its
entirety). An extension assay using a DNA oligonucleotide annealed
to the 20 nt at the 3' end of the 334 nt vector RNA is shown in
FIG. 5A, lane 1. As expected the major cDNA product was 334 nt in
length corresponding to simple extension by the reverse
transcriptase to the end of the vector RNA. Also produced was a
cDNA fragment approximately 670 nt, the length expected for reverse
transcription of two consecutive RNA molecules. Thus, in the
absence of the DNA target and R2 RNA, the R2 reverse transcriptase
can also undergo template jumps between vector RNA sequences.
[0095] During analysis of these primer extension assays, it was
noticed that the synthesis of cDNA was not completely dependent
upon the presence of a DNA primer annealed to the RNA template.
Approximately 20% of the cDNA synthesis could be generated without
a primer, suggesting an alternative means of priming reverse
transcription. It has previously been shown that under conditions
of a TPRT reaction, the R2 protein can use the 3' end of a second
RNA molecule to prime reverse transcription (Luan and Eickbush,
"Downstream 28S gene Sequences on the RNA Template Affect the
Choice of Primer and the Accuracy of Initiation by the R2 Reverse
Transcriptase," Mol. Cell. Biol. 16:4726-4734 (1996), which is
hereby incorporated by reference in its entirety). Sequence
analysis of the products of these reactions indicated that the
`primer` RNA had not annealed to the R2 `template` RNA (Luan and
Eickbush, "Downstream 28S gene Sequences on the RNA Template Affect
the Choice of Primer and the Accuracy of Initiation by the R2
Reverse Transcriptase," Mol. Cell. Biol. 16:4726-4734 (1996), which
is hereby incorporated by reference in its entirety). Here, it is
shown that elimination of both the DNA target and R2RNA from the
reaction enables the protein to conduct this RNA-priming with
non-specific RNA templates (diagramed in FIG. 5B).
[0096] Results of an RNA-primed reverse transcription reaction
using the 334 nt vector RNA template is shown in FIG. 5A, lanes 2
and 3. In lane 2, the reaction products have been treated with
RNase A before separation on the polyacrylamide gel, while the
reaction products in lane 3 have not been treated with RNase A. In
lane 3, the major product was a diffuse band slightly larger than
600 nt. After treatment with RNase A the products were reduced to a
major band at approximately 324 nt with faint bands extending up to
334 nt. This reduced length of the cDNA products compared to lane 1
indicated that the preferred site of initiation of reverse
transcription in the RNA-primed reaction was about 10 nt from the
3' end of the RNA template. The presence of a 660 nt cDNA product
in lane 2 indicated that the R2 enzyme can also undergo template
jumps in these RNA-primed reactions.
Example 5
RNA Preferences in the Absence of the DNA Target Site
[0097] In FIG. 6, the efficiency of RNA-primed reverse
transcription and template jumping are compared between three RNAs:
a 183 nt non-R2 RNA derived from pBSII(SK-), the 254 nt R2 RNA, and
334 nt vector RNA. Reverse transcription was primed by the RNAs
themselves (no DNA primers), and all products were digested with
RNase A to remove these RNA primers before electrophoresis. In the
case of the short vector RNA (lane 1), the initial cDNA products
were approximately 180 nt in length, which is consistent with
reverse transcription starting near the 3' end of the RNA template.
Template jumping was highly efficient with this RNA as cDNA
fragments of 360, 540, 720 and 900 nt were generated representing
one, two, three, and four consecutive template jumps. With the R2
RNA template (lane 2), the major cDNA fragments were about 250 nt,
consistent with reverse transcription starting near the 3' end of
the RNA template, while a faint band approximately 15 nt shorter
than the major band suggested cDNA synthesis also initiated at a
more internal site. One, two, and three template jumps were seen
with the R2 RNA template. In the case of the 334 nt vector RNA,
priming of cDNA synthesis occurred at several sites near the 3' end
of the RNA, and both single and double template jumps were detected
(660 and 1000 nt fragments).
[0098] The relative efficiency of the initial RNA-primed reverse
transcription reaction and of the template jumps for each of the
three RNAs are compared in Table II (below). The efficiency of the
initial RNA-primed reverse transcription step was highest with the
R2 RNA template and lowest for the 334 nt RNA. The frequency of
template jumps once reverse transcription initiated was 13-15% with
the 254 and 183 nt RNA, but only 4% with the 334 nt RNA.
TABLE-US-00005 TABLE II RNA Specificity of the RNA-primed Reverse
Transcription Assay Frequency of RNA template RNA-primed RT
Template Jumps 254 nt (R2 RNA) 1.00 12.9% 183 nt (vector RNA) 0.70
15.1% 334 nt (vector RNA) 0.44 3.9% 254 nt + 183 nt . . . 254 nt
1.59 10.9% 254 nt + 183 nt . . . 183 nt 0.17 2.1% 254 nt + 334 nt .
. . 254 nt 0.86 10.1% 254 nt + 334 nt . . . 334 nt 0.14 0.7% All
values are derived from the data in Figure 6. Values for the
RNA-primed reverse transcription represent the combination of all
RNA-unit-length bands visible in the lane and are given as a
fraction relative to that supported by R2 RINA alone. The frequency
of template jumps are given as pereentaces of the total reverse
transcripts that have undergone a template jump and is the combined
values for all jumps (single, double, etc. corrected for the length
of the eDNA fragment). Values for the template jumps in the
competition reactions represent only those between similar
templates (i.e. 254 nt to 254 nt jumps), and do not include the
hybrid bands (i.e. 430 nt in lane 4 and 590 in lane 5) as it is
uncertain which RNAs were the donors and acceptors in these
jumps.
[0099] Lanes 4 and 5 of FIG. 6 are the cDNA products of competition
experiments between equal molar ratios of the of the R2 RNA and the
individual vector RNAs. In the case of the 183 nt and 254 nt
competition, the significant reduction in intensity of the 180 nt
band and the increased intensity of the 250 nt band indicated that
the R2 RNA was the preferred template in the RNA-priming reaction.
In the case of the 254 nt and 334 nt competition, reverse
transcription of R2 RNA was again preferred over the longer vector
RNA (lane 5). Thus, RNA-primed cDNA synthesis occurred more readily
with the R2 RNA template than with either the shorter or longer
vector RNAs. However, stimulation of cDNA synthesis from the R2 RNA
by the addition of the 183 nt RNA, (Table 2), suggested the short
vector RNA functioned more efficiently than the R2 RNA in priming
reverse transcription of the R2 template.
[0100] The competition experiments in FIG. 6 also suggested that
template jumping occurred preferentially to the R2 template. In
both lanes 4 and 5, the template jumps between R2 RNAs (fragments
at 500 and 750 nt) were only slightly reduced compared to that in
lane,3. Meanwhile, the levels of vector RNA to vector RNA jumps in
lane 4 (360 nt) and lane 5 (664 nt) were reduced from 5 to 7 fold
(see Table 1). Most jumps from the vector RNAs appear to have gone
to the R2 RNA as hybrid products (430 nt in lane 4, and 590 nt in
lane 5) were readily apparent in both lanes. These results indicate
that even in the absence of the R2 RNA template, the R2 protein can
undergo RNA-primed reverse transcription and template jump with
non-R2 RNA templates. However, the R2 protein prefers to initiate
reverse transcription on the R2 RNA as well as use R2 RNA as the
acceptor of template jumps.
[0101] To affirm that these template jumps in the RNA-primed
reactions occurred without annealing of the cDNA to the acceptor
RNA template, the cDNA region corresponding to a 334 nt donor to
254 nt acceptor template jump was PCR amplified from the total
reaction products of lane 5. The total reaction products were used
in the amplification rather than the purified 590 nt hybrid in
order to sample the many faint products visible in FIG. 6A that are
not of unit RNA length (i.e., cannot be attributed to any
combination of 254 and 334 nt RNAs). As shown in FIG. 6B, in five
of the seven sequenced junctions, the template jump occurred to the
terminal 3' nucleotide of the acceptor R2 RNA. In the two remaining
cases, the jumps were to sites 5 and 7 nucleotides from the 3' end,
but in neither case did the jump involve sequences that would
enable the cDNA made from the 334 nt RNA to anneal to the R2 RNA.
Several of the junctions also represented reverse transcription
reactions that did not proceed to the 5' end of the 334 nt vector
RNA. These premature jumps can explain many of the faint product
bands seen on gels, however it is not clear what fraction of these
products were a result of RNA degradation, and what fraction
represented template jumps before the reverse transcriptase reached
the end of the first template.
Example 6
Directing Template Jumps Between RNA's
[0102] Increasing the ratio of DNA oligonucleotides used to anneal
to the 3' end of an RNA template will block this RNA from being an
acceptor of a template jump. As shown in FIG. 7A (lane 3), using a
6-fold molar excess of primer to RNA template (three times higher
than used in FIG. 5, lane 1) resulted in the synthesis of
full-length cDNA products (.about.177 nt), but no template jumps.
Template jumps were readily observed if a second RNA template, that
did not anneal to the DNA primer, was added to the reaction. In
lane 2, the addition a the 183 nt vector RNA resulted in nearly 40%
of the cDNA undergoing a template jump (many underwent multiple
consecutive jumps). In lane 1, the addition of the 334 nt vector
RNA resulted in about 6% of the cDNA undergoing a template
jump.
[0103] FIG. 7B shows the relative efficiency of directed template
jumps between the donor RNA and even longer RNAs (334 nt, 600 nt
and 1090 nt). The nature of these three RNAs are described above.
While template jumps onto each of these RNAs were observed, the
relative efficiencies of these jumps varied as did the efficiency
of the initial primer extension reaction itself. The reduction in
efficiency of the primer extension reaction with different RNAs is
possibly a result of the preference of the R2 protein to bind
certain RNAs and, thus, to reduce its ability to bind the donor
RNA. Meanwhile the general decrease in the efficiency of the
template jumps with longer RNAs is likely to be a mass affect in
which the R2 protein is more likely to encounter the 3' end of a
shorter RNA compared to the 3' end of a longer RNA.
Example 7
The DNA Target Site Stabilizes Interactions Between the R2 Protein
and Its Template
[0104] While R2 RNA templates are preferred, vector RNAs (or any
non-R2 RNAs) can compete as templates in RNA-primed reverse
transcription reactions as well as acceptors in template jumps
between RNAs. This contrasts with the activity of the R2 protein in
the presence of the target DNA, in which only R2 RNAs can be used
as templates in the target DNA-primed or RNA-primed reverse
transcription reactions (Luan and Eickbush, "Downstream 28S Gene
Sequences on the RNA Template Affect the Choice of Primer and the
Accuracy of Initiation by the R2 Reverse Transcriptase," Mol. Cell.
Biol. 16:4726-4734 (1996); and Mathews et al., "Secondary Structure
Model of the RNA Recognized by the Reverse Transcriptase from the
R2 Retrotransposable Element," RNA 3:1-16 (1997), each of which is
hereby incorporated by reference in its entirety), as well as
acceptors of template jumps (FIG. 3). These results suggest that
when the R2 protein is bound to DNA it has more specific structural
requirements for the RNA used as template, which in turn might mean
a higher affinity of the protein for the R2 RNA. Direct evidence
for an increased affinity of the R2 protein for binding the R2 RNA
in gel mobility shift assays is demonstrated below. As shown in
FIG. 8A, R2 protein and labeled R2 RNA incubated in the absence of
DNA do not generate a gel shift under the conditions of these
reactions (lane 1); however a shifted complex is readily observed
if the DNA target is added to the incubation (lane 2). Similar
complexes are not observed in the presence of single-stranded DNA
(lane 3) or in the presence of double-stranded DNAs not
corresponding to the target site (lane 4).
[0105] To confirm that the shifted complex in FIG. 8A is indeed a
complex of RNA, protein, and DNA, the mobility shift assays were
also conducted in the presence of labeled DNA target (FIG. 8B). In
the presence of the R2 protein and the DNA target, a shifted
complex is observed (lane 2). If R2 RNA is added to the protein and
target DNA, then a further reduction in the mobility of the complex
(a supershift) is observed (lane 1). The mobility of this
RNA:protein:DNA complex is the same whether the DNA is labeled
(FIG. 8B, lane 1) or the RNA is labeled (FIG. 8A, lane 2 and FIG.
8B lane 3). These results demonstrate that the DNA target site
increases the specific interactions between the R2 protein and R2
RNA. In the absence of DNA, only less stable interactions are
possible between the R2 protein and its RNA template, which can
explain why vector RNA can substitute for R2 RNA in the reverse
transcription reactions conducted in the absence of target DNA.
Example 8
Template Jumps Onto Single-Stranded DNA
[0106] The ability of the R2 protein to template jump onto another
RNA would suggest that the protein may also be able to template
jump onto single stranded DNA. Surprisingly, such jumps have only
been observed when R2 RNA is being used as the initial template. As
shown in FIG. 9, increasing concentrations of a 19 nt DNA primer
complementary to the 3' end of the 254 nt R2 RNA template (lanes
1-3) inhibited template jumping to the R2 RNA itself, similar to
that shown above for vector RNA templates (FIG. 7). However, with
the R2 template a new series of cDNA products were generated
approximately 20, 40 and 60 nt longer than the R2 RNA template. As
shown below, these new products were template jumps onto the excess
DNA oligonucleotide primers in the reaction. Such template jumps to
DNA oligonucleotides were not observed with the donor RNA template
in FIG. 7.
[0107] To determine if the R2 protein has sequence or length
preference for these templates jumps to ssDNA, two longer ssDNA
were tested. One ssDNA corresponded to a 50 nt segment from the
pBSII(SK-) plasmid (FIG. 9A, lane 4), while a second 54 nt ssDNA
corresponded to the sequence of the 28S gene immediately upstream
of the R2 insertion site (lane 5). This latter ssDNA was tested
because it has previously been postulated that the R2 element may
complete the R2 integration reaction by jumping onto these upstream
28S gene sequences and continuing synthesis (Burke et al., "The
Domain Structure and Retrotransposition Mechanism of R2 Elements
are Conserved Throughout Arthropods," Mol. Biol. Evol. 16:502-511
(1999), which is hereby incorporated by reference in its entirety).
Template jumps to both of these ssDNA were readily observed. Based
on the lengths of the reaction products, the jumps occurred to
locations near the 3' end of these oligonucleotides.
[0108] To obtain direct evidence for the use of ssDNA as an
acceptor of template jumps, the 300-310 nt reverse transcription
products from the reaction in lane 5 were excised and PCR amplified
using one primer within the 54 bp extension and a second primer
within the R2 sequence. The sequence of individual clones are shown
in FIG. 9B. In most cases, the RT extended to the 5' end of the R2
RNA template and then jumped to the terminal 3' nucleotide of the
ssDNA. In two cases, the jump occurred to positions 2 and 9 nt from
the 3' end of the primer. While these could represent jumps to
internal positions of the ssDNA, it has been have found that the R2
endonuclease has single-stranded 3' exonuclease activity. Thus,
many of these jumps to apparent internal locations near the end of
the oligonucleotide may be jumps to the 3' end of a partially
degraded ssDNA template. As was seen with the template jumps from
RNA to RNA, these RNA to ssDNA templates jumps also contained
additional, non-templated nucleotides.
Example 9
Retroviral Reverse Transcriptases Are Unable To Jump Templates
[0109] All of the reactions that have been conducted with the R2
protein herein have also been conducted using commercially
available retroviral RTs. Consistent with many previous studies
(see Coffin et al., Retroviruses, Cold Spring Harbor Press, Cold
Spring Harbor, N.Y. (1997), which is hereby incorporated by
reference in its entirety), these retroviral enzymes were unable to
conduct RNA-primed reverse transcription and template jumps to
either RNA or ssDNA templates. FIG. 10 is an example of such a
comparison in which an R2 RNA template has been primed with a short
oligonucleotide. Full-length cDNA products were seen with the R2
reverse transcriptase, as well as jumps to the DNA oligonucleotide
and RNA template (lane 2). In contrast, under identical conditions
the longest products generated by the AMV RT were only full-length
reverse transcripts of the RNA template (lane 1). The greater
abundance of shorter cDNA products seen with the retroviral enzyme
was a reflection of the reduced ability of the AMV RT to extend to
the 3' end of an RNA template compared to the R2 enzyme.
[0110] Based on the evidence provided in Examples 2-9, it is
evident that the R2 protein or polypeptide possess two unusual
properties.
[0111] First, the R2 RT can jump between RNA templates. The cDNA
strand from these jumps frequently contain the terminal nucleotides
of the donor and acceptor RNA molecules indicating that these jumps
do not involve annealing of the newly formed cDNA strands to the
acceptor RNA template. Thus, the R2 protein can conduct continuous
cDNA synthesis on non-continuous RNA templates. In contrast, strand
transfer by retroviral RTs requires sequence identity between the
donor and acceptor RNA templates (Peliska and Benkovic, "Mechanism
of DNA Strand Transfer Reactions Catalyzed by HIV-1 Reverse
Transcriptase," Science 258:1112-1118 (1992); and DeStefano et al.,
"Requirements for Strand Transfer Between Internal Regions of
Heteropolymer Templates by Human Immunodeficiency Virus Reverse
Transcriptase," J. Virol. 66: 6370-6378 (1992), each of which is
hereby incorporated by reference in its entirety). Retroviral
transfers are accomplished through catalytic removal of the donor
RNA from the cDNA strand by an associated RNase H domain which
allows the cDNA to anneal to the acceptor RNA molecule. The R2 RT
has no RNase H domain (Malik et al., "The Age and Evolution of
non-LTR Retrotransposable Elements," Mol. Biol. Evol. 16:793-805
(1999), which is hereby incorporated by reference in its entirety),
and no such activity has been detected in vitro (Luan et al.,
"Reverse Transcription of R2Bm RNA is Primed by a Nick at the
Chromosomal Target Site: A Mechanism for non-LTR
Retrotransposition," Cell 72:595-605 (1993), which is hereby
incorporated by reference in its entirety). The only similarity
between the jumps by the R2 and retroviral RTs is that both enzymes
can add non-templated nucleotides to the cDNA at the end of the
donor RNA template (Peliska and Benkovic, "Mechanism of DNA Strand
Transfer Reactions Catalyzed by HIV-1 Reverse Transcriptase,"
Science 258:1112-1118 (1992), which is hereby incorporated by
reference in its entirety). In the case of the retroviral
mechanism, these extra nucleotides leads to hypermutability in the
template switch region of the genome. As will be described below,
template jumping by the R2 protein may explain the high sequence
variation at the 5' junction of R2 elements.
[0112] A second unusual property of the R2 RT is that it can use
the 3' end of a second RNA molecule to initiate reverse
transcription. Such RNA-primed reactions have been previously
characterized as an alternative to the TPRT, reaction when the R2
protein was bound to the DNA target site (Luan and Eickbush,
"Downstream 28S Gene Sequences on the RNA Template Affect the
Choice of Primer and the Accuracy of Initiation by the R2 Reverse
Transcriptase," Mol. Cell. Biol. 16:4726-4734 (1996), which is
hereby incorporated by reference in its entirety). Similar to the
TPRT reaction itself, RNA-priming did not require the annealing of
the primer RNA to the template RNA (Luani and Eickbush, "Downstream
28S Gene Sequences on the RNA Template Affect the Choice of Primer
and the Accuracy of Initiation by the R2 Reverse Transcriptase,"
Mol. Cell. Biol. 16:4726-4734 (1996), which is hereby incorporated
by reference in its entirety). It was shown above that when the R2
protein is not bound to the DNA target, any RNA can be reverse
transcribed from its 3' end by this RNA-priming reaction.
RNA-priming without the annealing of the primer RNA to the template
RNA has not been observed for retroviral RT. However, the RT
encoded by the Mauriceville mitochondrial retroplasmid of
Neurospora crassa has been shown capable of using the 3' ends of
single-stranded DNA to prime reverse transcription in the absence
of significant sequence identity (Wang et al., "The Mauriceville
Plasmid of Neurospora crassa: Characterization of a Novel Reverse
Transcriptase that Begins cDNA Synthesis at the 3' End of Template
RNA," Mol. Cell. Biol. 12:5131-5144 (1992); and Kennell et al.,
"The Mauriceville Plasmid of Neurospora spp. Uses Novel Mechanisms
for Initiating Reverse Transcription in vitro," Mol. Cell. Biol.
14:3094-3107 (1994), each of which is hereby incorporated by
reference in its entirety).
[0113] The different properties of the R2 and retroviral RTs is
perhaps not unexpected because these enzymes differ substantially
in size and are highly divergent in sequences (Xiong and Eickbush,
"Origin and Evolution of Retroelements Based on their Reverse
Transcriptase Sequences," EMBO J. 9:3351-3362 (1990), which is
hereby incorporated by reference in its entirety). Indeed, it is
easier to align the amino acid sequence of the RT domains of R2 and
other non-LTR retrotransposons with the comparable domains of
mitochondrial group II introns and retroplasmids, bacterial msDNA
and even telomerase than it is to the retroviral and LTR
retrotransposon domains (Eickbush, "Origin and Evolutionary
Relationships of Retroelements," In The Evolutionary Biology of
Viruses (Morse, S. S. ed.), pp. 121-157, Raven Press, New York
(1994); and Nakamura et al., "Telomerase Catalytic Subunit Homologs
from Fission Yeast and Human," Science 277:955-959 (1997), each of
which is hereby incorporated by reference in its entirety). As a
result the phylogenetic relationship of these various retroelements
can be established with some confidence, while the relationship of
these elements to the retroviruses and LTR retrotransposons remains
controversial (Nakamura and Cech, "Reversing Time: Origin of
Telomerase," Cell 92:587-600 (1998); and Malik and Eickbush,
"Phylogenetic Analysis of Ribonuclease H Domains Suggests a Late,
Chimeric Origin of LTR Retrotransposable Elements and
Retroviruses," Genome Res. 11: 1187-1197 (2001), each of which is
hereby incorporated by reference in its entirety).
[0114] The R2 RT also shares with the other retroelement RTs the
ability to specifically bind the RNA that will be used as template
for cDNA synthesis. Priming of reverse transcription by these
different enzymes does not require extensive annealing of the
template to a oligonucleotide primer. The printer is the 3' end of
a cleaved chromosomal site in the cases of the non-LTR
retrotransposons and group II introns (Luan et al., "Reverse
Transcription of R2Bm RNA is Primed by a Nick at the Chromosomal
Target Site: A Mechanism for non-LTR Retrotransposition," Cell
72:595-605 (1993); and Cousineau et al., "Retrohoming of a
Bacterial Group II Intron: Mobility Via Complete Reverse Splicing,
Independent of Homologous DNA Recombination," Cell 94:451-462
(1998), each of which is hereby incorporated by reference in its
entirety), the 3' end of the chromosome itself in the case of
telomerase (Nakamura et al., "Telomerase Catalytic Subunit Homologs
from Fission Yeast and Human," Science 277:955-959 (1997), which is
hereby incorporated by reference in its entirety), the 2' hydroxyl
of an internal G residue of the RNA template in the case of msDNA
(Inouye and Inouye, "Bacterial Reverse Transcriptase," In Reverse
Transcriptase (Goff, S. & Salka, A., eds). pp. 391-410, Cold
Spring Harbor Press, Cold Spring Harbor, N.Y. (1993), which is
hereby incorporated by reference in its entirety), and either de
novo or the 3' end of another cDNA in the cases of the Mauriceville
plasmid (Kennell et al., "The Mauriceville Plasmid of Neurospora
spp. Uses Novel Mechanisms for Initiating Reverse Transcription in
vitro," Mol. Cell. Biol. 14:3094-3107 (1994); and Wang and
Lambowitz, "The Mauriceville Plasmid Reverse Transcriptase Can
initiate cDNA Synthesis de novo and May Be Related to Reverse
Transcriptase and DNA Polymerase Progenitor," Cell 75:1071-1081
(1993), each of which is hereby incorporated by reference in its
entirety). Only retroviruses and LTR retrotransposons use an
annealed RNA to prime reverse transcription of their RNA template
(reviewed in Levin, "It's Prime Time for Reverse Transcriptase,"
Cell 88:5-8 (1997), which is hereby incorporated by reference in
its entirety).
[0115] Another common feature of these various retroelement RTs is
that the region of the protein homologous to the `fingers` and
`palm` domains are considerably larger than that of the
retroviruses (Xiong and Eickbush, "Origin and Evolution of
Retroelements Based on their Reverse Transcriptase Sequences," EMBO
J. 9:3351 -3362 (1990), which is hereby incorporated by reference
in its entirety). All non-viral reverse transcriptases contain an
extra segment that is not found in retroviral RTs (Eickbush,
"Origin and Evolutionary Relationships of Retroelements," In The
Evolutionary Biology of Viruses (Morse, S. S. ed.), pp. 121-157,
Raven Press, New York (1994), which is hereby incorporated by
reference in its entirety), as well as additional segments in some
groups. For example, non-LTR retrotransposons contain an additional
segment between segment A and B, while group II intron RTs contain
an extra region between segments B and C (Eickbush, "Origin and
Evolutionary Relationships of Retroelements," In The Evolutionary
Biology of Viruses (Morse, S. S. ed.), pp. 121-157, Raven Press,
New York (1994); and Nakamura et al., "Telomerase Catalytic Subunit
Homologs from Fission Yeast and Human," Science 277:955-959 (1997),
each of which is hereby incorporated by reference in its entirety).
Because the `fingers` domain of retroviral RTs associates with the
RNA template upstream of the active site (Kohlstaedt et al.,
"Crystal Structure at 3.5 Angstrom Resolution of HIV-1 Reverse
Transcriptase Complexed with an Inhibitor," Science 256:1783-1790
(1992); and Sarafianos et al., "Crystal Structure of HIV-1 Reverse
Transcriptase in Complex with a Polypurine Tract RNA:DNA," EMBO J.
20:1449-1461 (2001), each of which is hereby incorporated by
reference in its entirety), these extra segments within the
non-retroviral RTs are presumably involved in specific RNA-template
interactions. Indeed, Chen and Lambowitz ("De novo and DNA
Primer-mediated Initiation of cDNA Synthesis by the Mauriceville
Retroplasmid Reverse Transcriptase Involve Recognition of a 3' CCA
Sequence," J. Mol. Biol. 271:311-332 (1997), which is hereby
incorporated by reference in its entirety) have suggested that this
`finger` domain is involved in specific recognition of the CCA
sequence involved in de novo initiation of reverse transcription by
the Mauriceville RT.
[0116] In FIG. 11 above, a simple model for the R2 RT is provided
which can help explain its different properties compared to
retroviral enzymes. This model has many similarities to that
proposed for the Mauriceville enzyme (Chen and Lambowitz, "De novo
and DNA Primer-mediated Initiation of cDNA Synthesis by the
Mauriceville Retroplasmid Reverse Transcriptase Involve Recognition
of a 3.degree. CCA Sequence," J. Mol. Biol. 271:311-332 (1997),
which is hereby incorporated by reference in its entirety). Based
on the additional amino acid motifs found in the palm and fingers
regions of the R2 RT, and the demonstrated ability of the R2 enzyme
to specifically bind its own RNA template, the R2 enzyme is shown
to have significant binding potential to the RNA template upstream
of the active site (FIG. 11A). There are two components to this
template binding: specific affinity of the protein for the RNA
structure assumed by the 3' UTR sequences of the R2 element, and
the ability of the protein bind the free 3' end of any RNA
molecule. The ability to bind RNA 3' ends could explain how the R2
protein can template jump onto a second RNA template when it
completes transcription of the first RNA template (panel B). R2 RNA
templates are preferred in these jumps because these RNAs have
higher affinity for the RT. These properties of the R2 protein
contrast with HIV RT and its associated RNase H domain in which the
major interactions of the protein is with the RNA template
downstream of the active site (Kohlstaedt et al., "Crystal
Structure at 3.5 Angstrom Resolution of HIV-1 Reverse Transcriptase
Complexed with an Inhibitor," Science 256:1783-1790 (1992); and
Sarafianos et al., "Crystal Structure of HIV-1 Reverse
Transcriptase in Complex with a Polypurine Tract RNA:DNA," EMBO J.
20:1449-1461 (2001), each of which is hereby incorporated by
reference in its entirety). Template switching by retroviral
enzymes involves annealing of the acceptor RNA to the cDNA
downstream of the active site (Peliska and Benkovic, "Mechanism of
DNA Strand Transfer Reactions Catalyzed by HIV-1 Reverse
Transcriptase," Science 258:1112-1118 (1992); and DeStefano et al.,
"Requirements for Strand Transfer Between Internal Regions of
Heteropolymer Templates by Human Immunodeficiency Virus Reverse
Transcriptase," J. Virol. 66: 6370-6378 (1992), each of which is
hereby incorporated by reference in its entirety).
[0117] The absence of the RNase H domain in the R2 RT means that
RNA 3' ends may also be able to bind the opposite (downstream) side
of the RT active site (FIG. 11B). Thus, the ability of the R2
enzyme to use RNA to prime reverse transcription in the absence of
sequence identity can be explained by the 3' ends of two RNA
molecules simultaneously binding either end of the presumed major
groove that contains the active site of the enzyme. The R2 RT has
significant preference to use the R2 RNA as template (upstream
binding), but little specificity for the RNA that primes the
reaction (downstream binding).
[0118] Priming of reverse transcription by the DNA cleavage site
can be viewed as similar to that of RNA-priming (panel C). When the
R2 protein is bound to the nicked DNA target site, the 3' end of a
cleaved DNA strand is positioned adjacent to the RT active site.
When the R2 protein is free in solution, the 3' end of RNA can be
bound to this site. In FIG. 11, the DNA end has been drawn unpaired
(to be used as primer), to emphasize its potential similarity to
the RNA-priming reaction; however, there is no direct evidence for
this suggestion.
[0119] It seems unlikely that template jumping between RNA
templates plays a role in R2 retrotransposition. However, the
ability of the enzyme to conduct such jumps can be viewed as
support for one possible model of how the 5' end of the reverse
transcribed product is attached to the upstream DNA target site.
Analysis of the sequence variation that exists at the 5' end of R2
elements from a number of arthropod species has led us to suggest a
model in which the RT jumps from the R2 RNA template onto the
upstream DNA target (FIG. 11C) (Burke et al., "The Domain Structure
and Retrotransposition Mechanism of R2 Elements are Conserved
Throughout Arthropods," Mol. Biol. Evol. 16:502-511 (1999); and
George et al., "Analysis of the 5' Junctions of R2 Insertions with
the 28S gene: Implications for non-LTR Retrotransposition,"
Genetics 142:853-863 (1996), each of which is hereby incorporated
by reference in its entirety). R2 5' junction variation includes
apparent non-templated nucleotides similar to those resulting from
in vitro template jumps (FIGS. 2, 6 and 9). R2 5' junctions
sometimes contain short deletions of the DNA upstream of the
cleavage site. These deletions could be explained by the jumps
occurring to internal nucleotides near the free 3' end, again as
seen above during in vitro jumps (FIGS. 2, 6 and 9). Finally, many
R2 5' junctions contain large deletions of the R2 element
indicating that no sequences at the 5' end of the element are
required for 5' attachment. These junctions are readily explained
by template jumps occurring prematurely or from RNA templates that
are not full-length.
Example 10
R2 RT is More Processive Than Retroviral AMV RT
[0120] The length with which a polymerase can synthesize a nucleic
acid before dissociating from its template is usually referred to
as its processivity. High processivity is desired of any RT to make
full-length cDNA copies of RNA. The retroviral AMV polymerase is
one with the best characterized RTs and is known to be one of the
most processive of the retroviral enzymes. FIG. 13 compares the
processivity of the R2 RT with that of AMV RT in a simple primer
extension assay with the 600 nt vector RNA as template. The
reaction with AMV RT generates a larger percentage of cDNA products
that are less than full length (AMV-RT, lane 1) compared to the R2
protein (R2-RT, lane 1). To confirm that the reaction products seen
in FIG. 13 reflect the processivity of each enzyme (i.e., the
length of cDNA synthesized before the RT dissociates from the RNA
template) and not multiple rounds of elongation, the reactions were
also conducted under conditions that do not allow the reinitiation
of RT after its dissociation. These single round reactions, also
called RT trap assays, involve the addition of heparin and an
excess of poly(A)/oligoT. Heparin inhibits reinitiation while any
reinitiation that might occur will be predominantly onto the more
abundant poly(A)/oligo dT templates. To demonstrate the efficiency
of the `trap`, lane 2 for each enzyme represents the addition of
heparin and the poly(A)/oligo dT at the same time as the addition
of the RT, thus preventing any synthesis of cDNA primed by the
end-labeled DNA primer annealed to the 600 nt RNA.
[0121] In lane 3 of each panel, the RTs are first bound to the DNA
primer/RNA template complex and then heparin and poly(A)/oligo dT
are added along with dNTPs to start reverse transcription. The
length distribution of the cDNA synthesized by AMV RT in the
presence of the trap was again significantly shorter than cDNA
synthesized by the R2-RT. The difference in the accumulation of
cDNA transcripts by the R2 and AMV RTs are also illustrated by the
graph in FIG. 14A. The tracings in this Figure represent the RT
product found in lane 3 of each enzyme in FIG. 13. The yield of
full-length cDNA product (600 nt) versus total cDNA (all bands
between 100-600 nt) was determined. The level of full-length
products with R2 was .about.16% of the total synthesis or nearly
four times higher than with AMV-RT (4.1%).
[0122] Similar processivity comparisons between the R2 and AMV RTs
were also conducted with the 1094 nt RNA derivative of the
pBSKII(SK-) plasmid (FIG. 14B). Reverse transcription of the 1094
nt template by AMV-RT yields essentially no cDNA products longer
than 450 nt. In contrast the R2 RT yields considerable cDNA
products over 500 nt in length with a distinct full-length band at
1094 nt These observations clearly suggest that under the
conditions of these single round elongation reactions, R2 RT is
more processive than the AMV-RT. It should be mentioned that the
cDNA distribution in the non-trap and trap reactions in FIG. 13
(lanes 1 and 3 respectively) are similar for both enzymes. This
result can be explained by the very short reaction times and low
polymerase concentrations used in these assays. Both conditions
significantly decrease the probability of reinitiation of the RT
after it has dissociated.
[0123] Finally, it should be noted that the cDNA that are less than
full-length differed dramatically for the R2 and AMV RTs. In the
case of the 600 nt template, the RT products generated by AMV
reveals several distinct bands that are distributed along the RNA,
while the distribution of RT products generated by the R2-RT are
more diffuse with only one distinct band of .about.95 nt. In the
case of the 1096 nt RNA, AMV produced distinct bands of lengths
130, 145, 160 and 300 nt, while the R2-RT produced weaker, but
still distinct bands of lengths 100, 180, 200, 220 and 240 nt.
These differences in the length of the truncated cDNA reflect the
different abilities of the RTs to transcribe regions of RNA with
different primary and secondary structure. They also indicate that
the shorter cDNA products produced by the RT are a result of enzyme
dissociation and not the result of a degraded RNA template.
Example 11
The Higher Processivity of R2 is a Result of its Reduced Rate of
Dissociation from the RNA Template
[0124] The higher processivity of the R2 RT compared to retroviral
RTs could be a result of two properties. First, it might dissociate
from the RNA template at a slower rate than retroviral enzymes.
Second, it might elongate cDNA at a faster rate than the retroviral
enzymes. To compare the dissociation rates of R2 and AMV RT, the
183 nt vector RNA with annealed end-labeled DNA primer was first
preincubated with the appropriate RT to allow protein binding. To
this complex were added heparin and excess poly(A)/oligo dT (the
trap). After various periods of time, dNTPs are added to initiate
reverse transcription, and the products are separated on
polyacrylamide gels (FIG. 15A). The total amount of cDNA synthesis
at each time point was determined on a phosphoimager and plotted in
FIG. 15B. This experiment demonstrated a dramatic difference in the
dissociation rate of the two RTs. The level of cDNA products
generated by the R2-RT decreased only 2 fold even after a 45 minute
incubation. Meanwhile the level of cDNA products generated by the
AMV-RT decreased 10-fold after only a 2 minute pre-incubation with
the trap.
[0125] The data was fit using decreasing single exponential
function exp(-k.sub.off*x) and the dissociation rate was determined
(k.sub.off=0.33.+-.0.04.times.10.sup.-3 sec.sup.-1). Because it can
be argued that absence of substrate dNTP may significantly change
the properties of the protein/template complex and, thus, may
affect its stability, another stability experiment was conduct
similar to that in FIG. 15A, but containing 5 .mu.M dATP in the
preincubation mixture. This nucleotide is the first to be
incorporated from this RNA template and, thus, should represent
complexes that have initiated reverse transcription. The k.sub.off
determined under these conditions was not significantly different
(k.sub.off=0.27.times.10.sup.-3 sec.sup.-1). The approximate
k.sub.off determined for the AMV protein in the experiment in FIG.
15 (k.sub.off=0.019.+-.0.0021 sec.sup.-1) is in good agreement with
published values for the half time of AMV RT binding (.about.30s),
e.g., by DeStefano et al., J. Biol. Chem. 266:7423-7431 (1991),
which is hereby incorporated by reference in its entirety.
[0126] Thus, the R2-RT dissociates from an RNA template nearly
60-fold slower than AMV-RT.
[0127] To determine the elongation rates of the R2 RT, the protein
was preincubated with the 1090 nt vector RNA/end-labeled DNA primer
complex to allow association. The four dNTPs were then added for
short periods of time and the reaction abruptly stopped by the
addition of SDS and ethanol. The cDNA products from the R2 RT are
shown in FIG. 16A. The maximum length of cDNA synthesized at each
time point can be used to determine an elongation rate of 11.0
nt/sec. This reaction has been conducted with a number of different
templates. Plotted in FIG. 16B is a similar experiment conducted
with a 600 nt RNA template using both R2 and AMV RT. In this
experiment, the R2 RT elongation rate was calculated to be 14.7
nt/sec similar to the rate calculated for the AMV-RT (12.9
nt/sec).
[0128] The combined results of FIGS. 15 and 16 demonstrate that the
increased processivity of the R2 protein is a result of it higher
stability on (i.e. reduced rate of dissociation from) the RNA
template.
Example 12
The R2 RT is Not Blocked by the Secondary Structure of the RNA
Template
[0129] The presence of truncated cDNA bands that are of
specific-lengths (see FIGS. 13-15), rather than a continuous range
of cDNA lengths expected from a gradual dissociation of the RT from
the RNA template, is believed to be a result of the RT pausing at
structural features of the RNA template. As was suggested in the
discussion to the data in FIG. 13 and 14, it appears that the R2 RT
responds differently and not as severely to these structural
features, because the yield and sizes of these truncated cDNAs
differed for the two enzymes. To directly determine the effects of
RNA structure of the ability of the R2 protein to reverse
transcribe RNA templates, RNA templates were engineered to contain
precise hair-pin loops. It was determined that the R2 protein was
readily able to transcribe through these loops. Unfortunately, DNA
sequences with extremely long such loops that can be transcribed by
T7 RNA polymerase are difficult to clone in on bacterial plasmids.
Therefore, a somewhat different approach was utilized to generate
such stable hairpins by simply annealing two complimentary RNA
molecules. The experimental approach (diagramed in FIG. 17B)
involved the 334 nt RNA template to which is annealed both a 19 nt
oligodeoxynucleotide near the middle of the RNA (the primer), and a
117 nt RNA that is a perfect complement to the 5' end of the PNA
(the block). cDNA synthesis is monitored on gels by means of the
end-labeled DNA primer. Both the R2 and AMV enzymes were tested
with and without the RNA block (lanes 1 and 2, respectively) under
two conditions: the presence of the poly(A)/oligo dT trap (left
panel) and the absence of a trap to allow multiple re-initiations
(right panel).
[0130] With the R2 RT, only full-length cDNA products are generated
irrespective of whether the RNA template contains the RNA block
(lane 1) or in the absence of the RNA block (lane 2). In the case
of the AMV-RT the presence of the RNA block completely prevents any
cDNA synthesis more than a few nucleotides past the beginning of
the block. In the absence of the RNA block, full length products
are obtained with the AMV-RT, but even here much of the cDNA stops
near the middle of the RNA, presumably as a result of a the RNA
secondary structure. These results dramatically reveal that the R2
RT is not significantly blocked by duplex regions of RNA templates.
In another set of experiment, the R2 protein was determined to be
capable of readily reverse transcribing poly(A) templates that are
saturated by oligo(dT). Under these same conditions where AMV-RT is
severely inhibited.
[0131] At this point it is not clear whether the ability of the R2
RT to reverse transcribe through the annealed RNA is a result of
the protein's ability to actively displace the annealed RNA strand
from the template, or whether the remarkable stability of the R2
protein on its RNA template allows the enzyme to passively move
through the duplex region during the random opening and closing
(sometimes called breathing) associated with the ends of duplex
nucleic acids.
Example 13
Effects of Temperature on the Reverse Transcription Reactions
[0132] Because weak RNA secondary structures are known to be highly
dependent upon the temperature of the solution, increasing the
temperature of the reaction is sometimes used as a means to
minimize the effects of RNA structure on a RT reaction. In FIG. 18,
the ability of the R2 and AMV RTs to reverse transcribe the 600 nt
RNA template at temperatures ranging from 25.degree. C. to
55.degree. C. are compared. The reactions were again conducted for
short periods with low concentration of RT to promote the formation
of products derived from only a single round of enzyme association.
The total cDNA produced at each temperature was determined on a
phosphoimager, and the fraction of the cDNA corresponding to
full-length cDNA (.about.600 nt) at each temperature is plotted in
FIG. 19.
[0133] In the case of the AMV-RT high levels of cDNA synthesis was
obtained at all temperatures, but the percentage of the cDNA
product corresponding to full-length transcripts never exceeded
1.2%. The nature of the truncated products did not substantially
differ at the different temperatures suggesting that the changes in
the secondary structure of the RNA associated with these different
temperatures had minimal effect of enzyme dissociation.
[0134] In the case of the R2 RT, the total amount of cDNA synthesis
increased from 25.degree. C. to 40.degree. C. at which point
further temperature increases caused a dramatic loss of activity.
This loss of activity is presumably as a result of the denaturation
of the protein. Surprisingly, even though the R2 protein is less
active at lower temperatures, the fraction of the total cDNA that
was full-length increased, such that at 25.degree. C. over 20% of
that cDNA is full-length. These results could be explained if
rising temperatures increase the dissociation rate of the R2
protein from the RNA template to a greater extend than they
increase the elongation rate.
[0135] Although the invention has been described in detail for the
purpose of illustration, it is understood that such detail is
solely for that purpose, and variations can be made therein by
those skilled in the art without departing from the spirit and
scope of the invention which is defined by the following claims.
Sequence CWU 1
1
38 1 24 RNA Artificial Sequence Description of Artificial Sequence
3' terminal sequence of 274 base R2 RNA 1 uuuuaucggu uuacggagca
guag 24 2 10 RNA Artificial Sequence Description of Artificial
Sequence 5' terminal sequence of 274 base R2 RNA 2 ccaacucgga 10 3
33 DNA Artificial Sequence Description of Artificial Sequence
reverse transcription product 3 tttatcggtt tacggagcag tagccaactc
gga 33 4 34 DNA Artificial Sequence Description of Artificial
Sequence reverse transcription product 4 tttatcggtt tacggagcag
tagaccaact cgga 34 5 34 DNA Artificial Sequence Description of
Artificial Sequence reverse transcription product 5 tttatcggtt
tacggagcag tagcccaact cgga 34 6 28 DNA Artificial Sequence
Description of Artificial Sequence reverse transcription product 6
tttatcggtt tacgtagccc aactcgga 28 7 16 RNA Artificial Sequence
Description of Artificial Sequence 3' end of 254 base R2 RNA 7
uccggcgcga ugaaaa 16 8 17 RNA Artificial Sequence Description of
Artificial Sequence 5' end of 334 base R2 RNA 8 cccgcuuaac ccauggc
17 9 34 DNA Artificial Sequence Description of Artificial Sequence
reverse transcription product 9 tccggcgcga tgaaaaaccc gcttaaccca
tggc 34 10 33 DNA Artificial Sequence Description of Artificial
Sequence reverse transcription product 10 tccggcgcga tgaaaacccg
cttaacccat ggc 33 11 33 DNA Artificial Sequence Description of
Artificial Sequence reverse transcription product 11 tccggcgcga
tgaaaagccg cttaacccat ggc 33 12 17 DNA Artificial Sequence
Description of Artificial Sequence reverse transcription product 12
tccggcgcgg gcatggc 17 13 29 DNA Artificial Sequence Description of
Artificial Sequence reverse transcription product 13 tccggcgcga
tgaaaacgcc ccccctcga 29 14 28 DNA Artificial Sequence Description
of Artificial Sequence reverse transcription product 14 tccggcgcga
tgaaaaattc cctttagt 28 15 26 DNA Artificial Sequence Description of
Artificial Sequence reverse transcription product 15 tccggcgcga
tttttgcttt tcttcc 26 16 17 DNA Artificial Sequence Description of
Artificial Sequence 3' end of 54 base ssDNA template 16 aactatgact
ctcttaa 17 17 15 RNA Artificial Sequence Description of Artificial
Sequence 5' end of 254 base R2 RNA 17 gguugagccu ugcac 15 18 32 DNA
Artificial Sequence Description of Artificial Sequence reverse
transcription product 18 aactatgact ctcttaaggt tgagccttgc ac 32 19
33 DNA Artificial Sequence Description of Artificial Sequence
reverse transcription product 19 aactatgact ctcttaaacg ttgagccttg
cac 33 20 30 DNA Artificial Sequence Description of Artificial
Sequence reverse transcription product 20 aactatgact ctcttttttg
agccttgcac 30 21 24 DNA Artificial Sequence Description of
Artificial Sequence reverse transcription product 21 aactatgatg
gttgagcctt gcac 24 22 1114 PRT Bombyx mori R2 element 22 Met Met
Ala Ser Thr Ala Leu Ser Leu Met Gly Arg Cys Asn Pro Asp 1 5 10 15
Gly Cys Thr Arg Gly Lys His Val Thr Ala Ala Pro Met Asp Gly Pro 20
25 30 Arg Gly Pro Ser Ser Leu Ala Gly Thr Phe Gly Trp Gly Leu Ala
Ile 35 40 45 Pro Ala Gly Glu Pro Cys Gly Arg Val Cys Ser Pro Ala
Thr Val Gly 50 55 60 Phe Phe Pro Val Ala Lys Lys Ser Asn Lys Glu
Asn Arg Pro Glu Ala 65 70 75 80 Ser Gly Leu Pro Leu Glu Ser Glu Arg
Thr Gly Asp Asn Pro Thr Val 85 90 95 Arg Gly Ser Ala Gly Ala Asp
Pro Val Gly Gln Asp Ala Pro Gly Trp 100 105 110 Thr Cys Gln Phe Cys
Glu Arg Thr Phe Ser Thr Asn Arg Gly Leu Gly 115 120 125 Val His Lys
Arg Arg Ala His Pro Val Glu Thr Asn Thr Asp Ala Ala 130 135 140 Pro
Met Met Val Lys Arg Arg Trp His Gly Glu Glu Ile Asp Leu Leu 145 150
155 160 Ala Arg Thr Glu Ala Arg Leu Leu Ala Glu Arg Gly Gln Cys Ser
Gly 165 170 175 Gly Asp Leu Phe Gly Ala Leu Pro Gly Phe Gly Arg Thr
Leu Glu Ala 180 185 190 Ile Lys Gly Gln Arg Arg Arg Glu Pro Tyr Arg
Ala Leu Val Gln Ala 195 200 205 His Leu Ala Arg Phe Gly Ser Gln Pro
Gly Pro Ser Ser Gly Gly Cys 210 215 220 Ser Ala Glu Pro Asp Phe Arg
Arg Ala Ser Gly Ala Glu Glu Ala Gly 225 230 235 240 Glu Glu Arg Cys
Ala Glu Asp Ala Ala Ala Tyr Asp Pro Ser Ala Val 245 250 255 Gly Gln
Met Ser Pro Asp Ala Ala Arg Val Leu Ser Glu Leu Leu Glu 260 265 270
Gly Ala Gly Arg Arg Arg Ala Cys Arg Ala Met Arg Pro Lys Thr Ala 275
280 285 Gly Arg Arg Asn Asp Leu His Asp Asp Arg Thr Ala Ser Ala His
Lys 290 295 300 Thr Ser Arg Gln Lys Arg Arg Ala Glu Tyr Ala Arg Val
Gln Glu Leu 305 310 315 320 Tyr Lys Lys Cys Arg Ser Arg Ala Ala Ala
Glu Val Ile Asp Gly Ala 325 330 335 Cys Gly Gly Val Gly His Ser Leu
Glu Glu Met Glu Thr Tyr Trp Arg 340 345 350 Pro Ile Leu Glu Arg Val
Ser Asp Ala Pro Gly Pro Thr Pro Glu Ala 355 360 365 Leu His Ala Leu
Gly Arg Ala Glu Trp His Gly Gly Asn Arg Asp Tyr 370 375 380 Thr Gln
Leu Trp Lys Pro Ile Ser Val Glu Glu Ile Lys Ala Ser Arg 385 390 395
400 Phe Asp Trp Arg Thr Ser Pro Gly Pro Asp Gly Ile Arg Ser Gly Gln
405 410 415 Trp Arg Ala Val Pro Val His Leu Lys Ala Glu Met Phe Asn
Ala Trp 420 425 430 Met Ala Arg Gly Glu Ile Pro Glu Ile Leu Arg Gln
Cys Arg Thr Val 435 440 445 Phe Val Pro Lys Val Glu Arg Pro Gly Gly
Pro Gly Glu Tyr Arg Pro 450 455 460 Ile Ser Ile Ala Ser Ile Pro Leu
Arg His Phe His Ser Ile Leu Ala 465 470 475 480 Arg Arg Leu Leu Ala
Cys Cys Pro Pro Asp Ala Arg Gln Arg Gly Phe 485 490 495 Ile Cys Ala
Asp Gly Thr Leu Glu Asn Ser Ala Val Leu Asp Ala Val 500 505 510 Leu
Gly Asp Ser Arg Lys Lys Leu Arg Glu Cys His Val Ala Val Leu 515 520
525 Asp Phe Ala Lys Ala Phe Asp Thr Val Ser His Glu Ala Leu Val Glu
530 535 540 Leu Leu Arg Leu Arg Gly Met Pro Glu Gln Phe Cys Gly Tyr
Ile Ala 545 550 555 560 His Leu Tyr Asp Thr Ala Ser Thr Thr Leu Ala
Val Asn Asn Glu Met 565 570 575 Ser Ser Pro Val Lys Val Gly Arg Gly
Val Arg Gln Gly Asp Pro Leu 580 585 590 Ser Pro Ile Leu Phe Asn Val
Val Met Asp Leu Ile Leu Ala Ser Leu 595 600 605 Pro Glu Arg Val Gly
Tyr Arg Leu Glu Met Glu Leu Val Ser Ala Leu 610 615 620 Ala Tyr Ala
Asp Asp Leu Val Leu Leu Ala Gly Ser Lys Val Gly Met 625 630 635 640
Gln Glu Ser Ile Ser Ala Val Asp Cys Val Gly Arg Gln Met Gly Leu 645
650 655 Arg Leu Asn Cys Arg Lys Ser Ala Val Leu Ser Met Ile Pro Asp
Gly 660 665 670 His Arg Lys Lys His His Tyr Leu Thr Glu Arg Thr Phe
Asn Ile Gly 675 680 685 Gly Lys Pro Leu Arg Gln Val Ser Cys Val Glu
Arg Trp Arg Tyr Leu 690 695 700 Gly Val Asp Phe Glu Ala Ser Gly Cys
Val Thr Leu Glu His Ser Ile 705 710 715 720 Ser Ser Ala Leu Asn Asn
Ile Ser Arg Ala Pro Leu Lys Pro Gln Gln 725 730 735 Arg Leu Glu Ile
Leu Arg Ala His Leu Ile Pro Arg Phe Gln His Gly 740 745 750 Phe Val
Leu Gly Asn Ile Ser Asp Asp Arg Leu Arg Met Leu Asp Val 755 760 765
Gln Ile Arg Lys Ala Val Gly Gln Trp Leu Arg Leu Pro Ala Asp Val 770
775 780 Pro Lys Ala Tyr Tyr His Ala Ala Val Gln Asp Gly Gly Leu Ala
Ile 785 790 795 800 Pro Ser Val Arg Ala Thr Ile Pro Asp Leu Ile Val
Arg Arg Phe Gly 805 810 815 Gly Leu Asp Ser Ser Pro Trp Ser Val Ala
Arg Ala Ala Ala Lys Ser 820 825 830 Asp Lys Ile Arg Lys Lys Leu Arg
Trp Ala Trp Lys Gln Leu Arg Arg 835 840 845 Phe Ser Arg Val Asp Ser
Thr Thr Gln Arg Pro Ser Val Arg Leu Phe 850 855 860 Trp Arg Glu His
Leu His Ala Ser Val Asp Gly Arg Glu Leu Arg Glu 865 870 875 880 Ser
Thr Arg Thr Pro Thr Ser Thr Lys Trp Ile Arg Glu Arg Cys Ala 885 890
895 Gln Ile Thr Gly Arg Asp Phe Val Gln Phe Val His Thr His Ile Asn
900 905 910 Ala Leu Pro Ser Arg Ile Arg Gly Ser Arg Gly Arg Arg Gly
Gly Gly 915 920 925 Glu Ser Ser Leu Thr Cys Arg Ala Gly Cys Lys Val
Arg Glu Thr Thr 930 935 940 Ala His Ile Leu Gln Gln Cys His Arg Thr
His Gly Gly Arg Ile Leu 945 950 955 960 Arg His Asn Lys Ile Val Ser
Phe Val Ala Lys Ala Met Glu Glu Asn 965 970 975 Lys Trp Thr Val Glu
Leu Glu Pro Arg Leu Arg Thr Ser Val Gly Leu 980 985 990 Arg Lys Pro
Asp Ile Ile Ala Ser Arg Asp Gly Val Gly Val Ile Val 995 1000 1005
Asp Val Gln Val Val Ser Gly Gln Arg Ser Leu Asp Glu Leu His Arg
1010 1015 1020 Glu Lys Arg Asn Lys Tyr Gly Asn His Gly Glu Leu Val
Glu Leu Val 1025 1030 1035 1040 Ala Gly Arg Leu Gly Leu Pro Lys Ala
Glu Cys Val Arg Ala Thr Ser 1045 1050 1055 Cys Thr Ile Ser Trp Arg
Gly Val Trp Ser Leu Thr Ser Tyr Lys Glu 1060 1065 1070 Leu Arg Ser
Ile Ile Gly Leu Arg Glu Pro Thr Leu Gln Ile Val Pro 1075 1080 1085
Ile Leu Ala Leu Arg Gly Ser His Met Asn Trp Thr Arg Phe Asn Gln
1090 1095 1100 Met Thr Ser Val Met Gly Gly Gly Val Gly 1105 1110 23
3345 DNA Bombyx mori R2 element 23 atgatggcga gcaccgcact gtcccttatg
ggacggtgta acccggatgg ctgtacacgt 60 ggtaaacacg tgacagcagc
cccgatggac ggaccgcgag gaccgtcaag cctagcaggt 120 accttcgggt
ggggccttgc gatacctgcg ggcgaaccct gtggtcgggt ttgcagcccg 180
gccacagtgg gtttttttcc tgttgcaaaa aagtcaaata aagaaaatag acctgaagcc
240 tctggcctcc cgctggagtc agagaggaca ggcgataacc cgactgtgcg
gggttccgcc 300 ggcgcagatc ctgtgggtca ggatgcgcct ggttggacct
gccagttctg cgaacgaacc 360 ttttcgacca acaggggttt gggtgtccac
aagcgtagag cccaccctgt tgagaccaat 420 acggatgccg ctccgatgat
ggtgaagcgg cggtggcatg gcgaggaaat cgacctcctc 480 gctcgcaccg
aggccaggtt gctcgctgag cggggtcagt gctcgggtgg agacctcttt 540
ggcgcgcttc cagggtttgg aagaactctg gaagcgatta agggacaacg gcggagggag
600 ccttatcggg cattggtgca agcgcacctt gcccgatttg gttcccagcc
gggtccctcg 660 tcgggggggt gctcggccga gcctgacttc cggcgggctt
ctggagctga ggaagcgggc 720 gaggaacgat gcgccgaaga cgccgctgcc
tatgatccat ccgcagtcgg tcagatgtcg 780 cccgatgccg ctcgggttct
ctccgaactc cttgagggtg cggggagaag acgagcgtgc 840 agggctatga
gacccaagac tgcagggcgg cgaaacgatt tgcacgatga tcggacagct 900
agtgcccaca aaaccagtag acaaaagcgc agggcagagt acgcgcgtgt gcaggaactg
960 tacaagaagt gtcgcagcag agcagcagct gaggtgatcg atggcgcgtg
tgggggtgtc 1020 ggacactcgc tcgaggagat ggagacctat tggcgaccta
tcctcgagag agtgtccgat 1080 gcacctgggc ctacaccgga agctcttcac
gccctagggc gtgcggagtg gcacgggggc 1140 aatcgcgact acacccagct
gtggaagccg atctcggtgg aagagatcaa ggcctcccgc 1200 tttgactggc
gaacttcgcc gggcccggac ggtatacgtt cgggtcagtg gcgtgcggtt 1260
cctgtgcact tgaaggcgga aatgttcaat gcatggatgg cacgaggcga aatacccgaa
1320 attctacggc agtgccgaac cgtctttgta cctaaggtgg agagaccagg
tggaccgggg 1380 gaatatcgac cgatctcgat cgcgtcgatt cccctgagac
actttcactc catcttggcc 1440 cggaggctgt tggcttgctg cccccctgat
gcacgacagc gcggatttat ctgcgccgac 1500 ggtacgctgg agaattccgc
agtactggac gcggtgcttg gggatagcag gaagaagctg 1560 cgggaatgtc
acgtggcggt gctagacttc gccaaggcat ttgacacagt gtctcacgag 1620
gcacttgtcg aattgctgag gttgaggggc atgcccgaac agttctgcgg ctacattgct
1680 cacctatacg atacggcgtc caccacctta gccgtgaaca atgaaatgag
cagccctgta 1740 aaagtgggac gaggggttcg tcaaggggac cctctgtcgc
cgatactctt caacgtggtg 1800 atggacctca tcctggcttc cctgccggag
agggtcgggt ataggttgga gatggaactc 1860 gtgtccgctc tggcctatgc
tgacgaccta gtcctgcttg cggggtcgaa ggtagggatg 1920 caggagtcca
tctctgctgt ggactgtgtc ggtaggcaga tgggcctacg cctgaattgc 1980
aggaaaagcg cggttctgtc tatgataccg gatggccacc gcaagaagca tcactacctg
2040 actgagcgaa ccttcaatat tggaggtaag ccgctcaggc aggtgagttg
tgttgagcgg 2100 tggcgatatc ttggtgtcga ttttgaggcc tctggatgcg
tgacattaga gcatagtatc 2160 agtagtgctc tgaataacat ctcaagggca
cctctcaaac cccaacagag gttggagatt 2220 ttgagagctc atctgattcc
gagattccag cacggttttg tgcttggaaa catctcggat 2280 gaccgattga
gaatgctcga tgtccaaatc cggaaagcag tcggacagtg gctaaggcta 2340
ccggcggatg tgcccaaggc atattatcac gccgcagttc aggacggcgg cttagcgatc
2400 ccatcggtgc gagcgaccat cccggacctc attgtgaggc gtttcggggg
gctcgactcg 2460 tcaccatggt cagtggcaag agccgccgcc aaatctgata
agattcgtaa gaaactgcgg 2520 tgggcctgga aacagctccg caggttcagc
cgtgttgact ccacaacgca acgaccatct 2580 gtgcgcttgt tttggcgaga
acatctgcat gcatctgttg atggacgcga acttcgcgaa 2640 tccacacgca
ccccgacatc cacaaagtgg attagggagc gatgcgcgca gataaccgga 2700
cgggacttcg tgcagttcgt gcacactcat atcaacgccc tcccatcccg cattcgcgga
2760 tcgagagggc gtagaggtgg gggtgagtct tcgttgacct gccgtgctgg
ttgcaaggtt 2820 agggagacga cggctcacat cctacaacag tgtcacagaa
cacacggcgg ccggattcta 2880 cgacacaaca agattgtatc tttcgtggcg
aaagccatgg aagagaacaa gtggacggtt 2940 gagctggagc cgaggctacg
aacatcggtt ggtctccgta agccggatat tatcgcctcc 3000 agggatggtg
tcggagtgat cgtggacgtg caggtggtct cgggccagcg atcgcttgac 3060
gagctccacc gtgagaaacg taataaatac gggaatcacg gggagctggt tgagttggtc
3120 gcaggtagac taggacttcc gaaagctgag tgcgtgcgag ccacttcgtg
cacgatatct 3180 tggaggggag tatggagcct gacttcttat aaggagttaa
ggtccataat cgggcttcgg 3240 gaaccgacac tacaaatcgt tccgatactg
gcgttgagag gttcacacat gaactggacc 3300 aggttcaatc agatgacgtc
cgtcatgggg ggcggcgttg gttga 3345 24 50 DNA Bombyx mori 24
taaacggcgg gagtaactat gactctctta aggtagccaa atgcctcgtc 50 25 42 DNA
Artificial Sequence Description of Artificial Sequence primer 25
ctgcagtaat acgactcact ataggacttg gggaatccga ct 42 26 19 DNA
Artificial Sequence Description of Artificial Sequence primer 26
ttttcatcgc cggatcatc 19 27 19 DNA Artificial Sequence Description
of Artificial Sequence primer 27 ggaaacagct atgaccatg 19 28 20 DNA
Artificial Sequence Description of Artificial Sequence primer 28
gatgacgagg catttggcta 20 29 43 DNA Artificial Sequence Description
of Artificial Sequence primer 29 ctgcagtaat acgactcact ataggttgag
ccttgcacag tag 43 30 50 DNA Artificial Sequence Description of
Artificial Sequence primer 30 cgacggccag tgccaagctt gcatgcctgc
aggtcgactc tagaggatcc 50 31 25 DNA Artificial Sequence Description
of Artificial Sequence primer 31 cgggatccga agccaaggga gcgag 25 32
24 DNA Artificial Sequence Description of Artificial Sequence
primer 32 gctctagagc gtacggccac gatc 24 33 25 DNA Artificial
Sequence Description of Artificial Sequence primer 33 ggggtaccga
caggtttccc gactg 25 34 25 DNA Artificial Sequence Description of
Artificial Sequence primer 34 gctctagagt tcccttggct gtggt 25 35 54
DNA Artificial Sequence Description of Artificial Sequence primer
35 gctctagagc aagcaagcgc gggtaaacgg cgggagtaac tatgactctc ttaa 54
36 18 DNA Artificial Sequence Description of Artificial Sequence
primer 36 taatacgact cactatag 18 37 17 DNA Artificial Sequence
Description
of Artificial Sequence primer 37 aattcaagca agcgcgg 17 38 21 DNA
Artificial Sequence Description of Artificial Sequence primer 38
cgttcttcgg ggcgaaaact c 21
* * * * *