U.S. patent application number 10/706229 was filed with the patent office on 2004-07-08 for thermophilic dna polymerases from thermotoga neapolitana.
Invention is credited to Bolchakova, Elena, Hartnett, James R., Huang, Fen, Novikov, Alexander, Slater, Michael R., Velikodvorskaya, Galina A..
Application Number | 20040132985 10/706229 |
Document ID | / |
Family ID | 23925074 |
Filed Date | 2004-07-08 |
United States Patent
Application |
20040132985 |
Kind Code |
A1 |
Bolchakova, Elena ; et
al. |
July 8, 2004 |
Thermophilic DNA polymerases from Thermotoga neapolitana
Abstract
The present invention relates to thermostable DNA polymerases
derived from the hyperthermophilic eubacteria, and Thermotoga
neapolitana in particular. The present invention provides means for
isolating and producing the enzymes from these thermostable DNA
polymerases, which are useful in many recombinant DNA techniques,
especially such techniques as thermal cycle sequencing and nucleic
acid amplification.
Inventors: |
Bolchakova, Elena; (Foster
City, CA) ; Novikov, Alexander; (Foster City, CA)
; Slater, Michael R.; (Madison, WI) ; Huang,
Fen; (Madison, WI) ; Hartnett, James R.;
(Fitchburg, WI) ; Velikodvorskaya, Galina A.;
(Moscow, RU) |
Correspondence
Address: |
MEDLEN & CARROLL, LLP
101 Howard Street, Suite 350
San Francisco
CA
94105
US
|
Family ID: |
23925074 |
Appl. No.: |
10/706229 |
Filed: |
November 12, 2003 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
10706229 |
Nov 12, 2003 |
|
|
|
09641319 |
Aug 18, 2000 |
|
|
|
09641319 |
Aug 18, 2000 |
|
|
|
09385986 |
Aug 30, 1999 |
|
|
|
09385986 |
Aug 30, 1999 |
|
|
|
08484661 |
Jun 7, 1995 |
|
|
|
6001645 |
|
|
|
|
Current U.S.
Class: |
536/23.2 ;
435/199; 435/252.3; 435/320.1; 435/6.12; 435/69.1 |
Current CPC
Class: |
C12Q 1/6869 20130101;
C12Q 2521/101 20130101; C12Q 2521/101 20130101; C12Q 1/686
20130101; C12N 9/1252 20130101; C12Q 1/686 20130101; C12Q 1/6869
20130101 |
Class at
Publication: |
536/023.2 ;
435/006; 435/069.1; 435/199; 435/252.3; 435/320.1 |
International
Class: |
C12Q 001/68; C07H
021/04; C12N 009/22; C12N 001/21; C12N 015/74 |
Claims
1. A purified thermostable DNA polymerase capable of DNA synthetic
activity, said enzyme derived from the eubacterium Thermotoga
neapolitana.
2. The polymerase of claim 1, wherein said polymerase is further
capable of 3' exonuclease activity.
3. The polymerase of claim 1, wherein said polymerase is further
capable of 5' exonuclease activity.
4. The polymerase of claim 1, wherein said polymerase comprises the
amino acid sequence of SEQ ID NO:2.
5. The polymerase of claim 1, wherein the specific activity of said
synthetic activity is approximately 100,000 units/mg.
6. A purified thermostable non-naturally occurring DNA polymerase
derived from the eubacterium Thermotoga neapolitana.
7. The polymerase of claim 6, wherein said polymerase is capable of
3' exonuclease activity.
8. The polymerase of claim 6, wherein said polymerase is capable of
5' exonuclease activity.
9. The polymerase of claim 6, wherein said polymerase has reduced
levels of 3' exonuclease activity.
10. The polymerase of claim 6, wherein said polymerase lacks
significant 5' exonuclease activity.
11. The polymerase of claim 6, wherein said polymerase comprises an
amino acid sequence selected from the group consisting of SEQ ID
NOS:8, 16, 19, 23, 26, 29, 33 and 35.
12. An oligonucleotide comprising the nucleic acid sequence of SEQ
ID NO: 1, said nucleotide sequence encoding a thermostable DNA
polymerase.
13. The oligonucleotide of claim 12, wherein said sequence is
modified such that the encoded polymerase lacks significant 5'
exonuclease activity.
14. The oligonucleotide of claim 13, wherein said modified sequence
comprises the nucleotide sequence of SEQ ID NO:7.
15. The oligonucleotide of claim 12, wherein said sequence is
modified such that the encoded polymerase displays reduced levels
of 3' exonuclease activity.
16. The oligonucleotide of claim 15, wherein said modified sequence
is selected from the group consisting of SEQ ID NOS:7, 15, 18, 22,
25, 28, 32 and 34.
17. A recombinant DNA vector comprising an oligonucleotide having
the nucleic acid sequence set forth in SEQ ID NO:1, said nucleotide
sequence encoding a thermostable DNA polymerase having DNA
synthetic activity.
18. The recombinant DNA vector of claim 17, wherein said nucleic
acid sequence is modified such that the encoded thermostable DNA
polymerase lacks significant 5' exonuclease activity.
19. The recombinant vector of claim 18, wherein said modified
sequence comprises SEQ ID NO:7.
20. The recombinant DNA vector of claim 17, wherein said nucleic
acid sequence is modified such that the encoded thermostable DNA
polymerase exhibits reduced levels 3' exonuclease activity.
21. The recombinant vector of claim 20, wherein said modified
sequence is selected from the group consisting of SEQ ID NOS:7, 15,
18, 22, 25, 28, 32 and 34.
Description
FIELD OF THE INVENTION
[0001] The present invention relates to thermostable DNA
polymerases derived from the hyperthermophilic eubacteria
Thermotoga neapolitana and means for isolating and producing the
enzymes. Thermostable DNA polymerases are useful in many
recombinant DNA techniques, especially thermal cycle sequencing and
nucleic acid amplification.
BACKGROUND
[0002] Thermophilic bacteria are organisms which are capable of
growth at elevated temperatures. Unlike the mesophiles, which grow
best at temperatures in the range of 25-40.degree. C., or
psychrophiles, which grow best at temperatures in the range of
15-20.degree. C., thermophiles grow best at temperatures greater
than 50.degree. C. Indeed, some thermophiles grow best at
65-75.degree. C., and some of the hyperthermophiles grow at
temperatures up to 130.degree. C. (See e.g., J. G. Black,
Microbiology Principles and Applications, 2d edition, Prentice
Hall, New Jersey, [1993] p. 145-146).
[0003] The thermophilic bacteria encompass a wide variety of genera
and species. There are thermophilic representatives included within
the phototrophic bacteria (e.g., the purple bacteria, green
bacteria, and cyanobacteria), eubacteria (e.g., Bacillus,
Clostridium, Thiobacillus, Desulfotomaculum, Thermus, lactic acid
bacteria, actinomycetes, spirochetes, and numerous other genera),
and the archaebacteria (e.g., Pyrococcus, Thermococcus,
Thermoplasma, Thermotoga, Sulfolobus, and the methanogens). There
are aerobic, as well as anaerobic thermophilic organisms. Thus, the
environments in which thermophiles may be isolated vary greatly,
although all of these organisms are always isolated from areas
associated with high temperatures. Natural geothermal habitats have
a worldwide distribution and are primarily associated with
tectonically active zones where major movements of the earth's
crust occur. Thermophilic bacteria have been isolated from all of
the various geothermal habitats, including boiling springs with
neutral pH ranges, sulfur-rich acidic springs, and deep-sea vents.
For all of these organisms, it appears that the organisms present
in these geothermal habitats are optimally adapted to the
temperatures at which they are living (T. D. Brock, "Introduction:
An overview of the thermophiles," in T. D. Brock (ed.),
Thermophiles: General, Molecular and Applied Microbiology, John
Wiley & Sons, New York [1986], pp. 1-16). Basic as well as
applied research on thermophiles has provided some insight into the
physiology of these organisms, as well as promise for use of these
organisms in industry and biotechnology.
[0004] 1. The Genus Thermotoga
[0005] The Thermotoga is a recently described genus with three
recognized species, which includes the most extremely thermophilic
eubacteria known. The genus was first described in 1986, by Huber
et al., (R. Huber et al. Arch. Microbiol. 144:324 [1986]; and Int.
J. Syst. Bacteriol., 36:575 [1986]). At this time, there was only
one species described, T. maritima. T. neapolitana was first
described by Jannasch et al. in 1986 (Jannasch et al., Arch.
Microbiol., 150:103-104 [1986]; and Int. J. Syst. Bacteriol., 39:93
[1989]). T. thermarum was described by Windberger et al. in 1989
(Windberger et al., Arch. Microbiol., 151:506-512; and Int. J.
Syst. Bacteriol., 42:327 [1992]).
[0006] These organisms were originally isolated from geothermally
heated marine sediments and hot springs. For example, T. maritima
has been isolated from geothermally heated sea floors in Italy, the
Azores, Indonesia, and Iceland as well as from continental,
solfataric springs in Africa. T. neapolitana has been isolated from
a submarine thermal vent near Naples and from continental,
solfataric springs in Africa.
[0007] Members of the genus Thermotoga are considered to be
hyperthermophilic, as they are capable of growth at temperatures up
to 90.degree. C., although growth will occur at temperatures
between 55.degree. C. and 90.degree. C.; the optimum growth
temperature is between 70-80.degree. C. The Thermotoga are strictly
anaerobic, non-sporing, Gram-negative rods, that ferment
carbohydrates, and may be motile by polar, lateral or peritrichous
flagella, although some strains are non-motile. The cells are
surrounded by a sheath-like outer structure which usually balloons
over the ends. In all species of Thermotoga, 1-4 cells may be
enclosed within one sheath.
[0008] T. maritima, T. neapolitana, and T. thermarum are distinct
species as judged by the numerous differences. For example, T.
maritima possesses a single subpolar flagellum and is motile, while
motile strains of T. neapolitana possess peritrichous flagella
(some strains are non-motile), and T. thermarum possesses lateral
flagella. In addition, T. neapolitana will grow in NaCl
concentrations ranging from 0.25 to 6.0%, while T. maritima will
grow in NaCl concentrations ranging from 0.25 to 3.75%, and T.
thermarum will grow in NaCl concentrations ranging from 0.2-0.55%
(J. G. Holt et al (eds.), Bergey's Manual.RTM. of Determinative
Bacteriology 9th ed., Williams & Wilkins, Baltimore, [1994], p.
333). Also, there are differences in the susceptibility of these
species to rifampicin, and differences in the inhibitory effects of
hydrogen and sulfur on these species. Furthermore, the rate of
growth at optimum growth temperature (80.degree. C.) is a doubling
time of about 45 min for T. neapolitana and about 75 min for T.
maritima The G+C content of the DNA of T. maritima and T.
neapolitana is 46% and 41%, respectively. The DNA from T. maritima
and T. neapolitana shows only about 25-30% homology by DNA-DNA
hybridization studies.
[0009] A few of the enzymes of the Thermotoga and other
thermophilic genera have been studied in varying degrees of detail.
As discussed below, the use of thermophilic enzymes in industry has
been viewed as providing advantages over the use of mesophilic
enzymes.
[0010] II. Uses for Thermophilic Enzymes
[0011] Advances in molecular biology and industrial processes have
led to an increased interest in thermophilic organisms such as
Thermotoga. Of particular interest has been the development of
thermophilic enzymes for use in industrial processes such as the
detergent, flavor-enhancing, and starch industries. Indeed, the
cost savings associated with the longer storage stability and
higher activity at higher temperatures of thermophilic enzymes, as
compared to mesophilic enzymes, provide good reason to select and
develop thermophilic enzymes for industrial and biotechnology
applications. Thus, there has been much research conducted to
characterize enzymes from thermophilic organisms. However, some
thermophilic enzymes have less activity than their mesophilic
counterparts under similar conditions at the elevated temperatures
used in industry (typically temperatures in the range of
50-100.degree. C.) (T. K. Ng and William R. Kenealy, "Industrial
Applications of Thermostable Enzymes," in T. D. Brock (ed.),
Thermophiles: General, Molecular, and Applied Microbiology, [1986],
John Wiley & Sons, New York, pp. 197-215). Thus, the choice of
a thermostable enzyme over a mesophilic one may not be as
beneficial as originally assumed. Nonetheless, of the $400 million
worth of enzymes sold worldwide in 1984, 90% were thermostable
enzymes used by the detergent and starch industries (Ng and
Kenealy, at p. 206). However, much research remains to be done to
characterize and compare thermophilic enzymes of importance in
areas such as molecular biology (e.g., polymerases, ligases,
topoisomerases, restriction endonucleases, etc.).
[0012] III. Thermophilic DNA Polymerases
[0013] Extensive research has been conducted on the isolation of
DNA polymerases from mesophilic organisms such as E. coli (See
e.g., Bessman et al., J. Biol. Chem. 223:171 [1957]; Buttin and
Kornberg, J. Biol. Chem. 241:5419 [1966]; and Joyce and Steitz,
Trends Biochem. Sci., 12:288-292 [1987]). Other mesophilic
polymerases have also been studied, such as those of Bacillus
licheniformis (Stenesh and McGowan, Biochim. Biophys. Acta 475:3244
[1977]; Stenesh and Roe, Biochim. Biophys. Acta 272:156-166
[1972]); Bacillus subtilis (Low et al., J. Biol. Chem., 251:1311
[1976]; and Ott et al., J. Bacteriol., 165:951 [1986]; Salmonella
typhimurium (Harwood et al., J. Biol. Chem., 245:5614 [1970];
Hamilton and Grossman, Biochem., 13:1885 [1974]), Streptococcus
pneumoniae (Lopez et al., J. Biol. Chem., 264:4255 [1989]), and
Micrococcus luteus (Engler and Bessman, Cold Spring Harbor Symp.,
43:929 [1979]), to name but a few.
[0014] Somewhat less investigation has been made on the isolation
and purification of DNA polymerases from thermophilic organisms.
However, native (i.e, non-recombinant) and/or recombinant
thermostable DNA polymerases have been purified from various
organisms, as shown in Table 1 below.
1TABLE 1 Polymerases Isolation From Thermophilic Organisms Organism
Citation Thermus aquaticus Kaledin et al., Biochem., 45: 494-501
(1980); Biokhimiya 45: 644-651 (1980). Chien et al., J. Bacteriol.,
127: 1550 (1976). University of Cincinnati Master's thesis by A.
Chien, "Purification and Characterization of DNA Polymerase from
Thermus aquaticus," (1976). University of Cincinnati, Master's
thesis by D. B. Edgar, "DNA Polymerase From an Extreme Thermophile:
Thermus aquaticus," (1974). U.S. Pat. No. 4,889,818* U.S. Pat. No.
5,352,600* U.S. Pat. No. 5,079,352* European Patent Pub. No.
258,017* PCT Pub. No. WO 94/26766* PCT Pub. No. WO 92/06188* PCT
Pub. No. WO 89/06691* Thermotoga maritima PCT Pub. No. WO 92/03556*
Thermotoga strain FjSS3-B.1 Simpson et al., Biochem. Cell Biol.,
68: 1292-1296 (1990). Thermosipho africanus PCT Pub. No. 92/06200*
Thermus thermophilus Myers and Gelfand, Biochem., 30: 7661 (1991)
PCT Pub. No. WO 91/09950* PCT Pub. No. WO 91/09944* Bechtereva et
al., Nucleic Acids Res., 17: 10507 (1989). Glukhov et al., Mol.
Cell. Probes 4: 435-443 (1990). Thermus thermophilus Carballeira et
al., BioTech., 9: 276-281 (1990) Ruttiman et al., Eur. J. Biochem.,
149: 41-46 (1985). Oshima et al., J. Biochem., 75: 179-183 (1974).
Sakaguchi and Yajima, Fed. Proc., 33: 1492 (1974) (abstract).
Thermus flavus Kaledin et al., Biochem., 46: 1247-1254 (1981);
Biokhimiya 46: 1576-1584 (1981). PCT Pub. No. WO 94/26766* Thermus
ruber Kaledin et al., Biochem., 47: 1515-1521 (1982); Biokhimiya
47: 1785-1791 (1982) Thermoplasma acidophilum Hamal et al., Eur. J.
Biochem., 190: 517-521 (1990). Forterre et al., Can. J. Microbiol.,
35: 228-233 (1989). Sulfolobus acidocaldarius Salhi et al., J. Mol.
Biol., 209: 635-641 (1989). Salhi et al., Biochem. Biophys. Res.
Comm., 167: 1341-1347 (1990). Rella et al., Ital. J. Biochem., 39:
83-99 (1990). Forterre et al., Can. J. Microbiol., 35: 228-233
(1989). Rossi et al., System. Appl. Microbiol., 7: 337-341 (1986).
Klimczak et al., Nucleic Acids Res., 13: 5269-5282 (1985). Elie et
al., Biochim. Biophys. Acta 951: 261-267 (1988). Bacillus
caldotenax J. Biochem., 113: 401-410 (1993). Bacillus
stearothermophilus Sellmann et al., J. Bacteriol., 174: 4350-4355
(1992). Stenesh and McGowan, Biochim. Biophys. Acta 475: 32-44
(1977). Stenesh and Roe, Biochim. Biophys. Acta 272: 156-166
(1972). Kaboev et al., J. Bacteriol., 145: 21-26 (1981).
Methanobacterium Klimczak et al., Biochem., 25: 4850-4855 (1986).
thermoautotropicum Thermococcus litoralis Kong et al., J. Biol.
Chem. 268: 1965 (1993); U.S. Pat. No. 5,210,036*; U.S. Pat. No.
5,322,785* Pyrococcus furiosus Lundberg et al., Gene 108: 1 (1991)
PCT Pub. WO 92/09689 *Herein incorporated by reference.
[0015] Although the organisms listed in Table 1 are considered
thermophiles, many are in the archaebacteria, a group that is
evolutionarily distinct from the eubacterial genus Thermotoga.
[0016] In addition to native forms, modified forms of thermostable
DNA polymerases having reduced or absent 5' to 3' exonuclease
activity have been expressed and purified from T. aquaticus, T.
maritima. Thermus species sps17, Thermus species Z05, T.
thermophilus and T. africanus [PCT Publication No. 92/06200].
[0017] IV. Uses for Thermophilic DNA Polymerases
[0018] One application for thermostable DNA polymerases is the
polymerase chain reaction (PCR). The PCR process is described in
U.S. Pat. Nos. 4,683,195 and 4,683,202, the disclosures of which
are incorporated herein by reference. Primers, template, nucleoside
triphosphates, the appropriate buffer and reaction conditions, and
polymerase are used in the PCR process, which involves denaturation
of target DNA, hybridization of primers and synthesis of
complementary strands. The extension product of each primer becomes
a template for the production of the desired nucleic acid sequence.
If the polymerase employed in the PCR is a thermostable enzyme,
then polymerase need not be added after each denaturation step
because heat will not destroy the polymerase activity. Use of such
enzymes as Taq DNA polymerase allows repetitive heating/cooling
cycles without the requirement of fresh enzyme at each cooling
step. This represents a major advantage over the use of mesophilic
enzymes such as Klenow, as fresh enzyme must be added to each
individual reaction tube at every cooling step. The use of Taq in
PCR is disclosed in U.S. Pat. No. 4,965,188, EP Publ. No. 258,017,
and PCT Publ. No. 89/06691, herein incorporated by reference.
[0019] In addition to PCR, Taq DNA polymerase is widely used in
other molecular biology techniques including recombinant DNA
methods. For example, various forms of Taq have been used in a
combination method which utilizes PCR and reverse transcription
(see e.g., U.S. Pat. No. 5,322,770, herein incorporated by
reference). DNA sequencing methods have also been described which
utilize Taq (see e.g., U.S. Pat. No. 5,075,216, herein incorporated
by reference).
[0020] However, Taq DNA polymerase has certain characteristics
which are undesirable in PCR and other applications including the
presence of 5' to 3' exonuclease activity. When thermostable DNA
polymerases which have 5' to 3' exonuclease activity (Taq, Tma,
Tsps17, 7-Z05, Tth and Taf) are used in the PCR process and other
methods, a variety of undesirable results have been observed,
including a limitation of the amount of PCR product produced, an
impaired ability to generate long PCR products or to amplify
regions containing significant secondary structure, the production
of shadow bands or the attenuation in signal strength of desired
termination bands during DNA sequencing, the degradation of the 5'
end of oligonucleotide primers in the context of double-stranded
primer-template complex, nick-translation synthesis during
oligonucleotide-directed mutagenesis and the degradation of the RNA
component of RNA:DNA hybrids. When utilized in a PCR process with
double-stranded primer-template complex, the 5' to 3' exonuclease
activity of a DNA polymerase may result in the degradation of the
5' end of the oligonucleotide primers. This activity is not only
undesirable in PCR, but also in second-strand cDNA synthesis and
sequencing processes.
[0021] In the choice of enzyme for sequencing, various factors must
be considered. For example, large quantities of the enzyme should
be easy to prepare; the enzyme must be stable upon storage for
considerable time periods; the enzyme should accept all deoxy and
dideoxy nucleotides and analogues as substrates with equal
affinities and high fidelity; the polymerase action should be
highly processive over nucleotide extensions to 1 kb and beyond,
even through regions of secondary structure within the template;
the activity should remain high, even in suboptimal conditions; and
it should be inexpensive (A. T. Bankier, "Dideoxy sequencing
reactions using Klenow fragment DNA polymerase I," in H. and A.
Griffin (eds.), Methods in Molecular Biology: DNA Sequencing
Protocols, Humana Press, Totowa, N.J., [1993], pp. 83-90).
Furthermore the enzyme should be able to function at elevated
temperatures (i.e, greater than about 70.degree. C.) so that
non-specific priming reactions are minimized. However, there are no
commercially available enzymes which fully meet all of these
criteria. Thus, mutant forms of enzymes have been produced in order
to address some of these needs.
[0022] For example, mutant forms of thermostable DNA polymerases
which exhibit reduced or absent 5' to 3' exonuclease activity have
been generated. The Stoffel fragment of Taq DNA polymerase lacks 5'
to 3' exonuclease activity due to genetic manipulations which
result in the production of a truncated protein lacking the
N-terminal 289 amino acids (See e.g., Lawyer et al., J. Biol.
Chem., 264:6427-6437 [1989]; and Lawyer et al., PCR Meth. Appl.,
2:275-287 [1993]). Analogous mutant polymerases have been generated
for polymerases derived from T. maritima, Tsps17, TZ05, Tth and
Taf. While the generation of thermostable polymerases lacking 5' to
3' exonuclease activity provides improved enzymes for certain
applications, some of these mutant polymerases still have
undesirable characteristics including the presence of 3' to 5'
exonuclease activity.
[0023] The 3' to 5' exonuclease activity is commonly referred to as
a proof-reading activity. The 3' to 5' exonuclease removes bases
which are mismatched at the 3' end of a primer-template duplex.
While the presence of 3' to 5' exonuclease activity may be
advantageous as it leads to an increase in the fidelity of
replication of nucleic acid strands it also has other undesirable
characteristics. The 3' to 5' exonuclease activity found in
thermostable DNA polymerases such as Tma (including mutant forms of
Tma that lack 5' to 3' exonuclease activity) also degrades
single-stranded DNA such as the primers used in the PCR,
single-stranded templates and single-stranded PCR products. The
integrity of the 3' end of an oligonucleotide primer used in a
primer extension process (e.g., PCR, Sanger sequencing methods,
etc.) is critical as it is from this terminus that extension of the
nascent strand begins. Degradation of the 3' end leads to a
shortened oligonucleotide which in turn results in a loss of
specificity in the priming reaction (i.e., the shorter the primer
the more likely it becomes that spurious or non-specific priming
will occur).
[0024] The degradation of an oligonucleotide primer by a 3'
exonuclease can be prevented by the use of modified nucleotides at
the 3' terminus. For example, the use of dideoxynucleotides or
deoxynucleotides having a phosphorothiolate linkage at the 3'
terminus of an oligonucleotide would prevent degradation by 3'
exonucleases. However, the need to use modified nucleotides to
prevent degradation of oligonucleotides by 3' exonuclease increases
the time and cost required to prepare oligonucleotide primers.
[0025] A few examples of a thermostable polymerase which lack both
5' to 3' exonuclease and 3' to 5' exonuclease are known. As
discussed above, the Stoffel fragment of Taq DNA polymerase lacks
the 5' to 3' exonuclease activity due to genetic manipulation and
no 3' to 5' activity is present as Taq polymerase is naturally
lacking in 3' to 5' exonuclease activity. Likewise the Tth
polymerase naturally lacks 3' to 5' exonuclease activity and
genetic deletion of N-terminal amino acids removes the 5' to 3'
exonuclease activity.
[0026] Despite the development of recombinant enzymes such as
Stoffel fragment, there remains a need for other thermostable
polymerases having improved characteristics. For example,
thermostable polymerases are used in Sanger dideoxynucleotide
sequencing protocols. The most commonly used enzyme is Taq
polymerase or a modified form of Taq polymerase. High
concentrations of the expensive dideoxynucleotides must be used in
the sequencing reaction when these enzymes are employed as they
have a fairly low affinity for dideoxynucleotides. The art needs a
thermostable polymerase which displays a higher affinity for
dideoxynucleotides as this would result in considerable cost
savings. In addition, the art needs additional thermostable
polymerases having novel properties to improve the results obtained
when using techniques such as DNA amplification, sequencing and
nick-translation.
SUMMARY OF THE INVENTION
[0027] The present invention relates to purified thermostable DNA
polymerases derived from the eubacteria Thermotoga neapolitana
(Tne). Nucleic acid sequences encoding the full-length Tne DNA
polymerase is provided. In addition, nucleic acid sequences
encoding several modified forms of the Tne DNA polymerase are
provided herein. The present invention provides methods for the
isolation of purified preparations of Tne DNA polymerases. The Tne
DNA polymerases may be isolated from Thermotoga neapolitana cells
or from host cells containing nucleic sequences encoding a Tne DNA
polymerase.
[0028] In one embodiment, the present invention contemplates a
purified thermostable DNA polymerase derived from the eubacterium
Thermotoga neapolitana which is capable of DNA synthetic activity.
In another embodiment, the purified Tne DNA polymerase has 3'
exonuclease activity. In yet another embodiment, the purified Tne
DNA polymerase has 5' exonuclease activity. In one preferred
embodiment, the purified Tne DNA polymerase comprises the amino
acid sequence of SEQ ID NO:2. In a particularly preferred
embodiment, the specific activity of the synthetic activity of the
purified Tne DNA polymerase is approximately 100,000 units/mg.
[0029] In another embodiment, the purified thermostable Tne DNA
polymerase is a non-naturally occurring or recombinant DNA
polymerase. The recombinant Tne DNA polymerase may further contain
3' exonuclease activity and/or 5' exonuclease activity.
[0030] In a preferred embodiment, the non-naturally occurring Tne
DNA polymerase has reduced levels of 3' exonuclease activity. In
another embodiment, the non-naturally occurring Tne DNA polymerase
lacks significant 5' exonuclease activity. In a particularly
preferred embodiment, the non-naturally occurring Tne DNA
polymerase comprises an amino acid sequence selected from the group
consisting of SEQ ID NOS: 8, 16, 19, 23, 26, 29, 33 and 35.
[0031] The present invention provides nucleic acid sequences
encoding thermostable DNA polymerases. In a preferred embodiment,
an oligonucleotide comprising the nucleic acid sequence of SEQ ID
NO:1 encodes the thermostable DNA polymerase. These nucleic acid
sequences encoding thermostable DNA polymerases may be modified to
encode a polymerase which lacks significant 5' exonuclease
activity. In a preferred embodiment, the modified nucleic acid
sequences encoding a thermostable DNA polymerase comprise the
nucleotide sequence of SEQ ID NO:7. In one embodiment, the
polymerase encoded by the modified nucleic acid sequences displays
reduced levels of 3' exonuclease activity. In a particularly
preferred embodiment, the modified nucleic acid sequences encoding
a polymerase having reduced levels of 3' exonuclease activity are
selected from the group consisting of SEQ ID NOS:7, 15, 18, 22, 25,
28, 32 and 34.
[0032] The present invention provides recombinant DNA vectors
containing nucleic acid sequences which encode a thermostable DNA
polymerase having DNA synthetic activity. In a preferred embodiment
the polymerase-encoding nucleic acid sequences are set forth in SEQ
ID NO:1. The recombinant DNA vector may contain a modified nucleic
sequence encoding a thermostable DNA polymerase which lacks
significant 5' exonuclease activity. In a preferred embodiment, the
recombinant DNA vector contains a modified nucleic acid sequence
which comprises SEQ ID NO:7.
[0033] In a preferred embodiment, the recombinant DNA vector
contains modified nucleic acid sequences encoding a thermostable
DNA polymerase which exhibits reduced levels 3' exonuclease
activity. In a particularly preferred embodiment, the modified
sequences encoding a thermostable DNA polymerase which exhibits
reduced levels 3' exonuclease activity are selected from the group
consisting of SEQ ID NOS:7, 15, 18, 22, 25, 28, 32 and 34.
[0034] The present invention further contemplates the
transformation of host cells with the recombinant DNA vectors
containing nucleic acid sequences encoding Tne DNA polymerases. The
invention is not limited by the choice of host cell; host cells may
comprise procaryotic or eucaryotic cells. In a preferred
embodiment, the host cell is an E. coli host cell.
[0035] The invention further provides methods for determining the
DNA sequence of a segment or portion of a DNA molecule using the
Tne DNA polymerases of the invention. Dideoxynucleotide (ddNTP)
chain termination sequencing protocols are used in conjunction with
the polymerases of the invention. Traditional (i.e., Sanger) as
well as other methods, including but not limited to, chain
termination sequencing or thermal cycle sequencing protocols
benefit from the use of the Tne DNA polymerases of the invention.
The claimed Tne DNA polymerases have a high affinity for
dideoxynucleotides; accordingly the following ratios of dNTPs and
ddNTPs are contemplated for use in either thermal cycling or Sanger
chain termination protocols when Tne DNA polymerases are employed:
dATP:ddATP is 1:1.67.+-.50%; dCTP:ddCTP is 1:0.83.+-.50%;
dGTP:ddGTP is 1:0.67.+-.50% and TTP:ddTTP is 1:2.5.+-.50% where
each dNTP is present at a final concentration of about 1 .mu.M to
120 .mu.M.
DESCRIPTION OF THE DRAWINGS
[0036] FIG. 1 provides a schematic representation of the 5'
exonuclease, 3' exonuclease and polymerase domains in several DNA
polymerases.
[0037] FIG. 2 shows an alignment of amino acid residues from three
regions within the 3' exonuclease domain of selected DNA
polymerases.
[0038] FIG. 3 shows the alignment of the amino acid residues (using
the one letter code for the amino acids) from E. coli DNA
polymerase I, Tma DNA polymerase and Tne DNA polymerase.
[0039] FIG. 4 provides a schematic depicting the full length and
mutant Tne DNA polymerases of the present invention.
[0040] FIG. 5A shows an autoradiograph of a sequencing gel.
[0041] FIG. 5B shows an autoradiograph of a sequencing gel.
[0042] FIG. 6A shows an autoradiograph of a sequencing gel.
[0043] FIG. 6B shows an autoradiograph of a sequencing gel.
[0044] FIG. 6C shows an autoradiograph of a sequencing gel.
DEFINITIONS
[0045] To facilitate understanding of the invention, a number of
terms are defined below.
[0046] The term "gene" refers to a DNA sequence that comprises
control and coding sequences necessary for the production of a
polypeptide or precursor. The polypeptide can be encoded by a full
length coding sequence or by any portion of the coding sequence so
long as the desired enzymatic activity is retained.
[0047] The term "wild-type" refers to a gene or gene product which
has the characteristics of that gene or gene product when isolated
from a naturally occurring source. A wild-type gene is that which
is most frequently observed in a population and is thus arbitrarily
designated the "normal" or "wild-type" form of the gene. In
contrast, the term "modified" or "mutant" refers to a gene or gene
product which displays modifications in sequence and or functional
properties (i.e., altered characteristics) when compared to the
wild-type gene or gene product. It is noted that
naturally-occurring mutants can be isolated; these are identified
by the fact that they have altered characteristics when compared to
the wild-type gene or gene product. The wild-type form of the
coding region for the Tne DNA polymerase is listed in SEQ ID NO:1;
the wild-type form of the Tne DNA polymerase protein is listed in
SEQ ID NO:2. The Tne DNA polymerase proteins encoded by "modified"
or "mutant" genes are referred to as non-naturally occurring Tne
DNA polymerases.
[0048] The term "recombinant DNA vector" as used herein refers to
DNA sequences containing a desired coding sequence and appropriate
DNA sequences necessary for the expression of the operably
linked-coding sequence in a particular host organism. DNA sequences
necessary for expression in procaryotes include a promoter,
optionally an operator sequence, a ribosome binding site and
possibly other sequences. Eukaryotic cells are known to utilize
promoters, polyadenlyation signals and enhancers.
[0049] As used herein, the terms "cell," "cell line," and cell
culture" are used interchangeably and all such designations include
progeny. The words "transformants" or "transformed cells" include
the primary transformed cell and cultures derived from that cell
without regard to the number of transfers. All progeny may not be
precisely identical in DNA content, due to deliberate or
inadvertent mutations. Mutant progeny that have the same
functionality as screened for in the originally transformed cell
are included in the definition of transformants.
[0050] As used herein, the term "vector" is used in reference to
nucleic acid molecules that transfer DNA segment(s) from one cell
to another. The term "vehicle" is sometimes used interchangeably
with "vector."
[0051] The term "expression vector" as used herein refers to a
recombinant DNA molecule containing a desired coding sequence and
appropriate nucleic acid sequences necessary for the expression of
the operably linked coding sequence in a particular host organism.
Nucleic acid sequences necessary for expression in procaryotes
usually include a promoter, an operator (optional), and a ribosome
binding site, often along with other sequences. Eucaryotic cells
are known to utilize promoters, enhancers, and termination and
polyadenylation signals.
[0052] The terms "in operable combination", "in operable order" and
"operably linked" as used herein refer to the linkage of nucleic
acid sequences in such a manner that a nucleic acid molecule
capable of directing the transcription of a given gene and/or the
synthesis of a desired protein molecule is produced. The term also
refers to the linkage of amino acid sequences in such a manner so
that a functional protein is produced.
[0053] The term "transfection" as used herein refers to the
introduction of foreign DNA into eucaryotic cells. Transfection may
be accomplished by a variety of means known to the art including
calcium phosphate-DNA co-precipitation, DEAE-dextran-mediated
transfection, polybrene-mediated transfection, electroporation,
microinjection, liposome fusion, lipofection, protoplast fusion,
retroviral infection, and biolistics.
[0054] As used herein, the terms "complementary" or
"complementarity" are used in reference to polynucleotides (i.e., a
sequence of nucleotides) related by the base-pairing rules. For
example, for the sequence "A-G-T," is complementary to the sequence
"T-C-A." Complementary may be "partial," in which only some of the
nucleic acids' bases are matched according to the base pairing
rules. Or, there may be "complete" or "total" complementary between
the nucleic acids. The degree of complementary between nucleic acid
strands has significant effects on the efficiency and strength of
hybridization between nucleic acid strands. This is of particular
importance in amplification reactions, as well as detection methods
which depend upon binding between nucleic acids.
[0055] The term "homology" refers to a degree of complementary.
There may be partial homology or complete homology (i.e.,
identity). A partially complementary sequence is one that at least
partially inhibits a completely complementary sequence from
hybridizing to a target nucleic acid is referred to using the
functional term "substantially homologous." The inhibition of
hybridization of the completely complementary sequence to the
target sequence may be examined using a hybridization assay
(Southern or Northern blot, solution hybridization and the like)
under conditions of low stringency. A substantially homologous
sequence or probe will compete for and inhibit the binding (i.e.,
the hybridization) of a completely homologous to a target under
conditions of low stringency. This is not to say that conditions of
low stringency are such that non-specific binding is permitted; low
stringency conditions require that the binding of two sequences to
one another be a specific (i.e., selective) interaction. The
absence of non-specific binding may be tested by the use of a
second target which lacks even a partial degree of complementarity
(e.g., less than about 30% identity); in the absence of
non-specific binding the probe will not hybridize to the second
non-complementary target.
[0056] Low stringency conditions comprise conditions equivalent to
binding or hybridization at 42.degree. C. in a solution consisting
of 5.times.SSPE (43.8 g/l NaCl, 6.9 g/l NaH.sub.2PO.sub.4.H.sub.2O
and 1.85 g/l EDTA, pH adjusted to 7.4 with NaOH), 0.1% SDS,
5.times. Denhardt's reagent [50.times. Denhardt's contains per 500
ml: 5 g Ficoll (Type 400, Pharamcia), 5 g BSA (Fraction V; Sigma)]
and 100 .mu.g/ml denatured salmon sperm DNA followed by washing in
a solution comprising 5.times.SSPE, 0.1% SDS at 42.degree. C. when
a probe of about 500 nucleotides in length is employed.
[0057] The art knows well that numerous equivalent conditions may
be employed to comprise low stringency conditions; factors such as
the length and nature (DNA, RNA, base composition) of the probe and
nature of the target (DNA, RNA, base composition, present in
solution or immobilized, etc.) and the concentration of the salts
and other components (e.g., the presence or absence of formamide,
dextran sulfate, polyethylene glycol) are considered and the
hybridization solution may be varied to generate conditions of low
stringency hybridization different from, but equivalent to, the
above listed conditions. In addition, the art knows conditions
which promote hybridization under conditions of high stringency
(e.g., increasing the temperature of the hybridization and/or wash
steps, the use of formamide in the hybridization solution,
etc.).
[0058] When used in reference to a double-stranded nucleic acid
sequence such as a cDNA or genomic clone, the term "substantially
homologous" refers to any probe which can hybridize to either or
both strands of the double-stranded nucleic acid sequence under
conditions of low stringency as described above.
[0059] When used in reference to a single-stranded nucleic acid
sequence, the term "substantially homologous" refers to any probe
which can hybridize (i.e., it is the complement of) the
single-stranded nucleic acid sequence under conditions of low
stringency as described above.
[0060] As used herein, the term "hybridization" is used in
reference to the pairing of complementary nucleic acids.
Hybridization and the strength of hybridization (i.e., the strength
of the association between the nucleic acids) is impacted by such
factors as the degree of complementary between the nucleic acids,
stringency of the conditions involved, the T.sub.m of the formed
hybrid, and the G:C ratio within the nucleic acids.
[0061] As used herein, the term "T.sub.m" is used in reference to
the "melting temperature." The melting temperature is the
temperature at which a population of double-stranded nucleic acid
molecules becomes half dissociated into single strands. The
equation for calculating the T.sub.m of nucleic acids is well known
in the art. As indicated by standard references, a simple estimate
of the T.sub.m value may be calculated by the equation:
T.sub.m=81.5+0.41(% G+C), when a nucleic acid is in aqueous
solution at 1 M NaCl (see e.g., Anderson and Young, Quantitative
Filter Hybridization, in Nucleic Acid Hybridization (1985). Other
references include more sophisticated computations which take
structural as well as sequence characteristics into account for the
calculation of T.sub.m.
[0062] As used herein the term "stringency" is used in reference to
the conditions of temperature, ionic strength, and the presence of
other compounds such as organic solvents, under which nucleic acid
hybridizations are conducted. With "high stringency" conditions,
nucleic acid base pairing will occur only between nucleic acid
fragments that have a high frequency of complementary base
sequences. Thus, conditions of "weak" or "low" stringency are often
required with nucleic acids that are derived from organisms that
are genetically diverse, as the frequency of complementary
sequences is usually less.
[0063] As used herein, the term "amplifiable nucleic acid" is used
in reference to nucleic acids which may be amplified by any
amplification method. It is contemplated that "amplifiable nucleic
acid" will usually comprise "sample template."
[0064] As used herein, the term "sample template" refers to nucleic
acid originating from a sample which is analyzed for the presence
of "target" (defined below). In contrast, "background template" is
used in reference to nucleic acid other than sample template which
may or may not be present in a sample. Background template is most
often inadvertent. It may be the result of carryover, or it may be
due to the presence of nucleic acid contaminants sought to be
purified away from the sample. For example, nucleic acids from
organisms other than those to be detected may be present as
background in a test sample.
[0065] As used herein, the term "primer" refers to an
oligonucleotide, whether occurring naturally as in a purified
restriction digest or produced synthetically, which is capable of
acting as a point of initiation of synthesis when placed under
conditions in which synthesis of a primer extension product which
is complementary to a nucleic acid strand is induced, (i.e., in the
presence of nucleotides and an inducing agent such as DNA
polymerase and at a suitable temperature and pH). The primer is
preferably single stranded for maximum efficiency in amplification,
but may alternatively be double stranded. If double stranded, the
primer is first treated to separate its strands before being used
to prepare extension products. Preferably, the primer is an
oligodeoxyribonucleotide. The primer must be sufficiently long to
prime the synthesis of extension products in the presence of the
inducing agent. The exact lengths of the primers will depend on
many factors, including temperature, source of primer and the use
of the method.
[0066] As used herein, the term "probe" refers to an
oligonucleotide (i.e., a sequence of nucleotides), whether
occurring naturally as in a purified restriction digest or produced
synthetically, recombinantly or by PCR amplification, which is
capable of hybridizing to another oligonucleotide of interest. A
probe may be single-stranded or double-stranded. Probes are useful
in the detection, identification and isolation of particular gene
sequences. It is contemplated that any probe used in the present
invention will be labeled with any "reporter molecule," so that is
detectable in any detection system, including, but not limited to
enzyme (e.g., ELISA, as well as enzyme-based histochemical assays),
fluorescent, radioactive, and luminescent systems. It is not
intended that the present invention be limited to any particular
detection system or label.
[0067] As used herein, the term "target," when used in reference to
the polymerase chain reaction, refers to the region of nucleic acid
bounded by the primers used for polymerase chain reaction. Thus,
the "target" is sought to be sorted out from other nucleic acid
sequences. A "segment" is defined as a region of nucleic acid
within the target sequence.
[0068] As used herein, the term "polymerase chain reaction" ("PCR")
refers to the method of K. B. Mullis U.S. Pat. Nos. 4,683,195 and
4,683,202, hereby incorporated by reference, which describe a
method for increasing the concentration of a segment of a target
sequence in a mixture of genomic DNA without cloning or
purification. This process for amplifying the target sequence
consists of introducing a large excess of two oligonucleotide
primers to the DNA mixture containing the desired target sequence,
followed by a precise sequence of thermal cycling in the presence
of a DNA polymerase. The two primers are complementary to their
respective strands of the double stranded target sequence. To
effect amplification, the mixture is denatured and the primers then
annealed to their complementary sequences within the target
molecule. Following annealing, the primers are extended with a
polymerase so as to form a new pair of complementary strands. The
steps of denaturation, primer annealing and polymerase extension
can be repeated many times (i.e., denaturation, annealing and
extension constitute one "cycle"; there can be numerous "cycles")
to obtain a high concentration of an amplified segment of the
desired target sequence. The length of the amplified segment of the
desired target sequence is determined by the relative positions of
the primers with respect to each other, and therefore, this length
is a controllable parameter. By virtue of the repeating aspect of
the process, the method is referred to as the "polymerase chain
reaction" (hereinafter "PCR"). Because the desired amplified
segments of the target sequence become the predominant sequences
(in terms of concentration) in the mixture, they are said to be
"PCR amplified".
[0069] With PCR, it is possible to amplify a single copy of a
specific target sequence in genomic DNA to a level detectable by
several different methodologies (e.g., hybridization with a labeled
probe; incorporation of biotinylated primers followed by
avidin-enzyme conjugate detection; incorporation of
.sup.32P-labeled deoxynucleotide triphosphates, such as dCTP or
dATP, into the amplified segment). In addition to genomic DNA, any
oligonucleotide sequence can be amplified with the appropriate set
of primer molecules. In particular, the amplified segments created
by the PCR process itself are, themselves, efficient templates for
subsequent PCR amplifications.
[0070] "Amplification" is a special case of nucleic acid
replication involving template specificity. It is to be contrasted
with non-specific template replication (i.e., replication that is
template-dependent but not dependent on a specific template).
Template specificity is here distinguished from fidelity of
replication (i.e., synthesis of the proper polynucleotide sequence)
and nucleotide (ribo- or deoxyribo-) specificity. Template
specificity is frequently described in terms of "target"
specificity. Target sequences are "targets" in the sense that they
are sought to be sorted out from other nucleic acid. Amplification
techniques have been designed primarily for this sorting out.
[0071] Template specificity is achieved in most amplification
techniques by the choice of enzyme. Amplification enzymes are
enzymes that, under conditions they are used, will process only
specific sequences of nucleic acid in a heterogeneous mixture of
nucleic acid. For example, in the case of Q.beta. replicase, MDV-1
RNA is the specific template for the replicase [D. L. Kacian et
al., Proc. Natl. Acad. Sci USA 69:3038 (1972)]. Other nucleic acid
will not be replicated by this amplification enzyme. Similarly, in
the case of T7 RNA polymerase, this amplification enzyme has a
stringent specificity for its own promoters [M. Chamberlin et al.,
Nature 228:227 (1970)]. In the case of T4 DNA ligase, the enzyme
will not ligate the two oligonucleotides where there is a mismatch
between the oligonucleotide substrate and the template at the
ligation junction [D. Y. Wu and R. B. Wallace, Genomics 4:560
(1989)]. Finally, Taq and Pfu polymerases, by virtue of their
ability to function at high temperature, are found to display high
specificity for the sequences bounded and thus defined by the
primers; the high temperature results in thermodynamic conditions
that favor primer hybridization with the target sequences and not
hybridization with non-target sequences [PCR Technology, H. A.
Erlich (ed.) (Stockton Press 1989)].
[0072] As used herein, the terms "PCR product", "PCR fragment" and
"amplification product" refer to the resultant mixture of compounds
after two or more cycles of the PCR steps of denaturation,
annealing and extension are complete. These terms encompass the
case where there has been amplification of one or more segments of
one or more target sequences.
[0073] As used herein, the term "amplification reagents" refers to
those reagents (deoxyribonucleotide triphosphates, buffer, etc.),
needed for amplification except for primers, nucleic acid template
and the amplification enzyme. Typically, amplification reagents
along with other reaction components are placed and contained in a
reaction vessel (test tube, microwell, etc.).
[0074] As used herein, the terms "restriction endonucleases" and
"restriction enzymes" refer to bacterial enzymes, each of which cut
double-stranded DNA at or near a specific nucleotide sequence.
[0075] As used herein, the term "recombinant DNA molecule" as used
herein refers to a DNA molecule which is comprised of segments of
DNA joined together by means of molecular biological
techniques.
[0076] DNA molecules are said to have "5' ends" and "3' ends"
because mononucleotides are reacted to make oligonucleotides in a
manner such that the 5' phosphate of one mononucleotide pentose
ring is attached to the 3' oxygen of its neighbor in one direction
via a phosphodiester linkage. Therefore, an end of an
oligonucleotides referred to as the "5' end" if its 5' phosphate is
not linked to the 3' oxygen of a mononucleotide pentose ring and as
the "3' end" if its 3' oxygen is not linked to a 5' phosphate of a
subsequent mononucleotide pentose ring. As used herein, a nucleic
acid sequence, even if internal to a larger oligonucleotide, also
may be said to have 5' and 3' ends. In either a linear or circular
DNA molecule, discrete elements are referred to as being "upstream"
or 5' of the "downstream" or 3' elements. This terminology reflects
the fact that transcription proceeds in a 5' to 3' fashion along
the DNA strand. The promoter and enhancer elements which direct
transcription of a linked gene are generally located 5' or upstream
of the coding region However, enhancer elements can exert their
effect even when located 3' of the promoter element and the coding
region. Transcription termination and polyadenylation signals are
located 3' or downstream of the coding region.
[0077] As used herein, the term "an oligonucleotide having a
nucleotide sequence encoding a gene" means a nucleic acid sequence
comprising the coding region of a gene or in other words the
nucleic acid sequence which encodes a gene product. The coding
region may be present in either a cDNA, genomic DNA or RNA form.
When present in a DNA form, the oligonucleotide may be
single-stranded (i.e., the sense strand) or double-stranded.
Suitable control elements such as enhancers/promoters, splice
junctions, polyadenylation signals, etc. may be placed in close
proximity to the coding region of the gene if needed to permit
proper initiation of transcription and/or correct processing of the
primary RNA transcript. Alternatively, the coding region utilized
in the expression vectors of the present invention may contain
endogenous enhancers/promoters, splice junctions, intervening
sequences, polyadenylation signals, etc. or a combination of both
endogenous and exogenous control elements.
[0078] As used herein, the term "regulatory element" refers to a
genetic element which controls some aspect of the expression of
nucleic acid sequences. For example, a promoter is a regulatory
element which facilitates the initiation of transcription of an
operably linked coding region. Other regulatory elements are
splicing signals, polyadenylation signals, termination signals,
etc. (defined infra).
[0079] Transcriptional control signals in eucaryotes comprise
"promoter" and "enhancer" elements. Promoters and enhancers consist
of short arrays of DNA sequences that interact specifically with
cellular proteins involved in transcription [Maniatis, T. et al.,
Science 236:1237 (1987)]. Promoter and enhancer elements have been
isolated from a variety of eucaryotic sources including genes in
yeast, insect and mammalian cells and viruses (analogous control
elements, i.e., promoters, are also found in procaryotes). The
selection of a particular promoter and enhancer depends on what
cell type is to be used to express the protein of interest. Some
eucaryotic promoters and enhancers have a broad host range while
others are functional in a limited subset of cell types [for review
see Voss, S. D. et al., Trends Biochem. Sci., 11:287 (1986) and
Maniatis, T. et al., supra (1987)]. For example, the SV40 early
gene enhancer is very active in a wide variety of cell types from
many mammalian species and has been widely used for the expression
of proteins in mammalian cells [Dijkema, R. et al., EMBO J. 4:761
(1985)]. Two other examples of promoter/enhancer elements active in
a broad range of mammalian cell types are those from the human
elongation factor la gene [Uetsuki, T. et al., J. Biol. Chem.,
264:5791 (1989), Kim, D. W. et al., Gene 91:217 (1990) and
Mizushima, S. and Nagata, S., Nuc. Acids. Res., 18:5322 (1990)] and
the long terminal repeats of the Rous sarcoma virus [Gorman, C. M.
et al., Proc. Natl. Acad. Sci. USA 79:6777 (1982)] and the human
cytomegalovirus [Boshart, M. et al., Cell 41:521 (1985)].
[0080] As used herein, the term "promoter/enhancer" denotes a
segment of DNA which contains sequences capable of providing both
promoter and enhancer functions (i.e., the functions provided by a
promoter element and an enhancer element, see above for a
discussion of these functions). For example, the long terminal
repeats of retroviruses contain both promoter and enhancer
functions. The enhancer/promoter may be "endogenous" or "exogenous"
or "heterologous." An "endogenous" enhancer/promoter is one which
is naturally linked with a given gene in the genome. An "exogenous"
or "heterologous" enhancer/promoter is one which is placed in
juxtaposition to a gene by means of genetic manipulation (i.e.,
molecular biological techniques) such that transcription of that
gene is directed by the linked enhancer/promoter.
[0081] The presence of "splicing signals" on an expression vector
often results in higher levels of expression of the recombinant
transcript in eucaryotic host cells. Splicing signals mediate the
removal of introns from the primary RNA transcript and consist of a
splice donor and acceptor site [Sambrook, J. et al., Molecular
Cloning: A Laboratory Manual, 2nd ed., Cold Spring Harbor
Laboratory Press, New York (1989) pp. 16.7-16.8]. A commonly used
splice donor and acceptor site is the splice junction from the 16S
RNA of SV40.
[0082] Efficient expression of recombinant DNA sequences in
eucaryotic cells requires expression of signals directing the
efficient termination and polyadenylation of the resulting
transcript. Transcription termination signals are generally found
downstream of the polyadenylation signal and are a few hundred
nucleotides in length. The term "poly A site" or "poly A sequence"
as used herein denotes a DNA sequence which directs both the
termination and polyadenylation of the nascent RNA transcript.
Efficient polyadenylation of the recombinant transcript is
desirable as transcripts lacking a poly A tail are unstable and are
rapidly degraded. The poly A signal utilized in an expression
vector may be "heterologous" or "endogenous." An endogenous poly A
signal is one that is found naturally at the 3' end of the coding
region of a given gene in the genome. A heterologous poly A signal
is one which is one which is isolated from one gene and placed 3'
of another gene. A commonly used heterologous poly A signal is the
SV40 poly A signal. The SV40 poly A signal is contained on a 237 bp
BamHI/BclI restriction fragment and directs both termination and
polyadenylation [J. Sambrook, supra, at 16.6-16.7].
[0083] Eucaryotic expression vectors may also contain "viral
replicons" or "viral origins of replication." Viral replicons are
viral DNA sequences which allow for the extrachromosomal
replication of a vector in a host cell expressing the appropriate
replication factors. Vectors which contain either the SV40 or
polyoma virus origin of replication replicate to high copy number
(up to 10.sup.4 copies/cell) in cells that express the appropriate
viral T antigen. Vectors which contain the replicons from bovine
papillomavirus or Epstein-Barr virus replicate extrachromosomally
at low copy number (.about.100 copies/cell).
[0084] The Tne polymerases may be expressed in either procaryotic
or eucaryotic host cells. Nucleic acid encoding the Tne polymerase
may be introduced into bacterial host cells by a number of means
including transformation of bacterial cells made competent for
transformation by treatment with calcium chloride or by
electroporation. If the Tne polymerases are to be expressed in
eucaryotic host cells, nucleic acid encoding the Tne polymerase may
be introduced into eucaryotic host cells by a number of means
including calcium phosphate co-precipitation, spheroplast fusion,
electroporation and the like. When the eucaryotic host cell is a
yeast cell, transformation may be affected by treatment of the host
cells with lithium acetate or by electroporation.
[0085] As used herein, the terms "nucleic acid molecule encoding,"
"DNA sequence encoding," and "DNA encoding" refer to the order or
sequence of deoxyribonucleotides along a strand of deoxyribonucleic
acid. The order of these deoxyribonucleotides determines the order
of amino acids along the polypeptide (protein) chain. The DNA
sequence thus codes for the amino acid sequence.
[0086] The term "isolated" when used in relation to a nucleic acid,
as in "an isolated oligonucleotide" refers to a nucleic acid
sequence that is identified and separated from at least one
contaminant nucleic acid with which it is ordinarily associated in
its natural source. Isolated nucleic acid is such present in a form
or setting that is different from that in which it is found in
nature. In contrast, non-isolated nucleic acids as nucleic acids
such as DNA and RNA found in the state they exist in nature. For
example, a given DNA sequence (e.g., a gene) is found on the host
cell chromosome in proximity to neighboring genes; RNA sequences,
such as a specific mRNA sequence encoding a specific protein, are
found in the cell as a mixture with numerous other mRNA s which
encode a multitude of proteins. However, isolated nucleic acid
encoding a Tne polymerase includes, by way of example, such nucleic
acid in cells ordinarily expressing a Tne polymerase where the
nucleic acid is in a chromosomal location different from that of
natural cells, or is otherwise flanked by a different nucleic acid
sequence than that found in nature. The isolated nucleic acid or
oligonucleotide may be present in single-stranded or
double-stranded form. When an isolated nucleic acid or
oligonucleotide is to be utilized to express a protein, the
oligonucleotide will contain at a minimum the sense or coding
strand (i.e., the oligonucleotide may single-stranded), but may
contain both the sense and anti-sense strands (i.e., the
oligonucleotide may be double-stranded).
[0087] As used herein the term "coding region" when used in
reference to structural gene refers to the nucleotide sequences
which encode the amino acids found in the nascent polypeptide as a
result of translation of a mRNA molecule. The coding region is
bounded on the 5' side by the nucleotide triplet "ATG" which
encodes the initiator methionine and on the 3' side by one of the
three triplets which specify stop codons (i.e., TAA, TAG, TGA).
[0088] As used herein, the term "purified" or "to purify" refers to
the removal of contaminants from a sample. For example, recombinant
Tne DNA polymerases are expressed in bacterial host cells and the
polymerases are purified by the removal of host cell proteins; the
percent of recombinant Tne DNA polymerase is thereby increased in
the sample.
[0089] The term "recombinant DNA molecule" as used herein refers to
a DNA molecule which is comprised of segments of DNA joined
together by means of molecular biological techniques.
[0090] The term "recombinant protein" or "recombinant polypeptide"
as used herein refers to a protein molecule which is expressed from
a recombinant DNA molecule.
[0091] The term "native protein" is used herein to indicate a
protein isolated from a naturally occurring (i.e., a
nonrecombinant) source. Molecular biological techniques may be used
to produce a recombinant form of a protein which has identical
properties when compared to the native form of the protein. The
term "rTne" is used to designate a recombinant form of Tne
polymerase. The terms "nTne" and "nTaq" are used to designate the
native forms of Tne polymerase and Taq polymerase,
respectively.
[0092] As used herein the term "portion" when in reference to a
protein (as in "a portion of a given protein") refers to fragments
of that protein. The fragments may range in size from four amino
acid residues to the entire amino acid sequence minus one amino
acid.
[0093] As used herein, the term "fusion protein" refers to a
chimeric protein containing the protein of interest (i.e., Tne DNA
polymerases and fragments thereof) joined to an exogenous protein
fragment (the fusion partner which consists of a non-Tne polymerase
protein). The fusion partner may enhance solubility of the Tne
polymerase protein as expressed in a host cell, may provide an
affinity tag to allow purification of the recombinant fusion
protein from the host cell or culture supernatant, or both. If
desired, the fusion protein may be removed from the protein of
interest (i.e., Tne DNA polymerase or fragments thereof) by a
variety of enzymatic or chemical means known to the art.
[0094] The term "5' exonuclease activity" refers to the presence of
an activity in a protein which is capable of removing nucleotides
from the 5' end of an oligonucleotide. 5' exonuclease activity may
be measured using any of the assays provided herein.
[0095] The term "3' exonuclease activity" refers to the presence of
an activity in a protein which is capable of removing nucleotides
from the 3' end of an oligonucleotide. 3' exonuclease activity may
be measured using any of the assays provided herein.
[0096] The terms "DNA polymerase activity," "synthetic activity"
and "polymerase activity" are used interchangably and refer to the
ability of an enzyme to synthesize new DNA strands by the
incorporation of deoxynucleoside triphosphates. The examples below
provide assays for the measurement of DNA polymerase activity.
[0097] The term "reduced levels of 3' exonuclease" is used in
reference to the level of 3' exonuclease activity displayed by the
wild-type Tne DNA polymerase (i.e., the polymerase of SEQ ID NO:2)
and indicates that the modified or "non-naturally occurring"
polymerase exhibits lower levels of 3' exonuclease than does the
full-length or unmodified enzyme.
[0098] The phrase "lacks significant 5' exonuclease activity" is
used relative to the level of 5' exonuclease activity displayed by
the wild-type Tne DNA polymerase (i.e., the polymerase of SEQ ID
NO:2) and indicates that the modified or "non-naturally occurring"
polymerase exhibits such low levels of 5' exonuclease that the
measurement is at background levels in the assay.
DESCRIPTION OF THE INVENTION
[0099] The present invention provides purified thermostable DNA
polymerase I enzymes derived from Thermotoga neapolitana (Tne).
These thermostable enzymes comprise the wild-type form of the
enzyme as well as mutant forms which posses altered characteristics
relative to the wild-type enzyme. In particular, the present
invention provides deletion mutants which lack 5' exonuclease
activity. Further the present invention provides modified forms of
Tne DNA polymerases which lack 5' exonuclease activity and have
reduced or absent 3' exonuclease activity.
[0100] The present invention also relates to an improved method of
determining the nucleic sequence of a DNA molecule using chain
terminating dideoxynucleotides in conjunction with the modified Tne
DNA polymerases. The novel properties of the polymerases of the
invention provide improved enzymes for a variety of applications
which utilize thermostable DNA polymerases.
[0101] The description of the invention is divided into: I. General
Structural Features of Type A DNA Polymerases, II. Generation of
Tne DNA Polymerases, III. Use of Tne DNA Polymerases in the PCR and
IV. Use of Tne DNA Polymerases in DNA Sequencing Methods.
[0102] I. General Structural Features of DNA Polymerases
[0103] DNA polymerases (DNAPs), such as those isolated from E. coli
or from thermophilic bacteria of the genera Thermus or Thermotoga,
are enzymes that synthesize new DNA strands. Several of the known
DNAPs contain associated nuclease activities in addition to the
synthetic or polymerization activity of the enzyme.
[0104] Some DNAPs are known to remove nucleotides from the 5' and
3' ends of DNA chains [Kornberg, DNA Replication, W. H. Freeman and
Co., San Francisco, pp. 127-139 (1980)]. These nuclease activities
are usually referred to as 5' exonuclease and 3' exonuclease
activities, respectively. For example, the 5' exonuclease activity
located in the N-terminal domain of several DNAPs participates in
the removal of RNA primers during lagging stand synthesis during
DNA replication and the removal of damaged nucleotides during
repair. Some DNAPs, such as the E. coli DNA polymerase, also have a
3' exonuclease activity responsible for proof-reading during DNA
synthesis (Kornberg, supra).
[0105] DNAPs isolated from Thermus aquaticus (Taq), Thermus flavus
(Tfl) and Thermus thermophilus (Tth) have a 5' exonuclease
activity, but lack a functional 3' exonucleolytic domain [Tindall
and Kunkell, Biochem. 27:6008 (1988)]. However, the lack of a 3'
exonuclease domain is not a general feature of DNAPs derived from
thermophilic bacteria as DNA polymerases from the thermophiles
Thermotoga maritima (Tma), Bacillus caldotenax, Thermococcus
litoralis (Tli) and Pyrococcus furiosus (Pfu) do contain 3'
exonuclease activity.
[0106] The 5' nuclease activity associated with a number of
eubacterial Type A DNA polymerases has been found to reside in the
one-third N-terminal region of the protein as an independent
functional domain. In these polymerase molecules, the C-terminal
two-thirds of the molecule constitute the polymerization domain
which is responsible for the synthesis of DNA. Some Type A DNA
polymerases also have a 3' exonuclease activity associated with the
two-third C-terminal region of the molecule. FIG. 1 provides a
schematic showing the location of the 5' exonuclease, 3'
exonuclease and polymerase domains of a number of eubacterial
DNAPs. As noted above, not all DNAPs contain both 5' and 3'
exonuclease domains.
[0107] FIG. 1 provides a schematic depicting the arrangement of the
5' exonuclease ("5' EXO"), 3' exonuclease ("3' EXO") and polymerase
("POL") domains in the DNA polymerases from phage T4 (".phi.T4"),
phage T7 (".phi.T7"), E. coli (DNA polymerase I; "Eco Pol I"), T.
aquaticus ("Taq"), T. maritima ("Tma") and T. neapolitana ("Tne").
The absence of a 3' exonuclease domain in Taq DNA polymerase is
indicated by the use of the line between the boxed 5' exonuclease
and polymerase domains; the absence of a 5' nuclease domain in
phage T4 polymerase is indicated by the absence of the term "5'
EXO" in the first boxed region of the molecule.
[0108] The 5' exonuclease activity and the polymerization activity
of DNAPs have been separated by proteolytic cleavage or genetic
manipulation of the polymerase molecule. The Klenow or large
proteolytic cleavage fragment of E. coli DNA polymerase I contains
the polymerase and 3' exonuclease activity but lacks the 5'
nuclease activity [Brutlag et al., Biochem. Biophys. Res. Commun.
37:982 (1969)]. The Stoffel fragment of DNAP Taq lacks the 5'
nuclease activity due to a genetic manipulation which deleted the
N-terminal 289 amino acids of the polymerase molecule [Erlich et
al., Science 252:1643 (1991)].
[0109] The removal of the 5' exonuclease domain from a DNAP may
effect the activity of the remaining domains. For example, removal
of the 5' exonuclease domain from the E. coli polymerase I protein
to generate the Klenow fragment affects the fidelity of the
remaining large polymerase domain. The fidelity of a DNA polymerase
involves several functions including the ability to discriminate
against errors when nucleotides are initially inserted,
discriminate against extension from misaligned or mispaired primer
termini and exonucleolytic removal of errors.
[0110] In comparison to the full-length enzyme, the Klenow fragment
exhibits altered base substitution error specificity and is less
accurate for minus one base frameshift errors at reiterated
template nucleotides [Bebenek et al., J. Biol. Chem. 265:13878
(1990)]. Thus, the removal of the 5' exonuclease domain of E. coli
DNA polymerase I adversely affects the fidelity of the remaining 3'
exonuclease and synthetic domains.
[0111] Removal of a 5' exonuclease domain does not always adversely
affect the fidelity of the resultant polymerase fragment KlenTaq, a
truncated version of Taq DNA polymerase lacks the first 235
N-terminal amino acids (which includes the 5' exonuclease domain)
has been reported improved the fidelity of the polymerase two-fold
[Barnes, Gene 112:29 (1992)].
[0112] Comparison of amino acid sequence in the 3' exonuclease
domain of a number DNAPs has identified three domains, termed Exo
I-III, which are highly conserved between a variety of mesophilic
and thermophilic organisms [Bernad et al. Cell 59:219 (1989)]. FIG.
2 provides a schematic drawing which aligns the amino acid residues
from a number of DNAPs over the 3' exonuclease domain. In FIG. 2,
the one letter code is used for the amino acids; the numbers
represent the amino acid residue in a given polymerase. In FIG. 2,
residues which are highly conserved are indicated by the use of
white letters within a black box. Portions of the 3' exonuclease
domain of following polymerases are shown: Bacillus subtillus (Bsu)
polymerase III; E. coli (Eco) polymerase III.epsilon.; phage T4,
phage T7, E. coli polymerase I, T. maritima (Tma) polymerase and T.
neapolitana (Tne) polymerase. The ".gradient." indicates amino acid
residues involved in single strand DNA binding; the ".DELTA."
indicates amino acid residues involved in metal binding and
catalysis.
[0113] Site-directed mutagenesis experiments have identified a
subset of these conserved residues as being critical for 3'
exonuclease activity in E. coli polymerase I. The critical residues
include D355, D424, D501 which are known to bind divalent metal
ions and are essential for 3' exonuclease activity; mutation of
these residues reduces 3' exonuclease activity several thousand
fold. L361, F473 and Y497 are also important for 3' exonuclease
activity and are believed to ensure correct positioning of the
substrate in the active site. Mutation of L361 and Y497 reduces 3'
exonuclease activity 12.5 to 25-fold; mutation of F473 reduces 3'
exonuclease activity about 3000-fold.
[0114] PCT Publ. No. WO 92/03556 states that three characteristic
domains are critical for 3' exonuclease activity in thermostable
DNA polymerases; however, no site-directed mutagenesis is shown for
any of the "critical" residues and no 3' exonuclease activity is
reported for any of the mutant forms of Tma DNA polymerase
(primarily deletion mutants) shown. The three domains identified in
PCT Publ. No. WO 92/03556 are Domain A, which comprises
D-X-E-X.sup.3-L; Domain B, which comprises N--X.sup.3-D-X.sup.3-L
and Domain C, which comprises Y--X.sup.3-D where X.sup.N represents
the number (N) of non-critical amino acids between the specified
amino acids. As shown in FIG. 2, the location, sequence and spacing
of these three domains found in polymerases derived from
thermophilic organisms is consistent with the three domains
identified in polymerases derived from mesophilic organisms.
[0115] While identification of residues which are highly conserved
between a number of species provides a starting point for the
design of site-directed mutagenesis experiments, it does not
provide an absolute prediction of the effect of a given mutation in
a particular protein. For example, the present invention shows that
substitution of the aspartate at position 468 of the Tne DNA
polymerase with a asparagine virtually eliminates the 3'
exonuclease activity [Tne M284(D468N)]. The analogous mutation in
the Klenow fragment of DNA polymerase I (D501N) reduces 3'
exonuclease activity only by 2-fold [Derbyshire et al., EMBO J.
10:17 (1991)]. These results underscore the fact that much remains
to be learned about structure-function relationships and that one
cannot predict, with certainty, the effect of a given mutation
based on analogy to other proteins.
[0116] II. Generation of Tne DNA Polymerases
[0117] The present invention provides wild-type and modified forms
of Tne DNA polymerases. The modified forms lack 5' exonuclease
activity and some modified forms also display reduced or absent 3'
exonuclease activity.
[0118] By the term "reduced or absent 3' exonuclease activity" it
is meant that the modified enzyme has less than the level of 3'
exonuclease activity found in the wild-type or unmodified enzyme
whose protein sequence is listed in SEQ ID NO:2. The modified Tne
polymerases of the present invention are advantageous in situations
where the polymerization (i.e., synthetic) activity of the enzyme
is desired but the presence of 5' exonuclease and/or 3' exonuclease
activity is not.
[0119] The present invention is not intended to be limited by the
nature of the alteration (e.g., deletion, insertion, substitution)
necessary to render the Tne polymerase deficient in 5' exonuclease
or 3' exonuclease activity. The present invention contemplates a
variety of methods, including but not limited to proteolysis and
genetic manipulation.
[0120] 1. Reduction of Exonuclease Activity by Proteolysis
[0121] Tne DNA polymerases having a reduced level of either or both
5' exonuclease and 3' exonuclease activity are produced according
to the present invention by physically cleaving the unmodified
enzyme with proteolytic enzymes to produce fragments of the enzyme
that are deficient in 5' and/or 3' exonuclease activity but retain
synthetic activity. The proteolysis can remove the N-terminal one
third of the protein (about residues 1 to 297 in SEQ ID NO:2) to
remove 5' exonuclease activity. Proteolytic cleavage which removes
all or a portion of the 3' exonuclease domain (about residues 298
to 482 in SEQ ID NO:2) will render the resulting enzyme deficient
in 3' exonuclease activity. Following proteolytic digestion, the
resulting fragments are separated by standard chromatographic
techniques and assayed for the ability to synthesize DNA and to act
as a 5' or 3' exonuclease. The assays to determine synthetic
activity and 5' and 3' exonuclease activity are described in the
experimental sections below.
[0122] 2. Reduction of Exonuclease Activity by Genetic
Manipulation
[0123] The examples below describe preferred methods for creating a
construct (i.e., a vector) encoding a polymerase derived from Tne
DNA polymerase I. The wild-type Tne polymerase is cloned by
isolating genomic DNA using molecular biological methods from T.
neapolitana cells. The genomic DNA is cleaved into fragments about
3 kb or larger using restriction enzymes and the fragments are
inserted into a suitable cloning vector such as a plasmid or
bacteriophage vector; the vectors containing fragments of T.
neapolitana genomic DNA are then transformed into a suitable E.
coli host. Clones containing DNA encoding the Tne polymerase may be
isolated using functional assays (i.e., presence of thermostable
polymerase in lysates of transformed cells) or by hybridization
using a probe derived from a region of conservation among DNA
polymerases derived from thermostable organisms. Alternatively, the
T. neapolitana genomic DNA may be used as the target in a
polymerase chain reaction (PCR) where the primers are selected from
regions of high sequence conservation among the genes encoding
thermostable DNA polymerases. Such a PCR may not amplify the entire
coding region of the Tne polymerase I gene; in such a case, the
full-length Tne gene could be isolated by using the amplified
fragment as a probe to screen a genomic library containing T.
neapolitana DNA.
[0124] Once the full-length Tne polymerase gene is obtained,
regions encoding the 5' exonuclease and/or 3' exonuclease may be
altered by a variety of means to reduce or eliminate these
activities. Suitable deletion and site-directed mutagenesis
procedures are described below in the examples.
[0125] Deletion of amino acids from the protein can be done either
by deletion of the encoding genetic material, or by introduction of
a translational stop codon by mutation or frame shift. In addition,
proteolytic treatment of the protein molecule can be performed to
remove segments of the protein.
[0126] In the examples below, specific alterations of the Tne
polymerase gene were: a deletion between residues 1-849, a deletion
between residues 1-945, a deletion between residues 1-966, deletion
between residues 1-966, a deletion between residues 1-849 and
residues 925-1272 and substitutions at residues 946, 947, 967, 968,
969, 975, 1166, 1167, 1391, 1402, 1407 and 1410. These modified
sequences are described below in the examples and at SEQ ID NOS:7,
10, 15, 18, 22, 25, 28, 32, 34, 36 and 38.
[0127] Those skilled in the art know that single base changes can
be innocuous in terms of enzyme structure and function. Similarly
small additions and deletions can be present without substantially
changing the exonuclease or polymerase function of the wild-type or
modified Tne DNA polymerases. To test whether a particular change
is innocuous in terms of the effect upon enzymatic activity, the
polymerase encoded by a given DNA sequence is tested for the
presence of synthetic activity, 5' exonuclease activity and 3'
exonuclease activity as in the assays described in the examples
below. DNA sequence which contain alterations other than those
listed in SEQ ID NOS:7, 10, 15, 18, 22, 25, 28, 32, 34, 36 and 38
but which encode a polymerase molecule having the properties
associated with the polymerases encoded by the above SEQ ID NOS are
contained within the present invention.
[0128] Other deletions and substitutions are also suitable to
create modified Tne DNA polymerases lacking 5' and/or 3'
exonuclease activity. For example, given the degeneracy of the
genetic code, several DNA sequences may be used to introduce
substitutions which result in the expression of the same amino
acid. It is preferable, that the alteration decrease the 5' and/or
3' exonuclease activity to a level which is low enough to provide
an improved enzyme for a variety of applications such as PCR and
chain termination sequencing (including thermal cycle sequencing)
as discussed below in the examples. These modifications will
preferably not reduce the synthetic activity of the modified
enzyme. Modified polymerases are tested for the presence of
synthetic activity and 5' and 3' exonuclease activity as in assays
described below. Thoughtful consideration of these assays allows
for the screening of candidate enzymes whose structure is
heretofore as yet unknown. In other words, construct "X" can be
evaluated according to the protocol described below to determine
whether it is a member of the genus of modified Tne polymerases of
the present invention as defined functionally, rather than
structurally.
[0129] The present invention contemplates that the nucleic acid
construct of the present invention be capable of expression in a
suitable host. In particular it is preferable that the expression
system chosen utilize a tightly controlled promoter such that
expression of the Tne polymerase is prevented until expression is
induced. In this manner, potential problems of toxicity of the
expressed polymerases to the host cells (and particularly to
bacterial host cells) is avoided. Those in the art know methods for
attaching various promoters and 3' sequences to a gene structure to
achieve efficient and tightly controlled expression. The examples
below disclose a number of suitable vectors and vector constructs.
Of course, there are other promoter/vector combinations that would
be suitable. The choice of a particular vector is also a function
of the type of host cell to be employed (i.e., procaryotic or
eucaryotic).
[0130] It is not necessary that a host organism be used for the
expression of the nucleic acid constructs of the invention. For
example, expression of the protein encoded by a nucleic acid
construct may be achieved through the use of a cell-free in vitro
transcription/translation system. An example of such a cell-free
system is the commercially available TnT.TM. Coupled Reticulocyte
Lysate System (Promega; this cell-free system is described in U.S.
Pat. No. 5,324,637, the disclosure of which is herein incorporated
by reference).
[0131] The nucleic acid construct containing DNA encoding the
wild-type or a modified Tne polymerase may provide for the addition
of exogenous sequences (i.e., sequences not encoded by the Tne
polymerase coding region) to either the 5' or 3' end of the Tne
polymerase coding region to allow for ease in purification of the
resulting polymerase protein (the resulting protein containing such
an affinity tag is termed a fusion protein). Several commercially
available expression vectors are available which provide for the
addition of affinity tags (an example of an exogenous sequence) to
either the amino or carboxy-termini of a coding region; in general
these affinity tags are short stretches of amino acids which do not
alter the characteristics of the protein to be expressed (i.e., no
change to enzymatic activities).
[0132] For example, the pET expression system (Novagen) utilizes a
vector containing the T7 promoter which encodes the fusion protein
containing a short stretch of histidine residues at either end of
the protein and a host cell which can be induced to express the T7
DNA polymerase (i.e., a DE3 host strain). The production of fusion
proteins containing a histidine tract is not limited to the use of
a particular expression vector and host strain. Several
commercially available expression vectors and host strains can be
used to express protein sequences as a fusion protein containing a
histidine tract (For example, the pQE series (pQE-8, 12, 16, 17,
18, 30, 31, 32, 40, 41, 42, 50, 51, 52, 60 and 70) of expression
vectors (Qiagen) which are used with the host strains M15[pREP4]
(Qiagen) and SG13009[pREP4] (Qiagen) can be used to express fusion
proteins containing six histidine residues at the amino-terminus of
the fusion protein). Additional expression systems which utilize
other affinity tags are known to the art.
[0133] Once a suitable nucleic acid construct has been made, the
Tne polymerase may be produced from the construct The examples
below and standard molecular biological teachings enable one to
manipulate the construct by different suitable methods.
[0134] Once the desired Tne polymerase has been expressed, the
polymerase is tested for both synthetic and exonuclease activity as
described below.
[0135] III. Use of Tne DNA Polymerases in the PCR
[0136] The wild-type and modified Tne polymerases of the present
invention provide suitable and in some cases superior enzymes for
use in the PCR. As shown in the examples below, the wild-type and
modified forms of Tne polymerase were found to require the use of
fewer units of polymerase activity to produce a given amount of
product DNA in PCRs as compared to wild-type Taq DNA polymerase
(i.e., nTaq) or a modified form of Tma DNA polymerase (i.e.,
UlTma.TM.). In addition, modified forms of Tne polymerase were
found to tolerate a broader range of dNTP concentrations and a
broader range of magnesium ion concentrations in the PCR. The
ability to tolerate a broad range of dNTP is important as it allows
flexibility in the range of dNTPs to be used in a reaction;
additionally, the ability to tolerate a wide range of dNTP
concentrations demonstrates that the enzymes of the invention
provide for a robust PCR (i.e., the enzyme is not sensitive to
small variations in dNTP concentration). The ability to produce
only specific amplification products over a wide range of magnesium
ion concentration is advantageous for use in multiplexing PCR
reactions.
[0137] Several of the modified Tne polymerases provide enzymes
having greater resistance to thermal inactivation as compared to
nTaq or UlTma.TM. DNA polymerases. Greater thermal stability is
important for PCR applications as the greater the thermal stability
of the enzyme, the fewer units of enzyme must be used in the
PCR.
[0138] IV. Use of Tne DNA Polymerases in DNA Sequencing Methods
[0139] The sequence of a deoxyribonucleic acid molecule can be
elucidated using chemical [Maxam and Gilbert, Proc. Natl. Acad Sci.
USA 74:560 (1977)] or enzymatic [Sanger et al., Proc. Natl. Acad
Sci. USA 74:5463 (1977)] methods. The enzymatic method of
sequencing is based on the ability of a DNA polymerase to extend a
primer, hybridized to the template that is to be sequenced, until a
chain-terminating nucleotide is incorporated (referred to as chain
terminating sequencing). Each sequence determination is carried out
as a set of four separate reactions, each of which contains all
four deoxyribonucleoside triphosphates (dNTP) supplemented with a
limiting amount of a different dideoxyribonucleoside triphosphate
(ddNTP). Because ddNTPs lack the 3'-OH group necessary for chain
elongation, the growing oligonucleotide is terminated selectively
at G, A, T, or C, depending on the respective dideoxy analog in the
reaction.
[0140] The relative concentrations of each of the dNTPs and ddNTPs
can be adjusted to give a nested set of terminated chains over
several hundred to a few thousand bases in length. The resulting
fragments, each with a common origin but ending in a different
nucleotide, are separated according to size by high-resolution
denaturing gel electrophoresis.
[0141] Incorporation of a radiolabel into the oligonucleotide chain
permits the visualization of the sequencing products by
autoradiography. The end-labeled primer protocol, a modification of
that described by Heiner et al.[(1988) Applied Biosystems, Inc. DNA
Sequencer Model 370 User Bulletin-Taq Polymerase: Increased Enzyme
Versatility in DNA Sequencing], uses [.gamma.-.sup.32P]ATP,
[.gamma.-.sup.33P]ATP or [.gamma.-.sup.3S]ATP to label the
sequencing primer. The DNA template and labeled primer are
repeatedly annealed and enzymatically extended/terminated in
thermal cycled sequencing. The end-labeled primer protocol is the
most versatile sequencing method and is useful when working with
lambda DNA [Kaledin et al., Biokhimiya 45:494 (1980)], PCR
templates, and any template where false priming may be a problem.
This protocol generates sequence data very close to the primer and
is recommended when this is needed. The reaction also contains
deaza nucleotide mixes that substitute 7-deaza dGTP for dGTP. The
deaza mixes resolve band compressions associated with GC-rich
regions [Mizusawa et al., Nucl, Acids Res. 14:1319 (1986) and Barr
et al., Biotechniques 4:428 (1986)].
[0142] Thermal cycled sequencing is an alternative method for
enzymatic sequence analysis which takes advantage of the intrinsic
properties of thermophilic DNA polymerases, such as the one
isolated from Thermus aquaticus (Taq DNA polymerase). Because the
protocol utilizes a thermocycling apparatus, several advantages are
realized over conventional sequencing strategies. First, the
protocol yields a linear amplification of the template DNA,
reducing the amount of template required to achieve a detectable
sequence ladder. Using a .sup.32P end-labeled primer, greater than
500 bases of sequence can be obtained from as little as 4 fmol
(4.times.10.sup.-15 moles) of template after an overnight exposure.
Secondly, the high temperatures employed during each denaturation
cycle eliminate the requirement for alkaline denaturation and
ethanol precipitation of double-stranded DNA (dsDNA) templates. The
denaturation cycles also help to circumvent the problems associated
with rapid reannealing of linear dsDNA templates such as PCR
reaction products. Third, high annealing temperatures increase the
stringency of primer hybridization. Fourth, the high polymerization
temperature decreases the secondary structure of DNA templates and
thus permits polymerization through highly structured regions
[Innis et al., Proc. Natl. Acad. Sci USA 85:9436 (1988)]. This
system is useful for sequencing a wide variety of templates such as
amplified DNA, large double-stranded DNA templates such as lambda,
GC-rich templates and palindrome-rich templates.
[0143] Existing thermostable polymerases which are used in chain
termination methods of sequencing (both traditional and thermal
cycling protocols) require fairly high concentrations of ddNTPs as
the affinity of these polymerases for ddNTPs is somewhat low. For
example, when Taq DNA polymerase is employed for chain termination
sequencing protocols, the optimal concentrations of ddNTPs in the
ddNTP termination mixtures (3.times. mixtures): 180 .mu.M ddGTP, 1
mM ddATP, 1.5 mM ddTTP and 500 .mu.M ddCTP [as described in U.S.
Pat. No. 5,075,216, the disclosure of which is herein incorporated
by reference]. When the polymerase employed is a modified form of
Taq DNA polymerase, sTaq (sequencing grade Taq), the optimal
concentrations of ddNTPs in the ddNTP termination mixtures: 30
.mu.M ddGTP, 350 .mu.M ddATP, 600 .mu.M ddTTP and 200 .mu.M ddCTP.
In contrast, a modified form of Tne DNA polymerase provided herein
utilizes the following concentrations of ddNTPs in the termination
mixtures (3.times. mixtures): 20 .mu.M ddGTP, 50 .mu.M ddATP, 75
.mu.M ddTTP and 25 .mu.M ddCTP. Because ddNTPs are expensive, the
use of a thermostable polymerase having a higher affinity for
ddNTPs (i.e., the modified Tne polymerase of the invention) will
result in considerable cost savings in DNA sequencing
applications.
[0144] Experimental
[0145] The following examples serve to illustrate certain preferred
embodiments and aspects of the present invention and are not to be
construed as limiting the scope thereof.
[0146] In the disclosure which follows, the following abbreviations
apply: .degree. C. (degrees Centigrade); g (gravitational field);
vol (volume); w/v (weight to volume); v/v (volume to volume); BSA
(bovine serum albumin); CTAB (cetyltrimethylammonium bromide); fmol
(femtomole); HPLC (high pressure liquid chromatography); DTT
(dithiothreitol); DMF (N,N dimethyl formamide); DNA
(deoxyribonucleic acid); i.d. (internal diameter); p (plasmid); Al
(microliters); ml (milliliters); .mu.g (micrograms); pmoles
(picomoles); mg (milligrams); MOPS (3-[N-Morpholino]propanesulfonic
acid); M (molar); mM (milliMolar); .mu.M (microMolar); nm
(nanometers); kdal (kilodaltons); OD (optical density); EDTA
(ethylene diamine tetra-acetic acid); FITC (fluorescein
isothiocyanate); SDS (sodium dodecyl sulfate); NaPO.sub.4 (sodium
phosphate); Tris (tris(hydroxymethyl)-aminomethane); PMSF
(phenylmethylsulfonylfluoride); TBE (Tris-Borate-EDTA, i.e., Tris
buffer titrated with boric acid rather than HCl and containing
EDTA); PBS (phosphate buffered saline); PPBS (phosphate buffered
saline containing 1 mM PMSF); PAGE (polyacrylamide gel
electrophoresis); Tween (polyoxyethylene-sorbitan); Boehringer
Mannheim (Boehringer Mannheim, Indianapolis, Ind.); Epicentre
(Epicentre Technologies, Madison, Wis.); New England Biolabs (New
England Biolabs, Beverly, Mass.); Novagen (Novagen, Inc., Madison,
Wis.); Pharmacia (Pharmacia Biotech Inc., Piscataway, N.J.); Perkin
Elmer (Perkin Elmer, Norwalk, Conn.); Promega (Promega Corp.,
Madison, Wis.); Qiagen (Qiagen Inc., Chatsworth, Calif.); Spectra
(Spectra, Houston, Tex.); Stratagene (Stratagene Cloning Systems,
La Jolla, Calif.); USB (U.S. Biochemical, Cleveland, Ohio).
EXAMPLE 1
Isolation of the Tne DNA Polymerase Gene
[0147] a) Growth of T. neapolitana Cells
[0148] T. neapolitana cells (obtained from V. A. Svetlichny, The
Institute of Microbiology, Russian Academy of Sciences, Moscow)
were grown in a medium containing (per 100 ml): 0.1 ml K-phosphate
solution [300 g/l K.sub.2HPO.sub.4 and 200 g/l KH.sub.2PO.sub.4]; 1
ml of Solution 1 [27 g/l NH.sub.4CI, 27 g/l CaCl.sub.2 and 31 g/l
MgCl.sub.2.6H.sub.2O]; 1 ml of a 10% solution of yeast extract
(Difco); 2.5% natural sea salt; 0.1 ml of a 2% solution of
resazurin; 1 ml of 5% Na.sub.2SO.sub.3; 150 mg NaHCO.sub.3 and 0.5%
glucose.
[0149] b) Isolation of Genomic DNA
[0150] Large scale cultures (10 liters) of T. neapolitana cells
were grown in the above medium in a 10 l fermentation vessel under
nitrogen (i.e., anaerobic conditions) at 75.degree. C. for 28 hours
(early stationary phase). The cells were then collected by
centrifugation at 10,000.times.g and the cell pellet was washed
once with a solution comprising 0.9% NaCl. The washed cell pellet
was frozen at -70.degree. C. DNA was isolated from the frozen cells
as follows. The frozen cells (3 g) were thawed in 30 ml of a
solution containing 100 mM Tris-HCl (pH 9.0), 50 mM EDTA and 2
mg/ml lysozyme. The mixture was incubated for 30 min at 0.degree.
C. and then SDS and proteinase K was added to a final concentration
of 1% and 100 .mu.g/ml, respectively. The mixture was incubated for
1.5 hours at 45.degree. C. with light shaking. Following the
incubation, the mixture was cooled to room temperature (about
25.degree. C.) and NaCl was added to a concentration of 0.5 M. An
equal volume of phenol/chloroform was added and the mixture was
extracted and the aqueous and organic phases were separated by
centrifugation for 10 min at 6,000.times.g at room temperature. The
supernatant was transferred to a fresh tube using a wide-bore
pipet. A total of six phenol/chloroform extractions were performed
(until the interphase disappeared). The DNA was precipitated by the
addition of ethanol and gently mixing the solution. The
precipitated DNA was washed with 70% ethanol. The DNA was then
centrifuged for 5 min at 10,000.times.g and the supernatant was
discarded. The pellet was resuspended in a buffer containing 10 mM
Tris-HCl (pH 7.4), 0.1 mM EDTA and stored at -20.degree. C. until
used.
[0151] c) Construction of a T. neapolitana Genomic DNA Library
[0152] The T. neapolitana DNA was then digested with Sau3A under
conditions which promoted the generation of fragments 3-8 kb in
length. Briefly, 10 .mu.g of genomic DNA was digested with 1.6
units of Sau3A in a volume of 15 .mu.l for 1 hour at 37.degree. C.
The reaction was stopped by the addition of 5 .mu.l of sample
buffer [70% glycerol, 50 mM EDTA] and the digested DNA was run on a
low melting temperature agarose gel (BioRad). Fragments 3-8 kb in
length were isolated from the gel using standard procedures
[Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd ed.,
Cold Spring Harbor Laboratory Press, NY (1989) pp. 6.30-6.31]. The
DNA recovered from the gel was precipitated with ethanol, dried and
resuspended in 100 .mu.l of 10 mM Tris-HCl (pH 7.5), 0.1 mM EDTA.
Five microliters of the DNA mixture (about 0.1 .mu.g) was ligated
to 0.25 .mu.g of the pTZ19R vector (Pharmacia) which had been
digested with BamHI and treated with bacterial alkaline
phosphatase. The ligation products were used to transform competent
TG1 cells [TG1 cells are an EcoK.sup.- derivative of JM101 cells; a
commercially available equivalents include NM522 cells (Pharmacia)
and XL1-Blue cells (Stratagene)] and the cells were plated onto
MacConkey agar plates (Difco).
[0153] White colonies (i.e., those containing plasmids having an
insert) were picked onto 30 master plates (96 colonies/plate; a
total of about 3,000 colonies were screened). Replica plates were
generated to provide cells for DNA polymerase analysis. The library
was screened by functional assay; the cells from each replica plate
were removed and pooled by rinsing the plate with 2 ml of 0.9%
NaCl. The cells were then collected by centrifugation (12,000 rpm
for 3 min) in a microcentrifuge (Eppendorf). The pellets were then
washed with 1.5 ml of 0.9% NaCl. The washed cell pellets were then
lysed by sonication in 0.5 ml of a solution containing 50 mM
Tris-HCl (pH 7.4), 1 mM EDTA and 5 mM PMSF. The lysates were then
heated to 72.degree. C.-75.degree. C. in a water bath for 20 min.
Following the incubation, the lysates were clarified by
centrifugation in a microfuge at 12,000 rpm for 10 min. The
supernatant was removed to a fresh tube. DNA polymerase activity
was assayed using the supernatant as follows.
[0154] Denatured bovine thymus DNA was prepared as follows. A
solution comprising 6 mM bovine thymus DNA (BioLAR, Olaine, Latvia;
equivalent preparations of calf thymus DNA are available from
Sigma, St. Loius, Mo.) in 1 mM NaOH was incubated for 15 min at
20.degree. C. The solution was then neutralized by the addition of
HCl to a final concentration of 100 mM and Tris-HCl, pH 8.0 to a
final concentration of 50 mM.
[0155] The following components were mixed: 2.5 .mu.l 0.5 M
Tris-HCl (pH 7.6), 5 .mu.l 100 mM MgCl.sub.2, 2 .mu.l denatured
bovine thymus DNA (2 mg/ml), 0.2 .mu.l 12.5 mM of each of the dNTPs
and 1.5 .mu.lCi of .alpha.-.sup.32P-dTTP and H.sub.2O to a volume
of 25 .mu.l. Twenty-five microliters of supernatant from each of
the pools of lysed cells were mixed with 25 .mu.l of the above
assay mixture in the well of a 96 well microtiter plate. The
mixture was incubated for 1 hour at 75.degree. C. The reaction was
stopped by the addition of 5 .mu.l of 200 mM EDTA (pH 8.0). Five
microliters of the reaction mixture was then loaded onto a
1.times.1 inch square of DEAE paper (Whatman). The samples were
dried at 80.degree. C. and then washed with 0.5 M sodium phosphate
(pH 7.2) (wash solution) using about 5 ml of wash solution per
sample for 10 min with light shaking. Three washes were performed.
The samples were then rinsed with water (10 ml/sample) for 1 min
followed by an ethanol rinse. The ethanol rinsed samples were then
dried at 80.degree. C. and DEAE-absorbent radioactivity was counted
using a liquid scintillation counter. The results of the initial
DNA polymerase assays revealed that a single pool produced DNA
polymerase activity.
[0156] To isolate clones containing Tne genomic DNA encoding the
DNA polymerase activity, the colonies on the positive master plate
were grown as smaller pools comprising either a single row or a
single column of colonies. The smaller pools of colonies were
grown, lysates were prepared and DNA polymerase activity was
determined as described above. A single row and a single column
contained DNA polymerase activity; the intersection of this row and
column identified the single individual clone containing DNA
encoding DNA polymerase activity. This single colony was grown and
assayed for DNA polymerase activity to confirm the presence of
thermostable DNA polymerase activity. This clone was called
pTen.
[0157] DNA was prepared from the pTen clone using standard
techniques of molecular biology; this clone was found to contain a
insert of approximately 3.5 kb. Restriction enzyme digests were
performed with a battery of enzymes to create a restriction map of
the Tne genomic DNA insert. Subclones were generated from the
positive clone and a series of nested deletions were generated
using Exonuclease III and standard molecular biology techniques to
facilitate DNA sequencing [Short Protocols in Molecular Biology,
2nd ed. (1992) Ausubel et al. Eds, John Wiley & Sons, New York,
pp.7-8 to 7-16 and 7-29 to 7-37]. The DNA sequence of the insert
was determined using the Sanger dideoxy sequencing method and
Sequenase.RTM. (USB). The DNA sequence of the coding region for the
full-length Tne DNA polymerase gene is listed in SEQ ID NO:1. The
deduced amino acid sequence of the Tne DNA polymerase is listed in
SEQ ID NO:2.
[0158] Analysis of the deduced amino acid sequence was performed
using protein analysis software (DNAStar, Inc., Madison, Wis.). The
open reading frame encodes a protein of 893 amino acids; the
predicted molecular weight of the protein is 102,054 (however, as
shown in Example below, the full-length protein migrates with an
apparent molecular weight of 97,000 on SDS-PAGE gels). The
predicted isolelectric point is 6.19 and the charge at pH 7.0 is
-7.56.
[0159] The nucleotide and amino acid sequences of the Tne DNA
polymerase were compared with the reported sequences for E. coli
DNA polymerase I and the thermostable DNA polymerase from T.
maritima. FIG. 3 provides an alignment of the amino acid residues
of these three polymerases. In FIG. 3 the following abbreviations
are used: Eco (E. coli DNA polymerase I); Tma (Tma DNA polymerase)
and Tne (Tne DNA polymerase). Shading is used to indicate residues
which differ from the amino acid sequence of Tne DNA
polymerase.
[0160] In E. coli DNA polymerase I, the 5' exonuclease domain
comprises approximately residues 1-323; the 3' exonuclease domain
comprises approximately residues 324-517 and the synthetic or
polymerization domain comprises approximately residues 521-928.
Alignment of the amino acid sequences of E. coli DNA polymerase I
with the sequence of the Tne DNA polymerase molecule of the present
invention reveals that the 5' exonuclease domain of Tne DNA
polymerase comprises approximately residues 1-297; the 3'
exonuclease domain comprises approximately residues 298482 and the
polymerization domain comprises approximately residues 486-893.
[0161] Alignment of amino acid residues present in E. coli DNA
polymerase I and Tne DNA polymerase shows that the two enzymes are
44% identical overall and 51% identical over the polymerase domain
(residues 521-928 in E. coli and residues 486-893 in Tne). The
alignment was performed using the Lipman-Pearson algorithm as
provided by DNASTAR, Inc. (Madison, Wis.); gaps were introduced
into the two sequences to provide for maximum alignment.
[0162] Comparison of the nucleotide and amino acid sequences of the
Tne and Tma polymerases revealed that these two polymerases share
78% identity at the nucleotide level and 88% identity at the amino
acid level.
EXAMPLE 2
Efficient Expression of Tne DNA Polymerase in E. coli
[0163] In order to express the Tne DNA polymerase in large amounts
in host cells, the DNA sequences encoding the polymerase (i.e., the
coding region) were removed from pTen (described in Example 1) and
inserted into an expression vector.
[0164] Restriction enzyme analysis revealed that the Tne polymerase
gene was present in the opposite transcriptional orientation
relative to the T7 promoter present on the pTZ19R vector used to
generate pTen. In order to produce Tne polymerase, sequences
encoding the Tne polymerase gene were removed from pTen by
digestion with SmaI and XbaI and an approximately 3.5 kb SmaI/XbaI
fragment was isolated by electrophoresis of the digestion products
on an agarose gel followed by excision of the desired band. DNA was
recovered from the agarose block using the Wizard.TM. PCR Preps DNA
Purification System (Promega). Briefly, 0.5 to 1.0 ml of Wizard.TM.
PCR Preps DNA Purification Resin was added to the agarose block and
the mixture was incubated at 42.degree. C. for 5 minutes to melt
the agarose. DNA was extracted using the protocol provided in the
kit.
[0165] The 3.5 kb SmaI/XbaI fragment was ligated into the
pGEM.RTM.-3Zf(+) vector (Promega) which had been digested with SmaI
and XbaI to generate pGTne. This construction placed the 5' end
(i.e., encodes the N terminal portion) of the Tne polymerase gene
downstream of the T7 promoter in the same transcriptional
orientation. The ligation mixture was used to transform competent
JM109(DE3) cells (Promega). Recombinant clones were isolated,
confirmed by restriction digestion using standard recombinant
molecular biology techniques [Sambrook et al., Molecular Cloning, A
Laboratory Manual (1989)].
[0166] Expression of the Tne DNA polymerase from the pGTne
construct was next examined. JM109(DE3) cells containing pGTne were
grown at 37.degree. C. and induced with 1 mM IPTG. Parallel
cultures of TG1 cells containing pTen were grown and induced. After
a few hours (i.e, 1-3) of growth in the presence of IPTG, the cells
were collected by centrifugation and crude lysates were prepared as
follows. A 1 ml aliquot of each of the cultures containing the Tne
constructs was centrifuged in a microcentrifuge at 14,000.times.g
for 3 min at room temperature to pellet the cells. The cells were
then resuspended in 200 .mu.l of a solution comprising 50 mM
Tris-HCl (pH 8.0), 50 mM glucose, 1 mM EDTA and the cells were
pelleted again. The cells were next resuspended in 50 .mu.l of the
previous buffer containing 4 mg/ml lysozyme and the mixture was
incubated at room temperature for 10 minutes. Following the
incubation, 50 82 l of a solution comprising 10 mM Tris-HCl (pH
8.0), 50 mM KCl, 1 mM EDTA, 1 mM PMSF, 0.5% Tween-20, 0.5% Nonidet
P40 was added and the mixture was incubated at 75.degree. C. for 10
minutes. The lysate was then clarified by centrifugation in a
microcentrifuge at 14,000.times.g for 5 minutes. Eighty microliters
of the supernatant was removed and stored in a separate tube at
4.degree. C. The crude lysates were analyzed for polymerase
activity at 74.degree. C. as described in Example 5(b), below.
[0167] The following results were obtained. The pTen construct gave
polymerase activity levels at or about 2 fold higher than the
background level for the assay. The pGTne construct gave activity
levels of about 50 times background levels. While the expression of
Tne polymerase seen using pGTne was much improved relative to the
level seen using pTen, this expression level was not sufficient to
produce large amounts of the enzyme.
[0168] These above result suggested that the Tne polymerase
promoter was non-functional in E. coli (very low level of activity
present when pTen is used). Furthermore, the presence of the Tne
polymerase promoter appeared to be detrimental to expression when
transcription was initiated from the T7 promoter in the pGTne
construct (perhaps due to transcriptional read-through
interference). In order to remove the Tne polymerase gene promoter
from the Tne polymerase coding region, the following experiments
were conducted.
[0169] The DNA sequence of the 5' end of the Tne polymerase gene
was sequenced using the M13 forward primer in conjunction with the
fmol.RTM. DNA Sequencing System (Promega); sequencing was conducted
according to the manufacturer's instructions. The sequence analysis
revealed that a unique BglI site was found 43 bp into the coding
region (i.e., 43 bp following the A of the initiator ATG codon). To
remove the coding region of the Tne polymerase gene from pGTne,
pGTne was digested with BglI and XbaI and the approximately 3.0 kb
BglI/XbaI fragment was isolated (as described above). The 3.0 kb
BglI/XbaI fragment was ligated directly downstream of either the T7
promoter or the tac promoter present in pALTER.RTM.-Ex1 (Promega);
pALTER.RTM.-Ex1 contains both the T7 and the tac promoters
positioned in opposite transcriptional orientations relative to one
another. These two ligations were performed as follows.
[0170] To insert the 3.0 kb BglI/XbaI fragment downstream of the T7
promoter, a 43 bp synthetic linker having a NcoI overhanging end at
one end and a BglI overhanging end at the other end was ligated to
the Tne polymerase coding region. This linker was formed by
annealing of the following two oligonucleotides: JH64 which
comprises 5'-CATGGCGAGACTATTTCTCTTTGATGGCACAGCCCTGGCCTACA-3' (SEQ
ID NO:3) and JH65 which comprises 5'-AGGCCAGGGCTGTGCCATCAAAGAGAA
ATAGTCTCGC-3' (SEQ ID NO:4). This synthetic linker regenerates the
native sequence of the Tne polymerase gene located upstream of the
BglI site and allows insertion of the coding region into
pALTER.RTM.-Ex1. pALTER.RTM.-Ex1 was digested with NcoI and XbaI
and the coding region containing the synthetic linker was ligated
to the digested vector to generate pATne2.
[0171] To insert the 3.0 kb BglI/XbaI fragment downstream of the
tac promoter, a 43 bp synthetic linker having a NdeI overhanging
end at one end and a BglI overhanging end at the other end was
ligated to the Tne polymerase coding region. This linker is formed
by annealing of the following two oligonucleotides: JH62 which
comprises 5-AGGCCAGGGCTGTGCCATCAAAGAGAAATAGTCTCGCCA (SEQ ID NO:5)
and JH63 which comprises 5'-TATGGCGAGACTATTTCTCTTTGTGGCACAGCCCT
GGCCTACA-3' (SEQ ID NO:6). This synthetic linker regenerates the
native sequence of the Tne polymerase gene located upstream of the
BglI site and allows insertion of the coding region into
pALTER.RTM.-Ex1. pALTER.RTM.-Ex1 was digested with NdeI and XbaI
and the coding region containing the synthetic linker was ligated
to the digested vector to generate pATne1.
[0172] Competent E. coli cells were transformed with the above
ligation mixtures corresponding to either pATne1 (JM109 cells;
Promega) and pATne2 [JM109(DE3) cells; Promega]. Recombinant clones
were isolated, confirmed by restriction digestion using standard
recombinant molecular biology. Cells harboring either pATne1 or
pATne2 were grown and induced as described above. Crude lysates
were prepared and DNA polymerase assays were performed (as
described above). The results of these polymerase assays showed
that both pATne1 and pATne2 gave significantly better yields of Tne
polymerase than pGTne (at least 2-3 fold higher).
EXAMPLE 3
Construction of Tne Deletion Mutants Lacking 5' to 3' Exonuclease
Activity
[0173] As noted above, the presence of 5' to 3' exonuclease
activity in a thermostable DNA polymerase is undesirable for
certain applications. To construct mutant Tne polymerases lacking
5' to 3' exonuclease activity, two deletion mutants of the Tne
polymerase gene were generated. Both mutants contain deletions
which remove sequences encoding a large portion of the 5' to 3'
exonuclease domain located at the N terminus of the Tne polymerase
molecule.
[0174] a) Construction of a Vector Encoding Deletion Mutant Tne
M284
[0175] The deletion mutant Tne M284 is a truncated form of the Tne
polymerase which uses the naturally occurring methionine at amino
acid position 284 in the full-length protein (SEQ ID NO:2) as the
initiating methionine for translation initiation. FIG. 4 provides a
schematic representation of several modified Tne polymerases
(constructed as described in Examples 3 and 4) along the map of the
full-length Tne polymerase protein. The scale represents length in
increments of 100 amino acid residues. The full length Tne
polymerase (SEQ ID NO:2) contains 893 amino acids. The thick open
boxes represent the presence of amino acid residues; thin lines
between two regions of thick boxes indicates that amino acids were
deleted between the two open boxes. Circles containing a single
letter indicate the location of a mutated amino acid residue (the
single letter code is used for the amino acid residues
indicated).
[0176] To generate a construct containing the Tne M284 mutant,
pGTne was digested with BspHI (generates ends compatible with NcoI
ends) and KpnI and a 1.05 kb BspHI/KpnI fragment (containing the 5'
portion of the coding region) was isolated as described in Example
2. A second aliquot of pGTne was digested with KpnI and XbaI and a
1.3 kb fragment containing the 3' portion of the Tne polymerase
coding region was isolated. pALTER-Ex1 was digested with NcoI-and
XbaI. A three-way ligation was performed using the digested
pALTEREx1 vector, 1.05 kb BspHI/KpnI fragment and the 1.3 kb
KpnI/XbaI fragment. Competent JM109(DE3) cells were transformed
with the ligation mixture and recombinant clones were isolated,
confirmed by restriction digestion using standard recombinant
molecular biology. The resulting plasmid was called pM284. The
nucleotide sequence of the region encoding the Tne M284 gene is
listed in SEQ ID NO:7. The amino acid sequence of Tne M284 is
listed in SEQ ID NO:8.
[0177] b) Construction of a Vector Encoding Deletion Mutant Tne
M316
[0178] The deletion mutant Tne M316 is a truncated form of the Tne
polymerase protein which uses an artificially created methionine at
amino acid position 316 as the initiator methionine. Tne M316 was
created by introducing a methionine residue at position 316 (and a
corresponding NcoI site) via site-directed mutagenesis using the
Altered Sites.RTM. II in vitro Mutagenesis System (Promega) in
conjunction with mutagenesis oligonucleotide JH68
[5'-ATCGAAAAGCTGACCATGGTTCCATCTT TTG-3' (SEQ ID NO:9)] and pATne2.
The manufacturer's protocol was followed exactly. Briefly, pATne2
was denatured using alkali and the JH68 mutagenic oligonucleotide
was annealed to the denatured plasmid along with the ampicillin
repair oligonucleotide (provided in the kit). The mutant strand was
synthesized using T4 DNA polymerase and T4 DNA ligase. ES1301mutS
cells (provided in the kit) were then cotransformed with the
mutagenized pATne2 and R408 DNA (provided in the kit). Small scale
DNA preparations were prepared from the transformed ES1301mutS
cells and the DNA was used to transform JM109 cells. Mutants were
selected by growth on ampicillin plates and the desired recombinant
were confirmed by restriction enzyme analysis (i.e., presence of an
additional NcoI site). The resulting plasmid containing the
engineered NcoI site was then digested with NcoI which deleted the
5' to 3' exonuclease domain as a 948 bp fragment and the large
fragment was isolated (as described above) and religated to itself
to create the pM316 construct. The nucleotide sequence of the
region encoding the Tne M316 gene is listed in SEQ ID NO:10. The
amino acid sequence of Tne M316 is listed in SEQ ID NO:11.
[0179] c) Expression of Tne M284 and Tne M316 in E. coli
[0180] The pM284 and pM316 constructs (in JM109 cells) were grown,
induced and crude lysates were prepared as described in Example 2.
DNA polymerase activity was measured in crude lysates as described
in Example 2.
[0181] The results of the polymerase assays showed that the Tne
M284 mutant (pM284) contained 17 units per .mu.l of polymerase
activity from the crude lysate and the Tne M316 mutant (pM316)
produced no detectable polymerase activity. No detectable
polymerase activity was found when the Tne M316 polymerase was
expressed from the trc promoter either (to express the Tne M316
coding region from the trc promoter, a 2.5 kb NcoI/PstI fragment
was isolated from pM316 and ligated to pTrc 99 A (Pharmacia)
digested with NcoI and PstI).
[0182] Aliquots (5 .mu.l) of each crude lysate were electrophoresed
on a pre-cast 4-20% denaturing gradient polyacrylamide gel (Novex,
San Diego, Calif.); following electrophoresis, the gel was stained
with Coomassie blue to visualize the separated proteins. A single,
sharp protein band corresponding to the expected size was visible
in lysates produced from cells containing the pATne1 (full-length
Tne polymerase) and pM284 (Tne M284 deletion mutant) constructs. No
protein band was observed for deletion mutant Tne M316 when
expressed from either the T7 or trc promoters.
[0183] In order to increase the level of expression of Tne M284
protein in E. coli, the Tne M284 coding region was placed
downstream of the strong tac promoter present in the JHEX3 vector
to create pJM284. JHEX3 was created as follows. pALTER-1 (Promega)
was digested with ClaI and StyI and the ends were made blunt by
incubation with the Klenow fragment. The 1.345 kb ClaI/StyI
fragment was isolated and ligated into pTrc 99 A (Pharmacia) which
had been digested with BsaAI. This ligation inserted the
tetracycline-resistance gene into the pTrc 99 A vector; the
resulting vector was called JHEXa. The ampicillin-resistance gene
was then removed from the JHEXa by digestion with SspI, DraI and
PvuI; this digestion cut the ampicillin gene into four small
fragments (483 bp, 227 bp, 209 bp and 19 bp). The large fragments
(3.93 kp and 652 bp) were isolated and ligated together to create
JHEXb. The Trc promoter was removed from JHEXb as an 89 bp
SspI/NcoI fragment and replaced with the tac promoter. The tac
promoter was inserted into the SspI/NcoI-digested JHEXb vector as a
141 bp BsrBI fragment from pALTER-Ex1 (Promega) together with a 30
bp linker formed by the oligonucleotide pair listed in SEQ ID
NOS:45 and 46.
[0184] To generate pJM284 construct was made as follows. pGTne was
digested with BspHI (generates ends compatible with NcoI ends) and
KpnI and a 1.05 kb BspHI/KpnI fragment (containing the 5' portion
of the' coding region) was isolated as described in Example 2. A
second aliquot of pGTne was digested with KpnI and XbaI and a 1.3
kb fragment containing the 3' portion of the Tne polymerase coding
region was isolated. JHEX3 was digested with NcoI and XbaI. A
three-way ligation was performed using the digested JHEX3 vector,
1.05 kb BspHI/KpnI fragment and the 1.3 kb KpnI/XbaI fragment.
Competent JM109 cells were transformed with the ligation mixture
and recombinant clones were isolated, confirmed by restriction
digestion using standard recombinant molecular biology. The
resulting plasmid was called pJM284.
[0185] Crude lysates were prepared from a small scale culture of
JM109 cells containing the pJM284 construct or the pM284 construct.
DNA polymerase assays were performed as described in Example 2. The
level of Tne M284 polymerase produced by pJM284 was found to be
about 50% greater than the level produced by expression from the
pM284 construct.
EXAMPLE 4
Construction of Tne Polymerase Mutants Having Altered 3' to 5'
Exonuclease Activity
[0186] In order to produce modified forms of Tne polymerase which
possess varying amounts of 3' to 5' exonuclease activity, seven
different point mutants and two deletion mutants were created using
the pM284 construct as the starting material. FIG. 4 provides a
schematic drawing of these mutant Tne polymerases.
[0187] All nine mutagenic changes also involved a change in the
restriction digest pattern of the starting pM284 plasmid to allow
for easy selection of the mutants. In all cases, a small portion of
the mutagenized region was exchanged into an Tne M284 gene that did
not undergo mutagenesis and the exchanged region was sequenced not
only to confirm the mutation, but also to show that there were no
second site mutations. DNA sequencing was performed using the
fmol.RTM. DNA Sequencing System (Promega) in conjunction with using
primers JH61. [5'-TGCCGTACACCTCC GAGAGC-3' (SEQ ID NO:12)] or JH66
[5'-CTCGTTTGGCTCCAGCAAATATGC-3' (SEQ ID NO:13)]. The mutants were
constructed as follows.
[0188] a) Construction of pD323E
[0189] pD323E produces a modified form of the Tne polymerase which
lacks the first 283 amino acids from the N-terminus of the
full-length protein and contains an amino acid substitution at
residue 323 (number indicates position of the residue in the full
length protein). At amino acid residue 323 the wild-type aspartic
acid is replaced with glutamic acid. pM284 was used in conjunction
with the mutagenic oligonucleotide JH74
[5'-TTTGCCCTGGAaCTTGAAACG-3' (SEQ ID NO:14)]; the mutagenic
residues are indicated by the lower case letter] and the Altered
Sites.RTM. II in vitro Mutagenesis System (Promega) to generate
pD323E as described in Example 3. The desired mutants were
confirmed by restriction analysis (absence of one of the SinI
restriction sites present in pM284. The DNA sequence of pD323E was
obtained as described above using the JH66 (SEQ ID NO: 13) primer.
The DNA sequence of the polymerase coding region present in pD323E
is listed in SEQ ID NO: 15. The corresponding amino acid sequence
of the Tne M284(D323E) protein is listed in SEQ ID NO:16.
[0190] b) Construction of pE325D
[0191] pE325D produces a modified form of the Tne polymerase which
lacks the first 283 amino acids from the N-terminus of the
full-length protein and contains an amino acid substitution at
residue 325. At amino acid residue 325, the wild-type glutamic acid
residue is replaced with aspartic acid. pM284 was used in
conjunction with the mutagenesis oligonucleotide JH75
[5'-GACCTTGAcACGTCCTC (SEQ ID NO:17);the mutagenic residue is
indicated by the lower-case letter] and the Altered Sites.RTM. II
in vitro Mutagenesis System (Promega) to generate pE325D as
described in Example 3. The desired mutant was confirmed by
restriction analysis (the presence of additional AflIII restriction
site). The DNA sequence of pD323 was obtained as described above
using the JH66 (SEQ ID NO:13) primer. The DNA sequence of the
polymerase coding region present in pE325D is listed in SEQ ID NO:
18. The corresponding amino acid sequence of the Tne M284(E325D)
protein is listed in SEQ ID NO:19.
[0192] c) Construction of pY464F
[0193] pY464F produces a modified form of the Tne polymerase which
lacks the first 283 amino acids from the N-terminus of the
full-length protein and contains an amino acid substitution at
residue 464. At amino acid residue 464, the wild-type tyrosine
residue is replaced with phenylalanine. pY464F was constructed by
replacing a 39 bp FokI fragment present in pM284 with a 39 bp
synthetic region. The 39 bp synthetic region was formed by the
following two oligonucleotides: JH81 [5'-TAAGTGATATC
TGCATCCTCGCAGGAGAAGTTCGCAGCC-3' (SEQ ID NO:20) and JH82
[5'-ACAAGGCTGCGAACTTCTCCTGCGAGGATGCAGATATCA-3' (SEQ ID NO:21)].
This synthetic 39 bp oligonucleotide contains the mutation. The
desired mutant was confirmed by restriction analysis (the presence
of additional EcoRV restriction site). The DNA sequence of pY464F
was obtained as described above using the JH61 (SEQ ID NO:12)
primer. The DNA sequence of the polymerase coding region present in
pY464F is listed in SEQ ID NO:22. The corresponding amino acid
sequence of the Tne M284(Y464F) protein is listed in SEQ ID
NO:23.
[0194] d) Construction of pD468N
[0195] pD468N produces a modified form of the Tne polymerase which
lacks the first 283 amino acids from the N-terminus of the
full-length protein and contains an amino acid substitution at
residue 468. At amino acid residue 468, the wild-type aspartic acid
residue is replaced with asparagine. pM284 was used in conjunction
with the mutagenesis oligonucleotide JH79
[5'-ACTCCTGCGAGaATGCtGACATCACTTAT AGG-3' (SEQ ID NO:24); the
mutagenic residues are indicated by the use of lower case letters]
and the Altered Sites.RTM. II in vitro Mutagenesis System (Promega)
to generate pD468N as described in Example 3. The desired mutant
was confirmed by restriction analysis (the presence of an
additional BsmI restriction site). The DNA sequence of pD468N was
obtained as described above using the JH61 (SEQ ID NO:12) primer.
The DNA sequence of the polymerase coding region present in pD468N
is listed in SEQ ID NO:25. The corresponding amino acid sequence of
the Tne M284(D468N) protein is listed in SEQ ID NO:26.
[0196] e) Construction of pD323A
[0197] pD323A produces a modified form of the Tne polymerase which
lacks the first 283 amino acids from the N-terminus of the
full-length protein and contains an amino acid substitution at
residue 323. At amino acid residue 323, the wild-type aspartic acid
residue is replaced with alanine. pM284 was used in conjunction
with the mutagenesis oligonucleotide JH70
[5'-TTTGCCCTGGcCCTTGAAACG-3' (SEQ ID NO:27); the mutagenic residue
is indicated by the use of the lower case letter] and the Altered
Sites.RTM. II in vitro Mutagenesis System (Promega) to generate
pD323A as described in Example 3. The desired mutant was confirmed
by restriction analysis (the absence of a SinI restriction site).
The DNA sequence of pD323A was obtained as described above using
the JH66 (SEQ ID NO:13) primer. The DNA sequence of the polymerase
coding region present in pD323A is listed in SEQ ID NO:28. The
corresponding amino acid sequence of the Tne M284(D323A) protein is
listed in SEQ ID NO:29.
[0198] f) Construction of pD389A
[0199] pD389A produces a modified form of the Tne polymerase which
lacks the first 283 amino acids from the N-terminus of the
full-length protein and contains an amino acid substitution at
residue 389. At amino acid residue 389, the wild-type aspartic acid
residue is replaced with alanine.
[0200] To construct pD389A, the PCR was used to amplify two
overlapping fragments independently; the PCR products were combined
and the resulting large fragment was reamplified. Mutagenesis
oligonucleotide JH80 [5'-CCTGAAGTACGcgTACAAGGT TCTTATGG-3' (SEQ ID
NO:30); the mutagenic residues are indicated by the use of lower
case letters] and sequencing primer JH61(SEQ ID NO: 12) were used
to prime a first PCR to create a 425 bp fragment which incorporates
the desired mutation. The sequencing primers JH66 (SEQ ID NO:13)
and M13 reverse (Promega Q5401; SEQ ID NO:31) were used to amplify
a 564 bp fragment using pM284 as the template in a second PCR When
these two PCR products were combined using the M13 reverse and JH61
primers, a 889 bp fragment was made. A 348 bp BglII fragment was
then removed from the 889 bp product and was exchanged with the
analogous, but, non-mutagenic BglII fragment of pM284. The desired
mutants was confirmed by the presence of an extra MluI restriction
site. All PCR reactions were performed using a Perkin-Elmer 480
thermal cycler.
[0201] For generation of the 564 bp product, the PCR was conducted
by performing 15 cycles comprising a denaturation step (95.degree.
C. for 15 sec) and an annealing/extension step (70.degree. C. for 1
min). JM284 was used as the template in a reaction containing 1
.mu.M of each of the primers (JH66 and M13 reverse), 1.5 MM
MgCl.sub.2 and 3 units Tli DNA polymerase (Promega).
[0202] For generation of the 425 bp product, the PCR was conducted
by performing 20 cycles comprising a denaturation step (95.degree.
C. for 15 sec), an annealing step (55.degree. C. for 30 sec; cycles
1-5) or an annealing step (70.degree. C. for 15 sec; cycles 6-20)
and an extension step (70.degree. C. for 45 sec). JM284 was used as
the template in a reaction containing 1 .mu.M of each of the
primers (JH80 and JH61), 1.5 mM MgCl.sub.2 and 3. units Tli DNA
polymerase (Promega).
[0203] For the generation of the 889 bp product, the 564 bp product
and the 425 bp product were used as the template in a reaction
containing 1.5 mM MgCl.sub.2 and 3 units Tli DNA polymerase
(Promega). The cycling conditions were: denaturation (95.degree. C.
for 15 sec) and annealing/extension at 70.degree. C. for 1 min; no
primers were present for cycles 1-5. One .mu.M of the M13 reverse
primer was present for cycles 5-10 and I .mu.M of the M13 reverse
and JH61 primers were present in cycles 11-25.
[0204] The DNA sequence of pD389A was obtained as described above
using the JH61 and JH66 (SEQ ID NOS:12 and 13) primer. The DNA
sequence of the polymerase coding region present in pD389A is
listed in SEQ ID NO:32. The corresponding amino acid sequence of
the Tne M284(D389A) protein is listed in SEQ ID NO:33.
[0205] g) Construction of pD323,389A
[0206] pD323,389A produces a modified form of the Tne polymerase
which lacks the first 283 amino acids from the N-terminus of the
full-length protein and contains two amino acid substitutions at
residues 323 and 389. At amino acid residue 323, the wild-type
aspartic acid residue is replaced with alanine and at amino acid
residue 389, the wild-type aspartic acid residue is replaced with
alanine.
[0207] To create pD323,389A, the 215 bp Csp45I fragment of pD323A
was exchanged with the analogous fragment in pD389A bringing the
two single mutations into the same construct. The desired mutants
were selected as having the two restriction site changes of the
individual mutants (described above). The DNA sequence of
pD323,389A was obtained as described above using the JH66 (SEQ ID
NO:13) primer. The DNA sequence of the polymerase coding region
present in pD323,389A is listed in SEQ ID NO:34. The corresponding
amino acid sequence of the Tne M284(D323A,D389A) protein is listed
in SEQ ID NO:35; this enzyme is referred to as the triple mutant
Tne polymerase.
[0208] When all of the above point mutant constructs (sections a-g)
were induced to express the modified Tne polymerases in an E. coli
host, the crude extracts showed polymerase activity comparable with
the activity observed using the original pM284 construct (cultures
were grown, induced, lysates prepared and assayed for DNA
polymerase activity as described in Example 2).
[0209] h) Construction of Deletion Mutants pM323 and
pJM284.DELTA.B
[0210] Two mutants were created which contained deletions into the
putative 3' to 5' exonuclease domain of the Tne polymerase gene. It
was predicted that these two deletion mutations would remove all 3'
to 5' exonuclease activity. Surprisingly, when these two mutants
were expressed in E. coli no polymerase activity was detected.
[0211] i) Construction of pM323
[0212] pM323 was constructed by digestion of pJM284 with SinI and
EcoRV followed by removal of the 3' overhanging (i.e., sticky) ends
with Mung Bean nuclease. The 1.9 kp SinI/EcoRV(polished ends)
fragment was isolated as described in Example 2. JHEX3 (Example 3)
was digested with NcoI and SmaI and the NcoI overhanging ends were
made blunt by incubation with the Klenow fragment. The 1.9 kp
SinI/EcoRV(polished ends) fragment was then blunt end ligated into
the prepared JHEX3 vector. When the blunted SinI end ligates to the
blunted NcoI end 39 amino acids are removed from the N terminus of
the protein encoded by the pM284 construct; this junction also
creates an initiating:methionine at amino acid position 323 that is
in-frame with the rest of the coding region. The polymerase coding
region present in the pM323 construct was sequenced to confirm that
no undesirable mutations were introduced (i.e., insertion of stop
codons, frame-shift mutations). The DNA sequence of the polymerase
coding region present in pM323 is listed in SEQ ID NO:36. The
corresponding amino acid sequence of the Tne M323 protein is listed
in SEQ ID NO:37.
[0213] ii) Construction of pJM284.DELTA.B
[0214] pJM284.DELTA.B was constructed by digestion of the JM284
construct with BglII followed by religation of the large fragment
(6.6 kb) back on itself. Digestion of JM284 with BglII created two
fragments, the smaller being 348 bp which contains DNA sequences
which encode a portion of the 3' to 5' exonuclease domain. The
pJM284.DELTA.B construct removed 116 amino acids of the 3' to 5'
exonuclease domain which correspond to residues 309 through 424 in
SEQ ID NO:2. pJM284.DELTA.B contains the same 25 amino acids which
encode the N terminus of the protein encoded by pM284. This
deletion does not change the reading frame for the polymerase
domain. Restriction digest analysis was conducted on the
pJM284.DELTA.B construct to confirm the proper construction was
made. The DNA sequence of the polymerase coding region present in
pJM284.DELTA.B is listed in SEQ ID NO:38. The corresponding amino
acid sequence of the Tne M284.DELTA.B protein is listed in SEQ ID
NO:39.
[0215] iii) Expression of pM323 and pJM284.DELTA.B in E. coli
[0216] When both of these deletion mutants constructs were grown
and induced to express the Tne polymerase in an E. coli host
(JM109), no detectable polymerase activity was detected. As these
two constructs were analyzed by restriction digestion or DNA
sequencing to insure that no undesirable mutations were introduced,
it appears, surprisingly, that deletion into the putative 3'
exonuclease domain (approximately aa residues 291-484) is
deleterious either for polymerase activity or alternatively for
protein stability. These results, in conjunction with those
obtained using the pM316 construct, show that deletions beyond
about position 849 of SEQ ID NO:1 produce proteins which either are
unstable (perhaps due to improper folding) or lack polymerase
activity.
EXAMPLE 5
Purification of Tne DNA Polymerases
[0217] In order to produce purified preparations of the wild-type
and modified Tne polymerases, cells harboring the Tne expression
vectors described above were grown, induced and the Tne polymerases
were isolated.
[0218] a) Growth of E. coli Cells Harboring Recombinant Tne (rTne)
Constructs
[0219] E. coli strains containing a desired construct were streaked
onto LB plates [10 g/l tryptone, 5 g/l yeast extract, 5 g/l NaCl, 1
ml/l 1N NaOH and 15 g/l agar] containing 10 .mu.g/ml tetracycline
to isolate single colonies and the plates were grown overnight at
37.degree. C. A single colony was inoculated into 150 ml LB broth
[10 g/l tryptone, 5 g/l yeast extract, 5 g/l NaCl and 1 ml/l 1N
NaOH] containing 10 .mu.g/ml tetracycline (divided into three
flasks of 50 ml each); the three flasks were grown with shaking
overnight at 37.degree. C. The next day, 120 ml of the overnight
culture was used to seed the fermentation of 6 liters of LB
containing 10 .mu.g/ml tetracycline prewarmed to 37.degree. C.
(divided into six flasks of 1 liter each). The large scale culture
was grown for 5 hours at 37.degree. C. and then IPTG was added to a
final concentration of 1 mM and growth was continued for an
additional 2 hours at 37.degree. C. The induced cells were
harvested by centrifugation at 9,000 rpm for 5 minutes in a
Beckmann JA10 rotor. Yields were typically 2 g cell paste per liter
of fermented culture.
[0220] b) Purification of rTne DNA Polymerases
[0221] Ten grams of cell paste (prepared as described above) were
resuspended in 100 ml of an ice-cold solution containing 0.25 M
NaCl in TEDGT buffer [50 mM Tris-HCl (pH 7.3), 1 mM EDTA, 1 mM DTT,
10% glycerol and 0.1% Tween 20] containing 2.5 mM PMSF. The
resuspended cells were lysed by sonication using a Vibracell
sonicator (Model VCX600; Sonics and Materials, Inc., Danbury,
Conn.). The solution was kept ice-cold during sonication by
placement of the beaker containing the cell suspension in a salted
ice bath. Sonication was repeated ten times at 40% output for 1
minute with a 2 minute rest between the 1 minute sonication bursts.
The cell lysate was heat treated to denature the bulk of E. coli
proteins by incubation of the lysate at 68.degree. C. to 70.degree.
C. for 5 to 10 minutes; following heat treatment the lysate was
placed on ice.
[0222] The following purification steps were performed at 4.degree.
C. The chilled lysate was centrifuged at 15,000 rpm for 15 minutes
in a Beckman JA18 rotor to remove the heat-denatured proteins. The
cleared lysate supernatant was removed and 4 ml of 5%
polyethylenimine (PEI) was added to the supernatant to precipitate
any DNA present in the lysate. The lysate was centrifuged at 15,000
rpm for 15 minutes in a Beckman JA18 rotor to remove the
precipitated DNA. The supernatant was retrieved and solid ammonium
sulfate was added to 60% saturation to precipitate the DNA
polymerase. After dissolution of the ammonium sulfate, the sample
was centrifuged at 15,000 rpm for 1 hour in a Beckman JA18 rotor.
The supernatant was discarded and the precipitated proteins were
gathered and dissolved in 10 ml TEDGT buffer. The resolubilized
protein was then placed in a dialysis membrane tubing having a
12,000 to 14,000 mw cutoff (Spectra, Houston, Tex.) and then
dialyzed against TEDGT buffer to remove the ammonium sulfate.
[0223] The dialyzed protein solution was then loaded onto a 15 ml
Heparin Sepharose (Scientific Protein Laboratory, Waunakee, Wis.)
column (1.7 cm i.d..times.6.5 cm height). The column was washed
with 150 ml 0.05 M NaCl in TEDGT buffer. A 100 ml salt (NaCl)
gradient was run over the column to elute the DNA polymerase; the
gradient started at 0.05 M NaCl and ended at 1 M NaCl (all in TEDGT
buffer). Fractions (1.5 ml) were collected and assayed for DNA
polymerase activity at 74.degree. C. as described below.
[0224] Thermostable DNA polymerase activity was assayed by
incorporation of radiolabeled dTTP into nicked and gapped (i.e.,
activated) calf thymus DNA (prepared as described below). One unit
of thermostable DNA polymerase is defined as the amount of enzyme
required to catalyze the incorporation of 10 nmol of dNTP into an
acid-insoluble form in 30 minutes at 74.degree. C. The reaction
conditions comprised: 50 mM Tris-HCl (pH 9.0 at 25.degree. C.), 50
mM NaCl, 10 mM MgCl.sub.2, 200 .mu.M dATP, 200 .mu.M dCTP, 200
.mu.M dGTP, 200 .mu.M dTTP and 5 .mu.Ci .sup.3H-DTTP (Amersham) and
60 .mu.g activated calf thymus DNA in a 250 .mu.l final volume.
[0225] The reaction components were assembled at room temperature.
Samples suspected of containing polymerase activity were added (5
.mu.l containing 0.05 to 0.5 units) and the tube was incubated at
74.degree. C.; aliquots (50 .infin.l) were withdrawn at 6, 9, 12
and 15 minutes and placed immediately into 1.5 ml microcentrifuge
tubes containing 0.5 ml of ice-cold 10% TCA on ice. After 10-30
minutes on ice, the entire TCA precipitation was filtered through a
GF/A filter (Whatmann). The reaction tubes were rinsed with 3
volumes of cold 5% TCA and the filters were washed twice with 10 ml
of ice-cold 5% TCA followed by a rinse with 1 ml of acetone. The
filters were dried and the radioactivity bound to filters was
counted in a scintillation counter.
[0226] Total and background counts were determined as follows. For
total cpm, 10 .mu.l of the reaction mix (without any polymerase
added) was spotted onto duplicate GF/A filters and counted. For
background counts, 50 .mu.l of the reaction mix (without any
polymerase added) was added to duplicate tubes containing 0.5 ml
cold 10% TCA and the samples were filtered through GF/A filters and
washed as described above.
[0227] Activated calf thymus DNA was prepared by dissolving 1 g
calf thymus DNA (#D-151, Sigma, St. Louis, Mo.) in 400 ml TM buffer
[10 mM Tris-HCl (pH 7.3), 5 mM MgCl.sub.2]. Four hundred
microliters of a solution containing 40 units of RQ1-DNAse
(Promega) in TM buffer was added to the DNA solution and incubated
at 37.degree. C. for 10 minutes. The DNAse digestion was stopped by
heating the DNA solution at 68.degree. C. for 30 minutes. The
activated calf thymus DNA was stored at -20.degree. C. until used.
The activated calf thymus DNA was heated to 74.degree. C. for 10
minutes and then cooled to room temperature before use.
[0228] As shown in Table 2 below, the Tne DNA polymerases generally
eluted from the Heparin Sepharose column at a salt concentration of
between 0.14 M and 0.29 M NaCl. Fractions containing the polymerase
activity were pooled, placed into dialysis membrane tubing (as
described above) and dialyzed against TEDGT buffer until the salt
concentration was less than or equal to 0.05 M NaCl as measured by
conductivity using a conductance meter (Yellow Spring Instrument
Co., Yellow Springs, Ohio).
[0229] The dialyzed polymerase fraction was then loaded onto a 7 ml
Cibracron Blue 3GA (Sigma, St. Louis, Mo.) column (1.25 cm
i.d..times.6 cm height). The Cibracron Blue 3GA column was washed
with 70 ml of 0.05 M NaCl in TEDGT buffer. A 100 ml salt gradient
was run over the column to elute the DNA polymerase; the gradient
started at 0.05 M NaCl and ended at 1 M NaCl (all in TEDGT).
Fractions (1.5 ml) were collected and assayed for DNA polymerase
activity at 74.degree. C. (as described above).
[0230] As shown in Table 2, the Tne DNA polymerases generally
eluted from the Cibracron Blue 3GA column at a salt concentration
of between 0.20 M and 0.46 M NaCl. Fractions containing the
polymerase activity were pooled, placed in dialysis membrane (as
described above) and dialyzed against storage buffer [20 mM
Tris-HCl (pH 8.0), 100 mM KCl, 0.1 mM EDTA, 1 mM DTT, 50% glycerol
and 0.5% Tween 20]. Aliquots of preparations of purified Tne
polymerases were electrophoresed on SDS-PAGE gels and stained with
Coomaisse blue. The purified DNA polymerases produced by all of the
recombinant Tne polymerase constructs were judged to be at least
95% pure based on visual inspection of the Coomaisse-stained SDS
PAGE gels.
2TABLE 2 Salt Elution Characteristics For rTne DNA Polymerases SEQ
ID NO Heparin Sepharose Cibracron Blue rTne DNA 2 0.26 to 0.37 M
NaCl 0.5 to 0.8 M NaCl Polymerase full-length Tne M284 8 0.13 to
0.3 M NaCl 0.05 to 0.3 M NaCl Tne M284 19 0.15 to 0.28 M NaCl 0.3
to 0.48 M NaCl (E325D) Tne M284 26 0.11 to 0.23 M NaCl 0.16 to 0.35
M NaCl (D468N) Tne M284 16 0.05 to 0.32 M NaCl 0.11 to 0.4 M NaCl
(D323E) Tne M284 23 0.11 to 0.28 M NaCl 0.24 to 0.4 M NaCl (Y464F)
Tne M284 35 0.14 to 0.27 M NaCl 0.06 to 0.46 M NaCl (D323A,
D389A)
[0231] The above results provide methods for the isolation of the
full-length and modified forms of Tne DNA polymerase in a highly
pure form at high yields.
EXAMPLE 6
Purification of Tne DNA Polymerase from T. neapolitana Cells
[0232] The preceding example described the isolation of recombinant
Tne DNA polymerases from E. coli harboring plasmids which
overexpress these enzymes. The full-length Tne DNA polymerase may
also be isolated from T. neapolitana cells
[0233] T. neapolitana cells are obtained from the ATCC (ATCC 49049)
and grown in anaerobic culture in MMS medium containing (per
liter): 6.93 g NaCl; 1.75 g MgSO.sub.4.7H.sub.2O; 1.38 g
MgCl.sub.2.6H.sub.2O; 0.16 g KCl; 25 mg NaBr; 7.5 mg
H.sub.3BO.sub.3; 3.8 mg SrCl.sub.2.6H.sub.2O; 0.025 mg KI; 0.38 g
CaCl.sub.2; 0.5 g KH.sub.2PO.sub.4; 0.5 g Na.sub.2S.9H.sub.2O; 2 mg
(NH.sub.4).sub.2Ni(SO.sub.4).sub.2; 15 ml trace minerals [per
liter: 3.0 g MgSO.sub.4.7H.sub.2O; 1.5 g nitriloacetic acid, 1.0 g
NaCl; 0.5 g MnSO.sub.4.H.sub.2O; 0.1 g FeSO.sub.4.7H.sub.2O; 0.1 g
CoCl.sub.2.6H.sub.2O; 0.1 g CaCl.sub.2; 0.1 g ZnSO.sub.4.7H.sub.2O;
0.01 g CuSO.sub.4.5H.sub.2O; 0.01 g AlK(SO.sub.4).sub.2.12H.sub.2O;
0.01 g H.sub.3BO.sub.3 and 0.01 g Na.sub.2MoO.sub.4.2H.sub.2O]; 1
mg resazurin and 5 g starch at a pH of 6.5 (adjusted with
H.sub.2SO.sub.4). For growth on solid medium, 0.8% agar (Difco) was
added to the above medium.
[0234] The cells are grown in a fermentation vessel maintained at
75-80.degree. C. under anaerobic conditions (i.e., under nitrogen).
The cells are grown for approximately 28 hours (early stationary
phase). The cells are collected by centrifugation at
10,000.times.g. The cell pellet may be frozen at -70.degree. C.
until used. All of the subsequent operations should be carried out
at 0 to 4.degree. C. unless otherwise stated.
[0235] Resuspend about 50 g of frozen Thermotoga neapolitana cells
in 100 ml of TEDGT [50 mM Tris-HCl (pH 7.3 at 25.degree. C.), 1 mM
EDTA, 1 mM DTT, 10% glycerol, 0.1% Tween 201 containing 2.5 mM PMSF
(from 144 mM stock in DMF). The thawed and resuspended cells can be
lysed in a Aminco French Pressure Cell (American Instrument Co.,
Silver Spring, Md., cat. no. FA-073) at 16,000 to 24,000 psi. This
operation should be done twice to ensure adequate lysis. The lysate
should be diluted by adding another 100 ml TEDGT containing 2.5 mM
PMSF and stirring gently.
[0236] PEI (polyethyleneimine) is added to the lysate to
precipitate the DNA. The exact amount of PEI is determined
empirically, but in general 0.2% PEI should be adequate to
precipitate most of the DNA (greater than 90%). Approximately 10 ml
of 5% PEI is added to precipitate most of the DNA in the lysate.
The lysate is centrifuged at 15,000 rpm for 15 minutes in a Beckman
JA18 rotor to remove the precipitated DNA. The supernatant is
retrieved and solid ammonium sulfate is added to 60% saturation to
precipitate the DNA polymerase and other proteins. After the salt
is dissolved, the sample is centrifuged at 15,000 rpm for 1 hour in
a Beckmann JA18 rotor. The supernatant is discarded and the
precipitated protein is gathered and dissolved in TEDGT buffer. The
resolubilized protein is then placed in a dialysis membrane having
a 12,000 to 14,000 mw cutoff (Spectra) and then dialyzed against
TEDGT buffer to remove the ammonium sulfate.
[0237] The dialyzed protein solution is then loaded onto a 60 ml
DEAE Sepharose (Sigma, St. Louis Mo.) column (2.5 cm i.d..times.13
cm height). The column is washed with 300 ml 0 M NaCl in TEDGT
buffer. A 300 ml salt gradient is run over the column to elute the
DNA polymerase starting at 0 M NaCl and ending at 0.5 M NaCl (all
in TEDGT buffer). Fractions (5.0 ml) are collected and assayed for
DNA polymerase activity at 74.degree. C. using the protocol
described in Example 5. Fractions containing the polymerase
activity are pooled, placed in dialysis membrane (12,000 to 14,000
mw cutoff; Spectra) and dialyzed against TEDGT buffer until the
salt concentration is less than or equal to 0.05 M NaCl as measured
by conductivity.
[0238] The pooled polymerase fraction is then loaded onto a 15 ml
Heparin Sepharose (Scientific Protein Laboratory, Waunakee, Wis.)
column (1.7 cm i.d..times.6.5 cm height). The column is washed with
150 ml 0.05 M NaCl in TEDGT buffer. A 100 ml salt gradient is run
over the column to elute the DNA polymerase starting at 0.05 M NaCl
and ending at 1 M NaCl (all in TEDGT buffer). Fractions (1.5 ml)
are collected and assayed for DNA polymerase activity at 74.degree.
C. using the assay described in Example 5. The Tne DNA polymerase
should elute between 0.14 M and 0.29 M NaCl (in TEDGT buffer).
Fractions containing the polymerase activity are pooled, placed in
dialysis membrane (12,000 to 14,000 mw cutoff; Spectra) and
dialyzed against TEDGT buffer until the salt concentration is less
than or equal to 0.05 M NaCl as measured by conductivity.
[0239] The pooled polymerase fraction is then loaded onto a 15 ml
DNA Agarose (Pharmacia) column (1.7 cm i.d..times.6.5 cm height).
The column is washed with 75 ml 0 M NaCl in TEDGT buffer. A 100 ml
salt gradient is run over the column to elute the DNA polymerase
starting at 0 M NaCl and ending at 0.5 M NaCl (all in TEDGT
buffer). Fractions (1.5 ml) are collected and assayed for DNA
polymerase activity at 74.degree. C. using the assay described in
Example 5. Fractions containing the polymerase activity are pooled,
placed in dialysis membrane (12,000 to 14,000 mw cutoff; Spectra)
and dialyzed against TEDGT buffer until the salt concentration is
less than or equal to 0.05 M NaCl as measured by conductivity.
[0240] The pooled polymerase fraction is then loaded onto a 7 ml
Cibracron Blue 3GA (Sigma, St. Louis, Mo.) column (1.25 cm
i.d..times.6 cm height). The column is washed with 70 ml of 0.05 M
NaCl in TEDGT buffer. A 100 ml salt gradient is run over the column
to elute the DNA polymerase starting at 0.05 M NaCl and ending at 1
M NaCl (all in TEDGT buffer). Fractions (1.5 ml) are collected and
assayed for DNA polymerase activity at 74.degree. C. using the
assay described in Example 5. The Tne DNA polymerase should elute
between 0.20 M and 0.46 M NaCl (in TEDGT buffer). Fractions
containing the polymerase activity are pooled, placed in dialysis
membrane (12,000 to 14,000 mw cutoff; Spectra) and dialyzed against
storage buffer [20 mM Tris-HCl (pH 8.0), 100 mM KCl, 0.1 mM EDTA, 1
mM DTT, 50% glycerol, 0.5% Tween 20].
[0241] The purity and approximate molecular weight of the DNA
polymerase can be assessed by SDS-PAGE gel electrophoresis using a
4-20% gradient Tris-Glycine SDS gel (Novex, San Diego, Calif.). An
aliquot of the purified material is mixed with sample buffer [63 mM
Tris-HCl pH 6.8, 10% glycerol, 2% SDS, 0.0025% Bromphenol Blue] and
the gel is run using the following running buffer [25 mM Tris-Base,
192 mM glycine, 0.1% SDS, pH 8.3]. The gel can be run for 90
minutes at 125 V D.C. until the bromphenol blue band is just to the
bottom of the gel. The apparent molecular weight for Tne DNA
polymerase should be about 97,000. Using the activity assay
described in Example 5, the number of units of DNA polymerase per
microliter can be established. By visually assessing the quantity
of DNA polymerase on the Coomassie stained SDS-PAGE gel compared to
the protein standards run in the molecular weight marker lane, the
specific activity of the DNA polymerase preparation may be
estimated. The specific activity of the purified Tne DNA polymerase
should be approximately 100,000 units/mg.
EXAMPLE 7
Characterization of the Full-Length and Modified Tne
Polymerases
[0242] The full-length and modified Tne polymerases were assayed
for 5' to 3' exonuclease activity and 3' to 5' exonuclease
activity.
[0243] a) 5' to 3' Exonuclease Assay
[0244] A 5' to 3' exonuclease assay was performed on the Tne M284
DNA polymerase (SEQ ID NO:8) present in crude lysates of cells
containing the pJM284 construct (prepared as described in Example
3c) to determine whether any residual 5' to 3' exonuclease activity
remained in this molecule. A comparison was made of the amount of
5' to 3' exonuclease activity present in the following thermostable
DNA polymerases: Tne M284 (SEQ ID NO:8), the full-length rTne
polymerase (SEQ ID NO:2), nTaq-(Promega) and UlTma (Perkin Elmer).
The 5' to 3' exonuclease assay was performed as follows.
[0245] End-labeled substrate DNAs were prepared by digestion of
pBR322 DNA with either EcoRI or EcoRV followed by incubation with
.gamma.-.sup.32P-ATP and T4 polynucleotide kinase. The 5' to 3'
exonuclease assay was performed in a final reaction volume of 25
.mu.l and contained 1.times. Taq buffer (10 mM Tris-HCl (pH 9.0 at
25.degree. C.), 50 mM KCl, 0.1% Triton X-100), 1.5 mM MgCl.sub.2,
200 .mu.M of each of the four dNTPs, 50 ng of labeled substrate DNA
and 5 units of the DNA polymerase to be tested (a no enzyme control
was also conducted; water was used in place of the enzyme). The
reaction was incubated for 1 hour at 74.degree. C. The reaction was
terminated by the addition of 5 .mu.l of 0.5 M EDTA. Ten
microliters of this mixture were spotted onto 2.3 cm circular DE81
filters (Whatman). The filters were dried briefly under a heat
lamp. The filters were washed in 50 ml of 0.5 M sodium phosphate
(pH 6.8) twice for 5 minutes/wash to remove unincorporated counts.
The no enzyme control was used to permit determination of the total
cpm in the sample. The washed filters were dried under a heat lamp
and then the incorporated cpm and total cpm (no enzyme control
filter) were determined by liquid scintillation counting.
[0246] The results of this assay showed that nTaq and the
full-length Tne (SEQ ID NO:2) polymerases contained considerable 5'
to 3' exonuclease activity, while the UlTma and Tne M284 (SEQ ID
NO:8) polymerases did not contain detectable levels of 5' to 3'
exonuclease activity.
[0247] b) 3' to 5' Exonuclease Assay
[0248] 3' to 5' exonuclease assays were performed in order to
establish how mutations in the putative exonuclease domain of the
Tne polymerase gene effected this activity. The assay was initially
performed on crude lysates from E. coli cells containing a plasmid
which produces the Tne M284 polymerase (SEQ ID NO:8). The assays
were subsequently performed on purified preparations of Tne M284
polymerase (SEQ ID NO:8).
[0249] The assay was performed using either a single stranded or
double stranded DNA substrate which contained a 3' end label. The
substrate DNA were prepared as follows.
[0250] To create the double stranded substrate, Lambda DNA/EcoRI
Markers (Promega G1721) were 3' end labeled with
[.alpha.-.sup.32P]dATP in a final reaction volume of 50 .mu.l
containing 10 .mu.g of lambda DNA, 1.times. Buffer A [6 mM Tris-HCl
(pH 7.5 at 37.degree. C.), 6 mM MgCl.sub.2, 6 mM NaCl and 1 mM
DTT], 5 .mu.l [C-.sup.32P]dATP (3000 Ci/mmol; Amersham), 5 units
Klenow fragment (Promega) and 10 mM of each of the four dNTPs. The
reaction was incubated for 20 minutes at 37.degree. C. The Klenow
enzyme was inactivated by heating the mixture at 65.degree. C. for
15 min. Unincorporated counts were removed by chromatography of the
reaction mixture on a Nick.TM. Column (Pharmacia) according to the
manufacturer's instructions. The labeled DNA was eluted in a volume
of 400 .mu.l. A 10 .mu.l aliquot of the eluted DNA was counted by
liquid scintillation counting and the aliquot contained
approximately 2.times.10.sup.5 cpm.
[0251] To create the single stranded substrate, a synthetic 74
nucleotide oligonucleotide, PM3074 (SEQ ID NO:41) is 3' end labeled
with [.alpha.-.sup.32P]dATP in a final reaction volume of 10 .mu.l
containing 10 pmoles of the PM3074 oligonucleotide, 1.times. TdT
Buffer [50 mM Tris-HCl (pH 7.5), 10 mM MgCl.sub.2, 5 mM DTT and 0.1
mM spermidine], 3 .mu.l [.alpha.-.sup.32P]dATP (3000 Ci/mmol;
Amersham) and 15 units terminal deoxynucleotidyl transferase (TdT)
(Promega). The reaction was incubated for 60 minutes at 37.degree.
C. The TdT enzyme was inactivated by heating the mixture at
65.degree. C. for 15 min. Unincorporated counts were removed by
chromatography of the reaction mixture on a Nick.TM. Column
(Pharmacia) according to the manufacturer's instructions. The
labeled DNA was eluted in a volume of 400 .mu.l. A 10 .mu.l aliquot
of the eluted DNA was counted by liquid scintillation counting and
the aliquot contained approximately 1.times.10.sup.5 cpm.
[0252] The 3' to 5' exonuclease assay was performed as follows. In
a final volume of 50 .mu.l, the following components were
assembled, 1 or 2 units of the DNA polymerase to be assayed, 5
.mu.l of 10.times. Toga buffer [10 mM Tris-HCl (pH 8.8 at
25.degree. C.), 10 mM KCl, 0.002% Tween 20] and nuclease-free water
(Promega). One tube was also set up which did not contain any
polymerase (Le., a no enzyme control). The reaction mixtures were
prewarmed to the reaction temperature (reactions were performed at
either 25.degree. C. or 74.degree. C.) and 10 .mu.l of either the
labeled single stranded or double stranded substrates were added to
start the assay. Ten microliter fractions were removed at the
following time points: 2, 4 or 6 minutes. The aliquots were spotted
onto 2.3 cm circular DE81 filters and processed as described in
section a) above.
[0253] The definition of one unit of 3' to 5' exonuclease activity
is defined as the amount of enzyme required to remove 1 pmol of
labeled 3' end from the substrate in 30 minutes.
[0254] In addition to assaying the polymerases for 3' exonuclease
activity, all samples were also analyzed for DNA polymerase
activity using the assay described in Example 5(b). In this manner,
the ratio of 3' exonuclease activity to polymerase activity could
be determined for each polymerase.
[0255] When the 3' to 5' exonuclease assay was performed using on
crude lysates containing the Tne M284 enzyme, the assay was run
using only the single stranded substrate at 25.degree. C. Under
these conditions, a comparison was made between purified UlTma (a
modified form of Tma polymerase which lacks 5' to 3' exonuclease
activity) and UlTma spiked into a lysate derived from E. coli cells
shown to lack 3' to 5' exonuclease activity; the spiked sample
showed a 22% lower 3' exonuclease: polymerase activity ratio
(polymerase activity was measured using the assay described in
Example ). As E. coli lysate alone shows no 3' exonuclease
activity, this result indicated that results obtained with crude
lysates containing the Tne M284 protein (SEQ ID NO:8) should
underestimate the 3' to 5' exonuclease levels that would be seen
using purified Tne polymerase preparations.
[0256] When purified UlTma.TM. was compared with crude lysates
containing the full-length (SEQ ID NO:2) or Tne M284 (SEQ ID NO:8)
polymerases, the highest exonuclease:polymerase activity ratio was
seen for the full-length Tne enzyme (1.37), then M284 Tne (0.83),
then UlTma.TM. (0.45). These results were unexpected as it was not
predictable that a deletion in the 5' to 3' exonuclease domain (Tne
M284 polymerase) would result in a 39% decrease in 3' to 5'
exonuclease activity.
[0257] Purified Tne M284 polymerase was used in 3' exonuclease
assays in comparison to purified UlTma.TM. using both single
stranded and double stranded substrates at 25.degree. C. and
74.degree. C. Surprisingly, both enzymes displayed the same level
of 3' exonuclease activity on single stranded substrate at
25.degree. C. Results obtained using crude lysates containing Tne
M284 polymerase suggested that Tne M284 would have a higher
activity. Both enzymes (purified Tne M284 and UlTma.TM.) gave
nearly identical results when the 3' exonuclease assay was
performed using the double stranded substrate at 25.degree. C. (and
both showed very low activity under-these conditions).
[0258] When the 3' exonuclease assay was performed using a single
stranded substrate at 74.degree. C., UlTma.TM. had no activity,
whereas purified Tne M284 showed an exonuclease: polymerase ratio
of 0.32.
[0259] These results demonstrate that purified Tne M284 polymerase
and UlTma.TM. have significantly different 3' to 5' exonuclease
activities.
EXAMPLE 8
Characterization of the Purified Tne Polymerases
[0260] In order to ascertain the characteristics of the full length
and modified forms of the rTne polymerases, a number of
determinations were made as described below.
[0261] a) Molecular Weight on SDS-PAGE Gels
[0262] The apparent molecular weight of the full length and
modified rTne polymerases were determined by SDS-PAGE. An aliquot
(10 units) of each purified polymerase was mixed with sample buffer
[63 mM Tris-HCl pH 6.8, 10% glycerol, 2% SDS, 0.0025% bromophenol
blue] and applied to a precast 4-20% gradient Tris-Glycine SDS gel
(Novex, San Diego, Calif.). Molecular weight markers (Promega) were
run on the same gel to allow calculation of the molecular weight of
the Tne polymerases. The gel was run using the following running
buffer [25 mM Tris-Base, 192 mM glycine, 0.1% SDS, pH 8.3] for 90
minutes at 125 V D.C. until the bromphenol blue band was just to
the bottom of the gel.
[0263] Following electrophoresis, the gel was stained with
Coomaisse blue to visualize the proteins. The full-length rTne
polymerase migrated with an apparent molecular weight of 97,000
daltons. Tne M284, Tne M284(D323E), Tne M284(E325D), Tne
M284(Y464F), Tne M284(D468N), and Tne M284(D323A, D389A) all
migrated with an apparent molecular weight of 70,000 daltons.
[0264] b) Specific Activity of Purified Tne Polymerases
[0265] Preparations of purified full-length rTne and Tne M284, Tne
M284(D323E), Tne M284(E325D), Tne M284(Y464F), Tne M284(D468N), and
Tne M284(D323A, D389A) were all found to have a specific activity
of 100,000 units/mg when the enzymes were purified and DNA
polymerase assays were conducted as described in Example 5.
[0266] c) 3' to 5' Exonuclease Activity
[0267] Purified preparations of full-length rTne and Tne M284, Tne
M284(D323E), Tne M284(E325D), Tne M284(Y464F), Tne M284(D468N) and
Tne M284(D323A, D389A) were assayed for 3' to 5' exonuclease
activity. In the same experiment, the 3' to 5' exonuclease activity
of the UlTma.TM. (Perkin Elmer) and nTaq (Promega) polymerases were
also measured for comparison to the Tne polymerases.
[0268] The 3' exonuclease assay involved the incubation of the
above enzymes with a 3' end-labeled DNA substrate; the loss of
radioactivity from the labeled substrate was measured after a 10
minute incubation period.
[0269] i) Preparation of the 3' End-Labeled Substrate
[0270] Ten micrograms of lambda phage DNA was digested with 50 u of
MluI (Promega) in a 50 .mu.l reaction volume in 1.times. Buffer D
[6 mM Tris-HCL (pH 7.9 at 37.degree. C.), 6 mM MgCl.sub.2, 150 mM
NaCl and 1 mM DTT]. The reaction was incubated for 3 hours at
37.degree. C. The 3' ends were then filled in using
[.alpha.-.sup.32P]-dCTP and unlabelled dGTP in a reaction
containing 5 u of Klenow Exo-(USB) and 1.times. Buffer D in a
volume of 100 .mu.l; the reaction mixture was incubated for 20 min
at 25.degree. C. The reaction was terminated by heating the sample
to 74 C for 15 min. The bulk of the unincorporated
[.alpha.-.sup.32P]-dCTP was removed by passage of the sample over a
Nick.TM. column (Pharmacia) and the labeled DNA was eluted in 400
.mu.l TE [10 mM Tris-HCl (pH 7.4), 1 mM EDTA]. An aliquot (10
.mu.l) of the eluted DNA was counted in a scintillation counter and
the aliquot contained approximately 2.times.10.sup.5 cpm.
[0271] ii) 3' to 5' Exonuclease Assay
[0272] The 3' to 5' exonuclease assay was performed as follows. In
a final volume of 50 .mu.l, the following components were
assembled, 1 or 2 units of the DNA polymerase to be assayed, 5
.mu.l of 10.times. Toga buffer [10 mM Tris-HCl (pH 8.8 at
25.degree. C.), 10 mM KCl, 0.002% Tween 20] and nuclease-free water
(Promega). One tube was also set up which did not contain any
polymerase (i.e., a no enzyme control). The reaction mixtures were
prewarmed to the reaction temperature (reactions were performed at
either 25.degree. C. or 74.degree. C.) and 10 .mu.l of the labeled
double stranded substrates were added to start the assay. Ten
microliter fractions were removed at the following time points: 2,
4, 6 and 10 minutes (In a subsequent experiment, aliquots were
withdrawn at 30 and 60 minutes to permit the detection of activity
for enzymes showing very low levels of activity). The aliquots were
spotted onto 2.3 cm circular DE81 filters and processed as
described in Example 7, section a) above.
[0273] The definition of one unit of 3' to 5' exonuclease activity
is defined as the amount of enzyme required to remove 1 pmol of
labeled 3' end from the substrate in 30 minutes.
[0274] In addition to assaying the polymerases for 3' exonuclease
activity, all samples were also analyzed for DNA polymerase
activity using the assay described in Example 5(b). In this manner,
the ratio of 3' exonuclease activity to polymerase activity could
be determined for each polymerase. The results are reported as the
ratio of 3' exonuclease activity to polymerase activity. The value
obtained for the full-length Tne polymerase was assigned a value of
100% and all other values are expressed relative to this value.
3TABLE 3 3' Exonuclease Activity DNA Polymerase Exo:Pol Ratio rTne
100 .+-. 9.0 UlTma .TM. 23.3 .+-. 0.8 Tne M284 28.0 .+-. 0.08 Tne
M284(Y464F) 1.9 .+-. 0.2 Tne M284(D468N) 0.47 .+-. 0.02 Tne
M284(D323E) 0.0 Tne M284(E325D) 0.0 Tne M284(D323A, D389A) 0.0 nTaq
0.0
[0275] The results shown in Table 3 demonstrate that deletions
which remove 5' exonuclease activity (e.g., Tne M284) also affect
3' exonuclease activity in the Tne polymerases (compare rune with
Tne M284; activity of Tne M284 is roughly one third that seen in
rTne). Mutation of amino acid residues suspected of being critical
for 3' exonuclease activity (by analogy to similar residues in E.
coli polymerase I) further reduces or eliminates 3' exonuclease.
The virtual elimination of 3' exonuclease activity seen with the
Tne M284(D468N) polymerase was unexpected. The analogous mutation
in the Klenow fragment of DNA polymerase I (D501N) reduces 3'
exonuclease activity only by 2-fold [Derbyshire et al., EMBO J.
10:17 (1991)]. These results underscore the fact that much remains
to be learned about structure-function relationships and that one
cannot predict, with certainty, the effect of a given mutation
based on analogy to other proteins.
[0276] d) Thermostability of the Purified Tne Polymerases at
97.5.degree. C.
[0277] Thermostability was measured by incubating a DNA polymerase
at 97.5.degree. C. for various amounts of time and measuring the
remaining DNA polymerase activity at 74.degree. C. Samples were
withdrawn after 0, 5, 10, 30, 60, 90, and 120 minutes after
exposure to 97.5.degree. C. and DNA polymerase assays were
conducted as described in Example 5b. The time necessary to reduce
the activity by one-half the initial value was determined by
plotting the remaining activity versus the time of incubation at
97.5.degree. C.
[0278] The incubation buffer contained 10 mM Tris-HCl (pH 8.8 at
25.degree. C.), 10 mM KCl, 0.002% Tween 20 and 1.5 mM MgCl.sub.2
and 2-3 units of the enzyme to be tested. All enzymes, including
commercially available enzymes (e.g., nTaq, UlTma.TM.), were
assayed using the DNA polymerase assay described in Example 5. The
results of the thermostability assays are summarized in Table 4
below.
4TABLE 4 Thermostability of Tne Polymerases Half-life at DNA
polymerase SEQ ID NO: 97.5.degree. C. (min) full-length rTne 2 5
Tne M284 (E325D) 19 5 native Taq 8 UlTma .TM. 12 Tne M284(D323E) 16
12.5 Tne M284(Y464F) 23 16 Tne M284 8 18 Tne M284(D323A, 35 22
D389A) Tne M284 (D468N) 26 66
[0279] The results shown in Table 4 demonstrate that deletion of
the amino terminal 5' to 3' exonuclease domain of Tne polymerase
increases resistance to thermal degradation (2.5 to 13.2 fold).
While amino-terminal deletions have been shown to increase
thermostability (i.e., thermal tolerance) of certain modified forms
of Taq DNA polymerase (e.g., the Stoffel fragment) and Tma DNA
polymerase, the increase seen is generally about 2-3 fold. The
increase in thermostability seen by the introduction of single or
double point mutations into the Tne M284 deletion mutant was
unexpected, especially the dramatic increase in thermostability
caused by the single point mutation present in Tne M284(D468N) (SEQ
ID NO:26).
[0280] The Tne M284(D468N) protein is identical to the Tne M284
protein with the exception that an asparagine is substituted for an
aspartate at position 468; the carbon backbones of these two
molecules should be identical, yet there is a greater than 5-fold
resistance to thermal degradation seen when asparagine is present
at position 468. The only difference between these two molecules
should be a negatively charged carbonyl group (aspartate) and a
neutral amide group (asparagine); both of these groups can
participate in hydrogen-bonding. Substitutions of solvent-exposed
amino acids (e.g., aspartate or asparagine) have been shown to have
little effect on protein stability or structure [Matthews, Ann.
Rev. Biochem. 62:139 (1993)], leading to the view that the rigid
parts of proteins are critical for folding and stability. Based on
analogy to the E. coli DNA polymerase I molecule at position 501,
it is expected that position 468 in the Tne DNA polymerase is
accessible to solvent Furthermore, since Tne M284(D468N) lacks the
carbonyl group to coordinate a metal ion, it was predicted that a
slight destabilization (in response to heat) would be seen rather
than a dramatic stabilization. This data show that a subtle change
in the 3' exonuclease active site can dramatically alter the
thermostability (i.e., thermotolerance) of the Tne DNA
polymerase.
[0281] The above results demonstrate that several of the modified
Tne polymerases have superior thermostability; accordingly, when
these enzymes are used in PCR and other reactions run at elevated
temperature, less polymerase activity needs to be used as less
enzyme is inactivated by exposure to elevated temperature.
[0282] g) Optimal Temperature for DNA Polymerase Activity
[0283] The DNA polymerase activity of several Tne polymerases and
nTaq DNA polymerase was measured at various temperatures using the
assay described in Example 5b (with the exception that the
temperature of incubation was varied). The results are summarized
in Table 5. The temperature which gave the highest activity for a
given enzyme was assigned a value of 100% and all other values
given are expressed relative to the 100% value.
[0284] The results shown in Table 5 demonstrate that for nTaq DNA
polymerase, the maximal DNA polymerase activity was present when
the reaction was run at 78.degree. C. Optimal temperature for DNA
polymerase activity for the full-length rTne DNA polymerase was
74.degree. C. Optimal temperature for DNA polymerase activity for
the Tne M284 and Tne M284(D323A, D389A) DNA polymerases was
75.degree. C.
5TABLE 5 Optimal Temperature For Tne Polymerase Activity Tne
M284(D323A, Temp. (.degree. C.) Tne M284 D389A) nTaq rTne 40 12 15
n.d. n.d. 45 15 17 n.d. n.d. 50 18 20 n.d. n.d. 55 24 26 n.d. n.d.
60 33 36 n.d. n.d. 65 53 52 n.d. n.d. 70 71 80 n.d. 77 72 n.d. n.d.
n.d. 86 74 n.d. n.d. n.d. 100 75 100 100 n.d. n.d.
[0285]
6TABLE 5 Optimal Temperature For Tne Polymerase Activity Tne
M284(D323A, Temp. (.degree. C.) Tne M284 D389A) nTaq rTne 76 n.d.
n.d. 93 94 77 n.d. n.d. 97 n.d. 78 n.d. n.d. 100 96 79 n.d. n.d. 97
n.d. 80 53 56 85 85 81 n.d. n.d. 82 n.d. 82 n.d. n.d. 68 68 85 16
16 n.d. n.d. 90 8 8 n.d. n.d.
EXAMPLE 9
The Tne DNA Polymerases Provide Improved Enzymes for use in the
Polymerase Chain Reaction
[0286] The use of the full-length and modified forms of Tne
polymerase in the PCR was examined. The results of the following
experiments show that the Tne polymerases provide improved enzymes
for a variety of PCR applications.
[0287] a) The Modified Tne Polymerases Utilize a Broader Range of
Optimal dNTP Concentrations in the PCR
[0288] PCR reactions were conducted using a range of dNTP
concentrations to determine the optimal range of dNTP
concentrations utilized by either the UlTma.TM. DNA polymerase
(Perkin Elmer) or the Tne M284 and Tne M284(D323A, D389A)
polymerases. The modified Tne polymerases were found to produce
high yields of amplified product over a much broader range of
nucleotide concentrations. The PCRs were conducted as follows.
[0289] The three enzyme preparations were assayed for DNA
polymerase activity using the assay in Example 5b so that the same
amount of enzyme was used in the PCRs. PCRs were performed in a
buffer optimized for each type of enzyme; the UlTma.TM. and Tne
enzymes were both found to give optimal results in the following
buffer [10 mM Tris-HCl (pH 8.8 at 25.degree. C.), 10 mM KCl, 0.002%
Tween 20]. All reactions contained 1 ng of pGEM-luc (Promega) as
the template, 20 pmol of each primer [the primers used were
LME41(SEQ ID NO:42) and LME43(SEQ ID NO:43)], 1.5 mM MgCl.sub.2
(this concentration was chosen as it was optimal for both the
UlTma.TM. and the modified Tne enzymes as shown below in section b)
and 2.5 units of each enzyme. The final reaction volume was 50
.mu.l.
[0290] A dilution series was created for the mixture of all four
dNTPs ranging from 20 to 200 .mu.M (20, 40, 60, 80, 100, 120, 140,
160 and 200 .mu.M). The PCRs were thermal cycled using the
following conditions, an initial denaturation at 96.degree. C. for
2 min, followed by 30. cycles comprising denaturation at 94.degree.
C. for 30 sec, annealing/extension at 65.degree. C. for 2 min;
following the last cycle the tubes were incubated at 65.degree. C.
for 10 min and then the tubes were incubated at 4.degree. C.
Thermal cycling was conducted on a Perkin Elmer Thermocycler Model
480. The PCR products were resolved on 1% agarose gels, stained
with EtBr and quantitated by visual inspection of the stained gels.
The experiment was repeated a second time and similar results were
obtained.
[0291] The optimal dNTP concentration for UlTma.TM. was found to be
40 .mu.M (the same value as reported to be optimal by Perkin Elmer
in the UlTma.TM. product insert); slight levels of product could be
detected using the UlTma.TM. enzyme at a concentration of 20 .mu.M
dNTPs but not at all at other concentrations tested. The optimal
range of concentrations of dNTPs for Tne M284(D323A, D389A) was
found to be from 60 .mu.M to 200 .mu.M with slight amounts of
product being detected at both 20 and 40 .mu.M dNTP. The optimal
range of concentrations of dNTPS for Tne M284 was from 40 .mu.M to
200 .mu.M with slight amounts of product being detected at 20 .mu.M
dNTP.
[0292] The ability to use a wide range of dNTP concentrations is
advantageous. The modified Tne polymerases allow PCR users wide
flexibility in the design of the reaction conditions. Because the
modified Tne polymerases are not sensitive to small variations in
dNTP concentration, PCRs using these enzymes are more robust.
Additionally the ability to use a higher concentration of dNTPs may
also allow a higher yield of product to be generated in a PCR using
the modified Tne polymerases as compared to the use of the
UlTma.TM. polymerase.
[0293] b) The Tne Polymerases Tolerate a Broader Range of Mg++
Concentrations in PCR
[0294] In side-by-side tests with UlTma.TM. DNA polymerase and Tne
M284 or Tne M284(D323A, D389A), the Tne variants were found to
produce high yields of amplified product over a broader range of
magnesium ion concentrations. A dilution series was created for
magnesium (MgCl.sub.2) ranging from 0.5 to 5 mM (0.5, 1.0, 1.5,
2.0, 2.5, 3.0, 4.0, and 5.0 mM). The other components of the PCR
were as described above in section a) with the exception that a
single, dNTP concentration was used (40 .mu.M dNTP was used for the
UlTma.TM. enzyme and the modified Tne enzymes as this value was
within the operable range for all three enzymes; it is noted that
40 .mu.M dNTP is not optimal for the Tne M284(D323A, D389A enzyme
and that even higher product yields would be expected if 60 .mu.M
dNTP were used for this enzyme) and the concentration of MgCl.sub.2
was varied. Thermal cycling conditions were as described above in
section a). The reaction products were treated as described above
in section a).
[0295] The results of these assays showed that the UlTma.TM. enzyme
(Perkin Elmer) had an optimal concentration of 1.5 MM MgCl.sub.2
and a great deal of non-specific background products were seen on
the gels (as a smear). The optimal concentration for Tne M284
enzyme was found to be from 1 to 3 mM MgCl.sub.2 and some
nonspecific background products were observed. The optimal
concentration for M284(D323A, D389A) was from 1 to 3 mM MgCl.sub.2
and very little or no detectable non-specific background products
were observed.
[0296] The ability of a thermostable enzyme to produce only
specific PCR products over a wide range of Mg.sup.++ concentrations
is important for certain PCR applications. For example,
multiplexing PCRs utilize several pairs of primers to amplify
several different targets in the sample. As each primer pair and
target combination will have a optimum Mg.sup.++ concentration and
this value may vary considerably from primer set to primer set, the
availability of thermostable polymerases having a wide optimum for
Mg.sup.++ is advantageous. Therefore, the Tne M284 and Tne
M284(D323A, D389A) polymerases provide improved thermostable DNA
polymerases.
[0297] c) Modified Tne Polymerases Produce High PCR Product
Yields
[0298] A comparison was made between the yield of PCR product
obtained using a consistent amount of polymerase activity when the
following enzymes were used in the PCR: Tne M284, Tne M284(D323E),
Tne M284(E325D), Tne M284(Y464F), Tne M284(D468N), Tne M284(D323A,
D389A), AmpliTaq (Perkin Elmer), nTaq (Promega) and UlTma.TM.
(Perkin Elmer). All enzymes were assayed using the DNA polymerase
assay described in Example 5b so that the same number of units of
enzyme were added to the PCRs.
[0299] PCRs which used UlTma.TM. polymerase contained 10 mM
Tris-HCl (pH 8.8 at 25.degree. C.), 10 mM KCl, 0.002% Tween 20 and
1.5 mM MgCl.sub.2. This buffer gives optimal PCR results for the
UlTma.TM. enzyme (as reported by Perkin Elmer). For the results
shown in Table 6, PCRs which used the Tne DNA polymerases contained
10 mM Tris-HCl (pH 8.8 at 25.degree. C.), 10 mM KCl, 0.002% Tween
20 and 1.5 mM MgCl.sub.2. This buffer gives acceptable results for
the Tne polymerases.
[0300] For the results shown in Table 7, PCRs which used the Tne
DNA polymerases contained 10 mM Tris-HCl (pH 9.0 at 25.degree. C.),
10 mM KCl, 0.01% Tween 20 and 1.5 mM MgCl.sub.2 as it was found
that the Tne DNA polymerases perform better when the pH of the
buffer is raised to 9.0 and the amount of detergent (i.e., Tween
20) is increased to 0.01%.
[0301] PCRs which used the UlTma.TM. enzyme contained 40 .mu.M
dNTPs as this value was determined to be optimal for this enzyme
(see section a, above). PCRs which used the modified Tne enzymes
contained 140 .mu.M dNTPs as this value was determined to be
optimal for these enzymes (see section a, above).
[0302] PCRs which used nTaq or AmpiTaq polymerase contained 10 mM
Tris-HCl (pH 9.0 at 25.degree. C.), 50 mM KCl, 1.5 mM MgCl.sub.2,
0.1% Triton X-100; this buffer gives optimal PCR results for the
nTaq and AmpliTaq polymerases. PCRs which used the nTaq and
AmpliTaq enzymes contained 200 .mu.M dNTPs as this value is
reported to be optimal for these enzymes.
[0303] A single template, pGEM-luc was tested in conjunction with
two different primer pairs. The primer pair comprising LME41 (SEQ
ID NO:42) and LME45 (SEQ ID NO:44) amplifies a 500 bp target on the
template. The primer pair comprising LME41 (SEQ ID NO:42) and LME43
(SEQ ID NO:43) amplifies a 1.5 kb target on the template.
[0304] All PCR reactions contained 3.0 units of the enzyme to be
tested, 2 ng of pGEM-luc (Promega) as the template, 50 pmol of each
primer pair, 1.5 mM MgCl.sub.2 (this concentration was chosen as it
was determined to be optimal for all three types of enzyme tested
using this template and these primer pairs). The final reaction
volume was 100 .mu.l.
[0305] The PCRs which amplified the 500 bp target were thermal
cycled using the following conditions, an initial denaturation at
96.degree. C. for 2 min, followed by 25 cycles comprising
denaturation at 95.degree. C. for 30 sec, annealing at 65.degree.
C. for 30 sec and extension at 72.degree. C. for 1 min; following
the last cycle the tubes were incubated at 4.degree. C.
[0306] The PCRs which amplified the 1500 bp target were thermal
cycled using the following conditions, an initial denaturation at
94.degree. C. for 2 min, followed by 30 cycles comprising
denaturation at 94.degree. C. for 1 min and annealing/extension at
65.degree. C. for 2 min; following the last cycle, the tubes were
incubated at 68.degree. C. for 10 min and then the tubes were
brought to 4.degree. C.
[0307] Thermal cycling was conducted on a Perkin Elmer Thermocycler
Model 480. Each reaction was run in duplicate. The PCR products
were resolved by electrophoresis through 1% agarose gels, followed
by staining with EtBr and quantitated by scanning the gels with a
fluoroimager (Molecular Dynamics); product yields were quantitated
using ImageQuant software (Molecular Dynamics). The results
(average of the duplicates) are shown below in Tables 6 and 7. The
enzyme which gave the highest yield was assigned a value of 100%
and all other values are expressed as a percentage of the maximal
yield.
[0308] The results shown above in Tables 6 and 7 demonstrate the
following. Regardless of the target size, the modified Tne enzymes
produced higher yields as compared to the UlTma.TM. enzyme (2.5 to
6.7 fold more product). The modified Tne enzymes also performed
better than nTaq when the 500 bp target was used (about 2 fold more
product) and better than the recombinant Taq polymerase (AmpliTaq)
when the 1.5 kb target was used (about 30% more product).
[0309] Thus, to produce the same amount of PCR product, fewer units
of modified Tne polymerases are required compared to the use of the
UlTma.TM., nTaq and AmpliTaq enzymes. These modified Tne
polymerases allow the user to use smaller amounts of enzyme in the
PCR and therefore will provide significant cost savings in PCR
applications.
7TABLE 6 Product Yields For Amplification Of A 500 bp Target Enzyme
% Yield nTaq 59 UlTma .TM. 16 Tne M284 100 Tne M284(D323E) 89 Tne
M284(E325D) 87 Tne M284(Y464F) 91 Tne M284(D468N) 89 Tne
M284(D323A, D389A) 91
[0310]
8TABLE 7 Product Yields For Amplification Of A 1.5 kb Target Enzyme
% Yield AmpliTaq 67 UlTma .TM. 40 Tne M284 81 Tne M284 (D323E) 70
Tne M284 (E325D) 86 Tne M284 (Y464F) 93 Tne M284 (D468N) 100 Tne
M284 (D323A, D389A) 87
EXAMPLE 10
Thermal Cycle Sequencing using Modified Tne Polymerases
[0311] The modified Tne polymerases were further characterized by
examining their ability to be used in thermal cycle sequencing
reactions. Initial sequencing reactions were performed using with
the Tne M284(D323E) and Tne M284(E325D) mutants (SEQ ID NOS:16 and
19, respectively) using the buffer and dideoxy mixes from the
fmol.RTM. DNA Sequencing System kit (Promega) according to the
manufacturer's instructions. The buffer and concentrations of
dideoxy and deoxyribonucleotides in this kit were optimized for use
with sTaq DNA polymerase (Promega; sTaq is a modified form of Taq
DNA polymerase which has very low 5' to 3' exonuclease
activity).
[0312] When the sequencing reactions were performed using purified
preparations of either Tne M284(D323E) (SEQ ID NO:16) or Tne
M284(E325D) (SEQ ID NO:19), the resulting reaction products were
very short and uneven and created light bands on the sequencing gel
in every position and in every lane. The light bands in every
position were consistent with the presence of residual 3' to 5'
exonuclease activity in the modified Tne polymerases.
[0313] Thermal cycle sequencing reactions were then performed using
the triple mutant Tne M284(D323A, D389A) (SEQ ID NO:35). All
thermal cycle sequencing was carried out using a Perkin-Elmer 9600
thermal cycler in conjunction with the fmol.RTM. DNA Sequencing
System reagents (Promega). The template used in the sequencing
reactions was the Tne clone pE325D (Example 4b) and the 5' labeled
primer was the JH66 primer (SEQ ID NO:12). The reactions were run
at 95.degree. C. for 15 seconds and 70.degree. C. for 60 seconds
for 30 cycles.
[0314] The results of this experiment showed that the use of the
triple mutant enzyme (SEQ ID NO:35) eliminated the light bands in
every position which were seen with Tne M284(D323E) and Tne
M284(E325D); however the sequence ladders were still shifted toward
short extension products. These short intense reads were indicative
of a DNA polymerase having a higher affinity for the
dideoxynucleotides than Taq DNA polymerase.
[0315] To examine if the triple mutant Tne enzyme (SEQ ID NO:35)
had a higher affinity for dideoxynucleotides, the sequencing
reactions were repeated using a lower ratio of dNTPs to ddNTPs in
the sequencing reaction mixtures (all other conditions remained the
same). Table 8 lists the 3.times. mixes used for sTaq and Tne
M284(D323A, D389A) polymerases in this experiment. The use of lower
concentration of dideoxynucleotides in the sequencing reaction run
using the triple mutant Tne enzyme (SEQ ID NO:35) produced reads as
long as those obtained using sTaq.
9 TABLE 8 Tne M284, D323A, sTaq D389A dNTPs ddNTPs dNTPs ddNTPs G
mix 20 .mu.M 30 .mu.M 20 .mu.M 30 .mu.M A mix 20 .mu.M 350 .mu.M 20
.mu.M 75 .mu.M T mix 20 .mu.M 600 .mu.M 40 .mu.M 40 .mu.M C mix 20
.mu.M 200 .mu.M 40 .mu.M 20 .mu.M
EXAMPLE 11
Optimization of Nucleotide Mixtures for Thermal Cycle Sequencing
using Tne M284(D323A, D389A) Polymerase
[0316] The results discussed in Example 10 demonstrated that the
Tne M284(D323A, D389A) polymerase (SEQ ID NO:35) has a higher
affinity for dideoxynucleotides than does sTaq polymerase. Further
testing with various concentrations of dideoxy- and
deoxynucleotides, including modified dNTPs, in sequencing reactions
was performed to ascertain the optimal concentration for these
reagents. The optimized Tne dNTP/ddNTP mixes are detailed in Table
9 below. The values reported in Table 9 represent 3.times.
mixtures; these mixtures are diluted 3-fold in the final reaction
mixture as described below. To obtain the final concentration of
dNTPs and ddNTPs in the reactions, the values in Table 9 are
divided by 3.
10TABLE 9 Optimized Nucleotide Mix Formulation for Tne Polymerase G
Nucleotide A Nucleotide T Nucleotide C Nucleotide Component Mix Mix
Mix Mix ddGTP 20 .mu.M -- -- -- ddATP -- 50 .mu.M -- -- ddTTP -- --
75 .mu.M -- ddCTP -- -- -- 25 .mu.M 7-deaza dGTP 30 .mu.M 30 .mu.M
30 .mu.M 30 .mu.M dATP 30 .mu.M 30 .mu.M 30 .mu.M 30 .mu.M dTTP 30
.mu.M 30 .mu.M 30 .mu.M 30 .mu.M dCTP 30 .mu.M 30 .mu.M 30 .mu.M 30
.mu.M
[0317] To illustrate the fact that the Tne M284(D323A,D389A) enzyme
has a higher affinity for ddNTPs as compared to sTaq, the following
thermal cycle sequencing reactions were performed. The template
used was pGEM-3Zf(+) (Promega) and the primer was the
.gamma.-.sup.32P end labeled pUC/M13 Forward Primer (Promega; SEQ
ID NO:40). Reactions using sTaq polymerase or the Tne M284(D323A,
D389A) polymerase were performed using dNTP/ddNTP mixes optimized
for sTaq as provided in the fmol(D DNA Sequencing System.
[0318] The extension products of thermal cycle sequencing reactions
performed using either sTaq or the Tne M284(D323A, D389A)
polymerase were resolved by electrophoresis on a 6% denaturing
polyacrylamide gel; following electrophoresis, the gel was exposed
to X-ray film. The resulting autoradiograph is shown in FIG. 5,
Panel A.
[0319] In FIG. 5, Panel A, lanes 1-4 contain reaction products
generated using sTaq and lanes 5-8 contain reaction products
generated using the Tne M284(D323A, D389A) polymerase. In each set
of four lanes, reactions run in the presence of ddGTP, ddATP, ddTTP
and ddCTP were loaded left to right.
[0320] The results shown in FIG. 5, Panel A show that the sequence
ladders generated using the modified Tne polymerase were shifted
toward short extension products. These short intense reads were
indicative of a DNA polymerase having a higher affinity for the
dideoxynucleotides than Taq DNA polymerase.
[0321] Sequencing reactions were then performed using sTaq in
conjunction with dNTP/ddNTP mixes optimized for sTaq as provided in
the fmol.RTM. DNA Sequencing System or the Tne M284(D323A, D389A)
polymerase and the dNTP/ddNTP mixes shown in Table 9. The template
used was pGEM-3Zf(+) (Promega) and the primer was the
.gamma.-.sup.32P end labeled pUC/M13 Forward Primer (Promega; SEQ
ID NO:40). An autoradiograph of reaction products is shown in FIG.
5, Panel B.
[0322] In FIG. 5, Panel B, lanes 1-4 contain reaction products
generated using sTaq and sequencing mixes optimized for sTaq and
lanes 5-8 contain reaction products generated using the Tne
M284(D323A, D389A) polymerase and sequencing mixes shown in Table 9
for the Tne M284(D323A, D389A) polymerase. In each set of four
lanes, reactions run in the presence of ddGTP, ddATP, ddTTP and
ddCTP were loaded left to right.
[0323] The results shown in FIG. 5, Panel B demonstrate that the
Tne M284(D323A, D389A) polymerase has a higher affinity for ddNTPs
than does sTaq polymerase and therefore lower concentrations of
ddNTPs must be used in the sequencing reactions.
[0324] Using the optimized mixes shown in Table 9, the Tne
M284(D323A,D389A) enzyme was compared with Sequencing Grade Taq
(sTaq; Promega) for its ability to sequence three different
templates. sTaq (Promega) was used in conjunction with the fmol DNA
Sequencing System (Promega) and all protocols were followed as per
the instructions. The Tne M284(D323A,D389A) polymerase was used in
conjunction with the fmol.RTM. DNA Sequencing System kit (Promega),
except that the dNTP's/ddNTP's mixes used were the Tne optimized
mixes shown in Table 9.
[0325] The following three DNA templates used in the indicated
amounts in the thermal cycle sequencing reactions. Forty femtomoles
of pGEM-3Zf(+) (Promega) was sequenced using the .gamma.-.sup.32P
end labeled pUC/M13 Forward Primer (Promega; SEQ ID NO:40). Four
femtomoles of a 500 bp PCR fragment was sequenced using the gamma
32P end labeled LME-28 primer (5'-TAATACGACTCACTATAGGGC GAAT-3'
(SEQ ID NO:47). Four femtomoles of .lambda.gt11 phage DNA (Promega)
was sequenced using a .gamma.-.sup.32P end labeled .lambda.gt11
Forward Primer (Promega).
[0326] The 500 bp PCR product used as template was generated by
amplification of the template pGEM-luc with primers LME41 (SEQ ID
NO:42) and LME45 (SEQ ID NO:44) as described in Example 9.
[0327] The thermal profile of all sequencing reactions was
95.degree. C. for 2 min, then 30 cycles of 95.degree. C. for 30
sec, 70.degree. C. for 60 sec; following the last cycle, the
reactions were brought to 4.degree. C. The thermal cycling was
performed using a Perkin-Elmer 480 Thermal Cycler. The extension
products were separated on a 6% denaturing polyacrylamide gel;
following electrophoresis, the gel was exposed to X-ray film. The
resulting autoradiographs are shown in FIG. 6.
[0328] In FIG. 6, three panels are shown. Panel A contains the
extension products generated using pQEM-3Zf(+) as the template;
Panel B contains the extension products generated using the 500 bp
PCR product as the template; Panel C contains the extension
products generated using .lambda.gt11 phage DNA as the template. In
all three panels, eight lanes are shown. The first four lanes
correspond to extension products generated using sTaq polymerase;
the last four lanes correspond to extension products generated
using the Tne M284(D323A, D389A) polymerase. In each set of four
lanes, the termination mixes were used in the following order (left
to right): G, A, T and C.
[0329] The results shown in FIG. 6 demonstrated that the Tne
M284(D323A, D389A) polymerase produced sequencing ladders which
were virtually identical in terms of quality and quantity to those
produced using sTaq (Promega); both enzymes provided the correct
DNA sequence for each template. While both enzymes produced similar
results in thermal cycle sequencing reactions, the Tne M284(D323A,
D389A) polymerase required lower concentrations of
dideoxynucleotides. Thus, the use of the use of the Tne M284(D323A,
D389A) polymerase will result in considerable cost savings for
thermal cycle sequencing applications.
[0330] The optimal ddNTP concentrations shown in Table 9 for the
Tne M284(D323A, D389A) polymerase (SEQ ID NO:35) as compared to the
optimal concentrations for sTaq (Table 8) demonstrate that the
modified Tne polymerase has a greater affinity four all four
ddNTPs. In particular this modified Tne polymerase requires 8-fold
less ddTTP or ddCTP, 7-fold less ddATP and 1.5-fold less ddGTP than
does sTaq in thermal cycle sequencing reactions. As
dideoxynucleotides are an expensive component of the sequencing
reaction mixtures, the use of the Tne M284(D323A, D389A) polymerase
(SEQ ID NO:35) in place of enzymes such as Taq polymerase which
have lower affinities for ddNTPS will result in considerable cost
savings. These results also demonstrate that the Tne M284(D323A,
D389A) enzyme can utilize the nucleotide analog 7-deaza dGTP which
is used to resolve band compression artifacts generated when
sequencing G+C-rich regions of DNA.
EXAMPLE 12
Preferred Sequencing Protocol using Tne M284(D323A, D389A)
Polymerase
[0331] The preferred sequencing protocol uses a thermal cycling
format. A detectable signal may be generated using either an end
radiolabeled primer or a radiolabeled dNTP that is incorporated
into the extension products.
[0332] a) Sequencing Protocol using an End-Labeled Primer
[0333] i) Primer Radiolabeling Reaction
[0334] To generate a radiolabeled primer for use in the sequencing
reaction, the following reaction components are assembled in a 0.5
ml microcentrifuge tube: 10 pmol of the desired sequencing primer;
10 pmol of .gamma.-labeled ATP (see Table 10 for amount); 1 .mu.l
of 10.times. T4 polynucleotide kinase Buffer [500 mM Tris-HCl (pH
7.5); 100 mM MgCl.sub.2; 50 mM DTT; 1.0 mM spermidine]; 5 units T4
polynucleotide kinase and sterile H.sub.2O to a final volume of 10
.mu.l. The reaction mixture is incubated at 37.degree. C. for 10-30
min (if end-labeling is to be performed using
[.gamma.-.sup.35S]ATP, 20 units of polynucleotide kinase are used
and the reaction is incubated for 4 hours at 37.degree. C.). The
reaction is then terminated by inactivation of the kinase by
incubation at 90.degree. C. for 2 minutes. The tube is then briefly
spun in a microcentrifuge to collect any condensation. The labeled
primer may be used directly in the sequencing reaction without
further purification.
11TABLE 10 Amount of Radiolabeled Nucleotide Needed To Equal 10
pmol [.gamma.-.sup.32P]ATP: 3.0 .mu.l of 3,000 Ci/mmol, 10
.mu.Ci/.mu.l 5.0 .mu.l of 5,000 Ci/mmol, 10 .mu.Ci/.mu.l 0.5 .mu.l
of 6,000 Ci/mmol, 135 .mu.Ci/.mu.l [.gamma.-.sup.35S]ATP: 1.4 .mu.l
of 1,326 Ci/mmol, 10 .mu.Ci/.mu.l
[0335] ii) Extension/Termination Reactions
[0336] For each set of sequencing reactions, label four 0.5 ml
microcentrifuge tubes (G, A, T, C). Add 2 .mu.l of the appropriate
3.times. dNTP/ddNTP Mix to each tube (see Table 9 for components of
the 3.times. mixes). Cap the tubes and store on ice or at 4.degree.
C. until needed. For each set of four sequencing reactions, mix the
following reagents in a microcentrifuge tube: 440 fmol of template
DNA (see Table 11 below for recommended amounts); 5 .mu.l fmol.RTM.
Sequencing 5.times. Buffer [250 .mu.M Tris-HCl (pH 9.0 at
25.degree. C.), 10 mM-MgCl.sub.2]; 1.5 .mu.l labeled primer (1.5
pmol); sterile H.sub.2O to a final volume of 16 .mu.l (this
comprises the primer/template mix).
[0337] Add 1.0 .mu.l of Tne M284 (D323A,D389A) DNA Polymerase (5
u/.mu.l) to the primer/template mix. Mix briefly by pipetting up
and down (this comprises the enzyme/primer/template mix). Add 4
.mu.l of the enzyme/primer/template mix to the inside wall of each
tube containing d/ddNTP mix. Add one drop (approximately 20 .mu.l)
of mineral oil to each tube and briefly spin in a microcentrifuge.
Place the reaction tubes in a thermal cycler that has been
preheated to 95.degree. C. and start the cycling program.
12TABLE 11 Recommended Amounts Of Template DNA (ng) for End-Labeled
Primer Protocol Template Length ng of Template 200 bp (PCR product)
0.5 ng (4 fmol) 3,000-5,000 bp (supercoiled plasmid DNA) 100 ng (40
fmol) 48,000 bp (lambda DNA) 130 ng (4 fmol)
[0338] When the primer used is less than 24 nucleotides in length
or has a G+C-content less than 50%, the following cycling profile
is used: 95.degree. C. for 2 minutes followed by 30 cycles of
95.degree. C. for 30 seconds (denaturation); 42.degree. C. for 30
seconds (annealing); 70.degree. C. for 1 minute (extension); the
tubes are then brought to 4.degree. C.
[0339] When the primer used is greater than or equal to 24
nucleotides in length or when shorter primers having a G+C-content
greater than or equal to 50%, the following cycling profile is
used: 95.degree. C. for 2 minutes followed by 30 cycles of
95.degree. C. for 30 seconds (denaturation); 70.degree. C. for 30
seconds(annealing/extension); the tubes are then brought to
4.degree. C.
[0340] After the thermocycling program has been completed, add 3
.mu.l of fmol' Sequencing Stop Solution [10 mM NaOH; 95% formamide;
0.05% bromophenol blue; 0.05% xylene cyanol] to the inside wall of
each tube. Briefly spin the tubes in a microcentrifuge to terminate
the reactions. Heat the reactions at 70.degree. C. for 2 minutes
immediately before loading onto a sequencing gel. Load 2.5-3.0
.mu.l of each reaction on the gel.
[0341] b) Sequencing Protocol using Direct Incorporation
[0342] i) Extension/Termination Reactions
[0343] The following procedure requires about 500 fmol of plasmid
templates and about 40 fmol of PCR product. The end-labeled primer
protocol (section a) is recommended for PCR templates. This
procedure is not recommended for the sequencing of lambda
templates.
[0344] For each set of sequencing reactions, label four 0.5 ml
microcentrifuge tubes (G, A, T, C). Add 2 .mu.l of the appropriate
d/ddNTP Mix to each tube. Cap the tubes and store on ice or at
4.degree. C. until needed. For each set of four sequencing
reactions, mix the following reagents in a microcentrifuge tube:
500 fmol template DNA (approx. 1 .mu.g of a 3-5 kb template); 3.0
pmol primer (approx. 25 ng of a 24 mer); 0.5 l
[.alpha.-.sup.35S]dATP (>1,000 Ci/mmol, 10 .mu.Ci/.mu.l) or
[.alpha.-.sup.32P]dATP (800 Ci/mmol, 10 .mu.Ci/.mu.l); 5
.mu.fmol.RTM. Sequencing 5.times. Buffer and sterile H.sub.2O to
final volume of 16 .mu.l.
[0345] Add 1.0 .mu.l of Tne M284 (D323A,D389A) DNA Polymerase (5
u/.mu.l) to the primer/template mix. Mix briefly by pipetting up
and down. Add 4 .mu.l of the enzyme/primer/template mix to the
inside wall of each tube containing d/ddNTP Mix. Add one drop
(approximately 20 .mu.l) of mineral oil to each tube and briefly
spin in a microcentrifuge. Place the reaction tubes in a thermal
cycler that has been preheated to 95.degree. C. and start the
cycling program. The cycling profile chosen depends upon the
characteristics of the primer used; see section a(i) above.
[0346] After the thermocycling program has been completed, add 3
.mu.l of fmol.RTM. Sequencing Stop Solution to the inside wall of
each tube. Briefly spin in a microcentrifuge to terminate the
reactions. Heat the reactions at 70.degree. C. for 2 minutes
immediately before loading on a sequencing gel. Load 2.5-3.0 .mu.l
of each reaction on the gel.
EXAMPLE 12
Use of Tne M284(D323A, D389A) Polymerase in Sanger Sequencing
Protocols
[0347] The above examples described the use of the Tne M284(D323A,
D389A) polymerase in thermal cycle sequencing protocols. The Tne
M284(D323A, D389A) Polymerase may also be used in traditional
Sanger sequencing protocols.
[0348] If a double stranded DNA template is used, the template is
first denatured using alkali as follows. Four micrograms
(approximately 2 pmol) of supercoiled plasmid DNA is added to a
microcentrifuge tube and deionized H.sub.2O is added to a final
volume of 18 .mu.l. Two microliters of 2 M NaOH, 2 mM EDTA is added
and the mixture is incubated for 5 minutes at room temperature. To
neutralize the reaction, add 8 .mu.l of 5M ammonium acetate, pH
7.5, and vortex. Add 112 .mu.l of 100% ethanol and vortex.
Centrifuge the tube for 10 minutes at top speed in a
microcentrifuge. Decant the supernatant. Wash the pellet with 1 ml
of 70% ethanol and centrifuge for 1 minute. Remove the supernatant
and dry the pellet. Resuspend the dried pellet in 18 .mu.l of
distilled water for sequencing. Proceed to either section i) or ii)
depending on whether an end-labeled primer is employed or whether
radiolabeled nucleotides are employed in the sequencing
reaction.
[0349] a) Sequencing Protocol using an End-Labeled Primer
[0350] The primer is end labeled using the protocol described in
Example 11 (a)(i). The template and primer are annealed as follows.
When a single-stranded DNA template is employed the following
reaction is used. For each set of four sequencing reactions, mix
the following reagents in a microcentrifuge tube: 0.8 pmol ssDNA
(approx. 2 .mu.g of an M13 template); 5.0 .mu.l of 5.times. Taq DNA
Polymerase [250 mM Tris-HCL (pH 9.0 at 25.degree. C.]; 50 mM
MgCl.sub.2; 1.0 .mu.l labeled primer (1 pmol); sterile dH.sub.2O to
a final volume of 25 .mu.l. Incubate at 37.degree. C. for 10
minutes. During the incubation, prepare the extension/termination
reaction tubes as described in section c) below.
[0351] When a double-stranded plasmid is-used as the template, the
following reaction is used. For each set of four sequencing
reactions, mix the following reagents in a microcentrifuge tube:
1.6 pmol denatured plasmid dsDNA (approx. 4 .mu.g of a 3-5 kb
template); 5.0 .mu.l Taq DNA Polymerase 5.times. Buffer; 2.0 .mu.l
labeled primer (2 pmol) and sterile dH.sub.2O to a final volume of
25 .mu.l. Incubate at 37.degree. C. for 10 minutes. During the
incubation, prepare the extension/termination reaction tubes as
described in section c) below.
[0352] b) Extension/Termination Reactions
[0353] For each set of sequencing reactions, label four
microcentrifuge tubes (G,A,T and C) and add 1 .parallel.l of the
8.times. Tne optimized d/ddNTP Mix (see Table 12 for components of
the 8.times. mixes) to each tube. Cap the tubes and store on ice or
at 4.degree. C. until needed. Add 1 .mu.l of Tne M284
(D323A,D389A)DNA Polymerase (5 u/.mu.l) to the annealed
primer/template mix (prepared as described above) and mix briefly
by pipetting up and down.
[0354] Add 6 .mu.l of the enzyme/primer/template mix to each of the
four tubes containing the d/ddNTP Mixes. Mix briefly by pipetting
up and down. A brief spin may be needed to ensure that no liquid is
left on the tube walls. Incubate at 70.degree. C. for 15 minutes.
Add 4 .mu.l of Stop Solution to each tube and set at room
temperature. Heat the reactions to .gtoreq.70.degree. C. for 2-5
minutes before loading the sequencing gel. Load 2.5-3.0 .mu.l of
each reaction on the gel (6% denaturing polyacrylamide). Following
electrophoresis of the sequencing gel, the gel is exposed to X-ray
film to generate an autoradiograph.
[0355] If the extension products seen on the autoradiograph are too
short, the ddNTP concentrations should be lowered and conversely if
the extension products are all skewed to high molecular weight
products, the ddNTP concentrations should be raised.
13TABLE 12 8X Nucleotide Mix Formulation for Sanger Sequencing
Using Tne M284(D323A, D389A) Polymerase G A T C Nucleotide
Nucleotide Nucleotide Nucleotide Nucleotide Component Mix Mix Mix
Mix ddGTP 20 .mu.M -- -- -- ddATP -- 50 .mu.M -- -- ddTTP -- -- 75
.mu.M -- ddCTP -- -- -- 25 .mu.M 7-deaza dGTP 30 .mu.M 250 .mu.M
250 .mu.M 250 .mu.M dATP 250 .mu.M 30 .mu.M 250 .mu.M 250 .mu.M
dTTP 250 .mu.M 250 .mu.M 30 .mu.M 250 .mu.M dCTP 250 .mu.M 250
.mu.M 250 .mu.M 30 .mu.M
[0356] c) Sequencing Protocol using Direct Incorporation
[0357] i) Annealing the Template and Primer
[0358] When a single-stranded DNA template is employed the
following reaction is used to anneal the template and primer. For
each set of four sequencing reactions, mix the following reagents
in a microcentrifuge tube: 0.8 pmol ssDNA (approx. 2 .mu.g of an
M13 template); 1.0 pmol primer (approx. 8 ng of a 24 mer); 5.0
.mu.l Taq DNA Polymerase 5.times. Buffer; 2.0 .mu.l
Extension/Labeling Mix [7.5 .mu.M each of dGTP, dTTP and
dCTP];sterile dH.sub.2O to a final volume of 25 .mu.l. Incubate at
37.degree. C. for 10 minutes. During the incubation, prepare the
nucleotide tubes for the termination reaction as described in
section iii, below.
[0359] When a double-stranded DNA template is employed the
following reaction is used to anneal the template and primer. For
each set of four sequencing reactions, mix the following reagents
in a microcentrifuge tube: 1.6 pmol denatured plasmid dsDNA
(approx. 4 .mu.g of a 3-5 kb template); 2 pmol primer (approx. 16
ng of a 24 mer); 5.0 .mu.l Taq DNA Polymerase 5.times. Buffer; 2.0
.mu.l Extension/Labeling Mix; sterile dH.sub.2O to a final volume
of 25 .mu.l. Incubate at 37.degree. C. for 10 minutes. During the
incubation, prepare the nucleotide tubes for the termination
reaction as described in section iii, below.
[0360] ii) Extension/Labeling Reaction
[0361] Add 0.5 .mu.l of [.alpha.-.sup.35S]dATP (1,000 Ci/mmol,
approximately 10 .mu.Ci/.mu.l) or 0.5 .mu.l of
[.alpha.-.sup.32P]dATP (800 Ci/mmol, approximately 10 .mu.Ci/.mu.l)
to the annealed primer/template mixture. Add 1 l of Tne M284
(D323A,D389A) DNA Polymerase (5 u/.mu.l) and mix briefly by
pipetting up and down. Incubate at 37.degree. C. for 5 minutes.
[0362] iii) Termination Reaction
[0363] For each set of sequencing reactions, label four
microcentrifuge tubes (G,A,T,C) and add .mu.l of the 8.times. Tne
optimized d/ddNTP Mix to each tube (see Table 12 above). Store on
ice or at 4.degree. C. until just before completion of the
extension/labeling reaction. When the extension/labeling reaction
is complete, aliquot 6 .mu.l to each tube (G,A,T,C) containing
d/ddNTP Mix. Mix briefly by pipetting up and down. A brief spin may
be needed to ensure that no liquid is left on the tube walls.
Incubate at 70.degree. C. for 15 minutes. Add 4 .mu.l of Stop
Solution to each tube and store at -20.degree. C. Heat the
reactions to >70.degree. C. for 2-5 minutes immediately before
loading on a sequencing gel. Load 2.5-3.0 .mu.l of each reaction on
the gel. Following electrophoresis of the sequencing gel, the gel
is exposed to X-ray film to generate an autoradiograph.
[0364] If the extension products seen on the autoradiograph are too
short, the ddNTP concentrations should be lowered and conversely if
the extension products are all skewed to high molecular weight
products, the ddNTP concentrations should be raised.
EXAMPLE 12
[0365] Fidelity of Tne DNA Polymerases
[0366] The fidelity of the Tne polymerases (full-length and
modified forms) is measured using a PCR fidelity assay. This assay
is based on the amplification, circularization, and transformation
of the pUC19 derivative pLACIQ, which contains a functional
lacI.sup.q allele [Frey and Suppmann, Biochemica 2:8 (1995)].
PCR-derived mutations in lacI result in a de-repression of the
expression of lacZ.alpha. and subsequent formation of a functional
.beta.-galactosidase enzyme, which can be easily detected on X-Gal
indicator plates.
[0367] a) Construction of pLACIQ
[0368] The truncated lacI gene present in pUC19 is replaced by a
functional copy of lacI.sup.q. pUC19 (GibcoBRL) is digested with
PvuII and AflIII and the 180 bp PvuII-AflIII fragment of pUC19 is
replaced by a 1189 bp DNA fragment encoding lacI.sup.q. The 1189 bp
lacI.sup.q fragment is created by PCR amplification of residues
2972-4142 of pTrc 99 A (Pharmacia). The following primer pair is
used in the PCR: 5'-CATGCCATGCATGCATTTACGTTGACACCA-3' (SEQ ID
NO:48) and 5'-TCCCCCGGGTTGCGCTCACTGCCCGCTTTCCAGT-3' (SEQ ID NO:49).
The oligonucleotide of SEQ ID NO:48 contains a NcoI recognition
site and the oligonucleotide of SEQ ID NO:49 contains a SmaI
recognition site. The PCR is performed using 18 cycles of
denaturation at 94.degree. C. for 30 sec; annealing at 57.degree.
C. for 30 sec and extension at 72.degree. C. for 4 min. The PCR is
performed using Pfu DNA polymerase (Stratagene) in the buffer
recommended by the manufacturer. The PCR products are digested with
NcoI and SmaI to generate a 1189 bp fragment having a blunt end
(compatible with PvuII ends) and a 5' overhanging end compatible
with the AflIII digested end of pUC19. The 1189 bp fragment is
ligated into the digested pUC19 using standard techniques.
[0369] The ligation products are used to transform the
.alpha.-complementing E. coli strain DH5.alpha. (GibcoBRL) and the
desired plasmid, termed pLACIQ (3695 bp), is isolated using
standard techniques. DH5' cells containing pLACIQ produce white
(LAC1.sup.+) colonies on LB plates containing ampicillin (100
.mu.g/ml) and X-Gal (0.004% w/v).
[0370] b) The PCR Fidelity Assay
[0371] The template used in the PCR fidelity assay is prepared as
follows. pLACIQ is linearized by digestion with Dra II. A typical
PCR reaction contains 5 or 10 ng of linearized, gel-purified
plasmid DNA.
[0372] The following primers are used in the PCR fidelity assay to
amplify the lacI.sup.q sequences of pLACIQ; both PCR primers used
have Cla I cleavage sites at their 5' ends:
5'-AGCTTATCGATGGCACTTTTCGGGGAAATGTGCG-3' (SEQ ID NO:50) and
5'-AGCTTATCGATAAGCGATGCCGGGAGCAGACAAGC-3' (SEQ ID NO:51). The
length of the resulting PCR product is 3,561 bp.
[0373] The PCRs which employ the Tne DNA polymerases or UlTma DNA
polymerase are performed using the following buffer [10 mM Tris-HCl
(pH 8.8 at 25.degree. C.), 10 mM KCl, 0.002% Tween 20, 1.5 mM
MgCl.sub.2 and 40 .mu.M all four dNTPs]. PCRs which employ nTaq DNA
polymerase are performed using the following buffer [10 mM Tris-HCl
(pH 9.0 at 25.degree. C.), 50 mM KCl, 1.5 mM MgCl.sub.2, 0.1%
Triton X-100 and 140 .mu.M all four dNTPs]. All reactions contain
5-10 ng of template, 20 pmol of each primer and 2.5 units of
polymerase (all polymerases are assayed using the assay described
in Example 5 to ensure the same amount of activity is used for each
enzyme tested). The following cycling conditions are used:
denaturation for 10 sec at 94.degree. C.; annealing for 30 sec at
57.degree. C. and extension for 4 min at 72.degree. C. for 18
cycles.
[0374] Following the PCR, the amplification products are
PEG-precipitated as follows. The PCR amplification mixtures are
frozen at -20.degree. C. and the mineral oil is removed by rinsing
twice with cold chloroform. The mixtures are then thawed and 10
.mu.g of glycogen and 1/2 volume of 30% PEG 3350/1.5 M NaCl is
added. The mixture is allowed to stand for a minimum of 1 hr at
0-4.degree. C. The mixture is then centrifuged in a microcentrifuge
for 8 min and the supernatant is removed. The pellet is then rinsed
with 75% ethanol and dried. The DNA is then digested with ClaI and
the digested DNA is purified by gel electrophoresis. The purified
DNA is then ligated to recircularize the plasmid in a reaction
containing less than or equal to 30 ng DNA.
[0375] The resulting PCR-derived plasmids are transformed into
competent E. coli DH5.alpha. and plated on LB Amp100 X-Gal plates
[LB plates containing 100 .mu.g/ml ampicillin and 0.004% X-Gal
(w/v)]. After incubation overnight at 37.degree. C., blue and white
colonies are counted. The error rate (f) per bp is calculated using
the rearranged equation published by Keohavong and Thilly [Proc.
Natl. Acad. Sci. USA 86:9253 (1989): f=InF/d.times.b bp; where F is
the fraction of white colonies: F=white (LACI.sup.+)/total colony
number; 2.sup.d is the number of DNA duplications: 2.sup.d=output
DNA/input DNA; and b is the effective target size of the (1080 bp)
lacI gene, which is 349 bp according to Provost et al. [Mut.
Research 288:133 (1993)]; there are 349 phenotypically identified
(by color screening) single-based substitutions (nonsense and
mis-sense) at 179 codons (approximately 50% of the coding region)
within the lacI gene [Provost et al., supra]. Frameshift errors,
which may occur at every position in the 1080 bp open reading frame
of lac, are not taken into account.
[0376] A religation control is prepared as follows. Fifty nanograms
of Dra II-linearized, gel-purified pLACIQ DNA is religated, and an
aliquot of the ligation reaction is transformed into DH5.alpha..
After incubation overnight, the number of growing colonies (0.027%)
showing a blue (LACI.sup.-) phenotype on LB Amp X-Gal plates is
measured to assess the rate of the formation of concatameric
ligation products (with subsequent intramolecular recombination in
E. coli that eliminates an additional origin of replication), which
seems to be a very rare event. Restriction analysis of PCR-derived
plasmids isolated from blue colonies is also performed to confirm
that the LACI.sup.- phenotype originates in PCR-derived mutations
of lac, but not in deleterious recombination events after
transformation of the ligated DNA in DH5.alpha..
[0377] From the above, it is clear that the enzymes of the present
invention provide thermostable DNA polymerase having novel
features. In particular, these enzymes provide superior polymerases
for use in PCR applications. In addition, these polymerases have
higher affinities for ddNTPs which results in significant cost
savings for users of chain termination sequencing protocols.
[0378] Other modifications of the embodiments of the invention
described above that are obvious to those of ordinary skill in the
areas of molecular biology, biochemistry and related disciplines
are intended to be within the scope of the accompanying claims.
Sequence CWU 1
1
75 1 2682 DNA Thermotoga neapolitana CDS (1)..(2679) 1 atg gcg aga
cta ttt ctc ttt gat ggc aca gcc ctg gcc tac agg gca 48 Met Ala Arg
Leu Phe Leu Phe Asp Gly Thr Ala Leu Ala Tyr Arg Ala 1 5 10 15 tat
tac gcc ctc gac aga tcc ctt tcc aca tcc aca gga att cca acg 96 Tyr
Tyr Ala Leu Asp Arg Ser Leu Ser Thr Ser Thr Gly Ile Pro Thr 20 25
30 aac gcc gtc tat ggc gtt gcc agg atg ctc gtt aaa ttc ata aag gaa
144 Asn Ala Val Tyr Gly Val Ala Arg Met Leu Val Lys Phe Ile Lys Glu
35 40 45 cac att ata ccc gaa aag gac tac gcg gct gtg gcc ttc gac
aag aag 192 His Ile Ile Pro Glu Lys Asp Tyr Ala Ala Val Ala Phe Asp
Lys Lys 50 55 60 gca gcg acg ttc aga cac aaa ctg ctc gaa gcg tac
aag gcg caa agg 240 Ala Ala Thr Phe Arg His Lys Leu Leu Glu Ala Tyr
Lys Ala Gln Arg 65 70 75 80 cca aag acg ccg gat ctt cta gtt cag cag
cta cct tac atc aag cgg 288 Pro Lys Thr Pro Asp Leu Leu Val Gln Gln
Leu Pro Tyr Ile Lys Arg 85 90 95 ctg ata gaa gct ctt ggt ttc aaa
gtg ctg gag ctg gaa gga tac gaa 336 Leu Ile Glu Ala Leu Gly Phe Lys
Val Leu Glu Leu Glu Gly Tyr Glu 100 105 110 gca gac gat atc atc gcc
acg ctt gca gtc aag ggc tgc acg ttt ttt 384 Ala Asp Asp Ile Ile Ala
Thr Leu Ala Val Lys Gly Cys Thr Phe Phe 115 120 125 gat gag att ttc
ata ata acc ggt gac aag gat atg ctt caa ctt gta 432 Asp Glu Ile Phe
Ile Ile Thr Gly Asp Lys Asp Met Leu Gln Leu Val 130 135 140 aac gag
aag ata aag gtc tgg aga atc gtc aag ggg ata tcg gat ctt 480 Asn Glu
Lys Ile Lys Val Trp Arg Ile Val Lys Gly Ile Ser Asp Leu 145 150 155
160 gag ctt tac gat tcg aaa aag gtg aaa gaa aga tac ggt gtg gaa cca
528 Glu Leu Tyr Asp Ser Lys Lys Val Lys Glu Arg Tyr Gly Val Glu Pro
165 170 175 cat cag ata ccg gat ctt cta gca ctg acg gga gac gag ata
gac aac 576 His Gln Ile Pro Asp Leu Leu Ala Leu Thr Gly Asp Glu Ile
Asp Asn 180 185 190 att ccc ggt gta acg gga ata ggt gaa aag acc gct
gta cag ctt ctc 624 Ile Pro Gly Val Thr Gly Ile Gly Glu Lys Thr Ala
Val Gln Leu Leu 195 200 205 ggc aag tac aga aat ctt gaa gac att ctg
gag cat gcc cgt gaa ctc 672 Gly Lys Tyr Arg Asn Leu Glu Asp Ile Leu
Glu His Ala Arg Glu Leu 210 215 220 ccc cag aga gtg aga aag gct ctc
ttg aga gac agg gaa gtt gcc atc 720 Pro Gln Arg Val Arg Lys Ala Leu
Leu Arg Asp Arg Glu Val Ala Ile 225 230 235 240 ctc agt aaa aaa ctt
gca act ctg gtg acg aac gca cct gtt gaa gtg 768 Leu Ser Lys Lys Leu
Ala Thr Leu Val Thr Asn Ala Pro Val Glu Val 245 250 255 gac tgg gaa
gag atg aaa tac aga gga tac gac aag aga aaa cta ctt 816 Asp Trp Glu
Glu Met Lys Tyr Arg Gly Tyr Asp Lys Arg Lys Leu Leu 260 265 270 ccg
ata ttg aaa gaa ctg gag ttt gct tcc atc atg aag gaa ctt caa 864 Pro
Ile Leu Lys Glu Leu Glu Phe Ala Ser Ile Met Lys Glu Leu Gln 275 280
285 ctg tac gaa gaa gca gaa ccc acc gga tac gaa atc gtg aag gat cat
912 Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu Ile Val Lys Asp His
290 295 300 aag acc ttc gaa gat ctc atc gaa aag ctg aag gag gtt cca
tct ttt 960 Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys Glu Val Pro
Ser Phe 305 310 315 320 gcc ctg gac ctt gaa acg tcc tcc ctt gac ccg
ttc aac tgt gag ata 1008 Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp
Pro Phe Asn Cys Glu Ile 325 330 335 gtc ggc atc tcc gtg tcg ttc aaa
ccg aaa aca gct tat tac att cca 1056 Val Gly Ile Ser Val Ser Phe
Lys Pro Lys Thr Ala Tyr Tyr Ile Pro 340 345 350 ctt cat cac aga aac
gcc cag aat ctt gat gaa aca ctg gtg ctg tcg 1104 Leu His His Arg
Asn Ala Gln Asn Leu Asp Glu Thr Leu Val Leu Ser 355 360 365 aag ttg
aaa gag atc ctc gaa gac ccg tct tcg aag att gtg ggt cag 1152 Lys
Leu Lys Glu Ile Leu Glu Asp Pro Ser Ser Lys Ile Val Gly Gln 370 375
380 aac ctg aag tac gac tac aag gtt ctt atg gta aag ggt ata tcg cca
1200 Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val Lys Gly Ile Ser
Pro 385 390 395 400 gtt tat ccg cat ttt gac acg atg ata gct gca tat
ttg ctg gag cca 1248 Val Tyr Pro His Phe Asp Thr Met Ile Ala Ala
Tyr Leu Leu Glu Pro 405 410 415 aac gag aaa aaa ttc aat ctc gaa gat
ctg tct ttg aaa ttt ctc gga 1296 Asn Glu Lys Lys Phe Asn Leu Glu
Asp Leu Ser Leu Lys Phe Leu Gly 420 425 430 tac aaa atg acg tct tat
cag gaa ctg atg tcg ttt tcc tca cca ctt 1344 Tyr Lys Met Thr Ser
Tyr Gln Glu Leu Met Ser Phe Ser Ser Pro Leu 435 440 445 ttt ggt ttc
agc ttt gcg gat gtt ccg gta gac aag gct gcg aac tac 1392 Phe Gly
Phe Ser Phe Ala Asp Val Pro Val Asp Lys Ala Ala Asn Tyr 450 455 460
tcc tgc gag gat gca gac atc act tat agg ctc tac aag ata ctc agc
1440 Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu Tyr Lys Ile Leu
Ser 465 470 475 480 atg aag ctc cat gaa gcg gaa ctt gag aac gtc ttc
tac agg ata gag 1488 Met Lys Leu His Glu Ala Glu Leu Glu Asn Val
Phe Tyr Arg Ile Glu 485 490 495 atg ccg ctt gtg aac gtt ctt gca cgc
atg gaa ttg aac ggg gtg tat 1536 Met Pro Leu Val Asn Val Leu Ala
Arg Met Glu Leu Asn Gly Val Tyr 500 505 510 gtg gac aca gaa ttc ctg
aaa aag ctc tcg gag gag tac ggc aaa aag 1584 Val Asp Thr Glu Phe
Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys Lys 515 520 525 ctc gag gaa
ctg gcc gaa aaa atc tac cag ata gca gga gag ccc ttc 1632 Leu Glu
Glu Leu Ala Glu Lys Ile Tyr Gln Ile Ala Gly Glu Pro Phe 530 535 540
aac atc aat tct cca aaa cag gtt tca aag atc ctt ttt gag aag ctg
1680 Asn Ile Asn Ser Pro Lys Gln Val Ser Lys Ile Leu Phe Glu Lys
Leu 545 550 555 560 gga ata aaa ccc cgt gga aaa acg aca aaa aca gga
gcg tac tct acc 1728 Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr
Gly Ala Tyr Ser Thr 565 570 575 agg ata gag gtg ttg gaa gag ata gcg
aat gag cac gag ata gta ccc 1776 Arg Ile Glu Val Leu Glu Glu Ile
Ala Asn Glu His Glu Ile Val Pro 580 585 590 ctc att ctc gag tac aga
aag atc cag aaa ctg aaa tcg acc tac ata 1824 Leu Ile Leu Glu Tyr
Arg Lys Ile Gln Lys Leu Lys Ser Thr Tyr Ile 595 600 605 gac acc ctt
ccg aaa ctt gtg aac ccg aaa acc gga aga att cat gca 1872 Asp Thr
Leu Pro Lys Leu Val Asn Pro Lys Thr Gly Arg Ile His Ala 610 615 620
tct ttc cac cag acg ggt acc gcc act ggc agg ttg agt agc agt gat
1920 Ser Phe His Gln Thr Gly Thr Ala Thr Gly Arg Leu Ser Ser Ser
Asp 625 630 635 640 cca aat ctt cag aat ctt ccg aca aag agc gaa gag
gga aaa gaa att 1968 Pro Asn Leu Gln Asn Leu Pro Thr Lys Ser Glu
Glu Gly Lys Glu Ile 645 650 655 aga aaa gcg att gtg ccc cag gat cca
gac tgg tgg atc gtc agt gcg 2016 Arg Lys Ala Ile Val Pro Gln Asp
Pro Asp Trp Trp Ile Val Ser Ala 660 665 670 gat tat tcc caa ata gaa
ctc aga atc ctc gct cat ctc agt ggt gat 2064 Asp Tyr Ser Gln Ile
Glu Leu Arg Ile Leu Ala His Leu Ser Gly Asp 675 680 685 gag aac ctt
gtg aag gcc ttc gag gag ggc atc gat gtg cac acc ttg 2112 Glu Asn
Leu Val Lys Ala Phe Glu Glu Gly Ile Asp Val His Thr Leu 690 695 700
act gcc tcc agg atc tac aac gta aag cca gaa gaa gtg aac gaa gaa
2160 Thr Ala Ser Arg Ile Tyr Asn Val Lys Pro Glu Glu Val Asn Glu
Glu 705 710 715 720 atg cga cgg gtt gga aag atg gtg aac ttc tct ata
ata tac ggt gtc 2208 Met Arg Arg Val Gly Lys Met Val Asn Phe Ser
Ile Ile Tyr Gly Val 725 730 735 aca ccg tac ggt ctt tct gtg aga ctt
gga ata ccg gtt aaa gaa gca 2256 Thr Pro Tyr Gly Leu Ser Val Arg
Leu Gly Ile Pro Val Lys Glu Ala 740 745 750 gaa aag atg att atc agc
tat ttc aca ctg tat cca aag gtg cga agc 2304 Glu Lys Met Ile Ile
Ser Tyr Phe Thr Leu Tyr Pro Lys Val Arg Ser 755 760 765 tac atc cag
cag gtt gtt gca gag gca aaa gag aag ggc tac gtc agg 2352 Tyr Ile
Gln Gln Val Val Ala Glu Ala Lys Glu Lys Gly Tyr Val Arg 770 775 780
act ctc ttt gga aga aaa aga gat att ccc cag ctc atg gca agg gac
2400 Thr Leu Phe Gly Arg Lys Arg Asp Ile Pro Gln Leu Met Ala Arg
Asp 785 790 795 800 aag aac acc cag tcc gaa ggc gaa aga atc gca ata
aac acc ccc att 2448 Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala
Ile Asn Thr Pro Ile 805 810 815 cag gga acg gcg gca gat ata ata aaa
ttg gct atg ata gat ata gac 2496 Gln Gly Thr Ala Ala Asp Ile Ile
Lys Leu Ala Met Ile Asp Ile Asp 820 825 830 gag gag ctg aga aaa aga
aac atg aaa tcc aga atg atc att cag gtt 2544 Glu Glu Leu Arg Lys
Arg Asn Met Lys Ser Arg Met Ile Ile Gln Val 835 840 845 cat gac gaa
ctg gtc ttc gag gtt ccc gat gag gaa aaa gaa gaa cta 2592 His Asp
Glu Leu Val Phe Glu Val Pro Asp Glu Glu Lys Glu Glu Leu 850 855 860
gtt gat ctg gtg aag aac aaa atg aca aat gtg gtg aaa ctc tct gtg
2640 Val Asp Leu Val Lys Asn Lys Met Thr Asn Val Val Lys Leu Ser
Val 865 870 875 880 cct ctt gag gtt gac ata agc atc gga aaa agc tgg
tct tga 2682 Pro Leu Glu Val Asp Ile Ser Ile Gly Lys Ser Trp Ser
885 890 2 893 PRT Thermotoga neapolitana 2 Met Ala Arg Leu Phe Leu
Phe Asp Gly Thr Ala Leu Ala Tyr Arg Ala 1 5 10 15 Tyr Tyr Ala Leu
Asp Arg Ser Leu Ser Thr Ser Thr Gly Ile Pro Thr 20 25 30 Asn Ala
Val Tyr Gly Val Ala Arg Met Leu Val Lys Phe Ile Lys Glu 35 40 45
His Ile Ile Pro Glu Lys Asp Tyr Ala Ala Val Ala Phe Asp Lys Lys 50
55 60 Ala Ala Thr Phe Arg His Lys Leu Leu Glu Ala Tyr Lys Ala Gln
Arg 65 70 75 80 Pro Lys Thr Pro Asp Leu Leu Val Gln Gln Leu Pro Tyr
Ile Lys Arg 85 90 95 Leu Ile Glu Ala Leu Gly Phe Lys Val Leu Glu
Leu Glu Gly Tyr Glu 100 105 110 Ala Asp Asp Ile Ile Ala Thr Leu Ala
Val Lys Gly Cys Thr Phe Phe 115 120 125 Asp Glu Ile Phe Ile Ile Thr
Gly Asp Lys Asp Met Leu Gln Leu Val 130 135 140 Asn Glu Lys Ile Lys
Val Trp Arg Ile Val Lys Gly Ile Ser Asp Leu 145 150 155 160 Glu Leu
Tyr Asp Ser Lys Lys Val Lys Glu Arg Tyr Gly Val Glu Pro 165 170 175
His Gln Ile Pro Asp Leu Leu Ala Leu Thr Gly Asp Glu Ile Asp Asn 180
185 190 Ile Pro Gly Val Thr Gly Ile Gly Glu Lys Thr Ala Val Gln Leu
Leu 195 200 205 Gly Lys Tyr Arg Asn Leu Glu Asp Ile Leu Glu His Ala
Arg Glu Leu 210 215 220 Pro Gln Arg Val Arg Lys Ala Leu Leu Arg Asp
Arg Glu Val Ala Ile 225 230 235 240 Leu Ser Lys Lys Leu Ala Thr Leu
Val Thr Asn Ala Pro Val Glu Val 245 250 255 Asp Trp Glu Glu Met Lys
Tyr Arg Gly Tyr Asp Lys Arg Lys Leu Leu 260 265 270 Pro Ile Leu Lys
Glu Leu Glu Phe Ala Ser Ile Met Lys Glu Leu Gln 275 280 285 Leu Tyr
Glu Glu Ala Glu Pro Thr Gly Tyr Glu Ile Val Lys Asp His 290 295 300
Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys Glu Val Pro Ser Phe 305
310 315 320 Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro Phe Asn Cys
Glu Ile 325 330 335 Val Gly Ile Ser Val Ser Phe Lys Pro Lys Thr Ala
Tyr Tyr Ile Pro 340 345 350 Leu His His Arg Asn Ala Gln Asn Leu Asp
Glu Thr Leu Val Leu Ser 355 360 365 Lys Leu Lys Glu Ile Leu Glu Asp
Pro Ser Ser Lys Ile Val Gly Gln 370 375 380 Asn Leu Lys Tyr Asp Tyr
Lys Val Leu Met Val Lys Gly Ile Ser Pro 385 390 395 400 Val Tyr Pro
His Phe Asp Thr Met Ile Ala Ala Tyr Leu Leu Glu Pro 405 410 415 Asn
Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser Leu Lys Phe Leu Gly 420 425
430 Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser Phe Ser Ser Pro Leu
435 440 445 Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp Lys Ala Ala
Asn Tyr 450 455 460 Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu Tyr
Lys Ile Leu Ser 465 470 475 480 Met Lys Leu His Glu Ala Glu Leu Glu
Asn Val Phe Tyr Arg Ile Glu 485 490 495 Met Pro Leu Val Asn Val Leu
Ala Arg Met Glu Leu Asn Gly Val Tyr 500 505 510 Val Asp Thr Glu Phe
Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys Lys 515 520 525 Leu Glu Glu
Leu Ala Glu Lys Ile Tyr Gln Ile Ala Gly Glu Pro Phe 530 535 540 Asn
Ile Asn Ser Pro Lys Gln Val Ser Lys Ile Leu Phe Glu Lys Leu 545 550
555 560 Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr Gly Ala Tyr Ser
Thr 565 570 575 Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu His Glu
Ile Val Pro 580 585 590 Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu
Lys Ser Thr Tyr Ile 595 600 605 Asp Thr Leu Pro Lys Leu Val Asn Pro
Lys Thr Gly Arg Ile His Ala 610 615 620 Ser Phe His Gln Thr Gly Thr
Ala Thr Gly Arg Leu Ser Ser Ser Asp 625 630 635 640 Pro Asn Leu Gln
Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys Glu Ile 645 650 655 Arg Lys
Ala Ile Val Pro Gln Asp Pro Asp Trp Trp Ile Val Ser Ala 660 665 670
Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala His Leu Ser Gly Asp 675
680 685 Glu Asn Leu Val Lys Ala Phe Glu Glu Gly Ile Asp Val His Thr
Leu 690 695 700 Thr Ala Ser Arg Ile Tyr Asn Val Lys Pro Glu Glu Val
Asn Glu Glu 705 710 715 720 Met Arg Arg Val Gly Lys Met Val Asn Phe
Ser Ile Ile Tyr Gly Val 725 730 735 Thr Pro Tyr Gly Leu Ser Val Arg
Leu Gly Ile Pro Val Lys Glu Ala 740 745 750 Glu Lys Met Ile Ile Ser
Tyr Phe Thr Leu Tyr Pro Lys Val Arg Ser 755 760 765 Tyr Ile Gln Gln
Val Val Ala Glu Ala Lys Glu Lys Gly Tyr Val Arg 770 775 780 Thr Leu
Phe Gly Arg Lys Arg Asp Ile Pro Gln Leu Met Ala Arg Asp 785 790 795
800 Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala Ile Asn Thr Pro Ile
805 810 815 Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala Met Ile Asp
Ile Asp 820 825 830 Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg Met
Ile Ile Gln Val 835 840 845 His Asp Glu Leu Val Phe Glu Val Pro Asp
Glu Glu Lys Glu Glu Leu 850 855 860 Val Asp Leu Val Lys Asn Lys Met
Thr Asn Val Val Lys Leu Ser Val 865 870 875 880 Pro Leu Glu Val Asp
Ile Ser Ile Gly Lys Ser Trp Ser 885 890 3 44 DNA Artificial
Sequence Synthetic 3 catggcgaga ctatttctct ttgatggcac agccctggcc
taca 44 4 37 DNA Artificial Sequence Synthetic 4 aggccagggc
tgtgccatca aagagaaata gtctcgc 37 5 39 DNA Artificial Sequence
Synthetic 5 aggccagggc tgtgccatca aagagaaata gtctcgcca 39 6 44 DNA
Artificial Sequence Synthetic 6 tatggcgaga ctatttctct ttgatggcac
agccctggcc taca 44 7 1833 DNA Artificial Sequence Synthetic 7 atg
aag gaa ctt caa ctg tac gaa gaa gca gaa ccc acc gga tac gaa 48 Met
Lys Glu Leu Gln Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10
15 atc gtg aag gat cat aag acc ttc gaa gat ctc atc gaa aag ctg aag
96 Ile Val
Lys Asp His Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys 20 25 30
gag gtt cca tct ttt gcc ctg gac ctt gaa acg tcc tcc ctt gac ccg 144
Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 35
40 45 ttc aac tgt gag ata gtc ggc atc tcc gtg tcg ttc aaa ccg aaa
aca 192 Phe Asn Cys Glu Ile Val Gly Ile Ser Val Ser Phe Lys Pro Lys
Thr 50 55 60 gct tat tac att cca ctt cat cac aga aac gcc cag aat
ctt gat gaa 240 Ala Tyr Tyr Ile Pro Leu His His Arg Asn Ala Gln Asn
Leu Asp Glu 65 70 75 80 aca ctg gtg ctg tcg aag ttg aaa gag atc ctc
gaa gac ccg tct tcg 288 Thr Leu Val Leu Ser Lys Leu Lys Glu Ile Leu
Glu Asp Pro Ser Ser 85 90 95 aag att gtg ggt cag aac ctg aag tac
gac tac aag gtt ctt atg gta 336 Lys Ile Val Gly Gln Asn Leu Lys Tyr
Asp Tyr Lys Val Leu Met Val 100 105 110 aag ggt ata tcg cca gtt tat
ccg cat ttt gac acg atg ata gct gca 384 Lys Gly Ile Ser Pro Val Tyr
Pro His Phe Asp Thr Met Ile Ala Ala 115 120 125 tat ttg ctg gag cca
aac gag aaa aaa ttc aat ctc gaa gat ctg tct 432 Tyr Leu Leu Glu Pro
Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 130 135 140 ttg aaa ttt
ctc gga tac aaa atg acg tct tat cag gaa ctg atg tcg 480 Leu Lys Phe
Leu Gly Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 145 150 155 160
ttt tcc tca cca ctt ttt ggt ttc agc ttt gcg gat gtt ccg gta gac 528
Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 165
170 175 aag gct gcg aac tac tcc tgc gag gat gca gac atc act tat agg
ctc 576 Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg
Leu 180 185 190 tac aag ata ctc agc atg aag ctc cat gaa gcg gaa ctt
gag aac gtc 624 Tyr Lys Ile Leu Ser Met Lys Leu His Glu Ala Glu Leu
Glu Asn Val 195 200 205 ttc tac agg ata gag atg ccg ctt gtg aac gtt
ctt gca cgc atg gaa 672 Phe Tyr Arg Ile Glu Met Pro Leu Val Asn Val
Leu Ala Arg Met Glu 210 215 220 ttg aac ggg gtg tat gtg gac aca gaa
ttc ctg aaa aag ctc tcg gag 720 Leu Asn Gly Val Tyr Val Asp Thr Glu
Phe Leu Lys Lys Leu Ser Glu 225 230 235 240 gag tac ggc aaa aag ctc
gag gaa ctg gcc gaa aaa atc tac cag ata 768 Glu Tyr Gly Lys Lys Leu
Glu Glu Leu Ala Glu Lys Ile Tyr Gln Ile 245 250 255 gca gga gag ccc
ttc aac atc aat tct cca aaa cag gtt tca aag atc 816 Ala Gly Glu Pro
Phe Asn Ile Asn Ser Pro Lys Gln Val Ser Lys Ile 260 265 270 ctt ttt
gag aag ctg gga ata aaa ccc cgt gga aaa acg aca aaa aca 864 Leu Phe
Glu Lys Leu Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr 275 280 285
gga gcg tac tct acc agg ata gag gtg ttg gaa gag ata gcg aat gag 912
Gly Ala Tyr Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu 290
295 300 cac gag ata gta ccc ctc att ctc gag tac aga aag atc cag aaa
ctg 960 His Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys
Leu 305 310 315 320 aaa tcg acc tac ata gac acc ctt ccg aaa ctt gtg
aac ccg aaa acc 1008 Lys Ser Thr Tyr Ile Asp Thr Leu Pro Lys Leu
Val Asn Pro Lys Thr 325 330 335 gga aga att cat gca tct ttc cac cag
acg ggt acc gcc act ggc agg 1056 Gly Arg Ile His Ala Ser Phe His
Gln Thr Gly Thr Ala Thr Gly Arg 340 345 350 ttg agt agc agt gat cca
aat ctt cag aat ctt ccg aca aag agc gaa 1104 Leu Ser Ser Ser Asp
Pro Asn Leu Gln Asn Leu Pro Thr Lys Ser Glu 355 360 365 gag gga aaa
gaa att aga aaa gcg att gtg ccc cag gat cca gac tgg 1152 Glu Gly
Lys Glu Ile Arg Lys Ala Ile Val Pro Gln Asp Pro Asp Trp 370 375 380
tgg atc gtc agt gcg gat tat tcc caa ata gaa ctc aga atc ctc gct
1200 Trp Ile Val Ser Ala Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu
Ala 385 390 395 400 cat ctc agt ggt gat gag aac ctt gtg aag gcc ttc
gag gag ggc atc 1248 His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala
Phe Glu Glu Gly Ile 405 410 415 gat gtg cac acc ttg act gcc tcc agg
atc tac aac gta aag cca gaa 1296 Asp Val His Thr Leu Thr Ala Ser
Arg Ile Tyr Asn Val Lys Pro Glu 420 425 430 gaa gtg aac gaa gaa atg
cga cgg gtt gga aag atg gtg aac ttc tct 1344 Glu Val Asn Glu Glu
Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 435 440 445 ata ata tac
ggt gtc aca ccg tac ggt ctt tct gtg aga ctt gga ata 1392 Ile Ile
Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly Ile 450 455 460
ccg gtt aaa gaa gca gaa aag atg att atc agc tat ttc aca ctg tat
1440 Pro Val Lys Glu Ala Glu Lys Met Ile Ile Ser Tyr Phe Thr Leu
Tyr 465 470 475 480 cca aag gtg cga agc tac atc cag cag gtt gtt gca
gag gca aaa gag 1488 Pro Lys Val Arg Ser Tyr Ile Gln Gln Val Val
Ala Glu Ala Lys Glu 485 490 495 aag ggc tac gtc agg act ctc ttt gga
aga aaa aga gat att ccc cag 1536 Lys Gly Tyr Val Arg Thr Leu Phe
Gly Arg Lys Arg Asp Ile Pro Gln 500 505 510 ctc atg gca agg gac aag
aac acc cag tcc gaa ggc gaa aga atc gca 1584 Leu Met Ala Arg Asp
Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala 515 520 525 ata aac acc
ccc att cag gga acg gcg gca gat ata ata aaa ttg gct 1632 Ile Asn
Thr Pro Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala 530 535 540
atg ata gat ata gac gag gag ctg aga aaa aga aac atg aaa tcc aga
1680 Met Ile Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser
Arg 545 550 555 560 atg atc att cag gtt cat gac gaa ctg gtc ttc gag
gtt ccc gat gag 1728 Met Ile Ile Gln Val His Asp Glu Leu Val Phe
Glu Val Pro Asp Glu 565 570 575 gaa aaa gaa gaa cta gtt gat ctg gtg
aag aac aaa atg aca aat gtg 1776 Glu Lys Glu Glu Leu Val Asp Leu
Val Lys Asn Lys Met Thr Asn Val 580 585 590 gtg aaa ctc tct gtg cct
ctt gag gtt gac ata agc atc gga aaa agc 1824 Val Lys Leu Ser Val
Pro Leu Glu Val Asp Ile Ser Ile Gly Lys Ser 595 600 605 tgg tct tga
1833 Trp Ser 610 8 610 PRT Artificial Sequence Synthetic 8 Met Lys
Glu Leu Gln Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10 15
Ile Val Lys Asp His Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys 20
25 30 Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp
Pro 35 40 45 Phe Asn Cys Glu Ile Val Gly Ile Ser Val Ser Phe Lys
Pro Lys Thr 50 55 60 Ala Tyr Tyr Ile Pro Leu His His Arg Asn Ala
Gln Asn Leu Asp Glu 65 70 75 80 Thr Leu Val Leu Ser Lys Leu Lys Glu
Ile Leu Glu Asp Pro Ser Ser 85 90 95 Lys Ile Val Gly Gln Asn Leu
Lys Tyr Asp Tyr Lys Val Leu Met Val 100 105 110 Lys Gly Ile Ser Pro
Val Tyr Pro His Phe Asp Thr Met Ile Ala Ala 115 120 125 Tyr Leu Leu
Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 130 135 140 Leu
Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 145 150
155 160 Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val
Asp 165 170 175 Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp Ile Thr
Tyr Arg Leu 180 185 190 Tyr Lys Ile Leu Ser Met Lys Leu His Glu Ala
Glu Leu Glu Asn Val 195 200 205 Phe Tyr Arg Ile Glu Met Pro Leu Val
Asn Val Leu Ala Arg Met Glu 210 215 220 Leu Asn Gly Val Tyr Val Asp
Thr Glu Phe Leu Lys Lys Leu Ser Glu 225 230 235 240 Glu Tyr Gly Lys
Lys Leu Glu Glu Leu Ala Glu Lys Ile Tyr Gln Ile 245 250 255 Ala Gly
Glu Pro Phe Asn Ile Asn Ser Pro Lys Gln Val Ser Lys Ile 260 265 270
Leu Phe Glu Lys Leu Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr 275
280 285 Gly Ala Tyr Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala Asn
Glu 290 295 300 His Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys Ile
Gln Lys Leu 305 310 315 320 Lys Ser Thr Tyr Ile Asp Thr Leu Pro Lys
Leu Val Asn Pro Lys Thr 325 330 335 Gly Arg Ile His Ala Ser Phe His
Gln Thr Gly Thr Ala Thr Gly Arg 340 345 350 Leu Ser Ser Ser Asp Pro
Asn Leu Gln Asn Leu Pro Thr Lys Ser Glu 355 360 365 Glu Gly Lys Glu
Ile Arg Lys Ala Ile Val Pro Gln Asp Pro Asp Trp 370 375 380 Trp Ile
Val Ser Ala Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala 385 390 395
400 His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly Ile
405 410 415 Asp Val His Thr Leu Thr Ala Ser Arg Ile Tyr Asn Val Lys
Pro Glu 420 425 430 Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met
Val Asn Phe Ser 435 440 445 Ile Ile Tyr Gly Val Thr Pro Tyr Gly Leu
Ser Val Arg Leu Gly Ile 450 455 460 Pro Val Lys Glu Ala Glu Lys Met
Ile Ile Ser Tyr Phe Thr Leu Tyr 465 470 475 480 Pro Lys Val Arg Ser
Tyr Ile Gln Gln Val Val Ala Glu Ala Lys Glu 485 490 495 Lys Gly Tyr
Val Arg Thr Leu Phe Gly Arg Lys Arg Asp Ile Pro Gln 500 505 510 Leu
Met Ala Arg Asp Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala 515 520
525 Ile Asn Thr Pro Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala
530 535 540 Met Ile Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn Met Lys
Ser Arg 545 550 555 560 Met Ile Ile Gln Val His Asp Glu Leu Val Phe
Glu Val Pro Asp Glu 565 570 575 Glu Lys Glu Glu Leu Val Asp Leu Val
Lys Asn Lys Met Thr Asn Val 580 585 590 Val Lys Leu Ser Val Pro Leu
Glu Val Asp Ile Ser Ile Gly Lys Ser 595 600 605 Trp Ser 610 9 31
DNA Artificial Sequence Synthetic 9 atcgaaaagc tgaccatggt
tccatctttt g 31 10 1737 DNA Artificial Sequence Synthetic 10 atg
gtt cca tct ttt gcc ctg gac ctt gaa acg tcc tcc ctt gac ccg 48 Met
Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 1 5 10
15 ttc aac tgt gag ata gtc ggc atc tcc gtg tcg ttc aaa ccg aaa aca
96 Phe Asn Cys Glu Ile Val Gly Ile Ser Val Ser Phe Lys Pro Lys Thr
20 25 30 gct tat tac att cca ctt cat cac aga aac gcc cag aat ctt
gat gaa 144 Ala Tyr Tyr Ile Pro Leu His His Arg Asn Ala Gln Asn Leu
Asp Glu 35 40 45 aca ctg gtg ctg tcg aag ttg aaa gag atc ctc gaa
gac ccg tct tcg 192 Thr Leu Val Leu Ser Lys Leu Lys Glu Ile Leu Glu
Asp Pro Ser Ser 50 55 60 aag att gtg ggt cag aac ctg aag tac gac
tac aag gtt ctt atg gta 240 Lys Ile Val Gly Gln Asn Leu Lys Tyr Asp
Tyr Lys Val Leu Met Val 65 70 75 80 aag ggt ata tcg cca gtt tat ccg
cat ttt gac acg atg ata gct gca 288 Lys Gly Ile Ser Pro Val Tyr Pro
His Phe Asp Thr Met Ile Ala Ala 85 90 95 tat ttg ctg gag cca aac
gag aaa aaa ttc aat ctc gaa gat ctg tct 336 Tyr Leu Leu Glu Pro Asn
Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 100 105 110 ttg aaa ttt ctc
gga tac aaa atg acg tct tat cag gaa ctg atg tcg 384 Leu Lys Phe Leu
Gly Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 115 120 125 ttt tcc
tca cca ctt ttt ggt ttc agc ttt gcg gat gtt ccg gta gac 432 Phe Ser
Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 130 135 140
aag gct gcg aac tac tcc tgc gag gat gca gac atc act tat agg ctc 480
Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu 145
150 155 160 tac aag ata ctc agc atg aag ctc cat gaa gcg gaa ctt gag
aac gtc 528 Tyr Lys Ile Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu
Asn Val 165 170 175 ttc tac agg ata gag atg ccg ctt gtg aac gtt ctt
gca cgc atg gaa 576 Phe Tyr Arg Ile Glu Met Pro Leu Val Asn Val Leu
Ala Arg Met Glu 180 185 190 ttg aac ggg gtg tat gtg gac aca gaa ttc
ctg aaa aag ctc tcg gag 624 Leu Asn Gly Val Tyr Val Asp Thr Glu Phe
Leu Lys Lys Leu Ser Glu 195 200 205 gag tac ggc aaa aag ctc gag gaa
ctg gcc gaa aaa atc tac cag ata 672 Glu Tyr Gly Lys Lys Leu Glu Glu
Leu Ala Glu Lys Ile Tyr Gln Ile 210 215 220 gca gga gag ccc ttc aac
atc aat tct cca aaa cag gtt tca aag atc 720 Ala Gly Glu Pro Phe Asn
Ile Asn Ser Pro Lys Gln Val Ser Lys Ile 225 230 235 240 ctt ttt gag
aag ctg gga ata aaa ccc cgt gga aaa acg aca aaa aca 768 Leu Phe Glu
Lys Leu Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr 245 250 255 gga
gcg tac tct acc agg ata gag gtg ttg gaa gag ata gcg aat gag 816 Gly
Ala Tyr Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu 260 265
270 cac gag ata gta ccc ctc att ctc gag tac aga aag atc cag aaa ctg
864 His Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu
275 280 285 aaa tcg acc tac ata gac acc ctt ccg aaa ctt gtg aac ccg
aaa acc 912 Lys Ser Thr Tyr Ile Asp Thr Leu Pro Lys Leu Val Asn Pro
Lys Thr 290 295 300 gga aga att cat gca tct ttc cac cag acg ggt acc
gcc act ggc agg 960 Gly Arg Ile His Ala Ser Phe His Gln Thr Gly Thr
Ala Thr Gly Arg 305 310 315 320 ttg agt agc agt gat cca aat ctt cag
aat ctt ccg aca aag agc gaa 1008 Leu Ser Ser Ser Asp Pro Asn Leu
Gln Asn Leu Pro Thr Lys Ser Glu 325 330 335 gag gga aaa gaa att aga
aaa gcg att gtg ccc cag gat cca gac tgg 1056 Glu Gly Lys Glu Ile
Arg Lys Ala Ile Val Pro Gln Asp Pro Asp Trp 340 345 350 tgg atc gtc
agt gcg gat tat tcc caa ata gaa ctc aga atc ctc gct 1104 Trp Ile
Val Ser Ala Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala 355 360 365
cat ctc agt ggt gat gag aac ctt gtg aag gcc ttc gag gag ggc atc
1152 His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly
Ile 370 375 380 gat gtg cac acc ttg act gcc tcc agg atc tac aac gta
aag cca gaa 1200 Asp Val His Thr Leu Thr Ala Ser Arg Ile Tyr Asn
Val Lys Pro Glu 385 390 395 400 gaa gtg aac gaa gaa atg cga cgg gtt
gga aag atg gtg aac ttc tct 1248 Glu Val Asn Glu Glu Met Arg Arg
Val Gly Lys Met Val Asn Phe Ser 405 410 415 ata ata tac ggt gtc aca
ccg tac ggt ctt tct gtg aga ctt gga ata 1296 Ile Ile Tyr Gly Val
Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly Ile 420 425 430 ccg gtt aaa
gaa gca gaa aag atg att atc agc tat ttc aca ctg tat 1344 Pro Val
Lys Glu Ala Glu Lys Met Ile Ile Ser Tyr Phe Thr Leu Tyr 435 440 445
cca aag gtg cga agc tac atc cag cag gtt gtt gca gag gca aaa gag
1392 Pro Lys Val Arg Ser Tyr Ile Gln Gln Val Val Ala Glu Ala Lys
Glu 450 455 460 aag ggc tac gtc agg act ctc ttt gga aga aaa aga gat
att ccc cag 1440 Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg
Asp Ile Pro Gln 465 470 475 480 ctc atg gca agg gac aag aac acc cag
tcc gaa ggc gaa aga atc gca 1488 Leu Met Ala Arg Asp Lys Asn Thr
Gln Ser Glu Gly Glu Arg Ile Ala 485 490 495 ata aac acc ccc att cag
gga acg gcg gca gat ata ata aaa ttg gct 1536 Ile Asn Thr Pro Ile
Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala 500 505 510 atg ata gat
ata gac gag gag ctg aga aaa aga aac atg aaa tcc aga 1584 Met Ile
Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 515 520 525
atg atc att cag gtt cat gac gaa ctg gtc ttc gag gtt ccc gat gag
1632 Met Ile Ile
Gln Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 530 535 540 gaa
aaa gaa gaa cta gtt gat ctg gtg aag aac aaa atg aca aat gtg 1680
Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 545
550 555 560 gtg aaa ctc tct gtg cct ctt gag gtt gac ata agc atc gga
aaa agc 1728 Val Lys Leu Ser Val Pro Leu Glu Val Asp Ile Ser Ile
Gly Lys Ser 565 570 575 tgg tct tga 1737 Trp Ser 11 578 PRT
Artificial Sequence Synthetic 11 Met Val Pro Ser Phe Ala Leu Asp
Leu Glu Thr Ser Ser Leu Asp Pro 1 5 10 15 Phe Asn Cys Glu Ile Val
Gly Ile Ser Val Ser Phe Lys Pro Lys Thr 20 25 30 Ala Tyr Tyr Ile
Pro Leu His His Arg Asn Ala Gln Asn Leu Asp Glu 35 40 45 Thr Leu
Val Leu Ser Lys Leu Lys Glu Ile Leu Glu Asp Pro Ser Ser 50 55 60
Lys Ile Val Gly Gln Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 65
70 75 80 Lys Gly Ile Ser Pro Val Tyr Pro His Phe Asp Thr Met Ile
Ala Ala 85 90 95 Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu
Glu Asp Leu Ser 100 105 110 Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser
Tyr Gln Glu Leu Met Ser 115 120 125 Phe Ser Ser Pro Leu Phe Gly Phe
Ser Phe Ala Asp Val Pro Val Asp 130 135 140 Lys Ala Ala Asn Tyr Ser
Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu 145 150 155 160 Tyr Lys Ile
Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 165 170 175 Phe
Tyr Arg Ile Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 180 185
190 Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu
195 200 205 Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys Ile Tyr
Gln Ile 210 215 220 Ala Gly Glu Pro Phe Asn Ile Asn Ser Pro Lys Gln
Val Ser Lys Ile 225 230 235 240 Leu Phe Glu Lys Leu Gly Ile Lys Pro
Arg Gly Lys Thr Thr Lys Thr 245 250 255 Gly Ala Tyr Ser Thr Arg Ile
Glu Val Leu Glu Glu Ile Ala Asn Glu 260 265 270 His Glu Ile Val Pro
Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu 275 280 285 Lys Ser Thr
Tyr Ile Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 290 295 300 Gly
Arg Ile His Ala Ser Phe His Gln Thr Gly Thr Ala Thr Gly Arg 305 310
315 320 Leu Ser Ser Ser Asp Pro Asn Leu Gln Asn Leu Pro Thr Lys Ser
Glu 325 330 335 Glu Gly Lys Glu Ile Arg Lys Ala Ile Val Pro Gln Asp
Pro Asp Trp 340 345 350 Trp Ile Val Ser Ala Asp Tyr Ser Gln Ile Glu
Leu Arg Ile Leu Ala 355 360 365 His Leu Ser Gly Asp Glu Asn Leu Val
Lys Ala Phe Glu Glu Gly Ile 370 375 380 Asp Val His Thr Leu Thr Ala
Ser Arg Ile Tyr Asn Val Lys Pro Glu 385 390 395 400 Glu Val Asn Glu
Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 405 410 415 Ile Ile
Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly Ile 420 425 430
Pro Val Lys Glu Ala Glu Lys Met Ile Ile Ser Tyr Phe Thr Leu Tyr 435
440 445 Pro Lys Val Arg Ser Tyr Ile Gln Gln Val Val Ala Glu Ala Lys
Glu 450 455 460 Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp
Ile Pro Gln 465 470 475 480 Leu Met Ala Arg Asp Lys Asn Thr Gln Ser
Glu Gly Glu Arg Ile Ala 485 490 495 Ile Asn Thr Pro Ile Gln Gly Thr
Ala Ala Asp Ile Ile Lys Leu Ala 500 505 510 Met Ile Asp Ile Asp Glu
Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 515 520 525 Met Ile Ile Gln
Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 530 535 540 Glu Lys
Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn Val 545 550 555
560 Val Lys Leu Ser Val Pro Leu Glu Val Asp Ile Ser Ile Gly Lys Ser
565 570 575 Trp Ser 12 20 DNA Artificial Sequence Synthetic 12
tgccgtacac ctccgagagc 20 13 24 DNA Artificial Sequence Synthetic 13
ctcgtttggc tccagcaaat atgc 24 14 21 DNA Artificial Sequence
Synthetic 14 tttgccctgg aacttgaaac g 21 15 1833 DNA Artificial
Sequence Synthetic 15 atg aag gaa ctt caa ctg tac gaa gaa gca gaa
ccc acc gga tac gaa 48 Met Lys Glu Leu Gln Leu Tyr Glu Glu Ala Glu
Pro Thr Gly Tyr Glu 1 5 10 15 atc gtg aag gat cat aag acc ttc gaa
gat ctc atc gaa aag ctg aag 96 Ile Val Lys Asp His Lys Thr Phe Glu
Asp Leu Ile Glu Lys Leu Lys 20 25 30 gag gtt cca tct ttt gcc ctg
gaa ctt gaa acg tcc tcc ctt gac ccg 144 Glu Val Pro Ser Phe Ala Leu
Glu Leu Glu Thr Ser Ser Leu Asp Pro 35 40 45 ttc aac tgt gag ata
gtc ggc atc tcc gtg tcg ttc aaa ccg aaa aca 192 Phe Asn Cys Glu Ile
Val Gly Ile Ser Val Ser Phe Lys Pro Lys Thr 50 55 60 gct tat tac
att cca ctt cat cac aga aac gcc cag aat ctt gat gaa 240 Ala Tyr Tyr
Ile Pro Leu His His Arg Asn Ala Gln Asn Leu Asp Glu 65 70 75 80 aca
ctg gtg ctg tcg aag ttg aaa gag atc ctc gaa gac ccg tct tcg 288 Thr
Leu Val Leu Ser Lys Leu Lys Glu Ile Leu Glu Asp Pro Ser Ser 85 90
95 aag att gtg ggt cag aac ctg aag tac gac tac aag gtt ctt atg gta
336 Lys Ile Val Gly Gln Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val
100 105 110 aag ggt ata tcg cca gtt tat ccg cat ttt gac acg atg ata
gct gca 384 Lys Gly Ile Ser Pro Val Tyr Pro His Phe Asp Thr Met Ile
Ala Ala 115 120 125 tat ttg ctg gag cca aac gag aaa aaa ttc aat ctc
gaa gat ctg tct 432 Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu
Glu Asp Leu Ser 130 135 140 ttg aaa ttt ctc gga tac aaa atg acg tct
tat cag gaa ctg atg tcg 480 Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser
Tyr Gln Glu Leu Met Ser 145 150 155 160 ttt tcc tca cca ctt ttt ggt
ttc agc ttt gcg gat gtt ccg gta gac 528 Phe Ser Ser Pro Leu Phe Gly
Phe Ser Phe Ala Asp Val Pro Val Asp 165 170 175 aag gct gcg aac tac
tcc tgc gag gat gca gac atc act tat agg ctc 576 Lys Ala Ala Asn Tyr
Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu 180 185 190 tac aag ata
ctc agc atg aag ctc cat gaa gcg gaa ctt gag aac gtc 624 Tyr Lys Ile
Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 195 200 205 ttc
tac agg ata gag atg ccg ctt gtg aac gtt ctt gca cgc atg gaa 672 Phe
Tyr Arg Ile Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 210 215
220 ttg aac ggg gtg tat gtg gac aca gaa ttc ctg aaa aag ctc tcg gag
720 Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu
225 230 235 240 gag tac ggc aaa aag ctc gag gaa ctg gcc gaa aaa atc
tac cag ata 768 Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys Ile
Tyr Gln Ile 245 250 255 gca gga gag ccc ttc aac atc aat tct cca aaa
cag gtt tca aag atc 816 Ala Gly Glu Pro Phe Asn Ile Asn Ser Pro Lys
Gln Val Ser Lys Ile 260 265 270 ctt ttt gag aag ctg gga ata aaa ccc
cgt gga aaa acg aca aaa aca 864 Leu Phe Glu Lys Leu Gly Ile Lys Pro
Arg Gly Lys Thr Thr Lys Thr 275 280 285 gga gcg tac tct acc agg ata
gag gtg ttg gaa gag ata gcg aat gag 912 Gly Ala Tyr Ser Thr Arg Ile
Glu Val Leu Glu Glu Ile Ala Asn Glu 290 295 300 cac gag ata gta ccc
ctc att ctc gag tac aga aag atc cag aaa ctg 960 His Glu Ile Val Pro
Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu 305 310 315 320 aaa tcg
acc tac ata gac acc ctt ccg aaa ctt gtg aac ccg aaa acc 1008 Lys
Ser Thr Tyr Ile Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 325 330
335 gga aga att cat gca tct ttc cac cag acg ggt acc gcc act ggc agg
1056 Gly Arg Ile His Ala Ser Phe His Gln Thr Gly Thr Ala Thr Gly
Arg 340 345 350 ttg agt agc agt gat cca aat ctt cag aat ctt ccg aca
aag agc gaa 1104 Leu Ser Ser Ser Asp Pro Asn Leu Gln Asn Leu Pro
Thr Lys Ser Glu 355 360 365 gag gga aaa gaa att aga aaa gcg att gtg
ccc cag gat cca gac tgg 1152 Glu Gly Lys Glu Ile Arg Lys Ala Ile
Val Pro Gln Asp Pro Asp Trp 370 375 380 tgg atc gtc agt gcg gat tat
tcc caa ata gaa ctc aga atc ctc gct 1200 Trp Ile Val Ser Ala Asp
Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala 385 390 395 400 cat ctc agt
ggt gat gag aac ctt gtg aag gcc ttc gag gag ggc atc 1248 His Leu
Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly Ile 405 410 415
gat gtg cac acc ttg act gcc tcc agg atc tac aac gta aag cca gaa
1296 Asp Val His Thr Leu Thr Ala Ser Arg Ile Tyr Asn Val Lys Pro
Glu 420 425 430 gaa gtg aac gaa gaa atg cga cgg gtt gga aag atg gtg
aac ttc tct 1344 Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met
Val Asn Phe Ser 435 440 445 ata ata tac ggt gtc aca ccg tac ggt ctt
tct gtg aga ctt gga ata 1392 Ile Ile Tyr Gly Val Thr Pro Tyr Gly
Leu Ser Val Arg Leu Gly Ile 450 455 460 ccg gtt aaa gaa gca gaa aag
atg att atc agc tat ttc aca ctg tat 1440 Pro Val Lys Glu Ala Glu
Lys Met Ile Ile Ser Tyr Phe Thr Leu Tyr 465 470 475 480 cca aag gtg
cga agc tac atc cag cag gtt gtt gca gag gca aaa gag 1488 Pro Lys
Val Arg Ser Tyr Ile Gln Gln Val Val Ala Glu Ala Lys Glu 485 490 495
aag ggc tac gtc agg act ctc ttt gga aga aaa aga gat att ccc cag
1536 Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp Ile Pro
Gln 500 505 510 ctc atg gca agg gac aag aac acc cag tcc gaa ggc gaa
aga atc gca 1584 Leu Met Ala Arg Asp Lys Asn Thr Gln Ser Glu Gly
Glu Arg Ile Ala 515 520 525 ata aac acc ccc att cag gga acg gcg gca
gat ata ata aaa ttg gct 1632 Ile Asn Thr Pro Ile Gln Gly Thr Ala
Ala Asp Ile Ile Lys Leu Ala 530 535 540 atg ata gat ata gac gag gag
ctg aga aaa aga aac atg aaa tcc aga 1680 Met Ile Asp Ile Asp Glu
Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 545 550 555 560 atg atc att
cag gtt cat gac gaa ctg gtc ttc gag gtt ccc gat gag 1728 Met Ile
Ile Gln Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 565 570 575
gaa aaa gaa gaa cta gtt gat ctg gtg aag aac aaa atg aca aat gtg
1776 Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn
Val 580 585 590 gtg aaa ctc tct gtg cct ctt gag gtt gac ata agc atc
gga aaa agc 1824 Val Lys Leu Ser Val Pro Leu Glu Val Asp Ile Ser
Ile Gly Lys Ser 595 600 605 tgg tct tga 1833 Trp Ser 610 16 610 PRT
Artificial Sequence Synthetic 16 Met Lys Glu Leu Gln Leu Tyr Glu
Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10 15 Ile Val Lys Asp His Lys
Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys 20 25 30 Glu Val Pro Ser
Phe Ala Leu Glu Leu Glu Thr Ser Ser Leu Asp Pro 35 40 45 Phe Asn
Cys Glu Ile Val Gly Ile Ser Val Ser Phe Lys Pro Lys Thr 50 55 60
Ala Tyr Tyr Ile Pro Leu His His Arg Asn Ala Gln Asn Leu Asp Glu 65
70 75 80 Thr Leu Val Leu Ser Lys Leu Lys Glu Ile Leu Glu Asp Pro
Ser Ser 85 90 95 Lys Ile Val Gly Gln Asn Leu Lys Tyr Asp Tyr Lys
Val Leu Met Val 100 105 110 Lys Gly Ile Ser Pro Val Tyr Pro His Phe
Asp Thr Met Ile Ala Ala 115 120 125 Tyr Leu Leu Glu Pro Asn Glu Lys
Lys Phe Asn Leu Glu Asp Leu Ser 130 135 140 Leu Lys Phe Leu Gly Tyr
Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 145 150 155 160 Phe Ser Ser
Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 165 170 175 Lys
Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu 180 185
190 Tyr Lys Ile Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val
195 200 205 Phe Tyr Arg Ile Glu Met Pro Leu Val Asn Val Leu Ala Arg
Met Glu 210 215 220 Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys
Lys Leu Ser Glu 225 230 235 240 Glu Tyr Gly Lys Lys Leu Glu Glu Leu
Ala Glu Lys Ile Tyr Gln Ile 245 250 255 Ala Gly Glu Pro Phe Asn Ile
Asn Ser Pro Lys Gln Val Ser Lys Ile 260 265 270 Leu Phe Glu Lys Leu
Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr 275 280 285 Gly Ala Tyr
Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu 290 295 300 His
Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu 305 310
315 320 Lys Ser Thr Tyr Ile Asp Thr Leu Pro Lys Leu Val Asn Pro Lys
Thr 325 330 335 Gly Arg Ile His Ala Ser Phe His Gln Thr Gly Thr Ala
Thr Gly Arg 340 345 350 Leu Ser Ser Ser Asp Pro Asn Leu Gln Asn Leu
Pro Thr Lys Ser Glu 355 360 365 Glu Gly Lys Glu Ile Arg Lys Ala Ile
Val Pro Gln Asp Pro Asp Trp 370 375 380 Trp Ile Val Ser Ala Asp Tyr
Ser Gln Ile Glu Leu Arg Ile Leu Ala 385 390 395 400 His Leu Ser Gly
Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly Ile 405 410 415 Asp Val
His Thr Leu Thr Ala Ser Arg Ile Tyr Asn Val Lys Pro Glu 420 425 430
Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 435
440 445 Ile Ile Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly
Ile 450 455 460 Pro Val Lys Glu Ala Glu Lys Met Ile Ile Ser Tyr Phe
Thr Leu Tyr 465 470 475 480 Pro Lys Val Arg Ser Tyr Ile Gln Gln Val
Val Ala Glu Ala Lys Glu 485 490 495 Lys Gly Tyr Val Arg Thr Leu Phe
Gly Arg Lys Arg Asp Ile Pro Gln 500 505 510 Leu Met Ala Arg Asp Lys
Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala 515 520 525 Ile Asn Thr Pro
Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala 530 535 540 Met Ile
Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 545 550 555
560 Met Ile Ile Gln Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu
565 570 575 Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr
Asn Val 580 585 590 Val Lys Leu Ser Val Pro Leu Glu Val Asp Ile Ser
Ile Gly Lys Ser 595 600 605 Trp Ser 610 17 17 DNA Artificial
Sequence Synthetic 17 gaccttgaca cgtcctc 17 18 1833 DNA Artificial
Sequence Synthetic 18 atg aag gaa ctt caa ctg tac gaa gaa gca gaa
ccc acc gga tac gaa 48 Met Lys Glu Leu Gln Leu Tyr Glu Glu Ala Glu
Pro Thr Gly Tyr Glu 1 5 10 15 atc gtg aag gat cat aag acc ttc gaa
gat ctc atc gaa aag ctg aag 96 Ile Val Lys Asp His Lys Thr Phe Glu
Asp Leu Ile Glu Lys Leu Lys 20 25 30 gag gtt cca tct ttt gcc ctg
gac ctt gac acg tcc tcc ctt gac ccg 144 Glu Val Pro Ser Phe Ala Leu
Asp Leu Asp Thr Ser Ser Leu Asp Pro 35 40 45 ttc aac tgt gag ata
gtc ggc atc tcc gtg tcg ttc aaa ccg aaa aca 192 Phe Asn Cys Glu Ile
Val Gly Ile Ser Val Ser Phe Lys Pro Lys Thr 50 55 60 gct tat tac
att
cca ctt cat cac aga aac gcc cag aat ctt gat gaa 240 Ala Tyr Tyr Ile
Pro Leu His His Arg Asn Ala Gln Asn Leu Asp Glu 65 70 75 80 aca ctg
gtg ctg tcg aag ttg aaa gag atc ctc gaa gac ccg tct tcg 288 Thr Leu
Val Leu Ser Lys Leu Lys Glu Ile Leu Glu Asp Pro Ser Ser 85 90 95
aag att gtg ggt cag aac ctg aag tac gac tac aag gtt ctt atg gta 336
Lys Ile Val Gly Gln Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 100
105 110 aag ggt ata tcg cca gtt tat ccg cat ttt gac acg atg ata gct
gca 384 Lys Gly Ile Ser Pro Val Tyr Pro His Phe Asp Thr Met Ile Ala
Ala 115 120 125 tat ttg ctg gag cca aac gag aaa aaa ttc aat ctc gaa
gat ctg tct 432 Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu
Asp Leu Ser 130 135 140 ttg aaa ttt ctc gga tac aaa atg acg tct tat
cag gaa ctg atg tcg 480 Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr
Gln Glu Leu Met Ser 145 150 155 160 ttt tcc tca cca ctt ttt ggt ttc
agc ttt gcg gat gtt ccg gta gac 528 Phe Ser Ser Pro Leu Phe Gly Phe
Ser Phe Ala Asp Val Pro Val Asp 165 170 175 aag gct gcg aac tac tcc
tgc gag gat gca gac atc act tat agg ctc 576 Lys Ala Ala Asn Tyr Ser
Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu 180 185 190 tac aag ata ctc
agc atg aag ctc cat gaa gcg gaa ctt gag aac gtc 624 Tyr Lys Ile Leu
Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 195 200 205 ttc tac
agg ata gag atg ccg ctt gtg aac gtt ctt gca cgc atg gaa 672 Phe Tyr
Arg Ile Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 210 215 220
ttg aac ggg gtg tat gtg gac aca gaa ttc ctg aaa aag ctc tcg gag 720
Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 225
230 235 240 gag tac ggc aaa aag ctc gag gaa ctg gcc gaa aaa atc tac
cag ata 768 Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys Ile Tyr
Gln Ile 245 250 255 gca gga gag ccc ttc aac atc aat tct cca aaa cag
gtt tca aag atc 816 Ala Gly Glu Pro Phe Asn Ile Asn Ser Pro Lys Gln
Val Ser Lys Ile 260 265 270 ctt ttt gag aag ctg gga ata aaa ccc cgt
gga aaa acg aca aaa aca 864 Leu Phe Glu Lys Leu Gly Ile Lys Pro Arg
Gly Lys Thr Thr Lys Thr 275 280 285 gga gcg tac tct acc agg ata gag
gtg ttg gaa gag ata gcg aat gag 912 Gly Ala Tyr Ser Thr Arg Ile Glu
Val Leu Glu Glu Ile Ala Asn Glu 290 295 300 cac gag ata gta ccc ctc
att ctc gag tac aga aag atc cag aaa ctg 960 His Glu Ile Val Pro Leu
Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu 305 310 315 320 aaa tcg acc
tac ata gac acc ctt ccg aaa ctt gtg aac ccg aaa acc 1008 Lys Ser
Thr Tyr Ile Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 325 330 335
gga aga att cat gca tct ttc cac cag acg ggt acc gcc act ggc agg
1056 Gly Arg Ile His Ala Ser Phe His Gln Thr Gly Thr Ala Thr Gly
Arg 340 345 350 ttg agt agc agt gat cca aat ctt cag aat ctt ccg aca
aag agc gaa 1104 Leu Ser Ser Ser Asp Pro Asn Leu Gln Asn Leu Pro
Thr Lys Ser Glu 355 360 365 gag gga aaa gaa att aga aaa gcg att gtg
ccc cag gat cca gac tgg 1152 Glu Gly Lys Glu Ile Arg Lys Ala Ile
Val Pro Gln Asp Pro Asp Trp 370 375 380 tgg atc gtc agt gcg gat tat
tcc caa ata gaa ctc aga atc ctc gct 1200 Trp Ile Val Ser Ala Asp
Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala 385 390 395 400 cat ctc agt
ggt gat gag aac ctt gtg aag gcc ttc gag gag ggc atc 1248 His Leu
Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly Ile 405 410 415
gat gtg cac acc ttg act gcc tcc agg atc tac aac gta aag cca gaa
1296 Asp Val His Thr Leu Thr Ala Ser Arg Ile Tyr Asn Val Lys Pro
Glu 420 425 430 gaa gtg aac gaa gaa atg cga cgg gtt gga aag atg gtg
aac ttc tct 1344 Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met
Val Asn Phe Ser 435 440 445 ata ata tac ggt gtc aca ccg tac ggt ctt
tct gtg aga ctt gga ata 1392 Ile Ile Tyr Gly Val Thr Pro Tyr Gly
Leu Ser Val Arg Leu Gly Ile 450 455 460 ccg gtt aaa gaa gca gaa aag
atg att atc agc tat ttc aca ctg tat 1440 Pro Val Lys Glu Ala Glu
Lys Met Ile Ile Ser Tyr Phe Thr Leu Tyr 465 470 475 480 cca aag gtg
cga agc tac atc cag cag gtt gtt gca gag gca aaa gag 1488 Pro Lys
Val Arg Ser Tyr Ile Gln Gln Val Val Ala Glu Ala Lys Glu 485 490 495
aag ggc tac gtc agg act ctc ttt gga aga aaa aga gat att ccc cag
1536 Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp Ile Pro
Gln 500 505 510 ctc atg gca agg gac aag aac acc cag tcc gaa ggc gaa
aga atc gca 1584 Leu Met Ala Arg Asp Lys Asn Thr Gln Ser Glu Gly
Glu Arg Ile Ala 515 520 525 ata aac acc ccc att cag gga acg gcg gca
gat ata ata aaa ttg gct 1632 Ile Asn Thr Pro Ile Gln Gly Thr Ala
Ala Asp Ile Ile Lys Leu Ala 530 535 540 atg ata gat ata gac gag gag
ctg aga aaa aga aac atg aaa tcc aga 1680 Met Ile Asp Ile Asp Glu
Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 545 550 555 560 atg atc att
cag gtt cat gac gaa ctg gtc ttc gag gtt ccc gat gag 1728 Met Ile
Ile Gln Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 565 570 575
gaa aaa gaa gaa cta gtt gat ctg gtg aag aac aaa atg aca aat gtg
1776 Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn
Val 580 585 590 gtg aaa ctc tct gtg cct ctt gag gtt gac ata agc atc
gga aaa agc 1824 Val Lys Leu Ser Val Pro Leu Glu Val Asp Ile Ser
Ile Gly Lys Ser 595 600 605 tgg tct tga 1833 Trp Ser 610 19 610 PRT
Artificial Sequence Synthetic 19 Met Lys Glu Leu Gln Leu Tyr Glu
Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10 15 Ile Val Lys Asp His Lys
Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys 20 25 30 Glu Val Pro Ser
Phe Ala Leu Asp Leu Asp Thr Ser Ser Leu Asp Pro 35 40 45 Phe Asn
Cys Glu Ile Val Gly Ile Ser Val Ser Phe Lys Pro Lys Thr 50 55 60
Ala Tyr Tyr Ile Pro Leu His His Arg Asn Ala Gln Asn Leu Asp Glu 65
70 75 80 Thr Leu Val Leu Ser Lys Leu Lys Glu Ile Leu Glu Asp Pro
Ser Ser 85 90 95 Lys Ile Val Gly Gln Asn Leu Lys Tyr Asp Tyr Lys
Val Leu Met Val 100 105 110 Lys Gly Ile Ser Pro Val Tyr Pro His Phe
Asp Thr Met Ile Ala Ala 115 120 125 Tyr Leu Leu Glu Pro Asn Glu Lys
Lys Phe Asn Leu Glu Asp Leu Ser 130 135 140 Leu Lys Phe Leu Gly Tyr
Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 145 150 155 160 Phe Ser Ser
Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 165 170 175 Lys
Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu 180 185
190 Tyr Lys Ile Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val
195 200 205 Phe Tyr Arg Ile Glu Met Pro Leu Val Asn Val Leu Ala Arg
Met Glu 210 215 220 Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys
Lys Leu Ser Glu 225 230 235 240 Glu Tyr Gly Lys Lys Leu Glu Glu Leu
Ala Glu Lys Ile Tyr Gln Ile 245 250 255 Ala Gly Glu Pro Phe Asn Ile
Asn Ser Pro Lys Gln Val Ser Lys Ile 260 265 270 Leu Phe Glu Lys Leu
Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr 275 280 285 Gly Ala Tyr
Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu 290 295 300 His
Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu 305 310
315 320 Lys Ser Thr Tyr Ile Asp Thr Leu Pro Lys Leu Val Asn Pro Lys
Thr 325 330 335 Gly Arg Ile His Ala Ser Phe His Gln Thr Gly Thr Ala
Thr Gly Arg 340 345 350 Leu Ser Ser Ser Asp Pro Asn Leu Gln Asn Leu
Pro Thr Lys Ser Glu 355 360 365 Glu Gly Lys Glu Ile Arg Lys Ala Ile
Val Pro Gln Asp Pro Asp Trp 370 375 380 Trp Ile Val Ser Ala Asp Tyr
Ser Gln Ile Glu Leu Arg Ile Leu Ala 385 390 395 400 His Leu Ser Gly
Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly Ile 405 410 415 Asp Val
His Thr Leu Thr Ala Ser Arg Ile Tyr Asn Val Lys Pro Glu 420 425 430
Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 435
440 445 Ile Ile Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly
Ile 450 455 460 Pro Val Lys Glu Ala Glu Lys Met Ile Ile Ser Tyr Phe
Thr Leu Tyr 465 470 475 480 Pro Lys Val Arg Ser Tyr Ile Gln Gln Val
Val Ala Glu Ala Lys Glu 485 490 495 Lys Gly Tyr Val Arg Thr Leu Phe
Gly Arg Lys Arg Asp Ile Pro Gln 500 505 510 Leu Met Ala Arg Asp Lys
Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala 515 520 525 Ile Asn Thr Pro
Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala 530 535 540 Met Ile
Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 545 550 555
560 Met Ile Ile Gln Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu
565 570 575 Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr
Asn Val 580 585 590 Val Lys Leu Ser Val Pro Leu Glu Val Asp Ile Ser
Ile Gly Lys Ser 595 600 605 Trp Ser 610 20 39 DNA Artificial
Sequence Synthetic 20 taagtgatat ctgcatcctc gcaggagaag ttcgcagcc 39
21 39 DNA Artificial Sequence Synthetic 21 acaaggctgc gaacttctcc
tgcgaggatg cagatatca 39 22 1833 DNA Artificial Sequence Synthetic
22 atg aag gaa ctt caa ctg tac gaa gaa gca gaa ccc acc gga tac gaa
48 Met Lys Glu Leu Gln Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu
1 5 10 15 atc gtg aag gat cat aag acc ttc gaa gat ctc atc gaa aag
ctg aag 96 Ile Val Lys Asp His Lys Thr Phe Glu Asp Leu Ile Glu Lys
Leu Lys 20 25 30 gag gtt cca tct ttt gcc ctg gac ctt gaa acg tcc
tcc ctt gac ccg 144 Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser
Ser Leu Asp Pro 35 40 45 ttc aac tgt gag ata gtc ggc atc tcc gtg
tcg ttc aaa ccg aaa aca 192 Phe Asn Cys Glu Ile Val Gly Ile Ser Val
Ser Phe Lys Pro Lys Thr 50 55 60 gct tat tac att cca ctt cat cac
aga aac gcc cag aat ctt gat gaa 240 Ala Tyr Tyr Ile Pro Leu His His
Arg Asn Ala Gln Asn Leu Asp Glu 65 70 75 80 aca ctg gtg ctg tcg aag
ttg aaa gag atc ctc gaa gac ccg tct tcg 288 Thr Leu Val Leu Ser Lys
Leu Lys Glu Ile Leu Glu Asp Pro Ser Ser 85 90 95 aag att gtg ggt
cag aac ctg aag tac gac tac aag gtt ctt atg gta 336 Lys Ile Val Gly
Gln Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 100 105 110 aag ggt
ata tcg cca gtt tat ccg cat ttt gac acg atg ata gct gca 384 Lys Gly
Ile Ser Pro Val Tyr Pro His Phe Asp Thr Met Ile Ala Ala 115 120 125
tat ttg ctg gag cca aac gag aaa aaa ttc aat ctc gaa gat ctg tct 432
Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 130
135 140 ttg aaa ttt ctc gga tac aaa atg acg tct tat cag gaa ctg atg
tcg 480 Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met
Ser 145 150 155 160 ttt tcc tca cca ctt ttt ggt ttc agc ttt gcg gat
gtt ccg gta gac 528 Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp
Val Pro Val Asp 165 170 175 aag gct gcg aac ttc tcc tgc gag gat gca
gat atc act tat agg ctc 576 Lys Ala Ala Asn Phe Ser Cys Glu Asp Ala
Asp Ile Thr Tyr Arg Leu 180 185 190 tac aag ata ctc agc atg aag ctc
cat gaa gcg gaa ctt gag aac gtc 624 Tyr Lys Ile Leu Ser Met Lys Leu
His Glu Ala Glu Leu Glu Asn Val 195 200 205 ttc tac agg ata gag atg
ccg ctt gtg aac gtt ctt gca cgc atg gaa 672 Phe Tyr Arg Ile Glu Met
Pro Leu Val Asn Val Leu Ala Arg Met Glu 210 215 220 ttg aac ggg gtg
tat gtg gac aca gaa ttc ctg aaa aag ctc tcg gag 720 Leu Asn Gly Val
Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 225 230 235 240 gag
tac ggc aaa aag ctc gag gaa ctg gcc gaa aaa atc tac cag ata 768 Glu
Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys Ile Tyr Gln Ile 245 250
255 gca gga gag ccc ttc aac atc aat tct cca aaa cag gtt tca aag atc
816 Ala Gly Glu Pro Phe Asn Ile Asn Ser Pro Lys Gln Val Ser Lys Ile
260 265 270 ctt ttt gag aag ctg gga ata aaa ccc cgt gga aaa acg aca
aaa aca 864 Leu Phe Glu Lys Leu Gly Ile Lys Pro Arg Gly Lys Thr Thr
Lys Thr 275 280 285 gga gcg tac tct acc agg ata gag gtg ttg gaa gag
ata gcg aat gag 912 Gly Ala Tyr Ser Thr Arg Ile Glu Val Leu Glu Glu
Ile Ala Asn Glu 290 295 300 cac gag ata gta ccc ctc att ctc gag tac
aga aag atc cag aaa ctg 960 His Glu Ile Val Pro Leu Ile Leu Glu Tyr
Arg Lys Ile Gln Lys Leu 305 310 315 320 aaa tcg acc tac ata gac acc
ctt ccg aaa ctt gtg aac ccg aaa acc 1008 Lys Ser Thr Tyr Ile Asp
Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 325 330 335 gga aga att cat
gca tct ttc cac cag acg ggt acc gcc act ggc agg 1056 Gly Arg Ile
His Ala Ser Phe His Gln Thr Gly Thr Ala Thr Gly Arg 340 345 350 ttg
agt agc agt gat cca aat ctt cag aat ctt ccg aca aag agc gaa 1104
Leu Ser Ser Ser Asp Pro Asn Leu Gln Asn Leu Pro Thr Lys Ser Glu 355
360 365 gag gga aaa gaa att aga aaa gcg att gtg ccc cag gat cca gac
tgg 1152 Glu Gly Lys Glu Ile Arg Lys Ala Ile Val Pro Gln Asp Pro
Asp Trp 370 375 380 tgg atc gtc agt gcg gat tat tcc caa ata gaa ctc
aga atc ctc gct 1200 Trp Ile Val Ser Ala Asp Tyr Ser Gln Ile Glu
Leu Arg Ile Leu Ala 385 390 395 400 cat ctc agt ggt gat gag aac ctt
gtg aag gcc ttc gag gag ggc atc 1248 His Leu Ser Gly Asp Glu Asn
Leu Val Lys Ala Phe Glu Glu Gly Ile 405 410 415 gat gtg cac acc ttg
act gcc tcc agg atc tac aac gta aag cca gaa 1296 Asp Val His Thr
Leu Thr Ala Ser Arg Ile Tyr Asn Val Lys Pro Glu 420 425 430 gaa gtg
aac gaa gaa atg cga cgg gtt gga aag atg gtg aac ttc tct 1344 Glu
Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 435 440
445 ata ata tac ggt gtc aca ccg tac ggt ctt tct gtg aga ctt gga ata
1392 Ile Ile Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly
Ile 450 455 460 ccg gtt aaa gaa gca gaa aag atg att atc agc tat ttc
aca ctg tat 1440 Pro Val Lys Glu Ala Glu Lys Met Ile Ile Ser Tyr
Phe Thr Leu Tyr 465 470 475 480 cca aag gtg cga agc tac atc cag cag
gtt gtt gca gag gca aaa gag 1488 Pro Lys Val Arg Ser Tyr Ile Gln
Gln Val Val Ala Glu Ala Lys Glu 485 490 495 aag ggc tac gtc agg act
ctc ttt gga aga aaa aga gat att ccc cag 1536 Lys Gly Tyr Val Arg
Thr Leu Phe Gly Arg Lys Arg Asp Ile Pro Gln 500 505 510 ctc atg gca
agg gac aag aac acc cag tcc gaa ggc gaa aga atc gca 1584 Leu Met
Ala Arg Asp Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala 515 520 525
ata aac acc ccc att cag gga acg gcg gca gat ata ata aaa ttg gct
1632 Ile Asn Thr Pro Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu
Ala 530 535 540 atg ata gat ata gac gag gag ctg aga aaa aga aac atg
aaa tcc aga 1680 Met Ile Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn
Met Lys Ser Arg 545 550 555 560 atg atc att cag gtt cat gac gaa ctg
gtc ttc gag gtt ccc gat gag
1728 Met Ile Ile Gln Val His Asp Glu Leu Val Phe Glu Val Pro Asp
Glu 565 570 575 gaa aaa gaa gaa cta gtt gat ctg gtg aag aac aaa atg
aca aat gtg 1776 Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys
Met Thr Asn Val 580 585 590 gtg aaa ctc tct gtg cct ctt gag gtt gac
ata agc atc gga aaa agc 1824 Val Lys Leu Ser Val Pro Leu Glu Val
Asp Ile Ser Ile Gly Lys Ser 595 600 605 tgg tct tga 1833 Trp Ser
610 23 610 PRT Artificial Sequence Synthetic 23 Met Lys Glu Leu Gln
Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10 15 Ile Val Lys
Asp His Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys 20 25 30 Glu
Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 35 40
45 Phe Asn Cys Glu Ile Val Gly Ile Ser Val Ser Phe Lys Pro Lys Thr
50 55 60 Ala Tyr Tyr Ile Pro Leu His His Arg Asn Ala Gln Asn Leu
Asp Glu 65 70 75 80 Thr Leu Val Leu Ser Lys Leu Lys Glu Ile Leu Glu
Asp Pro Ser Ser 85 90 95 Lys Ile Val Gly Gln Asn Leu Lys Tyr Asp
Tyr Lys Val Leu Met Val 100 105 110 Lys Gly Ile Ser Pro Val Tyr Pro
His Phe Asp Thr Met Ile Ala Ala 115 120 125 Tyr Leu Leu Glu Pro Asn
Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 130 135 140 Leu Lys Phe Leu
Gly Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 145 150 155 160 Phe
Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 165 170
175 Lys Ala Ala Asn Phe Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu
180 185 190 Tyr Lys Ile Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu
Asn Val 195 200 205 Phe Tyr Arg Ile Glu Met Pro Leu Val Asn Val Leu
Ala Arg Met Glu 210 215 220 Leu Asn Gly Val Tyr Val Asp Thr Glu Phe
Leu Lys Lys Leu Ser Glu 225 230 235 240 Glu Tyr Gly Lys Lys Leu Glu
Glu Leu Ala Glu Lys Ile Tyr Gln Ile 245 250 255 Ala Gly Glu Pro Phe
Asn Ile Asn Ser Pro Lys Gln Val Ser Lys Ile 260 265 270 Leu Phe Glu
Lys Leu Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr 275 280 285 Gly
Ala Tyr Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu 290 295
300 His Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu
305 310 315 320 Lys Ser Thr Tyr Ile Asp Thr Leu Pro Lys Leu Val Asn
Pro Lys Thr 325 330 335 Gly Arg Ile His Ala Ser Phe His Gln Thr Gly
Thr Ala Thr Gly Arg 340 345 350 Leu Ser Ser Ser Asp Pro Asn Leu Gln
Asn Leu Pro Thr Lys Ser Glu 355 360 365 Glu Gly Lys Glu Ile Arg Lys
Ala Ile Val Pro Gln Asp Pro Asp Trp 370 375 380 Trp Ile Val Ser Ala
Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala 385 390 395 400 His Leu
Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly Ile 405 410 415
Asp Val His Thr Leu Thr Ala Ser Arg Ile Tyr Asn Val Lys Pro Glu 420
425 430 Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe
Ser 435 440 445 Ile Ile Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg
Leu Gly Ile 450 455 460 Pro Val Lys Glu Ala Glu Lys Met Ile Ile Ser
Tyr Phe Thr Leu Tyr 465 470 475 480 Pro Lys Val Arg Ser Tyr Ile Gln
Gln Val Val Ala Glu Ala Lys Glu 485 490 495 Lys Gly Tyr Val Arg Thr
Leu Phe Gly Arg Lys Arg Asp Ile Pro Gln 500 505 510 Leu Met Ala Arg
Asp Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala 515 520 525 Ile Asn
Thr Pro Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala 530 535 540
Met Ile Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 545
550 555 560 Met Ile Ile Gln Val His Asp Glu Leu Val Phe Glu Val Pro
Asp Glu 565 570 575 Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys
Met Thr Asn Val 580 585 590 Val Lys Leu Ser Val Pro Leu Glu Val Asp
Ile Ser Ile Gly Lys Ser 595 600 605 Trp Ser 610 24 32 DNA
Artificial Sequence Synthetic 24 actcctgcga gaatgctgac atcacttata
gg 32 25 1833 DNA Artificial Sequence Synthetic 25 atg aag gaa ctt
caa ctg tac gaa gaa gca gaa ccc acc gga tac gaa 48 Met Lys Glu Leu
Gln Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10 15 atc gtg
aag gat cat aag acc ttc gaa gat ctc atc gaa aag ctg aag 96 Ile Val
Lys Asp His Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys 20 25 30
gag gtt cca tct ttt gcc ctg gac ctt gaa acg tcc tcc ctt gac ccg 144
Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 35
40 45 ttc aac tgt gag ata gtc ggc atc tcc gtg tcg ttc aaa ccg aaa
aca 192 Phe Asn Cys Glu Ile Val Gly Ile Ser Val Ser Phe Lys Pro Lys
Thr 50 55 60 gct tat tac att cca ctt cat cac aga aac gcc cag aat
ctt gat gaa 240 Ala Tyr Tyr Ile Pro Leu His His Arg Asn Ala Gln Asn
Leu Asp Glu 65 70 75 80 aca ctg gtg ctg tcg aag ttg aaa gag atc ctc
gaa gac ccg tct tcg 288 Thr Leu Val Leu Ser Lys Leu Lys Glu Ile Leu
Glu Asp Pro Ser Ser 85 90 95 aag att gtg ggt cag aac ctg aag tac
gac tac aag gtt ctt atg gta 336 Lys Ile Val Gly Gln Asn Leu Lys Tyr
Asp Tyr Lys Val Leu Met Val 100 105 110 aag ggt ata tcg cca gtt tat
ccg cat ttt gac acg atg ata gct gca 384 Lys Gly Ile Ser Pro Val Tyr
Pro His Phe Asp Thr Met Ile Ala Ala 115 120 125 tat ttg ctg gag cca
aac gag aaa aaa ttc aat ctc gaa gat ctg tct 432 Tyr Leu Leu Glu Pro
Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 130 135 140 ttg aaa ttt
ctc gga tac aaa atg acg tct tat cag gaa ctg atg tcg 480 Leu Lys Phe
Leu Gly Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 145 150 155 160
ttt tcc tca cca ctt ttt ggt ttc agc ttt gcg gat gtt ccg gta gac 528
Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 165
170 175 aag gct gcg aac tac tcc tgc gag aat gct gac atc act tat agg
ctc 576 Lys Ala Ala Asn Tyr Ser Cys Glu Asn Ala Asp Ile Thr Tyr Arg
Leu 180 185 190 tac aag ata ctc agc atg aag ctc cat gaa gcg gaa ctt
gag aac gtc 624 Tyr Lys Ile Leu Ser Met Lys Leu His Glu Ala Glu Leu
Glu Asn Val 195 200 205 ttc tac agg ata gag atg ccg ctt gtg aac gtt
ctt gca cgc atg gaa 672 Phe Tyr Arg Ile Glu Met Pro Leu Val Asn Val
Leu Ala Arg Met Glu 210 215 220 ttg aac ggg gtg tat gtg gac aca gaa
ttc ctg aaa aag ctc tcg gag 720 Leu Asn Gly Val Tyr Val Asp Thr Glu
Phe Leu Lys Lys Leu Ser Glu 225 230 235 240 gag tac ggc aaa aag ctc
gag gaa ctg gcc gaa aaa atc tac cag ata 768 Glu Tyr Gly Lys Lys Leu
Glu Glu Leu Ala Glu Lys Ile Tyr Gln Ile 245 250 255 gca gga gag ccc
ttc aac atc aat tct cca aaa cag gtt tca aag atc 816 Ala Gly Glu Pro
Phe Asn Ile Asn Ser Pro Lys Gln Val Ser Lys Ile 260 265 270 ctt ttt
gag aag ctg gga ata aaa ccc cgt gga aaa acg aca aaa aca 864 Leu Phe
Glu Lys Leu Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr 275 280 285
gga gcg tac tct acc agg ata gag gtg ttg gaa gag ata gcg aat gag 912
Gly Ala Tyr Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu 290
295 300 cac gag ata gta ccc ctc att ctc gag tac aga aag atc cag aaa
ctg 960 His Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys
Leu 305 310 315 320 aaa tcg acc tac ata gac acc ctt ccg aaa ctt gtg
aac ccg aaa acc 1008 Lys Ser Thr Tyr Ile Asp Thr Leu Pro Lys Leu
Val Asn Pro Lys Thr 325 330 335 gga aga att cat gca tct ttc cac cag
acg ggt acc gcc act ggc agg 1056 Gly Arg Ile His Ala Ser Phe His
Gln Thr Gly Thr Ala Thr Gly Arg 340 345 350 ttg agt agc agt gat cca
aat ctt cag aat ctt ccg aca aag agc gaa 1104 Leu Ser Ser Ser Asp
Pro Asn Leu Gln Asn Leu Pro Thr Lys Ser Glu 355 360 365 gag gga aaa
gaa att aga aaa gcg att gtg ccc cag gat cca gac tgg 1152 Glu Gly
Lys Glu Ile Arg Lys Ala Ile Val Pro Gln Asp Pro Asp Trp 370 375 380
tgg atc gtc agt gcg gat tat tcc caa ata gaa ctc aga atc ctc gct
1200 Trp Ile Val Ser Ala Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu
Ala 385 390 395 400 cat ctc agt ggt gat gag aac ctt gtg aag gcc ttc
gag gag ggc atc 1248 His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala
Phe Glu Glu Gly Ile 405 410 415 gat gtg cac acc ttg act gcc tcc agg
atc tac aac gta aag cca gaa 1296 Asp Val His Thr Leu Thr Ala Ser
Arg Ile Tyr Asn Val Lys Pro Glu 420 425 430 gaa gtg aac gaa gaa atg
cga cgg gtt gga aag atg gtg aac ttc tct 1344 Glu Val Asn Glu Glu
Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 435 440 445 ata ata tac
ggt gtc aca ccg tac ggt ctt tct gtg aga ctt gga ata 1392 Ile Ile
Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly Ile 450 455 460
ccg gtt aaa gaa gca gaa aag atg att atc agc tat ttc aca ctg tat
1440 Pro Val Lys Glu Ala Glu Lys Met Ile Ile Ser Tyr Phe Thr Leu
Tyr 465 470 475 480 cca aag gtg cga agc tac atc cag cag gtt gtt gca
gag gca aaa gag 1488 Pro Lys Val Arg Ser Tyr Ile Gln Gln Val Val
Ala Glu Ala Lys Glu 485 490 495 aag ggc tac gtc agg act ctc ttt gga
aga aaa aga gat att ccc cag 1536 Lys Gly Tyr Val Arg Thr Leu Phe
Gly Arg Lys Arg Asp Ile Pro Gln 500 505 510 ctc atg gca agg gac aag
aac acc cag tcc gaa ggc gaa aga atc gca 1584 Leu Met Ala Arg Asp
Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala 515 520 525 ata aac acc
ccc att cag gga acg gcg gca gat ata ata aaa ttg gct 1632 Ile Asn
Thr Pro Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala 530 535 540
atg ata gat ata gac gag gag ctg aga aaa aga aac atg aaa tcc aga
1680 Met Ile Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser
Arg 545 550 555 560 atg atc att cag gtt cat gac gaa ctg gtc ttc gag
gtt ccc gat gag 1728 Met Ile Ile Gln Val His Asp Glu Leu Val Phe
Glu Val Pro Asp Glu 565 570 575 gaa aaa gaa gaa cta gtt gat ctg gtg
aag aac aaa atg aca aat gtg 1776 Glu Lys Glu Glu Leu Val Asp Leu
Val Lys Asn Lys Met Thr Asn Val 580 585 590 gtg aaa ctc tct gtg cct
ctt gag gtt gac ata agc atc gga aaa agc 1824 Val Lys Leu Ser Val
Pro Leu Glu Val Asp Ile Ser Ile Gly Lys Ser 595 600 605 tgg tct tga
1833 Trp Ser 610 26 610 PRT Artificial Sequence Synthetic 26 Met
Lys Glu Leu Gln Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10
15 Ile Val Lys Asp His Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys
20 25 30 Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu
Asp Pro 35 40 45 Phe Asn Cys Glu Ile Val Gly Ile Ser Val Ser Phe
Lys Pro Lys Thr 50 55 60 Ala Tyr Tyr Ile Pro Leu His His Arg Asn
Ala Gln Asn Leu Asp Glu 65 70 75 80 Thr Leu Val Leu Ser Lys Leu Lys
Glu Ile Leu Glu Asp Pro Ser Ser 85 90 95 Lys Ile Val Gly Gln Asn
Leu Lys Tyr Asp Tyr Lys Val Leu Met Val 100 105 110 Lys Gly Ile Ser
Pro Val Tyr Pro His Phe Asp Thr Met Ile Ala Ala 115 120 125 Tyr Leu
Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 130 135 140
Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 145
150 155 160 Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro
Val Asp 165 170 175 Lys Ala Ala Asn Tyr Ser Cys Glu Asn Ala Asp Ile
Thr Tyr Arg Leu 180 185 190 Tyr Lys Ile Leu Ser Met Lys Leu His Glu
Ala Glu Leu Glu Asn Val 195 200 205 Phe Tyr Arg Ile Glu Met Pro Leu
Val Asn Val Leu Ala Arg Met Glu 210 215 220 Leu Asn Gly Val Tyr Val
Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 225 230 235 240 Glu Tyr Gly
Lys Lys Leu Glu Glu Leu Ala Glu Lys Ile Tyr Gln Ile 245 250 255 Ala
Gly Glu Pro Phe Asn Ile Asn Ser Pro Lys Gln Val Ser Lys Ile 260 265
270 Leu Phe Glu Lys Leu Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr
275 280 285 Gly Ala Tyr Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala
Asn Glu 290 295 300 His Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys
Ile Gln Lys Leu 305 310 315 320 Lys Ser Thr Tyr Ile Asp Thr Leu Pro
Lys Leu Val Asn Pro Lys Thr 325 330 335 Gly Arg Ile His Ala Ser Phe
His Gln Thr Gly Thr Ala Thr Gly Arg 340 345 350 Leu Ser Ser Ser Asp
Pro Asn Leu Gln Asn Leu Pro Thr Lys Ser Glu 355 360 365 Glu Gly Lys
Glu Ile Arg Lys Ala Ile Val Pro Gln Asp Pro Asp Trp 370 375 380 Trp
Ile Val Ser Ala Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala 385 390
395 400 His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly
Ile 405 410 415 Asp Val His Thr Leu Thr Ala Ser Arg Ile Tyr Asn Val
Lys Pro Glu 420 425 430 Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys
Met Val Asn Phe Ser 435 440 445 Ile Ile Tyr Gly Val Thr Pro Tyr Gly
Leu Ser Val Arg Leu Gly Ile 450 455 460 Pro Val Lys Glu Ala Glu Lys
Met Ile Ile Ser Tyr Phe Thr Leu Tyr 465 470 475 480 Pro Lys Val Arg
Ser Tyr Ile Gln Gln Val Val Ala Glu Ala Lys Glu 485 490 495 Lys Gly
Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp Ile Pro Gln 500 505 510
Leu Met Ala Arg Asp Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala 515
520 525 Ile Asn Thr Pro Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu
Ala 530 535 540 Met Ile Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn Met
Lys Ser Arg 545 550 555 560 Met Ile Ile Gln Val His Asp Glu Leu Val
Phe Glu Val Pro Asp Glu 565 570 575 Glu Lys Glu Glu Leu Val Asp Leu
Val Lys Asn Lys Met Thr Asn Val 580 585 590 Val Lys Leu Ser Val Pro
Leu Glu Val Asp Ile Ser Ile Gly Lys Ser 595 600 605 Trp Ser 610 27
21 DNA Artificial Sequence Synthetic 27 tttgccctgg cccttgaaac g 21
28 1833 DNA Artificial Sequence Synthetic 28 atg aag gaa ctt caa
ctg tac gaa gaa gca gaa ccc acc gga tac gaa 48 Met Lys Glu Leu Gln
Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10 15 atc gtg aag
gat cat aag acc ttc gaa gat ctc atc gaa aag ctg aag 96 Ile Val Lys
Asp His Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys 20 25 30 gag
gtt cca tct ttt gcc ctg gcc ctt gaa acg tcc tcc ctt gac ccg 144 Glu
Val Pro Ser Phe Ala Leu Ala Leu Glu Thr Ser Ser Leu Asp Pro 35 40
45 ttc aac tgt gag ata gtc ggc atc tcc gtg tcg ttc aaa ccg aaa aca
192 Phe Asn Cys Glu Ile Val Gly Ile Ser Val Ser Phe Lys Pro Lys Thr
50 55 60 gct tat
tac att cca ctt cat cac aga aac gcc cag aat ctt gat gaa 240 Ala Tyr
Tyr Ile Pro Leu His His Arg Asn Ala Gln Asn Leu Asp Glu 65 70 75 80
aca ctg gtg ctg tcg aag ttg aaa gag atc ctc gaa gac ccg tct tcg 288
Thr Leu Val Leu Ser Lys Leu Lys Glu Ile Leu Glu Asp Pro Ser Ser 85
90 95 aag att gtg ggt cag aac ctg aag tac gac tac aag gtt ctt atg
gta 336 Lys Ile Val Gly Gln Asn Leu Lys Tyr Asp Tyr Lys Val Leu Met
Val 100 105 110 aag ggt ata tcg cca gtt tat ccg cat ttt gac acg atg
ata gct gca 384 Lys Gly Ile Ser Pro Val Tyr Pro His Phe Asp Thr Met
Ile Ala Ala 115 120 125 tat ttg ctg gag cca aac gag aaa aaa ttc aat
ctc gaa gat ctg tct 432 Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn
Leu Glu Asp Leu Ser 130 135 140 ttg aaa ttt ctc gga tac aaa atg acg
tct tat cag gaa ctg atg tcg 480 Leu Lys Phe Leu Gly Tyr Lys Met Thr
Ser Tyr Gln Glu Leu Met Ser 145 150 155 160 ttt tcc tca cca ctt ttt
ggt ttc agc ttt gcg gat gtt ccg gta gac 528 Phe Ser Ser Pro Leu Phe
Gly Phe Ser Phe Ala Asp Val Pro Val Asp 165 170 175 aag gct gcg aac
tac tcc tgc gag gat gca gac atc act tat agg ctc 576 Lys Ala Ala Asn
Tyr Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu 180 185 190 tac aag
ata ctc agc atg aag ctc cat gaa gcg gaa ctt gag aac gtc 624 Tyr Lys
Ile Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val 195 200 205
ttc tac agg ata gag atg ccg ctt gtg aac gtt ctt gca cgc atg gaa 672
Phe Tyr Arg Ile Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu 210
215 220 ttg aac ggg gtg tat gtg gac aca gaa ttc ctg aaa aag ctc tcg
gag 720 Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser
Glu 225 230 235 240 gag tac ggc aaa aag ctc gag gaa ctg gcc gaa aaa
atc tac cag ata 768 Glu Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys
Ile Tyr Gln Ile 245 250 255 gca gga gag ccc ttc aac atc aat tct cca
aaa cag gtt tca aag atc 816 Ala Gly Glu Pro Phe Asn Ile Asn Ser Pro
Lys Gln Val Ser Lys Ile 260 265 270 ctt ttt gag aag ctg gga ata aaa
ccc cgt gga aaa acg aca aaa aca 864 Leu Phe Glu Lys Leu Gly Ile Lys
Pro Arg Gly Lys Thr Thr Lys Thr 275 280 285 gga gcg tac tct acc agg
ata gag gtg ttg gaa gag ata gcg aat gag 912 Gly Ala Tyr Ser Thr Arg
Ile Glu Val Leu Glu Glu Ile Ala Asn Glu 290 295 300 cac gag ata gta
ccc ctc att ctc gag tac aga aag atc cag aaa ctg 960 His Glu Ile Val
Pro Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu 305 310 315 320 aaa
tcg acc tac ata gac acc ctt ccg aaa ctt gtg aac ccg aaa acc 1008
Lys Ser Thr Tyr Ile Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr 325
330 335 gga aga att cat gca tct ttc cac cag acg ggt acc gcc act ggc
agg 1056 Gly Arg Ile His Ala Ser Phe His Gln Thr Gly Thr Ala Thr
Gly Arg 340 345 350 ttg agt agc agt gat cca aat ctt cag aat ctt ccg
aca aag agc gaa 1104 Leu Ser Ser Ser Asp Pro Asn Leu Gln Asn Leu
Pro Thr Lys Ser Glu 355 360 365 gag gga aaa gaa att aga aaa gcg att
gtg ccc cag gat cca gac tgg 1152 Glu Gly Lys Glu Ile Arg Lys Ala
Ile Val Pro Gln Asp Pro Asp Trp 370 375 380 tgg atc gtc agt gcg gat
tat tcc caa ata gaa ctc aga atc ctc gct 1200 Trp Ile Val Ser Ala
Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala 385 390 395 400 cat ctc
agt ggt gat gag aac ctt gtg aag gcc ttc gag gag ggc atc 1248 His
Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly Ile 405 410
415 gat gtg cac acc ttg act gcc tcc agg atc tac aac gta aag cca gaa
1296 Asp Val His Thr Leu Thr Ala Ser Arg Ile Tyr Asn Val Lys Pro
Glu 420 425 430 gaa gtg aac gaa gaa atg cga cgg gtt gga aag atg gtg
aac ttc tct 1344 Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met
Val Asn Phe Ser 435 440 445 ata ata tac ggt gtc aca ccg tac ggt ctt
tct gtg aga ctt gga ata 1392 Ile Ile Tyr Gly Val Thr Pro Tyr Gly
Leu Ser Val Arg Leu Gly Ile 450 455 460 ccg gtt aaa gaa gca gaa aag
atg att atc agc tat ttc aca ctg tat 1440 Pro Val Lys Glu Ala Glu
Lys Met Ile Ile Ser Tyr Phe Thr Leu Tyr 465 470 475 480 cca aag gtg
cga agc tac atc cag cag gtt gtt gca gag gca aaa gag 1488 Pro Lys
Val Arg Ser Tyr Ile Gln Gln Val Val Ala Glu Ala Lys Glu 485 490 495
aag ggc tac gtc agg act ctc ttt gga aga aaa aga gat att ccc cag
1536 Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp Ile Pro
Gln 500 505 510 ctc atg gca agg gac aag aac acc cag tcc gaa ggc gaa
aga atc gca 1584 Leu Met Ala Arg Asp Lys Asn Thr Gln Ser Glu Gly
Glu Arg Ile Ala 515 520 525 ata aac acc ccc att cag gga acg gcg gca
gat ata ata aaa ttg gct 1632 Ile Asn Thr Pro Ile Gln Gly Thr Ala
Ala Asp Ile Ile Lys Leu Ala 530 535 540 atg ata gat ata gac gag gag
ctg aga aaa aga aac atg aaa tcc aga 1680 Met Ile Asp Ile Asp Glu
Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 545 550 555 560 atg atc att
cag gtt cat gac gaa ctg gtc ttc gag gtt ccc gat gag 1728 Met Ile
Ile Gln Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu 565 570 575
gaa aaa gaa gaa cta gtt gat ctg gtg aag aac aaa atg aca aat gtg
1776 Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr Asn
Val 580 585 590 gtg aaa ctc tct gtg cct ctt gag gtt gac ata agc atc
gga aaa agc 1824 Val Lys Leu Ser Val Pro Leu Glu Val Asp Ile Ser
Ile Gly Lys Ser 595 600 605 tgg tct tga 1833 Trp Ser 610 29 610 PRT
Artificial Sequence Synthetic 29 Met Lys Glu Leu Gln Leu Tyr Glu
Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10 15 Ile Val Lys Asp His Lys
Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys 20 25 30 Glu Val Pro Ser
Phe Ala Leu Ala Leu Glu Thr Ser Ser Leu Asp Pro 35 40 45 Phe Asn
Cys Glu Ile Val Gly Ile Ser Val Ser Phe Lys Pro Lys Thr 50 55 60
Ala Tyr Tyr Ile Pro Leu His His Arg Asn Ala Gln Asn Leu Asp Glu 65
70 75 80 Thr Leu Val Leu Ser Lys Leu Lys Glu Ile Leu Glu Asp Pro
Ser Ser 85 90 95 Lys Ile Val Gly Gln Asn Leu Lys Tyr Asp Tyr Lys
Val Leu Met Val 100 105 110 Lys Gly Ile Ser Pro Val Tyr Pro His Phe
Asp Thr Met Ile Ala Ala 115 120 125 Tyr Leu Leu Glu Pro Asn Glu Lys
Lys Phe Asn Leu Glu Asp Leu Ser 130 135 140 Leu Lys Phe Leu Gly Tyr
Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 145 150 155 160 Phe Ser Ser
Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 165 170 175 Lys
Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu 180 185
190 Tyr Lys Ile Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val
195 200 205 Phe Tyr Arg Ile Glu Met Pro Leu Val Asn Val Leu Ala Arg
Met Glu 210 215 220 Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu Lys
Lys Leu Ser Glu 225 230 235 240 Glu Tyr Gly Lys Lys Leu Glu Glu Leu
Ala Glu Lys Ile Tyr Gln Ile 245 250 255 Ala Gly Glu Pro Phe Asn Ile
Asn Ser Pro Lys Gln Val Ser Lys Ile 260 265 270 Leu Phe Glu Lys Leu
Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr 275 280 285 Gly Ala Tyr
Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu 290 295 300 His
Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu 305 310
315 320 Lys Ser Thr Tyr Ile Asp Thr Leu Pro Lys Leu Val Asn Pro Lys
Thr 325 330 335 Gly Arg Ile His Ala Ser Phe His Gln Thr Gly Thr Ala
Thr Gly Arg 340 345 350 Leu Ser Ser Ser Asp Pro Asn Leu Gln Asn Leu
Pro Thr Lys Ser Glu 355 360 365 Glu Gly Lys Glu Ile Arg Lys Ala Ile
Val Pro Gln Asp Pro Asp Trp 370 375 380 Trp Ile Val Ser Ala Asp Tyr
Ser Gln Ile Glu Leu Arg Ile Leu Ala 385 390 395 400 His Leu Ser Gly
Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly Ile 405 410 415 Asp Val
His Thr Leu Thr Ala Ser Arg Ile Tyr Asn Val Lys Pro Glu 420 425 430
Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 435
440 445 Ile Ile Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly
Ile 450 455 460 Pro Val Lys Glu Ala Glu Lys Met Ile Ile Ser Tyr Phe
Thr Leu Tyr 465 470 475 480 Pro Lys Val Arg Ser Tyr Ile Gln Gln Val
Val Ala Glu Ala Lys Glu 485 490 495 Lys Gly Tyr Val Arg Thr Leu Phe
Gly Arg Lys Arg Asp Ile Pro Gln 500 505 510 Leu Met Ala Arg Asp Lys
Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala 515 520 525 Ile Asn Thr Pro
Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala 530 535 540 Met Ile
Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 545 550 555
560 Met Ile Ile Gln Val His Asp Glu Leu Val Phe Glu Val Pro Asp Glu
565 570 575 Glu Lys Glu Glu Leu Val Asp Leu Val Lys Asn Lys Met Thr
Asn Val 580 585 590 Val Lys Leu Ser Val Pro Leu Glu Val Asp Ile Ser
Ile Gly Lys Ser 595 600 605 Trp Ser 610 30 29 DNA Artificial
Sequence Synthetic 30 cctgaagtac gcgtacaagg ttcttatgg 29 31 22 DNA
Artificial Sequence Synthetic 31 tcacacagga aacagctatg ac 22 32
1833 DNA Artificial Sequence Synthetic 32 atg aag gaa ctt caa ctg
tac gaa gaa gca gaa ccc acc gga tac gaa 48 Met Lys Glu Leu Gln Leu
Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10 15 atc gtg aag gat
cat aag acc ttc gaa gat ctc atc gaa aag ctg aag 96 Ile Val Lys Asp
His Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys 20 25 30 gag gtt
cca tct ttt gcc ctg gac ctt gaa acg tcc tcc ctt gac ccg 144 Glu Val
Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 35 40 45
ttc aac tgt gag ata gtc ggc atc tcc gtg tcg ttc aaa ccg aaa aca 192
Phe Asn Cys Glu Ile Val Gly Ile Ser Val Ser Phe Lys Pro Lys Thr 50
55 60 gct tat tac att cca ctt cat cac aga aac gcc cag aat ctt gat
gaa 240 Ala Tyr Tyr Ile Pro Leu His His Arg Asn Ala Gln Asn Leu Asp
Glu 65 70 75 80 aca ctg gtg ctg tcg aag ttg aaa gag atc ctc gaa gac
ccg tct tcg 288 Thr Leu Val Leu Ser Lys Leu Lys Glu Ile Leu Glu Asp
Pro Ser Ser 85 90 95 aag att gtg ggt cag aac ctg aag tac gcg tac
aag gtt ctt atg gta 336 Lys Ile Val Gly Gln Asn Leu Lys Tyr Ala Tyr
Lys Val Leu Met Val 100 105 110 aag ggt ata tcg cca gtt tat ccg cat
ttt gac acg atg ata gct gca 384 Lys Gly Ile Ser Pro Val Tyr Pro His
Phe Asp Thr Met Ile Ala Ala 115 120 125 tat ttg ctg gag cca aac gag
aaa aaa ttc aat ctc gaa gat ctg tct 432 Tyr Leu Leu Glu Pro Asn Glu
Lys Lys Phe Asn Leu Glu Asp Leu Ser 130 135 140 ttg aaa ttt ctc gga
tac aaa atg acg tct tat cag gaa ctg atg tcg 480 Leu Lys Phe Leu Gly
Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 145 150 155 160 ttt tcc
tca cca ctt ttt ggt ttc agc ttt gcg gat gtt ccg gta gac 528 Phe Ser
Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 165 170 175
aag gct gcg aac tac tcc tgc gag gat gca gac atc act tat agg ctc 576
Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu 180
185 190 tac aag ata ctc agc atg aag ctc cat gaa gcg gaa ctt gag aac
gtc 624 Tyr Lys Ile Leu Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn
Val 195 200 205 ttc tac agg ata gag atg ccg ctt gtg aac gtt ctt gca
cgc atg gaa 672 Phe Tyr Arg Ile Glu Met Pro Leu Val Asn Val Leu Ala
Arg Met Glu 210 215 220 ttg aac ggg gtg tat gtg gac aca gaa ttc ctg
aaa aag ctc tcg gag 720 Leu Asn Gly Val Tyr Val Asp Thr Glu Phe Leu
Lys Lys Leu Ser Glu 225 230 235 240 gag tac ggc aaa aag ctc gag gaa
ctg gcc gaa aaa atc tac cag ata 768 Glu Tyr Gly Lys Lys Leu Glu Glu
Leu Ala Glu Lys Ile Tyr Gln Ile 245 250 255 gca gga gag ccc ttc aac
atc aat tct cca aaa cag gtt tca aag atc 816 Ala Gly Glu Pro Phe Asn
Ile Asn Ser Pro Lys Gln Val Ser Lys Ile 260 265 270 ctt ttt gag aag
ctg gga ata aaa ccc cgt gga aaa acg aca aaa aca 864 Leu Phe Glu Lys
Leu Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr 275 280 285 gga gcg
tac tct acc agg ata gag gtg ttg gaa gag ata gcg aat gag 912 Gly Ala
Tyr Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu 290 295 300
cac gag ata gta ccc ctc att ctc gag tac aga aag atc cag aaa ctg 960
His Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu 305
310 315 320 aaa tcg acc tac ata gac acc ctt ccg aaa ctt gtg aac ccg
aaa acc 1008 Lys Ser Thr Tyr Ile Asp Thr Leu Pro Lys Leu Val Asn
Pro Lys Thr 325 330 335 gga aga att cat gca tct ttc cac cag acg ggt
acc gcc act ggc agg 1056 Gly Arg Ile His Ala Ser Phe His Gln Thr
Gly Thr Ala Thr Gly Arg 340 345 350 ttg agt agc agt gat cca aat ctt
cag aat ctt ccg aca aag agc gaa 1104 Leu Ser Ser Ser Asp Pro Asn
Leu Gln Asn Leu Pro Thr Lys Ser Glu 355 360 365 gag gga aaa gaa att
aga aaa gcg att gtg ccc cag gat cca gac tgg 1152 Glu Gly Lys Glu
Ile Arg Lys Ala Ile Val Pro Gln Asp Pro Asp Trp 370 375 380 tgg atc
gtc agt gcg gat tat tcc caa ata gaa ctc aga atc ctc gct 1200 Trp
Ile Val Ser Ala Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala 385 390
395 400 cat ctc agt ggt gat gag aac ctt gtg aag gcc ttc gag gag ggc
atc 1248 His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu
Gly Ile 405 410 415 gat gtg cac acc ttg act gcc tcc agg atc tac aac
gta aag cca gaa 1296 Asp Val His Thr Leu Thr Ala Ser Arg Ile Tyr
Asn Val Lys Pro Glu 420 425 430 gaa gtg aac gaa gaa atg cga cgg gtt
gga aag atg gtg aac ttc tct 1344 Glu Val Asn Glu Glu Met Arg Arg
Val Gly Lys Met Val Asn Phe Ser 435 440 445 ata ata tac ggt gtc aca
ccg tac ggt ctt tct gtg aga ctt gga ata 1392 Ile Ile Tyr Gly Val
Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly Ile 450 455 460 ccg gtt aaa
gaa gca gaa aag atg att atc agc tat ttc aca ctg tat 1440 Pro Val
Lys Glu Ala Glu Lys Met Ile Ile Ser Tyr Phe Thr Leu Tyr 465 470 475
480 cca aag gtg cga agc tac atc cag cag gtt gtt gca gag gca aaa gag
1488 Pro Lys Val Arg Ser Tyr Ile Gln Gln Val Val Ala Glu Ala Lys
Glu 485 490 495 aag ggc tac gtc agg act ctc ttt gga aga aaa aga gat
att ccc cag 1536 Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg
Asp Ile Pro Gln 500 505 510 ctc atg gca agg gac aag aac acc cag tcc
gaa ggc gaa aga atc gca 1584 Leu Met Ala Arg Asp Lys Asn Thr Gln
Ser Glu Gly Glu Arg Ile Ala 515 520 525 ata aac acc ccc att cag gga
acg gcg gca gat ata ata aaa ttg gct 1632 Ile Asn Thr Pro Ile Gln
Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala 530 535 540 atg ata gat ata
gac gag gag ctg aga aaa aga aac atg aaa tcc aga 1680 Met Ile Asp
Ile Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg 545 550 555 560
atg atc att cag gtt cat gac gaa ctg gtc
ttc gag gtt ccc gat gag 1728 Met Ile Ile Gln Val His Asp Glu Leu
Val Phe Glu Val Pro Asp Glu 565 570 575 gaa aaa gaa gaa cta gtt gat
ctg gtg aag aac aaa atg aca aat gtg 1776 Glu Lys Glu Glu Leu Val
Asp Leu Val Lys Asn Lys Met Thr Asn Val 580 585 590 gtg aaa ctc tct
gtg cct ctt gag gtt gac ata agc atc gga aaa agc 1824 Val Lys Leu
Ser Val Pro Leu Glu Val Asp Ile Ser Ile Gly Lys Ser 595 600 605 tgg
tct tga 1833 Trp Ser 610 33 610 PRT Artificial Sequence Synthetic
33 Met Lys Glu Leu Gln Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu
1 5 10 15 Ile Val Lys Asp His Lys Thr Phe Glu Asp Leu Ile Glu Lys
Leu Lys 20 25 30 Glu Val Pro Ser Phe Ala Leu Asp Leu Glu Thr Ser
Ser Leu Asp Pro 35 40 45 Phe Asn Cys Glu Ile Val Gly Ile Ser Val
Ser Phe Lys Pro Lys Thr 50 55 60 Ala Tyr Tyr Ile Pro Leu His His
Arg Asn Ala Gln Asn Leu Asp Glu 65 70 75 80 Thr Leu Val Leu Ser Lys
Leu Lys Glu Ile Leu Glu Asp Pro Ser Ser 85 90 95 Lys Ile Val Gly
Gln Asn Leu Lys Tyr Ala Tyr Lys Val Leu Met Val 100 105 110 Lys Gly
Ile Ser Pro Val Tyr Pro His Phe Asp Thr Met Ile Ala Ala 115 120 125
Tyr Leu Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 130
135 140 Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met
Ser 145 150 155 160 Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp
Val Pro Val Asp 165 170 175 Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala
Asp Ile Thr Tyr Arg Leu 180 185 190 Tyr Lys Ile Leu Ser Met Lys Leu
His Glu Ala Glu Leu Glu Asn Val 195 200 205 Phe Tyr Arg Ile Glu Met
Pro Leu Val Asn Val Leu Ala Arg Met Glu 210 215 220 Leu Asn Gly Val
Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 225 230 235 240 Glu
Tyr Gly Lys Lys Leu Glu Glu Leu Ala Glu Lys Ile Tyr Gln Ile 245 250
255 Ala Gly Glu Pro Phe Asn Ile Asn Ser Pro Lys Gln Val Ser Lys Ile
260 265 270 Leu Phe Glu Lys Leu Gly Ile Lys Pro Arg Gly Lys Thr Thr
Lys Thr 275 280 285 Gly Ala Tyr Ser Thr Arg Ile Glu Val Leu Glu Glu
Ile Ala Asn Glu 290 295 300 His Glu Ile Val Pro Leu Ile Leu Glu Tyr
Arg Lys Ile Gln Lys Leu 305 310 315 320 Lys Ser Thr Tyr Ile Asp Thr
Leu Pro Lys Leu Val Asn Pro Lys Thr 325 330 335 Gly Arg Ile His Ala
Ser Phe His Gln Thr Gly Thr Ala Thr Gly Arg 340 345 350 Leu Ser Ser
Ser Asp Pro Asn Leu Gln Asn Leu Pro Thr Lys Ser Glu 355 360 365 Glu
Gly Lys Glu Ile Arg Lys Ala Ile Val Pro Gln Asp Pro Asp Trp 370 375
380 Trp Ile Val Ser Ala Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala
385 390 395 400 His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu
Glu Gly Ile 405 410 415 Asp Val His Thr Leu Thr Ala Ser Arg Ile Tyr
Asn Val Lys Pro Glu 420 425 430 Glu Val Asn Glu Glu Met Arg Arg Val
Gly Lys Met Val Asn Phe Ser 435 440 445 Ile Ile Tyr Gly Val Thr Pro
Tyr Gly Leu Ser Val Arg Leu Gly Ile 450 455 460 Pro Val Lys Glu Ala
Glu Lys Met Ile Ile Ser Tyr Phe Thr Leu Tyr 465 470 475 480 Pro Lys
Val Arg Ser Tyr Ile Gln Gln Val Val Ala Glu Ala Lys Glu 485 490 495
Lys Gly Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp Ile Pro Gln 500
505 510 Leu Met Ala Arg Asp Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile
Ala 515 520 525 Ile Asn Thr Pro Ile Gln Gly Thr Ala Ala Asp Ile Ile
Lys Leu Ala 530 535 540 Met Ile Asp Ile Asp Glu Glu Leu Arg Lys Arg
Asn Met Lys Ser Arg 545 550 555 560 Met Ile Ile Gln Val His Asp Glu
Leu Val Phe Glu Val Pro Asp Glu 565 570 575 Glu Lys Glu Glu Leu Val
Asp Leu Val Lys Asn Lys Met Thr Asn Val 580 585 590 Val Lys Leu Ser
Val Pro Leu Glu Val Asp Ile Ser Ile Gly Lys Ser 595 600 605 Trp Ser
610 34 1833 DNA Artificial Sequence Synthetic 34 atg aag gaa ctt
caa ctg tac gaa gaa gca gaa ccc acc gga tac gaa 48 Met Lys Glu Leu
Gln Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10 15 atc gtg
aag gat cat aag acc ttc gaa gat ctc atc gaa aag ctg aag 96 Ile Val
Lys Asp His Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys 20 25 30
gag gtt cca tct ttt gcc ctg gcc ctt gaa acg tcc tcc ctt gac ccg 144
Glu Val Pro Ser Phe Ala Leu Ala Leu Glu Thr Ser Ser Leu Asp Pro 35
40 45 ttc aac tgt gag ata gtc ggc atc tcc gtg tcg ttc aaa ccg aaa
aca 192 Phe Asn Cys Glu Ile Val Gly Ile Ser Val Ser Phe Lys Pro Lys
Thr 50 55 60 gct tat tac att cca ctt cat cac aga aac gcc cag aat
ctt gat gaa 240 Ala Tyr Tyr Ile Pro Leu His His Arg Asn Ala Gln Asn
Leu Asp Glu 65 70 75 80 aca ctg gtg ctg tcg aag ttg aaa gag atc ctc
gaa gac ccg tct tcg 288 Thr Leu Val Leu Ser Lys Leu Lys Glu Ile Leu
Glu Asp Pro Ser Ser 85 90 95 aag att gtg ggt cag aac ctg aag tac
gcg tac aag gtt ctt atg gta 336 Lys Ile Val Gly Gln Asn Leu Lys Tyr
Ala Tyr Lys Val Leu Met Val 100 105 110 aag ggt ata tcg cca gtt tat
ccg cat ttt gac acg atg ata gct gca 384 Lys Gly Ile Ser Pro Val Tyr
Pro His Phe Asp Thr Met Ile Ala Ala 115 120 125 tat ttg ctg gag cca
aac gag aaa aaa ttc aat ctc gaa gat ctg tct 432 Tyr Leu Leu Glu Pro
Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 130 135 140 ttg aaa ttt
ctc gga tac aaa atg acg tct tat cag gaa ctg atg tcg 480 Leu Lys Phe
Leu Gly Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 145 150 155 160
ttt tcc tca cca ctt ttt ggt ttc agc ttt gcg gat gtt ccg gta gac 528
Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp 165
170 175 aag gct gcg aac tac tcc tgc gag gat gca gac atc act tat agg
ctc 576 Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg
Leu 180 185 190 tac aag ata ctc agc atg aag ctc cat gaa gcg gaa ctt
gag aac gtc 624 Tyr Lys Ile Leu Ser Met Lys Leu His Glu Ala Glu Leu
Glu Asn Val 195 200 205 ttc tac agg ata gag atg ccg ctt gtg aac gtt
ctt gca cgc atg gaa 672 Phe Tyr Arg Ile Glu Met Pro Leu Val Asn Val
Leu Ala Arg Met Glu 210 215 220 ttg aac ggg gtg tat gtg gac aca gaa
ttc ctg aaa aag ctc tcg gag 720 Leu Asn Gly Val Tyr Val Asp Thr Glu
Phe Leu Lys Lys Leu Ser Glu 225 230 235 240 gag tac ggc aaa aag ctc
gag gaa ctg gcc gaa aaa atc tac cag ata 768 Glu Tyr Gly Lys Lys Leu
Glu Glu Leu Ala Glu Lys Ile Tyr Gln Ile 245 250 255 gca gga gag ccc
ttc aac atc aat tct cca aaa cag gtt tca aag atc 816 Ala Gly Glu Pro
Phe Asn Ile Asn Ser Pro Lys Gln Val Ser Lys Ile 260 265 270 ctt ttt
gag aag ctg gga ata aaa ccc cgt gga aaa acg aca aaa aca 864 Leu Phe
Glu Lys Leu Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr 275 280 285
gga gcg tac tct acc agg ata gag gtg ttg gaa gag ata gcg aat gag 912
Gly Ala Tyr Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu 290
295 300 cac gag ata gta ccc ctc att ctc gag tac aga aag atc cag aaa
ctg 960 His Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys
Leu 305 310 315 320 aaa tcg acc tac ata gac acc ctt ccg aaa ctt gtg
aac ccg aaa acc 1008 Lys Ser Thr Tyr Ile Asp Thr Leu Pro Lys Leu
Val Asn Pro Lys Thr 325 330 335 gga aga att cat gca tct ttc cac cag
acg ggt acc gcc act ggc agg 1056 Gly Arg Ile His Ala Ser Phe His
Gln Thr Gly Thr Ala Thr Gly Arg 340 345 350 ttg agt agc agt gat cca
aat ctt cag aat ctt ccg aca aag agc gaa 1104 Leu Ser Ser Ser Asp
Pro Asn Leu Gln Asn Leu Pro Thr Lys Ser Glu 355 360 365 gag gga aaa
gaa att aga aaa gcg att gtg ccc cag gat cca gac tgg 1152 Glu Gly
Lys Glu Ile Arg Lys Ala Ile Val Pro Gln Asp Pro Asp Trp 370 375 380
tgg atc gtc agt gcg gat tat tcc caa ata gaa ctc aga atc ctc gct
1200 Trp Ile Val Ser Ala Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu
Ala 385 390 395 400 cat ctc agt ggt gat gag aac ctt gtg aag gcc ttc
gag gag ggc atc 1248 His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala
Phe Glu Glu Gly Ile 405 410 415 gat gtg cac acc ttg act gcc tcc agg
atc tac aac gta aag cca gaa 1296 Asp Val His Thr Leu Thr Ala Ser
Arg Ile Tyr Asn Val Lys Pro Glu 420 425 430 gaa gtg aac gaa gaa atg
cga cgg gtt gga aag atg gtg aac ttc tct 1344 Glu Val Asn Glu Glu
Met Arg Arg Val Gly Lys Met Val Asn Phe Ser 435 440 445 ata ata tac
ggt gtc aca ccg tac ggt ctt tct gtg aga ctt gga ata 1392 Ile Ile
Tyr Gly Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly Ile 450 455 460
ccg gtt aaa gaa gca gaa aag atg att atc agc tat ttc aca ctg tat
1440 Pro Val Lys Glu Ala Glu Lys Met Ile Ile Ser Tyr Phe Thr Leu
Tyr 465 470 475 480 cca aag gtg cga agc tac atc cag cag gtt gtt gca
gag gca aaa gag 1488 Pro Lys Val Arg Ser Tyr Ile Gln Gln Val Val
Ala Glu Ala Lys Glu 485 490 495 aag ggc tac gtc agg act ctc ttt gga
aga aaa aga gat att ccc cag 1536 Lys Gly Tyr Val Arg Thr Leu Phe
Gly Arg Lys Arg Asp Ile Pro Gln 500 505 510 ctc atg gca agg gac aag
aac acc cag tcc gaa ggc gaa aga atc gca 1584 Leu Met Ala Arg Asp
Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala 515 520 525 ata aac acc
ccc att cag gga acg gcg gca gat ata ata aaa ttg gct 1632 Ile Asn
Thr Pro Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala 530 535 540
atg ata gat ata gac gag gag ctg aga aaa aga aac atg aaa tcc aga
1680 Met Ile Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser
Arg 545 550 555 560 atg atc att cag gtt cat gac gaa ctg gtc ttc gag
gtt ccc gat gag 1728 Met Ile Ile Gln Val His Asp Glu Leu Val Phe
Glu Val Pro Asp Glu 565 570 575 gaa aaa gaa gaa cta gtt gat ctg gtg
aag aac aaa atg aca aat gtg 1776 Glu Lys Glu Glu Leu Val Asp Leu
Val Lys Asn Lys Met Thr Asn Val 580 585 590 gtg aaa ctc tct gtg cct
ctt gag gtt gac ata agc atc gga aaa agc 1824 Val Lys Leu Ser Val
Pro Leu Glu Val Asp Ile Ser Ile Gly Lys Ser 595 600 605 tgg tct tga
1833 Trp Ser 610 35 610 PRT Artificial Sequence Synthetic 35 Met
Lys Glu Leu Gln Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10
15 Ile Val Lys Asp His Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys
20 25 30 Glu Val Pro Ser Phe Ala Leu Ala Leu Glu Thr Ser Ser Leu
Asp Pro 35 40 45 Phe Asn Cys Glu Ile Val Gly Ile Ser Val Ser Phe
Lys Pro Lys Thr 50 55 60 Ala Tyr Tyr Ile Pro Leu His His Arg Asn
Ala Gln Asn Leu Asp Glu 65 70 75 80 Thr Leu Val Leu Ser Lys Leu Lys
Glu Ile Leu Glu Asp Pro Ser Ser 85 90 95 Lys Ile Val Gly Gln Asn
Leu Lys Tyr Ala Tyr Lys Val Leu Met Val 100 105 110 Lys Gly Ile Ser
Pro Val Tyr Pro His Phe Asp Thr Met Ile Ala Ala 115 120 125 Tyr Leu
Leu Glu Pro Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser 130 135 140
Leu Lys Phe Leu Gly Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser 145
150 155 160 Phe Ser Ser Pro Leu Phe Gly Phe Ser Phe Ala Asp Val Pro
Val Asp 165 170 175 Lys Ala Ala Asn Tyr Ser Cys Glu Asp Ala Asp Ile
Thr Tyr Arg Leu 180 185 190 Tyr Lys Ile Leu Ser Met Lys Leu His Glu
Ala Glu Leu Glu Asn Val 195 200 205 Phe Tyr Arg Ile Glu Met Pro Leu
Val Asn Val Leu Ala Arg Met Glu 210 215 220 Leu Asn Gly Val Tyr Val
Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu 225 230 235 240 Glu Tyr Gly
Lys Lys Leu Glu Glu Leu Ala Glu Lys Ile Tyr Gln Ile 245 250 255 Ala
Gly Glu Pro Phe Asn Ile Asn Ser Pro Lys Gln Val Ser Lys Ile 260 265
270 Leu Phe Glu Lys Leu Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr
275 280 285 Gly Ala Tyr Ser Thr Arg Ile Glu Val Leu Glu Glu Ile Ala
Asn Glu 290 295 300 His Glu Ile Val Pro Leu Ile Leu Glu Tyr Arg Lys
Ile Gln Lys Leu 305 310 315 320 Lys Ser Thr Tyr Ile Asp Thr Leu Pro
Lys Leu Val Asn Pro Lys Thr 325 330 335 Gly Arg Ile His Ala Ser Phe
His Gln Thr Gly Thr Ala Thr Gly Arg 340 345 350 Leu Ser Ser Ser Asp
Pro Asn Leu Gln Asn Leu Pro Thr Lys Ser Glu 355 360 365 Glu Gly Lys
Glu Ile Arg Lys Ala Ile Val Pro Gln Asp Pro Asp Trp 370 375 380 Trp
Ile Val Ser Ala Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala 385 390
395 400 His Leu Ser Gly Asp Glu Asn Leu Val Lys Ala Phe Glu Glu Gly
Ile 405 410 415 Asp Val His Thr Leu Thr Ala Ser Arg Ile Tyr Asn Val
Lys Pro Glu 420 425 430 Glu Val Asn Glu Glu Met Arg Arg Val Gly Lys
Met Val Asn Phe Ser 435 440 445 Ile Ile Tyr Gly Val Thr Pro Tyr Gly
Leu Ser Val Arg Leu Gly Ile 450 455 460 Pro Val Lys Glu Ala Glu Lys
Met Ile Ile Ser Tyr Phe Thr Leu Tyr 465 470 475 480 Pro Lys Val Arg
Ser Tyr Ile Gln Gln Val Val Ala Glu Ala Lys Glu 485 490 495 Lys Gly
Tyr Val Arg Thr Leu Phe Gly Arg Lys Arg Asp Ile Pro Gln 500 505 510
Leu Met Ala Arg Asp Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala 515
520 525 Ile Asn Thr Pro Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu
Ala 530 535 540 Met Ile Asp Ile Asp Glu Glu Leu Arg Lys Arg Asn Met
Lys Ser Arg 545 550 555 560 Met Ile Ile Gln Val His Asp Glu Leu Val
Phe Glu Val Pro Asp Glu 565 570 575 Glu Lys Glu Glu Leu Val Asp Leu
Val Lys Asn Lys Met Thr Asn Val 580 585 590 Val Lys Leu Ser Val Pro
Leu Glu Val Asp Ile Ser Ile Gly Lys Ser 595 600 605 Trp Ser 610 36
1716 DNA Artificial Sequence Synthetic 36 atg ctt gaa acg tcc tcc
ctt gac ccg ttc aac tgt gag ata gtc ggc 48 Met Leu Glu Thr Ser Ser
Leu Asp Pro Phe Asn Cys Glu Ile Val Gly 1 5 10 15 atc tcc gtg tcg
ttc aaa ccg aaa aca gct tat tac att cca ctt cat 96 Ile Ser Val Ser
Phe Lys Pro Lys Thr Ala Tyr Tyr Ile Pro Leu His 20 25 30 cac aga
aac gcc cag aat ctt gat gaa aca ctg gtg ctg tcg aag ttg 144 His Arg
Asn Ala Gln Asn Leu Asp Glu Thr Leu Val Leu Ser Lys Leu 35 40 45
aaa gag atc ctc gaa gac ccg tct tcg aag att gtg ggt cag aac ctg 192
Lys Glu Ile Leu Glu Asp Pro Ser Ser Lys Ile Val Gly Gln Asn Leu 50
55 60 aag tac gac tac aag gtt ctt atg gta aag ggt ata tcg cca gtt
tat 240 Lys Tyr Asp Tyr Lys Val Leu Met Val Lys Gly Ile Ser Pro Val
Tyr 65 70 75 80 ccg cat ttt gac acg atg ata gct gca tat ttg ctg
gag cca aac gag 288 Pro His Phe Asp Thr Met Ile Ala Ala Tyr Leu Leu
Glu Pro Asn Glu 85 90 95 aaa aaa ttc aat ctc gaa gat ctg tct ttg
aaa ttt ctc gga tac aaa 336 Lys Lys Phe Asn Leu Glu Asp Leu Ser Leu
Lys Phe Leu Gly Tyr Lys 100 105 110 atg acg tct tat cag gaa ctg atg
tcg ttt tcc tca cca ctt ttt ggt 384 Met Thr Ser Tyr Gln Glu Leu Met
Ser Phe Ser Ser Pro Leu Phe Gly 115 120 125 ttc agc ttt gcg gat gtt
ccg gta gac aag gct gcg aac tac tcc tgc 432 Phe Ser Phe Ala Asp Val
Pro Val Asp Lys Ala Ala Asn Tyr Ser Cys 130 135 140 gag gat gca gac
atc act tat agg ctc tac aag ata ctc agc atg aag 480 Glu Asp Ala Asp
Ile Thr Tyr Arg Leu Tyr Lys Ile Leu Ser Met Lys 145 150 155 160 ctc
cat gaa gcg gaa ctt gag aac gtc ttc tac agg ata gag atg ccg 528 Leu
His Glu Ala Glu Leu Glu Asn Val Phe Tyr Arg Ile Glu Met Pro 165 170
175 ctt gtg aac gtt ctt gca cgc atg gaa ttg aac ggg gtg tat gtg gac
576 Leu Val Asn Val Leu Ala Arg Met Glu Leu Asn Gly Val Tyr Val Asp
180 185 190 aca gaa ttc ctg aaa aag ctc tcg gag gag tac ggc aaa aag
ctc gag 624 Thr Glu Phe Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys Lys
Leu Glu 195 200 205 gaa ctg gcc gaa aaa atc tac cag ata gca gga gag
ccc ttc aac atc 672 Glu Leu Ala Glu Lys Ile Tyr Gln Ile Ala Gly Glu
Pro Phe Asn Ile 210 215 220 aat tct cca aaa cag gtt tca aag atc ctt
ttt gag aag ctg gga ata 720 Asn Ser Pro Lys Gln Val Ser Lys Ile Leu
Phe Glu Lys Leu Gly Ile 225 230 235 240 aaa ccc cgt gga aaa acg aca
aaa aca gga gcg tac tct acc agg ata 768 Lys Pro Arg Gly Lys Thr Thr
Lys Thr Gly Ala Tyr Ser Thr Arg Ile 245 250 255 gag gtg ttg gaa gag
ata gcg aat gag cac gag ata gta ccc ctc att 816 Glu Val Leu Glu Glu
Ile Ala Asn Glu His Glu Ile Val Pro Leu Ile 260 265 270 ctc gag tac
aga aag atc cag aaa ctg aaa tcg acc tac ata gac acc 864 Leu Glu Tyr
Arg Lys Ile Gln Lys Leu Lys Ser Thr Tyr Ile Asp Thr 275 280 285 ctt
ccg aaa ctt gtg aac ccg aaa acc gga aga att cat gca tct ttc 912 Leu
Pro Lys Leu Val Asn Pro Lys Thr Gly Arg Ile His Ala Ser Phe 290 295
300 cac cag acg ggt acc gcc act ggc agg ttg agt agc agt gat cca aat
960 His Gln Thr Gly Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp Pro Asn
305 310 315 320 ctt cag aat ctt ccg aca aag agc gaa gag gga aaa gaa
att aga aaa 1008 Leu Gln Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys
Glu Ile Arg Lys 325 330 335 gcg att gtg ccc cag gat cca gac tgg tgg
atc gtc agt gcg gat tat 1056 Ala Ile Val Pro Gln Asp Pro Asp Trp
Trp Ile Val Ser Ala Asp Tyr 340 345 350 tcc caa ata gaa ctc aga atc
ctc gct cat ctc agt ggt gat gag aac 1104 Ser Gln Ile Glu Leu Arg
Ile Leu Ala His Leu Ser Gly Asp Glu Asn 355 360 365 ctt gtg aag gcc
ttc gag gag ggc atc gat gtg cac acc ttg act gcc 1152 Leu Val Lys
Ala Phe Glu Glu Gly Ile Asp Val His Thr Leu Thr Ala 370 375 380 tcc
agg atc tac aac gta aag cca gaa gaa gtg aac gaa gaa atg cga 1200
Ser Arg Ile Tyr Asn Val Lys Pro Glu Glu Val Asn Glu Glu Met Arg 385
390 395 400 cgg gtt gga aag atg gtg aac ttc tct ata ata tac ggt gtc
aca ccg 1248 Arg Val Gly Lys Met Val Asn Phe Ser Ile Ile Tyr Gly
Val Thr Pro 405 410 415 tac ggt ctt tct gtg aga ctt gga ata ccg gtt
aaa gaa gca gaa aag 1296 Tyr Gly Leu Ser Val Arg Leu Gly Ile Pro
Val Lys Glu Ala Glu Lys 420 425 430 atg att atc agc tat ttc aca ctg
tat cca aag gtg cga agc tac atc 1344 Met Ile Ile Ser Tyr Phe Thr
Leu Tyr Pro Lys Val Arg Ser Tyr Ile 435 440 445 cag cag gtt gtt gca
gag gca aaa gag aag ggc tac gtc agg act ctc 1392 Gln Gln Val Val
Ala Glu Ala Lys Glu Lys Gly Tyr Val Arg Thr Leu 450 455 460 ttt gga
aga aaa aga gat att ccc cag ctc atg gca agg gac aag aac 1440 Phe
Gly Arg Lys Arg Asp Ile Pro Gln Leu Met Ala Arg Asp Lys Asn 465 470
475 480 acc cag tcc gaa ggc gaa aga atc gca ata aac acc ccc att cag
gga 1488 Thr Gln Ser Glu Gly Glu Arg Ile Ala Ile Asn Thr Pro Ile
Gln Gly 485 490 495 acg gcg gca gat ata ata aaa ttg gct atg ata gat
ata gac gag gag 1536 Thr Ala Ala Asp Ile Ile Lys Leu Ala Met Ile
Asp Ile Asp Glu Glu 500 505 510 ctg aga aaa aga aac atg aaa tcc aga
atg atc att cag gtt cat gac 1584 Leu Arg Lys Arg Asn Met Lys Ser
Arg Met Ile Ile Gln Val His Asp 515 520 525 gaa ctg gtc ttc gag gtt
ccc gat gag gaa aaa gaa gaa cta gtt gat 1632 Glu Leu Val Phe Glu
Val Pro Asp Glu Glu Lys Glu Glu Leu Val Asp 530 535 540 ctg gtg aag
aac aaa atg aca aat gtg gtg aaa ctc tct gtg cct ctt 1680 Leu Val
Lys Asn Lys Met Thr Asn Val Val Lys Leu Ser Val Pro Leu 545 550 555
560 gag gtt gac ata agc atc gga aaa agc tgg tct tga 1716 Glu Val
Asp Ile Ser Ile Gly Lys Ser Trp Ser 565 570 37 571 PRT Artificial
Sequence Synthetic 37 Met Leu Glu Thr Ser Ser Leu Asp Pro Phe Asn
Cys Glu Ile Val Gly 1 5 10 15 Ile Ser Val Ser Phe Lys Pro Lys Thr
Ala Tyr Tyr Ile Pro Leu His 20 25 30 His Arg Asn Ala Gln Asn Leu
Asp Glu Thr Leu Val Leu Ser Lys Leu 35 40 45 Lys Glu Ile Leu Glu
Asp Pro Ser Ser Lys Ile Val Gly Gln Asn Leu 50 55 60 Lys Tyr Asp
Tyr Lys Val Leu Met Val Lys Gly Ile Ser Pro Val Tyr 65 70 75 80 Pro
His Phe Asp Thr Met Ile Ala Ala Tyr Leu Leu Glu Pro Asn Glu 85 90
95 Lys Lys Phe Asn Leu Glu Asp Leu Ser Leu Lys Phe Leu Gly Tyr Lys
100 105 110 Met Thr Ser Tyr Gln Glu Leu Met Ser Phe Ser Ser Pro Leu
Phe Gly 115 120 125 Phe Ser Phe Ala Asp Val Pro Val Asp Lys Ala Ala
Asn Tyr Ser Cys 130 135 140 Glu Asp Ala Asp Ile Thr Tyr Arg Leu Tyr
Lys Ile Leu Ser Met Lys 145 150 155 160 Leu His Glu Ala Glu Leu Glu
Asn Val Phe Tyr Arg Ile Glu Met Pro 165 170 175 Leu Val Asn Val Leu
Ala Arg Met Glu Leu Asn Gly Val Tyr Val Asp 180 185 190 Thr Glu Phe
Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys Lys Leu Glu 195 200 205 Glu
Leu Ala Glu Lys Ile Tyr Gln Ile Ala Gly Glu Pro Phe Asn Ile 210 215
220 Asn Ser Pro Lys Gln Val Ser Lys Ile Leu Phe Glu Lys Leu Gly Ile
225 230 235 240 Lys Pro Arg Gly Lys Thr Thr Lys Thr Gly Ala Tyr Ser
Thr Arg Ile 245 250 255 Glu Val Leu Glu Glu Ile Ala Asn Glu His Glu
Ile Val Pro Leu Ile 260 265 270 Leu Glu Tyr Arg Lys Ile Gln Lys Leu
Lys Ser Thr Tyr Ile Asp Thr 275 280 285 Leu Pro Lys Leu Val Asn Pro
Lys Thr Gly Arg Ile His Ala Ser Phe 290 295 300 His Gln Thr Gly Thr
Ala Thr Gly Arg Leu Ser Ser Ser Asp Pro Asn 305 310 315 320 Leu Gln
Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys Glu Ile Arg Lys 325 330 335
Ala Ile Val Pro Gln Asp Pro Asp Trp Trp Ile Val Ser Ala Asp Tyr 340
345 350 Ser Gln Ile Glu Leu Arg Ile Leu Ala His Leu Ser Gly Asp Glu
Asn 355 360 365 Leu Val Lys Ala Phe Glu Glu Gly Ile Asp Val His Thr
Leu Thr Ala 370 375 380 Ser Arg Ile Tyr Asn Val Lys Pro Glu Glu Val
Asn Glu Glu Met Arg 385 390 395 400 Arg Val Gly Lys Met Val Asn Phe
Ser Ile Ile Tyr Gly Val Thr Pro 405 410 415 Tyr Gly Leu Ser Val Arg
Leu Gly Ile Pro Val Lys Glu Ala Glu Lys 420 425 430 Met Ile Ile Ser
Tyr Phe Thr Leu Tyr Pro Lys Val Arg Ser Tyr Ile 435 440 445 Gln Gln
Val Val Ala Glu Ala Lys Glu Lys Gly Tyr Val Arg Thr Leu 450 455 460
Phe Gly Arg Lys Arg Asp Ile Pro Gln Leu Met Ala Arg Asp Lys Asn 465
470 475 480 Thr Gln Ser Glu Gly Glu Arg Ile Ala Ile Asn Thr Pro Ile
Gln Gly 485 490 495 Thr Ala Ala Asp Ile Ile Lys Leu Ala Met Ile Asp
Ile Asp Glu Glu 500 505 510 Leu Arg Lys Arg Asn Met Lys Ser Arg Met
Ile Ile Gln Val His Asp 515 520 525 Glu Leu Val Phe Glu Val Pro Asp
Glu Glu Lys Glu Glu Leu Val Asp 530 535 540 Leu Val Lys Asn Lys Met
Thr Asn Val Val Lys Leu Ser Val Pro Leu 545 550 555 560 Glu Val Asp
Ile Ser Ile Gly Lys Ser Trp Ser 565 570 38 1485 DNA Artificial
Sequence Synthetic 38 atg aag gaa ctt caa ctg tac gaa gaa gca gaa
ccc acc gga tac gaa 48 Met Lys Glu Leu Gln Leu Tyr Glu Glu Ala Glu
Pro Thr Gly Tyr Glu 1 5 10 15 atc gtg aag gat cat aag acc ttc gaa
gat ctg tct ttg aaa ttt ctc 96 Ile Val Lys Asp His Lys Thr Phe Glu
Asp Leu Ser Leu Lys Phe Leu 20 25 30 gga tac aaa atg acg tct tat
cag gaa ctg atg tcg ttt tcc tca cca 144 Gly Tyr Lys Met Thr Ser Tyr
Gln Glu Leu Met Ser Phe Ser Ser Pro 35 40 45 ctt ttt ggt ttc agc
ttt gcg gat gtt ccg gta gac aag gct gcg aac 192 Leu Phe Gly Phe Ser
Phe Ala Asp Val Pro Val Asp Lys Ala Ala Asn 50 55 60 tac tcc tgc
gag gat gca gac atc act tat agg ctc tac aag ata ctc 240 Tyr Ser Cys
Glu Asp Ala Asp Ile Thr Tyr Arg Leu Tyr Lys Ile Leu 65 70 75 80 agc
atg aag ctc cat gaa gcg gaa ctt gag aac gtc ttc tac agg ata 288 Ser
Met Lys Leu His Glu Ala Glu Leu Glu Asn Val Phe Tyr Arg Ile 85 90
95 gag atg ccg ctt gtg aac gtt ctt gca cgc atg gaa ttg aac ggg gtg
336 Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu Leu Asn Gly Val
100 105 110 tat gtg gac aca gaa ttc ctg aaa aag ctc tcg gag gag tac
ggc aaa 384 Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu Glu Tyr
Gly Lys 115 120 125 aag ctc gag gaa ctg gcc gaa aaa atc tac cag ata
gca gga gag ccc 432 Lys Leu Glu Glu Leu Ala Glu Lys Ile Tyr Gln Ile
Ala Gly Glu Pro 130 135 140 ttc aac atc aat tct cca aaa cag gtt tca
aag atc ctt ttt gag aag 480 Phe Asn Ile Asn Ser Pro Lys Gln Val Ser
Lys Ile Leu Phe Glu Lys 145 150 155 160 ctg gga ata aaa ccc cgt gga
aaa acg aca aaa aca gga gcg tac tct 528 Leu Gly Ile Lys Pro Arg Gly
Lys Thr Thr Lys Thr Gly Ala Tyr Ser 165 170 175 acc agg ata gag gtg
ttg gaa gag ata gcg aat gag cac gag ata gta 576 Thr Arg Ile Glu Val
Leu Glu Glu Ile Ala Asn Glu His Glu Ile Val 180 185 190 ccc ctc att
ctc gag tac aga aag atc cag aaa ctg aaa tcg acc tac 624 Pro Leu Ile
Leu Glu Tyr Arg Lys Ile Gln Lys Leu Lys Ser Thr Tyr 195 200 205 ata
gac acc ctt ccg aaa ctt gtg aac ccg aaa acc gga aga att cat 672 Ile
Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr Gly Arg Ile His 210 215
220 gca tct ttc cac cag acg ggt acc gcc act ggc agg ttg agt agc agt
720 Ala Ser Phe His Gln Thr Gly Thr Ala Thr Gly Arg Leu Ser Ser Ser
225 230 235 240 gat cca aat ctt cag aat ctt ccg aca aag agc gaa gag
gga aaa gaa 768 Asp Pro Asn Leu Gln Asn Leu Pro Thr Lys Ser Glu Glu
Gly Lys Glu 245 250 255 att aga aaa gcg att gtg ccc cag gat cca gac
tgg tgg atc gtc agt 816 Ile Arg Lys Ala Ile Val Pro Gln Asp Pro Asp
Trp Trp Ile Val Ser 260 265 270 gcg gat tat tcc caa ata gaa ctc aga
atc ctc gct cat ctc agt ggt 864 Ala Asp Tyr Ser Gln Ile Glu Leu Arg
Ile Leu Ala His Leu Ser Gly 275 280 285 gat gag aac ctt gtg aag gcc
ttc gag gag ggc atc gat gtg cac acc 912 Asp Glu Asn Leu Val Lys Ala
Phe Glu Glu Gly Ile Asp Val His Thr 290 295 300 ttg act gcc tcc agg
atc tac aac gta aag cca gaa gaa gtg aac gaa 960 Leu Thr Ala Ser Arg
Ile Tyr Asn Val Lys Pro Glu Glu Val Asn Glu 305 310 315 320 gaa atg
cga cgg gtt gga aag atg gtg aac ttc tct ata ata tac ggt 1008 Glu
Met Arg Arg Val Gly Lys Met Val Asn Phe Ser Ile Ile Tyr Gly 325 330
335 gtc aca ccg tac ggt ctt tct gtg aga ctt gga ata ccg gtt aaa gaa
1056 Val Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly Ile Pro Val Lys
Glu 340 345 350 gca gaa aag atg att atc agc tat ttc aca ctg tat cca
aag gtg cga 1104 Ala Glu Lys Met Ile Ile Ser Tyr Phe Thr Leu Tyr
Pro Lys Val Arg 355 360 365 agc tac atc cag cag gtt gtt gca gag gca
aaa gag aag ggc tac gtc 1152 Ser Tyr Ile Gln Gln Val Val Ala Glu
Ala Lys Glu Lys Gly Tyr Val 370 375 380 agg act ctc ttt gga aga aaa
aga gat att ccc cag ctc atg gca agg 1200 Arg Thr Leu Phe Gly Arg
Lys Arg Asp Ile Pro Gln Leu Met Ala Arg 385 390 395 400 gac aag aac
acc cag tcc gaa ggc gaa aga atc gca ata aac acc ccc 1248 Asp Lys
Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala Ile Asn Thr Pro 405 410 415
att cag gga acg gcg gca gat ata ata aaa ttg gct atg ata gat ata
1296 Ile Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala Met Ile Asp
Ile 420 425 430 gac gag gag ctg aga aaa aga aac atg aaa tcc aga atg
atc att cag 1344 Asp Glu Glu Leu Arg Lys Arg Asn Met Lys Ser Arg
Met Ile Ile Gln 435 440 445 gtt cat gac gaa ctg gtc ttc gag gtt ccc
gat gag gaa aaa gaa gaa 1392 Val His Asp Glu Leu Val Phe Glu Val
Pro Asp Glu Glu Lys Glu Glu 450 455 460 cta gtt gat ctg gtg aag aac
aaa atg aca aat gtg gtg aaa ctc tct 1440 Leu Val Asp Leu Val Lys
Asn Lys Met Thr Asn Val Val Lys Leu Ser 465 470 475 480 gtg cct ctt
gag gtt gac ata agc atc gga aaa agc tgg tct tga 1485 Val Pro Leu
Glu Val Asp Ile Ser Ile Gly Lys Ser Trp Ser 485 490 39 494 PRT
Artificial Sequence Synthetic 39 Met Lys Glu Leu Gln Leu Tyr Glu
Glu Ala Glu Pro Thr Gly Tyr Glu 1 5 10 15 Ile Val Lys Asp His Lys
Thr Phe Glu Asp Leu Ser Leu Lys Phe Leu 20 25 30 Gly Tyr Lys Met
Thr Ser Tyr Gln Glu Leu Met Ser Phe Ser Ser Pro 35 40 45 Leu Phe
Gly Phe Ser Phe Ala Asp Val Pro Val Asp Lys Ala Ala Asn 50 55 60
Tyr Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg Leu Tyr Lys Ile Leu 65
70 75 80 Ser Met Lys Leu His Glu Ala Glu Leu Glu Asn Val Phe Tyr
Arg Ile 85 90 95 Glu Met Pro Leu Val Asn Val Leu Ala Arg Met Glu
Leu Asn Gly Val 100 105 110 Tyr Val Asp Thr Glu Phe Leu Lys Lys Leu
Ser Glu Glu Tyr Gly Lys 115 120 125 Lys Leu Glu Glu Leu Ala Glu Lys
Ile Tyr Gln Ile Ala Gly Glu Pro 130 135 140 Phe Asn Ile Asn Ser Pro
Lys Gln Val Ser Lys Ile Leu Phe Glu Lys 145 150 155 160 Leu Gly Ile
Lys Pro Arg Gly Lys Thr Thr Lys Thr Gly Ala Tyr Ser 165 170 175 Thr
Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu His Glu Ile Val 180 185
190 Pro Leu Ile Leu Glu Tyr Arg Lys Ile Gln Lys Leu Lys Ser Thr Tyr
195 200 205 Ile Asp Thr Leu Pro Lys Leu Val Asn Pro Lys Thr Gly Arg
Ile His 210 215 220 Ala Ser Phe His Gln Thr Gly Thr Ala Thr Gly Arg
Leu Ser Ser Ser 225 230 235 240 Asp Pro Asn Leu Gln Asn Leu Pro Thr
Lys Ser Glu Glu Gly Lys Glu 245 250 255 Ile Arg Lys Ala Ile Val Pro
Gln Asp Pro Asp Trp Trp Ile Val Ser 260 265 270 Ala Asp Tyr Ser Gln
Ile Glu Leu Arg Ile Leu
Ala His Leu Ser Gly 275 280 285 Asp Glu Asn Leu Val Lys Ala Phe Glu
Glu Gly Ile Asp Val His Thr 290 295 300 Leu Thr Ala Ser Arg Ile Tyr
Asn Val Lys Pro Glu Glu Val Asn Glu 305 310 315 320 Glu Met Arg Arg
Val Gly Lys Met Val Asn Phe Ser Ile Ile Tyr Gly 325 330 335 Val Thr
Pro Tyr Gly Leu Ser Val Arg Leu Gly Ile Pro Val Lys Glu 340 345 350
Ala Glu Lys Met Ile Ile Ser Tyr Phe Thr Leu Tyr Pro Lys Val Arg 355
360 365 Ser Tyr Ile Gln Gln Val Val Ala Glu Ala Lys Glu Lys Gly Tyr
Val 370 375 380 Arg Thr Leu Phe Gly Arg Lys Arg Asp Ile Pro Gln Leu
Met Ala Arg 385 390 395 400 Asp Lys Asn Thr Gln Ser Glu Gly Glu Arg
Ile Ala Ile Asn Thr Pro 405 410 415 Ile Gln Gly Thr Ala Ala Asp Ile
Ile Lys Leu Ala Met Ile Asp Ile 420 425 430 Asp Glu Glu Leu Arg Lys
Arg Asn Met Lys Ser Arg Met Ile Ile Gln 435 440 445 Val His Asp Glu
Leu Val Phe Glu Val Pro Asp Glu Glu Lys Glu Glu 450 455 460 Leu Val
Asp Leu Val Lys Asn Lys Met Thr Asn Val Val Lys Leu Ser 465 470 475
480 Val Pro Leu Glu Val Asp Ile Ser Ile Gly Lys Ser Trp Ser 485 490
40 24 DNA Artificial Sequence Synthetic 40 cgccagggtt ttcccagtca
cgac 24 41 74 DNA Artificial Sequence Synthetic 41 ataagcgcca
ttgatgttcc tctctactcg aaagttagag aggacacacc cgatccctat 60
agtgagtcgt atta 74 42 25 DNA Artificial Sequence Synthetic 42
taatacgact cactataggg cgaat 25 43 25 DNA Artificial Sequence
Synthetic 43 gaatcgtcgt atgcagtgaa aactc 25 44 25 DNA Artificial
Sequence Synthetic 44 cttgattgac aaggatggat ggcta 25 45 33 DNA
Artificial Sequence Synthetic 45 catggtttaa atcctgtgtg aaattgttat
ccg 33 46 29 DNA Artificial Sequence Synthetic 46 cggataacaa
tttcacacag gatttaaac 29 47 25 DNA Artificial Sequence Synthetic 47
taatacgact cactataggg cgaat 25 48 31 DNA Artificial Sequence
Synthetic 48 catgccatgg catgcattta cgttgacacc a 31 49 34 DNA
Artificial Sequence Synthetic 49 tcccccgggt tgcgctcact gcccgctttc
cagt 34 50 34 DNA Artificial Sequence Synthetic 50 agcttatcga
tggcactttt cggggaaatg tgcg 34 51 35 DNA Artificial Sequence
Synthetic 51 agcttatcga taagcgatgc cgggagcaga caagc 35 52 17 PRT
Artificial Sequence Synthetic 52 Glu Thr Tyr Val Val Phe Asp Val
Glu Thr Thr Gly Leu Ser Ala Val 1 5 10 15 Tyr 53 17 PRT Artificial
Sequence Synthetic 53 Thr Arg Gln Ile Val Leu Asp Thr Glu Thr Thr
Gly Met Asn Gln Ile 1 5 10 15 Gly 54 18 PRT Artificial Sequence
Synthetic 54 Arg Val Ile Tyr Met Pro Phe Asp Asn Glu Arg Asp Met
Leu Met Glu 1 5 10 15 Tyr Ile 55 15 PRT Artificial Sequence
Synthetic 55 Met Ile Val Ser Asp Ile Glu Ala Asn Ala Leu Leu Glu
Ser Val 1 5 10 15 56 18 PRT Artificial Sequence Synthetic 56 Lys
Ala Pro Val Phe Ala Phe Asp Thr Glu Thr Asp Ser Leu Asp Asn 1 5 10
15 Ile Ser 57 18 PRT Artificial Sequence Synthetic 57 Glu Ser Pro
Ser Phe Ala Ile Asp Leu Glu Thr Ser Ser Leu Asp Pro 1 5 10 15 Phe
Asp 58 18 PRT Artificial Sequence Synthetic 58 Glu Val Pro Ser Phe
Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro 1 5 10 15 Phe Asn 59 14
PRT Artificial Sequence Synthetic 59 Leu Val Ala His Asn Ala Ser
Phe Asp Met Gly Phe Leu Asn 1 5 10 60 14 PRT Artificial Sequence
Synthetic 60 Leu Val Ile His Asn Ala Ala Phe Asp Ile Gly Phe Met
Asp 1 5 10 61 15 PRT Artificial Sequence Synthetic 61 Phe Thr Gly
Trp Asn Ile Glu Gly Phe Asp Val Pro Tyr Ile Met 1 5 10 15 62 14 PRT
Artificial Sequence Synthetic 62 Val Phe His Asn Gly His Lys Tyr
Asp Val Pro Ala Leu Thr 1 5 10 63 13 PRT Artificial Sequence
Synthetic 63 Val Gly Gln Asn Leu Lys Tyr Asp Arg Gly Ile Leu Ala 1
5 10 64 13 PRT Artificial Sequence Synthetic 64 Val Gly Gln Asn Leu
Lys Phe Asp Tyr Lys Val Leu Met 1 5 10 65 13 PRT Artificial
Sequence Synthetic 65 Val Gly Gln Asn Leu Lys Tyr Asp Tyr Lys Val
Leu Met 1 5 10 66 13 PRT Artificial Sequence Synthetic 66 Thr Leu
Cys Lys Lys Phe Asp Ile Glu Leu Thr Gln His 1 5 10 67 13 PRT
Artificial Sequence Synthetic 67 Ala Leu Cys Ala Arg Tyr Glu Ile
Asp Asn Ser Lys Arg 1 5 10 68 12 PRT Artificial Sequence Synthetic
68 Asp Lys Ile Arg Gly Phe Ile Asp Leu Val Leu Ser 1 5 10 69 14 PRT
Artificial Sequence Synthetic 69 Glu Glu Met Met Asp Tyr Asn Val
Gln Asp Val Val Val Thr 1 5 10 70 14 PRT Artificial Sequence
Synthetic 70 Glu Glu Ala Gly Arg Tyr Ala Ala Glu Asp Ala Asp Val
Thr 1 5 10 71 14 PRT Artificial Sequence Synthetic 71 Glu Lys Ala
Ala Asn Tyr Ser Cys Glu Asp Ala Asp Ile Thr 1 5 10 72 14 PRT
Artificial Sequence Synthetic 72 Asp Lys Ala Ala Asn Tyr Ser Cys
Glu Asp Ala Asp Ile Thr 1 5 10 73 928 PRT Escherichia coli 73 Met
Val Gln Ile Pro Gln Asn Pro Leu Ile Leu Val Asp Gly Ser Ser 1 5 10
15 Tyr Leu Tyr Arg Ala Tyr His Ala Phe Pro Pro Leu Thr Asn Ser Ala
20 25 30 Gly Glu Pro Thr Gly Ala Met Tyr Gly Val Leu Asn Met Leu
Arg Ser 35 40 45 Leu Ile Met Gln Tyr Lys Pro Thr His Ala Ala Val
Val Phe Asp Ala 50 55 60 Lys Gly Lys Thr Phe Arg Asp Glu Leu Phe
Glu His Tyr Lys Ser His 65 70 75 80 Arg Pro Pro Met Pro Asp Asp Leu
Arg Ala Gln Ile Glu Pro Leu His 85 90 95 Ala Met Val Lys Ala Met
Gly Leu Pro Leu Leu Ala Val Ser Gly Val 100 105 110 Glu Ala Asp Asp
Val Ile Gly Thr Leu Ala Arg Glu Ala Glu Lys Ala 115 120 125 Gly Arg
Pro Val Leu Ile Ser Thr Gly Asp Lys Asp Met Ala Gln Leu 130 135 140
Val Thr Pro Asn Ile Thr Leu Ile Asn Thr Met Thr Asn Thr Ile Leu 145
150 155 160 Gly Pro Glu Glu Val Val Asn Lys Tyr Gly Val Pro Pro Glu
Leu Ile 165 170 175 Ile Asp Phe Leu Ala Leu Met Gly Asp Ser Ser Asp
Asn Ile Pro Gly 180 185 190 Val Pro Gly Val Gly Glu Lys Thr Ala Gln
Ala Leu Leu Gln Gly Leu 195 200 205 Gly Gly Leu Asp Thr Leu Tyr Ala
Glu Pro Glu Lys Ile Ala Gly Leu 210 215 220 Ser Phe Arg Gly Ala Lys
Thr Met Ala Ala Lys Leu Glu Gln Asn Lys 225 230 235 240 Glu Val Ala
Tyr Leu Ser Tyr Gln Leu Ala Thr Ile Lys Thr Asp Val 245 250 255 Glu
Leu Glu Leu Thr Cys Glu Gln Leu Glu Val Gln Gln Pro Ala Ala 260 265
270 Glu Glu Leu Leu Gly Leu Phe Lys Lys Tyr Glu Phe Lys Arg Trp Thr
275 280 285 Ala Asp Val Glu Ala Gly Lys Trp Leu Gln Ala Lys Gly Ala
Lys Pro 290 295 300 Ala Ala Lys Pro Gln Glu Thr Ser Val Ala Asp Glu
Ala Pro Glu Val 305 310 315 320 Thr Ala Thr Val Ile Ser Tyr Asp Asn
Tyr Val Thr Ile Leu Asp Glu 325 330 335 Glu Thr Leu Lys Ala Trp Ile
Ala Lys Leu Glu Lys Ala Pro Val Phe 340 345 350 Ala Phe Asp Thr Glu
Thr Asp Ser Leu Asp Asn Ile Ser Ala Asn Leu 355 360 365 Val Gly Leu
Ser Phe Ala Ile Glu Pro Gly Val Ala Ala Tyr Ile Pro 370 375 380 Val
Ala His Asp Tyr Leu Asp Ala Pro Asp Gln Ile Ser Arg Glu Arg 385 390
395 400 Ala Leu Glu Leu Leu Lys Pro Leu Leu Glu Asp Glu Lys Ala Leu
Lys 405 410 415 Val Gly Gln Asn Leu Lys Tyr Asp Arg Gly Ile Leu Ala
Asn Tyr Gly 420 425 430 Ile Glu Leu Arg Gly Ile Ala Phe Asp Thr Met
Leu Glu Ser Tyr Ile 435 440 445 Leu Asn Ser Val Ala Gly Arg His Asp
Met Asp Ser Leu Ala Glu Arg 450 455 460 Trp Leu Lys His Lys Thr Ile
Thr Phe Glu Glu Ile Ala Gly Lys Gly 465 470 475 480 Lys Asn Gln Leu
Thr Phe Asn Gln Ile Ala Leu Glu Glu Ala Gly Arg 485 490 495 Tyr Ala
Ala Glu Asp Ala Asp Val Thr Leu Gln Leu His Leu Lys Met 500 505 510
Trp Pro Asp Leu Gln Lys His Lys Gly Pro Leu Asn Val Phe Glu Asn 515
520 525 Ile Glu Met Pro Leu Val Pro Val Leu Ser Arg Ile Glu Arg Asn
Gly 530 535 540 Val Lys Ile Asp Pro Lys Val Leu His Asn His Ser Glu
Glu Leu Thr 545 550 555 560 Leu Arg Leu Ala Glu Leu Glu Lys Lys Ala
His Glu Ile Ala Gly Glu 565 570 575 Glu Phe Asn Leu Ser Ser Thr Lys
Gln Leu Gln Thr Ile Leu Phe Glu 580 585 590 Lys Gln Gly Ile Lys Pro
Leu Lys Lys Thr Pro Gly Gly Ala Pro Ser 595 600 605 Thr Ser Glu Glu
Val Leu Glu Glu Leu Ala Leu Asp Tyr Pro Leu Pro 610 615 620 Lys Val
Ile Leu Glu Tyr Arg Gly Leu Ala Lys Leu Lys Ser Thr Tyr 625 630 635
640 Thr Asp Lys Leu Pro Leu Met Ile Asn Pro Lys Thr Gly Arg Val His
645 650 655 Thr Ser Tyr His Gln Ala Val Thr Ala Thr Gly Arg Leu Ser
Ser Thr 660 665 670 Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Asn Glu
Glu Gly Arg Arg 675 680 685 Ile Arg Gln Ala Phe Ile Ala Pro Glu Asp
Tyr Val Ile Val Ser Ala 690 695 700 Asp Tyr Ser Gln Ile Glu Leu Arg
Ile Met Ala His Leu Ser Arg Asp 705 710 715 720 Lys Gly Leu Leu Thr
Ala Phe Ala Glu Gly Lys Asp Ile His Arg Ala 725 730 735 Thr Ala Ala
Glu Val Phe Gly Leu Pro Leu Glu Thr Val Thr Ser Glu 740 745 750 Gln
Arg Arg Ser Ala Lys Ala Ile Asn Phe Gly Leu Ile Tyr Gly Met 755 760
765 Ser Ala Phe Gly Leu Ala Arg Gln Leu Asn Ile Pro Arg Lys Glu Ala
770 775 780 Gln Lys Tyr Met Asp Leu Tyr Phe Glu Arg Tyr Pro Gly Val
Leu Glu 785 790 795 800 Tyr Met Glu Arg Thr Arg Ala Gln Ala Lys Glu
Gln Gly Tyr Val Glu 805 810 815 Thr Leu Asp Gly Arg Arg Leu Tyr Leu
Pro Asp Ile Lys Ser Ser Asn 820 825 830 Gly Ala Arg Arg Ala Ala Ala
Glu Arg Ala Ala Ile Asn Ala Pro Met 835 840 845 Gln Gly Thr Ala Ala
Asp Ile Ile Lys Arg Ala Met Ile Ala Val Asp 850 855 860 Ala Trp Leu
Gln Ala Glu Gln Pro Arg Val Arg Met Ile Met Gln Val 865 870 875 880
His Asp Glu Leu Val Phe Glu Val His Lys Asp Asp Val Asp Ala Val 885
890 895 Ala Lys Gln Ile His Gln Leu Met Glu Asn Cys Thr Arg Leu Asp
Val 900 905 910 Pro Leu Leu Val Glu Val Gly Ser Gly Glu Asn Trp Asp
Gln Ala His 915 920 925 74 893 PRT Thermotoga maritima 74 Met Ala
Arg Leu Phe Leu Phe Asp Gly Thr Ala Leu Ala Tyr Arg Ala 1 5 10 15
Tyr Tyr Ala Leu Asp Arg Ser Leu Ser Thr Ser Thr Gly Ile Pro Thr 20
25 30 Asn Ala Thr Tyr Gly Val Ala Arg Met Leu Val Arg Phe Ile Lys
Asp 35 40 45 His Ile Ile Val Gly Lys Asp Tyr Val Ala Val Ala Phe
Asp Lys Lys 50 55 60 Ala Ala Thr Phe Arg His Lys Leu Leu Glu Thr
Tyr Lys Ala Gln Arg 65 70 75 80 Pro Lys Thr Pro Asp Leu Leu Ile Gln
Gln Leu Pro Tyr Ile Lys Lys 85 90 95 Leu Val Glu Ala Leu Gly Met
Lys Val Leu Glu Val Glu Gly Tyr Glu 100 105 110 Ala Asp Asp Ile Ile
Ala Thr Leu Ala Val Lys Gly Leu Pro Leu Phe 115 120 125 Asp Glu Ile
Phe Ile Val Thr Gly Asp Lys Asp Met Leu Gln Leu Val 130 135 140 Asn
Glu Lys Ile Lys Val Trp Arg Ile Val Lys Gly Ile Ser Asp Leu 145 150
155 160 Glu Leu Tyr Asp Ala Gln Lys Val Lys Glu Lys Tyr Gly Val Glu
Pro 165 170 175 Gln Gln Ile Pro Asp Leu Leu Ala Leu Thr Gly Asp Glu
Ile Asp Asn 180 185 190 Ile Pro Gly Val Thr Gly Ile Gly Glu Lys Thr
Ala Val Gln Leu Leu 195 200 205 Glu Lys Tyr Lys Asp Leu Glu Asp Ile
Leu Asn His Val Arg Glu Leu 210 215 220 Pro Gln Lys Val Arg Lys Ala
Leu Leu Arg Asp Arg Glu Asn Ala Ile 225 230 235 240 Leu Ser Lys Lys
Leu Ala Ile Leu Glu Thr Asn Val Pro Ile Glu Ile 245 250 255 Asn Trp
Glu Glu Leu Arg Tyr Gln Gly Tyr Asp Arg Glu Lys Leu Leu 260 265 270
Pro Leu Leu Lys Glu Leu Glu Phe Ala Ser Ile Met Lys Glu Leu Gln 275
280 285 Leu Tyr Glu Glu Ser Glu Pro Val Gly Tyr Arg Ile Val Lys Asp
Leu 290 295 300 Val Glu Phe Glu Lys Leu Ile Glu Lys Leu Arg Glu Ser
Pro Ser Phe 305 310 315 320 Ala Ile Asp Leu Glu Thr Ser Ser Leu Asp
Pro Phe Asp Cys Asp Ile 325 330 335 Val Gly Ile Ser Val Ser Phe Lys
Pro Lys Glu Ala Tyr Tyr Ile Pro 340 345 350 Leu His His Arg Asn Ala
Gln Asn Leu Asp Glu Lys Glu Val Leu Lys 355 360 365 Lys Leu Lys Glu
Ile Leu Glu Asp Pro Gly Ala Lys Ile Val Gly Gln 370 375 380 Asn Leu
Lys Phe Asp Tyr Lys Val Leu Met Val Lys Gly Val Glu Pro 385 390 395
400 Val Pro Pro Tyr Phe Asp Thr Met Ile Ala Ala Tyr Leu Leu Glu Pro
405 410 415 Asn Glu Lys Lys Phe Asn Leu Asp Asp Leu Ala Leu Lys Phe
Leu Gly 420 425 430 Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser Phe
Ser Phe Pro Leu 435 440 445 Phe Gly Phe Ser Phe Ala Asp Val Pro Val
Glu Lys Ala Ala Asn Tyr 450 455 460 Ser Cys Glu Asp Ala Asp Ile Thr
Tyr Arg Leu Tyr Lys Thr Leu Ser 465 470 475 480 Leu Lys Leu His Glu
Ala Asp Leu Glu Asn Val Phe Tyr Lys Ile Glu 485 490 495 Met Pro Leu
Val Asn Val Leu Ala Arg Met Glu Leu Asn Gly Val Tyr 500 505 510 Val
Asp Thr Glu Phe Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys Lys 515 520
525 Leu Glu Glu Leu Ala Glu Glu Ile Tyr Arg Ile Ala Gly Glu Pro Phe
530 535 540 Asn Ile Asn Ser Pro Lys Gln Val Ser Arg Ile Leu Phe Glu
Lys Leu 545 550 555 560 Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr
Gly Asp Tyr Ser Thr 565 570 575 Arg Ile Glu Val Leu Glu Glu Leu Ala
Gly Glu His Glu Ile Ile Pro 580 585 590 Leu Ile Leu Glu Tyr Arg Lys
Ile Gln Lys Leu Lys Ser Thr Tyr Ile 595 600 605 Asp Ala Leu Pro Lys
Met Val Asn Pro Lys Thr Gly Arg Ile His Ala 610 615 620 Ser Phe Asn
Gln Thr Gly Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 625 630 635 640
Pro Asn Leu Gln Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys Glu Ile 645
650 655 Arg Lys Ala Ile Val Pro Gln Asp Pro Asn Trp Trp Ile Val Ser
Ala 660 665 670 Asp Tyr Ser Gln Ile Glu Leu
Arg Ile Leu Ala His Leu Ser Gly Asp 675 680 685 Glu Asn Leu Leu Arg
Ala Phe Glu Glu Gly Ile Asp Val His Thr Leu 690 695 700 Thr Ala Ser
Arg Ile Phe Asn Val Lys Pro Glu Glu Val Thr Glu Glu 705 710 715 720
Met Arg Arg Ala Gly Lys Met Val Asn Phe Ser Ile Ile Tyr Gly Val 725
730 735 Thr Pro Tyr Gly Leu Ser Val Arg Leu Gly Val Pro Val Lys Glu
Ala 740 745 750 Glu Lys Met Ile Val Asn Tyr Phe Val Leu Tyr Pro Lys
Val Arg Asp 755 760 765 Tyr Ile Gln Arg Val Val Ser Glu Ala Lys Glu
Lys Gly Tyr Val Arg 770 775 780 Thr Leu Phe Gly Arg Lys Arg Asp Ile
Pro Gln Leu Met Ala Arg Asp 785 790 795 800 Arg Asn Thr Gln Ala Glu
Gly Glu Arg Ile Ala Ile Asn Thr Pro Ile 805 810 815 Gln Gly Thr Ala
Ala Asp Ile Ile Lys Leu Ala Met Ile Glu Ile Asp 820 825 830 Arg Glu
Leu Lys Glu Arg Lys Met Arg Ser Lys Met Ile Ile Gln Val 835 840 845
His Asp Glu Leu Val Phe Glu Val Pro Asn Glu Glu Lys Asp Ala Leu 850
855 860 Val Glu Leu Val Lys Asp Arg Met Thr Asn Val Val Lys Leu Ser
Val 865 870 875 880 Pro Leu Glu Val Asp Val Thr Ile Gly Lys Thr Trp
Ser 885 890 75 893 PRT Thermotoga neapolitana 75 Met Ala Arg Leu
Phe Leu Phe Asp Gly Thr Ala Leu Ala Tyr Arg Ala 1 5 10 15 Tyr Tyr
Ala Leu Asp Arg Ser Leu Ser Thr Ser Thr Gly Ile Pro Thr 20 25 30
Asn Ala Val Tyr Gly Val Ala Arg Met Leu Val Lys Phe Ile Lys Glu 35
40 45 His Ile Ile Pro Glu Lys Asp Tyr Ala Ala Val Ala Phe Asp Lys
Lys 50 55 60 Ala Ala Thr Phe Arg His Lys Leu Leu Glu Ala Tyr Lys
Ala Gln Arg 65 70 75 80 Pro Lys Thr Pro Asp Leu Leu Val Gln Gln Leu
Pro Tyr Ile Lys Arg 85 90 95 Leu Ile Glu Ala Leu Gly Phe Lys Val
Leu Glu Leu Glu Gly Tyr Glu 100 105 110 Ala Asp Asp Ile Ile Ala Thr
Leu Ala Val Lys Gly Cys Thr Phe Phe 115 120 125 Asp Glu Ile Phe Ile
Ile Thr Gly Asp Lys Asp Met Leu Gln Leu Val 130 135 140 Asn Glu Lys
Ile Lys Val Trp Arg Ile Val Lys Gly Ile Ser Asp Leu 145 150 155 160
Glu Leu Tyr Asp Ser Lys Lys Val Lys Glu Arg Tyr Gly Val Glu Pro 165
170 175 His Gln Ile Pro Asp Leu Leu Ala Leu Thr Gly Asp Glu Ile Asp
Asn 180 185 190 Ile Pro Gly Val Thr Gly Ile Gly Glu Lys Thr Ala Val
Gln Leu Leu 195 200 205 Gly Lys Tyr Arg Asn Leu Glu Asp Ile Leu Glu
His Ala Arg Glu Leu 210 215 220 Pro Gln Arg Val Arg Lys Ala Leu Leu
Arg Asp Arg Glu Val Ala Ile 225 230 235 240 Leu Ser Lys Lys Leu Ala
Thr Leu Val Thr Asn Ala Pro Val Glu Val 245 250 255 Asp Trp Glu Glu
Met Lys Tyr Arg Gly Tyr Asp Lys Arg Lys Leu Leu 260 265 270 Pro Ile
Leu Lys Glu Leu Glu Phe Ala Ser Ile Met Lys Glu Leu Gln 275 280 285
Leu Tyr Glu Glu Ala Glu Pro Thr Gly Tyr Glu Ile Val Lys Asp His 290
295 300 Lys Thr Phe Glu Asp Leu Ile Glu Lys Leu Lys Glu Val Pro Ser
Phe 305 310 315 320 Ala Leu Asp Leu Glu Thr Ser Ser Leu Asp Pro Phe
Asn Cys Glu Ile 325 330 335 Val Gly Ile Ser Val Ser Phe Lys Pro Lys
Thr Ala Tyr Tyr Ile Pro 340 345 350 Leu His His Arg Asn Ala Gln Asn
Leu Asp Glu Thr Leu Val Leu Ser 355 360 365 Lys Leu Lys Glu Ile Leu
Glu Asp Pro Ser Ser Lys Ile Val Gly Gln 370 375 380 Asn Leu Lys Tyr
Asp Tyr Lys Val Leu Met Val Lys Gly Ile Ser Pro 385 390 395 400 Val
Tyr Pro His Phe Asp Thr Met Ile Ala Ala Tyr Leu Leu Glu Pro 405 410
415 Asn Glu Lys Lys Phe Asn Leu Glu Asp Leu Ser Leu Lys Phe Leu Gly
420 425 430 Tyr Lys Met Thr Ser Tyr Gln Glu Leu Met Ser Phe Ser Ser
Pro Leu 435 440 445 Phe Gly Phe Ser Phe Ala Asp Val Pro Val Asp Lys
Ala Ala Asn Tyr 450 455 460 Ser Cys Glu Asp Ala Asp Ile Thr Tyr Arg
Leu Tyr Lys Ile Leu Ser 465 470 475 480 Met Lys Leu His Glu Ala Glu
Leu Glu Asn Val Phe Tyr Arg Ile Glu 485 490 495 Met Pro Leu Val Asn
Val Leu Ala Arg Met Glu Leu Asn Gly Val Tyr 500 505 510 Val Asp Thr
Glu Phe Leu Lys Lys Leu Ser Glu Glu Tyr Gly Lys Lys 515 520 525 Leu
Glu Glu Leu Ala Glu Lys Ile Tyr Gln Ile Ala Gly Glu Pro Phe 530 535
540 Asn Ile Asn Ser Pro Lys Gln Val Ser Lys Ile Leu Phe Glu Lys Leu
545 550 555 560 Gly Ile Lys Pro Arg Gly Lys Thr Thr Lys Thr Gly Ala
Tyr Ser Thr 565 570 575 Arg Ile Glu Val Leu Glu Glu Ile Ala Asn Glu
His Glu Ile Val Pro 580 585 590 Leu Ile Leu Glu Tyr Arg Lys Ile Gln
Lys Leu Lys Ser Thr Tyr Ile 595 600 605 Asp Thr Leu Pro Lys Leu Val
Asn Pro Lys Thr Gly Arg Ile His Ala 610 615 620 Ser Phe His Gln Thr
Gly Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 625 630 635 640 Pro Asn
Leu Gln Asn Leu Pro Thr Lys Ser Glu Glu Gly Lys Glu Ile 645 650 655
Arg Lys Ala Ile Val Pro Gln Asp Pro Asp Trp Trp Ile Val Ser Ala 660
665 670 Asp Tyr Ser Gln Ile Glu Leu Arg Ile Leu Ala His Leu Ser Gly
Asp 675 680 685 Glu Asn Leu Val Lys Ala Phe Glu Glu Gly Ile Asp Val
His Thr Leu 690 695 700 Thr Ala Ser Arg Ile Tyr Asn Val Lys Pro Glu
Glu Val Asn Glu Glu 705 710 715 720 Met Arg Arg Val Gly Lys Met Val
Asn Phe Ser Ile Ile Tyr Gly Val 725 730 735 Thr Pro Tyr Gly Leu Ser
Val Arg Leu Gly Ile Pro Val Lys Glu Ala 740 745 750 Glu Lys Met Ile
Ile Ser Tyr Phe Thr Leu Tyr Pro Lys Val Arg Ser 755 760 765 Tyr Ile
Gln Gln Val Val Ala Glu Ala Lys Glu Lys Gly Tyr Val Arg 770 775 780
Thr Leu Phe Gly Arg Lys Arg Asp Ile Pro Gln Leu Met Ala Arg Asp 785
790 795 800 Lys Asn Thr Gln Ser Glu Gly Glu Arg Ile Ala Ile Asn Thr
Pro Ile 805 810 815 Gln Gly Thr Ala Ala Asp Ile Ile Lys Leu Ala Met
Ile Asp Ile Asp 820 825 830 Glu Glu Leu Arg Lys Arg Asn Met Lys Ser
Arg Met Ile Ile Gln Val 835 840 845 His Asp Glu Leu Val Phe Glu Val
Pro Asp Glu Glu Lys Glu Glu Leu 850 855 860 Val Asp Leu Val Lys Asn
Lys Met Thr Asn Val Val Lys Leu Ser Val 865 870 875 880 Pro Leu Glu
Val Asp Ile Ser Ile Gly Lys Ser Trp Ser 885 890
* * * * *