U.S. patent application number 14/106183 was filed with the patent office on 2014-07-10 for thermus brockianus nucleic acid polymerases.
This patent application is currently assigned to APPLIED BIOSYSTEMS, LLC. The applicant listed for this patent is APPLIED BIOSYSTEMS, LLC. Invention is credited to Elena BOLCHAKOVA, James ROZZELLE.
Application Number | 20140193877 14/106183 |
Document ID | / |
Family ID | 23307202 |
Filed Date | 2014-07-10 |
United States Patent
Application |
20140193877 |
Kind Code |
A1 |
BOLCHAKOVA; Elena ; et
al. |
July 10, 2014 |
THERMUS BROCKIANUS NUCLEIC ACID POLYMERASES
Abstract
The invention provides nucleic acids and polypeptides for
nucleic acid polymerases from a thermophilic organism, Thermus
brockianus. The invention also provides methods for using these
nucleic acids and polypeptides.
Inventors: |
BOLCHAKOVA; Elena; (Union
City, CA) ; ROZZELLE; James; (San Francisco,
CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
APPLIED BIOSYSTEMS, LLC |
Carlsbad |
CA |
US |
|
|
Assignee: |
APPLIED BIOSYSTEMS, LLC
Carlsbad
CA
|
Family ID: |
23307202 |
Appl. No.: |
14/106183 |
Filed: |
December 13, 2013 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
13615041 |
Sep 13, 2012 |
|
|
|
14106183 |
|
|
|
|
11300194 |
Dec 13, 2005 |
8318469 |
|
|
13615041 |
|
|
|
|
10302817 |
Nov 22, 2002 |
7052877 |
|
|
11300194 |
|
|
|
|
60334434 |
Nov 30, 2001 |
|
|
|
Current U.S.
Class: |
435/194 ;
435/252.33; 435/254.2; 435/320.1; 435/325; 435/348 |
Current CPC
Class: |
C12Y 207/07007 20130101;
Y02P 20/52 20151101; C12Q 1/6869 20130101; C12N 9/1252 20130101;
C12N 9/1241 20130101; C12P 19/34 20130101 |
Class at
Publication: |
435/194 ;
435/320.1; 435/252.33; 435/254.2; 435/348; 435/325 |
International
Class: |
C12N 9/12 20060101
C12N009/12 |
Claims
1. An isolated nucleic acid encoding a nucleic acid polymerase
comprising any one of amino acid sequences SEQ ID NO:9-16.
2. The isolated nucleic acid of claim 1, wherein said nucleic acid
polymerase comprises a mutation that decreases 5-3' exonuclease
activity.
3. The isolated nucleic acid of claim 2, wherein said decreased
5-3' exonuclease activity is relative to a nucleic acid polymerase
that does not comprise said mutation.
4. The isolated nucleic acid of claim 1, wherein said nucleic acid
polymerase comprises a mutation that reduces discrimination against
dideoxynucleotide triphosphates.
5. The isolated nucleic acid of claim 4, wherein said reduced
discrimination against dideoxynucleotide triphosphates is relative
to a nucleic acid polymerase that does not comprise said
mutation.
6. An isolated nucleic acid comprising the nucleotide sequence of
any one of SEQ ID NO:1-8 or a nucleotide sequence complementary to
any one of SEQ ID NO:1-8.
7. (canceled)
8. An isolated nucleic acid encoding a nucleic acid polymerase from
Thermus brockianus comprising amino acid sequence with at least 96%
identity to any one of SEQ ID NO:9-16.
9. A vector comprising the isolated nucleic acid of claim 8.
10.-16. (canceled)
17. An expression vector comprising a promoter operably linked to
the isolated nucleic acid of claim 8.
18.-24. (canceled)
25. A host cell comprising the isolated nucleic acid of claim
8.
26.-54. (canceled)
Description
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application is a Divisional application of U.S.
Non-Provisional application Ser. No. 10/302,817, filed Nov. 22,
2002, which claims a priority benefit under 35 U.S.C. .sctn.119(e)
from U.S. Patent Application No. 60/334,434, filed Nov. 30, 2001,
and which is incorporated herein by reference.
FIELD OF THE INVENTION
[0002] The invention relates to nucleic acids and polypeptides for
a nucleic acid polymerase isolated from a thermophilic organism,
Thermus brockianus.
BACKGROUND OF THE INVENTION
[0003] DNA polymerases are naturally-occurring intracellular
enzymes used by a cell for replicating DNA by reading one nucleic
acid strand and manufacturing its complement. Enzymes having DNA
polymerase activity catalyze the formation of a bond between the 3'
hydroxyl group at the growing end of a nucleic acid primer and the
5' phosphate group of a newly added nucleotide triphosphate.
Nucleotide triphosphates used for DNA synthesis are usually
deoxyadenosine triphosphate (A), deoxythymidine triphosphate (T),
deoxycytosine triphosphate (C) and deoxyguanosine triphosphate (G),
but modified or tered versions of these nucleotides can also be
used. The order in which the nucleotides are added is dictated by
hydrogen-bond formation between A and T nucleotide bases and
between G and C nucleotide bases.
[0004] Bacterial cells contain three types of DNA polymerases,
termed polymerase I, II and III. DNA polymerase I is the most
abundant polymerase and is generally responsible for certain types
of DNA repair, including a repair-like reaction that permits the
joining of Okazaki fragments during DMA replication. Pol I is
essential for the repair of DNA damage induced by UV irradiation
and radiomimetic drugs. Pol II is thought to play a role in
repairing DNA damage that induces the SOS response. In mutants that
lack both pol I and III, pol II repairs UV-induced lesions. Pol I
and II are monomeric polymerases while pol III is a multisubunit
complex.
[0005] Enzymes having DNA polymerase activity are often used in
vitro for a variety of biochemical applications including cDNA
synthesis and DNA sequencing reactions. See Sambrook e al.,
Molecular Cloning: A Laboratory Manual (3rd ed. Cold Spring Harbor
Laboratory Press, 2001, hereby incorporated by reference. DNA
polymerases are also used for amplification of nucleic acids by
methods such as the polymerase chain reaction (PCR) (Mullis et at,
U.S. Pat. Nos. 4,683,195, 4,683,202, and 4,800,159, incorporated by
reference) and RNA transcription-mediated amplification methods
(e.g., Kacian et al., PCT Publication No. WO91/01384, incorporated
by reference).
[0006] DNA amplification utilizes cycles of primer extension
through the use of a DNA polymerase activity, followed by thermal
denaturation of the resulting double-stranded nucleic acid in order
to provide a new template for another round of primer annealing and
extension. Because the high temperatures necessary for strand
denaturation result in the irreversible inactivations of many DNA
polymerases, the discovery and use of DNA polymerases able to
remain active at temperatures above about 37.quadrature.C provides
an advantage in cost and labor efficiency.
[0007] Thermostable DNA polymerases have been discovered in a
number of thermophilic organisms including Thermus aquaticus,
Thermus thermophilus, and species within the genera the Bacillus,
Thermaococcus, Sulfobus, and Pyrococcus. A full length thermostable
DNA polymerase derived from Thermus aquaticus (Taq) has been
described by Lawyer, et al., J. Biol. Chem. 264:6427-6437 (1989)
and Gelfand et al, U.S. Pat. No. 5,466,591. The cloning and
expression of truncated versions of that DNA polymerase are further
described in Lawyer et al., in PCR Methods and Applications,
2:275-787 (1993), and Barnes, PCT Publication No. WO92/06188
(1992). Sullivan reports the cloning of a mutated version of the
Taq DNA polymerase in EPO Publication No. 0482714A1 (1992). A DNA
polymerase from Thermus thermophilus has also been cloned and
expressed. Asakura et al., J. Ferment. Bioeng. (Japan), 74:265-269
(1993). However, the properties of the various DNA polymerases
vary. Accordingly, new DNA polymerases are needed that have
improved sequence discrimination, better salt tolerance, varying
degrees of thermostability, improved tolerance for labeled or
dideoxy nucleotides and other valuable properties.
SUMMARY OF THE INVENTION
[0008] The invention provides nucleic acid polymerases isolated
from, a thermophilic organism, Thermus brockianus, for example,
from strains YS38 and 2AZN.
[0009] In one embodiment, the invention provides an isolated
nucleic acid comprising SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, or
SEQ ID NO:4, SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7 or SEQ ID NO:8,
and complementary nucleic acids. In another embodiment, the
invention provides an isolated nucleic acid encoding a polypeptide
comprising an amino acid sequence that has at least 96% identity to
SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID
NO:13, SEQ ID NO:14, SEQ ID NO:15 or SEQ ID NO:16. The invention
also provides vectors comprising these isolated nucleic acids,
including expression vectors comprising a promoter operably linked
to these isolated nucleic acids. Host cells comprising such
isolated nucleic acids and vectors are also provided by the
invention, particularly host cells capable of expressing a
thermostable polypeptide encoded by the nucleic acid, where the
polypeptide has DNA polymerase activity.
[0010] The invention also provides isolated polypeptides mat can
include amino acid sequence comprising any one of SEQ ID NO:9-49.
The isolated polypeptides provided by the invention preferably are
thermostable and have a DNA polymerase activity between 50,000 U/mg
protein and 500,000 U/mg protein.
[0011] The invention further provides a method of synthesizing DNA
that includes contacting a polypeptide comprising SEQ ID NO:9, SEQ
ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14,
SEQ ID NO:15 or SEQ ID NO:16 with a DNA under conditions sufficient
to permit polymerization of DNA.
[0012] The invention further provides a method for thermocyclic
amplification of nucleic acid that comprises contacting a nucleic
acid with a thermostable polypeptide having SEQ ID NO:9, SEQ ID
NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ
ID NO:15 or SEQ ID NO:16 under conditions suitable for
amplification of the nucleic acid, and amplifying the nucleic acid.
In general, one or more primers are included in the amplication
mixture, where each primer can hybridize to a separate segment of
the nucleic acid. Such amplification can include cycling the
temperature to permit denaturation of nucleic acids, annealing of a
primer to a template nucleic acid and polymerization of a nucleic
acid complementary to the template nucleic acid. Amplification can
be, for example, by Strand Displacement Amplification or Polymerase
Chain Reaction.
[0013] The invention also provides a method of primer extending DNA
comprising contacting a polypeptide comprising SEQ ID NO:9, SEQ ID
NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ
ID NO:15 or SEQ ID NO:16 with a DNA and a primer capable of
hybridizing to a segment of the DMA under conditions sufficient to
permit polymerization of DMA, Such primer extension can be
performed, for example, to sequence DNA or to amplify DNA.
[0014] The invention further provides a method of making a DNA
polymerase comprising SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ
ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15 or SEQ ID NO:16.
The method comprises incubating a host cell under conditions
sufficient for RNA transcription and translation, wherein the host
cell comprises a nucleic acid that encodes a polypeptide comprising
SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID
NO:13, SEQ ID NO:14, SEQ ID NO:15 or SEQ ID NO:16 operably linked
to a promoter. In one embodiment, the method uses a nucleic acid
that comprises SEQ ID NO:1, SEQ ID NO:2, SEQ ID NO:3, SEQ ID NO:4,
SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8. The
invention is also directed to a DNA polymerase made by this
method.
[0015] The invention also provides a kit that includes a container
containing a DNA polymerase that has an amino acid sequence
comprising SEQ ID NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12,
SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15 or SEQ ID NO:16. The kit
can also contain an unlabeled nucleotide, a labeled nucleotide, a
balanced mixture of nucleotides, a chain terminating nucleotide, a
nucleotide analog, a buffer solution, a solution containing
magnesium, a cloning vector, a restriction endonuclease, a
sequencing primer, a solution containing reverse transcriptase, or
a DNA or RNA amplification primer. Such kits can, for examples be
adapted for performing DNA sequencing, DNA amplification, RNA
amplification or primer extension reactions.
DESCRIPTION OF THE FIGURE
[0016] FIG. 1 provides an alignment of nucleic acid polymerase
nucleic acids from Thermus brockianus strains 2AZN and YS38. Two
codon differences exist between these strains. One is silent and
the other is a difference of C vs T at position 1637 (indicated in
boldface), encoding a difference of leucine vs proline at amino
acid position 546.
[0017] FIG. 2 provides a comparison of amino acid sequences for
polymerases from Thermus aquaticus (Taq), Thermus thermophilus
(Tth), Thermus filiformis (Tfi), Thermus flavus Tfl), Thermus
brockianus strain YS38 (Tbr YS38) and Thermus brockianus strain
2AZN (Tbr 2AZN).
DETAILED DESCRIPTION OF THE INVENTION
[0018] The present invention relates to nucleic acid and amino acid
sequences encoding nucleic acid polymerases from thermophilic
organisms. In particular, the present invention provides nucleic
acid polymerases from Thermus brockianus. The nucleic acid
polymerases of the invention can he used in a variety of
procedures, including DNA primer extension, DNA sequencing, reverse
transcription and DNA amplification procedures.
Definitions
[0019] The term "amino acid sequence" refers to the positional
arrangement and identity of amino acids in a peptide, polypeptide
or protein molecule. Use of the term "amino acid sequence" is not
meant to limit the ammo acid sequence to the complete, native amino
acid sequence of a peptide, polypeptide or protein.
[0020] "Chimeric" is used to indicate that a DNA sequence, such as
a vector or a gene, is comprised of more than one DNA sequence of
distinct origin that is fused by recombinant DNA techniques to
another DNA sequence, resulting In a longer DNA sequence that does
not occur naturally.
[0021] The term "coding region" refers to the nucleic acid segment
that codes for a protein of interest. The coding region of a
protein is hounded on the 5' side by the nucleotide triplet "ATG"
that encodes the initiator methionine and on the 3' side by one of
the three triplets that specify stop codons (i.e., TAA, TAG,
TGA).
[0022] "Constitutive expression" refers to expression using a
constitutive promoter.
[0023] "Constitutive promoter" refers to a promoter that is able to
express the gene that it controls in all, or nearly all, phases of
the life cycle of the cell.
[0024] "Complementary" or "complementarity" are used to define the
degree of base-pairing or hybridization between nucleic acids. For
example, as is known to one of skill in the art, adenine (A) can
form hydrogen bonds or base pair with thymine (T) and guanine (G)
can form hydrogen bonds or base pair with cytosine (C). Hence, A is
complementary to T while G is complementary to C. Complementarity
may be complete when all bases in a double-stranded nucleic acid
are base paired. Alternatively, complementarity may be "partial,"
when only some of the bases in a nucleic acid are matched according
to the base pairing rules. The degree of complementarity between
nucleic acid strands has an effect on the efficiency and strength
of hybridization between nucleic acid strands.
[0025] The "derivative" of a reference nucleic acid, protein,
polypeptide or peptide, is a nucleic arid, protein, polypeptide or
peptide, respectively, with a related but different sequence or
chemical structure than the respective reference nucleic acid,
protein, polypeptide or peptide. A derivative nucleic acid,
protein, polypeptide or peptide is generally made purposefully to
enhance or incorporate some chemical, physical or functional
property that is absent or only weakly present in the reference
nucleic acid, protein, polypeptide or peptide. A derivative nucleic
acid generally can differ in nucleotide sequence from a reference
nucleic acid whereas a derivative protein, polypeptide or peptide
can differ in amino acid sequence from the reference protein,
polypeptide or peptide, respectively. Such sequence differences can
be one or more substitutions, insertions, additions, deletions,
fusions and truncations, which can be present in any combination.
Differences can be minor (e.g., a difference of one nucleotide or
amino acid) or more substantial. However, the sequence of the
derivative is not so different from the reference that one of skill
in the art would not recognize that the derivative and reference
are related in structure and/or function. Generally, differences
are limited so that the reference and the derivative are closely
similar overall and, in many regions, identical. A "variant"
differs from a "derivative" nucleic acid, protein, polypeptide or
peptide in that the variant can have silent structural differences
that do not significantly change the chemical, physical or
functional properties of the reference nucleic acid, protein,
polypeptide or peptide. In contrast, the differences between the
reference and derivative nucleic acid, protein, polypeptide or
peptide are intentional changes made to improve one or more
chemical, physical or functional properties of the reference
nucleic acid, protein, polypeptide or peptide.
[0026] The terms "DNA polymerase activity," "synthetic activity"
and "polymerase activity" are used interchangeably and refer to the
ability of an enzyme to synthesize new DNA strands by the
incorporation of deoxynucleoside triphosphates. A protein that can
direct the synthesis of new DNA strands by the incorporation of
deoxynucleoside triphosphates in a template-dependent manner is
said to be "capable of DNA synthetic activity."
[0027] The term "5' exonuclease activity" refers to the presence of
an activity in a protein that is capable of removing nucleotides
from the 5' end of a nucleic acid.
[0028] The term "3' exonuclease activity" refers to the presence of
an activity in a protein that is capable of removing nucleotides
from the 3' end of a nucleic acid.
[0029] "Expression" refers to the transcription and/or translation
of an endogenous or exogeneous gene in an organism. Expression
generally refers to the transcription and stable accumulation of
mRNA. Expression may also refer to the prod action of protein.
[0030] "Expression cassette" means a nucleic acid sequence capable
of directing expression of a particular nucleotide sequence.
Expression cassettes generally comprise a promoter operably linked
to the nucleotide sequence to be expressed (e.g., a coding region)
that is operably linked to termination signals. Expression
cassettes also typically comprise sequences required for proper
translation of the nucleotide sequence. The expression cassette
comprising the nucleotide sequence of interest may be chimeric,
meaning that at least one of its components is heterologous with
respect to at least one of its other components. The expression of
the nucleotide sequence in the expression cassette may be under the
control of a constitutive promoter or under control of an inducible
promoter that initiates transcription only when the host cell is
exposed to some particular external stimulus. In the case of a
multicellular organism, the promoter can also be specific to a
particular tissue or organ or stage of development.
[0031] The term "gene" is used broadly to refer to any segment of
nucleic acid associated with a biological function. The term "gene"
encompasses the coding region of a protein, polypeptide, peptide or
structural RNA. The term "gene" also includes sequences up to a
distance of about 2 kb on either end of a coding region. These
sequences are referred to as "flanking" sequences or regions (these
flanking sequences are located 5' or 3' to the non-translated
sequences present on the mRNA transcript). The 5' flanking region
may contain regulatory sequences such as promoters and enhancers or
other recognition or binding sequences for proteins that control or
influence the transcription of the gene. The 3' flanking region may
contain sequences that direct the termination of transcription,
post-transcriptional cleavage and polyadenylation as well as
recognition sequences for other proteins. A protein or polypeptide
encoded in a gene can be full length or any portion thereof, so
that all activities or functional properties are retained, or so
that only selected activities (e.g., enzymatic activity, ligand
binding, of signal transduction) of the full-length protein or
polypeptide are retained. The protein or polypeptide can include
any sequences necessary for the production of a proprotein or
precursor polypeptide. The term "native gene" refers to gene that
is naturally present in the genome of an untransformed cell.
[0032] "Genome" refers to the complete generic material that is
naturally present in an organism and is transmitted from one
generation to the next.
[0033] The terms "heterologous nucleic acid," or "exogenous nucleic
acid" refer to a nucleic acid that originates from a source foreign
to the particular host cell or, if from the same source, is
modified from its original form. Thus, a heterologous gene in a
host cell includes a gene mat is endogenous to the particular host
cell bin has been modified through, for example, the use of DMA
shuffling. The terms also include non-naturally occurring multiple
copies of a naturally occurring nucleic acid. Thus, the terms refer
to a nucleic acid segment that is foreign or heterologous to the
cell, or normally found within the cell but in a position within
the cell or genome where it is not ordinarily found.
[0034] The term "homology" refers to a degree of similarity between
a nucleic acid and a reference nucleic acid or between a
polypeptide and a reference polypeptide. Homology may be partial or
complete. Complete homology indicates that the nucleic acid or
amino acid sequences are identical. A partially homologous nucleic
acid or amino acid sequence is one that is not identical, to the
reference nucleic acid or amino acid sequence. Hence, a partially
homologous nucleic acid has one or more nucleotide differences in
its sequence relative to the nucleic acid to which it is being
compared. The degree of homology can be determined by sequence
comparison. Alternatively, as is well understood by those skilled
in the art, DNA-DNA or DNA-RNA hybridization, under various
hybridization conditions, can provide an estimate of the degree of
homology between nucleic acids, (see, e.g., Haines and Higgins
(eds.), Nucleic Acid Hybridization, IRL Press, Oxford, U.K.).
[0035] "Hybridization" refers to the process of annealing
complementary nucleic acid strands by forming hydrogen bonds
between nucleotide bases on the complementary nucleic acid strands.
Hybridization, and the strength of the association between the
nucleic acids, is impacted by such factors as the degree of
complementary between the hybridizing nucleic acids, the stringency
of the conditions involved, the Tm of the formed hybrid, and the
G:C ratio within the nucleic acids.
[0036] "Inducible promoter" refers to a regulated promoter that can
be turned on in one or more cell types by an external stimulus,
such as a chemical, light, hormone, stress, temperature or a
pathogen.
[0037] An "initiation site" is region surrounding the position of
the first nucleotide that is part, of the transcribed sequence,
which is defined as position +1. All nucleotide positions of the
gene are numbered by reference to the first nucleotide of the
transcribed sequence, which resides within the initiation site.
Downstream sequences (i.e., sequences in the 3' direction) are
denominated positive, while upstream sequences (i.e., sequences in
the 5' direction) are denominated negative.
[0038] An "isolated" or "purified" nucleic acid or an "isolated" or
"purified" polypeptide is a nucleic acid or polypeptide that, by
the hand of man, exists apart from its native environment and is
therefore not a product of nature. An isolated nucleic acid or
polypeptide may exist in a purified form or may exist in a
non-native environment such as, for example, within a transgenic
host cell.
[0039] The term "invader oligonucleotide" refers to an
oligonucleotide that contains sequences at its 3' end that are
substantially the same as sequences located at the 5' end of a
probe oligonucleotide. These regions will compete for hybridization
to the same segment along a complementary target nucleic acid.
[0040] The term "label" refers to any atom or molecule that can be
used to provide a detectable (preferably quantifiable) signal, and
that can be attached to a nucleic acid or protein. Labels may
provide signals detectable by fluorescence, radioactivity,
colorimetry, gravimetry, X-ray diffraction or absorption,
magnetism, enzymatic activity, and the like.
[0041] The term "nucleic acid" refers to deoxyribonucleotides or
ribonucleotides and polymers thereof in either single- or
double-stranded form, composed of monomers (nucleotides) containing
a sugar, phosphate and a base that is either a purine or
pyrimidine. Unless specifically limited, the term encompasses
nucleic acids containing known analogs of natural nucleotides that
have similar binding properties as the reference nucleic acid and
are metabolized in a manner similar to naturally occurring
nucleotides. Unless otherwise indicated, a particular nucleic acid
sequence also implicitly encompasses conservatively modified
variants thereof (e.g., degenerate codon substitutions) and
complementary sequences as well as the reference sequence
explicitly indicated.
[0042] The term "oligonucleotide" as used herein is defined as a
molecule comprised of two or more deoxyribonucleotides or
ribonucleotides, preferably more than three, and usually more than
ten or fifteen. There is no precise upper limit on the size of an
oligonucleotide. However, in general, an oligonucleotide is shorter
than about 250 nucleotides, preferably shorter than about 200
nucleotides and more preferably shorter than about 100 nucleotides.
The exact size will depend on many factors, which in turn depends
on the ultimate function or use of the oligonucleotide. The
oligonucleotide may be generated in any manner, including chemical
synthesis, DNA replication, reverse transcription, or a combination
thereof.
[0043] The terms "open reading frame" and "ORF" refer to the amino
acid sequence encoded between translation initiation and
termination codons of a coding sequence. The terms "initiation
codon" and "termination codon" refer to a unit of three adjacent
nucleotides (`codon`) in a coding sequence that specifies
initiation and chain termination, respectively, of protein
synthesis (mRNA translation).
[0044] "Operably linked" means joined as part of the same nucleic
acid molecule, so that the function of one is affected by the
other. In general, "operably linked" also means that two or more
nucleic acids are suitably positioned and oriented so that they can
function together. Nucleic acids are often operably linked to
permit transcription of a coding region to be initiated from the
promoter. For example, a regulatory sequence is said to be
"operably linked to" or "associated with" a DNA sequence that codes
for an RNA or a polypeptide if the two sequences are situated such
that the regulatory sequence affects expression of the coding
region (i.e., that the coding sequence or functional RNA is under
the transcriptional control of the promoter). Coding regions can be
operably-linked to regulatory sequences in sense or antisense
orientation.
[0045] The term "probe oligonucleotide" refers to an
oligonucleotide that interacts with a target nucleic acid to form a
cleavage structure in the presence or absence of an invader
oligonucleotide. When annealed to the target nucleic acid, the
probe oligonucleotide and target form a cleavage structure and
cleavage occurs within the probe oligonucleotide. The presence of
an invader oligonucleotide upstream of the probe oligonucleotide
can shift the site of cleavage within the probe oligonucleotide
(relative to the site of cleavage in the absence of the
invader).
[0046] "Promoter" refers to a nucleotide sequence, usually upstream
(5') to a coding region, which controls the expression of die
coding region by providing the recognition site for RNA polymerase
and other factors required for proper transcription. "Promoter"
includes but is not limited a minimal promoter that is s short DNA
sequence comprised of a TATA-box. Hence, a promoter includes other
sequences that serve to specify the site of transcription
initiation and control or regulate expression, for example,
enhancers. Accordingly, an "enhancer" is a DNA sequence that can
stimulate promoter activity and may be an innate element of the
promoter or a heterologous element inserted to enhance the level or
tissue specificity of a promoter. It is capable of operating in
both orientations (normal or flipped), and is capable of
functioning even when moved either upstream or downstream from fee
promoter. Promoters may be derived in their entirety from a native
gene, or be composed of different elements derived from different
promoters found in nature, or even be comprised of synthetic DNA
segments. A promoter may also contain DNA sequences that are
involved in the binding of protein factors that control the
effectiveness of transcription initiation in response to
physiological or developmental conditions.
[0047] The terms "protein," "peptide" and "polypeptide" are used
interchangeably herein.
[0048] "Regulatory sequences" and "regulatory elements" refer to
nucleotide sequences that, control some aspect of the expression of
nucleic acid sequences. Such sequences or elements can be located
upstream (5' non-coding sequences), within, or downstream (3'
non-coding sequences) of a coding sequence. "Regulatory sequences"
and "regulatory elements" influence the transcription, RNA
processing or stability, or translation of the associated coding
sequence. Regulatory sequences include enhancers, introns,
promoters, polyadenylation signal sequences, splicing signals,
termination signals, and translation leader sequences. Regulatory
sequences also include natural and synthetic sequences.
[0049] As used herein, the term "selectable marker" refers to a
gene that encodes an observable or selectable trait that is
expressed and can be detected in an organism having that gene.
Selectable markers are often linked to a nucleic acid of interest
that may not encode an observable trait in order to trace or select
for the presence of the nucleic acid of interest. Any selectable
marker known to one of skill in the art can be used with the
nucleic acids of the invention. Some selectable markers allow the
host to survive under circumstances where, without the marker, the
host would otherwise die. Examples of selectable markers include
antibiotic resistance, for example, tetracycline or ampicillin
resistance.
[0050] As used herein the term "stringency" is used to define the
conditions of temperature, ionic strength, and the presence of
other compounds such as organic solvents, under which nucleic acid
hybridizations are conducted. With "high stringency" conditions,
nucleic acid base pairing will occur only between nucleic acids
that have a high frequency of complementary base sequences. With
"weak" or "low" stringency conditions nucleic acids the frequency
of complementary sequences is usually less, so that nucleic acids
with differing sequences can be detected and/or isolated.
[0051] The terms "substantially similar" and "substantially
homologous" refer to nucleotide and amino acid sequences dial
represent functional equivalents of the instant inventive
sequences. For example, altered nucleotide sequences that simply
reflect the degeneracy of the genetic code but nonetheless encode
amino acid sequences that are identical to the inventive amino acid
sequences are substantially similar to the inventive sequences. In
addition, amino acid sequences that are substantially similar to
the instant sequences are those wherein overall amino acid identity
is sufficient to provide an active, thermally stable DNA polymerase
I. For example, amino acid sequences that are substantially similar
to the sequences of the invention are those wherein the overall
amino acid identity is 80% or greater, preferably 90% or greater,
such as 91%, 92%, 93%, or 94%, and more preferably 95% identity or
greater, such as 96%, 97%, 98%, or 99% identity, relative to the
amino acid sequences of the invention.
[0052] A "terminating agent", "terminating nucleotide" or
"terminator" in relation to DMA synthesis or sequencing refers to
compounds capable of specifically terminating a DNA sequencing
reaction, at a specific base, such compounds include but are not
limited to, dideoxynucleosides having a 2',3' dideoxy structure
(e.g., ddATP, ddCTP, ddGTP and ddTTP).
[0053] "Thermostable" means that a nucleic acid polymerase remains
active at a temperature greater than about 37.quadrature.C.
Preferably, the nucleic acid polymerases of the invention remain
active at a temperature greater than about 42.quadrature.C. More
preferably, the nucleic acid polymerases of the invention remain
active at a temperature greater than about 50.quadrature.C. Even
more preferably, the nucleic acid polymerases of the invention
remain active after exposure to a temperature greater than about
60.quadrature.C. Most preferably, the nucleic acid polymerases of
the invention remain active despite exposure to a temperature
greater than about 70.quadrature.C.
[0054] A "transgene" refers to a gene that has been introduced into
the genome by transformation and is stably maintained. Transgenes
may include, for example, genes that are either heterologous or
homologous to the genes of a particular organism to be transformed.
Additionally, transgenes may comprise native genes inserted into a
non-native organism, or chimeric genes. The term "endogenous gene"
refers to a native gene in its natural location in the genome of an
organism. A "foreign" or "exogenous" gene refers to a gene not
normally found in the host organism but that is introduced by gene
transfer.
[0055] The term "transformation" refers to the transfer of a
nucleic acid fragment into the genome of a host cell, resulting in
genetically stable inheritance. Host cells containing the
transformed nucleic acid fragments are referred to as "transgenic"
ceils, and organisms comprising transgenic cells are referred to as
"transgenic organisms." Transformation may be accomplished by a
variety of means known to the art including calcium DNA
co-precipitation, electroporation, viral infection, and the
like.
[0056] The "variant" of a reference nucleic acid, protein,
polypeptide or peptide, is a nucleic acid, protein, polypeptide or
peptide, respectively, with a related but different sequence than
the respective reference nucleic acid, protein, polypeptide or
peptide. The differences between variant and reference nucleic
acids, proteins, polypeptides or peptides are silent or
conservative differences. A variant nucleic acid differs from a
reference nucleic acid in nucleotide sequence whereas a variant
nucleic acid, protein, polypeptide or peptide differs in amino acid
sequence from the reference protein, polypeptide or peptide,
respectively. A variant and reference nucleic acid, protein,
polypeptide or peptide may differ in sequence by one or more
substitutions, insertions, additions, deletions, fusions
and-truncations, which may be present in any combination.
Differences can be minor (e.g., a difference of one nucleotide or
amino acid) or more substantial. However, the structure and
function of the variant is not so different from the reference that
one of skill in the art would not recognize that the variant and
reference are related in structure and/or function. Generally,
differences are limited so that the reference and the variant are
closely similar overall and, in many regions, identical.
[0057] The term "vector" is used to refer to a nucleic acid that
can transfer another nucleic acid segment(s) into a cell. A
"vector" includes, inter alia, any plasmid, cosmid, phage or
nucleic acid in double- or single-stranded, linear or circular form
that may or may not be self-transmissible or mobilizable. It can
transform prokaryotic or eukaryotic host cells either by
integration into the cellular genome or by existing
extrachromosomally (e.g., autonomous replicating plasmid with an
origin of replication). Vectors used in bacterial systems often
contain an origin of replication that allows the vector to
replicate independently of the bacterial chromosome. The term
"expression vector" refers to a vector containing an expression
cassette.
[0058] The term "wild-type" refers to a gene or gene product that
has the characteristics of that gene or gene product when isolated
from a naturally occurring source. A wild-type gene is the gene
form most frequently observed in a population and thus arbitrarily
is designated the "normal" or "wild-type" form of the gene. In
contrast, the term "variant" or "derivative" refers to a gene or
gene product that displays modifications in sequence and or
functional properties (i.e., altered characteristics) when compared
to the wild-type gene or gene product. Naturally occurring
derivatives can be isolated. They are identified by the fact that
they have altered characteristics when compared to the wild-type
gene or gene product.
Polymerase Nucleic Acids
[0059] The invention provides isolated nucleic acids encoding
Thermus brockianus nucleic acid polymerases, as well as derivative,
fragment and variant nucleic acids thereof that encode active,
thermally stable nucleic acid polymerases. Thus, one aspect of the
invention includes the nucleic acid polymerases encoded by the
polynucleotide sequences contained in Thermus brockianus strains
YS3S and AZN. Any nucleic acid encoding amino acid sequence SEQ ID
NO:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ
ID NO:14, SEQ ID NO:15, and SEQ ID NO:16, which are amino acid
sequences for wild type and several derivative Thermus brockianus
polymerases, are also contemplated by the present invention.
[0060] In one embodiment, the invention provides a nucleic acid of
SEQ ID NO:1, a wild type Thermus brockianus nucleic acid encoding
nucleic acid polymerase from strain YS38:
TABLE-US-00001 1 ATGCTTCCCC TCTTTGAGCC CAAGGGCCGG GTGCTCCTGG
TGGACGGCCA 51 CCACCTGGCC TACCGTAACT TCTTCGCCCT CAAGGGGCTC
ACCACGAGCC 101 GGGGCGAGCC CGTGCAAGGG GTCTACGGCT TCGGCAAAAG
CCTCCTCAAG 151 GCCCTGAAGG AGGACGGGGA CGTGGTCATC GTGGTCTTTG
ACGCCAAGGC 201 CCCCTCTTTT CGCCACGAGG CCTACGGGGC CTACAAGGCG
GGCCGGGCCC 251 CTACCCCGGA GGACTTTCCG AGGCAGCTTG CCCTCATGAA
GGAGCTTGTG 301 GACCTTTTGG GGCTGGAGCG CCTCGAGGTC CCGGGCTTTG
AGGCGGACGA 351 TGTCCTCGCC GCCCTGGCCA AGAAGGCGGA GCGGGAAGGG
TACGAGGTGC 401 GCATCCTCAC CGCCGACCGG GACCTCTTCC AGCTTCTTTC
GGACCGCATC 451 GCCGTCCTGC ACCCGGAAGG CCACCTCATC ACCCCGGGGT
GGCTTTGGGA 501 GAGGTACGGC CTGAGACCGG AGCAGTGGGT GGAGTTCCGC
GCCCTGGCCG 551 GCGACCCTTC CGACAACATC CCCGGGGTGA AGGGGATCGG
CGAGAAGACG 601 GCCCTGAAGC TCCTAAAGGA GTGGGGTAGT CTGGAAAATA
TCCAAAAAAA 651 CCTGGACCAG GTCAGTCCCC CTTCCGTGCG CGAGAAGATC
CAGGCCCACC 701 TGGACGACCT CAGGCTCTCC CAGGAGCTTT CCCGGGTGCG
CACGGACCTT 751 CCCTTGGAGG TGGACTTTAG AAGGCGGCGG GAGCCCGATA
GGGAAGGCCT 801 TAGGGCCTTC TTAGAGCGGC TTGAGTTCGG GAGCCTCCTC
CACGAGTTCG 851 GCCTCCTGGA AAGCCCCCAG GCGGCGGAGG AGGCCCCTTG
GCCGCCGCCG 901 GAAGGGGCCT TCTTGGGCTT CCGCCTCTCC CGGCCCGAGC
CCATGTGGGC 951 GGAACTCCTT TCCTTGGCGG CAAGCGCCAA GGGCCGGGTC
TACCGGGCGG 1001 AGGCGCCCCA TAAGGCCCTT TCGGACCTGA AGGAGATCCG
GGGGCTTCTC 1051 GCCAAGGACC TCGCCGTCTT GGCCCTGAGG GAGGGGCTCG
GCCTTCCCCC 1101 CACGGACGAT CCCATGCTCC TCGCCTACCT CCTGGACCCC
TCCAACACCA 1151 CCCCCGAGGG CGTGGCCCGG CGCTACGGGG GGGAGTGGAC
GGAGGAGGCG 1201 GGGGAGAGGG CCTTGCTTGC CGAAAGGCTT TACGAGAACC
TCCTAAGCCG 1251 CCTGAAAGGG GAAGAAAAGC TCCTTTGGCT CTACGAGGAG
GTGGAAAAGC 1301 CCCTTTCCCG GGTCCTCGCC CACATGGAGG CCACGGGGGT
GAGGCTGGAC 1351 GTACCCTACC TAAGGGCCCT TTCCCTGGAG GTGGCGGCGG
AGATGGGCCG 1401 CCTGGAGGAG GAGGTTTTCC GCCTGGCGGG CCACCCCTTC
AACCTGAACT 1451 CCCGCGACCA GCTGGAAAGG GTGCTCTTTG ACGAGCTCGG
GCTTCCCCCC 1501 ATCGGCAAGA CGGAAAAAAC CGGGAAGCGC TCCACCAGCG
CCGCCGTCCT 1551 CGAGGCCCTG CGGGAGGCCC ACCCCATCGT GGAGAAGATC
CTCCAGTACC 1601 GGGAGCTCGC CAAGCTCAAG GGCACCTACA TTGACCTCCT
TCCCGCCCTG 1651 GTCCACCCCA GGACGGGCAG GCTCCACACC CGCTTCAACC
AGACGGCCAC 1701 GGCCACGGGC CGCCTTTCCA GCTCCGACCC CAACCTGCAG
AACATTCCCG 1751 TGCGCACCCC CTTGGGCCAA AGGATCCGCC GGGCCTTCGT
GGCCGAGGAG 1901 GGGTACCTTC TCGTGGCCCT GGACTATAGC CAGATTGAGC
TGAGGGTCCT 1851 GGCCCACCTC TCGGGGGACG AAAACCTCAT CCGGGTCTTC
CAGGAGGGCC 1901 GGGACATCCA CACCCAGACG GCGAGCTGGA TGTTCGGCCT
GCCGGCGGAG 1951 GCCATAGACC CCCTCAGGCG CCGGGCGGCC AAGACCATCA
ACTTCGGCGT 2001 CCTCTACGGC ATGTCCGCCC ACCGGCTTTC CCAGGAGCTG
GGCATCCCCT 2051 ACGAGGAGGC GGTGGCCTTC ATTGACCGCT ATTTCCAGAG
CTACCCCAAG 2101 GTGAAGGCCT GGATTGAAAG GACCCTGGAG GAGGGGCGGC
AAAGGGGGTA 2151 CGTGGAGACC CTCTTCGGCC GCAGGCGCTA CGTGCCCGAC
CTCAACGCCC 2201 GGGTAAAGAG CGTGCGGGAG GCGGCGGAGC GCATGGCCTT
TAACATGCCC 2251 GTGCAGGGCA CCGCCGCTGA CCTGATGAAG CTCGCCATGG
TGAGGCTCTT 2301 CCCTAGGCTT CCCGAGGTGG GGGCGAGGAT GCTCCTCCAG
GTCCACGACG 2351 AGCTCCTCCT GGAGGCGCCC AAGGAGCGGG CGGAGGAGGC
GGCGGCCCTG 2401 GCCAAGGAGG TCATGGAGGG GGTCTGGCCC CTGGCCGTGC
CCCTGGAGGT 2451 GGAGGTGGGC ATCGGGGAGG ACTGGCTTTC CGCCAAGGGC TAG
[0061] In another embodiment, the invention provides a nucleic acid
of SEQ ID NO:2, another wild type Thermus brockianus nucleic acid
encoding a nucleic acid polymerase, but from strain 2AZN.
TABLE-US-00002 1 ATGCTTCCCC TCTTTGAGCC CAAGGGCCGG GTGCTCCTGG
TGGACGGCCA 51 CCACCTGGCC TACCGTAACT TCTTCGCCCT CAAGGGGCTC
ACCACGAGCC 101 GGGGCGAGCC CGTGCAAGGG GTCTACGGCT TCGCCAAAAG
CCTCCTCAAG 151 GCCCTGAAGG AGGACGGGGA CGTGGTCATC GTGGTCTTTG
ACGCCAAGGC 201 CCCCTCTTTT CGCCACGAGG CCTACGGGGC CTACAAGGCG
GGCCGGGCCC 251 CTACCCCGGA GGACTTTCCG AGGCAGCTTG CCCTCATGAA
GGAGCTTGTG 301 GACCTTTTGG GGCTGGAGCG CCTCGAGGTC CCGGGCTTTG
AGGCGGACGA 351 TGTCCTCGCC GCCCTGGCCA AGAAGGCGGA GCGGGAAGGG
TACGAGGTGC 401 GCATCCTCAC CGCCGACCGG GACCTCTTCC AGCTTCTTTC
GGACCGCATC 451 GCCGTCCTGC ACCCGGAAGG CCACCTCATC ACCCCGGGGT
GGCTTTGGGA 501 GAGGTACGGC CTGAGACCGG AGCAGTGGGT GGACTTCCGC
GCCCTGGCCG 551 GCGACCCTTC CGACAACATC CCCGGGGTGA AGGGGATCGG
CGAGAAGACG 601 GCCCTGAAGC TCCTAAAGGA GTGGGGTAGT CTGGAAAATA
TCCAAAAAAA 651 CCTGGACCAG GTCAGTCCCC CTTCCGTGCG CGAGAAGATC
CAGGCCCACC 701 TGGACGACCT CAGGCTCTCC CAGGAGCTTT CCCGGGTGCG
CACGGACCTT 751 CCCTTGGAGG TGGACTTTAG AAGGCGGCGG GAGCCCGATA
GGGAAGGCCT 801 TAGGGCCTTC TTAGAGCGGC TTGAGTTCGG GAGCCTCCTC
CACGAGTTCG 851 GCCTCCTGGA AAGCCCCCAG GCGGCGGAGG AGGCCCCTTG
GCCGCCGCCG 901 GAAGGGGCCT TCTTGGGCTT CCGCCTCTCC CGGCCCGAGC
CCATGTGGGC 951 GGAACTCCTT TCCTTGGCGG CAAGCGCCAA GGGCCGGGTC
TACCGGGCGG 1001 AGGCGCCCCA TAAGGCCCTT TCGGACCTGA AGGAGATCCG
GGGGCTTCTC 1051 GCCAAGGACC TCGCCGTCTT GGCCCTGAGG GAGGGGCTCG
GCCTTCCCCC 1101 CACGGACGAT CCCATGCTCC TCGCCTACCT CCTGGACCCC
TCCAACACCA 1151 CCCCCGAGGG CGTGGCCCGG CGCTACGGGG GGGAGTGGAC
GGAGGAGGCG 1201 GGGGAGAGGG CCTTGCTTGC CGAAAGGCTT TACGAGAACC
TCCTAAGCCG 1251 CCTGAAAGGG GAAGAAAAGC TCCTTTGGCT CTACGAGGAG
GTGGAAAAGC 1301 CCCTTTCCCG GGTCCTCGCC CACATGGAGG CCACGGGGGT
GAGGCTGGAC 1351 GTACCCTACC TAAGGGCCCT TTCCCTGGAG GTGGCGGCGG
AGATGGGCCG 1401 CCTGGAGGAG GAGGTTTTCC GCCTGGCGGG CCACCCCTTC
AACCTGAACT 1451 CCCGCGACCA GCTGGAAAGG GTGCTCTTTG ACGAGCTCGG
GCTTCCCCCC 1501 ATCGGCAAGA CGGAAAAAAC CGGGAAGCGC TCCACCAGCG
CCGCCGTCCT 1551 CGAGGCCCTG CGGGAGGCCC ACCCCATCGT GGAGAAGATC
CTCCAGTACC 1601 GGGAGCTCGC CAAGCTCAAG GGCACCTACA TTGACCCCCT
TCCCGCCGTG 1651 GTCCACCCCA GGACGGGCAG GCTCCACACC CGCTTCAACC
AGACGGCCAC 1701 GGCCACGGGC CGCCTTTCCA GCTCCGACCC CAACCTGCAG
AACATTCCCG 1751 TGCGCACCCC CTTGGGCCAA AGGATCCGCC GGGCCTTCGT
GGCCGAGGAG 1801 GGGTACCTTC TCGTGGCCCT GGACTATAGC CAGATTGAGC
TGAGGGTCCT 1851 GGCCCACCTC TCGGGGGACG AAAACCTCAT CCGGGTCTTC
CAGGAGGGCC 1901 GGGACATCCA CACCCAGACG GCGAGCTGGA TGTTCGGCCT
GCCGGCGGAG 1951 GCCATAGACC CCCTCAGGCG CCGGGCGGCC AAGACCATCA
ACTTCGGCGT 2001 CCTCTACGGC ATGTCCGCCC ACCGGCTTTC CCAGGAGCTG
GGCATCCCCT 2051 ACGAGGAGGC GGTGGCCTTC ATTGACCGCT ATTTCCAGAG
CTACCCCAAG 2101 GTGAAGGCCT GGATTGAAAG GACCCTGGAG GAGGGGCGGC
AAAGGGGGTA 2151 CGTGGAGACC CTCTTCGGCC GCAGGCGCTA CGTGCCCGAC
CTCAACGCCC 2201 GGGTAAAGAG CGTGCGGGAG GCGGCGGAGC GCATGGCCTT
TAACATGCCC 2251 GTGCAGGGCA CCGCCGCTGA CCTGATGAAG CTCGCCATGG
TGAGGCTCTT 2301 CCCTAGGCTT CCCGAGGTGG GGGCGAGGAT GCTCCTCCAG
GTCCACGACG 2351 AGCTCCTCCT GGAGGCGCCC AAGGAGCGGG CGGAGGAGGC
GGCGGCCCTG 2401 GCCAAGCAGG TCATGGAGGG AGTCTGGCCC CTGGCCGTGC
CCCTGGAGGT 2451 GGAGGTGGGC ATCGGGGAGG ACTGGCTTTC CGCCAAGGGC
TAGTCGAC
[0062] In another embodiment, the invention provides a nucleic acid
from Thermus brockianus strain YS38 having SEQ ID NO:3, a
derivative nucleic acid having GAC (encoding Asp) in place of GGC
(encoding Gly) at positions 127-129. SEQ ID NO:3 is provided
below:
TABLE-US-00003 1 ATGCTTCCCC TCTTTGAGCC CAAGGGCCGG GTGCTCCTGG
TGGACGGCCA 51 CCACCTGGCC TACCGTAACT TCTTCGCCCT CAAGGGGCTC
ACCACGAGCC 101 GGGGCGAGCC CGTGCAAGGG GTCTACGACT TCGCCAAAAG
CCTCCTCAAG 151 GCCCTGAAGG AGGACGGGGA CGTGGTCATC GTGGTCTTTG
ACGCCAAGGC 201 CCCCTCTTTT CGCCACGAGG CCTACGCGGC CTACAAGGCG
GGCCGGGCCC 251 CTACCCCGGA GGACTTTCCG AGGCAGCTTG CCCTCATGAA
GGAGCTTGTG 301 GACCTTTTGG GGCTGGAGCG CCTCGAGGTC CCGGGCTTTG
AGGCGGACGA 351 TGTCCTCGCC GCCCTGGCCA AGAAGGCGGA GCGGGAAGGG
TACGAGGTGC 401 GCATCCTCAC CGCCGACCGG GACCTCTTCC AGCTTCTTTC
GGACCGCATC 451 GCCGTCCTGC ACCCGGAAGG CCACCTCATC ACCCCGGGGT
GGCTTTGGGA 501 GAGGTACGGC CTGAGACCGG AGCAGTGGGT GGACTTCCGC
GCCCTGGCCG 551 GCGACCCTTC CGACAACATC CCCGGGGTGA AGGGGATCGG
CGAGAAGACG 601 GCCCTGAAGC TCCTAAAGGA GTGGGGTAGT CTGGAAAATA
TCCAAAAAAA 651 CCTGGACCAG GTCAGTCCCC CTTCCGTGCG CGAGAAGATC
CAGGCCCACC 701 TGGACGACCT CAGGCTCTCC CAGGAGCTIT CCCGGGTGCG
CACGGACCTT 751 CCCTTGGAGG TGGACTTTAG AAGGCGGCGG GAGCCCGATA
GGGAAGGCCT 801 TAGGGCCTTC TTAGAGCGGC TTGAGTTCGG GAGCCTCCTC
CACGAGTTCG 851 GCCTCCTGGA AAGCCCCCAG GCGGCGGAGG AGGCCCCTTG
GCCGCCGCCG 901 GAAGGGGCCT TCTTGGGCTT CCGCCTCTCC CGGCCCGAGC
CCATGTGGGC 951 GGAACTCCTT TCCTTGGCGG CAAGCGCCAA GGGCCGGGTC
TACCGGGCCG 1001 AGGCGCCCCA TAAGGCCCTT TCGGACCTGA AGGAGATCCG
GGGGCTTCTC 1051 GCCAAGGACC TCGCCGTCTT GGCCCTGAGG GAGGGGCTCG
GCCTTCCCCC 1101 CACGGACGAT CCCATGCTCC TCGCCTACCT CCTGGACCCC
TCCAACACCA 1151 CCCCCGAGGG CGTGGCCCGG CGCTACGGGG GGGAGTGGAC
GGAGGAGGCG 1201 GGGGAGAGGG CCTTGCTTGC CGAAAGGCTT TACGAGAACC
TCCTAAGCCG 1251 CCTGAAAGGG GAAGAAAAGC TCCTTTGGCT CTCACGAGGAG
GTGGAAAAGC 1301 CCCTTTCCCG GGTCCTCGCC CACATGGAGG CCACGGGGGT
GAGGCTGGAC 1351 GTACCCTACC TAAGGGCCCT TTCCCTGGAG GTGGCGGCGG
AGATGGGCCG 1401 CCTGGAGGAG GAGGTTTTCC GCCTGGCGGG CCACCCCTTC
AACCTGAACT 1451 CCCGCGACCA GCTGGAAAGG GTGCTCTTTG ACGAGCTCGG
GCTTCCCCCC 1501 ATCGGCAAGA CGGAAAAAAC CGGGAAGCGC TCCACCAGCG
CCGCCGTCCT 1551 CGAGGCCCTG CGGGAGGCCC ACCCCATCGT GGAGAAGATC
CTCCAGTACC 1501 GGGAGCTCGC CAAGCTCAAG GGCACCTACA TTGACCTCCT
TCCCGCCCTG 1651 GTCCACCCCA GGACGGGCAG GCTCCACACC CGCTTCAACC
AGACGGCCAC 1701 GGCCACGGGC CGCCTTTCCA GCTCCGACCC CAACCTGCAG
AACATTCCCG 1751 TGCGCACCCC CTTGGGCCAA AGGATCCGCC GGGCCTTCGT
GGCCGAGGAG 1801 GGGTACCTTC TCGTGGCCCT GGACTATAGC CAGATTGAGC
TGAGGGTCCT 1851 GGCCCACCTC TCGGGGGACG AAAACCTCAT CCGGGTCTTC
CAGGAGGGCC 1901 GGGACATCCA CACCCAGACG GCGAGCTGGA TGTTCGGCCT
GCCGGCGGAG 1951 GCCATAGACC CCCTCAGGCG CCGGGCGGCC AAGACCATCA
ACTTCGGCGT 2001 CCTCTACGGC ATGTCCGCCC ACCGGCTTTC CCAGGAGCTG
GGCATCCCCT 2051 ACGAGGAGGC GGTGGCCTTC ATTGACCGCT ATTTCCAGAG
CTACCCCAAG 2101 GTGAAGGCCT GGATTGAAAG GACCCTGGAG GAGGGGCGGC
AAAGGGGGTA 2151 CGTGGAGACC CTCTTCGGCC GCAGGCGCTA CGTGCCCGAC
CTCAACGCCC 2201 GGGTAAAGAG CGTGCGGGAG GCGGCGGAGC GCATGGCCTT
TAACATGCCC 2251 GTGCAGGGCA CCGCCGCTGA CCTGATGAAG CTCGCCATGG
TGAGGCTCTT 2301 CCCTAGGCTT CCCGAGGTGG GGGCGAGGAT GCTCCTCCAG
GTCCACGACG 2351 AGCTCCTCCT GGAGGCGCCC AAGGAGCGGG CGGAGGAGGC
GGCGGCCCTG 2401 GCCAAGGAGG TCATGGAGGG GGTCTGGCCC CTGGCCGTGC
CCCTGGAGGT 2451 GGAGGTGGGC ATCGGGGAGG ACTGGCTTTC CGCCAAGGGC TAG
[0063] In another embodiment, the invention provides a derivative
nucleic acid related to Thermos brockianus strain 2AZN having SEQ
ID NO:4. SEQ ID NO:4 is a derivative nucleic acid having GAC
(encoding Asp) in place of GGC (encoding Gly) at positions 127-129
and is provided below:
TABLE-US-00004 1 ATGCTTCCCC TCTTTGAGCC CAAGGGCCGG GTGCTCCTGG
TGGACGGCCA 51 CCACCTGGCC TACCGTAACT TCTTCGCCCT CAAGGGGCTC
ACCACGAGCC 101 GGGGCGAGCC CGTGCAAGGG GTCTACGACT TCGCCAAAAG
CCTCCTCAAG 151 GCCCTGAAGG AGGACGGGGA CGTGGTCATC GTGGTCTTTG
ACGCCAAGGC 201 CCCCTCTTTT CGCCACGAGG CCTACGGGGC CTACAAGGCG
GGCCGGGCCC 251 CTACCCCGGA GGACTTTCCG AGGCAGCTTG CCCTCATGAA
GGAGCTTGTG 301 GACCTTTTGG GGCTGGAGCG CCTCGAGGTC CCGGGCTTTG
AGGCGGACGA 351 TGTCCTCGCC GCCCTGGCCA AGAAGGCGGA GCGGGAAGGG
TACGAGGTGC 401 GCATCCTCAC CGCCGACCGG GACCTCTTCC AGCTTCTTTC
GGACCGCATC 451 GCCGTCCTGC ACCCGGAAGG CCACCTCATC ACCCCGGGGT
GGCTTTGGGA 501 GAGGTACGGC CTGAGACCGG AGCAGTGGGT GGACTTCCGC
GCCCTGGCCG 551 GCGACCCTTC CGACAACATC CCCGGGGTGA AGGGGATCGG
CGAGAAGACG 601 GCCCTGAAGC TCCTAAAGGA GTGGGGTAGT CTGGAAAATA
TCCAAAAAAA 651 CCTGGACCAG GTCAGTCCCC CTTCCGTGCG CGAGAAGATC
CAGGCCCACC 701 TGGACGACCT CAGGCTCTCC CAGGAGCTTT CCCGGGTGCG
CACGGACCTT 751 CCCTTGGAGG TGGACTTTAG AAGGCGGCGG GAGCCCGATA
GGGAAGGCCT 801 TAGGGCCTTC TTAGAGCGGC TTGAGTTCGG GAGCCTCCTC
CACGAGTTCG 851 GCCTCCTGGA AAGCCCCCAG GCGGCGGAGG AGGCCCCTTG
GCCGCCGCCG 901 GAAGGGGCCT TCTTGGGCTT CCGCCTCTCC CGGCCCGAGC
CCATGTGGGC 951 GGAACTCCTT TCCTTGGCGG CAAGCGCCAA GGGCCGGGTC
TACCGGGCGG 1001 AGGCGCCCCA TAAGGCCCTT TCGGACCTGA AGGAGATCCG
GGGGCTTCTC 1051 GCCAAGGACC TCGCCGTCTT GGCCCTGAGG GAGGGGCTCG
GCCTTCCCCC 1101 CACGGACGAT CCCATGCTCC TCGCCTACCT CCTGGACCCC
TCCAACACCA 1151 CCCCCGAGGG CGTGGCCCGG CGCTACGGGG GGGAGTGGAC
GGAGGAGGCG 1201 GGGGAGAGGG CCTTGCTTGC CGAAAGGCTT TACGAGAACC
TCCTAAGCCG 1251 CCTGAAAGGG GAAGAAAAGC TCCTTTGGCT CTACGAGGAG
GTCGAAAAGC 1301 CCCTTTCCCG GGTCCTCGCC CACATGGAGG CCACGGGGGT
GAGGCTGGAC 1351 GTACCCTACC TAAGGGCCCT TTCCCTGGAG GTGGCGGCGG
AGATGGGCCG 1401 CCTGGAGGAG GAGGTTTTCC GCCTGGCGGG CCACCCCTTC
AACCTGAACT 1451 CCCGCGACCA GCTGGAAAGG GTGCTCTTTG ACGAGCTCGG
GCTTCCCCCC 1501 ATCGGCAAGA CGGAAAAAAC CGGGAAGCGC TCCACCAGCG
CCGCCGTCCT 1551 CGAGGCCCTG CGGGAGGCCC ACCCCATCGT GGAGAAGATC
CTCCAGTACC 1601 GGGAGCTCGC CAAGCTCAAG GGCACCTACA TTGACCCCCT
TCCCGCCCTG 1651 GTCCACCCCA GGACGGGCAG GCTCCACACC CGCTTCAACC
AGACGGCCAC 1701 GGCCACGGGC CGCCTTTCCA GCTCCGACCC CAACCTGCAG
AACATTCCCG 1751 TGCGCACCCC CTTGGGCCAA AGGATCCGCC GGGCCTTCGT
GGCCGAGGAG 1801 GGGTACCTTC TCGTGGCCCT GGACTATAGC CAGATTGAGC
TGAGGGTCCT 1851 GGCCCACCTC TCGGGGGACG AAAACCTCAT CCGGGTCTTC
CAGGAGGGCC 1901 GGGACATCCA CACCCAGACG GCGAGCTGGA TG1TCGGCCT
GCCGGCGGAG 1951 GCCATAGACC CCCTCAGGCG CCGGGCGGCC AAGACCATCA
ACTTCGGCGT 2001 CCTCTACGGC ATGTCCGCCC ACCGGCTTTC CCAGGAGCTG
GGCATCCCCT 2051 ACGAGGAGGC GGTGGCCTTC ATTGACCGCT ATTTCCAGAG
CTACCCCAAG 2101 GTGAAGGCCT GGATTGAAAG GACCCTGGAG GAGGGGCGGC
AAAGGGGGTA 2151 CGTGGAGACC CTCTTCGGCC GCAGGCGCTA CGTGCCCGAC
CTCAACGCCC 2201 CGGTAAAGAG CGTGCGGGAG GCGGCGGAGC GCATGGCCTT
TAACATGCCC 2251 GTGCAGGGCA CCGCCGCTGA CCTGATGAAG CTCGCCATGG
TGAGGCTCTT 2301 CCCTAGGCTT CCCGAGGTGG GGGCGAGGAT GCTCCTCCAG
GTCCACGACG 2351 AGCTCCTCCT GGAGGCGCCC AAGGAGCGGG CGGAGGAGGC
GGCGGCCCTG 2401 GCCAAGGAGG TCATGGAGGG AGTCTGGCCC CTGGCCGTGC
CCCTGGAGGT 2451 GGAGGTGGGC ATCGGGGAGG ACTGGCTTTC CGCCAAGGGC
TAGTCGAC
[0064] In another embodiment, the invention provides a derivative
nucleic acid related to Thermus brockianus strain YS38, having SEQ
ID NO:5. SEQ ID NO:5 is a derivative nucleic acid having TAC
(encoding Tyr) in place of TTC (encoding Phe) at positions 1993-95
and is provided below:
TABLE-US-00005 1 ATGCTTCCCC TCTTTGAGCC CAAGGGCCGG GTGCTCCTGG
TGGACGGCCA 51 CCACCTGGCC TACCGTAACT TCTTCGCCCT CAAGGGGCTC
ACCACGAGCC 101 GGGGCGAGCC CGTGCAAGGG GTCTACGGCT TCGCCAAAAG
CCTCCTCAAG 151 GCCCTGAAGG AGGACGGGGA CGTGGTCATC GTGGTCTTTG
ACGCCAAGGC 201 CCCCTCTTTT CGCCACGAGG CCTACGGGGC CTACAAGGCG
GGCCGGGCCC 251 CTACCCCGGA GGACTTTCCG AGGCAGCTTG CCCTCATGAA
GGAGCTTGTG 301 GACCTTTTGG GGCTGGAGCG CCTCGAGGTC CCGGGCTTTG
AGGCGGACGA 351 TGTCCTCGCC GCCCTGGCCA AGAAGGCGGA GCGGGAAGGG
TACGAGGTGC 401 GCATCCTCAC CGCCGACCGG GACCTCTTCC AGCTTCTTTC
GGACCGCATC 451 GCCGTCCTGC ACCCGGAAGG CCACCTCATC ACCCCGGGGT
GGCTTTGGGA 501 GAGGTACGGC CTGAGACCGG AGCAGTGGGT GGACTTCCGC
GCCCTGGCCG 551 GCGACCCTTC CGACAACATC CCCGGGGTGA AGGGGATCGG
CGAGAAGACG 601 GCCCTGAAGC TCCTAAAGGA GTGGGGTAGT CTGGAAAATA
TCCAAAAAAA 651 CCTGGACCAG GTCAGTCCCC CTTCCGTGCG CGAGAAGATC
CAGGCCCACC 701 TGGACGACCT CAGGCTCTCC CAGGAGCTTT CCCGGGTGCG
CACGGACCTT 751 CCCTTGGAGG TGGACTTTAG AAGGCGGCGG GAGCCCGATA
GGGAAGGCCT 801 TAGGGCCTTC TTAGAGCGGC TTGAGTTCGG GAGCCTCCTC
CACGAGTTCG 851 GCCTCCTGGA AAGCCCCCAG GCGGCGGAGG AGGCCCCTTG
GCCGCCGCCG 901 GAAGGGGCCT TCTTGGGCTT CCGCCTCTCC CGGCCCGAGC
CCATGTGGGC 951 GGAACTCCTT TCCTTGGCGG CAAGCGCCAA GGGCCGGGTC
TACCGGGCGG 1001 AGGCGCCCCA TAAGGCCCTT TCGGACCTGA AGGAGATCCG
GGGGCTTCTC 1051 GCCAAGGACC TCGCCGTCTT GGCCCTGAGG GAGGGGCTCG
GCCTTCCCCC 1101 CACGGACGAT CCCATGCTCC TCGCCTACCT CCTGGACCCC
TCCAACACCA 1151 CCCCCGAGGG CGTGGCCCGG CGCTACGGGG GGGAGTGGAC
GGAGGAGGCG 1201 GGGGAGAGGG CCTTGCTTGC CGAAAGGCTT TACGAGAACC
TCCTAAGCCG 1251 CCTGAAAGGG GAAGAAAAGC TCCTTTGGCT CTACGAGGAG
GTGGAAAAGC 1301 CCCTTTCCCG GGTCCTCGCC CACATGGAGG CCACGGGGGT
GAGGCTGGAC 1351 GTACCCTACC TAAGGGCCCT TTCCCTGGAG GTGGCGGCGG
AGATGGGCCG 1401 CCTGGAGGAG GAGGTTTTCC GCCTGGCGGG CCACCCCTTC
AACCTGAACT 1451 CCCGCGACCA GCTGGAAAGG GTGCTCTTTG ACGAGCTCGG
GCTTCCCCCC 1501 ATCGGCAAGA CGGAAAAAAC CGGGAAGCGC TCCACCAGCG
CCGCCGTCCT 1551 CGAGGCCCTG CGGGAGGCCC ACCCCATCGT GGAGAAGATC
CTCCAGTACC 1601 GGGAGCTCGC CAAGCTCAAG GGCACCTACA TTGACCTCCT
TCCCGCCCTG 1651 GTCCACCCCA GGACGGGCAG GCTCCACACC CGCTTCAACC
AGACGGCCAC 1701 GGCCACGGGC CGCCTTTCCA GCTCCGACCC CAACCTGCAG
AACATTCCCG 1751 TGCGCACCCC CTTGGGCCAA AGGATCCGCC GGGCCTTCGT
GGCCGAGGAG 1801 GGGTACCTTC TCGTGGCCCT GGACTATAGC CAGATTGAGC
TGAGGGTCCT 1851 GGCCCACCTC TCGGGGGACG AAAACCTCAT CCGGGTCTTC
CAGGAGGGCC 1901 GGGACATCCA CACCCAGACG GCGAGCTGGA TGTTCGGCCT
GCCGGCGGAG 1951 GCCATAGACC CCCTCAGGCG CCGGGCGGCC AAGACCATCA
ACTACGGCGT 2001 CCTCTACGGC ATGTCCGCCC ACCGGCTTTC CCAGGAGCTG
GGCATCCCCT 2051 ACGAGGAGGC GGTGGCCTTC ATTGACCGCT ATTTCCAGAG
CTACCCCAAG 2101 GTGAAGGCCT GGATTGAAAG GACCCTGGAG GAGGGGCGGC
AAAGGGGGTA 2151 CGTGGAGACC CTCTTCGGCC GCAGGCGCTA CGTGCCCGAC
CTCAACGCCC 2201 GGGTAAAGAG CGTGCGGGAG GCGGCGGAGC GCATGGCCTT
TAACATGCCC 2251 GTGCAGGGCA CCGCCGCTGA CCTGATGAAG CTCGCCATGG
TGAGGCTCTT 2301 CCCTAGGCTT CCCGAGGTGG GGGCGAGGAT GCTCCTCCAG
GTCCACGACG 2351 AGCTCCTCCT GGAGGCGCCC AAGGAGCGGG CGGAGGAGGC
GGCGGCCCTG 2401 GCCAAGGAGG TCATGGAGGG GGTCTGGCCC CTGGCCGTGC
CCCTGGAGGT 2451 GGAGGTGGGC ATCGGGGAGG ACTGGCTTTC CGCGAAGGGC TAG
[0065] In another embodiment, the invention provides a derivative
nucleic acid related to Thermus brockianus strain 2AZN, having SEQ
ID NO:6. SEQ ID NO:6 is a derivative nucleic acid having TAG
(encoding Tyr) in place of TTC (encoding Phe) at positions 1993-95
and is provided below:
TABLE-US-00006 1 ATGCTTCCCC TCTTTGAGCC CAAGGGCCGG GTGCTCCTGG
TGGACGGCCA 51 CCACCTGGCC TACCGTAACT TCTTCGCCCT CAAGGGGCTC
ACCACGAGCC 101 GGGGCGAGCC CGTGCAAGGG GTCTACGGCT TCGCCAAAAG
CCTCCTCAAG 151 GCCCTGAAGG AGGACGGGGA CGTGGTCATC GTGGTCTTTG
ACGCCAAGGC 201 CCCCTCTTTT CGCCACGAGG CCTACGGGGC CTACAAGGCG
GGCCGGGCCC 251 CTACCCCGGA GGACTTTCCG AGGCAGCTTG CCCTCATGAA
GGAGCTTGTG 301 GACCTTTTGG GGCTGGAGCG CCTCGAGGTC CCGGGCTTTG
AGGCGGACGA 351 TGTCCTCGCC GCCCTGGCCA AGAAGGCGGA GCGGGAAGGG
TACGAGGTGC 401 GCATCCTCAC CGCCGACCGG GACCTCTTCC AGCTTCTTTC
GGACCGCATC 451 GCCGTCCTGC ACCCGGAAGG CCACCTCATC ACCCCGGGGT
GGCTTTGGGA 501 GAGGTACGGC CTGAGACCGG AGCAGTGGGT GGACTTCCGC
GCCCTGGCCG 551 GCGACCCTTC CGACAACATC CCCGGGGTGA AGGGGATCGG
CGAGAAGACG 601 GCCCTGAAGC TCCTAAAGGA GTGGGGTAGT CTGGAAAATA
TCCAAAAAAA 651 CCTGGACCAG GTCAGTCCCC CTTCCGTGCG CGAGAAGATC
CAGGCCCACC 701 TGGACGACCT CAGGCTCTCC CAGGAGCTTT CCCGGGTGCG
CACGGACCTT 751 CCCTTGGAGG TGGACTTTAG AAGGCGGCGG GAGCCCGATA
GGGAAGGCCT 801 TAGGGCCTTC TTAGAGCGGC TTGAGTTCGG GAGCCTCCTC
CACGAGTTCG 851 GCCTCCTGGA AAGCCCCCAG GCGGCGGAGG AGGCCCCTTG
GCCGCCGCCG 901 GAAGGGGCCT TCTTGGGCTT CCGCCTCTCC CGGCCCGAGC
CCATGTGGGC 951 GGAACTCCTT TCCTTGGCGG CAAGCGCCAA GGGCCGGGTC
TACCGGGCGG 1001 AGGCGCCCCA TAAGGCCCTT TCGGACCTGA AGGAGATCCG
GGGGCTTCTC 1051 GCCAAGGACC TCGCCGTCTT GGCCCTGAGG GAGGGGCTCG
GCCTTCCCCC 1101 CACGGACGAT CCCATGCTCC TCGCCTACCT CCTGGACCCC
TCCAACACCA 1151 CCCCCGAGGG CGTGGCCCGG CGCTACGGGG GGGAGTGGAC
GGAGGAGGCG 1201 GGGGAGAGGG CCTTGCTTGC CGAAAGGCTT TACGAGAACC
TCCTAAGCCG 1251 CCTGAAAGGG GAAGAAAAGC TCCTTTGGCT CTACGAGGAG
GTGGAAAAGC 1301 CCCTTTCCCG GGTCCTCGCC CACATGGAGG CCACGGGGGT
GAGGCTGGAC 1351 GTACCCTACC TAAGGGCCCT TTCCCTGGAG GTGGCGGCGG
AGATGGGCCG 1401 CCTGGAGGAG GAGGTTTTCC GCCTGGCGGG CCACCCCTTC
AACCTGAACT 1451 CCCGCGACCA GCTGGAAAGG GTGCTCTTTG ACGAGCTCGG
GCTTCCCCCC 1501 ATCGGCAAGA CGGAAAAAAC CGGGAAGCGC TCCACCAGCG
CCGCCGTCCT 1551 CGAGGCCCTG CGGGAGGCCC ACCCCATCGT GGAGAAGATC
CTCCAGTACC 1601 GGGAGCTCGC CAAGCTCAAG GGCACCTACA TTGACCCCCT
TCCCGCCCTG 1651 GTCCACCCCA GGACGGGCAG GCTCCACACC CGCTTCAACC
AGACGGCCAC 1701 GGCCACGGGC CGCCTTTCCA GCTCCGACCC CAACCTGCAG
AACATTCCCG 1751 TCCGCACCCC CTTGGGCCAA AGGATCCGCC GGGCCTTCGT
GGCCGAGGAG 1801 GGGTACCTTC TCGTGGCCCT GGACTATAGC CAGATTGAGC
TGAGGGTCCT 1851 GGCCCACCTC TCGGGGGACG AAAACCTCAT CCGGGTCTTC
CAGGAGGGCC 1901 GGGACATCCA CACCCAGACG GCGAGCTGGA TGTTCGGCCT
GCCGGCGGAG 1951 GCCATAGACC CCCTCAGGCG CCGGGCGGCC AAGACCATCA
ACTACGGCGT 2001 CCTCTACGGC ATGTCCGCCC ACCGGCTTTC CCAGGAGCTG
GGCATCCCCT 2051 ACGAGGAGGC GGTGGCCTTC ATTGACCGCT ATTTCCAGAG
CTACCCCAAG 2101 GTGAAGGCCT GGATTGAAAG GACCCTGGAG GAGGGGCGGC
AAAGGGGGTA 2151 CGTGGAGACC CTCTTCGGCC GCAGGCGCTA CGTGCCCGAC
CTCAACGCCC 2201 GGGTAAAGAG CGTGCGGGAG GCGGCGGAGC GCATGGCCTT
TAACATGCCC 2251 GTGCAGGGCA CCGCCGCTGA CCTGATGAAG CTCGCCATGG
TGAGGCTCTT 2301 CCCTAGGCTT CCCGAGGTGG GGGCGAGGAT GCTCCTCCAG
GTCCACGACG 2351 AGCTCCTCCT GGAGGCGCCC AAGGAGCGGG CGGAGGAGGC
GGCGGCCCTG 2401 GCCAAGGAGG TCATGGAGGG AGTCTGGCCC CTGGCCGTGC
CCCTGGAGGT 2451 GGAGGTGGGC ATCGGGGAGG ACTGGCTTTC CGCCAAGGGC
TAGTCGAC
[0066] In another embodiment, the invention provides a derivative
nucleic acid related to Thermus brockianus strain YS38 having SEQ
ID NO:7. A nucleic acid having SEQ ID NO:7 has GAG (encoding Asp)
in place of GGC (encoding Gly) at positions 127-129 and TAC
(encoding Tyr) in place of TTC (encoding Phe) at positions
1993-95.
TABLE-US-00007 1 ATGCTTCCCC TCTTTGAGCC CAAGGGCCGG GTGCTCCTGG
TGGACGGCCA 51 CCACCTGGCC TACCGTAACT TCTTCGCCCT CAAGGGGCTC
ACCACGAGCC 101 GGGGCGAGCC CGTGCAAGGG GTCTACGACT TCGCCAAAAG
CCTCCTCAAG 151 GCCCTGAAGG AGGACGGGGA CGTGGTCATC GTGGTCTTTG
ACGCCAAGGC 201 CCCCTCTTTT CGCCACGAGG CCTACGGGGC CTACAAGGCG
GGCCGGGCCC 251 CTACCCCGGA GGACTTTCCG AGGCAGCTTG CCCTCATGAA
GGAGCTTGTG 301 GACCTTTTGG GGCTGGAGCG CCTCGAGGTC CCGGGCTTTG
AGGCGGACGA 351 TGTCCTCGCC GCCCTGGCCA AGAAGGCGGA GCGGGAAGGG
TACGAGGTGC 401 GCATCCTCAC CGCCGACCGG GACCTCTTCC AGCTTCTTTC
GGACCGCATC 451 GCCGTCCTGC ACCCGGAAGG CCACCTCATC ACCCCGGGGT
GGCTTTGGGA 501 GAGGTACGGC CTGAGACCGG AGCAGTGGGT GGACTTCCGC
GCCCTGGCCG 551 GCGACCCTTC CGACAACATC CCCGGGGTGA AGGGGATCGG
CGAGAAGACG 601 GCCCTGAAGC TCCTAAAGGA GTGGGGTAGT CTGGAAAATA
TCCAAAAAAA 651 CCTGGACCAG GTCAGTCCCC CTTCCGTGCG CGAGAAGATC
CAGGCCCACC 701 TGGACGACCT CAGGCTCTCC CAGGAGCTTT CCCGGGTGCG
CACGGACCTT 751 CCCTTGGAGG TGGACTTTAG AAGGCGGCGG GAGCCCGATA
GGGAAGGCCT 801 TAGGGCCTTC TTAGAGCGGC TTGAGTTCGG GAGCCTCCTC
CACGAGTTCG 851 GCCTCCTGGA AAGCCCCCAG GCGGCGGAGG AGGCCCCTTG
GCCGCCGCCG 901 GAAGGGGCCT TCTTGGGCTT CCGCCTCTCC CGGCCCGAGC
CCATGTGGGC 951 GGAACTCCTT TCCTTGGCGG CAAGCGCCAA GGGCCGGGTC
TACCGGGCGG 1001 AGGCGCCCCA TAAGGCCCTT TCGGACCTGA AGGAGATCCG
GGGGCTTCTC 1051 GCCAAGGACC TCGCCGTCTT GGCCCTGAGG GAGGGGCTCG
GCCTTCCCCC 1101 CACGGACGAT CCCATGCTCC TCGCCTACCT CCTGGACCCC
TCCAACACCA 1151 CCCCCGAGGG CGTGGCCCGG CGCTACGGGG GGGAGTGGAC
GGAGGAGGCG 1201 GGGGAGAGGG CCTTGCTTGC CGAAAGGCTT TACGAGAACC
TCCTAAGCCG 1251 CCTGAAAGGG GAAGAAAAGC TCCTTTGGCT CTACGAGGAG
GTGGAAAAGC 1301 CCCTTTCCCG GGTCCTCGCC CACATGGAGG CCACGGGGGT
GAGGCTGGAC 1351 GTACCCTACC TAAGGGCCCT TTCCCTGGAG GTGGCGGCGG
AGATGGGCCG 1401 CCTGGAGGAG GAGGTTTTCC GCCTGGCGGG CCACCCCTTC
AACCTGAACT 1451 CCCGCGACCA GCTGGAAAGG GTGCTCTTTG ACGAGCTCGG
GCTTCCCCCC 1501 ATCGGCAAGA CGGAAAAAAC CGGGAAGCGC TCCACCAGCG
CCGCCGTCCT 1551 CGAGGCCCTG CGGGAGGCCC ACCCCATCGT GGAGAAGATC
CTCCAGTACC 1601 GGGAGCTCGC CAAGCTGAAG GGCACCTACA TTGACCTCCT
TCCCGCCCTG 1651 GTCCACCCCA GGACGGGCAG GCTCCACACC CGCTTCAACC
AGACGGCCAC 1701 GGCCACGGGC CGCCTTTCCA GCTCCGACCC CAACCTGCAG
AACATTCCCG 1751 TGCGCACCCC CTTGGGCCAA AGGATCCGCC GGGCCTTCGT
GGCCGAGGAG 1801 GGGTACCTTC TCGTGGCCCT GGACTATAGC CAGATTGAGC
TGAGGGTCCT 1851 GGCCCACCTC TCGGGGGACG AAAACCTCAT CCGGGTCTTC
CAGGAGGGCC 1901 GGGACATCCA CACCCAGACG GCGAGCTGGA TGTTCGGCCT
GCCGGCGGAG 1951 GCCATAGACC CCCTCAGGCG CCGGGCGGCC AAGACCATCA
ACTACGGCGT 2001 CCTCTACGGC ATGTCCGCCC ACCGGCTTTC CCAGGAGCTG
GGCATCCCCT 2051 ACGAGGAGGC GGTGGCCTTC ATTGACCGCT ATTTCCAGAG
CTACCCCAAG 2101 GTGAAGGCCT GGATTGAAAG GACCCTGGAG GAGGGGCGGC
AAAGGGGGTA 2151 CGTGGAGACC CTCTTCGGCC GCAGGCGCTA CGTGCCCGAC
CTCAACGCCC 2201 GGGTAAAGAG CGTGCGGGAG GCGGCGGAGC GCATGGCCTT
TAACATGCCC 2251 GTGCAGGGCA CCGCCGCTGA CCTGATGAAG CTCGCCATGG
TGAGGCTCTT 2301 CCCTAGGCTT CCCGAGGTGG GGGCGAGGAT GCTCCTCCAG
GTCCACGACG 2351 AGCTCCTCCT GGAGGCGCCC AAGGAGCGGG CGGAGGAGGC
GGCGGCCCTG 2401 GCCAAGGAGG TCATGGAGGG GGTGTGGCCC CTGGCCGTGC
CCCTGGAGGT 2451 GGAGGTGGGC ATCGGGGAGG ACTGGCTTTC CGCCAAGGGC TAG
[0067] In another embodiment, the invention provides a derivative
nucleic acid related to Thermus brockianus strain 2AZN having SEQ
ID NO:8. A nucleic acid having SEQ ID NO:8 has GAC (encoding Asp)
in place of GGC (encoding Gly) at positions 127-129 and TAC
(encoding Tyr) in place of TTC (encoding Phe) at positions
1993-95.
TABLE-US-00008 1 ATGCTTCCCC TCTTTGAGCC CAAGGGCCGG GTGCTCCTGG
TGGACGGCCA 51 CCACCTGGCC TACCGTAACT TCTTCGCCCT CAAGGGGCTC
ACCACGAGCC 101 GGGGCGAGCC CGTGCAAGGG GTCTACGACT TCGCCAAAAG
CCTCCTCAAG 151 GCCCTGAAGG AGGACGGGGA CGTGGTCATC GTGGTCTTTG
ACGCCAAGGC 201 CCCCTCTTTT CGCCACGAGG CCTACGGGGC CTACAAGGCG
GGCCGGGCCC 251 CTACCCCGGA GGACTTTCCG AGGCAGCTTG CCCTCATGAA
GGAGCTTGTG 301 GACCTTTTGG GGCTGGAGCG CCTCGAGGTC CCGGGCTTTG
AGGCGGACGA 351 TGTCCTCGCC GCCCTGGCCA AGAAGGCGGA GCGGGAAGGG
TACGAGGTGC 401 GCATCCTCAC CGCCGACCGG GACCTCTTCC AGCTTCTTTC
GGACCGCATC 451 GCCGTCCTGC ACCCGGAAGG CCACCTCATC ACCCCGGGGT
GGCTTTGGGA 501 GAGGTACGGC CTGAGACCGG AGCAGTGGGT GGACTTCCGC
GCCCTGGCCG 551 GCGACCCTTC CGACAACATC CCCGGGGTGA AGGGGATCGG
CGAGAAGACG 601 GCCCTGAAGC TCCTAAAGGA GTGGGGTAGT CTGGAAAATA
TCCAAAAAAA 651 CCTGGACCAG GTCAGTCCCC CTTCCGTGCG CGAGAAGATC
CAGGCCCACC 701 TGGACGACCT CAGGCTCTCC CAGGAGCTTT CCCGGGTGCG
CACGGACCTT 751 CCCTTGGAGG TGGACTTTAG AAGGCGGCGG GAGCCCGATA
GGGAAGGCCT 801 TAGGGCCTTC TTAGAGCGGC TTGAGTTCGG GAGCCTCCTC
CACGAGTTCG 851 GCCTCCTGGA AAGCCCCCAG GCGGCGGAGG AGGCCCCTTG
GCCGCCGCCG 901 GAAGGGGCCT TCTTGGGCTT CCGCCTCTCC CGGCCCGAGC
CCATGTGGGC 951 GGAACTCCTT TCCTTGGCGG CAAGCGCCAA GGGCCGGGTC
TACCGGGCGG 1001 AGGCGCCCCA TAAGGCCCTT TCGGACCTGA AGGAGATCCG
GGGGCTTCTC 1051 GCCAAGGACC TCGCCGTCTT GGCCCTGAGG GAGGGGCTCG
GCCTTCCCCC 1101 CACGGACGAT CCCATGCTCC TCGCCTACCT CCTGGACCCC
TCCAACACCA 1151 CCCCCGAGGG CGTGGCCCGG CGCTACGGGG GGGAGTGGAC
GGAGGAGGCG 1201 GGGGAGAGGG CCTTGCTTGC CGAAAGGCTT TACGAGAACC
TCCTAAGCCG 1251 CCTGAAAGGG GAAGAAAAGC TCCTTTGGCT CTACGAGGAG
GTGGAAAAGC 1301 CCCTTTCCCG GGTCCTCGCC CACATGGAGG CCACGGGGGT
GAGGCTGGAC 1351 GTACCCTACC TAAGGGCCCT TTCCCTGGAG GTGGCGGCGG
AGATGGGCCG 1401 CCTGGAGGAG GAGGTTTTCC GCCTGGCGGG CCACCCCTTC
AACCTGAACT 1451 CCCGCGACCA GCTGGAAAGG GTGCTCTTTG ACGAGCTCGG
GCTTCCCCCC 1501 ATCGGCAAGA CGGAAAAAAC CGGGAAGCGC TCCACCAGCG
CCGCCGTCCT 1551 CGAGGCCCTG CGGGAGGCCC ACCCCATCGT GGAGAAGATC
CTCCAGTACC 1601 GGGAGCTCGC CAAGCTCAAG GGCACCTACA TTGACCCCCT
TCCCGCCCTG 1651 GTCCACCCCA GGACGGGCAG GCTCCACACC CGCTTCAACC
AGACGGCCAC 1701 GGCCACGGGC CGCCTTTCCA GCTCCGACCC CAACCTGCAG
AACATTCCCG 1751 TGCGCACCCC CTTGGGCCAA AGGATCCGCC GGGCCTTCGT
GGCCGAGGAG 1801 GGGTACCTTC TCGTGGCCCT GGACTATAGC CAGATTGAGC
TGAGGGTCCT 1851 GGCCCACCTC TCGGGGGACG AAAACCTCAT CCGGGTCTTC
CAGGAGGGCC 1901 GGGACATCCA CACCCAGACG GCGAGCTGGA TGTTCGGCCT
GCCGGCGGAG 1951 GCCATAGACC CCCTCAGGCG CCGGGCGGCC AAGACCATCA
ACTACGGCGT 2001 CCTCTACGGC ATGTCCGCCC ACCGGCTTTC CCAGGAGCTG
GGCATCCCCT 2051 ACGAGGAGGC GGTGGCCTTC ATTGACCGCT ATTTCCAGAG
CTACCCCAAG 2101 GTGAAGGCCT GGATTGAAAG GACCCTGGAG GAGGGGCGGC
AAAGGGGGTA 2151 CGTGGAGACC CTCTTCGGCC GCAGGCGCTA CGTGCCCGAC
CTCAACGCCC 2201 GGGTAAAGAG CGTGCGGGAG GCGGCGGAGC GCATGGCCTT
TAACATGCCC 2251 GTGCAGGGCA CCGCCGCTGA CCTGATGAAG CTCGCCATGG
TGAGGCTCTT 2301 CCCTAGGCTT CCCCAGGTGG GGGCGAGGAT GCTCCTCCAG
GTCCACGACG 2351 AGCTCCTCCT GGAGGCGCCC AAGGAGCGGG CGGAGGAGGC
GGCGGCCCTG 2401 GCCAAGGAGG TCATGGAGGG AGTCTGGCCC CTGGCCGTGC
CCCTGGAGGT 2451 GGAGGTGGGC ATCGGGGAGG ACTGGCTTTC CGCCAAGGGC
TAGTCGAC
[0068] The substitution of TAC (encoding Tyr) for TTC (encoding
Phe) at the indicated positions can reduce discrimination against
ddNTP incorporation by DNA polymerase I. See, e.g., U.S. Pat. No.
5,614,365, which is incorporated herein by reference. The
substitution of GAC (encoding Asp) for GGG (encoding Gly) at the
indicated positions removes the 5'-3' exonuclease activity.
[0069] The nucleic acids of the invention have homology to portions
of the DNA sequences encoding the thermostable DNA polymerases of
Thermus aquaticus and Thermus thermophilus (see FIG. 1). However,
significant portions of the nucleic acid sequences of the present
invention are distinct.
[0070] The invention also encompasses fragment and variant nucleic
acids of SEQ ID NO:1-8. Nucleic acid "fragments" encompassed by the
invention are of two general types. First, fragment nucleic acids
that do not encode a full length DMA polymerase but do encode a
thermally stable polypeptide with DNA polymerase activity are
encompassed within the invention. Second, fragment nucleic acids
useful as hybridization probes but that generally do not encode
polymerases retaining biological activity are also encompassed
within the invention. Thus, fragments of nucleotide sequences such
as SEQ ID NO:1-8 may be as small as about 9 nucleotides, about 12
nucleotides, about 15 nucleotides, about 17 nucleotides, about 18
nucleotides, about 20 nucleotides, about 50 nucleotides, about 100
nucleotides or more. In general, a fragment nucleic acid of the
invention can have any upper size limit so long as it is related in
sequence to the nucleic acids of the invention but is not full
length.
[0071] As indicated above, "variants" are substantially similar or
substantially homologous sequences. For nucleotide sequences,
variants include those sequences that, because of the degeneracy of
the genetic code, encode the identical amino acid sequence of the
native DNA polymerase I protein. Variant nucleic acids also include
those that encode polypeptides that do not have amino acid
sequences identical to that of a native DNA polymerase I protein,
but that encode an active, thermally stable DNA polymerase I with
conservative changes in the amino acid sequence.
[0072] As is known by one of skill in the art, the genetic code is
"degenerate," meaning that several trinucleotide codons can encode
the same amino acid. This degeneracy is apparent from Table 1.
TABLE-US-00009 TABLE 1 1.sup.st 3.sup.rd Posi- Second Position
Posi- tion T C A G tion T TTT = Phe TCT = Ser TAT = Tyr TGT = Cys T
T TTC = Phe TCC = Ser TAC = Tyr TGC = Cys C T TTA = Leu TCA = Ser
TAA = Stop TGA = Stop A T TTG = Leu TCG = Ser TAG = Stop TGG = Trp
G C CTT = Leu CCT = Pro CAT = His CGT = Arg T C CTC = Leu CCC = Pro
CAC = His CGC = Arg C C CTA = Leu CCA = Pro CAA = Gln CGA = Arg A C
CTG = Leu CCG = Pro CAG = Gln CGG = Arg G A ATT = Ile ACT = Thr AAT
= Asn AGT = Ser T A ATC = Ile ACC = Thr AAC = Asn AGC = Ser C A ATA
= Ile ACA = Thr AAA = Lys AGA = Arg A A ATG = Met ACG = Thr AAG =
Lys AGG = Arg G G GTT = Val GCT = Ala GAT = Asp GGT = Gly T G GTC =
Val GCC = Ala GAC = Asp GGC = Gly C G GTA = Val GCA = Ala GAA = Gln
GGA = Gly A G GTG = Val GCG = Ala GAG = Gln GGG = Gly G
Hence, many changes in the nucleotide sequence of the variant may
be silent and may not alter the amino acid sequence encoded by the
nucleic acid. Where nucleic acid sequence alterations are silent, a
variant nucleic acid will encode a polypeptide with the same amino
acid sequence as the reference nucleic acid. Therefore, a
particular nucleic acid sequence of the invention also encompasses
variants with degenerate codon substitutions, and complementary
sequences thereof, as well as the sequence explicitly specified by
a SEQ ID NO. Specifically, degenerate codon substitutions may be
achieved by generating sequences in which the reference codon is
replaced by any of the codons for the amino acid specified by the
reference codon. In general, the third position of one or more
selected codons can be substituted with mixed-base and/or
deoxyinosine residues as disclosed by Batzer et al., Nucleic Acid
Res., 19, 5081 (1991) and/or Ohtsuka et al., J. Biol. Chem.,
260,2605 (1985); Rossolini et al., Mol. Cell. Probes, 8, 91
(1994).
[0073] However, the invention is not limited to silent changes in
the present nucleotide sequences but also includes variant nucleic
acid sequences that conservatively alter the amino acid sequence of
a polypeptide of the invention. According to the present invention,
variant and reference nucleic acids of the invention may differ in
the encoded amino acid sequence by one or more substitutions,
additions, insertions, deletions, fusions and truncations, which
may be present in any combination, so long as an active, thermally
stable DNA polymerase is encoded by the variant nucleic acid. Such
variant nucleic acids will not encode exactly the same amino acid
sequence as the reference nucleic acid, hut have conservative
sequence changes.
[0074] Variant nucleic acids with silent and conservative changes
can be defined and characterized by the degree of homology to the
reference nucleic acid. Preferred variant nucleic acids are
"substantially homologous" to the reference nucleic acids of the
invention. As recognized by one of skill in the art, such
substantially similar nucleic acids can hybridize under stringent
conditions with the reference nucleic acids identified by SEQ ID
NOs herein. These types of substantially homologous nucleic acids
are encompassed by this invention.
[0075] Generally, nucleic acid derivatives and variants of the
invention will have at least 90%, 91%, 92%, 93% or 94% sequence
identity to the reference nucleotide sequence defined herein.
Preferably, nucleic acids of the invention will have at least at
least 95%, 96%, 97%, 98%, or 99% sequence identity to the reference
nucleotide sequence defined herein.
[0076] Variant nucleic acids can be detected and isolated by
standard hybridization procedures.
[0077] Hybridization to detect or isolate such sequences is
generally carried out under stringent conditions, "Stringent
hybridization conditions" and "stringent hybridization wash
conditions" in the context of nucleic acid hybridization
experiments such as Southern and Northern hybridization are
sequence dependent, and are different under different environmental
parameters. Longer sequences hybridize specifically at higher
temperatures. An extensive guide to the hybridization of nucleic
acids is found in Tijssen, Laboratory Techniques in Biochemistry
and Molecular biology-Hybridization with Nucleic Acid Probes, page
1, chapter 2 "Overview of principles of hybridization and the
strategy of nucleic acid probe assays" Elsevier, N.Y. (1993). See
also, J. Sambrook et al. Molecular Cloning: A Laboratory Manual,
Cold Spring Harbor Press, N.Y., pp 9.31-9.58 (1989); J. Sambrook et
al., Molecular Cloning: A Laboratory Manual Cold Spring Harbor
Press, N.Y. (3rd ed. 2001).
[0078] The invention also provides methods for detection and
isolation of derivative or variant nucleic acids encoding DNA
polymerase I activity. The methods involve hybridizing at least a
portion of a nucleic acid comprising SEQ ID NO:1, 2, 3, 4, 5, 6, 7
or 8 to a sample nucleic acid, thereby forming a hybridization
complex; and detecting the hybridization complex. The presence of
the complex correlates with the presence of a derivative or variant
nucleic acid encoding at least a segment of DNA polymerase I. In
general, the portion of a nucleic acid comprising SEQ ID NO: 1, 2,
3, 4, 5, 6, 7 or 8 used for hybridization is at least fifteen
nucleotides, and hybridization is under hybridization conditions
that are sufficiently stringent to permit detection and isolation
of substantially homologous nucleic acids. In an alternative
embodiment, a nucleic acid sample is amplified by the polymerase
chain reaction using primer oligonucleotides selected from SEQ ID
NO: 1, 2, 3, 4, 5, 6, 7 or 8.
[0079] Generally, highly stringent hybridization and wash
conditions are selected to be about 5.degree. C. lower than the
thermal melting point for the specific double-stranded sequence at
a defined ionic strength and pH. For example, under "highly
stringent conditions" or "highly stringent hybridization
conditions" a nucleic acid will hybridize to its complement to a
detectably greater degree than to other sequences (e.g., at least
2-fold over background). By controlling the stringency of the
hybridization and/or washing conditions, nucleic acids that are
100% complementary can be identified.
[0080] Alternatively, stringency conditions can be adjusted to
allow some mismatching in sequences so that lower degrees of
similarity are detected (heterologous probing). Typically,
stringent conditions will be those in which the salt concentration
is less than about 1.5 M Na ion, typically about 0.01 to 1.0 M Na
ion concentration (or other salts) at pH 7.0 to 8.3 and the
temperature is at least about 30.degree. C. for short probes (e.g.,
10 to 50 nucleotides) and at least about 60.degree. C. for long
probes (e.g., greater than 50 nucleotides). Stringent conditions
may also be achieved with the addition of destabilizing agents such
as formamide.
[0081] Exemplary low stringency conditions include hybridization
with a buffer solution of 30 to 35% formamide, 1 M NaCl, 1% SDS
(sodium dodecyl sulphate) at 37.degree. C. and a wash in 1.times.
to 2.times.SSC (20.times.SSC=3.0 M NaCl and 0.3 M trisodium
citrate) at 50 to 55.degree. C. Exemplary moderate stringency
conditions include hybridization in 40 to 45% formamide, 1.0 M
NaCl, 1% SDS at 37.degree. C., and a wash in 0.5.times. to
1.times.SSC at 55 to 60.degree. C. Exemplary high stringency
conditions include hybridization in 50% formamide, 1 M NaCl, 1% SDS
at 37.degree. C., and a wash in 0.1.times.SSC at 60 to 65.degree.
C.
[0082] The degree of complementarity or homology of hybrids
obtained during hybridization is typically a function of
post-hybridization washes, the critical factors being the ionic
strength and temperature of the final wash solution. The type and
length of hybridizing nucleic acids also affects whether
hybridization will occur and whether any hybrids formed will be
stable under a given set of hybridization and wash conditions. For
DNA-DNA hybrids, the T.sub.m can be approximated from the equation
of Meinkoth and Wahl Anal. Biochem. 138:267-284 (1984); T.sub.m
81.5.degree. C.+16.6 (log M)+0.41 (% GC)-0.61 (% form)-500/L; where
M is the molarity of monovalent cations, % GC is the percentage of
guanosine and cytokine nucleotides in the DNA, % form is the
percentage of formamide in the hybridization solution, and L is the
length of the hybrid in base pairs. The T.sub.m is the temperature
(under defined ionic strength and pH) at which 50% of a
complementary target sequence hybridizes to a perfectly matched
probe.
[0083] Very stringent conditions are selected to be equal to the Tm
for a particular probe.
[0084] An example of stringent hybridization conditions for
hybridization of complementary nucleic acids that have more than
100 complementary residues on a filter in a Southern or Northern
blot is 50% formamide with 1 mg of heparin at 42.degree. C., with
the hybridization being carried out overnight. An example of highly
stringent conditions is 0.1 5 M NaCl at 72.degree. C. for about 1.5
minutes. An example of stringent wash conditions is a 0.2.times.SSC
wash at 65.degree. C. for 15 minutes (see also, Sambrook, infra).
Often, a high stringency wash is preceded by a low stringency wash
to remove background probe signal. An example of medium stringency
for a duplex of, e.g., more than 100 nucleotides, is 1.times.SSC at
45.degree. C. for 15 minutes. An example low stringency wash for a
duplex of, e.g., more than 100 nucleotides, is 4-6.times.SSC at
40.degree. C. for 15 minutes. For short probes (e.g., about 10 to
50 nucleotides), stringent conditions typically involve salt
concentrations of less than about 1.0M Na ion, typically about 0.01
to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3,
and the temperature is typically at least about 30.degree. C.
[0085] Stringent conditions can also be achieved with the addition
of destabilizing agents such as formamide. In general, a signal to
noise ratio of 2.times. (or higher) than that observed for an
unrelated probe in the particular hybridization assay indicates
detection of a specific hybridization. Nucleic acids that do not
hybridize to each other tinder stringent conditions are still
substantially identical if the proteins that they encode are
substantially identical. This occurs, e.g., when a copy of a
nucleic acid is created using the maximum codon degeneracy
permitted by the genetic code.
[0086] The following are examples of sets of hybridization/wash
conditions that may be used to detect and isolate homologous
nucleic acids that are substantially identical to reference nucleic
acids of the present invention: a reference nucleotide sequence
preferably hybridizes to the reference nucleotide sequence in 7%
sodium dodecyl sulfate (SOS), 0.5 M NaPO.sub.4, 1 mM EDTA at
50.degree. C. with washing in 2.times.SSC, 0.1% SDS at 50.degree.
C., more desirably in 7% sodium dodecyl sulfate (SDS), 0.5 M
NaPO.sub.4, 1 mM EDTA at 50.degree. C. with washing in 1.times.SSC,
0.1% SDS at 50.degree. C., more desirably still in 7% sodium
dodecyl sulfate (SDS), 0.5 M NaPO.sub.4, 1 mM EDTA at 50.degree. C.
with washing in 0.5.times.SSC, 0.1% SDS at 50.degree. C.,
preferably in 1% sodium dodecyl sulfate (SDS), 0.5 M NaPO.sub.4, 1
mM EDTA at 50.degree. C. with washing in 0.1.times.SSC, 0.1% SDS at
50.degree. C., more preferably in 7% sodium dodecyl sulfate (SDS),
0.5 M NaPO.sub.4, 1 mM EDTA at 50.degree. C. with washing in
0.1.times.SSC, 0.1% SDS at 65.degree. C.
[0087] In general, T.sub.m is reduced by about 1.degree. C. for
each 1% of mismatching. Thus, T.sub.m hybridization, and/or wash
conditions can be adjusted to hybridize to sequences of the desired
sequence identity. For example, if sequences with >90% identity
are sought, the Tm can be decreased 10.degree. C. Generally,
stringent conditions are selected to be about 5.degree. C. lower
than the thermal melting point (T.sub.m) for the specific sequence
and its complement at a defined ionic strength and pH. However,
severely stringent conditions can utilize a hybridization and/or
wash at 1, 2, 3, or 4.degree. C. lower than the thermal melting
point (T.sub.m); moderately stringent conditions can utilize a
hybridization and/or wash at 6, 7, 8, 9, or 10.degree. C. lower
than the thermal melting point (T.sub.m); low stringency conditions
can utilize a hybridization and/or wash at 11, 12, 13, 14, 15, or
20.degree. C. lower than the thermal melting point (T.sub.m).
[0088] If the desired degree of mismatching results in a T.sub.m of
less than 45.degree. C. (aqueous solution) or 32.degree. C. (form
amide solution), it is preferred to increase the SSC concentration
so that a higher temperature can be used. An extensive guide to the
hybridization of nucleic acids is found in Tijssen (1993)
Laboratory Techniques in Biochemistry and Molecular
Biology-Hybridization with Nucleic Acid Probes, Part 1, Chapter 2
(Elsevier, New York), and Ausubel et al., eds. (1995) Current
Protocols in Molecular Biology, Chapter 2 (Greene Publishing and
Wiley-Interscience, New York). See Sambrook et al. (1989) Molecular
Cloning: A Laboratory Manual (2d ed., Cold Spring Harbor Laboratory
Press, Plainview, N.Y.). Using these references and the teachings
herein on the relationship between T.sub.m, mismatch, and
hybridization and wash conditions, those of ordinary skill can
generate variants of the present DNA polymerase I nucleic
acids.
[0089] Computer analyses can also be utilized for comparison of
sequences to determine sequence identity. Such analyses include,
but are not limited to: CLUSTAL in the PC/Gene program (available
from Intelligenetics, Mountain View, Calif.); the ALIGN program
(Version 2.0) and GAP, BESTFIT, BLAST, FASTA, and TFASTA in the
Wisconsin Genetics Software Package, Version 8 (available from
Genetics Computer Group (GCG), 575 Science Drive, Madison, Wis.,
USA). Alignments using these programs can be performed using the
default parameters. The CLUSTAL program is well described by
Higgins et al. Gene 73:237 344 (1988); Higgins et al. CABIOS
5:151-153 (1989); Corpet et al. Nucleic Acids Res. 16:10881-90
(1988); Huang et al. CABIOS 8:155-65 (1992); and Pearson et al.
Meth. Mol. Biol. 24:307-331 (1994). The ALIGN program is based on
the algorithm of Myers and Miller, supra. The BLAST programs of
Altschul et al., J. Mol. Biol. 215:403 (1990), are based on the
algorithm of Karlin and Altschul supra. To obtain gapped alignments
for comparison purposes, Gapped BLAST (in BLAST 2.0) can be
utilized as described in Altschul et al. Nucleic Acids Res. 25:3389
(1997). Alternatively, PSI-BLAST (in BLAST 2.0) can be used to
perform an iterated search that detects distant relationships
between molecules. See Altschul et al., supra, When utilizing
BLAST, Gapped BLAST, PSI-BLAST, the default parameters of the
respective programs (e.g. BLASTN for nucleotide sequences, BLASTX
for proteins) can be used. The BLASTN program (for nucleotide
sequences) uses as defaults a wordlength (W) of 11, an expectation
(E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both
strands. For amino acid sequences, the BLASTP program uses as
defaults a wordlength (W) of 3, an expectation (E) of 10, and the
BLOSUM62 scoring matrix (see Henikoff & Henikoff, Proc. Natl.
Acad. Sci. USA, 89, 10915 (1989)). See http://www.ncbi.nlm.nih.gov.
Alignment may also be performed manually by inspection.
[0090] For purposes of the present invention, comparison of
nucleotide sequences for determination of percent sequence identity
to the DNA polymerase sequences disclosed herein is preferably made
using the BlastN program (version 1.4.7 or later) with its default
parameters or any equivalent program. By "equivalent program" is
intended any sequence comparison program that, for any two
sequences in question, generates an alignment having identical
nucleotide or amino acid residue matches and an identical percent
sequence identity when compared to the corresponding alignment
generated by the preferred program.
Expression of Polymerase Nucleic Acids
[0091] Nucleic acids of the invention may be used for the
recombinant expression of the polymerase polypeptides of the
invention. Generally, recombinant expression of a polymerase
polypeptide of the invention is effected by introducing a nucleic
acid encoding that polypeptide into an expression vector adapted
for use in particular type of host cell. The nucleic acids of the
invention can be introduced and expressed in any host organism, for
example, in both prokaryotic or eukaryotic host cells. Examples of
host cells include bacterial cells, yeast cells, cultured insect
cell lines, and cultured mammalian cells lines. Preferably, the
recombinant host cell system is selected that processes and
post-translationally modifies nascent polypeptides in a manner
similar to that of the organism from which the polymerase was
derived. For purposes of expressing and isolating polymerase
polypeptides of the invention, prokaryotic organisms are preferred,
for example, Escherichia coli. Accordingly, the invention provides
host cells comprising the expression vectors of the invention.
[0092] The nucleic acids to be introduced can be conveniently
placed in expression cassettes for expression in an organism of
interest. Such expression cassettes will comprise a transcriptional
initiation region linked to a nucleic acid of the invention.
Expression cassettes preferably also have a plurality of
restriction sites for insertion of the nucleic acid to be under the
transcriptional regulation of various control elements. The
expression cassette additionally may contain selectable marker
genes. Suitable control elements such as enhancers/promoters,
splice junctions, polyadenylation signals, etc. may be placed in
close proximity to the coding region of the gene if needed to
permit proper initiation of transcription and/or correct processing
of the primary RNA transcript. Alternatively, the coding region
utilized in the expression vectors of the present invention may
contain endogenous enhancers/promoters, splice junctions,
intervening sequences, polyadenylation signals, etc., or a
combination of both endogenous and exogenous control elements.
[0093] Preferably the nucleic acid in the vector is under the
control of, and operably linked to, an appropriate promoter or
other regulatory elements for transcription in a host cell. The
vector may be a bi-functional expression vector that functions in
multiple hosts. The transcriptional cassette generally includes in
the 5'-3' direction of transcription, a promoter, a transcriptional
and translational initiation region, a DNA sequence of interest,
and a transcriptional and translational termination region
functional in the organism. The termination region may be native
with the transcriptional initiation region, may be native with the
DNA sequence of interest, or may be derived from another
source.
[0094] Efficient expression of recombinant DNA sequences in
prokaryotic and eukaryotic ceils generally requires regulatory
control elements directing the efficient termination and
polyadenylation of the resulting transcript. Transcription
termination signals are generally found downstream of the
polyadenylation signal and are a few hundred nucleotides in length.
The term "poly A site" or "poly A sequence" as used herein denotes
a DNA sequence that directs both the termination and
polyadenylation of the nascent RNA transcript. Efficient
polyadenylation of the recombinant transcript is desirable as
transcripts lacking a poly A tail are unstable and are rapidly
degraded.
[0095] Nucleic acids encoding DNA polymerase I may be introduced
into bacterial host cells by a method known to one of skill in the
art. For example, nucleic acids encoding a thermophilic DNA
polymerase I can be introduced into bacterial cells by commonly
used transformation procedures such as by treatment with calcium
chloride or by electroporation. If the thermophilic DNA polymerase
I is to be expressed in eukaryotic host cells, nucleic acids
encoding the thermophilic DNA polymerase I may be introduced into
eukaryotic host cells by a number of means including calcium
phosphate co-precipitation, spheroplast fusion, electroporation and
the like. When the eukaryotic host cell is a yeast cell,
transformation may be affected by treatment of the host cells with
lithium acetate or by electroporation.
[0096] Thus, one aspect of the invention is to provide expression
vectors and host ceils comprising a nucleic acid encoding a DNA
polymerase polypeptide of the invention. A range of expression
vectors are available in the art. Description of various expression
vectors and how to use them can be found among other places in U.S.
Pat. Nos. 5,604,118; 5,583,023; 5,432,082; 5,266,490; 5,063,158;
4,966,841; 4,806,472; 4,801,537; and Goedel et al., Gene Expression
Technology, Methods of Enzymology, Vol. 185, Academic Press, San
Diego (1989), The expression of polymerases in recombinant cell,
systems is an established technique. Examples of the recombinant
expression of DNA polymerase can be found, in U.S. Pat. Nos.
5,602,756; 5,545,552; 5,541,311; 5,500,363; 5,489,523; 5,455,170;
5,352,778; 5,322,785; and 4,935,361.
[0097] Recombinant DNA and molecular cloning techniques that can be
used to help make and use aspects of the invention are described by
Sambrook et al., Molecular Cloning: A Laboratory Manual Vol. 1-3,
Cold Spring Harbor laboratory, Cold Spring Harbor, N.Y. (2001);
Ausubel (ed.), Current Protocols in Molecular Biology, John Wiley
and Sons, Inc. (1994); T. Maniatis, E. F. Fritsch and J. Sambrook,
Molecular Cloning: A Laboratory Manual, Cold Spring Harbor
laboratory, Cold Spring Harbor, N.Y. (1989); and by T. J. Silhavy,
M. L. Berman, and L. W. Enquist, Experiments with Gene Fusions,
Cold Spring Harbor Laboratory, Cold Spring Harbor, N.Y. (1984).
Nucleic Acid Polymerases
[0098] The invention provides Thermus brockianus polymerase
polypeptides, as well as fragments thereof and variant polymerase
polypeptides that are active and thermally stable. Any polypeptide
containing amino acid sequence SEQ ID NO:9, SEQ ID NO:10, SEQ ID
NO:11, SEQ ID NO:12, SEQ ID NO:13, SEQ ID NO:14, SEQ ID NO:15 and
SEQ ID NO:16, which are the amino acid sequences for wild type and
derivative Thermus brockianus polymerases, are contemplated by the
present invention. The polypeptides of the invention are isolated
or substantially purified polypeptides. In particular, the isolated
polypeptides of the invention are substantially free of proteins
normally present in Thermus brockianus bacteria.
[0099] In one embodiment, the invention provides a wild type
Thermus brockianus polymerase from strain YS38 having SEQ ID
NO:9.
TABLE-US-00010 1 MLFLFEPKGR VLLVDGHHLA YRNFFALKGL TTSRGEPVQG
VYGFAKSLLK 50 51 ALKEDGDVVI VVFDAKAPSF RHEAYGAYKA GRAPTPEDFP
RQLALMKELV 100 101 DLLGLERLEV PGFEADDVLA ALAKKAEREG YEVRILTADR
DLFQLLSDRI 150 151 AVLHPEGHLI TPGWLWERYG LRPEQWVDFR ALAGDPSDNI
PGVKGIGEKT 200 201 ALKLLKEWGS LENIQKNLDQ VSPPSVREKI QAHLDDLRLS
QELSRVRTDL 250 251 PLEVDFRRRR EPDREGLRAF LERLEFGSLL HEFGLLESPQ
AAEEAPWPPP 300 301 EQAFLGFRLS RPEPMWAELL SLAASAKGRV YRAEAPHKAL
SDLKEIRGLL 350 351 AKDLAVLALR EGLGLPPTDD PMLLAYLLDP SNTTPEGVAR
RYGGEWTEEA 400 401 GERALLAERL YENLLSRLKG EEKLLWLYEE VEKPLSRVLA
HMEATGVRLD 450 451 VPYLRALSLE VAAEMGRLEE EVFRLAGHPF NLNSRDQLER
VLFDELGLPP 500 501 IGKTEKTGKR STSAAVLEAL REAHPIVEKI LQYRELAKLK
GTYIDLLPAL 550 551 VHPRTGRLHT RFNQTATATG RLSSSDPNLQ NIPVRTPLGQ
RIRRAFVAEE 600 601 GYLLVALDYS QIELRVLAHL SGDENLIRVF QEGRDIHTQT
ASWMFGLPAE 650 651 AIDPLRRRAA KTINFGVLYG MSAHRLSQEL GIPYEEAVAF
IDRYFQSYPK 700 701 VKAWIERTLE EGRQRGYVET LFGRRRYVPD LNARVKSVRE
AAERMAFNMP 750 751 VQGTAADLMK LAMVRLFPRL PEVGARMLLQ VHDELLLEAP
KERAEEAAAL 800 801 AKEVMEGVWP LAVPLEVEVG IGEDWLSAKG 829
[0100] In another embodiment, the invention provides a wild type
Thermus brockianus polymerase from strain 2AZN saving SEQ ID NO:10.
The 2AZN amino acid sequence differs from the YS38 sequence by one
amino acid--the 2AZN strain has proline instead of leucine at
position 546.
TABLE-US-00011 1 MLPLFEPKGR VLLVDGHHLA YRNFFALKGL TTSRGEPVQG
VYGFAKSLLK 50 51 ALKEDGDVVI VVFDAKAPSF RHEAYGAYKA GRAPTPEDFP
RQLALMKELV 100 101 DLLGLERLEV PGFEADDVLA ALAKKAEREG YEVRILTADR
DLFQLLSDRI 150 151 AVLHPEGHLI TPGWLWERYG LRPEQWVDPR ALAGDPSDNI
PGVKGIGEKT 200 201 ALKLLKEWGS LENIQKNLDQ VSPPSVREKI QAHLDDLRLS
QELSRVRTDL 250 251 PLEVDFRRRR EPDREGLRAP LERLEFGSLL HEFGLLESPQ
AAEEAPWPPP 300 301 EGAPLGFRLS RPEPMWAELL SLAASAKGRV YRAEAPHKAL
SDLKEIRGLL 350 351 AKDLAVLALR EGLGLPPTDD PMLLAYLLDP SNTTPEGVAR
RYGGEWTEEA 400 401 GERALLAERL YENLLSRLKG EEKLLWLYEE VEKPLSRVLA
HMEATGVRLD 450 451 VPYLRALSLE VAAEMGRLEE ENFRLAGHPF NLNSRDQLER
VLFDELGLPP 500 501 IGKTEKTGKR STSAAVLEAL REAEPIVEKI LQYRELAKLK
GTYIDPLPAL 550 551 VHPRTGRLHT RFNQTATATG RLSSSDPNLQ NIPVRTPLGQ
RIRRAFVAEE 600 601 GYLLVALDYS QIELRVLAHL SGDENLIRVF QEGRDIHTQT
ASWMFGLPAE 650 651 AIDPLRRRAA KTINFGVLYG MSAHRLSOEL GIPYEEAVAF
IDRYFQSYPK 700 701 VKAWIERTLE EGRQRGYVET LFGRRRYVPD LNARVKSVRE
AAERMAFNMP 750 751 VQGTAADLMK LAMVRLFPRL PEVGARMLLQ VHDELLLEAP
KERAEEAAAL 800 801 AKEVMEGVWP LAVPLEVEVG IGEDWLSAKG 829
[0101] Significant portions of the Thermus brockianus polymerase
sequences are distinct from other polymerases, including, for
example, a peptide at positions 22-25 (RNFF, SEQ ID NO:9, a peptide
at positions 39-42 (QGVY, SEQ ID NO:17), a peptide at positions
76-79 (GAYK, SEQ ID NO:18), a peptide at positions 95-98 (LMKE, SEQ
ID NO:19), a peptide at positions 111-114 (PGFE, SEQ ID NO:20), a
peptide at positions 106-121 (ERLEVPGFEADD VLAA, SEQ ID NO:21), a
peptide at positions 161-164 (TPGW, SEQ ID NO:22), a peptide at
positions 182-186 (LAGDP, SEQ ID NO:23), a peptide at positions
213-216 (NIQK, SEQ ID NO:24), a peptide at positions 220-224
(QVSPP, SEQ ID NO:25), a peptide at positions 228-231 (EKIQ, SEQ ID
NO:26), a peptide at positions 238-242 (RLSQE, SEQ ID NO:27), a
peptide at positions 256-261 (FRRRRE, SEQ ID NO:28), a peptide at
positions 288-292 (SPQAA, SEQ ID NO:29), a peptide at positions
305-308 (LGFR, SEQ ID NO:30), a peptide at positions 318-321 (ELLS,
SEQ ID NO:31), a peptide at positions 325-331 (SAKGRVY, SEQ ID
NO:32), a peptide at positions 334-337 (EAPH, SEQ ID NO:33), a
peptide at positions 334-341 (EAPHKALS, SEQ ID NO:34), a peptide at
positions 407-412 (AERLYE, SEQ ID NO:35), a peptide at positions
415-419 (LSRLK, SEQ ID NO:36), a peptide at positions 428-431
(YEEV, SEQ ID NO:37), a peptide at positions 465-468 (MGRL, SEQ ID
NO:38), a peptide at positions 537-541 (AKLKG, SEQ ID NO:39), a
peptide at positions 545-549 (LPAL, SEQ ID NO:40), a peptide at
positions 600-603 (EGYL, SEQ ID NO:41), a peptide at positions
647-650 (LPAE, SEQ ID NO:42), a peptide at positions 648-652
(PAEAI, SEQ ID NO:43), a peptide at positions 655-658 (LRRR, SEQ ID
NO:44), a peptide at positions 690-693 (IDRY, SEQ ID NO:45), a
peptide at positions 698-702 (YPKVK, SEQ ID NO:46), a peptide at
positions 712-715 (GRQR, SEQ ID NO:47), a peptide at positions
765-773 (RLFPRLPEV, SEQ ID NO:48) and a peptide at positions
807-810 (GVWP, SEQ ID NO:49).
[0102] Many DNA polymerases possess activities in addition to a DNA
polymerase activity. Such activities include, for example, a 5'-3'
exonuclease activity and/or a 3'-5' exonuclease activity. The 3'-5'
exonuclease activity improves the accuracy of the newly-synthesized
strand by removing incorrect bases that may have been incorporated.
DNA polymerases in which such activity is low or absent are prone
to errors in the incorporation of nucleotide residues into the
primer extension strand. Taq DNA polymerase has been reported to
have low 3'-5' exonuclease activity. See Lawyer et al., J. Biol
Chem. 264:6427-6437. In applications such as nucleic acid
amplification procedures in which the replication of DNA is often
geometric in relation to the number of primer extension cycles,
such errors can lead to serious artifactual problems such as
sequence heterogeneity of the nucleic acid amplification product
(amplicon). Thus, a 3'-5' exonuclease activity is a desired
characteristic of a thermostable DNA polymerase used for such
purposes.
[0103] By contrast, the 5'-3' exonuclease activity of DNA
polymerase enzymes is often undesirable because this activity may
digest nucleic acids, including primers, which have an unprotected
5' end. Thus, a thermostable polymerase with an attenuated 5'-3'
exonuclease activity, or in which such activity is absent, is a
desired characteristic of an enzyme for biochemical applications.
Various DNA polymerase enzymes have been described where a
modification has been introduced in a DNA polymerase that
accomplishes this object. For example, the Klenow fragment of E.
coli DNA polymerase I can be produced as a proteolytic fragment of
the holoenzyme in which the domain of the protein controlling the
5'-3' exonuclease activity has been removed. The Klenow fragment
still retains the polymerase activity and the 3'-5' exonuclease
activity. Barnes, PCT Publication No. WO92/06188 (1992) and Gelfand
et al, U.S. Pat. No. 5,466,591 have produced 5'-3'
exonuclease-deficient recombinant Thermus aquaticus DNA
polymerases. Ishino et al., EPO Publication No. 0517418A2, have
produced a 5'-3' exonuclease-deficient DNA polymerase derived from
Bacillus caldotenax.
[0104] In another embodiment, the invention provides a polypeptide
that is a derivative Thermus brockianus polypeptide with reduced or
eliminated 5'-3' exonuclease activity. Several methods exist for
reducing this activity, and the invention contemplates any
polypeptide derived from the Thermus brockianus polypeptides of the
invention that has reduced or eliminated such 5'-3' exonuclease
activity. Xu et al, Biochemical and mutational studies of the 5'-3'
exonuclease of DNA polymerase I of Escherichia coli, J. Mol. Biol.
1997 May 2; 268(2):284-302. In one embodiment, Asp is used in place
of Gly at position 43 to produce a polypeptide with reduced 5'-3'
exonuclease activity.
[0105] Hence, the invention provides a derivative polypeptide
having SEQ ID NO:11 that is related to a Thermus brockianus
polymerase polypeptide from strain YS38, wherein Asp is used in
place of Gly at position 43.
TABLE-US-00012 1 MLPLPEPKGR VLLVDGHHLA YRNFFALKGL TTSRGEPVQG
VYDPAKSLLK 50 51 ALKEDGDVVI VVFDAKAPSF RHEAYGAYKA GRAPTPEDFP
RQLALMKELV 100 101 DLLGLERLEV PGFEADDVLA ALAKKAEREG YEVRILTADR
DLFQLLSDRI 150 151 AVLHPEGHLI TPGWLWERYG LRPEQWVDFR ALAGDPSDNI
PGVKGIGEKT 200 201 ALKLLKEWGS LENIQKNLDQ VSPPSVREKI QAHLDDLRLS
QELSRVRTDL 250 251 PLEVDFRRRR EPDREGLRAF LERLEFGSLL HEFGLLESPQ
AAEEAPWPPP 300 301 EGAFLGFRLS RPEPMWAELL SLAASAKGRV YRAEAPHKAL
SDLKEIRGLL 350 351 AKDLAVLALR EGLGLPPTDD PMLLAYLLDP SNTTPEGVAR
RVGGEWTEEA 400 401 GERALLAERL YENLLSRLKG EEKLLWLYEE VEKPLSRVLA
HMEATGVRLD 450 451 VPYLRALSLE VAAEMGRLEE EVFRLAGHPF NLNSRDQLER
VLFDELGLPP 500 501 TGKTEKTGKR STSAAVLEAL REAHPIVEKI LQYRELAKLK
GTYIDLLPAL 550 501 VHPRTGRLHT RFNQTATATG RLSSSDPNLQ NIPVRTPLGQ
RIRRAFVAEE 600 601 GYLLVALDYS QIELRVLAHL SGDENLIRVF QEGRDIHTQT
ASWMFGLPAE 650 651 AIDPLPRRAA KTINFGVLYG MSAHRLSQEL GIPYEEAVAF
IDRYFQSYPK 700 701 VKAWIERTLE EGRQRGYVET LFGRRRYVPD LNARVKSVRE
AAERMAFNMP 750 751 VQGTAADLMK LAMVRLFPRL PEVGARMLLQ VHDELLLEAP
KERAEEAAAL 800 801 AKEVMEGVWP LAVPLEVEVG IGEDWLSAKG 829
[0106] The invention also provides a derivative polypeptide having
SEQ ID NO:12 that is related to a Thermus brockianus polymerase
polypeptide from strain 2AZN, wherein Asp is used in place of Gly
at position 43.
TABLE-US-00013 1 MLPLFEPKGR VLLVDGHHLA YRNFFALKGL TTSRGEPVQG
VYDFAKSLLK 50 51 ALKEDGDVVI VVPDAKAPSF RHEAYGAYKA GRAPTPEDFP
RQLALMKELV 100 101 DLLGLERLEV PGFEADDVLA ALAKKAEREG YEVRILTADR
DLFQLLSDRI 150 151 AVLHPEGHLI TPGWLWERYG LRPEQWVDER ALAGDPSDNI
PGVKGIGEKT 200 201 ALKLLKEWGS LENIQKNLDQ VSPPSVREKI QAHLDDLRLS
QELSRVRTDL 250 251 PLEVDFRRRR EPDREGLRAF LERLEFGSLL HEFGLLESPQ
AAEEAPWPPP 300 301 EGAFLGFRLS RPEPMWAELL SLAASAKGRV YRAEAPHKAL
SDLKEIRGLL 350 351 AKDLAVLALR EGLGLPPTDD PMLLAYLLDP SNTTPEGVAR
RYGGEWTEEA 400 401 GERALLAERL YENLLSRLKG EEKLLWLYEE VEKPLSRVLA
HMEATGVRLD 450 451 VPYLRALSLE VAAEMGRLEE EVFRLAGHPF NLNSRDQLER
VLFDELGLPP 500 501 IGKTEKTGKR STSAAVLEAL REAHPIVEKI LQYRELAKLK
GTYIDPLPAL 550 551 VHPRTGRLHT RFNQTATATG RLSSSDPNLQ NIPVRTPLGQ
RIRRAFVAEE 600 601 GYLLVALDYS QIELRVLAHL SGDENLIRVF QEGRDIHTQT
ASWMFGLPAE 650 651 AIDPLRRRAA KTINFGVLYG MSAHRLSQEL GIPYEEAVAF
IDRYFQSYPK 700 701 VKAWIERTLE EGRQRGYVET LFGRRRYVPD LNARVKSVRE
AAERMAFNMP 750 751 VQGTAADLMK LAMVRLFPRL PEVGARMLLQ VHDELELEAP
KERAEEAAAL 800 801 AKEVMEGVWP LAVPLEVEVG TGEDWLSAKG 829
[0107] In another embodiments the invention provides a derivative
polypeptide having SEQ ID NO:13 that is related to the Thermus
brockianus polymerase polypeptide from strain YS38, and that has
Tyr in place of Phe at position 665. This derivative polypeptide
has reduced bias against ddNTP incorporation. The sequence of SEQ
ID NO:13 is below.
TABLE-US-00014 1 MLPLFEPKGR VLLVDGHHLA YRNFFALKGL TTSRGEPVQG
VYGFAKSLLK 50 51 ALKEDGDVVI VVFDAKAPSF RHEAYGAYKA GRAPTPEDFP
RQLALMKELV 100 101 DLLGLERLEV PGFEADDVLA ALAKKAEREG YEVRILTADR
DLFQLLSDRI 150 151 AVLHPEGHLI TPGWLWERYG LRPEQWVDFR ALAGDPSDNI
PGVKGIGEKT 200 201 ALKLLKEWGS LENIQKNLDQ VSPPSVRSKI QAHLDDLRLS
QELSRVRTDL 250 251 PLEVDFRRRR EPDREGLRAF LERLEFGSLL HEFGLLESPQ
AAEEAPWPPP 300 301 EGAFLGFRLS RPEPMWAELL SLAASAKGRV YRAEAPHKAL
SDLKEIRGLL 350 351 AKDLAVLALR EGLGLPPTDD PMLLAYLLDP SNTTPEGVAR
RYGGEWTEEA 400 401 GERALLAERL YENLL5RLKG EEKLLWLYEE VEKPLSRVLA
HMEATGVRLD 450 451 VPYLRALSLE VAAEMGRLEE EVFRLAGHPF NLNSRDQLER
VLFDELGLPP 500 501 IGKTEKTGKR STSAAVLEAL REAHPIVEKI LQYRELAKLK
GTYIDLLPAL 550 501 VHPRTGRLHT RFNQTATATG RLSSSDPNLQ NIPVRTPLGQ
RIRRAFVAEE 600 601 GYLLVALDYS QIELRNLAHL SGDENLIRVF QEGRDIHTQT
ASWMFGLPAE 650 651 AIDPLRRRAA KTINYGVLYG MSAHRLSQEL GIPYEEAVAF
IDRYFQSYPK 700 701 VKAWIERTLE EGRQRGYVET LFGRRRYVPD LNARVKSVRE
AAERMAFNMP 750 751 VQGTAADLMK LAMVRLFPRL PEVGARMLLQ VHDELLLEAP
KERAEEAAAL 800 801 AKEVMEGVWP LAVPLEVEVG IGEWLSAKG 829
[0108] In another embodiment, the invention provides a derivative
polypeptide having SEQ ID NO:14 that is related to the Thermus
brockianus polymerase polypeptide from strain 2AZN and that has Tyr
in place of Phe at position 665. This derivative polypeptide has
reduced bias against ddNTP incorporation. The sequence of SEQ ID
NO:14 is below.
TABLE-US-00015 1 MLPLFEPKGR VLLVDGHHLA YRNFFALKGL TTSRGEPVQG
VYGFAKSLLK 50 51 ALKEDGDVVI VVFDAKAPSF RHEAYGAYKA GRAPTPEDFP
RQLALMKELV 100 101 DLLGLERLEV PGFEADDVLA ALAKKAEREG YEVRILTADR
DLFQLLSDRI 150 151 AVLHPEGHLI TPGWLWERYG LRPEQWVDFR ALAGDPSDNI
PGVKGIGEKT 200 201 ALKLLKEWGS LENIQKNLDQ VSPPSVREKI QAHLDDLRLS
QELSRVRTDL 250 251 PLEVDFRRRR EPDREGLRAF LERLEFGSLL HEFGLLESPQ
AAEEAPWPPP 300 301 EGAFLGFRLS RPEPMWAELL SLAASAKGRV YRAEAPHKAL
SDLKEIRGLL 350 351 AKDLAVLALR EGLGLPPTDD PMLLAYLLDP SNTTPEGVAR
RYGGEWTEEA 400 401 GERALLAERL YENLLSRLKG EEKLLWLYEE VEKPLSRVLA
HMEATGVRLD 450 451 VPYLRALSLE VAAEMGRLEE EVFRLAGHPF NLNSRDQLER
VLFDELGLPP 500 501 IGKTEKTGKR STSAAVLEAL REAHPIVEKI LQYRELAKLK
GTYIDPLPAL 550 551 VHPRTGRLHT RFNQTATATG RLSSSDPNLQ NIPVRTPLGQ
RIRRAFVAEE 600 601 GYLLVALDYS QIELRVLAHL SGDENLIRVF QEGRDIHTQT
ASWMFGLPAE 650 651 AIDPLRRRAA KTINYGVLYG MSAHRLSQEL GIPYEEAVAF
IDRYFQSYPK 700 701 VKAWIERTLE EGRQRGYVET LFGRRRYVPD LNARVKSVRE
AAERMAFNMP 750 751 VQGTAADLMK LAMVRLFPRL PEVGARMLLQ VHDELLLEAP
KERAEEAAAL 800 801 AKEVMEGVWP LAVPLEVEVG IGEDWLSAKG 829
[0109] In another embodiment, the invention provides a derivative
polypeptide having SEQ ID NO:15, related to a Thermus brockianus
polypeptide from strain YS38 with reduced 5'-3' exonuclease
activity and reduced bias against ddNTP incorporation. SEQ ID NO:
15 has Asp in place of Gly at position 43 and Tyr in place of Phe
at position 665. The sequence of SEQ ID NO:15 is below.
TABLE-US-00016 1 MLPLFEPKGR VLLVDGHHLA YRNFFALKGL TTSRGEPVQG
VYDFAKSLLK 50 51 ALKEDGDVVI VVFDAKAPSF RHEAYGAYKA GRAPTPEDFP
RQLALMKELV 100 101 DLLGLERLEV PGFEADDVLA ALAKKAEREG YEVRILTADR
DLFQLLSDRI 150 151 AVLHPEGHLI TPGWLWERYG LRPEQWVDFR ALAGDPSDNI
PGVKGIGEKT 200 201 ALKLLKEWGS LENIQKNLDQ VSPPSVREKI QAHLDDLRLS
QELSRVRTDL 250 251 PLEVDFRRRR EPDREGLRAF LERLEFGSLL HEFGLLESPQ
AAEEAPWPPP 300 301 EGAFLGFRLS RPEPMWAELL SLAASAKGRV YRAEAPHKAL
SDLKEIRGLL 350 351 AKDLAVLALR EGLGLPPTDD PMLLAYLLDP SNTTPEGVAR
RYGGEWTEEA 400 401 GERALLAERL YENLLSRLKG EEKLLWLYEE VEKPLSRVLA
HMEATGVRLD 450 451 VPYLRALSLE VAAEMGRLEE EVPRLAGHPF NLNSRDQLER
VLFDSLGLPP 500 501 IGKTEKTGKR STSAAVLEAL REAHPIVEKI LQYRELAKLK
GTYIDLLPAL 550 501 VHPRTGRLHT RFNQTATATG RLSSSDPNLQ NIPVRTPLGQ
RIRRAFVAEE 600 601 GYLLVALDYS QIELRVLAHL SGDENLIRVF QEGRDIHTQT
ASWMFGLPAE 650 651 AIDPLRRRAA KTINYGVLYG MSAHRLSQEL GIPYEEAVAF
IDRYFQSYPK 700 701 VKAWIERTLE EGRQRGYVET LFGRRRYVPD LNARVKSVRE
AAERMAFNMP 750 751 VQGTAADLMK LAMVRLFPRL PEVGARMLLQ VHDELLLEAP
KERAEEAAAL 800 801 AKEVMEGVWP LAVPLEVEVG IGEDWLSAKG 839
[0110] In another embodiment, the invention provides a derivative
polypeptide having SEQ ID NO:15, related to a Thermus brockianus
polypeptide from strain YS3S with reduced 5'-3' exonuclease
activity and reduced bias against ddNTP incorporation. SEQ ID NO:
15 has Asp in place of Gly at position 43 and Tyr in place of Phe
at position 665. The sequence of SEQ ID NO:15 is below.
TABLE-US-00017 1 MLPLFEPKGR VLLVDGHHLA YRNFFALKGL TTSRGEPVQG
VYDFAKSLLK 50 51 ALKEDGDVVI VVFDAKAPSF RHEAYGAYKA GRAPTPEDFP
RQLALMKELV 100 101 DLLGLERLEV PGFEADDVLA ALAKKAEREG YEVRILTADR
DLFQLLSDRI 150 151 AVLHPEGHLI TPGWLWERYG LRPEQWVDFR ALAGDPSDNI
PGVKGIGEKT 200 201 ALKLLKEWGS LENIQKNLDQ VSPPSVREKI QAHLDDLRLS
QELSRVRTDL 250 251 PLEVDFRRRR EPDREGLRAF LERLEFGSLL HEFGLLESPQ
AAETAPWPPP 300 301 EGAFLGFRLS RPEPMWAELL SLAASAKGRV YRAEAPHKAL
SDLKEIRGLL 350 351 AKDLAVLALR EGLGLPPTDD PMLLAYLLDP SNTTPEGVAR
RYGGEWTEEA 400 401 GERALLAERL YENLLSRLKG EEKLLWLYEE VEKPLSRVLA
HMEATGVRLD 450 451 VPYLRALSLE VAAEMGRLEE EVFRLAGHPF NLNSRDQLER
VLFDELGLPP 500 501 IGKTEKTGKR STSAAVLEAL REAHPIVEKI LQYRELAKLK
GTYIDPLPAL 550 551 VHPRTGRLHT RFNQTATATG RLSSSDPNLQ NIPVRTPLGQ
RIRRAFVAEE 600 601 GYLLVALDYS QIELRVLAHL SGDENLIRVF QEGRDIHTQT
ASWMFGLPAE 650 651 AIDPLRRRAA KTINYGVLYG MSAHRLSQEL GIPYEEAVAF
IDRYFQSYPK 700 701 VKAWIERTLE EGRQRGYVET LFGRRRYVPD LNARVKSVRE
AAERMAFNMP 750 751 VQGTAADLMK LAMVRLFPRL PEVGARMLLQ VHDELLLEAP
KERAEEAAAL 800 801 AKEVMEGVWP LAVPLEVEVG IGEDWLSAKG 829
[0111] The DNA polymerase polypeptides of the invention have
homology to portions of the amino acid sequences of the
thermostable DNA polymerases of Thermus aquaticus and Thermus
thermophilus (see FIG. 1). However, significant portions of the
amino acid sequences of the present invention are distinct,
including SEQ ID NO:17-49.
[0112] As indicated above, derivative and variant polypeptides of
the invention are derived from the wild type Thermus brockianus
polymerases by deletion or addition of one or more amino acids to
the N-terminal and/or C-terminal end of the wild type polypeptide;
deletion or addition of one or more amino acids at one or more
sites within the wild type polypeptide; or substitution of one or
more amino acids at one or more sites within the wild type
polypeptide. Thus, the polypeptides of the invention may be altered
in various ways including amino acid substitutions, deletions,
truncations, and insertions.
[0113] Such variant and derivative polypeptides may result, for
example, from genetic polymorphism or from human manipulation.
Methods for such manipulations are generally known in the art. For
example, amino acid sequence variants of the polypeptides can be
prepared by mutations in the DNA. Methods for mutagenesis and
nucleotide sequence alterations are well known in the art. See, for
example, Kunkel, Proc. Natl. Acad. Sci. USA, 82,488 (1985); Kunkel
et al., Methods in Enzymol., 154, 367 (1987); U.S. Pat. No.
4,873,192; Walker and Gaastra, eds., Techniques in Molecular
Biology, MacMillan Publishing Company, New York (1983) and the
references cited therein. Guidance as to appropriate amino acid
substitutions that do not affect biological activity of the protein
of interest may be found in the model of Dayhoff et al., Atlas of
Protein Sequence and Structure, Natl. Biomed. Res. Found.,
Washington, D. C. (1978), herein incorporated by reference.
[0114] The derivatives and variants of the isolated polypeptides of
the invention have identity with at least about 92% of the amino
acid positions of SEQ ID NO:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID
NO:8 and have DMA polymerase I activity and/or are thermally
stable. In a preferred embodiment, polypeptide derivatives and
variants have identity with at least about 95% of the amino acid
positions of SEQ ID NQ:5, SEQ ID NO:6, SEQ ID NO:7, or SEQ ID NO:8
and have DNA polymerase I activity and/or are thermally stable.
[0115] Amino acid residues of the isolated polypeptides and
polypeptide derivatives and variants can be genetically encoded
L-amino acids, naturally occurring non-genetically encoded L-amino
acids, synthetic L-amino acids or D-enantiomers of any of the
above. The amino acid notations used herein for the twenty
genetically encoded L-amino acids and common non-encoded amino
acids are conventional and are as shown in Table 2.
TABLE-US-00018 TABLE 2 One-Letter Common Amino Acid Symbol
Abbreviation Alanine A Ala Arginine R Arg Asparagine N Asn Aspartic
acid D Asp Cysteine C Cys Glutamine Q Gln Glutamic acid E Glu
Glycine G Gly Histidine H His Isoleucine I Ile Leucine L Leu Lysine
K Lys Methionine M Met Phenylalanine F Phe Proline P Pro Serine S
Ser Threonine T Thr Tryptophan W Trp Tyrosine Y Tyr Valine V Val
.beta.-Alanine bAla 2,3-Diaminopropionic acid Dpr
.alpha.-Aminoisobutyric acid Aib N-Methylglycine (sarcosine) MeGly
Ornithine Orn Citrulline Cit t-Butylalanine t-BuA t-Butylglycine
t-BuG M-methylisoleucine MeIle Phenylglycine Phg Cyclohexylalanine
Cha Norleucine Nle Naphthylalanine Nal Pyridylalanine
3-Benzothienyl alanine 4-Chlorophenylalanine Phe(4-Cl)
2-Fluorophenylalanine Phe(2-F) 3-Fluorophenylalanine Phe(3-F)
4-Fluorophenylalanine Phe(4-F) Penicillamine Pen
1,2,3,4-Tetrahydro- Tic isoquinoline-3-carboxylic acid
.beta.-2-thienylalanine Thi Methionine sulfoxide MSO Homoarginine
hArg N-acetyl lysine AcLys 2,4-Diamino butyric acid Dbu
.rho.-Aminophenylalanine Phe(pNH.sub.2) N-methylvaline MeVal
Homocysteine hCys Homoserine hSer .epsilon.-Amino hexanoic acid Aha
.delta.-Amino valeric acid Ava 2,3-Diaminobutyric acid Dab
[0116] Polypeptide variants that are encompassed within the scope
of the invention can have one or more amino acids substituted with
an amino acid of similar chemical and/or physical properties, so
long as these valiant polypeptides retain nucleic acid polymerase
or DNA polymerase activity and/or remain thermally stable.
Derivative polypeptides can have one or more amino acids
substituted with an amino acids having different chemical and/or
physical properties, so long as these variant polypeptides retain
nucleic acid polymerase or DNA polymerase activity and/or remain
thermally stable.
[0117] Amino acids that are substitutable for each other in the
present variant polypeptides generally reside within similar
classes or subclasses. As known to one of skill in the art, amino
acids can be placed into three main classes: hydrophilic amino
acids, hydrophobic amino acids and cysteine-like amino acids,
depending primarily on the characteristics of the amino acid side
chain. These main classes may be further divided into subclasses.
Hydrophilic amino acids include amino acids having acidic, basic or
polar side chains and hydrophobic amino acids include, amino acids
having aromatic or apolar side chains. Apolar amino acids may be
further subdivided to include, among others, aliphatic amino acids.
The definitions of the classes of amino acids as used herein are as
follows:
[0118] "Hydrophobic Amino Acid" refers to an amino acid having a
side chain that is uncharged at physiological pH and that is
repelled by aqueous solution. Examples of genetically encoded
hydrophobic amino acids include Ile, Leu and Val. Examples of
non-genetically encoded hydrophobic amino acids include t-BuA.
[0119] "Aromatic Amino Acid" refers to a hydrophobic amino acid
having a side chain containing at least one ring having a
conjugated .pi.-electron system (aromatic group). The aromatic
group may be further substituted with substituent groups such as
alkyl, alkenyl, alkynyl, hydroxyl, sulfonyl, nitro and amino
groups, as well as others. Examples of genetically encoded aromatic
amino acids include phenylalanine, tyrosine and tryptophan.
Commonly encountered non-genetically encoded aromatic amino acids
include phenyl glycine, 2-naphthylalanine, .beta.-2-thienylalanine,
1,2,3,4-tetrahydroisoquinoline-3-carboxylic acid,
4-chlorophenylalanine, 2-fluorophenylalanine, 3-fluorophenylalanine
and 4-fluorophenylalanine.
[0120] "Apolar Amino Acid" refers to a hydrophobic amino acid
having a side chain that is generally uncharged at physiological pH
and that is not polar. Examples of genetically encoded apolar amino
acids include glycine, proline and methionine. Examples of
non-encoded apolar amino acids include Cha.
[0121] "Aliphatic Amino Acid" refers to an apolar amino acid having
a saturated or unsaturated straight chain, branched or cyclic
hydrocarbon side chain. Examples of genetically encoded aliphatic
amino acids include Ala, Leu, Val and Ile. Examples of non-encoded
aliphatic amino acids include Nle.
[0122] "Hydrophilic Amino Acid" refers to an amino acid having a
side chain that is attracted by aqueous solution. Examples of
genetically encoded hydrophilic amino acids include Ser and Lys.
Examples of non-encoded hydrophilic amino acids include Cit and
hCys.
[0123] "Acidic Amino Acid" refers to a hydrophilic amino acid
having a side chain pK value of less than 7. Acidic amino acids
typically have negatively charged side chains at physiological pH
due to loss of a hydrogen ion. Examples of genetically encoded
acidic amino acids include aspartic acid (aspartate) and glutamic
acid (glutamate).
[0124] "Basic Amino Acid" refers to a hydrophilic amino acid having
a side chain pK value of greater than 7. Basic amino acids
typically have positively charged side chains at physiological pH
due to association with hydronium ion. Examples of genetically
encoded basic amino acids include arginine, lysine and histidine.
Examples of non-genetically encoded basic amino acids include the
non-cyclic amino acids ornithine, 2,3-diaminopropionic acid,
2,4-diaminobutyric acid and homoarginine.
[0125] "Polar Amino Acid" refers to a hydrophilic amino acid having
a side chain that is uncharged at physiological pH, but which has a
bond in which the pair of electrons shared in common by two atoms
is held more closely by one of the atoms. Examples of genetically
encoded polar amino acids include asparagine and glutamine.
Examples of non-genetically encoded polar amino acids include
citrulline, N-acetyl lysine and methionine sulfoxide.
[0126] "Cysteine-Like Amino Acid" refers to an amino acid having a
side chain capable of forming a covalent linkage with a side chain
of another amino acid residue, such as a disulfide linkage.
Typically, cysteine-like amino acids generally have a side chain
containing at least one thiol (SH) group. Examples of genetically
encoded cysteine-like amino acids include cysteine. Examples of
non-genetically encoded cysteine-like amino acids include
homocysteine and penicillamine.
[0127] As will be appreciated by those having skill in the art, the
above classification are not absolute. Several amino acids exhibit
more than one characteristic property, and can therefore be
included in more than one category. For example, tyrosine has both
an aromatic ring and a polar hydroxyl group. Thus, tyrosine has
dual properties and can be included in both the aromatic and polar
categories. Similarly, in addition to being able to form disulfide
linkages, cysteine also has apolar character. Thus, while not
strictly classified as a hydrophobic or apolar amino acid, in many
instances cysteine can be used to confer hydrophobicity to a
polypeptide.
[0128] Certain commonly encountered amino acids that are not
genetically encoded and that can be present, or substituted for an
amino acid, in the variant polypeptides of the invention include,
but are not limited to, .beta.-alanine (b-Ala) and other
omega-amino acids such as 3-aminopropionic acid (Dap),
2,3-diaminopropioinic acid (Dpr), 4-aminobutyric acid and so forth;
.alpha.-aminoisobutyric acid (Aib); .epsilon.-aminohexanoic acid
(Aha); .delta.-aminovaleric acid (Ava); N-methylglycine (MeGly);
ornithine (Orn); citrulline (Cit); t-butylalanine (t-BuA);
t-butylglycine (t-BuG); N-methylisoleucine (Melle); phenylglycine
(Phg); cyclohexylalanine (Cha); norleucine (Nle); 2-naphthylalanine
(2-Nal); 4-chlorophenylalanine (Phe(4-Cl)); 2-fluorophenylalanine
(Phe(2-F)); 3-fluorophenylalanine (Phe(3-F)); 4-fluorophenylalanine
(Phe(4-F)); penicillamine (Pen);
1,2,3,4-tetrahydroisoquinoline-3H-carboxylic acid (Tic);
.beta.-2-thienylalanine (Thi); methionine sulfoxide (MSO);
homoarginine (hArg); N-acetyl lysine (AcLys); 2,3-diaminobutyric
acid (Dab); 2,3-diaminobutyric acid (Dbu); p-aminophenylalanine
(Phe(pNH.sub.2)); N-methyl valine (MeVal); homocysteine (hCys) and
homoserine (hSer). These amino acids also fall into the categories
defined above.
[0129] The classifications of the above-described genetically
encoded and non-encoded amino acids are summarized in Table 3,
below. It is to be understood that Table 3 is for illustrative
purposes only and does not purport to be an exhaustive list of
amino acid residues that may comprise the variant and derivative
polypeptides described herein. Other amino acid residues that are
useful for making the variant and derivative polypeptides described
herein can be found, e.g., in Fasman, 1989, CRC Practical Handbook
of Biochemistry and Molecular Biology, CRC Press, Inc., and the
references cited therein. Amino acids not specifically mentioned
herein can be conveniently classified into the above-described
categories on the basis of known behavior and/or their
characteristic chemical and/or physical properties as compared with
amino acids specifically identified.
TABLE-US-00019 TABLE 3 Genetically Classification Encoded
Genetically Non-Encoded Hydrophobic F, L, I, V Aromatic F, Y, W
Phg, Nal, Thi, Tic, Phe(4-Cl), Phe(2-F), Phe(3-F), Phe(4-F),
Pyridyl Ala, Benzothienyl Ala Apolar M, G, P Aliphatic A, V, L, I
t-BuA, t-BuG, MeIle, Nle, MeVal, Cha, bAla, MeGly, Aib Hydrophilic
S, K Cit, hCys Acidic D, E Basic H, K, R Dpr, Orn, hArg,
Phe(p-NH.sub.2), DBU, A.sub.2 BU Polar Q, N, S, T, Y Cit, AcLys,
MSO, hSer Cysteine-Like C Pen, hCys, .beta.-methyl Cys
[0130] Polypeptides of the invention ears have any amino acid
substituted by any similarly classified amino acid to create a
variant peptide, so long as the peptide variant is thermally stable
and/or retains nucleic acid polymerase or DNA polymerase
activity.
[0131] "Domain shuffling" or construction of "thermostable chimeric
DNA polymerases" may be used to provide thermostable DNA
polymerases containing novel properties, For example, placement of
codons 289-422 from the Thermus brockianus polymerase coding
sequence after codons 1-288 of the Thermus aquaticus DNA polymerase
would yield a novel thermostable polymerase containing the 5'-3'
exonuclease domain of Thermus aquaticus DNA polymerase (1-288), the
3'-5' exonuclease domain of Thermus brockianus polymerase
(289-422), and the DNA polymerase domain of Thermus aquaticus DNA
polymerase (423-832). Alternatively, the 5'-3' exonuclease domain
and the 3' - 5' exonuclease domain of Thermus brockianus polymerase
may be fused to the DNA polymerase (dNTP binding and
primer/template binding domains) portions of Thermus aquaticus DNA
polymerase (about codons 423-832). The donors and recipients need
not be limited to Thermus aquaticus and Thermus brockianus
polymerases. Thermus thermophilus DNA polymerase 3'-5' exonuclease,
5'-3' exonuclease and DNA polymerase domains can similarly be
exchanged for those in the Thermus brockianus polymerases of the
invention.
[0132] For example, it has been demonstrated that the exonuclease
domain of Thermus aquaticus Polymerase I can be removed from the
amino terminus of the protein with out a significant loss of
thermostability or polymerase activity (Erlich et al., (1991)
Science 252: 1643-1651, Barnes, W. M., (1992) Gene 112:29-35.,
Lawyer et al., (1989) JBC 264:6427-6437). Other N-terminal
deletions similarly have been shown to maintain thermostability and
activity (Vainshtein et al., (1996) Protein Science 5:1785-1792 and
references therein.) Therefore this invention
[0133] also includes similarly truncated forms of any of the wild
type or variant polymerases provided herein. For example, the
invention is also directed to an active truncated variant of any of
the polymerases provided by the invention in which the first 330
amino acids are removed. Moreover, the invention provides SEQ ID
NO:56, a truncated form of
[0134] a polymerase in which the N-terminal 289 amino acids have
been removed from the wild type Thermus brockianus polymerase from
strain YS38.
TABLE-US-00020 290 Q AAEEAPWPPP 300 301 EGAFLGFRLS RPEPMWAELL
SLAASAKGRV YRAEAPHKAL SDLKEIRGLL 350 351 AKDLAVLALR EGLGLPPTDD
PMLLAYLLDP SNTTPEGVAR RYGGEWTEEA 400 401 GERALLAERL YENLLSRLKG
EEKLLWLYEE VEKPLSRVLA HMEATGVRLD 450 451 VPYLRALSLE VAAEMGRLEE
EVFRLAGHPF NLNSRDQLER VLFDELGLPP 500 501 IGKTEKTGKR STSAAVLEAL
REAHPIVEKI LQYRELAKLK GTYIDLLPAL 550 551 VHPRTGRLHT RPNQTATATG
RLSSSDPNLQ NIPVRTPLGQ RIRRAFVAEE 600 601 GYLLVALDYS QIELRVLAHL
SGDENLIRVF QEGRDIHTQT ASWMFGLPAE 650 651 AIDPLRRRAA KTINFGVLYG
MSAHRLSQEL GIPYEEAVAF IDRYFQSYPK 700 701 VKAWIERTLE EGRQRGYVET
LFGRRRYVPD LNARVKSVRE AAERMAFNMP 750 751 VQGTAADLMK LAMVRLFPRL
PEVGARMLLQ VHDELLLEAP KERAEEAAAL 800 801 AKEVMEGVWP LAVPLEVEVG
IGEDWLSAKG 829
[0135] In another embodiment, the invention provides SEQ ID NO:57,
a truncated form of a polymerase in which the N-terminal 289 amino
acids have been removed from the Thermus brockianus polymerase from
strain 2AZN.
TABLE-US-00021 290 Q AAEEAPWPPP 300 301 EGAFLGFRLS RPEPMWAELL
SLAASAKGRV YRAEAPHKAL SDLKEIRGLL 350 351 AKDLAVLALR EGLGLPPTDD
PMLLAYLLDP SNTTPEGVAR RYGGEWTEEA 400 401 GERALLAERL YENLLSRLKG
EEKLLWLYEE VEKPLSRVLA HMEATGVRLD 450 451 VPYLRALSLE VAAEMGRLEE
EVFRLAGHPF NLNSRDQLER VLFDELGLPP 500 501 IGKTEKTGKR STSAAVLEAL
REAHPIVEKI LQYRELAKLK GTYIDPLPAL 550 551 VHPRTGRLHT RFNQTATATG
RLSSSDPNLQ NIPVRTPLGQ RIRRAFVAEE 600 601 GYLLVALDYS QIELRVLAHL
SGDENLIRVF QEGRDIHTQT ASWMFGLPAE 650 651 AIDPLRRRAA KTINFGVLYG
MSAHRLSQEL GIPYEEAVAF IDRYFQSYPK 700 701 VKAWIERTLE EGRQRGYVET
LFGRRRYVPD LNARVKSVRE AAERMAFNMP 750 751 VQGTAADLMK LAMVRLFPRL
PEVGARMLLQ VHDELLLEAP KERAEEAAAL 800 801 AKEVMEGVWP LAVPLEVEVG
IGEDWLSAKG 829
[0136] Thus, the polypeptides of the invention encompass both
naturally occurring proteins as well as variations, truncations and
modified forms thereof. Such variants will continue to possess the
desired activity. The deletions, insertions, and substitutions of
the polypeptide sequence encompassed herein are not expected to
produce radical changes in the characteristics of the polypeptide.
One skilled in the art can readily evaluate the thermal stability,
nucleic acid polymerase or DNA polymerase activity of the
polypeptides and variant polypeptides of the invention by routine
screening assays.
[0137] Kits and compositions containing the present polypeptides
are substantially free of cellular material Such preparations and
compositions have less than about 30%, 20%, 10%, 5%, (by dry
weight) of contaminating bacterial cellular protein.
[0138] The activity of polymerase polypeptides and variant
polypeptides can be assessed by any procedure known to one of skill
in the art. For example, the DNA synthetic activity of the variant
and non-variant polymerase polypeptides of the invention can be
tested in standard DNA sequencing or DNA primer extension reaction.
One such assay can be performed in a 100 .mu.l (final volume)
reaction mixture, containing, for example, 0.1 mM dCTP, dTTP, dGTP,
.alpha.-.sup.32P-dATP, 0.3 mg/ml activated calf thymus DNA and 0.5
mg/ml BSA in a buffer containing: 50 mM KCl, 1 mM DTT, 10 mM
MgCl.sub.2 and 50 mM of a buffering compound such as PIPES, Tris or
Triethyiamine. A dilution to 0.1 units/.mu.l of each polymerase
enzyme is prepared, and 5 .mu.l of such a dilution is added to the
reaction mixture, followed by incubation at 60.degree. C. for 10
minutes. Reaction products can be detected by determining the
amount of .sup.32P incorporated into DNA or by observing the
products after separation on a polyacrylamide gel.
Uses for Nucleic Acid Polymerases
[0139] The thermostable enzyme of this invention may be used for
any purpose in which DNA or RNA polymerase enzyme activity is
necessary or desired. For example, the present polymerases can he
used in one or more of the following procedures: DNA sequencing,
DNA amplification, RNA amplification, reverse transcription, DNA
synthesis and/or primer extension. The polymerases of the invention
can be used to amplify DMA by polymerase chain reaction (PCR). The
polymerases of the invention can be used to sequence DNA by Sanger
sequencing procedures. The polymerases of the invention can also be
used in primer extension reactions. The polymerases of the
Invention can be used test for single nucleotide polymorphisms
(SNPs) by single nucleotide primer extension using terminator
nucleotides. Any such procedures and related procedures, for
example, polynucleotide or primer labeling, minisequencing and the
like are contemplated for use with the present polymerases.
[0140] Methods of the invention comprise the step of extending a
primed polynucleotide template with at least one labeled
nucleotide, wherein the extension is catalyzed by a polymerase of
the invention, DMA polymerases used for Sanger sequencing can
produce fluorescently labeled products that are analyzed on an
automated fluorescence-based sequencing apparatus such as an
Applied Biosystems 310 or 377 (Applied Biosystems, Foster City,
Calif.). Detailed protocols for Sanger sequencing are known to
those skilled in the art and may be found, for example in Sambrook
et al, Molecular Cloning, A Laboratory Manual, Second Edition, Cold
Spring Harbor Press, Cold Spring Harbor, N.Y. (1989).
[0141] In one embodiment, the polymerases of the invention are used
for DMA amplification. Any DNA procedure that employs a DNA
polymerase can be used, for example, in polymerase chain reaction
(PCR) assays, strand displacement amplification and other
amplification procedures. Strand displacement amplification can be
used as described in Walker et al (1992) Nucl. Acids Res. 20,
1691-1696. The term "polymerase chain reaction" ("PCR") refers to
the method of K. B. Mullis U.S. Pat. Nos. 4,683,195; 4,683,202; and
4,965,188, hereby incorporated by reference, which describe a
method for increasing the concentration of a segment of a target
sequence in a mixture of genomic or other DNA without cloning or
purification.
[0142] The PCR process for amplifying a target sequence consists of
introducing a large excess of two oligonucleotide primers to the
DNA mixture containing the desired target sequence, followed by a
precise sequence of thermal cycling in the presence of a DNA
polymerase. The two primers are complementary to their respective
strands of the double stranded target sequence. To do
amplification, the mixture is denatured and the primers are
annealed to complementary sequences within the target molecule.
Following annealing, the primers are extended with a polymerase so
as to form a new pair of complementary strands. The steps of
denaturation, printer annealing and polymerase extension are termed
a "cycle." There can be numerous cycles, and the amount of
amplified DNA produced increases with each cycle. Hence, to obtain
a high concentration of an amplified target nucleic acid, many
cycles are performed.
[0143] The steps involve in PCR nucleic acid amplification method
are described in more detail below. For ease of discussion, the
nucleic acid to be amplified is described as being double-stranded.
However, the process is equally useful for amplifying a
single-stranded nucleic acid, such as an mRNA, although the
ultimate product is generally double-stranded DNA. In the
amplification of a single-stranded nucleic acid, the first step
involves the synthesis of a complementary strand using, for
example, one of the two amplification primers. The succeeding steps
generally proceed as follows:
[0144] (a) Each nucleic acid strand is contacted with four
different nucleoside triphosphates and one oligonucleotide primer
for each nucleic acid strand to be amplified, wherein each primer
is selected to be substantially complementary to a portion the
nucleic acid strand to be amplified, such that the extension
product synthesized from one primer, when it is separated from its
complement, can serve as a template for synthesis of the extension
product of the other primer. To promote the proper annealing of
primer(s) and the nucleic acid strands to be amplified, a
temperature that allows hybridization of each primer to a
complementary nucleic acid strand is used.
[0145] (b) After primer annealing, a polymerase is used for primer
extension that incorporates the nucleoside triphosphates into a
growing nucleic acid strand that is complementary to the strand
hybridized by the primer. In general, this primer extension
reaction is performed at a temperature and for a time effective to
promote the activity of the enzyme and to synthesize a "full
length" complementary nucleic acid strand, that extends into a
through a complete second primer binding. However, the temperature
is not so high as to separate each extension product from its
nucleic acid template strand.
[0146] (c) The mixture from step (b) is then heated for a time and
at a temperature sufficient to separate the primer extension
products from their complementary templates. The temperature chosen
is not so high as to irreversibly denature the polymerase present
in the mixture.
[0147] (d) The mixture from (e) is cooled for a time and at a
temperature effective to promote hybridization of a primer to each
of the single-stranded molecules produced in step (b).
[0148] (e) The mixture from step (d) is maintained at a temperature
and for a time sufficient to promote primer extension by polymerase
to produce a "full length" extension product. The temperature used
is not so high as to separate each extension product from the
complementary strand template. Steps (c)-(e) are repeated until the
desired level of amplification is obtained.
[0149] The amplification method is useful not only for producing
large amounts of a specific nucleic acid sequence of known sequence
but also for producing nucleic acid sequences that are known to
exist but are not completely specified. One need know only the
identity of a sufficient number of bases at both ends of the
sequence in sufficient detail so that two oligonucleotide primers
can be prepared that will hybridize to different strands of the
desired sequence at those positions. An extension product is
synthesized from one primer. When that extension product is
separated from the template the extension product can serve as a
template for extension of the other primer. The greater the
knowledge about the bases at both ends of the sequence, the greater
can be the specificity of the primers for the target nucleic acid
sequence.
[0150] Thermally stable DNA polymerases are therefore generally
used for PCR because they can function at the high temperatures
used for melting double stranded target DNA and annealing the
primers during each cycle of the PCR reaction. High temperature
results in thermodynamic conditions that favor primer hybridization
with the target sequences and not hybridization with non-target
sequences (H.A. Erlich (ed), PCR Technology, Stockton Press
[1989]).
[0151] The thermostable polymerases of the present invention
satisfy the requirements for effective use in amplification
reactions such as PCR. The present polymerases do not become
irreversibly denatured (inactivated) when subjected to the required
elevated temperatures for the time necessary to melt
double-stranded nucleic acids during the amplification process.
Irreversible denaturation for purposes herein refers to permanent
and complete loss of enzymatic activity. The heating conditions
necessary for nucleic acid denaturation will depend, e.g., on the
buffer salt concentration and the composition and length of the
nucleic acids being denatured, but typically range from about
90.degree. C. to about 105.degree. C. for a time depending mainly
on the temperature and the nucleic acid length, typically from a
few seconds up to four minutes. Higher temperatures may be required
as the buffer salt concentration and/or GC composition of the
nucleic acid is increased. The polymerases of the invention do not
become irreversibly denatured for relatively short exposures to
temperatures of about 90.degree. C. to 100.degree. C.
[0152] The thermostable polymerases of the invention have an
optimum temperature at which they function that is higher than
about 45.degree. C. Temperatures below 45.degree. C. facilitate
hybridization of primer to template, but depending on salt
composition and concentration and primer composition and length,
hybridization of primer to template can occur at higher
temperatures (e.g., 45.degree. C. to 70.degree. C.), which may
promote specificity of the primer hybridization reaction. The DNA
polymerase polypeptides of the invention exhibit activity over a
broad temperature range from about 37.degree. C. to about
90.degree. C.
[0153] The present polymerases have particular utility for PCR not
only because of their thermal stability but also because of their
fidelity in replicating the target nucleic acid. With PCR, it is
possible to amplify a single copy of a specific target nucleic acid
to a level detectable by several different methodologies. However,
if the sequence of the target nucleic acid, is not replicated with
fidelity, then the amplified product can comprise a pool of nucleic
acids with diverse sequences. Hence, a polymerase that can
accurately replicate the sequence of the target is highly
desirable.
[0154] Any nucleic acid can act as a "target nucleic acid" for the
PCR methods of the invention. The term "target," when used in
reference to the polymerase chain reaction, refers to the region of
nucleic acid bounded by the primers used for polymerase chain
reaction. In addition to genomic DNA, any cDNA, oligonucleotide or
polynucleotide can be amplified with the appropriate set of primer
molecules. In particular, the amplified segments created by the PCR
process itself are, themselves, efficient templates for subsequent
PCR amplifications. The length of the amplified segment of the
desired target sequence is determined by the relative positions of
the primers with respect to each other, and therefore, this length
is a controllable parameter.
[0155] The amplified target nucleic acid can be detected by any
method known to one of skill in the art. For example, target
nucleic acids are often amplified to such an extent that they form
a band visible on a size separation gel. Target nucleic acids can
also be detected by hybridization with a labeled probe; by
incorporation of biotinylated primers during PCR followed, by
avidin-enzyme conjugate detection; by incorporation of
.sup.32P-labeled deoxynucleotide triphosphates during PCR, and the
like.
[0156] The amount of amplification can also be monitored, for
example, by use of a reporter-quencher oligonucleotide as described
in U.S. Pat. No. 5,723,591, and a polymerase of the invention that
has 5'-3' nuclease activity. The reporter-quencher oligonucleotide
has an attached reporter molecule and an attached quencher molecule
that is capable of quenching the fluorescence of the reporter
molecule when the two are in proximity. Quenching occurs when the
reporter-quencher oligonucleotide is not hybridized to a
complementary nucleic acid because the reporter molecule and the
quencher molecule tend to be in proximity or at an optimal distance
for quenching. When hybridized, the reporter-quencher
oligonucleotide emits more fluorescence than when unhybridized
because the reporter molecule and the quencher molecule tend to be
further apart. To monitor amplification, the reporter-quencher
oligonucleotide is designed to hybridize 3' to an amplification
primer. Dining amplification, the 5'-3' nuclease activity of the
polymerase digests the reporter oligonucleotide probe, thereby
separating the reporter molecule from the quencher molecule. As the
amplification is conducted, the fluorescence of the reporter
molecule increases. Accordingly, the amount of amplification
performed can be quantified based on the increase of fluorescence
observed.
[0157] Oligonucleotides used for PCR primers are usually about 9 to
about 75 nucleotides, preferably about 17 to about 50 nucleotides
in length. Preferably, an oligonucleotide for use in PCR reactions
is about 40 or fewer nucleotides in length (e.g., 9, 12, 15, 18,
20, 21, 24,27, 30, 35, 40, or any number between 9 and 40).
Generally specific primers are at least about 14 nucleotides in
length. For optimum specificity and cost effectiveness, primers of
16-24 nucleotides in length are generally preferred.
[0158] Those skilled in the art can readily design primers for use
processes such as PCR. For example, potential primers for nucleic
acid amplification can be used as probes to determine whether the
primer is selective tor a single target and what conditions permit
hybridization of a primer to a target within a sample or complex
mixture of nucleic acids.
[0159] The present invention also contemplates use of the present
polymerase polypeptides in combination with other procedures or
enzymes. For example, the polymerase polypeptides have reverse
transcription activity and can be used for reverse transcription of
an RNA. In this method, the RNA is convened to cDNA due to the
reverse transcriptase activity of the polymerase, and then
amplified using a polymerizing activity of one of the thermostable
polymerases of the invention. Additional reverse transcriptase
enzyme may be added as needed. Such procedures are provided in U.S.
Pat. No. 5,322,770, incorporated by reference herein.
[0160] In another embodiment, polymerases of the invention with 5'-
3' exonuclease activity are used to detect target nucleic acids in
an invader-directed cleavage assay. This type of assay is
described, for example, in U.S. Pat. No. 5,994,069. It is important
to note that the 5'-3' exonuclease of polymerases is not really an
exonuclease that progressively cleaves nucleotides from the 5' end
of a nucleic acid, but rather a nuclease that can cleave certain
types of nucleic acid structures to produce oligonucleotide
cleavage products. Such cleavage is sometimes called
structure-specific cleavage.
[0161] In general, the invader-directed cleavage assay employs at
least one pair of oligonucleotides that interact with a target
nucleic acid to form a cleavage structure for the 5'-3' nuclease
activity of the polymerase. Distinctive cleavage products are
released when the cleavage structure is cleaved by the 5'-3'
nuclease activity of the DNA polymerase. Formation of such a
target-dependent cleavage structure and the resulting cleavage
products is indicative of the presence of specific target nucleic
acid sequences in the test sample.
[0162] Therefore, in the invader-directed cleavage procedure, the
5'-3' nuclease activity of the present polymerases is needed as
well at least one pair of oligonucleotides that interact with a
target nucleic acid to form a cleavage structure for the 5'-3'
nuclease. The first oligonucleotide, sometimes termed the "probe,"
can hybridize within the target site but downstream of a second
oligonucleotide, sometimes termed an "invader" oligonucleotide. The
invader oligonucleotide can hybridize adjacent and upstream of the
probe oligonucleotide. However, the target sites to which the probe
and invader oligonucleotides hybridize overlap such that the 3'
segment of the invader oligonucleotide overlaps with the 5' segment
of the probe oligonucleotide. The 5'-3' nuclease of the present
polymerases can cleave the probe oligonucleotide at an internal
site to produce distinctive fragments that are diagnostic of the
presence of the target nucleic acid in a sample. Further details
and methods for adapting the invader-directed cleavage assay to
particular situations can be found in U.S. Pat. No. 5,994,069.
[0163] One or more nucleotide analogs can also be used with the
present methods, kits and with the polymerases. Such nucleotide
analogs can be modified or non-naturally occurring nucleotides such
as 7-deaza purines (i.e., 7-deaza-dATP and 7-deaza-dGTP).
Nucleotide analogs Include base analogs and comprise modified forms
of deoxyribonucleotides as well as ribonucleotides. As used herein
the term "nucleotide analog" when used in reference to targets
present in a PCR mixture refers to the use of nucleotides other
than dATP, dGTP, dCTP and dTTP; thus, the use of dUTP (a naturally
occurring dNTP) in a PCR would comprise the use of a nucleotide
analog in the PCR. A PCR product generated using dUTP,
7-deaza-dATP, 7-deaza-dGTP or any other nucleotide analog in the
reaction mixture is said to contain nucleotide analogs.
[0164] The invention also provides kits that contain at least one
of the polymerases of the invention. Individual kits may be adapted
for performing one or more of the following procedures: DNA
sequencing, DNA amplification, RNA Amplification and/or primer
extension. Kits of the invention comprise a DNA polymerase
polypeptide of the invention and at least one nucleotide. A
nucleotide provided in the kits of the invention can be labeled or
unlabeled. Kits preferably can also contain instructions on how to
perform the procedures for which the kits are adapted.
[0165] Optionally, the subject kit may further comprise at least
one other reagent required for performing the method the kit is
adapted to perform. Examples of such additional reagents include:
another unlabeled nucleotide, another labeled nucleotide, a balance
mixture of nucleotides, one or more chain terminating nucleotides,
one or more nucleotide analogs, buffer solutions), magnesium
solution(s), cloning vectors, restriction endonucleases, sequencing
primers, reverse transcriptase, and DNA or RNA amplification
primers. The reagents included in the kits of the invention may be
supplied in premeasured units so as to provide for greater
precision and accuracy. Typically, kits reagents and other
components are placed and contained in separate vessels. A reaction
vessel, test tube, microwell tray, microtiter dish or other
container can also be included in fee kit. Different labels can be
used on different reagents so that each reagent can be
distinguished from another.
[0166] The following Examples further illustrate the invention and
are not intended to limit the scope of the invention.
EXAMPLE 1
Cloning of Thermus brockianus Nucleic Acid Polymerases
Bacteria Growth and Genomic DNA Isolation
[0167] The 2AZN strain of Thermus brockianus used in this invention
was obtained from Dr. R.A.D, Williams, Queen Mary and Westfield
College, London, England. Strain YS38 was obtained from the NCIMB
collection. Both of these bacterial samples were obtained as
lyophilized bacteria and were revived in 4 ml of ATCC Thermus
bacteria growth media 461 (Castenholtz TYE medium). The 4 ml
overnight cultures were grown at 65.degree. C. in a water bath
orbital shaker. The 4-ml cultures were transferred to 200 ml of TYE
and grown overnight at 65.degree. C. in a water bath orbital shaker
to stationary phase. Thermus brockianus genomic DNAs were prepared
using a Qiagen genomic DNA preparation kit (Qiagen, Valencia,
Calif.),
Cloning of the Thermus brockianus Polymerase Genes
[0168] The forward and reverse primers were designed by analysis of
5' and 3' terminal homologous conserved regions of the Genebank DNA
sequences of the DNA Pol I genes from Thermus aquaticus (Taq),
Thermus thermophilus (Tth), Thermus filiformis (Tfi), Thermus
caldophilus, and Thermus flavus. The Thermus brockianus polymerase
gene from strain YS38 was first cloned as a partial fragment which
was amplified using N-terminal primer 5'-ggc cac cac ctg gcc tac-3'
(SEQ ID NO:50) and C-terminal primer 5'-ccc acc tcc acc tcc ag-3'
(SEQ ID NO:51) The PCR reaction mixture contained 2.5 ul of
10.times.cPfu Turbo reaction buffer (Stratagene), 50 ng genomic DNA
template, 0.2 mM (each) dNTPs, 20 pmol of each primer, and 10 units
of Pfu Turbo DNA polymerase (Stratagene) in a 25 .mu.l total
reaction volume. The reaction was started by adding a premix
containing enzyme, MgCl.sub.2, dNTPs, buffer and water to another
premix containing primer and template preheated at 80.degree. C.
The entire reaction mixture was then denatured (30 s, 96.degree.
C.) followed by 30 PCR cycles (97.degree. C. for 3 sec, 56.degree.
C. for 30 sec, 72.degree. C. for 2 min 30 sec) with a finishing
step (72.degree. C. for 6 min). This produced an approximate 2.3 kb
amplified DNA fragment. This amplified DNA fragment was purified
from the PCR reaction mix using a Quiagen PCR cleanup kit
(Quiagen). The fragment was then ligated into the inducible
expression vector pCR.RTM.T7 CT-TOPO.RTM. (Invitrogen, Carlsbad,
Calif.).
[0169] The sequence of the full-length Thermus brockianus strain
YS38 open reading frame and flanking regions was obtained by
genomic DMA sequencing using primers designed to hybridize to
portions of the Thermus brockianus strain YS3S Polymerase I gene.
The C terminal end was sequenced using the forward primer 5'-cga
cct caa cgc ccg ggt aaa ga-3' (SEQ ID NO:52). The N terminal end
was sequenced using the reverse primer 5'-gct ttt ggc gaa gcc gta
gac ccc t-3' (SEQ ID NO:53). The sequencing reactions were
performed using a pre-denaturation step (95.degree. C., 5 min)
followed by 60 cycles (97.degree. C. for 5 sec., 60.degree. C. for
4 min). The reaction mixture consisted of 16 ul Big Dye V1 Ready
Reaction mix, 2.4 .mu.g DNA, 15 pmol primer in a 40 .mu.l reaction
volume. The sequence of the 5' (start) and 3' (end) of the Thermus
brockianus YS38 gene were thus obtained.
[0170] Using the sequence information obtained in the genomic DNA
sequencing reactions above, two primers were designed to amplify
the full-length Thermus brockianus YS38 polymerase gene: N-terminal
primer 5'-cat atg ctt ccc ctc ttt gag ccc a-3' (SEQ ED NO:54) and
C-terminal primer 5'-gtc gac tag ccc ttg gcg gaa agc-3' (SEQ ID
NO:55). These primers introduced NdeI and Sal I restriction sites
that facilitated subcloning. The PCR reaction mixture used to
amplify Thermus brockianus strain YS38 contained 2.5 .mu.l of
10.times. Amplitaq reaction buffer (Applied Biosystems), 2 mM
MgCl.sub.2, 120 ng genomic DNA template, 0.2 mM (each) dNTPs, 20
pmol of each primer, and 1.25 units of Amplitaq in a 25 .mu.l total
reaction volume. The reaction was started by adding a premix
containing enzyme, MgCl.sub.2, dNTPs, buffer and water to another
premix containing primer and template preheated at 80.degree. C.
The entire reaction mixture was then denatured (30 sec at
96.degree. C.) followed by 30 PCR cycles (97.degree. C. for 3 sec,
62.degree. C. for 30 sec, 72.degree. C. for 3 mm) with a finishing
step (72.degree. C. for 7 min).
[0171] The same primers used to amplify the polymerase gene from
Thermus brockianus strain YS38 were used to amplify the polymerase
gene from Thermus brockianus strain 2AZN. The 2AZN PCR reaction
contained 5 .mu.l of 10.times.cPfu Turbo reaction buffer
(Stratagene), 200 ng genomic DNA template, 0.2 mM (each) dNTPs, 20
pmol of each primer, and 2.5 units of Pfu Turbo DNA polymerase
(Stratagene) in a 50 .mu.l total reaction volume. The reaction was
started by adding a premix containing enzyme, dNTPs, buffer and
water to another premix containing primer and template preheated at
80.degree. C. The entire reaction mixture was then denatured (2
min, 96.degree. C.) followed by PCR cycling for 25 cycles
(96.degree. C. for 5 sec, 64.degree. C. for 30 sec, 72.degree. C.
for 3 min) with a finishing step (72.degree. C. for 5 min).
[0172] Both PCR reactions produced approximate 2.5 kb amplified DNA
fragments. The amplified DNA fragments were purified from the PCR
reaction mix using a Qiagen PCR cleanup kit (Qiagen Inc., Valencia,
Calif.), The Thermus brockianus strain YS3S fragment was ligated
into the inducible expression vector pCR.RTM.T7 CT-TOPO.RTM.
(Invitrogen, Carlsbad, Calif.). The Thermus brockianus strain 2AZN
fragment was then ligated into the vector pCR4.RTM.TOPO.RTM.TA
(Invitrogen, Carlsbad, Calif.). Three different clones were
sequenced in order to rule out PCR errors. The sequence of the
Thermus brockianus 2AZN polymerase gene is provided as SEQ ID NO:2.
The consensus sequence of Thermus brockianus strain YS39 is
provided as SEQ ID NO:1. The two sequences are compared in an
alignment provided in FIG. 1. The sequences of both polymerase
genes were reconfirmed by sequencing PCR fragments produced by
reamplifying both full-length genes from the their respective
genomic DNAs.
[0173] The deduced amino acid sequences of Thermus brockianus YS38
and Thermus brockianus 2AZN were aligned with the polymerase
enzymes from Thermus aquaticus (Taq) Thermus thermophilus (Tth),
Thermus filiformis (Tfi) and Thermus flavus using the program
Vector NTI (Informax, Inc.). The alignment is shown in FIG. 2.
There are 44 amino acid positions where Thermus brockianus YS38
and/or Thermus brockianus 2AZN are different from the published
sequences of other known Thermus polymerases. For example, the
Thermus brockianus polymerases have a different start site from the
others, which accounts for the different amino acid numbering.
Modification of Thermus brockianus Polymerase Coding Regions
[0174] To produce Thermus brockianus polymerase from strains YS39
and 2AZN in a form better suited for dye-terminator DNA sequencing,
two amino acid substitutions were separately made in the nucleic
acid coding these polymerases. These are the FS (Tabor and
Richardson, 1995 PNAS 92: 6339-6343; U.S. Pat. No. 5,614,365) and
exo-minus mutations (see U.S. Pat. No. 5,466,591; Xu Y., Derbyshire
V., Ng K., Sun X-C., Grindley N. D., Joyce C. M. (1997) J. Mol.
Biol. 268, 284-302). To reduce the exonuclease activity to very low
levels, the mutation G43D was introduced, To reduce the
discrimination between ddNTP's and dNTP's, the mutation F665Y was
introduced.
[0175] Mutagenesis of the Thermus brockianus polymerase genes was
carried out using the modified QuickChange.TM. (Stratagene) PCR
mutagenesis protocol described in Sawano & Miyawaki (2000). The
mutagenized gene was resequenced completely to confirm the
introduction of the mutations and to ensure that no PCR errors were
introduced.
EXAMPLE 2
Protein Expression and Purification
[0176] Nucleic acids encoding both Thermus brockianus open reading
frames were separately subcloned into the expression vector pET24a
(Novagen, Madison, Wis.) using the Nde I and Sal I restriction
sites. These plasmids were then used to transform BL2I E. coli
cells. The cells were grown in one liter of Terrific Broth
(Maniatis) to an optical density of 1.2 OD and the protein was
overproduced by four-hour induction with 1.0 mM IPTG. The cells
were harvested by centrifugation, washed in 50 mM Iris (pH 7.5), 5
mM EDTA, 5% glycerol, 10 mM EDTA to remove growth media, and the
cell pellet frozen at -80.degree. C.
[0177] To isolate Thermus brockianus polymerase enzymes, the cells
were thawed and resuspended in 2.5 volumes (wet weight) of 50 mM
Tris (pH 7.2), 400 nM NaCl, 1 mM EDTA. The cell walls were
disrupted by sonication. The resulting E. coli cell debris was
removed by centrifugation. The cleared lysate was pasteurized in a
water bath (75.degree. C., 45 min), denaturing and precipitating
the majority of the non-thermostable E. coli proteins and leaving
the thermostable Thermus brockianus polymerase in solution. E.coli
genomic DNA was removed by coprecipitation with 0.3%
Polyethyleneimine (PEI). The cleared lysate was then applied to two
columns in series: (1) a Biorex 70 cation exchange resin (BioRad,
Hercules, Calif.) which chelates excess PEI and (2) a
heparin-agarose resin (Sigma, St. Louis, Mo.) which retains the
polymerase. The Heparin-agarose column was washed with 5 column
volumes of 20 mM Tris (pH 8.5), 5% glycerol, 100 mM NaCl, 0.1 mM
EDTA, 0.05% Triton X-100 and 0.05% Tween-20 (KTA buffer). The
protein was then elated with a 0.1 to 1.0M NaCl linear gradient.
The polymerase eluted at 0.8M NaCl. The eluted Thermus brockianus
polymerase enzymes were concentrated and the buffer exchanged using
a Millipore concentration filter (30 kD M. wt. cutoff). The
concentrated protein was stored at in KTA buffer (no salt) plus 50%
glycerol at -20.degree. C. The activity of the polymerases was
measured using a nicked salmon sperm DNA radiometric activity
assay.
[0178] All publications and patents mentioned in the above
specification are herein incorporated by reference. Various
modifications and variations of the described method and system of
the invention will be apparent to those skilled in the art without
departing from the scope and spirit of the invention. Although the
invention has been described in connection with specific preferred
embodiments, it should be understood that the invention as claimed
should not be unduly limited to such specific embodiments. Indeed,
various modifications of the described modes for carrying out the
invention that are obvious to those skilled in the relevant arts
are intended to be within the scope of the following claims.
Sequence CWU 1
1
6212493DNAThermus brockianus 1atgcttcccc tctttgagcc caagggccgg
gtgctcctgg tggacggcca ccacctggcc 60taccgtaact tcttcgccct caaggggctc
accacgagcc ggggcgagcc cgtgcaaggg 120gtctacggct tcgccaaaag
cctcctcaag gccctgaagg aggacgggga cgtggtcatc 180gtggtctttg
acgccaaggc cccctctttt cgccacgagg cctacggggc ctacaaggcg
240ggccgggccc ctaccccgga ggactttccg aggcagcttg ccctcatgaa
ggagcttgtg 300gaccttttgg ggctggagcg cctcgaggtc ccgggctttg
aggcggacga tgtcctcgcc 360gccctggcca agaaggcgga gcgggaaggg
tacgaggtgc gcatcctcac cgccgaccgg 420gacctcttcc agcttctttc
ggaccgcatc gccgtcctgc acccggaagg ccacctcatc 480accccggggt
ggctttggga gaggtacggc ctgagaccgg agcagtgggt ggacttccgc
540gccctggccg gcgacccttc cgacaacatc cccggggtga aggggatcgg
cgagaagacg 600gccctgaagc tcctaaagga gtggggtagt ctggaaaata
tccaaaaaaa cctggaccag 660gtcagtcccc cttccgtgcg cgagaagatc
caggcccacc tggacgacct caggctctcc 720caggagcttt cccgggtgcg
cacggacctt cccttggagg tggactttag aaggcggcgg 780gagcccgata
gggaaggcct tagggccttc ttagagcggc ttgagttcgg gagcctcctc
840cacgagttcg gcctcctgga aagcccccag gcggcggagg aggccccttg
gccgccgccg 900gaaggggcct tcttgggctt ccgcctctcc cggcccgagc
ccatgtgggc ggaactcctt 960tccttggcgg caagcgccaa gggccgggtc
taccgggcgg aggcgcccca taaggccctt 1020tcggacctga aggagatccg
ggggcttctc gccaaggacc tcgccgtctt ggccctgagg 1080gaggggctcg
gccttccccc cacggacgat cccatgctcc tcgcctacct cctggacccc
1140tccaacacca cccccgaggg cgtggcccgg cgctacgggg gggagtggac
ggaggaggcg 1200ggggagaggg ccttgcttgc cgaaaggctt tacgagaacc
tcctaagccg cctgaaaggg 1260gaagaaaagc tcctttggct ctacgaggag
gtggaaaagc ccctttcccg ggtcctcgcc 1320cacatggagg ccacgggggt
gaggctggac gtaccctacc taagggccct ttccctggag 1380gtggcggcgg
agatgggccg cctggaggag gaggttttcc gcctggcggg ccaccccttc
1440aacctgaact cccgcgacca gctggaaagg gtgctctttg acgagctcgg
gcttcccccc 1500atcggcaaga cggaaaaaac cgggaagcgc tccaccagcg
ccgccgtcct cgaggccctg 1560cgggaggccc accccatcgt ggagaagatc
ctccagtacc gggagctcgc caagctcaag 1620ggcacctaca ttgacctcct
tcccgccctg gtccacccca ggacgggcag gctccacacc 1680cgcttcaacc
agacggccac ggccacgggc cgcctttcca gctccgaccc caacctgcag
1740aacattcccg tgcgcacccc cttgggccaa aggatccgcc gggccttcgt
ggccgaggag 1800gggtaccttc tcgtggccct ggactatagc cagattgagc
tgagggtcct ggcccacctc 1860tcgggggacg aaaacctcat ccgggtcttc
caggagggcc gggacatcca cacccagacg 1920gcgagctgga tgttcggcct
gccggcggag gccatagacc ccctcaggcg ccgggcggcc 1980aagaccatca
acttcggcgt cctctacggc atgtccgccc accggctttc ccaggagctg
2040ggcatcccct acgaggaggc ggtggccttc attgaccgct atttccagag
ctaccccaag 2100gtgaaggcct ggattgaaag gaccctggag gaggggcggc
aaagggggta cgtggagacc 2160ctcttcggcc gcaggcgcta cgtgcccgac
ctcaacgccc gggtaaagag cgtgcgggag 2220gcggcggagc gcatggcctt
taacatgccc gtgcagggca ccgccgctga cctgatgaag 2280ctcgccatgg
tgaggctctt ccctaggctt cccgaggtgg gggcgaggat gctcctccag
2340gtccacgacg agctcctcct ggaggcgccc aaggagcggg cggaggaggc
ggcggccctg 2400gccaaggagg tcatggaggg ggtctggccc ctggccgtgc
ccctggaggt ggaggtgggc 2460atcggggagg actggctttc cgccaagggc tag
249322498DNAThermus brockianus 2atgcttcccc tctttgagcc caagggccgg
gtgctcctgg tggacggcca ccacctggcc 60taccgtaact tcttcgccct caaggggctc
accacgagcc ggggcgagcc cgtgcaaggg 120gtctacggct tcgccaaaag
cctcctcaag gccctgaagg aggacgggga cgtggtcatc 180gtggtctttg
acgccaaggc cccctctttt cgccacgagg cctacggggc ctacaaggcg
240ggccgggccc ctaccccgga ggactttccg aggcagcttg ccctcatgaa
ggagcttgtg 300gaccttttgg ggctggagcg cctcgaggtc ccgggctttg
aggcggacga tgtcctcgcc 360gccctggcca agaaggcgga gcgggaaggg
tacgaggtgc gcatcctcac cgccgaccgg 420gacctcttcc agcttctttc
ggaccgcatc gccgtcctgc acccggaagg ccacctcatc 480accccggggt
ggctttggga gaggtacggc ctgagaccgg agcagtgggt ggacttccgc
540gccctggccg gcgacccttc cgacaacatc cccggggtga aggggatcgg
cgagaagacg 600gccctgaagc tcctaaagga gtggggtagt ctggaaaata
tccaaaaaaa cctggaccag 660gtcagtcccc cttccgtgcg cgagaagatc
caggcccacc tggacgacct caggctctcc 720caggagcttt cccgggtgcg
cacggacctt cccttggagg tggactttag aaggcggcgg 780gagcccgata
gggaaggcct tagggccttc ttagagcggc ttgagttcgg gagcctcctc
840cacgagttcg gcctcctgga aagcccccag gcggcggagg aggccccttg
gccgccgccg 900gaaggggcct tcttgggctt ccgcctctcc cggcccgagc
ccatgtgggc ggaactcctt 960tccttggcgg caagcgccaa gggccgggtc
taccgggcgg aggcgcccca taaggccctt 1020tcggacctga aggagatccg
ggggcttctc gccaaggacc tcgccgtctt ggccctgagg 1080gaggggctcg
gccttccccc cacggacgat cccatgctcc tcgcctacct cctggacccc
1140tccaacacca cccccgaggg cgtggcccgg cgctacgggg gggagtggac
ggaggaggcg 1200ggggagaggg ccttgcttgc cgaaaggctt tacgagaacc
tcctaagccg cctgaaaggg 1260gaagaaaagc tcctttggct ctacgaggag
gtggaaaagc ccctttcccg ggtcctcgcc 1320cacatggagg ccacgggggt
gaggctggac gtaccctacc taagggccct ttccctggag 1380gtggcggcgg
agatgggccg cctggaggag gaggttttcc gcctggcggg ccaccccttc
1440aacctgaact cccgcgacca gctggaaagg gtgctctttg acgagctcgg
gcttcccccc 1500atcggcaaga cggaaaaaac cgggaagcgc tccaccagcg
ccgccgtcct cgaggccctg 1560cgggaggccc accccatcgt ggagaagatc
ctccagtacc gggagctcgc caagctcaag 1620ggcacctaca ttgaccccct
tcccgccctg gtccacccca ggacgggcag gctccacacc 1680cgcttcaacc
agacggccac ggccacgggc cgcctttcca gctccgaccc caacctgcag
1740aacattcccg tgcgcacccc cttgggccaa aggatccgcc gggccttcgt
ggccgaggag 1800gggtaccttc tcgtggccct ggactatagc cagattgagc
tgagggtcct ggcccacctc 1860tcgggggacg aaaacctcat ccgggtcttc
caggagggcc gggacatcca cacccagacg 1920gcgagctgga tgttcggcct
gccggcggag gccatagacc ccctcaggcg ccgggcggcc 1980aagaccatca
acttcggcgt cctctacggc atgtccgccc accggctttc ccaggagctg
2040ggcatcccct acgaggaggc ggtggccttc attgaccgct atttccagag
ctaccccaag 2100gtgaaggcct ggattgaaag gaccctggag gaggggcggc
aaagggggta cgtggagacc 2160ctcttcggcc gcaggcgcta cgtgcccgac
ctcaacgccc gggtaaagag cgtgcgggag 2220gcggcggagc gcatggcctt
taacatgccc gtgcagggca ccgccgctga cctgatgaag 2280ctcgccatgg
tgaggctctt ccctaggctt cccgaggtgg gggcgaggat gctcctccag
2340gtccacgacg agctcctcct ggaggcgccc aaggagcggg cggaggaggc
ggcggccctg 2400gccaaggagg tcatggaggg agtctggccc ctggccgtgc
ccctggaggt ggaggtgggc 2460atcggggagg actggctttc cgccaagggc tagtcgac
249832493DNAThermus brockianus 3atgcttcccc tctttgagcc caagggccgg
gtgctcctgg tggacggcca ccacctggcc 60taccgtaact tcttcgccct caaggggctc
accacgagcc ggggcgagcc cgtgcaaggg 120gtctacgact tcgccaaaag
cctcctcaag gccctgaagg aggacgggga cgtggtcatc 180gtggtctttg
acgccaaggc cccctctttt cgccacgagg cctacggggc ctacaaggcg
240ggccgggccc ctaccccgga ggactttccg aggcagcttg ccctcatgaa
ggagcttgtg 300gaccttttgg ggctggagcg cctcgaggtc ccgggctttg
aggcggacga tgtcctcgcc 360gccctggcca agaaggcgga gcgggaaggg
tacgaggtgc gcatcctcac cgccgaccgg 420gacctcttcc agcttctttc
ggaccgcatc gccgtcctgc acccggaagg ccacctcatc 480accccggggt
ggctttggga gaggtacggc ctgagaccgg agcagtgggt ggacttccgc
540gccctggccg gcgacccttc cgacaacatc cccggggtga aggggatcgg
cgagaagacg 600gccctgaagc tcctaaagga gtggggtagt ctggaaaata
tccaaaaaaa cctggaccag 660gtcagtcccc cttccgtgcg cgagaagatc
caggcccacc tggacgacct caggctctcc 720caggagcttt cccgggtgcg
cacggacctt cccttggagg tggactttag aaggcggcgg 780gagcccgata
gggaaggcct tagggccttc ttagagcggc ttgagttcgg gagcctcctc
840cacgagttcg gcctcctgga aagcccccag gcggcggagg aggccccttg
gccgccgccg 900gaaggggcct tcttgggctt ccgcctctcc cggcccgagc
ccatgtgggc ggaactcctt 960tccttggcgg caagcgccaa gggccgggtc
taccgggcgg aggcgcccca taaggccctt 1020tcggacctga aggagatccg
ggggcttctc gccaaggacc tcgccgtctt ggccctgagg 1080gaggggctcg
gccttccccc cacggacgat cccatgctcc tcgcctacct cctggacccc
1140tccaacacca cccccgaggg cgtggcccgg cgctacgggg gggagtggac
ggaggaggcg 1200ggggagaggg ccttgcttgc cgaaaggctt tacgagaacc
tcctaagccg cctgaaaggg 1260gaagaaaagc tcctttggct ctacgaggag
gtggaaaagc ccctttcccg ggtcctcgcc 1320cacatggagg ccacgggggt
gaggctggac gtaccctacc taagggccct ttccctggag 1380gtggcggcgg
agatgggccg cctggaggag gaggttttcc gcctggcggg ccaccccttc
1440aacctgaact cccgcgacca gctggaaagg gtgctctttg acgagctcgg
gcttcccccc 1500atcggcaaga cggaaaaaac cgggaagcgc tccaccagcg
ccgccgtcct cgaggccctg 1560cgggaggccc accccatcgt ggagaagatc
ctccagtacc gggagctcgc caagctcaag 1620ggcacctaca ttgacctcct
tcccgccctg gtccacccca ggacgggcag gctccacacc 1680cgcttcaacc
agacggccac ggccacgggc cgcctttcca gctccgaccc caacctgcag
1740aacattcccg tgcgcacccc cttgggccaa aggatccgcc gggccttcgt
ggccgaggag 1800gggtaccttc tcgtggccct ggactatagc cagattgagc
tgagggtcct ggcccacctc 1860tcgggggacg aaaacctcat ccgggtcttc
caggagggcc gggacatcca cacccagacg 1920gcgagctgga tgttcggcct
gccggcggag gccatagacc ccctcaggcg ccgggcggcc 1980aagaccatca
acttcggcgt cctctacggc atgtccgccc accggctttc ccaggagctg
2040ggcatcccct acgaggaggc ggtggccttc attgaccgct atttccagag
ctaccccaag 2100gtgaaggcct ggattgaaag gaccctggag gaggggcggc
aaagggggta cgtggagacc 2160ctcttcggcc gcaggcgcta cgtgcccgac
ctcaacgccc gggtaaagag cgtgcgggag 2220gcggcggagc gcatggcctt
taacatgccc gtgcagggca ccgccgctga cctgatgaag 2280ctcgccatgg
tgaggctctt ccctaggctt cccgaggtgg gggcgaggat gctcctccag
2340gtccacgacg agctcctcct ggaggcgccc aaggagcggg cggaggaggc
ggcggccctg 2400gccaaggagg tcatggaggg ggtctggccc ctggccgtgc
ccctggaggt ggaggtgggc 2460atcggggagg actggctttc cgccaagggc tag
249342498DNAThermus brockianus 4atgcttcccc tctttgagcc caagggccgg
gtgctcctgg tggacggcca ccacctggcc 60taccgtaact tcttcgccct caaggggctc
accacgagcc ggggcgagcc cgtgcaaggg 120gtctacgact tcgccaaaag
cctcctcaag gccctgaagg aggacgggga cgtggtcatc 180gtggtctttg
acgccaaggc cccctctttt cgccacgagg cctacggggc ctacaaggcg
240ggccgggccc ctaccccgga ggactttccg aggcagcttg ccctcatgaa
ggagcttgtg 300gaccttttgg ggctggagcg cctcgaggtc ccgggctttg
aggcggacga tgtcctcgcc 360gccctggcca agaaggcgga gcgggaaggg
tacgaggtgc gcatcctcac cgccgaccgg 420gacctcttcc agcttctttc
ggaccgcatc gccgtcctgc acccggaagg ccacctcatc 480accccggggt
ggctttggga gaggtacggc ctgagaccgg agcagtgggt ggacttccgc
540gccctggccg gcgacccttc cgacaacatc cccggggtga aggggatcgg
cgagaagacg 600gccctgaagc tcctaaagga gtggggtagt ctggaaaata
tccaaaaaaa cctggaccag 660gtcagtcccc cttccgtgcg cgagaagatc
caggcccacc tggacgacct caggctctcc 720caggagcttt cccgggtgcg
cacggacctt cccttggagg tggactttag aaggcggcgg 780gagcccgata
gggaaggcct tagggccttc ttagagcggc ttgagttcgg gagcctcctc
840cacgagttcg gcctcctgga aagcccccag gcggcggagg aggccccttg
gccgccgccg 900gaaggggcct tcttgggctt ccgcctctcc cggcccgagc
ccatgtgggc ggaactcctt 960tccttggcgg caagcgccaa gggccgggtc
taccgggcgg aggcgcccca taaggccctt 1020tcggacctga aggagatccg
ggggcttctc gccaaggacc tcgccgtctt ggccctgagg 1080gaggggctcg
gccttccccc cacggacgat cccatgctcc tcgcctacct cctggacccc
1140tccaacacca cccccgaggg cgtggcccgg cgctacgggg gggagtggac
ggaggaggcg 1200ggggagaggg ccttgcttgc cgaaaggctt tacgagaacc
tcctaagccg cctgaaaggg 1260gaagaaaagc tcctttggct ctacgaggag
gtggaaaagc ccctttcccg ggtcctcgcc 1320cacatggagg ccacgggggt
gaggctggac gtaccctacc taagggccct ttccctggag 1380gtggcggcgg
agatgggccg cctggaggag gaggttttcc gcctggcggg ccaccccttc
1440aacctgaact cccgcgacca gctggaaagg gtgctctttg acgagctcgg
gcttcccccc 1500atcggcaaga cggaaaaaac cgggaagcgc tccaccagcg
ccgccgtcct cgaggccctg 1560cgggaggccc accccatcgt ggagaagatc
ctccagtacc gggagctcgc caagctcaag 1620ggcacctaca ttgaccccct
tcccgccctg gtccacccca ggacgggcag gctccacacc 1680cgcttcaacc
agacggccac ggccacgggc cgcctttcca gctccgaccc caacctgcag
1740aacattcccg tgcgcacccc cttgggccaa aggatccgcc gggccttcgt
ggccgaggag 1800gggtaccttc tcgtggccct ggactatagc cagattgagc
tgagggtcct ggcccacctc 1860tcgggggacg aaaacctcat ccgggtcttc
caggagggcc gggacatcca cacccagacg 1920gcgagctgga tgttcggcct
gccggcggag gccatagacc ccctcaggcg ccgggcggcc 1980aagaccatca
acttcggcgt cctctacggc atgtccgccc accggctttc ccaggagctg
2040ggcatcccct acgaggaggc ggtggccttc attgaccgct atttccagag
ctaccccaag 2100gtgaaggcct ggattgaaag gaccctggag gaggggcggc
aaagggggta cgtggagacc 2160ctcttcggcc gcaggcgcta cgtgcccgac
ctcaacgccc gggtaaagag cgtgcgggag 2220gcggcggagc gcatggcctt
taacatgccc gtgcagggca ccgccgctga cctgatgaag 2280ctcgccatgg
tgaggctctt ccctaggctt cccgaggtgg gggcgaggat gctcctccag
2340gtccacgacg agctcctcct ggaggcgccc aaggagcggg cggaggaggc
ggcggccctg 2400gccaaggagg tcatggaggg agtctggccc ctggccgtgc
ccctggaggt ggaggtgggc 2460atcggggagg actggctttc cgccaagggc tagtcgac
249852493DNAThermus brockianus 5atgcttcccc tctttgagcc caagggccgg
gtgctcctgg tggacggcca ccacctggcc 60taccgtaact tcttcgccct caaggggctc
accacgagcc ggggcgagcc cgtgcaaggg 120gtctacggct tcgccaaaag
cctcctcaag gccctgaagg aggacgggga cgtggtcatc 180gtggtctttg
acgccaaggc cccctctttt cgccacgagg cctacggggc ctacaaggcg
240ggccgggccc ctaccccgga ggactttccg aggcagcttg ccctcatgaa
ggagcttgtg 300gaccttttgg ggctggagcg cctcgaggtc ccgggctttg
aggcggacga tgtcctcgcc 360gccctggcca agaaggcgga gcgggaaggg
tacgaggtgc gcatcctcac cgccgaccgg 420gacctcttcc agcttctttc
ggaccgcatc gccgtcctgc acccggaagg ccacctcatc 480accccggggt
ggctttggga gaggtacggc ctgagaccgg agcagtgggt ggacttccgc
540gccctggccg gcgacccttc cgacaacatc cccggggtga aggggatcgg
cgagaagacg 600gccctgaagc tcctaaagga gtggggtagt ctggaaaata
tccaaaaaaa cctggaccag 660gtcagtcccc cttccgtgcg cgagaagatc
caggcccacc tggacgacct caggctctcc 720caggagcttt cccgggtgcg
cacggacctt cccttggagg tggactttag aaggcggcgg 780gagcccgata
gggaaggcct tagggccttc ttagagcggc ttgagttcgg gagcctcctc
840cacgagttcg gcctcctgga aagcccccag gcggcggagg aggccccttg
gccgccgccg 900gaaggggcct tcttgggctt ccgcctctcc cggcccgagc
ccatgtgggc ggaactcctt 960tccttggcgg caagcgccaa gggccgggtc
taccgggcgg aggcgcccca taaggccctt 1020tcggacctga aggagatccg
ggggcttctc gccaaggacc tcgccgtctt ggccctgagg 1080gaggggctcg
gccttccccc cacggacgat cccatgctcc tcgcctacct cctggacccc
1140tccaacacca cccccgaggg cgtggcccgg cgctacgggg gggagtggac
ggaggaggcg 1200ggggagaggg ccttgcttgc cgaaaggctt tacgagaacc
tcctaagccg cctgaaaggg 1260gaagaaaagc tcctttggct ctacgaggag
gtggaaaagc ccctttcccg ggtcctcgcc 1320cacatggagg ccacgggggt
gaggctggac gtaccctacc taagggccct ttccctggag 1380gtggcggcgg
agatgggccg cctggaggag gaggttttcc gcctggcggg ccaccccttc
1440aacctgaact cccgcgacca gctggaaagg gtgctctttg acgagctcgg
gcttcccccc 1500atcggcaaga cggaaaaaac cgggaagcgc tccaccagcg
ccgccgtcct cgaggccctg 1560cgggaggccc accccatcgt ggagaagatc
ctccagtacc gggagctcgc caagctcaag 1620ggcacctaca ttgacctcct
tcccgccctg gtccacccca ggacgggcag gctccacacc 1680cgcttcaacc
agacggccac ggccacgggc cgcctttcca gctccgaccc caacctgcag
1740aacattcccg tgcgcacccc cttgggccaa aggatccgcc gggccttcgt
ggccgaggag 1800gggtaccttc tcgtggccct ggactatagc cagattgagc
tgagggtcct ggcccacctc 1860tcgggggacg aaaacctcat ccgggtcttc
caggagggcc gggacatcca cacccagacg 1920gcgagctgga tgttcggcct
gccggcggag gccatagacc ccctcaggcg ccgggcggcc 1980aagaccatca
actacggcgt cctctacggc atgtccgccc accggctttc ccaggagctg
2040ggcatcccct acgaggaggc ggtggccttc attgaccgct atttccagag
ctaccccaag 2100gtgaaggcct ggattgaaag gaccctggag gaggggcggc
aaagggggta cgtggagacc 2160ctcttcggcc gcaggcgcta cgtgcccgac
ctcaacgccc gggtaaagag cgtgcgggag 2220gcggcggagc gcatggcctt
taacatgccc gtgcagggca ccgccgctga cctgatgaag 2280ctcgccatgg
tgaggctctt ccctaggctt cccgaggtgg gggcgaggat gctcctccag
2340gtccacgacg agctcctcct ggaggcgccc aaggagcggg cggaggaggc
ggcggccctg 2400gccaaggagg tcatggaggg ggtctggccc ctggccgtgc
ccctggaggt ggaggtgggc 2460atcggggagg actggctttc cgccaagggc tag
249362498DNAThermus brockianus 6atgcttcccc tctttgagcc caagggccgg
gtgctcctgg tggacggcca ccacctggcc 60taccgtaact tcttcgccct caaggggctc
accacgagcc ggggcgagcc cgtgcaaggg 120gtctacggct tcgccaaaag
cctcctcaag gccctgaagg aggacgggga cgtggtcatc 180gtggtctttg
acgccaaggc cccctctttt cgccacgagg cctacggggc ctacaaggcg
240ggccgggccc ctaccccgga ggactttccg aggcagcttg ccctcatgaa
ggagcttgtg 300gaccttttgg ggctggagcg cctcgaggtc ccgggctttg
aggcggacga tgtcctcgcc 360gccctggcca agaaggcgga gcgggaaggg
tacgaggtgc gcatcctcac cgccgaccgg 420gacctcttcc agcttctttc
ggaccgcatc gccgtcctgc acccggaagg ccacctcatc 480accccggggt
ggctttggga gaggtacggc ctgagaccgg agcagtgggt ggacttccgc
540gccctggccg gcgacccttc cgacaacatc cccggggtga aggggatcgg
cgagaagacg 600gccctgaagc tcctaaagga gtggggtagt ctggaaaata
tccaaaaaaa cctggaccag 660gtcagtcccc cttccgtgcg cgagaagatc
caggcccacc tggacgacct caggctctcc 720caggagcttt cccgggtgcg
cacggacctt cccttggagg tggactttag aaggcggcgg 780gagcccgata
gggaaggcct tagggccttc ttagagcggc ttgagttcgg gagcctcctc
840cacgagttcg gcctcctgga aagcccccag gcggcggagg aggccccttg
gccgccgccg 900gaaggggcct tcttgggctt ccgcctctcc cggcccgagc
ccatgtgggc ggaactcctt 960tccttggcgg caagcgccaa gggccgggtc
taccgggcgg aggcgcccca taaggccctt 1020tcggacctga aggagatccg
ggggcttctc gccaaggacc tcgccgtctt ggccctgagg 1080gaggggctcg
gccttccccc cacggacgat cccatgctcc tcgcctacct cctggacccc
1140tccaacacca cccccgaggg cgtggcccgg cgctacgggg gggagtggac
ggaggaggcg 1200ggggagaggg ccttgcttgc cgaaaggctt tacgagaacc
tcctaagccg cctgaaaggg 1260gaagaaaagc tcctttggct ctacgaggag
gtggaaaagc ccctttcccg ggtcctcgcc 1320cacatggagg ccacgggggt
gaggctggac gtaccctacc taagggccct ttccctggag 1380gtggcggcgg
agatgggccg cctggaggag gaggttttcc gcctggcggg ccaccccttc
1440aacctgaact cccgcgacca gctggaaagg gtgctctttg acgagctcgg
gcttcccccc 1500atcggcaaga cggaaaaaac cgggaagcgc tccaccagcg
ccgccgtcct cgaggccctg 1560cgggaggccc accccatcgt ggagaagatc
ctccagtacc gggagctcgc caagctcaag 1620ggcacctaca ttgaccccct
tcccgccctg gtccacccca ggacgggcag gctccacacc 1680cgcttcaacc
agacggccac ggccacgggc cgcctttcca gctccgaccc caacctgcag
1740aacattcccg tgcgcacccc cttgggccaa aggatccgcc gggccttcgt
ggccgaggag 1800gggtaccttc tcgtggccct ggactatagc cagattgagc
tgagggtcct ggcccacctc 1860tcgggggacg aaaacctcat ccgggtcttc
caggagggcc gggacatcca cacccagacg 1920gcgagctgga tgttcggcct
gccggcggag gccatagacc ccctcaggcg ccgggcggcc 1980aagaccatca
actacggcgt cctctacggc atgtccgccc accggctttc ccaggagctg
2040ggcatcccct acgaggaggc ggtggccttc attgaccgct atttccagag
ctaccccaag 2100gtgaaggcct ggattgaaag gaccctggag gaggggcggc
aaagggggta cgtggagacc 2160ctcttcggcc gcaggcgcta cgtgcccgac
ctcaacgccc gggtaaagag cgtgcgggag 2220gcggcggagc gcatggcctt
taacatgccc gtgcagggca ccgccgctga cctgatgaag
2280ctcgccatgg tgaggctctt ccctaggctt cccgaggtgg gggcgaggat
gctcctccag 2340gtccacgacg agctcctcct ggaggcgccc aaggagcggg
cggaggaggc ggcggccctg 2400gccaaggagg tcatggaggg agtctggccc
ctggccgtgc ccctggaggt ggaggtgggc 2460atcggggagg actggctttc
cgccaagggc tagtcgac 249872493DNAThermus brockianus 7atgcttcccc
tctttgagcc caagggccgg gtgctcctgg tggacggcca ccacctggcc 60taccgtaact
tcttcgccct caaggggctc accacgagcc ggggcgagcc cgtgcaaggg
120gtctacgact tcgccaaaag cctcctcaag gccctgaagg aggacgggga
cgtggtcatc 180gtggtctttg acgccaaggc cccctctttt cgccacgagg
cctacggggc ctacaaggcg 240ggccgggccc ctaccccgga ggactttccg
aggcagcttg ccctcatgaa ggagcttgtg 300gaccttttgg ggctggagcg
cctcgaggtc ccgggctttg aggcggacga tgtcctcgcc 360gccctggcca
agaaggcgga gcgggaaggg tacgaggtgc gcatcctcac cgccgaccgg
420gacctcttcc agcttctttc ggaccgcatc gccgtcctgc acccggaagg
ccacctcatc 480accccggggt ggctttggga gaggtacggc ctgagaccgg
agcagtgggt ggacttccgc 540gccctggccg gcgacccttc cgacaacatc
cccggggtga aggggatcgg cgagaagacg 600gccctgaagc tcctaaagga
gtggggtagt ctggaaaata tccaaaaaaa cctggaccag 660gtcagtcccc
cttccgtgcg cgagaagatc caggcccacc tggacgacct caggctctcc
720caggagcttt cccgggtgcg cacggacctt cccttggagg tggactttag
aaggcggcgg 780gagcccgata gggaaggcct tagggccttc ttagagcggc
ttgagttcgg gagcctcctc 840cacgagttcg gcctcctgga aagcccccag
gcggcggagg aggccccttg gccgccgccg 900gaaggggcct tcttgggctt
ccgcctctcc cggcccgagc ccatgtgggc ggaactcctt 960tccttggcgg
caagcgccaa gggccgggtc taccgggcgg aggcgcccca taaggccctt
1020tcggacctga aggagatccg ggggcttctc gccaaggacc tcgccgtctt
ggccctgagg 1080gaggggctcg gccttccccc cacggacgat cccatgctcc
tcgcctacct cctggacccc 1140tccaacacca cccccgaggg cgtggcccgg
cgctacgggg gggagtggac ggaggaggcg 1200ggggagaggg ccttgcttgc
cgaaaggctt tacgagaacc tcctaagccg cctgaaaggg 1260gaagaaaagc
tcctttggct ctacgaggag gtggaaaagc ccctttcccg ggtcctcgcc
1320cacatggagg ccacgggggt gaggctggac gtaccctacc taagggccct
ttccctggag 1380gtggcggcgg agatgggccg cctggaggag gaggttttcc
gcctggcggg ccaccccttc 1440aacctgaact cccgcgacca gctggaaagg
gtgctctttg acgagctcgg gcttcccccc 1500atcggcaaga cggaaaaaac
cgggaagcgc tccaccagcg ccgccgtcct cgaggccctg 1560cgggaggccc
accccatcgt ggagaagatc ctccagtacc gggagctcgc caagctcaag
1620ggcacctaca ttgacctcct tcccgccctg gtccacccca ggacgggcag
gctccacacc 1680cgcttcaacc agacggccac ggccacgggc cgcctttcca
gctccgaccc caacctgcag 1740aacattcccg tgcgcacccc cttgggccaa
aggatccgcc gggccttcgt ggccgaggag 1800gggtaccttc tcgtggccct
ggactatagc cagattgagc tgagggtcct ggcccacctc 1860tcgggggacg
aaaacctcat ccgggtcttc caggagggcc gggacatcca cacccagacg
1920gcgagctgga tgttcggcct gccggcggag gccatagacc ccctcaggcg
ccgggcggcc 1980aagaccatca actacggcgt cctctacggc atgtccgccc
accggctttc ccaggagctg 2040ggcatcccct acgaggaggc ggtggccttc
attgaccgct atttccagag ctaccccaag 2100gtgaaggcct ggattgaaag
gaccctggag gaggggcggc aaagggggta cgtggagacc 2160ctcttcggcc
gcaggcgcta cgtgcccgac ctcaacgccc gggtaaagag cgtgcgggag
2220gcggcggagc gcatggcctt taacatgccc gtgcagggca ccgccgctga
cctgatgaag 2280ctcgccatgg tgaggctctt ccctaggctt cccgaggtgg
gggcgaggat gctcctccag 2340gtccacgacg agctcctcct ggaggcgccc
aaggagcggg cggaggaggc ggcggccctg 2400gccaaggagg tcatggaggg
ggtctggccc ctggccgtgc ccctggaggt ggaggtgggc 2460atcggggagg
actggctttc cgccaagggc tag 249382498DNAThermus brockianus
8atgcttcccc tctttgagcc caagggccgg gtgctcctgg tggacggcca ccacctggcc
60taccgtaact tcttcgccct caaggggctc accacgagcc ggggcgagcc cgtgcaaggg
120gtctacgact tcgccaaaag cctcctcaag gccctgaagg aggacgggga
cgtggtcatc 180gtggtctttg acgccaaggc cccctctttt cgccacgagg
cctacggggc ctacaaggcg 240ggccgggccc ctaccccgga ggactttccg
aggcagcttg ccctcatgaa ggagcttgtg 300gaccttttgg ggctggagcg
cctcgaggtc ccgggctttg aggcggacga tgtcctcgcc 360gccctggcca
agaaggcgga gcgggaaggg tacgaggtgc gcatcctcac cgccgaccgg
420gacctcttcc agcttctttc ggaccgcatc gccgtcctgc acccggaagg
ccacctcatc 480accccggggt ggctttggga gaggtacggc ctgagaccgg
agcagtgggt ggacttccgc 540gccctggccg gcgacccttc cgacaacatc
cccggggtga aggggatcgg cgagaagacg 600gccctgaagc tcctaaagga
gtggggtagt ctggaaaata tccaaaaaaa cctggaccag 660gtcagtcccc
cttccgtgcg cgagaagatc caggcccacc tggacgacct caggctctcc
720caggagcttt cccgggtgcg cacggacctt cccttggagg tggactttag
aaggcggcgg 780gagcccgata gggaaggcct tagggccttc ttagagcggc
ttgagttcgg gagcctcctc 840cacgagttcg gcctcctgga aagcccccag
gcggcggagg aggccccttg gccgccgccg 900gaaggggcct tcttgggctt
ccgcctctcc cggcccgagc ccatgtgggc ggaactcctt 960tccttggcgg
caagcgccaa gggccgggtc taccgggcgg aggcgcccca taaggccctt
1020tcggacctga aggagatccg ggggcttctc gccaaggacc tcgccgtctt
ggccctgagg 1080gaggggctcg gccttccccc cacggacgat cccatgctcc
tcgcctacct cctggacccc 1140tccaacacca cccccgaggg cgtggcccgg
cgctacgggg gggagtggac ggaggaggcg 1200ggggagaggg ccttgcttgc
cgaaaggctt tacgagaacc tcctaagccg cctgaaaggg 1260gaagaaaagc
tcctttggct ctacgaggag gtggaaaagc ccctttcccg ggtcctcgcc
1320cacatggagg ccacgggggt gaggctggac gtaccctacc taagggccct
ttccctggag 1380gtggcggcgg agatgggccg cctggaggag gaggttttcc
gcctggcggg ccaccccttc 1440aacctgaact cccgcgacca gctggaaagg
gtgctctttg acgagctcgg gcttcccccc 1500atcggcaaga cggaaaaaac
cgggaagcgc tccaccagcg ccgccgtcct cgaggccctg 1560cgggaggccc
accccatcgt ggagaagatc ctccagtacc gggagctcgc caagctcaag
1620ggcacctaca ttgaccccct tcccgccctg gtccacccca ggacgggcag
gctccacacc 1680cgcttcaacc agacggccac ggccacgggc cgcctttcca
gctccgaccc caacctgcag 1740aacattcccg tgcgcacccc cttgggccaa
aggatccgcc gggccttcgt ggccgaggag 1800gggtaccttc tcgtggccct
ggactatagc cagattgagc tgagggtcct ggcccacctc 1860tcgggggacg
aaaacctcat ccgggtcttc caggagggcc gggacatcca cacccagacg
1920gcgagctgga tgttcggcct gccggcggag gccatagacc ccctcaggcg
ccgggcggcc 1980aagaccatca actacggcgt cctctacggc atgtccgccc
accggctttc ccaggagctg 2040ggcatcccct acgaggaggc ggtggccttc
attgaccgct atttccagag ctaccccaag 2100gtgaaggcct ggattgaaag
gaccctggag gaggggcggc aaagggggta cgtggagacc 2160ctcttcggcc
gcaggcgcta cgtgcccgac ctcaacgccc gggtaaagag cgtgcgggag
2220gcggcggagc gcatggcctt taacatgccc gtgcagggca ccgccgctga
cctgatgaag 2280ctcgccatgg tgaggctctt ccctaggctt cccgaggtgg
gggcgaggat gctcctccag 2340gtccacgacg agctcctcct ggaggcgccc
aaggagcggg cggaggaggc ggcggccctg 2400gccaaggagg tcatggaggg
agtctggccc ctggccgtgc ccctggaggt ggaggtgggc 2460atcggggagg
actggctttc cgccaagggc tagtcgac 24989830PRTThermus brockianus 9Met
Leu Pro Leu Phe Glu Pro Lys Gly Arg Val Leu Leu Val Asp Gly1 5 10
15His His Leu Ala Tyr Arg Asn Phe Phe Ala Leu Lys Gly Leu Thr Thr
20 25 30Ser Arg Gly Glu Pro Val Gln Gly Val Tyr Gly Phe Ala Lys Ser
Leu 35 40 45Leu Lys Ala Leu Lys Glu Asp Gly Asp Val Val Ile Val Val
Phe Asp 50 55 60Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Ala
Tyr Lys Ala65 70 75 80Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg
Gln Leu Ala Leu Met 85 90 95Lys Glu Leu Val Asp Leu Leu Gly Leu Glu
Arg Leu Glu Val Pro Gly 100 105 110Phe Glu Ala Asp Asp Val Leu Ala
Ala Leu Ala Lys Lys Ala Glu Arg 115 120 125Glu Gly Tyr Glu Val Arg
Ile Leu Thr Ala Asp Arg Asp Leu Phe Gln 130 135 140Leu Leu Ser Asp
Arg Ile Ala Val Leu His Pro Glu Gly His Leu Ile145 150 155 160Thr
Pro Gly Trp Leu Trp Glu Arg Tyr Gly Leu Arg Pro Glu Gln Trp 165 170
175Val Asp Phe Arg Ala Leu Ala Gly Asp Pro Ser Asp Asn Ile Pro Gly
180 185 190Val Lys Gly Ile Gly Glu Lys Thr Ala Leu Lys Leu Leu Lys
Glu Trp 195 200 205Gly Ser Leu Glu Asn Ile Gln Lys Asn Leu Asp Gln
Val Ser Pro Pro 210 215 220Ser Val Arg Glu Lys Ile Gln Ala His Leu
Asp Asp Leu Arg Leu Ser225 230 235 240Gln Glu Leu Ser Arg Val Arg
Thr Asp Leu Pro Leu Glu Val Asp Phe 245 250 255Arg Arg Arg Arg Glu
Pro Asp Arg Glu Gly Leu Arg Ala Phe Leu Glu 260 265 270Arg Leu Glu
Phe Gly Ser Leu Leu His Glu Phe Gly Leu Leu Glu Ser 275 280 285Pro
Gln Ala Ala Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly Ala Phe 290 295
300Leu Gly Phe Arg Leu Ser Arg Pro Glu Pro Met Trp Ala Glu Leu
Leu305 310 315 320Ser Leu Ala Ala Ser Ala Lys Gly Arg Val Tyr Arg
Ala Glu Ala Pro 325 330 335His Lys Ala Leu Ser Asp Leu Lys Glu Ile
Arg Gly Leu Leu Ala Lys 340 345 350Asp Leu Ala Val Leu Ala Leu Arg
Glu Gly Leu Gly Leu Pro Pro Thr 355 360 365Asp Asp Pro Met Leu Leu
Ala Tyr Leu Leu Asp Pro Ser Asn Thr Thr 370 375 380Pro Glu Gly Val
Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu Glu Ala385 390 395 400Gly
Glu Arg Ala Leu Leu Ala Glu Arg Leu Tyr Glu Asn Leu Leu Ser 405 410
415Arg Leu Lys Gly Glu Glu Lys Leu Leu Trp Leu Tyr Glu Glu Val Glu
420 425 430Lys Pro Leu Ser Arg Val Leu Ala His Met Glu Ala Thr Gly
Val Arg 435 440 445Leu Asp Val Pro Tyr Leu Arg Ala Leu Ser Leu Glu
Val Ala Ala Glu 450 455 460Met Gly Arg Leu Glu Glu Glu Val Phe Arg
Leu Ala Gly His Pro Phe465 470 475 480Asn Leu Asn Ser Arg Asp Gln
Leu Glu Arg Val Leu Phe Asp Glu Leu 485 490 495Gly Leu Pro Pro Ile
Gly Lys Thr Glu Lys Thr Gly Lys Arg Ser Thr 500 505 510Ser Ala Ala
Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile Val Glu 515 520 525Lys
Ile Leu Gln Tyr Arg Glu Leu Ala Lys Leu Lys Gly Thr Tyr Ile 530 535
540Asp Leu Leu Pro Ala Leu Val His Pro Arg Thr Gly Arg Leu His
Thr545 550 555 560Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu
Ser Ser Ser Asp 565 570 575Pro Asn Leu Gln Asn Ile Pro Val Arg Thr
Pro Leu Gly Gln Arg Ile 580 585 590Arg Arg Ala Phe Val Ala Glu Glu
Gly Tyr Leu Leu Val Ala Leu Asp 595 600 605Tyr Ser Gln Ile Glu Leu
Arg Val Leu Ala His Leu Ser Gly Asp Glu 610 615 620Asn Leu Ile Arg
Val Phe Gln Glu Gly Arg Asp Ile His Thr Gln Thr625 630 635 640Ala
Ser Trp Met Phe Gly Leu Pro Ala Glu Ala Ile Asp Pro Leu Arg 645 650
655Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly Met Ser
660 665 670Ala His Arg Leu Ser Gln Glu Leu Gly Ile Pro Tyr Glu Glu
Ala Val 675 680 685Ala Phe Ile Asp Arg Tyr Phe Gln Ser Tyr Pro Lys
Val Lys Ala Trp 690 695 700Ile Glu Arg Thr Leu Glu Glu Gly Arg Gln
Arg Gly Tyr Val Glu Thr705 710 715 720Leu Phe Gly Arg Arg Arg Tyr
Val Pro Asp Leu Asn Ala Arg Val Lys 725 730 735Ser Val Arg Glu Ala
Ala Glu Arg Met Ala Phe Asn Met Pro Val Gln 740 745 750Gly Thr Ala
Ala Asp Leu Met Lys Leu Ala Met Val Arg Leu Phe Pro 755 760 765Arg
Leu Pro Glu Val Gly Ala Arg Met Leu Leu Gln Val His Asp Glu 770 775
780Leu Leu Leu Glu Ala Pro Lys Glu Arg Ala Glu Glu Ala Ala Ala
Leu785 790 795 800Ala Lys Glu Val Met Glu Gly Val Trp Pro Leu Ala
Val Pro Leu Glu 805 810 815Val Glu Val Gly Ile Gly Glu Asp Trp Leu
Ser Ala Lys Gly 820 825 83010830PRTThermus brockianus 10Met Leu Pro
Leu Phe Glu Pro Lys Gly Arg Val Leu Leu Val Asp Gly1 5 10 15His His
Leu Ala Tyr Arg Asn Phe Phe Ala Leu Lys Gly Leu Thr Thr 20 25 30Ser
Arg Gly Glu Pro Val Gln Gly Val Tyr Gly Phe Ala Lys Ser Leu 35 40
45Leu Lys Ala Leu Lys Glu Asp Gly Asp Val Val Ile Val Val Phe Asp
50 55 60Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Ala Tyr Lys
Ala65 70 75 80Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu
Ala Leu Met 85 90 95Lys Glu Leu Val Asp Leu Leu Gly Leu Glu Arg Leu
Glu Val Pro Gly 100 105 110Phe Glu Ala Asp Asp Val Leu Ala Ala Leu
Ala Lys Lys Ala Glu Arg 115 120 125Glu Gly Tyr Glu Val Arg Ile Leu
Thr Ala Asp Arg Asp Leu Phe Gln 130 135 140Leu Leu Ser Asp Arg Ile
Ala Val Leu His Pro Glu Gly His Leu Ile145 150 155 160Thr Pro Gly
Trp Leu Trp Glu Arg Tyr Gly Leu Arg Pro Glu Gln Trp 165 170 175Val
Asp Phe Arg Ala Leu Ala Gly Asp Pro Ser Asp Asn Ile Pro Gly 180 185
190Val Lys Gly Ile Gly Glu Lys Thr Ala Leu Lys Leu Leu Lys Glu Trp
195 200 205Gly Ser Leu Glu Asn Ile Gln Lys Asn Leu Asp Gln Val Ser
Pro Pro 210 215 220Ser Val Arg Glu Lys Ile Gln Ala His Leu Asp Asp
Leu Arg Leu Ser225 230 235 240Gln Glu Leu Ser Arg Val Arg Thr Asp
Leu Pro Leu Glu Val Asp Phe 245 250 255Arg Arg Arg Arg Glu Pro Asp
Arg Glu Gly Leu Arg Ala Phe Leu Glu 260 265 270Arg Leu Glu Phe Gly
Ser Leu Leu His Glu Phe Gly Leu Leu Glu Ser 275 280 285Pro Gln Ala
Ala Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly Ala Phe 290 295 300Leu
Gly Phe Arg Leu Ser Arg Pro Glu Pro Met Trp Ala Glu Leu Leu305 310
315 320Ser Leu Ala Ala Ser Ala Lys Gly Arg Val Tyr Arg Ala Glu Ala
Pro 325 330 335His Lys Ala Leu Ser Asp Leu Lys Glu Ile Arg Gly Leu
Leu Ala Lys 340 345 350Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Leu
Gly Leu Pro Pro Thr 355 360 365Asp Asp Pro Met Leu Leu Ala Tyr Leu
Leu Asp Pro Ser Asn Thr Thr 370 375 380Pro Glu Gly Val Ala Arg Arg
Tyr Gly Gly Glu Trp Thr Glu Glu Ala385 390 395 400Gly Glu Arg Ala
Leu Leu Ala Glu Arg Leu Tyr Glu Asn Leu Leu Ser 405 410 415Arg Leu
Lys Gly Glu Glu Lys Leu Leu Trp Leu Tyr Glu Glu Val Glu 420 425
430Lys Pro Leu Ser Arg Val Leu Ala His Met Glu Ala Thr Gly Val Arg
435 440 445Leu Asp Val Pro Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala
Ala Glu 450 455 460Met Gly Arg Leu Glu Glu Glu Val Phe Arg Leu Ala
Gly His Pro Phe465 470 475 480Asn Leu Asn Ser Arg Asp Gln Leu Glu
Arg Val Leu Phe Asp Glu Leu 485 490 495Gly Leu Pro Pro Ile Gly Lys
Thr Glu Lys Thr Gly Lys Arg Ser Thr 500 505 510Ser Ala Ala Val Leu
Glu Ala Leu Arg Glu Ala His Pro Ile Val Glu 515 520 525Lys Ile Leu
Gln Tyr Arg Glu Leu Ala Lys Leu Lys Gly Thr Tyr Ile 530 535 540Asp
Pro Leu Pro Ala Leu Val His Pro Arg Thr Gly Arg Leu His Thr545 550
555 560Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser
Asp 565 570 575Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly
Gln Arg Ile 580 585 590Arg Arg Ala Phe Val Ala Glu Glu Gly Tyr Leu
Leu Val Ala Leu Asp 595 600 605Tyr Ser Gln Ile Glu Leu Arg Val Leu
Ala His Leu Ser Gly Asp Glu 610 615 620Asn Leu Ile Arg Val Phe Gln
Glu Gly Arg Asp Ile His Thr Gln Thr625 630 635 640Ala Ser Trp Met
Phe Gly Leu Pro Ala Glu Ala Ile Asp Pro Leu Arg 645 650 655Arg Arg
Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly Met Ser 660 665
670Ala His Arg Leu Ser Gln Glu Leu Gly Ile Pro Tyr Glu Glu Ala Val
675 680 685Ala Phe Ile Asp Arg Tyr Phe Gln Ser Tyr Pro Lys Val Lys
Ala Trp 690 695 700Ile Glu Arg Thr Leu Glu Glu Gly Arg Gln Arg Gly
Tyr Val Glu Thr705 710 715 720Leu Phe Gly Arg Arg Arg Tyr Val Pro
Asp Leu Asn Ala Arg Val Lys 725 730 735Ser Val Arg Glu Ala Ala Glu
Arg Met Ala Phe Asn Met Pro Val Gln 740 745 750Gly Thr Ala Ala
Asp
Leu Met Lys Leu Ala Met Val Arg Leu Phe Pro 755 760 765Arg Leu Pro
Glu Val Gly Ala Arg Met Leu Leu Gln Val His Asp Glu 770 775 780Leu
Leu Leu Glu Ala Pro Lys Glu Arg Ala Glu Glu Ala Ala Ala Leu785 790
795 800Ala Lys Glu Val Met Glu Gly Val Trp Pro Leu Ala Val Pro Leu
Glu 805 810 815Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys
Gly 820 825 83011830PRTThermus brockianus 11Met Leu Pro Leu Phe Glu
Pro Lys Gly Arg Val Leu Leu Val Asp Gly1 5 10 15His His Leu Ala Tyr
Arg Asn Phe Phe Ala Leu Lys Gly Leu Thr Thr 20 25 30Ser Arg Gly Glu
Pro Val Gln Gly Val Tyr Asp Phe Ala Lys Ser Leu 35 40 45Leu Lys Ala
Leu Lys Glu Asp Gly Asp Val Val Ile Val Val Phe Asp 50 55 60Ala Lys
Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Ala Tyr Lys Ala65 70 75
80Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu Ala Leu Met
85 90 95Lys Glu Leu Val Asp Leu Leu Gly Leu Glu Arg Leu Glu Val Pro
Gly 100 105 110Phe Glu Ala Asp Asp Val Leu Ala Ala Leu Ala Lys Lys
Ala Glu Arg 115 120 125Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp
Arg Asp Leu Phe Gln 130 135 140Leu Leu Ser Asp Arg Ile Ala Val Leu
His Pro Glu Gly His Leu Ile145 150 155 160Thr Pro Gly Trp Leu Trp
Glu Arg Tyr Gly Leu Arg Pro Glu Gln Trp 165 170 175Val Asp Phe Arg
Ala Leu Ala Gly Asp Pro Ser Asp Asn Ile Pro Gly 180 185 190Val Lys
Gly Ile Gly Glu Lys Thr Ala Leu Lys Leu Leu Lys Glu Trp 195 200
205Gly Ser Leu Glu Asn Ile Gln Lys Asn Leu Asp Gln Val Ser Pro Pro
210 215 220Ser Val Arg Glu Lys Ile Gln Ala His Leu Asp Asp Leu Arg
Leu Ser225 230 235 240Gln Glu Leu Ser Arg Val Arg Thr Asp Leu Pro
Leu Glu Val Asp Phe 245 250 255Arg Arg Arg Arg Glu Pro Asp Arg Glu
Gly Leu Arg Ala Phe Leu Glu 260 265 270Arg Leu Glu Phe Gly Ser Leu
Leu His Glu Phe Gly Leu Leu Glu Ser 275 280 285Pro Gln Ala Ala Glu
Glu Ala Pro Trp Pro Pro Pro Glu Gly Ala Phe 290 295 300Leu Gly Phe
Arg Leu Ser Arg Pro Glu Pro Met Trp Ala Glu Leu Leu305 310 315
320Ser Leu Ala Ala Ser Ala Lys Gly Arg Val Tyr Arg Ala Glu Ala Pro
325 330 335His Lys Ala Leu Ser Asp Leu Lys Glu Ile Arg Gly Leu Leu
Ala Lys 340 345 350Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Leu Gly
Leu Pro Pro Thr 355 360 365Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu
Asp Pro Ser Asn Thr Thr 370 375 380Pro Glu Gly Val Ala Arg Arg Tyr
Gly Gly Glu Trp Thr Glu Glu Ala385 390 395 400Gly Glu Arg Ala Leu
Leu Ala Glu Arg Leu Tyr Glu Asn Leu Leu Ser 405 410 415Arg Leu Lys
Gly Glu Glu Lys Leu Leu Trp Leu Tyr Glu Glu Val Glu 420 425 430Lys
Pro Leu Ser Arg Val Leu Ala His Met Glu Ala Thr Gly Val Arg 435 440
445Leu Asp Val Pro Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala Ala Glu
450 455 460Met Gly Arg Leu Glu Glu Glu Val Phe Arg Leu Ala Gly His
Pro Phe465 470 475 480Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val
Leu Phe Asp Glu Leu 485 490 495Gly Leu Pro Pro Ile Gly Lys Thr Glu
Lys Thr Gly Lys Arg Ser Thr 500 505 510Ser Ala Ala Val Leu Glu Ala
Leu Arg Glu Ala His Pro Ile Val Glu 515 520 525Lys Ile Leu Gln Tyr
Arg Glu Leu Ala Lys Leu Lys Gly Thr Tyr Ile 530 535 540Asp Leu Leu
Pro Ala Leu Val His Pro Arg Thr Gly Arg Leu His Thr545 550 555
560Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp
565 570 575Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
Arg Ile 580 585 590Arg Arg Ala Phe Val Ala Glu Glu Gly Tyr Leu Leu
Val Ala Leu Asp 595 600 605Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala
His Leu Ser Gly Asp Glu 610 615 620Asn Leu Ile Arg Val Phe Gln Glu
Gly Arg Asp Ile His Thr Gln Thr625 630 635 640Ala Ser Trp Met Phe
Gly Leu Pro Ala Glu Ala Ile Asp Pro Leu Arg 645 650 655Arg Arg Ala
Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly Met Ser 660 665 670Ala
His Arg Leu Ser Gln Glu Leu Gly Ile Pro Tyr Glu Glu Ala Val 675 680
685Ala Phe Ile Asp Arg Tyr Phe Gln Ser Tyr Pro Lys Val Lys Ala Trp
690 695 700Ile Glu Arg Thr Leu Glu Glu Gly Arg Gln Arg Gly Tyr Val
Glu Thr705 710 715 720Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu
Asn Ala Arg Val Lys 725 730 735Ser Val Arg Glu Ala Ala Glu Arg Met
Ala Phe Asn Met Pro Val Gln 740 745 750Gly Thr Ala Ala Asp Leu Met
Lys Leu Ala Met Val Arg Leu Phe Pro 755 760 765Arg Leu Pro Glu Val
Gly Ala Arg Met Leu Leu Gln Val His Asp Glu 770 775 780Leu Leu Leu
Glu Ala Pro Lys Glu Arg Ala Glu Glu Ala Ala Ala Leu785 790 795
800Ala Lys Glu Val Met Glu Gly Val Trp Pro Leu Ala Val Pro Leu Glu
805 810 815Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Gly
820 825 83012830PRTThermus brockianus 12Met Leu Pro Leu Phe Glu Pro
Lys Gly Arg Val Leu Leu Val Asp Gly1 5 10 15His His Leu Ala Tyr Arg
Asn Phe Phe Ala Leu Lys Gly Leu Thr Thr 20 25 30Ser Arg Gly Glu Pro
Val Gln Gly Val Tyr Asp Phe Ala Lys Ser Leu 35 40 45Leu Lys Ala Leu
Lys Glu Asp Gly Asp Val Val Ile Val Val Phe Asp 50 55 60Ala Lys Ala
Pro Ser Phe Arg His Glu Ala Tyr Gly Ala Tyr Lys Ala65 70 75 80Gly
Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu Ala Leu Met 85 90
95Lys Glu Leu Val Asp Leu Leu Gly Leu Glu Arg Leu Glu Val Pro Gly
100 105 110Phe Glu Ala Asp Asp Val Leu Ala Ala Leu Ala Lys Lys Ala
Glu Arg 115 120 125Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Arg
Asp Leu Phe Gln 130 135 140Leu Leu Ser Asp Arg Ile Ala Val Leu His
Pro Glu Gly His Leu Ile145 150 155 160Thr Pro Gly Trp Leu Trp Glu
Arg Tyr Gly Leu Arg Pro Glu Gln Trp 165 170 175Val Asp Phe Arg Ala
Leu Ala Gly Asp Pro Ser Asp Asn Ile Pro Gly 180 185 190Val Lys Gly
Ile Gly Glu Lys Thr Ala Leu Lys Leu Leu Lys Glu Trp 195 200 205Gly
Ser Leu Glu Asn Ile Gln Lys Asn Leu Asp Gln Val Ser Pro Pro 210 215
220Ser Val Arg Glu Lys Ile Gln Ala His Leu Asp Asp Leu Arg Leu
Ser225 230 235 240Gln Glu Leu Ser Arg Val Arg Thr Asp Leu Pro Leu
Glu Val Asp Phe 245 250 255Arg Arg Arg Arg Glu Pro Asp Arg Glu Gly
Leu Arg Ala Phe Leu Glu 260 265 270Arg Leu Glu Phe Gly Ser Leu Leu
His Glu Phe Gly Leu Leu Glu Ser 275 280 285Pro Gln Ala Ala Glu Glu
Ala Pro Trp Pro Pro Pro Glu Gly Ala Phe 290 295 300Leu Gly Phe Arg
Leu Ser Arg Pro Glu Pro Met Trp Ala Glu Leu Leu305 310 315 320Ser
Leu Ala Ala Ser Ala Lys Gly Arg Val Tyr Arg Ala Glu Ala Pro 325 330
335His Lys Ala Leu Ser Asp Leu Lys Glu Ile Arg Gly Leu Leu Ala Lys
340 345 350Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro
Pro Thr 355 360 365Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro
Ser Asn Thr Thr 370 375 380Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly
Glu Trp Thr Glu Glu Ala385 390 395 400Gly Glu Arg Ala Leu Leu Ala
Glu Arg Leu Tyr Glu Asn Leu Leu Ser 405 410 415Arg Leu Lys Gly Glu
Glu Lys Leu Leu Trp Leu Tyr Glu Glu Val Glu 420 425 430Lys Pro Leu
Ser Arg Val Leu Ala His Met Glu Ala Thr Gly Val Arg 435 440 445Leu
Asp Val Pro Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala Ala Glu 450 455
460Met Gly Arg Leu Glu Glu Glu Val Phe Arg Leu Ala Gly His Pro
Phe465 470 475 480Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu
Phe Asp Glu Leu 485 490 495Gly Leu Pro Pro Ile Gly Lys Thr Glu Lys
Thr Gly Lys Arg Ser Thr 500 505 510Ser Ala Ala Val Leu Glu Ala Leu
Arg Glu Ala His Pro Ile Val Glu 515 520 525Lys Ile Leu Gln Tyr Arg
Glu Leu Ala Lys Leu Lys Gly Thr Tyr Ile 530 535 540Asp Pro Leu Pro
Ala Leu Val His Pro Arg Thr Gly Arg Leu His Thr545 550 555 560Arg
Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 565 570
575Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln Arg Ile
580 585 590Arg Arg Ala Phe Val Ala Glu Glu Gly Tyr Leu Leu Val Ala
Leu Asp 595 600 605Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu
Ser Gly Asp Glu 610 615 620Asn Leu Ile Arg Val Phe Gln Glu Gly Arg
Asp Ile His Thr Gln Thr625 630 635 640Ala Ser Trp Met Phe Gly Leu
Pro Ala Glu Ala Ile Asp Pro Leu Arg 645 650 655Arg Arg Ala Ala Lys
Thr Ile Asn Phe Gly Val Leu Tyr Gly Met Ser 660 665 670Ala His Arg
Leu Ser Gln Glu Leu Gly Ile Pro Tyr Glu Glu Ala Val 675 680 685Ala
Phe Ile Asp Arg Tyr Phe Gln Ser Tyr Pro Lys Val Lys Ala Trp 690 695
700Ile Glu Arg Thr Leu Glu Glu Gly Arg Gln Arg Gly Tyr Val Glu
Thr705 710 715 720Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn
Ala Arg Val Lys 725 730 735Ser Val Arg Glu Ala Ala Glu Arg Met Ala
Phe Asn Met Pro Val Gln 740 745 750Gly Thr Ala Ala Asp Leu Met Lys
Leu Ala Met Val Arg Leu Phe Pro 755 760 765Arg Leu Pro Glu Val Gly
Ala Arg Met Leu Leu Gln Val His Asp Glu 770 775 780Leu Leu Leu Glu
Ala Pro Lys Glu Arg Ala Glu Glu Ala Ala Ala Leu785 790 795 800Ala
Lys Glu Val Met Glu Gly Val Trp Pro Leu Ala Val Pro Leu Glu 805 810
815Val Glu Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Gly 820 825
83013830PRTThermus brockianus 13Met Leu Pro Leu Phe Glu Pro Lys Gly
Arg Val Leu Leu Val Asp Gly1 5 10 15His His Leu Ala Tyr Arg Asn Phe
Phe Ala Leu Lys Gly Leu Thr Thr 20 25 30Ser Arg Gly Glu Pro Val Gln
Gly Val Tyr Gly Phe Ala Lys Ser Leu 35 40 45Leu Lys Ala Leu Lys Glu
Asp Gly Asp Val Val Ile Val Val Phe Asp 50 55 60Ala Lys Ala Pro Ser
Phe Arg His Glu Ala Tyr Gly Ala Tyr Lys Ala65 70 75 80Gly Arg Ala
Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu Ala Leu Met 85 90 95Lys Glu
Leu Val Asp Leu Leu Gly Leu Glu Arg Leu Glu Val Pro Gly 100 105
110Phe Glu Ala Asp Asp Val Leu Ala Ala Leu Ala Lys Lys Ala Glu Arg
115 120 125Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Arg Asp Leu
Phe Gln 130 135 140Leu Leu Ser Asp Arg Ile Ala Val Leu His Pro Glu
Gly His Leu Ile145 150 155 160Thr Pro Gly Trp Leu Trp Glu Arg Tyr
Gly Leu Arg Pro Glu Gln Trp 165 170 175Val Asp Phe Arg Ala Leu Ala
Gly Asp Pro Ser Asp Asn Ile Pro Gly 180 185 190Val Lys Gly Ile Gly
Glu Lys Thr Ala Leu Lys Leu Leu Lys Glu Trp 195 200 205Gly Ser Leu
Glu Asn Ile Gln Lys Asn Leu Asp Gln Val Ser Pro Pro 210 215 220Ser
Val Arg Glu Lys Ile Gln Ala His Leu Asp Asp Leu Arg Leu Ser225 230
235 240Gln Glu Leu Ser Arg Val Arg Thr Asp Leu Pro Leu Glu Val Asp
Phe 245 250 255Arg Arg Arg Arg Glu Pro Asp Arg Glu Gly Leu Arg Ala
Phe Leu Glu 260 265 270Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe
Gly Leu Leu Glu Ser 275 280 285Pro Gln Ala Ala Glu Glu Ala Pro Trp
Pro Pro Pro Glu Gly Ala Phe 290 295 300Leu Gly Phe Arg Leu Ser Arg
Pro Glu Pro Met Trp Ala Glu Leu Leu305 310 315 320Ser Leu Ala Ala
Ser Ala Lys Gly Arg Val Tyr Arg Ala Glu Ala Pro 325 330 335His Lys
Ala Leu Ser Asp Leu Lys Glu Ile Arg Gly Leu Leu Ala Lys 340 345
350Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro Pro Thr
355 360 365Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
Thr Thr 370 375 380Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp
Thr Glu Glu Ala385 390 395 400Gly Glu Arg Ala Leu Leu Ala Glu Arg
Leu Tyr Glu Asn Leu Leu Ser 405 410 415Arg Leu Lys Gly Glu Glu Lys
Leu Leu Trp Leu Tyr Glu Glu Val Glu 420 425 430Lys Pro Leu Ser Arg
Val Leu Ala His Met Glu Ala Thr Gly Val Arg 435 440 445Leu Asp Val
Pro Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala Ala Glu 450 455 460Met
Gly Arg Leu Glu Glu Glu Val Phe Arg Leu Ala Gly His Pro Phe465 470
475 480Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp Glu
Leu 485 490 495Gly Leu Pro Pro Ile Gly Lys Thr Glu Lys Thr Gly Lys
Arg Ser Thr 500 505 510Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala
His Pro Ile Val Glu 515 520 525Lys Ile Leu Gln Tyr Arg Glu Leu Ala
Lys Leu Lys Gly Thr Tyr Ile 530 535 540Asp Leu Leu Pro Ala Leu Val
His Pro Arg Thr Gly Arg Leu His Thr545 550 555 560Arg Phe Asn Gln
Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 565 570 575Pro Asn
Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln Arg Ile 580 585
590Arg Arg Ala Phe Val Ala Glu Glu Gly Tyr Leu Leu Val Ala Leu Asp
595 600 605Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
Asp Glu 610 615 620Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile
His Thr Gln Thr625 630 635 640Ala Ser Trp Met Phe Gly Leu Pro Ala
Glu Ala Ile Asp Pro Leu Arg 645 650 655Arg Arg Ala Ala Lys Thr Ile
Asn Tyr Gly Val Leu Tyr Gly Met Ser 660 665 670Ala His Arg Leu Ser
Gln Glu Leu Gly Ile Pro Tyr Glu Glu Ala Val 675 680 685Ala Phe Ile
Asp Arg Tyr Phe Gln Ser Tyr Pro Lys Val Lys Ala Trp 690 695 700Ile
Glu Arg Thr Leu Glu Glu Gly Arg Gln Arg Gly Tyr Val Glu Thr705
710
715 720Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala Arg Val
Lys 725 730 735Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met
Pro Val Gln 740 745 750Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met
Val Arg Leu Phe Pro 755 760 765Arg Leu Pro Glu Val Gly Ala Arg Met
Leu Leu Gln Val His Asp Glu 770 775 780Leu Leu Leu Glu Ala Pro Lys
Glu Arg Ala Glu Glu Ala Ala Ala Leu785 790 795 800Ala Lys Glu Val
Met Glu Gly Val Trp Pro Leu Ala Val Pro Leu Glu 805 810 815Val Glu
Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Gly 820 825
83014830PRTThermus brockianus 14Met Leu Pro Leu Phe Glu Pro Lys Gly
Arg Val Leu Leu Val Asp Gly1 5 10 15His His Leu Ala Tyr Arg Asn Phe
Phe Ala Leu Lys Gly Leu Thr Thr 20 25 30Ser Arg Gly Glu Pro Val Gln
Gly Val Tyr Gly Phe Ala Lys Ser Leu 35 40 45Leu Lys Ala Leu Lys Glu
Asp Gly Asp Val Val Ile Val Val Phe Asp 50 55 60Ala Lys Ala Pro Ser
Phe Arg His Glu Ala Tyr Gly Ala Tyr Lys Ala65 70 75 80Gly Arg Ala
Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu Ala Leu Met 85 90 95Lys Glu
Leu Val Asp Leu Leu Gly Leu Glu Arg Leu Glu Val Pro Gly 100 105
110Phe Glu Ala Asp Asp Val Leu Ala Ala Leu Ala Lys Lys Ala Glu Arg
115 120 125Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Arg Asp Leu
Phe Gln 130 135 140Leu Leu Ser Asp Arg Ile Ala Val Leu His Pro Glu
Gly His Leu Ile145 150 155 160Thr Pro Gly Trp Leu Trp Glu Arg Tyr
Gly Leu Arg Pro Glu Gln Trp 165 170 175Val Asp Phe Arg Ala Leu Ala
Gly Asp Pro Ser Asp Asn Ile Pro Gly 180 185 190Val Lys Gly Ile Gly
Glu Lys Thr Ala Leu Lys Leu Leu Lys Glu Trp 195 200 205Gly Ser Leu
Glu Asn Ile Gln Lys Asn Leu Asp Gln Val Ser Pro Pro 210 215 220Ser
Val Arg Glu Lys Ile Gln Ala His Leu Asp Asp Leu Arg Leu Ser225 230
235 240Gln Glu Leu Ser Arg Val Arg Thr Asp Leu Pro Leu Glu Val Asp
Phe 245 250 255Arg Arg Arg Arg Glu Pro Asp Arg Glu Gly Leu Arg Ala
Phe Leu Glu 260 265 270Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe
Gly Leu Leu Glu Ser 275 280 285Pro Gln Ala Ala Glu Glu Ala Pro Trp
Pro Pro Pro Glu Gly Ala Phe 290 295 300Leu Gly Phe Arg Leu Ser Arg
Pro Glu Pro Met Trp Ala Glu Leu Leu305 310 315 320Ser Leu Ala Ala
Ser Ala Lys Gly Arg Val Tyr Arg Ala Glu Ala Pro 325 330 335His Lys
Ala Leu Ser Asp Leu Lys Glu Ile Arg Gly Leu Leu Ala Lys 340 345
350Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro Pro Thr
355 360 365Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
Thr Thr 370 375 380Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp
Thr Glu Glu Ala385 390 395 400Gly Glu Arg Ala Leu Leu Ala Glu Arg
Leu Tyr Glu Asn Leu Leu Ser 405 410 415Arg Leu Lys Gly Glu Glu Lys
Leu Leu Trp Leu Tyr Glu Glu Val Glu 420 425 430Lys Pro Leu Ser Arg
Val Leu Ala His Met Glu Ala Thr Gly Val Arg 435 440 445Leu Asp Val
Pro Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala Ala Glu 450 455 460Met
Gly Arg Leu Glu Glu Glu Val Phe Arg Leu Ala Gly His Pro Phe465 470
475 480Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp Glu
Leu 485 490 495Gly Leu Pro Pro Ile Gly Lys Thr Glu Lys Thr Gly Lys
Arg Ser Thr 500 505 510Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala
His Pro Ile Val Glu 515 520 525Lys Ile Leu Gln Tyr Arg Glu Leu Ala
Lys Leu Lys Gly Thr Tyr Ile 530 535 540Asp Pro Leu Pro Ala Leu Val
His Pro Arg Thr Gly Arg Leu His Thr545 550 555 560Arg Phe Asn Gln
Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 565 570 575Pro Asn
Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln Arg Ile 580 585
590Arg Arg Ala Phe Val Ala Glu Glu Gly Tyr Leu Leu Val Ala Leu Asp
595 600 605Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
Asp Glu 610 615 620Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile
His Thr Gln Thr625 630 635 640Ala Ser Trp Met Phe Gly Leu Pro Ala
Glu Ala Ile Asp Pro Leu Arg 645 650 655Arg Arg Ala Ala Lys Thr Ile
Asn Tyr Gly Val Leu Tyr Gly Met Ser 660 665 670Ala His Arg Leu Ser
Gln Glu Leu Gly Ile Pro Tyr Glu Glu Ala Val 675 680 685Ala Phe Ile
Asp Arg Tyr Phe Gln Ser Tyr Pro Lys Val Lys Ala Trp 690 695 700Ile
Glu Arg Thr Leu Glu Glu Gly Arg Gln Arg Gly Tyr Val Glu Thr705 710
715 720Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala Arg Val
Lys 725 730 735Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met
Pro Val Gln 740 745 750Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met
Val Arg Leu Phe Pro 755 760 765Arg Leu Pro Glu Val Gly Ala Arg Met
Leu Leu Gln Val His Asp Glu 770 775 780Leu Leu Leu Glu Ala Pro Lys
Glu Arg Ala Glu Glu Ala Ala Ala Leu785 790 795 800Ala Lys Glu Val
Met Glu Gly Val Trp Pro Leu Ala Val Pro Leu Glu 805 810 815Val Glu
Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Gly 820 825
83015830PRTThermus brockianus 15Met Leu Pro Leu Phe Glu Pro Lys Gly
Arg Val Leu Leu Val Asp Gly1 5 10 15His His Leu Ala Tyr Arg Asn Phe
Phe Ala Leu Lys Gly Leu Thr Thr 20 25 30Ser Arg Gly Glu Pro Val Gln
Gly Val Tyr Asp Phe Ala Lys Ser Leu 35 40 45Leu Lys Ala Leu Lys Glu
Asp Gly Asp Val Val Ile Val Val Phe Asp 50 55 60Ala Lys Ala Pro Ser
Phe Arg His Glu Ala Tyr Gly Ala Tyr Lys Ala65 70 75 80Gly Arg Ala
Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu Ala Leu Met 85 90 95Lys Glu
Leu Val Asp Leu Leu Gly Leu Glu Arg Leu Glu Val Pro Gly 100 105
110Phe Glu Ala Asp Asp Val Leu Ala Ala Leu Ala Lys Lys Ala Glu Arg
115 120 125Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Arg Asp Leu
Phe Gln 130 135 140Leu Leu Ser Asp Arg Ile Ala Val Leu His Pro Glu
Gly His Leu Ile145 150 155 160Thr Pro Gly Trp Leu Trp Glu Arg Tyr
Gly Leu Arg Pro Glu Gln Trp 165 170 175Val Asp Phe Arg Ala Leu Ala
Gly Asp Pro Ser Asp Asn Ile Pro Gly 180 185 190Val Lys Gly Ile Gly
Glu Lys Thr Ala Leu Lys Leu Leu Lys Glu Trp 195 200 205Gly Ser Leu
Glu Asn Ile Gln Lys Asn Leu Asp Gln Val Ser Pro Pro 210 215 220Ser
Val Arg Glu Lys Ile Gln Ala His Leu Asp Asp Leu Arg Leu Ser225 230
235 240Gln Glu Leu Ser Arg Val Arg Thr Asp Leu Pro Leu Glu Val Asp
Phe 245 250 255Arg Arg Arg Arg Glu Pro Asp Arg Glu Gly Leu Arg Ala
Phe Leu Glu 260 265 270Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe
Gly Leu Leu Glu Ser 275 280 285Pro Gln Ala Ala Glu Glu Ala Pro Trp
Pro Pro Pro Glu Gly Ala Phe 290 295 300Leu Gly Phe Arg Leu Ser Arg
Pro Glu Pro Met Trp Ala Glu Leu Leu305 310 315 320Ser Leu Ala Ala
Ser Ala Lys Gly Arg Val Tyr Arg Ala Glu Ala Pro 325 330 335His Lys
Ala Leu Ser Asp Leu Lys Glu Ile Arg Gly Leu Leu Ala Lys 340 345
350Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro Pro Thr
355 360 365Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
Thr Thr 370 375 380Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp
Thr Glu Glu Ala385 390 395 400Gly Glu Arg Ala Leu Leu Ala Glu Arg
Leu Tyr Glu Asn Leu Leu Ser 405 410 415Arg Leu Lys Gly Glu Glu Lys
Leu Leu Trp Leu Tyr Glu Glu Val Glu 420 425 430Lys Pro Leu Ser Arg
Val Leu Ala His Met Glu Ala Thr Gly Val Arg 435 440 445Leu Asp Val
Pro Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala Ala Glu 450 455 460Met
Gly Arg Leu Glu Glu Glu Val Phe Arg Leu Ala Gly His Pro Phe465 470
475 480Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp Glu
Leu 485 490 495Gly Leu Pro Pro Ile Gly Lys Thr Glu Lys Thr Gly Lys
Arg Ser Thr 500 505 510Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala
His Pro Ile Val Glu 515 520 525Lys Ile Leu Gln Tyr Arg Glu Leu Ala
Lys Leu Lys Gly Thr Tyr Ile 530 535 540Asp Leu Leu Pro Ala Leu Val
His Pro Arg Thr Gly Arg Leu His Thr545 550 555 560Arg Phe Asn Gln
Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 565 570 575Pro Asn
Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln Arg Ile 580 585
590Arg Arg Ala Phe Val Ala Glu Glu Gly Tyr Leu Leu Val Ala Leu Asp
595 600 605Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
Asp Glu 610 615 620Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile
His Thr Gln Thr625 630 635 640Ala Ser Trp Met Phe Gly Leu Pro Ala
Glu Ala Ile Asp Pro Leu Arg 645 650 655Arg Arg Ala Ala Lys Thr Ile
Asn Tyr Gly Val Leu Tyr Gly Met Ser 660 665 670Ala His Arg Leu Ser
Gln Glu Leu Gly Ile Pro Tyr Glu Glu Ala Val 675 680 685Ala Phe Ile
Asp Arg Tyr Phe Gln Ser Tyr Pro Lys Val Lys Ala Trp 690 695 700Ile
Glu Arg Thr Leu Glu Glu Gly Arg Gln Arg Gly Tyr Val Glu Thr705 710
715 720Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala Arg Val
Lys 725 730 735Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met
Pro Val Gln 740 745 750Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met
Val Arg Leu Phe Pro 755 760 765Arg Leu Pro Glu Val Gly Ala Arg Met
Leu Leu Gln Val His Asp Glu 770 775 780Leu Leu Leu Glu Ala Pro Lys
Glu Arg Ala Glu Glu Ala Ala Ala Leu785 790 795 800Ala Lys Glu Val
Met Glu Gly Val Trp Pro Leu Ala Val Pro Leu Glu 805 810 815Val Glu
Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Gly 820 825
83016830PRTThermus brockianus 16Met Leu Pro Leu Phe Glu Pro Lys Gly
Arg Val Leu Leu Val Asp Gly1 5 10 15His His Leu Ala Tyr Arg Asn Phe
Phe Ala Leu Lys Gly Leu Thr Thr 20 25 30Ser Arg Gly Glu Pro Val Gln
Gly Val Tyr Asp Phe Ala Lys Ser Leu 35 40 45Leu Lys Ala Leu Lys Glu
Asp Gly Asp Val Val Ile Val Val Phe Asp 50 55 60Ala Lys Ala Pro Ser
Phe Arg His Glu Ala Tyr Gly Ala Tyr Lys Ala65 70 75 80Gly Arg Ala
Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu Ala Leu Met 85 90 95Lys Glu
Leu Val Asp Leu Leu Gly Leu Glu Arg Leu Glu Val Pro Gly 100 105
110Phe Glu Ala Asp Asp Val Leu Ala Ala Leu Ala Lys Lys Ala Glu Arg
115 120 125Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Arg Asp Leu
Phe Gln 130 135 140Leu Leu Ser Asp Arg Ile Ala Val Leu His Pro Glu
Gly His Leu Ile145 150 155 160Thr Pro Gly Trp Leu Trp Glu Arg Tyr
Gly Leu Arg Pro Glu Gln Trp 165 170 175Val Asp Phe Arg Ala Leu Ala
Gly Asp Pro Ser Asp Asn Ile Pro Gly 180 185 190Val Lys Gly Ile Gly
Glu Lys Thr Ala Leu Lys Leu Leu Lys Glu Trp 195 200 205Gly Ser Leu
Glu Asn Ile Gln Lys Asn Leu Asp Gln Val Ser Pro Pro 210 215 220Ser
Val Arg Glu Lys Ile Gln Ala His Leu Asp Asp Leu Arg Leu Ser225 230
235 240Gln Glu Leu Ser Arg Val Arg Thr Asp Leu Pro Leu Glu Val Asp
Phe 245 250 255Arg Arg Arg Arg Glu Pro Asp Arg Glu Gly Leu Arg Ala
Phe Leu Glu 260 265 270Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe
Gly Leu Leu Glu Ser 275 280 285Pro Gln Ala Ala Glu Glu Ala Pro Trp
Pro Pro Pro Glu Gly Ala Phe 290 295 300Leu Gly Phe Arg Leu Ser Arg
Pro Glu Pro Met Trp Ala Glu Leu Leu305 310 315 320Ser Leu Ala Ala
Ser Ala Lys Gly Arg Val Tyr Arg Ala Glu Ala Pro 325 330 335His Lys
Ala Leu Ser Asp Leu Lys Glu Ile Arg Gly Leu Leu Ala Lys 340 345
350Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro Pro Thr
355 360 365Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn
Thr Thr 370 375 380Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp
Thr Glu Glu Ala385 390 395 400Gly Glu Arg Ala Leu Leu Ala Glu Arg
Leu Tyr Glu Asn Leu Leu Ser 405 410 415Arg Leu Lys Gly Glu Glu Lys
Leu Leu Trp Leu Tyr Glu Glu Val Glu 420 425 430Lys Pro Leu Ser Arg
Val Leu Ala His Met Glu Ala Thr Gly Val Arg 435 440 445Leu Asp Val
Pro Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala Ala Glu 450 455 460Met
Gly Arg Leu Glu Glu Glu Val Phe Arg Leu Ala Gly His Pro Phe465 470
475 480Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp Glu
Leu 485 490 495Gly Leu Pro Pro Ile Gly Lys Thr Glu Lys Thr Gly Lys
Arg Ser Thr 500 505 510Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala
His Pro Ile Val Glu 515 520 525Lys Ile Leu Gln Tyr Arg Glu Leu Ala
Lys Leu Lys Gly Thr Tyr Ile 530 535 540Asp Pro Leu Pro Ala Leu Val
His Pro Arg Thr Gly Arg Leu His Thr545 550 555 560Arg Phe Asn Gln
Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp 565 570 575Pro Asn
Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln Arg Ile 580 585
590Arg Arg Ala Phe Val Ala Glu Glu Gly Tyr Leu Leu Val Ala Leu Asp
595 600 605Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly
Asp Glu 610 615 620Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp Ile
His Thr Gln Thr625 630 635 640Ala Ser Trp Met Phe Gly Leu Pro Ala
Glu Ala Ile Asp Pro Leu Arg 645 650 655Arg Arg Ala Ala Lys Thr Ile
Asn Tyr Gly Val Leu Tyr Gly Met Ser 660 665 670Ala His Arg Leu Ser
Gln
Glu Leu Gly Ile Pro Tyr Glu Glu Ala Val 675 680 685Ala Phe Ile Asp
Arg Tyr Phe Gln Ser Tyr Pro Lys Val Lys Ala Trp 690 695 700Ile Glu
Arg Thr Leu Glu Glu Gly Arg Gln Arg Gly Tyr Val Glu Thr705 710 715
720Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala Arg Val Lys
725 730 735Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
Val Gln 740 745 750Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val
Arg Leu Phe Pro 755 760 765Arg Leu Pro Glu Val Gly Ala Arg Met Leu
Leu Gln Val His Asp Glu 770 775 780Leu Leu Leu Glu Ala Pro Lys Glu
Arg Ala Glu Glu Ala Ala Ala Leu785 790 795 800Ala Lys Glu Val Met
Glu Gly Val Trp Pro Leu Ala Val Pro Leu Glu 805 810 815Val Glu Val
Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Gly 820 825
830174PRTThermus brockianus 17Gln Gly Val Tyr1184PRTThermus
brockianus 18Gly Ala Tyr Lys1194PRTThermus brockianus 19Leu Met Lys
Glu1204PRTThermus brockianus 20Pro Gly Phe Glu12116PRTThermus
brockianus 21Glu Arg Leu Glu Val Pro Gly Phe Glu Ala Asp Asp Val
Leu Ala Ala1 5 10 15224PRTThermus brockianus 22Thr Pro Gly
Trp1235PRTThermus brockianus 23Leu Ala Gly Asp Pro1 5244PRTThermus
brockianus 24Asn Ile Gln Lys1255PRTThermus brockianus 25Gln Val Ser
Pro Pro1 5264PRTThermus brockianus 26Glu Lys Ile Gln1275PRTThermus
brockianus 27Arg Leu Ser Gln Glu1 5286PRTThermus brockianus 28Phe
Arg Arg Arg Arg Glu1 5295PRTThermus brockianus 29Ser Pro Gln Ala
Ala1 5304PRTThermus brockianus 30Leu Gly Phe Arg1314PRTThermus
brockianus 31Glu Leu Leu Ser1327PRTThermus brockianus 32Ser Ala Lys
Gly Arg Val Tyr1 5334PRTThermus brockianus 33Glu Ala Pro
His1348PRTThermus brockianus 34Glu Ala Pro His Lys Ala Leu Ser1
5356PRTThermus brockianus 35Ala Glu Arg Leu Tyr Glu1 5365PRTThermus
brockianus 36Leu Ser Arg Leu Lys1 5374PRTThermus brockianus 37Tyr
Glu Glu Val1384PRTThermus brockianus 38Met Gly Arg
Leu1395PRTThermus brockianus 39Ala Lys Leu Lys Gly1 5404PRTThermus
brockianus 40Leu Pro Ala Leu1414PRTThermus brockianus 41Glu Gly Tyr
Leu1424PRTThermus brockianus 42Leu Pro Ala Glu1435PRTThermus
brockianus 43Pro Ala Glu Ala Ile1 5444PRTThermus brockianus 44Leu
Arg Arg Arg1454PRTThermus brockianus 45Ile Asp Arg
Tyr1465PRTThermus brockianus 46Tyr Pro Lys Val Lys1 5474PRTThermus
brockianus 47Gly Arg Gln Arg1489PRTThermus brockianus 48Arg Leu Phe
Pro Arg Leu Pro Glu Val1 5494PRTThermus brockianus 49Gly Val Trp
Pro1504PRTThermus brockianus 50Arg Asn Phe Phe15117DNAThermus
brockianus 51cccacctcca cctccag 175223DNAThermus brockianus
52cgacctcaac gcccgggtaa aga 235325DNAThermus brockianus
53gcttttggcg aagccgtaga cccct 255425DNAThermus brockianus
54catatgcttc ccctctttga gccca 255524DNAThermus brockianus
55gtcgactagc ccttggcgga aagc 2456541PRTThermus brockianus 56Gln Ala
Ala Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly Ala Phe Leu1 5 10 15Gly
Phe Arg Leu Ser Arg Pro Glu Pro Met Trp Ala Glu Leu Leu Ser 20 25
30Leu Ala Ala Ser Ala Lys Gly Arg Val Tyr Arg Ala Glu Ala Pro His
35 40 45Lys Ala Leu Ser Asp Leu Lys Glu Ile Arg Gly Leu Leu Ala Lys
Asp 50 55 60Leu Ala Val Leu Ala Leu Arg Glu Gly Leu Gly Leu Pro Pro
Thr Asp65 70 75 80Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser
Asn Thr Thr Pro 85 90 95Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp
Thr Glu Glu Ala Gly 100 105 110Glu Arg Ala Leu Leu Ala Glu Arg Leu
Tyr Glu Asn Leu Leu Ser Arg 115 120 125Leu Lys Gly Glu Glu Lys Leu
Leu Trp Leu Tyr Glu Glu Val Glu Lys 130 135 140Pro Leu Ser Arg Val
Leu Ala His Met Glu Ala Thr Gly Val Arg Leu145 150 155 160Asp Val
Pro Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala Ala Glu Met 165 170
175Gly Arg Leu Glu Glu Glu Val Phe Arg Leu Ala Gly His Pro Phe Asn
180 185 190Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp Glu
Leu Gly 195 200 205Leu Pro Pro Ile Gly Lys Thr Glu Lys Thr Gly Lys
Arg Ser Thr Ser 210 215 220Ala Ala Val Leu Glu Ala Leu Arg Glu Ala
His Pro Ile Val Glu Lys225 230 235 240Ile Leu Gln Tyr Arg Glu Leu
Ala Lys Leu Lys Gly Thr Tyr Ile Asp 245 250 255Leu Leu Pro Ala Leu
Val His Pro Arg Thr Gly Arg Leu His Thr Arg 260 265 270Phe Asn Gln
Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser Asp Pro 275 280 285Asn
Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln Arg Ile Arg 290 295
300Arg Ala Phe Val Ala Glu Glu Gly Tyr Leu Leu Val Ala Leu Asp
Tyr305 310 315 320Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser
Gly Asp Glu Asn 325 330 335Leu Ile Arg Val Phe Gln Glu Gly Arg Asp
Ile His Thr Gln Thr Ala 340 345 350Ser Trp Met Phe Gly Leu Pro Ala
Glu Ala Ile Asp Pro Leu Arg Arg 355 360 365Arg Ala Ala Lys Thr Ile
Asn Phe Gly Val Leu Tyr Gly Met Ser Ala 370 375 380His Arg Leu Ser
Gln Glu Leu Gly Ile Pro Tyr Glu Glu Ala Val Ala385 390 395 400Phe
Ile Asp Arg Tyr Phe Gln Ser Tyr Pro Lys Val Lys Ala Trp Ile 405 410
415Glu Arg Thr Leu Glu Glu Gly Arg Gln Arg Gly Tyr Val Glu Thr Leu
420 425 430Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala Arg Val
Lys Ser 435 440 445Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met
Pro Val Gln Gly 450 455 460Thr Ala Ala Asp Leu Met Lys Leu Ala Met
Val Arg Leu Phe Pro Arg465 470 475 480Leu Pro Glu Val Gly Ala Arg
Met Leu Leu Gln Val His Asp Glu Leu 485 490 495Leu Leu Glu Ala Pro
Lys Glu Arg Ala Glu Glu Ala Ala Ala Leu Ala 500 505 510Lys Glu Val
Met Glu Gly Val Trp Pro Leu Ala Val Pro Leu Glu Val 515 520 525Glu
Val Gly Ile Gly Glu Asp Trp Leu Ser Ala Lys Gly 530 535
54057541PRTThermus brockianus 57Gln Ala Ala Glu Glu Ala Pro Trp Pro
Pro Pro Glu Gly Ala Phe Leu1 5 10 15Gly Phe Arg Leu Ser Arg Pro Glu
Pro Met Trp Ala Glu Leu Leu Ser 20 25 30Leu Ala Ala Ser Ala Lys Gly
Arg Val Tyr Arg Ala Glu Ala Pro His 35 40 45Lys Ala Leu Ser Asp Leu
Lys Glu Ile Arg Gly Leu Leu Ala Lys Asp 50 55 60Leu Ala Val Leu Ala
Leu Arg Glu Gly Leu Gly Leu Pro Pro Thr Asp65 70 75 80Asp Pro Met
Leu Leu Ala Tyr Leu Leu Asp Pro Ser Asn Thr Thr Pro 85 90 95Glu Gly
Val Ala Arg Arg Tyr Gly Gly Glu Trp Thr Glu Glu Ala Gly 100 105
110Glu Arg Ala Leu Leu Ala Glu Arg Leu Tyr Glu Asn Leu Leu Ser Arg
115 120 125Leu Lys Gly Glu Glu Lys Leu Leu Trp Leu Tyr Glu Glu Val
Glu Lys 130 135 140Pro Leu Ser Arg Val Leu Ala His Met Glu Ala Thr
Gly Val Arg Leu145 150 155 160Asp Val Pro Tyr Leu Arg Ala Leu Ser
Leu Glu Val Ala Ala Glu Met 165 170 175Gly Arg Leu Glu Glu Glu Val
Phe Arg Leu Ala Gly His Pro Phe Asn 180 185 190Leu Asn Ser Arg Asp
Gln Leu Glu Arg Val Leu Phe Asp Glu Leu Gly 195 200 205Leu Pro Pro
Ile Gly Lys Thr Glu Lys Thr Gly Lys Arg Ser Thr Ser 210 215 220Ala
Ala Val Leu Glu Ala Leu Arg Glu Ala His Pro Ile Val Glu Lys225 230
235 240Ile Leu Gln Tyr Arg Glu Leu Ala Lys Leu Lys Gly Thr Tyr Ile
Asp 245 250 255Pro Leu Pro Ala Leu Val His Pro Arg Thr Gly Arg Leu
His Thr Arg 260 265 270Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu
Ser Ser Ser Asp Pro 275 280 285Asn Leu Gln Asn Ile Pro Val Arg Thr
Pro Leu Gly Gln Arg Ile Arg 290 295 300Arg Ala Phe Val Ala Glu Glu
Gly Tyr Leu Leu Val Ala Leu Asp Tyr305 310 315 320Ser Gln Ile Glu
Leu Arg Val Leu Ala His Leu Ser Gly Asp Glu Asn 325 330 335Leu Ile
Arg Val Phe Gln Glu Gly Arg Asp Ile His Thr Gln Thr Ala 340 345
350Ser Trp Met Phe Gly Leu Pro Ala Glu Ala Ile Asp Pro Leu Arg Arg
355 360 365Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu Tyr Gly Met
Ser Ala 370 375 380His Arg Leu Ser Gln Glu Leu Gly Ile Pro Tyr Glu
Glu Ala Val Ala385 390 395 400Phe Ile Asp Arg Tyr Phe Gln Ser Tyr
Pro Lys Val Lys Ala Trp Ile 405 410 415Glu Arg Thr Leu Glu Glu Gly
Arg Gln Arg Gly Tyr Val Glu Thr Leu 420 425 430Phe Gly Arg Arg Arg
Tyr Val Pro Asp Leu Asn Ala Arg Val Lys Ser 435 440 445Val Arg Glu
Ala Ala Glu Arg Met Ala Phe Asn Met Pro Val Gln Gly 450 455 460Thr
Ala Ala Asp Leu Met Lys Leu Ala Met Val Arg Leu Phe Pro Arg465 470
475 480Leu Pro Glu Val Gly Ala Arg Met Leu Leu Gln Val His Asp Glu
Leu 485 490 495Leu Leu Glu Ala Pro Lys Glu Arg Ala Glu Glu Ala Ala
Ala Leu Ala 500 505 510Lys Glu Val Met Glu Gly Val Trp Pro Leu Ala
Val Pro Leu Glu Val 515 520 525Glu Val Gly Ile Gly Glu Asp Trp Leu
Ser Ala Lys Gly 530 535 5405818DNAThermus brockianus 58ggccaccacc
tggcctac 1859782PRTThermus aquaticus 59Met Arg Gly Met Leu Pro Leu
Phe Glu Pro Lys Gly Arg Val Leu Leu1 5 10 15Val Asp Gly His His Leu
Ala Tyr Arg Thr Phe His Ala Leu Lys Gly 20 25 30Leu Thr Thr Ser Arg
Gly Glu Pro Val Gln Ala Val Tyr Gly Phe Ala 35 40 45Lys Ser Leu Leu
Lys Ala Leu Lys Glu Asp Gly Asp Ala Val Ile Val 50 55 60Val Phe Asp
Ala Lys Ala Pro Ser Phe Arg His Glu Ala Tyr Gly Gly65 70 75 80Tyr
Lys Ala Gly Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu 85 90
95Ala Leu Ile Lys Glu Leu Val Asp Leu Leu Gly Leu Ala Arg Leu Glu
100 105 110Val Pro Gly Tyr Glu Ala Asp Asp Val Leu Ala Ser Leu Ala
Lys Lys 115 120 125Ala Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr
Ala Asp Lys Asp 130 135 140Leu Tyr Gln Leu Leu Ser Asp Arg Ile His
Val Leu His Pro Glu Gly145 150 155 160Tyr Leu Ile Thr Pro Ala Trp
Leu Trp Glu Lys Tyr Gly Leu Arg Pro 165 170 175Asp Gln Trp Ala Asp
Tyr Arg Ala Leu Thr Gly Asp Glu Ser Asp Asn 180 185 190Leu Pro Gly
Val Lys Gly Ile Gly Glu Lys Thr Ala Arg Lys Leu Leu 195 200 205Glu
Glu Trp Gly Ser Leu Glu Ala Leu Leu Lys Asn Leu Asp Arg Leu 210 215
220Lys Pro Ala Ile Arg Glu Lys Ile Leu Ala His Met Asp Asp Leu
Lys225 230 235 240Leu Ser Trp Asp Leu Ala Lys Val Arg Thr Asp Leu
Pro Leu Glu Val 245 250 255Asp Phe Ala Lys Arg Arg Glu Pro Asp Arg
Glu Arg Leu Arg Ala Phe 260 265 270Leu Glu Arg Leu Glu Phe Gly Ser
Leu Leu His Glu Phe Gly Leu Leu 275 280 285Glu Ser Pro Lys Ala Leu
Glu Glu Ala Pro Trp Pro Pro Pro Glu Gly 290 295 300Ala Phe Val Gly
Phe Val Leu Ser Arg Lys Glu Pro Met Trp Ala Asp305 310 315 320Leu
Leu Ala Leu Ala Ala Ala Arg Gly Gly Arg Val His Arg Ala Pro 325 330
335Glu Pro Tyr Lys Ala Leu Arg Asp Leu Lys Glu Ala Arg Gly Leu Leu
340 345 350Ala Lys Asp Leu Ser Val Leu Ala Leu Arg Glu Gly Leu Gly
Leu Pro 355 360 365Pro Gly Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu
Asp Pro Ser Asn 370 375 380Thr Thr Pro Glu Gly Val Ala Arg Arg Tyr
Gly Gly Glu Trp Thr Glu385 390 395 400Glu Ala Gly Glu Arg Ala Ala
Leu Ser Glu Arg Leu Phe Ala Asn Leu 405 410 415Trp Gly Arg Leu Glu
Gly Glu Glu Arg Leu Leu Trp Leu Tyr Arg Glu 420 425 430Val Glu Arg
Pro Leu Ser Ala Val Leu Ala His Met Glu Ala Thr Gly 435 440 445Val
Arg Leu Asp Val Ala Tyr Leu Arg Ala Leu Ser Leu Glu Val Ala 450 455
460Glu Glu Ile Ala Arg Leu Glu Ala Glu Val Phe Arg Leu Ala Gly
His465 470 475 480Pro Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg
Val Leu Phe Asp 485 490 495Glu Leu Gly Leu Pro Ala Ile Gly Lys Thr
Glu Lys Thr Gly Lys Arg 500 505 510Ser Thr Ser Ala Ala Val Leu Glu
Ala Leu Arg Glu Ala His Pro Ile 515 520 525Val Glu Lys Ile Leu Gln
Tyr Arg Glu Leu Thr Lys Leu Lys Ser Thr 530 535 540Tyr Ile Asp Pro
Leu Pro Asp Leu Ile His Pro Arg Thr Gly Arg Leu545 550 555 560His
Thr Arg Phe Asn Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser 565 570
575Ser Asp Pro Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln
580 585 590Arg Ile Arg Arg Phe Gly Val Pro Arg Glu Ala Val Asp Pro
Leu Met 595 600 605Arg Arg Ala Ala Lys Thr Ile Asn Phe Gly Val Leu
Tyr Gly Met Ser 610 615 620Ala His Arg Leu Ser Gln Glu Leu Ala Ile
Pro Tyr Glu Glu Ala Gln625 630 635 640Ala Phe Ile Glu Arg Tyr Phe
Gln Ser Phe Pro Lys Val Arg Ala Trp 645 650 655Ile Glu Lys Thr Leu
Glu Glu Gly Arg Arg Arg Gly Tyr Val Glu Thr 660 665 670Leu Phe Gly
Arg Arg Arg Tyr Val Pro Asp Leu Glu Ala Arg Val Lys 675 680 685Ser
Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro Val Gln 690 695
700Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val Lys Leu Phe
Pro705 710 715 720Arg Leu Glu Glu Met Gly Ala Arg Met Leu Leu Gln
Val His Asp Glu 725 730 735Leu Val Leu Glu Ala Pro Lys Glu Arg Ala
Glu Ala Val Ala Arg Leu 740 745 750Ala Lys Glu Val Met Glu Gly Val
Tyr Pro Leu Ala Val Pro Leu Glu 755 760 765Val Glu Val Gly Ile Gly
Glu Asp Trp Leu Ser Ala Lys Glu 770 775 78060834PRTThermus
thermophilus 60Met Glu Ala Met Leu Pro Leu Phe Glu Pro Lys Gly Arg
Val Leu Leu1 5 10 15Val Asp Gly His His Leu Ala Tyr Arg Thr Phe Phe
Ala Leu Lys Gly 20 25 30Leu Thr Thr Ser Arg Gly Glu Pro Val Gln Ala
Val Tyr Gly Phe Ala 35 40 45Lys Ser Leu Leu Lys Ala Leu Lys Glu Asp
Gly Tyr Lys Ala Val Phe 50 55 60Val Val Phe Asp Ala Lys Ala Pro Ser
Phe Arg His Glu Ala Tyr Glu65 70 75 80Ala Tyr Lys Ala Gly Arg Ala
Pro Thr Pro
Glu Asp Phe Pro Arg Gln 85 90 95Leu Ala Leu Ile Lys Glu Leu Val Asp
Leu Leu Gly Phe Thr Arg Leu 100 105 110Glu Val Pro Gly Tyr Glu Ala
Asp Asp Val Leu Ala Thr Leu Ala Lys 115 120 125Lys Ala Glu Lys Glu
Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Arg 130 135 140Asp Leu Tyr
Gln Leu Val Ser Asp Arg Val Ala Val Leu His Pro Glu145 150 155
160Gly His Leu Ile Thr Pro Glu Trp Leu Trp Glu Lys Tyr Gly Leu Arg
165 170 175Pro Glu Gln Trp Val Asp Phe Arg Ala Leu Val Gly Asp Pro
Ser Asp 180 185 190Asn Leu Pro Gly Val Lys Gly Ile Gly Glu Lys Thr
Ala Leu Lys Leu 195 200 205Leu Lys Glu Trp Gly Ser Leu Glu Asn Leu
Leu Lys Asn Leu Asp Arg 210 215 220Val Lys Pro Glu Asn Val Arg Glu
Lys Ile Lys Ala His Leu Glu Asp225 230 235 240Leu Arg Leu Ser Leu
Glu Leu Ser Arg Val Arg Thr Asp Leu Pro Leu 245 250 255Glu Val Asp
Leu Ala Gln Gly Arg Glu Pro Asp Arg Glu Gly Leu Arg 260 265 270Ala
Phe Leu Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu Phe Gly 275 280
285Leu Leu Glu Ala Pro Ala Pro Leu Glu Glu Ala Pro Trp Pro Pro Pro
290 295 300Glu Gly Ala Phe Val Gly Phe Val Leu Ser Arg Pro Glu Pro
Met Trp305 310 315 320Ala Glu Leu Lys Ala Leu Ala Ala Cys Arg Asp
Gly Arg Val His Arg 325 330 335Ala Ala Asp Pro Leu Ala Gly Leu Lys
Asp Leu Lys Glu Val Arg Gly 340 345 350Leu Leu Ala Lys Asp Leu Ala
Val Leu Ala Ser Arg Glu Gly Leu Asp 355 360 365Leu Val Pro Gly Asp
Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro 370 375 380Ser Asn Thr
Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu Trp385 390 395
400Thr Glu Asp Ala Ala His Arg Ala Leu Leu Ser Glu Arg Leu His Arg
405 410 415Asn Leu Leu Lys Arg Leu Glu Gly Glu Glu Lys Leu Leu Trp
Leu Tyr 420 425 430His Glu Val Glu Lys Pro Leu Ser Arg Val Leu Ala
His Met Glu Ala 435 440 445Thr Gly Val Arg Leu Asp Val Ala Tyr Leu
Gln Ala Leu Ser Leu Glu 450 455 460Leu Ala Glu Glu Ile Arg Arg Leu
Glu Glu Glu Val Phe Arg Leu Ala465 470 475 480Gly His Pro Phe Asn
Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu 485 490 495Phe Asp Glu
Leu Arg Leu Pro Ala Leu Gly Lys Thr Gln Lys Thr Gly 500 505 510Lys
Arg Ser Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu Ala His 515 520
525Pro Ile Val Glu Lys Ile Leu Gln His Arg Glu Leu Thr Lys Leu Lys
530 535 540Asn Thr Tyr Val Asp Pro Leu Pro Ser Leu Val His Pro Arg
Thr Gly545 550 555 560Arg Leu His Thr Arg Phe Asn Gln Thr Ala Thr
Ala Thr Gly Arg Leu 565 570 575Ser Ser Ser Asp Pro Asn Leu Gln Asn
Ile Pro Val Arg Thr Pro Leu 580 585 590Gly Gln Arg Ile Arg Arg Ala
Phe Val Ala Glu Ala Gly Trp Ala Leu 595 600 605Val Ala Leu Asp Tyr
Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu 610 615 620Ser Gly Asp
Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Lys Asp Ile625 630 635
640His Thr Gln Thr Ala Ser Trp Met Phe Gly Val Pro Pro Glu Ala Val
645 650 655Asp Pro Leu Met Arg Arg Ala Ala Lys Thr Val Asn Phe Gly
Val Leu 660 665 670Tyr Gly Met Ser Ala His Arg Leu Ser Gln Glu Leu
Ala Ile Pro Tyr 675 680 685Glu Glu Ala Val Ala Phe Ile Glu Arg Tyr
Phe Gln Ser Phe Pro Lys 690 695 700Val Arg Ala Trp Ile Glu Lys Thr
Leu Glu Glu Gly Arg Lys Arg Gly705 710 715 720Tyr Val Glu Thr Leu
Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn 725 730 735Ala Arg Val
Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn 740 745 750Met
Pro Val Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala Met Val 755 760
765Lys Leu Phe Pro Arg Leu Arg Glu Met Gly Ala Arg Met Leu Leu Gln
770 775 780Val His Asp Glu Leu Leu Leu Glu Ala Pro Gln Ala Arg Ala
Glu Glu785 790 795 800Val Ala Ala Leu Ala Lys Glu Ala Met Glu Lys
Ala Tyr Pro Leu Ala 805 810 815Val Pro Leu Glu Val Glu Val Gly Met
Gly Glu Asp Trp Leu Ser Ala 820 825 830Lys Gly61833PRTThermus
filiformis 61Met Thr Pro Leu Phe Asp Leu Glu Glu Pro Pro Lys Arg
Val Leu Leu1 5 10 15Val Asp Gly His His Leu Ala Tyr Arg Thr Phe Tyr
Ala Leu Ser Leu 20 25 30Thr Thr Ser Arg Gly Glu Pro Val Gln Met Val
Tyr Gly Phe Ala Arg 35 40 45Ser Leu Leu Lys Ala Leu Lys Glu Asp Gly
Gln Ala Val Val Val Val 50 55 60Phe Asp Ala Lys Ala Pro Ser Phe Arg
His Glu Ala Tyr Glu Ala Tyr65 70 75 80Lys Ala Gly Arg Ala Pro Thr
Pro Glu Asp Phe Pro Arg Gln Leu Ala 85 90 95Leu Val Lys Arg Leu Val
Asp Leu Leu Gly Leu Val Arg Leu Glu Ala 100 105 110Pro Gly Tyr Glu
Ala Asp Asp Val Leu Gly Thr Leu Ala Lys Lys Ala 115 120 125Glu Arg
Glu Gly Met Glu Val Arg Ile Leu Thr Gly Asp Arg Asp Phe 130 135
140Phe Gln Leu Leu Ser Glu Lys Val Ser Val Leu Leu Pro Asp Gly
Thr145 150 155 160Leu Val Thr Pro Lys Asp Val Gln Glu Lys Tyr Gly
Val Pro Pro Glu 165 170 175Arg Trp Val Asp Phe Arg Ala Leu Thr Gly
Asp Arg Ser Asp Asn Ile 180 185 190Pro Gly Val Ala Gly Ile Gly Glu
Lys Thr Ala Leu Arg Leu Leu Ala 195 200 205Glu Trp Gly Ser Val Glu
Asn Leu Leu Lys Asn Leu Asp Arg Val Lys 210 215 220Pro Asp Ser Leu
Arg Arg Lys Ile Glu Ala His Leu Glu Asp Leu His225 230 235 240Leu
Ser Leu Asp Leu Ala Arg Ile Arg Thr Asp Leu Pro Leu Glu Val 245 250
255Asp Phe Lys Ala Leu Arg Arg Arg Thr Pro Asp Leu Glu Gly Leu Arg
260 265 270Ala Phe Leu Glu Glu Leu Glu Phe Gly Ser Leu Leu His Glu
Phe Gly 275 280 285Leu Leu Gly Gly Glu Lys Pro Arg Glu Glu Ala Pro
Trp Pro Pro Pro 290 295 300Glu Gly Ala Phe Val Gly Phe Leu Leu Ser
Arg Lys Glu Pro Met Trp305 310 315 320Ala Glu Leu Leu Ala Leu Ala
Ala Ala Ser Glu Gly Arg Val His Arg 325 330 335Ala Thr Ser Pro Val
Glu Ala Leu Ala Asp Leu Lys Glu Ala Arg Gly 340 345 350Phe Leu Ala
Lys Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Val Ala 355 360 365Leu
Asp Pro Thr Asp Asp Pro Leu Leu Val Ala Tyr Leu Leu Asp Pro 370 375
380Ala Asn Thr His Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu
Phe385 390 395 400Thr Glu Asp Ala Ala Glu Arg Ala Leu Leu Ser Glu
Arg Leu Phe Gln 405 410 415Asn Leu Phe Pro Arg Leu Ser Glu Lys Leu
Leu Trp Leu Tyr Gln Glu 420 425 430Val Glu Arg Pro Leu Ser Arg Val
Leu Ala His Met Glu Ala Arg Gly 435 440 445Val Arg Leu Asp Val Pro
Leu Leu Glu Ala Leu Ser Phe Glu Leu Glu 450 455 460Lys Glu Met Glu
Arg Leu Glu Gly Glu Val Phe Arg Leu Ala Gly His465 470 475 480Pro
Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp 485 490
495Glu Leu Gly Leu Thr Pro Val Gly Arg Thr Glu Lys Thr Gly Lys Arg
500 505 510Ser Thr Ala Gln Gly Ala Leu Glu Ala Leu Arg Gly Ala His
Pro Ile 515 520 525Val Glu Leu Ile Leu Gln Tyr Arg Glu Leu Ser Lys
Leu Lys Ser Thr 530 535 540Tyr Leu Asp Pro Leu Pro Arg Leu Val His
Pro Arg Thr Gly Arg Leu545 550 555 560His Thr Arg Phe Asn Gln Thr
Ala Thr Ala Thr Gly Arg Leu Ser Ser 565 570 575Ser Asp Pro Asn Leu
Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln 580 585 590Arg Ile Arg
Lys Ala Phe Val Ala Glu Glu Gly Trp Leu Leu Leu Ala 595 600 605Ala
Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser Gly 610 615
620Asp Glu Asn Leu Lys Arg Val Phe Arg Glu Gly Lys Asp Ile His
Thr625 630 635 640Glu Thr Ala Ala Trp Met Phe Gly Leu Asp Pro Ala
Leu Val Asp Pro 645 650 655Lys Met Arg Arg Ala Ala Lys Thr Val Asn
Phe Gly Val Leu Tyr Gly 660 665 670Met Ser Ala His Arg Leu Ser Gln
Glu Leu Gly Ile Asp Tyr Lys Glu 675 680 685Ala Glu Ala Phe Ile Glu
Arg Tyr Phe Gln Ser Phe Pro Lys Val Arg 690 695 700Ala Trp Ile Glu
Arg Thr Leu Glu Glu Gly Arg Thr Arg Gly Tyr Val705 710 715 720Glu
Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Ala Ser Arg 725 730
735Val Arg Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn Met Pro
740 745 750Val Gln Gly Thr Ala Ala Asp Leu Met Lys Ile Ala Met Val
Lys Leu 755 760 765Phe Pro Arg Leu Lys Pro Leu Gly Ala His Leu Leu
Leu Gln Val His 770 775 780Asp Glu Leu Val Leu Glu Val Pro Glu Asp
Arg Ala Glu Glu Ala Lys785 790 795 800Ala Leu Val Lys Glu Val Met
Glu Asn Ala Tyr Pro Leu Asp Val Pro 805 810 815Leu Glu Val Glu Val
Gly Val Gly Arg Asp Trp Leu Glu Ala Lys Gln 820 825
830Asp62831PRTThermus flavus 62Met Ala Met Leu Pro Leu Phe Glu Pro
Lys Gly Arg Val Leu Leu Val1 5 10 15Asp Gly His His Leu Ala Tyr Arg
Thr Phe Phe Ala Leu Lys Gly Leu 20 25 30Thr Thr Ser Arg Gly Glu Pro
Val Gln Ala Val Tyr Gly Phe Ala Lys 35 40 45Ser Leu Leu Lys Ala Leu
Lys Glu Asp Gly Asp Val Val Val Val Val 50 55 60Phe Asp Ala Lys Ala
Pro Ser Phe Arg His Glu Ala Tyr Glu Ala Tyr65 70 75 80Lys Ala Gly
Arg Ala Pro Thr Pro Glu Asp Phe Pro Arg Gln Leu Ala 85 90 95Leu Ile
Lys Glu Leu Val Asp Leu Leu Gly Leu Val Arg Leu Glu Val 100 105
110Pro Gly Phe Glu Ala Asp Asp Val Leu Ala Thr Leu Ala Lys Arg Ala
115 120 125Glu Lys Glu Gly Tyr Glu Val Arg Ile Leu Thr Ala Asp Arg
Asp Leu 130 135 140Tyr Gln Leu Leu Ser Glu Arg Ile Ala Ile Leu His
Pro Glu Gly Tyr145 150 155 160Leu Ile Thr Pro Ala Trp Leu Tyr Glu
Lys Tyr Gly Leu Arg Pro Glu 165 170 175Gln Trp Val Asp Tyr Arg Ala
Leu Ala Gly Asp Pro Ser Asp Asn Ile 180 185 190Pro Gly Val Lys Gly
Ile Gly Glu Lys Thr Ala Gln Arg Leu Ile Arg 195 200 205Glu Trp Gly
Ser Leu Glu Asn Leu Phe Gln His Leu Asp Gln Val Lys 210 215 220Pro
Ser Leu Arg Glu Lys Leu Gln Ala Gly Met Glu Ala Leu Ala Leu225 230
235 240Ser Arg Lys Leu Ser Gln Val His Thr Asp Leu Pro Leu Glu Val
Asp 245 250 255Phe Gly Arg Arg Arg Thr Pro Asn Leu Glu Gly Leu Arg
Ala Phe Leu 260 265 270Glu Arg Leu Glu Phe Gly Ser Leu Leu His Glu
Phe Gly Leu Leu Glu 275 280 285Gly Pro Lys Ala Ala Glu Glu Ala Pro
Trp Pro Pro Pro Glu Gly Ala 290 295 300Phe Leu Gly Phe Ser Phe Ser
Arg Pro Glu Pro Met Trp Ala Glu Leu305 310 315 320Leu Ala Leu Ala
Gly Ala Trp Glu Gly Arg Leu His Arg Ala Gln Asp 325 330 335Pro Leu
Arg Gly Leu Arg Asp Leu Lys Gly Val Arg Gly Ile Leu Ala 340 345
350Lys Asp Leu Ala Val Leu Ala Leu Arg Glu Gly Leu Asp Leu Phe Pro
355 360 365Glu Asp Asp Pro Met Leu Leu Ala Tyr Leu Leu Asp Pro Ser
Asn Thr 370 375 380Thr Pro Glu Gly Val Ala Arg Arg Tyr Gly Gly Glu
Trp Thr Glu Asp385 390 395 400Ala Gly Glu Arg Ala Leu Leu Ala Glu
Arg Leu Phe Gln Thr Leu Lys 405 410 415Glu Arg Leu Lys Gly Glu Glu
Arg Leu Leu Trp Leu Tyr Glu Glu Val 420 425 430Glu Lys Pro Leu Ser
Arg Val Leu Ala Arg Met Glu Ala Thr Gly Val 435 440 445Arg Leu Asp
Val Ala Tyr Leu Gln Ala Leu Ser Leu Glu Val Glu Ala 450 455 460Glu
Val Arg Gln Leu Glu Glu Glu Val Phe Arg Leu Ala Gly His Pro465 470
475 480Phe Asn Leu Asn Ser Arg Asp Gln Leu Glu Arg Val Leu Phe Asp
Glu 485 490 495Leu Gly Leu Pro Ala Ile Gly Lys Thr Glu Lys Thr Gly
Lys Arg Ser 500 505 510Thr Ser Ala Ala Val Leu Glu Ala Leu Arg Glu
Ala His Pro Ile Val 515 520 525Asp Arg Ile Leu Gln Tyr Arg Glu Leu
Thr Lys Leu Lys Asn Thr Tyr 530 535 540Ile Asp Pro Leu Pro Ala Leu
Val His Pro Lys Thr Gly Arg Leu His545 550 555 560Thr Arg Phe Asn
Gln Thr Ala Thr Ala Thr Gly Arg Leu Ser Ser Ser 565 570 575Asp Pro
Asn Leu Gln Asn Ile Pro Val Arg Thr Pro Leu Gly Gln Arg 580 585
590Ile Arg Arg Ala Phe Val Ala Glu Glu Gly Trp Val Leu Val Val Leu
595 600 605Asp Tyr Ser Gln Ile Glu Leu Arg Val Leu Ala His Leu Ser
Gly Asp 610 615 620Glu Asn Leu Ile Arg Val Phe Gln Glu Gly Arg Asp
Ile His Thr Gln625 630 635 640Thr Ala Ser Trp Met Phe Gly Val Ser
Pro Glu Gly Val Asp Pro Leu 645 650 655Met Arg Arg Ala Ala Lys Thr
Ile Asn Phe Gly Val Leu Tyr Gly Met 660 665 670Ser Ala His Arg Leu
Ser Gly Glu Leu Ser Ile Pro Tyr Glu Glu Ala 675 680 685Val Ala Phe
Ile Glu Arg Tyr Phe Gln Ser Tyr Pro Lys Val Arg Ala 690 695 700Trp
Ile Glu Gly Thr Leu Glu Glu Gly Arg Arg Arg Gly Tyr Val Glu705 710
715 720Thr Leu Phe Gly Arg Arg Arg Tyr Val Pro Asp Leu Asn Ala Arg
Val 725 730 735Lys Ser Val Arg Glu Ala Ala Glu Arg Met Ala Phe Asn
Met Pro Val 740 745 750Gln Gly Thr Ala Ala Asp Leu Met Lys Leu Ala
Met Val Arg Leu Phe 755 760 765Pro Arg Leu Gln Glu Leu Gly Ala Arg
Met Leu Leu Gln Val His Asp 770 775 780Glu Leu Val Leu Glu Ala Pro
Lys Asp Arg Ala Glu Arg Val Ala Ala785 790 795 800Leu Ala Lys Glu
Val Met Glu Gly Val Trp Pro Leu Gln Val Pro Leu 805 810 815Glu Val
Glu Val Gly Leu Gly Glu Asp Trp Leu Ser Ala Lys Glu 820 825 830
* * * * *
References