Polymerases Smith; Geoffrey Paul ; et al. [Illumina Cambridge Limited]

Polymerases

Smith; Geoffrey Paul ; et al.

Patent Application Summary

U.S. patent application number 14/956231 was filed with the patent office on 2016-04-28 for polymerases. This patent application is currently assigned to Illumina Cambridge Limited. The applicant listed for this patent is Illumina Cambridge Limited. Invention is credited to Shankar Balasubramanian, Tobias William Barr Ost, Roberto Rigatti, Raquel Maria Sanches-Kuiper, Geoffrey Paul Smith.

Application Number	20160115461 14/956231
Document ID	/
Family ID	36645775
Filed Date	2016-04-28

United States Patent Application	20160115461
Kind Code	A1
Smith; Geoffrey Paul ; et al.	April 28, 2016

Polymerases

Abstract

Modified DNA polymerases have an affinity for DNA such that the polymerase has an ability to incorporate one or more nucleotides into a plurality of separate DNA templates in each reaction cycle. The polymerases are capable of forming an increased number of productive polymerase-DNA complexes in each reaction cycle. The modified polymerases may be used in a number of DNA sequencing applications, especially in the context of clustered arrays.

Inventors:

Smith; Geoffrey Paul; (Nr Saffron Walden, GB) ; Rigatti; Roberto; (Nr Saffron Walden, GB) ; Ost; Tobias William Barr; (Nr Saffron Walden, GB) ; Balasubramanian; Shankar; (Nr Saffron Walden, GB) ; Sanches-Kuiper; Raquel Maria; (Nr Saffron Walden, GB)

Applicant:

Name	City	State	Country	Type
Illumina Cambridge Limited	Nr Saffron Walden		GB

Assignee:

Illumina Cambridge Limited
Nr Saffron Walden
GB

Family ID:

36645775

Appl. No.:

14/956231

Filed:

December 1, 2015

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
14137434	Dec 20, 2013	9273352
14956231
11431939	May 10, 2006	8623628
14137434
60757997	Jan 11, 2006

Current U.S. Class:	435/6.11 ; 435/194; 435/6.1; 435/6.12
Current CPC Class:	C12Q 1/686 20130101; C12N 9/1252 20130101; C12Y 207/07007 20130101
International Class:	C12N 9/12 20060101 C12N009/12

Foreign Application Data

Date	Code	Application Number
May 10, 2005	GB	0509508.8

Claims

1-58. (canceled)

59. A polymerase comprising the amino acid sequence Ile-Gly-Asp-Arg-Ala-Ile-Pro at the positions functionally equivalent to residues 710-716 of SEQ ID NO:22, having a substitution mutation at the position functionally equivalent to Arg713 to a nonpolar amino acid, whereby the polymerase has a reduced affinity for DNA.

60. The polymerase of claim 59, wherein the polymerase is capable of incorporating a nucleotide or nucleotides into a plurality of separate DNA templates in each reaction cycle as compared to a control polymerase, wherein the control polymerase is the unaltered polymerase and is capable of incorporating a nucleotide or nucleotides into a single DNA template in each reaction cycle.

61. The polymerase of claim 59, wherein the polymerase is capable of forming an increased number of productive polymerase-DNA complexes in each reaction cycle as compared to a control polymerase, wherein the control polymerase is the unaltered polymerase.

62. The polymerase of claim 59, wherein the affinity of the polymerase for nucleotides and the fidelity of the polymerase is substantially unaffected by the substitution mutation.

63. The polymerase according to claim 59, wherein the substitution mutation converts the position functionally equivalent to Arg713 to glycine (G) or methionine (M).

64. The polymerase according to claim 59, wherein the substitution mutation converts the position functionally equivalent to Arg713 to alanine (A).

65. A kit for performing a nucleotide incorporation reaction comprising: a polymerase as defined in claim 59, 63, or 64, and a nucleotide solution.

66. The kit of claim 65, wherein the nucleotide solution comprises labelled nucleotides.

67. The kit of claim 65, wherein the nucleotides comprise synthetic nucleotides.

68. The kit of claim 65, wherein the nucleotides comprise modified nucleotides.

69. The kit of claim 68, wherein the modified nucleotides have been modified at the 3' sugar hydroxyl such that the substituent is larger in size than the naturally occurring 3' hydroxyl group.

70. The kit according to claim 69, wherein the modified nucleotides are a modified nucleotide or nucleoside molecule comprising a purine or pyrimidine base and a ribose or deoxyribose sugar moiety having a removable 3'-OH blocking group covalently attached thereto, such that the 3' carbon atom has attached a group of the structure --O--Z wherein Z is any of --C(R)2-O--R'', --C(R').sub.2--N(R'').sub.2, --C(R).sub.2--N(H)R'', --C(R').sub.2--S--R'' and --C(R').sub.2--F, wherein each R'' is or is part of a removable protecting group; each R' is independently a hydrogen atom, an alkyl, substituted alkyl, arylalkyl, alkenyl, alkynyl, aryl, heteroaryl, heterocyclic, acyl, cyano, alkoxy, aryloxy, heteroaryloxy or amido group, or a detectable label attached through a linking group; or (R').sub.2 represents an alkylidene group of formula=C(R''').sub.2 wherein each R''' may be the same or different and is selected from the group comprising hydrogen and halogen atoms and alkyl groups; and wherein said molecule may be reacted to yield an intermediate in which each R'' is exchanged for H or, where Z is --C(R').sub.2--F, the F is exchanged for OH, SH or NH.sub.2, preferably OH, which intermediate dissociates under aqueous conditions to afford a molecule with a free 3'OH; with the proviso that where Z is --C(R').sub.2--S--R'', both R' groups are not H.

71. The kit according to claim 70, wherein R' of the modified nucleotide or nucleoside is an alkyl or substituted alkyl.

72. The kit according to claim 71, wherein --Z of the modified nucleotide or nucleoside is of formula --C(R').sub.2-N.sub.3.

73. The kit according to claim 72, wherein Z is an azidomethyl group.

74. The kit according to claim 69, wherein the modified nucleotides are fluorescently labelled to allow their detection.

75. The kit according to claim 69, wherein the modified nucleotides comprise a nucleotide or nucleoside having a base attached to a detectable label via a cleavable linker, wherein the cleavable linker contains a moiety selected from the group consisting of: ##STR00002## wherein X is selected from the group comprising O, S, NH and NQ wherein Q is a C.sub.1-10 substituted or unsubstituted alkyl group, Y is selected from the group comprising O, S, NH and N(allyl), T is hydrogen or a C.sub.1-10 substituted or unsubstituted alkyl group and * indicates where the moiety is connected to the remainder of the nucleotide or nucleoside.

76. The kit according to claim 75, wherein the detectable label comprises a fluorescent label.

77. The kit of claim 65 further comprising one or more DNA template molecules and/or primers.

Description

RELATED APPLICATIONS

[0001] The present application claims priority from U.S. Provisional Application 60/757,997, filed Jan. 11, 2006, and Great Britain Provisional Application No. 0509508.8, filed May 10, 2005. Applicants claim the benefits of priority under 35 U.S.C. .sctn.119 as to the United States and Great Britain applications, and the entire disclosures of each of these applications are incorporated herein by reference in their entireties.

FIELD OF THE INVENTION

[0002] The present invention relates to polymerase enzymes and more particularly to modified DNA polymerases having an affinity for DNA such that the polymerase has an ability to incorporate a nucleotide or nucleotides into a plurality of separate DNA templates in each reaction cycle and is capable of forming an increased number of productive polymerase-DNA complexes in each reaction cycle. Also included in the scope of the present invention are methods of using the modified polymerases for DNA sequencing, especially in the context of clustered arrays.

BACKGROUND

[0003] Several publications and patent docments are referenced in this application in order to more fully describe the state of the art to which this invention pertains. The disclosure of each of these publications and documents is incorporated by reference herein.

[0004] The three-dimensional crystal structure of certain DNA polymerases has revealed three separate subdomains, named palm, fingers and thumb (Joyce, C. M. and Steiz, T. A. (1994) Function and structure relationships in DNA polymerases, Annu. Rev. Biochem., 63, 777-822), each having key roles during DNA polymerisation.

[0005] The C terminal thumb subdomain of DNA polymerases has been implicated in DNA binding and processivity (Doublie et al. 1998. Nature 391, 251; Truniger et al. 2004. Nucleic Acids Research 32, 371). Residues in this region of DNA polymerases interact with the primer:template duplex.

[0006] Disruption of the structure of this region either by the introduction of site-directed mutations or truncation by the deletion of a small number of amino acids, has provided evidence for variants with reduced DNA affinity and processivity without gross changes in other physical properties such as dNTP affinity and nucleotide insertion fidelity (Truniger et al. 2004. Nucleic Acids Research 32, 371; Minnick et al. 1996. J. Biol. Chem., 271, 24954; Polesky et al. 1990. J. Biol. Chem., 265, 14579).

[0007] Polymerases may be separated into two structurally distinct families called family A and family B.

[0008] The C-terminal subdomain of family B polymerases has been poorly studied, but is believed to be involved in DNA binding based primarily on the inspection of the x-ray crystal structure of the closed form (DNA-bound) of polymerase RB69. Mutagenesis studies have been conducted within this thumb domain for two examples of the family B class, namely Phi29 and T4. However, these studies were limited to amino acid deletions of large portions of the domain. The same type of deletion has been carried out for Klenow (a family A polymerase). The performance of the variants in these studies was evaluated in terms of their ability to bind and incorporate dNTPs, the effect the deletion had on fidelity, their affinity for DNA and also their interaction with accessory proteins.

[0009] No studies of the thumb domain of the polymerase from a thermophilic archaeon have previously been carried out.

[0010] The subject matter of the present invention was presented in prior filed U.K. Provisional Application No. 0509508.8 filed May 10, 2005, priority of which is believed to be available under 35 U.S.C. .sctn.119, and the disclosure of which is incorporated herein in its entirety. In said application, the structural aspects of the polymerases and the related materials of the present invention were disclosed as they are herein, and were accompanied by information providing further background and characterization of function, which was also in accordance with the understanding of the inventors at the time. Since filing said application, further study of the enzymes in question has taken place and and additional data illustrating and advancing the understanding of their function and application, has resulted, which is now felt to be desirably presented herein.

[0011] Accordingly, it is toward the advancement of the understanding and application of the present invention that the present application is directed.

SUMMARY OF THE INVENTION

[0012] The present invention is based upon the realisation that the tight binding of a polymerase to the DNA template is not always an advantageous property. This is particularly the case in the context of sequencing reactions in which only a single nucleotide incorporation event is required in each reaction cycle. Thus, for a polymerase that binds tightly to DNA, the ability of the polymerase to take part in incorporation of nucleotides on multiple DNA strands is restricted compared to a variant polymerase that has a lower affinity for DNA.

[0013] The present inventors have devised a method for sequencing DNA that uses nucleotide analogues bearing modifications at the 3' sugar hydroxyl group which block incorporation of further nucleotides (see WO03/048387, for example, and the citations described therein). The use of nucleotides bearing a 3' block allows successive nucleotides to be incorporated into a polynucleotide chain in a controlled manner. After each nucleotide addition the presence of the 3' block prevents incorporation of a further nucleotide into the chain. Once the nature of the incorporated nucleotide has been determined, the block may be removed, leaving a free 3' hydroxyl group for addition of the next nucleotide.

[0014] In addition, in the context of reactions such as sequencing reactions involving modified nucleotides (as discussed above and in more detail herein below), tight binding of a polymerase may in fact present certain disadvantages in terms of reaction completion. For example, if an inactive polymerase molecule having a tight DNA binding affinity forms a stable complex with a template DNA molecule, no extension is possible from that particular template DNA molecule.

[0015] With this realisation, the present invention provides altered polymerases which have a weaker interaction with template DNA. Thus, the polymerase of the invention has an improved ability to move from one template DNA molecule to another during a reaction cycle. This ability to form an increased number of productive polymerase-DNA complexes has the benefit that levels or reaction completion in reactions involving addition of a single nucleotide in each reaction cycle are much improved.

[0016] Unmodified polymerases tend to bind DNA with high affinity such that the equation:

Pol + DNA k a k d [ Pol : DNA ] ##EQU00001##

is heavily shifted to favour the [Pol:DNA] complex.

[0017] In contrast, in the present invention, the altered polymerases bind to DNA less well, meaning that the equilibrium position is shifted towards the left hand side.

[0018] Therefore, the invention provides an altered polymerase having reduced affinity for DNA such that the polymerase has an ability to incorporate a nucleotide or nucleotides into a plurality of separate DNA templates in each reaction cycle.

[0019] By "DNA template" is meant any DNA molecule which may be bound by the polymerase and utilised as a template for nucleic acid synthesis.

[0020] "Nucleotide" is defined herein to include both nucleotides and nucleosides. Nucleosides, as for nucleotides, comprise a purine or pyrimidine base linked glycosidically to ribose or deoxyribose, but they lack the phosphate residues which would make them a nucleotide. Synthetic and naturally occurring nucleotides are included within the definition. Labelled nucleotides are included within the definition. The advantageous properties of the polymerases are due to their reduced affinity for the DNA template in combination with a retained affinity and fidelity for the nucleotides which they incorporate.

[0021] In one preferred aspect, an altered polymerase is provided having a reduced affinity for DNA such that the polymerase has an ability to incorporate at least one synthetic nucleotide into a plurality of DNA templates in each reaction cycle. Prior to the present invention, the problem of modifying a polymerase adapted to incorporate non-natural nucleotides, to reduce its DNA affinity whilst retaining its advantageous properties has neither been realised nor addressed.

[0022] In one embodiment, nucleotides comprise dideoxy nucleotide triphosphates (ddNTPs) such as those used in Sanger sequencing reactions. These nucleotides may be labelled, e.g., with any of a mass label, radiolabel or a fluorescent label.

[0023] In a further embodiment, the nucleotides comprise nucleotides which have been modified at the 3' sugar hydroxyl such that the substituent is larger in size than the naturally occurring 3' hydroxyl group, compared to a control polymerase.

[0024] In a preferred embodiment, the nucleotides comprise those having a purine or pyrimidine base and a ribose or deoxyribose sugar moiety having a removable 3'-OH blocking group covalently attached thereto, such that the 3' carbon atom has attached a group of the structure

--O--Z

[0025] wherein Z is any of --C(R').sub.2--O--R'', --C(R').sub.2--N(R'').sub.2, --C(R').sub.2--N(H)R'', --C(R').sub.2--S--R'' and --C(R').sub.2--F,

[0026] wherein each R'' is or is part of a removable protecting group;

[0027] each R' is independently a hydrogen atom, an alkyl, substituted alkyl, arylalkyl, alkenyl, alkynyl, aryl, heteroaryl, heterocyclic, acyl, cyano, alkoxy, aryloxy, heteroaryloxy or amido group, or a detectable label attached through a linking group; or (R').sub.2 represents an alkylidene group of formula=C(R''').sub.2 wherein each R''' may be the same or different and is selected from the group comprising hydrogen and halogen atoms and alkyl groups; and

[0028] wherein said molecule may be reacted to yield an intermediate in which each R'' is exchanged for H or, where Z is --C(R').sub.2--F, the F is exchanged for OH, SH or NH.sub.2, preferably OH, which intermediate dissociates under aqueous conditions to afford a molecule with a free 3'OH;

[0029] with the proviso that where Z is --C(R').sub.2--S--R'', both R' groups are not H.

[0030] The nucleosides or nucleotides which are incorporated by the polymerases of the present invention according to one embodiment, comprise a purine or pyrimidine base and a ribose or deoxyribose sugar moiety which has a blocking group covalently attached thereto, preferably at the 3'O position, which renders the molecules useful in techniques requiring blocking of the 3'-OH group to prevent incorporation of additional nucleotides, such as for example in sequencing reactions, polynucleotide synthesis, nucleic acid amplification, nucleic acid hybridisation assays, single nucleotide polymorphism studies, and other such techniques.

[0031] Once the blocking group has been removed, it is possible to incorporate another nucleotide to the free 3'-OH group.

[0032] Preferred modified nucleotides are exemplified in International Patent Application publication number WO 2004/018497 in the name of Solexa Limited, which reference is incorporated herein in its entirety.

[0033] In a preferred embodiment the R' group of the modified nucleotide or nucleoside is an alkyl or substituted alkyl. In a further embodiment the --Z group of the modified nucleotide or nucleoside is of formula --C(R').sub.2--N.sub.3. In a most preferred embodiment the modified nucleotide or nucleoside includes a Z group which is an azido methyl group.

[0034] The preferred polymerases of the invention, as discussed in detail below, are particularly preferred for incorporation of nucleotide analogues wherein Z is an azido methyl group.

[0035] The modified nucleotide can be linked via the base to a detectable label by a desirable linker, which label may be a fluorophore, for example. The detectable label may instead, if desirable, be incorporated into the blocking groups of formula "Z". The linker can be acid labile, photolabile or contain a disulfide linkage. Other linkages, in particular phosphine-cleavable azide-containing linkers, may be employed in the invention as described in greater detail in WO 2004/018497, the contents of which are incorporated herein in their entirety.

[0036] Preferred labels and linkages include those disclosed in WO 03/048387, which is incorporated herein in its entirety.

[0037] In one embodiment the modified nucleotide or nucleoside has a base attached to a detectable label via a cleavable linker, characterised in that the cleavable linker contains a moiety selected from the group consisting of:

##STR00001##

(wherein X is selected from the group comprising O, S, NH and NQ wherein Q is a C.sub.1-10 substituted or unsubstituted alkyl group, Y is selected from the group comprising O, S, NH and N(allyl), T is hydrogen or a C.sub.1-10 substituted or unsubstituted alkyl group and * indicates where the moiety is connected to the remainder of the nucleotide or nucleoside).

[0038] In one embodiment the detectable label comprises a fluorescent label. Suitable fluorophores are well known in the art. In a preferred embodiment each different nucleotide type will carry a different fluorescent label. This facilitates the identification and incorporation of a particular nucleotide. Thus, for example modified Adenine, Guanine, Cytosine and Thymine would all have attached a separate fluorophore to allow them to be discriminated from one another readily. Surprisingly, it has been found that the altered polymerases are capable of incorporating modified nucleotide analogues carrying a number of different fluorescent labels. Moreover, the polymerases are capable of incorporating all four bases. These properties provide substantial advantages with regard to the use of the polymerases of the present invention in nucleic acid sequencing protocols.

[0039] As aforesaid, preferred nucleotide analogues include those containing O-azido methyl functionality at the 3' position. It will be appreciated that for other nucleotide analogues the preferred amino acid sequence of the polymerase in the C terminal thumb sub-domain region, which contributes significantly to DNA binding, for optimum incorporation may vary. For any given nucleotide analogue, optimum sequence preferences in the C terminal thumb sub domain region (such as at residues Lys 790, 800, 844, 874, 878 and Arg 806 in RB69 and at residues Arg 743, Arg 713 and Lys 705 in 9.degree. N polymerase, as discussed in greater detail below) may be determined by experiment, for example by construction of a library or discrete number of mutants followed by testing of individual variants in an incorporation assay system.

[0040] As aforementioned, the altered polymerases of the invention are capable of improved incorporation of all nucleotides, including a wide range of modified nucleotides having large 3' substituent groups of differing sizes and of varied chemical nature. The advantageous properties of the polymerases are due to their reduced affinity for the DNA template leading to increased dissociation of the polymerase from the DNA without adverse effects on affinity and fidelity for the nucleotides which they incorporate.

[0041] By virtue of the decreased DNA binding affinity of the polymerase of the invention, it is able to incorporate one or more nucleotides into several different DNA molecules in a single reaction cycle. Thus, the overall efficiency of reaction is improved, leading to greater levels of completion.

[0042] By "a reaction cycle" is meant a suitable reaction period to allow the incorporation of nucleotides into the template. Exemplary conditions for a single reaction cycle are one 30 minute, 45.degree. C. incubation period.

[0043] Many polymerisation reactions occur in the presence of an excess of DNA compared to polymerase. The polymerase of the present invention allows such a polymerisation reaction to proceed more effectively since the polymerase can catalyse numerous rounds of incorporation of a nucleotide or nucleotides on separate template DNA molecules. An unaltered polymerase on the other hand, particularly one which binds DNA much more tightly, will not have this ability since it is more likely to only participate in nucleotide incorporation on a single template in each reaction cycle. The polymerase according to the present invention allows high levels of reaction completion under conditions where the concentration of polymerase is limiting with respect to the concentration of DNA. In particular, the polymerase presents improved ability to incorporate one or more nucleotides into separate DNA molecules under conditions wherein the DNA:polymerase ratio is at least about 2:1, 3:1 or 5:1. However, at high concentrations of polymerase, the improvement may be masked.

[0044] Thus, an altered polymerase is provided having an affinity for DNA such that the polymerase is capable of forming an increased number of productive polymerase-DNA complexes in each reaction cycle.

[0045] The improved properties of the polymerases of the invention may be compared to a suitable control. "Control polymerase" is defined herein as the polymerase against which the activity of the altered polymerase is compared. The control polymerase is of the same type as the altered polymerase but does not carry the alteration(s) which reduce affinity of the polymerase for DNA. Thus, in a particular embodiment, the control polymerase is a 9.degree. N polymerase and the modified polymerase is the same 9.degree. N polymerase except for the presence of one or more modifications which reduce the affinity of the 9.degree. N polymerase for DNA.

[0046] In one embodiment, the control polymerase is a wild type polymerase which is altered to provide an altered polymerase which can be directly compared with the unaltered polymerase.

[0047] In one embodiment, the control polymerase comprises substitution mutations at positions which are functionally equivalent to Leu408 and Tyr409 and Pro410 in the 9.degree. N DNA polymerase amino acid sequence. Thus, in this embodiment the control polymerase has a substitution mutation at position 408 from leucine to a different amino acid, at position 409 from tyrosine to a different amino acid and at position 410 from proline to a different amino acid or at positions which are functionally equivalent if the polymerase is not a 9.degree. N DNA polymerase. In a preferred embodiment, the control polymerase is a 9.degree. N DNA polymerase comprising the said substitution mutations.

[0048] In another embodiment, the control polymerase comprises substitution mutations which are functionally equivalent to Leu408Tyr and Tyr409Ala and Pro410Val in the 9.degree. N DNA polymerase amino acid sequence. Thus, in this embodiment the control polymerase has a substitution mutation at position 408 from leucine to tyrosine, at position 409 from tyrosine to alanine and at position 410 from proline to valine or at positions which are functionally equivalent if the polymerase is not a 9.degree. N DNA polymerase. In a preferred embodiment, the control polymerase is a 9.degree. N DNA polymerase comprising the above substitution mutations.

[0049] The control polymerase may further comprise a substitution mutation at the position functionally equivalent to Cys223 in the 9.degree. N DNA polymerase amino acid sequence. Thus, in this embodiment the control polymerase has a substitution mutation at position 223 from cysteine to a different amino acid, or at a position which is functionally equivalent if the polymerase is not a 9.degree. N DNA polymerase. In a preferred embodiment, the control polymerase is a 9.degree. N DNA polymerase comprising the said substitution mutation. In another embodiment, the control polymerase comprises the substitution mutation functionally equivalent to Cys223Ser in the 9.degree. N DNA polymerase amino acid sequence. Thus, in this embodiment the control polymerase has a substitution mutation at position 223 from cysteine to serine, or at a position which is functionally equivalent if the polymerase is not a 9.degree. N DNA polymerase. In a preferred embodiment, the control polymerase is a 9.degree. N DNA polymerase comprising the said substitution mutation.

[0050] Preferably, the control polymerase is a 9.degree. N DNA polymerase comprising a combination of the above mentioned mutations.

[0051] The altered polymerase will generally have a reduced affinity for DNA. This may be defined in terms of dissociation constant. Thus, wild type polymerases tend to have dissociation constants in the nano-picomolar range. For the purposes of the present invention, an altered polymerase having an affinity for DNA which is reduced compared to the control unaltered polymerase is suitable. Preferably, due to the alteration(s), the polymerase has at least a, or approximately a 2-fold, 3-fold, 4-fold or 5-fold etc., increase in its dissociation constant when compared to the control unaltered polymerase.

[0052] By "functionally equivalent" is meant the amino acid substitution that is considered to occur at the amino acid position in another polymerase that has the same functional role in the enzyme. As an example, the mutation at position 412 from Tyrosine to Valine (Y412V) in the Vent DNA polymerase would be functionally equivalent to a substitution at position 409 from Tyrosine to Valine (Y409V) in the 9.degree. N polymerase. The bulk of this amino acid residue is thought to act as a "steric gate" to block access of the 2'-hydroxyl of the nucleotide sugar to the binding site. Also, residue 488 in Vent polymerase is deemed equivalent to amino acid 485 in 9.degree. N polymerase, such that the Alanine to Leucine mutation at 488 in Vent (A488L) is deemed equivalent to the A485L mutation in 9.degree. N polymerase.

[0053] Generally, functionally equivalent substitution mutations in two or more different polymerases occur at homologous amino acid positions in the amino acid sequences of the polymerases. Hence, use herein of the term "functionally equivalent" also encompasses mutations that are "positionally equivalent" or "homologous" to a given mutation, regardless of whether or not the particular function of the mutated amino acid is known. It is possible to identify positionally equivalent or homologous amino acid residues in the amino acid sequences of two or more different polymerases on the basis of sequence alignment and/or molecular modelling.

[0054] The altered polymerase will generally be an "isolated" or "purified" polypeptide. By "isolated polypeptide" is meant a polypeptide that is essentially free from contaminating cellular components, such as carbohydrates, lipids, nucleic acids or other proteinaceous impurities which may be associated with the polypeptide in nature. Typically, a preparation of the isolated polymerase contains the polymerase in a highly purified form, i.e. at least about 80% pure, preferably at least about 90% pure, more preferably at least about 95% pure, more preferably at least about 98% pure and most preferably at least about 99% pure. Purity of a preparation of the enzyme may be assessed, for example, by appearance of a single band on a standard SDS-polyacrylamide electrophoresis gel.

[0055] The altered polymerase may be a "recombinant" polypeptide.

[0056] The altered polymerase according to the invention may be any DNA polymerase. More particularly, the altered polymerase may be a family B type DNA polymerase, or a mutant or variant thereof. Family B DNA polymerases include numerous archael DNA polymerase, human DNA polymerase .alpha. and T4, RB69 and .phi.29 phage DNA polymerases. These polymerases are less well studied than the family A polymerases, which include polymerases such as Taq, and T7 DNA polymerase. In one embodiment the polymerase is selected from any family B archael DNA polymerase, human DNA polymerase .alpha. or T4, RB69 and .phi.29 phage DNA polymerases.

[0057] The archael DNA polymerases are in many cases from hyperthermophilic archea, which means that the polymerases are often thermostable. Accordingly, in a further preferred embodiment the polymerase is a thermophilic archaeon polymerase, including, e.g., Vent, Deep Vent, 9.degree. N and Pfu polymerase. Vent and Deep Vent are commercial names used for family B DNA polymerases isolated from the hyperthermophilic archaeon Thermococcus litoralis and Pyrococcus furiosus respectively. 9.degree. N polymerase was also identified from Thermococcus sp. Pfu polymerase was isolated from Pyrococcus furiosus. As mentioned above, prior to the present invention the thumb domain from a thermophilic polymerase had not been studied. A preferred polymerase of the present invention is 9.degree. N polymerase, including mutants and variants thereof. 9.degree. N polymerase has no requirement for accessory proteins. This can be contrasted with previously studied polymerases in which deletions in the thumb domain were shown to adversely affect the interaction with accessory proteins whilst not altering other properties of the polymerase. In contrast, as is shown in the Experimental Section below, a deletion of a large number of residues of 9.degree. N has a significant adverse effect on the important properties of 9.degree. N such that catalytic activity is severely compromised.

[0058] It is to be understood that the invention is not intended to be limited to mutants or variants of the family B polymerases. The altered polymerase may also be a family A polymerase, or a mutant or variant thereof, for example a mutant or variant Taq or T7 DNA polymerase enzyme, or a polymerase not belonging to either family A or family B, such as for example reverse transcriptases.

[0059] A number of different types of alterations are contemplated by the invention, wherein such alterations produce a polymerase displaying the desired properties as a result of reduced affinity for DNA. Particularly preferred are substitution mutations in the primary amino acid sequence of the polymerase, although addition and deletion mutations may also produce useful polymerases. Suitable alteration techniques, such as site directed mutagenesis, e.g., are well known in the art.

[0060] Thus, by "altered polymerase" it is meant that the polymerase has at least one amino acid change compared to the control polymerase enzyme. In general this change will comprise the substitution of at least one amino acid for another. In preferred embodiments, these changes are non-conservative changes, although conservative changes to maintain the overall charge distribution of the protein are also envisaged in the present invention. Moreover, it is within the contemplation of the present invention that the modification in the polymerase sequence may be a deletion or addition of one or more amino acids from or to the protein, provided that the resultant polymerase has reduced DNA affinity and an ability to incorporate a nucleotide or nucleotides into a plurality of separate DNA templates in each reaction cycle compared to a control polymerase.

[0061] In one embodiment, the alteration to form the polymerase of the invention comprises at least one mutation, and preferably at least one substitution mutation, at a residue in the polymerase which destabilises the interaction of the polymerase with DNA. Thus, the resultant polymerase interacts in a less stable manner with DNA. As aforementioned, a decrease in affinity of the polymerase for DNA allows it to incorporate one or more nucleotides into several different DNA molecules in a single reaction cycle. Thus, the overall efficiency of a reaction is improved, leading to greater levels of reaction completion.

[0062] In a further embodiment, the alteration comprises at least one mutation, and preferably at least one substitution mutation, at a residue in the polymerase which binds to DNA. Suitable target residues for mutation can be selected according to available crystal structures for suitable polymerases, particularly when crystallised in the closed state (bound to DNA). By reducing the number of binding contacts with the DNA, an overall reduction in DNA binding affinity may be achieved. Thus, the resultant polymerase displays improved characteristics in the context of nucleotide incorporation reactions in which tight binding to DNA is disadvantageous.

[0063] In similar fashion, the polymerase may also carry an alteration which comprises at least one mutation, and preferably at least one substitution mutation, at a residue in the DNA binding domain of the polymerase. Such a mutation is predicted to decrease the DNA binding affinity of the altered polymerase such that it is able to more readily bind to and dissociate from separate template DNA molecules during a reaction.

[0064] In one embodiment, the polymerase includes an alteration which comprises at least one mutation, and preferably at least one substitution mutation, at a basic amino acid residue in the polymerase. Indeed, many positively charged amino acid residues in polymerases are known to interact with the overall negatively charged DNA double helix, in particular with specific phosphate groups of nucleotides in the DNA.

[0065] As aforementioned, a particular type of alteration resulting in a polymerase according to the invention comprises at least one substitution mutation. As is shown in the experimental section below, deletion of residues from the polymerase amino acid sequence generate a polymerase which, whilst having a reduced affinity for DNA, does not have overall advantageous properties since catalytic ability is impaired. In one particularly preferred embodiment, the polymerase comprises two substitution mutations, but may contain four, five, six or seven, etc. mutations provided that the resultant polymerase has the desired properties.

[0066] Preferably, the affinity of the polymerase for nucleotides is substantially unaffected by the alteration. As is shown in the experimental section (in particular example 6), it is possible to mutate a polymerase such that its affinity for DNA is reduced, whilst the affinity of the polymerase for a nucleotide, which may be a dNTP or ddNTP or a modified version thereof for example (see the definition of nucleotide supra) is not adversely affected. By "substantially unaffected" in this context is meant that the affinity for the nucleotide remains of the same order as for the unaltered polymerase. Preferably, the affinity for nucleotides is unaffected by the alteration.

[0067] Preferably, the fidelity of the polymerase is substantially unaffected by the alteration. As is shown in the experimental section (in particular example 6), it is possible to mutate a polymerase such that its affinity for DNA is reduced, whilst the fidelity of the polymerase is substantially unaffected by the alteration. By "substantially unaffected" in this context is meant that the misincorporation frequency for each nucleotide remains of the same order as for the unaltered polymerase. Preferably, the fidelity of the polymerase is unaffected by the alteration.

[0068] In terms of specific and preferred structural mutants, these may be based upon the most preferred polymerase, namely 9.degree. N DNA polymerase. As discussed in example 1 below, an energy minimised overlaid alignment (contracted by Cresset) of the crystal structures of the open form of 9.degree. N-7 DNA polymerase (PDB=lqht), the open structure of a closely related DNA polymerase RB69 (PDB=lih7) and the closed form of RB69 (PDB=lig9) was used as a structural model for the identification of key residues involved in DNA binding. Accordingly, an altered polymerase is provided which comprises or incorporates one, two or three amino acid substitution mutations to a different amino acid at the position or positions functionally equivalent to Lys705, Arg713 and/or Arg743 in the 9.degree. N DNA polymerase amino acid sequence. Preferably, the polymerase is a 9.degree. N DNA polymerase comprising these mutations. All combinations and permutations of one, two or three mutations are contemplated within the scope of the invention.

[0069] Mutations may also be made at other specific residues based upon alignment of the "open" 9.degree. N DNA polymerase structure (i.e. not bound to DNA) with the known crystal structure of the RB69 polymerase complexed with DNA. Thus, an altered polymerase is provided which comprises or incorporates one or two amino acid substitution mutations to a different amino acid at the position or positions functionally equivalent to Arg606 and/or His679 in the 9.degree. N DNA polymerase amino acid sequence. Preferably, the polymerase is a 9.degree. N DNA polymerase comprising these mutations. All combinations and permutations of different mutations are contemplated within the scope of the invention. Thus, these mutations may be made in combination with the other mutations discussed supra.

[0070] In one preferred embodiment, the polymerase comprises at least a substitution mutation to a different amino acid at the position functionally equivalent to either Arg713 or Arg743 in the 9.degree. N DNA polymerase amino acid sequence. These two positions represent particularly preferred sites for mutation, as discussed in more detail in the experimental section below. Both residues may be mutated in the same polymerase to a different amino acid.

[0071] In terms of the nature of the different amino acid, the substitution mutation or mutations preferably convert the substituted amino acid to a non-basic amino acid (i.e. not lysine or arginine). Any non-basic amino acid may be chosen. Preferred substitution mutation or mutations convert the substituted amino acid to an amino acid selected from: [0072] (i) acidic amino acids, [0073] (ii) aromatic amino acids, particularly tyrosine (Y) or phenylalanine (F); and [0074] (iii)non-polar amino acids, particularly, alanine (A), glycine (G) or methionine (M).

[0075] In one embodiment, the substitution mutation or mutations convert the substituted amino acid to alanine. In a more specific embodiment, an altered polymerase is provided comprising the substitution mutation or mutations which are functionally equivalent to Lys705Ala and/or Arg713Ala and/or Arg743Ala in the 9.degree. N DNA polymerase amino acid sequence. Thus, in this embodiment the polymerase has a substitution mutation at position 705 from lysine to alanine and/or at position 713 from arginine to alanine and/or at position 743 from arginine to alanine or at positions which are functionally equivalent if the polymerase is not a 9.degree. N DNA polymerase. In a preferred embodiment, the polymerase is a 9.degree. N DNA polymerase comprising the said substitution mutations.

[0076] In one embodiment, the altered polymerase comprises the amino acid substitution functionally equivalent to Arg713Ala; in a further embodiment, the altered polymerase comprises the amino acid substitution functionally equivalent to Arg743Ala. Preferably, the altered polymerase is a 9.degree. N DNA polymerase.

[0077] Specific structural mutants may also be based upon other types of polymerase, such as the RB69 polymerase for which the "open" and "closed" structures are known. Accordingly, an altered polymerase is provided which comprises or incorporates one, two, three, four, five or six amino acid substitution mutations to a different amino acid at the position or positions functionally equivalent to Lys790, Lys800, Arg806, Lys844, Lys874 and/or Lys878 in the RB69 DNA polymerase amino acid sequence. Preferably, the polymerase is a 9.degree. N DNA polymerase comprising these analogous or functionally equivalent mutations. All combinations and permutations of one, two, three, four, five or six mutations are contemplated within the scope of the invention.

[0078] In terms of the nature of the different amino acid, the substitution mutation or mutations preferably convert the substituted amino acid to a non-basic amino acid (i.e. not lysine or arginine). Any non-basic amino acid may be chosen.

[0079] Preferred substitution mutation or mutations convert the substituted amino acid to an amino acid selected from: [0080] (i) acidic amino acids, [0081] (ii) aromatic amino acids, particularly tyrosine (Y) or phenylalanine (F); and [0082] (iii)non-polar amino acids, particularly, alanine (A), glycine (G) or methionine (M).

[0083] In one embodiment, the substitution mutation or mutations convert the substituted amino acid to alanine.

[0084] It should be noted that the present invention is not limited to polymerases which have only been altered in the above mentioned manner. Polymerases of the invention may include a number of additional mutations, such as for example the preferred mutant polymerases disclosed in detail in WO 2005/024010. In particular, a polymerase comprising substitution mutations at positions which are functionally equivalent to Leu408 and Tyr409 and Pro410 in the 9.degree. N DNA polymerase amino acid sequence is contemplated. In a preferred embodiment, the polymerase is a 9.degree. N DNA polymerase comprising the said substitution mutations.

[0085] In a specific embodiment, the polymerase comprises the substitution mutations which are functionally equivalent to at least one or two but preferably all of Leu408Tyr and Tyr409Ala and Pro410Val in the 9.degree. N DNA polymerase amino acid sequence. In a preferred embodiment, the polymerase is a 9.degree. N DNA polymerase comprising all the said substitution mutations.

[0086] The polymerase may further comprise a substitution mutation at the position functionally equivalent to Cys223 in the 9.degree. N DNA polymerase amino acid sequence. In a preferred embodiment, the polymerase is a 9.degree. N DNA polymerase comprising the said substitution mutation. In one embodiment, the polymerase comprises the substitution mutation functionally equivalent to Cys223Ser in the 9.degree. N DNA polymerase amino acid sequence. In a preferred embodiment, the polymerase is a 9.degree. N DNA polymerase comprising the said substitution mutation.

[0087] Preferably, the polymerase is a 9.degree. N DNA polymerase comprising a combination of the above mentioned mutations.

[0088] The invention also relates to a 9.degree. N polymerase molecule comprising, consisting essentially of or consisting of the amino acid sequence shown as any one of SEQ ID NO: 1, 3, 5 or 21. The invention also encompasses polymerases having amino acid sequences which differ from those shown as SEQ ID NOs: 1, 3, 5 and 21 only in amino acid changes which do not affect the function of the polymerase to a material extent. In this case the relevant function of the polymerase is defined as a reduced affinity for DNA such that the polymerase has an ability to incorporate a nucleotide or nucleotides into a plurality of separate DNA templates in each reaction cycle (compared to a control polymerase) and/or that the polymerase is capable of forming an increased number of productive polymerase-DNA complexes in each reaction cycle (compared to a control polymerase).

[0089] Thus, conservative substitutions at residues which are not important for this activity of the polymerase variants having reduced DNA affinity are included within the scope of the invention. The effect of further mutations on the function of the enzyme may be readily tested, for example using well known nucleotide incorporation assays (such as those described in the examples of WO 2005/024010 and in examples 3 and 4 below).

[0090] The altered polymerase of the invention may also be defined directly with reference to its reduced affinity for DNA, which together with a substantially unaltered fidelity and affinity for nucleotides produce the advantages associated with the polymerases of the invention. Thus, an altered polymerase is provided which has a dissociation constant (K.sub.D) for DNA of at least, or in the region of between, approximately 2-fold greater, 3-fold greater, 4-fold greater or 5-fold greater than the unaltered control polymerase.

[0091] In one embodiment, an altered polymerase is provided which will dissociate from DNA in the presence of a salt solution, preferably a NaCl solution, having a concentration of less than or equal to about 500 mM, preferably less than 500 mM. The salt solution may be of a suitable concentration such that the reduced affinity polymerase of the invention can be distinguished from an unaltered polymerase which binds DNA more tightly. Suitable salt solution concentrations (preferably NaCl) are in the region of approximately 150 mM, 200 mM, 250 mM, 300 mM or 350 mM preferably 200 mM. Any suitable double stranded DNA molecule may be utilised to determine whether the alteration has the desired effect in terms of reducing DNA affinity. Preferably, the DNA molecule from which the polymerase dissociates comprises the sequence set forth as SEQ ID No.: 18. Preferably, at least approximately 40%, 50%, 60%, 70%, etc., of the polymerase will dissociate from the DNA at the relevant NaCl concentration in the wash solution.

[0092] Dissociation experiments may be carried out by any known means, such as, for example, by utilising the washing assay detailed in example 5 of the Experimental Section below (see also FIGS. 6 and 7).

[0093] As aforementioned, the reduction in DNA affinity is (preferably) achieved without a notable or significant decrease in the affinity of the polymerase for nucleotides. Surprisingly, the altered polymerase of the invention may also display comparable activity, for example, in terms of Vmax, to the unmodified polymerase even though the DNA binding affinity has been decreased. This surprising property displayed by the polymerases of the present invention is shown in the kinetic analysis of certain enzymes of the invention in particular in example 6 of the Experimental Section below and with reference to FIG. 8.

[0094] The altered polymerase of the invention may also be defined directly with reference to its improved ability to be purified from host cells in which the polymerase is expressed. Thus, thanks to the reduced affinity of the altered polymerase for DNA (which together with a substantially unaltered affinity for nucleotides and fidelity produce the advantages associated with the polymerases of the invention) the polymerase can more readily be purified. Less endogenous DNA from the host cell is co-purified during purification of the enzyme. Thus, a more pure product results since less endogenous DNA remains bound to the polymerase following the purification process. An additional advantage of the reduced affinity for DNA of the altered polymerases is that less severe purification procedures need to be utilised in order to provide a substantially pure polymerase preparation. Accordingly, less polymerase will be adversely affected by the purification process itself leading to a polymerase preparation with higher levels of overall activity. In addition, more uniform purification should be possible leading to less variability between batches of polymerase. Representative data regarding the improvement in carry over of endogenous DNA during the purification procedure is provided in Example 7 of the experimental section below.

[0095] Preferably, less than about 60 ng/ml, 50 ng/ml, 40 ng/ml, 30 ng/ml, 20 ng/ml, 10 ng/ml and more preferably less than about 5 ng/ml of host DNA is carried over following purification of the polymerase. Standard purification protocols may be utilised, such as for example see Colley et al., J. Biol. Chem. 264:17619-17622 (1989); Guide to Protein Purification, in Methods in Enzymology, vol. 182 (Deutscher, ed., 1990).

[0096] Thus, the invention provides an altered polymerase having an affinity for DNA such that;

[0097] (i) the polymerase has a dissociation constant for DNA of at least about, or approximately, 2-fold, 3-fold, 4-fold or 5-fold greater than the unaltered/control polymerase and/or

[0098] (ii) at least 50%, 60%, 70% or 80% of the polymerase dissociates from DNA to which the polymerase is bound when a sodium chloride solution having a concentration of between about 200 nM and 500 nM, preferably between about 200 nM and 300 nM is applied thereto, and/or

[0099] (iii) less than about 60, 50, 45, 40, 35, 30, 25, 20, 15, 10, 5, 3, 1 or 0.5 ng/ml of endogenous DNA remains bound to the polymerase following a purification process from the cell in which the polymerase is expressed; the alteration not significantly adversely affecting nucleotide binding ability or fidelity such that the polymerase is capable of;

[0100] (a) forming an increased number of productive polymerase-DNA complexes over a reaction cycle (giving improved levels of reaction completion), and/or

[0101] (b) catalysing an improved (increased/elevated) overall level of nucleotide incorporation;

especially under conditions where the concentration of polymerase is limiting with respect to the concentration of DNA.

[0102] The invention further relates to nucleic acid molecules encoding the altered polymerase enzymes of the invention.

[0103] For any given altered polymerase which is a mutant version of a polymerase for which the amino acid sequence and preferably also the wild type nucleotide sequence encoding the polymerase is known, it is possible to obtain a nucleotide sequence encoding the mutant according to the basic principles of molecular biology. For example, given that the wild type nucleotide sequence encoding 9.degree. N polymerase is known, it is possible to deduce a nucleotide sequence encoding any given mutant version of 9.degree. N having one or more amino acid substitutions using the standard genetic code. Similarly, nucleotide sequences can readily be derived for mutant versions other polymerases from both family A and family B polymerases such as, for example, Vent.TM., Pfu, Tsp JDF-3, Taq, etc. Nucleic acid molecules having the required nucleotide sequence may then be constructed using standard molecular biology techniques known in the art.

[0104] In one particular embodiment the invention relates to nucleic acid molecules encoding mutant versions of the 9.degree. N polymerase.

[0105] Therefore, the invention provides a nucleic acid molecule which encodes an altered 9.degree. N polymerase, the nucleic acid molecule comprising, consisting essentially of or consisting of the nucleotide sequence of any of SEQ ID NO: 2, 4, 6, 19 or 20.

[0106] In accordance with the present invention, a defined nucleic acid includes not only the identical nucleic acid but also any minor base variations including, in particular, substitutions in cases which result in a synonymous codon (a different codon specifying the same amino acid residue) due to the degenerate code in conservative amino acid substitutions. The term "nucleic acid sequence" also includes the complementary sequence to any single stranded sequence given regarding base variations.

[0107] The nucleic acid molecules described herein may also, advantageously, be included in a suitable expression vector to express the polymerase proteins encoded therefrom in a suitable host. Thus, there is provided an expression vector comprising, consisting essentially of or consisting of the nucleotide sequence of any of SEQ ID NO: 2, 4, 6, 19 or 20. Incorporation of cloned DNA into a suitable expression vector for subsequent transformation of said cell and subsequent selection of the transformed cells is well known to those skilled in the art as provided in Sambrook et al. (1989), Molecular cloning: A Laboratory Manual, Cold Spring Harbour Laboratory.

[0108] Such an expression vector includes a vector having a nucleic acid according to the invention operably linked to regulatory sequences, such as promoter regions, that are capable of effecting expression of said DNA fragments. The term "operably linked" refers to a juxtaposition wherein the components described are in a relationship permitting them to function in their intended manner. Such vectors may be transformed into a suitable host cell to provide for the expression of a protein according to the invention.

[0109] The nucleic acid molecule may encode a mature protein or a protein having a prosequence, including that encoding a leader sequence on the preprotein which is then cleaved by the host cell to form a mature protein.

[0110] The vectors may be, for example, plasmid, virus or phage vectors provided with an origin of replication, and optionally a promoter for the expression of said nucleotide and optionally a regulator of the promoter. The vectors may contain one or more selectable markers, such as, for example, an antibiotic resistance gene.

[0111] Regulatory elements required for expression include promoter sequences to bind RNA polymerase and to direct an appropriate level of transcription initiation and also translation initiation sequences for ribosome binding. For example, a bacterial expression vector may include a promoter such as the lac promoter and for translation initiation the Shine-Delgarno sequence and the start codon AUG. Similarly, a eukaryotic expression vector may include a heterologous or homologous promoter for RNA polymerase II, a downstream polyadenylation signal, the start codon AUG, and a termination codon for detachment of the ribosome. Such vectors may be obtained commercially or be assembled from the sequences described by methods well known in the art.

[0112] Transcription of DNA encoding the polymerase of the invention by higher eukaryotes may be optimised by including an enhancer sequence in the vector. Enhancers are cis-acting elements of DNA that act on a promoter to increase the level of transcription. Vectors will also generally include origins of replication in addition to the selectable markers.

Preferred Uses of the Altered Polymerases

[0113] In a further aspect the invention relates to use of an altered polymerase having reduced affinity for DNA according to the invention for the incorporation of a nucleotide into a polynucleotide. As mentioned above, the nature of the nucleotide is not limiting since the altered polymerases of the invention retain affinity for the relevant nucleotides.

[0114] As aforementioned, the invention is based upon the realization that, the tight binding of a polymerase to the DNA template is not always an advantageous property. This is particularly the case in the context of sequencing reactions in which only a single nucleotide incorporation event is required in each reaction cycle for each template DNA molecule. In many of these sequencing reactions a labelled nucleotide is utilised.

[0115] Thus, the invention provides for use of a polymerase which has been altered such that it displays a reduced affinity for DNA and an ability to incorporate a labelled nucleotide into a plurality of separate DNA templates in each reaction cycle for incorporation of a labelled nucleotide into a polynucleotide, the label being utilised to determine the nature of the nucleotide added.

[0116] In one embodiment, the nucleotide comprises a ddNTP. Thus, the polymerase of the invention may be utilised in a conventional Sanger sequencing reaction, the details of which are well known in the art.

[0117] In a preferred embodiment, the nucleotide is a modified nucleotide which has been modified at the 3' sugar hydroxyl such that the substituent is larger in size than the naturally occurring 3' hydroxyl group.

[0118] The polymerases of the invention may be used in any area of technology where it is required/desirable to be able to incorporate nucleotides, for example modified nucleotides having a substituent at the 3' sugar hydroxyl position which is larger in size than the naturally occurring hydroxyl group, into a polynucleotide chain. They may be used in any area of technology where any of the desirable properties of the enzyme, for example improved rate of incorporation of nucleotides even under conditions where the DNA is present in excess and increased levels of reaction completion under these conditions, are required. This may be a practical, technical or economic advantage.

[0119] Although the altered polymerases exhibit desirable properties in relation to incorporation of modified nucleotides having a large 3' substituent due to their decreased affinity for DNA, the utility of the enzymes is not confined to incorporation of such nucleotide analogues. The desirable properties of the altered polymerase due to its reduced affinity for DNA may provide advantages in relation to incorporation of any other nucleotide, including unmodified nucleotides, relative to enzymes known in the art. In essence, the altered polymerases of the invention may be used to incorporate any type of nucleotide that they have the ability to incorporate.

[0120] The polymerases of the present invention are useful in a variety of techniques requiring incorporation of a nucleotide into a polynucleotide, which include sequencing reactions, polynucleotide synthesis, nucleic acid amplification, nucleic acid hybridisation assays, single nucleotide polymorphism studies, and other such techniques. Use in sequencing reactions represents a most preferred embodiment. All such uses and methods utilizing the modified polymerases of the invention are included within the scope of the present invention.

[0121] The invention also relates to a method for incorporating nucleotides into DNA comprising allowing the following components to interact:

[0122] (i) A polymerase according to the invention;

[0123] (ii) a DNA template; and

[0124] (iii) a nucleotide solution.

[0125] As discussed above, the polymerase of the invention has particular applicability in reactions where incorporation of only a single or relatively few nucleotides are required in each reaction cycle. Often in these reactions one or more of the nucleotides will be labelled. Accordingly, the invention provides a method for incorporating labelled nucleotides into DNA comprising allowing the following components to interact: [0126] (i) A polymerase which has been altered such that it displays a reduced affinity for DNA and an ability to incorporate a labelled nucleotide into a plurality of separate DNA templates in each reaction cycle, [0127] (ii) a DNA template; and [0128] (iii) a nucleotide solution.

[0129] In one specific embodiment, the invention provides a method for incorporating nucleotides which have been modified at the 3' sugar hydroxyl such that the substituent is larger in size than the naturally occurring 3' hydroxyl group into DNA comprising allowing the following components to interact: [0130] A polymerase according to the present invention (as described above) [0131] a DNA template; and [0132] a nucleotide solution containing the nucleotides which have been modified at the 3' sugar hydroxyl such that the substituent is larger in size than the naturally occurring 3' hydroxyl group.

[0133] Particularly preferred are uses and methods carried out on a clustered array. Clustered arrays of nucleic acid molecules may be produced using techniques generally known in the art. By way of example, WO 98/44151 and WO 00/18957 (both of which are incorporated by reference herein) both describe methods of nucleic acid amplification which allow amplification products to be immobilised on a solid support in order to form arrays comprised of clusters or "colonies" of immobilised nucleic acid molecules. Reference is also made to WO 2005/078130 including the citations referred to therein, the contents of all of which are hereby incorporated by reference. Incorporation on clusters, in particular sequencing on clustered arrays, provides specific advantages because the polymerase is able to incorporate nucleotides into multiple DNA templates located in close proximity, thus providing a highly efficient reaction.

[0134] The above components are allowed to interact under conditions which permit the formation of a phosphodiester linkage between the 5' phosphate group of a nucleotide and a free 3' hydroxyl group on the DNA template, whereby the nucleotide is incorporated into a polynucleotide. Preferred nucleotides, including modified nucleotides, are described in detail above.

[0135] The incorporation reactions may occur in free solution or the DNA templates may be fixed to a solid support.

[0136] The rate of incorporation of the nucleotide exhibited by a mutant enzyme may be similar to the rate of incorporation of nucleotides exhibited by the unaltered enzyme. Due to the improved activity of the modified enzyme, thanks to its reduced affinity for DNA, the same rate of incorporation combined with the ability to incorporate nucleotides into a plurality of templates in a single reaction cycle improves the overall rates of completion. However, it is not necessary for the rate of incorporation of nucleotides to be precisely the same to that of the unaltered enzyme for a mutant enzyme to be of practical use. The rate of incorporation may be less than, equal to or greater than the rate of incorporation of nucleotides by the unaltered enzyme, provided the overall reaction efficiency in terms of reaction completion is improved.

[0137] In one particular embodiment of the invention, the altered polymerases of the invention may be used to incorporate modified nucleotides into a polynucleotide chain in the context of a sequencing-by-synthesis protocol. In this particular aspect of the method the nucleotides may have been modified at the 3' sugar hydroxyl such that the substituent is larger in size than the naturally occurring 3' hydroxyl group. These nucleotides are detected in order to determine the sequence of a DNA template.

[0138] Thus, in a still further aspect, the invention provides a method of sequencing DNA comprising allowing the following components to interact: [0139] A polymerase according to the present invention (as described above) [0140] a DNA template; and [0141] a nucleotide solution containing the nucleotides which have been modified at the 3' sugar hydroxyl such that the substituent is larger in size than the naturally occurring 3' hydroxyl group [0142] followed by detection of the incorporated modified nucleotides thus allowing sequencing of the DNA template.

[0143] The DNA template for a sequencing reaction will typically comprise a double-stranded region having a free 3' hydroxyl group which serves as a primer or initiation point for the addition of further nucleotides in the sequencing reaction. The region of the DNA template to be sequenced will overhang this free 3' hydroxyl group on the complementary strand. The primer bearing the free 3' hydroxyl group may be added as a separate component (e.g. a short oligonucleotide) which hybridises to a region of the template to be sequenced. Alternatively, the primer and the template strand to be sequenced may each form part of a partially self-complementary nucleic acid strand capable of forming an intramolecular duplex, such as for example a hairpin loop structure. Nucleotides are added successively to the free 3' hydroxyl group, resulting in synthesis of a polynucleotide chain in the 5' to 3' direction. After each nucleotide addition the nature of the base which has been added will be determined, thus providing sequence information for the DNA template.

[0144] Such DNA sequencing may be possible if the modified nucleotides can act as chain terminators. Once the modified nucleotide has been incorporated into the growing polynucleotide chain complementary to the region of the template being sequenced there is no free 3'-OH group available to direct further sequence extension and therefore the polymerase can not add further nucleotides. Once the nature of the base incorporated into the growing chain has been determined, the 3' block may be removed to allow addition of the next successive nucleotide. By ordering the products derived using these modified nucleotides it is possible to deduce the DNA sequence of the DNA template. Such reactions can be done in a single experiment if each of the modified nucleotides has attached a different label, known to correspond to the particular base, to facilitate discrimination between the bases added at each incorporation step. Alternatively, a separate reaction may be carried out containing each of the modified nucleotides separately.

[0145] In a preferred embodiment the modified nucleotides carry a label to facilitate their detection. Preferably this is a fluorescent label. Each nucleotide type may carry a different fluorescent label. However the detectable label need not be a fluorescent label. Any label can be used which allows the detection of the incorporation of the nucleotide into the DNA sequence.

[0146] One method for detecting the fluorescently labelled nucleotides, suitable for use in the methods of the invention, comprises using laser light of a wavelength specific for the labelled nucleotides, or the use of other suitable sources of illumination.

[0147] In one embodiment, the fluorescence from the label on the nucleotide may be detected by a CCD camera.

[0148] If the DNA templates are immobilised on a surface they may preferably be immobilised on a surface to form a high density array, which is preferably a clustered or "colonial" array as discussed supra. In one embodiment, and in accordance with the technology developed by the applicants for the present invention, the high density array comprises a single molecule array, wherein there is a single DNA molecule at each discrete site that is detectable on the array. Single-molecule arrays comprised of nucleic acid molecules that are individually resolvable by optical means and the use of such arrays in sequencing are described, for example, in WO 00/06770, the contents of which are incorporated herein by reference. Single molecule arrays comprised of individually resolvable nucleic acid molecules including a hairpin loop structure are described in WO 01/57248, the contents of which are also incorporated herein by reference. The polymerases of the invention are suitable for use in conjunction with single molecule arrays prepared according to the disclosures of WO 00/06770 of WO 01/57248. However, it is to be understood that the scope of the invention is not intended to be limited to the use of the polymerases in connection with single molecule arrays.

[0149] Single molecule array-based sequencing methods may work by adding fluorescently labelled modified nucleotides and an altered polymerase to the single molecule array. Complementary nucleotides base-pair to the first base of each nucleotide fragment and are then added to the primer in a reaction catalysed by the improved polymerase enzyme. Remaining free nucleotides are removed.

[0150] Then, laser light of a specific wavelength for each modified nucleotide excites the appropriate label on the incorporated modified nucleotides, leading to the fluorescence of the label. This fluorescence may be detected by a suitable CCD camera that can scan the entire array to identify the incorporated modified nucleotides on each fragment. Thus millions of sites may potentially be detected in parallel. Fluorescence may then be removed.

[0151] The identity of the incorporated modified nucleotide reveals the identity of the base in the sample sequence to which it is paired. The cycle of incorporation, detection and identification may then be repeated approximately 25 times to determine the first 25 bases in each oligonucleotide fragment attached to the array, which is detectable.

[0152] Thus, by simultaneously sequencing all molecules on the array, which are detectable, the first 25 bases for the hundreds of millions of oligonucleotide fragments attached in single copy to the array may be determined. Obviously the invention is not limited to sequencing 25 bases. Many more or less bases may be sequenced depending on the level of detail of sequence information required and the complexity of the array.

[0153] Using a suitable bioinformatics program the generated sequences may be aligned and compared to specific reference sequences. This allows determination of any number of known and unknown genetic variations such as single nucleotide polymorphisms (SNPs) for example.

[0154] The utility of the altered polymerases of the invention is not limited to sequencing applications using single-molecule arrays. The polymerases may be used in conjunction with any type of array-based (and particularly any high density array-based) sequencing technology requiring the use of a polymerase to incorporate nucleotides into a polynucleotide chain, and in particular any array-based sequencing technology which relies on the incorporation of modified nucleotides having large 3' substituents (larger than natural hydroxyl group), such as 3' blocking groups.

[0155] The polymerases of the invention may be used for nucleic acid sequencing on essentially any type of array formed by immobilisation of nucleic acid molecules on a solid support. In addition to single molecule arrays suitable arrays may include, for example, multi-polynucleotide or clustered arrays in which distinct regions on the array comprise multiple copies of one individual polynucleotide molecule or even multiple copies of a small number of different polynucleotide molecules (e.g. multiple copies of two complementary nucleic acid strands).

[0156] In particular, the polymerases of the invention may be utilised in the nucleic acid sequencing method described in WO 98/44152, the contents of which are incorporated herein by reference. This International application describes a method of parallel sequencing of multiple templates located at distinct locations on a solid support. The method relies on incorporation of labelled nucleotides into a polynucleotide chain.

[0157] The polymerases of the invention may be used in the method described in International Application WO 00/18957, the contents of which are incorporated herein by reference. This application describes a method of solid-phase nucleic acid amplification and sequencing in which a large number of distinct nucleic acid molecules are arrayed and amplified simultaneously at high density via formation of nucleic acid colonies and the nucleic acid colonies are subsequently sequenced. The altered polymerases of the invention may be utilised in the sequencing step of this method.

[0158] Multi-polynucleotide or clustered arrays of nucleic acid molecules may be produced using techniques generally known in the art. By way of example, WO 98/44151 and WO 00/18957 both describe methods of nucleic acid amplification which allow amplification products to be immobilised on a solid support in order to form arrays comprised of clusters or "colonies" of immobilised nucleic acid molecules. The contents of WO 98/44151 and WO 00/18957 relating to the preparation of clustered arrays and use of such arrays as templates for nucleic acid sequencing are incorporated herein by reference. The nucleic acid molecules present on the clustered arrays prepared according to these methods are suitable templates for sequencing using the polymerases of the invention. However, the invention is not intended to be limited to use of the polymerases in sequencing reactions carried out on clustered arrays prepared according to these specific methods.

[0159] The polymerases of the invention may further be used in methods of fluorescent in situ sequencing, such as that described by Mitra et al. Analytical Biochemistry 320, 55-65, 2003.

[0160] The present invention also contemplates kits which include the polymerase of the invention, possibly packaged together with suitable instructions for use. The polymerase will be provided in a form suitable for use, for example provided in a suitable buffer or may be in a form which can be reconstituted for use (e.g. in a lyophilized form).

[0161] Thus, a kit is provided for use in a nucleotide incorporation reaction or assay comprising a polymerase of the invention as described herein and a solution of nucleotides, the nucleotides being such that the polymerase can incorporate them into a growing DNA strand. Preferred nucleotides include suitably labelled nucleotides which can thus be used in sequencing reactions for example. Labels may include fluorescent labels, radiolabels and/or mass labels as are known in the art.

[0162] In one preferred embodiment, the nucleotide solution comprises, consists essentially of or consists of synthetic (i.e. non-natural) nucleotides such as ddNTPS for example. The kit may thus be utilised in a Sanger sequencing reaction for example.

[0163] In a further embodiment, the nucleotide solution comprises, consists essentially of or consists of modified nucleotides. Preferred modified nucleotides are defined above with respect to the polymerases of the invention and this description applies mutatis mutandis here.

[0164] The kit may, in a further embodiment, also incorporate suitable primer and/or DNA template molecules which allow a nucleotide incorporation reaction to be carried out.

[0165] In a still further aspect, the invention provides a method for producing a polymerase according to the invention comprising:

[0166] (i) selecting residues for mutagensis in the polymerase;

[0167] (ii) producing a mutant polymerase in accordance with the selection made in (i);

[0168] (iii) determining the affinity of the mutant polymerase for DNA; and

[0169] (iv) if the affinity for DNA is reduced, testing the polymerase for an ability to form an increased number of productive polymerase-DNA complexes in each reaction cycle.

[0170] Preferably affinity for nucleotides is unaffected, but may be considered satisfactory if it remains of the same order as for the unmodified polymerase.

[0171] In one embodiment, the method further comprises ensuring that the fidelity of the polymerase remains of the same order following mutagenesis.

[0172] Preferably fidelity is unaffected, but may be considered acceptable if it remains of the same order as for the modified polymerase.

[0173] Reaction cycle is as defined above.

[0174] In a preferred embodiment, the test of the polymerase includes the use of synthetic nucleotides to determine whether an increased number of productive polymerase-DNA complexes are being formed. Suitable nucleotide incorporation assays in which the polymerase may be tested are known in the art (e.g. see WO2005/024010) and are described in more detail in the experimental section below.

[0175] In one embodiment, residues are selected on the basis of the 9.degree. N primary amino acid sequence. In one embodiment, the selection is made by predicting which amino acids will contact the DNA. Alternatively, residues may be selected which are predicted to stabilise the interaction of the polymerase with DNA and/or which are found in the DNA binding domain of the polymerase and/or which are basic. Predictions may be based on crystal structures of a suitable polymerase, as discussed supra and in the experimental section (example 1).

[0176] Methods of mutagenesis, in particular site-directed mutagenesis, are well characterised in the art and kits are commercially available. Accordingly, these techniques are not discussed in detail. Any suitable technique may be utilized in the method of the invention.

[0177] The reduction in affinity for DNA may be measured by any suitable method. Preferably, the affinity is reduced at least, or approximately, 1.5-fold, 2-fold, 3-fold, 4-fold or 5-fold etc. compared to the original unaltered polymerase. This affinity may be measured with reference to the dissociation constant for example.

[0178] In a preferred embodiment, the polymerase is a family B polymerase, preferably derived from a thermophilic archaeon and most preferably is 9.degree. N polymerase.

BRIEF DESCRIPTION OF THE DRAWINGS

[0179] The invention will be further understood with reference to the following experimental section and figures in which:

[0180] FIGS. 1A and 1B show overexpression of mutant enzymes of the invention.

[0181] FIG. 2 shows results of a NUNC tube assay using crude preparations of the mutant enzymes.

[0182] FIGS. 3A and 3B show results of a single base incorporation assay utilising the mutant enzymes.

[0183] FIG. 4 is a further presentation of the activity of the mutant enzymes.

[0184] FIG. 5 presents results of timecourses for a single base incorporation assay at [DNA]>[pol] (ratio 5:1) for the control polymerase (YAV) and for each of the three mutant enzymes (K705A, R713A and R743A).

[0185] FIG. 6 shows results of the washing assay, with the fluorescence image of the NUNC wells shown.

[0186] FIG. 7 also presents results of the washing assay, showing the affinity of the respective polymerases for a DNA template (the data for K705A has been omitted for clarity).

[0187] FIG. 8 represents Michaelis plots showing the kinetic characterization of the polymerase enzymes, which are shown overlaid.

[0188] FIG. 9 presents the nucleotide and amino acid sequences encoded by the codon-modified gene of clone 9. FIG. 9 discloses SEQ ID NOS 23 and 22, in order of appearance.

[0189] FIG. 10 presents the results of SDS-PAGE experiments comparing the expression of 3 clones (1, 2 and 4) of Pol 52 (codon-modified gene of clone 9 when expressed in pET11-a expression vector) with Pol 19 (clone 9 gene expressed from the pNEB917 expression vector) and Pol 43 (clone 9 gene expressed from the pET11-a expression vector) in the crude lysates of uninduced (gel I) and induced (Gel I,II) cultures.

[0190] Abbreviations: MW-Molecular Weight; PM-protein marker; cl 9-clone 9.

DETAILED DESCRIPTION OF THE INVENTION

Experimental Section

Example 1

Preparation of Altered Polymerases

Rationale

[0191] Site-directed mutations were introduced in the C-terminal region of 9.degree. N-7 YAV C223S polymerase in an attempt to reduce the affinity of the enzyme for DNA (wild-type 9.degree. N-7 polymerase has a very high affinity for DNA, Kd=50 pM; Southworth et al. 1996. PNAS. 93, 5281).

[0192] An energy minimised overlaid alignment (contracted by Cresset) of the crystal structures of the open form of 9.degree. N-7 DNA polymerase (PDB=lqht), the open structure of a closely related DNA polymerase RB69 (PDB=lih7) and the closed form of RB69 (PDB=lig9) was used as a structural model for the identification of key residues involved in DNA binding. The crystal structure of the closed form of RB69 polymerase (Franklin et al. 2001. Cell 105, 657) identified a number of residues that formed H-bond or electrostatic interactions with the complexed DNA, either directly to the nucleotide bases or the phosphate backbone. A high proportion of these residues were basic (Lys790, 800, 844, 874, 878 and Arg806), consistent with their likely interaction with acidic phosphate groups. Inspection of the closed RB69 structure showed that the majority of these residues adopted orientations toward the bound duplex. No analogous structure for the closed form of 9.degree. N-7 pol exists and so we used our structural alignment to identify basic residues in the open form of 9.degree. N-7 pol which adopted analogous conformations to the basic residues (of those above) from the RB69 open structure. Of the 6 basic residues from RB69, 3 were found to have a corresponding basic residue in 9.degree. N-7, these were: Arg743 (RB69 Lys878), Arg713 (Lys800) and Lys705 (Lys844). It was decided to engineer 4 mutant enzymes, the alanine variants of the residues shown (R743A, R713A and K705A) and a 71 amino acid deletion (71), which removed an .alpha.-helix from the thumb subdomain (residues disordered in the 9.degree. N-7 pol structure) within which the three residues above were located.

Mutagenesis and Cloning

[0193] Mutations were introduced into pSV19 (plasmid encoding 9.degree. N-7 YAV C223S exo-polymerase) via a PCR method using Stratagene Quikchange XL kit and the protocol thereof (also see WO 2005/024010)

[0194] Mutagenic primers used:

TABLE-US-00001 R743A. (SEQ ID NO: 9) fwd 5'-CCCGGCGGTGGAGGCGATTCTAAAAGCC-3' (SEQ ID NO: 10) rev 3'-GGGCCGCCACCTCCGCTAAGATTTTCGG-5' R713A (SEQ ID NO: 11) fwd 5'-GAAGGATAGGCGACGCGGCGATTCCAGCTG-3' (SEQ ID NO: 12) rev 3'-CTTCCTATCCGCTGCGCCGCTAAGGTCGAC-5' K705A (SEQ ID NO: 13) fwd 5'-GCTACATCGTCCTAGCGGGCTCTGGAAGG-3' (SEQ ID NO: 14) rev 3'-CGATGTAGCAGGATCGCCCGAGACCTTCC-5' 71 (C-terminus 704) (SEQ ID NO: 15) fwd 5'-GCTACATCGTCCTATGAGGCTCTGGAAGG-3' (SEQ ID NO: 16) rev 3'-CGATGTAGCAGGATACTCCGAGACCTTCC-5'

[0195] Potential clones were selected and PCR fragments of the gene sequenced to confirm the presence of the mutation. Positive clones were produced for all mutants.

Overexpression and Growth:

[0196] Transformed into expression strain Novagen RosettaBlue DE3 pLysS

[0197] Growth and induction carried out as described in Experimental section of WO 2005/024010.

[0198] Harvest and lysis carried out as described in Experimental section of WO 2005/024010.

[0199] Purification carried out as described in Experimental section of WO 2005/024010.

Results:

[0200] Successful overexpression of mutant enzymes was achieved. All mutant enzymes were overexpressed. SDS-PAGE gels were run to check overexpression of the constructs (-=uninduced; +=IPTG induced). The resulting gels are shown in FIG. 1.

Example 2

NUNC Tube Assay Using Crude Protein Preparation

[0201] Small 5 ml cultures of the mutant enzymes (along with a culture of YAV C223S exo- for direct comparison) were taken through a quick purification as outlined in WO 2005/024010 up until the heat treatment step. At this point, the samples were considered to be sufficiently pure to test their activity.

[0202] The buffers for each of the crude preparations were exchanged into enzymology buffer (50 mM Tris pH 8.0, 6 mM MgSO4, 1 mM EDTA, 0.05% Tween20) using an S300 gel filtration spin-column. The samples were not normalised for concentration. The test employed was a simple incorporation of ffTTP into surface-coupled A-template hairpin. 2 pmoles of 5'-amino oligo 815 (5'-CGATCACGATCACGATCACGATCACGATCACGATCACGCTGATGTGCATGCTGTTG TTTTTTTACAACAGCATGCACATCAGCG-3') (SEQ ID NO: 17) was coupled to a NUNC-nucleolink strip according to the manufacturers protocol.

[0203] Once washed, each well was incubated with a 20 .quadrature.l aliquot of a crude enzyme preparation (identity of enzyme listed below) and 5 DM ffT-N3-647. The strip was then incubated at 45.degree. C. for 30 minutes. The experiment was performed in duplicate. Upon completion of the 30 minute incubation, wells were washed with 3.times.100 .quadrature.l of high salt wash buffer (10 mM Tris pH 8.0, 1M NaCl, 10 mM EDTA) and then 3.times.100 .quadrature.l of MilliQ water. Strips were scanned on a typhoon fluorescence imager CY5 filter, PMT=450 V).

[0204] The results are presented in FIG. 2, in which the wells are as follows:

1=20 .mu.l enzymology buffer only+1 .mu.l 100 .mu.M ffT-N3-647 2=20 .mu.l crude YAV C223S exo-+1 .mu.l 100 .mu.M ffT-N3-647 3=20 .mu.l crude YAV C223S R743A exo-(clone 12)+1 .mu.l 100 .mu.M ffT-N3-647 4=20 .mu.l crude YAV C223S K705A exo-(clone 15)+1 .mu.l 100 .mu.M ffT-N3-647 5=20 .mu.l crude YAV C223S R743A exo-(clone 16)+1 .mu.l 100 .mu.M ffT-N3-647 6=20 .mu.l crude YAV C223S R713A exo-(clone 24)+1 .mu.l 100 .mu.M ffT-N3-647 7=20 .mu.l crude YAV C223S 71 exo-(clone 38)+1 .mu.l 100 .mu.M ffT-N3-647 8=20 .mu.l crude YAV C223S R713A exo-(clone 39)+1 .mu.l 100 .mu.M ffT-N3-647

Results

[0205] Enzymology was observed in all wells except the background wells (MilliQ only) and well 1 (no enzyme control). The fluorescence density is proportional to the amount of ffTTP incorporation--the darker the well, the greater the level of incorporation. Performance of the mutant enzymes will be discussed relative to YAV (clone 9)(YAV C223S exo-). Deletion of the tip of the thumb subdomain (71 mutant) results in an enzyme that is severely catalytically compromised, and only incorporates to 35% of the level seen for clone 9. Mutant K705A was equivalent to clone 9. The two arginine mutants R743A and R713A showed elevated levels of incorporation, showing .about.45% improvements over clone 9.

Conclusion

[0206] Mutant enzymes K705A, R713A and R743A display improved levels of incorporation and decreased affinity of the enzyme for DNA. Removal of all three of these basic residues, in combination with deletion of additional residues, abolishes activity (71 mutant). It may be that substitution of all three residues would not lead to a decrease in activity, in the absence of further mutations/deletions.

Example 3

Single Base Incorporation Assay

[0207] The activity of the crude enzyme preparations (normalised concentrations) was measured using the single base incorporation assay as described in WO 2005/024010. 10 minute incubations were run with either 30 or 3 .mu.g/ml crude enzyme preparation in the presence of 2 .mu.M ffT-N3-cy3 and 20 nM 10 A hairpin DNA (.sup.32P-labelled), aliquots of the reaction mixture were withdrawn at 0, 30, 60, 180 and 600 s and run on a 12% acrylamide gel.

Results

[0208] The gel images are shown in FIG. 3.

[0209] The band intensities were quantified using Imagequant and the fluorescence intensity plotted versus incubation time to generate the time-courses shown in FIG. 4.

[0210] These data give an estimate of the performance of the mutant enzymes for the first base incorporation of ffTTP relative to YAV. Due to the concentration normalisation, the activities are directly comparable. The 71 mutant is essentially inactive (kobs is 21% of that observed for YAV), R743A and K705A have comparable activities to YAV, but R713A shows a significant enhancement in both kobs (2.times. that observed for YAV) and the level of cycle completion.

Example 4

Single Base Incorporation Assay for Purified Polymerases Under Conditions where [DNA] is Greater than [Pol]

[0211] The activity of the purified enzyme preparations of Clone 9 polymerase (YAV C223S exo-) and the thumb sub-domain mutants K705A, R713A and R743A was measured using the single base incorporation assay as described in WO 2005/024010. The experiment was carried out such that the respective concentrations of DNA and polymerase were at a ratio of approximately 5:1. Thus, the ability of the enzyme to incorporate nucleotides into multiple DNA template molecules in a single reaction cycle was investigated. 30 minute incubations were run with 4 nM purified enzyme in the presence of 20 nM 10 A hairpin DNA (.sup.32P-labelled) and 2 .mu.M ffT-N3-cy3, aliquots of the reaction mixture were withdrawn at 0, 15, 30, 60, 180, 480, 900 and 1800 s intervals and run on a 12% acrylamide gel.

Results

[0212] The band intensities were quantified using Imagequant and the fluorescence intensity, converted into percentage completion (based on the relative intensities of the starting material and final product bands on the gel) plotted versus incubation time to generate the timecourses shown in FIG. 5.

[0213] Timecourse plots for clone 9 and K705A are biphasic in nature, displaying an initial exponential "burst" phase (black line) followed by a linear dependence of product conversion with time (grey line). The amplitude of the burst phase is greater for K705A than for clone 9 (.about.28% and 19% respectively) and the gradient of the linear phase is steeper (hence faster) for K705A than clone 9. The significance of this observation is discussed below.

[0214] In contrast to this, both R713A and R743A mutant enzymes do not show this biphasic nature, instead, only the fast exponential phase is observed. In both cases, the amplitude of the exponential phase is .about.90% indicating a higher degree of product conversion within this exponential phase than either clone 9 or K705A. The burst phase equates to the rate of incorporation of ffTTP of the population of DNA molecules associated with a polymerase prior to reaction initiation i.e. maximum rate at which the ternary pol:DNA:ffTTP complex can turnover. Any subsequent phase is attributed to a slower dissociation/re-association process required for the polymerase to sequester new substrate molecules (DNA and ffTTP). The biphasic nature observed for clone 9 and K705A suggests that the slow post-burst phase is caused by the difficulty of the enzyme to dissociate and re-associate with DNA, most likely due to their low Kd (DNA).

[0215] The mutation of basic residues that may contact duplex DNA when bound by the polymerase (namely R713 and R743) to remove this functionality results in mutant enzymes which only display burst kinetics (R713A and R743A). We interperet this in one of two ways, i) as having improved the enzymes ability to dissociate and re-associate with DNA by decreasing the affinity for DNA (increased Kd(DNA)) and/or ii) the decrease in affinity for DNA in these mutants results in a larger "active enzyme" fraction in the polymerase preparation. It has been shown that impure DNA polymerase (contaminated with E. coli genomic DNA carried through from lysis) inhibits the enzyme by reducing the active enzyme fraction of the preparation.

[0216] The crude fitting of the timecourses suggests that the observed rate constants for the burst phase seen for clone 9 and K705A are comparable (kobs.about.0.06 s-1) whereas this rate constant is smaller for R713A (kobs.about.0.01 s-1) and R743A (kobs.about.0.004 s-1). Under these experimental conditions, the burst is faster for clone 9 and K705A than for R713A or R743A, but the latter two enzymes reach completion in a shorter period of time due to the absence of the slow, linear dissociation/re-association phase inherent to clone 9 and K705A.

Example 5

Washing Assay

[0217] Employing a washing assay qualitatively assesses the affinity of purified enzyme preparations for DNA. 4 (1.times.8) NUNC nucleolink strips were functionalized with 2 pmoles of 5'-amino A-template hairpin, oligo 815 (5' H2N-CGATCACGATCACGATCACGATCACGATCACGATCACGCTGATGTGCATGCTGTTGTTTTTTTACAACA- GC ATGCACATCAGCG-3') (SEQ ID NO: 18) according to the manufacturer's protocol.

[0218] Once washed, each well was incubated with a 20 .mu.l aliquot of 500 nM enzyme (clone 9, K705A, R713A or R743A mutants) at 45.degree. C. for 30 minutes. Post incubation, each well was washed with 3.times.100 ml of 10 mM Tris pH 8.0, 10 mM EDTA including varying concentrations of NaCl (0, 0.05, 0.1, 0.3, 0.4, 0.75, 1.0, 2.0 M) and then 3.times.100 ml MilliQ water. Wells were subsequently pre-equilibrated with enzymology buffer prior to a further incubation of 20 .mu.l of 2 .mu.M ffT-N3-647 at 45.degree. C. for 30 minutes.sub.-- Wells were washed with 3.times.100 ml high salt wash buffer (10 mM Tris pH 8.0, 1M NaCl, 10 mM EDTA) and then 3.times.100 ml MilliQ water. Strips scanned on Typhoon fluorescence imager (y5 filter, PMT=500 V).

Results

[0219] The fluorescence image of the NUNC wells is shown in FIG. 6.

[0220] Any fluorescence in the wells is due to residual enzyme bound to the surface-coupled DNA post-wash. Increasing the ionic strength of the wash buffer between incubation should destabilise the interaction between the polymerase and the DNA by masking electrostatic interactions. Enzyme should be more effectively washed off the DNA at higher ionic strength.

[0221] When a low ionic strength wash is employed between incubations all enzymes tested displayed a high level of incorporation, therefore ineffective at dissociating enzyme from DNA. As the concentration of NaCl in the wash buffer increased, the behaviour of the enzymes relative to each other changed. Mutant enzymes R713A and R743A were more effectively removed from the DNA at [NaCl]<200 mM, whereas K705A and clone 9 showed a similar response to each other, but required higher [NaCl] to remove them from the DNA. Even after a wash with 2 M NaCl, a significant (ca. 75%) level of incorporation relative to a 0 M NaCl wash was observed for clone 9. This is clearly illustrated in the plot shown in FIG. 7 (the data for K705A has been omitted for clarity). Interestingly, none of the enzymes tested appeared to be completely removed from the DNA after experiencing a 2 M NaCl wash.

[0222] From this experiment, it is clear that mutating residues 8713 and 8743 result in enzymes that display lower affinity for DNA than clone 9, as evidenced by their ability to be washed from DNA by lower ionic strength washes.

Example 6

Incorporation Kinetics of ffT-N3-Cy3 by Clone 9, R713A and R743A

[0223] The kinetic characterization of the enzymes was conducted using NUNC tube assay and involved the measurement of rate constants for the first order incorporation of ffT N3 cy3 where [DNA]<<[pol] or [ffNTP], at a variety of [ffTTP]. Below is described the methodology used for each of the three polymerases tested.

[0224] Six (1.times.8) NUNC nucleolink strips were functionalized with 2 pmoles of 5'-amino A template hairpin oligo 815 (5' H2N-CGATCACGATCACGATCACGATCACGATCACGATCACGCTGATGTGCATGCTGTTGTTTTTTTACAACA- GC ATGCACATCAGCG-3') (SEQ ID NO: 18), according to the manufacturer's protocol.

[0225] Each strip was employed for a time-course experiment at a particular [ffT-N3-cy3]. 20 .mu.l of enzymology buffer (50 mM Tris pH 8.0, 6 mM MgSO4, 1 mM EDTA, 0.05% Tween20) was incubated in each NUNC well at 45.degree. C. for 2 minutes.

[0226] Time-courses were initiated by addition of a 20 .mu.l aliquot of 2.times. enzymology mix (X .mu.M ffT-N3-cy3, 1.1 .mu.M polymerase in enzymology buffer) pre-equilibrated at 45.degree. C. for 2 minutes using an 8-channel multipipette in order to start reactions in individual wells at identical time-points. The action of adding the 2.times. enzymology mix to the buffer in the well is sufficient to allow adequate mixing. The reactions were stopped at desired time-points by the addition of 125 .mu.l of 250 mM EDTA. After reactions in all 8 wells stopped, strips were washed with 3.times.100 ml high salt wash (10 mM Tris pH 8.0, 1 M NaCl, 10 mM EDTA) and then 3.times.100 ml MilliQ water and then scanned on a Typhoon fluorescence imager (Cy3 filter, PMT=500 V). Fluorescence intensities in each well were quantified using Imagequant. Plotting the variation in Cy3 fluorescence intensity vs. time generates time-course graphs. Under our experimental conditions, these time-course plots evaluate well to a single exponential decay process (fitted to equation: y=yo+Aexp (x/t)) from which the reaction half life, t, is determined, the inverse of which is termed the observed rate constant kobs (kobs=l/t).

[0227] The magnitude of the observed rate constant is dependent on the concentration of ffT-N3-cy3, so by repeating this experiment at different ffT-N3-cy3 concentrations a range of kobs values can be determined for a particular enzyme. The variation of kobs with ffT-N3-cy3 concentration is hyperbolic and fits well to the Michealis-Menten equation: Vmax=(kpolx[S])/(Kd+[S]) here S=ffT N3-cy3, according to standard enzymological analysis. From the Michaelis plot, key values characteristic of a particular enzyme catalyzing a particular reaction can be obtained, namely kpol (defined as the rate constant for the process at infinite substrate concentration) and Kd (defined as the dissociation constant, the concentration of substrate at kpol/2). This process was repeated for clone 9, R713A and R743A mutants.

[0228] Michaelis plots for all of the enzymes are shown overlaid in FIG. 8.

Results

[0229] The kinetic characteristics of ffT-N3-cy3 incorporation for the enzymes tested are summarized below.

TABLE-US-00002 Clone 9 R713A R743A k.sub.pol/s.sup.-1 0.061 0.10 0.068 K.sub.d/.quadrature.M 1.72 3.32 1.92

[0230] From this, it appears as though the mutations to the DNA-binding region of the polymerases have not adversely affected either the activity of the enzymes (at high substrate concentrations, kpol approximates to Vmax) or the affinity the enzymes have for fully functional nucleotide (in this case ffT-N3-cy3, but the trend is considered to be applicable to all bases). This is an ideal situation, as the mutations have had the desired effect of modifying the DNA-binding affinity of the enzymes without affecting other key catalytic properties.

Example 7

Purification of the Polymerases and Measurement of Levels of Carry Over DNA

DNA Contamination

[0231] Pico green assay (Molecular Probes kit, cat # P11496).

Solutions Required

[0232] TE buffer: 10 mM Tris.HCl pH 7.5, 1 mM EDTA

40 mL required, 2 mL of 20.times.TE buffer added to 38 mL H.sub.2O

.lamda. DNA

[0233] Solution 1 (2 .mu.g/mL .lamda.DNA) dilute 15 .mu.L of .lamda. DNA with 735 .lamda.L of 1.times.TE buffer.

[0234] Solution 2 (50 ng/mL A) dilute 25 .mu.L of .lamda. DNA with 975 .mu.l of 1.times.TE buffer.

Standard Curve

[0235] In 2 mL eppendorfs the following samples were made:

TABLE-US-00003 Sample .lamda. DNA @ .lamda. DNA @ glycerol .lamda. DNA 2 mg mL 50 ng mL storage buffer TE (ng) (.mu.L) (.mu.L) (.mu.L) (.mu.L) 100 160 400 1040 25 40 400 1160 10 16 400 1184 2.5 160 400 1040 1 64 400 1136 0.25 16 400 1184 0.025 1.6 400 1198.4 0 400 1200

3.times.500 .mu.L from each sample was put into 3 eppendorfs.

Enzyme Samples

[0236] In 5 mL bijou bottles the following samples were made:

TABLE-US-00004 glycerol Amount storage buffer TE sample (.mu.L) (.mu.L) (.mu.L) 1 enzyme 400 1800 stock 2 sample 1 1100 200 900 3 sample 2 1100 200 900 4 sample 3 1100 200 900

2.times.500 .mu.L from each sample was put into 2 eppendorfs.

[0237] A picogreen solution was prepared; 85 .mu.L of picogreen stock added to 17 mL of 1.times.TE buffer.

[0238] 500 .mu.L of this solution was added to each of the standard curve and enzyme samples, and was mixed well by pipetting and then all samples were transferred to 1.5 mL fluorimeter cuvettes.

Using the Fluorimeter

[0239] The advanced reads program of the Cary Eclipse file was utilised. The .lamda. excitation was set to 480 nm and the A emission was set to 520 nm, and 1000 volts were used.

Analysis

[0240] Data for the standard curve was entered into Graph pad Prism a standard curve of the formula y=ax+c was fitted. The concentration values, x, was then determined.

Results

TABLE-US-00005 [0241] Concentration of DNA associated Polymerase sample with purified polymerase Clone 9 batch 5 62.9 ng .+-. 1.9 ng Clone 9 batch 6 63.7 ng .+-. 2.1 ng Clone 9 R743A 0.04 ng .+-. 6.4 ng Clone 9 R713A 8.2 ng .+-. 4.2 ng

[0242] From this experiment, it is clear that the alterations in the polymerases enhance purification of the enzyme since less endogenous DNA is carried over during purification. As mentioned above, carry over of endogenous DNA can adversely influence activity of the enzyme and so the mutations are clearly advantageous.

Example 8

Preparation of a Modified Optimised Codon Usage Nucleic Acid Sequence which Encodes the Clone 9 Polymerase

[0243] The amino acid sequence shown in SEQ ID NO 1 was translated into a nucleic acid sequence using the optimal nucleic acid sequence at each codon to encode for the required/desired amino-acid.

[0244] The deduced nucleic acid sequence is shown in SEQ ID NO. 19.

[0245] In a similar scenario, the nucleic acid sequence presented as SEQ ID NO:20 was deduced based upon the amino-acid sequence of the polymerase presented as SEQ ID NO: 21. The polymerase having the amino acid sequence presented as SEQ ID NO: 21 comprises the R743A mutation and also carries a substitution mutation to Serine at both residues 141 and 143. Nucleic acid molecules and proteins comprising the respective nucleotide and amino acid sequences form a part of the invention.

Cloning of a codon-modified gene of clone 9 into the expression vector pET11-a using NdeI-Nhe I sites (to preserve the internal Bam H I site).

Synthesis of a Codon-Optimised Gene of Clone 9

[0246] The nucleic acid sequence of SEQ ID NO 19 was synthesized and supplied in pPCR-Script by GENEART.

The DNA and protein sequences were confirmed (results not shown). Cloning of pSV57 (Codon-Modified Gene of Clone 9 in the pPCRScript Vector) into pET11-a (Hereinafter Named pSV 52) Preparation of the pET11-a Vector

[0247] The pET11-a vector (Novagen catalog No. 69436-3) was digested with Nde I and Nhe I, dephosphorylated, and any undigested vector ligated using standard techniques.

[0248] The digested vector was purified on a 0.8% agarose gel and using the MinElute.RTM. Gel extraction kit protocol from Qiagen.RTM..

[0249] The purified digested pET11-a vector was quantified using a polyacrylamide TB 4-20% gel.

Preparation of the Insert (Codon-Modified Gene of Clone 9)

[0250] The codon-modified gene of clone 9 synthesized by GENEART in the pPCRSCript vector (hereinafter pSV 57) was digested with Nde I and Nhe.

[0251] The digested insert was purified on a 0.8% agarose gel and using the MinElute.RTM. Gel extraction kit protocol from Qiagen.RTM..

[0252] The purified digested insert was quantified using a polyacrylamide TB 4-20% gel.

Ligation

[0253] The pET11-a vector and the insert were ligated (ratio 1:3) at the Nde I and Nhe I restriction sites using the Quick ligation kit (NEB, M2200S).

Transformation

[0254] 2 .mu.l of the ligation mixture was used to transform XL10-gold ultracompetent cells (Stratagene catalog No 200315). PCR screening of the colonies containing the insert.

[0255] Transformants were picked and DNA minipreps of 3 positive clones of XL10-gold transformed with the ligation product were prepared. The three purified plasmids (hereinafter pSV52, clones 1, 2 and 4 were sequenced at the cloning sites and all three clones were found to have the correct sequence at the cloning sites.

[0256] The minipreps were also used to transform the expression E. coli host BL21-CodonPlus (DE3)-RIL (Stratagene catalog No. 230245) as described below.

Southern Blotting

[0257] pVent (pNEB917 derived vector), pSV43 (clone 9 in pETlla), pSV54 (codon-optimised clone in pET11-a) and pSV57 (codon-modified gene in pPCR-Script supplied by GENEART) were restricted and Southern blotted to check for cross hybridisation between the genes (results not shown).

Expression Studies of Pol 52

[0258] Transformation of pSV52 (clones 1, 2 and 4) into the expression host E. coli BL21-CodonPlus (DE3) RIL (Stratagene catalog No 230245).

[0259] 21-25 ng of purified pSV52 .mu.lasmid DNA (clones 1, 2 and 4) was used to transform competent cells of the expression host E. coli BL21-CodonPlus (DE3) RIL (hereinafter RIL) using the manufacturer's instructions.

[0260] 50 .mu.l of each transformation was plated onto fresh Luria-Bertani (LB) agar medium containing 100 .mu.g/ml of carbenicillin and 34 .mu.g/ml of chloramphenicol (LBCC agar medium) and incubated overnight at 37.degree. C.

[0261] The following glycerol stocks were also plated onto LBCC agar plates to be used as controls for the expression studies and incubated overnight at 37.degree. C.

SOL10204:RIL-pSV19 (clone 9 in pNEB 917 vector) SOL10354:RIL-pSV43 (clone 9 in pET11-a vector)

Production of Cell Pellets Expressing Pol 52 and the Positive Controls of Clone 9

[0262] Single transformed E. coli colonies were used to inoculate starter cultures of 3 ml LBCC media in culture tubes and incubated overnight at 37.degree. C. with shaking (225 rpm).

[0263] The starter cultures were diluted 1/100 into 50 ml LBCC media in sterile vented Erlenmeyer flasks and incubated at 37.degree. C. with vigorous shaking (300 rpm) for approximately 4 hours until OD.sub.600nm was approximately 1.0.

[0264] 10 ml of the uninduced cultures was removed and the cells harvested (as described below).

[0265] IPTG was added to a final concentration of 1 mM and the cultures induced for 2 hours at 37.degree. C. with vigorous shaking (300 rpm).

[0266] 10 ml of the induced cultures was removed and the cells harvested as follows:

[0267] Induced and uninduced cells were harvested by centrifugation at 5000.times.g for 30 min at 4.degree. C.

[0268] The cell pellets were washed and resuspended in 1/10.sup.th of the culture volume of 1.times. Phosphate Buffered Saline (PBS) and centrifuged as above.

[0269] The supernatants were decanted and the pellets stored at -20.degree. C. until required for the cell lysis and purification steps.

Cell Lysis and Crude Purification of Pol 52 and Clone 9

[0270] The cell pellets were thawed and resuspended in 1/50.sup.th of culture volume of 1.times. Wash buffer (50 mM Tris-HCl pH 7.9, 50 mM glucose, 1 mM EDTA) containing 4 mg/ml lysozyme freshly added to the 1.times. buffer and incubated at room temperature for 15 min.

[0271] An equal volume of 1.times. Lysis buffer (10 mM Tris-HCl pH 7.9, 50 mM KCl, 1 mM EDTA, 0.5% (w/v) Tween 20) containing 0.5%(w/v) Tergitol NP-40 and 1.times. "complete EDTA-free" proteinase inhibitor cocktail (both added freshly to the 1.times. Lysis buffer) was added to the cells which were gently mixed and incubated at room temperature for 30 min.

[0272] The cells were heated at 80.degree. C. for 1 hr in a water bath then centrifuged at 38,800.times.g for 30 min at 4.degree. C. to remove cell debris and denatured protein.

Preparation of Samples Normalised for Volume and SDS-PAGE Analysis

[0273] The expression of Pol 52 and clone 9 DNA polymerases was assessed by analysis of the crude lysates of the uninduced and induced control samples on SDS-PAGE followed by Coomassie blue staining.

[0274] Supernatants were carefully removed and the samples normalised to volume by the addition of 50:50 (v/v) 1.times. Wash buffer and 1.times. Lysis buffer to a final volume of 370 .mu.l.

Preparation of Samples for Gel I

[0275] 10 .mu.l of the normalised crude lysates (from uninduced and induced samples) were mixed with 10 .mu.l of loading buffer containing 143 mM DTT.

Preparation of Samples for Gel II

[0276] Normalised crude lysates from the induced samples only were dilute 1/10 in distilled water to a final volume of 10 .mu.l and mixed with 10 .mu.l of loading buffer containing 143 mM DTT.

[0277] All samples were heated at 70.degree. C. for 10 minutes.

SDS-Page

[0278] A NuPage.RTM. 4-12% Bis-Tris gel (Invitrogen catalog No NP0321BOX) was prepared according to the manufacturer's instructions.

[0279] 10 .mu.l of SeeBlue.RTM. Plus2 pre-stained proteins standard (Invitrogen catalog No LC5925) and .mu.l of each sample were loaded and the gels run at a constant 200V for 50 minutes.

[0280] The gels were stained with Coomassie blue (SimplyBlue.RTM. Safe stain, Invitrogen, catalog No. LC 6060).

Results

[0281] The results of the SDS-PAGE are shown in FIG. 10.

The estimated expression level in this experiment is 20 mg/L of culture.

[0282] Similar levels of expression of the codon-modified gene of clone 9 in E. coli host BL21-CodonPlus (DE3)-RIL (Po152) were obtained using the expression vector pET11-a when compared to the un-modified gene of clone 9 in the same cells using either the expression vector pNEB917 (P0119) or pET11 (Pol 43).

[0283] No significant differences were observed in the levels of expression of the 3 different clones of Pol 52.

REFERENCES

[0284] Crystal structure of a bacteriophage T7 DNA replication complex at 2.2 .ANG. resolution. [0285] Doublie et al. 1998. Nature 391, 251. [0286] Function of the C-terminus of Phi29 DNA polymerase in DNA and terminal protein binding. [0287] Truniger et al. 2004. Nucleic Acids Research 32, 371. [0288] A thumb subdomain mutant of the large fragment of Escherichia coli DNA polymerase I with reduced DNA binding affinity, processivity and frameshift fidelity. [0289] Minnick et al. 1996. J. Biol. Chem., 271. 24954. [0290] Identification of residues critical for the polymerase activity of the Klenow fragment of DNA polymerase I from Escherichia coli. [0291] Polesky et al. 1990. J. Biol. Chem., 265, 14579. [0292] Cloning of thermostable DNA polymerases from hyperthermophilic marine archaea with emphasis on Thermococcus sp. 9.degree. N-7 and mutations affecting 3'-5' exonuclease activity. [0293] Southworth et al. 1996. PNAS. 93, 5281 [0294] Structure of the replicating complex of a pol alpha family DNA polymerase. Franklin et al. 2001. Cell 105, 657. [0295] Crystal structure of a pol alpha family DNA polymerase from the hyperthermophilic archaeon Thermococcus sp. 9.degree. N-7. [0296] Rodriguez et al. 2000. J. Mol. Biol., 299, 471.

[0297] While certain of the preferred embodiments of the present invention have been described and specifically exemplified above, it is not intended that the invention be limited to such embodiments. Various modifications may be made thereto without departing from the scope and spirit of the present invention, as set forth in the following claims.

Sequence CWU 1

1

241775PRTArtificial sequenceMutant polymerase 1Met Ile Leu Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile 1 5 10 15 Arg Val Phe Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg 20 25 30 Thr Phe Glu Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile 35 40 45 Glu Asp Val Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys 50 55 60 Val Lys Arg Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile 65 70 75 80 Glu Val Trp Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile 85 90 95 Arg Asp Arg Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr 100 105 110 Asp Ile Pro Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro 115 120 125 Met Glu Gly Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr 130 135 140 Leu Tyr His Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile 145 150 155 160 Ser Tyr Ala Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile 165 170 175 Asp Leu Pro Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys 180 185 190 Arg Phe Leu Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr 195 200 205 Tyr Asn Gly Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Ser Glu 210 215 220 Glu Leu Gly Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys 225 230 235 240 Ile Gln Arg Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile 245 250 255 His Phe Asp Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr 260 265 270 Tyr Thr Leu Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu 275 280 285 Lys Val Tyr Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly 290 295 300 Leu Glu Arg Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr 305 310 315 320 Glu Leu Gly Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu 325 330 335 Ile Gly Gln Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu 340 345 350 Val Glu Trp Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala 355 360 365 Pro Asn Lys Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr 370 375 380 Ala Gly Gly Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile 385 390 395 400 Val Tyr Leu Asp Phe Arg Ser Tyr Ala Val Ser Ile Ile Ile Thr His 405 410 415 Asn Val Ser Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp 420 425 430 Val Ala Pro Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe 435 440 445 Ile Pro Ser Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys 450 455 460 Arg Lys Met Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp 465 470 475 480 Tyr Arg Gln Arg Leu Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr 485 490 495 Tyr Gly Tyr Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser 500 505 510 Val Thr Ala Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu 515 520 525 Glu Glu Lys Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu 530 535 540 His Ala Thr Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala 545 550 555 560 Lys Glu Phe Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu 565 570 575 Leu Glu Tyr Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys 580 585 590 Lys Tyr Ala Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu 595 600 605 Glu Ile Val Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala 610 615 620 Arg Val Leu Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val 625 630 635 640 Arg Ile Val Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro 645 650 655 Pro Glu Lys Leu Val Ile His Glu Gln Ile Thr Arg Asp Leu Arg Asp 660 665 670 Tyr Lys Ala Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala 675 680 685 Arg Gly Val Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu 690 695 700 Ala Gly Ser Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe 705 710 715 720 Asp Pro Thr Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln 725 730 735 Val Leu Pro Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys 740 745 750 Glu Asp Leu Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp 755 760 765 Leu Lys Val Lys Gly Lys Lys 770 775 22328DNAArtificial sequenceMutant polymerase 2atgattctcg ataccgacta catcaccgag aacgggaagc ccgtgataag ggtcttcaag 60aaggagaacg gcgagtttaa aatcgagtac gacagaacct tcgagcccta cttctacgcc 120cttctgaagg acgattctgc gatagaggac gtcaagaagg taaccgcaaa gaggcacgga 180acggttgtca aggtgaagcg cgccgagaag gtgcagaaga agttcctcgg caggccgata 240gaggtctgga agctctactt caaccatcct caggacgtcc cggcgattcg agacaggata 300cgtgcccacc ccgctgtcgt tgacatctac gagtacgaca tacccttcgc caagcgctac 360ctcatcgaca agggcctgat tccgatggag ggcgacgagg agcttacgat gctcgccttc 420gcgatcgcaa ccctctatca cgagggcgag gagttcggaa ccgggccgat tctcatgata 480agctacgccg acgggagcga ggcgagggtg ataacctgga agaagattga ccttccgtac 540gttgacgtcg tctcgaccga gaaggagatg attaagcgct tcctccgcgt cgtcagggag 600aaggaccccg acgtgctcat cacctacaac ggcgacaact tcgacttcgc ctacctgaag 660aagcgctctg aggaactcgg aataaagttc acactcggca gggacgggag cgagccgaag 720atacagcgaa tgggcgaccg ctttgccgtt gaggtgaagg gcaggattca cttcgacctc 780taccccgtca taaggcgcac gataaacctc ccgacctaca cccttgaggc cgtttacgag 840gccgtctttg gaaagcccaa ggagaaggtt tacgcagagg agatagcgca ggcctgggag 900agcggggagg gccttgaaag ggttgcaaga tactcgatgg aggacgctaa ggtgacctac 960gagctgggaa gggagttctt cccgatggag gcccagcttt cgaggcttat aggccagagc 1020ctctgggacg tctcgcgctc gagcaccgga aatttggtgg agtggttcct cctgcggaag 1080gcctacaaga ggaacgagct cgccccaaac aagcccgacg agagggagct cgcgagacgg 1140cgcgggggct acgctggcgg gtacgttaag gaaccagagc ggggattgtg ggacaacatt 1200gtgtatctag acttccgctc gtatgcggtt tcaatcatca taacccacaa cgtctcgccg 1260gataccctca accgcgaggg ctgtaaagag tacgacgtcg cccctgaggt tggacacaag 1320ttctgcaagg acttccccgg cttcatacca agcctcctgg gagatttgct cgaggagagg 1380cagaagataa agcggaagat gaaggcaacg gttgacccgc tggagaagaa actcctcgat 1440tacaggcaga ggctgatcaa aatcctcgcc aacagcttct acggctacta cggctacgcc 1500aaggcccggt ggtactgcaa ggagtgcgcc gagagcgtta cggcctgggg aagggagtat 1560atagaaatgg ttatccggga actcgaagaa aaattcggtt ttaaagttct ctatgccgat 1620acagacggtc tccatgctac cattcccgga gcagacgctg aaacagtcaa gaaaaaagca 1680aaggagttct taaaatacat taatccaaaa ctgcccggcc tgctcgaact tgagtacgag 1740ggcttctacg tgaggggctt cttcgtcacg aagaagaagt acgctgtgat agacgaggag 1800ggcaagataa ccacgagggg tcttgagatt gtgaggcgcg actggagcga gatagcgaag 1860gagacccagg ccagggtctt agaggcgata ctcaagcacg gtgacgtcga ggaggccgtt 1920aggatagtca aggaagtgac ggaaaagctg agcaagtatg aggtcccgcc cgagaagctg 1980gtaatccacg agcagataac gcgcgatttg agggattaca aagccaccgg cccgcacgtt 2040gccgttgcga agaggctcgc ggcgcgtgga gtgaaaatcc ggcccggcac ggtgataagc 2100tacatcgtcc tagcgggctc tggaaggata ggcgacaggg cgattccagc tgatgagttc 2160gacccgacga agcaccgcta cgatgcggaa tactacatcg agaaccaggt tctcccggcg 2220gtggagagga ttctaaaagc cttcggctat cggaaggagg atttgcgcta ccagaagacg 2280aagcaggtcg gcttgggcgc gtggctgaag gtgaagggga agaagtga 23283775PRTArtificial sequenceMutant polymerase 3Met Ile Leu Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile 1 5 10 15 Arg Val Phe Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg 20 25 30 Thr Phe Glu Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile 35 40 45 Glu Asp Val Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys 50 55 60 Val Lys Arg Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile 65 70 75 80 Glu Val Trp Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile 85 90 95 Arg Asp Arg Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr 100 105 110 Asp Ile Pro Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro 115 120 125 Met Glu Gly Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr 130 135 140 Leu Tyr His Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile 145 150 155 160 Ser Tyr Ala Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile 165 170 175 Asp Leu Pro Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys 180 185 190 Arg Phe Leu Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr 195 200 205 Tyr Asn Gly Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Ser Glu 210 215 220 Glu Leu Gly Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys 225 230 235 240 Ile Gln Arg Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile 245 250 255 His Phe Asp Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr 260 265 270 Tyr Thr Leu Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu 275 280 285 Lys Val Tyr Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly 290 295 300 Leu Glu Arg Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr 305 310 315 320 Glu Leu Gly Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu 325 330 335 Ile Gly Gln Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu 340 345 350 Val Glu Trp Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala 355 360 365 Pro Asn Lys Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr 370 375 380 Ala Gly Gly Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile 385 390 395 400 Val Tyr Leu Asp Phe Arg Ser Tyr Ala Val Ser Ile Ile Ile Thr His 405 410 415 Asn Val Ser Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp 420 425 430 Val Ala Pro Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe 435 440 445 Ile Pro Ser Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys 450 455 460 Arg Lys Met Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp 465 470 475 480 Tyr Arg Gln Arg Leu Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr 485 490 495 Tyr Gly Tyr Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser 500 505 510 Val Thr Ala Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu 515 520 525 Glu Glu Lys Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu 530 535 540 His Ala Thr Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala 545 550 555 560 Lys Glu Phe Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu 565 570 575 Leu Glu Tyr Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys 580 585 590 Lys Tyr Ala Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu 595 600 605 Glu Ile Val Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala 610 615 620 Arg Val Leu Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val 625 630 635 640 Arg Ile Val Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro 645 650 655 Pro Glu Lys Leu Val Ile His Glu Gln Ile Thr Arg Asp Leu Arg Asp 660 665 670 Tyr Lys Ala Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala 675 680 685 Arg Gly Val Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu 690 695 700 Lys Gly Ser Gly Arg Ile Gly Asp Ala Ala Ile Pro Ala Asp Glu Phe 705 710 715 720 Asp Pro Thr Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln 725 730 735 Val Leu Pro Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys 740 745 750 Glu Asp Leu Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp 755 760 765 Leu Lys Val Lys Gly Lys Lys 770 775 42328DNAArtificial sequenceMutant polymerase 4atgattctcg ataccgacta catcaccgag aacgggaagc ccgtgataag ggtcttcaag 60aaggagaacg gcgagtttaa aatcgagtac gacagaacct tcgagcccta cttctacgcc 120cttctgaagg acgattctgc gatagaggac gtcaagaagg taaccgcaaa gaggcacgga 180acggttgtca aggtgaagcg cgccgagaag gtgcagaaga agttcctcgg caggccgata 240gaggtctgga agctctactt caaccatcct caggacgtcc cggcgattcg agacaggata 300cgtgcccacc ccgctgtcgt tgacatctac gagtacgaca tacccttcgc caagcgctac 360ctcatcgaca agggcctgat tccgatggag ggcgacgagg agcttacgat gctcgccttc 420gcgatcgcaa ccctctatca cgagggcgag gagttcggaa ccgggccgat tctcatgata 480agctacgccg acgggagcga ggcgagggtg ataacctgga agaagattga ccttccgtac 540gttgacgtcg tctcgaccga gaaggagatg attaagcgct tcctccgcgt cgtcagggag 600aaggaccccg acgtgctcat cacctacaac ggcgacaact tcgacttcgc ctacctgaag 660aagcgctctg aggaactcgg aataaagttc acactcggca gggacgggag cgagccgaag 720atacagcgaa tgggcgaccg ctttgccgtt gaggtgaagg gcaggattca cttcgacctc 780taccccgtca taaggcgcac gataaacctc ccgacctaca cccttgaggc cgtttacgag 840gccgtctttg gaaagcccaa ggagaaggtt tacgcagagg agatagcgca ggcctgggag 900agcggggagg gccttgaaag ggttgcaaga tactcgatgg aggacgctaa ggtgacctac 960gagctgggaa gggagttctt cccgatggag gcccagcttt cgaggcttat aggccagagc 1020ctctgggacg tctcgcgctc gagcaccgga aatttggtgg agtggttcct cctgcggaag 1080gcctacaaga ggaacgagct cgccccaaac aagcccgacg agagggagct cgcgagacgg 1140cgcgggggct acgctggcgg gtacgttaag gaaccagagc ggggattgtg ggacaacatt 1200gtgtatctag acttccgctc gtatgcggtt tcaatcatca taacccacaa cgtctcgccg 1260gataccctca accgcgaggg ctgtaaagag tacgacgtcg cccctgaggt tggacacaag 1320ttctgcaagg acttccccgg cttcatacca agcctcctgg gagatttgct cgaggagagg 1380cagaagataa agcggaagat gaaggcaacg gttgacccgc tggagaagaa actcctcgat 1440tacaggcaga ggctgatcaa aatcctcgcc aacagcttct acggctacta cggctacgcc 1500aaggcccggt ggtactgcaa ggagtgcgcc gagagcgtta cggcctgggg aagggagtat 1560atagaaatgg ttatccggga actcgaagaa aaattcggtt ttaaagttct ctatgccgat 1620acagacggtc tccatgctac cattcccgga gcagacgctg aaacagtcaa gaaaaaagca 1680aaggagttct taaaatacat taatccaaaa ctgcccggcc tgctcgaact tgagtacgag 1740ggcttctacg tgaggggctt cttcgtcacg aagaagaagt acgctgtgat agacgaggag 1800ggcaagataa ccacgagggg tcttgagatt gtgaggcgcg actggagcga gatagcgaag 1860gagacccagg ccagggtctt agaggcgata ctcaagcacg gtgacgtcga ggaggccgtt 1920aggatagtca aggaagtgac ggaaaagctg agcaagtatg aggtcccgcc cgagaagctg 1980gtaatccacg agcagataac gcgcgatttg agggattaca aagccaccgg cccgcacgtt 2040gccgttgcga agaggctcgc ggcgcgtgga gtgaaaatcc ggcccggcac ggtgataagc 2100tacatcgtcc taaagggctc tggaaggata ggcgacgcgg cgattccagc tgatgagttc 2160gacccgacga agcaccgcta cgatgcggaa tactacatcg agaaccaggt tctcccggcg 2220gtggagagga ttctaaaagc cttcggctat cggaaggagg atttgcgcta ccagaagacg 2280aagcaggtcg gcttgggcgc gtggctgaag gtgaagggga agaagtga 23285775PRTArtificial sequenceMutant polymerase 5Met Ile Leu Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile 1

5 10 15 Arg Val Phe Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg 20 25 30 Thr Phe Glu Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile 35 40 45 Glu Asp Val Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys 50 55 60 Val Lys Arg Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile 65 70 75 80 Glu Val Trp Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile 85 90 95 Arg Asp Arg Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr 100 105 110 Asp Ile Pro Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro 115 120 125 Met Glu Gly Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr 130 135 140 Leu Tyr His Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile 145 150 155 160 Ser Tyr Ala Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile 165 170 175 Asp Leu Pro Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys 180 185 190 Arg Phe Leu Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr 195 200 205 Tyr Asn Gly Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Ser Glu 210 215 220 Glu Leu Gly Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys 225 230 235 240 Ile Gln Arg Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile 245 250 255 His Phe Asp Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr 260 265 270 Tyr Thr Leu Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu 275 280 285 Lys Val Tyr Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly 290 295 300 Leu Glu Arg Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr 305 310 315 320 Glu Leu Gly Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu 325 330 335 Ile Gly Gln Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu 340 345 350 Val Glu Trp Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala 355 360 365 Pro Asn Lys Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr 370 375 380 Ala Gly Gly Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile 385 390 395 400 Val Tyr Leu Asp Phe Arg Ser Tyr Ala Val Ser Ile Ile Ile Thr His 405 410 415 Asn Val Ser Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp 420 425 430 Val Ala Pro Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe 435 440 445 Ile Pro Ser Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys 450 455 460 Arg Lys Met Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp 465 470 475 480 Tyr Arg Gln Arg Leu Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr 485 490 495 Tyr Gly Tyr Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser 500 505 510 Val Thr Ala Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu 515 520 525 Glu Glu Lys Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu 530 535 540 His Ala Thr Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala 545 550 555 560 Lys Glu Phe Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu 565 570 575 Leu Glu Tyr Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys 580 585 590 Lys Tyr Ala Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu 595 600 605 Glu Ile Val Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala 610 615 620 Arg Val Leu Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val 625 630 635 640 Arg Ile Val Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro 645 650 655 Pro Glu Lys Leu Val Ile His Glu Gln Ile Thr Arg Asp Leu Arg Asp 660 665 670 Tyr Lys Ala Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala 675 680 685 Arg Gly Val Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu 690 695 700 Lys Gly Ser Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe 705 710 715 720 Asp Pro Thr Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln 725 730 735 Val Leu Pro Ala Val Glu Ala Ile Leu Lys Ala Phe Gly Tyr Arg Lys 740 745 750 Glu Asp Leu Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp 755 760 765 Leu Lys Val Lys Gly Lys Lys 770 775 62328DNAArtificial sequenceMutant polymerase 6atgattctcg ataccgacta catcaccgag aacgggaagc ccgtgataag ggtcttcaag 60aaggagaacg gcgagtttaa aatcgagtac gacagaacct tcgagcccta cttctacgcc 120cttctgaagg acgattctgc gatagaggac gtcaagaagg taaccgcaaa gaggcacgga 180acggttgtca aggtgaagcg cgccgagaag gtgcagaaga agttcctcgg caggccgata 240gaggtctgga agctctactt caaccatcct caggacgtcc cggcgattcg agacaggata 300cgtgcccacc ccgctgtcgt tgacatctac gagtacgaca tacccttcgc caagcgctac 360ctcatcgaca agggcctgat tccgatggag ggcgacgagg agcttacgat gctcgccttc 420gcgatcgcaa ccctctatca cgagggcgag gagttcggaa ccgggccgat tctcatgata 480agctacgccg acgggagcga ggcgagggtg ataacctgga agaagattga ccttccgtac 540gttgacgtcg tctcgaccga gaaggagatg attaagcgct tcctccgcgt cgtcagggag 600aaggaccccg acgtgctcat cacctacaac ggcgacaact tcgacttcgc ctacctgaag 660aagcgctctg aggaactcgg aataaagttc acactcggca gggacgggag cgagccgaag 720atacagcgaa tgggcgaccg ctttgccgtt gaggtgaagg gcaggattca cttcgacctc 780taccccgtca taaggcgcac gataaacctc ccgacctaca cccttgaggc cgtttacgag 840gccgtctttg gaaagcccaa ggagaaggtt tacgcagagg agatagcgca ggcctgggag 900agcggggagg gccttgaaag ggttgcaaga tactcgatgg aggacgctaa ggtgacctac 960gagctgggaa gggagttctt cccgatggag gcccagcttt cgaggcttat aggccagagc 1020ctctgggacg tctcgcgctc gagcaccgga aatttggtgg agtggttcct cctgcggaag 1080gcctacaaga ggaacgagct cgccccaaac aagcccgacg agagggagct cgcgagacgg 1140cgcgggggct acgctggcgg gtacgttaag gaaccagagc ggggattgtg ggacaacatt 1200gtgtatctag acttccgctc gtatgcggtt tcaatcatca taacccacaa cgtctcgccg 1260gataccctca accgcgaggg ctgtaaagag tacgacgtcg cccctgaggt tggacacaag 1320ttctgcaagg acttccccgg cttcatacca agcctcctgg gagatttgct cgaggagagg 1380cagaagataa agcggaagat gaaggcaacg gttgacccgc tggagaagaa actcctcgat 1440tacaggcaga ggctgatcaa aatcctcgcc aacagcttct acggctacta cggctacgcc 1500aaggcccggt ggtactgcaa ggagtgcgcc gagagcgtta cggcctgggg aagggagtat 1560atagaaatgg ttatccggga actcgaagaa aaattcggtt ttaaagttct ctatgccgat 1620acagacggtc tccatgctac cattcccgga gcagacgctg aaacagtcaa gaaaaaagca 1680aaggagttct taaaatacat taatccaaaa ctgcccggcc tgctcgaact tgagtacgag 1740ggcttctacg tgaggggctt cttcgtcacg aagaagaagt acgctgtgat agacgaggag 1800ggcaagataa ccacgagggg tcttgagatt gtgaggcgcg actggagcga gatagcgaag 1860gagacccagg ccagggtctt agaggcgata ctcaagcacg gtgacgtcga ggaggccgtt 1920aggatagtca aggaagtgac ggaaaagctg agcaagtatg aggtcccgcc cgagaagctg 1980gtaatccacg agcagataac gcgcgatttg agggattaca aagccaccgg cccgcacgtt 2040gccgttgcga agaggctcgc ggcgcgtgga gtgaaaatcc ggcccggcac ggtgataagc 2100tacatcgtcc taaagggctc tggaaggata ggcgacaggg cgattccagc tgatgagttc 2160gacccgacga agcaccgcta cgatgcggaa tactacatcg agaaccaggt tctcccggcg 2220gtggaggcga ttctaaaagc cttcggctat cggaaggagg atttgcgcta ccagaagacg 2280aagcaggtcg gcttgggcgc gtggctgaag gtgaagggga agaagtga 23287704PRTArtificial sequenceMutant polymerase 7Met Ile Leu Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile 1 5 10 15 Arg Val Phe Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg 20 25 30 Thr Phe Glu Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile 35 40 45 Glu Asp Val Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys 50 55 60 Val Lys Arg Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile 65 70 75 80 Glu Val Trp Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile 85 90 95 Arg Asp Arg Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr 100 105 110 Asp Ile Pro Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro 115 120 125 Met Glu Gly Asp Glu Glu Leu Thr Met Leu Ala Phe Ala Ile Ala Thr 130 135 140 Leu Tyr His Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile 145 150 155 160 Ser Tyr Ala Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile 165 170 175 Asp Leu Pro Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys 180 185 190 Arg Phe Leu Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr 195 200 205 Tyr Asn Gly Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Ser Glu 210 215 220 Glu Leu Gly Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys 225 230 235 240 Ile Gln Arg Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile 245 250 255 His Phe Asp Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr 260 265 270 Tyr Thr Leu Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu 275 280 285 Lys Val Tyr Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly 290 295 300 Leu Glu Arg Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr 305 310 315 320 Glu Leu Gly Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu 325 330 335 Ile Gly Gln Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu 340 345 350 Val Glu Trp Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala 355 360 365 Pro Asn Lys Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr 370 375 380 Ala Gly Gly Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile 385 390 395 400 Val Tyr Leu Asp Phe Arg Ser Tyr Ala Val Ser Ile Ile Ile Thr His 405 410 415 Asn Val Ser Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp 420 425 430 Val Ala Pro Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe 435 440 445 Ile Pro Ser Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys 450 455 460 Arg Lys Met Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp 465 470 475 480 Tyr Arg Gln Arg Leu Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr 485 490 495 Tyr Gly Tyr Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser 500 505 510 Val Thr Ala Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu 515 520 525 Glu Glu Lys Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu 530 535 540 His Ala Thr Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala 545 550 555 560 Lys Glu Phe Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu 565 570 575 Leu Glu Tyr Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys 580 585 590 Lys Tyr Ala Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu 595 600 605 Glu Ile Val Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala 610 615 620 Arg Val Leu Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val 625 630 635 640 Arg Ile Val Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro 645 650 655 Pro Glu Lys Leu Val Ile His Glu Gln Ile Thr Arg Asp Leu Arg Asp 660 665 670 Tyr Lys Ala Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala 675 680 685 Arg Gly Val Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu 690 695 700 82328DNAArtificial sequenceMutant polymerase 8atgattctcg ataccgacta catcaccgag aacgggaagc ccgtgataag ggtcttcaag 60aaggagaacg gcgagtttaa aatcgagtac gacagaacct tcgagcccta cttctacgcc 120cttctgaagg acgattctgc gatagaggac gtcaagaagg taaccgcaaa gaggcacgga 180acggttgtca aggtgaagcg cgccgagaag gtgcagaaga agttcctcgg caggccgata 240gaggtctgga agctctactt caaccatcct caggacgtcc cggcgattcg agacaggata 300cgtgcccacc ccgctgtcgt tgacatctac gagtacgaca tacccttcgc caagcgctac 360ctcatcgaca agggcctgat tccgatggag ggcgacgagg agcttacgat gctcgccttc 420gcgatcgcaa ccctctatca cgagggcgag gagttcggaa ccgggccgat tctcatgata 480agctacgccg acgggagcga ggcgagggtg ataacctgga agaagattga ccttccgtac 540gttgacgtcg tctcgaccga gaaggagatg attaagcgct tcctccgcgt cgtcagggag 600aaggaccccg acgtgctcat cacctacaac ggcgacaact tcgacttcgc ctacctgaag 660aagcgctctg aggaactcgg aataaagttc acactcggca gggacgggag cgagccgaag 720atacagcgaa tgggcgaccg ctttgccgtt gaggtgaagg gcaggattca cttcgacctc 780taccccgtca taaggcgcac gataaacctc ccgacctaca cccttgaggc cgtttacgag 840gccgtctttg gaaagcccaa ggagaaggtt tacgcagagg agatagcgca ggcctgggag 900agcggggagg gccttgaaag ggttgcaaga tactcgatgg aggacgctaa ggtgacctac 960gagctgggaa gggagttctt cccgatggag gcccagcttt cgaggcttat aggccagagc 1020ctctgggacg tctcgcgctc gagcaccgga aatttggtgg agtggttcct cctgcggaag 1080gcctacaaga ggaacgagct cgccccaaac aagcccgacg agagggagct cgcgagacgg 1140cgcgggggct acgctggcgg gtacgttaag gaaccagagc ggggattgtg ggacaacatt 1200gtgtatctag acttccgctc gtatgcggtt tcaatcatca taacccacaa cgtctcgccg 1260gataccctca accgcgaggg ctgtaaagag tacgacgtcg cccctgaggt tggacacaag 1320ttctgcaagg acttccccgg cttcatacca agcctcctgg gagatttgct cgaggagagg 1380cagaagataa agcggaagat gaaggcaacg gttgacccgc tggagaagaa actcctcgat 1440tacaggcaga ggctgatcaa aatcctcgcc aacagcttct acggctacta cggctacgcc 1500aaggcccggt ggtactgcaa ggagtgcgcc gagagcgtta cggcctgggg aagggagtat 1560atagaaatgg ttatccggga actcgaagaa aaattcggtt ttaaagttct ctatgccgat 1620acagacggtc tccatgctac cattcccgga gcagacgctg aaacagtcaa gaaaaaagca 1680aaggagttct taaaatacat taatccaaaa ctgcccggcc tgctcgaact tgagtacgag 1740ggcttctacg tgaggggctt cttcgtcacg aagaagaagt acgctgtgat agacgaggag 1800ggcaagataa ccacgagggg tcttgagatt gtgaggcgcg actggagcga gatagcgaag 1860gagacccagg ccagggtctt agaggcgata ctcaagcacg gtgacgtcga ggaggccgtt 1920aggatagtca aggaagtgac ggaaaagctg agcaagtatg aggtcccgcc cgagaagctg 1980gtaatccacg agcagataac gcgcgatttg agggattaca aagccaccgg cccgcacgtt 2040gccgttgcga agaggctcgc ggcgcgtgga gtgaaaatcc ggcccggcac ggtgataagc 2100tacatcgtcc tgacgggctc tggaaggata ggcgacaggg cgattccagc tgatgagttc 2160gacccgacga agcaccgcta cgatgcggaa tactacatcg agaaccaggt tctcccggcg 2220gtggagagga ttctaaaagc cttcggctat cggaaggagg atttgcgcta ccagaagacg 2280aagcaggtcg gcttgggcgc gtggctgaag gtgaagggga agaagtga 2328928DNAArtificial sequenceFwd primer 9cccggcggtg gaggcgattc taaaagcc 281028DNAArtificial sequenceRev primer 10gggccgccac ctccgctaag attttcgg 281130DNAArtificial sequenceFwd primer 11gaaggatagg cgacgcggcg attccagctg 301230DNAArtificial sequenceRev primer 12cttcctatcc gctgcgccgc taaggtcgac 301329DNAArtificial sequenceFwd primer 13gctacatcgt cctagcgggc tctggaagg 291429DNAArtificial sequenceRev primer 14cgatgtagca ggatcgcccg agaccttcc 291529DNAArtificial sequenceFwd primer 15gctacatcgt cctatgaggc tctggaagg

291629DNAArtificial sequenceRev primer 16cgatgtagca ggatactccg agaccttcc 291784DNAArtificial sequenceTemplate DNA 17cgatcacgat cacgatcacg atcacgatca cgatcacgct gatgtgcatg ctgttgtttt 60tttacaacag catgcacatc agcg 841884DNAArtificial sequenceNH2 coupled template 18cgatcacgat cacgatcacg atcacgatca cgatcacgct gatgtgcatg ctgttgtttt 60tttacaacag catgcacatc agcg 84192328DNAArtificial sequenceCodon optimised polymerase 19atgatcttag ataccgacta tatcaccgag aacggtaaac cggtgataag ggtgttcaaa 60aaggaaaatg gcgaattcaa gatcgagtat gatagaacct tcgaaccgta cttctacgcc 120ttgttgaagg acgatagtgc catcgaagat gtgaaaaaag ttaccgccaa acgtcacggc 180accgtggtaa aggttaaacg cgccgaaaag gttcagaaga agttcctagg ccgtccgatc 240gaggtgtgga aattgtactt taaccatccg caggatgtcc cggcgattag agatcgtatt 300cgtgcccacc cggcggtagt ggatatctat gagtacgata tcccgttcgc aaaaagatac 360ttgattgata aaggactaat cccgatggaa ggcgatgaag aattaaccat gttagcgttc 420tccatctcca ccctgtacca cgaaggcgaa gagttcggca ccggtccgat tctgatgatc 480tcctacgcag acggtagcga agcacgtgtg ataacctgga agaaaataga cctaccttac 540gtggacgtcg taagtaccga gaaggagatg atcaaaagat tcctgagggt ggtccgtgag 600aaggatccgg acgtactgat tacctataac ggcgataact tcgacttcgc ctacttgaaa 660aagagatctg aggaattagg catcaaattc accctgggcc gtgatggcag tgagccgaaa 720atccaacgta tgggcgaccg cttcgccgtc gaggtgaaag gccgtataca tttcgacttg 780tatccggtga ttaggcgtac cattaatttg ccgacctaca ccttggaagc ggtgtacgag 840gcggtcttcg gcaagccgaa ggaaaaggtg tacgccgaag agatcgcgca ggcgtgggag 900agcggtgagg gtctagaacg tgttgcaaga tatagcatgg aggacgccaa agttacctac 960gaattgggcc gcgagttttt tccgatggag gcccagttat ctcgtttaat tggccagtcc 1020ctgtgggatg ttagccgcag ttctactggt aatttggtag aatggttctt actgcgcaaa 1080gcgtataaac gtaacgagtt agcgccaaat aagccggacg aacgtgaact ggcccgtcgt 1140cgtggtggct atgccggcgg ttacgtgaag gaaccggagc gtggcctatg ggataacatt 1200gtgtaccttg actttagaag ctatgcggtt agcatcatca tcacccataa tgttagtccg 1260gacacattga atcgtgaagg atgcaaagaa tatgacgtcg ccccagaggt gggccacaaa 1320ttttgtaaag atttcccagg attcatccca agtttgttgg gtgatctgct ggaagaacgc 1380cagaaaatca aacgtaagat gaaggcgacc gtcgatccac tggagaaaaa gctattggac 1440taccgtcagc gcctgatcaa gattttggcg aattctttct atggatacta cggctacgcc 1500aaagcccgtt ggtattgtaa agagtgcgcc gagtctgtca ctgcctgggg tcgtgaatat 1560atcgaaatgg tgatccgcga gctggaagag aaatttggat tcaaagtctt gtacgccgat 1620accgatggtc tgcacgcgac cattccgggt gccgatgccg agaccgtgaa gaaaaaggcg 1680aaagagtttt tgaaatatat caatccgaag ttgccgggat tattagaatt ggaatacgaa 1740ggtttctatg ttcgcggctt tttcgtgacc aagaaaaaat acgccgtgat cgacgaggaa 1800ggaaaaatta ccacccgtgg tctagagatt gttcgtcgtg actggtccga aatcgccaaa 1860gaaacccagg cccgtgtact ggaagcgatt ttgaagcatg gcgatgtgga ggaggcggtt 1920cgtatcgtca aagaagtgac cgaaaagctg agcaagtatg aagtgccgcc ggagaaattg 1980gtcatacacg aacaaatcac acgtgacctg cgcgattata aggcgaccgg tccgcacgtt 2040gccgtggcga agcgtttggc ggcccgtggt gttaagattc gtccaggaac cgtgattagt 2100tacatagtgt tgaagggcag tggtcgtatt ggtgaccgtg ccatcccggc ggatgagttt 2160gacccgacca agcatcgtta tgacgccgaa tattatatcg agaatcaggt gctaccagcg 2220gttgaacgta ttttgaaggc attcggctat cgtaaagaag acctgcgcta ccagaaaacc 2280aagcaggttg gtctgggtgc ctggttgaaa gtgaaaggca aaaaataa 2328202328DNAArtificial sequenceCodon optimised thumb mutant polymerase 20atgatcttag ataccgacta tatcaccgag aacggtaaac cggtgataag ggtgttcaaa 60aaggaaaatg gcgaattcaa gatcgagtat gatagaacct tcgaaccgta cttctacgcc 120ttgttgaagg acgatagtgc catcgaagat gtgaaaaaag ttaccgccaa acgtcacggc 180accgtggtaa aggttaaacg cgccgaaaag gttcagaaga agttcctagg ccgtccgatc 240gaggtgtgga aattgtactt taaccatccg caggatgtcc cggcgattag agatcgtatt 300cgtgcccacc cggcggtagt ggatatctat gagtacgata tcccgttcgc aaaaagatac 360ttgattgata aaggactaat cccgatggaa ggcgatgaag aattaaccat gttagcgttc 420tccatctcca ccctgtacca cgaaggcgaa gagttcggca ccggtccgat tctgatgatc 480tcctacgcag acggtagcga agcacgtgtg ataacctgga agaaaataga cctaccttac 540gtggacgtcg taagtaccga gaaggagatg atcaaaagat tcctgagggt ggtccgtgag 600aaggatccgg acgtactgat tacctataac ggcgataact tcgacttcgc ctacttgaaa 660aagagatctg aggaattagg catcaaattc accctgggcc gtgatggcag tgagccgaaa 720atccaacgta tgggcgaccg cttcgccgtc gaggtgaaag gccgtataca tttcgacttg 780tatccggtga ttaggcgtac cattaatttg ccgacctaca ccttggaagc ggtgtacgag 840gcggtcttcg gcaagccgaa ggaaaaggtg tacgccgaag agatcgcgca ggcgtgggag 900agcggtgagg gtctagaacg tgttgcaaga tatagcatgg aggacgccaa agttacctac 960gaattgggcc gcgagttttt tccgatggag gcccagttat ctcgtttaat tggccagtcc 1020ctgtgggatg ttagccgcag ttctactggt aatttggtag aatggttctt actgcgcaaa 1080gcgtataaac gtaacgagtt agcgccaaat aagccggacg aacgtgaact ggcccgtcgt 1140cgtggtggct atgccggcgg ttacgtgaag gaaccggagc gtggcctatg ggataacatt 1200gtgtaccttg actttagaag ctatgcggtt agcatcatca tcacccataa tgttagtccg 1260gacacattga atcgtgaagg atgcaaagaa tatgacgtcg ccccagaggt gggccacaaa 1320ttttgtaaag atttcccagg attcatccca agtttgttgg gtgatctgct ggaagaacgc 1380cagaaaatca aacgtaagat gaaggcgacc gtcgatccac tggagaaaaa gctattggac 1440taccgtcagc gcctgatcaa gattttggcg aattctttct atggatacta cggctacgcc 1500aaagcccgtt ggtattgtaa agagtgcgcc gagtctgtca ctgcctgggg tcgtgaatat 1560atcgaaatgg tgatccgcga gctggaagag aaatttggat tcaaagtctt gtacgccgat 1620accgatggtc tgcacgcgac cattccgggt gccgatgccg agaccgtgaa gaaaaaggcg 1680aaagagtttt tgaaatatat caatccgaag ttgccgggat tattagaatt ggaatacgaa 1740ggtttctatg ttcgcggctt tttcgtgacc aagaaaaaat acgccgtgat cgacgaggaa 1800ggaaaaatta ccacccgtgg tctagagatt gttcgtcgtg actggtccga aatcgccaaa 1860gaaacccagg cccgtgtact ggaagcgatt ttgaagcatg gcgatgtgga ggaggcggtt 1920cgtatcgtca aagaagtgac cgaaaagctg agcaagtatg aagtgccgcc ggagaaattg 1980gtcatacacg aacaaatcac acgtgacctg cgcgattata aggcgaccgg tccgcacgtt 2040gccgtggcga agcgtttggc ggcccgtggt gttaagattc gtccaggaac cgtgattagt 2100tacatagtgt tgaagggcag tggtcgtatt ggtgaccgtg ccatcccggc ggatgagttt 2160gacccgacca agcatcgtta tgacgccgaa tattatatcg agaatcaggt gctaccagcg 2220gttgaagcta ttttgaaggc attcggctat cgtaaagaag acctgcgcta ccagaaaacc 2280aagcaggttg gtctgggtgc ctggttgaaa gtgaaaggca aaaaataa 232821775PRTArtificial sequenceCodon optimised thumb mutant polymerase 21Met Ile Leu Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile 1 5 10 15 Arg Val Phe Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg 20 25 30 Thr Phe Glu Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile 35 40 45 Glu Asp Val Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys 50 55 60 Val Lys Arg Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile 65 70 75 80 Glu Val Trp Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile 85 90 95 Arg Asp Arg Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr 100 105 110 Asp Ile Pro Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro 115 120 125 Met Glu Gly Asp Glu Glu Leu Thr Met Leu Ala Phe Ser Ile Ser Thr 130 135 140 Leu Tyr His Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile 145 150 155 160 Ser Tyr Ala Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile 165 170 175 Asp Leu Pro Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys 180 185 190 Arg Phe Leu Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr 195 200 205 Tyr Asn Gly Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Ser Glu 210 215 220 Glu Leu Gly Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys 225 230 235 240 Ile Gln Arg Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile 245 250 255 His Phe Asp Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr 260 265 270 Tyr Thr Leu Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu 275 280 285 Lys Val Tyr Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly 290 295 300 Leu Glu Arg Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr 305 310 315 320 Glu Leu Gly Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu 325 330 335 Ile Gly Gln Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu 340 345 350 Val Glu Trp Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala 355 360 365 Pro Asn Lys Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr 370 375 380 Ala Gly Gly Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile 385 390 395 400 Val Tyr Leu Asp Phe Arg Ser Tyr Ala Val Ser Ile Ile Ile Thr His 405 410 415 Asn Val Ser Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp 420 425 430 Val Ala Pro Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe 435 440 445 Ile Pro Ser Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys 450 455 460 Arg Lys Met Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp 465 470 475 480 Tyr Arg Gln Arg Leu Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr 485 490 495 Tyr Gly Tyr Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser 500 505 510 Val Thr Ala Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu 515 520 525 Glu Glu Lys Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu 530 535 540 His Ala Thr Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala 545 550 555 560 Lys Glu Phe Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu 565 570 575 Leu Glu Tyr Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys 580 585 590 Lys Tyr Ala Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu 595 600 605 Glu Ile Val Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala 610 615 620 Arg Val Leu Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val 625 630 635 640 Arg Ile Val Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro 645 650 655 Pro Glu Lys Leu Val Ile His Glu Gln Ile Thr Arg Asp Leu Arg Asp 660 665 670 Tyr Lys Ala Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala 675 680 685 Arg Gly Val Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu 690 695 700 Lys Gly Ser Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe 705 710 715 720 Asp Pro Thr Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln 725 730 735 Val Leu Pro Ala Val Glu Ala Ile Leu Lys Ala Phe Gly Tyr Arg Lys 740 745 750 Glu Asp Leu Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp 755 760 765 Leu Lys Val Lys Gly Lys Lys 770 775 22775PRTArtificial sequenceMutant polymerase 22Met Ile Leu Asp Thr Asp Tyr Ile Thr Glu Asn Gly Lys Pro Val Ile1 5 10 15 Arg Val Phe Lys Lys Glu Asn Gly Glu Phe Lys Ile Glu Tyr Asp Arg 20 25 30 Thr Phe Glu Pro Tyr Phe Tyr Ala Leu Leu Lys Asp Asp Ser Ala Ile 35 40 45 Glu Asp Val Lys Lys Val Thr Ala Lys Arg His Gly Thr Val Val Lys 50 55 60 Val Lys Arg Ala Glu Lys Val Gln Lys Lys Phe Leu Gly Arg Pro Ile65 70 75 80 Glu Val Trp Lys Leu Tyr Phe Asn His Pro Gln Asp Val Pro Ala Ile 85 90 95 Arg Asp Arg Ile Arg Ala His Pro Ala Val Val Asp Ile Tyr Glu Tyr 100 105 110 Asp Ile Pro Phe Ala Lys Arg Tyr Leu Ile Asp Lys Gly Leu Ile Pro 115 120 125 Met Glu Gly Asp Glu Glu Leu Thr Met Leu Ala Phe Ser Ile Ser Thr 130 135 140 Leu Tyr His Glu Gly Glu Glu Phe Gly Thr Gly Pro Ile Leu Met Ile145 150 155 160 Ser Tyr Ala Asp Gly Ser Glu Ala Arg Val Ile Thr Trp Lys Lys Ile 165 170 175 Asp Leu Pro Tyr Val Asp Val Val Ser Thr Glu Lys Glu Met Ile Lys 180 185 190 Arg Phe Leu Arg Val Val Arg Glu Lys Asp Pro Asp Val Leu Ile Thr 195 200 205 Tyr Asn Gly Asp Asn Phe Asp Phe Ala Tyr Leu Lys Lys Arg Ser Glu 210 215 220 Glu Leu Gly Ile Lys Phe Thr Leu Gly Arg Asp Gly Ser Glu Pro Lys225 230 235 240 Ile Gln Arg Met Gly Asp Arg Phe Ala Val Glu Val Lys Gly Arg Ile 245 250 255 His Phe Asp Leu Tyr Pro Val Ile Arg Arg Thr Ile Asn Leu Pro Thr 260 265 270 Tyr Thr Leu Glu Ala Val Tyr Glu Ala Val Phe Gly Lys Pro Lys Glu 275 280 285 Lys Val Tyr Ala Glu Glu Ile Ala Gln Ala Trp Glu Ser Gly Glu Gly 290 295 300 Leu Glu Arg Val Ala Arg Tyr Ser Met Glu Asp Ala Lys Val Thr Tyr305 310 315 320 Glu Leu Gly Arg Glu Phe Phe Pro Met Glu Ala Gln Leu Ser Arg Leu 325 330 335 Ile Gly Gln Ser Leu Trp Asp Val Ser Arg Ser Ser Thr Gly Asn Leu 340 345 350 Val Glu Trp Phe Leu Leu Arg Lys Ala Tyr Lys Arg Asn Glu Leu Ala 355 360 365 Pro Asn Lys Pro Asp Glu Arg Glu Leu Ala Arg Arg Arg Gly Gly Tyr 370 375 380 Ala Gly Gly Tyr Val Lys Glu Pro Glu Arg Gly Leu Trp Asp Asn Ile385 390 395 400 Val Tyr Leu Asp Phe Arg Ser Tyr Ala Val Ser Ile Ile Ile Thr His 405 410 415 Asn Val Ser Pro Asp Thr Leu Asn Arg Glu Gly Cys Lys Glu Tyr Asp 420 425 430 Val Ala Pro Glu Val Gly His Lys Phe Cys Lys Asp Phe Pro Gly Phe 435 440 445 Ile Pro Ser Leu Leu Gly Asp Leu Leu Glu Glu Arg Gln Lys Ile Lys 450 455 460 Arg Lys Met Lys Ala Thr Val Asp Pro Leu Glu Lys Lys Leu Leu Asp465 470 475 480 Tyr Arg Gln Arg Leu Ile Lys Ile Leu Ala Asn Ser Phe Tyr Gly Tyr 485 490 495 Tyr Gly Tyr Ala Lys Ala Arg Trp Tyr Cys Lys Glu Cys Ala Glu Ser 500 505 510 Val Thr Ala Trp Gly Arg Glu Tyr Ile Glu Met Val Ile Arg Glu Leu 515 520 525 Glu Glu Lys Phe Gly Phe Lys Val Leu Tyr Ala Asp Thr Asp Gly Leu 530 535 540 His Ala Thr Ile Pro Gly Ala Asp Ala Glu Thr Val Lys Lys Lys Ala545 550 555 560 Lys Glu Phe Leu Lys Tyr Ile Asn Pro Lys Leu Pro Gly Leu Leu Glu 565 570 575 Leu Glu Tyr Glu Gly Phe Tyr Val Arg Gly Phe Phe Val Thr Lys Lys 580 585 590 Lys Tyr Ala Val Ile Asp Glu Glu Gly Lys Ile Thr Thr Arg Gly Leu 595 600 605 Glu Ile Val Arg Arg Asp Trp Ser Glu Ile Ala Lys Glu Thr Gln Ala 610 615 620 Arg Val Leu Glu Ala Ile Leu Lys His Gly Asp Val Glu Glu Ala Val625 630 635 640 Arg Ile Val Lys Glu Val Thr Glu Lys Leu Ser Lys Tyr Glu Val Pro 645 650 655 Pro Glu Lys Leu Val Ile His Glu Gln Ile Thr Arg Asp Leu Arg Asp 660 665 670 Tyr Lys Ala Thr Gly Pro His Val Ala Val Ala Lys Arg Leu Ala Ala 675 680 685 Arg Gly Val Lys Ile Arg Pro Gly Thr Val Ile Ser Tyr Ile Val Leu 690 695 700 Lys Gly Ser Gly Arg Ile Gly Asp Arg Ala Ile Pro Ala Asp Glu Phe705 710 715

720 Asp Pro Thr Lys His Arg Tyr Asp Ala Glu Tyr Tyr Ile Glu Asn Gln 725 730 735 Val Leu Pro Ala Val Glu Arg Ile Leu Lys Ala Phe Gly Tyr Arg Lys 740 745 750 Glu Asp Leu Arg Tyr Gln Lys Thr Lys Gln Val Gly Leu Gly Ala Trp 755 760 765 Leu Lys Val Lys Gly Lys Lys 770 775232374DNAArtificial SequenceMutant polymerase 23gggcgaattg ggtacccata tgatcttaga taccgactat atcaccgaga acggtaaacc 60ggtgataagg gtgttcaaaa aggaaaatgg cgaattcaag atcgagtatg atagaacctt 120cgaaccgtac ttctacgcct tgttgaagga cgatagtgcc atcgaagatg tgaaaaaagt 180taccgccaaa cgtcacggca ccgtggtaaa ggttaaacgc gccgaaaagg ttcagaagaa 240gttcctaggc cgtccgatcg aggtgtggaa attgtacttt aaccatccgc aggatgtccc 300ggcgattaga gatcgtattc gtgcccaccc ggcggtagtg gatatctatg agtacgatat 360cccgttcgca aaaagatact tgattgataa aggactaatc ccgatggaag gcgatgaaga 420attaaccatg ttagcgttct ccatctccac cctgtaccac gaaggcgaag agttcggcac 480cggtccgatt ctgatgatct cctacgcaga cggtagcgaa gcacgtgtga taacctggaa 540gaaaatagac ctaccttacg tggacgtcgt aagtaccgag aaggagatga tcaaaagatt 600cctgagggtg gtccgtgaga aggatccgga cgtactgatt acctataacg gcgataactt 660cgacttcgcc tacttgaaaa agagatctga ggaattaggc atcaaattca ccctgggccg 720tgatggcagt gagccgaaaa tccaacgtat gggcgaccgc ttcgccgtcg aggtgaaagg 780ccgtatacat ttcgacttgt atccggtgat taggcgtacc attaatttgc cgacctacac 840cttggaagcg gtgtacgagg cggtcttcgg caagccgaag gaaaaggtgt acgccgaaga 900gatcgcgcag gcgtgggaga gcggtgaggg tctagaacgt gttgcaagat atagcatgga 960ggacgccaaa gttacctacg aattgggccg cgagtttttt ccgatggagg cccagttatc 1020tcgtttaatt ggccagtccc tgtgggatgt tagccgcagt tctactggta atttggtaga 1080atggttctta ctgcgcaaag cgtataaacg taacgagtta gcgccaaata agccggacga 1140acgtgaactg gcccgtcgtc gtggtggcta tgccggcggt tacgtgaagg aaccggagcg 1200tggcctatgg gataacattg tgtaccttga ctttagaagc tatgcggtta gcatcatcat 1260cacccataat gttagtccgg acacattgaa tcgtgaagga tgcaaagaat atgacgtcgc 1320cccagaggtg ggccacaaat tttgtaaaga tttcccagga ttcatcccaa gtttgttggg 1380tgatctgctg gaagaacgcc agaaaatcaa acgtaagatg aaggcgaccg tcgatccact 1440ggagaaaaag ctattggact accgtcagcg cctgatcaag attttggcga attctttcta 1500tggatactac ggctacgcca aagcccgttg gtattgtaaa gagtgcgccg agtctgtcac 1560tgcctggggt cgtgaatata tcgaaatggt gatccgcgag ctggaagaga aatttggatt 1620caaagtcttg tacgccgata ccgatggtct gcacgcgacc attccgggtg ccgatgccga 1680gaccgtgaag aaaaaggcga aagagttttt gaaatatatc aatccgaagt tgccgggatt 1740attagaattg gaatacgaag gtttctatgt tcgcggcttt ttcgtgacca agaaaaaata 1800cgccgtgatc gacgaggaag gaaaaattac cacccgtggt ctagagattg ttcgtcgtga 1860ctggtccgaa atcgccaaag aaacccaggc ccgtgtactg gaagcgattt tgaagcatgg 1920cgatgtggag gaggcggttc gtatcgtcaa agaagtgacc gaaaagctga gcaagtatga 1980agtgccgccg gagaaattgg tcatacacga acaaatcaca cgtgacctgc gcgattataa 2040ggcgaccggt ccgcacgttg ccgtggcgaa gcgtttggcg gcccgtggtg ttaagattcg 2100tccaggaacc gtgattagtt acatagtgtt gaagggcagt ggtcgtattg gtgaccgtgc 2160catcccggcg gatgagtttg acccgaccaa gcatcgttat gacgccgaat attatatcga 2220gaatcaggtg ctaccagcgg ttgaacgtat tttgaaggca ttcggctatc gtaaagaaga 2280cctgcgctac cagaaaacca agcaggttgg tctgggtgcc tggttgaaag tgaaaggcaa 2340aaaataagct agcggagctc cagcttttgt tccc 2374242374DNAArtificial SequenceMutant polymerase 24gggaacaaaa gctggagctc cgctagctta ttttttgcct ttcactttca accaggcacc 60cagaccaacc tgcttggttt tctggtagcg caggtcttct ttacgatagc cgaatgcctt 120caaaatacgt tcaaccgctg gtagcacctg attctcgata taatattcgg cgtcataacg 180atgcttggtc gggtcaaact catccgccgg gatggcacgg tcaccaatac gaccactgcc 240cttcaacact atgtaactaa tcacggttcc tggacgaatc ttaacaccac gggccgccaa 300acgcttcgcc acggcaacgt gcggaccggt cgccttataa tcgcgcaggt cacgtgtgat 360ttgttcgtgt atgaccaatt tctccggcgg cacttcatac ttgctcagct tttcggtcac 420ttctttgacg atacgaaccg cctcctccac atcgccatgc ttcaaaatcg cttccagtac 480acgggcctgg gtttctttgg cgatttcgga ccagtcacga cgaacaatct ctagaccacg 540ggtggtaatt tttccttcct cgtcgatcac ggcgtatttt ttcttggtca cgaaaaagcc 600gcgaacatag aaaccttcgt attccaattc taataatccc ggcaacttcg gattgatata 660tttcaaaaac tctttcgcct ttttcttcac ggtctcggca tcggcacccg gaatggtcgc 720gtgcagacca tcggtatcgg cgtacaagac tttgaatcca aatttctctt ccagctcgcg 780gatcaccatt tcgatatatt cacgacccca ggcagtgaca gactcggcgc actctttaca 840ataccaacgg gctttggcgt agccgtagta tccatagaaa gaattcgcca aaatcttgat 900caggcgctga cggtagtcca atagcttttt ctccagtgga tcgacggtcg ccttcatctt 960acgtttgatt ttctggcgtt cttccagcag atcacccaac aaacttggga tgaatcctgg 1020gaaatcttta caaaatttgt ggcccacctc tggggcgacg tcatattctt tgcatccttc 1080acgattcaat gtgtccggac taacattatg ggtgatgatg atgctaaccg catagcttct 1140aaagtcaagg tacacaatgt tatcccatag gccacgctcc ggttccttca cgtaaccgcc 1200ggcatagcca ccacgacgac gggccagttc acgttcgtcc ggcttatttg gcgctaactc 1260gttacgttta tacgctttgc gcagtaagaa ccattctacc aaattaccag tagaactgcg 1320gctaacatcc cacagggact ggccaattaa acgagataac tgggcctcca tcggaaaaaa 1380ctcgcggccc aattcgtagg taactttggc gtcctccatg ctatatcttg caacacgttc 1440tagaccctca ccgctctccc acgcctgcgc gatctcttcg gcgtacacct tttccttcgg 1500cttgccgaag accgcctcgt acaccgcttc caaggtgtag gtcggcaaat taatggtacg 1560cctaatcacc ggatacaagt cgaaatgtat acggcctttc acctcgacgg cgaagcggtc 1620gcccatacgt tggattttcg gctcactgcc atcacggccc agggtgaatt tgatgcctaa 1680ttcctcagat ctctttttca agtaggcgaa gtcgaagtta tcgccgttat aggtaatcag 1740tacgtccgga tccttctcac ggaccaccct caggaatctt ttgatcatct ccttctcggt 1800acttacgacg tccacgtaag gtaggtctat tttcttccag gttatcacac gtgcttcgct 1860accgtctgcg taggagatca tcagaatcgg accggtgccg aactcttcgc cttcgtggta 1920cagggtggag atggagaacg ctaacatggt taattcttca tcgccttcca tcgggattag 1980tcctttatca atcaagtatc tttttgcgaa cgggatatcg tactcataga tatccactac 2040cgccgggtgg gcacgaatac gatctctaat cgccgggaca tcctgcggat ggttaaagta 2100caatttccac acctcgatcg gacggcctag gaacttcttc tgaacctttt cggcgcgttt 2160aacctttacc acggtgccgt gacgtttggc ggtaactttt ttcacatctt cgatggcact 2220atcgtccttc aacaaggcgt agaagtacgg ttcgaaggtt ctatcatact cgatcttgaa 2280ttcgccattt tcctttttga acacccttat caccggttta ccgttctcgg tgatatagtc 2340ggtatctaag atcatatggg tacccaattc gccc 2374

* * * * *