Affinity Peptides And Method For Purification Of Recombinant Proteins Hernan; Ronald A. ; et al. [SIGMA-ALDRICH CO.]

Affinity Peptides And Method For Purification Of Recombinant Proteins

Hernan; Ronald A. ; et al.

Patent Application Summary

U.S. patent application number 12/842764 was filed with the patent office on 2012-06-21 for affinity peptides and method for purification of recombinant proteins. This patent application is currently assigned to SIGMA-ALDRICH CO.. Invention is credited to Ian R. Brockie, Ronald A. Hernan, Elizabeth Jenkins, Richard J. Mehigh.

Application Number	20120157659 12/842764
Document ID	/
Family ID	31498502
Filed Date	2012-06-21

United States Patent Application	20120157659
Kind Code	A1
Hernan; Ronald A. ; et al.	June 21, 2012

AFFINITY PEPTIDES AND METHOD FOR PURIFICATION OF RECOMBINANT PROTEINS

Abstract

This invention describes a process for separating a fusion protein or polypeptide in the form of its precursor from a mixture containing said fusion protein and impurities, which comprises contacting said fusion protein with a resin containing immobilized metal ions, said fusion protein covalently operably linked directly or indirectly to an immobilized metal ion-affinity peptide, binding said fusion protein to said resin, and selectively eluting said fusion protein from said resin.

Inventors:	Hernan; Ronald A.; (Ballwin, MO) ; Mehigh; Richard J.; (St. Louis, MO) ; Brockie; Ian R.; (St. Louis, MO) ; Jenkins; Elizabeth; (Sherman, IL)
Assignee:	SIGMA-ALDRICH CO. St. Louis MO
Family ID:	31498502
Appl. No.:	12/842764
Filed:	July 23, 2010

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
10460524	Jun 12, 2003	7799561
12842764
60388059	Jun 12, 2002

Current U.S. Class:	530/324 ; 530/325; 530/326; 530/327; 530/328; 530/350
Current CPC Class:	B01J 20/3265 20130101; C07K 1/047 20130101; C07K 1/22 20130101; C07K 7/06 20130101; B01J 20/286 20130101; B01D 15/3828 20130101; C07K 7/08 20130101
Class at Publication:	530/324 ; 530/325; 530/326; 530/327; 530/328; 530/350
International Class:	C07K 7/06 20060101 C07K007/06; C07K 14/00 20060101 C07K014/00; C07K 7/08 20060101 C07K007/08

Claims

1. A polypeptide, protein or protein fragment represented by the formula R.sub.1-Sp.sub.1-(His-Z.sub.1-His-Arg-His-Z.sub.2-His)-Sp.sub.2-R.sub.2, wherein (His-Z.sub.1-His-Arg-His-Z.sub.2-His) (SEQ ID NO: 24) is a metal ion-affinity peptide, R.sub.1 is hydrogen, a polypeptide, protein or protein fragment, Sp.sub.1 is a covalent bond or a spacer comprising at least one amino acid residue, R.sub.2 is hydrogen, a polypeptide, protein or protein fragment, Sp.sub.2 is a covalent bond or a spacer comprising at least one amino acid residue, Z.sub.1 is an amino acid residue selected from the group consisting of Ala, Asn, Asp, Gln, Glu, Ile, Lys, Phe, Pro, Ser, Thr, Trp, and Val, and Z.sub.2 is an amino acid residue selected from the group consisting of Ala, Asn, Asp, Cys, Gln, Glu, Gly, Ile, Leu, Lys, Met, Pro, Ser, Thr, Tyr, and Val.

2. The polypeptide, protein or protein fragment of claim 1, wherein Z.sub.1 is selected from the group consisting of Ala, Asn, Ile, Lys, Phe, Ser, Thr, and Val, and Z.sub.2 is selected from the group consisting of Ala, Asn, Gly, Lys, Ser, Thr and Tyr.

3. The polypeptide, protein or protein fragment of claim 1, wherein Z.sub.1 and Z.sub.2 are selected from the group consisting of: (a) Z.sub.1 is Asn and Z.sub.2 is Gly; (b) Z.sub.1 is Asn and Z.sub.2 is Lys (c) Z.sub.1 is Lys and Z.sub.2 is Gly. (d) Z.sub.1 is Lys and Z.sub.2 is Lys. (e) Z.sub.1 is Ile and Z.sub.2 is Asn; (f) Z.sub.1 is Thr and Z.sub.2 is Ser; (g) Z.sub.1 is Ser and Z.sub.2 is Tyr; (h) Z.sub.1 is Val and Z.sub.2 is Ala; and (i) Z.sub.1 is Ala and Z.sub.2 is Lys.

4. The polypeptide, protein or protein fragment of claim 1, wherein R.sub.1 or R.sub.2 is hydrogen.

5. The polypeptide, protein or protein fragment of claim 1, wherein R.sub.1 or R.sub.2 is an amino acid residue.

6. The polypeptide, protein or protein fragment of claim 1, wherein Sp.sub.1 or Sp.sub.2 is a spacer comprising a proteolytic cleavage site, a fusion protein, a secretion sequence, a leader sequence for cellular targeting an antibody epitope or an internal ribosomal sequences.

7. The polypeptide, protein or protein fragment of claim 1, wherein Sp.sub.1 or Sp.sub.2 is a spacer comprising a proteolytic cleavage site.

8. The polypeptide, protein or protein fragment of claim 7, wherein the proteolytic cleavage site is cleaved with enterokinase.

9. The polypeptide, protein or protein fragment of claim 1, wherein any one of Sp.sub.1, Sp.sub.2, R.sub.1 and R.sub.2 comprises at least one of the amino acid sequences selected from the group consisting of SEQ ID NOS: 1-17.

10. The polypeptide, protein or protein fragment of claim 1, wherein Sp.sub.1 or Sp.sub.2 is a spacer comprising the enzyme glutathione-S-transferase of the parasite helminth Schistosoma japonicum.

11. The polypeptide, protein or protein fragment of claim 1, wherein Sp.sub.1 or Sp.sub.2 is a spacer comprising the amino acid sequence DYKDDDDK (SEQ ID NO: 15).

12. The polypeptide, protein or protein fragment of claim 1, wherein Sp.sub.1 or Sp.sub.2 is a spacer comprising the amino acid sequence DLYDDDDK (SEQ ID NO: 16).

13. The polypeptide, protein or protein fragment of claim 1, wherein Sp.sub.1 or Sp.sub.2 is a spacer comprising the amino acid sequence Met-Asp-Tyr-Lys-Asp-His-Asp-Gly-Asp-Tyr-Lys-Asp-His-Asp-Ile-Asp-Tyr-Lys-A- sp-Asp-Asp-Asp-Lys (SEQ ID NO: 17).

14. The polypeptide, protein or protein fragment of claim 1, wherein Sp.sup.1 or Sp.sup.2 is a spacer comprising at least one amino acid residue, said spacer comprising an antigenic domain, wherein the antigenic domain comprises the sequence TABLE-US-00001 (SEQ ID NO: 39) X.sup.20-(X.sup.1-Y-K-X.sup.2-X.sup.3-D-X.sup.4).sub.n-X.sup.5-(X.sup.1-Y-- K-X.sup.7-X.sup.8-D-X.sup.9-K)-X.sup.21

where: D, Y and K are their representative amino acids; X.sup.20 and X.sup.21 are independently a hydrogen or a covalent bond; each X.sup.1 and X.sup.4 is independently a covalent bond or at least one amino acid residue selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues; each X.sup.2, X.sup.3, X.sup.7 and X.sup.8 is independently an amino acid residue selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues; X.sup.5 is a covalent bond or a spacer domain, the spacer domain comprising at least one amino acid or a combination of multiple or alternating histidine residues, said combination comprising His-Gly-His, or -(His-X).sub.m--, wherein m is 1 to 6 and X is selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr and Val; X.sup.9 is a covalent bond or an aspartate residue; and n is 0, 1 or 2.

15. The polypeptide, protein or protein fragment of claim 1, wherein Sp.sup.1 or Sp.sup.2 is a spacer comprising at least one amino acid residue, said spacer comprising an antigenic domain, wherein the antigenic domain comprises the sequence TABLE-US-00002 (SEQ ID NO: 40) X.sup.20-(D-Y-K-X.sup.2-X.sup.3-D).sub.n-X.sup.5-(D-Y-K-X.sup.7-X.sup.8-D- -X.sup.9-K)-X.sup.21

where: D, Y, K are their representative amino acids; X.sup.20 and X.sup.21 are independently a hydrogen or a covalent bond; each X.sup.2, X.sup.3, X.sup.7 and X.sup.8 is independently an amino acid residue selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues; X.sup.5 is a covalent bond or a spacer domain, the spacer domain comprising at least one amino acid or a combination of multiple or alternating histidine residues, said combination comprising His-Gly-His, or -(His-X).sub.m--, wherein m is 1 to 6 and X is selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr and Val; X.sup.9 is a covalent bond or an aspartate residue; and n is at least 2.

16. The polypeptide, protein or protein fragment of claim 1, wherein Sp.sup.1 or Sp.sup.2 is a spacer comprising at least one amino acid residue, said spacer comprising an antigenic domain, wherein the antigenic domain comprises the sequence TABLE-US-00003 (SEQ ID NO: 45) X.sup.20-X.sup.10-(D-Y-K-X.sup.2-X.sup.3-D).sub.n-X.sup.5-(D-Y-K-X.sup.7-X- .sup.8-D-X.sup.9-K)-X.sup.21

where: D, Y, and K are their representative amino acids; X.sup.20 and X.sup.21 are independently a hydrogen or a covalent bond; X.sup.10 is a covalent bond or an amino acid; each X.sup.2, X.sup.3, X.sup.7 and X.sup.8 is independently an amino acid residue selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues; X.sup.5 is a covalent bond or a spacer domain, the spacer domain comprising at least one amino acid or a combination of multiple or alternating histidine residues, said combination comprising His-Gly-His, or -(His-X).sub.m--, wherein m is 1 to 6 and X is selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, and Val; X.sup.9 is a covalent bond or an aspartate residue; and n is at least 2.

17. The polypeptide, protein or protein fragment of claim 1, wherein Sp.sup.1 or Sp.sup.2 is a spacer comprising at least one amino acid residue, said spacer comprising an antigenic domain, wherein the antigenic domain comprises the sequence TABLE-US-00004 (SEQ ID NO: 42) X.sup.20-(D-X.sup.11-Y-X.sup.12-X.sup.13)n-X.sup.14-(D-X.sup.11-Y-X.sup.12- -X.sup.13-D-X.sup.15-K)-X.sup.21

where: D, Y and K are their representative amino acids; X.sup.20 and X.sup.21 are independently a hydrogen or a covalent bond; each X.sup.11 is a covalent bond or an amino acid; each X.sup.12 is an amino acid selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues; each X.sup.13 is a covalent bond or at least one amino acid selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues; X.sup.14 is a covalent bond or a spacer domain, the spacer domain comprising at least one amino acid or alternating histidine residues, said combination comprising His-Gly-His, or -(His-X).sub.m--, wherein m is 1 to 6 and X is selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr and Val; X.sup.15 is a covalent bond or an aspartate residue; and n is 0 or at least 1.

18. A polypeptide, protein or protein fragment represented by the formula R.sub.1-Sp.sub.1-(His-Z.sub.1-His-Arg-His-Z.sub.2-His).sub.t-Sp.sub.2-R.s- ub.2, wherein (His-Z.sub.1-His-Arg-His-Z.sub.2-His) (SEQ ID NO: 24) is a metal ion-affinity peptide, t is at least 2, R.sub.1 is hydrogen, a polypeptide, protein or protein fragment, Sp.sub.1 is a covalent bond or a spacer comprising at least one amino acid residue, R.sub.2 is hydrogen, a polypeptide, protein or protein fragment, Sp.sub.2 is a covalent bond or a spacer comprising at least one amino acid residue, Z.sub.1 is an amino acid residue selected from the group consisting of Ala, Arg, Asn, Asp, Gln, Glu, Ile, Lys, Phe, Pro, Ser, Thr, Trp, and Val, and Z.sub.2 is an amino acid residue selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, Ile, Leu, Lys, Met, Pro, Ser, Thr, Tyr, and Val.

19. The peptide of claim 18, wherein Z.sub.1 and Z.sub.2 are selected from the group consisting of: (a) Z.sub.1 is Asn and Z.sub.2 is Lys; and (b) Z.sub.1 is Lys and Z.sub.2 is Gly.

20. A polypeptide, protein or protein fragment represented by the formula R.sub.1-Sp.sub.1-[(His-Z.sub.1-His-Arg-His-Z.sub.2-His)-Sp.sub.2].sub.t-R- .sub.2, wherein (His-Z.sub.1-His-Arg-His-Z.sub.2-His) (SEQ ID NO: 24) is a metal ion-affinity peptide, t is at least 2, R.sub.1 is hydrogen, a polypeptide, protein or protein fragment, Sp.sub.1 is a covalent bond or a spacer comprising at least one amino acid residue, R.sub.2 is hydrogen, a polypeptide, protein or protein fragment, Sp.sub.2 is a covalent bond or a spacer comprising at least one amino acid residue, Z.sub.1 is an amino acid residue selected from the group consisting of Ala, Arg, Asn, Asp, Gln, Glu, Ile, Lys, Phe, Pro, Ser, Thr, Trp, and Val, and Z.sub.2 is an amino acid residue selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, Ile, Leu, Lys, Met, Pro, Ser, Thr, Tyr, and Val; and each Sp.sub.2 of the recombinant polypeptides, proteins or protein fragments may be the same or different.

21. The peptide of claim 20, wherein Z.sub.1 and Z.sub.2 are selected from the group consisting of: (a) Z.sub.1 is Asn and Z.sub.2 is Lys; and (b) Z.sub.1 is Lys and Z.sub.2 is Gly.

Description

REFERENCE TO RELATED APPLICATIONS

[0001] This application is a divisional of application Ser. No. 10/460,524, filed Jun. 12, 2003, which is a non-provisional application claiming priority from provisional Application Ser. No. 60/388,059, filed Jun. 12, 2002, the content of each of which is hereby incorporated herein by reference.

FIELD OF THE INVENTION

[0002] This invention relates to affinity peptides, fusion proteins containing affinity peptides, genes coding for such proteins, expression vectors and transformed microorganisms containing such genes, and methods for the purification of the fusion proteins.

BACKGROUND OF THE INVENTION

[0003] The possibility of preparing hybrid genes by gene technology has opened up new routes for the analysis of recombinant proteins. By linking the coding gene sequence of a desired protein to the coding gene sequence of a protein fragment having a high affinity for a ligand (affinity peptide), it is possible to purify desired recombinant proteins in the form of fusion proteins in one-step using the affinity peptide.

[0004] Immobilized metal affinity chromatography (IMAC), also known as metal chelate affinity chromatography (MCAC), is a specialized aspect of affinity chromatography. The principle behind IMAC lies in the fact that many transition metal ions, e.g., nickel, zinc and copper, can coordinate to the amino acids histidine, cysteine, and tryptophan via electron donor groups on the amino acid side chains. To utilize this interaction for chromatographic purposes, the metal ion is typically immobilized onto an insoluble support. This can be done by attaching a chelating group to the chromatographic matrix. Most importantly, to be useful, the metal of choice must have a higher affinity for the matrix than for the compounds to be purified.

[0005] In U.S. Pat. No. 4,569,794, Smith et al. disclose the preparation of a fusion protein containing a metal ion-affinity peptide linker and a biologically active polypeptide, expressing the fusion protein, and purifying it using immobilized metal ion chromatography. Because essentially any biologically active polypeptide could be used, this approach enabled the convenient expression and purification of essentially biologically active polypeptide by immobilized metal ion chromatography.

[0006] In U.S. Pat. Nos. 5,310,663 and 5,284,933, Dobeli et al. disclose a process for separating a biologically active polypeptide from impurities by producing the desired polypeptide as a fusion protein containing a metal ion-affinity peptide linker comprising 2 to 6 adjacent histidine residues. Although Dobeli et al.'s metal ion-affinity peptide provides greater metal affinity relative to certain of the sequences disclosed by Smith et al., there is some cautionary evidence that proteins containing His-tags may differ from their wild-type counterparts in dimerization/oligomerization properties. For example, Wu and Filutowicz present evidence that the biochemical properties of the pi(30.5) protein of plasmid R6K, a DNA binding protein, were fundamentally altered due to the presence of an N-terminal 6.times. His-tag. Wu, J. and Filutowicz, M., Acta Biochim. Pol., 46:591-599, 1999. In addition, Rodriguez-Viciana et al. stated that V12 Ras proteins expressed as histidine-tagged fusion proteins exhibited poor biological activity. Rodriguez-Viciana, P., et al., Cell, 89:457-67, 1997.

SUMMARY OF THE INVENTION

[0007] One aspect of the present invention is a peptide which is relatively hydrophilic, is capable of exhibiting appropriate biological activity, and has a relatively high affinity for coordinating metals. Advantageously, this metal ion-affinity peptide may be incorporated into a fusion protein to enable ready purification of the fusion protein from aqueous solutions by immobilized metal affinity chromatography. In addition to the metal ion-affinity peptide, the fusion protein typically comprises a protein or polypeptide of interest, covalently linked, directly or indirectly, to the metal ion-affinity peptide.

[0008] Briefly, therefore, the present invention is directed to a peptide represented by the formula R.sub.1-Sp.sub.1-(His-Z.sub.1-His-Arg-His-Z.sub.2-His)-Sp.sub.2-R.sub.2, wherein (His-Z.sub.1-His-Arg-His-Z.sub.2-His) (SEQ ID NO: 24) is a metal ion-affinity peptide, R.sub.1 is hydrogen, a polypeptide, protein or protein fragment, Sp.sub.1 is a covalent bond or a spacer comprising at least one amino acid residue, R.sub.2 is hydrogen, a polypeptide, protein or protein fragment, Sp.sub.2 is a covalent bond or a spacer comprising at least one amino acid residue, Z.sub.1 is an amino acid residue selected from the group consisting of Ala, Arg, Asn, Asp, Gln, Glu, Ile, Lys, Phe, Pro, Ser, Thr, Trp, and Val; and Z.sub.2 is an amino acid residue selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, Ile, Leu, Lys, Met, Pro, Ser, Thr, Tyr, and Val.

[0009] The present invention is further directed to a process for separating a recombinant protein or polypeptide from a liquid mixture wherein the recombinant protein or polypeptide comprises a metal ion-affinity peptide having the sequence His-Z.sub.1-His-Arg-His-Z.sub.2-His (SEQ ID NO: 24) and Z.sub.1 and Z.sub.2 are as previously defined. In the process, the mixture is combined with a solid support having immobilized metal ions to bind the recombinant protein or polypeptide, and eluting the fusion protein from the solid support.

[0010] The present invention is further directed to vectors and host cells for recombinant expression of the nucleic acid molecules described herein, as well as methods of making such vectors and host cells and for using them for production of the polypeptides or peptides of the present invention by recombinant techniques.

[0011] The present invention is further directed to a kit for the expression and/or separation of the recombinant proteins or polypeptides from a mixture wherein the recombinant proteins or polypeptides contain the sequence R.sub.1-Sp.sub.1-(His-Z.sub.1-His-Arg-His-Z.sub.2-His)-Sp.sub.2-R.sub.2, and R.sub.1, R.sub.2, Sp.sub.1, Sp.sub.2, Z.sub.1 and Z.sub.2 are as previously defined. The kit may comprise, in separate containers, the nucleic acid components to be assembled into a vector encoding for a fusion protein comprising a protein or polypeptide covalently operably linked directly or indirectly to an immobilized metal ion-affinity peptide. In addition, or alternatively, the kit may be comprised of one or more of the following: buffers, enzymes, a chromatography column comprising a resin containing immobilized metal ions and an instructional brochure explaining how to use the kit.

[0012] Other objects and advantages of the present invention will become apparent as the detailed description of the invention proceeds.

DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS

[0013] The present invention generally relates to the expression and purification of recombinant polypeptides, proteins or protein fragments containing a metal ion-affinity peptide. In addition to the metal ion-affinity peptide, the recombinant polypeptides and proteins will typically also contain a target polypeptide, protein or fragment thereof covalently linked to the metal ion-affinity peptide. In one embodiment, the target polypeptide, protein or protein fragment is a biologically active protein or protein fragment. Advantageously, the metal ion-affinity peptide enables the recombinant polypeptides and proteins to be readily purified from a liquid sample by means of metal ion affinity chromatography.

[0014] The fusion proteins of this invention are prepared by recombinant DNA methodology. In accordance with the present invention, a gene sequence coding for a desired protein is isolated, synthesized or otherwise obtained and operably linked to a DNA sequence coding for the metal ion-affinity peptide. The hybrid gene containing the gene for a desired protein operably linked to a DNA sequence encoding the metal ion-affinity peptide is referred to as a chimeric gene.

[0015] In one embodiment, the metal ion-affinity peptide is covalently linked to the carboxy terminus of the target polypeptide, protein or protein fragment. In another embodiment, the metal ion-affinity peptide is covalently linked to the amino terminus of the target polypeptide, protein or protein fragment. In each of these embodiments, the metal ion-affinity peptide and the target polypeptide, protein or protein fragment may be directly attached by means of a peptide bond or, alternatively, the two may be separated by a linker. When present, the linker may provide other functionality to the recombinant polypeptide, protein or protein fragment.

[0016] The recombinant polypeptides, proteins or protein fragments of the present invention are defined by the general formula (I):

R.sub.1-Sp.sub.1-(His-Z.sub.1-His-Arg-His-Z.sub.2-H is)-Sp.sub.2-R.sub.2 (I)

wherein (His-Z.sub.1-His-Arg-His-Z.sub.2-His) (SEQ ID NO: 24) is a metal ion-affinity peptide; Z.sub.1 is an amino acid residue selected from the group consisting of Ala, Arg, Asn, Asp, Gln, Glu, Ile, Lys, Phe, Pro, Ser, Thr, Trp, and Val; and Z.sub.2 is an amino acid residue selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, Ile, Leu, Lys, Met, Pro, Ser, Thr, Tyr and Val. In addition, R.sub.1 is hydrogen, a polypeptide, protein or protein fragment, Sp.sub.1 is a covalent bond or a spacer comprising at least one amino acid residue, R.sub.2 is hydrogen, a polypeptide, protein or protein fragment, Sp.sub.2 is a covalent bond or a spacer comprising at least one amino acid residue. Thus, for example, R.sub.1 or R.sub.2 may comprise a target polypeptide, protein, or protein fragment which is directly or indirectly linked to the metal ion-affinity peptide.

[0017] Metal Ion-Affinity Peptide

[0018] In one embodiment, the recombinant polypeptide, protein or protein fragment is defined by formula (I), wherein Z.sub.1 is an amino acid selected from the group consisting of Ala, Asn, Ile, Lys, Phe, Ser, Thr, and Val; and Z.sub.2 is an amino acid selected from the group consisting of Ala, Asn, Gly, Lys, Ser, Thr, Tyr; and R.sub.1, R.sub.2, Sp.sub.1, and Sp.sub.2 are as previously defined. Thus, for example, in this embodiment the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2) may be at the carboxy or amino terminus of the metal ion-affinity polypeptide. In addition, the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2), may be directly fused (when Sp.sub.1 or Sp.sub.2 is a covalent bond) or separated from the metal ion-affinity polypeptide by a spacer (when Sp.sub.1 or Sp.sub.2 is one or more amino acid residues) regardless of whether the target polypeptide, protein or protein fragment is fused to the amino or carboxy terminus of the metal ion-affinity polypeptide.

[0019] In another embodiment, the recombinant polypeptide, protein or protein fragment is defined by formula (I), wherein Z.sub.1 is an amino acid selected from the group consisting of Asn and Lys; and Z.sub.2 is an amino acid selected from the group consisting of Gly and Lys; and R.sub.1, R.sub.2, Sp.sub.1, and Sp.sub.2 are as previously defined. For example, in one such embodiment, the recombinant polypeptide, protein or protein fragment is defined by formula (I) wherein Z.sub.1 is Asn, Z.sub.2 is Lys and R.sub.1, R.sub.2 Sp.sub.1, and Sp.sub.2 are as previously defined. By way of further example, in another such embodiment, the recombinant polypeptide, protein or protein fragment is defined by formula (I) wherein Z.sub.1 is Lys and Z.sub.2 is Gly. In each of these alternatives, the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2) may be at the carboxy or amino terminus of the metal ion-affinity polypeptide. In addition, the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2), may be directly fused (when Sp.sub.1 or Sp.sub.2 is a covalent bond) or separated from the metal ion-affinity polypeptide by a spacer (when Sp.sub.1 or Sp.sub.2 is one or more amino acid residues) regardless of whether the target polypeptide, protein or protein fragment is fused to the amino or carboxy terminus of the metal ion-affinity polypeptide.

[0020] In another embodiment, the recombinant polypeptide, protein or protein fragment is defined by formula (I), wherein Z.sub.1 is Ile, Z.sub.2 is Asn, and R.sub.1, R.sub.2, Sp.sub.1, and Sp.sub.2 are as previously defined. Thus, for example, in this embodiment the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2) may be at the carboxy or amino terminus of the metal ion-affinity polypeptide. In addition, the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2), may be directly fused (when Sp.sub.1 or Sp.sub.2 is a covalent bond) or separated from the metal ion-affinity polypeptide by a spacer (when Sp.sub.1 or Sp.sub.2 is one or more amino acid residues) regardless of whether the target polypeptide, protein or protein fragment is fused to the amino or carboxy terminus of the metal ion-affinity polypeptide.

[0021] In another embodiment, the recombinant polypeptide, protein or protein fragment is defined by formula (I), wherein Z.sub.1 is Thr, Z.sub.2 is Ser, and R.sub.1, R.sub.2, Sp.sub.1, and Sp.sub.2 are as previously defined. Thus, for example, in this embodiment the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2) may be at the carboxy or amino terminus of the metal ion-affinity polypeptide. In addition, the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2), may be directly fused (when Sp.sub.1 or Sp.sub.2 is a covalent bond) or separated from the metal ion-affinity polypeptide by a spacer (when Sp.sub.1 or Sp.sub.2 is one or more amino acid residues) regardless of whether the target polypeptide, protein or protein fragment is fused to the amino or carboxy terminus of the metal ion-affinity polypeptide.

[0022] In another embodiment, the recombinant polypeptide, protein or protein fragment is defined by formula (I), wherein Z.sub.1 is Ser, Z.sub.2 is Tyr, and R.sub.1, R.sub.2, Sp.sub.1, and Sp.sub.2 are as previously defined. Thus, for example, in this embodiment the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2) may be at the carboxy or amino terminus of the metal ion-affinity polypeptide. In addition, the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2), may be directly fused (when Sp.sub.1 or Sp.sub.2 is a covalent bond) or separated from the metal ion-affinity polypeptide by a spacer (when Sp.sub.1 or Sp.sub.2 is one or more amino acid residues) regardless of whether the target polypeptide, protein or protein fragment is fused to the amino or carboxy terminus of the metal ion-affinity polypeptide.

[0023] In another embodiment, the recombinant polypeptide, protein or protein fragment is defined by formula (I), wherein Z.sub.1 is Val, Z.sub.2 is Ala, and R.sub.1, R.sub.2, Sp.sub.1, and Sp.sub.2 are as previously defined. Thus, for example, in this embodiment the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2) may be at the carboxy or amino terminus of the metal ion-affinity polypeptide. In addition, the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2), may be directly fused (when Sp.sub.1 or Sp.sub.2 is a covalent bond) or separated from the metal ion-affinity polypeptide by a spacer (when Sp.sub.1 or Sp.sub.2 is one or more amino acid residues) regardless of whether the target polypeptide, protein or protein fragment is fused to the amino or carboxy terminus of the metal ion-affinity polypeptide.

[0024] In another embodiment, the recombinant polypeptide, protein or protein fragment is defined by formula (I), wherein Z.sub.1 is Ala, Z.sub.2 is Lys, and R.sub.1, R.sub.2, Sp.sub.1, and Sp.sub.2 are as previously defined. Thus, for example, in this embodiment the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2) may be at the carboxy or amino terminus of the metal ion-affinity polypeptide. In addition, the target polypeptide, protein or protein fragment (R.sub.1 or R.sub.2), may be directly fused (when Sp.sub.1 or Sp.sub.2 is a covalent bond) or separated from the metal ion-affinity polypeptide by a spacer (when Sp.sub.1 or Sp.sub.2 is one or more amino acid residues) regardless of whether the target polypeptide, protein or protein fragment is fused to the amino or carboxy terminus of the metal ion-affinity polypeptide.

[0025] In a further embodiment, R.sub.1 may be a polypeptide which drives expression of the fusion protein and R.sub.2 is the target polypeptide, protein or protein fragment. In this embodiment, each of Sp.sub.1 and Sp.sub.2 may be a covalent bond or a spacer, independently of the other. Thus, for example, R.sub.1 may be directly fused to the metal ion-affinity peptide or separated from the metal ion-affinity peptide by a spacer independently of whether R.sub.2 is directly fused to the metal ion-affinity peptide or separated from the metal ion-affinity peptide by a spacer; all of these combinations and permutations are contemplated. This type of arrangement is particularly useful when chimeric proteins are constructed which comprise epitopes from two portions of antigenic protein or from two different antigenic proteins. Such chimeric proteins may be useful in vaccine preparations.

[0026] In another embodiment, the recombinant polypeptides, proteins or protein fragments of the present invention comprise multiple copies of the metal ion-affinity peptide (His-Z.sub.1-His-Arg-His-Z.sub.2-His) (SEQ ID NO: 24) wherein Z.sub.1 and Z.sub.2 are as previously defined. In this embodiment, the additional copies of the metal affinity peptide may occur in either or both of the spacer domains (Sp.sub.1 and Sp.sub.2) or in either or both of the other domains (R.sub.1 and R.sub.2) of the recombinant polypeptides, proteins or protein fragments. Thus, for example, in one embodiment a second copy of the metal ion-affinity peptide (His-Z.sub.1-His-Arg-His-Z.sub.2-His) (SEQ ID NO: 24) wherein Z.sub.1 and Z.sub.2 are as previously defined is located in one of the spacer domains (Sp.sub.1 or Sp.sub.2) or other domains (R.sub.1 and R.sub.2) of the recombinant polypeptides, proteins or protein fragments. By way of further example, in another embodiment two additional copies of the metal ion-affinity peptide (His-Z.sub.1-His-Arg-His-Z.sub.2-His) (SEQ ID NO: 24) wherein Z.sub.1 and Z.sub.2 are as previously defined are located in the spacer domains (Sp.sub.1 or Sp.sub.2) or other domains (R.sub.1 and R.sub.2) of the recombinant polypeptides, proteins or protein fragments. By way of further example, in another embodiment at least three additional copies of the metal ion-affinity peptide (His-Z.sub.1-His-Arg-His-Z.sub.2-His) (SEQ ID NO: 24) wherein Z.sub.1 and Z.sub.2 are as previously defined are located in the spacer domains (Sp.sub.1 or Sp.sub.2) or other domains (R.sub.1 and R.sub.2) of the recombinant polypeptides, proteins or protein fragments. In each of these embodiments, the multiple copies of the metal ion-affinity peptide may be separated by one or more amino acid residues (i.e., a spacer) as described herein. Alternatively, in each of these embodiments the multiple copies of the metal ion-affinity peptide may be directly linked to each other without any intervening amino acid residues. Thus, for example, in one such embodiment the recombinant polypeptides, proteins or protein fragments of the present invention may be defined by the general formula (II):

R.sub.1-Sp.sub.1-(His-Z.sub.1-His-Arg-His-Z.sub.2-His).sub.t-Sp.sub.2-R.- sub.2 (II)

wherein (His-Z.sub.1-His-Arg-His-Z.sub.2-His) (SEQ ID NO: 24) is a metal ion-affinity peptide; t is at least 2 and R.sub.1, R.sub.2, Z.sub.1, Z.sub.2, Sp.sub.1 and Sp.sub.2 are as previously defined. By way of further example, in one such embodiment the recombinant polypeptides, proteins or protein fragments of the present invention may be defined by the general formula (III):

R.sub.1-Sp.sub.1-[(His-Z.sub.1-His-Arg-His-Z.sub.2-His)-Sp.sub.2].sub.t-- R.sub.2 (III)

wherein (His-Z.sub.1-His-Arg-His-Z.sub.2-His) (SEQ ID NO: 24) is a metal ion-affinity peptide; t is at least 2 and R.sub.1, R.sub.2, Z.sub.1, Z.sub.2, Sp.sub.1 and Sp.sub.2 are as previously defined; in addition, each Sp.sub.2 of the recombinant polypeptides, proteins or protein fragments corresponding to general formula (III) may be the same or different.

[0027] Target Polypeptide, Protein or Protein Fragment

[0028] The target polypeptide, protein or protein fragment may be composed of any proteinaceous substance that can be expressed in transformed host cells. Accordingly, the present invention may be beneficially employed to produce substantially any prokaryotic or eukaryotic, simple or conjugated, protein that can be expressed by a vector in a transformed host cell. For example, the target protein may be [0029] a) an enzyme, whether oxidoreductase, transferase, hydrolase, lyase, isomerase or ligase; [0030] b) a storage protein, such as ferritin or ovalbumin or a transport protein, such as hemoglobin, serum albumin or ceruloplasmin; [0031] c) a protein that functions in contractile and motile systems such as actin or myosin; [0032] d) any of a class of proteins that serve a protective or defense function, such as the blood protein fibrinogen or a binding protein, such as antibodies or immunoglobulins that bind to and thus neutralize antigens; [0033] e) a hormone such as human Growth Hormone, somatostatin, prolactin, estrone, progesterone, melanocyte, thyrotropin, calcitonin, gonadotropin and insulin; [0034] f) a hormone involved in the immune system, such as interleukin-1, interleukin-2, colony stimulating factor, macrophage-activating factor and interferon; [0035] g) a toxic protein, such as ricin from castor bean or gossypin from cotton linseed; [0036] h) a protein that serves as structural elements such as collagen, elastin, alpha-keratin, glyco-proteins, viral proteins and muco-proteins; or [0037] i) a synthetic protein, defined generally as any sequence of amino acids not occurring in nature. In general, the target polypeptide, protein or protein fragment may be a constituent of the R.sub.1 and R.sub.2 moieties of the recombinant polypeptides, proteins or protein fragments corresponding to general formulae (I), (II) and (III).

[0038] Genes coding for the various types of protein molecules identified above may be obtained from a variety of prokaryotic or eukaryotic sources, such as plant or animal cells or bacteria cells. The genes can be isolated from the chromosome material of these cells or from plasmids of prokaryotic cells by employing standard, well-known techniques. A variety of naturally occurring and synthesized plasmids having genes coding for many different protein molecules are not commercially available from a variety of sources. The desired DNA also can be produced from mRNA by using the enzyme reverse transcriptase. This enzyme permits the synthesis of DNA from an RNA template.

[0039] In one embodiment, R.sub.1 may be a protein which enhances expression and R.sub.2 is the target polypeptide, protein, or protein fragment. It is well known that the presence of some proteins in a cell result in expression of genes. If a chimeric protein contains an active portion of the protein which prompts or enhances expression of the gene encoding it, greater quantities of the protein may be expressed than if it were not present.

[0040] Linker and Other Optional Elements

[0041] In one embodiment, the recombinant polypeptide, protein or protein fragment includes a spacer (Sp.sub.1 or Sp.sub.2) between the metal ion-affinity polypeptide and the target polypeptide, protein or protein fragment. If present, the spacer may simply comprise one or more, e.g., three to ten amino acid residues, separating the metal ion-affinity peptide from the target polypeptide, protein or protein fragment. Alternatively, the spacer may comprise a sequence which imparts other functionality, such as a proteolytic cleavage site, a fusion protein, a secretion sequence (e.g. OmpA or OmpT for E. coli, preprotrypsin for mammalian cells, a-factor for yeast, and melittin for insect cells), a leader sequence for cellular targeting, antibody epitopes, or IRES (internal ribosomal entry sequences) sequences.

[0042] In one embodiment, the spacer is selected from among hydrophilic amino acids to increase the hydrophilic character of the recombinant polypeptide, protein or protein fragment. Alternatively, the amino acid(s) of the spacer domain may be selected to impart a desired folding to the recombinant polypeptide, protein or protein fragment thereby increasing accessability to one or more regions of the molecule. For example, the spacer domain may comprise glycine residues which results in a protein folding conformation which allows for improved accessibility to antibodies.

[0043] In another embodiment, the spacer comprises a cleavage site which consists of a unique amino acid sequence cleavable by use of a sequence-specific proteolytic agent. Such a site would enable the metal ion-affinity polypeptide to be readily cleaved from the target polypeptide, protein or protein fragment by digestion with a proteolytic agent specific for the amino acids of the cleavage site. Alternatively, the metal ion-affinity peptide may be removed from the desired protein by chemical cleavage using methods known to the art.

[0044] When present, the cleavable site may be located at the amino or carboxy terminus of the target peptide. Preferably, the cleavable site is immediately adjacent the desired protein to enable separation of the desired protein from the metal ion-affinity peptide. This cleavable site preferably does not appear in the desired protein. In one embodiment, the cleavable site is located at the amino terminus of the desired protein. If the cleavable site is located at the amino terminus of the desired protein and if there are remaining extraneous amino acids on the desired protein after cleavage with the proteolytic agent, an endopeptidase such as trypsin, clostropain or furin may be utilized to remove these remaining amino acids, thus resulting in a highly purified desired protein. Further examples of proteolytic enzymatic agents useful for cleavage are papain, pepsin, plasmin, thrombin, enterokinase, and the like. Each effects cleavage at a particular amino acid sequence which it recognizes.

[0045] Digestion with a proteolytic agent may occur while the fusion protein is still bound to the affinity resin or alternatively, the fusion protein may be eluted from the affinity resin and then digested with the proteolytic agent in order to further purify the desired protein. Preferably, the amino acid sequence of the proteolytic cleavage site is unique, thus minimizing the possibility that the proteolytic agent will cleave the desired protein. In one embodiment, the cleavable site comprises amino acids for an enterokinase, thrombin or a Factor Xa cleavage site.

[0046] Enterokinase recognizes several sequences: Asp-Lys; Asp-Asp-Lys; Asp-Asp-Asp-Lys (SEQ ID NO: 25); and Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 26). The only known natural occurrence of Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 26) is in the protein trypsinogen which is a natural substrate for bovine enterokinase and some yeast proteins. As such, by interposing a fragment containing the amino acid sequence Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 26) as a cleavable site between the metal ion-affinity polypeptide and the amino terminus of the target polypeptide, protein or protein fragment, the metal ion-affinity polypeptide can be liberated from the desired protein by use of bovine enterokinase with very little likelihood that this enzyme will cleave any portion of the desired protein itself.

[0047] Thrombin cleaves on the carboxy-terminal side of arginine in the following sequence: Leu-Val-Pro-Arg-Gly-X (SEQ ID NO: 27), where X is a non-acidic amino acid. Factor Xa protease (i.e., the activated form of Factor X) cleaves after the Arg in the following sequences: Ile-Glu-Gly-Arg-X (SEQ ID NO: 28), Ile-Asp-Gly-Arg-X (SEQ ID NO: 29), and Ala-Glu-Gly-Arg-X (SEQ ID NO: 30), where X is any amino acid except proline or arginine. A fusion protein comprising the 31 amino-terminal residues of the cII protein, a Factor Xa cleavage site and human .beta.-globin was shown to be cleaved by Factor Xa and generate authentic .beta.-globin. A limitation of the Factor Xa-based fusion systems is the fact that Factor Xa has been reported to cleave at arginine residues that are not present within in the Factor Xa recognition sequence. Lauritzen, C. et al., Protein Expr. and Purif., 5-6:372-378 (1991).

[0048] While less preferred, other unique amino acid sequences for other cleavable sites may also be employed in the spacer without departing from the spirit or scope of the present invention. For instance, the spacer may be composed, at least in part, of a pair of basic amino acids, i.e., Arg, His or Lys. This sequence is cleaved by kallikreins, a glandular enzyme. Also, the spacer may be composed, at least in part, of Arg-Gly, since it is known that the enzyme thrombin will cleave after the Arg if this residue is followed by Gly.

[0049] Regardless of whether a cleavage site is present, the recombinant polypeptide, protein or protein fragment may comprise an antigenic domain in a spacer region (Sp.sub.1 or Sp.sub.2). For example, in one embodiment of the present invention, the recombinant polypeptide, protein or protein fragment comprises one or multiple copies of an antigenic domain generally corresponding to the FLAG.RTM. (Sigma-Aldrich, St. Louis, Mo.) peptide sequence joined to a linking sequence containing a single enterokinase cleavage site. Such antigenic domains generally correspond to the sequence:

[0050] X.sup.20--(X.sup.1--Y--K--X.sup.2--X.sup.3-D-X.sup.4).sub.n--X.sup.- 5--(X.sup.1--Y--K--X.sup.7--X.sup.8-D-X.sup.9--K)--X.sup.21 (SEQ ID NO: 39)

[0051] where:

[0052] D, Y and K are their representative amino acids;

[0053] X.sup.20 and X.sup.21 are independently a hydrogen or a covalent bond;

[0054] each X.sup.1 and X.sup.4 is independently a covalent bond or at least one amino acid residue, if other than a covalent bond, preferably at least one amino acid residue selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues, more preferably at least one hydrophilic amino acid residue, and still more preferably at least one an aspartate residue;

[0055] each X.sup.2, X.sup.3, X.sup.7 and X.sup.8 is independently an amino acid residue, preferably an amino acid residue selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues, more preferably a hydrophilic amino acid residue, and still more preferably an aspartate residue;

[0056] X.sup.5 is a covalent bond or a spacer domain comprising at least one amino acid, if other than a covalent bond, preferably a histidine residue, a glycine residue or a combination of multiple or alternating histidine residues, said combination comprising His-Gly-His, or -(His-X).sub.m--, wherein m is 1 to 6 and X is selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr and Val;

[0057] X.sup.9 is a covalent bond or D; and

[0058] n is 0, 1 or 2.

[0059] In this embodiment, the amino acid sequence X.sup.20--(X.sup.1--Y--K--X.sup.2--X.sup.3-D-X.sup.4).sub.n (SEQ ID NO: 35) comprises an antigenic domain --X.sup.1--Y--K--X.sup.2--X.sup.3-D- (SEQ ID NO: 36) joined in tandem which are joined to a linking sequence (X.sup.1--Y--K--X.sup.7--X.sup.8-D-X.sup.9--K) (SEQ ID NO: 37). The antigenic domains may be immediately adjacent to each other when n is at least one and X.sup.4 is a covalent bond; optionally, X.sup.4 may be a spacer domain interposed between the multiple copies of antigenic domains. The linking sequence contains a single enterokinase cleavable site which is represented by the sequence --X.sup.7--X.sup.8-D-X.sup.9--K (SEQ ID NO: 38), where X.sup.7 and X.sup.8 may be an amino acid residue or a covalent bond and X.sup.9 is a covalent bond or an aspartate residue. In one embodiment, each X.sup.7, X.sup.8 and X.sup.9 is independently an aspartate residue thus resulting in the enterokinase cleavable site DDDDK (SEQ ID NO: 26) which is preferably located immediately adjacent to the amino terminus of the target peptide. When n is at least one and X.sup.5 is a covalent bond, the multiple copies of antigenic domains may be immediately adjacent to the linking sequence; optionally, X.sup.5 may be a spacer domain interposed between the linking sequence and the antigenic domains. When each X.sup.4 and X.sup.5 is independently a spacer domain, it is preferred that the amino acid residue(s) of each X.sup.4 and X.sup.5 impart one or more desired properties to the antigenic domain; for example, the amino acids of the spacer domain may be selected to impart a desired folding to the identification polypeptide thereby increasing accessibility to the antibody. In another embodiment, the amino acids of the spacer domain X.sup.4 and X.sup.5 may be selected to impart a desired affinity characteristic such as a combination of multiple or alternating histidine residues capable of chelating to an immobilized metal ion on a resin or other matrix. Furthermore, these desired properties may be designed into other areas of the identification polypeptide; for example, the amino acids represented by X.sup.2 and X.sup.3 may be selected to impart a desired peptide folding or a desired affinity characteristic for use in affinity purification.

[0060] In another embodiment, the spacer comprises multiple copies of an antigenic domain. For example, in one embodiment the spacer may comprise a linking sequence containing a single enterokinase or other cleavage site, or generally correspond to the sequence:

[0061] X.sup.20-(D-Y--K--X.sup.2--X.sup.3-D).sub.n-X.sup.5-(D-Y--K--X.sup.- 7--X.sup.8-D-X.sup.9--K)--X.sup.21 (SEQ ID NO: 40)

[0062] where:

[0063] D, Y, K are their representative amino acids;

[0064] X.sup.20 and X.sup.21 are independently a hydrogen or a covalent bond; each X.sup.2, X.sup.3, X.sup.7 and X.sup.8 is independently an amino acid residue, preferably an amino acid residue selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues, more preferably a hydrophilic amino acid residue, and still more preferably an aspartate residue;

[0065] X.sup.5 is a covalent bond or a spacer domain comprising at least one amino acid, if other than a covalent bond, preferably a histidine residue, a glycine residue or a combination of multiple or alternating histidine residues, said combination comprising His-Gly-His, or -(His-X).sub.m--, wherein m is 1 to 6 and X is selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr and Val;

[0066] X.sup.9 is a covalent bond or an aspartate residue; and

[0067] n is at least 2.

[0068] In this embodiment, the amino acid sequence X.sup.20-(D-Y--K--X.sup.2--X.sup.3-D).sub.n (SEQ ID NO: 41) represents the multiple copies of the antigenic domain D-Y--K--X.sup.2--X.sup.3-D (SEQ ID NO: 31) in tandem which are joined to a linking sequence (D-Y--K--X.sup.7--X.sup.8-D-X.sup.9--K) (SEQ ID NO: 32). In this embodiment, one antigenic domain is immediately adjacent to another antigenic domain, i.e., no intervening spacer domains, and the multiple copies of the antigenic domain are immediately adjacent to the linking sequence when X.sup.5 is a covalent bond. The linking sequence contains a single enterokinase cleavable site which is represented by the sequence --X.sup.7--X.sup.8-D-X.sup.9-K (SEQ ID NO: 38), where X.sup.7 and X.sup.8 may be a covalent bond or an amino acid residue, preferably an aspartate residue, and X.sup.9 is a covalent bond or an aspartate residue. In one embodiment, each X.sup.7, X.sup.8 and X.sup.9 is independently an aspartate residue thus resulting in the enterokinase cleavable site DDDDK (SEQ ID NO: 26) which is preferably adjacent to the amino terminus of the target peptide. Optionally, the multiple copies of the antigenic domain are joined to the linking sequence by a spacer X.sup.5 when X.sup.5 is at least one amino acid residue. When X.sup.5 is a spacer domain, it is preferred that the amino acid residue(s) of X.sup.5 impart one or more desired properties to the recombinant polypeptide, protein or protein fragment; for example, the amino acids of the spacer domain may be selected to impart a desired folding to the recombinant polypeptide, protein or protein fragment thereby increasing accessibility to the antibody. In another embodiment, the amino acids of the spacer domain may be selected to impart a desired affinity characteristic such as a combination of multiple or alternating histidine residues capable of chelating to an immobilized metal ion on a resin or other matrix. Furthermore, these desired properties may be designed into other areas of the spacer; for example, the amino acids represented by X.sup.2 and X.sup.3 may be selected to impart a desired peptide folding or a desired affinity characteristic for use in affinity purification.

[0069] When the affinity polypeptide is located at the amino terminus of the target polypeptide, protein or protein fragment, it is often desirable to design the amino acid sequence such that an initiator methionine is present. Accordingly, in one embodiment of the present invention, the recombinant polypeptide, protein or protein fragment comprises multiple copies of an antigenic domain, a linking sequence containing a single enterokinase cleavage site and generally corresponds to the sequence:

[0070] X.sup.20--X.sup.10-(D-Y--K--X.sup.2--X.sup.3-D).sub.n-X.sup.5-(D-Y-- -K--X.sup.7--X.sup.8-D-X.sup.9--K)--X.sup.21 (SEQ ID NO: 45)

[0071] where:

[0072] D, Y, and K are their representative amino acids;

[0073] X.sup.20 and X.sup.21 are independently a hydrogen or a covalent bond;

[0074] X.sup.10 is a covalent bond or an amino acid, if other than a covalent bond, preferably a methionine residue;

[0075] each X.sup.2, X.sup.3, X.sup.7 and X.sup.5 is independently an amino acid residue, preferably an amino acid residue selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues, more preferably a hydrophilic amino acid residue, and still more preferably an aspartate residue;

[0076] X.sup.5 is a covalent bond or a spacer domain comprising at least one amino acid, if other than a bond, preferably a histidine residue, a glycine residue or a combination of multiple or alternating histidine residues, said combination comprising His-Gly-His, or -(His-X).sub.m--, wherein m is 1 to 6 and X is selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr, and Val;

[0077] X.sup.9 is a covalent bond or an aspartate residue; and

[0078] n is at least 2.

[0079] In this embodiment, the amino acid sequence X.sup.20--X.sup.10-(D-Y--K--X.sup.2--X.sup.3-D).sub.n (SEQ ID NO: 44) represents the multiple copies of the antigenic domain D-Y--K--X.sup.2--X.sup.3-D (SEQ ID NO: 31) in tandem which is flanked by a linking sequence (D-Y--K--X.sup.7--X.sup.8-D-X.sup.9--K) (SEQ ID NO: 32) and an initiator amino acid X.sup.10, preferably methionine. The antigenic domain D-Y--K--X.sup.2--X.sup.3-D with an initiator methionine is recognized by the M5.RTM. antibody (Sigma-Aldrich, St. Louis, Mo.). In this embodiment, one antigenic domain is immediately adjacent to another antigenic domain, i.e., no intervening spacer domains, and the multiple copies of the antigenic domain are immediately adjacent to the linking sequence when X.sup.5 is a covalent bond. The linking sequence contains an enterokinase cleavable site which is represented by the amino acid sequence --X.sup.7--X.sup.8-D-X.sup.9--K (SEQ ID NO: 38), where X.sup.7 and X.sup.8 may be a covalent bond or an amino acid residue, preferably an aspartate residue, and X.sup.9 is a covalent bond or an aspartate residue. In one embodiment, each X.sup.7, X.sup.8 and X.sup.9 is independently an aspartate residue thus resulting in the enterokinase cleavable site DDDDK (SEQ ID NO: 26) which is preferably adjacent to the amino terminus of the target peptide. Optionally, the multiple copies of the antigenic domain are joined to the linking sequence by a spacer domain X.sup.5 whenX.sup.5 is at least one amino acid residue. When X.sup.5 is a spacer domain, it is preferred that the amino acid residue(s) of X.sup.5 impart one or more desired properties to the affinity polypeptide; for example, the amino acids of the spacer domain may be selected to impart a desired folding to the recombinant polypeptide, protein or protein fragment thereby increasing accessibility to the antibody. In another embodiment, the amino acids of the spacer domain may be selected to impart a desired affinity characteristic such as a combination of multiple or alternating histidine residues capable of chelating to an immobilized metal ion on a resin or other matrix. Furthermore, these desired properties may be designed into other areas of the affinity polypeptide; for example, the amino acids represented by X.sup.2 and X.sup.3 may be selected to impart a desired peptide folding or a desired affinity characteristic for use in affinity purification.

[0080] In another embodiment of the present invention, the recombinant polypeptide, protein or protein fragment comprises one or more copies of an antigenic sequence, a linking sequence containing a single enterokinase cleavable site and generally corresponds to the sequence:

[0081] X.sup.20-(D-X.sup.11--Y--X.sup.12--X.sup.13).sub.n--X.sup.14-(D-X.s- up.11--Y--X.sup.12--X.sup.13-D-X.sup.15--K)--X.sup.21 (SEQ ID NO: 42)

[0082] where:

[0083] D, Y and K are their representative amino acids;

[0084] X.sup.20 and X.sup.21 are independently a hydrogen or a covalent bond;

[0085] each X.sup.11 is a covalent bond or an amino acid, preferably Leu;

[0086] each X.sup.12 is an amino acid, preferably selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues, more preferably a hydrophilic amino acid residue, and still more preferably an aspartate residue;

[0087] each X.sup.13 is a covalent bond or at least one amino acid, if other than a covalent bond, preferably selected from the group consisting of aromatic amino acid residues and hydrophilic amino acid residues, more preferably a hydrophilic amino acid residue, and still more preferably an aspartate residue;

[0088] X.sup.14 is a covalent bond or a spacer domain comprising at least one amino acid, if other than a covalent bond, preferably a histidine residue, a glycine residue or a combination of multiple or alternating histidine residues, said combination comprising His-Gly-His, or -(His-X).sub.m--, wherein m is 1 to 6 and X is selected from the group consisting of Ala, Arg, Asn, Asp, Cys, Gln, Glu, Gly, His, Ile, Leu, Lys, Met, Phe, Pro, Ser, Thr, Trp, Tyr and Val;

[0089] X.sup.15 is a covalent bond or an aspartate residue; and

[0090] n is 0 or at least 1.

[0091] In this embodiment, when n is at least 2, the amino acid sequence X.sup.20-(D-X.sup.11--Y--X.sup.12--X.sup.13).sub.n (SEQ ID NO: 43) constitutes multiple copies of the antigenic domain D-X.sup.11--Y--X.sup.12--X.sup.13 (SEQ ID NO: 33) in tandem which are joined to a linking sequence (D-X.sup.11--Y--X.sup.12--X.sup.13-D-X.sup.15--K) (SEQ ID NO: 34). Additionally, one antigenic domain may be immediately adjacent to another antigenic domain, i.e., no intervening spacer domains, and the multiple copies of the antigenic domain may be immediately adjacent to the linking sequence when X.sup.14 is a covalent bond. The linking sequence contains a single enterokinase cleavable site which is represented by the sequence --X.sup.12--X.sup.13-D-X.sup.15--K, (SEQ ID NO: 38) where X.sup.12 and X.sup.13 may be a covalent bond or an amino acid residue, preferably an aspartate residue, and X.sup.15 is a covalent bond or an aspartate residue. In one embodiment, each X.sup.12, X.sup.13 and X.sup.15 is independently an aspartate residue thus resulting in the enterokinase cleavable site DDDDK (SEQ ID NO: 26) which is preferably adjacent to the amino terminus of the target peptide. Optionally, when n is at least two, the multiple copies of the antigenic domain are joined to the linking sequence by a spacer X.sup.14 when X.sup.14 is at least one amino acid residue. When X.sup.14 is a spacer domain, it is preferred that the amino acid residue(s) of X.sup.14 impart one or more desired properties to the recombinant polypeptide, protein or protein fragment; for example, the amino acids of the spacer domain may be selected to impart a desired folding to the recombinant polypeptide, protein or protein fragment thereby increasing accessibility to the antibody. In another embodiment, the amino acids of the spacer domain X.sup.14 may be selected to impart a desired affinity characteristic such as a combination of multiple or alternating histidine residues capable of chelating to an immobilized metal ion on a resin or other matrix.

[0092] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises the enzyme glutathione-S-transferase of the parasite helminth Schistosoma japonicum (SEQ ID NO: 1). The glutathione-S-transferase may, however, be derived from other species including human and other mammalian glutathione-S-transferase. Proteins expressed as fusions with the enzyme glutathione-S-transferase can be purified under non-denaturing conditions by affinity chromatography on immobilized glutathione. Glutathione-agarose beads have a capacity of at least 8 mg fusion protein/ml swollen beads and can be used several times for different preparations of the same fusion protein. Smith, D. B. and Johnson, K. S., Gene, 67:31-40, 1988.

[0093] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises a cellulose binding domain (CBD) (SEQ ID NO: 2). CBD's are found in both bacterial and fungal sources and possess a high affinity for the crystalline form of cellulose. This property has been useful for purification of fusion proteins using a cellulose matrix. Fusion proteins have been attached at both the N- and C-terminus of CBD.

[0094] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises the Maltose Binding Protein (MBP) encoded by the malE gene in E. coli (SEQ ID NO: 3). MBP has found utility in the formation of chimeric proteins with eukaryotic proteins for expression in bacterial systems. This system permits expression of soluble fusion proteins that can readily be purified on immobilized amylose resin.

[0095] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises Protein A (SEQ ID NO: 4). Protein A is isolated from Staphylococcus aureus and binds to the Fc origin of IgG. Fusion proteins containing the IgG binding domains of Protein A can be affinity purified on IgG resins (e.g., IgG Sepharose 6FF (Pharmacia Biotech). The signal sequence of Protein A is functional in E. coli. Fusion proteins using Protein A have shown increased stability when expressed both in the cytoplasm and periplasm in E. coli.

[0096] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises Protein G (SEQ ID NO: 5). Protein G is similar to Protein A with the difference being that Protein G binds to human serum albumin in addition to IgG. The major disadvantage is that low pH<3.4 is required to elute the fusion protein.

[0097] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises IgG (SEQ ID NO: 6). Placing the protein of interest on the C-terminal of IgG generates chimeric proteins. This allows purification of the fusion protein using either Protein A or G matrix.

[0098] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises the enzyme chloramphenicol acetyl transferase (CAT) from E. coli (SEQ ID NO: 7). CAT is used in the form of a C-terminal fusion. CAT is readily translated in E. coli and allows for over-expression of heterologous proteins. Capture of fusion proteins is accomplished using a chloramphenicol matrix.

[0099] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises streptavidin (SEQ ID NO: 8). Streptavidin is used for fusion proteins because of its high affinity and high specificity for biotin. Streptavidin is a neutral protein, free from carbohydrates and sulphydryl groups.

[0100] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises b-galactosidase (SEQ ID NO: 9). b-galactosidase is a enzyme that is utilized as both an N- and C-terminal fusion protein. Fusion proteins containing b-galactosidase sequences can be affinity purified on aminophenyl-b-D-thiogalactosidyl-succinyldiaminohexyl-Sepharose. However, given that C-terminal fusion proteins are usually insoluble, the system has limited use in bacterial systems. N-terminal fusions are soluble in E. coli, but due to the large size of b-galactosidase, this system is used more often in eukaryotic gene expression.

[0101] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises the Green Fluorescent Protein (GFP) (SEQ ID NO: 10). GFP is a protein from the jellyfish Aquorea victorea and many mutant variations of this protein have been used successfully in most organisms for protein expression. The major use of these types of fusion proteins is for targeting and determining physiological function of the host cell protein.

[0102] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises thioredoxin (SEQ ID NO: 11). Thioredoxin is a relatively small thermostable protein that is easily over-expressed in bacterial systems. Thioredoxin fusion systems are useful in avoiding the formation of inclusion bodies during heterologous gene expression. This has been particularly useful in the expression of mammalian cytokines.

[0103] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises Calmodulin Binding Protein (CBP) (SEQ ID NO: 12). This tag is derived from the C-terminus of skeletal muscle myosin light chain kinase. This small tag is recognized by calmodulin and forms the base of the technology. The tag is translated efficiently and allows for the expression and recovery of N-terminal chimeric genes.

[0104] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises the c-myc epitope sequence Glu-Gln-Lys-Leu-Ile-Ser-Glu-Glu-Asp-Leu (SEQ ID NO: 13). This C-terminal portion of the myc oncogene, which is part of the p53 signaling pathway, has been used as a detection tag for expression of recombinant proteins in mammalian cells.

[0105] In another embodiment of this invention, a spacer (Sp.sub.1 or Sp.sub.2) comprises the HA epitope sequence Tyr-Pro-Tyr-Asp-Val-Tyr-Ala (SEQ ID NO: 14). This detection tag has been utilized for the expression of recombinant proteins in mammalian cells.

[0106] In another embodiment of this invention, the spacer (Sp.sub.1 or Sp.sub.2) comprises a polypeptide possessing an amino acid sequence having at least 70% homology to any one of the amino acid sequences disclosed in SEQ ID NOS:1-14, and retains the same binding characteristics as said amino acid sequence.

[0107] DNA sequences encoding the aforementioned proteins which may be employed as spacers (Sp.sub.1 or Sp.sub.2) are commercially available (e.g., malE gene sequences encoding the MBP are available from New England Biolabs (pMAL-c2 and pMAL-p2); Schistosoma japonicum glutathione-S-transferase (GST) gene sequences are available from Pharmacia Biotech (the pGEX series which have GenBank Accession Nos.: U13849 to U13858); .beta.-galactosidase (the lacZ gene product) gene sequences are available from Pharmacia Biotech (pCH110 and pMC1871; GenBank Accession Nos: U13845 and L08936, respectively); sequences encoding the IgG binding domains of Protein A are available from Pharmacia Biotech (pRIT2T; GenBank Accession No. U13864)).

[0108] When any of the above listed proteins (including the hinge/Fc domains of human IgG.sub.1) are used as spacers, it is not required that the entire protein be used as a spacer. Portions of these proteins may be used as the spacer provided the portion selected is sufficient to permit interaction of a fusion protein containing the portion of the protein used as the spacer with the desired affinity resin.

[0109] Expression and Purification

[0110] The polypeptides, proteins and protein fragments of the present invention are generally prepared and expressed as a fusion protein using conventional recombinant DNA technology. The fusion protein is thus produced by host cells transformed with the genetic information encoding the fusion protein. The host cells may secrete the fusion protein into the culture media or store it in the cells whereby the cells must be collected and disrupted in order to extract the product. As hosts, E. coli, yeast, insect cells, mammalian cells and plants are suitable. Of these two, E. coli will typically be the more preferred host for most applications. In one embodiment, the recombinant polypeptides, proteins and protein fragments are produced in a soluble form or secreted from the host.

[0111] In general, a chimeric gene is inserted into an expression vector which allows for the expression of the desired fusion protein in a suitable transformed host. The expression vector provides the inserted chimeric gene with the necessary regulatory sequences to control expression in the suitable transformed host.

[0112] There are six elements of control expression sequence for proteins which are to be secreted from a host into the medium, while five of these elements apply to fusion proteins expressed intracellularly. These elements in the order they appear in the gene are: a) the promoter region; b) the 5' untranslated region; c) signal sequence; d) the chimeric coding sequence; e) the 3' untranslated region; f) the transcription termination site. Fusion proteins which are not secreted do not contain c), the signal sequence.

[0113] The recombinant expression vectors of the invention comprise a nucleic acid of the invention in a form suitable for expression of the nucleic acid in a host cell. This means that the recombinant expression vectors include one or more regulatory sequences, selected on the basis of the host cells to be used for expression, operably linked to the nucleic acid sequence to be expressed. It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression of protein desired, etc. The expression vectors of the invention can be introduced into host cells to thereby produce proteins or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein.

[0114] Vector DNA can be introduced into prokaryotic or eukaryotic cells via conventional transformation or transfection techniques. For stable transfection of mammalian cells, it is known that, depending upon the expression vector and transfection technique used, only a small fraction of cells may integrate the foreign DNA into their genome. In order to identify and select these integrants, a gene that encodes a selectable marker (e.g., for resistance to antibiotics) is generally introduced into the host cells along with the gene of interest. Preferred selectable markers include those which confer resistance to drugs, such as G418, hygromycin, and methotrexate. Nucleic acid encoding a selectable marker can be introduced into a host cell on the same vector as that encoding the metal ion-affinity peptide containing fusion protein or can be introduced on a separate vector. Cells stably transfected with the introduced nucleic acid can be identified by drug selection (e.g., cells that have incorporated the selectable marker gene will survive, while the other cells die). Methods and materials for preparing recombinant vectors, transforming host cells using replicating vectors, and expressing biologically active foreign polypeptides and proteins are generally well known.

[0115] The expressed recombinant polypeptides, proteins and protein fragments may be separated from other material present in the secretion media or extraction solution, or from other liquid mixtures, through immobilized metal affinity chromatography ("IMAC"). For example, the culture media containing the secreted recombinant polypeptides, proteins and protein fragments or the cell extracts containing the recombinant polypeptides, proteins and protein fragments may be passed through a column that contains a resin comprising an immobilized metal ion. In IMAC, metal ions are immobilized onto to a solid support, and used to capture proteins comprising a metal chelating peptide. The metal chelating peptide may occur naturally in the protein, or the protein may be a recombinant protein with an affinity tag comprising a metal chelating peptide. Exemplary metal ions include aluminum, cadmium, calcium, cobalt, copper, gallium, iron, nickel, ytterbium and zinc. In one embodiment, the metal ion is preferably nickel, copper, cobalt, or zinc. In another embodiment, the metal ion is nickel. Advantageously, the components of the solution other than recombinant polypeptide, protein or protein fragment freely pass through the column. The immobilized metal, however, chelates or binds the recombinant polypeptides, proteins and protein fragments, thereby separating it from the remaining contents of the liquid mixture in which it was originally contained.

[0116] Resins useful for producing immobilized metal ion affinity chromatography (IMAC) columns are available commercially. Examples of resins derivatized with iminodiacetic acid (IDA) are Chelating Sepharose 6B (Pharmacia), Immobilized Iminodiacetic Acid (Pierce), and Iminodiacetic Acid Agarose (Sigma-Aldrich). In addition, Porath has immobilized tris(carboxymethyl)ethylenediamine (TED) on Sepharose 6B and used it to fractionate serum proteins. Porath, J. and Olin, B., Biochemistry, 22:1621-1630, 1983. Other reports suggest that trisacryl GF2000 and silica can be derivatized with IDA, TED, or aspartic acid, and the resulting materials used in producing IMAC substances.

[0117] In one embodiment, the capture ligand is a metal chelate as described in WO 01/81365. More specifically, in this embodiment the capture ligand is a metal chelate derived from metal chelating composition (1):

##STR00001##

wherein [0118] Q is a carrier; [0119] S.sup.1 is a spacer; [0120] L is -A-T-CH(X)-- or --C(.dbd.O)--; [0121] A is an ether, thioether, selenoether, or amide linkage; [0122] T is a bond or substituted or unsubstituted alkyl or alkenyl; [0123] X is --(CH.sub.2).sub.kCH.sub.3, --(CH.sub.2).sub.kCOOH,--(CH.sub.2).sub.kSO.sub.3H, --(CH.sub.2).sub.kPO.sub.3H.sub.2, --(CH.sub.2).sub.kN(J).sub.2, or --(CH.sub.2).sub.kP(J).sub.2, preferably --(CH.sub.2).sub.kCOOH or --(CH.sub.2).sub.kSO.sub.3H; [0124] k is an integer from 0 to 2; [0125] J is hydrocarbyl or substituted hydrocarbyl; [0126] Y is --COOH, --H, --SO.sub.3H, --PO.sub.3H.sub.2, --N(J).sub.2, or --P(J).sub.2, preferably, --COOH; [0127] Z is --COOH, --H, --SO.sub.3H, --PO.sub.3H.sub.2, --N(J).sub.2, or --P(J).sub.2, preferably, --COOH; and [0128] i is an integer from 0 to 4, preferably 1 or 2.

[0129] In general, the carrier, Q, may comprise any solid or soluble material or compound capable of being derivatized for coupling. Solid (or insoluble) carriers may be selected from a group including agarose, cellulose, methacrylate co-polymers, polystyrene, polypropylene, paper, polyamide, polyacrylonitrile, polyvinylidene, polysulfone, nitrocellulose, polyester, polyethylene, silica, glass, latex, plastic, gold, iron oxide and polyacrylamide, but may be any insoluble or solid compound able to be derivatized to allow coupling of the remainder of the composition to the carrier, Q. Soluble carriers include proteins, nucleic acids including DNA, RNA, and oligonucleotides, lipids, liposomes, synthetic soluble polymers, proteins, polyamino acids, albumin, antibodies, enzymes, streptavidin, peptides, hormones, chromogenic dyes, fluorescent dyes, flurochromes or any other detection molecule, drugs, small organic compounds, polysaccharides and any other soluble compound able to be derivatized for coupling the remainder of the composition to the carrier, Q. In one embodiment, the carrier, Q, is the container of the present invention. In another embodiment, the carrier, Q, is a body provided within the container of the present invention.

[0130] The spacer, S.sup.1, which flanks the carrier comprises a chain of atoms which may be saturated or unsaturated, substituted or unsubstituted, linear or cyclic, or straight or branched. Typically, the chain of atoms defining the spacer, S.sup.1, will consist of no more than about 25 atoms; stated another way, the backbone of the spacer will consist of no more than about 25 atoms. More preferably, the chain of atoms defining the spacer, S.sup.1, will consist of no more than about 15 atoms, and still more preferably no more than about 12 atoms. The chain of atoms defining the spacer, S.sup.1, will typically be selected from the group consisting of carbon, oxygen, nitrogen, sulfur, selenium, silicon and phosphorous and preferably from the group consisting of carbon, oxygen, nitrogen, sulfur and selenium. In addition, the chain atoms may be substituted or unsubstituted with atoms other than hydrogen such as hydroxy, keto (.dbd.O), or acyl such as acetyl. Thus, the chain may optionally include one or more ether, thioether, selenoether, amide, or amine linkages between hydrocarbyl or substituted hydrocarbyl regions. Exemplary spacers, S.sup.1, include methylene, alkyleneoxy (--(CH.sub.2).sub.aO--), alkylenethioether (--(CH.sub.2).sub.aS--), alkyleneselenoether (--(CH.sub.2).sub.aSe--), alkyleneamide (--(CH.sub.2).sub.aNR.sup.1(C.dbd.O)--), alkylenecarbonyl (--(CH.sub.2).sub.aCO)--, and combinations thereof wherein a is generally from 1 to about 20 and R.sup.1 is hydrogen or hydrocarbyl, preferably alkyl. In one embodiment, the spacer, S.sup.1, is a hydrophilic, neutral structure and does not contain any amine linkages or substituents or other linkages or substituents which could become electrically charged during the purification of a polypeptide.

[0131] As noted above, the linker, L, may be -A-T-CH(X)-- or --C(.dbd.O)--. When L is -A-T-CH(X)--, the chelating composition corresponds to the formula:

##STR00002##

wherein Q, S.sup.1, A, T, X, Y, and Z are as previously defined. In this embodiment, the ether (--O--), thioether (--S--), selenoether (--Se--) or amide ((--NR.sup.1(C.dbd.O)--) or (--(C.dbd.O)NR.sup.1--) wherein R.sup.1 is hydrogen or hydrocarbyl) linkage is separated from the chelating portion of the molecule by a substituted or unsubstituted alkyl or alkenyl region. If other than a bond, T is preferably substituted or unsubstituted C.sub.1 to C.sub.6 alkyl or substituted or unsubstituted C.sub.2 to C.sub.6 alkenyl. More preferably, A is --S--, T is --(CH.sub.2).sub.n--, and n is an integer from 0 to 6, typically 0 to 4, and more typically 0, 1 or 2.

[0132] When L is --C(.dbd.O)--, the chelating composition corresponds to the formula:

##STR00003##

wherein Q, S.sup.1, i, Y, and Z are as previously defined.

[0133] In one embodiment, the sequence --S.sup.1-L-, in combination, is a chain of no more than about 35 atoms selected from the group consisting of carbon, oxygen, sulfur, selenium, nitrogen, silicon and phosphorous, more preferably only carbon, oxygen sulfur and nitrogen, and still more preferably only carbon, oxygen and sulfur. To reduce the prospects for non-specific binding, nitrogen, when present, is preferably in the form of an amide moiety. In addition, if the carbon chain atoms are substituted with anything other than hydrogen, they are preferably substituted with hydroxy or keto. In one embodiment, L comprises a portion (sometimes referred to as a fragment or residue) derived from an amino acid such as cystine, homocystine, cysteine, homocysteine, aspartic acid, cysteic acid or an ester thereof such as the methyl or ethyl ester thereof.

[0134] Exemplary chelating compositions corresponding to formula 1 include the following:

##STR00004## ##STR00005## ##STR00006## ##STR00007## ##STR00008##

wherein Q is a carrier and Ac is acetyl.

[0135] In another embodiment, the capture ligand is a metal chelate of the type described in U.S. Pat. No. 5,047,513. More specifically, in this embodiment the capture ligand is a metal chelate derived from nitrilotriacetic acid derivatives of the formula

##STR00009##

wherein S.sup.2 is --O--CH.sub.2--CH(OH)--CH.sub.2 or --O--CO-- and x is 2, 3 or 4. In this embodiment, the nitrilotriacetic acid derivative is immobilized on any of the previously described carriers, Q.

[0136] In these embodiments in which the capture ligand is a metal chelate as described in WO 01/81365 or U.S. Pat. No. 5,047,513, the metal chelate may contain any of the metal ions previously described in connection with IMAC. In one embodiment, the metal chelate comprises a metal ion selected from among nickel (Ni.sup.2+), zinc (Zn.sup.2+), copper (Cu.sup.2+), iron (Fe.sup.3+), cobalt (Co.sup.2+), calcium (Ca.sup.2+), aluminum (Al.sup.3+), magnesium (Mg.sup.2+), and manganese (Mn.sup.2+). In another embodiment, the metal chelate comprises nickel (Ni.sup.2+).

[0137] Another common purification technique that can be used in the context of the present invention is the use of an immunogenic capture system where the recombinant polypeptide, protein or protein fragment comprises an antigenic domain in a spacer region (Sp.sub.1 or Sp.sub.2). Any of the previously described antigenic systems comprising the spacer may be used for this purpose. In such systems, an epitope tag on a protein or peptide allows the protein to which it is attached to be purified based upon the affinity of the epitope tag for a corresponding ligand (e.g., antibody) immobilized on a support. One example of such a tag is the sequence Asp-Tyr-Lys-Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 15), or DYKDDDDK (SEQ ID NO: 15); antibodies having specificity for this sequence are sold by Sigma-Aldrich (St. Louis, Mo.) under the FLAG.RTM. trademark. Another example of such a tag is the sequence Asp-Leu-Tyr-Asp-Asp-Asp-Asp-Lys (SEQ ID NO: 16), or DLYDDDDK (SEQ ID NO: 16); antibodies having specificity for this sequence are sold by Invitrogen (Carlsbad, Calif.). Another example of such a tag is the 3.times. FLAG.RTM. sequence Met-Asp-Tyr-Lys-Asp-His-Asp-Gly-Asp-Tyr-Lys-Asp-His-Asp-Ile-Asp-Tyr-Lys-A- sp-Asp-Asp-Asp-Lys (SEQ ID NO: 17); antibodies having specificity for this sequence are sold by Sigma-Aldrich (St. Louis, Mo.). Thus, in one embodiment, the carrier comprises immobilized antibodies which have specificity for the DYKDDDDK (SEQ ID NO: 15) epitope; in another embodiment, the carrier comprises immobilized antibodies which have specificity for the DLYDDDDK (SEQ ID NO: 16) epitope. In another embodiment, the carrier comprises immobilized antibodies which have specificity for SEQ ID NO: 17. For example, in one embodiment, the ANTI-FLAG.RTM. M1, M2, or M5 antibody is immobilized on the interior surface of a column, or a portion thereof, and/or a bead or other support within a column.

[0138] After the recombinant polypeptides, proteins and protein fragments are separated from other components of the liquid mixture, the conditions in the column may be changed to release the bound material. For example, the bound molecules may be eluted by pH change, imidazole, or competition with another linker peptide from the column.

[0139] Alternatively, the target polypeptide, protein or protein fragment portion of the bound recombinant polypeptide, protein or protein fragment may be selectively released from immobilized metal. For example, if there is a cleavage site between the target polypeptide, protein or protein fragment and the metal ion-affinity peptide, and if the bound recombinant polypeptide, protein or protein fragment is treated with the appropriate enzyme, the target polypeptide, protein or protein fragment may be selectively released while the metal ion-affinity polypeptide fragment remains bound to the immobilized metal. For this purpose, the cleavage is preferably an enzymatically cleavable linker peptide having the ability to undergo site-specific proteolysis. Suitable cleaving enzymes in accordance with this invention are activated factor X (factor Xa), DPP I, DPP II, DPP IV, carboxylpeptidase A, collagen, enterokinase, human renin, thrombin, trypsin, ubtilisn and V5.

[0140] It is to be appreciated that some polypeptide or protein molecules will possess the desired enzymatic or biological activity with the metal chelate peptide still attached either at the C-terminal end or at the N-terminal end or both. In those cases the purification of the chimeric protein will be accomplished without subjecting the protein to site-specific proteolysis.

[0141] The present invention may be used to purify any prokaryotic or eukaryotic protein that can be expressed as the product of recombinant DNA technology in a transformed host cell. These recombinant protein products include hormones, receptors, enzymes, storage proteins, blood proteins, mutant proteins produced by protein engineering techniques, or synthetic proteins. The purification process of the present invention can be used batchwise or in continuously run columns.

[0142] It is to be understood that the present invention has been described in detail by way of illustration and example in order to acquaint others skilled in the art with the invention, its principles, and its practical application. Further, the specific embodiments of the present invention as set forth are not intended to be exhaustive or to limit the invention, and that many alternatives, modifications, and variations will be apparent to those skilled in the art in light of the foregoing examples and detailed description. Accordingly, this invention is intended to embrace all such alternatives, modifications, and variations that fall within the spirit and scope of the following claims. While some of the examples and descriptions above include some conclusions about the way the invention may function, the inventors do not intend to be bound by those conclusions and functions, but put them forth only as possible explanations in light of current understanding.

Abbreviations and Definitions

[0143] To facilitate understanding of the invention, a number of terms are defined below. Definitions of certain terms are included here. Any term not defined is understood to have the normal meaning used by scientists contemporaneous with the submission of this application.

[0144] The term "expression vector" as used herein refers to nucleic acid sequences containing a desired coding sequence and appropriate nucleic acid sequences necessary for the expression of the operably linked coding sequence in a particular host organism. Nucleic acid sequences necessary for expression in prokaryotes include a promoter, a ribosome binding site, an initiation codon, a stop codon, optionally an operator sequence and possibly other regulatory sequences. Eukaryotic cells utilize promoters, a Kozak sequence and often enhancers and polyadenlyation signals. Prokaryotic cells also utilize a Shine-Dalgarno Ribosome binding site. The present invention includes vectors or plasmids which can be used as vehicles to transform any viable host cell with the recombinant DNA expression vector.

[0145] "Operably linked" is intended to mean that the nucleotide sequence of interest is linked to the regulatory sequence(s) in a manner that allows for expression of the nucleotide sequence (e.g., in an in vitro transcription/translation system or in a host cell when the vector is introduced into the host cell).

[0146] The term "regulatory sequence" is intended to include promoters, enhancers, and other expression control elements (e.g., polyadenylation signals). Regulatory sequences include those that direct constitutive expression of a nucleotide sequence in many types of host cell and those that direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences).

[0147] The terms "transformation" and "transfection" are intended to refer to a variety of art-recognized techniques for introducing foreign nucleic acid (e.g., DNA) into a host cell, including calcium phosphate or calcium chloride co-precipitation, DEAE-dextran-mediated transfection, lipofection, or electroporation. Suitable methods for transforming or transfecting host cells can be found in laboratory manuals.

[0148] The term "hydrophilic" when used in reference to amino acids refers to those amino acids which have polar and/or charged side chains. Hydrophilic amino acids include lysine, arginine, histidine, aspartate (i.e., aspartic acid), glutamate (i.e., glutamic acid), serine, threonine, cysteine, tyrosine, asparagine and glutamine.

[0149] The term "hydrophobic" when used in reference to amino acids refers to those amino acids which have nonpolar side chains. Hydrophobic amino acids include valine, leucine, isoleucine, cysteine and methionine. Three hydrophobic amino acids have aromatic side chains. Accordingly, the term "aromatic" when used in reference to amino acids refers to the three aromatic hydrophobic amino acids phenylalanine, tyrosine and tryptophan.

[0150] The term "fusion protein" refers to polypeptides and proteins which consist of a metal ion-affinity linker peptide and a protein or polypeptide operably linked directly or indirectly to the metal ion-affinity peptide. The metal ion-affinity linker peptide may be located at the amino-terminal portion of the fusion protein or at the carboxy-terminal protein thus forming an "amino-terminal fusion protein" or a "carboxy-terminal fusion protein," respectively.

[0151] The terms "metal ion-affinity peptide", "metal binding peptide" and "linker peptide" are used interchangeably to refer to an amino acid sequence which displays an affinity to metal ions. The minimum length of the immobilized metal ion-affinity peptide according to the present invention is seven amino acids including four alternating histidines. The most preferred length is seven amino acids including four alternating histidines.

[0152] The term "enzyme" referred to herein in the context of a cleavage enzyme means a polypeptide or protein which recognizes a specific amino acid sequence in a polypeptide and cleaves the polypeptide at the scissile bond. In one embodiment of the present invention, enterokinase is the enzyme which is used to free the fusion protein from the immobilized metal ion column. In further embodiments, carboxylpeptidase A, DPP I, DPP II, DPP IV, factor Xa, human renin, TEV, thrombin or VIII protease is the enzyme.

[0153] The terms "cleavage site" used herein refers to an amino acid sequence which is recognized and cleaved by an enzyme or chemical means at the scissile bond.

[0154] The term "scissile bond" referred to herein is the juncture where cleavage occurs; for example the scissile bond recognized by enterokinase may be the bond following the sequence (Asp.sub.4)-Lys in the spacer peptide or affinity peptide.

[0155] By the term "immobilized metal ion-affinity peptide" as used herein is meant an amino acid sequence that chelates immobilized divalent metal ions of metals selected from the group consisting of aluminum, cadmium, calcium, cobalt, copper, gallium, iron, nickel, ytterbium and zinc.

[0156] The term "capture ligand" means any ligand or receptor that can be immobilized or supported on a container or support and used to isolate a cellular component from cellular debris. Some non-limiting examples of capture ligands that may be used in connection with the present invention include: biotin, streptavidin, various metal chelate ions, antibodies, various charged particles such as those for use in ion exchange chromatography, various affinity chromatography supports, and various hydrophobic groups for use in hydrophobic chromatography.

[0157] For all the nucleotide and amino acid sequences disclosed herein, it is understood that equivalent nucleotides and amino acids can be substituted into the sequences without affecting the function of the sequences. Such substitutions are within the ability of a person of ordinary skill in the art.

[0158] The procedures disclosed herein which involve the molecular manipulation of nucleic acids are known to those skilled in the art.

EXAMPLES

Example 1

Construction and Screening of a Metal Ion-Affinity Peptide Library

[0159] A pseudo-random glutathione-S-transferase C-terminal peptide library was constructed with the amino acid sequence of His-X-His-X-His-X-His where X is any amino acid except Gln, His and Pro. The library vector was constructed from the bacterial expression vector pGEX-2T. The library was constructed by annealing a pair of complimentary oligonucleotides together. Oligonucleotides were constructed as follows: 5'GATCCCATDNDCATDNDCATDNDCATTAAC3' (SEQ ID NO: 18) and 5'AATTGTTAATGHNHATGHNHATGHNHATGG3' (SEQ ID NO: 19) where D is nucleotides A, G, or T, H is nucleotides A, C, or T and N is nucleotides A, C, T, or G. The 5' end was phosphorylated with T.sub.4 polynucleotide kinase and the oligonucleotides were annealed together to generate a cassette. The cassette was ligated into pGEX-2T, which had been digested with EcoRI and BamHI restriction endonucleases. Ligated vector was transformed into E. coli DH5-.alpha. using standard protocols. Transformants were plated on LB/ampicillin plates (100 mg/L) and incubated overnight at 37.degree. C.

[0160] 900 colonies were picked and placed on 9 master plates. Each master plate contained 100 colonies each and were grown overnight at 37.degree. C. A piece of nitrocellulose was placed onto each of the master plates. This piece of nitrocellulose was then removed and the transferred colonies were placed onto a LB/ampicillin plate containing 1 mM isopropyl .beta.-D-galactopyranoside (IPTG) to induce the expression of the GST fusion peptides. The cells were allowed to grow for an additional 4 hours at 37.degree. C. The nitrocellulose filter was removed from the plate and placed sequentially on blotting paper containing the following solutions to lyse the cells in situ: [0161] (a) 10% SDS for 10 minutes, [0162] (b) 1.5 M sodium chloride, 0.5 M sodium hydroxide for 5 minutes [0163] (c) 1.5 M sodium chloride, 0.5 M Tris-HCl pH 7.4 for 5 minutes [0164] (d) 1.5 M sodium chloride, 0.5 M Tris-HCl pH 7.4 for 5 minutes [0165] (e) 2.times.SSC for 15 minutes.

[0166] The filters were dried at ambient temperature followed by an incubation in Tris-buffered saline (TBS) containing 3% non-fat dry milk for 1 hour at room temperature. Filters were then washed 3.times. for 5 minutes with TBS containing 0.05% Tween-20 (TBS-T). To detect clones that were capable of binding to a metal ion, the filters were incubated with nickel NTA horseradish peroxidase (HRP) at a concentration of 1 mg/ml in TBS-T for 1 hour. The filters were then washed with TBS-T 3.times. for 5 minutes and incubated with 3-3'-5-5'-Tetramethylbenzidine (TMB) to detect the horseradish peroxidase. The reaction was stopped by placing the filters in water. 250 colonies, which were detected above, were picked from the master plate and placed into 1 ml of LB/ampicillin and grown overnight in a 96 deep well plate at 37.degree. C. at 250 rpm on an orbital shaker. 10 .mu.l of the overnight cultures were transferred to a fresh aliquot of LB/ampicillin (1 ml) in a 96 deep well plate and grown for an additional 3 hours. The culture was then induced by adding IPTG (final concentration of 1 mM) and the culture was allowed to grow for an additional 3 hours prior to harvesting by centrifugation. The media was decanted and the cells were frozen overnight at -20.degree. C. in the collection plate. Cells were lysed with 0.6 ml of CelLytic-B (Sigma-Aldrich product no. B3553) and incubated for 15 minutes at room temperature. The cell debris was removed by centrifugation at 3,000.times.g for 15 minutes. Two experiments were done in parallel, one on a His-Select High Sensitivity (HS) nickel coated plate, and the second on HIS-Select High Capacity (HC) nickel coated plate. 0.1 ml of cell extracts of each clone were placed in a HS microwell plate in the presence of imidazole at a final concentration of 5 mM. This is the selective condition used for screening the different metal ion-affinity clones. HS plates were incubated for 4 hours at room temperature. Plates were then washed 3.times. with phosphate-buffered saline (PBS) containing 0.05% Tween 20 (PBS-T). The HS plates were then incubated with anti-GST at 1:1,000 dilution in PBS-BSA buffer (0.2 ml/well) for 1 hour at room temperature. HS plates were washed 3.times. with PBS-T. The HS plates were then incubated with anti-mouse HRP conjugate at 1:10,000 dilution in PBS-BSA buffer for 1 hour at room temperature. Plates were washed 3.times. with PBS-T. The plate was then developed with 2,2'azino-bis(3-ethylbenzthiazoline-6-sulfonic acid) ABST substrate. Color development was stopped by the addition of sodium azide to a final concentration of 2 mM. Absorbance of the plates was read at 405 nm using a Wallace 1420 plate reader. The HC plates were used to further analyze potential clones. To further characterize the clones, 0.2 ml of cell extracts were applied to the HC plates and the plates were incubated at ambient temperature for 1 hour. The plates were washed with PBS as described above. Twenty-one clones that produced the highest response on the HS plates were eluted from the corresponding HC plate. The selected cloned proteins were eluted from the HC plates by incubating at 37.degree. C. for 15 minutes in 50 mM sodium phosphate, 0.3 M sodium chloride and 0.2 M imidazole buffer. Eluted proteins were then moved to clean tubes and analyzed by SDS-PAGE. All 21 clones had the expected molecular weight and were sequence verified.

[0167] These 21 colonies were grown overnight in 1 ml LB/ampicillin media at 37.degree. C. at 250 rpm. 100 .mu.l of the overnight cultures were transferred to 50 ml of fresh LB/ampicillin media and the cultures grown for an additional 3 hours at 37.degree. C. The cultures were induced with IPTG (final concentration of 1 mM) and the cultures grown for an additional 3 hours prior to harvesting by centrifugation.

Example 2

Construction of a N-Terminal Metal Ion-Affinity Fusion Protein

[0168] Two metal ion-affinity tags were introduced to the N-terminal of bacterial alkaline phosphatase (BAP). The constructs were constructed from the BAP expression vector pFLAG-CTS-BAP. Construction was done by annealing two pair of complimentary oligonucleotides together. The following oligonucleotides were constructed: 5'TATGCATAATCATCGACATGAACATA3'(SEQ ID NO: 20), 5'AGCTTATGTTTATGTCGATGATTATGCA3' (SEQ ID NO: 21), 5'TATGCATAAACATAGACATGGGCATA3' (SEQ ID NO: 22) and 5'AGCTTGATGCCCATGTCTATGTTTATGCA3' (SEQ ID NO: 23). The oligonucleotides were annealed together to generate a cassette. The cassette was ligated into pFLAG-CTS-BAP, which had been digested with NdeI and HindIII restriction endonucleases. Ligated vector was transformed into E. coli DH5-a using standard protocols and plated on LB/ampicillin.

Example 3

Expression of an N-Terminal Metal Ion-Affinity Fusion Protein

[0169] MAT-BAP fusion peptide cultures were grown overnight in 1 ml LB/ampicillin at 37.degree. C. 500 .mu.l of overnight cultures were transferred to 500 ml of fresh TB media containing ampicillin (100 mg/L). The cultures were grown for three hours at 37.degree. C. at 250 rpm. Protein expression was induced by the addition of IPTG (final concentration of 1 mM). Cultures were then grown for an additional three hours, harvested by centrifugation and stored at -70.degree. C. until further use.

Example 4

Metal Ion-Affinity Fusion Protein Purification Protocol #1

[0170] Cells were resuspended in 2 ml of TE (50 mM Tris-HCl pH 8.0, 2 mM EDTA). Lysozyme (4 mg/ml in 2 ml of TE) was added to the resuspended cells and the cells were lysed at ambient temperature for 4 hours. The cell debris was removed by centrifugation at 27,000.times.g for 15 minutes. The supernatant was dialyzed overnight against 50 mM Tris-HCl pH 8.0 to remove the EDTA. The dialyzed supernatant was applied to a 1 ml column containing a nickel bis-carboxy-methyl-cysteine resin (nickel resin). The column was washed with 4 ml of 50 mM Tris-HCl pH 8.0 and then washed with 2 ml of 50 mM Tris-HCl pH 8.0, 10 mM imidazole. The column was then eluted 50 mM Tris-HCl pH 8.0 250 mM imidazole. Samples were analyzed for purity by SDS-PAGE.

Example 5

Metal Ion-Affinity Fusion protein Purification Protocol #2

[0171] Cells were resuspended with CelLytic B (Sigma-Aldrich product no. B3553), and 10 mM imidazole. The cells were solubilized by incubation for 15 minutes. The cell debris was removed by centrifugation at 15,000.times.g for 5 minutes at room temperature. A 0.5 ml column, containing nickel resin, was equilibrated with 10 column volumes (5 ml) of 50 mM sodium phosphate, pH 8, and 300 mM sodium chloride (column buffer). The supernatant was loaded on the column. The column was washed with 10 column volumes (5 ml) of 10 mM imidazole in column buffer. The column was eluted with 100 mM imidazole in column buffer. The samples were analyzed for specificity by SDS-PAGE.

Example 6

Metal Ion-Affinity Fusion Protein Purification Protocol #3: Use of Chaotropic Agents

[0172] The cells were resuspended in 100 mM sodium phosphate, pH 8, and 8 M urea (denaturant column buffer). The cells were solubilized by sonication three times, 15 seconds each, with a probe sonicator. Cell debris was removed by centrifugation at 15,000.times.g for 5 minutes at room temperature. A 0.5 ml column, containing nickel resin, was equilibrated with 10 column volumes (5 ml) of the denaturant column buffer. The supernatant was loaded on the column and the column was washed with 10 column volumes (5 ml) of denaturant column buffer. The column was sequentially eluted with 100 mM sodium phosphate, 8 M urea at pH 7.5, 7.0, 6.5, 6.0, 5.5, 5.0 and 4.5. The samples were analyzed for specificity by SDS-PAGE.

Sequence CWU 1

1

481211PRTSchistosoma japonicum 1Met Ala Cys Gly His Val Lys Leu Ile Tyr Phe Asn Gly Arg Gly Arg1 5 10 15Ala Glu Pro Ile Arg Met Ile Leu Val Ala Ala Gly Val Glu Phe Glu 20 25 30Asp Glu Arg Ile Glu Phe Gln Asp Trp Pro Lys Ile Lys Pro Thr Ile 35 40 45Pro Gly Gly Arg Leu Pro Ile Val Lys Ile Thr Asp Lys Arg Gly Asp 50 55 60Val Lys Thr Met Ser Glu Ser Leu Ala Ile Ala Arg Phe Ile Ala Arg65 70 75 80Lys His Asn Met Met Gly Asp Thr Asp Asp Glu Tyr Tyr Ile Ile Glu 85 90 95Lys Met Ile Gly Gln Val Glu Asp Val Glu Ser Asp Tyr His Lys Thr 100 105 110Leu Ile Lys Pro Pro Glu Glu Lys Glu Lys Ile Ser Lys Glu Ile Leu 115 120 125Asn Gly Lys Val Pro Ile Leu Leu Gln Ala Ile Cys Glu Thr Leu Lys 130 135 140Glu Ser Thr Gly Asn Leu Thr Val Gly Asp Lys Val Thr Leu Ala Asp145 150 155 160Val Val Leu Ile Ala Ser Ile Asp His Ile Thr Asp Leu Asp Lys Glu 165 170 175Phe Leu Thr Gly Lys Tyr Pro Glu Ile His Lys His Arg Lys His Leu 180 185 190Leu Ala Thr Ser Pro Lys Leu Ala Lys Tyr Leu Ser Glu Arg His Ala 195 200 205Thr Ala Phe 2102163PRTClostridium cellulovorans 2Ala Ala Thr Ser Ser Met Ser Val Glu Phe Tyr Asn Ser Asn Lys Ser1 5 10 15Ala Gln Thr Asn Ser Ile Thr Pro Ile Ile Lys Ile Thr Asn Thr Ser 20 25 30Asp Ser Asp Leu Asn Leu Asn Asp Val Lys Val Arg Tyr Thr Tyr Tyr 35 40 45Thr Ser Asp Gly Thr Gln Gly Gln Thr Phe Trp Cys Asp His Ala Gly 50 55 60Ala Leu Leu Gly Asn Ser Tyr Val Asp Asn Thr Ser Lys Val Thr Ala65 70 75 80Asn Phe Val Lys Glu Thr Ala Ser Pro Thr Ser Thr Tyr Asp Thr Tyr 85 90 95Val Glu Phe Gly Phe Ala Ser Gly Ala Ala Thr Leu Lys Lys Gly Gln 100 105 110Phe Ile Thr Ile Gln Gly Arg Ile Thr Lys Ser Asp Trp Ser Asn Tyr 115 120 125Thr Gln Thr Asn Asp Tyr Ser Phe Asp Ala Ser Ser Ser Thr Pro Val 130 135 140Val Asn Pro Lys Val Thr Gly Tyr Ile Gly Gly Ala Lys Val Leu Gly145 150 155 160Thr Ala Pro3396PRTEscherichia coli 3Met Lys Ile Lys Thr Gly Ala Arg Ile Leu Ala Leu Ser Ala Leu Thr1 5 10 15Thr Met Met Phe Ser Ala Ser Ala Leu Ala Lys Ile Glu Glu Gly Lys 20 25 30Leu Val Ile Trp Ile Asn Gly Asp Lys Gly Tyr Asn Gly Leu Ala Glu 35 40 45Val Gly Lys Lys Phe Glu Lys Asp Thr Gly Ile Lys Val Thr Val Glu 50 55 60His Pro Asp Lys Leu Glu Glu Lys Phe Pro Gln Val Ala Ala Thr Gly65 70 75 80Asp Gly Pro Asp Ile Ile Phe Trp Ala His Asp Arg Phe Gly Gly Tyr 85 90 95Ala Gln Ser Gly Leu Leu Ala Glu Ile Thr Pro Asp Lys Ala Phe Gln 100 105 110Asp Lys Leu Tyr Pro Phe Thr Trp Asp Ala Val Arg Tyr Asn Gly Lys 115 120 125Leu Ile Ala Tyr Pro Ile Ala Val Glu Ala Leu Ser Leu Ile Tyr Asn 130 135 140Lys Asp Leu Leu Pro Asn Pro Pro Lys Thr Trp Glu Glu Ile Pro Ala145 150 155 160Leu Asp Lys Glu Leu Lys Ala Lys Gly Lys Ser Ala Leu Met Phe Asn 165 170 175Leu Gln Glu Pro Tyr Phe Thr Trp Pro Leu Ile Ala Ala Asp Gly Gly 180 185 190Tyr Ala Phe Lys Tyr Glu Asn Gly Lys Tyr Asp Ile Lys Asp Val Gly 195 200 205Val Asp Asn Ala Gly Ala Lys Ala Gly Leu Thr Phe Leu Val Asp Leu 210 215 220Ile Lys Asn Lys His Met Asn Ala Asp Thr Asp Tyr Ser Ile Ala Glu225 230 235 240Ala Ala Phe Asn Lys Gly Glu Thr Ala Met Thr Ile Asn Gly Pro Trp 245 250 255Ala Trp Ser Asn Ile Asp Thr Ser Lys Val Asn Tyr Gly Val Thr Val 260 265 270Leu Pro Thr Phe Lys Gly Gln Pro Ser Lys Pro Phe Val Gly Val Leu 275 280 285Ser Ala Gly Ile Asn Ala Ala Ser Pro Asn Lys Glu Leu Ala Lys Glu 290 295 300Phe Leu Glu Asn Tyr Leu Leu Thr Asp Glu Gly Leu Glu Ala Val Asn305 310 315 320Lys Asp Lys Pro Leu Gly Ala Val Ala Leu Lys Ser Tyr Glu Glu Glu 325 330 335Leu Ala Lys Asp Pro Arg Ile Ala Ala Thr Met Glu Asn Ala Gln Lys 340 345 350Gly Glu Ile Met Pro Asn Ile Pro Gln Met Ser Ala Phe Trp Tyr Ala 355 360 365Val Arg Thr Ala Val Ile Asn Ala Ala Ser Gly Arg Gln Thr Val Asp 370 375 380Glu Ala Leu Lys Asp Ala Gln Thr Arg Ile Thr Lys385 390 3954524PRTStaphylococcus aureus 4Met Lys Lys Lys Asn Ile Tyr Ser Ile Arg Lys Leu Gly Val Gly Ile1 5 10 15Ala Ser Val Thr Leu Gly Thr Leu Leu Ile Ser Gly Gly Val Thr Pro 20 25 30Ala Ala Asn Ala Ala Gln His Asp Glu Ala Gln Gln Asn Ala Phe Tyr 35 40 45Gln Val Leu Asn Met Pro Asn Leu Asn Ala Asp Gln Arg Asn Gly Phe 50 55 60Ile Gln Ser Leu Lys Asp Asp Pro Ser Gln Ser Ala Asn Val Leu Gly65 70 75 80Glu Ala Gln Lys Leu Asn Asp Ser Gln Ala Pro Lys Ala Asp Ala Gln 85 90 95Gln Asn Asn Phe Asn Lys Asp Gln Gln Ser Ala Phe Tyr Glu Ile Leu 100 105 110Asn Met Pro Asn Leu Asn Glu Ala Gln Arg Asn Gly Phe Ile Gln Ser 115 120 125Leu Lys Asp Asp Pro Ser Gln Ser Thr Asn Val Leu Gly Glu Ala Lys 130 135 140Lys Leu Asn Glu Ser Gln Ala Pro Lys Ala Asp Asn Asn Phe Asn Lys145 150 155 160Glu Gln Gln Asn Ala Phe Tyr Glu Ile Leu Asn Met Pro Asn Leu Asn 165 170 175Glu Glu Gln Arg Asn Gly Phe Ile Gln Ser Leu Lys Asp Asp Pro Ser 180 185 190Gln Ser Ala Asn Leu Leu Ser Glu Ala Lys Lys Leu Asn Glu Ser Gln 195 200 205Ala Pro Lys Ala Asp Asn Lys Phe Asn Lys Glu Gln Gln Asn Ala Phe 210 215 220Tyr Glu Ile Leu His Leu Pro Asn Leu Asn Glu Glu Gln Arg Asn Gly225 230 235 240Phe Ile Gln Ser Leu Lys Asp Asp Pro Ser Gln Ser Ala Asn Leu Leu 245 250 255Ala Glu Ala Lys Lys Leu Asn Asp Ala Gln Ala Pro Lys Ala Asp Asn 260 265 270Lys Phe Asn Lys Glu Gln Gln Asn Ala Phe Tyr Glu Ile Leu His Leu 275 280 285Pro Asn Leu Thr Glu Glu Gln Arg Asn Gly Phe Ile Gln Ser Leu Lys 290 295 300Asp Asp Pro Ser Val Ser Lys Glu Ile Leu Ala Glu Ala Lys Lys Leu305 310 315 320Asn Asp Ala Gln Ala Pro Lys Glu Glu Asp Asn Asn Lys Pro Gly Lys 325 330 335Glu Asp Asn Asn Lys Pro Gly Lys Glu Asp Asn Asn Lys Pro Gly Lys 340 345 350Glu Asp Asn Asn Lys Pro Gly Lys Glu Asp Asn Asn Lys Pro Gly Lys 355 360 365Glu Asp Asn Asn Lys Pro Gly Lys Glu Asp Gly Asn Lys Pro Gly Lys 370 375 380Glu Asp Asn Lys Lys Pro Gly Lys Glu Asp Gly Asn Lys Pro Gly Lys385 390 395 400Glu Asp Asn Lys Lys Pro Gly Lys Glu Asp Gly Asn Lys Pro Gly Lys 405 410 415Glu Asp Gly Asn Lys Pro Gly Lys Glu Asp Gly Asn Gly Val His Val 420 425 430Val Lys Pro Gly Asp Thr Val Asn Asp Ile Ala Lys Ala Asn Gly Thr 435 440 445Thr Ala Asp Lys Ile Ala Ala Asp Asn Lys Leu Ala Asp Lys Asn Met 450 455 460Ile Lys Pro Gly Gln Glu Leu Val Val Asp Lys Lys Gln Pro Ala Asn465 470 475 480His Ala Asp Ala Asn Lys Ala Gln Ala Leu Pro Glu Thr Gly Glu Glu 485 490 495Asn Pro Phe Ile Gly Thr Thr Val Phe Gly Gly Leu Ser Leu Ala Leu 500 505 510Gly Ala Ala Leu Leu Ala Gly Arg Arg Arg Glu Leu 515 5205448PRTStreptococcus pyogenes 5Met Glu Lys Glu Lys Lys Val Lys Tyr Phe Leu Arg Lys Ser Ala Phe1 5 10 15Gly Leu Ala Ser Val Ser Ala Ala Phe Leu Val Gly Ser Thr Val Phe 20 25 30Ala Val Asp Ser Pro Ile Glu Asp Thr Pro Ile Ile Arg Asn Gly Gly 35 40 45Glu Leu Thr Asn Leu Leu Gly Asn Ser Glu Thr Thr Leu Ala Leu Arg 50 55 60Asn Glu Glu Ser Ala Thr Ala Asp Leu Thr Ala Ala Ala Val Ala Asp65 70 75 80Thr Val Ala Ala Ala Ala Ala Glu Asn Ala Gly Ala Ala Ala Trp Glu 85 90 95Ala Ala Ala Ala Ala Asp Ala Leu Ala Lys Ala Lys Ala Asp Ala Leu 100 105 110Lys Glu Phe Asn Lys Tyr Gly Val Ser Asp Tyr Tyr Lys Asn Leu Ile 115 120 125Asn Asn Ala Lys Thr Val Glu Gly Ile Lys Asp Leu Gln Ala Gln Val 130 135 140Val Glu Ser Ala Lys Lys Ala Arg Ile Ser Glu Ala Thr Asp Gly Leu145 150 155 160Ser Asp Phe Leu Lys Ser Gln Thr Pro Ala Glu Asp Thr Val Lys Ser 165 170 175Ile Glu Leu Ala Glu Ala Lys Val Leu Ala Asn Arg Glu Leu Asp Lys 180 185 190Tyr Gly Val Ser Asp Tyr His Lys Asn Leu Ile Asn Asn Ala Lys Thr 195 200 205Val Glu Gly Val Lys Glu Leu Ile Asp Glu Ile Leu Ala Ala Leu Pro 210 215 220Lys Thr Asp Thr Tyr Lys Leu Ile Leu Asn Gly Lys Thr Leu Lys Gly225 230 235 240Glu Thr Thr Thr Glu Ala Val Asp Ala Ala Thr Ala Glu Lys Val Phe 245 250 255Lys Gln Tyr Ala Asn Asp Asn Gly Val Asp Gly Glu Trp Thr Tyr Asp 260 265 270Asp Ala Thr Lys Thr Phe Thr Val Thr Glu Lys Pro Glu Val Ile Asp 275 280 285Ala Ser Glu Leu Thr Pro Ala Val Thr Thr Tyr Lys Leu Val Ile Asn 290 295 300Gly Lys Thr Leu Lys Gly Glu Thr Thr Thr Lys Ala Val Asp Ala Glu305 310 315 320Thr Ala Glu Lys Ala Phe Lys Gln Tyr Ala Asn Asp Asn Gly Val Asp 325 330 335Gly Val Trp Thr Tyr Asp Asp Ala Thr Lys Thr Phe Thr Val Thr Glu 340 345 350Met Val Thr Glu Val Pro Gly Asp Ala Pro Thr Glu Pro Glu Lys Pro 355 360 365Glu Ala Ser Ile Pro Leu Val Pro Leu Thr Pro Ala Thr Pro Ile Ala 370 375 380Lys Asp Asp Ala Lys Lys Asp Asp Thr Lys Lys Glu Asp Ala Lys Lys385 390 395 400Pro Glu Ala Lys Lys Asp Asp Ala Lys Lys Ala Glu Thr Leu Pro Thr 405 410 415Thr Gly Glu Gly Ser Asn Pro Phe Phe Thr Ala Ala Ala Leu Ala Val 420 425 430Met Ala Gly Ala Gly Ala Leu Ala Val Ala Ser Lys Arg Lys Glu Asp 435 440 4456192PRTHomo sapiens 6Met Ala Pro Ser Leu Ser Ala Met Thr Pro Trp Thr Pro Gly Pro Ser1 5 10 15Trp Ser Ser Val Tyr Met Thr Cys Val Trp Ser Val Gly Ser Gly Ser 20 25 30Ala Cys Ala Val Ala Ser Ala Pro Met Pro Arg Pro Val Trp Ser Leu 35 40 45Ala Ser Arg Leu Gly Thr Gly Asp His Gln Pro Thr Ala Pro Cys Pro 50 55 60Ala Leu Pro Thr Ala Ala Met Ser Ser Ala Ala Leu Leu Ala Arg Pro65 70 75 80Pro Ala Thr Gly Leu Arg Arg Arg Pro Thr Ala Pro Gly Ala Pro Ala 85 90 95Trp Arg Ala Ala Cys Ala Ser Gln Ala Ser Trp Pro Ala Ala Ala Pro 100 105 110Ala Cys Arg Pro Arg Arg Val Ala Ala Pro Ser Arg Val Ser Ser Ser 115 120 125Leu Arg Ala Arg Lys Cys Gly Arg Thr Ser Cys Ala Lys Gly Ala Ala 130 135 140Pro Ala Thr Ala Pro Pro Ile Arg Ser Pro Ala Ala Thr Ser Arg Ala145 150 155 160Ala Arg Arg Val Ser Ala Ala Ala Ser Arg Thr Ala Ser Trp Ala Ala 165 170 175Thr Pro Ile Ala Ser Gly Pro Ala Arg Gly Pro Gly Thr His Thr Met 180 185 1907216PRTEscherichia coli 7Met Asn Phe Asn Lys Ile Asp Leu Asp Asn Trp Lys Arg Lys Glu Ile1 5 10 15Phe Asn His Tyr Leu Asn Gln Gln Thr Thr Phe Ser Ile Thr Thr Glu 20 25 30Ile Asp Ile Ser Val Leu Tyr Arg Asn Ile Lys Gln Glu Gly Tyr Lys 35 40 45Phe Tyr Pro Ala Phe Ile Phe Leu Val Thr Arg Val Ile Asn Ser Asn 50 55 60Thr Ala Phe Arg Thr Gly Tyr Asn Ser Asp Gly Glu Leu Gly Tyr Trp65 70 75 80Asp Lys Leu Glu Pro Leu Tyr Thr Ile Phe Asp Gly Val Ser Lys Thr 85 90 95Phe Ser Gly Ile Trp Thr Pro Val Lys Asn Asp Phe Lys Glu Phe Tyr 100 105 110Asp Leu Tyr Leu Ser Asp Val Glu Lys Tyr Asn Gly Ser Gly Lys Leu 115 120 125Phe Pro Lys Thr Pro Ile Pro Glu Asn Ala Phe Ser Leu Ser Ile Ile 130 135 140Pro Trp Thr Ser Phe Thr Gly Phe Asn Leu Asn Ile Asn Asn Asn Ser145 150 155 160Asn Tyr Leu Leu Pro Ile Ile Thr Ala Gly Lys Phe Ile Asn Lys Gly 165 170 175Asn Ser Ile Tyr Leu Pro Leu Ser Leu Gln Val His His Ser Val Cys 180 185 190Asp Gly Tyr His Ala Gly Leu Phe Met Asn Ser Ile Gln Glu Leu Ser 195 200 205Asp Arg Pro Asn Asp Trp Leu Leu 210 2158160PRTStreptomyces avidinii 8Met Asp Pro Ser Lys Asp Ser Lys Ala Gln Val Ser Ala Ala Glu Ala1 5 10 15Gly Ile Thr Gly Thr Trp Tyr Asn Gln Leu Gly Ser Thr Phe Ile Val 20 25 30Thr Ala Gly Ala Asp Gly Ala Leu Thr Gly Thr Tyr Glu Ser Ala Val 35 40 45Gly Asn Ala Glu Ser Arg Tyr Val Leu Thr Gly Arg Tyr Asp Ser Ala 50 55 60Pro Ala Thr Asp Gly Ser Gly Thr Ala Leu Gly Trp Thr Val Ala Trp65 70 75 80Lys Asn Asn Tyr Arg Asn Ala His Ser Ala Thr Thr Trp Ser Gly Gln 85 90 95Tyr Val Gly Gly Ala Glu Ala Arg Ile Asn Thr Gln Trp Leu Leu Thr 100 105 110Ser Gly Thr Thr Glu Ala Asn Ala Trp Lys Ser Thr Leu Val Gly His 115 120 125Asp Thr Phe Thr Lys Val Lys Pro Ser Ala Ala Ser Ile Asp Ala Ala 130 135 140Lys Lys Ala Gly Val Asn Asn Gly Asn Pro Leu Asp Ala Val Gln Gln145 150 155 16091024PRTEscherichia coli 9Met Thr Met Ile Thr Asp Ser Leu Ala Val Val Leu Gln Arg Arg Asp1 5 10 15Trp Glu Asn Pro Gly Val Thr Gln Leu Asn Arg Leu Ala Ala His Pro 20 25 30Pro Phe Ala Ser Trp Arg Asn Ser Glu Glu Ala Arg Thr Asp Arg Pro 35 40 45Ser Gln Gln Leu Arg Ser Leu Asn Gly Glu Trp Arg Phe Ala Trp Phe 50 55 60Pro Ala Pro Glu Ala Val Pro Glu Ser Trp Leu Glu Cys Asp Leu Pro65 70 75 80Glu Ala Asp Thr Val Val Val Pro Ser Asn Trp Gln Met His Gly Tyr 85 90 95Asp Ala Pro Ile Tyr Thr Asn Val Thr Tyr Pro Ile Thr Val Asn Pro 100 105 110Pro Phe Val Pro Thr Glu Asn Pro Thr Gly Cys Tyr Ser Leu Thr Phe 115 120 125Asn Val Asp Glu Ser Trp Leu Gln Glu Gly Gln Thr Arg Ile Ile Phe 130 135

140Asp Gly Val Asn Ser Ala Phe His Leu Trp Cys Asn Gly Arg Trp Val145 150 155 160Gly Tyr Gly Gln Asp Ser Arg Leu Pro Ser Glu Phe Asp Leu Ser Ala 165 170 175Phe Leu Arg Ala Gly Glu Asn Arg Leu Ala Val Met Val Leu Arg Trp 180 185 190Ser Asp Gly Ser Tyr Leu Glu Asp Gln Asp Met Trp Arg Met Ser Gly 195 200 205Ile Phe Arg Asp Val Ser Leu Leu His Lys Pro Thr Thr Gln Ile Ser 210 215 220Asp Phe His Val Ala Thr Arg Phe Asn Asp Asp Phe Ser Arg Ala Val225 230 235 240Leu Glu Ala Glu Val Gln Met Cys Gly Glu Leu Arg Asp Tyr Leu Arg 245 250 255Val Thr Val Ser Leu Trp Gln Gly Glu Thr Gln Val Ala Ser Gly Thr 260 265 270Ala Pro Phe Gly Gly Glu Ile Ile Asp Glu Arg Gly Gly Tyr Ala Asp 275 280 285Arg Val Thr Leu Arg Leu Asn Val Glu Asn Pro Lys Leu Trp Ser Ala 290 295 300Glu Ile Pro Asn Leu Tyr Arg Ala Val Val Glu Leu His Thr Ala Asp305 310 315 320Gly Thr Leu Ile Glu Ala Glu Ala Cys Asp Val Gly Phe Arg Glu Val 325 330 335Arg Ile Glu Asn Gly Leu Leu Leu Leu Asn Gly Lys Pro Leu Leu Ile 340 345 350Arg Gly Val Asn Arg His Glu His His Pro Leu His Gly Gln Val Met 355 360 365Asp Glu Gln Thr Met Val Gln Asp Ile Leu Leu Met Lys Gln Asn Asn 370 375 380Phe Asn Ala Val Arg Cys Ser His Tyr Pro Asn His Pro Leu Trp Tyr385 390 395 400Thr Leu Cys Asp Arg Tyr Gly Leu Tyr Val Val Asp Glu Ala Asn Ile 405 410 415Glu Thr His Gly Met Val Pro Met Asn Arg Leu Thr Asp Asp Pro Arg 420 425 430Trp Leu Pro Ala Met Ser Glu Arg Val Thr Arg Met Val Gln Arg Asp 435 440 445Arg Asn His Pro Ser Val Ile Ile Trp Ser Leu Gly Asn Glu Ser Gly 450 455 460His Gly Ala Asn His Asp Ala Leu Tyr Arg Trp Ile Lys Ser Val Asp465 470 475 480Pro Ser Arg Pro Val Gln Tyr Glu Gly Gly Gly Ala Asp Thr Thr Ala 485 490 495Thr Asp Ile Ile Cys Pro Met Tyr Ala Arg Val Asp Glu Asp Gln Pro 500 505 510Phe Pro Ala Val Pro Lys Trp Ser Ile Lys Lys Trp Leu Ser Leu Pro 515 520 525Gly Glu Thr Arg Pro Leu Ile Leu Cys Glu Tyr Ala His Ala Met Gly 530 535 540Asn Ser Leu Gly Gly Phe Ala Lys Tyr Trp Gln Ala Phe Arg Gln Tyr545 550 555 560Pro Arg Leu Gln Gly Gly Phe Val Trp Asp Trp Val Asp Gln Ser Leu 565 570 575Ile Lys Tyr Asp Glu Asn Gly Asn Pro Trp Ser Ala Tyr Gly Gly Asp 580 585 590Phe Gly Asp Thr Pro Asn Asp Arg Gln Phe Cys Met Asn Gly Leu Val 595 600 605Phe Ala Asp Arg Thr Pro His Pro Ala Leu Thr Glu Ala Lys His Gln 610 615 620Gln Gln Phe Phe Gln Phe Arg Leu Ser Gly Gln Thr Ile Glu Val Thr625 630 635 640Ser Glu Tyr Leu Phe Arg His Ser Asp Asn Glu Leu Leu His Trp Met 645 650 655Val Ala Leu Asp Gly Lys Pro Leu Ala Ser Gly Glu Val Pro Leu Asp 660 665 670Val Ala Pro Gln Gly Lys Gln Leu Ile Glu Leu Pro Glu Leu Pro Gln 675 680 685Pro Glu Ser Ala Gly Gln Leu Trp Leu Thr Val Arg Val Val Gln Pro 690 695 700Asn Ala Thr Ala Trp Ser Glu Ala Gly His Ile Ser Ala Trp Gln Gln705 710 715 720Trp Arg Leu Ala Glu Asn Leu Ser Val Thr Leu Pro Ala Ala Ser His 725 730 735Ala Ile Pro His Leu Thr Thr Ser Glu Met Asp Phe Cys Ile Glu Leu 740 745 750Gly Asn Lys Arg Trp Gln Phe Asn Arg Gln Ser Gly Phe Leu Ser Gln 755 760 765Met Trp Ile Gly Asp Lys Lys Gln Leu Leu Thr Pro Leu Arg Asp Gln 770 775 780Phe Thr Arg Ala Pro Leu Asp Asn Asp Ile Gly Val Ser Glu Ala Thr785 790 795 800Arg Ile Asp Pro Asn Ala Trp Val Glu Arg Trp Lys Ala Ala Gly His 805 810 815Tyr Gln Ala Glu Ala Ala Leu Leu Gln Cys Thr Ala Asp Thr Leu Ala 820 825 830Asp Ala Val Leu Ile Thr Thr Ala His Ala Trp Gln His Gln Gly Lys 835 840 845Thr Leu Phe Ile Ser Arg Lys Thr Tyr Arg Ile Asp Gly Ser Gly Gln 850 855 860Met Ala Ile Thr Val Asp Val Glu Val Ala Ser Asp Thr Pro His Pro865 870 875 880Ala Arg Ile Gly Leu Asn Cys Gln Leu Ala Gln Val Ala Glu Arg Val 885 890 895Asn Trp Leu Gly Leu Gly Pro Gln Glu Asn Tyr Pro Asp Arg Leu Thr 900 905 910Ala Ala Cys Phe Asp Arg Trp Asp Leu Pro Leu Ser Asp Met Tyr Thr 915 920 925Pro Tyr Val Phe Pro Ser Glu Asn Gly Leu Arg Cys Gly Thr Arg Glu 930 935 940Leu Asn Tyr Gly Pro His Gln Trp Arg Gly Asp Phe Gln Phe Asn Ile945 950 955 960Ser Arg Tyr Ser Gln Gln Gln Leu Met Glu Thr Ser His Arg His Leu 965 970 975Leu His Ala Glu Glu Gly Thr Trp Leu Asn Ile Asp Gly Phe His Met 980 985 990Gly Ile Gly Gly Asp Asp Ser Trp Ser Pro Ser Val Ser Ala Glu Phe 995 1000 1005Gln Leu Ser Ala Gly Arg Tyr His Tyr Gln Leu Val Trp Cys Gln 1010 1015 1020Lys10238PRTAequorea victoria 10Met Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Val1 5 10 15Glu Leu Asp Gly Asp Val Asn Gly Gln Lys Phe Ser Val Ser Gly Glu 20 25 30Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys 35 40 45Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe 50 55 60Ser Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Gln65 70 75 80His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly Tyr Val Gln Glu Arg 85 90 95Thr Ile Phe Tyr Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val 100 105 110Lys Phe Glu Gly Asp Thr Leu Val Asn Arg Ile Glu Leu Lys Gly Ile 115 120 125Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Met Glu Tyr Asn 130 135 140Tyr Asn Ser His Asn Val Tyr Ile Met Ala Asp Lys Pro Lys Asn Gly145 150 155 160Ile Lys Val Asn Phe Lys Ile Arg His Asn Ile Lys Asp Gly Ser Val 165 170 175Gln Leu Ala Asp His Tyr Gln Gln Asn Thr Pro Ile Gly Asp Gly Pro 180 185 190Val Leu Leu Pro Asp Asn His Tyr Leu Ser Thr Gln Ser Ala Leu Ser 195 200 205Lys Asp Pro Asn Glu Lys Arg Asp His Met Ile Leu Leu Glu Phe Val 210 215 220Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys225 230 23511109PRTEscherichia coli 11Met Ser Asp Lys Ile Ile His Leu Thr Asp Asp Ser Phe Asp Thr Asp1 5 10 15Val Leu Lys Ala Asp Gly Ala Ile Leu Val Asp Phe Trp Ala Glu Trp 20 25 30Cys Gly Pro Cys Lys Met Ile Ala Pro Ile Leu Asp Glu Ile Ala Asp 35 40 45Glu Tyr Gln Gly Lys Leu Thr Val Ala Lys Leu Asn Ile Asp Gln Asn 50 55 60Pro Gly Thr Ala Pro Lys Tyr Gly Ile Arg Gly Ile Pro Thr Leu Leu65 70 75 80Leu Phe Lys Asn Gly Glu Val Ala Ala Thr Lys Val Gly Ala Leu Ser 85 90 95Lys Gly Gln Leu Lys Glu Phe Leu Asp Ala Asn Leu Ala 100 1051213PRTOryctolagus cuniculus 12Ile Ala Val Ser Ala Ala Asn Arg Phe Lys Lys Ile Ser1 5 101310PRTArtificial SequenceSYNTHESIZED 13Glu Gln Lys Leu Ile Ser Glu Glu Asp Leu1 5 10147PRTArtificial SequenceSYNTHESIZED 14Tyr Pro Tyr Asp Val Tyr Ala1 5158PRTArtificial SequenceSYNTHESIZED 15Asp Tyr Lys Asp Asp Asp Asp Lys1 5168PRTArtificial SequenceSYNTHESIZED 16Asp Leu Tyr Asp Asp Asp Asp Lys1 51723PRTArtificial SequenceSYNTHESIZED 17Met Asp Tyr Lys Asp His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp1 5 10 15Tyr Lys Asp Asp Asp Asp Lys 201830DNAArtificial SequenceSYNTHESIZED 18gatcccatdn dcatdndcat dndcattaac 301930DNAArtificial SequenceSYNTHESIZED 19aattgttaat ghnhatghnh atghnhatgg 302026DNAArtificial SequenceSYNTHESIZED 20tatgcataat catcgacatg aacata 262128DNAArtificial SequenceSYNTHESIZED 21agcttatgtt tatgtcgatg attatgca 282226DNAArtificial SequenceSYNTHESIZED 22tatgcataaa catagacatg ggcata 262329DNAArtificial SequenceSYNTHESIZED 23agcttgatgc ccatgtctat gtttatgca 29247PRTArtificial SequenceSYNTHESIZED 24His Xaa His Arg His Xaa His1 5254PRTArtificial SequenceSYNTHESIZED 25Asp Asp Asp Lys1265PRTArtificial SequenceSYNTHESIZED 26Asp Asp Asp Asp Lys1 5276PRTArtificial SequenceSYNTHESIZED 27Leu Val Pro Arg Gly Xaa1 5285PRTArtificial SequenceSYNTHESIZED 28Ile Glu Gly Arg Xaa1 5295PRTArtificial SequenceSYNTHESIZED 29Ile Asp Gly Arg Xaa1 5305PRTArtificial SequenceSYNTHESIZED 30Ala Glu Gly Arg Xaa1 5316PRTArtificial SequenceSYNTHESIZED 31Asp Tyr Lys Xaa Xaa Asp1 5328PRTArtificial SequenceSYNTHESIZED 32Asp Tyr Lys Xaa Xaa Asp Xaa Lys1 5335PRTArtificial SequenceSYNTHESIZED 33Asp Xaa Tyr Xaa Xaa1 5348PRTArtificial SequenceSYNTHESIZED 34Asp Xaa Tyr Xaa Xaa Asp Xaa Lys1 5357PRTArtificial SequenceSYNTHESIZED 35Xaa Tyr Lys Xaa Xaa Asp Xaa1 5366PRTArtificial SequenceSYNTHESIZED 36Xaa Tyr Lys Xaa Xaa Asp1 5378PRTArtificial SequenceSYNTHESIZED 37Xaa Tyr Lys Xaa Xaa Asp Xaa Lys1 5385PRTArtificial SequenceSYNTHESIZED 38Xaa Xaa Asp Xaa Lys1 53916PRTArtificial SequenceSYNTHESIZED 39Xaa Tyr Lys Xaa Xaa Asp Xaa Xaa Xaa Tyr Lys Xaa Xaa Asp Xaa Lys1 5 10 154021PRTArtificial SequenceSYNTHESIZED 40Asp Tyr Lys Xaa Xaa Asp Asp Tyr Lys Xaa Xaa Asp Xaa Asp Tyr Lys1 5 10 15Xaa Xaa Asp Xaa Lys 204112PRTArtificial SequenceSYNTHESIZED 41Asp Tyr Lys Xaa Xaa Asp Asp Tyr Lys Xaa Xaa Asp1 5 104214PRTArtificial SequenceSYTHESIZED 42Asp Xaa Tyr Xaa Xaa Xaa Asp Xaa Tyr Xaa Xaa Asp Xaa Lys1 5 104310PRTArtificial SequenceSYNTHESIZED 43Asp Xaa Tyr Xaa Xaa Asp Xaa Tyr Xaa Xaa1 5 104413PRTArtificial SequenceSYNTHESIZED 44Xaa Asp Tyr Lys Xaa Xaa Asp Asp Tyr Lys Xaa Xaa Asp1 5 104522PRTArtificial SequenceSYNTHESIZED 45Xaa Asp Tyr Lys Xaa Xaa Asp Asp Tyr Lys Xaa Xaa Asp Xaa Asp Tyr1 5 10 15Lys Xaa Xaa Asp Xaa Lys 204623PRTArtificial SequenceSYNTHESIZED 46Met Asp Tyr Lys Asp His Asp Gly Asp Tyr Lys Asp His Asp Ile Asp1 5 10 15Tyr Lys Asp Asp Asp Asp Lys 20479PRTArtificial SequenceSYNTHESIZED 47Xaa Xaa Tyr Lys Xaa Xaa Asp Xaa Lys1 5489PRTArtificial SequenceSYNTHESIZED 48Xaa Asp Xaa Tyr Xaa Xaa Asp Xaa Lys1 5

* * * * *