Fusion Proteins

James; Peter ;   et al.

Patent Application Summary

U.S. patent application number 13/595927 was filed with the patent office on 2014-02-27 for fusion proteins. This patent application is currently assigned to ALLERGAN, INC.. The applicant listed for this patent is Kei Roger Aoki, John Chaddock, Keith Foster, Joseph Francis, Peter James, Lance Steward. Invention is credited to Kei Roger Aoki, John Chaddock, Keith Foster, Joseph Francis, Peter James, Lance Steward.

Application Number20140056870 13/595927
Document ID /
Family ID49123869
Filed Date2014-02-27

United States Patent Application 20140056870
Kind Code A1
James; Peter ;   et al. February 27, 2014

FUSION PROTEINS

Abstract

A single chain, polypeptide fusion protein, comprising: a non-cytotoxic protease, which cleaves a protein of the exocytic fusion apparatus of a nociceptive sensory afferent; a galanin Targeting Moiety that binds a Binding Site on the nociceptive sensory afferent, which can undergo endocytosis to be incorporated into an endosome; a protease cleavage site where the fusion protein is cleavable by a protease located between the non-cytotoxic protease and the galanin Targeting Moiety; a translocation domain that translocates the protease from within an endosome, across the endosomal membrane and into the cytosol of the nociceptive sensory afferent; a first spacer from 4 to 25 amino acids between the non-cytotoxic protease and protease cleavage site; and a second spacer comprising from 4 to 35 residues between the galanin Targeting Moiety and translocation domain. Nucleic acid sequences encoding the polypeptide fusion proteins, methods of preparing same and uses thereof are also described.


Inventors: James; Peter; (Eastleigh, GB) ; Foster; Keith; (Salisbury, GB) ; Chaddock; John; (Salisbury, GB) ; Aoki; Kei Roger; (Coto de Caza, CA) ; Steward; Lance; (Irvine, CA) ; Francis; Joseph; (Laguna Niguel, CA)
Applicant:
Name City State Country Type

James; Peter
Foster; Keith
Chaddock; John
Aoki; Kei Roger
Steward; Lance
Francis; Joseph

Eastleigh
Salisbury
Salisbury
Coto de Caza
Irvine
Laguna Niguel

CA
CA
CA

GB
GB
GB
US
US
US
Assignee: ALLERGAN, INC.
Irvine
CA

SYNTAXIN LIMITED
Abingdon

Family ID: 49123869
Appl. No.: 13/595927
Filed: August 27, 2012

Current U.S. Class: 424/94.63 ; 435/212; 435/320.1; 536/23.2
Current CPC Class: A61K 47/64 20170801; A61P 25/06 20180101; A61P 29/00 20180101; C07K 2319/50 20130101; A61K 38/00 20130101; C07K 2319/74 20130101; C07K 2319/55 20130101; A61K 47/65 20170801; C12Y 304/21072 20130101; C07K 2319/01 20130101; C12N 15/625 20130101; C12Y 304/24068 20130101; C12Y 304/24069 20130101; A61P 25/04 20180101; C12N 9/52 20130101; C12Y 304/24013 20130101; C07K 14/575 20130101
Class at Publication: 424/94.63 ; 435/212; 536/23.2; 435/320.1
International Class: C12N 9/48 20060101 C12N009/48; A61P 29/00 20060101 A61P029/00; A61K 38/48 20060101 A61K038/48; C12N 15/57 20060101 C12N015/57; C12N 15/63 20060101 C12N015/63

Claims



1. A single chain, polypeptide fusion protein, comprising: a. a non-cytotoxic protease, which protease cleaves a protein of the exocytic fusion apparatus of a nociceptive sensory afferent; b. a galanin Targeting Moiety that binds to a Binding Site on the nociceptive sensory afferent, which Binding Site endocytoses to be incorporated into an endosome within the nociceptive sensory afferent; c. a protease cleavage site at which site the fusion protein is cleavable by a protease, wherein the protease cleavage site is located between the non-cytotoxic protease and the galanin Targeting Moiety; d. a translocation domain that translocates the protease from within an endosome, across the endosomal membrane and into the cytosol of the nociceptive sensory afferent, wherein the Targeting Moiety is located between the protease cleavage site and the translocation domain; e. a first spacer located between the non-cytotoxic protease and the protease cleavage site, wherein said first spacer comprises an amino acid sequence of from 4 to 25 amino acid residues; f. a second spacer located between the galanin Targeting Moiety and the translocation domain, wherein said second spacer comprises an amino acid sequence of from 4 to 35 amino acid residues.

2. The fusion protein according to claim 1, wherein the first spacer comprises an amino acid sequence of from 6 to 16 amino acid residues.

3. The fusion protein according to claim 1, wherein said amino acid residues of said first spacer are selected from the group consisting of glycine, threonine, arginine, serine, alanine, asparagine, glutamine, aspartic acid, proline, glutamic acid and/or lysine.

4. The fusion protein according to claim 1, wherein the amino acid residues of the first spacer are selected from the group consisting of glycine, serine and alanine.

5. The fusion protein according to claim 1, wherein the first spacer is selected from a GS5, GS10, GS15, GS18 or GS20 spacer.

6. The fusion protein according to claim 1, wherein the galanin Targeting Moiety binds specifically to the GALR1, GALR2 and/or the GALR3 receptor.

7. The fusion protein according to claim 1, wherein the galanin Targeting Moiety comprises or consists of an amino acid sequence having at least 70% sequence identity to SEQ ID NO: 7 or SEQ ID NO: 8.

8. The fusion protein according to claim 1, wherein the galanin Targeting Moiety comprises an amino acid sequence according to SEQ ID NO: 7 or a fragment comprising or consisting of at least 14 or 16 contiguous amino acid residues thereof, or a variant amino acid sequence of said SEQ ID NO: 7 or said fragment having a maximum of 5 or 6 conservative amino acid substitutions.

9. The fusion protein according to claim 1, wherein the non-cytotoxic protease is a clostridial neurotoxin L-chain or an IgA protease.

10. The fusion protein according to claim 1, wherein the translocation domain is the H.sub.N domain of a clostridial neurotoxin.

11. The fusion protein according to claim 1, wherein said fusion protein comprises an amino acid sequence having at least 90% sequence identity to the amino acid sequence selected from the group consisting of SEQ ID NOs: 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 53, 56 and/or 59.

12. A polynucleotide molecule encoding the polypeptide fusion protein according to claim 1.

13. An expression vector, which comprises a promoter, the polynucleotide molecule according to claim 12, wherein said polynucleotide molecule is located downstream of the promoter, and a terminator located downstream of the polynucleotide molecule.

14. A method for preparing a single-chain polypeptide fusion protein, comprising: a. transfecting a host cell with the expression vector of claim 13, and b. culturing said host cell under conditions promoting expressing of the polypeptide fusion protein by the expression vector.

15. A method of preparing a non-cytotoxic agent, comprising: a. contacting a single-chain polypeptide fusion protein according to claim 1 with a protease capable of cleaving the protease cleavage site; b. cleaving the protease cleavage site; and thereby forming a di-chain fusion protein.

16. A non-cytotoxic polypeptide, obtained by the method of claim 15, wherein the polypeptide is a di-chain polypeptide, and wherein: a. the first chain comprises the non-cytotoxic protease, which protease is capable of cleaving a protein of the exocytic fusion apparatus of a nociceptive sensory afferent; b. the second chain comprises the galanin TM and the translocation domain that is capable of translocating the protease from within an endosome, across the endosomal membrane and into the cytosol of the nociceptive sensory afferent; and the first and second chains are disulphide linked together.

17. A method of treating, preventing or ameliorating pain in a subject, comprising administering to said patient a therapeutically effective amount of the fusion protein according to claim 1.

18. A method according to claim 17, wherein the pain is chronic pain selected from neuropathic pain, inflammatory pain, headache pain, somatic pain, visceral pain, and referred pain.

19. A method of treating, preventing or ameliorating pain in a subject, comprising administering to said patient a therapeutically effective amount of a polypeptide according to claim 16.

20. A method according to claim 19, wherein the pain is chronic pain selected from neuropathic pain, inflammatory pain, headache pain, somatic pain, visceral pain, and referred pain.
Description



STATEMENT REGARDING SEQUENCE LISTING

[0001] The sequence listing associated with this application is provided in text format in lieu of a paper copy and is hereby incorporated by reference into the specification. The name of the text file containing the sequence listing is 39905_Sequence Listing_FINAL.sub.--2012-08-24_ST25.txt. The text file is 352 KB; was created on Aug. 24, 2012; and is being submitted via EFS-Web with the filing of the specification.

FIELD OF THE INVENTION

[0002] This invention relates to non-cytotoxic fusion proteins, and to the therapeutic application thereof as analgesic molecules.

BACKGROUND

[0003] Toxins may be generally divided into two groups according to the type of effect that they have on a target cell. In more detail, the first group of toxins kill their natural target cells, and are therefore known as cytotoxic toxin molecules. This group of toxins is exemplified inter alia by plant toxins such as ricin, and abrin, and by bacterial toxins such as diphtheria toxin, and Pseudomonas exotoxin A. Cytotoxic toxins have attracted much interest in the design of "magic bullets" (e.g., immunoconjugates, which comprise a cytotoxic toxin component and an antibody that binds to a specific marker on a target cell) for the treatment of cellular disorders and conditions such as cancer. Cytotoxic toxins typically kill their target cells by inhibiting the cellular process of protein synthesis.

[0004] The second group of toxins, which are known as non-cytotoxic toxins, do not (as their name confirms) kill their natural target cells. Non-cytotoxic toxins have attracted much less commercial interest than have their cytotoxic counterparts, and exert their effects on a target cell by inhibiting cellular processes other than protein synthesis. Non-cytotoxic toxins are produced by a variety of plants, and by a variety of microorganisms such as Clostridium sp. and Neisseria sp.

[0005] Clostridial neurotoxins are proteins that typically have a molecular mass of the order of 150 kDa. They are produced by various species of bacteria, especially of the genus Clostridium, most importantly C. tetani and several strains of C. botulinum, C. butyricum and C. argentinense. There are at present eight different classes of the clostridial neurotoxin, namely: tetanus toxin, and botulinum neurotoxin in its serotypes A, B, C1, D, E, F and G, and they all share similar structures and modes of action.

[0006] Clostridial neurotoxins represent a major group of non-cytotoxic toxin molecules, and are synthesized by the host bacterium as single polypeptides that are modified post-translationally by a proteolytic cleavage event to form two polypeptide chains joined together by a disulphide bond. The two chains are termed the heavy chain (H-chain), which has a molecular mass of approximately 100 kDa, and the light chain (L-chain), which has a molecular mass of approximately 50 kDa.

[0007] L-chains possess a protease function (zinc-dependent endopeptidase activity) and exhibit a high substrate specificity for vesicle and/or plasma membrane associated proteins involved in the exocytic process. L-chains from different clostridial species or serotypes may hydrolyze different but specific peptide bonds in one of three substrate proteins, namely synaptobrevin, syntaxin or SNAP-25. These substrates are important components of the neurosecretory machinery.

[0008] Neisseria sp., most importantly from the species N. gonorrhoeae, produce functionally similar non-cytotoxic proteases. An example of such a protease is IgA protease (see WO99/58571).

[0009] It has been well documented in the art that toxin molecules may be re-targeted to a cell that is not the toxin's natural target cell. When so re-targeted, the modified toxin is capable of binding to a desired target cell and, following subsequent translocation into the cytosol, is capable of exerting its effect on the target cell. Said re-targeting is achieved by replacing the natural Targeting Moiety (TM) of the toxin with a different TM. In this regard, the TM is selected so that it will bind to a desired target cell, and allow subsequent passage of the modified toxin into an endosome within the target cell. The modified toxin also comprises a translocation domain to enable entry of the non-cytotoxic protease into the cell cytosol. The translocation domain can be the natural translocation domain of the toxin or it can be a different translocation domain obtained from a microbial protein with translocation activity.

[0010] The above-mentioned TM replacement may be effected by conventional chemical conjugation techniques, which are well known to a skilled person. In this regard, reference is made to Hermanson, G. T. (1996), Bioconjugate Techniques, Academic Press, and to Wong, S. S. (1991), Chemistry of Protein Conjugation and Cross-Linking, CRC Press. Alternatively, recombinant techniques may be employed, such as those described in WO98/07864. All of the above cited references are incorporated by reference herein.

[0011] Pain-sensing cells possess a wide range of receptor types. However, not all receptor types are suited (least of all desirable) for receptor-mediated endocytosis. Similarly, binding properties can vary widely between different TMs for the same receptor, and even more so between different TMs and different receptors.

[0012] There is therefore a need to develop modified non-cytotoxic fusion proteins that address one or more of the above problems. Of particular interest is the development of an alternative/improved non-cytotoxic fusion protein for use in treating pain.

[0013] The present invention seeks to address one or more of the above problems by providing unique fusion proteins.

SUMMARY

[0014] The present invention addresses one or more of the above-mentioned problems by providing a single chain, polypeptide fusion protein, comprising: [0015] a. a non-cytotoxic protease which protease cleaves a protein of the exocytic fusion apparatus of a nociceptive sensory afferent; [0016] b. a galanin Targeting Moiety that binds to a Binding Site on the nociceptive sensory afferent, which Binding Site endocytoses to be incorporated into an endosome within the nociceptive sensory afferent; [0017] c. a protease cleavage site at which site the fusion protein is cleavable by a protease, wherein the protease cleavage site is located between the non-cytotoxic protease and the galanin Targeting Moiety; [0018] d. a translocation domain that translocates the protease from within an endosome, across the endosomal membrane and into the cytosol of the nociceptive sensory afferent, wherein the Targeting Moiety is located between the protease cleavage site and the translocation domain; [0019] e. a first spacer located between the non-cytotoxic and the protease cleavage site, wherein said first spacer comprises an amino acid sequence of from 4 to 25 amino acid residues; [0020] f. a second spacer located between the galanin Targeting Moiety and the translocation domain, wherein said second spacer comprises an amino acid sequence of from 4 to 35 amino acid residues.

DESCRIPTION OF THE DRAWINGS

[0021] FIGS. 1A and 1B Depict the Purification of a LC/A-Spacer-Galanin-Spacer-H.sub.N/A Fusion Protein.

[0022] Using the methodology outlined in Example 3, a LC/A-GS18-galanin-GS20-H.sub.N/A fusion protein was purified from E. coli BL21 cells. Briefly, the soluble products obtained following cell disruption were applied to a nickel-charged affinity capture column. Bound proteins were eluted with 100 mM imidazole, treated with enterokinase to activate the fusion protein and treated with factor Xa to remove the maltose-binding protein (MBP) tag. Activated fusion protein was then re-applied to a second nickel-charged affinity capture column. Samples from the purification procedure were assessed by SDS-PAGE (FIG. 1A) and Western blotting (FIG. 1B). Anti-galanin antisera (obtained from Abcam) and Anti-histag antisera (obtained from Qiagen) were used as the primary antibody for Western blotting. The final purified material in the absence and presence of reducing agent is identified in the lanes of FIG. 1A marked [-] and [+] respectively. FIG. 1A: Lane 1=Benchmark ladder; Lane 2=soluble fraction; Lane 3=1.sup.st His product; Lane 4=activated purified protein; Lane 5=second His product; Lane 6=final purified protein 5 .mu.l; Lane 7=final purified protein 10 .mu.l; Lane 8=final purified protein 20 .mu.l; Lane 9=final purified protein 5 .mu.l+DTT; Lane 10=final purified protein 10 .mu.l+DTT. FIG. 1B: Lane 1=Benchmark ladder; Lane 2=soluble fraction; Lane 3=1.sup.st His product; Lane 4=activated purified protein; Lane 5=second His product; Lane 6=final purified protein 2 .mu.l; Lane 7=final purified protein 5 .mu.l; Lane 8=final purified protein 10 .mu.l; Lane 9=final purified protein 2 .mu.l+DTT; Lane 10=final purified protein 5 .mu.l+DTT.

[0023] FIGS. 2A and 2B Depict the Purification of a LC/C-Spacer-Galanin-Spacer-H.sub.N/C Fusion Protein.

[0024] Using the methodology outlined in Example 3, an LC/C-galanin-H.sub.N/C fusion protein was purified from E. coli BL21 cells. Briefly, the soluble products obtained following cell disruption were applied to a nickel-charged affinity capture column. Bound proteins were eluted with 100 mM imidazole, treated with enterokinase to activate the fusion protein, then re-applied to a second nickel-charged affinity capture column. Samples from the purification procedure were assessed by SDS-PAGE (FIG. 2A) and Western blotting (FIG. 2B). Anti-galanin antisera (obtained from Abcam) and Anti-histag antisera (obtained from Qiagen) were used as the primary antibody for Western blotting. The final purified material in the absence and presence of reducing agent in FIG. 2A is identified in the lanes marked [-] and [+] respectively. FIG. 2A: Lane 1=Benchmark ladder; Lane 2=soluble fraction; Lane 3=product 1.sup.st column; Lane 4=enterokinase activated protein; Lane 5=final product 0.1 mg/ml (5 .mu.l); Lane 6=final product 0.1 mg/ml+DTT (5 .mu.l); Lane 7=final product 0.1 mg/ml (10 .mu.l); Lane 8=final product 0.1 mg/ml+DTT (10 .mu.l). FIG. 2B: Lane 1=Magic mark; 2=soluble fraction; 3=product 1.sup.st His-tag column; 4=activated fusion; 5=purified @ 0.1 mg/ml (5 .mu.l); 6=purified @ 0.1 mg/ml+DTT (5 .mu.l); 7=purified @ 0.1 mg/ml+100 mm DTT (10 .mu.l); 8=purified @ 0.1 mg/ml+100 mm DTT (10 .mu.l)+DTT.

[0025] FIGS. 3A and 3B Depict a Comparison of SNARE Cleavage Efficacy of a LC-Spacer-Galanin-Spacer-H.sub.N Fusion Protein and a LC-H.sub.N-Galanin Fusion Protein.

[0026] FIGS. 3A and 3B: The ability of galanin fusions to cleave SNAP-25 in a CHO GALR1 SNAP25 cell was assessed. Chinese hamster ovary (CHO) cells were transfected so that they express the GALR1 receptor. Said cells were further transfected to express a SNARE protein (SNAP-25). The transfected cells were exposed to varying concentrations of different galanin fusion proteins for 24 hours. Cellular proteins were separated by SDS-PAGE, Western blotted, and probed with anti-SNAP-25 to facilitate an assessment of SNAP-25 cleavage. The percentage of cleaved SNAP-25 was calculated by densitometric analysis. It is clear from the data that the LC-spacer-galanin-spacer-H.sub.N fusion (Fusion 1) is more potent than the LC-H.sub.N-galanin fusion (LHN-gal) and the unliganded LC/A-H.sub.N/A control molecule (LHA).

[0027] FIG. 4 Depicts GALR1 Receptor Activation Studies in the CHO-GALCHO-GALR1 SNAP-25 Cleavage Assay with Galanin Fusion Proteins of the Present Invention Having Different Serotype Backbones.

[0028] Chinese hamster ovary (CHO) cells were transfected so that they express the GALR1 receptor and SNAP-25. Said cells were used to measure cAMP deletion that occurs when the receptor is activated with a galanin ligand, using a FRET-based cAMP kit (LANCE kit from Perkin Elmer). The transfected cells were exposed to varying concentrations of galanin (GA16) fusion proteins having different serotype backbones (i.e., botulinum neurotoxin serotypes A, B, C and D) for 2 hours. cAMP levels were then detected by addition of a detection mix containing a fluorescently labeled cAMP tracer (Europium-streptavadi/biotin-cAMP) and fluorescently (Alexa) labeled anti-cAMP antibody and incubating at room temperature for 24 hours. Then samples were excited at 320 nM and emitted light measured at 665 nM to determine cAMP levels. The data demonstrate that galanin fusion proteins of the present invention having different serotype backbones activated the GALR1 receptor.

[0029] FIG. 5 Depicts the Cleavage of SNARE Protein by Galanin (GA16 and GA30) Fusion Proteins in CHO-GALR1 SNAP-25 Cleavage Assay.

[0030] Chinese hamster ovary (CHO) cells were transfected so that they express the GALR1 receptor. Said cells were further transfected to express a SNARE protein (SNAP-25). The transfected cells were exposed to varying concentrations of different galanin fusion proteins for 24 hours. Cellular proteins were separated by SDS-PAGE, Western blotted, and probed with anti-SNAP-25 to facilitate an assessment of SNAP-25 cleavage. The percentage of cleaved SNAP-25 was calculated by densitometric analysis. The data demonstrate that galanin fusion proteins having galanin-16 and galanin-30 ligands cleave SNARE protein. In addition, the data confirm that galanin fusion proteins having GS5, GS10 and GS18 spacers between the non-cytotoxic protease component and the protease cleavage site are functional.

[0031] FIG. 6 Depicts the Results of In Vivo Paw Guarding Assay Employing Galanin Fusion Proteins.

[0032] The nociceptive flexion reflex (also known as paw guarding assay) is a rapid withdrawal movement that constitutes a protective mechanism against possible limb damage. It can be quantified by assessment of electromyography (EMG) response in anesthetized rat as a result of low dose capsaicin, electrical stimulation or the capsaicin-sensitized electrical response. Intraplantar pretreatment (24 hour) of fusion proteins of the present invention into 300-380 g male Sprague-Dawley rats. Induction of paw guarding was achieved by 0.006% capsaicin, 10 .mu.l in PBS (7.5% DMSO), injected in 10 seconds. This produced a robust reflex response from biceps feroris muscle. A reduction/inhibition of the nociceptive flexion reflex indicates that the test substance demonstrates an anti-nociceptive effect. The data demonstrated the anti-nociceptive effect of the galanin fusion proteins of the present invention.

[0033] FIG. 7 Depicts Galanin Fusion Protein Efficacy in Capsaicin-Induced Thermal Hyperalgesia Assay.

[0034] The ability of different galanin fusion proteins of the invention to inhibit capsaicin-induced thermal hyperalgesia was evaluated. Intraplantar pretreatment of fusion proteins into Sprague-Dawley rats and 24 hours later 0.3% capsaicin was injected and rats were put on 25.degree. C. glass plate (rats contained in acrylic boxes, on 25.degree. C. glass plate). Light beam (adjustable light Intensity) focused on the hind paw. Sensors detected movement of paw, stopping timer. Paw Withdrawal Latency is defined as the time to remove paw from heat source (Cut-off of 20.48 seconds). A reduction/inhibition of the paw withdrawal latency indicates that the test substance demonstrates an anti-nociceptive effect. No. 1=LH.sub.N-GA16; No. 2=LH.sub.N-GA30; No. 3=LC-GS5-EN-CPGA16-GS20-H.sub.N-HT; No. 4=LC-GS18-EN-CPGA16-GS20-H.sub.N-HT; No. 5=BOTOX; No. 6=morphine. The data demonstrated the enhanced anti-nociceptive effect of the galanin fusion proteins of the present invention compared to fusion proteins with a C-terminally presented ligand.

[0035] FIG. 8 Depicts Galanin Fusion Protein Efficacy in Capsaicin-Induced Thermal Hyperalgesia Assay.

[0036] The ability of different galanin fusion proteins of the invention to inhibit capsaicin-induced thermal hyperalgesia was evaluated. Intraplantar pretreatment of fusion proteins into Sprague-Dawley rats and 24 hours later 0.3% capsaicin was injected and rats were put on a 25.degree. C. glass plate (rats were contained in acrylic boxes, on 25.degree. C. glass plate). Light beam (adjustable light Intensity) focused on the hind paw. Sensors detected movement of a paw, stopping a timer. Paw Withdrawal Latency is defined as the time to remove paw from heat source (Cut-off of 20.48 seconds). A reduction/inhibition of the paw withdrawal latency indicated that the test substance demonstrates an anti-nociceptive effect. The data demonstrated the anti-nociceptive effect of the galanin fusion proteins of the present invention having different serotype backbones (i.e., A, B, C and D).

[0037] FIGS. 9A Through 9C Depict the Activation of Galanin Fusion Proteins with Single and Double-Spacers.

[0038] Galanin fusion proteins lacking a first spacer (spacer 1) of the present invention located between the non-cytotoxic protease component and the Targeting Moiety component showed poor activation with protease (FIGS. 9A and 9B). FIG. 9C demonstrates the enhanced activation of galanin fusion proteins of the present invention having both first (spacer 1) and second (spacer 2) spacers. FIGS. 9A and 9B: Lane 1=Benchmark ladder; Lane 2=Unactivated control; Lane 3=Unactivated control+DTT; Lane 4=Protease activated protein+0.0 mM ZnCl.sub.2; Lane 5=Protease activated protein+0.0 mM ZnCl.sub.2+DTT; Lane 6=Protease activated protein+0.2 mM ZnCl.sub.2; Lane 7=Protease activated protein+0.2 mM ZnCl.sub.2+DTT; Lane 8=Protease activated protein+0.4 mM ZnCl.sub.2; Lane 9=Protease activated protein+0.4 mM ZnCl.sub.2+DTT; Lane 10=Protease activated protein+0.8 mM ZnCl.sub.2; Lane 11=Protease activated protein+0.8 mM ZnCl.sub.2+DTT. FIG. 9C: Lane 1=Benchmark ladder; Lane 2=Unactivated control 25.degree. C.; Lane 3=Unactivated control 25.degree. C.+DTT; Lane 4=Protease activated protein 25.degree. C.; Lane 5=Protease activated protein 25.degree. C.+DTT; Lane 6=Benchmark ladder.

DETAILED DESCRIPTION

[0039] The non-cytotoxic protease component of the present invention is a non-cytotoxic protease, which protease is capable of cleaving different but specific peptide bonds in one of three substrate proteins, namely synaptobrevin, syntaxin or SNAP-25, of the exocytic fusion apparatus in a nociceptive sensory afferent. These substrates are important components of the neurosecretory machinery. The non-cytotoxic protease component of the present invention is preferably a neisserial IgA protease or a clostridial neurotoxin L-chain. The term non-cytotoxic protease embraces functionally equivalent fragments and derivatives of said non-cytotoxic protease(s). A particularly preferred non-cytotoxic protease component is a botulinum neurotoxin (BoNT) L-chain.

[0040] The translocation component of the present invention enables translocation of the non-cytotoxic protease (or fragment thereof) into the target cell such that functional expression of protease activity occurs within the cytosol of the target cell. The translocation component is preferably capable of forming ion-permeable pores in lipid membranes under conditions of low pH. Preferably it has been found to use only those portions of the protein molecule capable of pore-formation within the endosomal membrane. The translocation component may be obtained from a microbial protein source, in particular from a bacterial or viral protein source. Hence, in one embodiment, the translocation component is a translocating domain of an enzyme, such as a bacterial toxin or viral protein. The translocation component of the present invention is preferably a clostridial neurotoxin H-chain or a fragment thereof. Most preferably it is the H.sub.N domain (or a functional component thereof), wherein H.sub.N means a portion or fragment of the H-chain of a clostridial neurotoxin approximately equivalent to the amino-terminal half of the H-chain, or the domain corresponding to that fragment in the intact H-chain.

[0041] The galanin TM component of the present invention is responsible for binding the fusion protein of the present invention to a Binding Site on a target cell. Thus, the galanin TM component is a ligand through which the fusion proteins of the present invention bind to a selected target cell.

[0042] In the context of the present invention, the target cell is a nociceptive sensory afferent, preferably a primary nociceptive afferent (e.g., an A-fiber such as an A.delta.-fiber or a C-fiber). Thus, the fusion proteins of the present invention are capable of inhibiting neurotransmitter or neuromodulator (e.g., glutamate, substance P, calcitonin-gene related peptide (CGRP), and/or neuropeptide Y) release from discrete populations of nociceptive sensory afferent neurons. In use, the fusion proteins reduce or prevent the transmission of sensory afferent signals (e.g., neurotransmitters or neuromodulators) from peripheral to central pain fibers, and therefore have application as therapeutic molecules for the treatment of pain, in particular chronic pain.

[0043] It is routine to confirm that a TM binds to a nociceptive sensory afferent. For example, a simple radioactive displacement experiment may be employed in which tissue or cells representative of the nociceptive sensory afferent (for example DRGs) are exposed to labeled (e.g., tritiated) ligand in the presence of an excess of unlabelled ligand. In such an experiment, the relative proportions of non-specific and specific binding may be assessed, thereby allowing confirmation that the ligand binds to the nociceptive sensory afferent target cell. Optionally, the assay may include one or more binding antagonists, and the assay may further comprise observing a loss of ligand binding. Examples of this type of experiment can be found in Hulme, E. C. (1990), Receptor-binding Studies, A Brief Outline, pp. 303-311, in Receptor Biochemistry, A Practical Approach, ed. Hulme, Oxford University Press.

[0044] The fusion proteins of the present invention generally demonstrate a reduced binding affinity (in the region of up to 10-fold) for the galanin receptor (e.g., GALR1) when compared with the corresponding `free` TM (e.g., gal16). However, despite this observation, the fusion proteins of the present invention surprisingly demonstrate good efficacy. This can be attributed to two principal features. First, the non-cytotoxic protease component is catalytic--thus, the therapeutic effect of a few such molecules is rapidly amplified. Secondly, the galanin receptors present on the nociceptive sensory afferents need only act as a gateway for entry of the therapeutic, and need not necessarily be stimulated to a level required in order to achieve a ligand-receptor mediated pharmacological response. Accordingly, the fusion proteins of the present invention may be administered at a dosage that is much lower than would be employed for other types of analgesic molecules such as NSAIDS, morphine, and gabapentin. The latter molecules are typically administered at high microgram to milligram (even up to hundreds of milligram) quantities, whereas the fusion proteins of the present invention may be administered at much lower dosages, typically at least 10-fold lower, and more typically at 100-fold lower.

[0045] The galanin TM of the invention can also be a molecule that acts as an "agonist" at one or more of the galanin receptors present on a nociceptive sensory afferent, more particularly on a primary nociceptive afferent. Conventionally, an agonist has been considered any molecule that can either increase or decrease activities within a cell, namely any molecule that simply causes an alteration of cell activity. For example, the conventional meaning of an agonist would include a chemical substance capable of combining with a receptor on a cell and initiating a reaction or activity, or a drug that induces an active response by activating receptors, whether the response is an increase or decrease in cellular activity.

[0046] However, for the purposes of this invention, an agonist is more specifically defined as a molecule that is capable of stimulating the process of exocytic fusion in a target cell, which process is susceptible to inhibition by a protease (or fragment thereof) capable of cleaving a protein of the exocytic fusion apparatus in said target cell.

[0047] Accordingly, the particular agonist definition of the present invention would exclude many molecules that would be conventionally considered as agonists. For example, nerve growth factor (NGF) is an agonist in respect of its ability to promote neuronal differentiation via binding to a TrkA receptor. However, NGF is not an agonist when assessed by the above criteria because it is not a principal inducer of exocytic fusion. In addition, the process that NGF stimulates (i.e., cell differentiation) is not susceptible to inhibition by the protease activity of a non-cytotoxic toxin molecule.

[0048] In one embodiment, the fusion proteins according to the present invention demonstrate preferential receptor binding and/or internalization properties. This, in turn, may result in more efficient delivery of the protease component to a pain-sensing target cell.

[0049] Use of an agonist as a TM is self-limiting with respect to side-effects. In more detail, binding of an agonist TM to a pain-sensing target cell increases exocytic fusion, which may exacerbate the sensation of pain. However, the exocytic process that is stimulated by agonist binding is subsequently reduced or inhibited by the protease component of the fusion protein.

[0050] The agonist properties of a TM that binds to a receptor on a nociceptive afferent can be confirmed using the methods described in Example 9.

[0051] The Targeting Moiety of the present invention comprises or consists of galanin and/or derivatives of galanin. Galanin receptors (e.g., GALR1, GALR2 and GALR3) are found pre- and post-synaptically in dorsal root ganglia (DRGs) (Liu and Hokfelt, (2002) Trends Pharm. Sci., 23(10):468-474), and are enhanced in expression during neuropathic pain states. Xu et al., (2000) Neuropeptides, 34(3-4):137-147 provides further information in relation to galanin. All of the above cited references are incorporated by reference herein.

[0052] In one embodiment of the invention, the target for the galanin TM is the GALR1, GALR2 and/or the GALR3 receptor. These receptors are members of the G-protein-coupled class of receptors, and have a seven transmembrane domain structure.

[0053] In one embodiment, the galanin TM is a molecule that binds (preferably that specifically binds) to the GALR1, GALR2 and/or the GALR3 receptor. More preferably, the galanin TM is an "agonist" of the GALR1, GALR2 and/or the GALR3 receptor. The term "agonist" in this context is defined as above.

[0054] Wild-type human galanin peptide is a 30 amino acid peptide, abbreviated herein as "GA30" (represented by SEQ ID NO: 7). In one embodiment, the galanin TM comprises or consists of SEQ ID NO: 7.

[0055] The invention also encompasses fragments, variants, and derivatives of the galanin TM described above. These fragments, variants, and derivatives substantially retain the properties that are ascribed to said galanin TM (i.e., are functionally equivalent). For example, the fragments, variants, and derivatives may retain the ability to bind to the GALR1, GALR2 and/or GALR3 receptor. In one embodiment, the galanin TM of the invention comprises or consists of a 16 amino acid fragment of full-length galanin peptide and is referred to herein as GA16 (represented by SEQ ID NO: 8).

[0056] In one embodiment, the galanin TM comprises or consists of an amino acid sequence having at least 70%, preferably at least 80% (such as at least 82, 84, 85, 86, 88 or 89%), more preferably at least 90% (such as at least 91, 92, 93 or 94%), and most preferably at least 95% (such as at least 96, 97, 98, 99 or 100%) amino acid sequence acid identity to SEQ ID NO: 7 or SEQ ID NO: 8.

[0057] In one embodiment the galanin TM comprises or consists of an amino acid sequence having at least 70% (such as at least 80, 82, 84, 85, 86, 88 or 89%), more preferably at least 90% (such as at least 91, 92, 93 or 94%), and most preferably at least 95% (such as at least 96, 97, 98, 99 or 100%) amino acid sequence acid identity to full-length amino acid sequence of SEQ ID NO: 7 or SEQ ID NO: 8, or a fragment of SEQ ID NO: 7 or SEQ ID NO: 8 comprising or consisting of at least 10 (such as at least 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28 or 29) contiguous amino acid residues thereof.

[0058] In one embodiment, the galanin Targeting Moiety comprises or consists of an amino acid sequence according to SEQ ID NO: 7 or a fragment comprising or consisting of at least 16 (such as at least 10, 11, 12, 13, 14 or 15) contiguous amino acid residues thereof, or a variant amino acid sequence of said SEQ ID NO: 7 or said fragment having a maximum of 6 (such as a maximum of 5, 4, 3, 2 or 1) conservative amino acid substitutions.

[0059] The protease cleavage site of the present invention allows cleavage (preferably controlled cleavage) of the fusion protein at a position between the non-cytotoxic protease component and the TM component. It is this cleavage reaction that converts the fusion protein from a single chain polypeptide into a disulphide-linked, di-chain polypeptide.

[0060] According to a preferred embodiment of the present invention, the galanin TM binds via a domain or amino acid sequence that is located away from the C-terminus of the galanin TM. For example, the relevant binding domain may include an intra domain or an amino acid sequence located towards the middle (i.e., of the linear peptide sequence) of the TM. Preferably, the relevant binding domain is located towards the N-terminus of the galanin TM, more preferably at or near to the N-terminus.

[0061] In one embodiment, the single chain polypeptide fusion may include more than one proteolytic cleavage site. However, where two or more such sites exist, they are different, thereby substantially preventing the occurrence of multiple cleavage events in the presence of a single protease. In another embodiment, it is preferred that the single chain polypeptide fusion has a single protease cleavage site.

[0062] The protease cleavage sequence(s) may be introduced (and/or any inherent cleavage sequence removed) at the DNA level by conventional means, such as by site-directed mutagenesis. Screening to confirm the presence of cleavage sequences may be performed manually or with the assistance of computer software (e.g., the MapDraw program by DNASTAR, Inc.).

[0063] Whilst any protease cleavage site may be employed, the following are preferred:

TABLE-US-00001 Enterokinase (SEQ ID NO: 60) (DDDDK.dwnarw.) Factor Xa (SEQ ID NO: 61/SEQ ID NO: 62) (IEGR.dwnarw./IDGR.dwnarw.) TEV(Tobacco Etch virus) (SEQ ID NO: 63) (ENLYFQ.dwnarw.G) Thrombin (SEQ ID NO: 64) (LVPR.dwnarw.GS) PreScission (SEQ ID NO: 65) (LEVLFQ.dwnarw.GP).

[0064] In one embodiment, the protease cleavage site is an enterokinase cleavage site (DDDDK.dwnarw.). In one embodiment, enterokinase protease is used to cleave the enterokinase cleavage site and activate the fusion protein.

[0065] Also embraced by the term protease cleavage site is an intein, which is a self-cleaving sequence. The self-splicing reaction is controllable, for example by varying the concentration of reducing agent present.

[0066] In use, the protease cleavage site is cleaved and the N-terminal region (preferably the N-terminus) of the TM becomes exposed. The resulting polypeptide has a TM with an N-terminal domain or an intra domain that is substantially free from the remainder of the fusion protein. This arrangement ensures that the N-terminal component (or intra domain) of the TM may interact directly with a Binding Site on a target cell.

[0067] In one embodiment, the TM and the protease cleavage site are distanced apart in the fusion protein by at most 10 amino acid residues, more preferably by at most 5 amino acid residues, and most preferably by zero amino acid residues. In one embodiment, the TM and the protease cleavage site are distanced apart in the fusion protein by 0-10 (such as 0-9,0-8, 0-7,0-6, 0-5,0-4, 0-3, 0-2) and preferably 0-1 amino acid residues Thus, following cleavage of the protease cleavage site, a fusion is provided with a TM that has an N-terminal domain that is substantially free from the remainder of the fusion. This arrangement ensures that the N-terminal component of the Targeting Moiety may interact directly with a Binding Site on a target cell.

[0068] One advantage associated with the above-mentioned activation step is that the TM only becomes susceptible to N-terminal degradation once proteolytic cleavage of the fusion protein has occurred. In addition, the selection of a specific protease cleavage site permits selective activation of the polypeptide fusion into a di-chain conformation.

[0069] Construction of the single-chain polypeptide fusion of the present invention places the protease cleavage site between the TM and the non-cytotoxic protease component.

[0070] It is preferred that, in the single-chain fusion, the TM is located between the protease cleavage site and the translocation component. This ensures that the TM is attached to the translocation domain (i.e., as occurs with native clostridial holotoxin), though in the case of the present invention the order of the two components is reversed vis-a-vis native holotoxin. A further advantage with this arrangement is that the TM is located in an exposed loop region of the fusion protein, which has minimal structural effects on the conformation of the fusion protein. In this regard, said loop is variously referred to as the linker, the activation loop, the inter-domain linker, or just the surface exposed loop (Schiavo et al., (2000) Phys. Rev. 80:717-766; Turton et al., (2002) Trends Biochem. Sci. 27:552-558).

[0071] The single chain fusion protein of the present invention comprises a first spacer located between the non-cytotoxic protease and the protease cleavage site, wherein said first spacer comprises (or consists of) an amino acid sequence of from 4 to 25 (such as from 6 to 25, 8 to 25, 10 to 25, 15 to 25 or from 4 to 21, 4 to 20, 4 to 18, 4 to 15, 4 to 12 or 4 to 10) amino acid residues. In one embodiment, the first spacer comprises (or consists of) an amino acid sequence of at least 4 (such as at least 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15) amino acid residues. In one embodiment, the first spacer comprises (or consists of) an amino acid sequence of at most 25 (such as at most 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 10) amino acid residues. Said first spacer enables cleavage of the fusion protein at the protease cleavage site.

[0072] Without a first spacer of the present invention, protease cleavage and activation of the fusion protein is markedly poor. Without wishing to be bound by theory, it is hypothesized that the galanin Targeting Moiety may sterically block or interact with the protease cleavage site resulting in poor activation of fusion proteins lacking a first spacer of the present invention. The present inventors believe that it is the flexibility afforded by the first spacer which provides for the enhanced/improved activation properties of the presently claimed fusion proteins. Rigid linkers such as alpha-helical linkers do not afford the necessary flexibility. This is also true for galanin fusion proteins having `natural` spacer sequences containing a protease cleavage site, which may replicate undesirable rigid alpha-helical linker structures. Flexibility and mobility of polypeptide domains can be ascertained by a number of methods including determining the X-ray crystallographic B-factor (see e.g., Smith et al., (2003) Protein Sci. 12:1060-1072; incorporated by reference herein). The specifically selected spacer sequences of the present invention provide for enhanced activation over and above any `natural` spacer sequences. Activation in this context means that said first spacer enables cleavage of the fusion protein at the protease cleavage site. Particularly preferred amino acid residues for use in the first spacer include glycine, threonine, arginine, serine, alanine, asparagine, glutamine, aspartic acid, proline, glutamic acid and/or lysine. The aforementioned amino acids are considered to be the most flexible amino acids--see Smith et al., (2003) Protein Sci. 12:1060-1072.

[0073] In one embodiment, the amino acid residues of the first spacer are selected from the group consisting of glycine, threonine, arginine, serine, asparagine, glutamine, alanine, aspartic acid, proline, glutamic acid, lysine, leucine and/or valine. In one embodiment, the amino acid residues of the first spacer are selected from the group consisting of glycine, serine, alanine, leucine and/or valine. In one embodiment, the amino acid residues of the first spacer are selected from the group consisting of glycine, serine and/or alanine Glycine and serine are particularly preferred. In one embodiment, the first spacer comprises or consists of one or more pentapeptides having glycine, serine, and or threonine residues. One way of assessing whether the first spacer possesses the requisite flexibility in the presently claimed fusion proteins is by performing a simple protease cleavage assay. It would be routine for a person skilled in the art to assess cleavage/activation of a fusion protein--standard methodology is described, for example, in Example 1.

[0074] In one embodiment, the first spacer may be selected from a GS5, GS10, GS15, GS18, GS20, FL3 and/or FL4 spacers. The sequence of said spacers is provided in Table 1, below.

TABLE-US-00002 TABLE 1 SPACER SEQUENCE SEQ ID NO: GS5 GGGGSA 66 GS10 GGGGSGGGGSA 67 GS15 ALAGGGGSGGGGSALV 68 GS18 GGGGSGGGGSGGGGSA 69 GS20 ALAGGGGSGGGGSGGGGSALV 70 FL3 LGGGGSGGGGSGGGGSAAA 71 FL4 LSGGGGSGGGGSGGGGSGGGGSAAA 72

[0075] In one embodiment, the first spacer enables at least 45% (such as at least 50, 55, 60, 65, 70, 75, 80, 90, 95, 98, 99 or 100%) activation of the fusion protein by protease cleavage. In one embodiment, the first spacer enables at least 70% activation of the fusion protein by protease cleavage.

[0076] In one embodiment, the first spacer is not a naturally-occurring spacer sequence. In one embodiment, the first spacer does not comprise or consist of an amino acid sequence native to the natural (i.e., wild-type) clostridial neurotoxin, such as botulinum neurotoxin. In other words, the first spacer may be a non-clostridial sequence (i.e., not found in the native clostridial neurotoxin). In one embodiment, the fusion protein does not comprise or consist of the amino acid sequence GIITSK (BoNT/A) (SEQ ID NO:74); VK (BoNT B); AIDGR (BoNT/C) (SEQ ID NO:75); LTK (BoNT/D); IVSVK (BoNT/E) (SEQ ID NO:76); VIPR (BoNT/F) (SEQ ID NO:77); VMYK (BoNT/G) (SEQ ID NO:78) and/or IIPPTNIREN (TeNT) (SEQ ID NO:79) as the first spacer.

[0077] In one embodiment, the first spacer begins on the third amino acid residue following the conserved cysteine residue in the clostridial neurotoxin L-chain (see Table 3 below). In one embodiment, the first spacer begins after the VD amino acid residues of a non-cytotoxic protease clostridial L-chain engineered with a SalI site following the conserved cysteine residue. In one embodiment, the first spacer ends with the amino acid residue marking the beginning of the protease cleavage sites mentioned above.

[0078] In one embodiment, the single chain fusion protein comprises a second spacer, which is located between the galanin Targeting Moiety and the translocation domain. Said second spacer may comprise (or consist of) an amino acid sequence of from 4 to 35 (such as from 6 to 35, 10 to 35, 15 to 35, 20 to 35 or from 4 to 28, 4 to 25, 4 to 20 or 4 to 10) amino acid residues. The present inventors have unexpectedly found that the fusion proteins of the present invention may demonstrate an improved binding activity when the size of the second spacer is selected so that (in use) the C-terminus of the TM and the N-terminus of the translocation component are separated from one another by 40-105 angstroms, preferably by 50-100 angstroms, and more preferably by 50-90 angstroms.

[0079] Suitable second spacers may be routinely identified and obtained according to Crasto and Feng, (2000) Protein Eng. 13(5):309-312. In one embodiment, the second spacer is selected from a GS5, GS10, GS15, GS18, GS20 or HX27 spacer. The sequence of said spacers is provided in Table 2, below.

TABLE-US-00003 TABLE 2 SPACER SEQUENCE SEQ ID NO: GS5 GGGGSA 66 GS10 GGGGSGGGGSA 67 GS15 ALAGGGGSGGGGSALV 68 GS18 GGGGSGGGGSGGGGSA 69 GS20 ALAGGGGSGGGGSGGGGSALV 70 HX27 ALAAEAAAKEAAAKEAAAKAGGGGSALV 73

[0080] The Inventors have surprisingly found, that the presently claimed fusion proteins having said first and second spacer features display enhanced activation properties and increased yield during recombinant expression. In addition, the presently claimed fusion proteins display enhanced potency compared to fusion proteins wherein the galanin TM is C-terminal of the translocation domain component.

[0081] In one embodiment, the invention provides a single-chain polypeptide fusion protein comprising (or consisting of) an amino acid sequence having at least 80% (such as at least 85, 90, 92, 94, 95, 96, 97, 98, 99 or 100%) sequence identity to the amino acid sequence of SEQ ID NOs: 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 53, 56 and/or 59.

[0082] In one embodiment, the invention provides a single-chain polypeptide fusion protein comprising (or consisting of) an amino acid sequence having at least 80% (such as at least 85, 90, 92, 94, 95, 96, 97, 98, 99 or 100%) sequence identity to the full-length amino acid sequence of SEQ ID NOs: 10, 11, 13, 14, 16, 17, 19, 20, 22, 23, 25, 26, 28, 29, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 53, 56 and/or 59.

[0083] In one embodiment, in the single chain polypeptide, the non-cytotoxic protease component and the translocation component are linked together by a disulphide bond. Thus, following cleavage of the protease cleavage site, the polypeptide assumes a di-chain conformation, wherein the protease and translocation components remain linked together by the disulphide bond. To this end, it is preferred that the protease and translocation components are distanced apart from one another in the single chain fusion protein by a maximum of 100 amino acid residues, more preferably a maximum of 80 amino acid residues, particularly preferably by a maximum of 60 amino acid residues, and most preferably by a maximum of 50 amino acid residues.

[0084] In one embodiment, the non-cytotoxic protease component forms a disulphide bond with the translocation component of the fusion protein. For example, the amino acid residue of the protease component that forms the disulphide bond is located within the last 20, preferably within the last 10 C-terminal amino acid residues of the protease component. Similarly, the amino acid residue within the translocation component that forms the second part of the disulphide bond may be located within the first 20, preferably within the first 10 N-terminal amino acid residues of the translocation component.

[0085] Alternatively, in the single chain polypeptide, the non-cytotoxic protease component and the TM may be linked together by a disulphide bond. In this regard, the amino acid residue of the TM that forms the disulphide bond is preferably located away from the N-terminus of the TM, more preferably towards the C-terminus of the TM.

[0086] In one embodiment, the non-cytotoxic protease component forms a disulphide bond with the TM component of the fusion protein. In this regard, the amino acid residue of the protease component that forms the disulphide bond is preferably located within the last 20, more preferably within the last 10 C-terminal amino acid residues of the protease component. Similarly, the amino acid residue within the TM component that forms the second part of the disulphide bond is preferably located within the last 20, more preferably within the last 10 C-terminal amino acid residues of the TM.

[0087] The above disulphide bond arrangements have the advantage that the protease and translocation components are arranged in a manner similar to that for native clostridial neurotoxin. By way of comparison, referring to the primary amino acid sequence for native clostridial neurotoxin, the respective cysteine amino acid residues are distanced apart by between 8 and 27 amino acid residues--taken from Popoff and Marvaud, 1999, Structural and genomic features of clostridial neurotoxins, Chapter 9, in The Comprehensive Sourcebook of Bacterial Protein Toxins. eds. Alouf and Freer, Elsevier:

TABLE-US-00004 TABLE 3 `Native` length SEQ ID Serotype.sup.1 Sequence between C-C NO: BoNT/A1 CVRGIITSKTKS----LDKGYNKALNDLC 23 80 BoNT/A2 CVRGIIPFKTKS----LDEGYNKALNDLC 23 81 BoNT/B CKSVKAPG-------------------IC 8 82 BoNT/C CHKAIDGRS----------LYNKTLDC 15 83 BoNT/D CLRLTK---------------NSRDDSTC 12 84 BoNT/E CKN-IVSVK----------GIRK---SIC 13 85 BoNT/F CKS-VIPRK----------GTKAPP-RLC 15 86 BoNT/G CKPVMYKNT----------GKSE----QC 13 87 TeNT CKKIIPPTNIRENLYNRTASLTDLGGELC 27 88 .sup.1Information from proteolytic strains only

[0088] The fusion protein may comprise one or more purification tags, which are located N-terminal to the protease component and/or C-terminal to the translocation component.

[0089] Whilst any purification tag may be employed, the following are preferred:

His-tag (e.g., 6.times. histidine), preferably as a C-terminal and/or N-terminal tag MBP-tag (maltose binding protein), preferably as an N-terminal tag GST-tag (glutathione-S-transferase), preferably as an N-terminal tag His-MBP-tag, preferably as an N-terminal tag GST-MBP-tag, preferably as an N-terminal tag Thioredoxin-tag, preferably as an N-terminal tag CBD-tag (Chitin Binding Domain), preferably as an N-terminal tag.

[0090] According to a further embodiment of the present invention, one or more additional peptide spacer molecules may be included in the fusion protein. For example, a peptide spacer may be employed between a purification tag and the rest of the fusion protein molecule (e.g., between an N-terminal purification tag and a protease component of the present invention; and/or between a C-terminal purification tag and a translocation component of the present invention.

[0091] In accordance with a second aspect of the present invention, there is provided a DNA sequence that encodes the above-mentioned single chain polypeptide. In a preferred aspect of the present invention, the DNA sequence is prepared as part of a DNA vector, wherein the vector comprises a promoter and terminator.

[0092] In a preferred embodiment, the vector has a promoter selected from:

TABLE-US-00005 Typical Induction Promoter Induction Agent Condition Tac (hybrid) IPTG 0.2 mM (0.05-2.0 mM) AraBAD L-arabinose 0.2% (0.002-0.4%) T7-lac operator IPTG 0.2 mM (0.05-2.0 mM)

[0093] The DNA construct of the present invention is preferably designed in silico, and then synthesized by conventional DNA synthesis techniques.

[0094] The above-mentioned DNA sequence information is optionally modified for codon-biasing according to the ultimate host cell (e.g. E. coli) expression system that is to be employed.

[0095] The DNA backbone is preferably screened for any inherent nucleic acid sequence, which when transcribed and translated would produce an amino acid sequence corresponding to the protease cleave site encoded by the second peptide-coding sequence. This screening may be performed manually or with the assistance of computer software (e.g., the MapDraw program by DNASTAR, Inc.).

[0096] According to a further embodiment of the present invention, there is provided a method of preparing a non-cytotoxic agent, comprising: [0097] a. contacting a single-chain polypeptide fusion protein of the invention with a protease capable of cleaving the protease cleavage site; [0098] b. cleaving the protease cleavage site, and thereby forming a di-chain fusion protein.

[0099] This aspect provides a di-chain polypeptide, which generally mimics the structure of clostridial holotoxin. In more detail, the resulting di-chain polypeptide typically has a structure wherein: [0100] a. the first chain comprises the non-cytotoxic protease, which protease is capable of cleaving a protein of the exocytic fusion apparatus of a nociceptive sensory afferent; [0101] b. the second chain comprises the galanin TM and the translocation domain that is capable of translocating the protease from within an endosome, across the endosomal membrane and into the cytosol of the nociceptive sensory afferent; and [0102] the first and second chains are disulphide linked together.

[0103] In use, the single chain or di-chain polypeptide of the invention treat, prevent or ameliorate pain.

[0104] In use, a therapeutically effective amount of a single chain or di-chain polypeptide of the invention is administered to a patient.

[0105] According to a further aspect of the present invention, there is provided use of a single chain or di-chain polypeptide of the invention, for the manufacture of a medicament for treating, preventing or ameliorating pain.

[0106] According to a related aspect, there is provided a method of treating, preventing or ameliorating pain in a subject, comprising administering to said patient a therapeutically effective amount of a single chain or di-chain polypeptide of the invention.

[0107] The compounds described here may be used to treat a patient suffering from one or more types of chronic pain including neuropathic pain, inflammatory pain, headache pain, somatic pain, visceral pain, and referred pain.

[0108] To "treat," as used here, means to deal with medically. It includes, for example, administering a compound of the invention to prevent pain or to lessen its severity.

[0109] The term "pain," as used here, means any unpleasant sensory experience, usually associated with a physical disorder. The physical disorder may or may not be apparent to a clinician. Pain is of two types: chronic and acute. An "acute pain" is a pain of short duration having a sudden onset. One type of acute pain, for example, is cutaneous pain felt on injury to the skin or other superficial tissues, such as caused by a cut or a burn. Cutaneous nociceptors terminate just below the skin, and due to the high concentration of nerve endings, produce a well-defined, localized pain of short duration. "Chronic pain" is a pain other than an acute pain. Chronic pain includes neuropathic pain, inflammatory pain, headache pain, somatic pain visceral pain and referred pain.

[0110] I. Neuropathic Pain

[0111] The compounds of the invention may be used to treat pain caused by or otherwise associated with any of the following neuropathic pain conditions. "Neuropathic pain" means abnormal sensory input, resulting in discomfort, from the peripheral nervous system, central nervous systems, or both.

[0112] A. Symptoms of Neuropathic Pain

[0113] Symptoms of neuropathic pain can involve persistent, spontaneous pain, as well as allodynia (a painful response to a stimulus that normally is not painful), hyperalgesia (an accentuated response to a painful stimulus that usually causes only a mild discomfort, such as a pin prick), or hyperpathia (where a short discomfort becomes a prolonged severe pain).

[0114] B. Causes of Neuropathic Pain

[0115] Neuropathic pain may be caused by any of the following.

[0116] 1. A traumatic insult, such as, for example, a nerve compression injury (e.g., a nerve crush, a nerve stretch, a nerve entrapment or an incomplete nerve transsection); a spinal cord injury (e.g., a hemisection of the spinal cord); a limb amputation; a contusion; an inflammation (e.g., an inflammation of the spinal cord); or a surgical procedure.

[0117] 2. An ischemic event, including, for example, a stroke and heart attack.

[0118] 3. An infectious agent

[0119] 4. Exposure to a toxic agent, including, for example, a drug, an alcohol, a heavy metal (e.g., lead, arsenic, mercury), an industrial agent (e.g., a solvent, fumes from a glue) or nitrous oxide.

[0120] 5. A disease, including, for example, an inflammatory disorder, a neoplastic tumor, an acquired immune deficiency syndrome (AIDS), Lyme disease, a leprosy, a metabolic disease, a peripheral nerve disorder, like neuroma, a mononeuropathy or a polyneuropathy.

[0121] C. Types of Neuropathic Pain

[0122] 1. Neuralgia.

[0123] A neuralgia is a pain that radiates along the course of one or more specific nerves usually without any demonstrable pathological change in the nerve structure. The causes of neuralgia are varied. Chemical irritation, inflammation, trauma (including surgery), compression by nearby structures (for instance, tumors), and infections may all lead to neuralgia. In many cases, however, the cause is unknown or unidentifiable. Neuralgia is most common in elderly persons, but it may occur at any age. A neuralgia, includes, without limitation, a trigeminal neuralgia, a post-herpetic neuralgia, a postherpetic neuralgia, a glossopharyngeal neuralgia, a sciatica and an atypical facial pain.

[0124] Neuralgia is pain in the distribution of a nerve or nerves. Examples are trigeminal neuralgia, atypical facial pain, and postherpetic neuralgia (caused by shingles or herpes). The affected nerves are responsible for sensing touch, temperature and pressure in the facial area from the jaw to the forehead. The disorder generally causes short episodes of excruciating pain, usually for less than two minutes and on only one side of the face. The pain can be described in a variety of ways such as "stabbing," "sharp," "like lightning," "burning," and even "itchy". In the atypical form of TN, the pain can also present as severe or merely aching and last for extended periods. The pain associated with TN is recognized as one the most excruciating pains that can be experienced.

[0125] Simple stimuli such as eating, talking, washing the face, or any light touch or sensation can trigger an attack (even the sensation of a gentle breeze). The attacks can occur in clusters or as an isolated attack.

[0126] Symptoms include sharp, stabbing pain or constant, burning pain located anywhere, usually on or near the surface of the body, in the same location for each episode; pain along the path of a specific nerve; impaired function of affected body part due to pain, or muscle weakness due to concomitant motor nerve damage; increased sensitivity of the skin or numbness of the affected skin area (feeling similar to a local anesthetic such as a Novacaine shot); and any touch or pressure is interpreted as pain. Movement may also be painful.

[0127] Trigeminal neuralgia is the most common form of neuralgia. It affects the main sensory nerve of the face, the trigeminal nerve ("trigeminal" literally means "three origins", referring to the division of the nerve into 3 branches). This condition involves sudden and short attacks of severe pain on the side of the face, along the area supplied by the trigeminal nerve on that side. The pain attacks may be severe enough to cause a facial grimace, which is classically referred to as a painful tic (tic douloureux). Sometimes, the cause of trigeminal neuralgia is a blood vessel or small tumor pressing on the nerve. Disorders such as multiple sclerosis (an inflammatory disease affecting the brain and spinal cord), certain forms of arthritis, and diabetes (high blood sugar) may also cause trigeminal neuralgia, but a cause is not always identified. In this condition, certain movements such as chewing, talking, swallowing, or touching an area of the face may trigger a spasm of excruciating pain.

[0128] A related but rather uncommon neuralgia affects the glosso-pharyngeal nerve, which provides sensation to the throat. Symptoms of this neuralgia are short, shock-like episodes of pain located in the throat.

[0129] Neuralgia may occur after infections such as shingles, which is caused by the varicella-zoster virus, a type of herpesvirus. This neuralgia produces a constant burning pain after the shingles rash has healed. The pain is worsened by movement of or contact with the affected area. Not all of those diagnosed with shingles go on to experience postherpetic neuralgia, which can be more painful than shingles. The pain and sensitivity can last for months or even years. The pain is usually in the form of an intolerable sensitivity to any touch but especially light touch. Postherpetic neuralgia is not restricted to the face; it can occur anywhere on the body but usually occurs at the location of the shingles rash. Depression is not uncommon due to the pain and social isolation during the illness.

[0130] Postherpetic neuralgia may be debilitating long after signs of the original herpes infection have disappeared. Other infectious diseases that may cause neuralgia are syphilis and Lyme disease.

[0131] Diabetes is another common cause of neuralgia. This very common medical problem affects almost 1 out of every 20 Americans during adulthood. Diabetes damages the tiny arteries that supply circulation to the nerves, resulting in nerve fiber malfunction and sometimes nerve loss. Diabetes can produce almost any neuralgia, including trigeminal neuralgia, carpal tunnel syndrome (pain and numbness of the hand and wrist), and meralgia paresthetica (numbness and pain in the thigh due to damage to the lateral femoral cutaneous nerve). Strict control of blood sugar may prevent diabetic nerve damage and may accelerate recovery in patients who do develop neuralgia.

[0132] Other medical conditions that may be associated with neuralgias are chronic renal insufficiency and porphyria--a hereditary disease in which the body cannot rid itself of certain substances produced after the normal breakdown of blood in the body. Certain drugs may also cause this problem.

[0133] 2. Deafferentation.

[0134] Deafferentation indicates a loss of the sensory input from a portion of the body, and can be caused by interruption of either peripheral sensory fibers or nerves from the central nervous system. A deafferentation pain syndrome, includes, without limitation, an injury to the brain or spinal cord, a post-stroke pain, a phantom pain, a paraplegia, a brachial plexus avulsion injuries, lumbar radiculopathies.

[0135] 3. Complex Regional Pain Syndromes (CRPSs)

[0136] CRPS is a chronic pain syndrome resulting from sympathetically-maintained pain, and presents in two forms. CRPS 1 currently replaces the term "reflex sympathetic dystrophy syndrome". It is a chronic nerve disorder that occurs most often in the arms or legs after a minor or major injury. CRPS 1 is associated with severe pain; changes in the nails, bone, and skin; and an increased sensitivity to touch in the affected limb. CRPS 2 replaces the term causalgia, and results from an identified injury to the nerve. A CRPS, includes, without limitation, a CRPS Type I (reflex sympathetic dystrophy) and a CRPS Type II (causalgia).

[0137] 4. Neuropathy.

[0138] A neuropathy is a functional or pathological change in a nerve and is characterized clinically by sensory or motor neuron abnormalities.

[0139] Central neuropathy is a functional or pathological change in the central nervous system.

[0140] Peripheral neuropathy is a functional or pathological change in one or more peripheral nerves. The peripheral nerves relay information from your central nervous system (brain and spinal cord) to muscles and other organs and from your skin, joints, and other organs back to your brain. Peripheral neuropathy occurs when these nerves fail to carry information to and from the brain and spinal cord, resulting in pain, loss of sensation, or inability to control muscles. In some cases, the failure of nerves that control blood vessels, intestines, and other organs results in abnormal blood pressure, digestion problems, and loss of other basic body processes. Risk factors for neuropathy include diabetes, heavy alcohol use, and exposure to certain chemicals and drugs. Some people have a hereditary predisposition for neuropathy. Prolonged pressure on a nerve is another risk for developing a nerve injury. Pressure injury may be caused by prolonged immobility (such as a long surgical procedure or lengthy illness) or compression of a nerve by casts, splints, braces, crutches, or other devices. Polyneuropathy implies a widespread process that usually affects both sides of the body equally. The symptoms depend on which type of nerve is affected. The three main types of nerves are sensory, motor, and autonomic. Neuropathy can affect any one or a combination of all three types of nerves. Symptoms also depend on whether the condition affects the whole body or just one nerve (as from an injury). The cause of chronic inflammatory polyneuropathy is an abnormal immune response. The specific antigens, immune processes, and triggering factors are variable and in many cases are unknown. It may occur in association with other conditions such as HIV, inflammatory bowel disease, lupus erythematosis, chronic active hepatitis, and blood cell abnormalities.

[0141] Peripheral neuropathy may involve a function or pathological change to a single nerve or nerve group (mononeuropathy) or a function or pathological change affecting multiple nerves (polyneuropathy).

[0142] Peripheral Neuropathies

[0143] Hereditary disorders [0144] Charcot-Marie-Tooth disease [0145] Friedreich's ataxia

[0146] Systemic or metabolic disorders [0147] Diabetes (diabetic neuropathy) [0148] Dietary deficiencies (especially vitamin B-12) [0149] Excessive alcohol use (alcoholic neuropathy) [0150] Uremia (from kidney failure) [0151] Cancer

[0152] Infectious or Inflammatory Conditions [0153] AIDS [0154] Hepatitis [0155] Colorado tick fever [0156] diphtheria [0157] Guillain-Barre syndrome [0158] HIV infection without development of AIDS [0159] leprosy [0160] Lyme [0161] polyarteritis nodosa [0162] rheumatoid arthritis [0163] sarcoidosis [0164] Sjogren syndrome [0165] syphilis [0166] systemic lupus erythematosus [0167] amyloid

[0168] Exposure to Toxic Compounds [0169] sniffing glue or other toxic compounds [0170] nitrous oxide [0171] industrial agents--especially solvents [0172] heavy metals (lead, arsenic, mercury, etc.) [0173] Neuropathy secondary to drugs like analgesic nephropathy

[0174] Miscellaneous Causes [0175] ischemia (decreased oxygen/decreased blood flow) [0176] prolonged exposure to cold temperature

[0177] a. Polyneuropathy

[0178] Polyneuropathy is a peripheral neuropathy involving the loss of movement or sensation to an area caused by damage or destruction to multiple peripheral nerves. Polyneuropathic pain, includes, without limitation, post-polio syndrome, postmastectomy syndrome, diabetic neuropathy, alcohol neuropathy, amyloid, toxins, AIDS, hypothyroidism, uremia, vitamin deficiencies, chemotherapy-induced pain, 2',3'-didexoycytidine (ddC) treatment, Guillain-Barre syndrome or Fabry's disease.

[0179] b. Mononeuropathy

[0180] Mononeuropathy is a peripheral neuropathy involving loss of movement or sensation to an area caused by damage or destruction to a single peripheral nerve or nerve group. Mononeuropathy is most often caused by damage to a local area resulting from injury or trauma, although occasionally systemic disorders may cause isolated nerve damage (as with mononeuritis multiplex). The usual causes are direct trauma, prolonged pressure on the nerve, and compression of the nerve by swelling or injury to nearby body structures. The damage includes destruction of the myelin sheath (covering) of the nerve or of part of the nerve cell (the axon). This damage slows or prevents conduction of impulses through the nerve. Mononeuropathy may involve any part of the body. Mononeuropathic pain, includes, without limitation, a sciatic nerve dysfunction, a common peroneal nerve dysfunction. a radial nerve dysfunction, an ulnar nerve dysfunction, a cranial mononeuropathy VI, a cranial mononeuropathy VII, a cranial mononeuropathy III (compression type), a cranial mononeuropathy III (diabetic type), an axillary nerve dysfunction, a carpal tunnel syndrome, a femoral nerve dysfunction, a tibial nerve dysfunction, a Bell's palsy, a thoracic outlet syndrome, a carpal tunnel syndrome and a sixth (abducent) nerve palsy

[0181] c. Generalized Peripheral Neuropathies

[0182] Generalized peripheral neuropathies are symmetrical, and usually due to various systematic illnesses and disease processes that affect the peripheral nervous system in its entirety. They are further subdivided into several categories:

[0183] i. Distal axonopathies are the result of some metabolic or toxic derangement of neurons. They may be caused by metabolic diseases such as diabetes, renal failure, deficiency syndromes such as malnutrition and alcoholism, or the effects of toxins or drugs. Distal axonopathy (aka dying back neuropathy) is a type of peripheral neuropathy that results from some metabolic or toxic derangement of peripheral nervous system (PNS) neurons. It is the most common response of nerves to metabolic or toxic disturbances, and as such may be caused by metabolic diseases such as diabetes, renal failure, deficiency syndromes such as malnutrition and alcoholism, or the effects of toxins or drugs. The most common cause of distal axonopathy is diabetes, and the most common distal axonopathy is diabetic neuropathy.

[0184] ii. Myelinopathies are due to a primary attack on myelin causing an acute failure of impulse conduction. The most common cause is acute inflammatory demyelinating polyneuropathy (AIDP; aka Guillain-Barre syndrome), though other causes include chronic inflammatory demyelinating syndrome (CIDP), genetic metabolic disorders (e.g., leukodystrophy), or toxins. Myelinopathy is due to primary destruction of myelin or the myelinating Schwann cells, which leaves the axon intact, but causes an acute failure of impulse conduction. This demyelination slows down or completely blocks the conduction of electrical impulses through the nerve. The most common cause is acute inflammatory demyelinating polyneuropathy (AIDP, better known as Guillain-Barre syndrome), though other causes include chronic inflammatory demyelinating polyneuropathy (CIDP), genetic metabolic disorders (e.g., leukodystrophy or Charcot-Marie-Tooth disease), or toxins.

[0185] iii. Neuronopathies are the result of destruction of peripheral nervous system (PNS) neurons. They may be caused by motor neuron diseases, sensory neuronopathies (e.g., Herpes zoster), toxins or autonomic dysfunction. Neurotoxins may cause neuronopathies, such as the chemotherapy agent vincristine. Neuronopathy is dysfunction due to damage to neurons of the peripheral nervous system (PNS), resulting in a peripheral neuropathy. It may be caused by motor neuron diseases, sensory neuronopathies (e.g., Herpes zoster), toxic substances or autonomic dysfunction. A person with neuronopathy may present in different ways, depending on the cause, the way it affects the nerve cells, and the type of nerve cell that is most affected.

[0186] Focal entrapment neuropathies (e.g., carpal tunnel syndrome).

[0187] II. Inflammatory Pain

[0188] The compounds of the invention may be used to treat pain caused by or otherwise associated with any of the following inflammatory conditions

[0189] A. Arthritic Disorder

[0190] Arthritic disorders include, for example, a rheumatoid arthritis; a juvenile rheumatoid arthritis; a systemic lupus erythematosus (SLE); a gouty arthritis; a scleroderma; an osteoarthritis; a psoriatic arthritis; an ankylosing spondylitis; a Reiter's syndrome (reactive arthritis); an adult Still's disease; an arthritis from a viral infection; an arthritis from a bacterial infection, such as, e.g., a gonococcal arthritis and a non-gonococcal bacterial arthritis (septic arthritis); a Tertiary Lyme disease; a tuberculous arthritis; and an arthritis from a fungal infection, such as, e.g., a blastomycosis

[0191] B. Autoimmune Diseases

[0192] Autoimmune diseases include, for example, a Guillain-Barre syndrome, a Hashimoto's thyroiditis, a pernicious anemia, an Addison's disease, a type I diabetes, a systemic lupus erythematosus, a dermatomyositis, a Sjogren's syndrome, a lupus erythematosus, a multiple sclerosis, a myasthenia gravis, a Reiter's syndrome and a Grave's disease.

[0193] C. Connective Tissue Disorder

[0194] Connective tissue disorders include, for example, a spondyloarthritis a dermatomyositis, and a fibromyalgia.

[0195] D. Injury

[0196] Inflammation caused by injury, including, for example, a crush, puncture, stretch of a tissue or joint, may cause chronic inflammatory pain.

[0197] E. Infection

[0198] Inflammation caused by infection, including, for example, a tuberculosis or an interstitial keratitis may cause chronic inflammatory pain.

[0199] F. Neuritis

[0200] Neuritis is an inflammatory process affecting a nerve or group of nerves. Symptoms depend on the nerves involved, but may include pain, paresthesias, paresis, or hypesthesia (numbness). [0201] Examples include: [0202] a. Brachial neuritis [0203] b. Retrobulbar neuropathy, an inflammatory process affecting the part of the optic nerve lying immediately behind the eyeball. [0204] c. Optic neuropathy, an inflammatory process affecting the optic nerve causing sudden, reduced vision in the affected eye. The cause of optic neuritis is unknown. The sudden inflammation of the optic nerve (the nerve connecting the eye and the brain) leads to swelling and destruction of the myelin sheath. The inflammation may occasionally be the result of a viral infection, or it may be caused by autoimmune diseases such as multiple sclerosis. Risk factors are related to the possible causes. [0205] d. Vestibular neuritis, a viral infection causing an inflammatory process affecting the vestibular nerve.

[0206] G. Joint Inflammation

[0207] Inflammation of the joint, such as that caused by bursitis or tendonitis, for example, may cause chronic inflammatory pain.

[0208] III. Headache Pain

[0209] The compounds of the invention may be used to treat pain caused by or otherwise associated with any of the following headache conditions. A headache (medically known as cephalgia) is a condition of mild to severe pain in the head; sometimes neck or upper back pain may also be interpreted as a headache. It may indicate an underlying local or systemic disease or be a disorder in itself.

[0210] A. Muscular/Myogenic Headache

[0211] Muscular/myogenic headaches appear to involve the tightening or tensing of facial and neck muscles; they may radiate to the forehead. Tension headache is the most common form of myogenic headache.

[0212] A tension headache is a condition involving pain or discomfort in the head, scalp, or neck, usually associated with muscle tightness in these areas. Tension headaches result from the contraction of neck and scalp muscles. One cause of this muscle contraction is a response to stress, depression or anxiety. Any activity that causes the head to be held in one position for a long time without moving can cause a headache. Such activities include typing or use of computers, fine work with the hands, and use of a microscope. Sleeping in a cold room or sleeping with the neck in an abnormal position may also trigger this type of headache. A tension-type headache, includes, without limitation, an episodic tension headache and a chronic tension headache.

[0213] B. Vascular Headache

[0214] The most common type of vascular headache is migraine. Other kinds of vascular headaches include cluster headaches, which cause repeated episodes of intense pain, and headaches resulting from high blood pressure

[0215] 1. Migraine [0216] A migraine is a heterogeneous disorder that generally involves recurring headaches. Migraines are different from other headaches because they occur with other symptoms, such as, e.g., nausea, vomiting, or sensitivity to light. In most people, a throbbing pain is felt only on one side of the head. Clinical features such as type of aura symptoms, presence of prodromes, or associated symptoms such as vertigo, may be seen in subgroups of patients with different underlying pathophysiological and genetic mechanisms. A migraine headache, includes, without limitation, a migraine without aura (common migraine), a migraine with aura (classic migraine), a menstrual migraine, a migraine equivalent (acephalic headache), a complicated migraine, an abdominal migraine and a mixed tension migraine.

[0217] 2. Cluster Headache [0218] Cluster headaches affect one side of the head (unilateral) and may be associated with tearing of the eyes and nasal congestion. They occur in clusters, happening repeatedly every day at the same time for several weeks and then remitting.

[0219] D. High Blood Pressure Headache

[0220] E. Traction and Inflammatory Headache

[0221] Traction and inflammatory headaches are usually symptoms of other disorders, ranging from stroke to sinus infection.

[0222] F. Hormone Headache

[0223] G. Rebound Headache

[0224] Rebound headaches, also known as medication overuse headaches, occur when medication is taken too frequently to relieve headache. Rebound headaches frequently occur daily and can be very painful.

[0225] H. Chronic Sinusitis Headache

[0226] Sinusitis is inflammation, either bacterial, fungal, viral, allergic or autoimmune, of the paranasal sinuses. Chronic sinusitis is one of the most common complications of the common cold. Symptoms include: nasal congestion; facial pain; headache; fever; general malaise; thick green or yellow discharge; feeling of facial `fullness` worsening on bending over. In a small number of cases, chronic maxillary sinusitis can also be brought on by the spreading of bacteria from a dental infection. Chronic hyperplastic eosinophilic sinusitis is a non-infective form of chronic sinusitis.

[0227] I. An Organic Headache

[0228] J. Ictal Headaches

[0229] Ictal headaches are headaches associated with seizure activity.

[0230] IV. Somatic Pain

[0231] The compounds of the invention may be used to treat pain caused by or otherwise associated with any of the following somatic pain conditions. Somatic pain originates from ligaments, tendons, bones, blood vessels, and even nerves themselves. It is detected with somatic nociceptors. The scarcity of pain receptors in these areas produces a dull, poorly-localized pain of longer duration than cutaneous pain; examples include sprains and broken bones. Additional examples include the following.

[0232] A. Excessive Muscle Tension

[0233] Excessive muscle tension can be caused, for example, by a sprain or a strain.

[0234] B. Repetitive Motion Disorders

[0235] Repetitive motion disorders can result from overuse of the hands, wrists, elbows, shoulders, neck, back, hips, knees, feet, legs, or ankles.

[0236] C. Muscle Disorders

[0237] Muscle disorders causing somatic pain include, for example, a polymyositis, a dermatomyositis, a lupus, a fibromyalgia, a polymyalgia rheumatica, and a rhabdomyolysis.

[0238] D. Myalgia

[0239] Myalgia is muscle pain and is a symptom of many diseases and disorders. The most common cause for myalgia is either overuse or over-stretching of a muscle or group of muscles. Myalgia without a traumatic history is often due to viral infections. Longer-term myalgias may be indicative of a metabolic myopathy, some nutritional deficiencies or chronic fatigue syndrome.

[0240] E. Infection

[0241] Infection can cause somatic pain. Examples of such infection include, for example, an abscess in the muscle, a trichinosis, an influenza, a Lyme disease, a malaria, a Rocky Mountain spotted fever, Avian influenza, the common cold, community-acquired pneumonia, meningitis, monkeypox, Severe Acute Respiratory Syndrome, toxic shock syndrome, trichinosis, typhoid fever, and upper respiratory tract infection.

[0242] F. Drugs

[0243] Drugs can cause somatic pain. Such drugs include, for example, cocaine, a statin for lowering cholesterol (such as atorvastatin, simvastatin, and lovastatin), and an ACE inhibitor for lowering blood pressure (such as enalapril and captopril)

[0244] V. Visceral Pain

[0245] The compounds of the invention may be used to treat pain caused by or otherwise associated with any of the following visceral pain conditions. Visceral pain originates from body's viscera, or organs. Visceral nociceptors are located within body organs and internal cavities. The even greater scarcity of nociceptors in these areas produces pain that is usually more aching and of a longer duration than somatic pain. Visceral pain is extremely difficult to localize, and several injuries to visceral tissue exhibit "referred" pain, where the sensation is localized to an area completely unrelated to the site of injury. Examples of visceral pain include the following:

[0246] A. Functional Visceral Pain

[0247] Functional visceral pain includes, for example, an irritable bowel syndrome and a chronic functional abdominal pain (CFAP), a functional constipation and a functional dyspepsia, a non-cardiac chest pain (NCCP) and a chronic abdominal pain.

[0248] B. Chronic Gastrointestinal Inflammation

[0249] Chronic gastrointestinal inflammation includes, for example, a gastritis, an inflammatory bowel disease, like, e.g., a Crohn's disease, an ulcerative colitis, a microscopic colitis, a diverticulitis and a gastroenteritis; an interstitial cystitis; an intestinal ischemia; a cholecystitis; an appendicitis; a gastroesophageal reflux; an ulcer, a nephrolithiasis, an urinary tract infection, a pancreatitis and a hernia.

[0250] C. Autoimmune Pain

[0251] Autoimmune pain includes, for example, a sarcoidosis and a vasculitis.

[0252] D. Organic Visceral Pain

[0253] Organic visceral pain includes, for example, pain resulting from a traumatic, inflammatory or degenerative lesion of the gut or produced by a tumor impinging on sensory innervation.

[0254] E. Treatment-Induced Visceral Pain

[0255] Treatment-induced visceral pain includes, for example, a pain attendant to chemotherapy therapy or a pain attendant to radiation therapy.

[0256] VI. Referred Pain

[0257] The compounds of the invention may be used to treat pain caused by or otherwise associated with any of the following referred pain conditions.

[0258] Referred pain arises from pain localized to an area separate from the site of pain stimulation. Often, referred pain arises when a nerve is compressed or damaged at or near its origin. In this circumstance, the sensation of pain will generally be felt in the territory that the nerve serves, even though the damage originates elsewhere. A common example occurs in intervertebral disc herniation, in which a nerve root arising from the spinal cord is compressed by adjacent disc material. Although pain may arise from the damaged disc itself, pain will also be felt in the region served by the compressed nerve (for example, the thigh, knee, or foot). Relieving the pressure on the nerve root may ameliorate the referred pain, provided that permanent nerve damage has not occurred. Myocardial ischaemia (the loss of blood flow to a part of the heart muscle tissue) is possibly the best known example of referred pain; the sensation can occur in the upper chest as a restricted feeling, or as an ache in the left shoulder, arm or even hand.

[0259] The present invention addresses a wide range of pain conditions, in particular chronic pain conditions. Preferred conditions include cancerous and non-cancerous pain, inflammatory pain and neuropathic pain. The opioid-fusions of the present application are particularly suited to addressing inflammatory pain, though may be less suited to addressing neuropathic pain. The galanin-fusions are more suited to addressing neuropathic pain.

[0260] In use, the polypeptides of the present invention are typically employed in the form of a pharmaceutical composition in association with a pharmaceutical carrier, diluent and/or excipient, although the exact form of the composition may be tailored to the mode of administration. Administration is preferably to a mammal, more preferably to a human.

[0261] The polypeptides may, for example, be employed in the form of a sterile solution for intra-articular administration or intra-cranial administration. Spinal injection (e.g., epidural or intrathecal) is preferred.

[0262] The dosage ranges for administration of the polypeptides of the present invention are those to produce the desired therapeutic effect. It will be appreciated that the dosage range required depends on the precise nature of the components, the route of administration, the nature of the formulation, the age of the patient, the nature, extent or severity of the patient's condition, contraindications, if any, and the judgment of the attending physician.

[0263] Suitable daily dosages are in the range 0.0001 to 1 mg/kg, preferably 0.0001 to 0.5 mg/kg, more preferably 0.002 to 0.5 mg/kg, and particularly preferably 0.004 to 0.5 mg/kg. The unit dosage can vary from less that 1 microgram to 30 mg, but typically will be in the region of 0.01 to 1 mg per dose, which may be administered daily or preferably less frequently, such as weekly or six monthly.

[0264] A particularly preferred dosing regimen is based on 2.5 ng of fusion protein as the 1.times. dose. In this regard, preferred dosages are in the range 1.times.-100.times. (i.e., 2.5 to 250 ng). This dosage range is significantly lower (i.e., at least 10-fold, typically 100-fold lower) than would be employed with other types of analgesic molecules such as NSAIDS, morphine, and gabapentin. Moreover, the above-mentioned difference is considerably magnified when the same comparison is made on a molar basis--this is because the fusion proteins of the present invention have a considerably greater molecular weight than do conventional `small` molecule therapeutics.

[0265] Wide variations in the required dosage, however, are to be expected depending on the precise nature of the components, and the differing efficiencies of various routes of administration.

[0266] Variations in these dosage levels can be adjusted using standard empirical routines for optimization, as is well understood in the art.

[0267] Compositions suitable for injection may be in the form of solutions, suspensions or emulsions, or dry powders which are dissolved or suspended in a suitable vehicle prior to use.

[0268] Fluid unit dosage forms are typically prepared utilizing a pyrogen-free sterile vehicle. The active ingredients, depending on the vehicle and concentration used, can be either dissolved or suspended in the vehicle.

[0269] In preparing administrable solutions, the polypeptides can be dissolved in a vehicle, the solution being made isotonic if necessary by addition of sodium chloride and sterilized by filtration through a sterile filter using aseptic techniques before filling into suitable sterile vials or ampoules and sealing. Alternatively, if solution stability is adequate, the solution in its sealed containers may be sterilized by autoclaving.

[0270] Advantageously additives such as buffering, solubilizing, stabilizing, preservative or bactericidal, suspending or emulsifying agents may be dissolved in the vehicle.

[0271] Dry powders which are dissolved or suspended in a suitable vehicle prior to use may be prepared by filling pre-sterilized drug substance and other ingredients into a sterile container using aseptic technique in a sterile area.

[0272] Alternatively the polypeptides and other ingredients may be dissolved in an aqueous vehicle, the solution is sterilized by filtration and distributed into suitable containers using aseptic technique in a sterile area. The product is then freeze dried and the containers are sealed aseptically.

[0273] Parenteral suspensions, suitable for intramuscular, subcutaneous or intradermal injection, are prepared in substantially the same manner, except that the sterile components are suspended in the sterile vehicle, instead of being dissolved and sterilization cannot be accomplished by filtration. The components may be isolated in a sterile state or alternatively it may be sterilized after isolation, e.g., by gamma irradiation.

[0274] Advantageously, a suspending agent for example polyvinylpyrrolidone is included in the composition/s to facilitate uniform distribution of the components.

DEFINITIONS SECTION

[0275] Targeting Moiety (TM) means any chemical structure associated with an agent that functionally interacts with a Binding Site to cause a physical association between the agent and the surface of a target cell. In the context of the present invention, the target cell is a nociceptive sensory afferent. The term TM embraces any molecule (i.e., a naturally occurring molecule, or a chemically/physically modified variant thereof) that is capable of binding to a Binding Site on the target cell, which Binding Site is capable of internalization (e.g., endosome formation)--also referred to as receptor-mediated endocytosis. The TM may possess an endosomal membrane translocation function, in which case separate TM and Translocation Domain components need not be present in an agent of the present invention.

[0276] The TM of the present invention binds (preferably specifically binds) to a nociceptive sensory afferent (e.g., a primary nociceptive afferent). In this regard, specifically binds means that the TM binds to a nociceptive sensory afferent (e.g., a primary nociceptive afferent) with a greater affinity than it binds to other neurons such as non-nociceptive afferents, and/or to motor neurons (i.e., the natural target for clostridial neurotoxin holotoxin). The term "specifically binding" can also mean that a given TM binds to a given receptor, for example galanin receptors, such as GALR1, GALR2 and/or GALR3 receptors, with a binding affinity (Ka) of 10.sup.6M.sup.-1 or greater, preferably 10.sup.7M.sup.-1 or greater, more preferably 10.sup.8 M.sup.-1 or greater, and most preferably, 10.sup.9 M.sup.-1 or greater.

[0277] For the purposes of this invention, an agonist is defined as a molecule that is capable of stimulating the process of exocytic fusion in a target cell, which process is susceptible to inhibition by a protease capable of cleaving a protein of the exocytic fusion apparatus in said target cell.

[0278] Accordingly, the particular agonist definition of the present invention would exclude many molecules that would be conventionally considered as agonists.

[0279] For example, nerve growth factor (NGF) is an agonist in respect of its ability to promote neuronal differentiation via binding to a TrkA receptor. However, NGF is not an agonist when assessed by the above criteria because it is not a principal inducer of exocytic fusion. In addition, the process that NGF stimulates (i.e., cell differentiation) is not susceptible to inhibition by the protease activity of a non-cytotoxic toxin molecule.

[0280] The term "fragment", when used in relation to a protein, means a peptide having at least thirty-five, preferably at least twenty-five, more preferably at least twenty, and most preferably at least 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6 or 5 amino acid residues of the protein in question.

[0281] The term "variant", when used in relation to a protein, means a peptide or peptide fragment of the protein that contains one or more analogues of an amino acid (e.g., an unnatural amino acid), or a substituted linkage.

[0282] The term "derivative", when used in relation to a protein, means a protein that comprises the protein in question, and a further peptide sequence. The further peptide sequence should preferably not interfere with the basic folding and thus conformational structure of the original protein. Two or more peptides (or fragments, or variants) may be joined together to form a derivative. Alternatively, a peptide (or fragment, or variant) may be joined to an unrelated molecule (e.g., a second, unrelated peptide). Derivatives may be chemically synthesized, but will be typically prepared by recombinant nucleic acid methods. Additional components such as lipid, and/or polysaccharide, and/or polypeptide components may be included.

[0283] The term non-cytotoxic means that the protease molecule in question does not kill the target cell to which it has been re-targeted.

[0284] The protease of the present invention embraces all naturally-occurring non-cytotoxic proteases that are capable of cleaving one or more proteins of the exocytic fusion apparatus in eukaryotic cells.

[0285] The non-cytotoxic protease of the present invention is preferably a bacterial protease. In one embodiment, the non-cytotoxic protease is selected from the genera Clostridium or Neisseria (e.g., a clostridial L-chain, or a neisserial IgA protease preferably from N. gonorrhoeae). The term protease embraces functionally equivalent fragments and molecules thereof.

[0286] The present invention also embraces modified non-cytotoxic proteases, which include amino acid sequences that do not occur in nature and/or synthetic amino acid residues, so long as the modified proteases still demonstrate the above-mentioned protease activity.

[0287] The protease of the present invention preferably demonstrates a serine or metalloprotease activity (e.g., endopeptidase activity). The protease is preferably specific for a SNARE protein (e.g., SNAP-25, synaptobrevin/VAMP, or syntaxin).

[0288] Particular mention is made to the protease domains of neurotoxins, for example the protease domains of bacterial neurotoxins. Thus, the present invention embraces the use of neurotoxin domains, which occur in nature, as well as recombinantly prepared versions of said naturally-occurring neurotoxins.

[0289] Exemplary neurotoxins are produced by clostridia, and the term clostridial neurotoxin embraces neurotoxins produced by C. tetani (TeNT), and by C. botulinum (BoNT) serotypes A through G, as well as the closely related BoNT-like neurotoxins produced by C. baratii and C. butyricum. The above-mentioned abbreviations are used throughout the present specification. For example, the nomenclature BoNT/A denotes the source of neurotoxin as BoNT (serotype A). Corresponding nomenclature applies to other BoNT serotypes.

[0290] The term L-chain fragment means a component of the L-chain of a neurotoxin, which fragment demonstrates a metalloprotease activity and is capable of proteolytically cleaving a vesicle and/or plasma membrane associated protein involved in cellular exocytosis.

[0291] A Translocation Domain is a molecule that enables translocation of a protease (or fragment thereof) into a target cell such that a functional expression of protease activity occurs within the cytosol of the target cell. Whether any molecule (e.g., a protein or peptide) possesses the requisite translocation function of the present invention may be confirmed by any one of a number of conventional assays.

[0292] For example, Shone, (1987) Eur. J. Biochem 167(1):175-180, describes an in vitro assay employing liposomes, which are challenged with a test molecule. Presence of the requisite translocation function is confirmed by release from the liposomes of K.sup.+ and/or labeled NAD, which may be readily monitored.

[0293] A further example is provided by Blaustein, (1987) FEBS Letts. 226:115-120, which describes a simple in vitro assay employing planar phospholipid bilayer membranes. The membranes are challenged with a test molecule and the requisite translocation function is confirmed by an increase in conductance across said membranes.

[0294] Additional methodology to enable assessment of membrane fusion and thus identification of Translocation Domains suitable for use in the present invention are provided by Methods in Enzymology Vol. 220 and 221, Membrane Fusion Techniques, Parts A and B, Academic Press 1993.

[0295] The Translocation Domain is preferably capable of formation of ion-permeable pores in lipid membranes under conditions of low pH. Preferably it has been found to use only those portions of the protein molecule capable of pore-formation within the endosomal membrane.

[0296] The Translocation Domain may be obtained from a microbial protein source, in particular from a bacterial or viral protein source. Hence, in one embodiment, the Translocation Domain is a translocation domain of an enzyme, such as a bacterial toxin or viral protein.

[0297] It is well documented that certain domains of bacterial toxin molecules are capable of forming such pores. It is also known that certain translocation domains of virally expressed membrane fusion proteins are capable of forming such pores. Such domains may be employed in the present invention.

[0298] The Translocation Domain may be of a clostridial origin, namely the H.sub.N domain (or a functional component thereof). H.sub.N means a portion or fragment of the H-chain of a clostridial neurotoxin approximately equivalent to the amino-terminal half of the H-chain, or the domain corresponding to that fragment in the intact H-chain. It is preferred that the H-chain substantially lacks the natural binding function of the H.sub.C component of the H-chain. In this regard, the H.sub.C function may be removed by deletion of the H.sub.C amino acid sequence (either at the DNA synthesis level, or at the post-synthesis level by nuclease or protease treatment). Alternatively, the H.sub.e function may be inactivated by chemical or biological treatment. Thus, the H-chain is preferably incapable of binding to the Binding Site on a target cell to which native clostridial neurotoxin (i.e., holotoxin) binds.

[0299] In one embodiment, the translocation domain is a H.sub.N domain (or a fragment thereof) of a clostridial neurotoxin. Examples of suitable clostridial Translocation Domains include:

[0300] Botulinum type A neurotoxin--amino acid residues (449-871)

[0301] Botulinum type B neurotoxin--amino acid residues (441-858)

[0302] Botulinum type C neurotoxin--amino acid residues (442-866)

[0303] Botulinum type D neurotoxin--amino acid residues (446-862)

[0304] Botulinum type E neurotoxin--amino acid residues (423-845)

[0305] Botulinum type F neurotoxin--amino acid residues (440-864)

[0306] Botulinum type G neurotoxin--amino acid residues (442-863)

[0307] Tetanus neurotoxin--amino acid residues (458-879)

[0308] For further details on the genetic basis of toxin production in Clostridium botulinum and C. tetani, reference is made to Henderson et at (1997) in The Clostridia: Molecular Biology and Pathogenesis, Academic Press.

[0309] The term H.sub.N embraces naturally-occurring neurotoxin H.sub.N portions, and modified H.sub.N portions having amino acid sequences that do not occur in nature and/or synthetic amino acid residues, so long as the modified H.sub.N portions still demonstrate the above-mentioned translocation function.

[0310] Alternatively, the Translocation Domain may be of a non-clostridial origin (see Table 4). Examples of non-clostridial Translocation Domain origins include, but not be restricted to, the translocation domain of diphtheria toxin (O'Keefe et al., (1992) Proc. Natl. Acad. Sci. USA 89:6202-6206; Silverman et al., (1993) J. Biol. Chem. 269:22524-22532; and London, (1992) Biochem. Biophys. Acta. 1112:25-51), the translocation domain of Pseudomonas exotoxin type A (Prior et al., (1992) Biochemistry 31:3555-3559), the translocation domains of anthrax toxin (Blanke et al., (1996) Proc. Natl. Acad. Sci. USA 93:8437-8442), a variety of fusogenic or hydrophobic peptides of translocating function (Plank et al., (1994) J. Biol. Chem. 269:12918-12924; and Wagner et al., (1992) Proc. Natl. Acad. Sci. USA 89:7934-7938), and amphiphilic peptides (Murata et al., (1992) Biochemistry, 31:1986-1992). The Translocation Domain may mirror the Translocation Domain present in a naturally-occurring protein, or may include amino acid variations so long as the variations do not destroy the translocating ability of the Translocation Domain.

[0311] Particular examples of viral Translocation Domains suitable for use in the present invention include certain translocating domains of virally expressed membrane fusion proteins. For example, Wagner et al., (1992) and Murata et al., (1992), (both supra) describe the translocation (i.e., membrane fusion and vesiculation) function of a number of fusogenic and amphiphilic peptides derived from the N-terminal region of influenza virus haemagglutinin. Other virally expressed membrane fusion proteins known to have the desired translocating activity are a translocating domain of a fusogenic peptide of Semliki Forest Virus (SFV), a translocating domain of vesicular stomatitis virus (VSV) glycoprotein G, a translocating domain of SER virus F protein and a translocating domain of Foamy virus envelope glycoprotein. Virally encoded Aspike proteins have particular application in the context of the present invention, for example, the E1 protein of SFV and the G protein of the G protein of VSV.

[0312] Use of the Translocation Domains listed in Table 4 (below) includes use of sequence variants thereof. A variant may comprise one or more conservative nucleic acid substitutions and/or nucleic acid deletions or insertions, with the proviso that the variant possesses the requisite translocating function. A variant may also comprise one or more amino acid substitutions and/or amino acid deletions or insertions, so long as the variant possesses the requisite translocating function.

TABLE-US-00006 TABLE 4 Translocation Domains Translocation domain source Amino acid residues References Diphtheria toxin 194-380 Silverman et al., (1994) J. Biol. Chem. 269: 22524-22532 London, (1992) Biochem. Biophys. Acta. 1113: 25-51 Domain II of 405-613 Prior et al., (1992) Biochemistry pseudomonas exotoxin 31: 3555-3559 Kihara and Pastan, (1994) Bioconj Chem. 5: 532-538 Influenza virus GLFGAIAGFIENGWEG Plank et al., (1994) J. Biol. Chem. haemagglutinin MIDGWYG (SEQ ID 269: 12918-12924 NO: 89), and Variants Wagner et al., (1992) Proc. Natl. thereof Acad. Sci. USA 89: 7934-7938 Murata et al., (1992) Biochemistry 31: 1986-1992 Semliki Forest virus Translocation domain Kielian et al., (1996) J Cell Biol. fusogenic protein 134(4): 863-872 Vesicular Stomatitis 118-139 Yao et al., (2003) Virology virus glycoprotein G 310(2): 319-332 SER virus F protein Translocation domain Seth et al. (2003), J Virol 77(11): 6520-6527 Foamy virus envelope Translocation domain Picard-Maureau et al. (2003), J Virol. glycoprotein 77(8): 4722-4730

SEQ ID NOs

[0313] Where an initial Met amino acid residue or a corresponding initial codon is indicated in any of the following SEQ ID NOs, said residue/codon is optional. [0314] SEQ ID NO: 1 DNA SEQUENCE OF THE LC/A [0315] SEQ ID NO: 2 DNA SEQUENCE OF THE H.sub.N/A [0316] SEQ ID NO: 3 DNA SEQUENCE OF THE LC/B [0317] SEQ ID NO: 4 DNA SEQUENCE OF THE H.sub.N/B [0318] SEQ ID NO: 5 DNA SEQUENCE OF THE LC/C [0319] SEQ ID NO: 6 DNA SEQUENCE OF THE H.sub.N/C [0320] SEQ ID NO: 7 PROTEIN SEQUENCE OF GALANIN GA30 [0321] SEQ ID NO: 8 PROTEIN SEQUENCE OF GALANIN GA16 [0322] SEQ ID NO: 9 DNA SEQUENCE OF LHA-GS18-EN-CPGA16-GS20-HT [0323] SEQ ID NO: 10 PROTEIN SEQUENCE OF LHA-GS18-EN-CPGA16-GS20-HT [0324] SEQ ID NO: 11 PROTEIN SEQUENCE OF LHA-GS18-EN-CPGA16-GS20 [0325] SEQ ID NO: 12 DNA SEQUENCE OF LHA-GS5-EN-CPGA16-GS20-HT [0326] SEQ ID NO: 13 PROTEIN SEQUENCE OF LHA-GS5-EN-CPGA16-GS20-HT [0327] SEQ ID NO: 14 PROTEIN SEQUENCE OF LHA-GS5-EN-CPGA16-GS20 [0328] SEQ ID NO: 15 DNA SEQUENCE OF LHA-GS5-EN-CPGA30-GS20-HT [0329] SEQ ID NO: 16 PROTEIN SEQUENCE OF LHA-GS5-EN-CPGA30-GS20-HT [0330] SEQ ID NO: 17 PROTEIN SEQUENCE OF LHA-GS5-EN-CPGA30-GS20 [0331] SEQ ID NO: 18 DNA SEQUENCE OF LHBK191A-GS5-EN-CPGA16-GS20-HT [0332] SEQ ID NO: 19 PROTEIN SEQUENCE OF LHBK191A-GS5-EN-CPGA16-GS20-HT [0333] SEQ ID NO: 20 PROTEIN SEQUENCE OF LHBK191A-GS5-EN-CPGA16-GS20 [0334] SEQ ID NO: 21 DNA SEQUENCE OF LHB-GS5-EN-CPGA16-GS20-HT [0335] SEQ ID NO: 22 PROTEIN SEQUENCE OF LHB-GS5-EN-CPGA16-GS20-HT [0336] SEQ ID NO: 23 PROTEIN SEQUENCE OF LHB-GS5-EN-CPGA16-GS20 [0337] SEQ ID NO: 24 DNA SEQUENCE OF LHC-GS5-EN-CPGA16-GS20-HT [0338] SEQ ID NO: 25 PROTEIN SEQUENCE OF LHC-GS5-EN-CPGA16-GS20-HT [0339] SEQ ID NO: 26 PROTEIN SEQUENCE OF LHC-GS5-EN-CPGA16-GS20 [0340] SEQ ID NO: 27 DNA SEQUENCE OF LHD-GS5-EN-CPGA16-GS20-HT [0341] SEQ ID NO: 28 PROTEIN SEQUENCE OF LHD-GS5-EN-CPGA16-GS20-HT [0342] SEQ ID NO: 29 PROTEIN SEQUENCE OF LHD-GS5-EN-CPGA16-GS20 [0343] SEQ ID NO: 30 DNA SEQUENCE OF LHA-GS5-EN-CPGA16-HX27-HT [0344] SEQ ID NO: 31 PROTEIN SEQUENCE OF LHA-GS5-EN-CPGA16-HX27-HT [0345] SEQ ID NO: 32 PROTEIN SEQUENCE OF LHA-GS5-EN-CPGA16-HX27 [0346] SEQ ID NO: 33 PROTEIN SEQUENCE OF LHA-GS10-EN-CPGA16-GS20-H.sub.N/A [0347] SEQ ID NO: 34 PROTEIN SEQUENCE OF LHA-GS10-EN-CPGA16-GS20 [0348] SEQ ID NO: 35 PROTEIN SEQUENCE OF LHA-GS5-EN-CPGA16-GS15-HT [0349] SEQ ID NO: 36 PROTEIN SEQUENCE OF LHA-GS5-EN-CPGA16-GS15 [0350] SEQ ID NO: 37 PROTEIN SEQUENCE OF LHA-GS5-EN-CPGA16-GS10-HT [0351] SEQ ID NO: 38 PROTEIN SEQUENCE OF LHA-GS5-EN-CPGA16-GS10 [0352] SEQ ID NO: 39 PROTEIN SEQUENCE OF LHA-GS18-EN-CPGA16-HX27-H.sub.N/A [0353] SEQ ID NO: 40 PROTEIN SEQUENCE OF LHA-GS18-EN-CPGA16-HX27 [0354] SEQ ID NO: 41 PROTEIN SEQUENCE OF LHA-GS18-EN-CPGA16-GS15-HT [0355] SEQ ID NO: 42 PROTEIN SEQUENCE OF LHA-GS18-EN-CPGA16-GS15 [0356] SEQ ID NO: 43 PROTEIN SEQUENCE OF LHA-GS18-EN-CPGA16-GS10-HT [0357] SEQ ID NO: 44 PROTEIN SEQUENCE OF LHA-GS18-EN-CPGA16-GS10 [0358] SEQ ID NO: 45 PROTEIN SEQUENCE OF LHA-GS10-EN-CPGA16-HX27-HT [0359] SEQ ID NO: 46 PROTEIN SEQUENCE OF LHA-GS10-EN-CPGA16-HX27 [0360] SEQ ID NO: 47 PROTEIN SEQUENCE OF LHA-GS10-EN-CPGA16-GS15-HT [0361] SEQ ID NO: 48 PROTEIN SEQUENCE OF LHA-GS10-EN-CPGA16-GS15 [0362] SEQ ID NO: 49 PROTEIN SEQUENCE OF LHA-GS10-EN-CPGA16-GS10-HT [0363] SEQ ID NO: 50 PROTEIN SEQUENCE OF LHA-GS10-EN-CPGA16-GS10 [0364] SEQ ID NO: 51 DNA SEQUENCE OF THE IGA PROTEASE [0365] SEQ ID NO: 52 DNA SEQUENCE OF THE IGA-GS5-GAL16-GS18-H.sub.N/A FUSION [0366] SEQ ID NO: 53 PROTEIN SEQUENCE OF THE IGA-GS5-GAL16-GS18-H.sub.N/A FUSION [0367] SEQ ID NO: 54 DNA SEQUENCE OF DT TRANSLOCATION DOMAIN [0368] SEQ ID NO: 55 DNA SEQUENCE OF LC/A-GS5-GAL16-GS20-DT-A [0369] SEQ ID NO: 56 PROTEIN SEQUENCE OF LC/A-GS5-GAL16-GS20-DT-A [0370] SEQ ID NO: 57 DNA SEQUENCE OF TENT LC [0371] SEQ ID NO: 58 DNA SEQUENCE OF TENT LC-GS5-CPGAL16-GS20-H.sub.N/A [0372] SEQ ID NO: 59 PROTEIN SEQUENCE OF TENT LC-GS5-EN-CPGA16-GS20-H.sub.N/A

EXAMPLES

Example 1

Construction and Activation of Galanin Fusion Proteins

Preparation of a LC/A and H.sub.N/A Backbone Clones

[0373] The following procedure creates the LC and H.sub.N fragments for use as the component backbone for multidomain fusion expression. This example is based on preparation of a serotype A based clone (SEQ ID NO: 1 and SEQ ID NO: 2), though the procedures and methods are equally applicable to the other serotypes (i.e., A, B, C, D and E serotypes) as illustrated by the sequence listing for serotype B (SEQ ID NO: 3 and SEQ ID NO: 4) and serotype C (SEQ ID NO: 5 and SEQ ID NO: 6).

Preparation of Cloning and Expression Vectors

[0374] pCR 4 (Invitrogen) is the chosen standard cloning vector, selected due to the lack of restriction sequences within the vector and adjacent sequencing primer sites for easy construct confirmation. The expression vector is based on the pMAL (NEB) expression vector, which has the desired restriction sequences within the multiple cloning site in the correct orientation for construct insertion (BamHI-SalI-PstI-HindIII). A fragment of the expression vector has been removed to create a non-mobilizable plasmid and a variety of different fusion tags have been inserted to increase purification options.

Preparation of Protease (e.g. LC/A) Insert

[0375] The LC/A (SEQ ID NO: 1) is created by one of two ways:

[0376] The DNA sequence is designed by back translation of the LC/A amino acid sequence (obtained from freely available database sources such as GenBank (accession number P10845) or Swissprot (accession locus BXA1_CLOBO) using one of a variety of reverse translation software tools (for example EditSeq best E. coli reverse translation (DNASTAR Inc.), or Backtranslation tool v 2.0 (Entelechon)). BamHI/SalI recognition sequences are incorporated at the 5' and 3' ends respectively of the sequence, maintaining the correct reading frame. The DNA sequence is screened (using software such as MapDraw, DNASTAR Inc.) for restriction enzyme cleavage sequences incorporated during the back translation. Any cleavage sequences that are found to be common to those required by the cloning system are removed manually from the proposed coding sequence ensuring common E. coli codon usage is maintained. E. coli codon usage is assessed by reference to software programs such as Graphical Codon Usage Analyser (Geneart), and the % GC content and codon usage ratio assessed by reference to published codon usage tables (for example GenBank Release 143, 13 Sep. 2004). This optimized DNA sequence containing the LC/A open reading frame (ORF) is then commercially synthesized (for example by Entelechon, Geneart or Sigma-Genosys) and is provided in the pCR 4 vector.

[0377] The alternative method is to use PCR amplification from an existing DNA sequence with BamHI and SalI restriction enzyme sequences incorporated into the 5' and 3' PCR primers respectively. Complementary oligonucleotide primers are chemically synthesized by a supplier (for example MWG or Sigma-Genosys), so that each pair has the ability to hybridize to the opposite strands (3' ends pointing "towards" each other) flanking the stretch of Clostridium target DNA, one oligonucleotide for each of the two DNA strands. To generate a PCR product the pair of short oligonucleotide primers specific for the Clostridium DNA sequence are mixed with the Clostridium DNA template and other reaction components and placed in a machine (the `PCR machine`) that can change the incubation temperature of the reaction tube automatically, cycling between approximately 94.degree. C. (for denaturation), 55.degree. C. (for oligonucleotide annealing), and 72.degree. C. (for synthesis). Other reagents required for amplification of a PCR product include a DNA polymerase (such as Taq or Pfu polymerase), each of the four nucleotide dNTP building blocks of DNA in equimolar amounts (50 to 200 .mu.M) and a buffer appropriate for the enzyme optimized for Mg.sup.2+ concentration (0.5 to 5 mM).

[0378] The amplification product is cloned into pCR 4 using either, TOPO TA cloning for Taq PCR products or Zero Blunt TOPO cloning for Pfu PCR products (both kits commercially available from Invitrogen). The resultant clone is checked by sequencing. Any additional restriction sequences which are not compatible with the cloning system are then removed using site directed mutagenesis (for example, using Quickchange (Stratagene Inc.)).

Preparation of Translocation (e.g. H.sub.N) Insert

[0379] The H.sub.N/A (SEQ ID NO: 2) is created by one of two ways:

[0380] The DNA sequence is designed by back translation of the H.sub.N/A amino acid sequence (obtained from freely available database sources such as GenBank (accession number P10845) or Swissprot (accession locus BXA1_CLOBO)) using one of a variety of reverse translation software tools (for example EditSeq best E. coli reverse translation (DNASTAR Inc.), or Backtranslation tool v 2.0 (Entelechon)). A PstI restriction sequence added to the N-terminus and XbaI-stop codon-HindIII to the C-terminus ensuring the correct reading frame is maintained. The DNA sequence is screened (using software such as MapDraw, DNASTAR Inc.) for restriction enzyme cleavage sequences incorporated during the back translation. Any sequences that are found to be common to those required by the cloning system are removed manually from the proposed coding sequence ensuring common E. coli codon usage is maintained. E. coli codon usage is assessed by reference to software programs such as Graphical Codon Usage Analyser (Geneart), and the % GC content and codon usage ratio assessed by reference to published codon usage tables (for example GenBank Release 143, 13 Sep. 2004). This optimized DNA sequence is then commercially synthesized (for example by Entelechon, Geneart or Sigma-Genosys) and is provided in the pCR 4 vector.

[0381] The alternative method is to use PCR amplification from an existing DNA sequence with PstI and XbaI-stop codon-HindIII restriction enzyme sequences incorporated into the 5' and 3' PCR primers respectively. The PCR amplification is performed as described above. The PCR product is inserted into pCR 4 vector and checked by sequencing. Any additional restriction sequences which are not compatible with the cloning system are then removed using site directed mutagenesis (for example using Quickchange (Stratagene Inc.)).

Preparation of LHA-GS18-EN-CPGA16-GS20 Fusion

[0382] In order to create the LC/A-GS18-EN-CPGA16-GS20-H.sub.N/A construct, an A serotype linker with the addition of an Enterokinase (EN) site for activation, arranged as BamHI-SalI-GS18-protease site-GS20-PstI-XbaI-stop codon-HindIII is synthesized. The pCR 4 vector encoding the linker is cleaved with BamHI+SalI restriction enzymes. This cleaved vector then serves as the recipient for insertion and ligation of the LC/A DNA (SEQ ID NO: 1) also cleaved with BamHI+SalI. This construct is then cleaved with BamHI+HindIII and inserted into an expression vector such as the pMAL plasmid (NEB) or pET based plasmid (Novagen). The resulting plasmid DNA is then cleaved with PstI+XbaI restriction enzymes and the H.sub.N/A DNA (SEQ ID NO.sub.2) is then cleaved with PstI+XbaI restriction enzymes and inserted into the a similarly cleaved pMAL vector to create pMAL-LC/A-GS18-EN-CPGA16-GS20-H.sub.N/A-XbaI-His-tag-stop codon-HindIII. The final construct contains the GS18-EN-CPGA16-GS20 spacer ORF for expression as a protein of the sequence illustrated in SEQ ID NO: 10.

Activation Assay

[0383] NuPAGE 4-12% Bis-Tris gels (10, 12 and 15 well pre-poured gels) were used to analyze activation of fusion proteins after treatment with protease. Protein samples were prepared with NuPAGE 4.times.LDS sample buffer, typically to a final volume of 100 .mu.l Samples were either diluted or made up neat (75 .mu.l of sample, 25 .mu.l of sample buffer) depending on protein concentration. The samples were mixed and then heated in a heat block at 95.degree. C. for 5 min before loading onto the gel. 5 to 20 .mu.l of sample was loaded along with 5 .mu.l of the protein marker (Benchmark.TM. protein marker from Invitrogen). The gels were typically run for 50 min at 200 V. The gel was immersed in dH.sub.2O and microwaved for 2 min on full power. The gel was rinsed and the microwave step was repeated. The gel was transferred to a staining box and immersed in Simply Blue SafeStain (Invitrogen). It was microwaved for 1 minute on full power and left for 0.5 to 2 h to stain. The gel was then destained by pouring off the Safestain and rinsing the gel with dH.sub.2O. The gels were left in dH.sub.2O to destain overnight and an image was taken on a GeneGnome (Syngene) imager. Total activated protein was calculated by comparing the density of the band that corresponded to full-length fusion protein (after protease treatment) in non-reduced and reduced conditions.

Example 2

Preparation of an LHA-GS18-EN-CPGA16-GS20 Fusion Protein Family with Variable Spacer Length

[0384] Using the same strategy as employed in Example 1, a range of DNA linkers were prepared that encoded galanin16 and variable spacer content. Using one of a variety of reverse translation software tools (for example EditSeq best E. coli reverse translation (DNASTAR Inc.), or Backtranslation tool v 2.0 (Entelechon)), the DNA sequence encoding the Spacer 1-Protease site-ligand-spacer 2 region is determined. Restriction sites are then incorporated into the DNA sequence and can be arranged as BamHI-SalI-Spacer 1-protease site-CPGA16-NheI-spacer 2-SpeI-PstI-XbaI-stop codon-HindIII. It is important to ensure the correct reading frame is maintained for the spacer, GA16 and restriction sequences and that the XbaI sequence is not preceded by the bases TC which would result in DAM methylation. The DNA sequence is screened for restriction sequence incorporation and any additional sequences are removed manually from the remaining sequence ensuring common E. coli codon usage is maintained. E. coli codon usage is assessed by reference to software programs such as Graphical Codon Usage Analyser (Geneart), and the % GC content and codon usage ratio assessed by reference to published codon usage tables (for example GenBank Release 143, 13 Sep. 2004). This optimized DNA sequence is then commercially synthesized (for example by Entelechon, Geneart or Sigma-Genosys) and is provided in the pCR 4 vector.

TABLE-US-00007 TABLE 5 Spacer-linkers that were created Spacer 1 - protease site-GA16- Spacer 2 SEQ ID NO: of the linker GS5-EN-CPGA16-GS20 12, 13, 14, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29 GS10-EN-CPGA16-GS20 33, 34 GS5-EN-CPGA16-HX27 30, 31, 32 GS5-EN-CPGA16-GS15 35, 36 GS5-EN-CPGA16-GS10 37, 38 GS18-EN-CPGA16-HX27 39, 40 GS18-EN-CPGA16-GS15 41, 42 GS18-EN-CPGA16-GS10 43, 44 GS10-EN-CPGA16-HX27 45, 46 GS10-EN-CPGA16-GS15 47, 48 GS10-EN-CPGA16-GS10 49, 50

[0385] By way of example, and in order to create the LC/A-GS5-EN-CPGA16-GS20-H.sub.N/A fusion construct (SEQ ID NO: 12), the pCR 4 vector encoding BamHI-SalI-GS5-protease site-GS20-PstI-XbaI-stop codon-HindIII the linker is cleaved with BamHI+SalI restriction enzymes. This cleaved vector then serves as the recipient vector for insertion and ligation of the LC/A DNA (SEQ ID NO: 1) also cleaved with BamHI+SalI. The resulting plasmid DNA is then cleaved with BamHI+HindIII restriction enzymes and the LC/A-linker fragment inserted into a similarly cleaved vector containing a unique multiple cloning site for BamHI, SalI, PstI, and HindIII such as the pMAL vector (NEB) or the pET vector (Novagen). The H.sub.N/A DNA (SEQ ID NO: 2) is then cleaved with PstI+HindIII restriction enzymes and inserted into the similarly cleaved pMAL-LC/A-linker construct. The final construct contains the LC/A-GS5-EN-CPGA16-GS20-H.sub.N/A ORF for expression as a protein of the sequence illustrated in SEQ ID NO: 13.

Example 3

Purification Method for Galanin Fusion Protein

[0386] A falcon tube containing 25 ml 50 mM HEPES pH 7.2, 200 mM NaCl and approximately 10 g of E. coli BL21 cell paste was defrosted. The thawed cell paste was made up to 80 ml with 50 mM HEPES pH 7.2, 200 mM NaCl and sonicated on ice for 30 seconds on, 30 seconds off for 10 cycles at a power of 22 microns ensuring that the sample remained cool. The lysed cells were centrifuged at 18,000 rpm, 4.degree. C. for 30 minutes. The supernatant was loaded onto a 0.1 M NiSO.sub.4 charged Chelating column (a 20-30 ml column was sufficient) equilibrated with 50 mM HEPES pH 7.2, 200 mM NaCl. A step gradient of 10 and 40 mM imidazole was used, the non-specific bound protein was washed and the fusion protein with eluted with 100 mM imidazole. The eluted fusion protein was dialyzed against 5 L of 50 mM HEPES pH 7.2, 200 mM NaCl at 4.degree. C. overnight and the OD of the dialysed fusion protein was measured. 1 .mu.g of enterokinase (1 mg/ml) was added per 100 .mu.g of purified fusion protein along with 10 .mu.l of factor Xa per mg of purified fusion protein if the fusion protein contained a maltose binding protein. Incubation was at 25.degree. C. static overnight. Dialysate was loaded onto a 0.1 M NiSO.sub.4 charged Chelating column (20-30 ml column is sufficient) equilibrated with 50 mM HEPES pH 7.2, 200 mM NaCl. The column was washed to baseline with 50 mM HEPES pH 7.2, 200 mM NaCl. Then, using a step gradient of 10 and 40 mM imidazole the non-specific bound protein was washed away and the fusion protein was eluted with 100 mM imidazole. The eluted fusion protein was dialyzed against 5 L of 50 mM

[0387] HEPES pH 7.2, 200 mM NaCl at 4.degree. C. overnight and the fusion protein was concentrated to about 2 mg/ml, aliquoted, and frozen at -20.degree. C.

Example 4

Preparation of a LC/C-GA16-H.sub.N/C Fusion Protein with a Serotype A Activation Sequence Following the methods used in Examples 1 and 2, the LC/C (SEQ ID NO: 5) and

[0388] H.sub.N/C (SEQ ID NO: 6) are created and inserted into the A serotype linker arranged as BamHI-SalI-Spacer 1-protease site-GA16-NheI-spacer 2-SpeI-PstI-XbaI-stop codon-HindIII. The final construct contains the LC-spacer 1-GA16-spacer 2-H.sub.N ORF for expression as a protein of the sequence illustrated in SEQ ID NO: 25.

Example 5

Preparation of an IgA Protease-GA16-H.sub.N/A Fusion Protein

[0389] The IgA protease amino acid sequence was obtained from freely available database sources such as GenBank (accession number P09790). Information regarding the structure of the N. gonorrhoeae IgA protease gene is available in the literature (Pohlner et al., (1987) Nature 325(6103):458-462). Using Backtranslation tool v 2.0 (Entelechon), the DNA sequence encoding the IgA protease modified for E. coli expression was determined. A BamHI recognition sequence was incorporated at the 5' end and a codon encoding a cysteine amino acid and SalI recognition sequence were incorporated at the 3' end of the IgA DNA. The DNA sequence was screened using MapDraw, (DNASTAR Inc.) for restriction enzyme cleavage sequences incorporated during the back translation. Any cleavage sequences that were found to be common to those required for cloning were removed manually from the proposed coding sequence ensuring common E. coli codon usage is maintained. E. coli codon usage was assessed Graphical Codon Usage Analyser (Geneart), and the % GC content and codon usage ratio assessed by reference to published codon usage tables. This optimized DNA sequence (SEQ ID NO: 51) containing the IgA open reading frame (ORF) was then commercially synthesized.

[0390] The IgA (SEQ ID NO: 51) was inserted into the LC-GS5-GA16-GS18-H.sub.N ORF using BamHI and SalI restriction enzymes to replace the LC with the IgA protease DNA. The final construct contains the IgA-GS5-GA16-GS18-H.sub.N ORF for expression as a protein of the sequence illustrated in SEQ ID NO: 53.

Example 6

Preparation of a Galanin Targeted Endopeptidase Fusion Protein Containing a LC Domain Derived from Tetanus

[0391] The DNA sequence is designed by back translation of the tetanus toxin LC amino acid sequence (obtained from freely available database sources such as GenBank (accession number X04436) using one of a variety of reverse translation software tools (for example EditSeq best E. coli reverse translation (DNASTAR Inc.), or Backtranslation tool v 2.0 (Entelechon)). BamHI/SalI recognition sequences are incorporated at the 5' and 3' ends respectively of the sequence maintaining the correct reading frame (SEQ ID NO: 57). The DNA sequence is screened (using software such as MapDraw, DNASTAR Inc.) for restriction enzyme cleavage sequences incorporated during the back translation. Any cleavage sequences that are found to be common to those required by the cloning system are removed manually from the proposed coding sequence ensuring common E. coli codon usage is maintained. E. coli codon usage is assessed by reference to software programs such as Graphical Codon Usage Analyser (Geneart), and the % GC content and codon usage ratio assessed by reference to published codon usage tables (for example GenBank Release 143, 13 Sep. 2004). This optimized DNA sequence containing the tetanus toxin LC open reading frame (ORF) is then commercially synthesized (for example by Entelechon, Geneart or Sigma-Genosys) and is provided in the pCR4 vector (Invitrogen). The pCR4 vector encoding the TeNT LC is cleaved with BamHI and SalI. The BamHI-SalI fragment is then inserted into the LC/A-GA16-H.sub.N/A vector that has also been cleaved by BamHI and SalI. The final construct contains the TeNT LC-GS5-GA16-GS20-H.sub.N ORF sequences for expression as a protein of the sequence illustrated in SEQ ID NO: 58.

Example 7

Construction of CHO-K1 GALR1 & GALR2 Receptor Activation Assay and SNAP-25 Cleavage Assay

Cell-Line Creation

[0392] CHO-K1 cells stably expressing either the human galanin 1 receptor (CHO-K1-Gal-1R; product number ES-510-C) or human galanin 2 receptor (CHO-K1-Gal-2R; product number ES-511-C) were purchased from Perkin-Elmer (Bucks, UK).

[0393] Where required, cells were transfected with SNAP-25 DNA using Lipofectamine.TM. 2000 and incubated for 4 hours before media replacement. After 24 hours, cells were transferred to a T175 flask. 100 .mu.g/ml Zeocin was added after a further 24 hours to begin selection of SNAP-25 expressing cells, and 5 .mu.g/ml Blasticidin added to maintain selective pressure for the receptor. Cells were maintained in media containing selection agents for two weeks, passaging cells every two to three days to maintain 30 to 70% confluence. Cells were then diluted in selective media to achieve 0.5 cell per well in a 96 well microplate. After a few days, the plates were examined under a microscope, and those wells containing single colonies were marked. Media in these wells was changed weekly. As cells became confluent in the wells, they were transferred to T25 flasks. When the cells had expanded sufficiently each clone was seeded to 24 wells of a 96 well plate, plus a frozen stock vial was created. Galanin fusion proteins of the invention and LC/A-H.sub.NA (LHA) were applied to the cells for 24 hours, and then Western blots performed to detect SNAP-25 cleavage. Clones from which SNAP-25 bands were strong and cleavage levels were high with fusion were maintained for further investigation. Full dose curves were run on these, and the clone with the highest differential between galanin fusion protein and LC/A-H.sub.NA cleavage levels was selected.

GALR1 Receptor Activation Assay

[0394] The GALR1 receptor activation assay measures the potency and intrinsic efficacy of ligands at the GALR1 receptor in transfected CHO-K1 cells by quantifying the reduction of forskolin-stimulated intracellular cAMP using a FRET-based cAMP (Perkin Elmer LANCE cAMP kit). After stimulation, a fluorescently labeled cAMP tracer (Europium-streptavadin/biotin-cAMP) and fluorescently (Alexa) labeled anti-cAMP antibody are added to the cells in a lysis buffer. cAMP from the cells competes with the cAMP tracer for antibody binding sites. When read, a light pulse at 320 nm excites the fluorescent portion (Europium) of the cAMP tracer. The energy emitted from the europium is transferred to the Alexa fluor-labeled antibodies bound to the tracer, generating a TR-FRET signal at 665 nm (Time-resolved fluorescence resonance energy transfer is based on the proximity of the donor label, europium, and the acceptor label, Alexa fluor, which have been brought together by a specific binding reaction). Residual energy from the europium produces light at 615 nm. In agonist treated cells there will be less cAMP to compete with the tracer so a dose dependant increase in signal at 665 nm will be observed compared with samples treated with forskolin alone. The signal at 665 nm signal is converted to cAMP concentration by interpolation to a cAMP standard curve which is included in each experiment.

[0395] Using Gilson pipettes and Sigmacote.RTM. treated or lo-bind tips, test materials and standards were diluted to the appropriate concentrations in the wells of the first two columns of an Eppendorf 500 .mu.l deep-well lo-bind plate, in assay buffer containing 10 .mu.M forskolin. The chosen concentrations in columns one and two were half a log unit apart. From these, serial 1:10 dilutions were made across the plate (using an electronic eight channel pipette with Sigmacote.RTM. or lo-bind tips) until eleven concentrations at half log intervals had been created. In the twelfth column, assay buffer only was added as a `basal`. Using a 12 channel digital pipette, 10 .mu.l of sample from the lo-bind plate was transferred to the Optiplate.TM. 96 well microplate.

[0396] To wells containing the standard curve, 10 .mu.l of assay buffer was added using a multichannel digital pipette. To wells containing the test materials, 10 .mu.l of cells in assay buffer at the appropriate concentration were added. Plates were sealed and incubated for 120 min at room temperature, for the first hour on an IKA MTS 2/4 orbital shaker set to maximum speed.

[0397] LANCE Eu-W8044 labeled streptavidin (Eu-SA) and Biotin-cAMP (b-cAMP) were diluted in cAMP Detection Buffer (both from Perkin Elmer LANCE cAMP kit) to create sub-stocks, at dilution ratios of 1:17 and 1:5, respectively. The final detection mix was prepared by diluting from the two sub stocks into detection buffer at a ratio of 1:125. The mixture was incubated for 15-30 min at room temperature before addition of 1:200 Alexa Fluor.RTM. 647-anti cAMP Antibody (Alexa-Fluor Ab). After briefly vortex mixing, 20 .mu.l was immediately added to each well using a digital multichannel pipette. Microplate sealers were applied and plates incubated for 24 h at room temperature (for the first hour on an IKA MTS 2/4 orbital shaker set to maximum speed). Plate sealers were removed prior to reading on a microplate reader (EnVision.RTM.).

GALR2 Receptor Activation Assay

[0398] The GALR2 receptor activation assay measures the potency and intrinsic efficacy of ligands at GALR2 receptor in transfected CHO-K1 cells by measuring the calcium mobilization that occurs when the receptor is activated. The transfected cells are pre-loaded with a calcium sensitive dye (FLIPR) before treatment. When read using a microplate reader (Flexstation 3, Molecular Devices) a light pulse at 485 nm excites the fluorescent dye and causes an emission at 525 nm. This provides real-time fluorescence data from changes in intracellular calcium. In agonist treated cells there will be activation of the receptor, leading to an increase in calcium mobilization. This will be measured as an increase in the relative fluorescence units (RFU) at 525 nM.

Culture of Cells for Receptor Activation Assay:

[0399] Cells were seeded and cultured in T175 flasks containing Ham F12 with Glutamax, 10% Fetal bovine serum, 5 .mu.g/ml Blasticidin and 100 .mu.g/ml Zeocin. The flasks were incubated at 37.degree. C. in a humidified environment containing 5% CO.sub.2 until 60 to 80% confluent. On the day of harvest the media was removed and the cells were washed twice with 25 ml PBS. The cells were removed from the flask by addition of 10 ml of Tryple Express, and incubation at 37.degree. C. for 10 min followed by gentle tapping of the flask. The dislodged cells were transferred to a 50 ml centrifuge tube and the flask washed twice with 10 ml media which was added to the cell suspension. The tube was centrifuged at 1300.times.g for 3 min and the supernatant removed. Cells were gently re-suspended in 10 ml media (if freezing cells) or assay buffer (if using `fresh` cells), and a sample was removed for counting using a Nucleocounter.RTM. (ChemoMetec). Cells for use `fresh` in an assay were diluted further in assay buffer to the appropriate concentration. Cells harvested for freezing were re-centrifuged (1300.times.g; 3 min), the supernatant removed and cells re-suspended in Synth-a-freeze at 4.degree. C. to 3.times.10.sup.6 cells/ml. Cryovials containing 1 ml suspension each were placed in a chilled Nalgene Mr. Frosty freezing container (-1.degree. C./minute cooling rate), and left overnight in a -80.degree. C. freezer. The following day vials were transferred to the vapor phase of a liquid nitrogen storage tank.

[0400] FIG. 4 demonstrates that galanin fusion proteins of the present invention having different galanin ligands (i.e., galanin-16 and galanin-30) and different serotype backbones (i.e., LC/A-H.sub.N/A, LC/B-H.sub.N/B, LC/C--H.sub.N/C and LC/D-H.sub.N/D) activate GALR1 receptors.

CHO-K1 GALR1SNAP-25 Cleavage Assays

[0401] Cultures of cells were exposed to varying concentrations of galanin fusion protein for 24 hours. Cellular proteins were separated by SDS-PAGE and Western blotted with anti-SNAP-25 antibody to facilitate assessment of SNAP-25 cleavage. SNAP-25 cleavage calculated by densitometric analysis (Syngene).

Plating Cells

[0402] Cells at 2.times.10.sup.5 cells/ml were prepared and seeded at 125 .mu.l per well of a 96 well plate using 500 ml Gibco Ham F12 with Glutamax, 50 ml FBS, 5 .mu.g/ml Blasticidin (Calbiochem, 10 ml at 10 mg/ml), 100 .mu.g/ml Zeocin. Cells were allowed to grow for 24 hrs (37.degree. C., 5% CO.sub.2, humidified atmosphere).

Cell Treatment

[0403] Dilutions of test protein were prepared for a dose range of each test proteins. CHO GALR1 feeding medium was filter sterilized (20 ml syringe, 0.2 .mu.m syringe filter) to make the dilutions. Filtered medium was added into 5 labeled Bijoux's (7 ml tubes), 0.9 ml each using a Gilson pipette or multi-stepper. The stock test protein was diluted to 2000 nM (working stock solution 1) and 600 nM (working stock solution 2). Using a Gilson pipette a 10-fold serial dilutions of each working stock was prepared, by adding 100 .mu.l to the next concentration in the series. Each dilution was pipetted up and down to mix thoroughly. This step was repeated to obtain 4 serial dilutions for solution 1, and 3 serial dilutions for solution 2. A 0 nM control (filtered feeding medium only) was also prepared as a negative control for each plate. The above steps were repeated for each test protein. In each experiment a `standard` batch of material was included as control/reference material, this is also called unliganded LC/A-H.sub.N/A.

Apply Diluted Sample to CHO GALR1 Plates

[0404] Test sample (125 .mu.l, double concentration) was applied per well. Each test sample was applied to triplicate wells and each dose range tested included a 0 nM control. Samples were incubated for 24 hrs (37.degree. C., 5% CO.sub.2, humidified atmosphere).

Cell Lysis

[0405] Fresh lysis buffer (20 mls per plate) was prepared with 25% (4.times.) NuPAGE LDS sample buffer, 65% dH.sub.20 and 10% 1 M DTT. Medium was removed from the CHO GALR1 plate by inverting over a waste receptacle. The remaining media was drained from each well using a fine-tipped pipette. The cells were lysed by adding 125 .mu.l of lysis buffer per well using a multi-stepper pipette. After a minimum of 20 min, the buffer was removed from each well to a 1.5 ml microcentrifuge tube. Tubes were numbered to allow tracking of the CHO GALR1 treatments throughout the blotting procedure. A1-A3 down to H1-H3 numbered 1-24, A4-A6 down to H4-H6 numbered 25-48, A7-A9 down to H7-H93 numbered 49-72, A10-A12 down to H10-H12 numbered 73-96. Each sample was mixed by vortex and heated at 90.degree. C. for 5 to 10 min in a pre-warmed heat block. The samples were either stored at -20.degree. C. or separated on the same day on an SDS gel.

Gel Electrophoresis

[0406] If the sample had been stored over night or longer, the sample was pre-warmed in a heat block to 90.degree. C. for 5 to 10 min. 12% bis-tris SDS page gels were used with running buffer (lx, Invitrogen NuPAGE MOPS SDS Running Buffer (20.times.). 500 .mu.l of NuPAGE antioxidant was added to the upper buffer chamber. 15 .mu.l samples were loaded onto the gel lanes from left to right. 2.5 .mu.l Magic Marker XP (Invitrogen), 5 .mu.l See Blue Plus 2 pre-stained standard (Invitrogen) and 15 .mu.l of non-treated control were also loaded onto gel lanes. The gels were run at 200 V for 1 hour and 25 minutes (until the pink (17 kDa) marker reached the bottom of the tank).

Western Blotting

[0407] Semi-dry transfer was carried out using an Invitrogen iBlot (iBlot Programme 3 for 6 minutes). Nitrocellulose membranes were put in individual small trays and incubated with blocking buffer solution (5 g Marvel milk powder per 100 ml 0.1% PBS/Tween) at room temperature on a rocker for 1 hour. Primary antibody (Anti-SNAP-25 1:1000 dilution) was applied and the membranes were incubated with primary antibody (diluted in blocking buffer) for 1 hour on a rocker at room temperature. Membranes were washed by rinsing 3 times with PBS/Tween (0.1%). The secondary (Anti-Rabbit-HRP conjugate diluted 1:1000) was then applied and the membranes were incubated with secondary antibody (diluted in blocking buffer) at room temperature on a rocker for 1 hour. Membranes were again washed by rinsing 3 times with PBS/Tween (0.1%) and the membranes were left for a minimum of 20 min for the last wash. Bound antibody was detected using Syngene: Blots were drained of PBS/Tween, and WestDura reagents were mixed 1:1 and added to completely cover the blots for 5 minutes. The membranes were placed in a Syngene tray, and the Syngene software was set for a 5 min expose time.

[0408] FIGS. 3 and 5 demonstrate that galanin fusion proteins of the invention effectively cleave SNAP-25.

Example 8

Assessment of In Vivo Efficacy of a Galanin Fusion

[0409] The nociceptive flexion reflex (also known as paw guarding assay) is a rapid withdrawal movement that constitutes a protective mechanism against possible limb damage. It can be quantified by assessment of electromyography (EMG) response in anesthetized rat as a result of low dose capsaicin, electrical stimulation or the capsaicin-sensitized electrical response. Intraplantar pretreatment (24 hour) of fusion proteins of the present invention into 300-380 g male Sprague-Dawley rats. Induction of paw guarding was achieved by 0.006% capsaicin, 10 .mu.l in PBS (7.5% DMSO), injected in 10 seconds. This produced a robust reflex response from biceps feroris muscle. A reduction/inhibition of the nociceptive flexion reflex indicates that the test substance demonstrates an anti-nociceptive effect. The data demonstrated the anti-nociceptive effect of the galanin fusion proteins of the present invention as a percentage (FIG. 6).

[0410] The ability of different galanin fusion proteins of the invention to inhibit capsaicin-induced thermal hyperalgesia was evaluated (FIGS. 7 and 8). Intraplantar pretreatment of fusion proteins into Sprague-Dawley rats and 24 hours later 0.3% capsaicin was injected and rats were put on 25.degree. C. glass plate (rats contained in acrylic boxes, on 25.degree. C. glass plate). The light beam (adjustable light Intensity) was focused on the hind paw and sensors detected movement of a paw, stopping the timer. Paw Withdrawal Latency is time to remove paw from heat source (Cut-off of 20.48 seconds). A reduction/inhibition of the paw withdrawal latency indicates that the test substance demonstrates an anti-nociceptive effect. The data demonstrated the enhanced anti-nociceptive effect of the galanin fusion proteins of the present invention compared to fusion proteins with a C-terminally presented ligand.

Example 9

Confirmation of TM Agonist Activity by Measuring Release of Substance P from Neuronal Cell Cultures

Materials

[0411] Substance P EIA was obtained from R&D Systems, UK.

Methods

[0412] Primary neuronal cultures of eDRG were established as described previously (Duggan et al., (2002) J. Biol. Chem. 277:34846-34852). Substance P released from the cultures was assessed by EIA, essentially as described previously (Duggan et al. (2002), J. Biol. Chem. 277:34846-34852). The TM of interest was added to the neuronal cultures (established for at least 2 weeks prior to treatment); control cultures were performed in parallel by addition of vehicle in place of TM. Stimulated (100 mM KCl) and basal release, together with total cell lysate content, of substance P were obtained for both control and TM treated cultures. Substance P immunoreactivity was measured using Substance P Enzyme Immunoassay Kits (Cayman Chemical Company, USA or R&D Systems, UK) according to manufacturers' instructions.

[0413] The amount of Substance P released by the neuronal cells in the presence of the TM of interest was compared to the release obtained in the presence and absence of 100 mM KCl. Stimulation of Substance P release by the TM of interest above the basal release, establishes that the TM of interest was an "agonist ligand" as defined in this specification. If desired the stimulation of Substance P released by the TM of interest could be compared to a standard Substance P release-curve produced using the natural ORL-1 receptor ligand, nociceptin (Tocris).

Example 10

[0414] A method of treating, preventing or ameliorating pain in a subject, comprising administration to said patient a therapeutic effective amount of fusion protein, wherein said pain is selected from the group consisting of: chronic pain arising from malignant disease, chronic pain not caused by malignant disease (peripheral neuropathies).

Patient A

[0415] A 73 year old woman suffering from severe pain caused by [postherpatic?] neuralgia is treated by a peripheral injection with fusion protein to reduce neurotransmitter release at the synapse of nerve terminals to reduce the pain. The patient experiences good analgesic effect within 2 hours of said injection.

Patient B

[0416] A 32 year old male suffering from phantom limb pain after having his left arm amputated following a car accident is treated by peripheral injection with fusion protein to reduce the pain. The patient experiences good analgesic effect within 1 hour of said injection.

Patient C

[0417] A 55 year male suffering from diabetic neuropathy is treated by a peripheral injection with fusion protein to reduce neurotransmitter release at the synapse of nerve terminals to reduce the pain. The patient experiences good analgesic effect within 4 hours of said injection.

Patient D

[0418] A 63 year old woman suffering from cancer pain is treated by a peripheral injection with fusion protein to reduce neurotransmitter release at the synapse of nerve terminals to reduce the pain. The patient experiences good analgesic effect within 4 hours of said injection.

[0419] All documents, books, manuals, papers, patents, published patent applications, guides, abstracts and other reference materials cited herein are incorporated by reference in their entirety. While the foregoing specification teaches the principles of the present invention, with examples provided for the purpose of illustration, it will be appreciated by one skilled in the art from reading this disclosure that various changes in form and detail can be made without departing from the true scope of the invention.

[0420] While illustrative embodiments have been illustrated and described, it will be appreciated that various changes can be made therein without departing from the spirit and scope of the invention.

Sequence CWU 1

1

8911302DNAArtificial SequenceSynthetic sequence 1ggatccatgg agttcgttaa caaacagttc aactataaag acccagttaa cggtgttgac 60attgcttaca tcaaaatccc gaacgctggc cagatgcagc cggtaaaggc attcaaaatc 120cacaacaaaa tctgggttat cccggaacgt gataccttta ctaacccgga agaaggtgac 180ctgaacccgc caccggaagc gaaacaggtg ccggtatctt actatgactc cacctacctg 240tctaccgata acgaaaagga caactacctg aaaggtgtta ctaaactgtt cgagcgtatt 300tactccaccg acctgggccg tatgctgctg actagcatcg ttcgcggtat cccgttctgg 360ggcggttcta ccatcgatac cgaactgaaa gtaatcgaca ctaactgcat caacgttatt 420cagccggacg gttcctatcg ttccgaagaa ctgaacctgg tgatcatcgg cccgtctgct 480gatatcatcc agttcgagtg taagagcttt ggtcacgaag ttctgaacct cacccgtaac 540ggctacggtt ccactcagta catccgtttc tctccggact tcaccttcgg ttttgaagaa 600tccctggaag tagacacgaa cccactgctg ggcgctggta aattcgcaac tgatcctgcg 660gttaccctgg ctcacgaact gattcatgca ggccaccgcc tgtacggtat cgccatcaat 720ccgaaccgtg tcttcaaagt taacaccaac gcgtattacg agatgtccgg tctggaagtt 780agcttcgaag aactgcgtac ttttggcggt cacgacgcta aattcatcga ctctctgcaa 840gaaaacgagt tccgtctgta ctactataac aagttcaaag atatcgcatc caccctgaac 900aaagcgaaat ccatcgtggg taccactgct tctctccagt acatgaagaa cgtttttaaa 960gaaaaatacc tgctcagcga agacacctcc ggcaaattct ctgtagacaa gttgaaattc 1020gataaacttt acaaaatgct gactgaaatt tacaccgaag acaacttcgt taagttcttt 1080aaagttctga accgcaaaac ctatctgaac ttcgacaagg cagtattcaa aatcaacatc 1140gtgccgaaag ttaactacac tatctacgat ggtttcaacc tgcgtaacac caacctggct 1200gctaatttta acggccagaa cacggaaatc aacaacatga acttcacaaa actgaaaaac 1260ttcactggtc tgttcgagtt ttacaagctg ctgtgcgtcg ac 130221257DNAArtificial SequenceSynthetic 2ctgcagtgta tcaaggttaa caactgggat ttattcttca gcccgagtga agacaacttc 60accaacgacc tgaacaaagg tgaagaaatc acctcagata ctaacatcga agcagccgaa 120gaaaacatct cgctggacct gatccagcag tactacctga cctttaattt cgacaacgag 180ccggaaaaca tttctatcga aaacctgagc tctgatatca tcggccagct ggaactgatg 240ccgaacatcg aacgtttccc aaacggtaaa aagtacgagc tggacaaata taccatgttc 300cactacctgc gcgcgcagga atttgaacac ggcaaatccc gtatcgcact gactaactcc 360gttaacgaag ctctgctcaa cccgtcccgt gtatacacct tcttctctag cgactacgtg 420aaaaaggtca acaaagcgac tgaagctgca atgttcttgg gttgggttga acagcttgtt 480tatgatttta ccgacgagac gtccgaagta tctactaccg acaaaattgc ggatatcact 540atcatcatcc cgtacatcgg tccggctctg aacattggca acatgctgta caaagacgac 600ttcgttggcg cactgatctt ctccggtgcg gtgatcctgc tggagttcat cccggaaatc 660gccatcccgg tactgggcac ctttgctctg gtttcttaca ttgcaaacaa ggttctgact 720gtacaaacca tcgacaacgc gctgagcaaa cgtaacgaaa aatgggatga agtttacaaa 780tatatcgtga ccaactggct ggctaaggtt aatactcaga tcgacctcat ccgcaaaaaa 840atgaaagaag cactggaaaa ccaggcggaa gctaccaagg caatcattaa ctaccagtac 900aaccagtaca ccgaggaaga aaaaaacaac atcaacttca acatcgacga tctgtcctct 960aaactgaacg aatccatcaa caaagctatg atcaacatca acaagttcct gaaccagtgc 1020tctgtaagct atctgatgaa ctccatgatc ccgtacggtg ttaaacgtct ggaggacttc 1080gatgcgtctc tgaaagacgc cctgctgaaa tacatttacg acaaccgtgg cactctgatc 1140ggtcaggttg atcgtctgaa ggacaaagtg aacaatacct tatcgaccga catccctttt 1200cagctcagta aatatgtcga taaccaacgc cttttgtcca ctctagacta gaagctt 125731323DNAArtificial SequenceSynthetic 3ggatccatgc cggttaccat caacaacttc aactacaacg acccgatcga caacaacaac 60atcattatga tggaaccgcc gttcgcacgt ggtaccggac gttactacaa ggcttttaag 120atcaccgacc gtatctggat catcccggaa cgttacacct tcggttacaa acctgaggac 180ttcaacaaga gtagcgggat tttcaatcgt gacgtctgcg agtactatga tccagattat 240ctgaatacca acgataagaa gaacatattc cttcagacta tgattaaact cttcaaccgt 300atcaaaagca aaccgctcgg tgaaaaactc ctcgaaatga ttatcaacgg tatcccgtac 360ctcggtgacc gtcgtgtccc gcttgaagag ttcaacacca acatcgcaag cgtcaccgtc 420aacaaactca tcagcaaccc aggtgaagtc gaacgtaaaa aaggtatctt cgcaaacctc 480atcatcttcg gtccgggtcc ggtcctcaac gaaaacgaaa ccatcgacat cggtatccag 540aaccacttcg caagccgtga aggtttcggt ggtatcatgc agatgaaatt ctgcccggaa 600tacgtcagtg tcttcaacaa cgtccaggaa aacaaaggtg caagcatctt caaccgtcgt 660ggttacttca gcgacccggc actcatcctc atgcatgaac tcatccacgt cctccacggt 720ctctacggta tcaaagttga cgacctcccg atcgtcccga acgagaagaa attcttcatg 780cagagcaccg acgcaatcca ggctgaggaa ctctacacct tcggtggcca agacccaagt 840atcataaccc cgtccaccga caaaagcatc tacgacaaag tcctccagaa cttcaggggt 900atcgtggaca gactcaacaa agtcctcgtc tgcatcagcg acccgaacat caatatcaac 960atatacaaga acaagttcaa agacaagtac aaattcgtcg aggacagcga aggcaaatac 1020agcatcgacg tagaaagttt cgacaagctc tacaaaagcc tcatgttcgg tttcaccgaa 1080accaacatcg ccgagaacta caagatcaag acaagggcaa gttacttcag cgacagcctc 1140ccgcctgtca aaatcaagaa cctcttagac aacgagattt acacaattga agagggcttc 1200aacatcagtg acaaagacat ggagaaggaa tacagaggtc agaacaaggc tatcaacaaa 1260caggcatacg aggagatcag caaagaacac ctcgcagtct acaagatcca gatgtgcgtc 1320gac 132341260DNAArtificial SequenceSynthetic 4ctgcagtgca tcgacgttga caacgaagac ctgttcttca tcgctgacaa aaacagcttc 60agtgacgacc tgagcaaaaa cgaacgtatc gaatacaaca cccagagcaa ctacatcgaa 120aacgacttcc cgatcaacga actgatcctg gacaccgacc tgataagtaa aatcgaactg 180ccgagcgaaa acaccgaaag tctgaccgac ttcaacgttg acgttccggt ttacgaaaaa 240cagccggcta tcaagaaaat cttcaccgac gaaaacacca tcttccagta cctgtacagc 300cagaccttcc cgctggacat ccgtgacatc agtctgacca gcagtttcga cgacgctctg 360ctgttcagca acaaagttta cagtttcttc agcatggact acatcaaaac cgctaacaaa 420gttgttgaag cagggctgtt cgctggttgg gttaaacaga tcgttaacga cttcgttatc 480gaagctaaca aaagcaacac tatggacaaa atcgctgaca tcagtctgat cgttccgtac 540atcggtctgg ctctgaacgt tggtaacgaa accgctaaag gtaactttga aaacgctttc 600gagatcgctg gtgcaagcat cctgctggag ttcatcccgg aactgctgat cccggttgtt 660ggtgctttcc tgctggaaag ttacatcgac aacaaaaaca agatcatcaa aaccatcgac 720aacgctctga ccaaacgtaa cgaaaaatgg agtgatatgt acggtctgat cgttgctcag 780tggctgagca ccgtcaacac ccagttctac accatcaaag aaggtatgta caaagctctg 840aactaccagg ctcaggctct ggaagagatc atcaaatacc gttacaacat ctacagtgag 900aaggaaaaga gtaacatcaa catcgacttc aacgacatca acagcaaact gaacgaaggt 960atcaaccagg ctatcgacaa catcaacaac ttcatcaacg gttgcagtgt tagctacctg 1020atgaagaaga tgatcccgct ggctgttgaa aaactgctgg acttcgacaa caccctgaaa 1080aagaacctgc tgaactacat cgacgaaaac aagctgtacc tgatcggtag tgctgaatac 1140gaaaaaagta aagtgaacaa atacctgaag accatcatgc cgttcgacct gagtatctac 1200accaacgaca ccatcctgat cgaaatgttc aacaaataca actctctaga ctagaagctt 126051329DNAArtificial SequenceSynthetic 5ggatccgaat tcatgccgat caccatcaac aacttcaact acagcgatcc ggtggataac 60aaaaacatcc tgtacctgga tacccatctg aataccctgg cgaacgaacc ggaaaaagcg 120tttcgtatca ccggcaacat ttgggttatt ccggatcgtt ttagccgtaa cagcaacccg 180aatctgaata aaccgccgcg tgttaccagc ccgaaaagcg gttattacga tccgaactat 240ctgagcaccg atagcgataa agataccttc ctgaaagaaa tcatcaaact gttcaaacgc 300atcaacagcc gtgaaattgg cgaagaactg atctatcgcc tgagcaccga tattccgttt 360ccgggcaaca acaacacccc gatcaacacc tttgatttcg atgtggattt caacagcgtt 420gatgttaaaa cccgccaggg taacaattgg gtgaaaaccg gcagcattaa cccgagcgtg 480attattaccg gtccgcgcga aaacattatt gatccggaaa ccagcacctt taaactgacc 540aacaacacct ttgcggcgca ggaaggtttt ggcgcgctga gcattattag cattagcccg 600cgctttatgc tgacctatag caacgcgacc aacgatgttg gtgaaggccg tttcagcaaa 660agcgaatttt gcatggaccc gatcctgatc ctgatgcatg aactgaacca tgcgatgcat 720aacctgtatg gcatcgcgat tccgaacgat cagaccatta gcagcgtgac cagcaacatc 780ttttacagcc agtacaacgt gaaactggaa tatgcggaaa tctatgcgtt tggcggtccg 840accattgatc tgattccgaa aagcgcgcgc aaatacttcg aagaaaaagc gctggattac 900tatcgcagca ttgcgaaacg tctgaacagc attaccaccg cgaatccgag cagcttcaac 960aaatatatcg gcgaatataa acagaaactg atccgcaaat atcgctttgt ggtggaaagc 1020agcggcgaag ttaccgttaa ccgcaataaa ttcgtggaac tgtacaacga actgacccag 1080atcttcaccg aatttaacta tgcgaaaatc tataacgtgc agaaccgtaa aatctacctg 1140agcaacgtgt ataccccggt gaccgcgaat attctggatg ataacgtgta cgatatccag 1200aacggcttta acatcccgaa aagcaacctg aacgttctgt ttatgggcca gaacctgagc 1260cgtaatccgg cgctgcgtaa agtgaacccg gaaaacatgc tgtacctgtt caccaaattt 1320tgcgtcgac 132961263DNAArtificial SequenceSynthetic 6ctgcagtgtc gtgaactgct ggtgaaaaac accgatctgc cgtttattgg cgatatcagc 60gatgtgaaaa ccgatatctt cctgcgcaaa gatatcaacg aagaaaccga agtgatctac 120tacccggata acgtgagcgt tgatcaggtg atcctgagca aaaacaccag cgaacatggt 180cagctggatc tgctgtatcc gagcattgat agcgaaagcg aaattctgcc gggcgaaaac 240caggtgtttt acgataaccg tacccagaac gtggattacc tgaacagcta ttactacctg 300gaaagccaga aactgagcga taacgtggaa gattttacct ttacccgcag cattgaagaa 360gcgctggata acagcgcgaa agtttacacc tattttccga ccctggcgaa caaagttaat 420gcgggtgttc agggcggtct gtttctgatg tgggcgaacg atgtggtgga agatttcacc 480accaacatcc tgcgtaaaga taccctggat aaaatcagcg atgttagcgc gattattccg 540tatattggtc cggcgctgaa cattagcaat agcgtgcgtc gtggcaattt taccgaagcg 600tttgcggtta ccggtgtgac cattctgctg gaagcgtttc cggaatttac cattccggcg 660ctgggtgcgt ttgtgatcta tagcaaagtg caggaacgca acgaaatcat caaaaccatc 720gataactgcc tggaacagcg tattaaacgc tggaaagata gctatgaatg gatgatgggc 780acctggctga gccgtattat cacccagttc aacaacatca gctaccagat gtacgatagc 840ctgaactatc aggcgggtgc gattaaagcg aaaatcgatc tggaatacaa aaaatacagc 900ggcagcgata aagaaaacat caaaagccag gttgaaaacc tgaaaaacag cctggatgtg 960aaaattagcg aagcgatgaa taacatcaac aaattcatcc gcgaatgcag cgtgacctac 1020ctgttcaaaa acatgctgcc gaaagtgatc gatgaactga acgaatttga tcgcaacacc 1080aaagcgaaac tgatcaacct gatcgatagc cacaacatta ttctggtggg cgaagtggat 1140aaactgaaag cgaaagttaa caacagcttc cagaacacca tcccgtttaa catcttcagc 1200tataccaaca acagcctgct gaaagatatc atcaacgaat acttcaatct agactagaag 1260ctt 1263730PRTArtificial SequenceSynthetic sequence 7Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val 1 5 10 15 Gly Asn His Arg Ser Phe Ser Asp Leu Asn Gly Leu Thr Ser 20 25 30 816PRTArtificial SequenceSynthetic sequence 8Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val 1 5 10 15 92781DNAArtificial SequenceSynthetic sequence 9atttcagaat tcggatccat ggagttcgtt aacaaacagt tcaactataa agacccagtt 60aacggtgttg acattgctta catcaaaatc ccgaacgctg gccagatgca gccggtaaag 120gcattcaaaa tccacaacaa aatctgggtt atcccggaac gtgatacctt tactaacccg 180gaagaaggtg acctgaaccc gccaccggaa gcgaaacagg tgccggtatc ttactatgac 240tccacctacc tgtctaccga taacgaaaag gacaactacc tgaaaggtgt tactaaactg 300ttcgagcgta tttactccac cgacctgggc cgtatgctgc tgactagcat cgttcgcggt 360atcccgttct ggggcggttc taccatcgat accgaactga aagtaatcga cactaactgc 420atcaacgtta ttcagccgga cggttcctat cgttccgaag aactgaacct ggtgatcatc 480ggcccgtctg ctgatatcat ccagttcgag tgtaagagct ttggtcacga agttctgaac 540ctcacccgta acggctacgg ttccactcag tacatccgtt tctctccgga cttcaccttc 600ggttttgaag aatccctgga agtagacacg aacccactgc tgggcgctgg taaattcgca 660actgatcctg cggttaccct ggctcacgaa ctgattcatg caggccaccg cctgtacggt 720atcgccatca atccgaaccg tgtcttcaaa gttaacacca acgcgtatta cgagatgtcc 780ggtctggaag ttagcttcga agaactgcgt acttttggcg gtcacgacgc taaattcatc 840gactctctgc aagaaaacga gttccgtctg tactactata acaagttcaa agatatcgca 900tccaccctga acaaagcgaa atccatcgtg ggtaccactg cttctctcca gtacatgaag 960aacgttttta aagaaaaata cctgctcagc gaagacacct ccggcaaatt ctctgtagac 1020aagttgaaat tcgataaact ttacaaaatg ctgactgaaa tttacaccga agacaacttc 1080gttaagttct ttaaagttct gaaccgcaaa acctatctga acttcgacaa ggcagtattc 1140aaaatcaaca tcgtgccgaa agttaactac actatctacg atggtttcaa cctgcgtaac 1200accaacctgg ctgctaattt taacggccag aacacggaaa tcaacaacat gaacttcaca 1260aaactgaaaa acttcactgg tctgttcgag ttttacaagc tgctgtgcgt cgacggcggt 1320ggcggtagcg gcggtggcgg tagcggcggt ggcggtagcg cagacgatga cgataaaggt 1380tggaccctga actctgctgg ttacctgctg ggtccgcacg ctgttgcgct agcgggcggt 1440ggcggtagcg gcggtggcgg tagcggcggt ggcggtagcg cactagtgct gcagtgtatc 1500aaggttaaca actgggattt attcttcagc ccgagtgaag acaacttcac caacgacctg 1560aacaaaggtg aagaaatcac ctcagatact aacatcgaag cagccgaaga aaacatctcg 1620ctggacctga tccagcagta ctacctgacc tttaatttcg acaacgagcc ggaaaacatt 1680tctatcgaaa acctgagctc tgatatcatc ggccagctgg aactgatgcc gaacatcgaa 1740cgtttcccaa acggtaaaaa gtacgagctg gacaaatata ccatgttcca ctacctgcgc 1800gcgcaggaat ttgaacacgg caaatcccgt atcgcactga ctaactccgt taacgaagct 1860ctgctcaacc cgtcccgtgt atacaccttc ttctctagcg actacgtgaa aaaggtcaac 1920aaagcgactg aagctgcaat gttcttgggt tgggttgaac agcttgttta tgattttacc 1980gacgagacgt ccgaagtatc tactaccgac aaaattgcgg atatcactat catcatcccg 2040tacatcggtc cggctctgaa cattggcaac atgctgtaca aagacgactt cgttggcgca 2100ctgatcttct ccggtgcggt gatcctgctg gagttcatcc cggaaatcgc catcccggta 2160ctgggcacct ttgctctggt ttcttacatt gcaaacaagg ttctgactgt acaaaccatc 2220gacaacgcgc tgagcaaacg taacgaaaaa tgggatgaag tttacaaata tatcgtgacc 2280aactggctgg ctaaggttaa tactcagatc gacctcatcc gcaaaaaaat gaaagaagca 2340ctggaaaacc aggcggaagc taccaaggca atcattaact accagtacaa ccagtacacc 2400gaggaagaaa aaaacaacat caacttcaac atcgacgatc tgtcctctaa actgaacgaa 2460tccatcaaca aagctatgat caacatcaac aagttcctga accagtgctc tgtaagctat 2520ctgatgaact ccatgatccc gtacggtgtt aaacgtctgg aggacttcga tgcgtctctg 2580aaagacgccc tgctgaaata catttacgac aaccgtggca ctctgatcgg tcaggttgat 2640cgtctgaagg acaaagtgaa caatacctta tcgaccgaca tcccttttca gctcagtaaa 2700tatgtcgata accaacgcct tttgtccact ctagaagcac tagcgagtgg gcaccatcac 2760catcaccatt aatgaaagct t 278110923PRTArtificial SequenceSynthetic sequence 10Ile Ser Glu Phe Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr 1 5 10 15 Lys Asp Pro Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn 20 25 30 Ala Gly Gln Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile 35 40 45 Trp Val Ile Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp 50 55 60 Leu Asn Pro Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp 65 70 75 80 Ser Thr Tyr Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly 85 90 95 Val Thr Lys Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met 100 105 110 Leu Leu Thr Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr 115 120 125 Ile Asp Thr Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile 130 135 140 Gln Pro Asp Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile 145 150 155 160 Gly Pro Ser Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His 165 170 175 Glu Val Leu Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile 180 185 190 Arg Phe Ser Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val 195 200 205 Asp Thr Asn Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala 210 215 220 Val Thr Leu Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly 225 230 235 240 Ile Ala Ile Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr 245 250 255 Tyr Glu Met Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe 260 265 270 Gly Gly His Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe 275 280 285 Arg Leu Tyr Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn 290 295 300 Lys Ala Lys Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys 305 310 315 320 Asn Val Phe Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys 325 330 335 Phe Ser Val Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr 340 345 350 Glu Ile Tyr Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn 355 360 365 Arg Lys Thr Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile 370 375 380 Val Pro Lys Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn 385 390 395 400 Thr Asn Leu Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn 405 410 415 Met Asn Phe Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr 420 425 430 Lys Leu Leu Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 435 440 445 Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn 450 455 460 Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala Leu Ala Gly Gly 465 470 475 480 Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Leu Val 485 490 495 Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser 500 505 510 Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu Ile Thr Ser 515 520 525 Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu Asp Leu Ile 530 535 540 Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro Glu Asn Ile 545 550

555 560 Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu Glu Leu Met 565 570 575 Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys 580 585 590 Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu His Gly Lys 595 600 605 Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu Leu Asn Pro 610 615 620 Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val Asn 625 630 635 640 Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu Val 645 650 655 Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys Ile 660 665 670 Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile 675 680 685 Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe Ser 690 695 700 Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile Pro Val 705 710 715 720 Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys Val Leu Thr 725 730 735 Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp 740 745 750 Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys Val Asn Thr 755 760 765 Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu Glu Asn Gln 770 775 780 Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr 785 790 795 800 Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp Leu Ser Ser 805 810 815 Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile Asn Lys Phe 820 825 830 Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr 835 840 845 Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu 850 855 860 Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val Asp 865 870 875 880 Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp Ile Pro Phe 885 890 895 Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu Glu 900 905 910 Ala Leu Ala Ser Gly His His His His His His 915 920 11912PRTArtificial SequenceSynthetic sequence 11Ile Ser Glu Phe Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr 1 5 10 15 Lys Asp Pro Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn 20 25 30 Ala Gly Gln Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile 35 40 45 Trp Val Ile Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp 50 55 60 Leu Asn Pro Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp 65 70 75 80 Ser Thr Tyr Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly 85 90 95 Val Thr Lys Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met 100 105 110 Leu Leu Thr Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr 115 120 125 Ile Asp Thr Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile 130 135 140 Gln Pro Asp Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile 145 150 155 160 Gly Pro Ser Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His 165 170 175 Glu Val Leu Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile 180 185 190 Arg Phe Ser Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val 195 200 205 Asp Thr Asn Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala 210 215 220 Val Thr Leu Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly 225 230 235 240 Ile Ala Ile Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr 245 250 255 Tyr Glu Met Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe 260 265 270 Gly Gly His Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe 275 280 285 Arg Leu Tyr Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn 290 295 300 Lys Ala Lys Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys 305 310 315 320 Asn Val Phe Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys 325 330 335 Phe Ser Val Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr 340 345 350 Glu Ile Tyr Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn 355 360 365 Arg Lys Thr Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile 370 375 380 Val Pro Lys Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn 385 390 395 400 Thr Asn Leu Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn 405 410 415 Met Asn Phe Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr 420 425 430 Lys Leu Leu Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 435 440 445 Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn 450 455 460 Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala Leu Ala Gly Gly 465 470 475 480 Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Leu Val 485 490 495 Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser 500 505 510 Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu Ile Thr Ser 515 520 525 Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu Asp Leu Ile 530 535 540 Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro Glu Asn Ile 545 550 555 560 Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu Glu Leu Met 565 570 575 Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys 580 585 590 Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu His Gly Lys 595 600 605 Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu Leu Asn Pro 610 615 620 Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val Asn 625 630 635 640 Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu Val 645 650 655 Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys Ile 660 665 670 Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile 675 680 685 Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe Ser 690 695 700 Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile Pro Val 705 710 715 720 Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys Val Leu Thr 725 730 735 Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp 740 745 750 Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys Val Asn Thr 755 760 765 Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu Glu Asn Gln 770 775 780 Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr 785 790 795 800 Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp Leu Ser Ser 805 810 815 Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile Asn Lys Phe 820 825 830 Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr 835 840 845 Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu 850 855 860 Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val Asp 865 870 875 880 Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp Ile Pro Phe 885 890 895 Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu Asp 900 905 910 122745DNAArtificial SequenceSynthetic sequence 12catatgggat ccatggagtt cgttaacaaa cagttcaact ataaagaccc agttaacggt 60gttgacattg cttacatcaa aatcccgaac gctggccaga tgcagccggt aaaggcattc 120aaaatccaca acaaaatctg ggttatcccg gaacgtgata cctttactaa cccggaagaa 180ggtgacctga acccgccacc ggaagcgaaa caggtgccgg tatcttacta tgactccacc 240tacctgtcta ccgataacga aaaggacaac tacctgaaag gtgttactaa actgttcgag 300cgtatttact ccaccgacct gggccgtatg ctgctgacta gcatcgttcg cggtatcccg 360ttctggggcg gttctaccat cgataccgaa ctgaaagtaa tcgacactaa ctgcatcaac 420gttattcagc cggacggttc ctatcgttcc gaagaactga acctggtgat catcggcccg 480tctgctgata tcatccagtt cgagtgtaag agctttggtc acgaagttct gaacctcacc 540cgtaacggct acggttccac tcagtacatc cgtttctctc cggacttcac cttcggtttt 600gaagaatccc tggaagtaga cacgaaccca ctgctgggcg ctggtaaatt cgcaactgat 660cctgcggtta ccctggctca cgaactgatt catgcaggcc accgcctgta cggtatcgcc 720atcaatccga accgtgtctt caaagttaac accaacgcgt attacgagat gtccggtctg 780gaagttagct tcgaagaact gcgtactttt ggcggtcacg acgctaaatt catcgactct 840ctgcaagaaa acgagttccg tctgtactac tataacaagt tcaaagatat cgcatccacc 900ctgaacaaag cgaaatccat cgtgggtacc actgcttctc tccagtacat gaagaacgtt 960tttaaagaaa aatacctgct cagcgaagac acctccggca aattctctgt agacaagttg 1020aaattcgata aactttacaa aatgctgact gaaatttaca ccgaagacaa cttcgttaag 1080ttctttaaag ttctgaaccg caaaacctat ctgaacttcg acaaggcagt attcaaaatc 1140aacatcgtgc cgaaagttaa ctacactatc tacgatggtt tcaacctgcg taacaccaac 1200ctggctgcta attttaacgg ccagaacacg gaaatcaaca acatgaactt cacaaaactg 1260aaaaacttca ctggtctgtt cgagttttac aagctgctgt gcgtcgacgg cggtggcggt 1320agcgcagacg atgacgataa aggttggacc ctgaactctg ctggttacct gctgggtccg 1380cacgctgttg cgctagcggg cggtggcggt agcggcggtg gcggtagcgg cggtggcggt 1440agcgcactag tgctgcagtg tatcaaggtt aacaactggg atttattctt cagcccgagt 1500gaagacaact tcaccaacga cctgaacaaa ggtgaagaaa tcacctcaga tactaacatc 1560gaagcagccg aagaaaacat ctcgctggac ctgatccagc agtactacct gacctttaat 1620ttcgacaacg agccggaaaa catttctatc gaaaacctga gctctgatat catcggccag 1680ctggaactga tgccgaacat cgaacgtttc ccaaacggta aaaagtacga gctggacaaa 1740tataccatgt tccactacct gcgcgcgcag gaatttgaac acggcaaatc ccgtatcgca 1800ctgactaact ccgttaacga agctctgctc aacccgtccc gtgtatacac cttcttctct 1860agcgactacg tgaaaaaggt caacaaagcg actgaagctg caatgttctt gggttgggtt 1920gaacagcttg tttatgattt taccgacgag acgtccgaag tatctactac cgacaaaatt 1980gcggatatca ctatcatcat cccgtacatc ggtccggctc tgaacattgg caacatgctg 2040tacaaagacg acttcgttgg cgcactgatc ttctccggtg cggtgatcct gctggagttc 2100atcccggaaa tcgccatccc ggtactgggc acctttgctc tggtttctta cattgcaaac 2160aaggttctga ctgtacaaac catcgacaac gcgctgagca aacgtaacga aaaatgggat 2220gaagtttaca aatatatcgt gaccaactgg ctggctaagg ttaatactca gatcgacctc 2280atccgcaaaa aaatgaaaga agcactggaa aaccaggcgg aagctaccaa ggcaatcatt 2340aactaccagt acaaccagta caccgaggaa gaaaaaaaca acatcaactt caacatcgac 2400gatctgtcct ctaaactgaa cgaatccatc aacaaagcta tgatcaacat caacaagttc 2460ctgaaccagt gctctgtaag ctatctgatg aactccatga tcccgtacgg tgttaaacgt 2520ctggaggact tcgatgcgtc tctgaaagac gccctgctga aatacattta cgacaaccgt 2580ggcactctga tcggtcaggt tgatcgtctg aaggacaaag tgaacaatac cttatcgacc 2640gacatccctt ttcagctcag taaatatgtc gataaccaac gccttttgtc cactctagaa 2700gcactagcga gtgggcacca tcaccatcac cattaatgaa agctt 274513910PRTArtificial SequenceSynthetic sequence 13Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp 435 440 445 Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala Leu 450 455 460 Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 465 470 475 480 Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe 485 490 495 Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 500 505 510 Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu 515 520 525 Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 530 535 540 Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu 545 550 555 560 Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 565 570 575 Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu

580 585 590 His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 595 600 605 Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 610 615 620 Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 625 630 635 640 Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 645 650 655 Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala 660 665 670 Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 675 680 685 Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala 690 695 700 Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys 705 710 715 720 Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu 725 730 735 Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys 740 745 750 Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu 755 760 765 Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn 770 775 780 Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp 785 790 795 800 Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile 805 810 815 Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met 820 825 830 Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 835 840 845 Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly 850 855 860 Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 865 870 875 880 Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser 885 890 895 Thr Leu Glu Ala Leu Ala Ser Gly His His His His His His 900 905 910 14899PRTArtificial SequenceSynthetic sequence 14Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp 435 440 445 Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala Leu 450 455 460 Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 465 470 475 480 Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe 485 490 495 Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 500 505 510 Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu 515 520 525 Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 530 535 540 Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu 545 550 555 560 Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 565 570 575 Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu 580 585 590 His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 595 600 605 Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 610 615 620 Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 625 630 635 640 Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 645 650 655 Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala 660 665 670 Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 675 680 685 Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala 690 695 700 Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys 705 710 715 720 Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu 725 730 735 Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys 740 745 750 Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu 755 760 765 Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn 770 775 780 Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp 785 790 795 800 Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile 805 810 815 Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met 820 825 830 Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 835 840 845 Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly 850 855 860 Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 865 870 875 880 Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser 885 890 895 Thr Leu Asp 152787DNAArtificial SequenceSynthetic sequence 15catatgggat ccatggagtt cgttaacaaa cagttcaact ataaagaccc agttaacggt 60gttgacattg cttacatcaa aatcccgaac gctggccaga tgcagccggt aaaggcattc 120aaaatccaca acaaaatctg ggttatcccg gaacgtgata cctttactaa cccggaagaa 180ggtgacctga acccgccacc ggaagcgaaa caggtgccgg tatcttacta tgactccacc 240tacctgtcta ccgataacga aaaggacaac tacctgaaag gtgttactaa actgttcgag 300cgtatttact ccaccgacct gggccgtatg ctgctgacta gcatcgttcg cggtatcccg 360ttctggggcg gttctaccat cgataccgaa ctgaaagtaa tcgacactaa ctgcatcaac 420gttattcagc cggacggttc ctatcgttcc gaagaactga acctggtgat catcggcccg 480tctgctgata tcatccagtt cgagtgtaag agctttggtc acgaagttct gaacctcacc 540cgtaacggct acggttccac tcagtacatc cgtttctctc cggacttcac cttcggtttt 600gaagaatccc tggaagtaga cacgaaccca ctgctgggcg ctggtaaatt cgcaactgat 660cctgcggtta ccctggctca cgaactgatt catgcaggcc accgcctgta cggtatcgcc 720atcaatccga accgtgtctt caaagttaac accaacgcgt attacgagat gtccggtctg 780gaagttagct tcgaagaact gcgtactttt ggcggtcacg acgctaaatt catcgactct 840ctgcaagaaa acgagttccg tctgtactac tataacaagt tcaaagatat cgcatccacc 900ctgaacaaag cgaaatccat cgtgggtacc actgcttctc tccagtacat gaagaacgtt 960tttaaagaaa aatacctgct cagcgaagac acctccggca aattctctgt agacaagttg 1020aaattcgata aactttacaa aatgctgact gaaatttaca ccgaagacaa cttcgttaag 1080ttctttaaag ttctgaaccg caaaacctat ctgaacttcg acaaggcagt attcaaaatc 1140aacatcgtgc cgaaagttaa ctacactatc tacgatggtt tcaacctgcg taacaccaac 1200ctggctgcta attttaacgg ccagaacacg gaaatcaaca acatgaactt cacaaaactg 1260aaaaacttca ctggtctgtt cgagttttac aagctgctgt gcgtcgacgg cggtggcggt 1320agcgcagacg atgacgataa aggttggacc ctgaactctg ctggttacct gctgggtccg 1380cacgctgttg gtaaccaccg ttctttctct gacctgaacg gtctgacctc tgcgctagcg 1440ggcggtggcg gtagcggcgg tggcggtagc ggcggtggcg gtagcgcact agtgctgcag 1500tgtatcaagg ttaacaactg ggatttattc ttcagcccga gtgaagacaa cttcaccaac 1560gacctgaaca aaggtgaaga aatcacctca gatactaaca tcgaagcagc cgaagaaaac 1620atctcgctgg acctgatcca gcagtactac ctgaccttta atttcgacaa cgagccggaa 1680aacatttcta tcgaaaacct gagctctgat atcatcggcc agctggaact gatgccgaac 1740atcgaacgtt tcccaaacgg taaaaagtac gagctggaca aatataccat gttccactac 1800ctgcgcgcgc aggaatttga acacggcaaa tcccgtatcg cactgactaa ctccgttaac 1860gaagctctgc tcaacccgtc ccgtgtatac accttcttct ctagcgacta cgtgaaaaag 1920gtcaacaaag cgactgaagc tgcaatgttc ttgggttggg ttgaacagct tgtttatgat 1980tttaccgacg agacgtccga agtatctact accgacaaaa ttgcggatat cactatcatc 2040atcccgtaca tcggtccggc tctgaacatt ggcaacatgc tgtacaaaga cgacttcgtt 2100ggcgcactga tcttctccgg tgcggtgatc ctgctggagt tcatcccgga aatcgccatc 2160ccggtactgg gcacctttgc tctggtttct tacattgcaa acaaggttct gactgtacaa 2220accatcgaca acgcgctgag caaacgtaac gaaaaatggg atgaagttta caaatatatc 2280gtgaccaact ggctggctaa ggttaatact cagatcgacc tcatccgcaa aaaaatgaaa 2340gaagcactgg aaaaccaggc ggaagctacc aaggcaatca ttaactacca gtacaaccag 2400tacaccgagg aagaaaaaaa caacatcaac ttcaacatcg acgatctgtc ctctaaactg 2460aacgaatcca tcaacaaagc tatgatcaac atcaacaagt tcctgaacca gtgctctgta 2520agctatctga tgaactccat gatcccgtac ggtgttaaac gtctggagga cttcgatgcg 2580tctctgaaag acgccctgct gaaatacatt tacgacaacc gtggcactct gatcggtcag 2640gttgatcgtc tgaaggacaa agtgaacaat accttatcga ccgacatccc ttttcagctc 2700agtaaatatg tcgataacca acgccttttg tccactctag aagcactagc gagtgggcac 2760catcaccatc accattaatg aaagctt 278716924PRTArtificial SequenceSynthetic sequence 16Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp 435 440 445 Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Gly Asn 450 455 460 His Arg Ser Phe Ser Asp Leu Asn Gly Leu Thr Ser Ala Leu Ala Gly 465 470 475 480 Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Leu 485 490 495 Val Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe Ser Pro 500 505 510 Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu Ile Thr 515 520 525 Ser Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu Asp Leu 530 535 540 Ile Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro Glu Asn 545 550 555 560 Ile Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu Glu Leu 565 570 575 Met Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu Leu Asp 580 585 590 Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu His Gly 595 600 605 Lys Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu Leu Asn 610 615 620 Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val 625 630

635 640 Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu 645 650 655 Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys 660 665 670 Ile Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn 675 680 685 Ile Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe 690 695 700 Ser Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile Pro 705 710 715 720 Val Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys Val Leu 725 730 735 Thr Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu Lys Trp 740 745 750 Asp Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys Val Asn 755 760 765 Thr Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu Glu Asn 770 775 780 Gln Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr 785 790 795 800 Thr Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp Leu Ser 805 810 815 Ser Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile Asn Lys 820 825 830 Phe Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met Ile Pro 835 840 845 Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala 850 855 860 Leu Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val 865 870 875 880 Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp Ile Pro 885 890 895 Phe Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu 900 905 910 Glu Ala Leu Ala Ser Gly His His His His His His 915 920 17913PRTArtificial SequenceSynthetic sequence 17Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp 435 440 445 Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Gly Asn 450 455 460 His Arg Ser Phe Ser Asp Leu Asn Gly Leu Thr Ser Ala Leu Ala Gly 465 470 475 480 Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Leu 485 490 495 Val Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe Ser Pro 500 505 510 Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu Ile Thr 515 520 525 Ser Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu Asp Leu 530 535 540 Ile Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro Glu Asn 545 550 555 560 Ile Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu Glu Leu 565 570 575 Met Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu Leu Asp 580 585 590 Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu His Gly 595 600 605 Lys Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu Leu Asn 610 615 620 Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val 625 630 635 640 Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu 645 650 655 Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys 660 665 670 Ile Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn 675 680 685 Ile Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe 690 695 700 Ser Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile Pro 705 710 715 720 Val Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys Val Leu 725 730 735 Thr Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu Lys Trp 740 745 750 Asp Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys Val Asn 755 760 765 Thr Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu Glu Asn 770 775 780 Gln Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr 785 790 795 800 Thr Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp Leu Ser 805 810 815 Ser Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile Asn Lys 820 825 830 Phe Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met Ile Pro 835 840 845 Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala 850 855 860 Leu Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val 865 870 875 880 Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp Ile Pro 885 890 895 Phe Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu 900 905 910 Asp 182769DNAArtificial SequenceSynthetic sequence 18catatgggat ccatgccggt taccatcaac aacttcaact acaacgaccc gatcgacaac 60aacaacatca ttatgatgga accgccgttc gcacgtggta ccggacgtta ctacaaggct 120tttaagatca ccgaccgtat ctggatcatc ccggaacgtt acaccttcgg ttacaaacct 180gaggacttca acaagagtag cgggattttc aatcgtgacg tctgcgagta ctatgatcca 240gattatctga ataccaacga taagaagaac atattccttc agactatgat taaactcttc 300aaccgtatca aaagcaaacc gctcggtgaa aaactcctcg aaatgattat caacggtatc 360ccgtacctcg gtgaccgtcg tgtcccgctt gaagagttca acaccaacat cgcaagcgtc 420accgtcaaca aactcatcag caacccaggt gaagtcgaac gtaaaaaagg tatcttcgca 480aacctcatca tcttcggtcc gggtccggtc ctcaacgaaa acgaaaccat cgacatcggt 540atccagaacc acttcgcaag ccgtgaaggt ttcggtggta tcatgcagat gaaattctgc 600ccggaatacg tcagtgtctt caacaacgtc caggaaaaca aaggtgcaag catcttcaac 660cgtcgtggtt acttcagcga cccggcactc atcctcatgc atgaactcat ccacgtcctc 720cacggtctct acggtatcaa agttgacgac ctcccgatcg tcccgaacga gaagaaattc 780ttcatgcaga gcaccgacgc aatccaggct gaggaactct acaccttcgg tggccaagac 840ccaagtatca taaccccgtc caccgacaaa agcatctacg acaaagtcct ccagaacttc 900aggggtatcg tggacagact caacaaagtc ctcgtctgca tcagcgaccc gaacatcaat 960atcaacatat acaagaacaa gttcaaagac aagtacaaat tcgtcgagga cagcgaaggc 1020aaatacagca tcgacgtaga aagtttcgac aagctctaca aaagcctcat gttcggtttc 1080accgaaacca acatcgccga gaactacaag atcaagacaa gggcaagtta cttcagcgac 1140agcctcccgc ctgtcaaaat caagaacctc ttagacaacg agatttacac aattgaagag 1200ggcttcaaca tcagtgacaa agacatggag aaggaataca gaggtcagaa caaggctatc 1260aacaaacagg catacgagga gatcagcaaa gaacacctcg cagtctacaa gatccagatg 1320tgcgtcgacg gcggtggcgg tagcgcagac gatgacgata aaggttggac cctgaactct 1380gctggttacc tgctgggtcc gcacgctgtt gcgctagcgg gcggtggcgg tagcggcggt 1440ggcggtagcg gcggtggcgg tagcgcacta gtgctgcagt gcatcgacgt tgacaacgaa 1500gacctgttct tcatcgctga caaaaacagc ttcagtgacg acctgagcaa aaacgaacgt 1560atcgaataca acacccagag caactacatc gaaaacgact tcccgatcaa cgaactgatc 1620ctggacaccg acctgataag taaaatcgaa ctgccgagcg aaaacaccga aagtctgacc 1680gacttcaacg ttgacgttcc ggtttacgaa aaacagccgg ctatcaagaa aatcttcacc 1740gacgaaaaca ccatcttcca gtacctgtac agccagacct tcccgctgga catccgtgac 1800atcagtctga ccagcagttt cgacgacgct ctgctgttca gcaacaaagt ttacagtttc 1860ttcagcatgg actacatcaa aaccgctaac aaagttgttg aagcagggct gttcgctggt 1920tgggttaaac agatcgttaa cgacttcgtt atcgaagcta acaaaagcaa cactatggac 1980gcaatcgctg acatcagtct gatcgttccg tacatcggtc tggctctgaa cgttggtaac 2040gaaaccgcta aaggtaactt tgaaaacgct ttcgagatcg ctggtgcaag catcctgctg 2100gagttcatcc cggaactgct gatcccggtt gttggtgctt tcctgctgga aagttacatc 2160gacaacaaaa acaagatcat caaaaccatc gacaacgctc tgaccaaacg taacgaaaaa 2220tggagtgata tgtacggtct gatcgttgct cagtggctga gcaccgtcaa cacccagttc 2280tacaccatca aagaaggtat gtacaaagct ctgaactacc aggctcaggc tctggaagag 2340atcatcaaat accgttacaa catctacagt gagaaggaaa agagtaacat caacatcgac 2400ttcaacgaca tcaacagcaa actgaacgaa ggtatcaacc aggctatcga caacatcaac 2460aacttcatca acggttgcag tgttagctac ctgatgaaga agatgatccc gctggctgtt 2520gaaaaactgc tggacttcga caacaccctg aaaaagaacc tgctgaacta catcgacgaa 2580aacaagctgt acctgatcgg tagtgctgaa tacgaaaaaa gtaaagtgaa caaatacctg 2640aagaccatca tgccgttcga cctgagtatc tacaccaacg acaccatcct gatcgaaatg 2700ttcaacaaat acaactctct agaagcacta gcgagtgggc accatcacca tcaccattaa 2760tgaaagctt 276919918PRTArtificial SequenceSynthetic sequence 19Met Gly Ser Met Pro Val Thr Ile Asn Asn Phe Asn Tyr Asn Asp Pro 1 5 10 15 Ile Asp Asn Asn Asn Ile Ile Met Met Glu Pro Pro Phe Ala Arg Gly 20 25 30 Thr Gly Arg Tyr Tyr Lys Ala Phe Lys Ile Thr Asp Arg Ile Trp Ile 35 40 45 Ile Pro Glu Arg Tyr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys 50 55 60 Ser Ser Gly Ile Phe Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp 65 70 75 80 Tyr Leu Asn Thr Asn Asp Lys Lys Asn Ile Phe Leu Gln Thr Met Ile 85 90 95 Lys Leu Phe Asn Arg Ile Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu 100 105 110 Glu Met Ile Ile Asn Gly Ile Pro Tyr Leu Gly Asp Arg Arg Val Pro 115 120 125 Leu Glu Glu Phe Asn Thr Asn Ile Ala Ser Val Thr Val Asn Lys Leu 130 135 140 Ile Ser Asn Pro Gly Glu Val Glu Arg Lys Lys Gly Ile Phe Ala Asn 145 150 155 160 Leu Ile Ile Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr Ile 165 170 175 Asp Ile Gly Ile Gln Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly 180 185 190 Ile Met Gln Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn 195 200 205 Val Gln Glu Asn Lys Gly Ala Ser Ile Phe Asn Arg Arg Gly Tyr Phe 210 215 220 Ser Asp Pro Ala Leu Ile Leu Met His Glu Leu Ile His Val Leu His 225 230 235 240 Gly Leu Tyr Gly Ile Lys Val Asp Asp Leu Pro Ile Val Pro Asn Glu 245 250 255 Lys Lys Phe Phe Met Gln Ser Thr Asp Ala Ile Gln Ala Glu Glu Leu 260 265 270 Tyr Thr Phe Gly Gly Gln Asp Pro Ser Ile Ile Thr Pro Ser Thr Asp 275 280 285 Lys Ser Ile Tyr Asp Lys Val Leu Gln Asn Phe Arg Gly Ile Val Asp 290 295 300 Arg Leu Asn Lys Val Leu Val Cys Ile Ser Asp Pro Asn Ile Asn Ile 305 310 315 320 Asn Ile Tyr Lys Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp 325 330 335 Ser Glu Gly Lys Tyr Ser Ile Asp Val Glu Ser Phe Asp Lys Leu Tyr 340 345 350 Lys Ser Leu Met Phe Gly Phe Thr Glu Thr Asn Ile Ala Glu Asn Tyr 355 360 365 Lys Ile Lys Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val 370 375 380 Lys Ile Lys Asn Leu Leu Asp Asn Glu Ile Tyr Thr Ile Glu Glu Gly 385 390 395 400 Phe Asn Ile Ser Asp Lys Asp Met Glu Lys Glu Tyr Arg Gly Gln Asn 405 410 415 Lys Ala Ile Asn Lys Gln Ala Tyr Glu Glu Ile Ser Lys Glu His Leu 420 425 430 Ala Val Tyr Lys Ile Gln Met Cys Val Asp Gly Gly Gly Gly Ser Ala 435 440 445 Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu 450 455 460 Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly 465 470 475 480 Gly Ser Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Asp Val 485 490 495 Asp Asn Glu Asp Leu Phe Phe Ile Ala Asp Lys Asn Ser Phe Ser Asp 500 505 510 Asp Leu Ser Lys Asn Glu Arg Ile Glu Tyr Asn Thr Gln Ser Asn Tyr 515 520 525 Ile Glu Asn Asp Phe Pro Ile Asn Glu Leu Ile Leu Asp Thr Asp Leu 530 535 540 Ile Ser Lys Ile Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr Asp 545 550 555 560 Phe Asn Val Asp Val Pro Val Tyr Glu Lys Gln Pro Ala Ile Lys Lys 565 570 575 Ile Phe Thr Asp Glu Asn Thr Ile Phe Gln Tyr Leu Tyr Ser Gln Thr 580 585 590 Phe Pro Leu Asp Ile Arg Asp Ile Ser Leu Thr Ser Ser Phe Asp Asp 595 600 605 Ala Leu Leu Phe Ser Asn Lys Val Tyr Ser Phe Phe Ser Met Asp Tyr 610 615 620 Ile Lys Thr Ala Asn Lys Val Val Glu Ala Gly Leu Phe Ala Gly Trp 625 630 635 640 Val Lys Gln Ile Val Asn Asp Phe Val Ile Glu Ala Asn Lys Ser Asn 645 650

655 Thr Met Asp Ala Ile Ala Asp Ile Ser Leu Ile Val Pro Tyr Ile Gly 660 665 670 Leu Ala Leu Asn Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu Asn 675 680 685 Ala Phe Glu Ile Ala Gly Ala Ser Ile Leu Leu Glu Phe Ile Pro Glu 690 695 700 Leu Leu Ile Pro Val Val Gly Ala Phe Leu Leu Glu Ser Tyr Ile Asp 705 710 715 720 Asn Lys Asn Lys Ile Ile Lys Thr Ile Asp Asn Ala Leu Thr Lys Arg 725 730 735 Asn Glu Lys Trp Ser Asp Met Tyr Gly Leu Ile Val Ala Gln Trp Leu 740 745 750 Ser Thr Val Asn Thr Gln Phe Tyr Thr Ile Lys Glu Gly Met Tyr Lys 755 760 765 Ala Leu Asn Tyr Gln Ala Gln Ala Leu Glu Glu Ile Ile Lys Tyr Arg 770 775 780 Tyr Asn Ile Tyr Ser Glu Lys Glu Lys Ser Asn Ile Asn Ile Asp Phe 785 790 795 800 Asn Asp Ile Asn Ser Lys Leu Asn Glu Gly Ile Asn Gln Ala Ile Asp 805 810 815 Asn Ile Asn Asn Phe Ile Asn Gly Cys Ser Val Ser Tyr Leu Met Lys 820 825 830 Lys Met Ile Pro Leu Ala Val Glu Lys Leu Leu Asp Phe Asp Asn Thr 835 840 845 Leu Lys Lys Asn Leu Leu Asn Tyr Ile Asp Glu Asn Lys Leu Tyr Leu 850 855 860 Ile Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu Lys 865 870 875 880 Thr Ile Met Pro Phe Asp Leu Ser Ile Tyr Thr Asn Asp Thr Ile Leu 885 890 895 Ile Glu Met Phe Asn Lys Tyr Asn Ser Leu Glu Ala Leu Ala Ser Gly 900 905 910 His His His His His His 915 20908PRTArtificial SequenceSynthetic sequence 20Met Gly Ser Met Pro Val Thr Ile Asn Asn Phe Asn Tyr Asn Asp Pro 1 5 10 15 Ile Asp Asn Asn Asn Ile Ile Met Met Glu Pro Pro Phe Ala Arg Gly 20 25 30 Thr Gly Arg Tyr Tyr Lys Ala Phe Lys Ile Thr Asp Arg Ile Trp Ile 35 40 45 Ile Pro Glu Arg Tyr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys 50 55 60 Ser Ser Gly Ile Phe Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp 65 70 75 80 Tyr Leu Asn Thr Asn Asp Lys Lys Asn Ile Phe Leu Gln Thr Met Ile 85 90 95 Lys Leu Phe Asn Arg Ile Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu 100 105 110 Glu Met Ile Ile Asn Gly Ile Pro Tyr Leu Gly Asp Arg Arg Val Pro 115 120 125 Leu Glu Glu Phe Asn Thr Asn Ile Ala Ser Val Thr Val Asn Lys Leu 130 135 140 Ile Ser Asn Pro Gly Glu Val Glu Arg Lys Lys Gly Ile Phe Ala Asn 145 150 155 160 Leu Ile Ile Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr Ile 165 170 175 Asp Ile Gly Ile Gln Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly 180 185 190 Ile Met Gln Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn 195 200 205 Val Gln Glu Asn Lys Gly Ala Ser Ile Phe Asn Arg Arg Gly Tyr Phe 210 215 220 Ser Asp Pro Ala Leu Ile Leu Met His Glu Leu Ile His Val Leu His 225 230 235 240 Gly Leu Tyr Gly Ile Lys Val Asp Asp Leu Pro Ile Val Pro Asn Glu 245 250 255 Lys Lys Phe Phe Met Gln Ser Thr Asp Ala Ile Gln Ala Glu Glu Leu 260 265 270 Tyr Thr Phe Gly Gly Gln Asp Pro Ser Ile Ile Thr Pro Ser Thr Asp 275 280 285 Lys Ser Ile Tyr Asp Lys Val Leu Gln Asn Phe Arg Gly Ile Val Asp 290 295 300 Arg Leu Asn Lys Val Leu Val Cys Ile Ser Asp Pro Asn Ile Asn Ile 305 310 315 320 Asn Ile Tyr Lys Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp 325 330 335 Ser Glu Gly Lys Tyr Ser Ile Asp Val Glu Ser Phe Asp Lys Leu Tyr 340 345 350 Lys Ser Leu Met Phe Gly Phe Thr Glu Thr Asn Ile Ala Glu Asn Tyr 355 360 365 Lys Ile Lys Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val 370 375 380 Lys Ile Lys Asn Leu Leu Asp Asn Glu Ile Tyr Thr Ile Glu Glu Gly 385 390 395 400 Phe Asn Ile Ser Asp Lys Asp Met Glu Lys Glu Tyr Arg Gly Gln Asn 405 410 415 Lys Ala Ile Asn Lys Gln Ala Tyr Glu Glu Ile Ser Lys Glu His Leu 420 425 430 Ala Val Tyr Lys Ile Gln Met Cys Val Asp Gly Gly Gly Gly Ser Ala 435 440 445 Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu 450 455 460 Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly 465 470 475 480 Gly Ser Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Asp Val 485 490 495 Asp Asn Glu Asp Leu Phe Phe Ile Ala Asp Lys Asn Ser Phe Ser Asp 500 505 510 Asp Leu Ser Lys Asn Glu Arg Ile Glu Tyr Asn Thr Gln Ser Asn Tyr 515 520 525 Ile Glu Asn Asp Phe Pro Ile Asn Glu Leu Ile Leu Asp Thr Asp Leu 530 535 540 Ile Ser Lys Ile Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr Asp 545 550 555 560 Phe Asn Val Asp Val Pro Val Tyr Glu Lys Gln Pro Ala Ile Lys Lys 565 570 575 Ile Phe Thr Asp Glu Asn Thr Ile Phe Gln Tyr Leu Tyr Ser Gln Thr 580 585 590 Phe Pro Leu Asp Ile Arg Asp Ile Ser Leu Thr Ser Ser Phe Asp Asp 595 600 605 Ala Leu Leu Phe Ser Asn Lys Val Tyr Ser Phe Phe Ser Met Asp Tyr 610 615 620 Ile Lys Thr Ala Asn Lys Val Val Glu Ala Gly Leu Phe Ala Gly Trp 625 630 635 640 Val Lys Gln Ile Val Asn Asp Phe Val Ile Glu Ala Asn Lys Ser Asn 645 650 655 Thr Met Asp Ala Ile Ala Asp Ile Ser Leu Ile Val Pro Tyr Ile Gly 660 665 670 Leu Ala Leu Asn Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu Asn 675 680 685 Ala Phe Glu Ile Ala Gly Ala Ser Ile Leu Leu Glu Phe Ile Pro Glu 690 695 700 Leu Leu Ile Pro Val Val Gly Ala Phe Leu Leu Glu Ser Tyr Ile Asp 705 710 715 720 Asn Lys Asn Lys Ile Ile Lys Thr Ile Asp Asn Ala Leu Thr Lys Arg 725 730 735 Asn Glu Lys Trp Ser Asp Met Tyr Gly Leu Ile Val Ala Gln Trp Leu 740 745 750 Ser Thr Val Asn Thr Gln Phe Tyr Thr Ile Lys Glu Gly Met Tyr Lys 755 760 765 Ala Leu Asn Tyr Gln Ala Gln Ala Leu Glu Glu Ile Ile Lys Tyr Arg 770 775 780 Tyr Asn Ile Tyr Ser Glu Lys Glu Lys Ser Asn Ile Asn Ile Asp Phe 785 790 795 800 Asn Asp Ile Asn Ser Lys Leu Asn Glu Gly Ile Asn Gln Ala Ile Asp 805 810 815 Asn Ile Asn Asn Phe Ile Asn Gly Cys Ser Val Ser Tyr Leu Met Lys 820 825 830 Lys Met Ile Pro Leu Ala Val Glu Lys Leu Leu Asp Phe Asp Asn Thr 835 840 845 Leu Lys Lys Asn Leu Leu Asn Tyr Ile Asp Glu Asn Lys Leu Tyr Leu 850 855 860 Ile Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu Lys 865 870 875 880 Thr Ile Met Pro Phe Asp Leu Ser Ile Tyr Thr Asn Asp Thr Ile Leu 885 890 895 Ile Glu Met Phe Asn Lys Tyr Asn Ser Leu Glu Asp 900 905 212769DNAArtificial SequenceSynthetic sequence 21catatgggat ccatgccggt taccatcaac aacttcaact acaacgaccc gatcgacaac 60aacaacatca ttatgatgga accgccgttc gcacgtggta ccggacgtta ctacaaggct 120tttaagatca ccgaccgtat ctggatcatc ccggaacgtt acaccttcgg ttacaaacct 180gaggacttca acaagagtag cgggattttc aatcgtgacg tctgcgagta ctatgatcca 240gattatctga ataccaacga taagaagaac atattccttc agactatgat taaactcttc 300aaccgtatca aaagcaaacc gctcggtgaa aaactcctcg aaatgattat caacggtatc 360ccgtacctcg gtgaccgtcg tgtcccgctt gaagagttca acaccaacat cgcaagcgtc 420accgtcaaca aactcatcag caacccaggt gaagtcgaac gtaaaaaagg tatcttcgca 480aacctcatca tcttcggtcc gggtccggtc ctcaacgaaa acgaaaccat cgacatcggt 540atccagaacc acttcgcaag ccgtgaaggt ttcggtggta tcatgcagat gaaattctgc 600ccggaatacg tcagtgtctt caacaacgtc caggaaaaca aaggtgcaag catcttcaac 660cgtcgtggtt acttcagcga cccggcactc atcctcatgc atgaactcat ccacgtcctc 720cacggtctct acggtatcaa agttgacgac ctcccgatcg tcccgaacga gaagaaattc 780ttcatgcaga gcaccgacgc aatccaggct gaggaactct acaccttcgg tggccaagac 840ccaagtatca taaccccgtc caccgacaaa agcatctacg acaaagtcct ccagaacttc 900aggggtatcg tggacagact caacaaagtc ctcgtctgca tcagcgaccc gaacatcaat 960atcaacatat acaagaacaa gttcaaagac aagtacaaat tcgtcgagga cagcgaaggc 1020aaatacagca tcgacgtaga aagtttcgac aagctctaca aaagcctcat gttcggtttc 1080accgaaacca acatcgccga gaactacaag atcaagacaa gggcaagtta cttcagcgac 1140agcctcccgc ctgtcaaaat caagaacctc ttagacaacg agatttacac aattgaagag 1200ggcttcaaca tcagtgacaa agacatggag aaggaataca gaggtcagaa caaggctatc 1260aacaaacagg catacgagga gatcagcaaa gaacacctcg cagtctacaa gatccagatg 1320tgcgtcgacg gcggtggcgg tagcgcagac gatgacgata aaggttggac cctgaactct 1380gctggttacc tgctgggtcc gcacgctgtt gcgctagcgg gcggtggcgg tagcggcggt 1440ggcggtagcg gcggtggcgg tagcgcacta gtgctgcagt gcatcgacgt tgacaacgaa 1500gacctgttct tcatcgctga caaaaacagc ttcagtgacg acctgagcaa aaacgaacgt 1560atcgaataca acacccagag caactacatc gaaaacgact tcccgatcaa cgaactgatc 1620ctggacaccg acctgataag taaaatcgaa ctgccgagcg aaaacaccga aagtctgacc 1680gacttcaacg ttgacgttcc ggtttacgaa aaacagccgg ctatcaagaa aatcttcacc 1740gacgaaaaca ccatcttcca gtacctgtac agccagacct tcccgctgga catccgtgac 1800atcagtctga ccagcagttt cgacgacgct ctgctgttca gcaacaaagt ttacagtttc 1860ttcagcatgg actacatcaa aaccgctaac aaagttgttg aagcagggct gttcgctggt 1920tgggttaaac agatcgttaa cgacttcgtt atcgaagcta acaaaagcaa cactatggac 1980aaaatcgctg acatcagtct gatcgttccg tacatcggtc tggctctgaa cgttggtaac 2040gaaaccgcta aaggtaactt tgaaaacgct ttcgagatcg ctggtgcaag catcctgctg 2100gagttcatcc cggaactgct gatcccggtt gttggtgctt tcctgctgga aagttacatc 2160gacaacaaaa acaagatcat caaaaccatc gacaacgctc tgaccaaacg taacgaaaaa 2220tggagtgata tgtacggtct gatcgttgct cagtggctga gcaccgtcaa cacccagttc 2280tacaccatca aagaaggtat gtacaaagct ctgaactacc aggctcaggc tctggaagag 2340atcatcaaat accgttacaa catctacagt gagaaggaaa agagtaacat caacatcgac 2400ttcaacgaca tcaacagcaa actgaacgaa ggtatcaacc aggctatcga caacatcaac 2460aacttcatca acggttgcag tgttagctac ctgatgaaga agatgatccc gctggctgtt 2520gaaaaactgc tggacttcga caacaccctg aaaaagaacc tgctgaacta catcgacgaa 2580aacaagctgt acctgatcgg tagtgctgaa tacgaaaaaa gtaaagtgaa caaatacctg 2640aagaccatca tgccgttcga cctgagtatc tacaccaacg acaccatcct gatcgaaatg 2700ttcaacaaat acaactctct agaagcacta gcgagtgggc accatcacca tcaccattaa 2760tgaaagctt 276922918PRTArtificial SequenceSynthetic sequence 22Met Gly Ser Met Pro Val Thr Ile Asn Asn Phe Asn Tyr Asn Asp Pro 1 5 10 15 Ile Asp Asn Asn Asn Ile Ile Met Met Glu Pro Pro Phe Ala Arg Gly 20 25 30 Thr Gly Arg Tyr Tyr Lys Ala Phe Lys Ile Thr Asp Arg Ile Trp Ile 35 40 45 Ile Pro Glu Arg Tyr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys 50 55 60 Ser Ser Gly Ile Phe Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp 65 70 75 80 Tyr Leu Asn Thr Asn Asp Lys Lys Asn Ile Phe Leu Gln Thr Met Ile 85 90 95 Lys Leu Phe Asn Arg Ile Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu 100 105 110 Glu Met Ile Ile Asn Gly Ile Pro Tyr Leu Gly Asp Arg Arg Val Pro 115 120 125 Leu Glu Glu Phe Asn Thr Asn Ile Ala Ser Val Thr Val Asn Lys Leu 130 135 140 Ile Ser Asn Pro Gly Glu Val Glu Arg Lys Lys Gly Ile Phe Ala Asn 145 150 155 160 Leu Ile Ile Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr Ile 165 170 175 Asp Ile Gly Ile Gln Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly 180 185 190 Ile Met Gln Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn 195 200 205 Val Gln Glu Asn Lys Gly Ala Ser Ile Phe Asn Arg Arg Gly Tyr Phe 210 215 220 Ser Asp Pro Ala Leu Ile Leu Met His Glu Leu Ile His Val Leu His 225 230 235 240 Gly Leu Tyr Gly Ile Lys Val Asp Asp Leu Pro Ile Val Pro Asn Glu 245 250 255 Lys Lys Phe Phe Met Gln Ser Thr Asp Ala Ile Gln Ala Glu Glu Leu 260 265 270 Tyr Thr Phe Gly Gly Gln Asp Pro Ser Ile Ile Thr Pro Ser Thr Asp 275 280 285 Lys Ser Ile Tyr Asp Lys Val Leu Gln Asn Phe Arg Gly Ile Val Asp 290 295 300 Arg Leu Asn Lys Val Leu Val Cys Ile Ser Asp Pro Asn Ile Asn Ile 305 310 315 320 Asn Ile Tyr Lys Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp 325 330 335 Ser Glu Gly Lys Tyr Ser Ile Asp Val Glu Ser Phe Asp Lys Leu Tyr 340 345 350 Lys Ser Leu Met Phe Gly Phe Thr Glu Thr Asn Ile Ala Glu Asn Tyr 355 360 365 Lys Ile Lys Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val 370 375 380 Lys Ile Lys Asn Leu Leu Asp Asn Glu Ile Tyr Thr Ile Glu Glu Gly 385 390 395 400 Phe Asn Ile Ser Asp Lys Asp Met Glu Lys Glu Tyr Arg Gly Gln Asn 405 410 415 Lys Ala Ile Asn Lys Gln Ala Tyr Glu Glu Ile Ser Lys Glu His Leu 420 425 430 Ala Val Tyr Lys Ile Gln Met Cys Val Asp Gly Gly Gly Gly Ser Ala 435 440 445 Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu 450 455 460 Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly 465 470 475 480 Gly Ser Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Asp Val 485 490 495 Asp Asn Glu Asp Leu Phe Phe Ile Ala Asp Lys Asn Ser Phe Ser Asp 500 505 510 Asp Leu Ser Lys Asn Glu Arg Ile Glu Tyr Asn Thr Gln Ser Asn Tyr 515 520 525 Ile Glu Asn Asp Phe Pro Ile Asn Glu Leu Ile Leu Asp Thr Asp Leu 530 535 540 Ile Ser Lys Ile Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr Asp 545 550 555 560 Phe Asn Val Asp Val Pro Val Tyr Glu Lys Gln Pro Ala Ile Lys Lys 565 570 575 Ile Phe Thr Asp Glu Asn Thr Ile Phe Gln Tyr Leu Tyr Ser Gln Thr 580 585 590 Phe Pro Leu Asp Ile Arg Asp Ile Ser Leu Thr Ser Ser Phe Asp Asp 595 600 605 Ala Leu Leu Phe Ser Asn Lys Val Tyr Ser Phe Phe Ser Met Asp Tyr 610 615 620 Ile Lys Thr Ala Asn Lys Val Val Glu Ala Gly Leu Phe Ala Gly Trp 625 630 635 640 Val Lys Gln Ile Val Asn Asp Phe Val Ile Glu Ala Asn Lys Ser Asn 645 650 655 Thr Met Asp Lys Ile Ala Asp Ile Ser Leu Ile Val Pro Tyr Ile Gly 660 665 670 Leu Ala Leu Asn Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu Asn

675 680 685 Ala Phe Glu Ile Ala Gly Ala Ser Ile Leu Leu Glu Phe Ile Pro Glu 690 695 700 Leu Leu Ile Pro Val Val Gly Ala Phe Leu Leu Glu Ser Tyr Ile Asp 705 710 715 720 Asn Lys Asn Lys Ile Ile Lys Thr Ile Asp Asn Ala Leu Thr Lys Arg 725 730 735 Asn Glu Lys Trp Ser Asp Met Tyr Gly Leu Ile Val Ala Gln Trp Leu 740 745 750 Ser Thr Val Asn Thr Gln Phe Tyr Thr Ile Lys Glu Gly Met Tyr Lys 755 760 765 Ala Leu Asn Tyr Gln Ala Gln Ala Leu Glu Glu Ile Ile Lys Tyr Arg 770 775 780 Tyr Asn Ile Tyr Ser Glu Lys Glu Lys Ser Asn Ile Asn Ile Asp Phe 785 790 795 800 Asn Asp Ile Asn Ser Lys Leu Asn Glu Gly Ile Asn Gln Ala Ile Asp 805 810 815 Asn Ile Asn Asn Phe Ile Asn Gly Cys Ser Val Ser Tyr Leu Met Lys 820 825 830 Lys Met Ile Pro Leu Ala Val Glu Lys Leu Leu Asp Phe Asp Asn Thr 835 840 845 Leu Lys Lys Asn Leu Leu Asn Tyr Ile Asp Glu Asn Lys Leu Tyr Leu 850 855 860 Ile Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu Lys 865 870 875 880 Thr Ile Met Pro Phe Asp Leu Ser Ile Tyr Thr Asn Asp Thr Ile Leu 885 890 895 Ile Glu Met Phe Asn Lys Tyr Asn Ser Leu Glu Ala Leu Ala Ser Gly 900 905 910 His His His His His His 915 23907PRTArtificial SequenceSynthetic sequence 23Met Gly Ser Met Pro Val Thr Ile Asn Asn Phe Asn Tyr Asn Asp Pro 1 5 10 15 Ile Asp Asn Asn Asn Ile Ile Met Met Glu Pro Pro Phe Ala Arg Gly 20 25 30 Thr Gly Arg Tyr Tyr Lys Ala Phe Lys Ile Thr Asp Arg Ile Trp Ile 35 40 45 Ile Pro Glu Arg Tyr Thr Phe Gly Tyr Lys Pro Glu Asp Phe Asn Lys 50 55 60 Ser Ser Gly Ile Phe Asn Arg Asp Val Cys Glu Tyr Tyr Asp Pro Asp 65 70 75 80 Tyr Leu Asn Thr Asn Asp Lys Lys Asn Ile Phe Leu Gln Thr Met Ile 85 90 95 Lys Leu Phe Asn Arg Ile Lys Ser Lys Pro Leu Gly Glu Lys Leu Leu 100 105 110 Glu Met Ile Ile Asn Gly Ile Pro Tyr Leu Gly Asp Arg Arg Val Pro 115 120 125 Leu Glu Glu Phe Asn Thr Asn Ile Ala Ser Val Thr Val Asn Lys Leu 130 135 140 Ile Ser Asn Pro Gly Glu Val Glu Arg Lys Lys Gly Ile Phe Ala Asn 145 150 155 160 Leu Ile Ile Phe Gly Pro Gly Pro Val Leu Asn Glu Asn Glu Thr Ile 165 170 175 Asp Ile Gly Ile Gln Asn His Phe Ala Ser Arg Glu Gly Phe Gly Gly 180 185 190 Ile Met Gln Met Lys Phe Cys Pro Glu Tyr Val Ser Val Phe Asn Asn 195 200 205 Val Gln Glu Asn Lys Gly Ala Ser Ile Phe Asn Arg Arg Gly Tyr Phe 210 215 220 Ser Asp Pro Ala Leu Ile Leu Met His Glu Leu Ile His Val Leu His 225 230 235 240 Gly Leu Tyr Gly Ile Lys Val Asp Asp Leu Pro Ile Val Pro Asn Glu 245 250 255 Lys Lys Phe Phe Met Gln Ser Thr Asp Ala Ile Gln Ala Glu Glu Leu 260 265 270 Tyr Thr Phe Gly Gly Gln Asp Pro Ser Ile Ile Thr Pro Ser Thr Asp 275 280 285 Lys Ser Ile Tyr Asp Lys Val Leu Gln Asn Phe Arg Gly Ile Val Asp 290 295 300 Arg Leu Asn Lys Val Leu Val Cys Ile Ser Asp Pro Asn Ile Asn Ile 305 310 315 320 Asn Ile Tyr Lys Asn Lys Phe Lys Asp Lys Tyr Lys Phe Val Glu Asp 325 330 335 Ser Glu Gly Lys Tyr Ser Ile Asp Val Glu Ser Phe Asp Lys Leu Tyr 340 345 350 Lys Ser Leu Met Phe Gly Phe Thr Glu Thr Asn Ile Ala Glu Asn Tyr 355 360 365 Lys Ile Lys Thr Arg Ala Ser Tyr Phe Ser Asp Ser Leu Pro Pro Val 370 375 380 Lys Ile Lys Asn Leu Leu Asp Asn Glu Ile Tyr Thr Ile Glu Glu Gly 385 390 395 400 Phe Asn Ile Ser Asp Lys Asp Met Glu Lys Glu Tyr Arg Gly Gln Asn 405 410 415 Lys Ala Ile Asn Lys Gln Ala Tyr Glu Glu Ile Ser Lys Glu His Leu 420 425 430 Ala Val Tyr Lys Ile Gln Met Cys Val Asp Gly Gly Gly Gly Ser Ala 435 440 445 Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu 450 455 460 Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly 465 470 475 480 Gly Ser Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Asp Val 485 490 495 Asp Asn Glu Asp Leu Phe Phe Ile Ala Asp Lys Asn Ser Phe Ser Asp 500 505 510 Asp Leu Ser Lys Asn Glu Arg Ile Glu Tyr Asn Thr Gln Ser Asn Tyr 515 520 525 Ile Glu Asn Asp Phe Pro Ile Asn Glu Leu Ile Leu Asp Thr Asp Leu 530 535 540 Ile Ser Lys Ile Glu Leu Pro Ser Glu Asn Thr Glu Ser Leu Thr Asp 545 550 555 560 Phe Asn Val Asp Val Pro Val Tyr Glu Lys Gln Pro Ala Ile Lys Lys 565 570 575 Ile Phe Thr Asp Glu Asn Thr Ile Phe Gln Tyr Leu Tyr Ser Gln Thr 580 585 590 Phe Pro Leu Asp Ile Arg Asp Ile Ser Leu Thr Ser Ser Phe Asp Asp 595 600 605 Ala Leu Leu Phe Ser Asn Lys Val Tyr Ser Phe Phe Ser Met Asp Tyr 610 615 620 Ile Lys Thr Ala Asn Lys Val Val Glu Ala Gly Leu Phe Ala Gly Trp 625 630 635 640 Val Lys Gln Ile Val Asn Asp Phe Val Ile Glu Ala Asn Lys Ser Asn 645 650 655 Thr Met Asp Lys Ile Ala Asp Ile Ser Leu Ile Val Pro Tyr Ile Gly 660 665 670 Leu Ala Leu Asn Val Gly Asn Glu Thr Ala Lys Gly Asn Phe Glu Asn 675 680 685 Ala Phe Glu Ile Ala Gly Ala Ser Ile Leu Leu Glu Phe Ile Pro Glu 690 695 700 Leu Leu Ile Pro Val Val Gly Ala Phe Leu Leu Glu Ser Tyr Ile Asp 705 710 715 720 Asn Lys Asn Lys Ile Ile Lys Thr Ile Asp Asn Ala Leu Thr Lys Arg 725 730 735 Asn Glu Lys Trp Ser Asp Met Tyr Gly Leu Ile Val Ala Gln Trp Leu 740 745 750 Ser Thr Val Asn Thr Gln Phe Tyr Thr Ile Lys Glu Gly Met Tyr Lys 755 760 765 Ala Leu Asn Tyr Gln Ala Gln Ala Leu Glu Glu Ile Ile Lys Tyr Arg 770 775 780 Tyr Asn Ile Tyr Ser Glu Lys Glu Lys Ser Asn Ile Asn Ile Asp Phe 785 790 795 800 Asn Asp Ile Asn Ser Lys Leu Asn Glu Gly Ile Asn Gln Ala Ile Asp 805 810 815 Asn Ile Asn Asn Phe Ile Asn Gly Cys Ser Val Ser Tyr Leu Met Lys 820 825 830 Lys Met Ile Pro Leu Ala Val Glu Lys Leu Leu Asp Phe Asp Asn Thr 835 840 845 Leu Lys Lys Asn Leu Leu Asn Tyr Ile Asp Glu Asn Lys Leu Tyr Leu 850 855 860 Ile Gly Ser Ala Glu Tyr Glu Lys Ser Lys Val Asn Lys Tyr Leu Lys 865 870 875 880 Thr Ile Met Pro Phe Asp Leu Ser Ile Tyr Thr Asn Asp Thr Ile Leu 885 890 895 Ile Glu Met Phe Asn Lys Tyr Asn Ser Leu Asp 900 905 242778DNAArtificial SequenceSynthetic sequence 24catatgggat ccgaattcat gccgatcacc atcaacaact tcaactacag cgatccggtg 60gataacaaaa acatcctgta cctggatacc catctgaata ccctggcgaa cgaaccggaa 120aaagcgtttc gtatcaccgg caacatttgg gttattccgg atcgttttag ccgtaacagc 180aacccgaatc tgaataaacc gccgcgtgtt accagcccga aaagcggtta ttacgatccg 240aactatctga gcaccgatag cgataaagat accttcctga aagaaatcat caaactgttc 300aaacgcatca acagccgtga aattggcgaa gaactgatct atcgcctgag caccgatatt 360ccgtttccgg gcaacaacaa caccccgatc aacacctttg atttcgatgt ggatttcaac 420agcgttgatg ttaaaacccg ccagggtaac aattgggtga aaaccggcag cattaacccg 480agcgtgatta ttaccggtcc gcgcgaaaac attattgatc cggaaaccag cacctttaaa 540ctgaccaaca acacctttgc ggcgcaggaa ggttttggcg cgctgagcat tattagcatt 600agcccgcgct ttatgctgac ctatagcaac gcgaccaacg atgttggtga aggccgtttc 660agcaaaagcg aattttgcat ggacccgatc ctgatcctga tgcatgaact gaaccatgcg 720atgcataacc tgtatggcat cgcgattccg aacgatcaga ccattagcag cgtgaccagc 780aacatctttt acagccagta caacgtgaaa ctggaatatg cggaaatcta tgcgtttggc 840ggtccgacca ttgatctgat tccgaaaagc gcgcgcaaat acttcgaaga aaaagcgctg 900gattactatc gcagcattgc gaaacgtctg aacagcatta ccaccgcgaa tccgagcagc 960ttcaacaaat atatcggcga atataaacag aaactgatcc gcaaatatcg ctttgtggtg 1020gaaagcagcg gcgaagttac cgttaaccgc aataaattcg tggaactgta caacgaactg 1080acccagatct tcaccgaatt taactatgcg aaaatctata acgtgcagaa ccgtaaaatc 1140tacctgagca acgtgtatac cccggtgacc gcgaatattc tggatgataa cgtgtacgat 1200atccagaacg gctttaacat cccgaaaagc aacctgaacg ttctgtttat gggccagaac 1260ctgagccgta atccggcgct gcgtaaagtg aacccggaaa acatgctgta cctgttcacc 1320aaattttgcg tcgacggcgg tggcggtagc gcagacgatg acgataaagg ttggaccctg 1380aactctgctg gttacctgct gggtccgcac gctgttgcgc tagcgggcgg tggcggtagc 1440ggcggtggcg gtagcggcgg tggcggtagc gcactagtgc tgcagtgtcg tgaactgctg 1500gtgaaaaaca ccgatctgcc gtttattggc gatatcagcg atgtgaaaac cgatatcttc 1560ctgcgcaaag atatcaacga agaaaccgaa gtgatctact acccggataa cgtgagcgtt 1620gatcaggtga tcctgagcaa aaacaccagc gaacatggtc agctggatct gctgtatccg 1680agcattgata gcgaaagcga aattctgccg ggcgaaaacc aggtgtttta cgataaccgt 1740acccagaacg tggattacct gaacagctat tactacctgg aaagccagaa actgagcgat 1800aacgtggaag attttacctt tacccgcagc attgaagaag cgctggataa cagcgcgaaa 1860gtttacacct attttccgac cctggcgaac aaagttaatg cgggtgttca gggcggtctg 1920tttctgatgt gggcgaacga tgtggtggaa gatttcacca ccaacatcct gcgtaaagat 1980accctggata aaatcagcga tgttagcgcg attattccgt atattggtcc ggcgctgaac 2040attagcaata gcgtgcgtcg tggcaatttt accgaagcgt ttgcggttac cggtgtgacc 2100attctgctgg aagcgtttcc ggaatttacc attccggcgc tgggtgcgtt tgtgatctat 2160agcaaagtgc aggaacgcaa cgaaatcatc aaaaccatcg ataactgcct ggaacagcgt 2220attaaacgct ggaaagatag ctatgaatgg atgatgggca cctggctgag ccgtattatc 2280acccagttca acaacatcag ctaccagatg tacgatagcc tgaactatca ggcgggtgcg 2340attaaagcga aaatcgatct ggaatacaaa aaatacagcg gcagcgataa agaaaacatc 2400aaaagccagg ttgaaaacct gaaaaacagc ctggatgtga aaattagcga agcgatgaat 2460aacatcaaca aattcatccg cgaatgcagc gtgacctacc tgttcaaaaa catgctgccg 2520aaagtgatcg atgaactgaa cgaatttgat cgcaacacca aagcgaaact gatcaacctg 2580atcgatagcc acaacattat tctggtgggc gaagtggata aactgaaagc gaaagttaac 2640aacagcttcc agaacaccat cccgtttaac atcttcagct ataccaacaa cagcctgctg 2700aaagatatca tcaacgaata cttcaatcta gaagcactag cgagtgggca ccatcaccat 2760caccattaat gaaagctt 277825921PRTArtificial SequenceSynthetic sequence 25Met Gly Ser Glu Phe Met Pro Ile Thr Ile Asn Asn Phe Asn Tyr Ser 1 5 10 15 Asp Pro Val Asp Asn Lys Asn Ile Leu Tyr Leu Asp Thr His Leu Asn 20 25 30 Thr Leu Ala Asn Glu Pro Glu Lys Ala Phe Arg Ile Thr Gly Asn Ile 35 40 45 Trp Val Ile Pro Asp Arg Phe Ser Arg Asn Ser Asn Pro Asn Leu Asn 50 55 60 Lys Pro Pro Arg Val Thr Ser Pro Lys Ser Gly Tyr Tyr Asp Pro Asn 65 70 75 80 Tyr Leu Ser Thr Asp Ser Asp Lys Asp Thr Phe Leu Lys Glu Ile Ile 85 90 95 Lys Leu Phe Lys Arg Ile Asn Ser Arg Glu Ile Gly Glu Glu Leu Ile 100 105 110 Tyr Arg Leu Ser Thr Asp Ile Pro Phe Pro Gly Asn Asn Asn Thr Pro 115 120 125 Ile Asn Thr Phe Asp Phe Asp Val Asp Phe Asn Ser Val Asp Val Lys 130 135 140 Thr Arg Gln Gly Asn Asn Trp Val Lys Thr Gly Ser Ile Asn Pro Ser 145 150 155 160 Val Ile Ile Thr Gly Pro Arg Glu Asn Ile Ile Asp Pro Glu Thr Ser 165 170 175 Thr Phe Lys Leu Thr Asn Asn Thr Phe Ala Ala Gln Glu Gly Phe Gly 180 185 190 Ala Leu Ser Ile Ile Ser Ile Ser Pro Arg Phe Met Leu Thr Tyr Ser 195 200 205 Asn Ala Thr Asn Asp Val Gly Glu Gly Arg Phe Ser Lys Ser Glu Phe 210 215 220 Cys Met Asp Pro Ile Leu Ile Leu Met His Glu Leu Asn His Ala Met 225 230 235 240 His Asn Leu Tyr Gly Ile Ala Ile Pro Asn Asp Gln Thr Ile Ser Ser 245 250 255 Val Thr Ser Asn Ile Phe Tyr Ser Gln Tyr Asn Val Lys Leu Glu Tyr 260 265 270 Ala Glu Ile Tyr Ala Phe Gly Gly Pro Thr Ile Asp Leu Ile Pro Lys 275 280 285 Ser Ala Arg Lys Tyr Phe Glu Glu Lys Ala Leu Asp Tyr Tyr Arg Ser 290 295 300 Ile Ala Lys Arg Leu Asn Ser Ile Thr Thr Ala Asn Pro Ser Ser Phe 305 310 315 320 Asn Lys Tyr Ile Gly Glu Tyr Lys Gln Lys Leu Ile Arg Lys Tyr Arg 325 330 335 Phe Val Val Glu Ser Ser Gly Glu Val Thr Val Asn Arg Asn Lys Phe 340 345 350 Val Glu Leu Tyr Asn Glu Leu Thr Gln Ile Phe Thr Glu Phe Asn Tyr 355 360 365 Ala Lys Ile Tyr Asn Val Gln Asn Arg Lys Ile Tyr Leu Ser Asn Val 370 375 380 Tyr Thr Pro Val Thr Ala Asn Ile Leu Asp Asp Asn Val Tyr Asp Ile 385 390 395 400 Gln Asn Gly Phe Asn Ile Pro Lys Ser Asn Leu Asn Val Leu Phe Met 405 410 415 Gly Gln Asn Leu Ser Arg Asn Pro Ala Leu Arg Lys Val Asn Pro Glu 420 425 430 Asn Met Leu Tyr Leu Phe Thr Lys Phe Cys Val Asp Gly Gly Gly Gly 435 440 445 Ser Ala Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr 450 455 460 Leu Leu Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly 465 470 475 480 Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Arg 485 490 495 Glu Leu Leu Val Lys Asn Thr Asp Leu Pro Phe Ile Gly Asp Ile Ser 500 505 510 Asp Val Lys Thr Asp Ile Phe Leu Arg Lys Asp Ile Asn Glu Glu Thr 515 520 525 Glu Val Ile Tyr Tyr Pro Asp Asn Val Ser Val Asp Gln Val Ile Leu 530 535 540 Ser Lys Asn Thr Ser Glu His Gly Gln Leu Asp Leu Leu Tyr Pro Ser 545 550 555 560 Ile Asp Ser Glu Ser Glu Ile Leu Pro Gly Glu Asn Gln Val Phe Tyr 565 570 575 Asp Asn Arg Thr Gln Asn Val Asp Tyr Leu Asn Ser Tyr Tyr Tyr Leu 580 585 590 Glu Ser Gln Lys Leu Ser Asp Asn Val Glu Asp Phe Thr Phe Thr Arg 595 600 605 Ser Ile Glu Glu Ala Leu Asp Asn Ser Ala Lys Val Tyr Thr Tyr Phe 610 615 620 Pro Thr Leu Ala Asn Lys Val Asn Ala Gly Val Gln Gly Gly Leu Phe 625 630 635 640 Leu Met Trp Ala Asn Asp Val Val Glu Asp Phe Thr Thr Asn Ile Leu 645 650 655 Arg Lys Asp Thr Leu Asp Lys Ile Ser Asp Val Ser Ala Ile Ile Pro 660 665 670 Tyr Ile Gly Pro Ala Leu Asn Ile Ser Asn Ser Val Arg Arg Gly Asn 675 680 685 Phe Thr Glu Ala Phe Ala Val Thr Gly Val Thr Ile Leu Leu Glu Ala 690 695 700 Phe Pro Glu Phe Thr Ile Pro Ala Leu Gly Ala Phe Val

Ile Tyr Ser 705 710 715 720 Lys Val Gln Glu Arg Asn Glu Ile Ile Lys Thr Ile Asp Asn Cys Leu 725 730 735 Glu Gln Arg Ile Lys Arg Trp Lys Asp Ser Tyr Glu Trp Met Met Gly 740 745 750 Thr Trp Leu Ser Arg Ile Ile Thr Gln Phe Asn Asn Ile Ser Tyr Gln 755 760 765 Met Tyr Asp Ser Leu Asn Tyr Gln Ala Gly Ala Ile Lys Ala Lys Ile 770 775 780 Asp Leu Glu Tyr Lys Lys Tyr Ser Gly Ser Asp Lys Glu Asn Ile Lys 785 790 795 800 Ser Gln Val Glu Asn Leu Lys Asn Ser Leu Asp Val Lys Ile Ser Glu 805 810 815 Ala Met Asn Asn Ile Asn Lys Phe Ile Arg Glu Cys Ser Val Thr Tyr 820 825 830 Leu Phe Lys Asn Met Leu Pro Lys Val Ile Asp Glu Leu Asn Glu Phe 835 840 845 Asp Arg Asn Thr Lys Ala Lys Leu Ile Asn Leu Ile Asp Ser His Asn 850 855 860 Ile Ile Leu Val Gly Glu Val Asp Lys Leu Lys Ala Lys Val Asn Asn 865 870 875 880 Ser Phe Gln Asn Thr Ile Pro Phe Asn Ile Phe Ser Tyr Thr Asn Asn 885 890 895 Ser Leu Leu Lys Asp Ile Ile Asn Glu Tyr Phe Asn Leu Glu Ala Leu 900 905 910 Ala Ser Gly His His His His His His 915 920 26910PRTArtificial SequenceSynthetic sequence 26Met Gly Ser Glu Phe Met Pro Ile Thr Ile Asn Asn Phe Asn Tyr Ser 1 5 10 15 Asp Pro Val Asp Asn Lys Asn Ile Leu Tyr Leu Asp Thr His Leu Asn 20 25 30 Thr Leu Ala Asn Glu Pro Glu Lys Ala Phe Arg Ile Thr Gly Asn Ile 35 40 45 Trp Val Ile Pro Asp Arg Phe Ser Arg Asn Ser Asn Pro Asn Leu Asn 50 55 60 Lys Pro Pro Arg Val Thr Ser Pro Lys Ser Gly Tyr Tyr Asp Pro Asn 65 70 75 80 Tyr Leu Ser Thr Asp Ser Asp Lys Asp Thr Phe Leu Lys Glu Ile Ile 85 90 95 Lys Leu Phe Lys Arg Ile Asn Ser Arg Glu Ile Gly Glu Glu Leu Ile 100 105 110 Tyr Arg Leu Ser Thr Asp Ile Pro Phe Pro Gly Asn Asn Asn Thr Pro 115 120 125 Ile Asn Thr Phe Asp Phe Asp Val Asp Phe Asn Ser Val Asp Val Lys 130 135 140 Thr Arg Gln Gly Asn Asn Trp Val Lys Thr Gly Ser Ile Asn Pro Ser 145 150 155 160 Val Ile Ile Thr Gly Pro Arg Glu Asn Ile Ile Asp Pro Glu Thr Ser 165 170 175 Thr Phe Lys Leu Thr Asn Asn Thr Phe Ala Ala Gln Glu Gly Phe Gly 180 185 190 Ala Leu Ser Ile Ile Ser Ile Ser Pro Arg Phe Met Leu Thr Tyr Ser 195 200 205 Asn Ala Thr Asn Asp Val Gly Glu Gly Arg Phe Ser Lys Ser Glu Phe 210 215 220 Cys Met Asp Pro Ile Leu Ile Leu Met His Glu Leu Asn His Ala Met 225 230 235 240 His Asn Leu Tyr Gly Ile Ala Ile Pro Asn Asp Gln Thr Ile Ser Ser 245 250 255 Val Thr Ser Asn Ile Phe Tyr Ser Gln Tyr Asn Val Lys Leu Glu Tyr 260 265 270 Ala Glu Ile Tyr Ala Phe Gly Gly Pro Thr Ile Asp Leu Ile Pro Lys 275 280 285 Ser Ala Arg Lys Tyr Phe Glu Glu Lys Ala Leu Asp Tyr Tyr Arg Ser 290 295 300 Ile Ala Lys Arg Leu Asn Ser Ile Thr Thr Ala Asn Pro Ser Ser Phe 305 310 315 320 Asn Lys Tyr Ile Gly Glu Tyr Lys Gln Lys Leu Ile Arg Lys Tyr Arg 325 330 335 Phe Val Val Glu Ser Ser Gly Glu Val Thr Val Asn Arg Asn Lys Phe 340 345 350 Val Glu Leu Tyr Asn Glu Leu Thr Gln Ile Phe Thr Glu Phe Asn Tyr 355 360 365 Ala Lys Ile Tyr Asn Val Gln Asn Arg Lys Ile Tyr Leu Ser Asn Val 370 375 380 Tyr Thr Pro Val Thr Ala Asn Ile Leu Asp Asp Asn Val Tyr Asp Ile 385 390 395 400 Gln Asn Gly Phe Asn Ile Pro Lys Ser Asn Leu Asn Val Leu Phe Met 405 410 415 Gly Gln Asn Leu Ser Arg Asn Pro Ala Leu Arg Lys Val Asn Pro Glu 420 425 430 Asn Met Leu Tyr Leu Phe Thr Lys Phe Cys Val Asp Gly Gly Gly Gly 435 440 445 Ser Ala Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr 450 455 460 Leu Leu Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly 465 470 475 480 Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Arg 485 490 495 Glu Leu Leu Val Lys Asn Thr Asp Leu Pro Phe Ile Gly Asp Ile Ser 500 505 510 Asp Val Lys Thr Asp Ile Phe Leu Arg Lys Asp Ile Asn Glu Glu Thr 515 520 525 Glu Val Ile Tyr Tyr Pro Asp Asn Val Ser Val Asp Gln Val Ile Leu 530 535 540 Ser Lys Asn Thr Ser Glu His Gly Gln Leu Asp Leu Leu Tyr Pro Ser 545 550 555 560 Ile Asp Ser Glu Ser Glu Ile Leu Pro Gly Glu Asn Gln Val Phe Tyr 565 570 575 Asp Asn Arg Thr Gln Asn Val Asp Tyr Leu Asn Ser Tyr Tyr Tyr Leu 580 585 590 Glu Ser Gln Lys Leu Ser Asp Asn Val Glu Asp Phe Thr Phe Thr Arg 595 600 605 Ser Ile Glu Glu Ala Leu Asp Asn Ser Ala Lys Val Tyr Thr Tyr Phe 610 615 620 Pro Thr Leu Ala Asn Lys Val Asn Ala Gly Val Gln Gly Gly Leu Phe 625 630 635 640 Leu Met Trp Ala Asn Asp Val Val Glu Asp Phe Thr Thr Asn Ile Leu 645 650 655 Arg Lys Asp Thr Leu Asp Lys Ile Ser Asp Val Ser Ala Ile Ile Pro 660 665 670 Tyr Ile Gly Pro Ala Leu Asn Ile Ser Asn Ser Val Arg Arg Gly Asn 675 680 685 Phe Thr Glu Ala Phe Ala Val Thr Gly Val Thr Ile Leu Leu Glu Ala 690 695 700 Phe Pro Glu Phe Thr Ile Pro Ala Leu Gly Ala Phe Val Ile Tyr Ser 705 710 715 720 Lys Val Gln Glu Arg Asn Glu Ile Ile Lys Thr Ile Asp Asn Cys Leu 725 730 735 Glu Gln Arg Ile Lys Arg Trp Lys Asp Ser Tyr Glu Trp Met Met Gly 740 745 750 Thr Trp Leu Ser Arg Ile Ile Thr Gln Phe Asn Asn Ile Ser Tyr Gln 755 760 765 Met Tyr Asp Ser Leu Asn Tyr Gln Ala Gly Ala Ile Lys Ala Lys Ile 770 775 780 Asp Leu Glu Tyr Lys Lys Tyr Ser Gly Ser Asp Lys Glu Asn Ile Lys 785 790 795 800 Ser Gln Val Glu Asn Leu Lys Asn Ser Leu Asp Val Lys Ile Ser Glu 805 810 815 Ala Met Asn Asn Ile Asn Lys Phe Ile Arg Glu Cys Ser Val Thr Tyr 820 825 830 Leu Phe Lys Asn Met Leu Pro Lys Val Ile Asp Glu Leu Asn Glu Phe 835 840 845 Asp Arg Asn Thr Lys Ala Lys Leu Ile Asn Leu Ile Asp Ser His Asn 850 855 860 Ile Ile Leu Val Gly Glu Val Asp Lys Leu Lys Ala Lys Val Asn Asn 865 870 875 880 Ser Phe Gln Asn Thr Ile Pro Phe Asn Ile Phe Ser Tyr Thr Asn Asn 885 890 895 Ser Leu Leu Lys Asp Ile Ile Asn Glu Tyr Phe Asn Leu Asp 900 905 910 272769DNAArtificial SequenceSynthetic sequence 27catatgggat ccatgacgtg gccagttaag gatttcaact actcagatcc tgtaaatgac 60aacgatattc tgtaccttcg cattccacaa aataaactga tcaccacacc agtcaaagca 120ttcatgatta ctcaaaacat ttgggtcatt ccagaacgct tttctagtga cacaaatccg 180agtttatcta aacctccgcg tccgacgtcc aaatatcaga gctattacga tccctcatat 240ctcagtacgg acgaacaaaa agatactttc cttaaaggta tcattaaact gtttaagcgt 300attaatgagc gcgatatcgg gaaaaagttg attaattatc ttgttgtggg ttccccgttc 360atgggcgata gctctacccc cgaagacact tttgatttta cccgtcatac gacaaacatc 420gcggtagaga agtttgagaa cggatcgtgg aaagtcacaa acatcattac acctagcgtc 480ttaatttttg gtccgctgcc aaacatctta gattatacag ccagcctgac tttgcagggg 540caacagtcga atccgagttt cgaaggtttt ggtaccctga gcattctgaa agttgccccg 600gaatttctgc tcactttttc agatgtcacc agcaaccaga gctcagcagt attaggaaag 660tcaatttttt gcatggaccc ggttattgca ctgatgcacg aactgacgca ctctctgcat 720caactgtatg ggatcaacat ccccagtgac aaacgtattc gtccccaggt gtctgaagga 780tttttctcac aggatgggcc gaacgtccag ttcgaagagt tgtatacttt cggaggcctg 840gacgtagaga tcattcccca gattgagcgc agtcagctgc gtgagaaggc attgggccat 900tataaggata ttgcaaaacg cctgaataac attaacaaaa cgattccatc ttcgtggatc 960tcgaatattg ataaatataa gaaaattttt agcgagaaat ataattttga taaagataat 1020acaggtaact ttgtggttaa cattgacaaa ttcaactccc tttacagtga tttgacgaat 1080gtaatgagcg aagttgtgta tagttcccaa tacaacgtta agaatcgtac ccattacttc 1140tctcgtcact acctgccggt tttcgcgaac atccttgacg ataatattta cactattcgt 1200gacggcttta acttgaccaa caagggcttc aatattgaaa attcaggcca gaacattgaa 1260cgcaacccgg ccttgcagaa actgtcgagt gaatccgtgg ttgacctgtt taccaaagtc 1320tgcgtcgacg gcggtggcgg tagcgcagac gatgacgata aaggttggac cctgaactct 1380gctggttacc tgctgggtcc gcacgctgtt gcgctagcgg gcggtggcgg tagcggcggt 1440ggcggtagcg gcggtggcgg tagcgcacta gtgctgcagt gtattaaagt gaaaaacaat 1500cggctgcctt atgtagcaga taaagatagc attagtcagg agattttcga aaataaaatt 1560atcactgacg aaaccaatgt tcagaattat tcagataaat tttcactgga cgaaagcatc 1620ttagatggcc aagttccgat taacccggaa attgttgatc cgttactgcc gaacgtgaat 1680atggaaccgt taaacctccc tggcgaagag atcgtatttt atgatgacat tacgaaatat 1740gtggactacc ttaattctta ttactatttg gaaagccaga aactgtccaa taacgtggaa 1800aacattactc tgaccacaag cgtggaagag gctttaggct actcaaataa gatttatacc 1860ttcctcccgt cgctggcgga aaaagtaaat aaaggtgtgc aggctggtct gttcctcaac 1920tgggcgaatg aagttgtcga agactttacc acgaatatta tgaaaaagga taccctggat 1980aaaatctccg acgtctcggt tattatccca tatattggcc ctgcgttaaa tatcggtaat 2040agtgcgctgc gggggaattt taaccaggcc tttgctaccg cgggcgtcgc gttcctcctg 2100gagggctttc ctgaatttac tatcccggcg ctcggtgttt ttacatttta ctcttccatc 2160caggagcgtg agaaaattat caaaaccatc gaaaactgcc tggagcagcg ggtgaaacgc 2220tggaaagatt cttatcaatg gatggtgtca aactggttat ctcgcatcac gacccaattc 2280aaccatatta attaccagat gtatgatagt ctgtcgtacc aagctgacgc cattaaagcc 2340aaaattgatc tggaatataa aaagtactct ggtagcgata aggagaacat caaaagccag 2400gtggagaacc ttaagaatag tctggatgtg aaaatctctg aagctatgaa taacattaac 2460aaattcattc gtgaatgttc ggtgacgtac ctgttcaaga atatgctgcc aaaagttatt 2520gatgaactga ataaatttga tctgcgtacc aaaaccgaac ttatcaacct catcgactcc 2580cacaacatta tccttgtggg cgaagtggat cgtctgaagg ccaaagtaaa cgagagcttt 2640gaaaatacga tgccgtttaa tattttttca tataccaata actccttgct gaaagatatc 2700atcaatgaat atttcaatct agaagcacta gcgagtgggc accatcacca tcaccattaa 2760tgaaagctt 276928918PRTArtificial SequenceSynthetic sequence 28Met Gly Ser Met Thr Trp Pro Val Lys Asp Phe Asn Tyr Ser Asp Pro 1 5 10 15 Val Asn Asp Asn Asp Ile Leu Tyr Leu Arg Ile Pro Gln Asn Lys Leu 20 25 30 Ile Thr Thr Pro Val Lys Ala Phe Met Ile Thr Gln Asn Ile Trp Val 35 40 45 Ile Pro Glu Arg Phe Ser Ser Asp Thr Asn Pro Ser Leu Ser Lys Pro 50 55 60 Pro Arg Pro Thr Ser Lys Tyr Gln Ser Tyr Tyr Asp Pro Ser Tyr Leu 65 70 75 80 Ser Thr Asp Glu Gln Lys Asp Thr Phe Leu Lys Gly Ile Ile Lys Leu 85 90 95 Phe Lys Arg Ile Asn Glu Arg Asp Ile Gly Lys Lys Leu Ile Asn Tyr 100 105 110 Leu Val Val Gly Ser Pro Phe Met Gly Asp Ser Ser Thr Pro Glu Asp 115 120 125 Thr Phe Asp Phe Thr Arg His Thr Thr Asn Ile Ala Val Glu Lys Phe 130 135 140 Glu Asn Gly Ser Trp Lys Val Thr Asn Ile Ile Thr Pro Ser Val Leu 145 150 155 160 Ile Phe Gly Pro Leu Pro Asn Ile Leu Asp Tyr Thr Ala Ser Leu Thr 165 170 175 Leu Gln Gly Gln Gln Ser Asn Pro Ser Phe Glu Gly Phe Gly Thr Leu 180 185 190 Ser Ile Leu Lys Val Ala Pro Glu Phe Leu Leu Thr Phe Ser Asp Val 195 200 205 Thr Ser Asn Gln Ser Ser Ala Val Leu Gly Lys Ser Ile Phe Cys Met 210 215 220 Asp Pro Val Ile Ala Leu Met His Glu Leu Thr His Ser Leu His Gln 225 230 235 240 Leu Tyr Gly Ile Asn Ile Pro Ser Asp Lys Arg Ile Arg Pro Gln Val 245 250 255 Ser Glu Gly Phe Phe Ser Gln Asp Gly Pro Asn Val Gln Phe Glu Glu 260 265 270 Leu Tyr Thr Phe Gly Gly Leu Asp Val Glu Ile Ile Pro Gln Ile Glu 275 280 285 Arg Ser Gln Leu Arg Glu Lys Ala Leu Gly His Tyr Lys Asp Ile Ala 290 295 300 Lys Arg Leu Asn Asn Ile Asn Lys Thr Ile Pro Ser Ser Trp Ile Ser 305 310 315 320 Asn Ile Asp Lys Tyr Lys Lys Ile Phe Ser Glu Lys Tyr Asn Phe Asp 325 330 335 Lys Asp Asn Thr Gly Asn Phe Val Val Asn Ile Asp Lys Phe Asn Ser 340 345 350 Leu Tyr Ser Asp Leu Thr Asn Val Met Ser Glu Val Val Tyr Ser Ser 355 360 365 Gln Tyr Asn Val Lys Asn Arg Thr His Tyr Phe Ser Arg His Tyr Leu 370 375 380 Pro Val Phe Ala Asn Ile Leu Asp Asp Asn Ile Tyr Thr Ile Arg Asp 385 390 395 400 Gly Phe Asn Leu Thr Asn Lys Gly Phe Asn Ile Glu Asn Ser Gly Gln 405 410 415 Asn Ile Glu Arg Asn Pro Ala Leu Gln Lys Leu Ser Ser Glu Ser Val 420 425 430 Val Asp Leu Phe Thr Lys Val Cys Val Asp Gly Gly Gly Gly Ser Ala 435 440 445 Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu 450 455 460 Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly 465 470 475 480 Gly Ser Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Lys Val 485 490 495 Lys Asn Asn Arg Leu Pro Tyr Val Ala Asp Lys Asp Ser Ile Ser Gln 500 505 510 Glu Ile Phe Glu Asn Lys Ile Ile Thr Asp Glu Thr Asn Val Gln Asn 515 520 525 Tyr Ser Asp Lys Phe Ser Leu Asp Glu Ser Ile Leu Asp Gly Gln Val 530 535 540 Pro Ile Asn Pro Glu Ile Val Asp Pro Leu Leu Pro Asn Val Asn Met 545 550 555 560 Glu Pro Leu Asn Leu Pro Gly Glu Glu Ile Val Phe Tyr Asp Asp Ile 565 570 575 Thr Lys Tyr Val Asp Tyr Leu Asn Ser Tyr Tyr Tyr Leu Glu Ser Gln 580 585 590 Lys Leu Ser Asn Asn Val Glu Asn Ile Thr Leu Thr Thr Ser Val Glu 595 600 605 Glu Ala Leu Gly Tyr Ser Asn Lys Ile Tyr Thr Phe Leu Pro Ser Leu 610 615 620 Ala Glu Lys Val Asn Lys Gly Val Gln Ala Gly Leu Phe Leu Asn Trp 625 630 635 640 Ala Asn Glu Val Val Glu Asp Phe Thr Thr Asn Ile Met Lys Lys Asp 645 650 655 Thr Leu Asp Lys Ile Ser Asp Val Ser Val Ile Ile Pro Tyr Ile Gly 660 665 670 Pro Ala Leu Asn Ile Gly Asn Ser Ala Leu Arg Gly Asn Phe Asn Gln 675 680 685 Ala Phe Ala Thr Ala Gly Val Ala Phe Leu Leu Glu Gly Phe Pro Glu 690 695 700 Phe Thr Ile Pro Ala Leu Gly Val Phe Thr Phe Tyr Ser Ser Ile Gln 705 710 715 720 Glu Arg Glu Lys Ile Ile Lys Thr Ile Glu Asn Cys Leu Glu Gln Arg 725 730

735 Val Lys Arg Trp Lys Asp Ser Tyr Gln Trp Met Val Ser Asn Trp Leu 740 745 750 Ser Arg Ile Thr Thr Gln Phe Asn His Ile Asn Tyr Gln Met Tyr Asp 755 760 765 Ser Leu Ser Tyr Gln Ala Asp Ala Ile Lys Ala Lys Ile Asp Leu Glu 770 775 780 Tyr Lys Lys Tyr Ser Gly Ser Asp Lys Glu Asn Ile Lys Ser Gln Val 785 790 795 800 Glu Asn Leu Lys Asn Ser Leu Asp Val Lys Ile Ser Glu Ala Met Asn 805 810 815 Asn Ile Asn Lys Phe Ile Arg Glu Cys Ser Val Thr Tyr Leu Phe Lys 820 825 830 Asn Met Leu Pro Lys Val Ile Asp Glu Leu Asn Lys Phe Asp Leu Arg 835 840 845 Thr Lys Thr Glu Leu Ile Asn Leu Ile Asp Ser His Asn Ile Ile Leu 850 855 860 Val Gly Glu Val Asp Arg Leu Lys Ala Lys Val Asn Glu Ser Phe Glu 865 870 875 880 Asn Thr Met Pro Phe Asn Ile Phe Ser Tyr Thr Asn Asn Ser Leu Leu 885 890 895 Lys Asp Ile Ile Asn Glu Tyr Phe Asn Leu Glu Ala Leu Ala Ser Gly 900 905 910 His His His His His His 915 29907PRTArtificial SequenceSynthetic sequence 29Met Gly Ser Met Thr Trp Pro Val Lys Asp Phe Asn Tyr Ser Asp Pro 1 5 10 15 Val Asn Asp Asn Asp Ile Leu Tyr Leu Arg Ile Pro Gln Asn Lys Leu 20 25 30 Ile Thr Thr Pro Val Lys Ala Phe Met Ile Thr Gln Asn Ile Trp Val 35 40 45 Ile Pro Glu Arg Phe Ser Ser Asp Thr Asn Pro Ser Leu Ser Lys Pro 50 55 60 Pro Arg Pro Thr Ser Lys Tyr Gln Ser Tyr Tyr Asp Pro Ser Tyr Leu 65 70 75 80 Ser Thr Asp Glu Gln Lys Asp Thr Phe Leu Lys Gly Ile Ile Lys Leu 85 90 95 Phe Lys Arg Ile Asn Glu Arg Asp Ile Gly Lys Lys Leu Ile Asn Tyr 100 105 110 Leu Val Val Gly Ser Pro Phe Met Gly Asp Ser Ser Thr Pro Glu Asp 115 120 125 Thr Phe Asp Phe Thr Arg His Thr Thr Asn Ile Ala Val Glu Lys Phe 130 135 140 Glu Asn Gly Ser Trp Lys Val Thr Asn Ile Ile Thr Pro Ser Val Leu 145 150 155 160 Ile Phe Gly Pro Leu Pro Asn Ile Leu Asp Tyr Thr Ala Ser Leu Thr 165 170 175 Leu Gln Gly Gln Gln Ser Asn Pro Ser Phe Glu Gly Phe Gly Thr Leu 180 185 190 Ser Ile Leu Lys Val Ala Pro Glu Phe Leu Leu Thr Phe Ser Asp Val 195 200 205 Thr Ser Asn Gln Ser Ser Ala Val Leu Gly Lys Ser Ile Phe Cys Met 210 215 220 Asp Pro Val Ile Ala Leu Met His Glu Leu Thr His Ser Leu His Gln 225 230 235 240 Leu Tyr Gly Ile Asn Ile Pro Ser Asp Lys Arg Ile Arg Pro Gln Val 245 250 255 Ser Glu Gly Phe Phe Ser Gln Asp Gly Pro Asn Val Gln Phe Glu Glu 260 265 270 Leu Tyr Thr Phe Gly Gly Leu Asp Val Glu Ile Ile Pro Gln Ile Glu 275 280 285 Arg Ser Gln Leu Arg Glu Lys Ala Leu Gly His Tyr Lys Asp Ile Ala 290 295 300 Lys Arg Leu Asn Asn Ile Asn Lys Thr Ile Pro Ser Ser Trp Ile Ser 305 310 315 320 Asn Ile Asp Lys Tyr Lys Lys Ile Phe Ser Glu Lys Tyr Asn Phe Asp 325 330 335 Lys Asp Asn Thr Gly Asn Phe Val Val Asn Ile Asp Lys Phe Asn Ser 340 345 350 Leu Tyr Ser Asp Leu Thr Asn Val Met Ser Glu Val Val Tyr Ser Ser 355 360 365 Gln Tyr Asn Val Lys Asn Arg Thr His Tyr Phe Ser Arg His Tyr Leu 370 375 380 Pro Val Phe Ala Asn Ile Leu Asp Asp Asn Ile Tyr Thr Ile Arg Asp 385 390 395 400 Gly Phe Asn Leu Thr Asn Lys Gly Phe Asn Ile Glu Asn Ser Gly Gln 405 410 415 Asn Ile Glu Arg Asn Pro Ala Leu Gln Lys Leu Ser Ser Glu Ser Val 420 425 430 Val Asp Leu Phe Thr Lys Val Cys Val Asp Gly Gly Gly Gly Ser Ala 435 440 445 Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu 450 455 460 Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly 465 470 475 480 Gly Ser Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Lys Val 485 490 495 Lys Asn Asn Arg Leu Pro Tyr Val Ala Asp Lys Asp Ser Ile Ser Gln 500 505 510 Glu Ile Phe Glu Asn Lys Ile Ile Thr Asp Glu Thr Asn Val Gln Asn 515 520 525 Tyr Ser Asp Lys Phe Ser Leu Asp Glu Ser Ile Leu Asp Gly Gln Val 530 535 540 Pro Ile Asn Pro Glu Ile Val Asp Pro Leu Leu Pro Asn Val Asn Met 545 550 555 560 Glu Pro Leu Asn Leu Pro Gly Glu Glu Ile Val Phe Tyr Asp Asp Ile 565 570 575 Thr Lys Tyr Val Asp Tyr Leu Asn Ser Tyr Tyr Tyr Leu Glu Ser Gln 580 585 590 Lys Leu Ser Asn Asn Val Glu Asn Ile Thr Leu Thr Thr Ser Val Glu 595 600 605 Glu Ala Leu Gly Tyr Ser Asn Lys Ile Tyr Thr Phe Leu Pro Ser Leu 610 615 620 Ala Glu Lys Val Asn Lys Gly Val Gln Ala Gly Leu Phe Leu Asn Trp 625 630 635 640 Ala Asn Glu Val Val Glu Asp Phe Thr Thr Asn Ile Met Lys Lys Asp 645 650 655 Thr Leu Asp Lys Ile Ser Asp Val Ser Val Ile Ile Pro Tyr Ile Gly 660 665 670 Pro Ala Leu Asn Ile Gly Asn Ser Ala Leu Arg Gly Asn Phe Asn Gln 675 680 685 Ala Phe Ala Thr Ala Gly Val Ala Phe Leu Leu Glu Gly Phe Pro Glu 690 695 700 Phe Thr Ile Pro Ala Leu Gly Val Phe Thr Phe Tyr Ser Ser Ile Gln 705 710 715 720 Glu Arg Glu Lys Ile Ile Lys Thr Ile Glu Asn Cys Leu Glu Gln Arg 725 730 735 Val Lys Arg Trp Lys Asp Ser Tyr Gln Trp Met Val Ser Asn Trp Leu 740 745 750 Ser Arg Ile Thr Thr Gln Phe Asn His Ile Asn Tyr Gln Met Tyr Asp 755 760 765 Ser Leu Ser Tyr Gln Ala Asp Ala Ile Lys Ala Lys Ile Asp Leu Glu 770 775 780 Tyr Lys Lys Tyr Ser Gly Ser Asp Lys Glu Asn Ile Lys Ser Gln Val 785 790 795 800 Glu Asn Leu Lys Asn Ser Leu Asp Val Lys Ile Ser Glu Ala Met Asn 805 810 815 Asn Ile Asn Lys Phe Ile Arg Glu Cys Ser Val Thr Tyr Leu Phe Lys 820 825 830 Asn Met Leu Pro Lys Val Ile Asp Glu Leu Asn Lys Phe Asp Leu Arg 835 840 845 Thr Lys Thr Glu Leu Ile Asn Leu Ile Asp Ser His Asn Ile Ile Leu 850 855 860 Val Gly Glu Val Asp Arg Leu Lys Ala Lys Val Asn Glu Ser Phe Glu 865 870 875 880 Asn Thr Met Pro Phe Asn Ile Phe Ser Tyr Thr Asn Asn Ser Leu Leu 885 890 895 Lys Asp Ile Ile Asn Glu Tyr Phe Asn Leu Asp 900 905 302766DNAArtificial SequenceSynthetic sequence 30catatgggat ccatggagtt cgttaacaaa cagttcaact ataaagaccc agttaacggt 60gttgacattg cttacatcaa aatcccgaac gctggccaga tgcagccggt aaaggcattc 120aaaatccaca acaaaatctg ggttatcccg gaacgtgata cctttactaa cccggaagaa 180ggtgacctga acccgccacc ggaagcgaaa caggtgccgg tatcttacta tgactccacc 240tacctgtcta ccgataacga aaaggacaac tacctgaaag gtgttactaa actgttcgag 300cgtatttact ccaccgacct gggccgtatg ctgctgacta gcatcgttcg cggtatcccg 360ttctggggcg gttctaccat cgataccgaa ctgaaagtaa tcgacactaa ctgcatcaac 420gttattcagc cggacggttc ctatcgttcc gaagaactga acctggtgat catcggcccg 480tctgctgata tcatccagtt cgagtgtaag agctttggtc acgaagttct gaacctcacc 540cgtaacggct acggttccac tcagtacatc cgtttctctc cggacttcac cttcggtttt 600gaagaatccc tggaagtaga cacgaaccca ctgctgggcg ctggtaaatt cgcaactgat 660cctgcggtta ccctggctca cgaactgatt catgcaggcc accgcctgta cggtatcgcc 720atcaatccga accgtgtctt caaagttaac accaacgcgt attacgagat gtccggtctg 780gaagttagct tcgaagaact gcgtactttt ggcggtcacg acgctaaatt catcgactct 840ctgcaagaaa acgagttccg tctgtactac tataacaagt tcaaagatat cgcatccacc 900ctgaacaaag cgaaatccat cgtgggtacc actgcttctc tccagtacat gaagaacgtt 960tttaaagaaa aatacctgct cagcgaagac acctccggca aattctctgt agacaagttg 1020aaattcgata aactttacaa aatgctgact gaaatttaca ccgaagacaa cttcgttaag 1080ttctttaaag ttctgaaccg caaaacctat ctgaacttcg acaaggcagt attcaaaatc 1140aacatcgtgc cgaaagttaa ctacactatc tacgatggtt tcaacctgcg taacaccaac 1200ctggctgcta attttaacgg ccagaacacg gaaatcaaca acatgaactt cacaaaactg 1260aaaaacttca ctggtctgtt cgagttttac aagctgctgt gcgtcgacgg cggtggcggt 1320agcgcagacg atgacgataa aggttggacc ctgaactctg ctggttacct gctgggtccg 1380cacgctgttg cgctagcggc tgaagctgct gctaaagaag ctgctgctaa agaagctgct 1440gctaaagctg gtggcggtgg ttccgcacta gtgctgcagt gtatcaaggt taacaactgg 1500gatttattct tcagcccgag tgaagacaac ttcaccaacg acctgaacaa aggtgaagaa 1560atcacctcag atactaacat cgaagcagcc gaagaaaaca tctcgctgga cctgatccag 1620cagtactacc tgacctttaa tttcgacaac gagccggaaa acatttctat cgaaaacctg 1680agctctgata tcatcggcca gctggaactg atgccgaaca tcgaacgttt cccaaacggt 1740aaaaagtacg agctggacaa atataccatg ttccactacc tgcgcgcgca ggaatttgaa 1800cacggcaaat cccgtatcgc actgactaac tccgttaacg aagctctgct caacccgtcc 1860cgtgtataca ccttcttctc tagcgactac gtgaaaaagg tcaacaaagc gactgaagct 1920gcaatgttct tgggttgggt tgaacagctt gtttatgatt ttaccgacga gacgtccgaa 1980gtatctacta ccgacaaaat tgcggatatc actatcatca tcccgtacat cggtccggct 2040ctgaacattg gcaacatgct gtacaaagac gacttcgttg gcgcactgat cttctccggt 2100gcggtgatcc tgctggagtt catcccggaa atcgccatcc cggtactggg cacctttgct 2160ctggtttctt acattgcaaa caaggttctg actgtacaaa ccatcgacaa cgcgctgagc 2220aaacgtaacg aaaaatggga tgaagtttac aaatatatcg tgaccaactg gctggctaag 2280gttaatactc agatcgacct catccgcaaa aaaatgaaag aagcactgga aaaccaggcg 2340gaagctacca aggcaatcat taactaccag tacaaccagt acaccgagga agaaaaaaac 2400aacatcaact tcaacatcga cgatctgtcc tctaaactga acgaatccat caacaaagct 2460atgatcaaca tcaacaagtt cctgaaccag tgctctgtaa gctatctgat gaactccatg 2520atcccgtacg gtgttaaacg tctggaggac ttcgatgcgt ctctgaaaga cgccctgctg 2580aaatacattt acgacaaccg tggcactctg atcggtcagg ttgatcgtct gaaggacaaa 2640gtgaacaata ccttatcgac cgacatccct tttcagctca gtaaatatgt cgataaccaa 2700cgccttttgt ccactctaga agcactagcg agtgggcacc atcaccatca ccattaatga 2760aagctt 276631917PRTArtificial SequenceSynthetic sequence 31Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp 435 440 445 Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala Leu 450 455 460 Ala Ala Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala 465 470 475 480 Lys Ala Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Lys Val 485 490 495 Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 500 505 510 Asp Leu Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr Asn Ile Glu Ala 515 520 525 Ala Glu Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln Tyr Tyr Leu Thr 530 535 540 Phe Asn Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile Glu Asn Leu Ser 545 550 555 560 Ser Asp Ile Ile Gly Gln Leu Glu Leu Met Pro Asn Ile Glu Arg Phe 565 570 575 Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 580 585 590 Leu Arg Ala Gln Glu Phe Glu His Gly Lys Ser Arg Ile Ala Leu Thr 595 600 605 Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 610 615 620 Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 625 630 635 640 Met Phe Leu Gly Trp Val Glu Gln Leu Val Tyr Asp Phe Thr Asp Glu 645 650 655 Thr Ser Glu Val Ser Thr Thr Asp Lys Ile Ala Asp Ile Thr Ile Ile 660 665 670 Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn Met Leu Tyr Lys 675 680 685 Asp Asp Phe Val Gly Ala Leu Ile Phe Ser Gly Ala Val Ile Leu Leu 690 695 700 Glu Phe Ile Pro Glu Ile Ala Ile Pro Val Leu Gly Thr Phe Ala Leu 705 710 715 720 Val Ser Tyr Ile Ala Asn Lys Val Leu Thr Val Gln Thr Ile Asp Asn 725 730 735 Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr Ile 740 745 750 Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gln Ile Asp Leu Ile Arg 755 760

765 Lys Lys Met Lys Glu Ala Leu Glu Asn Gln Ala Glu Ala Thr Lys Ala 770 775 780 Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu Glu Lys Asn Asn 785 790 795 800 Ile Asn Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser Ile 805 810 815 Asn Lys Ala Met Ile Asn Ile Asn Lys Phe Leu Asn Gln Cys Ser Val 820 825 830 Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly Val Lys Arg Leu Glu 835 840 845 Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr Ile Tyr Asp 850 855 860 Asn Arg Gly Thr Leu Ile Gly Gln Val Asp Arg Leu Lys Asp Lys Val 865 870 875 880 Asn Asn Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu Ser Lys Tyr Val 885 890 895 Asp Asn Gln Arg Leu Leu Ser Thr Leu Glu Ala Leu Ala Ser Gly His 900 905 910 His His His His His 915 32906PRTArtificial SequenceSynthetic sequence 32Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp 435 440 445 Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala Leu 450 455 460 Ala Ala Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala 465 470 475 480 Lys Ala Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Lys Val 485 490 495 Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn 500 505 510 Asp Leu Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr Asn Ile Glu Ala 515 520 525 Ala Glu Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln Tyr Tyr Leu Thr 530 535 540 Phe Asn Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile Glu Asn Leu Ser 545 550 555 560 Ser Asp Ile Ile Gly Gln Leu Glu Leu Met Pro Asn Ile Glu Arg Phe 565 570 575 Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 580 585 590 Leu Arg Ala Gln Glu Phe Glu His Gly Lys Ser Arg Ile Ala Leu Thr 595 600 605 Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe 610 615 620 Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala 625 630 635 640 Met Phe Leu Gly Trp Val Glu Gln Leu Val Tyr Asp Phe Thr Asp Glu 645 650 655 Thr Ser Glu Val Ser Thr Thr Asp Lys Ile Ala Asp Ile Thr Ile Ile 660 665 670 Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn Met Leu Tyr Lys 675 680 685 Asp Asp Phe Val Gly Ala Leu Ile Phe Ser Gly Ala Val Ile Leu Leu 690 695 700 Glu Phe Ile Pro Glu Ile Ala Ile Pro Val Leu Gly Thr Phe Ala Leu 705 710 715 720 Val Ser Tyr Ile Ala Asn Lys Val Leu Thr Val Gln Thr Ile Asp Asn 725 730 735 Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr Ile 740 745 750 Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gln Ile Asp Leu Ile Arg 755 760 765 Lys Lys Met Lys Glu Ala Leu Glu Asn Gln Ala Glu Ala Thr Lys Ala 770 775 780 Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu Glu Lys Asn Asn 785 790 795 800 Ile Asn Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser Ile 805 810 815 Asn Lys Ala Met Ile Asn Ile Asn Lys Phe Leu Asn Gln Cys Ser Val 820 825 830 Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly Val Lys Arg Leu Glu 835 840 845 Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr Ile Tyr Asp 850 855 860 Asn Arg Gly Thr Leu Ile Gly Gln Val Asp Arg Leu Lys Asp Lys Val 865 870 875 880 Asn Asn Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu Ser Lys Tyr Val 885 890 895 Asp Asn Gln Arg Leu Leu Ser Thr Leu Asp 900 905 33915PRTArtificial SequenceSynthetic sequence 33Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Asp Asp 435 440 445 Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro 450 455 460 His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 465 470 475 480 Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn 485 490 495 Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu 500 505 510 Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu 515 520 525 Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn 530 535 540 Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp 545 550 555 560 Ile Ile Gly Gln Leu Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn 565 570 575 Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg 580 585 590 Ala Gln Glu Phe Glu His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser 595 600 605 Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser 610 615 620 Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe 625 630 635 640 Leu Gly Trp Val Glu Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser 645 650 655 Glu Val Ser Thr Thr Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro 660 665 670 Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp 675 680 685 Phe Val Gly Ala Leu Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe 690 695 700 Ile Pro Glu Ile Ala Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser 705 710 715 720 Tyr Ile Ala Asn Lys Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu 725 730 735 Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr 740 745 750 Asn Trp Leu Ala Lys Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys 755 760 765 Met Lys Glu Ala Leu Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile 770 775 780 Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn 785 790 795 800 Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys 805 810 815 Ala Met Ile Asn Ile Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr 820 825 830 Leu Met Asn Ser Met Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe 835 840 845 Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn Arg 850 855 860 Gly Thr Leu Ile Gly Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn 865 870 875 880 Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn 885 890 895 Gln Arg Leu Leu Ser Thr Leu Glu Ala Leu Ala Ser Gly His His His 900 905 910 His His His 915 34904PRTArtificial SequenceSynthetic sequence 34Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305

310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Asp Asp 435 440 445 Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro 450 455 460 His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 465 470 475 480 Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn 485 490 495 Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu 500 505 510 Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu 515 520 525 Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn 530 535 540 Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp 545 550 555 560 Ile Ile Gly Gln Leu Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn 565 570 575 Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg 580 585 590 Ala Gln Glu Phe Glu His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser 595 600 605 Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser 610 615 620 Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe 625 630 635 640 Leu Gly Trp Val Glu Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser 645 650 655 Glu Val Ser Thr Thr Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro 660 665 670 Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp 675 680 685 Phe Val Gly Ala Leu Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe 690 695 700 Ile Pro Glu Ile Ala Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser 705 710 715 720 Tyr Ile Ala Asn Lys Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu 725 730 735 Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr 740 745 750 Asn Trp Leu Ala Lys Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys 755 760 765 Met Lys Glu Ala Leu Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile 770 775 780 Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn 785 790 795 800 Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys 805 810 815 Ala Met Ile Asn Ile Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr 820 825 830 Leu Met Asn Ser Met Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe 835 840 845 Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn Arg 850 855 860 Gly Thr Leu Ile Gly Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn 865 870 875 880 Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn 885 890 895 Gln Arg Leu Leu Ser Thr Leu Asp 900 35905PRTArtificial SequenceSynthetic sequence 35Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp 435 440 445 Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala Leu 450 455 460 Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Leu Val Leu Gln 465 470 475 480 Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp 485 490 495 Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr 500 505 510 Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln 515 520 525 Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile 530 535 540 Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu Glu Leu Met Pro Asn 545 550 555 560 Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr 565 570 575 Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu His Gly Lys Ser Arg 580 585 590 Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg 595 600 605 Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala 610 615 620 Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu Val Tyr Asp 625 630 635 640 Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys Ile Ala Asp 645 650 655 Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn 660 665 670 Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe Ser Gly Ala 675 680 685 Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile Pro Val Leu Gly 690 695 700 Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys Val Leu Thr Val Gln 705 710 715 720 Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val 725 730 735 Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gln Ile 740 745 750 Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu Glu Asn Gln Ala Glu 755 760 765 Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu 770 775 780 Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu 785 790 795 800 Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile Asn Lys Phe Leu Asn 805 810 815 Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly Val 820 825 830 Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys 835 840 845 Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val Asp Arg Leu 850 855 860 Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu 865 870 875 880 Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu Glu Ala Leu 885 890 895 Ala Ser Gly His His His His His His 900 905 36894PRTArtificial SequenceSynthetic sequence 36Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp 435 440 445 Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala Leu 450 455 460 Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Leu Val Leu Gln 465 470 475 480 Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp 485 490 495 Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr 500 505 510 Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln 515 520 525 Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile 530 535 540 Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu Glu Leu Met Pro Asn 545 550 555 560 Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr 565 570 575 Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu His Gly Lys Ser Arg 580 585 590 Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg 595 600 605 Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala 610 615 620 Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu Val Tyr Asp 625 630 635 640 Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys Ile Ala Asp 645 650 655 Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn 660 665 670 Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe Ser Gly Ala 675 680 685 Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile Pro Val Leu Gly 690 695 700 Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys Val Leu Thr Val Gln 705 710 715 720 Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val 725 730 735 Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gln Ile 740 745 750 Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu Glu Asn Gln Ala Glu 755 760 765 Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu 770 775 780 Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu 785 790

795 800 Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile Asn Lys Phe Leu Asn 805 810 815 Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly Val 820 825 830 Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys 835 840 845 Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val Asp Arg Leu 850 855 860 Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu 865 870 875 880 Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu Asp 885 890 37900PRTArtificial SequenceSynthetic sequence 37Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp 435 440 445 Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala Leu 450 455 460 Ala Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Lys Val Asn 465 470 475 480 Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp 485 490 495 Leu Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala 500 505 510 Glu Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe 515 520 525 Asn Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser 530 535 540 Asp Ile Ile Gly Gln Leu Glu Leu Met Pro Asn Ile Glu Arg Phe Pro 545 550 555 560 Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu 565 570 575 Arg Ala Gln Glu Phe Glu His Gly Lys Ser Arg Ile Ala Leu Thr Asn 580 585 590 Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe 595 600 605 Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met 610 615 620 Phe Leu Gly Trp Val Glu Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr 625 630 635 640 Ser Glu Val Ser Thr Thr Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile 645 650 655 Pro Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp 660 665 670 Asp Phe Val Gly Ala Leu Ile Phe Ser Gly Ala Val Ile Leu Leu Glu 675 680 685 Phe Ile Pro Glu Ile Ala Ile Pro Val Leu Gly Thr Phe Ala Leu Val 690 695 700 Ser Tyr Ile Ala Asn Lys Val Leu Thr Val Gln Thr Ile Asp Asn Ala 705 710 715 720 Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val 725 730 735 Thr Asn Trp Leu Ala Lys Val Asn Thr Gln Ile Asp Leu Ile Arg Lys 740 745 750 Lys Met Lys Glu Ala Leu Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile 755 760 765 Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile 770 775 780 Asn Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn 785 790 795 800 Lys Ala Met Ile Asn Ile Asn Lys Phe Leu Asn Gln Cys Ser Val Ser 805 810 815 Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp 820 825 830 Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn 835 840 845 Arg Gly Thr Leu Ile Gly Gln Val Asp Arg Leu Lys Asp Lys Val Asn 850 855 860 Asn Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp 865 870 875 880 Asn Gln Arg Leu Leu Ser Thr Leu Glu Ala Leu Ala Ser Gly His His 885 890 895 His His His His 900 38889PRTArtificial SequenceSynthetic sequence 38Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp 435 440 445 Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala Leu 450 455 460 Ala Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Lys Val Asn 465 470 475 480 Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp 485 490 495 Leu Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala 500 505 510 Glu Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe 515 520 525 Asn Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser 530 535 540 Asp Ile Ile Gly Gln Leu Glu Leu Met Pro Asn Ile Glu Arg Phe Pro 545 550 555 560 Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu 565 570 575 Arg Ala Gln Glu Phe Glu His Gly Lys Ser Arg Ile Ala Leu Thr Asn 580 585 590 Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe 595 600 605 Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met 610 615 620 Phe Leu Gly Trp Val Glu Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr 625 630 635 640 Ser Glu Val Ser Thr Thr Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile 645 650 655 Pro Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp 660 665 670 Asp Phe Val Gly Ala Leu Ile Phe Ser Gly Ala Val Ile Leu Leu Glu 675 680 685 Phe Ile Pro Glu Ile Ala Ile Pro Val Leu Gly Thr Phe Ala Leu Val 690 695 700 Ser Tyr Ile Ala Asn Lys Val Leu Thr Val Gln Thr Ile Asp Asn Ala 705 710 715 720 Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val 725 730 735 Thr Asn Trp Leu Ala Lys Val Asn Thr Gln Ile Asp Leu Ile Arg Lys 740 745 750 Lys Met Lys Glu Ala Leu Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile 755 760 765 Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile 770 775 780 Asn Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn 785 790 795 800 Lys Ala Met Ile Asn Ile Asn Lys Phe Leu Asn Gln Cys Ser Val Ser 805 810 815 Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp 820 825 830 Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn 835 840 845 Arg Gly Thr Leu Ile Gly Gln Val Asp Arg Leu Lys Asp Lys Val Asn 850 855 860 Asn Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp 865 870 875 880 Asn Gln Arg Leu Leu Ser Thr Leu Asp 885 39927PRTArtificial SequenceSynthetic sequence 39Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn

Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly 435 440 445 Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly 450 455 460 Tyr Leu Leu Gly Pro His Ala Val Ala Leu Ala Ala Glu Ala Ala Ala 465 470 475 480 Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Ala Gly Gly Gly Gly 485 490 495 Ser Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe 500 505 510 Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu 515 520 525 Glu Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser 530 535 540 Leu Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu 545 550 555 560 Pro Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln 565 570 575 Leu Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr 580 585 590 Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe 595 600 605 Glu His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala 610 615 620 Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val 625 630 635 640 Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val 645 650 655 Glu Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr 660 665 670 Thr Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro 675 680 685 Ala Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala 690 695 700 Leu Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile 705 710 715 720 Ala Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn 725 730 735 Lys Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn 740 745 750 Glu Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala 755 760 765 Lys Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala 770 775 780 Leu Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr 785 790 795 800 Asn Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp 805 810 815 Asp Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn 820 825 830 Ile Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser 835 840 845 Met Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu 850 855 860 Lys Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile 865 870 875 880 Gly Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr 885 890 895 Asp Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu 900 905 910 Ser Thr Leu Glu Ala Leu Ala Ser Gly His His His His His His 915 920 925 40916PRTArtificial SequenceSynthetic sequence 40Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly 435 440 445 Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly 450 455 460 Tyr Leu Leu Gly Pro His Ala Val Ala Leu Ala Ala Glu Ala Ala Ala 465 470 475 480 Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Ala Gly Gly Gly Gly 485 490 495 Ser Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe 500 505 510 Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu 515 520 525 Glu Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser 530 535 540 Leu Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu 545 550 555 560 Pro Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln 565 570 575 Leu Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr 580 585 590 Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe 595 600 605 Glu His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala 610 615 620 Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val 625 630 635 640 Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val 645 650 655 Glu Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr 660 665 670 Thr Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro 675 680 685 Ala Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala 690 695 700 Leu Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile 705 710 715 720 Ala Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn 725 730 735 Lys Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn 740 745 750 Glu Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala 755 760 765 Lys Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala 770 775 780 Leu Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr 785 790 795 800 Asn Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp 805 810 815 Asp Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn 820 825 830 Ile Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser 835 840 845 Met Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu 850 855 860 Lys Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile 865 870 875 880 Gly Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr 885 890 895 Asp Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu 900 905 910 Ser Thr Leu Asp 915 41915PRTArtificial SequenceSynthetic sequence 41Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly 435 440 445 Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly 450 455 460 Tyr Leu Leu Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser 465 470 475 480 Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn 485 490 495 Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu 500 505 510 Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu 515 520 525 Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn 530 535 540 Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp 545 550 555 560 Ile Ile Gly Gln Leu Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn 565 570 575 Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg 580 585 590 Ala Gln Glu Phe Glu His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser 595 600 605 Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser 610 615 620 Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe 625 630 635 640 Leu Gly Trp Val Glu Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser 645 650 655 Glu Val Ser Thr Thr Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro 660 665 670 Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp 675 680 685 Phe Val Gly Ala Leu Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe 690 695 700 Ile Pro Glu Ile Ala Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser 705 710 715 720 Tyr Ile Ala Asn Lys Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu 725 730 735 Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr 740 745 750 Asn Trp Leu Ala Lys Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys 755 760 765 Met Lys Glu Ala Leu Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile 770 775 780 Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn 785 790 795 800 Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys 805 810 815 Ala Met Ile Asn Ile Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr 820 825 830 Leu Met Asn Ser Met Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe 835 840 845 Asp Ala Ser Leu Lys Asp Ala Leu

Leu Lys Tyr Ile Tyr Asp Asn Arg 850 855 860 Gly Thr Leu Ile Gly Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn 865 870 875 880 Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn 885 890 895 Gln Arg Leu Leu Ser Thr Leu Glu Ala Leu Ala Ser Gly His His His 900 905 910 His His His 915 42904PRTArtificial SequenceSynthetic sequence 42Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly 435 440 445 Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly 450 455 460 Tyr Leu Leu Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser 465 470 475 480 Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn 485 490 495 Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu 500 505 510 Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu 515 520 525 Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn 530 535 540 Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp 545 550 555 560 Ile Ile Gly Gln Leu Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn 565 570 575 Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg 580 585 590 Ala Gln Glu Phe Glu His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser 595 600 605 Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser 610 615 620 Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe 625 630 635 640 Leu Gly Trp Val Glu Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser 645 650 655 Glu Val Ser Thr Thr Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro 660 665 670 Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp 675 680 685 Phe Val Gly Ala Leu Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe 690 695 700 Ile Pro Glu Ile Ala Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser 705 710 715 720 Tyr Ile Ala Asn Lys Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu 725 730 735 Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr 740 745 750 Asn Trp Leu Ala Lys Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys 755 760 765 Met Lys Glu Ala Leu Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile 770 775 780 Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn 785 790 795 800 Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys 805 810 815 Ala Met Ile Asn Ile Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr 820 825 830 Leu Met Asn Ser Met Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe 835 840 845 Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn Arg 850 855 860 Gly Thr Leu Ile Gly Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn 865 870 875 880 Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn 885 890 895 Gln Arg Leu Leu Ser Thr Leu Asp 900 43910PRTArtificial SequenceSynthetic sequence 43Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly 435 440 445 Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly 450 455 460 Tyr Leu Leu Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser 465 470 475 480 Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe 485 490 495 Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 500 505 510 Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu 515 520 525 Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 530 535 540 Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu 545 550 555 560 Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 565 570 575 Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu 580 585 590 His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 595 600 605 Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 610 615 620 Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 625 630 635 640 Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 645 650 655 Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala 660 665 670 Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 675 680 685 Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala 690 695 700 Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys 705 710 715 720 Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu 725 730 735 Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys 740 745 750 Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu 755 760 765 Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn 770 775 780 Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp 785 790 795 800 Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile 805 810 815 Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met 820 825 830 Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 835 840 845 Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly 850 855 860 Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 865 870 875 880 Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser 885 890 895 Thr Leu Glu Ala Leu Ala Ser Gly His His His His His His 900 905 910 44899PRTArtificial SequenceSynthetic sequence 44Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405

410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly 435 440 445 Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly 450 455 460 Tyr Leu Leu Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser 465 470 475 480 Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe 485 490 495 Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 500 505 510 Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu 515 520 525 Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 530 535 540 Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu 545 550 555 560 Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 565 570 575 Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu 580 585 590 His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 595 600 605 Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 610 615 620 Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 625 630 635 640 Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 645 650 655 Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala 660 665 670 Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 675 680 685 Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala 690 695 700 Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys 705 710 715 720 Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu 725 730 735 Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys 740 745 750 Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu 755 760 765 Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn 770 775 780 Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp 785 790 795 800 Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile 805 810 815 Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met 820 825 830 Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 835 840 845 Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly 850 855 860 Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 865 870 875 880 Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser 885 890 895 Thr Leu Asp 45922PRTArtificial SequenceSynthetic sequence 45Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Asp Asp 435 440 445 Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro 450 455 460 His Ala Val Ala Leu Ala Ala Glu Ala Ala Ala Lys Glu Ala Ala Ala 465 470 475 480 Lys Glu Ala Ala Ala Lys Ala Gly Gly Gly Gly Ser Ala Leu Val Leu 485 490 495 Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu 500 505 510 Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu Ile Thr Ser Asp 515 520 525 Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu Asp Leu Ile Gln 530 535 540 Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro Glu Asn Ile Ser 545 550 555 560 Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu Glu Leu Met Pro 565 570 575 Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr 580 585 590 Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu His Gly Lys Ser 595 600 605 Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser 610 615 620 Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys 625 630 635 640 Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu Val Tyr 645 650 655 Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys Ile Ala 660 665 670 Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile Gly 675 680 685 Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe Ser Gly 690 695 700 Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile Pro Val Leu 705 710 715 720 Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys Val Leu Thr Val 725 730 735 Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu 740 745 750 Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gln 755 760 765 Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu Glu Asn Gln Ala 770 775 780 Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu 785 790 795 800 Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp Leu Ser Ser Lys 805 810 815 Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile Asn Lys Phe Leu 820 825 830 Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly 835 840 845 Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu 850 855 860 Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val Asp Arg 865 870 875 880 Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp Ile Pro Phe Gln 885 890 895 Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu Glu Ala 900 905 910 Leu Ala Ser Gly His His His His His His 915 920 46911PRTArtificial SequenceSynthetic sequence 46Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Asp Asp 435 440 445 Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro 450 455 460 His Ala Val Ala Leu Ala Ala Glu Ala Ala Ala Lys Glu Ala Ala Ala 465 470 475 480 Lys Glu Ala Ala Ala Lys Ala Gly Gly Gly Gly Ser Ala Leu Val Leu 485 490 495 Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu 500 505 510 Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu Ile Thr Ser Asp 515 520 525 Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu Asp Leu Ile Gln 530 535 540 Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro Glu Asn Ile Ser 545 550 555 560 Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu Glu Leu Met Pro 565 570 575 Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr 580 585 590 Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu His Gly Lys Ser 595 600 605 Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser 610 615 620 Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys 625 630 635 640 Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu Val Tyr 645 650 655 Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys Ile Ala 660 665 670 Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile Gly 675 680 685 Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe Ser Gly 690 695 700 Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile Pro Val Leu 705 710 715 720 Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys Val Leu Thr Val 725 730 735 Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu 740 745 750 Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gln 755 760 765 Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu Glu Asn Gln Ala 770 775 780 Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu 785 790 795 800 Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp Leu Ser Ser Lys 805 810 815 Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile Asn Lys Phe Leu 820 825 830 Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly 835 840 845 Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu 850 855 860 Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val Asp Arg 865 870 875 880 Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr

Asp Ile Pro Phe Gln 885 890 895 Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu Asp 900 905 910 47910PRTArtificial SequenceSynthetic sequence 47Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Asp Asp 435 440 445 Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro 450 455 460 His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 465 470 475 480 Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe 485 490 495 Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 500 505 510 Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu 515 520 525 Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 530 535 540 Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu 545 550 555 560 Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 565 570 575 Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu 580 585 590 His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 595 600 605 Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 610 615 620 Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 625 630 635 640 Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 645 650 655 Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala 660 665 670 Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 675 680 685 Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala 690 695 700 Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys 705 710 715 720 Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu 725 730 735 Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys 740 745 750 Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu 755 760 765 Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn 770 775 780 Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp 785 790 795 800 Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile 805 810 815 Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met 820 825 830 Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 835 840 845 Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly 850 855 860 Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 865 870 875 880 Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser 885 890 895 Thr Leu Glu Ala Leu Ala Ser Gly His His His His His His 900 905 910 48899PRTArtificial SequenceSynthetic sequence 48Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Asp Asp 435 440 445 Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro 450 455 460 His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 465 470 475 480 Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe 485 490 495 Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu 500 505 510 Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu 515 520 525 Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro 530 535 540 Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu 545 550 555 560 Glu Leu Met Pro Asn Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu 565 570 575 Leu Asp Lys Tyr Thr Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu 580 585 590 His Gly Lys Ser Arg Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu 595 600 605 Leu Asn Pro Ser Arg Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys 610 615 620 Lys Val Asn Lys Ala Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu 625 630 635 640 Gln Leu Val Tyr Asp Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr 645 650 655 Asp Lys Ile Ala Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala 660 665 670 Leu Asn Ile Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu 675 680 685 Ile Phe Ser Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala 690 695 700 Ile Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys 705 710 715 720 Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu 725 730 735 Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys 740 745 750 Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu 755 760 765 Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn 770 775 780 Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp 785 790 795 800 Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile 805 810 815 Asn Lys Phe Leu Asn Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met 820 825 830 Ile Pro Tyr Gly Val Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys 835 840 845 Asp Ala Leu Leu Lys Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly 850 855 860 Gln Val Asp Arg Leu Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp 865 870 875 880 Ile Pro Phe Gln Leu Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser 885 890 895 Thr Leu Asp 49905PRTArtificial SequenceSynthetic sequence 49Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Asp Asp 435 440 445 Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr

Leu Leu Gly Pro 450 455 460 His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Ala Leu Val Leu Gln 465 470 475 480 Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp 485 490 495 Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr 500 505 510 Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln 515 520 525 Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile 530 535 540 Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu Glu Leu Met Pro Asn 545 550 555 560 Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr 565 570 575 Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu His Gly Lys Ser Arg 580 585 590 Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg 595 600 605 Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala 610 615 620 Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu Val Tyr Asp 625 630 635 640 Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys Ile Ala Asp 645 650 655 Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn 660 665 670 Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe Ser Gly Ala 675 680 685 Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile Pro Val Leu Gly 690 695 700 Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys Val Leu Thr Val Gln 705 710 715 720 Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val 725 730 735 Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gln Ile 740 745 750 Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu Glu Asn Gln Ala Glu 755 760 765 Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu 770 775 780 Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu 785 790 795 800 Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile Asn Lys Phe Leu Asn 805 810 815 Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly Val 820 825 830 Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys 835 840 845 Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val Asp Arg Leu 850 855 860 Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu 865 870 875 880 Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu Glu Ala Leu 885 890 895 Ala Ser Gly His His His His His His 900 905 50894PRTArtificial SequenceSynthetic sequence 50Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp Pro 1 5 10 15 Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly Gln 20 25 30 Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val Ile 35 40 45 Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn Pro 50 55 60 Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr Tyr 65 70 75 80 Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr Lys 85 90 95 Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu Thr 100 105 110 Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp Thr 115 120 125 Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro Asp 130 135 140 Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro Ser 145 150 155 160 Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val Leu 165 170 175 Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe Ser 180 185 190 Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr Asn 195 200 205 Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr Leu 210 215 220 Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala Ile 225 230 235 240 Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu Met 245 250 255 Ser Gly Leu Glu Val Ser Phe Glu Glu Leu Arg Thr Phe Gly Gly His 260 265 270 Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu Tyr 275 280 285 Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala Lys 290 295 300 Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val Phe 305 310 315 320 Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser Val 325 330 335 Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile Tyr 340 345 350 Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys Thr 355 360 365 Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro Lys 370 375 380 Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn Leu 385 390 395 400 Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn Phe 405 410 415 Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu Leu 420 425 430 Cys Val Asp Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Asp Asp 435 440 445 Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro 450 455 460 His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser Ala Leu Val Leu Gln 465 470 475 480 Cys Ile Lys Val Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp 485 490 495 Asn Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr 500 505 510 Asn Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln 515 520 525 Tyr Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile 530 535 540 Glu Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu Glu Leu Met Pro Asn 545 550 555 560 Ile Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr 565 570 575 Met Phe His Tyr Leu Arg Ala Gln Glu Phe Glu His Gly Lys Ser Arg 580 585 590 Ile Ala Leu Thr Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg 595 600 605 Val Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala 610 615 620 Thr Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu Val Tyr Asp 625 630 635 640 Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys Ile Ala Asp 645 650 655 Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn 660 665 670 Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe Ser Gly Ala 675 680 685 Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile Pro Val Leu Gly 690 695 700 Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys Val Leu Thr Val Gln 705 710 715 720 Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val 725 730 735 Tyr Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gln Ile 740 745 750 Asp Leu Ile Arg Lys Lys Met Lys Glu Ala Leu Glu Asn Gln Ala Glu 755 760 765 Ala Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu 770 775 780 Glu Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu 785 790 795 800 Asn Glu Ser Ile Asn Lys Ala Met Ile Asn Ile Asn Lys Phe Leu Asn 805 810 815 Gln Cys Ser Val Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly Val 820 825 830 Lys Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys 835 840 845 Tyr Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val Asp Arg Leu 850 855 860 Lys Asp Lys Val Asn Asn Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu 865 870 875 880 Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu Asp 885 890 512889DNAArtificial SequenceSynthetic sequence 51ggatccttgg tacgagatga cgttgactat caaattttcc gcgactttgc ggaaaataaa 60ggtaagtttt tcgtcggcgc cacagacctg tccgtcaaaa ataagagagg ccagaacatc 120ggtaacgcac tgagcaacgt ccctatgatt gattttagtg tagcggacgt taataaacgg 180attgcaaccg tcgttgatcc gcagtatgct gtcagcgtca aacatgctaa agcggaagtt 240catacgttct attacgggca atataacggc cataacgatg tggctgataa agaaaatgaa 300tatcgcgtgg tcgagcagaa caattacgaa ccgcacaaag cgtggggcgc gagtaattta 360ggccgcctgg aggactataa catggcccgt ttcaataaat tcgtgaccga ggtagcaccg 420atcgccccca cagatgctgg tgggggcctg gatacctaca aagataaaaa ccgcttctct 480agcttcgtgc gcattggcgc cggtcgtcag ctcgtgtacg agaagggtgt ctatcaccag 540gaaggtaatg aaaaggggta cgacctccgt gatttgtccc aggcgtatcg ctacgctatt 600gccggaaccc cgtataaaga tattaatatc gatcaaacca tgaataccga aggcctaatt 660ggtttcggga atcataataa gcaatatagc gcagaagagc taaagcaggc cctcagccaa 720gatgcgttaa ccaattacgg agtgttaggc gatagcggca gtccgctgtt tgccttcgat 780aaacagaaaa atcaatgggt gtttctgggc acttatgatt attgggccgg atatggtaaa 840aagagctggc aggaatggaa tatttataaa aaggaattcg cagacaaaat caagcagcat 900gacaacgcag gtacggtgaa ggggaacggc gaacatcact ggaagacgac cggcacgaat 960agtcatatcg gatcgacggc cgttcgcctg gcgaacaatg agggcgatgc aaacaatggg 1020caaaacgtga cctttgagga caacggtacc ctggtcctta accagaacat aaatcagggc 1080gcgggaggct tgttctttaa aggcgactat actgttaagg gagcaaacaa tgacatcacc 1140tggttagggg ccggtattga cgttgcggat ggaaaaaagg tggtttggca ggttaaaaac 1200cctaacgggg accggctggc aaaaatcggc aaagggacat tggaaattaa tggtaccggt 1260gtgaatcagg gtcagctgaa agtgggagat gggaccgtga ttctgaacca gaaagcagac 1320gctgacaaaa aggtgcaagc ctttagccaa gtaggaattg ttagtggtcg tggcacactc 1380gtcttgaact caagcaacca aataaatccg gataacctgt actttggatt tcgtggcgga 1440cgcctggatg ctaacgggaa tgatctgacc tttgaacata tccgtaacgt tgacgagggt 1500gcgcgcatag ttaatcataa tactgaccat gcatcaacta tcaccttgac cgggaaaagt 1560ctgattacaa acccaaactc tctgtcagta cattccatcc agaatgatta tgatgaagac 1620gattactcat actattaccg gccgcgtaga ccaattccac aaggtaaaga tctttattac 1680aaaaattacc gttattacgc attaaaatcc ggagggcggc tgaatgcacc tatgccggaa 1740aatggcgtgg ccgaaaacaa tgactggatt tttatgggtt atactcaaga agaggctcgc 1800aaaaatgcaa tgaaccataa aaataaccga aggatcggtg atttcggcgg atttttcgat 1860gaggaaaatg gtaaaggtca caatggtgcg ctgaatctaa attttaacgg caaaagtgcc 1920cagaaacgtt tccttctgac tggtggcgct aatctgaatg gtaaaatcag tgtgacgcag 1980ggtaacgtgc tgctttctgg ccggccaact ccgcatgcac gtgattttgt aaataaatcg 2040agcgctcgta aagatgcgca tttttctaaa aataacgagg tcgtgtttga agatgactgg 2100ataaatcgca cctttaaagc ggcagaaatc gcggttaatc agagtgcgag cttttcatcg 2160ggtaggaatg tatctgatat tacagcaaac attacagcca ctgataatgc gaaggtcaac 2220ctgggttata aaaacggtga tgaagtttgt gttcgatcgg attacacggg ctatgttacc 2280tgcaacactg gcaatctgtc tgataaagcg cttaactctt ttgacgccac gcgcattaac 2340gggaatgtga acctgaacca aaacgctgcc ttggtacttg gtaaggccgc gttgtggggt 2400aaaattcagg gccagggcaa ctcccgtgtg tctctgaacc agcactcgaa gtggcacctg 2460acgggggact cgcaggtgca caacttgtcc ctggccgata gccatattca ccttaacaat 2520gcgtccgatg cccagtcagc taataaatat catacgatca aaatcaatca cctctctggc 2580aacggtcact ttcactactt aacggattta gcaaaaaact taggggataa agtcctggta 2640aaagaatcag cgagcggaca ttatcagtta catgtacaga acaaaacagg cgagccaaat 2700caggaaggcc ttgacttatt tgatgcttca tcggtacaag atcgttccag actgttcgtt 2760tcactcgcga atcactacgt tgatctgggt gcgctgcgct atactataaa gacggaaaat 2820ggcataacac gcctctataa tccctatgcc ggtaacggcc gtccggtgaa acctgctccc 2880tgcgtcgac 2889524284DNAArtificial SequenceSynthetic sequence 52ggatccttgg tacgagatga cgttgactat caaattttcc gcgactttgc ggaaaataaa 60ggtaagtttt tcgtcggcgc cacagacctg tccgtcaaaa ataagagagg ccagaacatc 120ggtaacgcac tgagcaacgt ccctatgatt gattttagtg tagcggacgt taataaacgg 180attgcaaccg tcgttgatcc gcagtatgct gtcagcgtca aacatgctaa agcggaagtt 240catacgttct attacgggca atataacggc cataacgatg tggctgataa agaaaatgaa 300tatcgcgtgg tcgagcagaa caattacgaa ccgcacaaag cgtggggcgc gagtaattta 360ggccgcctgg aggactataa catggcccgt ttcaataaat tcgtgaccga ggtagcaccg 420atcgccccca cagatgctgg tgggggcctg gatacctaca aagataaaaa ccgcttctct 480agcttcgtgc gcattggcgc cggtcgtcag ctcgtgtacg agaagggtgt ctatcaccag 540gaaggtaatg aaaaggggta cgacctccgt gatttgtccc aggcgtatcg ctacgctatt 600gccggaaccc cgtataaaga tattaatatc gatcaaacca tgaataccga aggcctaatt 660ggtttcggga atcataataa gcaatatagc gcagaagagc taaagcaggc cctcagccaa 720gatgcgttaa ccaattacgg agtgttaggc gatagcggca gtccgctgtt tgccttcgat 780aaacagaaaa atcaatgggt gtttctgggc acttatgatt attgggccgg atatggtaaa 840aagagctggc aggaatggaa tatttataaa aaggaattcg cagacaaaat caagcagcat 900gacaacgcag gtacggtgaa ggggaacggc gaacatcact ggaagacgac cggcacgaat 960agtcatatcg gatcgacggc cgttcgcctg gcgaacaatg agggcgatgc aaacaatggg 1020caaaacgtga cctttgagga caacggtacc ctggtcctta accagaacat aaatcagggc 1080gcgggaggct tgttctttaa aggcgactat actgttaagg gagcaaacaa tgacatcacc 1140tggttagggg ccggtattga cgttgcggat ggaaaaaagg tggtttggca ggttaaaaac 1200cctaacgggg accggctggc aaaaatcggc aaagggacat tggaaattaa tggtaccggt 1260gtgaatcagg gtcagctgaa agtgggagat gggaccgtga ttctgaacca gaaagcagac 1320gctgacaaaa aggtgcaagc ctttagccaa gtaggaattg ttagtggtcg tggcacactc 1380gtcttgaact caagcaacca aataaatccg gataacctgt actttggatt tcgtggcgga 1440cgcctggatg ctaacgggaa tgatctgacc tttgaacata tccgtaacgt tgacgagggt 1500gcgcgcatag ttaatcataa tactgaccat gcatcaacta tcaccttgac cgggaaaagt 1560ctgattacaa acccaaactc tctgtcagta cattccatcc agaatgatta tgatgaagac 1620gattactcat actattaccg gccgcgtaga ccaattccac aaggtaaaga tctttattac 1680aaaaattacc gttattacgc attaaaatcc ggagggcggc tgaatgcacc tatgccggaa 1740aatggcgtgg ccgaaaacaa tgactggatt tttatgggtt atactcaaga agaggctcgc 1800aaaaatgcaa tgaaccataa aaataaccga aggatcggtg atttcggcgg atttttcgat 1860gaggaaaatg gtaaaggtca caatggtgcg ctgaatctaa attttaacgg caaaagtgcc 1920cagaaacgtt tccttctgac tggtggcgct aatctgaatg gtaaaatcag tgtgacgcag 1980ggtaacgtgc tgctttctgg ccggccaact ccgcatgcac gtgattttgt aaataaatcg 2040agcgctcgta aagatgcgca tttttctaaa aataacgagg tcgtgtttga agatgactgg 2100ataaatcgca cctttaaagc ggcagaaatc gcggttaatc agagtgcgag cttttcatcg 2160ggtaggaatg tatctgatat tacagcaaac attacagcca ctgataatgc gaaggtcaac 2220ctgggttata aaaacggtga tgaagtttgt gttcgatcgg attacacggg ctatgttacc 2280tgcaacactg gcaatctgtc tgataaagcg cttaactctt ttgacgccac gcgcattaac 2340gggaatgtga acctgaacca aaacgctgcc ttggtacttg gtaaggccgc gttgtggggt 2400aaaattcagg gccagggcaa ctcccgtgtg tctctgaacc agcactcgaa gtggcacctg 2460acgggggact cgcaggtgca caacttgtcc ctggccgata gccatattca ccttaacaat 2520gcgtccgatg cccagtcagc taataaatat catacgatca aaatcaatca cctctctggc 2580aacggtcact ttcactactt aacggattta gcaaaaaact taggggataa agtcctggta 2640aaagaatcag cgagcggaca ttatcagtta catgtacaga acaaaacagg cgagccaaat 2700caggaaggcc ttgacttatt tgatgcttca tcggtacaag atcgttccag actgttcgtt 2760tcactcgcga atcactacgt tgatctgggt gcgctgcgct atactataaa gacggaaaat 2820ggcataacac gcctctataa tccctatgcc ggtaacggcc gtccggtgaa acctgctccc 2880tgcgtcgacg gcggtggcgg tagcgcagac gatgacgata aaggttggac cctgaactct 2940gctggttacc tgctgggtcc gcacgctgtt gcgctagcgg gcggtggcgg tagcggcggt 3000ggcggtagcg gcggtggcgg tagcgcacta gtgctgcagt gtatcaaggt taacaactgg 3060gatttattct tcagcccgag tgaagacaac ttcaccaacg acctgaacaa aggtgaagaa 3120atcacctcag atactaacat cgaagcagcc gaagaaaaca tctcgctgga cctgatccag 3180cagtactacc tgacctttaa tttcgacaac gagccggaaa

acatttctat cgaaaacctg 3240agctctgata tcatcggcca gctggaactg atgccgaaca tcgaacgttt cccaaacggt 3300aaaaagtacg agctggacaa atataccatg ttccactacc tgcgcgcgca ggaatttgaa 3360cacggcaaat cccgtatcgc actgactaac tccgttaacg aagctctgct caacccgtcc 3420cgtgtataca ccttcttctc tagcgactac gtgaaaaagg tcaacaaagc gactgaagct 3480gcaatgttct tgggttgggt tgaacagctt gtttatgatt ttaccgacga gacgtccgaa 3540gtatctacta ccgacaaaat tgcggatatc actatcatca tcccgtacat cggtccggct 3600ctgaacattg gcaacatgct gtacaaagac gacttcgttg gcgcactgat cttctccggt 3660gcggtgatcc tgctggagtt catcccggaa atcgccatcc cggtactggg cacctttgct 3720ctggtttctt acattgcaaa caaggttctg actgtacaaa ccatcgacaa cgcgctgagc 3780aaacgtaacg aaaaatggga tgaagtttac aaatatatcg tgaccaactg gctggctaag 3840gttaatactc agatcgacct catccgcaaa aaaatgaaag aagcactgga aaaccaggcg 3900gaagctacca aggcaatcat taactaccag tacaaccagt acaccgagga agaaaaaaac 3960aacatcaact tcaacatcga cgatctgtcc tctaaactga acgaatccat caacaaagct 4020atgatcaaca tcaacaagtt cctgaaccag tgctctgtaa gctatctgat gaactccatg 4080atcccgtacg gtgttaaacg tctggaggac ttcgatgcgt ctctgaaaga cgccctgctg 4140aaatacattt acgacaaccg tggcactctg atcggtcagg ttgatcgtct gaaggacaaa 4200gtgaacaata ccttatcgac cgacatccct tttcagctca gtaaatatgt cgataaccaa 4260cgccttttgt ccactctaga ctag 4284531427PRTArtificial SequenceSynthetic sequence 53Gly Ser Leu Val Arg Asp Asp Val Asp Tyr Gln Ile Phe Arg Asp Phe 1 5 10 15 Ala Glu Asn Lys Gly Lys Phe Phe Val Gly Ala Thr Asp Leu Ser Val 20 25 30 Lys Asn Lys Arg Gly Gln Asn Ile Gly Asn Ala Leu Ser Asn Val Pro 35 40 45 Met Ile Asp Phe Ser Val Ala Asp Val Asn Lys Arg Ile Ala Thr Val 50 55 60 Val Asp Pro Gln Tyr Ala Val Ser Val Lys His Ala Lys Ala Glu Val 65 70 75 80 His Thr Phe Tyr Tyr Gly Gln Tyr Asn Gly His Asn Asp Val Ala Asp 85 90 95 Lys Glu Asn Glu Tyr Arg Val Val Glu Gln Asn Asn Tyr Glu Pro His 100 105 110 Lys Ala Trp Gly Ala Ser Asn Leu Gly Arg Leu Glu Asp Tyr Asn Met 115 120 125 Ala Arg Phe Asn Lys Phe Val Thr Glu Val Ala Pro Ile Ala Pro Thr 130 135 140 Asp Ala Gly Gly Gly Leu Asp Thr Tyr Lys Asp Lys Asn Arg Phe Ser 145 150 155 160 Ser Phe Val Arg Ile Gly Ala Gly Arg Gln Leu Val Tyr Glu Lys Gly 165 170 175 Val Tyr His Gln Glu Gly Asn Glu Lys Gly Tyr Asp Leu Arg Asp Leu 180 185 190 Ser Gln Ala Tyr Arg Tyr Ala Ile Ala Gly Thr Pro Tyr Lys Asp Ile 195 200 205 Asn Ile Asp Gln Thr Met Asn Thr Glu Gly Leu Ile Gly Phe Gly Asn 210 215 220 His Asn Lys Gln Tyr Ser Ala Glu Glu Leu Lys Gln Ala Leu Ser Gln 225 230 235 240 Asp Ala Leu Thr Asn Tyr Gly Val Leu Gly Asp Ser Gly Ser Pro Leu 245 250 255 Phe Ala Phe Asp Lys Gln Lys Asn Gln Trp Val Phe Leu Gly Thr Tyr 260 265 270 Asp Tyr Trp Ala Gly Tyr Gly Lys Lys Ser Trp Gln Glu Trp Asn Ile 275 280 285 Tyr Lys Lys Glu Phe Ala Asp Lys Ile Lys Gln His Asp Asn Ala Gly 290 295 300 Thr Val Lys Gly Asn Gly Glu His His Trp Lys Thr Thr Gly Thr Asn 305 310 315 320 Ser His Ile Gly Ser Thr Ala Val Arg Leu Ala Asn Asn Glu Gly Asp 325 330 335 Ala Asn Asn Gly Gln Asn Val Thr Phe Glu Asp Asn Gly Thr Leu Val 340 345 350 Leu Asn Gln Asn Ile Asn Gln Gly Ala Gly Gly Leu Phe Phe Lys Gly 355 360 365 Asp Tyr Thr Val Lys Gly Ala Asn Asn Asp Ile Thr Trp Leu Gly Ala 370 375 380 Gly Ile Asp Val Ala Asp Gly Lys Lys Val Val Trp Gln Val Lys Asn 385 390 395 400 Pro Asn Gly Asp Arg Leu Ala Lys Ile Gly Lys Gly Thr Leu Glu Ile 405 410 415 Asn Gly Thr Gly Val Asn Gln Gly Gln Leu Lys Val Gly Asp Gly Thr 420 425 430 Val Ile Leu Asn Gln Lys Ala Asp Ala Asp Lys Lys Val Gln Ala Phe 435 440 445 Ser Gln Val Gly Ile Val Ser Gly Arg Gly Thr Leu Val Leu Asn Ser 450 455 460 Ser Asn Gln Ile Asn Pro Asp Asn Leu Tyr Phe Gly Phe Arg Gly Gly 465 470 475 480 Arg Leu Asp Ala Asn Gly Asn Asp Leu Thr Phe Glu His Ile Arg Asn 485 490 495 Val Asp Glu Gly Ala Arg Ile Val Asn His Asn Thr Asp His Ala Ser 500 505 510 Thr Ile Thr Leu Thr Gly Lys Ser Leu Ile Thr Asn Pro Asn Ser Leu 515 520 525 Ser Val His Ser Ile Gln Asn Asp Tyr Asp Glu Asp Asp Tyr Ser Tyr 530 535 540 Tyr Tyr Arg Pro Arg Arg Pro Ile Pro Gln Gly Lys Asp Leu Tyr Tyr 545 550 555 560 Lys Asn Tyr Arg Tyr Tyr Ala Leu Lys Ser Gly Gly Arg Leu Asn Ala 565 570 575 Pro Met Pro Glu Asn Gly Val Ala Glu Asn Asn Asp Trp Ile Phe Met 580 585 590 Gly Tyr Thr Gln Glu Glu Ala Arg Lys Asn Ala Met Asn His Lys Asn 595 600 605 Asn Arg Arg Ile Gly Asp Phe Gly Gly Phe Phe Asp Glu Glu Asn Gly 610 615 620 Lys Gly His Asn Gly Ala Leu Asn Leu Asn Phe Asn Gly Lys Ser Ala 625 630 635 640 Gln Lys Arg Phe Leu Leu Thr Gly Gly Ala Asn Leu Asn Gly Lys Ile 645 650 655 Ser Val Thr Gln Gly Asn Val Leu Leu Ser Gly Arg Pro Thr Pro His 660 665 670 Ala Arg Asp Phe Val Asn Lys Ser Ser Ala Arg Lys Asp Ala His Phe 675 680 685 Ser Lys Asn Asn Glu Val Val Phe Glu Asp Asp Trp Ile Asn Arg Thr 690 695 700 Phe Lys Ala Ala Glu Ile Ala Val Asn Gln Ser Ala Ser Phe Ser Ser 705 710 715 720 Gly Arg Asn Val Ser Asp Ile Thr Ala Asn Ile Thr Ala Thr Asp Asn 725 730 735 Ala Lys Val Asn Leu Gly Tyr Lys Asn Gly Asp Glu Val Cys Val Arg 740 745 750 Ser Asp Tyr Thr Gly Tyr Val Thr Cys Asn Thr Gly Asn Leu Ser Asp 755 760 765 Lys Ala Leu Asn Ser Phe Asp Ala Thr Arg Ile Asn Gly Asn Val Asn 770 775 780 Leu Asn Gln Asn Ala Ala Leu Val Leu Gly Lys Ala Ala Leu Trp Gly 785 790 795 800 Lys Ile Gln Gly Gln Gly Asn Ser Arg Val Ser Leu Asn Gln His Ser 805 810 815 Lys Trp His Leu Thr Gly Asp Ser Gln Val His Asn Leu Ser Leu Ala 820 825 830 Asp Ser His Ile His Leu Asn Asn Ala Ser Asp Ala Gln Ser Ala Asn 835 840 845 Lys Tyr His Thr Ile Lys Ile Asn His Leu Ser Gly Asn Gly His Phe 850 855 860 His Tyr Leu Thr Asp Leu Ala Lys Asn Leu Gly Asp Lys Val Leu Val 865 870 875 880 Lys Glu Ser Ala Ser Gly His Tyr Gln Leu His Val Gln Asn Lys Thr 885 890 895 Gly Glu Pro Asn Gln Glu Gly Leu Asp Leu Phe Asp Ala Ser Ser Val 900 905 910 Gln Asp Arg Ser Arg Leu Phe Val Ser Leu Ala Asn His Tyr Val Asp 915 920 925 Leu Gly Ala Leu Arg Tyr Thr Ile Lys Thr Glu Asn Gly Ile Thr Arg 930 935 940 Leu Tyr Asn Pro Tyr Ala Gly Asn Gly Arg Pro Val Lys Pro Ala Pro 945 950 955 960 Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp 965 970 975 Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala Leu 980 985 990 Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 995 1000 1005 Ala Leu Val Leu Gln Cys Ile Lys Val Asn Asn Trp Asp Leu Phe 1010 1015 1020 Phe Ser Pro Ser Glu Asp Asn Phe Thr Asn Asp Leu Asn Lys Gly 1025 1030 1035 Glu Glu Ile Thr Ser Asp Thr Asn Ile Glu Ala Ala Glu Glu Asn 1040 1045 1050 Ile Ser Leu Asp Leu Ile Gln Gln Tyr Tyr Leu Thr Phe Asn Phe 1055 1060 1065 Asp Asn Glu Pro Glu Asn Ile Ser Ile Glu Asn Leu Ser Ser Asp 1070 1075 1080 Ile Ile Gly Gln Leu Glu Leu Met Pro Asn Ile Glu Arg Phe Pro 1085 1090 1095 Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met Phe His Tyr 1100 1105 1110 Leu Arg Ala Gln Glu Phe Glu His Gly Lys Ser Arg Ile Ala Leu 1115 1120 1125 Thr Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val Tyr 1130 1135 1140 Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr 1145 1150 1155 Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu Val Tyr Asp 1160 1165 1170 Phe Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys Ile Ala 1175 1180 1185 Asp Ile Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile 1190 1195 1200 Gly Asn Met Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe 1205 1210 1215 Ser Gly Ala Val Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile 1220 1225 1230 Pro Val Leu Gly Thr Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys 1235 1240 1245 Val Leu Thr Val Gln Thr Ile Asp Asn Ala Leu Ser Lys Arg Asn 1250 1255 1260 Glu Lys Trp Asp Glu Val Tyr Lys Tyr Ile Val Thr Asn Trp Leu 1265 1270 1275 Ala Lys Val Asn Thr Gln Ile Asp Leu Ile Arg Lys Lys Met Lys 1280 1285 1290 Glu Ala Leu Glu Asn Gln Ala Glu Ala Thr Lys Ala Ile Ile Asn 1295 1300 1305 Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu Glu Lys Asn Asn Ile Asn 1310 1315 1320 Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu Asn Glu Ser Ile Asn 1325 1330 1335 Lys Ala Met Ile Asn Ile Asn Lys Phe Leu Asn Gln Cys Ser Val 1340 1345 1350 Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly Val Lys Arg Leu 1355 1360 1365 Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr Ile 1370 1375 1380 Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val Asp Arg Leu Lys 1385 1390 1395 Asp Lys Val Asn Asn Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu 1400 1405 1410 Ser Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu Asp 1415 1420 1425 54558DNAArtificial SequenceSynthetic sequence 54ctgcagtgta tcaatctgga ttgggacgta atccgtgata agaccaaaac aaaaatcgag 60tctttgaaag aacacggccc gatcaaaaat aagatgtctg aatcacccaa taaaactgtt 120tcggaggaaa aagcgaaaca gtatttggaa gagtttcatc aaaccgcgct tgaacatccg 180gagctcagtg aactgaaaac agtgacggga acgaatcctg tttttgcagg cgcaaactat 240gcggcttggg ccgtgaatgt tgcccaagta attgatagtg agaccgcaga caacctggaa 300aagacgaccg cagcgttaag cattttaccg gggattggtt ccgtgatggg tatagcggat 360ggagcggtcc accataacac tgaggaaatt gtcgcccagt caatcgctct gagttccctg 420atggttgcac aggctatccc actcgtgggg gaactggttg acataggttt cgccgcctac 480aacttcgtag aaagcattat taatcttttt caggtggtgc ataacagcta caaccgccct 540ctagaatgat aaaagctt 558552010DNAArtificial SequenceSynthetic sequence 55catatgggat ccatggagtt cgttaacaaa cagttcaact ataaagaccc agttaacggt 60gttgacattg cttacatcaa aatcccgaac gctggccaga tgcagccggt aaaggcattc 120aaaatccaca acaaaatctg ggttatcccg gaacgtgata cctttactaa cccggaagaa 180ggtgacctga acccgccacc ggaagcgaaa caggtgccgg tatcttacta tgactccacc 240tacctgtcta ccgataacga aaaggacaac tacctgaaag gtgttactaa actgttcgag 300cgtatttact ccaccgacct gggccgtatg ctgctgacta gcatcgttcg cggtatcccg 360ttctggggcg gttctaccat cgataccgaa ctgaaagtaa tcgacactaa ctgcatcaac 420gttattcagc cggacggttc ctatcgttcc gaagaactga acctggtgat catcggcccg 480tctgctgata tcatccagtt cgagtgtaag agctttggtc acgaagttct gaacctcacc 540cgtaacggct acggttccac tcagtacatc cgtttctctc cggacttcac cttcggtttt 600gaagaatccc tggaagtaga cacgaaccca ctgctgggcg ctggtaaatt cgcaactgat 660cctgcggtta ccctggctca cgaactgatt catgcaggcc accgcctgta cggtatcgcc 720atcaatccga accgtgtctt caaagttaac accaacgcgt attacgagat gtccggtctg 780gaagttagct tcgaagaact gcgtactttt ggcggtcacg acgctaaatt catcgactct 840ctgcaagaaa acgagttccg tctgtactac tataacaagt tcaaagatat cgcatccacc 900ctgaacaaag cgaaatccat cgtgggtacc actgcttctc tccagtacat gaagaacgtt 960tttaaagaaa aatacctgct cagcgaagac acctccggca aattctctgt agacaagttg 1020aaattcgata aactttacaa aatgctgact gaaatttaca ccgaagacaa cttcgttaag 1080ttctttaaag ttctgaaccg caaaacctat ctgaacttcg acaaggcagt attcaaaatc 1140aacatcgtgc cgaaagttaa ctacactatc tacgatggtt tcaacctgcg taacaccaac 1200ctggctgcta attttaacgg ccagaacacg gaaatcaaca acatgaactt cacaaaactg 1260aaaaacttca ctggtctgtt cgagttttac aagctgctgt gcgtcgacgg cggtggcggt 1320agcgcagacg atgacgataa aggttggacc ctgaactctg ctggttacct gctgggtccg 1380cacgctgttg cgctagcggg cggtggcggt agcggcggtg gcggtagcgg cggtggcggt 1440agcgcactag tgctgcagtg tatcaatctg gattgggacg taatccgtga taagaccaaa 1500acaaaaatcg agtctttgaa agaacacggc ccgatcaaaa ataagatgtc tgaatcaccc 1560aataaaactg tttcggagga aaaagcgaaa cagtatttgg aagagtttca tcaaaccgcg 1620cttgaacatc cggagctcag tgaactgaaa acagtgacgg gaacgaatcc tgtttttgca 1680ggcgcaaact atgcggcttg ggccgtgaat gttgcccaag taattgatag tgagaccgca 1740gacaacctgg aaaagacgac cgcagcgtta agcattttac cggggattgg ttccgtgatg 1800ggtatagcgg atggagcggt ccaccataac actgaggaaa ttgtcgccca gtcaatcgct 1860ctgagttccc tgatggttgc acaggctatc ccactcgtgg gggaactggt tgacataggt 1920ttcgccgcct acaacttcgt agaaagcatt attaatcttt ttcaggtggt gcataacagc 1980tacaaccgcc ctctagaatg ataaaagctt 201056666PRTArtificial SequenceSynthetic sequence 56His Met Gly Ser Met Glu Phe Val Asn Lys Gln Phe Asn Tyr Lys Asp 1 5 10 15 Pro Val Asn Gly Val Asp Ile Ala Tyr Ile Lys Ile Pro Asn Ala Gly 20 25 30 Gln Met Gln Pro Val Lys Ala Phe Lys Ile His Asn Lys Ile Trp Val 35 40 45 Ile Pro Glu Arg Asp Thr Phe Thr Asn Pro Glu Glu Gly Asp Leu Asn 50 55 60 Pro Pro Pro Glu Ala Lys Gln Val Pro Val Ser Tyr Tyr Asp Ser Thr 65 70 75 80 Tyr Leu Ser Thr Asp Asn Glu Lys Asp Asn Tyr Leu Lys Gly Val Thr 85 90 95 Lys Leu Phe Glu Arg Ile Tyr Ser Thr Asp Leu Gly Arg Met Leu Leu 100 105 110 Thr Ser Ile Val Arg Gly Ile Pro Phe Trp Gly Gly Ser Thr Ile Asp 115 120 125 Thr Glu Leu Lys Val Ile Asp Thr Asn Cys Ile Asn Val Ile Gln Pro 130 135 140 Asp Gly Ser Tyr Arg Ser Glu Glu Leu Asn Leu Val Ile Ile Gly Pro 145 150 155 160 Ser Ala Asp Ile Ile Gln Phe Glu Cys Lys Ser Phe Gly His Glu Val 165 170 175 Leu Asn Leu Thr Arg Asn Gly Tyr Gly Ser Thr Gln Tyr Ile Arg Phe 180 185 190 Ser Pro Asp Phe Thr Phe Gly Phe Glu Glu Ser Leu Glu Val Asp Thr 195 200 205 Asn Pro Leu Leu Gly Ala Gly Lys Phe Ala Thr Asp Pro Ala Val Thr 210 215 220 Leu Ala His Glu Leu Ile His Ala Gly His Arg Leu Tyr Gly Ile Ala 225 230 235 240 Ile Asn Pro Asn Arg Val Phe Lys Val Asn Thr Asn Ala Tyr Tyr Glu 245 250 255 Met Ser Gly Leu Glu Val Ser Phe Glu

Glu Leu Arg Thr Phe Gly Gly 260 265 270 His Asp Ala Lys Phe Ile Asp Ser Leu Gln Glu Asn Glu Phe Arg Leu 275 280 285 Tyr Tyr Tyr Asn Lys Phe Lys Asp Ile Ala Ser Thr Leu Asn Lys Ala 290 295 300 Lys Ser Ile Val Gly Thr Thr Ala Ser Leu Gln Tyr Met Lys Asn Val 305 310 315 320 Phe Lys Glu Lys Tyr Leu Leu Ser Glu Asp Thr Ser Gly Lys Phe Ser 325 330 335 Val Asp Lys Leu Lys Phe Asp Lys Leu Tyr Lys Met Leu Thr Glu Ile 340 345 350 Tyr Thr Glu Asp Asn Phe Val Lys Phe Phe Lys Val Leu Asn Arg Lys 355 360 365 Thr Tyr Leu Asn Phe Asp Lys Ala Val Phe Lys Ile Asn Ile Val Pro 370 375 380 Lys Val Asn Tyr Thr Ile Tyr Asp Gly Phe Asn Leu Arg Asn Thr Asn 385 390 395 400 Leu Ala Ala Asn Phe Asn Gly Gln Asn Thr Glu Ile Asn Asn Met Asn 405 410 415 Phe Thr Lys Leu Lys Asn Phe Thr Gly Leu Phe Glu Phe Tyr Lys Leu 420 425 430 Leu Cys Val Asp Gly Gly Gly Gly Ser Ala Asp Asp Asp Asp Lys Gly 435 440 445 Trp Thr Leu Asn Ser Ala Gly Tyr Leu Leu Gly Pro His Ala Val Ala 450 455 460 Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 465 470 475 480 Ser Ala Leu Val Leu Gln Cys Ile Asn Leu Asp Trp Asp Val Ile Arg 485 490 495 Asp Lys Thr Lys Thr Lys Ile Glu Ser Leu Lys Glu His Gly Pro Ile 500 505 510 Lys Asn Lys Met Ser Glu Ser Pro Asn Lys Thr Val Ser Glu Glu Lys 515 520 525 Ala Lys Gln Tyr Leu Glu Glu Phe His Gln Thr Ala Leu Glu His Pro 530 535 540 Glu Leu Ser Glu Leu Lys Thr Val Thr Gly Thr Asn Pro Val Phe Ala 545 550 555 560 Gly Ala Asn Tyr Ala Ala Trp Ala Val Asn Val Ala Gln Val Ile Asp 565 570 575 Ser Glu Thr Ala Asp Asn Leu Glu Lys Thr Thr Ala Ala Leu Ser Ile 580 585 590 Leu Pro Gly Ile Gly Ser Val Met Gly Ile Ala Asp Gly Ala Val His 595 600 605 His Asn Thr Glu Glu Ile Val Ala Gln Ser Ile Ala Leu Ser Ser Leu 610 615 620 Met Val Ala Gln Ala Ile Pro Leu Val Gly Glu Leu Val Asp Ile Gly 625 630 635 640 Phe Ala Ala Tyr Asn Phe Val Glu Ser Ile Ile Asn Leu Phe Gln Val 645 650 655 Val His Asn Ser Tyr Asn Arg Pro Leu Glu 660 665 571329DNAArtificial SequenceSynthetic sequence 57ggatccatgc ctattactat taacaatttt cgttatagcg atcccgtcaa caatgacacc 60attatcatga tggaaccgcc atattgcaaa ggactggaca tttactataa agccttcaag 120attactgacc gcatttggat tgttccagag cgttacgagt tcgggacgaa accagaagat 180tttaacccgc cttcatcgct gatcgaagga gcatcagagt attacgatcc gaactatctg 240cgtacggaca gcgataaaga ccgcttctta cagaccatgg tcaaactttt taaccgtatt 300aagaacaatg tggccggaga agcactcttg gataagatta tcaacgcgat tccatacctg 360ggcaattctt acagcctgct ggataaattt gacacaaata gtaattcagt cagctttaac 420ctgttagaac aagatccgag tggcgcaacc acgaagtctg ccatgctgac aaatctgatc 480atttttggtc caggtcctgt actgaataaa aatgaagtac gcggcatcgt tctccgcgtg 540gacaataaga actacttccc atgccgtgac ggcttcggtt cgatcatgca gatggctttc 600tgtccggagt acgttccgac gtttgataat gttattgaga atatcacgag tttaacaatc 660ggtaagtcaa aatattttca agatccggcc cttctcctta tgcatgaact gattcacgtg 720ctgcacggct tatatggtat gcaagtgtcc tcgcatgaaa tcattccgtc caaacaggaa 780atttatatgc agcataccta cccgatttca gctgaagagt tgtttacgtt tggtggccag 840gacgcgaatt tgatctccat cgacatcaaa aacgatctgt atgagaaaac attaaatgac 900tataaagcga ttgcgaacaa actgtctcag gtgactagct gcaacgatcc taacattgat 960attgattcct acaaacaaat ttatcaacag aaataccagt tcgataaaga cagcaatggt 1020cagtatatcg taaacgaaga taaatttcag atcctgtata acagcattat gtatggcttt 1080accgaaattg agttggggaa gaaatttaac attaaaaccc gtctgtctta ttttagtatg 1140aaccatgatc cggtgaaaat ccccaatctg cttgatgata ccatttataa tgataccgaa 1200gggttcaaca ttgaatctaa ggatctgaaa tccgaataca aaggccaaaa tatgcgtgtt 1260aatactaacg ctttccgtaa tgttgatggt agtggactcg tctcgaaact gattgggttg 1320tgtgtcgac 1329582772DNAArtificial SequenceSynthetic sequence 58catatgggat ccatgcctat tactattaac aattttcgtt atagcgatcc cgtcaacaat 60gacaccatta tcatgatgga accgccatat tgcaaaggac tggacattta ctataaagcc 120ttcaagatta ctgaccgcat ttggattgtt ccagagcgtt acgagttcgg gacgaaacca 180gaagatttta acccgccttc atcgctgatc gaaggagcat cagagtatta cgatccgaac 240tatctgcgta cggacagcga taaagaccgc ttcttacaga ccatggtcaa actttttaac 300cgtattaaga acaatgtggc cggagaagca ctcttggata agattatcaa cgcgattcca 360tacctgggca attcttacag cctgctggat aaatttgaca caaatagtaa ttcagtcagc 420tttaacctgt tagaacaaga tccgagtggc gcaaccacga agtctgccat gctgacaaat 480ctgatcattt ttggtccagg tcctgtactg aataaaaatg aagtacgcgg catcgttctc 540cgcgtggaca ataagaacta cttcccatgc cgtgacggct tcggttcgat catgcagatg 600gctttctgtc cggagtacgt tccgacgttt gataatgtta ttgagaatat cacgagttta 660acaatcggta agtcaaaata ttttcaagat ccggcccttc tccttatgca tgaactgatt 720cacgtgctgc acggcttata tggtatgcaa gtgtcctcgc atgaaatcat tccgtccaaa 780caggaaattt atatgcagca tacctacccg atttcagctg aagagttgtt tacgtttggt 840ggccaggacg cgaatttgat ctccatcgac atcaaaaacg atctgtatga gaaaacatta 900aatgactata aagcgattgc gaacaaactg tctcaggtga ctagctgcaa cgatcctaac 960attgatattg attcctacaa acaaatttat caacagaaat accagttcga taaagacagc 1020aatggtcagt atatcgtaaa cgaagataaa tttcagatcc tgtataacag cattatgtat 1080ggctttaccg aaattgagtt ggggaagaaa tttaacatta aaacccgtct gtcttatttt 1140agtatgaacc atgatccggt gaaaatcccc aatctgcttg atgataccat ttataatgat 1200accgaagggt tcaacattga atctaaggat ctgaaatccg aatacaaagg ccaaaatatg 1260cgtgttaata ctaacgcttt ccgtaatgtt gatggtagtg gactcgtctc gaaactgatt 1320gggttgtgtg tcgacggcgg tggcggtagc gcagacgatg acgataaagg ttggaccctg 1380aactctgctg gttacctgct gggtccgcac gctgttgcgc tagcgggcgg tggcggtagc 1440ggcggtggcg gtagcggcgg tggcggtagc gcactagtgc tgcagtgtat caaggttaac 1500aactgggatt tattcttcag cccgagtgaa gacaacttca ccaacgacct gaacaaaggt 1560gaagaaatca cctcagatac taacatcgaa gcagccgaag aaaacatctc gctggacctg 1620atccagcagt actacctgac ctttaatttc gacaacgagc cggaaaacat ttctatcgaa 1680aacctgagct ctgatatcat cggccagctg gaactgatgc cgaacatcga acgtttccca 1740aacggtaaaa agtacgagct ggacaaatat accatgttcc actacctgcg cgcgcaggaa 1800tttgaacacg gcaaatcccg tatcgcactg actaactccg ttaacgaagc tctgctcaac 1860ccgtcccgtg tatacacctt cttctctagc gactacgtga aaaaggtcaa caaagcgact 1920gaagctgcaa tgttcttggg ttgggttgaa cagcttgttt atgattttac cgacgagacg 1980tccgaagtat ctactaccga caaaattgcg gatatcacta tcatcatccc gtacatcggt 2040ccggctctga acattggcaa catgctgtac aaagacgact tcgttggcgc actgatcttc 2100tccggtgcgg tgatcctgct ggagttcatc ccggaaatcg ccatcccggt actgggcacc 2160tttgctctgg tttcttacat tgcaaacaag gttctgactg tacaaaccat cgacaacgcg 2220ctgagcaaac gtaacgaaaa atgggatgaa gtttacaaat atatcgtgac caactggctg 2280gctaaggtta atactcagat cgacctcatc cgcaaaaaaa tgaaagaagc actggaaaac 2340caggcggaag ctaccaaggc aatcattaac taccagtaca accagtacac cgaggaagaa 2400aaaaacaaca tcaacttcaa catcgacgat ctgtcctcta aactgaacga atccatcaac 2460aaagctatga tcaacatcaa caagttcctg aaccagtgct ctgtaagcta tctgatgaac 2520tccatgatcc cgtacggtgt taaacgtctg gaggacttcg atgcgtctct gaaagacgcc 2580ctgctgaaat acatttacga caaccgtggc actctgatcg gtcaggttga tcgtctgaag 2640gacaaagtga acaatacctt atcgaccgac atcccttttc agctcagtaa atatgtcgat 2700aaccaacgcc ttttgtccac tctagaagca ctagcgagtg ggcaccatca ccatcaccat 2760taatgaaagc tt 277259920PRTArtificial SequenceSynthetic sequence 59His Met Gly Ser Met Pro Ile Thr Ile Asn Asn Phe Arg Tyr Ser Asp 1 5 10 15 Pro Val Asn Asn Asp Thr Ile Ile Met Met Glu Pro Pro Tyr Cys Lys 20 25 30 Gly Leu Asp Ile Tyr Tyr Lys Ala Phe Lys Ile Thr Asp Arg Ile Trp 35 40 45 Ile Val Pro Glu Arg Tyr Glu Phe Gly Thr Lys Pro Glu Asp Phe Asn 50 55 60 Pro Pro Ser Ser Leu Ile Glu Gly Ala Ser Glu Tyr Tyr Asp Pro Asn 65 70 75 80 Tyr Leu Arg Thr Asp Ser Asp Lys Asp Arg Phe Leu Gln Thr Met Val 85 90 95 Lys Leu Phe Asn Arg Ile Lys Asn Asn Val Ala Gly Glu Ala Leu Leu 100 105 110 Asp Lys Ile Ile Asn Ala Ile Pro Tyr Leu Gly Asn Ser Tyr Ser Leu 115 120 125 Leu Asp Lys Phe Asp Thr Asn Ser Asn Ser Val Ser Phe Asn Leu Leu 130 135 140 Glu Gln Asp Pro Ser Gly Ala Thr Thr Lys Ser Ala Met Leu Thr Asn 145 150 155 160 Leu Ile Ile Phe Gly Pro Gly Pro Val Leu Asn Lys Asn Glu Val Arg 165 170 175 Gly Ile Val Leu Arg Val Asp Asn Lys Asn Tyr Phe Pro Cys Arg Asp 180 185 190 Gly Phe Gly Ser Ile Met Gln Met Ala Phe Cys Pro Glu Tyr Val Pro 195 200 205 Thr Phe Asp Asn Val Ile Glu Asn Ile Thr Ser Leu Thr Ile Gly Lys 210 215 220 Ser Lys Tyr Phe Gln Asp Pro Ala Leu Leu Leu Met His Glu Leu Ile 225 230 235 240 His Val Leu His Gly Leu Tyr Gly Met Gln Val Ser Ser His Glu Ile 245 250 255 Ile Pro Ser Lys Gln Glu Ile Tyr Met Gln His Thr Tyr Pro Ile Ser 260 265 270 Ala Glu Glu Leu Phe Thr Phe Gly Gly Gln Asp Ala Asn Leu Ile Ser 275 280 285 Ile Asp Ile Lys Asn Asp Leu Tyr Glu Lys Thr Leu Asn Asp Tyr Lys 290 295 300 Ala Ile Ala Asn Lys Leu Ser Gln Val Thr Ser Cys Asn Asp Pro Asn 305 310 315 320 Ile Asp Ile Asp Ser Tyr Lys Gln Ile Tyr Gln Gln Lys Tyr Gln Phe 325 330 335 Asp Lys Asp Ser Asn Gly Gln Tyr Ile Val Asn Glu Asp Lys Phe Gln 340 345 350 Ile Leu Tyr Asn Ser Ile Met Tyr Gly Phe Thr Glu Ile Glu Leu Gly 355 360 365 Lys Lys Phe Asn Ile Lys Thr Arg Leu Ser Tyr Phe Ser Met Asn His 370 375 380 Asp Pro Val Lys Ile Pro Asn Leu Leu Asp Asp Thr Ile Tyr Asn Asp 385 390 395 400 Thr Glu Gly Phe Asn Ile Glu Ser Lys Asp Leu Lys Ser Glu Tyr Lys 405 410 415 Gly Gln Asn Met Arg Val Asn Thr Asn Ala Phe Arg Asn Val Asp Gly 420 425 430 Ser Gly Leu Val Ser Lys Leu Ile Gly Leu Cys Val Asp Gly Gly Gly 435 440 445 Gly Ser Ala Asp Asp Asp Asp Lys Gly Trp Thr Leu Asn Ser Ala Gly 450 455 460 Tyr Leu Leu Gly Pro His Ala Val Ala Leu Ala Gly Gly Gly Gly Ser 465 470 475 480 Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Leu Val Leu Gln Cys 485 490 495 Ile Lys Val Asn Asn Trp Asp Leu Phe Phe Ser Pro Ser Glu Asp Asn 500 505 510 Phe Thr Asn Asp Leu Asn Lys Gly Glu Glu Ile Thr Ser Asp Thr Asn 515 520 525 Ile Glu Ala Ala Glu Glu Asn Ile Ser Leu Asp Leu Ile Gln Gln Tyr 530 535 540 Tyr Leu Thr Phe Asn Phe Asp Asn Glu Pro Glu Asn Ile Ser Ile Glu 545 550 555 560 Asn Leu Ser Ser Asp Ile Ile Gly Gln Leu Glu Leu Met Pro Asn Ile 565 570 575 Glu Arg Phe Pro Asn Gly Lys Lys Tyr Glu Leu Asp Lys Tyr Thr Met 580 585 590 Phe His Tyr Leu Arg Ala Gln Glu Phe Glu His Gly Lys Ser Arg Ile 595 600 605 Ala Leu Thr Asn Ser Val Asn Glu Ala Leu Leu Asn Pro Ser Arg Val 610 615 620 Tyr Thr Phe Phe Ser Ser Asp Tyr Val Lys Lys Val Asn Lys Ala Thr 625 630 635 640 Glu Ala Ala Met Phe Leu Gly Trp Val Glu Gln Leu Val Tyr Asp Phe 645 650 655 Thr Asp Glu Thr Ser Glu Val Ser Thr Thr Asp Lys Ile Ala Asp Ile 660 665 670 Thr Ile Ile Ile Pro Tyr Ile Gly Pro Ala Leu Asn Ile Gly Asn Met 675 680 685 Leu Tyr Lys Asp Asp Phe Val Gly Ala Leu Ile Phe Ser Gly Ala Val 690 695 700 Ile Leu Leu Glu Phe Ile Pro Glu Ile Ala Ile Pro Val Leu Gly Thr 705 710 715 720 Phe Ala Leu Val Ser Tyr Ile Ala Asn Lys Val Leu Thr Val Gln Thr 725 730 735 Ile Asp Asn Ala Leu Ser Lys Arg Asn Glu Lys Trp Asp Glu Val Tyr 740 745 750 Lys Tyr Ile Val Thr Asn Trp Leu Ala Lys Val Asn Thr Gln Ile Asp 755 760 765 Leu Ile Arg Lys Lys Met Lys Glu Ala Leu Glu Asn Gln Ala Glu Ala 770 775 780 Thr Lys Ala Ile Ile Asn Tyr Gln Tyr Asn Gln Tyr Thr Glu Glu Glu 785 790 795 800 Lys Asn Asn Ile Asn Phe Asn Ile Asp Asp Leu Ser Ser Lys Leu Asn 805 810 815 Glu Ser Ile Asn Lys Ala Met Ile Asn Ile Asn Lys Phe Leu Asn Gln 820 825 830 Cys Ser Val Ser Tyr Leu Met Asn Ser Met Ile Pro Tyr Gly Val Lys 835 840 845 Arg Leu Glu Asp Phe Asp Ala Ser Leu Lys Asp Ala Leu Leu Lys Tyr 850 855 860 Ile Tyr Asp Asn Arg Gly Thr Leu Ile Gly Gln Val Asp Arg Leu Lys 865 870 875 880 Asp Lys Val Asn Asn Thr Leu Ser Thr Asp Ile Pro Phe Gln Leu Ser 885 890 895 Lys Tyr Val Asp Asn Gln Arg Leu Leu Ser Thr Leu Glu Ala Leu Ala 900 905 910 Ser Gly His His His His His His 915 920 605PRTArtificial sequenceSynthetic 60Asp Asp Asp Asp Lys 1 5 614PRTArtificial sequenceSynthetic 61Ile Glu Gly Arg 1 624PRTArtificial sequenceSynthetic 62Ile Asp Gly Arg 1 637PRTArtificial sequenceSynthetic 63Glu Asn Leu Tyr Phe Gln Gly 1 5 646PRTArtificial sequenceSynthetic 64Leu Val Pro Arg Gly Ser 1 5 658PRTArtificial sequenceSynthetic 65Leu Glu Val Leu Phe Gln Gly Pro 1 5 666PRTArtificial sequenceSynthetic 66Gly Gly Gly Gly Ser Ala 1 5 6711PRTArtificial sequenceSynthetic 67Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala 1 5 10 6816PRTArtificial sequenceSynthetic 68Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala Leu Val 1 5 10 15 6916PRTArtificial sequenceSynthetic 69Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Ala 1 5 10 15 7021PRTArtificial sequenceSynthetic 70Ala Leu Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly 1 5 10 15 Gly Ser Ala Leu Val 20 7119PRTArtificial sequenceSynthetic 71Leu Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser 1 5 10 15 Ala Ala Ala 7225PRTArtificial sequenceSynthetic 72Leu Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly Gly Gly 1 5 10 15 Ser Gly Gly Gly Gly Ser Ala Ala Ala 20 25 7328PRTArtificial sequenceSynthetic 73Ala Leu Ala Ala Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala 1 5 10 15 Ala Ala Lys Ala Gly Gly Gly Gly Ser Ala Leu Val 20 25 746PRTClostridium botulinum 74Gly Ile Ile Thr Ser Lys 1 5 755PRTClostridium botulinum 75Ala Ile Asp Gly Arg 1 5 765PRTClostridium botulinum 76Ile Val Ser Val Lys 1 5 774PRTClostridium botulinum 77Val Ile Pro Arg 1 784PRTClostridium botulinum 78Val Met Tyr Lys 1 7910PRTClostridium tetani 79Ile Ile Pro Pro Thr Asn Ile Arg Glu Asn 1 5 10 8025PRTClostridium botulinum 80Cys Val Arg Gly Ile Ile Thr Ser Lys Thr

Lys Ser Leu Asp Lys Gly 1 5 10 15 Tyr Asn Lys Ala Leu Asn Asp Leu Cys 20 25 8125PRTClostridium botulinum 81Cys Val Arg Gly Ile Ile Pro Phe Lys Thr Lys Ser Leu Asp Glu Gly 1 5 10 15 Tyr Asn Lys Ala Leu Asn Asp Leu Cys 20 25 8210PRTClostridium botulinum 82Cys Lys Ser Val Lys Ala Pro Gly Ile Cys 1 5 10 8317PRTClostridium botulinum 83Cys His Lys Ala Ile Asp Gly Arg Ser Leu Tyr Asn Lys Thr Leu Asp 1 5 10 15 Cys 8414PRTClostridium botulinum 84Cys Leu Arg Leu Thr Lys Asn Ser Arg Asp Asp Ser Thr Cys 1 5 10 8515PRTClostridium botulinum 85Cys Lys Asn Ile Val Ser Val Lys Gly Ile Arg Lys Ser Ile Cys 1 5 10 15 8617PRTClostridium botulinum 86Cys Lys Ser Val Ile Pro Arg Lys Gly Thr Lys Ala Pro Pro Arg Leu 1 5 10 15 Cys 8715PRTClostridium botulinum 87Cys Lys Pro Val Met Tyr Lys Asn Thr Gly Lys Ser Glu Gln Cys 1 5 10 15 8829PRTClostridium tetani 88Cys Lys Lys Ile Ile Pro Pro Thr Asn Ile Arg Glu Asn Leu Tyr Asn 1 5 10 15 Arg Thr Ala Ser Leu Thr Asp Leu Gly Gly Glu Leu Cys 20 25 8923PRTInfluenza virus 89Gly Leu Phe Gly Ala Ile Ala Gly Phe Ile Glu Asn Gly Trp Glu Gly 1 5 10 15 Met Ile Asp Gly Trp Tyr Gly 20

* * * * *


uspto.report is an independent third-party trademark research tool that is not affiliated, endorsed, or sponsored by the United States Patent and Trademark Office (USPTO) or any other governmental organization. The information provided by uspto.report is based on publicly available data at the time of writing and is intended for informational purposes only.

While we strive to provide accurate and up-to-date information, we do not guarantee the accuracy, completeness, reliability, or suitability of the information displayed on this site. The use of this site is at your own risk. Any reliance you place on such information is therefore strictly at your own risk.

All official trademark data, including owner information, should be verified by visiting the official USPTO website at www.uspto.gov. This site is not intended to replace professional legal advice and should not be used as a substitute for consulting with a legal professional who is knowledgeable about trademark law.

© 2024 USPTO.report | Privacy Policy | Resources | RSS Feed of Trademarks | Trademark Filings Twitter Feed