U.S. patent application number 16/449359 was filed with the patent office on 2020-08-27 for p97-ids fusion proteins.
The applicant listed for this patent is Bioasis Technologies Inc.. Invention is credited to Reinhard Gabathuler, Timothy Z. Vitalis.
Application Number | 20200270590 16/449359 |
Document ID | / |
Family ID | 1000004827872 |
Filed Date | 2020-08-27 |
United States Patent
Application |
20200270590 |
Kind Code |
A1 |
Vitalis; Timothy Z. ; et
al. |
August 27, 2020 |
P97-IDS FUSION PROTEINS
Abstract
Provided are fusion proteins between p97 (melanotransferrin) and
iduronate-2-sulfatase (IDS), and related compositions and methods
of use thereof, for instance, to facilitate delivery of IDS across
the blood-brain barrier (BBB) and/or improve its tissue penetration
in CNS and/or peripheral tissues, and thereby treat and/or diagnose
Hunter Syndrome (Mucopolysaccharidosis type II; MPS II) and related
lysosomal storage disorders, including those having a central
nervous system (CNS) component.
Inventors: |
Vitalis; Timothy Z.;
(Vancouver, CA) ; Gabathuler; Reinhard; (Montreal,
CA) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
Bioasis Technologies Inc. |
Guilford |
CT |
US |
|
|
Family ID: |
1000004827872 |
Appl. No.: |
16/449359 |
Filed: |
June 22, 2019 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
15119293 |
Aug 16, 2016 |
10392605 |
|
|
PCT/US2015/015662 |
Feb 12, 2015 |
|
|
|
16449359 |
|
|
|
|
61941896 |
Feb 19, 2014 |
|
|
|
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
C07K 2319/21 20130101;
A61K 38/465 20130101; C12N 9/16 20130101; C12N 9/14 20130101; C07K
2319/50 20130101; A61K 38/00 20130101; A61K 38/40 20130101; C07K
14/70596 20130101; C07K 2319/00 20130101; A61K 9/0019 20130101;
C12Y 306/04006 20130101; C07K 2319/43 20130101; C12Y 301/06013
20130101 |
International
Class: |
C12N 9/14 20060101
C12N009/14; C12N 9/16 20060101 C12N009/16; A61K 38/40 20060101
A61K038/40; A61K 38/46 20060101 A61K038/46; C07K 14/705 20060101
C07K014/705 |
Claims
1-46. (canceled)
47. A pharmaceutical composition comprising a p97
(melanotransferrin) fusion protein comprising: (i) an
iduronate-2-sulfatase (IDS) polypeptide fused to the C-terminus of
a p97 polypeptide and an optional heterologous peptide linker (L)
in between, wherein the p97 polypeptide comprises the amino acid
sequence having at least 80% sequence identity to DSSHAFTLDELR (SEQ
ID NO:14) and having transport activity; and (ii) a
pharmaceutically acceptable carrier.
48. The pharmaceutical composition of claim 47 which is sterile and
non-pyrogenic.
49. The pharmaceutical composition of claim 47 wherein the
pharmaceutically acceptable carrier is in the form of a liquid,
semi-liquid or a solid.
50. The pharmaceutical composition of claim 47 wherein the
pharmaceutically acceptable carrier is a saline solution.
Description
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority under 35 U.S.C. .sctn.
119(e) to U.S. Application No. 61/941,896, filed Feb. 19, 2014,
which is incorporated by reference in its entirety.
SEQUENCE LISTING
[0002] The Sequence Listing associated with this application is
provided in text format in lieu of a paper copy, and is hereby
incorporated by reference into the specification. The name of the
text file containing the Sequence Listing is dsshaft. The text file
is about 153 KB, was created on Feb. 9, 2015, and is being
submitted electronically via EFS-Web.
BACKGROUND
Technical Field
[0003] The present invention relates to fusion proteins between p97
(melanotransferrin) and iduronate-2-sulfatase (IDS), and related
compositions and methods of use thereof, for instance, to
facilitate delivery of IDS across the blood-brain barrier (BBB)
and/or improve its tissue penetration in CNS and/or peripheral
tissues, and thereby treat and/or diagnose Hunter Syndrome
(Mucopolysaccharidosis type II; MPS II) and related lysosomal
storage disorders, including those having a central nervous system
(CNS) component.
Description of the Related Art
[0004] Lysosomal storage diseases (LSDs) result from the absence or
reduced activity of specific enzymes or proteins within the
lysosomes of a cell. Within cells, the effect of the missing enzyme
activity can be seen as an accumulation of un-degraded "storage
material" within the intracellular lysosome. This build-up causes
lysosomes to swell and malfunction, resulting in cellular and
tissue damage. As lysosomal storage diseases typically have a
genetic etiology, many tissues will lack the enzyme in question.
However, different tissues suffer the absence of the same enzyme
activity differently. How adversely a tissue will be affected is
determined, to some extent, by the degree to which that tissue
generates the substrate of the missing enzyme. The types of tissue
most burdened by storage, in turn, dictate how the drug should be
administered to the patient.
[0005] A large number of lysosomal storage disease enzymes have
been identified and correlated with their respective diseases. Once
the missing or deficient enzyme has been identified, treatment can
focus on the problem of effectively delivering the replacement
enzyme to a patient's affected tissues. Hunter Syndrome or
Mucopolysaccharidosis type II (MPS II) is a lysosomal storage
disorders (LSD) caused by a deficiency in iduronate-2-sulfatase
(I2S or IDS). I2S is a lysosomal enzyme responsible for the
metabolism of mucopolysaccharides. Deficiency in the enzyme
activity leads a variety of pathologies ultimately and premature
death. Enzyme replacement therapy (ERT) with recombinant I2S
(Elaprase.RTM.) can treat peripheral symptoms but patients suffer
eventually from dementia because the enzyme cannot cross the blood
brain barrier (BBB).
[0006] Intravenous enzyme replacement therapy (ERT) can be
beneficial for LSDs such as MPSII. However, means for enhancing the
delivery of the therapeutic enzyme to the lysosome in such diseases
would be advantageous in terms of reduced cost and increased
therapeutic efficacy.
[0007] As one problem, the blood-brain barrier (BBB) blocks the
free transfer of many agents from blood to brain. For this reason,
LSDs that present with significant neurological aspect are not
expected to be as responsive to intravenous ERT. For such diseases,
methods of improving the delivery of the enzyme across the BBB and
into the lysosomes of the affected cells would be highly
desirable.
BRIEF SUMMARY
[0008] Embodiments of the present invention include p97
(melanotransferrin or MTf) fusion proteins, comprising an
iduronate-2-sulfatase (IDS or I2S) polypeptide fused to a p97
polypeptide and an optional peptide linker (L) in between.
[0009] In some embodiments, the IDS polypeptide is fused to the
N-terminus of the p97 polypeptide. In certain embodiments, the IDS
polypeptide is fused to the C-terminus of the p97 polypeptide.
[0010] Certain fusion proteins comprise the peptide linker in
between. In certain embodiments, the peptide linker is selected
from one or more of a rigid linker, a flexible linker, and an
enzymatically-cleavable linker. In certain embodiments, the peptide
linker is a rigid linker, optionally comprising the sequence
(EAAAK).sub.1-3 (SEQ ID NOS:36-38), such as EAAAKEAAAKEAAAK (SEQ ID
NO:38). In some embodiments, the peptide linker is a flexible
linker. In certain embodiments, the peptide linker is an
enzymatically-cleavable linker.
[0011] In certain embodiments, the fusion protein comprises an
N-terminal signal peptide (SP) sequence, optionally selected from
Table 4. In some embodiments, the fusion protein comprises the
structure: (a) SP-IDS-L-p97 or (b) SP-p97-L-IDS.
[0012] In particular embodiments, the SP comprises the sequence
MEWSWVFLFFLSVTTGVHS (SEQ ID NO:149) and the p97 fusion protein
comprises the structure: (a) SP-p97-IDS or (b) SP-p97-L-IDS.
[0013] In certain embodiments, the SP comprises the human p97 SP
sequence MRGPSGALWLLLALRTVLG (SEQ ID NO:39) and the p97 fusion
protein comprises the structure: (a) SP-p97-IDS or (b)
SP-p97-L-IDS.
[0014] In certain embodiments, the SP comprises the human IDS SP
sequence MPPPRTGRGLLWLGLVLSSVCVALG (SEQ ID NO:40) and the p97
fusion protein comprises the structure: (a) SP-IDS-p97 or (b)
SP-IDS-L-p97.
[0015] In some embodiments, the fusion protein comprises a
purification tag (TAG), optionally selected from Table 5. In
certain embodiments, the fusion protein comprises the structure:
(a) SP-TAG-IDS-L-p97 or (b) SP-TAG-p97-L-IDS. In certain
embodiments, the tag comprises a poly-histidine tag, optionally a
10.times. poly-histidine tag. In some embodiments, the tag
comprises a FLAG tag DYKDDDDK (SEQ ID NO:122). In specific
embodiments, the tag comprises a poly-histidine tag, for example, a
10.times. poly-histidine tag, and a FLAG tag.
[0016] In certain embodiments, the fusion protein comprises a
protease site (PS), optionally selected from Table 6. In particular
embodiments, the fusion protein comprises the structure: (a)
SP-TAG-PS-IDS-L-p97 or (b) SP-TAG-PS-p97-linker-IDS. In specific
embodiments, the PS site comprises the TEV protease site ENLYFQG
(SEQ ID NO:135).
[0017] In certain embodiments, the fusion protein comprises the
structure (a) SP (MEWSWVFLFFLSVTTGVHS; SEQ ID NO:149)-HIS TAG-TEV
PS-IDS-Rigid L-p97 or (b) SP (MEWSWVFLFFLSVTTGVHS; SEQ ID NO:
149)-HIS TAG-TEV PS-p97-Rigid L-IDS.
[0018] In specific embodiments, the fusion protein comprises the
structure (a) SP (MEWSWVFLFFLSVTTGVHS; SEQ ID NO: 149)-HIS TAG-TEV
PS-IDS-(EAAAK).sub.3-p97 or (b) SP (MEWSWVFLFFLSVTTGVHS; SEQ ID NO:
149)-HIS TAG-TEV PS-p97-(EAAAK).sub.3-IDS.
[0019] In certain embodiments, the fusion protein comprises the
structure (a) IDS SP-HIS TAG-TEV PS-IDS-Rigid L-p97 or (b) p97
SP-HIS TAG-TEV PS-p97-Rigid L-IDS.
[0020] In particular embodiments, the fusion protein comprises the
structure (a) IDS SP-10.times.HIS TAG-TEV PS-IDS-(EAAAK).sub.3-p97
(SEQ ID NO:29) or (b) p97 SP-10.times.HIS TAG-TEV
PS-p97-(EAAAK).sub.3-IDS (SEQ ID NO:30).
[0021] In certain embodiments, the IDS polypeptide comprises,
consists, or consists essentially of (a) an amino acid sequence set
forth in SEQ ID NOs:31-35; (b) an amino acid sequence at least 90%
identical to a sequence set forth in SEQ ID NOs:31-35; (c) or an
amino acid sequence that differs from SEQ ID NOs:31-35 by addition,
substitution, insertion, or deletion of about 1-50 amino acids. In
some embodiments, the IDS polypeptide comprises, consists, or
consists essentially of the amino acid sequence set forth in SEQ ID
NO:32 or 33.
[0022] In certain embodiments, the p97 polypeptide comprises,
consists, or consists essentially of (a) an amino acid sequence set
forth in SEQ ID NOs:1-28; (b) an amino acid sequence at least 90%
identical to a sequence set forth in SEQ ID NOs: 1-28; (c) or an
amino acid sequence that differs from SEQ ID NOs: 1-28 by addition,
substitution, insertion, or deletion of about 1-50 amino acids. In
particular embodiments, the p97 polypeptide comprises, consists, or
consists essentially of the amino acid sequence set forth in SEQ ID
NO:2 (soluble human p97) or SEQ ID NO:14 or 148 (MTfpep).
[0023] In certain embodiments, the fusion protein comprises,
consists, or consists essentially of (a) an amino acid sequence set
forth in SEQ ID NO: 138-142 or 29-30; (b) an amino acid sequence at
least 90% identical to a sequence set forth in SEQ ID NO: 138-142
or 29-30; (c) or an amino acid sequence that differs from SEQ ID
NO: 138-142 or 29-30 by addition, substitution, insertion, or
deletion of about 1-50 amino acids. In specific embodiments, the
fusion protein comprises, consists, or consists essentially of an
amino acid sequence set forth in SEQ ID NO: 138-142 or 29-30.
[0024] Also included are isolated polynucleotides which encodes a
p97 fusion protein described herein. In some embodiments, the
isolated polynucleotide is codon-optimized for expression in a host
cell. In certain embodiments, the host cell is a mammalian cell, an
insect cell, a yeast cell, or a bacterial cell. In particular
embodiments, the polynucleotide comprises a sequence selected from
SEQ ID NOs:143-147.
[0025] Some embodiments include recombinant host cells, comprising
an isolated polynucleotide described herein, where the isolated
polynucleotide is operably linked to one or more regulatory
elements.
[0026] Also included are vectors, comprising an isolated
polynucleotide that encodes a p97 fusion protein described herein,
which is operably linked to one or more regulatory elements.
[0027] Also included are recombinant host cells, comprising a
vector, isolated polynucleotide, and/or p97 fusion protein
described herein. In certain embodiments, the host cell is a
mammalian cell, an insect cell, a yeast cell, or a bacterial cell.
In specific embodiments, the mammalian cell is a Chinese hamster
ovary (CHO) cell, a HEK-293 cell, or a HT-1080 human fibrosarcoma
cell.
[0028] Certain embodiments include pharmaceutical compositions,
comprising a pharmaceutically-acceptable carrier and a p97 fusion
protein described herein, where the pharmaceutical composition is
sterile and non-pyrogenic.
[0029] Also included are methods for the treatment of a lysosomal
storage disease in a subject in need thereof, comprising
administering to the subject a p97 fusion protein or pharmaceutical
composition described herein. In certain embodiments, the lysosomal
storage disease is Hunter Syndrome (MPS II). In certain
embodiments, the lysosomal storage disease has central nervous
system (CNS) involvement. In certain embodiments, the subject is at
risk for developing CNS involvement of the lysosomal storage
disease. In certain embodiments, the subject is a human male. In
certain embodiments, the p97 fusion protein or pharmaceutical
composition is administered by intravenous (IV) infusion or
intraperitoneal (IP) injection.
BRIEF DESCRIPTION OF THE DRAWINGS
[0030] FIGS. 1A and 1B illustrate the general structure of
exemplary fusion proteins having a signal peptide (SP),
purification or affinity tag (TAG), protease site (PS) for removal
of the SP and TAG, p97 (melanotransferrin) polypeptide, a linker
(L), and an iduronate-2-sulfatase (IDS) polypeptide.
[0031] FIG. 2 shows the enzyme activity evaluation of I2S-MTf and
MTf-I2S fusion proteins as measured by their ability to hydrolyze
the substrate 4-Nitrocatechol Sulfate (PNCS) relative to
recombinant human IDS and negative control (TZM-MTf fusion). 1 ug
of each sample was used in the enzyme activity assay, and data
presented are normalized to rhIDS.
[0032] FIG. 3 shows the enzyme activity evaluation of MTfpep-I2S
and I2S-MTfpep (with I2S propeptide) fusion proteins as measured by
their ability to hydrolyze the substrate PNCS relative to I2S-MTf
fusion and negative control (TZM-MTf fusion). 1 ug of each sample
was used in the enzyme activity assay, and data presented are
normalized to substrate blank.
[0033] FIG. 4 shows a comparison of the enzyme activity of
I2S-MTfpep (with I2S propeptide) and I2S-MTfpep (without I2S
propeptide) fusion proteins as measured by their ability to
hydrolyze the substrate PNCS. 1 ug of each sample was used in the
enzyme activity assay, and data presented are normalized to
substrate blank.
[0034] FIG. 5 shows quantification of the relative distribution of
MTfpep-I2S (with propeptide) and I2S-MTf fusion proteins between
capillaries (C) and parenchyma (P) in the brain, relative to the
total (T) signal. Quantitative confocal microscopy imaging shows
that both the MTfpep-I2S and I2S-MTf fusion proteins were strongly
associated with parenchymal tissues of the CNS.
DETAILED DESCRIPTION
[0035] The practice of the present invention will employ, unless
indicated specifically to the contrary, conventional methods of
molecular biology and recombinant DNA techniques within the skill
of the art, many of which are described below for the purpose of
illustration. Such techniques are explained fully in the
literature. See, e.g., Sambrook, et al., Molecular Cloning: A
Laboratory Manual (3.sup.rd Edition, 2000); DNA Cloning: A
Practical Approach, vol. I & II (D. Glover, ed.);
Oligonucleotide Synthesis (N. Gait, ed., 1984); Oligonucleotide
Synthesis: Methods and Applications (P. Herdewijn, ed., 2004);
Nucleic Acid Hybridization (B. Hames & S. Higgins, eds., 1985);
Nucleic Acid Hybridization: Modern Applications (Buzdin and
Lukyanov, eds., 2009); Transcription and Translation (B. Hames
& S. Higgins, eds., 1984); Animal Cell Culture (R. Freshney,
ed., 1986); Freshney, R.I. (2005) Culture of Animal Cells, a Manual
of Basic Technique, 5.sup.th Ed. Hoboken N.J., John Wiley &
Sons; B. Perbal, A Practical Guide to Molecular Cloning (3.sup.rd
Edition 2010); Farrell, R., RNA Methodologies: A Laboratory Guide
for Isolation and Characterization (3.sup.rd Edition 2005).
[0036] All publications, patents, and patent applications cited
herein are hereby incorporated by reference in their entirety.
Definitions
[0037] Unless defined otherwise, all technical and scientific terms
used herein have the same meaning as commonly understood by those
of ordinary skill in the art to which the invention belongs.
Although any methods and materials similar or equivalent to those
described herein can be used in the practice or testing of the
present invention, certain exemplary methods and materials are
described herein. For the purposes of the present invention, the
following terms are defined below.
[0038] The articles "a" and "an" are used herein to refer to one or
to more than one (i.e., to at least one) of the grammatical object
of the article. By way of example, "an element" means one element
or more than one element.
[0039] By "about" is meant a quantity, level, value, number,
frequency, percentage, dimension, size, amount, weight or length
that varies by as much as 30, 25, 20, 15, 10, 9, 8, 7, 6, 5, 4, 3,
2 or 1% to a reference quantity, level, value, number, frequency,
percentage, dimension, size, amount, weight or length.
[0040] As used herein, the term "amino acid" is intended to mean
both naturally occurring and non-naturally occurring amino acids as
well as amino acid analogs and mimetics. Naturally occurring amino
acids include the 20 (L)-amino acids utilized during protein
biosynthesis as well as others such as 4-hydroxyproline,
hydroxylysine, desmosine, isodesmosine, homocysteine, citrulline
and ornithine, for example. Non-naturally occurring amino acids
include, for example, (D)-amino acids, norleucine, norvaline,
p-fluorophenylalanine, ethionine and the like, which are known to a
person skilled in the art. Amino acid analogs include modified
forms of naturally and non-naturally occurring amino acids. Such
modifications can include, for example, substitution or replacement
of chemical groups and moieties on the amino acid or by
derivatization of the amino acid. Amino acid mimetics include, for
example, organic structures which exhibit functionally similar
properties such as charge and charge spacing characteristic of the
reference amino acid. For example, an organic structure which
mimics Arginine (Arg or R) would have a positive charge moiety
located in similar molecular space and having the same degree of
mobility as the e-amino group of the side chain of the naturally
occurring Arg amino acid. Mimetics also include constrained
structures so as to maintain optimal spacing and charge
interactions of the amino acid or of the amino acid functional
groups. Those skilled in the art know or can determine what
structures constitute functionally equivalent amino acid analogs
and amino acid mimetics.
[0041] Throughout this specification, unless the context requires
otherwise, the words "comprise," "comprises," and "comprising" will
be understood to imply the inclusion of a stated step or element or
group of steps or elements but not the exclusion of any other step
or element or group of steps or elements. By "consisting of" is
meant including, and limited to, whatever follows the phrase
"consisting of." Thus, the phrase "consisting of" indicates that
the listed elements are required or mandatory, and that no other
elements may be present. By "consisting essentially of" is meant
including any elements listed after the phrase, and limited to
other elements that do not interfere with or contribute to the
activity or action specified in the disclosure for the listed
elements. Thus, the phrase "consisting essentially of" indicates
that the listed elements are required or mandatory, but that other
elements are optional and may or may not be present depending upon
whether or not they materially affect the activity or action of the
listed elements.
[0042] The term "conjugate" is intended to refer to the entity
formed as a result of covalent or non-covalent attachment or
linkage of an agent or other molecule, e.g., a biologically active
molecule, to a p97 polypeptide or p97 sequence. One example of a
conjugate polypeptide is a "fusion protein" or "fusion
polypeptide," that is, a polypeptide that is created through the
joining of two or more coding sequences, which originally coded for
separate polypeptides; translation of the joined coding sequences
results in a single, fusion polypeptide, typically with functional
properties derived from each of the separate polypeptides.
[0043] As used herein, the terms "function" and "functional" and
the like refer to a biological, enzymatic, or therapeutic
function.
[0044] "Homology" refers to the percentage number of amino acids
that are identical or constitute conservative substitutions.
Homology may be determined using sequence comparison programs such
as GAP (Deveraux et al., Nucleic Acids Research. 12, 387-395,
1984), which is incorporated herein by reference. In this way
sequences of a similar or substantially different length to those
cited herein could be compared by insertion of gaps into the
alignment, such gaps being determined, for example, by the
comparison algorithm used by GAP.
[0045] By "isolated" is meant material that is substantially or
essentially free from components that normally accompany it in its
native state. For example, an "isolated peptide" or an "isolated
polypeptide" and the like, as used herein, includes the in vitro
isolation and/or purification of a peptide or polypeptide molecule
from its natural cellular environment, and from association with
other components of the cell; i.e., it is not significantly
associated with in vivo substances.
[0046] The term "linkage," "linker," "linker moiety," or "L" is
used herein to refer to a linker that can be used to separate a p97
polypeptide from an agent of interest, or to separate a first agent
from another agent, for instance where two or more agents are
linked to form a p97 conjugate or fusion protein. The linker may be
physiologically stable or may include a releasable linker such as
an enzymatically degradable linker (e.g., proteolytically cleavable
linkers). In certain aspects, the linker may be a peptide linker,
for instance, as part of a p97 fusion protein. In some aspects, the
linker may be a non-peptide linker or non-proteinaceous linker. In
some aspects, the linker may be particle, such as a
nanoparticle.
[0047] The terms "modulating" and "altering" include "increasing,"
"enhancing" or "stimulating," as well as "decreasing" or
"reducing," typically in a statistically significant or a
physiologically significant amount or degree relative to a control.
An "increased," "stimulated" or "enhanced" amount is typically a
"statistically significant" amount, and may include an increase
that is 1.1, 1.2, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 30 or more
times (e.g., 500, 1000 times) (including all integers and decimal
points in between and above 1, e.g., 1.5, 1.6, 1.7. 1.8, etc.) the
amount produced by no composition (e.g., the absence of a fusion
protein of the invention) or a control composition, sample or test
subject. A "decreased" or "reduced" amount is typically a
"statistically significant" amount, and may include a 1%, 2%, 3%,
4%, 5%, 6%, 7%, 8%, 9%, 10%, 11%, 12%, 13%, 14%, 15%, 16%, 17%,
18%, 19%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%,
75%, 80%, 85%, 90%, 95%, or 100% decrease in the amount produced by
no composition or a control composition, including all integers in
between. As one non-limiting example, a control could compare the
activity, such as the enzymatic activity, the amount or rate of
transport/delivery across the blood brain barrier, the rate and/or
levels of distribution to central nervous system tissue, and/or the
C.sub.max for plasma, central nervous system tissues, or any other
systemic or peripheral non-central nervous system tissues, of a p97
fusion protein relative to the agent/protein alone. Other examples
of comparisons and "statistically significant" amounts are
described herein.
[0048] In certain embodiments, the "purity" of any given agent
(e.g., a p97 conjugate such as a fusion protein) in a composition
may be specifically defined. For instance, certain compositions may
comprise an agent that is at least 80%, 85%, 90%, 91%, 92%, 93%,
94%, 95%, 96%, 97%, 98%, 99%, or 100% pure, including all decimals
in between, as measured, for example and by no means limiting, by
high pressure liquid chromatography (HPLC), a well-known form of
column chromatography used frequently in biochemistry and
analytical chemistry to separate, identify, and quantify
compounds.
[0049] The terms "polypeptide" and "protein" are used
interchangeably herein to refer to a polymer of amino acid residues
and to variants and synthetic analogues of the same. Thus, these
terms apply to amino acid polymers in which one or more amino acid
residues are synthetic non-naturally occurring amino acids, such as
a chemical analogue of a corresponding naturally occurring amino
acid, as well as to naturally-occurring amino acid polymers. The
polypeptides described herein are not limited to a specific length
of the product; thus, peptides, oligopeptides, and proteins are
included within the definition of polypeptide, and such terms may
be used interchangeably herein unless specifically indicated
otherwise. The polypeptides described herein may also comprise
post-expression modifications, such as glycosylations,
acetylations, phosphorylations and the like, as well as other
modifications known in the art, both naturally occurring and
non-naturally occurring. A polypeptide may be an entire protein, or
a subsequence, fragment, variant, or derivative thereof.
[0050] A "physiologically cleavable" or "hydrolyzable" or
"degradable" bond is a bond that reacts with water (i.e., is
hydrolyzed) under physiological conditions. The tendency of a bond
to hydrolyze in water will depend not only on the general type of
linkage connecting two central atoms but also on the substituents
attached to these central atoms. Appropriate hydrolytically
unstable or weak linkages include, but are not limited to:
carboxylate ester, phosphate ester, anhydride, acetal, ketal,
acyloxyalkyl ether, imine, orthoester, thio ester, thiol ester,
carbonate, and hydrazone, peptides and oligonucleotides.
[0051] A "releasable linker" includes, but is not limited to, a
physiologically cleavable linker and an enzymatically degradable
linker. Thus, a "releasable linker" is a linker that may undergo
either spontaneous hydrolysis, or cleavage by some other mechanism
(e.g., enzyme-catalyzed, acid-catalyzed, base-catalyzed, and so
forth) under physiological conditions. For example, a "releasable
linker" can involve an elimination reaction that has a base
abstraction of a proton, (e.g., an ionizable hydrogen atom,
H.alpha.), as the driving force. For purposes herein, a "releasable
linker" is synonymous with a "degradable linker." An "enzymatically
degradable linkage" includes a linkage, e.g., amino acid sequence
that is subject to degradation by one or more enzymes, e.g.,
peptidases or proteases. In particular embodiments, a releasable
linker has a half life at pH 7.4, 25.degree. C., e.g., a
physiological pH, human body temperature (e.g., in vivo), of about
30 minutes, about 1 hour, about 2 hour, about 3 hours, about 4
hours, about 5 hours, about 6 hours, about 12 hours, about 18
hours, about 24 hours, about 36 hours, about 48 hours, about 72
hours, or about 96 hours or less.
[0052] The term "reference sequence" refers generally to a nucleic
acid coding sequence, or amino acid sequence, to which another
sequence is being compared. All polypeptide and polynucleotide
sequences described herein are included as references sequences,
including those described by name and those described in the
Sequence Listing.
[0053] The terms "sequence identity" or, for example, comprising a
"sequence 50% identical to," as used herein, refer to the extent
that sequences are identical on a nucleotide-by-nucleotide basis or
an amino acid-by-amino acid basis over a window of comparison.
Thus, a "percentage of sequence identity" may be calculated by
comparing two optimally aligned sequences over the window of
comparison, determining the number of positions at which the
identical nucleic acid base (e.g., A, T, C, G, I) or the identical
amino acid residue (e.g., Ala, Pro, Ser, Thr, Gly, Val, Leu, Ile,
Phe, Tyr, Trp, Lys, Arg, His, Asp, Glu, Asn, Gln, Cys and Met)
occurs in both sequences to yield the number of matched positions,
dividing the number of matched positions by the total number of
positions in the window of comparison (i.e., the window size), and
multiplying the result by 100 to yield the percentage of sequence
identity. Included are nucleotides and polypeptides having at least
about 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 97%, 98%,
99%, or 100% sequence identity to any of the reference sequences
described herein (see, e.g., Sequence Listing), typically where the
polypeptide variant maintains at least one biological activity of
the reference polypeptide.
[0054] Terms used to describe sequence relationships between two or
more polynucleotides or polypeptides include "reference sequence,"
"comparison window," "sequence identity," "percentage of sequence
identity," and "substantial identity." A "reference sequence" is at
least 12 but frequently 15 to 18 and often at least 25 monomer
units, inclusive of nucleotides and amino acid residues, in length.
Because two polynucleotides may each comprise (1) a sequence (i.e.,
only a portion of the complete polynucleotide sequence) that is
similar between the two polynucleotides, and (2) a sequence that is
divergent between the two polynucleotides, sequence comparisons
between two (or more) polynucleotides are typically performed by
comparing sequences of the two polynucleotides over a "comparison
window" to identify and compare local regions of sequence
similarity. A "comparison window" refers to a conceptual segment of
at least 6 contiguous positions, usually about 50 to about 100,
more usually about 100 to about 150 in which a sequence is compared
to a reference sequence of the same number of contiguous positions
after the two sequences are optimally aligned. The comparison
window may comprise additions or deletions (i.e., gaps) of about
20% or less as compared to the reference sequence (which does not
comprise additions or deletions) for optimal alignment of the two
sequences. Optimal alignment of sequences for aligning a comparison
window may be conducted by computerized implementations of
algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin
Genetics Software Package Release 7.0, Genetics Computer Group, 575
Science Drive Madison, Wis., USA) or by inspection and the best
alignment (i.e., resulting in the highest percentage homology over
the comparison window) generated by any of the various methods
selected. Reference also may be made to the BLAST family of
programs as for example disclosed by Altschul et al., Nucl. Acids
Res. 25:3389, 1997. A detailed discussion of sequence analysis can
be found in Unit 19.3 of Ausubel et al., "Current Protocols in
Molecular Biology," John Wiley & Sons Inc, 1994-1998, Chapter
15.
[0055] By "statistically significant," it is meant that the result
was unlikely to have occurred by chance. Statistical significance
can be determined by any method known in the art. Commonly used
measures of significance include the p-value, which is the
frequency or probability with which the observed event would occur,
if the null hypothesis were true. If the obtained p-value is
smaller than the significance level, then the null hypothesis is
rejected. In simple cases, the significance level is defined at a
p-value of 0.05 or less.
[0056] The term "solubility" refers to the property of a protein to
dissolve in a liquid solvent and form a homogeneous solution.
Solubility is typically expressed as a concentration, either by
mass of solute per unit volume of solvent (g of solute per kg of
solvent, g per dL (100 mL), mg/ml, etc.), molarity, molality, mole
fraction or other similar descriptions of concentration. The
maximum equilibrium amount of solute that can dissolve per amount
of solvent is the solubility of that solute in that solvent under
the specified conditions, including temperature, pressure, pH, and
the nature of the solvent. In certain embodiments, solubility is
measured at physiological pH, or other pH, for example, at pH 5.0,
pH 6.0, pH 7.0, or pH 7.4. In certain embodiments, solubility is
measured in water or a physiological buffer such as PBS or NaCl
(with or without NaP). In specific embodiments, solubility is
measured at relatively lower pH (e.g., pH 6.0) and relatively
higher salt (e.g., 500 mM NaCl and 10 mM NaP). In certain
embodiments, solubility is measured in a biological fluid (solvent)
such as blood or serum. In certain embodiments, the temperature can
be about room temperature (e.g., about 20, 21, 22, 23, 24,
25.degree. C.) or about body temperature (.sup..about.37.degree.
C.). In certain embodiments, a p97 polypeptide fusion protein has a
solubility of at least about 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7,
0.8, 0.9, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19, 20, 25, or 30 mg/ml at room temperature or at about
37.degree. C.
[0057] A "subject," as used herein, includes any animal that
exhibits a symptom, or is at risk for exhibiting a symptom, which
can be treated or diagnosed with a p97 fusion protein of the
invention. Suitable subjects (patients) include laboratory animals
(such as mouse, rat, rabbit, or guinea pig), farm animals, and
domestic animals or pets (such as a cat or dog). Non-human primates
and, preferably, human patients, are included.
[0058] "Substantially" or "essentially" means nearly totally or
completely, for instance, 95%, 96%, 97%, 98%, 99% or greater of
some given quantity.
[0059] "Substantially free" refers to the nearly complete or
complete absence of a given quantity for instance, less than about
10%, 5%, 4%, 3%, 2%, 1%, 0.5% or less of some given quantity. For
example, certain compositions may be "substantially free" of cell
proteins, membranes, nucleic acids, endotoxins, or other
contaminants.
[0060] "Treatment" or "treating," as used herein, includes any
desirable effect on the symptoms or pathology of a disease or
condition, and may include even minimal changes or improvements in
one or more measurable markers of the disease or condition being
treated. "Treatment" or "treating" does not necessarily indicate
complete eradication or cure of the disease or condition, or
associated symptoms thereof. The subject receiving this treatment
is any subject in need thereof. Exemplary markers of clinical
improvement will be apparent to persons skilled in the art.
[0061] The term "wild-type" refers to a gene or gene product that
has the characteristics of that gene or gene product when isolated
from a naturally-occurring source. A wild type gene or gene product
(e.g., a polypeptide) is that which is most frequently observed in
a population and is thus arbitrarily designed the "normal" or
"wild-type" form of the gene.
Fusion Proteins
[0062] Embodiments of the present invention relate generally to
fusion proteins that comprise a human p97 (melanotransferrin; MTf)
polypeptide sequence and a iduronate-2-sulfatase (IDS or I2S)
polypeptide sequence, polynucleotides encoding the fusion proteins,
host cells and methods of producing fusion proteins, and related
compositions and methods of use thereof. Exemplary fusion proteins
(e.g., Table 1), p97 polypeptide sequences (e.g., Table 2), and IDS
polypeptide sequences (e.g., Table 3) are described herein. The
terms "p97" and "MTf" are used interchangeably herein, as are the
terms "IDS" and "I2S."
[0063] Also described are exemplary methods and components for
coupling a p97 polypeptide sequence to an IDS sequence. In certain
embodiments, the p97 fusion protein comprises one or more signal
peptide sequences (SP), purification tags (TAG), protease cleavage
sites (PS), and/or peptide linkers (L), including any combination
of the foregoing, examples of which are provided herein. Variants
and fragments of any of the foregoing are also described
herein.
[0064] In certain embodiments, the p97 fusion protein comprises,
consists, or consists essentially of at least one of the
configurations illustrated below (N-terminus>C-terminus): [0065]
IDS-p97 [0066] p97-IDS [0067] IDS-L-p97 [0068] p97-L-IDS [0069]
SP-IDS-p97 [0070] SP-p97-IDS [0071] SP-IDS-L-p97 [0072]
SP-P97-L-IDS [0073] SP-PS-IDS-p97 [0074] SP-PS-P97-IDS [0075]
SP-PS-IDS-L-p97 [0076] SP-PS-p97-L-IDS [0077] SP-TAG-PS-IDS-p97
[0078] SP-TAG-PS-p97-IDS [0079] SP-TAG-PS-IDS-L-p97 [0080]
SP-TAG-PS-p97-L-IDS [0081] TAG-IDS-p97 [0082] TAG-p97-IDS [0083]
TAG-IDS-L-p97 [0084] TAG-p97-L-IDS [0085] TAG-PS-IDS-p97 [0086]
TAG-PS-p97-IDS [0087] TAG-PS-IDS-L-p97 [0088] TAG-PS-p97-L-IDS
[0089] IDS SP-HIS TAG-TEV PS-IDS-Rigid L-p97 [0090] IDS SP-HIS
TAG-TEV PS-IDS-(EAAAK).sub.3-p97 [0091] p97 SP-HIS TAG-TEV
PS-p97-Rigid L-IDS [0092] p97 SP-HIS TAG-TEV
PS-p97-(EAAAK).sub.3-IDS
[0093] Fusion proteins of these and related configurations can be
constructed using any of the IDS, p97, L, SP, TAG, or PS sequences
described herein, including functional or active variants and
fragments thereof.
[0094] Specific examples of p97 fusion proteins are illustrated in
Table 1 below.
TABLE-US-00001 TABLE 1 Exemplary p97 Fusion Proteins SEQ ID
Description Sequence NO: IDS SP-
MPPPRTGRGLLWLGLVLSSVCVALGHHHHHHHHHHENLYFQSETQANST 29 10xHIS TAG-
TDALNVLLIIVDDLRPSLGCYGDKLVRSPNIDQLASHSLLFQNAFAQQA TEV PS-IDS-
VCAPSRVSFLTGRRPDTTRLYDFNSYWRVHAGNFSTIPQYFKENGYVTM Rigid L-p97
SVGKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHA
NLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPH
IPFRYPKEFQKLYPLENITLAPDPEVPDGLPPVAYNPWMDIRQREDVQA
LNISVPYGPIPVDFQRKIRQSYFASVSYLDTQVGRLLSALDDLQLANST
IIAFTSDHGWALGEHGEWAKYSNFDVATHVPLIFYVPGRTASLPEAGEK
LFPYLDPFDSASQLMEPGRQSMDLVELVSLFPTLAGLAGLQVPPRCPVP
SFHVELCREGKNLLKHFRFRDLEEDPYLPGNPRELIAYSQYPRPSDIPQ
WNSDKPSLKDIKIMGYSIRTIDYRYTVWVGFNPDEFLANFSDIHAGELY
FVDSDPLQDHNMYNDSQGGDLFQLLMPEAAAKEAAAKEAAAKGMEVRWC
ATSDPEQHKCGNMSEAFREAGIQPSLLCVRGTSADHCVQLIAAQEADAI
TLDGGAIYEAGKEHGLKPVVGEVYDQEVGTSYYAVAVVRRSSHVTIDTL
KGVKSCHTGINRTVGWNVPVGYLVESGRLSVMGCDVLKAVSDYFGGSCV
PGAGETSYSESLCRLCRGDSSGEGVCDKSPLERYYDYSGAFRCLAEGAG
DVAFVKHSTVLENTDGKTLPSWGQALLSQDFELLCRDGSRADVTEWRQC
HLARVPAHAVVVRADTDGGLIFRLLNEGQRLFSHEGSSFQMFSSEAYGQ
KDLLFKDSTSELVPIATQTYEAWLGHEYLHAMKGLLCDPNRLPPYLRWC
VLSTPEIQKCGDMAVAFRRQRLKPEIQCVSAKSPQHCMERIQAEQVDAV
TLSGEDIYTAGKTYGLVPAAGEHYAPEDSSNSYYVVAVVRRDSSHAFTL
DELRGKRSCHAGFGSPAGWDVPVGALIQRGFIRPKDCDVLTAVSEFFNA
SCVPVNNPKNYPSSLCALCVGDEQGRNKCVGNSQERYYGYRGAFRCLVE
NAGDVAFVRHTTVFDNTNGHNSEPWAAELRSEDYELLCPNGARAEVSQF
AACNLAQIPPHAVMVRPDTNIFTVYGLLDKAQDLFGDDHNKNGFKMFDS
SNYHGQDLLFKDATVRAVPVGEKTTYRGWLGLDYVAALEGMSSQQCS P97 SP-
MRGPSGALWLLLALRTVLGHHHHHHHHHHENLYFQGMEVRWCATSDPEQ 30 10xHIS TAG-
HKCGNMSEAFREAGIQPSLLCVRGTSADHCVQLIAAQEADAITLDGGAI TEV PS-p97-
YEAGKEHGLKPVVGEVYDQEVGTSYYAVAVVRRSSHVTIDTLKGVKSCH Rigid L-IDS
TGINRTVGWNVPVGYLVESGRLSVMGCDVLKAVSDYFGGSCVPGAGETS
YSESLCRLCRGDSSGEGVCDKSPLERYYDYSGAFRCLAEGAGDVAFVKH
STVLENTDGKTLPSWGQALLSQDFELLCRDGSRADVTEWRQCHLARVPA
HAVVVRADTDGGLIFRLLNEGQRLFSHEGSSFQMFSSEAYGQKDLLFKD
STSELVPIATQTYEAWLGHEYLHAMKGLLCDPNRLPPYLRWCVLSTPEI
QKCGDMAVAFRRQRLKPEIQCVSAKSPQHCMERIQAEQVDAVTLSGEDI
YTAGKTYGLVPAAGEHYAPEDSSNSYYVVAVVRRDSSHAFTLDELRGKR
SCHAGFGSPAGWDVPVGALIQRGFIRPKDCDVLTAVSEFFNASCVPVNN
PKNYPSSLCALCVGDEQGRNKCVGNSQERYYGYRGAFRCLVENAGDVAF
VRHTTVFDNTNGHNSEPWAAELRSEDYELLCPNGARAEVSQFAACNLAQ
IPPHAVMVRPDTNIFTVYGLLDKAQDLFGDDHNKNGFKMFDSSNYHGQD
LLFKDATVRAVPVGEKTTYRGWLGLDYVAALEGMSSQQCSEAAAKEAAA
KEAAAKSETQANSTTDALNVLLIIVDDLRPSLGCYGDKLVRSPNIDQLA
SHSLLFQNAFAQQAVCAPSRVSFLTGRRPDTTRLYDFNSYWRVHAGNFS
TIPQYFKENGYVTMSVGKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYE
NTKTCRGPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTS
ASPFFLAVGYHKPHIPFRYPKEFQKLYPLENITLAPDPEVPDGLPPVAY
NPWMDIRQREDVQALNISVPYGPIPVDFQRKIRQSYFASVSYLDTQVGR
LLSALDDLQLANSTIIAFTSDHGWALGEHGEWAKYSNFDVATHVPLIFY
VPGRTASLPEAGEKLFPYLDPFDSASQLMEPGRQSMDLVELVSLFPTLA
GLAGLQVPPRCPVPSFHVELCREGKNLLKHFRFRDLEEDPYLPGNPREL
IAYSQYPRPSDIPQWNSDKPSLKDIKIMGYSIRTIDYRYTVWVGFNPDE
FLANFSDIHAGELYFVDSDPLQDHNMYNDSQGGDLFQLLMP I2S-MTf
MEWSWVFLFFLSVTTGVHSDYKDDDDKEQKLISEEDLHHHHHHHHHHGG 138 (SP: Flag
GGENLYFQGSETQANSTTDALNVLLIIVDDLRPSLGCYGDKLVRSPNID TAG and
QLASHSLLFQNAFAQQAVCAPSRVSFLTGRRPDTTRLYDFNSYWRVHAG 10xHIS TAG:
NFSTIPQYFKENGYVTMSVGKVFHPGISSNHTDDSPYSWSFPPYHPSSE TEV PS: IDS:
KYENTKTCRGPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKM Rigid L:
KTSASPFFLAVGYHKPHIPFRYPKEFQKLYPLENITLAPDPEVPDGLPP Soluble p97)
VAYNPWMDIRQREDVQALNISVPYGPIPVDFQRKIRQSYFASVSYLDTQ
VGRLLSALDDLQLANSTIIAFTSDHGWALGEHGEWAKYSNFDVATHVPL
IFYVPGRTASLPEAGEKLFPYLDPFDSASQLMEPGRQSMDLVELVSLFP
TLAGLAGLQVPPRCPVPSFHVELCREGKNLLKHFRFRDLEEDPYLPGNP
RELIAYSQYPRPSDIPQWNSDKPSLKDIKIMGYSIRTIDYRYTVWVGFN
PDEFLANFSDIHAGELYFVDSDPLQDHNMYNDSQGGDLFQLLMPEAAAK
EAAAKEAAAKGMEVRWCATSDPEQHKCGNMSEAFREAGIQPSLLCVRGT
SADHCVQLIAAQEADAITLDGGAIYEAGKEHGLKPVVGEVYDQEVGTSY
YAVAVVRRSSHVTIDTLKGVKSCHTGINRTVGWNVPVGYLVESGRLSVM
GCDVLKAVSDYFGGSCVPGAGETSYSESLCRLCRGDSSGEGVCDKSPLE
RYYDYSGAFRCLAEGAGDVAFVKHSTVLENTDGKTLPSWGQALLSQDFE
LLCRDGSRADVTEWRQCHLARVPAHAVVVRADTDGGLIFRLLNEGQRLF
SHEGSSFQMFSSEAYGQKDLLFKDSTSELVPIATQTYEAWLGHEYLHAM
KGLLCDPNRLPPYLRWCVLSTPEIQKCGDMAVAFRRQRLKPEIQCVSAK
SPQHCMERIQAEQVDAVTLSGEDIYTAGKTYGLVPAAGEHYAPEDSSNS
YYVVAVVRRDSSHAFTLDELRGKRSCHAGFGSPAGWDVPVGALIQRGFI
RPKDCDVLTAVSEFFNASCVPVNNPKNYPSSLCALCVGDEQGRNKCVGN
SQERYYGYRGAFRCLVENAGDVAFVRHTTVFDNTNGHNSEPWAAELRSE
DYELLCPNGARAEVSQFAACNLAQIPPHAVMVRPDTNIFTVYGLLDKAQ
DLFGDDHNKNGFKMFDSSNYHGQDLLFKDATVRAVPVGEKTTYRGWLGL DYVAALEGMSSQQCS
MTf-I2S MEWSWVFLFFLSVTTGVHSDYKDDDDKEQKLISEEDLHHHHHHHHHHGG 139 (SP:
Flag GGENLYFQGGMEVRWCATSDPEQHKCGNMSEAFREAGIQPSLLCVRGTS TAG and
ADHCVQLIAAQEADAITLDGGAIYEAGKEHGLKPVVGEVYDQEVGTSYY 10xHIS TAG:
AVAVVRRSSHVTIDTLKGVKSCHTGINRTVGWNVPVGYLVESGRLSVMG TEV PS:
CDVLKAVSDYFGGSCVPGAGETSYSESLCRLCRGDSSGEGVCDKSPLER Soluble p97:
YYDYSGAFRCLAEGAGDVAFVKHSTVLENTDGKTLPSWGQALLSQDFEL Rigid L:
LCRDGSRADVTEWRQCHLARVPAHAVVVRADTDGGLIFRLLNEGQRLFS IDS)
HEGSSFQMFSSEAYGQKDLLFKDSTSELVPIATQTYEAWLGHEYLHAMK
GLLCDPNRLPPYLRWCVLSTPEIQKCGDMAVAFRRQRLKPEIQCVSAKS
PQHCMERIQAEQVDAVTLSGEDIYTAGKTYGLVPAAGEHYAPEDSSNSY
YVVAVVRRDSSHAFTLDELRGKRSCHAGFGSPAGWDVPVGALIQRGFIR
PKDCDVLTAVSEFFNASCVPVNNPKNYPSSLCALCVGDEQGRNKCVGNS
QERYYGYRGAFRCLVENAGDVAFVRHTTVFDNTNGHNSEPWAAELRSED
YELLCPNGARAEVSQFAACNLAQIPPHAVMVRPDTNIFTVYGLLDKAQD
LFGDDHNKNGFKMFDSSNYHGQDLLFKDATVRAVPVGEKTTYRGWLGLD
YVAALEGNSSQQCSEAAAKEAAAKEAAAKSETQANSTTDALNVLLIIVD
DLRPSLGCYGDKLVRSPNIDQLASHSLLFQNAFAQQAVCAPSRVSFLTG
RRPDTTRLYDFNSYWRVHAGNFSTIPQYFKENGYVTMSVGKVFHPGISS
NHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHANLLCPVDVLDVP
EGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPHIPFRYPKEFQKL
YPLENITLAPDPEVPDGLPPVAYNPWMDIRQREDVQALNISVPYGPIPV
DFQRKIRQSYFASVSYLDTQVGRLLSALDDLQLANSTIIAFTSDHGWAL
GEHGEWAKYSNFDVATHVPLIFYVPGRTASLPEAGEKLFPYLDPFDSAS
QLMEPGRQSMDLVELVSLFPTLAGLAGLQVPPRCPVPSFHVELCREGKN
LLKHFRFRDLEEDPYLPGNPRELIAYSQYPRPSDIPQWNSDKPSLKDIK
IMGYSIRTIDYRYTVWVGFNPDEFLANFSDIHAGELYFVDSDPLQDHNM YNDSQGGDLFQLLMP
MTfpep-I2S MEWSWVFLFFLSVTTGVHSDYKDDDDKEQKLISEEDLHHHHHHHHHHGG 140
(SP: Flag GGENLYFQGDSSHAFTLDELRYEAAAKEAAAKEAAAKSETQANSTTDAL TAG and
NVLLIIVDDLRPSLGCYGDKLVRSPNIDQLASHSLLFQNAFAQQAVCAP 10xHIS TAG:
SRVSFLTGRRPDTTRLYDFNSYWRVHAGNFSTIPQYFKENGYVTMSVGK TEV PS:
VFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHANLLC MTfpep w/C-
PVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPHIPFR terminal Y:
YPKEFQKLYPLENITLAPDPEVPDGLPPVAYNPWMDIRQREDVQALNIS Rigid L:
VPYGPIPVDFQRKIRQSYFASVSYLDTQVGRLLSALDDLQLANSTIIAF I2S)
TSDHGWALGEHGEWAKYSNFDVATHVPLIFYVPGRTASLPEAGEKLFPY
LDPFDSASQLMEPGRQSMDLVELVSLFPTLAGLAGLQVPPRCPVPSFHV
ELCREGKNLLKHFRFRDLEEDPYLPGNPRELIAYSQYPRPSDIPQWNSD
KPSLKDIKIMGYSIRTIDYRYTVWVGFNPDEFLANFSDIHAGELYFVDS
DPLQDHNMYNDSQGGDLFQLLMP I2S-MTfpep
MEWSWVFLFFLSVTTGVHSDYKDDDDKEQKLISEEDLHHHHHHHHHHGG 141 (SP: Flag
GGENLYFQGSETQANSTTDALNVLLIIVDDLRPSLGCYGDKLVRSPNID TAG and
QLASHSLLFQNAFAQQAVCAPSRVSFLTGRRPDTTRLYDFNSYWRVHAG 10xHIS TAG:
NFSTIPQYFKENGYVTMSVGKVFHPGISSNHTDDSPYSWSFPPYHPSSE TEV PS: I2S:
KYENTKTCRGPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKM Rigid L:
KTSASPFFLAVGYHKPHIPFRYPKEFQKLYPLENITLAPDPEVPDGLPP MTfpep w/C-
VAYNPWMDIRQREDVQALNISVPYGPIPVDFQRKIRQSYFASVSYLDTQ terminal Y)
VGRLLSALDDLQLANSTIIAFTSDHGWALGEHGEWAKYSNFDVATHVPL
IFYVPGRTASLPEAGEKLFPYLDPFDSASQLMEPGRQSMDLVELVSLFP
TLAGLAGLQVPPRCPVPSFHVELCREGKNLLKHFRFRDLEEDPYLPGNP
RELIAYSQYPRPSDIPQWNSDKPSLKDIKIMGYSIRTIDYRYTVWVGFN
PDEFLANFSDIHAGELYFVDSDPLQDHNMYNDSQGGDLFQLLMPEAAAK
EAAAKEAAAKDSSHAFTLDELRY I2S-MTfpep
MEWSWVFLFFLSVTTGVHSDYKDDDDKEQKLISEEDLHHHHHHHHHHGG 142 (without
GGENLYFQGTDALNVLLIIVDDLRPSLGCYGDKLVRSPNIDQLASHSLL propep of
FQNAFAQQAVCAPSRVSFLTGRRPDTTRLYDFNSYWRVHAGNFSTIPQY I2S)
FKENGYVTMSVGKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTC SP: Flag
RGPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFF TAG and
LAVGYHKPHIPFRYPKEFQKLYPLENITLAPDPEVPDGLPPVAYNPWMD 10xHIS TAG:
IRQREDVQALNISVPYGPIPVDFQRKIRQSYFASVSYLDTQVGRLLSAL TEV PS: I2S
DDLQLANSTIIAFTSDHGWALGEHGEWAKYSNFDVATHVPLIFYVPGRT w/o propep:
ASLPEAGEKLFPYLDPFDSASQLMEPGRQSMDLVELVSLFPTLAGLAGL Rigid L:
QVPPRCPVPSFHVELCREGKNLLKHFRFRDLEEDPYLPGNPRELIAYSQ MTfpep w/C-
YPRPSDIPQWNSDKPSLKDIKIMGYSIRTIDYRYTVWVGFNPDEFLANF terminal Y)
SDIHAGELYFVDSDPLQDHNMYNDSQGGDLFQLLMPEAAAKEAAAKEAA
AKDSSHAFTLDELRY
[0095] Thus, in some embodiments, the fusion protein comprises,
consists, or consists essentially of an amino acid sequence from
Table 1, or a variant and/or fragment thereof.
[0096] p97 Sequences.
[0097] In certain embodiments, a p97 polypeptide sequence used in a
composition and/or fusion protein of the invention comprises,
consists essentially of, or consists of a human p97 reference
sequence provided in Table 2 below. Also included are variants and
fragments thereof.
TABLE-US-00002 TABLE 2 Exemplary p97 Sequences SEQ ID Description
Sequence NO: FL Human p97
MRGPSGALWLLLALRTVLGGMEVRWCATSDPEQHKCGNMSEAFREAGIQ 1
PSLLCVRGTSADHCVQLIAAQEADAITLDGGAIYEAGKEHGLKPVVGEV
YDQEVGTSYYAVAVVRRSSHVTIDTLKGVKSCHTGINRTVGWNVPVGYL
VESGRLSVMGCDVLKAVSDYFGGSCVPGAGETSYSESLCRLCRGDSSGE
GVCDKSPLERYYDYSGAFRCLAEGAGDVAFVKHSTVLENTDGKTLPSWG
QALLSQDFELLCRDGSRADVTEWRQCHLARVPAHAVVVRADTDGGLIFR
LLNEGQRLFSHEGSSFQMFSSEAYGQKDLLFKDSTSELVPIATQTYEAW
LGHEYLHAMKGLLCDPNRLPPYLRWCVLSTPEIQKCGDMAVAFRRQRLK
PEIQCVSAKSPQHCMERIQAEQVDAVTLSGEDIYTAGKTYGLVPAAGEH
YAPEDSSNSYYVVAVVRRDSSHAFTLDELRGKRSCHAGFGSPAGWDVPV
GALIQRGFIRPKDCDVLTAVSEFFNASCVPVNNPKNYPSSLCALCVGDE
QGRNKCVGNSQERYYGYRGAFRCLVENAGDVAFVRHTTVFDNTNGHNSE
PWAAELRSEDYELLCPNGARAEVSQFAACNLAQIPPHAVMVRPDTNIFT
VYGLLDKAQDLFGDDHNKNGFKMFDSSNYHGQDLLFKDATVRAVPVGEK
TTYRGWLGLDYVAALEGMSSQQCSGAAAPAPGAPLLPLLLPALAARLLP PAL Soluble
GMEVRWCATSDPEQHKCGNMSEAFREAGIQPSLLCVRGTSADHCVQLIA 2 Human p97
AQEADAITLDGGAIYEAGKEHGLKPVVGEVYDQEVGTSYYAVAVVRRSS
HVTIDTLKGVKSCHTGINRTVGWNVPVGYLVESGRLSVMGCDVLKAVSD
YFGGSCVPGAGETSYSESLCRLCRGDSSGEGVCDKSPLERYYDYSGAFR
CLAEGAGDVAFVKHSTVLENTDGKTLPSWGQALLSQDFELLCRDGSRAD
VTEWRQCHLARVPAHAVVVRADTDGGLIFRLLNEGQRLFSHEGSSFQMF
SSEAYGQKDLLFKDSTSELVPIATQTYEAWLGHEYLHAMKGLLCDPNRL
PPYLRWCVLSTPEIQKCGDMAVAFRRQRLKPEIQCVSAKSPQHCMERIQ
AEQVDAVTLSGEDIYTAGKTYGLVPAAGEHYAPEDSSNSYYVVAVVRRD
SSHAFTLDELRGKRSCHAGFGSPAGWDVPVGALIQRGFIRPKDCDVLTA
VSEFFNASCVPVNNPKNYPSSLCALCVGDEQGRNKCVGNSQERYYGYRG
AFRCLVENAGDVAFVRHTTVFDNTNGHNSEPWAAELRSEDYELLCPNGA
RAEVSQFAACNLAQIPPHAVMVRPDTNIFTVYGLLDKAQDLFGDDHNKN
GFKMFDSSNYHGQDLLFKDATVRAVPVGEKTTYRGWLGLDYVAALEGMS SQQCS P97
fragment WCATSDPEQHK 3 P97 fragment RSSHVTIDTLK 4 P97 fragment
SSHVTIDTLKGVK 5 P97 fragment LCRGDSSGEGVCDK 6 P97 fragment
GDSSGEGVCDKSPLER 7 P97 fragment YYDYSGAFR 8 P97 fragment ADVTEWR 9
P97 fragment VPAHAVVVR 10 P97 fragment ADTDGGLIFR 11 P97 fragment
CGDMAVAFR 12 P97 fragment LKPEIQCVSAK 13 P97 fragment DSSHAFTLDELR
14 P97 fragment 14 148 P97 fragment SEDYELLCPNGAR 15 P97 fragment
AQDLFGDDHNKNGFK 16 P97 fragment
FSSEAYGQKDLLFKDSTSELVPIATQTYEAWLGHEYLHAM 17 P97 fragment
ERIQAEQVDAVTLSGEDIYTAGKTYGLVPAAGEHYAPEDSSNSYYVVAV 18
VRRDSSHAFTLDELRGKRSCHAGFGSPAGWDVPVGALIQRGFIRPKDCD
VLTAVSEFFNASCVPVNNPKNYPSSLCALCVGDEQGRNKCVGNSQERYY
GYRGAFRCLVENAGDVAFVRHTTVFDNTNGHNSEPWAAELRSEDYELLC
PNGARAEVSQFAACNLAQIPPHAVM P97 fragment
VRPDTNIFTVYGLLDKAQDLFGDDHNKNGFKM 19 P97 fragment
GMEVRWCATSDPEQHKCGNMSEAFREAGIQPSLLCVRGTSADHCVQLIA 20
AQEADAITLDGGAIYEAGKEHGLKPVVGEVYDQEVGTSYYAVAVVRRSS
HVTIDTLKGVKSCHTGINRTVGWNVPVGYLVESGRLSVMGCDVLKAVSD
YFGGSCVPGAGETSYSESLCRLCRGDSSGEGVCDKSPLERYYDYSGAFR
CLAEGAGDVAFVKHSTVLENTDGKTLPSWGQALLSQDFELLCRDGSRAD
VTEWRQCHLARVPAHAVVVRADTDGGLIFRLLNEGQRLFSHEGSSFQMF
SSEAYGQKDLLFKDSTSELVPIATQTYEAWLGHEYLHAMKGLLCDPNRL
PPYLRWCVLSTPEIQKCGDMAVAFRRQRLKPEIQCVSAKSPQHCMERIQ
AEQVDAVTLSGEDIYTAGKTYGLVPAAGEHYAPEDSSNSYYVVAVVRRD
SSHAFTLDELRGKRSCHAGFGSPAGWDVPVGALIQRGFIRPKDCDVLTA
VSEFFNASCVPVNNPKNYPSSLCALCVGDEQGRNKCVGNSQERYYGYRG
AFRCLVENAGDVAFVRHTTVFDNTN P97 fragment GHNSEPWAAELRSEDYELLCPN 21
P97 fragment GARAEVSQFAACNLAQIPPHAVMVRPDTNIFTVYGLLDKAQDLFGDDHN 22
KN P97 fragment GFKMFDSSNYHGQDLLFKDATVRAVPVGEKTTYRGWLGLDYVAALEGMS
23 SQQC P97 fragment
GMEVRWCATSDPEQHKCGNMSEAFREAGIQPSLLCVRGTSADHCVQLIA 24
AQEADAITLDGGAIYEAGKEHGLKPVVGEVYDQEVGTSYYAVAVVRRSS
HVTIDTLKGVKSCHTGINRTVGWNVPVGYLVESGRLSVMGCDVLKAVSD
YFGGSCVPGAGETSYSESLCRLCRGDSSGEGVCDKSPLERYYDYSGAFR
CLAEGAGDVAFVKHSTVLENTDGKTLPSWGQALLSQDFELLCRDGSRAD
VTEWRQCHLARVPAHAVVVRADTDGGLIFRLLNEGQRLFSHEGSSFQMF
SSEAYGQKDLLFKDSTSELVPIATQTYEAWLGHEYLHAMKGLLCDPNRL
PPYLRWCVLSTPEIQKCGDMAVAFRRQRLKPEIQCVSAKSPQHCMERIQ
AEQVDAVTLSGEDIYTAGKTYGLVPAAGEHYAPEDSSNSYYVVAVVRRD
SSHAFTLDELRGKRSCHAGFGSPAGWDVPVGALIQRGFIRPKDCDVLTA
VSEFFNASCVPVNNPKNYPSSLCALCVGDEQGRNKCVGNSQERYYGYRG
AFRCLVENAGDVAFVRHTTVFDNTNGHNSEPWAAELRSEDYELLCPN P97 fragment
GMEVRWCATSDPEQHKCGNMSEAFREAGIQPSLLCVRGTSADHCVQLIA 25
AQEADAITLDGGAIYEAGKEHGLKPVVGEVYDQEVGTSYYAVAVVRRSS
HVTIDTLKGVKSCHTGINRTVGWNVPVGYLVESGRLSVMGCDVLKAVSD
YFGGSCVPGAGETSYSESLCRLCRGDSSGEGVCDKSPLERYYDYSGAFR
CLAEGAGDVAFVKHSTVLENTDGKTLPSWGQALLSQDFELLCRDGSRAD
VTEWRQCHLARVPAHAVVVRADTDGGLIFRLLNEGQRLFSHEGSSFQMF
SSEAYGQKDLLFKDSTSELVPIATQTYEAWLGHEYLHAMKGLLCDPNRL
PPYLRWCVLSTPEIQKCGDMAVAFRRQRLKPEIQCVSAKSPQHCMERIQ
AEQVDAVTLSGEDIYTAGKTYGLVPAAGEHYAPEDSSNSYYVVAVVRRD
SSHAFTLDELRGKRSCHAGFGSPAGWDVPVGALIQRGFIRPKDCDVLTA
VSEFFNASCVPVNNPKNYPSSLCALCVGDEQGRNKCVGNSQERYYGYRG
AFRCLVENAGDVAFVRHTTVFDNTNGHNSEPWAAELRSEDYELLCPNGA
RAEVSQFAACNLAQIPPHAVMVRPDTNIFTVYGLLDKAQDLFGDDHNKN P97 fragment
GHNSEPWAAELRSEDYELLCPNGARAEVSQFAACNLAQIPPHAVMVRPD 26
TNIFTVYGLLDKAQDLFGDDHNKN P97 fragment
GHNSEPWAAELRSEDYELLCPNGARAEVSQFAACNLAQIPPHAVMVRPD 27
TNIFTVYGLLDKAQDLFGDDHNKNGFKMFDSSNYHGQDLLFKDATVRAV
PVGEKTTYRGWLGLDYVAALEGMSSQQC P97 fragment
GARAEVSQFAACNLAQIPPHAVMVRPDTNIFTVYGLLDKAQDLFGDDHN 28
KNGFKMFDSSNYHGQDLLFKDATVRAVPVGEKTTYRGWLGLDYVAALEG MSSQQC
[0098] In some embodiments, a p97 polypeptide sequence comprises a
sequence having at least 70%, 75%, 80%, 85%, 90%, 95%, 96%, 97%,
98%, or 99% identity or homology, along its length, to a human p97
sequence in Table 2, or a fragment thereof.
[0099] In specific embodiments, the p97 polypeptide sequence
comprises, consists, or consists essentially of SEQ ID NO:2
(soluble MTf) or SEQ ID NO:14 (MTfpep). In some embodiments, the
MTfpep has a C-terminal tyrosine (Y) residue, as set forth in SEQ
ID NO:148.
[0100] In particular embodiments, a p97 polypeptide sequence
comprises a fragment of a human p97 sequence in Table 2. In certain
embodiments, a p97 polypeptide fragment is about, at least about,
or up to about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35,
36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52,
53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69,
70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86,
87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 100, 105, 110, 115,
120, 125, 130, 135, 140, 145, 150, 155, 160, 165, 170, 175, 180,
185, 190, 195, 200, 210, 220, 230, 240, 250, 260, 270, 280, 290,
300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410, 420,
430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540, 550,
560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670, 680,
690, 700, 700, 710, 720, 730 or more amino acids in length,
including all integers and ranges in between, and which may
comprise all or a portion of the sequence of a p97 reference
sequence.
[0101] In certain embodiments, a p97 polypeptide fragment is about
5-700, 5-600, 5-500, 5-400, 5-300, 5-200, 5-100, 5-50, 5-40, 5-30,
5-25, 5-20, 5-15, 5-10, 10-700, 10-600, 10-500, 10-400, 10-300,
10-200, 10-100, 10-50, 10-40, 10-30, 10-25, 10-20, 10-15, 20-700,
20-600, 20-500, 20-400, 20-300, 20-200, 20-100, 20-50, 20-40,
20-30, 20-25, 30-700, 30-600, 30-500, 30-400, 30-300, 30-200,
30-100, 30-50, 30-40, 40-700, 40-600, 40-500, 40-400, 40-300,
40-200, 40-100, 40-50, 50-700, 50-600, 50-500, 50-400, 50-300,
50-200, 50-100, 60-700, 60-600, 60-500, 60-400, 60-300, 60-200,
60-100, 60-70, 70-700, 70-600, 70-500, 70-400, 70-300, 70-200,
70-100, 70-80, 80-700, 80-600, 80-500, 80-400, 80-300, 80-200,
80-100, 80-90, 90-700, 90-600, 90-500, 90-400, 90-300, 90-200,
90-100, 100-700, 100-600, 100-500, 100-400, 100-300, 100-250,
100-200, 100-150, 200-700, 200-600, 200-500, 200-400, 200-300, or
200-250 amino acids in length, and comprises all or a portion of a
p97 reference sequence.
[0102] In certain embodiments, p97 polypeptide sequences of
interest include p97 amino acid sequences, subsequences, and/or
variants of p97 that are effective for transporting an agent of
interest across the blood brain barrier and into the central
nervous system (CNS). In particular embodiments, the variant or
fragment comprises the N-lobe of human p97 (residues 20-361 of SEQ
ID NO:1). In specific aspects, the variant or fragment comprises an
intact and functional Fe.sup.3+-binding site.
[0103] In some embodiments, a p97 polypeptide sequence is a soluble
form of a p97 polypeptide (see Yang et al., Prot Exp Purif.
34:28-48, 2004), or a fragment or variant thereof. In some aspects,
the soluble p97 polypeptide has a deletion of the all or a portion
of the hydrophobic domain (residues 710-738 of SEQ ID NO:1), alone
or in combination with a deletion of all or a portion of the signal
peptide (residues 1-19 of SEQ ID NO:1). In specific aspects, the
soluble p97 polypeptide comprises or consists of SEQ ID NO:2
(residues 20-710 or 20-711 of SEQ ID NO:1), including variants and
fragments thereof.
[0104] In certain embodiments, for instance, those that employ
liposomes, the p97 polypeptide sequence is a lipid soluble form of
a p97 polypeptide. For instance, certain of these and related
embodiments include a p97 polypeptide that comprises all or a
portion of the hydrophobic domain, optionally with or without the
signal peptide.
[0105] In certain other embodiments, the p97 fragment or variant is
capable of specifically binding to a p97 receptor, an LRP1 receptor
and/or an LRP1B receptor.
[0106] Variants and fragments of reference p97 polypeptides and
other reference polypeptides are described in greater detail
below.
[0107] Iduronate-2-Sulfatase Sequences.
[0108] In certain embodiments, an IDS (or I2S) polypeptide sequence
used in a fusion protein of the invention comprises, consists
essentially of, or consists of one or more human IDS sequences
illustrated in Table 3 below.
TABLE-US-00003 TABLE 3 Exemplary IDS Sequences SEQ ID Name Sequence
NO: Full-length MPPPRTGRGLLWLGLVLSSVCVALGSETQANSTTDALNVLLIIVDDLRP
31 human IDS SLGCYGDKLVRSPNIDQLASHSLLFQNAFAQQAVCAPSRVSFLTGRRPD
(signal TTRLYDFNSYWRVHAGNFSTIPQYFKENGYVTMSVGKVFHPGISSNHTD sequence
DSPYSWSFPPYHPSSEKYENTKTCRGPDGELHANLLCPVDVLDVPEGTL underlined)
PDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPHIPFRYPKEFQKLYPLE
NITLAPDPEVPDGLPPVAYNPWMDIRQREDVQALNISVPYGPIPVDFQR
KIRQSYFASVSYLDTQVGRLLSALDDLQLANSTIIAFTSDHGWALGEHG
EWAKYSNFDVATHVPLIFYVPGRTASLPEAGEKLFPYLDPFDSASQLME
PGRQSMDLVELVSLFPTLAGLAGLQVPPRCPVPSFHVELCREGKNLLKH
FRFRDLEEDPYLPGNPRELIAYSQYPRPSDIPQWNSDKPSLKDIKIMGY
SIRTIDYRYTVWVGFNPDEFLANFSDIHAGELYFVDSDPLQDHNMYNDS QGGDLFQLLMP Human
IDS SETQANSTTDALNVLLIIVDDLRPSLGCYGDKLVRSPNIDQLASHSLLF 32 with
QNAFAQQAVCAPSRVSFLTGRRPDTTRLYDFNSYWRVHAGNFSTIPQYF propeptide
KENGYVTMSVGKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCR sequence
GPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFL (underlined)
AVGYHKPHIPFRYPKEFQKLYPLENITLAPDPEVPDGLPPVAYNPWMDI but without
RQREDVQALNISVPYGPIPVDFQRKIRQSYFASVSYLDTQVGRLLSALD signal
DLQLANSTIIAFTSDHGWALGEHGEWAKYSNFDVATHVPLIFYVPGRTA sequence
SLPEAGEKLFPYLDPFDSASQLMEPGRQSMDLVELVSLFPTLAGLAGLQ
VPPRCPVPSFHVELCREGKNLLKHFRFRDLEEDPYLPGNPRELIAYSQY
PRPSDIPQWNSDKPSLKDIKIMGYSIRTIDYRYTVWVGFNPDEFLANFS
DIHAGELYFVDSDPLQDHNMYNDSQGGDLFQLLMP Human IDS
TDALNVLLIIVDDLRPSLGCYGDKLVRSPNIDQLASHSLLFQNAFAQQA 33 without
VCAPSRVSFLTGRRPDTTRLYDFNSYWRVHAGNFSTIPQYFKENGYVTM propeptide
SVGKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHA or signal
NLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPH sequence
IPFRYPKEFQKLYPLENITLAPDPEVPDGLPPVAYNPWMDIRQREDVQA
LNISVPYGPIPVDFQRKIRQSYFASVSYLDTQVGRLLSALDDLQLANST
IIAFTSDHGWALGEHGEWAKYSNFDVATHVPLIFYVPGRTASLPEAGEK
LFPYLDPFDSASQLMEPGRQSMDLVELVSLFPTLAGLAGLQVPPRCPVP
SFHVELCREGKNLLKHFRFRDLEEDPYLPGNPRELIAYSQYPRPSDIPQ
WNSDKPSLKDIKIMGYSIRTIDYRYTVWVGFNPDEFLANFSDIHAGELY
FVDSDPLQDHNMYNDSQGGDLFQLLMP Human IDS 42
TDALNVLLIIVDDLRPSLGCYGDKLVRSPNIDQLASHSLLFQNAFAQQA 34 kDa chain
VCAPSRVSFLTGRRPDTTRLYDFNSYWRVHAGNFSTIPQYFKENGYVTM
SVGKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHA
NLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPH
IPFRYPKEFQKLYPLENITLAPDPEVPDGLPPVAYNPWMDIRQREDVQA
LNISVPYGPIPVDFQRKIRQSYFASVSYLDTQVGRLLSALDDLQLANST
IIAFTSDHGWALGEHGEWAKYSNFDVATHVPLIFYVPGRTASLPEAGEK
LFPYLDPFDSASQLMEPGRQSMDLVELVSLFPTLAGLAGLQVPPRCPVP
SFHVELCREGKNLLKHFRFRDLEEDPYLPG Human IDS 14
NPRELIAYSQYPRPSDIPQWNSDKPSLKDIKIMGYSIRTIDYRYTVWVG 35 kDa chain
FNPDEFLANFSDIHAGELYFVDSDPLQDHNMYNDSQGGDLFQLLMP
[0109] Also included are biologically active variants and fragments
of the IDS sequences in Table 3 and the Sequence Listing. In
certain aspects, a biologically active IDS polypeptide or
variants/fragment thereof hydrolyzes the 2-sulfate groups of the
L-iduronate 2-sulfate units of dermatan sulfate, heparan sulfate,
and/or heparin, for example, at about 30%, 35%, 40%, 45%, 50%, 55%,
60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 100%, 200%, 300%, 400%,
500% or more of the activity of wild-type human IDS (e.g., SEQ ID
NO:31).
[0110] Linkers.
[0111] As noted above, certain fusion proteins may employ one or
more linker groups, including peptide linkers. Such linkers can be
rigid linkers, flexible linkers, stable linkers, or releasable
linkers, such as enzymatically-cleavable linkers. See, e.g., Chen
et al., Adv. Drug. Deliv. Ref., 65:1357-69, 2012.
[0112] For instance, for polypeptide-polypeptide conjugates,
peptide linkers can separate the components by a distance
sufficient to ensure that each polypeptide folds into its secondary
and tertiary structures. Such a peptide linker sequence may be
incorporated into the fusion protein using standard techniques
described herein and well-known in the art. Suitable peptide linker
sequences may be chosen based on the following factors: (1) their
ability to adopt a rigid or flexible extended conformation; (2)
their inability to adopt a secondary structure that could interact
with functional epitopes on the first and second polypeptides; and
(3) the lack of hydrophobic or charged residues that might react
with the polypeptide functional epitopes. Amino acid sequences
which may be usefully employed as linkers include those disclosed
in Maratea et al., Gene 40:39-46, 1985; Murphy et al., Proc. Natl.
Acad. Sci. USA 83:8258-8262, 1986; U.S. Pat. Nos. 4,935,233 and
4,751,180.
[0113] In certain illustrative embodiments, a peptide linker is
between about 1 to 5 amino acids, between 5 to 10 amino acids,
between 5 to 25 amino acids, between 5 to 50 amino acids, between
10 to 25 amino acids, between 10 to 50 amino acids, between 10 to
100 amino acids, or any intervening range of amino acids. In other
illustrative embodiments, a peptide linker comprises about 1, 5,
10, 15, 20, 25, 30, 35, 40, 45, 50 or more amino acids in length.
Particular linkers can have an overall amino acid length of about
1-200 amino acids, 1-150 amino acids, 1-100 amino acids, 1-90 amino
acids, 1-80 amino acids, 1-70 amino acids, 1-60 amino acids, 1-50
amino acids, 1-40 amino acids, 1-30 amino acids, 1-20 amino acids,
1-10 amino acids, 1-5 amino acids, 1-4 amino acids, 1-3 amino
acids, or about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,
33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
50, 60, 70, 80, 90, 100 or more amino acids.
[0114] A peptide linker may employ any one or more
naturally-occurring amino acids, non-naturally occurring amino
acid(s), amino acid analogs, and/or amino acid mimetics as
described elsewhere herein and known in the art. Certain amino acid
sequences which may be usefully employed as linkers include those
disclosed in Maratea et al., Gene 40:39-46, 1985; Murphy et al.,
PNAS USA. 83:8258-8262, 1986; U.S. Pat. Nos. 4,935,233 and
4,751,180. Particular peptide linker sequences contain Gly, Ser,
and/or Asn residues. Other near neutral amino acids, such as Thr
and Ala may also be employed in the peptide linker sequence, if
desired.
[0115] In particular embodiments, the linker is a rigid linker.
Examples of rigid linkers include, without limitation,
(EAAAK).sub.x (SEQ ID NO:36) and A(EAAAK).sub.xALEA(EAAAK).sub.xA
(SEQ ID NO:41), and (Ala-Pro).sub.x where .sub.x is 1, 2, 3, 4, 5,
6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 or more.
Specific examples of rigid linkers include EAAAK (SEQ ID NO:36),
(EAAAK).sub.2 (SEQ ID NO:37), (EAAAK).sub.3 (SEQ ID NO:38),
A(EAAAK).sub.4ALEA(EAAAK).sub.4A (SEQ ID NO:42), PAPAP (SEQ ID
NO:43), and AEAAAKEAAAKA (SEQ ID NO:44).
[0116] In specific embodiments, the linker comprises, consists, or
consists essentially of (EAAAK).sub.3 or EAAAKEAAAKEAAAK (SEQ ID
NO:38)
[0117] In some embodiments, the linker is a flexible linker. In
particular embodiments, the flexible linker is GGGGS (SEQ ID
NO:45), (GGGGS).sub.2 (SEQ ID NO:46), (GGGGS).sub.3 (SEQ ID NO:47),
or Gly.sub.2-10 (SEQ ID NOS:48-54). Additional examples of flexible
linkers are provided below.
[0118] Certain exemplary linkers include Gly, Ser and/or
Asn-containing linkers, as follows: [G].sub.x, [S].sub.x,
[N].sub.x, [GS].sub.x, [GGS].sub.x, [GSS].sub.x, [GSGS].sub.x (SEQ
ID NO:55), [GGSG].sub.x (SEQ ID NO:56), [GGGS].sub.x (SEQ ID NO:
57), [GGGGS].sub.x (SEQ ID NO: 45), [GN].sub.x, [GGN].sub.x,
[GNN].sub.x, [GNGN].sub.x (SEQ ID NO: 58), [GGNG].sub.x (SEQ ID NO:
59), [GGGN].sub.x (SEQ ID NO: 60), [GGGGN].sub.x (SEQ ID NO: 61)
linkers, where .sub.x is 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, 15, 16, 17, 18, 19, or 20 or more. Other combinations of these
and related amino acids will be apparent to persons skilled in the
art. In specific embodiments, the linker comprises or consists of a
[GGGGS].sub.3 (SEQ ID NO: 47) sequence, or GGGGSGGGGSGGGGS (SEQ ID
NO: 47).
[0119] In specific embodiments, the linker sequence comprises a
Gly3 linker sequence, which includes three glycine residues. In
particular embodiments, flexible linkers can be rationally designed
using a computer program capable of modeling both DNA-binding sites
and the peptides themselves (Desjarlais & Berg, PNAS.
90:2256-2260, 1993; and PNAS. 91:11099-11103, 1994) or by phage
display methods.
[0120] The peptide linkers may be physiologically stable or may
include a releasable linker such as a physiologically degradable or
enzymatically degradable linker (e.g., proteolytically or
enzymatically-cleavable linker). In certain embodiments, one or
more releasable linkers can result in a shorter half-life and more
rapid clearance of the fusion protein. These and related
embodiments can be used, for example, to enhance the solubility and
blood circulation lifetime of p97 fusion proteins in the
bloodstream, while also delivering an agent into the bloodstream
(or across the BBB) that, subsequent to linker degradation, is
substantially free of the p97 sequence. These aspects are
especially useful in those cases where polypeptides or other
agents, when permanently fused to a p97 sequence, demonstrate
reduced activity. By using the linkers as provided herein, such
polypeptides can maintain their therapeutic activity when in
conjugated or fused form. In these and other ways, the properties
of the p97 fusion proteins can be more effectively tailored to
balance the bioactivity and circulating half-life of the
polypeptides over time.
[0121] Specific examples of enzymatically-cleavable linkers
include, without limitation, a Factor XIa/FVIIa cleavable linker
(VSQTSKLTR AETVFPDV) (SEQ ID NO:62), a matrix metalloprotease-1
cleavable linker (PLG LWA) (SEQ ID NO:63), an HIV protease
cleavable linker (RVL AEA) (SEQ ID NO:64), a hepatitis C virus NS3
protease cleavable linker (EDVVCC SMSY) (SEQ ID NO:65), a Factor Xa
cleavable linker (GGIEGR/GS) (SEQ ID NO:66), a Furin cleavable
linker (TRHRQPRY GWE or AGNRVRR SVG or RRRRRRR R R) (SEQ ID
NOS:67-69), and a Cathepsin B cleavable linker (GFLG) (SEQ ID
NO:70).
[0122] Enzymatically degradable linkages suitable for use in
particular embodiments include, but are not limited to: an amino
acid sequence cleaved by a serine protease such as thrombin,
chymotrypsin, trypsin, elastase, kallikrein, or subtilisin.
Illustrative examples of thrombin-cleavable amino acid sequences
include, but are not limited to: -Gly-Arg-Gly-Asp--(SEQ ID NO: 71),
-Gly-Gly-Arg-, -Gly-Arg-Gly-Asp-Asn-Pro--(SEQ ID NO:72),
-Gly-Arg-Gly-Asp-Ser--(SEQ ID NO: 73),
-Gly-Arg-Gly-Asp-Ser-Pro-Lys--(SEQ ID NO: 74), -Gly-Pro-Arg-,
-Val-Pro-Arg-, and -Phe-Val-Arg-. Illustrative examples of
elastase-cleavable amino acid sequences include, but are not
limited to: -Ala-Ala-Ala-, -Ala-Ala-Pro-Val--(SEQ ID NO:75),
-Ala-Ala-Pro-Leu--(SEQ ID NO: 76), -Ala-Ala-Pro-Phe--(SEQ ID NO:
77), -Ala-Ala-Pro-Ala--(SEQ ID NO: 78), and -Ala-Tyr-Leu-Val--(SEQ
ID NO: 79).
[0123] Enzymatically degradable linkages suitable for use in
particular embodiments also include amino acid sequences that can
be cleaved by a matrix metalloproteinase such as collagenase,
stromelysin, and gelatinase. Illustrative examples of matrix
metalloproteinase-cleavable amino acid sequences include, but are
not limited to: -Gly-Pro-Y-Gly-Pro-Z--(SEQ ID NO: 80), -Gly-Pro-,
Leu-Gly-Pro-Z--(SEQ ID NO: 81), -Gly-Pro-Ile-Gly-Pro-Z--(SEQ ID
NO:82), and -Ala-Pro-Gly-Leu-Z--(SEQ ID NO: 83), where Y and Z are
amino acids. Illustrative examples of collagenase-cleavable amino
acid sequences include, but are not limited to:
-Pro-Leu-Gly-Pro-D-Arg-Z--(SEQ ID NO: 84),
-Pro-Leu-Gly-Leu-Leu-Gly-Z--(SEQ ID NO: 85),
-Pro-Gln-Gly-Ile-Ala-Gly-Trp--(SEQ ID NO: 86),
-Pro-Leu-Gly-Cys(Me)-His--(SEQ ID NO: 87),
-Pro-Leu-Gly-Leu-Tyr-Ala--(SEQ ID NO:88),
-Pro-Leu-Ala-Leu-Trp-Ala-Arg--(SEQ ID NO: 89), and
-Pro-Leu-Ala-Tyr-Trp-Ala-Arg--(SEQ ID NO: 90), where Z is an amino
acid. An illustrative example of a stromelysin-cleavable amino acid
sequence is -Pro-Tyr-Ala-Tyr-Tyr-Met-Arg--(SEQ ID NO: 91); and an
example of a gelatinase-cleavable amino acid sequence is
-Pro-Leu-Gly-Met-Tyr-Ser-Arg--(SEQ ID NO: 92).
[0124] Enzymatically degradable linkages suitable for use in
particular embodiments also include amino acid sequences that can
be cleaved by an angiotensin converting enzyme, such as, for
example, -Asp-Lys-Pro-, -Gly-Asp-Lys-Pro--(SEQ ID NO: 93), and
-Gly-Ser-Asp-Lys-Pro--(SEQ ID NO: 94).
[0125] Enzymatically degradable linkages suitable for use in
particular embodiments also include amino acid sequences that can
be degraded by cathepsin B, such as, for example, -Val-Cit-,
-Ala-Leu-Ala-Leu--(SEQ ID NO:95), -Gly-Phe-Leu-Gly--(SEQ ID NO:96)
and -Phe-Lys-.
[0126] In certain embodiments, however, any one or more of the
non-peptide or peptide linkers are optional. For instance, linker
sequences may not be required in a fusion protein where the first
and second polypeptides have non-essential N-terminal and/or
C-terminal amino acid regions that can be used to separate the
functional domains and prevent steric interference.
[0127] Signal Peptide Sequences.
[0128] In certain embodiments, a p97 fusion protein comprises one
or more signal peptide sequences (SP). In particular embodiments,
the signal peptide sequence is an N-terminal signal sequence, i.e.,
the most N-terminal portion of the fusion protein.
[0129] Specific examples of signal sequences are provided in Table
4 below. See also Kober et al., Biotechnology and Bioengineering.
110:1164-73, 2013.
TABLE-US-00004 TABLE 4 Exemplary Signal Peptide Sequences (SP) SEQ
ID Protein Signal Sequence NO: Human p97 MRGPSGALWLLLALRTVLG 39
Human IDS MPPPRTGRGLLWLGLVLSSVCVALG 40 Ig Heavy Chain
MEWSWVFLFFLSVTTGVHS 149 Ig kappa light MDMRAPAGIFGFLLVLFPGYRS 97
chain precursor Serum albumin MKWVTFISLLFLFSSAYS 98 preprotein Ig
heavy chain MDWTWRVFCLLAVTPGAHP 99 Ig light chain
MAWSPLFLTLITHCAGSWA 100 Azurocidin MTRLTVLALLAGLLASSRA 101
preprotein Cystatin-S MARPLCTLLLLMATLAGALA 102 precursor
Trypsinogen 2 MRSLVFVLLIGAAFA 103 precursor Potassium channel
MSRLFVFILIALFLSAIIDVMS 104 blocker Alpha conotoxin
MGMRMMFIMFMLVVLATTVVS 105 Alfa-galactosidase MRAFLFLTACISLPGVFG 106
(mutant m3) Cellulase MKFQSTLLLAAAAGSALA 107 Aspartic proteinase
MASSLYSFLLALSIVYIFVAPTHS 108 nepenthesin-1 Acid chitinase
MKTHYSSAILPILTLFVFLSINPSHG 109 K28 prepro-toxin
MESVSSLFNIFSTIMVNYKSLVLALLSVSNLKYARG 110 Killer toxin
MKAAQILTASIVSLLPIYTSA 111 zygocin precursor Cholera toxin
MIKLKFGVFFTVLLSSAYA 112
[0130] Thus, in some embodiments, the signal peptide comprises,
consists, or consists essentially of at least one sequence from
Table 4. In some embodiments, the signal peptide comprises SEQ ID
NO:149.
[0131] In specific embodiments, the signal peptide sequence
corresponds to the most N-terminal protein (p97 or IDS) of the
fusion protein. That is, in some embodiments the N-terminal signal
peptide sequence is the human p97 signal peptide sequence (SEQ ID
NO:39) and the p97 fusion protein comprises the general structure:
p97 SP-p97-IDS. In other embodiments, the N-terminal signal
sequence is the human IDS signal peptide sequence (SEQ ID NO:40)
and the p97 fusion protein comprises the general structure: IDS
SP-IDS-p97. Optionally, the fusion protein can further comprise one
or more purification tags and/or protease sites, for example,
between the N-terminal signal sequence and the p97/IDS portions of
the fusion protein, as described elsewhere herein. Here, the
protease site is typically place at the C-terminus of the signal
sequence or purification tag so that treatment with the
corresponding protease removes the N-terminal signal sequence,
purification tag, and most or the entire protease site from the
fusion protein.
[0132] Purification Tags.
[0133] In some embodiments, the fusion protein comprises one or
more purification or affinity tags (TAG or TAGs). Non-limiting
examples of purification tags include poly-histidine tags (e.g.,
6.times.His tags), avidin, FLAG tags, glutathione S-transferase
(GST) tags, maltose-binding protein tags, chitin binding protein
(CBP), and others. Also included are epitope tags, which bind to
high-affinity antibodies, examples of which include V5-tags,
Myc-tags, and HA-tags. In specific examples, the purification tag
is a polyhistidine tag (H.sub.5-10), for example, H.sub.5, H.sub.6,
H.sub.7, H.sub.8, H.sub.9, or H.sub.10 (SEQ ID NOS:113-118).
[0134] Non-limiting examples of purification tags are provided in
Table 5 below.
TABLE-US-00005 TABLE 5 Exemplary Purification Tags (TAG) SEQ ID
Name Sequence NO: 5X-HIS HHHHH 113 6X-HIS HHHHHH 114 7X-HIS HHHHHHH
115 8X-HIS HHHHHHHH 116 9X-HIS HHHHHHHHH 117 10X-HIS HHHHHHHHHH 118
AviTag GLNDIFEAQKIEWHE 119 Calmodulin-tag
KRRWKKNFIAVSAANRFKKISSSGAL 120 Polyglutamate EEEEEE 121 tag
FLAG-tag DYKDDDDK 122 HA-tag YPYDVPDYA 123 MYC-tag EQKLISEEDL 124
S-tag KETAAAKFERQHMDS 125 SPB-tag MDEKTTGWRGGHVVEGLAGELEQLRA 126
RLEHHPQGQREP Softag 1 SLAELLNAGLGGS 127 Softag 3 TQDPSRVG 128 V5
tag GKPIPNPLLGLDST 129 Xpress tag DLYDDDDK 130
[0135] Thus, in certain embodiments, the purification tag
comprises, consists, or consists essentially of at least one
sequence from Table 5. In specific embodiments, the tag comprises a
FLAG tag and a HIS tag, for example, a 10.times.-HIS tag.
[0136] Protease Sites (PS).
[0137] In some embodiments, the fusion protein comprises one or
more protease sites. Optionally, the one or more protease sites are
positioned at the C-terminus of the purification tag and/or signal
peptide sequence (if either one or both are present) so that
treatment with the corresponding protease removes the N-terminal
signal sequence, purification tag, and/or most or all of the
protease site from the fusion protein.
[0138] In particular embodiments, for instance, where the fusion
protein comprises an enzymatically-cleavable linker, the protease
site typically differs from that of the enzymatically-cleavable
linker, so that treatment with the protease removes any terminal
sequences (e.g., signal peptide sequence, purification tag) without
cleaving the peptide linker between the p97 and IDS sequences.
[0139] Non-limiting examples of protease sites are provided in
Table 6 below.
TABLE-US-00006 TABLE 6 Exemplary Protease Sites (PS) SEQ ID
Protease Sequence NO: Thrombin LVPRGS 131 Enteropep- DDDDK 132
tidase Factor Xa I(E/D)GR 133 Enterokinase DDDDK 134 TEV ENLYFQG
135 Protease HRV 3C LEVLFQGP 136 Protease SUMO
GSLQDSEVNQEAKPEVKPEVKPETHIN 137 Protease
LKVSDGSSEIFFKIKKTTPLRRLMEAF (Ulp1) AKRQGKEMDSLTFLYDGIEIQADQTPE
DLDMEDNDIIEAHREQIGG Denotes site of cleavage
[0140] Thus, in certain embodiments, the protease site comprises,
consists, or consists essentially of at least one sequence from
Table 6. In specific embodiments, the protease site comprises the
TEV protease site (SEQ ID NO:135).
[0141] Variant Sequences.
[0142] Certain embodiments include variants of the reference
polypeptide and polynucleotide sequences described herein, whether
described by name or by reference to a sequence identifier,
including p97 sequences, IDS sequences, linker sequences, signal
peptide sequences, purification tags, and protease sites (see,
e.g., Tables 1-6 and the Sequence Listing). The wild-type or most
prevalent sequences of these polypeptides are known in the art, and
can be used as a comparison for the variants and fragments
described herein.
[0143] A "variant" sequence, as the term is used herein, refers to
a polypeptide or polynucleotide sequence that differs from a
reference sequence disclosed herein by one or more substitutions,
deletions (e.g., truncations), additions, and/or insertions.
Certain variants thus include fragments of a reference sequence
described herein. Variant polypeptides are biologically active,
that is, they continue to possess the enzymatic or binding activity
of a reference polypeptide. Such variants may result from, for
example, genetic polymorphism and/or from human manipulation.
[0144] In many instances, a biologically active variant will
contain one or more conservative substitutions. A "conservative
substitution" is one in which an amino acid is substituted for
another amino acid that has similar properties, such that one
skilled in the art of peptide chemistry would expect the secondary
structure and hydropathic nature of the polypeptide to be
substantially unchanged. As described above, modifications may be
made in the structure of the polynucleotides and polypeptides of
the present invention and still obtain a functional molecule that
encodes a variant or derivative polypeptide with desirable
characteristics. When it is desired to alter the amino acid
sequence of a polypeptide to create an equivalent, or even an
improved, variant or portion of a polypeptide of the invention, one
skilled in the art will typically change one or more of the codons
of the encoding DNA sequence according to Table A below.
TABLE-US-00007 TABLE A Amino Acids Codons Alanine Ala A GCA GCC GCG
GCU Cysteine Cys C UGC UGU Aspartic acid Asp D GAC GAU Glutamic
acid Glu E GAA GAG Phenylalanine Phe F UUC UUU Glycine Gly G GGA
GGC GGG GGU Histidine His H CAC CAU Isoleucine Ile I AUA AUC AUU
Lysine Lys K AAA AAG Leucine Leu L UUA UUG CUA CUC CUG CUU
Methionine Met M AUG Asparagine Asn N AAC AAU Proline Pro P CCA CCC
CCG CCU Glutamine Gln Q CAA CAG Arginine Arg R AGA AGG CGA CGC CGG
CGU Serine Ser S AGC AGU UCA UCC UCG UCU Threonine Thr T ACA ACC
ACG ACU Valine Val V GUA GUC GUG GUU Tryptophan Trp W UGG Tyrosine
Tyr Y UAC UAU
[0145] For example, certain amino acids may be substituted for
other amino acids in a protein structure without appreciable loss
of interactive binding capacity with structures such as, for
example, antigen-binding regions of antibodies or binding sites on
substrate molecules. Since it is the interactive capacity and
nature of a protein that defines that protein's biological
functional activity, certain amino acid sequence substitutions can
be made in a protein sequence, and, of course, its underlying DNA
coding sequence, and nevertheless obtain a protein with like
properties. It is thus contemplated that various changes may be
made in the peptide sequences of the disclosed compositions, or
corresponding DNA sequences which encode said peptides without
appreciable loss of their utility.
[0146] In making such changes, the hydropathic index of amino acids
may be considered. The importance of the hydropathic amino acid
index in conferring interactive biologic function on a protein is
generally understood in the art (Kyte & Doolittle, 1982,
incorporated herein by reference). It is accepted that the relative
hydropathic character of the amino acid contributes to the
secondary structure of the resultant protein, which in turn defines
the interaction of the protein with other molecules, for example,
enzymes, substrates, receptors, DNA, antibodies, antigens, and the
like. Each amino acid has been assigned a hydropathic index on the
basis of its hydrophobicity and charge characteristics (Kyte &
Doolittle, 1982). These values are: isoleucine (+4.5); valine
(+4.2); leucine (+3.8); phenylalanine (+2.8); cysteine (+2.5);
methionine (+1.9); alanine (+1.8); glycine (-0.4); threonine
(-0.7); serine (-0.8); tryptophan (-0.9); tyrosine (-1.3); proline
(-1.6); histidine (-3.2); glutamate (-3.5); glutamine (-3.5);
aspartate (-3.5); asparagine (-3.5); lysine (-3.9); and arginine
(-4.5). It is known in the art that certain amino acids may be
substituted by other amino acids having a similar hydropathic index
or score and still result in a protein with similar biological
activity, i.e., still obtain a biological functionally equivalent
protein. In making such changes, the substitution of amino acids
whose hydropathic indices are within .+-.2 is preferred, those
within .+-.1 are particularly preferred, and those within .+-.0.5
are even more particularly preferred.
[0147] It is also understood in the art that the substitution of
like amino acids can be made effectively on the basis of
hydrophilicity. U.S. Pat. No. 4,554,101 (specifically incorporated
herein by reference in its entirety), states that the greatest
local average hydrophilicity of a protein, as governed by the
hydrophilicity of its adjacent amino acids, correlates with a
biological property of the protein. As detailed in U.S. Pat. No.
4,554,101, the following hydrophilicity values have been assigned
to amino acid residues: arginine (+3.0); lysine (+3.0); aspartate
(+3.0.+-.1); glutamate (+3.0.+-.1); serine (+0.3); asparagine
(+0.2); glutamine (+0.2); glycine (0); threonine (-0.4); proline
(-0.5.+-.1); alanine (-0.5); histidine (-0.5); cysteine (-1.0);
methionine (-1.3); valine (-1.5); leucine (-1.8); isoleucine
(-1.8); tyrosine (-2.3); phenylalanine (-2.5); tryptophan (-3.4).
It is understood that an amino acid can be substituted for another
having a similar hydrophilicity value and still obtain a
biologically equivalent, and in particular, an immunologically
equivalent protein. In such changes, the substitution of amino
acids whose hydrophilicity values are within .+-.2 is preferred,
those within .+-.1 are particularly preferred, and those within
.+-.0.5 are even more particularly preferred.
[0148] As outlined above, amino acid substitutions are generally
therefore based on the relative similarity of the amino acid
side-chain substituents, for example, their hydrophobicity,
hydrophilicity, charge, size, and the like. Exemplary substitutions
that take various of the foregoing characteristics into
consideration are well known to those of skill in the art and
include: arginine and lysine; glutamate and aspartate; serine and
threonine; glutamine and asparagine; and valine, leucine and
isoleucine.
[0149] Amino acid substitutions may further be made on the basis of
similarity in polarity, charge, solubility, hydrophobicity,
hydrophilicity and/or the amphipathic nature of the residues. For
example, negatively charged amino acids include aspartic acid and
glutamic acid; positively charged amino acids include lysine and
arginine; and amino acids with uncharged polar head groups having
similar hydrophilicity values include leucine, isoleucine and
valine; glycine and alanine; asparagine and glutamine; and serine,
threonine, phenylalanine and tyrosine. Other groups of amino acids
that may represent conservative changes include: (1) ala, pro, gly,
glu, asp, gln, asn, ser, thr; (2) cys, ser, tyr, thr; (3) val, ile,
leu, met, ala, phe; (4) lys, arg, his; and (5) phe, tyr, trp,
his.
[0150] A variant may also, or alternatively, contain
non-conservative changes. In a preferred embodiment, variant
polypeptides differ from a native or reference sequence by
substitution, deletion or addition of fewer than about 10, 9, 8, 7,
6, 5, 4, 3, 2 amino acids, or even 1 amino acid. Variants may also
(or alternatively) be modified by, for example, the deletion or
addition of amino acids that have minimal influence on the
immunogenicity, secondary structure, enzymatic activity, and/or
hydropathic nature of the polypeptide.
[0151] In certain embodiments, a polypeptide sequence is about, at
least about, or up to about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32,
33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49,
50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140,
150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270,
280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400,
410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530,
540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660,
670, 680, 690, 700. 700, 710, 720, 730, 740, 750, 760, 770, 780,
790, 800, 800, 810, 820, 830, 840, 850, 860, 870, 880, 890, 900,
900, 910, 920, 930, 940, 950, 960, 970, 980, 990, 1000 or more
contiguous amino acids in length, including all integers in
between, and which may comprise all or a portion of a reference
sequence (see, e.g., Sequence Listing).
[0152] In other specific embodiments, a polypeptide sequence
consists of about or no more than about 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29,
30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46,
47, 48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120,
130, 140, 150, 160, 170, 180, 190, 200, 210, 220, 230, 240, 250,
260, 270, 280, 290, 300, 310, 320, 330, 340, 350, 360, 370, 380,
390, 400, 410, 420, 430, 440, 450, 460, 470, 480, 490, 500, 510,
520, 530, 540, 550, 560, 570, 580, 590, 600, 610, 620, 630, 640,
650, 660, 670, 680, 690, 700, 710, 720, 730, 740, 750, 760, 770,
780, 790, 800. 800, 810, 820, 830, 840, 850, 860, 870, 880, 890,
900, 900, 910, 920, 930, 940, 950, 960, 970, 980, 990, 1000 or more
contiguous amino acids, including all integers in between, and
which may comprise all or a portion of a reference sequence (see,
e.g., Sequence Listing).
[0153] In still other specific embodiments, a polypeptide sequence
is about 10-1000, 10-900, 10-800, 10-700, 10-600, 10-500, 10-400,
10-300, 10-200, 10-100, 10-50, 10-40, 10-30, 10-20, 20-1000,
20-900, 20-800, 20-700, 20-600, 20-500, 20-400, 20-300, 20-200,
20-100, 20-50, 20-40, 20-30, 50-1000, 50-900, 50-800, 50-700,
50-600, 50-500, 50-400, 50-300, 50-200, 50-100, 100-1000, 100-900,
100-800, 100-700, 100-600, 100-500, 100-400, 100-300, 100-200,
200-1000, 200-900, 200-800, 200-700, 200-600, 200-500, 200-400, or
200-300 contiguous amino acids, including all ranges in between,
and comprises all or a portion of a reference sequence. In certain
embodiments, the C-terminal or N-terminal region of any reference
polypeptide may be truncated by about 1, 2, 3, 4, 5, 6, 7, 8, 9,
10, 15, 20, 25, 30, 35, 40, 45, 50, 60, 70, 80, 90, 100, 110, 120,
130, 140, 150, 160, 170, 180, 190, 200, 250, 300, 350, 400, 450,
500, 550, 600, 650, 700, 750, or 800 or more amino acids, or by
about 10-50, 20-50, 50-100, 100-150, 150-200, 200-250, 250-300,
300-350, 350-400, 400-450, 450-500, 500-550, 550-600, 600-650,
650-700, 700-750, 750-800 or more amino acids, including all
integers and ranges in between (e.g., 101, 102, 103, 104, 105), so
long as the truncated polypeptide retains the binding properties
and/or activity of the reference polypeptide. Typically, the
biologically-active fragment has no less than about 1%, about 5%,
about 10%, about 25%, or about 50% of an activity of the
biologically-active reference polypeptide from which it is
derived.
[0154] In general, variants will display at least about 30%, 40%,
50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%,
95%, 96%, 97%, 98%, 99% similarity or sequence identity or sequence
homology to a reference polypeptide sequence. Moreover, sequences
differing from the native or parent sequences by the addition
(e.g., C-terminal addition, N-terminal addition, both), deletion,
truncation, insertion, or substitution (e.g., conservative
substitution) of about 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13,
14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30,
31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47,
48, 49, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 amino acids
(including all integers and ranges in between) but which retain the
properties or activities of a parent or reference polypeptide
sequence are contemplated.
[0155] In some embodiments, variant polypeptides differ from
reference sequence by at least one but by less than 50, 40, 30, 20,
15, 10, 8, 6, 5, 4, 3 or 2 amino acid residue(s). In other
embodiments, variant polypeptides differ from a reference sequence
by at least 1% but less than 20%, 15%, 10% or 5% of the residues.
(If this comparison requires alignment, the sequences should be
aligned for maximum similarity. "Looped" out sequences from
deletions or insertions, or mismatches, are considered
differences.)
[0156] Calculations of sequence similarity or sequence identity
between sequences (the terms are used interchangeably herein) are
performed as follows. To determine the percent identity of two
amino acid sequences, or of two nucleic acid sequences, the
sequences are aligned for optimal comparison purposes (e.g., gaps
can be introduced in one or both of a first and a second amino acid
or nucleic acid sequence for optimal alignment and non-homologous
sequences can be disregarded for comparison purposes). In certain
embodiments, the length of a reference sequence aligned for
comparison purposes is at least 30%, preferably at least 40%, more
preferably at least 50%, 60%, and even more preferably at least
70%, 80%, 90%, 100% of the length of the reference sequence. The
amino acid residues or nucleotides at corresponding amino acid
positions or nucleotide positions are then compared. When a
position in the first sequence is occupied by the same amino acid
residue or nucleotide as the corresponding position in the second
sequence, then the molecules are identical at that position.
[0157] The percent identity between the two sequences is a function
of the number of identical positions shared by the sequences,
taking into account the number of gaps, and the length of each gap,
which need to be introduced for optimal alignment of the two
sequences.
[0158] The comparison of sequences and determination of percent
identity between two sequences can be accomplished using a
mathematical algorithm. In a preferred embodiment, the percent
identity between two amino acid sequences is determined using the
Needleman and Wunsch, (J. Mol. Biol. 48: 444-453, 1970) algorithm
which has been incorporated into the GAP program in the GCG
software package, using either a Blossum 62 matrix or a PAM250
matrix, and a gap weight of 16, 14, 12, 10, 8, 6, or 4 and a length
weight of 1, 2, 3, 4, 5, or 6. In yet another preferred embodiment,
the percent identity between two nucleotide sequences is determined
using the GAP program in the GCG software package, using a
NWSgapdna.CMP matrix and a gap weight of 40, 50, 60, 70, or 80 and
a length weight of 1, 2, 3, 4, 5, or 6. A particularly preferred
set of parameters (and the one that should be used unless otherwise
specified) are a Blossum 62 scoring matrix with a gap penalty of
12, a gap extend penalty of 4, and a frameshift gap penalty of
5.
[0159] The percent identity between two amino acid or nucleotide
sequences can be determined using the algorithm of E. Meyers and W.
Miller (Cabios. 4:11-17, 1989) which has been incorporated into the
ALIGN program (version 2.0), using a PAM120 weight residue table, a
gap length penalty of 12 and a gap penalty of 4.
[0160] The nucleic acid and protein sequences described herein can
be used as a "query sequence" to perform a search against public
databases to, for example, identify other family members or related
sequences. Such searches can be performed using the NBLAST and
XBLAST programs (version 2.0) of Altschul, et al., (1990, J. Mol.
Biol, 215: 403-10). BLAST nucleotide searches can be performed with
the NBLAST program, score=100, wordlength=12 to obtain nucleotide
sequences homologous to nucleic acid molecules of the invention.
BLAST protein searches can be performed with the XBLAST program,
score=50, wordlength=3 to obtain amino acid sequences homologous to
protein molecules of the invention. To obtain gapped alignments for
comparison purposes, Gapped BLAST can be utilized as described in
Altschul et al., (Nucleic Acids Res. 25: 3389-3402, 1997). When
utilizing BLAST and Gapped BLAST programs, the default parameters
of the respective programs (e.g., XBLAST and NBLAST) can be
used.
[0161] In one embodiment, as noted above, polynucleotides and/or
polypeptides can be evaluated using a BLAST alignment tool. A local
alignment consists simply of a pair of sequence segments, one from
each of the sequences being compared. A modification of
Smith-Waterman or Sellers algorithms will find all segment pairs
whose scores cannot be improved by extension or trimming, called
high-scoring segment pairs (HSPs). The results of the BLAST
alignments include statistical measures to indicate the likelihood
that the BLAST score can be expected from chance alone.
[0162] The raw score, S, is calculated from the number of gaps and
substitutions associated with each aligned sequence wherein higher
similarity scores indicate a more significant alignment.
Substitution scores are given by a look-up table (see PAM,
BLOSUM).
[0163] Gap scores are typically calculated as the sum of G, the gap
opening penalty and L, the gap extension penalty. For a gap of
length n, the gap cost would be G+Ln. The choice of gap costs, G
and L is empirical, but it is customary to choose a high value for
G (10-15), e.g., 11, and a low value for L (1-2) e.g., 1.
[0164] The bit score, S', is derived from the raw alignment score S
in which the statistical properties of the scoring system used have
been taken into account. Bit scores are normalized with respect to
the scoring system, therefore they can be used to compare alignment
scores from different searches. The terms "bit score" and
"similarity score" are used interchangeably. The bit score gives an
indication of how good the alignment is; the higher the score, the
better the alignment.
[0165] The E-Value, or expected value, describes the likelihood
that a sequence with a similar score will occur in the database by
chance. It is a prediction of the number of different alignments
with scores equivalent to or better than S that are expected to
occur in a database search by chance. The smaller the E-Value, the
more significant the alignment. For example, an alignment having an
E value of e.sup.-117 means that a sequence with a similar score is
very unlikely to occur simply by chance. Additionally, the expected
score for aligning a random pair of amino acids is required to be
negative, otherwise long alignments would tend to have high score
independently of whether the segments aligned were related.
Additionally, the BLAST algorithm uses an appropriate substitution
matrix, nucleotide or amino acid and for gapped alignments uses gap
creation and extension penalties. For example, BLAST alignment and
comparison of polypeptide sequences are typically done using the
BLOSUM62 matrix, a gap existence penalty of 11 and a gap extension
penalty of 1.
[0166] In one embodiment, sequence similarity scores are reported
from BLAST analyses done using the BLOSUM62 matrix, a gap existence
penalty of 11 and a gap extension penalty of 1.
[0167] In a particular embodiment, sequence identity/similarity
scores provided herein refer to the value obtained using GAP
Version 10 (GCG, Accelrys, San Diego, Calif.) using the following
parameters: % identity and % similarity for a nucleotide sequence
using GAP Weight of 50 and Length Weight of 3, and the
nwsgapdna.cmp scoring matrix; % identity and % similarity for an
amino acid sequence using GAP Weight of 8 and Length Weight of 2,
and the BLOSUM62 scoring matrix (Henikoff and Henikoff, PNAS USA.
89:10915-10919, 1992). GAP uses the algorithm of Needleman and
Wunsch (J Mol Biol. 48:443-453, 1970) to find the alignment of two
complete sequences that maximizes the number of matches and
minimizes the number of gaps.
[0168] In one particular embodiment, the variant polypeptide
comprises an amino acid sequence that can be optimally aligned with
a reference polypeptide sequence (see, e.g., Sequence Listing) to
generate a BLAST bit scores or sequence similarity scores of at
least about 50, 60, 70, 80, 90, 100, 100, 110, 120, 130, 140, 150,
160, 170, 180, 190, 200, 210, 220, 230, 240, 250, 260, 270, 280,
290, 300, 310, 320, 330, 340, 350, 360, 370, 380, 390, 400, 410,
420, 430, 440, 450, 460, 470, 480, 490, 500, 510, 520, 530, 540,
550, 560, 570, 580, 590, 600, 610, 620, 630, 640, 650, 660, 670,
680, 690, 700, 710, 720, 730, 740, 750, 760, 770, 780, 790, 800,
810, 820, 830, 840, 850, 860, 870, 880, 890, 900, 910, 920, 930,
940, 950, 960, 970, 980, 990, 1000, or more, including all integers
and ranges in between, wherein the BLAST alignment used the
BLOSUM62 matrix, a gap existence penalty of 11, and a gap extension
penalty of 1.
[0169] As noted above, a reference polypeptide may be altered in
various ways including amino acid substitutions, deletions,
truncations, additions, and insertions. Methods for such
manipulations are generally known in the art. For example, amino
acid sequence variants of a reference polypeptide can be prepared
by mutations in the DNA. Methods for mutagenesis and nucleotide
sequence alterations are well known in the art. See, for example,
Kunkel (PNAS USA. 82: 488-492, 1985); Kunkel et al., (Methods in
Enzymol. 154: 367-382, 1987), U.S. Pat. No. 4,873,192, Watson, J.
D. et al., ("Molecular Biology of the Gene," Fourth Edition,
Benjamin/Cummings, Menlo Park, Calif., 1987) and the references
cited therein. Guidance as to appropriate amino acid substitutions
that do not affect biological activity of the protein of interest
may be found in the model of Dayhoff et al., (1978) Atlas of
Protein Sequence and Structure (Natl. Biomed. Res. Found.,
Washington, D.C.).
[0170] Methods for screening gene products of combinatorial
libraries made by such modifications, and for screening cDNA
libraries for gene products having a selected property are known in
the art. Such methods are adaptable for rapid screening of the gene
libraries generated by combinatorial mutagenesis of reference
polypeptides. As one example, recursive ensemble mutagenesis (REM),
a technique which enhances the frequency of functional mutants in
the libraries, can be used in combination with the screening assays
to identify polypeptide variants (Arkin and Yourvan, PNAS USA 89:
7811-7815, 1992; Delgrave et al., Protein Engineering. 6: 327-331,
1993).
[0171] Polynucleotides, Host Cells, and Methods of Production.
[0172] Certain embodiments relate to polynucleotides that encode
the fusion proteins described herein, and vectors that comprise
such polynucleotides, for example, where the polynucleotides are
operably linked to one or more regulatory elements. Also included
are recombinant host cells that comprise such polynucleotides,
vectors, fusion proteins, and methods of recombinant production of
the foregoing.
[0173] Fusion proteins may be prepared using standard techniques.
Preferably, however, a fusion protein is expressed as a recombinant
protein in an expression system, as described herein and known in
the art. Fusion proteins can contain one or multiple copies of a
p97 sequence and one or multiple copies of an IDS sequence, present
in any desired arrangement.
[0174] Polynucleotides and fusion polynucleotides can contain one
or multiple copies of a nucleic acid encoding a p97 polypeptide
sequence, and/or may contain one or multiple copies of a nucleic
acid encoding an IDS sequence.
[0175] For fusion proteins, DNA sequences encoding the p97
polypeptide sequence, the IDS sequence of interest, and optionally
a peptide linker components may be assembled separately, and then
ligated into an appropriate expression vector. The 3' end of the
DNA sequence encoding one polypeptide component can be ligated,
with or without a peptide linker, to the 5' end of a DNA sequence
encoding the other polypeptide component(s) so that the reading
frames of the sequences are in frame. The ligated DNA sequences are
operably linked to suitable transcriptional and/or translational
regulatory elements. The regulatory elements responsible for
expression of DNA are usually located only 5' to the DNA sequence
encoding the first polypeptides. Similarly, stop codons required to
end translation and transcription termination signals are only
present 3' to the DNA sequence encoding the most C-terminal
polypeptide. This permits translation into a single fusion
polypeptide that retains the biological activity of both component
polypeptides.
[0176] Similar techniques, mainly the arrangement of regulatory
elements such as promoters, stop codons, and transcription
termination signals, can be applied to the recombinant production
of non-fusion proteins.
[0177] Suitable vectors can be chosen or constructed, containing
appropriate regulatory sequences, including promoter sequences,
terminator sequences, polyadenylation sequences, enhancer
sequences, marker genes and other sequences as appropriate. Vectors
may be plasmids, viral e.g. phage, or phagemid, as appropriate. For
further details see, for example, Molecular Cloning: a Laboratory
Manual: 2nd edition, Sambrook et al., 1989, Cold Spring Harbor
Laboratory Press. Many known techniques and protocols for
manipulation of nucleic acid, for example in preparation of nucleic
acid constructs, mutagenesis, sequencing, introduction of DNA into
cells and gene expression, and analysis of proteins, are described
in detail in Current Protocols in Molecular Biology, Second
Edition, Ausubel et al. eds., John Wiley & Sons, 1992, or
subsequent updates thereto.
[0178] As will be understood by those of skill in the art, it may
be advantageous in some instances to produce polypeptide-encoding
nucleotide sequences possessing non-naturally occurring codons. For
example, codons preferred by a particular prokaryotic or eukaryotic
host can be selected to increase the rate of protein expression or
to produce a recombinant RNA transcript having desirable
properties, such as a half-life which is longer than that of a
transcript generated from the naturally occurring sequence. Such
polynucleotides are commonly referred to as "codon-optimized." Any
of the polynucleotides described herein may be utilized in a
codon-optimized form. In certain embodiments, a polynucleotide can
be codon optimized for use in specific bacteria such as E. coli or
yeast such as S. cerevisiae (see, e.g., Burgess-Brown et al.,
Protein Expr Purif. 59:94-102, 2008).
[0179] Exemplary polynucleotide sequences are provided in Table 7
below.
TABLE-US-00008 TABLE 7 Exemplary polynucleotide sequences SEQ ID
Name Polynucleotide Sequence NO: I2S-MTf
ATGGAATGGAGCTGGGTCTTTCTCTTCTTCCTGTCAGTAACGACTGGTGTCCAC 143
TCCGACTACAAGGACGACGACGACAAAGAGCAGAAGCTGATCTCCGAAGAGGAC
CTGCACCACCATCATCACCATCACCACCATCACGGAGGCGGTGGAGAGAACCTG
TACTTTCAGGGCTCGGAAACTCAGGCCAACTCCACCACAGATGCACTCAACGTG
CTGCTGATCATCGTAGATGACCTCCGACCTTCTCTGGGCTGTTACGGCGACAAG
CTAGTACGGAGCCCAAACATCGACCAGCTCGCATCGCACTCTCTCCTATTCCAG
AACGCATTCGCCCAGCAGGCTGTCTGTGCTCCCTCCCGAGTGTCCTTCCTCACG
GGTCGGAGACCCGATACCACGAGGTTATATGACTTCAACTCATACTGGCGCGTG
CATGCCGGTAACTTTTCTACTATACCCCAGTATTTTAAAGAAAATGGCTATGTT
ACAATGTCCGTTGGCAAGGTATTTCATCCTGGTATTAGCAGCAACCACACAGAT
GACTCTCCGTATAGCTGGTCATTCCCACCATACCACCCCTCCAGCGAAAAGTAC
GAAAACACAAAGACTTGCCGGGGCCCAGATGGCGAACTGCACGCAAATCTGCTG
TGCCCTGTAGATGTCTTGGACGTGCCCGAAGGTACTCTGCCCGACAAACAGTCC
ACAGAACAGGCAATCCAACTCCTTGAAAAGATGAAAACGAGCGCGTCCCCCTTC
TTCCTCGCCGTGGGCTACCACAAGCCCCACATCCCGTTTAGATACCCCAAGGAA
TTTCAGAAACTGTACCCCCTGGAAAACATCACTCTCGCGCCCGACCCCGAAGTG
CCAGACGGACTCCCTCCTGTTGCCTACAACCCTTGGATGGACATCAGACAACGT
GAAGATGTGCAGGCCCTGAACATCTCAGTGCCTTACGGCCCCATTCCAGTTGAC
TTCCAGAGGAAGATTCGGCAGTCCTACTTCGCCTCCGTTAGTTACCTGGACACC
CAAGTGGGTAGACTCCTGAGCGCCTTGGACGATCTCCAGCTCGCAAACAGCACC
ATCATTGCCTTCACCAGCGACCATGGTTGGGCGCTGGGTGAACATGGAGAATGG
GCTAAATATTCAAATTTCGACGTTGCGACCCACGTCCCATTGATCTTCTACGTG
CCTGGACGAACAGCCTCCTTGCCTGAAGCCGGGGAAAAGTTGTTTCCATATCTG
GACCCTTTCGATTCTGCGAGCCAACTCATGGAACCTGGGCGACAGAGCATGGAC
CTGGTGGAACTGGTCAGTTTATTTCCAACCCTGGCAGGCCTTGCAGGCCTCCAA
GTTCCACCTCGGTGTCCCGTTCCCTCATTCCACGTCGAACTCTGTCGCGAAGGT
AAAAACCTCCTCAAGCATTTTCGTTTTCGGGACCTCGAAGAAGACCCATACCTG
CCAGGGAATCCAAGGGAACTGATTGCCTACAGCCAGTACCCTAGACCTAGCGAC
ATCCCACAGTGGAACAGCGACAAGCCCTCCCTCAAGGACATTAAAATCATGGGT
TATAGTATCCGGACTATTGACTACAGGTATACCGTGTGGGTGGGTTTCAACCCA
GACGAATTTCTCGCCAATTTCTCCGACATCCACGCGGGCGAACTGTATTTCGTT
GATTCCGATCCACTGCAAGATCATAATATGTACAACGATAGTCAAGGGGGTGAC
CTCTTCCAGTTGCTAATGCCAGAAGCCGCCGCGAAAGAAGCCGCCGCAAAAGAA
GCCGCTGCCAAAGGCATGGAAGTGCGTTGGTGCGCCACCTCTGACCCCGAGCAG
CACAAGTGCGGCAACATGTCCGAGGCCTTCAGAGAGGCCGGCATCCAGCCTTCT
CTGCTGTGTGTGCGGGGCACCTCTGCCGACCATTGCGTGCAGCTGATCGCCGCC
CAGGAAGCCGACGCTATCACACTGGATGGCGGCGCTATCTACGAGGCTGGCAAA
GAGCACGGCCTGAAGCCCGTCGTGGGCGAGGTGTACGATCAGGAAGTGGGCACC
TCCTACTACGCCGTGGCTGTCGTGCGGAGATCCTCCCACGTGACCATCGACACC
CTGAAGGGCGTGAAGTCCTGCCACACCGGCATCAACAGAACCGTGGGCTGGAAC
GTGCCCGTGGGCTACCTGGTGGAATCCGGCAGACTGTCCGTGATGGGCTGCGAC
GTGCTGAAGGCCGTGTCCGATTACTTCGGCGGCTCTTGTGTGCCTGGCGCTGGC
GAGACATCCTACTCCGAGTCCCTGTGCAGACTGTGCAGGGGCGACTCTTCTGGC
GAGGGCGTGTGCGACAAGTCCCCTCTGGAACGGTACTACGACTACTCCGGCGCC
TTCAGATGCCTGGCTGAAGGTGCTGGCGACGTGGCCTTCGTGAAGCACTCCACC
GTGCTGGAAAACACCGACGGCAAGACCCTGCCTTCTTGGGGCCAGGCACTGCTG
TCCCAGGACTTCGAGCTGCTGTGCCGGGATGGCTCCAGAGCCGATGTGACAGAG
TGGCGGCAGTGCCACCTGGCCAGAGTGCCTGCTCATGCTGTGGTCGTGCGCGCC
GATACAGATGGCGGCCTGATCTTCCGGCTGCTGAACGAGGGCCAGCGGCTGTTC
TCTCACGAGGGCTCCAGCTTCCAGATGTTCTCCAGCGAGGCCTACGGCCAGAAG
GACCTGCTGTTCAAGGACTCCACCTCCGAGCTGGTGCCTATCGCCACCCAGACC
TATGAGGCTTGGCTGGGCCACGAGTACCTGCACGCTATGAAGGGACTGCTGTGC
GACCCCAACCGGCTGCCTCCTTATCTGAGGTGGTGCGTGCTGTCCACCCCCGAG
ATCCAGAAATGCGGCGATATGGCCGTGGCCTTTCGGCGGCAGAGACTGAAGCCT
GAGATCCAGTGCGTGTCCGCCAAGAGCCCTCAGCACTGCATGGAACGGATCCAG
GCCGAACAGGTGGACGCCGTGACACTGTCCGGCGAGGATATCTACACCGCCGGA
AAGACCTACGGCCTGGTGCCAGCTGCTGGCGAGCATTACGCCCCTGAGGACTCC
TCCAACAGCTACTACGTGGTGGCAGTCGTGCGCCGGGACTCCTCTCACGCCTTT
ACCCTGGATGAGCTGCGGGGCAAGAGAAGCTGTCACGCCGGCTTTGGAAGCCCT
GCCGGATGGGATGTGCCTGTGGGCGCTCTGATCCAGCGGGGCTTCATCAGACCC
AAGGACTGTGATGTGCTGACCGCCGTGTCTGAGTTCTTCAACGCCTCCTGTGTG
CCCGTGAACAACCCCAAGAACTACCCCTCCAGCCTGTGCGCCCTGTGTGTGGGA
GATGAGCAGGGCCGGAACAAATGCGTGGGCAACTCCCAGGAAAGATATTACGGC
TACAGAGGCGCCTTCCGGTGTCTGGTGGAAAACGCCGGGGATGTGGCTTTTGTG
CGGCACACCACCGTGTTCGACAACACCAATGGCCACAACTCCGAGCCTTGGGCC
GCTGAGCTGAGATCCGAGGATTACGAACTGCTGTGTCCCAACGGCGCCAGGGCT
GAGGTGTCCCAGTTTGCCGCCTGTAACCTGGCCCAGATCCCTCCCCACGCTGTG
ATGGTGCGACCCGACACCAACATCTTCACCGTGTACGGCCTGCTGGACAAGGCC
CAGGATCTGTTCGGCGACGACCACAACAAGAACGGGTTCAAGATGTTCGACTCC
AGCAACTACCACGGACAGGATCTGCTGTTTAAAGATGCCACCGTGCGGGCCGTG
CCAGTGGGCGAAAAGACCACCTACAGAGGATGGCTGGGACTGGACTACGTGGCC
GCCCTGGAAGGCATGTCCTCCCAGCAGTGTTCCTGA MTf-I2S
ATGGAATGGAGCTGGGTCTTTCTCTTCTTCCTGTCAGTAACGACTGGTGTCCAC 144
TCCGACTACAAGGACGACGACGACAAAGAGCAGAAGCTGATCTCCGAAGAGGAC
CTGCACCACCATCATCACCATCACCACCATCACGGAGGCGGTGGAGAGAACCTG
TACTTTCAGGGCGGCATGGAAGTGCGTTGGTGCGCCACCTCTGACCCCGAGCAG
CACAAGTGCGGCAACATGTCCGAGGCCTTCAGAGAGGCCGGCATCCAGCCTTCT
CTGCTGTGTGTGCGGGGCACCTCTGCCGACCATTGCGTGCAGCTGATCGCCGCC
CAGGAAGCCGACGCTATCACACTGGATGGCGGCGCTATCTACGAGGCTGGCAAA
GAGCACGGCCTGAAGCCCGTCGTGGGCGAGGTGTACGATCAGGAAGTGGGCACC
TCCTACTACGCCGTGGCTGTCGTGCGGAGATCCTCCCACGTGACCATCGACACC
CTGAAGGGCGTGAAGTCCTGCCACACCGGCATCAACAGAACCGTGGGCTGGAAC
GTGCCCGTGGGCTACCTGGTGGAATCCGGCAGACTGTCCGTGATGGGCTGCGAC
GTGCTGAAGGCCGTGTCCGATTACTTCGGCGGCTCTTGTGTGCCTGGCGCTGGC
GAGACATCCTACTCCGAGTCCCTGTGCAGACTGTGCAGGGGCGACTCTTCTGGC
GAGGGCGTGTGCGACAAGTCCCCTCTGGAACGGTACTACGACTACTCCGGCGCC
TTCAGATGCCTGGCTGAAGGTGCTGGCGACGTGGCCTTCGTGAAGCACTCCACC
GTGCTGGAAAACACCGACGGCAAGACCCTGCCTTCTTGGGGCCAGGCACTGCTG
TCCCAGGACTTCGAGCTGCTGTGCCGGGATGGCTCCAGAGCCGATGTGACAGAG
TGGCGGCAGTGCCACCTGGCCAGAGTGCCTGCTCATGCTGTGGTCGTGCGCGCC
GATACAGATGGCGGCCTGATCTTCCGGCTGCTGAACGAGGGCCAGCGGCTGTTC
TCTCACGAGGGCTCCAGCTTCCAGATGTTCTCCAGCGAGGCCTACGGCCAGAAG
GACCTGCTGTTCAAGGACTCCACCTCCGAGCTGGTGCCTATCGCCACCCAGACC
TATGAGGCTTGGCTGGGCCACGAGTACCTGCACGCTATGAAGGGACTGCTGTGC
GACCCCAACCGGCTGCCTCCTTATCTGAGGTGGTGCGTGCTGTCCACCCCCGAG
ATCCAGAAATGCGGCGATATGGCCGTGGCCTTTCGGCGGCAGAGACTGAAGCCT
GAGATCCAGTGCGTGTCCGCCAAGAGCCCTCAGCACTGCATGGAACGGATCCAG
GCCGAACAGGTGGACGCCGTGACACTGTCCGGCGAGGATATCTACACCGCCGGA
AAGACCTACGGCCTGGTGCCAGCTGCTGGCGAGCATTACGCCCCTGAGGACTCC
TCCAACAGCTACTACGTGGTGGCAGTCGTGCGCCGGGACTCCTCTCACGCCTTT
ACCCTGGATGAGCTGCGGGGCAAGAGAAGCTGTCACGCCGGCTTTGGAAGCCCT
GCCGGATGGGATGTGCCTGTGGGCGCTCTGATCCAGCGGGGCTTCATCAGACCC
AAGGACTGTGATGTGCTGACCGCCGTGTCTGAGTTCTTCAACGCCTCCTGTGTG
CCCGTGAACAACCCCAAGAACTACCCCTCCAGCCTGTGCGCCCTGTGTGTGGGA
GATGAGCAGGGCCGGAACAAATGCGTGGGCAACTCCCAGGAAAGATATTACGGC
TACAGAGGCGCCTTCCGGTGTCTGGTGGAAAACGCCGGGGATGTGGCTTTTGTG
CGGCACACCACCGTGTTCGACAACACCAATGGCCACAACTCCGAGCCTTGGGCC
GCTGAGCTGAGATCCGAGGATTACGAACTGCTGTGTCCCAACGGCGCCAGGGCT
GAGGTGTCCCAGTTTGCCGCCTGTAACCTGGCCCAGATCCCTCCCCACGCTGTG
ATGGTGCGACCCGACACCAACATCTTCACCGTGTACGGCCTGCTGGACAAGGCC
CAGGATCTGTTCGGCGACGACCACAACAAGAACGGGTTCAAGATGTTCGACTCC
AGCAACTACCACGGACAGGATCTGCTGTTTAAAGATGCCACCGTGCGGGCCGTG
CCAGTGGGCGAAAAGACCACCTACAGAGGATGGCTGGGACTGGACTACGTGGCC
GCCCTGGAAGGCATGTCCTCCCAGCAGTGTTCCGAAGCCGCCGCGAAAGAAGCC
GCCGCAAAAGAAGCCGCTGCCAAATCGGAAACTCAGGCCAACTCCACCACAGAT
GCACTCAACGTGCTGCTGATCATCGTAGATGACCTCCGACCTTCTCTGGGCTGT
TACGGCGACAAGCTAGTACGGAGCCCAAACATCGACCAGCTCGCATCGCACTCT
CTCCTATTCCAGAACGCATTCGCCCAGCAGGCTGTCTGTGCTCCCTCCCGAGTG
TCCTTCCTCACGGGTCGGAGACCCGATACCACGAGGTTATATGACTTCAACTCA
TACTGGCGCGTGCATGCCGGTAACTTTTCTACTATACCCCAGTATTTTAAAGAA
AATGGCTATGTTACAATGTCCGTTGGCAAGGTATTTCATCCTGGTATTAGCAGC
AACCACACAGATGACTCTCCGTATAGCTGGTCATTCCCACCATACCACCCCTCC
AGCGAAAAGTACGAAAACACAAAGACTTGCCGGGGCCCAGATGGCGAACTGCAC
GCAAATCTGCTGTGCCCTGTAGATGTCTTGGACGTGCCCGAAGGTACTCTGCCC
GACAAACAGTCCACAGAACAGGCAATCCAACTCCTTGAAAAGATGAAAACGAGC
GCGTCCCCCTTCTTCCTCGCCGTGGGCTACCACAAGCCCCACATCCCGTTTAGA
TACCCCAAGGAATTTCAGAAACTGTACCCCCTGGAAAACATCACTCTCGCGCCC
GACCCCGAAGTGCCAGACGGACTCCCTCCTGTTGCCTACAACCCTTGGATGGAC
ATCAGACAACGTGAAGATGTGCAGGCCCTGAACATCTCAGTGCCTTACGGCCCC
ATTCCAGTTGACTTCCAGAGGAAGATTCGGCAGTCCTACTTCGCCTCCGTTAGT
TACCTGGACACCCAAGTGGGTAGACTCCTGAGCGCCTTGGACGATCTCCAGCTC
GCAAACAGCACCATCATTGCCTTCACCAGCGACCATGGTTGGGCGCTGGGTGAA
CATGGAGAATGGGCTAAATATTCAAATTTCGACGTTGCGACCCACGTCCCATTG
ATCTTCTACGTGCCTGGACGAACAGCCTCCTTGCCTGAAGCCGGGGAAAAGTTG
TTTCCATATCTGGACCCTTTCGATTCTGCGAGCCAACTCATGGAACCTGGGCGA
CAGAGCATGGACCTGGTGGAACTGGTCAGTTTATTTCCAACCCTGGCAGGCCTT
GCAGGCCTCCAAGTTCCACCTCGGTGTCCCGTTCCCTCATTCCACGTCGAACTC
TGTCGCGAAGGTAAAAACCTCCTCAAGCATTTTCGTTTTCGGGACCTCGAAGAA
GACCCATACCTGCCAGGGAATCCAAGGGAACTGATTGCCTACAGCCAGTACCCT
AGACCTAGCGACATCCCACAGTGGAACAGCGACAAGCCCTCCCTCAAGGACATT
AAAATCATGGGTTATAGTATCCGGACTATTGACTACAGGTATACCGTGTGGGTG
GGTTTCAACCCAGACGAATTTCTCGCCAATTTCTCCGACATCCACGCGGGCGAA
CTGTATTTCGTTGATTCCGATCCACTGCAAGATCATAATATGTACAACGATAGT
CAAGGGGGTGACCTCTTCCAGTTGCTAATGCCATGA MTfpep-
ATGGAATGGAGCTGGGTCTTTCTCTTCTTCCTGTCAGTAACGACTGGTGTCCAC 145 I2S
TCCGACTACAAGGACGACGACGACAAAGAGCAGAAGCTGATCTCCGAAGAGGAC
CTGCACCACCATCATCACCATCACCACCATCACGGAGGCGGTGGAGAGAACCTG
TACTTTCAGGGCGACTCCTCTCACGCCTTCACCCTGGACGAGCTGCGGTACGAA
GCCGCCGCGAAAGAAGCCGCCGCAAAAGAAGCCGCTGCCAAATCGGAAACTCAG
GCCAACTCCACCACAGATGCACTCAACGTGCTGCTGATCATCGTAGATGACCTC
CGACCTTCTCTGGGCTGTTACGGCGACAAGCTAGTACGGAGCCCAAACATCGAC
CAGCTCGCATCGCACTCTCTCCTATTCCAGAACGCATTCGCCCAGCAGGCTGTC
TGTGCTCCCTCCCGAGTGTCCTTCCTCACGGGTCGGAGACCCGATACCACGAGG
TTATATGACTTCAACTCATACTGGCGCGTGCATGCCGGTAACTTTTCTACTATA
CCCCAGTATTTTAAAGAAAATGGCTATGTTACAATGTCCGTTGGCAAGGTATTT
CATCCTGGTATTAGCAGCAACCACACAGATGACTCTCCGTATAGCTGGTCATTC
CCACCATACCACCCCTCCAGCGAAAAGTACGAAAACACAAAGACTTGCCGGGGC
CCAGATGGCGAACTGCACGCAAATCTGCTGTGCCCTGTAGATGTCTTGGACGTG
CCCGAAGGTACTCTGCCCGACAAACAGTCCACAGAACAGGCAATCCAACTCCTT
GAAAAGATGAAAACGAGCGCGTCCCCCTTCTTCCTCGCCGTGGGCTACCACAAG
CCCCACATCCCGTTTAGATACCCCAAGGAATTTCAGAAACTGTACCCCCTGGAA
AACATCACTCTCGCGCCCGACCCCGAAGTGCCAGACGGACTCCCTCCTGTTGCC
TACAACCCTTGGATGGACATCAGACAACGTGAAGATGTGCAGGCCCTGAACATC
TCAGTGCCTTACGGCCCCATTCCAGTTGACTTCCAGAGGAAGATTCGGCAGTCC
TACTTCGCCTCCGTTAGTTACCTGGACACCCAAGTGGGTAGACTCCTGAGCGCC
TTGGACGATCTCCAGCTCGCAAACAGCACCATCATTGCCTTCACCAGCGACCAT
GGTTGGGCGCTGGGTGAACATGGAGAATGGGCTAAATATTCAAATTTCGACGTT
GCGACCCACGTCCCATTGATCTTCTACGTGCCTGGACGAACAGCCTCCTTGCCT
GAAGCCGGGGAAAAGTTGTTTCCATATCTGGACCCTTTCGATTCTGCGAGCCAA
CTCATGGAACCTGGGCGACAGAGCATGGACCTGGTGGAACTGGTCAGTTTATTT
CCAACCCTGGCAGGCCTTGCAGGCCTCCAAGTTCCACCTCGGTGTCCCGTTCCC
TCATTCCACGTCGAACTCTGTCGCGAAGGTAAAAACCTCCTCAAGCATTTTCGT
TTTCGGGACCTCGAAGAAGACCCATACCTGCCAGGGAATCCAAGGGAACTGATT
GCCTACAGCCAGTACCCTAGACCTAGCGACATCCCACAGTGGAACAGCGACAAG
CCCTCCCTCAAGGACATTAAAATCATGGGTTATAGTATCCGGACTATTGACTAC
AGGTATACCGTGTGGGTGGGTTTCAACCCAGACGAATTTCTCGCCAATTTCTCC
GACATCCACGCGGGCGAACTGTATTTCGTTGATTCCGATCCACTGCAAGATCAT
AATATGTACAACGATAGTCAAGGGGGTGACCTCTTCCAGTTGCTAATGCCATGA I2S-
ATGGAATGGAGCTGGGTCTTTCTCTTCTTCCTGTCAGTAACGACTGGTGTCCAC 146 MTfpep
TCCGACTACAAGGACGACGACGACAAAGAGCAGAAGCTGATCTCCGAAGAGGAC
CTGCACCACCATCATCACCATCACCACCATCACGGAGGCGGTGGAGAGAACCTG
TACTTTCAGGGCTCGGAAACTCAGGCCAACTCCACCACAGATGCACTCAACGTG
CTGCTGATCATCGTAGATGACCTCCGACCTTCTCTGGGCTGTTACGGCGACAAG
CTAGTACGGAGCCCAAACATCGACCAGCTCGCATCGCACTCTCTCCTATTCCAG
AACGCATTCGCCCAGCAGGCTGTCTGTGCTCCCTCCCGAGTGTCCTTCCTCACG
GGTCGGAGACCCGATACCACGAGGTTATATGACTTCAACTCATACTGGCGCGTG
CATGCCGGTAACTTTTCTACTATACCCCAGTATTTTAAAGAAAATGGCTATGTT
ACAATGTCCGTTGGCAAGGTATTTCATCCTGGTATTAGCAGCAACCACACAGAT
GACTCTCCGTATAGCTGGTCATTCCCACCATACCACCCCTCCAGCGAAAAGTAC
GAAAACACAAAGACTTGCCGGGGCCCAGATGGCGAACTGCACGCAAATCTGCTG
TGCCCTGTAGATGTCTTGGACGTGCCCGAAGGTACTCTGCCCGACAAACAGTCC
ACAGAACAGGCAATCCAACTCCTTGAAAAGATGAAAACGAGCGCGTCCCCCTTC
TTCCTCGCCGTGGGCTACCACAAGCCCCACATCCCGTTTAGATACCCCAAGGAA
TTTCAGAAACTGTACCCCCTGGAAAACATCACTCTCGCGCCCGACCCCGAAGTG
CCAGACGGACTCCCTCCTGTTGCCTACAACCCTTGGATGGACATCAGACAACGT
GAAGATGTGCAGGCCCTGAACATCTCAGTGCCTTACGGCCCCATTCCAGTTGAC
TTCCAGAGGAAGATTCGGCAGTCCTACTTCGCCTCCGTTAGTTACCTGGACACC
CAAGTGGGTAGACTCCTGAGCGCCTTGGACGATCTCCAGCTCGCAAACAGCACC
ATCATTGCCTTCACCAGCGACCATGGTTGGGCGCTGGGTGAACATGGAGAATGG
GCTAAATATTCAAATTTCGACGTTGCGACCCACGTCCCATTGATCTTCTACGTG
CCTGGACGAACAGCCTCCTTGCCTGAAGCCGGGGAAAAGTTGTTTCCATATCTG
GACCCTTTCGATTCTGCGAGCCAACTCATGGAACCTGGGCGACAGAGCATGGAC
CTGGTGGAACTGGTCAGTTTATTTCCAACCCTGGCAGGCCTTGCAGGCCTCCAA
GTTCCACCTCGGTGTCCCGTTCCCTCATTCCACGTCGAACTCTGTCGCGAAGGT
AAAAACCTCCTCAAGCATTTTCGTTTTCGGGACCTCGAAGAAGACCCATACCTG
CCAGGGAATCCAAGGGAACTGATTGCCTACAGCCAGTACCCTAGACCTAGCGAC
ATCCCACAGTGGAACAGCGACAAGCCCTCCCTCAAGGACATTAAAATCATGGGT
TATAGTATCCGGACTATTGACTACAGGTATACCGTGTGGGTGGGTTTCAACCCA
GACGAATTTCTCGCCAATTTCTCCGACATCCACGCGGGCGAACTGTATTTCGTT
GATTCCGATCCACTGCAAGATCATAATATGTACAACGATAGTCAAGGGGGTGAC
CTCTTCCAGTTGCTAATGCCAGAGGCCGCTGCTAAAGAGGCTGCCGCCAAAGAA
GCCGCCGCTAAGGACTCCTCTCACGCCTTCACCCTGGACGAGCTGCGGTACTAA I2S-
ATGGAATGGAGCTGGGTCTTTCTCTTCTTCCTGTCAGTAACGACTGGTGTCCAC 147 MTfpep
TCCGACTACAAGGACGACGACGACAAAGAGCAGAAGCTGATCTCCGAAGAGGAC (without
CTGCACCACCATCATCACCATCACCACCATCACGGAGGCGGTGGAGAGAACCTG propep of
TACTTTCAGGGCACAGATGCACTCAACGTGCTGCTGATCATCGTAGATGACCTC I2S)
CGACCTTCTCTGGGCTGTTACGGCGACAAGCTAGTACGGAGCCCAAACATCGAC
CAGCTCGCATCGCACTCTCTCCTATTCCAGAACGCATTCGCCCAGCAGGCTGTC
TGTGCTCCCTCCCGAGTGTCCTTCCTCACGGGTCGGAGACCCGATACCACGAGG
TTATATGACTTCAACTCATACTGGCGCGTGCATGCCGGTAACTTTTCTACTATA
CCCCAGTATTTTAAAGAAAATGGCTATGTTACAATGTCCGTTGGCAAGGTATTT
CATCCTGGTATTAGCAGCAACCACACAGATGACTCTCCGTATAGCTGGTCATTC
CCACCATACCACCCCTCCAGCGAAAAGTACGAAAACACAAAGACTTGCCGGGGC
CCAGATGGCGAACTGCACGCAAATCTGCTGTGCCCTGTAGATGTCTTGGACGTG
CCCGAAGGTACTCTGCCCGACAAACAGTCCACAGAACAGGCAATCCAACTCCTT
GAAAAGATGAAAACGAGCGCGTCCCCCTTCTTCCTCGCCGTGGGCTACCACAAG
CCCCACATCCCGTTTAGATACCCCAAGGAATTTCAGAAACTGTACCCCCTGGAA
AACATCACTCTCGCGCCCGACCCCGAAGTGCCAGACGGACTCCCTCCTGTTGCC
TACAACCCTTGGATGGACATCAGACAACGTGAAGATGTGCAGGCCCTGAACATC
TCAGTGCCTTACGGCCCCATTCCAGTTGACTTCCAGAGGAAGATTCGGCAGTCC
TACTTCGCCTCCGTTAGTTACCTGGACACCCAAGTGGGTAGACTCCTGAGCGCC
TTGGACGATCTCCAGCTCGCAAACAGCACCATCATTGCCTTCACCAGCGACCAT
GGTTGGGCGCTGGGTGAACATGGAGAATGGGCTAAATATTCAAATTTCGACGTT
GCGACCCACGTCCCATTGATCTTCTACGTGCCTGGACGAACAGCCTCCTTGCCT
GAAGCCGGGGAAAAGTTGTTTCCATATCTGGACCCTTTCGATTCTGCGAGCCAA
CTCATGGAACCTGGGCGACAGAGCATGGACCTGGTGGAACTGGTCAGTTTATTT
CCAACCCTGGCAGGCCTTGCAGGCCTCCAAGTTCCACCTCGGTGTCCCGTTCCC
TCATTCCACGTCGAACTCTGTCGCGAAGGTAAAAACCTCCTCAAGCATTTTCGT
TTTCGGGACCTCGAAGAAGACCCATACCTGCCAGGGAATCCAAGGGAACTGATT
GCCTACAGCCAGTACCCTAGACCTAGCGACATCCCACAGTGGAACAGCGACAAG
CCCTCCCTCAAGGACATTAAAATCATGGGTTATAGTATCCGGACTATTGACTAC
AGGTATACCGTGTGGGTGGGTTTCAACCCAGACGAATTTCTCGCCAATTTCTCC
GACATCCACGCGGGCGAACTGTATTTCGTTGATTCCGATCCACTGCAAGATCAT
AATATGTACAACGATAGTCAAGGGGGTGACCTCTTCCAGTTGCTAATGCCAGAG
GCCGCTGCTAAAGAGGCTGCCGCCAAAGAAGCCGCCGCTAAGGACTCCTCTCAC
GCCTTCACCCTGGACGAGCTGCGGTACTAA
[0180] Thus, in certain embodiments, a polynucleotide that encodes
a fusion protein or antibody fusion described herein, or a portion
thereof, comprises one or more polynucleotide sequences from Table
7 (e.g., SEQ ID NOS:143-147), or a fragment/variant thereof.
[0181] In some embodiments, a nucleic acids or vectors encoding a
subject p97 polypeptide, an IDS polypeptide, and/or a p97-IDS
fusion are introduced directly into a host cell, and the cell is
incubated under conditions sufficient to induce expression of the
encoded polypeptide(s). Therefore, according to certain related
embodiments, there is provided a recombinant host cell which
comprises a polynucleotide or a fusion polynucleotide that encodes
one or more fusion proteins described herein, and which optionally
comprises additional exogenous polynucleotides.
[0182] Expression of a fusion protein in the host cell may be
achieved by culturing the recombinant host cells (containing the
polynucleotide(s)) under appropriate conditions. Following
production by expression, the polypeptide(s) and/or fusion
proteins, may be isolated and/or purified using any suitable
technique, and then used as desired. The term "host cell" is used
to refer to a cell into which has been introduced, or which is
capable of having introduced into it, a nucleic acid sequence
encoding one or more of the polypeptides described herein, and
which further expresses or is capable of expressing a selected gene
of interest, such as a gene encoding any herein described
polypeptide. The term includes the progeny of the parent cell,
whether or not the progeny are identical in morphology or in
genetic make-up to the original parent, so long as the selected
gene is present. Host cells may be chosen for certain
characteristics, for instance, the expression of aminoacyl tRNA
synthetase(s) that can incorporate unnatural amino acids into the
polypeptide.
[0183] Systems for cloning and expression of a protein in a variety
of different host cells are well known. Suitable host cells include
mammalian cells, bacteria, yeast, and baculovirus systems.
Mammalian cell lines available in the art for expression of a
heterologous polypeptide include Chinese hamster ovary (CHO) cells,
HeLa cells, baby hamster kidney cells, HEK-293 cells, human
fibrosarcoma cell line HT-1080 (see, e.g., Moran, Nat. Biotechnol.
28:1139-40, 2010), NSO mouse melanoma cells and many others.
Additional examples of useful mammalian host cell lines include
monkey kidney CV1 line transformed by SV40 (COS-7, ATCC CRL 1651);
human embryonic kidney line (293 or 293 cells sub-cloned for growth
in suspension culture, Graham et al., J. Gen Virol. 36:59 (1977));
baby hamster kidney cells (BHK, ATCC CCL 10); mouse sertoli cells
(TM4, Mather, Biol. Reprod. 23:243-251 (1980)); monkey kidney cells
(CV1 ATCC CCL 70); African green monkey kidney cells (VERO-76, ATCC
CRL-1587); human cervical carcinoma cells (HELA, ATCC CCL 2);
canine kidney cells (MDCK, ATCC CCL 34); buffalo rat liver cells
(BRL 3A, ATCC CRL 1442); human lung cells (W138, ATCC CCL 75);
human liver cells (Hep G2, HB 8065); mouse mammary tumor (MMT
060562, ATCC CCL51); TR1 cells (Mather et al., Annals N.Y. Acad.
Sci. 383:44-68 (1982)); MRC 5 cells; FS4 cells; and a human
hepatoma line (Hep G2). Other useful mammalian host cell lines
include Chinese hamster ovary (CHO) cells, including DHFR-CHO cells
(Urlaub et al., PNAS USA 77:4216 (1980)); and myeloma cell lines
such as NSO and Sp2/0. For a review of certain mammalian host cell
lines suitable for polypeptide production, see, e.g., Yazaki and
Wu, Methods in Molecular Biology, Vol. 248 (B. K. C Lo, ed., Humana
Press, Totowa, N.J., 2003), pp. 255-268. Certain preferred
mammalian cell expression systems include CHO and HEK293-cell based
expression systems. Mammalian expression systems can utilize
attached cell lines, for example, in T-flasks, roller bottles, or
cell factories, or suspension cultures, for example, in 1 L and 5 L
spinners, 5 L, 14 L, 40 L, 100 L and 200 L stir tank bioreactors,
or 20/50 L and 100/200 L WAVE bioreactors, among others known in
the art.
[0184] A common, preferred bacterial host is E. coli. The
expression of proteins in prokaryotic cells such as E. coli is well
established in the art. For a review, see for example Pluckthun, A.
Bio/Technology. 9:545-551 (1991). Expression in eukaryotic cells in
culture is also available to those skilled in the art as an option
for recombinant production of polypeptides (see Ref, Curr. Opinion
Biotech. 4:573-576, 1993; and Trill et al., Curr. Opinion Biotech.
6:553-560, 1995). In specific embodiments, protein expression may
be controlled by a T7 RNA polymerase (e.g., pET vector series).
These and related embodiments may utilize the expression host
strain BL21(DE3), a .lamda.DE3 lysogen of BL21 that supports
T7-mediated expression and is deficient in Ion and ompT proteases
for improved target protein stability. Also included are expression
host strains carrying plasmids encoding tRNAs rarely used in E.
coli, such as Rosetta.TM. (DE3) and Rosetta 2 (DE3) strains. Cell
lysis and sample handling may also be improved using reagents such
as Benzonase.RTM. nuclease and BugBuster.RTM. Protein Extraction
Reagent. For cell culture, auto-inducing media can improve the
efficiency of many expression systems, including high-throughput
expression systems. Media of this type (e.g., Overnight Express.TM.
Autoinduction System) gradually elicit protein expression through
metabolic shift without the addition of artificial inducing agents
such as IPTG. Particular embodiments employ hexahistidine tags
(such as His.cndot.Tag.RTM. fusions), followed by immobilized metal
affinity chromatography (IMAC) purification, or related techniques.
In certain aspects, however, clinical grade proteins can be
isolated from E. coli inclusion bodies, without or without the use
of affinity tags (see, e.g., Shimp et al., Protein Expr Purif.
50:58-67, 2006). As a further example, certain embodiments may
employ a cold-shock induced E. coli high-yield production system,
because over-expression of proteins in Escherichia coli at low
temperature improves their solubility and stability (see, e.g.,
Qing et al., Nature Biotechnology. 22:877-882, 2004).
[0185] In addition, a host cell strain may be chosen for its
ability to modulate the expression of the inserted sequences or to
process the expressed protein in the desired fashion. Such
modifications of the polypeptide include, but are not limited to,
post-translational modifications such as acetylation,
carboxylation, glycosylation, phosphorylation, lipidation, and
acylation. Post-translational processing, which cleaves a "prepro"
form of the protein may also be used to facilitate correct
insertion, folding and/or function. Different host cells such as
yeast, CHO, HeLa, MDCK, HEK293, and W138, in addition to bacterial
cells, which have or even lack specific cellular machinery and
characteristic mechanisms for such post-translational activities,
may be chosen to ensure the correct modification and processing of
the fusion protein of interest.
[0186] For long-term, high-yield production of recombinant
proteins, stable expression is generally preferred. For example,
cell lines that stably express a polynucleotide of interest may be
transformed using expression vectors which may contain viral
origins of replication and/or endogenous expression elements and a
selectable marker gene on the same or on a separate vector.
Following the introduction of the vector, cells may be allowed to
grow for about 1-2 days in an enriched media before they are
switched to selective media. The purpose of the selectable marker
is to confer resistance to selection, and its presence allows
growth and recovery of cells which, successfully express the
introduced sequences. Resistant clones of stably transformed cells
may be proliferated using tissue culture techniques appropriate to
the cell type. Transient production, such as by transient
transfection or infection, can also be employed. Exemplary
mammalian expression systems that are suitable for transient
production include HEK293 and CHO-based systems.
[0187] Host cells transformed with a polynucleotide sequence of
interest may be cultured under conditions suitable for the
expression and recovery of the protein from cell culture. Certain
specific embodiments utilize serum free cell expression systems.
Examples include HEK293 cells and CHO cells that can grow on serum
free medium (see, e.g., Rosser et al., Protein Expr. Purif.
40:237-43, 2005; and U.S. Pat. No. 6,210,922).
[0188] The protein(s) produced by a recombinant cell can be
purified and characterized according to a variety of techniques
known in the art. Exemplary systems for performing protein
purification and analyzing protein purity include fast protein
liquid chromatography (FPLC) (e.g., AKTA and Bio-Rad FPLC systems),
high-pressure liquid chromatography (HPLC) (e.g., Beckman and
Waters HPLC). Exemplary chemistries for purification include ion
exchange chromatography (e.g., Q, S), size exclusion
chromatography, salt gradients, affinity purification (e.g., Ni,
Co, FLAG, maltose, glutathione, protein A/G), gel filtration,
reverse-phase, ceramic HyperD.RTM. ion exchange chromatography, and
hydrophobic interaction columns (HIC), among others known in the
art. Also included are analytical methods such as SDS-PAGE (e.g.,
coomassie, silver stain), immunoblot, Bradford, and ELISA, which
may be utilized during any step of the production or purification
process, typically to measure the purity of the protein
composition.
[0189] Also included are methods of concentrating recombinantly
produced proteins, e.g., fusion proteins. Examples include
lyophilization, which is typically employed when the solution
contains few soluble components other than the protein of interest.
Lyophilization is often performed after HPLC run, and can remove
most or all volatile components from the mixture. Also included are
ultrafiltration techniques, which typically employ one or more
selective permeable membranes to concentrate a protein solution.
The membrane allows water and small molecules to pass through and
retains the protein; the solution can be forced against the
membrane by mechanical pump, gas pressure, or centrifugation, among
other techniques.
[0190] In certain embodiments, the fusion proteins have a purity of
at least about 90%, as measured according to routine techniques in
the art. In certain embodiments, such as diagnostic compositions or
certain therapeutic compositions, the fusion proteins have a purity
of at least about 95%. In specific embodiments, such as therapeutic
or pharmaceutical compositions, the fusion proteins have a purity
of at least about 97% or 98% or 99%. In other embodiments, such as
when being used as reference or research reagents, fusion proteins
can be of lesser purity, and may have a purity of at least about
50%, 60%, 70%, or 80%. Purity can be measured overall or in
relation to selected components, such as other proteins, e.g.,
purity on a protein basis.
[0191] In certain embodiments, as noted above, the compositions
described here are about substantially endotoxin free, including,
for example, about 95% endotoxin free, preferably about 99%
endotoxin free, and more preferably about 99.99% endotoxin free.
The presence of endotoxins can be detected according to routine
techniques in the art, as described herein. In specific
embodiments, the fusion proteins are made from a eukaryotic cell
such as a mammalian or human cell in substantially serum free
media.
Methods of Use and Pharmaceutical Compositions
[0192] Certain embodiments of the present invention relate to
methods of using the p97 fusion proteins described herein. Examples
of such methods include methods of treatment and methods of
diagnosis, including for instance, the use of p97 fusion proteins
for medical imaging of certain organs/tissues, such as those of the
nervous system. Some embodiments include methods of diagnosing
and/or treating disorders or conditions of the central nervous
system (CNS), or disorders or conditions having a CNS component.
Particular aspects include methods of treating a lysosomal storage
disorder (LSD), including those having a CNS component.
[0193] Accordingly, certain embodiments include methods of treating
a subject in need thereof, comprising administering a p97 fusion
protein described herein. Also included are methods of delivering
an IDS enzyme to the nervous system (e.g., central nervous system
tissues) of a subject, comprising administering a composition that
comprises a p97 fusion protein described herein. In certain of
these and related embodiments, the methods increase the rate of
delivery of the agent to the central nervous system tissues,
relative, for example, to delivery by a composition that comprises
a non-fusion IDS enzyme.
[0194] In some instances, the subject has or is at risk for having
a lysosomal storage disease. Certain methods thus relate to the
treatment of lysosomal storage diseases in a subject in need
thereof, optionally those lysosomal storage diseases associated
with the central nervous system, or having CNS involvement.
Exemplary lysosomal storage diseases include mucopolysaccharidosis
type II (Hunter Syndrome). Hunter Syndrome is an X-linked
multisystem disorder characterized by glycosaminoglycans (GAG)
accumulation. The vast majority of affected individuals are male;
on rare occasion carrier females manifest findings. Age of onset,
disease severity, and rate of progression may vary
significantly.
[0195] In those with severe disease, CNS involvement (manifest
primarily by progressive cognitive deterioration), progressive
airway disease, and cardiac disease usually result in death in the
first or second decade of life. Certain embodiments therefore
include the treatment of Hunter Syndrome with CNS involvement.
[0196] In those with attenuated disease, the CNS is not (or is
minimally) affected, although the effect of GAG accumulation on
other organ systems may be just as severe as in those who have
progressive cognitive decline. Survival into the early adult years
with normal intelligence is common in the attenuated form of the
disease. However, subjects with attenuated disease can still
benefit from administration of a p97-IDS fusion protein having
improved penetration into CNS tissues, for instance, to reduce the
risk of progression from attenuated Hunter Syndrome to that with
CNS involvement.
[0197] Additional findings in both forms of Hunter Syndrome
include: short stature; macrocephaly with or without communicating
hydrocephalus; macroglossia; hoarse voice; conductive and
sensorineural hearing loss; hepatomegaly and/or splenomegaly;
dysostosis multiplex and joint contractures including ankylosis of
the temporomandibular joint; spinal stenosis; and carpal tunnel
syndrome. Subjects undergoing treatment with fusion proteins
described herein may thus have one or more of these findings of
Hunter Syndrome.
[0198] Urine GAGs and skeletal surveys can establish the presence
of an MPS condition but are not specific to MPS II. The gold
standard for diagnosis of MPS II in a male proband is deficient
iduronate sulfatase (IDS) enzyme activity in white cells,
fibroblasts or plasma in the presence of normal activity of at
least one other sulfatase. Molecular genetic testing of IDS, the
only gene in which mutation is known to be associated with Hunter
Syndrome, can be used to confirm the diagnosis in a male proband
with an unusual phenotype or a phenotype that does not match the
results of GAG testing.
[0199] Common treatments for Hunter Syndrome include developmental,
occupational, and physical therapy; shunting for hydrocephalus;
tonsillectomy and adenoidectomy; positive pressure ventilation
(CPAP or tracheostomy); carpal tunnel release; cardiac valve
replacement; inguinal hernia repair. Hence, in certain aspects, a
subject for treatment by the fusion proteins described herein may
be about to undergo, is undergoing, or has undergone one or more of
these treatments.
[0200] Disease monitoring can depend on organ system involvement
and disease severity, and usually includes annual cardiac
evaluation and echocardiograms; pulmonary evaluations including
pulmonary function testing; audiograms; eye examinations;
developmental assessments; and neurologic examinations. Additional
studies may include sleep studies for obstructive apnea; nerve
conduction velocity (NCV) to assess for carpal tunnel syndrome;
evaluations for hydrocephalus; orthopedic evaluations to monitor
hip disease. Thus, in some aspects, a subject for treatment by the
fusion proteins described herein may be about to undergo, is
undergoing, or has undergone one or more of these disease
monitoring protocols.
[0201] For in vivo use, for instance, for the treatment of human
disease, medical imaging, or testing, the p97 fusion proteins
described herein are generally incorporated into a pharmaceutical
composition prior to administration. A pharmaceutical composition
comprises one or more of the p97 fusion proteins described herein
in combination with a physiologically acceptable carrier or
excipient.
[0202] To prepare a pharmaceutical composition, an effective or
desired amount of one or more fusion proteins is mixed with any
pharmaceutical carrier(s) or excipient known to those skilled in
the art to be suitable for the particular mode of administration. A
pharmaceutical carrier may be liquid, semi-liquid or solid.
Solutions or suspensions used for parenteral, intradermal,
subcutaneous or topical application may include, for example, a
sterile diluent (such as water), saline solution (e.g., phosphate
buffered saline; PBS), fixed oil, polyethylene glycol, glycerin,
propylene glycol or other synthetic solvent; antimicrobial agents
(such as benzyl alcohol and methyl parabens); antioxidants (such as
ascorbic acid and sodium bisulfite) and chelating agents (such as
ethylenediaminetetraacetic acid (EDTA)); buffers (such as acetates,
citrates and phosphates). If administered intravenously (e.g., by
IV infusion), suitable carriers include physiological saline or
phosphate buffered saline (PBS), and solutions containing
thickening and solubilizing agents, such as glucose, polyethylene
glycol, polypropylene glycol and mixtures thereof.
[0203] Administration of fusion proteins described herein, in pure
form or in an appropriate pharmaceutical composition, can be
carried out via any of the accepted modes of administration of
agents for serving similar utilities. The pharmaceutical
compositions can be prepared by combining a fusion
protein-containing composition with an appropriate physiologically
acceptable carrier, diluent or excipient, and may be formulated
into preparations in solid, semi-solid, liquid or gaseous forms,
such as tablets, capsules, powders, granules, ointments, solutions,
suppositories, injections, inhalants, gels, microspheres, and
aerosols. In addition, other pharmaceutically active ingredients
(including other small molecules as described elsewhere herein)
and/or suitable excipients such as salts, buffers and stabilizers
may, but need not, be present within the composition.
[0204] Administration may be achieved by a variety of different
routes, including oral, parenteral, nasal, intravenous,
intradermal, subcutaneous or topical. Preferred modes of
administration depend upon the nature of the condition to be
treated or prevented. Particular embodiments include administration
by IV infusion. Some embodiments include administration by
intraperitoneal (IP) injection. Also included are combinations
thereof.
[0205] Carriers can include, for example, pharmaceutically
acceptable carriers, excipients, or stabilizers that are nontoxic
to the cell or mammal being exposed thereto at the dosages and
concentrations employed. Often the physiologically acceptable
carrier is an aqueous pH buffered solution. Examples of
physiologically acceptable carriers include buffers such as
phosphate, citrate, and other organic acids; antioxidants including
ascorbic acid; low molecular weight (less than about 10 residues)
polypeptide; proteins, such as serum albumin, gelatin, or
immunoglobulins; hydrophilic polymers such as polyvinylpyrrolidone;
amino acids such as glycine, glutamine, asparagine, arginine or
lysine; monosaccharides, disaccharides, and other carbohydrates
including glucose, mannose, or dextrins; chelating agents such as
EDTA; sugar alcohols such as mannitol or sorbitol; salt-forming
counterions such as sodium; and/or nonionic surfactants such as
polysorbate 20 (TWEEN.TM.) polyethylene glycol (PEG), and
poloxamers (PLURONICS.TM.), and the like.
[0206] In certain aspects, a fusion protein is bound to or
encapsulated within a particle, e.g., a nanoparticle, bead, lipid
formulation, lipid particle, or liposome, e.g., immunoliposome. The
fusion proteins may be entrapped in microcapsules prepared, for
example, by coacervation techniques or by interfacial
polymerization (for example, hydroxymethylcellulose or
gelatin-microcapsules and poly-(methylmethacylate)microcapsules,
respectively), in colloidal drug delivery systems (for example,
liposomes, albumin microspheres, microemulsions, nano-particles and
nanocapsules), or in macroemulsions. Such techniques are disclosed
in Remington's Pharmaceutical Sciences, 16th edition, Oslo, A.,
Ed., (1980). The particle(s) or liposomes may further comprise
other therapeutic or diagnostic agents.
[0207] The precise dosage and duration of treatment is a function
of the disease being treated and may be determined empirically
using known testing protocols or by testing the compositions in
model systems known in the art and extrapolating therefrom.
Controlled clinical trials may also be performed. Dosages may also
vary with the severity of the condition to be alleviated. A
pharmaceutical composition is generally formulated and administered
to exert a therapeutically useful effect while minimizing
undesirable side effects. The composition may be administered one
time, or may be divided into a number of smaller doses to be
administered at intervals of time. For any particular subject,
specific dosage regimens may be adjusted over time according to the
individual need.
[0208] Typical routes of administering these and related
pharmaceutical compositions thus include, without limitation, oral,
topical, transdermal, inhalation, parenteral, sublingual, buccal,
rectal, vaginal, and intranasal. The term parenteral as used herein
includes subcutaneous injections, intravenous, intramuscular,
intrasternal injection or infusion techniques. Pharmaceutical
compositions according to certain embodiments of the present
invention are formulated so as to allow the active ingredients
contained therein to be bioavailable upon administration of the
composition to a patient. Compositions that will be administered to
a subject or patient may take the form of one or more dosage units,
where for example, a tablet may be a single dosage unit, and a
container of a herein described conjugate in aerosol form may hold
a plurality of dosage units. Actual methods of preparing such
dosage forms are known, or will be apparent, to those skilled in
this art; for example, see Remington: The Science and Practice of
Pharmacy, 20th Edition (Philadelphia College of Pharmacy and
Science, 2000). The composition to be administered will typically
contain a therapeutically effective amount of a fusion protein
described herein, for treatment of a disease or condition of
interest.
[0209] A pharmaceutical composition may be in the form of a solid
or liquid. In one embodiment, the carrier(s) are particulate, so
that the compositions are, for example, in tablet or powder form.
The carrier(s) may be liquid, with the compositions being, for
example, an oral oil, injectable liquid or an aerosol, which is
useful in, for example, inhalatory administration. When intended
for oral administration, the pharmaceutical composition is
preferably in either solid or liquid form, where semi-solid,
semi-liquid, suspension and gel forms are included within the forms
considered herein as either solid or liquid.
[0210] As a solid composition for oral administration, the
pharmaceutical composition may be formulated into a powder,
granule, compressed tablet, pill, capsule, chewing gum, wafer or
the like. Such a solid composition will typically contain one or
more inert diluents or edible carriers. In addition, one or more of
the following may be present: binders such as
carboxymethylcellulose, ethyl cellulose, microcrystalline
cellulose, gum tragacanth or gelatin; excipients such as starch,
lactose or dextrins, disintegrating agents such as alginic acid,
sodium alginate, Primogel, corn starch and the like; lubricants
such as magnesium stearate or Sterotex; glidants such as colloidal
silicon dioxide; sweetening agents such as sucrose or saccharin; a
flavoring agent such as peppermint, methyl salicylate or orange
flavoring; and a coloring agent. When the pharmaceutical
composition is in the form of a capsule, for example, a gelatin
capsule, it may contain, in addition to materials of the above
type, a liquid carrier such as polyethylene glycol or oil.
[0211] The pharmaceutical composition may be in the form of a
liquid, for example, an elixir, syrup, solution, emulsion or
suspension. The liquid may be for oral administration or for
delivery by injection, as two examples. When intended for oral
administration, preferred composition contain, in addition to the
present compounds, one or more of a sweetening agent,
preservatives, dye/colorant and flavor enhancer. In a composition
intended to be administered by injection, one or more of a
surfactant, preservative, wetting agent, dispersing agent,
suspending agent, buffer, stabilizer and isotonic agent may be
included.
[0212] The liquid pharmaceutical compositions, whether they be
solutions, suspensions or other like form, may include one or more
of the following adjuvants: sterile diluents such as water for
injection, saline solution, preferably physiological saline,
Ringer's solution, isotonic sodium chloride, fixed oils such as
synthetic mono or diglycerides which may serve as the solvent or
suspending medium, polyethylene glycols, glycerin, propylene glycol
or other solvents; antibacterial agents such as benzyl alcohol or
methyl paraben; antioxidants such as ascorbic acid or sodium
bisulfite; chelating agents such as ethylenediaminetetraacetic
acid; buffers such as acetates, citrates or phosphates and agents
for the adjustment of tonicity such as sodium chloride or dextrose.
The parenteral preparation can be enclosed in ampoules, disposable
syringes or multiple dose vials made of glass or plastic.
Physiological saline is a preferred adjuvant. An injectable
pharmaceutical composition is preferably sterile.
[0213] A liquid pharmaceutical composition intended for either
parenteral or oral administration should contain an amount of a
fusion protein such that a suitable dosage will be obtained.
Typically, this amount is at least 0.01% of the agent of interest
in the composition. When intended for oral administration, this
amount may be varied to be between 0.1 and about 70% of the weight
of the composition. Certain oral pharmaceutical compositions
contain between about 4% and about 75% of the agent of interest. In
certain embodiments, pharmaceutical compositions and preparations
according to the present invention are prepared so that a
parenteral dosage unit contains between 0.01 to 10% by weight of
the agent of interest prior to dilution.
[0214] The pharmaceutical composition may be intended for topical
administration, in which case the carrier may suitably comprise a
solution, emulsion, ointment or gel base. The base, for example,
may comprise one or more of the following: petrolatum, lanolin,
polyethylene glycols, bee wax, mineral oil, diluents such as water
and alcohol, and emulsifiers and stabilizers. Thickening agents may
be present in a pharmaceutical composition for topical
administration. If intended for transdermal administration, the
composition may include a transdermal patch or iontophoresis
device.
[0215] The pharmaceutical composition may be intended for rectal
administration, in the form, for example, of a suppository, which
will melt in the rectum and release the drug. The composition for
rectal administration may contain an oleaginous base as a suitable
nonirritating excipient. Such bases include, without limitation,
lanolin, cocoa butter, and polyethylene glycol.
[0216] The pharmaceutical composition may include various
materials, which modify the physical form of a solid or liquid
dosage unit. For example, the composition may include materials
that form a coating shell around the active ingredients. The
materials that form the coating shell are typically inert, and may
be selected from, for example, sugar, shellac, and other enteric
coating agents. Alternatively, the active ingredients may be
encased in a gelatin capsule. The pharmaceutical composition in
solid or liquid form may include an agent that binds to the
conjugate or agent and thereby assists in the delivery of the
compound. Suitable agents that may act in this capacity include
monoclonal or polyclonal antibodies, one or more proteins or a
liposome.
[0217] The pharmaceutical composition may consist essentially of
dosage units that can be administered as an aerosol. The term
aerosol is used to denote a variety of systems ranging from those
of colloidal nature to systems consisting of pressurized packages.
Delivery may be by a liquefied or compressed gas or by a suitable
pump system that dispenses the active ingredients. Aerosols may be
delivered in single phase, bi-phasic, or tri-phasic systems in
order to deliver the active ingredient(s). Delivery of the aerosol
includes the necessary container, activators, valves,
subcontainers, and the like, which together may form a kit. One of
ordinary skill in the art, without undue experimentation may
determine preferred aerosols.
[0218] The compositions described herein may be prepared with
carriers that protect the fusion proteins against rapid elimination
from the body, such as time release formulations or coatings. Such
carriers include controlled release formulations, such as, but not
limited to, implants and microencapsulated delivery systems, and
biodegradable, biocompatible polymers, such as ethylene vinyl
acetate, polyanhydrides, polyglycolic acid, polyorthoesters,
polylactic acid and others known to those of ordinary skill in the
art.
[0219] The pharmaceutical compositions may be prepared by
methodology well known in the pharmaceutical art. For example, a
pharmaceutical composition intended to be administered by injection
may comprise one or more of salts, buffers and/or stabilizers, with
sterile, distilled water so as to form a solution. A surfactant may
be added to facilitate the formation of a homogeneous solution or
suspension. Surfactants are compounds that non-covalently interact
with the conjugate so as to facilitate dissolution or homogeneous
suspension of the conjugate in the aqueous delivery system.
[0220] The compositions may be administered in a therapeutically
effective amount, which will vary depending upon a variety of
factors including the activity of the specific compound employed;
the metabolic stability and length of action of the compound; the
age, body weight, general health, sex, and diet of the patient; the
mode and time of administration; the rate of excretion; the drug
combination; the severity of the particular disorder or condition;
and the subject undergoing therapy. Generally, a therapeutically
effective daily dose is (for a 70 kg mammal) from about 0.001 mg/kg
(i.e., 0.07 mg) to about 100 mg/kg (i.e., 7.0 g); preferably a
therapeutically effective dose is (for a 70 kg mammal) from about
0.01 mg/kg (i.e., 0.7 mg) to about 50 mg/kg (i.e., 3.5 g); more
preferably a therapeutically effective dose is (for a 70 kg mammal)
from about 1 mg/kg (i.e., 70 mg) to about 25 mg/kg (i.e., 1.75
g).
[0221] Compositions described herein may also be administered
simultaneously with, prior to, or after administration of one or
more other therapeutic agents, as described herein. For instance,
in one embodiment, the conjugate is administered with an
anti-inflammatory agent. Anti-inflammatory agents or drugs include,
but are not limited to, steroids and glucocorticoids (including
betamethasone, budesonide, dexamethasone, hydrocortisone acetate,
hydrocortisone, hydrocortisone, methylprednisolone, prednisolone,
prednisone, triamcinolone), nonsteroidal anti-inflammatory drugs
(NSAIDS) including aspirin, ibuprofen, naproxen, methotrexate,
sulfasalazine, leflunomide, anti-TNF medications, cyclophosphamide
and mycophenolate.
[0222] Such combination therapy may include administration of a
single pharmaceutical dosage formulation, which contains a compound
of the invention (i.e., fusion protein) and one or more additional
active agents, as well as administration of compositions comprising
conjugates of the invention and each active agent in its own
separate pharmaceutical dosage formulation. For example, a fusion
protein as described herein and the other active agent can be
administered to the patient together in a single oral dosage
composition such as a tablet or capsule, or each agent administered
in separate oral dosage formulations. Similarly, a fusion protein
as described herein and the other active agent can be administered
to the patient together in a single parenteral dosage composition
such as in a saline solution or other physiologically acceptable
solution, or each agent administered in separate parenteral dosage
formulations. Where separate dosage formulations are used, the
compositions comprising fusion proteins and one or more additional
active agents can be administered at essentially the same time,
i.e., concurrently, or at separately staggered times, i.e.,
sequentially and in any order; combination therapy is understood to
include all these regimens.
[0223] The various embodiments described herein can be combined to
provide further embodiments. All of the U.S. patents, U.S. patent
application publications, U.S. patent application, foreign patents,
foreign patent application and non-patent publications referred to
in this specification and/or listed in the Application Data Sheet
are incorporated herein by reference, in their entirety. Aspects of
the embodiments can be modified, if necessary to employ concepts of
the various patents, application and publications to provide yet
further embodiments.
[0224] These and other changes can be made to the embodiments in
light of the above-detailed description. In general, in the
following claims, the terms used should not be construed to limit
the claims to the specific embodiments disclosed in the
specification and the claims, but should be construed to include
all possible embodiments along with the full scope of equivalents
to which such claims are entitled. Accordingly, the claims are not
limited by the disclosure.
EXAMPLES
Example 1
In Vitro Activity of Fusion Proteins
[0225] Fusion proteins of human p97 (melanotransferrin; MTf) and
human duronate-2-sulfatase (IDS) were prepared and tested for
enzymatic activity in vitro. Table E1 provides the amino acid
sequences and Table E2 provides the corresponding polynucleotide
coding sequences of the fusion proteins that were prepared and
tested.
TABLE-US-00009 TABLE E1 Polypeptide Sequences of Fusion Proteins
SEQ ID Name Sequence NO: I2S-MTf
MEWSWVFLFFLSVTTGVHSDYKDDDDKEQKLISEEDLHHHHHHHHHHGGGGENL 138 (SP:
TAG: YFQGSETQANSTTDALNVLLIIVDDLRPSLGCYGDKLVRSPNIDQLASHSLLFQ PS:
NAFAQQAVCAPSRVSFLTGRRPDTTRLYDFNSYWRVHAGNFSTIPQYFKENGYV I2S:
TMSVGKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHANLL Linker:
CPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPHIPFRYPKE Soluble
FQKLYPLENITLAPDPEVPDGLPPVAYNPWMDIRQREDVQALNISVPYGPIPVD MTf)
FQRKIRQSYFASVSYLDTQVGRLLSALDDLQLANSTIIAFTSDHGWALGEHGEW
AKYSNFDVATHVPLIFYVPGRTASLPEAGEKLFPYLDPFDSASQLMEPGRQSMD
LVELVSLFPTLAGLAGLQVPPRCPVPSFHVELCREGKNLLKHFRFRDLEEDPYL
PGNPRELIAYSQYPRPSDIPQWNSDKPSLKDIKIMGYSIRTIDYRYTVWVGFNP
DEFLANFSDIHAGELYFVDSDPLQDHNMYNDSQGGDLFQLLMPEAAAKEAAAKE
AAAKGMEVRWCATSDPEQHKCGNMSEAFREAGIQPSLLCVRGTSADHCVQLIAA
QEADAITLDGGAIYEAGKEHGLKPVVGEVYDQEVGTSYYAVAVVRRSSHVTIDT
LKGVKSCHTGINRTVGWNVPVGYLVESGRLSVMGCDVLKAVSDYFGGSCVPGAG
ETSYSESLCRLCRGDSSGEGVCDKSPLERYYDYSGAFRCLAEGAGDVAFVKHST
VLENTDGKTLPSWGQALLSQDFELLCRDGSRADVTEWRQCHLARVPAHAVVVRA
DTDGGLIFRLLNEGQRLFSHEGSSFQMFSSEAYGQKDLLFKDSTSELVPIATQT
YEAWLGHEYLHAMKGLLCDPNRLPPYLRWCVLSTPEIQKCGDMAVAFRRQRLKP
EIQCVSAKSPQHCMERIQAEQVDAVTLSGEDIYTAGKTYGLVPAAGEHYAPEDS
SNSYYVVAVVRRDSSHAFTLDELRGKRSCHAGFGSPAGWDVPVGALIQRGFIRP
KDCDVLTAVSEFFNASCVPVNNPKNYPSSLCALCVGDEQGRNKCVGNSQERYYG
YRGAFRCLVENAGDVAFVRHTTVFDNTNGHNSEPWAAELRSEDYELLCPNGARA
EVSQFAACNLAQIPPHAVMVRPDTNIFTVYGLLDKAQDLFGDDHNKNGFKMFDS
SNYHGQDLLFKDATVRAVPVGEKTTYRGWLGLDYVAALEGMSSQQCS MTf-I2S
MEWSWVELFELSVTTGVHSDYKDDDDKEQKLISEEDLHHHHHHHHHHGGGGENL 139 (SP:
TAG: YFQGGMEVRWCATSDPEQHKCGNMSEAFREAGIQPSLLCVRGTSADHCVQLIAA PS:
QEADAITLDGGAIYEAGKEHGLKPVVGEVYDQEVGTSYYAVAVVRRSSHVTIDT Soluble
LKGVKSCHTGINRTVGWNVPVGYLVESGRLSVMGCDVLKAVSDYFGGSCVPGAG MTf:
ETSYSESLCRLCRGDSSGEGVCDKSPLERYYDYSGAFRCLAEGAGDVAFVKHST Linker:
VLENTDGKTLPSWGQALLSQDFELLCRDGSRADVTEWRQCHLARVPAHAVVVRA I2S)
DTDGGLIFRLLNEGQRLFSHEGSSFQMFSSEAYGQKDLLFKDSTSELVPIATQT
YEAWLGHEYLHAMKGLLCDPNRLPPYLRWCVLSTPEIQKCGDMAVAFRRQRLKP
EIQCVSAKSPQHCMERIQAEQVDAVTLSGEDIYTAGKTYGLVPAAGEHYAPEDS
SNSYYVVAVVRRDSSHAFTLDELRGKRSCHAGFGSPAGWDVPVGALIQRGFIRP
KDCDVLTAVSEFFNASCVPVNNPKNYPSSLCALCVGDEQGRNKCVGNSQERYYG
YRGAFRCLVENAGDVAFVRHTTVFDNTNGHNSEPWAAELRSEDYELLCPNGARA
EVSQFAACNLAQIPPHAVMVRPDTNIFTVYGLLDKAQDLFGDDHNKNGFKMFDS
SNYHGQDLLFKDATVRAVPVGEKTTYRGWLGLDYVAALEGMSSQQCSEAAAKEA
AAKEAAAKSETQANSTTDALNVLLIIVDDLRPSLGCYGDKLVRSPNIDQLASHS
LLFQNAFAQQAVCAPSRVSFLTGRRPDTTRLYDFNSYWRVHAGNFSTIPQYFKE
NGYVTMSVGKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELH
ANLLCPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPHIPFR
YPKEFQKLYPLENITLAPDPEVPDGLPPVAYNPWMDIRQREDVQALNISVPYGP
IPVDFQRKIRQSYFASVSYLDTQVGRLLSALDDLQLANSTIIAFTSDHGWALGE
HGEWAKYSNFDVATHVPLIFYVPGRTASLPEAGEKLFPYLDPFDSASQLMEPGR
QSMDLVELVSLFPTLAGLAGLQVPPRCPVPSFHVELCREGKNLLKHFRFRDLEE
DPYLPGNPRELIAYSQYPRPSDIPQWNSDKPSLKDIKIMGYSIRTIDYRYTVWV
GFNPDEFLANFSDIHAGELYFVDSDPLQDHNMYNDSQGGDLFQLLMP MTfpep-
MEWSWVFLFFLSVTTGVHSDYKDDDDKEQKLISEEDLHHHHHHHHHHGGGGENL 140 I2S
YFQGDSSHAFTLDELRYEAAAKEAAAKEAAAKSETQANSTTDALNVLLIIVDDL (SP: TAG:
RPSLGCYGDKLVRSPNIDQLASHSLLFQNAFAQQAVCAPSRVSFLTGRRPDTTR PS:
LYDFNSYWRVHAGNFSTIPQYFKENGYVTMSVGKVFHPGISSNHTDDSPYSWSF MTfpep:
PPYHPSSEKYENTKTCRGPDGELHANLLCPVDVLDVPEGTLPDKQSTEQAIQLL Linker:
EKMKTSASPFFLAVGYHKPHIPFRYPKEFQKLYPLENITLAPDPEVPDGLPPVA I2S)
YNPWMDIRQREDVQALNISVPYGPIPVDFQRKIRQSYFASVSYLDTQVGRLLSA
LDDLQLANSTIIAFTSDHGWALGEHGEWAKYSNFDVATHVPLIFYVPGRTASLP
EAGEKLFPYLDPFDSASQLMEPGRQSMDLVELVSLFPTLAGLAGLQVPPRCPVP
SFHVELCREGKNLLKHFRFRDLEEDPYLPGNPRELIAYSQYPRPSDIPQWNSDK
PSLKDIKIMGYSIRTIDYRYTVWVGFNPDEFLANFSDIHAGELYFVDSDPLQDH
NMYNDSQGGDLFQLLMP I2S-
MEWSWVFLFFLSVTTGVHSDYKDDDDKEQKLISEEDLHHHHHHHHHHGGGGENL 141 MTfpep
YFQGSETQANSTTDALNVLLIIVDDLRPSLGCYGDKLVRSPNIDQLASHSLLFQ (SP: TAG:
NAFAQQAVCAPSRVSFLTGRRPDTTRLYDFNSYWRVHAGNFSTIPQYFKENGYV PS:
TMSVGKVFHPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHANLL I2S:
CPVDVLDVPEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPHIPFRYPKE Linker:
FQKLYPLENITLAPDPEVPDGLPPVAYNPWMDIRQREDVQALNISVPYGPIPVD MTfpep)
FQRKIRQSYFASVSYLDTQVGRLLSALDDLQLANSTIIAFTSDHGWALGEHGEW
AKYSNFDVATHVPLIFYVPGRTASLPEAGEKLFPYLDPFDSASQLMEPGRQSMD
LVELVSLFPTLAGLAGLQVPPRCPVPSFHVELCREGKNLLKHFRFRDLEEDPYL
PGNPRELIAYSQYPRPSDIPQWNSDKPSLKDIKIMGYSIRTIDYRYTVWVGFNP
DEFLANFSDIHAGELYFVDSDPLQDHNMYNDSQGGDLFQLLMPEAAAKEAAAKE
AAAKDSSHAFTLDELRY I2S-
MEWSWVFLFFLSVTTGVHSDYKDDDDKEQKLISEEDLHHHHHHHHHHGGGGENL 142 MTfpep
YFQGTDALNVLLIIVDDLRPSLGCYGDKLVRSPNIDQLASHSLLFQNAFAQQAV (without
CAPSRVSFLTGRRPDTTRLYDFNSYWRVHAGNFSTIPQYFKENGYVTMSVGKVF propep of
HPGISSNHTDDSPYSWSFPPYHPSSEKYENTKTCRGPDGELHANLLCPVDVLDV I2S)
PEGTLPDKQSTEQAIQLLEKMKTSASPFFLAVGYHKPHIPFRYPKEFQKLYPLE SP: TAG:
NITLAPDPEVPDGLPPVAYNPWMDIRQREDVQALNISVPYGPIPVDFQRKIRQS PS:
YFASVSYLDTQVGRLLSALDDLQLANSTIIAFTSDHGWALGEHGEWAKYSNFDV I2S w/o
ATHVPLIFYVPGRTASLPEAGEKLFPYLDPFDSASQLMEPGRQSMDLVELVSLF propep:
PTLAGLAGLQVPPRCPVPSFHVELCREGKNLLKHFRFRDLEEDPYLPGNPRELI Linker:
AYSQYPRPSDIPQWNSDKPSLKDIKIMGYSIRTIDYRYTVWVGFNPDEFLANFS MTfpep)
DIHAGELYFVDSDPLQDHNMYNDSQGGDLFQLLMPEAAAKEAAAKEAAAKDSSH
AFTLDELRY
TABLE-US-00010 TABLE E2 Polynucleotide Coding Sequences of Fusion
Constructs SEQ ID Name Polynucleotide Sequence NO: I2S-MTf
ATGGAATGGAGCTGGGTCTTTCTCTTCTTCCTGTCAGTAACGACTGGTGTCCAC 143
TCCGACTACAAGGACGACGACGACAAAGAGCAGAAGCTGATCTCCGAAGAGGAC
CTGCACCACCATCATCACCATCACCACCATCACGGAGGCGGTGGAGAGAACCTG
TACTTTCAGGGCTCGGAAACTCAGGCCAACTCCACCACAGATGCACTCAACGTG
CTGCTGATCATCGTAGATGACCTCCGACCTTCTCTGGGCTGTTACGGCGACAAG
CTAGTACGGAGCCCAAACATCGACCAGCTCGCATCGCACTCTCTCCTATTCCAG
AACGCATTCGCCCAGCAGGCTGTCTGTGCTCCCTCCCGAGTGTCCTTCCTCACG
GGTCGGAGACCCGATACCACGAGGTTATATGACTTCAACTCATACTGGCGCGTG
CATGCCGGTAACTTTTCTACTATACCCCAGTATTTTAAAGAAAATGGCTATGTT
ACAATGTCCGTTGGCAAGGTATTTCATCCTGGTATTAGCAGCAACCACACAGAT
GACTCTCCGTATAGCTGGTCATTCCCACCATACCACCCCTCCAGCGAAAAGTAC
GAAAACACAAAGACTTGCCGGGGCCCAGATGGCGAACTGCACGCAAATCTGCTG
TGCCCTGTAGATGTCTTGGACGTGCCCGAAGGTACTCTGCCCGACAAACAGTCC
ACAGAACAGGCAATCCAACTCCTTGAAAAGATGAAAACGAGCGCGTCCCCCTTC
TTCCTCGCCGTGGGCTACCACAAGCCCCACATCCCGTTTAGATACCCCAAGGAA
TTTCAGAAACTGTACCCCCTGGAAAACATCACTCTCGCGCCCGACCCCGAAGTG
CCAGACGGACTCCCTCCTGTTGCCTACAACCCTTGGATGGACATCAGACAACGT
GAAGATGTGCAGGCCCTGAACATCTCAGTGCCTTACGGCCCCATTCCAGTTGAC
TTCCAGAGGAAGATTCGGCAGTCCTACTTCGCCTCCGTTAGTTACCTGGACACC
CAAGTGGGTAGACTCCTGAGCGCCTTGGACGATCTCCAGCTCGCAAACAGCACC
ATCATTGCCTTCACCAGCGACCATGGTTGGGCGCTGGGTGAACATGGAGAATGG
GCTAAATATTCAAATTTCGACGTTGCGACCCACGTCCCATTGATCTTCTACGTG
CCTGGACGAACAGCCTCCTTGCCTGAAGCCGGGGAAAAGTTGTTTCCATATCTG
GACCCTTTCGATTCTGCGAGCCAACTCATGGAACCTGGGCGACAGAGCATGGAC
CTGGTGGAACTGGTCAGTTTATTTCCAACCCTGGCAGGCCTTGCAGGCCTCCAA
GTTCCACCTCGGTGTCCCGTTCCCTCATTCCACGTCGAACTCTGTCGCGAAGGT
AAAAACCTCCTCAAGCATTTTCGTTTTCGGGACCTCGAAGAAGACCCATACCTG
CCAGGGAATCCAAGGGAACTGATTGCCTACAGCCAGTACCCTAGACCTAGCGAC
ATCCCACAGTGGAACAGCGACAAGCCCTCCCTCAAGGACATTAAAATCATGGGT
TATAGTATCCGGACTATTGACTACAGGTATACCGTGTGGGTGGGTTTCAACCCA
GACGAATTTCTCGCCAATTTCTCCGACATCCACGCGGGCGAACTGTATTTCGTT
GATTCCGATCCACTGCAAGATCATAATATGTACAACGATAGTCAAGGGGGTGAC
CTCTTCCAGTTGCTAATGCCAGAAGCCGCCGCGAAAGAAGCCGCCGCAAAAGAA
GCCGCTGCCAAAGGCATGGAAGTGCGTTGGTGCGCCACCTCTGACCCCGAGCAG
CACAAGTGCGGCAACATGTCCGAGGCCTTCAGAGAGGCCGGCATCCAGCCTTCT
CTGCTGTGTGTGCGGGGCACCTCTGCCGACCATTGCGTGCAGCTGATCGCCGCC
CAGGAAGCCGACGCTATCACACTGGATGGCGGCGCTATCTACGAGGCTGGCAAA
GAGCACGGCCTGAAGCCCGTCGTGGGCGAGGTGTACGATCAGGAAGTGGGCACC
TCCTACTACGCCGTGGCTGTCGTGCGGAGATCCTCCCACGTGACCATCGACACC
CTGAAGGGCGTGAAGTCCTGCCACACCGGCATCAACAGAACCGTGGGCTGGAAC
GTGCCCGTGGGCTACCTGGTGGAATCCGGCAGACTGTCCGTGATGGGCTGCGAC
GTGCTGAAGGCCGTGTCCGATTACTTCGGCGGCTCTTGTGTGCCTGGCGCTGGC
GAGACATCCTACTCCGAGTCCCTGTGCAGACTGTGCAGGGGCGACTCTTCTGGC
GAGGGCGTGTGCGACAAGTCCCCTCTGGAACGGTACTACGACTACTCCGGCGCC
TTCAGATGCCTGGCTGAAGGTGCTGGCGACGTGGCCTTCGTGAAGCACTCCACC
GTGCTGGAAAACACCGACGGCAAGACCCTGCCTTCTTGGGGCCAGGCACTGCTG
TCCCAGGACTTCGAGCTGCTGTGCCGGGATGGCTCCAGAGCCGATGTGACAGAG
TGGCGGCAGTGCCACCTGGCCAGAGTGCCTGCTCATGCTGTGGTCGTGCGCGCC
GATACAGATGGCGGCCTGATCTTCCGGCTGCTGAACGAGGGCCAGCGGCTGTTC
TCTCACGAGGGCTCCAGCTTCCAGATGTTCTCCAGCGAGGCCTACGGCCAGAAG
GACCTGCTGTTCAAGGACTCCACCTCCGAGCTGGTGCCTATCGCCACCCAGACC
TATGAGGCTTGGCTGGGCCACGAGTACCTGCACGCTATGAAGGGACTGCTGTGC
GACCCCAACCGGCTGCCTCCTTATCTGAGGTGGTGCGTGCTGTCCACCCCCGAG
ATCCAGAAATGCGGCGATATGGCCGTGGCCTTTCGGCGGCAGAGACTGAAGCCT
GAGATCCAGTGCGTGTCCGCCAAGAGCCCTCAGCACTGCATGGAACGGATCCAG
GCCGAACAGGTGGACGCCGTGACACTGTCCGGCGAGGATATCTACACCGCCGGA
AAGACCTACGGCCTGGTGCCAGCTGCTGGCGAGCATTACGCCCCTGAGGACTCC
TCCAACAGCTACTACGTGGTGGCAGTCGTGCGCCGGGACTCCTCTCACGCCTTT
ACCCTGGATGAGCTGCGGGGCAAGAGAAGCTGTCACGCCGGCTTTGGAAGCCCT
GCCGGATGGGATGTGCCTGTGGGCGCTCTGATCCAGCGGGGCTTCATCAGACCC
AAGGACTGTGATGTGCTGACCGCCGTGTCTGAGTTCTTCAACGCCTCCTGTGTG
CCCGTGAACAACCCCAAGAACTACCCCTCCAGCCTGTGCGCCCTGTGTGTGGGA
GATGAGCAGGGCCGGAACAAATGCGTGGGCAACTCCCAGGAAAGATATTACGGC
TACAGAGGCGCCTTCCGGTGTCTGGTGGAAAACGCCGGGGATGTGGCTTTTGTG
CGGCACACCACCGTGTTCGACAACACCAATGGCCACAACTCCGAGCCTTGGGCC
GCTGAGCTGAGATCCGAGGATTACGAACTGCTGTGTCCCAACGGCGCCAGGGCT
GAGGTGTCCCAGTTTGCCGCCTGTAACCTGGCCCAGATCCCTCCCCACGCTGTG
ATGGTGCGACCCGACACCAACATCTTCACCGTGTACGGCCTGCTGGACAAGGCC
CAGGATCTGTTCGGCGACGACCACAACAAGAACGGGTTCAAGATGTTCGACTCC
AGCAACTACCACGGACAGGATCTGCTGTTTAAAGATGCCACCGTGCGGGCCGTG
CCAGTGGGCGAAAAGACCACCTACAGAGGATGGCTGGGACTGGACTACGTGGCC
GCCCTGGAAGGCATGTCCTCCCAGCAGTGTTCCTGA MTf-I2S
ATGGAATGGAGCTGGGTCTTTCTCTTCTTCCTGTCAGTAACGACTGGTGTCCAC 144
TCCGACTACAAGGACGACGACGACAAAGAGCAGAAGCTGATCTCCGAAGAGGAC
CTGCACCACCATCATCACCATCACCACCATCACGGAGGCGGTGGAGAGAACCTG
TACTTTCAGGGCGGCATGGAAGTGCGTTGGTGCGCCACCTCTGACCCCGAGCAG
CACAAGTGCGGCAACATGTCCGAGGCCTTCAGAGAGGCCGGCATCCAGCCTTCT
CTGCTGTGTGTGCGGGGCACCTCTGCCGACCATTGCGTGCAGCTGATCGCCGCC
CAGGAAGCCGACGCTATCACACTGGATGGCGGCGCTATCTACGAGGCTGGCAAA
GAGCACGGCCTGAAGCCCGTCGTGGGCGAGGTGTACGATCAGGAAGTGGGCACC
TCCTACTACGCCGTGGCTGTCGTGCGGAGATCCTCCCACGTGACCATCGACACC
CTGAAGGGCGTGAAGTCCTGCCACACCGGCATCAACAGAACCGTGGGCTGGAAC
GTGCCCGTGGGCTACCTGGTGGAATCCGGCAGACTGTCCGTGATGGGCTGCGAC
GTGCTGAAGGCCGTGTCCGATTACTTCGGCGGCTCTTGTGTGCCTGGCGCTGGC
GAGACATCCTACTCCGAGTCCCTGTGCAGACTGTGCAGGGGCGACTCTTCTGGC
GAGGGCGTGTGCGACAAGTCCCCTCTGGAACGGTACTACGACTACTCCGGCGCC
TTCAGATGCCTGGCTGAAGGTGCTGGCGACGTGGCCTTCGTGAAGCACTCCACC
GTGCTGGAAAACACCGACGGCAAGACCCTGCCTTCTTGGGGCCAGGCACTGCTG
TCCCAGGACTTCGAGCTGCTGTGCCGGGATGGCTCCAGAGCCGATGTGACAGAG
TGGCGGCAGTGCCACCTGGCCAGAGTGCCTGCTCATGCTGTGGTCGTGCGCGCC
GATACAGATGGCGGCCTGATCTTCCGGCTGCTGAACGAGGGCCAGCGGCTGTTC
TCTCACGAGGGCTCCAGCTTCCAGATGTTCTCCAGCGAGGCCTACGGCCAGAAG
GACCTGCTGTTCAAGGACTCCACCTCCGAGCTGGTGCCTATCGCCACCCAGACC
TATGAGGCTTGGCTGGGCCACGAGTACCTGCACGCTATGAAGGGACTGCTGTGC
GACCCCAACCGGCTGCCTCCTTATCTGAGGTGGTGCGTGCTGTCCACCCCCGAG
ATCCAGAAATGCGGCGATATGGCCGTGGCCTTTCGGCGGCAGAGACTGAAGCCT
GAGATCCAGTGCGTGTCCGCCAAGAGCCCTCAGCACTGCATGGAACGGATCCAG
GCCGAACAGGTGGACGCCGTGACACTGTCCGGCGAGGATATCTACACCGCCGGA
AAGACCTACGGCCTGGTGCCAGCTGCTGGCGAGCATTACGCCCCTGAGGACTCC
TCCAACAGCTACTACGTGGTGGCAGTCGTGCGCCGGGACTCCTCTCACGCCTTT
ACCCTGGATGAGCTGCGGGGCAAGAGAAGCTGTCACGCCGGCTTTGGAAGCCCT
GCCGGATGGGATGTGCCTGTGGGCGCTCTGATCCAGCGGGGCTTCATCAGACCC
AAGGACTGTGATGTGCTGACCGCCGTGTCTGAGTTCTTCAACGCCTCCTGTGTG
CCCGTGAACAACCCCAAGAACTACCCCTCCAGCCTGTGCGCCCTGTGTGTGGGA
GATGAGCAGGGCCGGAACAAATGCGTGGGCAACTCCCAGGAAAGATATTACGGC
TACAGAGGCGCCTTCCGGTGTCTGGTGGAAAACGCCGGGGATGTGGCTTTTGTG
CGGCACACCACCGTGTTCGACAACACCAATGGCCACAACTCCGAGCCTTGGGCC
GCTGAGCTGAGATCCGAGGATTACGAACTGCTGTGTCCCAACGGCGCCAGGGCT
GAGGTGTCCCAGTTTGCCGCCTGTAACCTGGCCCAGATCCCTCCCCACGCTGTG
ATGGTGCGACCCGACACCAACATCTTCACCGTGTACGGCCTGCTGGACAAGGCC
CAGGATCTGTTCGGCGACGACCACAACAAGAACGGGTTCAAGATGTTCGACTCC
AGCAACTACCACGGACAGGATCTGCTGTTTAAAGATGCCACCGTGCGGGCCGTG
CCAGTGGGCGAAAAGACCACCTACAGAGGATGGCTGGGACTGGACTACGTGGCC
GCCCTGGAAGGCATGTCCTCCCAGCAGTGTTCCGAAGCCGCCGCGAAAGAAGCC
GCCGCAAAAGAAGCCGCTGCCAAATCGGAAACTCAGGCCAACTCCACCACAGAT
GCACTCAACGTGCTGCTGATCATCGTAGATGACCTCCGACCTTCTCTGGGCTGT
TACGGCGACAAGCTAGTACGGAGCCCAAACATCGACCAGCTCGCATCGCACTCT
CTCCTATTCCAGAACGCATTCGCCCAGCAGGCTGTCTGTGCTCCCTCCCGAGTG
TCCTTCCTCACGGGTCGGAGACCCGATACCACGAGGTTATATGACTTCAACTCA
TACTGGCGCGTGCATGCCGGTAACTTTTCTACTATACCCCAGTATTTTAAAGAA
AATGGCTATGTTACAATGTCCGTTGGCAAGGTATTTCATCCTGGTATTAGCAGC
AACCACACAGATGACTCTCCGTATAGCTGGTCATTCCCACCATACCACCCCTCC
AGCGAAAAGTACGAAAACACAAAGACTTGCCGGGGCCCAGATGGCGAACTGCAC
GCAAATCTGCTGTGCCCTGTAGATGTCTTGGACGTGCCCGAAGGTACTCTGCCC
GACAAACAGTCCACAGAACAGGCAATCCAACTCCTTGAAAAGATGAAAACGAGC
GCGTCCCCCTTCTTCCTCGCCGTGGGCTACCACAAGCCCCACATCCCGTTTAGA
TACCCCAAGGAATTTCAGAAACTGTACCCCCTGGAAAACATCACTCTCGCGCCC
GACCCCGAAGTGCCAGACGGACTCCCTCCTGTTGCCTACAACCCTTGGATGGAC
ATCAGACAACGTGAAGATGTGCAGGCCCTGAACATCTCAGTGCCTTACGGCCCC
ATTCCAGTTGACTTCCAGAGGAAGATTCGGCAGTCCTACTTCGCCTCCGTTAGT
TACCTGGACACCCAAGTGGGTAGACTCCTGAGCGCCTTGGACGATCTCCAGCTC
GCAAACAGCACCATCATTGCCTTCACCAGCGACCATGGTTGGGCGCTGGGTGAA
CATGGAGAATGGGCTAAATATTCAAATTTCGACGTTGCGACCCACGTCCCATTG
ATCTTCTACGTGCCTGGACGAACAGCCTCCTTGCCTGAAGCCGGGGAAAAGTTG
TTTCCATATCTGGACCCTTTCGATTCTGCGAGCCAACTCATGGAACCTGGGCGA
CAGAGCATGGACCTGGTGGAACTGGTCAGTTTATTTCCAACCCTGGCAGGCCTT
GCAGGCCTCCAAGTTCCACCTCGGTGTCCCGTTCCCTCATTCCACGTCGAACTC
TGTCGCGAAGGTAAAAACCTCCTCAAGCATTTTCGTTTTCGGGACCTCGAAGAA
GACCCATACCTGCCAGGGAATCCAAGGGAACTGATTGCCTACAGCCAGTACCCT
AGACCTAGCGACATCCCACAGTGGAACAGCGACAAGCCCTCCCTCAAGGACATT
AAAATCATGGGTTATAGTATCCGGACTATTGACTACAGGTATACCGTGTGGGTG
GGTTTCAACCCAGACGAATTTCTCGCCAATTTCTCCGACATCCACGCGGGCGAA
CTGTATTTCGTTGATTCCGATCCACTGCAAGATCATAATATGTACAACGATAGT
CAAGGGGGTGACCTCTTCCAGTTGCTAATGCCATGA MTfpep-
ATGGAATGGAGCTGGGTCTTTCTCTTCTTCCTGTCAGTAACGACTGGTGTCCAC 145 I2S
TCCGACTACAAGGACGACGACGACAAAGAGCAGAAGCTGATCTCCGAAGAGGAC
CTGCACCACCATCATCACCATCACCACCATCACGGAGGCGGTGGAGAGAACCTG
TACTTTCAGGGCGACTCCTCTCACGCCTTCACCCTGGACGAGCTGCGGTACGAA
GCCGCCGCGAAAGAAGCCGCCGCAAAAGAAGCCGCTGCCAAATCGGAAACTCAG
GCCAACTCCACCACAGATGCACTCAACGTGCTGCTGATCATCGTAGATGACCTC
CGACCTTCTCTGGGCTGTTACGGCGACAAGCTAGTACGGAGCCCAAACATCGAC
CAGCTCGCATCGCACTCTCTCCTATTCCAGAACGCATTCGCCCAGCAGGCTGTC
TGTGCTCCCTCCCGAGTGTCCTTCCTCACGGGTCGGAGACCCGATACCACGAGG
TTATATGACTTCAACTCATACTGGCGCGTGCATGCCGGTAACTTTTCTACTATA
CCCCAGTATTTTAAAGAAAATGGCTATGTTACAATGTCCGTTGGCAAGGTATTT
CCCCAGTATTTTAAAGAAAATGGCTATGTTACAATGTCCGTTGGCAAGGTATTT
CATCCTGGTATTAGCAGCAACCACACAGATGACTCTCCGTATAGCTGGTCATTC
CCACCATACCACCCCTCCAGCGAAAAGTACGAAAACACAAAGACTTGCCGGGGC
CCAGATGGCGAACTGCACGCAAATCTGCTGTGCCCTGTAGATGTCTTGGACGTG
CCCGAAGGTACTCTGCCCGACAAACAGTCCACAGAACAGGCAATCCAACTCCTT
GAAAAGATGAAAACGAGCGCGTCCCCCTTCTTCCTCGCCGTGGGCTACCACAAG
CCCCACATCCCGTTTAGATACCCCAAGGAATTTCAGAAACTGTACCCCCTGGAA
AACATCACTCTCGCGCCCGACCCCGAAGTGCCAGACGGACTGGGTCCTGTTGCC
TACAACCCTTGGATGGACATCAGACAACGTGAAGATGTGCAGGCCCTGAACATC
TCAGTGCCTTACGGCCCCATTCCAGTTGACTTCCAGAGGAAGATTCGGCAGTCC
TACTTCGCCTCCGTTAGTTACCTGGACACCCAAGTGGGTAGACTCCTGAGCGCC
TTGGACGATCTCCAGCTCGCAAACAGCACCATCATTGCCTTCACCAGCGACCAT
GGTTGGGCGCTGGGTGAACATGGAGAATGGGCTAAATATTCAAATTTCGACGTT
GCGACCCACGTCCCATTGATCTTCTACGTGCCTGGACGAACAGCCTCCTTGCCT
GAAGCCGGGGAAAAGTTGTTTCCATATCTGGACCCTTTCGATTCTGCGAGCCAA
CTCATGGAACCTGGGCGACAGAGCATGGACCTGGTGGAACTGGTCAGTTTATTT
CCAACCCTGGCAGGCCTTGCAGGCCTCCAAGTTCCACCTCGGTGTCCCGTTCCC
TCATTCCACGTCGAACTCTGTCGCGAAGGTAAAAACCTCCTCAAGCATTTTCGT
TTTCGGGACCTCGAAGAAGACCCATACCTGCCAGGGAATCCAAGGGAACTGATT
GCCTACAGCCAGTACCCTAGACCTAGCGACATCCCACAGTGGAACAGCGACAAG
CCCTCCCTCAAGGACATTAAAATCATGGGTTATAGTATCCGGACTATTGACTAC
AGGTATACCGTGTGGGTGGGTTTCAACCCAGACGAATTTCTCGCCAATTTCTCC
GACATCCACGCGGGCGAACTGTATTTCGTTGATTCCGATCCACTGCAAGATCAT
AATATGTACAACGATAGTCAAGGGGGTGACCTCTTCCAGTTGCTAATGCCATGA I2S-
ATGGAATGGAGCTGGGTCTTTCTCTTCTTCCTGTCAGTAACGACTGGTGTCCAC 146 MTfpep
TCCGACTACAAGGACGACGACGACAAAGAGCAGAAGCTGATCTCCGAAGAGGAC
CTGCACCACCATCATCACCATCACCACCATCACGGAGGCGGTGGAGAGAACCTG
TACTTTCAGGGCTCGGAAACTCAGGCCAACTCCACCACAGATGCACTCAACGTG
CTGCTGATCATCGTAGATGACCTCCGACCTTCTCTGGGCTGTTACGGCGACAAG
CTAGTACGGAGCCCAAACATCGACCAGCTCGCATCGCACTCTCTCCTATTCCAG
AACGCATTCGCCCAGCAGGCTGTCTGTGCTCCCTCCCGAGTGTCCTTCCTCACG
GGTCGGAGACCCGATACCACGAGGTTATATGACTTCAACTCATACTGGCGCGTG
CATGCCGGTAACTTTTCTACTATACCCCAGTATTTTAAAGAAAATGGCTATGTT
ACAATGTCCGTTGGCAAGGTATTTCATCCTGGTATTAGCAGCAACCACACAGAT
GACTCTCCGTATAGCTGGTCATTCCCACCATACCACCCCTCCAGCGAAAAGTAC
GAAAACACAAAGACTTGCCGGGGCCCAGATGGCGAACTGCACGCAAATCTGCTG
TGCCCTGTAGATGTCTTGGACGTGCCCGAAGGTACTCTGCCCGACAAACAGTCC
ACAGAACAGGCAATCCAACTCCTTGAAAAGATGAAAACGAGCGCGTCCCCCTTC
TTCCTCGCCGTGGGCTACCACAAGCCCCACATCCCGTTTAGATACCCCAAGGAA
TTTCAGAAACTGTACCCCCTGGAAAACATCACTCTCGCGCCCGACCCCGAAGTG
CCAGACGGACTCCCTCCTGTTGCCTACAACCCTTGGATGGACATCAGACAACGT
GAAGATGTGCAGGCCCTGAACATCTCAGTGCCTTACGGCCCCATTCCAGTTGAC
TTCCAGAGGAAGATTCGGCAGTCCTACTTCGCCTCCGTTAGTTACCTGGACACC
CAAGTGGGTAGACTCCTGAGCGCCTTGGACGATCTCCAGCTCGCAAACAGCACC
ATCATTGCCTTCACCAGCGACCATGGTTGGGCGCTGGGTGAACATGGAGAATGG
GCTAAATATTCAAATTTCGACGTTGCGACCCACGTCCCATTGATCTTCTACGTG
CCTGGACGAACAGCCTCCTTGCCTGAAGCCGGGGAAAAGTTGTTTCCATATCTG
GACCCTTTCGATTCTGCGAGCCAACTCATGGAACCTGGGCGACAGAGCATGGAC
CTGGTGGAACTGGTCAGTTTATTTCCAACCCTGGCAGGCCTTGCAGGCCTCCAA
GTTCCACCTCGGTGTCCCGTTCCCTCATTCCACGTCGAACTCTGTCGCGAAGGT
AAAAACCTCCTCAAGCATTTTCGTTTTCGGGACCTCGAAGAAGACCCATACCTG
CCAGGGAATCCAAGGGAACTGATTGCCTACAGCCAGTACCCTAGACCTAGCGAC
ATCCCACAGTGGAACAGCGACAAGCCCTCCCTCAAGGACATTAAAATCATGGGT
TATAGTATCCGGACTATTGACTACAGGTATACCGTGTGGGTGGGTTTCAACCCA
GACGAATTTCTCGCCAATTTCTCCGACATCCACGCGGGCGAACTGTATTTCGTT
GATTCCGATCCACTGCAAGATCATAATATGTACAACGATAGTCAAGGGGGTGAC
CTCTTCCAGTTGCTAATGCCAGAGGCCGCTGCTAAAGAGGCTGCCGCCAAAGAA
GCCGCCGCTAAGGACTCCTCTCACGCCTTCACCCTGGACGAGCTGCGGTACTAA I2S-
ATGGAATGGAGCTGGGTCTTTCTCTTCTTCCTGTCAGTAACGACTGGTGTCCAC 147 MTfpep
TCCGACTACAAGGACGACGACGACAAAGAGCAGAAGCTGATCTCCGAAGAGGAC (without
CTGCACCACCATCATCACCATCACCACCATCACGGAGGCGGTGGAGAGAACCTG propep of
TACTTTCAGGGCACAGATGCACTCAACGTGCTGCTGATCATCGTAGATGACCTC I2S)
CGACCTTCTCTGGGCTGTTACGGCGACAAGCTAGTACGGAGCCCAAACATCGAC
CAGCTCGCATCGCACTCTCTCCTATTCCAGAACGCATTCGCCCAGCAGGCTGTC
TGTGCTCCCTCCCGAGTGTCCTTCCTCACGGGTCGGAGACCCGATACCACGAGG
TTATATGACTTCAACTCATACTGGCGCGTGCATGCCGGTAACTTTTCTACTATA
CCCCAGTATTTTAAAGAAAATGGCTATGTTACAATGTCCGTTGGCAAGGTATTT
CATCCTGGTATTAGCAGCAACCACACAGATGACTCTCCGTATAGCTGGTCATTC
CCACCATACCACCCCTCCAGCGAAAAGTACGAAAACACAAAGACTTGCCGGGGC
CCAGATGGCGAACTGCACGCAAATCTGCTGTGCCCTGTAGATGTCTTGGACGTG
CCCGAAGGTACTCTGCCCGACAAACAGTCCACAGAACAGGCAATCCAACTCCTT
GAAAAGATGAAAACGAGCGCGTCCCCCTTCTTCCTCGCCGTGGGCTACCACAAG
CCCCACATCCCGTTTAGATACCCCAAGGAATTTCAGAAACTGTACCCCCTGGAA
AACATCACTCTCGCGCCCGACCCCGAAGTGCCAGACGGACTCCCTCCTGTTGCC
TACAACCCTTGGATGGACATCAGACAACGTGAAGATGTGCAGGCCCTGAACATC
TCAGTGCCTTACGGCCCCATTCCAGTTGACTTCCAGAGGAAGATTCGGCAGTCC
TACTTCGCCTCCGTTAGTTACCTGGACACCCAAGTGGGTAGACTCCTGAGCGCC
TTGGACGATCTCCAGCTCGCAAACAGCACCATCATTGCCTTCACCAGCGACCAT
GGTTGGGCGCTGGGTGAACATGGAGAATGGGCTAAATATTCAAATTTCGACGTT
GCGACCCACGTCCCATTGATCTTCTACGTGCCTGGACGAACAGCCTCCTTGCCT
GAAGCCGGGGAAAAGTTGTTTCCATATCTGGACCCTTTCGATTCTGCGAGCCAA
CTCATGGAACCTGGGCGACAGAGCATGGACCTGGTGGAACTGGTCAGTTTATTT
CCAACCCTGGCAGGCCTTGCAGGCCTCCAAGTTCCACCTCGGTGTCCCGTTCCC
TCATTCCACGTCGAACTCTGTCGCGAAGGTAAAAACCTCCTCAAGCATTTTCGT
TTTCGGGACCTCGAAGAAGACCCATACCTGCCAGGGAATCCAAGGGAACTGATT
GCCTACAGCCAGTACCCTAGACCTAGCGACATCCCACAGTGGAACAGCGACAAG
CCCTCCCTCAAGGACATTAAAATCATGGGTTATAGTATCCGGACTATTGACTAC
AGGTATACCGTGTGGGTGGGTTTCAACCCAGACGAATTTCTCGCCAATTTCTCC
GACATCCACGCGGGCGAACTGTATTTCGTTGATTCCGATCCACTGCAAGATCAT
AATATGTACAACGATAGTCAAGGGGGTGACCTCTTCCAGTTGCTAATGCCAGAG
GCCGCTGCTAAAGAGGCTGCCGCCAAAGAAGCCGCCGCTAAGGACTCCTCTCAC
GCCTTCACCCTGGACGAGCTGCGGTACTAA
[0226] Recombinant proteins were prepared and tested for enzymatic
activity against the substrate 4-Nitrocatechol Sulfate (PNCS)
relative to recombinant human IDS and a negative control
(trastuzumab-MTf fusion). The results are shown in FIGS. 2-4. One g
of each sample was used in the enzyme activity assay, and the data
presented are normalized to substrate blank.
[0227] FIG. 2 shows the enzyme activity evaluation of I2S-MTf and
MTf-I2S fusion proteins as measured by their ability to hydrolyze
the substrate 4-Nitrocatechol Sulfate (PNCS) relative to
recombinant human IDS and negative control (TZM-MTf fusion). These
data show that the I2S-MTf and MTf-I2S fusion proteins not only had
significant enzymatic activity, but also had increased enzymatic
activity relative to wild-type (non-fusion) human IDS.
[0228] FIG. 3 shows the enzyme activity evaluation of MTfpep-I2S
and I2S-MTfpep (with I2S propeptide) fusion proteins as measured by
their ability to hydrolyze the substrate PNCS relative to I2S-MTf
fusion and negative control (TZM-MTf fusion). These data show that
the MTfpep-I2S and I2S-MTfpep fusion proteins not only had
significant enzymatic activity, but also had increased enzymatic
activity relative to the significantly active I2S-MTf fusion
protein (from FIG. 2), and thus increased enzymatic activity
relative to wild-type (non-fusion) human IDS.
[0229] FIG. 4 shows a comparison of the enzyme activity of
I2S-MTfpep (with I2S propeptide) and I2S-MTfpep (without I2S
propeptide) fusion proteins as measured by their ability to
hydrolyze the substrate PNCS. These data show that the MTfpep-I2S
and I2S-MTfpep fusion proteins not only had significant enzymatic
activity, but also had increased enzymatic activity relatively to
wild-type (non-fusion) human IDS.
Example 2
In Vivo Distribution of I2S-MTf and MTfpep-I2S Fusions in the
Brain
[0230] The brain biodistribution of the I2S-MTf and MTfpep-I2S
fusion proteins in mice was evaluated by quantitative confocal
microscopy imaging. Therapeutic dose equivalents of I2S-MTf and
MTfpep-I2S were administered in 100 .mu.L volume to mice via tail
vein injection. Prior to euthanasia, mice were injected (i.v.) with
Tomato Lectin-FITC (40 .mu.g) for 10 min to stain the brain
vasculature. Blood was cleared by intracardiac perfusion of 10 ml
heparinised saline at a rate of 1 ml per minute. Brains were
excised and frozen in OCT and stored at -80.degree. C. Brains were
mounted in Tissue Tek and sectioned with a cryostat at -20.degree.
C. Sections were mounted on Superfrost Plus microscope slides,
fixed in cold Acetone/MeOH (1:1) for 10 minutes at room
temperature, and washed with PBS. Glass coverslips were mounted on
sections using Prolong Gold antifade reagent with DAPI (molecular
probes, P36931). Three-dimensional (3D) confocal microscopy and
quantitative analysis was performed.
[0231] FIG. 5 shows quantification of the relative distribution of
MTfpep-I2S (with propeptide) and I2S-MTf fusion proteins between
capillaries (C) and parenchyma (P) in the brain, relative to the
total (T) signal. The significant staining of parenchymal tissues
relative to capillaries confirms that the MTfpep-I2S and I2S-MTf
fusion proteins were both able to cross the blood brain barrier
(BBB) and accumulate in tissues of the central nervous system.
[0232] In summary, the data from Examples 1 and 2 show that the
MTfpep-I2S (with propeptide) and I2S-MTf fusion proteins are not
only able to cross the BBB and accumulate in tissues of the CNS,
but also have significantly increased enzymatic activity relative
to wild-type (non-fusion) recombinant human IDS.
Example 3
In Vivo Activity of I2S-MTf and MTfpep-I2S Fusions in Mouse Model
of MPS II
[0233] The therapeutic efficacy of the I2S-MTf and MTfpep-I2S
fusion proteins is evaluated in a mouse model of Hunter Syndrome or
Mucopolysaccharidosis type II (MPS II) relative to Idursulfase
(Elaprase.RTM.), which is indicated for the treatment of Hunter
Syndrome. These studies are designed to evaluate the effect of
intravenous (IV) and intraperitoneal (IP) administration of the
fusion proteins on brain pathology in a knock-out mouse model of
Mucopolysaccharidosis II (MPSII).
[0234] Hunter Syndrome.
[0235] As noted above, Hunter Syndrome is an X-linked recessive
disease caused by insufficient levels of the lysosomal enzyme
iduronate 2-sulfatase (IDS). This enzyme cleaves the terminal
2-O-sulfate moieties from the glycosaminoglycans (GAG)
dermatan-sulfate and heparan-sulfate. Due to the missing or
defective IDS enzyme activity in patients with Hunter syndrome, GAG
accumulate progressively in the lysosomes of a variety of cell
types. This leads to cellular engorgement, organomegaly, tissue
destruction, and organ system dysfunction.
[0236] Mouse Model.
[0237] IDS-KO mice have little or no tissue IDS activity and
exhibit many of the cellular and clinical effects observed in
Hunter's syndrome including increased tissue vacuolization, GAG
levels, and urinary excretion of GAG. Due to the X-linked recessive
nature of Hunter syndrome, all pharmacology studies are performed
in male mice. Animal breeding is performed as described by Garcia
et al, 2007 (3). Briefly, carrier females are bred with wild type
male mice of the C57Bl/6 background strain, producing heterogenous
females and hemizygous male knock-out mice, as well as wild-type
(WT) males and females. IDS-KO male mice are alternatively obtained
by breeding carrier females with IDS-KO male mice. The genotype of
all mice used in these experiments is confirmed by polymerase chain
reaction of DNA obtained from tail snip. All IDS-KO mice are
hemizygous IKO (-/0) male and between 12-13 weeks old at the
beginning of treatment initiation (mice younger than 12 weeks are
not used in this study). A group of untreated WT littermate (+/0)
males are used as controls.
[0238] Idursulfase (Elaprase.RTM.).
[0239] Idursulfase is a drug used to treat Hunter syndrome (also
called MPS-II) (see Garcia et al., Mol Genet Metab. 91:183-90,
2007). It is a purified form of the lysosomal enzyme
iduronate-2-sulfatase and is produced by recombinant DNA technology
in a human cell line
[0240] Study Design.
[0241] The study design is outlined in Table E3 below.
TABLE-US-00011 TABLE E3 Dose Mice/ Dose level volume Treatment
Group Animal Group (mg/kg) (mL/kg) regimen Sacrifice Vehicle WT 5 0
5-6 IV, once per 24 h after last (control) week for 6 wks injection
Vehicle IDS-KO 5 0 5-6 IV, once per 24 h after last (control) week
for 6 wks injection IDS IDS-KO 5 6 mg/kg 5-6 IV, bi-weekly 24 h
after last (Elaprase) for 6 wks injection (high dose) hMTf IDS-KO
3-5 Molar equivalent 5-6 IV, once per 24 h after last to hMTf-IDS
dose week for 6 wks injection IDS-hMTf IDS-KO 5 Activity 5-6 IV,
once per 24 h after last equivalent to IDS week for 6 wks injection
(high dose) hMTfpep- IDS-KO 5 Activity 5-6 IV, once per 24 h after
last IDS equivalent to IDS week for 6 wks injection (high dose)
[0242] All test articles and vehicle controls are administered by
two slow bolus (one IV and one IP injection), to be performed once
a week for a total of 6 weeks.
[0243] Body weights are determined at randomization on the first
day of treatment and weekly thereafter. Clinical observations are
performed daily. The animals are sacrificed approximately 24 hours
after the last treatment.
[0244] Selected organs (brain, liver, kidney and heart) are
collected and their weights recorded. The brains are preserved for
histopathology and immunostaining analysis. The other tissues are
divided with one half or one paired organ and preserved for
histopathology and immunostaining in a manner similar to the brain.
The other half or paired organ is frozen in liquid nitrogen and
stored at -80.degree. C. until assayed for GAG.
[0245] Study End Points:
[0246] The primary endpoints are as follows: [0247] Histological
evaluation: Hematoxylin and eosin staining of brain sections. This
method is used to evaluate whether treatment has an effect on
reducing the number/size of cellular storage vacuoles observed in
IDS-KO mice; and [0248] Immunohistochemical evaluation of lysosomal
associated membrane protein-1 (LAMP-1) in brain sections: This
method is used to determine if treatment has effect on reducing the
elevated LAMP-1 immunoreactivity that is observed in IDS-KO
mice.
[0249] If feasible, qualitative or semi-qualitative methods are
also employed for analysis of the end points 1-2 (such as scoring,
area measurements, section scans, etc.). The histopathologist
performing this analysis is blinded with regard to slide allocation
to the study groups. Lysosome surface area is quantified by
scanning areas stained for LAMP1 (IHC) and compared between
experimental groups.
[0250] The secondary endpoints are as follows: [0251] GAG levels in
selected tissues (liver, kidney, and heart); [0252] H&E
staining of selected tissues and detection of cellular storage
vacuoles; and [0253] Immunohistochemical evaluation of LAMP-1
levels in selected organs/tissues.
[0254] Histopathology (H&EStain).
[0255] Tissues are collected and fixed in 10% neutral buffered
formalin, then processed and embedded in paraffin. 5 .mu.m paraffin
sections are prepared and stained with hematoxylin and eosin
(H&E) using standard procedures.
[0256] Immunohistochemistry (LAMP-1).
[0257] Deparaffinized slides are incubated overnight with rat
anti-LAMP-1 IgG (Santa CruzBiotechnology) as the primary antibody
or rat IgG2a as a control antibody (AbDSerotec, Raleigh, N.C.).
Following overnight incubation at 2-8.degree. C., biotinylated
rabbit anti-rat IgG (H&L)mouse adsorbed (Vector Laboratories)
is added. Following 30 minutes of incubation at 37.degree. C.,
samples are washed and then treated with avidin-biotin-peroxidase
complex (Vector Laboratories) for 30 minutes. Labeled protein is
localized by incubation with 3,39-diaminobenzidine. The area of
LAMP-1-positive cells is analyzed with Image-Pro Plus software
(Media Cybernetics, Inc., Bethesda, Md.).
[0258] GAG Measurements.
[0259] Tissue extracts are prepared by homogenizing tissue in a
lysis buffer (10 mM Tris, 5 mM EDTA, 0.1% Igepal CA-630, 2 mM
Pefabloc SC) using a glass grinder (Kontes Glass Company, Vineland,
N.J.) or a motorized tissue homogenizer (PowerGen Model 125, Omni
International, Warrenton, Va.). Homogenates re then subjected to 5
freeze-thaw cycles using an ethanol/dry ice bath and a 37.degree.
C. water bath. Tissue debris is pelleted twice by room temperature
centrifugation at 2000 g for 12 minutes, and supernatants are
collected and assayed for total protein concentration (mg/mL) using
the bicinchonic acid (BCA) assay (Pierce, Rockford, Ill.).
[0260] GAG concentration in urine and tissue extracts is quantified
by acolorimetric assay using 1,9-dimethylmethylene blue (DMB) dye
and a standard curve (1.56-25 .mu.g/mL) prepared from dermatan
sulfate (MP Biomedicals, Aurora, Ohio). Urine samples are run at
dilutions of 1/10, 1/20, and 1/40. To avoid assay interference from
protein, tissue extract samples are diluted to protein
concentrations of <200 .mu.g/mL. GAG concentrations in urine is
adjusted for creatinine concentrations measured with a commercially
available kit (Sigma, St. Louis, Mo., part no. 555A) to compensate
for differences in kidney function and expressed as .mu.g GAG/mg
creatinine. GAG levels in tissue extracts are adjusted for protein
concentration (.mu.g GAG/mg protein) or gram tissue.
Sequence CWU 1
1
1491738PRTHomo sapiens 1Met Arg Gly Pro Ser Gly Ala Leu Trp Leu Leu
Leu Ala Leu Arg Thr1 5 10 15Val Leu Gly Gly Met Glu Val Arg Trp Cys
Ala Thr Ser Asp Pro Glu 20 25 30Gln His Lys Cys Gly Asn Met Ser Glu
Ala Phe Arg Glu Ala Gly Ile 35 40 45Gln Pro Ser Leu Leu Cys Val Arg
Gly Thr Ser Ala Asp His Cys Val 50 55 60Gln Leu Ile Ala Ala Gln Glu
Ala Asp Ala Ile Thr Leu Asp Gly Gly65 70 75 80Ala Ile Tyr Glu Ala
Gly Lys Glu His Gly Leu Lys Pro Val Val Gly 85 90 95Glu Val Tyr Asp
Gln Glu Val Gly Thr Ser Tyr Tyr Ala Val Ala Val 100 105 110Val Arg
Arg Ser Ser His Val Thr Ile Asp Thr Leu Lys Gly Val Lys 115 120
125Ser Cys His Thr Gly Ile Asn Arg Thr Val Gly Trp Asn Val Pro Val
130 135 140Gly Tyr Leu Val Glu Ser Gly Arg Leu Ser Val Met Gly Cys
Asp Val145 150 155 160Leu Lys Ala Val Ser Asp Tyr Phe Gly Gly Ser
Cys Val Pro Gly Ala 165 170 175Gly Glu Thr Ser Tyr Ser Glu Ser Leu
Cys Arg Leu Cys Arg Gly Asp 180 185 190Ser Ser Gly Glu Gly Val Cys
Asp Lys Ser Pro Leu Glu Arg Tyr Tyr 195 200 205Asp Tyr Ser Gly Ala
Phe Arg Cys Leu Ala Glu Gly Ala Gly Asp Val 210 215 220Ala Phe Val
Lys His Ser Thr Val Leu Glu Asn Thr Asp Gly Lys Thr225 230 235
240Leu Pro Ser Trp Gly Gln Ala Leu Leu Ser Gln Asp Phe Glu Leu Leu
245 250 255Cys Arg Asp Gly Ser Arg Ala Asp Val Thr Glu Trp Arg Gln
Cys His 260 265 270Leu Ala Arg Val Pro Ala His Ala Val Val Val Arg
Ala Asp Thr Asp 275 280 285Gly Gly Leu Ile Phe Arg Leu Leu Asn Glu
Gly Gln Arg Leu Phe Ser 290 295 300His Glu Gly Ser Ser Phe Gln Met
Phe Ser Ser Glu Ala Tyr Gly Gln305 310 315 320Lys Asp Leu Leu Phe
Lys Asp Ser Thr Ser Glu Leu Val Pro Ile Ala 325 330 335Thr Gln Thr
Tyr Glu Ala Trp Leu Gly His Glu Tyr Leu His Ala Met 340 345 350Lys
Gly Leu Leu Cys Asp Pro Asn Arg Leu Pro Pro Tyr Leu Arg Trp 355 360
365Cys Val Leu Ser Thr Pro Glu Ile Gln Lys Cys Gly Asp Met Ala Val
370 375 380Ala Phe Arg Arg Gln Arg Leu Lys Pro Glu Ile Gln Cys Val
Ser Ala385 390 395 400Lys Ser Pro Gln His Cys Met Glu Arg Ile Gln
Ala Glu Gln Val Asp 405 410 415Ala Val Thr Leu Ser Gly Glu Asp Ile
Tyr Thr Ala Gly Lys Thr Tyr 420 425 430Gly Leu Val Pro Ala Ala Gly
Glu His Tyr Ala Pro Glu Asp Ser Ser 435 440 445Asn Ser Tyr Tyr Val
Val Ala Val Val Arg Arg Asp Ser Ser His Ala 450 455 460Phe Thr Leu
Asp Glu Leu Arg Gly Lys Arg Ser Cys His Ala Gly Phe465 470 475
480Gly Ser Pro Ala Gly Trp Asp Val Pro Val Gly Ala Leu Ile Gln Arg
485 490 495Gly Phe Ile Arg Pro Lys Asp Cys Asp Val Leu Thr Ala Val
Ser Glu 500 505 510Phe Phe Asn Ala Ser Cys Val Pro Val Asn Asn Pro
Lys Asn Tyr Pro 515 520 525Ser Ser Leu Cys Ala Leu Cys Val Gly Asp
Glu Gln Gly Arg Asn Lys 530 535 540Cys Val Gly Asn Ser Gln Glu Arg
Tyr Tyr Gly Tyr Arg Gly Ala Phe545 550 555 560Arg Cys Leu Val Glu
Asn Ala Gly Asp Val Ala Phe Val Arg His Thr 565 570 575Thr Val Phe
Asp Asn Thr Asn Gly His Asn Ser Glu Pro Trp Ala Ala 580 585 590Glu
Leu Arg Ser Glu Asp Tyr Glu Leu Leu Cys Pro Asn Gly Ala Arg 595 600
605Ala Glu Val Ser Gln Phe Ala Ala Cys Asn Leu Ala Gln Ile Pro Pro
610 615 620His Ala Val Met Val Arg Pro Asp Thr Asn Ile Phe Thr Val
Tyr Gly625 630 635 640Leu Leu Asp Lys Ala Gln Asp Leu Phe Gly Asp
Asp His Asn Lys Asn 645 650 655Gly Phe Lys Met Phe Asp Ser Ser Asn
Tyr His Gly Gln Asp Leu Leu 660 665 670Phe Lys Asp Ala Thr Val Arg
Ala Val Pro Val Gly Glu Lys Thr Thr 675 680 685Tyr Arg Gly Trp Leu
Gly Leu Asp Tyr Val Ala Ala Leu Glu Gly Met 690 695 700Ser Ser Gln
Gln Cys Ser Gly Ala Ala Ala Pro Ala Pro Gly Ala Pro705 710 715
720Leu Leu Pro Leu Leu Leu Pro Ala Leu Ala Ala Arg Leu Leu Pro Pro
725 730 735Ala Leu2692PRTHomo sapiens 2Gly Met Glu Val Arg Trp Cys
Ala Thr Ser Asp Pro Glu Gln His Lys1 5 10 15Cys Gly Asn Met Ser Glu
Ala Phe Arg Glu Ala Gly Ile Gln Pro Ser 20 25 30Leu Leu Cys Val Arg
Gly Thr Ser Ala Asp His Cys Val Gln Leu Ile 35 40 45Ala Ala Gln Glu
Ala Asp Ala Ile Thr Leu Asp Gly Gly Ala Ile Tyr 50 55 60Glu Ala Gly
Lys Glu His Gly Leu Lys Pro Val Val Gly Glu Val Tyr65 70 75 80Asp
Gln Glu Val Gly Thr Ser Tyr Tyr Ala Val Ala Val Val Arg Arg 85 90
95Ser Ser His Val Thr Ile Asp Thr Leu Lys Gly Val Lys Ser Cys His
100 105 110Thr Gly Ile Asn Arg Thr Val Gly Trp Asn Val Pro Val Gly
Tyr Leu 115 120 125Val Glu Ser Gly Arg Leu Ser Val Met Gly Cys Asp
Val Leu Lys Ala 130 135 140Val Ser Asp Tyr Phe Gly Gly Ser Cys Val
Pro Gly Ala Gly Glu Thr145 150 155 160Ser Tyr Ser Glu Ser Leu Cys
Arg Leu Cys Arg Gly Asp Ser Ser Gly 165 170 175Glu Gly Val Cys Asp
Lys Ser Pro Leu Glu Arg Tyr Tyr Asp Tyr Ser 180 185 190Gly Ala Phe
Arg Cys Leu Ala Glu Gly Ala Gly Asp Val Ala Phe Val 195 200 205Lys
His Ser Thr Val Leu Glu Asn Thr Asp Gly Lys Thr Leu Pro Ser 210 215
220Trp Gly Gln Ala Leu Leu Ser Gln Asp Phe Glu Leu Leu Cys Arg
Asp225 230 235 240Gly Ser Arg Ala Asp Val Thr Glu Trp Arg Gln Cys
His Leu Ala Arg 245 250 255Val Pro Ala His Ala Val Val Val Arg Ala
Asp Thr Asp Gly Gly Leu 260 265 270Ile Phe Arg Leu Leu Asn Glu Gly
Gln Arg Leu Phe Ser His Glu Gly 275 280 285Ser Ser Phe Gln Met Phe
Ser Ser Glu Ala Tyr Gly Gln Lys Asp Leu 290 295 300Leu Phe Lys Asp
Ser Thr Ser Glu Leu Val Pro Ile Ala Thr Gln Thr305 310 315 320Tyr
Glu Ala Trp Leu Gly His Glu Tyr Leu His Ala Met Lys Gly Leu 325 330
335Leu Cys Asp Pro Asn Arg Leu Pro Pro Tyr Leu Arg Trp Cys Val Leu
340 345 350Ser Thr Pro Glu Ile Gln Lys Cys Gly Asp Met Ala Val Ala
Phe Arg 355 360 365Arg Gln Arg Leu Lys Pro Glu Ile Gln Cys Val Ser
Ala Lys Ser Pro 370 375 380Gln His Cys Met Glu Arg Ile Gln Ala Glu
Gln Val Asp Ala Val Thr385 390 395 400Leu Ser Gly Glu Asp Ile Tyr
Thr Ala Gly Lys Thr Tyr Gly Leu Val 405 410 415Pro Ala Ala Gly Glu
His Tyr Ala Pro Glu Asp Ser Ser Asn Ser Tyr 420 425 430Tyr Val Val
Ala Val Val Arg Arg Asp Ser Ser His Ala Phe Thr Leu 435 440 445Asp
Glu Leu Arg Gly Lys Arg Ser Cys His Ala Gly Phe Gly Ser Pro 450 455
460Ala Gly Trp Asp Val Pro Val Gly Ala Leu Ile Gln Arg Gly Phe
Ile465 470 475 480Arg Pro Lys Asp Cys Asp Val Leu Thr Ala Val Ser
Glu Phe Phe Asn 485 490 495Ala Ser Cys Val Pro Val Asn Asn Pro Lys
Asn Tyr Pro Ser Ser Leu 500 505 510Cys Ala Leu Cys Val Gly Asp Glu
Gln Gly Arg Asn Lys Cys Val Gly 515 520 525Asn Ser Gln Glu Arg Tyr
Tyr Gly Tyr Arg Gly Ala Phe Arg Cys Leu 530 535 540Val Glu Asn Ala
Gly Asp Val Ala Phe Val Arg His Thr Thr Val Phe545 550 555 560Asp
Asn Thr Asn Gly His Asn Ser Glu Pro Trp Ala Ala Glu Leu Arg 565 570
575Ser Glu Asp Tyr Glu Leu Leu Cys Pro Asn Gly Ala Arg Ala Glu Val
580 585 590Ser Gln Phe Ala Ala Cys Asn Leu Ala Gln Ile Pro Pro His
Ala Val 595 600 605Met Val Arg Pro Asp Thr Asn Ile Phe Thr Val Tyr
Gly Leu Leu Asp 610 615 620Lys Ala Gln Asp Leu Phe Gly Asp Asp His
Asn Lys Asn Gly Phe Lys625 630 635 640Met Phe Asp Ser Ser Asn Tyr
His Gly Gln Asp Leu Leu Phe Lys Asp 645 650 655Ala Thr Val Arg Ala
Val Pro Val Gly Glu Lys Thr Thr Tyr Arg Gly 660 665 670Trp Leu Gly
Leu Asp Tyr Val Ala Ala Leu Glu Gly Met Ser Ser Gln 675 680 685Gln
Cys Ser Gly 690311PRTHomo sapiens 3Trp Cys Ala Thr Ser Asp Pro Glu
Gln His Lys1 5 10411PRTHomo sapiens 4Arg Ser Ser His Val Thr Ile
Asp Thr Leu Lys1 5 10513PRTHomo sapiens 5Ser Ser His Val Thr Ile
Asp Thr Leu Lys Gly Val Lys1 5 10614PRTHomo sapiens 6Leu Cys Arg
Gly Asp Ser Ser Gly Glu Gly Val Cys Asp Lys1 5 10716PRTHomo sapiens
7Gly Asp Ser Ser Gly Glu Gly Val Cys Asp Lys Ser Pro Leu Glu Arg1 5
10 1589PRTHomo sapiens 8Tyr Tyr Asp Tyr Ser Gly Ala Phe Arg1
597PRTHomo sapiens 9Ala Asp Val Thr Glu Trp Arg1 5109PRTHomo
sapiens 10Val Pro Ala His Ala Val Val Val Arg1 51110PRTHomo sapiens
11Ala Asp Thr Asp Gly Gly Leu Ile Phe Arg1 5 10129PRTHomo sapiens
12Cys Gly Asp Met Ala Val Ala Phe Arg1 51311PRTHomo sapiens 13Leu
Lys Pro Glu Ile Gln Cys Val Ser Ala Lys1 5 101412PRTHomo sapiens
14Asp Ser Ser His Ala Phe Thr Leu Asp Glu Leu Arg1 5 101513PRTHomo
sapiens 15Ser Glu Asp Tyr Glu Leu Leu Cys Pro Asn Gly Ala Arg1 5
101615PRTHomo sapiens 16Ala Gln Asp Leu Phe Gly Asp Asp His Asn Lys
Asn Gly Phe Lys1 5 10 151740PRTHomo sapiens 17Phe Ser Ser Glu Ala
Tyr Gly Gln Lys Asp Leu Leu Phe Lys Asp Ser1 5 10 15Thr Ser Glu Leu
Val Pro Ile Ala Thr Gln Thr Tyr Glu Ala Trp Leu 20 25 30Gly His Glu
Tyr Leu His Ala Met 35 4018221PRTHomo sapiens 18Glu Arg Ile Gln Ala
Glu Gln Val Asp Ala Val Thr Leu Ser Gly Glu1 5 10 15Asp Ile Tyr Thr
Ala Gly Lys Thr Tyr Gly Leu Val Pro Ala Ala Gly 20 25 30Glu His Tyr
Ala Pro Glu Asp Ser Ser Asn Ser Tyr Tyr Val Val Ala 35 40 45Val Val
Arg Arg Asp Ser Ser His Ala Phe Thr Leu Asp Glu Leu Arg 50 55 60Gly
Lys Arg Ser Cys His Ala Gly Phe Gly Ser Pro Ala Gly Trp Asp65 70 75
80Val Pro Val Gly Ala Leu Ile Gln Arg Gly Phe Ile Arg Pro Lys Asp
85 90 95Cys Asp Val Leu Thr Ala Val Ser Glu Phe Phe Asn Ala Ser Cys
Val 100 105 110Pro Val Asn Asn Pro Lys Asn Tyr Pro Ser Ser Leu Cys
Ala Leu Cys 115 120 125Val Gly Asp Glu Gln Gly Arg Asn Lys Cys Val
Gly Asn Ser Gln Glu 130 135 140Arg Tyr Tyr Gly Tyr Arg Gly Ala Phe
Arg Cys Leu Val Glu Asn Ala145 150 155 160Gly Asp Val Ala Phe Val
Arg His Thr Thr Val Phe Asp Asn Thr Asn 165 170 175Gly His Asn Ser
Glu Pro Trp Ala Ala Glu Leu Arg Ser Glu Asp Tyr 180 185 190Glu Leu
Leu Cys Pro Asn Gly Ala Arg Ala Glu Val Ser Gln Phe Ala 195 200
205Ala Cys Asn Leu Ala Gln Ile Pro Pro His Ala Val Met 210 215
2201932PRTHomo sapiens 19Val Arg Pro Asp Thr Asn Ile Phe Thr Val
Tyr Gly Leu Leu Asp Lys1 5 10 15Ala Gln Asp Leu Phe Gly Asp Asp His
Asn Lys Asn Gly Phe Lys Met 20 25 3020564PRTHomo sapiens 20Gly Met
Glu Val Arg Trp Cys Ala Thr Ser Asp Pro Glu Gln His Lys1 5 10 15Cys
Gly Asn Met Ser Glu Ala Phe Arg Glu Ala Gly Ile Gln Pro Ser 20 25
30Leu Leu Cys Val Arg Gly Thr Ser Ala Asp His Cys Val Gln Leu Ile
35 40 45Ala Ala Gln Glu Ala Asp Ala Ile Thr Leu Asp Gly Gly Ala Ile
Tyr 50 55 60Glu Ala Gly Lys Glu His Gly Leu Lys Pro Val Val Gly Glu
Val Tyr65 70 75 80Asp Gln Glu Val Gly Thr Ser Tyr Tyr Ala Val Ala
Val Val Arg Arg 85 90 95Ser Ser His Val Thr Ile Asp Thr Leu Lys Gly
Val Lys Ser Cys His 100 105 110Thr Gly Ile Asn Arg Thr Val Gly Trp
Asn Val Pro Val Gly Tyr Leu 115 120 125Val Glu Ser Gly Arg Leu Ser
Val Met Gly Cys Asp Val Leu Lys Ala 130 135 140Val Ser Asp Tyr Phe
Gly Gly Ser Cys Val Pro Gly Ala Gly Glu Thr145 150 155 160Ser Tyr
Ser Glu Ser Leu Cys Arg Leu Cys Arg Gly Asp Ser Ser Gly 165 170
175Glu Gly Val Cys Asp Lys Ser Pro Leu Glu Arg Tyr Tyr Asp Tyr Ser
180 185 190Gly Ala Phe Arg Cys Leu Ala Glu Gly Ala Gly Asp Val Ala
Phe Val 195 200 205Lys His Ser Thr Val Leu Glu Asn Thr Asp Gly Lys
Thr Leu Pro Ser 210 215 220Trp Gly Gln Ala Leu Leu Ser Gln Asp Phe
Glu Leu Leu Cys Arg Asp225 230 235 240Gly Ser Arg Ala Asp Val Thr
Glu Trp Arg Gln Cys His Leu Ala Arg 245 250 255Val Pro Ala His Ala
Val Val Val Arg Ala Asp Thr Asp Gly Gly Leu 260 265 270Ile Phe Arg
Leu Leu Asn Glu Gly Gln Arg Leu Phe Ser His Glu Gly 275 280 285Ser
Ser Phe Gln Met Phe Ser Ser Glu Ala Tyr Gly Gln Lys Asp Leu 290 295
300Leu Phe Lys Asp Ser Thr Ser Glu Leu Val Pro Ile Ala Thr Gln
Thr305 310 315 320Tyr Glu Ala Trp Leu Gly His Glu Tyr Leu His Ala
Met Lys Gly Leu 325 330 335Leu Cys Asp Pro Asn Arg Leu Pro Pro Tyr
Leu Arg Trp Cys Val Leu 340 345 350Ser Thr Pro Glu Ile Gln Lys Cys
Gly Asp Met Ala Val Ala Phe Arg 355 360 365Arg Gln Arg Leu Lys Pro
Glu Ile Gln Cys Val Ser Ala Lys Ser Pro 370 375 380Gln His Cys Met
Glu Arg Ile Gln Ala Glu Gln Val Asp Ala Val Thr385 390 395 400Leu
Ser Gly Glu Asp Ile Tyr Thr Ala Gly Lys Thr Tyr Gly Leu Val 405 410
415Pro Ala Ala Gly Glu His Tyr Ala Pro Glu Asp Ser Ser Asn Ser Tyr
420 425 430Tyr Val Val Ala Val Val Arg Arg Asp Ser Ser His Ala Phe
Thr Leu 435 440 445Asp Glu Leu Arg Gly Lys Arg Ser Cys His Ala Gly
Phe Gly Ser Pro 450 455 460Ala Gly Trp Asp Val Pro Val Gly Ala Leu
Ile Gln Arg Gly Phe Ile465 470 475 480Arg Pro Lys Asp Cys Asp Val
Leu Thr Ala Val Ser Glu Phe Phe Asn 485 490 495Ala Ser Cys Val Pro
Val Asn Asn Pro Lys Asn Tyr Pro Ser Ser Leu 500 505 510Cys Ala Leu
Cys Val Gly Asp Glu Gln Gly Arg Asn Lys Cys Val Gly 515 520 525Asn
Ser Gln Glu Arg Tyr Tyr Gly Tyr Arg Gly Ala Phe Arg Cys Leu 530
535
540Val Glu Asn Ala Gly Asp Val Ala Phe Val Arg His Thr Thr Val
Phe545 550 555 560Asp Asn Thr Asn2122PRTHomo sapiens 21Gly His Asn
Ser Glu Pro Trp Ala Ala Glu Leu Arg Ser Glu Asp Tyr1 5 10 15Glu Leu
Leu Cys Pro Asn 202251PRTHomo sapiens 22Gly Ala Arg Ala Glu Val Ser
Gln Phe Ala Ala Cys Asn Leu Ala Gln1 5 10 15Ile Pro Pro His Ala Val
Met Val Arg Pro Asp Thr Asn Ile Phe Thr 20 25 30Val Tyr Gly Leu Leu
Asp Lys Ala Gln Asp Leu Phe Gly Asp Asp His 35 40 45Asn Lys Asn
502353PRTHomo sapiens 23Gly Phe Lys Met Phe Asp Ser Ser Asn Tyr His
Gly Gln Asp Leu Leu1 5 10 15Phe Lys Asp Ala Thr Val Arg Ala Val Pro
Val Gly Glu Lys Thr Thr 20 25 30Tyr Arg Gly Trp Leu Gly Leu Asp Tyr
Val Ala Ala Leu Glu Gly Met 35 40 45Ser Ser Gln Gln Cys
5024586PRTHomo sapiens 24Gly Met Glu Val Arg Trp Cys Ala Thr Ser
Asp Pro Glu Gln His Lys1 5 10 15Cys Gly Asn Met Ser Glu Ala Phe Arg
Glu Ala Gly Ile Gln Pro Ser 20 25 30Leu Leu Cys Val Arg Gly Thr Ser
Ala Asp His Cys Val Gln Leu Ile 35 40 45Ala Ala Gln Glu Ala Asp Ala
Ile Thr Leu Asp Gly Gly Ala Ile Tyr 50 55 60Glu Ala Gly Lys Glu His
Gly Leu Lys Pro Val Val Gly Glu Val Tyr65 70 75 80Asp Gln Glu Val
Gly Thr Ser Tyr Tyr Ala Val Ala Val Val Arg Arg 85 90 95Ser Ser His
Val Thr Ile Asp Thr Leu Lys Gly Val Lys Ser Cys His 100 105 110Thr
Gly Ile Asn Arg Thr Val Gly Trp Asn Val Pro Val Gly Tyr Leu 115 120
125Val Glu Ser Gly Arg Leu Ser Val Met Gly Cys Asp Val Leu Lys Ala
130 135 140Val Ser Asp Tyr Phe Gly Gly Ser Cys Val Pro Gly Ala Gly
Glu Thr145 150 155 160Ser Tyr Ser Glu Ser Leu Cys Arg Leu Cys Arg
Gly Asp Ser Ser Gly 165 170 175Glu Gly Val Cys Asp Lys Ser Pro Leu
Glu Arg Tyr Tyr Asp Tyr Ser 180 185 190Gly Ala Phe Arg Cys Leu Ala
Glu Gly Ala Gly Asp Val Ala Phe Val 195 200 205Lys His Ser Thr Val
Leu Glu Asn Thr Asp Gly Lys Thr Leu Pro Ser 210 215 220Trp Gly Gln
Ala Leu Leu Ser Gln Asp Phe Glu Leu Leu Cys Arg Asp225 230 235
240Gly Ser Arg Ala Asp Val Thr Glu Trp Arg Gln Cys His Leu Ala Arg
245 250 255Val Pro Ala His Ala Val Val Val Arg Ala Asp Thr Asp Gly
Gly Leu 260 265 270Ile Phe Arg Leu Leu Asn Glu Gly Gln Arg Leu Phe
Ser His Glu Gly 275 280 285Ser Ser Phe Gln Met Phe Ser Ser Glu Ala
Tyr Gly Gln Lys Asp Leu 290 295 300Leu Phe Lys Asp Ser Thr Ser Glu
Leu Val Pro Ile Ala Thr Gln Thr305 310 315 320Tyr Glu Ala Trp Leu
Gly His Glu Tyr Leu His Ala Met Lys Gly Leu 325 330 335Leu Cys Asp
Pro Asn Arg Leu Pro Pro Tyr Leu Arg Trp Cys Val Leu 340 345 350Ser
Thr Pro Glu Ile Gln Lys Cys Gly Asp Met Ala Val Ala Phe Arg 355 360
365Arg Gln Arg Leu Lys Pro Glu Ile Gln Cys Val Ser Ala Lys Ser Pro
370 375 380Gln His Cys Met Glu Arg Ile Gln Ala Glu Gln Val Asp Ala
Val Thr385 390 395 400Leu Ser Gly Glu Asp Ile Tyr Thr Ala Gly Lys
Thr Tyr Gly Leu Val 405 410 415Pro Ala Ala Gly Glu His Tyr Ala Pro
Glu Asp Ser Ser Asn Ser Tyr 420 425 430Tyr Val Val Ala Val Val Arg
Arg Asp Ser Ser His Ala Phe Thr Leu 435 440 445Asp Glu Leu Arg Gly
Lys Arg Ser Cys His Ala Gly Phe Gly Ser Pro 450 455 460Ala Gly Trp
Asp Val Pro Val Gly Ala Leu Ile Gln Arg Gly Phe Ile465 470 475
480Arg Pro Lys Asp Cys Asp Val Leu Thr Ala Val Ser Glu Phe Phe Asn
485 490 495Ala Ser Cys Val Pro Val Asn Asn Pro Lys Asn Tyr Pro Ser
Ser Leu 500 505 510Cys Ala Leu Cys Val Gly Asp Glu Gln Gly Arg Asn
Lys Cys Val Gly 515 520 525Asn Ser Gln Glu Arg Tyr Tyr Gly Tyr Arg
Gly Ala Phe Arg Cys Leu 530 535 540Val Glu Asn Ala Gly Asp Val Ala
Phe Val Arg His Thr Thr Val Phe545 550 555 560Asp Asn Thr Asn Gly
His Asn Ser Glu Pro Trp Ala Ala Glu Leu Arg 565 570 575Ser Glu Asp
Tyr Glu Leu Leu Cys Pro Asn 580 58525637PRTHomo sapiens 25Gly Met
Glu Val Arg Trp Cys Ala Thr Ser Asp Pro Glu Gln His Lys1 5 10 15Cys
Gly Asn Met Ser Glu Ala Phe Arg Glu Ala Gly Ile Gln Pro Ser 20 25
30Leu Leu Cys Val Arg Gly Thr Ser Ala Asp His Cys Val Gln Leu Ile
35 40 45Ala Ala Gln Glu Ala Asp Ala Ile Thr Leu Asp Gly Gly Ala Ile
Tyr 50 55 60Glu Ala Gly Lys Glu His Gly Leu Lys Pro Val Val Gly Glu
Val Tyr65 70 75 80Asp Gln Glu Val Gly Thr Ser Tyr Tyr Ala Val Ala
Val Val Arg Arg 85 90 95Ser Ser His Val Thr Ile Asp Thr Leu Lys Gly
Val Lys Ser Cys His 100 105 110Thr Gly Ile Asn Arg Thr Val Gly Trp
Asn Val Pro Val Gly Tyr Leu 115 120 125Val Glu Ser Gly Arg Leu Ser
Val Met Gly Cys Asp Val Leu Lys Ala 130 135 140Val Ser Asp Tyr Phe
Gly Gly Ser Cys Val Pro Gly Ala Gly Glu Thr145 150 155 160Ser Tyr
Ser Glu Ser Leu Cys Arg Leu Cys Arg Gly Asp Ser Ser Gly 165 170
175Glu Gly Val Cys Asp Lys Ser Pro Leu Glu Arg Tyr Tyr Asp Tyr Ser
180 185 190Gly Ala Phe Arg Cys Leu Ala Glu Gly Ala Gly Asp Val Ala
Phe Val 195 200 205Lys His Ser Thr Val Leu Glu Asn Thr Asp Gly Lys
Thr Leu Pro Ser 210 215 220Trp Gly Gln Ala Leu Leu Ser Gln Asp Phe
Glu Leu Leu Cys Arg Asp225 230 235 240Gly Ser Arg Ala Asp Val Thr
Glu Trp Arg Gln Cys His Leu Ala Arg 245 250 255Val Pro Ala His Ala
Val Val Val Arg Ala Asp Thr Asp Gly Gly Leu 260 265 270Ile Phe Arg
Leu Leu Asn Glu Gly Gln Arg Leu Phe Ser His Glu Gly 275 280 285Ser
Ser Phe Gln Met Phe Ser Ser Glu Ala Tyr Gly Gln Lys Asp Leu 290 295
300Leu Phe Lys Asp Ser Thr Ser Glu Leu Val Pro Ile Ala Thr Gln
Thr305 310 315 320Tyr Glu Ala Trp Leu Gly His Glu Tyr Leu His Ala
Met Lys Gly Leu 325 330 335Leu Cys Asp Pro Asn Arg Leu Pro Pro Tyr
Leu Arg Trp Cys Val Leu 340 345 350Ser Thr Pro Glu Ile Gln Lys Cys
Gly Asp Met Ala Val Ala Phe Arg 355 360 365Arg Gln Arg Leu Lys Pro
Glu Ile Gln Cys Val Ser Ala Lys Ser Pro 370 375 380Gln His Cys Met
Glu Arg Ile Gln Ala Glu Gln Val Asp Ala Val Thr385 390 395 400Leu
Ser Gly Glu Asp Ile Tyr Thr Ala Gly Lys Thr Tyr Gly Leu Val 405 410
415Pro Ala Ala Gly Glu His Tyr Ala Pro Glu Asp Ser Ser Asn Ser Tyr
420 425 430Tyr Val Val Ala Val Val Arg Arg Asp Ser Ser His Ala Phe
Thr Leu 435 440 445Asp Glu Leu Arg Gly Lys Arg Ser Cys His Ala Gly
Phe Gly Ser Pro 450 455 460Ala Gly Trp Asp Val Pro Val Gly Ala Leu
Ile Gln Arg Gly Phe Ile465 470 475 480Arg Pro Lys Asp Cys Asp Val
Leu Thr Ala Val Ser Glu Phe Phe Asn 485 490 495Ala Ser Cys Val Pro
Val Asn Asn Pro Lys Asn Tyr Pro Ser Ser Leu 500 505 510Cys Ala Leu
Cys Val Gly Asp Glu Gln Gly Arg Asn Lys Cys Val Gly 515 520 525Asn
Ser Gln Glu Arg Tyr Tyr Gly Tyr Arg Gly Ala Phe Arg Cys Leu 530 535
540Val Glu Asn Ala Gly Asp Val Ala Phe Val Arg His Thr Thr Val
Phe545 550 555 560Asp Asn Thr Asn Gly His Asn Ser Glu Pro Trp Ala
Ala Glu Leu Arg 565 570 575Ser Glu Asp Tyr Glu Leu Leu Cys Pro Asn
Gly Ala Arg Ala Glu Val 580 585 590Ser Gln Phe Ala Ala Cys Asn Leu
Ala Gln Ile Pro Pro His Ala Val 595 600 605Met Val Arg Pro Asp Thr
Asn Ile Phe Thr Val Tyr Gly Leu Leu Asp 610 615 620Lys Ala Gln Asp
Leu Phe Gly Asp Asp His Asn Lys Asn625 630 6352673PRTHomo sapiens
26Gly His Asn Ser Glu Pro Trp Ala Ala Glu Leu Arg Ser Glu Asp Tyr1
5 10 15Glu Leu Leu Cys Pro Asn Gly Ala Arg Ala Glu Val Ser Gln Phe
Ala 20 25 30Ala Cys Asn Leu Ala Gln Ile Pro Pro His Ala Val Met Val
Arg Pro 35 40 45Asp Thr Asn Ile Phe Thr Val Tyr Gly Leu Leu Asp Lys
Ala Gln Asp 50 55 60Leu Phe Gly Asp Asp His Asn Lys Asn65
7027126PRTHomo sapiens 27Gly His Asn Ser Glu Pro Trp Ala Ala Glu
Leu Arg Ser Glu Asp Tyr1 5 10 15Glu Leu Leu Cys Pro Asn Gly Ala Arg
Ala Glu Val Ser Gln Phe Ala 20 25 30Ala Cys Asn Leu Ala Gln Ile Pro
Pro His Ala Val Met Val Arg Pro 35 40 45Asp Thr Asn Ile Phe Thr Val
Tyr Gly Leu Leu Asp Lys Ala Gln Asp 50 55 60Leu Phe Gly Asp Asp His
Asn Lys Asn Gly Phe Lys Met Phe Asp Ser65 70 75 80Ser Asn Tyr His
Gly Gln Asp Leu Leu Phe Lys Asp Ala Thr Val Arg 85 90 95Ala Val Pro
Val Gly Glu Lys Thr Thr Tyr Arg Gly Trp Leu Gly Leu 100 105 110Asp
Tyr Val Ala Ala Leu Glu Gly Met Ser Ser Gln Gln Cys 115 120
12528104PRTHomo sapiens 28Gly Ala Arg Ala Glu Val Ser Gln Phe Ala
Ala Cys Asn Leu Ala Gln1 5 10 15Ile Pro Pro His Ala Val Met Val Arg
Pro Asp Thr Asn Ile Phe Thr 20 25 30Val Tyr Gly Leu Leu Asp Lys Ala
Gln Asp Leu Phe Gly Asp Asp His 35 40 45Asn Lys Asn Gly Phe Lys Met
Phe Asp Ser Ser Asn Tyr His Gly Gln 50 55 60Asp Leu Leu Phe Lys Asp
Ala Thr Val Arg Ala Val Pro Val Gly Glu65 70 75 80Lys Thr Thr Tyr
Arg Gly Trp Leu Gly Leu Asp Tyr Val Ala Ala Leu 85 90 95Glu Gly Met
Ser Ser Gln Gln Cys 100291272PRTArtificial Sequencep97 fusion
protein 29Met Pro Pro Pro Arg Thr Gly Arg Gly Leu Leu Trp Leu Gly
Leu Val1 5 10 15Leu Ser Ser Val Cys Val Ala Leu Gly His His His His
His His His 20 25 30His His His Glu Asn Leu Tyr Phe Gln Ser Glu Thr
Gln Ala Asn Ser 35 40 45Thr Thr Asp Ala Leu Asn Val Leu Leu Ile Ile
Val Asp Asp Leu Arg 50 55 60Pro Ser Leu Gly Cys Tyr Gly Asp Lys Leu
Val Arg Ser Pro Asn Ile65 70 75 80Asp Gln Leu Ala Ser His Ser Leu
Leu Phe Gln Asn Ala Phe Ala Gln 85 90 95Gln Ala Val Cys Ala Pro Ser
Arg Val Ser Phe Leu Thr Gly Arg Arg 100 105 110Pro Asp Thr Thr Arg
Leu Tyr Asp Phe Asn Ser Tyr Trp Arg Val His 115 120 125Ala Gly Asn
Phe Ser Thr Ile Pro Gln Tyr Phe Lys Glu Asn Gly Tyr 130 135 140Val
Thr Met Ser Val Gly Lys Val Phe His Pro Gly Ile Ser Ser Asn145 150
155 160His Thr Asp Asp Ser Pro Tyr Ser Trp Ser Phe Pro Pro Tyr His
Pro 165 170 175Ser Ser Glu Lys Tyr Glu Asn Thr Lys Thr Cys Arg Gly
Pro Asp Gly 180 185 190Glu Leu His Ala Asn Leu Leu Cys Pro Val Asp
Val Leu Asp Val Pro 195 200 205Glu Gly Thr Leu Pro Asp Lys Gln Ser
Thr Glu Gln Ala Ile Gln Leu 210 215 220Leu Glu Lys Met Lys Thr Ser
Ala Ser Pro Phe Phe Leu Ala Val Gly225 230 235 240Tyr His Lys Pro
His Ile Pro Phe Arg Tyr Pro Lys Glu Phe Gln Lys 245 250 255Leu Tyr
Pro Leu Glu Asn Ile Thr Leu Ala Pro Asp Pro Glu Val Pro 260 265
270Asp Gly Leu Pro Pro Val Ala Tyr Asn Pro Trp Met Asp Ile Arg Gln
275 280 285Arg Glu Asp Val Gln Ala Leu Asn Ile Ser Val Pro Tyr Gly
Pro Ile 290 295 300Pro Val Asp Phe Gln Arg Lys Ile Arg Gln Ser Tyr
Phe Ala Ser Val305 310 315 320Ser Tyr Leu Asp Thr Gln Val Gly Arg
Leu Leu Ser Ala Leu Asp Asp 325 330 335Leu Gln Leu Ala Asn Ser Thr
Ile Ile Ala Phe Thr Ser Asp His Gly 340 345 350Trp Ala Leu Gly Glu
His Gly Glu Trp Ala Lys Tyr Ser Asn Phe Asp 355 360 365Val Ala Thr
His Val Pro Leu Ile Phe Tyr Val Pro Gly Arg Thr Ala 370 375 380Ser
Leu Pro Glu Ala Gly Glu Lys Leu Phe Pro Tyr Leu Asp Pro Phe385 390
395 400Asp Ser Ala Ser Gln Leu Met Glu Pro Gly Arg Gln Ser Met Asp
Leu 405 410 415Val Glu Leu Val Ser Leu Phe Pro Thr Leu Ala Gly Leu
Ala Gly Leu 420 425 430Gln Val Pro Pro Arg Cys Pro Val Pro Ser Phe
His Val Glu Leu Cys 435 440 445Arg Glu Gly Lys Asn Leu Leu Lys His
Phe Arg Phe Arg Asp Leu Glu 450 455 460Glu Asp Pro Tyr Leu Pro Gly
Asn Pro Arg Glu Leu Ile Ala Tyr Ser465 470 475 480Gln Tyr Pro Arg
Pro Ser Asp Ile Pro Gln Trp Asn Ser Asp Lys Pro 485 490 495Ser Leu
Lys Asp Ile Lys Ile Met Gly Tyr Ser Ile Arg Thr Ile Asp 500 505
510Tyr Arg Tyr Thr Val Trp Val Gly Phe Asn Pro Asp Glu Phe Leu Ala
515 520 525Asn Phe Ser Asp Ile His Ala Gly Glu Leu Tyr Phe Val Asp
Ser Asp 530 535 540Pro Leu Gln Asp His Asn Met Tyr Asn Asp Ser Gln
Gly Gly Asp Leu545 550 555 560Phe Gln Leu Leu Met Pro Glu Ala Ala
Ala Lys Glu Ala Ala Ala Lys 565 570 575Glu Ala Ala Ala Lys Gly Met
Glu Val Arg Trp Cys Ala Thr Ser Asp 580 585 590Pro Glu Gln His Lys
Cys Gly Asn Met Ser Glu Ala Phe Arg Glu Ala 595 600 605Gly Ile Gln
Pro Ser Leu Leu Cys Val Arg Gly Thr Ser Ala Asp His 610 615 620Cys
Val Gln Leu Ile Ala Ala Gln Glu Ala Asp Ala Ile Thr Leu Asp625 630
635 640Gly Gly Ala Ile Tyr Glu Ala Gly Lys Glu His Gly Leu Lys Pro
Val 645 650 655Val Gly Glu Val Tyr Asp Gln Glu Val Gly Thr Ser Tyr
Tyr Ala Val 660 665 670Ala Val Val Arg Arg Ser Ser His Val Thr Ile
Asp Thr Leu Lys Gly 675 680 685Val Lys Ser Cys His Thr Gly Ile Asn
Arg Thr Val Gly Trp Asn Val 690 695 700Pro Val Gly Tyr Leu Val Glu
Ser Gly Arg Leu Ser Val Met Gly Cys705 710 715 720Asp Val Leu Lys
Ala Val Ser Asp Tyr Phe Gly Gly Ser Cys Val Pro 725 730 735Gly Ala
Gly Glu Thr Ser Tyr Ser Glu Ser Leu Cys Arg Leu Cys Arg 740 745
750Gly Asp Ser Ser Gly Glu Gly Val Cys Asp Lys Ser Pro Leu Glu Arg
755 760 765Tyr Tyr Asp Tyr Ser Gly Ala Phe Arg Cys Leu Ala Glu Gly
Ala Gly 770 775
780Asp Val Ala Phe Val Lys His Ser Thr Val Leu Glu Asn Thr Asp
Gly785 790 795 800Lys Thr Leu Pro Ser Trp Gly Gln Ala Leu Leu Ser
Gln Asp Phe Glu 805 810 815Leu Leu Cys Arg Asp Gly Ser Arg Ala Asp
Val Thr Glu Trp Arg Gln 820 825 830Cys His Leu Ala Arg Val Pro Ala
His Ala Val Val Val Arg Ala Asp 835 840 845Thr Asp Gly Gly Leu Ile
Phe Arg Leu Leu Asn Glu Gly Gln Arg Leu 850 855 860Phe Ser His Glu
Gly Ser Ser Phe Gln Met Phe Ser Ser Glu Ala Tyr865 870 875 880Gly
Gln Lys Asp Leu Leu Phe Lys Asp Ser Thr Ser Glu Leu Val Pro 885 890
895Ile Ala Thr Gln Thr Tyr Glu Ala Trp Leu Gly His Glu Tyr Leu His
900 905 910Ala Met Lys Gly Leu Leu Cys Asp Pro Asn Arg Leu Pro Pro
Tyr Leu 915 920 925Arg Trp Cys Val Leu Ser Thr Pro Glu Ile Gln Lys
Cys Gly Asp Met 930 935 940Ala Val Ala Phe Arg Arg Gln Arg Leu Lys
Pro Glu Ile Gln Cys Val945 950 955 960Ser Ala Lys Ser Pro Gln His
Cys Met Glu Arg Ile Gln Ala Glu Gln 965 970 975Val Asp Ala Val Thr
Leu Ser Gly Glu Asp Ile Tyr Thr Ala Gly Lys 980 985 990Thr Tyr Gly
Leu Val Pro Ala Ala Gly Glu His Tyr Ala Pro Glu Asp 995 1000
1005Ser Ser Asn Ser Tyr Tyr Val Val Ala Val Val Arg Arg Asp Ser
1010 1015 1020Ser His Ala Phe Thr Leu Asp Glu Leu Arg Gly Lys Arg
Ser Cys 1025 1030 1035His Ala Gly Phe Gly Ser Pro Ala Gly Trp Asp
Val Pro Val Gly 1040 1045 1050Ala Leu Ile Gln Arg Gly Phe Ile Arg
Pro Lys Asp Cys Asp Val 1055 1060 1065Leu Thr Ala Val Ser Glu Phe
Phe Asn Ala Ser Cys Val Pro Val 1070 1075 1080Asn Asn Pro Lys Asn
Tyr Pro Ser Ser Leu Cys Ala Leu Cys Val 1085 1090 1095Gly Asp Glu
Gln Gly Arg Asn Lys Cys Val Gly Asn Ser Gln Glu 1100 1105 1110Arg
Tyr Tyr Gly Tyr Arg Gly Ala Phe Arg Cys Leu Val Glu Asn 1115 1120
1125Ala Gly Asp Val Ala Phe Val Arg His Thr Thr Val Phe Asp Asn
1130 1135 1140Thr Asn Gly His Asn Ser Glu Pro Trp Ala Ala Glu Leu
Arg Ser 1145 1150 1155Glu Asp Tyr Glu Leu Leu Cys Pro Asn Gly Ala
Arg Ala Glu Val 1160 1165 1170Ser Gln Phe Ala Ala Cys Asn Leu Ala
Gln Ile Pro Pro His Ala 1175 1180 1185Val Met Val Arg Pro Asp Thr
Asn Ile Phe Thr Val Tyr Gly Leu 1190 1195 1200Leu Asp Lys Ala Gln
Asp Leu Phe Gly Asp Asp His Asn Lys Asn 1205 1210 1215Gly Phe Lys
Met Phe Asp Ser Ser Asn Tyr His Gly Gln Asp Leu 1220 1225 1230Leu
Phe Lys Asp Ala Thr Val Arg Ala Val Pro Val Gly Glu Lys 1235 1240
1245Thr Thr Tyr Arg Gly Trp Leu Gly Leu Asp Tyr Val Ala Ala Leu
1250 1255 1260Glu Gly Met Ser Ser Gln Gln Cys Ser 1265
1270301266PRTArtificial Sequencep97 fusion protein 30Met Arg Gly
Pro Ser Gly Ala Leu Trp Leu Leu Leu Ala Leu Arg Thr1 5 10 15Val Leu
Gly His His His His His His His His His His Glu Asn Leu 20 25 30Tyr
Phe Gln Gly Met Glu Val Arg Trp Cys Ala Thr Ser Asp Pro Glu 35 40
45Gln His Lys Cys Gly Asn Met Ser Glu Ala Phe Arg Glu Ala Gly Ile
50 55 60Gln Pro Ser Leu Leu Cys Val Arg Gly Thr Ser Ala Asp His Cys
Val65 70 75 80Gln Leu Ile Ala Ala Gln Glu Ala Asp Ala Ile Thr Leu
Asp Gly Gly 85 90 95Ala Ile Tyr Glu Ala Gly Lys Glu His Gly Leu Lys
Pro Val Val Gly 100 105 110Glu Val Tyr Asp Gln Glu Val Gly Thr Ser
Tyr Tyr Ala Val Ala Val 115 120 125Val Arg Arg Ser Ser His Val Thr
Ile Asp Thr Leu Lys Gly Val Lys 130 135 140Ser Cys His Thr Gly Ile
Asn Arg Thr Val Gly Trp Asn Val Pro Val145 150 155 160Gly Tyr Leu
Val Glu Ser Gly Arg Leu Ser Val Met Gly Cys Asp Val 165 170 175Leu
Lys Ala Val Ser Asp Tyr Phe Gly Gly Ser Cys Val Pro Gly Ala 180 185
190Gly Glu Thr Ser Tyr Ser Glu Ser Leu Cys Arg Leu Cys Arg Gly Asp
195 200 205Ser Ser Gly Glu Gly Val Cys Asp Lys Ser Pro Leu Glu Arg
Tyr Tyr 210 215 220Asp Tyr Ser Gly Ala Phe Arg Cys Leu Ala Glu Gly
Ala Gly Asp Val225 230 235 240Ala Phe Val Lys His Ser Thr Val Leu
Glu Asn Thr Asp Gly Lys Thr 245 250 255Leu Pro Ser Trp Gly Gln Ala
Leu Leu Ser Gln Asp Phe Glu Leu Leu 260 265 270Cys Arg Asp Gly Ser
Arg Ala Asp Val Thr Glu Trp Arg Gln Cys His 275 280 285Leu Ala Arg
Val Pro Ala His Ala Val Val Val Arg Ala Asp Thr Asp 290 295 300Gly
Gly Leu Ile Phe Arg Leu Leu Asn Glu Gly Gln Arg Leu Phe Ser305 310
315 320His Glu Gly Ser Ser Phe Gln Met Phe Ser Ser Glu Ala Tyr Gly
Gln 325 330 335Lys Asp Leu Leu Phe Lys Asp Ser Thr Ser Glu Leu Val
Pro Ile Ala 340 345 350Thr Gln Thr Tyr Glu Ala Trp Leu Gly His Glu
Tyr Leu His Ala Met 355 360 365Lys Gly Leu Leu Cys Asp Pro Asn Arg
Leu Pro Pro Tyr Leu Arg Trp 370 375 380Cys Val Leu Ser Thr Pro Glu
Ile Gln Lys Cys Gly Asp Met Ala Val385 390 395 400Ala Phe Arg Arg
Gln Arg Leu Lys Pro Glu Ile Gln Cys Val Ser Ala 405 410 415Lys Ser
Pro Gln His Cys Met Glu Arg Ile Gln Ala Glu Gln Val Asp 420 425
430Ala Val Thr Leu Ser Gly Glu Asp Ile Tyr Thr Ala Gly Lys Thr Tyr
435 440 445Gly Leu Val Pro Ala Ala Gly Glu His Tyr Ala Pro Glu Asp
Ser Ser 450 455 460Asn Ser Tyr Tyr Val Val Ala Val Val Arg Arg Asp
Ser Ser His Ala465 470 475 480Phe Thr Leu Asp Glu Leu Arg Gly Lys
Arg Ser Cys His Ala Gly Phe 485 490 495Gly Ser Pro Ala Gly Trp Asp
Val Pro Val Gly Ala Leu Ile Gln Arg 500 505 510Gly Phe Ile Arg Pro
Lys Asp Cys Asp Val Leu Thr Ala Val Ser Glu 515 520 525Phe Phe Asn
Ala Ser Cys Val Pro Val Asn Asn Pro Lys Asn Tyr Pro 530 535 540Ser
Ser Leu Cys Ala Leu Cys Val Gly Asp Glu Gln Gly Arg Asn Lys545 550
555 560Cys Val Gly Asn Ser Gln Glu Arg Tyr Tyr Gly Tyr Arg Gly Ala
Phe 565 570 575Arg Cys Leu Val Glu Asn Ala Gly Asp Val Ala Phe Val
Arg His Thr 580 585 590Thr Val Phe Asp Asn Thr Asn Gly His Asn Ser
Glu Pro Trp Ala Ala 595 600 605Glu Leu Arg Ser Glu Asp Tyr Glu Leu
Leu Cys Pro Asn Gly Ala Arg 610 615 620Ala Glu Val Ser Gln Phe Ala
Ala Cys Asn Leu Ala Gln Ile Pro Pro625 630 635 640His Ala Val Met
Val Arg Pro Asp Thr Asn Ile Phe Thr Val Tyr Gly 645 650 655Leu Leu
Asp Lys Ala Gln Asp Leu Phe Gly Asp Asp His Asn Lys Asn 660 665
670Gly Phe Lys Met Phe Asp Ser Ser Asn Tyr His Gly Gln Asp Leu Leu
675 680 685Phe Lys Asp Ala Thr Val Arg Ala Val Pro Val Gly Glu Lys
Thr Thr 690 695 700Tyr Arg Gly Trp Leu Gly Leu Asp Tyr Val Ala Ala
Leu Glu Gly Met705 710 715 720Ser Ser Gln Gln Cys Ser Glu Ala Ala
Ala Lys Glu Ala Ala Ala Lys 725 730 735Glu Ala Ala Ala Lys Ser Glu
Thr Gln Ala Asn Ser Thr Thr Asp Ala 740 745 750Leu Asn Val Leu Leu
Ile Ile Val Asp Asp Leu Arg Pro Ser Leu Gly 755 760 765Cys Tyr Gly
Asp Lys Leu Val Arg Ser Pro Asn Ile Asp Gln Leu Ala 770 775 780Ser
His Ser Leu Leu Phe Gln Asn Ala Phe Ala Gln Gln Ala Val Cys785 790
795 800Ala Pro Ser Arg Val Ser Phe Leu Thr Gly Arg Arg Pro Asp Thr
Thr 805 810 815Arg Leu Tyr Asp Phe Asn Ser Tyr Trp Arg Val His Ala
Gly Asn Phe 820 825 830Ser Thr Ile Pro Gln Tyr Phe Lys Glu Asn Gly
Tyr Val Thr Met Ser 835 840 845Val Gly Lys Val Phe His Pro Gly Ile
Ser Ser Asn His Thr Asp Asp 850 855 860Ser Pro Tyr Ser Trp Ser Phe
Pro Pro Tyr His Pro Ser Ser Glu Lys865 870 875 880Tyr Glu Asn Thr
Lys Thr Cys Arg Gly Pro Asp Gly Glu Leu His Ala 885 890 895Asn Leu
Leu Cys Pro Val Asp Val Leu Asp Val Pro Glu Gly Thr Leu 900 905
910Pro Asp Lys Gln Ser Thr Glu Gln Ala Ile Gln Leu Leu Glu Lys Met
915 920 925Lys Thr Ser Ala Ser Pro Phe Phe Leu Ala Val Gly Tyr His
Lys Pro 930 935 940His Ile Pro Phe Arg Tyr Pro Lys Glu Phe Gln Lys
Leu Tyr Pro Leu945 950 955 960Glu Asn Ile Thr Leu Ala Pro Asp Pro
Glu Val Pro Asp Gly Leu Pro 965 970 975Pro Val Ala Tyr Asn Pro Trp
Met Asp Ile Arg Gln Arg Glu Asp Val 980 985 990Gln Ala Leu Asn Ile
Ser Val Pro Tyr Gly Pro Ile Pro Val Asp Phe 995 1000 1005Gln Arg
Lys Ile Arg Gln Ser Tyr Phe Ala Ser Val Ser Tyr Leu 1010 1015
1020Asp Thr Gln Val Gly Arg Leu Leu Ser Ala Leu Asp Asp Leu Gln
1025 1030 1035Leu Ala Asn Ser Thr Ile Ile Ala Phe Thr Ser Asp His
Gly Trp 1040 1045 1050Ala Leu Gly Glu His Gly Glu Trp Ala Lys Tyr
Ser Asn Phe Asp 1055 1060 1065Val Ala Thr His Val Pro Leu Ile Phe
Tyr Val Pro Gly Arg Thr 1070 1075 1080Ala Ser Leu Pro Glu Ala Gly
Glu Lys Leu Phe Pro Tyr Leu Asp 1085 1090 1095Pro Phe Asp Ser Ala
Ser Gln Leu Met Glu Pro Gly Arg Gln Ser 1100 1105 1110Met Asp Leu
Val Glu Leu Val Ser Leu Phe Pro Thr Leu Ala Gly 1115 1120 1125Leu
Ala Gly Leu Gln Val Pro Pro Arg Cys Pro Val Pro Ser Phe 1130 1135
1140His Val Glu Leu Cys Arg Glu Gly Lys Asn Leu Leu Lys His Phe
1145 1150 1155Arg Phe Arg Asp Leu Glu Glu Asp Pro Tyr Leu Pro Gly
Asn Pro 1160 1165 1170Arg Glu Leu Ile Ala Tyr Ser Gln Tyr Pro Arg
Pro Ser Asp Ile 1175 1180 1185Pro Gln Trp Asn Ser Asp Lys Pro Ser
Leu Lys Asp Ile Lys Ile 1190 1195 1200Met Gly Tyr Ser Ile Arg Thr
Ile Asp Tyr Arg Tyr Thr Val Trp 1205 1210 1215Val Gly Phe Asn Pro
Asp Glu Phe Leu Ala Asn Phe Ser Asp Ile 1220 1225 1230His Ala Gly
Glu Leu Tyr Phe Val Asp Ser Asp Pro Leu Gln Asp 1235 1240 1245His
Asn Met Tyr Asn Asp Ser Gln Gly Gly Asp Leu Phe Gln Leu 1250 1255
1260Leu Met Pro 126531550PRTHomo sapiens 31Met Pro Pro Pro Arg Thr
Gly Arg Gly Leu Leu Trp Leu Gly Leu Val1 5 10 15Leu Ser Ser Val Cys
Val Ala Leu Gly Ser Glu Thr Gln Ala Asn Ser 20 25 30Thr Thr Asp Ala
Leu Asn Val Leu Leu Ile Ile Val Asp Asp Leu Arg 35 40 45Pro Ser Leu
Gly Cys Tyr Gly Asp Lys Leu Val Arg Ser Pro Asn Ile 50 55 60Asp Gln
Leu Ala Ser His Ser Leu Leu Phe Gln Asn Ala Phe Ala Gln65 70 75
80Gln Ala Val Cys Ala Pro Ser Arg Val Ser Phe Leu Thr Gly Arg Arg
85 90 95Pro Asp Thr Thr Arg Leu Tyr Asp Phe Asn Ser Tyr Trp Arg Val
His 100 105 110Ala Gly Asn Phe Ser Thr Ile Pro Gln Tyr Phe Lys Glu
Asn Gly Tyr 115 120 125Val Thr Met Ser Val Gly Lys Val Phe His Pro
Gly Ile Ser Ser Asn 130 135 140His Thr Asp Asp Ser Pro Tyr Ser Trp
Ser Phe Pro Pro Tyr His Pro145 150 155 160Ser Ser Glu Lys Tyr Glu
Asn Thr Lys Thr Cys Arg Gly Pro Asp Gly 165 170 175Glu Leu His Ala
Asn Leu Leu Cys Pro Val Asp Val Leu Asp Val Pro 180 185 190Glu Gly
Thr Leu Pro Asp Lys Gln Ser Thr Glu Gln Ala Ile Gln Leu 195 200
205Leu Glu Lys Met Lys Thr Ser Ala Ser Pro Phe Phe Leu Ala Val Gly
210 215 220Tyr His Lys Pro His Ile Pro Phe Arg Tyr Pro Lys Glu Phe
Gln Lys225 230 235 240Leu Tyr Pro Leu Glu Asn Ile Thr Leu Ala Pro
Asp Pro Glu Val Pro 245 250 255Asp Gly Leu Pro Pro Val Ala Tyr Asn
Pro Trp Met Asp Ile Arg Gln 260 265 270Arg Glu Asp Val Gln Ala Leu
Asn Ile Ser Val Pro Tyr Gly Pro Ile 275 280 285Pro Val Asp Phe Gln
Arg Lys Ile Arg Gln Ser Tyr Phe Ala Ser Val 290 295 300Ser Tyr Leu
Asp Thr Gln Val Gly Arg Leu Leu Ser Ala Leu Asp Asp305 310 315
320Leu Gln Leu Ala Asn Ser Thr Ile Ile Ala Phe Thr Ser Asp His Gly
325 330 335Trp Ala Leu Gly Glu His Gly Glu Trp Ala Lys Tyr Ser Asn
Phe Asp 340 345 350Val Ala Thr His Val Pro Leu Ile Phe Tyr Val Pro
Gly Arg Thr Ala 355 360 365Ser Leu Pro Glu Ala Gly Glu Lys Leu Phe
Pro Tyr Leu Asp Pro Phe 370 375 380Asp Ser Ala Ser Gln Leu Met Glu
Pro Gly Arg Gln Ser Met Asp Leu385 390 395 400Val Glu Leu Val Ser
Leu Phe Pro Thr Leu Ala Gly Leu Ala Gly Leu 405 410 415Gln Val Pro
Pro Arg Cys Pro Val Pro Ser Phe His Val Glu Leu Cys 420 425 430Arg
Glu Gly Lys Asn Leu Leu Lys His Phe Arg Phe Arg Asp Leu Glu 435 440
445Glu Asp Pro Tyr Leu Pro Gly Asn Pro Arg Glu Leu Ile Ala Tyr Ser
450 455 460Gln Tyr Pro Arg Pro Ser Asp Ile Pro Gln Trp Asn Ser Asp
Lys Pro465 470 475 480Ser Leu Lys Asp Ile Lys Ile Met Gly Tyr Ser
Ile Arg Thr Ile Asp 485 490 495Tyr Arg Tyr Thr Val Trp Val Gly Phe
Asn Pro Asp Glu Phe Leu Ala 500 505 510Asn Phe Ser Asp Ile His Ala
Gly Glu Leu Tyr Phe Val Asp Ser Asp 515 520 525Pro Leu Gln Asp His
Asn Met Tyr Asn Asp Ser Gln Gly Gly Asp Leu 530 535 540Phe Gln Leu
Leu Met Pro545 55032525PRTHomo sapiens 32Ser Glu Thr Gln Ala Asn
Ser Thr Thr Asp Ala Leu Asn Val Leu Leu1 5 10 15Ile Ile Val Asp Asp
Leu Arg Pro Ser Leu Gly Cys Tyr Gly Asp Lys 20 25 30Leu Val Arg Ser
Pro Asn Ile Asp Gln Leu Ala Ser His Ser Leu Leu 35 40 45Phe Gln Asn
Ala Phe Ala Gln Gln Ala Val Cys Ala Pro Ser Arg Val 50 55 60Ser Phe
Leu Thr Gly Arg Arg Pro Asp Thr Thr Arg Leu Tyr Asp Phe65 70 75
80Asn Ser Tyr Trp Arg Val His Ala Gly Asn Phe Ser Thr Ile Pro Gln
85 90 95Tyr Phe Lys Glu Asn Gly Tyr Val Thr Met Ser Val Gly Lys Val
Phe 100 105 110His Pro Gly Ile Ser Ser Asn His Thr Asp Asp Ser Pro
Tyr Ser Trp 115 120 125Ser Phe Pro Pro Tyr His Pro Ser Ser Glu Lys
Tyr Glu Asn Thr Lys 130 135 140Thr Cys Arg Gly Pro Asp Gly Glu Leu
His Ala
Asn Leu Leu Cys Pro145 150 155 160Val Asp Val Leu Asp Val Pro Glu
Gly Thr Leu Pro Asp Lys Gln Ser 165 170 175Thr Glu Gln Ala Ile Gln
Leu Leu Glu Lys Met Lys Thr Ser Ala Ser 180 185 190Pro Phe Phe Leu
Ala Val Gly Tyr His Lys Pro His Ile Pro Phe Arg 195 200 205Tyr Pro
Lys Glu Phe Gln Lys Leu Tyr Pro Leu Glu Asn Ile Thr Leu 210 215
220Ala Pro Asp Pro Glu Val Pro Asp Gly Leu Pro Pro Val Ala Tyr
Asn225 230 235 240Pro Trp Met Asp Ile Arg Gln Arg Glu Asp Val Gln
Ala Leu Asn Ile 245 250 255Ser Val Pro Tyr Gly Pro Ile Pro Val Asp
Phe Gln Arg Lys Ile Arg 260 265 270Gln Ser Tyr Phe Ala Ser Val Ser
Tyr Leu Asp Thr Gln Val Gly Arg 275 280 285Leu Leu Ser Ala Leu Asp
Asp Leu Gln Leu Ala Asn Ser Thr Ile Ile 290 295 300Ala Phe Thr Ser
Asp His Gly Trp Ala Leu Gly Glu His Gly Glu Trp305 310 315 320Ala
Lys Tyr Ser Asn Phe Asp Val Ala Thr His Val Pro Leu Ile Phe 325 330
335Tyr Val Pro Gly Arg Thr Ala Ser Leu Pro Glu Ala Gly Glu Lys Leu
340 345 350Phe Pro Tyr Leu Asp Pro Phe Asp Ser Ala Ser Gln Leu Met
Glu Pro 355 360 365Gly Arg Gln Ser Met Asp Leu Val Glu Leu Val Ser
Leu Phe Pro Thr 370 375 380Leu Ala Gly Leu Ala Gly Leu Gln Val Pro
Pro Arg Cys Pro Val Pro385 390 395 400Ser Phe His Val Glu Leu Cys
Arg Glu Gly Lys Asn Leu Leu Lys His 405 410 415Phe Arg Phe Arg Asp
Leu Glu Glu Asp Pro Tyr Leu Pro Gly Asn Pro 420 425 430Arg Glu Leu
Ile Ala Tyr Ser Gln Tyr Pro Arg Pro Ser Asp Ile Pro 435 440 445Gln
Trp Asn Ser Asp Lys Pro Ser Leu Lys Asp Ile Lys Ile Met Gly 450 455
460Tyr Ser Ile Arg Thr Ile Asp Tyr Arg Tyr Thr Val Trp Val Gly
Phe465 470 475 480Asn Pro Asp Glu Phe Leu Ala Asn Phe Ser Asp Ile
His Ala Gly Glu 485 490 495Leu Tyr Phe Val Asp Ser Asp Pro Leu Gln
Asp His Asn Met Tyr Asn 500 505 510Asp Ser Gln Gly Gly Asp Leu Phe
Gln Leu Leu Met Pro 515 520 52533517PRTHomo sapiens 33Thr Asp Ala
Leu Asn Val Leu Leu Ile Ile Val Asp Asp Leu Arg Pro1 5 10 15Ser Leu
Gly Cys Tyr Gly Asp Lys Leu Val Arg Ser Pro Asn Ile Asp 20 25 30Gln
Leu Ala Ser His Ser Leu Leu Phe Gln Asn Ala Phe Ala Gln Gln 35 40
45Ala Val Cys Ala Pro Ser Arg Val Ser Phe Leu Thr Gly Arg Arg Pro
50 55 60Asp Thr Thr Arg Leu Tyr Asp Phe Asn Ser Tyr Trp Arg Val His
Ala65 70 75 80Gly Asn Phe Ser Thr Ile Pro Gln Tyr Phe Lys Glu Asn
Gly Tyr Val 85 90 95Thr Met Ser Val Gly Lys Val Phe His Pro Gly Ile
Ser Ser Asn His 100 105 110Thr Asp Asp Ser Pro Tyr Ser Trp Ser Phe
Pro Pro Tyr His Pro Ser 115 120 125Ser Glu Lys Tyr Glu Asn Thr Lys
Thr Cys Arg Gly Pro Asp Gly Glu 130 135 140Leu His Ala Asn Leu Leu
Cys Pro Val Asp Val Leu Asp Val Pro Glu145 150 155 160Gly Thr Leu
Pro Asp Lys Gln Ser Thr Glu Gln Ala Ile Gln Leu Leu 165 170 175Glu
Lys Met Lys Thr Ser Ala Ser Pro Phe Phe Leu Ala Val Gly Tyr 180 185
190His Lys Pro His Ile Pro Phe Arg Tyr Pro Lys Glu Phe Gln Lys Leu
195 200 205Tyr Pro Leu Glu Asn Ile Thr Leu Ala Pro Asp Pro Glu Val
Pro Asp 210 215 220Gly Leu Pro Pro Val Ala Tyr Asn Pro Trp Met Asp
Ile Arg Gln Arg225 230 235 240Glu Asp Val Gln Ala Leu Asn Ile Ser
Val Pro Tyr Gly Pro Ile Pro 245 250 255Val Asp Phe Gln Arg Lys Ile
Arg Gln Ser Tyr Phe Ala Ser Val Ser 260 265 270Tyr Leu Asp Thr Gln
Val Gly Arg Leu Leu Ser Ala Leu Asp Asp Leu 275 280 285Gln Leu Ala
Asn Ser Thr Ile Ile Ala Phe Thr Ser Asp His Gly Trp 290 295 300Ala
Leu Gly Glu His Gly Glu Trp Ala Lys Tyr Ser Asn Phe Asp Val305 310
315 320Ala Thr His Val Pro Leu Ile Phe Tyr Val Pro Gly Arg Thr Ala
Ser 325 330 335Leu Pro Glu Ala Gly Glu Lys Leu Phe Pro Tyr Leu Asp
Pro Phe Asp 340 345 350Ser Ala Ser Gln Leu Met Glu Pro Gly Arg Gln
Ser Met Asp Leu Val 355 360 365Glu Leu Val Ser Leu Phe Pro Thr Leu
Ala Gly Leu Ala Gly Leu Gln 370 375 380Val Pro Pro Arg Cys Pro Val
Pro Ser Phe His Val Glu Leu Cys Arg385 390 395 400Glu Gly Lys Asn
Leu Leu Lys His Phe Arg Phe Arg Asp Leu Glu Glu 405 410 415Asp Pro
Tyr Leu Pro Gly Asn Pro Arg Glu Leu Ile Ala Tyr Ser Gln 420 425
430Tyr Pro Arg Pro Ser Asp Ile Pro Gln Trp Asn Ser Asp Lys Pro Ser
435 440 445Leu Lys Asp Ile Lys Ile Met Gly Tyr Ser Ile Arg Thr Ile
Asp Tyr 450 455 460Arg Tyr Thr Val Trp Val Gly Phe Asn Pro Asp Glu
Phe Leu Ala Asn465 470 475 480Phe Ser Asp Ile His Ala Gly Glu Leu
Tyr Phe Val Asp Ser Asp Pro 485 490 495Leu Gln Asp His Asn Met Tyr
Asn Asp Ser Gln Gly Gly Asp Leu Phe 500 505 510Gln Leu Leu Met Pro
51534422PRTHomo sapiens 34Thr Asp Ala Leu Asn Val Leu Leu Ile Ile
Val Asp Asp Leu Arg Pro1 5 10 15Ser Leu Gly Cys Tyr Gly Asp Lys Leu
Val Arg Ser Pro Asn Ile Asp 20 25 30Gln Leu Ala Ser His Ser Leu Leu
Phe Gln Asn Ala Phe Ala Gln Gln 35 40 45Ala Val Cys Ala Pro Ser Arg
Val Ser Phe Leu Thr Gly Arg Arg Pro 50 55 60Asp Thr Thr Arg Leu Tyr
Asp Phe Asn Ser Tyr Trp Arg Val His Ala65 70 75 80Gly Asn Phe Ser
Thr Ile Pro Gln Tyr Phe Lys Glu Asn Gly Tyr Val 85 90 95Thr Met Ser
Val Gly Lys Val Phe His Pro Gly Ile Ser Ser Asn His 100 105 110Thr
Asp Asp Ser Pro Tyr Ser Trp Ser Phe Pro Pro Tyr His Pro Ser 115 120
125Ser Glu Lys Tyr Glu Asn Thr Lys Thr Cys Arg Gly Pro Asp Gly Glu
130 135 140Leu His Ala Asn Leu Leu Cys Pro Val Asp Val Leu Asp Val
Pro Glu145 150 155 160Gly Thr Leu Pro Asp Lys Gln Ser Thr Glu Gln
Ala Ile Gln Leu Leu 165 170 175Glu Lys Met Lys Thr Ser Ala Ser Pro
Phe Phe Leu Ala Val Gly Tyr 180 185 190His Lys Pro His Ile Pro Phe
Arg Tyr Pro Lys Glu Phe Gln Lys Leu 195 200 205Tyr Pro Leu Glu Asn
Ile Thr Leu Ala Pro Asp Pro Glu Val Pro Asp 210 215 220Gly Leu Pro
Pro Val Ala Tyr Asn Pro Trp Met Asp Ile Arg Gln Arg225 230 235
240Glu Asp Val Gln Ala Leu Asn Ile Ser Val Pro Tyr Gly Pro Ile Pro
245 250 255Val Asp Phe Gln Arg Lys Ile Arg Gln Ser Tyr Phe Ala Ser
Val Ser 260 265 270Tyr Leu Asp Thr Gln Val Gly Arg Leu Leu Ser Ala
Leu Asp Asp Leu 275 280 285Gln Leu Ala Asn Ser Thr Ile Ile Ala Phe
Thr Ser Asp His Gly Trp 290 295 300Ala Leu Gly Glu His Gly Glu Trp
Ala Lys Tyr Ser Asn Phe Asp Val305 310 315 320Ala Thr His Val Pro
Leu Ile Phe Tyr Val Pro Gly Arg Thr Ala Ser 325 330 335Leu Pro Glu
Ala Gly Glu Lys Leu Phe Pro Tyr Leu Asp Pro Phe Asp 340 345 350Ser
Ala Ser Gln Leu Met Glu Pro Gly Arg Gln Ser Met Asp Leu Val 355 360
365Glu Leu Val Ser Leu Phe Pro Thr Leu Ala Gly Leu Ala Gly Leu Gln
370 375 380Val Pro Pro Arg Cys Pro Val Pro Ser Phe His Val Glu Leu
Cys Arg385 390 395 400Glu Gly Lys Asn Leu Leu Lys His Phe Arg Phe
Arg Asp Leu Glu Glu 405 410 415Asp Pro Tyr Leu Pro Gly
4203595PRTHomo sapiens 35Asn Pro Arg Glu Leu Ile Ala Tyr Ser Gln
Tyr Pro Arg Pro Ser Asp1 5 10 15Ile Pro Gln Trp Asn Ser Asp Lys Pro
Ser Leu Lys Asp Ile Lys Ile 20 25 30Met Gly Tyr Ser Ile Arg Thr Ile
Asp Tyr Arg Tyr Thr Val Trp Val 35 40 45Gly Phe Asn Pro Asp Glu Phe
Leu Ala Asn Phe Ser Asp Ile His Ala 50 55 60Gly Glu Leu Tyr Phe Val
Asp Ser Asp Pro Leu Gln Asp His Asn Met65 70 75 80Tyr Asn Asp Ser
Gln Gly Gly Asp Leu Phe Gln Leu Leu Met Pro 85 90
95365PRTArtificial Sequencerigid peptide linker 36Glu Ala Ala Ala
Lys1 53710PRTArtificial Sequencerigid peptide linker 37Glu Ala Ala
Ala Lys Glu Ala Ala Ala Lys1 5 103815PRTArtificial Sequencerigid
peptide linker 38Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Glu Ala
Ala Ala Lys1 5 10 153919PRTHomo sapiens 39Met Arg Gly Pro Ser Gly
Ala Leu Trp Leu Leu Leu Ala Leu Arg Thr1 5 10 15Val Leu
Gly4025PRTHomo sapiens 40Met Pro Pro Pro Arg Thr Gly Arg Gly Leu
Leu Trp Leu Gly Leu Val1 5 10 15Leu Ser Ser Val Cys Val Ala Leu Gly
20 254116PRTArtificial Sequencerigid peptide linker 41Ala Glu Ala
Ala Ala Lys Ala Leu Glu Ala Glu Ala Ala Ala Lys Ala1 5 10
154246PRTArtificial Sequencerigid peptide linker 42Ala Glu Ala Ala
Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys1 5 10 15Glu Ala Ala
Ala Lys Ala Leu Glu Ala Glu Ala Ala Ala Lys Glu Ala 20 25 30Ala Ala
Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Ala 35 40
45435PRTArtificial Sequencerigid peptide linker 43Pro Ala Pro Ala
Pro1 54412PRTArtificial Sequencerigid peptide linker 44Ala Glu Ala
Ala Ala Lys Glu Ala Ala Ala Lys Ala1 5 10455PRTArtificial
Sequenceflexible peptide linker 45Gly Gly Gly Gly Ser1
54610PRTArtificial Sequenceflexible peptide linker 46Gly Gly Gly
Gly Ser Gly Gly Gly Gly Ser1 5 104715PRTArtificial Sequenceflexible
peptide linker 47Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly
Gly Gly Ser1 5 10 15484PRTArtificial Sequenceflexible peptide
linker 48Gly Gly Gly Gly1495PRTArtificial Sequenceflexible peptide
linker 49Gly Gly Gly Gly Gly1 5506PRTArtificial Sequenceflexible
peptide linker 50Gly Gly Gly Gly Gly Gly1 5517PRTArtificial
Sequenceflexible peptide linker 51Gly Gly Gly Gly Gly Gly Gly1
5528PRTArtificial Sequenceflexible peptide linker 52Gly Gly Gly Gly
Gly Gly Gly Gly1 5539PRTArtificial Sequenceflexible peptide linker
53Gly Gly Gly Gly Gly Gly Gly Gly Gly1 55410PRTArtificial
Sequenceflexible peptide linker 54Gly Gly Gly Gly Gly Gly Gly Gly
Gly Gly1 5 10554PRTArtificial SequencePeptide linker 55Gly Ser Gly
Ser1564PRTArtificial SequencePeptide linker 56Gly Gly Ser
Gly1574PRTArtificial SequencePeptide linker 57Gly Gly Gly
Ser1584PRTArtificial SequencePeptide linker 58Gly Asn Gly
Asn1594PRTArtificial SequencePeptide linker 59Gly Gly Asn
Gly1604PRTArtificial SequencePeptide linker 60Gly Gly Gly
Asn1615PRTArtificial SequencePeptide linker 61Gly Gly Gly Gly Asn1
56217PRTArtificial SequenceFactor XIa/FVIIa cleavable linker 62Val
Ser Gln Thr Ser Lys Leu Thr Arg Ala Glu Thr Val Phe Pro Asp1 5 10
15Val636PRTArtificial SequenceMtrix metalloprotease-1 cleavable
linker 63Pro Leu Gly Leu Trp Ala1 5646PRTArtificial SequenceHIV
protease cleavable linker 64Arg Val Leu Ala Glu Ala1
56510PRTArtificial SequenceHepatitis C virus NS3 protease cleavable
linker 65Glu Asp Val Val Cys Cys Ser Met Ser Tyr1 5
10668PRTArtificial SequenceFactor Xa cleavable linker 66Gly Gly Ile
Glu Gly Arg Gly Ser1 56710PRTArtificial SequenceFurin cleavable
linker 67Thr Arg His Arg Gln Pro Arg Gly Trp Glu1 5
106810PRTArtificial SequenceFurin cleavable linker 68Ala Gly Asn
Arg Val Arg Arg Ser Val Gly1 5 10699PRTArtificial SequenceFurin
cleavable linker 69Arg Arg Arg Arg Arg Arg Arg Arg Arg1
5704PRTArtificial SequenceCathepsin B cleavable linker 70Gly Phe
Leu Gly1714PRTArtificial SequenceThrombin cleavable linker 71Gly
Arg Gly Asp1726PRTArtificial SequenceThrombin cleavable linker
72Gly Arg Gly Asp Asn Pro1 5735PRTArtificial SequenceThrombin
cleavable linker 73Gly Arg Gly Asp Ser1 5747PRTArtificial
SequenceThrombin cleavable linker 74Gly Arg Gly Asp Ser Pro Lys1
5754PRTArtificial SequenceElastase cleavable linker 75Ala Ala Pro
Val1764PRTArtificial SequenceElastase cleavable linker 76Ala Ala
Pro Leu1774PRTArtificial SequenceElastase cleavable linker 77Ala
Ala Pro Phe1784PRTArtificial SequenceElastase cleavable linker
78Ala Ala Pro Ala1794PRTArtificial SequenceElastase cleavable
linker 79Ala Tyr Leu Val1806PRTArtificial SequenceMatrix
metalloproteinase cleavable linkerVARIANT(3)..(3)Xaa = Any amino
acidVARIANT(6)..(6)Xaa = Any amino acid 80Gly Pro Xaa Gly Pro Xaa1
5814PRTArtificial SequenceMatrix metalloproteinase cleavable
linkerVARIANT(4)..(4)Xaa = Any amino acid 81Leu Gly Pro
Xaa1826PRTArtificial SequenceMatrix metalloproteinase cleavable
linkerVARIANT(6)..(6)Xaa = Any amino acid 82Gly Pro Ile Gly Pro
Xaa1 5835PRTArtificial SequenceMatrix metalloproteinase cleavable
linkerVARIANT(5)..(5)Xaa = Any amino acid 83Ala Pro Gly Leu Xaa1
5847PRTArtificial SequenceCollagenase cleavable
linkerVARIANT(7)..(7)Xaa = Any amino acid 84Pro Leu Gly Pro Asp Arg
Xaa1 5857PRTArtificial SequenceCollagenase cleavable
linkerVARIANT(7)..(7)Xaa = Any amino acid 85Pro Leu Gly Leu Leu Gly
Xaa1 5867PRTArtificial SequenceCollagenase cleavable linker 86Pro
Gln Gly Ile Ala Gly Trp1 5875PRTArtificial SequenceCollagenase
cleavable linker 87Pro Leu Gly Cys His1 5886PRTArtificial
SequenceCollagenase cleavable linker 88Pro Leu Gly Leu Tyr Ala1
5897PRTArtificial SequenceCollagenase cleavable linker 89Pro Leu
Ala Leu Trp Ala Arg1 5907PRTArtificial SequenceCollagenase
cleavable linker 90Pro Leu Ala Tyr Trp Ala Arg1 5917PRTArtificial
SequenceStromelysin cleavable linker 91Pro Tyr Ala Tyr Tyr Met Arg1
5927PRTArtificial SequenceGelatinase cleavable linker 92Pro Leu Gly
Met Tyr Ser Arg1 5934PRTArtificial SequenceAngiotensin converting
enzyme cleavable linker 93Gly Asp Lys Pro1945PRTArtificial
SequenceAngiotensin converting enzyme cleavable linker 94Gly Ser
Asp Lys Pro1 5954PRTArtificial SequenceCathepsin B cleavable linker
95Ala Leu Ala Leu1964PRTArtificial SequenceCathepsin B cleavable
linker 96Gly Phe Leu Gly19722PRTHomo sapiens 97Met Asp Met Arg Ala
Pro Ala Gly Ile Phe Gly Phe Leu Leu Val Leu1 5 10 15Phe Pro Gly Tyr
Arg Ser 209818PRTHomo sapiens 98Met Lys Trp Val Thr Phe Ile Ser Leu
Leu Phe Leu Phe Ser Ser Ala1 5 10 15Tyr Ser9919PRTHomo sapiens
99Met Asp Trp Thr Trp Arg Val Phe Cys Leu Leu Ala Val Thr Pro Gly1
5 10 15Ala His Pro10019PRTHomo sapiens 100Met Ala Trp Ser Pro Leu
Phe Leu Thr Leu Ile Thr His Cys Ala Gly1 5 10 15Ser Trp
Ala10119PRTHomo sapiens 101Met Thr Arg Leu Thr Val Leu Ala Leu Leu
Ala Gly Leu Leu Ala Ser1 5 10 15Ser Arg Ala10220PRTHomo sapiens
102Met Ala Arg Pro Leu Cys Thr Leu Leu Leu Leu Met Ala Thr Leu
Ala1 5 10 15Gly Ala Leu Ala 2010315PRTHomo sapiens 103Met Arg Ser
Leu Val Phe Val Leu Leu Ile Gly Ala Ala Phe Ala1 5 10
1510422PRTHomo sapiens 104Met Ser Arg Leu Phe Val Phe Ile Leu Ile
Ala Leu Phe Leu Ser Ala1 5 10 15Ile Ile Asp Val Met Ser
2010521PRTHomo sapiens 105Met Gly Met Arg Met Met Phe Ile Met Phe
Met Leu Val Val Leu Ala1 5 10 15Thr Thr Val Val Ser 2010618PRTHomo
sapiens 106Met Arg Ala Phe Leu Phe Leu Thr Ala Cys Ile Ser Leu Pro
Gly Val1 5 10 15Phe Gly10718PRTHomo sapiens 107Met Lys Phe Gln Ser
Thr Leu Leu Leu Ala Ala Ala Ala Gly Ser Ala1 5 10 15Leu
Ala10824PRTHomo sapiens 108Met Ala Ser Ser Leu Tyr Ser Phe Leu Leu
Ala Leu Ser Ile Val Tyr1 5 10 15Ile Phe Val Ala Pro Thr His Ser
2010926PRTHomo sapiens 109Met Lys Thr His Tyr Ser Ser Ala Ile Leu
Pro Ile Leu Thr Leu Phe1 5 10 15Val Phe Leu Ser Ile Asn Pro Ser His
Gly 20 2511036PRTHomo sapiens 110Met Glu Ser Val Ser Ser Leu Phe
Asn Ile Phe Ser Thr Ile Met Val1 5 10 15Asn Tyr Lys Ser Leu Val Leu
Ala Leu Leu Ser Val Ser Asn Leu Lys 20 25 30Tyr Ala Arg Gly
3511121PRTHomo sapiens 111Met Lys Ala Ala Gln Ile Leu Thr Ala Ser
Ile Val Ser Leu Leu Pro1 5 10 15Ile Tyr Thr Ser Ala 2011219PRTHomo
sapiens 112Met Ile Lys Leu Lys Phe Gly Val Phe Phe Thr Val Leu Leu
Ser Ser1 5 10 15Ala Tyr Ala1135PRTArtificial SequencePurification
tag 113His His His His His1 51146PRTArtificial SequencePurification
tag 114His His His His His His1 51157PRTArtificial
SequencePurification tag 115His His His His His His His1
51168PRTArtificial SequencePurification tag 116His His His His His
His His His1 51179PRTArtificial SequencePurification tag 117His His
His His His His His His His1 511810PRTArtificial
SequencePurification tag 118His His His His His His His His His
His1 5 1011915PRTArtificial SequencePurification tag - AviTag
119Gly Leu Asn Asp Ile Phe Glu Ala Gln Lys Ile Glu Trp His Glu1 5
10 1512026PRTArtificial SequencePurification tag - Calmodulin-tag
120Lys Arg Arg Trp Lys Lys Asn Phe Ile Ala Val Ser Ala Ala Asn Arg1
5 10 15Phe Lys Lys Ile Ser Ser Ser Gly Ala Leu 20
251216PRTArtificial SequencePurification tag - Polyglutamate tag
121Glu Glu Glu Glu Glu Glu1 51228PRTArtificial SequencePurification
tag - FLAG-tag 122Asp Tyr Lys Asp Asp Asp Asp Lys1
51239PRTArtificial SequencePurificiation tag - HA-tag 123Tyr Pro
Tyr Asp Val Pro Asp Tyr Ala1 512410PRTArtificial
SequencePurification tag - MYC-tag 124Glu Gln Lys Leu Ile Ser Glu
Glu Asp Leu1 5 1012515PRTArtificial SequencePurification tag -
S-tag 125Lys Glu Thr Ala Ala Ala Lys Phe Glu Arg Gln His Met Asp
Ser1 5 10 1512638PRTArtificial SequencePurification tag - SPB-tag
126Met Asp Glu Lys Thr Thr Gly Trp Arg Gly Gly His Val Val Glu Gly1
5 10 15Leu Ala Gly Glu Leu Glu Gln Leu Arg Ala Arg Leu Glu His His
Pro 20 25 30Gln Gly Gln Arg Glu Pro 3512713PRTArtificial
SequencePurification tag - Softag 1 127Ser Leu Ala Glu Leu Leu Asn
Ala Gly Leu Gly Gly Ser1 5 101288PRTArtificial SequencePurification
tag - Softag 3 128Thr Gln Asp Pro Ser Arg Val Gly1
512914PRTArtificial SequencePurification tag - V5 tag 129Gly Lys
Pro Ile Pro Asn Pro Leu Leu Gly Leu Asp Ser Thr1 5
101308PRTArtificial SequencePurification tag - Xpress tag 130Asp
Leu Tyr Asp Asp Asp Asp Lys1 51316PRTHomo sapiens 131Leu Val Pro
Arg Gly Ser1 51325PRTHomo sapiens 132Asp Asp Asp Asp Lys1
51334PRTHomo sapiensMISC_FEATURE(2)..(2)Xaa = Glu or Asp 133Ile Xaa
Gly Arg11345PRTHomo sapiens 134Asp Asp Asp Asp Lys1 51357PRTHomo
sapiens 135Glu Asn Leu Tyr Phe Gln Gly1 51368PRTHomo sapiens 136Leu
Glu Val Leu Phe Gln Gly Pro1 5137100PRTHomo sapiens 137Gly Ser Leu
Gln Asp Ser Glu Val Asn Gln Glu Ala Lys Pro Glu Val1 5 10 15Lys Pro
Glu Val Lys Pro Glu Thr His Ile Asn Leu Lys Val Ser Asp 20 25 30Gly
Ser Ser Glu Ile Phe Phe Lys Ile Lys Lys Thr Thr Pro Leu Arg 35 40
45Arg Leu Met Glu Ala Phe Ala Lys Arg Gln Gly Lys Glu Met Asp Ser
50 55 60Leu Thr Phe Leu Tyr Asp Gly Ile Glu Ile Gln Ala Asp Gln Thr
Pro65 70 75 80Glu Asp Leu Asp Met Glu Asp Asn Asp Ile Ile Glu Ala
His Arg Glu 85 90 95Gln Ile Gly Gly 1001381289PRTArtificial
Sequencep97 fusion protein 138Met Glu Trp Ser Trp Val Phe Leu Phe
Phe Leu Ser Val Thr Thr Gly1 5 10 15Val His Ser Asp Tyr Lys Asp Asp
Asp Asp Lys Glu Gln Lys Leu Ile 20 25 30Ser Glu Glu Asp Leu His His
His His His His His His His His Gly 35 40 45Gly Gly Gly Glu Asn Leu
Tyr Phe Gln Gly Ser Glu Thr Gln Ala Asn 50 55 60Ser Thr Thr Asp Ala
Leu Asn Val Leu Leu Ile Ile Val Asp Asp Leu65 70 75 80Arg Pro Ser
Leu Gly Cys Tyr Gly Asp Lys Leu Val Arg Ser Pro Asn 85 90 95Ile Asp
Gln Leu Ala Ser His Ser Leu Leu Phe Gln Asn Ala Phe Ala 100 105
110Gln Gln Ala Val Cys Ala Pro Ser Arg Val Ser Phe Leu Thr Gly Arg
115 120 125Arg Pro Asp Thr Thr Arg Leu Tyr Asp Phe Asn Ser Tyr Trp
Arg Val 130 135 140His Ala Gly Asn Phe Ser Thr Ile Pro Gln Tyr Phe
Lys Glu Asn Gly145 150 155 160Tyr Val Thr Met Ser Val Gly Lys Val
Phe His Pro Gly Ile Ser Ser 165 170 175Asn His Thr Asp Asp Ser Pro
Tyr Ser Trp Ser Phe Pro Pro Tyr His 180 185 190Pro Ser Ser Glu Lys
Tyr Glu Asn Thr Lys Thr Cys Arg Gly Pro Asp 195 200 205Gly Glu Leu
His Ala Asn Leu Leu Cys Pro Val Asp Val Leu Asp Val 210 215 220Pro
Glu Gly Thr Leu Pro Asp Lys Gln Ser Thr Glu Gln Ala Ile Gln225 230
235 240Leu Leu Glu Lys Met Lys Thr Ser Ala Ser Pro Phe Phe Leu Ala
Val 245 250 255Gly Tyr His Lys Pro His Ile Pro Phe Arg Tyr Pro Lys
Glu Phe Gln 260 265 270Lys Leu Tyr Pro Leu Glu Asn Ile Thr Leu Ala
Pro Asp Pro Glu Val 275 280 285Pro Asp Gly Leu Pro Pro Val Ala Tyr
Asn Pro Trp Met Asp Ile Arg 290 295 300Gln Arg Glu Asp Val Gln Ala
Leu Asn Ile Ser Val Pro Tyr Gly Pro305 310 315 320Ile Pro Val Asp
Phe Gln Arg Lys Ile Arg Gln Ser Tyr Phe Ala Ser 325 330 335Val Ser
Tyr Leu Asp Thr Gln Val Gly Arg Leu Leu Ser Ala Leu Asp 340 345
350Asp Leu Gln Leu Ala Asn Ser Thr Ile Ile Ala Phe Thr Ser Asp His
355 360 365Gly Trp Ala Leu Gly Glu His Gly Glu Trp Ala Lys Tyr Ser
Asn Phe 370 375 380Asp Val Ala Thr His Val Pro Leu Ile Phe Tyr Val
Pro Gly Arg Thr385 390 395 400Ala Ser Leu Pro Glu Ala Gly Glu Lys
Leu Phe Pro Tyr Leu Asp Pro 405 410 415Phe Asp Ser Ala Ser Gln Leu
Met Glu Pro Gly Arg Gln Ser Met Asp 420 425 430Leu Val Glu Leu Val
Ser Leu Phe Pro Thr Leu Ala Gly Leu Ala Gly 435 440 445Leu Gln Val
Pro Pro Arg Cys Pro Val Pro Ser Phe His Val Glu Leu 450 455 460Cys
Arg Glu Gly Lys Asn Leu Leu Lys His Phe Arg Phe Arg Asp Leu465 470
475 480Glu Glu Asp Pro Tyr Leu Pro Gly Asn Pro Arg Glu Leu Ile Ala
Tyr 485 490 495Ser Gln Tyr Pro Arg Pro Ser Asp Ile Pro Gln Trp Asn
Ser Asp Lys 500 505 510Pro Ser Leu Lys Asp Ile Lys Ile Met Gly Tyr
Ser Ile Arg Thr Ile 515 520 525Asp Tyr Arg Tyr Thr Val Trp Val Gly
Phe Asn Pro Asp Glu Phe Leu 530 535 540Ala Asn Phe Ser Asp Ile His
Ala Gly Glu Leu Tyr Phe Val Asp Ser545 550 555 560Asp Pro Leu Gln
Asp His Asn Met Tyr Asn Asp Ser Gln Gly Gly Asp 565 570 575Leu Phe
Gln Leu Leu Met Pro Glu Ala Ala Ala Lys Glu Ala Ala Ala 580 585
590Lys Glu Ala Ala Ala Lys Gly Met Glu Val Arg Trp Cys Ala Thr Ser
595 600 605Asp Pro Glu Gln His Lys Cys Gly Asn Met Ser Glu Ala Phe
Arg Glu 610 615 620Ala Gly Ile Gln Pro Ser Leu Leu Cys Val Arg Gly
Thr Ser Ala Asp625 630 635 640His Cys Val Gln Leu Ile Ala Ala Gln
Glu Ala Asp Ala Ile Thr Leu 645 650 655Asp Gly Gly Ala Ile Tyr Glu
Ala Gly Lys Glu His Gly Leu Lys Pro 660 665 670Val Val Gly Glu Val
Tyr Asp Gln Glu Val Gly Thr Ser Tyr Tyr Ala 675 680 685Val Ala Val
Val Arg Arg Ser Ser His Val Thr Ile Asp Thr Leu Lys 690 695 700Gly
Val Lys Ser Cys His Thr Gly Ile Asn Arg Thr Val Gly Trp Asn705 710
715 720Val Pro Val Gly Tyr Leu Val Glu Ser Gly Arg Leu Ser Val Met
Gly 725 730 735Cys Asp Val Leu Lys Ala Val Ser Asp Tyr Phe Gly Gly
Ser Cys Val 740 745 750Pro Gly Ala Gly Glu Thr Ser Tyr Ser Glu Ser
Leu Cys Arg Leu Cys 755 760 765Arg Gly Asp Ser Ser Gly Glu Gly Val
Cys Asp Lys Ser Pro Leu Glu 770 775 780Arg Tyr Tyr Asp Tyr Ser Gly
Ala Phe Arg Cys Leu Ala Glu Gly Ala785 790 795 800Gly Asp Val Ala
Phe Val Lys His Ser Thr Val Leu Glu Asn Thr Asp 805 810 815Gly Lys
Thr Leu Pro Ser Trp Gly Gln Ala Leu Leu Ser Gln Asp Phe 820 825
830Glu Leu Leu Cys Arg Asp Gly Ser Arg Ala Asp Val Thr Glu Trp Arg
835 840 845Gln Cys His Leu Ala Arg Val Pro Ala His Ala Val Val Val
Arg Ala 850 855 860Asp Thr Asp Gly Gly Leu Ile Phe Arg Leu Leu Asn
Glu Gly Gln Arg865 870 875 880Leu Phe Ser His Glu Gly Ser Ser Phe
Gln Met Phe Ser Ser Glu Ala 885 890 895Tyr Gly Gln Lys Asp Leu Leu
Phe Lys Asp Ser Thr Ser Glu Leu Val 900 905 910Pro Ile Ala Thr Gln
Thr Tyr Glu Ala Trp Leu Gly His Glu Tyr Leu 915 920 925His Ala Met
Lys Gly Leu Leu Cys Asp Pro Asn Arg Leu Pro Pro Tyr 930 935 940Leu
Arg Trp Cys Val Leu Ser Thr Pro Glu Ile Gln Lys Cys Gly Asp945 950
955 960Met Ala Val Ala Phe Arg Arg Gln Arg Leu Lys Pro Glu Ile Gln
Cys 965 970 975Val Ser Ala Lys Ser Pro Gln His Cys Met Glu Arg Ile
Gln Ala Glu 980 985 990Gln Val Asp Ala Val Thr Leu Ser Gly Glu Asp
Ile Tyr Thr Ala Gly 995 1000 1005Lys Thr Tyr Gly Leu Val Pro Ala
Ala Gly Glu His Tyr Ala Pro 1010 1015 1020Glu Asp Ser Ser Asn Ser
Tyr Tyr Val Val Ala Val Val Arg Arg 1025 1030 1035Asp Ser Ser His
Ala Phe Thr Leu Asp Glu Leu Arg Gly Lys Arg 1040 1045 1050Ser Cys
His Ala Gly Phe Gly Ser Pro Ala Gly Trp Asp Val Pro 1055 1060
1065Val Gly Ala Leu Ile Gln Arg Gly Phe Ile Arg Pro Lys Asp Cys
1070 1075 1080Asp Val Leu Thr Ala Val Ser Glu Phe Phe Asn Ala Ser
Cys Val 1085 1090 1095Pro Val Asn Asn Pro Lys Asn Tyr Pro Ser Ser
Leu Cys Ala Leu 1100 1105 1110Cys Val Gly Asp Glu Gln Gly Arg Asn
Lys Cys Val Gly Asn Ser 1115 1120 1125Gln Glu Arg Tyr Tyr Gly Tyr
Arg Gly Ala Phe Arg Cys Leu Val 1130 1135 1140Glu Asn Ala Gly Asp
Val Ala Phe Val Arg His Thr Thr Val Phe 1145 1150 1155Asp Asn Thr
Asn Gly His Asn Ser Glu Pro Trp Ala Ala Glu Leu 1160 1165 1170Arg
Ser Glu Asp Tyr Glu Leu Leu Cys Pro Asn Gly Ala Arg Ala 1175 1180
1185Glu Val Ser Gln Phe Ala Ala Cys Asn Leu Ala Gln Ile Pro Pro
1190 1195 1200His Ala Val Met Val Arg Pro Asp Thr Asn Ile Phe Thr
Val Tyr 1205 1210 1215Gly Leu Leu Asp Lys Ala Gln Asp Leu Phe Gly
Asp Asp His Asn 1220 1225 1230Lys Asn Gly Phe Lys Met Phe Asp Ser
Ser Asn Tyr His Gly Gln 1235 1240 1245Asp Leu Leu Phe Lys Asp Ala
Thr Val Arg Ala Val Pro Val Gly 1250 1255 1260Glu Lys Thr Thr Tyr
Arg Gly Trp Leu Gly Leu Asp Tyr Val Ala 1265 1270 1275Ala Leu Glu
Gly Met Ser Ser Gln Gln Cys Ser 1280 12851391289PRTArtificial
Sequencep97 fusion protein 139Met Glu Trp Ser Trp Val Phe Leu Phe
Phe Leu Ser Val Thr Thr Gly1 5 10 15Val His Ser Asp Tyr Lys Asp Asp
Asp Asp Lys Glu Gln Lys Leu Ile 20 25 30Ser Glu Glu Asp Leu His His
His His His His His His His His Gly 35 40 45Gly Gly Gly Glu Asn Leu
Tyr Phe Gln Gly Gly Met Glu Val Arg Trp 50 55 60Cys Ala Thr Ser Asp
Pro Glu Gln His Lys Cys Gly Asn Met Ser Glu65 70 75 80Ala Phe Arg
Glu Ala Gly Ile Gln Pro Ser Leu Leu Cys Val Arg Gly 85 90 95Thr Ser
Ala Asp His Cys Val Gln Leu Ile Ala Ala Gln Glu Ala Asp 100 105
110Ala Ile Thr Leu Asp Gly Gly Ala Ile Tyr Glu Ala Gly Lys Glu His
115 120 125Gly Leu Lys Pro Val Val Gly Glu Val Tyr Asp Gln Glu Val
Gly Thr 130 135 140Ser Tyr Tyr Ala Val Ala Val Val Arg Arg Ser Ser
His Val Thr Ile145 150 155 160Asp Thr Leu Lys Gly Val Lys Ser Cys
His Thr Gly Ile Asn Arg Thr 165 170 175Val Gly Trp Asn Val Pro Val
Gly Tyr Leu Val Glu Ser Gly Arg Leu 180 185 190Ser Val Met Gly Cys
Asp Val Leu Lys Ala Val Ser Asp Tyr Phe Gly 195 200 205Gly Ser Cys
Val Pro Gly Ala Gly Glu Thr Ser Tyr Ser Glu Ser Leu 210 215 220Cys
Arg Leu Cys Arg Gly Asp Ser Ser Gly Glu Gly Val Cys Asp Lys225 230
235 240Ser Pro Leu Glu Arg Tyr Tyr Asp Tyr Ser Gly Ala Phe Arg Cys
Leu 245 250 255Ala Glu Gly Ala Gly Asp Val Ala Phe Val Lys His Ser
Thr Val Leu 260 265 270Glu Asn Thr Asp Gly Lys Thr Leu Pro Ser Trp
Gly Gln Ala Leu Leu 275 280 285Ser Gln Asp Phe Glu Leu Leu Cys Arg
Asp Gly Ser Arg Ala Asp Val 290 295 300Thr Glu Trp Arg Gln Cys His
Leu Ala Arg Val Pro Ala His Ala Val305 310 315 320Val Val Arg Ala
Asp Thr Asp Gly Gly Leu Ile Phe Arg Leu Leu Asn 325 330 335Glu Gly
Gln Arg Leu Phe Ser His Glu Gly Ser Ser Phe Gln Met Phe 340 345
350Ser Ser Glu Ala Tyr Gly Gln Lys Asp Leu Leu Phe Lys Asp Ser Thr
355 360 365Ser Glu Leu Val Pro Ile Ala Thr Gln Thr Tyr Glu Ala Trp
Leu Gly 370 375 380His Glu Tyr Leu His Ala Met Lys Gly Leu Leu Cys
Asp Pro Asn Arg385 390
395 400Leu Pro Pro Tyr Leu Arg Trp Cys Val Leu Ser Thr Pro Glu Ile
Gln 405 410 415Lys Cys Gly Asp Met Ala Val Ala Phe Arg Arg Gln Arg
Leu Lys Pro 420 425 430Glu Ile Gln Cys Val Ser Ala Lys Ser Pro Gln
His Cys Met Glu Arg 435 440 445Ile Gln Ala Glu Gln Val Asp Ala Val
Thr Leu Ser Gly Glu Asp Ile 450 455 460Tyr Thr Ala Gly Lys Thr Tyr
Gly Leu Val Pro Ala Ala Gly Glu His465 470 475 480Tyr Ala Pro Glu
Asp Ser Ser Asn Ser Tyr Tyr Val Val Ala Val Val 485 490 495Arg Arg
Asp Ser Ser His Ala Phe Thr Leu Asp Glu Leu Arg Gly Lys 500 505
510Arg Ser Cys His Ala Gly Phe Gly Ser Pro Ala Gly Trp Asp Val Pro
515 520 525Val Gly Ala Leu Ile Gln Arg Gly Phe Ile Arg Pro Lys Asp
Cys Asp 530 535 540Val Leu Thr Ala Val Ser Glu Phe Phe Asn Ala Ser
Cys Val Pro Val545 550 555 560Asn Asn Pro Lys Asn Tyr Pro Ser Ser
Leu Cys Ala Leu Cys Val Gly 565 570 575Asp Glu Gln Gly Arg Asn Lys
Cys Val Gly Asn Ser Gln Glu Arg Tyr 580 585 590Tyr Gly Tyr Arg Gly
Ala Phe Arg Cys Leu Val Glu Asn Ala Gly Asp 595 600 605Val Ala Phe
Val Arg His Thr Thr Val Phe Asp Asn Thr Asn Gly His 610 615 620Asn
Ser Glu Pro Trp Ala Ala Glu Leu Arg Ser Glu Asp Tyr Glu Leu625 630
635 640Leu Cys Pro Asn Gly Ala Arg Ala Glu Val Ser Gln Phe Ala Ala
Cys 645 650 655Asn Leu Ala Gln Ile Pro Pro His Ala Val Met Val Arg
Pro Asp Thr 660 665 670Asn Ile Phe Thr Val Tyr Gly Leu Leu Asp Lys
Ala Gln Asp Leu Phe 675 680 685Gly Asp Asp His Asn Lys Asn Gly Phe
Lys Met Phe Asp Ser Ser Asn 690 695 700Tyr His Gly Gln Asp Leu Leu
Phe Lys Asp Ala Thr Val Arg Ala Val705 710 715 720Pro Val Gly Glu
Lys Thr Thr Tyr Arg Gly Trp Leu Gly Leu Asp Tyr 725 730 735Val Ala
Ala Leu Glu Gly Met Ser Ser Gln Gln Cys Ser Glu Ala Ala 740 745
750Ala Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Ser Glu Thr Gln
755 760 765Ala Asn Ser Thr Thr Asp Ala Leu Asn Val Leu Leu Ile Ile
Val Asp 770 775 780Asp Leu Arg Pro Ser Leu Gly Cys Tyr Gly Asp Lys
Leu Val Arg Ser785 790 795 800Pro Asn Ile Asp Gln Leu Ala Ser His
Ser Leu Leu Phe Gln Asn Ala 805 810 815Phe Ala Gln Gln Ala Val Cys
Ala Pro Ser Arg Val Ser Phe Leu Thr 820 825 830Gly Arg Arg Pro Asp
Thr Thr Arg Leu Tyr Asp Phe Asn Ser Tyr Trp 835 840 845Arg Val His
Ala Gly Asn Phe Ser Thr Ile Pro Gln Tyr Phe Lys Glu 850 855 860Asn
Gly Tyr Val Thr Met Ser Val Gly Lys Val Phe His Pro Gly Ile865 870
875 880Ser Ser Asn His Thr Asp Asp Ser Pro Tyr Ser Trp Ser Phe Pro
Pro 885 890 895Tyr His Pro Ser Ser Glu Lys Tyr Glu Asn Thr Lys Thr
Cys Arg Gly 900 905 910Pro Asp Gly Glu Leu His Ala Asn Leu Leu Cys
Pro Val Asp Val Leu 915 920 925Asp Val Pro Glu Gly Thr Leu Pro Asp
Lys Gln Ser Thr Glu Gln Ala 930 935 940Ile Gln Leu Leu Glu Lys Met
Lys Thr Ser Ala Ser Pro Phe Phe Leu945 950 955 960Ala Val Gly Tyr
His Lys Pro His Ile Pro Phe Arg Tyr Pro Lys Glu 965 970 975Phe Gln
Lys Leu Tyr Pro Leu Glu Asn Ile Thr Leu Ala Pro Asp Pro 980 985
990Glu Val Pro Asp Gly Leu Pro Pro Val Ala Tyr Asn Pro Trp Met Asp
995 1000 1005Ile Arg Gln Arg Glu Asp Val Gln Ala Leu Asn Ile Ser
Val Pro 1010 1015 1020Tyr Gly Pro Ile Pro Val Asp Phe Gln Arg Lys
Ile Arg Gln Ser 1025 1030 1035Tyr Phe Ala Ser Val Ser Tyr Leu Asp
Thr Gln Val Gly Arg Leu 1040 1045 1050Leu Ser Ala Leu Asp Asp Leu
Gln Leu Ala Asn Ser Thr Ile Ile 1055 1060 1065Ala Phe Thr Ser Asp
His Gly Trp Ala Leu Gly Glu His Gly Glu 1070 1075 1080Trp Ala Lys
Tyr Ser Asn Phe Asp Val Ala Thr His Val Pro Leu 1085 1090 1095Ile
Phe Tyr Val Pro Gly Arg Thr Ala Ser Leu Pro Glu Ala Gly 1100 1105
1110Glu Lys Leu Phe Pro Tyr Leu Asp Pro Phe Asp Ser Ala Ser Gln
1115 1120 1125Leu Met Glu Pro Gly Arg Gln Ser Met Asp Leu Val Glu
Leu Val 1130 1135 1140Ser Leu Phe Pro Thr Leu Ala Gly Leu Ala Gly
Leu Gln Val Pro 1145 1150 1155Pro Arg Cys Pro Val Pro Ser Phe His
Val Glu Leu Cys Arg Glu 1160 1165 1170Gly Lys Asn Leu Leu Lys His
Phe Arg Phe Arg Asp Leu Glu Glu 1175 1180 1185Asp Pro Tyr Leu Pro
Gly Asn Pro Arg Glu Leu Ile Ala Tyr Ser 1190 1195 1200Gln Tyr Pro
Arg Pro Ser Asp Ile Pro Gln Trp Asn Ser Asp Lys 1205 1210 1215Pro
Ser Leu Lys Asp Ile Lys Ile Met Gly Tyr Ser Ile Arg Thr 1220 1225
1230Ile Asp Tyr Arg Tyr Thr Val Trp Val Gly Phe Asn Pro Asp Glu
1235 1240 1245Phe Leu Ala Asn Phe Ser Asp Ile His Ala Gly Glu Leu
Tyr Phe 1250 1255 1260Val Asp Ser Asp Pro Leu Gln Asp His Asn Met
Tyr Asn Asp Ser 1265 1270 1275Gln Gly Gly Asp Leu Phe Gln Leu Leu
Met Pro 1280 1285140611PRTArtificial Sequencep97 fusion protein
140Met Glu Trp Ser Trp Val Phe Leu Phe Phe Leu Ser Val Thr Thr Gly1
5 10 15Val His Ser Asp Tyr Lys Asp Asp Asp Asp Lys Glu Gln Lys Leu
Ile 20 25 30Ser Glu Glu Asp Leu His His His His His His His His His
His Gly 35 40 45Gly Gly Gly Glu Asn Leu Tyr Phe Gln Gly Asp Ser Ser
His Ala Phe 50 55 60Thr Leu Asp Glu Leu Arg Tyr Glu Ala Ala Ala Lys
Glu Ala Ala Ala65 70 75 80Lys Glu Ala Ala Ala Lys Ser Glu Thr Gln
Ala Asn Ser Thr Thr Asp 85 90 95Ala Leu Asn Val Leu Leu Ile Ile Val
Asp Asp Leu Arg Pro Ser Leu 100 105 110Gly Cys Tyr Gly Asp Lys Leu
Val Arg Ser Pro Asn Ile Asp Gln Leu 115 120 125Ala Ser His Ser Leu
Leu Phe Gln Asn Ala Phe Ala Gln Gln Ala Val 130 135 140Cys Ala Pro
Ser Arg Val Ser Phe Leu Thr Gly Arg Arg Pro Asp Thr145 150 155
160Thr Arg Leu Tyr Asp Phe Asn Ser Tyr Trp Arg Val His Ala Gly Asn
165 170 175Phe Ser Thr Ile Pro Gln Tyr Phe Lys Glu Asn Gly Tyr Val
Thr Met 180 185 190Ser Val Gly Lys Val Phe His Pro Gly Ile Ser Ser
Asn His Thr Asp 195 200 205Asp Ser Pro Tyr Ser Trp Ser Phe Pro Pro
Tyr His Pro Ser Ser Glu 210 215 220Lys Tyr Glu Asn Thr Lys Thr Cys
Arg Gly Pro Asp Gly Glu Leu His225 230 235 240Ala Asn Leu Leu Cys
Pro Val Asp Val Leu Asp Val Pro Glu Gly Thr 245 250 255Leu Pro Asp
Lys Gln Ser Thr Glu Gln Ala Ile Gln Leu Leu Glu Lys 260 265 270Met
Lys Thr Ser Ala Ser Pro Phe Phe Leu Ala Val Gly Tyr His Lys 275 280
285Pro His Ile Pro Phe Arg Tyr Pro Lys Glu Phe Gln Lys Leu Tyr Pro
290 295 300Leu Glu Asn Ile Thr Leu Ala Pro Asp Pro Glu Val Pro Asp
Gly Leu305 310 315 320Pro Pro Val Ala Tyr Asn Pro Trp Met Asp Ile
Arg Gln Arg Glu Asp 325 330 335Val Gln Ala Leu Asn Ile Ser Val Pro
Tyr Gly Pro Ile Pro Val Asp 340 345 350Phe Gln Arg Lys Ile Arg Gln
Ser Tyr Phe Ala Ser Val Ser Tyr Leu 355 360 365Asp Thr Gln Val Gly
Arg Leu Leu Ser Ala Leu Asp Asp Leu Gln Leu 370 375 380Ala Asn Ser
Thr Ile Ile Ala Phe Thr Ser Asp His Gly Trp Ala Leu385 390 395
400Gly Glu His Gly Glu Trp Ala Lys Tyr Ser Asn Phe Asp Val Ala Thr
405 410 415His Val Pro Leu Ile Phe Tyr Val Pro Gly Arg Thr Ala Ser
Leu Pro 420 425 430Glu Ala Gly Glu Lys Leu Phe Pro Tyr Leu Asp Pro
Phe Asp Ser Ala 435 440 445Ser Gln Leu Met Glu Pro Gly Arg Gln Ser
Met Asp Leu Val Glu Leu 450 455 460Val Ser Leu Phe Pro Thr Leu Ala
Gly Leu Ala Gly Leu Gln Val Pro465 470 475 480Pro Arg Cys Pro Val
Pro Ser Phe His Val Glu Leu Cys Arg Glu Gly 485 490 495Lys Asn Leu
Leu Lys His Phe Arg Phe Arg Asp Leu Glu Glu Asp Pro 500 505 510Tyr
Leu Pro Gly Asn Pro Arg Glu Leu Ile Ala Tyr Ser Gln Tyr Pro 515 520
525Arg Pro Ser Asp Ile Pro Gln Trp Asn Ser Asp Lys Pro Ser Leu Lys
530 535 540Asp Ile Lys Ile Met Gly Tyr Ser Ile Arg Thr Ile Asp Tyr
Arg Tyr545 550 555 560Thr Val Trp Val Gly Phe Asn Pro Asp Glu Phe
Leu Ala Asn Phe Ser 565 570 575Asp Ile His Ala Gly Glu Leu Tyr Phe
Val Asp Ser Asp Pro Leu Gln 580 585 590Asp His Asn Met Tyr Asn Asp
Ser Gln Gly Gly Asp Leu Phe Gln Leu 595 600 605Leu Met Pro
610141611PRTArtificial Sequencep97 fusion protein 141Met Glu Trp
Ser Trp Val Phe Leu Phe Phe Leu Ser Val Thr Thr Gly1 5 10 15Val His
Ser Asp Tyr Lys Asp Asp Asp Asp Lys Glu Gln Lys Leu Ile 20 25 30Ser
Glu Glu Asp Leu His His His His His His His His His His Gly 35 40
45Gly Gly Gly Glu Asn Leu Tyr Phe Gln Gly Ser Glu Thr Gln Ala Asn
50 55 60Ser Thr Thr Asp Ala Leu Asn Val Leu Leu Ile Ile Val Asp Asp
Leu65 70 75 80Arg Pro Ser Leu Gly Cys Tyr Gly Asp Lys Leu Val Arg
Ser Pro Asn 85 90 95Ile Asp Gln Leu Ala Ser His Ser Leu Leu Phe Gln
Asn Ala Phe Ala 100 105 110Gln Gln Ala Val Cys Ala Pro Ser Arg Val
Ser Phe Leu Thr Gly Arg 115 120 125Arg Pro Asp Thr Thr Arg Leu Tyr
Asp Phe Asn Ser Tyr Trp Arg Val 130 135 140His Ala Gly Asn Phe Ser
Thr Ile Pro Gln Tyr Phe Lys Glu Asn Gly145 150 155 160Tyr Val Thr
Met Ser Val Gly Lys Val Phe His Pro Gly Ile Ser Ser 165 170 175Asn
His Thr Asp Asp Ser Pro Tyr Ser Trp Ser Phe Pro Pro Tyr His 180 185
190Pro Ser Ser Glu Lys Tyr Glu Asn Thr Lys Thr Cys Arg Gly Pro Asp
195 200 205Gly Glu Leu His Ala Asn Leu Leu Cys Pro Val Asp Val Leu
Asp Val 210 215 220Pro Glu Gly Thr Leu Pro Asp Lys Gln Ser Thr Glu
Gln Ala Ile Gln225 230 235 240Leu Leu Glu Lys Met Lys Thr Ser Ala
Ser Pro Phe Phe Leu Ala Val 245 250 255Gly Tyr His Lys Pro His Ile
Pro Phe Arg Tyr Pro Lys Glu Phe Gln 260 265 270Lys Leu Tyr Pro Leu
Glu Asn Ile Thr Leu Ala Pro Asp Pro Glu Val 275 280 285Pro Asp Gly
Leu Pro Pro Val Ala Tyr Asn Pro Trp Met Asp Ile Arg 290 295 300Gln
Arg Glu Asp Val Gln Ala Leu Asn Ile Ser Val Pro Tyr Gly Pro305 310
315 320Ile Pro Val Asp Phe Gln Arg Lys Ile Arg Gln Ser Tyr Phe Ala
Ser 325 330 335Val Ser Tyr Leu Asp Thr Gln Val Gly Arg Leu Leu Ser
Ala Leu Asp 340 345 350Asp Leu Gln Leu Ala Asn Ser Thr Ile Ile Ala
Phe Thr Ser Asp His 355 360 365Gly Trp Ala Leu Gly Glu His Gly Glu
Trp Ala Lys Tyr Ser Asn Phe 370 375 380Asp Val Ala Thr His Val Pro
Leu Ile Phe Tyr Val Pro Gly Arg Thr385 390 395 400Ala Ser Leu Pro
Glu Ala Gly Glu Lys Leu Phe Pro Tyr Leu Asp Pro 405 410 415Phe Asp
Ser Ala Ser Gln Leu Met Glu Pro Gly Arg Gln Ser Met Asp 420 425
430Leu Val Glu Leu Val Ser Leu Phe Pro Thr Leu Ala Gly Leu Ala Gly
435 440 445Leu Gln Val Pro Pro Arg Cys Pro Val Pro Ser Phe His Val
Glu Leu 450 455 460Cys Arg Glu Gly Lys Asn Leu Leu Lys His Phe Arg
Phe Arg Asp Leu465 470 475 480Glu Glu Asp Pro Tyr Leu Pro Gly Asn
Pro Arg Glu Leu Ile Ala Tyr 485 490 495Ser Gln Tyr Pro Arg Pro Ser
Asp Ile Pro Gln Trp Asn Ser Asp Lys 500 505 510Pro Ser Leu Lys Asp
Ile Lys Ile Met Gly Tyr Ser Ile Arg Thr Ile 515 520 525Asp Tyr Arg
Tyr Thr Val Trp Val Gly Phe Asn Pro Asp Glu Phe Leu 530 535 540Ala
Asn Phe Ser Asp Ile His Ala Gly Glu Leu Tyr Phe Val Asp Ser545 550
555 560Asp Pro Leu Gln Asp His Asn Met Tyr Asn Asp Ser Gln Gly Gly
Asp 565 570 575Leu Phe Gln Leu Leu Met Pro Glu Ala Ala Ala Lys Glu
Ala Ala Ala 580 585 590Lys Glu Ala Ala Ala Lys Asp Ser Ser His Ala
Phe Thr Leu Asp Glu 595 600 605Leu Arg Tyr 610142603PRTArtificial
Sequencep97 fusion protein 142Met Glu Trp Ser Trp Val Phe Leu Phe
Phe Leu Ser Val Thr Thr Gly1 5 10 15Val His Ser Asp Tyr Lys Asp Asp
Asp Asp Lys Glu Gln Lys Leu Ile 20 25 30Ser Glu Glu Asp Leu His His
His His His His His His His His Gly 35 40 45Gly Gly Gly Glu Asn Leu
Tyr Phe Gln Gly Thr Asp Ala Leu Asn Val 50 55 60Leu Leu Ile Ile Val
Asp Asp Leu Arg Pro Ser Leu Gly Cys Tyr Gly65 70 75 80Asp Lys Leu
Val Arg Ser Pro Asn Ile Asp Gln Leu Ala Ser His Ser 85 90 95Leu Leu
Phe Gln Asn Ala Phe Ala Gln Gln Ala Val Cys Ala Pro Ser 100 105
110Arg Val Ser Phe Leu Thr Gly Arg Arg Pro Asp Thr Thr Arg Leu Tyr
115 120 125Asp Phe Asn Ser Tyr Trp Arg Val His Ala Gly Asn Phe Ser
Thr Ile 130 135 140Pro Gln Tyr Phe Lys Glu Asn Gly Tyr Val Thr Met
Ser Val Gly Lys145 150 155 160Val Phe His Pro Gly Ile Ser Ser Asn
His Thr Asp Asp Ser Pro Tyr 165 170 175Ser Trp Ser Phe Pro Pro Tyr
His Pro Ser Ser Glu Lys Tyr Glu Asn 180 185 190Thr Lys Thr Cys Arg
Gly Pro Asp Gly Glu Leu His Ala Asn Leu Leu 195 200 205Cys Pro Val
Asp Val Leu Asp Val Pro Glu Gly Thr Leu Pro Asp Lys 210 215 220Gln
Ser Thr Glu Gln Ala Ile Gln Leu Leu Glu Lys Met Lys Thr Ser225 230
235 240Ala Ser Pro Phe Phe Leu Ala Val Gly Tyr His Lys Pro His Ile
Pro 245 250 255Phe Arg Tyr Pro Lys Glu Phe Gln Lys Leu Tyr Pro Leu
Glu Asn Ile 260 265 270Thr Leu Ala Pro Asp Pro Glu Val Pro Asp Gly
Leu Pro Pro Val Ala 275 280 285Tyr Asn Pro Trp Met Asp Ile Arg Gln
Arg Glu Asp Val Gln Ala Leu 290 295 300Asn Ile Ser Val Pro Tyr Gly
Pro Ile Pro Val Asp Phe Gln Arg Lys305 310 315 320Ile Arg Gln Ser
Tyr Phe Ala Ser Val Ser Tyr Leu Asp Thr Gln Val 325 330 335Gly Arg
Leu Leu
Ser Ala Leu Asp Asp Leu Gln Leu Ala Asn Ser Thr 340 345 350Ile Ile
Ala Phe Thr Ser Asp His Gly Trp Ala Leu Gly Glu His Gly 355 360
365Glu Trp Ala Lys Tyr Ser Asn Phe Asp Val Ala Thr His Val Pro Leu
370 375 380Ile Phe Tyr Val Pro Gly Arg Thr Ala Ser Leu Pro Glu Ala
Gly Glu385 390 395 400Lys Leu Phe Pro Tyr Leu Asp Pro Phe Asp Ser
Ala Ser Gln Leu Met 405 410 415Glu Pro Gly Arg Gln Ser Met Asp Leu
Val Glu Leu Val Ser Leu Phe 420 425 430Pro Thr Leu Ala Gly Leu Ala
Gly Leu Gln Val Pro Pro Arg Cys Pro 435 440 445Val Pro Ser Phe His
Val Glu Leu Cys Arg Glu Gly Lys Asn Leu Leu 450 455 460Lys His Phe
Arg Phe Arg Asp Leu Glu Glu Asp Pro Tyr Leu Pro Gly465 470 475
480Asn Pro Arg Glu Leu Ile Ala Tyr Ser Gln Tyr Pro Arg Pro Ser Asp
485 490 495Ile Pro Gln Trp Asn Ser Asp Lys Pro Ser Leu Lys Asp Ile
Lys Ile 500 505 510Met Gly Tyr Ser Ile Arg Thr Ile Asp Tyr Arg Tyr
Thr Val Trp Val 515 520 525Gly Phe Asn Pro Asp Glu Phe Leu Ala Asn
Phe Ser Asp Ile His Ala 530 535 540Gly Glu Leu Tyr Phe Val Asp Ser
Asp Pro Leu Gln Asp His Asn Met545 550 555 560Tyr Asn Asp Ser Gln
Gly Gly Asp Leu Phe Gln Leu Leu Met Pro Glu 565 570 575Ala Ala Ala
Lys Glu Ala Ala Ala Lys Glu Ala Ala Ala Lys Asp Ser 580 585 590Ser
His Ala Phe Thr Leu Asp Glu Leu Arg Tyr 595 6001433870DNAArtificial
SequencePolynucleotide coding for p97 fusion protein 143atggaatgga
gctgggtctt tctcttcttc ctgtcagtaa cgactggtgt ccactccgac 60tacaaggacg
acgacgacaa agagcagaag ctgatctccg aagaggacct gcaccaccat
120catcaccatc accaccatca cggaggcggt ggagagaacc tgtactttca
gggctcggaa 180actcaggcca actccaccac agatgcactc aacgtgctgc
tgatcatcgt agatgacctc 240cgaccttctc tgggctgtta cggcgacaag
ctagtacgga gcccaaacat cgaccagctc 300gcatcgcact ctctcctatt
ccagaacgca ttcgcccagc aggctgtctg tgctccctcc 360cgagtgtcct
tcctcacggg tcggagaccc gataccacga ggttatatga cttcaactca
420tactggcgcg tgcatgccgg taacttttct actatacccc agtattttaa
agaaaatggc 480tatgttacaa tgtccgttgg caaggtattt catcctggta
ttagcagcaa ccacacagat 540gactctccgt atagctggtc attcccacca
taccacccct ccagcgaaaa gtacgaaaac 600acaaagactt gccggggccc
agatggcgaa ctgcacgcaa atctgctgtg ccctgtagat 660gtcttggacg
tgcccgaagg tactctgccc gacaaacagt ccacagaaca ggcaatccaa
720ctccttgaaa agatgaaaac gagcgcgtcc cccttcttcc tcgccgtggg
ctaccacaag 780ccccacatcc cgtttagata ccccaaggaa tttcagaaac
tgtaccccct ggaaaacatc 840actctcgcgc ccgaccccga agtgccagac
ggactccctc ctgttgccta caacccttgg 900atggacatca gacaacgtga
agatgtgcag gccctgaaca tctcagtgcc ttacggcccc 960attccagttg
acttccagag gaagattcgg cagtcctact tcgcctccgt tagttacctg
1020gacacccaag tgggtagact cctgagcgcc ttggacgatc tccagctcgc
aaacagcacc 1080atcattgcct tcaccagcga ccatggttgg gcgctgggtg
aacatggaga atgggctaaa 1140tattcaaatt tcgacgttgc gacccacgtc
ccattgatct tctacgtgcc tggacgaaca 1200gcctccttgc ctgaagccgg
ggaaaagttg tttccatatc tggacccttt cgattctgcg 1260agccaactca
tggaacctgg gcgacagagc atggacctgg tggaactggt cagtttattt
1320ccaaccctgg caggccttgc aggcctccaa gttccacctc ggtgtcccgt
tccctcattc 1380cacgtcgaac tctgtcgcga aggtaaaaac ctcctcaagc
attttcgttt tcgggacctc 1440gaagaagacc catacctgcc agggaatcca
agggaactga ttgcctacag ccagtaccct 1500agacctagcg acatcccaca
gtggaacagc gacaagccct ccctcaagga cattaaaatc 1560atgggttata
gtatccggac tattgactac aggtataccg tgtgggtggg tttcaaccca
1620gacgaatttc tcgccaattt ctccgacatc cacgcgggcg aactgtattt
cgttgattcc 1680gatccactgc aagatcataa tatgtacaac gatagtcaag
ggggtgacct cttccagttg 1740ctaatgccag aagccgccgc gaaagaagcc
gccgcaaaag aagccgctgc caaaggcatg 1800gaagtgcgtt ggtgcgccac
ctctgacccc gagcagcaca agtgcggcaa catgtccgag 1860gccttcagag
aggccggcat ccagccttct ctgctgtgtg tgcggggcac ctctgccgac
1920cattgcgtgc agctgatcgc cgcccaggaa gccgacgcta tcacactgga
tggcggcgct 1980atctacgagg ctggcaaaga gcacggcctg aagcccgtcg
tgggcgaggt gtacgatcag 2040gaagtgggca cctcctacta cgccgtggct
gtcgtgcgga gatcctccca cgtgaccatc 2100gacaccctga agggcgtgaa
gtcctgccac accggcatca acagaaccgt gggctggaac 2160gtgcccgtgg
gctacctggt ggaatccggc agactgtccg tgatgggctg cgacgtgctg
2220aaggccgtgt ccgattactt cggcggctct tgtgtgcctg gcgctggcga
gacatcctac 2280tccgagtccc tgtgcagact gtgcaggggc gactcttctg
gcgagggcgt gtgcgacaag 2340tcccctctgg aacggtacta cgactactcc
ggcgccttca gatgcctggc tgaaggtgct 2400ggcgacgtgg ccttcgtgaa
gcactccacc gtgctggaaa acaccgacgg caagaccctg 2460ccttcttggg
gccaggcact gctgtcccag gacttcgagc tgctgtgccg ggatggctcc
2520agagccgatg tgacagagtg gcggcagtgc cacctggcca gagtgcctgc
tcatgctgtg 2580gtcgtgcgcg ccgatacaga tggcggcctg atcttccggc
tgctgaacga gggccagcgg 2640ctgttctctc acgagggctc cagcttccag
atgttctcca gcgaggccta cggccagaag 2700gacctgctgt tcaaggactc
cacctccgag ctggtgccta tcgccaccca gacctatgag 2760gcttggctgg
gccacgagta cctgcacgct atgaagggac tgctgtgcga ccccaaccgg
2820ctgcctcctt atctgaggtg gtgcgtgctg tccacccccg agatccagaa
atgcggcgat 2880atggccgtgg cctttcggcg gcagagactg aagcctgaga
tccagtgcgt gtccgccaag 2940agccctcagc actgcatgga acggatccag
gccgaacagg tggacgccgt gacactgtcc 3000ggcgaggata tctacaccgc
cggaaagacc tacggcctgg tgccagctgc tggcgagcat 3060tacgcccctg
aggactcctc caacagctac tacgtggtgg cagtcgtgcg ccgggactcc
3120tctcacgcct ttaccctgga tgagctgcgg ggcaagagaa gctgtcacgc
cggctttgga 3180agccctgccg gatgggatgt gcctgtgggc gctctgatcc
agcggggctt catcagaccc 3240aaggactgtg atgtgctgac cgccgtgtct
gagttcttca acgcctcctg tgtgcccgtg 3300aacaacccca agaactaccc
ctccagcctg tgcgccctgt gtgtgggaga tgagcagggc 3360cggaacaaat
gcgtgggcaa ctcccaggaa agatattacg gctacagagg cgccttccgg
3420tgtctggtgg aaaacgccgg ggatgtggct tttgtgcggc acaccaccgt
gttcgacaac 3480accaatggcc acaactccga gccttgggcc gctgagctga
gatccgagga ttacgaactg 3540ctgtgtccca acggcgccag ggctgaggtg
tcccagtttg ccgcctgtaa cctggcccag 3600atccctcccc acgctgtgat
ggtgcgaccc gacaccaaca tcttcaccgt gtacggcctg 3660ctggacaagg
cccaggatct gttcggcgac gaccacaaca agaacgggtt caagatgttc
3720gactccagca actaccacgg acaggatctg ctgtttaaag atgccaccgt
gcgggccgtg 3780ccagtgggcg aaaagaccac ctacagagga tggctgggac
tggactacgt ggccgccctg 3840gaaggcatgt cctcccagca gtgttcctga
38701443870DNAArtificial SequencePolynucleotide coding for p97
fusion protein 144atggaatgga gctgggtctt tctcttcttc ctgtcagtaa
cgactggtgt ccactccgac 60tacaaggacg acgacgacaa agagcagaag ctgatctccg
aagaggacct gcaccaccat 120catcaccatc accaccatca cggaggcggt
ggagagaacc tgtactttca gggcggcatg 180gaagtgcgtt ggtgcgccac
ctctgacccc gagcagcaca agtgcggcaa catgtccgag 240gccttcagag
aggccggcat ccagccttct ctgctgtgtg tgcggggcac ctctgccgac
300cattgcgtgc agctgatcgc cgcccaggaa gccgacgcta tcacactgga
tggcggcgct 360atctacgagg ctggcaaaga gcacggcctg aagcccgtcg
tgggcgaggt gtacgatcag 420gaagtgggca cctcctacta cgccgtggct
gtcgtgcgga gatcctccca cgtgaccatc 480gacaccctga agggcgtgaa
gtcctgccac accggcatca acagaaccgt gggctggaac 540gtgcccgtgg
gctacctggt ggaatccggc agactgtccg tgatgggctg cgacgtgctg
600aaggccgtgt ccgattactt cggcggctct tgtgtgcctg gcgctggcga
gacatcctac 660tccgagtccc tgtgcagact gtgcaggggc gactcttctg
gcgagggcgt gtgcgacaag 720tcccctctgg aacggtacta cgactactcc
ggcgccttca gatgcctggc tgaaggtgct 780ggcgacgtgg ccttcgtgaa
gcactccacc gtgctggaaa acaccgacgg caagaccctg 840ccttcttggg
gccaggcact gctgtcccag gacttcgagc tgctgtgccg ggatggctcc
900agagccgatg tgacagagtg gcggcagtgc cacctggcca gagtgcctgc
tcatgctgtg 960gtcgtgcgcg ccgatacaga tggcggcctg atcttccggc
tgctgaacga gggccagcgg 1020ctgttctctc acgagggctc cagcttccag
atgttctcca gcgaggccta cggccagaag 1080gacctgctgt tcaaggactc
cacctccgag ctggtgccta tcgccaccca gacctatgag 1140gcttggctgg
gccacgagta cctgcacgct atgaagggac tgctgtgcga ccccaaccgg
1200ctgcctcctt atctgaggtg gtgcgtgctg tccacccccg agatccagaa
atgcggcgat 1260atggccgtgg cctttcggcg gcagagactg aagcctgaga
tccagtgcgt gtccgccaag 1320agccctcagc actgcatgga acggatccag
gccgaacagg tggacgccgt gacactgtcc 1380ggcgaggata tctacaccgc
cggaaagacc tacggcctgg tgccagctgc tggcgagcat 1440tacgcccctg
aggactcctc caacagctac tacgtggtgg cagtcgtgcg ccgggactcc
1500tctcacgcct ttaccctgga tgagctgcgg ggcaagagaa gctgtcacgc
cggctttgga 1560agccctgccg gatgggatgt gcctgtgggc gctctgatcc
agcggggctt catcagaccc 1620aaggactgtg atgtgctgac cgccgtgtct
gagttcttca acgcctcctg tgtgcccgtg 1680aacaacccca agaactaccc
ctccagcctg tgcgccctgt gtgtgggaga tgagcagggc 1740cggaacaaat
gcgtgggcaa ctcccaggaa agatattacg gctacagagg cgccttccgg
1800tgtctggtgg aaaacgccgg ggatgtggct tttgtgcggc acaccaccgt
gttcgacaac 1860accaatggcc acaactccga gccttgggcc gctgagctga
gatccgagga ttacgaactg 1920ctgtgtccca acggcgccag ggctgaggtg
tcccagtttg ccgcctgtaa cctggcccag 1980atccctcccc acgctgtgat
ggtgcgaccc gacaccaaca tcttcaccgt gtacggcctg 2040ctggacaagg
cccaggatct gttcggcgac gaccacaaca agaacgggtt caagatgttc
2100gactccagca actaccacgg acaggatctg ctgtttaaag atgccaccgt
gcgggccgtg 2160ccagtgggcg aaaagaccac ctacagagga tggctgggac
tggactacgt ggccgccctg 2220gaaggcatgt cctcccagca gtgttccgaa
gccgccgcga aagaagccgc cgcaaaagaa 2280gccgctgcca aatcggaaac
tcaggccaac tccaccacag atgcactcaa cgtgctgctg 2340atcatcgtag
atgacctccg accttctctg ggctgttacg gcgacaagct agtacggagc
2400ccaaacatcg accagctcgc atcgcactct ctcctattcc agaacgcatt
cgcccagcag 2460gctgtctgtg ctccctcccg agtgtccttc ctcacgggtc
ggagacccga taccacgagg 2520ttatatgact tcaactcata ctggcgcgtg
catgccggta acttttctac tataccccag 2580tattttaaag aaaatggcta
tgttacaatg tccgttggca aggtatttca tcctggtatt 2640agcagcaacc
acacagatga ctctccgtat agctggtcat tcccaccata ccacccctcc
2700agcgaaaagt acgaaaacac aaagacttgc cggggcccag atggcgaact
gcacgcaaat 2760ctgctgtgcc ctgtagatgt cttggacgtg cccgaaggta
ctctgcccga caaacagtcc 2820acagaacagg caatccaact ccttgaaaag
atgaaaacga gcgcgtcccc cttcttcctc 2880gccgtgggct accacaagcc
ccacatcccg tttagatacc ccaaggaatt tcagaaactg 2940taccccctgg
aaaacatcac tctcgcgccc gaccccgaag tgccagacgg actccctcct
3000gttgcctaca acccttggat ggacatcaga caacgtgaag atgtgcaggc
cctgaacatc 3060tcagtgcctt acggccccat tccagttgac ttccagagga
agattcggca gtcctacttc 3120gcctccgtta gttacctgga cacccaagtg
ggtagactcc tgagcgcctt ggacgatctc 3180cagctcgcaa acagcaccat
cattgccttc accagcgacc atggttgggc gctgggtgaa 3240catggagaat
gggctaaata ttcaaatttc gacgttgcga cccacgtccc attgatcttc
3300tacgtgcctg gacgaacagc ctccttgcct gaagccgggg aaaagttgtt
tccatatctg 3360gaccctttcg attctgcgag ccaactcatg gaacctgggc
gacagagcat ggacctggtg 3420gaactggtca gtttatttcc aaccctggca
ggccttgcag gcctccaagt tccacctcgg 3480tgtcccgttc cctcattcca
cgtcgaactc tgtcgcgaag gtaaaaacct cctcaagcat 3540tttcgttttc
gggacctcga agaagaccca tacctgccag ggaatccaag ggaactgatt
3600gcctacagcc agtaccctag acctagcgac atcccacagt ggaacagcga
caagccctcc 3660ctcaaggaca ttaaaatcat gggttatagt atccggacta
ttgactacag gtataccgtg 3720tgggtgggtt tcaacccaga cgaatttctc
gccaatttct ccgacatcca cgcgggcgaa 3780ctgtatttcg ttgattccga
tccactgcaa gatcataata tgtacaacga tagtcaaggg 3840ggtgacctct
tccagttgct aatgccatga 38701451836DNAArtificial
SequencePolynucleotide coding for p97 fusion protein 145atggaatgga
gctgggtctt tctcttcttc ctgtcagtaa cgactggtgt ccactccgac 60tacaaggacg
acgacgacaa agagcagaag ctgatctccg aagaggacct gcaccaccat
120catcaccatc accaccatca cggaggcggt ggagagaacc tgtactttca
gggcgactcc 180tctcacgcct tcaccctgga cgagctgcgg tacgaagccg
ccgcgaaaga agccgccgca 240aaagaagccg ctgccaaatc ggaaactcag
gccaactcca ccacagatgc actcaacgtg 300ctgctgatca tcgtagatga
cctccgacct tctctgggct gttacggcga caagctagta 360cggagcccaa
acatcgacca gctcgcatcg cactctctcc tattccagaa cgcattcgcc
420cagcaggctg tctgtgctcc ctcccgagtg tccttcctca cgggtcggag
acccgatacc 480acgaggttat atgacttcaa ctcatactgg cgcgtgcatg
ccggtaactt ttctactata 540ccccagtatt ttaaagaaaa tggctatgtt
acaatgtccg ttggcaaggt atttcatcct 600ggtattagca gcaaccacac
agatgactct ccgtatagct ggtcattccc accataccac 660ccctccagcg
aaaagtacga aaacacaaag acttgccggg gcccagatgg cgaactgcac
720gcaaatctgc tgtgccctgt agatgtcttg gacgtgcccg aaggtactct
gcccgacaaa 780cagtccacag aacaggcaat ccaactcctt gaaaagatga
aaacgagcgc gtcccccttc 840ttcctcgccg tgggctacca caagccccac
atcccgttta gataccccaa ggaatttcag 900aaactgtacc ccctggaaaa
catcactctc gcgcccgacc ccgaagtgcc agacggactc 960cctcctgttg
cctacaaccc ttggatggac atcagacaac gtgaagatgt gcaggccctg
1020aacatctcag tgccttacgg ccccattcca gttgacttcc agaggaagat
tcggcagtcc 1080tacttcgcct ccgttagtta cctggacacc caagtgggta
gactcctgag cgccttggac 1140gatctccagc tcgcaaacag caccatcatt
gccttcacca gcgaccatgg ttgggcgctg 1200ggtgaacatg gagaatgggc
taaatattca aatttcgacg ttgcgaccca cgtcccattg 1260atcttctacg
tgcctggacg aacagcctcc ttgcctgaag ccggggaaaa gttgtttcca
1320tatctggacc ctttcgattc tgcgagccaa ctcatggaac ctgggcgaca
gagcatggac 1380ctggtggaac tggtcagttt atttccaacc ctggcaggcc
ttgcaggcct ccaagttcca 1440cctcggtgtc ccgttccctc attccacgtc
gaactctgtc gcgaaggtaa aaacctcctc 1500aagcattttc gttttcggga
cctcgaagaa gacccatacc tgccagggaa tccaagggaa 1560ctgattgcct
acagccagta ccctagacct agcgacatcc cacagtggaa cagcgacaag
1620ccctccctca aggacattaa aatcatgggt tatagtatcc ggactattga
ctacaggtat 1680accgtgtggg tgggtttcaa cccagacgaa tttctcgcca
atttctccga catccacgcg 1740ggcgaactgt atttcgttga ttccgatcca
ctgcaagatc ataatatgta caacgatagt 1800caagggggtg acctcttcca
gttgctaatg ccatga 18361461836DNAArtificial SequencePolynucleotide
coding for p97 fusion protein 146atggaatgga gctgggtctt tctcttcttc
ctgtcagtaa cgactggtgt ccactccgac 60tacaaggacg acgacgacaa agagcagaag
ctgatctccg aagaggacct gcaccaccat 120catcaccatc accaccatca
cggaggcggt ggagagaacc tgtactttca gggctcggaa 180actcaggcca
actccaccac agatgcactc aacgtgctgc tgatcatcgt agatgacctc
240cgaccttctc tgggctgtta cggcgacaag ctagtacgga gcccaaacat
cgaccagctc 300gcatcgcact ctctcctatt ccagaacgca ttcgcccagc
aggctgtctg tgctccctcc 360cgagtgtcct tcctcacggg tcggagaccc
gataccacga ggttatatga cttcaactca 420tactggcgcg tgcatgccgg
taacttttct actatacccc agtattttaa agaaaatggc 480tatgttacaa
tgtccgttgg caaggtattt catcctggta ttagcagcaa ccacacagat
540gactctccgt atagctggtc attcccacca taccacccct ccagcgaaaa
gtacgaaaac 600acaaagactt gccggggccc agatggcgaa ctgcacgcaa
atctgctgtg ccctgtagat 660gtcttggacg tgcccgaagg tactctgccc
gacaaacagt ccacagaaca ggcaatccaa 720ctccttgaaa agatgaaaac
gagcgcgtcc cccttcttcc tcgccgtggg ctaccacaag 780ccccacatcc
cgtttagata ccccaaggaa tttcagaaac tgtaccccct ggaaaacatc
840actctcgcgc ccgaccccga agtgccagac ggactccctc ctgttgccta
caacccttgg 900atggacatca gacaacgtga agatgtgcag gccctgaaca
tctcagtgcc ttacggcccc 960attccagttg acttccagag gaagattcgg
cagtcctact tcgcctccgt tagttacctg 1020gacacccaag tgggtagact
cctgagcgcc ttggacgatc tccagctcgc aaacagcacc 1080atcattgcct
tcaccagcga ccatggttgg gcgctgggtg aacatggaga atgggctaaa
1140tattcaaatt tcgacgttgc gacccacgtc ccattgatct tctacgtgcc
tggacgaaca 1200gcctccttgc ctgaagccgg ggaaaagttg tttccatatc
tggacccttt cgattctgcg 1260agccaactca tggaacctgg gcgacagagc
atggacctgg tggaactggt cagtttattt 1320ccaaccctgg caggccttgc
aggcctccaa gttccacctc ggtgtcccgt tccctcattc 1380cacgtcgaac
tctgtcgcga aggtaaaaac ctcctcaagc attttcgttt tcgggacctc
1440gaagaagacc catacctgcc agggaatcca agggaactga ttgcctacag
ccagtaccct 1500agacctagcg acatcccaca gtggaacagc gacaagccct
ccctcaagga cattaaaatc 1560atgggttata gtatccggac tattgactac
aggtataccg tgtgggtggg tttcaaccca 1620gacgaatttc tcgccaattt
ctccgacatc cacgcgggcg aactgtattt cgttgattcc 1680gatccactgc
aagatcataa tatgtacaac gatagtcaag ggggtgacct cttccagttg
1740ctaatgccag aggccgctgc taaagaggct gccgccaaag aagccgccgc
taaggactcc 1800tctcacgcct tcaccctgga cgagctgcgg tactaa
18361471812DNAArtificial SequencePolynucleotide coding for p97
fusion protein 147atggaatgga gctgggtctt tctcttcttc ctgtcagtaa
cgactggtgt ccactccgac 60tacaaggacg acgacgacaa agagcagaag ctgatctccg
aagaggacct gcaccaccat 120catcaccatc accaccatca cggaggcggt
ggagagaacc tgtactttca gggcacagat 180gcactcaacg tgctgctgat
catcgtagat gacctccgac cttctctggg ctgttacggc 240gacaagctag
tacggagccc aaacatcgac cagctcgcat cgcactctct cctattccag
300aacgcattcg cccagcaggc tgtctgtgct ccctcccgag tgtccttcct
cacgggtcgg 360agacccgata ccacgaggtt atatgacttc aactcatact
ggcgcgtgca tgccggtaac 420ttttctacta taccccagta ttttaaagaa
aatggctatg ttacaatgtc cgttggcaag 480gtatttcatc ctggtattag
cagcaaccac acagatgact ctccgtatag ctggtcattc 540ccaccatacc
acccctccag cgaaaagtac gaaaacacaa agacttgccg gggcccagat
600ggcgaactgc acgcaaatct gctgtgccct gtagatgtct tggacgtgcc
cgaaggtact 660ctgcccgaca aacagtccac agaacaggca atccaactcc
ttgaaaagat gaaaacgagc 720gcgtccccct tcttcctcgc cgtgggctac
cacaagcccc acatcccgtt tagatacccc 780aaggaatttc agaaactgta
ccccctggaa aacatcactc tcgcgcccga ccccgaagtg 840ccagacggac
tccctcctgt tgcctacaac ccttggatgg acatcagaca acgtgaagat
900gtgcaggccc tgaacatctc agtgccttac ggccccattc cagttgactt
ccagaggaag 960attcggcagt cctacttcgc ctccgttagt tacctggaca
cccaagtggg tagactcctg 1020agcgccttgg acgatctcca gctcgcaaac
agcaccatca ttgccttcac cagcgaccat 1080ggttgggcgc tgggtgaaca
tggagaatgg gctaaatatt caaatttcga cgttgcgacc 1140cacgtcccat
tgatcttcta cgtgcctgga cgaacagcct ccttgcctga agccggggaa
1200aagttgtttc catatctgga ccctttcgat tctgcgagcc aactcatgga
acctgggcga 1260cagagcatgg acctggtgga actggtcagt ttatttccaa
ccctggcagg ccttgcaggc 1320ctccaagttc cacctcggtg tcccgttccc
tcattccacg tcgaactctg tcgcgaaggt 1380aaaaacctcc tcaagcattt
tcgttttcgg gacctcgaag aagacccata cctgccaggg 1440aatccaaggg
aactgattgc ctacagccag taccctagac ctagcgacat cccacagtgg
1500aacagcgaca agccctccct caaggacatt aaaatcatgg gttatagtat
ccggactatt 1560gactacaggt ataccgtgtg ggtgggtttc aacccagacg
aatttctcgc
caatttctcc 1620gacatccacg cgggcgaact gtatttcgtt gattccgatc
cactgcaaga tcataatatg 1680tacaacgata gtcaaggggg tgacctcttc
cagttgctaa tgccagaggc cgctgctaaa 1740gaggctgccg ccaaagaagc
cgccgctaag gactcctctc acgccttcac cctggacgag 1800ctgcggtact aa
181214813PRTHomo sapiens 148Asp Ser Ser His Ala Phe Thr Leu Asp Glu
Leu Arg Tyr1 5 1014919PRTHomo sapiens 149Met Glu Trp Ser Trp Val
Phe Leu Phe Phe Leu Ser Val Thr Thr Gly1 5 10 15Val His Ser
* * * * *