U.S. patent application number 15/566365 was filed with the patent office on 2018-05-03 for cyclized cytokine and method for producing same.
The applicant listed for this patent is National Institute of Advanced Industrial Science and Technology. Invention is credited to Shinya Honda, Takamitsu Miyafusa.
Application Number | 20180118799 15/566365 |
Document ID | / |
Family ID | 57127130 |
Filed Date | 2018-05-03 |
United States Patent
Application |
20180118799 |
Kind Code |
A1 |
Honda; Shinya ; et
al. |
May 3, 2018 |
CYCLIZED CYTOKINE AND METHOD FOR PRODUCING SAME
Abstract
The purpose of the present invention is to provide a method for
producing a very stable, cyclized mutant protein such that high
cyclization efficiency is achieved while the number of amino acids
added is minimal and the biological properties of an original
protein are maintained. In view of conformational information about
the original protein, secondary structure-free regions at N/C
terminal portions are deleted. Then, a protein database is screened
for proteins with secondary structures similar to those of N/C
terminal residues of a secondary structure-forming portion after
the deletion. The screening results are used to determine the amino
acid length of a loop structure through which the N-terminus and
the C-terminus of the secondary structure-forming portion of the
original protein are to be connected. A cyclized mutant protein is
finally produced having a loop structure with the determined amino
acid length.
Inventors: |
Honda; Shinya; (Tsukuba-shi,
JP) ; Miyafusa; Takamitsu; (Tsukuba-shi, JP) |
|
Applicant: |
Name |
City |
State |
Country |
Type |
National Institute of Advanced Industrial Science and
Technology |
Tokyo |
|
JP |
|
|
Family ID: |
57127130 |
Appl. No.: |
15/566365 |
Filed: |
April 13, 2016 |
PCT Filed: |
April 13, 2016 |
PCT NO: |
PCT/JP2016/061923 |
371 Date: |
October 13, 2017 |
Current U.S.
Class: |
1/1 |
Current CPC
Class: |
A61K 38/22 20130101;
C12N 2840/445 20130101; A61P 7/00 20180101; C07K 14/535 20130101;
C12N 15/70 20130101; C07K 1/02 20130101; C12N 5/10 20130101; A61K
38/00 20130101; C07K 19/00 20130101; C12P 21/02 20130101 |
International
Class: |
C07K 14/535 20060101
C07K014/535; A61P 7/00 20060101 A61P007/00; C12N 15/70 20060101
C12N015/70 |
Foreign Application Data
Date |
Code |
Application Number |
Apr 13, 2015 |
JP |
2015-082018 |
Claims
1. A method for producing a cyclized mutant protein with biological
properties not less than those of an original protein and with
stability higher than that of the original protein by modifying,
through the following steps (i) to (v), the original protein with a
specific three-dimensional structure, the method comprising the
steps of: (i) determining, based on conformational information
about the original protein, secondary structure-free regions at an
N-terminal portion and a C-terminal portion of the original protein
and determining an N-terminus and a C-terminus of a secondary
structure-forming portion of the original protein if the secondary
structure-free regions are deleted; (ii) screening a protein
database for proteins with secondary structures similar to those of
N-terminal residues and C-terminal residues of the secondary
structure-forming portion; (iii) determining, based on results of
the screening, an amino acid length of a loop structure through
which the N-terminus and the C-terminus of the secondary
structure-forming portion of the original protein are to be
connected; (iv) designing and producing a linear mutant protein
that can be subjected to cyclization so as to obtain a cyclized
mutant protein in which the N-terminus and the C-terminus of the
secondary structure-forming portion of the original protein are
connected via the loop structure with the amino acid length
determined; and (v) linking an N-terminus and a C-terminus of the
linear mutant protein by a chemical or biological method so that
the cyclization is made to obtain the cyclized mutant protein.
2. The method for producing a cyclized mutant protein according to
claim 1, wherein the cyclization of step (v) uses a trans-splicing
reaction mediated by split inteins.
3. The method for producing a cyclized mutant protein according to
claim 1, wherein the original protein with a specific
three-dimensional structure is a cytokine with a helix bundle
structure and the biological properties of the original protein
involve affinity for a receptor.
4. The method for producing a cyclized mutant protein according to
claim 3, wherein the original protein with a specific
three-dimensional structure is granulocyte colony-stimulating
factor (G-CSF).
5. (canceled)
6. A cyclized mutant protein consisting of an amino acid sequence
in which secondary structure-free regions at an N-terminal portion
and a C-terminal portion of an original protein with a specific
three-dimensional structure are deleted and a predetermined number
of amino acid residues is added to each of an N-terminus and a
C-terminus of a secondary structure-forming portion of the original
protein after the deletion, wherein the predetermined number of
amino acid residues is the number of amino acid residues, the
number corresponding to the number of amino acids connecting
secondary structures of N-terminal residues and C-terminal residues
of a secondary structure-forming portion of another protein, the
secondary structures being similar to those of the secondary
structure-forming portion of the original protein.
7. The cyclized mutant protein according to claim 6, wherein the
original protein with a specific three-dimensional structure is a
cytokine with a helix bundle structure.
8. The cyclized mutant protein according to claim 6, wherein the
original protein with a specific three-dimensional structure is a
granulocyte colony-stimulating factor (G-CSF) consisting of an
amino acid sequence set forth in SEQ ID NO: 1; 0 to 11 amino acid
residues are deleted from an N-terminus of the amino acid sequence
set forth in SEQ ID NO: 1; 0 to 3 amino acid residues are deleted
from an C-terminus of the amino acid sequence; a serine or cysteine
residue is added to an N-terminus and/or a C-terminus of the amino
acid sequence after the deletion; and an amino acid residue other
than proline is added to the C-terminus of the amino acid sequence
after the deletion.
9. The cyclized mutant protein according to claim 6, wherein the
protein consists of a cyclized amino acid sequence set forth in any
one of SEQ ID NOs: 6 to 9.
10. A linear mutant protein designed so as to produce the cyclized
mutant protein according to claim 6, wherein the linear mutant
protein consists of an amino acid sequence in which secondary
structure-free regions at an N-terminal portion and a C-terminal
portion of an original protein with a specific three-dimensional
structure are deleted and a predetermined number of amino acid
residues is added to each of an N-terminus and a C-terminus of a
secondary structure-forming portion of the original protein after
the deletion; the predetermined number of amino acid residues is
the number of amino acid residues, the number corresponding to the
number of amino acids connecting secondary structures of N-terminal
residues and C-terminal residues of a secondary structure-forming
portion of another protein, the secondary structures being similar
to those of the secondary structure-forming portion of the original
protein; and a most N-terminus amino acid residue and a most
C-terminus amino acid residue of the predetermined number of amino
acid residues added are amino acid residues that enable the linear
mutant protein to be subject to cyclization.
11. The linear mutant protein according to claim 10, wherein the
protein is designed so as to produce the cyclized mutant protein
that consists of a cyclized amino acid sequence set forth in any
one of SEQ ID NOs: 6 to 9 and consists of an amino acid sequence
set forth in any one of SEQ ID NOs: 2 to 5.
12. The linear mutant protein according to claim 10, wherein the
protein is designed so as to produce the cyclized mutant protein
that consists of a cyclized amino acid sequence set forth in any
one of SEQ ID NOs: 6 to 9 by using, as split inteins, DnaE-C
(C-intein) and DnaE-N (N-intein) of DnaE derived from Nostoc
punctiforme and consists of an amino acid sequence set forth in any
one of SEQ ID NOs: 10 to 13.
13. A nucleic acid comprising a nucleotide sequence encoding an
amino acid sequence of the mutant protein according to claim 6.
14. A recombinant vector comprising the nucleic acid according to
claim 13.
15. A transformant comprising the recombinant vector according to
claim 14.
16. A method for producing the mutant protein according to claim 6,
comprising using a transformant, said transformant comprising a
recombinant vector, said recombinant vector comprising a nucleic
acid, said nucleic acid comprising a nucleotide sequence encoding
an amino acid sequence of the mutant protein according to claim
6.
17. A pharmaceutical composition comprising, as an active
ingredient, the mutant protein according to claim 6.
18. A pharmaceutical composition for treatment of neutropenia,
comprising, as an active ingredient, the mutant protein according
to claim 6.
19. The cyclized mutant protein according to claim 7, wherein the
cytokine is selected from granulocyte colony-stimulating factor,
erythropoietin, and interferon .alpha..
20. A method for producing the cyclized mutant protein of claim 6,
comprising the steps of: (i) determining, based on conformational
information about the original protein, secondary structure-free
regions at an N-terminal portion and a C-terminal portion of the
original protein and determining an N-terminus and a C-terminus of
a secondary structure-forming portion of the original protein if
the secondary structure-free regions are deleted; (ii) screening a
protein database for proteins with secondary structures similar to
those of N-terminal residues and C-terminal residues of the
secondary structure-forming portion; (iii) determining, based on
results of the screening, an amino acid length of a loop structure
through which the N-terminus and the C-terminus of the secondary
structure-forming portion of the original protein are to be
connected; (iv) designing and producing a linear mutant protein
that can be subjected to cyclization so as to obtain a cyclized
mutant protein in which the N-terminus and the C-terminus of the
secondary structure-forming portion of the original protein are
connected via the loop structure with the amino acid length
determined; and (v) linking an N-terminus and a C-terminus of the
linear mutant protein by a chemical or biological method so that
the cyclization is made to obtain the cyclized mutant protein, to
thereby produce the cyclized mutant protein of claim 6.
Description
TECHNICAL FIELD
[0001] The present invention relates to novel improved versions of
a pharmaceutically bioactive protein, preferably a cytokine with a
helix bundle conformation, and more preferably granulocyte
colony-stimulating factor (G-CSF), which acts to promote production
of granulocytes and to enhance functions of neutrophils, nucleic
acids encoding the improved proteins, and a production method
therefor.
BACKGROUND ART
(Protein Instability Problem When Used as Pharmaceutical)
[0002] When administered for treatment of humans, a bioactive
protein often has an undesirable metabolic half-life. This
intrinsic metabolic half-life frequently causes suboptimal
therapeutic efficacy, medication compliance problems, and dosing
schedule and administration regimen that are inconvenient for
patients.
(Conventional Major Stability-improving Strategies and Their
Problems)
[0003] In the production of a bioactive protein for treatment of
humans, its metabolic half-life has been extended by physical means
(e.g., alteration of an administration route, nanoparticle
encapsulation, and liposome encapsulation), chemical modifications
(e.g., emulsion, PEGylation, glycosylation), and modifications by
genetic engineering (e.g., amino acid substitution, fusion to human
serum albumin, incorporation of post-translational glycosylation)
(Non-Patent Literatures 1 and 2). Unfortunately, most such
approaches are expensive and entail additional time-consuming
processes during production. In addition, PEG itself is
immunogenic, so that anti-PEG antibodies are produced, resulting in
a decrease in drug efficiency of a PEGylated protein and a decrease
in safety, as disclosed in Non-Patent Literature 3.
(Cyclization-medicated, Stability-improving Approaches and Their
Problems)
[0004] Bioactive peptides each have a markedly shorter metabolic
half-life than regular proteins. The metabolic half-life has been
extended by cyclization of a certain linear peptide (Patent
Literatures 1 and 2). In various conventional methods for producing
a cyclized peptide, naturally occurring cyclized peptides have been
mimicked and a biological process and a chemical process have been
used in combination. For instance, cysteine residues have been
introduced by chemical modification or genetic engineering
modification so as to give a disulfide bond (Patent Literature 1).
In another technology, a linear precursor molecule is produced in a
cell (e.g., a bacterium); and an exogenous factor (e.g., an enzyme)
is then added to make the linear precursor molecule cyclic by using
a chemical process. In still another technology, a cyclized peptide
is produced in a cell (e.g., a bacterium) by genetic engineering
modification. For example, as described in Patent Literature 3, the
trans-splicing function of split inteins is exploited to make a
precursor peptide cyclic, which is interposed between 2 parts of
the split inteins.
[0005] In still another method, a cyclized peptide is produced
through chemical modification by introducing unnatural amino acids
and cross-linkers. For instance, a technology has been known which
uses an olefin metathesis reaction to cross-link
(S')-.alpha.-(2'-pentenyl) alanine residues, thereby making a
peptide cyclized within a molecule (Patent Literature 4).
[0006] The above-mentioned cyclization methods are also used as a
method for extending the metabolic half-life of not only peptides
but also proteins. For proteins, which are more vulnerable to some
chemical synthesis process than peptides, commonly used are
modifications by genetic engineering or a protocol in which
modifications by genetic engineering and chemical modifications
under mild conditions are combined. However, such approaches each
require addition of cross-linkers and amino acid sequence necessary
for promoting a cyclization reaction to a wild-type protein.
Generally speaking, when amino acids are substituted and/or
inserted in a wild-type protein, the physical and biological
characteristics of the protein are disturbed. Thus, it is important
to complete a cyclization reaction without modifying the wild-type
amino acid sequence whenever possible.
[0007] In addition, when reaction efficiency of the cyclization
reaction is insufficient, an additional purification step of
separating the resulting cyclized protein from an unreacted linear
protein is needed. Between the cyclized protein and the unreacted
linear protein, there are very little differences in their
molecular weights and surface charges, so that it is uneasy to
separate and purify them from each other. Meanwhile, the efficiency
of the cyclization reaction depends on the characteristics of an
exogenous factor (e.g., an enzyme) used for the cyclization
reaction and the characteristics of a target protein that is
subject to cyclization. There is no established common procedure to
increase the efficiency.
(About Cytokine with Helix Bundle Conformation)
[0008] Among bioactive proteins for treatment of humans, cytokines
such as granulocyte colony-stimulating factor (G-CSF) and
erythropoietin have long been clinically applicable. These
cytokines have a helix bundle conformation and increased
three-dimensional homology. When stabilization of the conformation
is used as a basis to extend its metabolic half-life, analogous
approaches are often effective with respect to proteins with a
similar conformation (Non-Patent Literature 4). In addition, the
cytokines share structural features where their N-terminus and
C-terminus are positioned closely, etc.
(About G-CSF)
[0009] Granulocyte colony-stimulating factor (G-CSF) is the
hematopoietic factor that was discovered as a factor for inducing
and differentiating a granulocyte stem cell and promotes production
of neutrophils in vivo. Thus, G-CSF is clinically applicable as a
drug for neutropenia after bone marrow transplantation and/or
cancer chemotherapy. Various sources may be used to obtain and
purify G-CSF. Glycosylated G-CSF, a product expressed in eukaryotic
host cells, and non-glycosylated G-CSF, a product expressed in
prokaryotic host cells, can be produced on an industrial scale. For
instance, the former is lenograstim that is produced as a
polypeptide chain with the same 174 amino acids as of a native
protein. Then, the latter is filgrastim that is produced as a
polypeptide chain with the same 175 amino acids as of the native
protein except that a methionyl residue is added to the N-terminus.
They are available for treatment use in several countries (Patent
Literatures 5 and 6). Filgrastim has demonstrated that addition of
a methionyl group to the N-terminus and a lack of glycosylation
does not affect the functions of G-CSF. This is a known phenomenon.
In addition, some reports said that a mutant in which amino acids
positioned at the N-terminal region (1 to 17 residues) are replaced
or deleted, and a mutant in which a peptide sequence having 12
amino acids is added to the C-terminus, retained the functions of
G-CSF (Non-Patent Literatures 5 and 6). These reports indicate that
the N-terminus and the C-terminus of G-CSF are tolerant to the
introduction of mutations.
(Examples of Improving Stability of G-CSF)
[0010] Various modifications have been reported so as to extend the
metabolic half-life of G-CSF. These modifications have been
conducted by a technique in which amino acids are genetically
replaced and modified or a technique in which PEGs, for example,
are chemically added (Patent Literatures 7 and 8). For instance,
the former is used for nartograstim that is produced as a
polypeptide chain with the same 175 amino acids as of the native
protein except that a methionyl residue is added to the N-terminus;
the residues Thr2, Leu4, Gly5, Pro6, and Cys18 are replaced by Ala,
Thr, Tyr, Arg, and Ser residues, respectively. The latter is used
for pegfilgrastim that is a conjugated product of filgrastim and
monomethoxy polyethylene glycol. They are available for treatment
use in several countries. The PEGylation of filgrastim, like common
PEGylation, has been demonstrated to result in an increase in
stability but a decrease in activity thereof. For example, it has
been shown that when compared with that of filgrastim, its
metabolic half-life is increased from 1.79 h to 7.05 h (Patent
Literature 7). By contrast, its activity is shown to be decreased
to about 68 to 21% of that of filgrastim (Patent Literature 8).
[0011] In addition to the above, it has been reported (in Patent
Literature 9) that by using the fact that the N-terminus and the
C-terminus of G-CSF are closely positioned in its conformation,
G-CSF has been modified to obtain a cyclized polypeptide chain such
that a G-CSF, in which a plurality of Gly residues are added to the
N-terminus and Leu, Pro, Xaa (Xaa is any amino acid), Thr, and Gly
are added to the C-terminus, is expressed and isolated; and sortase
A derived from Staphylococcus aureus is added thereto to chemically
produce the cyclized polypeptide chain in vitro.
CITATION LIST
Patent Literature
[0012] Patent Literature 1: U.S. Pat. No. 4,033,940 [0013] Patent
Literature 2: U.S. Pat. No. 4,102,877 [0014] Patent Literature 3:
U.S. Pat. No. 7,846,710 [0015] Patent Literature 4: National
Publication of International Patent Application No. 2008-501623
[0016] Patent Literature 5: U.S. Pat. No. 4,810,643 [0017] Patent
Literature 6: U.S. Pat. No. 5,104,651 [0018] Patent Literature 7:
U.S. Pat. No. 5,824,778 [0019] Patent Literature 8: U.S. Pat. No.
5,985,265 [0020] Patent Literature 9 U.S. Pat. No. 8,940,501
Non Patent Literature
[0020] [0021] Non Patent Literature 1: B. I. Lord, L. B. Woolford,
and G. Molineux, Clin. Cancer Res., 2001, 7, 2085-90. [0022] Non
Patent Literature 2: P. van Der Auwera, E. Platzer, Z. X. Xu, R.
Schulz, O. Feugeas, R. Capdeville, and D. J. Edwards, Am. J.
Hematol., 2001, 66, 245-51. [0023] Non Patent Literature 3:
Guidance for Industry "Immunogenicity Assessment for Therapeutic
Protein2", U.S. Department of Health and Human Services, Food and
Drug Administration, August 2014, 1-36 [0024] Non Patent Literature
4: M. W. Popp, S. K. Dougan, T. Y. Chuang, E. Spooner, and H. L.
Ploegh, Proc. Natl. Acad. Sci. U.S.A., 2011, 108, 3169-74. [0025]
Non Patent Literature 5: T. Kuga, Y. Komatsu, M. Yamasaki, S.
Sekine, H. Miyaji, T. Nishi, M. Sato, Y. Yokoo, M. Asano, M. Okabe,
M. Morimoto, and S. Itoh, Biochem. Biophys. Res. Commun., 1989,
159, 103-111. [0026] Non Patent Literature 6: Y. Oshima, A. Tojo,
Y. Niho, and S. Asano, Biochem. Biophys. Res. Commun., 2000, 267,
924-7.
SUMMARY OF INVENTION
Problems to be Solved by the Invention
[0027] The purpose of the present invention is to solve
conventional technical problems in the production of a cyclized
protein.
[0028] As described above, in the production of a cyclized peptide
or a cyclized protein, the same procedure is used in each case in
which a linear polypeptide is once synthesized and a chemical or
biological process is then used to carry out a cyclization
reaction. At this time, addition of a sequence for the cyclization
reaction to a wild-type amino acid sequence may cause an unexpected
change in the physical and biological characteristics of a target
protein. In addition, if a linear polypeptide is contaminated as a
by-product due to low cyclization reaction efficiency, a subsequent
purification step is generally difficult. As a result, a target
substance with high purity cannot be obtained.
[0029] The present invention addresses the problem of establishing
a production method such that in the production of a cyclized
protein, high cyclization reaction efficiency can be achieved and
the number of amino acids added is minimal, thereby capable of
retaining the biological characteristics of an original protein and
providing a high-purity cyclized protein.
[0030] In particular, the present invention addresses the problem
of establishing a method for producing a high-purity cyclized
molecule modified by adding the minimum number of amino acids to a
cytokine (e.g., G-CSF), a very stable modified version of which
should be developed as a pharmaceutical, and providing the
high-purity cyclized cytokine.
Means for Solving the Problems
[0031] The present inventors considered, as a cyclization
efficiency-determining factor, how closely the N-terminus and the
C-terminal are positioned in the conformation of a protein that is
subject to cyclization. From this viewpoint, cyclized proteins
having various modifications at their terminal sequences were
designed and synthesized.
[0032] Specifically, a group of novel cyclized cytokines designed
from the above viewpoint were produced and the cyclization
efficiencies thereof were compared. In addition, the structure and
function of each cyclized cytokine purified were examined in detail
and were compared with those of a linear cytokine counterpart. This
comparison showed that each cyclized cytokine had comparable
functions and was very stable.
[0033] That is, the present application provides the following
aspects of the present invention.
[0034] <1> A method for producing a cyclized mutant protein
with biological properties not less than those of an original
protein and with stability higher than that of the original protein
by modifying, through the following steps (i) to (v), the original
protein with a specific three-dimensional structure, the method
comprising the steps of:
[0035] (i) determining, based on conformational information about
the original protein, secondary structure-free regions at an
N-terminal portion or a C-terminal portion of the original protein
and determining an N-terminus and a C-terminus of a secondary
structure-forming portion of the original protein if the secondary
structure-free regions are deleted;
[0036] (ii) screening a protein database for proteins with
secondary structures similar to those of N-terminal residues and
C-terminal residues of the secondary structure-forming portion;
[0037] (iii) determining, based on results of the screening, an
amino acid length of a loop structure through which the N-terminus
and the C-terminus of the secondary structure-forming portion of
the original protein are to be connected;
[0038] (iv) designing and producing a linear mutant protein that
can be subjected to cyclization so as to obtain a cyclized mutant
protein in which the N-terminus and the C-terminus of the secondary
structure-forming portion of the original protein are connected via
the loop structure with the amino acid length determined; and
[0039] (v) linking an N-terminus and a C-terminus of the linear
mutant protein by a chemical or biological method so that the
cyclization is made to obtain the cyclized mutant protein.
[0040] <2> The method for producing a cyclized mutant protein
according to item <1>, wherein the cyclization of step (v)
uses a trans-splicing reaction mediated by split inteins.
[0041] <3> The method for producing a cyclized mutant protein
according to item <1> or <2>, wherein the original
protein with a specific three-dimensional structure is a cytokine
with a helix bundle structure and the biological properties of the
original protein involve affinity for a receptor.
[0042] <4> The method of producing a cyclized mutant protein
according to item <3>, wherein the original protein with a
specific three-dimensional structure is granulocyte
colony-stimulating factor (G-CSF).
[0043] <5> A cyclized mutant protein produced using the
production method according to any one of items (1) to (4).
[0044] <6> A cyclized mutant protein consisting of an amino
acid sequence in which secondary structure-free regions at an
N-terminal portion or a C-terminal portion of an original protein
with a specific three-dimensional structure are deleted and a
predetermined number of amino acid residues is added to each of an
N-terminus and a C-terminus of a secondary structure-forming
portion of the original protein after the deletion, wherein the
predetermined number of amino acid residues is the number of amino
acid residues, the number corresponding to the number of amino
acids connecting secondary structures of N-terminal residues and
C-terminal residues of a secondary structure-forming portion of
another protein, the secondary structures being similar to those of
the secondary structure-forming portion of the original
protein.
[0045] <7> The cyclized mutant protein according to item
<6>, wherein the protein with a specific three-dimensional
structure is a cytokine with a helix bundle structure.
[0046] <8> The cyclized mutant protein according to item
<6> or <7>, wherein the original protein with a
specific three-dimensional structure is a granulocyte
colony-stimulating factor (G-CSF) consisting of an amino acid
sequence set forth in SEQ ID NO: 1; 0 to 11 amino acid residues are
deleted from an N-terminus of the amino acid sequence set forth in
SEQ ID NO: 1; 0 to 3 amino acid residues are deleted from an
C-terminus of the amino acid sequence; a serine or cysteine residue
is added to an N-terminus and/or a C-terminus of the amino acid
sequence after the deletion; and an amino acid residue other than
proline is added to the C-terminus of the amino acid sequence after
the deletion.
[0047] <9> The cyclized mutant protein according to any one
of items <6> to <8>, wherein the protein consists of a
cyclized amino acid sequence set forth in any one of SEQ ID NOs: 6
to 9.
[0048] <10> A linear mutant protein designed so as to produce
the cyclized mutant protein according to any one of items <6>
to <9>, wherein the linear mutant protein consists of an
amino acid sequence in which secondary structure-free regions at an
N-terminal portion and a C-terminal portion of an original protein
with a specific three-dimensional structure are deleted and a
predetermined number of amino acid residues is added to each of an
N-terminus and a C-terminus of a secondary structure-forming
portion of the original protein after the deletion; the
predetermined number of amino acid residues is the number of amino
acid residues, the number corresponding to the number of amino
acids connecting secondary structures of N-terminal residues and
C-terminal residues of a secondary structure-forming portion of
another protein, the secondary structures being similar to those of
the secondary structure-forming portion of the original protein;
and a most N-terminus amino acid residue and a most C-terminus
amino acid residue of the predetermined number of amino acid
residues added are amino acid residues that enable the linear
mutant protein to be subject to cyclization.
[0049] <11> The linear mutant protein according to item
<10>, wherein the protein is designed so as to produce the
cyclized mutant protein according to item <9> and consists of
an amino acid sequence set forth in any one of SEQ ID NOs: 2 to
5.
[0050] <12> The linear mutant protein according to item
<10>, wherein the protein is designed so as to produce the
cyclized mutant protein according to item <9> by using, as
split inteins, DnaE-C (C-intein) and DnaE-N (N-intein) of DnaE
derived from Nostoc punctiforme and consists of an amino acid
sequence set forth in any one of SEQ ID NOs: 10 to 13.
[0051] <13> A nucleic acid comprising a nucleotide sequence
encoding an amino acid sequence of the mutant protein according to
any one of items <5> to <12>.
[0052] <14> A recombinant vector comprising the nucleic acid
according to item <13>.
[0053] <15> A transformant comprising the recombinant vector
according to item <14>.
[0054] <16> A method for producing the mutant protein
according to any one of items <5> to <12>, using the
transformant according to item <15>.
[0055] <17> A pharmaceutical composition comprising, as an
active ingredient, the mutant protein according to any one of items
<5> to <12>.
[0056] <18> A pharmaceutical composition for treatment of
neutropenia, comprising, as an active ingredient, the mutant
protein according to any one of items <5> to <12>.
Advantageous Effects of Invention
[0057] In the present invention, specifically developed is
G-CSF(C177), a cyclized protein composed of an amino acid sequence
in which only 2 amino acids are added to a linear control G-CSF. In
addition, G-CSF(C163) is a cyclized protein composed of an amino
acid sequence in which nonstructural regions at the N-terminus and
the C-terminus are deleted and only 2 amino acids are added.
Further, G-CSF(C166) or G-CSF(C170) is a cyclized protein composed
of an amino acid sequence in which nonstructural regions at the
N-terminus and the C-terminus are deleted and 5 or 9 amino acid
residues, respectively, are added.
[0058] When compared with commercially available filgrastim and the
linear control G-CSF, any of them can maintain intrinsic
receptor-binding activity and have increased thermostability and
protease resistance. Among them, G-CSF(C163), G-CSF(C166), and
G-CSF(C170) were able to achieve high cyclization efficiency while
the number of amino acids added is small.
[0059] Currently, G-CSF has been widely used in clinical practice
including treatment of neutropenia caused by cancer chemotherapy.
Development of biobetter drugs is in progress so as to enhance
stability by using a variety of strategies such as a mutant G-CSF
and a chemically-modified G-CSF. Among the strategies, a
polypeptide chain cyclization technique aims at increasing
stability while change in the amino acid sequence and change in the
conformation of a protein can be minimized. An undesirable
characteristic change such as an increase in immunogenicity can
also be minimized. Thus, the technique seems to have a substantial
advantage in the production of bioactive proteins for treatment of
humans.
[0060] According to the present invention, it is possible to
reduce, to minimum, addition of a sequence necessary for a
cyclization reaction, which addition is the biggest disadvantage in
increasing stability by cyclization in view of the above-described
situations. Further, the present invention can resolve the problems
of low cyclization reaction efficiency and a complicated
purification step, thereby significantly contributing to technical
advances in the art.
BRIEF DESCRIPTION OF DRAWINGS
[0061] FIG. 1 The amino acid sequence of filgrastim and the amino
acid sequence of a linear control G-CSF, some amino acids of which
are replaced in the sequence of filgrastim, are compared to the
amino acid sequences of four different cyclized G-CSFs designed.
The schematic diagram shown at the top row shows helix-forming
regions predicted on the basis of a known G-CSF conformation (PDB
code: 2D9Q).
[0062] FIG. 2 is a synthetic scheme for a cyclized G-CSF produced
using inteins.
[0063] FIG. 3 is an SDS-PAGE picture showing a group of modified
cyclized G-CSFs (G-CSF(C177) and G-CSF(C163)) after
purification.
[0064] FIG. 4 is sensorgrams obtained by measuring, by means of
surface plasmon resonance, the binding activity of each modified
cyclized G-CSF (G-CSF(C177) or G-CSF(C163)) toward a G-CSF
receptor.
[0065] FIG. 5 is graphs showing the activity of each modified
cyclized G-CSF (G-CSF(C177) or G-CSF(C163)) on cells.
[0066] FIG. 6 is a graph obtained by measuring, by means of
circular dichroism, the thermostability of each modified cyclized
G-CSF (G-CSF(C177) or G-CSF(C163)).
[0067] FIG. 7 is a graph in which the protease resistance of each
modified cyclized G-CSF (G-CSF(C177) or G-CSF(C163)) was evaluated
using the remaining amount thereof after a reaction.
[0068] FIG. 8 is electron density maps of terminal regions of each
modified cyclized G-CSF (G-CSF(C177) or G-CSF(C163)), which regions
were subjected to cyclization and are visualized by X-ray
crystallography.
[0069] FIG. 9 is SDS-PAGE pictures showing, among the modified
cyclized G-CSFs, G-CSF(C170) and G-CSF(C166) after purification by
size-exclusion chromatography.
[0070] FIG. 10 is sensorgrams obtained by measuring, by means of
surface plasmon resonance, the binding activity of each of
G-CSF(C170), G-CSF(C166), and filgrastim among the modified
cyclized G-CSFs toward a G-CSF receptor, and sensorgrams obtained
by re-measuring the binding activity of each of the linear control
G-CSF, G-CSF(C177), and G-CSF(C163) toward the G-CSF receptor.
[0071] FIG. 11 is a graph obtained by measuring, by means of
circular dichroism, the thermostability of each of G-CSF(C170) and
G-CSF(C166) among the modified cyclized G-CSFs, and a graph
obtained by re-measuring the thermostability of each of the linear
control G-CSF, G-CSF(C177), and G-CSF(C163).
[0072] FIG. 12 is a graph in which the protease resistance of each
of G-CSF(C170) and G-CSF(C166) among the modified cyclized G-CSFs
was evaluated using the remaining amount thereof after a
reaction.
MODE FOR CARRYING OUT THE INVENTION
(Cyclized Mutant Protein and Production Method Thereof)
[0073] In a production method according to an aspect of the present
invention, an original protein with a specific three-dimensional
structure is modified to obtain a cyclized mutant protein after the
following steps (i) to (v):
[0074] (i) determining, based on conformational information about
the original protein, secondary structure-free regions at an
N-terminal portion or a C-terminal portion of the original protein
and determining an N-terminus and a C-terminus of a secondary
structure-forming portion of the original protein if the secondary
structure-free regions are deleted;
[0075] (ii) screening a protein database for proteins with
secondary structures similar to those of N-terminal residues and
C-terminal residues of the secondary structure-forming portion;
[0076] (iii) determining, based on results of the screening, an
amino acid length of a loop structure through which the N-terminus
and the C-terminus of the secondary structure-forming portion of
the original protein are to be connected;
[0077] (iv) designing and producing a linear mutant protein that
can be subjected to cyclization so as to obtain a cyclized mutant
protein in which the N-terminus and the C-terminus of the secondary
structure-forming portion of the original protein are connected via
the loop structure with the amino acid length determined; and
[0078] (v) linking an N-terminus and a C-terminus of the linear
mutant protein by a chemical or biological method so that the
cyclization is made to obtain the cyclized mutant protein.
[0079] Hereinbelow, disclosed is a method for producing a cyclized
mutant protein when granulocyte colony-stimulating factor (G-CSF)
is used as an original protein with a specific three-dimensional
structure. The present production method is not limited to this
embodiment, and is widely applicable to cyclization of various
proteins. The present production method is applicable to
cyclization of any protein as long as the amino acid sequence of
the protein is known and its conformation (three-dimensional
structure) is already determined In particular, from the viewpoint
of cyclization efficiency, preferred is cyclization of a protein,
the N-terminus and the C-terminus of which are positioned closely
in its conformation. Here, G-CSF is known to have a helix bundle
structure as its three-dimensional structure. The present
production method seems to particularly fit cyclization of a
protein (e.g., erythropoietin, interferon a) with the same helix
bundle structure as of G-CSF. Those skilled in the art can carry
out cyclization of another protein by using substantially the same
technique as the present production method when the below-described
G-CSF cyclization technique is taken into consideration. This
production method enables very efficient cyclization while the
number of amino acids added is minimal. Also, the secondary
structure of a cyclized mutant protein as obtained using this
production method does not change from that of an original protein.
This yields such advantages as the biological properties of the
original protein being retained and the stability being higher than
that of the original protein.
(Description of G-CSF)
[0080] Granulocyte colony-stimulating factor (G-CSF) is a cytokine
that binds to a granulocyte colony-stimulating factor receptor
(G-CSF receptor) to induce production of neutrophils, etc., and is
used for treatment of neutropenia during cancer chemotherapy, etc.
While G-CSF is very useful in treatment, it unfortunately has low
stability. Accordingly, efforts have been made to increase its
stability (Reference Document 1). It is known that the N-terminus
and the C-terminus of G-CSF is positioned closely in its
three-dimensional structure. It has been reported that when the
termini are linked by a covalent bond, the thermostability
increases (Reference Document 2).
[0081] In the present invention, a linear polypeptide consisting of
175 amino acids set forth in SEQ ID NO: 1 was used as a linear
control G-CSF, a starting material for improvement. When the amino
acid sequence of the linear control G-CSF is compared with the
amino acid sequence (SEQ ID NO: 14) of filgrastim, threonine at
position 2 and cysteine at position 18 are replaced by alanine and
serine, respectively. A previous report has demonstrated that these
mutations do not affect the activity of G-CSF (Reference Document
3).
(Description of Mutants Developed Herein)
[0082] Improved proteins developed herein each consist of an amino
acid sequence that is human-designed on the basis of the amino acid
sequence of the linear control G-CSF, and can be produced highly
efficiently as a cyclized molecule without impairing the G-CSF
receptor-binding activity while the number of amino acids added is
minimal.
(Sequences of Specific Mutants Developed Herein)
[0083] Four different improved proteins developed and designed
herein are composed of the respective four different amino acid
sequences described below (FIG. 1).
[0084] G-CSF(C177) is a polypeptide consisting of 177 amino acids
set forth in SEQ ID NO: 2, and an amino group of serine at the
N-terminus and a carbonyl group of glycine at the C-terminus are
linked by a peptide bond as shown in SEQ ID NO: 6. The amino acid
sequence set forth in SEQ ID NO: 2 corresponds to a sequence in
which serine is added at the N-terminus and glycine is added to the
C-terminus of the amino acid sequence (SEQ ID NO: 1) of the linear
control G-CSF.
[0085] G-CSF(C170) is a polypeptide consisting of 170 amino acids
set forth in SEQ ID NO: 3, and an amino group of serine at the
N-terminus and a carbonyl group of glycine at the C-terminus are
linked by a polypeptide bond as shown in SEQ ID NO: 7. The amino
acid sequence set forth in SEQ ID NO: 3 corresponds to a sequence
in which N-terminal 4 amino acid residues and C-terminal 3 amino
acid residues are deleted from the amino acid sequence (SEQ ID NO:
1) of the linear control G-CSF and serine is added to its
N-terminus and glycine is added to its C-terminus.
[0086] G-CSF(C166) is a polypeptide consisting of 166 amino acids
set forth in SEQ ID NO: 4, and an amino group of serine at the
N-terminus and a carbonyl group of glycine at the C-terminus are
linked by a peptide bond as shown in SEQ ID NO: 8. The amino acid
sequence set forth in SEQ ID NO: 4 corresponds to a sequence in
which N-terminal 8 amino acid residues and C-terminal 3 amino acid
residues are deleted from the amino acid sequence (SEQ ID NO: 1) of
the linear control G-CSF and serine is added to its N-terminus and
glycine is added to its C-terminus.
[0087] In addition, G-CSF(C163) is a polypeptide consisting of 163
amino acids set forth in SEQ ID NO: 5, and an amino group of serine
at the N-terminus and a carbonyl group of glycine at the C-terminus
are linked by a polypeptide bond as shown in SEQ ID NO: 9. The
amino acid sequence set forth in SEQ ID NO: 5 corresponds to a
sequence in which N-terminal 11 amino acid residues and C-terminal
3 amino acid residues are deleted from the amino acid sequence (SEQ
ID NO: 1) of the linear control G-CSF and serine is added to its
N-terminus and glycine is added to its C-terminus.
(Concerning Addition of Amino Acids to N-terminus and C-terminus so
as to Carry out Cyclization Reaction)
[0088] To make a protein cyclic in accordance with the present
invention, it is possible to use an intracellular cyclization
reaction using inteins. DnaE derived from Nostoc punctiforme can be
used as the inteins. More specifically, it is possible to use
DnaE-C and DnaE-N, proteins consisting of amino acid sequences set
forth in SEQ ID NOs: 15 and 16, respectively. A polypeptide chain
in which a target protein is interposed between the DnaE-C and the
DnaE-N is subject to spontaneous splicing using DnaE autocatalytic
function, thereby synthesizing a cyclized protein in which an amino
group at the N-terminus and a carbonyl group at the C-terminus of
the target protein are linked by a peptide bond (by means of split
inteins; see FIG. 2).
[0089] Use of split inteins requires cysteine or serine at either
the N-terminus or the C-terminus of a target protein as well as
proline at the C-terminus markedly decreases reaction efficiency
(Reference Document 6). Because of this, in the cases of the
above-mentioned present improved proteins, serine is added to the
N-terminus and glycine is added to the C-terminus of the target
protein. However, amino acids usable at the N-terminus or the
C-terminus are not limited to the above amino acids. In addition to
the above combination, 78 different sequence combinations may each
be a choice. Whether the sequence combination is good or poor may
be evaluated by a method for comparing proteins actually
synthesized and purified. In addition, it is possible to use a
method including: replacing, on a model, a loop region of a known
conformation described below by each corresponding sequence; and
computationally comparing the free energies thereof.
[0090] In the cases of the improved G-CSFs, the present inventors
designed, from the above viewpoint, a sequence (SEQ ID NO: 2) in
which serine is added to the N-terminus and glycine is added to the
C-terminus of the amino acid sequence (SEQ ID NO: 1) of the linear
control G-CSF, and produced G-CSF(C177) by using split inteins.
G-CSF(C177) had G-CSF receptor-binding activity comparable to that
of the linear control G-CSF and exhibited better activity in cells,
thermostability, protease resistance, and metabolic half-life, but
exhibited somewhat low cyclization efficiency (see Table 1).
(To Select Sequences Deleted from N-terminus and C-terminus of
Original Protein (Step (i)))
[0091] In order to modify an original protein to obtain
substantially the same biological activity as that of the original
protein, it is important not to damage the conformation of the
original protein. Also, to achieve high cyclization efficiency, it
seems to be critical that the N-terminus and the C-terminus of the
modified protein is positioned closely in its conformation.
[0092] To design an improved G-CSF from such viewpoint, the present
inventors did remove conformationally disordered regions from the
N-terminal sequence and the C-terminal sequence of the original
protein. Specifically, in the known G-CSF conformation model, a
secondary structure-forming region is determined to be a
conformationally stable region and a secondary structure-free
region is determined to be a disordered region. More specifically,
11 amino acids from methionine at position 1 to proline at position
11 and 3 amino acids from alanine at position 173 to proline at
position 175 of the amino acid sequence (SEQ ID NO: 1) of the
linear control G-CSF were determined to be secondary structure-free
regions and were then removed.
[0093] Reference Documents 4 and 5 report a plurality of algorithms
used to determine a secondary structure-forming region in
conformation modeling. Their determination often gives different
results with respect to a terminal portion of the secondary
structure. Accordingly, different algorithms adopted result in
different candidates for a cleavage site. In addition, to determine
the disordered region, it is possible to use a molecular dynamics
technique to calculate a change in the structure of the terminal
portion. When different methods give different determination
results, it is reasonable to adopt a representative value obtained
by statistical processing. As the representative value, preferred
is a mode. If there are different results, a value at which the
secondary structure-forming region is minimal is adopted.
[0094] If information on the conformation of some protein other
than G-CSF is available, secondary structure-free regions at the
N-terminus and the C-terminus are likewise determined. Then, the
N-terminus and the C-terminus of the protein when the secondary
structure-free regions are deleted (i.e., the N-terminus and the
C-terminus of the secondary structure-forming portion) can be
determined. Note that in this step, it is not necessary to actually
remove a secondary structure-free region from a protein by means of
deletion and mutation, etc. It is simply sufficient to determine,
in a design simulation, the N-terminus and the C-terminus thereof
while considering only a secondary structure-forming region without
considering secondary structure-free regions.
(Loop Design)
[0095] Of the two termini of a protein after secondary
structure-free regions at the two termini have been removed, each
has a secondary structure, so that it seems to be difficult to link
the termini as they are and to make the protein cyclic. Then, the
present inventors added, to each terminus, amino acid residues (or
a sequence) used to form a loop with an appropriate length so as to
connect these termini after conformationally disordered regions
have been removed from the N-terminal sequence and the C-terminal
sequence of G-CSF for cyclization. Note that as used herein, the
loop refers to a portion composed of an amino acid sequence
connecting both the termini of the protein after the secondary
structure-free regions have been removed from the original termini
(i.e., connecting the N-terminus and the C-terminus of a secondary
structure-forming portion of a protein of interest).
[0096] The appropriate length of the loop was determined using the
following procedure.
(Screening for Analogous Proteins (Step (ii)))
[0097] First, from protein conformations registered in a database,
a group of proteins with a "helix-loop-helix" secondary structure
in which two helices are connected via a loop was retrieved and a
group of proteins with structural homology to G-CSF was further
selected. Specifically retrieved was a group of proteins with a
helix-loop-helix structure in which proximal 4-amino-acid portions
(total of 8 amino acids) adjacent to the loop between the two
helices are structurally similar to 4-amino-acid portions (total of
8 amino acids) at the corresponding G-CSF helix termini (i.e., the
N-terminus and the C-terminus of the secondary structure-forming
region after the secondary structure-free regions have been
removed). For example, when 4 amino acid residues (.alpha.-carbon
atom) positioned at each of the two helix terminal portions of
G-CSF and those of a subject protein are superimposed, their root
mean square deviation may be less than 1.5 .ANG. and preferably
less than 1 .ANG.. In this case, the subject protein can be
retrieved as a structurally similar protein. In this regard,
however, the amino acid residue lengths at both the helix termini
being compared herein do not have to be 4, and may be adjusted to,
for example, a range from 1 to 10 and preferably from 3 to 6
depending on how many proteins have a helix-loop-helix structure
being selected based on structural similarity. That is, if the
amino acid residue lengths at both the helix termini being compared
are short, the number of the structurally similar proteins being
selected becomes large, and if the amino acid residue lengths are
long, the number of the structurally similar proteins being
selected becomes small. Because of this, the amino acid lengths at
both the helix termini being compared may be suitably adjusted such
that the number of the structurally similar proteins being selected
is set to a desirable degree (e.g., within 100 to within 1000 and
preferably from about 50 to 100). For example, Protein Data Bank
(PDB; http://wwww.rcsb.org/) is available as the database. Examples
of a condition that is for a group of proteins retrieved from the
database and can be added include the degree of resolution of
protein conformation (e.g., 3.0 .ANG. or more and preferably 2.5
.ANG. or more).
[0098] Like in the case of G-CSF, the database may be screened for
proteins with secondary structures similar to those of N-terminal
residues and C-terminal residues of a secondary structure-forming
region of any protein other than G-CSF.
(To determine the Length of Loop (Step (iii)))
[0099] Next, how many amino acid residues a loop of each protein
selected as having high structural homology to G-CSF had was
analyzed. As a result, it was observed that the number of amino
acid residues of the loop was mostly 2, 5, or 9
(22% of all the loops had 2 amino acid residues; 26% of all had 5
amino acid residues; and 16% of all had 9 amino acid residues). In
view of the above, it was predicted that a modified cyclized
molecule with the smallest size could be constructed using a
procedure in which the nonstructural regions at the N-terminus and
the C-terminus of G-CSF had been removed and two amino acids were
added to the resulting G-CSF. Further, it was also predicted that a
stable modified cyclized molecule could be constructed by adding
five or nine amino acids.
[0100] The amino acid length of a loop structure through which the
N-terminus and the C-terminus of a secondary structure-forming
portion of a protein other than G-CSF can be determined such that
the amino acid length corresponds to the number of amino acids used
to connect corresponding secondary structure portions of another
protein having similar secondary structure portions as obtained as
a result of the screening. Specifically, the screening results in
the number of amino acids used to connect, to each other, secondary
structure portions of each of analogous proteins having high
structural homology. Among the numbers, the frequently occurring
amino acid length is determined to be the amino acid length (loop
length) of a loop structure through which the N-terminus and the
C-terminus of a secondary structure-forming portion of a protein
subjected to cyclization is to be connected. How frequently the
amino acid length to be selected as the loop length occurs may be
selected by adopting the number of amino acids used to connect, to
each other, the secondary structure portions, the number occurring
in 10% or more and preferably 15% or more of the proteins screened
out. When there are several frequently occurring amino acid
lengths, one of the amino acid lengths may be selected. To
construct a modified cyclized molecule with a minimum size, the
amino acid length of the loop structure is preferably as short as
possible. However, a shorter one does not necessarily mean that it
will be a better one. From the viewpoint of stability and
cyclization efficiency, it is desirable to select a suitable
length. This step makes it possible to determine an optimal loop
length so as to connect the N-terminus and the C-terminus of a
secondary structure-forming portion of a protein without changing
its structure.
(To Design Linear Mutant Protein and Make It Cyclic (Steps (iv) and
(v))
[0101] Based on the above prediction, N-terminal 11 amino acid
residues and C-terminal 3 amino acid residues (i.e., secondary
structure-free regions) were deleted from the amino acid sequence
(SEQ ID NO: 1) of the linear control G-CSF. In addition, serine and
glycine (total of 2 amino acid residues) were added to the
N-terminus and the C-terminus, respectively, to design a sequence
(SEQ ID NO: 5). Next, split inteins were used to produce
G-CSF(C163) with a loop length of 2 amino acids. G-CSF(C163) had
G-CSF receptor-binding activity comparable to that of the linear
control G-CSF and exhibited somewhat poorer activity on cells,
thermostability, protease resistance, and metabolic half-life than
G-CSF(C177), but exhibited high cyclization efficiency (100%) (see
Table 1).
[0102] From these results, it can be predicted that any modified
cyclized molecule having a longer amino acid sequence than
G-CSF(C163) and a shorter amino acid sequence than G-CSF(C177) has
substantially the same preferable properties as of G-CSF(C177) and
G-CSF(C163). In particular, modified cyclized molecules in which
the loop length has been adjusted to 9 or 5 amino acids are highly
likely to have improved cyclization reaction efficiency and
enhanced stability while maintaining the structure of the linear
control G-CSF.
[0103] When kinds of amino acids constituting the loop are
determined, an available sequence is limited depending on a
technique used for cyclization. For instance, when split inteins
are used, it has already been shown that cysteine or serine at
either the N-terminus or the C-terminus of a target protein is
essential and proline at the C-terminus markedly decreases reaction
efficiency (Reference Document 6). Hence, an amino acid residue
added to form a loop after cyclization (or a terminal residue of an
amino acid sequence) should be cysteine or serine at either the
N-terminus or the C-terminus, and the C-terminal residue should not
be proline.
(To Improve Loops)
[0104] Based on the above prediction, modified molecules were
produced in which the loop length was adjusted to nine or five
amino acids. Seven or three amino acids, except for serine and
glycine at the N-terminus and the C-terminus, respectively, which
are amino acid residues required for cyclization using split
inteins, were designed so as to maintain the amino acid sequence of
the linear control G-CSF as much as possible. Specifically,
N-terminal 4 amino acid residues and C-terminal 3 amino acid
residues of the amino acid sequence (SEQ ID NO: 1) of the linear
control G-CSF were deleted; serine was added to the N-terminus and
glycine was added to the C-terminus to give an engineered sequence
(SEQ ID NO: 3); and split inteins were used to produce G-CSF(C170)
with a loop length of 9 amino acids. Similarly, N-terminal 8 amino
acid residues and C-terminal 3 amino acid residues of the amino
acid sequence (SEQ ID NO: 1) of the linear control G-CSF were
deleted; serine was added to the N-terminus and glycine was added
to the C-terminus to give an engineered sequence (SEQ ID NO: 4);
and split inteins were used to produce G-CSF(C166) with a loop
length of 5 amino acids.
[0105] Both G-CSF(C170) and G-CSF(C166) had G-CSF receptor-binding
activity comparable to that of the linear control G-CSF and had
higher thermostability than G-CSF(C177). In addition, the
cyclization efficiency was equal to or somewhat lower than that of
G-CSF(C163), but was markedly higher than that of G-CSF(C177) (see
Table 3). In particular, the thermostability of G-CSF(C166) was
markedly higher than other modified molecules. Accordingly, its
loop length is considered to be optimal for stabilization of the
structure of G-CSF.
[0106] From these results, it can be predicted that any modified
cyclized molecules having an amino acid sequence longer than
G-CSF(C163) and shorter than G-CSF(C177) have likewise preferable
properties, the molecules being modified G-CSF cyclized molecules
each having an amino acid sequence in which 0 to 11, and preferably
1 to 11, amino acid residues are deleted from the N-terminus and 0
to 3, and preferably 1 to 3, amino acid residues were deleted from
the C-terminus of the amino acid sequence set forth in SEQ ID NO:
1; a serine or cysteine residue is added to the N-terminus and/or
the C-terminus after the deletion; an amino acid residue other than
proline is added to the C-terminus.
[0107] For proteins other than G-CSF, a loop structure having a
determined amino acid length is used to connect the N-terminus and
the C-terminus of a secondary structure-forming portion of a
protein of interest to obtain a cyclized mutant protein. For this
purpose, a linear mutant protein that can be made cyclic may be
designed and produced and then the N-terminus and the C-terminus of
the linear mutant protein may be subjected to cyclization and be
linked to obtain a cyclized mutant protein.
[0108] The linear mutant protein is designed such that a portion of
the loop structure-forming amino acid sequence is added to the
N-terminus and the rest is added to the C-terminus of the amino
acid sequence of a secondary structure-forming portion of an
original protein. The total of the number of amino acid residues
added to the N-terminus and the number of amino acid residues added
to the C-terminus should be the amino acid length of the loop
structure determined above. Preferable design is such that at least
one amino acid residue is added to each of the N-terminus and the
C-terminus of a secondary structure-forming portion of an original
protein.
[0109] Here, the loop structure-forming amino acid sequence is
preferably designed such that the amino acid sequence of the
secondary structure-free region removed, on the basis of the design
concept, in step (i) is replaced by the necessary number of amino
acids. That is, it is preferable that the amino acid sequence of
the loop structure portion is designed so as to maintain the amino
acid sequence of the original protein as much as possible. From the
viewpoint of using, as a pharmaceutical composition, the resulting
cyclized mutant protein, it is desirable to design so as to
maintain the same conformation as that of the original protein as
much as possible.
[0110] In this regard, however, at least the amino acid residues of
the linear mutant protein nearest the N-terminal and nearest the
C-terminal should be amino acid residues necessary for cyclization.
As described above, when split inteins are used for cyclization,
either the nearest N-terminal residue or the nearest C-terminal
residue is cysteine or serine, and the nearest C-terminal residue
is an amino acid residue other than proline.
[0111] Hence, in this case, the design may be such that the amino
acid length of the remaining secondary structure-free region of the
original protein plus the number of amino acid residues necessary
for cyclization (when split inteins are used, the total number of
amino acid residues nearest the N-terminal side and nearest the
C-terminal side is two) equals the amino acid length of the loop
structure determined above.
[0112] In other words, the linear mutant protein or cyclized mutant
protein as obtained by the above production method can be said to
have an amino acid sequence in which a predetermined number of
amino acid residues is deleted from a portion of the secondary
structure-free region at each of the N-terminus and the C-terminus
of the original protein; after the deletion, amino acid residues
necessary for cyclization are added to each of the N-terminus and
the C-terminus. As a result, the conformation of the original
protein can remain the same as much as possible and an increase in
the cyclization efficiency can be achieved while the number of
amino acids added is minimal.
(About Other Cyclization Techniques)
[0113] When cyclization techniques other than the technique using
inteins are used, loop sequence design conditions are different.
For instance, when a disulfide bond is used for cyclization,
cysteine is added to each of the N-terminus and the C-terminus. In
addition, when the sortase-mediated formation of an isopeptide bond
is used for cyclization, it is essential to design a loop with an
amino acid sequence suitable for the substrate specificity of a
sortase used. Specifically, when Staphylococcus aureus -derived
sortase A is used, a loop region should have a plurality of Gly
residues at the N-terminus and a sequence containing Leu, Pro, Xaa
(Xaa is any amino acid), Thr, and Gly at the C-terminus.
(To Produce Improved Protein)
(1) To Produce Improved Protein by Using Genetic Engineering
Technique.
[0114] a. Gene Encoding Improved Protein
[0115] In the present invention, a genetic engineering method can
be used to produce the improved proteins designed above.
[0116] Specific examples of the gene used, in such a method, for
each improved G-CSF include nucleic acids encoding the amino acid
sequences set forth in SEQ ID NOs: 2, 3, 4, and 5 as described
above. The nucleic acids are not limited to the above as long as
the nucleic acids each encode a protein having substantially the
same G-CSF receptor-binding activity as of each improved G-CSF.
[0117] Examples may include: nucleic acids encoding proteins each
having G-CSF receptor-binding activity and having an amino acid
sequence in which one or several amino acid residues are deleted
from, replaced in, or added to the amino acid sequence set forth in
SEQ ID NO: 2, 3, 4, or 5 except that the amino acid residues added
to the N-terminus and the C-terminus so as to perform the
above-described cyclization reaction remain intact; and nucleic
acids encoding proteins each having said amino acid sequence in
which the residue at either the N-terminus or C-terminus is changed
to cysteine or serine and the residue at the C-terminus is changed
to an amino acid other than proline.
[0118] Examples of such genes include nucleic acids, each
consisting of the nucleotide sequence set forth in SEQ ID NO: 17,
18, 19, or 20.
[0119] Also, examples of the gene used in the present invention
include a gene, the nucleic acid of which is hybridized, under
stringent conditions, with a sequence complementary to the sequence
of a nucleic acid encoding the amino acid sequence set forth in SEQ
ID NO: 2, 3, 4, or 5 except for a portion encoding the amino acid
residues added at the N-terminus and the C-terminus so as to
perform the above-described cyclization reaction, and in which a
nucleic acid encoding cysteine or serine is added to either the
N-terminus or the C-terminus of a nucleic acid encoding a protein
having G-CSF receptor-binding activity and a nucleic acid encoding
an amino acid other than proline is added to the C-terminus. As
used herein, the term " stringent conditions" refers to a condition
under which a specific hybrid is formed and no non-specific hybrid
is formed. For instance, the stringent conditions refers to
condition under which a nucleic acid having high homology (the
homology is 60% or higher and preferably 80% or higher) is
hybridized therewith. More specifically, the concentration of
sodium is from 150 to 900 mM and preferably from 600 to 900 mM and
the temperature is from 60 to 68.degree. C. and preferably
65.degree. C. in that condition. For example, the hybridization is
performed at 65.degree. C. and the washing is performed in 0.1%
SDS-containing 0.1.times.SSC at 65.degree. C. for 10 min. In such
conditions, when the hybridization is demonstrated by a
conventional technique (e.g., Southern blotting, dot-blot
hybridization), it can be said that the hybridization under the
stringent condition occurs.
[0120] In addition to the above, a gene used for the cyclization
reaction consists of a nucleic acid encoding a protein having an
amino acid sequence (SEQ ID NO: 10, 11, 12, or 13) in which 2 amino
acid sequences set forth in SEQ ID Nos: 15 and 16 are linked to the
N-terminus and the C-terminus of the amino acid sequence set forth
in SEQ ID NO: 2, 3, 4, or 5, or having an amino acid sequence in
which the above-described modification is added to the amino acid
sequence set forth in SEQ ID NO: 2, 3, 4, or 5 or an amino acid
sequence encoded by a nucleic acid as obtained by adding the
above-described modification to a nucleic acid hybridized, under
stringent conditions, with a nucleic acid encoding the amino acid
sequence set forth in SEQ ID NO: 2, 3, 4, or 5; and having an amino
acid sequence in which 2 amino acid sequences set forth in SEQ ID
NOs: 15 and 16 are linked to the N-terminus and the C-terminus of
the amino acid sequence of a protein having G-CSF receptor-binding
activity and having a function that makes the amino acid sequence
sandwiched between SEQ ID NOs: 15 and 16 cyclic as a result of an
autocatalytic process.
[0121] Examples of such genes include nucleic acids, each
consisting of the nucleotide sequence set forth in SEQ ID NO: 21,
22, 23, or 24.
b. To Produce Genes, Recombinant Vectors, and Transformants
[0122] Genes according to the present invention can be synthesized
using chemical synthesis, a PCR method, a cassette mutation method,
site-specific mutagenesis, etc. For example, a plurality of
oligonucleotides having up to about 100-bp nucleotides and a
complementary region containing about 20 nucleotides at the
terminus are chemical synthesized. Then, they may be used in
combination and in an overlap elongation process (Reference
Document 7) to perform total synthesis of a gene of interest.
[0123] A recombinant vector according to the present invention can
be constructed by ligating (inserting) a gene containing the
above-described nucleotide sequence into a suitable vector. A
vector used in the present invention has no particular limitation
as long as the vector can replicate in a host or a gene of interest
can be integrated into a host genome by means of the vector.
Examples include bacteriophages, plasmids, cosmids, and
phagemids.
[0124] Examples of the plasmid DNA include Actinomyces-derived
plasmids (e.g., pK4, pRK401, pRF31), E. coli-derived plasmids (e.g.
pBR322, pBR325, pUC118, pUC119, pUC18), Bacillus subtilis-derived
plasmids (e.g., pUB110, pTP5), and yeast-derived plasmids (e.g.,
YEp13, YEp24, YCp50). Examples of the phage DNA include .lamda.
phages (e.g., .lamda.gt10, .lamda.gt11, .lamda.ZAP). Further,
animal viruses such as a retrovirus or a vaccinia virus or insect
virus vectors such as a baculovirus can be used.
[0125] In order to insert a gene into a vector, for example, a
method is adopted in which purified DNA is first digested by a
suitable restriction enzyme(s), is inserted into a restriction
enzyme site(s) or a multi-cloning site of a suitable vector DNA,
and is ligated to the vector. The gene should be cloned in the
vector so as to express an improved protein according to the
present invention. Here, a promoter and a gene nucleotide sequence
as well as, as desired, a cis-element (e.g., an enhancer), a
splicing signal, a poly-A tail signal, a selection marker, a
ribosome binding sequence (SD sequence), a start codon, and/or a
stop codon may be ligated to the vector according to the present
invention. In addition, it is possible to add a tag sequence to
easily purify a protein produced. Examples of the tag sequence that
can be used include nucleotide sequences encoding known tags such
as a His tag, a GST tag, and an MBP tag.
[0126] Whether or not the gene is inserted into the vector can be
examined by using a known genetic engineering technology. For
instance, the case of a plasmid vector, etc., can be examined by a
procedure in which a competent cell is used for vector subcloning,
DNA is extracted, a DNA sequencer is then used to check its
nucleotide sequence. The similar procedure is applicable to other
vectors as long as bacteria or other hosts can be used for
subcloning. In addition, a vector is effectively selected by using
a selection marker such as a drug resistance gene.
[0127] Transformants may be obtained by introducing a recombinant
vector according to the present invention into a host cell so as to
enable expression of an improved protein according to the present
invention. A host used for transformation is not particularly
limited if the host can express a protein or a polypeptide.
Examples include bacteria (e.g., Escherichia coli, Bacillus
subtilis), yeast, plant cells, animal cells (e.g., COS cells, CHO
cells), and insect cells.
[0128] When a bacterium is used as a host, a recombinant vector
should be able to self-replicate in the bacterium, and it is
preferable that the vector includes a promoter, a ribosome binding
sequence, a start codon, a nucleic acid encoding an improved
protein according to the present invention, and a transcription
termination sequence. Examples of the Escherichia coli include
Escherichia coli DH5.alpha.. Examples of the Bacillus include
Bacillus subtilis. A method for introducing a recombinant vector
into a bacterium is not particularly limited if the method can
introduce a DNA into a bacterium. Examples of the method include a
method using a calcium ion and electroporation.
[0129] When the yeast is used as a host, for example, Saccharomyces
cerevisiae, Schizosaccharomyces pombe, or another yeast may be
employed. A method for introducing a recombinant vector into yeast
is not particularly limited if the method can introduce a DNA into
yeast. Examples of the method include electroporation, a
spheroplast method, and a lithium acetate method.
[0130] When the animal cell is used as a host, a monkey cell COS-7,
a Vero cell, a Chinese hamster ovary cell (CHO cell), a mouse L
cell, a rat GH3, a human FL cell, or the like may be employed.
Examples of a method for introducing a recombinant vector into an
animal cell include electroporation, a calcium phosphate method,
and lipofection.
[0131] When the insect cell is used as a host, Sf9 or another
insect cell may be employed. Examples of a method for introducing a
recombinant vector into an insect cell include a calcium phosphate
method, lipofection, and electroporation.
[0132] Whether or not a gene is integrated into a host can be
examined by a PCR method, Southern hybridization, Northern
hybridization, or the like. For example, a DNA is prepared from a
transformant, and DNA-specific primers are designed to carry out a
PCR. Then, the resulting PCR amplification product is subjected to
agarose gel electrophoresis, polyacrylamide gel electrophoresis,
capillary electrophoresis, or the like. Subsequently, the
amplification product is stained with, for example, ethidium
bromide or a SyberGreen solution and is detected as a single band.
This can verify that a cell has been transformed. In addition, a
primer labeled in advance with, for example, a fluorescent dye can
be used to carry out a PCR to detect its amplification product.
c. To Obtain Improved Protein by Culturing Transformant
[0133] When produced as a recombinant protein, an improved protein
according to the present invention may be obtained by culturing the
above-described transformant and by collecting the protein from the
culture. The term "culture" means any of culture supernatant,
cultured cells or microorganisms, and lysates of cells or
microorganisms. Transformants according to the present invention
may be cultured by a common procedure used for culturing their
host.
[0134] Either a natural medium or a synthetic medium may be used as
a medium for culturing a transformant obtained by using a
microorganism such as E. coli or yeast as a host if the medium
contains, for example, a carbon source, a nitrogen source, and
inorganic salts, as utilized by the microorganism, to efficiently
culture the transformant. Examples of the carbon source include
carbohydrates (e.g., glucose, fructose, sucrose, starch), organic
acids (e.g., acetic acid, propionic acid), and alcohols (e.g.,
ethanol, propanol). Examples of the nitrogen source include
ammonia, ammonium chloride, ammonium salts of inorganic or organic
acids (e.g., ammonium sulfate, ammonium acetate, ammonium
phosphate), other nitrogen-containing compounds, peptone, meat
extracts, and corn steep liquors. Examples of the inorganic matter
include potassium primary phosphate, potassium secondary phosphate,
magnesium phosphate, magnesium sulfate, sodium chloride, ferrous
sulfate, manganese sulfate, copper sulfate, and calcium carbonate.
The culturing is carried out at 20 to 37.degree. C. for 12 h to 3
days under aerobic conditions such as shaking culture or aerated
and agitated culture.
[0135] When an improved protein according to the present invention
is produced in microbial bodies or cells after culturing, the
protein may be collected, for example, after homogenization (e.g.,
sonication, repeated freeze-thawing) so as to disrupt the microbial
bodies or cells. In addition, when the protein is secreted from
microbial bodies or cells, the resulting culture medium may be used
as it is or is, for example, centrifuged to remove the microbial
bodies or cells. Then, common biochemical processes used for
isolating and purifying a protein are used singly or in
combination, including ammonium sulfate precipitation,
size-exclusion chromatography, ion exchange chromatography, and
affinity chromatography. By doing so, an improved protein according
to the present invention can be isolated and purified from the
culture.
[0136] In addition, it is possible to use what is called a
cell-free synthesis system using a mixture of only protein
biosynthesis reaction-involving factors (e.g., enzymes, nucleic
acids, ATP, amino acids). In this case, an improved protein
according to the present invention can be synthesized from a vector
without using viable cells (Reference Document 8). Then, the
purification processes may be likewise used to isolate and purify
the improved protein according to the present invention from a
post-reaction mixed solution.
[0137] To check whether the purified/isolated improved protein
according to the present invention is a protein consisting of an
amino acid sequence of interest, a sample containing the protein is
analyzed. Analysis methods may be used, including SDS-PAGE, Western
blotting, mass spectroscopy, amino acid analysis, and analysis
using an amino acid sequencer (Reference Document 9).
(2) To Produce Improved Protein by Using Another Technique
[0138] An improved protein according to the present invention may
be produced using an organic chemistry technique (e.g., a solid
peptide synthesis process). A process for producing a protein by
using such a technique is well-known in the art and is briefly
described below.
[0139] When a protein is chemically produced using a solid peptide
synthesis process, an automatic synthesizer is preferably used to
synthesize, on a resin, a protected polypeptide having an amino
acid sequence of an improved protein according to the present
invention while repeating a polycondensation reaction of activated
amino acid derivatives. Next, this protected polypeptide is cleaved
from the resin and, at the same time, a side chain protection group
is cleaved. Regarding this cleavage reaction, it has been known
that there is a suitable cocktail prepared in accordance with a
type of resin or protection group and/or an amino acid composition
(Reference Document 10). Subsequently, a crude protein is
transferred from an organic solvent layer to an aqueous layer and a
mutant protein of interest is then purified. Examples of the
purification process that can be used include reverse-phase
chromatography (Reference Document 10).
(Tests for Checking Performance of Improved Protein)
[0140] The following tests for checking performance may be
conducted to select which of the improved proteins as so produced
are good. Here, any of the improved proteins according to the
present invention had good performance.
(1) Test for G-CSF Receptor-binding Activity
[0141] The G-CSF receptor-binding activity of an improved protein
according to the present invention may be checked and evaluated
using Western blotting, immunoprecipitation, pull-down assay, ELISA
(Enzyme-Linked ImmunoSorbent Assay), surface plasmon resonance
(SPR) spectroscopy, etc. Among them, the SPR spectroscopy enables
real-time sequential observation of the interaction between
label-free biomolecules. Accordingly, the SPR spectroscopy makes it
possible to quantitatively evaluate a binding reaction of the
improved protein from the kinetic viewpoint.
(2) Test for Activity of Improved Protein on Cells
[0142] The activity of an improved protein according to the present
invention on cells can be evaluated using, as an index, a
proliferation potential of a G-CSF-dependent culture cell line
(NFS-60). The NFS-60 is a cell line exhibiting the G-CSF-dependent
proliferation potential and a cell line generally used for
biological activity assay for G-CSF including a standard G-CSF
product produced by NIBSC. Hence, the cell line is suitable when
the activity of the improved protein on cells is evaluated.
(3) Test for Thermo stability of Improved Protein
[0143] The thermostability of an improved protein according to the
present invention can be evaluated using circular dichroism (CD)
spectroscopy, fluorescence spectroscopy, infrared spectroscopy,
differential scanning calorimetry, residual activity after heating,
etc. Among them, the CD spectroscopy is a spectroscopic analysis
method that precisely reflects a change in the secondary structure
of a protein. Because of this, it is possible to observe a
temperature-dependent change in the conformation of the improved
protein and to quantitatively evaluate the thermodynamical
structural stability thereof.
(4) Test for Stability of Improved Protein against Protease
[0144] The stability of an improved protein according to the
present invention against a protease can be evaluated using, for
example, a protocol in which a protease such as carboxypeptidase Y
and the improved protein are mixed; and the amount of a degradation
product caused by the reaction or the amount of a remaining
unreacted molecule is analyzed over time by using SDS-PAGE, liquid
chromatography, etc. Among them, the SDS-PAGE can be used to
analyze a trace amount of a protein in a simple fashion, so that
this method is suitable when the stability of the improved protein
against a protease is evaluated.
(5) Test for In Vivo Stability of Improved Protein
[0145] The in vivo stability of an improved protein according to
the present invention can be evaluated using, for example, a
protocol in which the improved protein is intravenously injected
into a rat; and the concentration of G-CSF present in blood is
analyzed over time by using ELISA, SDS-PAGE, liquid chromatography,
etc. Among them, the ELISA analysis can be used to quantify a trace
amount of G-CSF in a sample containing serum components in a simple
and accurate manner, so that the method is suitable when the in
vivo stability is evaluated.
(6) Conformation Analysis of Improved Protein
[0146] The conformation of an improved protein according to the
present invention can be evaluated using X-ray crystallography,
nuclear magnetic resonance spectroscopy, etc.
[0147] Among them, the X-ray crystallography can be used to analyze
a high-resolution diffraction image and to precisely observe a
local structure including the conformation of terminal regions that
are subjected to cyclization, so that the method is suitable when
the conformation of the improved protein is evaluated.
EXAMPLES
Example 1
To Design Improved G-CSFs
[0148] (1) As used herein, a linear control G-CSF, which is a
starting material for improvement, is defined as follows.
Specifically, the linear control G-CSF is a linear polypeptide
consisting of 175 amino acids set forth in SEQ ID NO: 1. When the
amino acid sequence of the linear control G-CSF is compared with
the amino acid sequence (SEQ ID NO: 14) of filgrastim, threonine at
position 2 and cysteine at position 18 are replaced by alanine and
serine, respectively. A previous report has demonstrated that these
mutations do not affect the activity of G-CSF (Reference Document
3).
[0149] (2) In order to make a polypeptide of interest cyclic by
using the catalytic function of inteins, a residue of at least one
of the N-terminus and the C-terminus should be serine or cysteine.
In addition, if the residue at the C-terminus is proline, it is
known that the reaction fails to proceed. Here, as a mutant
modified so as to make the inteins reaction proceed by introducing
the minimum number of mutations, G-CSF(C177) (SEQ ID NO: 2) was
produced by designing a sequence in which serine and glycine were
added to the N-terminus and the C-terminus of the linear control
G-CSF, respectively.
[0150] (3) Next, designed were G-CSFs, the terminal sequences of
which were optimized for a cyclization reaction. First, human G-CSF
three-dimensional structural coordinate data (PDB code: 2D9Q-A
chain) was downloaded from the Protein Data Bank (PDB;
http://www.rcsb.org/), an international database for the
three-dimensional structures of proteins. Next, nonstructural
regions that were disadvantageous for the cyclization reaction were
selected and removed from amino acids positioned at the N-terminus
and the C-terminus of G-CSF. According to the STRIDE, which is a
secondary structure-determining algorithm provided at the PDB, an
N-terminal region from glutamine at position 11 to serine at
position 37 (corresponding to a region from position 12 to position
38 of the linear control G-CSF) forms a helix (helix A). Meanwhile,
a C-terminal region from alanine at position 143 to leucine at
position 171 (corresponding to a region from position 144 to
position 172 of the linear control G-CSF) forms a helix structure
(helix D) (FIG. 1). Then, 11 amino acids from methionine at
position 1 to proline at position 11 and 3 amino acids from alanine
at position 173 to proline at position 175 were deleted from the
linear control G-CSF to give a sequence as a backbone suitable for
cyclization.
[0151] Subsequently, the optimal length of a loop used to connect
the N-terminus and the C-terminus was determined using the
following procedure.
[0152] The high-resolution three-dimensional structures having a
resolution of at least 2.5 .ANG. and registered at the PDB were
collected. Among the structures, the structure of a loop sandwiched
between two helices was retrieved. Among the loop structures,
retrieved were 57 loop structures in which 4 amino acids at each
terminus (the total of 8 amino acids) of the backbone are
structurally similar (i.e., the root mean square deviation of
.alpha.-carbon is less than 1 .ANG.) to 4 amino acids at the
terminus of each helix (the total of 8 amino acids). The numbers of
amino acids of such loop structures were set to references and were
used for comparison. This comparison demonstrated that the number
of the loop structure consisting of 2 amino acids was 13 (accounted
for 22% of the total). Accordingly, it was determined that two
amino acids were sufficient for the loop connecting the N-terminus
and the C-terminus. An amino acid sequence containing serine and
glycine as the loop sequence was added to give G-CSF(C163) (SEQ ID
NO: 5), the smallest improved G-CSF suitable for the cyclization
reaction.
[0153] On one hand, minimization of the loop region increases
cyclization reaction efficiency. On the other hand, the
minimization may cause distortion of the three-dimensional
structure, thereby decreasing the stability. From this, it can be
predicted that the number of amino acids, which number makes
maximum the total of the increase in the cyclization reaction
efficiency and the effect of stabilizing the structure, is present
between those of G-CSF(C177) and G-CSF(C163). Specifically,
G-CSF(C170) and G-CSF(C166) set forth in SEQ ID NOs: 3 and 4,
respectively, are exemplified as the candidates.
Example 2
To Synthesize Linear Control G-CSF-Expression Plasmid
[0154] (1) A chemically synthesized nucleic acid sequence (SEQ ID
NO: 25) encoding the amino acid sequence (SEQ ID NO: 1) of the
linear control G-CSF was purchased.
[0155] (2) As the chemically synthesized nucleic acid as a
template, PCR amplification was carried out using two primers (SEQ
ID NOs: 26 and 27). The resulting PCR product was purified and
digested by restriction enzymes NcoI and BamHI. Next, an E. coli
vector pET16b was digested by restriction enzymes NcoI and BamHI
and a band at or near 5600 bp was excised and purified. The band
was dephosphorylated using_E. coli alkaline phosphatase. The two
samples purified were ligated using T7 DNA ligase.
[0156] (3) E. coli DH5a was transformed with the resulting ligation
product of (2) and was cultured on an LB plate medium containing
100 .mu.g/m1 of ampicillin. The resulting transformants were
subjected to colony PCR and DNA sequencing (GE Healthcare
Bioscience, BigDye Terminator v3.1) and were then selected. After
that, a QIAprep Spin Miniprep kit (Qiagen) was used to extract a
linear control G-CSF-expression plasmid.
Example 3
To Construct Inteins Vector
[0157] (1) A synthetic gene was purchased having a nucleotide
sequence (SEQ ID NO: 28) encoding the inteins sequence (a sequence
containing, in sequence, SEQ ID NO: 15 and SEQ ID NO: 16) derived
from Nostoc punctiforme.
[0158] (2) The synthetic gene of (1) was mixed with restriction
enzymes NcoI and XhoI and was subjected to a cleavage reaction at
37.degree. C. for 4 h. After gel electrophoresis, a band at or near
450 bp was excised and purified. Next, an E. coli vector pET16b was
digested by NcoI and XhoI and a band at or near 5600 bp was excised
and purified. The band was dephosphorylated using E. coli alkaline
phosphatase. The two samples purified were ligated using T7 DNA
ligase.
[0159] (3) E. coli DH5a was transformed with the resulting ligation
product and was cultured on an LB plate medium containing 100
.mu.g/m1 of ampicillin. The resulting transformants were subjected
to colony PCR and DNA sequencing (GE Healthcare Bioscience, BigDye
Terminator v3.1) and were then selected. After that, a QIAprep Spin
Miniprep kit (Qiagen) was used to extract an inteins-expression
plasmid.
Example 4
To Synthesize G-CSF(C177)-Expression Plasmid
[0160] A nucleic acid (SEQ ID NO: 17) encoding the amino acid
sequence of G-CSF(C177) was synthesized by PCR reaction using the
linear control G-CSF-expression plasmid as a template and two
single strand DNAs (SEQ ID NO: 29 and SEQ ID NO: 30) as primers.
The resulting product was digested by restriction enzymes NheI and
NdeI and was then purified. Next, the inteins-expression plasmid
was digested by restriction enzymes NheI and NdeI and was then
purified. The two samples purified were ligated using T7 DNA
ligase.
[0161] E. coli DH5.alpha. was transformed with the resulting
ligation product and was cultured on an LB plate medium containing
100 .mu.g/ml of ampicillin. The resulting transformants were
subjected to colony PCR and DNA sequencing (GE Healthcare
Bioscience, BigDye Terminator v3.1) and were then selected. After
that, a QIAprep Spin Miniprep kit (Qiagen) was used to extract a
G-CSF(C177)-expression plasmid.
Example 5
To Synthesize G-CSF(C163)-Expression Plasmid
[0162] A nucleic acid (SEQ ID NO: 20) encoding the amino acid
sequence of G-CSF(C163) was synthesized by PCR reaction using the
linear control G-CSF-expression plasmid as a template and two
single strand DNAs (SEQ ID NO: 35 and SEQ ID NO: 36) as primers.
The resulting product was digested by restriction enzymes NheI and
NdeI and was then purified. Next, the inteins-expression plasmid
was digested by restriction enzymes NheI and NdeI and was then
purified. The two samples purified were ligated using T7 DNA
ligase.
[0163] E. coli DH5.alpha. was transformed with the resulting
ligation product and was cultured on an LB plate medium containing
100 .mu.g/m1 of ampicillin. The resulting transformants were
subjected to colony PCR and DNA sequencing (GE Healthcare
Bioscience, BigDye Terminator v3.1) and were then selected. After
that, a QIAprep Spin Miniprep kit (Qiagen) was used to extract a
mutant G-CSF(C163)-expression plasmid.
Example 6
To Produce Linear Control G-CSF, and G-CSF(C177) and
G-CSF(C163)
[0164] (1) E. coli BL21(DE3) (Novagen), which was used for
expression, was transformed with the linear control
G-CSF-expression plasmid. The resulting transformant was
pre-cultured and was inoculated in an LB medium at 1 ml/200 ml and
the mixture was cultured under shaking until O.D..sub.600 =0.8 to
1.0. To express the linear control G-CSF, IPTG (0.4 mM) was added
and the mixture was further cultured under shaking at 37.degree. C.
for 4 h.
[0165] (2) The bacteria collected were suspended in 10 ml of a
lysis buffer. The lysis buffer is a PBS containing 1% (w/v) sodium
deoxycholate (DOC), 1.2 kU/ml lysozyme, and 25 U/ml Benzonase.
After the suspension, the lysate was stirred at room temperature
for 15 min and was subjected to sonication for 2 min, followed by
centrifugation at 4.degree. C. for 10 min at 10,000 .times.g. The
supernatant was removed, and 20 ml of washing buffer 1 was added to
the precipitate to be suspended. Washing buffer 1 contains 50 mM
Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA, and 2% Tween20. The
suspended sample was centrifuged to remove the supernatant. Then,
20 ml of washing buffer 2 was added to the precipitate to be
suspended. Washing buffer 2 contains 50 mM Tris-HCl pH 8.0
(25.degree. C.), 5 mM EDTA, and 1% (w/v) DOC. The suspended sample
was centrifuged to remove the supernatant. Then, 20 ml of washing
buffer 3 was added to the precipitate to be suspended. Washing
buffer 3 contains 50 mM Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA,
and 1 M NaCl. The suspended sample was centrifuged to remove the
supernatant. Then, 10 ml of a solubilization buffer was added to
the precipitate, and the mixture was stirred at room temperature
for 18 h. The solubilization buffer contains 50 mM Tris-HCl pH 8.0
(25.degree. C.), 5 mM EDTA, and 6 M guanidine hydrochloride. The
solubilized sample was concentrated by size-exclusion filtration
and the concentrate was added dropwise to the 10-fold volume of an
ice-cold refolding buffer. Next, the mixture, as it was, was
stirred at 4.degree. C. for 18 h. The refolding buffer contains 100
mM Tris-HCl pH 8.0 (4.degree. C.), 2 mM EDTA, 400 mM L-arginine, 1
mM reduced glutathione, and 0.1 mM oxidized glutathione. After
that, the refolded sample was used as internal liquid and the
100-fold volume of 20 mM Tris-HCl pH 8.0 (4.degree. C.) was used as
external liquid to carry out dialysis (at 4.degree. C. for 18 h).
The sample was centrifuged to recover the supernatant, which was
filtered with a 0.22 -.mu.m syringe filter and was subjected to
dialysis using 20 mM Tris-HCl pH 8.0 (25.degree. C.).
[0166] (3) The filter-sterilized dialysis internal liquid was
injected into a HiTrap Q HP column (GE Healthcare Bioscience) of a
liquid chromatography apparatus AKTApurifier (GE Healthcare
Bioscience). Then, the linear control G-CSF was purified by anion
exchange chromatography (a running buffer: 20 mM Tris-HC1pH 8.0
(25.degree. C.); an elution buffer: 20 mM Tris-HCl pH 8.0
(25.degree. C.) and 1 M NaCl). Further, the purified sample was
injected into a Superdex75 10/300 GL column (GE Healthcare
Bioscience) of the liquid chromatography apparatus AKTApurifier (GE
Healthcare Bioscience). The sample was purified by size-exclusion
chromatography (a running buffer: 20 mM Tris-HCl pH 8.0 (25.degree.
C.) and 150 mM NaCl). The purified sample was subjected to dialysis
using PBS and was then stored at 4.degree. C.
[0167] (4) G-CSF(C177) and G-CSF(C163) were likewise produced using
the procedures (1), (2), and (3). In this regard, however, to
isolate a linear molecule produced as a by-product of the inteins
reaction, the following manipulations were added to the step of
purifying G-CSF(C177). The sample purified by size-exclusion
chromatography was concentrated by size-exclusion filtration,
followed by dialysis using the 100-fold volume of 20 mM Tris-HCl pH
8.0 (25.degree. C.). The filter-sterilized dialysis internal liquid
was injected into a MonoQ column (GE Healthcare Bioscience) of a
liquid chromatography apparatus AKTApurifier (GE Healthcare
Bioscience). Then, the G-CSF(C177) was purified by anion exchange
chromatography (a running buffer: 20 mM Tris-HCl pH 8.0 (25.degree.
C.); an elution buffer: 20 mM Tris-HCl pH 8.0 (25.degree. C.) and 1
M NaCl). When the amount of G-CSF(C177) added to the
above-described column was 20 .mu.g or less, a cyclized main
product and a linear by-product were separated as 2 independent
peaks. The purified sample of interest was subjected to dialysis
using PBS'and was then stored at 4.degree. C.
Example 7
To Measure Cyclization Efficiency by Using SDS-PAGE
[0168] In this example, the purity of each G-CSF after purification
was evaluated by using SDS-PAGE.
[0169] An aqueous solution containing each purified G-CSF at a
concentration of about 50 .mu.M was prepared and was subjected to
SDS-PAGE (4 to 20% precast gels (Bio-Rad), at 200 V for 30 min).
Then, an Oriole.RTM. Fluorescent Gel Stain kit (Bio-Rad) was used
to detect a band, and the purity of each G-CSF was determined. The
results showed that the linear control G-CSF and G-CSF(C163) were
each detected as a major band of any of the samples measured and
that the degree of purification was sufficient. The G-CSF(C177)
sample after the size-exclusion chromatography contained about 30%
of a linear molecule and the purity of the cyclized product was
demonstrated to be 70% or less. However, the purity after the final
purification using a MonoQ column was demonstrated to be 95% or
higher (FIG. 3 and Table 1).
TABLE-US-00001 TABLE 1 The cyclization efficiency and the function
and stability of each modified cyclized G-CSF were compared.
Cyclization G-CSF receptor- Activity Thermo- Protease Metabolic
efficiency binding activity on cells stability resistance half-life
(%) KD(nM) EC50 (pg/ml) Tm (.degree. C.) t0.5 (hours) t0.5 (hours)
filgrastim -- 0.19* 103 N.A. 2.0 N.A. Linear control -- 0.43 82.3
56.5 1.7 0.95 G-CSF G-CSF(C177) 63.6 0.57 60.7 66.2 230 1.13
G-CSF(C163) 100 0.57 80.6 58.3 11 1.05 *The value reported (by Feng
Y et al., 1999, Biochemistry) was cited.
Example 8
To Measure G-CSF Receptor-Binding Activity by Using Surface Plasmon
Resonance Spectroscopy
[0170] As described below, surface plasmon resonance (SPR)
spectroscopy was used to evaluate the receptor-binding activity of
each of the linear control G-CSF, G-CSF(C177), and G-CSF(C163).
[0171] Note that SPR spectroscopy is recognized as an excellent
method in which the specific interaction between biopolymers is
measured over time and the reaction can be quantitatively
interpreted from the kinetic viewpoint.
[0172] First, Protein G was immobilized on a flow cell of a sensor
chip CMS (GE Health care) by using a maleimide coupling reaction.
Next, a running buffer HBS-P (10 mM HEPES pH 7.4, 150 mM NaCl, and
0.05% v/v Surfactant P20) in which an Fc-fusion G-CSF receptor had
been dissolved was added to immobilize the receptor on the chip.
The SPR measurement was carried out using Biacore T100 (Biacore)
and the reaction temperature was 25.degree. C. A 2-fold dilution
series from 1.25 nM to 20 nM of the sample solution was prepared,
and a single kinetic mode was used for measurement and
analysis.
[0173] The observation results were analyzed using BIAevaluation
version 4.1 to calculate a dissociation constant. The results
demonstrated that the dissociation constant between G-CSF(C177) or
G-CSF(C163) and the G-CSF receptor was substantially the same value
as of the linear control G-CSF, indicating that the affinity
remained the same (FIG. 4 and Table 1).
Example 9
To Measure Activity on Cells
[0174] In this Example, evaluated was the activity of each of
filgrastim, the linear control G-CSF, G-CSF(C177), and G-CSF(C163)
on cells.
[0175] A cell line NFS-60, which exhibits G-CSF-dependent growth,
was maintained under conditions at 37.degree. C. and 5% CO.sub.2 in
RPMI1640 culture medium containing 10 ng/ml rhG-CSF and 10% FBS.
The cells in a logarithmic growth phase were prepared at
2.times.10.sup.5 cells/mL in 10% FBS-containing (culture-use
rhG-CSF-free) RPMI1640 culture medium, and were then used for
experiments. A 5-fold dilution series from 3 nM to 192 fM of the
linear control G-CSF, G-CSF(C177), or G-CSF(C163) was prepared
using 10% FBS-containing RPMI1640 culture medium. Equal volumes of
the cell solution and each G-CSF solution were mixed and the
mixture was cultured under conditions at 37.degree. C. and 5%
CO.sub.2 for 48 h. Then, a cell counting kit-8 (DOJINDO
LABORATORIES) was used to count the number of cells.
[0176] As a growth-stimulating activity on NFS-60 cells, the G-CSF
activity of each of the linear control G-CSF, G-CSF(C177), and
G-CSF(C163) was evaluated by calculating 50% effective
concentration (EC50) as obtained by fitting a logistic curve to the
activity data. The average of the vehicle control (VC) was
subtracted from absorbance data obtained using each sample at each
concentration. Then, the resulting value was divided by the average
of values of the reference at the maximum activity concentration to
calculate a relative activity (%) for each plate. The maximum value
was set to 100%, and a logistic formula (Hill equation) was fit to
the data to calculate an EC50 value.
[0177] The results demonstrated that G-CSF(C177) and G-CSF(C163)
exhibited substantially the same activity on cells as of filgrastim
or the linear control G-CSF (FIG. 5 and Table 1).
Example 10
To Determine Thermostability by Using Circular Dichroism
Spectroscopy
[0178] In this Example, evaluated was the thermostability of each
of the linear control G-CSF,
[0179] G-CSF(C177), and G-CSF(C163). Circular dichroism (CD)
spectroscopy is known to be a spectroscopic analysis method that
precisely reflects a change in the secondary structure of a
protein. The method can reveal at which temperature a sample is
denatured by measuring a molar ellipticity, which is represented by
the intensity of a CD spectrum, while the temperature of the sample
is changed.
[0180] An aqueous solution (10 mM HEPES buffer, pH 7.4, and 150 mM
sodium chloride) containing the linear control G-CSF, G-CSF(C177),
or G-CSF(C163) at a concentration of 5 .mu.M was prepared. This
sample solution was injected into a cylindrical cell (with a cell
length of 0.2 cm), and a circular dichroism spectrophotometer model
J805 (JASCO Corporation) was used for measurement.
[0181] While the measurement wavelength at a temperature of
10.degree. C. was shifted from 260 nm to 200 nm, a circular
dichroism spectrum was obtained. The results showed that the linear
control G-CSF, G-CSF(C177), and G-CSF(C163) had substantially the
same secondary structure. The same samples were heated to
80.degree. C. and then cooled from 80.degree. C. to 10.degree. C.
After that, a circular dichroism spectrum was remeasured. The molar
ellipticity, however, was not recovered, indicating the occurrence
of irreversible heat denaturation.
[0182] Next, while the temperature was raised from 10.degree. C. to
80.degree. C. at a rate of 1.degree. C. per min, their circular
dichroism at a wavelength of 222 nm was scanned and determined. The
resulting heat-fusing curve was analyzed using a theoretical
formula of a two-state phase transition model (Reference Document
11). This analysis determined a denaturation temperature T. and a
denaturation enthalpy change .DELTA.H.sub.m at the T.sub.m. The
results revealed that the thermostability of each of G-CSF(C177)
and G-CSF(C163) was increased by 9.7.degree. C. and 1.8.degree. C.,
respectively, when compared with that of the linear control G-CSF
(FIG. 6 and Table 1).
Example 11
Protease Resistance Test
[0183] In this Example, the protease resistance of each of
filgrastim, the linear control G-CSF, G-CSF(C177), and G-CSF(C163)
was evaluated.
[0184] Carboxypeptidase Y and each G-CSF were mixed, and the
proportion of an unreacted G-CSF was quantified using SDS-PAGE.
Carboxypeptidase Y is an enzyme that catalyzes the cleavage of an
amino acid from the C-terminus of a protein and exhibits low
substrate specificity and sequence specificity during the reaction.
Because of this, carboxypeptidase Y is effective in comprehensively
evaluating protease resistance.
[0185] The linear control G-CSF, G-CSF(C177), or G-CSF(C163),
isolated and purified, and carboxypeptidase Y were diluted and
mixed in 100 mM acetate buffer (pH 6.5) at a final concentration of
10 .mu.g/m1 and 1 .mu.g/ml, respectively. The reaction was carried
out at 37.degree. C. for 0, 60, 120, 180, and 240 min, and 10 .mu.l
of each sample was collected. Each sample was subjected to SDS-PAGE
(4 to 20% precast gels (Bio-Rad), at 200 V for 30 min), and an
Oriole.RTM. Fluorescent Gel Stain kit (Bio-Rad) was used to detect
a band. An imaging system (ChemiDoc.TM., BioRad) and
image-analyzing software (QuantityOne.TM., BioRad) were used to
detect the band and quantify its image density. The result at 0 min
was set to 100%, and the level of the remaining band was scored.
The carboxypeptidase Y-mediated digestion reaction was assumed to
be a pseudo-first-order reaction to calculate a half-life (FIG. 7
and Table 1). The results showed that G-CSF(C177) and G-CSF(C163)
had markedly higher protease resistance than the linear control
G-CSF.
Example 12
Test for Measuring Metabolic Half-life
[0186] In this Example, evaluated was the pharmacokinetics of each
of the linear control G-CSF, G-CSF(C177), and G-CSF(C163).
[0187] A single dose of each sample was administered at 100 .mu.g
protein/kg to a tail vein of each 7-week-old male Sprague-Dawley
rat. Then, 20 min, 1.5 h, 3 h, 4.5 h, 6 h, and 24 h after the
administration, 100 .mu.l of blood was taken via a tail vein. Each
blood sample was fractionated by centrifugation and the serum
concentration of the remaining G-CSF was quantified using a G-CSF
Human ELISA kit (Abcam.RTM.).
[0188] The results showed that the linear control G-CSF,
G-CSF(C177), and G-CSF(C163) had a calculated metabolic half-life
of 0.95 h, 1.13 h, and 1.05 h, respectively, indicating that
G-CSF(C177) and G-CSF(C163) had a decrease in in vivo clearance
efficiency (Table 1).
Example 13
X-ray Crystallography
[0189] In this Example, a single crystal of each improved protein
was created and its three-dimensional structure was determined by
X-ray crystallography. X-ray crystallography is a technique in
which the high-resolution structure of a protein can be observed
and can be used to check whether or not its cyclization occurs
definitely and whether or not the cyclization causes a change in
its receptor-binding interface.
[0190] Regarding G-CSF(C177), a mother liquor containing 200 mM
Li.sub.2SO.sub.4, 100 mM Tris-HCl (pH 8.4), and 20% (w/v) PEG 4000
was used for crystallization to give a columnar single crystal.
This crystal was used to obtain diffraction data with a resolution
of 3.0 .ANG.. Meanwhile, regarding G-CSF(C163), a mother liquor
containing 0.4 M ammonium phosphate was used for crystallization to
give a regular octahedral single crystal. This crystal was used to
obtain diffraction data with a resolution of 1.65 .ANG.. These two
different diffraction data were used to construct each
three-dimensional structure model.
[0191] First, it was verified that the N-terminal region and the
C-terminal region of each were made cyclic. With respect to each of
G-CSF(C177) and G-CSF(C163), obtained were electron density maps
elucidating that the serine residue at the N-terminus and the
glycine residue at the C-terminus were bonded (FIG. 8). The
three-dimensional structure of each of G-CSF(C177) and G-CSF(C163)
was superimposed on the known three-dimensional structure (PDB
code: 2DQ9-A chain) of the linear control G-CSF. Then, whether the
introduced mutations did not affect the G-CSF receptor-binding
interface was checked.
[0192] There are two separate interaction interfaces between G-CSF
and its receptor (site II and site III). A report said that each
interface requires 8 amino acids for the binding (Reference
Document 12). The positions of these 16 amino acids were compared
among the linear control G-CSF, G-CSF(C177), and G-CSF(C163). The
results showed that the distances between the corresponding
.alpha.-carbons of the G-CSFs were each less than 3 .ANG.,
indicating no difference in their three-dimensional structures
(Table 2).
TABLE-US-00002 TABLE 2 Comparison of the position of .alpha.-carbon
of each amino acid used to form the G-CSF receptor-binding
interface between the known three-dimensional structure (PDB code:
2D9Q-A chain) of human G-CSF and the three-dimensional structure of
each of G- CSF(C177) and G-CSF(C163) by using X-ray
crystallography. Known G-CSF(C177) G-CSF(C163) structure of (Corre-
(Corre- G-CSF sponding sponding (2D9Q Distance amino Distance amino
A chain) (.ANG.) acid) (.ANG.) acid) site II Lys-16 0.49 (Lys-18)
1.00 (Lys-7) Glu-19 0.38 (Glu-21) 0.73 (Glu-10) Gln-20 0.10
(Gln-22) 0.35 (Gln-11) Arg-22 0.86 (Arg-24) 0.44 (Arg-13) Lys-23
0.34 (Lys-25) 0.47 (Lys-14) Leu-108 0.76 (Leu-110) 0.95 (Leu-99)
Asp-109 0.47 (Asp-111) 0.77 (Asp-100) Asp-112 0.76 (Asp-114) 1.36
(Asp-103) site III Tyr-39 0.65 (Tyr-41) 0.27 (Tyr-30) Leu-41 0.30
(Leu-43) 0.33 (Leu-32) Glu-46 1.44 (Glu-48) 1.37 (Glu-37) Val-48
1.16 (Val-50) 1.74 (Val-39) Leu-49 1.29 (Leu-51) 2.39 (Leu-40)
Ser-53 0.39 (Ser-55) 2.09 (Ser-44) Phe-144 0.65 (Phe-146) 0.30
(Phe-135) Arg-147 0.71 (Arg-149) 0.44 (Arg-138)
Example 14
To Improve Loop Length
[0193] In Example 1, the protein conformation models registered in
the database (PDB) were also compared to the structure of G-CSF. As
a result, 9 similar conformation models having the structure of a
loop consisting of 9 amino acids were discovered (accounted for 16%
of the total). Likewise, 15 similar conformation models having the
structure of a loop consisting of 5 amino acids were found
(accounted for 26% of the total). Then, two different modified
G-CSFs were designed and produced such that corresponding seven or
three amino acids of the amino acid sequence of the linear control
G-CSF were maintained as much as possible, except for the serine at
the N-terminus and the glycine at the C-terminus. Specifically,
N-terminal 4 amino acid residues and C-terminal 3 amino acid
residues of the amino acid sequence (SEQ ID NO: 1) of the linear
control G-CSF were deleted; serine was added to the N-terminus and
glycine was added to the C-terminus to give an engineered sequence
(SEQ ID NO: 3); and split inteins were used to produce G-CSF(C170).
Likewise, N-terminal 8 amino acid residues and C-terminal 3 amino
acid residues of the amino acid sequence (SEQ ID NO: 1) of the
linear control G-CSF were deleted; serine was added to the
N-terminus and glycine was added to the C-terminus to give an
engineered sequence (SEQ ID NO: 4); and split inteins were used to
produce G-CSF(C166).
Example 15
To Synthesize G-CSF(C170)-Expression Plasmid
[0194] A nucleic acid (SEQ ID NO: 18) encoding the amino acid
sequence of G-CSF(C170) was synthesized by PCR reaction using the
linear control G-CSF-expression plasmid as a template and two
single strand DNAs (SEQ ID NO: 31 and SEQ ID NO: 32) as primers.
The resulting product was digested by restriction enzymes NheI and
NdeI and was then purified. Next, the inteins-expression plasmid
was digested by restriction enzymes NheI and NdeI and was then
purified. The two samples purified were ligated using T7 DNA
ligase.
[0195] E. coli DH5.alpha. was transformed with the resulting
ligation product and was cultured on an LB plate medium containing
100 .mu.g/ml of ampicillin. The resulting transformants were
subjected to colony PCR and DNA sequencing (GE Healthcare
Bioscience, BigDye Terminator v3.1) and were then selected. After
that, a QIAprep Spin Miniprep kit (Qiagen) was used to extract a
G-CSF(C170)-expression plasmid.
Example 16
To Synthesize G-CSF(C166)-Expression Plasmid
[0196] A nucleic acid (SEQ ID NO: 19) encoding the amino acid
sequence of G-CSF(C166) was synthesized by PCR reaction using the
linear control G-CSF-expression plasmid as a template and two
single strand DNAs (SEQ ID NO: 33 and SEQ ID NO: 34) as primers.
The resulting product was digested by restriction enzymes NheI and
NdeI and was then purified. Next, the inteins-expression plasmid
was digested by restriction enzymes NheI and NdeI and was then
purified. The two samples purified were ligated using T7 DNA
ligase.
[0197] E. coli DH5.alpha. was transformed with the resulting
ligation product and was cultured on an LB plate medium containing
100 .mu.g/ml of ampicillin. The resulting transformants were
subjected to colony PCR and DNA sequencing (GE Healthcare
Bioscience, BigDye Terminator v3.1) and were then selected. After
that, a QIAprep Spin Miniprep kit (Qiagen) was used to extract a
mutant G-CSF(C166)-expression plasmid.
Example 17
To produce G-CSF(C170) and G-CSF(C166)
[0198] (1) E. coli BL21(DE3) (Novagen), which was used for
expression, was transformed with the G-CSF(C170)-expression plasmid
or the G-CSF(C166)-expression plasmid. The resulting transformant
pre-cultured was inoculated in an LB medium at 1 ml/200 ml and the
mixture was cultured under shaking until O.D..sub.600 =0.8 to 1.0.
To express the linear control G-CSF, IPTG (0.4 mM) was added and
the mixture was further cultured under shaking at 37.degree. C. for
4 h.
[0199] (2) The bacteria collected were suspended in 10 ml of a
lysis buffer. The lysis buffer is a PBS containing 1% (w/v) sodium
deoxycholate (DOC), 1.2 kU/ml lysozyme, and 25 U/ml Benzonase.
After the suspension, the mixture was stirred at room temperature
for 15 min and was subjected to sonication for 2 min. Then, the
mixture was centrifuged at 4.degree. C. for 10 at 10,000 .times.g.
The supernatant was removed, and 20 ml of washing buffer 1 was
added to the precipitate to be suspended. Washing buffer 1 contains
50 mM Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA, and 2% Tween20.
The suspended sample was centrifuged to remove the supernatant.
Then, 20 ml of washing buffer 2 was added to the precipitate to be
suspended. Washing buffer 2 contains 50 mM Tris-HCl pH 8.0
(25.degree. C.), 5 mM EDTA, and 1% (w/v) DOC. The suspended sample
was centrifuged to remove the supernatant. Then, 20 ml of washing
buffer 3 was added to the precipitate to be suspended. Washing
buffer 3 contains 50 mM Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA,
and 1 M NaCl. The suspended sample was centrifuged to remove the
supernatant. Then, 10 ml of a solubilization buffer was added to
the precipitate, and the mixture was stirred at room temperature
for 18 h. The solubilization buffer contains 50 mM Tris-HCl pH 8.0
(25.degree. C.), 5 mM EDTA, and 6 M guanidine hydrochloride. The
solubilized sample was concentrated by size-exclusion filtration
and the concentrate was added dropwise to the 10-fold volume of an
ice-cold refolding buffer. Next, the mixture, as it was, was
stirred at 4.degree. C. for 18 h. The refolding buffer contains 100
mM Tris-HCl pH 8.0 (4.degree. C.), 2 mM EDTA, 400 mM L-arginine, 1
mM reduced glutathione, and 0.1 mM oxidized glutathione. After
that, the refolded sample was used as internal liquid and the
100-fold volume of 20 mM Tris-HCl pH 8.0 (4.degree. C.) was used as
external liquid to carry out dialysis (at 4.degree. C. for 18 h).
The sample was centrifuged to recover the supernatant, which was
filtered with a 0.22 -.mu.m syringe filter and was subjected to
dialysis using 20 mM Tris-HCl pH 8.0 (25.degree. C.).
[0200] (3) The filter-sterilized dialysis internal liquid was
injected into a HiTrap Q HP column (GE Healthcare Bioscience) of a
liquid chromatography apparatus AKTApurifier (GE
[0201] Healthcare Bioscience). Then, each G-CSF was purified by
anion exchange chromatography (a running buffer: 20 mM Tris-HCl pH
8.0 (25.degree. C.); an elution buffer: 20 mM Tris-HCl pH 8.0
(25.degree. C.) and 1 M NaCl). Further, each purified sample was
injected into a Superdex75 10/300 GL column (GE Healthcare
Bioscience) of the liquid chromatography apparatus AKTApurifier (GE
Healthcare Bioscience). Each sample was purified by size-exclusion
chromatography (a running buffer: 20 mM Tris-HCl pH 8.0 (25.degree.
C.) and 150 mM NaC1). Each sample purified by size-exclusion
chromatography was concentrated by size-exclusion filtration,
followed by dialysis using the 100-fold volume of 20 mM Tris-HCl pH
8.0 (25.degree. C.). The filter-sterilized dialysis internal liquid
was injected into a MonoQ column (GE Healthcare Bioscience) of the
liquid chromatography apparatus AKTApurifier (GE Healthcare
Bioscience). Then, each G-CSF was purified by anion exchange
chromatography (a running buffer: 20 mM Tris-HCl pH 8.0 (25.degree.
C.); an elution buffer: 20 mM Tris-HCl pH 8.0 (25.degree. C.) and 1
M NaCl). Each purified sample was subjected to dialysis using PBS
and was then stored at 4.degree. C.
Example 18
l To Measure Cyclization Efficiency by Using SDS-PAGE
[0202] In this example, the purity of each G-CSF after purification
was evaluated by using SDS-PAGE.
[0203] An aqueous solution containing each purified G-CSF at a
concentration of about 50 .mu.M was prepared and was subjected to
SDS-PAGE (4 to 20% gels (Bio-Rad), at 200 V for 30 min). Then, an
Oriole) Fluorescent Gel Stain kit (Bio-Rad) was used to detect a
band, and the purity of each G-CSF was determined. The results
showed that the G-CSF(C170) and G-CSF(C166) were each detected as a
major band of any of the samples measured and that the degree of
purification was sufficient. Each sample after the size-exclusion
chromatography was fractionated using a MonoQ column. G-CSF(C170)
appeared as a single peak and no linear molecule was observed. By
contrast, regarding G-CSF(C166), 4% of a linear molecule was
observed in terms of a peak area (Table 3).
TABLE-US-00003 TABLE 3 The cyclization efficiency and the function
and stability of each of G-CSF(C170) and G-CSF(C166) among the
modified cyclized G-CSFs were compared to those of the linear
control G-CSF. Cyclization G-CSF receptor- Thermo- Protease
efficiency binding activity stability resistance (%) KD(nM) Tm
(.degree. C.) t0.5 (hours) filgrastim -- 0.27 N.A. 2.0* Linear --
0.08 56.5 1.7* control G-CSF G- 63.6* 0.06 62.1 230* CSF(C177) G-
100 0.05 64.1 10 CSF(C170) G- 96.0 0.06 59.1 ** CSF(C166) G- 100*
0.10 58.2 11* CSF(C163) *The analysis values in Table 1 were used.
** No degradation was observed during the reaction.
Example 19
To Measure G-CSF Receptor-Binding Activity by Using Surface Plasmon
Resonance Spectroscopy
[0204] As described below, surface plasmon resonance (SPR)
spectroscopy was used to evaluate the receptor-binding activity of
each of G-CSF(C170) and G-CSF(C166)
[0205] Note that SPR spectroscopy is recognized as an excellent
method in which the specific interaction between biopolymers is
measured over time and the reaction can be quantitatively
interpreted from the kinetic viewpoint.
[0206] First, commercially available Protein A (Nacalai Tesque,
Inc.) was immobilized on a flow cell of a sensor chip CM5 (GE
Health care) by using an amine coupling reaction. Next, a running
buffer HBS-P (10 mM HEPES pH 7.4, 150 mM NaCl, and 0.05% v/v
Surfactant P20) in which an Fc-fusion G-CSF receptor had been
dissolved was added to immobilize the receptor on the chip. The SPR
measurement was carried out using Biacore T200 (Biacore) and the
reaction temperature was 25.degree. C. A 2-fold dilution series
from 1.25 nM to 20 nM of the sample solution was prepared, and a
single kinetic mode was used for measurement and analysis. When
compared with Example 8, this Example had improvements such as use
of Biacore T200, which has higher sensitivity than Biacore T100,
and use of commercially available Protein A. The improvements
helped increase the sensitivity and reproducibility of measurement.
Then, the linear G-CSF, G-CSF(C177), and G-CSF(C163) as measured in
Example 8 were remeasured and the resulting analysis values were
used for comparison.
[0207] The observation results were analyzed using BIAevaluation
version 4.1 to calculate a dissociation constant. The results
demonstrated that the dissociation constant between G-CSF(C170) or
G-CSF(C166) and the G-CSF receptor was substantially the same value
as of the linear control G-CSF, indicating that the affinity
remained the same (FIG. 10 and Table 3).
Example 20
To Determine Thermostability by Using Circular Dichroism
Spectroscopy
[0208] In this Example, evaluated was the thermostability of each
of G-CSF(C170) and G-CSF(C166). Circular dichroism (CD)
spectroscopy is known to be a spectroscopic analysis method that
precisely reflects a change in the secondary structure of a
protein. The method can reveal at which temperature a sample is
denatured by measuring a molar ellipticity, which is represented by
the intensity of a CD spectrum, while the temperature of the sample
is changed.
[0209] An aqueous solution (10 mM HEPES buffer, pH 7.4, and 150 mM
sodium chloride) containing G-CSF(C170) or G-CSF(C166) at a
concentration of 5 .mu.M was prepared. This sample solution was
injected into a cylindrical cell (with a cell length of 0.2 cm),
and a circular dichroism spectrophotometer model J805 (JASCO
Corporation) was used for measurement.
[0210] While the measurement wavelength at a temperature of
10.degree. C. was shifted from 260 nm to 200 nm, a circular
dichroism spectrum was obtained. The results showed that
G-CSF(C170) and G-CSF(C166) had substantially the same secondary
structure. The same samples were heated to 90.degree. C. and then
cooled from 90.degree. C. to 10.degree. C. After that, a circular
dichroism spectrum was remeasured. The molar ellipticity, however,
was not recovered, indicating the occurrence of irreversible heat
denaturation.
[0211] Next, while the temperature was raised from 10.degree. C. to
90.degree. C. at a rate of 1.degree. C. per min, their circular
dichroism at a wavelength of 222 nm was scanned and determined The
resulting heat-fusing curve was analyzed using a theoretical
formula of a two-state phase transition model (Reference Document
11). Then, the degeneration temperature T.sub.m, was determined. At
the time of curve fitting, a heat capacity change due to the
denaturation was assumed and fixed to .DELTA.Cp=7.5
kJmol.sup.-1K.sup.-1. When compared with Example 10, this Example
was carried out by measuring thermal melting up to a higher
temperature and by using a value representing the fixed heat
capacity change, resulting in higher reproducibility. G-CSF(C177)
was remeasured and re-analyzed under the same conditions. In
addition, the linear control G-CSF and G-CSF(C163) were
re-analyzed. The resulting analysis values were used for
comparison.
[0212] The results revealed that the thermostability of each of
G-CSF(C170) and G-CSF(C166) was increased by 7.6.degree. C. and
12.6.degree. C., respectively, when compared with that of the
linear control G-CSF (FIG. 11 and Table 3). The results also
revealed that the thermostability of each of G-CSF(C177) and
G-CSF(C163) was increased by 5.6.degree. C. and 1.7.degree. C.,
respectively, when compared with that of the linear control G-CSF.
Collectively, in terms of the thermostability, G-CSF(C166) was the
most stable modified molecule.
Example 21
Protease Resistance Test
[0213] In this Example, the protease resistance of each of
G-CSF(C170) and G-CSF(C166) was evaluated.
[0214] Carboxypeptidase Y and each G-CSF were mixed, and the
proportion of an unreacted G-CSF was quantified using SDS-PAGE.
Carboxypeptidase Y is an enzyme that catalyzes the cleavage of an
amino acid from the C-terminus of a protein and exhibits low
substrate specificity and sequence specificity during the reaction.
Because of this, carboxypeptidase Y is effective in comprehensively
evaluating protease resistance.
[0215] The G-CSF(C170) or G-CSF(C166), isolated and purified, and
carboxypeptidase Y were diluted and mixed in 100 mM acetate buffer
(pH 6.5) at a final concentration of 10 .mu.g/ml and 1 .mu.g/ml,
respectively. The reaction was carried out at 37.degree. C. for 0,
60, 120, 180, and 240 min, and 10 .mu.l of each sample was
collected. Each sample was subjected to SDS-PAGE (4 to 20% precast
gels (Bio-Rad), at 200 V for 30 min), and an Oriole.RTM.
Fluorescent Gel Stain kit (Bio-Rad) was used to detect a band. An
imaging system (ChemiDoc.TM., BioRad) and image-analyzing software
(QuantityOne.TM., BioRad) were used to detect the band and quantify
its image density. The result at 0 min was set to 100%, and the
level of the remaining band was scored. The carboxypeptidase
Y-mediated digestion reaction was assumed to be a
pseudo-first-order reaction to calculate a half-life (FIG. 12 and
Table 3). The results showed that G-CSF(C170) and G-CSF(C166) had
markedly higher protease resistance than the linear control
G-CSF.
[0216] Note that the present application further relates to the
following inventions.
[0217] <1> A method for producing a cyclized mutant protein
with biological properties not less than those of an original
protein and with stability higher than that of the original protein
by modifying, through the following steps (a) and (b) and/or steps
(c) to (f), the original protein with a specific three-dimensional
structure, the method comprising the steps of:
[0218] (a) adding, to an N-terminus and/or an C-terminus of an
amino acid sequence of the original protein, residue(s) that
enables formation of a peptide bond therebetween by using a
trans-splicing reaction using split inteins, provided that when
amino acid residues at the N-terminus and the C-terminus of the
amino acid sequence of the original protein are residues that
enable the formation of a peptide bond, the amino acid residues may
remain intact, and
[0219] (b) linking the N-terminus and the C-terminus by using the
trans-splicing reaction using the split inteins to make a
polypeptide main chain cyclic; and if sufficient cyclization
efficiency is not achieved after steps (a) and (b),
[0220] (c) determining N-terminal and C-terminal secondary
structure-free regions on a basis of conformational information
about the original protein and deleting or mutating the
regions,
[0221] (d) screening a known protein database for a structure of a
loop connecting secondary structures similar to N-terminal and
C-terminal secondary structures of a protein obtained after step
(c) and determining a suitable length of the loop,
[0222] (e) adding, to an N-terminus and a C-terminus of the protein
obtained after step (c), amino acid residues or amino acid
sequences used to form the structure of the loop, with the suitable
length, that connects the N-terminus and the C-terminus after
cyclization, and
[0223] (f) linking the N-terminus and the C-terminus by means of a
chemical or biological technique to make the polypeptide main chain
cyclic.
[0224] <2> The method for producing a cyclized mutant protein
according to item <1>, wherein the polypeptide main chain
cyclization of step (f) uses a trans-splicing reaction mediated by
split inteins.
[0225] <3> The method for producing a cyclized mutant protein
according to item <1> or <2>, wherein the original
protein with a specific three-dimensional structure is a cytokine
with a helix bundle structure and the biological properties of the
original protein involve affinity for a receptor.
[0226] <4> A cyclized mutant protein produced using the
production method according to any one of items <1> to
<3>.
[0227] <5> The cyclized mutant protein according to item
<4>, wherein the original cytokine is granulocyte
colony-stimulating factor (G-CSF).
[0228] <6> The cyclized mutant protein according to item
<5>, wherein the original cytokine is a granulocyte
colony-stimulating factor (G-CSF) consisting of an amino acid
sequence set forth in SEQ ID NO: 1.
[0229] <7> The cyclized mutant protein according to item
<6>, wherein the protein consists of an amino acid sequence
in which 0 to 11 amino acid residues are deleted from an N-terminus
and 0 to 3 amino acid residues are deleted from a C-terminus of the
amino acid sequence set forth in SEQ ID NO: 1; a serine residue or
a cysteine residue is added to an N-terminus and/or a C-terminus
after the deletion; and an amino acid residue other than proline is
added to the C-terminus.
[0230] <8> The cyclized mutant protein according to item
<7>, wherein the protein consists of a cyclized amino acid
sequence set forth in any one of SEQ ID NOs: 6 to 9.
[0231] <9> A linear mutant protein is designed so as to
produce, in accordance with the method according to any one of
items <1> to <3>, the cyclized mutant protein according
to item <8> and consisting of an amino acid sequence set
forth in any one of SEQ ID NOs: 2 to 5.
[0232] <10> A linear mutant protein is designed so as to
produce, in accordance with the method according to any one of
items <1> to <3>, the mutant protein according to item
<822 by using, as split inteins, DnaE-C (C-intein) and DnaE-N
(N-intein) of DnaE derived from Nostoc punctiforme and consisting
of an amino acid sequence set forth in any one of SEQ ID NOs: 10 to
13.
[0233] <11> A nucleic acid comprising a nucleotide sequence
encoding an amino acid sequence of the mutant protein according to
any one of items <4> to <10>.
[0234] <12> A recombinant vector comprising the nucleic acid
according to item <11>.
[0235] <13> A transformant comprising the recombinant vector
according to item <12>.
[0236] <14> A method for producing the mutant protein
according to any one of items <4> to <10>, comprising
using the transformant according to item <13>.
[0237] <15> A pharmaceutical composition comprising, as an
active ingredient, the mutant protein according to any one of items
<5> to <9>.
[0238] <16> A pharmaceutical composition for treatment of
neutropenia, comprising, as an active ingredient, the mutant
protein according to any one of items <5> to <9>.
(LIST OF REFERENCE DOCUMENTS)
[0239] Reference Document 1; B. I. Lord, L. B. Woolford, and G.
Molineux, Clin. Cancer Res., 2001, 7, 2085-90. [0240] Reference
Document 2; M. W. Popp, S. K. Dougan, T. Y. Chuang, E. Spooner, and
H. L. Ploegh, Proc. Natl. Acad. Sci. U.S.A., 2011, 108, 3169-74
[0241] Reference Document 3; T. Kuga, Y. Komatsu, M. Yamasaki, S.
Sekine, H. Miyaji, T. Nishi, M. Sato, Y. Yokoo, M. Asano, M. Okabe,
M. Morimoto, and S. Itoh, Biochem. Biophys. Res. Commun., 1989,
159, 103-111. [0242] Reference Document 4; W. Kabsch and C. Sander,
Biopolymers, 1983, 22, 2577-637. [0243] Reference Document 5; D.
Frishman and P. Argos, Proteins, 1995, 23, 566-79. [0244] Reference
Document 6; C. P. Scott, E. Abel-Santos, M. Wall, D. C. Wahnon, and
S. J. Benkovic, Proc. Natl. Acad. Sci., 1999, 96, 13638-13643.
[0245] Reference Document 7; R. M. Horton, H. D. Hunt, S. N. Ho, J.
K. Pullen, and L. R. Pease, Gene, 1989, 77, 61-8. [0246] Reference
Document 8: OKADA, Masato and MIYAZAKI, Kaori (2004) "Tanpakushitsu
Jikken Noto (Notes on Protein Experiments)" (I), YODOSHA CO., LTD.
[0247] Reference Document 9: OHNO, Shigeo and NISHIMURA, Yoshifumi
(1997), "Tanpakushitsu Jikken Purotokoru (Protocols for Protein
Experiments) 1-Functional Analysis Part", Shujunsha Co., Ltd.
[0248] Reference Document 10: OIINO, Shigeo and NISHIMURA,
Yoshifumi (1997), "Tanpakushitsu Jikken Purotokoru (Protocols for
Protein Experiments) 2-Structural Analysis Part", Shujunsha Co.,
Ltd. [0249] Reference Document 11: ARISAKA, Fumio (2004)
"Bioscience no Tame no Tanpakusitsukagaku Nyumon (Introduction to
Protein Science for Bioscience)", Shokabo Co., Ltd. [0250]
Reference Document 12; T. Tamada, E. Honjo, Y. Maeda, T. Okamoto,
M. Ishibashi, M. Tokunaga, and R. Kuroki, Proc. Natl. Acad. Sci.
U.S.A., 2006, 103, 3135-40. [0251] (SEQ ID NO Description List)
[0252] SEQID NO: 1: amino acid sequence of linear control G-CSF
[0253] SEQID NO: 2: amino acid sequence of G-CSF(C177) [0254] SEQID
NO: 3: amino acid sequence of G-CSF(C170) [0255] SEQID NO: 4: amino
acid sequence of G-CSF(C166) [0256] SEQID NO: 5: amino acid
sequence of G-CSF(C163) [0257] SEQID NO: 6: amino acid sequence of
cyclized G-CSF(C177) [0258] SEQID NO: 7: amino acid sequence of
cyclized G-CSF(C170) [0259] SEQID NO: 8: amino acid sequence of
cyclized G-CSF(C166) [0260] SEQID NO: 9: amino acid sequence of
cyclized G-CSF(C163) [0261] SEQID NO: 10: amino acid sequence of
G-CSF(C177) linked to DnaE [0262] SEQID NO: 11: amino acid sequence
of cyclized G-CSF(C170) linked to DnaE [0263] SEQID NO: 12: amino
acid sequence of cyclized G-CSF(C166) linked to DnaE [0264] SEQID
NO: 13: amino acid sequence of G-CSF(C163) linked to DnaE [0265]
SEQID NO: 14: amino acid sequence of filgrastim [0266] SEQID NO:
15: amino acid sequence of DnaE-C [0267] SEQID NO: 16: amino acid
sequence of DnaE-N [0268] SEQID NO: 17: nucleotide sequence of
G-CSF(C177) [0269] SEQID NO: 18: nucleotide sequence of G-CSF(C170)
[0270] SEQID NO: 19: nucleotide sequence of G-CSF(C166) [0271]
SEQID NO: 20: nucleotide sequence of G-CSF(C163) [0272] SEQID NO:
21: nucleotide sequence of G-CSF(C177) linked to DnaE [0273] SEQID
NO: 22: nucleotide sequence of cyclized G-CSF(C170) linked to DnaE
[0274] SEQID NO: 23: nucleotide sequence of cyclized G-CSF(C166)
linked to DnaE [0275] SEQID NO: 24: nucleotide sequence of
G-CSF(C163) linked to DnaE [0276] SEQID NO: 25: nucleotide sequence
of linear control G-CSF [0277] SEQID NO: 26: primer sequence for
amplifying the nucleotide sequence of linear control G-CSF [0278]
SEQID NO: 27: primer sequence for amplifying the nucleotide
sequence of linear control G-CSF [0279] SEQID NO: 28: nucleotide
sequence encoding, in sequence, DnaE-C and DnaE-N [0280] SEQID NO:
29: primer sequence for amplifying the nucleotide sequence of
G-CSF(C177) [0281] SEQID NO: 30: primer sequence for amplifying the
nucleotide sequence of G-CSF(C177) [0282] SEQID NO: 31: primer
sequence for amplifying the nucleotide sequence of G-CSF(C170)
[0283] SEQID NO: 32: primer sequence for amplifying the nucleotide
sequence of G-CSF(C170) [0284] SEQID NO: 33: primer sequence for
amplifying the nucleotide sequence of G-CSF(C166) [0285] SEQID NO:
34: primer sequence for amplifying the nucleotide sequence of
G-CSF(C166) [0286] SEQID NO: 35: primer sequence for amplifying the
nucleotide sequence of G-CSF(C163) [0287] SEQID NO: 36: primer
sequence for amplifying the nucleotide sequence of G-CSF(C163)
Sequence CWU 1
1
361175PRTartificialG-CSF added M at N terminal and replaced
residues 2 and 18 to A and S 1Met Ala Pro Leu Gly Pro Ala Ser Ser
Leu Pro Gln Ser Phe Leu Leu 1 5 10 15 Lys Ser Leu Glu Gln Val Arg
Lys Ile Gln Gly Asp Gly Ala Ala Leu 20 25 30 Gln Glu Lys Leu Cys
Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 35 40 45 Val Leu Leu
Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser Ser 50 55 60 Cys
Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu His 65 70
75 80 Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly
Ile 85 90 95 Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu
Asp Val Ala 100 105 110 Asp Phe Ala Thr Thr Ile Trp Gln Gln Met Glu
Glu Leu Gly Met Ala 115 120 125 Pro Ala Leu Gln Pro Thr Gln Gly Ala
Met Pro Ala Phe Ala Ser Ala 130 135 140 Phe Gln Arg Arg Ala Gly Gly
Val Leu Val Ala Ser His Leu Gln Ser 145 150 155 160 Phe Leu Glu Val
Ser Tyr Arg Val Leu Arg His Leu Ala Gln Pro 165 170 175
2177PRTartificialG-CSF modified for cyclization 2Ser Met Ala Pro
Leu Gly Pro Ala Ser Ser Leu Pro Gln Ser Phe Leu 1 5 10 15 Leu Lys
Ser Leu Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala 20 25 30
Leu Gln Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu 35
40 45 Leu Val Leu Leu Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu
Ser 50 55 60 Ser Cys Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys Leu
Ser Gln Leu 65 70 75 80 His Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu
Gln Ala Leu Glu Gly 85 90 95 Ile Ser Pro Glu Leu Gly Pro Thr Leu
Asp Thr Leu Gln Leu Asp Val 100 105 110 Ala Asp Phe Ala Thr Thr Ile
Trp Gln Gln Met Glu Glu Leu Gly Met 115 120 125 Ala Pro Ala Leu Gln
Pro Thr Gln Gly Ala Met Pro Ala Phe Ala Ser 130 135 140 Ala Phe Gln
Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gln 145 150 155 160
Ser Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gln Pro 165
170 175 Gly 3170PRTartificialG-CSF modified for cyclization 3Ser
Gly Pro Ala Ser Ser Leu Pro Gln Ser Phe Leu Leu Lys Ser Leu 1 5 10
15 Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala Leu Gln Glu Lys
20 25 30 Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu Val
Leu Leu 35 40 45 Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser
Ser Cys Pro Ser 50 55 60 Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser
Gln Leu His Ser Gly Leu 65 70 75 80 Phe Leu Tyr Gln Gly Leu Leu Gln
Ala Leu Glu Gly Ile Ser Pro Glu 85 90 95 Leu Gly Pro Thr Leu Asp
Thr Leu Gln Leu Asp Val Ala Asp Phe Ala 100 105 110 Thr Thr Ile Trp
Gln Gln Met Glu Glu Leu Gly Met Ala Pro Ala Leu 115 120 125 Gln Pro
Thr Gln Gly Ala Met Pro Ala Phe Ala Ser Ala Phe Gln Arg 130 135 140
Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gln Ser Phe Leu Glu 145
150 155 160 Val Ser Tyr Arg Val Leu Arg His Leu Gly 165 170
4166PRTartificialG-CSF modified for cyclization 4Ser Ser Leu Pro
Gln Ser Phe Leu Leu Lys Ser Leu Glu Gln Val Arg 1 5 10 15 Lys Ile
Gln Gly Asp Gly Ala Ala Leu Gln Glu Lys Leu Cys Ala Thr 20 25 30
Tyr Lys Leu Cys His Pro Glu Glu Leu Val Leu Leu Gly His Ser Leu 35
40 45 Gly Ile Pro Trp Ala Pro Leu Ser Ser Cys Pro Ser Gln Ala Leu
Gln 50 55 60 Leu Ala Gly Cys Leu Ser Gln Leu His Ser Gly Leu Phe
Leu Tyr Gln 65 70 75 80 Gly Leu Leu Gln Ala Leu Glu Gly Ile Ser Pro
Glu Leu Gly Pro Thr 85 90 95 Leu Asp Thr Leu Gln Leu Asp Val Ala
Asp Phe Ala Thr Thr Ile Trp 100 105 110 Gln Gln Met Glu Glu Leu Gly
Met Ala Pro Ala Leu Gln Pro Thr Gln 115 120 125 Gly Ala Met Pro Ala
Phe Ala Ser Ala Phe Gln Arg Arg Ala Gly Gly 130 135 140 Val Leu Val
Ala Ser His Leu Gln Ser Phe Leu Glu Val Ser Tyr Arg 145 150 155 160
Val Leu Arg His Leu Gly 165 5163PRTartificialG-CSF modified for
cyclization 5Ser Gln Ser Phe Leu Leu Lys Ser Leu Glu Gln Val Arg
Lys Ile Gln 1 5 10 15 Gly Asp Gly Ala Ala Leu Gln Glu Lys Leu Cys
Ala Thr Tyr Lys Leu 20 25 30 Cys His Pro Glu Glu Leu Val Leu Leu
Gly His Ser Leu Gly Ile Pro 35 40 45 Trp Ala Pro Leu Ser Ser Cys
Pro Ser Gln Ala Leu Gln Leu Ala Gly 50 55 60 Cys Leu Ser Gln Leu
His Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu 65 70 75 80 Gln Ala Leu
Glu Gly Ile Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr 85 90 95 Leu
Gln Leu Asp Val Ala Asp Phe Ala Thr Thr Ile Trp Gln Gln Met 100 105
110 Glu Glu Leu Gly Met Ala Pro Ala Leu Gln Pro Thr Gln Gly Ala Met
115 120 125 Pro Ala Phe Ala Ser Ala Phe Gln Arg Arg Ala Gly Gly Val
Leu Val 130 135 140 Ala Ser His Leu Gln Ser Phe Leu Glu Val Ser Tyr
Arg Val Leu Arg 145 150 155 160 His Leu Gly
6177PRTartificialmodified G-CSF cyclized by peptide bond formation
between N and C terminals 6Ser Met Ala Pro Leu Gly Pro Ala Ser Ser
Leu Pro Gln Ser Phe Leu 1 5 10 15 Leu Lys Ser Leu Glu Gln Val Arg
Lys Ile Gln Gly Asp Gly Ala Ala 20 25 30 Leu Gln Glu Lys Leu Cys
Ala Thr Tyr Lys Leu Cys His Pro Glu Glu 35 40 45 Leu Val Leu Leu
Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser 50 55 60 Ser Cys
Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu 65 70 75 80
His Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly 85
90 95 Ile Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu Asp
Val 100 105 110 Ala Asp Phe Ala Thr Thr Ile Trp Gln Gln Met Glu Glu
Leu Gly Met 115 120 125 Ala Pro Ala Leu Gln Pro Thr Gln Gly Ala Met
Pro Ala Phe Ala Ser 130 135 140 Ala Phe Gln Arg Arg Ala Gly Gly Val
Leu Val Ala Ser His Leu Gln 145 150 155 160 Ser Phe Leu Glu Val Ser
Tyr Arg Val Leu Arg His Leu Ala Gln Pro 165 170 175 Gly
7170PRTartificialmodified G-CSF cyclized by peptide bond formation
between N and C terminals 7Ser Gly Pro Ala Ser Ser Leu Pro Gln Ser
Phe Leu Leu Lys Ser Leu 1 5 10 15 Glu Gln Val Arg Lys Ile Gln Gly
Asp Gly Ala Ala Leu Gln Glu Lys 20 25 30 Leu Cys Ala Thr Tyr Lys
Leu Cys His Pro Glu Glu Leu Val Leu Leu 35 40 45 Gly His Ser Leu
Gly Ile Pro Trp Ala Pro Leu Ser Ser Cys Pro Ser 50 55 60 Gln Ala
Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu His Ser Gly Leu 65 70 75 80
Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly Ile Ser Pro Glu 85
90 95 Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu Asp Val Ala Asp Phe
Ala 100 105 110 Thr Thr Ile Trp Gln Gln Met Glu Glu Leu Gly Met Ala
Pro Ala Leu 115 120 125 Gln Pro Thr Gln Gly Ala Met Pro Ala Phe Ala
Ser Ala Phe Gln Arg 130 135 140 Arg Ala Gly Gly Val Leu Val Ala Ser
His Leu Gln Ser Phe Leu Glu 145 150 155 160 Val Ser Tyr Arg Val Leu
Arg His Leu Gly 165 170 8166PRTartificialmodified G-CSF cyclized by
peptide bond formation between N and C terminals 8Ser Ser Leu Pro
Gln Ser Phe Leu Leu Lys Ser Leu Glu Gln Val Arg 1 5 10 15 Lys Ile
Gln Gly Asp Gly Ala Ala Leu Gln Glu Lys Leu Cys Ala Thr 20 25 30
Tyr Lys Leu Cys His Pro Glu Glu Leu Val Leu Leu Gly His Ser Leu 35
40 45 Gly Ile Pro Trp Ala Pro Leu Ser Ser Cys Pro Ser Gln Ala Leu
Gln 50 55 60 Leu Ala Gly Cys Leu Ser Gln Leu His Ser Gly Leu Phe
Leu Tyr Gln 65 70 75 80 Gly Leu Leu Gln Ala Leu Glu Gly Ile Ser Pro
Glu Leu Gly Pro Thr 85 90 95 Leu Asp Thr Leu Gln Leu Asp Val Ala
Asp Phe Ala Thr Thr Ile Trp 100 105 110 Gln Gln Met Glu Glu Leu Gly
Met Ala Pro Ala Leu Gln Pro Thr Gln 115 120 125 Gly Ala Met Pro Ala
Phe Ala Ser Ala Phe Gln Arg Arg Ala Gly Gly 130 135 140 Val Leu Val
Ala Ser His Leu Gln Ser Phe Leu Glu Val Ser Tyr Arg 145 150 155 160
Val Leu Arg His Leu Gly 165 9163PRTartificialmodiefied G-CSF
cyclized by peptide bond formation between N and C terminals 9Ser
Gln Ser Phe Leu Leu Lys Ser Leu Glu Gln Val Arg Lys Ile Gln 1 5 10
15 Gly Asp Gly Ala Ala Leu Gln Glu Lys Leu Cys Ala Thr Tyr Lys Leu
20 25 30 Cys His Pro Glu Glu Leu Val Leu Leu Gly His Ser Leu Gly
Ile Pro 35 40 45 Trp Ala Pro Leu Ser Ser Cys Pro Ser Gln Ala Leu
Gln Leu Ala Gly 50 55 60 Cys Leu Ser Gln Leu His Ser Gly Leu Phe
Leu Tyr Gln Gly Leu Leu 65 70 75 80 Gln Ala Leu Glu Gly Ile Ser Pro
Glu Leu Gly Pro Thr Leu Asp Thr 85 90 95 Leu Gln Leu Asp Val Ala
Asp Phe Ala Thr Thr Ile Trp Gln Gln Met 100 105 110 Glu Glu Leu Gly
Met Ala Pro Ala Leu Gln Pro Thr Gln Gly Ala Met 115 120 125 Pro Ala
Phe Ala Ser Ala Phe Gln Arg Arg Ala Gly Gly Val Leu Val 130 135 140
Ala Ser His Leu Gln Ser Phe Leu Glu Val Ser Tyr Arg Val Leu Arg 145
150 155 160 His Leu Gly 10325PRTartificialmodified G-CSF fused with
DnaE-C and DnaE-N 10Met Val Lys Ile Ala Thr Arg Lys Tyr Leu Gly Lys
Gln Asn Val Tyr 1 5 10 15 Asp Ile Gly Val Glu Arg Asp His Asn Phe
Ala Leu Lys Asn Gly Phe 20 25 30 Ile Ala Ser Asn Ser Met Ala Pro
Leu Gly Pro Ala Ser Ser Leu Pro 35 40 45 Gln Ser Phe Leu Leu Lys
Ser Leu Glu Gln Val Arg Lys Ile Gln Gly 50 55 60 Asp Gly Ala Ala
Leu Gln Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys 65 70 75 80 His Pro
Glu Glu Leu Val Leu Leu Gly His Ser Leu Gly Ile Pro Trp 85 90 95
Ala Pro Leu Ser Ser Cys Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys 100
105 110 Leu Ser Gln Leu His Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu
Gln 115 120 125 Ala Leu Glu Gly Ile Ser Pro Glu Leu Gly Pro Thr Leu
Asp Thr Leu 130 135 140 Gln Leu Asp Val Ala Asp Phe Ala Thr Thr Ile
Trp Gln Gln Met Glu 145 150 155 160 Glu Leu Gly Met Ala Pro Ala Leu
Gln Pro Thr Gln Gly Ala Met Pro 165 170 175 Ala Phe Ala Ser Ala Phe
Gln Arg Arg Ala Gly Gly Val Leu Val Ala 180 185 190 Ser His Leu Gln
Ser Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His 195 200 205 Leu Ala
Gln Pro Gly Cys Leu Ser Tyr Glu Thr Glu Ile Leu Thr Val 210 215 220
Glu Tyr Gly Leu Leu Pro Ile Gly Lys Ile Val Glu Lys Arg Ile Glu 225
230 235 240 Cys Thr Val Tyr Ser Val Asp Asn Asn Gly Asn Ile Tyr Thr
Gln Pro 245 250 255 Val Ala Gln Trp His Asp Arg Gly Glu Gln Glu Val
Phe Glu Tyr Cys 260 265 270 Leu Glu Asp Gly Ser Leu Ile Arg Ala Thr
Lys Asp His Lys Phe Met 275 280 285 Thr Val Asp Gly Gln Met Leu Pro
Ile Asp Glu Ile Phe Glu Arg Glu 290 295 300 Leu Asp Leu Met Arg Val
Asp Asn Leu Pro Asn Ser Gly Ser Gly His 305 310 315 320 His His His
His His 325 11318PRTartificialmodified G-CSF fused with DnaE-C and
DnaE-N 11Met Val Lys Ile Ala Thr Arg Lys Tyr Leu Gly Lys Gln Asn
Val Tyr 1 5 10 15 Asp Ile Gly Val Glu Arg Asp His Asn Phe Ala Leu
Lys Asn Gly Phe 20 25 30 Ile Ala Ser Asn Ser Gly Pro Ala Ser Ser
Leu Pro Gln Ser Phe Leu 35 40 45 Leu Lys Ser Leu Glu Gln Val Arg
Lys Ile Gln Gly Asp Gly Ala Ala 50 55 60 Leu Gln Glu Lys Leu Cys
Ala Thr Tyr Lys Leu Cys His Pro Glu Glu 65 70 75 80 Leu Val Leu Leu
Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser 85 90 95 Ser Cys
Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu 100 105 110
His Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly 115
120 125 Ile Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu Asp
Val 130 135 140 Ala Asp Phe Ala Thr Thr Ile Trp Gln Gln Met Glu Glu
Leu Gly Met 145 150 155 160 Ala Pro Ala Leu Gln Pro Thr Gln Gly Ala
Met Pro Ala Phe Ala Ser 165 170 175 Ala Phe Gln Arg Arg Ala Gly Gly
Val Leu Val Ala Ser His Leu Gln 180 185 190 Ser Phe Leu Glu Val Ser
Tyr Arg Val Leu Arg His Leu Gly Cys Leu 195 200 205 Ser Tyr Glu Thr
Glu Ile Leu Thr Val Glu Tyr Gly Leu Leu Pro Ile 210 215 220 Gly Lys
Ile Val Glu Lys Arg Ile Glu Cys Thr Val Tyr Ser Val Asp 225 230 235
240 Asn Asn Gly Asn Ile Tyr Thr Gln Pro Val Ala Gln Trp His Asp Arg
245 250 255 Gly Glu Gln Glu Val Phe Glu Tyr Cys Leu Glu Asp Gly Ser
Leu Ile 260 265 270 Arg Ala Thr Lys Asp His Lys Phe Met Thr Val Asp
Gly Gln Met Leu 275 280 285 Pro Ile Asp Glu Ile Phe Glu Arg Glu Leu
Asp Leu Met Arg Val Asp 290 295 300 Asn Leu Pro Asn Ser Gly Ser Gly
His His His His His His 305 310 315 12314PRTartificialmodified
G-CSF fused with DnaE-C and DnaE-N 12Met Val Lys Ile Ala Thr Arg
Lys Tyr Leu Gly Lys Gln Asn Val Tyr 1 5 10 15 Asp Ile
Gly Val Glu Arg Asp His Asn Phe Ala Leu Lys Asn Gly Phe 20 25 30
Ile Ala Ser Asn Ser Ser Leu Pro Gln Ser Phe Leu Leu Lys Ser Leu 35
40 45 Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala Leu Gln Glu
Lys 50 55 60 Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu
Val Leu Leu 65 70 75 80 Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu
Ser Ser Cys Pro Ser 85 90 95 Gln Ala Leu Gln Leu Ala Gly Cys Leu
Ser Gln Leu His Ser Gly Leu 100 105 110 Phe Leu Tyr Gln Gly Leu Leu
Gln Ala Leu Glu Gly Ile Ser Pro Glu 115 120 125 Leu Gly Pro Thr Leu
Asp Thr Leu Gln Leu Asp Val Ala Asp Phe Ala 130 135 140 Thr Thr Ile
Trp Gln Gln Met Glu Glu Leu Gly Met Ala Pro Ala Leu 145 150 155 160
Gln Pro Thr Gln Gly Ala Met Pro Ala Phe Ala Ser Ala Phe Gln Arg 165
170 175 Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gln Ser Phe Leu
Glu 180 185 190 Val Ser Tyr Arg Val Leu Arg His Leu Gly Cys Leu Ser
Tyr Glu Thr 195 200 205 Glu Ile Leu Thr Val Glu Tyr Gly Leu Leu Pro
Ile Gly Lys Ile Val 210 215 220 Glu Lys Arg Ile Glu Cys Thr Val Tyr
Ser Val Asp Asn Asn Gly Asn 225 230 235 240 Ile Tyr Thr Gln Pro Val
Ala Gln Trp His Asp Arg Gly Glu Gln Glu 245 250 255 Val Phe Glu Tyr
Cys Leu Glu Asp Gly Ser Leu Ile Arg Ala Thr Lys 260 265 270 Asp His
Lys Phe Met Thr Val Asp Gly Gln Met Leu Pro Ile Asp Glu 275 280 285
Ile Phe Glu Arg Glu Leu Asp Leu Met Arg Val Asp Asn Leu Pro Asn 290
295 300 Ser Gly Ser Gly His His His His His His 305 310
13311PRTartificialmodified G-CSF fused with DnaE-C and DnaE-N 13Met
Val Lys Ile Ala Thr Arg Lys Tyr Leu Gly Lys Gln Asn Val Tyr 1 5 10
15 Asp Ile Gly Val Glu Arg Asp His Asn Phe Ala Leu Lys Asn Gly Phe
20 25 30 Ile Ala Ser Asn Ser Gln Ser Phe Leu Leu Lys Ser Leu Glu
Gln Val 35 40 45 Arg Lys Ile Gln Gly Asp Gly Ala Ala Leu Gln Glu
Lys Leu Cys Ala 50 55 60 Thr Tyr Lys Leu Cys His Pro Glu Glu Leu
Val Leu Leu Gly His Ser 65 70 75 80 Leu Gly Ile Pro Trp Ala Pro Leu
Ser Ser Cys Pro Ser Gln Ala Leu 85 90 95 Gln Leu Ala Gly Cys Leu
Ser Gln Leu His Ser Gly Leu Phe Leu Tyr 100 105 110 Gln Gly Leu Leu
Gln Ala Leu Glu Gly Ile Ser Pro Glu Leu Gly Pro 115 120 125 Thr Leu
Asp Thr Leu Gln Leu Asp Val Ala Asp Phe Ala Thr Thr Ile 130 135 140
Trp Gln Gln Met Glu Glu Leu Gly Met Ala Pro Ala Leu Gln Pro Thr 145
150 155 160 Gln Gly Ala Met Pro Ala Phe Ala Ser Ala Phe Gln Arg Arg
Ala Gly 165 170 175 Gly Val Leu Val Ala Ser His Leu Gln Ser Phe Leu
Glu Val Ser Tyr 180 185 190 Arg Val Leu Arg His Leu Gly Cys Leu Ser
Tyr Glu Thr Glu Ile Leu 195 200 205 Thr Val Glu Tyr Gly Leu Leu Pro
Ile Gly Lys Ile Val Glu Lys Arg 210 215 220 Ile Glu Cys Thr Val Tyr
Ser Val Asp Asn Asn Gly Asn Ile Tyr Thr 225 230 235 240 Gln Pro Val
Ala Gln Trp His Asp Arg Gly Glu Gln Glu Val Phe Glu 245 250 255 Tyr
Cys Leu Glu Asp Gly Ser Leu Ile Arg Ala Thr Lys Asp His Lys 260 265
270 Phe Met Thr Val Asp Gly Gln Met Leu Pro Ile Asp Glu Ile Phe Glu
275 280 285 Arg Glu Leu Asp Leu Met Arg Val Asp Asn Leu Pro Asn Ser
Gly Ser 290 295 300 Gly His His His His His His 305 310
14175PRTartificialG-CSF added M at N terminal 14Met Thr Pro Leu Gly
Pro Ala Ser Ser Leu Pro Gln Ser Phe Leu Leu 1 5 10 15 Lys Cys Leu
Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala Leu 20 25 30 Gln
Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 35 40
45 Val Leu Leu Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser Ser
50 55 60 Cys Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln
Leu His 65 70 75 80 Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu Gln Ala
Leu Glu Gly Ile 85 90 95 Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr
Leu Gln Leu Asp Val Ala 100 105 110 Asp Phe Ala Thr Thr Ile Trp Gln
Gln Met Glu Glu Leu Gly Met Ala 115 120 125 Pro Ala Leu Gln Pro Thr
Gln Gly Ala Met Pro Ala Phe Ala Ser Ala 130 135 140 Phe Gln Arg Arg
Ala Gly Gly Val Leu Val Ala Ser His Leu Gln Ser 145 150 155 160 Phe
Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gln Pro 165 170 175
1536PRTartificialDnaE-C 15Met Val Lys Ile Ala Thr Arg Lys Tyr Leu
Gly Lys Gln Asn Val Tyr 1 5 10 15 Asp Ile Gly Val Glu Arg Asp His
Asn Phe Ala Leu Lys Asn Gly Phe 20 25 30 Ile Ala Ser Asn 35
16112PRTartificialDnaE-N 16Cys Leu Ser Tyr Glu Thr Glu Ile Leu Thr
Val Glu Tyr Gly Leu Leu 1 5 10 15 Pro Ile Gly Lys Ile Val Glu Lys
Arg Ile Glu Cys Thr Val Tyr Ser 20 25 30 Val Asp Asn Asn Gly Asn
Ile Tyr Thr Gln Pro Val Ala Gln Trp His 35 40 45 Asp Arg Gly Glu
Gln Glu Val Phe Glu Tyr Cys Leu Glu Asp Gly Ser 50 55 60 Leu Ile
Arg Ala Thr Lys Asp His Lys Phe Met Thr Val Asp Gly Gln 65 70 75 80
Met Leu Pro Ile Asp Glu Ile Phe Glu Arg Glu Leu Asp Leu Met Arg 85
90 95 Val Asp Asn Leu Pro Asn Ser Gly Ser Gly His His His His His
His 100 105 110 17531DNAartificialG-CSF modified for cyclization
17agcatggcac cattaggtcc agcgagcagc ctgccgcaga gctttctgct gaaaagcctg
60gaacaggtgc gtaaaattca gggtgatggt gcggcgctgc aagaaaaact gtgcgcgacc
120tataaactgt gccatccgga agagctggtg ctgctgggcc atagcctggg
tattccgtgg 180gcaccgctgt ctagctgtcc gagccaggcg ctgcaactgg
ccggttgtct gagccagctg 240catagcggcc tgtttctgta tcagggcctg
ctgcaagcgc tggaaggcat tagcccggag 300ctgggcccga ctctggatac
cctgcaactg gatgtggcgg attttgcgac caccatttgg 360cagcagatgg
aagagctggg catggcaccg gcgctgcaac cgacccaggg tgccatgccg
420gcgtttgcga gcgcgtttca gcgtcgtgcg ggcggtgttc tggtggcgag
ccatctgcaa 480tcttttctgg aagtgagcta tcgtgtgctg cgtcatctgg
cccagccggg c 53118510DNAartificialG-CSF modified for cyclization
18agcggtccag cgagcagcct gccgcagagc tttctgctga aaagcctgga acaggtgcgt
60aaaattcagg gtgatggtgc ggcgctgcaa gaaaaactgt gcgcgaccta taaactgtgc
120catccggaag agctggtgct gctgggccat agcctgggta ttccgtgggc
accgctgtct 180agctgtccga gccaggcgct gcaactggcc ggttgtctga
gccagctgca tagcggcctg 240tttctgtatc agggcctgct gcaagcgctg
gaaggcatta gcccggagct gggcccgact 300ctggataccc tgcaactgga
tgtggcggat tttgcgacca ccatttggca gcagatggaa 360gagctgggca
tggcaccggc gctgcaaccg acccagggtg ccatgccggc gtttgcgagc
420gcgtttcagc gtcgtgcggg cggtgttctg gtggcgagcc atctgcaatc
ttttctggaa 480gtgagctatc gtgtgctgcg tcatctgggc
51019498DNAartificialG-CSF modified for cyclization 19agcagcctgc
cgcagagctt tctgctgaaa agcctggaac aggtgcgtaa aattcagggt 60gatggtgcgg
cgctgcaaga aaaactgtgc gcgacctata aactgtgcca tccggaagag
120ctggtgctgc tgggccatag cctgggtatt ccgtgggcac cgctgtctag
ctgtccgagc 180caggcgctgc aactggccgg ttgtctgagc cagctgcata
gcggcctgtt tctgtatcag 240ggcctgctgc aagcgctgga aggcattagc
ccggagctgg gcccgactct ggataccctg 300caactggatg tggcggattt
tgcgaccacc atttggcagc agatggaaga gctgggcatg 360gcaccggcgc
tgcaaccgac ccagggtgcc atgccggcgt ttgcgagcgc gtttcagcgt
420cgtgcgggcg gtgttctggt ggcgagccat ctgcaatctt ttctggaagt
gagctatcgt 480gtgctgcgtc atctgggc 49820489DNAartificialG-CSF
modified for cyclization 20agccagagct ttctgctgaa aagcctggaa
caggtgcgta aaattcaggg tgatggtgcg 60gcgctgcaag aaaaactgtg cgcgacctat
aaactgtgcc atccggaaga gctggtgctg 120ctgggccata gcctgggtat
tccgtgggca ccgctgtcta gctgtccgag ccaggcgctg 180caactggccg
gttgtctgag ccagctgcat agcggcctgt ttctgtatca gggcctgctg
240caagcgctgg aaggcattag cccggagctg ggcccgactc tggataccct
gcaactggat 300gtggcggatt ttgcgaccac catttggcag cagatggaag
agctgggcat ggcaccggcg 360ctgcaaccga cccagggtgc catgccggcg
tttgcgagcg cgtttcagcg tcgtgcgggc 420ggtgttctgg tggcgagcca
tctgcaatct tttctggaag tgagctatcg tgtgctgcgt 480catctgggc
48921978DNAartificialmodified G-CSF fused with DnaE-C and DnaE-N
21atggtgaaaa tagccacacg caaatatctg ggcaaacaga acgtgtatga tattggcgtg
60gaacgcgatc ataactttgc gctgaaaaac ggcttcatag ctagcaatag catggcacca
120ttaggtccag cgagcagcct gccgcagagc tttctgctga aaagcctgga
acaggtgcgt 180aaaattcagg gtgatggtgc ggcgctgcaa gaaaaactgt
gcgcgaccta taaactgtgc 240catccggaag agctggtgct gctgggccat
agcctgggta ttccgtgggc accgctgtct 300agctgtccga gccaggcgct
gcaactggcc ggttgtctga gccagctgca tagcggcctg 360tttctgtatc
agggcctgct gcaagcgctg gaaggcatta gcccggagct gggcccgact
420ctggataccc tgcaactgga tgtggcggat tttgcgacca ccatttggca
gcagatggaa 480gagctgggca tggcaccggc gctgcaaccg acccagggtg
ccatgccggc gtttgcgagc 540gcgtttcagc gtcgtgcggg cggtgttctg
gtggcgagcc atctgcaatc ttttctggaa 600gtgagctatc gtgtgctgcg
tcatctggcc cagccgggct gtttatcata tgaaacggaa 660atattgaccg
tggaatatgg cctgctgccg attggcaaaa ttgtggaaaa acgcattgaa
720tgcaccgtgt atagcgtgga taacaacggc aacatttata cccagccggt
ggcgcagtgg 780catgatcgcg gcgaacagga agtgtttgaa tattgcctgg
aagatggcag cctgattcgc 840gcgaccaaag atcataaatt tatgaccgtg
gatggccaga tgctgccgat tgatgaaatt 900tttgaacgcg aactggatct
gatgcgggtt gataatttgc cgaatagcgg cagcggccat 960caccatcacc atcactaa
97822957DNAartificialmodified G-CSF fused with DnaE-C and DnaE-N
22atggtgaaaa tagccacacg caaatatctg ggcaaacaga acgtgtatga tattggcgtg
60gaacgcgatc ataactttgc gctgaaaaac ggcttcatag ctagcaatag cggtccagcg
120agcagcctgc cgcagagctt tctgctgaaa agcctggaac aggtgcgtaa
aattcagggt 180gatggtgcgg cgctgcaaga aaaactgtgc gcgacctata
aactgtgcca tccggaagag 240ctggtgctgc tgggccatag cctgggtatt
ccgtgggcac cgctgtctag ctgtccgagc 300caggcgctgc aactggccgg
ttgtctgagc cagctgcata gcggcctgtt tctgtatcag 360ggcctgctgc
aagcgctgga aggcattagc ccggagctgg gcccgactct ggataccctg
420caactggatg tggcggattt tgcgaccacc atttggcagc agatggaaga
gctgggcatg 480gcaccggcgc tgcaaccgac ccagggtgcc atgccggcgt
ttgcgagcgc gtttcagcgt 540cgtgcgggcg gtgttctggt ggcgagccat
ctgcaatctt ttctggaagt gagctatcgt 600gtgctgcgtc atctgggctg
tttatcatat gaaacggaaa tattgaccgt ggaatatggc 660ctgctgccga
ttggcaaaat tgtggaaaaa cgcattgaat gcaccgtgta tagcgtggat
720aacaacggca acatttatac ccagccggtg gcgcagtggc atgatcgcgg
cgaacaggaa 780gtgtttgaat attgcctgga agatggcagc ctgattcgcg
cgaccaaaga tcataaattt 840atgaccgtgg atggccagat gctgccgatt
gatgaaattt ttgaacgcga actggatctg 900atgcgggttg ataatttgcc
gaatagcggc agcggccatc accatcacca tcactaa
95723945DNAartificialmodified G-CSF fused with DnaE-C and DnaE-N
23atggtgaaaa tagccacacg caaatatctg ggcaaacaga acgtgtatga tattggcgtg
60gaacgcgatc ataactttgc gctgaaaaac ggcttcatag ctagcaatag cagcctgccg
120cagagctttc tgctgaaaag cctggaacag gtgcgtaaaa ttcagggtga
tggtgcggcg 180ctgcaagaaa aactgtgcgc gacctataaa ctgtgccatc
cggaagagct ggtgctgctg 240ggccatagcc tgggtattcc gtgggcaccg
ctgtctagct gtccgagcca ggcgctgcaa 300ctggccggtt gtctgagcca
gctgcatagc ggcctgtttc tgtatcaggg cctgctgcaa 360gcgctggaag
gcattagccc ggagctgggc ccgactctgg ataccctgca actggatgtg
420gcggattttg cgaccaccat ttggcagcag atggaagagc tgggcatggc
accggcgctg 480caaccgaccc agggtgccat gccggcgttt gcgagcgcgt
ttcagcgtcg tgcgggcggt 540gttctggtgg cgagccatct gcaatctttt
ctggaagtga gctatcgtgt gctgcgtcat 600ctgggctgtt tatcatatga
aacggaaata ttgaccgtgg aatatggcct gctgccgatt 660ggcaaaattg
tggaaaaacg cattgaatgc accgtgtata gcgtggataa caacggcaac
720atttataccc agccggtggc gcagtggcat gatcgcggcg aacaggaagt
gtttgaatat 780tgcctggaag atggcagcct gattcgcgcg accaaagatc
ataaatttat gaccgtggat 840ggccagatgc tgccgattga tgaaattttt
gaacgcgaac tggatctgat gcgggttgat 900aatttgccga atagcggcag
cggccatcac catcaccatc actaa 94524936DNAartificialmodified G-CSF
fused with DnaE-C and DnaE-N 24atggtgaaaa tagccacacg caaatatctg
ggcaaacaga acgtgtatga tattggcgtg 60gaacgcgatc ataactttgc gctgaaaaac
ggcttcatag ctagcaatag ccagagcttt 120ctgctgaaaa gcctggaaca
ggtgcgtaaa attcagggtg atggtgcggc gctgcaagaa 180aaactgtgcg
cgacctataa actgtgccat ccggaagagc tggtgctgct gggccatagc
240ctgggtattc cgtgggcacc gctgtctagc tgtccgagcc aggcgctgca
actggccggt 300tgtctgagcc agctgcatag cggcctgttt ctgtatcagg
gcctgctgca agcgctggaa 360ggcattagcc cggagctggg cccgactctg
gataccctgc aactggatgt ggcggatttt 420gcgaccacca tttggcagca
gatggaagag ctgggcatgg caccggcgct gcaaccgacc 480cagggtgcca
tgccggcgtt tgcgagcgcg tttcagcgtc gtgcgggcgg tgttctggtg
540gcgagccatc tgcaatcttt tctggaagtg agctatcgtg tgctgcgtca
tctgggctgt 600ttatcatatg aaacggaaat attgaccgtg gaatatggcc
tgctgccgat tggcaaaatt 660gtggaaaaac gcattgaatg caccgtgtat
agcgtggata acaacggcaa catttatacc 720cagccggtgg cgcagtggca
tgatcgcggc gaacaggaag tgtttgaata ttgcctggaa 780gatggcagcc
tgattcgcgc gaccaaagat cataaattta tgaccgtgga tggccagatg
840ctgccgattg atgaaatttt tgaacgcgaa ctggatctga tgcgggttga
taatttgccg 900aatagcggca gcggccatca ccatcaccat cactaa
93625525DNAartificialG-CSF added M at N terminal and replaced
residues 2 and 18 to A and S 25atggcaccat taggtccagc gagcagcctg
ccgcagagct ttctgctgaa aagcctggaa 60caggtgcgta aaattcaggg tgatggtgcg
gcgctgcaag aaaaactgtg cgcgacctat 120aaactgtgcc atccggaaga
gctggtgctg ctgggccata gcctgggtat tccgtgggca 180ccgctgtcta
gctgtccgag ccaggcgctg caactggccg gttgtctgag ccagctgcat
240agcggcctgt ttctgtatca gggcctgctg caagcgctgg aaggcattag
cccggagctg 300ggcccgactc tggataccct gcaactggat gtggcggatt
ttgcgaccac catttggcag 360cagatggaag agctgggcat ggcaccggcg
ctgcaaccga cccagggtgc catgccggcg 420tttgcgagcg cgtttcagcg
tcgtgcgggc ggtgttctgg tggcgagcca tctgcaatct 480tttctggaag
tgagctatcg tgtgctgcgt catctggccc agccg 5252648DNAartificialprimer
26tataccatgg caccattagg tccagcgagc agcctgccgc agagcttt
482725DNAartificialprimer 27atggatcctt acggctgggc cagat
2528477DNAartificialcoding continuously DnaE-C and -N derived from
Nostoc punctiforme 28atggtgaaaa tagccacacg caaatatctg ggcaaacaga
acgtgtatga tattggcgtg 60gaacgcgatc ataactttgc gctgaaaaac ggcttcatag
ctagcaattg ttttaacaaa 120agccattttg cggaatattg tttatcatat
gaaacggaaa tattgaccgt ggaatatggc 180ctgctgccga ttggcaaaat
tgtggaaaaa cgcattgaat gcaccgtgta tagcgtggat 240aacaacggca
acatttatac ccagccggtg gcgcagtggc atgatcgcgg cgaacaggaa
300gtgtttgaat attgcctgga agatggcagc ctgattcgcg cgaccaaaga
tcataaattt 360atgaccgtgg atggccagat gctgccgatt gatgaaattt
ttgaacgcga actggatctg 420atgcgggttg ataatttgcc gaatagcggc
agcggccatc accatcacca tcactaa 4772933DNAartificialprimer
29aaagctagca atagcatggc accattaggt cca 333038DNAartificialprimer
30aaaaattcat atgataaaca gcccggctgg gccagatg
383130DNAartificialprimer 31aaagctagca atagcggtcc agcgagcagc
303241DNAartificialprimer 32aaaaattcat atgataaaca gcccagatga
cgcagcacac g 413327DNAartificialprimer 33aaagctagca atagcagcct
gccgcag 273445DNAartificialprimer 34aaaaattcat atgataaaca
gcccagatga cgcagcacac gatag 453536DNAartificialprimer 35aaaaaagcta
gcaatagcca gagctttctg ctgaaa 363638DNAartificialprimer 36aaaaattcat
atgataaaca gcccagatga cgcagcac 38
* * * * *
References