Cyclized Cytokine And Method For Producing Same Honda; Shinya ; et al. [National Institute of Advanced Industrial Science and Technology]

Cyclized Cytokine And Method For Producing Same

Honda; Shinya ; et al.

Patent Application Summary

U.S. patent application number 15/566365 was filed with the patent office on 2018-05-03 for cyclized cytokine and method for producing same. The applicant listed for this patent is National Institute of Advanced Industrial Science and Technology. Invention is credited to Shinya Honda, Takamitsu Miyafusa.

Application Number	20180118799 15/566365
Document ID	/
Family ID	57127130
Filed Date	2018-05-03

United States Patent Application	20180118799
Kind Code	A1
Honda; Shinya ; et al.	May 3, 2018

CYCLIZED CYTOKINE AND METHOD FOR PRODUCING SAME

Abstract

The purpose of the present invention is to provide a method for producing a very stable, cyclized mutant protein such that high cyclization efficiency is achieved while the number of amino acids added is minimal and the biological properties of an original protein are maintained. In view of conformational information about the original protein, secondary structure-free regions at N/C terminal portions are deleted. Then, a protein database is screened for proteins with secondary structures similar to those of N/C terminal residues of a secondary structure-forming portion after the deletion. The screening results are used to determine the amino acid length of a loop structure through which the N-terminus and the C-terminus of the secondary structure-forming portion of the original protein are to be connected. A cyclized mutant protein is finally produced having a loop structure with the determined amino acid length.

Inventors:

Honda; Shinya; (Tsukuba-shi, JP) ; Miyafusa; Takamitsu; (Tsukuba-shi, JP)

Applicant:

Name	City	State	Country	Type
National Institute of Advanced Industrial Science and Technology	Tokyo		JP

Family ID:

57127130

Appl. No.:

15/566365

Filed:

April 13, 2016

PCT Filed:

April 13, 2016

PCT NO:

PCT/JP2016/061923

371 Date:

October 13, 2017

Current U.S. Class:	1/1
Current CPC Class:	A61K 38/22 20130101; C12N 2840/445 20130101; A61P 7/00 20180101; C07K 14/535 20130101; C12N 15/70 20130101; C07K 1/02 20130101; C12N 5/10 20130101; A61K 38/00 20130101; C07K 19/00 20130101; C12P 21/02 20130101
International Class:	C07K 14/535 20060101 C07K014/535; A61P 7/00 20060101 A61P007/00; C12N 15/70 20060101 C12N015/70

Foreign Application Data

Date	Code	Application Number
Apr 13, 2015	JP	2015-082018

Claims

1. A method for producing a cyclized mutant protein with biological properties not less than those of an original protein and with stability higher than that of the original protein by modifying, through the following steps (i) to (v), the original protein with a specific three-dimensional structure, the method comprising the steps of: (i) determining, based on conformational information about the original protein, secondary structure-free regions at an N-terminal portion and a C-terminal portion of the original protein and determining an N-terminus and a C-terminus of a secondary structure-forming portion of the original protein if the secondary structure-free regions are deleted; (ii) screening a protein database for proteins with secondary structures similar to those of N-terminal residues and C-terminal residues of the secondary structure-forming portion; (iii) determining, based on results of the screening, an amino acid length of a loop structure through which the N-terminus and the C-terminus of the secondary structure-forming portion of the original protein are to be connected; (iv) designing and producing a linear mutant protein that can be subjected to cyclization so as to obtain a cyclized mutant protein in which the N-terminus and the C-terminus of the secondary structure-forming portion of the original protein are connected via the loop structure with the amino acid length determined; and (v) linking an N-terminus and a C-terminus of the linear mutant protein by a chemical or biological method so that the cyclization is made to obtain the cyclized mutant protein.

2. The method for producing a cyclized mutant protein according to claim 1, wherein the cyclization of step (v) uses a trans-splicing reaction mediated by split inteins.

3. The method for producing a cyclized mutant protein according to claim 1, wherein the original protein with a specific three-dimensional structure is a cytokine with a helix bundle structure and the biological properties of the original protein involve affinity for a receptor.

4. The method for producing a cyclized mutant protein according to claim 3, wherein the original protein with a specific three-dimensional structure is granulocyte colony-stimulating factor (G-CSF).

5. (canceled)

6. A cyclized mutant protein consisting of an amino acid sequence in which secondary structure-free regions at an N-terminal portion and a C-terminal portion of an original protein with a specific three-dimensional structure are deleted and a predetermined number of amino acid residues is added to each of an N-terminus and a C-terminus of a secondary structure-forming portion of the original protein after the deletion, wherein the predetermined number of amino acid residues is the number of amino acid residues, the number corresponding to the number of amino acids connecting secondary structures of N-terminal residues and C-terminal residues of a secondary structure-forming portion of another protein, the secondary structures being similar to those of the secondary structure-forming portion of the original protein.

7. The cyclized mutant protein according to claim 6, wherein the original protein with a specific three-dimensional structure is a cytokine with a helix bundle structure.

8. The cyclized mutant protein according to claim 6, wherein the original protein with a specific three-dimensional structure is a granulocyte colony-stimulating factor (G-CSF) consisting of an amino acid sequence set forth in SEQ ID NO: 1; 0 to 11 amino acid residues are deleted from an N-terminus of the amino acid sequence set forth in SEQ ID NO: 1; 0 to 3 amino acid residues are deleted from an C-terminus of the amino acid sequence; a serine or cysteine residue is added to an N-terminus and/or a C-terminus of the amino acid sequence after the deletion; and an amino acid residue other than proline is added to the C-terminus of the amino acid sequence after the deletion.

9. The cyclized mutant protein according to claim 6, wherein the protein consists of a cyclized amino acid sequence set forth in any one of SEQ ID NOs: 6 to 9.

10. A linear mutant protein designed so as to produce the cyclized mutant protein according to claim 6, wherein the linear mutant protein consists of an amino acid sequence in which secondary structure-free regions at an N-terminal portion and a C-terminal portion of an original protein with a specific three-dimensional structure are deleted and a predetermined number of amino acid residues is added to each of an N-terminus and a C-terminus of a secondary structure-forming portion of the original protein after the deletion; the predetermined number of amino acid residues is the number of amino acid residues, the number corresponding to the number of amino acids connecting secondary structures of N-terminal residues and C-terminal residues of a secondary structure-forming portion of another protein, the secondary structures being similar to those of the secondary structure-forming portion of the original protein; and a most N-terminus amino acid residue and a most C-terminus amino acid residue of the predetermined number of amino acid residues added are amino acid residues that enable the linear mutant protein to be subject to cyclization.

11. The linear mutant protein according to claim 10, wherein the protein is designed so as to produce the cyclized mutant protein that consists of a cyclized amino acid sequence set forth in any one of SEQ ID NOs: 6 to 9 and consists of an amino acid sequence set forth in any one of SEQ ID NOs: 2 to 5.

12. The linear mutant protein according to claim 10, wherein the protein is designed so as to produce the cyclized mutant protein that consists of a cyclized amino acid sequence set forth in any one of SEQ ID NOs: 6 to 9 by using, as split inteins, DnaE-C (C-intein) and DnaE-N (N-intein) of DnaE derived from Nostoc punctiforme and consists of an amino acid sequence set forth in any one of SEQ ID NOs: 10 to 13.

13. A nucleic acid comprising a nucleotide sequence encoding an amino acid sequence of the mutant protein according to claim 6.

14. A recombinant vector comprising the nucleic acid according to claim 13.

15. A transformant comprising the recombinant vector according to claim 14.

16. A method for producing the mutant protein according to claim 6, comprising using a transformant, said transformant comprising a recombinant vector, said recombinant vector comprising a nucleic acid, said nucleic acid comprising a nucleotide sequence encoding an amino acid sequence of the mutant protein according to claim 6.

17. A pharmaceutical composition comprising, as an active ingredient, the mutant protein according to claim 6.

18. A pharmaceutical composition for treatment of neutropenia, comprising, as an active ingredient, the mutant protein according to claim 6.

19. The cyclized mutant protein according to claim 7, wherein the cytokine is selected from granulocyte colony-stimulating factor, erythropoietin, and interferon .alpha..

20. A method for producing the cyclized mutant protein of claim 6, comprising the steps of: (i) determining, based on conformational information about the original protein, secondary structure-free regions at an N-terminal portion and a C-terminal portion of the original protein and determining an N-terminus and a C-terminus of a secondary structure-forming portion of the original protein if the secondary structure-free regions are deleted; (ii) screening a protein database for proteins with secondary structures similar to those of N-terminal residues and C-terminal residues of the secondary structure-forming portion; (iii) determining, based on results of the screening, an amino acid length of a loop structure through which the N-terminus and the C-terminus of the secondary structure-forming portion of the original protein are to be connected; (iv) designing and producing a linear mutant protein that can be subjected to cyclization so as to obtain a cyclized mutant protein in which the N-terminus and the C-terminus of the secondary structure-forming portion of the original protein are connected via the loop structure with the amino acid length determined; and (v) linking an N-terminus and a C-terminus of the linear mutant protein by a chemical or biological method so that the cyclization is made to obtain the cyclized mutant protein, to thereby produce the cyclized mutant protein of claim 6.

Description

TECHNICAL FIELD

[0001] The present invention relates to novel improved versions of a pharmaceutically bioactive protein, preferably a cytokine with a helix bundle conformation, and more preferably granulocyte colony-stimulating factor (G-CSF), which acts to promote production of granulocytes and to enhance functions of neutrophils, nucleic acids encoding the improved proteins, and a production method therefor.

BACKGROUND ART

(Protein Instability Problem When Used as Pharmaceutical)

[0002] When administered for treatment of humans, a bioactive protein often has an undesirable metabolic half-life. This intrinsic metabolic half-life frequently causes suboptimal therapeutic efficacy, medication compliance problems, and dosing schedule and administration regimen that are inconvenient for patients.

(Conventional Major Stability-improving Strategies and Their Problems)

[0003] In the production of a bioactive protein for treatment of humans, its metabolic half-life has been extended by physical means (e.g., alteration of an administration route, nanoparticle encapsulation, and liposome encapsulation), chemical modifications (e.g., emulsion, PEGylation, glycosylation), and modifications by genetic engineering (e.g., amino acid substitution, fusion to human serum albumin, incorporation of post-translational glycosylation) (Non-Patent Literatures 1 and 2). Unfortunately, most such approaches are expensive and entail additional time-consuming processes during production. In addition, PEG itself is immunogenic, so that anti-PEG antibodies are produced, resulting in a decrease in drug efficiency of a PEGylated protein and a decrease in safety, as disclosed in Non-Patent Literature 3.

(Cyclization-medicated, Stability-improving Approaches and Their Problems)

[0004] Bioactive peptides each have a markedly shorter metabolic half-life than regular proteins. The metabolic half-life has been extended by cyclization of a certain linear peptide (Patent Literatures 1 and 2). In various conventional methods for producing a cyclized peptide, naturally occurring cyclized peptides have been mimicked and a biological process and a chemical process have been used in combination. For instance, cysteine residues have been introduced by chemical modification or genetic engineering modification so as to give a disulfide bond (Patent Literature 1). In another technology, a linear precursor molecule is produced in a cell (e.g., a bacterium); and an exogenous factor (e.g., an enzyme) is then added to make the linear precursor molecule cyclic by using a chemical process. In still another technology, a cyclized peptide is produced in a cell (e.g., a bacterium) by genetic engineering modification. For example, as described in Patent Literature 3, the trans-splicing function of split inteins is exploited to make a precursor peptide cyclic, which is interposed between 2 parts of the split inteins.

[0005] In still another method, a cyclized peptide is produced through chemical modification by introducing unnatural amino acids and cross-linkers. For instance, a technology has been known which uses an olefin metathesis reaction to cross-link (S')-.alpha.-(2'-pentenyl) alanine residues, thereby making a peptide cyclized within a molecule (Patent Literature 4).

[0006] The above-mentioned cyclization methods are also used as a method for extending the metabolic half-life of not only peptides but also proteins. For proteins, which are more vulnerable to some chemical synthesis process than peptides, commonly used are modifications by genetic engineering or a protocol in which modifications by genetic engineering and chemical modifications under mild conditions are combined. However, such approaches each require addition of cross-linkers and amino acid sequence necessary for promoting a cyclization reaction to a wild-type protein. Generally speaking, when amino acids are substituted and/or inserted in a wild-type protein, the physical and biological characteristics of the protein are disturbed. Thus, it is important to complete a cyclization reaction without modifying the wild-type amino acid sequence whenever possible.

[0007] In addition, when reaction efficiency of the cyclization reaction is insufficient, an additional purification step of separating the resulting cyclized protein from an unreacted linear protein is needed. Between the cyclized protein and the unreacted linear protein, there are very little differences in their molecular weights and surface charges, so that it is uneasy to separate and purify them from each other. Meanwhile, the efficiency of the cyclization reaction depends on the characteristics of an exogenous factor (e.g., an enzyme) used for the cyclization reaction and the characteristics of a target protein that is subject to cyclization. There is no established common procedure to increase the efficiency.

(About Cytokine with Helix Bundle Conformation)

[0008] Among bioactive proteins for treatment of humans, cytokines such as granulocyte colony-stimulating factor (G-CSF) and erythropoietin have long been clinically applicable. These cytokines have a helix bundle conformation and increased three-dimensional homology. When stabilization of the conformation is used as a basis to extend its metabolic half-life, analogous approaches are often effective with respect to proteins with a similar conformation (Non-Patent Literature 4). In addition, the cytokines share structural features where their N-terminus and C-terminus are positioned closely, etc.

(About G-CSF)

[0009] Granulocyte colony-stimulating factor (G-CSF) is the hematopoietic factor that was discovered as a factor for inducing and differentiating a granulocyte stem cell and promotes production of neutrophils in vivo. Thus, G-CSF is clinically applicable as a drug for neutropenia after bone marrow transplantation and/or cancer chemotherapy. Various sources may be used to obtain and purify G-CSF. Glycosylated G-CSF, a product expressed in eukaryotic host cells, and non-glycosylated G-CSF, a product expressed in prokaryotic host cells, can be produced on an industrial scale. For instance, the former is lenograstim that is produced as a polypeptide chain with the same 174 amino acids as of a native protein. Then, the latter is filgrastim that is produced as a polypeptide chain with the same 175 amino acids as of the native protein except that a methionyl residue is added to the N-terminus. They are available for treatment use in several countries (Patent Literatures 5 and 6). Filgrastim has demonstrated that addition of a methionyl group to the N-terminus and a lack of glycosylation does not affect the functions of G-CSF. This is a known phenomenon. In addition, some reports said that a mutant in which amino acids positioned at the N-terminal region (1 to 17 residues) are replaced or deleted, and a mutant in which a peptide sequence having 12 amino acids is added to the C-terminus, retained the functions of G-CSF (Non-Patent Literatures 5 and 6). These reports indicate that the N-terminus and the C-terminus of G-CSF are tolerant to the introduction of mutations.

(Examples of Improving Stability of G-CSF)

[0010] Various modifications have been reported so as to extend the metabolic half-life of G-CSF. These modifications have been conducted by a technique in which amino acids are genetically replaced and modified or a technique in which PEGs, for example, are chemically added (Patent Literatures 7 and 8). For instance, the former is used for nartograstim that is produced as a polypeptide chain with the same 175 amino acids as of the native protein except that a methionyl residue is added to the N-terminus; the residues Thr2, Leu4, Gly5, Pro6, and Cys18 are replaced by Ala, Thr, Tyr, Arg, and Ser residues, respectively. The latter is used for pegfilgrastim that is a conjugated product of filgrastim and monomethoxy polyethylene glycol. They are available for treatment use in several countries. The PEGylation of filgrastim, like common PEGylation, has been demonstrated to result in an increase in stability but a decrease in activity thereof. For example, it has been shown that when compared with that of filgrastim, its metabolic half-life is increased from 1.79 h to 7.05 h (Patent Literature 7). By contrast, its activity is shown to be decreased to about 68 to 21% of that of filgrastim (Patent Literature 8).

[0011] In addition to the above, it has been reported (in Patent Literature 9) that by using the fact that the N-terminus and the C-terminus of G-CSF are closely positioned in its conformation, G-CSF has been modified to obtain a cyclized polypeptide chain such that a G-CSF, in which a plurality of Gly residues are added to the N-terminus and Leu, Pro, Xaa (Xaa is any amino acid), Thr, and Gly are added to the C-terminus, is expressed and isolated; and sortase A derived from Staphylococcus aureus is added thereto to chemically produce the cyclized polypeptide chain in vitro.

CITATION LIST

Patent Literature

[0012] Patent Literature 1: U.S. Pat. No. 4,033,940 [0013] Patent Literature 2: U.S. Pat. No. 4,102,877 [0014] Patent Literature 3: U.S. Pat. No. 7,846,710 [0015] Patent Literature 4: National Publication of International Patent Application No. 2008-501623 [0016] Patent Literature 5: U.S. Pat. No. 4,810,643 [0017] Patent Literature 6: U.S. Pat. No. 5,104,651 [0018] Patent Literature 7: U.S. Pat. No. 5,824,778 [0019] Patent Literature 8: U.S. Pat. No. 5,985,265 [0020] Patent Literature 9 U.S. Pat. No. 8,940,501

Non Patent Literature

[0020] [0021] Non Patent Literature 1: B. I. Lord, L. B. Woolford, and G. Molineux, Clin. Cancer Res., 2001, 7, 2085-90. [0022] Non Patent Literature 2: P. van Der Auwera, E. Platzer, Z. X. Xu, R. Schulz, O. Feugeas, R. Capdeville, and D. J. Edwards, Am. J. Hematol., 2001, 66, 245-51. [0023] Non Patent Literature 3: Guidance for Industry "Immunogenicity Assessment for Therapeutic Protein2", U.S. Department of Health and Human Services, Food and Drug Administration, August 2014, 1-36 [0024] Non Patent Literature 4: M. W. Popp, S. K. Dougan, T. Y. Chuang, E. Spooner, and H. L. Ploegh, Proc. Natl. Acad. Sci. U.S.A., 2011, 108, 3169-74. [0025] Non Patent Literature 5: T. Kuga, Y. Komatsu, M. Yamasaki, S. Sekine, H. Miyaji, T. Nishi, M. Sato, Y. Yokoo, M. Asano, M. Okabe, M. Morimoto, and S. Itoh, Biochem. Biophys. Res. Commun., 1989, 159, 103-111. [0026] Non Patent Literature 6: Y. Oshima, A. Tojo, Y. Niho, and S. Asano, Biochem. Biophys. Res. Commun., 2000, 267, 924-7.

SUMMARY OF INVENTION

Problems to be Solved by the Invention

[0027] The purpose of the present invention is to solve conventional technical problems in the production of a cyclized protein.

[0028] As described above, in the production of a cyclized peptide or a cyclized protein, the same procedure is used in each case in which a linear polypeptide is once synthesized and a chemical or biological process is then used to carry out a cyclization reaction. At this time, addition of a sequence for the cyclization reaction to a wild-type amino acid sequence may cause an unexpected change in the physical and biological characteristics of a target protein. In addition, if a linear polypeptide is contaminated as a by-product due to low cyclization reaction efficiency, a subsequent purification step is generally difficult. As a result, a target substance with high purity cannot be obtained.

[0029] The present invention addresses the problem of establishing a production method such that in the production of a cyclized protein, high cyclization reaction efficiency can be achieved and the number of amino acids added is minimal, thereby capable of retaining the biological characteristics of an original protein and providing a high-purity cyclized protein.

[0030] In particular, the present invention addresses the problem of establishing a method for producing a high-purity cyclized molecule modified by adding the minimum number of amino acids to a cytokine (e.g., G-CSF), a very stable modified version of which should be developed as a pharmaceutical, and providing the high-purity cyclized cytokine.

Means for Solving the Problems

[0031] The present inventors considered, as a cyclization efficiency-determining factor, how closely the N-terminus and the C-terminal are positioned in the conformation of a protein that is subject to cyclization. From this viewpoint, cyclized proteins having various modifications at their terminal sequences were designed and synthesized.

[0032] Specifically, a group of novel cyclized cytokines designed from the above viewpoint were produced and the cyclization efficiencies thereof were compared. In addition, the structure and function of each cyclized cytokine purified were examined in detail and were compared with those of a linear cytokine counterpart. This comparison showed that each cyclized cytokine had comparable functions and was very stable.

[0033] That is, the present application provides the following aspects of the present invention.

[0034] <1> A method for producing a cyclized mutant protein with biological properties not less than those of an original protein and with stability higher than that of the original protein by modifying, through the following steps (i) to (v), the original protein with a specific three-dimensional structure, the method comprising the steps of:

[0035] (i) determining, based on conformational information about the original protein, secondary structure-free regions at an N-terminal portion or a C-terminal portion of the original protein and determining an N-terminus and a C-terminus of a secondary structure-forming portion of the original protein if the secondary structure-free regions are deleted;

[0036] (ii) screening a protein database for proteins with secondary structures similar to those of N-terminal residues and C-terminal residues of the secondary structure-forming portion;

[0037] (iii) determining, based on results of the screening, an amino acid length of a loop structure through which the N-terminus and the C-terminus of the secondary structure-forming portion of the original protein are to be connected;

[0038] (iv) designing and producing a linear mutant protein that can be subjected to cyclization so as to obtain a cyclized mutant protein in which the N-terminus and the C-terminus of the secondary structure-forming portion of the original protein are connected via the loop structure with the amino acid length determined; and

[0039] (v) linking an N-terminus and a C-terminus of the linear mutant protein by a chemical or biological method so that the cyclization is made to obtain the cyclized mutant protein.

[0040] <2> The method for producing a cyclized mutant protein according to item <1>, wherein the cyclization of step (v) uses a trans-splicing reaction mediated by split inteins.

[0041] <3> The method for producing a cyclized mutant protein according to item <1> or <2>, wherein the original protein with a specific three-dimensional structure is a cytokine with a helix bundle structure and the biological properties of the original protein involve affinity for a receptor.

[0042] <4> The method of producing a cyclized mutant protein according to item <3>, wherein the original protein with a specific three-dimensional structure is granulocyte colony-stimulating factor (G-CSF).

[0043] <5> A cyclized mutant protein produced using the production method according to any one of items (1) to (4).

[0044] <6> A cyclized mutant protein consisting of an amino acid sequence in which secondary structure-free regions at an N-terminal portion or a C-terminal portion of an original protein with a specific three-dimensional structure are deleted and a predetermined number of amino acid residues is added to each of an N-terminus and a C-terminus of a secondary structure-forming portion of the original protein after the deletion, wherein the predetermined number of amino acid residues is the number of amino acid residues, the number corresponding to the number of amino acids connecting secondary structures of N-terminal residues and C-terminal residues of a secondary structure-forming portion of another protein, the secondary structures being similar to those of the secondary structure-forming portion of the original protein.

[0045] <7> The cyclized mutant protein according to item <6>, wherein the protein with a specific three-dimensional structure is a cytokine with a helix bundle structure.

[0046] <8> The cyclized mutant protein according to item <6> or <7>, wherein the original protein with a specific three-dimensional structure is a granulocyte colony-stimulating factor (G-CSF) consisting of an amino acid sequence set forth in SEQ ID NO: 1; 0 to 11 amino acid residues are deleted from an N-terminus of the amino acid sequence set forth in SEQ ID NO: 1; 0 to 3 amino acid residues are deleted from an C-terminus of the amino acid sequence; a serine or cysteine residue is added to an N-terminus and/or a C-terminus of the amino acid sequence after the deletion; and an amino acid residue other than proline is added to the C-terminus of the amino acid sequence after the deletion.

[0047] <9> The cyclized mutant protein according to any one of items <6> to <8>, wherein the protein consists of a cyclized amino acid sequence set forth in any one of SEQ ID NOs: 6 to 9.

[0048] <10> A linear mutant protein designed so as to produce the cyclized mutant protein according to any one of items <6> to <9>, wherein the linear mutant protein consists of an amino acid sequence in which secondary structure-free regions at an N-terminal portion and a C-terminal portion of an original protein with a specific three-dimensional structure are deleted and a predetermined number of amino acid residues is added to each of an N-terminus and a C-terminus of a secondary structure-forming portion of the original protein after the deletion; the predetermined number of amino acid residues is the number of amino acid residues, the number corresponding to the number of amino acids connecting secondary structures of N-terminal residues and C-terminal residues of a secondary structure-forming portion of another protein, the secondary structures being similar to those of the secondary structure-forming portion of the original protein; and a most N-terminus amino acid residue and a most C-terminus amino acid residue of the predetermined number of amino acid residues added are amino acid residues that enable the linear mutant protein to be subject to cyclization.

[0049] <11> The linear mutant protein according to item <10>, wherein the protein is designed so as to produce the cyclized mutant protein according to item <9> and consists of an amino acid sequence set forth in any one of SEQ ID NOs: 2 to 5.

[0050] <12> The linear mutant protein according to item <10>, wherein the protein is designed so as to produce the cyclized mutant protein according to item <9> by using, as split inteins, DnaE-C (C-intein) and DnaE-N (N-intein) of DnaE derived from Nostoc punctiforme and consists of an amino acid sequence set forth in any one of SEQ ID NOs: 10 to 13.

[0051] <13> A nucleic acid comprising a nucleotide sequence encoding an amino acid sequence of the mutant protein according to any one of items <5> to <12>.

[0052] <14> A recombinant vector comprising the nucleic acid according to item <13>.

[0053] <15> A transformant comprising the recombinant vector according to item <14>.

[0054] <16> A method for producing the mutant protein according to any one of items <5> to <12>, using the transformant according to item <15>.

[0055] <17> A pharmaceutical composition comprising, as an active ingredient, the mutant protein according to any one of items <5> to <12>.

[0056] <18> A pharmaceutical composition for treatment of neutropenia, comprising, as an active ingredient, the mutant protein according to any one of items <5> to <12>.

Advantageous Effects of Invention

[0057] In the present invention, specifically developed is G-CSF(C177), a cyclized protein composed of an amino acid sequence in which only 2 amino acids are added to a linear control G-CSF. In addition, G-CSF(C163) is a cyclized protein composed of an amino acid sequence in which nonstructural regions at the N-terminus and the C-terminus are deleted and only 2 amino acids are added. Further, G-CSF(C166) or G-CSF(C170) is a cyclized protein composed of an amino acid sequence in which nonstructural regions at the N-terminus and the C-terminus are deleted and 5 or 9 amino acid residues, respectively, are added.

[0058] When compared with commercially available filgrastim and the linear control G-CSF, any of them can maintain intrinsic receptor-binding activity and have increased thermostability and protease resistance. Among them, G-CSF(C163), G-CSF(C166), and G-CSF(C170) were able to achieve high cyclization efficiency while the number of amino acids added is small.

[0059] Currently, G-CSF has been widely used in clinical practice including treatment of neutropenia caused by cancer chemotherapy. Development of biobetter drugs is in progress so as to enhance stability by using a variety of strategies such as a mutant G-CSF and a chemically-modified G-CSF. Among the strategies, a polypeptide chain cyclization technique aims at increasing stability while change in the amino acid sequence and change in the conformation of a protein can be minimized. An undesirable characteristic change such as an increase in immunogenicity can also be minimized. Thus, the technique seems to have a substantial advantage in the production of bioactive proteins for treatment of humans.

[0060] According to the present invention, it is possible to reduce, to minimum, addition of a sequence necessary for a cyclization reaction, which addition is the biggest disadvantage in increasing stability by cyclization in view of the above-described situations. Further, the present invention can resolve the problems of low cyclization reaction efficiency and a complicated purification step, thereby significantly contributing to technical advances in the art.

BRIEF DESCRIPTION OF DRAWINGS

[0061] FIG. 1 The amino acid sequence of filgrastim and the amino acid sequence of a linear control G-CSF, some amino acids of which are replaced in the sequence of filgrastim, are compared to the amino acid sequences of four different cyclized G-CSFs designed. The schematic diagram shown at the top row shows helix-forming regions predicted on the basis of a known G-CSF conformation (PDB code: 2D9Q).

[0062] FIG. 2 is a synthetic scheme for a cyclized G-CSF produced using inteins.

[0063] FIG. 3 is an SDS-PAGE picture showing a group of modified cyclized G-CSFs (G-CSF(C177) and G-CSF(C163)) after purification.

[0064] FIG. 4 is sensorgrams obtained by measuring, by means of surface plasmon resonance, the binding activity of each modified cyclized G-CSF (G-CSF(C177) or G-CSF(C163)) toward a G-CSF receptor.

[0065] FIG. 5 is graphs showing the activity of each modified cyclized G-CSF (G-CSF(C177) or G-CSF(C163)) on cells.

[0066] FIG. 6 is a graph obtained by measuring, by means of circular dichroism, the thermostability of each modified cyclized G-CSF (G-CSF(C177) or G-CSF(C163)).

[0067] FIG. 7 is a graph in which the protease resistance of each modified cyclized G-CSF (G-CSF(C177) or G-CSF(C163)) was evaluated using the remaining amount thereof after a reaction.

[0068] FIG. 8 is electron density maps of terminal regions of each modified cyclized G-CSF (G-CSF(C177) or G-CSF(C163)), which regions were subjected to cyclization and are visualized by X-ray crystallography.

[0069] FIG. 9 is SDS-PAGE pictures showing, among the modified cyclized G-CSFs, G-CSF(C170) and G-CSF(C166) after purification by size-exclusion chromatography.

[0070] FIG. 10 is sensorgrams obtained by measuring, by means of surface plasmon resonance, the binding activity of each of G-CSF(C170), G-CSF(C166), and filgrastim among the modified cyclized G-CSFs toward a G-CSF receptor, and sensorgrams obtained by re-measuring the binding activity of each of the linear control G-CSF, G-CSF(C177), and G-CSF(C163) toward the G-CSF receptor.

[0071] FIG. 11 is a graph obtained by measuring, by means of circular dichroism, the thermostability of each of G-CSF(C170) and G-CSF(C166) among the modified cyclized G-CSFs, and a graph obtained by re-measuring the thermostability of each of the linear control G-CSF, G-CSF(C177), and G-CSF(C163).

[0072] FIG. 12 is a graph in which the protease resistance of each of G-CSF(C170) and G-CSF(C166) among the modified cyclized G-CSFs was evaluated using the remaining amount thereof after a reaction.

MODE FOR CARRYING OUT THE INVENTION

(Cyclized Mutant Protein and Production Method Thereof)

[0073] In a production method according to an aspect of the present invention, an original protein with a specific three-dimensional structure is modified to obtain a cyclized mutant protein after the following steps (i) to (v):

[0074] (i) determining, based on conformational information about the original protein, secondary structure-free regions at an N-terminal portion or a C-terminal portion of the original protein and determining an N-terminus and a C-terminus of a secondary structure-forming portion of the original protein if the secondary structure-free regions are deleted;

[0075] (ii) screening a protein database for proteins with secondary structures similar to those of N-terminal residues and C-terminal residues of the secondary structure-forming portion;

[0076] (iii) determining, based on results of the screening, an amino acid length of a loop structure through which the N-terminus and the C-terminus of the secondary structure-forming portion of the original protein are to be connected;

[0077] (iv) designing and producing a linear mutant protein that can be subjected to cyclization so as to obtain a cyclized mutant protein in which the N-terminus and the C-terminus of the secondary structure-forming portion of the original protein are connected via the loop structure with the amino acid length determined; and

[0078] (v) linking an N-terminus and a C-terminus of the linear mutant protein by a chemical or biological method so that the cyclization is made to obtain the cyclized mutant protein.

[0079] Hereinbelow, disclosed is a method for producing a cyclized mutant protein when granulocyte colony-stimulating factor (G-CSF) is used as an original protein with a specific three-dimensional structure. The present production method is not limited to this embodiment, and is widely applicable to cyclization of various proteins. The present production method is applicable to cyclization of any protein as long as the amino acid sequence of the protein is known and its conformation (three-dimensional structure) is already determined In particular, from the viewpoint of cyclization efficiency, preferred is cyclization of a protein, the N-terminus and the C-terminus of which are positioned closely in its conformation. Here, G-CSF is known to have a helix bundle structure as its three-dimensional structure. The present production method seems to particularly fit cyclization of a protein (e.g., erythropoietin, interferon a) with the same helix bundle structure as of G-CSF. Those skilled in the art can carry out cyclization of another protein by using substantially the same technique as the present production method when the below-described G-CSF cyclization technique is taken into consideration. This production method enables very efficient cyclization while the number of amino acids added is minimal. Also, the secondary structure of a cyclized mutant protein as obtained using this production method does not change from that of an original protein. This yields such advantages as the biological properties of the original protein being retained and the stability being higher than that of the original protein.

(Description of G-CSF)

[0080] Granulocyte colony-stimulating factor (G-CSF) is a cytokine that binds to a granulocyte colony-stimulating factor receptor (G-CSF receptor) to induce production of neutrophils, etc., and is used for treatment of neutropenia during cancer chemotherapy, etc. While G-CSF is very useful in treatment, it unfortunately has low stability. Accordingly, efforts have been made to increase its stability (Reference Document 1). It is known that the N-terminus and the C-terminus of G-CSF is positioned closely in its three-dimensional structure. It has been reported that when the termini are linked by a covalent bond, the thermostability increases (Reference Document 2).

[0081] In the present invention, a linear polypeptide consisting of 175 amino acids set forth in SEQ ID NO: 1 was used as a linear control G-CSF, a starting material for improvement. When the amino acid sequence of the linear control G-CSF is compared with the amino acid sequence (SEQ ID NO: 14) of filgrastim, threonine at position 2 and cysteine at position 18 are replaced by alanine and serine, respectively. A previous report has demonstrated that these mutations do not affect the activity of G-CSF (Reference Document 3).

(Description of Mutants Developed Herein)

[0082] Improved proteins developed herein each consist of an amino acid sequence that is human-designed on the basis of the amino acid sequence of the linear control G-CSF, and can be produced highly efficiently as a cyclized molecule without impairing the G-CSF receptor-binding activity while the number of amino acids added is minimal.

(Sequences of Specific Mutants Developed Herein)

[0083] Four different improved proteins developed and designed herein are composed of the respective four different amino acid sequences described below (FIG. 1).

[0084] G-CSF(C177) is a polypeptide consisting of 177 amino acids set forth in SEQ ID NO: 2, and an amino group of serine at the N-terminus and a carbonyl group of glycine at the C-terminus are linked by a peptide bond as shown in SEQ ID NO: 6. The amino acid sequence set forth in SEQ ID NO: 2 corresponds to a sequence in which serine is added at the N-terminus and glycine is added to the C-terminus of the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF.

[0085] G-CSF(C170) is a polypeptide consisting of 170 amino acids set forth in SEQ ID NO: 3, and an amino group of serine at the N-terminus and a carbonyl group of glycine at the C-terminus are linked by a polypeptide bond as shown in SEQ ID NO: 7. The amino acid sequence set forth in SEQ ID NO: 3 corresponds to a sequence in which N-terminal 4 amino acid residues and C-terminal 3 amino acid residues are deleted from the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF and serine is added to its N-terminus and glycine is added to its C-terminus.

[0086] G-CSF(C166) is a polypeptide consisting of 166 amino acids set forth in SEQ ID NO: 4, and an amino group of serine at the N-terminus and a carbonyl group of glycine at the C-terminus are linked by a peptide bond as shown in SEQ ID NO: 8. The amino acid sequence set forth in SEQ ID NO: 4 corresponds to a sequence in which N-terminal 8 amino acid residues and C-terminal 3 amino acid residues are deleted from the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF and serine is added to its N-terminus and glycine is added to its C-terminus.

[0087] In addition, G-CSF(C163) is a polypeptide consisting of 163 amino acids set forth in SEQ ID NO: 5, and an amino group of serine at the N-terminus and a carbonyl group of glycine at the C-terminus are linked by a polypeptide bond as shown in SEQ ID NO: 9. The amino acid sequence set forth in SEQ ID NO: 5 corresponds to a sequence in which N-terminal 11 amino acid residues and C-terminal 3 amino acid residues are deleted from the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF and serine is added to its N-terminus and glycine is added to its C-terminus.

(Concerning Addition of Amino Acids to N-terminus and C-terminus so as to Carry out Cyclization Reaction)

[0088] To make a protein cyclic in accordance with the present invention, it is possible to use an intracellular cyclization reaction using inteins. DnaE derived from Nostoc punctiforme can be used as the inteins. More specifically, it is possible to use DnaE-C and DnaE-N, proteins consisting of amino acid sequences set forth in SEQ ID NOs: 15 and 16, respectively. A polypeptide chain in which a target protein is interposed between the DnaE-C and the DnaE-N is subject to spontaneous splicing using DnaE autocatalytic function, thereby synthesizing a cyclized protein in which an amino group at the N-terminus and a carbonyl group at the C-terminus of the target protein are linked by a peptide bond (by means of split inteins; see FIG. 2).

[0089] Use of split inteins requires cysteine or serine at either the N-terminus or the C-terminus of a target protein as well as proline at the C-terminus markedly decreases reaction efficiency (Reference Document 6). Because of this, in the cases of the above-mentioned present improved proteins, serine is added to the N-terminus and glycine is added to the C-terminus of the target protein. However, amino acids usable at the N-terminus or the C-terminus are not limited to the above amino acids. In addition to the above combination, 78 different sequence combinations may each be a choice. Whether the sequence combination is good or poor may be evaluated by a method for comparing proteins actually synthesized and purified. In addition, it is possible to use a method including: replacing, on a model, a loop region of a known conformation described below by each corresponding sequence; and computationally comparing the free energies thereof.

[0090] In the cases of the improved G-CSFs, the present inventors designed, from the above viewpoint, a sequence (SEQ ID NO: 2) in which serine is added to the N-terminus and glycine is added to the C-terminus of the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF, and produced G-CSF(C177) by using split inteins. G-CSF(C177) had G-CSF receptor-binding activity comparable to that of the linear control G-CSF and exhibited better activity in cells, thermostability, protease resistance, and metabolic half-life, but exhibited somewhat low cyclization efficiency (see Table 1).

(To Select Sequences Deleted from N-terminus and C-terminus of Original Protein (Step (i)))

[0091] In order to modify an original protein to obtain substantially the same biological activity as that of the original protein, it is important not to damage the conformation of the original protein. Also, to achieve high cyclization efficiency, it seems to be critical that the N-terminus and the C-terminus of the modified protein is positioned closely in its conformation.

[0092] To design an improved G-CSF from such viewpoint, the present inventors did remove conformationally disordered regions from the N-terminal sequence and the C-terminal sequence of the original protein. Specifically, in the known G-CSF conformation model, a secondary structure-forming region is determined to be a conformationally stable region and a secondary structure-free region is determined to be a disordered region. More specifically, 11 amino acids from methionine at position 1 to proline at position 11 and 3 amino acids from alanine at position 173 to proline at position 175 of the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF were determined to be secondary structure-free regions and were then removed.

[0093] Reference Documents 4 and 5 report a plurality of algorithms used to determine a secondary structure-forming region in conformation modeling. Their determination often gives different results with respect to a terminal portion of the secondary structure. Accordingly, different algorithms adopted result in different candidates for a cleavage site. In addition, to determine the disordered region, it is possible to use a molecular dynamics technique to calculate a change in the structure of the terminal portion. When different methods give different determination results, it is reasonable to adopt a representative value obtained by statistical processing. As the representative value, preferred is a mode. If there are different results, a value at which the secondary structure-forming region is minimal is adopted.

[0094] If information on the conformation of some protein other than G-CSF is available, secondary structure-free regions at the N-terminus and the C-terminus are likewise determined. Then, the N-terminus and the C-terminus of the protein when the secondary structure-free regions are deleted (i.e., the N-terminus and the C-terminus of the secondary structure-forming portion) can be determined. Note that in this step, it is not necessary to actually remove a secondary structure-free region from a protein by means of deletion and mutation, etc. It is simply sufficient to determine, in a design simulation, the N-terminus and the C-terminus thereof while considering only a secondary structure-forming region without considering secondary structure-free regions.

(Loop Design)

[0095] Of the two termini of a protein after secondary structure-free regions at the two termini have been removed, each has a secondary structure, so that it seems to be difficult to link the termini as they are and to make the protein cyclic. Then, the present inventors added, to each terminus, amino acid residues (or a sequence) used to form a loop with an appropriate length so as to connect these termini after conformationally disordered regions have been removed from the N-terminal sequence and the C-terminal sequence of G-CSF for cyclization. Note that as used herein, the loop refers to a portion composed of an amino acid sequence connecting both the termini of the protein after the secondary structure-free regions have been removed from the original termini (i.e., connecting the N-terminus and the C-terminus of a secondary structure-forming portion of a protein of interest).

[0096] The appropriate length of the loop was determined using the following procedure.

(Screening for Analogous Proteins (Step (ii)))

[0097] First, from protein conformations registered in a database, a group of proteins with a "helix-loop-helix" secondary structure in which two helices are connected via a loop was retrieved and a group of proteins with structural homology to G-CSF was further selected. Specifically retrieved was a group of proteins with a helix-loop-helix structure in which proximal 4-amino-acid portions (total of 8 amino acids) adjacent to the loop between the two helices are structurally similar to 4-amino-acid portions (total of 8 amino acids) at the corresponding G-CSF helix termini (i.e., the N-terminus and the C-terminus of the secondary structure-forming region after the secondary structure-free regions have been removed). For example, when 4 amino acid residues (.alpha.-carbon atom) positioned at each of the two helix terminal portions of G-CSF and those of a subject protein are superimposed, their root mean square deviation may be less than 1.5 .ANG. and preferably less than 1 .ANG.. In this case, the subject protein can be retrieved as a structurally similar protein. In this regard, however, the amino acid residue lengths at both the helix termini being compared herein do not have to be 4, and may be adjusted to, for example, a range from 1 to 10 and preferably from 3 to 6 depending on how many proteins have a helix-loop-helix structure being selected based on structural similarity. That is, if the amino acid residue lengths at both the helix termini being compared are short, the number of the structurally similar proteins being selected becomes large, and if the amino acid residue lengths are long, the number of the structurally similar proteins being selected becomes small. Because of this, the amino acid lengths at both the helix termini being compared may be suitably adjusted such that the number of the structurally similar proteins being selected is set to a desirable degree (e.g., within 100 to within 1000 and preferably from about 50 to 100). For example, Protein Data Bank (PDB; http://wwww.rcsb.org/) is available as the database. Examples of a condition that is for a group of proteins retrieved from the database and can be added include the degree of resolution of protein conformation (e.g., 3.0 .ANG. or more and preferably 2.5 .ANG. or more).

[0098] Like in the case of G-CSF, the database may be screened for proteins with secondary structures similar to those of N-terminal residues and C-terminal residues of a secondary structure-forming region of any protein other than G-CSF.

(To determine the Length of Loop (Step (iii)))

[0099] Next, how many amino acid residues a loop of each protein selected as having high structural homology to G-CSF had was analyzed. As a result, it was observed that the number of amino acid residues of the loop was mostly 2, 5, or 9

(22% of all the loops had 2 amino acid residues; 26% of all had 5 amino acid residues; and 16% of all had 9 amino acid residues). In view of the above, it was predicted that a modified cyclized molecule with the smallest size could be constructed using a procedure in which the nonstructural regions at the N-terminus and the C-terminus of G-CSF had been removed and two amino acids were added to the resulting G-CSF. Further, it was also predicted that a stable modified cyclized molecule could be constructed by adding five or nine amino acids.

[0100] The amino acid length of a loop structure through which the N-terminus and the C-terminus of a secondary structure-forming portion of a protein other than G-CSF can be determined such that the amino acid length corresponds to the number of amino acids used to connect corresponding secondary structure portions of another protein having similar secondary structure portions as obtained as a result of the screening. Specifically, the screening results in the number of amino acids used to connect, to each other, secondary structure portions of each of analogous proteins having high structural homology. Among the numbers, the frequently occurring amino acid length is determined to be the amino acid length (loop length) of a loop structure through which the N-terminus and the C-terminus of a secondary structure-forming portion of a protein subjected to cyclization is to be connected. How frequently the amino acid length to be selected as the loop length occurs may be selected by adopting the number of amino acids used to connect, to each other, the secondary structure portions, the number occurring in 10% or more and preferably 15% or more of the proteins screened out. When there are several frequently occurring amino acid lengths, one of the amino acid lengths may be selected. To construct a modified cyclized molecule with a minimum size, the amino acid length of the loop structure is preferably as short as possible. However, a shorter one does not necessarily mean that it will be a better one. From the viewpoint of stability and cyclization efficiency, it is desirable to select a suitable length. This step makes it possible to determine an optimal loop length so as to connect the N-terminus and the C-terminus of a secondary structure-forming portion of a protein without changing its structure.

(To Design Linear Mutant Protein and Make It Cyclic (Steps (iv) and (v))

[0101] Based on the above prediction, N-terminal 11 amino acid residues and C-terminal 3 amino acid residues (i.e., secondary structure-free regions) were deleted from the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF. In addition, serine and glycine (total of 2 amino acid residues) were added to the N-terminus and the C-terminus, respectively, to design a sequence (SEQ ID NO: 5). Next, split inteins were used to produce G-CSF(C163) with a loop length of 2 amino acids. G-CSF(C163) had G-CSF receptor-binding activity comparable to that of the linear control G-CSF and exhibited somewhat poorer activity on cells, thermostability, protease resistance, and metabolic half-life than G-CSF(C177), but exhibited high cyclization efficiency (100%) (see Table 1).

[0102] From these results, it can be predicted that any modified cyclized molecule having a longer amino acid sequence than G-CSF(C163) and a shorter amino acid sequence than G-CSF(C177) has substantially the same preferable properties as of G-CSF(C177) and G-CSF(C163). In particular, modified cyclized molecules in which the loop length has been adjusted to 9 or 5 amino acids are highly likely to have improved cyclization reaction efficiency and enhanced stability while maintaining the structure of the linear control G-CSF.

[0103] When kinds of amino acids constituting the loop are determined, an available sequence is limited depending on a technique used for cyclization. For instance, when split inteins are used, it has already been shown that cysteine or serine at either the N-terminus or the C-terminus of a target protein is essential and proline at the C-terminus markedly decreases reaction efficiency (Reference Document 6). Hence, an amino acid residue added to form a loop after cyclization (or a terminal residue of an amino acid sequence) should be cysteine or serine at either the N-terminus or the C-terminus, and the C-terminal residue should not be proline.

(To Improve Loops)

[0104] Based on the above prediction, modified molecules were produced in which the loop length was adjusted to nine or five amino acids. Seven or three amino acids, except for serine and glycine at the N-terminus and the C-terminus, respectively, which are amino acid residues required for cyclization using split inteins, were designed so as to maintain the amino acid sequence of the linear control G-CSF as much as possible. Specifically, N-terminal 4 amino acid residues and C-terminal 3 amino acid residues of the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF were deleted; serine was added to the N-terminus and glycine was added to the C-terminus to give an engineered sequence (SEQ ID NO: 3); and split inteins were used to produce G-CSF(C170) with a loop length of 9 amino acids. Similarly, N-terminal 8 amino acid residues and C-terminal 3 amino acid residues of the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF were deleted; serine was added to the N-terminus and glycine was added to the C-terminus to give an engineered sequence (SEQ ID NO: 4); and split inteins were used to produce G-CSF(C166) with a loop length of 5 amino acids.

[0105] Both G-CSF(C170) and G-CSF(C166) had G-CSF receptor-binding activity comparable to that of the linear control G-CSF and had higher thermostability than G-CSF(C177). In addition, the cyclization efficiency was equal to or somewhat lower than that of G-CSF(C163), but was markedly higher than that of G-CSF(C177) (see Table 3). In particular, the thermostability of G-CSF(C166) was markedly higher than other modified molecules. Accordingly, its loop length is considered to be optimal for stabilization of the structure of G-CSF.

[0106] From these results, it can be predicted that any modified cyclized molecules having an amino acid sequence longer than G-CSF(C163) and shorter than G-CSF(C177) have likewise preferable properties, the molecules being modified G-CSF cyclized molecules each having an amino acid sequence in which 0 to 11, and preferably 1 to 11, amino acid residues are deleted from the N-terminus and 0 to 3, and preferably 1 to 3, amino acid residues were deleted from the C-terminus of the amino acid sequence set forth in SEQ ID NO: 1; a serine or cysteine residue is added to the N-terminus and/or the C-terminus after the deletion; an amino acid residue other than proline is added to the C-terminus.

[0107] For proteins other than G-CSF, a loop structure having a determined amino acid length is used to connect the N-terminus and the C-terminus of a secondary structure-forming portion of a protein of interest to obtain a cyclized mutant protein. For this purpose, a linear mutant protein that can be made cyclic may be designed and produced and then the N-terminus and the C-terminus of the linear mutant protein may be subjected to cyclization and be linked to obtain a cyclized mutant protein.

[0108] The linear mutant protein is designed such that a portion of the loop structure-forming amino acid sequence is added to the N-terminus and the rest is added to the C-terminus of the amino acid sequence of a secondary structure-forming portion of an original protein. The total of the number of amino acid residues added to the N-terminus and the number of amino acid residues added to the C-terminus should be the amino acid length of the loop structure determined above. Preferable design is such that at least one amino acid residue is added to each of the N-terminus and the C-terminus of a secondary structure-forming portion of an original protein.

[0109] Here, the loop structure-forming amino acid sequence is preferably designed such that the amino acid sequence of the secondary structure-free region removed, on the basis of the design concept, in step (i) is replaced by the necessary number of amino acids. That is, it is preferable that the amino acid sequence of the loop structure portion is designed so as to maintain the amino acid sequence of the original protein as much as possible. From the viewpoint of using, as a pharmaceutical composition, the resulting cyclized mutant protein, it is desirable to design so as to maintain the same conformation as that of the original protein as much as possible.

[0110] In this regard, however, at least the amino acid residues of the linear mutant protein nearest the N-terminal and nearest the C-terminal should be amino acid residues necessary for cyclization. As described above, when split inteins are used for cyclization, either the nearest N-terminal residue or the nearest C-terminal residue is cysteine or serine, and the nearest C-terminal residue is an amino acid residue other than proline.

[0111] Hence, in this case, the design may be such that the amino acid length of the remaining secondary structure-free region of the original protein plus the number of amino acid residues necessary for cyclization (when split inteins are used, the total number of amino acid residues nearest the N-terminal side and nearest the C-terminal side is two) equals the amino acid length of the loop structure determined above.

[0112] In other words, the linear mutant protein or cyclized mutant protein as obtained by the above production method can be said to have an amino acid sequence in which a predetermined number of amino acid residues is deleted from a portion of the secondary structure-free region at each of the N-terminus and the C-terminus of the original protein; after the deletion, amino acid residues necessary for cyclization are added to each of the N-terminus and the C-terminus. As a result, the conformation of the original protein can remain the same as much as possible and an increase in the cyclization efficiency can be achieved while the number of amino acids added is minimal.

(About Other Cyclization Techniques)

[0113] When cyclization techniques other than the technique using inteins are used, loop sequence design conditions are different. For instance, when a disulfide bond is used for cyclization, cysteine is added to each of the N-terminus and the C-terminus. In addition, when the sortase-mediated formation of an isopeptide bond is used for cyclization, it is essential to design a loop with an amino acid sequence suitable for the substrate specificity of a sortase used. Specifically, when Staphylococcus aureus -derived sortase A is used, a loop region should have a plurality of Gly residues at the N-terminus and a sequence containing Leu, Pro, Xaa (Xaa is any amino acid), Thr, and Gly at the C-terminus.

(To Produce Improved Protein)

(1) To Produce Improved Protein by Using Genetic Engineering Technique.

[0114] a. Gene Encoding Improved Protein

[0115] In the present invention, a genetic engineering method can be used to produce the improved proteins designed above.

[0116] Specific examples of the gene used, in such a method, for each improved G-CSF include nucleic acids encoding the amino acid sequences set forth in SEQ ID NOs: 2, 3, 4, and 5 as described above. The nucleic acids are not limited to the above as long as the nucleic acids each encode a protein having substantially the same G-CSF receptor-binding activity as of each improved G-CSF.

[0117] Examples may include: nucleic acids encoding proteins each having G-CSF receptor-binding activity and having an amino acid sequence in which one or several amino acid residues are deleted from, replaced in, or added to the amino acid sequence set forth in SEQ ID NO: 2, 3, 4, or 5 except that the amino acid residues added to the N-terminus and the C-terminus so as to perform the above-described cyclization reaction remain intact; and nucleic acids encoding proteins each having said amino acid sequence in which the residue at either the N-terminus or C-terminus is changed to cysteine or serine and the residue at the C-terminus is changed to an amino acid other than proline.

[0118] Examples of such genes include nucleic acids, each consisting of the nucleotide sequence set forth in SEQ ID NO: 17, 18, 19, or 20.

[0119] Also, examples of the gene used in the present invention include a gene, the nucleic acid of which is hybridized, under stringent conditions, with a sequence complementary to the sequence of a nucleic acid encoding the amino acid sequence set forth in SEQ ID NO: 2, 3, 4, or 5 except for a portion encoding the amino acid residues added at the N-terminus and the C-terminus so as to perform the above-described cyclization reaction, and in which a nucleic acid encoding cysteine or serine is added to either the N-terminus or the C-terminus of a nucleic acid encoding a protein having G-CSF receptor-binding activity and a nucleic acid encoding an amino acid other than proline is added to the C-terminus. As used herein, the term " stringent conditions" refers to a condition under which a specific hybrid is formed and no non-specific hybrid is formed. For instance, the stringent conditions refers to condition under which a nucleic acid having high homology (the homology is 60% or higher and preferably 80% or higher) is hybridized therewith. More specifically, the concentration of sodium is from 150 to 900 mM and preferably from 600 to 900 mM and the temperature is from 60 to 68.degree. C. and preferably 65.degree. C. in that condition. For example, the hybridization is performed at 65.degree. C. and the washing is performed in 0.1% SDS-containing 0.1.times.SSC at 65.degree. C. for 10 min. In such conditions, when the hybridization is demonstrated by a conventional technique (e.g., Southern blotting, dot-blot hybridization), it can be said that the hybridization under the stringent condition occurs.

[0120] In addition to the above, a gene used for the cyclization reaction consists of a nucleic acid encoding a protein having an amino acid sequence (SEQ ID NO: 10, 11, 12, or 13) in which 2 amino acid sequences set forth in SEQ ID Nos: 15 and 16 are linked to the N-terminus and the C-terminus of the amino acid sequence set forth in SEQ ID NO: 2, 3, 4, or 5, or having an amino acid sequence in which the above-described modification is added to the amino acid sequence set forth in SEQ ID NO: 2, 3, 4, or 5 or an amino acid sequence encoded by a nucleic acid as obtained by adding the above-described modification to a nucleic acid hybridized, under stringent conditions, with a nucleic acid encoding the amino acid sequence set forth in SEQ ID NO: 2, 3, 4, or 5; and having an amino acid sequence in which 2 amino acid sequences set forth in SEQ ID NOs: 15 and 16 are linked to the N-terminus and the C-terminus of the amino acid sequence of a protein having G-CSF receptor-binding activity and having a function that makes the amino acid sequence sandwiched between SEQ ID NOs: 15 and 16 cyclic as a result of an autocatalytic process.

[0121] Examples of such genes include nucleic acids, each consisting of the nucleotide sequence set forth in SEQ ID NO: 21, 22, 23, or 24.

b. To Produce Genes, Recombinant Vectors, and Transformants

[0122] Genes according to the present invention can be synthesized using chemical synthesis, a PCR method, a cassette mutation method, site-specific mutagenesis, etc. For example, a plurality of oligonucleotides having up to about 100-bp nucleotides and a complementary region containing about 20 nucleotides at the terminus are chemical synthesized. Then, they may be used in combination and in an overlap elongation process (Reference Document 7) to perform total synthesis of a gene of interest.

[0123] A recombinant vector according to the present invention can be constructed by ligating (inserting) a gene containing the above-described nucleotide sequence into a suitable vector. A vector used in the present invention has no particular limitation as long as the vector can replicate in a host or a gene of interest can be integrated into a host genome by means of the vector. Examples include bacteriophages, plasmids, cosmids, and phagemids.

[0124] Examples of the plasmid DNA include Actinomyces-derived plasmids (e.g., pK4, pRK401, pRF31), E. coli-derived plasmids (e.g. pBR322, pBR325, pUC118, pUC119, pUC18), Bacillus subtilis-derived plasmids (e.g., pUB110, pTP5), and yeast-derived plasmids (e.g., YEp13, YEp24, YCp50). Examples of the phage DNA include .lamda. phages (e.g., .lamda.gt10, .lamda.gt11, .lamda.ZAP). Further, animal viruses such as a retrovirus or a vaccinia virus or insect virus vectors such as a baculovirus can be used.

[0125] In order to insert a gene into a vector, for example, a method is adopted in which purified DNA is first digested by a suitable restriction enzyme(s), is inserted into a restriction enzyme site(s) or a multi-cloning site of a suitable vector DNA, and is ligated to the vector. The gene should be cloned in the vector so as to express an improved protein according to the present invention. Here, a promoter and a gene nucleotide sequence as well as, as desired, a cis-element (e.g., an enhancer), a splicing signal, a poly-A tail signal, a selection marker, a ribosome binding sequence (SD sequence), a start codon, and/or a stop codon may be ligated to the vector according to the present invention. In addition, it is possible to add a tag sequence to easily purify a protein produced. Examples of the tag sequence that can be used include nucleotide sequences encoding known tags such as a His tag, a GST tag, and an MBP tag.

[0126] Whether or not the gene is inserted into the vector can be examined by using a known genetic engineering technology. For instance, the case of a plasmid vector, etc., can be examined by a procedure in which a competent cell is used for vector subcloning, DNA is extracted, a DNA sequencer is then used to check its nucleotide sequence. The similar procedure is applicable to other vectors as long as bacteria or other hosts can be used for subcloning. In addition, a vector is effectively selected by using a selection marker such as a drug resistance gene.

[0127] Transformants may be obtained by introducing a recombinant vector according to the present invention into a host cell so as to enable expression of an improved protein according to the present invention. A host used for transformation is not particularly limited if the host can express a protein or a polypeptide. Examples include bacteria (e.g., Escherichia coli, Bacillus subtilis), yeast, plant cells, animal cells (e.g., COS cells, CHO cells), and insect cells.

[0128] When a bacterium is used as a host, a recombinant vector should be able to self-replicate in the bacterium, and it is preferable that the vector includes a promoter, a ribosome binding sequence, a start codon, a nucleic acid encoding an improved protein according to the present invention, and a transcription termination sequence. Examples of the Escherichia coli include Escherichia coli DH5.alpha.. Examples of the Bacillus include Bacillus subtilis. A method for introducing a recombinant vector into a bacterium is not particularly limited if the method can introduce a DNA into a bacterium. Examples of the method include a method using a calcium ion and electroporation.

[0129] When the yeast is used as a host, for example, Saccharomyces cerevisiae, Schizosaccharomyces pombe, or another yeast may be employed. A method for introducing a recombinant vector into yeast is not particularly limited if the method can introduce a DNA into yeast. Examples of the method include electroporation, a spheroplast method, and a lithium acetate method.

[0130] When the animal cell is used as a host, a monkey cell COS-7, a Vero cell, a Chinese hamster ovary cell (CHO cell), a mouse L cell, a rat GH3, a human FL cell, or the like may be employed. Examples of a method for introducing a recombinant vector into an animal cell include electroporation, a calcium phosphate method, and lipofection.

[0131] When the insect cell is used as a host, Sf9 or another insect cell may be employed. Examples of a method for introducing a recombinant vector into an insect cell include a calcium phosphate method, lipofection, and electroporation.

[0132] Whether or not a gene is integrated into a host can be examined by a PCR method, Southern hybridization, Northern hybridization, or the like. For example, a DNA is prepared from a transformant, and DNA-specific primers are designed to carry out a PCR. Then, the resulting PCR amplification product is subjected to agarose gel electrophoresis, polyacrylamide gel electrophoresis, capillary electrophoresis, or the like. Subsequently, the amplification product is stained with, for example, ethidium bromide or a SyberGreen solution and is detected as a single band. This can verify that a cell has been transformed. In addition, a primer labeled in advance with, for example, a fluorescent dye can be used to carry out a PCR to detect its amplification product.

c. To Obtain Improved Protein by Culturing Transformant

[0133] When produced as a recombinant protein, an improved protein according to the present invention may be obtained by culturing the above-described transformant and by collecting the protein from the culture. The term "culture" means any of culture supernatant, cultured cells or microorganisms, and lysates of cells or microorganisms. Transformants according to the present invention may be cultured by a common procedure used for culturing their host.

[0134] Either a natural medium or a synthetic medium may be used as a medium for culturing a transformant obtained by using a microorganism such as E. coli or yeast as a host if the medium contains, for example, a carbon source, a nitrogen source, and inorganic salts, as utilized by the microorganism, to efficiently culture the transformant. Examples of the carbon source include carbohydrates (e.g., glucose, fructose, sucrose, starch), organic acids (e.g., acetic acid, propionic acid), and alcohols (e.g., ethanol, propanol). Examples of the nitrogen source include ammonia, ammonium chloride, ammonium salts of inorganic or organic acids (e.g., ammonium sulfate, ammonium acetate, ammonium phosphate), other nitrogen-containing compounds, peptone, meat extracts, and corn steep liquors. Examples of the inorganic matter include potassium primary phosphate, potassium secondary phosphate, magnesium phosphate, magnesium sulfate, sodium chloride, ferrous sulfate, manganese sulfate, copper sulfate, and calcium carbonate. The culturing is carried out at 20 to 37.degree. C. for 12 h to 3 days under aerobic conditions such as shaking culture or aerated and agitated culture.

[0135] When an improved protein according to the present invention is produced in microbial bodies or cells after culturing, the protein may be collected, for example, after homogenization (e.g., sonication, repeated freeze-thawing) so as to disrupt the microbial bodies or cells. In addition, when the protein is secreted from microbial bodies or cells, the resulting culture medium may be used as it is or is, for example, centrifuged to remove the microbial bodies or cells. Then, common biochemical processes used for isolating and purifying a protein are used singly or in combination, including ammonium sulfate precipitation, size-exclusion chromatography, ion exchange chromatography, and affinity chromatography. By doing so, an improved protein according to the present invention can be isolated and purified from the culture.

[0136] In addition, it is possible to use what is called a cell-free synthesis system using a mixture of only protein biosynthesis reaction-involving factors (e.g., enzymes, nucleic acids, ATP, amino acids). In this case, an improved protein according to the present invention can be synthesized from a vector without using viable cells (Reference Document 8). Then, the purification processes may be likewise used to isolate and purify the improved protein according to the present invention from a post-reaction mixed solution.

[0137] To check whether the purified/isolated improved protein according to the present invention is a protein consisting of an amino acid sequence of interest, a sample containing the protein is analyzed. Analysis methods may be used, including SDS-PAGE, Western blotting, mass spectroscopy, amino acid analysis, and analysis using an amino acid sequencer (Reference Document 9).

(2) To Produce Improved Protein by Using Another Technique

[0138] An improved protein according to the present invention may be produced using an organic chemistry technique (e.g., a solid peptide synthesis process). A process for producing a protein by using such a technique is well-known in the art and is briefly described below.

[0139] When a protein is chemically produced using a solid peptide synthesis process, an automatic synthesizer is preferably used to synthesize, on a resin, a protected polypeptide having an amino acid sequence of an improved protein according to the present invention while repeating a polycondensation reaction of activated amino acid derivatives. Next, this protected polypeptide is cleaved from the resin and, at the same time, a side chain protection group is cleaved. Regarding this cleavage reaction, it has been known that there is a suitable cocktail prepared in accordance with a type of resin or protection group and/or an amino acid composition (Reference Document 10). Subsequently, a crude protein is transferred from an organic solvent layer to an aqueous layer and a mutant protein of interest is then purified. Examples of the purification process that can be used include reverse-phase chromatography (Reference Document 10).

(Tests for Checking Performance of Improved Protein)

[0140] The following tests for checking performance may be conducted to select which of the improved proteins as so produced are good. Here, any of the improved proteins according to the present invention had good performance.

(1) Test for G-CSF Receptor-binding Activity

[0141] The G-CSF receptor-binding activity of an improved protein according to the present invention may be checked and evaluated using Western blotting, immunoprecipitation, pull-down assay, ELISA (Enzyme-Linked ImmunoSorbent Assay), surface plasmon resonance (SPR) spectroscopy, etc. Among them, the SPR spectroscopy enables real-time sequential observation of the interaction between label-free biomolecules. Accordingly, the SPR spectroscopy makes it possible to quantitatively evaluate a binding reaction of the improved protein from the kinetic viewpoint.

(2) Test for Activity of Improved Protein on Cells

[0142] The activity of an improved protein according to the present invention on cells can be evaluated using, as an index, a proliferation potential of a G-CSF-dependent culture cell line (NFS-60). The NFS-60 is a cell line exhibiting the G-CSF-dependent proliferation potential and a cell line generally used for biological activity assay for G-CSF including a standard G-CSF product produced by NIBSC. Hence, the cell line is suitable when the activity of the improved protein on cells is evaluated.

(3) Test for Thermo stability of Improved Protein

[0143] The thermostability of an improved protein according to the present invention can be evaluated using circular dichroism (CD) spectroscopy, fluorescence spectroscopy, infrared spectroscopy, differential scanning calorimetry, residual activity after heating, etc. Among them, the CD spectroscopy is a spectroscopic analysis method that precisely reflects a change in the secondary structure of a protein. Because of this, it is possible to observe a temperature-dependent change in the conformation of the improved protein and to quantitatively evaluate the thermodynamical structural stability thereof.

(4) Test for Stability of Improved Protein against Protease

[0144] The stability of an improved protein according to the present invention against a protease can be evaluated using, for example, a protocol in which a protease such as carboxypeptidase Y and the improved protein are mixed; and the amount of a degradation product caused by the reaction or the amount of a remaining unreacted molecule is analyzed over time by using SDS-PAGE, liquid chromatography, etc. Among them, the SDS-PAGE can be used to analyze a trace amount of a protein in a simple fashion, so that this method is suitable when the stability of the improved protein against a protease is evaluated.

(5) Test for In Vivo Stability of Improved Protein

[0145] The in vivo stability of an improved protein according to the present invention can be evaluated using, for example, a protocol in which the improved protein is intravenously injected into a rat; and the concentration of G-CSF present in blood is analyzed over time by using ELISA, SDS-PAGE, liquid chromatography, etc. Among them, the ELISA analysis can be used to quantify a trace amount of G-CSF in a sample containing serum components in a simple and accurate manner, so that the method is suitable when the in vivo stability is evaluated.

(6) Conformation Analysis of Improved Protein

[0146] The conformation of an improved protein according to the present invention can be evaluated using X-ray crystallography, nuclear magnetic resonance spectroscopy, etc.

[0147] Among them, the X-ray crystallography can be used to analyze a high-resolution diffraction image and to precisely observe a local structure including the conformation of terminal regions that are subjected to cyclization, so that the method is suitable when the conformation of the improved protein is evaluated.

EXAMPLES

Example 1

To Design Improved G-CSFs

[0148] (1) As used herein, a linear control G-CSF, which is a starting material for improvement, is defined as follows. Specifically, the linear control G-CSF is a linear polypeptide consisting of 175 amino acids set forth in SEQ ID NO: 1. When the amino acid sequence of the linear control G-CSF is compared with the amino acid sequence (SEQ ID NO: 14) of filgrastim, threonine at position 2 and cysteine at position 18 are replaced by alanine and serine, respectively. A previous report has demonstrated that these mutations do not affect the activity of G-CSF (Reference Document 3).

[0149] (2) In order to make a polypeptide of interest cyclic by using the catalytic function of inteins, a residue of at least one of the N-terminus and the C-terminus should be serine or cysteine. In addition, if the residue at the C-terminus is proline, it is known that the reaction fails to proceed. Here, as a mutant modified so as to make the inteins reaction proceed by introducing the minimum number of mutations, G-CSF(C177) (SEQ ID NO: 2) was produced by designing a sequence in which serine and glycine were added to the N-terminus and the C-terminus of the linear control G-CSF, respectively.

[0150] (3) Next, designed were G-CSFs, the terminal sequences of which were optimized for a cyclization reaction. First, human G-CSF three-dimensional structural coordinate data (PDB code: 2D9Q-A chain) was downloaded from the Protein Data Bank (PDB; http://www.rcsb.org/), an international database for the three-dimensional structures of proteins. Next, nonstructural regions that were disadvantageous for the cyclization reaction were selected and removed from amino acids positioned at the N-terminus and the C-terminus of G-CSF. According to the STRIDE, which is a secondary structure-determining algorithm provided at the PDB, an N-terminal region from glutamine at position 11 to serine at position 37 (corresponding to a region from position 12 to position 38 of the linear control G-CSF) forms a helix (helix A). Meanwhile, a C-terminal region from alanine at position 143 to leucine at position 171 (corresponding to a region from position 144 to position 172 of the linear control G-CSF) forms a helix structure (helix D) (FIG. 1). Then, 11 amino acids from methionine at position 1 to proline at position 11 and 3 amino acids from alanine at position 173 to proline at position 175 were deleted from the linear control G-CSF to give a sequence as a backbone suitable for cyclization.

[0151] Subsequently, the optimal length of a loop used to connect the N-terminus and the C-terminus was determined using the following procedure.

[0152] The high-resolution three-dimensional structures having a resolution of at least 2.5 .ANG. and registered at the PDB were collected. Among the structures, the structure of a loop sandwiched between two helices was retrieved. Among the loop structures, retrieved were 57 loop structures in which 4 amino acids at each terminus (the total of 8 amino acids) of the backbone are structurally similar (i.e., the root mean square deviation of .alpha.-carbon is less than 1 .ANG.) to 4 amino acids at the terminus of each helix (the total of 8 amino acids). The numbers of amino acids of such loop structures were set to references and were used for comparison. This comparison demonstrated that the number of the loop structure consisting of 2 amino acids was 13 (accounted for 22% of the total). Accordingly, it was determined that two amino acids were sufficient for the loop connecting the N-terminus and the C-terminus. An amino acid sequence containing serine and glycine as the loop sequence was added to give G-CSF(C163) (SEQ ID NO: 5), the smallest improved G-CSF suitable for the cyclization reaction.

[0153] On one hand, minimization of the loop region increases cyclization reaction efficiency. On the other hand, the minimization may cause distortion of the three-dimensional structure, thereby decreasing the stability. From this, it can be predicted that the number of amino acids, which number makes maximum the total of the increase in the cyclization reaction efficiency and the effect of stabilizing the structure, is present between those of G-CSF(C177) and G-CSF(C163). Specifically, G-CSF(C170) and G-CSF(C166) set forth in SEQ ID NOs: 3 and 4, respectively, are exemplified as the candidates.

Example 2

To Synthesize Linear Control G-CSF-Expression Plasmid

[0154] (1) A chemically synthesized nucleic acid sequence (SEQ ID NO: 25) encoding the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF was purchased.

[0155] (2) As the chemically synthesized nucleic acid as a template, PCR amplification was carried out using two primers (SEQ ID NOs: 26 and 27). The resulting PCR product was purified and digested by restriction enzymes NcoI and BamHI. Next, an E. coli vector pET16b was digested by restriction enzymes NcoI and BamHI and a band at or near 5600 bp was excised and purified. The band was dephosphorylated using_E. coli alkaline phosphatase. The two samples purified were ligated using T7 DNA ligase.

[0156] (3) E. coli DH5a was transformed with the resulting ligation product of (2) and was cultured on an LB plate medium containing 100 .mu.g/m1 of ampicillin. The resulting transformants were subjected to colony PCR and DNA sequencing (GE Healthcare Bioscience, BigDye Terminator v3.1) and were then selected. After that, a QIAprep Spin Miniprep kit (Qiagen) was used to extract a linear control G-CSF-expression plasmid.

Example 3

To Construct Inteins Vector

[0157] (1) A synthetic gene was purchased having a nucleotide sequence (SEQ ID NO: 28) encoding the inteins sequence (a sequence containing, in sequence, SEQ ID NO: 15 and SEQ ID NO: 16) derived from Nostoc punctiforme.

[0158] (2) The synthetic gene of (1) was mixed with restriction enzymes NcoI and XhoI and was subjected to a cleavage reaction at 37.degree. C. for 4 h. After gel electrophoresis, a band at or near 450 bp was excised and purified. Next, an E. coli vector pET16b was digested by NcoI and XhoI and a band at or near 5600 bp was excised and purified. The band was dephosphorylated using E. coli alkaline phosphatase. The two samples purified were ligated using T7 DNA ligase.

[0159] (3) E. coli DH5a was transformed with the resulting ligation product and was cultured on an LB plate medium containing 100 .mu.g/m1 of ampicillin. The resulting transformants were subjected to colony PCR and DNA sequencing (GE Healthcare Bioscience, BigDye Terminator v3.1) and were then selected. After that, a QIAprep Spin Miniprep kit (Qiagen) was used to extract an inteins-expression plasmid.

Example 4

To Synthesize G-CSF(C177)-Expression Plasmid

[0160] A nucleic acid (SEQ ID NO: 17) encoding the amino acid sequence of G-CSF(C177) was synthesized by PCR reaction using the linear control G-CSF-expression plasmid as a template and two single strand DNAs (SEQ ID NO: 29 and SEQ ID NO: 30) as primers. The resulting product was digested by restriction enzymes NheI and NdeI and was then purified. Next, the inteins-expression plasmid was digested by restriction enzymes NheI and NdeI and was then purified. The two samples purified were ligated using T7 DNA ligase.

[0161] E. coli DH5.alpha. was transformed with the resulting ligation product and was cultured on an LB plate medium containing 100 .mu.g/ml of ampicillin. The resulting transformants were subjected to colony PCR and DNA sequencing (GE Healthcare Bioscience, BigDye Terminator v3.1) and were then selected. After that, a QIAprep Spin Miniprep kit (Qiagen) was used to extract a G-CSF(C177)-expression plasmid.

Example 5

To Synthesize G-CSF(C163)-Expression Plasmid

[0162] A nucleic acid (SEQ ID NO: 20) encoding the amino acid sequence of G-CSF(C163) was synthesized by PCR reaction using the linear control G-CSF-expression plasmid as a template and two single strand DNAs (SEQ ID NO: 35 and SEQ ID NO: 36) as primers. The resulting product was digested by restriction enzymes NheI and NdeI and was then purified. Next, the inteins-expression plasmid was digested by restriction enzymes NheI and NdeI and was then purified. The two samples purified were ligated using T7 DNA ligase.

[0163] E. coli DH5.alpha. was transformed with the resulting ligation product and was cultured on an LB plate medium containing 100 .mu.g/m1 of ampicillin. The resulting transformants were subjected to colony PCR and DNA sequencing (GE Healthcare Bioscience, BigDye Terminator v3.1) and were then selected. After that, a QIAprep Spin Miniprep kit (Qiagen) was used to extract a mutant G-CSF(C163)-expression plasmid.

Example 6

To Produce Linear Control G-CSF, and G-CSF(C177) and G-CSF(C163)

[0164] (1) E. coli BL21(DE3) (Novagen), which was used for expression, was transformed with the linear control G-CSF-expression plasmid. The resulting transformant was pre-cultured and was inoculated in an LB medium at 1 ml/200 ml and the mixture was cultured under shaking until O.D..sub.600 =0.8 to 1.0. To express the linear control G-CSF, IPTG (0.4 mM) was added and the mixture was further cultured under shaking at 37.degree. C. for 4 h.

[0165] (2) The bacteria collected were suspended in 10 ml of a lysis buffer. The lysis buffer is a PBS containing 1% (w/v) sodium deoxycholate (DOC), 1.2 kU/ml lysozyme, and 25 U/ml Benzonase. After the suspension, the lysate was stirred at room temperature for 15 min and was subjected to sonication for 2 min, followed by centrifugation at 4.degree. C. for 10 min at 10,000 .times.g. The supernatant was removed, and 20 ml of washing buffer 1 was added to the precipitate to be suspended. Washing buffer 1 contains 50 mM Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA, and 2% Tween20. The suspended sample was centrifuged to remove the supernatant. Then, 20 ml of washing buffer 2 was added to the precipitate to be suspended. Washing buffer 2 contains 50 mM Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA, and 1% (w/v) DOC. The suspended sample was centrifuged to remove the supernatant. Then, 20 ml of washing buffer 3 was added to the precipitate to be suspended. Washing buffer 3 contains 50 mM Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA, and 1 M NaCl. The suspended sample was centrifuged to remove the supernatant. Then, 10 ml of a solubilization buffer was added to the precipitate, and the mixture was stirred at room temperature for 18 h. The solubilization buffer contains 50 mM Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA, and 6 M guanidine hydrochloride. The solubilized sample was concentrated by size-exclusion filtration and the concentrate was added dropwise to the 10-fold volume of an ice-cold refolding buffer. Next, the mixture, as it was, was stirred at 4.degree. C. for 18 h. The refolding buffer contains 100 mM Tris-HCl pH 8.0 (4.degree. C.), 2 mM EDTA, 400 mM L-arginine, 1 mM reduced glutathione, and 0.1 mM oxidized glutathione. After that, the refolded sample was used as internal liquid and the 100-fold volume of 20 mM Tris-HCl pH 8.0 (4.degree. C.) was used as external liquid to carry out dialysis (at 4.degree. C. for 18 h). The sample was centrifuged to recover the supernatant, which was filtered with a 0.22 -.mu.m syringe filter and was subjected to dialysis using 20 mM Tris-HCl pH 8.0 (25.degree. C.).

[0166] (3) The filter-sterilized dialysis internal liquid was injected into a HiTrap Q HP column (GE Healthcare Bioscience) of a liquid chromatography apparatus AKTApurifier (GE Healthcare Bioscience). Then, the linear control G-CSF was purified by anion exchange chromatography (a running buffer: 20 mM Tris-HC1pH 8.0 (25.degree. C.); an elution buffer: 20 mM Tris-HCl pH 8.0 (25.degree. C.) and 1 M NaCl). Further, the purified sample was injected into a Superdex75 10/300 GL column (GE Healthcare Bioscience) of the liquid chromatography apparatus AKTApurifier (GE Healthcare Bioscience). The sample was purified by size-exclusion chromatography (a running buffer: 20 mM Tris-HCl pH 8.0 (25.degree. C.) and 150 mM NaCl). The purified sample was subjected to dialysis using PBS and was then stored at 4.degree. C.

[0167] (4) G-CSF(C177) and G-CSF(C163) were likewise produced using the procedures (1), (2), and (3). In this regard, however, to isolate a linear molecule produced as a by-product of the inteins reaction, the following manipulations were added to the step of purifying G-CSF(C177). The sample purified by size-exclusion chromatography was concentrated by size-exclusion filtration, followed by dialysis using the 100-fold volume of 20 mM Tris-HCl pH 8.0 (25.degree. C.). The filter-sterilized dialysis internal liquid was injected into a MonoQ column (GE Healthcare Bioscience) of a liquid chromatography apparatus AKTApurifier (GE Healthcare Bioscience). Then, the G-CSF(C177) was purified by anion exchange chromatography (a running buffer: 20 mM Tris-HCl pH 8.0 (25.degree. C.); an elution buffer: 20 mM Tris-HCl pH 8.0 (25.degree. C.) and 1 M NaCl). When the amount of G-CSF(C177) added to the above-described column was 20 .mu.g or less, a cyclized main product and a linear by-product were separated as 2 independent peaks. The purified sample of interest was subjected to dialysis using PBS'and was then stored at 4.degree. C.

Example 7

To Measure Cyclization Efficiency by Using SDS-PAGE

[0168] In this example, the purity of each G-CSF after purification was evaluated by using SDS-PAGE.

[0169] An aqueous solution containing each purified G-CSF at a concentration of about 50 .mu.M was prepared and was subjected to SDS-PAGE (4 to 20% precast gels (Bio-Rad), at 200 V for 30 min). Then, an Oriole.RTM. Fluorescent Gel Stain kit (Bio-Rad) was used to detect a band, and the purity of each G-CSF was determined. The results showed that the linear control G-CSF and G-CSF(C163) were each detected as a major band of any of the samples measured and that the degree of purification was sufficient. The G-CSF(C177) sample after the size-exclusion chromatography contained about 30% of a linear molecule and the purity of the cyclized product was demonstrated to be 70% or less. However, the purity after the final purification using a MonoQ column was demonstrated to be 95% or higher (FIG. 3 and Table 1).

TABLE-US-00001 TABLE 1 The cyclization efficiency and the function and stability of each modified cyclized G-CSF were compared. Cyclization G-CSF receptor- Activity Thermo- Protease Metabolic efficiency binding activity on cells stability resistance half-life (%) KD(nM) EC50 (pg/ml) Tm (.degree. C.) t0.5 (hours) t0.5 (hours) filgrastim -- 0.19* 103 N.A. 2.0 N.A. Linear control -- 0.43 82.3 56.5 1.7 0.95 G-CSF G-CSF(C177) 63.6 0.57 60.7 66.2 230 1.13 G-CSF(C163) 100 0.57 80.6 58.3 11 1.05 *The value reported (by Feng Y et al., 1999, Biochemistry) was cited.

Example 8

To Measure G-CSF Receptor-Binding Activity by Using Surface Plasmon Resonance Spectroscopy

[0170] As described below, surface plasmon resonance (SPR) spectroscopy was used to evaluate the receptor-binding activity of each of the linear control G-CSF, G-CSF(C177), and G-CSF(C163).

[0171] Note that SPR spectroscopy is recognized as an excellent method in which the specific interaction between biopolymers is measured over time and the reaction can be quantitatively interpreted from the kinetic viewpoint.

[0172] First, Protein G was immobilized on a flow cell of a sensor chip CMS (GE Health care) by using a maleimide coupling reaction. Next, a running buffer HBS-P (10 mM HEPES pH 7.4, 150 mM NaCl, and 0.05% v/v Surfactant P20) in which an Fc-fusion G-CSF receptor had been dissolved was added to immobilize the receptor on the chip. The SPR measurement was carried out using Biacore T100 (Biacore) and the reaction temperature was 25.degree. C. A 2-fold dilution series from 1.25 nM to 20 nM of the sample solution was prepared, and a single kinetic mode was used for measurement and analysis.

[0173] The observation results were analyzed using BIAevaluation version 4.1 to calculate a dissociation constant. The results demonstrated that the dissociation constant between G-CSF(C177) or G-CSF(C163) and the G-CSF receptor was substantially the same value as of the linear control G-CSF, indicating that the affinity remained the same (FIG. 4 and Table 1).

Example 9

To Measure Activity on Cells

[0174] In this Example, evaluated was the activity of each of filgrastim, the linear control G-CSF, G-CSF(C177), and G-CSF(C163) on cells.

[0175] A cell line NFS-60, which exhibits G-CSF-dependent growth, was maintained under conditions at 37.degree. C. and 5% CO.sub.2 in RPMI1640 culture medium containing 10 ng/ml rhG-CSF and 10% FBS. The cells in a logarithmic growth phase were prepared at 2.times.10.sup.5 cells/mL in 10% FBS-containing (culture-use rhG-CSF-free) RPMI1640 culture medium, and were then used for experiments. A 5-fold dilution series from 3 nM to 192 fM of the linear control G-CSF, G-CSF(C177), or G-CSF(C163) was prepared using 10% FBS-containing RPMI1640 culture medium. Equal volumes of the cell solution and each G-CSF solution were mixed and the mixture was cultured under conditions at 37.degree. C. and 5% CO.sub.2 for 48 h. Then, a cell counting kit-8 (DOJINDO LABORATORIES) was used to count the number of cells.

[0176] As a growth-stimulating activity on NFS-60 cells, the G-CSF activity of each of the linear control G-CSF, G-CSF(C177), and G-CSF(C163) was evaluated by calculating 50% effective concentration (EC50) as obtained by fitting a logistic curve to the activity data. The average of the vehicle control (VC) was subtracted from absorbance data obtained using each sample at each concentration. Then, the resulting value was divided by the average of values of the reference at the maximum activity concentration to calculate a relative activity (%) for each plate. The maximum value was set to 100%, and a logistic formula (Hill equation) was fit to the data to calculate an EC50 value.

[0177] The results demonstrated that G-CSF(C177) and G-CSF(C163) exhibited substantially the same activity on cells as of filgrastim or the linear control G-CSF (FIG. 5 and Table 1).

Example 10

To Determine Thermostability by Using Circular Dichroism Spectroscopy

[0178] In this Example, evaluated was the thermostability of each of the linear control G-CSF,

[0179] G-CSF(C177), and G-CSF(C163). Circular dichroism (CD) spectroscopy is known to be a spectroscopic analysis method that precisely reflects a change in the secondary structure of a protein. The method can reveal at which temperature a sample is denatured by measuring a molar ellipticity, which is represented by the intensity of a CD spectrum, while the temperature of the sample is changed.

[0180] An aqueous solution (10 mM HEPES buffer, pH 7.4, and 150 mM sodium chloride) containing the linear control G-CSF, G-CSF(C177), or G-CSF(C163) at a concentration of 5 .mu.M was prepared. This sample solution was injected into a cylindrical cell (with a cell length of 0.2 cm), and a circular dichroism spectrophotometer model J805 (JASCO Corporation) was used for measurement.

[0181] While the measurement wavelength at a temperature of 10.degree. C. was shifted from 260 nm to 200 nm, a circular dichroism spectrum was obtained. The results showed that the linear control G-CSF, G-CSF(C177), and G-CSF(C163) had substantially the same secondary structure. The same samples were heated to 80.degree. C. and then cooled from 80.degree. C. to 10.degree. C. After that, a circular dichroism spectrum was remeasured. The molar ellipticity, however, was not recovered, indicating the occurrence of irreversible heat denaturation.

[0182] Next, while the temperature was raised from 10.degree. C. to 80.degree. C. at a rate of 1.degree. C. per min, their circular dichroism at a wavelength of 222 nm was scanned and determined. The resulting heat-fusing curve was analyzed using a theoretical formula of a two-state phase transition model (Reference Document 11). This analysis determined a denaturation temperature T. and a denaturation enthalpy change .DELTA.H.sub.m at the T.sub.m. The results revealed that the thermostability of each of G-CSF(C177) and G-CSF(C163) was increased by 9.7.degree. C. and 1.8.degree. C., respectively, when compared with that of the linear control G-CSF (FIG. 6 and Table 1).

Example 11

Protease Resistance Test

[0183] In this Example, the protease resistance of each of filgrastim, the linear control G-CSF, G-CSF(C177), and G-CSF(C163) was evaluated.

[0184] Carboxypeptidase Y and each G-CSF were mixed, and the proportion of an unreacted G-CSF was quantified using SDS-PAGE. Carboxypeptidase Y is an enzyme that catalyzes the cleavage of an amino acid from the C-terminus of a protein and exhibits low substrate specificity and sequence specificity during the reaction. Because of this, carboxypeptidase Y is effective in comprehensively evaluating protease resistance.

[0185] The linear control G-CSF, G-CSF(C177), or G-CSF(C163), isolated and purified, and carboxypeptidase Y were diluted and mixed in 100 mM acetate buffer (pH 6.5) at a final concentration of 10 .mu.g/m1 and 1 .mu.g/ml, respectively. The reaction was carried out at 37.degree. C. for 0, 60, 120, 180, and 240 min, and 10 .mu.l of each sample was collected. Each sample was subjected to SDS-PAGE (4 to 20% precast gels (Bio-Rad), at 200 V for 30 min), and an Oriole.RTM. Fluorescent Gel Stain kit (Bio-Rad) was used to detect a band. An imaging system (ChemiDoc.TM., BioRad) and image-analyzing software (QuantityOne.TM., BioRad) were used to detect the band and quantify its image density. The result at 0 min was set to 100%, and the level of the remaining band was scored. The carboxypeptidase Y-mediated digestion reaction was assumed to be a pseudo-first-order reaction to calculate a half-life (FIG. 7 and Table 1). The results showed that G-CSF(C177) and G-CSF(C163) had markedly higher protease resistance than the linear control G-CSF.

Example 12

Test for Measuring Metabolic Half-life

[0186] In this Example, evaluated was the pharmacokinetics of each of the linear control G-CSF, G-CSF(C177), and G-CSF(C163).

[0187] A single dose of each sample was administered at 100 .mu.g protein/kg to a tail vein of each 7-week-old male Sprague-Dawley rat. Then, 20 min, 1.5 h, 3 h, 4.5 h, 6 h, and 24 h after the administration, 100 .mu.l of blood was taken via a tail vein. Each blood sample was fractionated by centrifugation and the serum concentration of the remaining G-CSF was quantified using a G-CSF Human ELISA kit (Abcam.RTM.).

[0188] The results showed that the linear control G-CSF, G-CSF(C177), and G-CSF(C163) had a calculated metabolic half-life of 0.95 h, 1.13 h, and 1.05 h, respectively, indicating that G-CSF(C177) and G-CSF(C163) had a decrease in in vivo clearance efficiency (Table 1).

Example 13

X-ray Crystallography

[0189] In this Example, a single crystal of each improved protein was created and its three-dimensional structure was determined by X-ray crystallography. X-ray crystallography is a technique in which the high-resolution structure of a protein can be observed and can be used to check whether or not its cyclization occurs definitely and whether or not the cyclization causes a change in its receptor-binding interface.

[0190] Regarding G-CSF(C177), a mother liquor containing 200 mM Li.sub.2SO.sub.4, 100 mM Tris-HCl (pH 8.4), and 20% (w/v) PEG 4000 was used for crystallization to give a columnar single crystal. This crystal was used to obtain diffraction data with a resolution of 3.0 .ANG.. Meanwhile, regarding G-CSF(C163), a mother liquor containing 0.4 M ammonium phosphate was used for crystallization to give a regular octahedral single crystal. This crystal was used to obtain diffraction data with a resolution of 1.65 .ANG.. These two different diffraction data were used to construct each three-dimensional structure model.

[0191] First, it was verified that the N-terminal region and the C-terminal region of each were made cyclic. With respect to each of G-CSF(C177) and G-CSF(C163), obtained were electron density maps elucidating that the serine residue at the N-terminus and the glycine residue at the C-terminus were bonded (FIG. 8). The three-dimensional structure of each of G-CSF(C177) and G-CSF(C163) was superimposed on the known three-dimensional structure (PDB code: 2DQ9-A chain) of the linear control G-CSF. Then, whether the introduced mutations did not affect the G-CSF receptor-binding interface was checked.

[0192] There are two separate interaction interfaces between G-CSF and its receptor (site II and site III). A report said that each interface requires 8 amino acids for the binding (Reference Document 12). The positions of these 16 amino acids were compared among the linear control G-CSF, G-CSF(C177), and G-CSF(C163). The results showed that the distances between the corresponding .alpha.-carbons of the G-CSFs were each less than 3 .ANG., indicating no difference in their three-dimensional structures (Table 2).

TABLE-US-00002 TABLE 2 Comparison of the position of .alpha.-carbon of each amino acid used to form the G-CSF receptor-binding interface between the known three-dimensional structure (PDB code: 2D9Q-A chain) of human G-CSF and the three-dimensional structure of each of G- CSF(C177) and G-CSF(C163) by using X-ray crystallography. Known G-CSF(C177) G-CSF(C163) structure of (Corre- (Corre- G-CSF sponding sponding (2D9Q Distance amino Distance amino A chain) (.ANG.) acid) (.ANG.) acid) site II Lys-16 0.49 (Lys-18) 1.00 (Lys-7) Glu-19 0.38 (Glu-21) 0.73 (Glu-10) Gln-20 0.10 (Gln-22) 0.35 (Gln-11) Arg-22 0.86 (Arg-24) 0.44 (Arg-13) Lys-23 0.34 (Lys-25) 0.47 (Lys-14) Leu-108 0.76 (Leu-110) 0.95 (Leu-99) Asp-109 0.47 (Asp-111) 0.77 (Asp-100) Asp-112 0.76 (Asp-114) 1.36 (Asp-103) site III Tyr-39 0.65 (Tyr-41) 0.27 (Tyr-30) Leu-41 0.30 (Leu-43) 0.33 (Leu-32) Glu-46 1.44 (Glu-48) 1.37 (Glu-37) Val-48 1.16 (Val-50) 1.74 (Val-39) Leu-49 1.29 (Leu-51) 2.39 (Leu-40) Ser-53 0.39 (Ser-55) 2.09 (Ser-44) Phe-144 0.65 (Phe-146) 0.30 (Phe-135) Arg-147 0.71 (Arg-149) 0.44 (Arg-138)

Example 14

To Improve Loop Length

[0193] In Example 1, the protein conformation models registered in the database (PDB) were also compared to the structure of G-CSF. As a result, 9 similar conformation models having the structure of a loop consisting of 9 amino acids were discovered (accounted for 16% of the total). Likewise, 15 similar conformation models having the structure of a loop consisting of 5 amino acids were found (accounted for 26% of the total). Then, two different modified G-CSFs were designed and produced such that corresponding seven or three amino acids of the amino acid sequence of the linear control G-CSF were maintained as much as possible, except for the serine at the N-terminus and the glycine at the C-terminus. Specifically, N-terminal 4 amino acid residues and C-terminal 3 amino acid residues of the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF were deleted; serine was added to the N-terminus and glycine was added to the C-terminus to give an engineered sequence (SEQ ID NO: 3); and split inteins were used to produce G-CSF(C170). Likewise, N-terminal 8 amino acid residues and C-terminal 3 amino acid residues of the amino acid sequence (SEQ ID NO: 1) of the linear control G-CSF were deleted; serine was added to the N-terminus and glycine was added to the C-terminus to give an engineered sequence (SEQ ID NO: 4); and split inteins were used to produce G-CSF(C166).

Example 15

To Synthesize G-CSF(C170)-Expression Plasmid

[0194] A nucleic acid (SEQ ID NO: 18) encoding the amino acid sequence of G-CSF(C170) was synthesized by PCR reaction using the linear control G-CSF-expression plasmid as a template and two single strand DNAs (SEQ ID NO: 31 and SEQ ID NO: 32) as primers. The resulting product was digested by restriction enzymes NheI and NdeI and was then purified. Next, the inteins-expression plasmid was digested by restriction enzymes NheI and NdeI and was then purified. The two samples purified were ligated using T7 DNA ligase.

[0195] E. coli DH5.alpha. was transformed with the resulting ligation product and was cultured on an LB plate medium containing 100 .mu.g/ml of ampicillin. The resulting transformants were subjected to colony PCR and DNA sequencing (GE Healthcare Bioscience, BigDye Terminator v3.1) and were then selected. After that, a QIAprep Spin Miniprep kit (Qiagen) was used to extract a G-CSF(C170)-expression plasmid.

Example 16

To Synthesize G-CSF(C166)-Expression Plasmid

[0196] A nucleic acid (SEQ ID NO: 19) encoding the amino acid sequence of G-CSF(C166) was synthesized by PCR reaction using the linear control G-CSF-expression plasmid as a template and two single strand DNAs (SEQ ID NO: 33 and SEQ ID NO: 34) as primers. The resulting product was digested by restriction enzymes NheI and NdeI and was then purified. Next, the inteins-expression plasmid was digested by restriction enzymes NheI and NdeI and was then purified. The two samples purified were ligated using T7 DNA ligase.

[0197] E. coli DH5.alpha. was transformed with the resulting ligation product and was cultured on an LB plate medium containing 100 .mu.g/ml of ampicillin. The resulting transformants were subjected to colony PCR and DNA sequencing (GE Healthcare Bioscience, BigDye Terminator v3.1) and were then selected. After that, a QIAprep Spin Miniprep kit (Qiagen) was used to extract a mutant G-CSF(C166)-expression plasmid.

Example 17

To produce G-CSF(C170) and G-CSF(C166)

[0198] (1) E. coli BL21(DE3) (Novagen), which was used for expression, was transformed with the G-CSF(C170)-expression plasmid or the G-CSF(C166)-expression plasmid. The resulting transformant pre-cultured was inoculated in an LB medium at 1 ml/200 ml and the mixture was cultured under shaking until O.D..sub.600 =0.8 to 1.0. To express the linear control G-CSF, IPTG (0.4 mM) was added and the mixture was further cultured under shaking at 37.degree. C. for 4 h.

[0199] (2) The bacteria collected were suspended in 10 ml of a lysis buffer. The lysis buffer is a PBS containing 1% (w/v) sodium deoxycholate (DOC), 1.2 kU/ml lysozyme, and 25 U/ml Benzonase. After the suspension, the mixture was stirred at room temperature for 15 min and was subjected to sonication for 2 min. Then, the mixture was centrifuged at 4.degree. C. for 10 at 10,000 .times.g. The supernatant was removed, and 20 ml of washing buffer 1 was added to the precipitate to be suspended. Washing buffer 1 contains 50 mM Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA, and 2% Tween20. The suspended sample was centrifuged to remove the supernatant. Then, 20 ml of washing buffer 2 was added to the precipitate to be suspended. Washing buffer 2 contains 50 mM Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA, and 1% (w/v) DOC. The suspended sample was centrifuged to remove the supernatant. Then, 20 ml of washing buffer 3 was added to the precipitate to be suspended. Washing buffer 3 contains 50 mM Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA, and 1 M NaCl. The suspended sample was centrifuged to remove the supernatant. Then, 10 ml of a solubilization buffer was added to the precipitate, and the mixture was stirred at room temperature for 18 h. The solubilization buffer contains 50 mM Tris-HCl pH 8.0 (25.degree. C.), 5 mM EDTA, and 6 M guanidine hydrochloride. The solubilized sample was concentrated by size-exclusion filtration and the concentrate was added dropwise to the 10-fold volume of an ice-cold refolding buffer. Next, the mixture, as it was, was stirred at 4.degree. C. for 18 h. The refolding buffer contains 100 mM Tris-HCl pH 8.0 (4.degree. C.), 2 mM EDTA, 400 mM L-arginine, 1 mM reduced glutathione, and 0.1 mM oxidized glutathione. After that, the refolded sample was used as internal liquid and the 100-fold volume of 20 mM Tris-HCl pH 8.0 (4.degree. C.) was used as external liquid to carry out dialysis (at 4.degree. C. for 18 h). The sample was centrifuged to recover the supernatant, which was filtered with a 0.22 -.mu.m syringe filter and was subjected to dialysis using 20 mM Tris-HCl pH 8.0 (25.degree. C.).

[0200] (3) The filter-sterilized dialysis internal liquid was injected into a HiTrap Q HP column (GE Healthcare Bioscience) of a liquid chromatography apparatus AKTApurifier (GE

[0201] Healthcare Bioscience). Then, each G-CSF was purified by anion exchange chromatography (a running buffer: 20 mM Tris-HCl pH 8.0 (25.degree. C.); an elution buffer: 20 mM Tris-HCl pH 8.0 (25.degree. C.) and 1 M NaCl). Further, each purified sample was injected into a Superdex75 10/300 GL column (GE Healthcare Bioscience) of the liquid chromatography apparatus AKTApurifier (GE Healthcare Bioscience). Each sample was purified by size-exclusion chromatography (a running buffer: 20 mM Tris-HCl pH 8.0 (25.degree. C.) and 150 mM NaC1). Each sample purified by size-exclusion chromatography was concentrated by size-exclusion filtration, followed by dialysis using the 100-fold volume of 20 mM Tris-HCl pH 8.0 (25.degree. C.). The filter-sterilized dialysis internal liquid was injected into a MonoQ column (GE Healthcare Bioscience) of the liquid chromatography apparatus AKTApurifier (GE Healthcare Bioscience). Then, each G-CSF was purified by anion exchange chromatography (a running buffer: 20 mM Tris-HCl pH 8.0 (25.degree. C.); an elution buffer: 20 mM Tris-HCl pH 8.0 (25.degree. C.) and 1 M NaCl). Each purified sample was subjected to dialysis using PBS and was then stored at 4.degree. C.

Example 18

l To Measure Cyclization Efficiency by Using SDS-PAGE

[0202] In this example, the purity of each G-CSF after purification was evaluated by using SDS-PAGE.

[0203] An aqueous solution containing each purified G-CSF at a concentration of about 50 .mu.M was prepared and was subjected to SDS-PAGE (4 to 20% gels (Bio-Rad), at 200 V for 30 min). Then, an Oriole) Fluorescent Gel Stain kit (Bio-Rad) was used to detect a band, and the purity of each G-CSF was determined. The results showed that the G-CSF(C170) and G-CSF(C166) were each detected as a major band of any of the samples measured and that the degree of purification was sufficient. Each sample after the size-exclusion chromatography was fractionated using a MonoQ column. G-CSF(C170) appeared as a single peak and no linear molecule was observed. By contrast, regarding G-CSF(C166), 4% of a linear molecule was observed in terms of a peak area (Table 3).

TABLE-US-00003 TABLE 3 The cyclization efficiency and the function and stability of each of G-CSF(C170) and G-CSF(C166) among the modified cyclized G-CSFs were compared to those of the linear control G-CSF. Cyclization G-CSF receptor- Thermo- Protease efficiency binding activity stability resistance (%) KD(nM) Tm (.degree. C.) t0.5 (hours) filgrastim -- 0.27 N.A. 2.0* Linear -- 0.08 56.5 1.7* control G-CSF G- 63.6* 0.06 62.1 230* CSF(C177) G- 100 0.05 64.1 10 CSF(C170) G- 96.0 0.06 59.1 ** CSF(C166) G- 100* 0.10 58.2 11* CSF(C163) *The analysis values in Table 1 were used. ** No degradation was observed during the reaction.

Example 19

To Measure G-CSF Receptor-Binding Activity by Using Surface Plasmon Resonance Spectroscopy

[0204] As described below, surface plasmon resonance (SPR) spectroscopy was used to evaluate the receptor-binding activity of each of G-CSF(C170) and G-CSF(C166)

[0205] Note that SPR spectroscopy is recognized as an excellent method in which the specific interaction between biopolymers is measured over time and the reaction can be quantitatively interpreted from the kinetic viewpoint.

[0206] First, commercially available Protein A (Nacalai Tesque, Inc.) was immobilized on a flow cell of a sensor chip CM5 (GE Health care) by using an amine coupling reaction. Next, a running buffer HBS-P (10 mM HEPES pH 7.4, 150 mM NaCl, and 0.05% v/v Surfactant P20) in which an Fc-fusion G-CSF receptor had been dissolved was added to immobilize the receptor on the chip. The SPR measurement was carried out using Biacore T200 (Biacore) and the reaction temperature was 25.degree. C. A 2-fold dilution series from 1.25 nM to 20 nM of the sample solution was prepared, and a single kinetic mode was used for measurement and analysis. When compared with Example 8, this Example had improvements such as use of Biacore T200, which has higher sensitivity than Biacore T100, and use of commercially available Protein A. The improvements helped increase the sensitivity and reproducibility of measurement. Then, the linear G-CSF, G-CSF(C177), and G-CSF(C163) as measured in Example 8 were remeasured and the resulting analysis values were used for comparison.

[0207] The observation results were analyzed using BIAevaluation version 4.1 to calculate a dissociation constant. The results demonstrated that the dissociation constant between G-CSF(C170) or G-CSF(C166) and the G-CSF receptor was substantially the same value as of the linear control G-CSF, indicating that the affinity remained the same (FIG. 10 and Table 3).

Example 20

To Determine Thermostability by Using Circular Dichroism Spectroscopy

[0208] In this Example, evaluated was the thermostability of each of G-CSF(C170) and G-CSF(C166). Circular dichroism (CD) spectroscopy is known to be a spectroscopic analysis method that precisely reflects a change in the secondary structure of a protein. The method can reveal at which temperature a sample is denatured by measuring a molar ellipticity, which is represented by the intensity of a CD spectrum, while the temperature of the sample is changed.

[0209] An aqueous solution (10 mM HEPES buffer, pH 7.4, and 150 mM sodium chloride) containing G-CSF(C170) or G-CSF(C166) at a concentration of 5 .mu.M was prepared. This sample solution was injected into a cylindrical cell (with a cell length of 0.2 cm), and a circular dichroism spectrophotometer model J805 (JASCO Corporation) was used for measurement.

[0210] While the measurement wavelength at a temperature of 10.degree. C. was shifted from 260 nm to 200 nm, a circular dichroism spectrum was obtained. The results showed that G-CSF(C170) and G-CSF(C166) had substantially the same secondary structure. The same samples were heated to 90.degree. C. and then cooled from 90.degree. C. to 10.degree. C. After that, a circular dichroism spectrum was remeasured. The molar ellipticity, however, was not recovered, indicating the occurrence of irreversible heat denaturation.

[0211] Next, while the temperature was raised from 10.degree. C. to 90.degree. C. at a rate of 1.degree. C. per min, their circular dichroism at a wavelength of 222 nm was scanned and determined The resulting heat-fusing curve was analyzed using a theoretical formula of a two-state phase transition model (Reference Document 11). Then, the degeneration temperature T.sub.m, was determined. At the time of curve fitting, a heat capacity change due to the denaturation was assumed and fixed to .DELTA.Cp=7.5 kJmol.sup.-1K.sup.-1. When compared with Example 10, this Example was carried out by measuring thermal melting up to a higher temperature and by using a value representing the fixed heat capacity change, resulting in higher reproducibility. G-CSF(C177) was remeasured and re-analyzed under the same conditions. In addition, the linear control G-CSF and G-CSF(C163) were re-analyzed. The resulting analysis values were used for comparison.

[0212] The results revealed that the thermostability of each of G-CSF(C170) and G-CSF(C166) was increased by 7.6.degree. C. and 12.6.degree. C., respectively, when compared with that of the linear control G-CSF (FIG. 11 and Table 3). The results also revealed that the thermostability of each of G-CSF(C177) and G-CSF(C163) was increased by 5.6.degree. C. and 1.7.degree. C., respectively, when compared with that of the linear control G-CSF. Collectively, in terms of the thermostability, G-CSF(C166) was the most stable modified molecule.

Example 21

Protease Resistance Test

[0213] In this Example, the protease resistance of each of G-CSF(C170) and G-CSF(C166) was evaluated.

[0214] Carboxypeptidase Y and each G-CSF were mixed, and the proportion of an unreacted G-CSF was quantified using SDS-PAGE. Carboxypeptidase Y is an enzyme that catalyzes the cleavage of an amino acid from the C-terminus of a protein and exhibits low substrate specificity and sequence specificity during the reaction. Because of this, carboxypeptidase Y is effective in comprehensively evaluating protease resistance.

[0215] The G-CSF(C170) or G-CSF(C166), isolated and purified, and carboxypeptidase Y were diluted and mixed in 100 mM acetate buffer (pH 6.5) at a final concentration of 10 .mu.g/ml and 1 .mu.g/ml, respectively. The reaction was carried out at 37.degree. C. for 0, 60, 120, 180, and 240 min, and 10 .mu.l of each sample was collected. Each sample was subjected to SDS-PAGE (4 to 20% precast gels (Bio-Rad), at 200 V for 30 min), and an Oriole.RTM. Fluorescent Gel Stain kit (Bio-Rad) was used to detect a band. An imaging system (ChemiDoc.TM., BioRad) and image-analyzing software (QuantityOne.TM., BioRad) were used to detect the band and quantify its image density. The result at 0 min was set to 100%, and the level of the remaining band was scored. The carboxypeptidase Y-mediated digestion reaction was assumed to be a pseudo-first-order reaction to calculate a half-life (FIG. 12 and Table 3). The results showed that G-CSF(C170) and G-CSF(C166) had markedly higher protease resistance than the linear control G-CSF.

[0216] Note that the present application further relates to the following inventions.

[0217] <1> A method for producing a cyclized mutant protein with biological properties not less than those of an original protein and with stability higher than that of the original protein by modifying, through the following steps (a) and (b) and/or steps (c) to (f), the original protein with a specific three-dimensional structure, the method comprising the steps of:

[0218] (a) adding, to an N-terminus and/or an C-terminus of an amino acid sequence of the original protein, residue(s) that enables formation of a peptide bond therebetween by using a trans-splicing reaction using split inteins, provided that when amino acid residues at the N-terminus and the C-terminus of the amino acid sequence of the original protein are residues that enable the formation of a peptide bond, the amino acid residues may remain intact, and

[0219] (b) linking the N-terminus and the C-terminus by using the trans-splicing reaction using the split inteins to make a polypeptide main chain cyclic; and if sufficient cyclization efficiency is not achieved after steps (a) and (b),

[0220] (c) determining N-terminal and C-terminal secondary structure-free regions on a basis of conformational information about the original protein and deleting or mutating the regions,

[0221] (d) screening a known protein database for a structure of a loop connecting secondary structures similar to N-terminal and C-terminal secondary structures of a protein obtained after step (c) and determining a suitable length of the loop,

[0222] (e) adding, to an N-terminus and a C-terminus of the protein obtained after step (c), amino acid residues or amino acid sequences used to form the structure of the loop, with the suitable length, that connects the N-terminus and the C-terminus after cyclization, and

[0223] (f) linking the N-terminus and the C-terminus by means of a chemical or biological technique to make the polypeptide main chain cyclic.

[0224] <2> The method for producing a cyclized mutant protein according to item <1>, wherein the polypeptide main chain cyclization of step (f) uses a trans-splicing reaction mediated by split inteins.

[0225] <3> The method for producing a cyclized mutant protein according to item <1> or <2>, wherein the original protein with a specific three-dimensional structure is a cytokine with a helix bundle structure and the biological properties of the original protein involve affinity for a receptor.

[0226] <4> A cyclized mutant protein produced using the production method according to any one of items <1> to <3>.

[0227] <5> The cyclized mutant protein according to item <4>, wherein the original cytokine is granulocyte colony-stimulating factor (G-CSF).

[0228] <6> The cyclized mutant protein according to item <5>, wherein the original cytokine is a granulocyte colony-stimulating factor (G-CSF) consisting of an amino acid sequence set forth in SEQ ID NO: 1.

[0229] <7> The cyclized mutant protein according to item <6>, wherein the protein consists of an amino acid sequence in which 0 to 11 amino acid residues are deleted from an N-terminus and 0 to 3 amino acid residues are deleted from a C-terminus of the amino acid sequence set forth in SEQ ID NO: 1; a serine residue or a cysteine residue is added to an N-terminus and/or a C-terminus after the deletion; and an amino acid residue other than proline is added to the C-terminus.

[0230] <8> The cyclized mutant protein according to item <7>, wherein the protein consists of a cyclized amino acid sequence set forth in any one of SEQ ID NOs: 6 to 9.

[0231] <9> A linear mutant protein is designed so as to produce, in accordance with the method according to any one of items <1> to <3>, the cyclized mutant protein according to item <8> and consisting of an amino acid sequence set forth in any one of SEQ ID NOs: 2 to 5.

[0232] <10> A linear mutant protein is designed so as to produce, in accordance with the method according to any one of items <1> to <3>, the mutant protein according to item <822 by using, as split inteins, DnaE-C (C-intein) and DnaE-N (N-intein) of DnaE derived from Nostoc punctiforme and consisting of an amino acid sequence set forth in any one of SEQ ID NOs: 10 to 13.

[0233] <11> A nucleic acid comprising a nucleotide sequence encoding an amino acid sequence of the mutant protein according to any one of items <4> to <10>.

[0234] <12> A recombinant vector comprising the nucleic acid according to item <11>.

[0235] <13> A transformant comprising the recombinant vector according to item <12>.

[0236] <14> A method for producing the mutant protein according to any one of items <4> to <10>, comprising using the transformant according to item <13>.

[0237] <15> A pharmaceutical composition comprising, as an active ingredient, the mutant protein according to any one of items <5> to <9>.

[0238] <16> A pharmaceutical composition for treatment of neutropenia, comprising, as an active ingredient, the mutant protein according to any one of items <5> to <9>.

(LIST OF REFERENCE DOCUMENTS)

[0239] Reference Document 1; B. I. Lord, L. B. Woolford, and G. Molineux, Clin. Cancer Res., 2001, 7, 2085-90. [0240] Reference Document 2; M. W. Popp, S. K. Dougan, T. Y. Chuang, E. Spooner, and H. L. Ploegh, Proc. Natl. Acad. Sci. U.S.A., 2011, 108, 3169-74 [0241] Reference Document 3; T. Kuga, Y. Komatsu, M. Yamasaki, S. Sekine, H. Miyaji, T. Nishi, M. Sato, Y. Yokoo, M. Asano, M. Okabe, M. Morimoto, and S. Itoh, Biochem. Biophys. Res. Commun., 1989, 159, 103-111. [0242] Reference Document 4; W. Kabsch and C. Sander, Biopolymers, 1983, 22, 2577-637. [0243] Reference Document 5; D. Frishman and P. Argos, Proteins, 1995, 23, 566-79. [0244] Reference Document 6; C. P. Scott, E. Abel-Santos, M. Wall, D. C. Wahnon, and S. J. Benkovic, Proc. Natl. Acad. Sci., 1999, 96, 13638-13643. [0245] Reference Document 7; R. M. Horton, H. D. Hunt, S. N. Ho, J. K. Pullen, and L. R. Pease, Gene, 1989, 77, 61-8. [0246] Reference Document 8: OKADA, Masato and MIYAZAKI, Kaori (2004) "Tanpakushitsu Jikken Noto (Notes on Protein Experiments)" (I), YODOSHA CO., LTD. [0247] Reference Document 9: OHNO, Shigeo and NISHIMURA, Yoshifumi (1997), "Tanpakushitsu Jikken Purotokoru (Protocols for Protein Experiments) 1-Functional Analysis Part", Shujunsha Co., Ltd. [0248] Reference Document 10: OIINO, Shigeo and NISHIMURA, Yoshifumi (1997), "Tanpakushitsu Jikken Purotokoru (Protocols for Protein Experiments) 2-Structural Analysis Part", Shujunsha Co., Ltd. [0249] Reference Document 11: ARISAKA, Fumio (2004) "Bioscience no Tame no Tanpakusitsukagaku Nyumon (Introduction to Protein Science for Bioscience)", Shokabo Co., Ltd. [0250] Reference Document 12; T. Tamada, E. Honjo, Y. Maeda, T. Okamoto, M. Ishibashi, M. Tokunaga, and R. Kuroki, Proc. Natl. Acad. Sci. U.S.A., 2006, 103, 3135-40. [0251] (SEQ ID NO Description List) [0252] SEQID NO: 1: amino acid sequence of linear control G-CSF [0253] SEQID NO: 2: amino acid sequence of G-CSF(C177) [0254] SEQID NO: 3: amino acid sequence of G-CSF(C170) [0255] SEQID NO: 4: amino acid sequence of G-CSF(C166) [0256] SEQID NO: 5: amino acid sequence of G-CSF(C163) [0257] SEQID NO: 6: amino acid sequence of cyclized G-CSF(C177) [0258] SEQID NO: 7: amino acid sequence of cyclized G-CSF(C170) [0259] SEQID NO: 8: amino acid sequence of cyclized G-CSF(C166) [0260] SEQID NO: 9: amino acid sequence of cyclized G-CSF(C163) [0261] SEQID NO: 10: amino acid sequence of G-CSF(C177) linked to DnaE [0262] SEQID NO: 11: amino acid sequence of cyclized G-CSF(C170) linked to DnaE [0263] SEQID NO: 12: amino acid sequence of cyclized G-CSF(C166) linked to DnaE [0264] SEQID NO: 13: amino acid sequence of G-CSF(C163) linked to DnaE [0265] SEQID NO: 14: amino acid sequence of filgrastim [0266] SEQID NO: 15: amino acid sequence of DnaE-C [0267] SEQID NO: 16: amino acid sequence of DnaE-N [0268] SEQID NO: 17: nucleotide sequence of G-CSF(C177) [0269] SEQID NO: 18: nucleotide sequence of G-CSF(C170) [0270] SEQID NO: 19: nucleotide sequence of G-CSF(C166) [0271] SEQID NO: 20: nucleotide sequence of G-CSF(C163) [0272] SEQID NO: 21: nucleotide sequence of G-CSF(C177) linked to DnaE [0273] SEQID NO: 22: nucleotide sequence of cyclized G-CSF(C170) linked to DnaE [0274] SEQID NO: 23: nucleotide sequence of cyclized G-CSF(C166) linked to DnaE [0275] SEQID NO: 24: nucleotide sequence of G-CSF(C163) linked to DnaE [0276] SEQID NO: 25: nucleotide sequence of linear control G-CSF [0277] SEQID NO: 26: primer sequence for amplifying the nucleotide sequence of linear control G-CSF [0278] SEQID NO: 27: primer sequence for amplifying the nucleotide sequence of linear control G-CSF [0279] SEQID NO: 28: nucleotide sequence encoding, in sequence, DnaE-C and DnaE-N [0280] SEQID NO: 29: primer sequence for amplifying the nucleotide sequence of G-CSF(C177) [0281] SEQID NO: 30: primer sequence for amplifying the nucleotide sequence of G-CSF(C177) [0282] SEQID NO: 31: primer sequence for amplifying the nucleotide sequence of G-CSF(C170) [0283] SEQID NO: 32: primer sequence for amplifying the nucleotide sequence of G-CSF(C170) [0284] SEQID NO: 33: primer sequence for amplifying the nucleotide sequence of G-CSF(C166) [0285] SEQID NO: 34: primer sequence for amplifying the nucleotide sequence of G-CSF(C166) [0286] SEQID NO: 35: primer sequence for amplifying the nucleotide sequence of G-CSF(C163) [0287] SEQID NO: 36: primer sequence for amplifying the nucleotide sequence of G-CSF(C163)

Sequence CWU 1

1

361175PRTartificialG-CSF added M at N terminal and replaced residues 2 and 18 to A and S 1Met Ala Pro Leu Gly Pro Ala Ser Ser Leu Pro Gln Ser Phe Leu Leu 1 5 10 15 Lys Ser Leu Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala Leu 20 25 30 Gln Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 35 40 45 Val Leu Leu Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser Ser 50 55 60 Cys Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu His 65 70 75 80 Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly Ile 85 90 95 Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu Asp Val Ala 100 105 110 Asp Phe Ala Thr Thr Ile Trp Gln Gln Met Glu Glu Leu Gly Met Ala 115 120 125 Pro Ala Leu Gln Pro Thr Gln Gly Ala Met Pro Ala Phe Ala Ser Ala 130 135 140 Phe Gln Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gln Ser 145 150 155 160 Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gln Pro 165 170 175 2177PRTartificialG-CSF modified for cyclization 2Ser Met Ala Pro Leu Gly Pro Ala Ser Ser Leu Pro Gln Ser Phe Leu 1 5 10 15 Leu Lys Ser Leu Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala 20 25 30 Leu Gln Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu 35 40 45 Leu Val Leu Leu Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser 50 55 60 Ser Cys Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu 65 70 75 80 His Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly 85 90 95 Ile Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu Asp Val 100 105 110 Ala Asp Phe Ala Thr Thr Ile Trp Gln Gln Met Glu Glu Leu Gly Met 115 120 125 Ala Pro Ala Leu Gln Pro Thr Gln Gly Ala Met Pro Ala Phe Ala Ser 130 135 140 Ala Phe Gln Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gln 145 150 155 160 Ser Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gln Pro 165 170 175 Gly 3170PRTartificialG-CSF modified for cyclization 3Ser Gly Pro Ala Ser Ser Leu Pro Gln Ser Phe Leu Leu Lys Ser Leu 1 5 10 15 Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala Leu Gln Glu Lys 20 25 30 Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu Val Leu Leu 35 40 45 Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser Ser Cys Pro Ser 50 55 60 Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu His Ser Gly Leu 65 70 75 80 Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly Ile Ser Pro Glu 85 90 95 Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu Asp Val Ala Asp Phe Ala 100 105 110 Thr Thr Ile Trp Gln Gln Met Glu Glu Leu Gly Met Ala Pro Ala Leu 115 120 125 Gln Pro Thr Gln Gly Ala Met Pro Ala Phe Ala Ser Ala Phe Gln Arg 130 135 140 Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gln Ser Phe Leu Glu 145 150 155 160 Val Ser Tyr Arg Val Leu Arg His Leu Gly 165 170 4166PRTartificialG-CSF modified for cyclization 4Ser Ser Leu Pro Gln Ser Phe Leu Leu Lys Ser Leu Glu Gln Val Arg 1 5 10 15 Lys Ile Gln Gly Asp Gly Ala Ala Leu Gln Glu Lys Leu Cys Ala Thr 20 25 30 Tyr Lys Leu Cys His Pro Glu Glu Leu Val Leu Leu Gly His Ser Leu 35 40 45 Gly Ile Pro Trp Ala Pro Leu Ser Ser Cys Pro Ser Gln Ala Leu Gln 50 55 60 Leu Ala Gly Cys Leu Ser Gln Leu His Ser Gly Leu Phe Leu Tyr Gln 65 70 75 80 Gly Leu Leu Gln Ala Leu Glu Gly Ile Ser Pro Glu Leu Gly Pro Thr 85 90 95 Leu Asp Thr Leu Gln Leu Asp Val Ala Asp Phe Ala Thr Thr Ile Trp 100 105 110 Gln Gln Met Glu Glu Leu Gly Met Ala Pro Ala Leu Gln Pro Thr Gln 115 120 125 Gly Ala Met Pro Ala Phe Ala Ser Ala Phe Gln Arg Arg Ala Gly Gly 130 135 140 Val Leu Val Ala Ser His Leu Gln Ser Phe Leu Glu Val Ser Tyr Arg 145 150 155 160 Val Leu Arg His Leu Gly 165 5163PRTartificialG-CSF modified for cyclization 5Ser Gln Ser Phe Leu Leu Lys Ser Leu Glu Gln Val Arg Lys Ile Gln 1 5 10 15 Gly Asp Gly Ala Ala Leu Gln Glu Lys Leu Cys Ala Thr Tyr Lys Leu 20 25 30 Cys His Pro Glu Glu Leu Val Leu Leu Gly His Ser Leu Gly Ile Pro 35 40 45 Trp Ala Pro Leu Ser Ser Cys Pro Ser Gln Ala Leu Gln Leu Ala Gly 50 55 60 Cys Leu Ser Gln Leu His Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu 65 70 75 80 Gln Ala Leu Glu Gly Ile Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr 85 90 95 Leu Gln Leu Asp Val Ala Asp Phe Ala Thr Thr Ile Trp Gln Gln Met 100 105 110 Glu Glu Leu Gly Met Ala Pro Ala Leu Gln Pro Thr Gln Gly Ala Met 115 120 125 Pro Ala Phe Ala Ser Ala Phe Gln Arg Arg Ala Gly Gly Val Leu Val 130 135 140 Ala Ser His Leu Gln Ser Phe Leu Glu Val Ser Tyr Arg Val Leu Arg 145 150 155 160 His Leu Gly 6177PRTartificialmodified G-CSF cyclized by peptide bond formation between N and C terminals 6Ser Met Ala Pro Leu Gly Pro Ala Ser Ser Leu Pro Gln Ser Phe Leu 1 5 10 15 Leu Lys Ser Leu Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala 20 25 30 Leu Gln Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu 35 40 45 Leu Val Leu Leu Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser 50 55 60 Ser Cys Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu 65 70 75 80 His Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly 85 90 95 Ile Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu Asp Val 100 105 110 Ala Asp Phe Ala Thr Thr Ile Trp Gln Gln Met Glu Glu Leu Gly Met 115 120 125 Ala Pro Ala Leu Gln Pro Thr Gln Gly Ala Met Pro Ala Phe Ala Ser 130 135 140 Ala Phe Gln Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gln 145 150 155 160 Ser Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gln Pro 165 170 175 Gly 7170PRTartificialmodified G-CSF cyclized by peptide bond formation between N and C terminals 7Ser Gly Pro Ala Ser Ser Leu Pro Gln Ser Phe Leu Leu Lys Ser Leu 1 5 10 15 Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala Leu Gln Glu Lys 20 25 30 Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu Val Leu Leu 35 40 45 Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser Ser Cys Pro Ser 50 55 60 Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu His Ser Gly Leu 65 70 75 80 Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly Ile Ser Pro Glu 85 90 95 Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu Asp Val Ala Asp Phe Ala 100 105 110 Thr Thr Ile Trp Gln Gln Met Glu Glu Leu Gly Met Ala Pro Ala Leu 115 120 125 Gln Pro Thr Gln Gly Ala Met Pro Ala Phe Ala Ser Ala Phe Gln Arg 130 135 140 Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gln Ser Phe Leu Glu 145 150 155 160 Val Ser Tyr Arg Val Leu Arg His Leu Gly 165 170 8166PRTartificialmodified G-CSF cyclized by peptide bond formation between N and C terminals 8Ser Ser Leu Pro Gln Ser Phe Leu Leu Lys Ser Leu Glu Gln Val Arg 1 5 10 15 Lys Ile Gln Gly Asp Gly Ala Ala Leu Gln Glu Lys Leu Cys Ala Thr 20 25 30 Tyr Lys Leu Cys His Pro Glu Glu Leu Val Leu Leu Gly His Ser Leu 35 40 45 Gly Ile Pro Trp Ala Pro Leu Ser Ser Cys Pro Ser Gln Ala Leu Gln 50 55 60 Leu Ala Gly Cys Leu Ser Gln Leu His Ser Gly Leu Phe Leu Tyr Gln 65 70 75 80 Gly Leu Leu Gln Ala Leu Glu Gly Ile Ser Pro Glu Leu Gly Pro Thr 85 90 95 Leu Asp Thr Leu Gln Leu Asp Val Ala Asp Phe Ala Thr Thr Ile Trp 100 105 110 Gln Gln Met Glu Glu Leu Gly Met Ala Pro Ala Leu Gln Pro Thr Gln 115 120 125 Gly Ala Met Pro Ala Phe Ala Ser Ala Phe Gln Arg Arg Ala Gly Gly 130 135 140 Val Leu Val Ala Ser His Leu Gln Ser Phe Leu Glu Val Ser Tyr Arg 145 150 155 160 Val Leu Arg His Leu Gly 165 9163PRTartificialmodiefied G-CSF cyclized by peptide bond formation between N and C terminals 9Ser Gln Ser Phe Leu Leu Lys Ser Leu Glu Gln Val Arg Lys Ile Gln 1 5 10 15 Gly Asp Gly Ala Ala Leu Gln Glu Lys Leu Cys Ala Thr Tyr Lys Leu 20 25 30 Cys His Pro Glu Glu Leu Val Leu Leu Gly His Ser Leu Gly Ile Pro 35 40 45 Trp Ala Pro Leu Ser Ser Cys Pro Ser Gln Ala Leu Gln Leu Ala Gly 50 55 60 Cys Leu Ser Gln Leu His Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu 65 70 75 80 Gln Ala Leu Glu Gly Ile Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr 85 90 95 Leu Gln Leu Asp Val Ala Asp Phe Ala Thr Thr Ile Trp Gln Gln Met 100 105 110 Glu Glu Leu Gly Met Ala Pro Ala Leu Gln Pro Thr Gln Gly Ala Met 115 120 125 Pro Ala Phe Ala Ser Ala Phe Gln Arg Arg Ala Gly Gly Val Leu Val 130 135 140 Ala Ser His Leu Gln Ser Phe Leu Glu Val Ser Tyr Arg Val Leu Arg 145 150 155 160 His Leu Gly 10325PRTartificialmodified G-CSF fused with DnaE-C and DnaE-N 10Met Val Lys Ile Ala Thr Arg Lys Tyr Leu Gly Lys Gln Asn Val Tyr 1 5 10 15 Asp Ile Gly Val Glu Arg Asp His Asn Phe Ala Leu Lys Asn Gly Phe 20 25 30 Ile Ala Ser Asn Ser Met Ala Pro Leu Gly Pro Ala Ser Ser Leu Pro 35 40 45 Gln Ser Phe Leu Leu Lys Ser Leu Glu Gln Val Arg Lys Ile Gln Gly 50 55 60 Asp Gly Ala Ala Leu Gln Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys 65 70 75 80 His Pro Glu Glu Leu Val Leu Leu Gly His Ser Leu Gly Ile Pro Trp 85 90 95 Ala Pro Leu Ser Ser Cys Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys 100 105 110 Leu Ser Gln Leu His Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu Gln 115 120 125 Ala Leu Glu Gly Ile Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu 130 135 140 Gln Leu Asp Val Ala Asp Phe Ala Thr Thr Ile Trp Gln Gln Met Glu 145 150 155 160 Glu Leu Gly Met Ala Pro Ala Leu Gln Pro Thr Gln Gly Ala Met Pro 165 170 175 Ala Phe Ala Ser Ala Phe Gln Arg Arg Ala Gly Gly Val Leu Val Ala 180 185 190 Ser His Leu Gln Ser Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His 195 200 205 Leu Ala Gln Pro Gly Cys Leu Ser Tyr Glu Thr Glu Ile Leu Thr Val 210 215 220 Glu Tyr Gly Leu Leu Pro Ile Gly Lys Ile Val Glu Lys Arg Ile Glu 225 230 235 240 Cys Thr Val Tyr Ser Val Asp Asn Asn Gly Asn Ile Tyr Thr Gln Pro 245 250 255 Val Ala Gln Trp His Asp Arg Gly Glu Gln Glu Val Phe Glu Tyr Cys 260 265 270 Leu Glu Asp Gly Ser Leu Ile Arg Ala Thr Lys Asp His Lys Phe Met 275 280 285 Thr Val Asp Gly Gln Met Leu Pro Ile Asp Glu Ile Phe Glu Arg Glu 290 295 300 Leu Asp Leu Met Arg Val Asp Asn Leu Pro Asn Ser Gly Ser Gly His 305 310 315 320 His His His His His 325 11318PRTartificialmodified G-CSF fused with DnaE-C and DnaE-N 11Met Val Lys Ile Ala Thr Arg Lys Tyr Leu Gly Lys Gln Asn Val Tyr 1 5 10 15 Asp Ile Gly Val Glu Arg Asp His Asn Phe Ala Leu Lys Asn Gly Phe 20 25 30 Ile Ala Ser Asn Ser Gly Pro Ala Ser Ser Leu Pro Gln Ser Phe Leu 35 40 45 Leu Lys Ser Leu Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala 50 55 60 Leu Gln Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu 65 70 75 80 Leu Val Leu Leu Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser 85 90 95 Ser Cys Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu 100 105 110 His Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly 115 120 125 Ile Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu Asp Val 130 135 140 Ala Asp Phe Ala Thr Thr Ile Trp Gln Gln Met Glu Glu Leu Gly Met 145 150 155 160 Ala Pro Ala Leu Gln Pro Thr Gln Gly Ala Met Pro Ala Phe Ala Ser 165 170 175 Ala Phe Gln Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gln 180 185 190 Ser Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Gly Cys Leu 195 200 205 Ser Tyr Glu Thr Glu Ile Leu Thr Val Glu Tyr Gly Leu Leu Pro Ile 210 215 220 Gly Lys Ile Val Glu Lys Arg Ile Glu Cys Thr Val Tyr Ser Val Asp 225 230 235 240 Asn Asn Gly Asn Ile Tyr Thr Gln Pro Val Ala Gln Trp His Asp Arg 245 250 255 Gly Glu Gln Glu Val Phe Glu Tyr Cys Leu Glu Asp Gly Ser Leu Ile 260 265 270 Arg Ala Thr Lys Asp His Lys Phe Met Thr Val Asp Gly Gln Met Leu 275 280 285 Pro Ile Asp Glu Ile Phe Glu Arg Glu Leu Asp Leu Met Arg Val Asp 290 295 300 Asn Leu Pro Asn Ser Gly Ser Gly His His His His His His 305 310 315 12314PRTartificialmodified G-CSF fused with DnaE-C and DnaE-N 12Met Val Lys Ile Ala Thr Arg Lys Tyr Leu Gly Lys Gln Asn Val Tyr 1 5 10 15 Asp Ile

Gly Val Glu Arg Asp His Asn Phe Ala Leu Lys Asn Gly Phe 20 25 30 Ile Ala Ser Asn Ser Ser Leu Pro Gln Ser Phe Leu Leu Lys Ser Leu 35 40 45 Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala Leu Gln Glu Lys 50 55 60 Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu Val Leu Leu 65 70 75 80 Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser Ser Cys Pro Ser 85 90 95 Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu His Ser Gly Leu 100 105 110 Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly Ile Ser Pro Glu 115 120 125 Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu Asp Val Ala Asp Phe Ala 130 135 140 Thr Thr Ile Trp Gln Gln Met Glu Glu Leu Gly Met Ala Pro Ala Leu 145 150 155 160 Gln Pro Thr Gln Gly Ala Met Pro Ala Phe Ala Ser Ala Phe Gln Arg 165 170 175 Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gln Ser Phe Leu Glu 180 185 190 Val Ser Tyr Arg Val Leu Arg His Leu Gly Cys Leu Ser Tyr Glu Thr 195 200 205 Glu Ile Leu Thr Val Glu Tyr Gly Leu Leu Pro Ile Gly Lys Ile Val 210 215 220 Glu Lys Arg Ile Glu Cys Thr Val Tyr Ser Val Asp Asn Asn Gly Asn 225 230 235 240 Ile Tyr Thr Gln Pro Val Ala Gln Trp His Asp Arg Gly Glu Gln Glu 245 250 255 Val Phe Glu Tyr Cys Leu Glu Asp Gly Ser Leu Ile Arg Ala Thr Lys 260 265 270 Asp His Lys Phe Met Thr Val Asp Gly Gln Met Leu Pro Ile Asp Glu 275 280 285 Ile Phe Glu Arg Glu Leu Asp Leu Met Arg Val Asp Asn Leu Pro Asn 290 295 300 Ser Gly Ser Gly His His His His His His 305 310 13311PRTartificialmodified G-CSF fused with DnaE-C and DnaE-N 13Met Val Lys Ile Ala Thr Arg Lys Tyr Leu Gly Lys Gln Asn Val Tyr 1 5 10 15 Asp Ile Gly Val Glu Arg Asp His Asn Phe Ala Leu Lys Asn Gly Phe 20 25 30 Ile Ala Ser Asn Ser Gln Ser Phe Leu Leu Lys Ser Leu Glu Gln Val 35 40 45 Arg Lys Ile Gln Gly Asp Gly Ala Ala Leu Gln Glu Lys Leu Cys Ala 50 55 60 Thr Tyr Lys Leu Cys His Pro Glu Glu Leu Val Leu Leu Gly His Ser 65 70 75 80 Leu Gly Ile Pro Trp Ala Pro Leu Ser Ser Cys Pro Ser Gln Ala Leu 85 90 95 Gln Leu Ala Gly Cys Leu Ser Gln Leu His Ser Gly Leu Phe Leu Tyr 100 105 110 Gln Gly Leu Leu Gln Ala Leu Glu Gly Ile Ser Pro Glu Leu Gly Pro 115 120 125 Thr Leu Asp Thr Leu Gln Leu Asp Val Ala Asp Phe Ala Thr Thr Ile 130 135 140 Trp Gln Gln Met Glu Glu Leu Gly Met Ala Pro Ala Leu Gln Pro Thr 145 150 155 160 Gln Gly Ala Met Pro Ala Phe Ala Ser Ala Phe Gln Arg Arg Ala Gly 165 170 175 Gly Val Leu Val Ala Ser His Leu Gln Ser Phe Leu Glu Val Ser Tyr 180 185 190 Arg Val Leu Arg His Leu Gly Cys Leu Ser Tyr Glu Thr Glu Ile Leu 195 200 205 Thr Val Glu Tyr Gly Leu Leu Pro Ile Gly Lys Ile Val Glu Lys Arg 210 215 220 Ile Glu Cys Thr Val Tyr Ser Val Asp Asn Asn Gly Asn Ile Tyr Thr 225 230 235 240 Gln Pro Val Ala Gln Trp His Asp Arg Gly Glu Gln Glu Val Phe Glu 245 250 255 Tyr Cys Leu Glu Asp Gly Ser Leu Ile Arg Ala Thr Lys Asp His Lys 260 265 270 Phe Met Thr Val Asp Gly Gln Met Leu Pro Ile Asp Glu Ile Phe Glu 275 280 285 Arg Glu Leu Asp Leu Met Arg Val Asp Asn Leu Pro Asn Ser Gly Ser 290 295 300 Gly His His His His His His 305 310 14175PRTartificialG-CSF added M at N terminal 14Met Thr Pro Leu Gly Pro Ala Ser Ser Leu Pro Gln Ser Phe Leu Leu 1 5 10 15 Lys Cys Leu Glu Gln Val Arg Lys Ile Gln Gly Asp Gly Ala Ala Leu 20 25 30 Gln Glu Lys Leu Cys Ala Thr Tyr Lys Leu Cys His Pro Glu Glu Leu 35 40 45 Val Leu Leu Gly His Ser Leu Gly Ile Pro Trp Ala Pro Leu Ser Ser 50 55 60 Cys Pro Ser Gln Ala Leu Gln Leu Ala Gly Cys Leu Ser Gln Leu His 65 70 75 80 Ser Gly Leu Phe Leu Tyr Gln Gly Leu Leu Gln Ala Leu Glu Gly Ile 85 90 95 Ser Pro Glu Leu Gly Pro Thr Leu Asp Thr Leu Gln Leu Asp Val Ala 100 105 110 Asp Phe Ala Thr Thr Ile Trp Gln Gln Met Glu Glu Leu Gly Met Ala 115 120 125 Pro Ala Leu Gln Pro Thr Gln Gly Ala Met Pro Ala Phe Ala Ser Ala 130 135 140 Phe Gln Arg Arg Ala Gly Gly Val Leu Val Ala Ser His Leu Gln Ser 145 150 155 160 Phe Leu Glu Val Ser Tyr Arg Val Leu Arg His Leu Ala Gln Pro 165 170 175 1536PRTartificialDnaE-C 15Met Val Lys Ile Ala Thr Arg Lys Tyr Leu Gly Lys Gln Asn Val Tyr 1 5 10 15 Asp Ile Gly Val Glu Arg Asp His Asn Phe Ala Leu Lys Asn Gly Phe 20 25 30 Ile Ala Ser Asn 35 16112PRTartificialDnaE-N 16Cys Leu Ser Tyr Glu Thr Glu Ile Leu Thr Val Glu Tyr Gly Leu Leu 1 5 10 15 Pro Ile Gly Lys Ile Val Glu Lys Arg Ile Glu Cys Thr Val Tyr Ser 20 25 30 Val Asp Asn Asn Gly Asn Ile Tyr Thr Gln Pro Val Ala Gln Trp His 35 40 45 Asp Arg Gly Glu Gln Glu Val Phe Glu Tyr Cys Leu Glu Asp Gly Ser 50 55 60 Leu Ile Arg Ala Thr Lys Asp His Lys Phe Met Thr Val Asp Gly Gln 65 70 75 80 Met Leu Pro Ile Asp Glu Ile Phe Glu Arg Glu Leu Asp Leu Met Arg 85 90 95 Val Asp Asn Leu Pro Asn Ser Gly Ser Gly His His His His His His 100 105 110 17531DNAartificialG-CSF modified for cyclization 17agcatggcac cattaggtcc agcgagcagc ctgccgcaga gctttctgct gaaaagcctg 60gaacaggtgc gtaaaattca gggtgatggt gcggcgctgc aagaaaaact gtgcgcgacc 120tataaactgt gccatccgga agagctggtg ctgctgggcc atagcctggg tattccgtgg 180gcaccgctgt ctagctgtcc gagccaggcg ctgcaactgg ccggttgtct gagccagctg 240catagcggcc tgtttctgta tcagggcctg ctgcaagcgc tggaaggcat tagcccggag 300ctgggcccga ctctggatac cctgcaactg gatgtggcgg attttgcgac caccatttgg 360cagcagatgg aagagctggg catggcaccg gcgctgcaac cgacccaggg tgccatgccg 420gcgtttgcga gcgcgtttca gcgtcgtgcg ggcggtgttc tggtggcgag ccatctgcaa 480tcttttctgg aagtgagcta tcgtgtgctg cgtcatctgg cccagccggg c 53118510DNAartificialG-CSF modified for cyclization 18agcggtccag cgagcagcct gccgcagagc tttctgctga aaagcctgga acaggtgcgt 60aaaattcagg gtgatggtgc ggcgctgcaa gaaaaactgt gcgcgaccta taaactgtgc 120catccggaag agctggtgct gctgggccat agcctgggta ttccgtgggc accgctgtct 180agctgtccga gccaggcgct gcaactggcc ggttgtctga gccagctgca tagcggcctg 240tttctgtatc agggcctgct gcaagcgctg gaaggcatta gcccggagct gggcccgact 300ctggataccc tgcaactgga tgtggcggat tttgcgacca ccatttggca gcagatggaa 360gagctgggca tggcaccggc gctgcaaccg acccagggtg ccatgccggc gtttgcgagc 420gcgtttcagc gtcgtgcggg cggtgttctg gtggcgagcc atctgcaatc ttttctggaa 480gtgagctatc gtgtgctgcg tcatctgggc 51019498DNAartificialG-CSF modified for cyclization 19agcagcctgc cgcagagctt tctgctgaaa agcctggaac aggtgcgtaa aattcagggt 60gatggtgcgg cgctgcaaga aaaactgtgc gcgacctata aactgtgcca tccggaagag 120ctggtgctgc tgggccatag cctgggtatt ccgtgggcac cgctgtctag ctgtccgagc 180caggcgctgc aactggccgg ttgtctgagc cagctgcata gcggcctgtt tctgtatcag 240ggcctgctgc aagcgctgga aggcattagc ccggagctgg gcccgactct ggataccctg 300caactggatg tggcggattt tgcgaccacc atttggcagc agatggaaga gctgggcatg 360gcaccggcgc tgcaaccgac ccagggtgcc atgccggcgt ttgcgagcgc gtttcagcgt 420cgtgcgggcg gtgttctggt ggcgagccat ctgcaatctt ttctggaagt gagctatcgt 480gtgctgcgtc atctgggc 49820489DNAartificialG-CSF modified for cyclization 20agccagagct ttctgctgaa aagcctggaa caggtgcgta aaattcaggg tgatggtgcg 60gcgctgcaag aaaaactgtg cgcgacctat aaactgtgcc atccggaaga gctggtgctg 120ctgggccata gcctgggtat tccgtgggca ccgctgtcta gctgtccgag ccaggcgctg 180caactggccg gttgtctgag ccagctgcat agcggcctgt ttctgtatca gggcctgctg 240caagcgctgg aaggcattag cccggagctg ggcccgactc tggataccct gcaactggat 300gtggcggatt ttgcgaccac catttggcag cagatggaag agctgggcat ggcaccggcg 360ctgcaaccga cccagggtgc catgccggcg tttgcgagcg cgtttcagcg tcgtgcgggc 420ggtgttctgg tggcgagcca tctgcaatct tttctggaag tgagctatcg tgtgctgcgt 480catctgggc 48921978DNAartificialmodified G-CSF fused with DnaE-C and DnaE-N 21atggtgaaaa tagccacacg caaatatctg ggcaaacaga acgtgtatga tattggcgtg 60gaacgcgatc ataactttgc gctgaaaaac ggcttcatag ctagcaatag catggcacca 120ttaggtccag cgagcagcct gccgcagagc tttctgctga aaagcctgga acaggtgcgt 180aaaattcagg gtgatggtgc ggcgctgcaa gaaaaactgt gcgcgaccta taaactgtgc 240catccggaag agctggtgct gctgggccat agcctgggta ttccgtgggc accgctgtct 300agctgtccga gccaggcgct gcaactggcc ggttgtctga gccagctgca tagcggcctg 360tttctgtatc agggcctgct gcaagcgctg gaaggcatta gcccggagct gggcccgact 420ctggataccc tgcaactgga tgtggcggat tttgcgacca ccatttggca gcagatggaa 480gagctgggca tggcaccggc gctgcaaccg acccagggtg ccatgccggc gtttgcgagc 540gcgtttcagc gtcgtgcggg cggtgttctg gtggcgagcc atctgcaatc ttttctggaa 600gtgagctatc gtgtgctgcg tcatctggcc cagccgggct gtttatcata tgaaacggaa 660atattgaccg tggaatatgg cctgctgccg attggcaaaa ttgtggaaaa acgcattgaa 720tgcaccgtgt atagcgtgga taacaacggc aacatttata cccagccggt ggcgcagtgg 780catgatcgcg gcgaacagga agtgtttgaa tattgcctgg aagatggcag cctgattcgc 840gcgaccaaag atcataaatt tatgaccgtg gatggccaga tgctgccgat tgatgaaatt 900tttgaacgcg aactggatct gatgcgggtt gataatttgc cgaatagcgg cagcggccat 960caccatcacc atcactaa 97822957DNAartificialmodified G-CSF fused with DnaE-C and DnaE-N 22atggtgaaaa tagccacacg caaatatctg ggcaaacaga acgtgtatga tattggcgtg 60gaacgcgatc ataactttgc gctgaaaaac ggcttcatag ctagcaatag cggtccagcg 120agcagcctgc cgcagagctt tctgctgaaa agcctggaac aggtgcgtaa aattcagggt 180gatggtgcgg cgctgcaaga aaaactgtgc gcgacctata aactgtgcca tccggaagag 240ctggtgctgc tgggccatag cctgggtatt ccgtgggcac cgctgtctag ctgtccgagc 300caggcgctgc aactggccgg ttgtctgagc cagctgcata gcggcctgtt tctgtatcag 360ggcctgctgc aagcgctgga aggcattagc ccggagctgg gcccgactct ggataccctg 420caactggatg tggcggattt tgcgaccacc atttggcagc agatggaaga gctgggcatg 480gcaccggcgc tgcaaccgac ccagggtgcc atgccggcgt ttgcgagcgc gtttcagcgt 540cgtgcgggcg gtgttctggt ggcgagccat ctgcaatctt ttctggaagt gagctatcgt 600gtgctgcgtc atctgggctg tttatcatat gaaacggaaa tattgaccgt ggaatatggc 660ctgctgccga ttggcaaaat tgtggaaaaa cgcattgaat gcaccgtgta tagcgtggat 720aacaacggca acatttatac ccagccggtg gcgcagtggc atgatcgcgg cgaacaggaa 780gtgtttgaat attgcctgga agatggcagc ctgattcgcg cgaccaaaga tcataaattt 840atgaccgtgg atggccagat gctgccgatt gatgaaattt ttgaacgcga actggatctg 900atgcgggttg ataatttgcc gaatagcggc agcggccatc accatcacca tcactaa 95723945DNAartificialmodified G-CSF fused with DnaE-C and DnaE-N 23atggtgaaaa tagccacacg caaatatctg ggcaaacaga acgtgtatga tattggcgtg 60gaacgcgatc ataactttgc gctgaaaaac ggcttcatag ctagcaatag cagcctgccg 120cagagctttc tgctgaaaag cctggaacag gtgcgtaaaa ttcagggtga tggtgcggcg 180ctgcaagaaa aactgtgcgc gacctataaa ctgtgccatc cggaagagct ggtgctgctg 240ggccatagcc tgggtattcc gtgggcaccg ctgtctagct gtccgagcca ggcgctgcaa 300ctggccggtt gtctgagcca gctgcatagc ggcctgtttc tgtatcaggg cctgctgcaa 360gcgctggaag gcattagccc ggagctgggc ccgactctgg ataccctgca actggatgtg 420gcggattttg cgaccaccat ttggcagcag atggaagagc tgggcatggc accggcgctg 480caaccgaccc agggtgccat gccggcgttt gcgagcgcgt ttcagcgtcg tgcgggcggt 540gttctggtgg cgagccatct gcaatctttt ctggaagtga gctatcgtgt gctgcgtcat 600ctgggctgtt tatcatatga aacggaaata ttgaccgtgg aatatggcct gctgccgatt 660ggcaaaattg tggaaaaacg cattgaatgc accgtgtata gcgtggataa caacggcaac 720atttataccc agccggtggc gcagtggcat gatcgcggcg aacaggaagt gtttgaatat 780tgcctggaag atggcagcct gattcgcgcg accaaagatc ataaatttat gaccgtggat 840ggccagatgc tgccgattga tgaaattttt gaacgcgaac tggatctgat gcgggttgat 900aatttgccga atagcggcag cggccatcac catcaccatc actaa 94524936DNAartificialmodified G-CSF fused with DnaE-C and DnaE-N 24atggtgaaaa tagccacacg caaatatctg ggcaaacaga acgtgtatga tattggcgtg 60gaacgcgatc ataactttgc gctgaaaaac ggcttcatag ctagcaatag ccagagcttt 120ctgctgaaaa gcctggaaca ggtgcgtaaa attcagggtg atggtgcggc gctgcaagaa 180aaactgtgcg cgacctataa actgtgccat ccggaagagc tggtgctgct gggccatagc 240ctgggtattc cgtgggcacc gctgtctagc tgtccgagcc aggcgctgca actggccggt 300tgtctgagcc agctgcatag cggcctgttt ctgtatcagg gcctgctgca agcgctggaa 360ggcattagcc cggagctggg cccgactctg gataccctgc aactggatgt ggcggatttt 420gcgaccacca tttggcagca gatggaagag ctgggcatgg caccggcgct gcaaccgacc 480cagggtgcca tgccggcgtt tgcgagcgcg tttcagcgtc gtgcgggcgg tgttctggtg 540gcgagccatc tgcaatcttt tctggaagtg agctatcgtg tgctgcgtca tctgggctgt 600ttatcatatg aaacggaaat attgaccgtg gaatatggcc tgctgccgat tggcaaaatt 660gtggaaaaac gcattgaatg caccgtgtat agcgtggata acaacggcaa catttatacc 720cagccggtgg cgcagtggca tgatcgcggc gaacaggaag tgtttgaata ttgcctggaa 780gatggcagcc tgattcgcgc gaccaaagat cataaattta tgaccgtgga tggccagatg 840ctgccgattg atgaaatttt tgaacgcgaa ctggatctga tgcgggttga taatttgccg 900aatagcggca gcggccatca ccatcaccat cactaa 93625525DNAartificialG-CSF added M at N terminal and replaced residues 2 and 18 to A and S 25atggcaccat taggtccagc gagcagcctg ccgcagagct ttctgctgaa aagcctggaa 60caggtgcgta aaattcaggg tgatggtgcg gcgctgcaag aaaaactgtg cgcgacctat 120aaactgtgcc atccggaaga gctggtgctg ctgggccata gcctgggtat tccgtgggca 180ccgctgtcta gctgtccgag ccaggcgctg caactggccg gttgtctgag ccagctgcat 240agcggcctgt ttctgtatca gggcctgctg caagcgctgg aaggcattag cccggagctg 300ggcccgactc tggataccct gcaactggat gtggcggatt ttgcgaccac catttggcag 360cagatggaag agctgggcat ggcaccggcg ctgcaaccga cccagggtgc catgccggcg 420tttgcgagcg cgtttcagcg tcgtgcgggc ggtgttctgg tggcgagcca tctgcaatct 480tttctggaag tgagctatcg tgtgctgcgt catctggccc agccg 5252648DNAartificialprimer 26tataccatgg caccattagg tccagcgagc agcctgccgc agagcttt 482725DNAartificialprimer 27atggatcctt acggctgggc cagat 2528477DNAartificialcoding continuously DnaE-C and -N derived from Nostoc punctiforme 28atggtgaaaa tagccacacg caaatatctg ggcaaacaga acgtgtatga tattggcgtg 60gaacgcgatc ataactttgc gctgaaaaac ggcttcatag ctagcaattg ttttaacaaa 120agccattttg cggaatattg tttatcatat gaaacggaaa tattgaccgt ggaatatggc 180ctgctgccga ttggcaaaat tgtggaaaaa cgcattgaat gcaccgtgta tagcgtggat 240aacaacggca acatttatac ccagccggtg gcgcagtggc atgatcgcgg cgaacaggaa 300gtgtttgaat attgcctgga agatggcagc ctgattcgcg cgaccaaaga tcataaattt 360atgaccgtgg atggccagat gctgccgatt gatgaaattt ttgaacgcga actggatctg 420atgcgggttg ataatttgcc gaatagcggc agcggccatc accatcacca tcactaa 4772933DNAartificialprimer 29aaagctagca atagcatggc accattaggt cca 333038DNAartificialprimer 30aaaaattcat atgataaaca gcccggctgg gccagatg 383130DNAartificialprimer 31aaagctagca atagcggtcc agcgagcagc 303241DNAartificialprimer 32aaaaattcat atgataaaca gcccagatga cgcagcacac g 413327DNAartificialprimer 33aaagctagca atagcagcct gccgcag 273445DNAartificialprimer 34aaaaattcat atgataaaca gcccagatga cgcagcacac gatag 453536DNAartificialprimer 35aaaaaagcta gcaatagcca gagctttctg ctgaaa 363638DNAartificialprimer 36aaaaattcat atgataaaca gcccagatga cgcagcac 38

* * * * *

Cyclized Cytokine And Method For Producing Same

Honda; Shinya ; et al.

References