Construct System And Uses Therefor Frazer; Ian Hector [THE UNIVERSITY OF QUEENSLAND]

Construct System And Uses Therefor

Frazer; Ian Hector

Patent Application Summary

U.S. patent application number 12/738291 was filed with the patent office on 2012-02-16 for construct system and uses therefor. This patent application is currently assigned to THE UNIVERSITY OF QUEENSLAND. Invention is credited to Ian Hector Frazer.

Application Number	20120040367 12/738291
Document ID	/
Family ID	40566909
Filed Date	2012-02-16

United States Patent Application	20120040367
Kind Code	A1
Frazer; Ian Hector	February 16, 2012

CONSTRUCT SYSTEM AND USES THEREFOR

Abstract

The present invention discloses construct systems and methods for comparing different iso-accepting codons according to their preference for translating RNA transcripts into proteins in cell or tissues of interest or for producing a selected phenotype in an organism of interest or part thereof. The codon preference comparisons thus obtained are particularly useful for modifying the translational efficiency of protein-encoding polynucleotides in cells or tissues of interest or for modulating the quality of a selected phenotype conferred by a phenotype-associated polypeptide upon an organism of interest or part thereof.

Inventors:	Frazer; Ian Hector; (Queensland, AU)
Assignee:	THE UNIVERSITY OF QUEENSLAND ST. LUCIA, QLD AU
Family ID:	40566909
Appl. No.:	12/738291
Filed:	October 2, 2008
PCT Filed:	October 2, 2008
PCT NO:	PCT/AU2008/001465
371 Date:	May 11, 2011

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
60980145	Oct 15, 2007

Current U.S. Class:	435/6.17 ; 435/6.1; 435/6.18
Current CPC Class:	C07K 14/005 20130101; A61P 33/00 20180101; A61P 31/16 20180101; A61P 31/06 20180101; A61P 33/12 20180101; A61K 2039/55516 20130101; A61K 39/245 20130101; A61P 31/14 20180101; A61P 33/06 20180101; C12N 2710/20034 20130101; C40B 40/08 20130101; A61P 31/22 20180101; A61K 2039/575 20130101; A61P 37/04 20180101; C12N 2760/16122 20130101; A61K 39/145 20130101; A61P 31/18 20180101; A61K 48/0075 20130101; C12N 2710/16222 20130101; A61P 31/12 20180101; C12N 2710/20022 20130101; A61P 33/02 20180101; C12N 2710/20071 20130101; C40B 50/04 20130101; C12N 2760/16134 20130101; A61P 35/02 20180101; Y02A 50/30 20180101; Y02A 50/39 20180101; A61P 35/00 20180101; C12N 15/67 20130101; C12N 15/79 20130101; C12N 2710/16622 20130101; C12N 2770/24222 20130101; C12N 2770/24234 20130101; Y02A 50/469 20180101; C12N 2710/16634 20130101; A61K 39/29 20130101; C12N 2710/16234 20130101; A61K 48/0066 20130101; A61P 33/04 20180101; C12N 15/85 20130101; A61K 39/12 20130101; A61K 2039/585 20130101; C12N 2800/22 20130101; A61K 2039/53 20130101; A61P 31/04 20180101; A61P 31/10 20180101; A61P 31/20 20180101; A61K 2039/54 20130101
Class at Publication:	435/6.17 ; 435/6.1; 435/6.18
International Class:	C12Q 1/68 20060101 C12Q001/68

Claims

1. A construct system for determining the translational efficiency or phenotypic preference of different synonymous codons, the system comprising a plurality of synthetic constructs, each comprising a regulatory sequence that is operably connected to a reporter polynucleotide, wherein the reporter polynucleotide of a first construct comprises a first coding sequence for interrogating the translational efficiency or phenotypic preference of a first codon ("the first interrogating codon") that codes for a first amino acid, wherein the reporter polynucleotide of a second construct comprises a second coding sequence for interrogating the translational efficiency or phenotypic preference of a second codon ("the second interrogating codon") that codes for the first amino acid, wherein the first and second coding sequences encode the same amino acid sequence, wherein the first coding sequence comprises the first interrogating codon to code for the first amino acid at one or more positions of the amino acid sequence, wherein the second coding sequence comprises the second interrogating codon to code for the first amino acid at one or more positions of the amino acid sequence, and wherein the first and second coding sequences differ from one another in the choice of the first interrogating codon or the second interrogating codon to code for the first amino acid at the corresponding position(s) in the amino acid, sequence and wherein the first coding sequence comprises the same number of first interrogating codons as the number of second interrogating co dons in the second coding sequence.

2. A system according to claim 1, wherein the second coding sequence differs from the first second coding sequence by the substitution of the first interrogating codon with the second interrogating codon to code for the first amino acid at the one or more positions of the amino acid sequence.

3. A system according to claim 1, wherein the construct system comprises one or more additional synthetic constructs for interrogating the translational efficiency or phenotypic preference of one or more additional interrogating co dons that codes for the first amino acid.

4. A system according to claim 3, wherein the construct system comprises a corresponding number of synthetic constructs as the number of synonymous codons that normally encode the first amino acid.

5. A system according to claim 1, wherein the coding sequence of individual synthetic constructs comprises at least 2 interrogating codons of the corresponding type.

6. A system according to claim 1, wherein at least 10% of co dons that code for the first amino acid in the coding sequence of individual synthetic constructs are the same interrogating codon.

7. A system according to claim 1, wherein the construct system further comprises a third construct and a fourth construct, wherein the reporter polynucleotide of the third construct comprises a third coding sequence for interrogating the translational efficiency or phenotypic preference of a third codon ("the third interrogating codon") that codes for a second amino acid that is different to the first amino acid, wherein the reporter polynucleotide of the fourth construct comprises a fourth coding sequence for interrogating the translational efficiency or phenotypic preference of a fourth codon ("the fourth interrogating codon") that codes for the second amino acid, wherein the third and fourth coding sequences encode the same amino acid sequence as the first and second coding sequences, wherein the third coding sequence comprises the third interrogating codon to code for the second amino acid at one or more positions of the amino acid sequence, wherein the fourth coding sequence comprises the fourth interrogating codon to code for the second amino acid at one or more positions of the amino acid sequence, and wherein the third and fourth coding sequences differ from one another in the choice of the third interrogating codon or the fourth interrogating codon to code for the second amino acid at the corresponding position(s) in the amino acid sequence.

8. A system according to claim 1, wherein the construct system further comprises synthetic constructs for interrogating the translational efficiency or phenotypic preference of co dons that code for other amino acids.

9. A system according to claim 1, wherein the coding sequence of individual reporter polynucleotides encodes a polypeptide that confers a phenotype upon a cell or tissue in which the coding sequence is expressed.

10. A system according to claim 9, wherein the polypeptide is selected from a reporter protein which, when present in a cell or tissue, is detectable either by its presence or activity.

11. A system according to claim 10, wherein the reporter protein is selected from a chemiluminescent reporter protein such as luciferase, a fluorescent protein such as green fluorescent protein, an enzymatic reporter protein such as chloramphenicol acetyl transferase, p-galactosidase, secreted placental alkaline phosphatase, p-Iactamase or a growth factor such as human growth hormone.

12. A system according to claim 1, wherein the coding sequence of individual reporter polynucleotides encodes a polypeptide that confers a phenotype upon a cell or tissue in which the coding sequence is not expressed.

13. A system according to claim 12, wherein the polypeptide is a phenotype-associated polypeptide that is the subject of producing a selected phenotype or a phenotype of the same class as the selected phenotype.

14. A system according to claim 1, wherein the reporter polynucleotide of individual synthetic constructs further comprises an ancillary coding sequence that encodes a detectable tag.

15. A system according to claim 14, wherein the tag is a member of a specific binding pair.

16. A system according to claim 14, wherein the ancillary coding sequence of one reporter polynucleotide encodes a different tag than the ancillary coding sequence of another reporter polynucleotide.

17. A method for determining the translational efficiency of a first codon relative to a second codon is in a cell of interest, wherein the first codon and the second codon code for the same amino acid, the method comprising: providing a plurality of synthetic constructs, each comprising a regulatory sequence that is operably connected to a reporter polynucleotide, wherein the reporter polynucleotide of a first construct comprises a first coding sequence for interrogating the translational efficiency of the first codon, wherein the reporter polynucleotide of a second construct comprises a second coding sequence for interrogating the translational efficiency of the second codon, wherein the first and second coding sequences encode the same amino acid sequence, which defines in whole or in part a reporter protein, wherein the first coding sequence comprises the first codon to code for the first amino acid at one or more positions of the amino acid sequence, wherein the second coding sequence comprises the second codon to code for the first amino acid at one or more positions of the amino acid sequence, and wherein the first and second coding sequences differ from one another in the choice of the first codon or the second codon to code for the first amino acid at the corresponding position(s) in the amino acid sequence and wherein the first coding sequence comprises the same number of first interrogating codons as the number of second interrogating co dons in the second coding sequence; introducing the first construct into a cell of the same type as the cell of interest; introducing the second construct into a cell of the same type as the cell of interest; measuring expression of the reporter protein from the first construct and from the second construct in the cell; and determining the translational efficiency of the first codon and the translational efficiency of the second codon based on the measured expression of the reporter protein in the cell, to thereby determine the translational efficiency of the first codon relative the second codon in the cell of interest.

18. A method according to claim 17, further comprising determining a comparison of translational efficiencies of individual synonymous codons in the cell of interest.

19. A method according to claim 17, comprising: introducing an individual synthetic construct into a progenitor of the cell of interest; and differentiating the cell of interest from the progenitor, wherein the cell of interest contains the synthetic construct.

20. A method according to claim 17, wherein the first and second constructs are separately introduced into different cells.

21. A method according to claim 17, wherein the first and second constructs are introduced into the same cell.

22. A method for determining the translational efficiency of a first codon and a second codon in a first cell type relative to a second cell type, the method comprising: providing a plurality of synthetic constructs, each comprising a regulatory sequence that is operably connected to a reporter polynucleotide, wherein the reporter polynucleotide of a first construct comprises a first coding sequence for interrogating the translational efficiency of the first codon, wherein the reporter polynucleotide of a second construct comprises a second coding sequence for interrogating the translational efficiency of the second codon, wherein the first and second coding sequences encode the same amino acid sequence, which defines in whole or in part a reporter protein, wherein the first coding sequence comprises the first codon to code for the first amino acid at one or more positions of the amino acid sequence, wherein the second coding sequence comprises the second codon to code for the first amino acid at one or more positions of the amino acid sequence, and wherein the first and second coding sequences differ from one another in the choice of the first codon or the second codon to code for the first amino acid at the corresponding position(s) in the amino acid sequence and wherein the first coding sequence comprises the same number of first interrogating codons as the number of second interrogating codons in the second coding sequence; separately introducing the first construct into the first cell type and into the second cell type; separately introducing the second construct into the first cell type and into the second cell type; measuring expression of the reporter protein in the first cell type and in the second cell type to which the first construct was provided; measuring expression of the reporter protein in the first cell type and in the second cell type to which the second construct was provided; determining the translational efficiency of the first codon in the first cell type and in the second cell type based on the measured expression of the reporter protein in the first cell type and in the second cell type, respectively, to which the first construct was provided, to thereby determine the translational efficiency of the first codon in the first cell type relative the second cell type; and determining the translational efficiency of the second codon in the first cell type and in the second cell type based on the measured expression of the reporter protein in the first cell type and in the second cell type, respectively, to which the second construct was provided, to thereby determine the translational efficiency of the second codon in the first cell type relative the second cell type.

23. A method according to claim 22, further comprising determining a comparison of translational efficiencies of individual synonymous co dons in the first cell type relative to the second cell type.

24. A method according to claim 22, further comprising: introducing an individual synthetic construct into a progenitor of a cell selected from the first cell type or the second cell type; and differentiating the cell from the progenitor, wherein the cell contains the synthetic construct.

25. A method for determining the preference of a first codon relative to the preference of a second codon for producing a selected phenotype ("the phenotypic preference") in a organism of interest or part thereof, wherein the first codon and the second codon code for the same amino acid, the method comprising: providing a plurality of synthetic constructs, each comprising a regulatory sequence that is operably connected to a reporter polynucleotide, wherein the reporter polynucleotide of a first construct comprises a first coding sequence for interrogating the phenotypic preference of the first codon, wherein the reporter polynucleotide of a second construct comprises a second coding sequence for interrogating the phenotypic preference of the second codon, wherein the first and second coding sequences encode the same amino acid sequence, which defines in whole or in part a reporter protein, which produces, or which is predicted to produce, the selected phenotype or a phenotype of the same class as the selected phenotype, wherein the first coding sequence comprises the first codon to code for the first amino acid at one or more positions of the amino acid sequence, wherein the second coding sequence comprises the second codon to code for the first amino acid at one or more positions of the amino acid sequence, and wherein the first and second coding sequences differ from one another in the choice of the first codon or the second codon to code for the first amino acid at the corresponding position(s) in the amino acid sequence and wherein the first coding sequence comprises the same number of first interrogating codons as the number of second interrogating codons in the second coding sequence; introducing the first construct into a first test organism or part thereof, wherein the test organism is selected from the group consisting of an organism of the same species as the organism of interest and an organism that is related to the organism of interest; introducing the second construct into a second test organism or part thereof, wherein the second test organism is of the same type as the first organism determining the quality of the corresponding phenotype displayed by the first test organism or part and by the second test organism or part; and determining the phenotypic preference of the first codon and the phenotypic preference of the second codon based, respectively, on the quality of the corresponding phenotype displayed by the first test organism or part and by the second test organism or part, to thereby determine the phenotypic preference of the first codon relative the phenotypic preference of the second codon in the organism of interest or part thereof.

26. A method according to claim 25, further comprising determining a comparison of phenotypic preferences of individual synonymous codons in the organism of interest or part thereof.

27. A method according to claim 25, further comprising: introducing an individual synthetic construct into a progenitor of the test organism or part; and growing a non-human organism or part from the progenitor, wherein the organism or part contains the synthetic construct.

28. A method according to claim 25, further comprising: introducing an individual synthetic construct into a progenitor of the test organism or part; and growing a non-human organism or part from the progenitor, wherein the organism or part comprises a cell containing the synthetic construct.

29. A method of constructing a synthetic polynucleotide from which an encoded polypeptide is produced at a higher level in a cell of interest than from a parent polynucleotide that encodes the same polypeptide, the method comprising: determining the translational efficiency of different synonymous codons in cells of the same type as the cell of interest, as defined in claim 17, to thereby determine a comparison of translational efficiencies of individual synonymous co dons in the cell of interest; selecting a first codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a higher translational efficiency than the first codon in the cell of interest according to the comparison of translational efficiencies; and replacing the first codon with the synonymous codon to construct the synthetic polynucleotide.

30. A method according to claim 29, wherein the synonymous codon is selected on the basis that it corresponds to an interrogating codon in a synthetic construct from which the reporter protein is expressed in the cell of interest at a level that is at least about 10% higher than the level of the reporter protein expressed from a synthetic construct that comprises the first codon as the interrogating codon.

31. A method of constructing a synthetic polynucleotide from which an encoded polypeptide is produced at a lower level in a cell of interest than from a parent polynucleotide that encodes the same polypeptide, the method comprising: determining the translational efficiency of different synonymous codons in cells of the same type as the cell of interest, as defined in claim 17, to thereby determine a comparison of translational efficiencies of individual synonymous co dons in the cell of interest; selecting a first codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a lower translational efficiency than the first codon in the cell of interest according to the comparison of translational efficiencies; and replacing the first codon with the synonymous codon to construct the synthetic polynucleotide.

32. A method according to claim 31, wherein the synonymous codon is selected on the basis that it corresponds to an interrogating codon in a synthetic construct from which the reporter protein is expressed in the cell of interest at a level that is no more than 90% of the level of the reporter protein expressed from a synthetic construct that comprises the first codon as the interrogating codon.

33. A method of constructing a synthetic polynucleotide from which an encoded polypeptide is produced at a higher level in a first cell than in a second cell, the method comprising: determining the translational efficiency of different synonymous codons in cells of the same type as the first cell and in cells of the same type as the second cell, as defined in claim 22, to thereby determine a comparison of translational efficiencies of individual synonymous co dons between the first cell and the second cell; selecting a first codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a higher translational efficiency in the first cell than in the second cell according to the comparison of translational efficiencies; and replacing the first codon with the synonymous codon to construct the synthetic polynucleotide.

34. A method according to claim 28, wherein the synonymous codon is the same as the interrogating codon in a synthetic construct from which the reporter protein is expressed in the first cell at a level that is at least about 10% higher than the level of the reporter protein expressed from the same synthetic construct in the second cell.

35. A method of constructing a synthetic polynucleotide from which a polypeptide is producible to confer a selected phenotype upon an organism of interest or part thereof in a different quality than that conferred by a parent polynucleotide that encodes the same polypeptide, the method comprising: determining the preference of different synonymous codons for producing the selected phenotype ("the phenotypic preference") in test organisms or parts thereof, as defined in claim 25, wherein the test organisms are selected from the group consisting of an organism of the same species as the organism of interest and an organism that is related to the organism of interest, to thereby determine a comparison of phenotypic preferences of individual synonymous codons in the organism of interest; selecting a first codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a different phenotypic preference than the first codon in the comparison of phenotypic preferences in organism or part thereof; and replacing the first codon with the synonymous codon to construct the synthetic polynucleotide.

36. A method according to claim 35, wherein the synthetic polynucleotide confers the selected phenotype upon the organism of interest or part thereof in a higher quality than that conferred by the parent polynucleotide.

37. A method according to claim 36, wherein the synonymous codon is selected on the basis that it corresponds to an interrogating codon in a synthetic construct that confers the selected phenotype in the organism of interest or part thereof in a quality that is at least about 10% higher than the quality of the phenotype conferred by the synthetic construct comprising the first codon as the interrogating codon.

38. A method according to claim 35, wherein the synthetic polynucleotide confers the selected phenotype upon the organism of interest or part thereof in a lower quality than that conferred by the parent polynucleotide.

39. A method according to claim 38, wherein the synonymous codon is selected on the basis that it corresponds to an interrogating codon in a synthetic construct that confers the selected phenotype in the organism of interest or part thereof in a quality that is no more than 90% of the quality of the phenotype conferred by the synthetic construct comprising the first codon as the interrogating codon.

Description

FIELD OF THE INVENTION

[0001] The present invention relates generally to gene expression. More particularly, the present invention relates to construct systems and methods for comparing different iso-accepting codons according to their preference for translating RNA transcripts into proteins in cell or tissues of interest or for producing a selected phenotype in an organism of interest or part thereof. The codon preference comparisons thus obtained are particularly useful for modifying the translational efficiency of protein-encoding polynucleotides in cells or tissues of interest or for modulating the quality of a selected phenotype conferred by a phenotype-associated polypeptide upon an organism of interest or part thereof.

BACKGROUND OF THE INVENTION

[0002] The expression of foreign heterologous genes in transformed cells is now commonplace. A large number of mammalian genes, including, for example, murine and human genes, have been successfully expressed in various host cells, including bacterial, yeast, insect, plant and mammalian host cells. Nevertheless, despite the burgeoning knowledge of expression systems and recombinant DNA technology, significant obstacles remain when one attempts to express a foreign or synthetic gene in a selected host cell. For example, translation of a synthetic gene, even when coupled with a strong promoter, often proceeds much more slowly than would be expected. The same is frequently true of exogenous genes that are foreign to the host cell. This lower than expected translation efficiency is often due to the protein coding regions of the gene having a codon usage pattern that does not resemble those of highly expressed genes in the host cell. It is known in this regard that codon utilization is highly biased and varies considerably in different organisms and that biases in codon usage can alter peptide elongation rates. It is also known that codon usage patterns are related to the relative abundance of tRNA isoacceptors, and that genes encoding proteins of high versus low abundance show differences in their codon preferences.

[0003] The implications of codon preference phenomena on gene expression are manifest in that these phenomena can affect the translational efficiency of messenger RNA (mRNA). It is widely known in this regard that translation of "rare codons", for which the corresponding iso-tRNA is in low abundance relative to other iso-tRNAs, may cause a ribosome to pause during translation which can lead to a failure to complete a nascent polypeptide chain and an uncoupling of transcription and translation. Thus, the expression of an exogenous gene may be impeded severely if a particular host cell of an organism or the organism itself has a low abundance of iso-tRNAs corresponding to one or more codons of the exogenous gene. Accordingly, a major aim of investigators in this field is to first ascertain the codon preference for particular cells in which an exogenous gene is to be expressed, and to subsequently alter the codon composition of that gene for optimized expression in those cells.

[0004] Codon-optimization techniques are known for improving the translational kinetics of translationally inefficient protein coding regions. Traditionally, these techniques have been based on the replacement of codons that are rarely or infrequently used in the host cell with those that are host-preferred. Codon frequencies can be derived from literature sources for the highly expressed genes of many organisms (see, for example, Nakamura et al., 1996, Nucleic Acids Res 24: 214-215). These frequencies are generally expressed on an `organism-wide average basis` as the percentage of occasions that a synonymous codon is used to encode a corresponding amino acid across a collection of protein-encoding genes of that organism, which are preferably highly expressed.

[0005] Typically, codons are classified as: (a) "common" codons (or "preferred" codons) if their frequency of usage is above about 4/3.times.the frequency of usage that would be expected in the absence of any bias in codon usage; (b) "rare" codons (or "non-preferred" codons) if their frequency of usage is below about 2/3.times.the frequency of usage that would be expected in the absence of any bias in codon usage; and (c) "intermediate" codons (or "less preferred" codons) if their frequency of usage is in-between the frequency of usage of "common" codons and of "rare" codons. Since an amino acid can be encoded by 2, 3, 4 or 6 codons, the frequency of usage of any selected codon, which would be expected in the absence of any bias in codon usage, will be dependent upon the number of synonymous codons which code for the same amino acid as the selected codon. Accordingly, for a particular amino acid, the frequency thresholds for classifying codons in the "common", "intermediate" and "rare" categories will be dependent upon the number of synonymous codons for that amino acid. Consequently, for amino acids having 6 choices of synonymous codon, the frequency of codon usage that would be expected in the absence of any bias in codon usage is 16% and thus the "common", "intermediate" and "rare" codons are defined as those codons that have a frequency of usage above 20%, between 10 and 20% and below 10%, respectively. For amino acids having 4 choices of synonymous codon, the frequency of codon usage that would be expected in the absence of codon usage bias is 25% and thus the "common", "intermediate" and "rare" codons are defined as those codons that have a frequency of usage above 33%, between 16 and 33% and below 16%, respectively. For isoleucine, which is the only amino acid having 3 choices of synonymous codon, the frequency of codon usage that would be expected in the absence of any bias in codon usage is 33% and thus the "common", "intermediate" and "rare" codons for isoleucine are defined as those codons that have a frequency of usage above 45%, between 20 and 45% and below 20%, respectively. For amino acids having 2 choices of synonymous codon, the frequency of codon usage that would be expected in the absence of codon usage bias is 50% and thus the "common", "intermediate" and "rare" codons are defined as those codons that have a frequency of usage above 60%, between 30 and 60% and below 30%, respectively. Thus, the categorization of codons into the "common", "intermediate" and "rare" classes (or "preferred", "less preferred" or "non preferred", respectively) has been based conventionally on a compilation of codon usage for an organism in general (e.g., `human-wide`) or for a class of organisms in general (e.g., `mammal-wide`). For example, reference may be made to Seed (see U.S. Pat. Nos. 5,786,464 and 5,795,737) who discloses preferred, less preferred and non-preferred codons for mammalian cells in general. However, the present inventor revealed in WO 99/02694 and in WO 00/42190 that there are substantial differences in the relative abundance of particular iso-tRNAs in different cells or tissues of a single multicellular organism (e.g., a mammal or a plant) and that this plays a pivotal role in protein translation from a coding sequence with a given codon usage or composition.

[0006] Thus, in contrast to the art-recognized presumption that different cells of a multicellular organism have the same bias in codon usage, it was revealed for the first time that one cell type of a multicellular organism uses codons in a manner distinct from another cell type of the same organism. In other words, it was discovered that different cells of an organism can exhibit different translational efficiencies for the same codon and that it was not possible to predict which codons would be preferred, less preferred or non preferred in a selected cell type. Accordingly, it was proposed that differences in codon translational efficiency between cell types could be exploited, together with codon composition of a gene, to regulate the production of a protein in, or to direct that production to, a chosen cell type.

[0007] Therefore, in order to optimize the expression of a protein-encoding polynucleotide in a particular cell type, WO 99/02694 and in WO 00/42190 teach that it is necessary to first determine the translational efficiency for each codon in that cell type, rather than to rely on codon frequencies calculated on an organism-wide average basis, and then to codon modify the polynucleotide based on that determination. WO 00/42190 further teaches a vector system for ranking synonymous codons according to their translational efficiencies. This vector system comprises a plurality of synthetic constructs, each comprising a regulatory sequence that is operably linked to a tandem repeat of a codon fused in frame with a reporter polynucleotide that encodes a reporter protein, wherein the tandemly repeated codon of one construct is different to the tandemly repeated codon of another. In this system, the tandem repeated codon is thought to cause a ribosome to pause during translation if the iso-tRNA corresponding to the tandemly repeated codon is limiting. Accordingly, the levels of reporter protein produced using this vector system are sensitive to the intracellular abundance of the iso-tRNA species corresponding to the tandemly repeated codon and provide, therefore, a direct correlation of a cell's or tissue's preference for translating a given codon. This means, for example, that if the levels of the reporter protein obtained in a cell or tissue type to which a synthetic construct having a first tandemly repeated codon is provided are lower than the levels expressed in the same cell or tissue type to which a different synthetic construct having a second tandemly repeated codon is provided (i.e., wherein the first tandemly repeated codon is different than, but synonymous with, the second tandemly repeated codon), then it can be deduced that the second tandemly repeated codon has a higher translational efficiency than the first tandemly repeated codon in the cell or tissue type.

[0008] The present inventor further determined a strategy for enhancing or reducing the quality of a selected phenotype (immunity, tolerance, pathogen resistance, enhancement or prevention of a repair process, pest resistance, frost resistance, herbicide tolerance etc) that is displayed, or proposed to be displayed, by an organism of interest. This strategy, which is disclosed in WO 2004/042059, involves codon modification of a polynucleotide that encodes a phenotype-associated polypeptide that either by itself, or in association with other molecules, in the organism of interest imparts or confers the selected phenotype upon the organism. Unlike previous methods, which rely on data that provide a ranking of synonymous codons according to their preference of usage or according to their translational efficiencies, this strategy is based on ranking individual synonymous codons according to their preference of usage by the organism or class of organisms, or by a part thereof, for producing the selected phenotype. An illustrative method for determining codon phenotypic preferences is disclosed in WO 2004/042059, which employs the synthetic construct system disclosed in WO 00/42190 to derive a set of synonymous codons that may display a range of phenotypic preferences, which can be used as a basis for rationally selecting a codon in polynucleotide that encodes a phenotype associated polypeptide for replacement with a synonymous codon that has a different phenotypic preference.

SUMMARY OF THE INVENTION

[0009] The present invention is predicated in part on the discovery that the sensitivity of determining the translational efficiency or phenotypic preference of different synonymous codons can be improved using a construct system that employs different reporter polynucleotides that encode the same amino acid sequence, wherein individual reporter polynucleotides use the same codon (also referred to herein as "an interrogating codon") to code for a particular amino acid at one or more positions of the amino acid sequence, and wherein the interrogating codon of one reporter polynucleotide is different to but synonymous with the interrogating codon of another reporter polynucleotide. In specific embodiments, the sensitivity is improved further by incorporating two or more interrogating codons to code for the particular amino acid in the amino acid sequence.

[0010] Thus, in one aspect of the present invention, construct systems are provided for determining the translational efficiency or phenotypic preference of different synonymous codons. These systems generally comprise a plurality of synthetic constructs, each comprising a regulatory sequence that is operably connected to a reporter polynucleotide, wherein the reporter polynucleotide of a first construct comprises a first coding sequence for interrogating the translational efficiency or phenotypic preference of a first codon ("the first interrogating codon") that codes for a first amino acid, wherein the reporter polynucleotide of a second construct comprises a second coding sequence for interrogating the translational efficiency or phenotypic preference of a second codon ("the second interrogating codon") that codes for the first amino acid, wherein the first and second coding sequences encode the same amino acid sequence, wherein the first coding sequence comprises the first interrogating codon to code for the first amino acid at one or more positions of the amino acid sequence, wherein the second coding sequence comprises the second interrogating codon to code for the first amino acid at one or more positions of the amino acid sequence, and wherein the first and second coding sequences differ from one another in the choice of the first interrogating codon or the second interrogating codon to code for the first amino acid at the corresponding position(s) in the amino acid sequence. Suitably, the first coding sequence comprises the same number of first interrogating codons as the number of second interrogating codons in the second coding sequence. In specific embodiments, the second coding sequence differs from the first second coding sequence by the substitution of the first interrogating codon with the second interrogating codon to code for the first amino acid at the one or more positions of the amino acid sequence. In some embodiments, the construct system comprises one or more additional synthetic constructs for interrogating the translational efficiency or phenotypic preference of one or more additional interrogating codons that codes for the first amino acid. In illustrative examples of this type, the construct system comprises a corresponding number of synthetic constructs as the number of synonymous codons that normally encode the first amino acid.

[0011] In some embodiments, the coding sequence of individual synthetic constructs comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 500 interrogating codons of the corresponding type. Suitably, at least 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%, 99% or even 100% of codons that code for the first amino acid in the coding sequence of individual synthetic constructs are the same interrogating codon.

[0012] In some embodiments, the construct system further comprises a third construct and a fourth construct, wherein the reporter polynucleotide of the third construct comprises a third coding sequence for interrogating the translational efficiency or phenotypic preference of a third codon ("the third interrogating codon") that codes for a second amino acid that is different to the first amino acid, wherein the reporter polynucleotide of the fourth construct comprises a fourth coding sequence for interrogating the translational efficiency or phenotypic preference of a fourth codon ("the fourth interrogating codon") that codes for the second amino acid, wherein the third and fourth coding sequences encode the same amino acid sequence as the first and second coding sequences, wherein the third coding sequence comprises the third interrogating codon to code for the second amino acid at one or more positions of the amino acid sequence, wherein the fourth coding sequence comprises the fourth interrogating codon to code for the second amino acid at one or more positions of the amino acid sequence, and wherein the third and fourth coding sequences differ from one another in the choice of the third interrogating codon or the fourth interrogating codon to code for the second amino acid at the corresponding position(s) in the amino acid sequence.

[0013] In some embodiments, the construct system further comprises synthetic constructs for interrogating the translational efficiency or phenotypic preference of codons that code for other amino acids.

[0014] In some embodiments, the coding sequence of individual reporter polynucleotides encodes an amino acid sequence that confers a phenotype upon a cell or tissue in which the coding sequence is expressed (e.g., an amino acid sequence of a reporter protein which, when present in a cell or tissue, is detectable either by its presence or activity, including, but not limited to, a chemiluminescent reporter protein such as luciferase, a fluorescent protein such as green fluorescent protein, an enzymatic reporter protein such as chloramphenicol acetyl transferase, .beta.-galactosidase, secreted placental alkaline phosphatase, .beta.-lactamase or a growth factor such as human growth hormone). Such reporter proteins are useful, for example, in determining the translational efficiency of different synonymous codons in a cell or tissue type of interest. In other embodiments, the coding sequence of individual reporter polynucleotides encodes an amino acid sequence that confers a phenotype upon a cell or tissue in which the coding sequence is not expressed including, for example, the amino acid sequence of a phenotype-associated polypeptide that is the subject of producing a selected phenotype (e.g., cellular immunity to melanoma) or a phenotype of the same class as the selected phenotype (e.g., a cellular immune response), as for example disclosed in WO 2004/042059, which is hereby incorporated by reference herein in its entirety.

[0015] In some embodiments, the reporter polynucleotide of individual synthetic constructs further comprises an ancillary coding sequence that encodes a detectable tag, which is suitably a member of a specific binding pair, which includes for example, antibody-antigen (or hapten) pairs, ligand-receptor pairs, enzyme-substrate pairs, biotin-avidin pairs, and the like. In illustrative examples of this type, the ancillary coding sequence of one reporter polynucleotide encodes a different tag than the ancillary coding sequence of another reporter polynucleotide. In these examples, it is possible to detectably distinguish the polypeptide products of different reporter polynucleotides in the same cell or organism of interest or part thereof, thereby permitting simultaneous determination of the translational efficiencies of different interrogating codons in the same cell or organism or part.

[0016] In another aspect, the present invention provides methods for determining the translational efficiency of a first codon relative to a second codon is in a cell of interest,

[0017] wherein the first codon and the second codon code for the same amino acid. These methods generally comprise: [0018] providing a plurality of synthetic constructs, each comprising a regulatory sequence that is operably connected to a reporter polynucleotide, wherein the reporter polynucleotide of a first construct comprises a first coding sequence for interrogating the translational efficiency of the first codon, wherein the reporter polynucleotide of a second construct comprises a second coding sequence for interrogating the translational efficiency of the second codon, wherein the first and second coding sequences encode the same amino acid sequence, which defines in whole or in part a reporter protein, wherein the first coding sequence comprises the first codon to code for the first amino acid at one or more positions of the amino acid sequence, wherein the second coding sequence comprises the second codon to code for the first amino acid at one or more positions of the amino acid sequence, and wherein the first and second coding sequences differ from one another in the choice of the first codon or the second codon to code for the first amino acid at the corresponding position(s) in the amino acid sequence; [0019] introducing the first construct into a cell of the same type as the cell of interest; [0020] introducing the second construct into a cell of the same type as the cell of interest; [0021] measuring expression of the reporter protein from the first construct and from the second construct in the cell; and [0022] determining the translational efficiency of the first codon and the translational efficiency of the second codon based on the measured expression of the reporter protein in the cell, to thereby determine the translational efficiency of the first codon relative the second codon in the cell of interest.

[0023] In some embodiments, the first and second constructs are separately introduced into different cells. In other embodiments, the first and second constructs are introduced into the same cell.

[0024] Suitably, the methods further comprise determining a comparison of translational efficiencies of individual synonymous codons in the cell of interest.

[0025] In some embodiments, the methods further comprise: [0026] introducing an individual synthetic construct into a progenitor of the cell of interest; and [0027] differentiating the cell of interest from the progenitor,

[0028] wherein the cell of interest contains the synthetic construct.

[0029] In yet another aspect, the present invention provides methods for determining the translational efficiency of a first codon and a second codon in a first cell type relative to a second cell type. These methods generally comprise: [0030] providing a plurality of synthetic constructs, each comprising a regulatory sequence that is operably connected to a reporter polynucleotide, wherein the reporter polynucleotide of a first construct comprises a first coding sequence for interrogating the translational efficiency of the first codon, wherein the reporter polynucleotide of a second construct comprises a second coding sequence for interrogating the translational efficiency of the second codon, wherein the first and second coding sequences encode the same amino acid sequence, which defines in whole or in part a reporter protein, wherein the first coding sequence comprises the first codon to code for the first amino acid at one or more positions of the amino acid sequence, wherein the second coding sequence comprises the second codon to code for the first amino acid at one or more positions of the amino acid sequence, and wherein the first and second coding sequences differ from one another in the choice of the first codon or the second codon to code for the first amino acid at the corresponding position(s) in the amino acid sequence; [0031] separately introducing the first construct into the first cell type and into the second cell type; [0032] separately introducing the second construct into the first cell type and into the second cell type; [0033] measuring expression of the reporter protein in the first cell type and in the second cell type to which the first construct was provided; [0034] measuring expression of the reporter protein in the first cell type and in the second cell type to which the second construct was provided; [0035] determining the translational efficiency of the first codon in the first cell type and in the second cell type based on the measured expression of the reporter protein in the first cell type and in the second cell type, respectively, to which the first construct was provided, to thereby determine the translational efficiency of the first codon in the first cell type relative the second cell type; and [0036] determining the translational efficiency of the second codon in the first cell type and in the second cell type based on the measured expression of the reporter protein in the first cell type and in the second cell type, respectively, to which the second construct was provided, to thereby determine the translational efficiency of the second codon in the first cell type relative the second cell type.

[0037] In some embodiments, the methods further comprise determining a comparison of translational efficiencies of individual synonymous codons in the first cell type relative to the second cell type.

[0038] In some embodiments, the methods further comprise: [0039] introducing an individual synthetic construct into a progenitor of a cell selected from the first cell type or the second cell type; and [0040] differentiating the cell from the progenitor,

[0041] wherein the cell contains the synthetic construct.

[0042] Still another aspect of the present invention provides methods for determining the preference of a first codon relative to the preference of a second codon for producing a selected phenotype ("the phenotypic preference") in a organism of interest or part thereof, wherein the first codon and the second codon code for the same amino acid. These methods generally comprise: [0043] providing a plurality of synthetic constructs, each comprising a regulatory sequence that is operably connected to a reporter polynucleotide, wherein the reporter polynucleotide of a first construct comprises a first coding sequence for interrogating the phenotypic preference of the first codon, wherein the reporter polynucleotide of a second construct comprises a second coding sequence for interrogating the phenotypic preference of the second codon, wherein the first and second coding sequences encode the same amino acid sequence, which defines in whole or in part a reporter protein, which produces, or which is predicted to produce, the selected phenotype or a phenotype of the same class as the selected phenotype, wherein the first coding sequence comprises the first codon to code for the first amino acid at one or more positions of the amino acid sequence, wherein the second coding sequence comprises the second codon to code for the first amino acid at one or more positions of the amino acid sequence, and wherein the first and second coding sequences differ from one another in the choice of the first codon or the second codon to code for the first amino acid at the corresponding position(s) in the amino acid sequence; [0044] introducing the first construct into a first test organism or part thereof, wherein the test organism is selected from the group consisting of an organism of the same species as the organism of interest and an organism that is related to the organism of interest; [0045] introducing the second construct into a second test organism or part thereof, wherein the second test organism is of the same type as the first organism; [0046] determining the quality of the corresponding phenotype displayed by the first test organism or part and by the second test organism or part; and [0047] determining the phenotypic preference of the first codon and the phenotypic preference of the second codon based, respectively, on the quality of the corresponding phenotype displayed by the first test organism or part and by the second test organism or part, to thereby determine the phenotypic preference of the first codon relative the phenotypic preference of the second codon in the organism of interest or part thereof.

[0048] In some embodiments, the methods further comprise determining a comparison of phenotypic preferences of individual synonymous codons in the organism of interest or part thereof.

[0049] In some embodiments, the methods further comprise: [0050] introducing an individual synthetic construct into a progenitor of the test organism or part; and [0051] growing a non-human organism or part from the progenitor,

[0052] wherein the organism or part contains the synthetic construct.

[0053] In other embodiments, the methods further comprise: [0054] introducing an individual synthetic construct into a progenitor of the test organism or part; and [0055] growing a non-human organism or part from the progenitor,

[0056] wherein the organism or part comprises a cell containing the synthetic construct.

[0057] In still another aspect, the present invention provides methods of constructing a synthetic polynucleotide from which an encoded polypeptide is produced at a higher level in a cell of interest than from a parent polynucleotide that encodes the same polypeptide. These methods generally comprise: [0058] determining the translational efficiency of different synonymous codons in cells of the same type as the cell of interest, as broadly described above, to thereby determine a comparison of translational efficiencies of individual synonymous codons in the cell of interest; [0059] selecting a first codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a higher translational efficiency than the first codon in the cell of interest according to the comparison of translational efficiencies; and [0060] replacing the first codon with the synonymous codon to construct the synthetic polynucleotide.

[0061] In some embodiments, the synonymous codon is selected on the basis that it corresponds to an interrogating codon in a synthetic construct from which the reporter protein is expressed in the cell of interest at a level that is at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% higher or at least about 2, 3, 4, 5, 6, 7, 8, 9, or 10 times higher than the level of the reporter protein expressed from a synthetic construct that comprises the first codon as the interrogating codon.

[0062] A further aspect of the present invention provides methods of constructing a synthetic polynucleotide from which an encoded polypeptide is produced at a lower level in a cell of interest than from a parent polynucleotide that encodes the same polypeptide. These methods generally comprise: [0063] determining the translational efficiency of different synonymous codons in cells of the same type as the cell of interest, as broadly described above, to thereby determine a comparison of translational efficiencies of individual synonymous codons in the cell of interest; [0064] selecting a first codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a lower translational efficiency than the first codon in the cell of interest according to the comparison of translational efficiencies; and [0065] replacing the first codon with the synonymous codon to construct the synthetic polynucleotide.

[0066] In some embodiments, the synonymous codon is selected on the basis that it corresponds to an interrogating codon in a synthetic construct from which the reporter protein is expressed in the cell of interest at a level that is no more than 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, 0.5%, 0.1%, 0.05% or 0.01% of the level of the reporter protein expressed from a synthetic construct that comprises the first codon as the interrogating codon.

[0067] Still another aspect of the present invention provides methods of constructing a synthetic polynucleotide from which an encoded polypeptide is produced at a higher level in a first cell than in a second cell. These methods generally comprise: [0068] determining the translational efficiency of different synonymous codons in cells of the same type as the first cell and in cells of the same type as the second cell, as broadly described above, to thereby determine a comparison of translational efficiencies of individual synonymous codons between the first cell and the second cell; [0069] selecting a first codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a higher translational efficiency in the first cell than in the second cell according to the comparison of translational efficiencies; and [0070] replacing the first codon with the synonymous codon to construct the synthetic polynucleotide.

[0071] In some embodiments, the synonymous codon is the same as the interrogating codon in a synthetic construct from which the reporter protein is expressed in the first cell at a level that is at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% higher or at least about 2, 3, 4, 5, 6, 7, 8, 9, or 10 times higher than the level of the reporter protein expressed from the same synthetic construct in the second cell.

[0072] In yet another aspect, the present invention provides methods of constructing a synthetic polynucleotide from which a polypeptide is producible to confer a selected phenotype upon an organism of interest or part thereof in a different quality than that conferred by a parent polynucleotide that encodes the same polypeptide. These methods generally comprise: [0073] determining the preference of different synonymous codons for producing the selected phenotype ("the phenotypic preference") in test organisms or parts thereof, as broadly described above, wherein the test organisms are selected from the group consisting of an organism of the same species as the organism of interest and an organism that is related to the organism of interest, to thereby determine a comparison of phenotypic preferences of individual synonymous codons in the organism of interest; [0074] selecting a first codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a different phenotypic preference than the first codon in the comparison of phenotypic preferences in organism or part thereof; and [0075] replacing the first codon with the synonymous codon to construct the synthetic polynucleotide.

[0076] In some embodiments, the synthetic polynucleotide confers the selected phenotype upon the organism of interest or part thereof in a higher quality than that conferred by the parent polynucleotide. In illustrative examples of this type, the synonymous codon is selected on the basis that it corresponds to an interrogating codon in a synthetic construct that confers the selected phenotype in the organism of interest or part thereof in a quality that is at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% higher or at least about 2, 3, 4, 5, 6, 7, 8, 9, or 10 times higher than the quality of the phenotype conferred by the synthetic construct comprising the first codon as the interrogating codon.

[0077] In other embodiments, the synthetic polynucleotide confers the selected phenotype upon the organism of interest or part thereof in a lower quality than that conferred by the parent polynucleotide. In illustrative examples of this type, the synonymous codon is selected on the basis that it corresponds to an interrogating codon in a synthetic construct that confers the selected phenotype in the organism of interest or part thereof in a quality that is no more than 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, 0.5%, 0.1%, 0.05% or 0.01% of the quality of the phenotype conferred by the synthetic construct comprising the first codon as the interrogating codon.

[0078] The construct system of the present invention has been used to determine a ranking of individual synonymous codons according to their preference for producing an immune response, including a humoral immune response, to an antigen in a mammal. Significantly, this ranking is not coterminous with a ranking of codon frequency values derivable from an analysis of the frequency with which codons are used to encode their corresponding amino acids across a collection of highly expressed mammalian protein-encoding genes, as for example disclosed by Seed (supra). Nor is it coterminous with a ranking of translational efficiency values obtained from an analysis of the translational efficiencies of codons in specific cell types, as disclosed for example in WO 99/02694 for COS-1 cells and epithelial cells and in WO 2004/024915 for CHO cells. As a result, the present invention enables for the first time the construction of antigen-encoding polynucleotides, which are codon-optimized for efficient production of immune responses, including humoral immune responses, in a mammal.

[0079] Accordingly, in yet another aspect, methods are provided for constructing a synthetic polynucleotide from which a polypeptide is producible to confer an immune response to a target antigen in a mammal in a different quality than that conferred by a parent polynucleotide that encodes the same polypeptide, wherein the polypeptide corresponds to at least a portion of the target antigen. These methods generally comprise: (a) selecting a first codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a different preference for conferring an immune response ("an immune response preference") than the first codon in a comparison of immune response preferences; and (b) replacing the first codon with the synonymous codon to construct the synthetic polynucleotide, wherein the comparison of immune response preferences of the codons is represented by TABLE 1:

TABLE-US-00001 TABLE 1 Amino Ranking of Immune Response Preferences for Synonymous Acid Codons Ala Ala.sup.GCT > Ala.sup.GCC > (Ala.sup.GCA, Ala.sup.GCG) Arg (Arg.sup.CGA, Arg.sup.CGC, Arg.sup.CGT, Arg.sup.AGA) > (Arg.sup.AGG, Arg.sup.CGG) Asn Asn.sup.AAC > Asn.sup.AAT Asp Asp.sup.GAC > Asp.sup.GAT Cys Cys.sup.TGC > Cys.sup.TGT Glu Glu.sup.GAA > Glu.sup.GAG Gln Gln.sup.CAA = Gln.sup.CAG Gly Gly.sup.GGA > (Gly.sup.GGG, Gly.sup.GGT, Gly.sup.GGC) His His.sup.CAC = His.sup.CAT Ile Ile.sup.ATC >> Ile.sup.ATT > Ile.sup.ATA Leu (Leu.sup.CTG, Leu.sup.CTC) > (Leu.sup.CTA, Leu.sup.CTT) >> Leu.sup.TTG > Leu.sup.TTA Lys Lys.sup.AAG = Lys.sup.AAA Phe Phe.sup.TTT > Phe.sup.TTC Pro Pro.sup.CCC > Pro.sup.CCT >> (Pro.sup.CCA, Pro.sup.CCG) Ser Ser.sup.TCG >> (Ser.sup.TCT, Ser.sup.TCA, Ser.sup.TCC) >> (Ser.sup.AGC, Ser.sup.AGT) Thr Thr.sup.ACG > Thr.sup.ACC >> Thr.sup.ACA > Thr.sup.ACT Tyr Tyr.sup.TAC > Tyr.sup.TAT Val (Val.sup.GTG, Val.sup.GTC) > Val.sup.GTT > Val.sup.GTA

[0080] Thus, a stronger or enhanced immune response to the target antigen (e.g., an immune response that is at least about 110%, 150%, 200%, 300%, 400%, 500%, 600%, 700%, 800%, 900%, 1000% and all integer percentages in between, of that produced from the parent polynucleotide under identical conditions) can be achieved by selecting a synonymous codon that has a higher immune response preference than the first codon it replaces. In specific embodiments, the synonymous codon is selected such that it has a higher immune response preference that is at least about 10% (and at least about 11% to at least about 1000% and all integer percentages in between) higher than the immune response preference of the codon it replaces. In illustrative examples of this type, the first and synonymous codons are selected from TABLE 2:

TABLE-US-00002 TABLE 2 Synonymous First Codon Codon Ala.sup.GCG Ala.sup.GCT Ala.sup.GCG Ala.sup.GCC Ala.sup.GCA Ala.sup.GCT Ala.sup.GCA Ala.sup.GCC Ala.sup.GCC Ala.sup.GCT Arg.sup.CGG Arg.sup.CGA Arg.sup.CGG Arg.sup.CGC Arg.sup.CGG Arg.sup.CGT Arg.sup.CGG Arg.sup.AGA Arg.sup.AGG Arg.sup.CGA Arg.sup.AGG Arg.sup.CGC Arg.sup.AGG Arg.sup.CGT Arg.sup.AGG Arg.sup.AGA Asn.sup.AAT Asn.sup.AAC Asp.sup.GAT Asp.sup.GAC Cys.sup.TGT Cys.sup.TGC Glu.sup.GAG Glu.sup.GAA Gly.sup.GGC Gly.sup.GGA Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG Gly.sup.GGA Ile.sup.ATA Ile.sup.ATC Ile.sup.ATA Ile.sup.ATT Ile.sup.ATT Ile.sup.ATC Leu.sup.TTA Leu.sup.CTG Leu.sup.TTA Leu.sup.CTC Leu.sup.TTA Leu.sup.CTA Leu.sup.TTA Leu.sup.CTT Leu.sup.TTA Leu.sup.TTG Leu.sup.TTG Leu.sup.CTG Leu.sup.TTG Leu.sup.CTC Leu.sup.TTG Leu.sup.CTA Leu.sup.TTG Leu.sup.CTT Leu.sup.CTT Leu.sup.CTG Leu.sup.CTT Leu.sup.CTC Leu.sup.CTA Leu.sup.CTG Leu.sup.CTA Leu.sup.CTC Phe.sup.TTC Phe.sup.TTT Pro.sup.CCG Pro.sup.CCC Pro.sup.CCG Pro.sup.CCT Pro.sup.CCA Pro.sup.CCC Pro.sup.CCA Pro.sup.CCT Pro.sup.CCT Pro.sup.CCC Ser.sup.AGT Ser.sup.TCG Ser.sup.AGT Ser.sup.TCT Ser.sup.AGT Ser.sup.TCA Ser.sup.AGT Ser.sup.TCC Ser.sup.AGC Ser.sup.TCG Ser.sup.AGC Ser.sup.TCT Ser.sup.AGC Ser.sup.TCA Ser.sup.AGC Ser.sup.TCC Ser.sup.TCC Ser.sup.TCG Ser.sup.TCA Ser.sup.TCG Ser.sup.TCT Ser.sup.TCG Thr.sup.ACT Thr.sup.ACG Thr.sup.ACT Thr.sup.ACC Thr.sup.ACT Thr.sup.ACA Thr.sup.ACA Thr.sup.ACG Thr.sup.ACA Thr.sup.ACC Thr.sup.ACC Thr.sup.ACG Tyr.sup.TAT Tyr.sup.TAC Val.sup.GTA Val.sup.GTG Val.sup.GTA Val.sup.GTC Val.sup.GTA Val.sup.GTT Val.sup.GTT Val.sup.GTG Val.sup.GTT Val.sup.GTC

[0081] In other illustrative examples of this type, the first and synonymous codons are selected from TABLE 3:

TABLE-US-00003 TABLE 3 Synonymous First Codon Codon Ala.sup.GCG Ala.sup.GCT Ala.sup.GCA Ala.sup.GCT Ala.sup.GCC Ala.sup.GCT Arg.sup.CGG Arg.sup.CGA Arg.sup.CGG Arg.sup.CGT Arg.sup.CGG Arg.sup.AGA Arg.sup.AGG Arg.sup.CGA Arg.sup.AGG Arg.sup.CGT Arg.sup.AGG Arg.sup.AGA Glu.sup.GAG Glu.sup.GAA Gly.sup.GGC Gly.sup.GGA Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG Gly.sup.GGA Leu.sup.TTA Leu.sup.CTA Leu.sup.TTA Leu.sup.CTT Leu.sup.TTA Leu.sup.TTG Leu.sup.TTG Leu.sup.CTA Leu.sup.TTG Leu.sup.CTT Phe.sup.TTC Phe.sup.TTT Pro.sup.CCG Pro.sup.CCT Pro.sup.CCA Pro.sup.CCT Ser.sup.AGT Ser.sup.TCG Ser.sup.AGT Ser.sup.TCT Ser.sup.AGT Ser.sup.TCA Ser.sup.AGC Ser.sup.TCG Ser.sup.AGC Ser.sup.TCT Ser.sup.AGC Ser.sup.TCA Ser.sup.AGC Ser.sup.TCC Ser.sup.TCC Ser.sup.TCG Ser.sup.TCA Ser.sup.TCG Ser.sup.TCT Ser.sup.TCG Thr.sup.ACT Thr.sup.ACG Thr.sup.ACT Thr.sup.ACA Thr.sup.ACA Thr.sup.ACG Thr.sup.ACC Thr.sup.ACG Val.sup.GTA Val.sup.GTT

[0082] Suitably, in some of the illustrative examples noted above, the method further comprises selecting a second codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a higher immune response preference than the second codon in a comparison of immune response preferences; and (b) replacing the second codon with the synonymous codon, wherein the comparison of immune response preferences of the codons is represented by TABLE 4:

TABLE-US-00004 TABLE 4 Second Synonymous Codon Codon Ala.sup.GCG Ala.sup.GCT Ala.sup.GCG Ala.sup.GCC Ala.sup.GCA Ala.sup.GCT Ala.sup.GCA Ala.sup.GCC Ala.sup.GCC Ala.sup.GCT Arg.sup.CGG Arg.sup.CGA Arg.sup.CGG Arg.sup.CGC Arg.sup.CGG Arg.sup.CGT Arg.sup.CGG Arg.sup.AGA Arg.sup.AGG Arg.sup.CGA Arg.sup.AGG Arg.sup.CGC Arg.sup.AGG Arg.sup.CGT Arg.sup.AGG Arg.sup.AGA Asn.sup.AAT Asn.sup.AAC Asp.sup.GAT Asp.sup.GAC Cys.sup.TGT Cys.sup.TGC Glu.sup.GAG Glu.sup.GAA Gly.sup.GGC Gly.sup.GGA Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG Gly.sup.GGA Ile.sup.ATA Ile.sup.ATC Ile.sup.ATA Ile.sup.ATT Ile.sup.ATT Ile.sup.ATC Leu.sup.TTA Leu.sup.CTG Leu.sup.TTA Leu.sup.CTC Leu.sup.TTA Leu.sup.CTA Leu.sup.TTA Leu.sup.CTT Leu.sup.TTA Leu.sup.TTG Leu.sup.TTG Leu.sup.CTG Leu.sup.TTG Leu.sup.CTC Leu.sup.TTG Leu.sup.CTA Leu.sup.TTG Leu.sup.CTT Leu.sup.CTT Leu.sup.CTG Leu.sup.CTT Leu.sup.CTC Leu.sup.CTA Leu.sup.CTG Leu.sup.CTA Leu.sup.CTC Phe.sup.TTC Phe.sup.TTT Pro.sup.CCG Pro.sup.CCC Pro.sup.CCG Pro.sup.CCT Pro.sup.CCA Pro.sup.CCC Pro.sup.CCA Pro.sup.CCT Pro.sup.CCT Pro.sup.CCC Ser.sup.AGT Ser.sup.TCG Ser.sup.AGT Ser.sup.TCT Ser.sup.AGT Ser.sup.TCA Ser.sup.AGT Ser.sup.TCC Ser.sup.AGC Ser.sup.TCG Ser.sup.AGC Ser.sup.TCT Ser.sup.AGC Ser.sup.TCA Ser.sup.AGC Ser.sup.TCC Ser.sup.TCC Ser.sup.TCG Ser.sup.TCA Ser.sup.TCG Ser.sup.TCT Ser.sup.TCG Thr.sup.ACT Thr.sup.ACG Thr.sup.ACT Thr.sup.ACC Thr.sup.ACT Thr.sup.ACA Thr.sup.ACA Thr.sup.ACG Thr.sup.ACA Thr.sup.ACC Thr.sup.ACC Thr.sup.ACG Tyr.sup.TAT Tyr.sup.TAC Val.sup.GTA Val.sup.GTG Val.sup.GTA Val.sup.GTC Val.sup.GTA Val.sup.GTT Val.sup.GTT Val.sup.GTG Val.sup.GTT Val.sup.GTC

[0083] Conversely, a weaker or reduced immune response to the target antigen (e.g., an immune response that is at less than about 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, 10%, 1% and all integer percentages in between, of that produced from the parent polynucleotide under identical conditions) can be achieved by selecting a synonymous codon that has a lower immune response preference than the first codon it replaces. In specific embodiments of this type, the synonymous codon is selected such that it has an immune response preference that is less than about 90% of the immune response preference of the codon it replaces. In illustrative examples, the first and synonymous codons are selected from the TABLE 5:

TABLE-US-00005 TABLE 5 Synonymous First Codon Codon Ala.sup.GCT Ala.sup.GCG Ala.sup.GCT Ala.sup.GCA Ala.sup.GCT Ala.sup.GCC Ala.sup.GCC Ala.sup.GCG Ala.sup.GCC Ala.sup.GCA Arg.sup.CGA Arg.sup.AGG Arg.sup.CGA Arg.sup.CGG Arg.sup.CGC Arg.sup.AGG Arg.sup.CGC Arg.sup.CGG Arg.sup.CGT Arg.sup.AGG Arg.sup.CGT Arg.sup.CGG Arg.sup.AGA Arg.sup.AGG Arg.sup.AGA Arg.sup.CGG Asn.sup.AAC Asn.sup.AAT Asp.sup.GAC Asp.sup.GAT Cys.sup.TGC Cys.sup.TGT Glu.sup.GAA Glu.sup.GAG Gly.sup.GGA Gly.sup.GGC Gly.sup.GGA Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG Ile.sup.ATC Ile.sup.ATA Ile.sup.ATC Ile.sup.ATT Ile.sup.ATT Ile.sup.ATA Leu.sup.CTG Leu.sup.CTA Leu.sup.CTG Leu.sup.CTT Leu.sup.CTG Leu.sup.TTG Leu.sup.CTG Leu.sup.TTA Leu.sup.CTC Leu.sup.CTA Leu.sup.CTC Leu.sup.CTT Leu.sup.CTC Leu.sup.TTG Leu.sup.CTC Leu.sup.TTA Leu.sup.CTA Leu.sup.TTG Leu.sup.CTA Leu.sup.TTA Leu.sup.CTT Leu.sup.TTG Leu.sup.CTT Leu.sup.TTA Leu.sup.TTG Leu.sup.TTA Phe.sup.TTT Phe.sup.TTC Pro.sup.CCC Pro.sup.CCT Pro.sup.CCC Pro.sup.CCA Pro.sup.CCC Pro.sup.CCG Pro.sup.CCT Pro.sup.CCA Pro.sup.CCT Pro.sup.CCG Ser.sup.TCG Ser.sup.TCT Ser.sup.TCG Ser.sup.TCA Ser.sup.TCG Ser.sup.TCC Ser.sup.TCG Ser.sup.AGC Ser.sup.TCG Ser.sup.AGT Ser.sup.TCT Ser.sup.AGC Ser.sup.TCT Ser.sup.AGT Ser.sup.TCA Ser.sup.AGC Ser.sup.TCA Ser.sup.AGT Ser.sup.TCC Ser.sup.AGC Ser.sup.TCC Ser.sup.AGT Thr.sup.ACG Thr.sup.ACC Thr.sup.ACG Thr.sup.ACA Thr.sup.ACG Thr.sup.ACT Thr.sup.ACC Thr.sup.ACA Thr.sup.ACC Thr.sup.ACT Thr.sup.ACA Thr.sup.ACT Tyr.sup.TAC Tyr.sup.TAT Val.sup.GTG Val.sup.GTT Val.sup.GTG Val.sup.GTA Val.sup.GTC Val.sup.GTT Val.sup.GTC Val.sup.GTA Val.sup.GTT Val.sup.GTA

[0084] In other illustrative examples, the first and synonymous codons are selected from TABLE 6:

TABLE-US-00006 TABLE 6 Synonymous First Codon Codon Ala.sup.GCT Ala.sup.GCG Ala.sup.GCT Ala.sup.GCA Ala.sup.GCT Ala.sup.GCC Arg.sup.CGA Arg.sup.AGG Arg.sup.CGA Arg.sup.CGG Arg.sup.CGT Arg.sup.AGG Arg.sup.CGT Arg.sup.CGG Arg.sup.AGA Arg.sup.AGG Arg.sup.AGA Arg.sup.CGG Glu.sup.GAA Glu.sup.GAG Gly.sup.GGA Gly.sup.GGC Gly.sup.GGA Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG Leu.sup.CTA Leu.sup.TTG Leu.sup.CTA Leu.sup.TTA Leu.sup.CTT Leu.sup.TTG Leu.sup.CTT Leu.sup.TTA Leu.sup.TTG Leu.sup.TTA Phe.sup.TTT Phe.sup.TTC Pro.sup.CCT Pro.sup.CCA Pro.sup.CCT Pro.sup.CCG Ser.sup.TCG Ser.sup.TCT Ser.sup.TCG Ser.sup.TCA Ser.sup.TCG Ser.sup.TCC Ser.sup.TCG Ser.sup.AGC Ser.sup.TCG Ser.sup.AGT Ser.sup.TCT Ser.sup.AGC Ser.sup.TCT Ser.sup.AGT Ser.sup.TCA Ser.sup.AGC Ser.sup.TCA Ser.sup.AGT Ser.sup.TCC Ser.sup.AGC Thr.sup.ACG Thr.sup.ACC Thr.sup.ACG Thr.sup.ACA Thr.sup.ACG Thr.sup.ACT Thr.sup.ACA Thr.sup.ACT Val.sup.GTT Val.sup.GTA

[0085] Suitably, in some of the illustrative examples noted above, the method further comprises selecting a second codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a lower immune response preference than the second codon in a comparison of immune response preferences; and; (b) replacing the second codon with the synonymous codon, wherein the comparison of immune response preferences of the codons is represented by TABLE 7:

TABLE-US-00007 TABLE 7 Second Synonymous Codon Codon Ala.sup.GCT Ala.sup.GCG Ala.sup.GCT Ala.sup.GCA Ala.sup.GCT Ala.sup.GCC Ala.sup.GCC Ala.sup.GCG Ala.sup.GCC Ala.sup.GCA Arg.sup.CGA Arg.sup.AGG Arg.sup.CGA Arg.sup.CGG Arg.sup.CGC Arg.sup.AGG Arg.sup.CGC Arg.sup.CGG Arg.sup.CGT Arg.sup.AGG Arg.sup.CGT Arg.sup.CGG Arg.sup.AGA Arg.sup.AGG Arg.sup.AGA Arg.sup.CGG Asn.sup.AAC Asn.sup.AAT Asp.sup.GAC Asp.sup.GAT Cys.sup.TGC Cys.sup.TGT Glu.sup.GAA Glu.sup.GAG Gly.sup.GGA Gly.sup.GGC Gly.sup.GGA Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG Ile.sup.ATC Ile.sup.ATA Ile.sup.ATC Ile.sup.ATT Ile.sup.ATT Ile.sup.ATA Leu.sup.CTG Leu.sup.CTA Leu.sup.CTG Leu.sup.CTT Leu.sup.CTG Leu.sup.TTG Leu.sup.CTG Leu.sup.TTA Leu.sup.CTC Leu.sup.CTA Leu.sup.CTC Leu.sup.CTT Leu.sup.CTC Leu.sup.TTG Leu.sup.CTC Leu.sup.TTA Leu.sup.CTA Leu.sup.TTG Leu.sup.CTA Leu.sup.TTA Leu.sup.CTT Leu.sup.TTG Leu.sup.CTT Leu.sup.TTA Leu.sup.TTG Leu.sup.TTA Phe.sup.TTT Phe.sup.TTC Pro.sup.CCC Pro.sup.CCT Pro.sup.CCC Pro.sup.CCA Pro.sup.CCC Pro.sup.CCG Pro.sup.CCT Pro.sup.CCA Pro.sup.CCT Pro.sup.CCG Ser.sup.TCG Ser.sup.TCT Ser.sup.TCG Ser.sup.TCA Ser.sup.TCG Ser.sup.TCC Ser.sup.TCG Ser.sup.AGC Ser.sup.TCG Ser.sup.AGT Ser.sup.TCT Ser.sup.AGC Ser.sup.TCT Ser.sup.AGT Ser.sup.TCA Ser.sup.AGC Ser.sup.TCA Ser.sup.AGT Ser.sup.TCC Ser.sup.AGC Ser.sup.TCC Ser.sup.AGT Thr.sup.ACG Thr.sup.ACC Thr.sup.ACG Thr.sup.ACA Thr.sup.ACG Thr.sup.ACT Thr.sup.ACC Thr.sup.ACA Thr.sup.ACC Thr.sup.ACT Thr.sup.ACA Thr.sup.ACT Tyr.sup.TAC Tyr.sup.TAT Val.sup.GTG Val.sup.GTT Val.sup.GTG Val.sup.GTA Val.sup.GTC Val.sup.GTT Val.sup.GTC Val.sup.GTA Val.sup.GTT Val.sup.GTA

[0086] In still another aspect, the invention provides a synthetic polynucleotide constructed according to any one of the above methods.

[0087] In accordance with the present invention, synthetic polynucleotides that are constructed by methods described herein are useful for expression in a mammal to elicit an immune response to a target antigen. Accordingly, in yet another aspect, the present invention provides chimeric constructs that comprise a synthetic polynucleotide of the invention, which is operably connected to a regulatory sequence.

[0088] In some embodiments, the chimeric construct is in the form of a pharmaceutical composition that optionally comprises a pharmaceutically acceptable excipient and/or carrier. Accordingly, in another aspect, the invention provides pharmaceutical compositions that are useful for modulating an immune response to a target antigen in a mammal, which response is conferred by the expression of a parent polynucleotide that encodes a polypeptide corresponding to at least a portion of the target antigen. These compositions generally comprise a chimeric construct and a pharmaceutically acceptable excipient and/or carrier, wherein the chimeric construct comprises a synthetic polynucleotide that is operably connected to a regulatory sequence and that is distinguished from the parent polynucleotide by the replacement of a first codon in the parent polynucleotide with a synonymous codon that has a different immune response preference than the first codon and wherein the first and synonymous codons are selected according to any one of TABLES 2, 3, 5 and 6. In some embodiments, the compositions further comprise an adjuvant that enhances the effectiveness of the immune response. In some embodiments, the composition is formulated for transcutaneous or dermal administration, e.g., by biolistic or microneedle delivery or by intradermal injection. Suitably, in embodiments in which a stronger or enhanced immune response to the target antigen is desired, the first and synonymous codons are selected according to TABLES 2 or 3. Conversely, in embodiments in which a weaker or reduced immune response to the target antigen is desired, the first and synonymous codons are selected according to TABLES 5 or 6.

[0089] In yet another aspect, the invention embraces methods of modulating the quality of an immune response to a target antigen in a mammal, which response is conferred by the expression of a parent polynucleotide that encodes a polypeptide corresponding to at least a portion of the target antigen. These methods generally comprise: introducing into the mammal a synthetic polynucleotide that is operably connected to a regulatory sequence and that is distinguished from the parent polynucleotide by the replacement of a first codon in the parent polynucleotide with a synonymous codon that has a different immune response preference than the first codon and wherein the first and synonymous codons are selected according to any one of TABLES 2, 3, 5 and 6. In these methods, expression of the synthetic polynucleotide results in a different quality (e.g., stronger or weaker) of immune response than the one obtained through expression of the parent polynucleotide under the same conditions. Suitably, the chimeric construct is introduced into the mammal by delivering the construct to antigen-presenting cells (e.g., dendritic cells, macrophages, Langerhans cells or their precursors) of the mammal. In some embodiments, the chimeric construct is introduced into the dermis and/or epidermis of the mammal (e.g., by transcutaneous or intradermal administration) and in this regard any suitable administration site is envisaged including the abdomen. Generally, the immune response is selected from a cell-mediated response and a humoral immune response. In specific embodiments, the immune response is a humoral immune response.

[0090] In a related aspect, the invention encompasses methods of enhancing the quality of an immune response to a target antigen in a mammal, which response is conferred by the expression of a parent polynucleotide that encodes a polypeptide corresponding to at least a portion of the target antigen. These methods generally comprise: introducing into the mammal a chimeric construct comprising a synthetic polynucleotide that is operably connected to a regulatory sequence and that is distinguished from the parent polynucleotide by the replacement of a first codon in the parent polynucleotide with a synonymous codon that has a higher immune response preference than the first codon, wherein the first and synonymous codons are selected according to TABLES 2 or 3. In these methods, expression of the synthetic polynucleotide typically results in a stronger or enhanced immune response than the one obtained through expression of the parent polynucleotide under the same conditions.

[0091] In another related aspect, the invention extends to methods of reducing the quality of an immune response to a target antigen in a mammal, which response is conferred by the expression of a parent polynucleotide that encodes a polypeptide corresponding to at least a portion of the target antigen. These methods generally comprise: introducing into the mammal a chimeric construct comprising a synthetic polynucleotide that is operably connected to a regulatory sequence and that is distinguished from the parent polynucleotide by the replacement of a first codon in the parent polynucleotide with a synonymous codon that has a lower immune response preference than the first codon, wherein the first and synonymous codons are selected according to TABLES 5 or 6. In these methods, expression of the synthetic polynucleotide typically results in a weaker or reduced immune response than the one obtained through expression of the parent polynucleotide under the same conditions.

[0092] Yet a further aspect of the present invention embraces methods of enhancing the quality of an immune response to a target antigen in a mammal, which response is conferred by the expression of a first polynucleotide that encodes a polypeptide corresponding to at least a portion of the target antigen. These methods generally comprise: co-introducing into the mammal a first nucleic acid construct comprising the first polynucleotide in operable connection with a regulatory sequence; and a second nucleic acid construct comprising a second polynucleotide that is operably connected to a regulatory sequence and that encodes an iso-tRNA corresponding to a codon of the first polynucleotide, wherein the codon has a low or intermediate immune response preference and is selected from the group consisting of Ala.sup.GCA, Ala.sup.GCG, Ala.sup.GCC, Arg.sup.AGG, Arg.sup.CGG, Asn.sup.AAT, Asp.sup.GAT, Cys.sup.TGT, Glu.sup.GAG, Gly.sup.GGG, Gly.sup.GGT, Gly.sup.GGC, Ile.sup.ATA, Ile.sup.ATT, Leu.sup.TTG, Leu.sup.TTA, Leu.sup.CTA, Leu.sup.CTT, Phe.sup.TTC, Pro.sup.CCA, Pro.sup.CCG, Pro.sup.CCT, Ser.sup.AGC, Ser.sup.AGT, Ser.sup.TCT, Ser.sup.TCA, Ser.sup.TCC, Thr.sup.ACA, Thr.sup.ACT, Tyr.sup.TAT, Val.sup.GTA and Val.sup.GTT. In specific embodiments, the codon has a `low` immune response preference, and is selected from the group consisting of Ala.sup.GCA, Ala.sup.GCG, Arg.sup.CGG, Asn.sup.AAT, Asp.sup.GAT, Cys.sup.TGT, Glu.sup.GAG, Gly.sup.GGG, Gly.sup.GGT, Gly.sup.GGC, Ile.sup.ATA, Leu.sup.TTG, Leu.sup.TTA, Phe.sup.TTC, Pro.sup.CCA, Pro.sup.CCG, Ser.sup.AGC, Ser.sup.AGT, Thr.sup.ACT, Tyr.sup.TAT and Val.sup.GTA.

BRIEF DESCRIPTION OF THE DRAWINGS

[0093] FIG. 1 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted ALA E7 constructs and controls (IgkC1, IgkS1-1, IgkS1-2, IgkS1-3, IgkS1-4 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0094] FIG. 2 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted ARG E7 constructs and controls (IgkS1-5, IgkS1-6, IgkS1-7, IgkS1-8, IgkS1-9, IgkS1-10, IgkC1 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0095] FIG. 3 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted ASN and LYS E7 constructs and controls (IgkS1, IgkS1-12, IgkS1-31 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0096] FIG. 4 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted ASP E7 constructs and controls (IgkC1, IgkS1-13, IgkS1-14 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0097] FIG. 5 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted CYS E7 constructs and controls (IgkC1, IgkS1-15, IgkS1-16 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0098] FIG. 6 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted GLU E7 constructs and controls (IgkS1-17, IgkS1-18, IgkC2 and IgkC1) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0099] FIG. 7 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted GLN E7 constructs and controls (IgkC1, IgkS1-19, IgkS1-20 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0100] FIG. 8 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted GLY E7 constructs and controls (IgkC1, IgkS1-21, IgkS1-22, IgkS1-23, IgkS1-24 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0101] FIG. 9 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted HIS E7 constructs and controls (IgkC1, IgkS1-25, IgkS1-26 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0102] FIG. 10 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted ILE E7 constructs and controls (IgkC1, IgkS1-27, IgkS1-28, IgkS1-29 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0103] FIG. 11 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted LEU E7 constructs and controls (IgkS1-50, IgkS1-51, IgkS1-52, IgkS1-53, IgkS1-54, IgkS1-55, IgkC3 and IgkC4) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3. The LEU E7 constructs are oncogenic (i.e., encode wild-type E7 protein).

[0104] FIG. 12 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted PHE E7 constructs and controls (IgkS1-32, IgkS1-33, IgkC1 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3. The two LEU residues were mutated to PHE in this sequence so that there are three instead of one PHE residue.

[0105] FIG. 13 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted PRO E7 constructs and controls (IgkS1-56, IgkS1-57, IgkS1-58, IgkS1-59, IgkC3 and IgkC4) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3. The PRO E7 constructs are oncogenic (i.e., encode wild-type E7 protein).

[0106] FIG. 14 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted SER E7 constructs and controls (IgkS1-34, IgkS1-35, IgkS1-36, IgkS1-37, IgkS1-38, IgkS1-39, IgkC1 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0107] FIG. 15 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted THR E7 constructs and controls (IgkC1, IgkS1-40, IgkS1-41, IgkS1-42, IgkS1-43 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0108] FIG. 16 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted TYR E7 constructs and controls (IgkC1, IgkS1-44, IgkS1-45 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0109] FIG. 17 is a diagrammatic representation depicting a nucleotide sequence alignment of secreted VAL E7 constructs and controls (IgkC1, IgkS1-46, IgkS1-47, IgkS1-48, IgkS1-49 and IgkC2) as further defined in Example 1 and Table 12. The sequences are ligated into the KpnI and EcoRI sites of pcDNA3.

[0110] FIG. 18 is a graphical representation showing the response to gene gun immunization with optimized and de-optimized E7 constructs measured by (a) ELISA, (b) Memory B cell ELISPOT, and (c) IFN-.gamma. ELISPOT. For part (a) eight mice were immunized per group (4 immunizations, 3 weeks apart) and the sera taken three weeks after the final immunization; (left) E7 protein ELISA, (right) E7 peptide 101 ELISA. Wells were done in duplicate. For parts (b) and (c) mice were immunized twice, three weeks apart and the spleens collected three weeks after the second immunization. The spleens were pooled prior to analysis. The Memory B cell and IFN-.gamma. ELISPOTs were conducted twice and three times, respectively, and the wells done in triplicate. Three mice were used per group per repeat. The results shown in parts (b) and (c) are from individual experiments and are representative of the complete data sets. The particular ELISPOT experimental data included here were gathered together with the corresponding data in FIG. 20 and therefore may be directly compared. Unpaired two-tailed t-tests were used to compare the modified constructs to wild-type. ***P<0.001, **0.001<P<0.01, *0.01<P<0.05, ns=not significant (P>0.05). In (a) 01-03 were not significantly different from MC as measured by unpaired two-tailed t-tests. wt=wild-type codon usage E7; O1-O3=codon-optimized E7 constructs 1 to 3; W=codon de-optimized E7; MC=mammalian consensus codon usage E7.

[0111] FIG. 19 is a graphical representation showing the response to immunization by intradermal injection with optimized and de-optimized constructs measured by (a) ELISA, (b) Memory B cell ELISPOT, and (c) IFN-.gamma. ELISPOT. For part (a) eight mice were immunized per group (4 immunizations, 3 weeks apart) and the sera taken three weeks after the final immunization; (left) E7 protein ELISA, (right) E7 peptide 101 ELISA. Wells were done in duplicate. For parts (b) and (c) mice were immunized twice, three weeks apart and the spleens collected three weeks after the second immunization. The spleens were pooled prior to analysis. The Memory B cell and IFN-.gamma. ELISPOTs were conducted twice and three times, respectively, and the wells done in triplicate. Three mice were used per group per repeat. The results shown in parts (b) and (c) are from individual experiments and are representative of the complete data sets. The particular ELISPOT experimental data included here were gathered together with the corresponding data in FIG. 20 and therefore may be directly compared. Unpaired two-tailed t-tests were used to compare the modified constructs to wild-type. ***P<0.001, **0.001.ltoreq.P<0.01, *0.01.ltoreq.P.ltoreq.0.05, ns=not significant (P>0.05). In (a) O1-O3 were not significantly different from MC as measured by unpaired two-tailed t-tests. wt=wild-type codon usage E7; O1-O3=codon-optimized E7 constructs 1 to 3; W=codon de-optimized E7; MC=mammalian consensus codon usage E7.

[0112] FIG. 20 is a graphical representation showing the results of an ELISA that measures binding of serum from mice immunized with various gD2 constructs by intradermal injection (white bars) or gene gun immunization (black bars), to C-terminally His-tagged gD2tr. Note that the His-tagged gD2tr protein was used in an unpurified state (in CHO cell supernatant) and that background readings of non-specific binding to control supernatant have been subtracted from the results.

TABLE-US-00008 TABLE 8 BRIEF DESCRIPTION OF THE SEQUENCES SEQUENCE ID NUMBER SEQUENCE LENGTH SEQ ID NO: 1 IgkS2-13 Asp GAT construct nucleotide sequence 387 nts SEQ ID NO: 2 IgkS2-14 Asp GAC construct nucleotide sequence 387 nts SEQ ID NO: 3 IgkS2-15 Cys TGT construct nucleotide sequence 387 nts SEQ ID NO: 4 IgkS2-16 Cys TGC construct nucleotide sequence 387 nts SEQ ID NO: 5 IgkS2-17 Glu GAG construct nucleotide sequence 387 nts SEQ ID NO: 6 IgkS2-18 Glu GAA construct nucleotide sequence 387 nts SEQ ID NO: 7 IgkS2-19 Gln CAG construct nucleotide sequence 387 nts SEQ ID NO: 8 IgkS2-20 Gln CAA construct nucleotide sequence 387 nts SEQ ID NO: 9 IgkS2-21 Gly GGG construct nucleotide sequence 387 nts SEQ ID NO: 10 IgkS2-22 Gly GGA construct nucleotide sequence 387 nts SEQ ID NO: 11 IgkS2-23 Gly GGT construct nucleotide sequence 387 nts SEQ ID NO: 12 IgkS2-24 Gly GGC construct nucleotide sequence 387 nts SEQ ID NO: 13 IgkS2-27 Ile ATA construct nucleotide sequence 387 nts SEQ ID NO: 14 IgkS2-28 Ile ATT construct nucleotide sequence 387 nts SEQ ID NO: 15 IgkS2-29 Ile ATC construct nucleotide sequence 387 nts SEQ ID NO: 16 IgkS2-34 Ser AGT construct nucleotide sequence 387 nts SEQ ID NO: 17 IgkS2-35 Ser AGC construct nucleotide sequence 387 nts SEQ ID NO: 18 IgkS2-36 Ser TCG construct nucleotide sequence 387 nts SEQ ID NO: 19 IgkS2-37 Ser TCA construct nucleotide sequence 387 nts SEQ ID NO: 20 IgkS2-38 Ser TCT construct nucleotide sequence 387 nts SEQ ID NO: 21 IgkS2-39 Ser TCC construct nucleotide sequence 387 nts SEQ ID NO: 22 IgkS2-40 Thr ACG construct nucleotide sequence 387 nts SEQ ID NO: 23 IgkS2-41 Thr ACA construct nucleotide sequence 387 nts SEQ ID NO: 24 IgkS2-42 Thr ACT construct nucleotide sequence 387 nts SEQ ID NO: 25 IgkS2-43 Thr ACC construct nucleotide sequence 387 nts SEQ ID NO: 26 IgkS2-46 Val GTG construct nucleotide sequence 387 nts SEQ ID NO: 27 IgkS2-47 Val GTA construct nucleotide sequence 387 nts SEQ ID NO: 28 IgkS2-48 Val GTT construct nucleotide sequence 387 nts SEQ ID NO: 29 IgkS2-49 Val GTG construct nucleotide sequence 387 nts SEQ ID NO: 30 IgkS2-1 Ala GCG Linker nucleotide sequence 408 nts SEQ ID NO: 31 IgkS2-2 Ala GCA Linker nucleotide sequence 408 nts SEQ ID NO: 32 IgkS2-3 Ala GCT Linker nucleotide sequence 408 nts SEQ ID NO: 33 IgkS2-4 Ala GCC Linker nucleotide sequence 408 nts SEQ ID NO: 34 IgkS2-5 Arg AGG Linker nucleotide sequence 408 nts SEQ ID NO: 35 IgkS2-6 Arg AGA Linker nucleotide sequence 408 nts SEQ ID NO: 36 IgkS2-7 Arg CGG Linker nucleotide sequence 408 nts SEQ ID NO: 37 IgkS2-8 Arg CGA Linker nucleotide sequence 408 nts SEQ ID NO: 38 IgkS2-9 Arg CGT Linker nucleotide sequence 408 nts SEQ ID NO: 39 IgkS2-10 Arg CGC Linker nucleotide sequence 408 nts SEQ ID NO: 40 IgkS2-11 Asn AAT Linker nucleotide sequence 408 nts SEQ ID NO: 41 IgkS2-12 Asn AAC Linker nucleotide sequence 408 nts SEQ ID NO: 42 IgkS2-25 His CAT Linker nucleotide sequence 408 nts SEQ ID NO: 43 IgkS2-26 His CAC Linker nucleotide sequence 408 nts SEQ ID NO: 44 IgkS2-30 Lys AAG Linker nucleotide sequence 408 nts SEQ ID NO: 45 IgkS2-31 Lys AAA Linker nucleotide sequence 408 nts SEQ ID NO: 46 IgkS2-32 Phe TTT Linker nucleotide sequence 408 nts SEQ ID NO: 47 IgkS2-33 Phe TTC Linker nucleotide sequence 408 nts SEQ ID NO: 48 IgkS2-44 Tyr TAT Linker nucleotide sequence 408 nts SEQ ID NO: 49 IgkS2-45 Tyr TAC Linker nucleotide sequence 408 nts SEQ ID NO: 50 Influenza A Virus HA hemagglutinin (A/Hong 1707 nts Kong/213/03(H5N1)) BAE07201 wild-type SEQ ID NO: 51 Influenza A Virus HA hemagglutinin (A/Hong 568 aa Kong/213/03(H5N1)) BAE07201 wild-type SEQ ID NO: 52 Influenza A Virus HA hemagglutinin (A/Hong 1707 nts Kong/213/03(H5N1)) Codon modified SEQ ID NO: 53 Influenza A Virus HA hemagglutinin 1701 nts (A/swine/Korea/PZ72-1/2006 (H3N1)) DQ923506 wild-type SEQ ID NO: 54 Influenza A Virus HA hemagglutinin 566 aa (A/swine/Korea/PZ72-1/2006 (H3N1)) DQ923506 wild-type SEQ ID NO: 55 Influenza A Virus HA hemagglutinin 1701 nts (A/swine/Korea/PZ72-1/2006 (H3N1)) Codon modified SEQ ID NO: 56 Influenza A Virus NA neuraminidase (A/Hong 1410 nts Kong/213/03(H5N1)) AB212056 wild-type SEQ ID NO: 57 Influenza A Virus NA neuraminidase (A/Hong 469 aa Kong/213/03(H5N1)) AB212056 wild-type SEQ ID NO: 58 Influenza A Virus NA neuraminidase (A/Hong 1410 nts Kong/213/03(H5N1)) Codon modified SEQ ID NO: 59 Influenza A Virus NA neuraminidase 1410 nts (A/swine/MI/PU243/04 (H3N1)) DQ150427 wild-type SEQ ID NO: 60 Influenza A Virus NA neuraminidase 469 aa (A/swine/MI/PU243/04 (H3N1)) DQ150427 wild-type SEQ ID NO: 61 Influenza A Virus NA neuraminidase 1410 nts (A/swine/MI/PU243/04 (H3N1)) Codon modified SEQ ID NO: 62 Hepatitis C Virus E1 (Serotype 1A, isolate H77) 576 nts AF009606 wild-type SEQ ID NO: 63 Hepatitis C Virus E1 (Serotype 1A, isolate H77) NP 192 aa 751920 wild-type SEQ ID NO: 64 Hepatitis C Virus E1 (Serotype 1A, isolate H77) Codon 576 nts modified SEQ ID NO: 65 Hepatitis C Virus E2 (Serotype 1A, isolate H77) 1089 nts AF009606 wild-type SEQ ID NO: 66 Hepatitis C Virus E2 (Serotype 1A, isolate H77) NP 363 aa 751921 wild-type SEQ ID NO: 67 Hepatitis C Virus E2 (Serotype 1A, isolate H77) Codon 1089 nts modified SEQ ID NO: 68 Epstein Barr Virus (Type 1, gp350 B95-8) NC 007605 2724 nts wild-type SEQ ID NO: 69 Epstein Barr Virus (Type 1, gp350 B95-8) CAD53417 907 aa wild-type SEQ ID NO: 70 Epstein Barr Virus (Type 1, gp350 B95-8) Codon 2724 nts modified SEQ ID NO: 71 Epstein Barr Virus (Type 2, gp350 AG876) NC 009334 2661 nts wild-type SEQ ID NO: 72 Epstein Barr Virus (Type 2, gp350 AG876) YP 886 aa 001129462 wild-type SEQ ID NO: 73 Epstein Barr Virus (Type 2, gp350 AG876) Codon 2661 nts Modified SEQ ID NO: 74 Herpes Simplex Virus 2 (Glycoprotein B strain HG52) 2715 nts NC 001798 wild-type SEQ ID NO: 75 Herpes Simplex Virus 2 (Glycoprotein B strain HG52) 904 aa CAB06752 wild-type SEQ ID NO: 76 Herpes Simplex Virus 2 (Glycoprotein B strain HG52) 2715 nts Codon modified SEQ ID NO: 77 Herpes Simplex Virus (Glycoprotein D strain HG52) 1182 nts NC 001798 wild-type SEQ ID NO: 78 Herpes Simplex Virus (Glycoprotein D strain HG52) 393 aa NP 0044536 wild-type SEQ ID NO: 79 Herpes Simplex Virus (Glycoprotein D strain HG52) 1182 nts Codon modified SEQ ID NO: 80 HPV-16 E7 wild-type 387 nts SEQ ID NO: 81 HPV-16 E7 O1 387 nts SEQ ID NO: 82 HPV-16 E7 O2 387 nts SEQ ID NO: 83 HPV-16 E7 O3 417 nts SEQ ID NO: 84 HPV-16 E7 W 387 nts SEQ ID NO: 85 HSV-2 gD2 wild-type 1182 nts SEQ ID NO: 86 HSV-2 gD2 O1 1182 nts SEQ ID NO: 87 HSV-2 gD2 O2 1182 nts SEQ ID NO: 88 HSV-2 gD2 O3 1182 nts SEQ ID NO: 89 HSV-2 gD2 W 1182 nts SEQ ID NO: 90 Common forward primer 41 nts SEQ ID NO: 91 ODN-7909 24 nts

DETAILED DESCRIPTION OF THE INVENTION

1. Definitions

[0113] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by those of ordinary skill in the art to which the invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, preferred methods and materials are described. For the purposes of the present invention, the following terms are defined below.

[0114] The articles "a" and "an" are used herein to refer to one or to more than one (i.e. to at least one) of the grammatical object of the article. By way of example, "an element" means one element or more than one element.

[0115] By "about" is meant a quantity, level, value, frequency, percentage, dimension, size, or amount that varies by no more than 15%, and preferably by no more than 10%, 9%, 8%, 7%, 6%, 5%, 4%, 3%, 2%, 1% to a reference quantity, level, value, frequency, percentage, dimension, size, or amount.

[0116] The terms "administration concurrently" or "administering concurrently" or "co-administering" and the like refer to the administration of a single composition containing two or more actives, or the administration of each active as separate compositions and/or delivered by separate routes either contemporaneously or simultaneously or sequentially within a short enough period of time that the effective result is equivalent to that obtained when all such actives are administered as a single composition. By "simultaneously" is meant that the active agents are administered at substantially the same time, and desirably together in the same formulation. By "contemporaneously" it is meant that the active agents are administered closely in time, e.g., one agent is administered within from about one minute to within about one day before or after another. Any contemporaneous time is useful. However, it will often be the case that when not administered simultaneously, the agents will be administered within about one minute to within about eight hours and preferably within less than about one to about four hours. When administered contemporaneously, the agents are suitably administered at the same site on the subject. The term "same site" includes the exact location, but can be within about 0.5 to about 15 centimeters, preferably from within about 0.5 to about 5 centimeters. The term "separately" as used herein means that the agents are administered at an interval, for example at an interval of about a day to several weeks or months. The active agents may be administered in either order. The term "sequentially" as used herein means that the agents are administered in sequence, for example at an interval or intervals of minutes, hours, days or weeks. If appropriate the active agents may be administered in a regular repeating cycle.

[0117] As used herein, the term "cis-acting sequence" or "cis-regulatory region" or similar term shall be taken to mean any sequence of nucleotides which is derived from an expressible genetic sequence wherein the expression of the genetic sequence is regulated, at least in part, by the sequence of nucleotides. Those skilled in the art will be aware that a cis-regulatory region may be capable of activating, silencing, enhancing, repressing or otherwise altering the level of expression and/or cell-type-specificity and/or developmental specificity of any structural gene sequence.

[0118] Throughout this specification, unless the context requires otherwise, the words "comprise," "comprises" and "comprising" will be understood to imply the inclusion of a stated step or element or group of steps or elements but not the exclusion of any other step or element or group of steps or elements.

[0119] As used herein, a "chimeric construct" refers to a polynucleotide having heterologous nucleic acid elements. Chimeric constructs include "expression cassettes" or "expression constructs," which refer to an assembly that is capable of directing the expression of the sequence(s) or gene(s) of interest. An expression cassette generally includes control elements such as a promoter that is operably linked to (so as to direct transcription of) a synthetic polynucleotide of the invention, and often includes a polyadenylation sequence as well. Within certain embodiments of the invention, the chimeric construct may be contained within a vector. In addition to the components of the chimeric construct, the vector may include, one or more selectable markers, a signal which allows the vector to exist as single-stranded DNA (e.g., a M13 origin of replication), at least one multiple cloning site, and a "mammalian" origin of replication (e.g., a SV40 or adenovirus origin of replication).

[0120] By "coding sequence" is meant any nucleic acid sequence that contributes to the code for the polypeptide product of a polynucleotide (e.g., a reporter polynucleotide).

[0121] As used herein a "conferred phenotype" refers to a temporary or permanent change in the state of an organism of interest or class of organisms of interest, or of a part or tissue or cell or cell type or class of cell of an organism of interest, which occurs after the introduction of a polynucleotide to that organism, or to that class of organisms, or to the part or tissue or cell or cell type or class of cell, or to a precursor of that organism or part or tissue or cell or cell type or class of cell, and which would not have occurred in the absence of that introduction. Typically, such a temporary or permanent change occurs as a result of the transcription and/or translation of genetic information contained within that polynucleotide in the cell, or in at least one cell or cell type or class of cell within the organism of interest or within the class of class of organisms of interest, and can be used to distinguish the organism of interest, or class of organisms of interest, or part or tissue or cell or cell type or class of cell thereof, or genetic progeny of these, to which the polynucleotide has been provided from a similar organism of interest, or class of organisms of interest, or part or tissue or cell or cell type or class of cell thereof, or genetic progeny of these, to which the polynucleotide has not been provided.

[0122] As used herein, "conferred immune response," "immune response that is conferred" and the like refer to a temporary or permanent change in immune response to a target antigen, which occurs or would occur after the introduction of a polynucleotide to the mammal, and which would not occur in the absence of that introduction. Typically, such a temporary or permanent change occurs as a result of the transcription and/or translation of genetic information contained within that polynucleotide in a cell, or in at least one cell or cell type or class of cell within a mammal or within a class of mammals, and can be used to distinguish the mammal, or class of mammals to which the polynucleotide has been provided from a similar mammal, or class of mammals, to which the polynucleotide has not been provided.

[0123] By "corresponds to" or "corresponding to" is meant an antigen which encodes an amino acid sequence that displays substantial similarity to an amino acid sequence in a target antigen. In general the antigen will display at least about 30, 40, 50, 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99% similarity or identity to at least a portion of the target antigen (e.g., at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 95% of the amino acid sequence of the target antigen).

[0124] By "effective amount," in the context of modulating an immune response or treating or preventing a disease or condition, is meant the administration of that amount of composition to an individual in need thereof, either in a single dose or as part of a series, that is effective for achieving that modulation, treatment or prevention. The effective amount will vary depending upon the health and physical condition of the individual to be treated, the taxonomic group of individual to be treated, the formulation of the composition, the assessment of the medical situation, and other relevant factors. It is expected that the amount will fall in a relatively broad range that can be determined through routine trials.

[0125] The terms "enhancing an immune response," "producing a stronger immune response" and the like refer to increasing an animal's capacity to respond to a target antigen (e.g., a foreign or disease-specific antigen or a self antigen), which can be determined for example by detecting an increase in the number, activity, and ability of the animal's cells that are primed to attack such antigens or an increase in the titer or activity of antibodies in the animal, which are immuno-interactive with the target antigen. Strength of immune response can be measured by standard immunoassays including: direct measurement of antibody titers or peripheral blood lymphocytes; cytolytic T lymphocyte assays; assays of natural killer cell cytotoxicity; cell proliferation assays including lymphoproliferation (lymphocyte activation) assays; immunoassays of immune cell subsets; assays of T-lymphocytes specific for the antigen in a sensitized subject; skin tests for cell-mediated immunity; etc. Such assays are well known in the art. See, e.g., Erickson et al., 1993, J. Immunol. 151:4189-4199; Doe et al., 1994, Eur. J. Immunol. 24:2369-2376. Recent methods of measuring cell-mediated immune response include measurement of intracellular cytokines or cytokine secretion by T-cell populations, or by measurement of epitope specific T-cells (e.g., by the tetramer technique) (reviewed by McMichael, A. J., and O'Callaghan, C. A., 1998, J. Exp. Med. 187(9)1367-1371; Mcheyzer-Williams, M. G., et al., 1996, Immunol. Rev. 150:5-21; Lalvani, A., et al., 1997, J. Exp. Med. 186:859-865). Any statistically significant increase in strength of immune response as measured for example by immunoassay is considered an "enhanced immune response" or "immunoenhancement" as used herein. Enhanced immune response is also indicated by physical manifestations such as fever and inflammation, as well as healing of systemic and local infections, and reduction of symptoms in disease, i.e., decrease in tumor size, alleviation of symptoms of a disease or condition including, but not restricted to, leprosy, tuberculosis, malaria, naphthous ulcers, herpetic and papillomatous warts, gingivitis, arthrosclerosis, the concomitants of AIDS such as Kaposi's sarcoma, bronchial infections, and the like. Such physical manifestations also encompass "enhanced immune response" or "immunoenhancement" as used herein. By contrast, "reducing an immune response," "producing a weaker immune response" and the like refer to decreasing an animal's capacity to respond to a target antigen, which can be determined for example by conducting immunoassays or assessing physical manifestations, as described for example above.

[0126] The terms "expression" or "gene expression" refer to production of RNA message and/or translation of RNA message into proteins or polypeptides.

[0127] By "expression vector" is meant any autonomous genetic element capable of directing the synthesis of a protein encoded by the vector. Such expression vectors are known by practitioners in the art.

[0128] The term "gene" is used in its broadest context to include both a genomic DNA region corresponding to the gene as well as a cDNA sequence corresponding to exons or a recombinant molecule engineered to encode a functional form of a product.

[0129] As used herein the term "heterologous" refers to a combination of elements that are not naturally occurring or that are obtained from different sources.

[0130] "Immune response" or "immunological response" refers to the concerted action of lymphocytes, antigen-presenting cells, phagocytic cells, granulocytes, and soluble macromolecules produced by the above cells or the liver (including antibodies, cytokines, and complement) that results in selective damage to, destruction of, or elimination from the body of cancerous cells, metastatic tumor cells, metastatic breast cancer cells, invading pathogens, cells or tissues infected with pathogens, or, in cases of autoimmunity or pathological inflammation, normal human cells or tissues. In some embodiments, an "immune response" encompasses the development in an individual of a humoral and/or a cellular immune response to a polypeptide that is encoded by an introduced synthetic polynucleotide of the invention. As known in the art, the terms "humoral immune response" includes and encompasses an immune response mediated by antibody molecules, while a "cellular immune response" includes and encompasses an immune response mediated by T-lymphocytes and/or other white blood cells. Thus, an immune response that is stimulated by a synthetic polynucleotide of the invention may be one that stimulates the production of antibodies (e.g., neutralizing antibodies that block bacterial toxins and pathogens such as viruses entering cells and replicating by binding to toxins and pathogens, typically protecting cells from infection and destruction). The synthetic polynucleotide may also elicit production of cytolytic T lymphocytes (CTLs). Hence, an immunological response may include one or more of the following effects: the production of antibodies by B-cells; and/or the activation of suppressor T-cells and/or memory/effector T-cells directed specifically to an antigen or antigens present in the composition or vaccine of interest. In some embodiments, these responses may serve to neutralize infectivity, and/or mediate antibody-complement, or antibody dependent cell cytotoxicity (ADCC) to provide protection to an immunized host. Such responses can be determined using standard immunoassays and neutralization assays, well known in the art. (See, e.g., Montefiori et al., 1988, J Clin Microbiol. 26:231-235; Dreyer et al., 1999, AIDS Res Hum Retroviruses 15(17):1563-1571). The innate immune system of mammals also recognizes and responds to molecular features of pathogenic organisms and cancer cells via activation of Toll-like receptors and similar receptor molecules on immune cells. Upon activation of the innate immune system, various non-adaptive immune response cells are activated to, e.g., produce various cytokines, lymphokines and chemokines. Cells activated by an innate immune response include immature and mature dendritic cells of, for example, the monocyte and plasmacytoid lineage (MDC, PDC), as well as gamma, delta, alpha and beta T cells and B cells and the like. Thus, the present invention also contemplates an immune response wherein the immune response involves both an innate and adaptive response.

[0131] A composition is "immunogenic" if it is capable of either: a) generating an immune response against a target antigen (e.g., a viral or tumor antigen) in an individual; or b) reconstituting, boosting, or maintaining an immune response in an individual beyond what would occur if the agent or composition was not administered. An agent or composition is immunogenic if it is capable of attaining either of these criteria when administered in single or multiple doses.

[0132] "Immunomodulation," modulating an immune response" and the like refer to the modulation of the immune system in response to a stimulus and includes increasing or decreasing an immune response to a target antigen or changing an immune response from one that is predominantly a humoral immune response to one that is a more cell-mediated immune response and vice versa. For example, it is known in the art that decreasing the amount of antigen for immunization can change the bias of the immune system from a predominantly humoral immune response to a predominantly cellular immune response.

[0133] By "isoaccepting transfer RNA" or "iso-tRNA" is meant one or more transfer RNA molecules that differ in their anticodon nucleotide sequence but are specific for the same amino acid.

[0134] As used herein, the term "mammal" refers to any mammal including, without limitation, humans and other primates, including non-human primates such as chimpanzees and other apes and monkey species; farm animals such as cattle, sheep, pigs, goats and horses; domestic mammals such as dogs and cats; and laboratory animals including rodents such as mice, rats and guinea pigs. The term does not denote a particular age. Thus, both adult and newborn individuals are intended to be covered.

[0135] By "modulating," "modulate" and the like is meant increasing or decreasing, either directly or indirectly, the quality of a selected phenotype (e.g., an immune response). In certain embodiments, "modulation" or "modulating" means that a desired/selected immune response is more efficient (e.g., at least 10%, 20%, 30%, 40%, 50%, 60% or more), more rapid (e.g., at least 10%, 20%, 30%, 40%, 50%, 60% or more), greater in magnitude (e.g., at least 10%, 20%, 30%, 40%, 50%, 60% or more), and/or more easily induced (e.g., at least 10%, 20%, 30%, 40%, 50%, 60% or more) than if the parent polynucleotide had been used under the same conditions as the synthetic polynucleotide. In other embodiments, "modulation" or "modulating" means changing an immune response from a predominantly antibody-mediated immune response as conferred by the parent polynucleotide, to a predominantly cellular immune response as conferred by the synthetic polynucleotide under the same conditions. In still other embodiments, "modulation" or "modulating" means changing an immune response from a predominantly cellular immune response as conferred by the parent polynucleotide, to a predominantly antibody-mediated immune response as conferred by the synthetic polynucleotide under the same conditions.

[0136] By "natural gene" is meant a gene that naturally encodes the protein. However, it is possible that the parent polynucleotide encodes a protein that is not naturally-occurring but has been engineered using recombinant techniques.

[0137] The term "5' non-coding region" is used herein in its broadest context to include all nucleotide sequences which are derived from the upstream region of an expressible gene, other than those sequences which encode amino acid residues which comprise the polypeptide product of the gene, wherein 5' non-coding region confers or activates or otherwise facilitates, at least in part, expression of the gene.

[0138] The term "oligonucleotide" as used herein refers to a polymer composed of a multiplicity of nucleotide units (deoxyribonucleotides or ribonucleotides, or related structural variants or synthetic analogues thereof) linked via phosphodiester bonds (or related structural variants or synthetic analogues thereof). Thus, while the term "oligonucleotide" typically refers to a nucleotide polymer in which the nucleotides and linkages between them are naturally occurring, it will be understood that the term also includes within its scope various analogues including, but not restricted to, peptide nucleic acids (PNAs), phosphoramidates, phosphorothioates, methyl phosphonates, 2-O-methyl ribonucleic acids, and the like. The exact size of the molecule may vary depending on the particular application. An oligonucleotide is typically rather short in length, generally from about 10 to 30 nucleotides, but the term can refer to molecules of any length, although the term "polynucleotide" or "nucleic acid" is typically used for large oligonucleotides.

[0139] The terms "operably connected," "operably linked" and the like as used herein refer to an arrangement of elements wherein the components so described are configured so as to perform their usual function. Thus, a given promoter operably linked to a coding sequence is capable of effecting the expression of the coding sequence when the proper enzymes are present. The promoter need not be contiguous with the coding sequence, so long as it functions to direct the expression thereof. Thus, for example, intervening untranslated yet transcribed sequences can be present between the promoter sequence and the coding sequence and the promoter sequence can still be considered "operably linked" to the coding sequence. Terms such as "operably connected," therefore, include placing a structural gene under the regulatory control of a promoter, which then controls the transcription and optionally translation of the gene. In the construction of heterologous promoter/structural gene combinations, it is generally preferred to position the genetic sequence or promoter at a distance from the gene transcription start site that is approximately the same as the distance between that genetic sequence or promoter and the gene it controls in its natural setting; i.e. the gene from which the genetic sequence or promoter is derived. As is known in the art, some variation in this distance can be accommodated without loss of function. Similarly, the preferred positioning of a regulatory sequence element with respect to a heterologous gene to be placed under its control is defined by the positioning of the element in its natural setting; i.e., the genes from which it is derived.

[0140] By "pharmaceutically-acceptable carrier" is meant a solid or liquid filler, diluent or encapsulating substance that may be safely used in topical or systemic administration.

[0141] The term "phenotype" means any one or more detectable physical or functional characteristics, properties, attributes or traits of an organism, tissue, or cell, or class of organisms, tissues or cells, which generally result from the interaction between the genetic makeup (i.e., genotype) of the organism, tissue, or cell, or the class of organisms, tissues or cells and the environment. In certain embodiments, the term "phenotype" excludes resistance to a selective agent or screening an enzymic or light-emitting activity, conferred directly by a reporter protein.

[0142] By "phenotypic preference" is meant the preference with which an organism uses a codon to produce a selected phenotype. This preference can be evidenced, for example, by the quality of a selected phenotype that is producible by a polynucleotide that comprises the codon in an open reading frame which codes for a polypeptide that produces the selected phenotype. In certain embodiment, the preference of usage is independent of the route by which the polynucleotide is introduced into the organism. However, in other embodiments, the preference of usage is dependent on the route of introduction of the polynucleotide into the organism.

[0143] The term "polynucleotide" or "nucleic acid" as used herein designates mRNA, RNA, cRNA, cDNA or DNA. The term typically refers to oligonucleotides greater than 30 nucleotides in length.

[0144] "Polypeptide," "peptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues and to variants and synthetic analogues of the same. Thus, these terms apply to amino acid polymers in which one or more amino acid residues is a synthetic non-naturally occurring amino acid, such as a chemical analogue of a corresponding naturally occurring amino acid, as well as to naturally-occurring amino acid polymers. As used herein, the terms "polypeptide," "peptide" and "protein" are not limited to a minimum length of the product. Thus, peptides, oligopeptides, dimers, multimers, and the like, are included within the definition. Both full-length proteins and fragments thereof are encompassed by the definition. The terms also include post expression modifications of a polypeptide, for example, glycosylation, acetylation, phosphorylation and the like. In some embodiments, a "polypeptide" refers to a protein which includes modifications, such as deletions, additions and substitutions (generally conservative in nature), to the native sequence, so long as the protein maintains the desired activity. These modifications may be deliberate, as through site-directed mutagenesis, or may be accidental, such as through mutations of hosts which produce the proteins or errors due to PCR amplification.

[0145] The terms "polypeptide variant," and "variant" refer to polypeptides that vary from a reference polypeptide by the addition, deletion or substitution (generally conservative in nature) of at least one amino acid residue. Typically, variants retain a desired activity of the reference polypeptide, such as antigenic activity in inducing an immune response against a target antigen. In general, variant polypeptides are "substantially similar" or substantially identical" to the reference polypeptide, e.g., amino acid sequence identity or similarity of more than 50%, generally more than 60%-70%, even more particularly 80%-85% or more, such as at least 90%-95% or more, when the two sequences are aligned. Often, the variants will include the same number of amino acids but will include substitutions, as explained herein.

[0146] The terms "precursor cell or tissue" and "progenitor cell or tissue" as used herein refer to a cell or tissue that can gives rise to a particular cell or tissue in which a polypeptide is produced by expression of the coding sequences in the synthetic constructs of the invention.

[0147] The terms "precursor" and "progenitor," as used herein in the context of phenotypic preference, refer to a cell or part of organism that can gives rise to an organism of interest in which phenotypic expression is desired or in which phenotypic preference of a codon is to be determined.

[0148] By "primer" is meant an oligonucleotide which, when paired with a strand of DNA, is capable of initiating the synthesis of a primer extension product in the presence of a suitable polymerizing agent. The primer is preferably single-stranded for maximum efficiency in amplification but may alternatively be double-stranded. A primer must be sufficiently long to prime the synthesis of extension products in the presence of the polymerization agent. The length of the primer depends on many factors, including application, temperature to be employed, template reaction conditions, other reagents, and source of primers. For example, depending on the complexity of the target sequence, the oligonucleotide primer typically contains 15 to 35 or more nucleotides, although it may contain fewer nucleotides. Primers can be large polynucleotides, such as from about 200 nucleotides to several kilobases or more. Primers may be selected to be "substantially complementary" to the sequence on the template to which it is designed to hybridize and serve as a site for the initiation of synthesis. By "substantially complementary", it is meant that the primer is sufficiently complementary to hybridize with a target nucleotide sequence. Preferably, the primer contains no mismatches with the template to which it is designed to hybridize but this is not essential. For example, non-complementary nucleotides may be attached to the 5' end of the primer, with the remainder of the primer sequence being complementary to the template. Alternatively, non-complementary nucleotides or a stretch of non-complementary nucleotides can be interspersed into a primer, provided that the primer sequence has sufficient complementarity with the sequence of the template to hybridize therewith and thereby form a template for synthesis of the extension product of the primer.

[0149] By "producing", and like terms such as "production" and "producible", in the context or protein production, is meant production of a protein to a level sufficient to achieve a particular function or phenotype associated with the protein. By contrast, the terms "not producible" and "not substantially producible" as used interchangeably herein refer to (a) no production of a protein, (b) production of a protein to a level that is not sufficient to effect a particular function or phenotype associated with the protein, (c) production of a protein, which cannot be detected by a monoclonal antibody specific for the protein, or (d) production of a protein, which is less that 1% of the level produced in a wild-type cell that normally produces the protein.

[0150] Reference herein to a "promoter" is to be taken in its broadest context and includes the transcriptional regulatory sequences of a classical genomic gene, including the TATA box which is required for accurate transcription initiation, with or without a CCAAT box sequence and additional regulatory elements (i.e. upstream activating sequences, enhancers and silencers) which alter gene expression in response to developmental and/or environmental stimuli, or in a tissue-specific or cell-type-specific manner. A promoter is usually, but not necessarily, positioned upstream or 5', of a structural gene, the expression of which it regulates. Furthermore, the regulatory elements comprising a promoter are usually positioned within 2 kb of the start site of transcription of the gene. Preferred promoters according to the invention may contain additional copies of one or more specific regulatory elements to further enhance expression in a cell, and/or to alter the timing of expression of a structural gene to which it is operably connected.

[0151] The term "quality" is used herein in its broadest sense and includes a measure, strength, intensity, degree or grade of a phenotype, e.g., a superior or inferior immune response, increased or decreased disease resistance, higher or lower sucrose accumulation, better or worse salt tolerance etc.

[0152] By "regulatory element" or "regulatory sequence" is meant a nucleic acid sequence (e.g., DNA) that expresses an operably linked nucleotide sequence (e.g., a coding sequence) in a particular host cell. The regulatory sequences that are suitable for prokaryotic cells for example, include a promoter, and optionally a cis-acting sequence such as an operator sequence and a ribosome binding site. Control sequences that are suitable for eukaryotic cells include promoters, polyadenylation signals, transcriptional enhancers, translational enhancers, leader or trailing sequences that modulate mRNA stability, as well as targeting sequences that target a product encoded by a transcribed polynucleotide to an intracellular compartment within a cell or to the extracellular environment.

[0153] The term "sequence identity" as used herein refers to the extent that sequences are identical on a nucleotide-by-nucleotide basis or an amino acid-by-amino acid basis over a window of comparison. Thus, a "percentage of sequence identity" is calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, I) or the identical amino acid residue (e.g., Ala, Pro, Ser, Thr, Gly, Val, Leu, Ile, Phe, Tyr, Trp, Lys, Arg, His, Asp, Glu, Asn, Gln, Cys and Met) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (i.e., the window size), and multiplying the result by 100 to yield the percentage of sequence identity. For the purposes of the present invention, "sequence identity" will be understood to mean the "match percentage" calculated by the DNASIS computer program (Version 2.5 for windows; available from Hitachi Software engineering Co., Ltd., South San Francisco, Calif., USA) using standard defaults as used in the reference manual accompanying the software.

[0154] "Similarity" refers to the percentage number of amino acids that are identical or constitute conservative substitutions as defined in Table 10. Similarity may be determined using sequence comparison programs such as GAP (Deveraux et al. 1984, Nucleic Acids Research 12, 387-395). In this way, sequences of a similar or substantially different length to those cited herein might be compared by insertion of gaps into the alignment, such gaps being determined, for example, by the comparison algorithm used by GAP.

[0155] Terms used to describe sequence relationships between two or more polynucleotides or polypeptides include "reference sequence", "comparison window", "sequence identity", "percentage of sequence identity" and "substantial identity". A "reference sequence" is at least 12 but frequently 15 to 18 and often at least 25 monomer units, inclusive of nucleotides and amino acid residues, in length. Because two polynucleotides may each comprise (1) a sequence (i.e., only a portion of the complete polynucleotide sequence) that is similar between the two polynucleotides, and (2) a sequence that is divergent between the two polynucleotides, sequence comparisons between two (or more) polynucleotides are typically performed by comparing sequences of the two polynucleotides over a "comparison window" to identify and compare local regions of sequence similarity. A "comparison window" refers to a conceptual segment of at least 6 contiguous positions, usually about 50 to about 100, more usually about 100 to about 150 in which a sequence is compared to a reference sequence of the same number of contiguous positions after the two sequences are optimally aligned. The comparison window may comprise additions or deletions (i.e., gaps) of about 20% or less as compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences. Optimal alignment of sequences for aligning a comparison window may be conducted by computerized implementations of algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetics Computer Group, 575 Science Drive Madison, Wis., USA) or by inspection and the best alignment (i.e., resulting in the highest percentage homology over the comparison window) generated by any of the various methods selected. Reference also may be made to the BLAST family of programs as for example disclosed by Altschul et al., 1997, Nucl. Acids Res. 25:3389. A detailed discussion of sequence analysis can be found in Unit 19.3 of Ausubel et al., "Current Protocols in Molecular Biology", John Wiley & Sons Inc, 1994-1998, Chapter 15.

[0156] As used herein, the term "specific binding pair" refers to a pair of molecules that physically interact with one another in a specific manner that gives rise to a biological activity, that is, to the substantial exclusion of other polypeptides. Members of a specific binding pair interact through complementary interaction domains, such that they interact to the substantial exclusion of proteins that do not have a complementary interaction domain. Non-limiting examples of specific binding pairs include antibody-antigen pairs, enzyme-substrate pairs, dimeric transcription factors (e.g., AP-1, composed of Fos specifically bound to Jun via a leucine zipper interaction domain) and receptor-ligand pairs.

[0157] The terms "synthetic polynucleotide," "synthetic construct" and the like as used herein refer to a nucleic acid molecule that is formed by recombinant or synthetic techniques and typically includes polynucleotides that are not normally found in nature.

[0158] The term "synonymous codon" as used herein refers to a codon having a different nucleotide sequence than another codon but encoding the same amino acid as that other codon.

[0159] By "treatment," "treat," "treated" and the like is meant to include both therapeutic and prophylactic treatment.

[0160] By "vector" is meant a nucleic acid molecule, preferably a DNA molecule derived, for example, from a plasmid, bacteriophage, or plant virus, into which a nucleic acid sequence may be inserted or cloned. A vector preferably contains one or more unique restriction sites and may be capable of autonomous replication in a defined host cell including a target cell or tissue or a progenitor cell or tissue thereof, or be integrable with the genome of the defined host such that the cloned sequence is reproducible. Accordingly, the vector may be an autonomously replicating vector, i.e., a vector that exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a linear or closed circular plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome. The vector may contain any means for assuring self-replication. Alternatively, the vector may be one which, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated. A vector system may comprise a single vector or plasmid, two or more vectors or plasmids, which together contain the total DNA to be introduced into the genome of the host cell, or a transposon. The choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced. The vector may also include a selection marker such as an antibiotic resistance gene that can be used for selection of suitable transformants. Examples of such resistance genes are well known to those of skill in the art.

2. Abbreviations

[0161] The following abbreviations are used throughout the application: [0162] nt=nucleotide [0163] nts=nucleotides [0164] aa=amino acid(s) [0165] kb=kilobase(s) or kilobase pair(s) [0166] kDa=kilodalton(s) [0167] d=day [0168] h=hour [0169] s=seconds

3. Construct System of the Invention

[0170] In accordance with the present invention, a construct system is provided for determining the translational efficiency or phenotypic preference of different synonymous codons. In its broadest form, the system comprises a plurality of synthetic constructs each of which is useful for interrogating the translational efficiency or phenotypic preference of a single codon ("interrogating codon"), wherein the interrogating codon of one construct is different from the interrogating codon of another. Thus, in order to compare the translational efficiency or phenotypic preference of different synonymous codons, it is generally desirable to use two or more synthetic constructs, suitably one for each synonymous codon that codes for a particular amino acid. For example, in the case of arginine, 6 synthetic constructs are necessary to determine the translational efficiency or phenotypic preference of all 6 synonymous codons for arginine (i.e. Arg.sup.CGA, Arg.sup.CGC, Arg.sup.CGT, Arg.sup.AGA, Arg.sup.AGG, Arg.sup.CGG). By contrast, only 2 synthetic constructs are required to determine the translational efficiency or phenotypic preference of both synonymous codons for phenylalanine (i.e., Phe.sup.TTT, Phe.sup.TTC) and so on. Accordingly, in order to interrogate the translational efficiency or phenotypic preference of a finite number of synonymous codons, a corresponding number of synthetic constructs will generally be required.

[0171] The synthetic constructs of the invention each comprise a regulatory sequence that is operably connected to a reporter polynucleotide, wherein the reporter polynucleotide of a respective construct encodes the same amino acid sequence as the reporter polynucleotide of another. In accordance with the present invention, individual reporter polynucleotides use the same interrogating codon to code for a particular amino acid at one or more positions of the amino acid sequence, wherein the interrogating codon of one reporter polynucleotide is different to but synonymous with the interrogating codon of another. In specific embodiments, the coding sequences of individual reporter polynucleotides comprise the same number of interrogating codons. Suitably, all codons in a respective coding sequence, which code for a particular amino acid, are the same interrogating codon. However, this is not necessary as it is possible to use fewer interrogating codons than the number of codons in a respective coding sequence, which code for the same amino acid as the interrogating codons. Nevertheless, the sensitivity of an individual synthetic construct in determining the translational efficiency or phenotypic preference of a corresponding interrogating codon is generally improved by incorporating more interrogating codons in the coding sequence.

[0172] In some embodiments, the interrogating codon(s) in one coding sequence is (are) located at the same positions as the interrogating codons in another coding sequence. In other embodiments, the interrogating codon(s) of one coding sequence is (are) located at different positions relative to the interrogating codons in another coding sequence. For example, a first coding sequence and a second coding sequence may each contain 5 codons that code for a particular amino acid and only 3 of those are used as interrogating codons. In this non-limiting example, the first coding sequence may comprise the sequence:

[0173] X.sub.1 X.sub.2 X.sub.3 A.sub.1 X.sub.4 X.sub.5 B.sub.1 X.sub.6 X.sub.7 X.sub.8 A.sub.2 X.sub.9 A.sub.3 X.sub.10 X.sub.11 X.sub.12 B.sub.2 X.sub.13 X.sub.14

[0174] and the second coding sequence may comprise:

[0175] X.sub.1 X.sub.2 X.sub.3 A.sub.1 X.sub.4 X.sub.5 A.sub.2 X.sub.6 X.sub.7 X.sub.8 B.sub.1 X.sub.9 A.sub.3 X.sub.10 X.sub.11 X.sub.12 B.sub.2 X.sub.13 X.sub.14

[0176] wherein:

[0177] A.sub.1-3 represent the same interrogating codon;

[0178] B.sub.1-2 represent codons that code for the same amino acid as the interrogating codon; and

[0179] X.sub.1-14 represent codons that code for different amino acids than the amino acid coded for by A.sub.1-3 and B.sub.1-2;

[0180] In some embodiments, the construct system comprises synthetic constructs for interrogating the translational efficiency or phenotypic preference of codons that code for two or more different amino acids. In illustrative examples of this type, the construct system comprises synthetic constructs for interrogating the translational efficiency or phenotypic preference of codons that code for 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 or 20 (suitably naturally occurring) amino acids. In specific embodiments, the construct system comprises 59 synthetic constructs for interrogating the translational efficiency or phenotypic preference of all naturally occurring codons for which there are two or more synonymous codons (e.g., Ala.sup.GCT, Ala.sup.GCC, Ala.sup.GCA, Ala.sup.GCG, Arg.sup.CGA, Arg.sup.CGT, Arg.sup.AGA, Arg.sup.AGG, Arg.sup.CGG, Asn.sup.AAC, Asn.sup.AAT, Asp.sup.GAC, Asp.sup.GAT, Cys.sup.TGC, Cys.sup.TGT, Glu.sup.GAA, Glu.sup.GAG, Gln.sup.CAA, Gln.sup.CAG, Gly.sup.GGA, Gly.sup.GGG, Gly.sup.GGT, Gly.sup.GGC, His.sup.CAC, His.sup.CAT, Ile.sup.ATC, Ile.sup.ATT, Ile.sup.ATA, Leu.sup.CTG, Leu.sup.CTC, Leu.sup.CTA, Leu.sup.CTT, Leu.sup.TTG, Leu.sup.TTA, Lys.sup.AAG, Lys.sup.AAA, Phe.sup.TTT, Phe.sup.TTC, Pro.sup.CCC, Pro.sup.CCT, Pro.sup.CCA, Pro.sup.CCG, Ser.sup.TCG, Ser.sup.TCT, Ser.sup.TCA, Ser.sup.TCC, Ser.sup.AGC, Ser.sup.AGT, Thr.sup.ACG, Thr.sup.ACC, Thr.sup.ACA, Thr.sup.ACT, Tyr.sup.TAC, Tyr.sup.TAT, Cal.sup.GTG, Val.sup.GTC, Val.sup.GTT and Val.sup.GTA).

[0181] In some embodiments in which the construct system is used for determining the translational efficiency of synonymous codons, the reporter polynucleotide encodes an amino acid sequence that defines, in whole or in part, a reporter protein that, when present in a cell, is detectable and distinguishable from other polypeptides present in the cell. A reporter protein may be a naturally occurring protein or a protein that is not naturally occurring. Illustrative examples of such reporter proteins include fluorescent proteins such as green fluorescent protein (gfp), cyan fluorescent protein (cfp), red fluorescent protein (rfp), or blue fluorescent protein (bfp), or derivatives of these proteins, or enzymatic proteins such as chloramphenicol acetyl transferase, .beta.-galactosidase, .beta.-glucuronidase (GUS)secreted placental alkaline phosphatase and .beta.-lactamase, chemiluminescent proteins such as luciferase, and selectable marker proteins including proteins encoded by antibiotic resistance genes (e.g., hygromycin resistance genes, neomycin resistance genes, tetracycline resistance genes, ampicillin resistance genes, kanamycin resistance genes, phleomycin resistance genes, herbicide resistance genes such as the bialophos resistance (BAR) gene that confers resistance to the herbicide BASTA, bleomycin resistance genes, geneticin resistance genes, carbenicillin resistance genes, chloramphenicol resistance genes, puromycin resistance genes, blasticidin-S-deaminase genes), heavy metal resistance genes, hisD genes, hypoxanthine phosphoribosyl transferase (HPRT) genes and guanine phosphoribosyl transferase (Gpt) genes.

[0182] In some embodiments in which the construct system is used for determining the phenotypic preference of synonymous codons, the reporter polynucleotide encodes an amino acid sequence that defines, in whole or in part, a reporter protein confers upon an organism of interest or part thereof, either by itself or in association with other molecules, a selected phenotype or a phenotype of the same class as the selected phenotype. For example, the reporter protein may be a phenotype-associated polypeptide (e.g., a melanoma specific antigen such as BAGE or GAGE-1) that will be the subject of producing the selected phenotype (e.g., immunity to melanoma). Alternatively, the phenotype-associated polypeptide (e.g., green fluorescent protein or a gastrointestinal associated antigen such as 17-1A) may not produce the selected phenotype (e.g., immunity to melanoma) but may produce the same class of phenotype (e.g., an immune response) as the selected phenotype. In illustrative examples, the phenotype-associated polypeptide is selected from antigens including antigens from pathogenic organisms or cancers (e.g., wherein the phenotype is immunity to disease) and self antigens or transplantation antigens (e.g., wherein the phenotype is antigen-specific anergy or tolerance), growth factors (e.g., wherein the phenotype is selected from size of the organism or part, wound healing, cell proliferation, cell differentiation, cell migration, immune cell function), hormones (e.g., wherein the phenotype is increased lactation, e.g., using oxytocin, or amelioration of a diabetic state, e.g., using insulin) and toxins (e.g., wherein the phenotype is tumour regression or cell death). In specific embodiments, the selected phenotype or class of phenotype corresponds to a beneficial or improved or superior state or condition of the organism or part thereof relative to a reference state or condition. In illustrative examples, the reference state or condition corresponds to a pathophysiological state. Phenotypes contemplated by the present invention include any desirable beneficial trait including, but not restricted to: immunity (e.g., immunity to pathogenic infection or cancer); antigen tolerance (e.g., antigen-specific T lymphocyte anergy, tolerance to allergens, transplantation antigens and self antigens); angiogenesis (e.g., blood vessel formation in the heart and vasculature and in tumour growths); anti-angiogenesis (e.g., treatment of ischaemic heart disease and tumours); amelioration of clinical symptoms (e.g., fever; inflammation; encephalitis; weight loss; anaemia; sensory symptoms such as paraesthesia or hypaesthesia; ataxia; neuralgia; paralysis; vertigo; urinary or bowel movement abnormalities; and cognitive dysfunction such memory loss, impaired attention, problem-solving difficulties, slowed information processing, and difficulty in shifting between cognitive tasks); reduced or increased cell death (e.g., apoptosis); reduced or increased cell differentiation; reduced or increased cell proliferation; tumour or cancer regression; growth and repair of tissue or organ; decreased fibrosis; inhibition or reversal of cell senescence; increased or reduced cell migration; differential expression of protein between different cells or tissues of an organism or part thereof; trauma recovery; recovery from burns; antibiotic resistance or sensitivity (e.g., resistance or sensitivity to aminoglycosidic antibiotics such as geneticin and paromomycin); herbicide tolerance or sensitivity (e.g. tolerance or sensitivity to glyphosate or glufosinate); starch biosynthesis or modification (e.g. using a starch branching enzyme, starch synthases, ADP-glucose pyrophosphorylase); fatty acid biosynthesis (e.g. using a desaturase or hydroxylase); disease resistance or tolerance (e.g., resistance to animal diseases such as cardiovascular disease, autoimmunity, Alzheimer's disease, Parkinson's disease, diabetes, AIDS etc or resistance to plant diseases such as rust, dwarfism, rot, smut, mould, scab and mildew); pest resistance or tolerance including insect resistance or tolerance (e.g., resistance to borers and worms); viral resistance or tolerance (e.g. resistance to animal viruses such as herpesviruses, hepadnaviruses, adenoviruses, flaviviruses, lentiviruses, poxviruses etc or resistance to plant viruses such as badnaviruses, caulimoviruses, potyviruses, luteoviruses, rhabdoviruses etc); fungal resistance or tolerance (e.g., resistance to arbuscular mycorrhizal fungi, endophytic fungi etc); a metabolic trait including sucrose metabolism (e.g., sucrose isomerisation); frost resistance or tolerance; stress tolerance (e.g., salt tolerance, drought tolerance); and improved food content or increased yields. Persons of skill in the art will recognise that the above exemplary classes of phenotype may be subdivided into phenotypic subclasses and that such subclasses would also fall within the scope of phenotypic classes contemplated by the present invention. For example, subclasses of immunity include innate immunity (which can be further subdivided inter alia into complement system, monocytes, macrophages, neutrophils and natural killer cells), cellular immunity (which can be further subdivided inter alia into cytolytic T lymphocytes, dendritic cells and T helper lymphocytes) and humoral immunity (which can be further subdivided inter alia into antibody subclasses IgA, IgD, IgE, IgG and IgM).

[0183] In some embodiments, the reporter polynucleotide of individual synthetic constructs further comprises an ancillary coding sequence that encodes a detectable tag (e.g., streptavidin, avidin, an antibody, an antigen, an epitope, a hapten, a protein, or a fluorescent, chemiluminescent or chemically reactive moiety). The detectable tag is suitably a member of a specific binding pair, which includes for example, antibody-antigen (or hapten) pairs, ligand-receptor pairs, enzyme-substrate pairs, biotin-avidin pairs, and the like. In illustrative examples of this type, the ancillary coding sequence of one reporter polynucleotide encodes a first tag (e.g., a first epitope to which a first antibody binds) and the ancillary coding sequence of another reporter polynucleotide encodes a second tag (e.g., a second epitope to which a second antibody binds), which is detectably distinguishable from the first tag. In these examples, it is possible to detectably distinguish the polypeptide products of different reporter polynucleotides in the same cell or organism of interest or part thereof, thereby permitting simultaneous determination of translational efficiencies or phenotypic preferences of different interrogating codons in the same cell or organism or part.

[0184] In accordance with the present invention, the reporter polynucleotide is operably linked in the synthetic constructs to a regulatory sequence. The regulatory sequence suitably comprises transcriptional and/or translational control sequences, which will be compatible for expression in the cell or organism of interest. Typically, the transcriptional and translational regulatory control sequences include, but are not limited to, a promoter sequence, a 5' non-coding region, a cis-regulatory region such as a functional binding site for transcriptional regulatory protein or translational regulatory protein, an upstream open reading frame, ribosomal-binding sequences, transcriptional start site, translational start site, and/or nucleotide sequence which encodes a leader sequence, termination codon, translational stop site and a 3' non-translated region. Constitutive or inducible promoters as known in the art are contemplated by the invention. The promoters may be either naturally occurring promoters, or hybrid promoters that combine elements of more than one promoter. Promoter sequences contemplated by the present invention may be native to the organism of interest or may be derived from an alternative source, where the region is functional in the chosen organism. The choice of promoter will differ depending on the intended host. For example, promoters which could be used for expression in plants include plant promoters such as: constitutive plant promoters examples of which include CaMV35S plant promoter, CaMV19S plant promoter, FMV34S plant promoter, sugarcane bacilliform badnavirus plant promoter, CsVMV plant promoter, Arabidopsis ACT2/ACT8 actin plant promoter, Arabidopsis ubiquitin UBQ1 plant promoter, barley leaf thionin BTH6 plant promoter, and rice actin plant promoter; tissue specific plant promoters examples of which include bean phaseolin storage protein plant promoter, DLEC plant promoter, PHSf3 plant promoter, zein storage protein plant promoter, conglutin gamma plant promoter from soybean, AT2S1 gene plant promoter, ACT11 actin plant promoter from Arabidopsis, napA plant promoter from Brassica napus and potato patatin gene plant promoter; and inducible plant promoters examples of which include a light-inducible plant promoter derived from the pea rbcS gene, a plant promoter from the alfalfa rbcS gene, DRE, MYC and MYB plant promoters which are active in drought; INT, INPS, prxEa, Ha hsp17.7G4 and RD21 plant promoters active in high salinity and osmotic stress, and hsr203J and str246C plant promoters active in pathogenic stress. Alternatively, promoters which could be used for expression in mammals include the metallothionein promoter, which can be induced in response to heavy metals such as cadmium, the .beta.-actin promoter as well as viral promoters such as the SV40 large T antigen promoter, human cytomegalovirus (CMV) immediate early (1E) promoter, Rous sarcoma virus LTR promoter, adenovirus promoter, or a HPV promoter, particularly the HPV upstream regulatory region (URR) may also be used. All these promoters are well described and readily available in the art.

[0185] The synthetic constructs of the present invention may also comprise a 3' non-translated sequence. A 3' non-translated sequence refers to that portion of a gene comprising a DNA segment that contains a polyadenylation signal and any other regulatory signals capable of effecting mRNA processing or gene expression. The polyadenylation signal is characterised by effecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. Polyadenylation signals are commonly recognised by the presence of homology to the canonical form 5' AATAAA-3' although variations are not uncommon. The 3' non-translated regulatory DNA sequence preferably includes from about 50 to 1,000 nucleotide base pairs and may contain transcriptional and translational termination sequences in addition to a polyadenylation signal and any other regulatory signals capable of effecting mRNA processing or gene expression.

[0186] In specific embodiments, the synthetic constructs further contain a selectable marker gene to permit selection of an organism or a precursor thereof that contains a synthetic construct. Selection genes are well known in the art and will be compatible for expression in cell or organism of interest, or a progenitor or precursor thereof.

[0187] In some embodiments, the synthetic constructs of the invention are in the form of viral vectors, such as simian virus 40 (SV40) or bovine papilloma virus (BPV), which has the ability to replicate as extra-chromosomal elements (Eukaryotic Viral Vectors, Cold Spring Harbor Laboratory, Gluzman ed., 1982; Sarver et al., 1981, Mol. Cell. Biol. 1:486). Viral vectors include retroviral (lentivirus), adeno-associated virus (see, e.g., Okada, 1996, Gene Ther. 3:957-964; Muzyczka, 1994, J. Clin. Invst. 94:1351; U.S. Pat. Nos. 6,156,303; 6,143,548 5,952,221, describing AAV vectors; see also U.S. Pat. Nos. 6,004,799; 5,833,993), adenovirus (see, e.g., U.S. Pat. Nos. 6,140,087; 6,136,594; 6,133,028; 6,120,764), reovirus, herpesvirus, rotavirus genomes etc., modified for introducing and directing expression of a polynucleotide or transgene in cells. Retroviral vectors can include those based upon murine leukemia virus (see, e.g., U.S. Pat. No. 6,132,731), gibbon ape leukemia virus (see, e.g., U.S. Pat. No. 6,033,905), simian immuno-deficiency virus, human immuno-deficiency virus (see, e.g., U.S. Pat. No. 5,985,641), and combinations thereof.

[0188] Vectors also include those that efficiently deliver genes to animal cells in vivo (e.g., stem cells) (see, e.g., U.S. Pat. Nos. 5,821,235 and 5,786,340; Croyle et al., 1998, Gene Ther. 5:645; Croyle et al., 1998, Pharm. Res. 15:1348; Croyle et al., 1998, Hum. Gene Ther. 9:561; Foreman et al., 1998, Hum. Gene Ther. 9:1313; Wirtz et al., 1999, Gut 44:800). Adenoviral and adeno-associated viral vectors suitable for in vivo delivery are described, for example, in U.S. Pat. Nos. 5,700,470, 5,731,172 and 5,604,090. Additional vectors suitable for in vivo delivery include herpes simplex virus vectors (see, e.g., U.S. Pat. No. 5,501,979), retroviral vectors (see, e.g., U.S. Pat. Nos. 5,624,820, 5,693,508 and 5,674,703; and WO92/05266 and WO92/14829), bovine papilloma virus (BPV) vectors (see, e.g., U.S. Pat. No. 5,719,054), CMV-based vectors (see, e.g., U.S. Pat. No. 5,561,063) and parvovirus, rotavirus and Norwalk virus vectors. Lentiviral vectors are useful for infecting dividing as well as non-dividing cells (see, e.g., U.S. Pat. No. 6,013,516).

[0189] Vectors for insect cell expression commonly use recombinant variations of baculoviruses and other nucleopolyhedrovirus, e.g., Bombyx mori nucleopolyhedrovirus vectors (see, e.g., Choi, 2000, Arch. Virol. 145:171-177). For example, Lepidopteran and Coleopteran cells are used to replicate baculoviruses to promote expression of foreign genes carried by baculoviruses, e.g., Spodoptera frugiperda cells are infected with recombinant Autographa californica nuclear polyhedrosis viruses (AcNPV) carrying a heterologous, e.g., a human, coding sequence (see, e.g., Lee, 2000, J. Virol. 74:11873-11880; Wu, 2000, J. Biotechnol. 80:75-83). See, e.g., U.S. Pat. No. 6,143,565, describing use of the polydnavirus of the parasitic wasp Glyptapanteles indiensis to stably integrate nucleic acid into the genome of Lepidopteran and Coleopteran insect cell lines. See also, U.S. Pat. Nos. 6,130,074; 5,858,353; 5,004,687.

[0190] Expression vectors capable of expressing proteins in plants are well known in the art, and include, e.g., vectors from Agrobacterium spp., potato virus X (see, e.g., Angell, 1997, EMBO J. 16:3675-3684), tobacco mosaic virus (see, e.g., Casper, 1996, Gene 173:69-73), tomato bushy stunt virus (see, e.g., Hillman, 1989, Virology 169:42-50), tobacco etch virus (see, e.g., Dolja, 1997, Virology 234:243-252), bean golden mosaic virus (see, e.g., Morinaga, 1993, Microbiol Immunol. 37:471-476), cauliflower mosaic virus (see, e.g., Cecchini, 1997, Mol. Plant. Microbe Interact. 10:1094-1101), maize Ac/Ds transposable element (see, e.g., Rubin, 1997, Mol. Cell. Biol. 17:6294-6302; Kunze, 1996, Curr. Top. Microbiol. Immunol. 204:161-194), and the maize suppressor-mutator (Spm) transposable element (see, e.g., Schlappi, 1996, Plant Mol. Biol. 32:717-725); and derivatives thereof.

[0191] The invention further contemplates cells or organisms containing therein the synthetic constructs of the invention, or alternatively, parts, precursors, cells or tissues produced by the methods described herein. In this regard, it will be appreciated that the construct system of the present invention is applicable to prokaryotic as well as eukaryotic hosts and includes for example unicellular organisms and multicellular organisms, such as but not limited to yeast, plants and animals including vertebrate animals such as mammals, reptiles, fish, birds etc as well as invertebrate animals such as metazoa, sponges, worms, molluscs, nematodes, crustaceans, echinoderms etc. In certain embodiments, the construct system is used to determine the translational efficiency of different synonymous codons in plant cells or animal cellos or to determine the phenotypic preference of different synonymous codons in plants and mammals.

[0192] Illustrative examples of eukaryotic organisms include, but are not limited to, fungi such as yeast and filamentous fungi, including species of Aspergillus, Trichoderma, and Neurospora; animal hosts including vertebrate animals illustrative examples of which include fish (e.g., salmon, trout, tilapia, tuna, carp, flounder, halibut, swordfish, cod and zebrafish), birds (e.g., chickens, ducks, quail, pheasants and turkeys, and other jungle foul or game birds) and mammals (e.g., dogs, cats, horses, cows, buffalo, deer, sheep, rabbits, rodents such as mice, rats, hamsters and guinea pigs, goats, pigs, primates, marine mammals including dolphins and whales, as well as cell lines, such as human or other mammalian cell lines of any tissue or stem cell type (e.g., COS, NIH 3T3 CHO, BHK, 293, or HeLa cells), and stem cells, including pluripotent and non-pluripotent and embryonic stem cells, and non-human zygotes), as well as invertebrate animals illustrative examples of which include nematodes (representative generae of which include those that infect animals such as but not limited to Ancylostoma, Ascaridia, Ascaris, Bunostomum, Caenorhabditis, Capillaria, Chabertia, Cooperia, Dictyocaulus, Haernonchus, Heterakis, Nematodirus, Oesophagostomum, Ostertagia, Oxyuris, Parascaris, Strongylus, Toxascaris, Trichuris, Trichostrongylus, Tflichonema, Toxocara, Uncinaria, and those that infect plants such as but not limited to Bursaphalenchus, Criconerriella, Diiylenchus, Ditylenchus, Globodera, Helicotylenchus, Heterodera, Longidorus, Melodoigyne, Nacobbus, Paratylenchus, Pratylenchus, Radopholus, Rotelynchus, Tylenchus, and Xiphinerna) and other worms, drosophila, and other insects (such as from the families Apidae, Curculionidae, Scarabaeidae, Tephritidae, Tortricidae, amongst others, representative orders of which include Coleoptera, Diptera, Lepidoptera, and Homoptera.

[0193] In certain embodiments, the construct system is used to determine the translational efficiency or phenotypic preference of different synonymous codons in plants or plant cells (e.g., a plant that is suitably selected from monocotyledons, dicotyledons and gymnosperms). The plant may be an ornamental plant or crop plant. Illustrative examples of ornamental plants include, but are not limited to, Malus spp, Crataegus spp, Rosa spp., Betula spp, Sorbus spp, Olea spp, Nerium spp, Salix spp, Populus spp. Illustrative examples of crop plants include plant species which are cultivated in order to produce a harvestable product such as, but not limited to, Abelmoschus esculentus (okra), Acacia spp., Agave fourcroydes (henequen), Agave sisalana (sisal), Albizia spp., Allium fistulosum (bunching onion), Allium sativum (garlic), Allium spp. (onions), Alpinia galanga (greater galanga), Amaranthus caudatus, Amaranthus spp., Anacardium spp. (cashew), Ananas comosus (pineapple), Anethum graveolens (dill), Annona cherimola (cherimoya), Apios americana (American potatobean), Arachis hypogaea (peanut), Arctium spp. (burdock), Artemisia spp. (wormwood), Aspalathus linearis (redbush tea), Athertonia diversifolia, Atriplex nummularia (old man saltbush), Averrhoa carambola (starfruit), Azadirachta indica (neem), Backhousia spp., Bambusa spp. (bamboo), Beta vulgaris (sugar beet), Boehmeria nivea (ramie), bok choy, Boronia megastigma (sweet boronia), Brassica carinata (Abyssinian mustard), Brassica juncea (Indian mustard), Brassica napus (rapeseed), Brassica oleracea (cabbage, broccoli), Brassica oleracea var Albogabra (gai lum), Brassica parachinensis (choi sum), Brassica pekensis (Wong bok or Chinese cabbage), Brassica spp., Burcella obovata, Cajanus cajan (pigeon pea), Camellia sinensis (tea), Cannabis sativa (non-drug hemp), Capsicum spp., Carica spp. (papaya), Carthamus tinctorius (safflower), Carum carvi (caraway), Cassinia spp., Castanospermum australe (blackbean), Casuarina cunninghamiana (beefwood), Ceratonia siliqua (carob), Chamaemelum nobile (chamomile), Chamelaucium spp. (Geraldton wax), Chenopodium quinoa (quinoa), Chrysanthemum (Tanacetum), cinerariifolium (pyrethrum), Cicer arietinum (chickpea), Cichorium intybus (chicory), Clematis spp., Clianthus formosus (Start's desert pea), Cocos nucifera (coconut), Coffea spp. (coffee), Colocasia esculenta (taro), Coriandrum sativum (coriander), Crambe abyssinica (crambe), Crocus sativus (saffron), Cucurbita foetidissima (buffalo gourd), Cucurbita spp. (gourd), Cyamopsis tetragonoloba (guar), Cymbopogon spp. (lemongrass), Cytisus proliferus (tagasaste), Daucus carota (carrot), Desmanthus spp., Dioscorea esculenta (Asiatic yam), Dioscorea spp. (yams), Diospyros spp. (persimmon), Doronicum sp., Echinacea spp., Eleocharis dulcis (water chestnut), Eleusine coracana (finger millet), Emanthus arundinaceus, Eragrostis tef (tef), Erianthus arundinaceus, Eriobotrya japonica (loquat), Eucalyptus spp., Eucalyptus spp. (gil mallee), Euclea spp., Eugenia malaccensis (jumba), Euphorbia spp., Euphoria longana (longan), Eutrema wasabi (wasabi), Fagopyrum esculentum (buckwheat), Festuca arundinacea (tall fescue), Ficus spp. (fig), Flacourtia inermis, Flindersia grayliana (Queensland maple), Foeniculum olearia, Foeniculum vulgare (fennel), Garcinia mangostana (mangosteen), Glycine latifolia, Glycine max (soybean), Glycine max (vegetable soybean), Glycyrrhiza glabra (licorice), Gossypium spp. (cottons), Grevillea spp., Grindelia spp., Guizotia abyssinica (niger), Harpagophyllum sp., Helianthus annuus (high oleic sunflowers), Helianthus annuus (monosun sunflowers), Helianthus tuberosus (Jerusalem artichoke), Hibiscus cannabinus (kenaf), Hordeum bulbosum, Hordeum spp. (waxy barley), Hordeum vulgare (barley), Hordeum vulgare subsp. spontaneum, Humulus lupulus (hops), Hydrastis canadensis (golden seal), Hymenachne spp., Hyssopus officinalis (hyssop), Indigofera spp., Inga edulis (ice cream bean), Inocarpus tugiter, Ipomoea batatas (sweet potato), Ipomoea sp. (kang kong), Lablab purpureus (white lablab), Lactuca spp. (lettuce), Lathyrus spp. (vetch), Lavandula spp. (lavender), Lens spp. (lentil), Lesquerella spp. (bladderpod), Leucaena spp., Lilium spp., Limnanthes spp. (meadowfoam), Linum usitatissimum (flax), Linum usitatissimum (linseed), Linum usitatissimum (Linola.TM.), Litchi chinensis (lychee), Lotus corniculatus (birdsfoot trefoil), Lotus pedunculatus, Lotus sp., Luffa spp., Lunaria annua (honesty), Lupinus mutabilis (pearl lupin), Lupinus spp. (lupin), Macadamia spp., Mangifera indica (mango), Manihot esculenta (cassaya), Medicago spp. (lucerne), Medicago spp., Melaleuca spp. (tea tree), Melaleuca uncinata (broombush), Mentha tasmannia, Mentha spicata (spearmint), Mentha X piperita (peppermint), Momordica charantia (bitter melon), Musa spp. (banana), Myrciaria cauliflora (jaboticaba), Myrothamnus flabellifolia, Nephelium lappaceum (rambutan), Nerine spp., Ocimum basilicum (basil), Oenanthe javanica (water dropwort), Oenothera biennis (evening primrose), Olea europaea (olive), Olearia sp., Origanum spp. (marjoram, oregano), Oryza spp. (rice), Oxalis tuberosa (oca), Ozothamnus spp. (rice flower), Pachyrrhizus ahipa (yam bean), Panax spp. (ginseng), Panicum miliaceum (common millet), Papaver spp. (poppy), Parthenium argentatum (guayule), Passiflora sp., Paulownia tomemtosa (princess tree), Pelargonium graveolens (rose geranium), Pelargonium sp., Pennisetum americanum (bulrush or pearl millet), Persoonia spp., Petroselinum crispum (parsley), Phacelia tanacetifolia (tansy), Phalaris canariensis (canary grass), Phalaris sp., Phaseolus coccineus (scarlet runner bean), Phaseolus lunatus (lima bean), Phaseolus spp., Phaseolus vulgaris (culinary bean), Phaseolus vulgaris (navy bean), Phaseolus vulgaris (red kidney bean), Pisum sativum (field pea), Plantago ovata (psyllium), Polygonum minus, Polygonum odoratum, Prunus mume (Japanese apricot), Psidium guajava (guava), Psophocarpus tetragonolobus (winged bean), Pyrus spp. (nashi), Raphanus satulus (long white radish or Daikon), Rhagodia spp. (saltbush), Ribes nigrum (black currant), Ricinus communis (castor bean), Rosmarinus officinalis (rosemary), Rungia klossii (rungia), Saccharum officinarum (sugar cane), Salvia officinalis (sage), Salvia sclarea (clary sage), Salvia sp., Sandersonia sp., Santalum acuminatum (sweet quandong), Santalum spp. (sandalwood), Sclerocarya caffra (macula), Scutellaria galericulata (scullcap), Secale cereale (rye), Sesamum indicum (sesame), Setaria italica (foxtail millet), Simmondsia spp. (jojoba), Solanum spp., Sorghum almum (sorghum), Stachys betonica (wood betony), Stenanthemum scortechenii, Strychnos cocculoides (monkey orange), Stylosanthes spp. (stylo), Syzygium spp., Tasmannia lanceolata (mountain pepper), Terminalia karnbachii, Theobroma cacao (cocoa), Thymus vulgaris (thyme), Toona australis (red cedar), Trifoliium spp. (clovers), Trifolium alexandrinum (berseem clover), Trifolium resupinatum (persian clover), Triticum spp., Triticum tauschii, Tylosema esculentum (morama bean), Valeriana sp. (valerian), Vernonia spp., Vetiver zizanioides (vetiver grass), Vicia benghalensis (purple vetch), Vicia faba (faba bean), Vicia narbonensis (narbon bean), Vicia sativa, Vicia spp., Vigna aconitifolia (mothbean), Vigna angularis (adzuki bean), Vigna mungo (black gram), Vigna radiata (mung bean), Vigna spp., Vigna unguiculata (cowpea), Vitis spp. (grapes), Voandzeia subterranea (bambarra groundnut), Triticosecale (triticale), Zea mays (bicolour sweetcorn), Zea mays (maize), Zea mays (sweet corn), Zea mays subsp. mexicana (teosinte), Zieria spp., Zingiber officinale (ginger), Zizania spp. (wild rice), Ziziphus jujuba (common jujube). Desirable crops for the practice of the present invention include Nicotiana tabacum (tobacco) and horticultural crops such as, for example, Ananas comosus (pineapple), Saccharum spp (sugar cane), Musa spp (banana), Lycopersicon esculentum (tomato) and Solanum tuberosum (potato).

[0194] The synthetic constructs of the present invention may be introduced directly ex vivo or in cell culture into a cell of interest or into an organism of interest or into one or more of parts of an organism of interest, e.g., cell or tissue types (e.g., a muscle, skin, brain, lung, kidney, pancreas, a reproductive organ such as testes, ovaries and breast, eye, liver, heart, vascular cell, root, leaf, flower, stalk or meristem) or into an organ of an organism of interest. Alternatively, the synthetic constructs are introduced into a progenitor of a cell or organism of interest and the progenitor is then grown or cultured for a time and under conditions sufficient to differentiate into the cell of interest or produce the organism of interest, whereby the synthetic construct is contained in the cell of interest or one or more cell types of the organism of interest. Suitable progenitor cells include, but are not limited to, stem cells such as embryonic stem cell, pluripotential immune cells, meristematic cells and embryonic callus. In certain embodiments, the synthetic construct is introduced into the organism of interest using a particular route of administration (e.g., for mammals, by the oral, parenteral (e.g., intravenous, intramuscular, intraperitoneal, intraventricular, intraarticular), mucosal (e.g., intranasal, intrapulmonary, oral, buccal, sublingual, rectal, intravaginal), dermal (topical, subcutaneous, transdermal); for plants, administration to flowers, meristem, root, leaves or stalk). Practitioners in the art will recognise that the route of administration will differ depending on the choice of organism of interest and the sought-after phenotype. In some embodiments relating to determination of phenotypic preference, the synthetic constructs are suitably introduced into the same or corresponding site of the organism or part thereof. In other embodiments, the synthetic constructs are introduced into a cell of the organism of interest (e.g., autologous cells), or into a cell that is compatible with the organism of interest (e.g., syngeneic or allogeneic cells) and the genetically-modified cell so produced is introduced into the organism of interest at a selected site or into a part of that organism.

[0195] The synthetic constructs of the present invention may be introduced into a cell or organism of interest or part thereof using any suitable method, and the kind of method employed will differ depending on the intended cell type, part and/or organism of interest. For example, four general classes of methods for delivering nucleic acid molecules into cells have been described: (1) chemical methods such as calcium phosphate precipitation, polyethylene glycol (PEG)-mediate precipitation and lipofection; (2) physical methods such as microinjection, electroporation, acceleration methods and vacuum infiltration; (3) vector based methods such as bacterial and viral vector-mediated transformation; and (4) receptor-mediated. Transformation techniques that fall within these and other classes are well known to workers in the art, and new techniques are continually becoming known. The particular choice of a transformation technology will be determined by its efficiency to transform certain host species as well as the experience and preference of the person practising the invention with a particular methodology of choice. It will be apparent to the skilled person that the particular choice of a transformation system to introduce a synthetic construct of the invention into cells is not essential to or a limitation of the invention, provided it achieves an acceptable level of nucleic acid transfer. Thus, the synthetic constructs are introduced into tissues or host cells by any number of routes, including viral infection, phage infection, microinjection, electroporation, or fusion of vesicles, lipofection, infection by Agrobacterium tumefaciens or A. rhizogenes, or protoplast fusion. Jet injection may also be used for intra-muscular administration (as described for example by Furth et al., 1992, Anal Biochem 205:365-368). The synthetic constructs may be coated onto microprojectiles, and delivered into a host cell or into tissue by a particle bombardment device, or "gene gun" (see, for example, Tang et al., 1992, Nature 356:152-154). Alternatively, the synthetic constructs can be fed directly to, or injected into, a host organism or it may be introduced into a cell (i.e., intracellularly) or introduced extracellularly into a cavity, interstitial space, into the circulation of an organism, introduced orally, etc. Methods for oral introduction include direct mixing of the synthetic constructs with food of the organism. In certain embodiments, a hydrodynamic nucleic acid administration protocol is employed (e.g., see Chang et al., 2001, J. Virol. 75:3469-3473; Liu et al., 1999, Gene Ther. 6:1258-1266; Wolff et al., 1990, Science 247:1465-1468; Zhang et al., 1999, Hum. Gene Ther. 10:1735-1737; and Zhang et al., 1999, Gene Ther. 7:1344-1349). Other methods of nucleic acid delivery include, but are not limited to, liposome-mediated transfer, naked DNA delivery (direct injection) and receptor-mediated transfer (ligand-DNA complex).

4. Methods of Determining the Translational Efficiency or Phenotypic Preference of Synonymous Codons

[0196] The construct system of the present invention can be used to compare the translational efficiency of different synonymous codons in cells of a particular type or to compare the translational efficiency of individual synonymous codons between different types of cells. Not wishing to be bound by any one particular theory or mode of operation, it is believed that the levels of reporter protein produced in a cell of interest from individual synthetic constructs are sensitive to the intracellular abundance of the iso-tRNA species corresponding to the interrogating codon(s) in the corresponding coding sequences and, therefore, provide a direct correlation of a cell's preference for or efficiency in translating a given codon. This means, for example, that if the level of the reporter protein obtained in a cell of the same type as a cell of interest, to which a synthetic construct having at least one first interrogating codon is provided, is higher than the level produced in a cell of the same type as the cell of interest, to which another synthetic construct having at least one second interrogating codon is provided (i.e., wherein the first interrogating codon(s) is (are) different from, but synonymous with, the second interrogating codon(s)), then it can be deduced that the first interrogating codon has a higher translational efficiency than the second interrogating codon in the cell of interest. Methods for measuring reporter protein levels are well-known in the art and include, but are not limited to, immunoassays such as Western blotting, ELISA, and RIA assays, chemiluminescent protein assays such as luciferase assays, enzymatic assays such as assays that measure .beta.-galactosidase or chloramphenicol acetyl transferase (CAT) activity as well as fluorometric assays that measure fluorescence associated with a fluorescent protein. In some embodiments, the different synthetic constructs are separately introduced into different cells. In other embodiments, the different synthetic constructs are introduced into the same cell (e.g., when the reporter polynucleotides comprise ancillary coding sequences that encode a tag, as described herein).

[0197] With regard to differential expression of the reporter polynucleotide between different cell types, it will be appreciated that if the level of the reporter protein obtained in a first cell type to which a synthetic construct having at least one interrogating codon is provided is higher than the level obtained in a second cell type to which the same synthetic construct is provided, then it can be deduced that the interrogating codon has a higher translational efficiency in the first cell type than in the second cell type.

[0198] The translational efficiencies of different synonymous codons so determined are then typically compared to provide a ranked order of individual synonymous codons according to their preference for translation in the cell or cells of interest. One of ordinary skill in the art will thereby be able to determine a "codon translational efficiency table" for each amino acid. Comparison of synonymous codons within a codon translational efficiency table can then be used to identify codons for tailoring a synthetic polynucleotide to modulate the level of an encoded polypeptide that is expressed in a cell type of interest or to differentially express an encoded polypeptide between different cell types.

[0199] In other embodiments, the construct system is used to compare the preference of different synonymous codons for producing a selected phenotype in an organism of interest or part thereof (i.e., "phenotypic preference"). In these embodiments, the synthetic constructs are used to determine the influence of the interrogating codon(s) on the phenotype or class of phenotype displayed by the organism or part in response to the phenotype-associated protein produced by those synthetic constructs. This means, for example, that if the quality of the phenotype displayed by the organism or part to which a synthetic construct having at least one first interrogating codon is provided is higher than the quality of the phenotype displayed by the organism or part to which a synthetic construct having at least one second interrogating codon is provided (i.e., wherein the first interrogating codon is different than, but synonymous with, the second interrogating codon), then it can be deduced that the organism of interest or part thereof has a higher preference for the first interrogating codon than the second interrogating codon with respect to the quality of the phenotype produced. Put another way, the first interrogating codon has a higher phenotypic preference than the second interrogating codon in the organism of interest or part thereof.

[0200] In accordance with the present invention, individual synthetic constructs are introduced into test organisms which are preferably selected from organisms of the same species as the organism of interest or organisms that are related to the organism of interest, or into test parts of such organisms. Related organisms are generally species within the same phylum, preferably species within the same subphylum, more preferably species within superclass, even more preferably species within the same class, even more preferably species within the same order and still even more preferably species within the same genus. For example, if the organism of interest is human, a related species is suitably selected from mouse, cow, dog or cat, which belong to the same class as human, or a chimpanzee, which belongs to the same order as human. Alternatively, if the organism of interest is banana, the related organism may be selected from taro, ginger, onions, garlic, pineapple, bromeliaeds, palms, orchids, lilies, irises and the like, which are all non-graminaceous monocotyledonous plants and which constitute horticultural or botanical relatives.

[0201] After introduction of the synthetic constructs into the test organisms or parts, the qualities of their phenotypes are determined by a suitable assay and then compared to determine the relative phenotypic preferences of the synonymous codons. The quality is suitably a measure of the strength, intensity or grade of the phenotype, or the relative strength, intensity or grade of two or more desired phenotypic traits. Assays for various phenotypes conferred by the production of a chosen reporter protein are known by those of skill in the art. For example, immunity may be assayed by any suitable methods that detects an increase in an animal's capacity to respond to foreign or disease-specific antigens (e.g., cancer antigens) i.e., those cells primed to attack such antigens are increased in number, activity, and ability to detect and destroy the those antigens. Strength of immune response is measured by standard tests including: direct measurement of peripheral blood lymphocytes by means known to the art; natural killer cell cytotoxicity assays (see, e.g., Provinciali et al (1992, J. Immunol. Meth. 155: 19-24), cell proliferation assays (see, e.g., Vollenweider and Groseurth (1992,1 Immunol. Meth. 149: 133-135), immunoassays of immune cells and subsets (see, e.g., Loeffler et al. (1992, Cytom. 13: 169-174); Rivoltini et al. (1992, Can. Immunol. Immunother. 34: 241-251); or skin tests for cell-mediated immunity (see, e.g., Chang et al (1993, Cancer Res. 53: 1043-1050). Enhanced immune response is also indicated by physical manifestations such as fever and inflammation, as well as healing of systemic and local infections, and reduction of symptoms in disease, i.e., decrease in tumour size, alleviation of symptoms of a disease or condition including, but not restricted to, leprosy, tuberculosis, malaria, naphthous ulcers, herpetic and papillomatous warts, gingivitis, artherosclerosis, the concomitants of AIDS such as Kaposi's sarcoma, bronchial infections, and the like. Such physical manifestations may also be used to detect, or define the quality of, the phenotype or class of phenotype displayed by an organism. Alternatively, herbicide tolerance may be assayed by treating test organisms (e.g., plants such as cotton plants), which express a herbicide tolerance gene (e.g., glyphosate tolerance protein gene such as a glyphosate resistant EPSP synthase), with a herbicide (e.g., glyphosate) and determining the efficacy of herbicide tolerance displayed by the plants. For example, when determining the efficacy of synthetic constructs for conferring herbicide tolerance in cotton, the amount of boll retention is a measure of efficacy and is a desirable trait.

[0202] The qualities of selected phenotype displayed by the test organisms or by the test parts are then compared to provide a ranked order of the individual synonymous codons according to their preference of usage by the organism or part to confer the selected phenotype. One of ordinary skill in the art will thereby be able to determine a "codon preference table" for each amino acid in the polypeptide whose expression conveys the selected phenotype to the organism of interest. Comparison of synonymous codons within a codon preference table can then be used to identify codons for tailoring a synthetic polynucleotide to modulate the quality of a selected phenotype.

5. Codon Modification of Polynucleotides

[0203] The construct system of the present invention can thus be used to provide a comparison of translational efficiencies for synonymous codons in a cell of interest or a comparison of phenotypic preferences for synonymous codons in an organism of interest or in a related organism, or in parts thereof. These comparisons can then be used as a basis for constructing a synthetic or `codon modified` polynucleotide which differs from a parent or reference polynucleotide by the substitution of at least one `replaceable` codon (also referred to herein as "a first codon") in the parent polynucleotide with a synonymous codon that has a different translational efficiency or different phenotypic preference than the replaceable codon.

[0204] 5.1 Modifications Based on Synonymous Codons with Different Translational Efficiencies

[0205] In some embodiments, the synthetic polynucleotide is constructed so that it produces an encoded polypeptide in a cell of interest at a different level than that produced from a parent polynucleotide. The method comprises selecting a replaceable codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a different translational efficiency than the replaceable codon in a comparison of translational efficiencies in the cell of interest, as determined, for example, in Section 4. The replaceable codon is then replaced with the synonymous codon to construct the synthetic polynucleotide.

[0206] Synonymous codons can thus be selected to increase or decrease the level of polypeptide that is produced in a cell of interest. For example, when it is desired to increase the level of polypeptide that is produced in the cell, it is generally desirable to use a synonymous codon whose translational efficiency is at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% higher or at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 50 or 100 times higher than the translational efficiency of the replaceable codon. Alternatively, when it is desired to decrease the level of polypeptide that is produced in the cell, it is generally desirable to use a synonymous codon whose translational efficiency is no more than 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, 0.5%, 0.1%, 0.05% or 0.01% of the translational efficiency of the replaceable codon.

[0207] Generally, the difference in level of polypeptide produced in the cell from a synthetic polynucleotide as compared to that produced from a parent polynucleotide depends on the number of replaceable codons that are replaced by synonymous codons, and on the difference in translational efficiencies between the replaceable codons and the synonymous codons in the cell of interest. Put another way, the fewer such replacements, and/or the smaller the difference in translational efficiencies between the synonymous and replaceable codons, the smaller the difference will be in protein production between the synthetic polynucleotide and parent polynucleotide. Conversely, the more such replacements, and/or the greater the difference in translational efficiencies between the synonymous and replaceable codons, the greater the difference will be in protein production between the synthetic polynucleotide and parent polynucleotide.

[0208] Accordingly, when it is desired to increase or decrease the level of polypeptide produced in the cell of interest, it is generally desirable but not necessary to replace all the replaceable codons of the parent polynucleotide with synonymous codons having higher or lower translational efficiencies in the cell of interest, as the case may be, than the replaceable codons. Changes in expression can be accomplished even with partial replacement. Typically, the replacement step affects at least about 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% or more of the replaceable codons of the parent polynucleotide. Suitably, the number of, and difference in translational efficiency between, the replaceable codons and the synonymous codons are selected such that the chosen polypeptide is produced from the synthetic polynucleotide in the cell at a level which is at least about at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% higher than, or even at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 50 or 100 times higher than, or no more than 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, 0.5%, 0.1%, 0.05% or 0.01% of, the level at which the polypeptide is produced from the parent polynucleotide in the cell. In the case of two or more synonymous codons having similar translational efficiencies, it will be appreciated that any one of these codons can be used to replace the replaceable codon. Generally, if a parent polynucleotide has a choice of low and intermediate translational efficiency codons, it is preferable in the first instance to replace some, or more preferably all, of the low translational efficiency codons with synonymous codons having intermediate, or preferably high, translational efficiencies when higher production of polypeptide is required. Typically, replacement of low with intermediate or high translational efficiency codons results in a substantial increase in the level of polypeptide produced by the synthetic polynucleotide so constructed. However, it is also preferable to replace some, or preferably all, of the intermediate translational efficiency codons with high translationally efficient codons for conferring an optimal production of the encoded polypeptide.

[0209] 5.2 Modifications Based on Synonymous Codons with Differentphenotypic Preferences

[0210] In other embodiments, the synthetic polynucleotide is constructed so that its expression in the organism or part confers a selected phenotype upon that organism or part but in a different quality than that conferred by a parent polynucleotide that encodes the same polypeptide. The method comprises selecting a replaceable codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a different phenotypic preference than the first codon in a comparison of phenotypic preferences in the organism of interest or in a related organism, or in a part thereof, as determined in Section 4. The replaceable codon is then replaced with the synonymous codon to construct the synthetic polynucleotide.

[0211] Thus, a parent polynucleotide can be modified with synonymous codons such that quality of the selected phenotype conferred by the polynucleotide so modified (synthetic polynucleotide) is higher than from the parent polynucleotide. Generally, the difference between the respective phenotypic qualities conferred by a synthetic polynucleotide and by a parent polynucleotide depends on the number of first codons that are replaced by synonymous codons, and on the difference in phenotypic preference between the first codons and the synonymous codons in the organism of interest or part thereof. Put another way, the fewer such replacements, and/or the smaller the difference in phenotypic preference between the synonymous and first codons, the smaller the difference will be in the phenotypic quality between the synthetic and parent polynucleotides. Conversely, the more such replacements, and/or the greater the difference in phenotypic preference between the synonymous and first codons, the greater the difference will be in the phenotypic quality between the synthetic and parent polynucleotides.

[0212] In some embodiments in which a higher quality of a selected phenotype is required to be displayed by an organism of interest or part thereof, a replaceable codon of the parent polynucleotide is suitably selected for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a higher phenotypic preference than the replaceable codon in a comparison of phenotypic preferences in the organism of interest or in a related organism, or in a part thereof. Generally, a higher phenotypic preference will correlate with a higher quality of the selected phenotype. Thus, in a non-limiting example of such a correlation, a synonymous codon is deemed to have at least about a 10% higher phenotypic preference than a replaceable codon when the quality of phenotype displayed by an organism or part thereof to which a synthetic construct comprising the synonymous codon as the interrogating codon has been provided is at least about 10% higher than the quality of phenotype displayed by an organism or part thereof to which a synthetic construct comprising the replaceable codon as the interrogating codon has been provided. When it is desired to increase the quality of a phenotype, it is generally desirable to use a synonymous codon whose phenotypic preference (i.e., preference for conferring that phenotype upon the organism or part) is at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% higher or at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 50 or 100 times higher than the phenotypic preference of the replaceable codon. In the case of two or more synonymous codons having similar phenotypic preferences, it will be appreciated that any one of these codons can be used to replace the first codon. Generally, if a parent polynucleotide has a choice of low and intermediate phenotypic preference codons, it is preferable in the first instance to replace some, or more preferably all, of the low phenotypic preference codons with synonymous codons having intermediate, or preferably high, phenotypic preferences. Typically, replacement of low with intermediate or high phenotypic preference codons results in a substantial increase in the quality of the phenotype conferred by the synthetic polynucleotide so constructed. However, it is also preferable to replace some, or preferably all, of the intermediate phenotypic preference codons with high translationally efficient codons for conferring an optimal quality in the selected phenotype.

[0213] In some embodiments in which a lower quality of a selected phenotype is required to be displayed by an organism of interest or part thereof, a replaceable codon of the parent polynucleotide is selected for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a lower phenotypic preference than the replaceable codon in a comparison of phenotypic preferences in the organism of interest or in a related organism or in a part thereof, as determined for example according to method described in Section 4. A lower phenotypic preference will typically correlate with a lower quality of the selected phenotype. Accordingly, in a non-limiting example of such a correlation, a synonymous codon is deemed to have at least about a 10% lower phenotypic preference than a first codon when the quality of phenotype displayed by an organism or part thereof to which a synthetic construct comprising the synonymous codon as the interrogating codon has been provided is at least about 10% lower than the quality of phenotype displayed by an organism or part thereof to which a synthetic construct comprising the replaceable codon as the interrogating codon has been provided. When selecting the synonymous codon for this embodiment, it is preferred that it has a phenotypic preference in the organism of interest that is no more than about 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, 0.5%, 0.1%, 0.05% or 0.01% of the phenotypic preference of the replaceable codon.

[0214] It is preferable but not necessary to replace all the replaceable codons of the parent polynucleotide with synonymous codons having higher or lower phenotypic preference in the organism of interest or part thereof than the first codons. For example, a higher or lower phenotypic quality can be accomplished even with partial replacement. Typically, the replacement step affects 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%, 60%, 70%, 80%, 90%, 95%, 99% or more of the replaceable codons of the parent polynucleotide. In some embodiments requiring a higher phenotypic quality, the number of, and difference in phenotypic preference between the replaceable codons and the synonymous codons are selected such that the phenotype-associated polypeptide is produced from the synthetic polynucleotide to confer a phenotype upon a chosen organism or organism part in a quality that is at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% higher, or even at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 50 or 100 times higher than the quality of phenotype conferred by the parent polynucleotide in the organism or part. Conversely, in some embodiments requiring a lower phenotypic quality, the number of, and difference in phenotypic preference between, the replaceable codons and the synonymous codons are selected such that the phenotype-associated polypeptide is produced from the synthetic polynucleotide to confer a phenotype upon a chosen organism or part thereof in a quality that is no more than about 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, 0.5%, 0.1%, 0.05% or 0.01% of the quality of phenotype conferred by the parent polynucleotide in the organism or part.

[0215] 5.3 Construction of Synthetic Polynucleotides

[0216] Replacement of one codon for another can be achieved using standard methods known in the art. For example codon modification of a parent polynucleotide can be effected using several known mutagenesis techniques including, for example, oligonucleotide-directed mutagenesis, mutagenesis with degenerate oligonucleotides, and region-specific mutagenesis. Exemplary in vitro mutagenesis techniques are described for example in U.S. Pat. Nos. 4,184,917, 4,321,365 and 4,351,901 or in the relevant sections of Ausubel, et al. (CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, Inc. 1997) and of Sambrook, et al., (MOLECULAR CLONING. A LABORATORY MANUAL, Cold Spring Harbor Press, 1989). Instead of in vitro mutagenesis, the synthetic polynucleotide can be synthesised de novo using readily available machinery as described, for example, in U.S. Pat. No. 4,293,652. However, it should be noted that the present invention is not dependent on, and not directed to, any one particular technique for constructing the synthetic polynucleotide.

[0217] The parent polynucleotide is suitably a natural gene. However, it is possible that the parent polynucleotide that is not naturally-occurring but has been engineered using recombinant techniques. Parent polynucleotides can be obtained from any suitable source, such as from eukaryotic or prokaryotic organisms, including but not limited to mammals or other animals, and pathogenic organisms such as yeasts, bacteria, protozoa and viruses.

6. Immune Response Preference Ranking of Codons in Mammals

[0218] The construct system of the present invention has been used to experimentally determine a ranking of individual synonymous codons according to their preference for producing an immune response, including a humoral immune response, to an antigen in a mammal. Accordingly, the present invention provides for the first time an immune response preference ranking of individual synonymous codons in mammals. This ranking was determined using a construct system that comprises a series of reporter constructs each comprising a different coding sequence for an antigenic polypeptide (e.g., a papillomavirus E7 polypeptide), wherein the coding sequence of individual constructs is distinguished from a parent (e.g., wild-type) coding sequence that encodes the antigenic polypeptide by the substitution of a single species of iso-accepting codon for other species of iso-accepting codon that are present in the parent coding sequence. Accordingly, the coding sequences of individual synthetic constructs use the same "interrogating" iso-accepting codon to encode at least 1, generally at least 2, usually at least 3 instances, typically at least most instances and preferably every instance of a particular amino acid residue in the antigenic polypeptide and individual synthetic constructs differ in the species of interrogating iso-accepting codon used to encode a particular amino acid residue at one or more different positions in the polypeptide sequence. For example, in an antigenic polypeptide containing several alanine residues, the coding sequence of a synthetic construct in the construct system of the present invention may comprise Ala.sup.GcT as the interrogating codon for each encoded alanine residue, whereas the coding sequence of another construct may comprise Ala.sup.GCC as the interrogating codon for each encoded alanine residue, and so on. An illustrative synthetic construct system is described in Example 1, which covers the entire set of synonymous codons that code for amino acids.

[0219] In order to determine the immune response preference of different codons, test mammals (e.g., mice) are immunized with the synthetic construct system in which individual mammals are immunized with a different synthetic construct and the host immune response (e.g., humoral immune response or a cellular immune response) to the antigenic polypeptide is determined for each construct. In accordance with the present invention, the strength of immune response obtained from individual synthetic constructs provides a direct correlation to the immune preference of a corresponding interrogating codon in a test mammal. Accordingly, the stronger the immune response produced from a given construct in a test mammal, the higher the immune preference will be of the corresponding interrogating codon.

[0220] In an illustrative example, comparison of the immune response preferences determined according to Example 1 with the translational efficiencies derived from codon usage frequency values for mammalian cells in general as determined by Seed (see U.S. Pat. Nos. 5,786,464 and 5,795,737) reveals several differences in the ranking of codons. For convenience, these differences are highlighted in TABLE 9, in which Seed `preferred` codons are highlighted with a blue background, Seed `less preferred` codons are highlighted with a green background, and Seed `non preferred` codons are highlighted with a grey background.

TABLE-US-00009 TABLE 9 Preferential codon usage as predicted Experimentally determined codon by Seed for mammalian cells in immune response preferences in test aa general mammals Ala GCC >> (GCG, GCT, GCA) GCT > GCC > (GCA GCG) Arg CGC >> (CGA, CGT, AGA, AGG, (CGA, CGC, CGT, AGA) > (AGG, CGG) CGG) Asn AAC >> AAT AAC > AAT Asp GAC >> GAT GAC > GAT Cys TGC >> TGT TGC > TGT Glu (GAA, GAG) GAA > GAG Gln CAG >> CAA CAA = CAG Gly GGC > GGG > (GGT, GGA) GGA > (GGG, GGT, GGC) His CAC >> CAT CAC = CAT Ile ATC > ATT > ATA ATC >> ATT > ATA Leu CTG > CTC > (TTA, CTA, CTT, (CTG, CTC) > (CTA, CTT) >> TTG > TTG) TTA Lys AAG >> AAA AAG = AAA Phe TTC >> TTT TTT > TTC Pro CCC >> (CCG, CCA, CCT) CCC > CCT >> (CCA, CCG) Ser AGC > TCC > (TCG, AGT, TCA, TCG >> (TCT, TCA, TCC) >> (AGC, TCT) AGT) Thr ACC >> (ACG, ACA, ACT) ACG > ACC >> ACA > ACT Tyr TAC >> TAT TAC > TAT Val GTG > GTC > (GTA, GTT) (GTG, GTC) > GTT > GTA

[0221] As will be apparent from the above table:

[0222] (i) several codons deemed by Seed to have a higher codon usage ranking in mammalian cells than at least one other synonymous codon have in fact a lower immune response preference ranking than the or each other synonymous codon (e.g., Ala.sup.GCC has a higher codon usage ranking but lower immune response preference ranking than Ala.sup.GCT; Gly.sup.GGC has a higher codon usage ranking but lower immune response preference ranking than Gly.sup.GGA; Phe.sup.TTC has a higher codon usage ranking but lower immune response preference ranking than Phe.sup.TTT; Ser.sup.AGC has a higher codon usage ranking but lower immune response preference ranking than any one of Ser.sup.TCG, Ser.sup.tct, Ser.sup.TCG, Ser.sup.TCA and Ser.sup.TCC; and Thr.sup.ACC has a higher codon usage ranking but lower immune response preference ranking than Thr.sup.ACG);

[0223] (ii) several codons deemed by Seed to have a lower codon usage ranking in mammalian cells than at least one other synonymous codon have in fact a higher immune response preference ranking than the or each other synonymous codon (e.g., Ala.sup.GCT has a lower codon usage ranking but higher immune response preference ranking than Ala.sup.GCC; Gly.sup.GGA has a lower codon usage ranking but higher immune response preference ranking than Gly.sup.GGC or Gly.sup.GGG; Phe.sup.TTT has a lower codon usage ranking but higher immune response preference ranking than Phe.sup.TTC; Ser.sup.TCG has a lower codon usage ranking but higher immune response preference ranking than Ser.sup.AGC or Ser.sup.TCC; Ser.sup.TCT and Ser.sup.TCA have a lower codon usage ranking but higher immune response preference ranking than Ser.sup.AGC; and Thr.sup.ACG has a lower codon usage ranking but higher immune response preference ranking than Thr.sup.ACC);

[0224] (iii) several codons deemed by Seed to have a higher codon usage ranking in mammalian cells than another synonymous codon have in fact the same immune response preference ranking as the other synonymous codon (e.g., Gln.sup.CAG has a higher codon usage ranking than, but the same immune response preference ranking as, Gln.sup.CAA; His.sup.CAC has a higher codon usage ranking than, but the same immune response preference ranking as, His.sup.CAT; Leu.sup.CTG has a higher codon usage ranking than, but the same immune response preference ranking as Leu.sup.CTC; Lys.sup.AAG has a higher codon usage ranking than, but the same immune response preference ranking as, Lys.sup.AAA; Val.sup.GTG has a higher codon usage ranking than, but the same immune response preference ranking as, Val.sup.GTC); and

[0225] (iv) several codons deemed by Seed to have the same codon usage ranking in mammalian cells as at least one other synonymous codon have in fact a different immune response preference ranking than the or each other synonymous codon (e.g., Ala.sup.GCT has the same codon usage ranking as, but a higher immune response preference ranking than, Ala.sup.GcA and Ala.sup.GCG; Arg.sup.CGA, Arg.sup.CGT and Arg.sup.AGA have the same codon usage ranking as, but a higher immune response preference ranking than, Arg.sup.AGG and Arg.sup.CGG; Glu.sup.GAA has the same codon usage ranking as, but a higher immune response preference ranking than, Glu.sup.GAG; Gly.sup.GGA has the same codon usage ranking as, but a higher immune response preference ranking than, Gly.sup.GGT; Leu.sup.CTA and Leu.sup.CTT have the same codon usage ranking as, but a higher immune response preference ranking than, Leu.sup.TTG and Leu.sup.TTA; and Pro.sup.CCT has the same codon usage ranking as, but a higher immune response preference ranking than, Pro.sup.CCA or Pro.sup.CCG; Ser.sup.TCG has the same codon usage ranking as, but a higher immune response preference ranking than, any one of Ser.sup.TCT, Ser.sup.TCA and Ser.sup.AGT; Ser.sup.TCT and Ser.sup.TCA have the same codon usage ranking as, but a higher immune response preference ranking than, Ser.sup.AGT; Thr.sup.AcG has the same codon usage ranking as, but a higher immune response preference ranking than, any one of Thr.sup.ACA and Thr.sup.ACT; Thr.sup.ACG has the same codon usage ranking as, but a higher immune response preference ranking than, Thr.sup.ACT; Val.sup.GTT has the same codon usage ranking as, but a higher immune response preference ranking than, Val.sup.GTA).

[0226] Accordingly, the present invention enables for the first time the modulation of an immune response to a target antigen in a mammal from a polynucleotide that encodes a polypeptide that corresponds to at least a portion of the target antigen by replacing at least one codon of the polynucleotide with a synonymous codon that has a higher or lower preference for producing an immune response than the codon it replaces. In some embodiments, therefore, the present invention embraces methods of constructing a synthetic polynucleotide from which a polypeptide is producible to confer an enhanced or stronger immune response than one conferred by a parent polynucleotide that encodes the same polypeptide. These methods generally comprise selecting from TABLE 1a codon (often referred to herein arbitrarily as a "first codon") of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a higher immune response preference than the first codon and replacing the first codon with the synonymous codon to construct the synthetic polynucleotide. Illustrative selections of the first and synonymous codons are made according to TABLE 2.

[0227] In some embodiments, the selection of the first and synonymous codons is made according to TABLE 3, which is the same as TABLE 2 with the exception that it excludes selections based on codon usage rankings as disclosed by Seed. In illustrative examples of this type, the selection of a second codon (and subsequent codons if desired) for replacement with a synonymous codon is made according to TABLE 4.

[0228] Where synonymous codons are classified into three ranks (`high`, `intermediate` and `low` ranks) based on their immune response preference ranking (e.g., the synonymous codons for Ala, Ile, Leu, Pro, Ser, Thr and Val), it is preferred that the synonymous codon that is selected is a high rank codon when the first codon is a low rank codon. However, this is not essential and the synonymous codon can be selected from intermediate rank codons. In the case of two or more synonymous codons having similar immune response preferences, it will be appreciated that any one of these codons can be used to replace the first codon.

[0229] In other embodiments, the invention provides methods of constructing a synthetic polynucleotide from which a polypeptide is producible to confer a reduced or weaker immune response than one conferred by a parent polynucleotide that encodes the same polypeptide. These methods generally comprise selecting from TABLE 1 a first codon of the parent polynucleotide for replacement with a synonymous codon, wherein the synonymous codon is selected on the basis that it exhibits a lower immune response preference than the first codon and replacing the first codon with the synonymous codon to construct the synthetic polynucleotide. Illustrative selections of the first and synonymous codons are made according to TABLE 5.

[0230] In some embodiments, the selection of the first and synonymous codons is made according to TABLE 6, which is the same as TABLE 5 with the exception that it excludes selections based on codon usage rankings as disclosed by Seed. In illustrative examples of this type, the selection of a second codon (and subsequent codons if desired) for replacement with a synonymous codon is made according to TABLE 7.

[0231] Where synonymous codons are classified into the three ranks noted above, it is preferred that the synonymous codon that is selected is a low rank codon when the first codon is a high rank codon but this is not essential and thus the synonymous codon can be selected from intermediate rank codons if desired.

[0232] Generally, the difference in strength of the immune response produced in the mammal from the synthetic polynucleotide as compared to that produced from the parent polynucleotide depends on the number of first/second codons that are replaced by synonymous codons, and on the difference in immune response preference ranking between the first/second codons and the synonymous codons. Put another way, the fewer such replacements, and/or the smaller the difference in immune response preference ranking between the synonymous and first/codons codons, the smaller the difference will be in the immune response produced by the synthetic polynucleotide and the one produced by the parent polynucleotide. Conversely, the more such replacements, and/or the greater the difference in immune response preference ranking between the synonymous and first/second codons, the greater the difference will be in the immune response produced by the synthetic polynucleotide and the one produced by the parent polynucleotide.

[0233] It is preferable but not necessary to replace all the codons of the parent polynucleotide with synonymous codons having different (e.g., higher or lower) immune response preference rankings than the first/second codons. Changes in the conferred immune response can be accomplished even with partial replacement. Generally, the replacement step affects at least about 5%, 10%, 15%, 20%, 25%, 30%, usually at least about 35%, 40%, 50%, and typically at least about 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%, 99% or more of the first/second codons of the parent polynucleotide. In embodiments in which a stronger or enhanced immune response is required, it is generally desirable to replace some, preferably most and more preferably all, low rank codons in a parent polynucleotide with synonymous codons that are intermediate, or preferably high rank codons. Typically, replacement of low with intermediate or high rank codons will result in an increase in the strength of immune response from the synthetic polynucleotide so constructed, as compared to the one produced from the parent polynucleotide under the same conditions. However, it is often desirable to replace some, preferably most and more preferably all, intermediate rank codons in the parent polynucleotide with high rank codons, if stronger or more enhanced immune responses are desired.

[0234] By contrast, in some embodiments in which a weaker or reduced immune response is required, it is generally desirable to replace some, preferably most and more preferably all, high rank codons in a parent polynucleotide with synonymous codons that are intermediate, or preferably low rank codons. Typically, replacement of high with intermediate or low rank codons will result in a substantial decrease in the strength of immune response from the synthetic polynucleotide so constructed, as compared to the one produced from the parent polynucleotide under the same condition. In specific embodiments in which it is desired to confer a weaker or more reduced immune response, it is generally desirable to replace some, preferably most and more preferably all, intermediate rank codons in the parent polynucleotide with low rank codons.

[0235] In illustrative examples requiring a stronger or enhanced immune response, the number of, and difference in immune response preference ranking between, the first/second codons and the synonymous codons are selected such that the immune response conferred by the synthetic polynucleotide is at least about 110%, 150%, 200%, 250%, 300%, 350%, 400%, 450%, 500%, 600%, 700%, 800%, 900%, 1000%, or more, of the immune response conferred by the parent polynucleotide under the same conditions. Conversely, in some embodiments requiring a lower or weaker immune response, the number of, and difference in phenotypic preference ranking between, the first/second codons and the synonymous codons are selected such that the immune response conferred by the synthetic polynucleotide is no more than about 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, 10%, 5%, or less of the immune response conferred by the parent polynucleotide under the same conditions.

7. Modulating Immune Responses in Mammals by Expression of Isoaccepting Transfer RNA-encoding Polynucleotides

[0236] It is possible to take advantage of the immune response preference rankings of codons discussed in Section 6 to modulate an immune response to a target antigen by changing the level of iso-tRNAs in the cell population which is the target of the immunization. Accordingly, the invention also features methods of enhancing the quality of an immune response to a target antigen in a mammal, wherein the response is conferred by the expression of a first polynucleotide that encodes a polypeptide corresponding to at least a portion of the target antigen. These methods generally comprise: introducing into the mammal a first nucleic acid construct comprising the first polynucleotide in operable connection with a regulatory sequence. A second nucleic acid construct is then introduced into the mammal, which comprises a second polynucleotide that is operably connected to a regulatory sequence and that encodes an iso-tRNA corresponding to a low immune preference codon of the first polynucleotide.

[0237] In practice, therefore, an iso-tRNA is introduced into the mammal by the second nucleic acid construct when the iso-tRNA corresponds to a low immune response preference codon in the first polynucleotide, which are suitably selected from the group consisting of Ala.sup.GCA, Ala.sup.GCG, Ala.sup.GCC, Arg.sup.AGG, Arg.sup.CGG, Asn.sup.AAT, Asp.sup.GAT, Cys.sup.TGT, Glu.sup.GAG, Gly.sup.GGG, Gly.sup.GGT, Gly.sup.GGC, Ile.sup.ATA, Ile.sup.ATT, Leu.sup.TTG, Leu.sup.TTA, Leu.sup.CTA, Leu.sup.CTT, Phe.sup.TTC, Pro.sup.CCA, Pro.sup.CCG, Pro.sup.CCT, Ser.sup.AGC, Ser.sup.AGT, Ser.sup.TCT, Ser.sup.TCA, Ser.sup.TCC, Thr.sup.ACA, Thr.sup.ACT, Tyr.sup.TAT, Val.sup.GTA and Val.sup.GTT. In specific embodiments, the supplied iso-tRNAs are specific for codons that have `low` immune response preference codons, which may be selected from the group consisting of Ala.sup.GCA, Ala.sup.GCG, Arg.sup.CGG, Asn.sup.AAT, Asp.sup.GAT, Cys.sup.TGT, Glu.sup.GAG, Gly.sup.GGG, Gly.sup.GGT, Gly.sup.GGC, Ile.sup.ATA, Leu.sup.TTG, Leu.sup.TTA, Phe.sup.TTC, Pro.sup.CCA, Pro.sup.CCG, Ser.sup.AGC, Ser.sup.AGT, Thr.sup.ACT, Tyr.sup.TAT and Val.sup.GTA. The first construct (i.e., antigen-expressing construct) and the second construct (i.e., the iso-tRNA-expressing construct) may be introduced simultaneously or sequentially (in either order) and may be introduced at the same or different sites. In some embodiments, the first and second constructs are contained in separate vectors. In other embodiments, they are contained in a single vector. If desired, two or more second constructs may be introduced each expressing a different iso-tRNA corresponding to a low preference codon of the first polynucleotide. The first and second nucleic acid constructs may be constructed and administered concurrently or contemporaneously to a mammal according to any suitable method, illustrative examples of which are discussed below for the chimeric constructs of the invention.

[0238] In some embodiments, a plurality of different iso-tRNA-expressing constructs (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more) are administered concurrently or contemporaneously with the antigen-expressing construct, wherein individual iso-tRNA-expressing constructs express a different iso-tRNA than other iso-tRNA-expressing constructs.

8. Antigens

[0239] Target antigens useful in the present invention are typically proteinaceous molecules, representative examples of which include polypeptides and peptides. Target antigens may be selected from endogenous antigens produced by a host or exogenous antigens that are foreign to the host. Suitable endogenous antigens include, but are not restricted to, cancer or tumor antigens. Non-limiting examples of cancer or tumor antigens include antigens from a cancer or tumor selected from ABL1 proto-oncogene, AIDS related cancers, acoustic neuroma, acute lymphocytic leukemia, acute myeloid leukemia, adenocystic carcinoma, adrenocortical cancer, agnogenic myeloid metaplasia, alopecia, alveolar soft-part sarcoma, anal cancer, angiosarcoma, aplastic anemia, astrocytoma, ataxia-telangiectasia, basal cell carcinoma (skin), bladder cancer, bone cancers, bowel cancer, brain stem glioma, brain and CNS tumors, breast cancer, CNS tumors, carcinoid tumors, cervical cancer, childhood brain tumors, childhood cancer, childhood leukemia, childhood soft tissue sarcoma, chondrosarcoma, choriocarcinoma, chronic lymphocytic leukemia, chronic myeloid leukemia, colorectal cancers, cutaneous T-cell lymphoma, dermatofibrosarcoma protuberans, desmoplastic small round cell tumor, ductal carcinoma, endocrine cancers, endometrial cancer, ependymoma, oesophageal cancer, Ewing's Sarcoma, Extra-Hepatic Bile Duct Cancer, Eye Cancer, Eye: Melanoma, Retinoblastoma, Fallopian Tube cancer, Fanconi anemia, fibrosarcoma, gall bladder cancer, gastric cancer, gastrointestinal cancers, gastrointestinal-carcinoid-tumor, genitourinary cancers, germ cell tumors, gestational-trophoblastic-disease, glioma, gynecological cancers, haematological malignancies, hairy cell leukemia, head and neck cancer, hepatocellular cancer, hereditary breast cancer, histiocytosis, Hodgkin's disease, human papillomavirus, hydatidiform mole, hypercalcemia, hypopharynx cancer, intraocular melanoma, islet cell cancer, Kaposi's sarcoma, kidney cancer, Langerhans cell histiocytosis, laryngeal cancer, leiomyosarcoma, leukemia, Li-Fraumeni syndrome, lip cancer, liposarcoma, liver cancer, lung cancer, lymphedema, lymphoma, Hodgkin's lymphoma, non-Hodgkin's lymphoma, male breast cancer, malignant-rhabdoid tumor of kidney, medulloblastoma, melanoma, Merkel cell cancer, mesothelioma, metastatic cancer, mouth cancer, multiple endocrine neoplasia, mycosis fungoides, myelodysplastic syndromes, myeloma, myeloproliferative disorders, nasal cancer, nasopharyngeal cancer, nephroblastoma, neuroblastoma, neurofibromatosis, Nijmegen breakage syndrome, non-melanoma skin cancer, non-small-cell-lung-cancer (NSCLC), ocular cancers, esophageal cancer, oral cavity cancer, oropharynx cancer, osteosarcoma, ostomy ovarian cancer, pancreas cancer, paranasal cancer, parathyroid cancer, parotid gland cancer, penile cancer, peripheral-neuroectodermal tumours, pituitary cancer, polycythemia vera, prostate cancer, rare cancers and associated disorders, renal cell carcinoma, retinoblastoma, rhabdomyosarcoma, Rothmund-Thomson syndrome, salivary gland cancer, sarcoma, schwannoma, Sezary syndrome, skin cancer, small cell lung cancer (SCLC), small intestine cancer, soft tissue sarcoma, spinal cord tumors, squamous-cell-carcinoma-(skin), stomach cancer, synovial sarcoma, testicular cancer, thymus cancer, thyroid cancer, transitional-cell-cancer-(bladder), transitional-cell-cancer-(renal-pelvis-/- ureter), trophoblastic cancer, urethral cancer, urinary system cancer, uroplakins, uterine sarcoma, uterus cancer, vaginal cancer, vulva cancer, Waldenstroms macroglobulinemia, Wilms' tumor. In certain embodiments, the cancer or tumor relates to melanoma. Illustrative examples of melanoma-related antigens include melanocyte differentiation antigen (e.g., gp100, MART, Melan-A/MART-1, TRP-1, Tyros, TRP2, MC1R, MUC1F, MUC1R or a combination thereof) and melanoma-specific antigens (e.g., BAGE, GAGE-1, gp100In4, MAGE-1 (e.g., GenBank Accession No. X54156 and AA494311), MAGE-3, MAGE4, PRAME, TRP2IN2, NYNSO1a, NYNSO1b, LAGE1, p97 melanoma antigen (e.g., GenBank Accession No. M12154) p5 protein, gp75, oncofetal antigen, GM2 and GD2 gangliosides, cdc27, p21ras, gp100.sup.Pmel117 or a combination thereof. Other tumour-specific antigens include, but are not limited to: etv6, aml1, cyclophilin b (acute lymphoblastic leukemia); Ig-idiotype (B cell lymphoma); E-cadherin, .alpha.-catenin, .beta.-catenin, .gamma.-catenin, p120ctn (glioma); p21ras (bladder cancer); p21ras (biliary cancer); MUC family, HER2/neu, c-erbB-2 (breast cancer); p53, p21ras (cervical carcinoma); p21ras, HER2/neu, c-erbB-2, MUC family, Cripto-1protein, Pim-1 protein (colon carcinoma); Colorectal associated antigen (CRC)-0017-1A/GA733, APC (colorectal cancer); carcinoembryonic antigen (CEA) (colorectal cancer; choriocarcinoma); cyclophilin b (epithelial cell cancer); HER2/neu, c-erbB-2, ga733 glycoprotein (gastric cancer); .alpha.-fetoprotein (hepatocellular cancer); Imp-1, EBNA-1 (Hodgkin's lymphoma); CEA, MAGE-3, NY-ESO-1 (lung cancer); cyclophilin b (lymphoid cell-derived leukemia); MUC family, p21ras (myeloma); HER2/neu, c-erbB-2 (non-small cell lung carcinoma); Imp-1, EBNA-1 (nasopharyngeal cancer); MUC family, HER2/neu, c-erbB-2, MAGE-A4, NY-ESO-1 (ovarian cancer); Prostate Specific Antigen (PSA) and its antigenic epitopes PSA-1, PSA-2, and PSA-3, PSMA, HER2/neu, c-erbB-2, ga733 glycoprotein (prostate cancer); HER2/neu, c-erbB-2 (renal cancer); viral products such as human papillomavirus proteins (squamous cell cancers of the cervix and esophagus); NY-ESO-1 (testicular cancer); and HTLV-1 epitopes (T cell leukemia).

[0240] Foreign or exogenous antigens are suitably selected from antigens of pathogenic organisms. Exemplary pathogenic organisms include, but are not limited to, viruses, bacteria, fungi parasites, algae and protozoa and amoebae. Illustrative viruses include viruses responsible for diseases including, but not limited to, measles, mumps, rubella, poliomyelitis, hepatitis A, B (e.g., GenBank Accession No. E02707), and C (e.g., GenBank Accession No. E06890), as well as other hepatitis viruses, influenza, adenovirus (e.g., types 4 and 7), rabies (e.g., GenBank Accession No. M34678), yellow fever, Epstein-Barr virus and other herpesviruses such as papillomavirus, Ebola virus, influenza virus, Japanese encephalitis (e.g., GenBank Accession No. E07883), dengue (e.g., GenBank Accession No. M24444), hantavirus, Sendai virus, respiratory syncytial virus, orthomyxoviruses, vesicular stomatitis virus, visna virus, cytomegalovirus and human immunodeficiency virus (HIV) (e.g., GenBank Accession No. U18552). Any suitable antigen derived from such viruses are useful in the practice of the present invention. For example, illustrative retroviral antigens derived from HIV include, but are not limited to, antigens such as gene products of the gag, pol, and env genes, the Nef protein, reverse transcriptase, and other HIV components. Illustrative examples of hepatitis viral antigens include, but are not limited to, antigens such as the S, M, and L proteins of hepatitis B virus, the pre-S antigen of hepatitis B virus, and other hepatitis, e.g., hepatitis A, B, and C, viral components such as hepatitis C viral RNA. Illustrative examples of influenza viral antigens include; but are not limited to, antigens such as hemagglutinin and neuraminidase and other influenza viral components. Illustrative examples of measles viral antigens include, but are not limited to, antigens such as the measles virus fusion protein and other measles virus components. Illustrative examples of rubella viral antigens include, but are not limited to, antigens such as proteins E1 and E2 and other rubella virus components; rotaviral antigens such as VP7sc and other rotaviral components. Illustrative examples of cytomegaloviral antigens include, but are not limited to, antigens such as envelope glycoprotein B and other cytomegaloviral antigen components. Non-limiting examples of respiratory syncytial viral antigens include antigens such as the RSV fusion protein, the M2 protein and other respiratory syncytial viral antigen components. Illustrative examples of herpes simplex viral antigens include, but are not limited to, antigens such as immediate early proteins, glycoprotein D, and other herpes simplex viral antigen components. Non-limiting examples of varicella zoster viral antigens include antigens such as 9PI, gpII, and other varicella zoster viral antigen components. Non-limiting examples of Japanese encephalitis viral antigens include antigens such as proteins E, M-E, M-E-NS 1, NS 1, NS 1-NS2A, 80% E, and other Japanese encephalitis viral antigen components. Representative examples of rabies viral antigens include, but are not limited to, antigens such as rabies glycoprotein, rabies nucleoprotein and other rabies viral antigen components. Illustrative examples of papillomavirus antigens include, but are not limited to, the L1 and L2 capsid proteins as well as the E6/E7 antigens associated with cervical cancers, See Fundamental Virology, Second Edition, eds. Fields, B. N. and Knipe, D. M., 1991, Raven Press, New York, for additional examples of viral antigens.

[0241] Illustrative examples of fungi include Acremonium spp., Aspergillus spp., Basidiobolus spp., Bipolaris spp., Blastomyces dermatidis, Candida spp., Cladophialophora carrionii, Coccidioides immitis, Conidiobolus spp., Cryptococcus spp., Curvularia spp., Epidermophyton spp., Exophiala jeanselmei, Exserohilum spp., Fonsecaea compacta, Fonsecaea pedrosoi, Fusarium oxysporum, Fusarium solani, Geotrichum candidum, Histoplasma capsulatum var. capsulatum, Histoplasma capsulatum var. duboisii, Hortaea werneckii, Lacazia loboi, Lasiodiplodia theobromae, Leptosphaeria senegalensis, Madurella grisea, Madurella mycetomatis, Malassezia furfur, Microsporum spp., Neotestudina rosatii, Onychocola canadensis, Paracoccidioides brasiliensis, Phialophora verrucosa, Piedraia hortae, Piedra iahortae, Pityriasis versicolor, Pseudallescheria boydii, Pyrenochaeta romeroi, Rhizopus arrhizus, Scopulariopsis brevicaulis, Scytalidium dimidiatum, Sporothrix schenckii, Trichophyton spp., Trichosporon spp., Zygomycete fungi, Absidia corymbifera, Rhizomucor pusillus and Rhizopus arrhizus. Thus, representative fungal antigens that can be used in the compositions and methods of the present invention include, but are not limited to, candida fungal antigen components; histoplasma fungal antigens such as heat shock protein 60 (HSP60) and other histoplasma fungal antigen components; cryptococcal fungal antigens such as capsular polysaccharides and other cryptococcal fungal antigen components; coccidioides fungal antigens such as spherule antigens and other coccidioides fungal antigen components; and tinea fungal antigens such as trichophytin and other coccidioides fungal antigen components.

[0242] Illustrative examples of bacteria include bacteria that are responsible for diseases including, but not restricted to, diphtheria (e.g., Corynebacterium diphtheria), pertussis (e.g., Bordetella pertussis, GenBank Accession No. M35274), tetanus (e.g., Clostridium tetani, GenBank Accession No. M64353), tuberculosis (e.g., Mycobacterium tuberculosis), bacterial pneumonias (e.g., Haemophilus influenzae.), cholera (e.g., Vibrio cholerae), anthrax (e.g., Bacillus anthracis), typhoid, plague, shigellosis (e.g., Shigella dysenteriae), botulism (e.g., Clostridium botulinum), salmonellosis (e.g., GenBank Accession No. L03833), peptic ulcers (e.g., Helicobacter pylori), Legionnaire's Disease, Lyme disease (e.g., GenBank Accession No. U59487), Other pathogenic bacteria include Escherichia coli, Clostridium perfringens, Pseudomonas aeruginosa, Staphylococcus aureus and Streptococcus pyogenes. Thus, bacterial antigens which can be used in the compositions and methods of the invention include, but are not limited to: pertussis bacterial antigens such as pertussis toxin, filamentous hemagglutinin, pertactin, F M2, FIM3, adenylate cyclase and other pertussis bacterial antigen components; diphtheria bacterial antigens such as diphtheria toxin or toxoid and other diphtheria bacterial antigen components; tetanus bacterial antigens such as tetanus toxin or toxoid and other tetanus bacterial antigen components, streptococcal bacterial antigens such as M proteins and other streptococcal bacterial antigen components; gram-negative bacilli bacterial antigens such as lipopolysaccharides and other gram-negative bacterial antigen components; Mycobacterium tuberculosis bacterial antigens such as mycolic acid, heat shock protein 65 (HSP65), the kDa major secreted protein, antigen 85A and other mycobacterial antigen components; Helicobacter pylori bacterial antigen components, pneumococcal bacterial antigens such as pneumolysin, pneumococcal capsular polysaccharides and other pneumococcal bacterial antigen components; Haemophilus influenza bacterial antigens such as capsular polysaccharides and other Haemophilus influenza bacterial antigen components; anthrax bacterial antigens such as anthrax protective antigen and other anthrax bacterial antigen components; rickettsiae bacterial antigens such as rompA and other rickettsiae bacterial antigen component. Also included with the bacterial antigens described herein are any other bacterial, mycobacterial, mycoplasmal, rickettsial, or chlamydial antigens.

[0243] Illustrative examples of protozoa include protozoa that are responsible for diseases including, but not limited to, malaria (e.g., GenBank Accession No. X53832), hookworm, onchocerciasis (e.g., GenBank Accession No. M27807), schistosomiasis (e.g., GenBank Accession No. LOS198), toxoplasmosis, trypanosomiasis, leishmaniasis, giardiasis (GenBank Accession No. M33641), amoebiasis, filariasis (e.g., GenBank Accession No. J03266), borreliosis, and trichinosis. Thus, protozoal antigens which can be used in the compositions and methods of the invention include, but are not limited to: plasmodium falciparum antigens such as merozoite surface antigens, sporozoite surface antigens, circumsporozoite antigens, gametocyte/gamete surface antigens, blood-stage antigen pf 155/RESA and other plasmodial antigen components; toxoplasma antigens such as SAG-1, p30 and other toxoplasma antigen components; schistosoma antigens such as glutathione-S-transferase, paramyosin, and other schistosomal antigen components; leishmania major and other leishmaniae antigens such as gp63, lipophosphoglycan and its associated protein and other leishmanial antigen components; and trypanosoma cruzi antigens such as the 75-77 kDa antigen, the 56 kDa antigen and other trypanosomal antigen components.

[0244] The present invention also contemplates toxin components as antigens, illustrative examples of which include staphylococcal enterotoxins, toxic shock syndrome toxin; retroviral antigens (e.g., antigens derived from HIV), streptococcal antigens, staphylococcal enterotoxin-A (SEA), staphylococcal enterotoxin-B (SEB), staphylococcal enterotoxin.sub.1-3 (SE.sub.1-3), staphylococcal enterotoxin-D (SED), staphylococcal enterotoxin-E (SEE) as well as toxins derived from mycoplasma, mycobacterium, and herpes viruses.

9. Construction of Synthetic Polynucleotides

[0245] Replacement of one codon for another can be achieved using standard methods known in the art. For example codon modification of a parent polynucleotide can be effected using several known mutagenesis techniques including, for example, oligonucleotide-directed mutagenesis, mutagenesis with degenerate oligonucleotides, and region-specific mutagenesis. Exemplary in vitro mutagenesis techniques are described for example in U.S. Pat. Nos. 4,184,917, 4,321,365 and 4,351,901 or in the relevant sections of Ausubel, et al. (CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, Inc. 1997) and of Sambrook, et al., (MOLECULAR CLONING. A LABORATORY MANUAL, Cold Spring Harbor Press, 1989). Instead of in vitro mutagenesis, the synthetic polynucleotide can be synthesized de novo using readily available machinery as described, for example, in U.S. Pat. No. 4,293,652. However, it should be noted that the present invention is not dependent on, and not directed to, any one particular technique for constructing the synthetic polynucleotide.

[0246] The parent polynucleotide is suitably a natural gene. However, it is possible that the parent polynucleotide is not naturally-occurring but has been engineered using recombinant techniques. Parent polynucleotides can be obtained from any suitable source, such as from eukaryotic or prokaryotic organisms, including but not limited to mammals or other animals, and pathogenic organisms such as yeasts, bacteria, protozoa and viruses.

[0247] The invention also contemplates synthetic polynucleotides encoding one or more desired portions of a target antigen. In some embodiments, the synthetic polynucleotide encodes at least about 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 30, 40, 50, 60, 70, 80, 90, 100, 120, 150, 300, 400, 500, 600, 700, 800, 900 or 1000, or even at least about 2000, 3000, 4000 or 5000 contiguous amino acid residues, or almost up to the total number of amino acids present in a full-length target antigen. In some embodiments, the synthetic polynucleotide encodes a plurality of portions of the target antigen, wherein the portions are the same or different. In illustrative examples of this type, the synthetic polynucleotide encodes a multi-epitope fusion protein. A number of factors can influence the choice of portion size. For example, the size of individual portions encoded by the synthetic polynucleotide can be chosen such that it includes, or corresponds to the size of, T cell epitopes and/or B cell epitopes, and their processing requirements. Practitioners in the art will recognize that class I-restricted T cell epitopes are typically between 8 and 10 amino acid residues in length and if placed next to unnatural flanking residues, such epitopes can generally require 2 to 3 natural flanking amino acid residues to ensure that they are efficiently processed and presented. Class II-restricted T cell epitopes usually range between 12 and 25 amino acid residues in length and may not require natural flanking residues for efficient proteolytic processing although it is believed that natural flanking residues may play a role. Another important feature of class II-restricted epitopes is that they generally contain a core of 9-10 amino acid residues in the middle which bind specifically to class II MHC molecules with flanking sequences either side of this core stabilizing binding by associating with conserved structures on either side of class II MHC antigens in a sequence independent manner. Thus the functional region of class II-restricted epitopes is typically less than about 15 amino acid residues long. The size of linear B cell epitopes and the factors effecting their processing, like class II-restricted epitopes, are quite variable although such epitopes are frequently smaller in size than 15 amino acid residues. From the foregoing, it is advantageous, but not essential, that the size of individual portions of the target antigen is at least 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 20, 25, 30 amino acid residues. Suitably, the size of individual portions is no more than about 500, 200, 100, 80, 60, 50, 40 amino acid residues. In certain advantageous embodiments, the size of individual portions is sufficient for presentation by an antigen-presenting cell of a T cell and/or a B cell epitope contained within the peptide.

[0248] As will be appreciated by those of skill in the art, it is generally not necessary to immunize with a polypeptide that shares exactly the same amino acid sequence with the target antigen to produce an immune response to that antigen. In some embodiments, therefore, the polypeptide encoded by the synthetic polynucleotide is desirably a variant of at least a portion of the target antigen. "Variant" polypeptides include proteins derived from the target antigen by deletion (so-called truncation) or addition of one or more amino acids to the N-terminal and/or C-terminal end of the target antigen; deletion or addition of one or more amino acids at one or more sites in the target antigen; or substitution of one or more amino acids at one or more sites in the target antigen. Variant polypeptides encompassed by the present invention will have at least 40%, 50%, 60%, 70%, generally at least 75%, 80%, 85%, typically at least about 90% to 95% or more, and more typically at least about 96%, 97%, 98%, 99% or more sequence similarity or identity with the amino acid sequence of the target antigen or portion thereof as determined by sequence alignment programs described elsewhere herein using default parameters. A variant of a target antigen may differ from that antigen generally by as much 1000, 500, 400, 300, 200, 100, 50 or 20 amino acid residues or suitably by as few as 1-15 amino acid residues, as few as 1-10, such as 6-10, as few as 5, as few as 4, 3, 2, or even 1 amino acid residue.

[0249] Variant polypeptides corresponding to at least a portion of a target antigen may contain conservative amino acid substitutions at various locations along their sequence, as compared to the target antigen amino acid sequence. A "conservative amino acid substitution" is one in which the amino acid residue is replaced with an amino acid residue having a similar side chain. Families of amino acid residues having similar side chains have been defined in the art, which can be generally sub-classified as follows:

[0250] Acidic: The residue has a negative charge due to loss of H ion at physiological pH and the residue is attracted by aqueous solution so as to seek the surface positions in the conformation of a peptide in which it is contained when the peptide is in aqueous medium at physiological pH. Amino acids having an acidic side chain include glutamic acid and aspartic acid.

[0251] Basic: The residue has a positive charge due to association with H ion at physiological pH or within one or two pH units thereof (e.g., histidine) and the residue is attracted by aqueous solution so as to seek the surface positions in the conformation of a peptide in which it is contained when the peptide is in aqueous medium at physiological pH. Amino acids having a basic side chain include arginine, lysine and histidine.

[0252] Charged: The residues are charged at physiological pH and, therefore, include amino acids having acidic or basic side chains (i.e., glutamic acid, aspartic acid, arginine, lysine and histidine).

[0253] Hydrophobic: The residues are not charged at physiological pH and the residue is repelled by aqueous solution so as to seek the inner positions in the conformation of a peptide in which it is contained when the peptide is in aqueous medium. Amino acids having a hydrophobic side chain include tyrosine, valine, isoleucine, leucine, methionine, phenylalanine and tryptophan.

[0254] Neutral/polar: The residues are not charged at physiological pH, but the residue is not sufficiently repelled by aqueous solutions so that it would seek inner positions in the conformation of a peptide in which it is contained when the peptide is in aqueous medium. Amino acids having a neutral/polar side chain include asparagine, glutamine, cysteine, histidine, serine and threonine.

[0255] This description also characterizes certain amino acids as "small" since their side chains are not sufficiently large, even if polar groups are lacking, to confer hydrophobicity. With the exception of proline, "small" amino acids are those with four carbons or less when at least one polar group is on the side chain and three carbons or less when not. Amino acids having a small side chain include glycine, serine, alanine and threonine. The gene-encoded secondary amino acid proline is a special case due to its known effects on the secondary conformation of peptide chains. The structure of proline differs from all the other naturally-occurring amino acids in that its side chain is bonded to the nitrogen of the .alpha.-amino group, as well as the .alpha.-carbon. Several amino acid similarity matrices (e.g., PAM120 matrix and PAM250 matrix as disclosed for example by Dayhoff et al. (1978) A model of evolutionary change in proteins. Matrices for determining distance relationships In M. O. Dayhoff, (ed.), Atlas of protein sequence and structure, Vol. 5, pp. 345-358, National Biomedical Research Foundation, Washington D.C.; and by Gonnet et al., 1992, Science 256(5062): 144301445), however, include proline in the same group as glycine, serine, alanine and threonine. Accordingly, for the purposes of the present invention, proline is classified as a "small" amino acid.

[0256] The degree of attraction or repulsion required for classification as polar or nonpolar is arbitrary and, therefore, amino acids specifically contemplated by the invention have been classified as one or the other. Most amino acids not specifically named can be classified on the basis of known behavior.

[0257] Amino acid residues can be further sub-classified as cyclic or noncyclic, and aromatic or nonaromatic, self-explanatory classifications with respect to the side-chain substituent groups of the residues, and as small or large. The residue is considered small if it contains a total of four carbon atoms or less, inclusive of the carboxyl carbon, provided an additional polar substituent is present; three or less if not. Small residues are, of course, always nonaromatic. Dependent on their structural properties, amino acid residues may fall in two or more classes. For the naturally-occurring protein amino acids, sub-classification according to the this scheme is presented in the Table 10.

TABLE-US-00010 TABLE 10 Original Residue Exemplary Substitutions Ala Ser Arg Lys Asn Gln, His Asp Glu Cys Ser Gln Asn Glu Asp Gly Pro His Asn, Gln Ile Leu, Val Leu Ile, Val Lys Arg, Gln, Glu Met Leu, Ile, Phe Met, Leu, Tyr Ser Thr Thr Ser Trp Tyr Tyr Trp, Phe Val Ile, Leu

[0258] Conservative amino acid substitution also includes groupings based on side chains. For example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is serine and threonine; a group of amino acids having amide-containing side chains is asparagine and glutamine; a group of amino acids having aromatic side chains is phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side chains is cysteine and methionine. For example, it is reasonable to expect that replacement of a leucine with an isoleucine or valine, an aspartate with a glutamate, a threonine with a serine, or a similar replacement of an amino acid with a structurally related amino acid will not have a major effect on the properties of the resulting variant polypeptide. Conservative substitutions are shown in Table 11 below under the heading of exemplary substitutions. More preferred substitutions are shown under the heading of preferred substitutions. Amino acid substitutions falling within the scope of the invention, are, in general, accomplished by selecting substitutions that do not differ significantly in their effect on maintaining (a) the structure of the peptide backbone in the area of the substitution, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. After the substitutions are introduced, the variants are screened for biological activity.

TABLE-US-00011 TABLE 11 EXEMPLARY AND PREFERRED AMINO ACID SUBSTITUTIONS Preferred Original Residue Exemplary Substitutions Substitutions Ala Val, Leu, Ile Val Arg Lys, Gln, Asn Lys Asn Gln, His, Lys, Arg Gln Asp Glu Glu Cys Ser Ser Gln Asn, His, Lys, Asn Glu Asp, Lys Asp Gly Pro Pro His Asn, Gln, Lys, Arg Arg Ile Leu, Val, Met, Ala, Phe, Leu Norleu Leu Norleu, Ile, Val, Met, Ala, Phe Ile Lys Arg, Gln, Asn Arg Met Leu, Ile, Phe Leu Phe Leu, Val, Ile, Ala Leu Pro Gly Gly Ser Thr Thr Thr Ser Ser Trp Tyr Tyr Tyr Trp, Phe, Thr, Ser Phe Val Ile, Leu, Met, Phe, Ala, Norleu Leu

[0259] Alternatively, similar amino acids for making conservative substitutions can be grouped into three categories based on the identity of the side chains. The first group includes glutamic acid, aspartic acid, arginine, lysine, histidine, which all have charged side chains; the second group includes glycine, serine, threonine, cysteine, tyrosine, glutamine, asparagine; and the third group includes leucine, isoleucine, valine, alanine, proline, phenylalanine, tryptophan, methionine, as described in Zubay, G., Biochemistry, third edition, Wm.C. Brown Publishers (1993).

[0260] The invention further contemplates a chimeric construct comprising a synthetic polynucleotide of the invention, which is operably linked to a regulatory sequence. The regulatory sequence suitably comprises transcriptional and/or translational control sequences, which will be compatible for expression in the organism of interest or in cells of that organism. Typically, the transcriptional and translational regulatory control sequences include, but are not, limited to, a promoter sequence, a 5' non-coding region, a cis-regulatory region such as a functional binding site for transcriptional regulatory protein or translational regulatory protein, an upstream open reading frame, ribosomal-binding sequences, transcriptional start site, translational start site, and/or nucleotide sequence which encodes a leader sequence, termination codon, translational stop site and a 3' non-translated region. Constitutive or inducible promoters as known in the art are contemplated by the invention. The promoters may be either naturally occurring promoters, or hybrid promoters that combine elements of more than one promoter. Promoter sequences contemplated by the present invention may be native to the organism of interest or may be derived from an alternative source, where the region is functional in the chosen organism. The choice of promoter will differ depending on the intended host or cell or tissue type. For example, promoters which could be used for expression in mammals include the metallothionein promoter, which can be induced in response to heavy metals such as cadmium, the .beta.-actin promoter as well as viral promoters such as the SV40 large T antigen promoter, human cytomegalovirus (CMV) immediate early (TB) promoter, Rous sarcoma virus LTR promoter, the mouse mammary tumor virus LTR promoter, the adenovirus major late promoter (Ad MLP), the herpes simplex virus promoter, and a HPV promoter, particularly the HPV upstream regulatory region (URR), among others. All these promoters are well described and readily available in the art.

[0261] Enhancer elements may also be used herein to increase expression levels of the mammalian constructs. Examples include the SV40 early gene enhancer, as described for example in Dijkema et al. (1985, EMBO J. 4:761), the enhancer/promoter derived from the long terminal repeat (LTR) of the Rous Sarcoma Virus, as described for example in Gorman et al., (1982, Proc. Natl. Acad. Sci. USA 79:6777) and elements derived from human CMV, as described for example in Boshart et al. (1985, Cell 41:521), such as elements included in the CMV intron A sequence.

[0262] The chimeric construct may also comprise a 3' non-translated sequence. A 3' non-translated sequence refers to that portion of a gene comprising a DNA segment that contains a polyadenylation signal and any other regulatory signals capable of effecting mRNA processing or gene expression. The polyadenylation signal is characterized by effecting the addition of polyadenylic acid tracts to the 3' end of the mRNA precursor. Polyadenylation signals are commonly recognized by the presence of homology to the canonical form 5'AATAAA-3' although variations are not uncommon. The 3' non-translated regulatory DNA sequence preferably includes from about 50 to 1,000 nts and may contain transcriptional and translational termination sequences in addition to a polyadenylation signal and any other regulatory signals capable of effecting mRNA processing or gene expression.

[0263] In some embodiments, the chimeric construct further contains a selectable marker gene to permit selection of cells containing the construct. Selection genes are well known in the art and will be compatible for expression in the cell of interest.

[0264] It will be understood, however, that expression of protein-encoding polynucleotides in heterologous systems is now well known, and the present invention is not directed to or dependent on any particular vector, transcriptional control sequence or technique for expression of the polynucleotides. Rather, synthetic polynucleotides prepared according to the methods set forth herein may be introduced into a mammal in any suitable manner in the form of any suitable construct or vector, and the synthetic polynucleotides may be expressed with known transcription regulatory elements in any conventional manner.

[0265] In addition, chimeric constructs can be constructed that include sequences coding for adjuvants. Particularly suitable are detoxified mutants of bacterial ADP-ribosylating toxins, for example, diphtheria toxin, pertussis toxin (PT), cholera toxin (CT), Escherichia coli heat-labile toxins (LT1 and LT2), Pseudomonas endotoxin A, Clostridium botulinum C2 and C3 toxins, as well as toxins from C. perfringens, C. spiriforma and C. difficile. In some embodiments, the chimeric constructs include coding sequences for detoxified mutants of E. coli heat-labile toxins, such as the LT-K63 and LT-R72 detoxified mutants, described in U.S. Pat. No. 6,818,222. In some embodiments, the adjuvant is a protein-destabilising element, which increases processing and presentation of the polypeptide that corresponds to at least a portion of the target antigen through the class I MHC pathway, thereby leading to enhanced cell-mediated immunity against the polypeptide. Illustrative protein-destabilising elements include intracellular protein degradation signals or degrons which may be selected without limitation from a destabilising amino acid at the amino-terminus of a polypeptide of interest, a PEST region or a ubiquitin. For example, the coding sequence for the polypeptide can be modified to include a destabilising amino acid at its amino-terminus so that the protein so modified is subject to the N-end rule pathway as disclosed, for example, by Bachmair et al. in U.S. Pat. No. 5,093,242 and by Varshaysky et al. in U.S. Pat. No. 5,122,463. In some embodiments, the destabilising amino acid is selected from isoleucine and glutamic acid, especially from histidine tyrosine and glutamine, and more especially from aspartic acid, asparagine, phenylalanine, leucine, tryptophan and lysine. In certain embodiments, the destabilising amino acid is arginine. In some proteins, the amino-terminal end is obscured as a result of the protein's conformation (i.e., its tertiary or quaternary structure). In these cases, more extensive alteration of the amino-terminus may be necessary to make the protein subject to the N-end rule pathway. For example, where simple addition or replacement of the single amino-terminal residue is insufficient because of an inaccessible amino-terminus, several amino acids (including lysine, the site of ubiquitin joining to substrate proteins) may be added to the original amino-terminus to increase the accessibility and/or segmental mobility of the engineered amino terminus. In some embodiments, a nucleic acid sequence encoding the amino-terminal region of the polypeptide can be modified to introduce a lysine residue in an appropriate context. This can be achieved most conveniently by employing DNA constructs encoding "universal destabilising segments". A universal destabilising segment comprises a nucleic acid construct which encodes a polypeptide structure, preferably segmentally mobile, containing one or more lysine residues, the codons for lysine residues being positioned within the construct such that when the construct is inserted into the coding sequence of the protein-encoding synthetic polynucleotide, the lysine residues are sufficiently spatially proximate to the amino-terminus of the encoded protein to serve as the second determinant of the complete amino-terminal degradation signal. The insertion of such constructs into the 5' portion of a polypeptide-encoding synthetic polynucleotide would provide the encoded polypeptide with a lysine residue (or residues) in an appropriate context for destabilization. In other embodiments, the polypeptide is modified to contain a PEST region, which is rich in an amino acid selected from proline, glutamic acid, serine and threonine, which region is optionally flanked by amino acids comprising electropositive side chains. In this regard, it is known that amino acid sequences of proteins with intracellular half-lives less than about 2 hours contain one or more regions rich in proline (P), glutamic acid (E), serine (S), and threonine (T) as for example shown by Rogers et al. (1986, Science 234 (4774): 364-368). In still other embodiments, the polypeptide is conjugated to a ubiquitin or a biologically active fragment thereof, to produce a modified polypeptide whose rate of intracellular proteolytic degradation is increased, enhanced or otherwise elevated relative to the unmodified polypeptide.

[0266] One or more adjuvant polypeptides may be co-expressed with an `antigenic` polypeptide that corresponds to at least a portion of the target antigen. In certain embodiments, adjuvant and antigenic polypeptides may be co-expressed in the form of a fusion protein comprising one or more adjuvant polypeptides and one or more antigenic polypeptides. Alternatively, adjuvant and antigenic polypeptides may be co-expressed as separate proteins.

[0267] Furthermore, chimeric constructs can be constructed that include chimeric antigen-coding gene sequences, encoding, e.g., multiple antigens/epitopes of interest, for example derived from a single or from more than one target antigen. In certain embodiments, multi-cistronic cassettes (e.g., bi-cistronic cassettes) can be constructed allowing expression of multiple adjuvants and/or antigenic polypeptides from a single mRNA using, for example, the EMCV IRES, or the like. In other embodiments, adjuvants and/or antigenic polypeptides can be encoded on separate coding sequences that are operably connected to independent transcription regulatory elements.

[0268] In some embodiments, the chimeric constructs of the invention are in the form of expression vectors which are suitably selected from self-replicating extra-chromosomal vectors (e.g., plasmids) and vectors that integrate into a host genome. In illustrative examples of this type, the expression vectors are viral vectors, such as simian virus 40 (SV40) or bovine papilloma virus (BPV), which has the ability to replicate as extra-chromosomal elements (Eukaryotic Viral Vectors, Cold Spring Harbor Laboratory, Gluzman ed., 1982; Sarver et al., 1981, Mol. Cell. Biol. 1:486). Viral vectors include retroviral (lentivirus), adeno-associated virus (see, e.g., Okada, 1996, Gene Ther. 3:957-964; Muzyczka, 1994, J. Clin. Invst. 94:1351; U.S. Pat. Nos. 6,156,303; 6,143,548 5,952,221, describing AAV vectors; see also U.S. Pat. Nos. 6,004,799; 5,833,993), adenovirus (see, e.g., U.S. Pat. Nos. 6,140,087; 6,136,594; 6,133,028; 6,120,764), reovirus, herpesvirus, rotavirus genomes etc., modified for introducing and directing expression of a polynucleotide or transgene in cells. Retroviral vectors can include those based upon murine leukemia virus (see, e.g., U.S. Pat. No. 6,132,731), gibbon ape leukemia virus (see, e.g., U.S. Pat. No. 6,033,905), simian immuno-deficiency virus, human immuno-deficiency virus (see, e.g., U.S. Pat. No. 5,985,641), and combinations thereof.

[0269] Vectors also include those that efficiently deliver genes to animal cells in vivo (e.g., stem cells) (see, e.g., U.S. Pat. Nos. 5,821,235 and 5,786,340; Croyle et al., 1998, Gene Ther. 5:645; Croyle et al., 1998, Pharm. Res. 15:1348; Croyle et al., 1998, Hum. Gene Ther. 9:561; Foreman et al., 1998, Hum. Gene Ther. 9:1313; Wirtz et al., 1999, Gut 44:800). Adenoviral and adeno-associated viral vectors suitable for in vivo delivery are described, for example, in U.S. Pat. Nos. 5,700,470, 5,731,172 and 5,604,090. Additional vectors suitable for in vivo delivery include herpes simplex virus vectors (see, e.g., U.S. Pat. No. 5,501,979), retroviral vectors (see, e.g., U.S. Pat. Nos. 5,624,820, 5,693,508 and 5,674,703; and WO92/05266 and WO92/14829), bovine papilloma virus (BPV) vectors (see, e.g., U.S. Pat. No. 5,719,054), CMV-based vectors (see, e.g., U.S. Pat. No. 5,561,063) and parvovirus, rotavirus and Norwalk virus vectors. Lentiviral vectors are useful for infecting dividing as well as non-dividing cells (see, e.g., U.S. Pat. No. 6,013,516).

[0270] Additional viral vectors which will find use for delivering the nucleic acid molecules encoding the antigens of interest include those derived from the pox family of viruses, including vaccinia virus and avian poxvirus. By way of example, vaccinia virus recombinants expressing the chimeric constructs can be constructed as follows. The antigen coding sequence is first inserted into an appropriate vector so that it is adjacent to a vaccinia promoter and flanking vaccinia DNA sequences, such as the sequence encoding thymidine kinase (TK). This vector is then used to transfect cells that are simultaneously infected with vaccinia. Homologous recombination serves to insert the vaccinia promoter plus the gene encoding the coding sequences of interest into the viral genome. The resulting TK-recombinant can be selected by culturing the cells in the presence of 5-bromodeoxyuridine and picking viral plaques resistant thereto.

[0271] Alternatively, avipoxviruses, such as the fowlpox and canarypox viruses, can also be used to deliver the genes. Recombinant avipox viruses, expressing immunogens from mammalian pathogens, are known to confer protective immunity when administered to non-avian species. The use of an avipox vector is particularly desirable in human and other mammalian species since members of the avipox genus can only productively replicate in susceptible avian species and therefore are not infective in mammalian cells. Methods for producing recombinant avipoxviruses are known in the art and employ genetic recombination, as described above with respect to the production of vaccinia viruses. See, e.g., WO 91/12882; WO 89/03429; and WO 92/03545.

[0272] Molecular conjugate vectors, such as the adenovirus chimeric vectors described in Michael et al., J. Biol. Chem. (1993) 268:6866-6869 and Wagner et al., Proc. Natl. Acad. Sci. USA (1992) 89:6099-6103, can also be used for gene delivery.

[0273] Members of the Alphavirus genus, such as, but not limited to, vectors derived from the Sindbis virus (SIN), Semliki Forest virus (SFV), and Venezuelan Equine Encephalitis virus (VEE), will also find use as viral vectors for delivering the chimeric constructs of the present invention. For a description of Sindbis-virus derived vectors useful for the practice of the instant methods, see, Dubensky et al. (1996, J. Virol. 70:508-519; and International Publication Nos. WO 95/07995, WO 96/17072); as well as, Dubensky, Jr., T. W., et al., U.S. Pat. No. 5,843,723, and Dubensky, Jr., T. W., U.S. Pat. No. 5,789,245. Exemplary vectors of this type are chimeric alphavirus vectors comprised of sequences derived from Sindbis virus and Venezuelan equine encephalitis virus. See, e.g., Perri et al. (2003, J. Virol. 77: 10394-10403) and International Publication Nos. WO 02/099035, WO 02/080982, WO 01/81609, and WO 00/61772.

[0274] In other illustrative embodiments, lentiviral vectors are employed to deliver a chimeric construct of the invention into selected cells or tissues. Typically, these vectors comprise a 5' lentiviral LTR, a tRNA binding site, a packaging signal, a promoter operably linked to one or more genes of interest, an origin of second strand DNA synthesis and a 3' lentiviral LTR, wherein the lentiviral vector contains a nuclear transport element. The nuclear transport element may be located either upstream (5') or downstream (3') of a coding sequence of interest (for example, a synthetic Gag or Env expression cassette of the present invention). A wide variety of lentiviruses may be utilized within the context of the present invention, including for example, lentiviruses selected from the group consisting of HIV, HIV-1, HIV-2, FIV, BIV, EIAV, MVV, CAEV, and SIV. Illustrative examples of lentiviral vectors are described in PCT Publication Nos. WO 00/66759, WO 00/00600, WO 99/24465, WO 98/51810, WO 99/51754, WO 99/31251, WO 99/30742, and WO 99/15641. Desirably, a third generation SIN lentivirus is used. Commercial suppliers of third generation SIN (self-inactivating) lentiviruses include Invitrogen (ViraPower Lentiviral Expression System). Detailed methods for construction, transfection, harvesting, and use of lentiviral vectors are given, for example, in the Invitrogen technical manual "ViraPower Lentiviral Expression System version B 050102 25-0501", available at http://www.invitrogen.com/Content/Tech-Online/molecular_biology/manuals_p- -ps/virapower_lentiviral_system_man.pdf. Lentiviral vectors have emerged as an efficient method for gene transfer. Improvements in biosafety characteristics have made these vectors suitable for use at biosafety level 2 (BL2). A number of safety features are incorporated into third generation SIN (self-inactivating) vectors. Deletion of the viral 3' LTR U3 region results in a provirus that is unable to transcribe a full length viral RNA. In addition, a number of essential genes are provided in trans, yielding a viral stock that is capable of but a single round of infection and integration. Lentiviral vectors have several advantages, including: 1) pseudotyping of the vector using amphotropic envelope proteins allows them to infect virtually any cell type; 2) gene delivery to quiescent, post mitotic, differentiated cells, including neurons, has been demonstrated; 3) their low cellular toxicity is unique among transgene delivery systems; 4) viral integration into the genome permits long term transgene expression; 5) their packaging capacity (6-14 kb) is much larger than other retroviral, or adeno-associated viral vectors. In a recent demonstration of the capabilities of this system, lentiviral vectors expressing GFP were used to infect murine stem cells resulting in live progeny, germline transmission, and promoter-, and tissue-specific expression of the reporter (Ailles, L. E. and Naldini, L., HIV-1-Derived Lentiviral Vectors. In: Trono, D. (Ed.), Lentiviral Vectors, Springer-Verlag, Berlin, Heidelberg, New York, 2002, pp. 31-52). An example of the current generation vectors is outlined in FIG. 2 of a review by Lois et al. (2002, Science, 295 868-872).

[0275] The chimeric construct can also be delivered without a vector. For example, the chimeric construct can be packaged as DNA or RNA in liposomes prior to delivery to the subject or to cells derived therefrom. Lipid encapsulation is generally accomplished using liposomes which are able to stably bind or entrap and retain nucleic acid. The ratio of condensed DNA to lipid preparation can vary but will generally be around 1:1 (mg DNA:micromoles lipid), or more of lipid. For a review of the use of liposomes as carriers for delivery of nucleic acids, see, Hug and Sleight, (1991, Biochim. Biophys. Acta. 1097:1-17); and Straubinger et al., in Methods of Enzymology (1983), Vol. 101, pp. 512-527.

[0276] Liposomal preparations for use in the present invention include cationic (positively charged), anionic (negatively charged) and neutral preparations, with cationic liposomes particularly preferred. Cationic liposomes have been shown to mediate intracellular delivery of plasmid DNA (Feigner et al., 1987, Proc. Natl. Acad. Sci. USA 84:7413-7416); mRNA (Malone et al., 1989, Proc. Natl. Acad. Sci. USA 86:6077-6081); and purified transcription factors (Debs et al., 1990, J. Biol. Chem. 265:10189-10192), in functional form.

[0277] Cationic liposomes are readily available. For example, N[1-2,3-dioleyloxy)propyl]-N,N,N-triethylammonium (DOTMA) liposomes are available under the trademark Lipofectin, from GIBCO BRL, Grand Island, N.Y. (See, also, Feigner et al., 1987, Proc. Natl. Acad. Sci. USA 84:7413-7416). Other commercially available lipids include (DDAB/DOPE) and DOTAP/DOPE (Boerhinger). Alternative cationic liposomes can be prepared from readily available materials using techniques well known in the art. See, e.g., Szoka et al., 1978, Proc. Natl. Acad. Sci. USA 75:4194-4198; PCT Publication No. WO 90/11092 for a description of the synthesis of DOTAP (1,2-bis(oleoyloxy)-3-(trimethylammonio)propane) liposomes.

[0278] Similarly, anionic and neutral liposomes are readily available, such as, from Avanti Polar Lipids (Birmingham, Ala.), or can be easily prepared using readily available materials. Such materials include phosphatidyl choline, cholesterol, phosphatidyl ethanolamine, dioleoylphosphatidyl choline (DOPC), dioleoylphosphatidyl glycerol (DOPG), dioleoylphosphatidyl ethanolamine (DOPE), among others. These materials can also be mixed with the DOTMA and DOTAP starting materials in appropriate ratios. Methods for making liposomes using these materials are well known in the art.

[0279] The liposomes can comprise multilamellar vesicles (MLVs), small unilamellar vesicles (SUVs), or large unilamellar vesicles (LUVs). The various liposome-nucleic acid complexes are prepared using methods known in the art. See, e.g., Straubinger et al., in METHODS OF IMMUNOLOGY (1983), Vol. 101, pp. 512-527; Szoka et al., 1978, Proc. Natl. Acad. Sci. USA 75:4194-4198; Papahadjopoulos et al., 1975, Biochim. Biophys. Acta 394:483; Wilson et al., 1979, Cell 17:77); Deamer and Bangham, 1976, Biochim. Biophys. Acta 443:629; Ostro et al., 1977, Biochem. Biophys. Res. Commun. 76:836; Fraley et al., 1979, Proc. Natl. Acad. Sci. USA 76:3348); Enoch and Strittmatter, 1979, Proc. Natl. Acad. Sci. USA 76:145); Fraley et al., 1980, J. Biol. Chem. 255:10431; Szoka and Papahadjopoulos, 1978, Proc. Natl. Acad. Sci. USA 75:145; and Schaefer-Ridder et al., 1982, Science 215:166.

[0280] The chimeric construct can also be delivered in cochleate lipid compositions similar to those described by Papahadjopoulos et al., 1975, Biochem. Biophys. Acta. 394:483-491. See, also, U.S. Pat. Nos. 4,663,161 and 4,871,488.

[0281] The chimeric construct may also be encapsulated, adsorbed to, or associated with, particulate carriers. Such carriers present multiple copies-of a selected chimeric construct to the immune system. The particles can be taken up by professional antigen presenting cells such as macrophages and dendritic cells, and/or can enhance antigen presentation through other mechanisms such as stimulation of cytokine release. Examples of particulate carriers include those derived from polymethyl methacrylate polymers, as well as microparticles derived from poly(lactides) and poly(lactide-co-glycolides), known as PLG. See, e.g., Jeffery et al., 1993, Pharm. Res. 10:362-368; McGee J. P., et al., 1997, J. Microencapsul. 14(2):197-210; O'Hagan D. T., et al., 1993, Vaccine 11(2):149-54.

[0282] Furthermore, other particulate systems and polymers can be used for the in vivo delivery of the chimeric construct. For example, polymers such as polylysine, polyarginine, polyornithine, spermine, spermidine, as well as conjugates of these molecules, are useful for transferring a nucleic acid of interest. Similarly, DEAE dextran-mediated transfection, calcium phosphate precipitation or precipitation using other insoluble inorganic salts, such as strontium phosphate, aluminum silicates including bentonite and kaolin, chromic oxide, magnesium silicate, talc, and the like, will find use with the present methods. See, e.g., Felgner, P. L., Advanced Drug Delivery Reviews (1990) 5:163-187, for a review of delivery systems useful for gene transfer. Peptoids (Zuckerman, R. N., et al., U.S. Pat. No. 5,831,005, issued Nov. 3, 1998) may also be used for delivery of a construct of the present invention.

[0283] Additionally, biolistic delivery systems employing particulate carriers such as gold and tungsten, are especially useful for delivering chimeric constructs of the present invention. The particles are coated with the synthetic expression cassette(s) to be delivered and accelerated to high velocity, generally under a reduced atmosphere, using a gun powder discharge from a "gene gun." For a description of such techniques, and apparatuses useful therefor, see, e.g., U.S. Pat. Nos. 4,945,050; 5,036,006; 5,100,792; 5,179,022; 5,371,015; and 5,478,744. In illustrative examples, gas-driven particle acceleration can be achieved with devices such as those manufactured by PowderMed Pharmaceuticals PLC (Oxford, UK) and PowderMed Vaccines Inc. (Madison, Wis.), some examples of which are described in U.S. Pat. Nos. 5,846,796; 6,010,478; 5,865,796; 5,584,807; and EP Patent No. 0500 799. This approach offers a needle-free delivery approach wherein a dry powder formulation of microscopic particles, such as polynucleotide or polypeptide particles, are accelerated to high speed within a helium gas jet generated by a hand held device, propelling the particles into a target tissue of interest. Other devices and methods that may be useful for gas-driven needle-less injection of compositions of the present invention include those provided by Bioject, Inc. (Portland, Oreg.), some examples of which are described in U.S. Pat. Nos. 4,790,824; 5,064,413; 5,312,335; 5,383,851; 5,399,163; 5,520,639 and 5,993,412.

[0284] Alternatively, micro-cannula- and microneedle-based devices (such as those being developed by Becton Dickinson and others) can be used to administer the chimeric constructs of the invention. Illustrative devices of this type are described in EP 1 092 444 A1, and U.S. application Ser. No. 606,909, filed Jun. 29, 2000. Standard steel cannula can also be used for intra-dermal delivery using devices and methods as described in U.S. Ser. No. 417,671, filed Oct. 14, 1999. These methods and devices include the delivery of substances through narrow gauge (about 30 G) "micro-cannula" with limited depth of penetration, as defined by the total length of the cannula or the total length of the cannula that is exposed beyond a depth-limiting feature. It is within the scope of the present invention that targeted delivery of substances including chimeric constructs can be achieved either through a single microcannula or an array of microcannula (or "microneedles"), for example 3-6 microneedles mounted on an injection device that may include or be attached to a reservoir in which the substance to be administered is contained.

10. Compositions

[0285] The invention also provides compositions, particularly immunomodulating compositions, comprising one or more of the chimeric constructs described herein. The immunomodulating compositions may comprise a mixture of chimeric constructs, which in turn may be delivered, for example, using the same or different vectors or vehicles. Antigens may be administered individually or in combination, in e.g., prophylactic (i.e., to prevent infection or disease) or therapeutic (to treat infection or disease) immunomodulating compositions. The immunomodulating compositions may be given more than once (e.g., a "prime" administration followed by one or more "boosts") to achieve the desired effects. The same composition can be administered in one or more priming and one or more boosting steps. Alternatively, different compositions can be used for priming and boosting.

[0286] The immunomodulating compositions will generally include one or more "pharmaceutically acceptable excipients or vehicles" such as water, saline, glycerol, ethanol, etc. Additionally, auxiliary substances, such as wetting or emulsifying agents, pH buffering substances, and the like, may be present in such vehicles.

[0287] Immunomodulating compositions will typically, in addition to the components mentioned above, comprise one or more "pharmaceutically acceptable carriers." These include any carrier which does not itself induce the production of antibodies harmful to the individual receiving the composition. Suitable carriers typically are large, slowly metabolized macromolecules such as proteins, polysaccharides, polylactic acids, polyglycolic acids, polymeric amino acids, amino acid copolymers, and lipid aggregates (such as oil droplets or liposomes). Such carriers are well known to those of ordinary skill in the art. A composition may also contain a diluent, such as water, saline, glycerol, etc. Additionally, an auxiliary substance, such as a wetting or emulsifying agent, pH buffering substance, and the like, may be present. A thorough discussion of pharmaceutically acceptable components is available in Gennaro (2000) Remington: The Science and Practice of Pharmacy. 20th ed., ISBN: 0683306472.

[0288] Pharmaceutically compatible salts can also be used in compositions of the invention, for example, mineral salts such as hydrochlorides, hydrobromides, phosphates, or sulfates, as well as salts of organic acids such as acetates, propionates, malonates, or benzoates. Especially useful protein substrates are serum albumins, keyhole limpet hemocyanin, immunoglobulin molecules, thyroglobulin, ovalbumin, tetanus toxoid, and other proteins well known to those of skill in the art.

[0289] The chimeric constructs of the invention can also be adsorbed to, entrapped within or otherwise associated with liposomes and particulate carriers such as PLG.

[0290] The chimeric constructs of the present invention are formulated into compositions for delivery to a mammal. These compositions may either be prophylactic (to prevent infection) or therapeutic (to treat disease after infection). The compositions will comprise a "therapeutically effective amount" of the gene of interest such that an amount of the antigen can be produced in vivo so that an immune response is generated in the individual to which it is administered. The exact amount necessary will vary depending on the subject being treated; the age and general condition of the subject to be treated; the capacity of the subject's immune system to synthesize antibodies; the degree of protection desired; the severity of the condition being treated; the particular antigen selected and its mode of administration, among other factors. An appropriate effective amount can be readily determined by one of skill in the art. Thus, a "therapeutically effective amount" will fall in a relatively broad range that can be determined through routine trials.

[0291] Once formulated, the compositions of the invention can be administered directly to the subject (e.g., as described above). Direct delivery of chimeric construct-containing compositions in vivo will generally be accomplished with or without vectors, as described above, by injection using either a conventional syringe, needless devices such as Bioject.TM. or a gene gun, such as the Accell.TM. gene delivery system (PowderMed Ltd, Oxford, England) or microneedle device. The constructs can be delivered (e.g., injected) either subcutaneously, epidermally, intradermally, intramuscularly, intravenous, intramucosally (such as nasally, rectally and vaginally), intraperitoneally or orally. Delivery of nucleic acid into cells of the epidermis is particularly preferred as this mode of administration provides access to skin-associated lymphoid cells and provides for a transient presence of nucleic acid (e.g., DNA) in the recipient. Other modes of administration include oral ingestion and pulmonary administration, suppositories, needle-less injection, transcutaneous, topical, and transdermal applications. Dosage treatment may be a single dose schedule or a multiple dose schedule.

[0292] In order that the invention may be readily understood and put into practical effect, particular preferred embodiments will now be described by way of the following non-limiting examples.

EXAMPLES

Example 1

Synthetic Construct System for Determining the Immune Response Preference of Codons in Mammals

Materials and Methods

[0293] Primer Design/synthesis and Sequence Manipulation

[0294] Oligonucleotides for site-directed mutagenesis were designed according to the guidelines included in the mutagenesis kit manuals (Quikchange II Site-directed Mutagenesis kit or Quikchange Multi Site-directed Mutagenesis Kit; Stratagene, La Jolla Calif.). These primers were synthesized and PAGE purified by Sigma (formerly Proligo).

[0295] Oligonucleotides for whole gene synthesis were designed by eye and synthesized by Sigma (formerly Proligo). The primers were supplied as standard desalted oligos. No additional purification of the oligonucleotides was carried out.

[0296] Sequence manipulation and analysis was carried out using the suite of programs on Biomanager (ANGIS) and various other web-based programs including BLAST at NCBI (http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi), NEBcutter V2.0 from New England Biolabs (http://tools.neb.com/NEBcutter2/index.php), the Translate Tool on ExPASy (http://au.expasv.org/tools/dna.html), and the SignalP 3.0 server (http://www.cbs.dtu.dk/services/SionalP/).

[0297] Standard Cloning Techniques

[0298] Restriction enzyme digests, alkaline phosphatase treatments and ligations were carried out according to the enzyme manufacturers' instructions (various manufacturers including New England Biolabs, Roche and Fermentas).

[0299] Purification of DNA from agarose gels and preparation of mini-prep DNA were carried out using commercial kits (Qiagen, Bio-Rad, Macherey-Nagel).

[0300] Agarose gel electrophoresis, phenol/chloroform extraction of contaminant protein from DNA, ethanol precipitation of DNA and other basic molecular biological procedures were carried out using standard protocols, similar to those described in Current Protocols in Molecular Biology (Ebook available via Wiley InterScience; edited by Ausubel et al.).

[0301] Sequencing was carried out by the Australian Genome Research Facility (AGRF, Brisbane).

[0302] Whole Gene Synthesis

[0303] Overlapping .about.35-50mer oligonucleotides (Sigma-Proligo) were used to synthesize longer DNA sequences. Restriction enzyme sites were incorporated to facilitate cloning. The method used to synthesize the fragments is based on that given in Smith et al. (2003). First, oligonucleotides for the top or bottom strand were mixed and then phosphorylated using T4 polynucleotide kinase (PNK; New England Biolabs). The oligonucleotide mixes were then purified from the PNK by a standard phenol/chloroform extraction and sodium acetate/ethanol (NaAc/EtOH) precipitation. Equal volumes of oligonucleotide mixes for the top and bottom strands were then mixed and the oligonucleotides denatured by heating at 95.degree. C. for 2 mins. The oligonucleotides were annealed by slowly cooling the sample to 55.degree. C. and the annealed oligonucleotides ligated using Taq ligase (New England Biolabs). The resulting fragment was purified by phenol/CHCl.sub.3 extraction and NaAc/EtOH precipitation.

[0304] The ends of the fragments were filled in and the fragments then amplified, using the outermost forward and reverse primers, with the Clontech Advantage HF 2 PCR kit (Clontech) according to the manufacturer's instructions. To fill in the ends the following PCR was used: 35 cycles of a denaturation step of 94.degree. C. for 15s, a slow annealing step where the temperature was ramped down to 55.degree. C. over 7 minutes and then kept at 55.degree. C. for 2 min, and an elongation step of 72.degree. C. for 6 minutes. A final elongation step for 7 min at 72.degree. C. was then carried out. The second PCR to amplify the fragment involved: an initial denaturation step at 94.degree. C. for 30 s, followed by 25 cycles of 94.degree. C. for 15 s, 55.degree. C. 30 s and 68.degree. C. for 1 min, and a final elongation step of 68.degree. C. for 3 mins.

[0305] The fragments were then purified by gel electrophoresis, digested and ligated into the relevant vector. Following transformation of E. coli with the ligation mixture, mini-preps were made for multiple colonies and the inserts sequenced. Sometimes it was not possible to isolate clones with entirely correct sequence. In those cases the errors were fixed by single or multi site-directed mutagenesis.

[0306] Site-directed Mutagenesis

[0307] Mutagenesis was carried out using the Quikchange II Site-directed Mutagenesis kit or Quikchange Multi Site-directed Mutagenesis Kit (Stratagene, La Jolla Calif.), with appropriate PAGE (polyacrylamide gel electrophoresis)-purified primers (Sigma), according to the manufacturer's instructions.

[0308] Preparation of Constructs

[0309] The details of the constructs used to generate the codon preference table are summarized in TABLE 12. All constructs were made using pcDNA3 from Invitrogen and were verified by sequencing prior to use.

TABLE-US-00012 TABLE 12 SUMMARY OF SECRETORY E7 CONSTRUCT SERIES 1 AND 2 AA & E7 Construct Codon CU of Sec Seq CU of E7 Protein Control Constructs IgkC1 N/A wt wt non-onc IgkC2 N/A mc mc non-onc IgkC3 N/A wt wt onc IgkC4 N/A mc mc onc Secretory E7 construct series 1 IgkS1-1 Ala GCG wt wt with all Ala non-onc gcg IgkS1-2 Ala GCA wt wt with all Ala non-onc gca IgkS1-3 Ala GCT wt wt with all Ala non-onc gct IgkS1-4 Ala GCC wt wt with all Ala non-onc gcc IgkS1-5 Arg AGG wt wt with all Arg non-onc agg IgkS1-6 Arg AGA wt wt with all Arg non-onc aga IgkS1-7 Arg CGG wt wt with all Arg non-onc cgg IgkS1-8 Arg CGA wt wt with all Arg non-onc cga IgkS1-9 Arg CGT wt wt with all Arg non-onc cgt IgkS1-10 Arg CGC wt wt with all Arg non-onc cgc IgkS1-11 Asn AAT wt wt with all Asn non-onc aat IgkS1-12 Asn AAC wt wt with all Asn non-onc aac IgkS1-13 Asp GAT wt with all Asp wt with all Asp non-onc gat gat IgkS1-14 Asp GAC wt with all Asp wt with all Asp non-onc gac gac IgkS1-15 Cys TGT wt wt with all Cys non-onc tgt IgkS1-16 Cys TGC wt wt with all Cys non-onc tgc IgkS1-17 Glu GAG wt with all Glu wt with all Glu non-onc gag gag IgkS1-18 Glu GAA wt with all Glu wt with all Glu non-onc gaa gaa IgkS1-19 Gln CAG wt wt with all Gln non-onc cag IgkS1-20 Gln CAA wt wt with all Gln non-onc caa IgkS1-21 Gly GGG wt with all Gly wt with all Gly non-onc ggg ggg IgkS1-22 Gly GGA wt with all Gly wt with all Gly non-onc gga gga IgkS1-23 Gly GGT wt with all Gly wt with all Gly non-onc ggt ggt IgkS1-24 Gly GGC wt with all Gly wt with all Gly non-onc ggc ggc IgkS1-25 His CAT wt wt with all His non-onc cat IgkS1-26 His CAC wt wt with all His non-onc cac IgkS1-27 Ile ATA wt wt with all Ile non-onc ata IgkS1-28 Ile ATT wt wt with all Ile non-onc att IgkS1-29 Ile ATC wt wt with all Ile non-onc atc IgkS1-30 Lys AAG wt wt with all Lys non-onc aag IgkS1-31 Lys AAA wt wt with all Lys non-onc aaa IgkS1-32 Phe TTT wt wt with all Phe non-onc ttt L15F, L22F IgkS1-33 Phe TTC wt wt with all Phe non-onc ttc L15F, L22F IgkS1-34 Ser AGT wt with all Ser wt with all Ser non-onc agt agt IgkS1-35 Ser AGC wt with all Ser wt with all Ser non-onc agc agc IgkS1-36 Ser TCG wt with all Ser wt with all Ser non-onc tcg tcg IgkS1-37 Ser TCA wt with all Ser wt with all Ser non-onc tca tca IgkS1-38 Ser TCT wt with all Ser wt with all Ser non-onc tct tct IgkS1-39 Ser TCC wt wt with all Ser non-onc tcc IgkS1-40 Thr ACG wt with all Thr wt with all Thr non-onc acg acg IgkS1-41 Thr ACA wt with all Thr wt with all Thr non-onc aca aca IgkS1-42 Thr ACT wt with all Thr wt with all Thr non-onc act act IgkS1-43 Thr ACC wt with all Thr wt with all Thr non-onc acc acc IgkS1-44 Tyr TAT wt wt with all Tyr non-onc tat IgkS1-45 Tyr TAC wt wt with all Tyr non-onc tac IgkS1-46 Val GTG wt with all Val wt with all Val non-onc gtg gtg IgkS1-47 Val GTA wt with all Val wt with all Val non-onc gta gta IgkS1-48 Val GTT wt with all Val wt with all Val non-onc gtt gtt IgkS1-49 Val GTC wt with all Val wt with all Val non-onc gtc gtc IgkS1-50 Leu CTG altered with altered with Leu onc Leu ctg ctg IgkS1-51 Leu CTA altered with altered with Leu onc Leu cta cta IgkS1-52 Leu CTT altered with altered with Leu onc Leu ctt ctt IgkS1-53 Leu CTC altered with altered with Leu onc Leu ctc ctc IgkS1-54 Leu TTG altered with altered with Leu onc Leu ttg ttg IgkS1-55 Leu TTA altered with altered with Leu onc Leu tta tta IgkS1-56 Pro CCG altered with altered with Pro onc Pro ccg ccg IgkS1-57 Pro CCA altered with altered with Pro onc Pro cca cca IgkS1-58 Pro CCT altered with altered with Pro onc Pro cct cct IgkS1-59 Pro CCC altered with altered with Pro onc Pro ccc ccc Secretory E7 construct series 2 IgkS2-1 Ala GCG mc mc linkerA-onc IgkS2-2 Ala GCA mc mc linkerA-onc IgkS2-3 Ala GCT mc mc linkerA-onc IgkS2-4 Ala GCC mc mc linkerA-onc IgkS2-5 Arg AGG mc mc linkerR-onc IgkS2-6 Arg AGA mc mc linkerR-onc IgkS2-7 Arg CGG mc mc linkerR-onc IgkS2-8 Arg CGA mc mc linkerR-onc IgkS2-9 Arg CGT mc mc linkerR-onc IgkS2-10 Arg CGC mc mc linkerR-onc IgkS2-11 Asn AAT mc mc linkerN-onc IgkS2-12 Asn AAC mc mc linkerN-onc IgkS2-13 Asp GAT wt with all Asp wt with all Asp onc gat gat IgkS2-14 Asp GAC wt with all Asp wt with all Asp onc gac gac IgkS2-15 Cys TGT wt wt with all Cys onc tgt IgkS2-16 Cys TGC wt wt with all Cys onc tgc IgkS2-17 Glu GAG wt with all Glu wt with all Glu onc gag gag IgkS2-18 Glu GAA wt with all Glu wt with all Glu onc gaa gaa IgkS2-19 Gln CAG wt wt with all Gln onc cag IgkS2-20 Gln CAA wt wt with all Gln onc caa IgkS2-21 Gly GGG wt with all Gly wt with all Gly onc ggg ggg IgkS2-22 Gly GGA wt with all Gly wt with all Gly onc gga gga IgkS2-23 Gly GGT wt with all Gly wt with all Gly onc ggt ggt IgkS2-24 Gly GGC wt with all Gly wt with all Gly onc ggc ggc IgkS2-25 His CAT mc mc linkerH-onc IgkS2-26 His CAC mc mc linkerH-onc IgkS2-27 Ile ATA wt wt with all Ile onc ata IgkS2-28 Ile ATT wt wt with all Ile onc att IgkS2-29 Ile ATC wt wt with all Ile onc atc IgkS2-30 Lys AAG mc mc linkerK- onc IgkS2-31 Lys AAA mc mc linkerK- onc IgkS2-32 Phe TTT mc mc linkerF- onc IgkS2-33 Phe TTC mc mc linkerF- onc IgkS2-34 Ser AGT wt with all Ser wt with all Ser onc agt agt IgkS2-35 Ser AGC wt with all Ser wt with all Ser onc agc agc IgkS2-36 Ser TCG wt with all Ser wt with all Ser onc tcg tcg IgkS2-37 Ser TCA wt with all Ser wt with all Ser onc tca tca IgkS2-38 Ser TCT wt with all Ser wt with all Ser onc tct tct IgkS2-39 Ser TCC wt wt with all Ser onc tcc IgkS2-40 Thr ACG wt with all Thr wt with all Thr onc acg acg IgkS2-41 Thr ACA wt with all Thr wt with all Thr onc aca aca IgkS2-42 Thr ACT wt with all Thr wt with all Thr onc act act IgkS2-43 Thr ACC wt with all Thr wt with all Thr onc acc acc IgkS2-44 Tyr TAT mc mc linkerY- onc IgkS2-45 Tyr TAC mc mc linkerY- onc IgkS2-46 Val GTG wt with all Val wt with all Val onc gtg gtg IgkS2-47 Val GTA wt with all Val wt with all Val onc gta gta IgkS2-48 Val GTT wt with all Val wt with all Val onc gtt gtt IgkS2-49 Val GTC wt with all Val wt with all Val onc gtc gtc IgkS2- Asn AAT wt wt with all Asn linkerN- 11b aat non-onc IgkS2- Asn AAC wt wt with all Asn linkerN- 12b aac non-onc AA = amino acid, CU = codon usage, mc = mammalian consensus, wt = wild-type, onc = oncogenic, non-onc = non-oncogenic, Sec seq = secretory sequence, N/A = not applicable

[0310] Control Constructs

[0311] Control E7 constructs were based on those from Liu et al. (2002). Both oncogenic (i.e. wild-type) and non-oncogenic E7 control constructs were made with wild-type or mammalian consensus codon usage. "Non-oncogenic" E7 is E7 with D21G, C24G, E26G mutations, i.e. with mutations that have been reported to render E7 non-transforming (Edmonds and Vousden, 1989; Heck et al, 1992).

[0312] The secretory sequence was derived from Mus musculus IgK RNA for the anti-HLA-DR antibody light chain (GenBank accession number D84070). For some constructs the codon usage of this sequence was modified.

[0313] Wild-type Codon Usage Control Constructs:

[0314] The wild-type (wt) codon usage E7 construct from Liu et al. was used as the template in a site-directed mutagenesis PCR to make the wt codon usage non-oncogenic E7 construct.

[0315] The non-oncogenic and oncogenic wild-type codon usage E7 sequences were amplified to incorporate a 5' BamHI site and a 3' EcoRI site. The resulting fragments were cloned into BamHI and EcoRI cut pcDNA3 and sequenced. The secretory fragment was made by whole gene synthesis using wild-type codon usage with flanking KpnI and BamHI sites. The Kozak-secretory fragments were then ligated into KpnI/BamHI cut pcDNA3-wtE7 (non-oncogenic or oncogenic) to make pcDNA3-Igk-nE7 and pcDNA3-Igk-E7 (named IgkC1 and IgkC3 respectively; see TABLE 12). The identity of the constructs was confirmed by sequencing.

[0316] Mammalian Consensus (mc) Codon Usage Control Constructs:

[0317] As there were errors in the original mammalian consensus (mc) E7 construct (L28F, Q70R and an E35 deletion; Liu et al., 2002) it was not used. A mc non-oncogenic E7 control construct was synthesized by whole gene synthesis. A mc oncogenic E7 (i.e., wild-type E7) control construct was subsequently made from the mc non-oncogenic E7 construct by single site-directed mutagenesis.

[0318] Secretory mc oncogenic and non-oncogenic constructs were made by amplifying the mc E7 sequence with a forward primer that introduced a BamHI site and a reverse primer that incorporated an EcoRI site. The resulting E7 fragment was cloned into the respective sites in pcDNA3 and sequenced. A mc secretory sequence flanked by KpnI and BamHI sites, 5' and 3' respectively, was synthesised and ligated into the KpnI and BamHI sites of pcDNA3-mcE7 (oncogenic or non-oncogenic) to make pcDNA3-mcIgk-mcnE7 and pcDNA3-mcIgk-mcE7 (named IgkC2 and IgkC4 respectively; see TABLE 12). The identity of the constructs was confirmed by sequencing.

[0319] Secreted Non-oncogenic E7 Constructs with Predominantly Wild-type Codon Usage, Modified for Individual Codons

[0320] Plasmids encoding a non-oncogenic form of E7 were made for all of the codons, with the exception of the Pro and Leu codons, stop codons and codons for non-degenerate amino acids. As Phe occurs just once in the E7 sequence, the codons for two Leu residues, L15 and L22, were mutated to Phe codons. A combination of techniques was used to make these constructs. When few mutations were required single or multi site-directed mutagenesis of a control construct encoding non-oncogenic E7 was performed (details of the control construct are given above under "control constructs"). When more extensive modifications were required whole gene synthesis was employed. Regardless of the methods used these constructs all include an E7 encoding sequence with identical upstream and downstream sequence cloned into the KpnI and EcoRI sites of pcDNA3. These constructs were then modified to include a secretory sequence, as described below.

[0321] First, using the whole gene synthesis method, DNA fragments that included a secretory sequence flanked by KpnI and BamHI sites were synthesized. For some constructs the amino acid of interest occurred in the secretory sequence so individual modified secretory sequence fragments were made. For constructs for amino acids that did not occur in the secretory sequence, wild-type secretory sequence was used. These fragments were digested with KpnI and BamHI. Then, using the relevant nE7 construct as a template and a standard PCR protocol, a BamHI site was introduced at the 5' end of the E7 sequence. The 3' EcoRI site was retained. The resulting E7 fragments were cut with BamHI and EcoRI, purified, and ligated into pcDNA3. Following sequencing, the plasmids were cut with KpnI and BamHI and ligated with the relevant KpnI BamHI secretory sequences. The sequences of the constructs were then confirmed. Constructs IgkS1-1 to IgkS1-49 were made in this way (see TABLE 12 and FIGS. 1 to 11, 13 and 15 to 17 for sequence comparisons).

[0322] Secreted E7 Constructs with Individual Pro or Leu Codons Modified

[0323] E7 DNA sequences in which the Pro or Leu codons were individually modified were designed. The rest of the codon usage for these E7 DNAs was the same for all of the Pro and Leu constructs but differed from the wild-type or mammalian consensus codon usage. [Note that this codon usage was based on our preliminary data from immunizing mice with the GFP constructs.]

[0324] The Pro/LeuE7 DNA fragments, flanked by HindIII and BamHI sites, were made by whole gene synthesis and cloned into the HindIII and BamHI sites of pcDNA3. Using these constructs as templates, a KpnI site was incorporated upstream and an EcoRI site downstream, of the Pro/Leu E7 sequences by standard PCR methods. The resulting fragments were cut with KpnI and EcoRI and cloned into pcDNA3. These constructs were then used to make the secreted E7 constructs with Pro or Leu codon modifications.

[0325] Firstly, using the whole gene synthesis method, DNA fragments that included a secretory sequence flanked by KpnI and BamHI sites were synthesized. As Pro and Leu occur in the secretory sequence, individually modified secretory sequence fragments were made for the different constructs. These fragments were digested with KpnI and BamHI. Then, using the relevant Pro or Leu E7 construct as a template and a standard PCR protocol, a BamHI site was introduced at the 5' end of the E7 sequence. The 3' EcoRI site was retained. The resulting fragments were cut with BamHI and EcoRI, purified, and ligated into pcDNA3. Following sequencing, the plasmids were cut with KpnI and BamHI and ligated with the relevant KpnI/BamHI secretory sequences. The resulting constructs were sequenced and are denoted IgkS1-50 to IgkS1-59 (see TABLE 12 and FIGS. 12 and 14 for sequence comparisons).

[0326] Secreted E7 Constructs with Predominantly Wild-type Codon Usage, Modified for Individual Codons

[0327] Constructs encoding a secreted form of oncogenic E7 (i.e. wild-type E7 protein) were made by site-directed mutagenesis of the plasmids encoding a secreted form of non-oncogenic E7. This was done for constructs for codons for the following amino acids: Asp, Cys, Glu, Gln, Gly, Ile, Ser, Thr and Val.

[0328] Site-directed mutagenesis was carried out using the Quikchange II Site-directed Mutagenesis kit (Stratagene, La Jolla Calif.) and appropriate PAGE (polyacrylamide gel electrophoresis)-purified primers (Sigma) according to the manufacturer's instructions. The pcDNA-kIgkX-nE7X series of constructs were used as templates for the mutagenesis (i.e. constructs IgkS1-13 to 24, IgkS1-27 to 29, IgkS1-34 to 43 and IgkS1-46 to 49). The primers introduced the desired G21D, G24C, G26E mutations.

[0329] The resulting constructs, IgkS2-13 to 24, IgkS2-27 to 29, IgkS2-34 to 43 and IgkS2-46 to 49 (see Table 8, SEQ ID NOs: 1 to 29), have wild-type codon usage for the Igk secretory sequence and E7 sequence with the exception that the codons for the relevant amino acid were changed, and they encode oncogenic E7.

[0330] Linker Constructs

[0331] Constructs encoding the N-terminal Igk secretory sequence followed by a linker sequence (XXGXGXX, where X is the relevant amino acid for a particular construct and G is glycine) and the E7 protein were made for each of the following amino acids: Asn, Ala, Lys, Arg, Phe, His and Tyr.

[0332] Fragments consisting of the Igk secretory sequence (with mammalian consensus codon usage) and the linker sequences were made by PCR using Taq polymerase and standard cycling conditions, as recommended by the manufacturer.

[0333] The fragments were amplified from pcDNA3-kmcIgk-mcE7 using a common forward primer

TABLE-US-00013 (5'TTGAATAGGTACCGCCGCCACCATGGAGACCGACACCCTCC3'; SEQ ID NO: 90)

that annealed to the KpnI site, the Kozak sequence and the beginning of the Igk secretory sequence. The reverse primers were different for each linker construct and annealed to the end of the Igk secretory sequence (with mammalian consensus codon usage), introduced new sequence that encoded the relevant linker sequence and a 3' BamHI site.

[0334] The fragments were digested with KpnI/BamHI and were ligated into KpnI/BamHI-cut pcDNA3-mcIgk-mcE7 (i.e. the Kozak sequence and secretory sequence had been removed from the plasmid by digestion) to make pcDNA3-mcIgk-linkerX-mcE7 (i.e., IgkS2-1 to 12, IgkS2-25 and 26, IgkS2-30 to 33 and IgkS2-44 and 45 as illustrated in Table 8, SEQ ID NOs: 30 to 49).

[0335] For Asn the fragments were also ligated into KpnI/BamHI-cut pcDNA3-Igk-nE7Asn1/2 (i.e. IgkS1-11 and 12) to make pcDNA3-mcIgk-linkerN1/2-nE7Asn1/2 (i.e., IgkS2-11b and IgkS2-12b, see Table 12).

E7 Protein Expression

[0336] Cell Culture

[0337] CHO cells were cultured in DMEM (GIBCO from Invitrogen) containing 10% foetal bovine serum (FBS) (DKSH), penicillin, streptomycin and glutamine (GIBCO from Invitrogen) at 37.degree. C. and 5% CO.sub.2. Cells were plated into 6-well plates at 3.times.10.sup.5/well, 24 hours prior to transfection. For each transfection, 2 .mu.g of DNA was mixed with 504 OptiMEM (GIBCO from Invitrogen) and 4 .mu.L Plus reagent (Invitrogen) and incubated at room temperature (RT) for 30 min. Lipofectamine (Invitrogen; 5 .mu.L in 50 .mu.L OptiMEM) was added and the complexes incubated at RT for 30 min. The cells were rinsed with OptiMEM, 2 mL OptiMEM were added to each well, and the complexes then added. The cells were incubated overnight at 37.degree. C. and 5% CO.sub.2. The following morning the complexes were removed and 2 mL of fresh DMEM containing 2% FBS added to each well.

[0338] Cell pellets and supernatants were collected about 40 h after transfection. The cell pellets were resuspended in lysis buffer (0.1% NP-40, 2 .mu.g/mL Aprotinin, 1 .mu.g/mL

[0339] Leupeptin and 2 mM PMSF in PBS). Transfections were carried out in duplicate and repeated. Control transfections, with empty vector (pcDNA3), were also carried out.

[0340] Western Blotting

[0341] Western blots of the CHO cell supernatants or lysates were carried out according to standard protocols. Briefly, this involved firstly separating the samples by polyacrylamide gel electrophoresis (PAGE). For cell lysates, 30 .mu.g of total protein were loaded for each sample. For supernatants, 30 .mu.L of each was loaded. The protein samples were boiled with SDS-PAGE loading buffer for 10 mins before loading onto 12% SDS-PAGE gels and the gels were run at 150-200V for approximately 1 h.

[0342] The separated proteins were then transferred from the gels to PVDF membrane (100V for 1 h). The membranes were blocked with 5% skim milk (in PBS/0.05% Tween 20 (PBS-T)) for 1 h at room temperature and were then incubated with the primary antibody, HPV-16 E7 Mouse Monoclonal Antibody (Zymed Laboratories) at a concentration of 1:1000 in 5% skim milk (in PBS-T) overnight at 4.degree. C. Following washing of the membrane in PBS-T (3.times.10 min), secondary antibody, anti-mouse IgG (Sigma) in 5% skim milk, was added and the membrane incubated at room temperature for 4 h. The membranes were washed as before, incubated in a mixture containing equal volumes of solution A (4.425 mL water, 50 .mu.L luminol, 22 .mu.L p-coumaric and 500 .mu.L 1M Tris pH 8.5) and solution B (4.5 mL water, 3 .mu.L 30% H.sub.2O.sub.2 and 500 .mu.L 1M Tris pH8.5) for 1 min, and then dried and wrapped in plastic wrap. Film was exposed to the blots for various times (1 min, 3 min or 10 min) and the film then developed.

Gene Gun Immunization Protocols

[0343] Plasmid Purification

[0344] All plasmids used for vaccination were grown in the Escherichia coli strain DH5.alpha. and purified using the Nucleobond Maxi Kit (Machery-Nagal). DNA concentration was quantitated spectrophotometrically at 260 nm.

[0345] Preparation of DNA/Gold Cartridges

[0346] Coating of gold particles with plasmid DNA was performed as described in the Biorad Helios Gene Gun System instruction manual using a microcarrier loading quantity (MLQ) of 0.5 mg gold/cartridge and a DNA loading ratio of 2 .mu.g DNA/mg gold. This resulted in 1 .mu.g of DNA per prepared cartridge. In brief 50 .mu.L of 0.05M spermidine (Sigma) was added to 25 mg of 1.0 um gold particles (Bio-Rad) and the spermidine/gold was sonicated for 3 seconds. 50 .mu.g of plasmid DNA was then added, followed by the dropwise addition of 100 .mu.L 1M CaCl.sub.2 while vortexing. The mixture was allowed to precipitate at room temperature for 10 min, then centrifuged to pellet the DNA/gold. The pellet was washed three times with HPLC grade ethanol (Scharlau), before resuspension in HPLC grade ethanol containing 0.5 mg/mL of polyvinylpyrrolidone (PVP) (Bio-Rad). The gold/plasmid suspension was then coated onto Tefzel tubing and 0.5 inch cartridges prepared.

[0347] Gene Gun Immunization of Mice

[0348] Groups of 8 female C57BL6/J (6-8 weeks old) (ARC, WA or Monash Animal Services, VIC) were immunized on Day 0, Day21, Day 42 and Day 63 with the relevant DNA. The day before each immunization the abdomen of each mouse was shaved and depilatory cream (Nair) applied for 1 minute. DNA was delivered with the Helios gene gun (Biorad) using a pressure of 400 psi. Mice were given 2 shots on either side of the abdomen, with 1 .mu.g of DNA delivered per shot. Serum was collected via intra-ocular bleed 2 days prior to initial immunization and 2 weeks after each subsequent immunization (Day 2, Day 35, Day 56 and Day 77).

[0349] ELISA to Measure E7 Immune Response

[0350] Nine peptides spanning the full-length of HPV16E7 (Frazer et al., 1995) were used to measure the E7 antibody response. The peptides were synthesised and purified to >70% purity by Auspep (Melbourne). Peptides GF101 to 106 and GF108 to 109 described in Frazer et al. were made. Note that instead of GF107, GF107a was used:

TABLE-US-00014 HYNIVTFCCKCDSTLRL.

[0351] GF102 D13G, GF103 D5G/C8G/E10G and GF104E2G peptides, named GF102n, GF103n and GF104n respectively, were also synthesised. These peptides were used for the ELISA when measuring antibodies to non-oncogenic E7 i.e. these peptides incorporate the mutations that were made to make the E7 protein non-oncogenic.

[0352] Microtiter plates were coated overnight with 50 .mu.L of 10 .mu.g/mL E7 peptide per well. After coating, microtiter plates (Maxisorp, Nunc) were washed two times with PBS/0.05% Tween 20 (PBS-T) and then blocked for two hours at 37.degree. C. with 100 .mu.L of 5% skim milk powder in PBS-T. After blocking, plates were washed three times with PBS-T and 50 .mu.L of mouse sera at a dilution of 1 in 100 was added for 2 hours at 37.degree. C. All serum was assayed in duplicate wells. Plates were then washed three times with PBS-T and 50 .mu.L of sheep anti-mouse IgG horseradish peroxidise conjugate (Sigma) was added at a 1 in 1000 dilution. After 1 hour plates were washed and 50 .mu.l, of OPD substrate was added. Absorbance was measured after 30 min and the addition of 25 .mu.L of 2.5 M HCl at 490 nm in a Multiskan EX plate reader (Pathtech). Note controls were included: control primary antibody for a positive control, secondary antibody only, and day 0 serum/serum from unimmunized mice as negative controls.

[0353] The immune response preferences of codons determined from these experiments are tabulated in TABLE 1.

Example 2

Construction of Codon Modified Influenza A Virus (H5N1) HA DNA for Conferring an Enhanced Immune Response to H5N1 HA

[0354] The wild-type nucleotide sequence of the influenza A virus, HA gene for hemagglutinin (A/Hong Kong/213/03(H5N1), MDCK isolate, embryonated chicken egg isolate) is shown in SEQ ID NO: 50 and encodes the amino acid sequence shown in SEQ ID NO: 51. Several codons within that sequence were mutated using the method described in Example 1. Specifically, the method involved replacing codons of the wild type nucleotide sequence with corresponding synonymous codons having higher immune response preferences than the codons they replaced, as represented in Table 1. An illustrative codon modified nucleotide sequence comprising high immune response preference codons is shown in SEQ ID NO: 52.

Example 3

Construction of Codon Modified Influenza A Virus (H.sub.3N.sub.1) DNA for Conferring an Enhanced Immune Response to H3N1 HA

[0355] The wild-type nucleotide sequence of the influenza A virus, HA gene for hemagglutinin (A/swine/Korea/PZ72-1/2006(H3N1)) is shown in SEQ ID NO: 53 and encodes the amino acid sequence shown in SEQ ID NO: 54. Specifically, the method involved replacing codons of the wild type nucleotide sequence with corresponding synonymous codons having higher immune response preferences than the codons they replaced, as represented in Table 1. An illustrative codon modified nucleotide sequence comprising high immune response preference codons is shown in SEQ ID NO: 55.

Example 4

Construction of Codon Modified Influenza A Virus (H.sub.5N.sub.1) NA DNA for Conferring an Enhanced Immune Response to H5N1 NA

[0356] The wild-type nucleotide sequence of the influenza A virus, NA gene for neuraminidase (A/Hong Kong/213/03(H5N1), NA gene neuraminidase, MDCK isolate, embryonated chicken egg isolate) is shown in SEQ ID NO: 56 and encodes the amino acid sequence shown in SEQ ID NO: 57. Several codons within that sequence were mutated using the method described in Example 1. Specifically, the method involved replacing codons of the wild type nucleotide sequence with corresponding synonymous codons having higher immune response preferences than the codons they replaced, as represented in Table 1. An illustrative codon modified nucleotide sequence comprising high immune response preference codons is shown in SEQ ID NO: 58.

Example 5

Construction of Codon Modified Influenza A Virus (H.sub.3N.sub.1) NA DNA for Conferring an Enhanced Immune Response to H3N1 NA

[0357] The wild-type nucleotide sequence of the influenza A virus, NA gene for neuraminidase (A/swine/MI/PU243/04(H3N1)) is shown in SEQ ID NO: 59 and encodes the amino acid sequence shown in SEQ ID NO: 60. Several codons within that sequence were mutated using the method described in Example 1. Specifically, the method involved replacing codons of the wild type nucleotide sequence with corresponding synonymous codons having higher immune response preferences than the codons they replaced, as represented in Table 1. An illustrative codon modified nucleotide sequence comprising high immune response preference codons is shown in SEQ ID NO: 61.

Example 6

Construction of Codon Modified Hepatitis C Virus E1 (1AH77) DNA for Conferring an Enhanced Immune Response to HCV E1 (1AH77)

[0358] The wild-type nucleotide sequence of the hepatitis C Virus E1, (serotype 1A, isolate H77, from polyprotein nucleotide sequence AF009606) is shown in SEQ ID NO: 62 and encodes the amino acid sequence (NP 751920) shown in SEQ ID NO: 63. Several codons within that sequence were mutated using the method described in Example 1. Specifically, the method involved replacing codons of the wild type nucleotide sequence with corresponding synonymous codons having higher immune response preferences than the codons they replaced, as represented in Table 1. An illustrative codon modified nucleotide sequence comprising high immune response preference codons is shown in SEQ ID NO: 64.

Example 7

Construction of Codon Modified Hepatitis C Virus E2 (1AH77) DNA for Conferring an Enhanced Immune Response to HCV E2 (1AH77)

[0359] The wild-type nucleotide sequence of the hepatitis C Virus E2, (serotype 1A, isolate H77, from polyprotein nucleotide sequence AF009606) is shown in SEQ ID NO: 65 and encodes the amino acid sequence (NP 751921) shown in SEQ ID NO: 66. Several codons within that sequence were mutated using the method described in Example 1. Specifically, the method involved replacing codons of the wild type nucleotide sequence with corresponding synonymous codons having higher immune response preferences than the codons they replaced, as represented in Table 1. An illustrative codon modified nucleotide sequence comprising high immune response preference codons is shown in in SEQ ID NO: 67.

Example 8

Construction of Codon Modified Epstein--Barr Virus Type 1 GP350 DNA for Conferring an Enhanced Immune Response to EBV Type 1 GP350

[0360] The wild-type nucleotide sequence of the Epstein--Barr virus, EBV type 1 gp350 (Gene BLLF1, strand 77142-79865) is shown in SEQ ID NO: 68 and encodes amino acid sequence (CAD53417) shown in SEQ ID NO: 69. Several codons within that sequence were mutated using the method described in Example 1. Specifically, the method involved replacing codons of the wild type nucleotide sequence with corresponding synonymous codons having higher immune response preferences than the codons they replaced, as represented in Table 1. An illustrative codon modified nucleotide sequence comprising high immune response preference codons is shown in SEQ ID NO: 70.

Example 9

Construction of Codon Modified Epstein--Barr Virus Type 2 GP350 DNA for Conferring an Enhanced Immune Response to EBV Type 2 GP350

[0361] The wild-type nucleotide sequence of the Epstein--Barr virus, EBV type 2 gp350 (Gene BLLF1, strand 77267-29936) is shown in SEQ ID NO: 71 and encodes the amino acid sequence (YP 001129462) shown in SEQ ID NO: 72. Several codons within that sequence were mutated using the method described in Example 1. Specifically, the method involved replacing codons of the wild type nucleotide sequence with corresponding synonymous codons having higher immune response preferences than the codons they replaced, as represented in Table 1. An illustrative codon modified nucleotide sequence comprising high immune response preference codons is shown in SEQ ID NO: 73.

Example 10

Construction of Codon Modified Herpes Simplex Virus 2 Glycoprotein B DNA for Conferring an Enhanced Immune Response to HSV-2 Glycoprotein B

[0362] The wild-type nucleotide sequence of the Herpes Simplex virus 2, glycoprotein B strain HG52 (genome strain NC 001798) is shown in SEQ ID NO: 74 and encodes the amino acid sequence (CAB06752) shown in SEQ ID NO: 75. Several codons within that sequence were mutated using the method described in Example 1. Specifically, the method involved replacing codons of the wild type nucleotide sequence with corresponding synonymous codons having higher immune response preferences than the codons they replaced, as represented in Table 1. An illustrative codon modified nucleotide sequence comprising high immune response preference codons is shown in SEQ ID NO: 76.

Example 11

Construction of Codon Modified Herpes Simplex Virus 2 Glycoprotein D DNA for Conferring an Enhanced Immune Response to HSV-2 Glycoprotein D

[0363] The wild-type nucleotide sequence of the Herpes Simplex virus 2, glycoprotein D strain H052 (genome strain NC 001798) is shown in SEQ ID NO: 77 and encodes the amino acid sequence (NP 044536) shown in SEQ ID NO: 78. Several codons within that sequence were mutated using the method described in Example 1. Specifically, the method involved replacing codons of the wild type nucleotide sequence with corresponding synonymous codons having higher immune response preferences than the codons they replaced, as represented in Table 1. An illustrative codon modified nucleotide sequence comprising high immune response preference codons is shown in SEQ ID NO: 79.

Example 12

Optimised E7 and HSV-2 Constructs

Design and Synthesis of Optimal and Least Optimal E7 Constructs

[0364] One de-optimized (W) and three optimized (O1-O3) E7 constructs were designed and made using the codon preferences summarized in Table 1 ("the Immune Coricode table"). The least favourable codons were used for construct W. For the first optimized construct, O1, whose sequence is shown in SEQ ID NO: 81, all of the codons were modified to those codons determined most optimal. O2, whose sequence is shown in SEQ ID NO: 82, is an alternative optimized construct which involved changing all Ala to GCT; Arg CGG and AGG to CGA and AGA, respectively; Glu to GAA; Gly to GGA; Ile to ATC; all Leu to CTG; Phe to TTT, Pro to CCT or CCC, Ser to TCG, Thr to ACG; and all Val except GTG to GTC. The O.sub.2 modifications avoided, with the exception of Leu and Ile, changing codons to mammalian consensus-preferred codons. For O3, whose sequence is shown in SEQ ID NO: 83, only certain amino acids for which particularly distinct differences were observed between codons, and for which the optimal codon(s) was not also a mammalian consensus preferred codon, were modified. In particular, in O3 all non-preferred Gly, Leu, Pro, Ser and Thr codons were changed to GGA, CTC, CCT, TCG and ACG, respectively, and where a preferred codon was already used it was not altered. Codons for other amino acids in O3 were not modified.

Humoral and Cellular Responses to Biolistic Immunization with the Optimal and Least Optimal E7 Constructs

[0365] As may be seen in FIG. 18 (a) all three optimized constructs (O1 to O3) gave rise to significantly larger antibody responses than the wild-type construct as measured by both the peptide ELISA and a GST-E7 protein ELISA. The amplitudes of the response were not statistically different between the three optimized constructs. The de-optimized construct, W, whose sequence is shown in SEQ ID NO: 84, gave a very low antibody response, appearing slightly lower but not statistically different from the wild-type (wt) codon usage (CU) construct, whose sequence is shown in SEQ ID NO: 80. From the IFN-.gamma. ELISPOT experiments, a representative example of which is shown in FIG. 18, it appears that the codon preferences for maximizing the antibody response are similar to those required for maximising the T cell response: the de-optimized construct W failed to give a measurable response in the IFN-.gamma. ELISPOT assay and two of the optimized constructs (O2 and O3) gave statistically significantly larger responses than the wild-type CU construct. Over the three repeats the responses to O2 and O3 were not statistically different from each other. Unexpectedly, and in contrast to the antibody trend, in two of the three repeat experiments O1 gave a similar cellular response to the wt CU construct, which was less than that achieved by the O2 or O3 constructs.

Humoral and Cellular Responses to Immunization by Intradermal Injection with the Optimal and Least Optimal E7 Constructs

[0366] The humoral and cellular responses of mice to the optimized, wild-type CU and de-optimized constructs delivered by intradermal injection were also measured and the results are summarized in FIG. 19. In general, similar trends were observed for intradermal injection as for biolistic delivery.

[0367] From the E7 protein ELISA, it is apparent that the three optimized constructs, O1-O3, were all significantly better at generating antibodies than the wild-type construct and that the de-optimized construct gave a very low antibody response similar to wild-type. The optimized constructs all gave rise to significantly more spots in the IFN-.gamma.ELISPOT than the wild-type construct and the de-optimized construct failed to give rise to a measurable response.

[0368] The amplitudes of the antibody responses to gene gun immunization were larger than that for the intradermally (ID) delivered vaccines, despite the ID immunization delivering more than five times the dose.

Design and Synthesis of Optimal and Least Optimal HSV-2Constructs

[0369] Three optimized (O1-O3; whose sequences are shown in SEQ ID NO: 86-88, respectively) and a de-optimized construct (W; whose sequence is shown in SEQ ID NO: 88) encoding full-length glycoprotein D from Herpes Simplex Virus 2 (gD2) were prepared. A control construct pcDNA3-gD2 with wt CU was also made. Wild-type CU, whose sequence is shown in SEQ ID NO: 85, is close to MC CU.

Humoral Responses to Biolistic and Intradermal Immunization with the Optimal and Least Optimal gD2 Constructs

[0370] C57B1/6 mice were immunized in two groups (8 mice/construct; used intradermal injection (ID) and gene gun delivery) using the same immunization protocol as for the E7 constructs.

[0371] Group 1 included pcDNA3-gD2 and pcDNA3-gD2 O1. Group 2 included pcDNA3-gD2, pcDNA3-gD2 O2, pcDNA3-gD2 O3, and pcDNA3-gD2 W.

[0372] Antibody responses were measured by an ELISA using plates coated with CHO cell supernatant containing C-terminally His tagged and truncated gD2. The truncation is at amino acid residue 331 and removes the transmembrane region resulting in the protein being secreted into the medium. Control ELISA plates coated with supernatant from CHO cells transfected with empty vector were used as a control.

[0373] For both biolistic and intradermal injection delivery routes it was found that the three optimized constructs generated similar levels of antibodies as the wt CU gD2 construct (FIG. 20). The de-optimized construct, W gD2, was very poor at generating antibodies, particularly when delivered by intradermal injection. The two delivery methods resulted in similar levels of antibodies.

[0374] To date, there are no DNA vaccines on the market for the treatment or prevention of disease in humans. There is a need to maximize the immune responses generated by DNA vaccines and the present invention discloses ways of enhancing efficacy of DNA vaccines by using codons that have a higher preference for producing an immune response.

[0375] The study described in this Example has validated the Immune Coricode table by applying it to optimization or de-optimization of the HPV16 E7 and HSV-2 glycoprotein D (gD2) genes and demonstrating that this does enhance or reduce, respectively, the antibody or cellular response to biolistic delivery of these genes to mammals such as mice.

Material and Methods

[0376] ELISPOT Assay

[0377] For the IFN-.gamma. ELISPOTs, mice were immunized twice, at days 0 and 21, and the spleens were collected 3 weeks after the second immunization.

[0378] Intradermal Injection Protocol

[0379] The timing and frequency of the immunizations by intradermal injection were the same as for gene gun immunization. At each immunization 5 .mu.s of DNA was injected per ear i.e. a total of 10 .mu.g was administered per immunization per mouse. Hair removal prior to immunization was not necessary. The timing of bleeds and spleen collection was the same as for the gene gun immunized mice.

[0380] GST-E7 ELISA

[0381] The GST-E7 ELISA was carried out in the same way as the peptide ELISA with the exception that the plates were coated overnight with 50 .mu.L of 10 .mu.g/mL GST-tagged E7 protein (kindly provided by the Frazer group from the Diamantina Institute, The University of Queensland, Brisbane).

[0382] HSV-2 gD ELISA

[0383] This ELISA was carried out in the same way as the E7 ELISAs with the exception that the plates were coated with supernatant from CHO cells transfected with a vector encoding C-terminally His-tagged and truncated gD2 protein. Control plates coated with supernatant from CHO cells transfected with empty vector were also used.

[0384] Detection of HPV-specific Responses

[0385] For the detection of HPV-specific responses, 96-well filter ELISPOT plates (Millipore) were coated overnight with 10 mg/mL HPV GST-tagged E7 protein in 0.1 M NaHCO.sub.3. For the detection of total IgG secreting cells, 96-well filter ELISPOT plates were coated overnight with 2 .mu.g/mL goat anti-mouse Ig (Sigma) in PBS without MgCl.sub.2 and CaCl.sub.2. After coating, plates were washed once with complete DMEM without FCS and then blocked with complete DMEM supplemented with 10% FCS for one hour at 37.degree. C. Cultured mouse spleen cells were washed and added to ELISPOT plates at 10.sup.6 cells/100 .mu.L. For the detection of HPV-specific memory B cells, plates were incubated overnight at 37.degree. C. and for measuring total IgG cells, plates were incubated for 1 hour at 37.degree. C. For detection, we used biotinylated goat anti-mouse IgG (Sigma) in PBS-T/1% FCS, followed by 5 .mu.g/mL HRP-conjugated avidin (Pierce) and developed using 3-amino-9-ethylcarbozole (Sigma). Developed plates were counted using an automated ELISPOT plate counter.

[0386] E7 IFN-.gamma. ELISPOT

[0387] 96-well filter plates (Millipore) were coated overnight with 4 .mu.g/mL of monoclonal antibody (AN18; Mabtech). After coating, plates were washed once with complete RPMI and blocked for 2 hours with complete RPMI with 10% foetal calf serum (FCS; CSL Ltd). Mouse spleens were made into single cell suspensions and treated with ACK lysis buffer, washed and resuspended at a concentration of 10.sup.7 cells/mL. Spleen cells (10.sup.6/well) were added to each well followed by the addition of complete RPMI supplemented with recombinant hIL-2 (ProSpec-Tany TechnoGene Ltd) and peptide to a final concentration of 10 IU/well and 1 .mu.g/mL, respectively. Medium containing hIL-2 without peptide was added to control wells. Plates were incubated for approximately 18 hours at 37.degree. C. in 5-8% CO.sub.2.

[0388] After overnight incubation, cells were lysed by rinsing the plates in tap water and then washed six times in PBS/0.05% Tween 20 (PBS-T). For detection, biotinylated detection mAb (R4-6A2; Mabtech) in PBS-T/2% FCS was added, followed by horse radish peroxidase (HRP)-conjugated streptavidin and DAB (Sigma). Developed plates were counted using an automated ELISPOT plate counter.

[0389] The disclosure of every patent, patent application, and publication cited herein is hereby incorporated herein by reference in its entirety.

[0390] The citation of any reference herein should not be construed as an admission that such reference is available as "Prior Art" to the instant application.

[0391] Throughout the specification the aim has been to describe the preferred embodiments of the invention without limiting the invention to any one embodiment or specific collection of features. Those of skill in the art will therefore appreciate that, in light of the instant disclosure, various modifications and changes can be made in the particular embodiments exemplified without departing from the scope of the present invention. All such modifications and changes are intended to be included within the scope of the appended claims.

BIBLIOGRAPHY

[0392] Ausubel, F. M. (Ed.) 2007. Current Protocols in Molecular Biology. Ebook (http://www.mrw.interscience.wiley.com/emrw/9780471142720/cp/cpmb/toc). [0393] Edmonds, C., and Vousden, K. H. (1989). A point mutational analysis of human papillomavirus type 16 E7 protein. Journal of Virology. 63: 2650-2656. [0394] Frazer, I. H., Leippe, D. M., Dunn, L. A., Leim, A., Tindle, R. W., Fernando, G. J., Phelps, W. C., and Lambert, P. F. (1995). Immunological responses in human papillomavirus 16 E6/E7 transgenic mice to E7 protein correlate with the presence of skin disease. Cancer Research. 55: 2635-2639. [0395] Heck, D. V., Yee, C. L., Howley, P. M., and Munger, K. (1992). Efficiency of binding the retinoblastoma protein correlates with the transforming capacity of the E7 oncoproteins of the human papillomaviruses. PNAS 89: 4442-4446. [0396] Liu, W. J., Gao, F., Zhao, K N., Zhao, W., Fernando, G. J, Thomas, R. And Frazer, I. H. (2002). Codon modified human papillomavirus type 16 E7 DNA vaccine enhances cytotoxic T-lymphocyte induction and anti-tumour activity. Virology 301: 43-52. [0397] Smith, H. O., Hutchison III, C. A., Pfannkoch, C. and Venter, J. C. (2003). Generating a synthetic genome by whole genome assembly: .phi.X174 bacteriophage from synthetic oligonucleotides. PNAS. 100 (26): 15440-15445.

Sequence CWU 1

1

911387DNAArtificial sequencePlasmid sequence 1ggtaccgccg ccaccatgga gacagataca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgatgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgatagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggatag agcccattac 240aatattgtaa ccttttgttg caagtgtgat tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagata ttcgtacttt ggaagatctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 3872387DNAArtificial sequencePlasmid sequence 2ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagacacac ctacattgca tgaatatatg 120ttagacttgc aaccagagac aactgacctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg acgaaataga cggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 3873387DNAArtificial sequencePlasmid sequence 3ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg taagtgtgac tctacgcttc ggttgtgtgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgtccc 360atctgttctc agaagcccta agaattc 3874387DNAArtificial sequencePlasmid sequence 4ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgctatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgctg caagtgcgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 3875387DNAArtificial sequencePlasmid sequence 5ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgagtatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgagataga tggtccagct ggacaagcag agccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaggacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 3876387DNAArtificial sequencePlasmid sequence 6ggtaccgccg ccaccatgga aacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagaaac aactgatctc tactgttatg aacaattaaa tgacagctca 180gaagaagaag atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 3877387DNAArtificial sequencePlasmid sequence 7ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc agccagagac aactgatctc tactgttatg agcagttaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaggcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acagagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 3878387DNAArtificial sequencePlasmid sequence 8ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc aaaagcccta agaattc 3879387DNAArtificial sequencePlasmid sequence 9ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccagggtcca ctggggacgg atccatgcat ggggatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tgggccagct gggcaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggga cactagggat tgtgtgcccc 360atctgctctc agaagcccta agaattc 38710387DNAArtificial sequencePlasmid sequence 10ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggatcca ctggagacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggaccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggaa cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 38711387DNAArtificial sequencePlasmid sequence 11ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggtgatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggtcaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggta cactaggtat tgtgtgcccc 360atctgctctc agaagcccta agaattc 38712387DNAArtificial sequencePlasmid sequence 12ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggctcca ctggcgacgg atccatgcat ggcgatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggcccagct ggccaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggcat tgtgtgcccc 360atctgctctc agaagcccta agaattc 38713387DNAArtificial sequencePlasmid sequence 13ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatatagtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca tacgtacttt ggaagacctg ttaatgggca cactaggaat agtgtgcccc 360atatgctctc agaagcccta agaattc 38714387DNAArtificial sequencePlasmid sequence 14ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaattga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atttgctctc agaagcccta agaattc 38715387DNAArtificial sequencePlasmid sequence 15ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaatcga tggtccagct ggacaagcag aaccggacag agcccattac 240aatatcgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca tccgtacttt ggaagacctg ttaatgggca cactaggaat cgtgtgcccc 360atctgctctc agaagcccta agaattc 38716387DNAArtificial sequencePlasmid sequence 16ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggtagta ctggtgacgg aagtatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagtagt 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac agtacgcttc ggttgtgcgt acaaagtaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgcagtc agaagcccta agaattc 38717387DNAArtificial sequencePlasmid sequence 17ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggtagca ctggtgacgg aagcatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagcagc 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac agcacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgcagcc agaagcccta agaattc 38718387DNAArtificial sequencePlasmid sequence 18ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcga ctggtgacgg atcgatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgactcgtcg 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tcgacgcttc ggttgtgcgt acaatcgaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctcgc agaagcccta agaattc 38719387DNAArtificial sequencePlasmid sequence 19ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcaa ctggtgacgg atcaatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgactcatca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tcaacgcttc ggttgtgcgt acaatcaaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctcac agaagcccta agaattc 38720387DNAArtificial sequencePlasmid sequence 20ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcta ctggtgacgg atctatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgactcttct 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaatctaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 38721387DNAArtificial sequencePlasmid sequence 21ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgactcctcc 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tccacgcttc ggttgtgcgt acaatccaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctccc agaagcccta agaattc 38722387DNAArtificial sequencePlasmid sequence 22ggtaccgccg ccaccatgga gacggacacg ctcctgctat gggtactgct gctctgggtt 60ccaggttcca cgggtgacgg atccatgcat ggagatacgc ctacgttgca tgaatatatg 120ttagatttgc aaccagagac gacggatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa cgttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcacg 300cacgtagaca ttcgtacgtt ggaagacctg ttaatgggca cgctaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 38723387DNAArtificial sequencePlasmid sequence 23ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca caggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aacagatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa cattttgttg caagtgtgac tctacacttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacatt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 38724387DNAArtificial sequencePlasmid sequence 24ggtaccgccg ccaccatgga gactgacact ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatactc ctactttgca tgaatatatg 120ttagatttgc aaccagagac tactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ctttttgttg caagtgtgac tctactcttc ggttgtgcgt acaaagcact 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca ctctaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 38725387DNAArtificial sequencePlasmid sequence 25ggtaccgccg ccaccatgga gaccgacacc ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ccggtgacgg atccatgcat ggagataccc ctaccttgca tgaatatatg 120ttagatttgc aaccagagac caccgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacccttc ggttgtgcgt acaaagcacc 300cacgtagaca ttcgtacctt ggaagacctg ttaatgggca ccctaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 38726387DNAArtificial sequencePlasmid sequence 26ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtgctgct gctctgggtg 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtga ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt gcaaagcaca 300cacgtggaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 38727387DNAArtificial sequencePlasmid sequence 27ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggta 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtatgcccc 360atctgctctc agaagcccta agaattc 38728387DNAArtificial sequencePlasmid sequence 28ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggttctgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtta ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt tcaaagcaca 300cacgttgaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtttgcccc 360atctgctctc agaagcccta agaattc 38729387DNAArtificial sequencePlasmid sequence 29ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtcctgct gctctgggtc 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtca ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt ccaaagcaca 300cacgtcgaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtctgcccc 360atctgctctc agaagcccta agaattc 38730408DNAArtificial sequencePlasmid linker sequence 30ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacgc ggcgggcgcg ggcgcggcgg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40831408DNAArtificial sequencePlasmid linker sequence 31ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacgc agcaggcgca ggcgcagcag gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40832408DNAArtificial sequencePlasmid linker sequence 32ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacgc tgctggcgct ggcgctgctg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40833408DNAArtificial sequencePlasmid linker sequence 33ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacgc cgccggcgcc ggcgccgccg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc

gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40834408DNAArtificial sequencePlasmid linker sequence 34ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacag gaggggcagg ggcaggaggg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40835408DNAArtificial sequencePlasmid linker sequence 35ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacag aagaggcaga ggcagaagag gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40836408DNAArtificial sequencePlasmid linker sequence 36ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgaccg gcggggccgg ggccggcggg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40837408DNAArtificial sequencePlasmid linker sequence 37ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgaccg acgaggccga ggccgacgag gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40838408DNAArtificial sequencePlasmid linker sequence 38ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgaccg tcgtggccgt ggccgtcgtg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40839408DNAArtificial sequencePlasmid linker sequence 39ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgaccg ccgcggccgc ggccgccgcg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40840408DNAArtificial sequencePlasmid linker sequence 40ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacaa taatggcaat ggcaataatg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40841408DNAArtificial sequencePlasmid linker sequence 41ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacaa caacggcaac ggcaacaacg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40842408DNAArtificial sequencePlasmid linker sequence 42ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacca tcatggccat ggccatcatg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40843408DNAArtificial sequencePlasmid linker sequence 43ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacca ccacggccac ggccaccacg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40844408DNAArtificial sequencePlasmid linker sequence 44ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacaa gaagggcaag ggcaagaagg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40845408DNAArtificial sequencePlasmid linker sequence 45ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacaa aaaaggcaaa ggcaaaaaag gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40846408DNAArtificial sequencePlasmid linker sequence 46ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgactt ttttggcttt ggcttttttg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40847408DNAArtificial sequencePlasmid linker sequence 47ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgactt cttcggcttc ggcttcttcg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40848408DNAArtificial sequencePlasmid linker sequence 48ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacta ttatggctat ggctattatg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 40849408DNAArtificial sequencePlasmid linker sequence 49ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacta ctacggctac ggctactacg gatccatgca cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc 408501707DNAArtificial sequenceVirus 50atggagaaaa tagtgcttct ttttgcaata gtcagtcttg ttaaaagtga tcagatttgc 60attggttacc atgcaaacaa ctcgacagag caggttgaca caataatgga aaagaacgtt 120actgttacac atgcccaaga catactggaa aagacacaca acgggaagct ctgcgatcta 180gatggagtga agcctctaat tttgagagat tgtagtgtag ctggatggct cctcggaaac 240ccaatgtgtg acgaattcat caatgtgccg gaatggtctt acatagtgga gaaggccaat 300ccagccaatg acctctgtta cccaggggat ttcaacgact atgaagaatt gaaacaccta 360ttgagcagaa taaaccattt tgagaaaatt cagatcatcc ccaaaaattc ttggtccagt 420catgaagcct cattaggggt gagctcagca tgtccatacc aaggaaagtc ctcctttttc 480aggaatgtgg tatggcttat caaaaagaac aatgcatacc caacaataaa gaggagctac 540aataatacca accaagaaga tcttttggta ttgtggggga ttcaccatcc taatgatgcg 600gcagagcaga ctaggctcta tcaaaaccca accacctaca tttccgttgg gacatcaaca 660ctaaaccaga gattggtacc aaaaatagct actagatcca aagtaaacgg gcaaaatgga 720aggatggagt tcttctggac aattttaaaa ccgaatgatg caatcaactt cgagagcaat 780ggaaatttca ttgctccaga atatgcatac aaaattgtca agaaagggga ctcagcaatt 840atgaaaagtg aattggaata tggtaactgc aacaccaagt gtcaaactcc aatgggggcg 900ataaactcta gtatgccatt ccacaatata caccctctca ccatcgggga atgccccaaa 960tatgtgaaat caaacagatt agtccttgcg actgggctca gaaatagccc tcaaagagag 1020agaagaagaa aaaagagagg attatttgga gctatagcag gttttataga gggaggatgg 1080cagggaatgg tagatggttg gtatgggtac caccatagca atgagcaggg gagtgggtac 1140gctgcagaca aagaatccac tcaaaaggca atagatggag tcaccaataa ggtcaactcg 1200atcattgaca aaatgaacac tcagtttgag gccgttggaa gggaatttaa taacttagaa 1260aggagaatag agaatttaaa caagaagatg gaagacggat tcctagatgt ctggacttat 1320aatgctgaac ttctggttct catggaaaat gagagaactc tagactttca tgactcaaat 1380gtcaagaacc tttacgacaa ggtccgacta cagcttaggg ataatgcaaa ggagctgggt 1440aacggttgtt tcgagttcta tcacaaatgt gataatgaat gtatggaaag tgtaagaaac 1500ggaacgtatg actacccgca gtattcagaa gaagcaagac taaaaagaga ggaaataagt 1560ggagtaaaat tggagtcaat aggaacttac caaatactgt caatttattc tacagtggcg 1620agttccctag cactggcaat catggtagct ggtctatctt tatggatgtg ctccaatggg 1680tcgttacaat gcagaatttg catttaa 170751568PRTArtificial sequenceVirus 51Met Glu Lys Ile Val Leu Leu Phe Ala Ile Val Ser Leu Val Lys Ser1 5 10 15Asp Gln Ile Cys Ile Gly Tyr His Ala Asn Asn Ser Thr Glu Gln Val 20 25 30Asp Thr Ile Met Glu Lys Asn Val Thr Val Thr His Ala Gln Asp Ile 35 40 45Leu Glu Lys Thr His Asn Gly Lys Leu Cys Asp Leu Asp Gly Val Lys 50 55 60Pro Leu Ile Leu Arg Asp Cys Ser Val Ala Gly Trp Leu Leu Gly Asn65 70 75 80Pro Met Cys Asp Glu Phe Ile Asn Val Pro Glu Trp Ser Tyr Ile Val 85 90 95Glu Lys Ala Asn Pro Ala Asn Asp Leu Cys Tyr Pro Gly Asp Phe Asn 100 105 110Asp Tyr Glu Glu Leu Lys His Leu Leu Ser Arg Ile Asn His Phe Glu 115 120 125Lys Ile Gln Ile Ile Pro Lys Asn Ser Trp Ser Ser His Glu Ala Ser 130 135 140Leu Gly Val Ser Ser Ala Cys Pro Tyr Gln Gly Lys Ser Ser Phe Phe145 150 155 160Arg Asn Val Val Trp Leu Ile Lys Lys Asn Asn Ala Tyr Pro Thr Ile 165 170 175Lys Arg Ser Tyr Asn Asn Thr Asn Gln Glu Asp Leu Leu Val Leu Trp 180 185 190Gly Ile His His Pro Asn Asp Ala Ala Glu Gln Thr Arg Leu Tyr Gln 195 200 205Asn Pro Thr Thr Tyr Ile Ser Val Gly Thr Ser Thr Leu Asn Gln Arg 210 215 220Leu Val Pro Lys Ile Ala Thr Arg Ser Lys Val Asn Gly Gln Asn Gly225 230 235 240Arg Met Glu Phe Phe Trp Thr Ile Leu Lys Pro Asn Asp Ala Ile Asn 245 250 255Phe Glu Ser Asn Gly Asn Phe Ile Ala Pro Glu Tyr Ala Tyr Lys Ile 260 265 270Val Lys Lys Gly Asp Ser Ala Ile Met Lys Ser Glu Leu Glu Tyr Gly 275 280 285Asn Cys Asn Thr Lys Cys Gln Thr Pro Met Gly Ala Ile Asn Ser Ser 290 295 300Met Pro Phe His Asn Ile His Pro Leu Thr Ile Gly Glu Cys Pro Lys305 310 315 320Tyr Val Lys Ser Asn Arg Leu Val Leu Ala Thr Gly Leu Arg Asn Ser 325 330 335Pro Gln Arg Glu Arg Arg Arg Lys Lys Arg Gly Leu Phe Gly Ala Ile 340 345 350Ala Gly Phe Ile Glu Gly Gly Trp Gln Gly Met Val Asp Gly Trp Tyr 355 360 365Gly Tyr His His Ser Asn Glu Gln Gly Ser Gly Tyr Ala Ala Asp Lys 370 375 380Glu Ser Thr Gln Lys Ala Ile Asp Gly Val Thr Asn Lys Val Asn Ser385 390 395 400Ile Ile Asp Lys Met Asn Thr Gln Phe Glu Ala Val Gly Arg Glu Phe 405 410 415Asn Asn Leu Glu Arg Arg Ile Glu Asn Leu Asn Lys Lys Met Glu Asp 420 425 430Gly Phe Leu Asp Val Trp Thr Tyr Asn Ala Glu Leu Leu Val Leu Met 435 440 445Glu Asn Glu Arg Thr Leu Asp Phe His Asp Ser Asn Val Lys Asn Leu 450 455 460Tyr Asp Lys Val Arg Leu Gln Leu Arg Asp Asn Ala Lys Glu Leu Gly465 470 475 480Asn Gly Cys Phe Glu Phe Tyr His Lys Cys Asp Asn Glu Cys Met Glu 485 490 495Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro Gln Tyr Ser Glu Glu Ala 500 505 510Arg Leu Lys Arg Glu Glu Ile Ser Gly Val Lys Leu Glu Ser Ile Gly 515 520 525Thr Tyr Gln Ile Leu Ser Ile Tyr Ser Thr Val Ala Ser Ser Leu Ala 530 535 540Leu Ala Ile Met Val Ala Gly Leu Ser Leu Trp Met Cys Ser Asn Gly545 550 555 560Ser Leu Gln Cys Arg Ile Cys Ile 565521707DNAArtificial sequenceVirus 52atggaaaaaa tcgtgctgct gttcgctatc gtctcgctgg tcaaatcgga tcagatctgc 60atcggatacc atgctaacaa ctcgacggaa caggtcgaca cgatcatgga aaagaacgtc 120acggtcacgc atgctcaaga catcctggaa aagacgcaca acggaaagct gtgcgatctg 180gatggagtga agcctctgat cctgagagat tgttcggtcg ctggatggct gctgggaaac 240cccatgtgtg acgaatttat caatgtgccc gaatggtcgt acatcgtgga aaaggctaat 300cccgctaatg acctgtgtta ccccggagat tttaacgact atgaagaact gaaacacctg 360ctgtcgagaa tcaaccattt cgaaaaaatc cagatcatcc ccaaaaattc gtggtcgtcg 420catgaagctt cgctgggagt gtcgtcggct tgtccctacc aaggaaagtc gtcgttcttt 480agaaatgtgg tctggctgat caaaaagaac aatgcttacc ccacgatcaa gagatcgtac 540aataatacga accaagaaga tctgctggtc ctgtggggaa tccaccatcc taatgatgct 600gctgaacaga cgagactgta tcaaaacccc acgacgtaca tctcggtcgg aacgtcgacg 660ctgaaccaga gactggtccc caaaatcgct acgagatcga aagtcaacgg acaaaatgga 720agaatggaat ttttttggac gatcctgaaa cccaatgatg ctatcaactt tgaatcgaat 780ggaaatttta tcgctcccga atatgcttac aaaatcgtca agaaaggaga ctcggctatc 840atgaaatcgg aactggaata tggaaactgc aacacgaagt gtcaaacgcc catgggagct 900atcaactcgt cgatgccctt tcacaatatc caccctctga cgatcggaga atgccccaaa 960tatgtgaaat cgaacagact ggtcctggct acgggactga gaaattcgcc tcaaagagaa 1020agaagaagaa aaaagagagg actgttcgga gctatcgctg gattcatcga aggaggatgg 1080cagggaatgg tcgatggatg gtatggatac caccattcga atgaacaggg atcgggatac 1140gctgctgaca aagaatcgac gcaaaaggct atcgatggag tcacgaataa ggtcaactcg 1200atcatcgaca aaatgaacac gcagttcgaa gctgtcggaa gagaattcaa taacctggaa 1260agaagaatcg aaaatctgaa caagaagatg gaagacggat ttctggatgt ctggacgtat 1320aatgctgaac tgctggtcct gatggaaaat gaaagaacgc tggacttcca tgactcgaat 1380gtcaagaacc tgtacgacaa ggtccgactg cagctgagag ataatgctaa ggaactggga 1440aacggatgtt ttgaatttta tcacaaatgt gataatgaat gtatggaatc ggtcagaaac 1500ggaacgtatg actaccccca gtattcggaa gaagctagac tgaaaagaga agaaatctcg 1560ggagtcaaac tggaatcgat cggaacgtac caaatcctgt cgatctattc gacggtggct 1620tcgtcgctgg ctctggctat catggtcgct ggactgtcgc tgtggatgtg ctcgaatgga 1680tcgctgcaat gcagaatctg catctaa 1707531701DNAArtificial sequenceVirus 53atgaagacta tcattgctct gagctacatt ttatgtctgg tcttcgctca aaaacttccc 60cgaaatgaca acagcacggc aacgctgtgc ttgggacacc atgcagtgtc aaacggaaca 120ctagtgaaaa caatcacgaa tgaccaaatt

gaagtgacta atgctactga attggttcag 180agttcctcaa caggtagaat atgtgaccga cctcatcgaa tccttgatgg ggaaaactgc 240acactgatag atgctctctt gggagaccct cattgtgata gtttccaaaa caaggaatgg 300gacctttttg tagaacgcag cacagcttac agcgactgtt acccttatga tgtgccggat 360tatgcctccc ttaggtcact agttgcctca tccggcaccc tggagtttaa cgatgaaagt 420ttcgattgga ctggagtctc tcaggatgga acaagcaatg cttgcaaaag gagatctgtt 480aaaagttttt ttagtagatt aaattggttg tacaaattag aatacaaata tccagcactg 540aacgtgacta tgccaaacaa tgaaaaattt gacaaattgt acatttgggg ggtgcaccac 600ccgagcacgg acagtgacca aaccagtcta tatgttcaag catcagggag agtcacaatc 660tctaccaaaa gaagccaaca aactgtaatc ccgaatatcg gatctagacc ctgggtaagg 720ggtatctcca gcagaataag catctattgg acaatagtaa aacctggaga catacttatg 780attaacagca cagggaatct aatcgcccct cggggttact tcaagatacg aagtggagaa 840agctcaataa tgaggtcaga tgcacccatt gatagctgca attctgaatg catcactcca 900aatggaagca ttcccaataa caaaccattt caaaatgtaa acaggatcac atatggggcc 960tgtcctagat atgttaaaca aaaaactcta aaattggcaa cagggatgcg gaatgtacca 1020gagaaacaag ctaggggcat attcggcgcc atcgcaggtt tcatagaaaa tggttgggag 1080ggaatggtag acggttggta cggttttagg catctaaatt ctgagggctc aggacaagca 1140gcagacctca aaagcactca ggcagcaatt aaccaaatca acgggaaact gaataggttg 1200gtcgaaaaaa caaacgagaa attccatcaa attgaaaaag aattctcaga cgtggaaggg 1260agaattcagg atctcgagaa atatgttgaa gacaccaaaa tagatctctg gtcatacaat 1320gcggagcttc ttgttgccct ggagaaccaa cacacaattg atctaactga ctcagaaatg 1380aacaaactgt tcgaaagaac aaggaaacaa ctgagggaaa atgctgagga catgggcaat 1440ggttgcttca aaatatacca caaatgtgac aatgcctgca tagggtcgat cagaaatgga 1500acttatgacc ataatgtata cagagacgaa gcattaaaca accgactcca tatcaaaggg 1560gttgagctga agtcaggata caaagattgg atcttatgga tctcattttc catatcatgc 1620tttttgtttt gtgttgtttt gctggggttc atcatgtggg cctgccaaaa aggcaacatt 1680aggtgcaaca tttgcatttg a 170154566PRTArtificial sequenceVirus 54Met Lys Thr Ile Ile Ala Leu Ser Tyr Ile Leu Cys Leu Val Phe Ala1 5 10 15Gln Lys Leu Pro Arg Asn Asp Asn Ser Thr Ala Thr Leu Cys Leu Gly 20 25 30His His Ala Val Ser Asn Gly Thr Leu Val Lys Thr Ile Thr Asn Asp 35 40 45Gln Ile Glu Val Thr Asn Ala Thr Glu Leu Val Gln Ser Ser Ser Thr 50 55 60Gly Arg Ile Cys Asp Arg Pro His Arg Ile Leu Asp Gly Glu Asn Cys65 70 75 80Thr Leu Ile Asp Ala Leu Leu Gly Asp Pro His Cys Asp Ser Phe Gln 85 90 95Asn Lys Glu Trp Asp Leu Phe Val Glu Arg Ser Thr Ala Tyr Ser Asp 100 105 110Cys Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Ser Leu Arg Ser Leu Val 115 120 125Ala Ser Ser Gly Thr Leu Glu Phe Asn Asp Glu Ser Phe Asp Trp Thr 130 135 140Gly Val Ser Gln Asp Gly Thr Ser Asn Ala Cys Lys Arg Arg Ser Val145 150 155 160Lys Ser Phe Phe Ser Arg Leu Asn Trp Leu Tyr Lys Leu Glu Tyr Lys 165 170 175Tyr Pro Ala Leu Asn Val Thr Met Pro Asn Asn Glu Lys Phe Asp Lys 180 185 190Leu Tyr Ile Trp Gly Val His His Pro Ser Thr Asp Ser Asp Gln Thr 195 200 205Ser Leu Tyr Val Gln Ala Ser Gly Arg Val Thr Ile Ser Thr Lys Arg 210 215 220Ser Gln Gln Thr Val Ile Pro Asn Ile Gly Ser Arg Pro Trp Val Arg225 230 235 240Gly Ile Ser Ser Arg Ile Ser Ile Tyr Trp Thr Ile Val Lys Pro Gly 245 250 255Asp Ile Leu Met Ile Asn Ser Thr Gly Asn Leu Ile Ala Pro Arg Gly 260 265 270Tyr Phe Lys Ile Arg Ser Gly Glu Ser Ser Ile Met Arg Ser Asp Ala 275 280 285Pro Ile Asp Ser Cys Asn Ser Glu Cys Ile Thr Pro Asn Gly Ser Ile 290 295 300Pro Asn Asn Lys Pro Phe Gln Asn Val Asn Arg Ile Thr Tyr Gly Ala305 310 315 320Cys Pro Arg Tyr Val Lys Gln Lys Thr Leu Lys Leu Ala Thr Gly Met 325 330 335Arg Asn Val Pro Glu Lys Gln Ala Arg Gly Ile Phe Gly Ala Ile Ala 340 345 350Gly Phe Ile Glu Asn Gly Trp Glu Gly Met Val Asp Gly Trp Tyr Gly 355 360 365Phe Arg His Leu Asn Ser Glu Gly Ser Gly Gln Ala Ala Asp Leu Lys 370 375 380Ser Thr Gln Ala Ala Ile Asn Gln Ile Asn Gly Lys Leu Asn Arg Leu385 390 395 400Val Glu Lys Thr Asn Glu Lys Phe His Gln Ile Glu Lys Glu Phe Ser 405 410 415Asp Val Glu Gly Arg Ile Gln Asp Leu Glu Lys Tyr Val Glu Asp Thr 420 425 430Lys Ile Asp Leu Trp Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu 435 440 445Asn Gln His Thr Ile Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe 450 455 460Glu Arg Thr Arg Lys Gln Leu Arg Glu Asn Ala Glu Asp Met Gly Asn465 470 475 480Gly Cys Phe Lys Ile Tyr His Lys Cys Asp Asn Ala Cys Ile Gly Ser 485 490 495Ile Arg Asn Gly Thr Tyr Asp His Asn Val Tyr Arg Asp Glu Ala Leu 500 505 510Asn Asn Arg Leu His Ile Lys Gly Val Glu Leu Lys Ser Gly Tyr Lys 515 520 525Asp Trp Ile Leu Trp Ile Ser Phe Ser Ile Ser Cys Phe Leu Phe Cys 530 535 540Val Val Leu Leu Gly Phe Ile Met Trp Ala Cys Gln Lys Gly Asn Ile545 550 555 560Arg Cys Asn Ile Cys Ile 565551701DNAArtificial sequenceVirus 55atgaagacga tcatcgctct gtcgtacatc ctgtgtctgg tctttgctca aaaactgccc 60cgaaatgaca actcgacggc tacgctgtgc ctgggacacc atgctgtgtc gaacggaacg 120ctggtgaaaa cgatcacgaa tgaccaaatc gaagtgacga atgctacgga actggtccag 180tcgtcgtcga cgggaagaat ctgtgaccga cctcatcgaa tcctggatgg agaaaactgc 240acgctgatcg atgctctgct gggagaccct cattgtgatt cgtttcaaaa caaggaatgg 300gacctgttcg tcgaacgctc gacggcttac tcggactgtt acccttatga tgtgcccgat 360tatgcttcgc tgagatcgct ggtcgcttcg tcgggaacgc tggaattcaa cgatgaatcg 420tttgattgga cgggagtctc gcaggatgga acgtcgaatg cttgcaaaag aagatcggtc 480aaatcgttct tctcgagact gaattggctg tacaaactgg aatacaaata tcccgctctg 540aacgtgacga tgcccaacaa tgaaaaattc gacaaactgt acatctgggg agtgcaccac 600ccctcgacgg actcggacca aacgtcgctg tatgtccaag cttcgggaag agtcacgatc 660tcgacgaaaa gatcgcaaca aacggtcatc cccaatatcg gatcgagacc ctgggtcaga 720ggaatctcgt cgagaatctc gatctattgg acgatcgtca aacctggaga catcctgatg 780atcaactcga cgggaaatct gatcgctcct cgaggatact ttaagatccg atcgggagaa 840tcgtcgatca tgagatcgga tgctcccatc gattcgtgca attcggaatg catcacgccc 900aatggatcga tccccaataa caaacccttc caaaatgtca acagaatcac gtatggagct 960tgtcctagat atgtcaaaca aaaaacgctg aaactggcta cgggaatgcg aaatgtcccc 1020gaaaaacaag ctagaggaat ctttggagct atcgctggat ttatcgaaaa tggatgggaa 1080ggaatggtcg acggatggta cggattcaga catctgaatt cggaaggatc gggacaagct 1140gctgacctga aatcgacgca ggctgctatc aaccaaatca acggaaaact gaatagactg 1200gtcgaaaaaa cgaacgaaaa atttcatcaa atcgaaaaag aattttcgga cgtggaagga 1260agaatccagg atctggaaaa atatgtcgaa gacacgaaaa tcgatctgtg gtcgtacaat 1320gctgaactgc tggtcgctct ggaaaaccaa cacacgatcg atctgacgga ctcggaaatg 1380aacaaactgt ttgaaagaac gagaaaacaa ctgagagaaa atgctgaaga catgggaaat 1440ggatgcttta aaatctacca caaatgtgac aatgcttgca tcggatcgat cagaaatgga 1500acgtatgacc ataatgtcta cagagacgaa gctctgaaca accgactgca tatcaaagga 1560gtcgaactga agtcgggata caaagattgg atcctgtgga tctcgttctc gatctcgtgc 1620ttcctgttct gtgtcgtcct gctgggattt atcatgtggg cttgccaaaa aggaaacatc 1680agatgcaaca tctgcatctg a 1701561410DNAArtificial sequenceVirus 56atgaatccaa atcagaagat aacaaccatt ggatcaatct gtatggtaat tggaatagtt 60agcttgatgt tacaaattgg gaacataatc tcaatatggg ttagtcattc aattcaaaca 120gggaatcaac accaggctga accatgcaat caaagcatta ttacttatga aaacaacacc 180tgggtaaacc agacatatgt caacatcagc aataccaatt ttcttactga gaaagctgtg 240gcttcagtaa cattagcggg caattcatct ctttgcccca ttagtggatg ggctgtatac 300agtaaggaca acggtataag aatcggttcc aagggggatg tgtttgttat aagagagccg 360ttcatctcat gctcccactt ggaatgcaga actttctttt tgactcaggg agccttgctg 420aatgacaagc attctaatgg gaccgtcaaa gacagaagcc ctcacagaac attaatgagt 480tgtcccgtgg gtgaggctcc ttccccatac aactcgaggt ttgagtctgt tgcttggtcg 540gcaagtgctt gtcatgatgg cactagttgg ttgacaattg gaatttctgg cccagacaat 600ggggctgtgg ctgtattgaa atacaatggc ataataacag acactatcaa gagttggagg 660aacaacataa tgagaactca agagtctgaa tgtgcatgtg taaatggctc ttgctttact 720gttatgactg atggaccaag taatgggcag gcttcataca aaatcttcag aatagaaaaa 780gggaaagtag ttaaatcagc cgaattaaat gcccctaatt atcactatga ggagtgctcc 840tgttatcctg atgctggaga aatcacatgt gtgtgcaggg ataactggca tggctcaaat 900cggccatggg tatctttcaa tcaaaatttg gagtatcgaa taggatatat atgcagtgga 960gttttcggag acaatccacg ccccaatgat gggacaggca gttgtggtcc ggtgtcccct 1020aaaggggcat atggaataaa agggttctca tttaaatacg gcaatggtgt ttggatcggg 1080agaaccaaaa gcactaattc caggagcggc tttgaaatga tttgggatcc aaatggatgg 1140actggtacgg acagtaattt ttcagtaaag caagatattg tagctataac cgattggtca 1200ggatatagcg ggagttttgt ccagcatcca gaactgacag gattagattg cataagacct 1260tgtttctggg ttgagctaat cagagggcgg cccaaagaga gcacaatttg gactagtggg 1320agcagcatat ccttttgtgg tgtaaatagt gacactgtgg gttggtcttg gccagacggt 1380gctgagttgc cattcaccat tgacaagtag 141057469PRTArtificial sequenceVirus 57Met Asn Pro Asn Gln Lys Ile Thr Thr Ile Gly Ser Ile Cys Met Val1 5 10 15Ile Gly Ile Val Ser Leu Met Leu Gln Ile Gly Asn Ile Ile Ser Ile 20 25 30Trp Val Ser His Ser Ile Gln Thr Gly Asn Gln His Gln Ala Glu Pro 35 40 45Cys Asn Gln Ser Ile Ile Thr Tyr Glu Asn Asn Thr Trp Val Asn Gln 50 55 60Thr Tyr Val Asn Ile Ser Asn Thr Asn Phe Leu Thr Glu Lys Ala Val65 70 75 80Ala Ser Val Thr Leu Ala Gly Asn Ser Ser Leu Cys Pro Ile Ser Gly 85 90 95Trp Ala Val Tyr Ser Lys Asp Asn Gly Ile Arg Ile Gly Ser Lys Gly 100 105 110Asp Val Phe Val Ile Arg Glu Pro Phe Ile Ser Cys Ser His Leu Glu 115 120 125Cys Arg Thr Phe Phe Leu Thr Gln Gly Ala Leu Leu Asn Asp Lys His 130 135 140Ser Asn Gly Thr Val Lys Asp Arg Ser Pro His Arg Thr Leu Met Ser145 150 155 160Cys Pro Val Gly Glu Ala Pro Ser Pro Tyr Asn Ser Arg Phe Glu Ser 165 170 175Val Ala Trp Ser Ala Ser Ala Cys His Asp Gly Thr Ser Trp Leu Thr 180 185 190Ile Gly Ile Ser Gly Pro Asp Asn Gly Ala Val Ala Val Leu Lys Tyr 195 200 205Asn Gly Ile Ile Thr Asp Thr Ile Lys Ser Trp Arg Asn Asn Ile Met 210 215 220Arg Thr Gln Glu Ser Glu Cys Ala Cys Val Asn Gly Ser Cys Phe Thr225 230 235 240Val Met Thr Asp Gly Pro Ser Asn Gly Gln Ala Ser Tyr Lys Ile Phe 245 250 255Arg Ile Glu Lys Gly Lys Val Val Lys Ser Ala Glu Leu Asn Ala Pro 260 265 270Asn Tyr His Tyr Glu Glu Cys Ser Cys Tyr Pro Asp Ala Gly Glu Ile 275 280 285Thr Cys Val Cys Arg Asp Asn Trp His Gly Ser Asn Arg Pro Trp Val 290 295 300Ser Phe Asn Gln Asn Leu Glu Tyr Arg Ile Gly Tyr Ile Cys Ser Gly305 310 315 320Val Phe Gly Asp Asn Pro Arg Pro Asn Asp Gly Thr Gly Ser Cys Gly 325 330 335Pro Val Ser Pro Lys Gly Ala Tyr Gly Ile Lys Gly Phe Ser Phe Lys 340 345 350Tyr Gly Asn Gly Val Trp Ile Gly Arg Thr Lys Ser Thr Asn Ser Arg 355 360 365Ser Gly Phe Glu Met Ile Trp Asp Pro Asn Gly Trp Thr Gly Thr Asp 370 375 380Ser Asn Phe Ser Val Lys Gln Asp Ile Val Ala Ile Thr Asp Trp Ser385 390 395 400Gly Tyr Ser Gly Ser Phe Val Gln His Pro Glu Leu Thr Gly Leu Asp 405 410 415Cys Ile Arg Pro Cys Phe Trp Val Glu Leu Ile Arg Gly Arg Pro Lys 420 425 430Glu Ser Thr Ile Trp Thr Ser Gly Ser Ser Ile Ser Phe Cys Gly Val 435 440 445Asn Ser Asp Thr Val Gly Trp Ser Trp Pro Asp Gly Ala Glu Leu Pro 450 455 460Phe Thr Ile Asp Lys465581410DNAArtificial sequenceVirus 58atgaatccca atcagaagat cacgacgatc ggatcgatct gtatggtcat cggaatcgtc 60tcgctgatgc tgcaaatcgg aaacatcatc tcgatctggg tctcgcattc gatccaaacg 120ggaaatcaac accaggctga accctgcaat caatcgatca tcacgtatga aaacaacacg 180tgggtcaacc agacgtatgt caacatctcg aatacgaatt tcctgacgga aaaagctgtg 240gcttcggtca cgctggctgg aaattcgtcg ctgtgcccca tctcgggatg ggctgtctac 300tcgaaggaca acggaatcag aatcggatcg aagggagatg tgttcgtcat cagagaaccc 360tttatctcgt gctcgcacct ggaatgcaga acgtttttcc tgacgcaggg agctctgctg 420aatgacaagc attcgaatgg aacggtcaaa gacagatcgc ctcacagaac gctgatgtcg 480tgtcccgtgg gagaagctcc ttcgccctac aactcgagat tcgaatcggt cgcttggtcg 540gcttcggctt gtcatgatgg aacgtcgtgg ctgacgatcg gaatctcggg acccgacaat 600ggagctgtgg ctgtcctgaa atacaatgga atcatcacgg acacgatcaa gtcgtggaga 660aacaacatca tgagaacgca agaatcggaa tgtgcttgtg tcaatggatc gtgcttcacg 720gtcatgacgg atggaccctc gaatggacag gcttcgtaca aaatctttag aatcgaaaaa 780ggaaaagtcg tcaaatcggc tgaactgaat gctcctaatt atcactatga agaatgctcg 840tgttatcctg atgctggaga aatcacgtgt gtgtgcagag ataactggca tggatcgaat 900cgaccctggg tctcgtttaa tcaaaatctg gaatatcgaa tcggatatat ctgctcggga 960gtctttggag acaatccccg ccccaatgat ggaacgggat cgtgtggacc cgtgtcgcct 1020aaaggagctt atggaatcaa aggattttcg ttcaaatacg gaaatggagt ctggatcgga 1080agaacgaaat cgacgaattc gagatcggga ttcgaaatga tctgggatcc caatggatgg 1140acgggaacgg actcgaattt ctcggtcaag caagatatcg tcgctatcac ggattggtcg 1200ggatattcgg gatcgttcgt ccagcatccc gaactgacgg gactggattg catcagacct 1260tgtttttggg tcgaactgat cagaggacga cccaaagaat cgacgatctg gacgtcggga 1320tcgtcgatct cgttctgtgg agtcaattcg gacacggtgg gatggtcgtg gcccgacgga 1380gctgaactgc cctttacgat cgacaagtag 1410591410DNAArtificial sequenceVirus 59atgaatacaa atcaaaaaat aataaccatt ggaacagcct gtctgatagt cggaataatt 60agtctattat tgcagatagg agatatagtc tcgttatgga taagccattc aattcagact 120ggagagaaaa accactctca gatatgcagt caaagtgtca ttacatatga aaacaacaca 180tgggtgaacc aaacttatgt aaacattggc aataccaata ttgctgatgg acagggagta 240aattcaataa tactagcggg caattcctct ctttgcccag taagtggatg ggccatatac 300agcaaagaca atagcataag gatcggttcc aaaggagaca tttttgtcat aagagaacta 360tttatctcat gctctcattt ggagtgcaga actttttatc tgacccaagg tgctttgctg 420aatgacaagc attctaatgg aaccgtcaaa gacaggagtc cttatagaac cttaatgagc 480tgcccgattg gtgaagctcc ttctccgtac aattcaaggt tcgaatcagt tgcttggtca 540gcaagtgcat gccatgacgg aatgggatgg ctgacaatcg gaatttccgg cccagataat 600ggagcagtgg ctgttttgaa atacaatggg ataataacag atacaataaa aagttggagg 660aacaaaatac taagaacaca agaatcagaa tgtgtctgta taaacggttc gtgtttcact 720ataatgactg atggcccaag caatgggcag gcctcataca aaatattcaa aatgaagaaa 780gggaaaatta ttaaatcagt ggagatgaat gcacctaatt accactatga ggaatgctcc 840tgttaccctg atacaggcaa agtggtgtgc gtgtgcagag acaattggca tgcttcgaat 900agaccgtggg tctctttcga tcagaacctt aattatcaga tagggtacat atgtagtggg 960gttttcggtg ataacccgcg ttctaatgat gggagaggcg attgtgggcc agtactttct 1020aatggagcta atggagtgaa aggattctca tttaggtatg gcaatggcgt ttggatagga 1080agaactaaaa gcatcagctc tagaagtgga tttgagatga tttgggatcc gaatggatgg 1140acggaaaccg atagtagttt ctcgataaag caggatgtta tagcattaac tgattggtca 1200ggatacagtg ggaactttgt ccaacatccc gaattaacag gaatgaactg cataaagcct 1260tgtttctggg tagagttaat cagaggacag cccaaggaga gaacaatctg gactagtgga 1320agcagcattt ctttctgtgg tgtagacagt gaaaccgcaa gctggtcatg gccagacgga 1380gctgatctgc cattcactat tgacaagtag 141060469PRTArtificial sequenceVirus 60Met Asn Thr Asn Gln Lys Ile Ile Thr Ile Gly Thr Ala Cys Leu Ile1 5 10 15Val Gly Ile Ile Ser Leu Leu Leu Gln Ile Gly Asp Ile Val Ser Leu 20 25 30Trp Ile Ser His Ser Ile Gln Thr Gly Glu Lys Asn His Ser Gln Ile 35 40 45Cys Ser Gln Ser Val Ile Thr Tyr Glu Asn Asn Thr Trp Val Asn Gln 50 55 60Thr Tyr Val Asn Ile Gly Asn Thr Asn Ile Ala Asp Gly Gln Gly Val65 70 75 80Asn Ser Ile Ile Leu Ala Gly Asn Ser Ser Leu Cys Pro Val Ser Gly 85 90 95Trp Ala Ile Tyr Ser Lys Asp Asn Ser Ile Arg Ile Gly Ser Lys Gly 100 105 110Asp Ile Phe Val Ile Arg Glu Leu Phe Ile Ser Cys Ser His Leu Glu 115 120 125Cys Arg Thr Phe Tyr Leu Thr Gln Gly Ala Leu Leu Asn Asp Lys His 130 135 140Ser Asn Gly Thr Val Lys Asp Arg Ser Pro Tyr Arg Thr Leu Met

Ser145 150 155 160Cys Pro Ile Gly Glu Ala Pro Ser Pro Tyr Asn Ser Arg Phe Glu Ser 165 170 175Val Ala Trp Ser Ala Ser Ala Cys His Asp Gly Met Gly Trp Leu Thr 180 185 190Ile Gly Ile Ser Gly Pro Asp Asn Gly Ala Val Ala Val Leu Lys Tyr 195 200 205Asn Gly Ile Ile Thr Asp Thr Ile Lys Ser Trp Arg Asn Lys Ile Leu 210 215 220Arg Thr Gln Glu Ser Glu Cys Val Cys Ile Asn Gly Ser Cys Phe Thr225 230 235 240Ile Met Thr Asp Gly Pro Ser Asn Gly Gln Ala Ser Tyr Lys Ile Phe 245 250 255Lys Met Lys Lys Gly Lys Ile Ile Lys Ser Val Glu Met Asn Ala Pro 260 265 270Asn Tyr His Tyr Glu Glu Cys Ser Cys Tyr Pro Asp Thr Gly Lys Val 275 280 285Val Cys Val Cys Arg Asp Asn Trp His Ala Ser Asn Arg Pro Trp Val 290 295 300Ser Phe Asp Gln Asn Leu Asn Tyr Gln Ile Gly Tyr Ile Cys Ser Gly305 310 315 320Val Phe Gly Asp Asn Pro Arg Ser Asn Asp Gly Arg Gly Asp Cys Gly 325 330 335Pro Val Leu Ser Asn Gly Ala Asn Gly Val Lys Gly Phe Ser Phe Arg 340 345 350Tyr Gly Asn Gly Val Trp Ile Gly Arg Thr Lys Ser Ile Ser Ser Arg 355 360 365Ser Gly Phe Glu Met Ile Trp Asp Pro Asn Gly Trp Thr Glu Thr Asp 370 375 380Ser Ser Phe Ser Ile Lys Gln Asp Val Ile Ala Leu Thr Asp Trp Ser385 390 395 400Gly Tyr Ser Gly Asn Phe Val Gln His Pro Glu Leu Thr Gly Met Asn 405 410 415Cys Ile Lys Pro Cys Phe Trp Val Glu Leu Ile Arg Gly Gln Pro Lys 420 425 430Glu Arg Thr Ile Trp Thr Ser Gly Ser Ser Ile Ser Phe Cys Gly Val 435 440 445Asp Ser Glu Thr Ala Ser Trp Ser Trp Pro Asp Gly Ala Asp Leu Pro 450 455 460Phe Thr Ile Asp Lys465611410DNAArtificial sequenceVirus 61atgaatacga atcaaaaaat catcacgatc ggaacggctt gtctgatcgt cggaatcatc 60tcgctgctgc tgcagatcgg agatatcgtc tcgctgtgga tctcgcattc gatccagacg 120ggagaaaaaa accactcgca gatctgctcg caatcggtca tcacgtatga aaacaacacg 180tgggtgaacc aaacgtatgt caacatcgga aatacgaata tcgctgatgg acagggagtc 240aattcgatca tcctggctgg aaattcgtcg ctgtgccccg tctcgggatg ggctatctac 300tcgaaagaca attcgatcag aatcggatcg aaaggagaca tcttcgtcat cagagaactg 360ttcatctcgt gctcgcatct ggaatgcaga acgttctatc tgacgcaagg agctctgctg 420aatgacaagc attcgaatgg aacggtcaaa gacagatcgc cttatagaac gctgatgtcg 480tgccccatcg gagaagctcc ttcgccctac aattcgagat ttgaatcggt cgcttggtcg 540gcttcggctt gccatgacgg aatgggatgg ctgacgatcg gaatctcggg acccgataat 600ggagctgtgg ctgtcctgaa atacaatgga atcatcacgg atacgatcaa atcgtggaga 660aacaaaatcc tgagaacgca agaatcggaa tgtgtctgta tcaacggatc gtgttttacg 720atcatgacgg atggaccctc gaatggacag gcttcgtaca aaatctttaa aatgaagaaa 780ggaaaaatca tcaaatcggt ggaaatgaat gctcctaatt accactatga agaatgctcg 840tgttaccctg atacgggaaa agtggtgtgc gtgtgcagag acaattggca tgcttcgaat 900agaccctggg tctcgtttga tcagaacctg aattatcaga tcggatacat ctgttcggga 960gtctttggag ataacccccg ttcgaatgat ggaagaggag attgtggacc cgtcctgtcg 1020aatggagcta atggagtgaa aggattttcg ttcagatatg gaaatggagt ctggatcgga 1080agaacgaaat cgatctcgtc gagatcggga ttcgaaatga tctgggatcc caatggatgg 1140acggaaacgg attcgtcgtt ttcgatcaag caggatgtca tcgctctgac ggattggtcg 1200ggatactcgg gaaacttcgt ccaacatccc gaactgacgg gaatgaactg catcaagcct 1260tgtttttggg tcgaactgat cagaggacag cccaaggaaa gaacgatctg gacgtcggga 1320tcgtcgatct cgttttgtgg agtcgactcg gaaacggctt cgtggtcgtg gcccgacgga 1380gctgatctgc cctttacgat cgacaagtag 141062576DNAArtificial sequenceVirus 62taccaagtgc gcaattcctc ggggctttac catgtcacca atgattgccc taactcgagt 60attgtgtacg aggcggccga tgccatcctg cacactccgg ggtgtgtccc ttgcgttcgc 120gagggtaacg cctcgaggtg ttgggtggcg gtgaccccca cggtggccac cagggacggc 180aaactcccca caacgcagct tcgacgtcat atcgatctgc ttgtcgggag cgccaccctc 240tgctcggccc tctacgtggg ggacctgtgc gggtctgtct ttcttgttgg tcaactgttt 300accttctctc ccaggcgcca ctggacgacg caagactgca attgttctat ctatcccggc 360catataacgg gtcatcgcat ggcatgggat atgatgatga actggtcccc tacggcagcg 420ttggtggtag ctcagctgct ccggatccca caagccatca tggacatgat cgctggtgct 480cactggggag tcctggcggg catagcgtat ttctccatgg tggggaactg ggcgaaggtc 540ctggtagtgc tgctgctatt tgccggcgtc gacgcg 57663192PRTArtificial sequenceVirus 63Tyr Gln Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp Cys1 5 10 15Pro Asn Ser Ser Ile Val Tyr Glu Ala Ala Asp Ala Ile Leu His Thr 20 25 30Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala Ser Arg Cys Trp 35 40 45Val Ala Val Thr Pro Thr Val Ala Thr Arg Asp Gly Lys Leu Pro Thr 50 55 60Thr Gln Leu Arg Arg His Ile Asp Leu Leu Val Gly Ser Ala Thr Leu65 70 75 80Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys Gly Ser Val Phe Leu Val 85 90 95Gly Gln Leu Phe Thr Phe Ser Pro Arg Arg His Trp Thr Thr Gln Asp 100 105 110Cys Asn Cys Ser Ile Tyr Pro Gly His Ile Thr Gly His Arg Met Ala 115 120 125Trp Asp Met Met Met Asn Trp Ser Pro Thr Ala Ala Leu Val Val Ala 130 135 140Gln Leu Leu Arg Ile Pro Gln Ala Ile Met Asp Met Ile Ala Gly Ala145 150 155 160His Trp Gly Val Leu Ala Gly Ile Ala Tyr Phe Ser Met Val Gly Asn 165 170 175Trp Ala Lys Val Leu Val Val Leu Leu Leu Phe Ala Gly Val Asp Ala 180 185 19064576DNAArtificial sequenceVirus 64taccaagtgc gcaattcgtc gggactgtac catgtcacga atgattgccc taactcgtcg 60atcgtgtacg aagctgctga tgctatcctg cacacgcccg gatgtgtccc ttgcgtccgc 120gaaggaaacg cttcgagatg ttgggtggct gtgacgccca cggtggctac gagagacgga 180aaactgccca cgacgcagct gcgacgtcat atcgatctgc tggtcggatc ggctacgctg 240tgctcggctc tgtacgtggg agacctgtgc ggatcggtct tcctggtcgg acaactgttc 300acgttttcgc ccagacgcca ctggacgacg caagactgca attgttcgat ctatcccgga 360catatcacgg gacatcgcat ggcttgggat atgatgatga actggtcgcc tacggctgct 420ctggtggtcg ctcagctgct gcgaatcccc caagctatca tggacatgat cgctggagct 480cactggggag tcctggctgg aatcgcttat ttttcgatgg tgggaaactg ggctaaggtc 540ctggtcgtgc tgctgctgtt cgctggagtc gacgct 576651089DNAArtificial sequenceVirus 65gaaacccacg tcaccggggg aagtgccggc cgcaccacgg ctgggcttgt tggtctcctt 60acaccaggcg ccaagcagaa catccaactg atcaacacca acggcagttg gcacatcaat 120agcacggcct tgaactgcaa tgaaagcctt aacaccggct ggttagcagg gctcttctat 180cagcacaaat tcaactcttc aggctgtcct gagaggttgg ccagctgccg acgccttacc 240gattttgccc agggctgggg tcctatcagt tatgccaacg gaagcggcct cgacgaacgc 300ccctactgct ggcactaccc tccaagacct tgtggcattg tgcccgcaaa gagcgtgtgt 360ggcccggtat attgcttcac tcccagcccc gtggtggtgg gaacgaccga caggtcgggc 420gcgcctacct acagctgggg tgcaaatgat acggatgtct tcgtccttaa caacaccagg 480ccaccgctgg gcaattggtt cggttgtacc tggatgaact caactggatt caccaaagtg 540tgcggagcgc ccccttgtgt catcggaggg gtgggcaaca acaccttgct ctgccccact 600gattgtttcc gcaagcatcc ggaagccaca tactctcggt gcggctccgg tccctggatt 660acacccaggt gcatggtcga ctacccgtat aggctttggc actatccttg taccatcaat 720tacaccatat tcaaagtcag gatgtacgtg ggaggggtcg agcacaggct ggaagcggcc 780tgcaactgga cgcggggcga acgctgtgat ctggaagaca gggacaggtc cgagctcagc 840ccattgctgc tgtccaccac acagtggcag gtccttccgt gttctttcac gaccctgcca 900gccttgtcca ccggcctcat ccacctccac cagaacattg tggacgtgca gtacttgtac 960ggggtagggt caagcatcgc gtcctgggcc attaagtggg agtacgtcgt tctcctgttc 1020ctcctgcttg cagacgcgcg cgtctgctcc tgcttgtgga tgatgttact catatcccaa 1080gcggaggcg 108966363PRTArtificial sequenceVirus 66Glu Thr His Val Thr Gly Gly Ser Ala Gly Arg Thr Thr Ala Gly Leu1 5 10 15Val Gly Leu Leu Thr Pro Gly Ala Lys Gln Asn Ile Gln Leu Ile Asn 20 25 30Thr Asn Gly Ser Trp His Ile Asn Ser Thr Ala Leu Asn Cys Asn Glu 35 40 45Ser Leu Asn Thr Gly Trp Leu Ala Gly Leu Phe Tyr Gln His Lys Phe 50 55 60Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Arg Leu Thr65 70 75 80Asp Phe Ala Gln Gly Trp Gly Pro Ile Ser Tyr Ala Asn Gly Ser Gly 85 90 95Leu Asp Glu Arg Pro Tyr Cys Trp His Tyr Pro Pro Arg Pro Cys Gly 100 105 110Ile Val Pro Ala Lys Ser Val Cys Gly Pro Val Tyr Cys Phe Thr Pro 115 120 125Ser Pro Val Val Val Gly Thr Thr Asp Arg Ser Gly Ala Pro Thr Tyr 130 135 140Ser Trp Gly Ala Asn Asp Thr Asp Val Phe Val Leu Asn Asn Thr Arg145 150 155 160Pro Pro Leu Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly 165 170 175Phe Thr Lys Val Cys Gly Ala Pro Pro Cys Val Ile Gly Gly Val Gly 180 185 190Asn Asn Thr Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu 195 200 205Ala Thr Tyr Ser Arg Cys Gly Ser Gly Pro Trp Ile Thr Pro Arg Cys 210 215 220Met Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro Cys Thr Ile Asn225 230 235 240Tyr Thr Ile Phe Lys Val Arg Met Tyr Val Gly Gly Val Glu His Arg 245 250 255Leu Glu Ala Ala Cys Asn Trp Thr Arg Gly Glu Arg Cys Asp Leu Glu 260 265 270Asp Arg Asp Arg Ser Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Gln 275 280 285Trp Gln Val Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr 290 295 300Gly Leu Ile His Leu His Gln Asn Ile Val Asp Val Gln Tyr Leu Tyr305 310 315 320Gly Val Gly Ser Ser Ile Ala Ser Trp Ala Ile Lys Trp Glu Tyr Val 325 330 335Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys Ser Cys Leu 340 345 350Trp Met Met Leu Leu Ile Ser Gln Ala Glu Ala 355 360671089DNAArtificial sequenceVirus 67gaaacgcacg tcacgggagg atcggctgga cgcacgacgg ctggactggt cggactgctg 60acgcccggag ctaagcagaa catccaactg atcaacacga acggatcgtg gcacatcaat 120tcgacggctc tgaactgcaa tgaatcgctg aacacgggat ggctggctgg actgttttat 180cagcacaaat ttaactcgtc gggatgtcct gaaagactgg cttcgtgccg acgcctgacg 240gatttcgctc agggatgggg acctatctcg tatgctaacg gatcgggact ggacgaacgc 300ccctactgct ggcactaccc tcccagacct tgtggaatcg tgcccgctaa gtcggtgtgt 360ggacccgtct attgctttac gccctcgccc gtggtggtgg gaacgacgga cagatcggga 420gctcctacgt actcgtgggg agctaatgat acggatgtct ttgtcctgaa caacacgaga 480ccccccctgg gaaattggtt tggatgtacg tggatgaact cgacgggatt tacgaaagtg 540tgcggagctc ccccttgtgt catcggagga gtgggaaaca acacgctgct gtgccccacg 600gattgttttc gcaagcatcc cgaagctacg tactcgcgat gcggatcggg accctggatc 660acgcccagat gcatggtcga ctacccctat agactgtggc actatccttg tacgatcaat 720tacacgatct ttaaagtcag aatgtacgtg ggaggagtcg aacacagact ggaagctgct 780tgcaactgga cgcgaggaga acgctgtgat ctggaagaca gagacagatc ggaactgtcg 840cccctgctgc tgtcgacgac gcagtggcag gtcctgccct gttcgtttac gacgctgccc 900gctctgtcga cgggactgat ccacctgcac cagaacatcg tggacgtgca gtacctgtac 960ggagtcggat cgtcgatcgc ttcgtgggct atcaagtggg aatacgtcgt cctgctgttt 1020ctgctgctgg ctgacgctcg cgtctgctcg tgcctgtgga tgatgctgct gatctcgcaa 1080gctgaagct 1089682724DNAArtificial sequenceVirus 68atggaggcag ccttgcttgt gtgtcagtac accatccaga gcctgatcca tctcacgggt 60gaagatcctg gttttttcaa tgttgagatt ccggaattcc cattttaccc cacatgcaat 120gtttgcacgg cagatgtcaa tgtaactatc aatttcgatg tcgggggcaa aaagcatcaa 180cttgatcttg actttggcca gctgacaccc catacgaagg ctgtctacca acctcgaggt 240gcatttggtg gctcagaaaa tgccaccaat ctctttctac tggagctcct tggtgcagga 300gaattggctc taactatgcg gtctaagaag cttccaatta acgtcaccac cggagaggag 360caacaagtaa gcctggaatc tgtagatgtc tactttcaag atgtgtttgg aaccatgtgg 420tgccaccatg cagaaatgca aaaccccgtg tacctgatac cagaaacagt gccatacata 480aagtgggata actgtaattc taccaatata acggcagtag tgagggcaca ggggctggat 540gtcacgctac ccttaagttt gccaacgtca gctcaagact cgaatttcag cgtaaaaaca 600gaaatgctcg gtaatgagat agatattgag tgtattatgg aggatggcga aatttcacaa 660gttctgcccg gagacaacaa atttaacatc acctgcagtg gatacgagag ccatgttccc 720agcggcggaa ttctcacatc aacgagtccc gtggccaccc caatacctgg tacagggtat 780gcatacagcc tgcgtctgac accacgtcca gtgtcacgat ttcttggcaa taacagtatc 840ctgtacgtgt tttactctgg gaatggaccg aaggcgagcg ggggagatta ctgcattcag 900tccaacattg tgttctctga tgagattcca gcttcacagg acatgccgac aaacaccaca 960gacatcacat atgtgggtga caatgctacc tattcagtgc caatggtcac ttctgaggac 1020gcaaactcgc caaatgttac agtgactgcc ttttgggcct ggccaaacaa cactgaaact 1080gactttaagt gcaaatggac tctcacctcg gggacacctt cgggttgtga aaatatttct 1140ggtgcatttg cgagcaatcg gacatttgac attactgtct cgggtcttgg cacggccccc 1200aagacactca ttatcacacg aacggctacc aatgccacca caacaaccca caaggttata 1260ttctccaagg cacccgagag caccaccacc tcccctacct tgaatacaac tggatttgct 1320gatcccaata caacgacagg tctacccagc tctactcacg tgcctaccaa cctcaccgca 1380cctgcaagca caggccccac tgtatccacc gcggatgtca ccagcccaac accagccggc 1440acaacgtcag gcgcatcacc ggtgacacca agtccatctc catgggacaa cggcacagaa 1500agtaaggccc ccgacatgac cagctccacc tcaccagtga ctaccccaac cccaaatgcc 1560accagcccca ccccagcagt gactacccca accccaaatg ccaccagccc caccccagca 1620gtgactaccc caaccccaaa tgccaccagc cccaccttgg gaaaaacaag tcctacctca 1680gcagtgacta ccccaacccc aaatgccacc agccccacct tgggaaaaac aagccccacc 1740tcagcagtga ctaccccaac cccaaatgcc accagcccca ccttgggaaa aacaagcccc 1800acctcagcag tgactacccc aaccccaaat gccaccggcc ctactgtggg agaaacaagt 1860ccacaggcaa atgccaccaa ccacacctta ggaggaacaa gtcccacccc agtagttacc 1920agccaaccaa aaaatgcaac cagtgctgtt accacaggcc aacataacat aacttcaagt 1980tcaacctctt ccatgtcact gagacccagt tcaaacccag agacactcag cccctccacc 2040agtgacaatt caacgtcaca tatgccttta ctaacctccg ctcacccaac aggtggtgaa 2100aatataacac aggtgacacc agcctctatc agcacacatc atgtgtccac cagttcgcca 2160gcaccccgcc caggcaccac cagccaagcg tcaggccctg gaaacagttc cacatccaca 2220aaaccggggg aggttaatgt caccaaaggc acgccccccc aaaatgcaac gtcgccccag 2280gcccccagtg gccaaaagac ggcggttccc acggtcacct caacaggtgg aaaggccaat 2340tctaccaccg gtggaaagca caccacagga catggagccc ggacaagtac agagcccacc 2400acagattacg gcggtgattc aactacgcca agaccgagat acaatgcgac cacctatcta 2460cctcccagca cttctagcaa actgcggccc cgctggactt ttacgagccc accggttacc 2520acagcccaag ccaccgtgcc agtcccgcca acgtcccagc ccagattctc aaacctctcc 2580atgctagtac tgcagtgggc ctctctggct gtgctgaccc ttctgctgct gctggtcatg 2640gcggactgcg cctttaggcg taacttgtct acatcccata cctacaccac cccaccatat 2700gatgacgccg agacctatgt ataa 272469907PRTArtificial sequenceVirus 69Met Glu Ala Ala Leu Leu Val Cys Gln Tyr Thr Ile Gln Ser Leu Ile1 5 10 15His Leu Thr Gly Glu Asp Pro Gly Phe Phe Asn Val Glu Ile Pro Glu 20 25 30Phe Pro Phe Tyr Pro Thr Cys Asn Val Cys Thr Ala Asp Val Asn Val 35 40 45Thr Ile Asn Phe Asp Val Gly Gly Lys Lys His Gln Leu Asp Leu Asp 50 55 60Phe Gly Gln Leu Thr Pro His Thr Lys Ala Val Tyr Gln Pro Arg Gly65 70 75 80Ala Phe Gly Gly Ser Glu Asn Ala Thr Asn Leu Phe Leu Leu Glu Leu 85 90 95Leu Gly Ala Gly Glu Leu Ala Leu Thr Met Arg Ser Lys Lys Leu Pro 100 105 110Ile Asn Val Thr Thr Gly Glu Glu Gln Gln Val Ser Leu Glu Ser Val 115 120 125Asp Val Tyr Phe Gln Asp Val Phe Gly Thr Met Trp Cys His His Ala 130 135 140Glu Met Gln Asn Pro Val Tyr Leu Ile Pro Glu Thr Val Pro Tyr Ile145 150 155 160Lys Trp Asp Asn Cys Asn Ser Thr Asn Ile Thr Ala Val Val Arg Ala 165 170 175Gln Gly Leu Asp Val Thr Leu Pro Leu Ser Leu Pro Thr Ser Ala Gln 180 185 190Asp Ser Asn Phe Ser Val Lys Thr Glu Met Leu Gly Asn Glu Ile Asp 195 200 205Ile Glu Cys Ile Met Glu Asp Gly Glu Ile Ser Gln Val Leu Pro Gly 210 215 220Asp Asn Lys Phe Asn Ile Thr Cys Ser Gly Tyr Glu Ser His Val Pro225 230 235 240Ser Gly Gly Ile Leu Thr Ser Thr Ser Pro Val Ala Thr Pro Ile Pro 245 250 255Gly Thr Gly Tyr Ala Tyr Ser Leu Arg Leu Thr Pro Arg Pro Val Ser 260 265 270Arg Phe Leu Gly Asn Asn Ser Ile Leu Tyr Val Phe Tyr Ser Gly Asn 275 280 285Gly Pro Lys Ala Ser Gly Gly Asp Tyr Cys Ile Gln Ser Asn Ile Val 290

295 300Phe Ser Asp Glu Ile Pro Ala Ser Gln Asp Met Pro Thr Asn Thr Thr305 310 315 320Asp Ile Thr Tyr Val Gly Asp Asn Ala Thr Tyr Ser Val Pro Met Val 325 330 335Thr Ser Glu Asp Ala Asn Ser Pro Asn Val Thr Val Thr Ala Phe Trp 340 345 350Ala Trp Pro Asn Asn Thr Glu Thr Asp Phe Lys Cys Lys Trp Thr Leu 355 360 365Thr Ser Gly Thr Pro Ser Gly Cys Glu Asn Ile Ser Gly Ala Phe Ala 370 375 380Ser Asn Arg Thr Phe Asp Ile Thr Val Ser Gly Leu Gly Thr Ala Pro385 390 395 400Lys Thr Leu Ile Ile Thr Arg Thr Ala Thr Asn Ala Thr Thr Thr Thr 405 410 415His Lys Val Ile Phe Ser Lys Ala Pro Glu Ser Thr Thr Thr Ser Pro 420 425 430Thr Leu Asn Thr Thr Gly Phe Ala Asp Pro Asn Thr Thr Thr Gly Leu 435 440 445Pro Ser Ser Thr His Val Pro Thr Asn Leu Thr Ala Pro Ala Ser Thr 450 455 460Gly Pro Thr Val Ser Thr Ala Asp Val Thr Ser Pro Thr Pro Ala Gly465 470 475 480Thr Thr Ser Gly Ala Ser Pro Val Thr Pro Ser Pro Ser Pro Trp Asp 485 490 495Asn Gly Thr Glu Ser Lys Ala Pro Asp Met Thr Ser Ser Thr Ser Pro 500 505 510Val Thr Thr Pro Thr Pro Asn Ala Thr Ser Pro Thr Pro Ala Val Thr 515 520 525Thr Pro Thr Pro Asn Ala Thr Ser Pro Thr Pro Ala Val Thr Thr Pro 530 535 540Thr Pro Asn Ala Thr Ser Pro Thr Leu Gly Lys Thr Ser Pro Thr Ser545 550 555 560Ala Val Thr Thr Pro Thr Pro Asn Ala Thr Ser Pro Thr Leu Gly Lys 565 570 575Thr Ser Pro Thr Ser Ala Val Thr Thr Pro Thr Pro Asn Ala Thr Ser 580 585 590Pro Thr Leu Gly Lys Thr Ser Pro Thr Ser Ala Val Thr Thr Pro Thr 595 600 605Pro Asn Ala Thr Gly Pro Thr Val Gly Glu Thr Ser Pro Gln Ala Asn 610 615 620Ala Thr Asn His Thr Leu Gly Gly Thr Ser Pro Thr Pro Val Val Thr625 630 635 640Ser Gln Pro Lys Asn Ala Thr Ser Ala Val Thr Thr Gly Gln His Asn 645 650 655Ile Thr Ser Ser Ser Thr Ser Ser Met Ser Leu Arg Pro Ser Ser Asn 660 665 670Pro Glu Thr Leu Ser Pro Ser Thr Ser Asp Asn Ser Thr Ser His Met 675 680 685Pro Leu Leu Thr Ser Ala His Pro Thr Gly Gly Glu Asn Ile Thr Gln 690 695 700Val Thr Pro Ala Ser Ile Ser Thr His His Val Ser Thr Ser Ser Pro705 710 715 720Ala Pro Arg Pro Gly Thr Thr Ser Gln Ala Ser Gly Pro Gly Asn Ser 725 730 735Ser Thr Ser Thr Lys Pro Gly Glu Val Asn Val Thr Lys Gly Thr Pro 740 745 750Pro Gln Asn Ala Thr Ser Pro Gln Ala Pro Ser Gly Gln Lys Thr Ala 755 760 765Val Pro Thr Val Thr Ser Thr Gly Gly Lys Ala Asn Ser Thr Thr Gly 770 775 780Gly Lys His Thr Thr Gly His Gly Ala Arg Thr Ser Thr Glu Pro Thr785 790 795 800Thr Asp Tyr Gly Gly Asp Ser Thr Thr Pro Arg Pro Arg Tyr Asn Ala 805 810 815Thr Thr Tyr Leu Pro Pro Ser Thr Ser Ser Lys Leu Arg Pro Arg Trp 820 825 830Thr Phe Thr Ser Pro Pro Val Thr Thr Ala Gln Ala Thr Val Pro Val 835 840 845Pro Pro Thr Ser Gln Pro Arg Phe Ser Asn Leu Ser Met Leu Val Leu 850 855 860Gln Trp Ala Ser Leu Ala Val Leu Thr Leu Leu Leu Leu Leu Val Met865 870 875 880Ala Asp Cys Ala Phe Arg Arg Asn Leu Ser Thr Ser His Thr Tyr Thr 885 890 895Thr Pro Pro Tyr Asp Asp Ala Glu Thr Tyr Val 900 905702724DNAArtificial sequenceVirus 70atggaagctg ctctgctggt gtgtcagtac acgatccagt cgctgatcca tctgacggga 60gaagatcctg gattctttaa tgtcgaaatc cccgaatttc ccttctaccc cacgtgcaat 120gtctgcacgg ctgatgtcaa tgtcacgatc aattttgatg tcggaggaaa aaagcatcaa 180ctggatctgg acttcggaca gctgacgccc catacgaagg ctgtctacca acctcgagga 240gctttcggag gatcggaaaa tgctacgaat ctgttcctgc tggaactgct gggagctgga 300gaactggctc tgacgatgcg atcgaagaag ctgcccatca acgtcacgac gggagaagaa 360caacaagtct cgctggaatc ggtcgatgtc tacttccaag atgtgttcgg aacgatgtgg 420tgccaccatg ctgaaatgca aaaccccgtg tacctgatcc ccgaaacggt gccctacatc 480aagtgggata actgtaattc gacgaatatc acggctgtcg tgagagctca gggactggat 540gtcacgctgc ccctgtcgct gcccacgtcg gctcaagact cgaatttttc ggtcaaaacg 600gaaatgctgg gaaatgaaat cgatatcgaa tgtatcatgg aagatggaga aatctcgcaa 660gtcctgcccg gagacaacaa attcaacatc acgtgctcgg gatacgaatc gcatgtcccc 720tcgggaggaa tcctgacgtc gacgtcgccc gtggctacgc ccatccctgg aacgggatat 780gcttactcgc tgcgtctgac gccccgtccc gtgtcgcgat tcctgggaaa taactcgatc 840ctgtacgtgt tctactcggg aaatggaccc aaggcttcgg gaggagatta ctgcatccag 900tcgaacatcg tgttttcgga tgaaatcccc gcttcgcagg acatgcccac gaacacgacg 960gacatcacgt atgtgggaga caatgctacg tattcggtgc ccatggtcac gtcggaagac 1020gctaactcgc ccaatgtcac ggtgacggct ttctgggctt ggcccaacaa cacggaaacg 1080gacttcaagt gcaaatggac gctgacgtcg ggaacgcctt cgggatgtga aaatatctcg 1140ggagctttcg cttcgaatcg aacgttcgac atcacggtct cgggactggg aacggctccc 1200aagacgctga tcatcacgcg aacggctacg aatgctacga cgacgacgca caaggtcatc 1260ttttcgaagg ctcccgaatc gacgacgacg tcgcctacgc tgaatacgac gggattcgct 1320gatcccaata cgacgacggg actgccctcg tcgacgcacg tgcctacgaa cctgacggct 1380cctgcttcga cgggacccac ggtctcgacg gctgatgtca cgtcgcccac gcccgctgga 1440acgacgtcgg gagcttcgcc cgtgacgccc tcgccctcgc cctgggacaa cggaacggaa 1500tcgaaggctc ccgacatgac gtcgtcgacg tcgcccgtga cgacgcccac gcccaatgct 1560acgtcgccca cgcccgctgt gacgacgccc acgcccaatg ctacgtcgcc cacgcccgct 1620gtgacgacgc ccacgcccaa tgctacgtcg cccacgctgg gaaaaacgtc gcctacgtcg 1680gctgtgacga cgcccacgcc caatgctacg tcgcccacgc tgggaaaaac gtcgcccacg 1740tcggctgtga cgacgcccac gcccaatgct acgtcgccca cgctgggaaa aacgtcgccc 1800acgtcggctg tgacgacgcc cacgcccaat gctacgggac ctacggtggg agaaacgtcg 1860ccccaggcta atgctacgaa ccacacgctg ggaggaacgt cgcccacgcc cgtcgtcacg 1920tcgcaaccca aaaatgctac gtcggctgtc acgacgggac aacataacat cacgtcgtcg 1980tcgacgtcgt cgatgtcgct gagaccctcg tcgaaccccg aaacgctgtc gccctcgacg 2040tcggacaatt cgacgtcgca tatgcctctg ctgacgtcgg ctcaccccac gggaggagaa 2100aatatcacgc aggtgacgcc cgcttcgatc tcgacgcatc atgtgtcgac gtcgtcgccc 2160gctccccgcc ccggaacgac gtcgcaagct tcgggacctg gaaactcgtc gacgtcgacg 2220aaacccggag aagtcaatgt cacgaaagga acgccccccc aaaatgctac gtcgccccag 2280gctccctcgg gacaaaagac ggctgtcccc acggtcacgt cgacgggagg aaaggctaat 2340tcgacgacgg gaggaaagca cacgacggga catggagctc gaacgtcgac ggaacccacg 2400acggattacg gaggagattc gacgacgccc agacccagat acaatgctac gacgtatctg 2460cctccctcga cgtcgtcgaa actgcgaccc cgctggacgt tcacgtcgcc ccccgtcacg 2520acggctcaag ctacggtgcc cgtccccccc acgtcgcagc ccagattttc gaacctgtcg 2580atgctggtcc tgcagtgggc ttcgctggct gtgctgacgc tgctgctgct gctggtcatg 2640gctgactgcg ctttcagacg taacctgtcg acgtcgcata cgtacacgac gcccccctat 2700gatgacgctg aaacgtatgt ctaa 2724712661DNAArtificial sequenceVirus 71atggaggcag ccttgcttgt gtgtcagtac accatccaga gccttatcca actcacgcgt 60gatgatcctg gttttttcaa tgttgagatt ctggaattcc cattttaccc agcgtgcaat 120gtttgcacgg cagatgtcaa tgcaactatc aatttcgatg tcgggggcaa aaagcataaa 180cttaatcttg actttggcct gctgacaccc catacaaagg ctgtctacca acctcgaggt 240gcatttggtg gctcagaaaa tgccaccaat ctctttctac tggagctcct tggtgcagga 300gaattggctc taactatgcg gtctaagaag cttccaatta acatcaccac cggagaggag 360caacaagtaa gcctggaatc tgtagatgtc tactttcaag atgtgtttgg caccatgtgg 420tgccaccatg cagaaatgca aaacccagta tacctaatac cagaaacagt gccatacata 480aagtgggata actgtaattc taccaatata acggcagtag taagggcaca ggggctggat 540gtcacgctac ccttaagttt gccaacatca gctcaagact cgaatttcag cgtaaaaaca 600gaaatgctcg gtaatgagat agatattgag tgtattatgg aggatggcga aatttcacaa 660gttctgcccg gagacaacaa atttaacatc acctgcagtg gatacgagag ccatgttccc 720agcggcggaa ttctcacatc aacgagtccc gtggccaccc caatacctgg tacagggtat 780gcatacagcc tgcgtctgac accacgtcca gtgtcacgat ttcttggcaa taacagtata 840ctgtacgtgt tttactctgg gaatggaccg aaggcgagcg ggggagatta ctgcattcag 900tccaacattg tgttctctga tgagattcca gcttcacagg acatgccgac aaacaccaca 960gacatcacat atgtgggtga caatgctacc tattcagtgc caatggtcac ttctgaggac 1020gcaaactcgc caaatgttac agtgactgcc ttttgggcct ggccaaacaa cactgaaact 1080gactttaagt gcaaatggac tctcacctcg gggacacctt cgggttgtga aaatatttct 1140ggtgcatttg cgagcaatcg gacatttgac attactgtct cgggtcttgg cacggccccc 1200aagacactca ttatcacacg aacggctacc aatgccacca caacaaccca caaggttata 1260ttctccaagg cacccgagag caccaccacc tcccctacct tgaatacaac tggatttgct 1320gctcccaata caacgacagg tctacccagc tctactcacg tgcctaccaa cctcaccgca 1380cctgcaagca caggccccac tgtatccacc gcggatgtca ccagcccaac accagccggc 1440acaacgtcag gcgcatcacc ggtgacacca agtccatctc cacgggacaa cggcacagaa 1500agtaaggccc ccgacatgac cagccccacc tcagcagtga ctaccccaac cccaaatgcc 1560accagcccca ccccagcagt gactacccca accccaaatg ccaccagccc caccttggga 1620aaaacaagtc ccacctcagc agtgactacc ccaaccccaa atgccaccag ccccacccca 1680gcagtgacta ccccaacccc aaatgccacc atccccacct tgggaaaaac aagtcccacc 1740tcagcagtga ctaccccaac cccaaatgcc accagcccta ccgtgggaga aacaagtcca 1800caggcaaata ccaccaacca cacattagga ggaacaagtt ccaccccagt agttaccagc 1860ccaccaaaaa atgcaaccag tgctgttacc acaggccaac ataacataac ttcaagttca 1920acctcttcca tgtcactgag acccagttca atctcagaga cactcagccc ctccaccagt 1980gacaattcaa cgtcacatat gcctttacta acctccgctc acccaacagg tggtgaaaat 2040ataacacagg tgacaccagc ctctaccagc acacatcatg tgtccaccag ttcgccagcg 2100ccccgcccag gcaccaccag ccaagcgtca ggccctggaa acagttccac atccacaaaa 2160ccgggggagg ttaatgtcac caaaggcacg ccccccaaaa atgcaacgtc gccccaggcc 2220cccagtggcc aaaagacggc ggttcccacg gtcacctcaa caggtggaaa ggccaattct 2280accaccggtg gaaagcacac cacaggacat ggagcccgga caagtacaga gcccaccaca 2340gattacggcg gtgattcaac tacgccaaga acgagataca atgcgaccac ctatctacct 2400cccagcactt ctagcaaact gcggccccgc tggactttta cgagcccacc ggttaccaca 2460gcccaagcca ccgtgcctgt cccgccaacg tcccagccca gattctcaaa cctctccatg 2520ctagtactgc agtgggcctc tctggctgtg ctgacccttc tgctgctgct ggtcatggcg 2580gactgcgcct tcaggcgtaa cttgtcgaca tcccatacct acaccacccc accatatgat 2640gacgccgaga cctatgtata a 266172886PRTArtificial sequenceVirus 72Met Glu Ala Ala Leu Leu Val Cys Gln Tyr Thr Ile Gln Ser Leu Ile1 5 10 15Gln Leu Thr Arg Asp Asp Pro Gly Phe Phe Asn Val Glu Ile Leu Glu 20 25 30Phe Pro Phe Tyr Pro Ala Cys Asn Val Cys Thr Ala Asp Val Asn Ala 35 40 45Thr Ile Asn Phe Asp Val Gly Gly Lys Lys His Lys Leu Asn Leu Asp 50 55 60Phe Gly Leu Leu Thr Pro His Thr Lys Ala Val Tyr Gln Pro Arg Gly65 70 75 80Ala Phe Gly Gly Ser Glu Asn Ala Thr Asn Leu Phe Leu Leu Glu Leu 85 90 95Leu Gly Ala Gly Glu Leu Ala Leu Thr Met Arg Ser Lys Lys Leu Pro 100 105 110Ile Asn Ile Thr Thr Gly Glu Glu Gln Gln Val Ser Leu Glu Ser Val 115 120 125Asp Val Tyr Phe Gln Asp Val Phe Gly Thr Met Trp Cys His His Ala 130 135 140Glu Met Gln Asn Pro Val Tyr Leu Ile Pro Glu Thr Val Pro Tyr Ile145 150 155 160Lys Trp Asp Asn Cys Asn Ser Thr Asn Ile Thr Ala Val Val Arg Ala 165 170 175Gln Gly Leu Asp Val Thr Leu Pro Leu Ser Leu Pro Thr Ser Ala Gln 180 185 190Asp Ser Asn Phe Ser Val Lys Thr Glu Met Leu Gly Asn Glu Ile Asp 195 200 205Ile Glu Cys Ile Met Glu Asp Gly Glu Ile Ser Gln Val Leu Pro Gly 210 215 220Asp Asn Lys Phe Asn Ile Thr Cys Ser Gly Tyr Glu Ser His Val Pro225 230 235 240Ser Gly Gly Ile Leu Thr Ser Thr Ser Pro Val Ala Thr Pro Ile Pro 245 250 255Gly Thr Gly Tyr Ala Tyr Ser Leu Arg Leu Thr Pro Arg Pro Val Ser 260 265 270Arg Phe Leu Gly Asn Asn Ser Ile Leu Tyr Val Phe Tyr Ser Gly Asn 275 280 285Gly Pro Lys Ala Ser Gly Gly Asp Tyr Cys Ile Gln Ser Asn Ile Val 290 295 300Phe Ser Asp Glu Ile Pro Ala Ser Gln Asp Met Pro Thr Asn Thr Thr305 310 315 320Asp Ile Thr Tyr Val Gly Asp Asn Ala Thr Tyr Ser Val Pro Met Val 325 330 335Thr Ser Glu Asp Ala Asn Ser Pro Asn Val Thr Val Thr Ala Phe Trp 340 345 350Ala Trp Pro Asn Asn Thr Glu Thr Asp Phe Lys Cys Lys Trp Thr Leu 355 360 365Thr Ser Gly Thr Pro Ser Gly Cys Glu Asn Ile Ser Gly Ala Phe Ala 370 375 380Ser Asn Arg Thr Phe Asp Ile Thr Val Ser Gly Leu Gly Thr Ala Pro385 390 395 400Lys Thr Leu Ile Ile Thr Arg Thr Ala Thr Asn Ala Thr Thr Thr Thr 405 410 415His Lys Val Ile Phe Ser Lys Ala Pro Glu Ser Thr Thr Thr Ser Pro 420 425 430Thr Leu Asn Thr Thr Gly Phe Ala Ala Pro Asn Thr Thr Thr Gly Leu 435 440 445Pro Ser Ser Thr His Val Pro Thr Asn Leu Thr Ala Pro Ala Ser Thr 450 455 460Gly Pro Thr Val Ser Thr Ala Asp Val Thr Ser Pro Thr Pro Ala Gly465 470 475 480Thr Thr Ser Gly Ala Ser Pro Val Thr Pro Ser Pro Ser Pro Arg Asp 485 490 495Asn Gly Thr Glu Ser Lys Ala Pro Asp Met Thr Ser Pro Thr Ser Ala 500 505 510Val Thr Thr Pro Thr Pro Asn Ala Thr Ser Pro Thr Pro Ala Val Thr 515 520 525Thr Pro Thr Pro Asn Ala Thr Ser Pro Thr Leu Gly Lys Thr Ser Pro 530 535 540Thr Ser Ala Val Thr Thr Pro Thr Pro Asn Ala Thr Ser Pro Thr Pro545 550 555 560Ala Val Thr Thr Pro Thr Pro Asn Ala Thr Ile Pro Thr Leu Gly Lys 565 570 575Thr Ser Pro Thr Ser Ala Val Thr Thr Pro Thr Pro Asn Ala Thr Ser 580 585 590Pro Thr Val Gly Glu Thr Ser Pro Gln Ala Asn Thr Thr Asn His Thr 595 600 605Leu Gly Gly Thr Ser Ser Thr Pro Val Val Thr Ser Pro Pro Lys Asn 610 615 620Ala Thr Ser Ala Val Thr Thr Gly Gln His Asn Ile Thr Ser Ser Ser625 630 635 640Thr Ser Ser Met Ser Leu Arg Pro Ser Ser Ile Ser Glu Thr Leu Ser 645 650 655Pro Ser Thr Ser Asp Asn Ser Thr Ser His Met Pro Leu Leu Thr Ser 660 665 670Ala His Pro Thr Gly Gly Glu Asn Ile Thr Gln Val Thr Pro Ala Ser 675 680 685Thr Ser Thr His His Val Ser Thr Ser Ser Pro Ala Pro Arg Pro Gly 690 695 700Thr Thr Ser Gln Ala Ser Gly Pro Gly Asn Ser Ser Thr Ser Thr Lys705 710 715 720Pro Gly Glu Val Asn Val Thr Lys Gly Thr Pro Pro Lys Asn Ala Thr 725 730 735Ser Pro Gln Ala Pro Ser Gly Gln Lys Thr Ala Val Pro Thr Val Thr 740 745 750Ser Thr Gly Gly Lys Ala Asn Ser Thr Thr Gly Gly Lys His Thr Thr 755 760 765Gly His Gly Ala Arg Thr Ser Thr Glu Pro Thr Thr Asp Tyr Gly Gly 770 775 780Asp Ser Thr Thr Pro Arg Thr Arg Tyr Asn Ala Thr Thr Tyr Leu Pro785 790 795 800Pro Ser Thr Ser Ser Lys Leu Arg Pro Arg Trp Thr Phe Thr Ser Pro 805 810 815Pro Val Thr Thr Ala Gln Ala Thr Val Pro Val Pro Pro Thr Ser Gln 820 825 830Pro Arg Phe Ser Asn Leu Ser Met Leu Val Leu Gln Trp Ala Ser Leu 835 840 845Ala Val Leu Thr Leu Leu Leu Leu Leu Val Met Ala Asp Cys Ala Phe 850 855 860Arg Arg Asn Leu Ser Thr Ser His Thr Tyr Thr Thr Pro Pro Tyr Asp865 870 875 880Asp Ala Glu Thr Tyr Val 885732661DNAArtificial sequenceVirus 73atggaagctg ctctgctggt gtgtcagtac acgatccagt cgctgatcca actgacgcgt 60gatgatcctg gattctttaa tgtcgaaatc ctggaatttc ccttctaccc cgcttgcaat 120gtctgcacgg ctgatgtcaa tgctacgatc aattttgatg tcggaggaaa aaagcataaa 180ctgaatctgg acttcggact gctgacgccc catacgaagg ctgtctacca acctcgagga 240gctttcggag gatcggaaaa tgctacgaat ctgttcctgc tggaactgct gggagctgga 300gaactggctc tgacgatgcg atcgaagaag ctgcccatca acatcacgac gggagaagaa 360caacaagtct

cgctggaatc ggtcgatgtc tacttccaag atgtgttcgg aacgatgtgg 420tgccaccatg ctgaaatgca aaaccccgtc tacctgatcc ccgaaacggt gccctacatc 480aagtgggata actgtaattc gacgaatatc acggctgtcg tcagagctca gggactggat 540gtcacgctgc ccctgtcgct gcccacgtcg gctcaagact cgaatttttc ggtcaaaacg 600gaaatgctgg gaaatgaaat cgatatcgaa tgtatcatgg aagatggaga aatctcgcaa 660gtcctgcccg gagacaacaa attcaacatc acgtgctcgg gatacgaatc gcatgtcccc 720tcgggaggaa tcctgacgtc gacgtcgccc gtggctacgc ccatccctgg aacgggatat 780gcttactcgc tgcgtctgac gccccgtccc gtgtcgcgat tcctgggaaa taactcgatc 840ctgtacgtgt tctactcggg aaatggaccc aaggcttcgg gaggagatta ctgcatccag 900tcgaacatcg tgttttcgga tgaaatcccc gcttcgcagg acatgcccac gaacacgacg 960gacatcacgt atgtgggaga caatgctacg tattcggtgc ccatggtcac gtcggaagac 1020gctaactcgc ccaatgtcac ggtgacggct ttctgggctt ggcccaacaa cacggaaacg 1080gacttcaagt gcaaatggac gctgacgtcg ggaacgcctt cgggatgtga aaatatctcg 1140ggagctttcg cttcgaatcg aacgttcgac atcacggtct cgggactggg aacggctccc 1200aagacgctga tcatcacgcg aacggctacg aatgctacga cgacgacgca caaggtcatc 1260ttttcgaagg ctcccgaatc gacgacgacg tcgcctacgc tgaatacgac gggattcgct 1320gctcccaata cgacgacggg actgccctcg tcgacgcacg tgcctacgaa cctgacggct 1380cctgcttcga cgggacccac ggtctcgacg gctgatgtca cgtcgcccac gcccgctgga 1440acgacgtcgg gagcttcgcc cgtgacgccc tcgccctcgc cccgagacaa cggaacggaa 1500tcgaaggctc ccgacatgac gtcgcccacg tcggctgtga cgacgcccac gcccaatgct 1560acgtcgccca cgcccgctgt gacgacgccc acgcccaatg ctacgtcgcc cacgctggga 1620aaaacgtcgc ccacgtcggc tgtgacgacg cccacgccca atgctacgtc gcccacgccc 1680gctgtgacga cgcccacgcc caatgctacg atccccacgc tgggaaaaac gtcgcccacg 1740tcggctgtga cgacgcccac gcccaatgct acgtcgccta cggtgggaga aacgtcgccc 1800caggctaata cgacgaacca cacgctggga ggaacgtcgt cgacgcccgt cgtcacgtcg 1860ccccccaaaa atgctacgtc ggctgtcacg acgggacaac ataacatcac gtcgtcgtcg 1920acgtcgtcga tgtcgctgag accctcgtcg atctcggaaa cgctgtcgcc ctcgacgtcg 1980gacaattcga cgtcgcatat gcctctgctg acgtcggctc accccacggg aggagaaaat 2040atcacgcagg tgacgcccgc ttcgacgtcg acgcatcatg tgtcgacgtc gtcgcccgct 2100ccccgccccg gaacgacgtc gcaagcttcg ggacctggaa actcgtcgac gtcgacgaaa 2160cccggagaag tcaatgtcac gaaaggaacg ccccccaaaa atgctacgtc gccccaggct 2220ccctcgggac aaaagacggc tgtccccacg gtcacgtcga cgggaggaaa ggctaattcg 2280acgacgggag gaaagcacac gacgggacat ggagctcgaa cgtcgacgga acccacgacg 2340gattacggag gagattcgac gacgcccaga acgagataca atgctacgac gtatctgcct 2400ccctcgacgt cgtcgaaact gcgaccccgc tggacgttca cgtcgccccc cgtcacgacg 2460gctcaagcta cggtgcctgt cccccccacg tcgcagccca gattttcgaa cctgtcgatg 2520ctggtcctgc agtgggcttc gctggctgtg ctgacgctgc tgctgctgct ggtcatggct 2580gactgcgctt ttagacgtaa cctgtcgacg tcgcatacgt acacgacgcc cccctatgat 2640gacgctgaaa cgtatgtcta a 2661742715DNAArtificial sequenceVirus 74atgcgcgggg ggggcttgat ttgcgcgctg gtcgtggggg cgctggtggc cgcggtggcg 60tcggcggccc cggcggcccc ggcggccccc cgcgcctcgg gcggcgtggc cgcgaccgtc 120gcggcgaacg ggggtcccgc ctcccggccg ccccccgtcc cgagccccgc gaccaccaag 180gcccggaagc ggaaaaccaa aaagccgccc aagcggcccg aggcgacccc gccccccgac 240gccaacgcga ccgtcgccgc cggccacgcc acgctgcgcg cgcacctgcg ggaaatcaag 300gtcgagaacg ccgatgccca gttttacgtg tgcccgcccc cgacgggcgc cacggtggtg 360cagtttgagc agccgcgccg ctgcccgacg cgcccggagg ggcagaacta cacggagggc 420atcgcggtgg tcttcaagga gaacatcgcc ccgtacaaat tcaaggccac catgtactac 480aaagacgtga ccgtgtcgca ggtgtggttc ggccaccgct actcccagtt tatggggata 540ttcgaggacc gcgcccccgt tcccttcgag gaggtgatcg acaagattaa caccaagggg 600gtctgccgct ccacggccaa gtacgtgcgg aacaacatgg agaccaccgc gtttcaccgg 660gacgaccacg agaccgacat ggagctcaag ccggcgaagg tcgccacgcg cacgagccgg 720gggtggcaca ccaccgacct caagtacaac ccctcgcggg tggaggcgtt ccatcggtac 780ggcacgacgg tcaactgcat cgtcgaggag gtggacgcgc ggtcggtgta cccgtacgat 840gagtttgtgc tggcgacggg cgactttgtg tacatgtccc cgttttacgg ctaccgggag 900gggtcgcaca ccgagcacac cagctacgcc gccgaccgct tcaagcaggt cgacggcttc 960tacgcgcgcg acctcaccac gaaggcccgg gccacgtcgc cgacgacccg caacttgctg 1020acgaccccca agtttaccgt ggcctgggac tgggtgccga agcgaccggc ggtctgcacc 1080atgaccaagt ggcaggaggt ggacgagatg ctccgcgccg agtacggcgg ctccttccgc 1140ttctcctccg acgccatctc gaccaccttc accaccaacc tgaccgagta ctcgctctcg 1200cgcgtcgacc tgggcgactg catcggccgg gatgcccgcg aggccatcga ccgcatgttt 1260gcgcgcaagt acaacgccac gcacatcaag gtgggccagc cgcagtacta cctggccacg 1320gggggcttcc tcatcgcgta ccagcccctc ctcagcaaca cgctcgccga gctgtacgtg 1380cgggagtaca tgcgggagca ggaccgcaag ccccggaatg ccacgcccgc gccactgcgg 1440gaggcgccca gcgccaacgc gtccgtggag cgcatcaaga ccacctcctc gatcgagttc 1500gcccggctgc agtttacgta taaccacata cagcgccacg tgaatgacat gctggggcgc 1560atcgccgtcg cgtggtgcga gctgcagaac cacgagctga ctctctggaa cgaggcccgc 1620aagctcaacc ccaacgccat cgcctccgcc accgtcggcc ggcgggtgag cgcgcgcatg 1680ctcggagacg tcatggccgt ctccacgtgc gtgcccgtcg ccccggacaa cgtgatcgtg 1740cagaactcga tgcgcgtcag ctcgcggccg gggacgtgct acagccgccc cctggtcagc 1800tttcggtacg aagaccaggg cccgctgatc gaggggcagc tgggcgagaa caacgagctg 1860cgcctcaccc gcgacgcgct cgagccgtgc accgtgggcc accggcgcta cttcatcttc 1920ggcgggggct acgtgtactt cgaggagtac gcgtactctc accagctgag tcgcgccgac 1980gtcaccaccg tcagcacctt catcgacctg aacatcacca tgctggagga ccacgagttt 2040gtgcccctgg aggtctacac gcgccacgag atcaaggaca gcggcctgct ggactacacg 2100gaggtccagc gccgcaacca gctgcacgac ctgcgctttg ccgacatcga cacggtcatc 2160cgcgccgacg ccaacgccgc catgttcgcg gggctgtgcg cgttcttcga ggggatgggg 2220gacttggggc gcgcggtcgg caaggtagtc atgggagtag tggggggcgt ggtgtcggcc 2280gtctcgggcg tgtcctcctt tatgtccaac cccttcgggg cgcttgccgt ggggctgctg 2340gtcctggccg gcctggtcgc ggccttcttc gccttccgct acgtcctgca actgcaacgc 2400aatcccatga aggccctgta tccgctcacc accaaggaac tcaagacttc cgaccccggg 2460ggcgtgggcg gggaggggga ggaaggcgcg gaggggggcg ggtttgacga ggccaagttg 2520gccgaggccc gagaaatgat ccgatatatg gctttggtgt cggccatgga gcgcacggaa 2580cacaaggcca gaaagaaggg cacgagcgcc ctgctcagct ccaaggtcac caacatggtt 2640ctgcgcaagc gcaacaaagc caggtactct ccgctccaca acgaggacga ggccggagac 2700gaagacgagc tctaa 271575904PRTArtificial sequenceVirus 75Met Arg Gly Gly Gly Leu Ile Cys Ala Leu Val Val Gly Ala Leu Val1 5 10 15Ala Ala Val Ala Ser Ala Ala Pro Ala Ala Pro Ala Ala Pro Arg Ala 20 25 30Ser Gly Gly Val Ala Ala Thr Val Ala Ala Asn Gly Gly Pro Ala Ser 35 40 45Arg Pro Pro Pro Val Pro Ser Pro Ala Thr Thr Lys Ala Arg Lys Arg 50 55 60Lys Thr Lys Lys Pro Pro Lys Arg Pro Glu Ala Thr Pro Pro Pro Asp65 70 75 80Ala Asn Ala Thr Val Ala Ala Gly His Ala Thr Leu Arg Ala His Leu 85 90 95Arg Glu Ile Lys Val Glu Asn Ala Asp Ala Gln Phe Tyr Val Cys Pro 100 105 110Pro Pro Thr Gly Ala Thr Val Val Gln Phe Glu Gln Pro Arg Arg Cys 115 120 125Pro Thr Arg Pro Glu Gly Gln Asn Tyr Thr Glu Gly Ile Ala Val Val 130 135 140Phe Lys Glu Asn Ile Ala Pro Tyr Lys Phe Lys Ala Thr Met Tyr Tyr145 150 155 160Lys Asp Val Thr Val Ser Gln Val Trp Phe Gly His Arg Tyr Ser Gln 165 170 175Phe Met Gly Ile Phe Glu Asp Arg Ala Pro Val Pro Phe Glu Glu Val 180 185 190Ile Asp Lys Ile Asn Thr Lys Gly Val Cys Arg Ser Thr Ala Lys Tyr 195 200 205Val Arg Asn Asn Met Glu Thr Thr Ala Phe His Arg Asp Asp His Glu 210 215 220Thr Asp Met Glu Leu Lys Pro Ala Lys Val Ala Thr Arg Thr Ser Arg225 230 235 240Gly Trp His Thr Thr Asp Leu Lys Tyr Asn Pro Ser Arg Val Glu Ala 245 250 255Phe His Arg Tyr Gly Thr Thr Val Asn Cys Ile Val Glu Glu Val Asp 260 265 270Ala Arg Ser Val Tyr Pro Tyr Asp Glu Phe Val Leu Ala Thr Gly Asp 275 280 285Phe Val Tyr Met Ser Pro Phe Tyr Gly Tyr Arg Glu Gly Ser His Thr 290 295 300Glu His Thr Ser Tyr Ala Ala Asp Arg Phe Lys Gln Val Asp Gly Phe305 310 315 320Tyr Ala Arg Asp Leu Thr Thr Lys Ala Arg Ala Thr Ser Pro Thr Thr 325 330 335Arg Asn Leu Leu Thr Thr Pro Lys Phe Thr Val Ala Trp Asp Trp Val 340 345 350Pro Lys Arg Pro Ala Val Cys Thr Met Thr Lys Trp Gln Glu Val Asp 355 360 365Glu Met Leu Arg Ala Glu Tyr Gly Gly Ser Phe Arg Phe Ser Ser Asp 370 375 380Ala Ile Ser Thr Thr Phe Thr Thr Asn Leu Thr Glu Tyr Ser Leu Ser385 390 395 400Arg Val Asp Leu Gly Asp Cys Ile Gly Arg Asp Ala Arg Glu Ala Ile 405 410 415Asp Arg Met Phe Ala Arg Lys Tyr Asn Ala Thr His Ile Lys Val Gly 420 425 430Gln Pro Gln Tyr Tyr Leu Ala Thr Gly Gly Phe Leu Ile Ala Tyr Gln 435 440 445Pro Leu Leu Ser Asn Thr Leu Ala Glu Leu Tyr Val Arg Glu Tyr Met 450 455 460Arg Glu Gln Asp Arg Lys Pro Arg Asn Ala Thr Pro Ala Pro Leu Arg465 470 475 480Glu Ala Pro Ser Ala Asn Ala Ser Val Glu Arg Ile Lys Thr Thr Ser 485 490 495Ser Ile Glu Phe Ala Arg Leu Gln Phe Thr Tyr Asn His Ile Gln Arg 500 505 510His Val Asn Asp Met Leu Gly Arg Ile Ala Val Ala Trp Cys Glu Leu 515 520 525Gln Asn His Glu Leu Thr Leu Trp Asn Glu Ala Arg Lys Leu Asn Pro 530 535 540Asn Ala Ile Ala Ser Ala Thr Val Gly Arg Arg Val Ser Ala Arg Met545 550 555 560Leu Gly Asp Val Met Ala Val Ser Thr Cys Val Pro Val Ala Pro Asp 565 570 575Asn Val Ile Val Gln Asn Ser Met Arg Val Ser Ser Arg Pro Gly Thr 580 585 590Cys Tyr Ser Arg Pro Leu Val Ser Phe Arg Tyr Glu Asp Gln Gly Pro 595 600 605Leu Ile Glu Gly Gln Leu Gly Glu Asn Asn Glu Leu Arg Leu Thr Arg 610 615 620Asp Ala Leu Glu Pro Cys Thr Val Gly His Arg Arg Tyr Phe Ile Phe625 630 635 640Gly Gly Gly Tyr Val Tyr Phe Glu Glu Tyr Ala Tyr Ser His Gln Leu 645 650 655Ser Arg Ala Asp Val Thr Thr Val Ser Thr Phe Ile Asp Leu Asn Ile 660 665 670Thr Met Leu Glu Asp His Glu Phe Val Pro Leu Glu Val Tyr Thr Arg 675 680 685His Glu Ile Lys Asp Ser Gly Leu Leu Asp Tyr Thr Glu Val Gln Arg 690 695 700Arg Asn Gln Leu His Asp Leu Arg Phe Ala Asp Ile Asp Thr Val Ile705 710 715 720Arg Ala Asp Ala Asn Ala Ala Met Phe Ala Gly Leu Cys Ala Phe Phe 725 730 735Glu Gly Met Gly Asp Leu Gly Arg Ala Val Gly Lys Val Val Met Gly 740 745 750Val Val Gly Gly Val Val Ser Ala Val Ser Gly Val Ser Ser Phe Met 755 760 765Ser Asn Pro Phe Gly Ala Leu Ala Val Gly Leu Leu Val Leu Ala Gly 770 775 780Leu Val Ala Ala Phe Phe Ala Phe Arg Tyr Val Leu Gln Leu Gln Arg785 790 795 800Asn Pro Met Lys Ala Leu Tyr Pro Leu Thr Thr Lys Glu Leu Lys Thr 805 810 815Ser Asp Pro Gly Gly Val Gly Gly Glu Gly Glu Glu Gly Ala Glu Gly 820 825 830Gly Gly Phe Asp Glu Ala Lys Leu Ala Glu Ala Arg Glu Met Ile Arg 835 840 845Tyr Met Ala Leu Val Ser Ala Met Glu Arg Thr Glu His Lys Ala Arg 850 855 860Lys Lys Gly Thr Ser Ala Leu Leu Ser Ser Lys Val Thr Asn Met Val865 870 875 880Leu Arg Lys Arg Asn Lys Ala Arg Tyr Ser Pro Leu His Asn Glu Asp 885 890 895Glu Ala Gly Asp Glu Asp Glu Leu 900762715DNAArtificial sequenceVirus 76atgcgcggag gaggactgat ctgcgctctg gtcgtgggag ctctggtggc tgctgtggct 60tcggctgctc ccgctgctcc cgctgctccc cgcgcttcgg gaggagtggc tgctacggtc 120gctgctaacg gaggacccgc ttcgcgaccc ccccccgtcc cctcgcccgc tacgacgaag 180gctcgaaagc gaaaaacgaa aaagcccccc aagcgacccg aagctacgcc cccccccgac 240gctaacgcta cggtcgctgc tggacacgct acgctgcgcg ctcacctgcg agaaatcaag 300gtcgaaaacg ctgatgctca gttctacgtg tgcccccccc ccacgggagc tacggtggtg 360cagttcgaac agccccgccg ctgccccacg cgccccgaag gacagaacta cacggaagga 420atcgctgtgg tctttaagga aaacatcgct ccctacaaat ttaaggctac gatgtactac 480aaagacgtga cggtgtcgca ggtgtggttt ggacaccgct actcgcagtt catgggaatc 540tttgaagacc gcgctcccgt cccctttgaa gaagtgatcg acaagatcaa cacgaaggga 600gtctgccgct cgacggctaa gtacgtgcga aacaacatgg aaacgacggc tttccaccga 660gacgaccacg aaacggacat ggaactgaag cccgctaagg tcgctacgcg cacgtcgcga 720ggatggcaca cgacggacct gaagtacaac ccctcgcgag tggaagcttt tcatcgatac 780ggaacgacgg tcaactgcat cgtcgaagaa gtggacgctc gatcggtgta cccctacgat 840gaattcgtgc tggctacggg agacttcgtg tacatgtcgc ccttctacgg ataccgagaa 900ggatcgcaca cggaacacac gtcgtacgct gctgaccgct ttaagcaggt cgacggattt 960tacgctcgcg acctgacgac gaaggctcga gctacgtcgc ccacgacgcg caacctgctg 1020acgacgccca agttcacggt ggcttgggac tgggtgccca agcgacccgc tgtctgcacg 1080atgacgaagt ggcaggaagt ggacgaaatg ctgcgcgctg aatacggagg atcgtttcgc 1140ttttcgtcgg acgctatctc gacgacgttt acgacgaacc tgacggaata ctcgctgtcg 1200cgcgtcgacc tgggagactg catcggacga gatgctcgcg aagctatcga ccgcatgttc 1260gctcgcaagt acaacgctac gcacatcaag gtgggacagc cccagtacta cctggctacg 1320ggaggatttc tgatcgctta ccagcccctg ctgtcgaaca cgctggctga actgtacgtg 1380cgagaataca tgcgagaaca ggaccgcaag ccccgaaatg ctacgcccgc tcccctgcga 1440gaagctccct cggctaacgc ttcggtggaa cgcatcaaga cgacgtcgtc gatcgaattt 1500gctcgactgc agttcacgta taaccacatc cagcgccacg tgaatgacat gctgggacgc 1560atcgctgtcg cttggtgcga actgcagaac cacgaactga cgctgtggaa cgaagctcgc 1620aagctgaacc ccaacgctat cgcttcggct acggtcggac gacgagtgtc ggctcgcatg 1680ctgggagacg tcatggctgt ctcgacgtgc gtgcccgtcg ctcccgacaa cgtgatcgtg 1740cagaactcga tgcgcgtctc gtcgcgaccc ggaacgtgct actcgcgccc cctggtctcg 1800ttccgatacg aagaccaggg acccctgatc gaaggacagc tgggagaaaa caacgaactg 1860cgcctgacgc gcgacgctct ggaaccctgc acggtgggac accgacgcta ctttatcttt 1920ggaggaggat acgtgtactt tgaagaatac gcttactcgc accagctgtc gcgcgctgac 1980gtcacgacgg tctcgacgtt tatcgacctg aacatcacga tgctggaaga ccacgaattc 2040gtgcccctgg aagtctacac gcgccacgaa atcaaggact cgggactgct ggactacacg 2100gaagtccagc gccgcaacca gctgcacgac ctgcgcttcg ctgacatcga cacggtcatc 2160cgcgctgacg ctaacgctgc tatgtttgct ggactgtgcg ctttttttga aggaatggga 2220gacctgggac gcgctgtcgg aaaggtcgtc atgggagtcg tgggaggagt ggtgtcggct 2280gtctcgggag tgtcgtcgtt catgtcgaac ccctttggag ctctggctgt gggactgctg 2340gtcctggctg gactggtcgc tgcttttttt gcttttcgct acgtcctgca actgcaacgc 2400aatcccatga aggctctgta tcccctgacg acgaaggaac tgaagacgtc ggaccccgga 2460ggagtgggag gagaaggaga agaaggagct gaaggaggag gattcgacga agctaagctg 2520gctgaagctc gagaaatgat ccgatatatg gctctggtgt cggctatgga acgcacggaa 2580cacaaggcta gaaagaaggg aacgtcggct ctgctgtcgt cgaaggtcac gaacatggtc 2640ctgcgcaagc gcaacaaagc tagatactcg cccctgcaca acgaagacga agctggagac 2700gaagacgaac tgtaa 2715771182DNAArtificial sequenceVirus 77atggggcgtt tgacctccgg cgtcgggacg gcggccctgc tagttgtcgc ggtgggactc 60cgcgtcgtct gcgccaaata cgccttagca gacccctcgc ttaagatggc cgatcccaat 120cgatttcgcg ggaagaacct tccggttttg gaccagctga ccgacccccc cggggtgaag 180cgtgtttacc acattcagcc gagcctggag gacccgttcc agccccccag catcccgatc 240actgtgtact acgcagtgct ggaacgtgcc tgccgcagcg tgctcctaca tgccccatcg 300gaggcccccc agatcgtgcg cggggcttcg gacgaggccc gaaagcacac gtacaacctg 360accatcgcct ggtatcgcat gggagacaat tgcgctatcc ccatcacggt tatggaatac 420accgagtgcc cctacaacaa gtcgttgggg gtctgcccca tccgaacgca gccccgctgg 480agctactatg acagctttag cgccgtcagc gaggataacc tgggattcct gatgcacgcc 540cccgccttcg agaccgcggg tacgtacctg cggctagtga agataaacga ctggacggag 600atcacacaat ttatcctgga gcaccgggcc cgcgcctcct gcaagtacgc tctccccctg 660cgcatccccc cggcagcgtg cctcacctcg aaggcctacc aacagggcgt gacggtcgac 720agcatcggga tgctaccccg ctttatcccc gaaaaccagc gcaccgtcgc cctatacagc 780ttaaaaatcg ccgggtggca cggccccaag cccccgtaca ccagcaccct gctgccgccg 840gagctgtccg acaccaccaa cgccacgcaa cccgaactcg ttccggaaga ccccgaggac 900tcggccctct tagaggatcc cgccgggacg gtgtcttcgc agatcccccc aaactggcac 960atcccgtcga tccaggacgt cgcgccgcac cacgcccccg ccgcccccag caacccgggc 1020ctgatcatcg gcgcgctggc cggcagtacc ctggcggtgc tggtcatcgg cggtattgcg 1080ttttgggtac gccgccgcgc tcagatggcc cccaagcgcc tacgtctccc ccacatccgg 1140gatgacgacg cgcccccctc gcaccagcca ttgttttact ag 118278393PRTArtificial sequenceVirus 78Met Gly Arg Leu Thr Ser Gly Val Gly Thr Ala Ala Leu Leu Val Val1 5 10 15Ala Val Gly Leu Arg Val Val Cys Ala Lys Tyr Ala Leu Ala Asp Pro 20 25 30Ser Leu Lys Met Ala Asp Pro Asn Arg Phe Arg Gly Lys Asn Leu Pro 35 40 45Val Leu Asp Gln Leu Thr Asp Pro Pro Gly Val Lys Arg Val Tyr His 50

55 60Ile Gln Pro Ser Leu Glu Asp Pro Phe Gln Pro Pro Ser Ile Pro Ile65 70 75 80Thr Val Tyr Tyr Ala Val Leu Glu Arg Ala Cys Arg Ser Val Leu Leu 85 90 95His Ala Pro Ser Glu Ala Pro Gln Ile Val Arg Gly Ala Ser Asp Glu 100 105 110Ala Arg Lys His Thr Tyr Asn Leu Thr Ile Ala Trp Tyr Arg Met Gly 115 120 125Asp Asn Cys Ala Ile Pro Ile Thr Val Met Glu Tyr Thr Glu Cys Pro 130 135 140Tyr Asn Lys Ser Leu Gly Val Cys Pro Ile Arg Thr Gln Pro Arg Trp145 150 155 160Ser Tyr Tyr Asp Ser Phe Ser Ala Val Ser Glu Asp Asn Leu Gly Phe 165 170 175Leu Met His Ala Pro Ala Phe Glu Thr Ala Gly Thr Tyr Leu Arg Leu 180 185 190Val Lys Ile Asn Asp Trp Thr Glu Ile Thr Gln Phe Ile Leu Glu His 195 200 205Arg Ala Arg Ala Ser Cys Lys Tyr Ala Leu Pro Leu Arg Ile Pro Pro 210 215 220Ala Ala Cys Leu Thr Ser Lys Ala Tyr Gln Gln Gly Val Thr Val Asp225 230 235 240Ser Ile Gly Met Leu Pro Arg Phe Ile Pro Glu Asn Gln Arg Thr Val 245 250 255Ala Leu Tyr Ser Leu Lys Ile Ala Gly Trp His Gly Pro Lys Pro Pro 260 265 270Tyr Thr Ser Thr Leu Leu Pro Pro Glu Leu Ser Asp Thr Thr Asn Ala 275 280 285Thr Gln Pro Glu Leu Val Pro Glu Asp Pro Glu Asp Ser Ala Leu Leu 290 295 300Glu Asp Pro Ala Gly Thr Val Ser Ser Gln Ile Pro Pro Asn Trp His305 310 315 320Ile Pro Ser Ile Gln Asp Val Ala Pro His His Ala Pro Ala Ala Pro 325 330 335Ser Asn Pro Gly Leu Ile Ile Gly Ala Leu Ala Gly Ser Thr Leu Ala 340 345 350Val Leu Val Ile Gly Gly Ile Ala Phe Trp Val Arg Arg Arg Ala Gln 355 360 365Met Ala Pro Lys Arg Leu Arg Leu Pro His Ile Arg Asp Asp Asp Ala 370 375 380Pro Pro Ser His Gln Pro Leu Phe Tyr385 390791182DNAArtificial sequenceVirus 79atgggacgtc tgacgtcggg agtcggaacg gctgctctgc tggtcgtcgc tgtgggactg 60cgcgtcgtct gcgctaaata cgctctggct gacccctcgc tgaagatggc tgatcccaat 120cgattccgcg gaaagaacct gcccgtcctg gaccagctga cggacccccc cggagtgaag 180cgtgtctacc acatccagcc ctcgctggaa gacccctttc agcccccctc gatccccatc 240acggtgtact acgctgtgct ggaacgtgct tgccgctcgg tgctgctgca tgctccctcg 300gaagctcccc agatcgtgcg cggagcttcg gacgaagctc gaaagcacac gtacaacctg 360acgatcgctt ggtatcgcat gggagacaat tgcgctatcc ccatcacggt catggaatac 420acggaatgcc cctacaacaa gtcgctggga gtctgcccca tccgaacgca gccccgctgg 480tcgtactatg actcgttctc ggctgtctcg gaagataacc tgggatttct gatgcacgct 540cccgcttttg aaacggctgg aacgtacctg cgactggtga agatcaacga ctggacggaa 600atcacgcaat tcatcctgga acaccgagct cgcgcttcgt gcaagtacgc tctgcccctg 660cgcatccccc ccgctgcttg cctgacgtcg aaggcttacc aacagggagt gacggtcgac 720tcgatcggaa tgctgccccg cttcatcccc gaaaaccagc gcacggtcgc tctgtactcg 780ctgaaaatcg ctggatggca cggacccaag cccccctaca cgtcgacgct gctgcccccc 840gaactgtcgg acacgacgaa cgctacgcaa cccgaactgg tccccgaaga ccccgaagac 900tcggctctgc tggaagatcc cgctggaacg gtgtcgtcgc agatcccccc caactggcac 960atcccctcga tccaggacgt cgctccccac cacgctcccg ctgctccctc gaaccccgga 1020ctgatcatcg gagctctggc tggatcgacg ctggctgtgc tggtcatcgg aggaatcgct 1080ttctgggtcc gccgccgcgc tcagatggct cccaagcgcc tgcgtctgcc ccacatccga 1140gatgacgacg ctcccccctc gcaccagccc ctgttctact ag 118280387DNAHuman papillomavirus type 16 80ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc 38781387DNAArtificial SequenceHPV-16 E7 O1 81ggtaccgccg ccaccatgga aacggacacg ctgctgctgt gggtcctgct gctgtgggtc 60cccggatcga cgggagacgg atcgatgcat ggagacacgc ccacgctgca tgaatacatg 120ctggacctgc aacccgaaac gacggacctg tactgctacg aacaactgaa cgactcgtcg 180gaagaagaag acgaaatcga cggacccgct ggacaagctg aacccgacag agctcattac 240aacatcgtca cgttctgctg caagtgcgac tcgacgctgc gactgtgcgt ccaatcgacg 300cacgtcgaca tccgtacgct ggaagacctg ctgatgggaa cgctgggaat cgtgtgcccc 360atctgctcgc agaagcccta agaattc 38782387DNAArtificial SequenceHPV16 E7 O2 82ggtaccgccg ccaccatgga aacggacacg ctgctgctgt gggtcctgct gctgtgggtc 60cccggatcga cgggagacgg atcgatgcat ggagatacgc ctacgctgca tgaatatatg 120ctggatctgc aacccgaaac gacggatctg tactgttatg aacaactgaa tgactcgtcg 180gaagaagaag atgaaatcga tggacccgct ggacaagctg aacccgacag agctcattac 240aatatcgtca cgttttgttg caagtgtgac tcgacgctgc gactgtgcgt ccaatcgacg 300cacgtcgaca tccgtacgct ggaagacctg ctgatgggaa cgctgggaat cgtgtgcccc 360atctgctcgc agaagcccta agaattc 38783417DNAArtificial SequenceHPV-16 E7 O3 83ggtaccgccg ccaccatgga gacggacacg ctcctgctct gggtactgct gctctgggtt 60cctggatcga cgggattgtg gacggatcga tgcatggaga tacgcctacg ctccatgaat 120atatgctcga tctccaacct ggttgagacg acggatctct actgttatga gcaactcaat 180gactcgtcgg aggaggagga tgaattcata gatggacctg ctggacaagc agaacctgac 240agagcccatt acaatattgt aacgtttgag aattgttgca agtgtgactc gacgctccgg 300ctctgcgtac aatcgacgca cgtagacatt cgtccctcta cgctcgaaga cctgctcatg 360ggaacgctcg gaattgtgtg ccccatctgc tcgcagaagt gtgcccccta agaattc 41784387DNAArtificial SequenceHPV-16 E7 W 84ggtaccgccg ccaccatgga gactgatact ttattattat gggtattatt attatgggtt 60ccaggtagta ctggtgatgg cagtatgcat ggcgatactc caactttaca tgagtatatg 120ttagatttac aaccagagac tactgattta tattgttatg agcaattaaa tgatagcagt 180gaggaggagg atgagataga tggtccagcg ggccaagcag agccggatcg ggcgcattat 240aatatagtaa ctttctgttg taagtgtgat agtactttac ggttatgtgt acaaagcact 300cacgtagata tacggacttt agaggattta ttaatgggca ctttaggcat agtatgtcca 360atatgtagtc agaagccata agaattc 387851182DNAHerpes simplex virus type 2 85atggggcgtt tgacctccgg cgtcgggacg gcggccctgc tagttgtcgc ggtgggactc 60cgcgtcgtct gcgccaaata cgccttagca gacccctcgc ttaagatggc cgatcccaat 120cgatttcgcg ggaagaacct tccggttttg gaccagctga ccgacccccc cggggtgaag 180cgtgtttacc acattcagcc gagcctggag gacccgttcc agccccccag catcccgatc 240actgtgtact acgcagtgct ggaacgtgcc tgccgcagcg tgctcctaca tgccccatcg 300gaggcccccc agatcgtgcg cggggcttcg gacgaggccc gaaagcacac gtacaacctg 360accatcgcct ggtatcgcat gggagacaat tgcgctatcc ccatcacggt tatggaatac 420accgagtgcc cctacaacaa gtcgttgggg gtctgcccca tccgaacgca gccccgctgg 480agctactatg acagctttag cgccgtcagc gaggataacc tgggattcct gatgcacgcc 540cccgccttcg agaccgcggg tacgtacctg cggctagtga agataaacga ctggacggag 600atcacacaat ttatcctgga gcaccgggcc cgcgcctcct gcaagtacgc tctccccctg 660cgcatccccc cggcagcgtg cctcacctcg aaggcctacc aacagggcgt gacggtcgac 720agcatcggga tgctaccccg ctttatcccc gaaaaccagc gcaccgtcgc cctatacagc 780ttaaaaatcg ccgggtggca cggccccaag cccccgtaca ccagcaccct gctgccgccg 840gagctgtccg acaccaccaa cgccacgcaa cccgaactcg ttccggaaga ccccgaggac 900tcggccctct tagaggatcc cgccgggacg gtgtcttcgc agatcccccc aaactggcac 960atcccgtcga tccaggacgt cgcgccgcac cacgcccccg ccgcccccag caacccgggc 1020ctgatcatcg gcgcgctggc cggcagtacc ctggcggtgc tggtcatcgg cggtattgcg 1080ttttgggtac gccgccgcgc tcagatggcc cccaagcgcc tacgtctccc ccacatccgg 1140gatgacgacg cgcccccctc gcaccagcca ttgttttact ag 1182861182DNAArtificial SequenceHSV-2 gD2 O1 86atgggacgtc tgacgtcggg agtcggaacg gctgctctgc tggtcgtcgc tgtgggactc 60cgcgtcgtct gcgctaaata cgctctggct gacccctcgc tgaagatggc tgaccccaac 120cgatttcgcg gaaagaacct gcccgtcctg gaccagctga cggacccccc cggagtgaag 180cgtgtctacc acatccagcc ctcgctggaa gacccctttc agcccccctc gatccccatc 240acggtgtact acgctgtgct ggaacgtgct tgccgctcgg tgctcctcca tgctccctcg 300gaagctcccc agatcgtgcg cggagcttcg gacgaagctc gaaagcacac gtacaacctg 360acgatcgctt ggtaccgcat gggagacaac tgcgctatcc ccatcacggt catggaatac 420acggaatgcc cctacaacaa gtcgctcgga gtctgcccca tccgaacgca gccccgctgg 480tcgtactacg actcgttttc ggctgtctcg gaagacaacc tgggatttct gatgcacgct 540cccgcttttg aaacggctgg aacgtacctg cgactcgtga agatcaacga ctggacggaa 600atcacgcaat ttatcctgga acaccgagct cgcgcttcgt gcaagtacgc tctccccctg 660cgcatccccc ccgctgcttg cctcacgtcg aaggcttacc aacagggagt gacggtcgac 720tcgatcggaa tgctcccccg ctttatcccc gaaaaccagc gcacggtcgc tctctactcg 780ctcaaaatcg ctggatggca cggacccaag cccccctaca cgtcgacgct gctgcccccc 840gaactgtcgg acacgacgaa cgctacgcaa cccgaactcg tccccgaaga ccccgaagac 900tcggctctcc tcgaagaccc cgctggaacg gtgtcgtcgc agatcccccc caactggcac 960atcccctcga tccaggacgt cgctccccac cacgctcccg ctgctccctc gaaccccgga 1020ctgatcatcg gagctctggc tggatcgacg ctggctgtgc tggtcatcgg aggaatcgct 1080ttttgggtcc gccgccgcgc tcagatggct cccaagcgcc tccgtctccc ccacatccga 1140gacgacgacg ctcccccctc gcaccagccc ctcttttact ag 1182871182DNAArtificial SequenceHSV-2 gD2 O2 87atgggacgtc tgacgtcggg agtcggaacg gctgctctgc tggtcgtcgc tgtgggactg 60cgcgtcgtct gcgctaaata cgctctggct gacccctcgc tgaagatggc tgatcccaat 120cgatttcgcg gaaagaacct gcccgtcctg gaccagctga cggacccccc cggagtgaag 180cgtgtctacc acatccagcc ctcgctggaa gacccctttc agcccccctc gatccccatc 240acggtgtact acgctgtgct ggaacgtgct tgccgctcgg tgctgctgca tgctccctcg 300gaagctcccc agatcgtgcg cggagcttcg gacgaagctc gaaagcacac gtacaacctg 360acgatcgctt ggtatcgcat gggagacaat tgcgctatcc ccatcacggt catggaatac 420acggaatgcc cctacaacaa gtcgctggga gtctgcccca tccgaacgca gccccgctgg 480tcgtactatg actcgttttc ggctgtctcg gaagataacc tgggatttct gatgcacgct 540cccgcttttg aaacggctgg aacgtacctg cgactggtga agatcaacga ctggacggaa 600atcacgcaat ttatcctgga acaccgagct cgcgcttcgt gcaagtacgc tctgcccctg 660cgcatccccc ccgctgcttg cctgacgtcg aaggcttacc aacagggagt gacggtcgac 720tcgatcggaa tgctgccccg ctttatcccc gaaaaccagc gcacggtcgc tctgtactcg 780ctgaaaatcg ctggatggca cggacccaag cccccctaca cgtcgacgct gctgcccccc 840gaactgtcgg acacgacgaa cgctacgcaa cccgaactgg tccccgaaga ccccgaagac 900tcggctctgc tggaagatcc cgctggaacg gtgtcgtcgc agatcccccc caactggcac 960atcccctcga tccaggacgt cgctccccac cacgctcccg ctgctccctc gaaccccgga 1020ctgatcatcg gagctctggc tggatcgacg ctggctgtgc tggtcatcgg aggaatcgct 1080ttttgggtcc gccgccgcgc tcagatggct cccaagcgcc tgcgtctgcc ccacatccga 1140gatgacgacg ctcccccctc gcaccagccc ctgttttact ag 1182881182DNAArtificial SequenceHSV-2 gD2 O3 88atgggacgtc tcacgtcggg agtcggaacg gcggccctgc tcgttgtcgc ggtgggactc 60cgcgtcgtct gcgccaaata cgccctcgca gacccctcgc tcaagatggc cgatcccaat 120cgatttcgcg gaaagaacct ccctgttctc gaccagctga cggacccccc cggagtgaag 180cgtgtttacc acattcagcc ttcgctggag gaccctttcc agcccccctc gatccctatc 240acggtgtact acgcagtgct ggaacgtgcc tgccgctcgg tgctcctcca tgccccttcg 300gaggcccccc agatcgtgcg cggagcttcg gacgaggccc gaaagcacac gtacaacctg 360acgatcgcct ggtatcgcat gggagacaat tgcgctatcc ccatcacggt tatggaatac 420acggagtgcc cctacaacaa gtcgctcgga gtctgcccca tccgaacgca gccccgctgg 480tcgtactatg actcgttttc ggccgtctcg gaggataacc tgggattcct gatgcacgcc 540cccgccttcg agacggcggg aacgtacctg cggctcgtga agataaacga ctggacggag 600atcacgcaat ttatcctgga gcaccgggcc cgcgcctcgt gcaagtacgc tctccccctg 660cgcatccccc ctgcagcgtg cctcacgtcg aaggcctacc aacagggagt gacggtcgac 720tcgatcggaa tgctcccccg ctttatcccc gaaaaccagc gcacggtcgc cctctactcg 780ctcaaaatcg ccggatggca cggacccaag cccccttaca cgtcgacgct gctgcctcct 840gagctgtcgg acacgacgaa cgccacgcaa cccgaactcg ttcctgaaga ccccgaggac 900tcggccctcc tagaggatcc cgccggaacg gtgtcgtcgc agatcccccc taactggcac 960atcccttcga tccaggacgt cgcgcctcac cacgcccccg ccgccccctc gaaccctgga 1020ctgatcatcg gagcgctggc cggatcgacg ctggcggtgc tggtcatcgg aggaattgcg 1080ttttgggtac gccgccgcgc tcagatggcc cccaagcgcc tccgtctccc ccacatccgg 1140gatgacgacg cgcccccctc gcaccagcct ctcttttact ag 1182891182DNAArtificial SequenceHSV-2 gD2 W 89atggggcggt tgactagtgg cgtagggact gcggcgttat tagtagtagc ggtaggctta 60cgggtagtat gtgcaaaata tgcgttagca gatccaagtt taaagatggc ggatccaaat 120cggttccggg ggaagaattt accggtattg gatcagttaa ctgatccacc aggggtaaag 180cgggtatatc acatacagcc gagcttagag gatccgttcc agccaccaag cataccgata 240actgtatatt atgcagtatt agagcgggcg tgtcggagcg tattattaca tgcaccaagt 300gaggcgccac agatagtacg gggggcaagt gatgaggcgc ggaagcacac ttataattta 360actatagcat ggtatcggat gggcgataat tgtgcgatac caataactgt aatggagtat 420actgagtgtc catataataa gagtttgggg gtatgtccaa tacggactca gccacggtgg 480agctattatg atagcttcag cgcagtaagc gaggataatt taggcttctt aatgcacgcg 540ccagcattcg agactgcggg tacttattta cggttagtaa agataaatga ttggactgag 600ataactcaat tcatattaga gcaccgggca cgggcgagtt gtaagtatgc attaccatta 660cggataccac cggcagcgtg tttaactagt aaggcatatc aacagggcgt aactgtagat 720agcataggga tgttaccacg gttcatacca gagaatcagc ggactgtagc gttatatagc 780ttaaaaatag cagggtggca cggcccaaag ccaccgtata ctagcacttt attaccgccg 840gagttaagtg atactactaa tgcgactcaa ccagagttag taccggagga tccagaggat 900agtgcattat tagaggatcc agcggggact gtaagtagtc agataccacc aaattggcac 960ataccgagta tacaggatgt agcgccgcac cacgcaccag cggcaccaag caatccgggc 1020ttaataatag gcgcgttagc aggcagtact ttagcggtat tagtaatagg cggtatagcg 1080ttctgggtac ggcggcgggc gcagatggcg ccaaagcggt tacggttacc acacatacgg 1140gatgatgatg cgccaccaag tcaccagcca ttgttctatt ag 11829041DNAArtificial sequenceCommon forward primer 90ttgaataggt accgccgcca ccatggagac cgacaccctc c 419124DNAArtificial SequenceODN-7909 91tcgtcgtttt gtcgttttgt cgtt 24

* * * * *

Construct System And Uses Therefor

Frazer; Ian Hector

References