U.S. patent application number 12/738291 was filed with the patent office on 2012-02-16 for construct system and uses therefor.
This patent application is currently assigned to THE UNIVERSITY OF QUEENSLAND. Invention is credited to Ian Hector Frazer.
Application Number | 20120040367 12/738291 |
Document ID | / |
Family ID | 40566909 |
Filed Date | 2012-02-16 |
United States Patent
Application |
20120040367 |
Kind Code |
A1 |
Frazer; Ian Hector |
February 16, 2012 |
CONSTRUCT SYSTEM AND USES THEREFOR
Abstract
The present invention discloses construct systems and methods
for comparing different iso-accepting codons according to their
preference for translating RNA transcripts into proteins in cell or
tissues of interest or for producing a selected phenotype in an
organism of interest or part thereof. The codon preference
comparisons thus obtained are particularly useful for modifying the
translational efficiency of protein-encoding polynucleotides in
cells or tissues of interest or for modulating the quality of a
selected phenotype conferred by a phenotype-associated polypeptide
upon an organism of interest or part thereof.
Inventors: |
Frazer; Ian Hector;
(Queensland, AU) |
Assignee: |
THE UNIVERSITY OF
QUEENSLAND
ST. LUCIA, QLD
AU
|
Family ID: |
40566909 |
Appl. No.: |
12/738291 |
Filed: |
October 2, 2008 |
PCT Filed: |
October 2, 2008 |
PCT NO: |
PCT/AU2008/001465 |
371 Date: |
May 11, 2011 |
Related U.S. Patent Documents
|
|
|
|
|
|
Application
Number |
Filing Date |
Patent Number |
|
|
60980145 |
Oct 15, 2007 |
|
|
|
Current U.S.
Class: |
435/6.17 ;
435/6.1; 435/6.18 |
Current CPC
Class: |
C07K 14/005 20130101;
A61P 33/00 20180101; A61P 31/16 20180101; A61P 31/06 20180101; A61P
33/12 20180101; A61K 2039/55516 20130101; A61K 39/245 20130101;
A61P 31/14 20180101; A61P 33/06 20180101; C12N 2710/20034 20130101;
C40B 40/08 20130101; A61P 31/22 20180101; A61K 2039/575 20130101;
A61P 37/04 20180101; C12N 2760/16122 20130101; A61K 39/145
20130101; A61P 31/18 20180101; A61K 48/0075 20130101; C12N
2710/16222 20130101; A61P 31/12 20180101; C12N 2710/20022 20130101;
A61P 33/02 20180101; C12N 2710/20071 20130101; C40B 50/04 20130101;
C12N 2760/16134 20130101; A61P 35/02 20180101; Y02A 50/30 20180101;
Y02A 50/39 20180101; A61P 35/00 20180101; C12N 15/67 20130101; C12N
15/79 20130101; C12N 2710/16622 20130101; C12N 2770/24222 20130101;
C12N 2770/24234 20130101; Y02A 50/469 20180101; C12N 2710/16634
20130101; A61K 39/29 20130101; C12N 2710/16234 20130101; A61K
48/0066 20130101; A61P 33/04 20180101; C12N 15/85 20130101; A61K
39/12 20130101; A61K 2039/585 20130101; C12N 2800/22 20130101; A61K
2039/53 20130101; A61P 31/04 20180101; A61P 31/10 20180101; A61P
31/20 20180101; A61K 2039/54 20130101 |
Class at
Publication: |
435/6.17 ;
435/6.1; 435/6.18 |
International
Class: |
C12Q 1/68 20060101
C12Q001/68 |
Claims
1. A construct system for determining the translational efficiency
or phenotypic preference of different synonymous codons, the system
comprising a plurality of synthetic constructs, each comprising a
regulatory sequence that is operably connected to a reporter
polynucleotide, wherein the reporter polynucleotide of a first
construct comprises a first coding sequence for interrogating the
translational efficiency or phenotypic preference of a first codon
("the first interrogating codon") that codes for a first amino
acid, wherein the reporter polynucleotide of a second construct
comprises a second coding sequence for interrogating the
translational efficiency or phenotypic preference of a second codon
("the second interrogating codon") that codes for the first amino
acid, wherein the first and second coding sequences encode the same
amino acid sequence, wherein the first coding sequence comprises
the first interrogating codon to code for the first amino acid at
one or more positions of the amino acid sequence, wherein the
second coding sequence comprises the second interrogating codon to
code for the first amino acid at one or more positions of the amino
acid sequence, and wherein the first and second coding sequences
differ from one another in the choice of the first interrogating
codon or the second interrogating codon to code for the first amino
acid at the corresponding position(s) in the amino acid, sequence
and wherein the first coding sequence comprises the same number of
first interrogating codons as the number of second interrogating co
dons in the second coding sequence.
2. A system according to claim 1, wherein the second coding
sequence differs from the first second coding sequence by the
substitution of the first interrogating codon with the second
interrogating codon to code for the first amino acid at the one or
more positions of the amino acid sequence.
3. A system according to claim 1, wherein the construct system
comprises one or more additional synthetic constructs for
interrogating the translational efficiency or phenotypic preference
of one or more additional interrogating co dons that codes for the
first amino acid.
4. A system according to claim 3, wherein the construct system
comprises a corresponding number of synthetic constructs as the
number of synonymous codons that normally encode the first amino
acid.
5. A system according to claim 1, wherein the coding sequence of
individual synthetic constructs comprises at least 2 interrogating
codons of the corresponding type.
6. A system according to claim 1, wherein at least 10% of co dons
that code for the first amino acid in the coding sequence of
individual synthetic constructs are the same interrogating
codon.
7. A system according to claim 1, wherein the construct system
further comprises a third construct and a fourth construct, wherein
the reporter polynucleotide of the third construct comprises a
third coding sequence for interrogating the translational
efficiency or phenotypic preference of a third codon ("the third
interrogating codon") that codes for a second amino acid that is
different to the first amino acid, wherein the reporter
polynucleotide of the fourth construct comprises a fourth coding
sequence for interrogating the translational efficiency or
phenotypic preference of a fourth codon ("the fourth interrogating
codon") that codes for the second amino acid, wherein the third and
fourth coding sequences encode the same amino acid sequence as the
first and second coding sequences, wherein the third coding
sequence comprises the third interrogating codon to code for the
second amino acid at one or more positions of the amino acid
sequence, wherein the fourth coding sequence comprises the fourth
interrogating codon to code for the second amino acid at one or
more positions of the amino acid sequence, and wherein the third
and fourth coding sequences differ from one another in the choice
of the third interrogating codon or the fourth interrogating codon
to code for the second amino acid at the corresponding position(s)
in the amino acid sequence.
8. A system according to claim 1, wherein the construct system
further comprises synthetic constructs for interrogating the
translational efficiency or phenotypic preference of co dons that
code for other amino acids.
9. A system according to claim 1, wherein the coding sequence of
individual reporter polynucleotides encodes a polypeptide that
confers a phenotype upon a cell or tissue in which the coding
sequence is expressed.
10. A system according to claim 9, wherein the polypeptide is
selected from a reporter protein which, when present in a cell or
tissue, is detectable either by its presence or activity.
11. A system according to claim 10, wherein the reporter protein is
selected from a chemiluminescent reporter protein such as
luciferase, a fluorescent protein such as green fluorescent
protein, an enzymatic reporter protein such as chloramphenicol
acetyl transferase, p-galactosidase, secreted placental alkaline
phosphatase, p-Iactamase or a growth factor such as human growth
hormone.
12. A system according to claim 1, wherein the coding sequence of
individual reporter polynucleotides encodes a polypeptide that
confers a phenotype upon a cell or tissue in which the coding
sequence is not expressed.
13. A system according to claim 12, wherein the polypeptide is a
phenotype-associated polypeptide that is the subject of producing a
selected phenotype or a phenotype of the same class as the selected
phenotype.
14. A system according to claim 1, wherein the reporter
polynucleotide of individual synthetic constructs further comprises
an ancillary coding sequence that encodes a detectable tag.
15. A system according to claim 14, wherein the tag is a member of
a specific binding pair.
16. A system according to claim 14, wherein the ancillary coding
sequence of one reporter polynucleotide encodes a different tag
than the ancillary coding sequence of another reporter
polynucleotide.
17. A method for determining the translational efficiency of a
first codon relative to a second codon is in a cell of interest,
wherein the first codon and the second codon code for the same
amino acid, the method comprising: providing a plurality of
synthetic constructs, each comprising a regulatory sequence that is
operably connected to a reporter polynucleotide, wherein the
reporter polynucleotide of a first construct comprises a first
coding sequence for interrogating the translational efficiency of
the first codon, wherein the reporter polynucleotide of a second
construct comprises a second coding sequence for interrogating the
translational efficiency of the second codon, wherein the first and
second coding sequences encode the same amino acid sequence, which
defines in whole or in part a reporter protein, wherein the first
coding sequence comprises the first codon to code for the first
amino acid at one or more positions of the amino acid sequence,
wherein the second coding sequence comprises the second codon to
code for the first amino acid at one or more positions of the amino
acid sequence, and wherein the first and second coding sequences
differ from one another in the choice of the first codon or the
second codon to code for the first amino acid at the corresponding
position(s) in the amino acid sequence and wherein the first coding
sequence comprises the same number of first interrogating codons as
the number of second interrogating co dons in the second coding
sequence; introducing the first construct into a cell of the same
type as the cell of interest; introducing the second construct into
a cell of the same type as the cell of interest; measuring
expression of the reporter protein from the first construct and
from the second construct in the cell; and determining the
translational efficiency of the first codon and the translational
efficiency of the second codon based on the measured expression of
the reporter protein in the cell, to thereby determine the
translational efficiency of the first codon relative the second
codon in the cell of interest.
18. A method according to claim 17, further comprising determining
a comparison of translational efficiencies of individual synonymous
codons in the cell of interest.
19. A method according to claim 17, comprising: introducing an
individual synthetic construct into a progenitor of the cell of
interest; and differentiating the cell of interest from the
progenitor, wherein the cell of interest contains the synthetic
construct.
20. A method according to claim 17, wherein the first and second
constructs are separately introduced into different cells.
21. A method according to claim 17, wherein the first and second
constructs are introduced into the same cell.
22. A method for determining the translational efficiency of a
first codon and a second codon in a first cell type relative to a
second cell type, the method comprising: providing a plurality of
synthetic constructs, each comprising a regulatory sequence that is
operably connected to a reporter polynucleotide, wherein the
reporter polynucleotide of a first construct comprises a first
coding sequence for interrogating the translational efficiency of
the first codon, wherein the reporter polynucleotide of a second
construct comprises a second coding sequence for interrogating the
translational efficiency of the second codon, wherein the first and
second coding sequences encode the same amino acid sequence, which
defines in whole or in part a reporter protein, wherein the first
coding sequence comprises the first codon to code for the first
amino acid at one or more positions of the amino acid sequence,
wherein the second coding sequence comprises the second codon to
code for the first amino acid at one or more positions of the amino
acid sequence, and wherein the first and second coding sequences
differ from one another in the choice of the first codon or the
second codon to code for the first amino acid at the corresponding
position(s) in the amino acid sequence and wherein the first coding
sequence comprises the same number of first interrogating codons as
the number of second interrogating codons in the second coding
sequence; separately introducing the first construct into the first
cell type and into the second cell type; separately introducing the
second construct into the first cell type and into the second cell
type; measuring expression of the reporter protein in the first
cell type and in the second cell type to which the first construct
was provided; measuring expression of the reporter protein in the
first cell type and in the second cell type to which the second
construct was provided; determining the translational efficiency of
the first codon in the first cell type and in the second cell type
based on the measured expression of the reporter protein in the
first cell type and in the second cell type, respectively, to which
the first construct was provided, to thereby determine the
translational efficiency of the first codon in the first cell type
relative the second cell type; and determining the translational
efficiency of the second codon in the first cell type and in the
second cell type based on the measured expression of the reporter
protein in the first cell type and in the second cell type,
respectively, to which the second construct was provided, to
thereby determine the translational efficiency of the second codon
in the first cell type relative the second cell type.
23. A method according to claim 22, further comprising determining
a comparison of translational efficiencies of individual synonymous
co dons in the first cell type relative to the second cell
type.
24. A method according to claim 22, further comprising: introducing
an individual synthetic construct into a progenitor of a cell
selected from the first cell type or the second cell type; and
differentiating the cell from the progenitor, wherein the cell
contains the synthetic construct.
25. A method for determining the preference of a first codon
relative to the preference of a second codon for producing a
selected phenotype ("the phenotypic preference") in a organism of
interest or part thereof, wherein the first codon and the second
codon code for the same amino acid, the method comprising:
providing a plurality of synthetic constructs, each comprising a
regulatory sequence that is operably connected to a reporter
polynucleotide, wherein the reporter polynucleotide of a first
construct comprises a first coding sequence for interrogating the
phenotypic preference of the first codon, wherein the reporter
polynucleotide of a second construct comprises a second coding
sequence for interrogating the phenotypic preference of the second
codon, wherein the first and second coding sequences encode the
same amino acid sequence, which defines in whole or in part a
reporter protein, which produces, or which is predicted to produce,
the selected phenotype or a phenotype of the same class as the
selected phenotype, wherein the first coding sequence comprises the
first codon to code for the first amino acid at one or more
positions of the amino acid sequence, wherein the second coding
sequence comprises the second codon to code for the first amino
acid at one or more positions of the amino acid sequence, and
wherein the first and second coding sequences differ from one
another in the choice of the first codon or the second codon to
code for the first amino acid at the corresponding position(s) in
the amino acid sequence and wherein the first coding sequence
comprises the same number of first interrogating codons as the
number of second interrogating codons in the second coding
sequence; introducing the first construct into a first test
organism or part thereof, wherein the test organism is selected
from the group consisting of an organism of the same species as the
organism of interest and an organism that is related to the
organism of interest; introducing the second construct into a
second test organism or part thereof, wherein the second test
organism is of the same type as the first organism determining the
quality of the corresponding phenotype displayed by the first test
organism or part and by the second test organism or part; and
determining the phenotypic preference of the first codon and the
phenotypic preference of the second codon based, respectively, on
the quality of the corresponding phenotype displayed by the first
test organism or part and by the second test organism or part, to
thereby determine the phenotypic preference of the first codon
relative the phenotypic preference of the second codon in the
organism of interest or part thereof.
26. A method according to claim 25, further comprising determining
a comparison of phenotypic preferences of individual synonymous
codons in the organism of interest or part thereof.
27. A method according to claim 25, further comprising: introducing
an individual synthetic construct into a progenitor of the test
organism or part; and growing a non-human organism or part from the
progenitor, wherein the organism or part contains the synthetic
construct.
28. A method according to claim 25, further comprising: introducing
an individual synthetic construct into a progenitor of the test
organism or part; and growing a non-human organism or part from the
progenitor, wherein the organism or part comprises a cell
containing the synthetic construct.
29. A method of constructing a synthetic polynucleotide from which
an encoded polypeptide is produced at a higher level in a cell of
interest than from a parent polynucleotide that encodes the same
polypeptide, the method comprising: determining the translational
efficiency of different synonymous codons in cells of the same type
as the cell of interest, as defined in claim 17, to thereby
determine a comparison of translational efficiencies of individual
synonymous co dons in the cell of interest; selecting a first codon
of the parent polynucleotide for replacement with a synonymous
codon, wherein the synonymous codon is selected on the basis that
it exhibits a higher translational efficiency than the first codon
in the cell of interest according to the comparison of
translational efficiencies; and replacing the first codon with the
synonymous codon to construct the synthetic polynucleotide.
30. A method according to claim 29, wherein the synonymous codon is
selected on the basis that it corresponds to an interrogating codon
in a synthetic construct from which the reporter protein is
expressed in the cell of interest at a level that is at least about
10% higher than the level of the reporter protein expressed from a
synthetic construct that comprises the first codon as the
interrogating codon.
31. A method of constructing a synthetic polynucleotide from which
an encoded polypeptide is produced at a lower level in a cell of
interest than from a parent polynucleotide that encodes the same
polypeptide, the method comprising: determining the translational
efficiency of different synonymous codons in cells of the same type
as the cell of interest, as defined in claim 17, to thereby
determine a comparison of translational efficiencies of individual
synonymous co dons in the cell of interest; selecting a first codon
of the parent polynucleotide for replacement with a synonymous
codon, wherein the synonymous codon is selected on the basis that
it exhibits a lower translational efficiency than the first codon
in the cell of interest according to the comparison of
translational efficiencies; and replacing the first codon with the
synonymous codon to construct the synthetic polynucleotide.
32. A method according to claim 31, wherein the synonymous codon is
selected on the basis that it corresponds to an interrogating codon
in a synthetic construct from which the reporter protein is
expressed in the cell of interest at a level that is no more than
90% of the level of the reporter protein expressed from a synthetic
construct that comprises the first codon as the interrogating
codon.
33. A method of constructing a synthetic polynucleotide from which
an encoded polypeptide is produced at a higher level in a first
cell than in a second cell, the method comprising: determining the
translational efficiency of different synonymous codons in cells of
the same type as the first cell and in cells of the same type as
the second cell, as defined in claim 22, to thereby determine a
comparison of translational efficiencies of individual synonymous
co dons between the first cell and the second cell; selecting a
first codon of the parent polynucleotide for replacement with a
synonymous codon, wherein the synonymous codon is selected on the
basis that it exhibits a higher translational efficiency in the
first cell than in the second cell according to the comparison of
translational efficiencies; and replacing the first codon with the
synonymous codon to construct the synthetic polynucleotide.
34. A method according to claim 28, wherein the synonymous codon is
the same as the interrogating codon in a synthetic construct from
which the reporter protein is expressed in the first cell at a
level that is at least about 10% higher than the level of the
reporter protein expressed from the same synthetic construct in the
second cell.
35. A method of constructing a synthetic polynucleotide from which
a polypeptide is producible to confer a selected phenotype upon an
organism of interest or part thereof in a different quality than
that conferred by a parent polynucleotide that encodes the same
polypeptide, the method comprising: determining the preference of
different synonymous codons for producing the selected phenotype
("the phenotypic preference") in test organisms or parts thereof,
as defined in claim 25, wherein the test organisms are selected
from the group consisting of an organism of the same species as the
organism of interest and an organism that is related to the
organism of interest, to thereby determine a comparison of
phenotypic preferences of individual synonymous codons in the
organism of interest; selecting a first codon of the parent
polynucleotide for replacement with a synonymous codon, wherein the
synonymous codon is selected on the basis that it exhibits a
different phenotypic preference than the first codon in the
comparison of phenotypic preferences in organism or part thereof;
and replacing the first codon with the synonymous codon to
construct the synthetic polynucleotide.
36. A method according to claim 35, wherein the synthetic
polynucleotide confers the selected phenotype upon the organism of
interest or part thereof in a higher quality than that conferred by
the parent polynucleotide.
37. A method according to claim 36, wherein the synonymous codon is
selected on the basis that it corresponds to an interrogating codon
in a synthetic construct that confers the selected phenotype in the
organism of interest or part thereof in a quality that is at least
about 10% higher than the quality of the phenotype conferred by the
synthetic construct comprising the first codon as the interrogating
codon.
38. A method according to claim 35, wherein the synthetic
polynucleotide confers the selected phenotype upon the organism of
interest or part thereof in a lower quality than that conferred by
the parent polynucleotide.
39. A method according to claim 38, wherein the synonymous codon is
selected on the basis that it corresponds to an interrogating codon
in a synthetic construct that confers the selected phenotype in the
organism of interest or part thereof in a quality that is no more
than 90% of the quality of the phenotype conferred by the synthetic
construct comprising the first codon as the interrogating codon.
Description
FIELD OF THE INVENTION
[0001] The present invention relates generally to gene expression.
More particularly, the present invention relates to construct
systems and methods for comparing different iso-accepting codons
according to their preference for translating RNA transcripts into
proteins in cell or tissues of interest or for producing a selected
phenotype in an organism of interest or part thereof. The codon
preference comparisons thus obtained are particularly useful for
modifying the translational efficiency of protein-encoding
polynucleotides in cells or tissues of interest or for modulating
the quality of a selected phenotype conferred by a
phenotype-associated polypeptide upon an organism of interest or
part thereof.
BACKGROUND OF THE INVENTION
[0002] The expression of foreign heterologous genes in transformed
cells is now commonplace. A large number of mammalian genes,
including, for example, murine and human genes, have been
successfully expressed in various host cells, including bacterial,
yeast, insect, plant and mammalian host cells. Nevertheless,
despite the burgeoning knowledge of expression systems and
recombinant DNA technology, significant obstacles remain when one
attempts to express a foreign or synthetic gene in a selected host
cell. For example, translation of a synthetic gene, even when
coupled with a strong promoter, often proceeds much more slowly
than would be expected. The same is frequently true of exogenous
genes that are foreign to the host cell. This lower than expected
translation efficiency is often due to the protein coding regions
of the gene having a codon usage pattern that does not resemble
those of highly expressed genes in the host cell. It is known in
this regard that codon utilization is highly biased and varies
considerably in different organisms and that biases in codon usage
can alter peptide elongation rates. It is also known that codon
usage patterns are related to the relative abundance of tRNA
isoacceptors, and that genes encoding proteins of high versus low
abundance show differences in their codon preferences.
[0003] The implications of codon preference phenomena on gene
expression are manifest in that these phenomena can affect the
translational efficiency of messenger RNA (mRNA). It is widely
known in this regard that translation of "rare codons", for which
the corresponding iso-tRNA is in low abundance relative to other
iso-tRNAs, may cause a ribosome to pause during translation which
can lead to a failure to complete a nascent polypeptide chain and
an uncoupling of transcription and translation. Thus, the
expression of an exogenous gene may be impeded severely if a
particular host cell of an organism or the organism itself has a
low abundance of iso-tRNAs corresponding to one or more codons of
the exogenous gene. Accordingly, a major aim of investigators in
this field is to first ascertain the codon preference for
particular cells in which an exogenous gene is to be expressed, and
to subsequently alter the codon composition of that gene for
optimized expression in those cells.
[0004] Codon-optimization techniques are known for improving the
translational kinetics of translationally inefficient protein
coding regions. Traditionally, these techniques have been based on
the replacement of codons that are rarely or infrequently used in
the host cell with those that are host-preferred. Codon frequencies
can be derived from literature sources for the highly expressed
genes of many organisms (see, for example, Nakamura et al., 1996,
Nucleic Acids Res 24: 214-215). These frequencies are generally
expressed on an `organism-wide average basis` as the percentage of
occasions that a synonymous codon is used to encode a corresponding
amino acid across a collection of protein-encoding genes of that
organism, which are preferably highly expressed.
[0005] Typically, codons are classified as: (a) "common" codons (or
"preferred" codons) if their frequency of usage is above about
4/3.times.the frequency of usage that would be expected in the
absence of any bias in codon usage; (b) "rare" codons (or
"non-preferred" codons) if their frequency of usage is below about
2/3.times.the frequency of usage that would be expected in the
absence of any bias in codon usage; and (c) "intermediate" codons
(or "less preferred" codons) if their frequency of usage is
in-between the frequency of usage of "common" codons and of "rare"
codons. Since an amino acid can be encoded by 2, 3, 4 or 6 codons,
the frequency of usage of any selected codon, which would be
expected in the absence of any bias in codon usage, will be
dependent upon the number of synonymous codons which code for the
same amino acid as the selected codon. Accordingly, for a
particular amino acid, the frequency thresholds for classifying
codons in the "common", "intermediate" and "rare" categories will
be dependent upon the number of synonymous codons for that amino
acid. Consequently, for amino acids having 6 choices of synonymous
codon, the frequency of codon usage that would be expected in the
absence of any bias in codon usage is 16% and thus the "common",
"intermediate" and "rare" codons are defined as those codons that
have a frequency of usage above 20%, between 10 and 20% and below
10%, respectively. For amino acids having 4 choices of synonymous
codon, the frequency of codon usage that would be expected in the
absence of codon usage bias is 25% and thus the "common",
"intermediate" and "rare" codons are defined as those codons that
have a frequency of usage above 33%, between 16 and 33% and below
16%, respectively. For isoleucine, which is the only amino acid
having 3 choices of synonymous codon, the frequency of codon usage
that would be expected in the absence of any bias in codon usage is
33% and thus the "common", "intermediate" and "rare" codons for
isoleucine are defined as those codons that have a frequency of
usage above 45%, between 20 and 45% and below 20%, respectively.
For amino acids having 2 choices of synonymous codon, the frequency
of codon usage that would be expected in the absence of codon usage
bias is 50% and thus the "common", "intermediate" and "rare" codons
are defined as those codons that have a frequency of usage above
60%, between 30 and 60% and below 30%, respectively. Thus, the
categorization of codons into the "common", "intermediate" and
"rare" classes (or "preferred", "less preferred" or "non
preferred", respectively) has been based conventionally on a
compilation of codon usage for an organism in general (e.g.,
`human-wide`) or for a class of organisms in general (e.g.,
`mammal-wide`). For example, reference may be made to Seed (see
U.S. Pat. Nos. 5,786,464 and 5,795,737) who discloses preferred,
less preferred and non-preferred codons for mammalian cells in
general. However, the present inventor revealed in WO 99/02694 and
in WO 00/42190 that there are substantial differences in the
relative abundance of particular iso-tRNAs in different cells or
tissues of a single multicellular organism (e.g., a mammal or a
plant) and that this plays a pivotal role in protein translation
from a coding sequence with a given codon usage or composition.
[0006] Thus, in contrast to the art-recognized presumption that
different cells of a multicellular organism have the same bias in
codon usage, it was revealed for the first time that one cell type
of a multicellular organism uses codons in a manner distinct from
another cell type of the same organism. In other words, it was
discovered that different cells of an organism can exhibit
different translational efficiencies for the same codon and that it
was not possible to predict which codons would be preferred, less
preferred or non preferred in a selected cell type. Accordingly, it
was proposed that differences in codon translational efficiency
between cell types could be exploited, together with codon
composition of a gene, to regulate the production of a protein in,
or to direct that production to, a chosen cell type.
[0007] Therefore, in order to optimize the expression of a
protein-encoding polynucleotide in a particular cell type, WO
99/02694 and in WO 00/42190 teach that it is necessary to first
determine the translational efficiency for each codon in that cell
type, rather than to rely on codon frequencies calculated on an
organism-wide average basis, and then to codon modify the
polynucleotide based on that determination. WO 00/42190 further
teaches a vector system for ranking synonymous codons according to
their translational efficiencies. This vector system comprises a
plurality of synthetic constructs, each comprising a regulatory
sequence that is operably linked to a tandem repeat of a codon
fused in frame with a reporter polynucleotide that encodes a
reporter protein, wherein the tandemly repeated codon of one
construct is different to the tandemly repeated codon of another.
In this system, the tandem repeated codon is thought to cause a
ribosome to pause during translation if the iso-tRNA corresponding
to the tandemly repeated codon is limiting. Accordingly, the levels
of reporter protein produced using this vector system are sensitive
to the intracellular abundance of the iso-tRNA species
corresponding to the tandemly repeated codon and provide,
therefore, a direct correlation of a cell's or tissue's preference
for translating a given codon. This means, for example, that if the
levels of the reporter protein obtained in a cell or tissue type to
which a synthetic construct having a first tandemly repeated codon
is provided are lower than the levels expressed in the same cell or
tissue type to which a different synthetic construct having a
second tandemly repeated codon is provided (i.e., wherein the first
tandemly repeated codon is different than, but synonymous with, the
second tandemly repeated codon), then it can be deduced that the
second tandemly repeated codon has a higher translational
efficiency than the first tandemly repeated codon in the cell or
tissue type.
[0008] The present inventor further determined a strategy for
enhancing or reducing the quality of a selected phenotype
(immunity, tolerance, pathogen resistance, enhancement or
prevention of a repair process, pest resistance, frost resistance,
herbicide tolerance etc) that is displayed, or proposed to be
displayed, by an organism of interest. This strategy, which is
disclosed in WO 2004/042059, involves codon modification of a
polynucleotide that encodes a phenotype-associated polypeptide that
either by itself, or in association with other molecules, in the
organism of interest imparts or confers the selected phenotype upon
the organism. Unlike previous methods, which rely on data that
provide a ranking of synonymous codons according to their
preference of usage or according to their translational
efficiencies, this strategy is based on ranking individual
synonymous codons according to their preference of usage by the
organism or class of organisms, or by a part thereof, for producing
the selected phenotype. An illustrative method for determining
codon phenotypic preferences is disclosed in WO 2004/042059, which
employs the synthetic construct system disclosed in WO 00/42190 to
derive a set of synonymous codons that may display a range of
phenotypic preferences, which can be used as a basis for rationally
selecting a codon in polynucleotide that encodes a phenotype
associated polypeptide for replacement with a synonymous codon that
has a different phenotypic preference.
SUMMARY OF THE INVENTION
[0009] The present invention is predicated in part on the discovery
that the sensitivity of determining the translational efficiency or
phenotypic preference of different synonymous codons can be
improved using a construct system that employs different reporter
polynucleotides that encode the same amino acid sequence, wherein
individual reporter polynucleotides use the same codon (also
referred to herein as "an interrogating codon") to code for a
particular amino acid at one or more positions of the amino acid
sequence, and wherein the interrogating codon of one reporter
polynucleotide is different to but synonymous with the
interrogating codon of another reporter polynucleotide. In specific
embodiments, the sensitivity is improved further by incorporating
two or more interrogating codons to code for the particular amino
acid in the amino acid sequence.
[0010] Thus, in one aspect of the present invention, construct
systems are provided for determining the translational efficiency
or phenotypic preference of different synonymous codons. These
systems generally comprise a plurality of synthetic constructs,
each comprising a regulatory sequence that is operably connected to
a reporter polynucleotide, wherein the reporter polynucleotide of a
first construct comprises a first coding sequence for interrogating
the translational efficiency or phenotypic preference of a first
codon ("the first interrogating codon") that codes for a first
amino acid, wherein the reporter polynucleotide of a second
construct comprises a second coding sequence for interrogating the
translational efficiency or phenotypic preference of a second codon
("the second interrogating codon") that codes for the first amino
acid, wherein the first and second coding sequences encode the same
amino acid sequence, wherein the first coding sequence comprises
the first interrogating codon to code for the first amino acid at
one or more positions of the amino acid sequence, wherein the
second coding sequence comprises the second interrogating codon to
code for the first amino acid at one or more positions of the amino
acid sequence, and wherein the first and second coding sequences
differ from one another in the choice of the first interrogating
codon or the second interrogating codon to code for the first amino
acid at the corresponding position(s) in the amino acid sequence.
Suitably, the first coding sequence comprises the same number of
first interrogating codons as the number of second interrogating
codons in the second coding sequence. In specific embodiments, the
second coding sequence differs from the first second coding
sequence by the substitution of the first interrogating codon with
the second interrogating codon to code for the first amino acid at
the one or more positions of the amino acid sequence. In some
embodiments, the construct system comprises one or more additional
synthetic constructs for interrogating the translational efficiency
or phenotypic preference of one or more additional interrogating
codons that codes for the first amino acid. In illustrative
examples of this type, the construct system comprises a
corresponding number of synthetic constructs as the number of
synonymous codons that normally encode the first amino acid.
[0011] In some embodiments, the coding sequence of individual
synthetic constructs comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 30, 40, 50, 60, 70, 80, 90,
100, 150, 200, 250, 300, 350, 400, 500 interrogating codons of the
corresponding type. Suitably, at least 10%, 15%, 20%, 25%, 30%,
35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95%,
99% or even 100% of codons that code for the first amino acid in
the coding sequence of individual synthetic constructs are the same
interrogating codon.
[0012] In some embodiments, the construct system further comprises
a third construct and a fourth construct, wherein the reporter
polynucleotide of the third construct comprises a third coding
sequence for interrogating the translational efficiency or
phenotypic preference of a third codon ("the third interrogating
codon") that codes for a second amino acid that is different to the
first amino acid, wherein the reporter polynucleotide of the fourth
construct comprises a fourth coding sequence for interrogating the
translational efficiency or phenotypic preference of a fourth codon
("the fourth interrogating codon") that codes for the second amino
acid, wherein the third and fourth coding sequences encode the same
amino acid sequence as the first and second coding sequences,
wherein the third coding sequence comprises the third interrogating
codon to code for the second amino acid at one or more positions of
the amino acid sequence, wherein the fourth coding sequence
comprises the fourth interrogating codon to code for the second
amino acid at one or more positions of the amino acid sequence, and
wherein the third and fourth coding sequences differ from one
another in the choice of the third interrogating codon or the
fourth interrogating codon to code for the second amino acid at the
corresponding position(s) in the amino acid sequence.
[0013] In some embodiments, the construct system further comprises
synthetic constructs for interrogating the translational efficiency
or phenotypic preference of codons that code for other amino
acids.
[0014] In some embodiments, the coding sequence of individual
reporter polynucleotides encodes an amino acid sequence that
confers a phenotype upon a cell or tissue in which the coding
sequence is expressed (e.g., an amino acid sequence of a reporter
protein which, when present in a cell or tissue, is detectable
either by its presence or activity, including, but not limited to,
a chemiluminescent reporter protein such as luciferase, a
fluorescent protein such as green fluorescent protein, an enzymatic
reporter protein such as chloramphenicol acetyl transferase,
.beta.-galactosidase, secreted placental alkaline phosphatase,
.beta.-lactamase or a growth factor such as human growth hormone).
Such reporter proteins are useful, for example, in determining the
translational efficiency of different synonymous codons in a cell
or tissue type of interest. In other embodiments, the coding
sequence of individual reporter polynucleotides encodes an amino
acid sequence that confers a phenotype upon a cell or tissue in
which the coding sequence is not expressed including, for example,
the amino acid sequence of a phenotype-associated polypeptide that
is the subject of producing a selected phenotype (e.g., cellular
immunity to melanoma) or a phenotype of the same class as the
selected phenotype (e.g., a cellular immune response), as for
example disclosed in WO 2004/042059, which is hereby incorporated
by reference herein in its entirety.
[0015] In some embodiments, the reporter polynucleotide of
individual synthetic constructs further comprises an ancillary
coding sequence that encodes a detectable tag, which is suitably a
member of a specific binding pair, which includes for example,
antibody-antigen (or hapten) pairs, ligand-receptor pairs,
enzyme-substrate pairs, biotin-avidin pairs, and the like. In
illustrative examples of this type, the ancillary coding sequence
of one reporter polynucleotide encodes a different tag than the
ancillary coding sequence of another reporter polynucleotide. In
these examples, it is possible to detectably distinguish the
polypeptide products of different reporter polynucleotides in the
same cell or organism of interest or part thereof, thereby
permitting simultaneous determination of the translational
efficiencies of different interrogating codons in the same cell or
organism or part.
[0016] In another aspect, the present invention provides methods
for determining the translational efficiency of a first codon
relative to a second codon is in a cell of interest,
[0017] wherein the first codon and the second codon code for the
same amino acid. These methods generally comprise: [0018] providing
a plurality of synthetic constructs, each comprising a regulatory
sequence that is operably connected to a reporter polynucleotide,
wherein the reporter polynucleotide of a first construct comprises
a first coding sequence for interrogating the translational
efficiency of the first codon, wherein the reporter polynucleotide
of a second construct comprises a second coding sequence for
interrogating the translational efficiency of the second codon,
wherein the first and second coding sequences encode the same amino
acid sequence, which defines in whole or in part a reporter
protein, wherein the first coding sequence comprises the first
codon to code for the first amino acid at one or more positions of
the amino acid sequence, wherein the second coding sequence
comprises the second codon to code for the first amino acid at one
or more positions of the amino acid sequence, and wherein the first
and second coding sequences differ from one another in the choice
of the first codon or the second codon to code for the first amino
acid at the corresponding position(s) in the amino acid sequence;
[0019] introducing the first construct into a cell of the same type
as the cell of interest; [0020] introducing the second construct
into a cell of the same type as the cell of interest; [0021]
measuring expression of the reporter protein from the first
construct and from the second construct in the cell; and [0022]
determining the translational efficiency of the first codon and the
translational efficiency of the second codon based on the measured
expression of the reporter protein in the cell, to thereby
determine the translational efficiency of the first codon relative
the second codon in the cell of interest.
[0023] In some embodiments, the first and second constructs are
separately introduced into different cells. In other embodiments,
the first and second constructs are introduced into the same
cell.
[0024] Suitably, the methods further comprise determining a
comparison of translational efficiencies of individual synonymous
codons in the cell of interest.
[0025] In some embodiments, the methods further comprise: [0026]
introducing an individual synthetic construct into a progenitor of
the cell of interest; and [0027] differentiating the cell of
interest from the progenitor,
[0028] wherein the cell of interest contains the synthetic
construct.
[0029] In yet another aspect, the present invention provides
methods for determining the translational efficiency of a first
codon and a second codon in a first cell type relative to a second
cell type. These methods generally comprise: [0030] providing a
plurality of synthetic constructs, each comprising a regulatory
sequence that is operably connected to a reporter polynucleotide,
wherein the reporter polynucleotide of a first construct comprises
a first coding sequence for interrogating the translational
efficiency of the first codon, wherein the reporter polynucleotide
of a second construct comprises a second coding sequence for
interrogating the translational efficiency of the second codon,
wherein the first and second coding sequences encode the same amino
acid sequence, which defines in whole or in part a reporter
protein, wherein the first coding sequence comprises the first
codon to code for the first amino acid at one or more positions of
the amino acid sequence, wherein the second coding sequence
comprises the second codon to code for the first amino acid at one
or more positions of the amino acid sequence, and wherein the first
and second coding sequences differ from one another in the choice
of the first codon or the second codon to code for the first amino
acid at the corresponding position(s) in the amino acid sequence;
[0031] separately introducing the first construct into the first
cell type and into the second cell type; [0032] separately
introducing the second construct into the first cell type and into
the second cell type; [0033] measuring expression of the reporter
protein in the first cell type and in the second cell type to which
the first construct was provided; [0034] measuring expression of
the reporter protein in the first cell type and in the second cell
type to which the second construct was provided; [0035] determining
the translational efficiency of the first codon in the first cell
type and in the second cell type based on the measured expression
of the reporter protein in the first cell type and in the second
cell type, respectively, to which the first construct was provided,
to thereby determine the translational efficiency of the first
codon in the first cell type relative the second cell type; and
[0036] determining the translational efficiency of the second codon
in the first cell type and in the second cell type based on the
measured expression of the reporter protein in the first cell type
and in the second cell type, respectively, to which the second
construct was provided, to thereby determine the translational
efficiency of the second codon in the first cell type relative the
second cell type.
[0037] In some embodiments, the methods further comprise
determining a comparison of translational efficiencies of
individual synonymous codons in the first cell type relative to the
second cell type.
[0038] In some embodiments, the methods further comprise: [0039]
introducing an individual synthetic construct into a progenitor of
a cell selected from the first cell type or the second cell type;
and [0040] differentiating the cell from the progenitor,
[0041] wherein the cell contains the synthetic construct.
[0042] Still another aspect of the present invention provides
methods for determining the preference of a first codon relative to
the preference of a second codon for producing a selected phenotype
("the phenotypic preference") in a organism of interest or part
thereof, wherein the first codon and the second codon code for the
same amino acid. These methods generally comprise: [0043] providing
a plurality of synthetic constructs, each comprising a regulatory
sequence that is operably connected to a reporter polynucleotide,
wherein the reporter polynucleotide of a first construct comprises
a first coding sequence for interrogating the phenotypic preference
of the first codon, wherein the reporter polynucleotide of a second
construct comprises a second coding sequence for interrogating the
phenotypic preference of the second codon, wherein the first and
second coding sequences encode the same amino acid sequence, which
defines in whole or in part a reporter protein, which produces, or
which is predicted to produce, the selected phenotype or a
phenotype of the same class as the selected phenotype, wherein the
first coding sequence comprises the first codon to code for the
first amino acid at one or more positions of the amino acid
sequence, wherein the second coding sequence comprises the second
codon to code for the first amino acid at one or more positions of
the amino acid sequence, and wherein the first and second coding
sequences differ from one another in the choice of the first codon
or the second codon to code for the first amino acid at the
corresponding position(s) in the amino acid sequence; [0044]
introducing the first construct into a first test organism or part
thereof, wherein the test organism is selected from the group
consisting of an organism of the same species as the organism of
interest and an organism that is related to the organism of
interest; [0045] introducing the second construct into a second
test organism or part thereof, wherein the second test organism is
of the same type as the first organism; [0046] determining the
quality of the corresponding phenotype displayed by the first test
organism or part and by the second test organism or part; and
[0047] determining the phenotypic preference of the first codon and
the phenotypic preference of the second codon based, respectively,
on the quality of the corresponding phenotype displayed by the
first test organism or part and by the second test organism or
part, to thereby determine the phenotypic preference of the first
codon relative the phenotypic preference of the second codon in the
organism of interest or part thereof.
[0048] In some embodiments, the methods further comprise
determining a comparison of phenotypic preferences of individual
synonymous codons in the organism of interest or part thereof.
[0049] In some embodiments, the methods further comprise: [0050]
introducing an individual synthetic construct into a progenitor of
the test organism or part; and [0051] growing a non-human organism
or part from the progenitor,
[0052] wherein the organism or part contains the synthetic
construct.
[0053] In other embodiments, the methods further comprise: [0054]
introducing an individual synthetic construct into a progenitor of
the test organism or part; and [0055] growing a non-human organism
or part from the progenitor,
[0056] wherein the organism or part comprises a cell containing the
synthetic construct.
[0057] In still another aspect, the present invention provides
methods of constructing a synthetic polynucleotide from which an
encoded polypeptide is produced at a higher level in a cell of
interest than from a parent polynucleotide that encodes the same
polypeptide. These methods generally comprise: [0058] determining
the translational efficiency of different synonymous codons in
cells of the same type as the cell of interest, as broadly
described above, to thereby determine a comparison of translational
efficiencies of individual synonymous codons in the cell of
interest; [0059] selecting a first codon of the parent
polynucleotide for replacement with a synonymous codon, wherein the
synonymous codon is selected on the basis that it exhibits a higher
translational efficiency than the first codon in the cell of
interest according to the comparison of translational efficiencies;
and [0060] replacing the first codon with the synonymous codon to
construct the synthetic polynucleotide.
[0061] In some embodiments, the synonymous codon is selected on the
basis that it corresponds to an interrogating codon in a synthetic
construct from which the reporter protein is expressed in the cell
of interest at a level that is at least about 10%, 15%, 20%, 25%,
30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or
95% higher or at least about 2, 3, 4, 5, 6, 7, 8, 9, or 10 times
higher than the level of the reporter protein expressed from a
synthetic construct that comprises the first codon as the
interrogating codon.
[0062] A further aspect of the present invention provides methods
of constructing a synthetic polynucleotide from which an encoded
polypeptide is produced at a lower level in a cell of interest than
from a parent polynucleotide that encodes the same polypeptide.
These methods generally comprise: [0063] determining the
translational efficiency of different synonymous codons in cells of
the same type as the cell of interest, as broadly described above,
to thereby determine a comparison of translational efficiencies of
individual synonymous codons in the cell of interest; [0064]
selecting a first codon of the parent polynucleotide for
replacement with a synonymous codon, wherein the synonymous codon
is selected on the basis that it exhibits a lower translational
efficiency than the first codon in the cell of interest according
to the comparison of translational efficiencies; and [0065]
replacing the first codon with the synonymous codon to construct
the synthetic polynucleotide.
[0066] In some embodiments, the synonymous codon is selected on the
basis that it corresponds to an interrogating codon in a synthetic
construct from which the reporter protein is expressed in the cell
of interest at a level that is no more than 95%, 90%, 85%, 80%,
75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%,
10%, 5%, 1%, 0.5%, 0.1%, 0.05% or 0.01% of the level of the
reporter protein expressed from a synthetic construct that
comprises the first codon as the interrogating codon.
[0067] Still another aspect of the present invention provides
methods of constructing a synthetic polynucleotide from which an
encoded polypeptide is produced at a higher level in a first cell
than in a second cell. These methods generally comprise: [0068]
determining the translational efficiency of different synonymous
codons in cells of the same type as the first cell and in cells of
the same type as the second cell, as broadly described above, to
thereby determine a comparison of translational efficiencies of
individual synonymous codons between the first cell and the second
cell; [0069] selecting a first codon of the parent polynucleotide
for replacement with a synonymous codon, wherein the synonymous
codon is selected on the basis that it exhibits a higher
translational efficiency in the first cell than in the second cell
according to the comparison of translational efficiencies; and
[0070] replacing the first codon with the synonymous codon to
construct the synthetic polynucleotide.
[0071] In some embodiments, the synonymous codon is the same as the
interrogating codon in a synthetic construct from which the
reporter protein is expressed in the first cell at a level that is
at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%,
60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% higher or at least about
2, 3, 4, 5, 6, 7, 8, 9, or 10 times higher than the level of the
reporter protein expressed from the same synthetic construct in the
second cell.
[0072] In yet another aspect, the present invention provides
methods of constructing a synthetic polynucleotide from which a
polypeptide is producible to confer a selected phenotype upon an
organism of interest or part thereof in a different quality than
that conferred by a parent polynucleotide that encodes the same
polypeptide. These methods generally comprise: [0073] determining
the preference of different synonymous codons for producing the
selected phenotype ("the phenotypic preference") in test organisms
or parts thereof, as broadly described above, wherein the test
organisms are selected from the group consisting of an organism of
the same species as the organism of interest and an organism that
is related to the organism of interest, to thereby determine a
comparison of phenotypic preferences of individual synonymous
codons in the organism of interest; [0074] selecting a first codon
of the parent polynucleotide for replacement with a synonymous
codon, wherein the synonymous codon is selected on the basis that
it exhibits a different phenotypic preference than the first codon
in the comparison of phenotypic preferences in organism or part
thereof; and [0075] replacing the first codon with the synonymous
codon to construct the synthetic polynucleotide.
[0076] In some embodiments, the synthetic polynucleotide confers
the selected phenotype upon the organism of interest or part
thereof in a higher quality than that conferred by the parent
polynucleotide. In illustrative examples of this type, the
synonymous codon is selected on the basis that it corresponds to an
interrogating codon in a synthetic construct that confers the
selected phenotype in the organism of interest or part thereof in a
quality that is at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%,
45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% higher or
at least about 2, 3, 4, 5, 6, 7, 8, 9, or 10 times higher than the
quality of the phenotype conferred by the synthetic construct
comprising the first codon as the interrogating codon.
[0077] In other embodiments, the synthetic polynucleotide confers
the selected phenotype upon the organism of interest or part
thereof in a lower quality than that conferred by the parent
polynucleotide. In illustrative examples of this type, the
synonymous codon is selected on the basis that it corresponds to an
interrogating codon in a synthetic construct that confers the
selected phenotype in the organism of interest or part thereof in a
quality that is no more than 95%, 90%, 85%, 80%, 75%, 70%, 65%,
60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 1%,
0.5%, 0.1%, 0.05% or 0.01% of the quality of the phenotype
conferred by the synthetic construct comprising the first codon as
the interrogating codon.
[0078] The construct system of the present invention has been used
to determine a ranking of individual synonymous codons according to
their preference for producing an immune response, including a
humoral immune response, to an antigen in a mammal. Significantly,
this ranking is not coterminous with a ranking of codon frequency
values derivable from an analysis of the frequency with which
codons are used to encode their corresponding amino acids across a
collection of highly expressed mammalian protein-encoding genes, as
for example disclosed by Seed (supra). Nor is it coterminous with a
ranking of translational efficiency values obtained from an
analysis of the translational efficiencies of codons in specific
cell types, as disclosed for example in WO 99/02694 for COS-1 cells
and epithelial cells and in WO 2004/024915 for CHO cells. As a
result, the present invention enables for the first time the
construction of antigen-encoding polynucleotides, which are
codon-optimized for efficient production of immune responses,
including humoral immune responses, in a mammal.
[0079] Accordingly, in yet another aspect, methods are provided for
constructing a synthetic polynucleotide from which a polypeptide is
producible to confer an immune response to a target antigen in a
mammal in a different quality than that conferred by a parent
polynucleotide that encodes the same polypeptide, wherein the
polypeptide corresponds to at least a portion of the target
antigen. These methods generally comprise: (a) selecting a first
codon of the parent polynucleotide for replacement with a
synonymous codon, wherein the synonymous codon is selected on the
basis that it exhibits a different preference for conferring an
immune response ("an immune response preference") than the first
codon in a comparison of immune response preferences; and (b)
replacing the first codon with the synonymous codon to construct
the synthetic polynucleotide, wherein the comparison of immune
response preferences of the codons is represented by TABLE 1:
TABLE-US-00001 TABLE 1 Amino Ranking of Immune Response Preferences
for Synonymous Acid Codons Ala Ala.sup.GCT > Ala.sup.GCC >
(Ala.sup.GCA, Ala.sup.GCG) Arg (Arg.sup.CGA, Arg.sup.CGC,
Arg.sup.CGT, Arg.sup.AGA) > (Arg.sup.AGG, Arg.sup.CGG) Asn
Asn.sup.AAC > Asn.sup.AAT Asp Asp.sup.GAC > Asp.sup.GAT Cys
Cys.sup.TGC > Cys.sup.TGT Glu Glu.sup.GAA > Glu.sup.GAG Gln
Gln.sup.CAA = Gln.sup.CAG Gly Gly.sup.GGA > (Gly.sup.GGG,
Gly.sup.GGT, Gly.sup.GGC) His His.sup.CAC = His.sup.CAT Ile
Ile.sup.ATC >> Ile.sup.ATT > Ile.sup.ATA Leu (Leu.sup.CTG,
Leu.sup.CTC) > (Leu.sup.CTA, Leu.sup.CTT) >> Leu.sup.TTG
> Leu.sup.TTA Lys Lys.sup.AAG = Lys.sup.AAA Phe Phe.sup.TTT >
Phe.sup.TTC Pro Pro.sup.CCC > Pro.sup.CCT >> (Pro.sup.CCA,
Pro.sup.CCG) Ser Ser.sup.TCG >> (Ser.sup.TCT, Ser.sup.TCA,
Ser.sup.TCC) >> (Ser.sup.AGC, Ser.sup.AGT) Thr Thr.sup.ACG
> Thr.sup.ACC >> Thr.sup.ACA > Thr.sup.ACT Tyr
Tyr.sup.TAC > Tyr.sup.TAT Val (Val.sup.GTG, Val.sup.GTC) >
Val.sup.GTT > Val.sup.GTA
[0080] Thus, a stronger or enhanced immune response to the target
antigen (e.g., an immune response that is at least about 110%,
150%, 200%, 300%, 400%, 500%, 600%, 700%, 800%, 900%, 1000% and all
integer percentages in between, of that produced from the parent
polynucleotide under identical conditions) can be achieved by
selecting a synonymous codon that has a higher immune response
preference than the first codon it replaces. In specific
embodiments, the synonymous codon is selected such that it has a
higher immune response preference that is at least about 10% (and
at least about 11% to at least about 1000% and all integer
percentages in between) higher than the immune response preference
of the codon it replaces. In illustrative examples of this type,
the first and synonymous codons are selected from TABLE 2:
TABLE-US-00002 TABLE 2 Synonymous First Codon Codon Ala.sup.GCG
Ala.sup.GCT Ala.sup.GCG Ala.sup.GCC Ala.sup.GCA Ala.sup.GCT
Ala.sup.GCA Ala.sup.GCC Ala.sup.GCC Ala.sup.GCT Arg.sup.CGG
Arg.sup.CGA Arg.sup.CGG Arg.sup.CGC Arg.sup.CGG Arg.sup.CGT
Arg.sup.CGG Arg.sup.AGA Arg.sup.AGG Arg.sup.CGA Arg.sup.AGG
Arg.sup.CGC Arg.sup.AGG Arg.sup.CGT Arg.sup.AGG Arg.sup.AGA
Asn.sup.AAT Asn.sup.AAC Asp.sup.GAT Asp.sup.GAC Cys.sup.TGT
Cys.sup.TGC Glu.sup.GAG Glu.sup.GAA Gly.sup.GGC Gly.sup.GGA
Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG Gly.sup.GGA Ile.sup.ATA
Ile.sup.ATC Ile.sup.ATA Ile.sup.ATT Ile.sup.ATT Ile.sup.ATC
Leu.sup.TTA Leu.sup.CTG Leu.sup.TTA Leu.sup.CTC Leu.sup.TTA
Leu.sup.CTA Leu.sup.TTA Leu.sup.CTT Leu.sup.TTA Leu.sup.TTG
Leu.sup.TTG Leu.sup.CTG Leu.sup.TTG Leu.sup.CTC Leu.sup.TTG
Leu.sup.CTA Leu.sup.TTG Leu.sup.CTT Leu.sup.CTT Leu.sup.CTG
Leu.sup.CTT Leu.sup.CTC Leu.sup.CTA Leu.sup.CTG Leu.sup.CTA
Leu.sup.CTC Phe.sup.TTC Phe.sup.TTT Pro.sup.CCG Pro.sup.CCC
Pro.sup.CCG Pro.sup.CCT Pro.sup.CCA Pro.sup.CCC Pro.sup.CCA
Pro.sup.CCT Pro.sup.CCT Pro.sup.CCC Ser.sup.AGT Ser.sup.TCG
Ser.sup.AGT Ser.sup.TCT Ser.sup.AGT Ser.sup.TCA Ser.sup.AGT
Ser.sup.TCC Ser.sup.AGC Ser.sup.TCG Ser.sup.AGC Ser.sup.TCT
Ser.sup.AGC Ser.sup.TCA Ser.sup.AGC Ser.sup.TCC Ser.sup.TCC
Ser.sup.TCG Ser.sup.TCA Ser.sup.TCG Ser.sup.TCT Ser.sup.TCG
Thr.sup.ACT Thr.sup.ACG Thr.sup.ACT Thr.sup.ACC Thr.sup.ACT
Thr.sup.ACA Thr.sup.ACA Thr.sup.ACG Thr.sup.ACA Thr.sup.ACC
Thr.sup.ACC Thr.sup.ACG Tyr.sup.TAT Tyr.sup.TAC Val.sup.GTA
Val.sup.GTG Val.sup.GTA Val.sup.GTC Val.sup.GTA Val.sup.GTT
Val.sup.GTT Val.sup.GTG Val.sup.GTT Val.sup.GTC
[0081] In other illustrative examples of this type, the first and
synonymous codons are selected from TABLE 3:
TABLE-US-00003 TABLE 3 Synonymous First Codon Codon Ala.sup.GCG
Ala.sup.GCT Ala.sup.GCA Ala.sup.GCT Ala.sup.GCC Ala.sup.GCT
Arg.sup.CGG Arg.sup.CGA Arg.sup.CGG Arg.sup.CGT Arg.sup.CGG
Arg.sup.AGA Arg.sup.AGG Arg.sup.CGA Arg.sup.AGG Arg.sup.CGT
Arg.sup.AGG Arg.sup.AGA Glu.sup.GAG Glu.sup.GAA Gly.sup.GGC
Gly.sup.GGA Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG Gly.sup.GGA
Leu.sup.TTA Leu.sup.CTA Leu.sup.TTA Leu.sup.CTT Leu.sup.TTA
Leu.sup.TTG Leu.sup.TTG Leu.sup.CTA Leu.sup.TTG Leu.sup.CTT
Phe.sup.TTC Phe.sup.TTT Pro.sup.CCG Pro.sup.CCT Pro.sup.CCA
Pro.sup.CCT Ser.sup.AGT Ser.sup.TCG Ser.sup.AGT Ser.sup.TCT
Ser.sup.AGT Ser.sup.TCA Ser.sup.AGC Ser.sup.TCG Ser.sup.AGC
Ser.sup.TCT Ser.sup.AGC Ser.sup.TCA Ser.sup.AGC Ser.sup.TCC
Ser.sup.TCC Ser.sup.TCG Ser.sup.TCA Ser.sup.TCG Ser.sup.TCT
Ser.sup.TCG Thr.sup.ACT Thr.sup.ACG Thr.sup.ACT Thr.sup.ACA
Thr.sup.ACA Thr.sup.ACG Thr.sup.ACC Thr.sup.ACG Val.sup.GTA
Val.sup.GTT
[0082] Suitably, in some of the illustrative examples noted above,
the method further comprises selecting a second codon of the parent
polynucleotide for replacement with a synonymous codon, wherein the
synonymous codon is selected on the basis that it exhibits a higher
immune response preference than the second codon in a comparison of
immune response preferences; and (b) replacing the second codon
with the synonymous codon, wherein the comparison of immune
response preferences of the codons is represented by TABLE 4:
TABLE-US-00004 TABLE 4 Second Synonymous Codon Codon Ala.sup.GCG
Ala.sup.GCT Ala.sup.GCG Ala.sup.GCC Ala.sup.GCA Ala.sup.GCT
Ala.sup.GCA Ala.sup.GCC Ala.sup.GCC Ala.sup.GCT Arg.sup.CGG
Arg.sup.CGA Arg.sup.CGG Arg.sup.CGC Arg.sup.CGG Arg.sup.CGT
Arg.sup.CGG Arg.sup.AGA Arg.sup.AGG Arg.sup.CGA Arg.sup.AGG
Arg.sup.CGC Arg.sup.AGG Arg.sup.CGT Arg.sup.AGG Arg.sup.AGA
Asn.sup.AAT Asn.sup.AAC Asp.sup.GAT Asp.sup.GAC Cys.sup.TGT
Cys.sup.TGC Glu.sup.GAG Glu.sup.GAA Gly.sup.GGC Gly.sup.GGA
Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG Gly.sup.GGA Ile.sup.ATA
Ile.sup.ATC Ile.sup.ATA Ile.sup.ATT Ile.sup.ATT Ile.sup.ATC
Leu.sup.TTA Leu.sup.CTG Leu.sup.TTA Leu.sup.CTC Leu.sup.TTA
Leu.sup.CTA Leu.sup.TTA Leu.sup.CTT Leu.sup.TTA Leu.sup.TTG
Leu.sup.TTG Leu.sup.CTG Leu.sup.TTG Leu.sup.CTC Leu.sup.TTG
Leu.sup.CTA Leu.sup.TTG Leu.sup.CTT Leu.sup.CTT Leu.sup.CTG
Leu.sup.CTT Leu.sup.CTC Leu.sup.CTA Leu.sup.CTG Leu.sup.CTA
Leu.sup.CTC Phe.sup.TTC Phe.sup.TTT Pro.sup.CCG Pro.sup.CCC
Pro.sup.CCG Pro.sup.CCT Pro.sup.CCA Pro.sup.CCC Pro.sup.CCA
Pro.sup.CCT Pro.sup.CCT Pro.sup.CCC Ser.sup.AGT Ser.sup.TCG
Ser.sup.AGT Ser.sup.TCT Ser.sup.AGT Ser.sup.TCA Ser.sup.AGT
Ser.sup.TCC Ser.sup.AGC Ser.sup.TCG Ser.sup.AGC Ser.sup.TCT
Ser.sup.AGC Ser.sup.TCA Ser.sup.AGC Ser.sup.TCC Ser.sup.TCC
Ser.sup.TCG Ser.sup.TCA Ser.sup.TCG Ser.sup.TCT Ser.sup.TCG
Thr.sup.ACT Thr.sup.ACG Thr.sup.ACT Thr.sup.ACC Thr.sup.ACT
Thr.sup.ACA Thr.sup.ACA Thr.sup.ACG Thr.sup.ACA Thr.sup.ACC
Thr.sup.ACC Thr.sup.ACG Tyr.sup.TAT Tyr.sup.TAC Val.sup.GTA
Val.sup.GTG Val.sup.GTA Val.sup.GTC Val.sup.GTA Val.sup.GTT
Val.sup.GTT Val.sup.GTG Val.sup.GTT Val.sup.GTC
[0083] Conversely, a weaker or reduced immune response to the
target antigen (e.g., an immune response that is at less than about
90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, 10%, 1% and all integer
percentages in between, of that produced from the parent
polynucleotide under identical conditions) can be achieved by
selecting a synonymous codon that has a lower immune response
preference than the first codon it replaces. In specific
embodiments of this type, the synonymous codon is selected such
that it has an immune response preference that is less than about
90% of the immune response preference of the codon it replaces. In
illustrative examples, the first and synonymous codons are selected
from the TABLE 5:
TABLE-US-00005 TABLE 5 Synonymous First Codon Codon Ala.sup.GCT
Ala.sup.GCG Ala.sup.GCT Ala.sup.GCA Ala.sup.GCT Ala.sup.GCC
Ala.sup.GCC Ala.sup.GCG Ala.sup.GCC Ala.sup.GCA Arg.sup.CGA
Arg.sup.AGG Arg.sup.CGA Arg.sup.CGG Arg.sup.CGC Arg.sup.AGG
Arg.sup.CGC Arg.sup.CGG Arg.sup.CGT Arg.sup.AGG Arg.sup.CGT
Arg.sup.CGG Arg.sup.AGA Arg.sup.AGG Arg.sup.AGA Arg.sup.CGG
Asn.sup.AAC Asn.sup.AAT Asp.sup.GAC Asp.sup.GAT Cys.sup.TGC
Cys.sup.TGT Glu.sup.GAA Glu.sup.GAG Gly.sup.GGA Gly.sup.GGC
Gly.sup.GGA Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG Ile.sup.ATC
Ile.sup.ATA Ile.sup.ATC Ile.sup.ATT Ile.sup.ATT Ile.sup.ATA
Leu.sup.CTG Leu.sup.CTA Leu.sup.CTG Leu.sup.CTT Leu.sup.CTG
Leu.sup.TTG Leu.sup.CTG Leu.sup.TTA Leu.sup.CTC Leu.sup.CTA
Leu.sup.CTC Leu.sup.CTT Leu.sup.CTC Leu.sup.TTG Leu.sup.CTC
Leu.sup.TTA Leu.sup.CTA Leu.sup.TTG Leu.sup.CTA Leu.sup.TTA
Leu.sup.CTT Leu.sup.TTG Leu.sup.CTT Leu.sup.TTA Leu.sup.TTG
Leu.sup.TTA Phe.sup.TTT Phe.sup.TTC Pro.sup.CCC Pro.sup.CCT
Pro.sup.CCC Pro.sup.CCA Pro.sup.CCC Pro.sup.CCG Pro.sup.CCT
Pro.sup.CCA Pro.sup.CCT Pro.sup.CCG Ser.sup.TCG Ser.sup.TCT
Ser.sup.TCG Ser.sup.TCA Ser.sup.TCG Ser.sup.TCC Ser.sup.TCG
Ser.sup.AGC Ser.sup.TCG Ser.sup.AGT Ser.sup.TCT Ser.sup.AGC
Ser.sup.TCT Ser.sup.AGT Ser.sup.TCA Ser.sup.AGC Ser.sup.TCA
Ser.sup.AGT Ser.sup.TCC Ser.sup.AGC Ser.sup.TCC Ser.sup.AGT
Thr.sup.ACG Thr.sup.ACC Thr.sup.ACG Thr.sup.ACA Thr.sup.ACG
Thr.sup.ACT Thr.sup.ACC Thr.sup.ACA Thr.sup.ACC Thr.sup.ACT
Thr.sup.ACA Thr.sup.ACT Tyr.sup.TAC Tyr.sup.TAT Val.sup.GTG
Val.sup.GTT Val.sup.GTG Val.sup.GTA Val.sup.GTC Val.sup.GTT
Val.sup.GTC Val.sup.GTA Val.sup.GTT Val.sup.GTA
[0084] In other illustrative examples, the first and synonymous
codons are selected from TABLE 6:
TABLE-US-00006 TABLE 6 Synonymous First Codon Codon Ala.sup.GCT
Ala.sup.GCG Ala.sup.GCT Ala.sup.GCA Ala.sup.GCT Ala.sup.GCC
Arg.sup.CGA Arg.sup.AGG Arg.sup.CGA Arg.sup.CGG Arg.sup.CGT
Arg.sup.AGG Arg.sup.CGT Arg.sup.CGG Arg.sup.AGA Arg.sup.AGG
Arg.sup.AGA Arg.sup.CGG Glu.sup.GAA Glu.sup.GAG Gly.sup.GGA
Gly.sup.GGC Gly.sup.GGA Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG
Leu.sup.CTA Leu.sup.TTG Leu.sup.CTA Leu.sup.TTA Leu.sup.CTT
Leu.sup.TTG Leu.sup.CTT Leu.sup.TTA Leu.sup.TTG Leu.sup.TTA
Phe.sup.TTT Phe.sup.TTC Pro.sup.CCT Pro.sup.CCA Pro.sup.CCT
Pro.sup.CCG Ser.sup.TCG Ser.sup.TCT Ser.sup.TCG Ser.sup.TCA
Ser.sup.TCG Ser.sup.TCC Ser.sup.TCG Ser.sup.AGC Ser.sup.TCG
Ser.sup.AGT Ser.sup.TCT Ser.sup.AGC Ser.sup.TCT Ser.sup.AGT
Ser.sup.TCA Ser.sup.AGC Ser.sup.TCA Ser.sup.AGT Ser.sup.TCC
Ser.sup.AGC Thr.sup.ACG Thr.sup.ACC Thr.sup.ACG Thr.sup.ACA
Thr.sup.ACG Thr.sup.ACT Thr.sup.ACA Thr.sup.ACT Val.sup.GTT
Val.sup.GTA
[0085] Suitably, in some of the illustrative examples noted above,
the method further comprises selecting a second codon of the parent
polynucleotide for replacement with a synonymous codon, wherein the
synonymous codon is selected on the basis that it exhibits a lower
immune response preference than the second codon in a comparison of
immune response preferences; and; (b) replacing the second codon
with the synonymous codon, wherein the comparison of immune
response preferences of the codons is represented by TABLE 7:
TABLE-US-00007 TABLE 7 Second Synonymous Codon Codon Ala.sup.GCT
Ala.sup.GCG Ala.sup.GCT Ala.sup.GCA Ala.sup.GCT Ala.sup.GCC
Ala.sup.GCC Ala.sup.GCG Ala.sup.GCC Ala.sup.GCA Arg.sup.CGA
Arg.sup.AGG Arg.sup.CGA Arg.sup.CGG Arg.sup.CGC Arg.sup.AGG
Arg.sup.CGC Arg.sup.CGG Arg.sup.CGT Arg.sup.AGG Arg.sup.CGT
Arg.sup.CGG Arg.sup.AGA Arg.sup.AGG Arg.sup.AGA Arg.sup.CGG
Asn.sup.AAC Asn.sup.AAT Asp.sup.GAC Asp.sup.GAT Cys.sup.TGC
Cys.sup.TGT Glu.sup.GAA Glu.sup.GAG Gly.sup.GGA Gly.sup.GGC
Gly.sup.GGA Gly.sup.GGT Gly.sup.GGA Gly.sup.GGG Ile.sup.ATC
Ile.sup.ATA Ile.sup.ATC Ile.sup.ATT Ile.sup.ATT Ile.sup.ATA
Leu.sup.CTG Leu.sup.CTA Leu.sup.CTG Leu.sup.CTT Leu.sup.CTG
Leu.sup.TTG Leu.sup.CTG Leu.sup.TTA Leu.sup.CTC Leu.sup.CTA
Leu.sup.CTC Leu.sup.CTT Leu.sup.CTC Leu.sup.TTG Leu.sup.CTC
Leu.sup.TTA Leu.sup.CTA Leu.sup.TTG Leu.sup.CTA Leu.sup.TTA
Leu.sup.CTT Leu.sup.TTG Leu.sup.CTT Leu.sup.TTA Leu.sup.TTG
Leu.sup.TTA Phe.sup.TTT Phe.sup.TTC Pro.sup.CCC Pro.sup.CCT
Pro.sup.CCC Pro.sup.CCA Pro.sup.CCC Pro.sup.CCG Pro.sup.CCT
Pro.sup.CCA Pro.sup.CCT Pro.sup.CCG Ser.sup.TCG Ser.sup.TCT
Ser.sup.TCG Ser.sup.TCA Ser.sup.TCG Ser.sup.TCC Ser.sup.TCG
Ser.sup.AGC Ser.sup.TCG Ser.sup.AGT Ser.sup.TCT Ser.sup.AGC
Ser.sup.TCT Ser.sup.AGT Ser.sup.TCA Ser.sup.AGC Ser.sup.TCA
Ser.sup.AGT Ser.sup.TCC Ser.sup.AGC Ser.sup.TCC Ser.sup.AGT
Thr.sup.ACG Thr.sup.ACC Thr.sup.ACG Thr.sup.ACA Thr.sup.ACG
Thr.sup.ACT Thr.sup.ACC Thr.sup.ACA Thr.sup.ACC Thr.sup.ACT
Thr.sup.ACA Thr.sup.ACT Tyr.sup.TAC Tyr.sup.TAT Val.sup.GTG
Val.sup.GTT Val.sup.GTG Val.sup.GTA Val.sup.GTC Val.sup.GTT
Val.sup.GTC Val.sup.GTA Val.sup.GTT Val.sup.GTA
[0086] In still another aspect, the invention provides a synthetic
polynucleotide constructed according to any one of the above
methods.
[0087] In accordance with the present invention, synthetic
polynucleotides that are constructed by methods described herein
are useful for expression in a mammal to elicit an immune response
to a target antigen. Accordingly, in yet another aspect, the
present invention provides chimeric constructs that comprise a
synthetic polynucleotide of the invention, which is operably
connected to a regulatory sequence.
[0088] In some embodiments, the chimeric construct is in the form
of a pharmaceutical composition that optionally comprises a
pharmaceutically acceptable excipient and/or carrier. Accordingly,
in another aspect, the invention provides pharmaceutical
compositions that are useful for modulating an immune response to a
target antigen in a mammal, which response is conferred by the
expression of a parent polynucleotide that encodes a polypeptide
corresponding to at least a portion of the target antigen. These
compositions generally comprise a chimeric construct and a
pharmaceutically acceptable excipient and/or carrier, wherein the
chimeric construct comprises a synthetic polynucleotide that is
operably connected to a regulatory sequence and that is
distinguished from the parent polynucleotide by the replacement of
a first codon in the parent polynucleotide with a synonymous codon
that has a different immune response preference than the first
codon and wherein the first and synonymous codons are selected
according to any one of TABLES 2, 3, 5 and 6. In some embodiments,
the compositions further comprise an adjuvant that enhances the
effectiveness of the immune response. In some embodiments, the
composition is formulated for transcutaneous or dermal
administration, e.g., by biolistic or microneedle delivery or by
intradermal injection. Suitably, in embodiments in which a stronger
or enhanced immune response to the target antigen is desired, the
first and synonymous codons are selected according to TABLES 2 or
3. Conversely, in embodiments in which a weaker or reduced immune
response to the target antigen is desired, the first and synonymous
codons are selected according to TABLES 5 or 6.
[0089] In yet another aspect, the invention embraces methods of
modulating the quality of an immune response to a target antigen in
a mammal, which response is conferred by the expression of a parent
polynucleotide that encodes a polypeptide corresponding to at least
a portion of the target antigen. These methods generally comprise:
introducing into the mammal a synthetic polynucleotide that is
operably connected to a regulatory sequence and that is
distinguished from the parent polynucleotide by the replacement of
a first codon in the parent polynucleotide with a synonymous codon
that has a different immune response preference than the first
codon and wherein the first and synonymous codons are selected
according to any one of TABLES 2, 3, 5 and 6. In these methods,
expression of the synthetic polynucleotide results in a different
quality (e.g., stronger or weaker) of immune response than the one
obtained through expression of the parent polynucleotide under the
same conditions. Suitably, the chimeric construct is introduced
into the mammal by delivering the construct to antigen-presenting
cells (e.g., dendritic cells, macrophages, Langerhans cells or
their precursors) of the mammal. In some embodiments, the chimeric
construct is introduced into the dermis and/or epidermis of the
mammal (e.g., by transcutaneous or intradermal administration) and
in this regard any suitable administration site is envisaged
including the abdomen. Generally, the immune response is selected
from a cell-mediated response and a humoral immune response. In
specific embodiments, the immune response is a humoral immune
response.
[0090] In a related aspect, the invention encompasses methods of
enhancing the quality of an immune response to a target antigen in
a mammal, which response is conferred by the expression of a parent
polynucleotide that encodes a polypeptide corresponding to at least
a portion of the target antigen. These methods generally comprise:
introducing into the mammal a chimeric construct comprising a
synthetic polynucleotide that is operably connected to a regulatory
sequence and that is distinguished from the parent polynucleotide
by the replacement of a first codon in the parent polynucleotide
with a synonymous codon that has a higher immune response
preference than the first codon, wherein the first and synonymous
codons are selected according to TABLES 2 or 3. In these methods,
expression of the synthetic polynucleotide typically results in a
stronger or enhanced immune response than the one obtained through
expression of the parent polynucleotide under the same
conditions.
[0091] In another related aspect, the invention extends to methods
of reducing the quality of an immune response to a target antigen
in a mammal, which response is conferred by the expression of a
parent polynucleotide that encodes a polypeptide corresponding to
at least a portion of the target antigen. These methods generally
comprise: introducing into the mammal a chimeric construct
comprising a synthetic polynucleotide that is operably connected to
a regulatory sequence and that is distinguished from the parent
polynucleotide by the replacement of a first codon in the parent
polynucleotide with a synonymous codon that has a lower immune
response preference than the first codon, wherein the first and
synonymous codons are selected according to TABLES 5 or 6. In these
methods, expression of the synthetic polynucleotide typically
results in a weaker or reduced immune response than the one
obtained through expression of the parent polynucleotide under the
same conditions.
[0092] Yet a further aspect of the present invention embraces
methods of enhancing the quality of an immune response to a target
antigen in a mammal, which response is conferred by the expression
of a first polynucleotide that encodes a polypeptide corresponding
to at least a portion of the target antigen. These methods
generally comprise: co-introducing into the mammal a first nucleic
acid construct comprising the first polynucleotide in operable
connection with a regulatory sequence; and a second nucleic acid
construct comprising a second polynucleotide that is operably
connected to a regulatory sequence and that encodes an iso-tRNA
corresponding to a codon of the first polynucleotide, wherein the
codon has a low or intermediate immune response preference and is
selected from the group consisting of Ala.sup.GCA, Ala.sup.GCG,
Ala.sup.GCC, Arg.sup.AGG, Arg.sup.CGG, Asn.sup.AAT, Asp.sup.GAT,
Cys.sup.TGT, Glu.sup.GAG, Gly.sup.GGG, Gly.sup.GGT, Gly.sup.GGC,
Ile.sup.ATA, Ile.sup.ATT, Leu.sup.TTG, Leu.sup.TTA, Leu.sup.CTA,
Leu.sup.CTT, Phe.sup.TTC, Pro.sup.CCA, Pro.sup.CCG, Pro.sup.CCT,
Ser.sup.AGC, Ser.sup.AGT, Ser.sup.TCT, Ser.sup.TCA, Ser.sup.TCC,
Thr.sup.ACA, Thr.sup.ACT, Tyr.sup.TAT, Val.sup.GTA and Val.sup.GTT.
In specific embodiments, the codon has a `low` immune response
preference, and is selected from the group consisting of
Ala.sup.GCA, Ala.sup.GCG, Arg.sup.CGG, Asn.sup.AAT, Asp.sup.GAT,
Cys.sup.TGT, Glu.sup.GAG, Gly.sup.GGG, Gly.sup.GGT, Gly.sup.GGC,
Ile.sup.ATA, Leu.sup.TTG, Leu.sup.TTA, Phe.sup.TTC, Pro.sup.CCA,
Pro.sup.CCG, Ser.sup.AGC, Ser.sup.AGT, Thr.sup.ACT, Tyr.sup.TAT and
Val.sup.GTA.
BRIEF DESCRIPTION OF THE DRAWINGS
[0093] FIG. 1 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted ALA E7 constructs and
controls (IgkC1, IgkS1-1, IgkS1-2, IgkS1-3, IgkS1-4 and IgkC2) as
further defined in Example 1 and Table 12. The sequences are
ligated into the KpnI and EcoRI sites of pcDNA3.
[0094] FIG. 2 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted ARG E7 constructs and
controls (IgkS1-5, IgkS1-6, IgkS1-7, IgkS1-8, IgkS1-9, IgkS1-10,
IgkC1 and IgkC2) as further defined in Example 1 and Table 12. The
sequences are ligated into the KpnI and EcoRI sites of pcDNA3.
[0095] FIG. 3 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted ASN and LYS E7 constructs
and controls (IgkS1, IgkS1-12, IgkS1-31 and IgkC2) as further
defined in Example 1 and Table 12. The sequences are ligated into
the KpnI and EcoRI sites of pcDNA3.
[0096] FIG. 4 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted ASP E7 constructs and
controls (IgkC1, IgkS1-13, IgkS1-14 and IgkC2) as further defined
in Example 1 and Table 12. The sequences are ligated into the KpnI
and EcoRI sites of pcDNA3.
[0097] FIG. 5 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted CYS E7 constructs and
controls (IgkC1, IgkS1-15, IgkS1-16 and IgkC2) as further defined
in Example 1 and Table 12. The sequences are ligated into the KpnI
and EcoRI sites of pcDNA3.
[0098] FIG. 6 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted GLU E7 constructs and
controls (IgkS1-17, IgkS1-18, IgkC2 and IgkC1) as further defined
in Example 1 and Table 12. The sequences are ligated into the KpnI
and EcoRI sites of pcDNA3.
[0099] FIG. 7 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted GLN E7 constructs and
controls (IgkC1, IgkS1-19, IgkS1-20 and IgkC2) as further defined
in Example 1 and Table 12. The sequences are ligated into the KpnI
and EcoRI sites of pcDNA3.
[0100] FIG. 8 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted GLY E7 constructs and
controls (IgkC1, IgkS1-21, IgkS1-22, IgkS1-23, IgkS1-24 and IgkC2)
as further defined in Example 1 and Table 12. The sequences are
ligated into the KpnI and EcoRI sites of pcDNA3.
[0101] FIG. 9 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted HIS E7 constructs and
controls (IgkC1, IgkS1-25, IgkS1-26 and IgkC2) as further defined
in Example 1 and Table 12. The sequences are ligated into the KpnI
and EcoRI sites of pcDNA3.
[0102] FIG. 10 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted ILE E7 constructs and
controls (IgkC1, IgkS1-27, IgkS1-28, IgkS1-29 and IgkC2) as further
defined in Example 1 and Table 12. The sequences are ligated into
the KpnI and EcoRI sites of pcDNA3.
[0103] FIG. 11 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted LEU E7 constructs and
controls (IgkS1-50, IgkS1-51, IgkS1-52, IgkS1-53, IgkS1-54,
IgkS1-55, IgkC3 and IgkC4) as further defined in Example 1 and
Table 12. The sequences are ligated into the KpnI and EcoRI sites
of pcDNA3. The LEU E7 constructs are oncogenic (i.e., encode
wild-type E7 protein).
[0104] FIG. 12 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted PHE E7 constructs and
controls (IgkS1-32, IgkS1-33, IgkC1 and IgkC2) as further defined
in Example 1 and Table 12. The sequences are ligated into the KpnI
and EcoRI sites of pcDNA3. The two LEU residues were mutated to PHE
in this sequence so that there are three instead of one PHE
residue.
[0105] FIG. 13 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted PRO E7 constructs and
controls (IgkS1-56, IgkS1-57, IgkS1-58, IgkS1-59, IgkC3 and IgkC4)
as further defined in Example 1 and Table 12. The sequences are
ligated into the KpnI and EcoRI sites of pcDNA3. The PRO E7
constructs are oncogenic (i.e., encode wild-type E7 protein).
[0106] FIG. 14 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted SER E7 constructs and
controls (IgkS1-34, IgkS1-35, IgkS1-36, IgkS1-37, IgkS1-38,
IgkS1-39, IgkC1 and IgkC2) as further defined in Example 1 and
Table 12. The sequences are ligated into the KpnI and EcoRI sites
of pcDNA3.
[0107] FIG. 15 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted THR E7 constructs and
controls (IgkC1, IgkS1-40, IgkS1-41, IgkS1-42, IgkS1-43 and IgkC2)
as further defined in Example 1 and Table 12. The sequences are
ligated into the KpnI and EcoRI sites of pcDNA3.
[0108] FIG. 16 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted TYR E7 constructs and
controls (IgkC1, IgkS1-44, IgkS1-45 and IgkC2) as further defined
in Example 1 and Table 12. The sequences are ligated into the KpnI
and EcoRI sites of pcDNA3.
[0109] FIG. 17 is a diagrammatic representation depicting a
nucleotide sequence alignment of secreted VAL E7 constructs and
controls (IgkC1, IgkS1-46, IgkS1-47, IgkS1-48, IgkS1-49 and IgkC2)
as further defined in Example 1 and Table 12. The sequences are
ligated into the KpnI and EcoRI sites of pcDNA3.
[0110] FIG. 18 is a graphical representation showing the response
to gene gun immunization with optimized and de-optimized E7
constructs measured by (a) ELISA, (b) Memory B cell ELISPOT, and
(c) IFN-.gamma. ELISPOT. For part (a) eight mice were immunized per
group (4 immunizations, 3 weeks apart) and the sera taken three
weeks after the final immunization; (left) E7 protein ELISA,
(right) E7 peptide 101 ELISA. Wells were done in duplicate. For
parts (b) and (c) mice were immunized twice, three weeks apart and
the spleens collected three weeks after the second immunization.
The spleens were pooled prior to analysis. The Memory B cell and
IFN-.gamma. ELISPOTs were conducted twice and three times,
respectively, and the wells done in triplicate. Three mice were
used per group per repeat. The results shown in parts (b) and (c)
are from individual experiments and are representative of the
complete data sets. The particular ELISPOT experimental data
included here were gathered together with the corresponding data in
FIG. 20 and therefore may be directly compared. Unpaired two-tailed
t-tests were used to compare the modified constructs to wild-type.
***P<0.001, **0.001<P<0.01, *0.01<P<0.05, ns=not
significant (P>0.05). In (a) 01-03 were not significantly
different from MC as measured by unpaired two-tailed t-tests.
wt=wild-type codon usage E7; O1-O3=codon-optimized E7 constructs 1
to 3; W=codon de-optimized E7; MC=mammalian consensus codon usage
E7.
[0111] FIG. 19 is a graphical representation showing the response
to immunization by intradermal injection with optimized and
de-optimized constructs measured by (a) ELISA, (b) Memory B cell
ELISPOT, and (c) IFN-.gamma. ELISPOT. For part (a) eight mice were
immunized per group (4 immunizations, 3 weeks apart) and the sera
taken three weeks after the final immunization; (left) E7 protein
ELISA, (right) E7 peptide 101 ELISA. Wells were done in duplicate.
For parts (b) and (c) mice were immunized twice, three weeks apart
and the spleens collected three weeks after the second
immunization. The spleens were pooled prior to analysis. The Memory
B cell and IFN-.gamma. ELISPOTs were conducted twice and three
times, respectively, and the wells done in triplicate. Three mice
were used per group per repeat. The results shown in parts (b) and
(c) are from individual experiments and are representative of the
complete data sets. The particular ELISPOT experimental data
included here were gathered together with the corresponding data in
FIG. 20 and therefore may be directly compared. Unpaired two-tailed
t-tests were used to compare the modified constructs to wild-type.
***P<0.001, **0.001.ltoreq.P<0.01,
*0.01.ltoreq.P.ltoreq.0.05, ns=not significant (P>0.05). In (a)
O1-O3 were not significantly different from MC as measured by
unpaired two-tailed t-tests. wt=wild-type codon usage E7;
O1-O3=codon-optimized E7 constructs 1 to 3; W=codon de-optimized
E7; MC=mammalian consensus codon usage E7.
[0112] FIG. 20 is a graphical representation showing the results of
an ELISA that measures binding of serum from mice immunized with
various gD2 constructs by intradermal injection (white bars) or
gene gun immunization (black bars), to C-terminally His-tagged
gD2tr. Note that the His-tagged gD2tr protein was used in an
unpurified state (in CHO cell supernatant) and that background
readings of non-specific binding to control supernatant have been
subtracted from the results.
TABLE-US-00008 TABLE 8 BRIEF DESCRIPTION OF THE SEQUENCES SEQUENCE
ID NUMBER SEQUENCE LENGTH SEQ ID NO: 1 IgkS2-13 Asp GAT construct
nucleotide sequence 387 nts SEQ ID NO: 2 IgkS2-14 Asp GAC construct
nucleotide sequence 387 nts SEQ ID NO: 3 IgkS2-15 Cys TGT construct
nucleotide sequence 387 nts SEQ ID NO: 4 IgkS2-16 Cys TGC construct
nucleotide sequence 387 nts SEQ ID NO: 5 IgkS2-17 Glu GAG construct
nucleotide sequence 387 nts SEQ ID NO: 6 IgkS2-18 Glu GAA construct
nucleotide sequence 387 nts SEQ ID NO: 7 IgkS2-19 Gln CAG construct
nucleotide sequence 387 nts SEQ ID NO: 8 IgkS2-20 Gln CAA construct
nucleotide sequence 387 nts SEQ ID NO: 9 IgkS2-21 Gly GGG construct
nucleotide sequence 387 nts SEQ ID NO: 10 IgkS2-22 Gly GGA
construct nucleotide sequence 387 nts SEQ ID NO: 11 IgkS2-23 Gly
GGT construct nucleotide sequence 387 nts SEQ ID NO: 12 IgkS2-24
Gly GGC construct nucleotide sequence 387 nts SEQ ID NO: 13
IgkS2-27 Ile ATA construct nucleotide sequence 387 nts SEQ ID NO:
14 IgkS2-28 Ile ATT construct nucleotide sequence 387 nts SEQ ID
NO: 15 IgkS2-29 Ile ATC construct nucleotide sequence 387 nts SEQ
ID NO: 16 IgkS2-34 Ser AGT construct nucleotide sequence 387 nts
SEQ ID NO: 17 IgkS2-35 Ser AGC construct nucleotide sequence 387
nts SEQ ID NO: 18 IgkS2-36 Ser TCG construct nucleotide sequence
387 nts SEQ ID NO: 19 IgkS2-37 Ser TCA construct nucleotide
sequence 387 nts SEQ ID NO: 20 IgkS2-38 Ser TCT construct
nucleotide sequence 387 nts SEQ ID NO: 21 IgkS2-39 Ser TCC
construct nucleotide sequence 387 nts SEQ ID NO: 22 IgkS2-40 Thr
ACG construct nucleotide sequence 387 nts SEQ ID NO: 23 IgkS2-41
Thr ACA construct nucleotide sequence 387 nts SEQ ID NO: 24
IgkS2-42 Thr ACT construct nucleotide sequence 387 nts SEQ ID NO:
25 IgkS2-43 Thr ACC construct nucleotide sequence 387 nts SEQ ID
NO: 26 IgkS2-46 Val GTG construct nucleotide sequence 387 nts SEQ
ID NO: 27 IgkS2-47 Val GTA construct nucleotide sequence 387 nts
SEQ ID NO: 28 IgkS2-48 Val GTT construct nucleotide sequence 387
nts SEQ ID NO: 29 IgkS2-49 Val GTG construct nucleotide sequence
387 nts SEQ ID NO: 30 IgkS2-1 Ala GCG Linker nucleotide sequence
408 nts SEQ ID NO: 31 IgkS2-2 Ala GCA Linker nucleotide sequence
408 nts SEQ ID NO: 32 IgkS2-3 Ala GCT Linker nucleotide sequence
408 nts SEQ ID NO: 33 IgkS2-4 Ala GCC Linker nucleotide sequence
408 nts SEQ ID NO: 34 IgkS2-5 Arg AGG Linker nucleotide sequence
408 nts SEQ ID NO: 35 IgkS2-6 Arg AGA Linker nucleotide sequence
408 nts SEQ ID NO: 36 IgkS2-7 Arg CGG Linker nucleotide sequence
408 nts SEQ ID NO: 37 IgkS2-8 Arg CGA Linker nucleotide sequence
408 nts SEQ ID NO: 38 IgkS2-9 Arg CGT Linker nucleotide sequence
408 nts SEQ ID NO: 39 IgkS2-10 Arg CGC Linker nucleotide sequence
408 nts SEQ ID NO: 40 IgkS2-11 Asn AAT Linker nucleotide sequence
408 nts SEQ ID NO: 41 IgkS2-12 Asn AAC Linker nucleotide sequence
408 nts SEQ ID NO: 42 IgkS2-25 His CAT Linker nucleotide sequence
408 nts SEQ ID NO: 43 IgkS2-26 His CAC Linker nucleotide sequence
408 nts SEQ ID NO: 44 IgkS2-30 Lys AAG Linker nucleotide sequence
408 nts SEQ ID NO: 45 IgkS2-31 Lys AAA Linker nucleotide sequence
408 nts SEQ ID NO: 46 IgkS2-32 Phe TTT Linker nucleotide sequence
408 nts SEQ ID NO: 47 IgkS2-33 Phe TTC Linker nucleotide sequence
408 nts SEQ ID NO: 48 IgkS2-44 Tyr TAT Linker nucleotide sequence
408 nts SEQ ID NO: 49 IgkS2-45 Tyr TAC Linker nucleotide sequence
408 nts SEQ ID NO: 50 Influenza A Virus HA hemagglutinin (A/Hong
1707 nts Kong/213/03(H5N1)) BAE07201 wild-type SEQ ID NO: 51
Influenza A Virus HA hemagglutinin (A/Hong 568 aa
Kong/213/03(H5N1)) BAE07201 wild-type SEQ ID NO: 52 Influenza A
Virus HA hemagglutinin (A/Hong 1707 nts Kong/213/03(H5N1)) Codon
modified SEQ ID NO: 53 Influenza A Virus HA hemagglutinin 1701 nts
(A/swine/Korea/PZ72-1/2006 (H3N1)) DQ923506 wild-type SEQ ID NO: 54
Influenza A Virus HA hemagglutinin 566 aa
(A/swine/Korea/PZ72-1/2006 (H3N1)) DQ923506 wild-type SEQ ID NO: 55
Influenza A Virus HA hemagglutinin 1701 nts
(A/swine/Korea/PZ72-1/2006 (H3N1)) Codon modified SEQ ID NO: 56
Influenza A Virus NA neuraminidase (A/Hong 1410 nts
Kong/213/03(H5N1)) AB212056 wild-type SEQ ID NO: 57 Influenza A
Virus NA neuraminidase (A/Hong 469 aa Kong/213/03(H5N1)) AB212056
wild-type SEQ ID NO: 58 Influenza A Virus NA neuraminidase (A/Hong
1410 nts Kong/213/03(H5N1)) Codon modified SEQ ID NO: 59 Influenza
A Virus NA neuraminidase 1410 nts (A/swine/MI/PU243/04 (H3N1))
DQ150427 wild-type SEQ ID NO: 60 Influenza A Virus NA neuraminidase
469 aa (A/swine/MI/PU243/04 (H3N1)) DQ150427 wild-type SEQ ID NO:
61 Influenza A Virus NA neuraminidase 1410 nts (A/swine/MI/PU243/04
(H3N1)) Codon modified SEQ ID NO: 62 Hepatitis C Virus E1 (Serotype
1A, isolate H77) 576 nts AF009606 wild-type SEQ ID NO: 63 Hepatitis
C Virus E1 (Serotype 1A, isolate H77) NP 192 aa 751920 wild-type
SEQ ID NO: 64 Hepatitis C Virus E1 (Serotype 1A, isolate H77) Codon
576 nts modified SEQ ID NO: 65 Hepatitis C Virus E2 (Serotype 1A,
isolate H77) 1089 nts AF009606 wild-type SEQ ID NO: 66 Hepatitis C
Virus E2 (Serotype 1A, isolate H77) NP 363 aa 751921 wild-type SEQ
ID NO: 67 Hepatitis C Virus E2 (Serotype 1A, isolate H77) Codon
1089 nts modified SEQ ID NO: 68 Epstein Barr Virus (Type 1, gp350
B95-8) NC 007605 2724 nts wild-type SEQ ID NO: 69 Epstein Barr
Virus (Type 1, gp350 B95-8) CAD53417 907 aa wild-type SEQ ID NO: 70
Epstein Barr Virus (Type 1, gp350 B95-8) Codon 2724 nts modified
SEQ ID NO: 71 Epstein Barr Virus (Type 2, gp350 AG876) NC 009334
2661 nts wild-type SEQ ID NO: 72 Epstein Barr Virus (Type 2, gp350
AG876) YP 886 aa 001129462 wild-type SEQ ID NO: 73 Epstein Barr
Virus (Type 2, gp350 AG876) Codon 2661 nts Modified SEQ ID NO: 74
Herpes Simplex Virus 2 (Glycoprotein B strain HG52) 2715 nts NC
001798 wild-type SEQ ID NO: 75 Herpes Simplex Virus 2 (Glycoprotein
B strain HG52) 904 aa CAB06752 wild-type SEQ ID NO: 76 Herpes
Simplex Virus 2 (Glycoprotein B strain HG52) 2715 nts Codon
modified SEQ ID NO: 77 Herpes Simplex Virus (Glycoprotein D strain
HG52) 1182 nts NC 001798 wild-type SEQ ID NO: 78 Herpes Simplex
Virus (Glycoprotein D strain HG52) 393 aa NP 0044536 wild-type SEQ
ID NO: 79 Herpes Simplex Virus (Glycoprotein D strain HG52) 1182
nts Codon modified SEQ ID NO: 80 HPV-16 E7 wild-type 387 nts SEQ ID
NO: 81 HPV-16 E7 O1 387 nts SEQ ID NO: 82 HPV-16 E7 O2 387 nts SEQ
ID NO: 83 HPV-16 E7 O3 417 nts SEQ ID NO: 84 HPV-16 E7 W 387 nts
SEQ ID NO: 85 HSV-2 gD2 wild-type 1182 nts SEQ ID NO: 86 HSV-2 gD2
O1 1182 nts SEQ ID NO: 87 HSV-2 gD2 O2 1182 nts SEQ ID NO: 88 HSV-2
gD2 O3 1182 nts SEQ ID NO: 89 HSV-2 gD2 W 1182 nts SEQ ID NO: 90
Common forward primer 41 nts SEQ ID NO: 91 ODN-7909 24 nts
DETAILED DESCRIPTION OF THE INVENTION
1. Definitions
[0113] Unless defined otherwise, all technical and scientific terms
used herein have the same meaning as commonly understood by those
of ordinary skill in the art to which the invention belongs.
Although any methods and materials similar or equivalent to those
described herein can be used in the practice or testing of the
present invention, preferred methods and materials are described.
For the purposes of the present invention, the following terms are
defined below.
[0114] The articles "a" and "an" are used herein to refer to one or
to more than one (i.e. to at least one) of the grammatical object
of the article. By way of example, "an element" means one element
or more than one element.
[0115] By "about" is meant a quantity, level, value, frequency,
percentage, dimension, size, or amount that varies by no more than
15%, and preferably by no more than 10%, 9%, 8%, 7%, 6%, 5%, 4%,
3%, 2%, 1% to a reference quantity, level, value, frequency,
percentage, dimension, size, or amount.
[0116] The terms "administration concurrently" or "administering
concurrently" or "co-administering" and the like refer to the
administration of a single composition containing two or more
actives, or the administration of each active as separate
compositions and/or delivered by separate routes either
contemporaneously or simultaneously or sequentially within a short
enough period of time that the effective result is equivalent to
that obtained when all such actives are administered as a single
composition. By "simultaneously" is meant that the active agents
are administered at substantially the same time, and desirably
together in the same formulation. By "contemporaneously" it is
meant that the active agents are administered closely in time,
e.g., one agent is administered within from about one minute to
within about one day before or after another. Any contemporaneous
time is useful. However, it will often be the case that when not
administered simultaneously, the agents will be administered within
about one minute to within about eight hours and preferably within
less than about one to about four hours. When administered
contemporaneously, the agents are suitably administered at the same
site on the subject. The term "same site" includes the exact
location, but can be within about 0.5 to about 15 centimeters,
preferably from within about 0.5 to about 5 centimeters. The term
"separately" as used herein means that the agents are administered
at an interval, for example at an interval of about a day to
several weeks or months. The active agents may be administered in
either order. The term "sequentially" as used herein means that the
agents are administered in sequence, for example at an interval or
intervals of minutes, hours, days or weeks. If appropriate the
active agents may be administered in a regular repeating cycle.
[0117] As used herein, the term "cis-acting sequence" or
"cis-regulatory region" or similar term shall be taken to mean any
sequence of nucleotides which is derived from an expressible
genetic sequence wherein the expression of the genetic sequence is
regulated, at least in part, by the sequence of nucleotides. Those
skilled in the art will be aware that a cis-regulatory region may
be capable of activating, silencing, enhancing, repressing or
otherwise altering the level of expression and/or
cell-type-specificity and/or developmental specificity of any
structural gene sequence.
[0118] Throughout this specification, unless the context requires
otherwise, the words "comprise," "comprises" and "comprising" will
be understood to imply the inclusion of a stated step or element or
group of steps or elements but not the exclusion of any other step
or element or group of steps or elements.
[0119] As used herein, a "chimeric construct" refers to a
polynucleotide having heterologous nucleic acid elements. Chimeric
constructs include "expression cassettes" or "expression
constructs," which refer to an assembly that is capable of
directing the expression of the sequence(s) or gene(s) of interest.
An expression cassette generally includes control elements such as
a promoter that is operably linked to (so as to direct
transcription of) a synthetic polynucleotide of the invention, and
often includes a polyadenylation sequence as well. Within certain
embodiments of the invention, the chimeric construct may be
contained within a vector. In addition to the components of the
chimeric construct, the vector may include, one or more selectable
markers, a signal which allows the vector to exist as
single-stranded DNA (e.g., a M13 origin of replication), at least
one multiple cloning site, and a "mammalian" origin of replication
(e.g., a SV40 or adenovirus origin of replication).
[0120] By "coding sequence" is meant any nucleic acid sequence that
contributes to the code for the polypeptide product of a
polynucleotide (e.g., a reporter polynucleotide).
[0121] As used herein a "conferred phenotype" refers to a temporary
or permanent change in the state of an organism of interest or
class of organisms of interest, or of a part or tissue or cell or
cell type or class of cell of an organism of interest, which occurs
after the introduction of a polynucleotide to that organism, or to
that class of organisms, or to the part or tissue or cell or cell
type or class of cell, or to a precursor of that organism or part
or tissue or cell or cell type or class of cell, and which would
not have occurred in the absence of that introduction. Typically,
such a temporary or permanent change occurs as a result of the
transcription and/or translation of genetic information contained
within that polynucleotide in the cell, or in at least one cell or
cell type or class of cell within the organism of interest or
within the class of class of organisms of interest, and can be used
to distinguish the organism of interest, or class of organisms of
interest, or part or tissue or cell or cell type or class of cell
thereof, or genetic progeny of these, to which the polynucleotide
has been provided from a similar organism of interest, or class of
organisms of interest, or part or tissue or cell or cell type or
class of cell thereof, or genetic progeny of these, to which the
polynucleotide has not been provided.
[0122] As used herein, "conferred immune response," "immune
response that is conferred" and the like refer to a temporary or
permanent change in immune response to a target antigen, which
occurs or would occur after the introduction of a polynucleotide to
the mammal, and which would not occur in the absence of that
introduction. Typically, such a temporary or permanent change
occurs as a result of the transcription and/or translation of
genetic information contained within that polynucleotide in a cell,
or in at least one cell or cell type or class of cell within a
mammal or within a class of mammals, and can be used to distinguish
the mammal, or class of mammals to which the polynucleotide has
been provided from a similar mammal, or class of mammals, to which
the polynucleotide has not been provided.
[0123] By "corresponds to" or "corresponding to" is meant an
antigen which encodes an amino acid sequence that displays
substantial similarity to an amino acid sequence in a target
antigen. In general the antigen will display at least about 30, 40,
50, 55, 60, 65, 70, 75, 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98,
99% similarity or identity to at least a portion of the target
antigen (e.g., at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%
or 95% of the amino acid sequence of the target antigen).
[0124] By "effective amount," in the context of modulating an
immune response or treating or preventing a disease or condition,
is meant the administration of that amount of composition to an
individual in need thereof, either in a single dose or as part of a
series, that is effective for achieving that modulation, treatment
or prevention. The effective amount will vary depending upon the
health and physical condition of the individual to be treated, the
taxonomic group of individual to be treated, the formulation of the
composition, the assessment of the medical situation, and other
relevant factors. It is expected that the amount will fall in a
relatively broad range that can be determined through routine
trials.
[0125] The terms "enhancing an immune response," "producing a
stronger immune response" and the like refer to increasing an
animal's capacity to respond to a target antigen (e.g., a foreign
or disease-specific antigen or a self antigen), which can be
determined for example by detecting an increase in the number,
activity, and ability of the animal's cells that are primed to
attack such antigens or an increase in the titer or activity of
antibodies in the animal, which are immuno-interactive with the
target antigen. Strength of immune response can be measured by
standard immunoassays including: direct measurement of antibody
titers or peripheral blood lymphocytes; cytolytic T lymphocyte
assays; assays of natural killer cell cytotoxicity; cell
proliferation assays including lymphoproliferation (lymphocyte
activation) assays; immunoassays of immune cell subsets; assays of
T-lymphocytes specific for the antigen in a sensitized subject;
skin tests for cell-mediated immunity; etc. Such assays are well
known in the art. See, e.g., Erickson et al., 1993, J. Immunol.
151:4189-4199; Doe et al., 1994, Eur. J. Immunol. 24:2369-2376.
Recent methods of measuring cell-mediated immune response include
measurement of intracellular cytokines or cytokine secretion by
T-cell populations, or by measurement of epitope specific T-cells
(e.g., by the tetramer technique) (reviewed by McMichael, A. J.,
and O'Callaghan, C. A., 1998, J. Exp. Med. 187(9)1367-1371;
Mcheyzer-Williams, M. G., et al., 1996, Immunol. Rev. 150:5-21;
Lalvani, A., et al., 1997, J. Exp. Med. 186:859-865). Any
statistically significant increase in strength of immune response
as measured for example by immunoassay is considered an "enhanced
immune response" or "immunoenhancement" as used herein. Enhanced
immune response is also indicated by physical manifestations such
as fever and inflammation, as well as healing of systemic and local
infections, and reduction of symptoms in disease, i.e., decrease in
tumor size, alleviation of symptoms of a disease or condition
including, but not restricted to, leprosy, tuberculosis, malaria,
naphthous ulcers, herpetic and papillomatous warts, gingivitis,
arthrosclerosis, the concomitants of AIDS such as Kaposi's sarcoma,
bronchial infections, and the like. Such physical manifestations
also encompass "enhanced immune response" or "immunoenhancement" as
used herein. By contrast, "reducing an immune response," "producing
a weaker immune response" and the like refer to decreasing an
animal's capacity to respond to a target antigen, which can be
determined for example by conducting immunoassays or assessing
physical manifestations, as described for example above.
[0126] The terms "expression" or "gene expression" refer to
production of RNA message and/or translation of RNA message into
proteins or polypeptides.
[0127] By "expression vector" is meant any autonomous genetic
element capable of directing the synthesis of a protein encoded by
the vector. Such expression vectors are known by practitioners in
the art.
[0128] The term "gene" is used in its broadest context to include
both a genomic DNA region corresponding to the gene as well as a
cDNA sequence corresponding to exons or a recombinant molecule
engineered to encode a functional form of a product.
[0129] As used herein the term "heterologous" refers to a
combination of elements that are not naturally occurring or that
are obtained from different sources.
[0130] "Immune response" or "immunological response" refers to the
concerted action of lymphocytes, antigen-presenting cells,
phagocytic cells, granulocytes, and soluble macromolecules produced
by the above cells or the liver (including antibodies, cytokines,
and complement) that results in selective damage to, destruction
of, or elimination from the body of cancerous cells, metastatic
tumor cells, metastatic breast cancer cells, invading pathogens,
cells or tissues infected with pathogens, or, in cases of
autoimmunity or pathological inflammation, normal human cells or
tissues. In some embodiments, an "immune response" encompasses the
development in an individual of a humoral and/or a cellular immune
response to a polypeptide that is encoded by an introduced
synthetic polynucleotide of the invention. As known in the art, the
terms "humoral immune response" includes and encompasses an immune
response mediated by antibody molecules, while a "cellular immune
response" includes and encompasses an immune response mediated by
T-lymphocytes and/or other white blood cells. Thus, an immune
response that is stimulated by a synthetic polynucleotide of the
invention may be one that stimulates the production of antibodies
(e.g., neutralizing antibodies that block bacterial toxins and
pathogens such as viruses entering cells and replicating by binding
to toxins and pathogens, typically protecting cells from infection
and destruction). The synthetic polynucleotide may also elicit
production of cytolytic T lymphocytes (CTLs). Hence, an
immunological response may include one or more of the following
effects: the production of antibodies by B-cells; and/or the
activation of suppressor T-cells and/or memory/effector T-cells
directed specifically to an antigen or antigens present in the
composition or vaccine of interest. In some embodiments, these
responses may serve to neutralize infectivity, and/or mediate
antibody-complement, or antibody dependent cell cytotoxicity (ADCC)
to provide protection to an immunized host. Such responses can be
determined using standard immunoassays and neutralization assays,
well known in the art. (See, e.g., Montefiori et al., 1988, J Clin
Microbiol. 26:231-235; Dreyer et al., 1999, AIDS Res Hum
Retroviruses 15(17):1563-1571). The innate immune system of mammals
also recognizes and responds to molecular features of pathogenic
organisms and cancer cells via activation of Toll-like receptors
and similar receptor molecules on immune cells. Upon activation of
the innate immune system, various non-adaptive immune response
cells are activated to, e.g., produce various cytokines,
lymphokines and chemokines. Cells activated by an innate immune
response include immature and mature dendritic cells of, for
example, the monocyte and plasmacytoid lineage (MDC, PDC), as well
as gamma, delta, alpha and beta T cells and B cells and the like.
Thus, the present invention also contemplates an immune response
wherein the immune response involves both an innate and adaptive
response.
[0131] A composition is "immunogenic" if it is capable of either:
a) generating an immune response against a target antigen (e.g., a
viral or tumor antigen) in an individual; or b) reconstituting,
boosting, or maintaining an immune response in an individual beyond
what would occur if the agent or composition was not administered.
An agent or composition is immunogenic if it is capable of
attaining either of these criteria when administered in single or
multiple doses.
[0132] "Immunomodulation," modulating an immune response" and the
like refer to the modulation of the immune system in response to a
stimulus and includes increasing or decreasing an immune response
to a target antigen or changing an immune response from one that is
predominantly a humoral immune response to one that is a more
cell-mediated immune response and vice versa. For example, it is
known in the art that decreasing the amount of antigen for
immunization can change the bias of the immune system from a
predominantly humoral immune response to a predominantly cellular
immune response.
[0133] By "isoaccepting transfer RNA" or "iso-tRNA" is meant one or
more transfer RNA molecules that differ in their anticodon
nucleotide sequence but are specific for the same amino acid.
[0134] As used herein, the term "mammal" refers to any mammal
including, without limitation, humans and other primates, including
non-human primates such as chimpanzees and other apes and monkey
species; farm animals such as cattle, sheep, pigs, goats and
horses; domestic mammals such as dogs and cats; and laboratory
animals including rodents such as mice, rats and guinea pigs. The
term does not denote a particular age. Thus, both adult and newborn
individuals are intended to be covered.
[0135] By "modulating," "modulate" and the like is meant increasing
or decreasing, either directly or indirectly, the quality of a
selected phenotype (e.g., an immune response). In certain
embodiments, "modulation" or "modulating" means that a
desired/selected immune response is more efficient (e.g., at least
10%, 20%, 30%, 40%, 50%, 60% or more), more rapid (e.g., at least
10%, 20%, 30%, 40%, 50%, 60% or more), greater in magnitude (e.g.,
at least 10%, 20%, 30%, 40%, 50%, 60% or more), and/or more easily
induced (e.g., at least 10%, 20%, 30%, 40%, 50%, 60% or more) than
if the parent polynucleotide had been used under the same
conditions as the synthetic polynucleotide. In other embodiments,
"modulation" or "modulating" means changing an immune response from
a predominantly antibody-mediated immune response as conferred by
the parent polynucleotide, to a predominantly cellular immune
response as conferred by the synthetic polynucleotide under the
same conditions. In still other embodiments, "modulation" or
"modulating" means changing an immune response from a predominantly
cellular immune response as conferred by the parent polynucleotide,
to a predominantly antibody-mediated immune response as conferred
by the synthetic polynucleotide under the same conditions.
[0136] By "natural gene" is meant a gene that naturally encodes the
protein. However, it is possible that the parent polynucleotide
encodes a protein that is not naturally-occurring but has been
engineered using recombinant techniques.
[0137] The term "5' non-coding region" is used herein in its
broadest context to include all nucleotide sequences which are
derived from the upstream region of an expressible gene, other than
those sequences which encode amino acid residues which comprise the
polypeptide product of the gene, wherein 5' non-coding region
confers or activates or otherwise facilitates, at least in part,
expression of the gene.
[0138] The term "oligonucleotide" as used herein refers to a
polymer composed of a multiplicity of nucleotide units
(deoxyribonucleotides or ribonucleotides, or related structural
variants or synthetic analogues thereof) linked via phosphodiester
bonds (or related structural variants or synthetic analogues
thereof). Thus, while the term "oligonucleotide" typically refers
to a nucleotide polymer in which the nucleotides and linkages
between them are naturally occurring, it will be understood that
the term also includes within its scope various analogues
including, but not restricted to, peptide nucleic acids (PNAs),
phosphoramidates, phosphorothioates, methyl phosphonates,
2-O-methyl ribonucleic acids, and the like. The exact size of the
molecule may vary depending on the particular application. An
oligonucleotide is typically rather short in length, generally from
about 10 to 30 nucleotides, but the term can refer to molecules of
any length, although the term "polynucleotide" or "nucleic acid" is
typically used for large oligonucleotides.
[0139] The terms "operably connected," "operably linked" and the
like as used herein refer to an arrangement of elements wherein the
components so described are configured so as to perform their usual
function. Thus, a given promoter operably linked to a coding
sequence is capable of effecting the expression of the coding
sequence when the proper enzymes are present. The promoter need not
be contiguous with the coding sequence, so long as it functions to
direct the expression thereof. Thus, for example, intervening
untranslated yet transcribed sequences can be present between the
promoter sequence and the coding sequence and the promoter sequence
can still be considered "operably linked" to the coding sequence.
Terms such as "operably connected," therefore, include placing a
structural gene under the regulatory control of a promoter, which
then controls the transcription and optionally translation of the
gene. In the construction of heterologous promoter/structural gene
combinations, it is generally preferred to position the genetic
sequence or promoter at a distance from the gene transcription
start site that is approximately the same as the distance between
that genetic sequence or promoter and the gene it controls in its
natural setting; i.e. the gene from which the genetic sequence or
promoter is derived. As is known in the art, some variation in this
distance can be accommodated without loss of function. Similarly,
the preferred positioning of a regulatory sequence element with
respect to a heterologous gene to be placed under its control is
defined by the positioning of the element in its natural setting;
i.e., the genes from which it is derived.
[0140] By "pharmaceutically-acceptable carrier" is meant a solid or
liquid filler, diluent or encapsulating substance that may be
safely used in topical or systemic administration.
[0141] The term "phenotype" means any one or more detectable
physical or functional characteristics, properties, attributes or
traits of an organism, tissue, or cell, or class of organisms,
tissues or cells, which generally result from the interaction
between the genetic makeup (i.e., genotype) of the organism,
tissue, or cell, or the class of organisms, tissues or cells and
the environment. In certain embodiments, the term "phenotype"
excludes resistance to a selective agent or screening an enzymic or
light-emitting activity, conferred directly by a reporter
protein.
[0142] By "phenotypic preference" is meant the preference with
which an organism uses a codon to produce a selected phenotype.
This preference can be evidenced, for example, by the quality of a
selected phenotype that is producible by a polynucleotide that
comprises the codon in an open reading frame which codes for a
polypeptide that produces the selected phenotype. In certain
embodiment, the preference of usage is independent of the route by
which the polynucleotide is introduced into the organism. However,
in other embodiments, the preference of usage is dependent on the
route of introduction of the polynucleotide into the organism.
[0143] The term "polynucleotide" or "nucleic acid" as used herein
designates mRNA, RNA, cRNA, cDNA or DNA. The term typically refers
to oligonucleotides greater than 30 nucleotides in length.
[0144] "Polypeptide," "peptide" and "protein" are used
interchangeably herein to refer to a polymer of amino acid residues
and to variants and synthetic analogues of the same. Thus, these
terms apply to amino acid polymers in which one or more amino acid
residues is a synthetic non-naturally occurring amino acid, such as
a chemical analogue of a corresponding naturally occurring amino
acid, as well as to naturally-occurring amino acid polymers. As
used herein, the terms "polypeptide," "peptide" and "protein" are
not limited to a minimum length of the product. Thus, peptides,
oligopeptides, dimers, multimers, and the like, are included within
the definition. Both full-length proteins and fragments thereof are
encompassed by the definition. The terms also include post
expression modifications of a polypeptide, for example,
glycosylation, acetylation, phosphorylation and the like. In some
embodiments, a "polypeptide" refers to a protein which includes
modifications, such as deletions, additions and substitutions
(generally conservative in nature), to the native sequence, so long
as the protein maintains the desired activity. These modifications
may be deliberate, as through site-directed mutagenesis, or may be
accidental, such as through mutations of hosts which produce the
proteins or errors due to PCR amplification.
[0145] The terms "polypeptide variant," and "variant" refer to
polypeptides that vary from a reference polypeptide by the
addition, deletion or substitution (generally conservative in
nature) of at least one amino acid residue. Typically, variants
retain a desired activity of the reference polypeptide, such as
antigenic activity in inducing an immune response against a target
antigen. In general, variant polypeptides are "substantially
similar" or substantially identical" to the reference polypeptide,
e.g., amino acid sequence identity or similarity of more than 50%,
generally more than 60%-70%, even more particularly 80%-85% or
more, such as at least 90%-95% or more, when the two sequences are
aligned. Often, the variants will include the same number of amino
acids but will include substitutions, as explained herein.
[0146] The terms "precursor cell or tissue" and "progenitor cell or
tissue" as used herein refer to a cell or tissue that can gives
rise to a particular cell or tissue in which a polypeptide is
produced by expression of the coding sequences in the synthetic
constructs of the invention.
[0147] The terms "precursor" and "progenitor," as used herein in
the context of phenotypic preference, refer to a cell or part of
organism that can gives rise to an organism of interest in which
phenotypic expression is desired or in which phenotypic preference
of a codon is to be determined.
[0148] By "primer" is meant an oligonucleotide which, when paired
with a strand of DNA, is capable of initiating the synthesis of a
primer extension product in the presence of a suitable polymerizing
agent. The primer is preferably single-stranded for maximum
efficiency in amplification but may alternatively be
double-stranded. A primer must be sufficiently long to prime the
synthesis of extension products in the presence of the
polymerization agent. The length of the primer depends on many
factors, including application, temperature to be employed,
template reaction conditions, other reagents, and source of
primers. For example, depending on the complexity of the target
sequence, the oligonucleotide primer typically contains 15 to 35 or
more nucleotides, although it may contain fewer nucleotides.
Primers can be large polynucleotides, such as from about 200
nucleotides to several kilobases or more. Primers may be selected
to be "substantially complementary" to the sequence on the template
to which it is designed to hybridize and serve as a site for the
initiation of synthesis. By "substantially complementary", it is
meant that the primer is sufficiently complementary to hybridize
with a target nucleotide sequence. Preferably, the primer contains
no mismatches with the template to which it is designed to
hybridize but this is not essential. For example, non-complementary
nucleotides may be attached to the 5' end of the primer, with the
remainder of the primer sequence being complementary to the
template. Alternatively, non-complementary nucleotides or a stretch
of non-complementary nucleotides can be interspersed into a primer,
provided that the primer sequence has sufficient complementarity
with the sequence of the template to hybridize therewith and
thereby form a template for synthesis of the extension product of
the primer.
[0149] By "producing", and like terms such as "production" and
"producible", in the context or protein production, is meant
production of a protein to a level sufficient to achieve a
particular function or phenotype associated with the protein. By
contrast, the terms "not producible" and "not substantially
producible" as used interchangeably herein refer to (a) no
production of a protein, (b) production of a protein to a level
that is not sufficient to effect a particular function or phenotype
associated with the protein, (c) production of a protein, which
cannot be detected by a monoclonal antibody specific for the
protein, or (d) production of a protein, which is less that 1% of
the level produced in a wild-type cell that normally produces the
protein.
[0150] Reference herein to a "promoter" is to be taken in its
broadest context and includes the transcriptional regulatory
sequences of a classical genomic gene, including the TATA box which
is required for accurate transcription initiation, with or without
a CCAAT box sequence and additional regulatory elements (i.e.
upstream activating sequences, enhancers and silencers) which alter
gene expression in response to developmental and/or environmental
stimuli, or in a tissue-specific or cell-type-specific manner. A
promoter is usually, but not necessarily, positioned upstream or
5', of a structural gene, the expression of which it regulates.
Furthermore, the regulatory elements comprising a promoter are
usually positioned within 2 kb of the start site of transcription
of the gene. Preferred promoters according to the invention may
contain additional copies of one or more specific regulatory
elements to further enhance expression in a cell, and/or to alter
the timing of expression of a structural gene to which it is
operably connected.
[0151] The term "quality" is used herein in its broadest sense and
includes a measure, strength, intensity, degree or grade of a
phenotype, e.g., a superior or inferior immune response, increased
or decreased disease resistance, higher or lower sucrose
accumulation, better or worse salt tolerance etc.
[0152] By "regulatory element" or "regulatory sequence" is meant a
nucleic acid sequence (e.g., DNA) that expresses an operably linked
nucleotide sequence (e.g., a coding sequence) in a particular host
cell. The regulatory sequences that are suitable for prokaryotic
cells for example, include a promoter, and optionally a cis-acting
sequence such as an operator sequence and a ribosome binding site.
Control sequences that are suitable for eukaryotic cells include
promoters, polyadenylation signals, transcriptional enhancers,
translational enhancers, leader or trailing sequences that modulate
mRNA stability, as well as targeting sequences that target a
product encoded by a transcribed polynucleotide to an intracellular
compartment within a cell or to the extracellular environment.
[0153] The term "sequence identity" as used herein refers to the
extent that sequences are identical on a nucleotide-by-nucleotide
basis or an amino acid-by-amino acid basis over a window of
comparison. Thus, a "percentage of sequence identity" is calculated
by comparing two optimally aligned sequences over the window of
comparison, determining the number of positions at which the
identical nucleic acid base (e.g., A, T, C, G, I) or the identical
amino acid residue (e.g., Ala, Pro, Ser, Thr, Gly, Val, Leu, Ile,
Phe, Tyr, Trp, Lys, Arg, His, Asp, Glu, Asn, Gln, Cys and Met)
occurs in both sequences to yield the number of matched positions,
dividing the number of matched positions by the total number of
positions in the window of comparison (i.e., the window size), and
multiplying the result by 100 to yield the percentage of sequence
identity. For the purposes of the present invention, "sequence
identity" will be understood to mean the "match percentage"
calculated by the DNASIS computer program (Version 2.5 for windows;
available from Hitachi Software engineering Co., Ltd., South San
Francisco, Calif., USA) using standard defaults as used in the
reference manual accompanying the software.
[0154] "Similarity" refers to the percentage number of amino acids
that are identical or constitute conservative substitutions as
defined in Table 10. Similarity may be determined using sequence
comparison programs such as GAP (Deveraux et al. 1984, Nucleic
Acids Research 12, 387-395). In this way, sequences of a similar or
substantially different length to those cited herein might be
compared by insertion of gaps into the alignment, such gaps being
determined, for example, by the comparison algorithm used by
GAP.
[0155] Terms used to describe sequence relationships between two or
more polynucleotides or polypeptides include "reference sequence",
"comparison window", "sequence identity", "percentage of sequence
identity" and "substantial identity". A "reference sequence" is at
least 12 but frequently 15 to 18 and often at least 25 monomer
units, inclusive of nucleotides and amino acid residues, in length.
Because two polynucleotides may each comprise (1) a sequence (i.e.,
only a portion of the complete polynucleotide sequence) that is
similar between the two polynucleotides, and (2) a sequence that is
divergent between the two polynucleotides, sequence comparisons
between two (or more) polynucleotides are typically performed by
comparing sequences of the two polynucleotides over a "comparison
window" to identify and compare local regions of sequence
similarity. A "comparison window" refers to a conceptual segment of
at least 6 contiguous positions, usually about 50 to about 100,
more usually about 100 to about 150 in which a sequence is compared
to a reference sequence of the same number of contiguous positions
after the two sequences are optimally aligned. The comparison
window may comprise additions or deletions (i.e., gaps) of about
20% or less as compared to the reference sequence (which does not
comprise additions or deletions) for optimal alignment of the two
sequences. Optimal alignment of sequences for aligning a comparison
window may be conducted by computerized implementations of
algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin
Genetics Software Package Release 7.0, Genetics Computer Group, 575
Science Drive Madison, Wis., USA) or by inspection and the best
alignment (i.e., resulting in the highest percentage homology over
the comparison window) generated by any of the various methods
selected. Reference also may be made to the BLAST family of
programs as for example disclosed by Altschul et al., 1997, Nucl.
Acids Res. 25:3389. A detailed discussion of sequence analysis can
be found in Unit 19.3 of Ausubel et al., "Current Protocols in
Molecular Biology", John Wiley & Sons Inc, 1994-1998, Chapter
15.
[0156] As used herein, the term "specific binding pair" refers to a
pair of molecules that physically interact with one another in a
specific manner that gives rise to a biological activity, that is,
to the substantial exclusion of other polypeptides. Members of a
specific binding pair interact through complementary interaction
domains, such that they interact to the substantial exclusion of
proteins that do not have a complementary interaction domain.
Non-limiting examples of specific binding pairs include
antibody-antigen pairs, enzyme-substrate pairs, dimeric
transcription factors (e.g., AP-1, composed of Fos specifically
bound to Jun via a leucine zipper interaction domain) and
receptor-ligand pairs.
[0157] The terms "synthetic polynucleotide," "synthetic construct"
and the like as used herein refer to a nucleic acid molecule that
is formed by recombinant or synthetic techniques and typically
includes polynucleotides that are not normally found in nature.
[0158] The term "synonymous codon" as used herein refers to a codon
having a different nucleotide sequence than another codon but
encoding the same amino acid as that other codon.
[0159] By "treatment," "treat," "treated" and the like is meant to
include both therapeutic and prophylactic treatment.
[0160] By "vector" is meant a nucleic acid molecule, preferably a
DNA molecule derived, for example, from a plasmid, bacteriophage,
or plant virus, into which a nucleic acid sequence may be inserted
or cloned. A vector preferably contains one or more unique
restriction sites and may be capable of autonomous replication in a
defined host cell including a target cell or tissue or a progenitor
cell or tissue thereof, or be integrable with the genome of the
defined host such that the cloned sequence is reproducible.
Accordingly, the vector may be an autonomously replicating vector,
i.e., a vector that exists as an extrachromosomal entity, the
replication of which is independent of chromosomal replication,
e.g., a linear or closed circular plasmid, an extrachromosomal
element, a minichromosome, or an artificial chromosome. The vector
may contain any means for assuring self-replication. Alternatively,
the vector may be one which, when introduced into the host cell, is
integrated into the genome and replicated together with the
chromosome(s) into which it has been integrated. A vector system
may comprise a single vector or plasmid, two or more vectors or
plasmids, which together contain the total DNA to be introduced
into the genome of the host cell, or a transposon. The choice of
the vector will typically depend on the compatibility of the vector
with the host cell into which the vector is to be introduced. The
vector may also include a selection marker such as an antibiotic
resistance gene that can be used for selection of suitable
transformants. Examples of such resistance genes are well known to
those of skill in the art.
2. Abbreviations
[0161] The following abbreviations are used throughout the
application: [0162] nt=nucleotide [0163] nts=nucleotides [0164]
aa=amino acid(s) [0165] kb=kilobase(s) or kilobase pair(s) [0166]
kDa=kilodalton(s) [0167] d=day [0168] h=hour [0169] s=seconds
3. Construct System of the Invention
[0170] In accordance with the present invention, a construct system
is provided for determining the translational efficiency or
phenotypic preference of different synonymous codons. In its
broadest form, the system comprises a plurality of synthetic
constructs each of which is useful for interrogating the
translational efficiency or phenotypic preference of a single codon
("interrogating codon"), wherein the interrogating codon of one
construct is different from the interrogating codon of another.
Thus, in order to compare the translational efficiency or
phenotypic preference of different synonymous codons, it is
generally desirable to use two or more synthetic constructs,
suitably one for each synonymous codon that codes for a particular
amino acid. For example, in the case of arginine, 6 synthetic
constructs are necessary to determine the translational efficiency
or phenotypic preference of all 6 synonymous codons for arginine
(i.e. Arg.sup.CGA, Arg.sup.CGC, Arg.sup.CGT, Arg.sup.AGA,
Arg.sup.AGG, Arg.sup.CGG). By contrast, only 2 synthetic constructs
are required to determine the translational efficiency or
phenotypic preference of both synonymous codons for phenylalanine
(i.e., Phe.sup.TTT, Phe.sup.TTC) and so on. Accordingly, in order
to interrogate the translational efficiency or phenotypic
preference of a finite number of synonymous codons, a corresponding
number of synthetic constructs will generally be required.
[0171] The synthetic constructs of the invention each comprise a
regulatory sequence that is operably connected to a reporter
polynucleotide, wherein the reporter polynucleotide of a respective
construct encodes the same amino acid sequence as the reporter
polynucleotide of another. In accordance with the present
invention, individual reporter polynucleotides use the same
interrogating codon to code for a particular amino acid at one or
more positions of the amino acid sequence, wherein the
interrogating codon of one reporter polynucleotide is different to
but synonymous with the interrogating codon of another. In specific
embodiments, the coding sequences of individual reporter
polynucleotides comprise the same number of interrogating codons.
Suitably, all codons in a respective coding sequence, which code
for a particular amino acid, are the same interrogating codon.
However, this is not necessary as it is possible to use fewer
interrogating codons than the number of codons in a respective
coding sequence, which code for the same amino acid as the
interrogating codons. Nevertheless, the sensitivity of an
individual synthetic construct in determining the translational
efficiency or phenotypic preference of a corresponding
interrogating codon is generally improved by incorporating more
interrogating codons in the coding sequence.
[0172] In some embodiments, the interrogating codon(s) in one
coding sequence is (are) located at the same positions as the
interrogating codons in another coding sequence. In other
embodiments, the interrogating codon(s) of one coding sequence is
(are) located at different positions relative to the interrogating
codons in another coding sequence. For example, a first coding
sequence and a second coding sequence may each contain 5 codons
that code for a particular amino acid and only 3 of those are used
as interrogating codons. In this non-limiting example, the first
coding sequence may comprise the sequence:
[0173] X.sub.1 X.sub.2 X.sub.3 A.sub.1 X.sub.4 X.sub.5 B.sub.1
X.sub.6 X.sub.7 X.sub.8 A.sub.2 X.sub.9 A.sub.3 X.sub.10 X.sub.11
X.sub.12 B.sub.2 X.sub.13 X.sub.14
[0174] and the second coding sequence may comprise:
[0175] X.sub.1 X.sub.2 X.sub.3 A.sub.1 X.sub.4 X.sub.5 A.sub.2
X.sub.6 X.sub.7 X.sub.8 B.sub.1 X.sub.9 A.sub.3 X.sub.10 X.sub.11
X.sub.12 B.sub.2 X.sub.13 X.sub.14
[0176] wherein:
[0177] A.sub.1-3 represent the same interrogating codon;
[0178] B.sub.1-2 represent codons that code for the same amino acid
as the interrogating codon; and
[0179] X.sub.1-14 represent codons that code for different amino
acids than the amino acid coded for by A.sub.1-3 and B.sub.1-2;
[0180] In some embodiments, the construct system comprises
synthetic constructs for interrogating the translational efficiency
or phenotypic preference of codons that code for two or more
different amino acids. In illustrative examples of this type, the
construct system comprises synthetic constructs for interrogating
the translational efficiency or phenotypic preference of codons
that code for 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16,
17, 18, 19 or 20 (suitably naturally occurring) amino acids. In
specific embodiments, the construct system comprises 59 synthetic
constructs for interrogating the translational efficiency or
phenotypic preference of all naturally occurring codons for which
there are two or more synonymous codons (e.g., Ala.sup.GCT,
Ala.sup.GCC, Ala.sup.GCA, Ala.sup.GCG, Arg.sup.CGA, Arg.sup.CGT,
Arg.sup.AGA, Arg.sup.AGG, Arg.sup.CGG, Asn.sup.AAC, Asn.sup.AAT,
Asp.sup.GAC, Asp.sup.GAT, Cys.sup.TGC, Cys.sup.TGT, Glu.sup.GAA,
Glu.sup.GAG, Gln.sup.CAA, Gln.sup.CAG, Gly.sup.GGA, Gly.sup.GGG,
Gly.sup.GGT, Gly.sup.GGC, His.sup.CAC, His.sup.CAT, Ile.sup.ATC,
Ile.sup.ATT, Ile.sup.ATA, Leu.sup.CTG, Leu.sup.CTC, Leu.sup.CTA,
Leu.sup.CTT, Leu.sup.TTG, Leu.sup.TTA, Lys.sup.AAG, Lys.sup.AAA,
Phe.sup.TTT, Phe.sup.TTC, Pro.sup.CCC, Pro.sup.CCT, Pro.sup.CCA,
Pro.sup.CCG, Ser.sup.TCG, Ser.sup.TCT, Ser.sup.TCA, Ser.sup.TCC,
Ser.sup.AGC, Ser.sup.AGT, Thr.sup.ACG, Thr.sup.ACC, Thr.sup.ACA,
Thr.sup.ACT, Tyr.sup.TAC, Tyr.sup.TAT, Cal.sup.GTG, Val.sup.GTC,
Val.sup.GTT and Val.sup.GTA).
[0181] In some embodiments in which the construct system is used
for determining the translational efficiency of synonymous codons,
the reporter polynucleotide encodes an amino acid sequence that
defines, in whole or in part, a reporter protein that, when present
in a cell, is detectable and distinguishable from other
polypeptides present in the cell. A reporter protein may be a
naturally occurring protein or a protein that is not naturally
occurring. Illustrative examples of such reporter proteins include
fluorescent proteins such as green fluorescent protein (gfp), cyan
fluorescent protein (cfp), red fluorescent protein (rfp), or blue
fluorescent protein (bfp), or derivatives of these proteins, or
enzymatic proteins such as chloramphenicol acetyl transferase,
.beta.-galactosidase, .beta.-glucuronidase (GUS)secreted placental
alkaline phosphatase and .beta.-lactamase, chemiluminescent
proteins such as luciferase, and selectable marker proteins
including proteins encoded by antibiotic resistance genes (e.g.,
hygromycin resistance genes, neomycin resistance genes,
tetracycline resistance genes, ampicillin resistance genes,
kanamycin resistance genes, phleomycin resistance genes, herbicide
resistance genes such as the bialophos resistance (BAR) gene that
confers resistance to the herbicide BASTA, bleomycin resistance
genes, geneticin resistance genes, carbenicillin resistance genes,
chloramphenicol resistance genes, puromycin resistance genes,
blasticidin-S-deaminase genes), heavy metal resistance genes, hisD
genes, hypoxanthine phosphoribosyl transferase (HPRT) genes and
guanine phosphoribosyl transferase (Gpt) genes.
[0182] In some embodiments in which the construct system is used
for determining the phenotypic preference of synonymous codons, the
reporter polynucleotide encodes an amino acid sequence that
defines, in whole or in part, a reporter protein confers upon an
organism of interest or part thereof, either by itself or in
association with other molecules, a selected phenotype or a
phenotype of the same class as the selected phenotype. For example,
the reporter protein may be a phenotype-associated polypeptide
(e.g., a melanoma specific antigen such as BAGE or GAGE-1) that
will be the subject of producing the selected phenotype (e.g.,
immunity to melanoma). Alternatively, the phenotype-associated
polypeptide (e.g., green fluorescent protein or a gastrointestinal
associated antigen such as 17-1A) may not produce the selected
phenotype (e.g., immunity to melanoma) but may produce the same
class of phenotype (e.g., an immune response) as the selected
phenotype. In illustrative examples, the phenotype-associated
polypeptide is selected from antigens including antigens from
pathogenic organisms or cancers (e.g., wherein the phenotype is
immunity to disease) and self antigens or transplantation antigens
(e.g., wherein the phenotype is antigen-specific anergy or
tolerance), growth factors (e.g., wherein the phenotype is selected
from size of the organism or part, wound healing, cell
proliferation, cell differentiation, cell migration, immune cell
function), hormones (e.g., wherein the phenotype is increased
lactation, e.g., using oxytocin, or amelioration of a diabetic
state, e.g., using insulin) and toxins (e.g., wherein the phenotype
is tumour regression or cell death). In specific embodiments, the
selected phenotype or class of phenotype corresponds to a
beneficial or improved or superior state or condition of the
organism or part thereof relative to a reference state or
condition. In illustrative examples, the reference state or
condition corresponds to a pathophysiological state. Phenotypes
contemplated by the present invention include any desirable
beneficial trait including, but not restricted to: immunity (e.g.,
immunity to pathogenic infection or cancer); antigen tolerance
(e.g., antigen-specific T lymphocyte anergy, tolerance to
allergens, transplantation antigens and self antigens);
angiogenesis (e.g., blood vessel formation in the heart and
vasculature and in tumour growths); anti-angiogenesis (e.g.,
treatment of ischaemic heart disease and tumours); amelioration of
clinical symptoms (e.g., fever; inflammation; encephalitis; weight
loss; anaemia; sensory symptoms such as paraesthesia or
hypaesthesia; ataxia; neuralgia; paralysis; vertigo; urinary or
bowel movement abnormalities; and cognitive dysfunction such memory
loss, impaired attention, problem-solving difficulties, slowed
information processing, and difficulty in shifting between
cognitive tasks); reduced or increased cell death (e.g.,
apoptosis); reduced or increased cell differentiation; reduced or
increased cell proliferation; tumour or cancer regression; growth
and repair of tissue or organ; decreased fibrosis; inhibition or
reversal of cell senescence; increased or reduced cell migration;
differential expression of protein between different cells or
tissues of an organism or part thereof; trauma recovery; recovery
from burns; antibiotic resistance or sensitivity (e.g., resistance
or sensitivity to aminoglycosidic antibiotics such as geneticin and
paromomycin); herbicide tolerance or sensitivity (e.g. tolerance or
sensitivity to glyphosate or glufosinate); starch biosynthesis or
modification (e.g. using a starch branching enzyme, starch
synthases, ADP-glucose pyrophosphorylase); fatty acid biosynthesis
(e.g. using a desaturase or hydroxylase); disease resistance or
tolerance (e.g., resistance to animal diseases such as
cardiovascular disease, autoimmunity, Alzheimer's disease,
Parkinson's disease, diabetes, AIDS etc or resistance to plant
diseases such as rust, dwarfism, rot, smut, mould, scab and
mildew); pest resistance or tolerance including insect resistance
or tolerance (e.g., resistance to borers and worms); viral
resistance or tolerance (e.g. resistance to animal viruses such as
herpesviruses, hepadnaviruses, adenoviruses, flaviviruses,
lentiviruses, poxviruses etc or resistance to plant viruses such as
badnaviruses, caulimoviruses, potyviruses, luteoviruses,
rhabdoviruses etc); fungal resistance or tolerance (e.g.,
resistance to arbuscular mycorrhizal fungi, endophytic fungi etc);
a metabolic trait including sucrose metabolism (e.g., sucrose
isomerisation); frost resistance or tolerance; stress tolerance
(e.g., salt tolerance, drought tolerance); and improved food
content or increased yields. Persons of skill in the art will
recognise that the above exemplary classes of phenotype may be
subdivided into phenotypic subclasses and that such subclasses
would also fall within the scope of phenotypic classes contemplated
by the present invention. For example, subclasses of immunity
include innate immunity (which can be further subdivided inter alia
into complement system, monocytes, macrophages, neutrophils and
natural killer cells), cellular immunity (which can be further
subdivided inter alia into cytolytic T lymphocytes, dendritic cells
and T helper lymphocytes) and humoral immunity (which can be
further subdivided inter alia into antibody subclasses IgA, IgD,
IgE, IgG and IgM).
[0183] In some embodiments, the reporter polynucleotide of
individual synthetic constructs further comprises an ancillary
coding sequence that encodes a detectable tag (e.g., streptavidin,
avidin, an antibody, an antigen, an epitope, a hapten, a protein,
or a fluorescent, chemiluminescent or chemically reactive moiety).
The detectable tag is suitably a member of a specific binding pair,
which includes for example, antibody-antigen (or hapten) pairs,
ligand-receptor pairs, enzyme-substrate pairs, biotin-avidin pairs,
and the like. In illustrative examples of this type, the ancillary
coding sequence of one reporter polynucleotide encodes a first tag
(e.g., a first epitope to which a first antibody binds) and the
ancillary coding sequence of another reporter polynucleotide
encodes a second tag (e.g., a second epitope to which a second
antibody binds), which is detectably distinguishable from the first
tag. In these examples, it is possible to detectably distinguish
the polypeptide products of different reporter polynucleotides in
the same cell or organism of interest or part thereof, thereby
permitting simultaneous determination of translational efficiencies
or phenotypic preferences of different interrogating codons in the
same cell or organism or part.
[0184] In accordance with the present invention, the reporter
polynucleotide is operably linked in the synthetic constructs to a
regulatory sequence. The regulatory sequence suitably comprises
transcriptional and/or translational control sequences, which will
be compatible for expression in the cell or organism of interest.
Typically, the transcriptional and translational regulatory control
sequences include, but are not limited to, a promoter sequence, a
5' non-coding region, a cis-regulatory region such as a functional
binding site for transcriptional regulatory protein or
translational regulatory protein, an upstream open reading frame,
ribosomal-binding sequences, transcriptional start site,
translational start site, and/or nucleotide sequence which encodes
a leader sequence, termination codon, translational stop site and a
3' non-translated region. Constitutive or inducible promoters as
known in the art are contemplated by the invention. The promoters
may be either naturally occurring promoters, or hybrid promoters
that combine elements of more than one promoter. Promoter sequences
contemplated by the present invention may be native to the organism
of interest or may be derived from an alternative source, where the
region is functional in the chosen organism. The choice of promoter
will differ depending on the intended host. For example, promoters
which could be used for expression in plants include plant
promoters such as: constitutive plant promoters examples of which
include CaMV35S plant promoter, CaMV19S plant promoter, FMV34S
plant promoter, sugarcane bacilliform badnavirus plant promoter,
CsVMV plant promoter, Arabidopsis ACT2/ACT8 actin plant promoter,
Arabidopsis ubiquitin UBQ1 plant promoter, barley leaf thionin BTH6
plant promoter, and rice actin plant promoter; tissue specific
plant promoters examples of which include bean phaseolin storage
protein plant promoter, DLEC plant promoter, PHSf3 plant promoter,
zein storage protein plant promoter, conglutin gamma plant promoter
from soybean, AT2S1 gene plant promoter, ACT11 actin plant promoter
from Arabidopsis, napA plant promoter from Brassica napus and
potato patatin gene plant promoter; and inducible plant promoters
examples of which include a light-inducible plant promoter derived
from the pea rbcS gene, a plant promoter from the alfalfa rbcS
gene, DRE, MYC and MYB plant promoters which are active in drought;
INT, INPS, prxEa, Ha hsp17.7G4 and RD21 plant promoters active in
high salinity and osmotic stress, and hsr203J and str246C plant
promoters active in pathogenic stress. Alternatively, promoters
which could be used for expression in mammals include the
metallothionein promoter, which can be induced in response to heavy
metals such as cadmium, the .beta.-actin promoter as well as viral
promoters such as the SV40 large T antigen promoter, human
cytomegalovirus (CMV) immediate early (1E) promoter, Rous sarcoma
virus LTR promoter, adenovirus promoter, or a HPV promoter,
particularly the HPV upstream regulatory region (URR) may also be
used. All these promoters are well described and readily available
in the art.
[0185] The synthetic constructs of the present invention may also
comprise a 3' non-translated sequence. A 3' non-translated sequence
refers to that portion of a gene comprising a DNA segment that
contains a polyadenylation signal and any other regulatory signals
capable of effecting mRNA processing or gene expression. The
polyadenylation signal is characterised by effecting the addition
of polyadenylic acid tracts to the 3' end of the mRNA precursor.
Polyadenylation signals are commonly recognised by the presence of
homology to the canonical form 5' AATAAA-3' although variations are
not uncommon. The 3' non-translated regulatory DNA sequence
preferably includes from about 50 to 1,000 nucleotide base pairs
and may contain transcriptional and translational termination
sequences in addition to a polyadenylation signal and any other
regulatory signals capable of effecting mRNA processing or gene
expression.
[0186] In specific embodiments, the synthetic constructs further
contain a selectable marker gene to permit selection of an organism
or a precursor thereof that contains a synthetic construct.
Selection genes are well known in the art and will be compatible
for expression in cell or organism of interest, or a progenitor or
precursor thereof.
[0187] In some embodiments, the synthetic constructs of the
invention are in the form of viral vectors, such as simian virus 40
(SV40) or bovine papilloma virus (BPV), which has the ability to
replicate as extra-chromosomal elements (Eukaryotic Viral Vectors,
Cold Spring Harbor Laboratory, Gluzman ed., 1982; Sarver et al.,
1981, Mol. Cell. Biol. 1:486). Viral vectors include retroviral
(lentivirus), adeno-associated virus (see, e.g., Okada, 1996, Gene
Ther. 3:957-964; Muzyczka, 1994, J. Clin. Invst. 94:1351; U.S. Pat.
Nos. 6,156,303; 6,143,548 5,952,221, describing AAV vectors; see
also U.S. Pat. Nos. 6,004,799; 5,833,993), adenovirus (see, e.g.,
U.S. Pat. Nos. 6,140,087; 6,136,594; 6,133,028; 6,120,764),
reovirus, herpesvirus, rotavirus genomes etc., modified for
introducing and directing expression of a polynucleotide or
transgene in cells. Retroviral vectors can include those based upon
murine leukemia virus (see, e.g., U.S. Pat. No. 6,132,731), gibbon
ape leukemia virus (see, e.g., U.S. Pat. No. 6,033,905), simian
immuno-deficiency virus, human immuno-deficiency virus (see, e.g.,
U.S. Pat. No. 5,985,641), and combinations thereof.
[0188] Vectors also include those that efficiently deliver genes to
animal cells in vivo (e.g., stem cells) (see, e.g., U.S. Pat. Nos.
5,821,235 and 5,786,340; Croyle et al., 1998, Gene Ther. 5:645;
Croyle et al., 1998, Pharm. Res. 15:1348; Croyle et al., 1998, Hum.
Gene Ther. 9:561; Foreman et al., 1998, Hum. Gene Ther. 9:1313;
Wirtz et al., 1999, Gut 44:800). Adenoviral and adeno-associated
viral vectors suitable for in vivo delivery are described, for
example, in U.S. Pat. Nos. 5,700,470, 5,731,172 and 5,604,090.
Additional vectors suitable for in vivo delivery include herpes
simplex virus vectors (see, e.g., U.S. Pat. No. 5,501,979),
retroviral vectors (see, e.g., U.S. Pat. Nos. 5,624,820, 5,693,508
and 5,674,703; and WO92/05266 and WO92/14829), bovine papilloma
virus (BPV) vectors (see, e.g., U.S. Pat. No. 5,719,054), CMV-based
vectors (see, e.g., U.S. Pat. No. 5,561,063) and parvovirus,
rotavirus and Norwalk virus vectors. Lentiviral vectors are useful
for infecting dividing as well as non-dividing cells (see, e.g.,
U.S. Pat. No. 6,013,516).
[0189] Vectors for insect cell expression commonly use recombinant
variations of baculoviruses and other nucleopolyhedrovirus, e.g.,
Bombyx mori nucleopolyhedrovirus vectors (see, e.g., Choi, 2000,
Arch. Virol. 145:171-177). For example, Lepidopteran and
Coleopteran cells are used to replicate baculoviruses to promote
expression of foreign genes carried by baculoviruses, e.g.,
Spodoptera frugiperda cells are infected with recombinant
Autographa californica nuclear polyhedrosis viruses (AcNPV)
carrying a heterologous, e.g., a human, coding sequence (see, e.g.,
Lee, 2000, J. Virol. 74:11873-11880; Wu, 2000, J. Biotechnol.
80:75-83). See, e.g., U.S. Pat. No. 6,143,565, describing use of
the polydnavirus of the parasitic wasp Glyptapanteles indiensis to
stably integrate nucleic acid into the genome of Lepidopteran and
Coleopteran insect cell lines. See also, U.S. Pat. Nos. 6,130,074;
5,858,353; 5,004,687.
[0190] Expression vectors capable of expressing proteins in plants
are well known in the art, and include, e.g., vectors from
Agrobacterium spp., potato virus X (see, e.g., Angell, 1997, EMBO
J. 16:3675-3684), tobacco mosaic virus (see, e.g., Casper, 1996,
Gene 173:69-73), tomato bushy stunt virus (see, e.g., Hillman,
1989, Virology 169:42-50), tobacco etch virus (see, e.g., Dolja,
1997, Virology 234:243-252), bean golden mosaic virus (see, e.g.,
Morinaga, 1993, Microbiol Immunol. 37:471-476), cauliflower mosaic
virus (see, e.g., Cecchini, 1997, Mol. Plant. Microbe Interact.
10:1094-1101), maize Ac/Ds transposable element (see, e.g., Rubin,
1997, Mol. Cell. Biol. 17:6294-6302; Kunze, 1996, Curr. Top.
Microbiol. Immunol. 204:161-194), and the maize suppressor-mutator
(Spm) transposable element (see, e.g., Schlappi, 1996, Plant Mol.
Biol. 32:717-725); and derivatives thereof.
[0191] The invention further contemplates cells or organisms
containing therein the synthetic constructs of the invention, or
alternatively, parts, precursors, cells or tissues produced by the
methods described herein. In this regard, it will be appreciated
that the construct system of the present invention is applicable to
prokaryotic as well as eukaryotic hosts and includes for example
unicellular organisms and multicellular organisms, such as but not
limited to yeast, plants and animals including vertebrate animals
such as mammals, reptiles, fish, birds etc as well as invertebrate
animals such as metazoa, sponges, worms, molluscs, nematodes,
crustaceans, echinoderms etc. In certain embodiments, the construct
system is used to determine the translational efficiency of
different synonymous codons in plant cells or animal cellos or to
determine the phenotypic preference of different synonymous codons
in plants and mammals.
[0192] Illustrative examples of eukaryotic organisms include, but
are not limited to, fungi such as yeast and filamentous fungi,
including species of Aspergillus, Trichoderma, and Neurospora;
animal hosts including vertebrate animals illustrative examples of
which include fish (e.g., salmon, trout, tilapia, tuna, carp,
flounder, halibut, swordfish, cod and zebrafish), birds (e.g.,
chickens, ducks, quail, pheasants and turkeys, and other jungle
foul or game birds) and mammals (e.g., dogs, cats, horses, cows,
buffalo, deer, sheep, rabbits, rodents such as mice, rats, hamsters
and guinea pigs, goats, pigs, primates, marine mammals including
dolphins and whales, as well as cell lines, such as human or other
mammalian cell lines of any tissue or stem cell type (e.g., COS,
NIH 3T3 CHO, BHK, 293, or HeLa cells), and stem cells, including
pluripotent and non-pluripotent and embryonic stem cells, and
non-human zygotes), as well as invertebrate animals illustrative
examples of which include nematodes (representative generae of
which include those that infect animals such as but not limited to
Ancylostoma, Ascaridia, Ascaris, Bunostomum, Caenorhabditis,
Capillaria, Chabertia, Cooperia, Dictyocaulus, Haernonchus,
Heterakis, Nematodirus, Oesophagostomum, Ostertagia, Oxyuris,
Parascaris, Strongylus, Toxascaris, Trichuris, Trichostrongylus,
Tflichonema, Toxocara, Uncinaria, and those that infect plants such
as but not limited to Bursaphalenchus, Criconerriella, Diiylenchus,
Ditylenchus, Globodera, Helicotylenchus, Heterodera, Longidorus,
Melodoigyne, Nacobbus, Paratylenchus, Pratylenchus, Radopholus,
Rotelynchus, Tylenchus, and Xiphinerna) and other worms,
drosophila, and other insects (such as from the families Apidae,
Curculionidae, Scarabaeidae, Tephritidae, Tortricidae, amongst
others, representative orders of which include Coleoptera, Diptera,
Lepidoptera, and Homoptera.
[0193] In certain embodiments, the construct system is used to
determine the translational efficiency or phenotypic preference of
different synonymous codons in plants or plant cells (e.g., a plant
that is suitably selected from monocotyledons, dicotyledons and
gymnosperms). The plant may be an ornamental plant or crop plant.
Illustrative examples of ornamental plants include, but are not
limited to, Malus spp, Crataegus spp, Rosa spp., Betula spp, Sorbus
spp, Olea spp, Nerium spp, Salix spp, Populus spp. Illustrative
examples of crop plants include plant species which are cultivated
in order to produce a harvestable product such as, but not limited
to, Abelmoschus esculentus (okra), Acacia spp., Agave fourcroydes
(henequen), Agave sisalana (sisal), Albizia spp., Allium fistulosum
(bunching onion), Allium sativum (garlic), Allium spp. (onions),
Alpinia galanga (greater galanga), Amaranthus caudatus, Amaranthus
spp., Anacardium spp. (cashew), Ananas comosus (pineapple), Anethum
graveolens (dill), Annona cherimola (cherimoya), Apios americana
(American potatobean), Arachis hypogaea (peanut), Arctium spp.
(burdock), Artemisia spp. (wormwood), Aspalathus linearis (redbush
tea), Athertonia diversifolia, Atriplex nummularia (old man
saltbush), Averrhoa carambola (starfruit), Azadirachta indica
(neem), Backhousia spp., Bambusa spp. (bamboo), Beta vulgaris
(sugar beet), Boehmeria nivea (ramie), bok choy, Boronia megastigma
(sweet boronia), Brassica carinata (Abyssinian mustard), Brassica
juncea (Indian mustard), Brassica napus (rapeseed), Brassica
oleracea (cabbage, broccoli), Brassica oleracea var Albogabra (gai
lum), Brassica parachinensis (choi sum), Brassica pekensis (Wong
bok or Chinese cabbage), Brassica spp., Burcella obovata, Cajanus
cajan (pigeon pea), Camellia sinensis (tea), Cannabis sativa
(non-drug hemp), Capsicum spp., Carica spp. (papaya), Carthamus
tinctorius (safflower), Carum carvi (caraway), Cassinia spp.,
Castanospermum australe (blackbean), Casuarina cunninghamiana
(beefwood), Ceratonia siliqua (carob), Chamaemelum nobile
(chamomile), Chamelaucium spp. (Geraldton wax), Chenopodium quinoa
(quinoa), Chrysanthemum (Tanacetum), cinerariifolium (pyrethrum),
Cicer arietinum (chickpea), Cichorium intybus (chicory), Clematis
spp., Clianthus formosus (Start's desert pea), Cocos nucifera
(coconut), Coffea spp. (coffee), Colocasia esculenta (taro),
Coriandrum sativum (coriander), Crambe abyssinica (crambe), Crocus
sativus (saffron), Cucurbita foetidissima (buffalo gourd),
Cucurbita spp. (gourd), Cyamopsis tetragonoloba (guar), Cymbopogon
spp. (lemongrass), Cytisus proliferus (tagasaste), Daucus carota
(carrot), Desmanthus spp., Dioscorea esculenta (Asiatic yam),
Dioscorea spp. (yams), Diospyros spp. (persimmon), Doronicum sp.,
Echinacea spp., Eleocharis dulcis (water chestnut), Eleusine
coracana (finger millet), Emanthus arundinaceus, Eragrostis tef
(tef), Erianthus arundinaceus, Eriobotrya japonica (loquat),
Eucalyptus spp., Eucalyptus spp. (gil mallee), Euclea spp., Eugenia
malaccensis (jumba), Euphorbia spp., Euphoria longana (longan),
Eutrema wasabi (wasabi), Fagopyrum esculentum (buckwheat), Festuca
arundinacea (tall fescue), Ficus spp. (fig), Flacourtia inermis,
Flindersia grayliana (Queensland maple), Foeniculum olearia,
Foeniculum vulgare (fennel), Garcinia mangostana (mangosteen),
Glycine latifolia, Glycine max (soybean), Glycine max (vegetable
soybean), Glycyrrhiza glabra (licorice), Gossypium spp. (cottons),
Grevillea spp., Grindelia spp., Guizotia abyssinica (niger),
Harpagophyllum sp., Helianthus annuus (high oleic sunflowers),
Helianthus annuus (monosun sunflowers), Helianthus tuberosus
(Jerusalem artichoke), Hibiscus cannabinus (kenaf), Hordeum
bulbosum, Hordeum spp. (waxy barley), Hordeum vulgare (barley),
Hordeum vulgare subsp. spontaneum, Humulus lupulus (hops),
Hydrastis canadensis (golden seal), Hymenachne spp., Hyssopus
officinalis (hyssop), Indigofera spp., Inga edulis (ice cream
bean), Inocarpus tugiter, Ipomoea batatas (sweet potato), Ipomoea
sp. (kang kong), Lablab purpureus (white lablab), Lactuca spp.
(lettuce), Lathyrus spp. (vetch), Lavandula spp. (lavender), Lens
spp. (lentil), Lesquerella spp. (bladderpod), Leucaena spp., Lilium
spp., Limnanthes spp. (meadowfoam), Linum usitatissimum (flax),
Linum usitatissimum (linseed), Linum usitatissimum (Linola.TM.),
Litchi chinensis (lychee), Lotus corniculatus (birdsfoot trefoil),
Lotus pedunculatus, Lotus sp., Luffa spp., Lunaria annua (honesty),
Lupinus mutabilis (pearl lupin), Lupinus spp. (lupin), Macadamia
spp., Mangifera indica (mango), Manihot esculenta (cassaya),
Medicago spp. (lucerne), Medicago spp., Melaleuca spp. (tea tree),
Melaleuca uncinata (broombush), Mentha tasmannia, Mentha spicata
(spearmint), Mentha X piperita (peppermint), Momordica charantia
(bitter melon), Musa spp. (banana), Myrciaria cauliflora
(jaboticaba), Myrothamnus flabellifolia, Nephelium lappaceum
(rambutan), Nerine spp., Ocimum basilicum (basil), Oenanthe
javanica (water dropwort), Oenothera biennis (evening primrose),
Olea europaea (olive), Olearia sp., Origanum spp. (marjoram,
oregano), Oryza spp. (rice), Oxalis tuberosa (oca), Ozothamnus spp.
(rice flower), Pachyrrhizus ahipa (yam bean), Panax spp. (ginseng),
Panicum miliaceum (common millet), Papaver spp. (poppy), Parthenium
argentatum (guayule), Passiflora sp., Paulownia tomemtosa (princess
tree), Pelargonium graveolens (rose geranium), Pelargonium sp.,
Pennisetum americanum (bulrush or pearl millet), Persoonia spp.,
Petroselinum crispum (parsley), Phacelia tanacetifolia (tansy),
Phalaris canariensis (canary grass), Phalaris sp., Phaseolus
coccineus (scarlet runner bean), Phaseolus lunatus (lima bean),
Phaseolus spp., Phaseolus vulgaris (culinary bean), Phaseolus
vulgaris (navy bean), Phaseolus vulgaris (red kidney bean), Pisum
sativum (field pea), Plantago ovata (psyllium), Polygonum minus,
Polygonum odoratum, Prunus mume (Japanese apricot), Psidium guajava
(guava), Psophocarpus tetragonolobus (winged bean), Pyrus spp.
(nashi), Raphanus satulus (long white radish or Daikon), Rhagodia
spp. (saltbush), Ribes nigrum (black currant), Ricinus communis
(castor bean), Rosmarinus officinalis (rosemary), Rungia klossii
(rungia), Saccharum officinarum (sugar cane), Salvia officinalis
(sage), Salvia sclarea (clary sage), Salvia sp., Sandersonia sp.,
Santalum acuminatum (sweet quandong), Santalum spp. (sandalwood),
Sclerocarya caffra (macula), Scutellaria galericulata (scullcap),
Secale cereale (rye), Sesamum indicum (sesame), Setaria italica
(foxtail millet), Simmondsia spp. (jojoba), Solanum spp., Sorghum
almum (sorghum), Stachys betonica (wood betony), Stenanthemum
scortechenii, Strychnos cocculoides (monkey orange), Stylosanthes
spp. (stylo), Syzygium spp., Tasmannia lanceolata (mountain
pepper), Terminalia karnbachii, Theobroma cacao (cocoa), Thymus
vulgaris (thyme), Toona australis (red cedar), Trifoliium spp.
(clovers), Trifolium alexandrinum (berseem clover), Trifolium
resupinatum (persian clover), Triticum spp., Triticum tauschii,
Tylosema esculentum (morama bean), Valeriana sp. (valerian),
Vernonia spp., Vetiver zizanioides (vetiver grass), Vicia
benghalensis (purple vetch), Vicia faba (faba bean), Vicia
narbonensis (narbon bean), Vicia sativa, Vicia spp., Vigna
aconitifolia (mothbean), Vigna angularis (adzuki bean), Vigna mungo
(black gram), Vigna radiata (mung bean), Vigna spp., Vigna
unguiculata (cowpea), Vitis spp. (grapes), Voandzeia subterranea
(bambarra groundnut), Triticosecale (triticale), Zea mays (bicolour
sweetcorn), Zea mays (maize), Zea mays (sweet corn), Zea mays
subsp. mexicana (teosinte), Zieria spp., Zingiber officinale
(ginger), Zizania spp. (wild rice), Ziziphus jujuba (common
jujube). Desirable crops for the practice of the present invention
include Nicotiana tabacum (tobacco) and horticultural crops such
as, for example, Ananas comosus (pineapple), Saccharum spp (sugar
cane), Musa spp (banana), Lycopersicon esculentum (tomato) and
Solanum tuberosum (potato).
[0194] The synthetic constructs of the present invention may be
introduced directly ex vivo or in cell culture into a cell of
interest or into an organism of interest or into one or more of
parts of an organism of interest, e.g., cell or tissue types (e.g.,
a muscle, skin, brain, lung, kidney, pancreas, a reproductive organ
such as testes, ovaries and breast, eye, liver, heart, vascular
cell, root, leaf, flower, stalk or meristem) or into an organ of an
organism of interest. Alternatively, the synthetic constructs are
introduced into a progenitor of a cell or organism of interest and
the progenitor is then grown or cultured for a time and under
conditions sufficient to differentiate into the cell of interest or
produce the organism of interest, whereby the synthetic construct
is contained in the cell of interest or one or more cell types of
the organism of interest. Suitable progenitor cells include, but
are not limited to, stem cells such as embryonic stem cell,
pluripotential immune cells, meristematic cells and embryonic
callus. In certain embodiments, the synthetic construct is
introduced into the organism of interest using a particular route
of administration (e.g., for mammals, by the oral, parenteral
(e.g., intravenous, intramuscular, intraperitoneal,
intraventricular, intraarticular), mucosal (e.g., intranasal,
intrapulmonary, oral, buccal, sublingual, rectal, intravaginal),
dermal (topical, subcutaneous, transdermal); for plants,
administration to flowers, meristem, root, leaves or stalk).
Practitioners in the art will recognise that the route of
administration will differ depending on the choice of organism of
interest and the sought-after phenotype. In some embodiments
relating to determination of phenotypic preference, the synthetic
constructs are suitably introduced into the same or corresponding
site of the organism or part thereof. In other embodiments, the
synthetic constructs are introduced into a cell of the organism of
interest (e.g., autologous cells), or into a cell that is
compatible with the organism of interest (e.g., syngeneic or
allogeneic cells) and the genetically-modified cell so produced is
introduced into the organism of interest at a selected site or into
a part of that organism.
[0195] The synthetic constructs of the present invention may be
introduced into a cell or organism of interest or part thereof
using any suitable method, and the kind of method employed will
differ depending on the intended cell type, part and/or organism of
interest. For example, four general classes of methods for
delivering nucleic acid molecules into cells have been described:
(1) chemical methods such as calcium phosphate precipitation,
polyethylene glycol (PEG)-mediate precipitation and lipofection;
(2) physical methods such as microinjection, electroporation,
acceleration methods and vacuum infiltration; (3) vector based
methods such as bacterial and viral vector-mediated transformation;
and (4) receptor-mediated. Transformation techniques that fall
within these and other classes are well known to workers in the
art, and new techniques are continually becoming known. The
particular choice of a transformation technology will be determined
by its efficiency to transform certain host species as well as the
experience and preference of the person practising the invention
with a particular methodology of choice. It will be apparent to the
skilled person that the particular choice of a transformation
system to introduce a synthetic construct of the invention into
cells is not essential to or a limitation of the invention,
provided it achieves an acceptable level of nucleic acid transfer.
Thus, the synthetic constructs are introduced into tissues or host
cells by any number of routes, including viral infection, phage
infection, microinjection, electroporation, or fusion of vesicles,
lipofection, infection by Agrobacterium tumefaciens or A.
rhizogenes, or protoplast fusion. Jet injection may also be used
for intra-muscular administration (as described for example by
Furth et al., 1992, Anal Biochem 205:365-368). The synthetic
constructs may be coated onto microprojectiles, and delivered into
a host cell or into tissue by a particle bombardment device, or
"gene gun" (see, for example, Tang et al., 1992, Nature
356:152-154). Alternatively, the synthetic constructs can be fed
directly to, or injected into, a host organism or it may be
introduced into a cell (i.e., intracellularly) or introduced
extracellularly into a cavity, interstitial space, into the
circulation of an organism, introduced orally, etc. Methods for
oral introduction include direct mixing of the synthetic constructs
with food of the organism. In certain embodiments, a hydrodynamic
nucleic acid administration protocol is employed (e.g., see Chang
et al., 2001, J. Virol. 75:3469-3473; Liu et al., 1999, Gene Ther.
6:1258-1266; Wolff et al., 1990, Science 247:1465-1468; Zhang et
al., 1999, Hum. Gene Ther. 10:1735-1737; and Zhang et al., 1999,
Gene Ther. 7:1344-1349). Other methods of nucleic acid delivery
include, but are not limited to, liposome-mediated transfer, naked
DNA delivery (direct injection) and receptor-mediated transfer
(ligand-DNA complex).
4. Methods of Determining the Translational Efficiency or
Phenotypic Preference of Synonymous Codons
[0196] The construct system of the present invention can be used to
compare the translational efficiency of different synonymous codons
in cells of a particular type or to compare the translational
efficiency of individual synonymous codons between different types
of cells. Not wishing to be bound by any one particular theory or
mode of operation, it is believed that the levels of reporter
protein produced in a cell of interest from individual synthetic
constructs are sensitive to the intracellular abundance of the
iso-tRNA species corresponding to the interrogating codon(s) in the
corresponding coding sequences and, therefore, provide a direct
correlation of a cell's preference for or efficiency in translating
a given codon. This means, for example, that if the level of the
reporter protein obtained in a cell of the same type as a cell of
interest, to which a synthetic construct having at least one first
interrogating codon is provided, is higher than the level produced
in a cell of the same type as the cell of interest, to which
another synthetic construct having at least one second
interrogating codon is provided (i.e., wherein the first
interrogating codon(s) is (are) different from, but synonymous
with, the second interrogating codon(s)), then it can be deduced
that the first interrogating codon has a higher translational
efficiency than the second interrogating codon in the cell of
interest. Methods for measuring reporter protein levels are
well-known in the art and include, but are not limited to,
immunoassays such as Western blotting, ELISA, and RIA assays,
chemiluminescent protein assays such as luciferase assays,
enzymatic assays such as assays that measure .beta.-galactosidase
or chloramphenicol acetyl transferase (CAT) activity as well as
fluorometric assays that measure fluorescence associated with a
fluorescent protein. In some embodiments, the different synthetic
constructs are separately introduced into different cells. In other
embodiments, the different synthetic constructs are introduced into
the same cell (e.g., when the reporter polynucleotides comprise
ancillary coding sequences that encode a tag, as described
herein).
[0197] With regard to differential expression of the reporter
polynucleotide between different cell types, it will be appreciated
that if the level of the reporter protein obtained in a first cell
type to which a synthetic construct having at least one
interrogating codon is provided is higher than the level obtained
in a second cell type to which the same synthetic construct is
provided, then it can be deduced that the interrogating codon has a
higher translational efficiency in the first cell type than in the
second cell type.
[0198] The translational efficiencies of different synonymous
codons so determined are then typically compared to provide a
ranked order of individual synonymous codons according to their
preference for translation in the cell or cells of interest. One of
ordinary skill in the art will thereby be able to determine a
"codon translational efficiency table" for each amino acid.
Comparison of synonymous codons within a codon translational
efficiency table can then be used to identify codons for tailoring
a synthetic polynucleotide to modulate the level of an encoded
polypeptide that is expressed in a cell type of interest or to
differentially express an encoded polypeptide between different
cell types.
[0199] In other embodiments, the construct system is used to
compare the preference of different synonymous codons for producing
a selected phenotype in an organism of interest or part thereof
(i.e., "phenotypic preference"). In these embodiments, the
synthetic constructs are used to determine the influence of the
interrogating codon(s) on the phenotype or class of phenotype
displayed by the organism or part in response to the
phenotype-associated protein produced by those synthetic
constructs. This means, for example, that if the quality of the
phenotype displayed by the organism or part to which a synthetic
construct having at least one first interrogating codon is provided
is higher than the quality of the phenotype displayed by the
organism or part to which a synthetic construct having at least one
second interrogating codon is provided (i.e., wherein the first
interrogating codon is different than, but synonymous with, the
second interrogating codon), then it can be deduced that the
organism of interest or part thereof has a higher preference for
the first interrogating codon than the second interrogating codon
with respect to the quality of the phenotype produced. Put another
way, the first interrogating codon has a higher phenotypic
preference than the second interrogating codon in the organism of
interest or part thereof.
[0200] In accordance with the present invention, individual
synthetic constructs are introduced into test organisms which are
preferably selected from organisms of the same species as the
organism of interest or organisms that are related to the organism
of interest, or into test parts of such organisms. Related
organisms are generally species within the same phylum, preferably
species within the same subphylum, more preferably species within
superclass, even more preferably species within the same class,
even more preferably species within the same order and still even
more preferably species within the same genus. For example, if the
organism of interest is human, a related species is suitably
selected from mouse, cow, dog or cat, which belong to the same
class as human, or a chimpanzee, which belongs to the same order as
human. Alternatively, if the organism of interest is banana, the
related organism may be selected from taro, ginger, onions, garlic,
pineapple, bromeliaeds, palms, orchids, lilies, irises and the
like, which are all non-graminaceous monocotyledonous plants and
which constitute horticultural or botanical relatives.
[0201] After introduction of the synthetic constructs into the test
organisms or parts, the qualities of their phenotypes are
determined by a suitable assay and then compared to determine the
relative phenotypic preferences of the synonymous codons. The
quality is suitably a measure of the strength, intensity or grade
of the phenotype, or the relative strength, intensity or grade of
two or more desired phenotypic traits. Assays for various
phenotypes conferred by the production of a chosen reporter protein
are known by those of skill in the art. For example, immunity may
be assayed by any suitable methods that detects an increase in an
animal's capacity to respond to foreign or disease-specific
antigens (e.g., cancer antigens) i.e., those cells primed to attack
such antigens are increased in number, activity, and ability to
detect and destroy the those antigens. Strength of immune response
is measured by standard tests including: direct measurement of
peripheral blood lymphocytes by means known to the art; natural
killer cell cytotoxicity assays (see, e.g., Provinciali et al
(1992, J. Immunol. Meth. 155: 19-24), cell proliferation assays
(see, e.g., Vollenweider and Groseurth (1992,1 Immunol. Meth. 149:
133-135), immunoassays of immune cells and subsets (see, e.g.,
Loeffler et al. (1992, Cytom. 13: 169-174); Rivoltini et al. (1992,
Can. Immunol. Immunother. 34: 241-251); or skin tests for
cell-mediated immunity (see, e.g., Chang et al (1993, Cancer Res.
53: 1043-1050). Enhanced immune response is also indicated by
physical manifestations such as fever and inflammation, as well as
healing of systemic and local infections, and reduction of symptoms
in disease, i.e., decrease in tumour size, alleviation of symptoms
of a disease or condition including, but not restricted to,
leprosy, tuberculosis, malaria, naphthous ulcers, herpetic and
papillomatous warts, gingivitis, artherosclerosis, the concomitants
of AIDS such as Kaposi's sarcoma, bronchial infections, and the
like. Such physical manifestations may also be used to detect, or
define the quality of, the phenotype or class of phenotype
displayed by an organism. Alternatively, herbicide tolerance may be
assayed by treating test organisms (e.g., plants such as cotton
plants), which express a herbicide tolerance gene (e.g., glyphosate
tolerance protein gene such as a glyphosate resistant EPSP
synthase), with a herbicide (e.g., glyphosate) and determining the
efficacy of herbicide tolerance displayed by the plants. For
example, when determining the efficacy of synthetic constructs for
conferring herbicide tolerance in cotton, the amount of boll
retention is a measure of efficacy and is a desirable trait.
[0202] The qualities of selected phenotype displayed by the test
organisms or by the test parts are then compared to provide a
ranked order of the individual synonymous codons according to their
preference of usage by the organism or part to confer the selected
phenotype. One of ordinary skill in the art will thereby be able to
determine a "codon preference table" for each amino acid in the
polypeptide whose expression conveys the selected phenotype to the
organism of interest. Comparison of synonymous codons within a
codon preference table can then be used to identify codons for
tailoring a synthetic polynucleotide to modulate the quality of a
selected phenotype.
5. Codon Modification of Polynucleotides
[0203] The construct system of the present invention can thus be
used to provide a comparison of translational efficiencies for
synonymous codons in a cell of interest or a comparison of
phenotypic preferences for synonymous codons in an organism of
interest or in a related organism, or in parts thereof. These
comparisons can then be used as a basis for constructing a
synthetic or `codon modified` polynucleotide which differs from a
parent or reference polynucleotide by the substitution of at least
one `replaceable` codon (also referred to herein as "a first
codon") in the parent polynucleotide with a synonymous codon that
has a different translational efficiency or different phenotypic
preference than the replaceable codon.
[0204] 5.1 Modifications Based on Synonymous Codons with Different
Translational Efficiencies
[0205] In some embodiments, the synthetic polynucleotide is
constructed so that it produces an encoded polypeptide in a cell of
interest at a different level than that produced from a parent
polynucleotide. The method comprises selecting a replaceable codon
of the parent polynucleotide for replacement with a synonymous
codon, wherein the synonymous codon is selected on the basis that
it exhibits a different translational efficiency than the
replaceable codon in a comparison of translational efficiencies in
the cell of interest, as determined, for example, in Section 4. The
replaceable codon is then replaced with the synonymous codon to
construct the synthetic polynucleotide.
[0206] Synonymous codons can thus be selected to increase or
decrease the level of polypeptide that is produced in a cell of
interest. For example, when it is desired to increase the level of
polypeptide that is produced in the cell, it is generally desirable
to use a synonymous codon whose translational efficiency is at
least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%,
65%, 70%, 75%, 80%, 85%, 90% or 95% higher or at least about 2, 3,
4, 5, 6, 7, 8, 9, 10, 50 or 100 times higher than the translational
efficiency of the replaceable codon. Alternatively, when it is
desired to decrease the level of polypeptide that is produced in
the cell, it is generally desirable to use a synonymous codon whose
translational efficiency is no more than 95%, 90%, 85%, 80%, 75%,
70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%,
5%, 1%, 0.5%, 0.1%, 0.05% or 0.01% of the translational efficiency
of the replaceable codon.
[0207] Generally, the difference in level of polypeptide produced
in the cell from a synthetic polynucleotide as compared to that
produced from a parent polynucleotide depends on the number of
replaceable codons that are replaced by synonymous codons, and on
the difference in translational efficiencies between the
replaceable codons and the synonymous codons in the cell of
interest. Put another way, the fewer such replacements, and/or the
smaller the difference in translational efficiencies between the
synonymous and replaceable codons, the smaller the difference will
be in protein production between the synthetic polynucleotide and
parent polynucleotide. Conversely, the more such replacements,
and/or the greater the difference in translational efficiencies
between the synonymous and replaceable codons, the greater the
difference will be in protein production between the synthetic
polynucleotide and parent polynucleotide.
[0208] Accordingly, when it is desired to increase or decrease the
level of polypeptide produced in the cell of interest, it is
generally desirable but not necessary to replace all the
replaceable codons of the parent polynucleotide with synonymous
codons having higher or lower translational efficiencies in the
cell of interest, as the case may be, than the replaceable codons.
Changes in expression can be accomplished even with partial
replacement. Typically, the replacement step affects at least about
5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%, 60%, 70%, 80%, 90%,
95%, 99% or more of the replaceable codons of the parent
polynucleotide. Suitably, the number of, and difference in
translational efficiency between, the replaceable codons and the
synonymous codons are selected such that the chosen polypeptide is
produced from the synthetic polynucleotide in the cell at a level
which is at least about at least about 10%, 15%, 20%, 25%, 30%,
35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95%
higher than, or even at least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 50
or 100 times higher than, or no more than 95%, 90%, 85%, 80%, 75%,
70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%,
5%, 1%, 0.5%, 0.1%, 0.05% or 0.01% of, the level at which the
polypeptide is produced from the parent polynucleotide in the cell.
In the case of two or more synonymous codons having similar
translational efficiencies, it will be appreciated that any one of
these codons can be used to replace the replaceable codon.
Generally, if a parent polynucleotide has a choice of low and
intermediate translational efficiency codons, it is preferable in
the first instance to replace some, or more preferably all, of the
low translational efficiency codons with synonymous codons having
intermediate, or preferably high, translational efficiencies when
higher production of polypeptide is required. Typically,
replacement of low with intermediate or high translational
efficiency codons results in a substantial increase in the level of
polypeptide produced by the synthetic polynucleotide so
constructed. However, it is also preferable to replace some, or
preferably all, of the intermediate translational efficiency codons
with high translationally efficient codons for conferring an
optimal production of the encoded polypeptide.
[0209] 5.2 Modifications Based on Synonymous Codons with
Differentphenotypic Preferences
[0210] In other embodiments, the synthetic polynucleotide is
constructed so that its expression in the organism or part confers
a selected phenotype upon that organism or part but in a different
quality than that conferred by a parent polynucleotide that encodes
the same polypeptide. The method comprises selecting a replaceable
codon of the parent polynucleotide for replacement with a
synonymous codon, wherein the synonymous codon is selected on the
basis that it exhibits a different phenotypic preference than the
first codon in a comparison of phenotypic preferences in the
organism of interest or in a related organism, or in a part
thereof, as determined in Section 4. The replaceable codon is then
replaced with the synonymous codon to construct the synthetic
polynucleotide.
[0211] Thus, a parent polynucleotide can be modified with
synonymous codons such that quality of the selected phenotype
conferred by the polynucleotide so modified (synthetic
polynucleotide) is higher than from the parent polynucleotide.
Generally, the difference between the respective phenotypic
qualities conferred by a synthetic polynucleotide and by a parent
polynucleotide depends on the number of first codons that are
replaced by synonymous codons, and on the difference in phenotypic
preference between the first codons and the synonymous codons in
the organism of interest or part thereof. Put another way, the
fewer such replacements, and/or the smaller the difference in
phenotypic preference between the synonymous and first codons, the
smaller the difference will be in the phenotypic quality between
the synthetic and parent polynucleotides. Conversely, the more such
replacements, and/or the greater the difference in phenotypic
preference between the synonymous and first codons, the greater the
difference will be in the phenotypic quality between the synthetic
and parent polynucleotides.
[0212] In some embodiments in which a higher quality of a selected
phenotype is required to be displayed by an organism of interest or
part thereof, a replaceable codon of the parent polynucleotide is
suitably selected for replacement with a synonymous codon, wherein
the synonymous codon is selected on the basis that it exhibits a
higher phenotypic preference than the replaceable codon in a
comparison of phenotypic preferences in the organism of interest or
in a related organism, or in a part thereof. Generally, a higher
phenotypic preference will correlate with a higher quality of the
selected phenotype. Thus, in a non-limiting example of such a
correlation, a synonymous codon is deemed to have at least about a
10% higher phenotypic preference than a replaceable codon when the
quality of phenotype displayed by an organism or part thereof to
which a synthetic construct comprising the synonymous codon as the
interrogating codon has been provided is at least about 10% higher
than the quality of phenotype displayed by an organism or part
thereof to which a synthetic construct comprising the replaceable
codon as the interrogating codon has been provided. When it is
desired to increase the quality of a phenotype, it is generally
desirable to use a synonymous codon whose phenotypic preference
(i.e., preference for conferring that phenotype upon the organism
or part) is at least about 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%,
50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90% or 95% higher or at
least about 2, 3, 4, 5, 6, 7, 8, 9, 10, 50 or 100 times higher than
the phenotypic preference of the replaceable codon. In the case of
two or more synonymous codons having similar phenotypic
preferences, it will be appreciated that any one of these codons
can be used to replace the first codon. Generally, if a parent
polynucleotide has a choice of low and intermediate phenotypic
preference codons, it is preferable in the first instance to
replace some, or more preferably all, of the low phenotypic
preference codons with synonymous codons having intermediate, or
preferably high, phenotypic preferences. Typically, replacement of
low with intermediate or high phenotypic preference codons results
in a substantial increase in the quality of the phenotype conferred
by the synthetic polynucleotide so constructed. However, it is also
preferable to replace some, or preferably all, of the intermediate
phenotypic preference codons with high translationally efficient
codons for conferring an optimal quality in the selected
phenotype.
[0213] In some embodiments in which a lower quality of a selected
phenotype is required to be displayed by an organism of interest or
part thereof, a replaceable codon of the parent polynucleotide is
selected for replacement with a synonymous codon, wherein the
synonymous codon is selected on the basis that it exhibits a lower
phenotypic preference than the replaceable codon in a comparison of
phenotypic preferences in the organism of interest or in a related
organism or in a part thereof, as determined for example according
to method described in Section 4. A lower phenotypic preference
will typically correlate with a lower quality of the selected
phenotype. Accordingly, in a non-limiting example of such a
correlation, a synonymous codon is deemed to have at least about a
10% lower phenotypic preference than a first codon when the quality
of phenotype displayed by an organism or part thereof to which a
synthetic construct comprising the synonymous codon as the
interrogating codon has been provided is at least about 10% lower
than the quality of phenotype displayed by an organism or part
thereof to which a synthetic construct comprising the replaceable
codon as the interrogating codon has been provided. When selecting
the synonymous codon for this embodiment, it is preferred that it
has a phenotypic preference in the organism of interest that is no
more than about 95%, 90%, 85%, 80%, 75%, 70%, 65%, 60%, 55%, 50%,
45%, 40%, 35%, 30%, 25%, 20%, 15%, 10%, 5%, 1%, 0.5%, 0.1%, 0.05%
or 0.01% of the phenotypic preference of the replaceable codon.
[0214] It is preferable but not necessary to replace all the
replaceable codons of the parent polynucleotide with synonymous
codons having higher or lower phenotypic preference in the organism
of interest or part thereof than the first codons. For example, a
higher or lower phenotypic quality can be accomplished even with
partial replacement. Typically, the replacement step affects 5%,
10%, 15%, 20%, 25%, 30%, 35%, 40%, 50%, 60%, 70%, 80%, 90%, 95%,
99% or more of the replaceable codons of the parent polynucleotide.
In some embodiments requiring a higher phenotypic quality, the
number of, and difference in phenotypic preference between the
replaceable codons and the synonymous codons are selected such that
the phenotype-associated polypeptide is produced from the synthetic
polynucleotide to confer a phenotype upon a chosen organism or
organism part in a quality that is at least about 10%, 15%, 20%,
25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%,
90% or 95% higher, or even at least about 2, 3, 4, 5, 6, 7, 8, 9,
10, 50 or 100 times higher than the quality of phenotype conferred
by the parent polynucleotide in the organism or part. Conversely,
in some embodiments requiring a lower phenotypic quality, the
number of, and difference in phenotypic preference between, the
replaceable codons and the synonymous codons are selected such that
the phenotype-associated polypeptide is produced from the synthetic
polynucleotide to confer a phenotype upon a chosen organism or part
thereof in a quality that is no more than about 95%, 90%, 85%, 80%,
75%, 70%, 65%, 60%, 55%, 50%, 45%, 40%, 35%, 30%, 25%, 20%, 15%,
10%, 5%, 1%, 0.5%, 0.1%, 0.05% or 0.01% of the quality of phenotype
conferred by the parent polynucleotide in the organism or part.
[0215] 5.3 Construction of Synthetic Polynucleotides
[0216] Replacement of one codon for another can be achieved using
standard methods known in the art. For example codon modification
of a parent polynucleotide can be effected using several known
mutagenesis techniques including, for example,
oligonucleotide-directed mutagenesis, mutagenesis with degenerate
oligonucleotides, and region-specific mutagenesis. Exemplary in
vitro mutagenesis techniques are described for example in U.S. Pat.
Nos. 4,184,917, 4,321,365 and 4,351,901 or in the relevant sections
of Ausubel, et al. (CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John
Wiley & Sons, Inc. 1997) and of Sambrook, et al., (MOLECULAR
CLONING. A LABORATORY MANUAL, Cold Spring Harbor Press, 1989).
Instead of in vitro mutagenesis, the synthetic polynucleotide can
be synthesised de novo using readily available machinery as
described, for example, in U.S. Pat. No. 4,293,652. However, it
should be noted that the present invention is not dependent on, and
not directed to, any one particular technique for constructing the
synthetic polynucleotide.
[0217] The parent polynucleotide is suitably a natural gene.
However, it is possible that the parent polynucleotide that is not
naturally-occurring but has been engineered using recombinant
techniques. Parent polynucleotides can be obtained from any
suitable source, such as from eukaryotic or prokaryotic organisms,
including but not limited to mammals or other animals, and
pathogenic organisms such as yeasts, bacteria, protozoa and
viruses.
6. Immune Response Preference Ranking of Codons in Mammals
[0218] The construct system of the present invention has been used
to experimentally determine a ranking of individual synonymous
codons according to their preference for producing an immune
response, including a humoral immune response, to an antigen in a
mammal. Accordingly, the present invention provides for the first
time an immune response preference ranking of individual synonymous
codons in mammals. This ranking was determined using a construct
system that comprises a series of reporter constructs each
comprising a different coding sequence for an antigenic polypeptide
(e.g., a papillomavirus E7 polypeptide), wherein the coding
sequence of individual constructs is distinguished from a parent
(e.g., wild-type) coding sequence that encodes the antigenic
polypeptide by the substitution of a single species of
iso-accepting codon for other species of iso-accepting codon that
are present in the parent coding sequence. Accordingly, the coding
sequences of individual synthetic constructs use the same
"interrogating" iso-accepting codon to encode at least 1, generally
at least 2, usually at least 3 instances, typically at least most
instances and preferably every instance of a particular amino acid
residue in the antigenic polypeptide and individual synthetic
constructs differ in the species of interrogating iso-accepting
codon used to encode a particular amino acid residue at one or more
different positions in the polypeptide sequence. For example, in an
antigenic polypeptide containing several alanine residues, the
coding sequence of a synthetic construct in the construct system of
the present invention may comprise Ala.sup.GcT as the interrogating
codon for each encoded alanine residue, whereas the coding sequence
of another construct may comprise Ala.sup.GCC as the interrogating
codon for each encoded alanine residue, and so on. An illustrative
synthetic construct system is described in Example 1, which covers
the entire set of synonymous codons that code for amino acids.
[0219] In order to determine the immune response preference of
different codons, test mammals (e.g., mice) are immunized with the
synthetic construct system in which individual mammals are
immunized with a different synthetic construct and the host immune
response (e.g., humoral immune response or a cellular immune
response) to the antigenic polypeptide is determined for each
construct. In accordance with the present invention, the strength
of immune response obtained from individual synthetic constructs
provides a direct correlation to the immune preference of a
corresponding interrogating codon in a test mammal. Accordingly,
the stronger the immune response produced from a given construct in
a test mammal, the higher the immune preference will be of the
corresponding interrogating codon.
[0220] In an illustrative example, comparison of the immune
response preferences determined according to Example 1 with the
translational efficiencies derived from codon usage frequency
values for mammalian cells in general as determined by Seed (see
U.S. Pat. Nos. 5,786,464 and 5,795,737) reveals several differences
in the ranking of codons. For convenience, these differences are
highlighted in TABLE 9, in which Seed `preferred` codons are
highlighted with a blue background, Seed `less preferred` codons
are highlighted with a green background, and Seed `non preferred`
codons are highlighted with a grey background.
TABLE-US-00009 TABLE 9 Preferential codon usage as predicted
Experimentally determined codon by Seed for mammalian cells in
immune response preferences in test aa general mammals Ala GCC
>> (GCG, GCT, GCA) GCT > GCC > (GCA GCG) Arg CGC
>> (CGA, CGT, AGA, AGG, (CGA, CGC, CGT, AGA) > (AGG, CGG)
CGG) Asn AAC >> AAT AAC > AAT Asp GAC >> GAT GAC
> GAT Cys TGC >> TGT TGC > TGT Glu (GAA, GAG) GAA >
GAG Gln CAG >> CAA CAA = CAG Gly GGC > GGG > (GGT, GGA)
GGA > (GGG, GGT, GGC) His CAC >> CAT CAC = CAT Ile ATC
> ATT > ATA ATC >> ATT > ATA Leu CTG > CTC >
(TTA, CTA, CTT, (CTG, CTC) > (CTA, CTT) >> TTG > TTG)
TTA Lys AAG >> AAA AAG = AAA Phe TTC >> TTT TTT >
TTC Pro CCC >> (CCG, CCA, CCT) CCC > CCT >> (CCA,
CCG) Ser AGC > TCC > (TCG, AGT, TCA, TCG >> (TCT, TCA,
TCC) >> (AGC, TCT) AGT) Thr ACC >> (ACG, ACA, ACT) ACG
> ACC >> ACA > ACT Tyr TAC >> TAT TAC > TAT
Val GTG > GTC > (GTA, GTT) (GTG, GTC) > GTT > GTA
[0221] As will be apparent from the above table:
[0222] (i) several codons deemed by Seed to have a higher codon
usage ranking in mammalian cells than at least one other synonymous
codon have in fact a lower immune response preference ranking than
the or each other synonymous codon (e.g., Ala.sup.GCC has a higher
codon usage ranking but lower immune response preference ranking
than Ala.sup.GCT; Gly.sup.GGC has a higher codon usage ranking but
lower immune response preference ranking than Gly.sup.GGA;
Phe.sup.TTC has a higher codon usage ranking but lower immune
response preference ranking than Phe.sup.TTT; Ser.sup.AGC has a
higher codon usage ranking but lower immune response preference
ranking than any one of Ser.sup.TCG, Ser.sup.tct, Ser.sup.TCG,
Ser.sup.TCA and Ser.sup.TCC; and Thr.sup.ACC has a higher codon
usage ranking but lower immune response preference ranking than
Thr.sup.ACG);
[0223] (ii) several codons deemed by Seed to have a lower codon
usage ranking in mammalian cells than at least one other synonymous
codon have in fact a higher immune response preference ranking than
the or each other synonymous codon (e.g., Ala.sup.GCT has a lower
codon usage ranking but higher immune response preference ranking
than Ala.sup.GCC; Gly.sup.GGA has a lower codon usage ranking but
higher immune response preference ranking than Gly.sup.GGC or
Gly.sup.GGG; Phe.sup.TTT has a lower codon usage ranking but higher
immune response preference ranking than Phe.sup.TTC; Ser.sup.TCG
has a lower codon usage ranking but higher immune response
preference ranking than Ser.sup.AGC or Ser.sup.TCC; Ser.sup.TCT and
Ser.sup.TCA have a lower codon usage ranking but higher immune
response preference ranking than Ser.sup.AGC; and Thr.sup.ACG has a
lower codon usage ranking but higher immune response preference
ranking than Thr.sup.ACC);
[0224] (iii) several codons deemed by Seed to have a higher codon
usage ranking in mammalian cells than another synonymous codon have
in fact the same immune response preference ranking as the other
synonymous codon (e.g., Gln.sup.CAG has a higher codon usage
ranking than, but the same immune response preference ranking as,
Gln.sup.CAA; His.sup.CAC has a higher codon usage ranking than, but
the same immune response preference ranking as, His.sup.CAT;
Leu.sup.CTG has a higher codon usage ranking than, but the same
immune response preference ranking as Leu.sup.CTC; Lys.sup.AAG has
a higher codon usage ranking than, but the same immune response
preference ranking as, Lys.sup.AAA; Val.sup.GTG has a higher codon
usage ranking than, but the same immune response preference ranking
as, Val.sup.GTC); and
[0225] (iv) several codons deemed by Seed to have the same codon
usage ranking in mammalian cells as at least one other synonymous
codon have in fact a different immune response preference ranking
than the or each other synonymous codon (e.g., Ala.sup.GCT has the
same codon usage ranking as, but a higher immune response
preference ranking than, Ala.sup.GcA and Ala.sup.GCG; Arg.sup.CGA,
Arg.sup.CGT and Arg.sup.AGA have the same codon usage ranking as,
but a higher immune response preference ranking than, Arg.sup.AGG
and Arg.sup.CGG; Glu.sup.GAA has the same codon usage ranking as,
but a higher immune response preference ranking than, Glu.sup.GAG;
Gly.sup.GGA has the same codon usage ranking as, but a higher
immune response preference ranking than, Gly.sup.GGT; Leu.sup.CTA
and Leu.sup.CTT have the same codon usage ranking as, but a higher
immune response preference ranking than, Leu.sup.TTG and
Leu.sup.TTA; and Pro.sup.CCT has the same codon usage ranking as,
but a higher immune response preference ranking than, Pro.sup.CCA
or Pro.sup.CCG; Ser.sup.TCG has the same codon usage ranking as,
but a higher immune response preference ranking than, any one of
Ser.sup.TCT, Ser.sup.TCA and Ser.sup.AGT; Ser.sup.TCT and
Ser.sup.TCA have the same codon usage ranking as, but a higher
immune response preference ranking than, Ser.sup.AGT; Thr.sup.AcG
has the same codon usage ranking as, but a higher immune response
preference ranking than, any one of Thr.sup.ACA and Thr.sup.ACT;
Thr.sup.ACG has the same codon usage ranking as, but a higher
immune response preference ranking than, Thr.sup.ACT; Val.sup.GTT
has the same codon usage ranking as, but a higher immune response
preference ranking than, Val.sup.GTA).
[0226] Accordingly, the present invention enables for the first
time the modulation of an immune response to a target antigen in a
mammal from a polynucleotide that encodes a polypeptide that
corresponds to at least a portion of the target antigen by
replacing at least one codon of the polynucleotide with a
synonymous codon that has a higher or lower preference for
producing an immune response than the codon it replaces. In some
embodiments, therefore, the present invention embraces methods of
constructing a synthetic polynucleotide from which a polypeptide is
producible to confer an enhanced or stronger immune response than
one conferred by a parent polynucleotide that encodes the same
polypeptide. These methods generally comprise selecting from TABLE
1a codon (often referred to herein arbitrarily as a "first codon")
of the parent polynucleotide for replacement with a synonymous
codon, wherein the synonymous codon is selected on the basis that
it exhibits a higher immune response preference than the first
codon and replacing the first codon with the synonymous codon to
construct the synthetic polynucleotide. Illustrative selections of
the first and synonymous codons are made according to TABLE 2.
[0227] In some embodiments, the selection of the first and
synonymous codons is made according to TABLE 3, which is the same
as TABLE 2 with the exception that it excludes selections based on
codon usage rankings as disclosed by Seed. In illustrative examples
of this type, the selection of a second codon (and subsequent
codons if desired) for replacement with a synonymous codon is made
according to TABLE 4.
[0228] Where synonymous codons are classified into three ranks
(`high`, `intermediate` and `low` ranks) based on their immune
response preference ranking (e.g., the synonymous codons for Ala,
Ile, Leu, Pro, Ser, Thr and Val), it is preferred that the
synonymous codon that is selected is a high rank codon when the
first codon is a low rank codon. However, this is not essential and
the synonymous codon can be selected from intermediate rank codons.
In the case of two or more synonymous codons having similar immune
response preferences, it will be appreciated that any one of these
codons can be used to replace the first codon.
[0229] In other embodiments, the invention provides methods of
constructing a synthetic polynucleotide from which a polypeptide is
producible to confer a reduced or weaker immune response than one
conferred by a parent polynucleotide that encodes the same
polypeptide. These methods generally comprise selecting from TABLE
1 a first codon of the parent polynucleotide for replacement with a
synonymous codon, wherein the synonymous codon is selected on the
basis that it exhibits a lower immune response preference than the
first codon and replacing the first codon with the synonymous codon
to construct the synthetic polynucleotide. Illustrative selections
of the first and synonymous codons are made according to TABLE
5.
[0230] In some embodiments, the selection of the first and
synonymous codons is made according to TABLE 6, which is the same
as TABLE 5 with the exception that it excludes selections based on
codon usage rankings as disclosed by Seed. In illustrative examples
of this type, the selection of a second codon (and subsequent
codons if desired) for replacement with a synonymous codon is made
according to TABLE 7.
[0231] Where synonymous codons are classified into the three ranks
noted above, it is preferred that the synonymous codon that is
selected is a low rank codon when the first codon is a high rank
codon but this is not essential and thus the synonymous codon can
be selected from intermediate rank codons if desired.
[0232] Generally, the difference in strength of the immune response
produced in the mammal from the synthetic polynucleotide as
compared to that produced from the parent polynucleotide depends on
the number of first/second codons that are replaced by synonymous
codons, and on the difference in immune response preference ranking
between the first/second codons and the synonymous codons. Put
another way, the fewer such replacements, and/or the smaller the
difference in immune response preference ranking between the
synonymous and first/codons codons, the smaller the difference will
be in the immune response produced by the synthetic polynucleotide
and the one produced by the parent polynucleotide. Conversely, the
more such replacements, and/or the greater the difference in immune
response preference ranking between the synonymous and first/second
codons, the greater the difference will be in the immune response
produced by the synthetic polynucleotide and the one produced by
the parent polynucleotide.
[0233] It is preferable but not necessary to replace all the codons
of the parent polynucleotide with synonymous codons having
different (e.g., higher or lower) immune response preference
rankings than the first/second codons. Changes in the conferred
immune response can be accomplished even with partial replacement.
Generally, the replacement step affects at least about 5%, 10%,
15%, 20%, 25%, 30%, usually at least about 35%, 40%, 50%, and
typically at least about 60%, 70%, 80%, 90%, 95%, 96%, 97%, 98%,
99% or more of the first/second codons of the parent
polynucleotide. In embodiments in which a stronger or enhanced
immune response is required, it is generally desirable to replace
some, preferably most and more preferably all, low rank codons in a
parent polynucleotide with synonymous codons that are intermediate,
or preferably high rank codons. Typically, replacement of low with
intermediate or high rank codons will result in an increase in the
strength of immune response from the synthetic polynucleotide so
constructed, as compared to the one produced from the parent
polynucleotide under the same conditions. However, it is often
desirable to replace some, preferably most and more preferably all,
intermediate rank codons in the parent polynucleotide with high
rank codons, if stronger or more enhanced immune responses are
desired.
[0234] By contrast, in some embodiments in which a weaker or
reduced immune response is required, it is generally desirable to
replace some, preferably most and more preferably all, high rank
codons in a parent polynucleotide with synonymous codons that are
intermediate, or preferably low rank codons. Typically, replacement
of high with intermediate or low rank codons will result in a
substantial decrease in the strength of immune response from the
synthetic polynucleotide so constructed, as compared to the one
produced from the parent polynucleotide under the same condition.
In specific embodiments in which it is desired to confer a weaker
or more reduced immune response, it is generally desirable to
replace some, preferably most and more preferably all, intermediate
rank codons in the parent polynucleotide with low rank codons.
[0235] In illustrative examples requiring a stronger or enhanced
immune response, the number of, and difference in immune response
preference ranking between, the first/second codons and the
synonymous codons are selected such that the immune response
conferred by the synthetic polynucleotide is at least about 110%,
150%, 200%, 250%, 300%, 350%, 400%, 450%, 500%, 600%, 700%, 800%,
900%, 1000%, or more, of the immune response conferred by the
parent polynucleotide under the same conditions. Conversely, in
some embodiments requiring a lower or weaker immune response, the
number of, and difference in phenotypic preference ranking between,
the first/second codons and the synonymous codons are selected such
that the immune response conferred by the synthetic polynucleotide
is no more than about 90%, 80%, 70%, 60%, 50%, 40%, 30%, 20%, 10%,
5%, or less of the immune response conferred by the parent
polynucleotide under the same conditions.
7. Modulating Immune Responses in Mammals by Expression of
Isoaccepting Transfer RNA-encoding Polynucleotides
[0236] It is possible to take advantage of the immune response
preference rankings of codons discussed in Section 6 to modulate an
immune response to a target antigen by changing the level of
iso-tRNAs in the cell population which is the target of the
immunization. Accordingly, the invention also features methods of
enhancing the quality of an immune response to a target antigen in
a mammal, wherein the response is conferred by the expression of a
first polynucleotide that encodes a polypeptide corresponding to at
least a portion of the target antigen. These methods generally
comprise: introducing into the mammal a first nucleic acid
construct comprising the first polynucleotide in operable
connection with a regulatory sequence. A second nucleic acid
construct is then introduced into the mammal, which comprises a
second polynucleotide that is operably connected to a regulatory
sequence and that encodes an iso-tRNA corresponding to a low immune
preference codon of the first polynucleotide.
[0237] In practice, therefore, an iso-tRNA is introduced into the
mammal by the second nucleic acid construct when the iso-tRNA
corresponds to a low immune response preference codon in the first
polynucleotide, which are suitably selected from the group
consisting of Ala.sup.GCA, Ala.sup.GCG, Ala.sup.GCC, Arg.sup.AGG,
Arg.sup.CGG, Asn.sup.AAT, Asp.sup.GAT, Cys.sup.TGT, Glu.sup.GAG,
Gly.sup.GGG, Gly.sup.GGT, Gly.sup.GGC, Ile.sup.ATA, Ile.sup.ATT,
Leu.sup.TTG, Leu.sup.TTA, Leu.sup.CTA, Leu.sup.CTT, Phe.sup.TTC,
Pro.sup.CCA, Pro.sup.CCG, Pro.sup.CCT, Ser.sup.AGC, Ser.sup.AGT,
Ser.sup.TCT, Ser.sup.TCA, Ser.sup.TCC, Thr.sup.ACA, Thr.sup.ACT,
Tyr.sup.TAT, Val.sup.GTA and Val.sup.GTT. In specific embodiments,
the supplied iso-tRNAs are specific for codons that have `low`
immune response preference codons, which may be selected from the
group consisting of Ala.sup.GCA, Ala.sup.GCG, Arg.sup.CGG,
Asn.sup.AAT, Asp.sup.GAT, Cys.sup.TGT, Glu.sup.GAG, Gly.sup.GGG,
Gly.sup.GGT, Gly.sup.GGC, Ile.sup.ATA, Leu.sup.TTG, Leu.sup.TTA,
Phe.sup.TTC, Pro.sup.CCA, Pro.sup.CCG, Ser.sup.AGC, Ser.sup.AGT,
Thr.sup.ACT, Tyr.sup.TAT and Val.sup.GTA. The first construct
(i.e., antigen-expressing construct) and the second construct
(i.e., the iso-tRNA-expressing construct) may be introduced
simultaneously or sequentially (in either order) and may be
introduced at the same or different sites. In some embodiments, the
first and second constructs are contained in separate vectors. In
other embodiments, they are contained in a single vector. If
desired, two or more second constructs may be introduced each
expressing a different iso-tRNA corresponding to a low preference
codon of the first polynucleotide. The first and second nucleic
acid constructs may be constructed and administered concurrently or
contemporaneously to a mammal according to any suitable method,
illustrative examples of which are discussed below for the chimeric
constructs of the invention.
[0238] In some embodiments, a plurality of different
iso-tRNA-expressing constructs (e.g., 2, 3, 4, 5, 6, 7, 8, 9, 10,
11, 12, 13, 14, 15, 16, 17, 18, 19, 20 or more) are administered
concurrently or contemporaneously with the antigen-expressing
construct, wherein individual iso-tRNA-expressing constructs
express a different iso-tRNA than other iso-tRNA-expressing
constructs.
8. Antigens
[0239] Target antigens useful in the present invention are
typically proteinaceous molecules, representative examples of which
include polypeptides and peptides. Target antigens may be selected
from endogenous antigens produced by a host or exogenous antigens
that are foreign to the host. Suitable endogenous antigens include,
but are not restricted to, cancer or tumor antigens. Non-limiting
examples of cancer or tumor antigens include antigens from a cancer
or tumor selected from ABL1 proto-oncogene, AIDS related cancers,
acoustic neuroma, acute lymphocytic leukemia, acute myeloid
leukemia, adenocystic carcinoma, adrenocortical cancer, agnogenic
myeloid metaplasia, alopecia, alveolar soft-part sarcoma, anal
cancer, angiosarcoma, aplastic anemia, astrocytoma,
ataxia-telangiectasia, basal cell carcinoma (skin), bladder cancer,
bone cancers, bowel cancer, brain stem glioma, brain and CNS
tumors, breast cancer, CNS tumors, carcinoid tumors, cervical
cancer, childhood brain tumors, childhood cancer, childhood
leukemia, childhood soft tissue sarcoma, chondrosarcoma,
choriocarcinoma, chronic lymphocytic leukemia, chronic myeloid
leukemia, colorectal cancers, cutaneous T-cell lymphoma,
dermatofibrosarcoma protuberans, desmoplastic small round cell
tumor, ductal carcinoma, endocrine cancers, endometrial cancer,
ependymoma, oesophageal cancer, Ewing's Sarcoma, Extra-Hepatic Bile
Duct Cancer, Eye Cancer, Eye: Melanoma, Retinoblastoma, Fallopian
Tube cancer, Fanconi anemia, fibrosarcoma, gall bladder cancer,
gastric cancer, gastrointestinal cancers,
gastrointestinal-carcinoid-tumor, genitourinary cancers, germ cell
tumors, gestational-trophoblastic-disease, glioma, gynecological
cancers, haematological malignancies, hairy cell leukemia, head and
neck cancer, hepatocellular cancer, hereditary breast cancer,
histiocytosis, Hodgkin's disease, human papillomavirus,
hydatidiform mole, hypercalcemia, hypopharynx cancer, intraocular
melanoma, islet cell cancer, Kaposi's sarcoma, kidney cancer,
Langerhans cell histiocytosis, laryngeal cancer, leiomyosarcoma,
leukemia, Li-Fraumeni syndrome, lip cancer, liposarcoma, liver
cancer, lung cancer, lymphedema, lymphoma, Hodgkin's lymphoma,
non-Hodgkin's lymphoma, male breast cancer, malignant-rhabdoid
tumor of kidney, medulloblastoma, melanoma, Merkel cell cancer,
mesothelioma, metastatic cancer, mouth cancer, multiple endocrine
neoplasia, mycosis fungoides, myelodysplastic syndromes, myeloma,
myeloproliferative disorders, nasal cancer, nasopharyngeal cancer,
nephroblastoma, neuroblastoma, neurofibromatosis, Nijmegen breakage
syndrome, non-melanoma skin cancer, non-small-cell-lung-cancer
(NSCLC), ocular cancers, esophageal cancer, oral cavity cancer,
oropharynx cancer, osteosarcoma, ostomy ovarian cancer, pancreas
cancer, paranasal cancer, parathyroid cancer, parotid gland cancer,
penile cancer, peripheral-neuroectodermal tumours, pituitary
cancer, polycythemia vera, prostate cancer, rare cancers and
associated disorders, renal cell carcinoma, retinoblastoma,
rhabdomyosarcoma, Rothmund-Thomson syndrome, salivary gland cancer,
sarcoma, schwannoma, Sezary syndrome, skin cancer, small cell lung
cancer (SCLC), small intestine cancer, soft tissue sarcoma, spinal
cord tumors, squamous-cell-carcinoma-(skin), stomach cancer,
synovial sarcoma, testicular cancer, thymus cancer, thyroid cancer,
transitional-cell-cancer-(bladder),
transitional-cell-cancer-(renal-pelvis-/- ureter), trophoblastic
cancer, urethral cancer, urinary system cancer, uroplakins, uterine
sarcoma, uterus cancer, vaginal cancer, vulva cancer, Waldenstroms
macroglobulinemia, Wilms' tumor. In certain embodiments, the cancer
or tumor relates to melanoma. Illustrative examples of
melanoma-related antigens include melanocyte differentiation
antigen (e.g., gp100, MART, Melan-A/MART-1, TRP-1, Tyros, TRP2,
MC1R, MUC1F, MUC1R or a combination thereof) and melanoma-specific
antigens (e.g., BAGE, GAGE-1, gp100In4, MAGE-1 (e.g., GenBank
Accession No. X54156 and AA494311), MAGE-3, MAGE4, PRAME, TRP2IN2,
NYNSO1a, NYNSO1b, LAGE1, p97 melanoma antigen (e.g., GenBank
Accession No. M12154) p5 protein, gp75, oncofetal antigen, GM2 and
GD2 gangliosides, cdc27, p21ras, gp100.sup.Pmel117 or a combination
thereof. Other tumour-specific antigens include, but are not
limited to: etv6, aml1, cyclophilin b (acute lymphoblastic
leukemia); Ig-idiotype (B cell lymphoma); E-cadherin,
.alpha.-catenin, .beta.-catenin, .gamma.-catenin, p120ctn (glioma);
p21ras (bladder cancer); p21ras (biliary cancer); MUC family,
HER2/neu, c-erbB-2 (breast cancer); p53, p21ras (cervical
carcinoma); p21ras, HER2/neu, c-erbB-2, MUC family,
Cripto-1protein, Pim-1 protein (colon carcinoma); Colorectal
associated antigen (CRC)-0017-1A/GA733, APC (colorectal cancer);
carcinoembryonic antigen (CEA) (colorectal cancer;
choriocarcinoma); cyclophilin b (epithelial cell cancer); HER2/neu,
c-erbB-2, ga733 glycoprotein (gastric cancer); .alpha.-fetoprotein
(hepatocellular cancer); Imp-1, EBNA-1 (Hodgkin's lymphoma); CEA,
MAGE-3, NY-ESO-1 (lung cancer); cyclophilin b (lymphoid
cell-derived leukemia); MUC family, p21ras (myeloma); HER2/neu,
c-erbB-2 (non-small cell lung carcinoma); Imp-1, EBNA-1
(nasopharyngeal cancer); MUC family, HER2/neu, c-erbB-2, MAGE-A4,
NY-ESO-1 (ovarian cancer); Prostate Specific Antigen (PSA) and its
antigenic epitopes PSA-1, PSA-2, and PSA-3, PSMA, HER2/neu,
c-erbB-2, ga733 glycoprotein (prostate cancer); HER2/neu, c-erbB-2
(renal cancer); viral products such as human papillomavirus
proteins (squamous cell cancers of the cervix and esophagus);
NY-ESO-1 (testicular cancer); and HTLV-1 epitopes (T cell
leukemia).
[0240] Foreign or exogenous antigens are suitably selected from
antigens of pathogenic organisms. Exemplary pathogenic organisms
include, but are not limited to, viruses, bacteria, fungi
parasites, algae and protozoa and amoebae. Illustrative viruses
include viruses responsible for diseases including, but not limited
to, measles, mumps, rubella, poliomyelitis, hepatitis A, B (e.g.,
GenBank Accession No. E02707), and C (e.g., GenBank Accession No.
E06890), as well as other hepatitis viruses, influenza, adenovirus
(e.g., types 4 and 7), rabies (e.g., GenBank Accession No. M34678),
yellow fever, Epstein-Barr virus and other herpesviruses such as
papillomavirus, Ebola virus, influenza virus, Japanese encephalitis
(e.g., GenBank Accession No. E07883), dengue (e.g., GenBank
Accession No. M24444), hantavirus, Sendai virus, respiratory
syncytial virus, orthomyxoviruses, vesicular stomatitis virus,
visna virus, cytomegalovirus and human immunodeficiency virus (HIV)
(e.g., GenBank Accession No. U18552). Any suitable antigen derived
from such viruses are useful in the practice of the present
invention. For example, illustrative retroviral antigens derived
from HIV include, but are not limited to, antigens such as gene
products of the gag, pol, and env genes, the Nef protein, reverse
transcriptase, and other HIV components. Illustrative examples of
hepatitis viral antigens include, but are not limited to, antigens
such as the S, M, and L proteins of hepatitis B virus, the pre-S
antigen of hepatitis B virus, and other hepatitis, e.g., hepatitis
A, B, and C, viral components such as hepatitis C viral RNA.
Illustrative examples of influenza viral antigens include; but are
not limited to, antigens such as hemagglutinin and neuraminidase
and other influenza viral components. Illustrative examples of
measles viral antigens include, but are not limited to, antigens
such as the measles virus fusion protein and other measles virus
components. Illustrative examples of rubella viral antigens
include, but are not limited to, antigens such as proteins E1 and
E2 and other rubella virus components; rotaviral antigens such as
VP7sc and other rotaviral components. Illustrative examples of
cytomegaloviral antigens include, but are not limited to, antigens
such as envelope glycoprotein B and other cytomegaloviral antigen
components. Non-limiting examples of respiratory syncytial viral
antigens include antigens such as the RSV fusion protein, the M2
protein and other respiratory syncytial viral antigen components.
Illustrative examples of herpes simplex viral antigens include, but
are not limited to, antigens such as immediate early proteins,
glycoprotein D, and other herpes simplex viral antigen components.
Non-limiting examples of varicella zoster viral antigens include
antigens such as 9PI, gpII, and other varicella zoster viral
antigen components. Non-limiting examples of Japanese encephalitis
viral antigens include antigens such as proteins E, M-E, M-E-NS 1,
NS 1, NS 1-NS2A, 80% E, and other Japanese encephalitis viral
antigen components. Representative examples of rabies viral
antigens include, but are not limited to, antigens such as rabies
glycoprotein, rabies nucleoprotein and other rabies viral antigen
components. Illustrative examples of papillomavirus antigens
include, but are not limited to, the L1 and L2 capsid proteins as
well as the E6/E7 antigens associated with cervical cancers, See
Fundamental Virology, Second Edition, eds. Fields, B. N. and Knipe,
D. M., 1991, Raven Press, New York, for additional examples of
viral antigens.
[0241] Illustrative examples of fungi include Acremonium spp.,
Aspergillus spp., Basidiobolus spp., Bipolaris spp., Blastomyces
dermatidis, Candida spp., Cladophialophora carrionii, Coccidioides
immitis, Conidiobolus spp., Cryptococcus spp., Curvularia spp.,
Epidermophyton spp., Exophiala jeanselmei, Exserohilum spp.,
Fonsecaea compacta, Fonsecaea pedrosoi, Fusarium oxysporum,
Fusarium solani, Geotrichum candidum, Histoplasma capsulatum var.
capsulatum, Histoplasma capsulatum var. duboisii, Hortaea
werneckii, Lacazia loboi, Lasiodiplodia theobromae, Leptosphaeria
senegalensis, Madurella grisea, Madurella mycetomatis, Malassezia
furfur, Microsporum spp., Neotestudina rosatii, Onychocola
canadensis, Paracoccidioides brasiliensis, Phialophora verrucosa,
Piedraia hortae, Piedra iahortae, Pityriasis versicolor,
Pseudallescheria boydii, Pyrenochaeta romeroi, Rhizopus arrhizus,
Scopulariopsis brevicaulis, Scytalidium dimidiatum, Sporothrix
schenckii, Trichophyton spp., Trichosporon spp., Zygomycete fungi,
Absidia corymbifera, Rhizomucor pusillus and Rhizopus arrhizus.
Thus, representative fungal antigens that can be used in the
compositions and methods of the present invention include, but are
not limited to, candida fungal antigen components; histoplasma
fungal antigens such as heat shock protein 60 (HSP60) and other
histoplasma fungal antigen components; cryptococcal fungal antigens
such as capsular polysaccharides and other cryptococcal fungal
antigen components; coccidioides fungal antigens such as spherule
antigens and other coccidioides fungal antigen components; and
tinea fungal antigens such as trichophytin and other coccidioides
fungal antigen components.
[0242] Illustrative examples of bacteria include bacteria that are
responsible for diseases including, but not restricted to,
diphtheria (e.g., Corynebacterium diphtheria), pertussis (e.g.,
Bordetella pertussis, GenBank Accession No. M35274), tetanus (e.g.,
Clostridium tetani, GenBank Accession No. M64353), tuberculosis
(e.g., Mycobacterium tuberculosis), bacterial pneumonias (e.g.,
Haemophilus influenzae.), cholera (e.g., Vibrio cholerae), anthrax
(e.g., Bacillus anthracis), typhoid, plague, shigellosis (e.g.,
Shigella dysenteriae), botulism (e.g., Clostridium botulinum),
salmonellosis (e.g., GenBank Accession No. L03833), peptic ulcers
(e.g., Helicobacter pylori), Legionnaire's Disease, Lyme disease
(e.g., GenBank Accession No. U59487), Other pathogenic bacteria
include Escherichia coli, Clostridium perfringens, Pseudomonas
aeruginosa, Staphylococcus aureus and Streptococcus pyogenes. Thus,
bacterial antigens which can be used in the compositions and
methods of the invention include, but are not limited to: pertussis
bacterial antigens such as pertussis toxin, filamentous
hemagglutinin, pertactin, F M2, FIM3, adenylate cyclase and other
pertussis bacterial antigen components; diphtheria bacterial
antigens such as diphtheria toxin or toxoid and other diphtheria
bacterial antigen components; tetanus bacterial antigens such as
tetanus toxin or toxoid and other tetanus bacterial antigen
components, streptococcal bacterial antigens such as M proteins and
other streptococcal bacterial antigen components; gram-negative
bacilli bacterial antigens such as lipopolysaccharides and other
gram-negative bacterial antigen components; Mycobacterium
tuberculosis bacterial antigens such as mycolic acid, heat shock
protein 65 (HSP65), the kDa major secreted protein, antigen 85A and
other mycobacterial antigen components; Helicobacter pylori
bacterial antigen components, pneumococcal bacterial antigens such
as pneumolysin, pneumococcal capsular polysaccharides and other
pneumococcal bacterial antigen components; Haemophilus influenza
bacterial antigens such as capsular polysaccharides and other
Haemophilus influenza bacterial antigen components; anthrax
bacterial antigens such as anthrax protective antigen and other
anthrax bacterial antigen components; rickettsiae bacterial
antigens such as rompA and other rickettsiae bacterial antigen
component. Also included with the bacterial antigens described
herein are any other bacterial, mycobacterial, mycoplasmal,
rickettsial, or chlamydial antigens.
[0243] Illustrative examples of protozoa include protozoa that are
responsible for diseases including, but not limited to, malaria
(e.g., GenBank Accession No. X53832), hookworm, onchocerciasis
(e.g., GenBank Accession No. M27807), schistosomiasis (e.g.,
GenBank Accession No. LOS198), toxoplasmosis, trypanosomiasis,
leishmaniasis, giardiasis (GenBank Accession No. M33641),
amoebiasis, filariasis (e.g., GenBank Accession No. J03266),
borreliosis, and trichinosis. Thus, protozoal antigens which can be
used in the compositions and methods of the invention include, but
are not limited to: plasmodium falciparum antigens such as
merozoite surface antigens, sporozoite surface antigens,
circumsporozoite antigens, gametocyte/gamete surface antigens,
blood-stage antigen pf 155/RESA and other plasmodial antigen
components; toxoplasma antigens such as SAG-1, p30 and other
toxoplasma antigen components; schistosoma antigens such as
glutathione-S-transferase, paramyosin, and other schistosomal
antigen components; leishmania major and other leishmaniae antigens
such as gp63, lipophosphoglycan and its associated protein and
other leishmanial antigen components; and trypanosoma cruzi
antigens such as the 75-77 kDa antigen, the 56 kDa antigen and
other trypanosomal antigen components.
[0244] The present invention also contemplates toxin components as
antigens, illustrative examples of which include staphylococcal
enterotoxins, toxic shock syndrome toxin; retroviral antigens
(e.g., antigens derived from HIV), streptococcal antigens,
staphylococcal enterotoxin-A (SEA), staphylococcal enterotoxin-B
(SEB), staphylococcal enterotoxin.sub.1-3 (SE.sub.1-3),
staphylococcal enterotoxin-D (SED), staphylococcal enterotoxin-E
(SEE) as well as toxins derived from mycoplasma, mycobacterium, and
herpes viruses.
9. Construction of Synthetic Polynucleotides
[0245] Replacement of one codon for another can be achieved using
standard methods known in the art. For example codon modification
of a parent polynucleotide can be effected using several known
mutagenesis techniques including, for example,
oligonucleotide-directed mutagenesis, mutagenesis with degenerate
oligonucleotides, and region-specific mutagenesis. Exemplary in
vitro mutagenesis techniques are described for example in U.S. Pat.
Nos. 4,184,917, 4,321,365 and 4,351,901 or in the relevant sections
of Ausubel, et al. (CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John
Wiley & Sons, Inc. 1997) and of Sambrook, et al., (MOLECULAR
CLONING. A LABORATORY MANUAL, Cold Spring Harbor Press, 1989).
Instead of in vitro mutagenesis, the synthetic polynucleotide can
be synthesized de novo using readily available machinery as
described, for example, in U.S. Pat. No. 4,293,652. However, it
should be noted that the present invention is not dependent on, and
not directed to, any one particular technique for constructing the
synthetic polynucleotide.
[0246] The parent polynucleotide is suitably a natural gene.
However, it is possible that the parent polynucleotide is not
naturally-occurring but has been engineered using recombinant
techniques. Parent polynucleotides can be obtained from any
suitable source, such as from eukaryotic or prokaryotic organisms,
including but not limited to mammals or other animals, and
pathogenic organisms such as yeasts, bacteria, protozoa and
viruses.
[0247] The invention also contemplates synthetic polynucleotides
encoding one or more desired portions of a target antigen. In some
embodiments, the synthetic polynucleotide encodes at least about 5,
6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23,
24, 25, 30, 40, 50, 60, 70, 80, 90, 100, 120, 150, 300, 400, 500,
600, 700, 800, 900 or 1000, or even at least about 2000, 3000, 4000
or 5000 contiguous amino acid residues, or almost up to the total
number of amino acids present in a full-length target antigen. In
some embodiments, the synthetic polynucleotide encodes a plurality
of portions of the target antigen, wherein the portions are the
same or different. In illustrative examples of this type, the
synthetic polynucleotide encodes a multi-epitope fusion protein. A
number of factors can influence the choice of portion size. For
example, the size of individual portions encoded by the synthetic
polynucleotide can be chosen such that it includes, or corresponds
to the size of, T cell epitopes and/or B cell epitopes, and their
processing requirements. Practitioners in the art will recognize
that class I-restricted T cell epitopes are typically between 8 and
10 amino acid residues in length and if placed next to unnatural
flanking residues, such epitopes can generally require 2 to 3
natural flanking amino acid residues to ensure that they are
efficiently processed and presented. Class II-restricted T cell
epitopes usually range between 12 and 25 amino acid residues in
length and may not require natural flanking residues for efficient
proteolytic processing although it is believed that natural
flanking residues may play a role. Another important feature of
class II-restricted epitopes is that they generally contain a core
of 9-10 amino acid residues in the middle which bind specifically
to class II MHC molecules with flanking sequences either side of
this core stabilizing binding by associating with conserved
structures on either side of class II MHC antigens in a sequence
independent manner. Thus the functional region of class
II-restricted epitopes is typically less than about 15 amino acid
residues long. The size of linear B cell epitopes and the factors
effecting their processing, like class II-restricted epitopes, are
quite variable although such epitopes are frequently smaller in
size than 15 amino acid residues. From the foregoing, it is
advantageous, but not essential, that the size of individual
portions of the target antigen is at least 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 20, 25, 30 amino acid residues. Suitably, the size of
individual portions is no more than about 500, 200, 100, 80, 60,
50, 40 amino acid residues. In certain advantageous embodiments,
the size of individual portions is sufficient for presentation by
an antigen-presenting cell of a T cell and/or a B cell epitope
contained within the peptide.
[0248] As will be appreciated by those of skill in the art, it is
generally not necessary to immunize with a polypeptide that shares
exactly the same amino acid sequence with the target antigen to
produce an immune response to that antigen. In some embodiments,
therefore, the polypeptide encoded by the synthetic polynucleotide
is desirably a variant of at least a portion of the target antigen.
"Variant" polypeptides include proteins derived from the target
antigen by deletion (so-called truncation) or addition of one or
more amino acids to the N-terminal and/or C-terminal end of the
target antigen; deletion or addition of one or more amino acids at
one or more sites in the target antigen; or substitution of one or
more amino acids at one or more sites in the target antigen.
Variant polypeptides encompassed by the present invention will have
at least 40%, 50%, 60%, 70%, generally at least 75%, 80%, 85%,
typically at least about 90% to 95% or more, and more typically at
least about 96%, 97%, 98%, 99% or more sequence similarity or
identity with the amino acid sequence of the target antigen or
portion thereof as determined by sequence alignment programs
described elsewhere herein using default parameters. A variant of a
target antigen may differ from that antigen generally by as much
1000, 500, 400, 300, 200, 100, 50 or 20 amino acid residues or
suitably by as few as 1-15 amino acid residues, as few as 1-10,
such as 6-10, as few as 5, as few as 4, 3, 2, or even 1 amino acid
residue.
[0249] Variant polypeptides corresponding to at least a portion of
a target antigen may contain conservative amino acid substitutions
at various locations along their sequence, as compared to the
target antigen amino acid sequence. A "conservative amino acid
substitution" is one in which the amino acid residue is replaced
with an amino acid residue having a similar side chain. Families of
amino acid residues having similar side chains have been defined in
the art, which can be generally sub-classified as follows:
[0250] Acidic: The residue has a negative charge due to loss of H
ion at physiological pH and the residue is attracted by aqueous
solution so as to seek the surface positions in the conformation of
a peptide in which it is contained when the peptide is in aqueous
medium at physiological pH. Amino acids having an acidic side chain
include glutamic acid and aspartic acid.
[0251] Basic: The residue has a positive charge due to association
with H ion at physiological pH or within one or two pH units
thereof (e.g., histidine) and the residue is attracted by aqueous
solution so as to seek the surface positions in the conformation of
a peptide in which it is contained when the peptide is in aqueous
medium at physiological pH. Amino acids having a basic side chain
include arginine, lysine and histidine.
[0252] Charged: The residues are charged at physiological pH and,
therefore, include amino acids having acidic or basic side chains
(i.e., glutamic acid, aspartic acid, arginine, lysine and
histidine).
[0253] Hydrophobic: The residues are not charged at physiological
pH and the residue is repelled by aqueous solution so as to seek
the inner positions in the conformation of a peptide in which it is
contained when the peptide is in aqueous medium. Amino acids having
a hydrophobic side chain include tyrosine, valine, isoleucine,
leucine, methionine, phenylalanine and tryptophan.
[0254] Neutral/polar: The residues are not charged at physiological
pH, but the residue is not sufficiently repelled by aqueous
solutions so that it would seek inner positions in the conformation
of a peptide in which it is contained when the peptide is in
aqueous medium. Amino acids having a neutral/polar side chain
include asparagine, glutamine, cysteine, histidine, serine and
threonine.
[0255] This description also characterizes certain amino acids as
"small" since their side chains are not sufficiently large, even if
polar groups are lacking, to confer hydrophobicity. With the
exception of proline, "small" amino acids are those with four
carbons or less when at least one polar group is on the side chain
and three carbons or less when not. Amino acids having a small side
chain include glycine, serine, alanine and threonine. The
gene-encoded secondary amino acid proline is a special case due to
its known effects on the secondary conformation of peptide chains.
The structure of proline differs from all the other
naturally-occurring amino acids in that its side chain is bonded to
the nitrogen of the .alpha.-amino group, as well as the
.alpha.-carbon. Several amino acid similarity matrices (e.g.,
PAM120 matrix and PAM250 matrix as disclosed for example by Dayhoff
et al. (1978) A model of evolutionary change in proteins. Matrices
for determining distance relationships In M. O. Dayhoff, (ed.),
Atlas of protein sequence and structure, Vol. 5, pp. 345-358,
National Biomedical Research Foundation, Washington D.C.; and by
Gonnet et al., 1992, Science 256(5062): 144301445), however,
include proline in the same group as glycine, serine, alanine and
threonine. Accordingly, for the purposes of the present invention,
proline is classified as a "small" amino acid.
[0256] The degree of attraction or repulsion required for
classification as polar or nonpolar is arbitrary and, therefore,
amino acids specifically contemplated by the invention have been
classified as one or the other. Most amino acids not specifically
named can be classified on the basis of known behavior.
[0257] Amino acid residues can be further sub-classified as cyclic
or noncyclic, and aromatic or nonaromatic, self-explanatory
classifications with respect to the side-chain substituent groups
of the residues, and as small or large. The residue is considered
small if it contains a total of four carbon atoms or less,
inclusive of the carboxyl carbon, provided an additional polar
substituent is present; three or less if not. Small residues are,
of course, always nonaromatic. Dependent on their structural
properties, amino acid residues may fall in two or more classes.
For the naturally-occurring protein amino acids, sub-classification
according to the this scheme is presented in the Table 10.
TABLE-US-00010 TABLE 10 Original Residue Exemplary Substitutions
Ala Ser Arg Lys Asn Gln, His Asp Glu Cys Ser Gln Asn Glu Asp Gly
Pro His Asn, Gln Ile Leu, Val Leu Ile, Val Lys Arg, Gln, Glu Met
Leu, Ile, Phe Met, Leu, Tyr Ser Thr Thr Ser Trp Tyr Tyr Trp, Phe
Val Ile, Leu
[0258] Conservative amino acid substitution also includes groupings
based on side chains. For example, a group of amino acids having
aliphatic side chains is glycine, alanine, valine, leucine, and
isoleucine; a group of amino acids having aliphatic-hydroxyl side
chains is serine and threonine; a group of amino acids having
amide-containing side chains is asparagine and glutamine; a group
of amino acids having aromatic side chains is phenylalanine,
tyrosine, and tryptophan; a group of amino acids having basic side
chains is lysine, arginine, and histidine; and a group of amino
acids having sulfur-containing side chains is cysteine and
methionine. For example, it is reasonable to expect that
replacement of a leucine with an isoleucine or valine, an aspartate
with a glutamate, a threonine with a serine, or a similar
replacement of an amino acid with a structurally related amino acid
will not have a major effect on the properties of the resulting
variant polypeptide. Conservative substitutions are shown in Table
11 below under the heading of exemplary substitutions. More
preferred substitutions are shown under the heading of preferred
substitutions. Amino acid substitutions falling within the scope of
the invention, are, in general, accomplished by selecting
substitutions that do not differ significantly in their effect on
maintaining (a) the structure of the peptide backbone in the area
of the substitution, (b) the charge or hydrophobicity of the
molecule at the target site, or (c) the bulk of the side chain.
After the substitutions are introduced, the variants are screened
for biological activity.
TABLE-US-00011 TABLE 11 EXEMPLARY AND PREFERRED AMINO ACID
SUBSTITUTIONS Preferred Original Residue Exemplary Substitutions
Substitutions Ala Val, Leu, Ile Val Arg Lys, Gln, Asn Lys Asn Gln,
His, Lys, Arg Gln Asp Glu Glu Cys Ser Ser Gln Asn, His, Lys, Asn
Glu Asp, Lys Asp Gly Pro Pro His Asn, Gln, Lys, Arg Arg Ile Leu,
Val, Met, Ala, Phe, Leu Norleu Leu Norleu, Ile, Val, Met, Ala, Phe
Ile Lys Arg, Gln, Asn Arg Met Leu, Ile, Phe Leu Phe Leu, Val, Ile,
Ala Leu Pro Gly Gly Ser Thr Thr Thr Ser Ser Trp Tyr Tyr Tyr Trp,
Phe, Thr, Ser Phe Val Ile, Leu, Met, Phe, Ala, Norleu Leu
[0259] Alternatively, similar amino acids for making conservative
substitutions can be grouped into three categories based on the
identity of the side chains. The first group includes glutamic
acid, aspartic acid, arginine, lysine, histidine, which all have
charged side chains; the second group includes glycine, serine,
threonine, cysteine, tyrosine, glutamine, asparagine; and the third
group includes leucine, isoleucine, valine, alanine, proline,
phenylalanine, tryptophan, methionine, as described in Zubay, G.,
Biochemistry, third edition, Wm.C. Brown Publishers (1993).
[0260] The invention further contemplates a chimeric construct
comprising a synthetic polynucleotide of the invention, which is
operably linked to a regulatory sequence. The regulatory sequence
suitably comprises transcriptional and/or translational control
sequences, which will be compatible for expression in the organism
of interest or in cells of that organism. Typically, the
transcriptional and translational regulatory control sequences
include, but are not, limited to, a promoter sequence, a 5'
non-coding region, a cis-regulatory region such as a functional
binding site for transcriptional regulatory protein or
translational regulatory protein, an upstream open reading frame,
ribosomal-binding sequences, transcriptional start site,
translational start site, and/or nucleotide sequence which encodes
a leader sequence, termination codon, translational stop site and a
3' non-translated region. Constitutive or inducible promoters as
known in the art are contemplated by the invention. The promoters
may be either naturally occurring promoters, or hybrid promoters
that combine elements of more than one promoter. Promoter sequences
contemplated by the present invention may be native to the organism
of interest or may be derived from an alternative source, where the
region is functional in the chosen organism. The choice of promoter
will differ depending on the intended host or cell or tissue type.
For example, promoters which could be used for expression in
mammals include the metallothionein promoter, which can be induced
in response to heavy metals such as cadmium, the .beta.-actin
promoter as well as viral promoters such as the SV40 large T
antigen promoter, human cytomegalovirus (CMV) immediate early (TB)
promoter, Rous sarcoma virus LTR promoter, the mouse mammary tumor
virus LTR promoter, the adenovirus major late promoter (Ad MLP),
the herpes simplex virus promoter, and a HPV promoter, particularly
the HPV upstream regulatory region (URR), among others. All these
promoters are well described and readily available in the art.
[0261] Enhancer elements may also be used herein to increase
expression levels of the mammalian constructs. Examples include the
SV40 early gene enhancer, as described for example in Dijkema et
al. (1985, EMBO J. 4:761), the enhancer/promoter derived from the
long terminal repeat (LTR) of the Rous Sarcoma Virus, as described
for example in Gorman et al., (1982, Proc. Natl. Acad. Sci. USA
79:6777) and elements derived from human CMV, as described for
example in Boshart et al. (1985, Cell 41:521), such as elements
included in the CMV intron A sequence.
[0262] The chimeric construct may also comprise a 3' non-translated
sequence. A 3' non-translated sequence refers to that portion of a
gene comprising a DNA segment that contains a polyadenylation
signal and any other regulatory signals capable of effecting mRNA
processing or gene expression. The polyadenylation signal is
characterized by effecting the addition of polyadenylic acid tracts
to the 3' end of the mRNA precursor. Polyadenylation signals are
commonly recognized by the presence of homology to the canonical
form 5'AATAAA-3' although variations are not uncommon. The 3'
non-translated regulatory DNA sequence preferably includes from
about 50 to 1,000 nts and may contain transcriptional and
translational termination sequences in addition to a
polyadenylation signal and any other regulatory signals capable of
effecting mRNA processing or gene expression.
[0263] In some embodiments, the chimeric construct further contains
a selectable marker gene to permit selection of cells containing
the construct. Selection genes are well known in the art and will
be compatible for expression in the cell of interest.
[0264] It will be understood, however, that expression of
protein-encoding polynucleotides in heterologous systems is now
well known, and the present invention is not directed to or
dependent on any particular vector, transcriptional control
sequence or technique for expression of the polynucleotides.
Rather, synthetic polynucleotides prepared according to the methods
set forth herein may be introduced into a mammal in any suitable
manner in the form of any suitable construct or vector, and the
synthetic polynucleotides may be expressed with known transcription
regulatory elements in any conventional manner.
[0265] In addition, chimeric constructs can be constructed that
include sequences coding for adjuvants. Particularly suitable are
detoxified mutants of bacterial ADP-ribosylating toxins, for
example, diphtheria toxin, pertussis toxin (PT), cholera toxin
(CT), Escherichia coli heat-labile toxins (LT1 and LT2),
Pseudomonas endotoxin A, Clostridium botulinum C2 and C3 toxins, as
well as toxins from C. perfringens, C. spiriforma and C. difficile.
In some embodiments, the chimeric constructs include coding
sequences for detoxified mutants of E. coli heat-labile toxins,
such as the LT-K63 and LT-R72 detoxified mutants, described in U.S.
Pat. No. 6,818,222. In some embodiments, the adjuvant is a
protein-destabilising element, which increases processing and
presentation of the polypeptide that corresponds to at least a
portion of the target antigen through the class I MHC pathway,
thereby leading to enhanced cell-mediated immunity against the
polypeptide. Illustrative protein-destabilising elements include
intracellular protein degradation signals or degrons which may be
selected without limitation from a destabilising amino acid at the
amino-terminus of a polypeptide of interest, a PEST region or a
ubiquitin. For example, the coding sequence for the polypeptide can
be modified to include a destabilising amino acid at its
amino-terminus so that the protein so modified is subject to the
N-end rule pathway as disclosed, for example, by Bachmair et al. in
U.S. Pat. No. 5,093,242 and by Varshaysky et al. in U.S. Pat. No.
5,122,463. In some embodiments, the destabilising amino acid is
selected from isoleucine and glutamic acid, especially from
histidine tyrosine and glutamine, and more especially from aspartic
acid, asparagine, phenylalanine, leucine, tryptophan and lysine. In
certain embodiments, the destabilising amino acid is arginine. In
some proteins, the amino-terminal end is obscured as a result of
the protein's conformation (i.e., its tertiary or quaternary
structure). In these cases, more extensive alteration of the
amino-terminus may be necessary to make the protein subject to the
N-end rule pathway. For example, where simple addition or
replacement of the single amino-terminal residue is insufficient
because of an inaccessible amino-terminus, several amino acids
(including lysine, the site of ubiquitin joining to substrate
proteins) may be added to the original amino-terminus to increase
the accessibility and/or segmental mobility of the engineered amino
terminus. In some embodiments, a nucleic acid sequence encoding the
amino-terminal region of the polypeptide can be modified to
introduce a lysine residue in an appropriate context. This can be
achieved most conveniently by employing DNA constructs encoding
"universal destabilising segments". A universal destabilising
segment comprises a nucleic acid construct which encodes a
polypeptide structure, preferably segmentally mobile, containing
one or more lysine residues, the codons for lysine residues being
positioned within the construct such that when the construct is
inserted into the coding sequence of the protein-encoding synthetic
polynucleotide, the lysine residues are sufficiently spatially
proximate to the amino-terminus of the encoded protein to serve as
the second determinant of the complete amino-terminal degradation
signal. The insertion of such constructs into the 5' portion of a
polypeptide-encoding synthetic polynucleotide would provide the
encoded polypeptide with a lysine residue (or residues) in an
appropriate context for destabilization. In other embodiments, the
polypeptide is modified to contain a PEST region, which is rich in
an amino acid selected from proline, glutamic acid, serine and
threonine, which region is optionally flanked by amino acids
comprising electropositive side chains. In this regard, it is known
that amino acid sequences of proteins with intracellular half-lives
less than about 2 hours contain one or more regions rich in proline
(P), glutamic acid (E), serine (S), and threonine (T) as for
example shown by Rogers et al. (1986, Science 234 (4774): 364-368).
In still other embodiments, the polypeptide is conjugated to a
ubiquitin or a biologically active fragment thereof, to produce a
modified polypeptide whose rate of intracellular proteolytic
degradation is increased, enhanced or otherwise elevated relative
to the unmodified polypeptide.
[0266] One or more adjuvant polypeptides may be co-expressed with
an `antigenic` polypeptide that corresponds to at least a portion
of the target antigen. In certain embodiments, adjuvant and
antigenic polypeptides may be co-expressed in the form of a fusion
protein comprising one or more adjuvant polypeptides and one or
more antigenic polypeptides. Alternatively, adjuvant and antigenic
polypeptides may be co-expressed as separate proteins.
[0267] Furthermore, chimeric constructs can be constructed that
include chimeric antigen-coding gene sequences, encoding, e.g.,
multiple antigens/epitopes of interest, for example derived from a
single or from more than one target antigen. In certain
embodiments, multi-cistronic cassettes (e.g., bi-cistronic
cassettes) can be constructed allowing expression of multiple
adjuvants and/or antigenic polypeptides from a single mRNA using,
for example, the EMCV IRES, or the like. In other embodiments,
adjuvants and/or antigenic polypeptides can be encoded on separate
coding sequences that are operably connected to independent
transcription regulatory elements.
[0268] In some embodiments, the chimeric constructs of the
invention are in the form of expression vectors which are suitably
selected from self-replicating extra-chromosomal vectors (e.g.,
plasmids) and vectors that integrate into a host genome. In
illustrative examples of this type, the expression vectors are
viral vectors, such as simian virus 40 (SV40) or bovine papilloma
virus (BPV), which has the ability to replicate as
extra-chromosomal elements (Eukaryotic Viral Vectors, Cold Spring
Harbor Laboratory, Gluzman ed., 1982; Sarver et al., 1981, Mol.
Cell. Biol. 1:486). Viral vectors include retroviral (lentivirus),
adeno-associated virus (see, e.g., Okada, 1996, Gene Ther.
3:957-964; Muzyczka, 1994, J. Clin. Invst. 94:1351; U.S. Pat. Nos.
6,156,303; 6,143,548 5,952,221, describing AAV vectors; see also
U.S. Pat. Nos. 6,004,799; 5,833,993), adenovirus (see, e.g., U.S.
Pat. Nos. 6,140,087; 6,136,594; 6,133,028; 6,120,764), reovirus,
herpesvirus, rotavirus genomes etc., modified for introducing and
directing expression of a polynucleotide or transgene in cells.
Retroviral vectors can include those based upon murine leukemia
virus (see, e.g., U.S. Pat. No. 6,132,731), gibbon ape leukemia
virus (see, e.g., U.S. Pat. No. 6,033,905), simian
immuno-deficiency virus, human immuno-deficiency virus (see, e.g.,
U.S. Pat. No. 5,985,641), and combinations thereof.
[0269] Vectors also include those that efficiently deliver genes to
animal cells in vivo (e.g., stem cells) (see, e.g., U.S. Pat. Nos.
5,821,235 and 5,786,340; Croyle et al., 1998, Gene Ther. 5:645;
Croyle et al., 1998, Pharm. Res. 15:1348; Croyle et al., 1998, Hum.
Gene Ther. 9:561; Foreman et al., 1998, Hum. Gene Ther. 9:1313;
Wirtz et al., 1999, Gut 44:800). Adenoviral and adeno-associated
viral vectors suitable for in vivo delivery are described, for
example, in U.S. Pat. Nos. 5,700,470, 5,731,172 and 5,604,090.
Additional vectors suitable for in vivo delivery include herpes
simplex virus vectors (see, e.g., U.S. Pat. No. 5,501,979),
retroviral vectors (see, e.g., U.S. Pat. Nos. 5,624,820, 5,693,508
and 5,674,703; and WO92/05266 and WO92/14829), bovine papilloma
virus (BPV) vectors (see, e.g., U.S. Pat. No. 5,719,054), CMV-based
vectors (see, e.g., U.S. Pat. No. 5,561,063) and parvovirus,
rotavirus and Norwalk virus vectors. Lentiviral vectors are useful
for infecting dividing as well as non-dividing cells (see, e.g.,
U.S. Pat. No. 6,013,516).
[0270] Additional viral vectors which will find use for delivering
the nucleic acid molecules encoding the antigens of interest
include those derived from the pox family of viruses, including
vaccinia virus and avian poxvirus. By way of example, vaccinia
virus recombinants expressing the chimeric constructs can be
constructed as follows. The antigen coding sequence is first
inserted into an appropriate vector so that it is adjacent to a
vaccinia promoter and flanking vaccinia DNA sequences, such as the
sequence encoding thymidine kinase (TK). This vector is then used
to transfect cells that are simultaneously infected with vaccinia.
Homologous recombination serves to insert the vaccinia promoter
plus the gene encoding the coding sequences of interest into the
viral genome. The resulting TK-recombinant can be selected by
culturing the cells in the presence of 5-bromodeoxyuridine and
picking viral plaques resistant thereto.
[0271] Alternatively, avipoxviruses, such as the fowlpox and
canarypox viruses, can also be used to deliver the genes.
Recombinant avipox viruses, expressing immunogens from mammalian
pathogens, are known to confer protective immunity when
administered to non-avian species. The use of an avipox vector is
particularly desirable in human and other mammalian species since
members of the avipox genus can only productively replicate in
susceptible avian species and therefore are not infective in
mammalian cells. Methods for producing recombinant avipoxviruses
are known in the art and employ genetic recombination, as described
above with respect to the production of vaccinia viruses. See,
e.g., WO 91/12882; WO 89/03429; and WO 92/03545.
[0272] Molecular conjugate vectors, such as the adenovirus chimeric
vectors described in Michael et al., J. Biol. Chem. (1993)
268:6866-6869 and Wagner et al., Proc. Natl. Acad. Sci. USA (1992)
89:6099-6103, can also be used for gene delivery.
[0273] Members of the Alphavirus genus, such as, but not limited
to, vectors derived from the Sindbis virus (SIN), Semliki Forest
virus (SFV), and Venezuelan Equine Encephalitis virus (VEE), will
also find use as viral vectors for delivering the chimeric
constructs of the present invention. For a description of
Sindbis-virus derived vectors useful for the practice of the
instant methods, see, Dubensky et al. (1996, J. Virol. 70:508-519;
and International Publication Nos. WO 95/07995, WO 96/17072); as
well as, Dubensky, Jr., T. W., et al., U.S. Pat. No. 5,843,723, and
Dubensky, Jr., T. W., U.S. Pat. No. 5,789,245. Exemplary vectors of
this type are chimeric alphavirus vectors comprised of sequences
derived from Sindbis virus and Venezuelan equine encephalitis
virus. See, e.g., Perri et al. (2003, J. Virol. 77: 10394-10403)
and International Publication Nos. WO 02/099035, WO 02/080982, WO
01/81609, and WO 00/61772.
[0274] In other illustrative embodiments, lentiviral vectors are
employed to deliver a chimeric construct of the invention into
selected cells or tissues. Typically, these vectors comprise a 5'
lentiviral LTR, a tRNA binding site, a packaging signal, a promoter
operably linked to one or more genes of interest, an origin of
second strand DNA synthesis and a 3' lentiviral LTR, wherein the
lentiviral vector contains a nuclear transport element. The nuclear
transport element may be located either upstream (5') or downstream
(3') of a coding sequence of interest (for example, a synthetic Gag
or Env expression cassette of the present invention). A wide
variety of lentiviruses may be utilized within the context of the
present invention, including for example, lentiviruses selected
from the group consisting of HIV, HIV-1, HIV-2, FIV, BIV, EIAV,
MVV, CAEV, and SIV. Illustrative examples of lentiviral vectors are
described in PCT Publication Nos. WO 00/66759, WO 00/00600, WO
99/24465, WO 98/51810, WO 99/51754, WO 99/31251, WO 99/30742, and
WO 99/15641. Desirably, a third generation SIN lentivirus is used.
Commercial suppliers of third generation SIN (self-inactivating)
lentiviruses include Invitrogen (ViraPower Lentiviral Expression
System). Detailed methods for construction, transfection,
harvesting, and use of lentiviral vectors are given, for example,
in the Invitrogen technical manual "ViraPower Lentiviral Expression
System version B 050102 25-0501", available at
http://www.invitrogen.com/Content/Tech-Online/molecular_biology/manuals_p-
-ps/virapower_lentiviral_system_man.pdf. Lentiviral vectors have
emerged as an efficient method for gene transfer. Improvements in
biosafety characteristics have made these vectors suitable for use
at biosafety level 2 (BL2). A number of safety features are
incorporated into third generation SIN (self-inactivating) vectors.
Deletion of the viral 3' LTR U3 region results in a provirus that
is unable to transcribe a full length viral RNA. In addition, a
number of essential genes are provided in trans, yielding a viral
stock that is capable of but a single round of infection and
integration. Lentiviral vectors have several advantages, including:
1) pseudotyping of the vector using amphotropic envelope proteins
allows them to infect virtually any cell type; 2) gene delivery to
quiescent, post mitotic, differentiated cells, including neurons,
has been demonstrated; 3) their low cellular toxicity is unique
among transgene delivery systems; 4) viral integration into the
genome permits long term transgene expression; 5) their packaging
capacity (6-14 kb) is much larger than other retroviral, or
adeno-associated viral vectors. In a recent demonstration of the
capabilities of this system, lentiviral vectors expressing GFP were
used to infect murine stem cells resulting in live progeny,
germline transmission, and promoter-, and tissue-specific
expression of the reporter (Ailles, L. E. and Naldini, L.,
HIV-1-Derived Lentiviral Vectors. In: Trono, D. (Ed.), Lentiviral
Vectors, Springer-Verlag, Berlin, Heidelberg, New York, 2002, pp.
31-52). An example of the current generation vectors is outlined in
FIG. 2 of a review by Lois et al. (2002, Science, 295 868-872).
[0275] The chimeric construct can also be delivered without a
vector. For example, the chimeric construct can be packaged as DNA
or RNA in liposomes prior to delivery to the subject or to cells
derived therefrom. Lipid encapsulation is generally accomplished
using liposomes which are able to stably bind or entrap and retain
nucleic acid. The ratio of condensed DNA to lipid preparation can
vary but will generally be around 1:1 (mg DNA:micromoles lipid), or
more of lipid. For a review of the use of liposomes as carriers for
delivery of nucleic acids, see, Hug and Sleight, (1991, Biochim.
Biophys. Acta. 1097:1-17); and Straubinger et al., in Methods of
Enzymology (1983), Vol. 101, pp. 512-527.
[0276] Liposomal preparations for use in the present invention
include cationic (positively charged), anionic (negatively charged)
and neutral preparations, with cationic liposomes particularly
preferred. Cationic liposomes have been shown to mediate
intracellular delivery of plasmid DNA (Feigner et al., 1987, Proc.
Natl. Acad. Sci. USA 84:7413-7416); mRNA (Malone et al., 1989,
Proc. Natl. Acad. Sci. USA 86:6077-6081); and purified
transcription factors (Debs et al., 1990, J. Biol. Chem.
265:10189-10192), in functional form.
[0277] Cationic liposomes are readily available. For example,
N[1-2,3-dioleyloxy)propyl]-N,N,N-triethylammonium (DOTMA) liposomes
are available under the trademark Lipofectin, from GIBCO BRL, Grand
Island, N.Y. (See, also, Feigner et al., 1987, Proc. Natl. Acad.
Sci. USA 84:7413-7416). Other commercially available lipids include
(DDAB/DOPE) and DOTAP/DOPE (Boerhinger). Alternative cationic
liposomes can be prepared from readily available materials using
techniques well known in the art. See, e.g., Szoka et al., 1978,
Proc. Natl. Acad. Sci. USA 75:4194-4198; PCT Publication No. WO
90/11092 for a description of the synthesis of DOTAP
(1,2-bis(oleoyloxy)-3-(trimethylammonio)propane) liposomes.
[0278] Similarly, anionic and neutral liposomes are readily
available, such as, from Avanti Polar Lipids (Birmingham, Ala.), or
can be easily prepared using readily available materials. Such
materials include phosphatidyl choline, cholesterol, phosphatidyl
ethanolamine, dioleoylphosphatidyl choline (DOPC),
dioleoylphosphatidyl glycerol (DOPG), dioleoylphosphatidyl
ethanolamine (DOPE), among others. These materials can also be
mixed with the DOTMA and DOTAP starting materials in appropriate
ratios. Methods for making liposomes using these materials are well
known in the art.
[0279] The liposomes can comprise multilamellar vesicles (MLVs),
small unilamellar vesicles (SUVs), or large unilamellar vesicles
(LUVs). The various liposome-nucleic acid complexes are prepared
using methods known in the art. See, e.g., Straubinger et al., in
METHODS OF IMMUNOLOGY (1983), Vol. 101, pp. 512-527; Szoka et al.,
1978, Proc. Natl. Acad. Sci. USA 75:4194-4198; Papahadjopoulos et
al., 1975, Biochim. Biophys. Acta 394:483; Wilson et al., 1979,
Cell 17:77); Deamer and Bangham, 1976, Biochim. Biophys. Acta
443:629; Ostro et al., 1977, Biochem. Biophys. Res. Commun. 76:836;
Fraley et al., 1979, Proc. Natl. Acad. Sci. USA 76:3348); Enoch and
Strittmatter, 1979, Proc. Natl. Acad. Sci. USA 76:145); Fraley et
al., 1980, J. Biol. Chem. 255:10431; Szoka and Papahadjopoulos,
1978, Proc. Natl. Acad. Sci. USA 75:145; and Schaefer-Ridder et
al., 1982, Science 215:166.
[0280] The chimeric construct can also be delivered in cochleate
lipid compositions similar to those described by Papahadjopoulos et
al., 1975, Biochem. Biophys. Acta. 394:483-491. See, also, U.S.
Pat. Nos. 4,663,161 and 4,871,488.
[0281] The chimeric construct may also be encapsulated, adsorbed
to, or associated with, particulate carriers. Such carriers present
multiple copies-of a selected chimeric construct to the immune
system. The particles can be taken up by professional antigen
presenting cells such as macrophages and dendritic cells, and/or
can enhance antigen presentation through other mechanisms such as
stimulation of cytokine release. Examples of particulate carriers
include those derived from polymethyl methacrylate polymers, as
well as microparticles derived from poly(lactides) and
poly(lactide-co-glycolides), known as PLG. See, e.g., Jeffery et
al., 1993, Pharm. Res. 10:362-368; McGee J. P., et al., 1997, J.
Microencapsul. 14(2):197-210; O'Hagan D. T., et al., 1993, Vaccine
11(2):149-54.
[0282] Furthermore, other particulate systems and polymers can be
used for the in vivo delivery of the chimeric construct. For
example, polymers such as polylysine, polyarginine, polyornithine,
spermine, spermidine, as well as conjugates of these molecules, are
useful for transferring a nucleic acid of interest. Similarly, DEAE
dextran-mediated transfection, calcium phosphate precipitation or
precipitation using other insoluble inorganic salts, such as
strontium phosphate, aluminum silicates including bentonite and
kaolin, chromic oxide, magnesium silicate, talc, and the like, will
find use with the present methods. See, e.g., Felgner, P. L.,
Advanced Drug Delivery Reviews (1990) 5:163-187, for a review of
delivery systems useful for gene transfer. Peptoids (Zuckerman, R.
N., et al., U.S. Pat. No. 5,831,005, issued Nov. 3, 1998) may also
be used for delivery of a construct of the present invention.
[0283] Additionally, biolistic delivery systems employing
particulate carriers such as gold and tungsten, are especially
useful for delivering chimeric constructs of the present invention.
The particles are coated with the synthetic expression cassette(s)
to be delivered and accelerated to high velocity, generally under a
reduced atmosphere, using a gun powder discharge from a "gene gun."
For a description of such techniques, and apparatuses useful
therefor, see, e.g., U.S. Pat. Nos. 4,945,050; 5,036,006;
5,100,792; 5,179,022; 5,371,015; and 5,478,744. In illustrative
examples, gas-driven particle acceleration can be achieved with
devices such as those manufactured by PowderMed Pharmaceuticals PLC
(Oxford, UK) and PowderMed Vaccines Inc. (Madison, Wis.), some
examples of which are described in U.S. Pat. Nos. 5,846,796;
6,010,478; 5,865,796; 5,584,807; and EP Patent No. 0500 799. This
approach offers a needle-free delivery approach wherein a dry
powder formulation of microscopic particles, such as polynucleotide
or polypeptide particles, are accelerated to high speed within a
helium gas jet generated by a hand held device, propelling the
particles into a target tissue of interest. Other devices and
methods that may be useful for gas-driven needle-less injection of
compositions of the present invention include those provided by
Bioject, Inc. (Portland, Oreg.), some examples of which are
described in U.S. Pat. Nos. 4,790,824; 5,064,413; 5,312,335;
5,383,851; 5,399,163; 5,520,639 and 5,993,412.
[0284] Alternatively, micro-cannula- and microneedle-based devices
(such as those being developed by Becton Dickinson and others) can
be used to administer the chimeric constructs of the invention.
Illustrative devices of this type are described in EP 1 092 444 A1,
and U.S. application Ser. No. 606,909, filed Jun. 29, 2000.
Standard steel cannula can also be used for intra-dermal delivery
using devices and methods as described in U.S. Ser. No. 417,671,
filed Oct. 14, 1999. These methods and devices include the delivery
of substances through narrow gauge (about 30 G) "micro-cannula"
with limited depth of penetration, as defined by the total length
of the cannula or the total length of the cannula that is exposed
beyond a depth-limiting feature. It is within the scope of the
present invention that targeted delivery of substances including
chimeric constructs can be achieved either through a single
microcannula or an array of microcannula (or "microneedles"), for
example 3-6 microneedles mounted on an injection device that may
include or be attached to a reservoir in which the substance to be
administered is contained.
10. Compositions
[0285] The invention also provides compositions, particularly
immunomodulating compositions, comprising one or more of the
chimeric constructs described herein. The immunomodulating
compositions may comprise a mixture of chimeric constructs, which
in turn may be delivered, for example, using the same or different
vectors or vehicles. Antigens may be administered individually or
in combination, in e.g., prophylactic (i.e., to prevent infection
or disease) or therapeutic (to treat infection or disease)
immunomodulating compositions. The immunomodulating compositions
may be given more than once (e.g., a "prime" administration
followed by one or more "boosts") to achieve the desired effects.
The same composition can be administered in one or more priming and
one or more boosting steps. Alternatively, different compositions
can be used for priming and boosting.
[0286] The immunomodulating compositions will generally include one
or more "pharmaceutically acceptable excipients or vehicles" such
as water, saline, glycerol, ethanol, etc. Additionally, auxiliary
substances, such as wetting or emulsifying agents, pH buffering
substances, and the like, may be present in such vehicles.
[0287] Immunomodulating compositions will typically, in addition to
the components mentioned above, comprise one or more
"pharmaceutically acceptable carriers." These include any carrier
which does not itself induce the production of antibodies harmful
to the individual receiving the composition. Suitable carriers
typically are large, slowly metabolized macromolecules such as
proteins, polysaccharides, polylactic acids, polyglycolic acids,
polymeric amino acids, amino acid copolymers, and lipid aggregates
(such as oil droplets or liposomes). Such carriers are well known
to those of ordinary skill in the art. A composition may also
contain a diluent, such as water, saline, glycerol, etc.
Additionally, an auxiliary substance, such as a wetting or
emulsifying agent, pH buffering substance, and the like, may be
present. A thorough discussion of pharmaceutically acceptable
components is available in Gennaro (2000) Remington: The Science
and Practice of Pharmacy. 20th ed., ISBN: 0683306472.
[0288] Pharmaceutically compatible salts can also be used in
compositions of the invention, for example, mineral salts such as
hydrochlorides, hydrobromides, phosphates, or sulfates, as well as
salts of organic acids such as acetates, propionates, malonates, or
benzoates. Especially useful protein substrates are serum albumins,
keyhole limpet hemocyanin, immunoglobulin molecules, thyroglobulin,
ovalbumin, tetanus toxoid, and other proteins well known to those
of skill in the art.
[0289] The chimeric constructs of the invention can also be
adsorbed to, entrapped within or otherwise associated with
liposomes and particulate carriers such as PLG.
[0290] The chimeric constructs of the present invention are
formulated into compositions for delivery to a mammal. These
compositions may either be prophylactic (to prevent infection) or
therapeutic (to treat disease after infection). The compositions
will comprise a "therapeutically effective amount" of the gene of
interest such that an amount of the antigen can be produced in vivo
so that an immune response is generated in the individual to which
it is administered. The exact amount necessary will vary depending
on the subject being treated; the age and general condition of the
subject to be treated; the capacity of the subject's immune system
to synthesize antibodies; the degree of protection desired; the
severity of the condition being treated; the particular antigen
selected and its mode of administration, among other factors. An
appropriate effective amount can be readily determined by one of
skill in the art. Thus, a "therapeutically effective amount" will
fall in a relatively broad range that can be determined through
routine trials.
[0291] Once formulated, the compositions of the invention can be
administered directly to the subject (e.g., as described above).
Direct delivery of chimeric construct-containing compositions in
vivo will generally be accomplished with or without vectors, as
described above, by injection using either a conventional syringe,
needless devices such as Bioject.TM. or a gene gun, such as the
Accell.TM. gene delivery system (PowderMed Ltd, Oxford, England) or
microneedle device. The constructs can be delivered (e.g.,
injected) either subcutaneously, epidermally, intradermally,
intramuscularly, intravenous, intramucosally (such as nasally,
rectally and vaginally), intraperitoneally or orally. Delivery of
nucleic acid into cells of the epidermis is particularly preferred
as this mode of administration provides access to skin-associated
lymphoid cells and provides for a transient presence of nucleic
acid (e.g., DNA) in the recipient. Other modes of administration
include oral ingestion and pulmonary administration, suppositories,
needle-less injection, transcutaneous, topical, and transdermal
applications. Dosage treatment may be a single dose schedule or a
multiple dose schedule.
[0292] In order that the invention may be readily understood and
put into practical effect, particular preferred embodiments will
now be described by way of the following non-limiting examples.
EXAMPLES
Example 1
Synthetic Construct System for Determining the Immune Response
Preference of Codons in Mammals
Materials and Methods
[0293] Primer Design/synthesis and Sequence Manipulation
[0294] Oligonucleotides for site-directed mutagenesis were designed
according to the guidelines included in the mutagenesis kit manuals
(Quikchange II Site-directed Mutagenesis kit or Quikchange Multi
Site-directed Mutagenesis Kit; Stratagene, La Jolla Calif.). These
primers were synthesized and PAGE purified by Sigma (formerly
Proligo).
[0295] Oligonucleotides for whole gene synthesis were designed by
eye and synthesized by Sigma (formerly Proligo). The primers were
supplied as standard desalted oligos. No additional purification of
the oligonucleotides was carried out.
[0296] Sequence manipulation and analysis was carried out using the
suite of programs on Biomanager (ANGIS) and various other web-based
programs including BLAST at NCBI
(http://www.ncbi.nlm.nih.gov/blast/bl2seq/wblast2.cgi), NEBcutter
V2.0 from New England Biolabs
(http://tools.neb.com/NEBcutter2/index.php), the Translate Tool on
ExPASy (http://au.expasv.org/tools/dna.html), and the SignalP 3.0
server (http://www.cbs.dtu.dk/services/SionalP/).
[0297] Standard Cloning Techniques
[0298] Restriction enzyme digests, alkaline phosphatase treatments
and ligations were carried out according to the enzyme
manufacturers' instructions (various manufacturers including New
England Biolabs, Roche and Fermentas).
[0299] Purification of DNA from agarose gels and preparation of
mini-prep DNA were carried out using commercial kits (Qiagen,
Bio-Rad, Macherey-Nagel).
[0300] Agarose gel electrophoresis, phenol/chloroform extraction of
contaminant protein from DNA, ethanol precipitation of DNA and
other basic molecular biological procedures were carried out using
standard protocols, similar to those described in Current Protocols
in Molecular Biology (Ebook available via Wiley InterScience;
edited by Ausubel et al.).
[0301] Sequencing was carried out by the Australian Genome Research
Facility (AGRF, Brisbane).
[0302] Whole Gene Synthesis
[0303] Overlapping .about.35-50mer oligonucleotides (Sigma-Proligo)
were used to synthesize longer DNA sequences. Restriction enzyme
sites were incorporated to facilitate cloning. The method used to
synthesize the fragments is based on that given in Smith et al.
(2003). First, oligonucleotides for the top or bottom strand were
mixed and then phosphorylated using T4 polynucleotide kinase (PNK;
New England Biolabs). The oligonucleotide mixes were then purified
from the PNK by a standard phenol/chloroform extraction and sodium
acetate/ethanol (NaAc/EtOH) precipitation. Equal volumes of
oligonucleotide mixes for the top and bottom strands were then
mixed and the oligonucleotides denatured by heating at 95.degree.
C. for 2 mins. The oligonucleotides were annealed by slowly cooling
the sample to 55.degree. C. and the annealed oligonucleotides
ligated using Taq ligase (New England Biolabs). The resulting
fragment was purified by phenol/CHCl.sub.3 extraction and NaAc/EtOH
precipitation.
[0304] The ends of the fragments were filled in and the fragments
then amplified, using the outermost forward and reverse primers,
with the Clontech Advantage HF 2 PCR kit (Clontech) according to
the manufacturer's instructions. To fill in the ends the following
PCR was used: 35 cycles of a denaturation step of 94.degree. C. for
15s, a slow annealing step where the temperature was ramped down to
55.degree. C. over 7 minutes and then kept at 55.degree. C. for 2
min, and an elongation step of 72.degree. C. for 6 minutes. A final
elongation step for 7 min at 72.degree. C. was then carried out.
The second PCR to amplify the fragment involved: an initial
denaturation step at 94.degree. C. for 30 s, followed by 25 cycles
of 94.degree. C. for 15 s, 55.degree. C. 30 s and 68.degree. C. for
1 min, and a final elongation step of 68.degree. C. for 3 mins.
[0305] The fragments were then purified by gel electrophoresis,
digested and ligated into the relevant vector. Following
transformation of E. coli with the ligation mixture, mini-preps
were made for multiple colonies and the inserts sequenced.
Sometimes it was not possible to isolate clones with entirely
correct sequence. In those cases the errors were fixed by single or
multi site-directed mutagenesis.
[0306] Site-directed Mutagenesis
[0307] Mutagenesis was carried out using the Quikchange II
Site-directed Mutagenesis kit or Quikchange Multi Site-directed
Mutagenesis Kit (Stratagene, La Jolla Calif.), with appropriate
PAGE (polyacrylamide gel electrophoresis)-purified primers (Sigma),
according to the manufacturer's instructions.
[0308] Preparation of Constructs
[0309] The details of the constructs used to generate the codon
preference table are summarized in TABLE 12. All constructs were
made using pcDNA3 from Invitrogen and were verified by sequencing
prior to use.
TABLE-US-00012 TABLE 12 SUMMARY OF SECRETORY E7 CONSTRUCT SERIES 1
AND 2 AA & E7 Construct Codon CU of Sec Seq CU of E7 Protein
Control Constructs IgkC1 N/A wt wt non-onc IgkC2 N/A mc mc non-onc
IgkC3 N/A wt wt onc IgkC4 N/A mc mc onc Secretory E7 construct
series 1 IgkS1-1 Ala GCG wt wt with all Ala non-onc gcg IgkS1-2 Ala
GCA wt wt with all Ala non-onc gca IgkS1-3 Ala GCT wt wt with all
Ala non-onc gct IgkS1-4 Ala GCC wt wt with all Ala non-onc gcc
IgkS1-5 Arg AGG wt wt with all Arg non-onc agg IgkS1-6 Arg AGA wt
wt with all Arg non-onc aga IgkS1-7 Arg CGG wt wt with all Arg
non-onc cgg IgkS1-8 Arg CGA wt wt with all Arg non-onc cga IgkS1-9
Arg CGT wt wt with all Arg non-onc cgt IgkS1-10 Arg CGC wt wt with
all Arg non-onc cgc IgkS1-11 Asn AAT wt wt with all Asn non-onc aat
IgkS1-12 Asn AAC wt wt with all Asn non-onc aac IgkS1-13 Asp GAT wt
with all Asp wt with all Asp non-onc gat gat IgkS1-14 Asp GAC wt
with all Asp wt with all Asp non-onc gac gac IgkS1-15 Cys TGT wt wt
with all Cys non-onc tgt IgkS1-16 Cys TGC wt wt with all Cys
non-onc tgc IgkS1-17 Glu GAG wt with all Glu wt with all Glu
non-onc gag gag IgkS1-18 Glu GAA wt with all Glu wt with all Glu
non-onc gaa gaa IgkS1-19 Gln CAG wt wt with all Gln non-onc cag
IgkS1-20 Gln CAA wt wt with all Gln non-onc caa IgkS1-21 Gly GGG wt
with all Gly wt with all Gly non-onc ggg ggg IgkS1-22 Gly GGA wt
with all Gly wt with all Gly non-onc gga gga IgkS1-23 Gly GGT wt
with all Gly wt with all Gly non-onc ggt ggt IgkS1-24 Gly GGC wt
with all Gly wt with all Gly non-onc ggc ggc IgkS1-25 His CAT wt wt
with all His non-onc cat IgkS1-26 His CAC wt wt with all His
non-onc cac IgkS1-27 Ile ATA wt wt with all Ile non-onc ata
IgkS1-28 Ile ATT wt wt with all Ile non-onc att IgkS1-29 Ile ATC wt
wt with all Ile non-onc atc IgkS1-30 Lys AAG wt wt with all Lys
non-onc aag IgkS1-31 Lys AAA wt wt with all Lys non-onc aaa
IgkS1-32 Phe TTT wt wt with all Phe non-onc ttt L15F, L22F IgkS1-33
Phe TTC wt wt with all Phe non-onc ttc L15F, L22F IgkS1-34 Ser AGT
wt with all Ser wt with all Ser non-onc agt agt IgkS1-35 Ser AGC wt
with all Ser wt with all Ser non-onc agc agc IgkS1-36 Ser TCG wt
with all Ser wt with all Ser non-onc tcg tcg IgkS1-37 Ser TCA wt
with all Ser wt with all Ser non-onc tca tca IgkS1-38 Ser TCT wt
with all Ser wt with all Ser non-onc tct tct IgkS1-39 Ser TCC wt wt
with all Ser non-onc tcc IgkS1-40 Thr ACG wt with all Thr wt with
all Thr non-onc acg acg IgkS1-41 Thr ACA wt with all Thr wt with
all Thr non-onc aca aca IgkS1-42 Thr ACT wt with all Thr wt with
all Thr non-onc act act IgkS1-43 Thr ACC wt with all Thr wt with
all Thr non-onc acc acc IgkS1-44 Tyr TAT wt wt with all Tyr non-onc
tat IgkS1-45 Tyr TAC wt wt with all Tyr non-onc tac IgkS1-46 Val
GTG wt with all Val wt with all Val non-onc gtg gtg IgkS1-47 Val
GTA wt with all Val wt with all Val non-onc gta gta IgkS1-48 Val
GTT wt with all Val wt with all Val non-onc gtt gtt IgkS1-49 Val
GTC wt with all Val wt with all Val non-onc gtc gtc IgkS1-50 Leu
CTG altered with altered with Leu onc Leu ctg ctg IgkS1-51 Leu CTA
altered with altered with Leu onc Leu cta cta IgkS1-52 Leu CTT
altered with altered with Leu onc Leu ctt ctt IgkS1-53 Leu CTC
altered with altered with Leu onc Leu ctc ctc IgkS1-54 Leu TTG
altered with altered with Leu onc Leu ttg ttg IgkS1-55 Leu TTA
altered with altered with Leu onc Leu tta tta IgkS1-56 Pro CCG
altered with altered with Pro onc Pro ccg ccg IgkS1-57 Pro CCA
altered with altered with Pro onc Pro cca cca IgkS1-58 Pro CCT
altered with altered with Pro onc Pro cct cct IgkS1-59 Pro CCC
altered with altered with Pro onc Pro ccc ccc Secretory E7
construct series 2 IgkS2-1 Ala GCG mc mc linkerA-onc IgkS2-2 Ala
GCA mc mc linkerA-onc IgkS2-3 Ala GCT mc mc linkerA-onc IgkS2-4 Ala
GCC mc mc linkerA-onc IgkS2-5 Arg AGG mc mc linkerR-onc IgkS2-6 Arg
AGA mc mc linkerR-onc IgkS2-7 Arg CGG mc mc linkerR-onc IgkS2-8 Arg
CGA mc mc linkerR-onc IgkS2-9 Arg CGT mc mc linkerR-onc IgkS2-10
Arg CGC mc mc linkerR-onc IgkS2-11 Asn AAT mc mc linkerN-onc
IgkS2-12 Asn AAC mc mc linkerN-onc IgkS2-13 Asp GAT wt with all Asp
wt with all Asp onc gat gat IgkS2-14 Asp GAC wt with all Asp wt
with all Asp onc gac gac IgkS2-15 Cys TGT wt wt with all Cys onc
tgt IgkS2-16 Cys TGC wt wt with all Cys onc tgc IgkS2-17 Glu GAG wt
with all Glu wt with all Glu onc gag gag IgkS2-18 Glu GAA wt with
all Glu wt with all Glu onc gaa gaa IgkS2-19 Gln CAG wt wt with all
Gln onc cag IgkS2-20 Gln CAA wt wt with all Gln onc caa IgkS2-21
Gly GGG wt with all Gly wt with all Gly onc ggg ggg IgkS2-22 Gly
GGA wt with all Gly wt with all Gly onc gga gga IgkS2-23 Gly GGT wt
with all Gly wt with all Gly onc ggt ggt IgkS2-24 Gly GGC wt with
all Gly wt with all Gly onc ggc ggc IgkS2-25 His CAT mc mc
linkerH-onc IgkS2-26 His CAC mc mc linkerH-onc IgkS2-27 Ile ATA wt
wt with all Ile onc ata IgkS2-28 Ile ATT wt wt with all Ile onc att
IgkS2-29 Ile ATC wt wt with all Ile onc atc IgkS2-30 Lys AAG mc mc
linkerK- onc IgkS2-31 Lys AAA mc mc linkerK- onc IgkS2-32 Phe TTT
mc mc linkerF- onc IgkS2-33 Phe TTC mc mc linkerF- onc IgkS2-34 Ser
AGT wt with all Ser wt with all Ser onc agt agt IgkS2-35 Ser AGC wt
with all Ser wt with all Ser onc agc agc IgkS2-36 Ser TCG wt with
all Ser wt with all Ser onc tcg tcg IgkS2-37 Ser TCA wt with all
Ser wt with all Ser onc tca tca IgkS2-38 Ser TCT wt with all Ser wt
with all Ser onc tct tct IgkS2-39 Ser TCC wt wt with all Ser onc
tcc IgkS2-40 Thr ACG wt with all Thr wt with all Thr onc acg acg
IgkS2-41 Thr ACA wt with all Thr wt with all Thr onc aca aca
IgkS2-42 Thr ACT wt with all Thr wt with all Thr onc act act
IgkS2-43 Thr ACC wt with all Thr wt with all Thr onc acc acc
IgkS2-44 Tyr TAT mc mc linkerY- onc IgkS2-45 Tyr TAC mc mc linkerY-
onc IgkS2-46 Val GTG wt with all Val wt with all Val onc gtg gtg
IgkS2-47 Val GTA wt with all Val wt with all Val onc gta gta
IgkS2-48 Val GTT wt with all Val wt with all Val onc gtt gtt
IgkS2-49 Val GTC wt with all Val wt with all Val onc gtc gtc IgkS2-
Asn AAT wt wt with all Asn linkerN- 11b aat non-onc IgkS2- Asn AAC
wt wt with all Asn linkerN- 12b aac non-onc AA = amino acid, CU =
codon usage, mc = mammalian consensus, wt = wild-type, onc =
oncogenic, non-onc = non-oncogenic, Sec seq = secretory sequence,
N/A = not applicable
[0310] Control Constructs
[0311] Control E7 constructs were based on those from Liu et al.
(2002). Both oncogenic (i.e. wild-type) and non-oncogenic E7
control constructs were made with wild-type or mammalian consensus
codon usage. "Non-oncogenic" E7 is E7 with D21G, C24G, E26G
mutations, i.e. with mutations that have been reported to render E7
non-transforming (Edmonds and Vousden, 1989; Heck et al, 1992).
[0312] The secretory sequence was derived from Mus musculus IgK RNA
for the anti-HLA-DR antibody light chain (GenBank accession number
D84070). For some constructs the codon usage of this sequence was
modified.
[0313] Wild-type Codon Usage Control Constructs:
[0314] The wild-type (wt) codon usage E7 construct from Liu et al.
was used as the template in a site-directed mutagenesis PCR to make
the wt codon usage non-oncogenic E7 construct.
[0315] The non-oncogenic and oncogenic wild-type codon usage E7
sequences were amplified to incorporate a 5' BamHI site and a 3'
EcoRI site. The resulting fragments were cloned into BamHI and
EcoRI cut pcDNA3 and sequenced. The secretory fragment was made by
whole gene synthesis using wild-type codon usage with flanking KpnI
and BamHI sites. The Kozak-secretory fragments were then ligated
into KpnI/BamHI cut pcDNA3-wtE7 (non-oncogenic or oncogenic) to
make pcDNA3-Igk-nE7 and pcDNA3-Igk-E7 (named IgkC1 and IgkC3
respectively; see TABLE 12). The identity of the constructs was
confirmed by sequencing.
[0316] Mammalian Consensus (mc) Codon Usage Control Constructs:
[0317] As there were errors in the original mammalian consensus
(mc) E7 construct (L28F, Q70R and an E35 deletion; Liu et al.,
2002) it was not used. A mc non-oncogenic E7 control construct was
synthesized by whole gene synthesis. A mc oncogenic E7 (i.e.,
wild-type E7) control construct was subsequently made from the mc
non-oncogenic E7 construct by single site-directed mutagenesis.
[0318] Secretory mc oncogenic and non-oncogenic constructs were
made by amplifying the mc E7 sequence with a forward primer that
introduced a BamHI site and a reverse primer that incorporated an
EcoRI site. The resulting E7 fragment was cloned into the
respective sites in pcDNA3 and sequenced. A mc secretory sequence
flanked by KpnI and BamHI sites, 5' and 3' respectively, was
synthesised and ligated into the KpnI and BamHI sites of
pcDNA3-mcE7 (oncogenic or non-oncogenic) to make pcDNA3-mcIgk-mcnE7
and pcDNA3-mcIgk-mcE7 (named IgkC2 and IgkC4 respectively; see
TABLE 12). The identity of the constructs was confirmed by
sequencing.
[0319] Secreted Non-oncogenic E7 Constructs with Predominantly
Wild-type Codon Usage, Modified for Individual Codons
[0320] Plasmids encoding a non-oncogenic form of E7 were made for
all of the codons, with the exception of the Pro and Leu codons,
stop codons and codons for non-degenerate amino acids. As Phe
occurs just once in the E7 sequence, the codons for two Leu
residues, L15 and L22, were mutated to Phe codons. A combination of
techniques was used to make these constructs. When few mutations
were required single or multi site-directed mutagenesis of a
control construct encoding non-oncogenic E7 was performed (details
of the control construct are given above under "control
constructs"). When more extensive modifications were required whole
gene synthesis was employed. Regardless of the methods used these
constructs all include an E7 encoding sequence with identical
upstream and downstream sequence cloned into the KpnI and EcoRI
sites of pcDNA3. These constructs were then modified to include a
secretory sequence, as described below.
[0321] First, using the whole gene synthesis method, DNA fragments
that included a secretory sequence flanked by KpnI and BamHI sites
were synthesized. For some constructs the amino acid of interest
occurred in the secretory sequence so individual modified secretory
sequence fragments were made. For constructs for amino acids that
did not occur in the secretory sequence, wild-type secretory
sequence was used. These fragments were digested with KpnI and
BamHI. Then, using the relevant nE7 construct as a template and a
standard PCR protocol, a BamHI site was introduced at the 5' end of
the E7 sequence. The 3' EcoRI site was retained. The resulting E7
fragments were cut with BamHI and EcoRI, purified, and ligated into
pcDNA3. Following sequencing, the plasmids were cut with KpnI and
BamHI and ligated with the relevant KpnI BamHI secretory sequences.
The sequences of the constructs were then confirmed. Constructs
IgkS1-1 to IgkS1-49 were made in this way (see TABLE 12 and FIGS. 1
to 11, 13 and 15 to 17 for sequence comparisons).
[0322] Secreted E7 Constructs with Individual Pro or Leu Codons
Modified
[0323] E7 DNA sequences in which the Pro or Leu codons were
individually modified were designed. The rest of the codon usage
for these E7 DNAs was the same for all of the Pro and Leu
constructs but differed from the wild-type or mammalian consensus
codon usage. [Note that this codon usage was based on our
preliminary data from immunizing mice with the GFP constructs.]
[0324] The Pro/LeuE7 DNA fragments, flanked by HindIII and BamHI
sites, were made by whole gene synthesis and cloned into the
HindIII and BamHI sites of pcDNA3. Using these constructs as
templates, a KpnI site was incorporated upstream and an EcoRI site
downstream, of the Pro/Leu E7 sequences by standard PCR methods.
The resulting fragments were cut with KpnI and EcoRI and cloned
into pcDNA3. These constructs were then used to make the secreted
E7 constructs with Pro or Leu codon modifications.
[0325] Firstly, using the whole gene synthesis method, DNA
fragments that included a secretory sequence flanked by KpnI and
BamHI sites were synthesized. As Pro and Leu occur in the secretory
sequence, individually modified secretory sequence fragments were
made for the different constructs. These fragments were digested
with KpnI and BamHI. Then, using the relevant Pro or Leu E7
construct as a template and a standard PCR protocol, a BamHI site
was introduced at the 5' end of the E7 sequence. The 3' EcoRI site
was retained. The resulting fragments were cut with BamHI and
EcoRI, purified, and ligated into pcDNA3. Following sequencing, the
plasmids were cut with KpnI and BamHI and ligated with the relevant
KpnI/BamHI secretory sequences. The resulting constructs were
sequenced and are denoted IgkS1-50 to IgkS1-59 (see TABLE 12 and
FIGS. 12 and 14 for sequence comparisons).
[0326] Secreted E7 Constructs with Predominantly Wild-type Codon
Usage, Modified for Individual Codons
[0327] Constructs encoding a secreted form of oncogenic E7 (i.e.
wild-type E7 protein) were made by site-directed mutagenesis of the
plasmids encoding a secreted form of non-oncogenic E7. This was
done for constructs for codons for the following amino acids: Asp,
Cys, Glu, Gln, Gly, Ile, Ser, Thr and Val.
[0328] Site-directed mutagenesis was carried out using the
Quikchange II Site-directed Mutagenesis kit (Stratagene, La Jolla
Calif.) and appropriate PAGE (polyacrylamide gel
electrophoresis)-purified primers (Sigma) according to the
manufacturer's instructions. The pcDNA-kIgkX-nE7X series of
constructs were used as templates for the mutagenesis (i.e.
constructs IgkS1-13 to 24, IgkS1-27 to 29, IgkS1-34 to 43 and
IgkS1-46 to 49). The primers introduced the desired G21D, G24C,
G26E mutations.
[0329] The resulting constructs, IgkS2-13 to 24, IgkS2-27 to 29,
IgkS2-34 to 43 and IgkS2-46 to 49 (see Table 8, SEQ ID NOs: 1 to
29), have wild-type codon usage for the Igk secretory sequence and
E7 sequence with the exception that the codons for the relevant
amino acid were changed, and they encode oncogenic E7.
[0330] Linker Constructs
[0331] Constructs encoding the N-terminal Igk secretory sequence
followed by a linker sequence (XXGXGXX, where X is the relevant
amino acid for a particular construct and G is glycine) and the E7
protein were made for each of the following amino acids: Asn, Ala,
Lys, Arg, Phe, His and Tyr.
[0332] Fragments consisting of the Igk secretory sequence (with
mammalian consensus codon usage) and the linker sequences were made
by PCR using Taq polymerase and standard cycling conditions, as
recommended by the manufacturer.
[0333] The fragments were amplified from pcDNA3-kmcIgk-mcE7 using a
common forward primer
TABLE-US-00013 (5'TTGAATAGGTACCGCCGCCACCATGGAGACCGACACCCTCC3'; SEQ
ID NO: 90)
that annealed to the KpnI site, the Kozak sequence and the
beginning of the Igk secretory sequence. The reverse primers were
different for each linker construct and annealed to the end of the
Igk secretory sequence (with mammalian consensus codon usage),
introduced new sequence that encoded the relevant linker sequence
and a 3' BamHI site.
[0334] The fragments were digested with KpnI/BamHI and were ligated
into KpnI/BamHI-cut pcDNA3-mcIgk-mcE7 (i.e. the Kozak sequence and
secretory sequence had been removed from the plasmid by digestion)
to make pcDNA3-mcIgk-linkerX-mcE7 (i.e., IgkS2-1 to 12, IgkS2-25
and 26, IgkS2-30 to 33 and IgkS2-44 and 45 as illustrated in Table
8, SEQ ID NOs: 30 to 49).
[0335] For Asn the fragments were also ligated into KpnI/BamHI-cut
pcDNA3-Igk-nE7Asn1/2 (i.e. IgkS1-11 and 12) to make
pcDNA3-mcIgk-linkerN1/2-nE7Asn1/2 (i.e., IgkS2-11b and IgkS2-12b,
see Table 12).
E7 Protein Expression
[0336] Cell Culture
[0337] CHO cells were cultured in DMEM (GIBCO from Invitrogen)
containing 10% foetal bovine serum (FBS) (DKSH), penicillin,
streptomycin and glutamine (GIBCO from Invitrogen) at 37.degree. C.
and 5% CO.sub.2. Cells were plated into 6-well plates at
3.times.10.sup.5/well, 24 hours prior to transfection. For each
transfection, 2 .mu.g of DNA was mixed with 504 OptiMEM (GIBCO from
Invitrogen) and 4 .mu.L Plus reagent (Invitrogen) and incubated at
room temperature (RT) for 30 min. Lipofectamine (Invitrogen; 5
.mu.L in 50 .mu.L OptiMEM) was added and the complexes incubated at
RT for 30 min. The cells were rinsed with OptiMEM, 2 mL OptiMEM
were added to each well, and the complexes then added. The cells
were incubated overnight at 37.degree. C. and 5% CO.sub.2. The
following morning the complexes were removed and 2 mL of fresh DMEM
containing 2% FBS added to each well.
[0338] Cell pellets and supernatants were collected about 40 h
after transfection. The cell pellets were resuspended in lysis
buffer (0.1% NP-40, 2 .mu.g/mL Aprotinin, 1 .mu.g/mL
[0339] Leupeptin and 2 mM PMSF in PBS). Transfections were carried
out in duplicate and repeated. Control transfections, with empty
vector (pcDNA3), were also carried out.
[0340] Western Blotting
[0341] Western blots of the CHO cell supernatants or lysates were
carried out according to standard protocols. Briefly, this involved
firstly separating the samples by polyacrylamide gel
electrophoresis (PAGE). For cell lysates, 30 .mu.g of total protein
were loaded for each sample. For supernatants, 30 .mu.L of each was
loaded. The protein samples were boiled with SDS-PAGE loading
buffer for 10 mins before loading onto 12% SDS-PAGE gels and the
gels were run at 150-200V for approximately 1 h.
[0342] The separated proteins were then transferred from the gels
to PVDF membrane (100V for 1 h). The membranes were blocked with 5%
skim milk (in PBS/0.05% Tween 20 (PBS-T)) for 1 h at room
temperature and were then incubated with the primary antibody,
HPV-16 E7 Mouse Monoclonal Antibody (Zymed Laboratories) at a
concentration of 1:1000 in 5% skim milk (in PBS-T) overnight at
4.degree. C. Following washing of the membrane in PBS-T (3.times.10
min), secondary antibody, anti-mouse IgG (Sigma) in 5% skim milk,
was added and the membrane incubated at room temperature for 4 h.
The membranes were washed as before, incubated in a mixture
containing equal volumes of solution A (4.425 mL water, 50 .mu.L
luminol, 22 .mu.L p-coumaric and 500 .mu.L 1M Tris pH 8.5) and
solution B (4.5 mL water, 3 .mu.L 30% H.sub.2O.sub.2 and 500 .mu.L
1M Tris pH8.5) for 1 min, and then dried and wrapped in plastic
wrap. Film was exposed to the blots for various times (1 min, 3 min
or 10 min) and the film then developed.
Gene Gun Immunization Protocols
[0343] Plasmid Purification
[0344] All plasmids used for vaccination were grown in the
Escherichia coli strain DH5.alpha. and purified using the
Nucleobond Maxi Kit (Machery-Nagal). DNA concentration was
quantitated spectrophotometrically at 260 nm.
[0345] Preparation of DNA/Gold Cartridges
[0346] Coating of gold particles with plasmid DNA was performed as
described in the Biorad Helios Gene Gun System instruction manual
using a microcarrier loading quantity (MLQ) of 0.5 mg
gold/cartridge and a DNA loading ratio of 2 .mu.g DNA/mg gold. This
resulted in 1 .mu.g of DNA per prepared cartridge. In brief 50
.mu.L of 0.05M spermidine (Sigma) was added to 25 mg of 1.0 um gold
particles (Bio-Rad) and the spermidine/gold was sonicated for 3
seconds. 50 .mu.g of plasmid DNA was then added, followed by the
dropwise addition of 100 .mu.L 1M CaCl.sub.2 while vortexing. The
mixture was allowed to precipitate at room temperature for 10 min,
then centrifuged to pellet the DNA/gold. The pellet was washed
three times with HPLC grade ethanol (Scharlau), before resuspension
in HPLC grade ethanol containing 0.5 mg/mL of polyvinylpyrrolidone
(PVP) (Bio-Rad). The gold/plasmid suspension was then coated onto
Tefzel tubing and 0.5 inch cartridges prepared.
[0347] Gene Gun Immunization of Mice
[0348] Groups of 8 female C57BL6/J (6-8 weeks old) (ARC, WA or
Monash Animal Services, VIC) were immunized on Day 0, Day21, Day 42
and Day 63 with the relevant DNA. The day before each immunization
the abdomen of each mouse was shaved and depilatory cream (Nair)
applied for 1 minute. DNA was delivered with the Helios gene gun
(Biorad) using a pressure of 400 psi. Mice were given 2 shots on
either side of the abdomen, with 1 .mu.g of DNA delivered per shot.
Serum was collected via intra-ocular bleed 2 days prior to initial
immunization and 2 weeks after each subsequent immunization (Day 2,
Day 35, Day 56 and Day 77).
[0349] ELISA to Measure E7 Immune Response
[0350] Nine peptides spanning the full-length of HPV16E7 (Frazer et
al., 1995) were used to measure the E7 antibody response. The
peptides were synthesised and purified to >70% purity by Auspep
(Melbourne). Peptides GF101 to 106 and GF108 to 109 described in
Frazer et al. were made. Note that instead of GF107, GF107a was
used:
TABLE-US-00014 HYNIVTFCCKCDSTLRL.
[0351] GF102 D13G, GF103 D5G/C8G/E10G and GF104E2G peptides, named
GF102n, GF103n and GF104n respectively, were also synthesised.
These peptides were used for the ELISA when measuring antibodies to
non-oncogenic E7 i.e. these peptides incorporate the mutations that
were made to make the E7 protein non-oncogenic.
[0352] Microtiter plates were coated overnight with 50 .mu.L of 10
.mu.g/mL E7 peptide per well. After coating, microtiter plates
(Maxisorp, Nunc) were washed two times with PBS/0.05% Tween 20
(PBS-T) and then blocked for two hours at 37.degree. C. with 100
.mu.L of 5% skim milk powder in PBS-T. After blocking, plates were
washed three times with PBS-T and 50 .mu.L of mouse sera at a
dilution of 1 in 100 was added for 2 hours at 37.degree. C. All
serum was assayed in duplicate wells. Plates were then washed three
times with PBS-T and 50 .mu.L of sheep anti-mouse IgG horseradish
peroxidise conjugate (Sigma) was added at a 1 in 1000 dilution.
After 1 hour plates were washed and 50 .mu.l, of OPD substrate was
added. Absorbance was measured after 30 min and the addition of 25
.mu.L of 2.5 M HCl at 490 nm in a Multiskan EX plate reader
(Pathtech). Note controls were included: control primary antibody
for a positive control, secondary antibody only, and day 0
serum/serum from unimmunized mice as negative controls.
[0353] The immune response preferences of codons determined from
these experiments are tabulated in TABLE 1.
Example 2
Construction of Codon Modified Influenza A Virus (H5N1) HA DNA for
Conferring an Enhanced Immune Response to H5N1 HA
[0354] The wild-type nucleotide sequence of the influenza A virus,
HA gene for hemagglutinin (A/Hong Kong/213/03(H5N1), MDCK isolate,
embryonated chicken egg isolate) is shown in SEQ ID NO: 50 and
encodes the amino acid sequence shown in SEQ ID NO: 51. Several
codons within that sequence were mutated using the method described
in Example 1. Specifically, the method involved replacing codons of
the wild type nucleotide sequence with corresponding synonymous
codons having higher immune response preferences than the codons
they replaced, as represented in Table 1. An illustrative codon
modified nucleotide sequence comprising high immune response
preference codons is shown in SEQ ID NO: 52.
Example 3
Construction of Codon Modified Influenza A Virus (H.sub.3N.sub.1)
DNA for Conferring an Enhanced Immune Response to H3N1 HA
[0355] The wild-type nucleotide sequence of the influenza A virus,
HA gene for hemagglutinin (A/swine/Korea/PZ72-1/2006(H3N1)) is
shown in SEQ ID NO: 53 and encodes the amino acid sequence shown in
SEQ ID NO: 54. Specifically, the method involved replacing codons
of the wild type nucleotide sequence with corresponding synonymous
codons having higher immune response preferences than the codons
they replaced, as represented in Table 1. An illustrative codon
modified nucleotide sequence comprising high immune response
preference codons is shown in SEQ ID NO: 55.
Example 4
Construction of Codon Modified Influenza A Virus (H.sub.5N.sub.1)
NA DNA for Conferring an Enhanced Immune Response to H5N1 NA
[0356] The wild-type nucleotide sequence of the influenza A virus,
NA gene for neuraminidase (A/Hong Kong/213/03(H5N1), NA gene
neuraminidase, MDCK isolate, embryonated chicken egg isolate) is
shown in SEQ ID NO: 56 and encodes the amino acid sequence shown in
SEQ ID NO: 57. Several codons within that sequence were mutated
using the method described in Example 1. Specifically, the method
involved replacing codons of the wild type nucleotide sequence with
corresponding synonymous codons having higher immune response
preferences than the codons they replaced, as represented in Table
1. An illustrative codon modified nucleotide sequence comprising
high immune response preference codons is shown in SEQ ID NO:
58.
Example 5
Construction of Codon Modified Influenza A Virus (H.sub.3N.sub.1)
NA DNA for Conferring an Enhanced Immune Response to H3N1 NA
[0357] The wild-type nucleotide sequence of the influenza A virus,
NA gene for neuraminidase (A/swine/MI/PU243/04(H3N1)) is shown in
SEQ ID NO: 59 and encodes the amino acid sequence shown in SEQ ID
NO: 60. Several codons within that sequence were mutated using the
method described in Example 1. Specifically, the method involved
replacing codons of the wild type nucleotide sequence with
corresponding synonymous codons having higher immune response
preferences than the codons they replaced, as represented in Table
1. An illustrative codon modified nucleotide sequence comprising
high immune response preference codons is shown in SEQ ID NO:
61.
Example 6
Construction of Codon Modified Hepatitis C Virus E1 (1AH77) DNA for
Conferring an Enhanced Immune Response to HCV E1 (1AH77)
[0358] The wild-type nucleotide sequence of the hepatitis C Virus
E1, (serotype 1A, isolate H77, from polyprotein nucleotide sequence
AF009606) is shown in SEQ ID NO: 62 and encodes the amino acid
sequence (NP 751920) shown in SEQ ID NO: 63. Several codons within
that sequence were mutated using the method described in Example 1.
Specifically, the method involved replacing codons of the wild type
nucleotide sequence with corresponding synonymous codons having
higher immune response preferences than the codons they replaced,
as represented in Table 1. An illustrative codon modified
nucleotide sequence comprising high immune response preference
codons is shown in SEQ ID NO: 64.
Example 7
Construction of Codon Modified Hepatitis C Virus E2 (1AH77) DNA for
Conferring an Enhanced Immune Response to HCV E2 (1AH77)
[0359] The wild-type nucleotide sequence of the hepatitis C Virus
E2, (serotype 1A, isolate H77, from polyprotein nucleotide sequence
AF009606) is shown in SEQ ID NO: 65 and encodes the amino acid
sequence (NP 751921) shown in SEQ ID NO: 66. Several codons within
that sequence were mutated using the method described in Example 1.
Specifically, the method involved replacing codons of the wild type
nucleotide sequence with corresponding synonymous codons having
higher immune response preferences than the codons they replaced,
as represented in Table 1. An illustrative codon modified
nucleotide sequence comprising high immune response preference
codons is shown in in SEQ ID NO: 67.
Example 8
Construction of Codon Modified Epstein--Barr Virus Type 1 GP350 DNA
for Conferring an Enhanced Immune Response to EBV Type 1 GP350
[0360] The wild-type nucleotide sequence of the Epstein--Barr
virus, EBV type 1 gp350 (Gene BLLF1, strand 77142-79865) is shown
in SEQ ID NO: 68 and encodes amino acid sequence (CAD53417) shown
in SEQ ID NO: 69. Several codons within that sequence were mutated
using the method described in Example 1. Specifically, the method
involved replacing codons of the wild type nucleotide sequence with
corresponding synonymous codons having higher immune response
preferences than the codons they replaced, as represented in Table
1. An illustrative codon modified nucleotide sequence comprising
high immune response preference codons is shown in SEQ ID NO:
70.
Example 9
Construction of Codon Modified Epstein--Barr Virus Type 2 GP350 DNA
for Conferring an Enhanced Immune Response to EBV Type 2 GP350
[0361] The wild-type nucleotide sequence of the Epstein--Barr
virus, EBV type 2 gp350 (Gene BLLF1, strand 77267-29936) is shown
in SEQ ID NO: 71 and encodes the amino acid sequence (YP 001129462)
shown in SEQ ID NO: 72. Several codons within that sequence were
mutated using the method described in Example 1. Specifically, the
method involved replacing codons of the wild type nucleotide
sequence with corresponding synonymous codons having higher immune
response preferences than the codons they replaced, as represented
in Table 1. An illustrative codon modified nucleotide sequence
comprising high immune response preference codons is shown in SEQ
ID NO: 73.
Example 10
Construction of Codon Modified Herpes Simplex Virus 2 Glycoprotein
B DNA for Conferring an Enhanced Immune Response to HSV-2
Glycoprotein B
[0362] The wild-type nucleotide sequence of the Herpes Simplex
virus 2, glycoprotein B strain HG52 (genome strain NC 001798) is
shown in SEQ ID NO: 74 and encodes the amino acid sequence
(CAB06752) shown in SEQ ID NO: 75. Several codons within that
sequence were mutated using the method described in Example 1.
Specifically, the method involved replacing codons of the wild type
nucleotide sequence with corresponding synonymous codons having
higher immune response preferences than the codons they replaced,
as represented in Table 1. An illustrative codon modified
nucleotide sequence comprising high immune response preference
codons is shown in SEQ ID NO: 76.
Example 11
Construction of Codon Modified Herpes Simplex Virus 2 Glycoprotein
D DNA for Conferring an Enhanced Immune Response to HSV-2
Glycoprotein D
[0363] The wild-type nucleotide sequence of the Herpes Simplex
virus 2, glycoprotein D strain H052 (genome strain NC 001798) is
shown in SEQ ID NO: 77 and encodes the amino acid sequence (NP
044536) shown in SEQ ID NO: 78. Several codons within that sequence
were mutated using the method described in Example 1. Specifically,
the method involved replacing codons of the wild type nucleotide
sequence with corresponding synonymous codons having higher immune
response preferences than the codons they replaced, as represented
in Table 1. An illustrative codon modified nucleotide sequence
comprising high immune response preference codons is shown in SEQ
ID NO: 79.
Example 12
Optimised E7 and HSV-2 Constructs
Design and Synthesis of Optimal and Least Optimal E7 Constructs
[0364] One de-optimized (W) and three optimized (O1-O3) E7
constructs were designed and made using the codon preferences
summarized in Table 1 ("the Immune Coricode table"). The least
favourable codons were used for construct W. For the first
optimized construct, O1, whose sequence is shown in SEQ ID NO: 81,
all of the codons were modified to those codons determined most
optimal. O2, whose sequence is shown in SEQ ID NO: 82, is an
alternative optimized construct which involved changing all Ala to
GCT; Arg CGG and AGG to CGA and AGA, respectively; Glu to GAA; Gly
to GGA; Ile to ATC; all Leu to CTG; Phe to TTT, Pro to CCT or CCC,
Ser to TCG, Thr to ACG; and all Val except GTG to GTC. The O.sub.2
modifications avoided, with the exception of Leu and Ile, changing
codons to mammalian consensus-preferred codons. For O3, whose
sequence is shown in SEQ ID NO: 83, only certain amino acids for
which particularly distinct differences were observed between
codons, and for which the optimal codon(s) was not also a mammalian
consensus preferred codon, were modified. In particular, in O3 all
non-preferred Gly, Leu, Pro, Ser and Thr codons were changed to
GGA, CTC, CCT, TCG and ACG, respectively, and where a preferred
codon was already used it was not altered. Codons for other amino
acids in O3 were not modified.
Humoral and Cellular Responses to Biolistic Immunization with the
Optimal and Least Optimal E7 Constructs
[0365] As may be seen in FIG. 18 (a) all three optimized constructs
(O1 to O3) gave rise to significantly larger antibody responses
than the wild-type construct as measured by both the peptide ELISA
and a GST-E7 protein ELISA. The amplitudes of the response were not
statistically different between the three optimized constructs. The
de-optimized construct, W, whose sequence is shown in SEQ ID NO:
84, gave a very low antibody response, appearing slightly lower but
not statistically different from the wild-type (wt) codon usage
(CU) construct, whose sequence is shown in SEQ ID NO: 80. From the
IFN-.gamma. ELISPOT experiments, a representative example of which
is shown in FIG. 18, it appears that the codon preferences for
maximizing the antibody response are similar to those required for
maximising the T cell response: the de-optimized construct W failed
to give a measurable response in the IFN-.gamma. ELISPOT assay and
two of the optimized constructs (O2 and O3) gave statistically
significantly larger responses than the wild-type CU construct.
Over the three repeats the responses to O2 and O3 were not
statistically different from each other. Unexpectedly, and in
contrast to the antibody trend, in two of the three repeat
experiments O1 gave a similar cellular response to the wt CU
construct, which was less than that achieved by the O2 or O3
constructs.
Humoral and Cellular Responses to Immunization by Intradermal
Injection with the Optimal and Least Optimal E7 Constructs
[0366] The humoral and cellular responses of mice to the optimized,
wild-type CU and de-optimized constructs delivered by intradermal
injection were also measured and the results are summarized in FIG.
19. In general, similar trends were observed for intradermal
injection as for biolistic delivery.
[0367] From the E7 protein ELISA, it is apparent that the three
optimized constructs, O1-O3, were all significantly better at
generating antibodies than the wild-type construct and that the
de-optimized construct gave a very low antibody response similar to
wild-type. The optimized constructs all gave rise to significantly
more spots in the IFN-.gamma.ELISPOT than the wild-type construct
and the de-optimized construct failed to give rise to a measurable
response.
[0368] The amplitudes of the antibody responses to gene gun
immunization were larger than that for the intradermally (ID)
delivered vaccines, despite the ID immunization delivering more
than five times the dose.
Design and Synthesis of Optimal and Least Optimal
HSV-2Constructs
[0369] Three optimized (O1-O3; whose sequences are shown in SEQ ID
NO: 86-88, respectively) and a de-optimized construct (W; whose
sequence is shown in SEQ ID NO: 88) encoding full-length
glycoprotein D from Herpes Simplex Virus 2 (gD2) were prepared. A
control construct pcDNA3-gD2 with wt CU was also made. Wild-type
CU, whose sequence is shown in SEQ ID NO: 85, is close to MC
CU.
Humoral Responses to Biolistic and Intradermal Immunization with
the Optimal and Least Optimal gD2 Constructs
[0370] C57B1/6 mice were immunized in two groups (8 mice/construct;
used intradermal injection (ID) and gene gun delivery) using the
same immunization protocol as for the E7 constructs.
[0371] Group 1 included pcDNA3-gD2 and pcDNA3-gD2 O1. Group 2
included pcDNA3-gD2, pcDNA3-gD2 O2, pcDNA3-gD2 O3, and pcDNA3-gD2
W.
[0372] Antibody responses were measured by an ELISA using plates
coated with CHO cell supernatant containing C-terminally His tagged
and truncated gD2. The truncation is at amino acid residue 331 and
removes the transmembrane region resulting in the protein being
secreted into the medium. Control ELISA plates coated with
supernatant from CHO cells transfected with empty vector were used
as a control.
[0373] For both biolistic and intradermal injection delivery routes
it was found that the three optimized constructs generated similar
levels of antibodies as the wt CU gD2 construct (FIG. 20). The
de-optimized construct, W gD2, was very poor at generating
antibodies, particularly when delivered by intradermal injection.
The two delivery methods resulted in similar levels of
antibodies.
[0374] To date, there are no DNA vaccines on the market for the
treatment or prevention of disease in humans. There is a need to
maximize the immune responses generated by DNA vaccines and the
present invention discloses ways of enhancing efficacy of DNA
vaccines by using codons that have a higher preference for
producing an immune response.
[0375] The study described in this Example has validated the Immune
Coricode table by applying it to optimization or de-optimization of
the HPV16 E7 and HSV-2 glycoprotein D (gD2) genes and demonstrating
that this does enhance or reduce, respectively, the antibody or
cellular response to biolistic delivery of these genes to mammals
such as mice.
Material and Methods
[0376] ELISPOT Assay
[0377] For the IFN-.gamma. ELISPOTs, mice were immunized twice, at
days 0 and 21, and the spleens were collected 3 weeks after the
second immunization.
[0378] Intradermal Injection Protocol
[0379] The timing and frequency of the immunizations by intradermal
injection were the same as for gene gun immunization. At each
immunization 5 .mu.s of DNA was injected per ear i.e. a total of 10
.mu.g was administered per immunization per mouse. Hair removal
prior to immunization was not necessary. The timing of bleeds and
spleen collection was the same as for the gene gun immunized
mice.
[0380] GST-E7 ELISA
[0381] The GST-E7 ELISA was carried out in the same way as the
peptide ELISA with the exception that the plates were coated
overnight with 50 .mu.L of 10 .mu.g/mL GST-tagged E7 protein
(kindly provided by the Frazer group from the Diamantina Institute,
The University of Queensland, Brisbane).
[0382] HSV-2 gD ELISA
[0383] This ELISA was carried out in the same way as the E7 ELISAs
with the exception that the plates were coated with supernatant
from CHO cells transfected with a vector encoding C-terminally
His-tagged and truncated gD2 protein. Control plates coated with
supernatant from CHO cells transfected with empty vector were also
used.
[0384] Detection of HPV-specific Responses
[0385] For the detection of HPV-specific responses, 96-well filter
ELISPOT plates (Millipore) were coated overnight with 10 mg/mL HPV
GST-tagged E7 protein in 0.1 M NaHCO.sub.3. For the detection of
total IgG secreting cells, 96-well filter ELISPOT plates were
coated overnight with 2 .mu.g/mL goat anti-mouse Ig (Sigma) in PBS
without MgCl.sub.2 and CaCl.sub.2. After coating, plates were
washed once with complete DMEM without FCS and then blocked with
complete DMEM supplemented with 10% FCS for one hour at 37.degree.
C. Cultured mouse spleen cells were washed and added to ELISPOT
plates at 10.sup.6 cells/100 .mu.L. For the detection of
HPV-specific memory B cells, plates were incubated overnight at
37.degree. C. and for measuring total IgG cells, plates were
incubated for 1 hour at 37.degree. C. For detection, we used
biotinylated goat anti-mouse IgG (Sigma) in PBS-T/1% FCS, followed
by 5 .mu.g/mL HRP-conjugated avidin (Pierce) and developed using
3-amino-9-ethylcarbozole (Sigma). Developed plates were counted
using an automated ELISPOT plate counter.
[0386] E7 IFN-.gamma. ELISPOT
[0387] 96-well filter plates (Millipore) were coated overnight with
4 .mu.g/mL of monoclonal antibody (AN18; Mabtech). After coating,
plates were washed once with complete RPMI and blocked for 2 hours
with complete RPMI with 10% foetal calf serum (FCS; CSL Ltd). Mouse
spleens were made into single cell suspensions and treated with ACK
lysis buffer, washed and resuspended at a concentration of 10.sup.7
cells/mL. Spleen cells (10.sup.6/well) were added to each well
followed by the addition of complete RPMI supplemented with
recombinant hIL-2 (ProSpec-Tany TechnoGene Ltd) and peptide to a
final concentration of 10 IU/well and 1 .mu.g/mL, respectively.
Medium containing hIL-2 without peptide was added to control wells.
Plates were incubated for approximately 18 hours at 37.degree. C.
in 5-8% CO.sub.2.
[0388] After overnight incubation, cells were lysed by rinsing the
plates in tap water and then washed six times in PBS/0.05% Tween 20
(PBS-T). For detection, biotinylated detection mAb (R4-6A2;
Mabtech) in PBS-T/2% FCS was added, followed by horse radish
peroxidase (HRP)-conjugated streptavidin and DAB (Sigma). Developed
plates were counted using an automated ELISPOT plate counter.
[0389] The disclosure of every patent, patent application, and
publication cited herein is hereby incorporated herein by reference
in its entirety.
[0390] The citation of any reference herein should not be construed
as an admission that such reference is available as "Prior Art" to
the instant application.
[0391] Throughout the specification the aim has been to describe
the preferred embodiments of the invention without limiting the
invention to any one embodiment or specific collection of features.
Those of skill in the art will therefore appreciate that, in light
of the instant disclosure, various modifications and changes can be
made in the particular embodiments exemplified without departing
from the scope of the present invention. All such modifications and
changes are intended to be included within the scope of the
appended claims.
BIBLIOGRAPHY
[0392] Ausubel, F. M. (Ed.) 2007. Current Protocols in Molecular
Biology. Ebook
(http://www.mrw.interscience.wiley.com/emrw/9780471142720/cp/cpmb/toc).
[0393] Edmonds, C., and Vousden, K. H. (1989). A point mutational
analysis of human papillomavirus type 16 E7 protein. Journal of
Virology. 63: 2650-2656. [0394] Frazer, I. H., Leippe, D. M., Dunn,
L. A., Leim, A., Tindle, R. W., Fernando, G. J., Phelps, W. C., and
Lambert, P. F. (1995). Immunological responses in human
papillomavirus 16 E6/E7 transgenic mice to E7 protein correlate
with the presence of skin disease. Cancer Research. 55: 2635-2639.
[0395] Heck, D. V., Yee, C. L., Howley, P. M., and Munger, K.
(1992). Efficiency of binding the retinoblastoma protein correlates
with the transforming capacity of the E7 oncoproteins of the human
papillomaviruses. PNAS 89: 4442-4446. [0396] Liu, W. J., Gao, F.,
Zhao, K N., Zhao, W., Fernando, G. J, Thomas, R. And Frazer, I. H.
(2002). Codon modified human papillomavirus type 16 E7 DNA vaccine
enhances cytotoxic T-lymphocyte induction and anti-tumour activity.
Virology 301: 43-52. [0397] Smith, H. O., Hutchison III, C. A.,
Pfannkoch, C. and Venter, J. C. (2003). Generating a synthetic
genome by whole genome assembly: .phi.X174 bacteriophage from
synthetic oligonucleotides. PNAS. 100 (26): 15440-15445.
Sequence CWU 1
1
911387DNAArtificial sequencePlasmid sequence 1ggtaccgccg ccaccatgga
gacagataca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca ctggtgatgg
atccatgcat ggagatacac ctacattgca tgaatatatg 120ttagatttgc
aaccagagac aactgatctc tactgttatg agcaattaaa tgatagctca
180gaggaggagg atgaaataga tggtccagct ggacaagcag aaccggatag
agcccattac 240aatattgtaa ccttttgttg caagtgtgat tctacgcttc
ggttgtgcgt acaaagcaca 300cacgtagata ttcgtacttt ggaagatctg
ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc agaagcccta agaattc
3872387DNAArtificial sequencePlasmid sequence 2ggtaccgccg
ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca
ctggtgacgg atccatgcat ggagacacac ctacattgca tgaatatatg
120ttagacttgc aaccagagac aactgacctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg acgaaataga cggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 3873387DNAArtificial sequencePlasmid sequence
3ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg taagtgtgac
tctacgcttc ggttgtgtgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgtccc 360atctgttctc
agaagcccta agaattc 3874387DNAArtificial sequencePlasmid sequence
4ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgctatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgctg caagtgcgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 3875387DNAArtificial sequencePlasmid sequence
5ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgagtatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgagataga tggtccagct ggacaagcag
agccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaggacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 3876387DNAArtificial sequencePlasmid sequence
6ggtaccgccg ccaccatgga aacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagaaac aactgatctc tactgttatg aacaattaaa
tgacagctca 180gaagaagaag atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 3877387DNAArtificial sequencePlasmid sequence
7ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc agccagagac aactgatctc tactgttatg agcagttaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaggcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acagagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 3878387DNAArtificial sequencePlasmid sequence
8ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc
aaaagcccta agaattc 3879387DNAArtificial sequencePlasmid sequence
9ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccagggtcca ctggggacgg atccatgcat ggggatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tgggccagct gggcaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggga cactagggat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 38710387DNAArtificial sequencePlasmid sequence
10ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggatcca ctggagacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggaccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggaa cactaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 38711387DNAArtificial sequencePlasmid sequence
11ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggtgatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggtcaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggta cactaggtat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 38712387DNAArtificial sequencePlasmid sequence
12ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggctcca ctggcgacgg atccatgcat ggcgatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggcccagct ggccaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggcat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 38713387DNAArtificial sequencePlasmid sequence
13ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatatagtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca tacgtacttt
ggaagacctg ttaatgggca cactaggaat agtgtgcccc 360atatgctctc
agaagcccta agaattc 38714387DNAArtificial sequencePlasmid sequence
14ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaattga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atttgctctc
agaagcccta agaattc 38715387DNAArtificial sequencePlasmid sequence
15ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaatcga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatatcgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca tccgtacttt
ggaagacctg ttaatgggca cactaggaat cgtgtgcccc 360atctgctctc
agaagcccta agaattc 38716387DNAArtificial sequencePlasmid sequence
16ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggtagta ctggtgacgg aagtatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagtagt 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
agtacgcttc ggttgtgcgt acaaagtaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgcagtc
agaagcccta agaattc 38717387DNAArtificial sequencePlasmid sequence
17ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggtagca ctggtgacgg aagcatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagcagc 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
agcacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgcagcc
agaagcccta agaattc 38718387DNAArtificial sequencePlasmid sequence
18ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcga ctggtgacgg atcgatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgactcgtcg 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tcgacgcttc ggttgtgcgt acaatcgaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctcgc
agaagcccta agaattc 38719387DNAArtificial sequencePlasmid sequence
19ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcaa ctggtgacgg atcaatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgactcatca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tcaacgcttc ggttgtgcgt acaatcaaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctcac
agaagcccta agaattc 38720387DNAArtificial sequencePlasmid sequence
20ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcta ctggtgacgg atctatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgactcttct 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaatctaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 38721387DNAArtificial sequencePlasmid sequence
21ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgactcctcc 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tccacgcttc ggttgtgcgt acaatccaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctccc
agaagcccta agaattc 38722387DNAArtificial sequencePlasmid sequence
22ggtaccgccg ccaccatgga gacggacacg ctcctgctat gggtactgct gctctgggtt
60ccaggttcca cgggtgacgg atccatgcat ggagatacgc ctacgttgca tgaatatatg
120ttagatttgc aaccagagac gacggatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa cgttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcacg 300cacgtagaca ttcgtacgtt
ggaagacctg ttaatgggca cgctaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 38723387DNAArtificial sequencePlasmid sequence
23ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt
60ccaggttcca caggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aacagatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa cattttgttg caagtgtgac
tctacacttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacatt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 38724387DNAArtificial sequencePlasmid sequence
24ggtaccgccg ccaccatgga gactgacact ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatactc ctactttgca tgaatatatg
120ttagatttgc aaccagagac tactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ctttttgttg caagtgtgac
tctactcttc ggttgtgcgt acaaagcact 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca ctctaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 38725387DNAArtificial sequencePlasmid sequence
25ggtaccgccg ccaccatgga gaccgacacc ctcctgctat gggtactgct gctctgggtt
60ccaggttcca ccggtgacgg atccatgcat ggagataccc ctaccttgca tgaatatatg
120ttagatttgc aaccagagac caccgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacccttc ggttgtgcgt acaaagcacc 300cacgtagaca ttcgtacctt
ggaagacctg ttaatgggca ccctaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 38726387DNAArtificial sequencePlasmid sequence
26ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtgctgct gctctgggtg
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtga ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt gcaaagcaca 300cacgtggaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 38727387DNAArtificial sequencePlasmid sequence
27ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggta
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtatgcccc 360atctgctctc
agaagcccta agaattc 38728387DNAArtificial sequencePlasmid sequence
28ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggttctgct gctctgggtt
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtta ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt tcaaagcaca 300cacgttgaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtttgcccc 360atctgctctc
agaagcccta agaattc 38729387DNAArtificial sequencePlasmid sequence
29ggtaccgccg ccaccatgga gacagacaca ctcctgctat gggtcctgct gctctgggtc
60ccaggttcca ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtca ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt ccaaagcaca 300cacgtcgaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtctgcccc 360atctgctctc
agaagcccta agaattc 38730408DNAArtificial sequencePlasmid linker
sequence 30ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct
gctctgggtg 60cccggctcca ccggcgacgc ggcgggcgcg ggcgcggcgg gatccatgca
cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga
ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag
gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta
caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg
tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc
360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc
40831408DNAArtificial sequencePlasmid linker sequence 31ggtaccgccg
ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca
ccggcgacgc agcaggcgca ggcgcagcag gatccatgca cggcgacacc
120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct
gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg
acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg
accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac
ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca
tcgtgtgccc catctgctcc cagaagccct aagaattc 40832408DNAArtificial
sequencePlasmid linker sequence 32ggtaccgccg ccaccatgga gaccgacacc
ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacgc tgctggcgct
ggcgctgctg gatccatgca cggcgacacc 120cccaccctgc acgagtacat
gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca
acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc
240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga
cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc
tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc
cagaagccct aagaattc 40833408DNAArtificial sequencePlasmid linker
sequence 33ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct
gctctgggtg 60cccggctcca ccggcgacgc cgccggcgcc ggcgccgccg gatccatgca
cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga
ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag
gacgagatcg acggccccgc cggccaggcc 240gagcccgacc
gcgcccacta caacatcgtg accttctgct gcaagtgcga cagcaccctg
300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc tggaggacct
gctgatgggc 360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc
40834408DNAArtificial sequencePlasmid linker sequence 34ggtaccgccg
ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca
ccggcgacag gaggggcagg ggcaggaggg gatccatgca cggcgacacc
120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct
gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg
acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg
accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac
ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca
tcgtgtgccc catctgctcc cagaagccct aagaattc 40835408DNAArtificial
sequencePlasmid linker sequence 35ggtaccgccg ccaccatgga gaccgacacc
ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacag aagaggcaga
ggcagaagag gatccatgca cggcgacacc 120cccaccctgc acgagtacat
gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca
acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc
240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga
cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc
tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc
cagaagccct aagaattc 40836408DNAArtificial sequencePlasmid linker
sequence 36ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct
gctctgggtg 60cccggctcca ccggcgaccg gcggggccgg ggccggcggg gatccatgca
cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga
ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag
gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta
caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg
tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc
360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc
40837408DNAArtificial sequencePlasmid linker sequence 37ggtaccgccg
ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca
ccggcgaccg acgaggccga ggccgacgag gatccatgca cggcgacacc
120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct
gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg
acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg
accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac
ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca
tcgtgtgccc catctgctcc cagaagccct aagaattc 40838408DNAArtificial
sequencePlasmid linker sequence 38ggtaccgccg ccaccatgga gaccgacacc
ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgaccg tcgtggccgt
ggccgtcgtg gatccatgca cggcgacacc 120cccaccctgc acgagtacat
gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca
acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc
240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga
cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc
tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc
cagaagccct aagaattc 40839408DNAArtificial sequencePlasmid linker
sequence 39ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct
gctctgggtg 60cccggctcca ccggcgaccg ccgcggccgc ggccgccgcg gatccatgca
cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga
ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag
gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta
caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg
tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc
360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc
40840408DNAArtificial sequencePlasmid linker sequence 40ggtaccgccg
ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca
ccggcgacaa taatggcaat ggcaataatg gatccatgca cggcgacacc
120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct
gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg
acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg
accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac
ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca
tcgtgtgccc catctgctcc cagaagccct aagaattc 40841408DNAArtificial
sequencePlasmid linker sequence 41ggtaccgccg ccaccatgga gaccgacacc
ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacaa caacggcaac
ggcaacaacg gatccatgca cggcgacacc 120cccaccctgc acgagtacat
gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca
acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc
240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga
cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc
tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc
cagaagccct aagaattc 40842408DNAArtificial sequencePlasmid linker
sequence 42ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct
gctctgggtg 60cccggctcca ccggcgacca tcatggccat ggccatcatg gatccatgca
cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga
ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag
gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta
caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg
tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc
360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc
40843408DNAArtificial sequencePlasmid linker sequence 43ggtaccgccg
ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca
ccggcgacca ccacggccac ggccaccacg gatccatgca cggcgacacc
120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct
gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg
acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg
accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac
ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca
tcgtgtgccc catctgctcc cagaagccct aagaattc 40844408DNAArtificial
sequencePlasmid linker sequence 44ggtaccgccg ccaccatgga gaccgacacc
ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgacaa gaagggcaag
ggcaagaagg gatccatgca cggcgacacc 120cccaccctgc acgagtacat
gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca
acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc
240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga
cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc
tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc
cagaagccct aagaattc 40845408DNAArtificial sequencePlasmid linker
sequence 45ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct
gctctgggtg 60cccggctcca ccggcgacaa aaaaggcaaa ggcaaaaaag gatccatgca
cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga
ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag
gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta
caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg
tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc
360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc
40846408DNAArtificial sequencePlasmid linker sequence 46ggtaccgccg
ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca
ccggcgactt ttttggcttt ggcttttttg gatccatgca cggcgacacc
120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct
gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg
acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg
accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac
ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca
tcgtgtgccc catctgctcc cagaagccct aagaattc 40847408DNAArtificial
sequencePlasmid linker sequence 47ggtaccgccg ccaccatgga gaccgacacc
ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca ccggcgactt cttcggcttc
ggcttcttcg gatccatgca cggcgacacc 120cccaccctgc acgagtacat
gctggacctg cagcccgaga ccaccgacct gtactgctac 180gagcagctca
acgacagcag cgaggaggag gacgagatcg acggccccgc cggccaggcc
240gagcccgacc gcgcccacta caacatcgtg accttctgct gcaagtgcga
cagcaccctg 300cgcctctgcg tgcagagcac ccacgtggac atccgcaccc
tggaggacct gctgatgggc 360accctgggca tcgtgtgccc catctgctcc
cagaagccct aagaattc 40848408DNAArtificial sequencePlasmid linker
sequence 48ggtaccgccg ccaccatgga gaccgacacc ctcctgctgt gggtgctgct
gctctgggtg 60cccggctcca ccggcgacta ttatggctat ggctattatg gatccatgca
cggcgacacc 120cccaccctgc acgagtacat gctggacctg cagcccgaga
ccaccgacct gtactgctac 180gagcagctca acgacagcag cgaggaggag
gacgagatcg acggccccgc cggccaggcc 240gagcccgacc gcgcccacta
caacatcgtg accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg
tgcagagcac ccacgtggac atccgcaccc tggaggacct gctgatgggc
360accctgggca tcgtgtgccc catctgctcc cagaagccct aagaattc
40849408DNAArtificial sequencePlasmid linker sequence 49ggtaccgccg
ccaccatgga gaccgacacc ctcctgctgt gggtgctgct gctctgggtg 60cccggctcca
ccggcgacta ctacggctac ggctactacg gatccatgca cggcgacacc
120cccaccctgc acgagtacat gctggacctg cagcccgaga ccaccgacct
gtactgctac 180gagcagctca acgacagcag cgaggaggag gacgagatcg
acggccccgc cggccaggcc 240gagcccgacc gcgcccacta caacatcgtg
accttctgct gcaagtgcga cagcaccctg 300cgcctctgcg tgcagagcac
ccacgtggac atccgcaccc tggaggacct gctgatgggc 360accctgggca
tcgtgtgccc catctgctcc cagaagccct aagaattc 408501707DNAArtificial
sequenceVirus 50atggagaaaa tagtgcttct ttttgcaata gtcagtcttg
ttaaaagtga tcagatttgc 60attggttacc atgcaaacaa ctcgacagag caggttgaca
caataatgga aaagaacgtt 120actgttacac atgcccaaga catactggaa
aagacacaca acgggaagct ctgcgatcta 180gatggagtga agcctctaat
tttgagagat tgtagtgtag ctggatggct cctcggaaac 240ccaatgtgtg
acgaattcat caatgtgccg gaatggtctt acatagtgga gaaggccaat
300ccagccaatg acctctgtta cccaggggat ttcaacgact atgaagaatt
gaaacaccta 360ttgagcagaa taaaccattt tgagaaaatt cagatcatcc
ccaaaaattc ttggtccagt 420catgaagcct cattaggggt gagctcagca
tgtccatacc aaggaaagtc ctcctttttc 480aggaatgtgg tatggcttat
caaaaagaac aatgcatacc caacaataaa gaggagctac 540aataatacca
accaagaaga tcttttggta ttgtggggga ttcaccatcc taatgatgcg
600gcagagcaga ctaggctcta tcaaaaccca accacctaca tttccgttgg
gacatcaaca 660ctaaaccaga gattggtacc aaaaatagct actagatcca
aagtaaacgg gcaaaatgga 720aggatggagt tcttctggac aattttaaaa
ccgaatgatg caatcaactt cgagagcaat 780ggaaatttca ttgctccaga
atatgcatac aaaattgtca agaaagggga ctcagcaatt 840atgaaaagtg
aattggaata tggtaactgc aacaccaagt gtcaaactcc aatgggggcg
900ataaactcta gtatgccatt ccacaatata caccctctca ccatcgggga
atgccccaaa 960tatgtgaaat caaacagatt agtccttgcg actgggctca
gaaatagccc tcaaagagag 1020agaagaagaa aaaagagagg attatttgga
gctatagcag gttttataga gggaggatgg 1080cagggaatgg tagatggttg
gtatgggtac caccatagca atgagcaggg gagtgggtac 1140gctgcagaca
aagaatccac tcaaaaggca atagatggag tcaccaataa ggtcaactcg
1200atcattgaca aaatgaacac tcagtttgag gccgttggaa gggaatttaa
taacttagaa 1260aggagaatag agaatttaaa caagaagatg gaagacggat
tcctagatgt ctggacttat 1320aatgctgaac ttctggttct catggaaaat
gagagaactc tagactttca tgactcaaat 1380gtcaagaacc tttacgacaa
ggtccgacta cagcttaggg ataatgcaaa ggagctgggt 1440aacggttgtt
tcgagttcta tcacaaatgt gataatgaat gtatggaaag tgtaagaaac
1500ggaacgtatg actacccgca gtattcagaa gaagcaagac taaaaagaga
ggaaataagt 1560ggagtaaaat tggagtcaat aggaacttac caaatactgt
caatttattc tacagtggcg 1620agttccctag cactggcaat catggtagct
ggtctatctt tatggatgtg ctccaatggg 1680tcgttacaat gcagaatttg catttaa
170751568PRTArtificial sequenceVirus 51Met Glu Lys Ile Val Leu Leu
Phe Ala Ile Val Ser Leu Val Lys Ser1 5 10 15Asp Gln Ile Cys Ile Gly
Tyr His Ala Asn Asn Ser Thr Glu Gln Val 20 25 30Asp Thr Ile Met Glu
Lys Asn Val Thr Val Thr His Ala Gln Asp Ile 35 40 45Leu Glu Lys Thr
His Asn Gly Lys Leu Cys Asp Leu Asp Gly Val Lys 50 55 60Pro Leu Ile
Leu Arg Asp Cys Ser Val Ala Gly Trp Leu Leu Gly Asn65 70 75 80Pro
Met Cys Asp Glu Phe Ile Asn Val Pro Glu Trp Ser Tyr Ile Val 85 90
95Glu Lys Ala Asn Pro Ala Asn Asp Leu Cys Tyr Pro Gly Asp Phe Asn
100 105 110Asp Tyr Glu Glu Leu Lys His Leu Leu Ser Arg Ile Asn His
Phe Glu 115 120 125Lys Ile Gln Ile Ile Pro Lys Asn Ser Trp Ser Ser
His Glu Ala Ser 130 135 140Leu Gly Val Ser Ser Ala Cys Pro Tyr Gln
Gly Lys Ser Ser Phe Phe145 150 155 160Arg Asn Val Val Trp Leu Ile
Lys Lys Asn Asn Ala Tyr Pro Thr Ile 165 170 175Lys Arg Ser Tyr Asn
Asn Thr Asn Gln Glu Asp Leu Leu Val Leu Trp 180 185 190Gly Ile His
His Pro Asn Asp Ala Ala Glu Gln Thr Arg Leu Tyr Gln 195 200 205Asn
Pro Thr Thr Tyr Ile Ser Val Gly Thr Ser Thr Leu Asn Gln Arg 210 215
220Leu Val Pro Lys Ile Ala Thr Arg Ser Lys Val Asn Gly Gln Asn
Gly225 230 235 240Arg Met Glu Phe Phe Trp Thr Ile Leu Lys Pro Asn
Asp Ala Ile Asn 245 250 255Phe Glu Ser Asn Gly Asn Phe Ile Ala Pro
Glu Tyr Ala Tyr Lys Ile 260 265 270Val Lys Lys Gly Asp Ser Ala Ile
Met Lys Ser Glu Leu Glu Tyr Gly 275 280 285Asn Cys Asn Thr Lys Cys
Gln Thr Pro Met Gly Ala Ile Asn Ser Ser 290 295 300Met Pro Phe His
Asn Ile His Pro Leu Thr Ile Gly Glu Cys Pro Lys305 310 315 320Tyr
Val Lys Ser Asn Arg Leu Val Leu Ala Thr Gly Leu Arg Asn Ser 325 330
335Pro Gln Arg Glu Arg Arg Arg Lys Lys Arg Gly Leu Phe Gly Ala Ile
340 345 350Ala Gly Phe Ile Glu Gly Gly Trp Gln Gly Met Val Asp Gly
Trp Tyr 355 360 365Gly Tyr His His Ser Asn Glu Gln Gly Ser Gly Tyr
Ala Ala Asp Lys 370 375 380Glu Ser Thr Gln Lys Ala Ile Asp Gly Val
Thr Asn Lys Val Asn Ser385 390 395 400Ile Ile Asp Lys Met Asn Thr
Gln Phe Glu Ala Val Gly Arg Glu Phe 405 410 415Asn Asn Leu Glu Arg
Arg Ile Glu Asn Leu Asn Lys Lys Met Glu Asp 420 425 430Gly Phe Leu
Asp Val Trp Thr Tyr Asn Ala Glu Leu Leu Val Leu Met 435 440 445Glu
Asn Glu Arg Thr Leu Asp Phe His Asp Ser Asn Val Lys Asn Leu 450 455
460Tyr Asp Lys Val Arg Leu Gln Leu Arg Asp Asn Ala Lys Glu Leu
Gly465 470 475 480Asn Gly Cys Phe Glu Phe Tyr His Lys Cys Asp Asn
Glu Cys Met Glu 485 490 495Ser Val Arg Asn Gly Thr Tyr Asp Tyr Pro
Gln Tyr Ser Glu Glu Ala 500 505 510Arg Leu Lys Arg Glu Glu Ile Ser
Gly Val Lys Leu Glu Ser Ile Gly 515 520 525Thr Tyr Gln Ile Leu Ser
Ile Tyr Ser Thr Val Ala Ser Ser Leu Ala 530 535 540Leu Ala Ile Met
Val Ala Gly Leu Ser Leu Trp Met Cys Ser Asn Gly545 550 555 560Ser
Leu Gln Cys Arg Ile Cys Ile 565521707DNAArtificial sequenceVirus
52atggaaaaaa tcgtgctgct gttcgctatc gtctcgctgg tcaaatcgga tcagatctgc
60atcggatacc atgctaacaa ctcgacggaa caggtcgaca cgatcatgga aaagaacgtc
120acggtcacgc atgctcaaga catcctggaa aagacgcaca acggaaagct
gtgcgatctg 180gatggagtga agcctctgat cctgagagat tgttcggtcg
ctggatggct gctgggaaac 240cccatgtgtg acgaatttat caatgtgccc
gaatggtcgt acatcgtgga aaaggctaat 300cccgctaatg acctgtgtta
ccccggagat tttaacgact atgaagaact gaaacacctg 360ctgtcgagaa
tcaaccattt cgaaaaaatc cagatcatcc ccaaaaattc gtggtcgtcg
420catgaagctt cgctgggagt gtcgtcggct tgtccctacc aaggaaagtc
gtcgttcttt 480agaaatgtgg tctggctgat caaaaagaac aatgcttacc
ccacgatcaa gagatcgtac 540aataatacga accaagaaga tctgctggtc
ctgtggggaa tccaccatcc taatgatgct 600gctgaacaga cgagactgta
tcaaaacccc acgacgtaca tctcggtcgg aacgtcgacg 660ctgaaccaga
gactggtccc caaaatcgct acgagatcga aagtcaacgg acaaaatgga
720agaatggaat ttttttggac gatcctgaaa cccaatgatg ctatcaactt
tgaatcgaat 780ggaaatttta tcgctcccga atatgcttac aaaatcgtca
agaaaggaga ctcggctatc 840atgaaatcgg aactggaata tggaaactgc
aacacgaagt gtcaaacgcc catgggagct 900atcaactcgt cgatgccctt
tcacaatatc caccctctga cgatcggaga atgccccaaa 960tatgtgaaat
cgaacagact ggtcctggct acgggactga gaaattcgcc tcaaagagaa
1020agaagaagaa aaaagagagg actgttcgga gctatcgctg gattcatcga
aggaggatgg 1080cagggaatgg tcgatggatg gtatggatac caccattcga
atgaacaggg atcgggatac 1140gctgctgaca aagaatcgac gcaaaaggct
atcgatggag tcacgaataa ggtcaactcg 1200atcatcgaca aaatgaacac
gcagttcgaa gctgtcggaa gagaattcaa taacctggaa 1260agaagaatcg
aaaatctgaa caagaagatg gaagacggat ttctggatgt ctggacgtat
1320aatgctgaac tgctggtcct gatggaaaat gaaagaacgc tggacttcca
tgactcgaat 1380gtcaagaacc tgtacgacaa ggtccgactg cagctgagag
ataatgctaa ggaactggga 1440aacggatgtt ttgaatttta tcacaaatgt
gataatgaat gtatggaatc ggtcagaaac 1500ggaacgtatg actaccccca
gtattcggaa gaagctagac tgaaaagaga agaaatctcg 1560ggagtcaaac
tggaatcgat cggaacgtac caaatcctgt cgatctattc gacggtggct
1620tcgtcgctgg ctctggctat catggtcgct ggactgtcgc tgtggatgtg
ctcgaatgga 1680tcgctgcaat gcagaatctg catctaa
1707531701DNAArtificial sequenceVirus 53atgaagacta tcattgctct
gagctacatt ttatgtctgg tcttcgctca aaaacttccc 60cgaaatgaca acagcacggc
aacgctgtgc ttgggacacc atgcagtgtc aaacggaaca 120ctagtgaaaa
caatcacgaa tgaccaaatt
gaagtgacta atgctactga attggttcag 180agttcctcaa caggtagaat
atgtgaccga cctcatcgaa tccttgatgg ggaaaactgc 240acactgatag
atgctctctt gggagaccct cattgtgata gtttccaaaa caaggaatgg
300gacctttttg tagaacgcag cacagcttac agcgactgtt acccttatga
tgtgccggat 360tatgcctccc ttaggtcact agttgcctca tccggcaccc
tggagtttaa cgatgaaagt 420ttcgattgga ctggagtctc tcaggatgga
acaagcaatg cttgcaaaag gagatctgtt 480aaaagttttt ttagtagatt
aaattggttg tacaaattag aatacaaata tccagcactg 540aacgtgacta
tgccaaacaa tgaaaaattt gacaaattgt acatttgggg ggtgcaccac
600ccgagcacgg acagtgacca aaccagtcta tatgttcaag catcagggag
agtcacaatc 660tctaccaaaa gaagccaaca aactgtaatc ccgaatatcg
gatctagacc ctgggtaagg 720ggtatctcca gcagaataag catctattgg
acaatagtaa aacctggaga catacttatg 780attaacagca cagggaatct
aatcgcccct cggggttact tcaagatacg aagtggagaa 840agctcaataa
tgaggtcaga tgcacccatt gatagctgca attctgaatg catcactcca
900aatggaagca ttcccaataa caaaccattt caaaatgtaa acaggatcac
atatggggcc 960tgtcctagat atgttaaaca aaaaactcta aaattggcaa
cagggatgcg gaatgtacca 1020gagaaacaag ctaggggcat attcggcgcc
atcgcaggtt tcatagaaaa tggttgggag 1080ggaatggtag acggttggta
cggttttagg catctaaatt ctgagggctc aggacaagca 1140gcagacctca
aaagcactca ggcagcaatt aaccaaatca acgggaaact gaataggttg
1200gtcgaaaaaa caaacgagaa attccatcaa attgaaaaag aattctcaga
cgtggaaggg 1260agaattcagg atctcgagaa atatgttgaa gacaccaaaa
tagatctctg gtcatacaat 1320gcggagcttc ttgttgccct ggagaaccaa
cacacaattg atctaactga ctcagaaatg 1380aacaaactgt tcgaaagaac
aaggaaacaa ctgagggaaa atgctgagga catgggcaat 1440ggttgcttca
aaatatacca caaatgtgac aatgcctgca tagggtcgat cagaaatgga
1500acttatgacc ataatgtata cagagacgaa gcattaaaca accgactcca
tatcaaaggg 1560gttgagctga agtcaggata caaagattgg atcttatgga
tctcattttc catatcatgc 1620tttttgtttt gtgttgtttt gctggggttc
atcatgtggg cctgccaaaa aggcaacatt 1680aggtgcaaca tttgcatttg a
170154566PRTArtificial sequenceVirus 54Met Lys Thr Ile Ile Ala Leu
Ser Tyr Ile Leu Cys Leu Val Phe Ala1 5 10 15Gln Lys Leu Pro Arg Asn
Asp Asn Ser Thr Ala Thr Leu Cys Leu Gly 20 25 30His His Ala Val Ser
Asn Gly Thr Leu Val Lys Thr Ile Thr Asn Asp 35 40 45Gln Ile Glu Val
Thr Asn Ala Thr Glu Leu Val Gln Ser Ser Ser Thr 50 55 60Gly Arg Ile
Cys Asp Arg Pro His Arg Ile Leu Asp Gly Glu Asn Cys65 70 75 80Thr
Leu Ile Asp Ala Leu Leu Gly Asp Pro His Cys Asp Ser Phe Gln 85 90
95Asn Lys Glu Trp Asp Leu Phe Val Glu Arg Ser Thr Ala Tyr Ser Asp
100 105 110Cys Tyr Pro Tyr Asp Val Pro Asp Tyr Ala Ser Leu Arg Ser
Leu Val 115 120 125Ala Ser Ser Gly Thr Leu Glu Phe Asn Asp Glu Ser
Phe Asp Trp Thr 130 135 140Gly Val Ser Gln Asp Gly Thr Ser Asn Ala
Cys Lys Arg Arg Ser Val145 150 155 160Lys Ser Phe Phe Ser Arg Leu
Asn Trp Leu Tyr Lys Leu Glu Tyr Lys 165 170 175Tyr Pro Ala Leu Asn
Val Thr Met Pro Asn Asn Glu Lys Phe Asp Lys 180 185 190Leu Tyr Ile
Trp Gly Val His His Pro Ser Thr Asp Ser Asp Gln Thr 195 200 205Ser
Leu Tyr Val Gln Ala Ser Gly Arg Val Thr Ile Ser Thr Lys Arg 210 215
220Ser Gln Gln Thr Val Ile Pro Asn Ile Gly Ser Arg Pro Trp Val
Arg225 230 235 240Gly Ile Ser Ser Arg Ile Ser Ile Tyr Trp Thr Ile
Val Lys Pro Gly 245 250 255Asp Ile Leu Met Ile Asn Ser Thr Gly Asn
Leu Ile Ala Pro Arg Gly 260 265 270Tyr Phe Lys Ile Arg Ser Gly Glu
Ser Ser Ile Met Arg Ser Asp Ala 275 280 285Pro Ile Asp Ser Cys Asn
Ser Glu Cys Ile Thr Pro Asn Gly Ser Ile 290 295 300Pro Asn Asn Lys
Pro Phe Gln Asn Val Asn Arg Ile Thr Tyr Gly Ala305 310 315 320Cys
Pro Arg Tyr Val Lys Gln Lys Thr Leu Lys Leu Ala Thr Gly Met 325 330
335Arg Asn Val Pro Glu Lys Gln Ala Arg Gly Ile Phe Gly Ala Ile Ala
340 345 350Gly Phe Ile Glu Asn Gly Trp Glu Gly Met Val Asp Gly Trp
Tyr Gly 355 360 365Phe Arg His Leu Asn Ser Glu Gly Ser Gly Gln Ala
Ala Asp Leu Lys 370 375 380Ser Thr Gln Ala Ala Ile Asn Gln Ile Asn
Gly Lys Leu Asn Arg Leu385 390 395 400Val Glu Lys Thr Asn Glu Lys
Phe His Gln Ile Glu Lys Glu Phe Ser 405 410 415Asp Val Glu Gly Arg
Ile Gln Asp Leu Glu Lys Tyr Val Glu Asp Thr 420 425 430Lys Ile Asp
Leu Trp Ser Tyr Asn Ala Glu Leu Leu Val Ala Leu Glu 435 440 445Asn
Gln His Thr Ile Asp Leu Thr Asp Ser Glu Met Asn Lys Leu Phe 450 455
460Glu Arg Thr Arg Lys Gln Leu Arg Glu Asn Ala Glu Asp Met Gly
Asn465 470 475 480Gly Cys Phe Lys Ile Tyr His Lys Cys Asp Asn Ala
Cys Ile Gly Ser 485 490 495Ile Arg Asn Gly Thr Tyr Asp His Asn Val
Tyr Arg Asp Glu Ala Leu 500 505 510Asn Asn Arg Leu His Ile Lys Gly
Val Glu Leu Lys Ser Gly Tyr Lys 515 520 525Asp Trp Ile Leu Trp Ile
Ser Phe Ser Ile Ser Cys Phe Leu Phe Cys 530 535 540Val Val Leu Leu
Gly Phe Ile Met Trp Ala Cys Gln Lys Gly Asn Ile545 550 555 560Arg
Cys Asn Ile Cys Ile 565551701DNAArtificial sequenceVirus
55atgaagacga tcatcgctct gtcgtacatc ctgtgtctgg tctttgctca aaaactgccc
60cgaaatgaca actcgacggc tacgctgtgc ctgggacacc atgctgtgtc gaacggaacg
120ctggtgaaaa cgatcacgaa tgaccaaatc gaagtgacga atgctacgga
actggtccag 180tcgtcgtcga cgggaagaat ctgtgaccga cctcatcgaa
tcctggatgg agaaaactgc 240acgctgatcg atgctctgct gggagaccct
cattgtgatt cgtttcaaaa caaggaatgg 300gacctgttcg tcgaacgctc
gacggcttac tcggactgtt acccttatga tgtgcccgat 360tatgcttcgc
tgagatcgct ggtcgcttcg tcgggaacgc tggaattcaa cgatgaatcg
420tttgattgga cgggagtctc gcaggatgga acgtcgaatg cttgcaaaag
aagatcggtc 480aaatcgttct tctcgagact gaattggctg tacaaactgg
aatacaaata tcccgctctg 540aacgtgacga tgcccaacaa tgaaaaattc
gacaaactgt acatctgggg agtgcaccac 600ccctcgacgg actcggacca
aacgtcgctg tatgtccaag cttcgggaag agtcacgatc 660tcgacgaaaa
gatcgcaaca aacggtcatc cccaatatcg gatcgagacc ctgggtcaga
720ggaatctcgt cgagaatctc gatctattgg acgatcgtca aacctggaga
catcctgatg 780atcaactcga cgggaaatct gatcgctcct cgaggatact
ttaagatccg atcgggagaa 840tcgtcgatca tgagatcgga tgctcccatc
gattcgtgca attcggaatg catcacgccc 900aatggatcga tccccaataa
caaacccttc caaaatgtca acagaatcac gtatggagct 960tgtcctagat
atgtcaaaca aaaaacgctg aaactggcta cgggaatgcg aaatgtcccc
1020gaaaaacaag ctagaggaat ctttggagct atcgctggat ttatcgaaaa
tggatgggaa 1080ggaatggtcg acggatggta cggattcaga catctgaatt
cggaaggatc gggacaagct 1140gctgacctga aatcgacgca ggctgctatc
aaccaaatca acggaaaact gaatagactg 1200gtcgaaaaaa cgaacgaaaa
atttcatcaa atcgaaaaag aattttcgga cgtggaagga 1260agaatccagg
atctggaaaa atatgtcgaa gacacgaaaa tcgatctgtg gtcgtacaat
1320gctgaactgc tggtcgctct ggaaaaccaa cacacgatcg atctgacgga
ctcggaaatg 1380aacaaactgt ttgaaagaac gagaaaacaa ctgagagaaa
atgctgaaga catgggaaat 1440ggatgcttta aaatctacca caaatgtgac
aatgcttgca tcggatcgat cagaaatgga 1500acgtatgacc ataatgtcta
cagagacgaa gctctgaaca accgactgca tatcaaagga 1560gtcgaactga
agtcgggata caaagattgg atcctgtgga tctcgttctc gatctcgtgc
1620ttcctgttct gtgtcgtcct gctgggattt atcatgtggg cttgccaaaa
aggaaacatc 1680agatgcaaca tctgcatctg a 1701561410DNAArtificial
sequenceVirus 56atgaatccaa atcagaagat aacaaccatt ggatcaatct
gtatggtaat tggaatagtt 60agcttgatgt tacaaattgg gaacataatc tcaatatggg
ttagtcattc aattcaaaca 120gggaatcaac accaggctga accatgcaat
caaagcatta ttacttatga aaacaacacc 180tgggtaaacc agacatatgt
caacatcagc aataccaatt ttcttactga gaaagctgtg 240gcttcagtaa
cattagcggg caattcatct ctttgcccca ttagtggatg ggctgtatac
300agtaaggaca acggtataag aatcggttcc aagggggatg tgtttgttat
aagagagccg 360ttcatctcat gctcccactt ggaatgcaga actttctttt
tgactcaggg agccttgctg 420aatgacaagc attctaatgg gaccgtcaaa
gacagaagcc ctcacagaac attaatgagt 480tgtcccgtgg gtgaggctcc
ttccccatac aactcgaggt ttgagtctgt tgcttggtcg 540gcaagtgctt
gtcatgatgg cactagttgg ttgacaattg gaatttctgg cccagacaat
600ggggctgtgg ctgtattgaa atacaatggc ataataacag acactatcaa
gagttggagg 660aacaacataa tgagaactca agagtctgaa tgtgcatgtg
taaatggctc ttgctttact 720gttatgactg atggaccaag taatgggcag
gcttcataca aaatcttcag aatagaaaaa 780gggaaagtag ttaaatcagc
cgaattaaat gcccctaatt atcactatga ggagtgctcc 840tgttatcctg
atgctggaga aatcacatgt gtgtgcaggg ataactggca tggctcaaat
900cggccatggg tatctttcaa tcaaaatttg gagtatcgaa taggatatat
atgcagtgga 960gttttcggag acaatccacg ccccaatgat gggacaggca
gttgtggtcc ggtgtcccct 1020aaaggggcat atggaataaa agggttctca
tttaaatacg gcaatggtgt ttggatcggg 1080agaaccaaaa gcactaattc
caggagcggc tttgaaatga tttgggatcc aaatggatgg 1140actggtacgg
acagtaattt ttcagtaaag caagatattg tagctataac cgattggtca
1200ggatatagcg ggagttttgt ccagcatcca gaactgacag gattagattg
cataagacct 1260tgtttctggg ttgagctaat cagagggcgg cccaaagaga
gcacaatttg gactagtggg 1320agcagcatat ccttttgtgg tgtaaatagt
gacactgtgg gttggtcttg gccagacggt 1380gctgagttgc cattcaccat
tgacaagtag 141057469PRTArtificial sequenceVirus 57Met Asn Pro Asn
Gln Lys Ile Thr Thr Ile Gly Ser Ile Cys Met Val1 5 10 15Ile Gly Ile
Val Ser Leu Met Leu Gln Ile Gly Asn Ile Ile Ser Ile 20 25 30Trp Val
Ser His Ser Ile Gln Thr Gly Asn Gln His Gln Ala Glu Pro 35 40 45Cys
Asn Gln Ser Ile Ile Thr Tyr Glu Asn Asn Thr Trp Val Asn Gln 50 55
60Thr Tyr Val Asn Ile Ser Asn Thr Asn Phe Leu Thr Glu Lys Ala Val65
70 75 80Ala Ser Val Thr Leu Ala Gly Asn Ser Ser Leu Cys Pro Ile Ser
Gly 85 90 95Trp Ala Val Tyr Ser Lys Asp Asn Gly Ile Arg Ile Gly Ser
Lys Gly 100 105 110Asp Val Phe Val Ile Arg Glu Pro Phe Ile Ser Cys
Ser His Leu Glu 115 120 125Cys Arg Thr Phe Phe Leu Thr Gln Gly Ala
Leu Leu Asn Asp Lys His 130 135 140Ser Asn Gly Thr Val Lys Asp Arg
Ser Pro His Arg Thr Leu Met Ser145 150 155 160Cys Pro Val Gly Glu
Ala Pro Ser Pro Tyr Asn Ser Arg Phe Glu Ser 165 170 175Val Ala Trp
Ser Ala Ser Ala Cys His Asp Gly Thr Ser Trp Leu Thr 180 185 190Ile
Gly Ile Ser Gly Pro Asp Asn Gly Ala Val Ala Val Leu Lys Tyr 195 200
205Asn Gly Ile Ile Thr Asp Thr Ile Lys Ser Trp Arg Asn Asn Ile Met
210 215 220Arg Thr Gln Glu Ser Glu Cys Ala Cys Val Asn Gly Ser Cys
Phe Thr225 230 235 240Val Met Thr Asp Gly Pro Ser Asn Gly Gln Ala
Ser Tyr Lys Ile Phe 245 250 255Arg Ile Glu Lys Gly Lys Val Val Lys
Ser Ala Glu Leu Asn Ala Pro 260 265 270Asn Tyr His Tyr Glu Glu Cys
Ser Cys Tyr Pro Asp Ala Gly Glu Ile 275 280 285Thr Cys Val Cys Arg
Asp Asn Trp His Gly Ser Asn Arg Pro Trp Val 290 295 300Ser Phe Asn
Gln Asn Leu Glu Tyr Arg Ile Gly Tyr Ile Cys Ser Gly305 310 315
320Val Phe Gly Asp Asn Pro Arg Pro Asn Asp Gly Thr Gly Ser Cys Gly
325 330 335Pro Val Ser Pro Lys Gly Ala Tyr Gly Ile Lys Gly Phe Ser
Phe Lys 340 345 350Tyr Gly Asn Gly Val Trp Ile Gly Arg Thr Lys Ser
Thr Asn Ser Arg 355 360 365Ser Gly Phe Glu Met Ile Trp Asp Pro Asn
Gly Trp Thr Gly Thr Asp 370 375 380Ser Asn Phe Ser Val Lys Gln Asp
Ile Val Ala Ile Thr Asp Trp Ser385 390 395 400Gly Tyr Ser Gly Ser
Phe Val Gln His Pro Glu Leu Thr Gly Leu Asp 405 410 415Cys Ile Arg
Pro Cys Phe Trp Val Glu Leu Ile Arg Gly Arg Pro Lys 420 425 430Glu
Ser Thr Ile Trp Thr Ser Gly Ser Ser Ile Ser Phe Cys Gly Val 435 440
445Asn Ser Asp Thr Val Gly Trp Ser Trp Pro Asp Gly Ala Glu Leu Pro
450 455 460Phe Thr Ile Asp Lys465581410DNAArtificial sequenceVirus
58atgaatccca atcagaagat cacgacgatc ggatcgatct gtatggtcat cggaatcgtc
60tcgctgatgc tgcaaatcgg aaacatcatc tcgatctggg tctcgcattc gatccaaacg
120ggaaatcaac accaggctga accctgcaat caatcgatca tcacgtatga
aaacaacacg 180tgggtcaacc agacgtatgt caacatctcg aatacgaatt
tcctgacgga aaaagctgtg 240gcttcggtca cgctggctgg aaattcgtcg
ctgtgcccca tctcgggatg ggctgtctac 300tcgaaggaca acggaatcag
aatcggatcg aagggagatg tgttcgtcat cagagaaccc 360tttatctcgt
gctcgcacct ggaatgcaga acgtttttcc tgacgcaggg agctctgctg
420aatgacaagc attcgaatgg aacggtcaaa gacagatcgc ctcacagaac
gctgatgtcg 480tgtcccgtgg gagaagctcc ttcgccctac aactcgagat
tcgaatcggt cgcttggtcg 540gcttcggctt gtcatgatgg aacgtcgtgg
ctgacgatcg gaatctcggg acccgacaat 600ggagctgtgg ctgtcctgaa
atacaatgga atcatcacgg acacgatcaa gtcgtggaga 660aacaacatca
tgagaacgca agaatcggaa tgtgcttgtg tcaatggatc gtgcttcacg
720gtcatgacgg atggaccctc gaatggacag gcttcgtaca aaatctttag
aatcgaaaaa 780ggaaaagtcg tcaaatcggc tgaactgaat gctcctaatt
atcactatga agaatgctcg 840tgttatcctg atgctggaga aatcacgtgt
gtgtgcagag ataactggca tggatcgaat 900cgaccctggg tctcgtttaa
tcaaaatctg gaatatcgaa tcggatatat ctgctcggga 960gtctttggag
acaatccccg ccccaatgat ggaacgggat cgtgtggacc cgtgtcgcct
1020aaaggagctt atggaatcaa aggattttcg ttcaaatacg gaaatggagt
ctggatcgga 1080agaacgaaat cgacgaattc gagatcggga ttcgaaatga
tctgggatcc caatggatgg 1140acgggaacgg actcgaattt ctcggtcaag
caagatatcg tcgctatcac ggattggtcg 1200ggatattcgg gatcgttcgt
ccagcatccc gaactgacgg gactggattg catcagacct 1260tgtttttggg
tcgaactgat cagaggacga cccaaagaat cgacgatctg gacgtcggga
1320tcgtcgatct cgttctgtgg agtcaattcg gacacggtgg gatggtcgtg
gcccgacgga 1380gctgaactgc cctttacgat cgacaagtag
1410591410DNAArtificial sequenceVirus 59atgaatacaa atcaaaaaat
aataaccatt ggaacagcct gtctgatagt cggaataatt 60agtctattat tgcagatagg
agatatagtc tcgttatgga taagccattc aattcagact 120ggagagaaaa
accactctca gatatgcagt caaagtgtca ttacatatga aaacaacaca
180tgggtgaacc aaacttatgt aaacattggc aataccaata ttgctgatgg
acagggagta 240aattcaataa tactagcggg caattcctct ctttgcccag
taagtggatg ggccatatac 300agcaaagaca atagcataag gatcggttcc
aaaggagaca tttttgtcat aagagaacta 360tttatctcat gctctcattt
ggagtgcaga actttttatc tgacccaagg tgctttgctg 420aatgacaagc
attctaatgg aaccgtcaaa gacaggagtc cttatagaac cttaatgagc
480tgcccgattg gtgaagctcc ttctccgtac aattcaaggt tcgaatcagt
tgcttggtca 540gcaagtgcat gccatgacgg aatgggatgg ctgacaatcg
gaatttccgg cccagataat 600ggagcagtgg ctgttttgaa atacaatggg
ataataacag atacaataaa aagttggagg 660aacaaaatac taagaacaca
agaatcagaa tgtgtctgta taaacggttc gtgtttcact 720ataatgactg
atggcccaag caatgggcag gcctcataca aaatattcaa aatgaagaaa
780gggaaaatta ttaaatcagt ggagatgaat gcacctaatt accactatga
ggaatgctcc 840tgttaccctg atacaggcaa agtggtgtgc gtgtgcagag
acaattggca tgcttcgaat 900agaccgtggg tctctttcga tcagaacctt
aattatcaga tagggtacat atgtagtggg 960gttttcggtg ataacccgcg
ttctaatgat gggagaggcg attgtgggcc agtactttct 1020aatggagcta
atggagtgaa aggattctca tttaggtatg gcaatggcgt ttggatagga
1080agaactaaaa gcatcagctc tagaagtgga tttgagatga tttgggatcc
gaatggatgg 1140acggaaaccg atagtagttt ctcgataaag caggatgtta
tagcattaac tgattggtca 1200ggatacagtg ggaactttgt ccaacatccc
gaattaacag gaatgaactg cataaagcct 1260tgtttctggg tagagttaat
cagaggacag cccaaggaga gaacaatctg gactagtgga 1320agcagcattt
ctttctgtgg tgtagacagt gaaaccgcaa gctggtcatg gccagacgga
1380gctgatctgc cattcactat tgacaagtag 141060469PRTArtificial
sequenceVirus 60Met Asn Thr Asn Gln Lys Ile Ile Thr Ile Gly Thr Ala
Cys Leu Ile1 5 10 15Val Gly Ile Ile Ser Leu Leu Leu Gln Ile Gly Asp
Ile Val Ser Leu 20 25 30Trp Ile Ser His Ser Ile Gln Thr Gly Glu Lys
Asn His Ser Gln Ile 35 40 45Cys Ser Gln Ser Val Ile Thr Tyr Glu Asn
Asn Thr Trp Val Asn Gln 50 55 60Thr Tyr Val Asn Ile Gly Asn Thr Asn
Ile Ala Asp Gly Gln Gly Val65 70 75 80Asn Ser Ile Ile Leu Ala Gly
Asn Ser Ser Leu Cys Pro Val Ser Gly 85 90 95Trp Ala Ile Tyr Ser Lys
Asp Asn Ser Ile Arg Ile Gly Ser Lys Gly 100 105 110Asp Ile Phe Val
Ile Arg Glu Leu Phe Ile Ser Cys Ser His Leu Glu 115 120 125Cys Arg
Thr Phe Tyr Leu Thr Gln Gly Ala Leu Leu Asn Asp Lys His 130 135
140Ser Asn Gly Thr Val Lys Asp Arg Ser Pro Tyr Arg Thr Leu Met
Ser145 150 155 160Cys Pro Ile Gly Glu Ala Pro Ser Pro Tyr Asn Ser
Arg Phe Glu Ser 165 170 175Val Ala Trp Ser Ala Ser Ala Cys His Asp
Gly Met Gly Trp Leu Thr 180 185 190Ile Gly Ile Ser Gly Pro Asp Asn
Gly Ala Val Ala Val Leu Lys Tyr 195 200 205Asn Gly Ile Ile Thr Asp
Thr Ile Lys Ser Trp Arg Asn Lys Ile Leu 210 215 220Arg Thr Gln Glu
Ser Glu Cys Val Cys Ile Asn Gly Ser Cys Phe Thr225 230 235 240Ile
Met Thr Asp Gly Pro Ser Asn Gly Gln Ala Ser Tyr Lys Ile Phe 245 250
255Lys Met Lys Lys Gly Lys Ile Ile Lys Ser Val Glu Met Asn Ala Pro
260 265 270Asn Tyr His Tyr Glu Glu Cys Ser Cys Tyr Pro Asp Thr Gly
Lys Val 275 280 285Val Cys Val Cys Arg Asp Asn Trp His Ala Ser Asn
Arg Pro Trp Val 290 295 300Ser Phe Asp Gln Asn Leu Asn Tyr Gln Ile
Gly Tyr Ile Cys Ser Gly305 310 315 320Val Phe Gly Asp Asn Pro Arg
Ser Asn Asp Gly Arg Gly Asp Cys Gly 325 330 335Pro Val Leu Ser Asn
Gly Ala Asn Gly Val Lys Gly Phe Ser Phe Arg 340 345 350Tyr Gly Asn
Gly Val Trp Ile Gly Arg Thr Lys Ser Ile Ser Ser Arg 355 360 365Ser
Gly Phe Glu Met Ile Trp Asp Pro Asn Gly Trp Thr Glu Thr Asp 370 375
380Ser Ser Phe Ser Ile Lys Gln Asp Val Ile Ala Leu Thr Asp Trp
Ser385 390 395 400Gly Tyr Ser Gly Asn Phe Val Gln His Pro Glu Leu
Thr Gly Met Asn 405 410 415Cys Ile Lys Pro Cys Phe Trp Val Glu Leu
Ile Arg Gly Gln Pro Lys 420 425 430Glu Arg Thr Ile Trp Thr Ser Gly
Ser Ser Ile Ser Phe Cys Gly Val 435 440 445Asp Ser Glu Thr Ala Ser
Trp Ser Trp Pro Asp Gly Ala Asp Leu Pro 450 455 460Phe Thr Ile Asp
Lys465611410DNAArtificial sequenceVirus 61atgaatacga atcaaaaaat
catcacgatc ggaacggctt gtctgatcgt cggaatcatc 60tcgctgctgc tgcagatcgg
agatatcgtc tcgctgtgga tctcgcattc gatccagacg 120ggagaaaaaa
accactcgca gatctgctcg caatcggtca tcacgtatga aaacaacacg
180tgggtgaacc aaacgtatgt caacatcgga aatacgaata tcgctgatgg
acagggagtc 240aattcgatca tcctggctgg aaattcgtcg ctgtgccccg
tctcgggatg ggctatctac 300tcgaaagaca attcgatcag aatcggatcg
aaaggagaca tcttcgtcat cagagaactg 360ttcatctcgt gctcgcatct
ggaatgcaga acgttctatc tgacgcaagg agctctgctg 420aatgacaagc
attcgaatgg aacggtcaaa gacagatcgc cttatagaac gctgatgtcg
480tgccccatcg gagaagctcc ttcgccctac aattcgagat ttgaatcggt
cgcttggtcg 540gcttcggctt gccatgacgg aatgggatgg ctgacgatcg
gaatctcggg acccgataat 600ggagctgtgg ctgtcctgaa atacaatgga
atcatcacgg atacgatcaa atcgtggaga 660aacaaaatcc tgagaacgca
agaatcggaa tgtgtctgta tcaacggatc gtgttttacg 720atcatgacgg
atggaccctc gaatggacag gcttcgtaca aaatctttaa aatgaagaaa
780ggaaaaatca tcaaatcggt ggaaatgaat gctcctaatt accactatga
agaatgctcg 840tgttaccctg atacgggaaa agtggtgtgc gtgtgcagag
acaattggca tgcttcgaat 900agaccctggg tctcgtttga tcagaacctg
aattatcaga tcggatacat ctgttcggga 960gtctttggag ataacccccg
ttcgaatgat ggaagaggag attgtggacc cgtcctgtcg 1020aatggagcta
atggagtgaa aggattttcg ttcagatatg gaaatggagt ctggatcgga
1080agaacgaaat cgatctcgtc gagatcggga ttcgaaatga tctgggatcc
caatggatgg 1140acggaaacgg attcgtcgtt ttcgatcaag caggatgtca
tcgctctgac ggattggtcg 1200ggatactcgg gaaacttcgt ccaacatccc
gaactgacgg gaatgaactg catcaagcct 1260tgtttttggg tcgaactgat
cagaggacag cccaaggaaa gaacgatctg gacgtcggga 1320tcgtcgatct
cgttttgtgg agtcgactcg gaaacggctt cgtggtcgtg gcccgacgga
1380gctgatctgc cctttacgat cgacaagtag 141062576DNAArtificial
sequenceVirus 62taccaagtgc gcaattcctc ggggctttac catgtcacca
atgattgccc taactcgagt 60attgtgtacg aggcggccga tgccatcctg cacactccgg
ggtgtgtccc ttgcgttcgc 120gagggtaacg cctcgaggtg ttgggtggcg
gtgaccccca cggtggccac cagggacggc 180aaactcccca caacgcagct
tcgacgtcat atcgatctgc ttgtcgggag cgccaccctc 240tgctcggccc
tctacgtggg ggacctgtgc gggtctgtct ttcttgttgg tcaactgttt
300accttctctc ccaggcgcca ctggacgacg caagactgca attgttctat
ctatcccggc 360catataacgg gtcatcgcat ggcatgggat atgatgatga
actggtcccc tacggcagcg 420ttggtggtag ctcagctgct ccggatccca
caagccatca tggacatgat cgctggtgct 480cactggggag tcctggcggg
catagcgtat ttctccatgg tggggaactg ggcgaaggtc 540ctggtagtgc
tgctgctatt tgccggcgtc gacgcg 57663192PRTArtificial sequenceVirus
63Tyr Gln Val Arg Asn Ser Ser Gly Leu Tyr His Val Thr Asn Asp Cys1
5 10 15Pro Asn Ser Ser Ile Val Tyr Glu Ala Ala Asp Ala Ile Leu His
Thr 20 25 30Pro Gly Cys Val Pro Cys Val Arg Glu Gly Asn Ala Ser Arg
Cys Trp 35 40 45Val Ala Val Thr Pro Thr Val Ala Thr Arg Asp Gly Lys
Leu Pro Thr 50 55 60Thr Gln Leu Arg Arg His Ile Asp Leu Leu Val Gly
Ser Ala Thr Leu65 70 75 80Cys Ser Ala Leu Tyr Val Gly Asp Leu Cys
Gly Ser Val Phe Leu Val 85 90 95Gly Gln Leu Phe Thr Phe Ser Pro Arg
Arg His Trp Thr Thr Gln Asp 100 105 110Cys Asn Cys Ser Ile Tyr Pro
Gly His Ile Thr Gly His Arg Met Ala 115 120 125Trp Asp Met Met Met
Asn Trp Ser Pro Thr Ala Ala Leu Val Val Ala 130 135 140Gln Leu Leu
Arg Ile Pro Gln Ala Ile Met Asp Met Ile Ala Gly Ala145 150 155
160His Trp Gly Val Leu Ala Gly Ile Ala Tyr Phe Ser Met Val Gly Asn
165 170 175Trp Ala Lys Val Leu Val Val Leu Leu Leu Phe Ala Gly Val
Asp Ala 180 185 19064576DNAArtificial sequenceVirus 64taccaagtgc
gcaattcgtc gggactgtac catgtcacga atgattgccc taactcgtcg 60atcgtgtacg
aagctgctga tgctatcctg cacacgcccg gatgtgtccc ttgcgtccgc
120gaaggaaacg cttcgagatg ttgggtggct gtgacgccca cggtggctac
gagagacgga 180aaactgccca cgacgcagct gcgacgtcat atcgatctgc
tggtcggatc ggctacgctg 240tgctcggctc tgtacgtggg agacctgtgc
ggatcggtct tcctggtcgg acaactgttc 300acgttttcgc ccagacgcca
ctggacgacg caagactgca attgttcgat ctatcccgga 360catatcacgg
gacatcgcat ggcttgggat atgatgatga actggtcgcc tacggctgct
420ctggtggtcg ctcagctgct gcgaatcccc caagctatca tggacatgat
cgctggagct 480cactggggag tcctggctgg aatcgcttat ttttcgatgg
tgggaaactg ggctaaggtc 540ctggtcgtgc tgctgctgtt cgctggagtc gacgct
576651089DNAArtificial sequenceVirus 65gaaacccacg tcaccggggg
aagtgccggc cgcaccacgg ctgggcttgt tggtctcctt 60acaccaggcg ccaagcagaa
catccaactg atcaacacca acggcagttg gcacatcaat 120agcacggcct
tgaactgcaa tgaaagcctt aacaccggct ggttagcagg gctcttctat
180cagcacaaat tcaactcttc aggctgtcct gagaggttgg ccagctgccg
acgccttacc 240gattttgccc agggctgggg tcctatcagt tatgccaacg
gaagcggcct cgacgaacgc 300ccctactgct ggcactaccc tccaagacct
tgtggcattg tgcccgcaaa gagcgtgtgt 360ggcccggtat attgcttcac
tcccagcccc gtggtggtgg gaacgaccga caggtcgggc 420gcgcctacct
acagctgggg tgcaaatgat acggatgtct tcgtccttaa caacaccagg
480ccaccgctgg gcaattggtt cggttgtacc tggatgaact caactggatt
caccaaagtg 540tgcggagcgc ccccttgtgt catcggaggg gtgggcaaca
acaccttgct ctgccccact 600gattgtttcc gcaagcatcc ggaagccaca
tactctcggt gcggctccgg tccctggatt 660acacccaggt gcatggtcga
ctacccgtat aggctttggc actatccttg taccatcaat 720tacaccatat
tcaaagtcag gatgtacgtg ggaggggtcg agcacaggct ggaagcggcc
780tgcaactgga cgcggggcga acgctgtgat ctggaagaca gggacaggtc
cgagctcagc 840ccattgctgc tgtccaccac acagtggcag gtccttccgt
gttctttcac gaccctgcca 900gccttgtcca ccggcctcat ccacctccac
cagaacattg tggacgtgca gtacttgtac 960ggggtagggt caagcatcgc
gtcctgggcc attaagtggg agtacgtcgt tctcctgttc 1020ctcctgcttg
cagacgcgcg cgtctgctcc tgcttgtgga tgatgttact catatcccaa
1080gcggaggcg 108966363PRTArtificial sequenceVirus 66Glu Thr His
Val Thr Gly Gly Ser Ala Gly Arg Thr Thr Ala Gly Leu1 5 10 15Val Gly
Leu Leu Thr Pro Gly Ala Lys Gln Asn Ile Gln Leu Ile Asn 20 25 30Thr
Asn Gly Ser Trp His Ile Asn Ser Thr Ala Leu Asn Cys Asn Glu 35 40
45Ser Leu Asn Thr Gly Trp Leu Ala Gly Leu Phe Tyr Gln His Lys Phe
50 55 60Asn Ser Ser Gly Cys Pro Glu Arg Leu Ala Ser Cys Arg Arg Leu
Thr65 70 75 80Asp Phe Ala Gln Gly Trp Gly Pro Ile Ser Tyr Ala Asn
Gly Ser Gly 85 90 95Leu Asp Glu Arg Pro Tyr Cys Trp His Tyr Pro Pro
Arg Pro Cys Gly 100 105 110Ile Val Pro Ala Lys Ser Val Cys Gly Pro
Val Tyr Cys Phe Thr Pro 115 120 125Ser Pro Val Val Val Gly Thr Thr
Asp Arg Ser Gly Ala Pro Thr Tyr 130 135 140Ser Trp Gly Ala Asn Asp
Thr Asp Val Phe Val Leu Asn Asn Thr Arg145 150 155 160Pro Pro Leu
Gly Asn Trp Phe Gly Cys Thr Trp Met Asn Ser Thr Gly 165 170 175Phe
Thr Lys Val Cys Gly Ala Pro Pro Cys Val Ile Gly Gly Val Gly 180 185
190Asn Asn Thr Leu Leu Cys Pro Thr Asp Cys Phe Arg Lys His Pro Glu
195 200 205Ala Thr Tyr Ser Arg Cys Gly Ser Gly Pro Trp Ile Thr Pro
Arg Cys 210 215 220Met Val Asp Tyr Pro Tyr Arg Leu Trp His Tyr Pro
Cys Thr Ile Asn225 230 235 240Tyr Thr Ile Phe Lys Val Arg Met Tyr
Val Gly Gly Val Glu His Arg 245 250 255Leu Glu Ala Ala Cys Asn Trp
Thr Arg Gly Glu Arg Cys Asp Leu Glu 260 265 270Asp Arg Asp Arg Ser
Glu Leu Ser Pro Leu Leu Leu Ser Thr Thr Gln 275 280 285Trp Gln Val
Leu Pro Cys Ser Phe Thr Thr Leu Pro Ala Leu Ser Thr 290 295 300Gly
Leu Ile His Leu His Gln Asn Ile Val Asp Val Gln Tyr Leu Tyr305 310
315 320Gly Val Gly Ser Ser Ile Ala Ser Trp Ala Ile Lys Trp Glu Tyr
Val 325 330 335Val Leu Leu Phe Leu Leu Leu Ala Asp Ala Arg Val Cys
Ser Cys Leu 340 345 350Trp Met Met Leu Leu Ile Ser Gln Ala Glu Ala
355 360671089DNAArtificial sequenceVirus 67gaaacgcacg tcacgggagg
atcggctgga cgcacgacgg ctggactggt cggactgctg 60acgcccggag ctaagcagaa
catccaactg atcaacacga acggatcgtg gcacatcaat 120tcgacggctc
tgaactgcaa tgaatcgctg aacacgggat ggctggctgg actgttttat
180cagcacaaat ttaactcgtc gggatgtcct gaaagactgg cttcgtgccg
acgcctgacg 240gatttcgctc agggatgggg acctatctcg tatgctaacg
gatcgggact ggacgaacgc 300ccctactgct ggcactaccc tcccagacct
tgtggaatcg tgcccgctaa gtcggtgtgt 360ggacccgtct attgctttac
gccctcgccc gtggtggtgg gaacgacgga cagatcggga 420gctcctacgt
actcgtgggg agctaatgat acggatgtct ttgtcctgaa caacacgaga
480ccccccctgg gaaattggtt tggatgtacg tggatgaact cgacgggatt
tacgaaagtg 540tgcggagctc ccccttgtgt catcggagga gtgggaaaca
acacgctgct gtgccccacg 600gattgttttc gcaagcatcc cgaagctacg
tactcgcgat gcggatcggg accctggatc 660acgcccagat gcatggtcga
ctacccctat agactgtggc actatccttg tacgatcaat 720tacacgatct
ttaaagtcag aatgtacgtg ggaggagtcg aacacagact ggaagctgct
780tgcaactgga cgcgaggaga acgctgtgat ctggaagaca gagacagatc
ggaactgtcg 840cccctgctgc tgtcgacgac gcagtggcag gtcctgccct
gttcgtttac gacgctgccc 900gctctgtcga cgggactgat ccacctgcac
cagaacatcg tggacgtgca gtacctgtac 960ggagtcggat cgtcgatcgc
ttcgtgggct atcaagtggg aatacgtcgt cctgctgttt 1020ctgctgctgg
ctgacgctcg cgtctgctcg tgcctgtgga tgatgctgct gatctcgcaa
1080gctgaagct 1089682724DNAArtificial sequenceVirus 68atggaggcag
ccttgcttgt gtgtcagtac accatccaga gcctgatcca tctcacgggt 60gaagatcctg
gttttttcaa tgttgagatt ccggaattcc cattttaccc cacatgcaat
120gtttgcacgg cagatgtcaa tgtaactatc aatttcgatg tcgggggcaa
aaagcatcaa 180cttgatcttg actttggcca gctgacaccc catacgaagg
ctgtctacca acctcgaggt 240gcatttggtg gctcagaaaa tgccaccaat
ctctttctac tggagctcct tggtgcagga 300gaattggctc taactatgcg
gtctaagaag cttccaatta acgtcaccac cggagaggag 360caacaagtaa
gcctggaatc tgtagatgtc tactttcaag atgtgtttgg aaccatgtgg
420tgccaccatg cagaaatgca aaaccccgtg tacctgatac cagaaacagt
gccatacata 480aagtgggata actgtaattc taccaatata acggcagtag
tgagggcaca ggggctggat 540gtcacgctac ccttaagttt gccaacgtca
gctcaagact cgaatttcag cgtaaaaaca 600gaaatgctcg gtaatgagat
agatattgag tgtattatgg aggatggcga aatttcacaa 660gttctgcccg
gagacaacaa atttaacatc acctgcagtg gatacgagag ccatgttccc
720agcggcggaa ttctcacatc aacgagtccc gtggccaccc caatacctgg
tacagggtat 780gcatacagcc tgcgtctgac accacgtcca gtgtcacgat
ttcttggcaa taacagtatc 840ctgtacgtgt tttactctgg gaatggaccg
aaggcgagcg ggggagatta ctgcattcag 900tccaacattg tgttctctga
tgagattcca gcttcacagg acatgccgac aaacaccaca 960gacatcacat
atgtgggtga caatgctacc tattcagtgc caatggtcac ttctgaggac
1020gcaaactcgc caaatgttac agtgactgcc ttttgggcct ggccaaacaa
cactgaaact 1080gactttaagt gcaaatggac tctcacctcg gggacacctt
cgggttgtga aaatatttct 1140ggtgcatttg cgagcaatcg gacatttgac
attactgtct cgggtcttgg cacggccccc 1200aagacactca ttatcacacg
aacggctacc aatgccacca caacaaccca caaggttata 1260ttctccaagg
cacccgagag caccaccacc tcccctacct tgaatacaac tggatttgct
1320gatcccaata caacgacagg tctacccagc tctactcacg tgcctaccaa
cctcaccgca 1380cctgcaagca caggccccac tgtatccacc gcggatgtca
ccagcccaac accagccggc 1440acaacgtcag gcgcatcacc ggtgacacca
agtccatctc catgggacaa cggcacagaa 1500agtaaggccc ccgacatgac
cagctccacc tcaccagtga ctaccccaac cccaaatgcc 1560accagcccca
ccccagcagt gactacccca accccaaatg ccaccagccc caccccagca
1620gtgactaccc caaccccaaa tgccaccagc cccaccttgg gaaaaacaag
tcctacctca 1680gcagtgacta ccccaacccc aaatgccacc agccccacct
tgggaaaaac aagccccacc 1740tcagcagtga ctaccccaac cccaaatgcc
accagcccca ccttgggaaa aacaagcccc 1800acctcagcag tgactacccc
aaccccaaat gccaccggcc ctactgtggg agaaacaagt 1860ccacaggcaa
atgccaccaa ccacacctta ggaggaacaa gtcccacccc agtagttacc
1920agccaaccaa aaaatgcaac cagtgctgtt accacaggcc aacataacat
aacttcaagt 1980tcaacctctt ccatgtcact gagacccagt tcaaacccag
agacactcag cccctccacc 2040agtgacaatt caacgtcaca tatgccttta
ctaacctccg ctcacccaac aggtggtgaa 2100aatataacac aggtgacacc
agcctctatc agcacacatc atgtgtccac cagttcgcca 2160gcaccccgcc
caggcaccac cagccaagcg tcaggccctg gaaacagttc cacatccaca
2220aaaccggggg aggttaatgt caccaaaggc acgccccccc aaaatgcaac
gtcgccccag 2280gcccccagtg gccaaaagac ggcggttccc acggtcacct
caacaggtgg aaaggccaat 2340tctaccaccg gtggaaagca caccacagga
catggagccc ggacaagtac agagcccacc 2400acagattacg gcggtgattc
aactacgcca agaccgagat acaatgcgac cacctatcta 2460cctcccagca
cttctagcaa actgcggccc cgctggactt ttacgagccc accggttacc
2520acagcccaag ccaccgtgcc agtcccgcca acgtcccagc ccagattctc
aaacctctcc 2580atgctagtac tgcagtgggc ctctctggct gtgctgaccc
ttctgctgct gctggtcatg 2640gcggactgcg cctttaggcg taacttgtct
acatcccata cctacaccac cccaccatat 2700gatgacgccg agacctatgt ataa
272469907PRTArtificial sequenceVirus 69Met Glu Ala Ala Leu Leu Val
Cys Gln Tyr Thr Ile Gln Ser Leu Ile1 5 10 15His Leu Thr Gly Glu Asp
Pro Gly Phe Phe Asn Val Glu Ile Pro Glu 20 25 30Phe Pro Phe Tyr Pro
Thr Cys Asn Val Cys Thr Ala Asp Val Asn Val 35 40 45Thr Ile Asn Phe
Asp Val Gly Gly Lys Lys His Gln Leu Asp Leu Asp 50 55 60Phe Gly Gln
Leu Thr Pro His Thr Lys Ala Val Tyr Gln Pro Arg Gly65 70 75 80Ala
Phe Gly Gly Ser Glu Asn Ala Thr Asn Leu Phe Leu Leu Glu Leu 85 90
95Leu Gly Ala Gly Glu Leu Ala Leu Thr Met Arg Ser Lys Lys Leu Pro
100 105 110Ile Asn Val Thr Thr Gly Glu Glu Gln Gln Val Ser Leu Glu
Ser Val 115 120 125Asp Val Tyr Phe Gln Asp Val Phe Gly Thr Met Trp
Cys His His Ala 130 135 140Glu Met Gln Asn Pro Val Tyr Leu Ile Pro
Glu Thr Val Pro Tyr Ile145 150 155 160Lys Trp Asp Asn Cys Asn Ser
Thr Asn Ile Thr Ala Val Val Arg Ala 165 170 175Gln Gly Leu Asp Val
Thr Leu Pro Leu Ser Leu Pro Thr Ser Ala Gln 180 185 190Asp Ser Asn
Phe Ser Val Lys Thr Glu Met Leu Gly Asn Glu Ile Asp 195 200 205Ile
Glu Cys Ile Met Glu Asp Gly Glu Ile Ser Gln Val Leu Pro Gly 210 215
220Asp Asn Lys Phe Asn Ile Thr Cys Ser Gly Tyr Glu Ser His Val
Pro225 230 235 240Ser Gly Gly Ile Leu Thr Ser Thr Ser Pro Val Ala
Thr Pro Ile Pro 245 250 255Gly Thr Gly Tyr Ala Tyr Ser Leu Arg Leu
Thr Pro Arg Pro Val Ser 260 265 270Arg Phe Leu Gly Asn Asn Ser Ile
Leu Tyr Val Phe Tyr Ser Gly Asn 275 280 285Gly Pro Lys Ala Ser Gly
Gly Asp Tyr Cys Ile Gln Ser Asn Ile Val 290
295 300Phe Ser Asp Glu Ile Pro Ala Ser Gln Asp Met Pro Thr Asn Thr
Thr305 310 315 320Asp Ile Thr Tyr Val Gly Asp Asn Ala Thr Tyr Ser
Val Pro Met Val 325 330 335Thr Ser Glu Asp Ala Asn Ser Pro Asn Val
Thr Val Thr Ala Phe Trp 340 345 350Ala Trp Pro Asn Asn Thr Glu Thr
Asp Phe Lys Cys Lys Trp Thr Leu 355 360 365Thr Ser Gly Thr Pro Ser
Gly Cys Glu Asn Ile Ser Gly Ala Phe Ala 370 375 380Ser Asn Arg Thr
Phe Asp Ile Thr Val Ser Gly Leu Gly Thr Ala Pro385 390 395 400Lys
Thr Leu Ile Ile Thr Arg Thr Ala Thr Asn Ala Thr Thr Thr Thr 405 410
415His Lys Val Ile Phe Ser Lys Ala Pro Glu Ser Thr Thr Thr Ser Pro
420 425 430Thr Leu Asn Thr Thr Gly Phe Ala Asp Pro Asn Thr Thr Thr
Gly Leu 435 440 445Pro Ser Ser Thr His Val Pro Thr Asn Leu Thr Ala
Pro Ala Ser Thr 450 455 460Gly Pro Thr Val Ser Thr Ala Asp Val Thr
Ser Pro Thr Pro Ala Gly465 470 475 480Thr Thr Ser Gly Ala Ser Pro
Val Thr Pro Ser Pro Ser Pro Trp Asp 485 490 495Asn Gly Thr Glu Ser
Lys Ala Pro Asp Met Thr Ser Ser Thr Ser Pro 500 505 510Val Thr Thr
Pro Thr Pro Asn Ala Thr Ser Pro Thr Pro Ala Val Thr 515 520 525Thr
Pro Thr Pro Asn Ala Thr Ser Pro Thr Pro Ala Val Thr Thr Pro 530 535
540Thr Pro Asn Ala Thr Ser Pro Thr Leu Gly Lys Thr Ser Pro Thr
Ser545 550 555 560Ala Val Thr Thr Pro Thr Pro Asn Ala Thr Ser Pro
Thr Leu Gly Lys 565 570 575Thr Ser Pro Thr Ser Ala Val Thr Thr Pro
Thr Pro Asn Ala Thr Ser 580 585 590Pro Thr Leu Gly Lys Thr Ser Pro
Thr Ser Ala Val Thr Thr Pro Thr 595 600 605Pro Asn Ala Thr Gly Pro
Thr Val Gly Glu Thr Ser Pro Gln Ala Asn 610 615 620Ala Thr Asn His
Thr Leu Gly Gly Thr Ser Pro Thr Pro Val Val Thr625 630 635 640Ser
Gln Pro Lys Asn Ala Thr Ser Ala Val Thr Thr Gly Gln His Asn 645 650
655Ile Thr Ser Ser Ser Thr Ser Ser Met Ser Leu Arg Pro Ser Ser Asn
660 665 670Pro Glu Thr Leu Ser Pro Ser Thr Ser Asp Asn Ser Thr Ser
His Met 675 680 685Pro Leu Leu Thr Ser Ala His Pro Thr Gly Gly Glu
Asn Ile Thr Gln 690 695 700Val Thr Pro Ala Ser Ile Ser Thr His His
Val Ser Thr Ser Ser Pro705 710 715 720Ala Pro Arg Pro Gly Thr Thr
Ser Gln Ala Ser Gly Pro Gly Asn Ser 725 730 735Ser Thr Ser Thr Lys
Pro Gly Glu Val Asn Val Thr Lys Gly Thr Pro 740 745 750Pro Gln Asn
Ala Thr Ser Pro Gln Ala Pro Ser Gly Gln Lys Thr Ala 755 760 765Val
Pro Thr Val Thr Ser Thr Gly Gly Lys Ala Asn Ser Thr Thr Gly 770 775
780Gly Lys His Thr Thr Gly His Gly Ala Arg Thr Ser Thr Glu Pro
Thr785 790 795 800Thr Asp Tyr Gly Gly Asp Ser Thr Thr Pro Arg Pro
Arg Tyr Asn Ala 805 810 815Thr Thr Tyr Leu Pro Pro Ser Thr Ser Ser
Lys Leu Arg Pro Arg Trp 820 825 830Thr Phe Thr Ser Pro Pro Val Thr
Thr Ala Gln Ala Thr Val Pro Val 835 840 845Pro Pro Thr Ser Gln Pro
Arg Phe Ser Asn Leu Ser Met Leu Val Leu 850 855 860Gln Trp Ala Ser
Leu Ala Val Leu Thr Leu Leu Leu Leu Leu Val Met865 870 875 880Ala
Asp Cys Ala Phe Arg Arg Asn Leu Ser Thr Ser His Thr Tyr Thr 885 890
895Thr Pro Pro Tyr Asp Asp Ala Glu Thr Tyr Val 900
905702724DNAArtificial sequenceVirus 70atggaagctg ctctgctggt
gtgtcagtac acgatccagt cgctgatcca tctgacggga 60gaagatcctg gattctttaa
tgtcgaaatc cccgaatttc ccttctaccc cacgtgcaat 120gtctgcacgg
ctgatgtcaa tgtcacgatc aattttgatg tcggaggaaa aaagcatcaa
180ctggatctgg acttcggaca gctgacgccc catacgaagg ctgtctacca
acctcgagga 240gctttcggag gatcggaaaa tgctacgaat ctgttcctgc
tggaactgct gggagctgga 300gaactggctc tgacgatgcg atcgaagaag
ctgcccatca acgtcacgac gggagaagaa 360caacaagtct cgctggaatc
ggtcgatgtc tacttccaag atgtgttcgg aacgatgtgg 420tgccaccatg
ctgaaatgca aaaccccgtg tacctgatcc ccgaaacggt gccctacatc
480aagtgggata actgtaattc gacgaatatc acggctgtcg tgagagctca
gggactggat 540gtcacgctgc ccctgtcgct gcccacgtcg gctcaagact
cgaatttttc ggtcaaaacg 600gaaatgctgg gaaatgaaat cgatatcgaa
tgtatcatgg aagatggaga aatctcgcaa 660gtcctgcccg gagacaacaa
attcaacatc acgtgctcgg gatacgaatc gcatgtcccc 720tcgggaggaa
tcctgacgtc gacgtcgccc gtggctacgc ccatccctgg aacgggatat
780gcttactcgc tgcgtctgac gccccgtccc gtgtcgcgat tcctgggaaa
taactcgatc 840ctgtacgtgt tctactcggg aaatggaccc aaggcttcgg
gaggagatta ctgcatccag 900tcgaacatcg tgttttcgga tgaaatcccc
gcttcgcagg acatgcccac gaacacgacg 960gacatcacgt atgtgggaga
caatgctacg tattcggtgc ccatggtcac gtcggaagac 1020gctaactcgc
ccaatgtcac ggtgacggct ttctgggctt ggcccaacaa cacggaaacg
1080gacttcaagt gcaaatggac gctgacgtcg ggaacgcctt cgggatgtga
aaatatctcg 1140ggagctttcg cttcgaatcg aacgttcgac atcacggtct
cgggactggg aacggctccc 1200aagacgctga tcatcacgcg aacggctacg
aatgctacga cgacgacgca caaggtcatc 1260ttttcgaagg ctcccgaatc
gacgacgacg tcgcctacgc tgaatacgac gggattcgct 1320gatcccaata
cgacgacggg actgccctcg tcgacgcacg tgcctacgaa cctgacggct
1380cctgcttcga cgggacccac ggtctcgacg gctgatgtca cgtcgcccac
gcccgctgga 1440acgacgtcgg gagcttcgcc cgtgacgccc tcgccctcgc
cctgggacaa cggaacggaa 1500tcgaaggctc ccgacatgac gtcgtcgacg
tcgcccgtga cgacgcccac gcccaatgct 1560acgtcgccca cgcccgctgt
gacgacgccc acgcccaatg ctacgtcgcc cacgcccgct 1620gtgacgacgc
ccacgcccaa tgctacgtcg cccacgctgg gaaaaacgtc gcctacgtcg
1680gctgtgacga cgcccacgcc caatgctacg tcgcccacgc tgggaaaaac
gtcgcccacg 1740tcggctgtga cgacgcccac gcccaatgct acgtcgccca
cgctgggaaa aacgtcgccc 1800acgtcggctg tgacgacgcc cacgcccaat
gctacgggac ctacggtggg agaaacgtcg 1860ccccaggcta atgctacgaa
ccacacgctg ggaggaacgt cgcccacgcc cgtcgtcacg 1920tcgcaaccca
aaaatgctac gtcggctgtc acgacgggac aacataacat cacgtcgtcg
1980tcgacgtcgt cgatgtcgct gagaccctcg tcgaaccccg aaacgctgtc
gccctcgacg 2040tcggacaatt cgacgtcgca tatgcctctg ctgacgtcgg
ctcaccccac gggaggagaa 2100aatatcacgc aggtgacgcc cgcttcgatc
tcgacgcatc atgtgtcgac gtcgtcgccc 2160gctccccgcc ccggaacgac
gtcgcaagct tcgggacctg gaaactcgtc gacgtcgacg 2220aaacccggag
aagtcaatgt cacgaaagga acgccccccc aaaatgctac gtcgccccag
2280gctccctcgg gacaaaagac ggctgtcccc acggtcacgt cgacgggagg
aaaggctaat 2340tcgacgacgg gaggaaagca cacgacggga catggagctc
gaacgtcgac ggaacccacg 2400acggattacg gaggagattc gacgacgccc
agacccagat acaatgctac gacgtatctg 2460cctccctcga cgtcgtcgaa
actgcgaccc cgctggacgt tcacgtcgcc ccccgtcacg 2520acggctcaag
ctacggtgcc cgtccccccc acgtcgcagc ccagattttc gaacctgtcg
2580atgctggtcc tgcagtgggc ttcgctggct gtgctgacgc tgctgctgct
gctggtcatg 2640gctgactgcg ctttcagacg taacctgtcg acgtcgcata
cgtacacgac gcccccctat 2700gatgacgctg aaacgtatgt ctaa
2724712661DNAArtificial sequenceVirus 71atggaggcag ccttgcttgt
gtgtcagtac accatccaga gccttatcca actcacgcgt 60gatgatcctg gttttttcaa
tgttgagatt ctggaattcc cattttaccc agcgtgcaat 120gtttgcacgg
cagatgtcaa tgcaactatc aatttcgatg tcgggggcaa aaagcataaa
180cttaatcttg actttggcct gctgacaccc catacaaagg ctgtctacca
acctcgaggt 240gcatttggtg gctcagaaaa tgccaccaat ctctttctac
tggagctcct tggtgcagga 300gaattggctc taactatgcg gtctaagaag
cttccaatta acatcaccac cggagaggag 360caacaagtaa gcctggaatc
tgtagatgtc tactttcaag atgtgtttgg caccatgtgg 420tgccaccatg
cagaaatgca aaacccagta tacctaatac cagaaacagt gccatacata
480aagtgggata actgtaattc taccaatata acggcagtag taagggcaca
ggggctggat 540gtcacgctac ccttaagttt gccaacatca gctcaagact
cgaatttcag cgtaaaaaca 600gaaatgctcg gtaatgagat agatattgag
tgtattatgg aggatggcga aatttcacaa 660gttctgcccg gagacaacaa
atttaacatc acctgcagtg gatacgagag ccatgttccc 720agcggcggaa
ttctcacatc aacgagtccc gtggccaccc caatacctgg tacagggtat
780gcatacagcc tgcgtctgac accacgtcca gtgtcacgat ttcttggcaa
taacagtata 840ctgtacgtgt tttactctgg gaatggaccg aaggcgagcg
ggggagatta ctgcattcag 900tccaacattg tgttctctga tgagattcca
gcttcacagg acatgccgac aaacaccaca 960gacatcacat atgtgggtga
caatgctacc tattcagtgc caatggtcac ttctgaggac 1020gcaaactcgc
caaatgttac agtgactgcc ttttgggcct ggccaaacaa cactgaaact
1080gactttaagt gcaaatggac tctcacctcg gggacacctt cgggttgtga
aaatatttct 1140ggtgcatttg cgagcaatcg gacatttgac attactgtct
cgggtcttgg cacggccccc 1200aagacactca ttatcacacg aacggctacc
aatgccacca caacaaccca caaggttata 1260ttctccaagg cacccgagag
caccaccacc tcccctacct tgaatacaac tggatttgct 1320gctcccaata
caacgacagg tctacccagc tctactcacg tgcctaccaa cctcaccgca
1380cctgcaagca caggccccac tgtatccacc gcggatgtca ccagcccaac
accagccggc 1440acaacgtcag gcgcatcacc ggtgacacca agtccatctc
cacgggacaa cggcacagaa 1500agtaaggccc ccgacatgac cagccccacc
tcagcagtga ctaccccaac cccaaatgcc 1560accagcccca ccccagcagt
gactacccca accccaaatg ccaccagccc caccttggga 1620aaaacaagtc
ccacctcagc agtgactacc ccaaccccaa atgccaccag ccccacccca
1680gcagtgacta ccccaacccc aaatgccacc atccccacct tgggaaaaac
aagtcccacc 1740tcagcagtga ctaccccaac cccaaatgcc accagcccta
ccgtgggaga aacaagtcca 1800caggcaaata ccaccaacca cacattagga
ggaacaagtt ccaccccagt agttaccagc 1860ccaccaaaaa atgcaaccag
tgctgttacc acaggccaac ataacataac ttcaagttca 1920acctcttcca
tgtcactgag acccagttca atctcagaga cactcagccc ctccaccagt
1980gacaattcaa cgtcacatat gcctttacta acctccgctc acccaacagg
tggtgaaaat 2040ataacacagg tgacaccagc ctctaccagc acacatcatg
tgtccaccag ttcgccagcg 2100ccccgcccag gcaccaccag ccaagcgtca
ggccctggaa acagttccac atccacaaaa 2160ccgggggagg ttaatgtcac
caaaggcacg ccccccaaaa atgcaacgtc gccccaggcc 2220cccagtggcc
aaaagacggc ggttcccacg gtcacctcaa caggtggaaa ggccaattct
2280accaccggtg gaaagcacac cacaggacat ggagcccgga caagtacaga
gcccaccaca 2340gattacggcg gtgattcaac tacgccaaga acgagataca
atgcgaccac ctatctacct 2400cccagcactt ctagcaaact gcggccccgc
tggactttta cgagcccacc ggttaccaca 2460gcccaagcca ccgtgcctgt
cccgccaacg tcccagccca gattctcaaa cctctccatg 2520ctagtactgc
agtgggcctc tctggctgtg ctgacccttc tgctgctgct ggtcatggcg
2580gactgcgcct tcaggcgtaa cttgtcgaca tcccatacct acaccacccc
accatatgat 2640gacgccgaga cctatgtata a 266172886PRTArtificial
sequenceVirus 72Met Glu Ala Ala Leu Leu Val Cys Gln Tyr Thr Ile Gln
Ser Leu Ile1 5 10 15Gln Leu Thr Arg Asp Asp Pro Gly Phe Phe Asn Val
Glu Ile Leu Glu 20 25 30Phe Pro Phe Tyr Pro Ala Cys Asn Val Cys Thr
Ala Asp Val Asn Ala 35 40 45Thr Ile Asn Phe Asp Val Gly Gly Lys Lys
His Lys Leu Asn Leu Asp 50 55 60Phe Gly Leu Leu Thr Pro His Thr Lys
Ala Val Tyr Gln Pro Arg Gly65 70 75 80Ala Phe Gly Gly Ser Glu Asn
Ala Thr Asn Leu Phe Leu Leu Glu Leu 85 90 95Leu Gly Ala Gly Glu Leu
Ala Leu Thr Met Arg Ser Lys Lys Leu Pro 100 105 110Ile Asn Ile Thr
Thr Gly Glu Glu Gln Gln Val Ser Leu Glu Ser Val 115 120 125Asp Val
Tyr Phe Gln Asp Val Phe Gly Thr Met Trp Cys His His Ala 130 135
140Glu Met Gln Asn Pro Val Tyr Leu Ile Pro Glu Thr Val Pro Tyr
Ile145 150 155 160Lys Trp Asp Asn Cys Asn Ser Thr Asn Ile Thr Ala
Val Val Arg Ala 165 170 175Gln Gly Leu Asp Val Thr Leu Pro Leu Ser
Leu Pro Thr Ser Ala Gln 180 185 190Asp Ser Asn Phe Ser Val Lys Thr
Glu Met Leu Gly Asn Glu Ile Asp 195 200 205Ile Glu Cys Ile Met Glu
Asp Gly Glu Ile Ser Gln Val Leu Pro Gly 210 215 220Asp Asn Lys Phe
Asn Ile Thr Cys Ser Gly Tyr Glu Ser His Val Pro225 230 235 240Ser
Gly Gly Ile Leu Thr Ser Thr Ser Pro Val Ala Thr Pro Ile Pro 245 250
255Gly Thr Gly Tyr Ala Tyr Ser Leu Arg Leu Thr Pro Arg Pro Val Ser
260 265 270Arg Phe Leu Gly Asn Asn Ser Ile Leu Tyr Val Phe Tyr Ser
Gly Asn 275 280 285Gly Pro Lys Ala Ser Gly Gly Asp Tyr Cys Ile Gln
Ser Asn Ile Val 290 295 300Phe Ser Asp Glu Ile Pro Ala Ser Gln Asp
Met Pro Thr Asn Thr Thr305 310 315 320Asp Ile Thr Tyr Val Gly Asp
Asn Ala Thr Tyr Ser Val Pro Met Val 325 330 335Thr Ser Glu Asp Ala
Asn Ser Pro Asn Val Thr Val Thr Ala Phe Trp 340 345 350Ala Trp Pro
Asn Asn Thr Glu Thr Asp Phe Lys Cys Lys Trp Thr Leu 355 360 365Thr
Ser Gly Thr Pro Ser Gly Cys Glu Asn Ile Ser Gly Ala Phe Ala 370 375
380Ser Asn Arg Thr Phe Asp Ile Thr Val Ser Gly Leu Gly Thr Ala
Pro385 390 395 400Lys Thr Leu Ile Ile Thr Arg Thr Ala Thr Asn Ala
Thr Thr Thr Thr 405 410 415His Lys Val Ile Phe Ser Lys Ala Pro Glu
Ser Thr Thr Thr Ser Pro 420 425 430Thr Leu Asn Thr Thr Gly Phe Ala
Ala Pro Asn Thr Thr Thr Gly Leu 435 440 445Pro Ser Ser Thr His Val
Pro Thr Asn Leu Thr Ala Pro Ala Ser Thr 450 455 460Gly Pro Thr Val
Ser Thr Ala Asp Val Thr Ser Pro Thr Pro Ala Gly465 470 475 480Thr
Thr Ser Gly Ala Ser Pro Val Thr Pro Ser Pro Ser Pro Arg Asp 485 490
495Asn Gly Thr Glu Ser Lys Ala Pro Asp Met Thr Ser Pro Thr Ser Ala
500 505 510Val Thr Thr Pro Thr Pro Asn Ala Thr Ser Pro Thr Pro Ala
Val Thr 515 520 525Thr Pro Thr Pro Asn Ala Thr Ser Pro Thr Leu Gly
Lys Thr Ser Pro 530 535 540Thr Ser Ala Val Thr Thr Pro Thr Pro Asn
Ala Thr Ser Pro Thr Pro545 550 555 560Ala Val Thr Thr Pro Thr Pro
Asn Ala Thr Ile Pro Thr Leu Gly Lys 565 570 575Thr Ser Pro Thr Ser
Ala Val Thr Thr Pro Thr Pro Asn Ala Thr Ser 580 585 590Pro Thr Val
Gly Glu Thr Ser Pro Gln Ala Asn Thr Thr Asn His Thr 595 600 605Leu
Gly Gly Thr Ser Ser Thr Pro Val Val Thr Ser Pro Pro Lys Asn 610 615
620Ala Thr Ser Ala Val Thr Thr Gly Gln His Asn Ile Thr Ser Ser
Ser625 630 635 640Thr Ser Ser Met Ser Leu Arg Pro Ser Ser Ile Ser
Glu Thr Leu Ser 645 650 655Pro Ser Thr Ser Asp Asn Ser Thr Ser His
Met Pro Leu Leu Thr Ser 660 665 670Ala His Pro Thr Gly Gly Glu Asn
Ile Thr Gln Val Thr Pro Ala Ser 675 680 685Thr Ser Thr His His Val
Ser Thr Ser Ser Pro Ala Pro Arg Pro Gly 690 695 700Thr Thr Ser Gln
Ala Ser Gly Pro Gly Asn Ser Ser Thr Ser Thr Lys705 710 715 720Pro
Gly Glu Val Asn Val Thr Lys Gly Thr Pro Pro Lys Asn Ala Thr 725 730
735Ser Pro Gln Ala Pro Ser Gly Gln Lys Thr Ala Val Pro Thr Val Thr
740 745 750Ser Thr Gly Gly Lys Ala Asn Ser Thr Thr Gly Gly Lys His
Thr Thr 755 760 765Gly His Gly Ala Arg Thr Ser Thr Glu Pro Thr Thr
Asp Tyr Gly Gly 770 775 780Asp Ser Thr Thr Pro Arg Thr Arg Tyr Asn
Ala Thr Thr Tyr Leu Pro785 790 795 800Pro Ser Thr Ser Ser Lys Leu
Arg Pro Arg Trp Thr Phe Thr Ser Pro 805 810 815Pro Val Thr Thr Ala
Gln Ala Thr Val Pro Val Pro Pro Thr Ser Gln 820 825 830Pro Arg Phe
Ser Asn Leu Ser Met Leu Val Leu Gln Trp Ala Ser Leu 835 840 845Ala
Val Leu Thr Leu Leu Leu Leu Leu Val Met Ala Asp Cys Ala Phe 850 855
860Arg Arg Asn Leu Ser Thr Ser His Thr Tyr Thr Thr Pro Pro Tyr
Asp865 870 875 880Asp Ala Glu Thr Tyr Val 885732661DNAArtificial
sequenceVirus 73atggaagctg ctctgctggt gtgtcagtac acgatccagt
cgctgatcca actgacgcgt 60gatgatcctg gattctttaa tgtcgaaatc ctggaatttc
ccttctaccc cgcttgcaat 120gtctgcacgg ctgatgtcaa tgctacgatc
aattttgatg tcggaggaaa aaagcataaa 180ctgaatctgg acttcggact
gctgacgccc catacgaagg ctgtctacca acctcgagga 240gctttcggag
gatcggaaaa tgctacgaat ctgttcctgc tggaactgct gggagctgga
300gaactggctc tgacgatgcg atcgaagaag ctgcccatca acatcacgac
gggagaagaa 360caacaagtct
cgctggaatc ggtcgatgtc tacttccaag atgtgttcgg aacgatgtgg
420tgccaccatg ctgaaatgca aaaccccgtc tacctgatcc ccgaaacggt
gccctacatc 480aagtgggata actgtaattc gacgaatatc acggctgtcg
tcagagctca gggactggat 540gtcacgctgc ccctgtcgct gcccacgtcg
gctcaagact cgaatttttc ggtcaaaacg 600gaaatgctgg gaaatgaaat
cgatatcgaa tgtatcatgg aagatggaga aatctcgcaa 660gtcctgcccg
gagacaacaa attcaacatc acgtgctcgg gatacgaatc gcatgtcccc
720tcgggaggaa tcctgacgtc gacgtcgccc gtggctacgc ccatccctgg
aacgggatat 780gcttactcgc tgcgtctgac gccccgtccc gtgtcgcgat
tcctgggaaa taactcgatc 840ctgtacgtgt tctactcggg aaatggaccc
aaggcttcgg gaggagatta ctgcatccag 900tcgaacatcg tgttttcgga
tgaaatcccc gcttcgcagg acatgcccac gaacacgacg 960gacatcacgt
atgtgggaga caatgctacg tattcggtgc ccatggtcac gtcggaagac
1020gctaactcgc ccaatgtcac ggtgacggct ttctgggctt ggcccaacaa
cacggaaacg 1080gacttcaagt gcaaatggac gctgacgtcg ggaacgcctt
cgggatgtga aaatatctcg 1140ggagctttcg cttcgaatcg aacgttcgac
atcacggtct cgggactggg aacggctccc 1200aagacgctga tcatcacgcg
aacggctacg aatgctacga cgacgacgca caaggtcatc 1260ttttcgaagg
ctcccgaatc gacgacgacg tcgcctacgc tgaatacgac gggattcgct
1320gctcccaata cgacgacggg actgccctcg tcgacgcacg tgcctacgaa
cctgacggct 1380cctgcttcga cgggacccac ggtctcgacg gctgatgtca
cgtcgcccac gcccgctgga 1440acgacgtcgg gagcttcgcc cgtgacgccc
tcgccctcgc cccgagacaa cggaacggaa 1500tcgaaggctc ccgacatgac
gtcgcccacg tcggctgtga cgacgcccac gcccaatgct 1560acgtcgccca
cgcccgctgt gacgacgccc acgcccaatg ctacgtcgcc cacgctggga
1620aaaacgtcgc ccacgtcggc tgtgacgacg cccacgccca atgctacgtc
gcccacgccc 1680gctgtgacga cgcccacgcc caatgctacg atccccacgc
tgggaaaaac gtcgcccacg 1740tcggctgtga cgacgcccac gcccaatgct
acgtcgccta cggtgggaga aacgtcgccc 1800caggctaata cgacgaacca
cacgctggga ggaacgtcgt cgacgcccgt cgtcacgtcg 1860ccccccaaaa
atgctacgtc ggctgtcacg acgggacaac ataacatcac gtcgtcgtcg
1920acgtcgtcga tgtcgctgag accctcgtcg atctcggaaa cgctgtcgcc
ctcgacgtcg 1980gacaattcga cgtcgcatat gcctctgctg acgtcggctc
accccacggg aggagaaaat 2040atcacgcagg tgacgcccgc ttcgacgtcg
acgcatcatg tgtcgacgtc gtcgcccgct 2100ccccgccccg gaacgacgtc
gcaagcttcg ggacctggaa actcgtcgac gtcgacgaaa 2160cccggagaag
tcaatgtcac gaaaggaacg ccccccaaaa atgctacgtc gccccaggct
2220ccctcgggac aaaagacggc tgtccccacg gtcacgtcga cgggaggaaa
ggctaattcg 2280acgacgggag gaaagcacac gacgggacat ggagctcgaa
cgtcgacgga acccacgacg 2340gattacggag gagattcgac gacgcccaga
acgagataca atgctacgac gtatctgcct 2400ccctcgacgt cgtcgaaact
gcgaccccgc tggacgttca cgtcgccccc cgtcacgacg 2460gctcaagcta
cggtgcctgt cccccccacg tcgcagccca gattttcgaa cctgtcgatg
2520ctggtcctgc agtgggcttc gctggctgtg ctgacgctgc tgctgctgct
ggtcatggct 2580gactgcgctt ttagacgtaa cctgtcgacg tcgcatacgt
acacgacgcc cccctatgat 2640gacgctgaaa cgtatgtcta a
2661742715DNAArtificial sequenceVirus 74atgcgcgggg ggggcttgat
ttgcgcgctg gtcgtggggg cgctggtggc cgcggtggcg 60tcggcggccc cggcggcccc
ggcggccccc cgcgcctcgg gcggcgtggc cgcgaccgtc 120gcggcgaacg
ggggtcccgc ctcccggccg ccccccgtcc cgagccccgc gaccaccaag
180gcccggaagc ggaaaaccaa aaagccgccc aagcggcccg aggcgacccc
gccccccgac 240gccaacgcga ccgtcgccgc cggccacgcc acgctgcgcg
cgcacctgcg ggaaatcaag 300gtcgagaacg ccgatgccca gttttacgtg
tgcccgcccc cgacgggcgc cacggtggtg 360cagtttgagc agccgcgccg
ctgcccgacg cgcccggagg ggcagaacta cacggagggc 420atcgcggtgg
tcttcaagga gaacatcgcc ccgtacaaat tcaaggccac catgtactac
480aaagacgtga ccgtgtcgca ggtgtggttc ggccaccgct actcccagtt
tatggggata 540ttcgaggacc gcgcccccgt tcccttcgag gaggtgatcg
acaagattaa caccaagggg 600gtctgccgct ccacggccaa gtacgtgcgg
aacaacatgg agaccaccgc gtttcaccgg 660gacgaccacg agaccgacat
ggagctcaag ccggcgaagg tcgccacgcg cacgagccgg 720gggtggcaca
ccaccgacct caagtacaac ccctcgcggg tggaggcgtt ccatcggtac
780ggcacgacgg tcaactgcat cgtcgaggag gtggacgcgc ggtcggtgta
cccgtacgat 840gagtttgtgc tggcgacggg cgactttgtg tacatgtccc
cgttttacgg ctaccgggag 900gggtcgcaca ccgagcacac cagctacgcc
gccgaccgct tcaagcaggt cgacggcttc 960tacgcgcgcg acctcaccac
gaaggcccgg gccacgtcgc cgacgacccg caacttgctg 1020acgaccccca
agtttaccgt ggcctgggac tgggtgccga agcgaccggc ggtctgcacc
1080atgaccaagt ggcaggaggt ggacgagatg ctccgcgccg agtacggcgg
ctccttccgc 1140ttctcctccg acgccatctc gaccaccttc accaccaacc
tgaccgagta ctcgctctcg 1200cgcgtcgacc tgggcgactg catcggccgg
gatgcccgcg aggccatcga ccgcatgttt 1260gcgcgcaagt acaacgccac
gcacatcaag gtgggccagc cgcagtacta cctggccacg 1320gggggcttcc
tcatcgcgta ccagcccctc ctcagcaaca cgctcgccga gctgtacgtg
1380cgggagtaca tgcgggagca ggaccgcaag ccccggaatg ccacgcccgc
gccactgcgg 1440gaggcgccca gcgccaacgc gtccgtggag cgcatcaaga
ccacctcctc gatcgagttc 1500gcccggctgc agtttacgta taaccacata
cagcgccacg tgaatgacat gctggggcgc 1560atcgccgtcg cgtggtgcga
gctgcagaac cacgagctga ctctctggaa cgaggcccgc 1620aagctcaacc
ccaacgccat cgcctccgcc accgtcggcc ggcgggtgag cgcgcgcatg
1680ctcggagacg tcatggccgt ctccacgtgc gtgcccgtcg ccccggacaa
cgtgatcgtg 1740cagaactcga tgcgcgtcag ctcgcggccg gggacgtgct
acagccgccc cctggtcagc 1800tttcggtacg aagaccaggg cccgctgatc
gaggggcagc tgggcgagaa caacgagctg 1860cgcctcaccc gcgacgcgct
cgagccgtgc accgtgggcc accggcgcta cttcatcttc 1920ggcgggggct
acgtgtactt cgaggagtac gcgtactctc accagctgag tcgcgccgac
1980gtcaccaccg tcagcacctt catcgacctg aacatcacca tgctggagga
ccacgagttt 2040gtgcccctgg aggtctacac gcgccacgag atcaaggaca
gcggcctgct ggactacacg 2100gaggtccagc gccgcaacca gctgcacgac
ctgcgctttg ccgacatcga cacggtcatc 2160cgcgccgacg ccaacgccgc
catgttcgcg gggctgtgcg cgttcttcga ggggatgggg 2220gacttggggc
gcgcggtcgg caaggtagtc atgggagtag tggggggcgt ggtgtcggcc
2280gtctcgggcg tgtcctcctt tatgtccaac cccttcgggg cgcttgccgt
ggggctgctg 2340gtcctggccg gcctggtcgc ggccttcttc gccttccgct
acgtcctgca actgcaacgc 2400aatcccatga aggccctgta tccgctcacc
accaaggaac tcaagacttc cgaccccggg 2460ggcgtgggcg gggaggggga
ggaaggcgcg gaggggggcg ggtttgacga ggccaagttg 2520gccgaggccc
gagaaatgat ccgatatatg gctttggtgt cggccatgga gcgcacggaa
2580cacaaggcca gaaagaaggg cacgagcgcc ctgctcagct ccaaggtcac
caacatggtt 2640ctgcgcaagc gcaacaaagc caggtactct ccgctccaca
acgaggacga ggccggagac 2700gaagacgagc tctaa 271575904PRTArtificial
sequenceVirus 75Met Arg Gly Gly Gly Leu Ile Cys Ala Leu Val Val Gly
Ala Leu Val1 5 10 15Ala Ala Val Ala Ser Ala Ala Pro Ala Ala Pro Ala
Ala Pro Arg Ala 20 25 30Ser Gly Gly Val Ala Ala Thr Val Ala Ala Asn
Gly Gly Pro Ala Ser 35 40 45Arg Pro Pro Pro Val Pro Ser Pro Ala Thr
Thr Lys Ala Arg Lys Arg 50 55 60Lys Thr Lys Lys Pro Pro Lys Arg Pro
Glu Ala Thr Pro Pro Pro Asp65 70 75 80Ala Asn Ala Thr Val Ala Ala
Gly His Ala Thr Leu Arg Ala His Leu 85 90 95Arg Glu Ile Lys Val Glu
Asn Ala Asp Ala Gln Phe Tyr Val Cys Pro 100 105 110Pro Pro Thr Gly
Ala Thr Val Val Gln Phe Glu Gln Pro Arg Arg Cys 115 120 125Pro Thr
Arg Pro Glu Gly Gln Asn Tyr Thr Glu Gly Ile Ala Val Val 130 135
140Phe Lys Glu Asn Ile Ala Pro Tyr Lys Phe Lys Ala Thr Met Tyr
Tyr145 150 155 160Lys Asp Val Thr Val Ser Gln Val Trp Phe Gly His
Arg Tyr Ser Gln 165 170 175Phe Met Gly Ile Phe Glu Asp Arg Ala Pro
Val Pro Phe Glu Glu Val 180 185 190Ile Asp Lys Ile Asn Thr Lys Gly
Val Cys Arg Ser Thr Ala Lys Tyr 195 200 205Val Arg Asn Asn Met Glu
Thr Thr Ala Phe His Arg Asp Asp His Glu 210 215 220Thr Asp Met Glu
Leu Lys Pro Ala Lys Val Ala Thr Arg Thr Ser Arg225 230 235 240Gly
Trp His Thr Thr Asp Leu Lys Tyr Asn Pro Ser Arg Val Glu Ala 245 250
255Phe His Arg Tyr Gly Thr Thr Val Asn Cys Ile Val Glu Glu Val Asp
260 265 270Ala Arg Ser Val Tyr Pro Tyr Asp Glu Phe Val Leu Ala Thr
Gly Asp 275 280 285Phe Val Tyr Met Ser Pro Phe Tyr Gly Tyr Arg Glu
Gly Ser His Thr 290 295 300Glu His Thr Ser Tyr Ala Ala Asp Arg Phe
Lys Gln Val Asp Gly Phe305 310 315 320Tyr Ala Arg Asp Leu Thr Thr
Lys Ala Arg Ala Thr Ser Pro Thr Thr 325 330 335Arg Asn Leu Leu Thr
Thr Pro Lys Phe Thr Val Ala Trp Asp Trp Val 340 345 350Pro Lys Arg
Pro Ala Val Cys Thr Met Thr Lys Trp Gln Glu Val Asp 355 360 365Glu
Met Leu Arg Ala Glu Tyr Gly Gly Ser Phe Arg Phe Ser Ser Asp 370 375
380Ala Ile Ser Thr Thr Phe Thr Thr Asn Leu Thr Glu Tyr Ser Leu
Ser385 390 395 400Arg Val Asp Leu Gly Asp Cys Ile Gly Arg Asp Ala
Arg Glu Ala Ile 405 410 415Asp Arg Met Phe Ala Arg Lys Tyr Asn Ala
Thr His Ile Lys Val Gly 420 425 430Gln Pro Gln Tyr Tyr Leu Ala Thr
Gly Gly Phe Leu Ile Ala Tyr Gln 435 440 445Pro Leu Leu Ser Asn Thr
Leu Ala Glu Leu Tyr Val Arg Glu Tyr Met 450 455 460Arg Glu Gln Asp
Arg Lys Pro Arg Asn Ala Thr Pro Ala Pro Leu Arg465 470 475 480Glu
Ala Pro Ser Ala Asn Ala Ser Val Glu Arg Ile Lys Thr Thr Ser 485 490
495Ser Ile Glu Phe Ala Arg Leu Gln Phe Thr Tyr Asn His Ile Gln Arg
500 505 510His Val Asn Asp Met Leu Gly Arg Ile Ala Val Ala Trp Cys
Glu Leu 515 520 525Gln Asn His Glu Leu Thr Leu Trp Asn Glu Ala Arg
Lys Leu Asn Pro 530 535 540Asn Ala Ile Ala Ser Ala Thr Val Gly Arg
Arg Val Ser Ala Arg Met545 550 555 560Leu Gly Asp Val Met Ala Val
Ser Thr Cys Val Pro Val Ala Pro Asp 565 570 575Asn Val Ile Val Gln
Asn Ser Met Arg Val Ser Ser Arg Pro Gly Thr 580 585 590Cys Tyr Ser
Arg Pro Leu Val Ser Phe Arg Tyr Glu Asp Gln Gly Pro 595 600 605Leu
Ile Glu Gly Gln Leu Gly Glu Asn Asn Glu Leu Arg Leu Thr Arg 610 615
620Asp Ala Leu Glu Pro Cys Thr Val Gly His Arg Arg Tyr Phe Ile
Phe625 630 635 640Gly Gly Gly Tyr Val Tyr Phe Glu Glu Tyr Ala Tyr
Ser His Gln Leu 645 650 655Ser Arg Ala Asp Val Thr Thr Val Ser Thr
Phe Ile Asp Leu Asn Ile 660 665 670Thr Met Leu Glu Asp His Glu Phe
Val Pro Leu Glu Val Tyr Thr Arg 675 680 685His Glu Ile Lys Asp Ser
Gly Leu Leu Asp Tyr Thr Glu Val Gln Arg 690 695 700Arg Asn Gln Leu
His Asp Leu Arg Phe Ala Asp Ile Asp Thr Val Ile705 710 715 720Arg
Ala Asp Ala Asn Ala Ala Met Phe Ala Gly Leu Cys Ala Phe Phe 725 730
735Glu Gly Met Gly Asp Leu Gly Arg Ala Val Gly Lys Val Val Met Gly
740 745 750Val Val Gly Gly Val Val Ser Ala Val Ser Gly Val Ser Ser
Phe Met 755 760 765Ser Asn Pro Phe Gly Ala Leu Ala Val Gly Leu Leu
Val Leu Ala Gly 770 775 780Leu Val Ala Ala Phe Phe Ala Phe Arg Tyr
Val Leu Gln Leu Gln Arg785 790 795 800Asn Pro Met Lys Ala Leu Tyr
Pro Leu Thr Thr Lys Glu Leu Lys Thr 805 810 815Ser Asp Pro Gly Gly
Val Gly Gly Glu Gly Glu Glu Gly Ala Glu Gly 820 825 830Gly Gly Phe
Asp Glu Ala Lys Leu Ala Glu Ala Arg Glu Met Ile Arg 835 840 845Tyr
Met Ala Leu Val Ser Ala Met Glu Arg Thr Glu His Lys Ala Arg 850 855
860Lys Lys Gly Thr Ser Ala Leu Leu Ser Ser Lys Val Thr Asn Met
Val865 870 875 880Leu Arg Lys Arg Asn Lys Ala Arg Tyr Ser Pro Leu
His Asn Glu Asp 885 890 895Glu Ala Gly Asp Glu Asp Glu Leu
900762715DNAArtificial sequenceVirus 76atgcgcggag gaggactgat
ctgcgctctg gtcgtgggag ctctggtggc tgctgtggct 60tcggctgctc ccgctgctcc
cgctgctccc cgcgcttcgg gaggagtggc tgctacggtc 120gctgctaacg
gaggacccgc ttcgcgaccc ccccccgtcc cctcgcccgc tacgacgaag
180gctcgaaagc gaaaaacgaa aaagcccccc aagcgacccg aagctacgcc
cccccccgac 240gctaacgcta cggtcgctgc tggacacgct acgctgcgcg
ctcacctgcg agaaatcaag 300gtcgaaaacg ctgatgctca gttctacgtg
tgcccccccc ccacgggagc tacggtggtg 360cagttcgaac agccccgccg
ctgccccacg cgccccgaag gacagaacta cacggaagga 420atcgctgtgg
tctttaagga aaacatcgct ccctacaaat ttaaggctac gatgtactac
480aaagacgtga cggtgtcgca ggtgtggttt ggacaccgct actcgcagtt
catgggaatc 540tttgaagacc gcgctcccgt cccctttgaa gaagtgatcg
acaagatcaa cacgaaggga 600gtctgccgct cgacggctaa gtacgtgcga
aacaacatgg aaacgacggc tttccaccga 660gacgaccacg aaacggacat
ggaactgaag cccgctaagg tcgctacgcg cacgtcgcga 720ggatggcaca
cgacggacct gaagtacaac ccctcgcgag tggaagcttt tcatcgatac
780ggaacgacgg tcaactgcat cgtcgaagaa gtggacgctc gatcggtgta
cccctacgat 840gaattcgtgc tggctacggg agacttcgtg tacatgtcgc
ccttctacgg ataccgagaa 900ggatcgcaca cggaacacac gtcgtacgct
gctgaccgct ttaagcaggt cgacggattt 960tacgctcgcg acctgacgac
gaaggctcga gctacgtcgc ccacgacgcg caacctgctg 1020acgacgccca
agttcacggt ggcttgggac tgggtgccca agcgacccgc tgtctgcacg
1080atgacgaagt ggcaggaagt ggacgaaatg ctgcgcgctg aatacggagg
atcgtttcgc 1140ttttcgtcgg acgctatctc gacgacgttt acgacgaacc
tgacggaata ctcgctgtcg 1200cgcgtcgacc tgggagactg catcggacga
gatgctcgcg aagctatcga ccgcatgttc 1260gctcgcaagt acaacgctac
gcacatcaag gtgggacagc cccagtacta cctggctacg 1320ggaggatttc
tgatcgctta ccagcccctg ctgtcgaaca cgctggctga actgtacgtg
1380cgagaataca tgcgagaaca ggaccgcaag ccccgaaatg ctacgcccgc
tcccctgcga 1440gaagctccct cggctaacgc ttcggtggaa cgcatcaaga
cgacgtcgtc gatcgaattt 1500gctcgactgc agttcacgta taaccacatc
cagcgccacg tgaatgacat gctgggacgc 1560atcgctgtcg cttggtgcga
actgcagaac cacgaactga cgctgtggaa cgaagctcgc 1620aagctgaacc
ccaacgctat cgcttcggct acggtcggac gacgagtgtc ggctcgcatg
1680ctgggagacg tcatggctgt ctcgacgtgc gtgcccgtcg ctcccgacaa
cgtgatcgtg 1740cagaactcga tgcgcgtctc gtcgcgaccc ggaacgtgct
actcgcgccc cctggtctcg 1800ttccgatacg aagaccaggg acccctgatc
gaaggacagc tgggagaaaa caacgaactg 1860cgcctgacgc gcgacgctct
ggaaccctgc acggtgggac accgacgcta ctttatcttt 1920ggaggaggat
acgtgtactt tgaagaatac gcttactcgc accagctgtc gcgcgctgac
1980gtcacgacgg tctcgacgtt tatcgacctg aacatcacga tgctggaaga
ccacgaattc 2040gtgcccctgg aagtctacac gcgccacgaa atcaaggact
cgggactgct ggactacacg 2100gaagtccagc gccgcaacca gctgcacgac
ctgcgcttcg ctgacatcga cacggtcatc 2160cgcgctgacg ctaacgctgc
tatgtttgct ggactgtgcg ctttttttga aggaatggga 2220gacctgggac
gcgctgtcgg aaaggtcgtc atgggagtcg tgggaggagt ggtgtcggct
2280gtctcgggag tgtcgtcgtt catgtcgaac ccctttggag ctctggctgt
gggactgctg 2340gtcctggctg gactggtcgc tgcttttttt gcttttcgct
acgtcctgca actgcaacgc 2400aatcccatga aggctctgta tcccctgacg
acgaaggaac tgaagacgtc ggaccccgga 2460ggagtgggag gagaaggaga
agaaggagct gaaggaggag gattcgacga agctaagctg 2520gctgaagctc
gagaaatgat ccgatatatg gctctggtgt cggctatgga acgcacggaa
2580cacaaggcta gaaagaaggg aacgtcggct ctgctgtcgt cgaaggtcac
gaacatggtc 2640ctgcgcaagc gcaacaaagc tagatactcg cccctgcaca
acgaagacga agctggagac 2700gaagacgaac tgtaa 2715771182DNAArtificial
sequenceVirus 77atggggcgtt tgacctccgg cgtcgggacg gcggccctgc
tagttgtcgc ggtgggactc 60cgcgtcgtct gcgccaaata cgccttagca gacccctcgc
ttaagatggc cgatcccaat 120cgatttcgcg ggaagaacct tccggttttg
gaccagctga ccgacccccc cggggtgaag 180cgtgtttacc acattcagcc
gagcctggag gacccgttcc agccccccag catcccgatc 240actgtgtact
acgcagtgct ggaacgtgcc tgccgcagcg tgctcctaca tgccccatcg
300gaggcccccc agatcgtgcg cggggcttcg gacgaggccc gaaagcacac
gtacaacctg 360accatcgcct ggtatcgcat gggagacaat tgcgctatcc
ccatcacggt tatggaatac 420accgagtgcc cctacaacaa gtcgttgggg
gtctgcccca tccgaacgca gccccgctgg 480agctactatg acagctttag
cgccgtcagc gaggataacc tgggattcct gatgcacgcc 540cccgccttcg
agaccgcggg tacgtacctg cggctagtga agataaacga ctggacggag
600atcacacaat ttatcctgga gcaccgggcc cgcgcctcct gcaagtacgc
tctccccctg 660cgcatccccc cggcagcgtg cctcacctcg aaggcctacc
aacagggcgt gacggtcgac 720agcatcggga tgctaccccg ctttatcccc
gaaaaccagc gcaccgtcgc cctatacagc 780ttaaaaatcg ccgggtggca
cggccccaag cccccgtaca ccagcaccct gctgccgccg 840gagctgtccg
acaccaccaa cgccacgcaa cccgaactcg ttccggaaga ccccgaggac
900tcggccctct tagaggatcc cgccgggacg gtgtcttcgc agatcccccc
aaactggcac 960atcccgtcga tccaggacgt cgcgccgcac cacgcccccg
ccgcccccag caacccgggc 1020ctgatcatcg gcgcgctggc cggcagtacc
ctggcggtgc tggtcatcgg cggtattgcg 1080ttttgggtac gccgccgcgc
tcagatggcc cccaagcgcc tacgtctccc ccacatccgg 1140gatgacgacg
cgcccccctc gcaccagcca ttgttttact ag 118278393PRTArtificial
sequenceVirus 78Met Gly Arg Leu Thr Ser Gly Val Gly Thr Ala Ala Leu
Leu Val Val1 5 10 15Ala Val Gly Leu Arg Val Val Cys Ala Lys Tyr Ala
Leu Ala Asp Pro 20 25 30Ser Leu Lys Met Ala Asp Pro Asn Arg Phe Arg
Gly Lys Asn Leu Pro 35 40 45Val Leu Asp Gln Leu Thr Asp Pro Pro Gly
Val Lys Arg Val Tyr His 50
55 60Ile Gln Pro Ser Leu Glu Asp Pro Phe Gln Pro Pro Ser Ile Pro
Ile65 70 75 80Thr Val Tyr Tyr Ala Val Leu Glu Arg Ala Cys Arg Ser
Val Leu Leu 85 90 95His Ala Pro Ser Glu Ala Pro Gln Ile Val Arg Gly
Ala Ser Asp Glu 100 105 110Ala Arg Lys His Thr Tyr Asn Leu Thr Ile
Ala Trp Tyr Arg Met Gly 115 120 125Asp Asn Cys Ala Ile Pro Ile Thr
Val Met Glu Tyr Thr Glu Cys Pro 130 135 140Tyr Asn Lys Ser Leu Gly
Val Cys Pro Ile Arg Thr Gln Pro Arg Trp145 150 155 160Ser Tyr Tyr
Asp Ser Phe Ser Ala Val Ser Glu Asp Asn Leu Gly Phe 165 170 175Leu
Met His Ala Pro Ala Phe Glu Thr Ala Gly Thr Tyr Leu Arg Leu 180 185
190Val Lys Ile Asn Asp Trp Thr Glu Ile Thr Gln Phe Ile Leu Glu His
195 200 205Arg Ala Arg Ala Ser Cys Lys Tyr Ala Leu Pro Leu Arg Ile
Pro Pro 210 215 220Ala Ala Cys Leu Thr Ser Lys Ala Tyr Gln Gln Gly
Val Thr Val Asp225 230 235 240Ser Ile Gly Met Leu Pro Arg Phe Ile
Pro Glu Asn Gln Arg Thr Val 245 250 255Ala Leu Tyr Ser Leu Lys Ile
Ala Gly Trp His Gly Pro Lys Pro Pro 260 265 270Tyr Thr Ser Thr Leu
Leu Pro Pro Glu Leu Ser Asp Thr Thr Asn Ala 275 280 285Thr Gln Pro
Glu Leu Val Pro Glu Asp Pro Glu Asp Ser Ala Leu Leu 290 295 300Glu
Asp Pro Ala Gly Thr Val Ser Ser Gln Ile Pro Pro Asn Trp His305 310
315 320Ile Pro Ser Ile Gln Asp Val Ala Pro His His Ala Pro Ala Ala
Pro 325 330 335Ser Asn Pro Gly Leu Ile Ile Gly Ala Leu Ala Gly Ser
Thr Leu Ala 340 345 350Val Leu Val Ile Gly Gly Ile Ala Phe Trp Val
Arg Arg Arg Ala Gln 355 360 365Met Ala Pro Lys Arg Leu Arg Leu Pro
His Ile Arg Asp Asp Asp Ala 370 375 380Pro Pro Ser His Gln Pro Leu
Phe Tyr385 390791182DNAArtificial sequenceVirus 79atgggacgtc
tgacgtcggg agtcggaacg gctgctctgc tggtcgtcgc tgtgggactg 60cgcgtcgtct
gcgctaaata cgctctggct gacccctcgc tgaagatggc tgatcccaat
120cgattccgcg gaaagaacct gcccgtcctg gaccagctga cggacccccc
cggagtgaag 180cgtgtctacc acatccagcc ctcgctggaa gacccctttc
agcccccctc gatccccatc 240acggtgtact acgctgtgct ggaacgtgct
tgccgctcgg tgctgctgca tgctccctcg 300gaagctcccc agatcgtgcg
cggagcttcg gacgaagctc gaaagcacac gtacaacctg 360acgatcgctt
ggtatcgcat gggagacaat tgcgctatcc ccatcacggt catggaatac
420acggaatgcc cctacaacaa gtcgctggga gtctgcccca tccgaacgca
gccccgctgg 480tcgtactatg actcgttctc ggctgtctcg gaagataacc
tgggatttct gatgcacgct 540cccgcttttg aaacggctgg aacgtacctg
cgactggtga agatcaacga ctggacggaa 600atcacgcaat tcatcctgga
acaccgagct cgcgcttcgt gcaagtacgc tctgcccctg 660cgcatccccc
ccgctgcttg cctgacgtcg aaggcttacc aacagggagt gacggtcgac
720tcgatcggaa tgctgccccg cttcatcccc gaaaaccagc gcacggtcgc
tctgtactcg 780ctgaaaatcg ctggatggca cggacccaag cccccctaca
cgtcgacgct gctgcccccc 840gaactgtcgg acacgacgaa cgctacgcaa
cccgaactgg tccccgaaga ccccgaagac 900tcggctctgc tggaagatcc
cgctggaacg gtgtcgtcgc agatcccccc caactggcac 960atcccctcga
tccaggacgt cgctccccac cacgctcccg ctgctccctc gaaccccgga
1020ctgatcatcg gagctctggc tggatcgacg ctggctgtgc tggtcatcgg
aggaatcgct 1080ttctgggtcc gccgccgcgc tcagatggct cccaagcgcc
tgcgtctgcc ccacatccga 1140gatgacgacg ctcccccctc gcaccagccc
ctgttctact ag 118280387DNAHuman papillomavirus type 16 80ggtaccgccg
ccaccatgga gacagacaca ctcctgctat gggtactgct gctctgggtt 60ccaggttcca
ctggtgacgg atccatgcat ggagatacac ctacattgca tgaatatatg
120ttagatttgc aaccagagac aactgatctc tactgttatg agcaattaaa
tgacagctca 180gaggaggagg atgaaataga tggtccagct ggacaagcag
aaccggacag agcccattac 240aatattgtaa ccttttgttg caagtgtgac
tctacgcttc ggttgtgcgt acaaagcaca 300cacgtagaca ttcgtacttt
ggaagacctg ttaatgggca cactaggaat tgtgtgcccc 360atctgctctc
agaagcccta agaattc 38781387DNAArtificial SequenceHPV-16 E7 O1
81ggtaccgccg ccaccatgga aacggacacg ctgctgctgt gggtcctgct gctgtgggtc
60cccggatcga cgggagacgg atcgatgcat ggagacacgc ccacgctgca tgaatacatg
120ctggacctgc aacccgaaac gacggacctg tactgctacg aacaactgaa
cgactcgtcg 180gaagaagaag acgaaatcga cggacccgct ggacaagctg
aacccgacag agctcattac 240aacatcgtca cgttctgctg caagtgcgac
tcgacgctgc gactgtgcgt ccaatcgacg 300cacgtcgaca tccgtacgct
ggaagacctg ctgatgggaa cgctgggaat cgtgtgcccc 360atctgctcgc
agaagcccta agaattc 38782387DNAArtificial SequenceHPV16 E7 O2
82ggtaccgccg ccaccatgga aacggacacg ctgctgctgt gggtcctgct gctgtgggtc
60cccggatcga cgggagacgg atcgatgcat ggagatacgc ctacgctgca tgaatatatg
120ctggatctgc aacccgaaac gacggatctg tactgttatg aacaactgaa
tgactcgtcg 180gaagaagaag atgaaatcga tggacccgct ggacaagctg
aacccgacag agctcattac 240aatatcgtca cgttttgttg caagtgtgac
tcgacgctgc gactgtgcgt ccaatcgacg 300cacgtcgaca tccgtacgct
ggaagacctg ctgatgggaa cgctgggaat cgtgtgcccc 360atctgctcgc
agaagcccta agaattc 38783417DNAArtificial SequenceHPV-16 E7 O3
83ggtaccgccg ccaccatgga gacggacacg ctcctgctct gggtactgct gctctgggtt
60cctggatcga cgggattgtg gacggatcga tgcatggaga tacgcctacg ctccatgaat
120atatgctcga tctccaacct ggttgagacg acggatctct actgttatga
gcaactcaat 180gactcgtcgg aggaggagga tgaattcata gatggacctg
ctggacaagc agaacctgac 240agagcccatt acaatattgt aacgtttgag
aattgttgca agtgtgactc gacgctccgg 300ctctgcgtac aatcgacgca
cgtagacatt cgtccctcta cgctcgaaga cctgctcatg 360ggaacgctcg
gaattgtgtg ccccatctgc tcgcagaagt gtgcccccta agaattc
41784387DNAArtificial SequenceHPV-16 E7 W 84ggtaccgccg ccaccatgga
gactgatact ttattattat gggtattatt attatgggtt 60ccaggtagta ctggtgatgg
cagtatgcat ggcgatactc caactttaca tgagtatatg 120ttagatttac
aaccagagac tactgattta tattgttatg agcaattaaa tgatagcagt
180gaggaggagg atgagataga tggtccagcg ggccaagcag agccggatcg
ggcgcattat 240aatatagtaa ctttctgttg taagtgtgat agtactttac
ggttatgtgt acaaagcact 300cacgtagata tacggacttt agaggattta
ttaatgggca ctttaggcat agtatgtcca 360atatgtagtc agaagccata agaattc
387851182DNAHerpes simplex virus type 2 85atggggcgtt tgacctccgg
cgtcgggacg gcggccctgc tagttgtcgc ggtgggactc 60cgcgtcgtct gcgccaaata
cgccttagca gacccctcgc ttaagatggc cgatcccaat 120cgatttcgcg
ggaagaacct tccggttttg gaccagctga ccgacccccc cggggtgaag
180cgtgtttacc acattcagcc gagcctggag gacccgttcc agccccccag
catcccgatc 240actgtgtact acgcagtgct ggaacgtgcc tgccgcagcg
tgctcctaca tgccccatcg 300gaggcccccc agatcgtgcg cggggcttcg
gacgaggccc gaaagcacac gtacaacctg 360accatcgcct ggtatcgcat
gggagacaat tgcgctatcc ccatcacggt tatggaatac 420accgagtgcc
cctacaacaa gtcgttgggg gtctgcccca tccgaacgca gccccgctgg
480agctactatg acagctttag cgccgtcagc gaggataacc tgggattcct
gatgcacgcc 540cccgccttcg agaccgcggg tacgtacctg cggctagtga
agataaacga ctggacggag 600atcacacaat ttatcctgga gcaccgggcc
cgcgcctcct gcaagtacgc tctccccctg 660cgcatccccc cggcagcgtg
cctcacctcg aaggcctacc aacagggcgt gacggtcgac 720agcatcggga
tgctaccccg ctttatcccc gaaaaccagc gcaccgtcgc cctatacagc
780ttaaaaatcg ccgggtggca cggccccaag cccccgtaca ccagcaccct
gctgccgccg 840gagctgtccg acaccaccaa cgccacgcaa cccgaactcg
ttccggaaga ccccgaggac 900tcggccctct tagaggatcc cgccgggacg
gtgtcttcgc agatcccccc aaactggcac 960atcccgtcga tccaggacgt
cgcgccgcac cacgcccccg ccgcccccag caacccgggc 1020ctgatcatcg
gcgcgctggc cggcagtacc ctggcggtgc tggtcatcgg cggtattgcg
1080ttttgggtac gccgccgcgc tcagatggcc cccaagcgcc tacgtctccc
ccacatccgg 1140gatgacgacg cgcccccctc gcaccagcca ttgttttact ag
1182861182DNAArtificial SequenceHSV-2 gD2 O1 86atgggacgtc
tgacgtcggg agtcggaacg gctgctctgc tggtcgtcgc tgtgggactc 60cgcgtcgtct
gcgctaaata cgctctggct gacccctcgc tgaagatggc tgaccccaac
120cgatttcgcg gaaagaacct gcccgtcctg gaccagctga cggacccccc
cggagtgaag 180cgtgtctacc acatccagcc ctcgctggaa gacccctttc
agcccccctc gatccccatc 240acggtgtact acgctgtgct ggaacgtgct
tgccgctcgg tgctcctcca tgctccctcg 300gaagctcccc agatcgtgcg
cggagcttcg gacgaagctc gaaagcacac gtacaacctg 360acgatcgctt
ggtaccgcat gggagacaac tgcgctatcc ccatcacggt catggaatac
420acggaatgcc cctacaacaa gtcgctcgga gtctgcccca tccgaacgca
gccccgctgg 480tcgtactacg actcgttttc ggctgtctcg gaagacaacc
tgggatttct gatgcacgct 540cccgcttttg aaacggctgg aacgtacctg
cgactcgtga agatcaacga ctggacggaa 600atcacgcaat ttatcctgga
acaccgagct cgcgcttcgt gcaagtacgc tctccccctg 660cgcatccccc
ccgctgcttg cctcacgtcg aaggcttacc aacagggagt gacggtcgac
720tcgatcggaa tgctcccccg ctttatcccc gaaaaccagc gcacggtcgc
tctctactcg 780ctcaaaatcg ctggatggca cggacccaag cccccctaca
cgtcgacgct gctgcccccc 840gaactgtcgg acacgacgaa cgctacgcaa
cccgaactcg tccccgaaga ccccgaagac 900tcggctctcc tcgaagaccc
cgctggaacg gtgtcgtcgc agatcccccc caactggcac 960atcccctcga
tccaggacgt cgctccccac cacgctcccg ctgctccctc gaaccccgga
1020ctgatcatcg gagctctggc tggatcgacg ctggctgtgc tggtcatcgg
aggaatcgct 1080ttttgggtcc gccgccgcgc tcagatggct cccaagcgcc
tccgtctccc ccacatccga 1140gacgacgacg ctcccccctc gcaccagccc
ctcttttact ag 1182871182DNAArtificial SequenceHSV-2 gD2 O2
87atgggacgtc tgacgtcggg agtcggaacg gctgctctgc tggtcgtcgc tgtgggactg
60cgcgtcgtct gcgctaaata cgctctggct gacccctcgc tgaagatggc tgatcccaat
120cgatttcgcg gaaagaacct gcccgtcctg gaccagctga cggacccccc
cggagtgaag 180cgtgtctacc acatccagcc ctcgctggaa gacccctttc
agcccccctc gatccccatc 240acggtgtact acgctgtgct ggaacgtgct
tgccgctcgg tgctgctgca tgctccctcg 300gaagctcccc agatcgtgcg
cggagcttcg gacgaagctc gaaagcacac gtacaacctg 360acgatcgctt
ggtatcgcat gggagacaat tgcgctatcc ccatcacggt catggaatac
420acggaatgcc cctacaacaa gtcgctggga gtctgcccca tccgaacgca
gccccgctgg 480tcgtactatg actcgttttc ggctgtctcg gaagataacc
tgggatttct gatgcacgct 540cccgcttttg aaacggctgg aacgtacctg
cgactggtga agatcaacga ctggacggaa 600atcacgcaat ttatcctgga
acaccgagct cgcgcttcgt gcaagtacgc tctgcccctg 660cgcatccccc
ccgctgcttg cctgacgtcg aaggcttacc aacagggagt gacggtcgac
720tcgatcggaa tgctgccccg ctttatcccc gaaaaccagc gcacggtcgc
tctgtactcg 780ctgaaaatcg ctggatggca cggacccaag cccccctaca
cgtcgacgct gctgcccccc 840gaactgtcgg acacgacgaa cgctacgcaa
cccgaactgg tccccgaaga ccccgaagac 900tcggctctgc tggaagatcc
cgctggaacg gtgtcgtcgc agatcccccc caactggcac 960atcccctcga
tccaggacgt cgctccccac cacgctcccg ctgctccctc gaaccccgga
1020ctgatcatcg gagctctggc tggatcgacg ctggctgtgc tggtcatcgg
aggaatcgct 1080ttttgggtcc gccgccgcgc tcagatggct cccaagcgcc
tgcgtctgcc ccacatccga 1140gatgacgacg ctcccccctc gcaccagccc
ctgttttact ag 1182881182DNAArtificial SequenceHSV-2 gD2 O3
88atgggacgtc tcacgtcggg agtcggaacg gcggccctgc tcgttgtcgc ggtgggactc
60cgcgtcgtct gcgccaaata cgccctcgca gacccctcgc tcaagatggc cgatcccaat
120cgatttcgcg gaaagaacct ccctgttctc gaccagctga cggacccccc
cggagtgaag 180cgtgtttacc acattcagcc ttcgctggag gaccctttcc
agcccccctc gatccctatc 240acggtgtact acgcagtgct ggaacgtgcc
tgccgctcgg tgctcctcca tgccccttcg 300gaggcccccc agatcgtgcg
cggagcttcg gacgaggccc gaaagcacac gtacaacctg 360acgatcgcct
ggtatcgcat gggagacaat tgcgctatcc ccatcacggt tatggaatac
420acggagtgcc cctacaacaa gtcgctcgga gtctgcccca tccgaacgca
gccccgctgg 480tcgtactatg actcgttttc ggccgtctcg gaggataacc
tgggattcct gatgcacgcc 540cccgccttcg agacggcggg aacgtacctg
cggctcgtga agataaacga ctggacggag 600atcacgcaat ttatcctgga
gcaccgggcc cgcgcctcgt gcaagtacgc tctccccctg 660cgcatccccc
ctgcagcgtg cctcacgtcg aaggcctacc aacagggagt gacggtcgac
720tcgatcggaa tgctcccccg ctttatcccc gaaaaccagc gcacggtcgc
cctctactcg 780ctcaaaatcg ccggatggca cggacccaag cccccttaca
cgtcgacgct gctgcctcct 840gagctgtcgg acacgacgaa cgccacgcaa
cccgaactcg ttcctgaaga ccccgaggac 900tcggccctcc tagaggatcc
cgccggaacg gtgtcgtcgc agatcccccc taactggcac 960atcccttcga
tccaggacgt cgcgcctcac cacgcccccg ccgccccctc gaaccctgga
1020ctgatcatcg gagcgctggc cggatcgacg ctggcggtgc tggtcatcgg
aggaattgcg 1080ttttgggtac gccgccgcgc tcagatggcc cccaagcgcc
tccgtctccc ccacatccgg 1140gatgacgacg cgcccccctc gcaccagcct
ctcttttact ag 1182891182DNAArtificial SequenceHSV-2 gD2 W
89atggggcggt tgactagtgg cgtagggact gcggcgttat tagtagtagc ggtaggctta
60cgggtagtat gtgcaaaata tgcgttagca gatccaagtt taaagatggc ggatccaaat
120cggttccggg ggaagaattt accggtattg gatcagttaa ctgatccacc
aggggtaaag 180cgggtatatc acatacagcc gagcttagag gatccgttcc
agccaccaag cataccgata 240actgtatatt atgcagtatt agagcgggcg
tgtcggagcg tattattaca tgcaccaagt 300gaggcgccac agatagtacg
gggggcaagt gatgaggcgc ggaagcacac ttataattta 360actatagcat
ggtatcggat gggcgataat tgtgcgatac caataactgt aatggagtat
420actgagtgtc catataataa gagtttgggg gtatgtccaa tacggactca
gccacggtgg 480agctattatg atagcttcag cgcagtaagc gaggataatt
taggcttctt aatgcacgcg 540ccagcattcg agactgcggg tacttattta
cggttagtaa agataaatga ttggactgag 600ataactcaat tcatattaga
gcaccgggca cgggcgagtt gtaagtatgc attaccatta 660cggataccac
cggcagcgtg tttaactagt aaggcatatc aacagggcgt aactgtagat
720agcataggga tgttaccacg gttcatacca gagaatcagc ggactgtagc
gttatatagc 780ttaaaaatag cagggtggca cggcccaaag ccaccgtata
ctagcacttt attaccgccg 840gagttaagtg atactactaa tgcgactcaa
ccagagttag taccggagga tccagaggat 900agtgcattat tagaggatcc
agcggggact gtaagtagtc agataccacc aaattggcac 960ataccgagta
tacaggatgt agcgccgcac cacgcaccag cggcaccaag caatccgggc
1020ttaataatag gcgcgttagc aggcagtact ttagcggtat tagtaatagg
cggtatagcg 1080ttctgggtac ggcggcgggc gcagatggcg ccaaagcggt
tacggttacc acacatacgg 1140gatgatgatg cgccaccaag tcaccagcca
ttgttctatt ag 11829041DNAArtificial sequenceCommon forward primer
90ttgaataggt accgccgcca ccatggagac cgacaccctc c 419124DNAArtificial
SequenceODN-7909 91tcgtcgtttt gtcgttttgt cgtt 24
* * * * *
References