Influenza virus vaccine composition and methods of use Luke; Catherine J. ; et al. [Evans; Thomas G.]

Influenza virus vaccine composition and methods of use

Luke; Catherine J. ; et al.

Patent Application Summary

U.S. patent application number 11/131479 was filed with the patent office on 2006-02-02 for influenza virus vaccine composition and methods of use. Invention is credited to Thomas G. Evans, Andrew J. Geall, Gretchen S. Jimenez, Catherine J. Luke, Adrian Vilalta, Mary K. Wloch.

Application Number	20060024670 11/131479
Document ID	/
Family ID	35451484
Filed Date	2006-02-02

United States Patent Application	20060024670
Kind Code	A1
Luke; Catherine J. ; et al.	February 2, 2006

Influenza virus vaccine composition and methods of use

Abstract

The present invention is directed to enhancing the immune response of a human in need of protection against IV infection by administering in vivo, into a tissue of the human, at least one polynucleotide comprising one or more regions of nucleic acid encoding an IV protein or a fragment, a variant, or a derivative thereof. The present invention is further directed to enhancing the immune response of a human in need of protection against IV infection by administering, in vivo, into a tissue of the human, at least one IV protein or a fragment, a variant, or derivative thereof. The IV protein can be, for example, in purified form or can be an inactivated IV, such as those present in inactivated IV vaccines. The polynucleotide is incorporated into the cells of the human in vivo, and an immunologically effective amount of an immunogenic epitope of an IV, or a fragment, variant, or derivative thereof is produced in vivo. The IV protein (in purified form or in the form of an inactivated IV vaccine) is also administered in an immunologically effective amount.

Inventors:	Luke; Catherine J.; (Frederick, MD) ; Vilalta; Adrian; (San Diego, CA) ; Wloch; Mary K.; (San Diego, CA) ; Evans; Thomas G.; (San Diego, CA) ; Geall; Andrew J.; (San Marcos, CA) ; Jimenez; Gretchen S.; (San Diego, CA)
Correspondence Address:	STERNE, KESSLER, GOLDSTEIN & FOX PLLC 1100 NEW YORK AVENUE, N.W. WASHINGTON DC 20005 US
Family ID:	35451484
Appl. No.:	11/131479
Filed:	May 18, 2005

Related U.S. Patent Documents


Application Number	Filing Date	Patent Number
60571854	May 18, 2004

Current U.S. Class:	435/5 ; 424/209.1; 435/325; 435/456; 435/69.1; 530/350; 536/23.72
Current CPC Class:	A61K 2039/54 20130101; A61K 39/145 20130101; C12N 2760/16134 20130101; A61K 39/00 20130101; A61P 37/04 20180101; C07K 16/1018 20130101; A61K 39/12 20130101; A61K 2039/70 20130101; C07K 2317/34 20130101; C12N 2760/16151 20130101; C12N 7/00 20130101; A61K 2039/53 20130101; A61P 31/16 20180101; A61K 2039/55511 20130101; A61K 2039/55555 20130101; A61P 31/12 20180101; A61K 2039/55566 20130101
Class at Publication:	435/005 ; 435/069.1; 435/456; 435/325; 424/209.1; 530/350; 536/023.72
International Class:	C12Q 1/70 20060101 C12Q001/70; C07H 21/04 20060101 C07H021/04; C12P 21/06 20060101 C12P021/06; A61K 39/145 20060101 A61K039/145; C07K 14/11 20060101 C07K014/11

Claims

1-392. (canceled)

393. An isolated polynucleotide comprising a nucleic acid fragment which encodes the consensus amino acid sequence of SEQ ID NO:78, wherein the codons of said nucleic acid fragment are optimized for expression in humans.

394. The polynucleotide of claim 393, wherein the nucleotide sequence of said nucleic acid fragment is SEQ ID NO:66.

395. A vector comprising the polynucleotide of claim 393, wherein said vector, upon uptake by a suitable host cell, expresses said amino acid sequence.

396. The polynucleotide of claim 393, further comprising a heterologous nucleic acid ligated to said nucleic acid fragment.

397. A composition comprising the vector of claim 395 and a carrier.

398. The composition of claim 397, further comprising a component selected from the group consisting of an adjuvant and a transfection facilitating compound.

399. The composition of claim 398, wherein said component is a cationic lipid.

400. The composition of claim 399, wherein said adjuvant comprises(.+-.)-N-(3-aminopropyl)-N,N-dimethyl-2,3-bis(syn-9-tetradeceney- loxy)-1-propanaminium bromide (GAP-DMORIE) and a neutral lipid, wherein said neutral lipid is selected from the group consisting of: (a) 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE); (b) 1,2-diphytanoyl-sn-glycero-3-phosphoethanolamine (DPyPE); and (c) 1,2-dimyristoyl-glycer-3-phosphoethanolamine (DMPE).

401. The composition of claim 399, wherein said transfection facilitating compound comprises (.+-.)-N-(2-hydroxyethyl)-N,N-dimethyl-2,3-bis(tetradecyloxy)-1-propanami- nium bromide (DMRIE).

402. The composition of claim 401, wherein said transfection facilitating compound further comprises a neutral lipid.

403. The composition of claim 402, wherein said neutral lipid is DOPE.

404. The composition of claim 401 further comprising a 1:1 molar ratio of GAP-DMORIE and DPyPE.

405. A method for treating or preventing influenza infection in a vertebrate comprising administering to a vertebrate in need thereof the composition of claim 397.

406. A method for eliciting an immune response to influenza virus in a vertebrate comprising administering to a vertebrate in need thereof the composition of claim 397.

407. An isolated polynucleotide comprising a nucleic acid fragment which encodes the consensus amino acid sequence of SEQ ID NO:76, wherein the codons of said nucleic acid fragment are optimized for expression in humans.

408. The polynucleotide of claim 407, wherein the nucleotide sequence of said nucleic acid fragment is SEQ ID NO:75.

409. A vector comprising the polynucleotide of claim 405, wherein said vector, upon uptake by a suitable host cell, expresses said amino acid sequence.

410. The polynucleotide of claim 407, further comprising a heterologous nucleic acid ligated to said nucleic acid fragment.

411. A composition comprising the vector of claim 409 and a carrier

412. The composition of claim 411, further comprising a component selected from the group consisting of an adjuvant and a transfection facilitating compound.

413. The composition of claim 412, wherein said component is a cationic lipid.

414. The composition of claim 413, wherein said adjuvant comprises(.+-.)-N-(3 -aminopropyl)-N,N-dimethyl-2,3-bis(syn-9-tetradeceneyloxy)-1-propanaminiu- m bromide (GAP-DMORIE) and a neutral lipid, wherein said neutral lipid is selected from the group consisting of: (a) 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE); (b) 1,2-diphytanoyl-sn-glycero-3 -phosphoethanolamine (DPyPE); and (c) 1,2-dimyristoyl-glycer-3-phosphoethanolamine (DMPE).

415. The composition of claim 413, wherein said transfection facilitating compound comprises (.+-.)-N-(2-hydroxyethyl)-N,N-dimethyl-2,3-bis(tetradecyloxy)-1-propanami- nium bromide (DMRIE).

416. The composition of claim 415, wherein said transfection facilitating agent further comprises a neutral lipid.

417. The composition of claim 416, wherein the neutral lipid is DOPE.

418. The composition of claim 415 further comprising a 1:1 molar ratio of GAP-DMORIE and DPyPE.

419. An isolated polypeptide produced by the polynucleotide claim 407.

420. A method for treating or preventing influenza infection in a vertebrate comprising administering to a vertebrate in need thereof the composition of claim 411.

421. A method for eliciting an immune response to influenza virus in a vertebrate by administration of the composition of claim 411.

422. An isolated polynucleotide comprising a first nucleic acid fragment which encodes the consensus amino acid sequence of SEQ ID NO:78 and a second nucleic acid fragment which encodes the consensus amino acid sequence of SEQ ID NO:76, wherein the codons of said first and second nucleic acid fragments are optimized for expression in humans.

423. The polynucleotide of claim 422, wherein the nucleotide sequence of said first nucleic acid fragment is SEQ ID NO:66 and wherein the nucleotide sequence of said second nucleic acid fragment is SEQ ID NO:75.

424. A vector comprising the polynucleotide of claim 422, wherein said vector, upon uptake by a suitable host cell, expresses the consensus amino acid sequences of SEQ ID NO:78 and SEQ ID NO:76.

425. The vector of claim 424, wherein said consensus amino acid sequences of SEQ ID NO:78 and SEQ ID NO:76 are expressed as a fusion protein.

426. The vector of claim 422, wherein said vector is DNA and wherein said vector comprises a first expression cassette and second expression cassette, said first expression cassette comprises a first nucleic acid fragment which encodes the consensus amino acid sequence of SEQ ID NO:78 in operable association with a promoter and said second expression cassette comprises a second nucleic acid fragment which encodes the consensus amino acid sequence of SEQ ID NO:76 in operable association with a promoter.

427. The vector of claim 426, wherein said first expression cassette and said second expression cassette are associated with separate promoters.

428. The vector of claim 427, wherein said separate promoters are non-identical.

429. The vector of claim 426, wherein said first expression cassette and said second expression cassette are associated with a single promoter, and wherein said second expression cassette is in operable association with an internal ribosome entry site (IRES).

430. The vector of claim 426, wherein said first expression cassette and said second expression cassette are associated with a single promoter, and wherein said first expression cassette is in operable association with an internal ribosome entry site (IRES).

431. A composition comprising the vector of claim 424 and a carrier.

432. A composition comprising the vector of claim 426 and a carrier.

433. A composition comprising at least two non-identical vectors, wherein one of said vectors comprises a nucleic acid fragment which encodes the consensus amino acid sequence of SEQ ID NO:78 and wherein another of said vectors comprises a nucleic acid fragment which encodes the consensus amino acid sequence of SEQ ID NO:76, wherein the codons of said nucleic acid fragments encoding SEQ ID NO:78 and SEQ ID NO:76 are optimized for expression in humans, and wherein said vectors, upon uptake by a suitable host cell, express said amino acid sequences.

434. The composition of claim 433, further comprising a carrier.

435. The composition of claim 434, further comprising a component selected from the group consisting of an adjuvant and a transfection facilitating compound.

436. The composition of claim 435, wherein said component is a cationic lipid.

437. The composition of claim 436, wherein said adjuvant comprises(.+-.)-N-(3-aminopropyl)-N,N-dimethyl-2,3-bis(syn-9-tetradeceney- loxy)-1-propanaminium bromide (GAP-DMORIE) and a neutral lipid, wherein said neutral lipid is selected from the group consisting of: (a) 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE); (b) 1,2-diphytanoyl-sn-glycero-3-phosphoethanolamine (DPyPE); and (c) 1,2-dimyristoyl-glycer-3-phosphoethanolamine (DMPE).

438. The composition of claim 436, wherein said transfection facilitating compound comprises (.+-.)-N-(2-hydroxyethyl)-N,N-dimethyl-2,3-bis(tetradecyloxy)-1-propanami- nium bromide (DMRIE).

439. The composition of claim 438, wherein said transfection facilitating compound further comprises a neutral lipid.

440. The composition of claim 439, wherein the neutral lipid is DOPE.

441. The composition of claim 438 further comprising a 1:1 molar ratio of GAP-DMORIE and DPyPE.

442. A method for treating or preventing influenza infection in a vertebrate comprising administering to a vertebrate in need thereof the composition of claim 434.

443. A method for eliciting an immune response to influenza virus in a vertebrate by administration of the composition of claim 434.

Description

CROSS-REFERENCE TO RELATED APPLICATIONS

[0001] The present application claims the benefit of the filing date of U.S. Provisional Application No. 60/571,854 filed May 18, 2004, which is incorporated herein by reference in its entirety.

REFERENCE TO A SEQUENCE LISTING SUBMITTED ON A COMPACT DISC

[0002] This application includes a "Sequence Listing," which is provided as an electronic document on a compact disk (CD-R). This compact disk contains the file "Sequence Listing.txt" (340,000 bytes, created on May 18, 2005), which is hereby incorporated by reference in its entirety.

BACKGROUND OF THE INVENTION

[0003] The present invention relates to influenza virus vaccine compositions and methods of treating or preventing influenza infection and disease in mammals. Influenza is an acute febrile illness caused by infection of the respiratory tract. There are three types of influenza viruses: A, B, and C "IAV," "IBV" or "IAC," respectively, or generally "IV". Type A, which includes several subtypes, causes widespread epidemics and global pandemics such as those that occurred in 1918, 1957 and 1968. Type B causes regional epidemics. Type C causes sporadic cases and minor, local outbreaks. These virus types are distinguished in part on the basis of differences in two structural proteins, the nucleoprotein, found in the center of the virus, and the matrix protein, which forms the viral shell.

[0004] The disease can cause significant systemic symptoms, severe illness requiring hospitalization (such as viral pneumonia), and complications such as secondary bacterial pneumonia. More than 20 million people died during the pandemic flu season of 1918/1919, the largest pandemic of the 20.sup.th century. Recent epidemics in the United States are believed to have resulted in greater than 10,000 (up to 40,000) excess deaths per year and 5,000-10,000 deaths per year in non-epidemic years.

[0005] The best strategy for prevention of morbidity and mortality associated with influenza is vaccination. Vaccination is especially recommended for people in high-risk groups, such as residents of nursing or residential homes, as well as for diabetes, chronic renal failure, or chronic respiratory conditions.

[0006] Traditional methods of producing influenza vaccines involve growth of an isolated strain in embryonated hens' eggs. Initially, the virus is recovered from a throat swab or similar source and isolated in eggs. The initial isolation in egg is difficult, but the virus adapts to its egg host and subsequent propagation in eggs takes place relatively easily. It is widely recognized, however, that the egg-derived production of IV for vaccine purposes has several disadvantages. One disadvantage is that such production process is rather vulnerable due to the varying (micro)biological quality of the eggs. Another disadvantage is that the process completely lacks flexibility if demand suddenly increases, i.e., in case of a serious epidemic or pandemic, because of the logistical problems due to the non-availability of large quantities of suitable eggs. Also, vaccines thus produced are contra-indicated for persons with a known hypersensitivity to chicken and/or egg proteins.

[0007] The influenza vaccines currently in use are designated whole virus (WV) vaccine or subvirion (SV) (also called "split" or "purified surface antigen"). The WV vaccine contains intact, inactivated virus, whereas the SV vaccine contains purified virus disrupted with detergents that solubilize the lipid-containing viral envelope, followed by chemical inactivation of residual virus. Attenuated viral vaccines against influenza are also in development. A discussion of methods of preparing conventional vaccine may be found in Wright, P. F. & Webster, R. G., FIELDS VIROLOGY, 4d Ed. (Knipe, D. M. et al. Ed.), 1464-65 (2001), for example.

Virus Structures

[0008] An IV is roughly spherical, but it can also be elongated or irregularly shaped. Inside the virus, eight segments of single-stranded RNA contain the genetic instructions for making the virus. The most striking feature of the virus is a layer of spikes projecting outward over its surface. There are two different types of spikes: one is composed of the molecule hemagglutinin (HA), the other of neuraminidase (NA). The HA molecule allows the virus to "stick" to a cell, initiating infection. The NA molecule allows newly formed viruses to exit their host cell without sticking to the cell surface or to each other. The viral capsid is comprised of viral ribonucleic acid and several so called "internal" proteins (polymerases (PB1, PB2, and PA, matrix protein (M1) and nucleoprotein (NP)). Because antibodies against HA and NA have traditionally proved the most effective in fighting infection, much research has focused on the structure, function, and genetic variation of those molecules. Researchers are also interested in a two non-structural proteins M2 and NS1; both molecules play important roles in viral infection.

[0009] Type A subtypes are described by a nomenclature system that includes the geographic site of discovery, a lab identification number, the year of discovery, and in parentheses the type of HA and NA it possesses, for example, A/Hong Kong/156/97 (H5N1). If the virus infects non-humans, the host species is included before the geographical site, as in A/Chicken/Hong Kong/G9/97 (H9N2).

[0010] Virions contain 7 segments (influenza C virus) to 8 segments (influenza A and B virus) of linear negative-sense single stranded RNA. Most of the segments of the virus genome code for a single protein. For many influenza viruses, the whole genome is now known. Genetic reassortment of the virus results from intermixing of the parental gene segments in the progeny of the viruses when a cell is co-infected by two different viruses of a given type. This phenomenon is facilitated by the segmental nature of the genome of influenza virus. Genetic reassortment is manifested as sudden changes in the viral surface antigens.

[0011] Antigenic changes in HA and NA allow the influenza virus to have tremendous variability. Antigenic drift is the term used to indicate minor antigenic variations in HA and NA of the influenza virus from the original parent virus, while major changes in HA and NA which make the new virions significantly different, are called Antigenic shift. The difference between the two phenomena is a matter of degree.

[0012] Antigenic drift (minor changes) occurs due to accumulation of point mutations in the gene which results in changes in the amino acids in the proteins. Changes which are extreme, and drastic (too drastic to be explained by mutation alone) result in antigenic shift of the virus. The segmented genomes of the influenza viruses reassort readily in double infected cells. Genetic reassortment between human and non-human influenza virus has been suggested as a mechanism for antigenic shift. Influenza is a zoonotic disease, and an important pathogen in a number of animal species, including swine, horses, and birds, both wild and domestic. Influenza viruses are transferred to humans from other species.

[0013] Because of antigenic shift and antigenic drift, immunity to an IV carrying a particular HA and/or NA protein does not necessarily confer protective immunity against IV strains carrying variant, or different HA and/or NA proteins. Because antibodies against HA and NA have traditionally proved the most effective in fighting IV infection, much research has focused on the structure, function and genetic variation of those molecules.

Recent IV Vaccine Candidates

[0014] During the past few years, there has been substantial interest in testing DNA-based vaccines for a number of infectious diseases where the need for a vaccine, or an improved vaccine, exists. Several well-recognized advantages of DNA-based vaccines include the speed, ease and cost of manufacture, the versatility of developing and testing multivalent vaccines, the finding that DNA vaccines can produce a robust cellular response in a wide variety of animal models as well as in humans, and the proven safety of using plasmid DNA as a delivery vector (Donnelly, J. J., et al., Annu. Rev. Immunol. 15:617-648 (1997); Manickan, E., et al., Crit. Rev. Immunol. 17(2):139-154 (1997); U.S. Pat. No. 6,214,804). DNA vaccines represent the next generation in the development of vaccines (Nossal, G., Nat. Med. 4(5 Supple):475-476 (1998)) and numerous DNA vaccines are in clinical trials. The above references are herein incorporated by reference in their entireties.

[0015] Studies have already been performed using DNA-based vaccines in animals. Ulmer, J. B. et al., Science 259:1745-9 (1993) revealed that mice could be protected by an IV nucleoprotein DNA vaccine alone against severe disease and death resulting from either a homologous or a heterologous IV challenge. Further studies have substantiated this model, and comparative studies of live influenza vaccines versus DNA influenza vaccines show them to be relatively equivalent in immune induction and protection in the murine model.

[0016] WO 94/21797, incorporated herein by reference in its entirety, discloses IV vaccine compositions comprising DNA constructs encoding NP, HA, M1, PB1 and NS1. WO 94/21797 also discloses methods of protecting against IV infection comprising immunization with a prophylactically effective amount of these DNA vaccine compositions.

[0017] The IV nucleoprotein is relatively conserved (see Shu, L. L. et al., J. Virol. 67:2723-9 (1993)), but just as conserved are the M1 matrix protein (which is a major T-cell target), and the M2 protein, which are encoded by separate reading frames of RNA segment 7. See Neirynck, S. et al., Nat. Med. 5:1157-63 (1999); Lamb, R. A. & Lai, C. J., Virology 112:746-51 (1981); Ito, T. et al., J. Virol. 65:5491-8 (1991). Animal DNA vaccine trials have been performed with DNA constructs encoding these genes alone or in combination, usually with success. See Okuda, K., et al., Vaccine 19:3681-91 (2001); Watabe, S. et al., Vaccine 19:4434-44 (2001). Of interest, the M2 protein is involved as part of an ion channel, is critical in resistance to the antiviral agents amantadine and rimantidine, and approximately 24 amino acids are extracellular (eM2). See Fischer, W. B., Biochim Biophys Acta 1561:27-45 (2002); Zhong, Q., FEBS Lett 434:265-71 (1998). Antibodies to this extracellular, highly conserved protein (eM2), which is highly expressed in infected cells (Lamb, R. A., et al., Cell 40:627-33 (1985)), have been shown to be involved in animal models. Treanor, J. J., J. Virol. 64:1375-7 (1990); Slepushkin, V. A. et al., Vaccine 13:1399-402 (1995). An approach using a conjugate hepatitis B core-eM2 protein has been evaluated in an animal model and proposed as a pandemic influenza vaccine. Neirynck, S. et al., Nat. Med. 5:1157-63 (1999). However, in one study vaccination of pigs with a DNA construct expressing eM2-NP fusion protein exacerbated disease after challenge with influenza A virus. Heinen, P. P., J. Gen. Virol. 83:1851-59 (2002). All of the above references are herein incorporated by reference in their entireties.

[0018] Heterologous "prime boost" strategies have been effective for enhancing immune responses and protection against numerous pathogens. Schneider et al., Immunol. Rev. 170:29-38 (1999); Robinson, H. L., Nat. Rev. Immunol. 2:239-50 (2002); Gonzalo, R. M. et al., Vaccine 20:1226-31 (2002); Tanghe, A., Infect. Immun. 69:3041-7 (2001). Providing antigen in different forms in the prime and the boost injections appears to maximize the immune response to the antigen. DNA vaccine priming followed by boosting with protein in adjuvant or by viral vector delivery of DNA encoding antigen appears to be the most effective way of improving antigen specific antibody and CD4+ T-cell responses or CD8+ T-cell responses respectively. Shiver J. W. et al., Nature 415: 331-5 (2002); Gilbert, S. C. et al., Vaccine 20:1039-45 (2002); Billaut-Mulot, O. et al., Vaccine 19:95-102 (2000); Sin, J. I. et al., DNA Cell Biol. 18:771-9 (1999). Recent data from monkey vaccination studies suggests that adding CRL1005 poloxamer (12 kDa, 5% POE), to DNA encoding the HIV gag antigen enhances T-cell responses when monkeys are vaccinated with an HIV gag DNA prime followed by a boost with an adenoviral vector expressing HIV gag (Ad5-gag). The cellular immune responses for a DNA/poloxamer prime followed by an Ad5-gag boost were greater than the responses induced with a DNA (without poloxamer) prime followed by Ad5-gag boost or for Ad5-gag only. Shiver, J. W. et al. Nature 415:331-5 (2002). U.S. patent application Publication No. US 2002/0165172 A1 describes simultaneous administration of a vector construct encoding an immunogenic portion of an antigen and a protein comprising the immunogenic portion of an antigen such that an immune response is generated. The document is limited to hepatitis B antigens and HIV antigens. Moreover, U.S. Pat. No. 6,500,432 is directed to methods of enhancing an immune response of nucleic acid vaccination by simultaneous administration of a polynucleotide and polypeptide of interest. According to the patent, simultaneous administration means administration of the polynucleotide and the polypeptide during the same immune response, preferably within 0-10 or 3-7 days of each other. The antigens contemplated by the patent include, among others, those of Hepatitis (all forms), HSV, HIV, CMV, EBV, RSV, VZV, HPV, polio, influenza, parasites (e.g., from the genus Plasmodium), and pathogenic bacteria (including but not limited to M. tuberculosis, M. leprae, Chlamydia, Shigella, B. burgdorferi, enterotoxigenic E. coli, S. typhosa, H. pylori, V. cholerae, B. pertussis, etc.). All of the above references are herein incorporated by reference in their entireties.

SUMMARY OF THE INVENTION

[0019] The present invention is directed to enhancing the immune response of a vertebrate in need of protection against IV infection by administering in vivo, into a tissue of the vertebrate, at least one polynucleotide, wherein the polynucleotide comprises one or more nucleic acid fragments, where the one or more nucleic acid fragments are optionally fragments of codon-optimized coding regions operably encoding one or more IV polypeptides, or fragments, variants, or derivatives thereof. The present invention is further directed to enhancing the immune response of a vertebrate in need of protection against IV infection by administering, in vivo, into a tissue of the vertebrate, a polynucleotide described above plus at least one isolated IV polypeptide or a fragment, a variant, or derivative thereof. The isolated IV polypeptide can be, for example, a purified subunit, a recombinant protein, a viral vector expressing an isolated IV polypeptide, or can be an inactivated or attentuated IV, such as those present in conventional IV vaccines. According to either method, the polynucleotide is incorporated into the cells of the vertebrate in vivo, and an immunologically effective amount of an immunogenic epitope of the encoded IV polypeptide, or a fragment, variant, or derivative thereof, is produced in vivo. When utilized, an isolated IV polypeptide or a fragment, variant, or derivative thereof is also administered in an immunologically effective amount.

[0020] According to the present invention, the polynucleotide can be administered either prior to, at the same time (simultaneously), or subsequent to the administration of the isolated IV polypeptide. The IV polypeptide or fragment, variant, or derivative thereof encoded by the polynucleotide comprises at least one immunogenic epitope capable of eliciting an immune response to influenza virus in a vertebrate. In addition, an isolated IV polypeptide or fragment, variant, or derivative thereof, when used, comprises at least one immunogenic epitope capable of eliciting an immune response in a vertebrate. The IV polypeptide or fragment, variant, or derivative thereof encoded by the polynucleotide can, but need not, be the same protein or fragment, variant, or derivative thereof as the isolated IV polypeptide which can be administered according to the method.

[0021] The polynucleotide of the invention can comprise a nucleic acid fragment, where the nucleic acid fragment is a fragment of a codon-optimized coding region operably encoding any IV polypeptide or fragment, variant, or derivative thereof, including, but not limited to, HA, NA, NP, M1 or M2 proteins or fragments (e.g., eM2), variants or derivatives thereof. A polynucleotide of the invention can also encode a derivative fusion protein, wherein two or more nucleic acid fragments, at least one of which encodes an IV polypeptide or fragment, variant, or derivative thereof, are joined in frame to encode a single polypeptide, e.g., NP fused to eM2. Additionally, a polynucleotide of the invention can further comprise a heterologous nucleic acid or nucleic acid fragment. Such heterologous nucleic acid or nucleic acid fragment may encode a heterologous polypeptide fused in frame with the polynucleotide encoding the IV polypeptide, e.g., a hepatitis B core protein or a secretory signal peptide. Preferably, the polynucleotide encodes an IV polypeptide or fragment, variant, or derivative thereof comprising at least one immunogenic epitope of IV, wherein the epitope elicits a B-cell (antibody) response, a T-cell (e.g., CTL) response, or both.

[0022] Similarly, the isolated IV polypeptide or fragment, variant, or derivative thereof to be delivered (either a recombinant protein, a purified subunit, or viral vector expressing an isolated IV polypeptide, or in the form of an inactivated IV vaccine) can be any isolated IV polypeptide or fragment, variant, or derivative thereof, including but not limited to the HA, NA, NP, M1 or M2 proteins or fragments (e.g., eM2), variants or derivatives thereof. In certain embodiments, a derivative protein can be a fusion protein, e.g., NP-eM2. In other embodiments, the isolated IV polypeptide or fragment, variant, or derivative thereof can be fused to a heterologous protein, e.g., a secretory signal peptide or the hepatitis B virus core protein. Preferably, the isolated IV polypeptide or fragment, variant, or derivative thereof comprises at least one immunogenic epitope of IV, wherein the antigen elicits a B-cell antibody response, a T-cell antibody response, or both.

[0023] Nucleic acids and fragments thereof of the present invention can be altered from their native state in one or more of the following ways. First, a nucleic acid or fragment thereof which encodes an IV polypeptide or fragment, variant, or derivative thereof can be part or all of a codon-optimized coding region, optimized according to codon usage in the animal in which the vaccine is to be delivered. In addition, a nucleic acid or fragment thereof which encodes an IV polypeptide can be a fragment which encodes only a portion of a full-length polypeptide, and/or can be mutated so as to, for example, remove from the encoded polypeptide non-desired protein motifs present in the encoded polypeptide or virulence factors associated with the encoded polypeptide. For example, the nucleic acid sequence could be mutated so as not to encode a membrane anchoring region that would prevent release of the polypeptide from the cell as with, e.g., eM2. Upon delivery, the polynucleotide of the invention is incorporated into the cells of the vertebrate in vivo, and a prophylactically or therapeutically effective amount of an immunologic epitope of an IV is produced in vivo.

[0024] Similarly, the proteins of the invention can be a fragment of a full-length IV polypeptide and/or can be altered so as to, for example, remove from the polypeptide non-desired protein motifs present in the polypeptide or virulence factors associated with the polypeptide. For example, the polypeptide could be altered so as not to encode a membrane anchoring region that would prevent release of the the polypeptide from the cell.

[0025] The invention further provides immunogenic compositions comprising at least one polynucleotide, wherein the polynucleotide comprises one or more nucleic acid fragments, where each nucleic acid fragment is a fragment of a codon-optimized coding region encoding an IV polypeptide or a fragment, a variant, or a derivative thereof; and immunogenic compositions comprising a polynucleotide as described above and at least one isolated IV polypeptide or a fragment, a variant, or derivative thereof. Such compositions can further comprise, for example, carriers, excipients, transfection facilitating agents, and/or adjuvants as described herein.

[0026] The immunogenic compositions comprising a polynucleotide and an isolated IV polypeptide or fragment, variant, or derivative thereof as described above can be provided so that the polynucleotide and protein formulation are administered separately, for example, when the polynucleotide portion of the composition is administered prior (or subsequent) to the isolated IV polypeptide portion of the composition. Alternatively, immunogenic compositions comprising the polynucleotide and the isolated IV polypeptide or fragment, variant, or derivative thereof can be provided as a single formulation, comprising both the polynucleotide and the protein, for example, when the polynucleotide and the protein are administered simultaneously. In another alternative, the polynucleotide portion of the composition and the isolated IV polypeptide portion of the composition can be provided simultaneously, but in separate formulations.

[0027] Compositions comprising at least one polynucleotide comprising one or more nucleic acid fragments, where each nucleic acid fragment is optionally a fragment of a codon-optimized coding region operably encoding an IV polypeptide or fragment, variant, or derivative thereof together with and one or more isolated IV polypeptides or fragments, variants or derivatives thereof (as either a recombinant protein, a purified subunit, a viral vector expressing the protein, or in the form of an inactivated or attenuated IV vaccine) will be referred to herein as "combinatorial polynucleotide (e.g., DNA) vaccine compositions" or "single formulation heterologous prime-boost vaccine compositions."

[0028] The compositions of the invention can be univalent, bivalent, trivalent or mulitvalent. A univalent composition will comprise only one polynucleotide comprising a nucleic acid fragment, where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide or a fragment, variant, or derivative thereof, and optionally the same IV polypeptide or a fragment, variant, or derivative thereof in isolated form. In a single formulation heterologous prime-boost vaccine composition, a univalent composition can include a polynucleotide comprising a nucleic acid fragment, where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide or a fragment, variant, or derivative thereof and an isolated polypeptide having the same antigenic region as the polynucleotide. A bivalent composition will comprise, either in polynucleotide or protein form, two different IV polypeptides or fragments, variants, or derivatives thereof, each capable of eliciting an immune response. The polynucleotide(s) of the composition can encode two IV polypeptides or alternatively, the polynucleotide can encode only one IV polypeptide and the second IV polypeptide would be provided by an isolated IV polypeptide of the invention as in, for example, a single formulation heterologous prime-boost vaccine composition. In the case where both IV polypeptides of a bivalent composition are delivered in polynucleotide form, the nucleic acid fragments operably encoding those IV polypeptides need not be on the same polynucleotide, but can be on two different polynucleotides. A trivalent or further multivalent composition will comprise three IV polypeptides or fragments, variants or derivatives thereof, either in isolated form or encoded by one or more polynucleotides of the invention.

[0029] The present invention further provides plasmids and other polynucleotide constructs for delivery of nucleic acid fragments of the invention to a vertebrate, e.g., a human, which provide expression of IV polypeptides, or fragments, variants, or derivatives thereof. The present invention further provides carriers, excipients, transfection-facilitating agents, immunogenicity-enhancing agents, e.g., adjuvants, or other agent or agents to enhance the transfection, expression or efficacy of the administered gene and its gene product.

[0030] In one embodiment, a mulitvalent composition comprises a single polynucleotide, e.g., plasmid, comprising one or more nucleic acid regions operably encoding IV polypeptides or fragments, variants, or derivatives thereof. Reducing the number of polynucleotides, e.g., plasmids in the compositions of the invention can have significant impacts on the manufacture and release of product, thereby reducing the costs associated with manufacturing the compositions. There are a number of approaches to include more than one expressed antigen coding sequence on a single plasmid. These include, for example, the use of Internal Ribosome Entry Site (IRES) sequences, dual promoters/expression cassettes, and fusion proteins.

[0031] The invention also provides methods for enhancing the immune response of a vertebrate to IV infection by administering to the tissues of a vertebrate one or more polynucleotides each comprising one or more nucleic acid fragments, where each nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide or fragment, variant, or derivative thereof; and optionally administering to the tissues of the vertebrate one or more isolated IV polypeptides, or fragments, variants, or derivatives thereof. The isolated IV polypeptide can be administered prior to, at the same time (simultaneously), or subsequent to administration of the polynucleotides encoding IV polypeptides.

[0032] In addition, the invention provides consensus amino acid sequences for IV polypeptides, or fragments, variants or derivatives thereof, including, but not limited to the HA, NA, NP, M1 or M2 proteins or fragments (e.g. eM2), variants or derivatives thereof. Polynucleotides which encode the consensus polypeptides or fragments, variants or derivatives thereof, are also embodied in this invention. Such polynucleotides can be obtained by known methods, for example by backtranslation of the amino acid sequence and PCR synthesis of the corresponding polynucleotide as described below.

BRIEF DESCRIPTION OF THE FIGURES

[0033] FIG. 1 shows an alignment of nucleotides 46-1542 of SEQ ID NO:1 (native NP coding region) with a coding region fully codon-optimized for human usage (SEQ ID NO:23).

[0034] FIG. 2 shows the protocol for the preparation of a formulation comprising 0.3 mM BAK, 7.5 mg/ml CRL 1005 and 5 mg/ml of DNA in a final volume of 3.6 ml, through the use of thermal cycling.

[0035] FIG. 3 shows the protocol for the preparation of a formulation comprising 0.3 mM BAK, 34 mg/ml or 50 mg/ml CRL 1005 and 2.5 mg/ml DNA in a final volumne of 4.0 ml, through the use of thermal cycling.

[0036] FIG. 4 shows the protocol for the simplified preparation (without thermal cycling) of a formulation comprising 0.3 mM BAK, 7.5 mg/ml CRL 1005 and 5 mg/ml DNA.

[0037] FIG. 5 shows the anti-NP antibody response three weeks after a single administration of a combinatorial prime-boost vaccine formulation against the influenza virus NP protein.

[0038] FIG. 6 shows the anti-NP antibody response twelve days after a second administration of a combinatorial prime-boost vaccine formulation against the influenza virus NP protein.

[0039] FIG. 7 shows the CD8+ T Cell response to a combinatorial prime-boost vaccine formulation against the influenza virus NP protein.

[0040] FIG. 8 shows the CD4+ T Cell response to a combinatorial prime-boost vaccine formulation against the influenza virus NP protein.

[0041] FIGS. 9A and 9B show the results of a two dose mouse immunization regimen study with plasmid DNA encoding IAV HA (H3).

[0042] FIGS. 10A and 10B show the in vitro expression of M1 and M2 from segment 7 and an M1M2 fusion.

[0043] FIGS. 11A and 11B show the in vitro expression of eM2-NP and codon-optimized influenza virus NP protein.

[0044] FIG. 12 shows the influenza A NP protein consensus amino acid sequence aligned with 22 full length NP sequences.

[0045] FIG. 13 is a schematic diagram of various vectors encoding influenza proteins described herein.

[0046] FIG. 14 are the results of western blot experiments as described in Example 13, Experiment 3. The blots show lysates of VM92 cells transfected with plasmids which express M2 or NP to compare expression of the influenza protein from different expression vectors.

[0047] FIG. 15 are the results of western blot experiments as described in Example 13, Experiment 3. The blots show lysates of VM92 cells transfected with plasmids which express M1, M2 or NP to compare expression of the influenza protein from expression vectors.

DETAILED DESCRIPTION OF THE INVENTION

[0048] The present invention is directed to compositions and methods for enhancing the immune response of a vertebrate in need of protection against IV infection by administering in vivo, into a tissue of a vertebrate, at least one polynucleotide comprising one or more nucleic acid fragments, where each nucleic acid fragment is optionally a fragment of a codon-optimized coding region operably encoding an IV polypeptide, or a fragment, variant, or derivative thereof in cells of the vertebrate in need of protection. The present invention is also directed to administering in vivo, into a tissue of the vertebrate the above described polynucleotide and at least one isolated IV polypeptide, or a fragment, variant, or derivative thereof. The isolated IV polypeptide or fragment, variant, or derivative thereof can be, for example, a recombinant protein, a purified subunit protein, a protein expressed and carried by a heterologous live or inactivated or attentuated viral vector expressing the protein, or can be an inactivated IV, such as those present in conventional, commercially available, inactivated IV vaccines. According to either method, the polynucleotide is incorporated into the cells of the vertebrate in vivo, and an immunologically effective amount of the influenza protein, or fragment or variant encoded by the polynucleotide is produced in vivo. The isolated protein or fragment, variant, or derivative thereof is also administered in an immunologically effective amount. The polynucleotide can be administered to the vertebrate in need thereof either prior to, at the same time (simultaneously), or subsequent to the administration of the isolated IV polypeptide or fragment, variant, or derivative thereof.

[0049] Non-limiting examples of IV polypeptides within the scope of the invention include, but are not limited to, NP, HA, NA, M1 and M2 polypeptides, and fragments, e.g., eM2, derivatives, e.g., an NP-eM2 fusion, and variants thereof. Nucleotide and amino acid sequences of IV polypeptides from a wide variety of IV types and subtypes are known in the art. The nucleotide sequences set out below are the wild-type sequences. For example, the nucleotide sequence of the NP protein of Influenza A/PR/8/34 (H1N1) is available as GenBank Accession Number M38279.1, and has the following sequence, referred to herein as SEQ ID NO:1: TABLE-US-00001 AGCAAAAGCAGGGTAGATAATCACTCACTGAGTGACATCAAAATCATGGC GTCTCAAGGCACCAAACGATCTTACGAACAGATGGAGACTGATGGAGAAC GCCAGAATGCCACTGAAATCAGAGCATCCGTCGGAAAAATGATTGGTGGA ATTGGACGATTCTACATCCAAATGTGCACCGAACTCAAACTCAGTGATTA TGAGGGACGGTTGATCCAAAACAGCTTAACAATAGAGAGAATGGTGCTCT CTGCTTTTGACGAAAGGAGAAATAAATACCTTGAAGAACATCCCAGTGCG GGGAAAGATCCTAAGAAAACTGGAGGACCTATATACAGGAGAGTAAACGG AAAGTGGATGAGAGAACTCATCCTTTATGACAAAGAAGAAATAAGGCGAA TCTGGCGCCAAGCTAATAATGGTGACGATGCAACGGCTGGTCTGACTCAC ATGATGATCTGGCATTCCAATTTGAATGATGCAACTTATCAGAGGACAAG AGCTCTTGTTCGCACCGGAATGGATCCCAGGATGTGCTCTCTGATGCAAG GTTCAACTCTCCCTAGGAGGTCTGGAGCCGCAGGTGCTGCAGTCAAAGGA GTTGGAACAATGGTGATGGAATTGGTCAGAATGATCAAACGTGGGATCAA TGATCGGAACTTCTGGAGGGGTGAGAATGGACGAAAAACAAGAATTGCTT ATGAAAGAATGTGCAACATTCTCAAAGGGAAATTTCAAACTGCTGCACAA AAAGCAATGATGGATCAAGTGAGAGAGAGCCGGAACCCAGGGAATGCTGA GTTCGAAGATCTCACTTTTCTAGCACGGTCTGCACTCATATTGAGAGGGT CGGTTGCTCACAAGTCCTGCCTGCCTGCCTGTGTGTATGGACCTGCCGTA GCCAGTGGGTACGACTTTGAAAGGGAGGGATACTCTCTAGTCGGAATAGA CCCTTTCAGACTGCTTCAAAACAGCCAAGTGTACAGCCTAATCAGACCAA ATGAGAATCCAGCACACAAGAGTCAACTGGTGTGGATGGCATGCCATTCT GCCGCATTTGAAGATCTAAGAGTATTAAGCTTCATCAAAGGGACGAAGGT GCTCCCAAGAGGGAAGCTTTCCACTAGAGGAGTTCAAATTGCTTCCAATG AAAATATGGAGACTATGGAATCAAGTACACTTGAACTGAGAAGCAGGTAC TGGGCCATAAGGACCAGAAGTGGAGGAAACACCAATCAACAGAGGGCATC TGCGGGCCAAATCAGCATACAACCTACGTTCTCAGTACAGAGAAATCTCC CTTTTGACAGAACAACCGTTATGGCAGCATTCAGTGGGAATACAGAGGGG AGAACATCTGACATGAGGACCGAAATCATAAGGATGATGGAAAGTGCAAG ACCAGAAGATGTGTCTTTCCAGGGGCGGGGAGTCTTCGAGCTCTCGGACG AAAAGGCAGCGAGCCCGATCGTGCCTTCCTTTGACATGAGTAATGAAGGA TCTTATTTCTTCGGAGACAATGCAGAGGAATACGATAATTAAAGAAAAAT ACCCTTGTTTCTACT

[0050] The amino acid sequence of the NP protein of Influenza A/PR/8/34 (H1N1), encoded by nucleotides 46-1494 of SEQ ID NO:1 is as follows, referred to herein as SEQ ID NO:2: TABLE-US-00002 MASQGTKRSYEQMETDGERQNATEIRASVGKMIGGIGRFYIQMCTELKLS DYEGRLIQNSLTIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRV NGKWMRELILYDKEEIRRIWRQANNGDDATAGLTHMMIWHSNLNDATYQR TRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELVRMIKRG INDRNFWRGENGRKTRIAYERMCNILKGKFQTAAQKAMMDQVRESRNPGN AEFEDLTFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEREGYSLVG IDPFRLLQNSQVYSLIRPNENPAHKSQLVWMACHSAAFEDLRVLSFIKGT KVLPRGKLSTRGVQIASNENMETMESSTLELRSRYWAIRTRSGGNTNQQR ASAGQISIQPTFSVQRNLPFDRTTVMAAFSGNTEGRTSDMRTEIIRMMES ARPEDVSFQGRGVFELSDEKAASPIVPSFDMSNEGSYFFGDNAEEYDN

[0051] Segment 7 of the IAV genome encodes both M1 and M2. Seqment 7 of Influenza A virus (A/Puerto Rico/8/34/Mount Sinai (H1N1)), is available as GenBank Accession No. AF389121.1, and has the following sequence, referred to herein as SEQ ID NO:3: TABLE-US-00003 AGCGAAAGCAGGTAGATATTGAAAGATGAGTCTTCTAACCGAGGTCGAAA CGTACGTACTCTCTATCATCCCGTCAGGCCCCCTCAAAGCCGAGATCGCA CAGAGACTTGAAGATGTCTTTGCAGGGAAGAACACTGATCTTGAGGTTCT CATGGAATGGCTAAAGACAAGACCAATCCTGTCACCTCTGACTAAGGGGA TTTTAGGATTTGTGTTCACGCTCACCGTGCCCAGTGAGCGAGGACTGCAG CGTAGACGCTTTGTCCAAAATGCCCTTAATGGGAACGGGGATCCAAATAA CATGGACAAAGCAGTTAAACTGTATAGGAAGCTCAAGAGGGAGATTAACA TTCCATGGGGCCAAAGAAATCTCACTCAGTTATTCTGCTGGTGCACTTGC CAGTTGTATGGGCCTCATATACAACAGGATGGGGGCTGTGACCACTGAAG TGGCATTTGGCCTGGTATGTGCAACCTGTGAACAGATTGCTGACTCCCAG CATCGGTCTCATAGGCAAATGGTGACAACAACCAATCCACTAATCAGACA TGAGAACAGAATGGTTTTAGCCAGCACTACAGCTAAGGCTATGGAGCAAA TGGCTGGATCGAGTGAGCAAGCAGCAGAGGCCATGGAGGTTGCTAGTCAG GCTAGACAAATGGTGCAAGCGATGAGAACCATTGGGACTCATCCTAGCTC CAGTGCTGGTCTGAAAAATGATCTTCTTGAAAATTTGCAGGCCTATCAGA AACGAATGGGGGTGCAGATGCAACGGTTCAAGTGATCCTCTCGCTATTGC CGCAAATATCATTGGGATCTTGCACTTGACATTGTGGATTCTTGATCGTC TTTTTTTCAAATGCATTTACCGTCGCTTTAAATACGGACTGAAAGGAGGG CCTTCTACGGAAGGAGTGCCAAAGTCTATGAGGGAAGAATATCGAAAGGA ACAGCAGAGTGCTGTGGATGCTGACGATGGTCATTTTGTCAGCATAGAGC TGGAGTAAAAAACTACCTTGTTTCTACT

[0052] The amino acid sequence of the M1 protein of Influenza A/Puerto Rico/8/34/Mount Sinai(H1N1), encoded by nucleotides 26 to 784 of SEQ ID NO:3 is as follows, referred to herein as SEQ ID NO:4: TABLE-US-00004 MSLLTEVETYVLSIIPSGPLKAEIAQRLEDVFAGKNTDLEVLMEWLKTRP ILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDKAVKLY RKLKREITFHGAKEISLSYSAGALASGMGLIYNRMGAVTTEVAFGLVCAT CEQIADSQHRSHRQMVTTNPLIRHENRMVLASTTAKAMEQMAGSSEQAAE AMEVASQARQMVQAMRTIGTHPSSSAGLKNDLLENLQAYQKRMGVQMQ RFK

[0053] The amino acid sequence of the M2 protein of Influenza A/Puerto Rico/8/34/Mount Sinai (H1N1), encoded (in spliced form) by nucleotides 26 to 51 and 740 to 1007 of SEQ ID NO:3 is as follows, referred to herein as SEQ ID NO:5: TABLE-US-00005 MSLLTEVETPIRNEWGCRCNGSSDPLAIAANIIGILHLTLWILDRLFFKC IYRRFKYGLKGGPSTEGVPKSMREEYRKEQQSAVDADDGHFVSIELE

[0054] The Extracellular region of the M2 protein (eM2) corresponds to the first 24 amino acids of the N-terminal end of the protein, and is underlined above. See Fischer, W. B. et al., Biochim. Biophys. Acta. 1561:27-45 (2002); Zhong, Q. et al., FEBS Lett. 434:265-71 (1998).

[0055] A derivative of NP and eM2 described herein is encoded by a construct which encodes the first 24 amino acids of M2 and all or a portion of NP. The fusion constructs may be constructed with the eM2 sequences followed by the NP sequences, or with the NP sequences followed by the eM2 sequences. Exemplary fusion constructs using the NP and M2 sequences from Influenza A/PR/8/34 (H1N1) are set out below. A sequence, using the original influenza virus nucleotide sequences, which encodes the first 24 amino acids of M2 fused at its 3' end to a sequence which encodes NP in its entirety eM2-NP is referred to herein as SEQ ID NO:6: TABLE-US-00006 1 ATGAGTCTTC TAACCGAGGT CGAAACGCCT ATCAGAAACG AATGGGGGTG CAGATGCAAC 61 GGTTCAAGTG ATATGGCGTC TCAAGGCACC AAACGATCTT ACGAACAGAT GGAGACTGAT 121 GGAGAACGCC AGAATGCCAC TGAAATCAGA GCATCCGTCG GAAAAATGAT TGGTGGAATT 181 GGACGATTCT ACATCCAAAT GTGCACCGAA CTCAAACTCA GTGATTATGA GGGACGGTTG 241 ATCCAAAACA GCTTAACAAT AGAGAGAATG GTGCTCTCTG CTTTTGACGA AAGGAGAAAT 301 AAATACCTTG AAGAACATCC CAGTGCGGGG AAAGATCCTA AGAAAACTGG AGGACCTATA 361 TACAGGAGAG TAAACGGAAA GTGGATGAGA GAACTCATCC TTTATGACAA AGAAGAAATA 421 AGGCGAATCT GGCGCCAAGC TAATAATGGT GACGATGCAA CGGCTGGTCT GACTCACATG 481 ATGATCTGGC ATTCCAATTT GAATGATGCA ACTTATCAGA GGACAAGAGC TCTTGTTCGC 541 ACCGGAATGG ATCCCAGGAT GTGCTCTCTG ATGCAAGGTT CAACTCTCCC TAGGAGGTCT 601 GGAGCCGCAG GTGCTGCAGT CAAAGGAGTT GGAACAATGG TGATGGAATT GGTCAGAATG 661 ATCAAACGTG GGATCAATGA TCGGAACTTC TGGAGGGGTG AGAATGGACG AAAAACAAGA 721 ATTGCTTATG AAAGAATGTG CAACATTCTC AAAGGGAAAT TTCAAACTGC TGCACAAAAA 781 GCAATGATGG ATCAAGTGAG AGAGAGCCGG AACCCAGGGA ATGCTGAGTT CGAAGATCTC 841 ACTTTTCTAG CACGGTCTGC ACTCATATTG AGAGGGTCGG TTGCTCACAA GTCCTGCCTG 901 CCTGCCTGTG TGTATGGACC TGCCGTAGCC AGTGGGTACG ACTTTGAAAG GGAGGGATAC 961 TCTCTAGTCG GAATAGACCC TTTCAGACTG CTTCAAAACA GCCAAGTGTA CAGCCTAATC 1021 AGACCAAATG AGAATCCAGC ACACAAGAGT CAACTGGTGT GGATGGCATG CCATTCTGCC 1081 GCATTTGAAG ATCTAAGAGT ATTAAGCTTC ATCAAAGGGA CGAAGGTGCT CCCAAGAGGG 1141 AAGCTTTCCA CTAGAGGAGT TCAAATTGCT TCCAATGAAA ATATGGAGAC TATGGAATCA 1201 AGTACACTTG AACTGAGAAG CAGGTACTGG GCCATAAGGA CCAGAAGTGG AGGAAACACC 1261 AATCAACAGA GGGCATCTGC GGGCCAAATC AGCATACAAC CTACGTTCTC AGTACAGAGA 1321 AATCTCCCTT TTGACAGAAC AACCGTTATG GCAGCATTCA GTGGGAATAC AGAGGGGAGA 1381 ACATCTGACA TGAGGACCGA AATCATAAGG ATGATGGAAA GTGCAAGACC AGAAGATGTG 1441 TCTTTCCAGG GGCGGGGAGT CTTCGAGCTC TCGGACGAAA AGGCAGCGAG CCCGATCGTG 1501 CCTTCCTTTG ACATGAGTAA TGAAGGATCT TATTTCTTCG GAGACAATGC AGAGGAATAC 1561 GATAAT

[0056] The amino acid sequence of the eM2-NP fusion protein of Influenza A/PR/8/341 (H1N1), encoded by nucleotides 1 to 1566 SEQ ID NO:6 is as follows, referred to herein as SEQ ID NO:7 (eM2 amino acid sequence underlined): TABLE-US-00007 MSLLTEVETPIRNEWGCRCNGSSDMASQGTKRSYEQMETDGERQNATEIR ASVGKMIGGIGRFYIQMCTELKLSDYEGRLIQNSLTIERMVLSAFDERRN KYLEEHPSAGKDPKKTGGPIYRRVNGKWMRELILYDKEEIRRIWRQANNG DDATAGLTHMMIWHSNLNDATYQRTRALVRTGMDPRMCSLMQGSTLPRRS GAAGAAVKGVGTMVMELVRMIKRGINDRNFWRGENGRKTRIAYERMCNIL KGKFQTAAQKAMMDQVRESRNPGNAEFEDLTFLARSALILRGSVAHKSCL PACVYGPAVASGYDFEREGYSLVGIDPFRLLQNSQVYSLIRPNENPAHKS QLVWMACHSAAFEDLRVLSFIKGTKVLPRGKLSTRGVQIASNENMETMES STLELRSRYWAIRTRSGGNTNQQRASAGQISIQPTFSVQRNLPFDRTTVM AAFSGNTEGRTSDMRTEIIRMMESARPEDVSFQGRGVFELSDEKAASPIV PSFDMSNEGSYFFGDNAEEYDN

[0057] A sequence, using the original influenza virus nucleotide sequences, which encodes NP in its entirety fused at its 3' end to the first 24 amino acids of M2 fused to a sequence which encodes NP in its entirety is referred to herein as SEQ ID NO:8: TABLE-US-00008 ATGGCGTCTCAAGGCACCAAACGATCTTACGAACAGATGGAGACTGATGG AGAACGCCAGAATGCCACTGAAATCAGAGCATCCGTCGGAAAAATGATTG GTGGAATTGGACGATTCTACATCCAAATGTGCACCGAACTCAAACTCAGT GATTATGAGGGACGGTTGATCCAAAACAGCTTAACAATAGAGAGAATGGT GCTCTCTGCTTTTGACGAAAGGAGAAATAAATACCTTGAAGAACATCCCA GTGCGGGGAAAGATCCTAAGAAAACTGGAGGACCTATATACAGGAGAGTA AACGGAAAGTGGATGAGAGAACTCATCCTTTATGACAAAGAAGAAATAAG GCGAATCTGGCGCCAAGCTAATAATGGTGACGATGCAACGGCTGGTCTGA CTCACATGATGATCTGGCATTCCAATTTGAATGATGCAACTTATCAGAGG ACAAGAGCTCTTGTTCGCACCGGAATGGATCCCAGGATGTGCTCTCTGAT GCAAGGTTCAACTCTCCCTAGGAGGTCTGGAGCCGCAGGTGCTGCAGTCA AAGGAGTTGGAACAATGGTGATGGAATTGGTCAGAATGATCAAACGTGGG ATCAATGATCGGAACTTCTGGAGGGGTGAGAATGGACGAAAAACAAGAAT TGCTTATGAAAGAATGTGCAACATTCTCAAAGGGAAATTTCAAACTGCTG CACAAAAAGCAATGATGGATCAAGTGAGAGAGAGCCGGAACCCAGCGAAT GCTGAGTTCGAAGATCTCACTTTTCTAGCACGGTCTGCACTCATATTGAG AGCGTCGGTTGCTCACAAGTCCTGCCTGCCTGCCTGTGTGTATGGACCTG CCGTAGCCAGTGGGTACGACTTTGAAAGGGAGGGATACTCTCTAGTCGGA ATAGACCCTTTCAGACTGCTTCAAAACAGCCAAGTGTACAGCCTAATCAG ACCAAATGAGAATCCAGCACACAAGAGTCAACTGGTGTGGATGGCATGCC ATTCTGCCGCATTTGAAGATCTAAGAGTATTAAGCTTCATCAAAGGGACG AAGGTGCTCCCAAGAGGGAAGCTTTCCACTAGAGGAGTTCAAATTGCTTC CAATGAAAATATGGAGACTATGGAATCAAGTACACTTGAACTGAGAAGCA GGTACTGGGCCATAAGGACCAGAAGTGGAGGAAACACCAATCAACAGAGG GCATCTGCGGGCCAAATCAGCATACAACCTACGTTCTCAGTACAGAGAAA TCTCCCTTTTGACAGAACAACCGTTATGGCAGCATTCAGTGGGAATACAG AGGGGAGAACATCTGACATGAGGACCGAAATCATAAGGATGATGGAAAGT GCAAGACCAGAAGATGTGTCTTTCCAGGGGCGGGGAGTCTTCGAGCTCTC GGACGAAAAGGCAGCGAGCCCGATCGTGCCTTCCTTTGACATGAGTAATG AAGGATCTTATTTCTTCGGAGACAATGCAGAGGAATACGATAATATGAGT CTTCTAACCGAGGTCGAAACGCCTATCAGAAACGAATGGGGGTGCAGATG CAACGGTTCAAGTGAT

[0058] The amino acid sequence of the NP-eM2 fusion protein of Influenza A/PR/8/34/ (H1N1), encoded by nucleotides 1 to 1566 of SEQ ID NO:8 is as follows, referred to herein as SEQ ID NO:9 (eM2 amino acid sequence underlined): TABLE-US-00009 MASQGTKRSYEQMETDGERQNATEIRASVGKMIGGIGRFYIQMCTELKLS DYEGRLIQNSLTIERMVLSAFDERRNKYLEEHPSAGKDPKKTGGPIYRRV NGKWMRELILYDKEEIRRIWRQANNGDDATAGLTHMMIWHSNLNDATYQR TRALVRTGMDPRMCSLMQGSTLPRRSGAAGAAVKGVGTMVMELVRMIKRG INDRNFWRGENGRKTRIAYERMCNILKGKFQTAAQKAMMDQVRESRNPGN AEFEDLTFLARSALILRGSVAHKSCLPACVYGPAVASGYDFEREGYSLVG IDPFRLLQNSQVYSLIRPNENPAHKSQLVWMACHSAAFEDLRVLSFIKGT KVLPRGKLSTRGVQIASNENMETMESSTLELRSRYWAIRTRSGGNTNQQR ASAGQISIQPTFSVQRNLPFDRTTVMAAFSGNTEGRTSDMRTEIIRMMES ARPEDVSFQGRGVFELSDEKAASPIVPSFDMSNEGSYFFGDNAEEYDN MSLLTEVETPIRNEWGCRCNGSSD

[0059] The construction of functional fusion proteins often requires a linker sequence between the two fused fragments, in order to adopt an extended conformation to allow maximal flexibility. We used program LINKER (Chiquita J. Crasto C. J. and Feng, J. Protein Engineering 13:309-312 (2000), program publicly available at http://chutney.med.yale.edu/linker/linker.html (visited Apr. 16, 2003)), that can automatically generate a set of linker sequences, which are known to adopt extended conformations as determined by X-ray crystallography and NMR. Examples of suitable linkers to use in various eM2-NP or NP-eM2 fusion proteins are as follows: TABLE-US-00010 1. GYNTRA (SEQ ID NO:10) 2. FQMGET (SEQ ID NO:11) 3. FDRVKHLK (SEQ ID NO:12) 4. GRNTNGVIT (SEQ ID NO:13) 5. VNEKTIPDHD (SEQ ID NO:14)

[0060] The nucleotide sequence of the NP protein of Influenza B/LEE/40 is available as GenBank Accession Number K01395, and has the following sequence, referred to herein as SEQ ID NO:15: TABLE-US-00011 1 ATGTCCAACA TGGATATTGA CAGTATAAAT ACCGGAACAA TCGATAAAAC ACCAGAAGAA 61 CTGACTCCCG GAACCAGTGG GGCAACCAGA CCAATCATCA AGCCAGCAAC CCTTGCTCCG 121 CCAAGCAACA AACGAACCCG AAATCCATCT CCAGAAAGGA CAACCACAAG CAGTGAAACC 181 GATATCGGAA GGAAAATCCA AAAGAAACAA ACCCCAACAG AGATAAAGAA GAGCGTCTAC 241 AAAATGGTGG TAAAACTGGG TGAATTCTAC AACCAGATGA TGGTCAAAGC TGGACTTAAT 301 GATGACATGG AAAGGAATCT AATTCAAAAT GCACAAGCTG TGGAGAGAAT CCTATTGGCT 361 GCAACTGATG ACAAGAAAAC TGAATACCAA AAGAAAAGGA ATGCCAGAGA TGTCAAAGAA 421 GGGAAGGAAG AAATAGACCA CAACAAGACA GGAGGCACCT TTTATAAGAT GGTAAGAGAT 481 GATAAAACCA TCTACTTCAG CCCTATAAAA ATTACCTTTT TAAAAGAAGA GGTGAAAACA 541 ATGTACAAGA CCACCATGGG GAGTGATGGT TTCAGTGGAC TAAATCACAT TATGATTGGA 601 CATTCACAGA TGAACGATGT CTGTTTCCAA AGATCAAAGG GACTGAAAAG GGTTGGACTT 661 GACCCTTCAT TAATCAGTAC TTTTGCCGGA AGCACACTAC CCAGAAGATC AGGTACAACT 721 GGTGTTGCAA TCAAAGGAGG TGGAACTTTA GTGGATGAAG CCATCCGATT TATAGGAAGA 781 GCAATGGCAG ACAGAGGGCT ACTGAGAGAC ATCAAGGCCA AGACGGCCTA TGAAAAGATT 841 CTTCTGAATC TGAAAAACAA GTGCTCTGCG CCGCAACAAA AGGCTCTAGT TGATCAAGTG 901 ATCGGAAGTA GGAACCCAGG GATTGCAGAC ATAGAAGACC TAACTCTGCT TGCCAGAAGC 961 ATGGTAGTTG TCAGACCCTC TGTAGCGAGC AAAGTGGTGC TTCCCATAAG CATTTATGCT 1021 AAAATACCTC AACTAGGATT CAATACCGAA GAATACTCTA TGGTTGGGTA TGAAGCCATG 1081 GCTCTTTATA ATATGGCAAC ACCTGTTTCC ATATTAAGAA TGGGAGATGA CGCAAAAGAT 1141 AAATCTCAAC TATTCTTCAT GTCGTGCTTC GGAGCTGCCT ATGAAGATCT AAGAGTGTTA 1201 TCTGCACTAA CGGGCACCGA ATTTAAGCCT AGATCAGCAC TAAAATGCAA GGGTTTCCAT 1261 GTCCCGGCTA AGGAGCAAGT AGAAGGAATG GGGGCAGCTC TGATGTCCAT CAAGCTTCAG 1321 TTCTGGGCCC CAATGACCAG ATCTGGAGGG AATGAAGTAA GTGGAGAAGG AGGGTCTGGT 1381 CAAATAAGTT GCAGCCCTGT GTTTGCAGTA GAAAGACCTA TTGCTCTAAG CAAGCAAGCT 1441 GTAAGAAGAA TGCTGTCAAT GAACGTTGAA GGACGTGATG CAGATGTCAA AGGAAATCTA 1501 CTCAAAATGA TGAATGATTC AATGGCAAAG AAAACCAGTG GAAATGCTTT CATTGGGAAG 1561 AAAATGTTTC AAATATCAGA CAAAAACAAA GTCAATCCCA TTGAGATTCC AATTAAGCAG 1621 ACCATCCCCA ATTTCTTCTT TGGGAGGGAC ACAGCAGAGG ATTATGATGA CCTCGATTAT 1681 TAA

[0061] The amino acid sequence of the NP protein of IBV B/LEE/40, encoded by nucleotides 1-1680 of SEQ ID NO:1 is as follows, referred to herein as SEQ ID NO:16: TABLE-US-00012 MSNMDIDSINTGTIDKTPEELTPGTSGATRPIIKPATLAPPSNKRTRNPS PERTTTSSETDIGRKIQKKQTPTEIKKSVYKMVVKLGEFYNQMMVKAGLN DDMERNLIQNAQAVERILLAATDDKKTEYQKKRNARDVKEGKEEIDHNKT GGTFYKMVRDDKTIYFSPIKITFLKEEVKTMYKTTMGSDGFSGLNHIMIG HSQMNDVCFQRSKGLKRVGLDPSLISTFAGSTLPRRSGTTGVAIKGGGTL VDEAIRFIGRAMADRGLLRDIKAKTAYEKILLNLKNKCSAPQQKALVDQV IGSRNPGIADIEDLTLLARSMVVVRPSVASKVVLPISIYAKIPQLGFNTE EYSMVGYEAMALYNMATPVSILRMGDDAKDKSQLFFMSCFGAAYEDLRVL SALTGTEFKPRSALKCKGFHVPAKEQVEGMGAALMSIKLQFWAPMTRSGG NEVSGEGGSGQISCSPVFAVERPIALSKQAVRRMLSMNVEGRDADVKGNL LKMMNDSMAKKTSGNAFIGKKMFQISDKNKVNPIEIPIKQTIPNFFFGRD TAEDYDDLDY

[0062] Non limiting examples of nucleotide sequences encoding the IAV hemagglutinin (HA) are as follows. It should be noted that HA sequences vary significantly between IV subtypes. Virtually any nucleotide sequence encoding an IV HA is suitable for the present invention. In fact, HA sequences included in vaccines and therapeutic formulations of the present invention (discussed in more detail below) might change from year to year depending on the prevalent strain or strains of IV.

[0063] The partial nucleotide sequence of the HA protein of IAV A/New_York/1/18(H1N1) is available as GenBank Accession Number AF116576, and has the following sequence, referred to herein as SEQ ID NO: 17: TABLE-US-00013 1 atggaggcaa gactactggt cttgttatgt gcatttgcag ctacaaatgc agacacaata 61 tgtataggct accatgcgaa taactcaacc gacactgttg acacagtact cgaaaagaat 121 gtgaccgtga cacactctgt taacctgctc gaagacagcc acaacggaaa actatgtaaa 181 ttaaaaggaa tagccccatt acaattgggg aaatgtaata tcgccggatg gctcttggga 241 aacccggaat gcgatttact gctcacagcg agctcatggt cctatattgt agaaacatcg 301 aactcagaga atggaacatg ttacccagga gatttcatcg actatgaaga actgagggag 361 caattgagct cagtgtcatc gtttgaaaaa ttcgaaatat ttcccaagac aagctcgtgg 421 cccaatcatg aaacaaccaa aggtgtaacg gcagcatgct cctatgcggg agcaagcagt 481 ttttacagaa atttgctgtg gctgacaaag aagggaagct catacccaaa gcttagcaag 541 tcctatgtga acaataaagg gaaagaagtc cttgtactat ggggtgttca tcatccgcct 601 accggtactg atcaacagag tctctatcag aatgcagatg cttatgtctc tgtagggtca 661 tcaaaatata acaggagatt caccccggaa atagcagcga gacccaaagt aagaggtcaa 721 gctgggagga tgaactatta ctggacatta ctagaacccg gagacacaat aacatttgag 781 gcaactggaa atctaatagc accatggtat gctttcgcac tgaatagagg ttctggatcc 841 ggtatcatca cttcagacgc accagtgcat gattgtaaca cgaagtgtca aacaccccat 901 ggtgctataa acagcagtct ccctttccag aatatacatc cagtcacaat aggagagtgc 961 ccaaaatacg tcaggagtac caaattgagg atggctacag gactaagaaa cattccatct 1021 attcaatcca ggggtctatt tggagccatt gccggtttta ttgagggggg atggactgga 1081 atgatagatg gatggtatgg ttatcatcat cagaatgaac agggatcagg ctatgcagcg 1141 gatcaaaaaa gcacacaaaa tgccattgac gggattacaa acaaggtgaa ttctgttatc 1201 gagaaaatga acacccaatt

[0064] The amino acid sequence of the partial HA protein of IAV A/New_York/1/18(H1N1), encoded by nucleotides 1 to 1218 of SEQ ID NO:17 is as follows, referred to herein as SEQ ID NO:18: TABLE-US-00014 MEARLLVLLCAFAATNADTICIGYHANNSTDTVDTVLEKNVTVTHSVNLL EDSHNGKLCKLKGIAPLQLGKCNIAGWLLGNPECDLLLTASSWSYIVETS NSENGTGYPGDFLDYEELREQLSSVSSFEKFEIFPKTSSWPNHETTKGVT AACSYAGASSFYRNLLWLTKGSSYPKLSKSYVNNKGKEVLVLWGVHHPPT GTDQQSLYQNADAYVSVGSSKYNRRFTPEIAARPKVRGQAGRMNYYWTLL EPGDTITFEATGNLIAPWYAFALNRGSGSGIITSDAPVHDCNTKCQTPHG AINSSLPFQNIHPVTIGECPKYVRSTKLRMATGLRNIPSIQSRGLFGAIA GFIEGGWTGMIDGWYGYHHQNEQGSGYAADQKSTQNAIDGITNKVNSVIE KMNTQ

[0065] The nucleotide sequence of the IAV A/Hong Kong/482/97 hemagglutinin (H5) is available as GenBank Accession Number AF046098, and has the following sequence, referred to herein as SEQ ID NO:19: TABLE-US-00015 1 ctgtcaaaat qgagaaaata gtgcttcttc ttgcaacagt cagtcttgtt aaaagtgatc 61 agatttgcat tggttaccat gcaaacaact cgacagagca ggttgacaca ataatggaaa 121 agaatgttac tgttacacat gcccaagaca tactggaaag gacacacaac gggaagctct 181 gcgatctaaa tggagtgaaa cctctcattt tgagggattg tagtgtagct ggatggctcc 241 tcggaaaccc tatgtgtgac gaattcatca atgtgccgga atggtcttac atagtggaga 301 aggccagtcc agccaatgac ctctgttatc cagggaattt caacgactat gaagaactga 361 aacacctatt gagcagaata aaccattttg agaaaattca gatcatcccc aaaagttctt 421 gqtccaatca tgatgcctca tcaggggtga gctcagcatg tccatacctt gggaggtcct 481 cctttttcag aaatgtggta tggcttatca aaaagaacag tgcataccca acaataaaga 541 ggagctacaa taataccaac caagaagatc ttttggtact gtgggggatt caccatccta 601 atgatgcggc agaycagaca aagctctatc aaaatccaac cacctacatt tccgttggaa 661 catcaacact gaaccagaga ttggttccag aaatagctac tagacccaaa gtaaacgggc 721 aaagtggaag aatggagttc ttctggacaa ttttaaagcc gaatgatgcc atcaatttcg 781 agagtaatgg aaatttcatt gccccagaat atgcatacaa aattgtcaag aaaggggact 841 caacaattat gaaaagtgaa ttggaatatg gtaactgcaa caccaagtgt caaactccaa 901 tgggggcgat aaactctagt atgccattcc acaacataca ccccctcacc atcggggaat 961 gccccaaata tgtgaaatca aacagattag ttcttgcgac tggactcaga aatacccctc 1021 aaagggagag aagaagaaaa aagagaggac tatttggagc tatagcaggt tttatagagg 1081 gaggatggca gggcatggta gatggttggt atgggtacca ccatagcaat gagcagggga 1141 gtggatacgc tgcagacaaa gaatccactc aaaaggcaat agatggagtc accaataagg 1201 tcaactcgat cattaacaaa atgaacactc agtttgaggc cgttggaagg gaatttaata 1261 acttagaaag gagaatagag aatttaaaca agaaaatgga agacggattc ctagatgtct 1321 ggacttacaa tgctgaactt ctggttctca tggaaaatga gagaactctc gactttcatg 1381 actcaaatgt caagaacctt tacgacaagg tccgactaca gcttagggat aatgcaaagg 1441 aactgggtaa tggttgtttc gaattctatc acaaatgtga taatgaatgt atggaaagtg 1501 taaaaaacgg aacgtatgac tacccgcagt attcagaaga agcaagacta aacagagagg 1561 aaataagtgg agtaaaattg gaatcaatgg gaacttacca aatactgtca atttattcaa 1621 cagtggcgag ttccctagca ctggcaatca tggtagctgg tctatcttta tggatgtgct 1681 ccaatggatc gttacaatgc agaatttgca tttaaatttg tgagttcaga ttgtagttaa 1741 a

[0066] The amino acid sequence of the HA protein of IAV A/Hong Kong/482/97 (H5), encoded by nucleotides 9 to 1715 of SEQ ID NO:19 is as follows, referred to herein as SEQ ID NO:20: TABLE-US-00016 MEKIVLLLATVSLVKSDQICIGYHANNSTEQVDTIMEKNVTVTHAQDILE RTHNGKLGDLNGVKPLILRDCSVAGWLLGNPMCDEFINVPEWSYIVEKAS PANDLCYPGNFNDYEELKHLLSRINHFEKIQIIPKSSWSNHDASSGVSSA GPYLGRSSFFRNVVWLIKKNSAYPTIKRSYNNTNQEDLLVLWGIHHPNDA AEQTKLYQNPTTYISVGTSTLNQRLVPEIATRPKVNGQSGRMEFFWTILK PNDAINFESNGNFIAPEYAYKIVKKGDSTIMKSELEYGNCNTKCQTPMGA INSSMPFHNIHPLTIGECPKYVKSNRLVLATGLRNTPQRERRRKKRGLFG AIAGFIEGGWQGMVDGWYGYHHSNEQGSGYAADKESTQKAIDGVTNKVNS IINKMNTQFEAVGREFNNLERRIENLNKKMEDGFLDVWTYNAELLVLMEN ERTLDFHDSNVKNLYDKVRLQLRDNAKELGNGCFEFYHKCDNECMESVKN GTYDYPQYSEEARLNREEISGVKLESMGTYQILSIYSTVASSLALAIMVA GLSLWMCSNGSLQCRICI

[0067] The nucleotide sequence of the IAV A/Hong Kong/1073/99(H9N2) is available as GenBank Accession Number INA404626, and has the following sequence, referred to herein as SEQ ID NO:21: TABLE-US-00017 1 gcaaaagcag gggaattact taactagcaa aatggaaaca atatcactaa taactatact 61 actagtagta acagcaagca atgcagataa aatctgcatc ggccaccagt caacaaactc 121 cacagaaact gtgqacacgc taacagaaac caatgttcct gtgacacatg ccaaagaatt 181 gctccacaca gagcataatg gaatgctgtg tgcaacaagc ctgggacatc ccctcattct 241 agacacatgc actattgaag gactagtcta tggcaaccct tcttgtgacc tgctgttggg 301 aggaagagaa tggtcctaca tcgtcgaaag atcatcagct gtaaatggaa cgtgttaccc 361 tgggaatgta gaaaacctaq aggaactcag gacacttttt agttccgcta gttcctacca 421 aagaatccaa atcttcccag acacaacctg gaatgtgact tacactggaa caagcagagc 481 atgttcaggt tcattctaca ggagtatgag atggctgact caaaagagcg gtttttaccc 541 tgttcaagac gcccaataca caaataacag gggaaagagc attcttttcg tgtggggcat 601 acatcaccca cccacctata ccgagcaaac aaatttgtac ataagaaacg acacaacaac 661 aagcgtgaca acagaagatt tgaataggac cttcaaacca gtgatagggc caaggcccct 721 tgtcaatggt ctgcagggaa gaattgatta ttattggtcg gtactaaaac caggccaaac 781 attgcgagta cgatccaatg ggaatctaat tgctccatgg tatggacacg ttctttcagg 841 agggagccat ggaagaatcc tgaagactga tttaaaaggt ggtaattqtg tagtgcaatg 901 tcagactgaa aaaggtggct taaacagtac attgccattc cacaatatca gtaaatatgc 961 atttggaacc tgccccaaat atgtaagagt taatagtctc aaactggcag tcggtctgag 1021 gaacgtgcct gctagatcaa gtagaggact atttggagcc atagctggat tcatagaagg 1081 aggttggcca ggactagtcg ctggctggta tggtttccag cattcaaatg atcaaggggt 1141 tggtatggct gcagataggg attcaactca aaaggcaatt gataaaataa catccaaggt 1201 gaataatata gtcgacaaga tgaacaagca atatgaaata attgatcatg aattcagtga 1261 ggttgaaact agactcaata tgatcaataa taagattgat gaccaaatac aaqacgtatg 1321 ggcatataat gcagaattgc tagtactact tgaaaatcaa aaaacactcg atgagcatga 1381 tgcgaacgtg aacaatctat ataacaaggt gaagagggca ctgggctcca atgctatgga 1441 agatgggaaa ggctgtttcg agctatacca taaatgtgat gatcagtgca tggaaacaat 1501 tcggaacggg acctataata ggagaaagta tagagaggaa tcaagactag aaaggcagaa 1561 aatagagggg gttaagctgg aatctgaggg aacttacaaa atcctcacca tttattcgac 1621 tgtcgcctca tctcttgtgc ttgcaatggg gtttgctgcc ttcctgttct gggccatgtc 1681 caatggatct tgcagatgca acatttgtat ataa

[0068] The amino acid sequence of the HA protein of IAV A/Hong Kong/1073/99 (H9N2), encoded by nucleotides 32 to 1711 of SEQ ID NO:21 is as follows, referred to herein as SEQ ID NO:22: TABLE-US-00018 METISLITILLVVTASNADKICIGHQSTNSTETVDTLTETNVPVTHAKEL LHTEHNGMLCATSLGHPLILDTCTIEGLVYGNPSGDLLLGGREWSYIVER SSAVNGTCYPGNVENLEELRTLFSSASSYQRIQIFPDTTWNVTYTGTSRA CSGSFYRSMRWLTQKSGFYPVQDAQYTNNRGKSILFVWGIHHPPTYTEQT NLYIRNDTTTSVTTEDLNRTFKPVIGPRPLVNGLQGRIDYYWSVLKPGQT LRVRSNGNLIAPWYGHVLSGGSHGRILKTDLKGGNCVVQCQTEKGGLNST LPFHNISKYAFGTCPKYVRVNSLKLAVGLRNVPARSSRGLFGAIAGFIEG GWPGLVAGWYGFQHSNDQGVGMAADRDSTQKAIDKITSKVNNIVDKMNKQ YEIIDHEFSEVETRLNMINNKIDDQIQDVWAYNAELLVLLENQKTLDEHD ANVNNLYNKVKRALGSNAMEDGKGCFELYHKCDDQCMETIRNGTYNRRKY REESRLERQKIEGVKLESEGTYKILTIYSTVASSLVLAMGFAAFLFWAMS NGSGRGNICI

[0069] The present invention also provides vaccine compositions and methods for delivery of IV coding sequences to a vertebrate with optimal expression and safety conferred through codon optimization and/or other manipulations. These vaccine compositions are prepared and administered in such a manner that the encoded gene products are optimally expressed in the vertebrate of interest. As a result, these compositions and methods are useful in stimulating an immune response against IV infection. Also included in the invention are expression systems, delivery systems, and codon-optimized IV coding regions.

[0070] In a specific embodiment, the invention provides combinatorial polynucleotide (e.g., DNA) vaccines which combine both a polynucleotide vaccine and polypeptide (e.g., either a recombinant protein, a purified subunit protein, a viral vector expressing an isolated IV polypeptide, or in the form of an inactivated or attenuated IV vaccine) vaccine in a single formulation. The single formulation comprises an IV polypeptide-encoding polynucleotide vaccine as described herein, and optionally, an effective amount of a desired isolated IV polypeptide or fragment, variant, or derivative thereof. The polypeptide may exist in any form, for example, a recombinant protein, a purified subunit protein, a viral vector expressing an isolated IV polypeptide, or in the form of an inactivated or attenuated IV vaccine. The IV polypeptide or fragment, variant, or derivative thereof encoded by the polynucleotide vaccine may be identical to the isolated IV polypeptide or fragment, variant, or derivative thereof. Alternatively, the IV polypeptide or fragment, variant, or derivative thereof encoded by the polynucleotide may be different from the isolated IV polypeptide or fragment, variant, or derivative thereof.

[0071] It is to be noted that the term "a" or "an" entity refers to one or more of that entity; for example, "a polynucleotide," is understood to represent one or more polynucleotides. As such, the terms "a" (or "an"), "one or more," and "at least one" can be used interchangeably herein.

[0072] The term "polynucleotide" is intended to encompass a singular nucleic acid or nucleic acid fragment as well as plural nucleic acids or nucleic acid fragments, and refers to an isolated molecule or construct, e.g., a virus genome (e.g., a non-infectious viral genome), messenger RNA (mRNA), plasmid DNA (pDNA), or derivatives of pDNA (e.g., minicircles as described in (Darquet, A-M et al., Gene Therapy 4:1341-1349 (1997)) comprising a polynucleotide. A polynucleotide may comprise a conventional phosphodiester bond or a non-conventional bond (e.g., an amide bond, such as found in peptide nucleic acids (PNA)).

[0073] The terms "nucleic acid" or "nucleic acid fragment" refer to any one or more nucleic acid segments, e.g., DNA or RNA fragments, present in a polynucleotide or construct. A nucleic acid or fragment thereof may be provided in linear (e.g., mRNA) or circular (e.g., plasmid) form as well as double-stranded or single-stranded forms. By "isolated" nucleic acid or polynucleotide is intended a nucleic acid molecule, DNA or RNA, which has been removed from its native environment. For example, a recombinant polynucleotide contained in a vector is considered isolated for the purposes of the present invention. Further examples of an isolated polynucleotide include recombinant polynucleotides maintained in heterologous host cells or purified (partially or substantially) polynucleotides in solution. Isolated RNA molecules include in vivo or in vitro RNA transcripts of the polynucleotides of the present invention. Isolated polynucleotides or nucleic acids according to the present invention further include such molecules produced synthetically.

[0074] As used herein, a "coding region" is a portion of nucleic acid which consists of codons translated into amino acids. Although a "stop codon" (TAG, TGA, or TAA) is not translated into an amino acid, it may be considered to be part of a coding region, but any flanking sequences, for example promoters, ribosome binding sites, transcriptional terminators, and the like, are not part of a coding region. Two or more nucleic acids or nucleic acid fragments of the present invention can be present in a single polynucleotide construct, e.g., on a single plasmid, or in separate polynucleotide constructs, e.g., on separate (different) plasmids. Furthermore, any nucleic acid or nucleic acid fragment may encode a single IV polypeptide or fragment, derivative, or variant thereof, e.g., or may encode more than one polypeptide, e.g., a nucleic acid may encode two or more polypeptides. In addition, a nucleic acid may include a regulatory element such as a promoter, ribosome binding site, or a transcription terminator, or may encode heterologous coding regions fused to the IV coding region, e.g., specialized elements or motifs, such as a secretory signal peptide or a heterologous functional domain.

[0075] The terms "fragment," "variant," "derivative" and "analog" when referring to IV polypeptides of the present invention include any polypeptides which retain at least some of the immunogenicity or antigenicity of the corresponding native polypeptide. Fragments of IV polypeptides of the present invention include proteolytic fragments, deletion fragments and in particular, fragments of IV polypeptides which exhibit increased secretion from the cell or higher immunogenicity or reduced pathogenicity when delivered to an animal. Polypeptide fragments further include any portion of the polypeptide which comprises an antigenic or immunogenic epitope of the native polypeptide, including linear as well as three-dimensional epitopes. Variants of IV polypeptides of the present invention include fragments as described above, and also polypeptides with altered amino acid sequences due to amino acid substitutions, deletions, or insertions. Variants may occur naturally, such as an allelic variant. By an "allelic variant" is intended alternate forms of a gene occupying a given locus on a chromosome or genome of an organism or virus. Genes II, Lewin, B., ed., John Wiley & Sons, New York (1985), which is incorporated herein by reference. For example, as used herein, variations in a given gene product. When referring to IV NA or HA proteins, each such protein is a "variant," in that native IV strains are distinguished by the type of NA and HA proteins encoded by the virus. However, within a single HA or NA variant type, further naturally or non-naturally occurring variations such as amino acid deletions, insertions or substitutions may occur. Non-naturally occurring variants may be produced using art-known mutagenesis techniques. Variant polypeptides may comprise conservative or non-conservative amino acid substitutions, deletions or additions. Derivatives of IV polypeptides of the present invention, are polypeptides which have been altered so as to exhibit additional features not found on the native polypeptide. Examples include fusion proteins. An analog is another form of an IV polypeptide of the present invention. An example is a proprotein which can be activated by cleavage of the proprotein to produce an active mature polypeptide.

[0076] The terms "infectious polynucleotide" or "infectious nucleic acid" are intended to encompass isolated viral polynucleotides and/or nucleic acids which are solely sufficient to mediate the synthesis of complete infectious virus particles upon uptake by permissive cells. Thus, "infectious nucleic acids" do not require pre-synthesized copies of any of the polypeptides it encodes, e.g., viral replicases, in order to initiate its replication cycle in a permissive host cell.

[0077] The terms "non-infectious polynucleotide" or "non-infectious nucleic acid" as defined herein are polynucleotides or nucleic acids which cannot, without additional added materials, e.g, polypeptides, mediate the synthesis of complete infectious virus particles upon uptake by permissive cells. An infectious polynucleotide or nucleic acid is not made "non-infectious" simply because it is taken up by a non-permissive cell. For example, an infectious viral polynucleotide from a virus with limited host range is infectious if it is capable of mediating the synthesis of complete infectious virus particles when taken up by cells derived from a permissive host (i.e., a host permissive for the virus itself). The fact that uptake by cells derived from a non-permissive host does not result in the synthesis of complete infectious virus particles does not make the nucleic acid "non-infectious." In other words, the term is not qualified by the nature of the host cell, the tissue type, or the species taking up the polynucleotide or nucleic acid fragment.

[0078] In some cases, an isolated infectious polynucleotide or nucleic acid may produce fully-infectious virus particles in a host cell population which lacks receptors for the virus particles, i.e., is non-permissive for virus entry. Thus viruses produced will not infect surrounding cells. However, if the supernatant containing the virus particles is transferred to cells which are permissive for the virus, infection will take place.

[0079] The terms "replicating polynucleotide" or "replicating nucleic acid" are meant to encompass those polynucleotides and/or nucleic acids which, upon being taken up by a permissive host cell, are capable of producing multiple, e.g., one or more copies of the same polynucleotide or nucleic acid. Infectious polynucleotides and nucleic acids are a subset of replicating polynucleotides and nucleic acids; the terms are not synonymous. For example, a defective virus genome lacking the genes for virus coat proteins may replicate, e.g., produce multiple copies of itself, but is NOT infectious because it is incapable of mediating the synthesis of complete infectious virus particles unless the coat proteins, or another nucleic acid encoding the coat proteins, are exogenously provided.

[0080] In certain embodiments, the polynucleotide, nucleic acid, or nucleic acid fragment is DNA. In the case of DNA, a polynucleotide comprising a nucleic acid which encodes a polypeptide normally also comprises a promoter and/or other transcription or translation control elments operably associated with the polypeptide-encoding nucleic acid fragment. An operable association is when a nucleic acid fragment encoding a gene product, e.g., a polypeptide, is associated with one or more regulatory sequences in such a way as to place expression of the gene product under the influence or control of the regulatory sequence(s). Two DNA fragments (such as a polypeptide-encoding nucleic acid fragment and a promoter associated with the 5' end of the nucleic acid fragment) are "operably associated" if induction of promoter function results in the transcription of mRNA encoding the desired gene product and if the nature of the linkage between the two DNA fragments does not (1) result in the introduction of a frame-shift mutation, (2) interfere with the ability of the expression regulatory sequences to direct the expression of the gene product, or (3) interfere with the ability of the DNA template to be transcribed. Thus, a promoter region would be operably associated with a nucleic acid fragment encoding a polypeptide if the promoter was capable of effecting transcription of that nucleic acid fragment. The promoter may be a cell-specific promoter that directs substantial transcription of the DNA only in predetermined cells. Other transcription control elements, besides a promoter, for example enhancers, operators, repressors, and transcription termination signals, can be operably associated with the polynucleotide to direct cell-specific transcription. Suitable promoters and other transcription control regions are disclosed herein.

[0081] A variety of transcription control regions are known to those skilled in the art. These include, without limitation, transcription control regions which function in vertebrate cells, such as, but not limited to, promoter and enhancer segments from cytomegaloviruses (the immediate early promoter, in conjunction with intron-A), simian virus 40 (the early promoter), and retroviruses (such as Rous sarcoma virus). Other transcription control regions include those derived from vertebrate genes such as actin, heat shock protein, bovine growth hormone and rabbit .beta.-globin, as well as other sequences capable of controlling gene expression in eukaryotic cells. Additional suitable transcription control regions include tissue-specific promoters and enhancers as well as lymphokine-inducible promoters (e.g., promoters inducible by interferons or interleukins).

[0082] Similarly, a variety of translation control elements are known to those of ordinary skill in the art. These include, but are not limited to ribosome binding sites, translation initiation and termination codons, elements from picornaviruses (particularly an internal ribosome entry site, or IRES, also referred to as a CITE sequence).

[0083] A DNA polynucleotide of the present invention may be a circular or linearized plasmid or vector, or other linear DNA which may also be non-infectious and nonintegrating (i.e., does not integrate into the genome of vertebrate cells). A linearized plasmid is a plasmid that was previously circular but has been linearized, for example, by digestion with a restriction endonuclease. Linear DNA may be advantageous in certain situations as discussed, e.g., in Cherng, J. Y., et al., J. Control. Release 60:343-53 (1999), and Chen, Z. Y., et al. Mol. Ther. 3:403-10 (2001), both of which are incorporated herein by reference. As used herein, the terms plasmid and vector can be used interchangeably

[0084] Alternatively, DNA virus genomes may be used to administer DNA polynucleotides into vertebrate cells. In certain embodiments, a DNA virus genome of the present invention is nonreplicative, noninfectious, and/or nonintegrating. Suitable DNA virus genomes include without limitation, herpesvirus genomes, adenovirus genomes, adeno-associated virus genomes, and poxvirus genomes. References citing methods for the in vivo introduction of non-infectious virus genomes to vertebrate tissues are well known to those of ordinary skill in the art, and are cited supra.

[0085] In other embodiments, a polynucleotide of the present invention is RNA, for example, in the form of messenger RNA (mRNA). Methods for introducing RNA sequences into vertebrate cells are described in U.S. Pat. No. 5,580,859, the disclosure of which is incorporated herein by reference in its entirety.

[0086] Polynucleotides, nucleic acids, and nucleic acid fragments of the present invention may be associated with additional nucleic acids which encode secretory or signal peptides, which direct the secretion of a polypeptide encoded by a nucleic acid fragment or polynucleotide of the present invention. According to the signal hypothesis, proteins secreted by mammalian cells have a signal peptide or secretory leader sequence which is cleaved from the mature protein once export of the growing protein chain across the rough endoplasmic reticulum has been initiated. Those of ordinary skill in the art are aware that polypeptides secreted by vertebrate cells generally have a signal peptide fused to the N-terminus of the polypeptide, which is cleaved from the complete or "full length" polypeptide to produce a secreted or "mature" form of the polypeptide. In certain embodiments, the native leader sequence is used, or a functional derivative of that sequence that retains the ability to direct the secretion of the polypeptide that is operably associated with it. Alternatively, a heterologous mammalian leader sequence, or a functional derivative thereof, may be used. For example, the wild-type leader sequence may be substituted with the leader sequence of human tissue plasminogen activator (TPA) or mouse .beta.-glucuronidase.

[0087] In accordance with one aspect of the present invention, there is provided a polynucleotide construct, for example, a plasmid, comprising a nucleic acid fragment, where the nucleic acid fragment is a fragment of a codon-optimized coding region operably encoding an IV-derived polypeptide, where the coding region is optimized for expression in vertebrate cells, of a desired vertebrate species, e.g., humans, to be delivered to a vertebrate to be treated or immunized. Suitable IV polypeptides, or fragments, variants, or derivatives thereof may be derived from, but are not limited to, the IV HA, NA, NP, M1, or M2 proteins. Additional IV-derived coding sequences, e.g., coding for HA, NA, NP, M1, M2 or eM2, may also be included on the plasmid, or on a separate plasmid, and expressed, either using native IV codons or codons optimized for expression in the vertebrate to be treated or immunized. When such a plasmid encoding one or more optimized influenza sequences is delivered, in vivo to a tissue of the vertebrate to be treated or immunized, one or more of the encoded gene products will be expressed, i.e., transcribed and translated. The level of expression of the gene product(s) will depend to a significant extent on the strength of the associated promoter and the presence and activation of an associated enhancer element, as well as the degree of optimization of the coding region.

[0088] As used herein, the term "plasmid" refers to a construct made up of genetic material (i.e., nucleic acids). Typically a plasmid contains an origin of replication which is functional in bacterial host cells, e.g., Escherichia coli, and selectable markers for detecting bacterial host cells comprising the plasmid. Plasmids of the present invention may include genetic elements as described herein arranged such that an inserted coding sequence can be transcribed and translated in eukaryotic cells. Also, the plasmid may include a sequence from a viral nucleic acid. However, such viral sequences normally are not sufficient to direct or allow the incorporation of the plasmid into a viral particle, and the plasmid is therefore a non-viral vector. In certain embodiments described herein, a plasmid is a closed circular DNA molecule.

[0089] The term "expression" refers to the biological production of a product encoded by a coding sequence. In most cases a DNA sequence, including the coding sequence, is transcribed to form a messenger-RNA (mRNA). The messenger-RNA is then translated to form a polypeptide product which has a relevant biological activity. Also, the process of expression may involve further processing steps to the RNA product of transcription, such as splicing to remove introns, and/or post-translational processing of a polypeptide product.

[0090] As used herein, the term "polypeptide" is intended to encompass a singular "polypeptide" as well as plural "polypeptides," and comprises any chain or chains of two or more amino acids. Thus, as used herein, terms including, but not limited to "peptide," "dipeptide," "tripeptide," "protein," "amino acid chain," or any other term used to refer to a chain or chains of two or more amino acids, are included in the definition of a "polypeptide," and the term "polypeptide" can be used instead of, or interchangeably with any of these terms. The term further includes polypeptides which have undergone post-translational modifications, for example, glycosylation, acetylation, phosphorylation, amidation, derivatization by known protecting/blocking groups, proteolytic cleavage, or modification by non-naturally occurring amino acids.

[0091] Also included as polypeptides of the present invention are fragments, derivatives, analogs, or variants of the foregoing polypeptides, and any combination thereof. Polypeptides, and fragments, derivatives, analogs, or variants thereof of the present invention can be antigenic and immunogenic polypeptides related to IV polypeptides, which are used to prevent or treat, i.e., cure, ameliorate, lessen the severity of, or prevent or reduce contagion of infectious disease caused by the IV.

[0092] As used herein, an "antigenic polypeptide" or an "immunogenic polypeptide" is a polypeptide which, when introduced into a vertebrate, reacts with the vertebrate's immune system molecules, i.e., is antigenic, and/or induces an immune response in the vertebrate, i.e., is immunogenic. It is quite likely that an immunogenic polypeptide will also be antigenic, but an antigenic polypeptide, because of its size or conformation, may not necessarily be immunogenic. Examples of antigenic and immunogenic polypeptides of the present invention include, but are not limited to, e.g., HA or fragments or variants thereof, e.g. NP, or fragments thereof, e.g., PB1, or fragments or variants thereof, e.g., NS1 or fragments or variants thereof, e.g., M1 or fragments or variants thereof, and e.g. M2 or fragments or variants thereof including the extracellular fragment of M2 (eM2), or e.g., any of the foregoing polypeptides or fragments fused to a heterologous polypeptide, for example, a hepatitis B core antigen. Isolated antigenic and immunogenic polypeptides of the present invention in addition to those encoded by polynucleotides of the invention, may be provided as a recombinant protein, a purified subunit, a viral vector expressing the protein, or may be provided in the form of an inactivated IV vaccine, e.g., a live-attenuated virus vaccine, a heat-killed virus vaccine, etc.

[0093] By an "isolated" IV polypeptide or a fragment, variant, or derivative thereof is intended an IV polypeptide or protein that is not in its natural form. No particular level of purification is required. For example, an isolated IV polypeptide can be removed from its native or natural environment. Recombinantly produced IV polypeptides and proteins expressed in host cells are considered isolated for purposed of the invention, as are native or recombinant IV polypeptides which have been separated, fractionated, or partially or substantially purified by any suitable technique, including the separation of IV virions from eggs or culture cells in which they have been propagated. In addition, an isolated IV polypeptide or protein can be provided as a live or inactivated viral vector expressing an isolated IV polypeptide and can include those found in inactivated IV vaccine compositions. Thus, isolated IV polypeptides and proteins can be provided as, for example, recombinant IV polypeptides, a purified subunit of IV, a viral vector expressing an isolated IV polypeptide, or in the form of an inactivated or attenuated IV vaccine.

[0094] The term "epitopes," as used herein, refers to portions of a polypeptide having antigenic or immunogenic activity in a vertebrate, for example a human. An "immunogenic epitope," as used herein, is defined as a portion of a protein that elicits an immune response in an animal, as determined by any method known in the art. The term "antigenic epitope," as used herein, is defined as a portion of a protein to which an antibody or T-cell receptor can immunospecifically bind as determined by any method well known in the art. Immunospecific binding excludes non-specific binding but does not exclude cross-reactivity with other antigens. Where all immunogenic epitopes are antigenic, antigenic epitopes need not be immunogenic.

[0095] The term "immunogenic carrier" as used herein refers to a first polypeptide or fragment, variant, or derivative thereof which enhances the immunogenicity of a second polypeptide or fragment, variant, or derivative thereof. Typically, an "immunogenic carrier" is fused to or conjugated to the desired polypeptide or fragment thereof. An example of an "immunogenic carrier" is a recombinant hepatitis B core antigen expressing, as a surface epitope, an immunogenic epitope of interest. See, e.g., European Patent No. EP 0385610 B 1, which is incorporated herein by reference in its entirety.

[0096] In the present invention, antigenic epitopes preferably contain a sequence of at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, or between about 8 to about 30 amino acids contained within the amino acid sequence of an IV polypeptide of the invention, e.g., an NP polypeptide, an M1 polypeptide or an M2 polypeptide. Certain polypeptides comprising immunogenic or antigenic epitopes are at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 amino acid residues in length. Antigenic as well as immunogenic epitopes may be linear, i.e., be comprised of contiguous amino acids in a polypeptide, or may be three dimensional, i.e., where an epitope is comprised of non-contiguous amino acids which come together due to the secondary or tertiary structure of the polypeptide, thereby forming an epitope.

[0097] As to the selection of peptides or polypeptides bearing an antigenic epitope (e.g., that contain a region of a protein molecule to which an antibody or T cell receptor can bind), it is well known in that art that relatively short synthetic peptides that mimic part of a protein sequence are routinely capable of eliciting an antiserum that reacts with the partially mimicked protein. See, e.g., Sutcliffe, J. G., et al., Science 219:660-666 (1983), which is herein incorporated by reference.

[0098] Peptides capable of eliciting an immunogenic response are frequently represented in the primary sequence of a protein, can be characterized by a set of simple chemical rules, and are confined neither to immunodominant regions of intact proteins nor to the amino or carboxyl terminals. Peptides that are extremely hydrophobic and those of six or fewer residues generally are ineffective at inducing antibodies that bind to the mimicked protein; longer peptides, especially those containing proline residues, usually are effective. Sutcliffe et al., supra, at 661. For instance, 18 of 20 peptides designed according to these guidelines, containing 8-39 residues covering 75% of the sequence of the IV hemagglutinin HA1 polypeptide chain, induced antibodies that reacted with the HA1 protein or intact virus; and 12/12 peptides from the MuLV polymerase and 18/18 from the rabies glycoprotein induced antibodies that precipitated the respective proteins.

Codon Optimization

[0099] "Codon optimization" is defined as modifying a nucleic acid sequence for enhanced expression in the cells of the vertebrate of interest, e.g. human, by replacing at least one, more than one, or a significant number, of codons of the native sequence with codons that are more frequently or most frequently used in the genes of that vertebrate. Various species exhibit particular bias for certain codons of a particular amino acid.

[0100] In one aspect, the present invention relates to polynucleotides comprising nucleic acid fragments of codon-optimized coding regions which encode IV polypeptides, or fragments, variants, or derivatives thereof, with the codon usage adapted for optimized expression in the cells of a given vertebrate, e.g., humans. These polynucleotides are prepared by incorporating codons preferred for use in the genes of the vertebrate of interest into the DNA sequence. Also provided are polynucleotide expression constructs, vectors, and host cells comprising nucleic acid fragments of codon-optimized coding regions which encode IV polypeptides, and fragments, variants, or derivatives thereof, and various methods of using the polynucleotide expression constructs, vectors, host cells to treat or prevent influenza disease in a vertebrate.

[0101] As used herein the term "codon-optimized coding region" means a nucleic acid coding region that has been adapted for expression in the cells of a given vertebrate by replacing at least one, or more than one, or a significant number, of codons with one or more codons that are more frequently used in the genes of that vertebrate.

[0102] Deviations in the nucleotide sequence that comprise the codons encoding the amino acids of any polypeptide chain allow for variations in the sequence coding for the gene. Since each codon consists of three nucleotides, and the nucleotides comprising DNA are restricted to four specific bases, there are 64 possible combinations of nucleotides, 61 of which encode amino acids (the remaining three codons encode signals ending translation). The "genetic code" which shows which codons encode which amino acids is reproduced herein as Table 1. As a result, many amino acids are designated by more than one codon. For example, the amino acids alanine and proline are coded for by four triplets, serine and arginine by six, whereas tryptophan and methionine are coded by just one triplet. This degeneracy allows for DNA base composition to vary over a wide range without altering the amino acid sequence of the proteins encoded by the DNA. TABLE-US-00019 TABLE 1 The Standard Genetic Code T C A G T TTT Phe (F) TCT Ser (S) TAT Tyr (Y) TGT Cys (C) TTC Phe (F) TCC Ser (S) TAC Tyr (Y) TGC TTA Leu (L) TCA Ser (S) TAA Ter TGA Ter TTG Leu (L) TCG Ser (S) TAG Ter TGG Trp (W) C CTT Leu (L) CCT Pro (P) CAT His (H) CGT Arg (R) CTC Leu (L) CCC Pro (P) CAC His (H) CGC Arg (R) CTA Leu (L) CCA Pro (P) CAA Gln (Q) CGA Arg (R) CTG Leu (L) CCG Pro (P) CAG Gln (Q) CGG Arg (R) A ATT Ile (I) ACT Thr (T) AAT Asn (N) AGT Ser (S) ATC Ile (I) ACC Thr (T) AAC Asn (N) AGC Ser (S) ATA Ile (I) ACA Thr (T) AAA Lys (K) AGA Arg (R) ATG Met (M) ACG Thr (T) AAG Lys (K) AGG Arg (R) G GTT Val (V) GCT Ala (A) GAT Asp (D) GGT Gly (G) GTC Val (V) GCC Ala (A) GAC Asp (D) GGC Gly (G) GTA Val (V) GCA Ala (A) GAA Glu (E) GGA Gly (G) GTG Val (V) GCG Ala (A) GAG Glu (E) GGG Gly (G)

[0103] Many organisms display a bias for use of particular codons to code for insertion of a particular amino acid in a growing peptide chain. Codon preference or codon bias, differences in codon usage between organisms, is afforded by degeneracy of the genetic code, and is well documented among many organisms. Codon bias often correlates with the efficiency of translation of messenger RNA (mRNA), which is in turn believed to be dependent on, inter alia, the properties of the codons being translated and the availability of particular transfer RNA (tRNA) molecules. The predominance of selected tRNAs in a cell is generally a reflection of the codons used most frequently in peptide synthesis. Accordingly, genes can be tailored for optimal gene expression in a given organism based on codon optimization.

[0104] Given the large number of gene sequences available for a wide variety of animal, plant and microbial species, it is possible to calculate the relative frequencies of codon usage. Codon usage tables are readily available, for example, at the "Codon Usage Database" available at http://www.kazusa.or.jp/codon/ (visited Jul. 9, 2002), and these tables can be adapted in a number of ways. See Nakamura, Y., et al. "Codon usage tabulated from the international DNA sequence databases: status for the year 2000" Nucl. Acids Res. 28:292 (2000), which is incorporated by reference. As examples, the codon usage tables for human, mouse, domestic cat, and cow, calculated from GenBank Release 128.0 (15 Feb. 2002), are reproduced below as Tables 2-5. These Tables use mRNA nomenclature, and so instead of thymine (T) which is found in DNA, the Tables use uracil (U) which is found in RNA. The Tables have been adapted so that frequencies are calculated for each amino acid, rather than for all 64 codons. TABLE-US-00020 TABLE 2 Codon Usage Table for Human Genes (Homo sapiens) Amino Acid Codon Number Frequency Phe UUU 326146 0.4525 Phe UUC 394680 0.5475 Total 720826 Leu UUA 139249 0.0728 Leu UUG 242151 0.1266 Leu CUU 246206 0.1287 Leu CUC 374262 0.1956 Leu CUA 133980 0.0700 Leu CUG 777077 0.4062 Total 1912925 Ile AUU 303721 0.3554 Ile AUC 414483 0.4850 Ile AUA 136399 0.1596 Total 854603 Met AUG 430946 1.0000 Total 430946 Val GUU 210423 0.1773 Val GUC 282445 0.2380 Val GUA 134991 0.1137 Val GUG 559044 0.4710 Total 1186903 Ser UCU 282407 0.1840 Ser UCC 336349 0.2191 Ser UCA 225963 0.1472 Ser UCG 86761 0.0565 Ser AGU 230047 0.1499 Ser AGC 373362 0.2433 Total 1534889 Pro CCU 333705 0.2834 Pro CCC 386462 0.3281 Pro CCA 322220 0.2736 Pro CCG 135317 0.1149 Total 1177704 Thr ACU 247913 0.2419 Thr ACC 371420 0.3624 Thr ACA 285655 0.2787 Thr ACG 120022 0.1171 Total 1025010 Ala GCU 360146 0.2637 Ala GCC 551452 0.4037 Ala GCA 308034 0.2255 Ala GCG 146233 0.1071 Total 1365865 Tyr UAU 232240 0.4347 Tyr UAC 301978 0.5653 Total 534218 His CAU 201389 0.4113 His CAC 288200 0.5887 Total 489589 Gln CAA 227742 0.2541 Gln CAG 668391 0.7459 Total 896133 Asn AAU 322271 0.4614 Asn AAC 376210 0.5386 Total 698481 Lys AAA 462660 0.4212 Lys AAG 635755 0.5788 Total 1098415 Asp GAU 430744 0.4613 Asp GAC 502940 0.5387 Total 933684 Glu GAA 561277 0.4161 Glu GAG 787712 0.5839 Total 1348989 Cys UGU 190962 0.4468 Cys UGC 236400 0.5532 Total 427362 Trp UGG 248083 1.0000 Total 248083 Arg CGU 90899 0.0830 Arg CGC 210931 0.1927 Arg CGA 122555 0.1120 Arg CGG 228970 0.2092 Arg AGA 221221 0.2021 Arg AGG 220119 0.2011 Total 1094695 Gly GGU 209450 0.1632 Gly GGC 441320 0.3438 Gly GGA 315726 0.2459 Gly GGG 317263 0.2471 Total 1283759 Stop UAA 13963 Stop UAG 10631 Stop UGA 24607

[0105] TABLE-US-00021 TABLE 3 Codon Usage Table for Mouse Genes (Mus musculus) Amino Acid Codon Number Frequency Phe UUU 150467 0.4321 Phe UUC 197795 0.5679 Total 348262 Leu UUA 55635 0.0625 Leu UUG 116210 0.1306 Leu CUU 114699 0.1289 Leu CUC 179248 0.2015 Leu CUA 69237 0.0778 Leu CUG 354743 0.3987 Total 889772 Ile AUU 137513 0.3367 Ile AUC 208533 0.5106 Ile AUA 62349 0.1527 Total 408395 Met AUG 204546 1.0000 Total 204546 Val GUU 93754 0.1673 Val GUC 140762 0.2513 Val GUA 64417 0.1150 Val GUG 261308 0.4664 Total 560241 Ser UCU 139576 0.1936 Ser UCC 160313 0.2224 Ser UCA 100524 0.1394 Ser UCG 38632 0.0536 Ser AGU 108413 0.1504 Ser AGC 173518 0.2407 Total 720976 Pro CCU 162613 0.3036 Pro CCC 164796 0.3077 Pro CCA 151091 0.2821 Pro CCG 57032 0.1065 Total 535532 Thr ACU 119832 0.2472 Thr ACC 172415 0.3556 Thr ACA 140420 0.2896 Thr ACG 52142 0.1076 Total 484809 Ala GCU 178593 0.2905 Ala GCC 236018 0.3839 Ala GCA 139697 0.2272 Ala GCG 60444 0.0983 Total 614752 Tyr UAU 108556 0.4219 Tyr UAC 148772 0.5781 Total 257328 His CAU 88786 0.3973 His CAC 134705 0.6027 Total 223491 Gln CAA 101783 0.2520 Gln CAG 302064 0.7480 Total 403847 Asn AAU 138868 0.4254 Asn AAC 187541 0.5746 Total 326409 Lys AAA 188707 0.3839 Lys AAG 302799 0.6161 Total 491506 Asp GAU 189372 0.4414 Asp GAC 239670 0.5586 Total 429042 Glu GAA 235842 0.4015 Glu GAG 351582 0.5985 Total 587424 Cys UGU 97385 0.4716 Cys UGC 109130 0.5284 Total 206515 Trp UGG 112588 1.0000 Total 112588 Arg CGU 41703 0.0863 Arg CGC 86351 0.1787 Arg CGA 58928 0.1220 Arg CGG 92277 0.1910 Arg AGA 101029 0.2091 Arg AGG 102859 0.2129 Total 483147 Gly GGU 103673 0.1750 Gly GGC 198604 0.3352 Gly GGA 151497 0.2557 Gly GGG 138700 0.2341 Total 592474 Stop UAA 5499 Stop UAG 4661 Stop UGA 10356

[0106] TABLE-US-00022 TABLE 4 Codon Usage Table for Domestic Cat Genes (Felis cattus) Amino Acid Codon Number Frequency of usage Phe UUU 1204.00 0.4039 Phe UUC 1777.00 0.5961 Total 2981 Leu UUA 404.00 0.0570 Leu UUG 857.00 0.1209 Leu CUU 791.00 0.1116 Leu CUC 1513.00 0.2135 Leu CUA 488.00 0.0688 Leu CUG 3035.00 0.4282 Total 7088 Ile AUU 1018.00 0.2984 Ile AUC 1835.00 0.5380 Ile AUA 558.00 0.1636 Total 3411 Met AUG 1553.00 0.0036 Total 1553 Val GUU 696.00 0.1512 Val GUC 1279.00 0.2779 Val GUA 463.00 0.1006 Val GUG 2164.00 0.4702 Total 4602 Ser UCU 940.00 0.1875 Ser UCC 1260.00 0.2513 Ser UCA 608.00 0.1213 Ser UCG 332.00 0.0662 Ser AGU 672.00 0.1340 Ser AGC 1202.00 0.2397 Total 5014 Pro CCU 958.00 0.2626 Pro CCC 1375.00 0.3769 Pro CCA 850.00 0.2330 Pro CCG 465.00 0.1275 Total 3648 Thr ACU 822.00 0.2127 Thr ACC 1574.00 0.4072 Thr ACA 903.00 0.2336 Thr ACG 566.00 0.1464 Total 3865 Ala GCU 1129.00 0.2496 Ala GCC 1951.00 0.4313 Ala GCA 883.00 0.1952 Ala GCG 561.00 0.1240 Total 4524 Tyr UAU 837.00 0.3779 Tyr UAC 1378.00 0.6221 Total 2215 His CAU 594.00 0.3738 His CAC 995.00 0.6262 Total 1589 Gln CAA 747.00 0.2783 Gln CAG 1937.00 0.7217 Total 2684 Asn AAU 1109.00 0.3949 Asn AAC 1699.00 0.6051 Total 2808 Lys AAA 1445.00 0.4088 Lys AAG 2090.00 0.5912 Total 3535 Asp GAU 1255.00 0.4055 Asp GAC 1840.00 0.5945 Total 3095 Glu GAA 1637.00 0.4164 Glu GAG 2294.00 0.5836 Total 3931 Cys UGU 719.00 0.4425 Cys UGC 906.00 0.5575 Total 1625 Trp UGG 1073.00 1.0000 Total 1073 Arg CGU 236.00 0.0700 Arg CGC 629.00 0.1865 Arg CGA 354.00 0.1050 Arg CGG 662.00 0.1963 Arg AGA 712.00 0.2112 Arg AGG 779.00 0.2310 Total 3372 Gly GGU 648.00 0.1498 Gly GGC 1536.00 0.3551 Gly GGA 1065.00 0.2462 Gly GGG 1077.00 0.2490 Total 4326 Stop UAA 55 Stop UAG 36 Stop UGA 110

[0107] TABLE-US-00023 TABLE 5 Codon Usage Table for Cow Genes (Bos taurus) Amino Acid Codon Number Frequency of usage Phe UUU 13002 0.4112 Phe UUC 18614 0.5888 Total 31616 Leu UUA 4467 0.0590 Leu UUG 9024 0.1192 Leu CUU 9069 0.1198 Leu CUC 16003 0.2114 Leu CUA 4608 0.0609 Leu CUG 32536 0.4298 Total 75707 Ile AUU 12474 0.3313 Ile AUC 19800 0.5258 Ile AUA 5381 0.1429 Total 37655 Met AUG 17770 1.0000 Total 17770 Val GUU 8212 0.1635 Val GUC 12846 0.2558 Val GUA 4932 0.0982 Val GUG 24222 0.4824 Total 50212 Ser UCU 10287 0.1804 Ser UCC 13258 0.2325 Ser UCA 7678 0.1347 Ser UCG 3470 0.0609 Ser AGU 8040 0.1410 Ser AGC 14279 0.2505 Total 57012 Pro CCU 11695 0.2684 Pro CCC 15221 0.3493 Pro CCA 11039 0.2533 Pro CCG 5621 0.1290 Total 43576 Thr ACU 9372 0.2203 Thr ACC 16574 0.3895 Thr ACA 10892 0.2560 Thr ACG 5712 0.1342 Total 42550 Ala GCU 13923 0.2592 Ala GCC 23073 0.4295 Ala GCA 10704 0.1992 Ala GCG 6025 0.1121 Total 53725 Tyr UAU 9441 0.3882 Tyr UAC 14882 0.6118 Total 24323 His CAU 6528 0.3649 His CAC 11363 0.6351 Total 17891 Gln CAA 8060 0.2430 Gln CAG 25108 0.7570 Total 33168 Asn AAU 12491 0.4088 Asn AAC 18063 0.5912 Total 30554 Lys AAA 17244 0.3897 Lys AAG 27000 0.6103 Total 44244 Asp GAU 16615 0.4239 Asp GAC 22580 0.5761 Total 39195 Glu GAA 21102 0.4007 Glu GAG 31555 0.5993 Total 52657 Cys UGU 7556 0.4200 Cys UGC 10436 0.5800 Total 17992 Trp UGG 10706 1.0000 Total 10706 Arg CGU 3391 0.0824 Arg CGC 7998 0.1943 Arg CGA 4558 0.1108 Arg CGG 8300 0.2017 Arg AGA 8237 0.2001 Arg AGG 8671 0.2107 Total 41155 Gly GGU 8508 0.1616 Gly GGC 18517 0.3518 Gly GGA 12838 0.2439 Gly GGG 12772 0.2427 Total 52635 Stop UAA 555 Stop UAG 394 Stop UGA 392

[0108] By utilizing these or similar tables, one of ordinary skill in the art can apply the frequencies to any given polypeptide sequence, and produce a nucleic acid fragment of a codon-optimized coding region which encodes the polypeptide, but which uses codons more optimal for a given species. Codon-optimized coding regions can be designed by various different methods.

[0109] In one method, termed "uniform optimization," a codon usage table is used to find the single most frequent codon used for any given amino acid, and that codon is used each time that particular amino acid appears in the polypeptide sequence. For example, referring to Table 2 above, for leucine, the most frequent codon in humans is CUG, which is used 41% of the time. Thus all the leucine residues in a given amino acid sequence would be assigned the codon CUG. A coding region for IAV NP (SEQ ID NO:2) optimized by the "uniform optimization" method is presented herein as SEQ ID NO 24: TABLE-US-00024 1 ATGGCCAGCC AGGGCACCAA GCGGAGCTAC GAGCAGATGG AGACCGACGG CGAGCGGCAG 61 AACGCCACCG AGATCCGGGC CAGCGTGGGC AAGATGATCG GCGGCATCGG CCGGTTCTAC 121 ATCCAGATGT GCACCGAGCT GAAGCTGAGC GACTACGAGG GCCGGCTGAT CCAGAACAGC 181 CTGACCATCG AGCGGATGGT GCTGAGCGCC TTCGACGAGC GGCGGAACAA GTACCTGGAG 241 GAGCACCCCA GCGCCGGCAA GGACCCCAAG AAGACCGGCG GCCCCATCTA CCGGCGGGTG 301 AACGGCAAGT GGATGCGGGA GCTGATCCTG TACGACAAGG AGGAGATCCG GCGGATCTGG 361 CGGCAGGCCA ACAACGGCGA CGACGCCACC GCCGGCCTGA CCCACATGAT GATCTGGCAC 421 AGCAACCTGA ACGACGCCAC CTACCAGCGG ACCCGGGCCC TGGTGCGGAC CGGCATGGAC 481 CCCCGGATGT GCAGCCTGAT GCAGGGCAGC ACCCTGCCCC GGCGGAGCGG CGCCGCCGGC 541 GCCGCCGTGA AGGGCGTGGG CACCATGGTG ATGGAGCTGG TGCGGATGAT CAAGCGGGGC 601 ATCAACGACC GGAACTTCTG GCGGGGCGAG AACGGCCGGA AGACCCGGAT CGCCTACGAG 661 CGGATGTGCA ACATCCTGAA GGGCAAGTTC CAGACCGCCG CCCAGAAGGC CATGATGGAC 721 CAGGTGCGGG AGAGCCGGAA CCCCGGCAAC GCCGAGTTCG AGGACCTGAC CTTCCTGGCC 781 CGGAGCGCCC TGATCCTGCG GGGCAGCGTG GCCCACAAGA GCTGCCTGCC CGCCTGCGTG 841 TACGGCCCCG CCGTGGCCAG CGGCTACGAC TTCGAGCGGG AGGOCTACAG CCTGGTGGGC 901 ATCGACCCCT TCCGGCTGCT GCAGAACAGC CAGGTGTACA GCCTGATCCG GCCCAACGAG 961 AACCCCGCCC ACAAGAGCCA GCTGGTGTGG ATGGCCTGCC ACAGCGCCGC CTTCGAGGAC 1021 CTGCGGGTGC TGAGCTTCAT CAAGGGCACC AAGGTGCTGC CCCGGGGCAA GCTGAGCACC 1081 CGGGGCGTGC AGATCGCCAG CAACGAGAAC ATGGAGACCA TGGAGAGCAG CACCCTGGAG 1141 CTGCGGAGCC GGTACTGGGC CATCCGGACC CGGAGCGGCG GCAACACCAA CCAGCAGCGG 1201 GCCAGCGCCG GCCAGATCAG CATCCAGCCC ACCTTCAGCG TGCAGCGGAA CCTGCCCTTC 1261 GACCGGACCA CCGTGATGGC CGCCTTCAGC GGCAACACCG AGGGCCGGAC CAGCGACATG 1321 CGGACCGAGA TCATCCGGAT GATGGAGAGC GCCCGGCCCG AGGACGTGAG CTTCCAGGGC 1381 CGGGGCGTGT TCGAGCTGAG CGACGAGAAG GCCGCCAGCC CCATCGTGCC CAGCTTCGAC 1441 ATGAGCAACG AGGGCAGCTA CTTCTTCGGC GACAACGCCG AGGAGTACGA CAACTGA

[0110] In another method, termed "full-optimization," the actual frequencies of the codons are distributed randomly throughout the coding region. Thus, using this method for optimization, if a hypothetical polypeptide sequence had 100 leucine residues, referring to Table 2 for frequency of usage in humans, about 7, or 7% of the leucine codons would be UUA, about 13, or 13% of the leucine codons would be WUG, about 13, or 13% of the leucine codons would be CUU, about 20, or 20% of the leucine codons would be CUC, about 7, or 7% of the leucine codons would be CUA, and about 41, or 41% of the leucine codons would be CUG. These frequencies would be distributed randomly throughout the leucine codons in the coding region encoding the hypothetical polypeptide. As will be understood by those of ordinary skill in the art, the distribution of codons in the sequence can vary significantly using this method; however, the sequence always encodes the same polypeptide.

[0111] As an example, a nucleotide sequence for NP (SEQ ID NO:2) fully optimized for human codon usage, is shown as SEQ ID NO:23. An alignment of nucleotides 46-1542 of SEQ ID NO:1 (native NP coding region) with the codon-optimized coding region (SEQ ID NO:23) is presented in FIG. 1.

[0112] In using the "full-optimization" method, an entire polypeptide sequence may be codon-optimized as described above. With respect to various desired fragments, variants or derivatives of the complete polypeptide, the fragment variant, or derivative may first be designed, and is then codon-optimized individually. Alternatively, a full-length polypeptide sequence is codon-optimized for a given species resulting in a codon-optimized coding region encoding the entire polypeptide, and then nucleic acid fragments of the codon-optimized coding region, which encode fragments, variants, and derivatives of the polypeptide are made from the original codon-optimized coding region. As would be well understood by those of ordinary skill in the art, if codons have been randomly assigned to the full-length coding region based on their frequency of use in a given species, nucleic acid fragments encoding fragments, variants, and derivatives would not necessarily be fully codon-optimized for the given species. However, such sequences are still much closer to the codon usage of the desired species than the native codon usage. The advantage of this approach is that synthesizing codon-optimized nucleic acid fragments encoding each fragment, variant, and derivative of a given polypeptide, although routine, would be time consuming and would result in significant expense.

[0113] When using the "full-optimization" method, the term "about" is used precisely to account for fractional percentages of codon frequencies for a given amino acid. As used herein, "about" is defined as one amino acid more or one amino acid less than the value given. The whole number value of amino acids is rounded up if the fractional frequency of usage is 0.50 or greater, and is rounded down if the fractional frequency of use is 0.49 or less. Using again the example of the frequency of usage of leucine in human genes for a hypothetical polypeptide having 62 leucine residues, the fractional frequency of codon usage would be calculated by multiplying 62 by the frequencies for the various codons. Thus, 7.28 percent of 62 equals 4.51 UUA codons, or "about 5," i.e., 4, 5, or 6 UUA codons, 12.66 percent of 62 equals 7.85 UUG codons or "about 8," i.e., 7, 8, or 9 TUG codons, 12.87 percent of 62 equals 7.98 CUU codons, or "about 8," i.e., 7, 8, or 9 CTU codons, 19.56 percent of 62 equals 12.13 CUC codons or "about 12," i.e., 11, 12, or 13 CUC codons, 7.00 percent of 62 equals 4.34 CUA codons or "about 4," i.e., 3, 4, or 5 CUA codons, and 40.62 percent of 62 equals 25.19 CUG codons, or "about 25," i.e., 24, 25, or 26 CUG codons.

[0114] In a third method termed "minimal optimization," coding regions are only partially optimized. For example, the invention includes a nucleic acid fragment of a codon-optimized coding region encoding a polypeptide in which at least about 1%, 2%, 3%, 4%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, 45%, 50%, 55%, 60%, 65%, 70%, 75%, 80%, 85%, 90%, 95% or 100% of the codon positions have been codon-optimized for a given species. That is, they contain a codon that is preferentially used in the genes of a desired species, e.g., a vertebrate species, e.g., humans, in place of a codon that is normally used in the native nucleic acid sequence. Codons that are rarely found in the genes of the vertebrate of interest are changed to codons more commonly utilized in the coding regions of the vertebrate of interest.

[0115] Thus, those codons which are used more frequently in the IV gene of interest than in genes of the vertebrate of interest are substituted with more frequently-used codons. The difference in frequency at which the IV codons are substituted may vary based on a number factors as discussed below. For example, codons used at least twice more per thousand in IV genes as compared to genes of the vertebrate of interest are substituted with the most frequently used codon for that amino acid in the vertebrate of interest. This ratio may be adjusted higher or lower depending on various factors such as those discussed below. Accordingly, a codon in an IV native coding region would be substituted with a codon used more frequently for that amino acid in coding regions of the vertebrate of interest if the codon is used 1.1 times, 1.2 times, 1.3 times, 1.4 times, 1.5 times, 1.6 times, 1.7 times, 1.8 times, 1.9 times, 2.0 times, 2.1 times, 2.2 times, 2.3 times, 2.4 times, 2.5 times, 2.6 times, 2.7 times, 2.8 times, 2.9 times, 3.0 times, 3.1 times, 3.2 times, 3.3. times, 3.4 times, 3.5 times, 3.6 times. 3.7 times, 3.8 times, 3.9 times, 4.0 times, 4.1 times, 4.2 times, 4.3 times, 4.4 times, 4.5 times, 4.6 times, 4.7 times, 4.8 times, 4.9 times, 5.0 times, 5.5 times, 6.0 times, 6.5 times, 7.0 times, 7.5 times, 8.0 times, 8.5 times, 9.0 times, 9.5 times, 10.0 times, 10.5 times, 11.0 times, 11.5 times, 12.0 times, 12.5 times, 13.0 times, 13.5 times, 14.0 times, 14.5 times, 15.0 times, 15.5 times, 16.0 times, 16.5 times, 17.0 times, 17.5 times, 18.0 times, 18.5 times, 19.0 times, 19.5 times, 20 times, 21 times, 22 times, 23 times, 24 times, 25 times, or greater more frequently in IV coding regions than in coding regions of the vertebrate of interest.

[0116] This minimal human codon optimization for highly variant codons has several advantages, which include but are not limited to the following examples. Since fewer changes are made to the nucleotide sequence of the gene of interest, fewer manipulations are required, which leads to reduced risk of introducing unwanted mutations and lower cost, as well as allowing the use of commercially available site-directed mutagenesis kits, and reducing the need for expensive oligonucleotide synthesis. Further, decreasing the number of changes in the nucleotide sequence decreases the potential of altering the secondary structure of the sequence, which can have a significant impact on gene expression in certain host cells. The introduction of undesirable restriction sites is also reduced, facilitating the subcloning of the genes of interest into the plasmid expression vector.

[0117] The present invention also provides isolated polynucleotides comprising coding regions of IV polypeptides, e.g., NP, M1, M2, HA, NA, PB1, PB2, PA, NS1 or NS2, or fragments, variants, or derivatives thereof. The isolated polynucleotides can also be codon-optimized.

[0118] In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:2 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:2 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:2, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:2 is shown in Table 6. TABLE-US-00025 TABLE 6 Number in AMINO ACID SEQ ID NO: 2 A Ala 39 R Arg 49 C Cys 6 G Gly 41 H His 6 I Ile 26 L Leu 33 K Lys 21 M Met 25 F Phe 18 P Pro 17 S Ser 40 T Thr 28 W Trp 6 Y Tyr 15 V Val 23 N Asn 26 D Asp 22 Q Gln 21 E Glu 36

[0119] Using the amino acid composition shown in Table 6, a human codon-optimized coding region which encodes SEQ ID NO:2 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:2 as follows: the 18 phenylalanine codons are TTC, the 33 leucine codons are CTG, the 26 isoleucine codons are ATC, the 25 methionine codons are ATG, the 23 valine codons are GTG, the 40 serine codons are AGC, the 17 proline codons are CCC, the 28 threonine codons are ACC, the 39 alanine codons are GCC, the 15 tyrosine codons are TAC, the 6 histidine codons are CAC, the 21 glutamine codons are CAG, the 26 asparagine codons are AAC, the 21 lysine codons are AAG, the 22 aspartic acid codons are GAC, the 36 glutamic acid codons are GAG, the 6 tryptophan codons are TGG, the 49 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 41 glycine codons are GGC.

[0120] Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:2 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 6 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:2 as follows: about 8 of the 18 phenylalanine codons are TTT, and about 10 of the phenylalanine codons are TTC; about 2 of the 33 leucine codons are TTA, about 4 of the leucine codons are TTG, about 4 of the leucine codons are CTT, about 6 of the leucine codons are CTC, about 2 of the leucine codons are CTA, and about 13 of the leucine codons are CTG; about 9 of the 26 isoleucine codons are ATT, about 13 of the isoleucine codons are ATC, and about 4 of the isoleucine codons are ATA; the 25 methionine codons are ATG; about 4 of the 23 valine codons are GTT, about 5 of the valine codons are GTG, about 3 of the valine codons are GTA, and about 11 of the valine codons are GTG; about 7 of the 40 serine codons are TCT, about 9 of the serine codons are TCC, about 6 of the serine codons are TCA, about 2 of the serine codons are TCG, about 6 of the serine codons are AGT, and about 10 of the serine codons are AGC; about 5 of the 17 proline codons are CCT, about 6 of the proline codons are CCC, about 5 of the proline codons are CCA, and about 2 of the proline codons are CCG; about 7 of the 28 threonine codons are ACT, about 10 of the threonine codons are ACC, about 8 of the threonine codons are ACA, and about 3 of the threonine codons are ACG; about 10 of the 39 alanine codons are GCT, about 16 of the alanine codons are GCC, about 9 of the alanine codons are GCA, and about 4 of the alanine codons are GCG; about 7 of the 15 tyrosine codons are TAT and about 8 of the tyrosine codons are TAC; about 2 of the 6 histidine codons are CAT and about 4 of the histidine codons are CAC; about 5 of the 21 glutamine codons are CAA and about 16 of the glutamine codons are CAG; about 12 of the 26 asparagine codons are AAT and about 14 of the asparagine codons are AAC; about 9 of the 21 lysine codons are AAA and about 12 of the lysine codons are AAG; about 10 of the 22 aspartic acid codons are GAT and about 12 of the aspartic acid codons are GAC; about 11 of the 26 glutamic acid codons are GAA and about 15 of the glutanic acid codons are GAG; about 3 of the 6 cysteine codons are TGT and about 3 of the cysteine codons are TGC; the 6 tryptophan codons are TGG; about 4 of the 49 arginine codons are CGT, about 9 of the arginine codons are CGC, about 5 of the arginine codons are CGA, about 10 of the arginine codons are CGG, about 10 of the arginine codons are AGA, and about 10 of the arginine codons are AGG; and about 7 of the 41 glycine codons are GGT, about 14 of the glycine codons are GGC, about 10 of the glycine codons are GGA, and about 10 of the glycine codons are GGG.

[0121] As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.

[0122] A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:2, optimized according to codon usage in humans is presented herein as SEQ ID NO:23.

[0123] Additionally, a minimally codon-optimized nucleotide sequence encoding SEQ ID NO:2 can be designed by changing only certain codons found more frequently in IV genes than in human genes, as shown in Table 7. For example, if it is desired to substitute more frequently used codons in humans for those codons that occur at least 2 times more frequently in IV genes (designated with an asterisk in Table 7), Arg AGA, which occurs 2.3 times more frequently in IV genes than in human genes, is changed to, e.g., CGG; Asn AAT, which occurs 2.0 times more frequently in IV genes than in human genes, is changed to, e.g., AAC; Ile ATA, which occurs 3.6 times more frequently in IV genes than in human genes, is changed to, e.g., ATC; and Leu CTA, which occurs 2.0 times more frequently in IV genes than is human, is changed to, e.g., CTG. TABLE-US-00026 TABLE 7 Codon Usage Table for Human Genes and IV Genes Amino Acid Codon Human IV Ala A GCA 16 25 GCG 8 5 GCC 19 11 GCT 19 15 Arg R AGA 12 28* AGG 11 14 CGA 6 7 CGG 12 4 CGC 11 3 CGT 5 3 Asn N AAC 20 27 AAT 17 34* Asp D GAC 26 20 GAT 22 25 Cys C TGC 12 13 TGT 10 12 Gln Q CAA 12 18 CAG 35 20 Glu E GAA 30 39 GAG 40 28 Gly G GGA 16 30 GGG 16 19 GGC 23 9 GGT 11 13 His H CAC 15 13 CAT 11 7 Ile I ATA 7 25* ATC 22 18 ATT 16 23 Leu L CTA 7 14* CTG 40 17 CTC 20 14 CTT 13 14 TTA 7 8 TTG 13 14 Lys K AAA 24 35 AAG 33 20 Met M ATG 22 30 Phe F TTC 21 17 TTT 17 19 Pro P CCA 17 12 CCG 7 4 CCC 20 8 CCT 17 13 Ser S AGC 19 14 AGT 12 16 TCA 12 23 TCG 5 4 TCC 18 12 TCT 15 15 Thr T ACA 15 24 ACG 6 4 ACC 19 13 ACT 13 19 Trp W TGG 13 18 Tyr Y TAC 16 12 TAT 12 19 Val V GTA 7 13 GTG 29 20 GTC 15 12 GTT 11 15 Term TAA 1 2 TAG 0.5 0.4 TGA 1 1

[0124] In another form of minimal optimization, a Codon Usage Table (CUT) for the specific IV sequence in question is generated and compared to CUT for human genomic DNA (see Table 7, supra). Amino acids are identified for which there is a difference of at least 10 percentage points in codon usage between human and IV DNA (either more or less). Then the wild type IV codon is modified to conform to predominant human codon for each such amino acid. Furthermore, the remainder of codons for that amino acid are also modified such that they conform to the predominant human codon for each such amino acid.

[0125] A representative "minimally optimized" codon-optimized coding region encoding SEQ ID NO:2, minimally optimized according to codon usage in humans by this latter method, is presented herein as SEQ ID NO:25: TABLE-US-00027 1 ATGGCCTCAC AGGGCACCAA GCGGAGTTAT GAGCAGATGG AGACCGATGG CGAGAGACAG 61 AACGCCACAG AGATCAGAGC CTCAGTTGGC AAGATGATCG GCGGCATCGG CCGGTTCTAT 121 ATCCAGATGT GCACGGAGCT GAAGCTGAGC GACTACGAGG GCAGACTGAT TCAGAACTCT 181 CTGACCATCG AGAGAATGGT CCTGAGTGCC TTCGATGAGA GACGAAACAA GTATCTGGAG 241 GAGCATCCCT CCGCCGGCAA GGACCCCAAG AAGACGGGCG GCCCCATATA TAGAAGAGTT 301 AACGGCAAGT GGATGAGAGA GCTGATCCTG TACGATAAGG AGGAGATCCG CAGAATATGG 361 AGGCAGGCCA ACAACGGCGA CGATGCCACT GCCGGCCTGA CACATATGAT GATATGGCAC 421 AGTAACCTGA ACGACGCCAC CTACCAGAGA ACAAGGGCCC TGGTTCGCAC GGGCATGGAT 481 CCCAGAATGT GTTCACTGAT GCAGGGCTCT ACACTGCCCA GAAGGTCTGG CGCCGCCGGC 541 GCCGCCGTCA AGGGCGTTGG CACAATGGTG ATGGAGCTGG TGCGGATGAT CAAGAGAGGC 601 ATTAACGATC GGAACTTTTG GAGGGGCGAG AACGGCAGAA AGACCAGGAT AGCCTACGAG 661 CGAATGTGCA ACATTCTGAA GGGCAAGTTC CAGACTGCCG CCCAGAAGGC CATGATGGAT 721 CAGGTGCGGG AGAGCAGAAA CCCCGGCAAC GCCGAGTTCG AGGACCTGAC TTTCCTGGCC 781 AGATCTGCCC TGATACTGAG GGGCTCTGTA GCCCACAAGT CCTGCCTGCC CGCCTGCGTG 841 TACGGCCCCG CCGTGGCCTC CGGCTATGAC TTCGAGCGAG AGGGCTACTC CCTGGTAGGC 901 ATCGATCCCT TTAGACTGCT GCAGAACTCT CAGGTCTACA GTCTGATTAG ACCCAACGAG 961 AACCCCGCCC ATAAGAGCCA GCTGGTGTGG ATGGCCTGCC ACAGTGCCGC CTTCGAGGAC 1021 CTGAGGGTGC TGTCTTTTAT AAAGGGCACA AAGGTGCTGC CCCGCGGCAA GCTGTCTACT 1081 AGGGGCGTCC AGATAGCCTC CAACGAGAAC ATGGAGACAA TGGAGTCTAG TACTCTGGAG 1141 CTGAGGTCTA GGTACTGGGC CATCAGGACT AGGAGCGGCG GCAACACCAA CCAGCAGAGG 1201 GCCAGCGCCG GCCAGATCAG CATTCAGCCC ACCTTCAGTG TACAGAGAAA CCTGCCCTTT 1261 GATAGAACTA CTGTTATGGC CGCCTTCTCT GGCAACACTG AGGGCAGAAC TAGTGACATG 1321 CGAACAGAGA TCATAAGAAT GATGGAGTCG GCCCGTCCCG AGGATGTGTC CTTTCAGGGC 1381 AGGGGCGTCT TCGAGCTGAG CGACGAGAAG GCCGCCAGCC CCATCGTACC CTCTTTCGAT 1441 ATGAGTAACG AGGGCTCGTA CTTTTTTGGC GACAACGCCG AGGAGTATGA TAACTGA

[0126] In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:4 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:4 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:4, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:4 is shown in Table 8. TABLE-US-00028 TABLE 8 Number in AMINO ACID SEQ ID NO: 4 A Ala 25 R Arg 17 C Cys 3 G Gly 16 H His 5 I Ile 11 L Leu 26 K Lys 13 M Met 14 F Phe 7 P Pro 8 S Ser 18 T Thr 18 W Trp 1 Y Tyr 5 V Val 16 N Asn 11 D Asp 6 Q Gln 15 E Glu 17

[0127] Using the amino acid composition shown in Table 8, a human codon-optimized coding region which encodes SEQ ID NO:4 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:4 as follows: the 7 phenylalanine codons are TTC, the 26 leucine codons are CTG, the 11 isoleucine codons are ATC, the 14 methionine codons are ATG, the 16 valine codons are GTG, the 18 serine codons are AGC, the 8 proline codons are CCC, the 18 threonine codons are ACC, the 25 alanine codons are GCC, the 5 tyrosine codons are TAC, the 5 histidine codons are CAC, the 15 glutamine codons are CAG, the 11 asparagine codons are AAC, the 13 lysine codons are AAG, the 6 aspartic acid codons are GAC, the 17 glutamic acid codons are GAG, the 1 tryptophan codon is TGG, the 17 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 16 glycine codons are GGC. The codon-optimized coding region designed by this method is presented herein as SEQ ID NO:27: TABLE-US-00029 ATGAGCCTGCTGACCGAGGTGGAGACCTACGTGCTGAGCATCATCCCCAG CGGCCCCCTGAAGGCCGAGATCGCCCAGAGGCTGGAGGACGTGTTCGCCG GCAAGAACACCGACCTGGAGGTGCTGATGGAGTGGCTGAAGACCAGGCCC ATCCTGAGCCCCCTGACCAAGGGCATCCTGGGCTTCGTGTTCACCCTGAC CGTGCCCAGCGAGAGGGGCCTGCAGAGGAGGAGGTTCGTGCAGAACGCCC TGAACGGCAACGGCGACCCCAACAACATGGACAAGGCCGTGAAGCTGTAC AGGAAGCTGAAGAGGGAGATCACCTTCCACGGCGCCAAGGAGATCAGCCT GAGCTACAGCGCCGGCGCCCTGGCCAGCTGCATGGGCCTGATCTACAACA GGATGGGCGCCGTGACCACCGAGGTGGCCTTCGGCCTGGTGTGCGCCACC TGCGAGCAGATCGCCGACAGCCAGCACAGGAGCCACAGGCAGATGGTGAC CACCACCAACCCCCTGATCAGGCACGAGAACAGGATGGTGCTGGCCAGCA CCACCGCCAAGGCCATGGAGCAGATGGCCGGCAGCAGCGAGCAGGCCGCC GAGGCCATGGAGGTGGCCAGCCAGGCCAGGCAGATGGTGCAGGCCATGAG GACCATCGGCACCCACCCCAGCAGCAGCGCCGGCCTGAAGAACGACCTGC TGGAGAACCTGCAGGCCTACCAGAAGAGGATGGGCGTGCAGATGCAGAGG TTCAAG

[0128] Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:4 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 8 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:4 as follows: about 3 of the 7 phenylalanine codons are TTT, and about 4 of the phenylalanine codons are TTC; about 2 of the 26 leucine codons are TTA, about 3 of the leucine codons are TTG, about 3 of the leucine codons are CTT, about 5 of the leucine codons are CTC, about 2 of the leucine codons are CTA, and about 11 of the leucine codons are CTG; about 4 of the 11 isoleucine codons are ATT, about 5 of the isoleucine codons are ATC, and about 2 of the isoleucine codons are ATA; the 14 methionine codons are ATG; about 3 of the 16 valine codons are GTT, about 4 of the valine codons are GTG, about 2 of the valine codons are GTA, and about 8 of the valine codons are GTG; about 3 of the 18 serine codons are TCT, about 4 of the serine codons are TCC, about 3 of the serine codons are TCA, about 1 of the serine codons is TCG, about 3 of the serine codons are AGT, and about 4 of the serine codons are AGC; about 2 of the 8 proline codons are CCT, about 3 of the proline codons are CCC, about 2 of the proline codons are CCA, and about 1 of the proline codons is CCG; about 4 of the 18 threonine codons are ACT, about 7 of the threonine codons are ACC, about 5 of the threonine codons are ACA, and about 2 of the threonine codons are ACG; about 7 of the 25 alanine codons are GCT, about 10 of the alanine codons are GCC, about 6 of the alanine codons are GCA, and about 3 of the alanine codons are GCG; about 2 of the 5 tyrosine codons are TAT and about 3 of the tyrosine codons are TAC; about 2 of the 5 histidine codons are CAT and about 3 of the histidine codons are CAC; about 4 of the 15 glutamine codons are CAA and about 11 of the glutamine codons are CAG; about 5 of the 11 asparagine codons are AAT and about 6 of the asparagine codons are AAC; about 5 of the 13 lysine codons are AAA and about 8 of the lysine codons are AAG; about 3 of the 6 aspartic acid codons are GAT and about 3 of the aspartic acid codons are GAC; about 7 of the 17 glutamic acid codons are GAA and about 10 of the glutamic acid codons are GAG; about 1 of the 3 cysteine codons is TGT and about 2 of the cysteine codons are TGC; the 1 tryptophan codons is TGG; about 1 of the 17 arginine codons are CGT, about 3 of the arginine codons are CGC, about 2 of the arginine codons are CGA, about 4 of the arginine codons are CGG, about 3 of the arginine codons are AGA, and about 3 of the arginine codons are AGG; and about 3 of the 16 glycine codons are GGT, about 6 of the glycine codons are GGC, about 4 of the glycine codons are GGA, and about 4 of the glycine codons are GGG.

[0129] As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.

[0130] A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:4, optimized according to codon usage in humans is presented herein as SEQ ID NO:26: TABLE-US-00030 ATGAGCTTGCTAACAGAAGTGGAAACCTATGTCCTCAGTATCATTCCTAG CGGCCCCTTAAAAGCCGAAATCGCTCAGCGGCTCGAGGATGTTTTTGCCG GCAAGAACACCGACCTGGAGGTATTGATGGAGTGGCTGAAAACGCGACCT ATTCTGAGCCCCCTGACTAAGGGAATACTCGGCTTCGTTTTTACATTGAC CGTGCCCTCAGAGAGGGGTCTCCAAAGGAGGCGCTTCGTGCAGAACGCCT TAAACGGGAACGGGGACCCAAATAATATGGATAAGGCAGTGAAACTGTAT CGCAAATTAAAGCGGGAGATAACCTTCCATGGAGCCAAGGAGATCTCCCT GTCTTACTCTGCAGGTGCTCTCGCGTCGTGTATGGGACTTATCTACAACC GAATGGGCGCCGTCACAACAGAAGTGGCTTTCGGGCTGGTGTGCGCAACT TGCGAACAGATTGCTGACAGTCAGCACCGGTCCCACCGTCAAATGGTCAC CACCACCAATCCGCTGATTAGACATGAAAATCGCATGGTTCTAGCATCAA CTACAGCCAAAGCAATGGAACAAATGGCCGGAAGCTCCGAGCAGGCTGCC GAGGCGATGGAGGTGGCGTCCCAGGCCAGACAGATGGTACAGGCTATGAG AACTATCGGTACGCACCCAAGTTCTTCAGCTGGGCTGAAGAATGATCTTC TTGAGAACCTGCAGGCCTACCAAAAGCGGATGGGCGTCCAGATGCAGAGA TTTAAA

[0131] Additionally, a minimally codon-optimized nucleotide sequence encoding SEQ ID NO:4 can be designed by changing only certain codons found more frequently in IV genes than in human genes, as shown in Table 7. For example, if it is desired to substitute more frequently used codons in humans for those codons that occur at least 2 times more frequently in IV genes (designated with an asterisk in Table 7), Arg AGA, which occurs 2.3 times more frequently in IV genes than in human genes, is changed to, e.g., CGG; Asn AAT, which occurs 2.0 times more frequently in IV genes than in human genes, is changed to, e.g., AAC; Ile ATA, which occurs 3.6 times more frequently in IV genes than in human genes, is changed to, e.g., ATC; and Leu CTA, which occurs 2.0 times more frequently in IV genes than is human, is changed to, e.g., CTG.

[0132] In another form of minimal optimization, a Codon Usage Table (CUT) for the specific IV sequence in question is generated and compared to CUT for human genomic DNA (see Table 7, supra). Amino acids are identified for which there is a difference of at least 10 percentage points in codon usage between human and IV DNA (either more or less). Then the wild type IV codon is modified to conform to predominant human codon for each such amino acid. Furthermore, the remainder of codons for that amino acid are also modified such that they conform to the predominant human codon for each such amino acid.

[0133] A representative "minimally optimized" codon-optimized coding region encoding SEQ ID NO:4, minimally optimized according to codon usage in humans by this latter method, is presented herein as SEQ ID NO:28: TABLE-US-00031 ATGAGTCTGCTGACAGAGGTTGAGACGTACGTGCTGTCCATCATTCCCTC AGGCCCCCTGAAGGCCGAGATTGCCCAGAGACTGGAGGACGTCTTCGCCG GCAAGAACACCGATCTGGAGGTGCTGATGGAGTGGCTGAAGACTCGCCCC ATCCTGTCTCCCCTGACAAAGGGCATCCTGGGCTTCGTATTTACACTGAC CGTCCCCTCCGAGAGAGGCCTGCAGCGGAGGAGGTTCGTTCAGAACGCCC TGAACGGCAACGGCGATCCCAACAACATGGATAAGGCCGTGAAGCTGTAT AGAAAGCTGAAGCGAGAGATCACATTTCATGGCGCCAAGGAGATATCGCT GAGCTACAGTGCCGGCGCCCTGGCCTCTTGCATGGGCCTGATATACAACA GAATGGGCGCCGTTACTACAGAGGTAGCCTTTGGCCTGGTCTGCGCCACT TGCGAGCAGATCGCCGACTCTCAGCATAGATCTCACAGACAGATGGTGAC GACTACAAACCCCCTGATACGGCACGAGAACAGGATGGTGCTGGCCTCTA CTACCGCCAAGGCCATGGAGCAGATGGCCGGCAGCAGTGAGCAGGCCGCC GAGGCCATGGAGGTAGCCTCACAGGCCAGGCAGATGGTGCAGGCCATGCG AACCATCGGCACTCACCCCTCCAGCTCTGCCGGCCTGAAGAACGACCTGC TGGAGAACCTGCAGGCCTATCAGAAGAGAATGGGCGTACAGATGCAGAGG TTCAAG

[0134] In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:5 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:5 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:5, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:5 is shown in Table 9. TABLE-US-00032 TABLE 9 Number in AMINO ACID SEQ ID NO: 5 A Ala 5 R Arg 7 C Cys 3 G Gly 8 H His 2 I Ile 8 L Leu 10 K Lys 5 M Met 2 F Phe 4 P Pro 4 S Ser 7 T Thr 4 W Trp 2 Y Tyr 3 V Val 4 N Asn 3 D Asp 5 Q Gln 2 E Glu 9

[0135] Using the amino acid composition shown in Table 9, a human codon-optimized coding region which encodes SEQ ID NO:5 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:5 as follows: the 4 phenylalanine codons are TTC, the 10 leucine codons are CTG, the 8 isoleucine codons are ATC, the 2 methionine codons are ATG, the 4 valine codons are GTG, the 7 serine codons are AGC, the 4 proline codons are CCC, the 4 threonine codons are ACC, the 5 alanine codons are GCC, the 3 tyrosine codons are TAC, the 2 histidine codons are CAC, the 2 glutamine codons are CAG, the 3 asparagine codons are AAC, the 5 lysine codons are AAG, the 5 aspartic acid codons are GAC, the 9 glutamic acid codons are GAG, the 2 tryptophan codons are TGG, the 7 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 8 glycine codons are GGC. The codon-optimized PA coding region designed by this method is presented herein as SEQ ID NO:30: TABLE-US-00033 1 ATGAGCCTGC TGACCGAGGT GGAGACCCCC ATCCGGAACG AGTGGGGCTG CCGGTGCAAC 61 GGCAGCAGCG ACCCCCTGGC CATCGCCGCC AACATCATCG GCATCCTGCA CCTGACCCTG 121 TGGATCCTGG ACCGGCTGTT CTTCAAGTGC ATCTACCGGC GGTTCAAGTA CGGCCTGAAG 181 GGCGGCCCCA GCACCGAGGG CGTGCCCAAG AGCATGCGGG AGGAGTACCG GAAGGAGCAG 241 CAGAGCGCCG TGGACGCCGA CGACGGCCAC TTCGTGAGCA TCGAGCTGGA GTGA

[0136] Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:5 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 9 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:5 as follows: about 2 of the 4 phenylalanine codons are TTT, and about 2 of the phenylalanine codons are TTC; about 1 of the 10 leucine codons are TTA, about 1 of the leucine codons are TTG, about 1 of the leucine codons are CTT, about 2 of the leucine codons are CTC, about 1 of the leucine codons are CTA, and about 4 of the leucine codons are CTG; about 3 of the 8 isoleucine codons are ATT, about 4 of the isoleucine codons are ATC, and about 1 of the isoleucine codons are ATA; the 2 methionine codons are ATG; about 1 of the 4 valine codons are GTT, about 1 of the valine codons are GTG, about 0 of the valine codons are GTA, and about 2 of the valine codons are GTG; about 1 of the 7 serine codons are TCT, about 2 of the serine codons are TCC, about 1 of the serine codons are TCA, about 0 of the serine codons are TCG, about 1 of the serine codons are AGT, and about 2 of the serine codons are AGC; about 1 of the 4 proline codons are CCT, about 1 of the proline codons are CCC, about 2 of the proline codons are CCA, and about 0 of the proline codons are CCG; about 1 of the 4 threonine codons are ACT, about 1 of the threonine codons are ACC, about 1 of the threonine codons are ACA, and about 0 of the threonine codons are ACG; about 1 of the 5 alanine codons are GGT, about 2 of the alanine codons are GCC, about 1 of the alanine codons are GCA, and about 1 of the alanine codons are GCG; about 1 of the 3 tyrosine codons are TAT and about 2 of the tyrosine codons are TAC; about 1 of the 2 histidine codons are CAT and about 1 of the histidine codons are CAC; about 1 of the 2 glutamine codons are CAA and about 1 of the glutamine codons are CAG; about 1 of the 3 asparagine codons are AAT and about 2 of the asparagine codons are AAC; about 2 of the 5 lysine codons are AAA and about 3 of the lysine codons are AAG; about 2 of the 5 aspartic acid codons are GAT and about 3 of the aspartic acid codons are GAC; about 4 of the 9 glutamic acid codons are GAA and about 5 of the glutamic acid codons are GAG; about 1 of the 3 cysteine codons are TGT and about 2 of the cysteine codons are TGC; the 2 tryptophan codons are TGG; about 1 of the 7 arginine codons are CGT, about 1 of the arginine codons are CGC, about 1 of the arginine codons are CGA, about 1 of the arginine codons are CGG, about 1 of the arginine codons are AGA, and about 1 of the arginine codons are AGG; and about 1 of the 8 glycine codons are GGT, about 3 of the glycine codons are GGC, about 2 of the glycine codons are GGA, and about 2 of the glycine codons are GGG.

[0137] As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.

[0138] A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:5, optimized according to codon usage in humans is presented herein as SEQ ID NO:29: TABLE-US-00034 1 ATGAGTCTTC TAACCGAGGT CGAAACGCCT ATCAGAAACG AATGGGGGTG CAGATGCAAC 61 GGTTCAAGTG ATCCTCTCGC TATTGCCGCA AATATCATTG GGATCTTGCA CTTGACATTG 121 TGGATTCTTG ATCGTCTTTT TTTCAAATGC ATTTACCGTC GCTTTAAATA CGGACTGAAA 181 GGAGGGCCTT CTACGGAAGG AGTGCCAAAG TCTATGAGGG AAGAATATCG AAAGGAACAG 241 CAGAGTGCTG TGGATGCTGA CGATGGTCAT TTTGTCAGCA TAGAGCTGGA GTAA

[0139] Additionally, a minimally codon-optimized nucleotide sequence encoding SEQ ID NO:5 can be designed by changing only certain codons found more frequently in IV genes than in human genes, as shown in Table 7. For example, if it is desired to substitute more frequently used codons in humans for those codons that occur at least 2 times more frequently in IV genes (designated with an asterisk in Table 7), Arg AGA, which occurs 2.3 times more frequently in IV genes than in human genes, is changed to, e.g., CGG; Asn AAT, which occurs 2.0 times more frequently in IV genes than in human genes, is changed to, e.g., AAC; Ile ATA, which occurs 3.6 times more frequently in IV genes than in human genes, is changed to, e.g., ATC; and Leu CTA, which occurs 2.0 times more frequently in IV genes than is human, is changed to, e.g., CTG.

[0140] In another form of minimal optimization, a Codon Usage Table (CUT) for the specific IV sequence in question is generated and compared to CUT for human genomic DNA (see Table 7, supra). Amino acids are identified for which there is a difference of at least 10 percentage points in codon usage between human and IV DNA (either more or less). Then the wild type IV codon is modified to conform to predominant human codon for each such amino acid. Furthermore, the remainder of codons for that amino acid are also modified such that they conform to the predominant human codon for each such amino acid.

[0141] A representative "minimally optimized" codon-optimized coding region encoding SEQ ID NO:5, minimally optimized according to codon usage in humans by this latter method, is presented herein as SEQ ID NO:31: TABLE-US-00035 1 ATGTCTCTGC TGACAGAGGT GGAGACACCC ATAAGGAACG AGTGGGGCTG CAGGTGCAAC 61 GGCTCTAGTG ATCCCCTGGC CATCGCCGCC AACATCATTG GCATACTGCA TCTGACCCTG 121 TGGATCCTGG ATAGACTGTT CTTTAAGTGC ATTTACAGAC GATTTAAGTA TGGCCTGAAG 181 GGCGGCCCCT CAACTGAGGG CGTGCCCAAG AGTATGAGAG AGGAGTACCG GAAGGAGCAG 241 CAGAGCGCCG TTGACGCCGA TGACGGCCAC TTCGTCTCCA TCGAGCTGGA GTGA

[0142] In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:7 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:7 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:7, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:7 is shown in Table 10. TABLE-US-00036 TABLE 10 Number in AMINO ACID SEQ ID NO: 7 A Ala 39 R Arg 51 C Cys 8 G Gly 43 H His 6 I Ile 27 L Leu 35 K Lys 21 M Met 26 F Phe 18 P Pro 18 S Ser 43 T Thr 30 W Trp 7 Y Tyr 15 V Val 24 N Asn 28 D Asp 23 Q Gln 21 E Glu 39

[0143] Using the amino acid composition shown in Table 10, a human codon-optimized coding region which encodes SEQ ID NO:7 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:7 as follows: the 18 phenylalanine codons are TTC, the 35 leucine codons are CTG, the 27 isoleucine codons are ATC, the 26 methionine codons are ATG, the 24 valine codons are GTG, the 43 serine codons are AGC, the 18 proline codons are CCC, the 30 threonine codons are ACC, the 39 alanine codons are GCC, the 15 tyrosine codons are TAC, the 6 histidine codons are CAC, the 21 glutamine codons are CAG, the 28 asparagine codons are AAC, the 21 lysine codons are AAG, the 23 aspartic acid codons are GAC, the 39 glutamic acid codons are GAG, the 7 tryptophan codons are TGG, the 51 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 43 glycine codons are GGC. The codon-optimized PA coding region designed by this method is presented herein as SEQ ID NO:33: TABLE-US-00037 ATGAGCCTGCTGACCGAGGTGGAGACCCCCATCAGGAACGAGTGGGGCT GCAGGTGCAACGGCAGCAGCGACATGGCCAGCCAGGGCACCAAGAGGAGC TACGAGCAGATGGAGACCGACGGCGAGAGGCAGAACGCCACCGAGATCAG GGCCAGCGTGGGCAAGATGATCGGCGGCATCGGCAGGTTCTACATCCAGA TGTGCACCGAGCTGAAGCTGAGCGACTACGAGGGCAGGCTGATCCAGAAC AGCCTGACCATCGAGAGGATGGTGCTGAGCGCCTTCGACGAGAGGAGGAA CAAGTACCTGGAGGAGCACCCCAGCGCCGGCAAGGACCCCAAGAAGACCG GCGGCCCCATCTACAGGAGGGTGAACGGCAAGTGGATGAGGGAGCTGATC CTGTACGACAAGGAGGAGATCAGGAGGATCTGGAGGCAGGCCAACAACGG CGACGACGCCACCGCCGGCCTGACCCACATGATGATCTGGCACAGCAACC TGAACGACGCCACCTACCAGAGGACCAGGGCCCTGGTGAGGACCGGCATG GACCCCAGGATGTGCAGCCTGATGCAGGGCAGCACCCTGCCCAGGAGGAG CGGCGCCGCCGGCGCCGCCGTGAAGGGCGTGGGCACCATGGTGATGGAGC TGGTGAGGATGATCAAGAGGGGCATCAACGACAGGAACTTCTGGAGGGGC GAGAACGGCAGGAAGACCAGGATCGCCTACGAGAGGATGTGCAACATCCT GAAGGGCAAGTTCCAGACCGCCGCCCAGAAGGCCATGATGGACCAGGTGA GGGAGAGCAGGAACCCCGGCAACGCCGAGTTCGAGGACCTGACCTTCCTG GCCAGGAGCGCCCTGATCCTGAGGGGCAGCGTGGCCCACAAGAGCTGCCT GCCCGCCTGCGTGTACGGCCCCGCCGTGGCCAGCGGCTACGACTTCGAGA GGGAGGGCTACAGCCTGGTGGGCATCGACCCCTTCAGGCTGCTGCAGAAC AGCCAGGTGTACAGCCTGATCAGGCCCAACGAGAACCCCGCCCACAAGAG CCAGCTGGTGTGGATGGCCTGCCACAGCGCCGCCTTCGAGGACCTGAGGG TGCTGAGCTTCATCAAGGGCACCAAGGTGCTGCCCAGGGGCAAGCTGAGC ACCAGGGGCGTGCAGATCGCCAGCAACGAGAACATGGAGACCATGGAGAG CAGCACCCTGGAGCTGAGGAGCAGGTACTGGGCCATCAGGACCAGGAGCG GCGGCAACACCAACCAGCAGAGGGCCAGCGCCGGCCAGATCAGCATCCAG CCCACCTTCAGCGTGCAGAGGAACCTGCCCTTCGACAGGACCACCGTGAT GGCCGCCTTCAGCGGCAACACCGAGGGCAGGACCAGCGACATGAGGACCG AGATCATCAGGATGATGGAGAGCGCCAGGCCCGAGGACGTGAGCTTCCAG GGCAGGGGCGTGTTCGAGCTGAGCGACGAGAAGGCCGCCAGCCCCATCGT GCCCAGCTTCGACATGAGCAACGAGGGCAGCTACTTCTTCGGCGACAACG CCGAGGAGTACGACAAC

[0144] Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:7 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 10 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:7 as follows: about 8 of the 18 phenylalanine codons are TTT, and about 10 of the phenylalanine codons are TTC; about 3 of the 35 leucine codons are TTA, about 4 of the leucine codons are TTG, about 5 of the leucine codons are CTT, about 7 of the leucine codons are CTC, about 2 of the leucine codons are CTA, and about 14 of the leucine codons are CTG; about 10 of the 27 isoleucine codons are ATT, about 13 of the isoleucine codons are ATC, and about 4 of the isoleucine codons are ATA; the 26 methionine codons are ATG; about 4 of the 24 valine codons are GTT, about 6 of the valine codons are GTG, about 3 of the valine codons are GTA, and about 11 of the valine codons are GTG; about 8 of the 43 serine codons are TCT, about 9 of the serine codons are TCC, about 6 of the serine codons are TCA, about 2 of the serine codons are TCG, about 6 of the serine codons are AGT, and about 10 of the serine codons are AGC; about 5 of the 18 proline codons are CCT, about 6 of the proline codons are CCC, about 5 of the proline codons are CCA, and about 2 of the proline codons are CCG; about 7 of the 30 threonine codons are ACT, about 11 of the threonine codons are ACC, about 8 of the threonine codons are ACA, and about 4 of the threonine codons are ACG; about 10 of the 39 alanine codons are GGT, about 16 of the alanine codons are GCC, about 9 of the alanine codons are GCA, and about 4 of the alanine codons are GCG; about 7 of the 15 tyrosine codons are TAT and about 8 of the tyrosine codons are TAC; about 2 of the 6 histidine codons are CAT and about 4 of the histidine codons are CAC; about 5 of the 21 glutamine codons are CAA and about 16 of the glutamine codons are CAG; about 13 of the 28 asparagine codons are AAT and about 15 of the asparagine codons are AAC; about 9 of the 21 lysine codons are AAA and about 12 of the lysine codons are AAG; about 11 of the 23 aspartic acid codons are GAT and about 12 of the aspartic acid codons are GAC; about 16 of the 39 glutamic acid codons are GAA and about 23 of the glutamic acid codons are GAG; about 4 of the 8 cysteine codons are TGT and about 4 of the cysteine codons are TGC; the 7 tryptophan codons are TGG; about 4 of the 51 arginine codons are CGT, about 10 of the arginine codons are CGC, about 6 of the arginine codons are CGA, about 11 of the arginine codons are CGG, about 10 of the arginine codons are AGA, and about 10 of the arginine codons are AGG; and about 7 of the 43 glycine codons are GGT, about 15 of the glycine codons are GGC, about 11 of the glycine codons are GGA, and about 11 of the glycine codons are GGG.

[0145] As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.

[0146] A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:7, optimized according to codon usage in humans is presented herein as SEQ ID NO:32: TABLE-US-00038 ATGAGCCTTCTCACAGAAGTGGAAACACCTATCAGAAATGAATGGGGATG CAGATGCAATGGGTCGAGTGATATGGCCTCTCAAGGTACGAAAAGAAGCT ACGAGCAAATGGAAACGGATGGAGAAAGACAAAACGCGACCGAAATCAGA GCATCCGTCGGGAAGATGATTGGAGGAATCGGACGATTCTACATCCAGAT GTGCACAGAGCTAAAGCTATCGGATTATGAAGGGAGACTAATACAAAATA GCCTAACTATCGAGAGAATGGTGCTGTCTGCATTTGACGAAAGGAGAAAC AAATACCTGGAAGAACACCCCTCTGCAGGGAAAGACCCAAAAAAAACTGG AGGTCCGATATACCGGAGAGTCAACGGTAAATGGATGAGAGAGCTGATCT TGTATGATAAGGAAGAAATAAGACGCATCTGGCGGCAAGCTAATAATGGA GACGACGCTACTGCAGGGCTCACGCATATGATGATCTGGCACTCTAATTT GAATGATGCAACGTACCAAAGAACCCGCGCACTTGTGCGGACCGGAATGG ACCCTCGTATGTGCAGCCTTATGCAGGGGTCCACACTGCCCAGAAGGTCC GGAGCAGCTGGAGCAGCAGTAAAGGGGGTTGGAACCATGGTGATGGAGCT GGTGAGAATGATTAAGAGGGGGATCAATGACAGGAACTTCTGGCGAGGAG AAAACGGGAGAAAAACTAGGATAGCATATGAGAGGATGTGTAACATCCTC AAAGGAAAATTCCAAACCGCTGCTCAGAAAGCAATGATGGATCAAGTACG CGAAAGTAGAAATCCTGGAAATGCAGAGTTTGAAGATCTCACTTTCCTCG CGCGAAGCGCTCTCATCCTCAGAGGGAGTGTCGCTCATAAAAGTTGCCTG CCTGCCTGCGTATATGGTCCTGCCGTGGCAAGTGGATACGACTTTGAGAG AGAGGGGTACTCTCTTGTTGGAATAGATCCATTCAGATTACTTCAGAATT CCCAGGTGTACAGTTTAATAAGGCCAAACGAAAATCCTGCACACAAATCA CAACTTGTTTGGATGGCATGCCATAGTGCCGCATTCGAAGATCTAAGAGT TCTCTCTTTCATCAAAGGTACAAAGGTCCTTCCAAGGGGAAAACTCTCTA CCAGAGGGGTACAAATAGCTTCAAATGAGAACATGGAGACAATGGAATCT AGCACATTGGAATTGAGAAGTAGGTATTGGGCCATTAGAACCAGGAGTGG AGGCAATACTAATCAACAGCGGGCTTCTGCCGGTCAAATTAGCATACAAC CTACTTTTTCAGTGCAACGGAATCTCCCTTTTGATAGGACAACTGTCATG GCGGCATTCTCTGGAAATACCGAAGGAAGGACTTCCGATATGAGGACTGA GATCATTAGGATGATGGAAAGTGCCCGACCTGAAGACGTCAGTTTTCAAG GAAGAGGTGTGTTCGAACTCTCTGACGAAAAGGCAGCTAGCCCAATCGTT CCTTCTTTTGATATGTCAAATGAAGGATCCTACTTCTTCGGCGATAATGC GGAGGAATATGACAAC

[0147] In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:9 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:9 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:9, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:9 is shown in Table 11. TABLE-US-00039 TABLE 11 Number in AMINO ACID SEQ ID NO: 9 A Ala 39 R Arg 51 C Cys 8 G Gly 43 H His 6 I Ile 27 L Leu 35 K Lys 21 M Met 26 F Phe 18 P Pro 18 S Ser 43 T Thr 30 W Trp 7 Y Tyr 15 V Val 24 N Asn 28 D Asp 23 Q Gln 21 E Glu 39

[0148] Using the amino acid composition shown in Table 11, a human codon-optimized coding region which encodes SEQ ID NO:9 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:9 as follows: the 18 phenylalanine codons are TTC, the 35 leucine codons are CTG, the 27 isoleucine codons are ATC, the 26 methionine codons are ATG, the 24 valine codons are GTG, the 43 serine codons are AGC, the 18 proline codons are CCC, the 30 threonine codons are ACC, the 39 alanine codons are GCC, the 15 tyrosine codons are TAC, the 6 histidine codons are CAC, the 21 glutamine codons are CAG, the 28 asparagine codons are AAC, the 21 lysine codons are AAG, the 23 aspartic acid codons are GAC, the 39 glutamic acid codons are GAG, the 7 tryptophan codons are TGG, the 51 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 43 glycine codons are GGC. The codon-optimized PA coding region designed by this method is presented herein as SEQ ID NO:35: TABLE-US-00040 ATGGCCAGCCAGGGCACCAAGAGGAGCTACGAGCAGATGGAGACCGACGG CGAGAGGCAGAACGCCACCGAGATCAGGGGCAGCGTGGGCAAGATGATCG GCGGCATCGGCAGGTTCTACATCCAGATGTGCACCGAGCTGAAGCTGAGC GACTACGAGGGCAGGCTGATCCAGAACAGCCTGACCATCGAGAGGATGGT GCTGAGCGCCTTCGACGAGAGGAGGAACAAGTACCTGGAGGAGCACCCCA GCGCCGGCAAGGACCCCAAGAAGACCGGCGGCCCCATCTACAGGAGGGT GAACGGCAAGTGGATGAGGGAGCTGATCCTGTACGACAAGGAGGAGATCA GGAGGATCTGGAGGCAGGCCAACAACGGCGACGACGCCACCGCCGGCCTG ACCCACATGATGATCTGGCACAGCAACCTGAACGACGCCACCTACCAGAG GACCAGGGCCCTGGTGAGGACCGGCATGGACCCCAGGATGTGCAGCCTGA TGCAGGGCAGCACCCTGCCCAGGAGGAGCGGCGCCGCCGGCGCCGCCGTG AAGGGCGTGGGCACCATGGTGATGGAGCTGGTGAGGATGATCAAGAGGGG CATCAACGACAGGAACTTCTGGAGGGGCGAGAACGGCAGGAAGACCAGGA TCGCCTACGAGAGGATGTGCAACATCCTGAAGGGCAAGTTCCAGACCGCC GCCCAGAAGGCCATGATGGACCAGGTGAGGGAGAGCAGGAACCCCGGCAA CGCCGAGTTCGAGGACCTGACCTTCCTGGCCAGGAGCGCCCTGATCCTGA GGGGCAGCGTGGCCCACAAGAGCTGCCTGCCCGCCTGCGTGTACGGCCCC GCCGTGGCCAGCGGCTACGACTTCGAGAGGGAGGGCTACAGCCTGGTGGG CATCGACCCCTTCAGGCTGCTGCAGAACAGCCAGGTGTACAGCCTGATCA GGCCCAACGAGAACCCCGCCCACAAGAGCCAGCTGGTGTGGATGGCCTGC CACAGCGCCGCCTTCGAGGACCTGAGGGTGCTGAGCTTCATCAAGGGCAC CAAGGTGCTGCCCAGGGGCAAGCTGAGCACCAGGGGCGTGCAGATGGGCA GCAAGGAGAACATGGAGACCATGGAGAGCAGCACCCTGGAGCTGAGGAGC AGGTACTGGGCCATCAGGACCAGGAGCGGCGGCAACACCAACCAGCAGAG GGCCAGCGCCGGCCAGATCAGCATCCAGCCCACCTTCAGCGTGCAGAGGA ACCTGCCCTTCGACAGGACCACCGTGATGGCCGCCTTCAGCGGCAACACC GAGGGCAGGACCAGCGACATGAGGACCGAGATCATCAGGATGATGGAGAG CGCCAGGCCCGAGGACGTGAGCTTCCAGGGCAGGGGCGTGTTCGAGCTGA GCGACGAGAAGGCCGCCAGCCCCATCGTGCCCAGCTTCGACATGAGCAAC GAGGGCAGCTACTTCTTCGGCGACAACGCCGAGGAGTACGACAACATGAG CCTGCTGACCGAGGTGGAGACCCCCATCAGGAACGAGTGGGGCTGCAGGT GCAACGGCAGCAGCGAC

[0149] Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:9 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 11 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:9 as follows: about 8 of the 18 phenylalanine codons are TTT, and about 10 of the phenylalanine codons are TTC; about 3 of the 35 leucine codons are TTA, about 4 of the leucine codons are TTG, about 5 of the leucine codons are CTT, about 7 of the leucine codons are CTC, about 2 of the leucine codons are CTA, and about 14 of the leucine codons are CTG; about 10 of the 27 isoleucine codons are ATT, about 13 of the isoleucine codons are ATC, and about 4 of the isoleucine codons are ATA; the 26 methionine codons are ATG; about 4 of the 24 valine codons are GTT, about 6 of the valine codons are GTG, about 3 of the valine codons are GTA, and about 11 of the valine codons are GTG; about 8 of the 43 serine codons are TCT, about 9 of the serine codons are TCC, about 6 of the serine codons are TCA, about 2 of the serine codons are TCG, about 6 of the serine codons are AGT, and about 10 of the serine codons are AGC; about 5 of the 18 proline codons are CCT, about 6 of the proline codons are CCC, about 5 of the proline codons are CCA, and about 2 of the proline codons are CCG; about 7 of the 30 threonine codons are ACT, about 11 of the threonine codons are ACC, about 8 of the threonine codons are ACA, and about 4 of the threonine codons are ACG; about 10 of the 39 alanine codons are GGT, about 16 of the alanine codons are GCC, about 9 of the alanine codons are GCA, and about 4 of the alanine codons are GCG; about 7 of the 15 tyrosine codons are TAT and about 8 of the tyrosine codons are TAC; about 2 of the 6 histidine codons are CAT and about 4 of the histidine codons are CAC; about 5 of the 21 glutamine codons are CAA and about 16 of the glutamine codons are CAG; about 13 of the 28 asparagine codons are AAT and about 15 of the asparagine codons are AAC; about 9 of the 21 lysine codons are AAA and about 12 of the lysine codons are AAG; about 11 of the 23 aspartic acid codons are GAT and about 12 of the aspartic acid codons are GAC; about 16 of the 39 glutamic acid codons are GAA and about 23 of the glutamic acid codons are GAG; about 4 of the 8 cysteine codons are TGT and about 4 of the cysteine codons are TGC; the 7 tryptophan codons are TGG; about 4 of the 51 arginine codons are CGT, about 10 of the arginine codons are CGC, about 6 of the arginine codons are CGA, about 11 of the arginine codons are CGG, about 10 of the arginine codons are AGA, and about 10 of the arginine codons are AGG; and about 7 of the 43 glycine codons are GGT, about 15 of the glycine codons are GGC, about 11 of the glycine codons are GGA, and about 11 of the glycine codons are GGG.

[0150] As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.

[0151] A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:9, optimized according to codon usage in humans is presented herein as SEQ ID NO:34: TABLE-US-00041 ATGGCAAGCCAGGGCACAAAACGCAGTTACGAGCAGATGGAGACTGATGG TGAGAGGCAGAACGCCACCGAAATCCGGGCCTCCGTCGGCAAGATGATTG GTGGCATCGGAAGATTCTATATCCAGATGTGCACGGAGCTTAAGCTGTCC GATTACGAGGGGCGCTTAATACAGAACTCTCTGACTATCGAGCGAATGGT CTTGAGCGCCTTTGATGAGCGGCGTAATAAGTATCTCGAAGAGCACCCTT CTGCTGGAAAAGACCCCAAAAAGACCGGGGGACCTATCTACCGACGTGTG AACGGAAAATGGATGCGCGAACTGATACTGTACGACAAGGAGGAGATCCG TAGGATCTGGAGACAGGCTAATAACGGAGATGATGCCACAGCTGGGCTGA CCCATATGATGATATGGCATAGCAACCTGAACGACGCAACCTATCAACGC ACTAGAGCACTCGTGAGGACCGGTATGGACCCACGCATGTGCTCATTGAT GCAAGGTAGCACATTGCCTCGGAGGTCAGGCGCCGCCGGTGCCGCCGTAA AGGGGGTGGGCACAATGGTGATGGAACTGGTCCGAATGATCAAAAGAGGC ATCAATGACAGGAACTTTTGGCGCGGAGAAAACGGGCGCAAGACCCGCAT TGCCTACGAGCGCATGTGTAACATTTTAAAAGGCAAATTCCAGACTGCAG CCCAGAAAGCAATGATGGACCAAGTTAGAGAAAGTAGAAATCCCGGGAAT GCCGAGTTTGAAGACCTGACTTTCCTGGCTAGAAGCGCCTTGATCCTGCG GGGCTCTGTCGCCCACAAGAGCTGCCTCCCCGCTTGCGTTTACGGCCCCG CGGTCGCAAGTGGCTACGATTTCGAGAGGGAGGGGTATTCCCTAGTTGGG ATCGATCCCTTCCGGCTCCTACAGAATTCTCAGGTGTATAGTCTGATTAG ACCCAACGAAAACCCGGCTCACAAGAGTCAGCTTGTTTGGATGGCATGTC ACTCAGCAGCTTTCGAAGACCTGCGGGTACTCAGCTTTATTAAAGGCACC AAGGTCCTGCCAAGAGGAAAGCTCTCCACGAGGGGAGTACAGATCGCCTC AAACGAGAACATGGAGACAATGGAAAGCTCCACCCTTGAGCTTAGGTCGC GGTATTGGGCTATTAGAACACGATCTGGGGGGAATACCAATCAGCAACGA GCGAGTGCTGGTCAGATTTCCATTCAGCCTACTTTCTCTGTGCAACGGAA TCTACCATTTGACAGGACAACTGTGATGGCAGCGTTCTCCGGCAATACAG AAGGACGAACATCAGACATGAGGACCGAAATTATCCGGATGATGGAGAGC GCTCGGCCAGAAGATGTGTCGTTCCAGGGCCGGGGCGTGTTTGAGCTCAG CGACGAGAAGGCCGCGTCTCCAATTGTGCCTTCCTTTGATATGAGCAATG AGGGGTCATACTTTTTCGGAGACAATGCCGAAGAGTATGATAATATGTCT CTGCTTACCGAGGTGGAAACGCCGATACGCAACGAATGGGGTTGTCGTTG TAACGGCTCCAGTGAT

[0152] In certain embodiments described herein, a codon-optimized coding region encoding SEQ ID NO:16 is optimized according to codon usage in humans (Homo sapiens). Alternatively, a codon-optimized coding region encoding SEQ ID NO:16 may be optimized according to codon usage in any plant, animal, or microbial species. Codon-optimized coding regions encoding SEQ ID NO:16, optimized according to codon usage in humans are designed as follows. The amino acid composition of SEQ ID NO:16 is shown in Table 12. TABLE-US-00042 TABLE 12 AMINO Number in ACID SEQ ID NO: 16 A Ala 41 R Arg 30 C Cys 5 G Gly 44 H His 4 I Ile 38 L Leu 39 K Lys 52 M Met 27 F Phe 21 P Pro 26 S Ser 40 T Thr 38 W Trp 1 Y Tyr 14 V Val 32 N Asn 25 D Asp 34 Q Gln 19 E Glu 30

[0153] Using the amino acid composition shown in Table 12, a human codon-optimized coding region which encodes SEQ ID NO: 16 can be designed by any of the methods discussed herein. For "uniform" optimization, each amino acid is assigned the most frequent codon used in the human genome for that amino acid. According to this method, codons are assigned to the coding region encoding SEQ ID NO:16 as follows: the 21 phenylalanine codons are TTC, the 39 leucine codons are CTG, the 38 isoleucine codons are ATC, the 27 methionine codons are ATG, the 32 valine codons are GTG, the 40 serine codons are AGC, the 26 proline codons are CCC, the 38 threonine codons are ACC, the 41 alanine codons are GCC, the 14 tyrosine codons are TAC, the 4 histidine codons are CAC, the 19 glutamine codons are CAG, the 25 asparagine codons are AAC, the 52 lysine codons are AAG, the 34 aspartic acid codons are GAC, the 30 glutamic acid codons are GAG, the 1 tryptophan codon is TGG, the 30 arginine codons are CGG, AGA, or AGG (the frequencies of usage of these three codons in the human genome are not significantly different), and the 44 glycine codons are GGC. The codon-optimized PA coding region designed by this method is presented herein as SEQ ID NO:37: TABLE-US-00043 ATGAGCAACATGGACATCGACAGGATCAACACCGGCACCATCGACAAGAC CGGCGAGGAGCTGAGGGCCGGCACCAGCGGGGCCAGCCGGCCGATCATCA AGCCGGGCAGCGTGGCCCCCCCGAGCAACAAGCGGACCCGGAACCCCAGC CCCGAGCGGACCAGGAGCAGGAGCGAGACCGAGATCGGCCGGAAGATCCA GAAGAAGGAGAGGCCCACCGAGATGAAGAAGAGCGTGTACAAGATGGTGG TGAAGCTGGGCGAGTTGTACAACCAGATGATGGTGAAGGCCGGGCTGAAC GACGACATGGAGCGGAAGCTGATGGAGAACGCCGAGGCGGTGGAGGGGAT GCTGCTGGCCGGCAGCGACGACAAGAAGAGCGAGTACCAGAAGAAGGGGA ACGCCGGGGACGTGAAGGAGGGCAAGGAGGAGATCGACGACAACAAGAGC GGCGGCACCTTCTACAAGATGGTGCGGGACGACAAGACCATGTAGTTGAG CCCCATGAAGATCACCTTCGTGAAGGAGGAGGTGAAGACCATGTACAAGA CCACGATGGGCAGCGACGGCTTCAGCGGCCTGAACCACATCATGATCGGC CACAGCCAGATGAACGAGGTGTGCTTCGAGCGGAGCAAGGGGGTGAAGCG GGTGGGCCTGGACCCCAGCCTGATCAGCAGCTTCGCCGGCAGCACCGTGC CCCGGCGGAGCGGCACCACCGGCGTGGCCATCAAGGGCGGCGGCACGCTG GTGGACGAGGGCATCCGGTTCATCGGCCGGGCCATGGCCGACGGGGGCCT GCTGGGGGACATCAAGGCCAAGACCGCCTACGAGAAGATCCTGCTGAACC TGAAGAACAAGTGCAGCGCCCCCCAGCAGAAGGCCCTGGTGGACCAGGTG ATCGGCAGCCGGAAGCCCGGCATCGCCGACATCGAGGACCTGACCCTGCT GGCCCGGAGCATGGTGGTGGTGCGGCCCAGCGTGGCCAGCAAGGTGGTGC TGCCCATCAGCATCTACGCCAAGATGCCCCAGCTGGGCTTCAACACCGAG GAGTACAGCATGGTGGGCTACGAGGCCATGGCCCTGTACAACATGGCCAC CCCCGTGAGCATCCTGCGGATGGGCGACGACGCCAAGGACAAGAGCCAGC TGTTCTTCATGAGCTGCTTCGGCGCCGCCTACGAGGACCTGCGGGTGCTG AGCGCCCTGACCGGCACCGAGTTCAAGCCCCGGAGCGCCCTGAAGTGCAA GGGCTTCCACGTGCCCGCCAAGGAGCAGGTGGAGGGCATGGGCGCCGCCC TGATGAGCATCAAGCTGCAGTTCTGGGCCCCCATGACCCGGAGCGGCGGG AAGGAGGTGAGCGGCGAGGGCGGGAGCGGCCAGATCAGCTGCAGCCCCGT GTTGGCCGTGGAGCGGCCCATCGCCCTGAGCAAGCAGGCCGTGCGGCGGA TGCTGAGCATGAACGTGGAGGGCCGGGACGCCGAGGTGAAGGGCAACCTG CTGAAGATGATGAACGACAGCATGGCCAAGAAGACCAGCGGCAACGCCTT CATCGGCAAGAAGATGTTCCAGATCAGCGACAAGAACAAGGTGAACCCCA TCGAGATCCCCATCAAGCAGACCATCCCCAACTTCTTCTTCGGCCGGGAC ACCGCCGAGGACTACGACGACCTGGACTACTGA

[0154] Alternatively, a human codon-optimized coding region which encodes SEQ ID NO:16 can be designed by the "full optimization" method, where each amino acid is assigned codons based on the frequency of usage in the human genome. These frequencies are shown in Table 12 above. Using this latter method, codons are assigned to the coding region encoding SEQ ID NO:16 as follows: about 10 of the 21 phenylalanine codons are TTT, and about 12 of the phenylalanine codons are TTC; about 3 of the 39 leucine codons are TTA, about 5 of the leucine codons are TTG, about 5 of the leucine codons are CTT, about 8 of the leucine codons are CTC, about 3 of the leucine codons are CTA, and about 16 of the leucine codons are CTG; about 14 of the 38 isoleucine codons are ATT, about 18 of the isoleucine codons are ATC, and about 6 of the isoleucine codons are ATA; the 27 methionine codons are ATG; about 6 of the 32 valine codons are GTT, about 8 of the valine codons are GTG, about 4 of the valine codons are GTA, and about 15 of the valine codons are GTG; about 7 of the 40 serine codons are TCT, about 9 of the serine codons are TCC, about 6 of the serine codons are TCA, about 2 of the serine codons are TCG, about 6 of the serine codons are AGT, and about 10 of the serine codons are AGC; about 7 of the 26 proline codons are CCT, about 9 of the proline codons are CCC, about 7 of the proline codons are CCA, and about 3 of the proline codons are CCG; about 9 of the 38 threonine codons are ACT, about 14 of the threonine codons are ACC, about 11 of the threonine codons are ACA, and about 4 of the threonine codons are ACG; about 11 of the 41 alanine codons are GGT, about 17 of the alanine codons are GCC, about 9 of the alanine codons are GCA, and about 4 of the alanine codons are GCG; about 6 of the 14 tyrosine codons are TAT and about 8 of the tyrosine codons are TAC; about 2 of the 4 histidine codons are CAT and about 2 of the histidine codons are CAC; about 5 of the 19 glutamine codons are CAA and about 14 of the glutamine codons are CAG; about 12 of the 25 asparagine codons are AAT and about 13 of the asparagine codons are AAC; about 22 of the 52 lysine codons are AAA and about 30 of the lysine codons are AAG; about 16 of the 34 aspartic acid codons are GAT and about 18 of the aspartic acid codons are GAC; about 12 of the 30 glutamic acid codons are GAA and about 18 of the glutamic acid codons are GAG; about 2 of the 5 cysteine codons are TGT and about 3 of the cysteine codons are TGC; the single tryptophan codon is TGG; about 2 of the 30 arginine codons are CGT, about 6 of the arginine codons are CGC, about 3 of the arginine codons are CGA, about 6 of the arginine codons are CGG, about 6 of the arginine codons are AGA, and about 6 of the arginine codons are AGG; and about 7 of the 44 glycine codons are GGT, about 15 of the glycine codons are GGC, about 11 of the glycine codons are GGA, and about 11 of the glycine codons are GGG.

[0155] As described above, the term "about" means that the number of amino acids encoded by a certain codon may be one more or one less than the number given. It would be understood by those of ordinary skill in the art that the total number of any amino acid in the polypeptide sequence must remain constant, therefore, if there is one "more" of one codon encoding a give amino acid, there would have to be one "less" of another codon encoding that same amino acid.

[0156] A representative "fully optimized" codon-optimized coding region encoding SEQ ID NO:16, optimized according to codon usage in humans is presented herein as SEQ ID NO:36: TABLE-US-00044 ATGTCGAACATGGACATCGACAGCATTAACACAGGTACTATTGACAAAAC CCCCGAAGAACTAACCCCTGGAACCTCAGGAGCAACACGCCCAATAATCA AACCGGCCACCCTCGCGCCCCCTAGCAATAAGAGGACCCGCAATCCAAGT CCTGAGAGAACCACTACTTCATCTGAAACGGATATCGGTCGGAAAATTCA AAAAAAGCAGACGCCCACAGAGATAAAGAAGTCTGTTTACAAAATGGTGG TAAAGCTCGGTGAGTTTTATAACCAGATGATGGTCAAGGCGGGGCTTAAC GACGATATGGAACGAAATCTTATACAGAATGCACAGGCAGTAGAGAGAAT ACTGCTGGCCGCTACTGATGACAAGAAAACGGAGTACCAAAAAAAACGGA ATGCTCGAGATGTGAAAGAAGGAAAAGAAGAAATTGACCATAACAAAACT GGGGGGACATTCTATAAGATGGTGCGGGACGATAAGACAATCTATTTTAG CCCGATAAAGATTACCTTCCTGAAGGAGGAGGTTAAAACAATGTACAAGA CGACGATGGGCAGCGATGGTTTTCCGGACTTAATCATATAATGATTGGTC ACTCGCAGATGAACGATGTATGTTTCCAGCGCTCCAAGGGCTTAAAGAGG GTAGGTCTTGACCCGTCTCTAATATCAACTTTCGCAGGATCCACTTTGCC GAGGCGTTCTGGCACGACAGGCGTGGCTATCAAGGGCGGGGGGACGCTGG TCGATGAGGCCATTCGCTTTATTGGTAGGGCCATGGCCGATAGAGGGCTT CTACGAGACATCAAAGCAAAAACAGCATATGAGAAGATATTATTAAACTT AAAGAACAAATGCTCCGCTCCTCAGCAAAAAGCGCTCGTTGACCAAGTAA TCGGTTCGAGAAATCCAGGCATTGCCGATATCGAAGATCTTACACTCTTG GCGCGAAGCATGGTCGTTGTCCGTCCCAGTGTCGCTAGTAAGGTGGTACT ACCAATCTCGATTTACGCAAAAATTCCACAACTCGGCTTTAATACAGAGG AATATTCTATGGTAGGTTATGAAGCCATGGCGTTGTATAATATGGCTACA CCAGTCTCCATATTGCGTATGGGAGATGACGCAAAAGATAAGAGTCAACT CTTTTTCATGTCATGTTTCGGCGCAGCGTACGAAGATCTGAGAGTACTAT CCGCCTTGACTGGAACGGAATTTAAACCACGGTCAGCCTTAAAGTGTAAG GGTTTTCACGTCCCTGCTAAGGAGCAAGTTGAGGGAATGGGCGCGGCACT GATGAGTATAAAATTACAATTTTGGGCTCCAATGACGCGTTCGGGAGGGA ATGAAGTTTCTGGTGAGGGAGGGAGTGGACAGATATCATGCTCGCCCGTG TTCGCGGTTGAACGTCCGATTGCTTTGAGTAAGCAGGCGGTTAGGCGGAT GTTAAGTATGAATGTGGAGGGCCGCGATGCCGACGTCAAAGGCAACTTAT TAAAAATGATGAACGACAGCATGGCAAAGAAGACTAGTGGGAATGCTTTT ATAGGGAAAAAAATGTTCCAAATAAGTGACAAAAACAAAGTGAACCCCAT CGAAATACCTATCAAGCAAACCATCCCGAATTTCTTTTTCGGTCGAGACA CCGCGGAGGACTACGATGACCTAGATTACTAA

[0157] Additionally, a minimally codon-optimized nucleotide sequence encoding SEQ ID NO:16 can be designed by changing only certain codons found more frequently in IV genes than in human genes, as shown in Table 7. For example, if it is desired to substitute more frequently used codons in humans for those codons that occur at least 2 times more frequently in IV genes (designated with an asterisk in Table 7), Arg AGA, which occurs 2.3 times more frequently in IV genes than in human genes, is changed to, e.g., CGG; Asn AAT, which occurs 2.0 times more frequently in IV genes than in human genes, is changed to, e.g., AAC; Ile ATA, which occurs 3.6 times more frequently in IV genes than in human genes, is changed to, e.g., ATC; and Leu CTA, which occurs 2.0 times more frequently in IV genes than is human, is changed to, e.g., CTG.

[0158] In another form of minimal optimization, a Codon Usage Table (CUT) for the specific IV sequence in question is generated and compared to CUT for human genomic DNA (see Table 7, supra). Amino acids are identified for which there is a difference of at least 10 percentage points in codon usage between human and IV DNA (either more or less). Then the wild type IV codon is modified to conform to predominant human codon for each such amino acid. Furthermore, the remainder of codons for that amino acid are also modified such that they conform to the predominant human codon for each such amino acid.

[0159] A representative "minimally optimized" codon-optimized coding region encoding SEQ ID NO:16, minimally optimized according to codon usage in humans by this latter method, is presented herein as SEQ ID NO:38: TABLE-US-00045 ATGTCTAACATGGACATCGACTCTATAAACACAGGCACGATCGATAAGAC CCCCGAGGAGCTGACACCCGGGACTTCAGGCGCCACCAGACCCATAATAA AGGCCGCGACTCTGGCCCCCCCCTCTAACAAGAGGAGGAGGAACCCCTCT CCCGAGCGCACCACAACGAGTAGCGAGACGGACATCGGCAGGAAGATACA GAAGAAGCAGACTCCCACTGAGATTAAGAAGTGCGTGTATAAGATGGTGG TTAAGCTGGGCGAGTTTTACAACCAGATGATGGTGAAGGCCGGCCTGAAC GATGACATGGAGAGGAACCTGATACAGAACGCCCAGGCCGTGGAGAGGAT TCTGCTGGCCGCCACCGATGACAAGAAGACTGAGTATCAGAAGAAGAGAA ACGCCCGGGACGTTAAGGAGGGCAAGGAGGAGATCGATCACAACAAGACA GGCGGCACTTTCTATAAGATGGTCCGTGATGACAAGACAATCTACTTTTC TCCCATCAAGATCACATTCCTGAAGGAGGAGGTAAAGACTATGTACAAGA CAACTATGGGCTCCGATGGCTTCAGTGGCCTGAACCACATAATGATAGGC CATAGTCAGATGAACGATGTGTGCTTCCAGAGAAGCAAGGGCCTGAAGAG GGTCGGCCTGGATCCCTCGCTGATTAGTACCTTCGCCGGCAGCACTCTGC CCAGAAGATCTGGCACTACTGGCGTAGCCATAAAGGGCGGCGGCACACTG GTAGACGAGGCCATAAGGTTTATTGGCAGAGCCATGGCCGACCGGGGGGT GCTGAGAGATATGAAGGCCAAGACCGCCTACGAGAAGATACTGCTGAACC TGAAGAACAAGTGCTCAGCCCCCCAGCAGAAGGCCCTGGTGGATCAGGTG ATCGGCAGTAGAAACCCCGGCATCGCCGACATCGAGGATCTGACTCTGCT GGCCAGAAGCATGGTAGTCGTAAGACCCTCTGTGGCCTCTAAGGTTGTGC TGCCCATCTCCATCTACGCCAAGATTCCCCAGCTGGGCTTTAACACTGAG GAGTACTCCATGGTGGGCTATGAGGCCATGGCCCTGTATAACATGGCCAC ACCCGTCTCTATCCTGCGGATGGGCGACGATGCCAAGGACAAGTCTCAGC TGTTTTTTATGAGTTGTTTCGGCGCCGCCTATGAGGATCTGAGAGTGCTG TCAGCCCTGACAGGCACTGAGTTCAAGCGCAGGTCGGCGCTGAAGTGCAA GGGCTTTCATGTGGCGGCCAAGGAGCAGGTGGAGGGCATGGGCGCCGGCC TGATGAGCATCAAGCTGCAGTTCTGGGCCCGCATGACCCGGTCTGGCGGC AAGGAGGTCTCGGGCGAGGGCGGCAGTGGCCAGATAAGTTGGAGCCCCGT TTTTGCCGTTGAGAGACCCATCGCCCTGTCTAAGCAGGCCGTTAGACGAA TGCTGAGTATGAACGTCGAGGGCGGAGAGGCCGATGTGAAGGGCAACCTG CTGAAGATGATGAACGATTCCATGGCCAAGAAGACAAGCGGCAACGGCTT CATTGGCAAGAAGATGTTCCAGATAAGCGATAAGAACAAGGTTAACCCCA TCGAGATTCCCATCAAGCAGACCATCCCCAACTTCTTCTTCGGCAGGGAT ACCGCCGAGGATTAGGATGACCTGGACTACTGA

[0160] Randomly assigning codons at an optimized frequency to encode a given polypeptide sequence using the "full-optimization" or "minimal optimization" methods, can be done manually by calculating codon frequencies for each amino acid, and then assigning the codons to the polypeptide sequence randomly. Additionally, various algorithms and computer software programs are readily available to those of ordinary skill in the art. For example, the "EditSeq" function in the Lasergene Package, available from DNAstar, Inc., Madison, Wis., the backtranslation function in the VectorNTI Suite, available from InforMax, Inc., Bethesda, Md., and the "backtranslate" function in the GCG--Wisconsin Package, available from Accelrys, Inc., San Diego, Calif. In addition, various resources are publicly available to codon-optimize coding region sequences. For example, the "backtranslation" function found at http://www.entelechon.com/eng/backtranslation.html (visited Jul. 9, 2002), and the "backtranseq" function available at http:/bioinfo.pbi.nrc.ca:8090/EMBOSS/index.html (visited Oct. 15, 2002). Constructing a rudimentary algorithm to assign codons based on a given frequency can also easily be accomplished with basic mathematical functions by one of ordinary skill in the art.

[0161] A number of options are available for synthesizing codon-optimized coding regions designed by any of the methods described above, using standard and routine molecular biological manipulations well known to those of ordinary skill in the art. In one approach, a series of complementary oligonucleotide pairs of 80-90 nucleotides each in length and spanning the length of the desired sequence are synthesized by standard methods. These oligonucleotide pairs are synthesized such that upon annealing, they form double stranded fragments of 80-90 base pairs, containing cohesive ends, e.g., each oligonucleotide in the pair is synthesized to extend 3, 4, 5, 6, 7, 8, 9, 10, or more bases beyond the region that is complementary to the other oligonucleotide in the pair. The single-stranded ends of each pair of oligonucleotides is designed to anneal with the single-stranded end of another pair of oligonucleotides. The oligonucleotide pairs are allowed to anneal, and approximately five to six of these double-stranded fragments are then allowed to anneal together via the cohesive single stranded ends, and then they ligated together and cloned into a standard bacterial cloning vector, for example, a TOPO.RTM. vector available from Invitrogen Corporation, Carlsbad, Calif. The construct is then sequenced by standard methods. Several of these constructs consisting of 5 to 6 fragments of 80 to 90 base pair fragments ligated together, i.e., fragments of about 500 base pairs, are prepared, such that the entire desired sequence is represented in a series of plasmid constructs. The inserts of these plasmids are then cut with appropriate restriction enzymes and ligated together to form the final construct. The final construct is then cloned into a standard bacterial cloning vector, and sequenced. Additional methods would be immediately apparent to the skilled artisan. In addition, gene synthesis is readily available commercially.

[0162] The codon-optimized coding regions can be versions encoding any gene products from any strain, derivative, or variant of IV, or fragments, variants, or derivatives of such gene products. For example, nucleic acid fragments of codon-optimized coding regions encoding the NP, M1 and M2 polypeptides, or fragments, variants or derivatives thereof. Codon-optimized coding regions encoding other IV polypeptides or fragments, variants, or derivatives thereof (e.g. HA, NA, PB1, PB2, PA, NS1 or NS2), are included within the present invention. Additional, non-codon-optimized polynucleotides encoding IV polypeptides or other polypeptides are included as well.

Consensus Sequences

[0163] The present invention is further directed to specific consensus sequences of influenza virus proteins, and fragments, derivatives and variants thereof. A "consensus sequence" is, e.g., an idealized sequence that represents the amino acids most often present at each position of two or more sequences which have been compared to each other. A consensus sequence is a theoretical representative amino acid sequence in which each amino acid is the one which occurs most frequently at that site in the different sequences which occur in nature. The term also refers to an actual sequence which approximates the theoretical consensus. A consensus sequence can be derived from sequences which have, e.g., shared functional or structural purposes. It can be defined by aligning as many known examples of a particular structural or functional domain as possible to maximize the homology. A sequence is generally accepted as a consensus when each particular amino acid is reasonably predominant at its position, and most of the sequences which form the basis of the comparison are related to the consensus by rather few substitutions, e.g., from 0 to about 100 substitutions. In general, the wild-type comparison sequences are at least about 50%, 75%, 80%, 90%, 95%, 96%, 97%, 98% or 99% identical to the consensus sequence. Accordingly, polypeptides of the invention are about 50%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, 99% or 100% identical to the consensus sequence. Consensus amino acid sequences can be prepared for any of the influenza antigens. By analyzing amino acid sequences from influenza A strains sequenced since 1990, consensus amino acid sequences were derived for the influenza A NP (SEQ ID NO: 76), M1 (SEQ ID NO:77) and M2 (SEQ ID NO:78) proteins (Example 3).

[0164] A "consensus amino acid" is an amino acid chosen to occupy a given position in the consensus protein. A system which is organized to select consensus amino acids can be a computer program, or a combination of one or more computer programs with "by hand" analysis and calculation. When a consensus amino acid is obtained for each position of the aligned amino acid sequences, then these consensus amino acids are "lined up" to obtain the amino acid sequence of the consensus protein.

[0165] Another embodiment of this invention is directed to a process for the preparation of a consensus protein comprising a process to calculate an amino acid residue for nearly all positions of a so-called consensus protein and to synthesize a complete gene from this sequence that could be expressed in a prokaryotic or eukaryotic expression system.

[0166] Polynucleotides which encode the consensus influenza polypeptides, or fragments, variants or derivatives thereof, are also part of this invention. Such polynucleotides can be obtained by known methods, for example by backtranslation of the amino acid sequence and PCR synthesis of the corresponding polynucleotide.

Compositions and Methods

[0167] In certain embodiments, the present invention is directed to compositions and methods of enhancing the immune response of a vertebrate in need of protection against IV infection by administering in vivo, into a tissue of a vertebrate, one or more polynucleotides comprising at least one codon-optimized coding region encoding an IV polypeptide, or a fragment, variant, or derivative thereof. In addition, the present invention is directed to compositions and methods of enhancing the immune response of a vertebrate in need of protection against IV infection by administering to the vertebrate a composition comprising one or more polynucleotides as described herein, and at least one isolated IV polypeptide, or a fragment, variant, or derivative thereof. The polynucleotide may be administered either prior to, at the same time (simultaneously), or subsequent to the administration of the isolated polypeptide.

[0168] The coding regions encoding IV polypeptides or fragments, variants, or derivatives thereof may be codon optimized for a particular vertebrate. Codon optimization is carried out by the methods described herein, for example, in certain embodiments codon-optimized coding regions encoding polypeptides of IV, or nucleic acid fragments of such coding regions encoding fragments, variants, or derivatives thereof are optimized according to the codon usage of the particular vertebrate. The polynucleotides of the invention are incorporated into the cells of the vertebrate in vivo, and an immunologically effective amount of an IV polypeptide or a fragment, variant, or derivative thereof is produced in vivo. The coding regions encoding an IV polypeptide or a fragment, variant, or derivative thereof may be codon optimized for mammals, e.g., humans, apes, monkeys (e.g., owl, squirrel, cebus, rhesus, African green, patas, cynomolgus, and cercopithecus), orangutans, baboons, gibbons, and chimpanzees, dogs, wolves, cats, lions, and tigers, horses, donkeys, zebras, cows, pigs, sheep, deer, giraffes, bears, rabbits, mice, ferrets, seals, whales; birds, e.g., ducks, geese, terns, shearwaters, gulls, turkeys, chickens, quail, pheasants, geese, starlings and budgerigars, or other vertebrates.

[0169] In one embodiment, the present invention relates to codon-optimized coding regions encoding polypeptides of IV, or nucleic acid fragments of such coding regions fragments, variants, or derivatives thereof which have been optimized according to human codon usage. For example, human codon-optimized coding regions encoding polypeptides of IV, or fragments, variants, or derivatives thereof are prepared by substituting one or more codons preferred for use in human genes for the codons naturally used in the the DNA sequence encoding the IV polypeptide or a fragment, variant, or derivative thereof. Also provided are polynucleotides, vectors, and other expression constructs comprising codon-optimized coding regions encoding polypeptides of IV, or nucleic acid fragments of such coding regions encoding fragments, variants, or derivatives thereof, pharmaceutical compositions comprising polynucleotides, vectors, and other expression constructs comprising codon-optimized coding regions -encoding polypeptides of IV, or nucleic acid fragments of such coding regions encoding fragments, variants, or derivatives thereof, and various methods of using such polynucleotides, vectors and other expression constructs. Coding regions encoding IV polypeptides can be uniformly optimized, fully optimized, minimally optimized, codon-optimized by region and/or not codon-optimized, as described herein.

[0170] The present invention is further directed towards polynucleotides comprising codon-optimized coding regions encoding polypeptides of IV antigens, for example, HA, NA, NP, M1 and M2, optionally in conjunction with other antigens. The invention is also directed to polynucleotides comprising codon-optimized nucleic acid fragments encoding fragments, variants and derivatives of these polypeptides, e.g., an eM2 or a fusion of NP and eM2.

[0171] In certain embodiments, the present invention provides an isolated polynucleotide comprising a nucleic acid fragment, where the nucleic acid fragment is a fragment of a codon-optimized coding region encoding a polypeptide at least 60%, 65%, 70%, 75%, 80%, 85%, 86%, 87%, 88%, 89%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identical to an IV polypeptide, e.g., HA, NA, NP, M1 or M2, and where the nucleic acid fragment is a variant of a codon-optimized coding region encoding an IV polypeptide, e.g., HA, NA, NP, M1 or M2. The human codon-optimized coding region can be optimized for any vertebrate species and by any of the methods described herein.

Isolated IV Polypeptides

[0172] The present invention is further drawn to compositions which include at least one polynucleotide comprising one or more nucleic acid fragments, where each nucleic acid fragment is optionally a fragment of a codon-optimized coding region operably encoding an IV polypeptide or fragment, variant, or derivative thereof; together with one or more isolated IV component or isolated polypeptide. The IV component may be inactivated virus, attenuated virus, a viral vector expressing an isolated influenza virus polypeptide, or an influenza virus protein, fragment, variant or derivative thereof.

[0173] The polypeptides or fragments, variants or derivatives thereof, in combination with the codon-optimized nucleic acid compositions may be referred to as "combinatorial polynucleotide vaccine compositions" or "single formulation heterologous prime-boost vaccine compositions."

[0174] The isolated IV polypeptides of the invention may be in any form, and are generated using techniques well known in the art. Examples include isolated IV proteins produced recombinantly, isolated IV proteins directly purified from their natural milieu, recombinant (non-IV) virus vectors expressing an isolated IV protein, or proteins delivered in the form of an inactivated IV vaccine, such as conventional vaccines

[0175] When utilized, an isolated IV polypeptide or fragment, variant or derivative thereof is administered in an immunologically effective amount. Conventional IV vaccines have been standardized to micrograms of viral antigens HA and NA. See Subbarao, K., Advances in Viral Research 54:349-373 (1999), incorporated herein by reference in its entirety. The recommended dose for these vaccines is 15 .mu.g of each HA per 0.5 ml. Id. The effective amount of conventional IV vaccines is determinable by one of ordinary skill in the art based upon several factors, including the antigen being expressed, the age and weight of the subject, and the precise condition requiring treatment and its severity, and route of administration.

[0176] In the instant invention, the combination of conventional antigen vaccine compositions with the codon-optimized nucleic acid compositions provides for therapeutically beneficial effects at dose sparing concentrations. For example, immunological responses sufficient for a therapeutically beneficial effect in patients predetermined for an approved commercial product, such as for the conventional product described above, can be attained by using less of the approved commercial product when supplemented or enhanced with the appropriate amount of codon-optimized nucleic acid. Thus, dose sparing is contemplated by administration of conventional IV vaccines administered in combination with the codon-optimized nucleic acids of the invention

[0177] In particular, the dose of conventional vaccine may be reduced by at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60% or at least 70% when administered in combination with the codon-optimized nucleic acid compositions of the invention.

[0178] Similarly, a desirable level of an immunological response afforded by a DNA based pharmaceutical alone may be attained with less DNA by including an aliquot of a conventional vaccine. Further, using a combination of conventional and DNA based pharmaceuticals may allow both materials to be used in lesser amounts while still affording the desired level of immune response arising from administration of either component alone in higher amounts (e.g. one may use less of either immunological product when they are used in combination). This may be manifest not only by using lower amounts of materials being delivered at any time, but also to reducing the number of administrations points in a vaccination regime (e.g. 2 versus 3 or 4 injections), and/or to reducing the kinetics of the immunological response (e.g. desired response levels are attained in 3 weeks in stead of 6 after immunization).

[0179] In particular, the dose of DNA based pharmaceuticals, may be reduced by at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60% or at least 70% when administered in combination with conventional IV vaccines.

[0180] Determining the precise amounts of DNA based pharmaceutical and conventional antigen is based on a number of factors as described above, and is readily determined by one of ordinary skill in the art.

[0181] In addition to dose sparing, the claimed combinatorial compositions provide for a broadening of the immune response and/or enhanced beneficial immune responses. Such broadened or enhanced immune responses are achieved by: adding DNA to enhance cellular responses to a conventional vaccine; adding a conventional vaccine to a DNA pharmaceutical to enhance humoral response; using a combination that induces additional epitopes (both humoral and/or cellular) to be recognized and/or more desirably responded to (epitope broadening); employing a DNA-conventional vaccine combination designed for a particular desired spectrum of immunological responses; obtaining a desirable spectrum by using higher amounts of either component. The broadened immune response is measurable by one of ordinary skill in the art by standard immunological assay specific for the desirable response spectrum.

[0182] Both broadening and dose sparing can be obtained simultaneously.

[0183] The isolated IV polypeptide or fragment, variant, or derivative thereof to be delivered (either a recombinant protein, a purified subunit, or viral vector expressing an isolated IV polypeptide, or in the form of an inactivated IV vaccine) can be any isolated IV polypeptide or fragment, variant, or derivative thereof, including but not limited to the HA, NA, NP, M1, or M2 proteins or fragments, variants or derivatives thereof. Fragments include, but are not limited to, the eM2 protein. In certain embodiments, a derivative protein can be a fusion protein, e.g., NP-eM2. It should be noted that any isolated IV polypeptide or fragment, variant, or derivative thereof described herein can be combined in a composition with any polynucleotide comprising a nucleic acid fragment, where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region operably encoding an IV polypeptide or fragment, variant, or derivative thereof. The proteins can be different, the same, or can be combined in any combination of one or more isolated IV proteins and one or more polynucleotides.

[0184] In certain embodiments, the isolated IV polypeptides, or fragments, derivatives or variants thereof can be fused to or conjugated to a second isolated IV polypeptide, or fragment, derivative or variant thereof, or can be fused to other heterologous proteins, including for example, hepatitis B proteins including, but not limited to the hepatitis B core antigen (HBcAg), or those derived from diphtheria or tetanus. The second isolated IV polypeptide or other heterologous protein can act as a "carrier" that potentiates the immunogenicity of the IV polypeptide or a fragment, variant, or derivative thereof to which it is attached. Hepatitis B virus proteins and fragments and variants thereof useful as carriers within the scope of the invention are disclosed in U.S. Pat. Nos. 6,231,864 and 5,143,726, which are incorporated by reference in their entireties. Polynucleotides comprising coding regions encoding said fused or conjugated proteins are also within the scope of the invention.

[0185] The use of recombinant particles comprising hepatitis B core antigen ("HBcAg") and heterologous protein sequences as potent immunogenic moieties is well documented. For example, addition of heterologous sequences to the amino terminus of a recombinant HBcAg results in the spontaneous assembly of particulate structures which express the heterologous epitope on their surface, and which are highly immunogenic when inoculated into experimental animals. See Clarke et al., Nature 330:381-384 (1987). Heterologous epitopes can also be inserted into HBcAg particles by replacing approximately 40 amino acids of the carboxy terminus of the protein with the heterologous sequences. These recombinant HBcAg proteins also spontaneously form immunogenic particles. See Stahl and Murray, Proc. Natl. Acad. Sci. USA, 86:6283-6287 (1989). Additionally, chimeric HBcAg particles may be constructed where the heterologous epitope is inserted in or replaces all or part of the sequence of amino acid residues in a more central region of the HBcAg protein, in an immunodominant loop, thereby allowing the heterologous epitope to be displayed on the surface of the resulting particles. See EP Patent No. 0421635 B1. Shown below are the DNA and amino acid sequences of the human hepatitis B core protein (HBc), subtype ayw (SEQ ID NOs 39 and 40), as described in Galibert, F., et al., Nature 281:646-650 (1979); see also U.S. Pat. Nos. 4,818,527, 4,882,145 and 5,143,726. All of the above references are incorporated herein by reference in their entireties. The nucleotide and amino acid sequences are presented herein as SEQ ID NO 39: TABLE-US-00046 ATGGACATCGACCCTTATAAAGAATTTGGAGCTACTGTGGAGTTACTCTC GTTTTTGCCTTCTGACTTCTTTCCTTCAGTACGAGATCTTCTAGATACCG CCTCAGCTCTGTATCGGGAAGCCTTAGAGTCTCCTGAGCATTGTTCACCT CACCATACTGCACTCAGGCAAGCAATTCTTTGCTGGGGGGAACTAATGAC TCTAGCTACCTGGGTGGGTGTTAATTTGGAAGATCCAGCGTCTAGAGACC TAGTAGTCAGTTATGTCAACACTAATATGGGCCTAAAGTTCAGGCAACTC TTGTGGTTTCACATTTCTTGTCTCACTTTTGGAAGAGAAACAGTTATAGA GTATTTGGTGTCTTTCGGAGTGTGGATTCGCACTCCTCCAGCTTATAGAC CACCAAATGCCCCTATCCTATCAACACTTCCGGAGACTACTGTTGTTAGA CGACGAGGCAGGTCCCCTAGAAGAAGAACTCCCTCGCCTCGCAGACGAAG GTCTCAATCGCCGCGTCGCAGAAGATCTCAATCTCGGGAATCTCAATG TTAG

[0186] and SEQ ID NO:40: TABLE-US-00047 MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTASALYREALESPEHCSP HHTALRQAILCWGELMTLATWVGVNLEDPASRDLVVSYVNTNMGLKFRQL LWFHISCLTFGRETVIEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVR RRGRSPRRRTPSPRRRRSQSPRRRRSQSRESQC

[0187] A completely synthetic HBcAg has been synthesized as well. See Nassal, M. Gene 66:279-294 (1988). The nucleotide and amino acid sequences are presented herein as SEQ ID NO 41: TABLE-US-00048 ATGGATATCGATCCTTATAAAGAATTCGGAGCTACTGGGAGTTACTCTCG TTTCTCCCGAGTGACTTCTTTCCTTCAGTACGAGATCTTCTGGATACCGC CAGCGCGCTGTATCGGGAAGCCTTGGAGTCTCCTGAGCACTGCAGCCCTC ACCATACTGCCCTCAGGCAAGCAATTCTTTGCTGGGGGGAGCTCATGACT CTGGCCACGTGGGTGGGTGTTAACTTGGAAGATCCAGCTAGCAGGGACCT GGTAGTCAGTTATGTCAACACTAATATGGGTTTAAAGTTCAGGCAACTCT TGTGGTTTCACATTAGCTGCCTCACTTTCGGCCGAGAAACAGTTCTAGAA TATTTGGTGTCTTTCGGAGTGTGGATCCGCACTCCTCCAGCTTATAGGCC TCCGAATGCCCCTATCCTGTCGACACTCCCGGAGACTACTGTTGTTAGAC GTCGAGGCAGGTCACCTAGAAGAAGAACTCCTTCGCCTCGCAGGCGAAGG TCTCAATCGCCGCGGCGCCGAAGATCTCAATCTCGGGAATCTCAATGTTA GTGA

[0188] and SEQ ID NO:42: TABLE-US-00049 MDIDPYKEFGATVELLSFLPSDFFPSVRDLLDTASALYREALESPEHCSP HHTALRQAILCWGELMTLATWVGVNLEDPASRDLVVSYVNTNMGLKFRQL LWFHISCLTFGRETVLEYLVSFGVWIRTPPAYRPPNAPILSTLPETTVVR RRGRSPRRRTPSPRRRRSQSPRRRRSQSRESQC

[0189] Chimaeric HBcAg particles comprising isolated IV proteins or variants, fragments or derivatives thereof are prepared by recombinant techniques well known to those of ordinary skill in the art. A polynucleotide, e.g., a plasmid, which carries the coding region for the HBcAg operably associated with a promoter is constructed. Convenient restrictions sites are engineered into the coding region encoding the N-terminal, central, and/or C-terminal portions of the HBcAg, such that heterologous sequences may be inserted. A construct which expresses a HBcAg/IV fusion protein is prepared by inserting a DNA sequence encoding an IV protein or variant, fragment or derivative thereof, in frame, into a desired restriction site in the coding region of the HBcAg. The resulting construct is then inserted into a suitable host cell, e.g., E. coli, under conditions where the chimeric HBcAg will be expressed. The chimaeric HBcAg self-assembles into particles when expressed, and can then be isolated, e.g., by ultracentrifugation. The particles formed resemble the natural 27 nm HBcAg particles isolated from a hepatitis B virus, except that an isolated IV protein or fragment, variant, or derivative thereof is contained in the particle, preferably exposed on the outer particle surface.

[0190] The IV protein or fragment, variant, or derivative thereof expressed in a chimaeric HBcAg particle may be of any size which allows suitable particles of the chimeric HBcAg to self-assemble. As discussed above, even small antigenic epitopes may be immunogenic when expressed in the context of an immunogenic carrier, e.g., a HBcAg. Thus, HBcAg particles of the invention may comprise at least 4, at least 5, at least 6, at least 7, at least 8, at least 9, at least 10, at least 15, at least 20, at least 25, or between about 15 to about 30 amino acids of an IV protein fragment of interest inserted therein. HBcAg particles of the invention may further comprise immunogenic or antigenic epitopes of at least 5, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 amino acid residues of an IV protein fragment of interest inserted therein.

[0191] The immunodominant loop region of HBcAg was mapped to about amino acid residues 75 to 83, to about amino acids 75 to 85 or to about amino acids 130 to 140. See Colucci et al., J. Immunol. 141:4376-4380 (1988), and Salfeld et al. J. Virol. 63:798 (1989), which are incorporated by reference. A chimeric HBcAg is still often able to form core particles when foreign epitopes are cloned into the immunodominant loop. Thus, for example, amino acids of the IV protein fragment may be inserted into the sequence of HBcAg amino acids at various positions, for example, at the N-terminus, from about amino acid 75 to about amino acid 85, from about amino acid 75 to about amino acid 83, from about amino acid 130 to about amino acid 140, or at the C-terminus. Where amino acids of the IV protein fragment replace all or part of the native core protein sequence, the inserted IV sequence is generally not shorter, but may be longer, than the HBcAg sequence it replaces.

[0192] Alternatively, if particle formation is not desired, full-length IV coding sequences can be fused to the coding region for the HBcAg. The HBcAg sequences can be fused either at the N- or C-terminus of any of the Influenza antigens described herein, including the eM2-NP constructs. Fusions could include flexible protein linkers as described for NP-eM2 fusions above. Examples of IV coding sequences fused to the HBcAg coding sequence of SEQ ID NO:41 include an IAV NP-HBcAg fusion (SEQ ID NO:43), TABLE-US-00050 ATGGCGTCTCAAGGCACCAAACGATCTTACGAACAGATGGAGACTGATG GAGAACGCCAGAATGCCACTGAAATCAGAGCATCCGTCGGAAAAATGAT TGGTGGAATTGGACGATTCTACATCCAAATGTGCACCGAACTCAAACTCA GTGATTATGAGGGACGGTTGATCCAAAACAGCTTAACAATAGAGAGAAT GGTGCTCTCTGCTTTTGACGAAAGGAGAAATAAATACCTTGAAGAACATC CCAGTGCGGGGAAAGATCCTAAGAAAACTGGAGGACCTATATACAGGAG AGTAAACGGAAAGTGGATGAGAGAACTCATCCTTTATGACAAAGAAGAA ATAAGGCGAATCTGGCGCCAAGCTAATAATGGTGACGATGCAACGGCTG GTCTGACTCACATGATGATCTGGGATTCCAATTTGAATGATGCAACTTAT CAGAGGACAAGAGCTCTTGTTCGCACCGGAATGGATCCCAGGATGTGCTC TCTGATGCAAGGTTCAACTCTCCCTAGGAGGTCTGGAGCCGCAGGTGCTG CAGTCAAAGGAGTTTGGAACAATGGTGATGGAATTGGTCAGAATGATCAA ACGTGGGATCAATGATCGGAACTTCTGGAGGGGTGAGAATGGACGAAAAA CAAGAATTGCTTATGAAAGAATGTGCAACATTCTCAAAGGGAAATTTCAA ACTGCTGCACAAAAAGCAATGATGGATCAAGTGAGAGAGAGCCGGAACC CAGGGAATGCTGAGTTTCGAAGATGTCAGTTTCTAGCACGGTCTGCACTC ATATTGAGAGGGTCGGTTGCTCACAAGTCCTGCCTGCCTGCCTGTGTGTA TGGACCTGGCGTAGCCAGTGGGTACGACTTTGAAAGGGAGGGATACTCTC TAGTCGGAATAGACCGTTTCAGACTGCTTCAAAACAGCCAAGTGTACAGC CTAATCAGACCAAATGAGAATCCAGGACACAAGAGTCAACTGGTGTGGA TGGCATGCCATTCTGCCGCATTTGAAGATCTAAGAGTATTAAGCTTCATC AAAGGGACGAAGGTGCTCCCAAGAGGGAAGCTTTCCACTAGAGGAGTTC AAATTGCTTCCAATGAAAATATGGAGACTATGGAATCAAGTACACTTGAA CTGAGAAGCAGGTACTGGGCCATAAGGACCAGAAGTGGAGGAAACACCA ATCAACAGAGGGCATCTGCGGGCCAAATCAGCATACAAGGTACGTTCTCA GTACAGAGAAATCTCCCTTTTGACAGAACAACCGTTATGGCAGCATTCAG TGGGAATACAGAGGGGAGATGGCGTCTCAAGGCACCAAACGATCTACG AACAGATGGAGACTGATGGAGAACGCCAGAATGCCACTGAAATCAGAGG ATCCGTCGGAAAAATGATTGGTGGAATGGACGATTCTACATCCAAATGT GCACCGAACTCAAACTCAGTGATTATGAGGGACGGTTGATCCAAAACAG CTTAACAATAGAGAGAATGGTGCTCTCTGCTTTTGACGAAAGGAGAAATA AATACCTTGAAGAACATCCCAGTGCGGGGAAAGATCCTAAGAAAACTGG AGGACCTATATACAGGAGAGTAAACGGAAAGTGGATGAGAGAACTCATC CTTTATGACAAAGAAGAAATAAGGCGAATCTGGCGCCAAGCTAATAATG GTGACGATGCAACGGCTGGTCTGACTCACATGATGATCTGGCATTCCAAT TTGAATGATGCAACTTATCAGAGGACAAGAGCTCTTGTTCGCACCGGAAT GGATCCCAGGATGTGCTCTCTGATGCAAGGTTCAACTCTCCCTAGGAGGT CTGGAGCCGCAGGTGCTGCAGTCAAAGGAGTTGGAACAATGGTGATGGA ATTGGTCAGAATGATCAAACGTGGGATCAATGATCGGAACTTCTGGAGG GGTGAGAATGGACGAAAAACAAGAATTGCTTATGAAAGAATGTGCAACA TTCTCAAAGGGAAATTTCAAACTGGTGCAGAAAAAGGAATGATGGATCA AGTGAGAGAGAGGCGGAAGCCAGGGAATGCTGAGTTCGAAGATCTCACT TTTCTAGCACGGTCTGCACTCATATTGAGAGGGTCGGTTGCTCACAAGTC CTGCCTGCCTGCCTGTGTGTATGGACCTGCCGTAGCCAGTGGGTACGACT TTGAAAGGGAGGGATACTCTCTAGTCGGAATAGACCCTTTCAGACTGCTT CAAAACAGCCAAGTGTACAGCCTAATCAGACCAAATGAGAATCCAGCAC ACAAGAGTCAACTGGTGTGGATGGCATGCGATTCTGCCGCATTTGAAGAT CTAAGAGTATTAAGCTTCATCAAAGGGACGAAGGTGGTCCCAAGAGGGA AGCTTTCCACTAGAGGAGTTCAAATTGCTTCCAATGAAAATATGGAGACT ATGGAATCAAGTACACTTGAACTGAGAAGCAGGTACTGGGCGATAAGGA CCAGAAGTGGAGGAAACACCAATCAACAGAGGGCATCTGCGGGCCAAAT CAGCATACAACCTACGTTCTCAGTACAGAGAAATCTCCCTTTTGACAGAA CAACCGTTATGGCAGCATTCAGTGGGAATACAGAGGGGAGAACATCTGA CATGAGGACCGAAATGATAAGGATGATGGAAAGTGGAAGACCAGAAGAT GTGTCTTCCAGGGGCGGGGAGTCTTCGAGCTCTCGGACGAAAAGGCAGC GAGCCCGATCGTGCCTTCCTTTGACATGAGTAATGAAGGATCTTATTTC TTCGGAGACAATGCAGAGGAATACGATAATATGGATATCGATCCTTATA AAGAATTCGGAGCTACTGTGGAGTTACTCTCGTTTCTCCCGAGTGACTT CTTTCCTTCAGTACGAGATCTTCTGGATACCGGCAGCGCGCTGTATCGG GAAGCCTTGGAGTCTCCTGAGCACTGCAGCCCTGACCATACTGGCCTCA GGGAAGCAATTCTTTGCTGGGGGGAGCTCATGACTCTGGCCACGTGGGT GGGTGTTAACTTGGAAGATGGAGCTAGCAGGGACCTGGTAGTCAGTTAT GTCAACACTAATATGGGTTTAAAGTTCAGGCAACTCTTGTGGTTTCACA TTAGCTGCCTCACTTTCGGCCGAGAAACAGTTCTAGAATATTTGGTGTC TTTCGGAGTGTGGATCCGCACTCCTCCAGCTTATAGGCCTCCGAATGCC CCTATCCTGTCGACAGTCCCGGAGACTACTGTTTGTTAGACGTCGAGGC AGGTCACCTAGAAGAAGAACTCCTTCGCCTCGCAGGCGAAGGTCTCAAT CGCCGCGGCGCCGAAGATCTCAATCTCGGGAATCTCAATGT

[0193] an IBV NP-HBcAg fusion (SEQ ID NO:44), TABLE-US-00051 ATGTCCAACATGGATATTGACAGTATAAATACCGGAACAATGGATAAAA GACCAGAAGAACTGACTGCCGGAACCAGTGGGGCAACCAGACCAATCAT CAAGCCAGCAACCCTTGCTCCGCCAAGCAACAAACGAACCCGAAATCCA TCTCCAGAAAGGACAACCACAAGCAGTGAAACCGATATCGGAAGGAAAA TCCAAAAGAAACAAACCCCAACAGAGATAAAGAAGAGCGTCTACAAAAT GGTGGTAAAACTGGGTGAATTTGTACAACCAGATGATGGTCAAAGGTGGA CTTAATGATGACATGGAAAGGAATCTAATTCAAAATGCACAAGCTGTGG AGAGAATCCTATTGGCTGCAACTGATGACAAGAAAACTGAATACCAAAA GAAAAGGAATGCGAGAGATGTCAAAGAAGGGAAGGAAGAAATAGACCA CAACAAGACAGGAGGCACCTTTTATAAGATGGTAAGAGATGATAAAACC ATCTACTTCAGCCCTATAAAAATTACCTTTTTAAAAGAAGAGGTGAAAAC AATGTACAAGACCACCATGGGGAGTGATGGTTTCAGTGGACTAAATCAC ATTATGATTGGACATTCACAGATGAACGATGTCTGTTTCCAAAGATCAAA GGGACTGAAAAGGGTTGGACTTGACCCTTCATTAATCAGTACTTTTGCCG GAAGGACAGTACCCAGAAGATCAGGTACAACTGGTGTTGCAATCAAAGG AGGTGGAACTTAGTGGATGAAGCCATCCGATTTATAGGAAGAGCAATG GCAGACAGAGGGGTACTGAGAGACATGAAGGCCAAGACGGCCTATGAAA AGATTCTTCTGAATGTGAAAAACAAGTGCTCTGCGGCGCAACAAAAGGCT CTAGTTTGATCAAGTGATCGGAAGTAGGAACCCAGGGATTGCAGACATAG AAGACCTAACTCTGCTTGCCAGAAGCATGGTAGTTGTCAGACCCTCTGTA GCGAGCAAAGTGGTGCTTCCCATAAGGATTTATGCTAAAATACCTCAACT AGGATTCAATACCGAAGAATACTCTATGGTTGGGTATGAAGCCATGGCTC TTTATAATATGGCAACACCTGTTCCATATTAAGAATGGGAGATGACGCA AAAGATAAATCTCAACTATTCTTCATGTCGTGCTTCGGAGCTGCCTATGA AGATCTAAGAGTGTTATCTGCACTAACGGGCACCGAATTTAAGCCTAGAT CAGCACTAAAATGCAAGGGTTTCCATGTCCCGGCTAAGGAGCAAGTAGA AGGAATGGGGGCAGCTCTGATGTCCATCAAGCTTCAGTTCTGGGCCCCAA TGACCAGATCTGGAGGGAATGAAGTAAGTGGAGAAGGAGGGTCTGGTGA AATAAGTTGCAGCCCTGTGTTTGCAGTAGAAAGACCTATTGCTCTAAGGA AGCAAGCTGTAAGAAGAATGCTGTCAATGAACGTTGAAGGACGTGATGC AGATGTCAAAGGAAATCTACTCAAAATGATGAATGATTCAATGGCAAAG AAAACCAGTGGAAATGCTTTCATTGGGAAGAAAATGTTTCAAATATCAGA CAAAAACAAAGTCAATCCCATTGAGATTCCAATTAAGCAGACCATCCCCA ATTTCTTCTTTGGGAGGGACACAGCAGAGGATTATGATGACCTCGATTAT ATGGATATCGATCCTTATAAAGAATTCGGAGCTACTGTGGAGTTACTCTC GTTTCTCCCGAGTGACTTCTTCCTTCAGTACGAGATCTTCTGGATACCGC CAGCGCGCTGTATCGGGAAGCCTTGGAGTCTCCTGAGCACTGCAGCCCTC ACCATACTGCCCTCAGGCAAGCAATTCTTTGCTGGGGGGAGCTCATGACT CTGGCCACGTGGGTGGGTGTTAACTTGGAAGATCCAGCTAGCAGGGACCT GGTAGTCAGTTATGTCAACACTAATATGGGTTTAAAGTTCAGGCAACTCT TGTGGTTTGACATTAGCTGCCTCACTTTCGGCCGAGAAACAGTTCTAGAA TATTTGGTGTCTTTCGGAGTGTGGATCCGCACTCCTCCAGCTTATAGGCC TCCGAATGCCCCTATCCTGTCGACACTCCCGGAGACTACTGTTGTTAGAC GTCGAGGCAGGTCACCTAGAAGAAGAACTCCTTCGCCTCGCAGGCGAAGG TCTCAATCGCCGCGGCGCCGAAGATCTCAATCTCGGGAATCTCAATGTT

[0194] or an IAV M1-HBcAg fusion (SEQ ID NO:45), TABLE-US-00052 ATGAGTCTTCTAACCGAGGTCGAAACGTACGTACTCTCTATCATCCCGTC AGGCCCCCTCAAAGCCGAGATCGCACAGAGACTTGAAGATGTCTTTGCAG GGAAGAACACTGATCTTGAGGTTCTCATGGAATGGCTAAAGACAAGACC AATCCTGTCACCTCTGACTAAGGGGATTTTAGGATTTGTGTTCACGCTCA CCGTGCCCAGTGAGCGAGGACTGCAGCGTAGACGCTTTGTCCAAAATGCC CTTAATGGGAACGGGGATCCAAATAACATGGACAAAGCAGTTAAACTGTA TAGGAAGCTCAAGAGGGAGATAACATTCCATGGGGCCAAAGAAATCTCA CTCAGTTATTCTGCTGGTGCACTTGCCAGTTGTATGGGCCTCATATACAA CAGGATGGGGGCTGTGACCACTGAAGTGGCATTTGGCCTGGTATGTGCAA CCTGTGAACAGATTGCTGACTCCCAGCATCGGTCTCATAGGCAAATGGTG ACAACAACCAATCCACTAATCAGACATGAGAACAGAATGGTTTTAGCCAG CACTACAGCTAAGGCTATGGAGCAAATGGCTGGATCGAGTGAGCAAGCA GCAGAGGCCATGGAGGTTGCTAGTCAGGCTAGACAAATGGTGCAAGCGA TGAGAACCATTGGGACTCATCCTAGCTCCAGTGCTGGTCTGAAAAATGAT CTTCTTGAAAATTTGCAGGCCTATCAGAAACGAATGGGGGTGCAGATGCA ACGGTTCAAGATGGATATCGATCCTTATAAAGAATTCGGAGCTACTGTGG AGTTACTCTCGTTTCTCCCGAGTGACTTCTTTCCTTCAGTACGAGATCTT CTGGATACCGCCAGCGCGCTGTATCGGGAAGCCTTGGAGTCTCCTGAGCA CTGCAGCCCTCACCATACTGCCCTCAGGCAAGCAATTCTTTGCTGGGGGG AGCTCATGACTCTGGCCACGTGGGTGGGTGTTAACTTGGAAGATCCAGCT AGCAGGGACCTGGTAGTCAGTTATGTCAACACTAATATGGGTTTAAAGTT CAGGCAACTCTTGTGGTTTCACATTAGCTGCCTCACTTTCGGCCGAGAAA CAGTTCTAGAATATTTGGTGTCTTTCGGAGTGTGGATCCGCACTCCTCCA GCTTATAGGCCTCCGAATGCCCCTATCCTGTCGACACTCCCGGAGACTAC TGTTGTTAGACGTCGAGGCAGGTCACCTAGAAGAAGAACTCCTTCGCCTC GCAGGCGAAGGTCTCAATCGCCGCGGCGCCGAAGATCTCAATCTCGGGAA TCTCAATGT

[0195] These fusion constructs could be codon optimized by any of the methods described.

[0196] The chimeric HBcAg can be used in the present invention in conjunction with a polynucleotide comprising a nucleic acid fragment, where each nucleic acid fragment is optionally a fragment of a codon-optimized coding region operably encoding an IV polypeptide, or a fragment, variant, or derivative thereof, as an influenza vaccine for a vertebrate.

Methods and Administration

[0197] The present invention also provides methods for delivering an IV polypeptide or a fragment, variant, or derivative thereof to a human, which comprise administering to a human one or more of the compositions described herein; such that upon administration of compositions such as those described herein, an IV polypeptide or a fragment, variant, or derivative thereof is expressed in human cells, in an amount sufficient to generate an immune response to the IV or administering the IV polypeptide or a fragment, variant, or derivative thereof itself to the human in an amount sufficient to generate an immune response.

[0198] The present invention further provides methods for delivering an IV polypeptide or a fragment, variant, or derivative thereof to a human, which comprise administering to a vertebrate one or more of the compositions described herein; such that upon administration of compositions such as those described herein, an immune response is generated in the vertebrate.

[0199] The term "vertebrate" is intended to encompass a singular "vertebrate" as well as plural "vertebrates" and comprises mammals and birds, as well as fish, reptiles, and amphibians.

[0200] The term "mammal" is intended to encompass a singular "mammal" and plural "mammals," and includes, but is not limited to humans; primates such as apes, monkeys (e.g., owl, squirrel, cebus, rhesus, African green, patas, cynomolgus, and cercopithecus), orangutans, baboons, gibbons, and chimpanzees; canids such as dogs and wolves; felids such as cats, lions, and tigers; equines such as horses, donkeys, and zebras, food animals such as cows, pigs, and sheep; ungulates such as deer and giraffes; ursids such as bears; and others such as rabbits, mice, ferrets, seals, whales. In particular, the mammal can be a human subject, a food animal or a companion animal.

[0201] The term "bird" is intended to encompass a singular "bird" and plural "birds," and includes, but is not limited to feral water birds such as ducks, geese, terns, shearwaters, and gulls; as well as domestic avian species such as turkeys, chickens, quail, pheasants, geese, and ducks. The term "bird" also encompasses passerine birds such as starlings and budgerigars.

[0202] The present invention further provides a method for generating, enhancing or modulating an immune response to an W comprising administering to a vertebrate one or more of the compositions described herein. In this method, the compositions may include one or more isolated polynucleotides comprising at least one nucleic acid fragment where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide, or a fragment, variant, or derivative thereof. In another embodiment, the compositions may include both a polynucleotide as described above, and also an isolated IV polypeptide, or a fragment, variant, or derivative thereof, wherein the protein is provided as a recombinant protein, in particular, a fusion protein, a purified subunit, viral vector expressing the protein, or in the form of an inactivated IV vaccine. Thus, the latter compositions include both a polynucleotide encoding an IV polypeptide or a fragment, variant, or derivative thereof and an isolated IV polypeptide or a fragment, variant, or derivative thereof. The IV polypeptide or a fragment, variant, or derivative thereof encoded by the polynucleotide of the compositions need not be the same as the isolated IV polypeptide or a fragment, variant, or derivative thereof of the compositions. Compositions to be used according to this method may be univalent, bivalent, trivalent or multivalent.

[0203] The polynucleotides of the compositions may comprise a fragment of a human (or other vertebrate) codon-optimized coding region encoding a protein of the IV, or a fragment, variant, or derivative thereof. The polynucleotides are incorporated into the cells of the vertebrate in vivo, and an antigenic amount of the IV polypeptide, or fragment, variant, or derivative thereof, is produced in vivo. Upon administration of the composition according to this method, the IV polypeptide or a fragment, variant, or derivative thereof is expressed in the vertebrate in an amount sufficient to elicit an immune response. Such an immune response might be used, for example, to generate antibodies to the IV for use in diagnostic assays or as laboratory reagents, or as therapeutic or preventative vaccines as described herein.

[0204] The present invention further provides a method for generating, enhancing, or modulating a protective and/or therapeutic immune response to IV in a vertebrate, comprising administering to a vertebrate in need of therapeutic and/or preventative immunity one or more of the compositions described herein. In this method, the compositions include one or more polynucleotides comprising at least one nucleic acid fragment, where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide, or a fragment, variant, or derivative thereof. In a further embodiment, the composition used in this method includes both an isolated polynucleotide comprising at least one nucleic acid fragment, where the nucleic acid fragment is optionally a fragment of a codon-optimized coding region encoding an IV polypeptide, or a fragment, variant, or derivative thereof; and at least one isolated IV polypeptide, or a fragment, variant, or derivative thereof. Thus, the latter composition includes both an isolated polynucleotide encoding an IV polypeptide or a fragment, variant, or derivative thereof and an isolated IV polypeptide or a fragment, variant, or derivative thereof, for example, a recombinant protein, a purified subunit, viral vector expressing the protein, or an inactivated virus vaccine. Upon administration of the composition according to this method, the IV polypeptide or a fragment, variant, or derivative thereof is expressed in the human in a therapeutically or prophylactically effective amount.

[0205] As used herein, an "immune response" refers to the ability of a vertebrate to elicit an immune reaction to a composition delivered to that vertebrate. Examples of immune responses include an antibody response or a cellular, e.g., cytotoxic T-cell, response. One or more compositions of the present invention may be used to prevent influenza infection in vertebrates, e.g., as a prophylactic vaccine, to establish or enhance immunity to IV in a healthy individual prior to exposure to influenza or contraction of influenza disease, thus preventing the disease or reducing the severity of disease symptoms.

[0206] As mentioned above, compositions of the present invention can be used both to prevent IV infection, and also to therapeutically treat IV infection. In individuals already exposed to influenza, or already suffering from influenza disease, the present invention is used to further stimulate the immune system of the vertebrate, thus reducing or eliminating the symptoms associated with that disease or disorder. As defined herein, "treatment " refers to the use of one or more compositions of the present invention to prevent, cure, retard, or reduce the severity of influenza disease symptoms in a vertebrate, and/or result in no worsening of influenza disease over a specified period of time in a vertebrate which has already been exposed to IV and is thus in need of therapy. The term "prevention" refers to the use of one or more compositions of the present invention to generate immunity in a vertebrate which has not yet been exposed to a particular strain of IV, thereby preventing or reducing disease symptoms if the vertebrate is later exposed to the particular strain of IV. The methods of the present invention therefore may be referred to as therapeutic vaccination or preventative or prophylactic vaccination. It is not required that any composition of the present invention provide total immunity to influenza or totally cure or eliminate all influenza disease symptoms. As used herein, a "vertebrate in need of therapeutic and/or preventative immunity" refers to an individual for whom it is desirable to treat, i.e., to prevent, cure, retard, or reduce the severity of influenza disease symptoms, and/or result in no worsening of influenza disease over a specified period of time. Vertebrates to treat and/or vaccinate include humans, apes, monkeys (e.g., owl, squirrel, cebus, rhesus, African green, patas, cynomolgus, and cercopithecus), orangutans, baboons, gibbons, and chimpanzees, dogs, wolves, cats, lions, and tigers, horses, donkeys, zebras, cows, pigs, sheep, deer, giraffes, bears, rabbits, mice, ferrets, seals, whales, ducks, geese, terns, shearwaters, gulls, turkeys, chickens, quail, pheasants, geese, starlings and budgerigars.

[0207] One or more compositions of the present invention are utilized in a "prime boost" regimen. An example of a "prime boost" regimen may be found in Yang, Z. et al. J. Virol. 77:799-803 (2002), which is incorporated herein by reference in its entirety. In these embodiments, one or more polynucleotide vaccine compositions of the present invention are delivered to a vertebrate, thereby priming the immune response of the vertebrate to an IV, and then a second immunogenic composition is utilized as a boost vaccination. One or more compositions of the present invention are used to prime immunity, and then a second immunogenic composition, e.g., a recombinant viral vaccine or vaccines, a different polynucleotide vaccine, or one or more purified subunit isolated IV polypeptides or fragments, variants or derivatives thereof is used to boost the anti-IV immune response.

[0208] In one embodiment, a priming composition and a boosting composition are combined in a single composition or single formulation. For example, a single composition may comprise an isolated IV polypeptide or a fragment, variant, or derivative thereof as the priming component and a polynucleotide encoding an influenza protein as the boosting component. In this embodiment, the compositions may be contained in a single vial where the priming component and boosting component are mixed together. In general, because the peak levels of expression of protein from the polynucleotide does not occur until later (e.g. 7-10 days) after administration, the polynucleotide component may provide a boost to the isolated protein component. Compositions comprising both a priming component and a boosting component are referred to herein as "combinatorial vaccine compositions" or "single formulation heterologous prime-boost vaccine compositions." In addition, the priming composition may be administered before the boosting composition, or even after the boosting composition, if the boosting composition is expected to take longer to act.

[0209] In another embodiment, the priming composition may be administered simultaneously with the boosting composition, but in separate formulations where the priming component and the boosting component are separated.

[0210] The terms "priming" or "primary" and "boost" or "boosting" as used herein may refer to the initial and subsequent immunizations, respectively, i.e., in accordance with the definitions these terms normally have in immunology. However, in certain embodiments, e.g., where the priming component and boosting component are in a single formulation, initial and subsequent immunizations may not be necessary as both the "prime" and the "boost" compositions are administered simultaneously.

[0211] In certain embodiments, one or more compositions of the present invention are delivered to a vertebrate by methods described herein, thereby achieving an effective therapeutic and/or an effective preventative immune response. More specifically, the compositions of the present invention may be administered to any tissue of a vertebrate, including, but not limited to, muscle, skin, brain tissue, lung tissue, liver tissue, spleen tissue, bone marrow tissue, thymus tissue, heart tissue, e.g., myocardium, endocardium, and pericardium, lymph tissue, blood tissue, bone tissue, pancreas tissue, kidney tissue, gall bladder tissue, stomach tissue, intestinal tissue, testicular tissue, ovarian tissue, uterine tissue, vaginal tissue, rectal tissue, nervous system tissue, eye tissue, glandular tissue, tongue tissue, and connective tissue, e.g., cartilage.

[0212] Furthermore, the compositions of the present invention may be administered to any internal cavity of a vertebrate, including, but not limited to, the lungs, the mouth, the nasal cavity, the stomach, the peritoneal cavity, the intestine, any heart chamber, veins, arteries, capillaries, lymphatic cavities, the uterine cavity, the vaginal cavity, the rectal cavity, joint cavities, ventricles in brain, spinal canal in spinal cord, the ocular cavities, the lumen of a duct of a salivary gland or a liver. When the compositions of the present invention is administered to the lumen of a duct of a salivary gland or liver, the desired polypeptide is expressed in the salivary gland and the liver such that the polypeptide is delivered into the blood stream of the vertebrate from each of the salivary gland or the liver. Certain modes for administration to secretory organs of a gastrointestinal system using the salivary gland, liver and pancreas to release a desired polypeptide into the bloodstream is disclosed in U.S. Pat. Nos. 5,837,693 and 6,004,944, both of which are incorporated herein by reference in their entireties.

[0213] In certain embodiments, the compositions are administered into embryonated chicken eggs or by intra-muscular injection into the defeathered breast area of chicks as described in Kodihalli S. et al., Vaccine 18:2592-9 (2000), which is incorporated herein by reference in its entirety.

[0214] In certain embodiments, the compositions are administered to muscle, either skeletal muscle or cardiac muscle, or to lung tissue. Specific, but non-limiting modes for administration to lung tissue are disclosed in Wheeler, C. J., et al., Proc. Natl. Acad. Sci. USA 93:11454-11459 (1996), which is incorporated herein by reference in its entirety.

[0215] According to the disclosed methods, compositions of the present invention can be administered by intramuscular (i.m.), subcutaneous (s.c.), or intrapulmonary routes. Other suitable routes of administration include, but are not limited to intratracheal, transdermal, intraocular, intranasal, inhalation, intracavity, intravenous (i.v.), intraductal (e.g., into the pancreas) and intraparenchymal (i.e., into any tissue) administration. Transdermal delivery includes, but not limited to intradermal (e.g., into the dermis or epidermis), transdermal (e.g., percutaneous) and transmucosal administration (i.e., into or through skin or mucosal tissue). Intracavity administration includes, but not limited to administration into oral, vaginal, rectal, nasal, peritoneal, or intestinal cavities as well as, intrathecal (i.e., into spinal canal), intraventricular (i.e., into the brain ventricles or the heart ventricles), inraatrial (i.e., into the heart atrium) and sub arachnoid (i.e., into the sub arachnoid spaces of the brain) administration.

[0216] Any mode of administration can be used so long as the mode results in the expression of the desired peptide or protein, in the desired tissue, in an amount sufficient to generate an immune response to IV and/or to generate a prophylactically or therapeutically effective immune response to IV in a human in need of such response. Administration means of the present invention include needle injection, catheter infusion, biolistic injectors, particle accelerators (e.g., "gene guns" or pneumatic "needleless" injectors) Med-E-Jet (Vahlsing, H., et al., J. Immunol. Methods 171:11-22 (1994)), Pigjet (Schrijver, R., et al., Vaccine 15: 1908-1916 (1997)), Biojector (Davis, H., et al., Vaccine 12: 1503-1509 (1994); Gramzinski, R., et al., Mol. Med. 4: 109-118 (1998)), AdvantaJet (Linmayer, I., et al., Diabetes Care 9:294-297 (1986)), Medi-jector (Martins, J., and Roedl, E. J. Occup. Med. 21:821-824 (1979)), gelfoam sponge depots, other commercially available depot materials (e.g., hydrogels), osmotic pumps (e.g., Alza minipumps), oral or suppositorial solid (tablet or pill) pharmaceutical formulations, topical skin creams, and decanting, use of polynucleotide coated suture (Qin, Y., et al., Life Sciences 65: 2193-2203 (1999)) or topical applications during surgery. Certain modes of administration are intramuscular needle-based injection and pulmonary application via catheter infusion. Energy-assisted plasmid delivery (EAPD) methods may also be employed to administer the compositions of the invention. One such method involves the application of brief electrical pulses to injected tissues, a procedure commonly known as electroporation. See generally Mir, L. M. et al., Proc. Natl. Acad. Sci USA 96:4262-7 (1999); Hartikka, J. et al., Mol. Ther. 4:407-15 (2001); Mathiesen, I., Gene Ther. 6:508-14(1999); Rizzuto G. et al., Hum. Gen. Ther. 11:1891-900 (2000). Each of the references cited in this paragraph is incorporated herein by reference in its entirety.

[0217] Determining an effective amount of one or more compositions of the present invention depends upon a number of factors including, for example, the antigen being expressed or administered directly, e.g., HA, NA, NP, M1 or M2, or fragments, e.g., eM2, variants, or derivatives thereof, the age and weight of the subject, the precise condition requiring treatment and its severity, and the route of administration. Based on the above factors, determining the precise amount, number of doses, and timing of doses are within the ordinary skill in the art and will be readily determined by the attending physician or veterinarian.

[0218] Compositions of the present invention may include various salts, excipients, delivery vehicles and/or auxiliary agents as are disclosed, e.g., in U.S. patent application Publication No. 2002/0019358, published Feb. 14, 2002, which is incorporated herein by reference in its entirety.

[0219] Furthermore, compositions of the present invention may include one or more transfection facilitating compounds that facilitate delivery of polynucleotides to the interior of a cell, and/or to a desired location within a cell. As used herein, the terms "transfection facilitating compound," "transfection facilitating agent," and "transfection facilitating material" are synonymous, and may be used interchangeably. It should be noted that certain transfection facilitating compounds may also be "adjuvants" as described infra, i.e., in addition to facilitating delivery of polynucleotides to the interior of a cell, the compound acts to alter or increase the immune response to the antigen encoded by that polynucleotide. Examples of the transfection facilitating compounds include, but are not limited to inorganic materials such as calcium phosphate, alum (aluminum sulfate), and gold particles (e.g., "powder" type delivery vehicles); peptides that are, for example, cationic, intercell targeting (for selective delivery to certain cell types), intracell targeting (for nuclear localization or endosomal escape), and ampipathic (helix forming or pore forming); proteins that are, for example, basic (e.g., positively charged) such as histones, targeting (e.g., asialoprotein), viral (e.g., Sendai virus coat protein), and pore-forming; lipids that are, for example, cationic (e.g., DMRIE, DOSPA, DC-Chol), basic (e.g., steryl amine), neutral (e.g., cholesterol), anionic (e.g., phosphatidyl serine), and zwitterionic (e.g., DOPE, DOPC); and polymers such as dendrimers, star-polymers, "homogenous" poly-amino acids (e.g., poly-lysine, poly-arginine), "heterogeneous" poly-amino acids (e.g., mixtures of lysine & glycine), co-polymers, polyvinylpyrrolidinone (PVP), poloxamers (e.g. CRL 1005) and polyethylene glycol (PEG). A transfection facilitating material can be used alone or in combination with one or more other transfection facilitating materials. Two or more transfection facilitating materials can be combined by chemical bonding (e.g., covalent and ionic such as in lipidated polylysine, PEGylated polylysine) (Toncheva, et al., Biochim. Biophys. Acta 1380(3):354-368 (1988)), mechanical mixing (e.g., free moving materials in liquid or solid phase such as "polylysine+cationic lipids") (Gao and Huang, Biochemistry 35:1027-1036 (1996); Trubetskoy, et al., Biochem. Biophys. Acta 1131:311-313 (1992)), and aggregation (e.g., co-precipitation, gel forming such as in cationic lipids+poly-lactide, and polylysine+gelatin). Each of the references cited in this paragraph is incorporated herein by reference in its entirety.

[0220] One category of transfection facilitating materials is cationic lipids. Examples of cationic lipids are 5-carboxyspermylglycine dioctadecylamide (DOGS) and dipalmitoyl-phophatidylethanolamine-5-carboxyspermylamide (DPPES). Cationic cholesterol derivatives are also useful, including {3.beta.-[N-N',N'-dimethylamino)ethane]-carbomoyl}-cholesterol (DC-Chol). Dimethyldioctdecyl-ammonium bromide (DDAB), N-(3-aminopropyl)-N,N-(bis-(2-tetradecyloxyethyl))-N-methyl-ammonium bromide (PA-DEMO), N-(3-aminopropyl)-N,N-(bis-(2-dodecyloxyethyl))-N-methyl-ammonium bromide (PA-DELO), N,N,N-tris-(2-dodecyloxy)ethyl-N-(3-amino)propyl-ammonium bromide (PA-TELO), and N1-(3-aminopropyl)((2-dodecyloxy)ethyl)-N2-(2-dodecyloxy)ethyl-1-piperazi- naminium bromide (GA-LOE-BP) can also be employed in the present invention.

[0221] Non-diether cationic lipids, such as DL-1,2-dioleoyl-3-dimethylaminopropyl-.beta.-hydroxyethylammonium (DORI diester), 1-O-oleyl-2-oleoyl-3-dimethylaminopropyl-p-hydroxyethylammonium (DORI ester/ether), and their salts promote in vivo gene delivery. In some embodiments, cationic lipids comprise groups attached via a heteroatom attached to the quaternary ammonium moiety in the head group. A glycyl spacer can connect the linker to the hydroxyl group.

[0222] Specific, but non-limiting cationic lipids for use in certain embodiments of the present invention include DMRIE ((.+-.)-N-(2-hydroxyethyl)-N,N-dimethyl-2,3-bis(tetradecyloxy)-1-propanam- inium bromide), GAP-DMORIE ((.+-.)-N-(3-aminopropyl)-N,N-dimethyl-2,3-bis(syn-9-tetradeceneyloxy)-1-- propanaminium bromide), and GAP-DMRIE ((.+-.)-N-(3-aminopropyl)-N,N-dimethyl-2,3-(bis-dodecyloxy)-1-propanamini- um bromide).

[0223] Other specific but non-limiting cationic surfactants for use in certain embodiments of the present invention include Bn-DHRIE, DhxRIE, DhxRIE-OAc, DhxRIE-OBz and Pr-DOctRIE-OAc. These lipids are disclosed in copending U.S. patent application Ser. No. 10/725,015. In another aspect of the present invention, the cationic surfactant is Pr-DOctRIE-OAc.

[0224] Other cationic lipids include (.+-.)-N,N-dimethyl-N-[2-(sperminecarboxamido)ethyl]-2,3-bis(dioleyloxy)-- 1-propaniminium pentahydrochloride (DOSPA), (.+-.)-N-(2-aminoethyl)-N,N-dimethyl-2,3-bis(tetradecyloxy)-1-propanimini- um bromide (.beta.-aminoethyl-DMRIE or .beta.AE-DMRIE) (Wheeler, et al., Biochim. Biophys. Acta 1280:1-11 (1996), and (.+-.)-N-(3-aminopropyl)-N,N-dimethyl-2,3-bis(dodecyloxy)-1-propaniminium bromide (GAP-DLRIE) (Wheeler, et al., Proc. Natl. Acad. Sci. USA 93:11454-11459 (1996)), which have been developed from DMRIE. Both of the references cited in this paragraph are incorporated herein by reference in their entirety.

[0225] Other examples of DMRIE-derived cationic lipids that are useful for the present invention are (.+-.)-N-(3-aminopropyl)-N,N-dimethyl-2,3-(bis-decyloxy)-1-propanaminium bromide (GAP-DDRIE), (.+-.)-N-(3-aminopropyl)-N,N-dimethyl-2,3-(bis-tetradecyloxy)-1-propanami- nium bromide (GAP-DMRIE), (.+-.)-N-((N''-methyl)-N'-ureyl)propyl-N,N-dimethyl-2,3-bis(tetradecyloxy- )-1-propanaminium bromide (GMU-DMRIE), (.+-.)-N-(2-hydroxyethyl)-N,N-dimethyl-2,3-bis(dodecyloxy)-1-propanaminiu- m bromide (DLRIE), and (.+-.)-N-(2-hydroxyethyl)-N,N-dimethyl-2,3-bis-([Z]-9-octadecenyloxy)prop- yl-1-propaniminium bromide (HP-DORIE).

[0226] In the embodiments where the immunogenic composition comprises a cationic lipid, the cationic lipid may be mixed with one or more co-lipids. For purposes of definition, the term "co-lipid" refers to any hydrophobic material which may be combined with the cationic lipid component and includes amphipathic lipids, such as phospholipids, and neutral lipids, such as cholesterol. Cationic lipids and co-lipids may be mixed or combined in a number of ways to produce a variety of non-covalently bonded macroscopic structures, including, for example, liposomes, multilamellar vesicles, unilamellar vesicles, micelles, and simple films. One non-limiting class of co-lipids are the zwitterionic phospholipids, which include the phosphatidylethanolamines and the phosphatidylcholines. Examples of phosphatidylethanolamines, include DOPE, DMPE and DPyPE. In certain embodiments, the co-lipid is DPyPE, which comprises two phytanoyl substituents incorporated into the diacylphosphatidylethanolamine skeleton. In other embodiments, the co-lipid is DOPE, CAS name 1,2-diolyeoyl-sn-glycero-3-phosphoethanolamine.

[0227] When a composition of the present invention comprises a cationic lipid and co-lipid, the cationic lipid:co-lipid molar ratio may be from about 9:1 to about 1:9, from about 4:1 to about 1:4, from about 2:1 to about 1:2, or about 1:1.

[0228] In order to maximize homogeneity, the cationic lipid and co-lipid components may be dissolved in a solvent such as chloroform, followed by evaporation of the cationic lipid/co-lipid solution under vacuum to dryness as a film on the inner surface of a glass vessel (e.g., a Rotovap round-bottomed flask). Upon suspension in an aqueous solvent, the amphipathic lipid component molecules self-assemble into homogenous lipid vesicles. These lipid vesicles may subsequently be processed to have a selected mean diameter of uniform size prior to complexing with, for example, a codon-optimized polynucleotide of the present invention, according to methods known to those skilled in the art. For example, the sonication of a lipid solution is described in Felgner et al., Proc. Natl. Acad. Sci. USA 8:,7413-7417 (1987) and in U.S. Pat. No. 5,264,618, the disclosures of which are incorporated herein by reference.

[0229] In those embodiments where the composition includes a cationic lipid, polynucleotides of the present invention are complexed with lipids by mixing, for example, a plasmid in aqueous solution and a solution of cationic lipid:co-lipid as prepared herein are mixed. The concentration of each of the constituent solutions can be adjusted prior to mixing such that the desired final plasmid/cationic lipid:co-lipid ratio and the desired plasmid final concentration will be obtained upon mixing the two solutions. The cationic lipid:co-lipid mixtures are suitably prepared by hydrating a thin film of the mixed lipid materials in an appropriate volume of aqueous solvent by vortex mixing at ambient temperatures for about 1 minute. The thin films are prepared by admixing chloroform solutions of the individual components to afford a desired molar solute ratio followed by aliquoting the desired volume of the solutions into a suitable container. The solvent is removed by evaporation, first with a stream of dry, inert gas (e.g. argon) followed by high vacuum treatment.

[0230] Other hydrophobic and amphiphilic additives, such as, for example, sterols, fatty acids, gangliosides, glycolipids, lipopeptides, liposaccharides, neobees, niosomes, prostaglandins and sphingolipids, may also be included in compositions of the present invention. In such compositions, these additives may be included in an amount between about 0.1 mol % and about 99.9 mol % (relative to total lipid), about 1-50 mol %, or about 2-25 mol %.

[0231] Additional embodiments of the present invention are drawn to compositions comprising an auxiliary agent which is administered before, after, or concurrently with the polynucleotide. As used herein, an "auxiliary agent" is a substance included in a composition for its ability to enhance, relative to a composition which is identical except for the inclusion of the auxiliary agent, the entry of polynucleotides into vertebrate cells in vivo, and/or the in vivo expression of polypeptides encoded by such polynucleotides. Certain auxiliary agents may, in addition to enhancing entry of polynucleotides into cells, enhance an immune response to an immunogen encoded by the polynucleotide. Auxiliary agents of the present invention include nonionic, anionic, cationic, or zwitterionic surfactants or detergents, with nonionic surfactants or detergents being preferred, chelators, DNase inhibitors, poloxamers, agents that aggregate or condense nucleic acids, emulsifying or solubilizing agents, wetting agents, gel-forming agents, and buffers.

[0232] Auxiliary agents for use in compositions of the present invention include, but are not limited to non-ionic detergents and surfactants IGEPAL CA 6300, NONIDET NP-40, Nonidet.RTM. P40, Tween-20.TM., Tween-80.TM., Pluronic.RTM. F68 (ave. MW: 8400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 80%), Pluronic F77.RTM. (ave. MW: 6600; approx. MW of hydrophobe, 2100; approx. wt. % of hydrophile, 70%), Pluronic P65.RTM.(ave. MW: 3400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 50%), Triton X-100.TM., and Triton X-114.TM.; the anionic detergent sodium dodecyl sulfate (SDS); the sugar stachyose; the condensing agent DMSO; and the chelator/DNAse inhibitor EDTA, CRL 1005 (12 kDa, 5% POE), and BAK (Benzalkonium chloride 50% solution, available from Ruger Chemical Co. Inc.). In certain specific embodiments, the auxiliary agent is DMSO, Nonidet P40, Pluronic F68.RTM. (ave. MW: 8400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 80%), Pluronic F77.RTM. (ave. MW: 6600; approx. MW of hydrophobe, 2100; approx. wt. % of hydrophile, 70%), Pluronic P65.RTM. (ave. MW: 3400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 50%), Pluronic L64.RTM. (ave. MW: 2900; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 40%), and Pluronic F108.RTM. (ave. MW: 14600; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 80%). See, e.g., U.S. patent application Publication No. 2002/0019358, published Feb. 14, 2002, which is incorporated herein by reference in its entirety.

[0233] Certain compositions of the present invention can further include one or more adjuvants before, after, or concurrently with the polynucleotide. The term "adjuvant" refers to any material having the ability to (1) alter or increase the immune response to a particular antigen or (2) increase or aid an effect of a pharmacological agent. It should be noted, with respect to polynucleotide vaccines, that an "adjuvant," can be a transfection facilitating material. Similarly, certain "transfection facilitating materials" described supra, may also be an "adjuvant." An adjuvant may be used with a composition comprising a polynucleotide of the present invention. In a prime-boost regimen, as described herein, an adjuvant may be used with either the priming immunization, the booster immunization, or both. Suitable adjuvants include, but are not limited to, cytokines and growth factors; bacterial components (e.g., endotoxins, in particular superantigens, exotoxins and cell wall components); aluminum-based salts; calcium-based salts; silica; polynucleotides; toxoids; serum proteins, viruses and virally-derived materials, poisons, venoms, imidazoquiniline compounds, poloxamers, and cationic lipids.

[0234] A great variety of materials have been shown to have adjuvant activity through a variety of mechanisms. Any compound which may increase the expression, antigenicity or immunogenicity of the polypeptide is a potential adjuvant. The present invention provides an assay to screen for improved immune responses to potential adjuvants. Potential adjuvants which may be screened for their ability to enhance the immune response according to the present invention include, but are not limited to: inert carriers, such as alum, bentonite, latex, and acrylic particles; pluronic block polymers, such as TiterMax.RTM. (block copolymer CRL-8941, squalene (a metabolizable oil) and a microparticulate silica stabilizer); depot formers, such as Freunds adjuvant, surface active materials, such as saponin, lysolecithin, retinal, Quil A, liposomes, and pluronic polymer formulations; macrophage stimulators, such as bacterial lipopolysaccharide; alternate pathway complement activators, such as insulin, zymosan, endotoxin, and levamisole; and non-ionic surfactants, such as poloxamers, poly(oxyethylene)-poly(oxypropylene) tri-block copolymers. Also included as adjuvants are transfection-facilitating materials, such as those described above.

[0235] Poloxamers which may be screened for their ability to enhance the immune response according to the present invention include, but are not limited to, commercially available poloxamers such as Pluronic.RTM. surfactants, which are block copolymers of propylene oxide and ethylene oxide in which the propylene oxide block is sandwiched between two ethylene oxide blocks. Examples of Pluronic.RTM. surfactants include Pluronic.RTM. L121 (ave. MW: 4400; approx. MW of hydrophobe, 3600; approx. wt. % of hydrophile, 10%), Pluronic.RTM. L101 (ave. MW: 3800; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 10%), Pluronic.RTM. L81 (ave. MW: 2750; approx. MW of hydrophobe, 2400; approx. wt. % of hydrophile, 10%), Pluronic.RTM. L61 (ave. MW: 2000; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 10%), Pluronic.RTM. L31 (ave. MW: 1100; approx. MW of hydrophobe, 900; approx. wt. % of hydrophile, 10%), Pluronic.RTM. L122 (ave. MW: 5000; approx. MW of hydrophobe, 3600; approx. wt. % of hydrophile, 20%), Pluronic.RTM. L92 (ave. MW: 3650; approx. MW of hydrophobe, 2700; approx. wt. % of hydrophile, 20%), Pluronic.RTM. L72 (ave. MW: 2750; approx. MW of hydrophobe, 2100; approx. wt. % of hydrophile, 20%), Pluronic.RTM. L62 (ave. MW: 2500; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 20%), Pluronic.RTM. L42 (ave. MW: 1630; approx. MW of hydrophobe, 1200; approx. wt. % of hydrophile, 20%), Pluronic.RTM. L63 (ave. MW: 2650; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 30%), Pluronic.RTM. L43 (ave. MW: 1850; approx. MW of hydrophobe, 1200; approx. wt. % of hydrophile, 30%), Pluronic.RTM. L64 (ave. MW: 2900; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 40%), Pluronic.RTM. L44 (ave. MW: 2200; approx. MW of hydrophobe, 1200; approx. wt. % of hydrophile, 40%), Pluronic.RTM. L35 (ave. MW: 1900; approx. MW of hydrophobe, 900; approx. wt. % of hydrophile, 50%), Pluronic.RTM. P123 (ave. MW: 5750; approx. MW of hydrophobe, 3600; approx. wt. % of hydrophile, 30%), Pluronic.RTM. P103 (ave. MW: 4950; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 30%), Pluronic.RTM. P104 (ave. MW: 5900; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 40%), Pluronic.RTM. P84 (ave. MW: 4200; approx. MW of hydrophobe, 2400; approx. wt. % of hydrophile, 40%), Pluronic.RTM. P105 (ave. MW: 6500; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 50%), Pluronic.RTM. P85 (ave. MW: 4600; approx. MW of hydrophobe, 2400; approx. wt. % of hydrophile, 50%), Pluronic.RTM. P75 (ave. MW: 4150; approx. MW of hydrophobe, 2100; approx. wt. % of hydrophile, 50%), Pluronic.RTM. P65 (ave. MW: 3400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 50%), Pluronic.RTM. F127 (ave. MW: 12600; approx. MW of hydrophobe, 3600; approx. wt. % of hydrophile, 70%), Pluronic.RTM. F98 (ave. MW: 13000; approx. MW of hydrophobe, 2700; approx. wt. % of hydrophile, 80%), Pluronic.RTM. F87 (ave. MW: 7700; approx. MW of hydrophobe, 2400; approx. wt. % of hydrophile, 70%), Pluronic.RTM. F77 (ave. MW: 6600; approx. MW of hydrophobe, 2100; approx. wt. % of hydrophile, 70%), Pluronic.RTM. F108 (ave. MW: 14600; approx. MW of hydrophobe, 3000; approx. wt. % of hydrophile, 80%), Pluronic.RTM. F98 (ave. MW: 13000; approx. MW of hydrophobe, 2700; approx. wt. % of hydrophile, 80%), Pluronic.RTM. F88 (ave. MW: 11400; approx. MW of hydrophobe, 2400; approx. wt. % of hydrophile, 80%), Pluronic.RTM. F68 (ave. MW: 8400; approx. MW of hydrophobe, 1800; approx. wt. % of hydrophile, 80%), Pluronic.RTM. F38 (ave. MW: 4700; approx. MW of hydrophobe, 900; approx. wt. % of hydrophile, 80%).

[0236] Reverse poloxamers which may be screened for their ability to enhance the immune response according to the present invention include, but are not limited to Pluronic.RTM. R 31R1 (ave. MW: 3250; approx. MW of hydrophobe, 3100; approx. wt. % of hydrophile, 10%), Pluronic.RTM. R 25R1 (ave. MW: 2700; approx. MW of hydrophobe, 2500; approx. wt. % of hydrophile, 10%), Pluronic.RTM. R 17R1 (ave. MW: 1900; approx. MW of hydrophobe, 1700; approx. wt. % of hydrophile, 10%), Pluronic.RTM. R 31R2 (ave. MW: 3300; approx. MW of hydrophobe, 3100; approx. wt. % of hydrophile, 20%), Pluronic.RTM. R 25R2 (ave. MW: 3100; approx. MW of hydrophobe, 2500; approx. wt. % of hydrophile, 20%), Pluronic.RTM. R 17R2 (ave. MW: 2150; approx. MW of hydrophobe, 1700; approx. wt. % of hydrophile, 20%), Pluronic.RTM. R 12R3 (ave. MW: 1800; approx. MW of hydrophobe, 1200; approx. wt. % of hydrophile, 30%), Pluronic.RTM. R 31R4 (ave. MW: 4150; approx. MW of hydrophobe, 3100; approx. wt. % of hydrophile, 40%), Pluronic.RTM. R 25R4 (ave. MW: 3600; approx. MW of hydrophobe, 2500; approx. wt. % of hydrophile, 40%), Pluronic.RTM. R 22R4 (ave. MW: 3350; approx. MW of hydrophobe, 2200; approx. wt. % of hydrophile, 40%), Pluronic.RTM. R 17R4 (ave. MW: 3650; approx. MW of hydrophobe, 1700; approx. wt. % of hydrophile, 40%), Pluronic.RTM. R 25R5 (ave. MW: 4320; approx. MW of hydrophobe, 2500; approx. wt. % of hydrophile, 50%), Pluronic.RTM. R 10R5 (ave. MW: 1950; approx. MW of hydrophobe, 1000; approx. wt. % of hydrophile, 50%), Pluronic.RTM. R 25R8 (ave. MW: 8550; approx. MW of hydrophobe, 2500; approx. wt. % of hydrophile, 80%), Pluronic.RTM. R 17R8 (ave. MW: 7000; approx. MW of hydrophobe, 1700; approx. wt. % of hydrophile, 80%), and Pluronic.RTM. R 10R8 (ave. MW: 4550; approx. MW of hydrophobe, 1000; approx. wt. % of hydrophile, 80%).

[0237] Other commercially available poloxamers which may be screened for their ability to enhance the immune response according to the present invention include compounds that are block copolymer of polyethylene and polypropylene glycol such as Synperonic.RTM. L121 (ave. MW: 4400), Synperonic.RTM. L122 (ave. MW: 5000), Synperonic.RTM. P104 (ave. MW: 5850), Synperonic.RTM. P105 (ave. MW: 6500), Synperonic.RTM. P123 (ave. MW: 5750), Synperonic.RTM. P85 (ave. MW: 4600) and Synperonic.RTM. P94 (ave. MW: 4600), in which L indicates that the surfactants are liquids, P that they are pastes, the first digit is a measure of the molecular weight of the polypropylene portion of the surfactant and the last digit of the number, multiplied by 10, gives the percent ethylene oxide content of the surfactant; and compounds that are nonylphenyl polyethylene glycol such as Synperonic.RTM. NP10 (nonylphenol ethoxylated surfactant--10% solution), Synperonic.RTM. NP30 (condensate of 1 mole of nonylphenol with 30 moles of ethylene oxide) and Synperonic.RTM. NP5 (condensate of 1 mole of nonylphenol with 5.5 moles of naphthalene oxide).

[0238] Other poloxamers which may be screened for their ability to enhance the immune response according to the present invention include: (a) a polyether block copolymer comprising an A-type segment and a B-type segment, wherein the A-type segment comprises a linear polymeric segment of relatively hydrophilic character, the repeating units of which contribute an average Hansch-Leo fragmental constant of about -0.4 or less and have molecular weight contributions between about 30 and about 500, wherein the B-type segment comprises a linear polymeric segment of relatively hydrophobic character, the repeating units of which contribute an average Hansch-Leo fragmental constant of about -0.4 or more and have molecular weight contributions between about 30 and about 500, wherein at least about 80% of the linkages joining the repeating units for each of the polymeric segments comprise an ether linkage; (b) a block copolymer having a polyether segment and a polycation segment, wherein the polyether segment comprises at least an A-type block, and the polycation segment comprises a plurality of cationic repeating units; and (c) a polyether-polycation copolymer comprising a polymer, a polyether segment and a polycationic segment comprising a plurality of cationic repeating units of formula --NH--R.sup.0, wherein R.sup.0 is a straight chain aliphatic group of 2 to 6 carbon atoms, which may be substituted, wherein said polyether segments comprise at least one of an A-type of B-type segment. See U.S. Pat. No. 5,656,611, by Kabonov, et al., which is incorporated herein by reference in its entirety. Other poloxamers of interest include CRL1005 (12 kDa, 5% POE), CRL8300 (11 kDa, 5% POE), CRL2690 (12 kDa, 10% POE), CRL4505 (15 kDa, 5% POE) and CRL1415 (9 kDa, 10% POE).

[0239] Other auxiliary agents which may be screened for their ability to enhance the immune response according to the present invention include, but are not limited to Acacia (gum arabic); the poloxyethylene ether R--O--(C.sub.2H.sub.4O).sub.x--H (BRIJ.RTM.), e.g., polyethylene glycol dodecyl ether (BRIJ.RTM. 35, x=23), polyethylene glycol dodecyl ether (BRIJ.RTM. 30, x=4), polyethylene glycol hexadecyl ether (BRIJ.RTM. 52 x=2), polyethylene glycol hexadecyl ether (BRIJ.RTM. 56, x=10), polyethylene glycol hexadecyl ether (BRIJ.RTM. 58P, x=20), polyethylene glycol octadecyl ether (BRIJ.RTM. 72, x=2), polyethylene glycol octadecyl ether (BRIJ.RTM. 76, x=10), polyethylene glycol octadecyl ether (BRIJ.RTM. 78P, x=20), polyethylene glycol oleyl ether (BRIJ.RTM. 92V, x=2), and polyoxyl 10 oleyl ether (BRIJ.RTM. 97, x=10); poly-D-glucosamine (chitosan); chlorbutanol; cholesterol; diethanolamine; digitonin; dimethylsulfoxide (DMSO), ethylenediamine tetraacetic acid (EDTA); glyceryl monosterate; lanolin alcohols; mono- and di-glycerides; monoethanolamine; nonylphenol polyoxyethylene ether (NP-40.RTM.); octylphenoxypolyethoxyethanol (NONIDET NP-40 from Amresco); ethyl phenol poly (ethylene glycol ether).sup.n, n=11 (Nonidet.RTM. P40 from Roche); octyl phenol ethylene oxide condensate with about 9 ethylene oxide units (nonidet P40); IGEPAL CA 630.RTM. ((octyl phenoxy) polyethoxyethanol; structurally same as NONIDET NP-40); oleic acid; oleyl alcohol; polyethylene glycol 8000; polyoxyl 20 cetostearyl ether; polyoxyl 35 castor oil; polyoxyl 40 hydrogenated castor oil; polyoxyl 40 stearate; polyoxyethylene sorbitan monolaurate (polysorbate 20, or TWEEN-20.RTM.; polyoxyethylene sorbitan monooleate (polysorbate 80, or TWEEN-80.RTM.); propylene glycol diacetate; propylene glycol monstearate; protamine sulfate; proteolytic enzymes; sodium dodecyl sulfate (SDS); sodium monolaurate; sodium stearate; sorbitan derivatives (SPAN.RTM.), e.g., sorbitan monopalmitate (SPAN.RTM. 40), sorbitan monostearate (SPAN.RTM. 60), sorbitan tristearate (SPAN.RTM. 65), sorbitan monooleate (SPAN.RTM. 80), and sorbitan trioleate (SPAN.RTM. 85); 2,6,10,15,19,23-hexamethyl-2,6,10,14,18,22-tetracosa-hexaene (squalene); stachyose; stearic acid; sucrose; surfactin (lipopeptide antibiotic from Bacillus subtilis); dodecylpoly(ethyleneglycolether).sub.9 (Thesit.RTM.) MW 582.9; octyl phenol ethylene oxide condensate with about 9-10 ethylene oxide units (Triton X-100.TM.); octyl phenol ethylene oxide condensate with about 7-8 ethylene oxide units (Triton X-114.TM.); tris(2-hydroxyethyl)amine (trolamine); and emulsifying wax.

[0240] In certain adjuvant compostions, the adjuvant is a cytokine. A composition of the present invention can comprise one or more cytokines, chemokines, or compounds that induce the production of cytokines and chemokines, or a polynucleotide encoding one or more cytokines, chemokines, or compounds that induce the production of cytokines and chemokines. Examples include, but are not limited to granulocyte macrophage colony stimulating factor (GM-CSF), granulocyte colony stimulating factor (G-CSF), macrophage colony stimulating factor (M-CSF), colony stimulating factor (CSF), erythropoietin (EPO), interleukin 2 (IL-2), interleukin-3 (IL-3), interleukin 4 (IL-4), interleukin 5 (IL-5), interleukin 6 (IL-6), interleukin 7 (IL-7), interleukin 8 (IL-8), interleukin 10 (IL-10), interleukin 12 (IL-12), interleukin 15 (IL-15), interleukin 18 (IL-18), interferon alpha (IFN.alpha.), interferon beta (IFN.beta.), interferon gamma (IFN.gamma.), interferon omega (IFN.omega.), interferon tau (IFN.tau.), interferon gamma inducing factor I (IGIF), transforming growth factor beta (TGF-.beta.), RANTES (regulated upon activation, normal T-cell expressed and presumably secreted), macrophage inflammatory proteins (e.g., MIP-1 alpha and MIP-1 beta), Leishmania elongation initiating factor (LEIF), and Flt-3 ligand.

[0241] In certain compositions of the present invention, the polynucleotide construct may be complexed with an adjuvant composition comprising (.+-.)-N-(3-aminopropyl)-N,N-dimethyl-2,3-bis(syn-9-tetradeceneyloxy)-1-p- ropanaminium bromide (GAP-DMORIE). The composition may also comprise one or more co-lipids, e.g., 1,2-dioleoyl-sn-glycero-3-phosphoethanolamine (DOPE), 1,2-diphytanoyl-sn-glycero-3-phosphoethanolamine (DPyPE), and/or 1,2-dimyristoyl-glycer-3-phosphoethanolamine (DMPE). An adjuvant composition comprising GAP-DMORIE and DPyPE at a 1:1 molar ratio is referred to herein as Vaxfectin.TM.. See, e.g., PCT Publication No. WO 00/57917, which is incorporated herein by reference in its entirety.

[0242] In other embodiments, the polynucleotide itself may function as an adjuvant as is the case when the polynucleotides of the invention are derived, in whole or in part, from bacterial DNA. Bacterial DNA containing motifs of unmethylated CpG-dinucleotides (CpG-DNA) triggers innate immune cells in vertebrates through a pattern recognition receptor (including toll receptors such as TLR 9) and thus possesses potent immunostimulatory effects on macrophages, dendritic cells and B-lymphocytes. See, e.g., Wagner, H., Curr. Opin. Microbiol. 5:62-69 (2002); Jung, J. et al., J. Immunol. 169: 2368-73 (2002); see also Klinman, D. M. et al., Proc. Natl Acad. Sci. U.S.A. 93:2879-83 (1996). Methods of using unmethylated CpG-dinucleotides as adjuvants are described in, for example, U.S. Pat. Nos. 6,207,646, 6,406,705 and 6,429,199, the disclosures of which are herein incorporated by reference.

[0243] The ability of an adjuvant to increase the immune response to an antigen is typically manifested by a significant increase in immune-mediated protection. For example, an increase in humoral immunity is typically manifested by a significant increase in the titer of antibodies raised to the antigen, and an increase in T-cell activity is typically manifested in increased cell proliferation, or cellular cytotoxicity, or cytokine secretion. An adjuvant may also alter an immune response, for example, by changing a primarily humoral or Th.sub.2 response into a primarily cellular, or Th.sub.1 response.

[0244] Nucleic acid molecules and/or polynucleotides of the present invention, e.g., plasmid DNA, mRNA, linear DNA or oligonucleotides, may be solubilized in any of various buffers. Suitable buffers include, for example, phosphate buffered saline (PBS), normal saline, Tris buffer, and sodium phosphate (e.g., 150 mM sodium phosphate). Insoluble polynucleotides may be solubilized in a weak acid or weak base, and then diluted to the desired volume with a buffer. The pH of the buffer may be adjusted as appropriate. In addition, a pharmaceutically acceptable additive can be used to provide an appropriate osmolarity. Such additives are within the purview of one skilled in the art. For aqueous compositions used in vivo, sterile pyrogen-free water can be used. Such formulations will contain an effective amount of a polynucleotide together with a suitable amount of an aqueous solution in order to prepare pharmaceutically acceptable compositions suitable for administration to a human.

[0245] Compositions of the present invention can be formulated according to known methods. Suitable preparation methods are described, for example, in Remington's Pharmaceutical Sciences, 16th Edition, A. Osol, ed., Mack Publishing Co., Easton, Pa. (1980), and Remington's Pharmaceutical Sciences, 19th Edition, A. R. Gennaro, ed., Mack Publishing Co., Easton, Pa. (1995), both of which are incorporated herein by reference in their entireties. Although the composition may be administered as an aqueous solution, it can also be formulated as an emulsion, gel, solution, suspension, lyophilized form, or any other form known in the art. In addition, the composition may contain pharmaceutically acceptable additives including, for example, diluents, binders, stabilizers, and preservatives.

[0246] The following examples are included for purposes of illustration only and are not intended to limit the scope of the present invention, which is defined by the appended claims. All references cited in the Examples are incorporated herein by reference in their entireties.

EXAMPLES

Materials and Methods

[0247] The following materials and methods apply generally to all the examples disclosed herein. Specific materials and methods are disclosed in each example, as necessary.

[0248] The practice of the present invention will employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology (including PCR), vaccinology, microbiology, recombinant DNA, and immunology, which are within the skill of the art. Such techniques are explained fully in the literature. See, for example, Molecular Cloning A Laboratory Manual, 2nd Ed., Sambrook et al., ed., Cold Spring Harbor Laboratory Press: (1989); DNA Cloning, Volumes I and II (D. N. Glover ed., 1985); Oligonucleotide Synthesis (M. J. Gait ed., 1984); Mullis et al. U.S. Pat. No: 4,683,195; Nucleic Acid Hybridization (B. D. Hames & S. J. Higgins eds. 1984); Transcription And Translation (B. D. Hames & S. J. Higgins eds. 1984); Culture Of Animal Cells (R. I. Freshney, Alan R. Liss, Inc., 1987); Immobilized Cells And Enzymes (IRL Press, 1986); B. Perbal, A Practical Guide To Molecular Cloning (1984); the treatise, Methods In Enzymology (Academic Press, Inc., N.Y.); Gene Transfer Vectors For Mammalian Cells (J. H. Miller and M. P. Calos eds., 1987, Cold Spring Harbor Laboratory); Methods In Enzymology, Vols. 154 and 155 (Wu et al. eds.), Immunochemical Methods In Cell And Molecular Biology (Mayer and Walker, eds., Academic Press, London, 1987); and in Ausubel et al., Current Protocols in Molecular Biology, John Wiley and Sons, Baltimore, Md. (1989). Each of the references cited in this paragraph is incorporated herein by reference in its entirety.

Gene Construction

[0249] Constructs of the present invention are constructed based on the sequence information provided herein or in the art utilizing standard molecular biology techniques, including, but not limited to the following. First, a series complementary oligonucleotide pairs of 80-90 nucleotides each in length and spanning the length of the construct are synthesized by standard methods. These oligonucleotide pairs are synthesized such that upon annealing, they form double stranded fragments of 80-90 base pairs, containing cohesive ends. The single-stranded ends of each pair of oligonucleotides are designed to anneal with a single-stranded end of an adjacent oligonucleotide duplex. Several adjacent oligonucleotide pairs prepared in this manner are allowed to anneal, and approximately five to six adjacent oligonucleotide duplex fragments are then allowed to anneal together via the cohesive single stranded ends. This series of annealed oligonucleotide duplex fragments is then ligated together and cloned into a suitable plasmid, such as the TOPO.RTM. vector available from Invitrogen Corporation, Carlsbad, Calif. The construct is then sequenced by standard methods. Constructs prepared in this manner, comprising 5 to 6 adjacent 80 to 90 base pair fragments ligated together, i.e., fragments of about 500 base pairs, are prepared, such that the entire desired sequence of the construct is represented in a series of plasmid constructs. The inserts of these plasmids are then cut with appropriate restriction enzymes and ligated together to form the final construct. The final construct is then cloned into a standard bacterial cloning vector, and sequenced. The oligonucleotides and primers referred to herein can easily be designed by a person of skill in the art based on the sequence information provided herein and in the art, and such can be synthesized by any of a number of commercial nucleotide providers, for example Retrogen, San Diego, Calif., and GENEART, Regensburg, Germany.

Plasmid Vectors

[0250] Constructs of the present invention can be inserted, for example, into eukaryotic expression vectors VR1012 or VR10551. These vectors are built on a modified pUC18 background (see Yanisch-Perron, C., et al. Gene 33:103-119 (1985)), and contain a kanamycin resistance gene, the human cytomegalovirus immediate early promoter/enhancer and intron A, and the bovine growth hormone transcription termination signal, and a polylinker for inserting foreign genes. See Hartikka, J., et al., Hum. Gene Ther. 7:1205-1217 (1996). However, other standard commercially available eukaryotic expression vectors may be used in the present invention, including, but not limited to: plasmids pcDNA3, pHCMV/Zeo, pCR3.1, pEF1/His, pIND/GS, pRc/HCMV2, pSV40/Zeo2, pTRACER-HCMV, pUB6/V5-His, pVAX1, and pZeoSV2 (available from Invitrogen, San Diego, Calif.), and plasmid pCI (available from Promega, Madison, Wis.).

[0251] An optimized backbone plasmid, termed VR10551, has minor changes from the VR1012 backbone described above. The VR10551 vector is derived from and similar to VR1012 in that it uses the human cytomegalovirus immediate early (hCMV-IE) gene enhancer/promoter and 5' untranslated region (UTR), including the hCMV-IE Intron A. The changes from the VR1012 to the VR10551 include some modifications to the multiple cloning site, and a modified rabbit .beta. globin 3' untranslated region/polyadenylation signal sequence/transcriptional terminator has been substituted for the same functional domain derived from the bovine growth hormone gene.

[0252] Additionally, constructs of the present invention can be inserted into other eukaryotic expression vector backbones such as VR10682 or VR10686. The VR10682 expression vector backbone (SEQ ID NO:94) contains a modified rous sarcoma virus (RSV) promoter from expression plasmid VCL1005, the bovine growth hormone (BGH) poly-adenylation site and a polylinker for inserting foreign genes and a kanamycin resistance gene. The RSV promoter in VCL1005 and VR10682 contains a XbaI endonuclease restriction site near the transcription start site in the sequence TAC TCT AGA CG (SEQ ID NO:82). The modified RSV promoter contained in VR10682. Expression plasmid VCL1005 is described in U.S. Pat. No. 5,561,064 and is incorporated herein by reference.

[0253] The VR10686 expression vector backbone (SEQ ID NO:112) was created by replacing the West Nile Virus (WNV) antigen insert in VR6430 (SEQ ID NO:89) with the multiple cloning site from the VR1012 vector. The VR10686 and VR6430 expression vector backbones contain the RSV promoter, derived from VCL1005, which has been modified back to the wild-type RSV sequence (TAC AAT AAA CG (SEQ ID NO:83)). The wild-type RSV promoter is fused to the "R" region plus the first 39 nucleotides of the U5 region from Human T-Cell Leukemia Virus I (HTLV-I), hereinafter refered to as the RU5 element. The R and U5 regions are portions of the long terminal repeat region (LTR) of HTLV-I which control expression of the HTLV-I transcript and is duplicated at either end of the integrated viral genome as a result of the retroviral integration mechanism. The LTR of HTLV-1 and most retroviruses are divided into three regions, U3, R and U5. Transcription from the intigrated viral genome commences at the U3-R boundary of the 5' LTR and the transcript is polyadenylated at the R-U5 boundary of the 3' LTR. (See Goff, S. P. Retroviridae, Field's Virology 4.sup.th ed. 2:1871-1939 (2001). This RU5 HTLV-I element has been shown to be a potent stimulator of translation when fused to the SV40 early gene promoter. See Takebe et al., Mol. Cell Biol. 8:466-472 (1988). It has been proposed that the stimulation of translation by the HTLV-I RU5 element is due to its function, in part, as a translational enhancing internal ribosome entry site (IRES). See Attal et al. FEBS Letters 392:220-224 (1996). Additionally the HTLV-I RU5 element provides the 5'-splice donor site. Immediately downstream of the RU5 element is the 3'-end of the HCMV intron A sequence containing the splice acceptor sequence. The VR10686 and VR6430 expression vectors contain a hybrid intron composed of the 5'-HTLV I intron sequence fused to the 3'-end of the HCMV intron A, a bovine growth hormone poly-adenylation site, a polylinker for insertion of forign genes and a kanamycin resistance gene. The VR6430 vector expresses the prM and E West Nile Virus antigens (Genebank Accession No. AF202541).

[0254] The vector backbones described above may by used to create expression vectors which express multiple influenza proteins, fragments, variants or derivatives thereof. An expression vector as desribed herein may contain an additional promoter. For example, construct VR4774 (described in Example 13), contains a CMV promoter and an RSV promoter. Thus, the vector backbones described herein may contain multiple expression cassettes which comprise a promoter and an influenza coding sequence including, inter alia, polynucleotides as described herein. The expression cassettes may encode the same or different influenza polypeptides. Additionally, the expression cassettes may be in the same or opposite orientation relative to each other. As such transcription from each cassette may be in the same or opposition direction (i.e. 5' to 3' in both expression cassettes or, alternatively, 5' to 3' in one expression cassette and 3' to 5' in the other expression cassette).

Plasmid DNA Purification

[0255] Plasmid DNA may be transformed into competent cells of an appropriate Escherichia coli strain (including but not limited to the DH5.alpha. strain) and highly purified covalently closed circular plasmid DNA was isolated by a modified lysis procedure (Horn, N. A., et al., Hum. Gene Ther. 6:565-573 (1995)) followed by standard double CsCl-ethidium bromide gradient ultracentrifugation (Sambrook, J., et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y. (1989)). Alternatively, plasmid DNAs are purified using Giga columns from Qiagen (Valencia, Calif.) according to the kit instructions. All plasmid preparations were free of detectable chromosomal DNA, RNA and protein impurities based on gel analysis and the bicinchoninic protein assay (Pierce Chem. Co., Rockford Ill.). Endotoxin levels were measured using Limulus Amebocyte Lysate assay (LAL, Associates of Cape Cod, Falmouth, Mass.) and were less than 0.6 Endotoxin Units/mg of plasmid DNA. The spectrophotometric A.sub.260/A.sub.280 ratios of the DNA solutions were typically above 1.8. Plasmids were ethanol precipitated and resuspended in an appropriate solution, e.g., 150 mM sodium phosphate (for other appropriate excipients and auxiliary agents, see U.S. patent application Publication 2002/0019358, published Feb. 14, 2002). DNA was stored at -20EC until use. DNA was diluted by mixing it with 300 mM salt solutions and by adding appropriate amount of USP water to obtain 1 mg/ml plasmid DNA in the desired salt at the desired molar concentration.

Plasmid Expression in Mammalian Cell Lines

[0256] The expression plasmids were analyzed in vitro by transfecting the plasmids into a well characterized mouse melanoma cell line (VM-92, also known as UM-449). See, e.g., Wheeler, C. J., Sukhu, L., Yang, G., Tsai, Y., Bustamente, C., Felgner, P. Norman, J & Manthorpe, M. "Converting an Alcohol to an Amine in a Cationic Lipid Dramatically Alters the Co-lipid Requirement, Cellular Transfection Activity and the Ultrastructure of DNA-Cytofectin Complexes," Biochim. Biophys. Acta. 1280:1-11 (1996). Other well-characterized human cell lines can also be used, e.g. MRC-5 cells, ATCC Accession No. CCL-171 or human rhabdomyosarcoma cell line RD (ATCC CCL-136). The transfection was performed using cationic lipid-based transfection procedures well known to those of skill in the art. Other transfection procedures are well known in the art and may be used, for example electroporation and calcium chloride-mediated transfection (Graham F. L. and A. J. van der Eb Virology 52:456-67 (1973)). Following transfection, cell lysates and culture supernatants of transfected cells were evaluated to compare relative levels of expression of IV antigen proteins. The samples were assayed by western blots and ELISAs, using commercially available polyclonal and/or monoclonal antibodies (available, e.g., from Research Diagnostics Inc., Flanders N.J.), so as to compare both the quality and the quantity of expressed antigen.

Injections of Plasmid DNA

[0257] The quadriceps muscles of restrained awake mice (e.g., female 6-12 week old BALB/c mice from Harlan Sprague Dawley, Indianapolis, Ind.) are injected bilaterally with 1-50 .mu.g of DNA in 50 .mu.l solution (100 .mu.g in 100 .mu.l total per mouse) using a disposable plastic insulin syringe and 28G 1/2 needle (Becton-Dickinson, Franklin Lakes, N.J., Cat. No. 329430) fitted with a plastic collar cut from a micropipette tip, as previously described (Hartikka, J., et al., Hum. Gene Ther. 7:1205-1217 (1996).

[0258] Animal care throughout the study was in compliance with the "Guide for the Use and Care of Laboratory Animals", Institute of Laboratory Animal Resources, Commission on Life Sciences, National Research Council, National Academy Press, Washington, D.C., 1996 as well as with Vical's Institutional Animal Care and Use Committee.

Example 1

Construction of Expression Vectors

[0259] Plasmid constructs comprising the native coding regions encoding NP, M1, M2, HA, and eM2, IV proteins or fragments, variants or derivatives are constructed as follows. The NP, M1, and M2 genes from IV (A/PR/8/34) are isolated from viral RNA by RT PCR, or prepared by direct synthesis if the wildtype sequence is known, by standard methods and are inserted into the vector VR10551 via standard restriction sites, by standard methods.

[0260] Plasmid constructs comprising human codon-optimized coding regions encoding NP, M1, M2, HA, eM2, and/or an eM2-NP fusion; or other codon-optimized coding regions encoding other IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, are prepared as follows. The codon-optimized coding regions are generated using the full, minimal, or uniform codon optimization methods described herein. The codon optimized coding regions are constructed using standard PCR methods described herein, or are ordered commercially. Oligonucleotides representing about the first 23-24 aa extracellular region of M2 are constructed, and are used in an overlap PCR reaction with the NP coding regions described above, to create a coding region coding for an eM2/NP fusion protein, for example as shown in SEQ ID NOs 6 and 7. The codon-optimized coding regions are inserted into the vector VR10551 via standard restriction sites, by standard methods.

[0261] Plasmids constructed as above are propagated in Escherichia coli and purified by the alkaline lysis method (Sambrook, J., et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y., ed. 2 (1989)). CsCl-banded DNA are ethanol precipitated and resuspended in 0.9% saline or PBS to a final concentration of 2 mg/ml for injection. Alternately, plasmids are purified using any of a variety of commercial kits, or by other known procedures involving differential precipitation and/or chromatographic purification.

[0262] Expression is tested by formulating each of the plasmids in DMRIE/DOPE and transfecting VM92 cells. The supernatants are collected and the protein production tested by Western blot or ELISA. The relative expression of the wild type and codon optimized constructs are compared.

[0263] Examples of constructs made according to the above methods are listed in Table 13. The experimental procedure for generating the listed constructs is as described above, with particular parameters and materials employed as described herein. TABLE-US-00053 TABLE 13 Plasmid # Description VR4700 TPA leader - NP (A/PR/34) in VR 1255 VR4707 TPA leader-M2 with transmembrane deletion, glycine linker inserted VR4710 TPA leader - 1st 24 amino acids of M2 from VR4707 fused to NP from VR4700 VR4750 full length HA from mouse adapted virus (H3, Hong Kong 68) VR4752 full length HA from mouse adapted virus (H1, Puerto Rico 34) VR4755 algorithm to codon optimize consensus amino acid sequence, direct fusion M2 to ATG of M1 VR4756 native sequence from A/Niigata/137/96 influenza strain (matches amino acid consensus sequence) VR4757 Contracted codon optimized - 1st 24 amino acids of M2 from consensus fused to full- length NP consensus VR4758 Applicants' codon optimized - 1st 24 amino acids of M2 from consensus fused to full- length NP consensus VR4759 Full-length M2 derived from VR4755 VR4760 Full-length M1 derived from VR4755 VR4761 Full-length NP derived from VR4757 VR4762 Full-length NP derived from VR4758 VR4763 Selectively codon-optimized regions of segment 7

[0264] The pDNA expression vector VR4700 which encodes the influenza NP protein has been described in the art. See, e.g. Sankar, V., Baccaglilni, L., Sawddey, M., Wheeler, C. J., Pillemer, S. R., Baum, B. J. and Atkinson, J. C., "Salivary Gland Delivery of pDNA-Cationic Lipolplexes Elicits Systemic Immune Responses," Oral Diseases 8:275-281 (2002). The following is the open reading frame for TPA-NP (from VR4700), referred to herein as SEQ ID NO:46: TABLE-US-00054 1 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 61 tcgcccagcg ctagaggatc gggaatggcg tcccaaggca ccaaacggtc ttacgaacag 121 atggagactg atggagaacg ccagaatgcc actgaaatca gagcatccgt cggaaaaatg 181 attggtggaa ttggacgatt ctacatccaa atgtgcaccg aactcaaact cagtgattat 241 gagggacggt tgatccaaaa cagcttaaca atagagagaa tggtgctctc tgcttttgac 301 gaaaggagaa ataaatacct ggaagaacat cccagtgcgg ggaaagatcc taagaaaact 361 ggaggaccta tatacaggag agtaaacgga aagtggatga gagaactcat cctttatgac 421 aaagaagaaa taaggcgaat ctggcgccaa gctaataatg gtgacgatgc aacggctggt 481 ctgactcaca tgatgatctg gcattccaat ttgaatgatg caacttatca gaggacaaga 541 gctcttgttc gcaccggaat ggatcccagg atgtgctctc tgatgcaagg ttcaactctc 601 cctaggaggt ctggagccgc aggtgctgca gtcaaaggag ttggaacaat ggtgatggaa 661 ttggtcagga tgatcaaacg tgggatcaat gatcggaact tctggagggg tgagaatgga 721 cgaaaaacaa gaattgctta tgaaagaatg tgcaacattc tcaaagggaa atttcaaact 781 gctgcacaaa aagcaatgat ggatcaagtg agagagagcc ggaacccagg gaatgctgag 841 ttcgaagatc tcacttttct agcacggtct gcactcatat tgagagggtc ggttgctcac 901 aagtcctgcc tgcctgcctg tgtgtatgga cctgccgtag ccagtgggta cgactttgaa 961 agagagggat actctctagt cggaatagac cctttcagac tgcttcaaaa cagccaagtg 1021 tacagcctaa tcagaccaaa tgagaatcca gcacacaaga gtcaactggt gtggatggca 1081 tgccattctg ccgcatttga agatctaaga gtattaagct tcatcaaagg gacgaaggtg 1141 ctcccaagag ggaagctttc cactagagga gttcaaattg cttccaatga aaatatggag 1201 actatggaat caagtacact tgaactgaga agcaggtact gggccataag gaccagaagt 1261 ggaggaaaca ccaatcaaca gagggcatct gcgggccaaa tcagcataca acctacgttc 1321 tcagtacaga gaaatctccc ttttgacaga acaaccatta tggcagcatt caatgggaat 1381 acagagggaa gaacatctga catgaggacc gaaatcataa ggatgatgga aagtgcaaga 1441 ccagaagatg tgtctttcca ggggcgggga gtcttcgagc tctcggacga aaaggcagcg 1501 agcccgatcg tgccttcctt tgacatgagt aatgaaggat cttatttctt cggagacaat 1561 gcagatgagt acgacaatta a

[0265] Purified VR4700 DNA was used to transfect the murine cell line VM92 to determine expression of the NP protein. Expression of NP was confirmed with a Western Blot assay. Western blot analysis showed very low level expression of VR4700 in vitro as detected with mouse polyclonal anti-NP antibody. In vivo antibody response was detected by ELISA with an average titer of 62,578.

[0266] Plasmid VR4707 expresses a secreted form of M2, i.e., TPA-M2. The sequence was assembled using synthetic oligonucleotides in which the oligos were annealed amongst themselves, and then ligated and gel purified. The purified product was then ligated (cloned) into Eco RI/Sal I of VR10551. The M2 sequence lacks the transmembrane domain; the cloned sequence contains amino acids [TPA(1-23)]ARGSG[M2(1-25)]GGG[M2(44-97)]. Amino acid residues between TPA and M2 and between M2 domains were added as flexible linkers. The following mutations were introduced to generate appropriate T-cell epitopes: 74S.fwdarw.G and 78S.fwdarw.N. The following is the open reading frame for TPA-M2ATM (from VR4707), referred to herein as SEQ ID NO:47: TABLE-US-00055 1 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 61 tcgcccagcg ctagaggatc gggaatgagt cttctgaccg aggtcgaaac ccctatcaga 121 aacgaatggg ggtgcagatg caacgattca agtgatcctg gcggcggcga tcggcttttt 181 ttcaaatgca tttatcggcg ctttaaatac ggcttgaaaa gagggccttc taccgaagga 241 gtgccagagt ctatgaggga agaatatcgg aaggaacagc agaatgctgt ggatgttgac 301 gatagccatt ttgtcagcat cgagctggag taa

[0267] Purified VR4707 DNA was used to transfect the murine cell line VM92 to determine expression of the M2 protein. Expression of M2 was confirmed with a Western Blot assay. Expression was visualized with a commercially available anti-M2 monoclonal antibody. In vivo M2 antibody response to VR4707, as assayed by ELISA, resulted in an average titer of 110, which is lower than the average titer of 9,240 for VR4756, encoding full-length M2 from segment 7. An IFN.gamma. ELISPOT assay for M2-specific T cells resulted in an average of 61 SFU/10.sup.6 cells versus an average of 121 SFU/10.sup.6 cells for the segment 7 construct.

[0268] VR4710 was created by fusing the TPA leader and the first 24 amino acids of M2 from VR4707 to the full-length NP gene from VR4700. Primers 5'-GCCGAATCCATGGATGCAATGAAG-3' (SEQ ID NO:48) and 5'-GGTGCCTTGGGACGCCATATCACTTGAATCGTTGCA-3' (SEQ ID NO:49) were used to amplify the TPA-M2 fragment from VR4707. Primers 5'-TGCAACGATTCAAGTGATATGGCGTCCCAAGGCACC-3' (SEQ ID NO:50) and 5'-GCCGTCGACTTAATTGTCGTACTC-3' (SEQ ID NO:51) were used to amplify the NP gene from VR4700. Then the N-terminal and C-terminal primers were used to assemble the fusion, and the eM2NP fusion was cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for TPA-M2-NP (from VR4710), referred to herein as SEQ ID NO:52: TABLE-US-00056 1 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 61 tcgcccagcg ctagaggatc gggaatgagt cttctgaccg aggtcgaaac ccctatcaga 121 aacgaatggg ggtgcagatg caacgattca agtgatatgg cgtcccaagg caccaaacgg 181 tcttacgaac agatggagac tgatggagaa cgccagaatg ccactgaaat cagagcatcc 241 gtcggaaaaa tgattggtgg aattggacga ttctacatcc aaatgtgcac cgaactcaaa 301 ctcagtgatt atgagggacg gttgatccaa aacagcttaa caatagagag aatggtgctc 361 tctgcttttg acgaaaggag aaataaatac ctggaagaac atcccagtgc ggggaaagat 421 cctaagaaaa ctggaggacc tatatacagg agagtaaacg gaaagtggat gagagaactc 481 atcctttatg acaaagaaga aataaggcga atctggcgcc aagctaataa tggtgacgat 541 gcaacggctg gtctgactca catgatgatc tggcattcca atttgaatga tgcaacttat 601 cagaggacaa gagctcttgt tcgcaccgga atggatccca ggatgtgctc tctgatgcaa 661 ggttcaactc tccctaggag gtctggagcc gcaggtgctg cagtcaaagg agttggaaca 721 atggtgatgg aattggtcag gatgatcaaa cgtgggatca atgatcggaa cttctggagg 781 ggtgagaatg gacgaaaaac aagaattgct tatgaaagaa tgtgcaacat tctcaaaggg 841 aaatttcaaa ctgctgcaca aaaagcaatg atggatcaag tgagagagag ccggaaccca 901 gggaatgctg agttcgaaga tctcactttt ctagcacggt ctgcactcat attgagaggg 961 tcggttgctc acaagtcctg cctgcctgcc tgtgtgtatg gacctgccgt agccagtggg 1021 tacgactttg aaagagaggg atactctcta gtcggaatag accctttcag actgcttcaa 1081 aacagccaag tgtacagcct aatcagacca aatgagaatc cagcacacaa gagtcaactg 1141 gtgtggatgg catgccattc tgccgcattt gaagatctaa gagtattaag cttcatcaaa 1201 gggacgaagg tgctcccaag agggaagctt tccactagag gagttcaaat tgcttccaat 1261 gaaaatatgg agactatgga atcaagtaca cttgaactga gaagcaggta ctgggccata 1321 aggaccagaa gtggaggaaa caccaatcaa cagagggcat ctgcgggcca aatcagcata 1381 caacctacgt tctcagtaca gagaaatctc ccttttgaca gaacaaccat tatggcagca 1441 ttcaatggga atacagagyg aagaacatct gacatgagga ccgaaatcat aaggatgatg 1501 gaaagtgcaa gaccagaaga tgtgtctttc caggggcggg gagtcttcga gctctcggac 1561 gaaaaggcag cgagcccgat cgtgccttcc tttgacatga gtaatgaagg atcttatttc 1621 ttcggagaca atgcagatga gtacgacaat taa

[0269] Purified VR4710 DNA was used to transfect the murine cell line VM92 to determine expression of the eM2-NP fusion protein. Expression of EM2-NP was confirmed with a Western Blot assay. Expression was visualized with a commercially available monoclonal antibody to M2 and with mouse polyclonal antibody to NP. ELISA assay results following 2 injections of pDNA into mice revealed little antibody response to M2, but an average titer of 66,560 for anti-NP antibody.

[0270] VR4750 was created by first reverse transcribing RNA from the mouse-adapted A/Hong Kong/1/68 virus stock using random hexamer to create a cDNA library. Then primers 5' GGGCTAGCGCCGCCACCATGAAGACCATCATTGCT 3' (SEQ ID NO:53) and 5' CCGTCGACTCAAATGCAAATGTTGCA 3' (SEQ ID NO:54) were employed to PCR the HA gene. The gene was inserted into the Invitrogen TOPO-TA vector first, and then sub-cloned into VR10551 using restriction enzymes NheI and SalI. The following is the open reading frame for HA (H3N2) from mouse-adapted A/Hong Kong/68 (from VR4750), referred to herein as SEQ ID NO:55: TABLE-US-00057 1 atgaagacca tcattgcttt gagctacatt ttctgtctgg ctctcggcca agaccttcca 61 ggaaatgaca acaacacagc aacgctgtgc ctgggacatc atgcggtgcc aaacggaaca 121 ctagtgaaaa caatcacaga tgatcagatt gaagtgacta atgctactga gctagttcag 181 agctcctcaa cggggaaaat atgcaacaat cctcatcgaa tccttgatgg aatagactgc 241 acactgatag atgctctatt gggggaccct cattgtgatg tttttcaaaa tgagacatgg 301 gaccttttcg ttgaacgcag caaagctttc agcaactgtt acccttatga tgtgccagat 361 tatgcccccc ttaggtcact agttgcctcg tcaggcactc tggagtttat cactgagggt 421 ttcacttgga ctggggtcac tcagaatggg ggaagcagtg cttgcaaaag gggacctggt 481 agcggttttt tcagtagact gaactggttg accaaatcag gaagcacata tccagtgctg 541 aacgtgacta tgccaaacaa tgacaatttt gacaaactat acatttgggg ggttcaccac 601 ccgagcacga accaagaaca aaccagcctg tatgttcaag catcagggag agtcacagtc 661 tctaccagga gaagccagca aactataatc ccgaatatcg agtccagacc ctgggtaagg 721 ggtctgtcta gtagaataag catctattgg acaatagtta agccgggaga cgtactggta 781 attaatagta atgggaacct aatcgctcct cggggttatt tcaagatgcg cactgggaaa 841 agctcaataa tgaggtcaga tgcacctatt gatacctgta tttctgaatg catcactcca 901 aatggaagca ttcccaatga caagcccttt caaaacgtaa acaaaatcac gtatggagca 961 tgccccaagt atgttaagca aaacaccctg aagttggcaa cagggatgcg gaatgtacca 1021 gagaaacaaa ctagaggcct attcggcgca atagcaggtt tcatagaaaa tggttgggag 1081 ggaatgatag acggttggta cggtttcagg catcaaaatt ctgagggcac aggacaagca 1141 gcagatctta aaagcactca agcagccatc gaccaaatca atgggaaatt gaacaggata 1201 atcaagaaga cgaacgagaa attccatcaa atcgaaaagg aattctcaga agtagaaggg 1261 agaattcagg acctcgagaa atacgttgaa gacactaaaa tagatctctg gtcttacaat 1321 gcggagcttc ttgtcgctct ggagaatcaa catacaattg acctgactga ctcggaaatg 1381 aacaagctgt ttgaaaaaac aaggaggcaa ctgagggaaa atgctgaaga catgggcaat 1441 ggttgcttca aaatatacca caaatgtgac aacgcttgca tagagtcaat cagaactggg 1501 acttatgacc atgatgtata cagagacgaa gcattaaaca accggtttca gatcaaaggt 1561 gttgaactga agtctggata caaagactgg atcctgtgga tttcctttgc catatcatgc 1621 tttttgcttt gtgttgtttt gctggggttc atcatgtggg cctgccagaa aggcaacatt 1681 aggtgcaaca tttgcatttg a

[0271] While VR4750 expression was not clearly detected in vitro by Western blot Assay, two 100 .mu.g vaccinations of VR4750 have been shown to protect mice from intranasal challenge with mouse-adapted A/Hong Kong/68 virus.

[0272] VR4752 was created by first reverse transcribing RNA from the mouse-adapted A/Puerto Rico/8/34 virus stock using random hexamer to create a cDNA library. Then primers 5' GGGCTAGCGCCGCCACCATGAAGGCAAACCTACTG 3' (SEQ ID NO:56) and 5' CCGTCGACTCAGATGCATATTCTGCA 3' (SEQ ID NO:57) were employed to PCR the HA gene. The gene was then cloned into the TOPO-TA vector first, and then sub-cloned into VR10551 using restriction enzymes NheI and SalI. The following is the open reading frame for HA (H1N1) cloned from mouse-adapted A/Puerto Rico/34 (from VR4752), referred to herein as SEQ ID NO:58: TABLE-US-00058 1 atgaaggcaa acctactggt cctgttatgt gcacttgcag ctgcagatgc agacacaata 61 tgtataggct accatgcgaa caattcaacc gacactgttg acacagtgct cgagaagaat 121 gtgacagtga cacactctgt taacctgctc gaagacagcc acaacggaaa actatgtaga 181 ttaaaaggaa tagccccact acaattgggg aaatgtaaca tcgccggatg gctcttggga 241 aacccagaat gcgacccact gcttccagtg agatcatggt cctacattgt agaaacacca 301 aactctgaga atggaatatg ttatccagga gatttcatcg actatgagga gctgagggag 361 caattgagct cagtgtcatc attcgaaaga ttcgaaatat ttcccaaaga aagctcatgg 421 cccaaccaca acacaaccaa aggagtaacg gcagcatgct cccatgcggg gaaaagcagt 481 ttttacagaa atttgctatg gctgacggag aaggagggct catacccaaa gctgaaaaat 541 tcttatgtga acaagaaagg gaaagaagtc cttgtactgt ggggtattca tcacccgtct 601 aacagtaagg atcaacagaa tatctatcag aatgaaaatg cttatgtctc tgtagtgact 661 tcaaattata acaggagatt taccccggaa atagcagaaa gacccaaagt aagagatcaa 721 gctgggagga tgaactatta ctggaccttg ctaaaacccg gagacacaat aatatttgag 781 gcaaatggaa atctaatagc accaaggtat gctttcgcac tgagtagagg ctttgggtcc 841 ggcatcatca cctcaaacgc atcaatgcat gagtgtaaca cgaagtgtca aacacccctg 901 ggagctataa acagcagtct ccctttccag aatatacacc cagtcacaat aggagagtgc 961 ccaaaatacg tcaggagtgc caaattgagg atggttacag gactaaggaa cattccgtcc 1021 attcaatcca gaggtctatt tggagccatt gccggtttta ttgaaggggg atggactgga 1081 atgatagatg gatggtacgg ttatcatcat cagaatgaac agggatcagg ctatgcagcg 1141 gatcaaaaaa gcacacaaaa tgccattaac gggattacaa acaaggtgaa ctctgttatc 1201 gagaaaatga acattcaatt cacagctgtg ggtaaagaat tcaacaaatt agaaaaaagg 1261 atggaaaatt taaataaaaa agttgatgat ggatttctgg acatttggac atataatgca 1321 gaattgttag ttctactgga aaatgaaagg actctggatt tccatgactc aaatgtgaag 1381 aatctgtatg agaaagtaaa aagccaatta aagaataatg ccaaagaaat cggaaatgga 1441 tgttttgagt tctaccacaa gtgtgacaat gaatgcatgg aaagtgtaag aaatgggact 1501 tatgattatc ccaaatattc agaagagtca aagttgaaca gggaaaaggt agatggagtg 1561 aaattggaat caatggggat ctatcagatt ctggcgatct actcaactgt cgccagttca 1621 ctggtgcttt tggtctccct gggggcaatc agtttctgga tgtgttctaa tggatctttg 1681 cagtgcagaa tatgcatctg a

[0273] Purified VR4752 DNA was used to transfect the murine cell line VM92 to determine expression of the HA protein. Expression of HA was confirmed with a Western Blot assay. Expression was visualized with a commercially available goat anti-influenza A (H1N1) antibody.

[0274] A direct fusion of the M2 gene to the M1 gene was synthesized based on a codon-optimized sequence derived from methods described in Example 4 using the "universal" optimization strategy. The synthesized gene was received in the pUC119 vector and then sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for the M2M1 fusion (from VR4755), referred to herein as SEQ ID NO:59: TABLE-US-00059 1 atgagcctgc tgaccgaggt ggagaccccc atcagaaacg agtggggctg cagatgcaac 61 gacagcagcg accccctggt ggtggccgcc agcatcatcg gcatcctgca cctgatcctg 121 tggatcctgg acagactgtt cttcaagtgc atctacagac tgttcaagca cggcctgaag 181 agaggcccca gcaccgaggg cgtgcccgag agcatgagag aggagtacag aaaggagcag 241 cagaacgccg tggacgccga cgacagccac ttcgtgagca tcgagctgga gatgtccctg 301 ctgacagaag tggaaacata cgtgctgagc atcgtgccca gcggccccct gaaggccgag 361 atcgcccaga gactggagga cgtgttcgcc ggcaagaaca ccgacctgga ggccctgatg 421 gagtggctga agaccagacc catcctgagc cccctgacca agggcatcct gggcttcgtg 481 ttcaccctga ccgtgcccag cgagagaggc ctgcagagaa gaagattcgt gcagaacgcc 541 ctgaacggca acggcgaccc caacaacatg gaccgggccg tgaagctgta ccggaagctg 601 aagagagaga tcaccttcca cggcgccaag gagatcgccc tgagctacag cgccggcgcc 661 ctggccagct gcatgggcct gatctacaac agaatgggcg ccgtgaccac cgaggtggcc 721 ttcggcctgg tgtgcgccac ctgcgagcag atcgccgaca gccagcacag aagccacaga 781 cagatggtgg ccaccaccaa ccccctgatc agacacgaga acagaatggt gctggccagc 841 accaccgcca aggccatgga gcagatggcc ggcagcagcg agcaggccgc cgaggccatg 901 gagatcgcca gccaggccag acagatggtg caggccatga gagccatcgg cacccacccc 961 agcagcagcg ccggcctgaa ggacgacctg ctggagaacc tgcagaccta ccagaagaga 1021 atgggcgtgc agatgcagag attcaagtga

[0275] Purified VR4755 DNA was used to transfect the murine cell line VM92 to determine expression of the M2M1 fusion protein. Expression of M2M1 was confirmed with a Western Blot assay. Expression of the M2M1 fusion was visualized with commercially available anti-M1 and anti-M2 monoclonal antibodies.

[0276] The segment 7 RNA of influenza A encodes both the M1 and M2 genes. A consensus amino acid sequence for M1 and M2 was derived according to methods described herein. The consensus sequences for both proteins, however, are identical to the M1 and M2 amino acid sequences derived from the IV strain A/Niigata/137/96, represented herein as SEQ ID NO:77 and SEQ ID NO:78, respectively. Accordingly, the native sequence for segment 7, A/Niigata/137/96, was synthesized and received as an insert in pUC119. The segment 7 insert was sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for segment 7 (from VR4756), referred to herein as SEQ ID NO:60: TABLE-US-00060 1 atgagccttc taaccgaggt cgaaacgtat gttctctcta tcgttccatc aggccccctc 61 aaagccgaaa tcgcgcagag acttgaagat gtctttgctg ggaaaaacac agatcttgag 121 gctctcatgg aatggctaaa gacaagacca atcctgtcac ctctgactaa ggggattttg 181 gggtttgtgt tcacgctcac cgtgcccagt gagcgaggac tgcagcgtag acgctttgtc 241 caaaatgccc tcaatgggaa tggggatcca aataacatgg acagagcagt taaactatat 301 agaaaactta agagggagat tacattccat ggggccaaag aaatagcact cagttattct 361 gctggtgcac ttgccagttg catgggcctc atatacaaca gaatgggggc tgtaaccact 421 gaagtggcct ttggcctggt atgtgcaaca tgtgaacaga ttgctgactc ccagcacagg 481 tctcataggc aaatggtggc aacaaccaat ccattaataa ggcatgagaa cagaatggtt 541 ttggccagca ctacagctaa ggctatggag caaatggctg gatcaagtga gcaggcagcg 601 gaggccatgg aaattgctag tcaggccagg caaatggtgc aggcaatgag agccattggg 661 actcatccta gctccagtgc tggtctaaaa gatgatcttc ttgaaaattt gcagacctat 721 cagaaacgaa tgggggtgca gatgcaacga ttcaagtgac ccgcttgttg ttgctgcgag 781 tatcattggg atcttgcact tgatattgtg gattcttgat cgtctttttt tcaaatgcat 841 ctatcgactc ttcaaacacg gtctgaaaag agggccttct acggaaggag tacctgagtc 901 tatgagggaa gaatatcgaa aggaacagca gaatgctgtg gatgctgacg acagtcattt 961 tgtcagcata gagctggagt aa

[0277] SEQ ID NO:77 ("consensus" (A/Niigata/137/96) M1): TABLE-US-00061 MSLLTEVETYVLSIVPSGPLKAEIAQRLEDVFAGKNTDLEALMEWLKTRP ILSPLTKGILGFVFTLTVPSERGLQRRRFVQNALNGNGDPNNMDRAVKLY RKLKREITFHGAKEIALSYSAGALASCMGLIYNRMGAVTTEVAFGLVCAT CEQIADSQHRSHRQMVATTNPLIRHENRMVLASTTAKAMEQMAGSSEQAA EAMEIASQARQMVQAMRAIGTHPSSSAGLKDDLLENLQTYQKRMGVQM QRFK

[0278] SEQ ID NO:78 ("consensus" (A/Niigata/137/96) M2): TABLE-US-00062 MSLLTEVETPIRNEWGCRCNDSSDPLVVAASIIGILHLILWILDRLFFKC IYRLFKHGLKRGPSTEGVPESMREEYRKEQQNAVDADDSHFVSIELE

[0279] Purified VR4756 DNA was used to transfect the murine cell line VM92 to determine expression of the the proteins encoded by segment 7. Expression of both M1 and M2 was confirmed with a Western blot assay using commercially available anti-M1 and anti-M2 monoclonal antibodies. ELISA assay results following 2 injections of pDNA into mice revealed an average anti-M2 antibody titer of 9,240 versus a 110 average titer for VR4707. An IFN.gamma. ELISPOT assay for M2-specific T cells resulted in an average of 121 SFU/106 cells for VR4756 injected mice versus an average of 61 SFU/106 cells for the VR4707 construct.

[0280] An additional segment 7 sequence is created, VR4763, which contains selectively codon-optimized regions of segment 7. Optimization of the coding regions in segment 7 is selective, because segment 7 contains two overlapping coding regions (i.e., encoding M1 and M2,) and these coding regions are partially in different reading frames. From the AUG encoded by nucleotides 1 to 3 of segment 7, M1 is encoded by bp 1 through 759 of the segment 7 RNA, while M2 is encoded by a spliced messenger RNA which includes nucleotides 1 to 26 of segment 7 spliced to nucleotides 715 to 982 of segment 7. Optimization of the region from 715 to 759 is avoided because the M1 and M2 coding sequences (in different reading frames) overlap in that region. Due to the splicing that occurs to join bp 26 to an alternate frame at bp 715 of the segment 7 sequence, optimization in these splicing regions is also avoided; adjacent regions that arguably could also participate in splicing are likewise avoided. Optimization is done in a manner to insure that no new splicing sites are inadvertently introduced. The areas that are optimized are done so using "universal" strategy, e.g. inserting the most frequently used codon for each amino acid. The following is the nucleotide sequence for codon-optimized segment 7 (from VR4763), referred to herein as SEQ ID NO:61: TABLE-US-00063 1 atgagcctgc tgaccgaggt cgaaacgtat gttctctcta tcgtgcccag cggccccctg 61 aaggccgaga tcgcccagag actggaggac gtgttcgccg gcaagaacac cgacctggag 121 gccctgatgg agtggctgaa gaccagaccc atcctgagcc ccctgaccaa gggcatcctg 181 ggcttcgtgt tcaccctgac cgtgcccagc gagagaggcc tgcagagaag aagattcgtg 241 cagaacgccc tgaacggcaa cggcgacccc aacaacatgg acagagccgt gaagctgtac 301 agaaagctga agagagagat caccttccac ggcgccaagg agatcgccct gagctacagc 361 gccggcgccc tggccagctg catgggcctg atctacaaca gaatgggcgc cgtgaccacc 421 gaggtggcct tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag ccagcacaga 481 agccacagac agatggtggc caccaccaac cccctgatca gacacgagaa cagaatggtg 541 ctggccagca ccaccgccaa ggccatggag cagatggccg gcagcagcga gcaggccgcc 601 gaggccatgg agatcgccag ccaggccaga cagatggtgc aggccatgag agccatcggc 661 acccacccca gcagcagcgc cggcctgaaa gatgatcttc ttgaaaattt gcagacctat 721 cagaaacgaa tgggggtgca gatgcaacga ttcaagtgac cccctggtgg tggccgccag 781 catcatcggc atcctgcacc tgatcctgtg gatcctggac agactgttct tcaagtgcat 841 ctacagactg ttcaagcacg gcctgaagag aggccccagc accgagggcg tgcccgagag 901 catgagagag gagtacagaa aggagcagca gaacgccgtg gacgccgacg acagccactt 961 cgtgagcatc gagctggagt ga

[0281] The codon optimized coding region for M1 extends from nucleotide 1 to nucleotide 759 of SEQ ID NO:61 including the stop codon, and is represented herein as SEQ ID NO:79. The codon-optimized coding region for M2 extends from nucleotide 1 to nucleotide 26 of SEQ ID NO:61 spliced to nucleotide 715 through nucleotide 959 of SEQ ID NO:61, including the stop codon, and is represented herein as SEQ ID NO:80.

[0282] Optimized M1 Coding Region (SEQ ID NO:79): TABLE-US-00064 ATGAGCCTGCTGACCGAGGTCGAAACGTATGTTCTCTCTATCGTGCCCAG CGGCCCCCTGAAGGCCGAGATCGCCCAGAGACTGGAGGACGTGTTCGCCG GCAAGAACACCGACCTGGAGGCCCTGATGGAGTGGCTGAAGACCAGACCC ATCCTGAGCCCCCTGACCAAGGGCATCCTGGGCTTCGTGTTCACCCTGAC CGTGCCCAGCGAGAGAGGCCTGCAGAGAAGAAGATTCGTGCAGAACGCCC TGAACGGCAACGGCGACCCCAACAACATGGACAGAGCCGTGAAGCTGTAC AGAAAGCTGAAGAGAGAGATCACCTTCCACGGCGCCAAGGAGATCGCCCT GAGCTACAGCGCCGGCGCCCTGGCCAGCTGCATGGGCCTGATCTACAACA GAATGGGCGCCGTGACCACCGAGGTGGCCTTCGGCCTGGTGTGCGCCACC TGCGAGCAGATCGCCGACAGCCAGCACAGAAGCCACAGACAGATGGTGGC CACCACCAACCCCCTGATCAGACACGAGAACAGAATGGTGCTGGCCAGCA CCACCGCCAAGGCCATGGAGCAGATGGCCGGCAGCAGCGAGCAGGCCGCC GAGGCCATGGAGATCGCCAGCCAGGCCAGACAGATGGTGCAGGCCATGAG AGCCATCGGCACCCACCCCAGCAGCAGCGCCGGCCTGAAAGATGATCTTC TTGAAAATTTGCAGACCTATCAGAAACGAATGGGGGTGCAGATGCAACGA TTCAAGTGA

[0283] Optimized M2 Coding Region (SEQ ID NO:80): TABLE-US-00065 ATGAGCCTGCTGACCGAGGTCGAAACACCTATCAGAAACGAATGGGGGTG CAGATGCAACGATTCAAGTGACCCCCTGGTGGTGGCCGCCAGCATCATCG GCATCCTGCACCTGATCCTGTGGATCCTGGACAGACTGTTCTTCAAGTGC ATCTACAGACTGTTCAAGCACGGCCTGAAGAGAGGCCCCAGCACCGAGGG CGTGCCCGAGAGCATGAGAGAGGAGTACAGAAAGGAGCAGCAGAACGCCG TGGACGCCGACGACAGCCACTTCGTGAGCATCGAGCTGGAGTGA

[0284] The eM2-NP fusion was codon-optimized, inserted in pUC119 and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for eM2-NP: codon-optimized by Contract (from VR4757), referred to herein as SEQ ID NO:62: TABLE-US-00066 1 atgagcttgc tcactgaagt cgagacacca atcagaaacg aatggggatg tagatgcaac 61 gatagctcag acatggcctc ccagggaacc aaaagaagct atgaacagat ggagactgac 121 ggagagagac agaacgccac agagatcaga gctagtgtag gaaagatgat agacggtatc 181 gggcgatttt acattcaaat gtgtacggaa ttgaaactca gcgactatga aggcagactt 241 atccagaact cactcacaat tgagcgcatg gtactcagtg catttgatga aagaaggaat 301 aggtacctcg aagaacaccc cagcgccggc aaagatccca agaagactgg cggcccaatt 361 tacagaagag tggacggtaa gtggatgaga gagctggtat tgtacgataa agaagaaatt 421 agaagaatct ggaggcaagc aaacaatgga gaggatgcta cagctggcct gacccacatg 481 atgatttggc atagtaacct gaatgatacc acctaccagc ggacaagggc tctcgttcga 541 accgggatgg atccccgcat gtgctcattg atgcagggta gtacactccc gaggaggtca 601 ggcgcggccg gtgcagccgt gaaaggaatc ggcactatgg taatggaatt gataagaatg 661 attaaaaggg ggattaatga caggaacttt tggagaggag aaaatggacg caaaacaagg 721 agtgcgtatg aacggatgtg caatattttg aaaggaaaat tccaaactgc agcacagcgc 781 gccatgatgg atcaggtacg agaaagtcgc aacccaggta atgctgaaat agaggacctt 841 atatttctcg cccggagtgc tctcatactt agaggaagcg tggcccataa aagttgtctc 901 cccgcatgcg tatacggtcc cgctgtgtct tccggatacg attttgaaaa agagggatat 961 tcattggtgg gaatcgaccc ttttaagctg cttcagaact cacaggttta cagtttgatt 1021 agaccaaacg agaacccagc ccacaaatca caactcgtgt ggatggcatg ccactctgcc 1081 gctttcgaag atctgagact gctctcattt attagaggca ctaaagtgag cccgagggga 1141 aaactgagca cacgaggagt acagatagca tctaacgaaa atatggataa tatgggatct 1201 agcacactcg aattgaggtc acgatactgg gctattagaa cacggagcgg agggaacacc 1261 aaccagcaga gagcatccgc cggtcagata agcgttcagc ctacattttc agtacaacga 1321 aacctgccat ttgaaaagag tacagtgatg gccgcattta ctggcaacac cgagggacga 1381 acaagcgaca tgagagcaga gattattaga atgatggaag gagctaaacc agaggaggtt 1441 tcatttagag gaaggggagt cttcgaattg tccgatgaga aagccacaaa tcccatagta 1501 cctagcttcg acatgtccaa cgaaggctct tacttttttg gtgacaatgc cgaagagtac 1561 gacaattga

[0285] Purified VR4757 DNA was used to transfect the murine cell line VM92 to determine expression of the eM2-NP fusion protein. Expression of eM2-NP was confirmed with a Western Blot assay. Expression was visualized with a commercially available monoclonal antibody to M2 and with mouse polyclonal antibody to NP. In vivo antibody response to NP was detected by ELISA with an average titer of 51,200.

[0286] The eM2-NP fusion gene in VR4758 was codon-optimized and synthesized. The gene was inserted into pUC119 and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for eM2-NP: codon-optimized by Applicants (from VR4758), referred to herein as SEQ ID NO:63: TABLE-US-00067 1 atgagcctgc tgaccgaggt ggagaccccc atcagaaacg agtggggctg cagatgcaac 61 gacagcagcg acatggccag ccagggcacc aagagaagct acgagcagat ggagaccgac 121 ggcgagagac agaacgccac cgagatcaga gccagcgtgg gcaagatgat cgacggcatc 181 ggcagattct acatccagat gtgcaccgag ctgaagctga gcgactacga gggcagactg 241 atccagaaca gcctgaccat cgagagaatg gtgctgagcg ccttcgacga gagaagaaac 301 agatacctgg aggagcaccc cagcgccggc aaggacccca agaagaccgg cggccccatc 361 tacagaagag tggacggcaa gtggatgaga gagctggtgc tgtacgacaa ggaggagatc 421 agaagaatct ggagacaggc caacaacggc gaggacgcca ccgccggcct gacccacatg 481 atgatctggc acagcaacct gaacgacacc acctaccaga gaaccagagc cctggtgcgg 541 accggcatgg accccagaat gtgcagcctg atgcagggca gcaccctgcc cagaagaagc 601 ggcgccgccg gcgccgccgt gaagggcatc ggcaccatgg tgatggagct gatcagaatg 661 atcaagagag gcatcaacga cagaaacttc tggagaggcg agaacggcag aaagaccaga 721 agcgcctacg agagaatgtg caacatcctg aagggcaagt tccagaccgc cgcccagaga 781 gccatgatgg accaggtccg ggagagcaga aaccccggca acgccgagat cgaggacctg 841 atcttcctgg ccagaagcgc cctgatcctg agaggcagcg tggcccacaa gagctgcctg 901 cccgcctgcg tgtacggccc cgccgtgagc agcggctacg acttcgagaa ggagggctac 961 agcctggtgg gcatcgaccc cttcaagctg ctgcagaaca gccaggtgta cagcctgatc 1021 agacccaacg agaaccccgc ccacaagagc cagctggtgt ggatggcctg ccacagcgcc 1081 gccttcgagg acctgagact gctgagcttc atcagaggca ccaaggtgtc ccccagaggc 1141 aagctgagca ccagaggcgt gcagatcgcc agcaacgaga acatggacaa catgggcagc 1201 agcaccctgg agctgagaag cagatactgg gccatcagaa ccagaagcgg cggcaacacc 1261 aaccagcaga gagccagcgc cggccagatc agcgtgcagc ccaccttcag cgtgcagaga 1321 aacctgccct tcgagaagag caccgtgatg gccgccttca ccggcaacac cgagggcaga 1381 accagcgaca tgagagccga gatcatcaga atgatggagg gcgccaagcc cgaggaggtg 1441 tccttcagag gcagaggcgt gttcgagctg agcgacgaga aggccaccaa ccccatcgtg 1501 cctagcttcg acatgagcaa cgagggcagc tacttcttcg gcgacaacgc cgaggagtac 1561 gacaactga

[0287] Purified VR4758 DNA was used to transfect the murine cell line VM92 to determine expression of the eM2-NP protein. Expression of eM2-NP was confirmed with a Western Blot assay. Expression was visualized with a commercially available monoclonal antibody to M2 and with mouse polyclonal antibody to NP. In vivo antibody response to NP was detected by ELISA with an average titer of 48,640.

[0288] The M2 gene was PCR-amplified from VR4755 using the primers 5'-GCCGAATTCGCCACCATGAGCCTGCTGACC-3' (SEQ ID NO:64) and 5'-GCCGTCGACTGATCACTCCAGCTCGATGCTCAC-3' (SEQ ID NO:65) and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for M2 (from VR4759), referred to herein as SEQ ID NO:66: TABLE-US-00068 1 atgagcctgc tgaccgaggt ggagaccccc atcagaaacg agtggggctg cagatgcaac 61 gacagcagcg accccctggt ggtggccgcc agcatcatcg gcatcctgca cctgatcctg 121 tggatcctgg acagactgtt cttcaagtgc atctacagac tgttcaagca cggcctgaag 181 agaggcccca gcaccgaggg cgtgcccgag agcatgagag aggagtacag aaaggagcag 241 cagaacgccg tggacgccga cgacagccac ttcgtgagca tcgagctgga gtga

[0289] Purified VR4759 DNA was used to transfect the murine cell line VM92 to determine expression of the M2 protein. Expression of M2 was confirmed with a Western Blot assay. Expression was visualized with a commercially available anti-M2 monoclonal antibody.

[0290] The M1 gene was PCR-amplified from VR4755 using the primers 5'-GCCGAATTCGCCACCATGTCCCTGCTGACAGAAGTG-3' (SEQ ID NO:67) and 5'-GCCGTCGACTGATCACTTGAATCTCTGCATC-3' (SEQ ID NO:68) and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for M1 (from VR4760), referred to herein as SEQ ID NO:69: TABLE-US-00069 1 atgtccctgc tgacagaagt ggaaacatac gtgctgagca tcgtgcccag cggccccctg 61 aaggccgaga tcgcccagag actggaggac gtgttcgccg gcaagaacac cgacctggag 121 gccctgatgg agtggctgaa gaccagaccc atcctgagcc ccctgaccaa gggcatcctg 181 ggcttcgtgt tcaccctgac cgtgcccagc gagagaggcc tgcagagaag aagattcgtg 241 cagaacgccc tgaacggcaa cggcgacccc aacaacatgg accgggccgt gaagctgtac 301 cggaagctga agagagagat caccttccac ggcgccaagg agatcgccct gagctacagc 361 gccggcgccc tggccagctg catgggcctg atctacaaca gaatgggcgc cgtgaccacc 421 gaggtggcct tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag ccagcacaga 481 agccacagac agatggtggc caccaccaac cccctgatca gacacgagaa cagaatggtg 541 ctggccagca ccaccgccaa ggccatggag cagatggccg gcagcagcga gcaggccgcc 601 gaggccatgg agatcgccag ccaggccaga cagatggtgc aggccatgag agccatcggc 661 acccacccca gcagcagcgc cggcctgaag gacgacctgc tggagaacct gcagacctac 721 cagaagagaa tgggcgtgca gatgcagaga ttcaagtga

[0291] Purified VR4760 DNA was used to transfect the murine cell line VM92 to determine expression of the M1 protein. Expression of M1 was confirmed with a Western Blot assay. Expression was visualized with a commercially available anti-M1 monoclonal antibody.

[0292] The NP gene was PCR-amplified from VR4757 using primers 5'-GCCGAATTCGCCACCATGGCCTCCCAGGGAACCAAAAG-3' (SEQ ID NO:70) and 5'-GCCGTCGACTGATCAATTGTCGTACTCTTC-3' (SEQ ID NO:71) and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for NP: codon-optimized by Contract (from VR4761), referred to herein as SEQ ID NO:72: TABLE-US-00070 1 atg gcc tcc cag gga acc aaa aga agc tat gaa cag atg gag act gac 49 gga gag aga cag aac gcc aca gag atc aga gct agt gta gga aag atg 97 ata gac ggt atc ggg cga ttt tac att caa atg tgt acg gaa ttg aaa 145 ctc agc gac tat gaa ggc aga ctt atc cag aac tca ctc aca att gag 193 cgc atg gta ctc agt gca ttt gat gaa aga agg aat agg tac ctc gaa 241 gaa cac ccc agc gcc ggc aaa gat ccc aag aag act ggc ggc cca att 289 tac aga aga gtg gac ggt aag tgg atg aga gag ctg gta ttg tac gat 337 aaa gaa gaa att aga aga atc tgg agg caa gca aac aat gga gag gat 385 gct aca gct ggc ctg acc cac atg atg att tgg cat agt aac ctg aat 433 gat acc acc tac cag cgg aca agg gct ctc gtt cga acc ggg atg gat 481 ccc cgc atg tgc tca ttg atg cag ggt agt aca ctc ccg agg agg tca 529 ggc gcg gcc ggt gca gcc gtg aaa gga atc ggc act atg gta atg gaa 577 ttg ata aga atg att aaa agg ggg att aat gac agg aac ttt tgg aga 625 gga gaa aat gga cgc aaa aca agg agt gcg tat gaa cgg atg tgc aat 673 att ttg aaa gga aaa ttc caa act gca gca cag cgc gcc atg atg gat 721 cag gta cga gaa agt cgc aac cca ggt aat gct gaa ata gag gac ctt 769 ata ttt ctc gcc cgg agt gct ctc ata ctt aga gga agc gtg gcc cat 817 aaa agt tgt ctc ccc gca tgc gta tac ggt ccc gct gtg tct tcc gga 865 tac gat ttt gaa aaa gag gga tat tca ttg gtg gga atc gac cct ttt 913 aag ctg ctt cag aac tca cag gtt tac agt ttg att aga cca aac gag 961 aac cca gcc cac aaa tca caa ctc gtg tgg atg gca tgc cac tct gcc 1009 gct ttc gaa gat ctg aga ctg ctc tca ttt att aga ggc act aaa gtg 1057 agc ccg agg gga aaa ctg agc aca cga gga gta cag ata gca tct aac 1105 gaa aat atg gat aat atg gga tct agc aca ctc gaa ttg agg tca cga 1153 tac tgg gct att aga aca cgg agc gga ggg aac acc aac cag cag aga 1201 gca tcc gcc ggt cag ata agc gtt cag cct aca ttt tca gta caa cga 1249 aac ctg cca ttt gaa aag agt aca gtg atg gcc gca ttt act ggc aac 1297 acc gag gga cga aca agc gac atg aga gca gag att att aga atg atg 1345 gaa gga gct aaa cca gag gag gtt tca ttt aga gga agg gga gtc ttc 1393 gaa ttg tcc gat gag aaa gcc aca aat ccc ata gta cct agc ttc gac 1441 atg tcc aac gaa ggc tct tac ttt ttt ggt gac aat gcc gaa gag tac 1489 gac aat tga

[0293] Purified VR4761 DNA was used to transfect the murine cell line VM92 to determine expression of the NP protein. Expression of NP was confirmed with a Western Blot assay. Expression was visualized with a mouse polyclonal anti-NP antibody. In vitro expression of VR4761 was significantly higher than VR4700 and comparable to VR4762.

[0294] The NP gene was PCR-amplified from VR4758 using primers 5'-GCCGAATTCGCCACCATGGCCAGCCAGGGCACCAAG-3' (SEQ ID NO:73) and 5'-GCCGTCGACTGATCAGTTGTCGTACTCC-3' (SEQ ID NO:74) and sub-cloned into VR10551 as an EcoRI-SalI fragment. The following is the open reading frame for NP: codon-optimized by Applicants (from VR4762), referred to herein as SEQ ID NO:75: TABLE-US-00071 1 atg gcc agc cag ggc acc aag aga agc tac gag cag atg gag acc gac 49 ggc gag aga cag aac gcc acc gag atc aga gcc agc gtg ggc aag atg 97 atc gac ggc atc ggc aga ccc tac atc cag atg tgc acc gag ctg aag 145 ctg agc gac tac gag ggc aga ctg atc cag aac agc ctg acc atc gag 193 aga atg gtg ctg agc gcc ccc gac gag aga aga aac aga tac ctg gag 241 gag cac ccc agc gcc ggc aag gac ccc aag aag acc ggc ggc ccc atc 289 tac aga aga gtg gac ggc aag tgg atg aga gag ctg gtg ctg tac gac 337 aag gag gag atc aga aga atc tgg aga cag gcc aac aac ggc gag gac 385 gcc acc gcc ggc ctg acc cac atg atg atc tgg cac agc aac ctg aac 433 gac acc acc tac cag aga acc aga gcc ctg gtg cgg acc ggc atg gac 481 ccc aga atg tgc agc ctg atg cag ggc agc acc ctg ccc aga aga agc 529 ggc gcc gcc ggc gcc gcc gtg aag ggc atc ggc acc atg gtg atg gag 577 ctg atc aga atg atc aag aga ggc atc aac gac aga aac ccc tgg aga 625 ggc gag aac ggc aga aag acc aga agc gcc tac gag aga atg tgc aac 673 atc ctg aag ggc aag ttc cag acc gcc gcc cag aga gcc atg atg gac 721 cag gtc cgg gag agc aga aac ccc ggc aac gcc gag atc gag gac ctg 769 atc ttc ctg gcc aga agc gcc ctg atc ctg aga ggc agc gtg gcc cac 817 aag agc tgc ctg ccc gcc tgc gtg cac ggc ccc gcc gtg agc agc ggc 865 cac gac ccc gag aag gag ggc cac agc ctg gtg ggc atc gac ccc ccc 913 aag ctg ctg cag aac agc cag gtg tac agc ctg atc aga ccc aac gag 961 aac ccc gcc cac aag agc cag ctg gtg tgg atg gcc tgc cac agc gcc 1009 gcc ttc gag gac ctg aga ctg ctg agc ttc atc aga ggc acc aag gtg 1057 ccc ccc aga ggc aag ctg agc acc aga ggc gtg cag atc gcc agc aac 1105 gag aac atg gac aac atg ggc agc agc acc ctg gag ctg aga agc aga 1153 tac tgg gcc atc aga acc aga agc ggc ggc aac acc aac cag cag aga 1201 gcc agc gcc ggc cag atc agc gtg cag ccc acc ttc agc gtg cag aga 1249 aac ctg ccc ttc gag aag agc acc gtg atg gcc gcc ttc acc ggc aac 1297 acc gag ggc aga acc agc gac atg aga gcc gag atc atc aga atg atg 1345 gag ggc gcc aag ccc gag gag gtg ccc ttc aga ggc aga ggc gtg ttc 1393 gag ctg agc gac gag aag gcc acc aac ccc atc gtg cct agc ttc gac 1441 atg agc aac gag ggc agc tac ttc ttc ggc gac aac gcc gag gag tac 1489 gac aac tga

[0295] Purified VR4762 DNA was used to transfect the murine cell line VM92 to determine expression of the NP protein. Expression of NP was confirmed with a Western Blot assay. Expression was visualized with a mouse polyclonal anti-NP antibody. In vitro expression of VR4762 was significantly higher than VR4700 and comparable to VR4761.

[0296] In addition to plasmids encoding single IV proteins, single plasmids which contain two or more IV coding regions are constructed according to standard methods. For example, a polycistronic construct, where two or more IV coding regions are transcribed as a single transcript in eukaryotic cells may be constructed by separating the various coding regions with IRES sequences. Alternatively, two or more coding regions may be inserted into a single plasmid, each with their own promoter sequence.

Example 2

Preparation of Recombinant NP DNA and Protein

[0297] Recombinant NP DNA and protein may be prepared using the following procedure. Eukaryotic cells may be used to express the NP protein from a transfected expression plasmid. Alternatively, a baculovirus system can be used wherein insect cells such as, but not limited to, Sf9, Sf21, or D.Mel-2 cells are infected with a recombinant baculovirus which can expresses the NP protein. Cells which have been infected with recombinant baculoviruses, or contain expression plasmids, encoding recombinant NP are collected by knocking and scraping cells off the bottom of the flask in which they are grown. Cells infected for 24 or 48 hours are less easy to detach from flask and may lyse, thus care must be taken with their removal. The flask containing the cells is then rinsed with PBS and the cells are transfered to 250 ml conical tubes. The tubes are spun at 1000 rpm in J-6 centrifuge (300.times.g) for about 5-10 minutes. The cell pellets are washed two times with PBS and then resuspended in about 10-20 ml of PBS in order to count. The cells are finally resuspended at a concentration of about 2.times.10.sup.7 cells/ml in RSB (10 mM Tris pH=7.5, 1.5 mM MgCl.sub.2, 10 mM KCl).

[0298] Approximately 10.sup.6 cells are used per lane of a standard SDS-PAGE mini-protein gel which is equivalent to the whole cell fraction for gel analysis purposes. 10% NP40 is added to the cells for a final concentration of 0.5%. The cell-NP40 mixture is vortexed and placed on ice for 10 minutes, vortexing occasionally. After ice incubation, the cells are spun at 1500 rpm in a J-6 centrifuge (600.times.1) for 10 minutes. The supernantant is removed which is the cytoplasmic fration. The remaining pellet, containing the nuclei, is washed two times with buffer C (20 mM HEPES pH=7.9, 1.5 mM MgCl.sub.2, 0.2 mM EDTA, 0.5 mM PMSF, 0.5 mM DTT) to remove cytoplasmic proteins. The nuclei are resuspended in buffer C to 5.times.10.sup.7 nuclei/ml. The nuclei are vortexed vigorously to break up particles and an aliquot is removed for the mini-protein gel which is the nuclei fraction.

[0299] To the remaining nuclei a quarter of the volume of 5M NaCl is added and the mixture is sonicated for 5 minutes at a maximum output in a bath-type sonicator at 4.degree. C., in 1-2 minute bursts, resting 30 seconds between bursts. The sonicated mixture is stirred at 4.degree. C., then spun at 12000.times.g for 10 minutes. A sample is removed for the protein mini-gel equivalent to approximately 10.sup.6 nuclei. The sample for the gel is centrifuged and the supernatant is the nuclear extract and the pellet is the nuclear pellet for gel analysis.

[0300] For gel analysis, a small amount (about 10.sup.6 nuclear equivalents) of the nuclear pellet is resuspended directly in gel sample buffer and run with equivalent amounts of whole cells, cytoplasm, nuclei, nuclear extract and nuclear pellet. The above method gives relatively crude NP. To recover NP of a higher purity, 2.1 M NaCl can be added to the nuclear pellet instead of 5M NaCl. This will bring the salt content to 0.42M NaCl. The supernatant will then contain about 60-70% of the total NP plus nuclear proteins. The resulting pellet is then extracted with 1M NaCl and centrifuged as above. The supernatant will contain NP at more than 95% purity.

Example 3

Consensus Amino Acid Sequences of NP, M1 and M2

[0301] By analyzing amino acid sequences from influenza strains sequenced since 1990, consensus amino acid sequences were derived for influenza NP, M1 and M2 antigens.

NP Consensus Amino Acid Sequence

[0302] The method by which amino acid sequences for influenza NP (strain A) was chosen is as follows. The http://www.flu.lanl.gov database containing influenza sequences for each segment was searched for influenza A strains, human, NP, amino acids. Results gave about 400 sequences, the majority of which were only partial sequences. The sequences were subsequently narrowed down to 85 approximately full length sequences. If different passages of the same strain were found, the earliest passage was chosen. The sequences were further narrowed down to 28 full length NP sequences isolated from 1990 to 2000 (no full-length sequences from 2001-2003). Five additional sequences were eliminated which were identical to another sequence isolated from the same year based on the assumption that sequences with the same year and identical amino acid sequences were likely to be the same virus strain (in order to avoid double weighting). If there were sequences from the same year with different amino acid sequences, both sequences were kept.

[0303] Sequences were aligned to the A/PR/8/34 strain in decending order by most recent, and the consensus sequence was determined by utilizing the amino acid with the majority (FIG. 12). There are 32 amino acid changes between the A/PR/8/34 and the consensus sequence, and all amino acid changes are also present in the two year 2000 NP sequences. For one additional amino acid (aa 275) 15/23 have changed from E (in A/PR/34) to G/D or V (7G, 7D, 1V). Since the two 2000 strains both contain a G at this position, G was chosen. The changes total 33 amino acids, which is about a 7% difference from the A/PR/8/34 strain.

[0304] The dominant Balb/c epitope TYQRTRALV is still maintained in the new consensus; changes to other theoretical human epitopes have not been determined as yet.

[0305] The A strains used in the last 8 years of flu vaccines (USA) are as follows (no full length sequences are available on any of the these strains' NP genes): [0306] a. 2002-2003 A/Moscow/10/99, A/New Caledonia/20/99 [0307] b. 2001-2002 A/Moscow/10/99, A/New Caledonia/20/99 [0308] c. 2000-2001 A/Panama/2007/99, A/New Caledonia/20/99 [0309] d. 1999-2000 A/Sydney/05/97, A/Beijing/262/95 [0310] e. 1998-1999 A/Sydney/05/97, A/Beijing/262/95 [0311] f. 1997-1998 A/Nanchang/933/95, A/Johannesburg/82/96 [0312] g. 1996-1997 A/Nanchang/933/95, A/Texas/36/91 [0313] h. 1995-1996 A/Johannesburg/33/94, A/Texas/36/91

[0314] The final NP consensus amino acid sequence derived using this method is referred to herein as SEQ ID NO:76: TABLE-US-00072 1 masqgtkrsy eqmetdgerq nateirasvg kmidgigrfy iqmctelkls dyegrliqns 61 ltiermvlsa fderrnryle ehpsagkdpk ktggpiyrrv dgkwmrelvl ydkeeirriw 121 rqanngedat aglthmmiwh snlndttyqr tralvrtgmd prmcslmqgs tlprrsgaag 181 aavkgigtmv melirmikrg indrnfwrge ngrktrsaye rmcnilkgkf qtaaqrammd 241 qvresrnpgn aeiedlifla rsalilrgsv ahksclpacv ygpavssgyd fekegyslvg 301 idpfkllqns qvyslirpne npahksqlvw machsaafed lrllsfirgt kvsprgklst 361 rgvqiasnen mdnmgsstle lrsrywairt rsggntnqqr asagqisvqp tfsvqrnlpf 421 ekstvmaaft gntegrtsdm raeiirmmeg akpeevsfrg rgvfelsdek atnpivpsfd 481 msnegsyffg dnaeeydn

M1 and M2 Consensus Amino Acid Sequences

[0315] Consensus sequences for M1 and M2 were determined in a similar fashion, as follows. The search parameters on the http://www.flu.lanl.gov/website were: influenza A strains, human, segment 7, nucleotide (both M1 and M2 are derived from segment 7). Full-length sequences from 1990-1999 (no 2000+ sequences were available) were chosen. For sequences with the same year and city, only the earliest passage was used. For entries for the same year, sequences were eliminated that were identical to another sequence isolated from the same year (even if different city). Twenty one sequences, full-length for both M1 and M2 from 1993-1999, were compared. At each position, the amino acid with the simple majority was used.

[0316] The M1 amino acid consensus sequence is referred to herein as SEQ ID NO:77: TABLE-US-00073 1 mslltevety vlsivpsgpl kaeiaqrled vfagkntdle almewlktrp ilspltkgil 61 gfvftltvps erglqrrrfv qnalngngdp nnmdravkly rklkreitfh gakeialsys 121 agalascmgl iynrmgavtt evafglvcat ceqiadsqhr shrqmvattn plirhenrmv 181 lasttakame qmagsseqaa eameiasqar qmvqamraig thpsssaglk ddllenlqty 241 qkrmgvqmqr fk

[0317] The M2 amino acid consensus sequence is referred to herein as SEQ ID NO:78: TABLE-US-00074 1 mslltevetp irnewgcrcn dssdplvvaa siigilhlil wildrlffkc iyrlfkhglk 61 rgpstegvpe smreeyrkeq qnavdaddsh fvsiele

Example 4

Codon Optimization Algorithm

[0318] The following is an outline of the algorithm used to derive human codon-optimized sequences of influenza antigens.

Back Translation

[0319] Starting with the amino acid sequence, one can either (a) manually backtranslate using the human codon usage table from http://www.kazusa.or.jp/codon/

[0320] Homo sapiens [gbpri]: 55194 CDS's (24298072 codons)

[0321] Fields: [triplet] [frequency: per thousand] ([number]) TABLE-US-00075 UUU 17.1(415589) UCU 14.7(357770) UAU 12.1(294182) UGU 10.0(243198) UUC 20.6(500964) UCC 17.6(427664) UAC 15.5(377811) UGC 12.2(297010) UUA 7.5(182466) UCA 12.0(291788) UAA 0.7(17545) UGA 1.5(36163) UUG 12.6(306793) UCG 4.4(107809) UAG 0.6(13416) UGG 12.7(309683) CUU 13.0(315804) CCU 17.3(419521) CAU 10.5(255135) CGU 4.6(112673) CUC 19.8(480790) CCC 20.1(489224) CAC 15.0(364828) CGC 10.7(259950) CUA 7.8(189383) CCA 16.7(405320) CAA 12.0(292745) CGA 6.3(152905) CUG 39.8(967277) CCG 6.9(168542) CAG 34.1(827754) CGG 11.6(281493) AUU 16.1(390571) ACU 13.0(315736) AAU 16.7(404867) AGU 11.9(289294) AUC 21.6(525478) ACC 19.4(471273) AAC 19.5(473208) AGC 19.3(467869) AUA 7.7(186138) ACA 15.1(366753) AAA 24.1(585243) AGA 11.5(278843) AUG 22.2(538917) ACG 6.1(148277) AAG 32.2(781752) AGG 11.4(277693) GUU 11.0(266493) GCU 18.6(451517) GAU 21.9(533009) GGU 10.8(261467) GUC 14.6(354537) GCC 28.4(690382) GAC 25.6(621290) GGC 22.5(547729) GUA 7.2(174572) GCA 16.1(390964) GAA 29.0(703852) GGA 16.4(397574) GUG 28.4(690428) GCG 7.5(181803) GAG 39.9(970417) GGG 16.3(396931)

* Coding GC 52.45% 1st letter GC 56.04% 2nd letter GC 42.37% 3rd letter GC 58.93% (Table as of Nov. 6, 2003)

[0322] Or (b) log on to www.svntheticgenes.com and use the backtranslation tool, as follows:

[0323] (1) Under Protein tab, paste amino acid sequence;

[0324] (2) Under download codon usage tab, highlight homo sapiens and then download CUT. TABLE-US-00076 UUU 17.1(415589) UCU 14.7(357770) UAU 12.1(294182) UGU 10.0(243198) UUC 20.6(500964) UCC 17.6(427664) UAC 15.5(377811) UGC 12.2(297010) UUA 7.5(182466) UCA 12.0(291788) UAA 0.7(17545) UGA 1.5(36163) UUG 12.6(306793) UCG 4.4(107809) UAG 0.6(13416) UGG 12.7(309683) CUU 13.0(315804) CCU 17.3(419521) CAU 10.5(255135) CGU 4.6(112673) CUC 19.8(480790) CCC 20.1(489224) CAC 15.0(364828) CGC 10.7(259950) CUA 7.8(189383) CCA 16.7(405320) CAA 12.0(292745) CGA 6.3(152905) CUG 39.8(967277) CCG 6.9(168542) CAG 34.1(827754) CGG 11.6(281493) AUU 16.1(390571) ACU 13.0(315736) AAU 16.7(404867) AGU 11.9(289294) AUC 21.6(525478) ACC 19.4(471273) AAC 19.5(473208) AGC 19.3(467869) AUA 7.7(186138) ACA 15.1(366753) AAA 24.1(585243) AGA 11.5(278843) AUG 22.2(538917) ACG 6.1(148277) AAG 32.2(781752) AGG 11.4(277693) GUU 11.0(266493) GCU 18.6(451517) GAU 21.9(533009) GGU 10.8(261467) GUC 14.6(354537) GCC 28.4(690382) GAC 25.6(621290) GGC 22.5(547729) GUA 7.2(174572) GCA 16.1(390964) GAA 29.0(703852) GGA 16.4(397574) GUG 28.4(690428) GCG 7.5(181803) GAG 39.9(970417) GGG 16.3(396931)

(Table as of Nov. 6, 2003)

[0325] (3) Hit Apply button.

[0326] (4) Under Optimize TAB, open General TAB.

[0327] (5) Check use only most frequent codon box.

[0328] (6) Hit Apply button.

[0329] (7) Under Optimize TAB, open Motif TAB.

[0330] (8) Load desired cloning restriction sites into bad motifs; load any undesirable sequences, such as Pribnow Box sequences (TATAA), Chi sequences (GCTGGCGG), and restriction sites into bad motifs.

[0331] (9) Under Output TAB, click on Start box. Output will include sequence, motif search results (under Report TAB), and codon usage report.

[0332] The program did not always use the most frequent codon for amino acids such as cysteine proline, and arginine. To change this, go back to the Edit CUT TAB and manually drag the rainbow colored bar to 100% for the desired codon. Then re-do start under the Output TAB.

[0333] The use of CGG for arginine can lead to very high GC content, so AGA can be used for arginine as an alternative. The difference in codon usage is 11.6 per thousand for CGG vs. 11.5 per thousand for AGA.

Splice Donor and Acceptor Site Search

[0334] (1) Log on to Berkeley Drosophila Genome Project Website at http://www.fruitfly.org/seg_tools/spice.html\

[0335] (2) Check boxes for Human or other and both splice sites.

[0336] (3) Select minimum scores for 5' and 3' splice sites between 0 and 1. [0337] Used the default setting at 0.4 where:

[0338] Default minimum score is 0.4, where: TABLE-US-00077 % splice % false sites recognized positives Human 5' Splice sites 93.2% 5.2% Human 3' Splice sites 83.8% 3.1%

[0339] (4) Paste in sequence.

[0340] (5) Submit.

[0341] (6) Based on predicted donors or acceptors, change the individual codons until the sites are no longer predicted.

Add in 5' and 3' Sequences.

[0342] On the 5' end of the gene sequence, the restriction enzyme site and Kozak sequence (gccacc) was added before ATG. On 3' end of the sequence, tca was added following the stop codon (tga on opposite strand) and then a restriction enzyme site. The GC content and Open Reading Frames were then checked in SEC Central.

Example 5

Preparation of Vaccine Formulations

[0343] Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, HA, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are formulated with the poloxamer CRL 1005 and BAK (Benzalkonium chloride 50% solution, available from Ruger Chemical Co. Inc.) by the following methods. Specific final concentrations of each component of the formulae are described in the following methods, but for any of these methods, the concentrations of each component may be varied by basic stoichiometric calculations known by those of ordinary skill in the art to make a final solution having the desired concentrations.

[0344] For example, the concentration of CRL 1005 is adjusted depending on, for example, transfection efficiency, expression efficiency, or immunogenicity, to achieve a final concentration of between about 1 mg/ml to about 75 mg/ml, for example, about 1 mg/ml, about 2 mg/ml, about 3 mg/ml, about 4 mg/ml, about 5 mg/ml, about 6.5 mg/ml, about 7 mg/ml, about 7.5 mg/ml, about 8 mg/ml, about 9 mg/ml, about 10 mg/ml, about 15 mg/ml, about 20 mg/ml, about 25 mg/ml, about 30 mg/ml, about 35 mg/ml, about 40 mg/ml, about 45 mg/ml, about 50 mg/ml, about 55 mg/ml, about 60 mg/ml, about 65 mg/ml, about 70 mg/ml, or about 75 mg/ml of CRL 1005.

[0345] Similarly the concentration of DNA is adjusted depending on many factors, including the amount of a formulation to be delivered, the age and weight of the subject, the delivery method and route and the immunogenicity of the antigen being delivered. In general, formulations of the present invention are adjusted to have a final concentration from about 1 ng/ml to about 30 mg/ml of plasmid (or other polynucleotide). For example, a formulation of the present invention may have a final concentration of about 1 ng/ml, about 5 ng/ml, about 10 ng/ml, about 50 ng/ml, about 100 ng/ml, about 500 ng/ml, about 1 .mu.g/ml, about 5 .mu.g/ml, about 10 .mu.g/ml, about 50 .mu.g/ml, about 200 .mu.g/ml, about 400 .mu.g/ml, about 600 .mu.g/ml, about 800 .mu.g/ml, about 1 mg/ml, about 2 mg/ml, about 2.5, about 3 mg/ml, about 3.5, about 4 mg/ml, about 4.5, about 5 mg/ml, about 5.5 mg/ml, about 6 mg/ml, about 7 mg/ml, about 8 mg/ml, about 9 mg/ml, about 10 mg/ml, about 20 mg/ml, or about 30 mg mg/ml of a plasmid.

[0346] Certain formulations of the present invention include a cocktail of plasmids (see, e,g., Example 2 supra) of the present invention, e.g., comprising coding regions encoding IV proteins NP, M1 and/or M2 and optionally, plasmids encoding immunity enhancing proteins, e.g., cytokines. Various plasmids desired in a cocktail are combined together in PBS or other diluent prior to the addition to the other ingredients. Furthermore, plasmids may be present in a cocktail at equal proportions, or the ratios may be adjusted based on, for example, relative expression levels of the antigens or the relative immunogenicity of the encoded antigens. Thus, various plasmids in the cocktail may be present in equal proportion, or up to twice or three times as much of one plasmid may be included relative to other plasmids in the cocktail.

[0347] Additionally, the concentration of BAK may be adjusted depending on, for example, a desired particle size and improved stability. Indeed, in certain embodiments, formulations of the present invention include CRL 1005 and DNA, but are free of BAK. In general BAK-containing formulations of the present invention are adjusted to have a final concentration of BAK from about 0.05 mM to about 0.5 mM. For example, a formulation of the present invention may have a final BAK concentration of about 0.05 mM, 0.1 mM, 0.2 mM, 0.3 mM, 0.4 mM or 0.5 mM.

[0348] The total volume of the formulations produced by the methods below may be scaled up or down, by choosing apparatus of proportional size. Finally, in carrying out any of the methods described below, the three components of the formulation, BAK, CRL 1005, and plasmid DNA, may be added in any order. In each of these methods described below the term "cloud point" refers to the point in a temperature shift, or other titration, at which a clear solution becomes cloudy, i.e., when a component dissolved in a solution begins to precipitate out of solution.

Thermal Cycling of a Pre-Mixed Formulation

[0349] This example describes the preparation of a formulation comprising 0.3 mM BAK, 7.5 mg/ml CRL 1005, and 5 mg/ml of DNA in a total volume of 3.6 ml. The ingredients are combined together at a temperature below the cloud point and then the formulation is thermally cycled to room temperature (above the cloud point) several times, according to the protocol outlined in FIG. 2.

[0350] A 1.28 mM solution of BAK is prepared in PBS, 846 .mu.l of the solution is placed into a 15 ml round bottom flask fitted with a magnetic stirring bar, and the solution is stirred with moderate speed, in an ice bath on top of a stirrer/hotplate (hotplate off) for 10 minutes. CRL 1005 (27 .mu.l) is then added using a 100 .mu.l positive displacement pipette and the solution is stirred for a further 60 minutes on ice. Plasmids comprising codon-optimized coding regions encoding, for example, NP, M1, and M2 as described herein, and optionally, additional plasmids comprising codon-optimized or non-codin-optimized coding regions encoding, e.g., additional IV proteins, and or other proteins, e.g., cytokines, are mixed together at desired proportions in PBS to achieve 6.4 mg/ml total DNA. This plasmid cocktail is added drop wise, slowly, to the stirring solution over 1 min using a 5 ml pipette. The solution at this point (on ice) is clear since it is below the cloud point of the poloxamer and is further stirred on ice for 15 min. The ice bath is then removed, and the solution is stirred at ambient temperature for 15 minutes to produce a cloudy solution as the poloxamer passes through the cloud point.

[0351] The flask is then placed back into the ice bath and stirred for a further 15 minutes to produce a clear solution as the mixture is cooled below the poloxamer cloud point. The ice bath is again removed and the solution stirred at ambient temperature for a further 15 minutes. Stirring for 15 minutes above and below the cloud point (total of 30 minutes), is defined as one thermal cycle. The mixture is cycled six more times. The resulting formulation may be used immediately, or may be placed in a glass vial, cooled below the cloud point, and frozen at -80 .degree. C. for use at a later time.

Thermal Cycling, Dilution and Filtration of a Pre-mixed Formulation, Using Increased Concentrations of CRL 1005

[0352] This example describes the preparation of a formulation comprising 0.3 mM BAK, 34 mg/ml or 50 mg/ml CRL 1005, and 5.0 mg/ml of DNA in a final volume of 4.0 ml. The ingredients are combined together at a temperature below the cloud point, then the formulation is thermally cycled to room temperature (above the cloud point) several times, diluted, and filtered according to the protocol outlined in FIG. 3.

[0353] Plasmids comprising codon-optimized coding regions encoding, for example, NP, M1, and M2 as described herein, and optionally, additional plasmids comprising codon-optimized or non-codin-optimized coding regions encoding, e.g., additional IV proteins, and or other proteins, e.g., cytokines, are mixed together at desired proportions in PBS to achieve 6.4 mg/ml total DNA. This plasmid cocktail is placed into the 15 ml round bottom flask fitted with a magnetic stirring bar, and for the formulation containing 50 mg/ml CRL 1005, 3.13 ml of a solution containing about 3.2 mg/ml of NP encoding plasmid and about 3.2 mg/ml M2 encoding plasmid (about 6.4 mg/ml total DNA) is placed into the 15 ml round bottom flask fitted with a magnetic stirring bar, and the solutions are stirred with moderate speed, in an ice bath on top of a stirrer/hotplate (hotplate off) for 10 minutes. CRL 1005 (136 .mu.l for 34 mg/ml final concentration, and 200 .mu.l for 50 mg/ml final concentration) is then added using a 200 .mu.l positive displacement pipette and the solution is stirred for a further 30 minutes on ice. Solutions of 1.6 mM and 1.8 mM BAK are prepared in PBS, and 734 .mu.l of 1.6 mM and 670 .mu.l of 1.8 mM are then added drop wise, slowly, to the stirring poloxamer solutions with concentrations of 34 mg/ml or 50 mg/ml mixtures, respectively, over 1 min using a 1 ml pipette. The solutions at this point are clear since they are below the cloud point of the poloxamer and are stirred on ice for 30 min. The ice baths are then removed; the solutions stirred at ambient temperature for 15 minutes to produce cloudy solutions as the poloxamer passes through the cloud point.

[0354] The flasks are then placed back into the ice baths and stirred for a further 15 minutes to produce clear solutions as the mixtures cooled below the poloxamer cloud point. The ice baths are again removed and the solutions stirred for a further 15 minutes. Stirring for 15 minutes above and below the cloud point (total of 30 minutes), is defined as one thermal cycle. The mixtures are cycled two more times.

[0355] In the meantime, two Steriflip.RTM. 50 ml disposable vacuum filtration devices, each with a 0.22 .mu.m Millipore Express.RTM. membrane (available from Millipore, cat # SCGP00525) are placed in an ice bucket, with a vacuum line attached and left for 1 hour to allow the devices to equilibrate to the temperature of the ice. The poloxamer formulations are then diluted to 2.5 mg/ml DNA with PBS and filtered under vacuum.

[0356] The resulting formulations may be used immediately, or may be transferred to glass vials, cooled below the cloud point, and frozen at -80 .degree. C. for use at a later time.

A Simplified Method Without Thermal Cycling

[0357] This example describes a simplified preparation of a formulation comprising 0.3 mM BAK, 7.5 mg/ml CRL 1005, and 5 mg/ml of DNA in a total volume of 2.0 ml. The ingredients are combined together at a temperature below the cloud point and then the formulation is simply filtered and then used or stored, according to the protocol outlined in FIG. 4.

[0358] A 0.77 mM solution of BAK is prepared in PBS, and 780 .mu.l of the solution is placed into a 15 ml round bottom flask fitted with a magnetic stirring bar, and the solution is stirred with moderate speed, in an ice bath on top of a stirrer/hotplate (hotplate off) for 15 minutes. CRL 1005 (15 .mu.l) is then added using a 100 .mu.l positive displacement pipette and the solution is stirred for a further 60 minutes on ice. Plasmids comprising codon-optimized coding regions encoding, for example, NP, M1, and M2 as described herein, and optionally, additional plasmids comprising codon-optimized or non-codin-optimized coding regions encoding, e.g., additional IV proteins, and or other proteins, e.g., cytokines, are mixed together at desired proportions in PBS to achieve a final concentration of about 8.3 mg/ml total DNA. This plasmid cocktail is added drop wise, slowly, to the stirring solution over 1 min using a 5 ml pipette. The solution at this point (on ice) is clear since it is below the cloud point of the poloxamer and is further stirred on ice for 15 min.

[0359] In the meantime, one Steriflip.RTM. 50 ml disposable vacuum filtration devices, with a 0.22 .mu.m Millipore Express.RTM. membrane (available from Millipore, cat # SCGP00525) is placed in an ice bucket, with a vacuum line attached and left for 1 hour to allow the device to equilibrate to the temperature of the ice. The poloxamer formulation is then filtered under vacuum, below the cloud point and then allowed to warm above the cloud point. The resulting formulations may be used immediately, or may be transferred to glass vials, cooled below the cloud point and then frozen at -80.degree. C. for use at a later time.

Example 6

Animal Immunizations

[0360] The immunogenicity of the various IV expression products encoded by the codon-optimized polynucleotides described herein are initially evaluated based on each plasmid's ability to mount an immune response in vivo. Plasmids are tested individually and in combinations by injecting single constructs as well as multiple constructs. Immunizations are initially carried out in animals, such as mice, rabbits, goats, sheep, non-human primates, or other suitable animal, by intramuscular (IM) injections. Serum is collected from immunized animals, and the antigen specific antibody response is quantified by ELISA assay using purified immobilized antigen proteins in a protein--immunized subject antibody--anti-species antibody type assay, according to standard protocols. The tests of immunogenicity further include measuring antibody titer, neutralizing antibody titer, T-cell proliferation, T-cell secretion of cytokines, cytolytic T cell responses, and by direct enumeration of antigen specific CD4+ and CD8+ T-cells. Correlation to protective levels of the immune responses in humans are made according to methods well known by those of ordinary skill in the art. See above.

A. DNA Formulations

[0361] Plasmid DNA is formulated with a poloxamer by any of the methods described in Example 3. Alternatively, plasmid DNA is prepared as described above and dissolved at a concentration of about 0.1 mg/ml to about 10 mg/ml, preferably about 1 mg/ml, in PBS with or without transfection-facilitating cationic lipids, e.g., DMRIE/DOPE at a 4:1 DNA:lipid mass ratio. Alternative DNA formulations include 150 mM sodium phosphate instead of PBS, adjuvants, e.g., Vaxfectin.TM. at a 4:1 DNA: Vaxfectin.TM. mass ratio, mono-phosphoryl lipid A (detoxified endotoxin) from S. minnesota (MPL) and trehalosedicorynomycolateAF (TDM), in 2% oil (squalene)-Tween 80-water (MPL+TDM, available from Sigma/Aldrich, St. Louis, Mo., (catalog # M6536)), a solubilized mono-phosphoryl lipid A formulation (AF, available from Corixa), or (.+-.)-N-(3-Acetoxypropyl)-N,N-dimethyl-2,3-bis(octyloxy)-1-propanaminium chloride (compound # VC1240) (see Shriver, J. W. et al., Nature 415:331-335 (2002), and P.C.T. Publication No. WO 02/00844 A2, each of which is incorporated herein by reference in its entirety).

B. Animal Immunizations

[0362] Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are injected into BALB/c mice as single plasmids or as cocktails of two or more plasmids, as either DNA in PBS or formulated with the poloxamer-based delivery system: 2 mg/ml DNA, 3 mg/ml CRL 1005, and 0.1 mM BAK. Groups of 10 mice are immunized three times, at biweekly intervals, and serum is obtained to determine antibody titers to each of the antigens. Groups are also included in which mice are immunized with a trivalent preparation, containing each of the three plasmid constructs in equal mass.

[0363] The immunization schedule is as follows: TABLE-US-00078 Day -3 Pre-bleed Day 0 Plasmid injections, intramuscular, bilateral in rectus femoris, 5-50 .mu.g/leg Day 21 Plasmid injections, intramuscular, bilateral in rectus femoris, 5-50 .mu.g/leg Day 49 Plasmid injections, intramuscular, bilateral in rectus femoris, 5-50 .mu.g/leg Day 59 Serum collection

[0364] Serum antibody titers are determined by ELISA with recombinant proteins, peptides or transfection supernatants and lysates from transfected VM-92 cells live, inactivated, or lysed virus.

C. Immunization of Mice with Vaccine Formulations Using a Vaxfectin.TM. Adjuvant

[0365] Vaxfectin.TM. (a 1:1 molar ratio of the cationic lipid VC1052 and the neutral co-lipid DPyPE) is a synthetic cationic lipid formulation which has shown promise for its ability to enhance antibody titers against when administered with DNA intramuscularly to mice.

[0366] In mice, intramuscular injection of Vaxfectin.TM. formulated with NP DNA increased antibody titers up to 20-fold to levels that could not be reached with DNA alone. In rabbits, complexing DNA with Vaxfectin.TM. enhanced antibody titers up to 50-fold. Thus, Vaxfectin.TM. shows promise as a delivery system and as an adjuvant in a DNA vaccine.

[0367] Vaxfectin.TM. mixtures are prepared by mixing chloroform solutions of VC1052 cationic lipid with chloroform solutions of DpyPE neutral co-lipid. Dried films are prepared in 2 ml sterile glass vials by evaporating the chloroform under a stream of nitrogen, and placing the vials under vacuum overnight to remove solvent traces. Each vial contains 1.5 .mu.mole each of VC1052 and DPyPE. Liposomes are prepared by adding sterile water followed by vortexing. The resulting liposome solution is mixed with DNA at a phosphate mole:cationic lipid mole ratio of 4:1.

[0368] Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are mixed together at desired proportions in PBS to achieve a final concentration of 1.0 mg/ml. The plasmid cocktail, as well as the controls, are formulated with Vaxfectin.TM.. Groups of 5 BALB/c female mice are injected bilaterally in the rectus femoris muscle with 50 .mu.l of DNA solution (100 .mu.l total/mouse), on days 1 and 21 and 49 with each formulation. Mice are bled for serum on days 0 (prebleed), 20 (bleed 1), and 41 (bleed 2), and 62 (bleed 3), and up to 40 weeks post-injection. Antibody titers to the various IV proteins encoded by the plasmid DNAs are measured by ELISA as described elsewhere herein.

[0369] Cytolytic T-cell responses are measured as described in Hartikka et al. "Vaxfectin Enhances the Humoral Response to Plasmid DNA-encoded Antigens," Vaccine 19:1911-1923 (2001) and is incorporated herein in its entirety by reference. Standard ELISPOT technology is used for the CD4+ and CD8+ T-cell assays as described in Example 6, part A.

D. Production of NP, M1 or M2 Antisera in Animals

[0370] Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are prepared according to the immunization scheme described above and injected into a suitable animal for generating polyclonal antibodies. Serum is collected and the antibody titered as above.

[0371] Monoclonal antibodies are also produced using hybridoma technology (Kohler, et al., Nature 256:495 (1975); Kohler, et al., Eur. J. Immunol. 6:511 (1976); Kohler, et al., Eur. J. Immunol. 6:292 (1976); Hammerling, et al., in Monoclonal Antibodies and T-Cell Hybridomas, Elsevier, N.Y., (1981), pp. 563-681, each of which is incorporated herein by reference in its entirety). In general, such procedures involve immunizing an animal (preferably a mouse) as described above. The splenocytes of such mice are extracted and fused with a suitable myeloma cell line. Any suitable myeloma cell line may be employed in accordance with the present invention; however, it is preferable to employ the parent myeloma cell line (SP2O), available from the American Type Culture Collection, Rockville, Md. After fusion, the resulting hybridoma cells are selectively maintained in HAT medium, and then cloned by limiting dilution as described by Wands et al., Gastroenterology 80:225-232 (1981), incorporated herein by reference in its entirety. The hybridoma cells obtained through such a selection are then assayed to identify clones which secrete antibodies capable of binding the various IV proteins.

[0372] Alternatively, additional antibodies capable of binding to IV proteins described herein may be produced in a two-step procedure through the use of anti-idiotypic antibodies. Such a method makes use of the fact that antibodies are themselves antigens, and that, therefore, it is possible to obtain an antibody which binds to a second antibody. In accordance with this method, various IV-specific antibodies are used to immunize an animal, preferably a mouse. The splenocytes of such an animal are then used to produce hybridoma cells, and the hybridoma cells are screened to identify clones which produce an antibody whose ability to bind to the IV protein-specific antibody can be blocked by the cognate IV protein. Such antibodies comprise anti-idiotypic antibodies to the IV protein-specific antibody and can be used to immunize an animal to induce formation of further IV-specific antibodies.

[0373] It will be appreciated that Fab and F(ab').sub.2 and other fragments of the antibodies of the present invention may be used according to the methods disclosed herein. Such fragments are typically produced by proteolytic cleavage, using enzymes such as papain (to produce Fab fragments) or pepsin (to produce F(ab').sub.2 fragments). Alternatively, NP, M1, M2, HA and eM2 binding fragments can be produced through the application of recombinant DNA technology or through synthetic chemistry.

[0374] It may be preferable to use "humanized" chimeric monoclonal antibodies. Such antibodies can be produced using genetic constructs derived from hybridoma cells producing the monoclonal antibodies described above. Methods for producing chimeric antibodies are known in the art. See, for review, Morrison, Science 229:1202 (1985); Oi, et al., BioTechniques 4:214 (1986); Cabilly, et al., U.S. Pat. No. 4,816,567; Taniguchi, et al., EP 171496; Morrison, et al., EP 173494; Neuberger, et al., WO 8601533; Robinson, et al., WO 8702671; Boulianne, et al., Nature 312:643 (1984); Neuberger, et al., Nature 314:268 (1985).

[0375] These antibodies are used, for example, in diagnostic assays, as a research reagent, or to further immunize animals to generate IV-specific anti-idiotypic antibodies. Non-limiting examples of uses for anti-IV antibodies include use in Western blots, ELISA (competitive, sandwich, and direct), immunofluorescence, immunoelectron microscopy, radioimmunoassay, immunoprecipitation, agglutination assays, immunodiffusion, immunoelectrophoresis, and epitope mapping (Weir, D. Ed. Handbook of Experimental Immunology, 4.sup.th ed. Vols. I and II, Blackwell Scientific Publications (1986)).

Example 7

Mucosal Vaccination and Electrically Assisted Plasmid Delivery

A. Mucosal DNA Vaccination

[0376] Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, HA, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, (100 .mu.g/50 .mu.l total DNA) are delivered to BALB/c mice at 0, 2 and 4 weeks via i.m., intranasal (i.n.), intravenous (i.v.), intravaginal (i.vag.), intrarectal (i.r.) or oral routes. The DNA is delivered unformulated or formulated with the cationic lipids DMRIE/DOPE (DD) or GAP-DLRIE/DOPE (GD). As endpoints, serum IgG titers against the various IV antigens are measured by ELISA and splenic T-cell responses are measured by antigen-specific production of IFN-gamma and IL-4 in ELISPOT assays. Standard chromium release assays are used to measure specific cytotoxic T lymphocyte (CTL) activity against the various IV antigens. Tetramer assays are used to detect and quantify antigen specific T-cells, with quantification being confirmed and phenotypic characterization accomplished by intracellular cytokine staining. In addition, IgG and IgA responses against the various IV antigens are analyzed by ELISA of vaginal washes.

B. Electrically-Assisted Plasmid Delivery

[0377] In vivo gene delivery may be enhanced through the application of brief electrical pulses to injected tissues, a procedure referred to herein as electrically-assisted plasmid delivery. See, e.g., Aihara, H. & Miyazaki, J. Nat. Biotechnol. 16:867-70 (1998); Mir, L. M. et al., Proc. Natl Acad. Sci. USA 96:4262-67 (1999); Hartikka, J. et al., Mol. Ther. 4:407-15 (2001); and Mir, L. M. et al.; Rizzuto, G. et al., Hum Gene Ther 11:1891-900 (2000); Widera, G. et al, J. of Immuno. 164: 4635-4640 (2000). The use of electrical pulses for cell electropermeabilization has been used to introduce foreign DNA into prokaryotic and eukaryotic cells in vitro. Cell permeabilization can also be achieved locally, in vivo, using electrodes and optimal electrical parameters that are compatible with cell survival.

[0378] The electroporation procedure can be performed with various electroporation devices. These devices include external plate type electrodes or invasive needle/rod electrodes and can possess two electrodes or multiple electrodes placed in an array. Distances between the plate or needle electrodes can vary depending upon the number of electrodes, size of target area and treatment subject.

[0379] The TriGrid needle array, used in examples described herein, is a three electrode array comprising three elongate electrodes in the approximate shape of a geometric triangle. Needle arrays may include single, double, three, four, five, six or more needles arranged in various array formations. The electrodes are connected through conductive cables to a high voltage switching device that is connected to a power supply.

[0380] The electrode array is placed into the muscle tissue, around the site of nucleic acid injection, to a depth of approximately 3 mm to 3 cm. The depth of insertion varys depending upon the target tissue and size of patient receiving electroporation. After injection of foreign nucleic acid, such as plasmid DNA, and a period of time sufficient for distribution of the nucleic acid, square wave electrical pulses are applied to the tissue. The amplitude of each pulse ranges from about 100 volts to about 1500 volts, e.g., about 100 volts, about 200 volts, about 300 volts, about 400 volts, about 500 volts, about 600 volts, about 700 volts, about 800 volts, about 900 volts, about 1000 volts, about 1100 volts, about 1200 volts, about 1300 volts, about 1400 volts, or about 1500 volts or about 1-1.5 kV/cm, based on the spacing between electrodes. Each pulse has a duration of about 1 .mu.s to about 1000 .mu.s, e.g., about 1 .mu.s, about 10 .mu.s, about 50 .mu.s, about 100 .mu.s, about 200 .mu.s, about 300 .mu.s, about 400 .mu.s, about 500 .mu.s, about 600 .mu.s, about 700 .mu.s, about 800 .mu.s, about 900 .mu.s, or about 1000 .mu.s, and a pulse frequency on the order of about 1-10 Hz. The polarity of the pulses may be reversed during the electroporation procedure by switching the connectors to the pulse generator. Pulses are repeated multiple times. The electroporation parameters (e.g. voltage amplitude, duration of pulse, number of pulses, depth of electrode insertion and frequency) will vary based on target tissue type, number of electrodes used and distance of electrode spacing, as would be understood by one of ordinary skill in the art.

[0381] Immediately after completion of the pulse regimen, subjects receiving electroporation can be optionally treated with membrane stabilizing agents to prolong cell membrane permeability as a result of the electroporation. Examples of membrane stabilizing agents include, but are not limited to, steroids (e.g. dexamethasone, methylprednisone and progesterone), angiotensin II and vitamin E. A single dose of dexamethasone, approximately 0.1 mg per kilogram of body weight, should be sufficient to achieve a beneficial affect.

[0382] EAPD techniques such as electroporation can also be used for plasmids contained in liposome formulations. The liposome--plasmid suspension is administered to the animal or patient and the site of injection is treated with a safe but effective electrical field generated, for example, by a TriGrid needle array. The electroporation may aid in plasmid delivery to the cell by destabilizing the liposome bilayer so that membrane fusion between the liposome and the target cellular structure occurs. Electroporation may also aid in plasmid delivery to the cell by triggering the release of the plasmid, in high concentrations, from the liposome at the surface of the target cell so that the plasmid is driven across the cell membrane by a concentration gradient via the pores created in the cell membrane as a result of the electroporation.

[0383] Female BALB/c mice aged 8-10 weeks are anesthetized with inhalant isoflurane and maintained under anesthesia for the duration of the electroporation procedure. The legs are shaved prior to treatment. Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, HA, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are administered to BALB/c mice (n=10) via unilateral injection in the quadriceps with 25 .mu.g total of a plasmid DNA per mouse using an 0.3 cc insulin syringe and a 26 gauge, 1/2 length needle fitted with a plastic collar to regulate injection depth. Approximately one minute after injection, electrodes are applied. Modified caliper electrodes are used to apply the electrical pulse. See Hartikka J. et al. Mol Ther 188:407-415 (2001). The caliper electrode plates are coated with conductivity gel and applied to the sides of the injected muscle before closing to a gap of 3 mm for administration of pulses. EAPD is applied using a square pulse type at 1-10 Hz with a field strength of 100-500 V/cm, 1-10 pulses, of 10-100 ms each.

[0384] Mice are vaccinated .+-.EAPD at 0, 2 and 4 weeks. As endpoints, serum IgG titers against the various IV antigens are measured by ELISA and splenic T-cell responses are measured by antigen-specific production of IFN-gamma and IL-4 in ELISPOT assays. Standard chromium release assays are used to measure specific cytotoxic T lymphocyte (CTL) activity against the various IV antigens.

[0385] Rabbits (n=3) are given bilateral injections in the quadriceps muscle with plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, HA, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector. The implantation area is shaved and the TriGrid electrode array is implanted into the target region of the muscle. 3.0 mg of plasmid DNA is administered per dose through the injection port of the electrode array. An injection collet is used to control the depth of injection. Electroporation begins approximately one minute after injection of the plasmid DNA is complete. Electroporation is administered with a TriGrid needle array, with eletrodes evenly spaced 7 mm apart, using an Ichor TGP-2 pulse generator. The array is inserted into the target muscle to a depth of about I to 2 cm. 4-8 pulses are administered. Each pulse has a duration of about 50-100 .mu.s, an amplitude of about 1-1.2 kV/cm and a pulse frequency of 1 Hz. The injection and electroporation may be repeated.

[0386] Sera are collected from vaccinated rabbits at various time point. As endpoints, serum IgG titers against the various IV antigens are measured by ELISA and PBMC T-cell proliferative responses.

[0387] To test the effect of electroporation on therapeutic protein expression in non-human primates, male or female rhesus monkeys are given either 2 or 6 i.m. injections of plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, (0.1 to 10 mg DNA total per animal). Target muscle groups include, but are not limited to, bilateral rectus fermoris, cranial tibialis, biceps, gastrocenemius or deltoid muscles. The target area is shaved and a needle array, comprising between 4 and 10 electrodes, spaced between 0.5-1.5 cm apart, is implanted into the target muscle. Once injections are complete, a sequence of brief electrical pulses are applied to the electrodes implanted in the target muscle using an Ichor TGP-2 pulse generator. The pulses have an amplitude of approximately 120 - 200V. The pulse sequence is completed within one second. During this time, the target muscle may make brief contractions or twitches. The injection and electroporation may be repeated.

[0388] Sera are collected from vaccinated monkeys at various time points. As endpoints, serum IgG titers against the various IV antigens are measured by ELISA and PBMC T-cell proliferative responses are measured by antigen-specific production of IFN-gamma and IL-4 in ELISPOT assays or by tetramer assays to detect and quantify antigen specific T-cells, with quantification being confirmed and phenotypic characterization accomplished by intracellular cytokine staining. Standard chromium release assays are used to measure specific cytotoxic T lymphocyte (CTL) activity against the various TV antigens.

Example 8

Combinatorial DNA Vaccine Using Heterologous Prime-Boost Vaccination

[0389] This Example describes vaccination with a combinatorial formulation including one or more polynucleotides comprising one codon-optimized coding regions encoding an IV protein or fragment, variant, or derivative thereof prepared with an adjuvant and/or transfection facilitating agent; and also an isolated IV protein or fragment, variant, or derivative thereof. Thus, antigen is provided in two forms. The exogenous isolated protein stimulates antigen specific antibody and CD4+ T-cell responses, while the polynucleotide-encoded protein, produced as a result of cellular uptake and expression of the coding region, stimulates a CD8+ T-cell response. Unlike conventional "prime-boost" vaccination strategies, this approach provides different forms of antigen in the same formulation. Because antigen expression from the DNA vaccine doesn't peak until 7-10 days after injection, the DNA vaccine provides a boost for the protein component. Furthermore, the formulation takes advantage of the immunostimulatory properties of the bacterial plasmid DNA.

A. Non-Codon Optimized NP Gene

[0390] This example demonstrates the efficacy of this procedure using a non-codon-optimized polynucleotide encoding NP, however, the methods described herein are applicable to any IV polynucleotide vaccine formulation. Because only a small amount of protein is needed in this method, it is conceivable that the approach could be used to reduce the dose of conventional vaccines, thus increasing the availability of scarce or expensive vaccines. This feature would be particularly important for vaccines against pandemic influenza or biological warfare agents.

[0391] An injection dose of 10 .mu.g influenza A/PR/8/34 nucleoprotein (NP) DNA per mouse, prepared essentially as described in Ulmer, J. B., et al., Science 259:1745-49 (1993) and Ulmer, J. B. et al., J. Virol. 72:5648-53 (1998) was pre-determined in dose response studies to induce T cell and antibody responses in the linear range of the dose response and results in a response rate of greater than 95% of mice injected. Each formulation, NP DNA alone, or NP DNA.+-.NP protein formulated with Ribi I or the cationic lipids, DMRIE:DOPE or Vaxfectin.TM., was prepared in the recommended buffer for that vaccine modality. For injections with NP DNA formulated with cationic lipid, the DNA was diluted in 2.times. PBS to 0.2 mg/ml.+-.purified recombinant NP protein (produced in baculovirus as described in Example 2) at 0.08 mg/ml. Each cationic lipid was reconstituted from a dried film by adding 1 ml of sterile water for injection (SWFI) to each vial and vortexing continuously for 2 min., then diluted with SWFI to a final concentration of 0.15 mM. Equal volumes of NP DNA (.+-.NP protein) and cationic lipid were mixed to obtain a DNA to cationic lipid molar ratio of 4:1. For injections with DNA containing Ribi I adjuvant (Sigmna), Ribi I was reconstituted with saline to twice the final concentration. Ribi I (2.times.) was mixed with an equal volume of NP DNA at 0.2 mg/ml in saline.+-.NP protein at 0.08 mg/ml. For immunizations without cationic lipid or Ribi, NP DNA was prepared in 150 mM sodium phosphate buffer, pH 7.2. For each experiment, groups of 9 BALB/c female mice at 7-9 weeks of age were injected with 50 .mu.l of NP DNA.+-.NP protein, cationic lipid or Ribi I. Injections were given bilaterally in each rectus femoris at day 0 and day 21. The mice were bled by OSP on day 20 and day 33 and serum titers of individual mice were measured.

[0392] NP specific serum antibody titers were determined by indirect binding ELISA using 96 well ELISA plates coated overnight at 4.degree. C. with purified recombinant NP protein at 0.5 .mu.g per well in BBS buffer pH 8.3. NP coated wells were blocked with 1% bovine serum albumin in BBS for 1 h at room temperature. Two-fold serial dilutions of sera in blocking buffer were incubated for 2 h at room temperature and detected by incubating with alkaline phosphatase conjugated (AP) goat anti-mouse IgG-Fc (Jackson Immunoresearch, West Grove, Pa.) at 1:5000 for 2 h at room temperature. Color was developed with 1 mg/ml para-nitrophenyl phosphate (Calbiochem, La Jolla, Calif.) in 50 mM sodium bicarbonate buffer, pH 9.8 and 1 mM MgCl.sub.2 and the absorbance read at 405 nm. The titer is the reciprocal of the last dilution exhibiting an absorbance value 2 times that of pre-bleed samples.

[0393] Standard ELISPOT technology, used to identify the number of interferon gamma (IFN-y) secreting cells after stimulation with specific antigen (spot forming cells per million splenocytes, expressed as SFU/million), was used for the CD4+ and CD8+ T-cell assays. For the screening assays, 3 mice from each group were sacrificed on day 34, 35, and 36. At the time of collection, spleens from each group were pooled, and single cell suspensions made in cell culture media using a dounce homogenizer. Red blood cells were lysed, and cells washed and counted. For the CD4+ and CD8+ assays, cells were serially diluted 3-fold, starting at 10.sup.6 cells per well and transferred to 96 well ELISPOT plates pre-coated with anti-murine IFN-.gamma. monoclonal antibody. Spleen cells were stimulated with the H-2K.sup.d binding peptide, TYQRTRALV (SEQ ID NO:81), at 1 .mu.g/ml and recombinant murine IL-2 at 1 U/ml for the CD8+ assay and with purified recombinant NP protein at 20 .mu.g/ml for the CD4+ assay. Cells were stimulated for 20-24 hours at 37.degree. C. in 5% CO.sub.2, then the cells were washed out and biotin labeled anti-IFN-.gamma. monoclonal antibody added for a 2 hour incubation at room temperature. Plates were washed and horseradish peroxidase-labeled avidin was added. After a 1-hour incubation at room temperature, AEC substrate was added and "spots" developed for 15 min. Spots were counted using the Immunospot automated spot counter (C.T.L. Inc., Cleveland Ohio). Thus, CD4+ and CD8+ responses were measured in three separate assays, using spleens collected on each of three consecutive days.

[0394] Three weeks after a single injection, antibody responses in mice receiving vaccine formulations containing purified protein were 6 to 8-fold higher than for mice receiving NP DNA only (FIG. 5, Table 15). The titers for mice receiving DNA and protein formulated with a cationic lipid were similar to those for mice receiving protein in Ribi adjuvant or DNA and protein in Ribi adjuvant. These data indicate that the levels of antibody seen when protein is injected with an adjuvant can be obtained with DNA vaccines containing DNA and protein formulated with a cationic lipid, without the addition of conventional adjuvant.

[0395] Twelve days after a second injection, antibody responses in mice receiving vaccine formulations containing purified protein were 9 to 129-fold higher than for mice receiving NP DNA only (FIG. 6, Table 15). With a mean anti-NP antibody titer of 750,933 at day 33, the titers for mice receiving DNA and protein formulated with Vaxfectin.TM. were 25-fold higher than for mice receiving DNA alone (mean titer=30,578), and nearly as high as those for mice injected with protein in Ribi adjuvant (mean titer=1,748,133). TABLE-US-00079 TABLE 15 Fold increase in antibody response over DNA alone 20 days after one 12 days after second Formulation injection injection protein + Ribi 7X (p = 0.0002) 57X (p = 0.002) DNA + protein + 6X (p = 0.00005) 9X (p = 0.0002) DMRIE:DOPE DNA + protein + 8X (p = 0.00003) 25X (p = 0.0004) Vaxfectin .TM. DNA + protein + Ribi 7X (p = 0.01) 129X (p = 0.003) *protein = purified recombinant NP protein

[0396] As expected, an NP specific CD8+ T-cell IFN-.gamma. response was not detected in spleens of mice injected with NP protein in Ribi (FIG. 7). All of the other groups had detectable NP specific CD8+ T-cell responses. The CD8+ T-cell responses for all groups receiving vaccine formulations containing NP DNA were not statistically different from each other.

[0397] Mice from all of the groups had detectable NP specific CD4+ T-cell responses (FIG. 8). The CD4+ T-cell responses of splenocytes from groups receiving vaccine formulations containing NP DNA and NP protein formulated with cationic lipid were 2-6 fold higher than the group injected with DNA alone.

B. Codon-Optimized IV Constructs

[0398] Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are used in the prime-boost compositions described herein. For the prime-boost modalities, the same protein may be used for the boost, e.g., DNA encoding NP with NP protein, or a heterologous boost may be used, e.g., DNA encoding NP with an M1 protein boost. Each formulation, the plasmid comprising a coding region for the IV protein alone, or the plasmid comprising a coding region for the IV protein plus the isolated protein are formulated with Ribi I or the cationic lipids, DMRIE:DOPE or Vaxfectin.TM.. The formulations are prepared in the recommended buffer for that vaccine modality. Exemplary formulations, using NP as an example, are described herein. Other plasmid/protein formulations, including multivalent formulations, can be easily prepared by one of ordinary skill in the art by following this example. For injections with DNA formulated with cationic lipid, the DNA is diluted in 2.times. PBS to 0.2 mg/ml.+-.purified recombinant NP protein at 0.08 mg/ml. Each cationic lipid is reconstituted from a dried film by adding 1 ml of sterile water for injection (SWFI) to each vial and vortexing continuously for 2 min., then diluted with SWFI to a final concentration of 0.15 mM. Equal volumes of NP DNA (.+-.NP protein) and cationic lipid are mixed to obtain a DNA to cationic lipid molar ratio of 4:1. For injections with DNA containing Ribi I adjuvant (Sigma), Ribi I is reconstituted with saline to twice the final concentration. Ribi I (2.times.) is mixed with an equal volume of NP DNA at 0.2 mg/ml in saline.+-.NP protein at 0.08 mg/ml. For immunizations without cationic lipid or Ribi, NP DNA is prepared in 150 mM sodium phosphate buffer, pH 7.2. For each experiment, groups of 9 BALB/c female mice at 7-9 weeks of age are injected with 50 .mu.l of NP DNA.+-.NP protein, cationic lipid or Ribi I. The formulations are administered to BALB/c mice (n=10) via bilateral injection in each rectus femoris at day 0 and day 21.

[0399] The mice are bled on day 20 and day 33 and serum titers of individual mice to the various IV antigens are measured. Serum antibody titers specific for the various IV antigens are determined by ELISA. Standard ELISPOT technology, used to identify the number of interferon gamma (IFN-.gamma.) secreting cells after stimulation with specific antigen (spot forming cells per million splenocytes, expressed as SFU/million), is used for the CD4+ and CD8+ T-cell assays using 3 mice from each group vaccinated above, sacrificed on day 34, 35 and 36, post vaccination.

Example 9

Murine Challenge Model of Influenza

General Experimental Procedure

[0400] A murine challenge model with influenza A virus is used to test the efficacy of the immunotherapies. The model used is based on that described in Ulmer, J. B., et al., Science 259:1745-49 (1993) and Ulmer, J. B. et al., J Virol. 72:5648-53 (1998), both of which are incorporated herein by reference in their entireties. This model utilizes a mouse-adapted strain of influenza A/HK/8/68 which replicates in mouse lungs and is titered in tissue culture in Madin Darby Canine Kidney cells. The LD.sub.90 of this mouse-adapted influenza virus is determined in female BALB/c mice age 13-15 weeks. In this model, two types of challenge study can be conducted: lethal challenge, where the virus is administered intranasally to heavily sedated mice under ketamine anesthesia; and a sub-lethal challenge, where mice are not anesthetized when the viral inoculum is administered (also intranasally). The endpoint for lethal challenge is survival, but loss in body mass and body temperature can also be monitored. The read-outs for the sublethal challenge include lung virus titer and loss in body mass and body temperature.

[0401] In the studies described here, mice are subjected to lethal challenge. Mice that are previously vaccinated with DNA encoding IV antigens are anesthetized and challenged intranasally with 0.02 mL of mouse-adapted influenza A/HK/8/68 (mouse passage #6), diluted 1 to 10,000 (500 PFU) in PBS containing 0.2% wt/vol BSA.

[0402] These challenge studies utilize groups of 10 mice. The route of administration is intramuscular in rectus femoris (quadriceps), using 0.1 .mu.g up to 1 mg total plasmid DNA. Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are tested singly and in multivalent cocktails for the ability to protect against challenge. The plasmids are formulated with an adjuvant and/or a transfection facilitating agent, e.g., Vaxfectin.TM. by methods described elsewhere herein. Mice are vaccinated on days 0 and 21 using amounts of plasmids as described in Example 6. Subsequent injections can be administered. Nasal challenge of mice takes place 3 weeks after the final immunization, and animals are monitored daily for body mass, hypothermia, general appearance and then death.

[0403] For each group of mice that are studied, blood is taken at 2 weeks following the second injection, and/or any subsequent injection, and the animals are terminally bled two weeks following the last injection. Antibody titers are determined for M2, M1, and NP using ELISAs as previously described.

Plasmids

[0404] As described above, constructs of the present invention were inserted into the expression vector VR10551. VR10551 is an expression vector without any transgene insert.

[0405] VR4750 contains the coding sequence for hemagglutinin (HA) (H3N2) from mouse adapted A/Hong Kong/68. The DNA was prepared using Qiagen plasmid purification kits.

Experimental Procedure

[0406] The experimental procedure for the following example is as described above, with particular parameters and materials employed as described herein. In order to provide a pDNA control for protection in the mouse influenza challenge model, the hemagglutinin (HA) gene was cloned from the influenza A/HK/8/68 challenge virus stock, which was passaged 6 times in mice.

[0407] Mice were vaccinated twice at 3 week intervals with either 100 .mu.g pDNA VR4750 encoding the HA gene cloned directly from the mouse-adapted influenza A/HK/8/68 strain, or with 100 .mu.g blank vector pDNA (VR10551). An additional control group was immunized intranasally with live A/HK/8/68 virus (500 PFU). Three weeks after the last injection, mice were challenged intranasally with mouse-adapted influenza A/HK/8168 with one of 3 doses (50, 500 and 5,000 PFU). Following viral challenge, mice were monitored daily for symptoms of disease, loss in body mass and survival.

[0408] FIG. 9 shows that homologous HA-pDNA vaccinated mice are completely protected over a range of viral challenge doses (FIG. 9A) and did not suffer significant weight loss (FIG. 9B) during the 3 week period following challenge.

[0409] Based on these results, future mouse flu challenge studies can include VR4750 (HA) pDNA as a positive control for protection and utilize 500 PFU, which is the LD90 for this mouse-adapted virus, as the challenge dose.

Example 10

Challenge in Non-Human Primates

[0410] The purpose of these studies is to evaluate three or more of the optimal plasmid DNA vaccine formulations for immunogenicity in non-human primates. Rhesus or cynomologus monkeys (6/group) are vaccinated with plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, HA, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, intramuscularly 0.1 to 2 mg DNA combined with cationic lipid, and/or poloxamer and/or aluminum phosphate based or other adjuvants at 0, 1 and 4 months.

[0411] Blood is drawn twice at baseline and then again at the time of and two weeks following each vaccination, and then again 4 months following the last vaccination. At 2 weeks post-vaccination, plasma is analyzed for humoral response and PBMCs are monitored for cellular responses, by standard methods described herein. Animals are monitored for 4 months following the final vaccination to determine the durability of the immune response.

[0412] Animals are challenged within 2-4 weeks following the final vaccination. Animals are challenged intratracheally with the suitable dose of virus based on preliminary challege studies. Nasal swabs, pharyngeal swabs and lung lavages are collected at days 0, 2, 4, 6, 8 and 11 post-challenge and will be assayed for cell-free virus titers on monkey kidney cells. After challenge, animals are monitored for clinical symptoms, e.g., rectal temperature, body weight, leukocyte counts, and in addition, hematocrit and respiratory rate. Oropharyngeal swab samples are taken to allow determination of the length of viral shedding. Illness is scored using the system developed by Berendt & Hall (Infect Immun 16:476-479 (1977)), and will be analyzed by analysis of variance and the method of least significant difference.

Example 11

Challenge in Birds

[0413] In this example, various vaccine formulations of the present invention are tested in the chicken influenza model. For these studies an IV H5N1 virus, known to infect birds, is used. Plasmid constructs comprising codon-optimized and non-codon-optimized coding regions encoding NP, M1, M2, eM2, and/or an eM2-NP fusion; or alternatively coding regions (either codon-optimized or non-codon optimized) encoding various IV proteins or fragments, variants or derivatives either alone or as fusions with a carrier protein, e.g., HBcAg, as well as various controls, e.g., empty vector, are formulated with cationic lipid, and/or poloxamer and/or aluminum phosphate based or other adjuvants. The vaccine formulations are delivered at a dose of about 1-10 .mu.g, delivered IM into the defeathered breast area, at 0 and 1 month. The animals are bled for antibody results 3 weeks following the second vaccine. Antibody titers against the various IV antigens are determined using techniques described in the literature. See, e.g., Kodihalli S. et al., Vaccine 18:2592-9 (2000). The birds are challenged intranasally with 0.1 mL containing 100 LD.sub.50 3 weeks post second vaccination. The birds are monitored daily for 10 days for disease symptoms, which include loss of appetite, diarrhea, swollen faces, cyanosis, paralysis and death. Tracheal and cloacal swabs are taken 4 days following challenge for virus titration.

Example 12

Formulation Selection Studies

[0414] The potency of different vaccine formulations was evaluated in different experimental studies using the NP protein of Influenza A/PR/8134.

Vaccination Regimen

[0415] Groups of nine, six- to eight-week old BALB/c mice (Harlan-Sprague-Dawley) received bilateral (50 .mu.L/leg) intramuscular (rectus femoris) injections of plasmid DNA. Control mice received DNA in PBS alone. Mice received injections on days 0, 20 and 49. Mice were bled by OSP on day 62, and NP-specific antibodies analyzed by ELISA. Splenocytes were harvested from 3 mice/group/day for three sequential days beginning day 63, and NP-specific specific T cells were analyzed by IFN.gamma. ELISPOT using overlapping peptide stimulation.

Cell Culture Media

[0416] Splenocyte cultures were grown in RPMI-1640 medium containing 25 mM HEPES buffer and L-glutamine and supplemented with 10% (v/v) FBS, 55 .mu.M .beta.-mercaptoethanol, 100 U/mL of penicillin G sodium salt, and 100 .mu.g/mL of streptomycin sulfate.

Standard Influenza NP Indirect Binding Assay

[0417] NP specific serum antibody titers were determined by indirect binding ELISA using 96 well ELISA plates coated overnight at 4.degree. C. with purified recombinant NP protein at 0.5 .mu.g per well in BBS buffer, pH 8.3. NP coated wells were blocked with 1% bovine serum albumin in BBS for 1 hour at room temperature. Two-fold serial dilutions of sera in blocking buffer were incubated for 2 hours at room temperature and detected by incubating with alkaline phosphatase conjugated (AP) goat anti-mouse IgG-Fc (Jackson Immunoresearch, West Grove, Pa.) at 1:5000 for 2 hours at room temperature. Color was developed with 1 mg/ml para-nitrophenyl phosphate (Calbiochem, La Jolla, Calif.) in 50 mM sodium bicarbonate buffer, pH 9.8 and 1 mM MgCl.sub.2 and the absorbance read at 405 nm. The titer is the reciprocal of the last dilution exhibiting an absorbance value 2 times that of pre-bleed samples.

Standard NP CD8+ and CD4+ T-Cell ELISPOT Assay

[0418] Standard ELISPOT technology, used to identify the number of interferon gamma (IFN-.gamma.) secreting cells after stimulation with specific antigen (spot forming cells per million splenocytes, expressed as SFU/million), was used for the CD4+ and CD8+ T-cell assays. Three mice from each group were sacrificed on each of three consecutive days. At the time of collection, spleens from each group were pooled, and single cell suspensions were made in cell culture media using a dounce homogenizer. Red blood cells were lysed, and cells were washed and counted. For the CD4+ and CD8+ assays, cells were serially diluted 3- fold, starting at 10.sup.6 cells per well and transferred to 96 well ELISPOT plates pre-coated with anti-murine IFN-.gamma. monoclonal antibody. Spleen cells were stimulated with the H-2K.sup.d binding peptide, TYQRTRALV, at 1 .mu.g/ml and recombinant murine IL-2 at 1 U/ml for the CD8+ assay and with purified recombinant NP protein at 20 .mu.g/ml for the CD4+ assay. Cells were stimulated for 20-24 hours at 37.degree. C. in 5% CO.sub.2, and then the cells were washed out and biotin labeled anti-IFN-.gamma. monoclonal antibody added for a 2 hour incubation at room temperature. Plates were washed and horseradish peroxidase-labeled avidin was added. After a 1-hour incubation at room temperature, AEC substrate was added and "spots" developed for 15 minutes. Spots were counted using the Immunospot automated spot counter (C.T.L. Inc., Cleveland Ohio).

Experiment 1

[0419] The purpose of this experiment was to determine a dose response to naked DNA (VR4700) and for pDNA formulated with VF-P1205-02A. VR4700 is a plasmid encoding influenza A/PR/8/34 nucleoprotein (NP) in a VR10551 backbone. VR10551 is an expression vector without any transgene insert. VF-P1205-02A is a formulation containing a poloxamer with a POP molecular weight of 12 KDa and POE of 5% (CRL1005) at a DNA:poloxamer:BAK ratio of 5 mg/ml:7.5 mg/ml:0.3 mM. The results of this experiment are shown in the following Table: TABLE-US-00080 TABLE 16 CRL1005 BAK Serum Ab CD8.sup.+T CD4.sup.+T DNA dose dose conc. titers (total cells cells (.mu.g) (.mu.g) (.mu.M) IgG, n = 9) (SFU/10.sup.6) (SFU/10.sup.6) 1 11,206 28 24 10 31,289 77 99 100 65,422 243 304 1 1.5 0.06 9,956 48 57 10 15 0.6 45,511 174 220 100 150 6 79,644 397 382

[0420] The results of this experiment indicate that increasing the dose of DNA increases both the humoral and cell mediated immune responses. When the DNA is formulated with poloxamer and BAK, increasing the dose also increases both the humoral and cell mediated immune responses.

Experiment 2

[0421] The purpose of this experiment was to determine a dose response to CRL1005, with a fixed pDNA (VR4700) dose and no BAK. The results of this experiment are shown in the following Table: TABLE-US-00081 TABLE 17 CRL1005 DNA dose dose Serum Ab titers CD8.sup.+T cells CD4.sup.+T cells (.mu.g) (.mu.g) (total IgG, n = 9) (SFU/10.sup.6) (SFU/10.sup.6) 10 27,733 45 46 10 15 38,400 69 86 10 50 46,933 66 73 10 150 54,044 90 97 10 450 76,800 90 92 10 750 119,467 83 60

[0422] The results of this experiment indicate that increasing the dose of CRL1005 increases both the humoral and cell mediated immune responses.

Experiment 3

[0423] The purpose of this experiment was to compare immune responses of DMRIE:DOPE (1:1, mol:mol) and Vaxfectin.TM. cationic lipid formulations at different pDNA/cationic lipid molar ratios. The results of this experiement are shown in the following Table: TABLE-US-00082 TABLE 18 DMRIE:DOPE Vaxfectin .TM. Serum CD8.sup.+T CD4.sup.+T DNA pDNA/cationic pDNA/cationic Ab titers cells cells dose lipid molar lipid molar (total (SFU/ (SFU/ (.mu.g) ratios ratios IgG, n = 9) 10.sup.6) 10.sup.6) 10 17,778 57 54 10 4:1 48,356 47 112 10 2:1 49,778 44 133 10 4:1 88,178 68 464 10 2:1 150,756 46 363

[0424] The results of this experiment indicate that formulating the plasmid with DMRIE:DOPE or Vaxfectin.TM. increases both the humoral and cell mediated immune responses.

Experiment 4

[0425] The purpose of this experiment was first to compare immune responses of DMRIE:DOPE (1:1, mol:mol) at pDNA/cationic lipid molar ratios of 4:1 as an MLV (multi lamellar vesicle formulation--multi-vial) or SUV (small unilamellar vesicles--single-vial) formulation. Second, it was to compare sucrose (lyophilized and frozen) and PBS based formulations. The results of this experiment are shown in the following Table: TABLE-US-00083 TABLE 19 DNA Serum Ab CD8.sup.+T CD4.sup.+T dose titers (total cells cells (.mu.g) Formulation Buffer IgG, n = 9) (SFU/10.sup.6) (SFU/10.sup.6) 10 PBS, pH 21,333 107 118 7.2 10 SUV PBS, pH 15,644 144 169 7.2 10 SUV PBS, pH 13,511 114 173 7.8 10 SUV Sucrose 15,644 103 119 Frozen/thawed pH 7.8 10 SUV Sucrose 10,311 ND 246 Lyophilized pH 7.8 10 MLV PBS, 29,867 170 259 pH 7.2 * ND - could not be counted due to high background

[0426] The results of this experiment indicate that formulating the plasmid with DMRIE:DOPE stimulates both the humoral and cell mediated immune responses.

Experiment 5

[0427] The purpose of this experiment was first to determine what effect changing the ratio of DMRIE to DOPE has on immune response at pDNA/cationic lipid molar ratios of 4:1 as an MLV (multi-vial, in PBS) or SUV (single-vial in PBS) formulation. Second, it was to compare the effect of changing the co-lipid from DOPE to cholesterol. The results of this experiment are shown in the following Table: TABLE-US-00084 TABLE 20 Serum Ab DNA titers CD8.sup.+T CD4.sup.+T dose (total IgG, cells cells (.mu.g) Formulation DMRIE:DOPE n = 9) (SFU/10.sup.6) (SFU/10.sup.6) 10 19,342 65 98 10 MLV, 1:0 38,684 70 126 DM:DP 10 MLV, 3:1 75,093 82 162 DM:DP 10 MLV, 1:1 53,476 78 186 DM:DP 10 SUV, 1:1 36,409 96 106 DM:DP 10 MLV, 1:1 52,338 65 154 DM:Chol

[0428] The results of this experiment indicate that formulating the plasmid with DMRIE:DOPE stimulates both the humoral and cell mediated immune responses. Changing the co-lipid from DOPE to cholesterol also stimulates both the humoral and cell mediated immune responses.

Experiment 6

[0429] The purpose of this experiment was to obtain a dose response to pDNA formulated with DMRIE:DOPE (1:1, mol:mol) at a 4:1 pDNA/cationic lipid molar ratio. The results of this experiemtn are shown in the following Table: TABLE-US-00085 TABLE 21 Serum DNA dose Ab titers (total CD8.sup.+T cells CD4.sup.+T cells (.mu.g) Formulation IgG, n = 9) (SFU/10.sup.6) (SFU/10.sup.6) 10 22,044 119 154 1 MLV 5,600 22 67 3 MLV 22,756 46 97 10 MLV 45,511 199 250 30 MLV 60,444 274 473 100 MLV 91,022 277 262

[0430] The results of this experiment indicate that when the plasmid is formulated with DMRIE:DOPE, increasing the dose also increases both the humoral and cell mediated immune responses.

Example 13

In vitro Expression of Influenza Antigens

Plasmid Vector

[0431] Polynucleotides of the present invention were inserted into eukaryotic expression vector backbones VR10551, VR10682 and VR6430 all of which are described previously. The VR10551 vector is built on a modified pUC18 background (see Yanisch-Perron, C., et al. Gene 33:103-119 (1985)), and contains a kanamycin resistance gene, the human cytomegalovirus immediate early 1 promoter/enhancer and intron A, and the bovine growth hormone transcription termination signal, and a polylinker for inserting foreign genes. See Hartikka, J., et al., Hum. Gene Ther. 7:1205-1217 (1996). However, other standard commercially available eukaryotic expression vectors may be used in the present invention, including, but not limited to: plasmids pcDNA3, pHCMV/Zeo, pCR3.1, pEF1/His, pIND/GS, pRc/HCMV2, pSV40/Zeo2, pTRACER-HCMV, pUB6/V5-His, pVAX1, and pZeoSV2 (available from Invitrogen, San Diego, Calif.), and plasmid pCI (available from Promega, Madison, Wis.).

[0432] Various plasmids were generated by cloning the nucleotide sequence for the following influenza A antigens: segment 7 (encodes both M1 and M2 proteins via differential splicing), M2 and NP into expression constructions as described below and pictured in FIG. 13.

[0433] Plasmids VR4756 (SEQ ID NO:91), VR4759 (SEQ ID NO:92) and VR4762 (SEQ ID NO:93) were created by cloning the nucleotide sequence encoding the consensus sequence for the following influenza A antigens respectively: segment 7 (encoding both the M1 and M2 proteins by differential splicing), M2 and NP into the VR10551 backbone. The VR4756, VR4759 and VR4762 plasmids are also described in Table 13.

[0434] The VR4764 (SEQ ID NO:95) and VR4765 (SEQ ID NO:96) plasmids were constructed by ligating the segment 7 and NP coding regions from VR4756 and VR4762 respectively into the VR10682 vector. Specifically, the VR4756 vector was digested with EcoRV and SalI restriction endonucleases and the blunted fragment was ligated into the VR10682 backbone, which had been digested with the EcoRV restriction endonuclease. The VR4765 vector was constructed by digesting the VR4762 vector with EcoRV and NotI and ligating the NP coding region into the VR10682 backbone digested with the same restriction endonucleases.

[0435] VR4766 (SEQ ID NO:97) and VR4767 (SEQ ID NO:98) contain a CMV promoter/intron A-NP expression cassette and a RSV promoter (from VCL1005)-segment 7 expression cassette in the same orientation (VR4766) or opposite orientation (VR4767). These plasmids were generated by digesting VR4762 with the DraIII restriction endonuclease and cutting the RSV-segment 7-mRBG cassette from VR4764 with EcoRV and BamHI restriction endonucleases. After exonuclease digestion with the Klenow fragment of DNA polymerase I, the the EcoRV/BamHI fragment was cloned into the DraIII digested VR4762 vector. Both insert orientations were obtained by this blunt end cloning method.

[0436] VR4768 (SEQ ID NO:99) and VR4769 (SEQ ID NO:100), containing a CMV promoter/intron A-segment 7 expression cassette and a RSV promoter-NP expression cassette, were similarly derived. VR4756 was digested with the DraIII restriction endonuclease and blunted by treatment with the Klenow fragment of DNA Polymerase I. The cassette containing the RSV promoter, NP coding region and mRBG terminator was removed from VR4765 by digesting with KpnI and NdeI restriction endonucleases. The fragment was also blunted with the Klenow fragment of DNA polymerase I and ligated into the DraIII-digested VR4756 vector in both gene orientations.

[0437] VR4770 (SEQ ID NO:101), VR4771 (SEQ ID NO:102) and VR4772 (SEQ ID NO:103) were constructed by cloning the coding regions from VR4756, VR4762 and VR4759 respectively into the VR6430 vector backbone. Specifically, the segment 7 gene from VR4756 was removed using SalI and EcoRV restriction endonucleases and blunted with the Klenow fragment of DNA polymerase I. The VR6430 plasmid was digested with EcoRV and BamHI and the vector backbone fragment was blunted with the Klenow fragment of DNA polymerase I. The segment 7 gene fragment was then ligated into the VR6430 vector backbone. VR4771 was derived by removing the NP insert from VR4762 following EcoRV and BglII restriction endonuclease digestion and the fragment was ligated into the VR6430 vector backbone which had been digested the same restriction endonucleases. VR4772 was derived by subcloning the M2 coding region from VR4759 as a blunted SalI-EcoRV fragment and ligating into the VR6430 vector backbone from a blunted EcoRV-BamHI digest.

[0438] VR4773 (SEQ ID NO:104) and VR4774 (SEQ ID NO:105) contain a CMV promoter/intron A-segment 7 expression cassette and a RSV/R-NP expression cassette with the genes in the same or opposite orientation. These plasmids were generated by digesting VR4756 with the DraIII restriction endonuclease, blunting, and ligating to the RSV/R-NP-BGH fragment from VR4771 (VR4771 digested with NdeI and SfiI and then blunted).

[0439] VR4775 (SEQ ID NO:106) and VR4776 (SEQ ID NO:107) contain a CMV promoter/intron A-NP expression cassette and a RSV/R-segment 7 expression cassette with the genes in the same or opposite orientation. These plasmids were generated by digesting VR4762 with the DraIII restriction enzyme and blunting with the Klenow fragment of DNA polymerase. The RSV/R-segment 7-BGH fragment was generated by digesting VR4770 with NdeI and SfiI restriction endonucleases and ligating the blunted fragment with the DraIII restriction endonuclease digested VR4762.

[0440] VR4777 (SEQ ID NO:108) and VR4778 (SEQ ID NO:109) contain a CMV promoter/intron A-NP expression cassette and a RSV/R-M2 expression cassette in the same or opposite orientation. These plasmids were generated by digesting VR4762 with the MscI restriction endonuclease, digesting VR4772 with NdeI and SfiI restriction endonucleases and treating the RSV/R-M2-BGH with the Klenow fragment of DNA polymerase, followed by ligation of these two gel purified fragments.

[0441] VR4779 and VR4780 contain a CMV promoter/intron A-M2 expression cassette and a RSV/R-NP expression cassette in the same or opposite orientation. These plasmids were generated by digesting VR4759 with the MscI restriction endonuclease, digesting VR4771 with NdeI and SfiI restriction endonucleases and treating the RSV/R-NP-BGH segment with the Klenow fragment of DNA polymerase, followed by ligation of these two gel purified fragments.

Plasmid DNA Purification

[0442] Plasmid DNA was transformed into Escherichia coli DH5.alpha. competent cells, and highly purified covalently closed circular plasmid DNA was isolated by a modified lysis procedure (Horn, N. A., et al., Hum. Gene Ther. 6:565-573 (1995)) followed by standard double CsCl-ethidium bromide gradient ultracentrifugation (Sambrook, J., et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Plainview, N.Y. (1989)). All plasmid preparations were free of detectable chromosomal DNA, RNA and protein impurities based on gel analysis and the bicinchoninic protein assay (Pierce Chem. Co., Rockford Ill.). Endotoxin levels were measured using Limulus Amebocyte Lysate assay (LAL, Associates of Cape Cod, Falmouth, Mass.) and were less than 0.6 Endotoxin Units/mg of plasmid DNA. The spectrophotometric A.sub.260/A.sub.280 ratios of the DNA solutions were typically above 1.8. Plasmids were ethanol precipitated and resuspended in an appropriate solution, e.g., 150 mM sodium phosphate (for other appropriate excipients and auxiliary agents, see U.S. patent application Publication 2002/0019358, published Feb. 14, 2002). DNA was stored at -20.degree. C. until use. DNA was diluted by mixing it with 300 mM salt solutions and by adding appropriate amount of USP water to obtain 1 mg/ml plasmid DNA in the desired salt at the desired molar concentration.

Plasmid Expression in Mammalian Cell Lines

[0443] The expression plasmids were analyzed in vitro by transfecting the plasmids into a well characterized mouse melanoma cell line (VM-92, also known as UM-449) and the human rhabdomyosarcoma cell line RD (ATCC CCL-136) both available from the American Type Culture Collection, Manassas, Va. Other well-characterized human cell lines may also be used, e.g. MRC-5 cells, ATCC Accession No. CCL-171. The transfection was performed using cationic lipid-based transfection procedures well known to those of skill in the art. Other transfection procedures are well known in the art and may be used, for example electroporation and calcium chloride-mediated transfection (Graham F. L. and A. J. van der Eb Virology 52:456-67 (1973)). Following transfection, cell lysates and culture supernatants of transfected cells were evaluated to compare relative levels of expression of IV antigen proteins. The samples were assayed by western blots and ELISAs, using commercially available monoclonal antibodies (available, e.g., from Research Diagnostics Inc., Flanders, N.J.), so as to compare both the quality and the quantity of expressed antigen.

[0444] Genes encoding the consensus amino acid sequences (described above) derived for NP, M1 and M2 antigens were cloned in several configurations into several plasmid vector backbones. The pDNAs were tested for in vitro expression and are being assessed in vivo for immunogenicity, as well as for the ability to protect mice from influenza challenge.

Experiment 1

[0445] Following the derivation of an amino acid consensus for M1 and M2, a native segment 7 isolate was found to encode this consensus, and this nucleotide sequence was synthesized according to methods described above. An M2-M1 fusion gene was also created and the nucleotide sequence was human codon-optimized using the above described codon optimization algorithm of Example 4. The individual full-length M2 and M1 genes were also cloned via PCR from this fusion.

[0446] In vitro expression of influenza antigens in cell lysates was assessed 48 hours after transfection into a mouse melanoma cell line. M2 expression was detected following transfection of VR4756 (segment 7), VR4755 (M2-M1 fusion) and VR4759 (full-length M2) using the anti-M2 monoclonal antibody (14C2) from Affinity BioReagents. The data are shown in FIG. 10 for VR4756 and VR4755. Expression of M1 was detected from transfected VR4756, VR4755 and VR4760 (full-length M1) pDNAs, as detected by anti-M1 monoclonal (Serotec) in FIG. 10 for VR4756 and VR4755, or by anti-M1 goat polyclonal (Virostat, data not shown). VR10551 is the empty cloning vector.

Experiment 2

[0447] In order to compare alternative human codon-optimization methods, two versions of a fusion of the first 24 amino acids of M2 to full-length NP ("eM2-NP") were constructed. One nucleotide sequence was derived from the above codon optimization algorithm, while the other was done by an outside vendor. Comparison of expression levels from the two eM2-NP pDNAs was measured in vitro, and comparison of immunogenicity in vivo is on-going. Additionally, the full-length NP genes for both codon-optimized versions were sub-cloned from the eM2-NP pDNAs and analyzed for expression in vitro.

[0448] In vitro expression was tested to compare eM2-NP and NP pDNAs derived from the above described codon-optimization algorithm and an outside vendor algorithm. The data are shown in FIG. 11. Expression levels were approximately the same for VR4757 (eM2-NP vendor optimization) vs. VR4758 (eM2-NP Applicant optimization), as detected by anti-M2 monoclonal (FIG. 11A) or anti-NP mouse polyclonal (data not shown). Similarly, NP expression was approximately equal for VR4761 (vendor optimization) vs. VR4762 (Applicant optimization), detected by anti-NP mouse polyclonal generated by Applicants (FIG. 11B). NP consensus protein expression in vitro was also detected using a goat polyclonal antibody (Fitzgerald) generated against whole H1N1 or H3N2 virus (data not shown). Expression levels of both of these NP constructs were much higher than a pDNA containing A/PR/34 NP (VR4700).

Experiment 3

[0449] Influenza antigen-encoding plasmids were transfected into VM92 cells using methods described above. Cell lysates and media were collected 48 hours after transfection. Cells were lysed in 200 .mu.l of Laemmli buffer, cell debris removed by microcentrifuge spin, and 20 .mu.l was heated and loaded on a 4-12% Bis-Tris gel. To determine expression of those vectors encoding secreted NP protein, 15 .mu.l of media was mixed with 5 .mu.l of loading buffer, heated, and loaded on a gel. Western blots were processed as described above. Primary antibodies were as follows: monoclonal antibody MA1-082 (ABR) to detect M2 protein, monoclonal antibody MCA401 (Serotec) to detect M1 protein, and a polyclonal antibody against VR4762-injected rabbits generated in-house. All primary antibodies were used at a 1:500 dilution.

[0450] FIG. 14 shows Western blot results wherein M2 protein expression from segment 7-enocoding plasmids are higher in CMV promoter/intron A-segment 7 (VR4756) and RSV/R-segment 7 (VR4770) than VR4764 (RSV promoter). NP expression appeared highest from the RSV/R-NP plasmid (VR4771), followed by CMV/intron A-NP (VR4762) and then RSV-NP (VR4765). Similar results were seen in Western blots from human RD-transfected cells.

[0451] For dual promoter plasmids, containing RSV-segment 7 and CMV/intron A-NP (VR4766 and VR4767), M2 expression from segment 7 is very low, independent of orientation. The CMV/intron A-NP expression in these dual promoter plasmids does not differ significantly compared to VR4762. RSV-NP expression in dual promoter plasmids (VR4768 and VR4769), where segment 7 is expressed from CMV/intron A, NP expression decreases somewhat, but not as drastically as M2 expression in the dual promoter VR4766 and VR4767.

[0452] FIG. 15 shows expression of the M1 and M2 proteins from segment 7, as well as NP, from CMV promoter/intron A, RSV promoter, and RSV/R-containing plasmids. For these Western blots, dual promoter plasmids contain the CMV promoter/intron A and RSV/R driving either NP or segment 7. Similar results were seen in Western blots from human RD-transfected cells.

[0453] Western blot results confirm that the M1 and M2 protein expression from both CMV promoter/intron A-segment 7 (VR4756) and RSV/R-segment 7 (VR4770) is superior to RSV-segment 7 (VR4764). M1 and M2 expression decrease slightly when RSV/R-segment 7 or CMV/intron A-segment 7 is combined with CMV/intron A-NP or RSV/R-NP in a dual promoter plasmid (VR4773, VR4774, VR4775, and VR4776). Results were similar in Western blots from human RD transfected cells. Human RD cells transfected with M2 antigen encoding plasmids, RSV/R-M2 (VR4772) and CMV/intron A-M2 (VR4759), showed a similar level of M2 expression, which was decreased in dual promoter plasmids (VR4777, VR4778, VR4779, and VR4780). Human RD cells transfected with NP antigen-encoding plasmids, VR4762, VR4771, VR4777, VR4778, VR4779, and VR4780, all showed similar NP expression levels.

Example 14

Murine Influenza a Challenge Model

[0454] A model influenza A challenge model has been established utilizing a mouse-adapted A/BK/8/68 strain. Positive and negative control Hemagluttinin (HA)-containing plasmids were generated by PCR of the HA genes directly from mouse-adapted A/Hong Kong/68 (H3N2) and A/Puerto Rico/34 (H1N1) viruses, respectively.

[0455] For all experiments, plasmid DNA vaccinations are given as bilateral, rectus femoris injections at 0 and 3 weeks, followed by orbital sinus puncture (OSP) bleed at 5 weeks and intranasal viral challenge at 6 weeks with 500 pfu (1 LD.sub.90) of virus. Mice are monitored for morbidity and weight loss for about 3 weeks following viral challenge. Endpoint antibody titers for NP and M2 were determined by ELISA. For study GSJ08, 5 additional mice per test group were vaccinated and interferon-.gamma. ELISPOT assays were performed at week number 5.

Study CL88:

[0456] A mouse influenza challenge study was initiated to test the M1, M2, Segment 7, and NP-encoding plamids alone, or in combination. In addition to HA pDNAs, sub-lethal infection and naive mice serve as additional positive and negative controls, respectively. Mice received 100 .mu.g of each plasmid formulated in poloxamer CRL1005, 02A formulation. The test groups and 21 day post-challenge survival are shown in Table 21: TABLE-US-00086 TABLE 21 Total pDNA per # mice/ 21 day Group Construct(s) vaccination group Survival (%) A VR4762 (NP) 100 .mu.g 12 17 B VR4759 (M2) 100 .mu.g 12 25 C VR4760 (M1) 100 .mu.g 12 0 D VR4756 (S7) 100 .mu.g 12 50 E VR4762 (NP) + 200 .mu.g 12 100 VR4759 (M2) F VR4762 (NP) + 200 .mu.g 12 17 VR4760 (M1) G VR4762 (NP) + 200 .mu.g 12 75 VR4756 (S7) H VR4750 (HA, 100 .mu.g 12 100 H3N2, + control) I VR4752 (HA, 100 .mu.g 12 8 H1N1, - control) J Naive mice (- control) N/A 12 8 K Sub-lethal (+ control) N/A 12 100

CL88 Results:

[0457] The performance criteria for this study was survival of >90% for the positive controls, .ltoreq.10% for the negative controls, and >75% for the experimental groups. Table 21 shows that all of the control groups, as well as two experimental groups met the performance criteria. The M2+NP and S7+NP plamsid DNA combinations resulted in 100% and 75% survival, respectively. There was no statistically significant difference (p<0.05) between the two lead plasmid combinations, but there was statistical significance in the S7, S7+NP, and M2+NP groups vs. the negative controls.

[0458] Weight loss data showed that the positive control groups did not exhibit any weight loss following viral challenge, as opposed to the weight loss seen in all of the experimental groups. Mice that survived the viral challenge recovered to their starting weight by the end of the study. Tables 22 and 23 show endpoint antibody titers for test groups containing M2, Segment 7, and NP antigens. Shaded boxes represent mice that died following viral challenge. TABLE-US-00087 TABLE 22 CL88 M2 Antibody Titers Group D Group G Group B Group E mouse (seg 7) (NP + seg7) (M2) (NP + M2) 1 800 1600 25600 1600 2 ##STR1## 1600 ##STR2## 6400 3 3200 6400 ##STR3## 200 4 6400 ##STR4## ##STR5## 6400 5 12800 ##STR6## 3200 3200 6 800 12800 12800 3200 7 ##STR7## 0 ##STR8## 3200 8 ##STR9## 0 ##STR10## 6400 9 800 3200 ##STR11## 1600 10 ##STR12## 3200 ##STR13## 800 11 12800 1600 ##STR14## 3200 12 ##STR15## 12800 ##STR16## 400 **An M2 antibody titer of 0 represents a titer of <100.

[0459] TABLE-US-00088 TABLE 23 CL88 NP Antibody Titers Group A Group E Group F Group G mouse (NP) (NP + M2) (NP + M1) (NP + seg7) 1 204800 51200 ##STR17## 25600 2 ##STR18## 51200 204800 51200 3 204800 51200 ##STR19## 51200 4 ##STR20## 25600 51200 ##STR21## 5 ##STR22## 102400 ##STR23## ##STR24## 6 ##STR25## 51200 ##STR26## 102400 7 ##STR27## 204800 ##STR28## 102400 8 ##STR29## 102400 ##STR30## 102400 9 ##STR31## 102400 ##STR32## 51200 10 ##STR33## 102400 ##STR34## 102400 11 ##STR35## 51200 ##STR36## 25600 12 ##STR37## 51200 ##STR38## 25600

Study GSJ05:

[0460] In order to attempt to distinguish between the two antigen combinations, S7+NP and M2+NP, a dose ranging challenge experiment was undertaken with these two plasmid combinations. Mice were injected with 100 .mu.g, 30 .mu.g, or 10 .mu.g per plasmid in the 02A poloxamer formulation at 0 and 3 weeks, followed by bleed at 5 weeks and viral challenge at 6 weeks. Sixteen mice per group were vaccinated for test groups A-H, while 12 mice per group were vaccinated for the controls. Poloxamer 02A-formulated HA plasmids, VR4750 (HA H3) and VR4752 (HA H1), were included as positive and negative controls, respectively. The test groups and 21 day survival post-challenge are shown in Table 24: TABLE-US-00089 TABLE 24 Total pDNA # mice/ 21 day Group Construct(s) per vaccination group Survival (%) A VR4756 (Seg 7) + VR4762 (NP) 200 .mu.g 16 73 B VR4756 (Seg 7) + VR4762 (NP) 60 .mu.g 16 81 C VR4756 (Seg 7) + VR4762 (NP) 20 .mu.g 16 69 D VR4759 (M2) + VR4762 (NP) 200 .mu.g 16 94 E VR4759 (M2) + VR4762 (NP) 60 .mu.g 16 81 F VR4759 (M2) + VR4762 (NP) 20 .mu.g 16 75 G VR4750 (Positive DNA control) 100 .mu.g 12 100 H VR4752 (Negative DNA control) 100 .mu.g 12 8

Results

[0461] The performance criteria of >90% survival with the HA positive control and .ltoreq.10% for the HA negative control plasmid again were met. The performance criteria for the experimental groups, >75% survival at the 30 .mu.g per plasmid dose, was met by both M2+NP and S7+NP (Table 24). In fact, at a dose of 10 .mu.g per plasmid, S7+NP and M2+NP resulted in 69% and 75% survival, respectively. There was no statistical significance (p<0.05) between the three doses of M2+NP or between the 3 doses of S7+NP, nor was there statistical significance when comparing M2+NP to S7+NP at the 200 .mu.g, 60 .mu.g, or 20 .mu.g doses. However, there was a statistical difference for the HA positive control vs. S7+NP at 200 .mu.g and 20 .mu.g. Body mass data shows weight loss and recovery by all surviving experimental plasmid DNA-vaccinated groups, while the HA positive control mice did not experience weight loss. Antibody data for M2 and NP are shown in Tables 25 and 26. TABLE-US-00090 TABLE 25 GSJ05 M2 Antibody Titers Group Group Group Group Group mouse # A B C D E Group F 1 ##STR39## 400 3200 6400 800 3200 2 200 ##STR40## 0 25600 1600 0 3 0 ##STR41## 0 3200 3200 3200 4 100 0 ##STR42## 6400 1600 400 5 ##STR43## 0 0 3200 800 1600 6 3200 400 0 6400 200 100 7 25600 800 0 ##STR44## ##STR45## ##STR46## 8 0 100 ##STR47## 1600 0 400 9 ##STR48## ##STR49## 800 3200 12800 0 10 ##STR50## 800 ##STR51## 1600 800 ##STR52## 11 100 1600 ##STR53## 3200 200 1600 12 3200 0 ##STR54## 6400 ##STR55## 1600 13 800 0 400 3200 ##STR56## 800 14 ##STR57## 0 1600 3200 400 100 15 0 1600 800 1600 3200 ##STR58## 16 0 0 800 800 3200 ##STR59##

[0462] TABLE-US-00091 TABLE 26 GSJ05 NP Antibody Titers Group Group Group Group Group mouse # A B C D E Group F 1 ##STR60## 51200 51200 51200 25600 25600 2 25600 ##STR61## 12800 51200 25600 6400 3 102400 ##STR62## 51200 12800 51200 25600 4 25600 12800 ##STR63## 25600 12800 12800 5 ##STR64## 102400 6400 25600 12800 12800 6 25600 51200 25600 25600 12800 6400 7 102400 51200 6400 ##STR65## ##STR66## ##STR67## 8 51200 25600 ##STR68## 12800 51200 6400 9 ##STR69## ##STR70## 25600 102400 12800 12800 10 ##STR71## 25600 ##STR72## 25600 12800 ##STR73## 11 51200 25600 ##STR74## 25600 25600 3200 12 51200 51200 ##STR75## 25600 ##STR76## 12800 13 51200 51200 25600 51200 ##STR77## 12800 14 ##STR78## 12800 25600 51200 6400 12800 15 25600 6400 25600 25600 25600 ##STR79## 16 51200 51200 25600 12800 12800 ##STR80## Gray shading represents mice that died post-challenge. Group A, mouse 9 (spotted box) died during the OSP bleed procedure.

Study GSJ06

[0463] The plasmid combination VR4759 (M2) and VR4762 (NP) was utilized in further mouse influenza challenge studies to examine additional formulations.

[0464] Using the experimental protocol described above, 12 mice per group were vaccinated with equal weight VR4759 (M2) and VR4762 (NP) in the following formulations: [0465] Poloxamer 02A used in the previous two challenge experiments. [0466] DMRIE+Cholesterol (DM:Chol) at a 4:1 molar ratio of DNA to DMRIE, the molar ratio of DM:Chol is 3:1. [0467] Vaxfectin.TM. (VC 1052+DPyPE) at a 4:1 molar ratio of DNA: VC1052, the molar ratio of VC1052: DpyPE is 1:1.

[0468] GSJ06 study design and 21 day survival post-challenge is found in Table 27. TABLE-US-00092 TABLE 27 Total 21 day Group pDNA pDNA Survival (%) A Poloxamer 02A 20 ug 92 B Poloxamer 02A 2 ug 58 C DMRIE: Cholesterol 20 ug 58 D DMRIE: Cholesterol 2 ug 17 E Vaxfectin 20 ug 100 F Vaxfectin 2 ug 75 G VR4750 (HA, positive) 100 ug 100 H VR4752 (HA, negative) 100 ug 0

Results

[0469] Poloxamer 02A and Vaxfectin.TM.-formulated plasmid DNA led to 92% and 100% survival at the 20 .mu.g pDNA dose, and 58% and 75% at the 2 .mu.g dose, respectively (Table 27).

[0470] Average weights were tracked for each group of mice starting at the day of challenge. As shown in Table 28, it was noted in this experiment that the weight recovery for group E (Vaxfectin.TM.-formulated pDNA, 20 .mu.g total) began after day 4, as opposed to the other groups' recovery beginning at day 7. Antibody titers, Tables 29 and 30, were determined for M2 and NP and shaded boxes represent mice that died following viral challenge. TABLE-US-00093 TABLE 28 GSJ06 Average Body Weights Post-Challenge Avg Body Weights (g)-Days post-challenge Group pDNA Total pDNA 0 2 4 7 9 11 14 16 18 21 A Poloxamer 02A 20 ug 20.73 19.98 17.98 ##STR81## 17.36 18.74 19.94 20.45 20.60 21.08 B Poloxamer 02A 2 ug 21.08 19.91 17.96 15.17 ##STR82## 16.03 16.77 17.41 18.10 19.52 C DMRIE-Cholesterol 20 ug 21.43 20.24 18.14 ##STR83## 18.68 19.24 20.14 20.50 20.90 21.42 D DMRIE-Cholesterol 2 ug 21.28 20.24 17.58 ##STR84## 16.18 17.45 18.80 19.84 20.13 20.98 E Vaxfectin 20 ug 21.41 19.97 ##STR85## 18.10 19.12 19.82 20.39 20.87 20.93 21.34 F Vaxfectin 2 ug 20.47 18.97 16.86 ##STR86## 16.22 16.84 17.87 18.60 19.08 20.02 G VR4750 (HA, positive) 100 ug 21.30 20.97 21.60 21.21 21.57 21.79 21.84 22.13 21.94 22.13 H VR4752 (HA, negative) 100 ug 20.89 20.25 17.57 14.67 Shading represents the lowest group average post-challenge for each test group. Group H (negative control) weight averages are not recorded once the percentage survival has dropped below 50%.

[0471] TABLE-US-00094 TABLE 29 GSJ06 M2 Antibody Titers Group Group Group Group mouse # A B C Group D E Group F 1 ##STR87## 400 ##STR88## ##STR89## 1600 6400 2 6400 ##STR90## 1600 400 800 3 6400 ##STR91## ##STR92## ##STR93## 12800 3200 4 1600 0 400 ##STR94## 25600 1600 5 6400 3200 ##STR95## ##STR96## 100 ##STR97## 6 3200 100 100 ##STR98## 12800 1600 7 800 1600 1600 ##STR99## 800 3200 8 400 100 ##STR100## 200 6400 ##STR101## 9 1600 ##STR102## 100 ##STR103## 6400 ##STR104## 10 100 ##STR105## 1600 ##STR106## 3200 400 11 3200 0 800 ##STR107## 1600 1600 12 6400 ##STR108## ##STR109## 0 6400 1600

[0472] TABLE-US-00095 TABLE 30 GSJ06 NP Antibody Titers Group Group Group Group mouse # A B C Group D E Group F 1 ##STR110## 6400 ##STR111## ##STR112## 51200 51200 2 51200 ##STR113## 6400 ##STR114## 102400 102400 3 12800 ##STR115## ##STR116## ##STR117## 51200 25600 4 25600 1600 6400 ##STR118## 204800 102400 5 25600 6400 ##STR119## ##STR120## 51200 ##STR121## 6 51200 12800 25600 ##STR122## 102400 51200 7 25600 25600 12800 ##STR123## 51200 51200 8 25600 3200 ##STR124## 6400 25600 ##STR125## 9 25600 ##STR126## 51200 ##STR127## 51200 ##STR128## 10 51200 ##STR129## 12800 ##STR130## 51200 51200 11 25600 12800 25600 ##STR131## 102400 51200 12 51200 ##STR132## ##STR133## 400 51200 51200

Study GSJ08

[0473] Further formulation comparisons were done with utilizing VR4759 (M2) and VR4762 (NP). Seventeen mice per test group (A-G) were vaccinated with equal weight VR4759 (M2) and VR4762 (NP) vectors in the following formulations: [0474] Poloxamer 02A [0475] Vaxfectin.TM. (preparations A and B represent different purifications) [0476] DMRIE:DOPE at a 4:1 molar ratio of DNA to DMRIE [0477] DMRIE:DOPE at a 2.5:1 molar ratio of DNA to DMRIE [0478] PBS (unformulated pDNA)

[0479] Twelve mice per test group were challenged with influenza virus at week number 6. Five mice per test group were sacrificed at days 36-38 for T cell assays (IFN-.gamma. ELISPOT). The test groups and 21 day survival post-challenge are shown in Table 31. Groups A-D, and F-G were vaccinated with 20 .mu.g total plasmid DNA per injection to further explore the weight loss/recovery phenomena seen in study GSJ06 with the Vaxfectin.TM.-formulated pDNA. TABLE-US-00096 TABLE 31 Total pDNA 21 Day Group Construct(s) per vaccination Survival (%) A Poloxamer 02A 20 .mu.g 50 B DMRIE:DOPE 4:1 20 .mu.g 92 C DMRIE:DOPE 2.5:1 20 .mu.g 92 D Vaxfectin - prep A 20 .mu.g 92 E Vaxfectin - prep A 2 .mu.g 75 F Vaxfectin - prep B 20 .mu.g 100 G PBS 20 .mu.g 42 H VR4750 (HA, H3N2, +control) 100 .mu.g 100 I VR4752 (HA, H1N1, -control) 100 .mu.g 17

Results

[0480] The DMRIE:DOPE and Vaxfectin.TM. formulated groups resulted in 92-100% survival at a 20 .mu.g pDNA dose. Group A (Poloxaamer 02A) and Group G (PBS) survival results were not statistically different than the negative control (as measured by Fisher exact p, one-tailed), while the Vaxfectin.TM. and DMIRE:DOPE Groups (Groups B-F) were shown to be statistically superior (p<0.05) as compared to the negative control. Therefore, the plasmid DNA formulated with lipids appear to provide superior protection in the mouse influenza model challenge.

[0481] A repeated measures ANOVA mixed model analysis of weight data for groups B, C, and D of the weight loss and recovery data showed that Group B and Group D were not statistically different, while Group C and Group D were statistically different.

[0482] T cell responses, as measured by IFN-.gamma. ELISPOT assay, were conducted on the last 5 mice per group using an M2 peptide encompassing the first 24 amino acids of M2 (TABLE 33), an NP protein expressed in baculovirus (TABLE 34), and an NP CD8+ Balb/c immunodominant peptide (TABLE 35).

[0483] Antibody titers, Tables 36 and 37, were determined for M2 and NP proteins. The first 12 mice listed for each group were challenge at day 42 and the last 5 mice per group were sacrificed for IFN-.gamma. ELISPOT. The shaded boxes represent mice that died following viral challenge. TABLE-US-00097 TABLE 32 GSJ06 Average Body Weights Post-Challenge Total pDNA Avg Body Weights (g)-Days post-challenge Group Construct(s) par vaccination 0 2 4 5 6 7 9 11 14 16 18 22 A Poloxamer 02A 20 .mu.g 20.47 18.97 16.30 15.43 14.75 ##STR134## 14.36 14.44 16.63 17.64 18.36 20.53 B DMRIE-DOPE 4:1 20 .mu.g 21.58 19.94 17.43 16.75 16.17 ##STR135## 16.43 17.28 18.45 19.50 20.22 20.89 C DMRIE-DOPE 2.5:1 20 .mu.g 19.95 18.58 16.44 15.77 ##STR136## 15.56 15.75 16.22 16.78 17.16 17.31 18.04 D Vaxfectin - prep A 20 .mu.g 20.87 19.22 16.81 16.47 ##STR137## 16.92 17.94 19.48 20.06 20.19 20.64 21.17 E Vaxfectin - prep A 2 .mu.g 20.40 19.59 17.97 17.47 17.27 ##STR138## 18.96 19.83 20.24 20.49 20.57 21.06 F Vaxfectin - prep B 20 .mu.g 21.33 20.01 17.88 ##STR139## 17.74 18.21 18.85 19.85 20.29 20.77 20.88 21.39 G PBS 20 .mu.g 20.84 19.46 16.97 16.00 15.38 ##STR140## 15.80 16.39 17.35 H VR4750 100 .mu.g 21.25 21.15 21.27 20.77 20.92 21.24 20.74 21.16 21.33 21.40 21.64 21.64 (HA, H3N2, +control) I VR4752 100 .mu.g 21.67 20.65 17.87 16.77 16.05 15.17 15.09 (HA, H1N1, -control) Shading represents the lowest group average post-challenge for each test group. Group G and I weight averages are not recorded once the percentage survival has dropped below 50%.

[0484] TABLE-US-00098 TABLE 33 M2 peptide Interferon-.gamma. ELISPOT M2 peptide IFN gamma ELISPOT (SFU/10E6 cells) Mouse A B C D E F G 1 66 88 145 189 283 253 31 2 11 115 150 269 62 282 47 3 115 247 190 233 99 283 112 4 20 6 51 67 73 93 45 5 93 277 397 248 202 399 93 AVG 61 147 187 201 144 262 66

[0485] TABLE-US-00099 TABLE 34 NP CD4 peptide Interferon-.gamma. ELISPOT ND CD4 peptide IFN gamma ELISPOT (SFU/10E6 cells) Mouse A B C D E F G 1 7 32 3 52 72 108 18 2 8 83 34 125 8 34 8 3 22 91 106 293 26 51 73 4 9 15 80 39 53 10 12 5 37 150 374 117 40 217 43 AVG 17 74 119 125 40 84 31

[0486] TABLE-US-00100 TABLE 35 NP CD8 peptide Interferon-.gamma. ELISPOT NP CD8 peptide IFN gamma ELISPOT (SFU/10E6 cells) Mouse A B C D E F G 1 11 37 4 14 20 67 8 2 0 3 4 6 1 0 2 3 31 19 15 26 23 51 34 4 1 0 0 12 1 38 3 5 46 36 39 21 13 15 18 AVG 18 19 12 16 12 34 13

[0487] TABLE-US-00101 TABLE 36 GSJ08 M2 Antibody Titers mouse # Group A Group B Group C Group D Group E Group F Group G Group H ELISPOT # 1 1600 3200 3200 6400 400 12800 800 6400 2 ##STR141## 12800 6400 1600 3200 800 ##STR142## ##STR143## 3 ##STR144## 3200 6400 ##STR145## 800 3200 ##STR146## 800 4 ##STR147## ##STR148## 6400 1600 ##STR149## 800 ##STR150## 0 5 1600 0 ##STR151## 12800 1600 800 ##STR152## ##STR153## 6 ##STR154## 3200 1600 6400 200 12800 400 ##STR155## 7 ##STR156## 3200 12800 800 1600 3200 1600 ##STR157## 8 12800 6400 3200 12800 12800 12800 12800 ##STR158## 9 1600 1600 0 12800 6400 12800 ##STR159## ##STR160## 10 3200 1600 12800 12800 1600 800 ##STR161## 12800 11 1600 6400 3200 3200 ##STR162## 6400 ##STR163## ##STR164## 12 200 800 6400 25600 ##STR165## 800 ##STR166## 6400 13 1600 800 6400 12800 3200 6400 6400 6400 1 14 3200 6400 1600 1600 800 12800 3200 12800 2 15 0 1600 3200 3200 12800 12800 6400 12800 3 16 3200 3200 1600 12800 0 12800 200 6400 4 17 3200 200 400 6400 800 400 1600 3200 5

[0488] TABLE-US-00102 TABLE 37 GSJ08 NP Antibody Titers mouse # Group A Group B Group C Group D Group E Group F Group G Group H ELISPOT # 1 51200 25600 6400 51200 12800 51200 51200 25600 2 ##STR167## 25600 51200 51200 25600 102400 ##STR168## ##STR169## 3 ##STR170## 51200 12800 ##STR171## 6400 102400 ##STR172## 12800 4 ##STR173## ##STR174## 51200 102400 ##STR175## 25600 ##STR176## 25600 5 25600 12800 ##STR177## 51200 51200 102400 ##STR178## ##STR179## 6 ##STR180## 12800 51200 102400 25600 51200 25600 ##STR181## 7 ##STR182## 51200 51200 51200 25600 204800 102400 ##STR183## 8 25600 51200 25600 51200 12800 51200 25600 ##STR184## 9 25600 12800 25600 51200 51200 51200 ##STR185## ##STR186## 10 6400 12800 51200 51200 25600 204800 ##STR187## 25600 11 12800 51200 25600 204800 ##STR188## 102400 ##STR189## ##STR190## 12 102400 102400 51200 102400 ##STR191## 204800 ##STR192## 51200 13 25600 25600 12800 51200 51200 102400 25600 25600 1 14 51200 25600 12800 51200 25600 102400 25600 51200 2 15 51200 51200 51200 51200 25600 25600 102400 12800 3 16 25600 6400 25600 51200 25600 102400 25600 51200 4 17 25600 25600 51200 51200 12800 51200 25600 25600 5

[0489] The present invention is not to be limited in scope by the specific embodiments described which are intended as single illustrations of individual aspects of the invention, and any compositions or methods which are functionally equivalent are within the scope of this invention. Indeed, various modifications of the invention in addition to those shown and described herein will become apparent to those skilled in the art from the foregoing description and accompanying drawings. Such modifications are intended to fall within the scope of the appended claims.

[0490] All publications and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication or patent application was specifically and individually indicated to be incorporated by reference.

Sequence CWU 1

1

112 1 1565 DNA Influenza A virus 1 agcaaaagca gggtagataa tcactcactg agtgacatca aaatcatggc gtctcaaggc 60 accaaacgat cttacgaaca gatggagact gatggagaac gccagaatgc cactgaaatc 120 agagcatccg tcggaaaaat gattggtgga attggacgat tctacatcca aatgtgcacc 180 gaactcaaac tcagtgatta tgagggacgg ttgatccaaa acagcttaac aatagagaga 240 atggtgctct ctgcttttga cgaaaggaga aataaatacc ttgaagaaca tcccagtgcg 300 gggaaagatc ctaagaaaac tggaggacct atatacagga gagtaaacgg aaagtggatg 360 agagaactca tcctttatga caaagaagaa ataaggcgaa tctggcgcca agctaataat 420 ggtgacgatg caacggctgg tctgactcac atgatgatct ggcattccaa tttgaatgat 480 gcaacttatc agaggacaag agctcttgtt cgcaccggaa tggatcccag gatgtgctct 540 ctgatgcaag gttcaactct ccctaggagg tctggagccg caggtgctgc agtcaaagga 600 gttggaacaa tggtgatgga attggtcaga atgatcaaac gtgggatcaa tgatcggaac 660 ttctggaggg gtgagaatgg acgaaaaaca agaattgctt atgaaagaat gtgcaacatt 720 ctcaaaggga aatttcaaac tgctgcacaa aaagcaatga tggatcaagt gagagagagc 780 cggaacccag ggaatgctga gttcgaagat ctcacttttc tagcacggtc tgcactcata 840 ttgagagggt cggttgctca caagtcctgc ctgcctgcct gtgtgtatgg acctgccgta 900 gccagtgggt acgactttga aagggaggga tactctctag tcggaataga ccctttcaga 960 ctgcttcaaa acagccaagt gtacagccta atcagaccaa atgagaatcc agcacacaag 1020 agtcaactgg tgtggatggc atgccattct gccgcatttg aagatctaag agtattaagc 1080 ttcatcaaag ggacgaaggt gctcccaaga gggaagcttt ccactagagg agttcaaatt 1140 gcttccaatg aaaatatgga gactatggaa tcaagtacac ttgaactgag aagcaggtac 1200 tgggccataa ggaccagaag tggaggaaac accaatcaac agagggcatc tgcgggccaa 1260 atcagcatac aacctacgtt ctcagtacag agaaatctcc cttttgacag aacaaccgtt 1320 atggcagcat tcagtgggaa tacagagggg agaacatctg acatgaggac cgaaatcata 1380 aggatgatgg aaagtgcaag accagaagat gtgtctttcc aggggcgggg agtcttcgag 1440 ctctcggacg aaaaggcagc gagcccgatc gtgccttcct ttgacatgag taatgaagga 1500 tcttatttct tcggagacaa tgcagaggaa tacgataatt aaagaaaaat acccttgttt 1560 ctact 1565 2 498 PRT Influenza A virus 2 Met Ala Ser Gln Gly Thr Lys Arg Ser Thr Glu Gln Met Glu Thr Asp 1 5 10 15 Gly Glu Arg Gln Asn Ala Thr Glu Ile Arg Ala Ser Val Gly Lys Met 20 25 30 Ile Gly Gly Ile Gly Arg Phe Tyr Ile Gln Met Cys Thr Glu Leu Lys 35 40 45 Leu Ser Asp Tyr Glu Gly Arg Leu Ile Gln Asn Ser Leu Thr Ile Glu 50 55 60 Arg Met Val Leu Ser Ala Phe Asp Glu Arg Arg Asn Lys Tyr Leu Glu 65 70 75 80 Glu His Pro Ser Ala Gly Lys Asp Pro Lys Lys Thr Gly Gly Pro Ile 85 90 95 Tyr Arg Arg Val Asn Gly Lys Trp Met Arg Glu Leu Ile Leu Tyr Asp 100 105 110 Lys Glu Glu Ile Arg Arg Ile Trp Arg Gln Ala Asn Asn Gly Asp Asp 115 120 125 Ala Thr Ala Gly Leu Thr His Met Met Ile Trp His Ser Asn Leu Asn 130 135 140 Asp Ala Thr Tyr Gln Arg Thr Arg Ala Leu Val Arg Thr Gly Met Asp 145 150 155 160 Pro Arg Met Cys Ser Leu Met Gln Gly Ser Thr Leu Pro Arg Arg Ser 165 170 175 Gly Ala Ala Gly Ala Ala Val Lys Gly Val Gly Thr Met Val Met Glu 180 185 190 Leu Val Arg Met Ile Lys Arg Gly Ile Asn Asp Arg Asn Phe Trp Arg 195 200 205 Gly Glu Asn Gly Arg Lys Thr Arg Ile Ala Tyr Glu Arg Met Cys Asn 210 215 220 Ile Leu Lys Gly Lys Phe Gln Thr Ala Ala Gln Lys Ala Met Met Asp 225 230 235 240 Gln Val Arg Glu Ser Arg Asn Pro Gly Asn Ala Glu Phe Glu Asp Leu 245 250 255 Thr Phe Leu Ala Arg Ser Ala Leu Ile Leu Arg Gly Ser Val Ala His 260 265 270 Lys Ser Cys Leu Pro Ala Cys Val Tyr Gly Pro Ala Val Ala Ser Gly 275 280 285 Tyr Asp Phe Glu Arg Glu Gly Tyr Ser Leu Val Gly Ile Asp Pro Phe 290 295 300 Arg Leu Leu Gln Asn Ser Gln Val Tyr Ser Leu Ile Arg Pro Asn Glu 305 310 315 320 Asn Pro Ala His Lys Ser Gln Leu Val Trp Met Ala Cys His Ser Ala 325 330 335 Ala Phe Glu Asp Leu Arg Val Leu Ser Phe Ile Lys Gly Thr Lys Val 340 345 350 Leu Pro Arg Gly Lys Leu Ser Thr Arg Gly Val Gln Ile Ala Ser Asn 355 360 365 Glu Asn Met Glu Thr Met Glu Ser Ser Thr Leu Glu Leu Arg Ser Arg 370 375 380 Tyr Trp Ala Ile Arg Thr Arg Ser Gly Gly Asn Thr Asn Gln Gln Arg 385 390 395 400 Ala Ser Ala Gly Gln Ile Ser Ile Gln Pro Thr Phe Ser Val Gln Arg 405 410 415 Asn Leu Pro Phe Asp Arg Thr Thr Val Met Ala Ala Phe Ser Gly Asn 420 425 430 Thr Glu Gly Arg Thr Ser Asp Met Arg Thr Glu Ile Ile Arg Met Met 435 440 445 Glu Ser Ala Arg Pro Glu Asp Val Ser Phe Gln Gly Arg Gly Val Phe 450 455 460 Glu Leu Ser Asp Glu Lys Ala Ala Ser Pro Ile Val Pro Ser Phe Asp 465 470 475 480 Met Ser Asn Glu Gly Ser Tyr Phe Phe Gly Asp Asn Ala Glu Glu Tyr 485 490 495 Asp Asn 3 1027 DNA Influenza A virus 3 agcgaaagca ggtagatatt gaaagatgag tcttctaacc gaggtcgaaa cgtacgtact 60 ctctatcatc ccgtcaggcc ccctcaaagc cgagatcgca cagagacttg aagatgtctt 120 tgcagggaag aacactgatc ttgaggttct catggaatgg ctaaagacaa gaccaatcct 180 gtcacctctg actaagggga ttttaggatt tgtgttcacg ctcaccgtgc ccagtgagcg 240 aggactgcag cgtagacgct ttgtccaaaa tgcccttaat gggaacgggg atccaaataa 300 catggacaaa gcagttaaac tgtataggaa gctcaagagg gagataacat tccatggggc 360 caaagaaatc tcactcagtt attctgctgg tgcacttgcc agttgtatgg gcctcatata 420 caacaggatg ggggctgtga ccactgaagt ggcatttggc ctggtatgtg caacctgtga 480 acagattgct gactcccagc atcggtctca taggcaaatg gtgacaacaa ccaatccact 540 aatcagacat gagaacagaa tggttttagc cagcactaca gctaaggcta tggagcaaat 600 ggctggatcg agtgagcaag cagcagaggc catggaggtt gctagtcagg ctagacaaat 660 ggtgcaagcg atgagaacca ttgggactca tcctagctcc agtgctggtc tgaaaaatga 720 tcttcttgaa aatttgcagg cctatcagaa acgaatgggg gtgcagatgc aacggttcaa 780 gtgatcctct cgctattgcc gcaaatatca ttgggatctt gcacttgaca ttgtggattc 840 ttgatcgtct ttttttcaaa tgcatttacc gtcgctttaa atacggactg aaaggagggc 900 cttctacgga aggagtgcca aagtctatga gggaagaata tcgaaaggaa cagcagagtg 960 ctgtggatgc tgacgatggt cattttgtca gcatagagct ggagtaaaaa actaccttgt 1020 ttctact 1027 4 252 PRT Influenza A virus 4 Met Ser Leu Leu Thr Glu Val Glu Thr Tyr Val Leu Ser Ile Ile Pro 1 5 10 15 Ser Gly Pro Leu Lys Ala Glu Ile Ala Gln Arg Leu Glu Asp Val Phe 20 25 30 Ala Gly Lys Asn Thr Asp Leu Glu Val Leu Met Glu Trp Leu Lys Thr 35 40 45 Arg Pro Ile Leu Ser Pro Leu Thr Lys Gly Ile Leu Gly Phe Val Phe 50 55 60 Thr Leu Thr Val Pro Ser Glu Arg Gly Leu Gln Arg Arg Arg Phe Val 65 70 75 80 Gln Asn Ala Leu Asn Gly Asn Gly Asp Pro Asn Asn Met Asp Lys Ala 85 90 95 Val Lys Leu Tyr Arg Lys Leu Lys Arg Glu Ile Thr Phe His Gly Ala 100 105 110 Lys Glu Ile Ser Leu Ser Tyr Ser Ala Gly Ala Leu Ala Ser Cys Met 115 120 125 Gly Leu Ile Tyr Asn Arg Met Gly Ala Val Thr Thr Glu Val Ala Phe 130 135 140 Gly Leu Val Cys Ala Thr Cys Glu Gln Ile Ala Asp Ser Gln His Arg 145 150 155 160 Ser His Arg Gln Met Val Thr Thr Thr Asn Pro Leu Ile Arg His Glu 165 170 175 Asn Arg Met Val Leu Ala Ser Thr Thr Ala Lys Ala Met Glu Gln Met 180 185 190 Ala Gly Ser Ser Glu Gln Ala Ala Glu Ala Met Glu Val Ala Ser Gln 195 200 205 Ala Arg Gln Met Val Gln Ala Met Arg Thr Ile Gly Thr His Pro Ser 210 215 220 Ser Ser Ala Gly Leu Lys Asn Asp Leu Leu Glu Asn Leu Gln Ala Tyr 225 230 235 240 Gln Lys Arg Met Gly Val Gln Met Gln Arg Phe Lys 245 250 5 97 PRT Influenza A virus 5 Met Ser Leu Leu Thr Glu Val Glu Thr Pro Ile Arg Asn Glu Trp Gly 1 5 10 15 Cys Arg Cys Asn Gly Ser Ser Asp Pro Leu Ala Ile Ala Ala Asn Ile 20 25 30 Ile Gly Ile Leu His Leu Thr Leu Trp Ile Leu Asp Arg Leu Phe Phe 35 40 45 Lys Cys Ile Tyr Arg Arg Phe Lys Tyr Gly Leu Lys Gly Gly Pro Ser 50 55 60 Thr Glu Gly Val Pro Lys Ser Met Arg Glu Glu Tyr Arg Lys Glu Gln 65 70 75 80 Gln Ser Ala Val Asp Ala Asp Asp Gly His Phe Val Ser Ile Glu Leu 85 90 95 Glu 6 1566 DNA Artificial sequence eM2NP fusion 6 atgagtcttc taaccgaggt cgaaacgcct atcagaaacg aatgggggtg cagatgcaac 60 ggttcaagtg atatggcgtc tcaaggcacc aaacgatctt acgaacagat ggagactgat 120 ggagaacgcc agaatgccac tgaaatcaga gcatccgtcg gaaaaatgat tggtggaatt 180 ggacgattct acatccaaat gtgcaccgaa ctcaaactca gtgattatga gggacggttg 240 atccaaaaca gcttaacaat agagagaatg gtgctctctg cttttgacga aaggagaaat 300 aaataccttg aagaacatcc cagtgcgggg aaagatccta agaaaactgg aggacctata 360 tacaggagag taaacggaaa gtggatgaga gaactcatcc tttatgacaa agaagaaata 420 aggcgaatct ggcgccaagc taataatggt gacgatgcaa cggctggtct gactcacatg 480 atgatctggc attccaattt gaatgatgca acttatcaga ggacaagagc tcttgttcgc 540 accggaatgg atcccaggat gtgctctctg atgcaaggtt caactctccc taggaggtct 600 ggagccgcag gtgctgcagt caaaggagtt ggaacaatgg tgatggaatt ggtcagaatg 660 atcaaacgtg ggatcaatga tcggaacttc tggaggggtg agaatggacg aaaaacaaga 720 attgcttatg aaagaatgtg caacattctc aaagggaaat ttcaaactgc tgcacaaaaa 780 gcaatgatgg atcaagtgag agagagccgg aacccaggga atgctgagtt cgaagatctc 840 acttttctag cacggtctgc actcatattg agagggtcgg ttgctcacaa gtcctgcctg 900 cctgcctgtg tgtatggacc tgccgtagcc agtgggtacg actttgaaag ggagggatac 960 tctctagtcg gaatagaccc tttcagactg cttcaaaaca gccaagtgta cagcctaatc 1020 agaccaaatg agaatccagc acacaagagt caactggtgt ggatggcatg ccattctgcc 1080 gcatttgaag atctaagagt attaagcttc atcaaaggga cgaaggtgct cccaagaggg 1140 aagctttcca ctagaggagt tcaaattgct tccaatgaaa atatggagac tatggaatca 1200 agtacacttg aactgagaag caggtactgg gccataagga ccagaagtgg aggaaacacc 1260 aatcaacaga gggcatctgc gggccaaatc agcatacaac ctacgttctc agtacagaga 1320 aatctccctt ttgacagaac aaccgttatg gcagcattca gtgggaatac agaggggaga 1380 acatctgaca tgaggaccga aatcataagg atgatggaaa gtgcaagacc agaagatgtg 1440 tctttccagg ggcggggagt cttcgagctc tcggacgaaa aggcagcgag cccgatcgtg 1500 ccttcctttg acatgagtaa tgaaggatct tatttcttcg gagacaatgc agaggaatac 1560 gataat 1566 7 522 PRT Artificial sequence eM2NP fusion 7 Met Ser Leu Leu Thr Glu Val Glu Thr Pro Ile Arg Asn Glu Trp Gly 1 5 10 15 Cys Arg Cys Asn Gly Ser Ser Asp Met Ala Ser Gln Gly Thr Lys Arg 20 25 30 Ser Tyr Glu Gln Met Glu Thr Asp Gly Glu Arg Gln Asn Ala Thr Glu 35 40 45 Ile Arg Ala Ser Val Gly Lys Met Ile Gly Gly Ile Gly Arg Phe Tyr 50 55 60 Ile Gln Met Cys Thr Glu Leu Lys Leu Ser Asp Tyr Glu Gly Arg Leu 65 70 75 80 Ile Gln Asn Ser Leu Thr Ile Glu Arg Met Val Leu Ser Ala Phe Asp 85 90 95 Glu Arg Arg Asn Lys Tyr Leu Glu Glu His Pro Ser Ala Gly Lys Asp 100 105 110 Pro Lys Lys Thr Gly Gly Pro Ile Tyr Arg Arg Val Asn Gly Lys Trp 115 120 125 Met Arg Glu Leu Ile Leu Tyr Asp Lys Glu Glu Ile Arg Arg Ile Trp 130 135 140 Arg Gln Ala Asn Asn Gly Asp Asp Ala Thr Ala Gly Leu Thr His Met 145 150 155 160 Met Ile Trp His Ser Asn Leu Asn Asp Ala Thr Tyr Gln Arg Thr Arg 165 170 175 Ala Leu Val Arg Thr Gly Met Asp Pro Arg Met Cys Ser Leu Met Gln 180 185 190 Gly Ser Thr Leu Pro Arg Arg Ser Gly Ala Ala Gly Ala Ala Val Lys 195 200 205 Gly Val Gly Thr Met Val Met Glu Leu Val Arg Met Ile Lys Arg Gly 210 215 220 Ile Asn Asp Arg Asn Phe Trp Arg Gly Glu Asn Gly Arg Lys Thr Arg 225 230 235 240 Ile Ala Tyr Glu Arg Met Cys Asn Ile Leu Lys Gly Lys Phe Gln Thr 245 250 255 Ala Ala Gln Lys Ala Met Met Asp Gln Val Arg Glu Ser Arg Asn Pro 260 265 270 Gly Asn Ala Glu Phe Glu Asp Leu Thr Phe Leu Ala Arg Ser Ala Leu 275 280 285 Ile Leu Arg Gly Ser Val Ala His Lys Ser Cys Leu Pro Ala Cys Val 290 295 300 Tyr Gly Pro Ala Val Ala Ser Gly Tyr Asp Phe Glu Arg Glu Gly Tyr 305 310 315 320 Ser Leu Val Gly Ile Asp Pro Phe Arg Leu Leu Gln Asn Ser Gln Val 325 330 335 Tyr Ser Leu Ile Arg Pro Asn Glu Asn Pro Ala His Lys Ser Gln Leu 340 345 350 Val Trp Met Ala Cys His Ser Ala Ala Phe Glu Asp Leu Arg Val Leu 355 360 365 Ser Phe Ile Lys Gly Thr Lys Val Leu Pro Arg Gly Lys Leu Ser Thr 370 375 380 Arg Gly Val Gln Ile Ala Ser Asn Glu Asn Met Glu Thr Met Glu Ser 385 390 395 400 Ser Thr Leu Glu Leu Arg Ser Arg Tyr Trp Ala Ile Arg Thr Arg Ser 405 410 415 Gly Gly Asn Thr Asn Gln Gln Arg Ala Ser Ala Gly Gln Ile Ser Ile 420 425 430 Gln Pro Thr Phe Ser Val Gln Arg Asn Leu Pro Phe Asp Arg Thr Thr 435 440 445 Val Met Ala Ala Phe Ser Gly Asn Thr Glu Gly Arg Thr Ser Asp Met 450 455 460 Arg Thr Glu Ile Ile Arg Met Met Glu Ser Ala Arg Pro Glu Asp Val 465 470 475 480 Ser Phe Gln Gly Arg Gly Val Phe Glu Leu Ser Asp Glu Lys Ala Ala 485 490 495 Ser Pro Ile Val Pro Ser Phe Asp Met Ser Asn Glu Gly Ser Tyr Phe 500 505 510 Phe Gly Asp Asn Ala Glu Glu Tyr Asp Asn 515 520 8 1566 DNA Artificial sequence NPeM2 Fusion Construct 8 atggcgtctc aaggcaccaa acgatcttac gaacagatgg agactgatgg agaacgccag 60 aatgccactg aaatcagagc atccgtcgga aaaatgattg gtggaattgg acgattctac 120 atccaaatgt gcaccgaact caaactcagt gattatgagg gacggttgat ccaaaacagc 180 ttaacaatag agagaatggt gctctctgct tttgacgaaa ggagaaataa ataccttgaa 240 gaacatccca gtgcggggaa agatcctaag aaaactggag gacctatata caggagagta 300 aacggaaagt ggatgagaga actcatcctt tatgacaaag aagaaataag gcgaatctgg 360 cgccaagcta ataatggtga cgatgcaacg gctggtctga ctcacatgat gatctggcat 420 tccaatttga atgatgcaac ttatcagagg acaagagctc ttgttcgcac cggaatggat 480 cccaggatgt gctctctgat gcaaggttca actctcccta ggaggtctgg agccgcaggt 540 gctgcagtca aaggagttgg aacaatggtg atggaattgg tcagaatgat caaacgtggg 600 atcaatgatc ggaacttctg gaggggtgag aatggacgaa aaacaagaat tgcttatgaa 660 agaatgtgca acattctcaa agggaaattt caaactgctg cacaaaaagc aatgatggat 720 caagtgagag agagccggaa cccagggaat gctgagttcg aagatctcac ttttctagca 780 cggtctgcac tcatattgag agggtcggtt gctcacaagt cctgcctgcc tgcctgtgtg 840 tatggacctg ccgtagccag tgggtacgac tttgaaaggg agggatactc tctagtcgga 900 atagaccctt tcagactgct tcaaaacagc caagtgtaca gcctaatcag accaaatgag 960 aatccagcac acaagagtca actggtgtgg atggcatgcc attctgccgc atttgaagat 1020 ctaagagtat taagcttcat caaagggacg aaggtgctcc caagagggaa gctttccact 1080 agaggagttc aaattgcttc caatgaaaat atggagacta tggaatcaag tacacttgaa 1140 ctgagaagca ggtactgggc cataaggacc agaagtggag gaaacaccaa tcaacagagg 1200 gcatctgcgg gccaaatcag catacaacct acgttctcag tacagagaaa tctccctttt 1260 gacagaacaa ccgttatggc agcattcagt gggaatacag aggggagaac atctgacatg 1320 aggaccgaaa tcataaggat gatggaaagt gcaagaccag aagatgtgtc tttccagggg 1380 cggggagtct tcgagctctc ggacgaaaag gcagcgagcc cgatcgtgcc ttcctttgac 1440 atgagtaatg aaggatctta tttcttcgga gacaatgcag aggaatacga taatatgagt 1500 cttctaaccg aggtcgaaac gcctatcaga aacgaatggg ggtgcagatg caacggttca 1560 agtgat 1566 9 522 PRT Artificial sequence NPeM2 Fusion Construct 9 Met Ala Ser Gln Gly Thr Lys Arg Ser Tyr Glu Gln Met Glu Thr Asp 1 5 10 15 Gly Glu Arg Gln Asn Ala Thr Glu Ile Arg Ala Ser Val Gly Lys Met 20 25 30 Ile Gly Gly Ile Gly Arg Phe Tyr Ile Gln Met Cys Thr Glu Leu Lys 35 40 45 Leu Ser Asp Tyr Glu Gly Arg Leu Ile Gln Asn Ser Leu Thr Ile Glu 50 55 60 Arg Met Val Leu Ser Ala Phe Asp Glu Arg

Arg Asn Lys Tyr Leu Glu 65 70 75 80 Glu His Pro Ser Ala Gly Lys Asp Pro Lys Lys Thr Gly Gly Pro Ile 85 90 95 Tyr Arg Arg Val Asn Gly Lys Trp Met Arg Glu Leu Ile Leu Tyr Asp 100 105 110 Lys Glu Glu Ile Arg Arg Ile Trp Arg Gln Ala Asn Asn Gly Asp Asp 115 120 125 Ala Thr Ala Gly Leu Thr His Met Met Ile Trp His Ser Asn Leu Asn 130 135 140 Asp Ala Thr Tyr Gln Arg Thr Arg Ala Leu Val Arg Thr Gly Met Asp 145 150 155 160 Pro Arg Met Cys Ser Leu Met Gln Gly Ser Thr Leu Pro Arg Arg Ser 165 170 175 Gly Ala Ala Gly Ala Ala Val Lys Gly Val Gly Thr Met Val Met Glu 180 185 190 Leu Val Arg Met Ile Lys Arg Gly Ile Asn Asp Arg Asn Phe Trp Arg 195 200 205 Gly Glu Asn Gly Arg Lys Thr Arg Ile Ala Tyr Glu Arg Met Cys Asn 210 215 220 Ile Leu Lys Gly Lys Phe Gln Thr Ala Ala Gln Lys Ala Met Met Asp 225 230 235 240 Gln Val Arg Glu Ser Arg Asn Pro Gly Asn Ala Glu Phe Glu Asp Leu 245 250 255 Thr Phe Leu Ala Arg Ser Ala Leu Ile Leu Arg Gly Ser Val Ala His 260 265 270 Lys Ser Cys Leu Pro Ala Cys Val Tyr Gly Pro Ala Val Ala Ser Gly 275 280 285 Tyr Asp Phe Glu Arg Glu Gly Tyr Ser Leu Val Gly Ile Asp Pro Phe 290 295 300 Arg Leu Leu Gln Asn Ser Gln Val Tyr Ser Leu Ile Arg Pro Asn Glu 305 310 315 320 Asn Pro Ala His Lys Ser Gln Leu Val Trp Met Ala Cys His Ser Ala 325 330 335 Ala Phe Glu Asp Leu Arg Val Leu Ser Phe Ile Lys Gly Thr Lys Val 340 345 350 Leu Pro Arg Gly Lys Leu Ser Thr Arg Gly Val Gln Ile Ala Ser Asn 355 360 365 Glu Asn Met Glu Thr Met Glu Ser Ser Thr Leu Glu Leu Arg Ser Arg 370 375 380 Tyr Trp Ala Ile Arg Thr Arg Ser Gly Gly Asn Thr Asn Gln Gln Arg 385 390 395 400 Ala Ser Ala Gly Gln Ile Ser Ile Gln Pro Thr Phe Ser Val Gln Arg 405 410 415 Asn Leu Pro Phe Asp Arg Thr Thr Val Met Ala Ala Phe Ser Gly Asn 420 425 430 Thr Glu Gly Arg Thr Ser Asp Met Arg Thr Glu Ile Ile Arg Met Met 435 440 445 Glu Ser Ala Arg Pro Glu Asp Val Ser Phe Gln Gly Arg Gly Val Phe 450 455 460 Glu Leu Ser Asp Glu Lys Ala Ala Ser Pro Ile Val Pro Ser Phe Asp 465 470 475 480 Met Ser Asn Glu Gly Ser Tyr Phe Phe Gly Asp Asn Ala Glu Glu Tyr 485 490 495 Asp Asn Met Ser Leu Leu Thr Glu Val Glu Thr Pro Ile Arg Asn Glu 500 505 510 Trp Gly Cys Arg Cys Asn Gly Ser Ser Asp 515 520 10 6 PRT Artificial sequence Linker Peptide 10 Gly Tyr Ala Thr Arg Ala 1 5 11 6 PRT Artificial sequence Linker Peptide 11 Phe Gln Met Gly Glu Thr 1 5 12 8 PRT Artificial sequence Linker Peptide 12 Phe Asp Arg Val Lys His Leu Lys 1 5 13 9 PRT Artificial sequence Linker Peptide 13 Gly Arg Asn Thr Asn Gly Val Ile Thr 1 5 14 10 PRT Artificial sequence Linker Peptide 14 Val Asn Glu Lys Thr Ile Pro Asp His Asp 1 5 10 15 1683 DNA Influenza B virus 15 atgtccaaca tggatattga cagtataaat accggaacaa tcgataaaac accagaagaa 60 ctgactcccg gaaccagtgg ggcaaccaga ccaatcatca agccagcaac ccttgctccg 120 ccaagcaaca aacgaacccg aaatccatct ccagaaagga caaccacaag cagtgaaacc 180 gatatcggaa ggaaaatcca aaagaaacaa accccaacag agataaagaa gagcgtctac 240 aaaatggtgg taaaactggg tgaattctac aaccagatga tggtcaaagc tggacttaat 300 gatgacatgg aaaggaatct aattcaaaat gcacaagctg tggagagaat cctattggct 360 gcaactgatg acaagaaaac tgaataccaa aagaaaagga atgccagaga tgtcaaagaa 420 gggaaggaag aaatagacca caacaagaca ggaggcacct tttataagat ggtaagagat 480 gataaaacca tctacttcag ccctataaaa attacctttt taaaagaaga ggtgaaaaca 540 atgtacaaga ccaccatggg gagtgatggt ttcagtggac taaatcacat tatgattgga 600 cattcacaga tgaacgatgt ctgtttccaa agatcaaagg gactgaaaag ggttggactt 660 gacccttcat taatcagtac ttttgccgga agcacactac ccagaagatc aggtacaact 720 ggtgttgcaa tcaaaggagg tggaacttta gtggatgaag ccatccgatt tataggaaga 780 gcaatggcag acagagggct actgagagac atcaaggcca agacggccta tgaaaagatt 840 cttctgaatc tgaaaaacaa gtgctctgcg ccgcaacaaa aggctctagt tgatcaagtg 900 atcggaagta ggaacccagg gattgcagac atagaagacc taactctgct tgccagaagc 960 atggtagttg tcagaccctc tgtagcgagc aaagtggtgc ttcccataag catttatgct 1020 aaaatacctc aactaggatt caataccgaa gaatactcta tggttgggta tgaagccatg 1080 gctctttata atatggcaac acctgtttcc atattaagaa tgggagatga cgcaaaagat 1140 aaatctcaac tattcttcat gtcgtgcttc ggagctgcct atgaagatct aagagtgtta 1200 tctgcactaa cgggcaccga atttaagcct agatcagcac taaaatgcaa gggtttccat 1260 gtcccggcta aggagcaagt agaaggaatg ggggcagctc tgatgtccat caagcttcag 1320 ttctgggccc caatgaccag atctggaggg aatgaagtaa gtggagaagg agggtctggt 1380 caaataagtt gcagccctgt gtttgcagta gaaagaccta ttgctctaag caagcaagct 1440 gtaagaagaa tgctgtcaat gaacgttgaa ggacgtgatg cagatgtcaa aggaaatcta 1500 ctcaaaatga tgaatgattc aatggcaaag aaaaccagtg gaaatgcttt cattgggaag 1560 aaaatgtttc aaatatcaga caaaaacaaa gtcaatccca ttgagattcc aattaagcag 1620 accatcccca atttcttctt tgggagggac acagcagagg attatgatga cctcgattat 1680 taa 1683 16 560 PRT Artificial sequence Influenza B Virus 16 Met Ser Asn Met Asp Ile Asp Ser Ile Asn Thr Gly Thr Ile Asp Lys 1 5 10 15 Thr Pro Glu Glu Leu Thr Pro Gly Thr Ser Gly Ala Thr Arg Pro Ile 20 25 30 Ile Lys Pro Ala Thr Leu Ala Pro Pro Ser Asn Lys Arg Thr Arg Asn 35 40 45 Pro Ser Pro Glu Arg Thr Thr Thr Ser Ser Glu Thr Asp Ile Gly Arg 50 55 60 Lys Ile Gln Lys Lys Gln Thr Pro Thr Glu Ile Lys Lys Ser Val Tyr 65 70 75 80 Lys Met Val Val Lys Leu Gly Glu Phe Tyr Asn Gln Met Met Val Lys 85 90 95 Ala Gly Leu Asn Asp Asp Met Glu Arg Asn Leu Ile Gln Asn Ala Gln 100 105 110 Ala Val Glu Arg Ile Leu Leu Ala Ala Thr Asp Asp Lys Lys Thr Glu 115 120 125 Tyr Gln Lys Lys Arg Asn Ala Arg Asp Val Lys Glu Gly Lys Glu Glu 130 135 140 Ile Asp His Asn Lys Thr Gly Gly Thr Phe Tyr Lys Met Val Arg Asp 145 150 155 160 Asp Lys Thr Ile Tyr Phe Ser Pro Ile Lys Ile Thr Phe Leu Lys Glu 165 170 175 Glu Val Lys Thr Met Tyr Lys Thr Thr Met Gly Ser Asp Gly Phe Ser 180 185 190 Gly Leu Asn His Ile Met Ile Gly His Ser Gln Met Asn Asp Val Cys 195 200 205 Phe Gln Arg Ser Lys Gly Leu Lys Arg Val Gly Leu Asp Pro Ser Leu 210 215 220 Ile Ser Thr Phe Ala Gly Ser Thr Leu Pro Arg Arg Ser Gly Thr Thr 225 230 235 240 Gly Val Ala Ile Lys Gly Gly Gly Thr Leu Val Asp Glu Ala Ile Arg 245 250 255 Phe Ile Gly Arg Ala Met Ala Asp Arg Gly Leu Leu Arg Asp Ile Lys 260 265 270 Ala Lys Thr Ala Tyr Glu Lys Ile Leu Leu Asn Leu Lys Asn Lys Cys 275 280 285 Ser Ala Pro Gln Gln Lys Ala Leu Val Asp Gln Val Ile Gly Ser Arg 290 295 300 Asn Pro Gly Ile Ala Asp Ile Glu Asp Leu Thr Leu Leu Ala Arg Ser 305 310 315 320 Met Val Val Val Arg Pro Ser Val Ala Ser Lys Val Val Leu Pro Ile 325 330 335 Ser Ile Tyr Ala Lys Ile Pro Gln Leu Gly Phe Asn Thr Glu Glu Tyr 340 345 350 Ser Met Val Gly Tyr Glu Ala Met Ala Leu Tyr Asn Met Ala Thr Pro 355 360 365 Val Ser Ile Leu Arg Met Gly Asp Asp Ala Lys Asp Lys Ser Gln Leu 370 375 380 Phe Phe Met Ser Cys Phe Gly Ala Ala Tyr Glu Asp Leu Arg Val Leu 385 390 395 400 Ser Ala Leu Thr Gly Thr Glu Phe Lys Pro Arg Ser Ala Leu Lys Cys 405 410 415 Lys Gly Phe His Val Pro Ala Lys Glu Gln Val Glu Gly Met Gly Ala 420 425 430 Ala Leu Met Ser Ile Lys Leu Gln Phe Trp Ala Pro Met Thr Arg Ser 435 440 445 Gly Gly Asn Glu Val Ser Gly Glu Gly Gly Ser Gly Gln Ile Ser Cys 450 455 460 Ser Pro Val Phe Ala Val Glu Arg Pro Ile Ala Leu Ser Lys Gln Ala 465 470 475 480 Val Arg Arg Met Leu Ser Met Asn Val Glu Gly Arg Asp Ala Asp Val 485 490 495 Lys Gly Asn Leu Leu Lys Met Met Asn Asp Ser Met Ala Lys Lys Thr 500 505 510 Ser Gly Asn Ala Phe Ile Gly Lys Lys Met Phe Gln Ile Ser Asp Lys 515 520 525 Asn Lys Val Asn Pro Ile Glu Ile Pro Ile Lys Gln Thr Ile Pro Asn 530 535 540 Phe Phe Phe Gly Arg Asp Thr Ala Glu Asp Tyr Asp Asp Leu Asp Tyr 545 550 555 560 17 1220 DNA Influenza A virus 17 atggaggcaa gactactggt cttgttatgt gcatttgcag ctacaaatgc agacacaata 60 tgtataggct accatgcgaa taactcaacc gacactgttg acacagtact cgaaaagaat 120 gtgaccgtga cacactctgt taacctgctc gaagacagcc acaacggaaa actatgtaaa 180 ttaaaaggaa tagccccatt acaattgggg aaatgtaata tcgccggatg gctcttggga 240 aacccggaat gcgatttact gctcacagcg agctcatggt cctatattgt agaaacatcg 300 aactcagaga atggaacatg ttacccagga gatttcatcg actatgaaga actgagggag 360 caattgagct cagtgtcatc gtttgaaaaa ttcgaaatat ttcccaagac aagctcgtgg 420 cccaatcatg aaacaaccaa aggtgtaacg gcagcatgct cctatgcggg agcaagcagt 480 ttttacagaa atttgctgtg gctgacaaag aagggaagct catacccaaa gcttagcaag 540 tcctatgtga acaataaagg gaaagaagtc cttgtactat ggggtgttca tcatccgcct 600 accggtactg atcaacagag tctctatcag aatgcagatg cttatgtctc tgtagggtca 660 tcaaaatata acaggagatt caccccggaa atagcagcga gacccaaagt aagaggtcaa 720 gctgggagga tgaactatta ctggacatta ctagaacccg gagacacaat aacatttgag 780 gcaactggaa atctaatagc accatggtat gctttcgcac tgaatagagg ttctggatcc 840 ggtatcatca cttcagacgc accagtgcat gattgtaaca cgaagtgtca aacaccccat 900 ggtgctataa acagcagtct ccctttccag aatatacatc cagtcacaat aggagagtgc 960 ccaaaatacg tcaggagtac caaattgagg atggctacag gactaagaaa cattccatct 1020 attcaatcca ggggtctatt tggagccatt gccggtttta ttgagggggg atggactgga 1080 atgatagatg gatggtatgg ttatcatcat cagaatgaac agggatcagg ctatgcagcg 1140 gatcaaaaaa gcacacaaaa tgccattgac gggattacaa acaaggtgaa ttctgttatc 1200 gagaaaatga acacccaatt 1220 18 406 PRT Influenza A virus 18 Met Glu Ala Arg Leu Leu Val Leu Leu Cys Ala Phe Ala Ala Thr Asn 1 5 10 15 Ala Asp Thr Ile Cys Ile Gly Tyr His Ala Asn Asn Ser Thr Asp Thr 20 25 30 Val Asp Thr Val Leu Glu Lys Asn Val Thr Val Thr His Ser Val Asn 35 40 45 Leu Leu Glu Asp Ser His Asn Gly Lys Leu Cys Lys Leu Lys Gly Ile 50 55 60 Ala Pro Leu Gln Leu Gly Lys Cys Asn Ile Ala Gly Trp Leu Leu Gly 65 70 75 80 Asn Pro Glu Cys Asp Leu Leu Leu Thr Ala Ser Ser Trp Ser Tyr Ile 85 90 95 Val Glu Thr Ser Asn Ser Glu Asn Gly Thr Cys Tyr Pro Gly Asp Phe 100 105 110 Ile Asp Tyr Glu Glu Leu Arg Glu Gln Leu Ser Ser Val Ser Ser Phe 115 120 125 Glu Lys Phe Glu Ile Phe Pro Lys Thr Ser Ser Trp Pro Asn His Glu 130 135 140 Thr Thr Lys Gly Val Thr Ala Ala Cys Ser Tyr Ala Gly Ala Ser Ser 145 150 155 160 Phe Tyr Arg Asn Leu Leu Trp Leu Thr Lys Lys Gly Ser Ser Tyr Pro 165 170 175 Lys Leu Ser Lys Ser Tyr Val Asn Asn Lys Gly Lys Glu Val Leu Val 180 185 190 Leu Trp Gly Val His His Pro Pro Thr Gly Thr Asp Gln Gln Ser Leu 195 200 205 Tyr Gln Asn Ala Asp Ala Tyr Val Ser Val Gly Ser Ser Lys Tyr Asn 210 215 220 Arg Arg Phe Thr Pro Glu Ile Ala Ala Arg Pro Lys Val Arg Gly Gln 225 230 235 240 Ala Gly Arg Met Asn Tyr Tyr Trp Thr Leu Leu Glu Pro Gly Asp Thr 245 250 255 Ile Thr Phe Glu Ala Thr Gly Asn Leu Ile Ala Pro Trp Tyr Ala Phe 260 265 270 Ala Leu Asn Arg Gly Ser Gly Ser Gly Ile Ile Thr Ser Asp Ala Pro 275 280 285 Val His Asp Cys Asn Thr Lys Cys Gln Thr Pro His Gly Ala Ile Asn 290 295 300 Ser Ser Leu Pro Phe Gln Asn Ile His Pro Val Thr Ile Gly Glu Cys 305 310 315 320 Pro Lys Tyr Val Arg Ser Thr Lys Leu Arg Met Ala Thr Gly Leu Arg 325 330 335 Asn Ile Pro Ser Ile Gln Ser Arg Gly Leu Phe Gly Ala Ile Ala Gly 340 345 350 Phe Ile Glu Gly Gly Trp Thr Gly Met Ile Asp Gly Trp Tyr Gly Tyr 355 360 365 His His Gln Asn Glu Gln Gly Ser Gly Tyr Ala Ala Asp Gln Lys Ser 370 375 380 Thr Gln Asn Ala Ile Asp Gly Ile Thr Asn Lys Val Asn Ser Val Ile 385 390 395 400 Glu Lys Met Asn Thr Gln 405 19 1741 DNA Influenza A virus 19 ctgtcaaaat ggagaaaata gtgcttcttc ttgcaacagt cagtcttgtt aaaagtgatc 60 agatttgcat tggttaccat gcaaacaact cgacagagca ggttgacaca ataatggaaa 120 agaatgttac tgttacacat gcccaagaca tactggaaag gacacacaac gggaagctct 180 gcgatctaaa tggagtgaaa cctctcattt tgagggattg tagtgtagct ggatggctcc 240 tcggaaaccc tatgtgtgac gaattcatca atgtgccgga atggtcttac atagtggaga 300 aggccagtcc agccaatgac ctctgttatc cagggaattt caacgactat gaagaactga 360 aacacctatt gagcagaata aaccattttg agaaaattca gatcatcccc aaaagttctt 420 ggtccaatca tgatgcctca tcaggggtga gctcagcatg tccatacctt gggaggtcct 480 cctttttcag aaatgtggta tggcttatca aaaagaacag tgcataccca acaataaaga 540 ggagctacaa taataccaac caagaagatc ttttggtact gtgggggatt caccatccta 600 atgatgcggc agagcagaca aagctctatc aaaatccaac cacctacatt tccgttggaa 660 catcaacact gaaccagaga ttggttccag aaatagctac tagacccaaa gtaaacgggc 720 aaagtggaag aatggagttc ttctggacaa ttttaaagcc gaatgatgcc atcaatttcg 780 agagtaatgg aaatttcatt gccccagaat atgcatacaa aattgtcaag aaaggggact 840 caacaattat gaaaagtgaa ttggaatatg gtaactgcaa caccaagtgt caaactccaa 900 tgggggcgat aaactctagt atgccattcc acaacataca ccccctcacc atcggggaat 960 gccccaaata tgtgaaatca aacagattag ttcttgcgac tggactcaga aatacccctc 1020 aaagggagag aagaagaaaa aagagaggac tatttggagc tatagcaggt tttatagagg 1080 gaggatggca gggcatggta gatggttggt atgggtacca ccatagcaat gagcagggga 1140 gtggatacgc tgcagacaaa gaatccactc aaaaggcaat agatggagtc accaataagg 1200 tcaactcgat cattaacaaa atgaacactc agtttgaggc cgttggaagg gaatttaata 1260 acttagaaag gagaatagag aatttaaaca agaaaatgga agacggattc ctagatgtct 1320 ggacttacaa tgctgaactt ctggttctca tggaaaatga gagaactctc gactttcatg 1380 actcaaatgt caagaacctt tacgacaagg tccgactaca gcttagggat aatgcaaagg 1440 aactgggtaa tggttgtttc gaattctatc acaaatgtga taatgaatgt atggaaagtg 1500 taaaaaacgg aacgtatgac tacccgcagt attcagaaga agcaagacta aacagagagg 1560 aaataagtgg agtaaaattg gaatcaatgg gaacttacca aatactgtca atttattcaa 1620 cagtggcgag ttccctagca ctggcaatca tggtagctgg tctatcttta tggatgtgct 1680 ccaatggatc gttacaatgc agaatttgca tttaaatttg tgagttcaga ttgtagttaa 1740 a 1741 20 568 PRT Influenza A virus 20 Met Glu Lys Ile Val Leu Leu Leu Ala Thr Val Ser Leu Val Lys Ser 1 5 10 15 Asp Gln Ile Cys Ile Gly Tyr His Ala Asn Asn Ser Thr Glu Gln Val 20 25 30 Asp Thr Ile Met Glu Lys Asn Val Thr Val Thr His Ala Gln Asp Ile 35 40 45 Leu Glu Arg Thr His Asn Gly Lys Leu Cys Asp Leu Asn Gly Val Lys 50 55 60 Pro Leu Ile Leu Arg Asp Cys Ser Val Ala Gly Trp Leu Leu Gly Asn 65 70 75 80 Pro Met Cys Asp Glu Phe Ile Asn Val Pro Glu Trp Ser Tyr Ile Val 85 90 95 Glu Lys Ala Ser Pro Ala Asn Asp Leu Cys Tyr Pro Gly Asn Phe Asn 100 105 110 Asp Tyr Glu Glu Leu Lys His Leu Leu Ser Arg Ile Asn His Phe Glu 115 120 125 Lys Ile Gln Ile Ile Pro Lys Ser Ser Trp Ser Asn His Asp Ala Ser 130 135 140 Ser Gly Val Ser Ser Ala Cys Pro Tyr Leu Gly Arg Ser Ser Phe Phe

145 150 155 160 Arg Asn Val Val Trp Leu Ile Lys Lys Asn Ser Ala Tyr Pro Thr Ile 165 170 175 Lys Arg Ser Tyr Asn Asn Thr Asn Gln Glu Asp Leu Leu Val Leu Trp 180 185 190 Gly Ile His His Pro Asn Asp Ala Ala Glu Gln Thr Lys Leu Tyr Gln 195 200 205 Asn Pro Thr Thr Tyr Ile Ser Val Gly Thr Ser Thr Leu Asn Gln Arg 210 215 220 Leu Val Pro Glu Ile Ala Thr Arg Pro Lys Val Asn Gly Gln Ser Gly 225 230 235 240 Arg Met Glu Phe Phe Trp Thr Ile Leu Lys Pro Asn Asp Ala Ile Asn 245 250 255 Phe Glu Ser Asn Gly Asn Phe Ile Ala Pro Glu Tyr Ala Tyr Lys Ile 260 265 270 Val Lys Lys Gly Asp Ser Thr Ile Met Lys Ser Glu Leu Glu Tyr Gly 275 280 285 Asn Cys Asn Thr Lys Cys Gln Thr Pro Met Gly Ala Ile Asn Ser Ser 290 295 300 Met Pro Phe His Asn Ile His Pro Leu Thr Ile Gly Glu Cys Pro Lys 305 310 315 320 Tyr Val Lys Ser Asn Arg Leu Val Leu Ala Thr Gly Leu Arg Asn Thr 325 330 335 Pro Gln Arg Glu Arg Arg Arg Lys Lys Arg Gly Leu Phe Gly Ala Ile 340 345 350 Ala Gly Phe Ile Glu Gly Gly Trp Gln Gly Met Val Asp Gly Trp Tyr 355 360 365 Gly Tyr His His Ser Asn Glu Gln Gly Ser Gly Tyr Ala Ala Asp Lys 370 375 380 Glu Ser Thr Gln Lys Ala Ile Asp Gly Val Thr Asn Lys Val Asn Ser 385 390 395 400 Ile Ile Asn Lys Met Asn Thr Gln Phe Glu Ala Val Gly Arg Glu Phe 405 410 415 Asn Asn Leu Glu Arg Arg Ile Glu Asn Leu Asn Lys Lys Met Glu Asp 420 425 430 Gly Phe Leu Asp Val Trp Thr Tyr Asn Ala Glu Leu Leu Val Leu Met 435 440 445 Glu Asn Glu Arg Thr Leu Asp Phe His Asp Ser Asn Val Lys Asn Leu 450 455 460 Tyr Asp Lys Val Arg Leu Gln Leu Arg Asp Asn Ala Lys Glu Leu Gly 465 470 475 480 Asn Gly Cys Phe Glu Phe Tyr His Lys Cys Asp Asn Glu Cys Met Glu 485 490 495 Ser Val Lys Asn Gly Thr Tyr Asp Tyr Pro Gln Tyr Ser Glu Glu Ala 500 505 510 Arg Leu Asn Arg Glu Glu Ile Ser Gly Val Lys Leu Glu Ser Met Gly 515 520 525 Thr Tyr Gln Ile Leu Ser Ile Tyr Ser Thr Val Ala Ser Ser Leu Ala 530 535 540 Leu Ala Ile Met Val Ala Gly Leu Ser Leu Trp Met Cys Ser Asn Gly 545 550 555 560 Ser Leu Gln Cys Arg Ile Cys Ile 565 21 1714 DNA Influenza A virus 21 gcaaaagcag gggaattact taactagcaa aatggaaaca atatcactaa taactatact 60 actagtagta acagcaagca atgcagataa aatctgcatc ggccaccagt caacaaactc 120 cacagaaact gtggacacgc taacagaaac caatgttcct gtgacacatg ccaaagaatt 180 gctccacaca gagcataatg gaatgctgtg tgcaacaagc ctgggacatc ccctcattct 240 agacacatgc actattgaag gactagtcta tggcaaccct tcttgtgacc tgctgttggg 300 aggaagagaa tggtcctaca tcgtcgaaag atcatcagct gtaaatggaa cgtgttaccc 360 tgggaatgta gaaaacctag aggaactcag gacacttttt agttccgcta gttcctacca 420 aagaatccaa atcttcccag acacaacctg gaatgtgact tacactggaa caagcagagc 480 atgttcaggt tcattctaca ggagtatgag atggctgact caaaagagcg gtttttaccc 540 tgttcaagac gcccaataca caaataacag gggaaagagc attcttttcg tgtggggcat 600 acatcaccca cccacctata ccgagcaaac aaatttgtac ataagaaacg acacaacaac 660 aagcgtgaca acagaagatt tgaataggac cttcaaacca gtgatagggc caaggcccct 720 tgtcaatggt ctgcagggaa gaattgatta ttattggtcg gtactaaaac caggccaaac 780 attgcgagta cgatccaatg ggaatctaat tgctccatgg tatggacacg ttctttcagg 840 agggagccat ggaagaatcc tgaagactga tttaaaaggt ggtaattgtg tagtgcaatg 900 tcagactgaa aaaggtggct taaacagtac attgccattc cacaatatca gtaaatatgc 960 atttggaacc tgccccaaat atgtaagagt taatagtctc aaactggcag tcggtctgag 1020 gaacgtgcct gctagatcaa gtagaggact atttggagcc atagctggat tcatagaagg 1080 aggttggcca ggactagtcg ctggctggta tggtttccag cattcaaatg atcaaggggt 1140 tggtatggct gcagataggg attcaactca aaaggcaatt gataaaataa catccaaggt 1200 gaataatata gtcgacaaga tgaacaagca atatgaaata attgatcatg aattcagtga 1260 ggttgaaact agactcaata tgatcaataa taagattgat gaccaaatac aagacgtatg 1320 ggcatataat gcagaattgc tagtactact tgaaaatcaa aaaacactcg atgagcatga 1380 tgcgaacgtg aacaatctat ataacaaggt gaagagggca ctgggctcca atgctatgga 1440 agatgggaaa ggctgtttcg agctatacca taaatgtgat gatcagtgca tggaaacaat 1500 tcggaacggg acctataata ggagaaagta tagagaggaa tcaagactag aaaggcagaa 1560 aatagagggg gttaagctgg aatctgaggg aacttacaaa atcctcacca tttattcgac 1620 tgtcgcctca tctcttgtgc ttgcaatggg gtttgctgcc ttcctgttct gggccatgtc 1680 caatggatct tgcagatgca acatttgtat ataa 1714 22 560 PRT Influenza A virus 22 Met Glu Thr Ile Ser Leu Ile Thr Ile Leu Leu Val Val Thr Ala Ser 1 5 10 15 Asn Ala Asp Lys Ile Cys Ile Gly His Gln Ser Thr Asn Ser Thr Glu 20 25 30 Thr Val Asp Thr Leu Thr Glu Thr Asn Val Pro Val Thr His Ala Lys 35 40 45 Glu Leu Leu His Thr Glu His Asn Gly Met Leu Cys Ala Thr Ser Leu 50 55 60 Gly His Pro Leu Ile Leu Asp Thr Cys Thr Ile Glu Gly Leu Val Tyr 65 70 75 80 Gly Asn Pro Ser Cys Asp Leu Leu Leu Gly Gly Arg Glu Trp Ser Tyr 85 90 95 Ile Val Glu Arg Ser Ser Ala Val Asn Gly Thr Cys Tyr Pro Gly Asn 100 105 110 Val Glu Asn Leu Glu Glu Leu Arg Thr Leu Phe Ser Ser Ala Ser Ser 115 120 125 Tyr Gln Arg Ile Gln Ile Phe Pro Asp Thr Thr Trp Asn Val Thr Tyr 130 135 140 Thr Gly Thr Ser Arg Ala Cys Ser Gly Ser Phe Tyr Arg Ser Met Arg 145 150 155 160 Trp Leu Thr Gln Lys Ser Gly Phe Tyr Pro Val Gln Asp Ala Gln Tyr 165 170 175 Thr Asn Asn Arg Gly Lys Ser Ile Leu Phe Val Trp Gly Ile His His 180 185 190 Pro Pro Thr Tyr Thr Glu Gln Thr Asn Leu Tyr Ile Arg Asn Asp Thr 195 200 205 Thr Thr Ser Val Thr Thr Glu Asp Leu Asn Arg Thr Phe Lys Pro Val 210 215 220 Ile Gly Pro Arg Pro Leu Val Asn Gly Leu Gln Gly Arg Ile Asp Tyr 225 230 235 240 Tyr Trp Ser Val Leu Lys Pro Gly Gln Thr Leu Arg Val Arg Ser Asn 245 250 255 Gly Asn Leu Ile Ala Pro Trp Tyr Gly His Val Leu Ser Gly Gly Ser 260 265 270 His Gly Arg Ile Leu Lys Thr Asp Leu Lys Gly Gly Asn Cys Val Val 275 280 285 Gln Cys Gln Thr Glu Lys Gly Gly Leu Asn Ser Thr Leu Pro Phe His 290 295 300 Asn Ile Ser Lys Tyr Ala Phe Gly Thr Cys Pro Lys Tyr Val Arg Val 305 310 315 320 Asn Ser Leu Lys Leu Ala Val Gly Leu Arg Asn Val Pro Ala Arg Ser 325 330 335 Ser Arg Gly Leu Phe Gly Ala Ile Ala Gly Phe Ile Glu Gly Gly Trp 340 345 350 Pro Gly Leu Val Ala Gly Trp Tyr Gly Phe Gln His Ser Asn Asp Gln 355 360 365 Gly Val Gly Met Ala Ala Asp Arg Asp Ser Thr Gln Lys Ala Ile Asp 370 375 380 Lys Ile Thr Ser Lys Val Asn Asn Ile Val Asp Lys Met Asn Lys Gln 385 390 395 400 Tyr Glu Ile Ile Asp His Glu Phe Ser Glu Val Glu Thr Arg Leu Asn 405 410 415 Met Ile Asn Asn Lys Ile Asp Asp Gln Ile Gln Asp Val Trp Ala Tyr 420 425 430 Asn Ala Glu Leu Leu Val Leu Leu Glu Asn Gln Lys Thr Leu Asp Glu 435 440 445 His Asp Ala Asn Val Asn Asn Leu Tyr Asn Lys Val Lys Arg Ala Leu 450 455 460 Gly Ser Asn Ala Met Glu Asp Gly Lys Gly Cys Phe Glu Leu Tyr His 465 470 475 480 Lys Cys Asp Asp Gln Cys Met Glu Thr Ile Arg Asn Gly Thr Tyr Asn 485 490 495 Arg Arg Lys Tyr Arg Glu Glu Ser Arg Leu Glu Arg Gln Lys Ile Glu 500 505 510 Gly Val Lys Leu Glu Ser Glu Gly Thr Tyr Lys Ile Leu Thr Ile Tyr 515 520 525 Ser Thr Val Ala Ser Ser Leu Val Leu Ala Met Gly Phe Ala Ala Phe 530 535 540 Leu Phe Trp Ala Met Ser Asn Gly Ser Cys Arg Cys Asn Ile Cys Ile 545 550 555 560 23 1494 DNA Artificial sequence Human Codon Optimized Influenza A Virus H1N1 Nucleoprotein 23 atggcctctc aggggacaaa gcggtcctac gagcagatgg agaccgatgg agaaaggcag 60 aatgctaccg agatacgagc ctcggtggga aagatgatag gcgggatcgg taggttttac 120 attcagatgt gcactgagct taagctgagt gattatgaag gtagactgat acagaattca 180 ctcaccatcg aaagaatggt gctgagtgca ttcgacgagc gccgaaacaa atacctggag 240 gaacatcctt cagccggcaa ggatcccaag aaaactggcg gacccatcta ccggagggtg 300 aacgggaaat ggatgcgcga gctgattctg tatgataaag aagaaatccg gcgtatctgg 360 aggcaagcta acaacggaga tgatgccaca gccggactga cgcatatgat gatttggcac 420 tctaacctta acgacgcgac ctaccagagg acccgggccc tcgtgagaac aggcatggat 480 ccacgaatgt gctcacttat gcaggggtcc accctgccaa ggaggagcgg ggcagctggt 540 gccgcagtca aaggggtggg aactatggtg atggagctag tgcgtatgat taagcgcggc 600 ataaatgacc gcaatttctg gcggggggaa aacggacgaa agacacgcat tgcatatgaa 660 cgcatgtgca atattctcaa ggggaaattc cagacggctg ctcaaaaggc catgatggac 720 caggtgaggg agtcaagaaa cccaggcaac gccgagtttg aagacctgac cttcctggca 780 cggtctgctc taatcctcag aggtagtgta gcacacaaga gttgtcttcc ggcttgtgtg 840 tatggaccag ctgttgcatc agggtatgat ttcgaaaggg aaggctacag cctagttggt 900 atcgacccgt ttagactctt acagaattcc caagtctatt ccctgatcag acccaacgag 960 aatcctgctc acaaaagcca gttggtctgg atggcctgtc actccgccgc cttcgaggac 1020 ctccgggtct tgtcctttat caaaggcact aaggttctgc cccgcggcaa gttaagcact 1080 aggggagttc agatcgcaag taacgagaac atggagacaa tggagtctag caccttggaa 1140 ttgcgctccc gttattgggc gatccggaca agaagcggag gtaacacgaa tcagcaacgg 1200 gccagcgcgg gccaaatttc gatacagcct actttcagcg tgcagcggaa tctccccttc 1260 gatcgcacca ccgtaatggc cgcgtttagt ggtaatacag agggcagaac ttctgacatg 1320 cgaacagaga ttatccgtat gatggagagc gctcgacctg aagatgtgtc atttcagggc 1380 agaggcgtat ttgagctgtc cgacgagaaa gcagcctctc ctattgtccc ctctttcgac 1440 atgtccaacg aggggagcta cttctttggc gacaatgccg aagaatacga caat 1494 24 1497 DNA Artificial Sequence Human Codon Optimized Influenza A Virus H1N1 Nucleoprotein 24 atggccagcc agggcaccaa gcggagctac gagcagatgg agaccgacgg cgagcggcag 60 aacgccaccg agatccgggc cagcgtgggc aagatgatcg gcggcatcgg ccggttctac 120 atccagatgt gcaccgagct gaagctgagc gactacgagg gccggctgat ccagaacagc 180 ctgaccatcg agcggatggt gctgagcgcc ttcgacgagc ggcggaacaa gtacctggag 240 gagcacccca gcgccggcaa ggaccccaag aagaccggcg gccccatcta ccggcgggtg 300 aacggcaagt ggatgcggga gctgatcctg tacgacaagg aggagatccg gcggatctgg 360 cggcaggcca acaacggcga cgacgccacc gccggcctga cccacatgat gatctggcac 420 agcaacctga acgacgccac ctaccagcgg acccgggccc tggtgcggac cggcatggac 480 ccccggatgt gcagcctgat gcagggcagc accctgcccc ggcggagcgg cgccgccggc 540 gccgccgtga agggcgtggg caccatggtg atggagctgg tgcggatgat caagcggggc 600 atcaacgacc ggaacttctg gcggggcgag aacggccgga agacccggat cgcctacgag 660 cggatgtgca acatcctgaa gggcaagttc cagaccgccg cccagaaggc catgatggac 720 caggtgcggg agagccggaa ccccggcaac gccgagttcg aggacctgac cttcctggcc 780 cggagcgccc tgatcctgcg gggcagcgtg gcccacaaga gctgcctgcc cgcctgcgtg 840 tacggccccg ccgtggccag cggctacgac ttcgagcggg agggctacag cctggtgggc 900 atcgacccct tccggctgct gcagaacagc caggtgtaca gcctgatccg gcccaacgag 960 aaccccgccc acaagagcca gctggtgtgg atggcctgcc acagcgccgc cttcgaggac 1020 ctgcgggtgc tgagcttcat caagggcacc aaggtgctgc cccggggcaa gctgagcacc 1080 cggggcgtgc agatcgccag caacgagaac atggagacca tggagagcag caccctggag 1140 ctgcggagcc ggtactgggc catccggacc cggagcggcg gcaacaccaa ccagcagcgg 1200 gccagcgccg gccagatcag catccagccc accttcagcg tgcagcggaa cctgcccttc 1260 gaccggacca ccgtgatggc cgccttcagc ggcaacaccg agggccggac cagcgacatg 1320 cggaccgaga tcatccggat gatggagagc gcccggcccg aggacgtgag cttccagggc 1380 cggggcgtgt tcgagctgag cgacgagaag gccgccagcc ccatcgtgcc cagcttcgac 1440 atgagcaacg agggcagcta cttcttcggc gacaacgccg aggagtacga caactga 1497 25 1497 DNA Artificial sequence Human Codon Optimized Influenza A Virus H1N1 Nucleoprotein 25 atggcctcac agggcaccaa gcggagttat gagcagatgg agaccgatgg cgagagacag 60 aacgccacag agatcagagc ctcagttggc aagatgatcg gcggcatcgg ccggttctat 120 atccagatgt gcacggagct gaagctgagc gactacgagg gcagactgat tcagaactct 180 ctgaccatcg agagaatggt cctgagtgcc ttcgatgaga gacgaaacaa gtatctggag 240 gagcatccct ccgccggcaa ggaccccaag aagacgggcg gccccatata tagaagagtt 300 aacggcaagt ggatgagaga gctgatcctg tacgataagg aggagatccg cagaatatgg 360 aggcaggcca acaacggcga cgatgccact gccggcctga cacatatgat gatatggcac 420 agtaacctga acgacgccac ctaccagaga acaagggccc tggttcgcac gggcatggat 480 cccagaatgt gttcactgat gcagggctct acactgccca gaaggtctgg cgccgccggc 540 gccgccgtca agggcgttgg cacaatggtg atggagctgg tgcggatgat caagagaggc 600 attaacgatc ggaacttttg gaggggcgag aacggcagaa agaccaggat agcctacgag 660 cgaatgtgca acattctgaa gggcaagttc cagactgccg cccagaaggc catgatggat 720 caggtgcggg agagcagaaa ccccggcaac gccgagttcg aggacctgac tttcctggcc 780 agatctgccc tgatactgag gggctctgta gcccacaagt cctgcctgcc cgcctgcgtg 840 tacggccccg ccgtggcctc cggctatgac ttcgagcgag agggctactc cctggtaggc 900 atcgatccct ttagactgct gcagaactct caggtctaca gtctgattag acccaacgag 960 aaccccgccc ataagagcca gctggtgtgg atggcctgcc acagtgccgc cttcgaggac 1020 ctgagggtgc tgtcttttat aaagggcaca aaggtgctgc cccgcggcaa gctgtctact 1080 aggggcgtcc agatagcctc caacgagaac atggagacaa tggagtctag tactctggag 1140 ctgaggtcta ggtactgggc catcaggact aggagcggcg gcaacaccaa ccagcagagg 1200 gccagcgccg gccagatcag cattcagccc accttcagtg tacagagaaa cctgcccttt 1260 gatagaacta ctgttatggc cgccttctct ggcaacactg agggcagaac tagtgacatg 1320 cgaacagaga tcataagaat gatggagtcg gcccgtcccg aggatgtgtc ctttcagggc 1380 aggggcgtct tcgagctgag cgacgagaag gccgccagcc ccatcgtacc ctctttcgat 1440 atgagtaacg agggctcgta cttttttggc gacaacgccg aggagtatga taactga 1497 26 756 DNA Artificial sequence Human Codon Optimized Influenza A Virus M1 Protein 26 atgagcttgc taacagaagt ggaaacctat gtcctcagta tcattcctag cggcccctta 60 aaagccgaaa tcgctcagcg gctcgaggat gtttttgccg gcaagaacac cgacctggag 120 gtattgatgg agtggctgaa aacgcgacct attctgagcc ccctgactaa gggaatactc 180 ggcttcgttt ttacattgac cgtgccctca gagaggggtc tccaaaggag gcgcttcgtg 240 cagaacgcct taaacgggaa cggggaccca aataatatgg ataaggcagt gaaactgtat 300 cgcaaattaa agcgggagat aaccttccat ggagccaagg agatctccct gtcttactct 360 gcaggtgctc tcgcgtcgtg tatgggactt atctacaacc gaatgggcgc cgtcacaaca 420 gaagtggctt tcgggctggt gtgcgcaact tgcgaacaga ttgctgacag tcagcaccgg 480 tcccaccgtc aaatggtcac caccaccaat ccgctgatta gacatgaaaa tcgcatggtt 540 ctagcatcaa ctacagccaa agcaatggaa caaatggccg gaagctccga gcaggctgcc 600 gaggcgatgg aggtggcgtc ccaggccaga cagatggtac aggctatgag aactatcggt 660 acgcacccaa gttcttcagc tgggctgaag aatgatcttc ttgagaacct gcaggcctac 720 caaaagcgga tgggcgtcca gatgcagaga tttaaa 756 27 756 DNA Artificial sequence Human Codon Optimized Influenza A Virus M1 Protein 27 atgagcctgc tgaccgaggt ggagacctac gtgctgagca tcatccccag cggccccctg 60 aaggccgaga tcgcccagag gctggaggac gtgttcgccg gcaagaacac cgacctggag 120 gtgctgatgg agtggctgaa gaccaggccc atcctgagcc ccctgaccaa gggcatcctg 180 ggcttcgtgt tcaccctgac cgtgcccagc gagaggggcc tgcagaggag gaggttcgtg 240 cagaacgccc tgaacggcaa cggcgacccc aacaacatgg acaaggccgt gaagctgtac 300 aggaagctga agagggagat caccttccac ggcgccaagg agatcagcct gagctacagc 360 gccggcgccc tggccagctg catgggcctg atctacaaca ggatgggcgc cgtgaccacc 420 gaggtggcct tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag ccagcacagg 480 agccacaggc agatggtgac caccaccaac cccctgatca ggcacgagaa caggatggtg 540 ctggccagca ccaccgccaa ggccatggag cagatggccg gcagcagcga gcaggccgcc 600 gaggccatgg aggtggccag ccaggccagg cagatggtgc aggccatgag gaccatcggc 660 acccacccca gcagcagcgc cggcctgaag aacgacctgc tggagaacct gcaggcctac 720 cagaagagga tgggcgtgca gatgcagagg ttcaag 756 28 756 DNA Artificial sequence Human Codon Optimized Influenza A Virus M1 Protein 28 atgagtctgc tgacagaggt tgagacgtac gtgctgtcca tcattccctc aggccccctg 60 aaggccgaga ttgcccagag actggaggac gtcttcgccg gcaagaacac cgatctggag 120 gtgctgatgg agtggctgaa gactcgcccc atcctgtctc ccctgacaaa gggcatcctg 180 ggcttcgtat ttacactgac cgtcccctcc gagagaggcc tgcagcggag gaggttcgtt 240 cagaacgccc tgaacggcaa cggcgatccc aacaacatgg ataaggccgt gaagctgtat 300 agaaagctga agcgagagat cacatttcat ggcgccaagg agatatcgct gagctacagt 360 gccggcgccc tggcctcttg catgggcctg atatacaaca gaatgggcgc cgttactaca 420 gaggtagcct ttggcctggt ctgcgccact tgcgagcaga tcgccgactc tcagcataga 480 tctcacagac agatggtgac gactacaaac cccctgatac ggcacgagaa caggatggtg 540 ctggcctcta ctaccgccaa ggccatggag cagatggccg gcagcagtga gcaggccgcc 600 gaggccatgg aggtagcctc acaggccagg cagatggtgc aggccatgcg aaccatcggc 660 actcacccct ccagctctgc cggcctgaag aacgacctgc tggagaacct gcaggcctat 720

cagaagagaa tgggcgtaca gatgcagagg ttcaag 756 29 294 DNA Artificial sequence Human Codon Optimized Influenza A Virus M2 Protein 29 atgagtcttc taaccgaggt cgaaacgcct atcagaaacg aatgggggtg cagatgcaac 60 ggttcaagtg atcctctcgc tattgccgca aatatcattg ggatcttgca cttgacattg 120 tggattcttg atcgtctttt tttcaaatgc atttaccgtc gctttaaata cggactgaaa 180 ggagggcctt ctacggaagg agtgccaaag tctatgaggg aagaatatcg aaaggaacag 240 cagagtgctg tggatgctga cgatggtcat tttgtcagca tagagctgga gtaa 294 30 294 DNA Artificial sequence Human Codon Optimized Influenza A Virus M2 Protein 30 atgagcctgc tgaccgaggt ggagaccccc atccggaacg agtggggctg ccggtgcaac 60 ggcagcagcg accccctggc catcgccgcc aacatcatcg gcatcctgca cctgaccctg 120 tggatcctgg accggctgtt cttcaagtgc atctaccggc ggttcaagta cggcctgaag 180 ggcggcccca gcaccgaggg cgtgcccaag agcatgcggg aggagtaccg gaaggagcag 240 cagagcgccg tggacgccga cgacggccac ttcgtgagca tcgagctgga gtga 294 31 294 DNA Artificial sequence Human Codon-Optimized Influenza A Virus M2 Protein 31 atgtctctgc tgacagaggt ggagacaccc ataaggaacg agtggggctg caggtgcaac 60 ggctctagtg atcccctggc catcgccgcc aacatcattg gcatactgca tctgaccctg 120 tggatcctgg atagactgtt ctttaagtgc atttacagac gatttaagta tggcctgaag 180 ggcggcccct caactgaggg cgtgcccaag agtatgagag aggagtaccg gaaggagcag 240 cagagcgccg ttgacgccga tgacggccac ttcgtctcca tcgagctgga gtga 294 32 1566 DNA Artificial sequence Human Codon Optimized Coding Region Encoding eM2NP 32 atgagccttc tcacagaagt ggaaacacct atcagaaatg aatggggatg cagatgcaat 60 gggtcgagtg atatggcctc tcaaggtacg aaaagaagct acgagcaaat ggaaacggat 120 ggagaaagac aaaacgcgac cgaaatcaga gcatccgtcg ggaagatgat tggaggaatc 180 ggacgattct acatccagat gtgcacagag ctaaagctat cggattatga agggagacta 240 atacaaaata gcctaactat cgagagaatg gtgctgtctg catttgacga aaggagaaac 300 aaatacctgg aagaacaccc ctctgcaggg aaagacccaa aaaaaactgg aggtccgata 360 taccggagag tcaacggtaa atggatgaga gagctgatct tgtatgataa ggaagaaata 420 agacgcatct ggcggcaagc taataatgga gacgacgcta ctgcagggct cacgcatatg 480 atgatctggc actctaattt gaatgatgca acgtaccaaa gaacccgcgc acttgtgcgg 540 accggaatgg accctcgtat gtgcagcctt atgcaggggt ccacactgcc cagaaggtcc 600 ggagcagctg gagcagcagt aaagggggtt ggaaccatgg tgatggagct ggtgagaatg 660 attaagaggg ggatcaatga caggaacttc tggcgaggag aaaacgggag aaaaactagg 720 atagcatatg agaggatgtg taacatcctc aaaggaaaat tccaaaccgc tgctcagaaa 780 gcaatgatgg atcaagtacg cgaaagtaga aatcctggaa atgcagagtt tgaagatctc 840 actttcctcg cgcgaagcgc tctcatcctc agagggagtg tcgctcataa aagttgcctg 900 cctgcctgcg tatatggtcc tgccgtggca agtggatacg actttgagag agaggggtac 960 tctcttgttg gaatagatcc attcagatta cttcagaatt cccaggtgta cagtttaata 1020 aggccaaacg aaaatcctgc acacaaatca caacttgttt ggatggcatg ccatagtgcc 1080 gcattcgaag atctaagagt tctctctttc atcaaaggta caaaggtcct tccaagggga 1140 aaactctcta ccagaggggt acaaatagct tcaaatgaga acatggagac aatggaatct 1200 agcacattgg aattgagaag taggtattgg gccattagaa ccaggagtgg aggcaatact 1260 aatcaacagc gggcttctgc cggtcaaatt agcatacaac ctactttttc agtgcaacgg 1320 aatctccctt ttgataggac aactgtcatg gcggcattct ctggaaatac cgaaggaagg 1380 acttccgata tgaggactga gatcattagg atgatggaaa gtgcccgacc tgaagacgtc 1440 agttttcaag gaagaggtgt gttcgaactc tctgacgaaa aggcagctag cccaatcgtt 1500 ccttcttttg atatgtcaaa tgaaggatcc tacttcttcg gcgataatgc ggaggaatat 1560 gacaac 1566 33 1566 DNA Artificial sequence Human Codon Optimized Coding Region Encoding eM2NP 33 atgagcctgc tgaccgaggt ggagaccccc atcaggaacg agtggggctg caggtgcaac 60 ggcagcagcg acatggccag ccagggcacc aagaggagct acgagcagat ggagaccgac 120 ggcgagaggc agaacgccac cgagatcagg gccagcgtgg gcaagatgat cggcggcatc 180 ggcaggttct acatccagat gtgcaccgag ctgaagctga gcgactacga gggcaggctg 240 atccagaaca gcctgaccat cgagaggatg gtgctgagcg ccttcgacga gaggaggaac 300 aagtacctgg aggagcaccc cagcgccggc aaggacccca agaagaccgg cggccccatc 360 tacaggaggg tgaacggcaa gtggatgagg gagctgatcc tgtacgacaa ggaggagatc 420 aggaggatct ggaggcaggc caacaacggc gacgacgcca ccgccggcct gacccacatg 480 atgatctggc acagcaacct gaacgacgcc acctaccaga ggaccagggc cctggtgagg 540 accggcatgg accccaggat gtgcagcctg atgcagggca gcaccctgcc caggaggagc 600 ggcgccgccg gcgccgccgt gaagggcgtg ggcaccatgg tgatggagct ggtgaggatg 660 atcaagaggg gcatcaacga caggaacttc tggaggggcg agaacggcag gaagaccagg 720 atcgcctacg agaggatgtg caacatcctg aagggcaagt tccagaccgc cgcccagaag 780 gccatgatgg accaggtgag ggagagcagg aaccccggca acgccgagtt cgaggacctg 840 accttcctgg ccaggagcgc cctgatcctg aggggcagcg tggcccacaa gagctgcctg 900 cccgcctgcg tgtacggccc cgccgtggcc agcggctacg acttcgagag ggagggctac 960 agcctggtgg gcatcgaccc cttcaggctg ctgcagaaca gccaggtgta cagcctgatc 1020 aggcccaacg agaaccccgc ccacaagagc cagctggtgt ggatggcctg ccacagcgcc 1080 gccttcgagg acctgagggt gctgagcttc atcaagggca ccaaggtgct gcccaggggc 1140 aagctgagca ccaggggcgt gcagatcgcc agcaacgaga acatggagac catggagagc 1200 agcaccctgg agctgaggag caggtactgg gccatcagga ccaggagcgg cggcaacacc 1260 aaccagcaga gggccagcgc cggccagatc agcatccagc ccaccttcag cgtgcagagg 1320 aacctgccct tcgacaggac caccgtgatg gccgccttca gcggcaacac cgagggcagg 1380 accagcgaca tgaggaccga gatcatcagg atgatggaga gcgccaggcc cgaggacgtg 1440 agcttccagg gcaggggcgt gttcgagctg agcgacgaga aggccgccag ccccatcgtg 1500 cccagcttcg acatgagcaa cgagggcagc tacttcttcg gcgacaacgc cgaggagtac 1560 gacaac 1566 34 1566 DNA Artificial Sequence Human Codon Optimized Coding Region Encoding NPeM2 34 atggcaagcc agggcacaaa acgcagttac gagcagatgg agactgatgg tgagaggcag 60 aacgccaccg aaatccgggc ctccgtcggc aagatgattg gtggcatcgg aagattctat 120 atccagatgt gcacggagct taagctgtcc gattacgagg ggcgcttaat acagaactct 180 ctgactatcg agcgaatggt cttgagcgcc tttgatgagc ggcgtaataa gtatctcgaa 240 gagcaccctt ctgctggaaa agaccccaaa aagaccgggg gacctatcta ccgacgtgtg 300 aacggaaaat ggatgcgcga actgatactg tacgacaagg aggagatccg taggatctgg 360 agacaggcta ataacggaga tgatgccaca gctgggctga cccatatgat gatatggcat 420 agcaacctga acgacgcaac ctatcaacgc actagagcac tcgtgaggac cggtatggac 480 ccacgcatgt gctcattgat gcaaggtagc acattgcctc ggaggtcagg cgccgccggt 540 gccgccgtaa agggggtggg cacaatggtg atggaactgg tccgaatgat caaaagaggc 600 atcaatgaca ggaacttttg gcgcggagaa aacgggcgca agacccgcat tgcctacgag 660 cgcatgtgta acattttaaa aggcaaattc cagactgcag cccagaaagc aatgatggac 720 caagttagag aaagtagaaa tcccgggaat gccgagtttg aagacctgac tttcctggct 780 agaagcgcct tgatcctgcg gggctctgtc gcccacaaga gctgcctccc cgcttgcgtt 840 tacggccccg cggtcgcaag tggctacgat ttcgagaggg aggggtattc cctagttggg 900 atcgatccct tccggctcct acagaattct caggtgtata gtctgattag acccaacgaa 960 aacccggctc acaagagtca gcttgtttgg atggcatgtc actcagcagc tttcgaagac 1020 ctgcgggtac tcagctttat taaaggcacc aaggtcctgc caagaggaaa gctctccacg 1080 aggggagtac agatcgcctc aaacgagaac atggagacaa tggaaagctc cacccttgag 1140 cttaggtcgc ggtattgggc tattagaaca cgatctgggg ggaataccaa tcagcaacga 1200 gcgagtgctg gtcagatttc cattcagcct actttctctg tgcaacggaa tctaccattt 1260 gacaggacaa ctgtgatggc agcgttctcc ggcaatacag aaggacgaac atcagacatg 1320 aggaccgaaa ttatccggat gatggagagc gctcggccag aagatgtgtc gttccagggc 1380 cggggcgtgt ttgagctcag cgacgagaag gccgcgtctc caattgtgcc ttcctttgat 1440 atgagcaatg aggggtcata ctttttcgga gacaatgccg aagagtatga taatatgtct 1500 ctgcttaccg aggtggaaac gccgatacgc aacgaatggg gttgtcgttg taacggctcc 1560 agtgat 1566 35 1566 DNA Artificial sequence Human Codon Optimized Coding Region Encoding NPeM2 35 atggccagcc agggcaccaa gaggagctac gagcagatgg agaccgacgg cgagaggcag 60 aacgccaccg agatcagggc cagcgtgggc aagatgatcg gcggcatcgg caggttctac 120 atccagatgt gcaccgagct gaagctgagc gactacgagg gcaggctgat ccagaacagc 180 ctgaccatcg agaggatggt gctgagcgcc ttcgacgaga ggaggaacaa gtacctggag 240 gagcacccca gcgccggcaa ggaccccaag aagaccggcg gccccatcta caggagggtg 300 aacggcaagt ggatgaggga gctgatcctg tacgacaagg aggagatcag gaggatctgg 360 aggcaggcca acaacggcga cgacgccacc gccggcctga cccacatgat gatctggcac 420 agcaacctga acgacgccac ctaccagagg accagggccc tggtgaggac cggcatggac 480 cccaggatgt gcagcctgat gcagggcagc accctgccca ggaggagcgg cgccgccggc 540 gccgccgtga agggcgtggg caccatggtg atggagctgg tgaggatgat caagaggggc 600 atcaacgaca ggaacttctg gaggggcgag aacggcagga agaccaggat cgcctacgag 660 aggatgtgca acatcctgaa gggcaagttc cagaccgccg cccagaaggc catgatggac 720 caggtgaggg agagcaggaa ccccggcaac gccgagttcg aggacctgac cttcctggcc 780 aggagcgccc tgatcctgag gggcagcgtg gcccacaaga gctgcctgcc cgcctgcgtg 840 tacggccccg ccgtggccag cggctacgac ttcgagaggg agggctacag cctggtgggc 900 atcgacccct tcaggctgct gcagaacagc caggtgtaca gcctgatcag gcccaacgag 960 aaccccgccc acaagagcca gctggtgtgg atggcctgcc acagcgccgc cttcgaggac 1020 ctgagggtgc tgagcttcat caagggcacc aaggtgctgc ccaggggcaa gctgagcacc 1080 aggggcgtgc agatcgccag caacgagaac atggagacca tggagagcag caccctggag 1140 ctgaggagca ggtactgggc catcaggacc aggagcggcg gcaacaccaa ccagcagagg 1200 gccagcgccg gccagatcag catccagccc accttcagcg tgcagaggaa cctgcccttc 1260 gacaggacca ccgtgatggc cgccttcagc ggcaacaccg agggcaggac cagcgacatg 1320 aggaccgaga tcatcaggat gatggagagc gccaggcccg aggacgtgag cttccagggc 1380 aggggcgtgt tcgagctgag cgacgagaag gccgccagcc ccatcgtgcc cagcttcgac 1440 atgagcaacg agggcagcta cttcttcggc gacaacgccg aggagtacga caacatgagc 1500 ctgctgaccg aggtggagac ccccatcagg aacgagtggg gctgcaggtg caacggcagc 1560 agcgac 1566 36 1683 DNA Artificial sequence Human Codon Optimized Coding Region Encoding IBV NP Protein 36 atgtcgaaca tggacatcga cagcattaac acaggtacta ttgacaaaac ccccgaagaa 60 ctaacccctg gaacctcagg agcaacacgc ccaataatca aaccggccac cctcgcgccc 120 cctagcaata agaggacccg caatccaagt cctgagagaa ccactacttc atctgaaacg 180 gatatcggtc ggaaaattca aaaaaagcag acgcccacag agataaagaa gtctgtttac 240 aaaatggtgg taaagctcgg tgagttttat aaccagatga tggtcaaggc ggggcttaac 300 gacgatatgg aacgaaatct tatacagaat gcacaggcag tagagagaat actgctggcc 360 gctactgatg acaagaaaac ggagtaccaa aaaaaacgga atgctcgaga tgtgaaagaa 420 ggaaaagaag aaattgacca taacaaaact ggggggacat tctataagat ggtgcgggac 480 gataagacaa tctattttag cccgataaag attaccttcc tgaaggagga ggttaaaaca 540 atgtacaaga cgacgatggg cagcgatggg ttttccggac ttaatcatat aatgattggt 600 cactcgcaga tgaacgatgt atgtttccag cgctccaagg gcttaaagag ggtaggtctt 660 gacccgtctc taatatcaac tttcgcagga tccactttgc cgaggcgttc tggcacgaca 720 ggcgtggcta tcaagggcgg ggggacgctg gtcgatgagg ccattcgctt tattggtagg 780 gccatggccg atagagggct tctacgagac atcaaagcaa aaacagcata tgagaagata 840 ttattaaact taaagaacaa atgctccgct cctcagcaaa aagcgctcgt tgaccaagta 900 atcggttcga gaaatccagg cattgccgat atcgaagatc ttacactctt ggcgcgaagc 960 atggtcgttg tccgtcccag tgtcgctagt aaggtggtac taccaatctc gatttacgca 1020 aaaattccac aactcggctt taatacagag gaatattcta tggtaggtta tgaagccatg 1080 gcgttgtata atatggctac accagtctcc atattgcgta tgggagatga cgcaaaagat 1140 aagagtcaac tctttttcat gtcatgtttc ggcgcagcgt acgaagatct gagagtacta 1200 tccgccttga ctggaacgga atttaaacca cggtcagcct taaagtgtaa gggttttcac 1260 gtccctgcta aggagcaagt tgagggaatg ggcgcggcac tgatgagtat aaaattacaa 1320 ttttgggctc caatgacgcg ttcgggaggg aatgaagttt ctggtgaggg agggagtgga 1380 cagatatcat gctcgcccgt gttcgcggtt gaacgtccga ttgctttgag taagcaggcg 1440 gttaggcgga tgttaagtat gaatgtggag ggccgcgatg ccgacgtcaa aggcaactta 1500 ttaaaaatga tgaacgacag catggcaaag aagactagtg ggaatgcttt tatagggaaa 1560 aaaatgttcc aaataagtga caaaaacaaa gtgaacccca tcgaaatacc tatcaagcaa 1620 accatcccga atttcttttt cggtcgagac accgcggagg actacgatga cctagattac 1680 taa 1683 37 1683 DNA Artificial sequence Human Codon Optimized Coding Region Encoding IBV NP Protein 37 atgagcaaca tggacatcga cagcatcaac accggcacca tcgacaagac ccccgaggag 60 ctgacccccg gcaccagcgg cgccacccgg cccatcatca agcccgccac cctggccccc 120 cccagcaaca agcggacccg gaaccccagc cccgagcgga ccaccaccag cagcgagacc 180 gacatcggcc ggaagatcca gaagaagcag acccccaccg agatcaagaa gagcgtgtac 240 aagatggtgg tgaagctggg cgagttctac aaccagatga tggtgaaggc cggcctgaac 300 gacgacatgg agcggaacct gatccagaac gcccaggccg tggagcggat cctgctggcc 360 gccaccgacg acaagaagac cgagtaccag aagaagcgga acgcccggga cgtgaaggag 420 ggcaaggagg agatcgacca caacaagacc ggcggcacct tctacaagat ggtgcgggac 480 gacaagacca tctacttcag ccccatcaag atcaccttcc tgaaggagga ggtgaagacc 540 atgtacaaga ccaccatggg cagcgacggc ttcagcggcc tgaaccacat catgatcggc 600 cacagccaga tgaacgacgt gtgcttccag cggagcaagg gcctgaagcg ggtgggcctg 660 gaccccagcc tgatcagcac cttcgccggc agcaccctgc cccggcggag cggcaccacc 720 ggcgtggcca tcaagggcgg cggcaccctg gtggacgagg ccatccggtt catcggccgg 780 gccatggccg accggggcct gctgcgggac atcaaggcca agaccgccta cgagaagatc 840 ctgctgaacc tgaagaacaa gtgcagcgcc ccccagcaga aggccctggt ggaccaggtg 900 atcggcagcc ggaaccccgg catcgccgac atcgaggacc tgaccctgct ggcccggagc 960 atggtggtgg tgcggcccag cgtggccagc aaggtggtgc tgcccatcag catctacgcc 1020 aagatccccc agctgggctt caacaccgag gagtacagca tggtgggcta cgaggccatg 1080 gccctgtaca acatggccac ccccgtgagc atcctgcgga tgggcgacga cgccaaggac 1140 aagagccagc tgttcttcat gagctgcttc ggcgccgcct acgaggacct gcgggtgctg 1200 agcgccctga ccggcaccga gttcaagccc cggagcgccc tgaagtgcaa gggcttccac 1260 gtgcccgcca aggagcaggt ggagggcatg ggcgccgccc tgatgagcat caagctgcag 1320 ttctgggccc ccatgacccg gagcggcggc aacgaggtga gcggcgaggg cggcagcggc 1380 cagatcagct gcagccccgt gttcgccgtg gagcggccca tcgccctgag caagcaggcc 1440 gtgcggcgga tgctgagcat gaacgtggag ggccgggacg ccgacgtgaa gggcaacctg 1500 ctgaagatga tgaacgacag catggccaag aagaccagcg gcaacgcctt catcggcaag 1560 aagatgttcc agatcagcga caagaacaag gtgaacccca tcgagatccc catcaagcag 1620 accatcccca acttcttctt cggccgggac accgccgagg actacgacga cctggactac 1680 tga 1683 38 1683 PRT Artificial sequence Human Codon Optimized Coding Region Encoding IBV NP Protein 38 Ala Thr Gly Thr Cys Thr Ala Ala Cys Ala Thr Gly Gly Ala Cys Ala 1 5 10 15 Thr Cys Gly Ala Cys Thr Cys Thr Ala Thr Ala Ala Ala Cys Ala Cys 20 25 30 Ala Gly Gly Cys Ala Cys Gly Ala Thr Cys Gly Ala Thr Ala Ala Gly 35 40 45 Ala Cys Cys Cys Cys Cys Gly Ala Gly Gly Ala Gly Cys Thr Gly Ala 50 55 60 Cys Ala Cys Cys Cys Gly Gly Cys Ala Cys Thr Thr Cys Ala Gly Gly 65 70 75 80 Cys Gly Cys Cys Ala Cys Cys Ala Gly Ala Cys Cys Cys Ala Thr Ala 85 90 95 Ala Thr Ala Ala Ala Gly Cys Cys Cys Gly Cys Cys Ala Cys Thr Cys 100 105 110 Thr Gly Gly Cys Cys Cys Cys Cys Cys Cys Cys Thr Cys Thr Ala Ala 115 120 125 Cys Ala Ala Gly Ala Gly Gly Ala Cys Gly Ala Gly Gly Ala Ala Cys 130 135 140 Cys Cys Cys Thr Cys Thr Cys Cys Cys Gly Ala Gly Cys Gly Cys Ala 145 150 155 160 Cys Cys Ala Cys Ala Ala Cys Gly Ala Gly Thr Ala Gly Cys Gly Ala 165 170 175 Gly Ala Cys Gly Gly Ala Cys Ala Thr Cys Gly Gly Cys Ala Gly Gly 180 185 190 Ala Ala Gly Ala Thr Ala Cys Ala Gly Ala Ala Gly Ala Ala Gly Cys 195 200 205 Ala Gly Ala Cys Thr Cys Cys Cys Ala Cys Thr Gly Ala Gly Ala Thr 210 215 220 Thr Ala Ala Gly Ala Ala Gly Thr Cys Cys Gly Thr Gly Thr Ala Thr 225 230 235 240 Ala Ala Gly Ala Thr Gly Gly Thr Gly Gly Thr Thr Ala Ala Gly Cys 245 250 255 Thr Gly Gly Gly Cys Gly Ala Gly Thr Thr Thr Thr Ala Cys Ala Ala 260 265 270 Cys Cys Ala Gly Ala Thr Gly Ala Thr Gly Gly Thr Gly Ala Ala Gly 275 280 285 Gly Cys Cys Gly Gly Cys Cys Thr Gly Ala Ala Cys Gly Ala Thr Gly 290 295 300 Ala Cys Ala Thr Gly Gly Ala Gly Ala Gly Gly Ala Ala Cys Cys Thr 305 310 315 320 Gly Ala Thr Ala Cys Ala Gly Ala Ala Cys Gly Cys Cys Cys Ala Gly 325 330 335 Gly Cys Cys Gly Thr Gly Gly Ala Gly Ala Gly Gly Ala Thr Thr Cys 340 345 350 Thr Gly Cys Thr Gly Gly Cys Cys Gly Cys Cys Ala Cys Cys Gly Ala 355 360 365 Thr Gly Ala Cys Ala Ala Gly Ala Ala Gly Ala Cys Thr Gly Ala Gly 370 375 380 Thr Ala Thr Cys Ala Gly Ala Ala Gly Ala Ala Gly Ala Gly Ala Ala 385 390 395 400 Ala Cys Gly Cys Cys Cys Gly Gly Gly Ala Cys Gly Thr Thr Ala Ala 405 410 415 Gly Gly Ala Gly Gly Gly Cys Ala Ala Gly Gly Ala Gly Gly Ala Gly 420 425 430 Ala Thr Cys Gly Ala Thr Cys Ala Cys Ala Ala Cys Ala Ala Gly Ala 435 440 445 Cys Ala Gly Gly Cys Gly Gly Cys Ala Cys Thr Thr Thr Cys Thr Ala 450 455 460 Thr Ala Ala Gly Ala Thr Gly Gly Thr Cys Cys Gly Thr Gly Ala Thr 465 470 475 480 Gly Ala Cys Ala Ala Gly Ala Cys Ala Ala Thr Cys Thr Ala Cys Thr 485 490 495 Thr Thr Thr Cys Thr Cys Cys Cys Ala Thr Cys Ala Ala Gly Ala Thr 500 505 510 Cys Ala Cys Ala Thr Thr Cys Cys Thr Gly Ala Ala Gly Gly Ala Gly 515 520 525 Gly Ala Gly Gly Thr Ala Ala Ala Gly Ala Cys Thr Ala Thr Gly Thr 530

535 540 Ala Cys Ala Ala Gly Ala Cys Ala Ala Cys Thr Ala Thr Gly Gly Gly 545 550 555 560 Cys Thr Cys Cys Gly Ala Thr Gly Gly Cys Thr Thr Cys Ala Gly Thr 565 570 575 Gly Gly Cys Cys Thr Gly Ala Ala Cys Cys Ala Cys Ala Thr Ala Ala 580 585 590 Thr Gly Ala Thr Ala Gly Gly Cys Cys Ala Thr Ala Gly Thr Cys Ala 595 600 605 Gly Ala Thr Gly Ala Ala Cys Gly Ala Thr Gly Thr Gly Thr Gly Cys 610 615 620 Thr Thr Cys Cys Ala Gly Ala Gly Ala Ala Gly Cys Ala Ala Gly Gly 625 630 635 640 Gly Cys Cys Thr Gly Ala Ala Gly Ala Gly Gly Gly Thr Cys Gly Gly 645 650 655 Cys Cys Thr Gly Gly Ala Thr Cys Cys Cys Thr Cys Gly Cys Thr Gly 660 665 670 Ala Thr Thr Ala Gly Thr Ala Cys Cys Thr Thr Cys Gly Cys Cys Gly 675 680 685 Gly Cys Ala Gly Cys Ala Cys Thr Cys Thr Gly Cys Cys Cys Ala Gly 690 695 700 Ala Ala Gly Ala Thr Cys Thr Gly Gly Cys Ala Cys Thr Ala Cys Thr 705 710 715 720 Gly Gly Cys Gly Thr Ala Gly Cys Cys Ala Thr Ala Ala Ala Gly Gly 725 730 735 Gly Cys Gly Gly Cys Gly Gly Cys Ala Cys Ala Cys Thr Gly Gly Thr 740 745 750 Ala Gly Ala Cys Gly Ala Gly Gly Cys Cys Ala Thr Ala Ala Gly Gly 755 760 765 Thr Thr Thr Ala Thr Thr Gly Gly Cys Ala Gly Ala Gly Cys Cys Ala 770 775 780 Thr Gly Gly Cys Cys Gly Ala Cys Cys Gly Cys Gly Gly Cys Cys Thr 785 790 795 800 Gly Cys Thr Gly Ala Gly Ala Gly Ala Thr Ala Thr Cys Ala Ala Gly 805 810 815 Gly Cys Cys Ala Ala Gly Ala Cys Cys Gly Cys Cys Thr Ala Cys Gly 820 825 830 Ala Gly Ala Ala Gly Ala Thr Ala Cys Thr Gly Cys Thr Gly Ala Ala 835 840 845 Cys Cys Thr Gly Ala Ala Gly Ala Ala Cys Ala Ala Gly Thr Gly Cys 850 855 860 Thr Cys Ala Gly Cys Cys Cys Cys Cys Cys Ala Gly Cys Ala Gly Ala 865 870 875 880 Ala Gly Gly Cys Cys Cys Thr Gly Gly Thr Gly Gly Ala Thr Cys Ala 885 890 895 Gly Gly Thr Gly Ala Thr Cys Gly Gly Cys Ala Gly Thr Ala Gly Ala 900 905 910 Ala Ala Cys Cys Cys Cys Gly Gly Cys Ala Thr Cys Gly Cys Cys Gly 915 920 925 Ala Cys Ala Thr Cys Gly Ala Gly Gly Ala Thr Cys Thr Gly Ala Cys 930 935 940 Thr Cys Thr Gly Cys Thr Gly Gly Cys Cys Ala Gly Ala Ala Gly Cys 945 950 955 960 Ala Thr Gly Gly Thr Ala Gly Thr Cys Gly Thr Ala Ala Gly Ala Cys 965 970 975 Cys Cys Thr Cys Thr Gly Thr Gly Gly Cys Cys Thr Cys Thr Ala Ala 980 985 990 Gly Gly Thr Thr Gly Thr Gly Cys Thr Gly Cys Cys Cys Ala Thr Cys 995 1000 1005 Thr Cys Cys Ala Thr Cys Thr Ala Cys Gly Cys Cys Ala Ala Gly 1010 1015 1020 Ala Thr Thr Cys Cys Cys Cys Ala Gly Cys Thr Gly Gly Gly Cys 1025 1030 1035 Thr Thr Thr Ala Ala Cys Ala Cys Thr Gly Ala Gly Gly Ala Gly 1040 1045 1050 Thr Ala Cys Thr Cys Cys Ala Thr Gly Gly Thr Gly Gly Gly Cys 1055 1060 1065 Thr Ala Thr Gly Ala Gly Gly Cys Cys Ala Thr Gly Gly Cys Cys 1070 1075 1080 Cys Thr Gly Thr Ala Thr Ala Ala Cys Ala Thr Gly Gly Cys Cys 1085 1090 1095 Ala Cys Ala Cys Cys Cys Gly Thr Cys Thr Cys Thr Ala Thr Cys 1100 1105 1110 Cys Thr Gly Cys Gly Gly Ala Thr Gly Gly Gly Cys Gly Ala Cys 1115 1120 1125 Gly Ala Thr Gly Cys Cys Ala Ala Gly Gly Ala Cys Ala Ala Gly 1130 1135 1140 Thr Cys Thr Cys Ala Gly Cys Thr Gly Thr Thr Thr Thr Thr Thr 1145 1150 1155 Ala Thr Gly Ala Gly Thr Thr Gly Thr Thr Thr Cys Gly Gly Cys 1160 1165 1170 Gly Cys Cys Gly Cys Cys Thr Ala Thr Gly Ala Gly Gly Ala Thr 1175 1180 1185 Cys Thr Gly Ala Gly Ala Gly Thr Cys Cys Thr Gly Thr Cys Ala 1190 1195 1200 Gly Cys Cys Cys Thr Gly Ala Cys Ala Gly Gly Cys Ala Cys Thr 1205 1210 1215 Gly Ala Gly Thr Thr Cys Ala Ala Gly Cys Cys Cys Ala Gly Gly 1220 1225 1230 Thr Cys Cys Gly Cys Cys Cys Thr Gly Ala Ala Gly Thr Gly Cys 1235 1240 1245 Ala Ala Gly Gly Gly Cys Thr Thr Thr Cys Ala Thr Gly Thr Gly 1250 1255 1260 Cys Cys Cys Gly Cys Cys Ala Ala Gly Gly Ala Gly Cys Ala Gly 1265 1270 1275 Gly Thr Gly Gly Ala Gly Gly Gly Cys Ala Thr Gly Gly Gly Cys 1280 1285 1290 Gly Cys Cys Gly Cys Cys Cys Thr Gly Ala Thr Gly Ala Gly Cys 1295 1300 1305 Ala Thr Cys Ala Ala Gly Cys Thr Gly Cys Ala Gly Thr Thr Cys 1310 1315 1320 Thr Gly Gly Gly Cys Cys Cys Cys Cys Ala Thr Gly Ala Cys Cys 1325 1330 1335 Cys Gly Gly Thr Cys Thr Gly Gly Cys Gly Gly Cys Ala Ala Cys 1340 1345 1350 Gly Ala Gly Gly Thr Cys Thr Cys Gly Gly Gly Cys Gly Ala Gly 1355 1360 1365 Gly Gly Cys Gly Gly Cys Ala Gly Thr Gly Gly Cys Cys Ala Gly 1370 1375 1380 Ala Thr Ala Ala Gly Thr Thr Gly Cys Ala Gly Cys Cys Cys Cys 1385 1390 1395 Gly Thr Thr Thr Thr Thr Gly Cys Cys Gly Thr Thr Gly Ala Gly 1400 1405 1410 Ala Gly Ala Cys Cys Cys Ala Thr Cys Gly Cys Cys Cys Thr Gly 1415 1420 1425 Thr Cys Thr Ala Ala Gly Cys Ala Gly Gly Cys Cys Gly Thr Thr 1430 1435 1440 Ala Gly Ala Cys Gly Ala Ala Thr Gly Cys Thr Gly Ala Gly Thr 1445 1450 1455 Ala Thr Gly Ala Ala Cys Gly Thr Cys Gly Ala Gly Gly Gly Cys 1460 1465 1470 Cys Gly Ala Gly Ala Cys Gly Cys Cys Gly Ala Thr Gly Thr Gly 1475 1480 1485 Ala Ala Gly Gly Gly Cys Ala Ala Cys Cys Thr Gly Cys Thr Gly 1490 1495 1500 Ala Ala Gly Ala Thr Gly Ala Thr Gly Ala Ala Cys Gly Ala Thr 1505 1510 1515 Thr Cys Cys Ala Thr Gly Gly Cys Cys Ala Ala Gly Ala Ala Gly 1520 1525 1530 Ala Cys Ala Ala Gly Cys Gly Gly Cys Ala Ala Cys Gly Cys Cys 1535 1540 1545 Thr Thr Cys Ala Thr Thr Gly Gly Cys Ala Ala Gly Ala Ala Gly 1550 1555 1560 Ala Thr Gly Thr Thr Cys Cys Ala Gly Ala Thr Ala Ala Gly Cys 1565 1570 1575 Gly Ala Thr Ala Ala Gly Ala Ala Cys Ala Ala Gly Gly Thr Thr 1580 1585 1590 Ala Ala Cys Cys Cys Cys Ala Thr Cys Gly Ala Gly Ala Thr Thr 1595 1600 1605 Cys Cys Cys Ala Thr Cys Ala Ala Gly Cys Ala Gly Ala Cys Cys 1610 1615 1620 Ala Thr Cys Cys Cys Cys Ala Ala Cys Thr Thr Cys Thr Thr Cys 1625 1630 1635 Thr Thr Cys Gly Gly Cys Ala Gly Gly Gly Ala Thr Ala Cys Cys 1640 1645 1650 Gly Cys Cys Gly Ala Gly Gly Ala Thr Thr Ala Cys Gly Ala Thr 1655 1660 1665 Gly Ala Cys Cys Thr Gly Gly Ala Cys Thr Ala Cys Thr Gly Ala 1670 1675 1680 39 552 DNA Hepatitis B virus 39 atggacatcg acccttataa agaatttgga gctactgtgg agttactctc gtttttgcct 60 tctgacttct ttccttcagt acgagatctt ctagataccg cctcagctct gtatcgggaa 120 gccttagagt ctcctgagca ttgttcacct caccatactg cactcaggca agcaattctt 180 tgctgggggg aactaatgac tctagctacc tgggtgggtg ttaatttgga agatccagcg 240 tctagagacc tagtagtcag ttatgtcaac actaatatgg gcctaaagtt caggcaactc 300 ttgtggtttc acatttcttg tctcactttt ggaagagaaa cagttataga gtatttggtg 360 tctttcggag tgtggattcg cactcctcca gcttatagac caccaaatgc ccctatccta 420 tcaacacttc cggagactac tgttgttaga cgacgaggca ggtcccctag aagaagaact 480 ccctcgcctc gcagacgaag gtctcaatcg ccgcgtcgca gaagatctca atctcgggaa 540 tctcaatgtt ag 552 40 183 PRT Artificial sequence Hepatitus B Virus 40 Met Asp Ile Asp Pro Tyr Lys Glu Phe Gly Ala Thr Val Glu Leu Leu 1 5 10 15 Ser Phe Leu Pro Ser Asp Phe Phe Pro Ser Val Arg Asp Leu Leu Asp 20 25 30 Thr Ala Ser Ala Leu Tyr Arg Glu Ala Leu Glu Ser Pro Glu His Cys 35 40 45 Ser Pro His His Thr Ala Leu Arg Gln Ala Ile Leu Cys Trp Gly Glu 50 55 60 Leu Met Thr Leu Ala Thr Trp Val Gly Val Asn Leu Glu Asp Pro Ala 65 70 75 80 Ser Arg Asp Leu Val Val Ser Tyr Val Asn Thr Asn Met Gly Leu Lys 85 90 95 Phe Arg Gln Leu Leu Trp Phe His Ile Ser Cys Leu Thr Phe Gly Arg 100 105 110 Glu Thr Val Ile Glu Tyr Leu Val Ser Phe Gly Val Trp Ile Arg Thr 115 120 125 Pro Pro Ala Tyr Arg Pro Pro Asn Ala Pro Ile Leu Ser Thr Leu Pro 130 135 140 Glu Thr Thr Val Val Arg Arg Arg Gly Arg Ser Pro Arg Arg Arg Thr 145 150 155 160 Pro Ser Pro Arg Arg Arg Arg Ser Gln Ser Pro Arg Arg Arg Arg Ser 165 170 175 Gln Ser Arg Glu Ser Gln Cys 180 41 555 DNA Artificial sequence Synthetic HBcAg 41 atggatatcg atccttataa agaattcgga gctactgtgg agttactctc gtttctcccg 60 agtgacttct ttccttcagt acgagatctt ctggataccg ccagcgcgct gtatcgggaa 120 gccttggagt ctcctgagca ctgcagccct caccatactg ccctcaggca agcaattctt 180 tgctgggggg agctcatgac tctggccacg tgggtgggtg ttaacttgga agatccagct 240 agcagggacc tggtagtcag ttatgtcaac actaatatgg gtttaaagtt caggcaactc 300 ttgtggtttc acattagctg cctcactttc ggccgagaaa cagttctaga atatttggtg 360 tctttcggag tgtggatccg cactcctcca gcttataggc ctccgaatgc ccctatcctg 420 tcgacactcc cggagactac tgttgttaga cgtcgaggca ggtcacctag aagaagaact 480 ccttcgcctc gcaggcgaag gtctcaatcg ccgcggcgcc gaagatctca atctcgggaa 540 tctcaatgtt agtga 555 42 183 PRT Artificial sequence Synthetic HBcAg 42 Met Asp Ile Asp Pro Tyr Lys Glu Phe Gly Ala Thr Val Glu Leu Leu 1 5 10 15 Ser Phe Leu Pro Ser Asp Phe Phe Pro Ser Val Arg Asp Leu Leu Asp 20 25 30 Thr Ala Ser Ala Leu Tyr Arg Glu Ala Leu Glu Ser Pro Glu His Cys 35 40 45 Ser Pro His His Thr Ala Leu Arg Gln Ala Ile Leu Cys Trp Gly Glu 50 55 60 Leu Met Thr Leu Ala Thr Trp Val Gly Val Asn Leu Glu Asp Pro Ala 65 70 75 80 Ser Arg Asp Leu Val Val Ser Tyr Val Asn Thr Asn Met Gly Leu Lys 85 90 95 Phe Arg Gln Leu Leu Trp Phe His Ile Ser Cys Leu Thr Phe Gly Arg 100 105 110 Glu Thr Val Leu Glu Tyr Leu Val Ser Phe Gly Val Trp Ile Arg Thr 115 120 125 Pro Pro Ala Tyr Arg Pro Pro Asn Ala Pro Ile Leu Ser Thr Leu Pro 130 135 140 Glu Thr Thr Val Val Arg Arg Arg Gly Arg Ser Pro Arg Arg Arg Thr 145 150 155 160 Pro Ser Pro Arg Arg Arg Arg Ser Gln Ser Pro Arg Arg Arg Arg Ser 165 170 175 Gln Ser Arg Glu Ser Gln Cys 180 43 2043 DNA Artificial sequence Influenza A Virus NP Gene Fused to Synthetic HBcAg 43 atggcgtctc aaggcaccaa acgatcttac gaacagatgg agactgatgg agaacgccag 60 aatgccactg aaatcagagc atccgtcgga aaaatgattg gtggaattgg acgattctac 120 atccaaatgt gcaccgaact caaactcagt gattatgagg gacggttgat ccaaaacagc 180 ttaacaatag agagaatggt gctctctgct tttgacgaaa ggagaaataa ataccttgaa 240 gaacatccca gtgcggggaa agatcctaag aaaactggag gacctatata caggagagta 300 aacggaaagt ggatgagaga actcatcctt tatgacaaag aagaaataag gcgaatctgg 360 cgccaagcta ataatggtga cgatgcaacg gctggtctga ctcacatgat gatctggcat 420 tccaatttga atgatgcaac ttatcagagg acaagagctc ttgttcgcac cggaatggat 480 cccaggatgt gctctctgat gcaaggttca actctcccta ggaggtctgg agccgcaggt 540 gctgcagtca aaggagttgg aacaatggtg atggaattgg tcagaatgat caaacgtggg 600 atcaatgatc ggaacttctg gaggggtgag aatggacgaa aaacaagaat tgcttatgaa 660 agaatgtgca acattctcaa agggaaattt caaactgctg cacaaaaagc aatgatggat 720 caagtgagag agagccggaa cccagggaat gctgagttcg aagatctcac ttttctagca 780 cggtctgcac tcatattgag agggtcggtt gctcacaagt cctgcctgcc tgcctgtgtg 840 tatggacctg ccgtagccag tgggtacgac tttgaaaggg agggatactc tctagtcgga 900 atagaccctt tcagactgct tcaaaacagc caagtgtaca gcctaatcag accaaatgag 960 aatccagcac acaagagtca actggtgtgg atggcatgcc attctgccgc atttgaagat 1020 ctaagagtat taagcttcat caaagggacg aaggtgctcc caagagggaa gctttccact 1080 agaggagttc aaattgcttc caatgaaaat atggagacta tggaatcaag tacacttgaa 1140 ctgagaagca ggtactgggc cataaggacc agaagtggag gaaacaccaa tcaacagagg 1200 gcatctgcgg gccaaatcag catacaacct acgttctcag tacagagaaa tctccctttt 1260 gacagaacaa ccgttatggc agcattcagt gggaatacag aggggagaac atctgacatg 1320 aggaccgaaa tcataaggat gatggaaagt gcaagaccag aagatgtgtc tttccagggg 1380 cggggagtct tcgagctctc ggacgaaaag gcagcgagcc cgatcgtgcc ttcctttgac 1440 atgagtaatg aaggatctta tttcttcgga gacaatgcag aggaatacga taatatggat 1500 atcgatcctt ataaagaatt cggagctact gtggagttac tctcgtttct cccgagtgac 1560 ttctttcctt cagtacgaga tcttctggat accgccagcg cgctgtatcg ggaagccttg 1620 gagtctcctg agcactgcag ccctcaccat actgccctca ggcaagcaat tctttgctgg 1680 ggggagctca tgactctggc cacgtgggtg ggtgttaact tggaagatcc agctagcagg 1740 gacctggtag tcagttatgt caacactaat atgggtttaa agttcaggca actcttgtgg 1800 tttcacatta gctgcctcac tttcggccga gaaacagttc tagaatattt ggtgtctttc 1860 ggagtgtgga tccgcactcc tccagcttat aggcctccga atgcccctat cctgtcgaca 1920 ctcccggaga ctactgttgt tagacgtcga ggcaggtcac ctagaagaag aactccttcg 1980 cctcgcaggc gaaggtctca atcgccgcgg cgccgaagat ctcaatctcg ggaatctcaa 2040 tgt 2043 44 2230 DNA Artificial sequence Influenza B Virus NP Gene Fused to Synthetic HBcAg 44 atgtccaaca tggatattga cagtataaat accggaacaa tcgataaaac accagaagaa 60 ctgactcccg gaaccagtgg ggcaaccaga ccaatcatca agccagcaac ccttgctccg 120 ccaagcaaca aacgaacccg aaatccatct ccagaaagga caaccacaag cagtgaaacc 180 gatatcggaa ggaaaatcca aaagaaacaa accccaacag agataaagaa gagcgtctac 240 aaaatggtgg taaaactggg tgaattctac aaccagatga tggtcaaagc tggacttaat 300 gatgacatgg aaaggaatct aattcaaaat gcacaagctg tggagagaat cctattggct 360 gcaactgatg acaagaaaac tgaataccaa aagaaaagga atgccagaga tgtcaaagaa 420 gggaaggaag aaatagacca caacaagaca ggaggcacct tttataagat ggtaagagat 480 gataaaacca tctacttcag ccctataaaa attacctttt taaaagaaga ggtgaaaaca 540 atgtacaaga ccaccatggg gagtgatggt ttcagtggac taaatcacat tatgattgga 600 cattcacaga tgaacgatgt ctgtttccaa agatcaaagg gactgaaaag ggttggactt 660 gacccttcat taatcagtac ttttgccgga agcacactac ccagaagatc aggtacaact 720 ggtgttgcaa tcaaaggagg tggaacttta gtggatgaag ccatccgatt tataggaaga 780 gcaatggcag acagagggct actgagagac atcaaggcca agacggccta tgaaaagatt 840 cttctgaatc tgaaaaacaa gtgctctgcg ccgcaacaaa aggctctagt tgatcaagtg 900 atcggaagta ggaacccagg gattgcagac atagaagacc taactctgct tgccagaagc 960 atggtagttg tcagaccctc tgtagcgagc aaagtggtgc ttcccataag catttatgct 1020 aaaatacctc aactaggatt caataccgaa gaatactcta tggttgggta tgaagccatg 1080 gctctttata atatggcaac acctgtttcc atattaagaa tgggagatga cgcaaaagat 1140 aaatctcaac tattcttcat gtcgtgcttc ggagctgcct atgaagatct aagagtgtta 1200 tctgcactaa cgggcaccga atttaagcct agatcagcac taaaatgcaa gggtttccat 1260 gtcccggcta aggagcaagt agaaggaatg ggggcagctc tgatgtccat caagcttcag 1320 ttctgggccc caatgaccag atctggaggg aatgaagtaa gtggagaagg agggtctggt 1380 caaataagtt gcagccctgt gtttgcagta gaaagaccta ttgctctaag caagcaagct 1440 gtaagaagaa tgctgtcaat gaacgttgaa ggacgtgatg cagatgtcaa aggaaatcta 1500 ctcaaaatga tgaatgattc aatggcaaag aaaaccagtg gaaatgcttt cattgggaag 1560 aaaatgtttc aaatatcaga caaaaacaaa gtcaatccca ttgagattcc aattaagcag 1620 accatcccca atttcttctt tgggagggac acagcagagg attatgatga cctcgattat 1680 atggatatcg atccttataa agaattcgga gctactgtgg agttactctc gtttctcccg 1740 agtgacttct ttccttcagt acgagatctt ctggataccg ccagcgcgct gtatcgggaa 1800 gccttggagt ctcctgagca ctgcagccct caccatactg ccctcaggca agcaattctt 1860 tgctgggggg agctcatgac tctggccacg tgggtgggtg ttaacttgga agatccagct 1920 agcagggacc tggtagtcag ttatgtcaac actaatatgg gtttaaagtt caggcaactc 1980 ttgtggtttc acattagctg cctcactttc ggccgagaaa cagttctaga atatttggtg 2040 tctttcggag tgtggatccg cactcctcca

gcttataggc ctccgaatgc ccctatcctg 2100 tcgacactcc cggagactac tgttgttaga cgtcgaggca ggtcacctag aagaagaact 2160 ccttcgcctc gcaggcgaag gtctcaatcg ccgcggcgcc gaagatctca atctcgggaa 2220 tctcaatgtt 2230 45 1305 DNA Artificial sequence Influenza A Virus M1 Fused to Synthetic HBcAg 45 atgagtcttc taaccgaggt cgaaacgtac gtactctcta tcatcccgtc aggccccctc 60 aaagccgaga tcgcacagag acttgaagat gtctttgcag ggaagaacac tgatcttgag 120 gttctcatgg aatggctaaa gacaagacca atcctgtcac ctctgactaa ggggatttta 180 ggatttgtgt tcacgctcac cgtgcccagt gagcgaggac tgcagcgtag acgctttgtc 240 caaaatgccc ttaatgggaa cggggatcca aataacatgg acaaagcagt taaactgtat 300 aggaagctca agagggagat aacattccat ggggccaaag aaatctcact cagttattct 360 gctggtgcac ttgccagttg tatgggcctc atatacaaca ggatgggggc tgtgaccact 420 gaagtggcat ttggcctggt atgtgcaacc tgtgaacaga ttgctgactc ccagcatcgg 480 tctcataggc aaatggtgac aacaaccaat ccactaatca gacatgagaa cagaatggtt 540 ttagccagca ctacagctaa ggctatggag caaatggctg gatcgagtga gcaagcagca 600 gaggccatgg aggttgctag tcaggctaga caaatggtgc aagcgatgag aaccattggg 660 actcatccta gctccagtgc tggtctgaaa aatgatcttc ttgaaaattt gcaggcctat 720 cagaaacgaa tgggggtgca gatgcaacgg ttcaagatgg atatcgatcc ttataaagaa 780 ttcggagcta ctgtggagtt actctcgttt ctcccgagtg acttctttcc ttcagtacga 840 gatcttctgg ataccgccag cgcgctgtat cgggaagcct tggagtctcc tgagcactgc 900 agccctcacc atactgccct caggcaagca attctttgct ggggggagct catgactctg 960 gccacgtggg tgggtgttaa cttggaagat ccagctagca gggacctggt agtcagttat 1020 gtcaacacta atatgggttt aaagttcagg caactcttgt ggtttcacat tagctgcctc 1080 actttcggcc gagaaacagt tctagaatat ttggtgtctt tcggagtgtg gatccgcact 1140 cctccagctt ataggcctcc gaatgcccct atcctgtcga cactcccgga gactactgtt 1200 gttagacgtc gaggcaggtc acctagaaga agaactcctt cgcctcgcag gcgaaggtct 1260 caatcgccgc ggcgccgaag atctcaatct cgggaatctc aatgt 1305 46 1581 DNA Artificial sequence Open Reading Frame for TPANP from VR4700 46 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 60 tcgcccagcg ctagaggatc gggaatggcg tcccaaggca ccaaacggtc ttacgaacag 120 atggagactg atggagaacg ccagaatgcc actgaaatca gagcatccgt cggaaaaatg 180 attggtggaa ttggacgatt ctacatccaa atgtgcaccg aactcaaact cagtgattat 240 gagggacggt tgatccaaaa cagcttaaca atagagagaa tggtgctctc tgcttttgac 300 gaaaggagaa ataaatacct ggaagaacat cccagtgcgg ggaaagatcc taagaaaact 360 ggaggaccta tatacaggag agtaaacgga aagtggatga gagaactcat cctttatgac 420 aaagaagaaa taaggcgaat ctggcgccaa gctaataatg gtgacgatgc aacggctggt 480 ctgactcaca tgatgatctg gcattccaat ttgaatgatg caacttatca gaggacaaga 540 gctcttgttc gcaccggaat ggatcccagg atgtgctctc tgatgcaagg ttcaactctc 600 cctaggaggt ctggagccgc aggtgctgca gtcaaaggag ttggaacaat ggtgatggaa 660 ttggtcagga tgatcaaacg tgggatcaat gatcggaact tctggagggg tgagaatgga 720 cgaaaaacaa gaattgctta tgaaagaatg tgcaacattc tcaaagggaa atttcaaact 780 gctgcacaaa aagcaatgat ggatcaagtg agagagagcc ggaacccagg gaatgctgag 840 ttcgaagatc tcacttttct agcacggtct gcactcatat tgagagggtc ggttgctcac 900 aagtcctgcc tgcctgcctg tgtgtatgga cctgccgtag ccagtgggta cgactttgaa 960 agagagggat actctctagt cggaatagac cctttcagac tgcttcaaaa cagccaagtg 1020 tacagcctaa tcagaccaaa tgagaatcca gcacacaaga gtcaactggt gtggatggca 1080 tgccattctg ccgcatttga agatctaaga gtattaagct tcatcaaagg gacgaaggtg 1140 ctcccaagag ggaagctttc cactagagga gttcaaattg cttccaatga aaatatggag 1200 actatggaat caagtacact tgaactgaga agcaggtact gggccataag gaccagaagt 1260 ggaggaaaca ccaatcaaca gagggcatct gcgggccaaa tcagcataca acctacgttc 1320 tcagtacaga gaaatctccc ttttgacaga acaaccatta tggcagcatt caatgggaat 1380 acagagggaa gaacatctga catgaggacc gaaatcataa ggatgatgga aagtgcaaga 1440 ccagaagatg tgtctttcca ggggcgggga gtcttcgagc tctcggacga aaaggcagcg 1500 agcccgatcg tgccttcctt tgacatgagt aatgaaggat cttatttctt cggagacaat 1560 gcagatgagt acgacaatta a 1581 47 333 DNA Artificial sequence Open Reading Frame for TPAM2 DeltaTM from VR4707 47 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 60 tcgcccagcg ctagaggatc gggaatgagt cttctgaccg aggtcgaaac ccctatcaga 120 aacgaatggg ggtgcagatg caacgattca agtgatcctg gcggcggcga tcggcttttt 180 ttcaaatgca tttatcggcg ctttaaatac ggcttgaaaa gagggccttc taccgaagga 240 gtgccagagt ctatgaggga agaatatcgg aaggaacagc agaatgctgt ggatgttgac 300 gatagccatt ttgtcagcat cgagctggag taa 333 48 24 DNA Artificial sequence Primer Used to Amplify TPAM2 Fragment 48 gccgaatcca tggatgcaat gaag 24 49 36 DNA Artificial sequence Primer Used to Amplify TPAM2 Fragment 49 ggtgccttgg gacgccatat cacttgaatc gttgca 36 50 36 DNA Artificial sequence Primer Used to Amplify NP Gene 50 tgcaacgatt caagtgatat ggcgtcccaa ggcacc 36 51 24 DNA Artificial sequence Primer Used to Amplify NP Gene 51 gccgtcgact taattgtcgt actc 24 52 1653 DNA Artificial sequence Open Reading Frame for TPAM2NP from VR4710 52 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 60 tcgcccagcg ctagaggatc gggaatgagt cttctgaccg aggtcgaaac ccctatcaga 120 aacgaatggg ggtgcagatg caacgattca agtgatatgg cgtcccaagg caccaaacgg 180 tcttacgaac agatggagac tgatggagaa cgccagaatg ccactgaaat cagagcatcc 240 gtcggaaaaa tgattggtgg aattggacga ttctacatcc aaatgtgcac cgaactcaaa 300 ctcagtgatt atgagggacg gttgatccaa aacagcttaa caatagagag aatggtgctc 360 tctgcttttg acgaaaggag aaataaatac ctggaagaac atcccagtgc ggggaaagat 420 cctaagaaaa ctggaggacc tatatacagg agagtaaacg gaaagtggat gagagaactc 480 atcctttatg acaaagaaga aataaggcga atctggcgcc aagctaataa tggtgacgat 540 gcaacggctg gtctgactca catgatgatc tggcattcca atttgaatga tgcaacttat 600 cagaggacaa gagctcttgt tcgcaccgga atggatccca ggatgtgctc tctgatgcaa 660 ggttcaactc tccctaggag gtctggagcc gcaggtgctg cagtcaaagg agttggaaca 720 atggtgatgg aattggtcag gatgatcaaa cgtgggatca atgatcggaa cttctggagg 780 ggtgagaatg gacgaaaaac aagaattgct tatgaaagaa tgtgcaacat tctcaaaggg 840 aaatttcaaa ctgctgcaca aaaagcaatg atggatcaag tgagagagag ccggaaccca 900 gggaatgctg agttcgaaga tctcactttt ctagcacggt ctgcactcat attgagaggg 960 tcggttgctc acaagtcctg cctgcctgcc tgtgtgtatg gacctgccgt agccagtggg 1020 tacgactttg aaagagaggg atactctcta gtcggaatag accctttcag actgcttcaa 1080 aacagccaag tgtacagcct aatcagacca aatgagaatc cagcacacaa gagtcaactg 1140 gtgtggatgg catgccattc tgccgcattt gaagatctaa gagtattaag cttcatcaaa 1200 gggacgaagg tgctcccaag agggaagctt tccactagag gagttcaaat tgcttccaat 1260 gaaaatatgg agactatgga atcaagtaca cttgaactga gaagcaggta ctgggccata 1320 aggaccagaa gtggaggaaa caccaatcaa cagagggcat ctgcgggcca aatcagcata 1380 caacctacgt tctcagtaca gagaaatctc ccttttgaca gaacaaccat tatggcagca 1440 ttcaatggga atacagaggg aagaacatct gacatgagga ccgaaatcat aaggatgatg 1500 gaaagtgcaa gaccagaaga tgtgtctttc caggggcggg gagtcttcga gctctcggac 1560 gaaaaggcag cgagcccgat cgtgccttcc tttgacatga gtaatgaagg atcttatttc 1620 ttcggagaca atgcagatga gtacgacaat taa 1653 53 35 DNA Artificial sequence Primer Used to Amplify the HA Gene 53 gggctagcgc cgccaccatg aagaccatca ttgct 35 54 26 DNA Artificial sequence Primer Used to Amplify the HA Gene 54 ccgtcgactc aaatgcaaat gttgca 26 55 1701 DNA Artificial sequence Open Reading Frame for HA H3N2 from VR4750 55 atgaagacca tcattgcttt gagctacatt ttctgtctgg ctctcggcca agaccttcca 60 ggaaatgaca acaacacagc aacgctgtgc ctgggacatc atgcggtgcc aaacggaaca 120 ctagtgaaaa caatcacaga tgatcagatt gaagtgacta atgctactga gctagttcag 180 agctcctcaa cggggaaaat atgcaacaat cctcatcgaa tccttgatgg aatagactgc 240 acactgatag atgctctatt gggggaccct cattgtgatg tttttcaaaa tgagacatgg 300 gaccttttcg ttgaacgcag caaagctttc agcaactgtt acccttatga tgtgccagat 360 tatgcccccc ttaggtcact agttgcctcg tcaggcactc tggagtttat cactgagggt 420 ttcacttgga ctggggtcac tcagaatggg ggaagcagtg cttgcaaaag gggacctggt 480 agcggttttt tcagtagact gaactggttg accaaatcag gaagcacata tccagtgctg 540 aacgtgacta tgccaaacaa tgacaatttt gacaaactat acatttgggg ggttcaccac 600 ccgagcacga accaagaaca aaccagcctg tatgttcaag catcagggag agtcacagtc 660 tctaccagga gaagccagca aactataatc ccgaatatcg agtccagacc ctgggtaagg 720 ggtctgtcta gtagaataag catctattgg acaatagtta agccgggaga cgtactggta 780 attaatagta atgggaacct aatcgctcct cggggttatt tcaagatgcg cactgggaaa 840 agctcaataa tgaggtcaga tgcacctatt gatacctgta tttctgaatg catcactcca 900 aatggaagca ttcccaatga caagcccttt caaaacgtaa acaaaatcac gtatggagca 960 tgccccaagt atgttaagca aaacaccctg aagttggcaa cagggatgcg gaatgtacca 1020 gagaaacaaa ctagaggcct attcggcgca atagcaggtt tcatagaaaa tggttgggag 1080 ggaatgatag acggttggta cggtttcagg catcaaaatt ctgagggcac aggacaagca 1140 gcagatctta aaagcactca agcagccatc gaccaaatca atgggaaatt gaacaggata 1200 atcaagaaga cgaacgagaa attccatcaa atcgaaaagg aattctcaga agtagaaggg 1260 agaattcagg acctcgagaa atacgttgaa gacactaaaa tagatctctg gtcttacaat 1320 gcggagcttc ttgtcgctct ggagaatcaa catacaattg acctgactga ctcggaaatg 1380 aacaagctgt ttgaaaaaac aaggaggcaa ctgagggaaa atgctgaaga catgggcaat 1440 ggttgcttca aaatatacca caaatgtgac aacgcttgca tagagtcaat cagaactggg 1500 acttatgacc atgatgtata cagagacgaa gcattaaaca accggtttca gatcaaaggt 1560 gttgaactga agtctggata caaagactgg atcctgtgga tttcctttgc catatcatgc 1620 tttttgcttt gtgttgtttt gctggggttc atcatgtggg cctgccagaa aggcaacatt 1680 aggtgcaaca tttgcatttg a 1701 56 35 DNA Artificial sequence Primer Used to Amplify the HA Gene 56 gggctagcgc cgccaccatg aaggcaaacc tactg 35 57 26 DNA Artificial sequence Primer Used to Amplify the HA Gene 57 ccgtcgactc agatgcatat tctgca 26 58 1701 DNA Artificial sequence Open Reading Frame for HA H1N1 from VR4752 58 atgaaggcaa acctactggt cctgttatgt gcacttgcag ctgcagatgc agacacaata 60 tgtataggct accatgcgaa caattcaacc gacactgttg acacagtgct cgagaagaat 120 gtgacagtga cacactctgt taacctgctc gaagacagcc acaacggaaa actatgtaga 180 ttaaaaggaa tagccccact acaattgggg aaatgtaaca tcgccggatg gctcttggga 240 aacccagaat gcgacccact gcttccagtg agatcatggt cctacattgt agaaacacca 300 aactctgaga atggaatatg ttatccagga gatttcatcg actatgagga gctgagggag 360 caattgagct cagtgtcatc attcgaaaga ttcgaaatat ttcccaaaga aagctcatgg 420 cccaaccaca acacaaccaa aggagtaacg gcagcatgct cccatgcggg gaaaagcagt 480 ttttacagaa atttgctatg gctgacggag aaggagggct catacccaaa gctgaaaaat 540 tcttatgtga acaagaaagg gaaagaagtc cttgtactgt ggggtattca tcacccgtct 600 aacagtaagg atcaacagaa tatctatcag aatgaaaatg cttatgtctc tgtagtgact 660 tcaaattata acaggagatt taccccggaa atagcagaaa gacccaaagt aagagatcaa 720 gctgggagga tgaactatta ctggaccttg ctaaaacccg gagacacaat aatatttgag 780 gcaaatggaa atctaatagc accaaggtat gctttcgcac tgagtagagg ctttgggtcc 840 ggcatcatca cctcaaacgc atcaatgcat gagtgtaaca cgaagtgtca aacacccctg 900 ggagctataa acagcagtct ccctttccag aatatacacc cagtcacaat aggagagtgc 960 ccaaaatacg tcaggagtgc caaattgagg atggttacag gactaaggaa cattccgtcc 1020 attcaatcca gaggtctatt tggagccatt gccggtttta ttgaaggggg atggactgga 1080 atgatagatg gatggtacgg ttatcatcat cagaatgaac agggatcagg ctatgcagcg 1140 gatcaaaaaa gcacacaaaa tgccattaac gggattacaa acaaggtgaa ctctgttatc 1200 gagaaaatga acattcaatt cacagctgtg ggtaaagaat tcaacaaatt agaaaaaagg 1260 atggaaaatt taaataaaaa agttgatgat ggatttctgg acatttggac atataatgca 1320 gaattgttag ttctactgga aaatgaaagg actctggatt tccatgactc aaatgtgaag 1380 aatctgtatg agaaagtaaa aagccaatta aagaataatg ccaaagaaat cggaaatgga 1440 tgttttgagt tctaccacaa gtgtgacaat gaatgcatgg aaagtgtaag aaatgggact 1500 tatgattatc ccaaatattc agaagagtca aagttgaaca gggaaaaggt agatggagtg 1560 aaattggaat caatggggat ctatcagatt ctggcgatct actcaactgt cgccagttca 1620 ctggtgcttt tggtctccct gggggcaatc agtttctgga tgtgttctaa tggatctttg 1680 cagtgcagaa tatgcatctg a 1701 59 1050 DNA Artificial sequence Open Reading Frame for the M2M1 Fusion from VR4755 59 atgagcctgc tgaccgaggt ggagaccccc atcagaaacg agtggggctg cagatgcaac 60 gacagcagcg accccctggt ggtggccgcc agcatcatcg gcatcctgca cctgatcctg 120 tggatcctgg acagactgtt cttcaagtgc atctacagac tgttcaagca cggcctgaag 180 agaggcccca gcaccgaggg cgtgcccgag agcatgagag aggagtacag aaaggagcag 240 cagaacgccg tggacgccga cgacagccac ttcgtgagca tcgagctgga gatgtccctg 300 ctgacagaag tggaaacata cgtgctgagc atcgtgccca gcggccccct gaaggccgag 360 atcgcccaga gactggagga cgtgttcgcc ggcaagaaca ccgacctgga ggccctgatg 420 gagtggctga agaccagacc catcctgagc cccctgacca agggcatcct gggcttcgtg 480 ttcaccctga ccgtgcccag cgagagaggc ctgcagagaa gaagattcgt gcagaacgcc 540 ctgaacggca acggcgaccc caacaacatg gaccgggccg tgaagctgta ccggaagctg 600 aagagagaga tcaccttcca cggcgccaag gagatcgccc tgagctacag cgccggcgcc 660 ctggccagct gcatgggcct gatctacaac agaatgggcg ccgtgaccac cgaggtggcc 720 ttcggcctgg tgtgcgccac ctgcgagcag atcgccgaca gccagcacag aagccacaga 780 cagatggtgg ccaccaccaa ccccctgatc agacacgaga acagaatggt gctggccagc 840 accaccgcca aggccatgga gcagatggcc ggcagcagcg agcaggccgc cgaggccatg 900 gagatcgcca gccaggccag acagatggtg caggccatga gagccatcgg cacccacccc 960 agcagcagcg ccggcctgaa ggacgacctg ctggagaacc tgcagaccta ccagaagaga 1020 atgggcgtgc agatgcagag attcaagtga 1050 60 982 DNA Artificial sequence Open Reading Frame for Fragment 7 from VR4756 60 atgagccttc taaccgaggt cgaaacgtat gttctctcta tcgttccatc aggccccctc 60 aaagccgaaa tcgcgcagag acttgaagat gtctttgctg ggaaaaacac agatcttgag 120 gctctcatgg aatggctaaa gacaagacca atcctgtcac ctctgactaa ggggattttg 180 gggtttgtgt tcacgctcac cgtgcccagt gagcgaggac tgcagcgtag acgctttgtc 240 caaaatgccc tcaatgggaa tggggatcca aataacatgg acagagcagt taaactatat 300 agaaaactta agagggagat tacattccat ggggccaaag aaatagcact cagttattct 360 gctggtgcac ttgccagttg catgggcctc atatacaaca gaatgggggc tgtaaccact 420 gaagtggcct ttggcctggt atgtgcaaca tgtgaacaga ttgctgactc ccagcacagg 480 tctcataggc aaatggtggc aacaaccaat ccattaataa ggcatgagaa cagaatggtt 540 ttggccagca ctacagctaa ggctatggag caaatggctg gatcaagtga gcaggcagcg 600 gaggccatgg aaattgctag tcaggccagg caaatggtgc aggcaatgag agccattggg 660 actcatccta gctccagtgc tggtctaaaa gatgatcttc ttgaaaattt gcagacctat 720 cagaaacgaa tgggggtgca gatgcaacga ttcaagtgac ccgcttgttg ttgctgcgag 780 tatcattggg atcttgcact tgatattgtg gattcttgat cgtctttttt tcaaatgcat 840 ctatcgactc ttcaaacacg gtctgaaaag agggccttct acggaaggag tacctgagtc 900 tatgagggaa gaatatcgaa aggaacagca gaatgctgtg gatgctgacg acagtcattt 960 tgtcagcata gagctggagt aa 982 61 982 DNA Artificial sequence Codon Optimized Segment 7 from VR4763 61 atgagcctgc tgaccgaggt cgaaacgtat gttctctcta tcgtgcccag cggccccctg 60 aaggccgaga tcgcccagag actggaggac gtgttcgccg gcaagaacac cgacctggag 120 gccctgatgg agtggctgaa gaccagaccc atcctgagcc ccctgaccaa gggcatcctg 180 ggcttcgtgt tcaccctgac cgtgcccagc gagagaggcc tgcagagaag aagattcgtg 240 cagaacgccc tgaacggcaa cggcgacccc aacaacatgg acagagccgt gaagctgtac 300 agaaagctga agagagagat caccttccac ggcgccaagg agatcgccct gagctacagc 360 gccggcgccc tggccagctg catgggcctg atctacaaca gaatgggcgc cgtgaccacc 420 gaggtggcct tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag ccagcacaga 480 agccacagac agatggtggc caccaccaac cccctgatca gacacgagaa cagaatggtg 540 ctggccagca ccaccgccaa ggccatggag cagatggccg gcagcagcga gcaggccgcc 600 gaggccatgg agatcgccag ccaggccaga cagatggtgc aggccatgag agccatcggc 660 acccacccca gcagcagcgc cggcctgaaa gatgatcttc ttgaaaattt gcagacctat 720 cagaaacgaa tgggggtgca gatgcaacga ttcaagtgac cccctggtgg tggccgccag 780 catcatcggc atcctgcacc tgatcctgtg gatcctggac agactgttct tcaagtgcat 840 ctacagactg ttcaagcacg gcctgaagag aggccccagc accgagggcg tgcccgagag 900 catgagagag gagtacagaa aggagcagca gaacgccgtg gacgccgacg acagccactt 960 cgtgagcatc gagctggagt ga 982 62 1569 DNA Artificial sequence Open Reading Frame for eM2NP Codon Optimized by Contract 62 atgagcttgc tcactgaagt cgagacacca atcagaaacg aatggggatg tagatgcaac 60 gatagctcag acatggcctc ccagggaacc aaaagaagct atgaacagat ggagactgac 120 ggagagagac agaacgccac agagatcaga gctagtgtag gaaagatgat agacggtatc 180 gggcgatttt acattcaaat gtgtacggaa ttgaaactca gcgactatga aggcagactt 240 atccagaact cactcacaat tgagcgcatg gtactcagtg catttgatga aagaaggaat 300 aggtacctcg aagaacaccc cagcgccggc aaagatccca agaagactgg cggcccaatt 360 tacagaagag tggacggtaa gtggatgaga gagctggtat tgtacgataa agaagaaatt 420 agaagaatct ggaggcaagc aaacaatgga gaggatgcta cagctggcct gacccacatg 480 atgatttggc atagtaacct gaatgatacc acctaccagc ggacaagggc tctcgttcga 540 accgggatgg atccccgcat gtgctcattg atgcagggta gtacactccc gaggaggtca 600 ggcgcggccg gtgcagccgt gaaaggaatc ggcactatgg taatggaatt gataagaatg 660 attaaaaggg ggattaatga caggaacttt tggagaggag aaaatggacg caaaacaagg 720 agtgcgtatg aacggatgtg caatattttg aaaggaaaat tccaaactgc agcacagcgc 780 gccatgatgg atcaggtacg agaaagtcgc aacccaggta atgctgaaat agaggacctt 840 atatttctcg cccggagtgc tctcatactt agaggaagcg tggcccataa aagttgtctc 900 cccgcatgcg tatacggtcc cgctgtgtct tccggatacg attttgaaaa agagggatat 960 tcattggtgg gaatcgaccc ttttaagctg cttcagaact cacaggttta cagtttgatt 1020 agaccaaacg agaacccagc ccacaaatca caactcgtgt ggatggcatg ccactctgcc 1080 gctttcgaag atctgagact gctctcattt attagaggca ctaaagtgag cccgagggga 1140 aaactgagca cacgaggagt acagatagca tctaacgaaa atatggataa tatgggatct 1200 agcacactcg aattgaggtc acgatactgg gctattagaa cacggagcgg agggaacacc 1260 aaccagcaga gagcatccgc cggtcagata agcgttcagc ctacattttc agtacaacga 1320 aacctgccat ttgaaaagag tacagtgatg gccgcattta ctggcaacac cgagggacga 1380 acaagcgaca tgagagcaga gattattaga atgatggaag gagctaaacc agaggaggtt 1440 tcatttagag gaaggggagt cttcgaattg tccgatgaga aagccacaaa tcccatagta 1500 cctagcttcg acatgtccaa cgaaggctct tacttttttg gtgacaatgc cgaagagtac 1560 gacaattga

1569 63 1569 DNA Artificial sequence Open Reading Frame for eM2NP Codon Optimized by Applicants 63 atgagcctgc tgaccgaggt ggagaccccc atcagaaacg agtggggctg cagatgcaac 60 gacagcagcg acatggccag ccagggcacc aagagaagct acgagcagat ggagaccgac 120 ggcgagagac agaacgccac cgagatcaga gccagcgtgg gcaagatgat cgacggcatc 180 ggcagattct acatccagat gtgcaccgag ctgaagctga gcgactacga gggcagactg 240 atccagaaca gcctgaccat cgagagaatg gtgctgagcg ccttcgacga gagaagaaac 300 agatacctgg aggagcaccc cagcgccggc aaggacccca agaagaccgg cggccccatc 360 tacagaagag tggacggcaa gtggatgaga gagctggtgc tgtacgacaa ggaggagatc 420 agaagaatct ggagacaggc caacaacggc gaggacgcca ccgccggcct gacccacatg 480 atgatctggc acagcaacct gaacgacacc acctaccaga gaaccagagc cctggtgcgg 540 accggcatgg accccagaat gtgcagcctg atgcagggca gcaccctgcc cagaagaagc 600 ggcgccgccg gcgccgccgt gaagggcatc ggcaccatgg tgatggagct gatcagaatg 660 atcaagagag gcatcaacga cagaaacttc tggagaggcg agaacggcag aaagaccaga 720 agcgcctacg agagaatgtg caacatcctg aagggcaagt tccagaccgc cgcccagaga 780 gccatgatgg accaggtccg ggagagcaga aaccccggca acgccgagat cgaggacctg 840 atcttcctgg ccagaagcgc cctgatcctg agaggcagcg tggcccacaa gagctgcctg 900 cccgcctgcg tgtacggccc cgccgtgagc agcggctacg acttcgagaa ggagggctac 960 agcctggtgg gcatcgaccc cttcaagctg ctgcagaaca gccaggtgta cagcctgatc 1020 agacccaacg agaaccccgc ccacaagagc cagctggtgt ggatggcctg ccacagcgcc 1080 gccttcgagg acctgagact gctgagcttc atcagaggca ccaaggtgtc ccccagaggc 1140 aagctgagca ccagaggcgt gcagatcgcc agcaacgaga acatggacaa catgggcagc 1200 agcaccctgg agctgagaag cagatactgg gccatcagaa ccagaagcgg cggcaacacc 1260 aaccagcaga gagccagcgc cggccagatc agcgtgcagc ccaccttcag cgtgcagaga 1320 aacctgccct tcgagaagag caccgtgatg gccgccttca ccggcaacac cgagggcaga 1380 accagcgaca tgagagccga gatcatcaga atgatggagg gcgccaagcc cgaggaggtg 1440 tccttcagag gcagaggcgt gttcgagctg agcgacgaga aggccaccaa ccccatcgtg 1500 cctagcttcg acatgagcaa cgagggcagc tacttcttcg gcgacaacgc cgaggagtac 1560 gacaactga 1569 64 30 DNA Artificial sequence Primer Used to Amplify the M2 Gene 64 gccgaattcg ccaccatgag cctgctgacc 30 65 33 DNA Artificial sequence Primer Used to Amplify the M2 Gene 65 gccgtcgact gatcactcca gctcgatgct cac 33 66 294 DNA Artificial sequence Open Reading Frame for M2 Gene from VR4759 66 atgagcctgc tgaccgaggt ggagaccccc atcagaaacg agtggggctg cagatgcaac 60 gacagcagcg accccctggt ggtggccgcc agcatcatcg gcatcctgca cctgatcctg 120 tggatcctgg acagactgtt cttcaagtgc atctacagac tgttcaagca cggcctgaag 180 agaggcccca gcaccgaggg cgtgcccgag agcatgagag aggagtacag aaaggagcag 240 cagaacgccg tggacgccga cgacagccac ttcgtgagca tcgagctgga gtga 294 67 36 DNA Artificial sequence Primer Used Amplify M1 Gene from VR4755 67 gccgaattcg ccaccatgtc cctgctgaca gaagtg 36 68 31 DNA Artificial sequence Primer Used to Amplify M1 Gene from VR4755 68 gccgtcgact gatcacttga atctctgcat c 31 69 759 DNA Artificial sequence Open Reading Frame for M1 Gene from VR4760 69 atgtccctgc tgacagaagt ggaaacatac gtgctgagca tcgtgcccag cggccccctg 60 aaggccgaga tcgcccagag actggaggac gtgttcgccg gcaagaacac cgacctggag 120 gccctgatgg agtggctgaa gaccagaccc atcctgagcc ccctgaccaa gggcatcctg 180 ggcttcgtgt tcaccctgac cgtgcccagc gagagaggcc tgcagagaag aagattcgtg 240 cagaacgccc tgaacggcaa cggcgacccc aacaacatgg accgggccgt gaagctgtac 300 cggaagctga agagagagat caccttccac ggcgccaagg agatcgccct gagctacagc 360 gccggcgccc tggccagctg catgggcctg atctacaaca gaatgggcgc cgtgaccacc 420 gaggtggcct tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag ccagcacaga 480 agccacagac agatggtggc caccaccaac cccctgatca gacacgagaa cagaatggtg 540 ctggccagca ccaccgccaa ggccatggag cagatggccg gcagcagcga gcaggccgcc 600 gaggccatgg agatcgccag ccaggccaga cagatggtgc aggccatgag agccatcggc 660 acccacccca gcagcagcgc cggcctgaag gacgacctgc tggagaacct gcagacctac 720 cagaagagaa tgggcgtgca gatgcagaga ttcaagtga 759 70 38 DNA Artificial sequence Primer Used to Amplify NP Gene from VR4757 70 gccgaattcg ccaccatggc ctcccaggga accaaaag 38 71 30 DNA Artificial sequence Primer Used to Amplify NP Gene from VR4757 71 gccgtcgact gatcaattgt cgtactcttc 30 72 1497 DNA Artificial sequence Open Reading Frame for NP Codon Optimized by Contract 72 atggcctccc agggaaccaa aagaagctat gaacagatgg agactgacgg agagagacag 60 aacgccacag agatcagagc tagtgtagga aagatgatag acggtatcgg gcgattttac 120 attcaaatgt gtacggaatt gaaactcagc gactatgaag gcagacttat ccagaactca 180 ctcacaattg agcgcatggt actcagtgca tttgatgaaa gaaggaatag gtacctcgaa 240 gaacacccca gcgccggcaa agatcccaag aagactggcg gcccaattta cagaagagtg 300 gacggtaagt ggatgagaga gctggtattg tacgataaag aagaaattag aagaatctgg 360 aggcaagcaa acaatggaga ggatgctaca gctggcctga cccacatgat gatttggcat 420 agtaacctga atgataccac ctaccagcgg acaagggctc tcgttcgaac cgggatggat 480 ccccgcatgt gctcattgat gcagggtagt acactcccga ggaggtcagg cgcggccggt 540 gcagccgtga aaggaatcgg cactatggta atggaattga taagaatgat taaaaggggg 600 attaatgaca ggaacttttg gagaggagaa aatggacgca aaacaaggag tgcgtatgaa 660 cggatgtgca atattttgaa aggaaaattc caaactgcag cacagcgcgc catgatggat 720 caggtacgag aaagtcgcaa cccaggtaat gctgaaatag aggaccttat atttctcgcc 780 cggagtgctc tcatacttag aggaagcgtg gcccataaaa gttgtctccc cgcatgcgta 840 tacggtcccg ctgtgtcttc cggatacgat tttgaaaaag agggatattc attggtggga 900 atcgaccctt ttaagctgct tcagaactca caggtttaca gtttgattag accaaacgag 960 aacccagccc acaaatcaca actcgtgtgg atggcatgcc actctgccgc tttcgaagat 1020 ctgagactgc tctcatttat tagaggcact aaagtgagcc cgaggggaaa actgagcaca 1080 cgaggagtac agatagcatc taacgaaaat atggataata tgggatctag cacactcgaa 1140 ttgaggtcac gatactgggc tattagaaca cggagcggag ggaacaccaa ccagcagaga 1200 gcatccgccg gtcagataag cgttcagcct acattttcag tacaacgaaa cctgccattt 1260 gaaaagagta cagtgatggc cgcatttact ggcaacaccg agggacgaac aagcgacatg 1320 agagcagaga ttattagaat gatggaagga gctaaaccag aggaggtttc atttagagga 1380 aggggagtct tcgaattgtc cgatgagaaa gccacaaatc ccatagtacc tagcttcgac 1440 atgtccaacg aaggctctta cttttttggt gacaatgccg aagagtacga caattga 1497 73 36 DNA Artificial sequence Primer Used to Amplify NP Gene from VR4758 73 gccgaattcg ccaccatggc cagccagggc accaag 36 74 28 DNA Artificial sequence Primer Used to Amplify NP Gene from VR4758 74 gccgtcgact gatcagttgt cgtactcc 28 75 1497 DNA Artificial sequence Open Reading Frame for NP Codon Optimized by Applicants from VR4762 75 atggccagcc agggcaccaa gagaagctac gagcagatgg agaccgacgg cgagagacag 60 aacgccaccg agatcagagc cagcgtgggc aagatgatcg acggcatcgg cagattctac 120 atccagatgt gcaccgagct gaagctgagc gactacgagg gcagactgat ccagaacagc 180 ctgaccatcg agagaatggt gctgagcgcc ttcgacgaga gaagaaacag atacctggag 240 gagcacccca gcgccggcaa ggaccccaag aagaccggcg gccccatcta cagaagagtg 300 gacggcaagt ggatgagaga gctggtgctg tacgacaagg aggagatcag aagaatctgg 360 agacaggcca acaacggcga ggacgccacc gccggcctga cccacatgat gatctggcac 420 agcaacctga acgacaccac ctaccagaga accagagccc tggtgcggac cggcatggac 480 cccagaatgt gcagcctgat gcagggcagc accctgccca gaagaagcgg cgccgccggc 540 gccgccgtga agggcatcgg caccatggtg atggagctga tcagaatgat caagagaggc 600 atcaacgaca gaaacttctg gagaggcgag aacggcagaa agaccagaag cgcctacgag 660 agaatgtgca acatcctgaa gggcaagttc cagaccgccg cccagagagc catgatggac 720 caggtccggg agagcagaaa ccccggcaac gccgagatcg aggacctgat cttcctggcc 780 agaagcgccc tgatcctgag aggcagcgtg gcccacaaga gctgcctgcc cgcctgcgtg 840 tacggccccg ccgtgagcag cggctacgac ttcgagaagg agggctacag cctggtgggc 900 atcgacccct tcaagctgct gcagaacagc caggtgtaca gcctgatcag acccaacgag 960 aaccccgccc acaagagcca gctggtgtgg atggcctgcc acagcgccgc cttcgaggac 1020 ctgagactgc tgagcttcat cagaggcacc aaggtgtccc ccagaggcaa gctgagcacc 1080 agaggcgtgc agatcgccag caacgagaac atggacaaca tgggcagcag caccctggag 1140 ctgagaagca gatactgggc catcagaacc agaagcggcg gcaacaccaa ccagcagaga 1200 gccagcgccg gccagatcag cgtgcagccc accttcagcg tgcagagaaa cctgcccttc 1260 gagaagagca ccgtgatggc cgccttcacc ggcaacaccg agggcagaac cagcgacatg 1320 agagccgaga tcatcagaat gatggagggc gccaagcccg aggaggtgtc cttcagaggc 1380 agaggcgtgt tcgagctgag cgacgagaag gccaccaacc ccatcgtgcc tagcttcgac 1440 atgagcaacg agggcagcta cttcttcggc gacaacgccg aggagtacga caactga 1497 76 498 PRT Artificial sequence NP Consensus Sequence 76 Met Ala Ser Gln Gly Thr Lys Arg Ser Tyr Glu Gln Met Glu Thr Asp 1 5 10 15 Gly Glu Arg Gln Asn Ala Thr Glu Ile Arg Ala Ser Val Gly Lys Met 20 25 30 Ile Asp Gly Ile Gly Arg Phe Tyr Ile Gln Met Cys Thr Glu Leu Lys 35 40 45 Leu Ser Asp Tyr Glu Gly Arg Leu Ile Gln Asn Ser Leu Thr Ile Glu 50 55 60 Arg Met Val Leu Ser Ala Phe Asp Glu Arg Arg Asn Arg Tyr Leu Glu 65 70 75 80 Glu His Pro Ser Ala Gly Lys Asp Pro Lys Lys Thr Gly Gly Pro Ile 85 90 95 Tyr Arg Arg Val Asp Gly Lys Trp Met Arg Glu Leu Val Leu Tyr Asp 100 105 110 Lys Glu Glu Ile Arg Arg Ile Trp Arg Gln Ala Asn Asn Gly Glu Asp 115 120 125 Ala Thr Ala Gly Leu Thr His Met Met Ile Trp His Ser Asn Leu Asn 130 135 140 Asp Thr Thr Tyr Gln Arg Thr Arg Ala Leu Val Arg Thr Gly Met Asp 145 150 155 160 Pro Arg Met Cys Ser Leu Met Gln Gly Ser Thr Leu Pro Arg Arg Ser 165 170 175 Gly Ala Ala Gly Ala Ala Val Lys Gly Ile Gly Thr Met Val Met Glu 180 185 190 Leu Ile Arg Met Ile Lys Arg Gly Ile Asn Asp Arg Asn Phe Trp Arg 195 200 205 Gly Glu Asn Gly Arg Lys Thr Arg Ser Ala Tyr Glu Arg Met Cys Asn 210 215 220 Ile Leu Lys Gly Lys Phe Gln Thr Ala Ala Gln Arg Ala Met Met Asp 225 230 235 240 Gln Val Arg Glu Ser Arg Asn Pro Gly Asn Ala Glu Ile Glu Asp Leu 245 250 255 Ile Phe Leu Ala Arg Ser Ala Leu Ile Leu Arg Gly Ser Val Ala His 260 265 270 Lys Ser Cys Leu Pro Ala Cys Val Tyr Gly Pro Ala Val Ser Ser Gly 275 280 285 Tyr Asp Phe Glu Lys Glu Gly Tyr Ser Leu Val Gly Ile Asp Pro Phe 290 295 300 Lys Leu Leu Gln Asn Ser Gln Val Tyr Ser Leu Ile Arg Pro Asn Glu 305 310 315 320 Asn Pro Ala His Lys Ser Gln Leu Val Trp Met Ala Cys His Ser Ala 325 330 335 Ala Phe Glu Asp Leu Arg Leu Leu Ser Phe Ile Arg Gly Thr Lys Val 340 345 350 Ser Pro Arg Gly Lys Leu Ser Thr Arg Gly Val Gln Ile Ala Ser Asn 355 360 365 Glu Asn Met Asp Asn Met Gly Ser Ser Thr Leu Glu Leu Arg Ser Arg 370 375 380 Tyr Trp Ala Ile Arg Thr Arg Ser Gly Gly Asn Thr Asn Gln Gln Arg 385 390 395 400 Ala Ser Ala Gly Gln Ile Ser Val Gln Pro Thr Phe Ser Val Gln Arg 405 410 415 Asn Leu Pro Phe Glu Lys Ser Thr Val Met Ala Ala Phe Thr Gly Asn 420 425 430 Thr Glu Gly Arg Thr Ser Asp Met Arg Ala Glu Ile Ile Arg Met Met 435 440 445 Glu Gly Ala Lys Pro Glu Glu Val Ser Phe Arg Gly Arg Gly Val Phe 450 455 460 Glu Leu Ser Asp Glu Lys Ala Thr Asn Pro Ile Val Pro Ser Phe Asp 465 470 475 480 Met Ser Asn Glu Gly Ser Tyr Phe Phe Gly Asp Asn Ala Glu Glu Tyr 485 490 495 Asp Asn 77 252 PRT Artificial sequence M1 Gene Consensus Sequence 77 Met Ser Leu Leu Thr Glu Val Glu Thr Tyr Val Leu Ser Ile Val Pro 1 5 10 15 Ser Gly Pro Leu Lys Ala Glu Ile Ala Gln Arg Leu Glu Asp Val Phe 20 25 30 Ala Gly Lys Asn Thr Asp Leu Glu Ala Leu Met Glu Trp Leu Lys Thr 35 40 45 Arg Pro Ile Leu Ser Pro Leu Thr Lys Gly Ile Leu Gly Phe Val Phe 50 55 60 Thr Leu Thr Val Pro Ser Glu Arg Gly Leu Gln Arg Arg Arg Phe Val 65 70 75 80 Gln Asn Ala Leu Asn Gly Asn Gly Asp Pro Asn Asn Met Asp Arg Ala 85 90 95 Val Lys Leu Tyr Arg Lys Leu Lys Arg Glu Ile Thr Phe His Gly Ala 100 105 110 Lys Glu Ile Ala Leu Ser Tyr Ser Ala Gly Ala Leu Ala Ser Cys Met 115 120 125 Gly Leu Ile Tyr Asn Arg Met Gly Ala Val Thr Thr Glu Val Ala Phe 130 135 140 Gly Leu Val Cys Ala Thr Cys Glu Gln Ile Ala Asp Ser Gln His Arg 145 150 155 160 Ser His Arg Gln Met Val Ala Thr Thr Asn Pro Leu Ile Arg His Glu 165 170 175 Asn Arg Met Val Leu Ala Ser Thr Thr Ala Lys Ala Met Glu Gln Met 180 185 190 Ala Gly Ser Ser Glu Gln Ala Ala Glu Ala Met Glu Ile Ala Ser Gln 195 200 205 Ala Arg Gln Met Val Gln Ala Met Arg Ala Ile Gly Thr His Pro Ser 210 215 220 Ser Ser Ala Gly Leu Lys Asp Asp Leu Leu Glu Asn Leu Gln Thr Tyr 225 230 235 240 Gln Lys Arg Met Gly Val Gln Met Gln Arg Phe Lys 245 250 78 97 PRT Artificial sequence M2 Gene Consensus Sequence 78 Met Ser Leu Leu Thr Glu Val Glu Thr Pro Ile Arg Asn Glu Trp Gly 1 5 10 15 Cys Arg Cys Asn Asp Ser Ser Asp Pro Leu Val Val Ala Ala Ser Ile 20 25 30 Ile Gly Ile Leu His Leu Ile Leu Trp Ile Leu Asp Arg Leu Phe Phe 35 40 45 Lys Cys Ile Tyr Arg Leu Phe Lys His Gly Leu Lys Arg Gly Pro Ser 50 55 60 Thr Glu Gly Val Pro Glu Ser Met Arg Glu Glu Tyr Arg Lys Glu Gln 65 70 75 80 Gln Asn Ala Val Asp Ala Asp Asp Ser His Phe Val Ser Ile Glu Leu 85 90 95 Glu 79 759 DNA Artificial sequence Optimized M1 Coding Region 79 atgagcctgc tgaccgaggt cgaaacgtat gttctctcta tcgtgcccag cggccccctg 60 aaggccgaga tcgcccagag actggaggac gtgttcgccg gcaagaacac cgacctggag 120 gccctgatgg agtggctgaa gaccagaccc atcctgagcc ccctgaccaa gggcatcctg 180 ggcttcgtgt tcaccctgac cgtgcccagc gagagaggcc tgcagagaag aagattcgtg 240 cagaacgccc tgaacggcaa cggcgacccc aacaacatgg acagagccgt gaagctgtac 300 agaaagctga agagagagat caccttccac ggcgccaagg agatcgccct gagctacagc 360 gccggcgccc tggccagctg catgggcctg atctacaaca gaatgggcgc cgtgaccacc 420 gaggtggcct tcggcctggt gtgcgccacc tgcgagcaga tcgccgacag ccagcacaga 480 agccacagac agatggtggc caccaccaac cccctgatca gacacgagaa cagaatggtg 540 ctggccagca ccaccgccaa ggccatggag cagatggccg gcagcagcga gcaggccgcc 600 gaggccatgg agatcgccag ccaggccaga cagatggtgc aggccatgag agccatcggc 660 acccacccca gcagcagcgc cggcctgaaa gatgatcttc ttgaaaattt gcagacctat 720 cagaaacgaa tgggggtgca gatgcaacga ttcaagtga 759 80 294 DNA Artificial sequence Optimized M2 Coding Region 80 atgagcctgc tgaccgaggt cgaaacacct atcagaaacg aatgggggtg cagatgcaac 60 gattcaagtg accccctggt ggtggccgcc agcatcatcg gcatcctgca cctgatcctg 120 tggatcctgg acagactgtt cttcaagtgc atctacagac tgttcaagca cggcctgaag 180 agaggcccca gcaccgaggg cgtgcccgag agcatgagag aggagtacag aaaggagcag 240 cagaacgccg tggacgccga cgacagccac ttcgtgagca tcgagctgga gtga 294 81 9 PRT Artificial sequence H2Kd Binding Peptide 81 Thr Tyr Gln Arg Thr Arg Ala Leu Val 1 5 82 11 DNA Artificial sequence RSV Promoter from Plasmid VCL1005 82 tactctagac g 11 83 11 DNA Artificial sequence Promoter RSV/R 83 tacaataaac g 11 84 27 DNA Artificial Sequence Primer RSVfor 84 catcagctgc tccctgcttg tgtgttg 27 85 19 DNA Artificial sequence Primer WNVpst rev 85 cgatatccga cgacggtga 19 86 39 DNA Artificial sequence Primer RSV HTLV5 86 caccacattg gtgtgcacct ccatcggctc gcatctctc 39 87 42 DNA Artificial sequence Primer HTLV RSVrev 87 aggtgcacac caatgtggtg aatggtcaaa tggcgtttat tg 42 88 44 DNA Artificial sequence Primer RSVrev 88 aatggtcaaa tggcgtttat tgtatcgagc taggcactta aata 44 89 6254 DNA Artificial sequence VR-6430, RSV RWNV 89 tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240 ctattggctg ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta 300 agctacaaca aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg 360 ttttgcgctg cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt 420 gtttaggcga aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta 480 gtttcgcttt tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc 540

aacatggtaa cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc 600 cgattggtgg aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg 660 acatggattg gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag 720 ctcgatacaa taaacgccat ttgaccattc accacattgg tgtgcacctc catcggctcg 780 catctctcct tcacgcgccc gccgccctac ctgaggccgc catccacgcc ggttgagtcg 840 cgttctgccg cctcccgcct gtggtgcctc ctgaactgcg tccgccgtct aggtaagttt 900 aaagctcagg tcgagaccgg gcctttgtcc ggcgctccct tggagcctac ctagactcag 960 ccggctctcc acgctttgcc tgaccctgct tgctcaactc tagttaacgg tggagggcag 1020 tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc agacataata gctgacagac 1080 taacagactg ttcctttcca tgggtctttt ctgcagtcac cgtcgtcgga tatcgaattc 1140 gccgccacca tgggcaagcg gagcgctggc tcaatcatgt ggctcgcgag cttggcagtt 1200 gtcatagctt gtgcaggagc cgttaccctc tctaacttcc aagggaaggt gatgatgacg 1260 gtaaatgcta ctgacgtcac agatgtcatc acgattccaa cagctgctgg aaagaaccta 1320 tgcattgtca gagcaatgga tgtgggatac atgtgcgatg atactatcac ctatgaatgc 1380 ccagtgctgt cggctggtaa tgatccagaa gacatcgact gttggtgcac aaagtcagca 1440 gtctacgtca ggtatggaag atgcaccaag acacgccact caagacgcag tcggaggtca 1500 ctgacagtgc agacacacgg agaaagcact ctagcgaaca agaagggggc ttggatggac 1560 agcaccaagg ccacaaggta tttggtaaaa acagaatcat ggatcttgag gaaccctgga 1620 tatgccctgg tggcagccgt cattggttgg atgcttggga gcaacaccat gcagagagtt 1680 gtgtttgtcg tgctattgct tttggtggcc ccagcttaca gcttcaactg ccttggaatg 1740 agcaacagag acttcttgga aggagtgtct ggagcaacat gggtggattt ggttctcgaa 1800 ggcgatagct gcgtgactat catgtctaag gacaagccta ccatcgatgt gaagatgatg 1860 aatatggagg cggccaacct ggcagaggtc cgcagttatt gctatttggc taccgtcagc 1920 gatctctcca ccaaagctgc gtgcccgacc atgggggaag cccacaatga caaacgtgct 1980 gacccagctt ttgtgtgcag acaaggagtg gtggacaggg gctggggcaa cggctgcgga 2040 ctatttggca aaggaagcat tgacacatgc gccaaatttg cctgctctac caaggcaata 2100 ggaagaacca tcttgaaaga gaatatcaag tacgaagtgg ccatttttgt ccatggacca 2160 actactgtgg agtcgcacgg aaactactcc acacaggttg gagccactca ggcagggaga 2220 ttcagcatca ctcctgcggc gccttcatac acactaaagc ttggagaata tggagaggtg 2280 acagtggact gtgaaccacg gtcagggatt gacaccaatg catactacgt gatgactgtt 2340 ggaacaaaga cgttcttggt ccatcgtgag tggttcatgg acctcaacct cccttggagc 2400 agtgctggaa gtactgtgtg gaggaacaga gagacgttaa tggagtttga ggaaccacac 2460 gccacgaagc agtctgtgat agcattgggc tcacaagagg gagctctgca tcaagctttg 2520 gctggagcca ttcctgtgga attttcaagc aacactgtca agttgacgtc gggtcatttg 2580 aagtgtagag tgaagatgga aaaattgcag ttgaagggaa caacctatgg cgtctgttca 2640 aaggctttca agtttcttgg gactcccgca gacacaggtc acggcactgt ggtgttggaa 2700 ttgcagtaca ctggcacgga tggaccttgc aaagttccta tctcgtcagt ggcttcattg 2760 aacgacctaa cgccagtggg cagattggtc actgtcaacc cttttgtttc agtggccacg 2820 gccaacgcta aggtcctgat tgaattggaa ccaccctttg gagactcata catagtggtg 2880 ggcagaggag aacaacagat caatcaccat tggcacaagt ctggaagcag cattggcaaa 2940 gcctttacaa ccaccctcaa aggagcgcag agactagccg ctctaggaga cacagcttgg 3000 gactttggat cagttggagg ggtgttcacc tcagttggga aggctgtcca tcaagtgttc 3060 ggaggagcat tccgctcact gttcggaggc atgtcctgga taacgcaagg attgctgggg 3120 gctctcctgt tgtggatggg catcaatgct cgtgataggt ccatagctct cacgtttctc 3180 gcagttggag gagttctgct cttcctctcc gtgaacgtgc acgcttgagg atccagatct 3240 gctgtgcctt ctagttgcca gccatctgtt gtttgcccct cccccgtgcc ttccttgacc 3300 ctggaaggtg ccactcccac tgtcctttcc taataaaatg aggaaattgc atcgcattgt 3360 ctgagtaggt gtcattctat tctggggggt ggggtggggc aggacagcaa gggggaggat 3420 tgggaagaca atagcaggca tgctggggat gcggtgggct ctatgggtac ccaggtgctg 3480 aagaattgac ccggttcctc ctgggccaga aagaagcagg cacatcccct tctctgtgac 3540 acaccctgtc cacgcccctg gttcttagtt ccagccccac tcataggaca ctcatagctc 3600 aggagggctc cgccttcaat cccacccgct aaagtacttg gagcggtctc tccctccctc 3660 atcagcccac caaaccaaac ctagcctcca agagtgggaa gaaattaaag caagataggc 3720 tattaagtgc agagggagag aaaatgcctc caacatgtga ggaagtaatg agagaaatca 3780 tagaatttta aggccatgat ttaaggccat catggcctta atcttccgct tcctcgctca 3840 ctgactcgct gcgctcggtc gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg 3900 taatacggtt atccacagaa tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc 3960 agcaaaaggc caggaaccgt aaaaaggccg cgttgctggc gtttttccat aggctccgcc 4020 cccctgacga gcatcacaaa aatcgacgct caagtcagag gtggcgaaac ccgacaggac 4080 tataaagata ccaggcgttt ccccctggaa gctccctcgt gcgctctcct gttccgaccc 4140 tgccgcttac cggatacctg tccgcctttc tcccttcggg aagcgtggcg ctttctcata 4200 gctcacgctg taggtatctc agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc 4260 acgaaccccc cgttcagccc gaccgctgcg ccttatccgg taactatcgt cttgagtcca 4320 acccggtaag acacgactta tcgccactgg cagcagccac tggtaacagg attagcagag 4380 cgaggtatgt aggcggtgct acagagttct tgaagtggtg gcctaactac ggctacacta 4440 gaagaacagt atttggtatc tgcgctctgc tgaagccagt taccttcgga aaaagagttg 4500 gtagctcttg atccggcaaa caaaccaccg ctggtagcgg tggttttttt gtttgcaagc 4560 agcagattac gcgcagaaaa aaaggatctc aagaagatcc tttgatcttt tctacggggt 4620 ctgacgctca gtggaacgaa aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa 4680 ggatcttcac ctagatcctt ttaaattaaa aatgaagttt taaatcaatc taaagtatat 4740 atgagtaaac ttggtctgac agttaccaat gcttaatcag tgaggcacct atctcagcga 4800 tctgtctatt tcgttcatcc atagttgcct gactcggggg gggggggcgc tgaggtctgc 4860 ctcgtgaaga aggtgttgct gactcatacc aggcctgaat cgccccatca tccagccaga 4920 aagtgaggga gccacggttg atgagagctt tgttgtaggt ggaccagttg gtgattttga 4980 acttttgctt tgccacggaa cggtctgcgt tgtcgggaag atgcgtgatc tgatccttca 5040 actcagcaaa agttcgattt attcaacaaa gccgccgtcc cgtcaagtca gcgtaatgct 5100 ctgccagtgt tacaaccaat taaccaattc tgattagaaa aactcatcga gcatcaaatg 5160 aaactgcaat ttattcatat caggattatc aataccatat ttttgaaaaa gccgtttctg 5220 taatgaagga gaaaactcac cgaggcagtt ccataggatg gcaagatcct ggtatcggtc 5280 tgcgattccg actcgtccaa catcaataca acctattaat ttcccctcgt caaaaataag 5340 gttatcaagt gagaaatcac catgagtgac gactgaatcc ggtgagaatg gcaaaagctt 5400 atgcatttct ttccagactt gttcaacagg ccagccatta cgctcgtcat caaaatcact 5460 cgcatcaacc aaaccgttat tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc 5520 gctgttaaaa ggacaattac aaacaggaat cgaatgcaac cggcgcagga acactgccag 5580 cgcatcaaca atattttcac ctgaatcagg atattcttct aatacctgga atgctgtttt 5640 cccggggatc gcagtggtga gtaaccatgc atcatcagga gtacggataa aatgcttgat 5700 ggtcggaaga ggcataaatt ccgtcagcca gtttagtctg accatctcat ctgtaacatc 5760 attggcaacg ctacctttgc catgtttcag aaacaactct ggcgcatcgg gcttcccata 5820 caatcgatag attgtcgcac ctgattgccc gacattatcg cgagcccatt tatacccata 5880 taaatcagca tccatgttgg aatttaatcg cggcctcgag caagacgttt cccgttgaat 5940 atggctcata acaccccttg tattactgtt tatgtaagca gacagtttta ttgttcatga 6000 tgatatattt ttatcttgtg caatgtaaca tcagagattt tgagacacaa cgtggctttc 6060 cccccccccc cattattgaa gcatttatca gggttattgt ctcatgagcg gatacatatt 6120 tgaatgtatt tagaaaaata aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc 6180 acctgacgtc taagaaacca ttattatcat gacattaacc tataaaaata ggcgtatcac 6240 gaggcccttt cgtc 6254 90 6425 DNA Artificial sequence VR6307, Ligation of VCL6292 into VR6430 90 tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240 ctattggctg ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta 300 agctacaaca aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg 360 ttttgcgctg cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt 420 gtttaggcga aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta 480 gtttcgcttt tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc 540 aacatggtaa cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc 600 cgattggtgg aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg 660 acatggattg gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag 720 ctcgatacaa taaacgccat ttgaccattc accacattgg tgtgcacctc catcggctcg 780 catctctcct tcacgcgccc gccgccctac ctgaggccgc catccacgcc ggttgagtcg 840 cgttctgccg cctcccgcct gtggtgcctc ctgaactgcg tccgccgtct aggtaagttt 900 aaagctcagg tcgagaccgg gcctttgtcc ggcgctccct tggagcctac ctagactcag 960 ccggctctcc acgctttgcc tgaccctgct tgctcaactc tagttaacgg tggagggcag 1020 tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc agacataata gctgacagac 1080 taacagactg ttcctttcca tgggtctttt ctgcagtcac cgtcgtcgga tatcgccacc 1140 atggatgcaa tgaagagagg gctctgctgt gtgctgctgc tgtgtggagc agtcttcgtt 1200 tcgcccagcg aagtgaagca agaaaatcga cttctgaacg agagcgaaag ttcatcacag 1260 ggtcttctcg gatactactt cagtgacttg aatttccaag caccaatggt ggtgactagt 1320 agcaccaccg gcgatttgag cattcccagc tctgagttgg agaacattcc cagcgaaaat 1380 cagtacttcc agtctgctat ctggtccgga ttcattaagg ttaaaaagtc cgacgaatat 1440 acatttgcta cctcggcgga taaccatgtg acaatgtggg tggacgacca ggaagtgatc 1500 aacaaggctt caaactctaa taaaatccgg ctcgagaagg ggaggctcta ccagatcaaa 1560 attcagtacc agcgggaaaa ccctacagaa aaaggactcg atttcaagct gtactggaca 1620 gatagccaaa acaagaaaga agttatcagc tcagacaatc tgcagttacc cgagctcaag 1680 cagaagagtt ctaatacaag cgctgggcca actgtgcccg acagagacaa tgatggaatc 1740 cctgatagtc tagaggttga gggatacacg gtagatgtca agaacaaaag gacttttctc 1800 tcgccttgga tctcaaatat ccatgagaag aaggggctta ccaagtacaa gtcctccccc 1860 gagaagtggt ctaccgcttc cgatccatat agcgatttcg agaaggtcac aggccggatc 1920 gataaaaatg tgtctccaga ggctagacac cccctggtag cagcctaccc gattgtacac 1980 gtggacatgg agaacatcat tctaagcaaa aacgaggacc agtccacaca aaacactgac 2040 tccgagaccc gcaccatatc taaaaacacc agtacttcaa ggacccacac ctctgaagtg 2100 cacggcaatg cggaagtcca tgcatcgttt ttcgatattg gtggctccgt gtcagccggc 2160 tttagcaata gcaactcctc gacggttgcc attgaccact cactgtcatt agcaggtgag 2220 aggacttggg ctgaaactat gggtctgaat accgccgata cggcccggct caacgcaaat 2280 attcggtacg tcaacacagg gactgctcct atatataacg tgctgcctac gacaagtctt 2340 gtcctgggca aaaatcagac cctcgcaacc attaaggcaa aggaaaatca gctgagccag 2400 atcctcgccc ctaacaacta ttatccatcc aaaaatttag cccccatagc cctgaacgcc 2460 caggacgact tttcctctac ccccataact atgaattaca atcagttcct ggagctggaa 2520 aagacgaagc agctgagact agacaccgat caggtgtatg gaaacatagc gacatataac 2580 tttgagaacg gccgcgtgcg cgtcgacact gggtcaaact ggtctgaagt tctgccgcaa 2640 attcaagaga caaccgccag aattatcttt aatgggaagg acttgaacct tgtcgaacgt 2700 agaattgccg ccgtgaaccc cagtgatcca ctcgagacga ctaaaccgga tatgacactg 2760 aaagaggctc tgaagattgc cttcggattc aacgaaccta atggcaattt gcagtatcag 2820 gggaaagaca tcacagagtt tgatttcaat ttcgatcagc agacttccca aaatatcaaa 2880 aatcagttgg cagagctgaa tgccaccaat atctacacgg ttctcgataa aatcaaactt 2940 aacgccaaga tgaacatatt gattcgagac aaacgcttcc actacgaccg caacaatata 3000 gccgtaggcg ctgatgagtc tgtcgtcaag gaggctcata gggaagttat caacagcagt 3060 actgaagggc tgttacttaa tatcgacaag gacattcgga agatcctgtc cgggtatatc 3120 gtggagatcg aggataccga gggcctgaag gaagtcatta acgaccgcta tgatatgctg 3180 aacatttcca gcttacgaca ggacggtaag acatttattg actttaaaaa gtataacgac 3240 aagctacccc tgtacatttc caacccaaat tacaaagtta atgtgtatgc tgtaaccaag 3300 gagaacacaa tcatcaatcc aagcgagaac ggcgatacca gcacaaatgg aatcaaaaag 3360 atccttatat ttagtaaaaa aggctacgag atcggttgag gatccagatc tgctgtgcct 3420 tctagttgcc agccatctgt tgtttgcccc tcccccgtgc cttccttgac cctggaaggt 3480 gccactccca ctgtcctttc ctaataaaat gaggaaattg catcgcattg tctgagtagg 3540 tgtcattcta ttctgggggg tggggtgggg caggacagca agggggagga ttgggaagac 3600 aatagcaggc atgctgggga tgcggtgggc tctatgggta cccaggtgct gaagaattga 3660 cccggttcct cctgggccag aaagaagcag gcacatcccc ttctctgtga cacaccctgt 3720 ccacgcccct ggttcttagt tccagcccca ctcataggac actcatagct caggagggct 3780 ccgccttcaa tcccacccgc taaagtactt ggagcggtct ctccctccct catcagccca 3840 ccaaaccaaa cctagcctcc aagagtggga agaaattaaa gcaagatagg ctattaagtg 3900 cagagggaga gaaaatgcct ccaacatgtg aggaagtaat gagagaaatc atagaatttt 3960 aaggccatga tttaaggcca tcatggcctt aatcttccgc ttcctcgctc actgactcgc 4020 tgcgctcggt cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt 4080 tatccacaga atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg 4140 ccaggaaccg taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg 4200 agcatcacaa aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat 4260 accaggcgtt tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta 4320 ccggatacct gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct 4380 gtaggtatct cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc 4440 ccgttcagcc cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa 4500 gacacgactt atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg 4560 taggcggtgc tacagagttc ttgaagtggt ggcctaacta cggctacact agaagaacag 4620 tatttggtat ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt 4680 gatccggcaa acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta 4740 cgcgcagaaa aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc 4800 agtggaacga aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca 4860 cctagatcct tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa 4920 cttggtctga cagttaccaa tgcttaatca gtgaggcacc tatctcagcg atctgtctat 4980 ttcgttcatc catagttgcc tgactcgggg ggggggggcg ctgaggtctg cctcgtgaag 5040 aaggtgttgc tgactcatac caggcctgaa tcgccccatc atccagccag aaagtgaggg 5100 agccacggtt gatgagagct ttgttgtagg tggaccagtt ggtgattttg aacttttgct 5160 ttgccacgga acggtctgcg ttgtcgggaa gatgcgtgat ctgatccttc aactcagcaa 5220 aagttcgatt tattcaacaa agccgccgtc ccgtcaagtc agcgtaatgc tctgccagtg 5280 ttacaaccaa ttaaccaatt ctgattagaa aaactcatcg agcatcaaat gaaactgcaa 5340 tttattcata tcaggattat caataccata tttttgaaaa agccgtttct gtaatgaagg 5400 agaaaactca ccgaggcagt tccataggat ggcaagatcc tggtatcggt ctgcgattcc 5460 gactcgtcca acatcaatac aacctattaa tttcccctcg tcaaaaataa ggttatcaag 5520 tgagaaatca ccatgagtga cgactgaatc cggtgagaat ggcaaaagct tatgcatttc 5580 tttccagact tgttcaacag gccagccatt acgctcgtca tcaaaatcac tcgcatcaac 5640 caaaccgtta ttcattcgtg attgcgcctg agcgagacga aatacgcgat cgctgttaaa 5700 aggacaatta caaacaggaa tcgaatgcaa ccggcgcagg aacactgcca gcgcatcaac 5760 aatattttca cctgaatcag gatattcttc taatacctgg aatgctgttt tcccggggat 5820 cgcagtggtg agtaaccatg catcatcagg agtacggata aaatgcttga tggtcggaag 5880 aggcataaat tccgtcagcc agtttagtct gaccatctca tctgtaacat cattggcaac 5940 gctacctttg ccatgtttca gaaacaactc tggcgcatcg ggcttcccat acaatcgata 6000 gattgtcgca cctgattgcc cgacattatc gcgagcccat ttatacccat ataaatcagc 6060 atccatgttg gaatttaatc gcggcctcga gcaagacgtt tcccgttgaa tatggctcat 6120 aacacccctt gtattactgt ttatgtaagc agacagtttt attgttcatg atgatatatt 6180 tttatcttgt gcaatgtaac atcagagatt ttgagacaca acgtggcttt cccccccccc 6240 ccattattga agcatttatc agggttattg tctcatgagc ggatacatat ttgaatgtat 6300 ttagaaaaat aaacaaatag gggttccgcg cacatttccc cgaaaagtgc cacctgacgt 6360 ctaagaaacc attattatca tgacattaac ctataaaaat aggcgtatca cgaggccctt 6420 tcgtc 6425 91 5398 DNA Artificial sequence VR4756, Ligation of Segment7 into VR10551 91 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catgagcctt ctaaccgagg tcgaaacgta 1680 tgttctctct atcgttccat caggccccct caaagccgaa atcgcgcaga gacttgaaga 1740 tgtctttgct gggaaaaaca cagatcttga ggctctcatg gaatggctaa agacaagacc 1800 aatcctgtca cctctgacta aggggatttt ggggtttgtg ttcacgctca ccgtgcccag 1860 tgagcgagga ctgcagcgta gacgctttgt ccaaaatgcc ctcaatggga atggggatcc 1920 aaataacatg gacagagcag ttaaactata tagaaaactt aagagggaga ttacattcca 1980 tggggccaaa gaaatagcac tcagttattc tgctggtgca cttgccagtt gcatgggcct 2040 catatacaac agaatggggg ctgtaaccac tgaagtggcc tttggcctgg tatgtgcaac 2100 atgtgaacag attgctgact cccagcacag gtctcatagg caaatggtgg caacaaccaa 2160 tccattaata aggcatgaga acagaatggt tttggccagc actacagcta aggctatgga 2220 gcaaatggct ggatcaagtg agcaggcagc ggaggccatg gaaattgcta gtcaggccag 2280 gcaaatggtg caggcaatga gagccattgg gactcatcct agctccagtg ctggtctaaa 2340 agatgatctt cttgaaaatt tgcagaccta tcagaaacga atgggggtgc agatgcaacg 2400 attcaagtga cccgcttgtt gttgctgcga gtatcattgg gatcttgcac ttgatattgt 2460 ggattcttga tcgtcttttt ttcaaatgca tctatcgact cttcaaacac ggtctgaaaa 2520 gagggccttc tacggaagga gtacctgagt ctatgaggga agaatatcga aaggaacagc 2580 agaatgctgt ggatgctgac gacagtcatt ttgtcagcat agagctggag taatcagtcg 2640 accacgtgtg atccagatct acttctggct aataaaagat cagagctcta

gagatctgtg 2700 tgttggtttt ttgtgtggta ctcttccgct tcctcgctca ctgactcgct gcgctcggtc 2760 gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa 2820 tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt 2880 aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa 2940 aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt 3000 ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg 3060 tccgcctttc tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc 3120 agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc 3180 gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta 3240 tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct 3300 acagagttct tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc 3360 tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa 3420 caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa 3480 aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa 3540 aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt 3600 ttaaattaaa aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac 3660 agttaccaat gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc 3720 atagttgcct gactcggggg gggggggcgc tgaggtctgc ctcgtgaaga aggtgttgct 3780 gactcatacc aggcctgaat cgccccatca tccagccaga aagtgaggga gccacggttg 3840 atgagagctt tgttgtaggt ggaccagttg gtgattttga acttttgctt tgccacggaa 3900 cggtctgcgt tgtcgggaag atgcgtgatc tgatccttca actcagcaaa agttcgattt 3960 attcaacaaa gccgccgtcc cgtcaagtca gcgtaatgct ctgccagtgt tacaaccaat 4020 taaccaattc tgattagaaa aactcatcga gcatcaaatg aaactgcaat ttattcatat 4080 caggattatc aataccatat ttttgaaaaa gccgtttctg taatgaagga gaaaactcac 4140 cgaggcagtt ccataggatg gcaagatcct ggtatcggtc tgcgattccg actcgtccaa 4200 catcaataca acctattaat ttcccctcgt caaaaataag gttatcaagt gagaaatcac 4260 catgagtgac gactgaatcc ggtgagaatg gcaaaagctt atgcatttct ttccagactt 4320 gttcaacagg ccagccatta cgctcgtcat caaaatcact cgcatcaacc aaaccgttat 4380 tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc gctgttaaaa ggacaattac 4440 aaacaggaat cgaatgcaac cggcgcagga acactgccag cgcatcaaca atattttcac 4500 ctgaatcagg atattcttct aatacctgga atgctgtttt cccggggatc gcagtggtga 4560 gtaaccatgc atcatcagga gtacggataa aatgcttgat ggtcggaaga ggcataaatt 4620 ccgtcagcca gtttagtctg accatctcat ctgtaacatc attggcaacg ctacctttgc 4680 catgtttcag aaacaactct ggcgcatcgg gcttcccata caatcgatag attgtcgcac 4740 ctgattgccc gacattatcg cgagcccatt tatacccata taaatcagca tccatgttgg 4800 aatttaatcg cggcctcgag caagacgttt cccgttgaat atggctcata acaccccttg 4860 tattactgtt tatgtaagca gacagtttta ttgttcatga tgatatattt ttatcttgtg 4920 caatgtaaca tcagagattt tgagacacaa cgtggctttc cccccccccc cattattgaa 4980 gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata 5040 aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca 5100 ttattatcat gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtctcgcgc 5160 gtttcggtga tgacggtgaa aacctctgac acatgcagct cccggagacg gtcacagctt 5220 gtctgtaagc ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg 5280 ggtgtcgggg ctggcttaac tatgcggcat cagagcagat tgtactgaga gtgcaccata 5340 tgcggtgtga aataccgcac agatgcgtaa ggagaaaata ccgcatcaga ttggctat 5398 92 4710 DNA Artificial sequence VR4759, Ligation of M2 into 10551 92 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catgagcctg ctgaccgagg tggagacccc 1680 catcagaaac gagtggggct gcagatgcaa cgacagcagc gaccccctgg tggtggccgc 1740 cagcatcatc ggcatcctgc acctgatcct gtggatcctg gacagactgt tcttcaagtg 1800 catctacaga ctgttcaagc acggcctgaa gagaggcccc agcaccgagg gcgtgcccga 1860 gagcatgaga gaggagtaca gaaaggagca gcagaacgcc gtggacgccg acgacagcca 1920 cttcgtgagc atcgagctgg agtgatcagt cgaccacgtg tgatccagat ctacttctgg 1980 ctaataaaag atcagagctc tagagatctg tgtgttggtt ttttgtgtgg tactcttccg 2040 cttcctcgct cactgactcg ctgcgctcgg tcgttcggct gcggcgagcg gtatcagctc 2100 actcaaaggc ggtaatacgg ttatccacag aatcagggga taacgcagga aagaacatgt 2160 gagcaaaagg ccagcaaaag gccaggaacc gtaaaaaggc cgcgttgctg gcgtttttcc 2220 ataggctccg cccccctgac gagcatcaca aaaatcgacg ctcaagtcag aggtggcgaa 2280 acccgacagg actataaaga taccaggcgt ttccccctgg aagctccctc gtgcgctctc 2340 ctgttccgac cctgccgctt accggatacc tgtccgcctt tctcccttcg ggaagcgtgg 2400 cgctttctca tagctcacgc tgtaggtatc tcagttcggt gtaggtcgtt cgctccaagc 2460 tgggctgtgt gcacgaaccc cccgttcagc ccgaccgctg cgccttatcc ggtaactatc 2520 gtcttgagtc caacccggta agacacgact tatcgccact ggcagcagcc actggtaaca 2580 ggattagcag agcgaggtat gtaggcggtg ctacagagtt cttgaagtgg tggcctaact 2640 acggctacac tagaagaaca gtatttggta tctgcgctct gctgaagcca gttaccttcg 2700 gaaaaagagt tggtagctct tgatccggca aacaaaccac cgctggtagc ggtggttttt 2760 ttgtttgcaa gcagcagatt acgcgcagaa aaaaaggatc tcaagaagat cctttgatct 2820 tttctacggg gtctgacgct cagtggaacg aaaactcacg ttaagggatt ttggtcatga 2880 gattatcaaa aaggatcttc acctagatcc ttttaaatta aaaatgaagt tttaaatcaa 2940 tctaaagtat atatgagtaa acttggtctg acagttacca atgcttaatc agtgaggcac 3000 ctatctcagc gatctgtcta tttcgttcat ccatagttgc ctgactcggg gggggggggc 3060 gctgaggtct gcctcgtgaa gaaggtgttg ctgactcata ccaggcctga atcgccccat 3120 catccagcca gaaagtgagg gagccacggt tgatgagagc tttgttgtag gtggaccagt 3180 tggtgatttt gaacttttgc tttgccacgg aacggtctgc gttgtcggga agatgcgtga 3240 tctgatcctt caactcagca aaagttcgat ttattcaaca aagccgccgt cccgtcaagt 3300 cagcgtaatg ctctgccagt gttacaacca attaaccaat tctgattaga aaaactcatc 3360 gagcatcaaa tgaaactgca atttattcat atcaggatta tcaataccat atttttgaaa 3420 aagccgtttc tgtaatgaag gagaaaactc accgaggcag ttccatagga tggcaagatc 3480 ctggtatcgg tctgcgattc cgactcgtcc aacatcaata caacctatta atttcccctc 3540 gtcaaaaata aggttatcaa gtgagaaatc accatgagtg acgactgaat ccggtgagaa 3600 tggcaaaagc ttatgcattt ctttccagac ttgttcaaca ggccagccat tacgctcgtc 3660 atcaaaatca ctcgcatcaa ccaaaccgtt attcattcgt gattgcgcct gagcgagacg 3720 aaatacgcga tcgctgttaa aaggacaatt acaaacagga atcgaatgca accggcgcag 3780 gaacactgcc agcgcatcaa caatattttc acctgaatca ggatattctt ctaatacctg 3840 gaatgctgtt ttcccgggga tcgcagtggt gagtaaccat gcatcatcag gagtacggat 3900 aaaatgcttg atggtcggaa gaggcataaa ttccgtcagc cagtttagtc tgaccatctc 3960 atctgtaaca tcattggcaa cgctaccttt gccatgtttc agaaacaact ctggcgcatc 4020 gggcttccca tacaatcgat agattgtcgc acctgattgc ccgacattat cgcgagccca 4080 tttataccca tataaatcag catccatgtt ggaatttaat cgcggcctcg agcaagacgt 4140 ttcccgttga atatggctca taacacccct tgtattactg tttatgtaag cagacagttt 4200 tattgttcat gatgatatat ttttatcttg tgcaatgtaa catcagagat tttgagacac 4260 aacgtggctt tccccccccc cccattattg aagcatttat cagggttatt gtctcatgag 4320 cggatacata tttgaatgta tttagaaaaa taaacaaata ggggttccgc gcacatttcc 4380 ccgaaaagtg ccacctgacg tctaagaaac cattattatc atgacattaa cctataaaaa 4440 taggcgtatc acgaggccct ttcgtctcgc gcgtttcggt gatgacggtg aaaacctctg 4500 acacatgcag ctcccggaga cggtcacagc ttgtctgtaa gcggatgccg ggagcagaca 4560 agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg ggctggctta actatgcggc 4620 atcagagcag attgtactga gagtgcacca tatgcggtgt gaaataccgc acagatgcgt 4680 aaggagaaaa taccgcatca gattggctat 4710 93 5913 DNA Artificial sequence VR4762, Ligation of NP Consensus into 10551 93 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catggccagc cagggcacca agagaagcta 1680 cgagcagatg gagaccgacg gcgagagaca gaacgccacc gagatcagag ccagcgtggg 1740 caagatgatc gacggcatcg gcagattcta catccagatg tgcaccgagc tgaagctgag 1800 cgactacgag ggcagactga tccagaacag cctgaccatc gagagaatgg tgctgagcgc 1860 cttcgacgag agaagaaaca gatacctgga ggagcacccc agcgccggca aggaccccaa 1920 gaagaccggc ggccccatct acagaagagt ggacggcaag tggatgagag agctggtgct 1980 gtacgacaag gaggagatca gaagaatctg gagacaggcc aacaacggcg aggacgccac 2040 cgccggcctg acccacatga tgatctggca cagcaacctg aacgacacca cctaccagag 2100 aaccagagcc ctggtgcgga ccggcatgga ccccagaatg tgcagcctga tgcagggcag 2160 caccctgccc agaagaagcg gcgccgccgg cgccgccgtg aagggcatcg gcaccatggt 2220 gatggagctg atcagaatga tcaagagagg catcaacgac agaaacttct ggagaggcga 2280 gaacggcaga aagaccagaa gcgcctacga gagaatgtgc aacatcctga agggcaagtt 2340 ccagaccgcc gcccagagag ccatgatgga ccaggtccgg gagagcagaa accccggcaa 2400 cgccgagatc gaggacctga tcttcctggc cagaagcgcc ctgatcctga gaggcagcgt 2460 ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc gccgtgagca gcggctacga 2520 cttcgagaag gagggctaca gcctggtggg catcgacccc ttcaagctgc tgcagaacag 2580 ccaggtgtac agcctgatca gacccaacga gaaccccgcc cacaagagcc agctggtgtg 2640 gatggcctgc cacagcgccg ccttcgagga cctgagactg ctgagcttca tcagaggcac 2700 caaggtgtcc cccagaggca agctgagcac cagaggcgtg cagatcgcca gcaacgagaa 2760 catggacaac atgggcagca gcaccctgga gctgagaagc agatactggg ccatcagaac 2820 cagaagcggc ggcaacacca accagcagag agccagcgcc ggccagatca gcgtgcagcc 2880 caccttcagc gtgcagagaa acctgccctt cgagaagagc accgtgatgg ccgccttcac 2940 cggcaacacc gagggcagaa ccagcgacat gagagccgag atcatcagaa tgatggaggg 3000 cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg ttcgagctga gcgacgagaa 3060 ggccaccaac cccatcgtgc ctagcttcga catgagcaac gagggcagct acttcttcgg 3120 cgacaacgcc gaggagtacg acaactgatc agtcgaccac gtgtgatcca gatctacttc 3180 tggctaataa aagatcagag ctctagagat ctgtgtgttg gttttttgtg tggtactctt 3240 ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 3300 ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 3360 tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 3420 tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 3480 gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 3540 ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 3600 tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 3660 agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 3720 atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 3780 acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 3840 actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 3900 tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 3960 tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 4020 tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 4080 tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 4140 caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg 4200 cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc gggggggggg 4260 ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc ataccaggcc tgaatcgccc 4320 catcatccag ccagaaagtg agggagccac ggttgatgag agctttgttg taggtggacc 4380 agttggtgat tttgaacttt tgctttgcca cggaacggtc tgcgttgtcg ggaagatgcg 4440 tgatctgatc cttcaactca gcaaaagttc gatttattca acaaagccgc cgtcccgtca 4500 agtcagcgta atgctctgcc agtgttacaa ccaattaacc aattctgatt agaaaaactc 4560 atcgagcatc aaatgaaact gcaatttatt catatcagga ttatcaatac catatttttg 4620 aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg cagttccata ggatggcaag 4680 atcctggtat cggtctgcga ttccgactcg tccaacatca atacaaccta ttaatttccc 4740 ctcgtcaaaa ataaggttat caagtgagaa atcaccatga gtgacgactg aatccggtga 4800 gaatggcaaa agcttatgca tttctttcca gacttgttca acaggccagc cattacgctc 4860 gtcatcaaaa tcactcgcat caaccaaacc gttattcatt cgtgattgcg cctgagcgag 4920 acgaaatacg cgatcgctgt taaaaggaca attacaaaca ggaatcgaat gcaaccggcg 4980 caggaacact gccagcgcat caacaatatt ttcacctgaa tcaggatatt cttctaatac 5040 ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac catgcatcat caggagtacg 5100 gataaaatgc ttgatggtcg gaagaggcat aaattccgtc agccagttta gtctgaccat 5160 ctcatctgta acatcattgg caacgctacc tttgccatgt ttcagaaaca actctggcgc 5220 atcgggcttc ccatacaatc gatagattgt cgcacctgat tgcccgacat tatcgcgagc 5280 ccatttatac ccatataaat cagcatccat gttggaattt aatcgcggcc tcgagcaaga 5340 cgtttcccgt tgaatatggc tcataacacc ccttgtatta ctgtttatgt aagcagacag 5400 ttttattgtt catgatgata tatttttatc ttgtgcaatg taacatcaga gattttgaga 5460 cacaacgtgg ctttcccccc ccccccatta ttgaagcatt tatcagggtt attgtctcat 5520 gagcggatac atatttgaat gtatttagaa aaataaacaa ataggggttc cgcgcacatt 5580 tccccgaaaa gtgccacctg acgtctaaga aaccattatt atcatgacat taacctataa 5640 aaataggcgt atcacgaggc cctttcgtct cgcgcgtttc ggtgatgacg gtgaaaacct 5700 ctgacacatg cagctcccgg agacggtcac agcttgtctg taagcggatg ccgggagcag 5760 acaagcccgt cagggcgcgt cagcgggtgt tggcgggtgt cggggctggc ttaactatgc 5820 ggcatcagag cagattgtac tgagagtgca ccatatgcgg tgtgaaatac cgcacagatg 5880 cgtaaggaga aaataccgca tcagattggc tat 5913 94 3817 DNA Artificial sequence VR10682 94 tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 accatatggt gcactctcag tacaatctgc tctgatgccg catagttaag ccagtatctg 240 ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta agctacaaca 300 aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg ttttgcgctg 360 cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt gtttaggcga 420 aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta gtttcgcttt 480 tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc aacatggtaa 540 cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc cgattggtgg 600 aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg acatggattg 660 gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag ctcgatactc 720 tagacgccat ttgaccattc accacattgg tgtgcacctc caagcttccg tcaccgtcgt 780 cgacacgtgt gatcagatat cgcggccgct ctagaccagg cgcctggatc cagatctgct 840 gtgccttcta gttgccagcc atctgttgtt tgcccctccc ccgtgccttc cttgaccctg 900 gaaggtgcca ctcccactgt cctttcctaa taaaatgagg aaattgcatc gcattgtctg 960 agtaggtgtc attctattct ggggggtggg gtggggcagg acagcaaggg ggaggattgg 1020 gaagacaata gcaggcatgc tggggatgcg gtgggctcta tgggtaccca ggtgctgaag 1080 aattgacccg gttcctcctg ggccagaaag aagcaggcac atccccttct ctgtgacaca 1140 ccctgtccac gcccctggtt cttagttcca gccccactca taggacactc atagctcagg 1200 agggctccgc cttcaatccc acccgctaaa gtacttggag cggtctctcc ctccctcatc 1260 agcccaccaa accaaaccta gcctccaaga gtgggaagaa attaaagcaa gataggctat 1320 taagtgcaga gggagagaaa atgcctccaa catgtgagga agtaatgaga gaaatcatag 1380 aatttcttcc gcttcctcgc tcactgactc gctgcgctcg gtcgttcggc tgcggcgagc 1440 ggtatcagct cactcaaagg cggtaatacg gttatccaca gaatcagggg ataacgcagg 1500

aaagaacatg tgagcaaaag gccagcaaaa ggccaggaac cgtaaaaagg ccgcgttgct 1560 ggcgtttttc cataggctcc gcccccctga cgagcatcac aaaaatcgac gctcaagtca 1620 gaggtggcga aacccgacag gactataaag ataccaggcg tttccccctg gaagctccct 1680 cgtgcgctct cctgttccga ccctgccgct taccggatac ctgtccgcct ttctcccttc 1740 gggaagcgtg gcgctttctc atagctcacg ctgtaggtat ctcagttcgg tgtaggtcgt 1800 tcgctccaag ctgggctgtg tgcacgaacc ccccgttcag cccgaccgct gcgccttatc 1860 cggtaactat cgtcttgagt ccaacccggt aagacacgac ttatcgccac tggcagcagc 1920 cactggtaac aggattagca gagcgaggta tgtaggcggt gctacagagt tcttgaagtg 1980 gtggcctaac tacggctaca ctagaagaac agtatttggt atctgcgctc tgctgaagcc 2040 agttaccttc ggaaaaagag ttggtagctc ttgatccggc aaacaaacca ccgctggtag 2100 cggtggtttt tttgtttgca agcagcagat tacgcgcaga aaaaaaggat ctcaagaaga 2160 tcctttgatc ttttctacgg ggtctgacgc tcagtggaac gaaaactcac gttaagggat 2220 tttggtcatg agattatcaa aaaggatctt cacctagatc cttttaaatt aaaaatgaag 2280 ttttaaatca atctaaagta tatatgagta aacttggtct gacagttacc aatgcttaat 2340 cagtgaggca cctatctcag cgatctgtct atttcgttca tccatagttg cctgactcgg 2400 gggggggggg cgctgaggtc tgcctcgtga agaaggtgtt gctgactcat accaggcctg 2460 aatcgcccca tcatccagcc agaaagtgag ggagccacgg ttgatgagag ctttgttgta 2520 ggtggaccag ttggtgattt tgaacttttg ctttgccacg gaacggtctg cgttgtcggg 2580 aagatgcgtg atctgatcct tcaactcagc aaaagttcga tttattcaac aaagccgccg 2640 tcccgtcaag tcagcgtaat gctctgccag tgttacaacc aattaaccaa ttctgattag 2700 aaaaactcat cgagcatcaa atgaaactgc aatttattca tatcaggatt atcaatacca 2760 tatttttgaa aaagccgttt ctgtaatgaa ggagaaaact caccgaggca gttccatagg 2820 atggcaagat cctggtatcg gtctgcgatt ccgactcgtc caacatcaat acaacctatt 2880 aatttcccct cgtcaaaaat aaggttatca agtgagaaat caccatgagt gacgactgaa 2940 tccggtgaga atggcaaaag cttatgcatt tctttccaga cttgttcaac aggccagcca 3000 ttacgctcgt catcaaaatc actcgcatca accaaaccgt tattcattcg tgattgcgcc 3060 tgagcgagac gaaatacgcg atcgctgtta aaaggacaat tacaaacagg aatcgaatgc 3120 aaccggcgca ggaacactgc cagcgcatca acaatatttt cacctgaatc aggatattct 3180 tctaatacct ggaatgctgt tttcccgggg atcgcagtgg tgagtaacca tgcatcatca 3240 ggagtacgga taaaatgctt gatggtcgga agaggcataa attccgtcag ccagtttagt 3300 ctgaccatct catctgtaac atcattggca acgctacctt tgccatgttt cagaaacaac 3360 tctggcgcat cgggcttccc atacaatcga tagattgtcg cacctgattg cccgacatta 3420 tcgcgagccc atttataccc atataaatca gcatccatgt tggaatttaa tcgcggcctc 3480 gagcaagacg tttcccgttg aatatggctc ataacacccc ttgtattact gtttatgtaa 3540 gcagacagtt ttattgttca tgatgatata tttttatctt gtgcaatgta acatcagaga 3600 ttttgagaca caacgtggct ttcccccccc ccccattatt gaagcattta tcagggttat 3660 tgtctcatga gcggatacat atttgaatgt atttagaaaa ataaacaaat aggggttccg 3720 cgcacatttc cccgaaaagt gccacctgac gtctaagaaa ccattattat catgacatta 3780 acctataaaa ataggcgtat cacgaggccc tttcgtc 3817 95 4822 DNA Artificial sequence VR4764, Ligation of VR4756 RV-SalI into VR10682 RV 95 tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 accatatggt gcactctcag tacaatctgc tctgatgccg catagttaag ccagtatctg 240 ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta agctacaaca 300 aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg ttttgcgctg 360 cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt gtttaggcga 420 aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta gtttcgcttt 480 tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc aacatggtaa 540 cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc cgattggtgg 600 aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg acatggattg 660 gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag ctcgatactc 720 tagacgccat ttgaccattc accacattgg tgtgcacctc caagcttccg tcaccgtcgt 780 cgacacgtgt gatcagatat cgaattcgcc accatgagcc ttctaaccga ggtcgaaacg 840 tatgttctct ctatcgttcc atcaggcccc ctcaaagccg aaatcgcgca gagacttgaa 900 gatgtctttg ctgggaaaaa cacagatctt gaggctctca tggaatggct aaagacaaga 960 ccaatcctgt cacctctgac taaggggatt ttggggtttg tgttcacgct caccgtgccc 1020 agtgagcgag gactgcagcg tagacgcttt gtccaaaatg ccctcaatgg gaatggggat 1080 ccaaataaca tggacagagc agttaaacta tatagaaaac ttaagaggga gattacattc 1140 catggggcca aagaaatagc actcagttat tctgctggtg cacttgccag ttgcatgggc 1200 ctcatataca acagaatggg ggctgtaacc actgaagtgg cctttggcct ggtatgtgca 1260 acatgtgaac agattgctga ctcccagcac aggtctcata ggcaaatggt ggcaacaacc 1320 aatccattaa taaggcatga gaacagaatg gttttggcca gcactacagc taaggctatg 1380 gagcaaatgg ctggatcaag tgagcaggca gcggaggcca tggaaattgc tagtcaggcc 1440 aggcaaatgg tgcaggcaat gagagccatt gggactcatc ctagctccag tgctggtcta 1500 aaagatgatc ttcttgaaaa tttgcagacc tatcagaaac gaatgggggt gcagatgcaa 1560 cgattcaagt gacccgcttg ttgttgctgc gagtatcatt gggatcttgc acttgatatt 1620 gtggattctt gatcgtcttt ttttcaaatg catctatcga ctcttcaaac acggtctgaa 1680 aagagggcct tctacggaag gagtacctga gtctatgagg gaagaatatc gaaaggaaca 1740 gcagaatgct gtggatgctg acgacagtca ttttgtcagc atagagctgg agtaatcagt 1800 cgaatcgcgg ccgctctaga ccaggcgcct ggatccagat ctgctgtgcc ttctagttgc 1860 cagccatctg ttgtttgccc ctcccccgtg ccttccttga ccctggaagg tgccactccc 1920 actgtccttt cctaataaaa tgaggaaatt gcatcgcatt gtctgagtag gtgtcattct 1980 attctggggg gtggggtggg gcaggacagc aagggggagg attgggaaga caatagcagg 2040 catgctgggg atgcggtggg ctctatgggt acccaggtgc tgaagaattg acccggttcc 2100 tcctgggcca gaaagaagca ggcacatccc cttctctgtg acacaccctg tccacgcccc 2160 tggttcttag ttccagcccc actcatagga cactcatagc tcaggagggc tccgccttca 2220 atcccacccg ctaaagtact tggagcggtc tctccctccc tcatcagccc accaaaccaa 2280 acctagcctc caagagtggg aagaaattaa agcaagatag gctattaagt gcagagggag 2340 agaaaatgcc tccaacatgt gaggaagtaa tgagagaaat catagaattt cttccgcttc 2400 ctcgctcact gactcgctgc gctcggtcgt tcggctgcgg cgagcggtat cagctcactc 2460 aaaggcggta atacggttat ccacagaatc aggggataac gcaggaaaga acatgtgagc 2520 aaaaggccag caaaaggcca ggaaccgtaa aaaggccgcg ttgctggcgt ttttccatag 2580 gctccgcccc cctgacgagc atcacaaaaa tcgacgctca agtcagaggt ggcgaaaccc 2640 gacaggacta taaagatacc aggcgtttcc ccctggaagc tccctcgtgc gctctcctgt 2700 tccgaccctg ccgcttaccg gatacctgtc cgcctttctc ccttcgggaa gcgtggcgct 2760 ttctcatagc tcacgctgta ggtatctcag ttcggtgtag gtcgttcgct ccaagctggg 2820 ctgtgtgcac gaaccccccg ttcagcccga ccgctgcgcc ttatccggta actatcgtct 2880 tgagtccaac ccggtaagac acgacttatc gccactggca gcagccactg gtaacaggat 2940 tagcagagcg aggtatgtag gcggtgctac agagttcttg aagtggtggc ctaactacgg 3000 ctacactaga agaacagtat ttggtatctg cgctctgctg aagccagtta ccttcggaaa 3060 aagagttggt agctcttgat ccggcaaaca aaccaccgct ggtagcggtg gtttttttgt 3120 ttgcaagcag cagattacgc gcagaaaaaa aggatctcaa gaagatcctt tgatcttttc 3180 tacggggtct gacgctcagt ggaacgaaaa ctcacgttaa gggattttgg tcatgagatt 3240 atcaaaaagg atcttcacct agatcctttt aaattaaaaa tgaagtttta aatcaatcta 3300 aagtatatat gagtaaactt ggtctgacag ttaccaatgc ttaatcagtg aggcacctat 3360 ctcagcgatc tgtctatttc gttcatccat agttgcctga ctcggggggg gggggcgctg 3420 aggtctgcct cgtgaagaag gtgttgctga ctcataccag gcctgaatcg ccccatcatc 3480 cagccagaaa gtgagggagc cacggttgat gagagctttg ttgtaggtgg accagttggt 3540 gattttgaac ttttgctttg ccacggaacg gtctgcgttg tcgggaagat gcgtgatctg 3600 atccttcaac tcagcaaaag ttcgatttat tcaacaaagc cgccgtcccg tcaagtcagc 3660 gtaatgctct gccagtgtta caaccaatta accaattctg attagaaaaa ctcatcgagc 3720 atcaaatgaa actgcaattt attcatatca ggattatcaa taccatattt ttgaaaaagc 3780 cgtttctgta atgaaggaga aaactcaccg aggcagttcc ataggatggc aagatcctgg 3840 tatcggtctg cgattccgac tcgtccaaca tcaatacaac ctattaattt cccctcgtca 3900 aaaataaggt tatcaagtga gaaatcacca tgagtgacga ctgaatccgg tgagaatggc 3960 aaaagcttat gcatttcttt ccagacttgt tcaacaggcc agccattacg ctcgtcatca 4020 aaatcactcg catcaaccaa accgttattc attcgtgatt gcgcctgagc gagacgaaat 4080 acgcgatcgc tgttaaaagg acaattacaa acaggaatcg aatgcaaccg gcgcaggaac 4140 actgccagcg catcaacaat attttcacct gaatcaggat attcttctaa tacctggaat 4200 gctgttttcc cggggatcgc agtggtgagt aaccatgcat catcaggagt acggataaaa 4260 tgcttgatgg tcggaagagg cataaattcc gtcagccagt ttagtctgac catctcatct 4320 gtaacatcat tggcaacgct acctttgcca tgtttcagaa acaactctgg cgcatcgggc 4380 ttcccataca atcgatagat tgtcgcacct gattgcccga cattatcgcg agcccattta 4440 tacccatata aatcagcatc catgttggaa tttaatcgcg gcctcgagca agacgtttcc 4500 cgttgaatat ggctcataac accccttgta ttactgttta tgtaagcaga cagttttatt 4560 gttcatgatg atatattttt atcttgtgca atgtaacatc agagattttg agacacaacg 4620 tggctttccc ccccccccca ttattgaagc atttatcagg gttattgtct catgagcgga 4680 tacatatttg aatgtattta gaaaaataaa caaatagggg ttccgcgcac atttccccga 4740 aaagtgccac ctgacgtcta agaaaccatt attatcatga cattaaccta taaaaatagg 4800 cgtatcacga ggccctttcg tc 4822 96 5341 DNA Artificial sequence VR4765, Ligation of NP from 4762 into VR10682 96 tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 accatatggt gcactctcag tacaatctgc tctgatgccg catagttaag ccagtatctg 240 ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta agctacaaca 300 aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg ttttgcgctg 360 cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt gtttaggcga 420 aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta gtttcgcttt 480 tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc aacatggtaa 540 cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc cgattggtgg 600 aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg acatggattg 660 gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag ctcgatactc 720 tagacgccat ttgaccattc accacattgg tgtgcacctc caagcttccg tcaccgtcgt 780 cgacacgtgt gatcagatat cgaattcgcc accatggcca gccagggcac caagagaagc 840 tacgagcaga tggagaccga cggcgagaga cagaacgcca ccgagatcag agccagcgtg 900 ggcaagatga tcgacggcat cggcagattc tacatccaga tgtgcaccga gctgaagctg 960 agcgactacg agggcagact gatccagaac agcctgacca tcgagagaat ggtgctgagc 1020 gccttcgacg agagaagaaa cagatacctg gaggagcacc ccagcgccgg caaggacccc 1080 aagaagaccg gcggccccat ctacagaaga gtggacggca agtggatgag agagctggtg 1140 ctgtacgaca aggaggagat cagaagaatc tggagacagg ccaacaacgg cgaggacgcc 1200 accgccggcc tgacccacat gatgatctgg cacagcaacc tgaacgacac cacctaccag 1260 agaaccagag ccctggtgcg gaccggcatg gaccccagaa tgtgcagcct gatgcagggc 1320 agcaccctgc ccagaagaag cggcgccgcc ggcgccgccg tgaagggcat cggcaccatg 1380 gtgatggagc tgatcagaat gatcaagaga ggcatcaacg acagaaactt ctggagaggc 1440 gagaacggca gaaagaccag aagcgcctac gagagaatgt gcaacatcct gaagggcaag 1500 ttccagaccg ccgcccagag agccatgatg gaccaggtcc gggagagcag aaaccccggc 1560 aacgccgaga tcgaggacct gatcttcctg gccagaagcg ccctgatcct gagaggcagc 1620 gtggcccaca agagctgcct gcccgcctgc gtgtacggcc ccgccgtgag cagcggctac 1680 gacttcgaga aggagggcta cagcctggtg ggcatcgacc ccttcaagct gctgcagaac 1740 agccaggtgt acagcctgat cagacccaac gagaaccccg cccacaagag ccagctggtg 1800 tggatggcct gccacagcgc cgccttcgag gacctgagac tgctgagctt catcagaggc 1860 accaaggtgt cccccagagg caagctgagc accagaggcg tgcagatcgc cagcaacgag 1920 aacatggaca acatgggcag cagcaccctg gagctgagaa gcagatactg ggccatcaga 1980 accagaagcg gcggcaacac caaccagcag agagccagcg ccggccagat cagcgtgcag 2040 cccaccttca gcgtgcagag aaacctgccc ttcgagaaga gcaccgtgat ggccgccttc 2100 accggcaaca ccgagggcag aaccagcgac atgagagccg agatcatcag aatgatggag 2160 ggcgccaagc ccgaggaggt gtccttcaga ggcagaggcg tgttcgagct gagcgacgag 2220 aaggccacca accccatcgt gcctagcttc gacatgagca acgagggcag ctacttcttc 2280 ggcgacaacg ccgaggagta cgacaactga tcagtcgacc acatcgcggc cgctctagac 2340 caggcgcctg gatccagatc tgctgtgcct tctagttgcc agccatctgt tgtttgcccc 2400 tcccccgtgc cttccttgac cctggaaggt gccactccca ctgtcctttc ctaataaaat 2460 gaggaaattg catcgcattg tctgagtagg tgtcattcta ttctgggggg tggggtgggg 2520 caggacagca agggggagga ttgggaagac aatagcaggc atgctgggga tgcggtgggc 2580 tctatgggta cccaggtgct gaagaattga cccggttcct cctgggccag aaagaagcag 2640 gcacatcccc ttctctgtga cacaccctgt ccacgcccct ggttcttagt tccagcccca 2700 ctcataggac actcatagct caggagggct ccgccttcaa tcccacccgc taaagtactt 2760 ggagcggtct ctccctccct catcagccca ccaaaccaaa cctagcctcc aagagtggga 2820 agaaattaaa gcaagatagg ctattaagtg cagagggaga gaaaatgcct ccaacatgtg 2880 aggaagtaat gagagaaatc atagaatttc ttccgcttcc tcgctcactg actcgctgcg 2940 ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa tacggttatc 3000 cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc aaaaggccag 3060 gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca 3120 tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca 3180 ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg 3240 atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag 3300 gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt 3360 tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca 3420 cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg 3480 cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa gaacagtatt 3540 tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta gctcttgatc 3600 cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc agattacgcg 3660 cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg acgctcagtg 3720 gaacgaaaac tcacgttaag ggattttggt catgagatta tcaaaaagga tcttcaccta 3780 gatcctttta aattaaaaat gaagttttaa atcaatctaa agtatatatg agtaaacttg 3840 gtctgacagt taccaatgct taatcagtga ggcacctatc tcagcgatct gtctatttcg 3900 ttcatccata gttgcctgac tcgggggggg ggggcgctga ggtctgcctc gtgaagaagg 3960 tgttgctgac tcataccagg cctgaatcgc cccatcatcc agccagaaag tgagggagcc 4020 acggttgatg agagctttgt tgtaggtgga ccagttggtg attttgaact tttgctttgc 4080 cacggaacgg tctgcgttgt cgggaagatg cgtgatctga tccttcaact cagcaaaagt 4140 tcgatttatt caacaaagcc gccgtcccgt caagtcagcg taatgctctg ccagtgttac 4200 aaccaattaa ccaattctga ttagaaaaac tcatcgagca tcaaatgaaa ctgcaattta 4260 ttcatatcag gattatcaat accatatttt tgaaaaagcc gtttctgtaa tgaaggagaa 4320 aactcaccga ggcagttcca taggatggca agatcctggt atcggtctgc gattccgact 4380 cgtccaacat caatacaacc tattaatttc ccctcgtcaa aaataaggtt atcaagtgag 4440 aaatcaccat gagtgacgac tgaatccggt gagaatggca aaagcttatg catttctttc 4500 cagacttgtt caacaggcca gccattacgc tcgtcatcaa aatcactcgc atcaaccaaa 4560 ccgttattca ttcgtgattg cgcctgagcg agacgaaata cgcgatcgct gttaaaagga 4620 caattacaaa caggaatcga atgcaaccgg cgcaggaaca ctgccagcgc atcaacaata 4680 ttttcacctg aatcaggata ttcttctaat acctggaatg ctgttttccc ggggatcgca 4740 gtggtgagta accatgcatc atcaggagta cggataaaat gcttgatggt cggaagaggc 4800 ataaattccg tcagccagtt tagtctgacc atctcatctg taacatcatt ggcaacgcta 4860 cctttgccat gtttcagaaa caactctggc gcatcgggct tcccatacaa tcgatagatt 4920 gtcgcacctg attgcccgac attatcgcga gcccatttat acccatataa atcagcatcc 4980 atgttggaat ttaatcgcgg cctcgagcaa gacgtttccc gttgaatatg gctcataaca 5040 ccccttgtat tactgtttat gtaagcagac agttttattg ttcatgatga tatattttta 5100 tcttgtgcaa tgtaacatca gagattttga gacacaacgt ggctttcccc ccccccccat 5160 tattgaagca tttatcaggg ttattgtctc atgagcggat acatatttga atgtatttag 5220 aaaaataaac aaataggggt tccgcgcaca tttccccgaa aagtgccacc tgacgtctaa 5280 gaaaccatta ttatcatgac attaacctat aaaaataggc gtatcacgag gccctttcgt 5340 c 5341 97 7798 DNA Artificial sequence VR4766, Ligation of Seg7 into VR4762 97 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catggccagc cagggcacca agagaagcta 1680 cgagcagatg gagaccgacg gcgagagaca gaacgccacc gagatcagag ccagcgtggg 1740 caagatgatc gacggcatcg gcagattcta catccagatg tgcaccgagc tgaagctgag 1800 cgactacgag ggcagactga tccagaacag cctgaccatc gagagaatgg tgctgagcgc 1860 cttcgacgag agaagaaaca gatacctgga ggagcacccc agcgccggca aggaccccaa 1920 gaagaccggc ggccccatct acagaagagt ggacggcaag tggatgagag agctggtgct 1980 gtacgacaag gaggagatca gaagaatctg gagacaggcc aacaacggcg aggacgccac 2040 cgccggcctg acccacatga tgatctggca cagcaacctg aacgacacca cctaccagag 2100 aaccagagcc ctggtgcgga ccggcatgga ccccagaatg tgcagcctga tgcagggcag 2160 caccctgccc agaagaagcg gcgccgccgg cgccgccgtg aagggcatcg gcaccatggt 2220 gatggagctg atcagaatga

tcaagagagg catcaacgac agaaacttct ggagaggcga 2280 gaacggcaga aagaccagaa gcgcctacga gagaatgtgc aacatcctga agggcaagtt 2340 ccagaccgcc gcccagagag ccatgatgga ccaggtccgg gagagcagaa accccggcaa 2400 cgccgagatc gaggacctga tcttcctggc cagaagcgcc ctgatcctga gaggcagcgt 2460 ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc gccgtgagca gcggctacga 2520 cttcgagaag gagggctaca gcctggtggg catcgacccc ttcaagctgc tgcagaacag 2580 ccaggtgtac agcctgatca gacccaacga gaaccccgcc cacaagagcc agctggtgtg 2640 gatggcctgc cacagcgccg ccttcgagga cctgagactg ctgagcttca tcagaggcac 2700 caaggtgtcc cccagaggca agctgagcac cagaggcgtg cagatcgcca gcaacgagaa 2760 catggacaac atgggcagca gcaccctgga gctgagaagc agatactggg ccatcagaac 2820 cagaagcggc ggcaacacca accagcagag agccagcgcc ggccagatca gcgtgcagcc 2880 caccttcagc gtgcagagaa acctgccctt cgagaagagc accgtgatgg ccgccttcac 2940 cggcaacacc gagggcagaa ccagcgacat gagagccgag atcatcagaa tgatggaggg 3000 cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg ttcgagctga gcgacgagaa 3060 ggccaccaac cccatcgtgc ctagcttcga catgagcaac gagggcagct acttcttcgg 3120 cgacaacgcc gaggagtacg acaactgatc agtcgaccac gtgtgatcca gatctacttc 3180 tggctaataa aagatcagag ctctagagat ctgtgtgttg gttttttgtg tggtactctt 3240 ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 3300 ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 3360 tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 3420 tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 3480 gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 3540 ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 3600 tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 3660 agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 3720 atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 3780 acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 3840 actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 3900 tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 3960 tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 4020 tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 4080 tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 4140 caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg 4200 cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc gggggggggg 4260 ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc ataccaggcc tgaatcgccc 4320 catcatccag ccagaaagtg agggagccac ggttgatgag agctttgttg taggtggacc 4380 agttggtgat tttgaacttt tgctttgcca cggaacggtc tgcgttgtcg ggaagatgcg 4440 tgatctgatc cttcaactca gcaaaagttc gatttattca acaaagccgc cgtcccgtca 4500 agtcagcgta atgctctgcc agtgttacaa ccaattaacc aattctgatt agaaaaactc 4560 atcgagcatc aaatgaaact gcaatttatt catatcagga ttatcaatac catatttttg 4620 aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg cagttccata ggatggcaag 4680 atcctggtat cggtctgcga ttccgactcg tccaacatca atacaaccta ttaatttccc 4740 ctcgtcaaaa ataaggttat caagtgagaa atcaccatga gtgacgactg aatccggtga 4800 gaatggcaaa agcttatgca tttctttcca gacttgttca acaggccagc cattacgctc 4860 gtcatcaaaa tcactcgcat caaccaaacc gttattcatt cgtgattgcg cctgagcgag 4920 acgaaatacg cgatcgctgt taaaaggaca attacaaaca ggaatcgaat gcaaccggcg 4980 caggaacact gccagcgcat caacaatatt ttcacctgaa tcaggatatt cttctaatac 5040 ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac catgcatcat caggagtacg 5100 gataaaatgc ttgatggtcg gaagaggcat aaattccgtc agccagttta gtctgaccat 5160 ctcatctgta acatcattgg caacgctacc tttgccatgt ttcagaaaca actctggcgc 5220 atcgggcttc ccatacaatc gatagattgt cgcacctgat tgcccgacat tatcgcgagc 5280 ccatttatac ccatataaat cagcatccat gttggaattt aatcgcggcc tcgagcaaga 5340 cgtttcccgt tgaatatggc tcataacacc ccttgtatta ctgtttatgt aagcagacag 5400 ttttattgtt catgatgata tatttttatc ttgtgcaatg taacatcaga gattttgaga 5460 cactatggtg cactctcagt acaatctgct ctgatgccgc atagttaagc cagtatctgc 5520 tccctgcttg tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa gctacaacaa 5580 ggcaaggctt gaccgacaat tgcatgaaga atctgcttag ggttaggcgt tttgcgctgc 5640 ttcgcgatgt acgggccaga tatacgcgta tctgagggga ctagggtgtg tttaggcgaa 5700 aagcggggct tcggttgtac gcggttagga gtcccctcag gatatagtag tttcgctttt 5760 gcatagggag ggggaaatgt agtcttatgc aatactcttg tagtcttgca acatggtaac 5820 gatgagttag caacatgcct tacaaggaga gaaaaagcac cgtgcatgcc gattggtgga 5880 agtaaggtgg tacgatcgtg ccttattagg aaggcaacag acgggtctga catggattgg 5940 acgaaccact gaattccgca ttgcagagat attgtattta agtgcctagc tcgatactct 6000 agacgccatt tgaccattca ccacattggt gtgcacctcc aagcttccgt caccgtcgtc 6060 gacacgtgtg atcagatatc gaattcgcca ccatgagcct tctaaccgag gtcgaaacgt 6120 atgttctctc tatcgttcca tcaggccccc tcaaagccga aatcgcgcag agacttgaag 6180 atgtctttgc tgggaaaaac acagatcttg aggctctcat ggaatggcta aagacaagac 6240 caatcctgtc acctctgact aaggggattt tggggtttgt gttcacgctc accgtgccca 6300 gtgagcgagg actgcagcgt agacgctttg tccaaaatgc cctcaatggg aatggggatc 6360 caaataacat ggacagagca gttaaactat atagaaaact taagagggag attacattcc 6420 atggggccaa agaaatagca ctcagttatt ctgctggtgc acttgccagt tgcatgggcc 6480 tcatatacaa cagaatgggg gctgtaacca ctgaagtggc ctttggcctg gtatgtgcaa 6540 catgtgaaca gattgctgac tcccagcaca ggtctcatag gcaaatggtg gcaacaacca 6600 atccattaat aaggcatgag aacagaatgg ttttggccag cactacagct aaggctatgg 6660 agcaaatggc tggatcaagt gagcaggcag cggaggccat ggaaattgct agtcaggcca 6720 ggcaaatggt gcaggcaatg agagccattg ggactcatcc tagctccagt gctggtctaa 6780 aagatgatct tcttgaaaat ttgcagacct atcagaaacg aatgggggtg cagatgcaac 6840 gattcaagtg acccgcttgt tgttgctgcg agtatcattg ggatcttgca cttgatattg 6900 tggattcttg atcgtctttt tttcaaatgc atctatcgac tcttcaaaca cggtctgaaa 6960 agagggcctt ctacggaagg agtacctgag tctatgaggg aagaatatcg aaaggaacag 7020 cagaatgctg tggatgctga cgacagtcat tttgtcagca tagagctgga gtaatcagtc 7080 gaccacatcg cggccgctct agaccaggcg cctggatcca gatctgctgt gccttctagt 7140 tgccagccat ctgttgtttg cccctccccc gtgccttcct tgaccctgga aggtgccact 7200 cccactgtcc tttcctaata aaatgaggaa attgcatcgc attgtctgag taggtgtcat 7260 tctattctgg ggggtggggt ggggcaggac agcaaggggg aggattggga agacaatagc 7320 aggcatgctg gggatgcggt gggctctatg ggtggctttc cccccccccc cattattgaa 7380 gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata 7440 aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca 7500 ttattatcat gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtctcgcgc 7560 gtttcggtga tgacggtgaa aacctctgac acatgcagct cccggagacg gtcacagctt 7620 gtctgtaagc ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg 7680 ggtgtcgggg ctggcttaac tatgcggcat cagagcagat tgtactgaga gtgcaccata 7740 tgcggtgtga aataccgcac agatgcgtaa ggagaaaata ccgcatcaga ttggctat 7798 98 7798 DNA Artificial sequence VR4767, Ligation of Inverted RSVSeg7 into VR4762 98 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catggccagc cagggcacca agagaagcta 1680 cgagcagatg gagaccgacg gcgagagaca gaacgccacc gagatcagag ccagcgtggg 1740 caagatgatc gacggcatcg gcagattcta catccagatg tgcaccgagc tgaagctgag 1800 cgactacgag ggcagactga tccagaacag cctgaccatc gagagaatgg tgctgagcgc 1860 cttcgacgag agaagaaaca gatacctgga ggagcacccc agcgccggca aggaccccaa 1920 gaagaccggc ggccccatct acagaagagt ggacggcaag tggatgagag agctggtgct 1980 gtacgacaag gaggagatca gaagaatctg gagacaggcc aacaacggcg aggacgccac 2040 cgccggcctg acccacatga tgatctggca cagcaacctg aacgacacca cctaccagag 2100 aaccagagcc ctggtgcgga ccggcatgga ccccagaatg tgcagcctga tgcagggcag 2160 caccctgccc agaagaagcg gcgccgccgg cgccgccgtg aagggcatcg gcaccatggt 2220 gatggagctg atcagaatga tcaagagagg catcaacgac agaaacttct ggagaggcga 2280 gaacggcaga aagaccagaa gcgcctacga gagaatgtgc aacatcctga agggcaagtt 2340 ccagaccgcc gcccagagag ccatgatgga ccaggtccgg gagagcagaa accccggcaa 2400 cgccgagatc gaggacctga tcttcctggc cagaagcgcc ctgatcctga gaggcagcgt 2460 ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc gccgtgagca gcggctacga 2520 cttcgagaag gagggctaca gcctggtggg catcgacccc ttcaagctgc tgcagaacag 2580 ccaggtgtac agcctgatca gacccaacga gaaccccgcc cacaagagcc agctggtgtg 2640 gatggcctgc cacagcgccg ccttcgagga cctgagactg ctgagcttca tcagaggcac 2700 caaggtgtcc cccagaggca agctgagcac cagaggcgtg cagatcgcca gcaacgagaa 2760 catggacaac atgggcagca gcaccctgga gctgagaagc agatactggg ccatcagaac 2820 cagaagcggc ggcaacacca accagcagag agccagcgcc ggccagatca gcgtgcagcc 2880 caccttcagc gtgcagagaa acctgccctt cgagaagagc accgtgatgg ccgccttcac 2940 cggcaacacc gagggcagaa ccagcgacat gagagccgag atcatcagaa tgatggaggg 3000 cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg ttcgagctga gcgacgagaa 3060 ggccaccaac cccatcgtgc ctagcttcga catgagcaac gagggcagct acttcttcgg 3120 cgacaacgcc gaggagtacg acaactgatc agtcgaccac gtgtgatcca gatctacttc 3180 tggctaataa aagatcagag ctctagagat ctgtgtgttg gttttttgtg tggtactctt 3240 ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 3300 ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 3360 tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 3420 tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 3480 gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 3540 ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 3600 tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 3660 agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 3720 atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 3780 acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 3840 actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 3900 tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 3960 tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 4020 tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 4080 tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 4140 caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg 4200 cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc gggggggggg 4260 ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc ataccaggcc tgaatcgccc 4320 catcatccag ccagaaagtg agggagccac ggttgatgag agctttgttg taggtggacc 4380 agttggtgat tttgaacttt tgctttgcca cggaacggtc tgcgttgtcg ggaagatgcg 4440 tgatctgatc cttcaactca gcaaaagttc gatttattca acaaagccgc cgtcccgtca 4500 agtcagcgta atgctctgcc agtgttacaa ccaattaacc aattctgatt agaaaaactc 4560 atcgagcatc aaatgaaact gcaatttatt catatcagga ttatcaatac catatttttg 4620 aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg cagttccata ggatggcaag 4680 atcctggtat cggtctgcga ttccgactcg tccaacatca atacaaccta ttaatttccc 4740 ctcgtcaaaa ataaggttat caagtgagaa atcaccatga gtgacgactg aatccggtga 4800 gaatggcaaa agcttatgca tttctttcca gacttgttca acaggccagc cattacgctc 4860 gtcatcaaaa tcactcgcat caaccaaacc gttattcatt cgtgattgcg cctgagcgag 4920 acgaaatacg cgatcgctgt taaaaggaca attacaaaca ggaatcgaat gcaaccggcg 4980 caggaacact gccagcgcat caacaatatt ttcacctgaa tcaggatatt cttctaatac 5040 ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac catgcatcat caggagtacg 5100 gataaaatgc ttgatggtcg gaagaggcat aaattccgtc agccagttta gtctgaccat 5160 ctcatctgta acatcattgg caacgctacc tttgccatgt ttcagaaaca actctggcgc 5220 atcgggcttc ccatacaatc gatagattgt cgcacctgat tgcccgacat tatcgcgagc 5280 ccatttatac ccatataaat cagcatccat gttggaattt aatcgcggcc tcgagcaaga 5340 cgtttcccgt tgaatatggc tcataacacc ccttgtatta ctgtttatgt aagcagacag 5400 ttttattgtt catgatgata tatttttatc ttgtgcaatg taacatcaga gattttgaga 5460 cacccataga gcccaccgca tccccagcat gcctgctatt gtcttcccaa tcctccccct 5520 tgctgtcctg ccccacccca ccccccagaa tagaatgaca cctactcaga caatgcgatg 5580 caatttcctc attttattag gaaaggacag tgggagtggc accttccagg gtcaaggaag 5640 gcacggggga ggggcaaaca acagatggct ggcaactaga aggcacagca gatctggatc 5700 caggcgcctg gtctagagcg gccgcgatgt ggtcgactga ttactccagc tctatgctga 5760 caaaatgact gtcgtcagca tccacagcat tctgctgttc ctttcgatat tcttccctca 5820 tagactcagg tactccttcc gtagaaggcc ctcttttcag accgtgtttg aagagtcgat 5880 agatgcattt gaaaaaaaga cgatcaagaa tccacaatat caagtgcaag atcccaatga 5940 tactcgcagc aacaacaagc gggtcacttg aatcgttgca tctgcacccc cattcgtttc 6000 tgataggtct gcaaattttc aagaagatca tcttttagac cagcactgga gctaggatga 6060 gtcccaatgg ctctcattgc ctgcaccatt tgcctggcct gactagcaat ttccatggcc 6120 tccgctgcct gctcacttga tccagccatt tgctccatag ccttagctgt agtgctggcc 6180 aaaaccattc tgttctcatg ccttattaat ggattggttg ttgccaccat ttgcctatga 6240 gacctgtgct gggagtcagc aatctgttca catgttgcac ataccaggcc aaaggccact 6300 tcagtggtta cagcccccat tctgttgtat atgaggccca tgcaactggc aagtgcacca 6360 gcagaataac tgagtgctat ttctttggcc ccatggaatg taatctccct cttaagtttt 6420 ctatatagtt taactgctct gtccatgtta tttggatccc cattcccatt gagggcattt 6480 tggacaaagc gtctacgctg cagtcctcgc tcactgggca cggtgagcgt gaacacaaac 6540 cccaaaatcc ccttagtcag aggtgacagg attggtcttg tctttagcca ttccatgaga 6600 gcctcaagat ctgtgttttt cccagcaaag acatcttcaa gtctctgcgc gatttcggct 6660 ttgagggggc ctgatggaac gatagagaga acatacgttt cgacctcggt tagaaggctc 6720 atggtggcga attcgatatc tgatcacacg tgtcgacgac ggtgacggaa gcttggaggt 6780 gcacaccaat gtggtgaatg gtcaaatggc gtctagagta tcgagctagg cacttaaata 6840 caatatctct gcaatgcgga attcagtggt tcgtccaatc catgtcagac ccgtctgttg 6900 ccttcctaat aaggcacgat cgtaccacct tacttccacc aatcggcatg cacggtgctt 6960 tttctctcct tgtaaggcat gttgctaact catcgttacc atgttgcaag actacaagag 7020 tattgcataa gactacattt ccccctccct atgcaaaagc gaaactacta tatcctgagg 7080 ggactcctaa ccgcgtacaa ccgaagcccc gcttttcgcc taaacacacc ctagtcccct 7140 cagatacgcg tatatctggc ccgtacatcg cgaagcagcg caaaacgcct aaccctaagc 7200 agattcttca tgcaattgtc ggtcaagcct tgccttgttg tagcttaaat tttgctcgcg 7260 cactactcag cgacctccaa cacacaagca gggagcagat actggcttaa ctatgcggca 7320 tcagagcaga ttgtactgag agtgcaccat agtggctttc cccccccccc cattattgaa 7380 gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata 7440 aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca 7500 ttattatcat gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtctcgcgc 7560 gtttcggtga tgacggtgaa aacctctgac acatgcagct cccggagacg gtcacagctt 7620 gtctgtaagc ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg 7680 ggtgtcgggg ctggcttaac tatgcggcat cagagcagat tgtactgaga gtgcaccata 7740 tgcggtgtga aataccgcac agatgcgtaa ggagaaaata ccgcatcaga ttggctat 7798 99 7798 DNA Artificial sequence VR4768, Ligation of RSVNP into VR4756 99 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560

ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catgagcctt ctaaccgagg tcgaaacgta 1680 tgttctctct atcgttccat caggccccct caaagccgaa atcgcgcaga gacttgaaga 1740 tgtctttgct gggaaaaaca cagatcttga ggctctcatg gaatggctaa agacaagacc 1800 aatcctgtca cctctgacta aggggatttt ggggtttgtg ttcacgctca ccgtgcccag 1860 tgagcgagga ctgcagcgta gacgctttgt ccaaaatgcc ctcaatggga atggggatcc 1920 aaataacatg gacagagcag ttaaactata tagaaaactt aagagggaga ttacattcca 1980 tggggccaaa gaaatagcac tcagttattc tgctggtgca cttgccagtt gcatgggcct 2040 catatacaac agaatggggg ctgtaaccac tgaagtggcc tttggcctgg tatgtgcaac 2100 atgtgaacag attgctgact cccagcacag gtctcatagg caaatggtgg caacaaccaa 2160 tccattaata aggcatgaga acagaatggt tttggccagc actacagcta aggctatgga 2220 gcaaatggct ggatcaagtg agcaggcagc ggaggccatg gaaattgcta gtcaggccag 2280 gcaaatggtg caggcaatga gagccattgg gactcatcct agctccagtg ctggtctaaa 2340 agatgatctt cttgaaaatt tgcagaccta tcagaaacga atgggggtgc agatgcaacg 2400 attcaagtga cccgcttgtt gttgctgcga gtatcattgg gatcttgcac ttgatattgt 2460 ggattcttga tcgtcttttt ttcaaatgca tctatcgact cttcaaacac ggtctgaaaa 2520 gagggccttc tacggaagga gtacctgagt ctatgaggga agaatatcga aaggaacagc 2580 agaatgctgt ggatgctgac gacagtcatt ttgtcagcat agagctggag taatcagtcg 2640 accacgtgtg atccagatct acttctggct aataaaagat cagagctcta gagatctgtg 2700 tgttggtttt ttgtgtggta ctcttccgct tcctcgctca ctgactcgct gcgctcggtc 2760 gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa 2820 tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt 2880 aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa 2940 aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt 3000 ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg 3060 tccgcctttc tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc 3120 agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc 3180 gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta 3240 tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct 3300 acagagttct tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc 3360 tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa 3420 caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa 3480 aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa 3540 aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt 3600 ttaaattaaa aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac 3660 agttaccaat gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc 3720 atagttgcct gactcggggg gggggggcgc tgaggtctgc ctcgtgaaga aggtgttgct 3780 gactcatacc aggcctgaat cgccccatca tccagccaga aagtgaggga gccacggttg 3840 atgagagctt tgttgtaggt ggaccagttg gtgattttga acttttgctt tgccacggaa 3900 cggtctgcgt tgtcgggaag atgcgtgatc tgatccttca actcagcaaa agttcgattt 3960 attcaacaaa gccgccgtcc cgtcaagtca gcgtaatgct ctgccagtgt tacaaccaat 4020 taaccaattc tgattagaaa aactcatcga gcatcaaatg aaactgcaat ttattcatat 4080 caggattatc aataccatat ttttgaaaaa gccgtttctg taatgaagga gaaaactcac 4140 cgaggcagtt ccataggatg gcaagatcct ggtatcggtc tgcgattccg actcgtccaa 4200 catcaataca acctattaat ttcccctcgt caaaaataag gttatcaagt gagaaatcac 4260 catgagtgac gactgaatcc ggtgagaatg gcaaaagctt atgcatttct ttccagactt 4320 gttcaacagg ccagccatta cgctcgtcat caaaatcact cgcatcaacc aaaccgttat 4380 tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc gctgttaaaa ggacaattac 4440 aaacaggaat cgaatgcaac cggcgcagga acactgccag cgcatcaaca atattttcac 4500 ctgaatcagg atattcttct aatacctgga atgctgtttt cccggggatc gcagtggtga 4560 gtaaccatgc atcatcagga gtacggataa aatgcttgat ggtcggaaga ggcataaatt 4620 ccgtcagcca gtttagtctg accatctcat ctgtaacatc attggcaacg ctacctttgc 4680 catgtttcag aaacaactct ggcgcatcgg gcttcccata caatcgatag attgtcgcac 4740 ctgattgccc gacattatcg cgagcccatt tatacccata taaatcagca tccatgttgg 4800 aatttaatcg cggcctcgag caagacgttt cccgttgaat atggctcata acaccccttg 4860 tattactgtt tatgtaagca gacagtttta ttgttcatga tgatatattt ttatcttgtg 4920 caatgtaaca tcagagattt tgagacacta tggtgcactc tcagtacaat ctgctctgat 4980 gccgcatagt taagccagta tctgctccct gcttgtgtgt tggaggtcgc tgagtagtgc 5040 gcgagcaaaa tttaagctac aacaaggcaa ggcttgaccg acaattgcat gaagaatctg 5100 cttagggtta ggcgttttgc gctgcttcgc gatgtacggg ccagatatac gcgtatctga 5160 ggggactagg gtgtgtttag gcgaaaagcg gggcttcggt tgtacgcggt taggagtccc 5220 ctcaggatat agtagtttcg cttttgcata gggaggggga aatgtagtct tatgcaatac 5280 tcttgtagtc ttgcaacatg gtaacgatga gttagcaaca tgccttacaa ggagagaaaa 5340 agcaccgtgc atgccgattg gtggaagtaa ggtggtacga tcgtgcctta ttaggaaggc 5400 aacagacggg tctgacatgg attggacgaa ccactgaatt ccgcattgca gagatattgt 5460 atttaagtgc ctagctcgat actctagacg ccatttgacc attcaccaca ttggtgtgca 5520 cctccaagct tccgtcaccg tcgtcgacac gtgtgatcag atatcgaatt cgccaccatg 5580 gccagccagg gcaccaagag aagctacgag cagatggaga ccgacggcga gagacagaac 5640 gccaccgaga tcagagccag cgtgggcaag atgatcgacg gcatcggcag attctacatc 5700 cagatgtgca ccgagctgaa gctgagcgac tacgagggca gactgatcca gaacagcctg 5760 accatcgaga gaatggtgct gagcgccttc gacgagagaa gaaacagata cctggaggag 5820 caccccagcg ccggcaagga ccccaagaag accggcggcc ccatctacag aagagtggac 5880 ggcaagtgga tgagagagct ggtgctgtac gacaaggagg agatcagaag aatctggaga 5940 caggccaaca acggcgagga cgccaccgcc ggcctgaccc acatgatgat ctggcacagc 6000 aacctgaacg acaccaccta ccagagaacc agagccctgg tgcggaccgg catggacccc 6060 agaatgtgca gcctgatgca gggcagcacc ctgcccagaa gaagcggcgc cgccggcgcc 6120 gccgtgaagg gcatcggcac catggtgatg gagctgatca gaatgatcaa gagaggcatc 6180 aacgacagaa acttctggag aggcgagaac ggcagaaaga ccagaagcgc ctacgagaga 6240 atgtgcaaca tcctgaaggg caagttccag accgccgccc agagagccat gatggaccag 6300 gtccgggaga gcagaaaccc cggcaacgcc gagatcgagg acctgatctt cctggccaga 6360 agcgccctga tcctgagagg cagcgtggcc cacaagagct gcctgcccgc ctgcgtgtac 6420 ggccccgccg tgagcagcgg ctacgacttc gagaaggagg gctacagcct ggtgggcatc 6480 gaccccttca agctgctgca gaacagccag gtgtacagcc tgatcagacc caacgagaac 6540 cccgcccaca agagccagct ggtgtggatg gcctgccaca gcgccgcctt cgaggacctg 6600 agactgctga gcttcatcag aggcaccaag gtgtccccca gaggcaagct gagcaccaga 6660 ggcgtgcaga tcgccagcaa cgagaacatg gacaacatgg gcagcagcac cctggagctg 6720 agaagcagat actgggccat cagaaccaga agcggcggca acaccaacca gcagagagcc 6780 agcgccggcc agatcagcgt gcagcccacc ttcagcgtgc agagaaacct gcccttcgag 6840 aagagcaccg tgatggccgc cttcaccggc aacaccgagg gcagaaccag cgacatgaga 6900 gccgagatca tcagaatgat ggagggcgcc aagcccgagg aggtgtcctt cagaggcaga 6960 ggcgtgttcg agctgagcga cgagaaggcc accaacccca tcgtgcctag cttcgacatg 7020 agcaacgagg gcagctactt cttcggcgac aacgccgagg agtacgacaa ctgatcagtc 7080 gaccacatcg cggccgctct agaccaggcg cctggatcca gatctgctgt gccttctagt 7140 tgccagccat ctgttgtttg cccctccccc gtgccttcct tgaccctgga aggtgccact 7200 cccactgtcc tttcctaata aaatgaggaa attgcatcgc attgtctgag taggtgtcat 7260 tctattctgg ggggtggggt ggggcaggac agcaaggggg aggattggga agacaatagc 7320 aggcatgctg gggatgcggt gggctctatg ggtggctttc cccccccccc cattattgaa 7380 gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata 7440 aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca 7500 ttattatcat gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtctcgcgc 7560 gtttcggtga tgacggtgaa aacctctgac acatgcagct cccggagacg gtcacagctt 7620 gtctgtaagc ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg 7680 ggtgtcgggg ctggcttaac tatgcggcat cagagcagat tgtactgaga gtgcaccata 7740 tgcggtgtga aataccgcac agatgcgtaa ggagaaaata ccgcatcaga ttggctat 7798 100 7798 DNA Artificial sequence VR4769, Ligation of Inverted NP into VR4756 100 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catgagcctt ctaaccgagg tcgaaacgta 1680 tgttctctct atcgttccat caggccccct caaagccgaa atcgcgcaga gacttgaaga 1740 tgtctttgct gggaaaaaca cagatcttga ggctctcatg gaatggctaa agacaagacc 1800 aatcctgtca cctctgacta aggggatttt ggggtttgtg ttcacgctca ccgtgcccag 1860 tgagcgagga ctgcagcgta gacgctttgt ccaaaatgcc ctcaatggga atggggatcc 1920 aaataacatg gacagagcag ttaaactata tagaaaactt aagagggaga ttacattcca 1980 tggggccaaa gaaatagcac tcagttattc tgctggtgca cttgccagtt gcatgggcct 2040 catatacaac agaatggggg ctgtaaccac tgaagtggcc tttggcctgg tatgtgcaac 2100 atgtgaacag attgctgact cccagcacag gtctcatagg caaatggtgg caacaaccaa 2160 tccattaata aggcatgaga acagaatggt tttggccagc actacagcta aggctatgga 2220 gcaaatggct ggatcaagtg agcaggcagc ggaggccatg gaaattgcta gtcaggccag 2280 gcaaatggtg caggcaatga gagccattgg gactcatcct agctccagtg ctggtctaaa 2340 agatgatctt cttgaaaatt tgcagaccta tcagaaacga atgggggtgc agatgcaacg 2400 attcaagtga cccgcttgtt gttgctgcga gtatcattgg gatcttgcac ttgatattgt 2460 ggattcttga tcgtcttttt ttcaaatgca tctatcgact cttcaaacac ggtctgaaaa 2520 gagggccttc tacggaagga gtacctgagt ctatgaggga agaatatcga aaggaacagc 2580 agaatgctgt ggatgctgac gacagtcatt ttgtcagcat agagctggag taatcagtcg 2640 accacgtgtg atccagatct acttctggct aataaaagat cagagctcta gagatctgtg 2700 tgttggtttt ttgtgtggta ctcttccgct tcctcgctca ctgactcgct gcgctcggtc 2760 gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa 2820 tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt 2880 aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa 2940 aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt 3000 ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg 3060 tccgcctttc tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc 3120 agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc 3180 gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta 3240 tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct 3300 acagagttct tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc 3360 tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa 3420 caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa 3480 aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa 3540 aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt 3600 ttaaattaaa aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac 3660 agttaccaat gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc 3720 atagttgcct gactcggggg gggggggcgc tgaggtctgc ctcgtgaaga aggtgttgct 3780 gactcatacc aggcctgaat cgccccatca tccagccaga aagtgaggga gccacggttg 3840 atgagagctt tgttgtaggt ggaccagttg gtgattttga acttttgctt tgccacggaa 3900 cggtctgcgt tgtcgggaag atgcgtgatc tgatccttca actcagcaaa agttcgattt 3960 attcaacaaa gccgccgtcc cgtcaagtca gcgtaatgct ctgccagtgt tacaaccaat 4020 taaccaattc tgattagaaa aactcatcga gcatcaaatg aaactgcaat ttattcatat 4080 caggattatc aataccatat ttttgaaaaa gccgtttctg taatgaagga gaaaactcac 4140 cgaggcagtt ccataggatg gcaagatcct ggtatcggtc tgcgattccg actcgtccaa 4200 catcaataca acctattaat ttcccctcgt caaaaataag gttatcaagt gagaaatcac 4260 catgagtgac gactgaatcc ggtgagaatg gcaaaagctt atgcatttct ttccagactt 4320 gttcaacagg ccagccatta cgctcgtcat caaaatcact cgcatcaacc aaaccgttat 4380 tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc gctgttaaaa ggacaattac 4440 aaacaggaat cgaatgcaac cggcgcagga acactgccag cgcatcaaca atattttcac 4500 ctgaatcagg atattcttct aatacctgga atgctgtttt cccggggatc gcagtggtga 4560 gtaaccatgc atcatcagga gtacggataa aatgcttgat ggtcggaaga ggcataaatt 4620 ccgtcagcca gtttagtctg accatctcat ctgtaacatc attggcaacg ctacctttgc 4680 catgtttcag aaacaactct ggcgcatcgg gcttcccata caatcgatag attgtcgcac 4740 ctgattgccc gacattatcg cgagcccatt tatacccata taaatcagca tccatgttgg 4800 aatttaatcg cggcctcgag caagacgttt cccgttgaat atggctcata acaccccttg 4860 tattactgtt tatgtaagca gacagtttta ttgttcatga tgatatattt ttatcttgtg 4920 caatgtaaca tcagagattt tgagacaccc atagagccca ccgcatcccc agcatgcctg 4980 ctattgtctt cccaatcctc ccccttgctg tcctgcccca ccccaccccc cagaatagaa 5040 tgacacctac tcagacaatg cgatgcaatt tcctcatttt attaggaaag gacagtggga 5100 gtggcacctt ccagggtcaa ggaaggcacg ggggaggggc aaacaacaga tggctggcaa 5160 ctagaaggca cagcagatct ggatccaggc gcctggtcta gagcggccgc gatgtggtcg 5220 actgatcagt tgtcgtactc ctcggcgttg tcgccgaaga agtagctgcc ctcgttgctc 5280 atgtcgaagc taggcacgat ggggttggtg gccttctcgt cgctcagctc gaacacgcct 5340 ctgcctctga aggacacctc ctcgggcttg gcgccctcca tcattctgat gatctcggct 5400 ctcatgtcgc tggttctgcc ctcggtgttg ccggtgaagg cggccatcac ggtgctcttc 5460 tcgaagggca ggtttctctg cacgctgaag gtgggctgca cgctgatctg gccggcgctg 5520 gctctctgct ggttggtgtt gccgccgctt ctggttctga tggcccagta tctgcttctc 5580 agctccaggg tgctgctgcc catgttgtcc atgttctcgt tgctggcgat ctgcacgcct 5640 ctggtgctca gcttgcctct gggggacacc ttggtgcctc tgatgaagct cagcagtctc 5700 aggtcctcga aggcggcgct gtggcaggcc atccacacca gctggctctt gtgggcgggg 5760 ttctcgttgg gtctgatcag gctgtacacc tggctgttct gcagcagctt gaaggggtcg 5820 atgcccacca ggctgtagcc ctccttctcg aagtcgtagc cgctgctcac ggcggggccg 5880 tacacgcagg cgggcaggca gctcttgtgg gccacgctgc ctctcaggat cagggcgctt 5940 ctggccagga agatcaggtc ctcgatctcg gcgttgccgg ggtttctgct ctcccggacc 6000 tggtccatca tggctctctg ggcggcggtc tggaacttgc ccttcaggat gttgcacatt 6060 ctctcgtagg cgcttctggt ctttctgccg ttctcgcctc tccagaagtt tctgtcgttg 6120 atgcctctct tgatcattct gatcagctcc atcaccatgg tgccgatgcc cttcacggcg 6180 gcgccggcgg cgccgcttct tctgggcagg gtgctgccct gcatcaggct gcacattctg 6240 gggtccatgc cggtccgcac cagggctctg gttctctggt aggtggtgtc gttcaggttg 6300 ctgtgccaga tcatcatgtg ggtcaggccg gcggtggcgt cctcgccgtt gttggcctgt 6360 ctccagattc ttctgatctc ctccttgtcg tacagcacca gctctctcat ccacttgccg 6420 tccactcttc tgtagatggg gccgccggtc ttcttggggt ccttgccggc gctggggtgc 6480 tcctccaggt atctgtttct tctctcgtcg aaggcgctca gcaccattct ctcgatggtc 6540 aggctgttct ggatcagtct gccctcgtag tcgctcagct tcagctcggt gcacatctgg 6600 atgtagaatc tgccgatgcc gtcgatcatc ttgcccacgc tggctctgat ctcggtggcg 6660 ttctgtctct cgccgtcggt ctccatctgc tcgtagcttc tcttggtgcc ctggctggcc 6720 atggtggcga attcgatatc tgatcacacg tgtcgacgac ggtgacggaa gcttggaggt 6780 gcacaccaat gtggtgaatg gtcaaatggc gtctagagta tcgagctagg cacttaaata 6840 caatatctct gcaatgcgga attcagtggt tcgtccaatc catgtcagac ccgtctgttg 6900 ccttcctaat aaggcacgat cgtaccacct tacttccacc aatcggcatg cacggtgctt 6960 tttctctcct tgtaaggcat gttgctaact catcgttacc atgttgcaag actacaagag 7020 tattgcataa gactacattt ccccctccct atgcaaaagc gaaactacta tatcctgagg 7080 ggactcctaa ccgcgtacaa ccgaagcccc gcttttcgcc taaacacacc ctagtcccct 7140 cagatacgcg tatatctggc ccgtacatcg cgaagcagcg caaaacgcct aaccctaagc 7200 agattcttca tgcaattgtc ggtcaagcct tgccttgttg tagcttaaat tttgctcgcg 7260 cactactcag cgacctccaa cacacaagca gggagcagat actggcttaa ctatgcggca 7320 tcagagcaga ttgtactgag agtgcaccat agtggctttc cccccccccc cattattgaa 7380 gcatttatca gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata 7440 aacaaatagg ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca 7500 ttattatcat gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtctcgcgc 7560 gtttcggtga tgacggtgaa aacctctgac acatgcagct cccggagacg gtcacagctt 7620 gtctgtaagc ggatgccggg agcagacaag cccgtcaggg cgcgtcagcg ggtgttggcg 7680 ggtgtcgggg ctggcttaac tatgcggcat cagagcagat tgtactgaga gtgcaccata 7740 tgcggtgtga aataccgcac agatgcgtaa ggagaaaata ccgcatcaga ttggctat 7798 101 5161 DNA Artificial sequence VR4770, M2 Insert Replacing WNV Insert in VR6430 101 tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240 ctattggctg ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta 300 agctacaaca aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg 360 ttttgcgctg cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt 420 gtttaggcga aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta 480 gtttcgcttt tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc 540 aacatggtaa cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc 600 cgattggtgg aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg 660 acatggattg gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag 720 ctcgatacaa taaacgccat ttgaccattc accacattgg tgtgcacctc catcggctcg 780 catctctcct tcacgcgccc gccgccctac ctgaggccgc catccacgcc ggttgagtcg 840 cgttctgccg cctcccgcct

gtggtgcctc ctgaactgcg tccgccgtct aggtaagttt 900 aaagctcagg tcgagaccgg gcctttgtcc ggcgctccct tggagcctac ctagactcag 960 ccggctctcc acgctttgcc tgaccctgct tgctcaactc tagttaacgg tggagggcag 1020 tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc agacataata gctgacagac 1080 taacagactg ttcctttcca tgggtctttt ctgcagtcac cgtcgtcgga tatcgaattc 1140 gccaccatga gccttctaac cgaggtcgaa acgtatgttc tctctatcgt tccatcaggc 1200 cccctcaaag ccgaaatcgc gcagagactt gaagatgtct ttgctgggaa aaacacagat 1260 cttgaggctc tcatggaatg gctaaagaca agaccaatcc tgtcacctct gactaagggg 1320 attttggggt ttgtgttcac gctcaccgtg cccagtgagc gaggactgca gcgtagacgc 1380 tttgtccaaa atgccctcaa tgggaatggg gatccaaata acatggacag agcagttaaa 1440 ctatatagaa aacttaagag ggagattaca ttccatgggg ccaaagaaat agcactcagt 1500 tattctgctg gtgcacttgc cagttgcatg ggcctcatat acaacagaat gggggctgta 1560 accactgaag tggcctttgg cctggtatgt gcaacatgtg aacagattgc tgactcccag 1620 cacaggtctc ataggcaaat ggtggcaaca accaatccat taataaggca tgagaacaga 1680 atggttttgg ccagcactac agctaaggct atggagcaaa tggctggatc aagtgagcag 1740 gcagcggagg ccatggaaat tgctagtcag gccaggcaaa tggtgcaggc aatgagagcc 1800 attgggactc atcctagctc cagtgctggt ctaaaagatg atcttcttga aaatttgcag 1860 acctatcaga aacgaatggg ggtgcagatg caacgattca agtgacccgc ttgttgttgc 1920 tgcgagtatc attgggatct tgcacttgat attgtggatt cttgatcgtc tttttttcaa 1980 atgcatctat cgactcttca aacacggtct gaaaagaggg ccttctacgg aaggagtacc 2040 tgagtctatg agggaagaat atcgaaagga acagcagaat gctgtggatg ctgacgacag 2100 tcattttgtc agcatagagc tggagtaatc agtcgagatc cagatctgct gtgccttcta 2160 gttgccagcc atctgttgtt tgcccctccc ccgtgccttc cttgaccctg gaaggtgcca 2220 ctcccactgt cctttcctaa taaaatgagg aaattgcatc gcattgtctg agtaggtgtc 2280 attctattct ggggggtggg gtggggcagg acagcaaggg ggaggattgg gaagacaata 2340 gcaggcatgc tggggatgcg gtgggctcta tgggtaccca ggtgctgaag aattgacccg 2400 gttcctcctg ggccagaaag aagcaggcac atccccttct ctgtgacaca ccctgtccac 2460 gcccctggtt cttagttcca gccccactca taggacactc atagctcagg agggctccgc 2520 cttcaatccc acccgctaaa gtacttggag cggtctctcc ctccctcatc agcccaccaa 2580 accaaaccta gcctccaaga gtgggaagaa attaaagcaa gataggctat taagtgcaga 2640 gggagagaaa atgcctccaa catgtgagga agtaatgaga gaaatcatag aattttaagg 2700 ccatgattta aggccatcat ggccttaatc ttccgcttcc tcgctcactg actcgctgcg 2760 ctcggtcgtt cggctgcggc gagcggtatc agctcactca aaggcggtaa tacggttatc 2820 cacagaatca ggggataacg caggaaagaa catgtgagca aaaggccagc aaaaggccag 2880 gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg ctccgccccc ctgacgagca 2940 tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg acaggactat aaagatacca 3000 ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt ccgaccctgc cgcttaccgg 3060 atacctgtcc gcctttctcc cttcgggaag cgtggcgctt tctcatagct cacgctgtag 3120 gtatctcagt tcggtgtagg tcgttcgctc caagctgggc tgtgtgcacg aaccccccgt 3180 tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt gagtccaacc cggtaagaca 3240 cgacttatcg ccactggcag cagccactgg taacaggatt agcagagcga ggtatgtagg 3300 cggtgctaca gagttcttga agtggtggcc taactacggc tacactagaa gaacagtatt 3360 tggtatctgc gctctgctga agccagttac cttcggaaaa agagttggta gctcttgatc 3420 cggcaaacaa accaccgctg gtagcggtgg tttttttgtt tgcaagcagc agattacgcg 3480 cagaaaaaaa ggatctcaag aagatccttt gatcttttct acggggtctg acgctcagtg 3540 gaacgaaaac tcacgttaag ggattttggt catgagatta tcaaaaagga tcttcaccta 3600 gatcctttta aattaaaaat gaagttttaa atcaatctaa agtatatatg agtaaacttg 3660 gtctgacagt taccaatgct taatcagtga ggcacctatc tcagcgatct gtctatttcg 3720 ttcatccata gttgcctgac tcgggggggg ggggcgctga ggtctgcctc gtgaagaagg 3780 tgttgctgac tcataccagg cctgaatcgc cccatcatcc agccagaaag tgagggagcc 3840 acggttgatg agagctttgt tgtaggtgga ccagttggtg attttgaact tttgctttgc 3900 cacggaacgg tctgcgttgt cgggaagatg cgtgatctga tccttcaact cagcaaaagt 3960 tcgatttatt caacaaagcc gccgtcccgt caagtcagcg taatgctctg ccagtgttac 4020 aaccaattaa ccaattctga ttagaaaaac tcatcgagca tcaaatgaaa ctgcaattta 4080 ttcatatcag gattatcaat accatatttt tgaaaaagcc gtttctgtaa tgaaggagaa 4140 aactcaccga ggcagttcca taggatggca agatcctggt atcggtctgc gattccgact 4200 cgtccaacat caatacaacc tattaatttc ccctcgtcaa aaataaggtt atcaagtgag 4260 aaatcaccat gagtgacgac tgaatccggt gagaatggca aaagcttatg catttctttc 4320 cagacttgtt caacaggcca gccattacgc tcgtcatcaa aatcactcgc atcaaccaaa 4380 ccgttattca ttcgtgattg cgcctgagcg agacgaaata cgcgatcgct gttaaaagga 4440 caattacaaa caggaatcga atgcaaccgg cgcaggaaca ctgccagcgc atcaacaata 4500 ttttcacctg aatcaggata ttcttctaat acctggaatg ctgttttccc ggggatcgca 4560 gtggtgagta accatgcatc atcaggagta cggataaaat gcttgatggt cggaagaggc 4620 ataaattccg tcagccagtt tagtctgacc atctcatctg taacatcatt ggcaacgcta 4680 cctttgccat gtttcagaaa caactctggc gcatcgggct tcccatacaa tcgatagatt 4740 gtcgcacctg attgcccgac attatcgcga gcccatttat acccatataa atcagcatcc 4800 atgttggaat ttaatcgcgg cctcgagcaa gacgtttccc gttgaatatg gctcataaca 4860 ccccttgtat tactgtttat gtaagcagac agttttattg ttcatgatga tatattttta 4920 tcttgtgcaa tgtaacatca gagattttga gacacaacgt ggctttcccc ccccccccat 4980 tattgaagca tttatcaggg ttattgtctc atgagcggat acatatttga atgtatttag 5040 aaaaataaac aaataggggt tccgcgcaca tttccccgaa aagtgccacc tgacgtctaa 5100 gaaaccatta ttatcatgac attaacctat aaaaataggc gtatcacgag gccctttcgt 5160 c 5161 102 5684 DNA Artificial sequence VR4771, NP Insert Repacing WNV Insert in VR6430 102 tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240 ctattggctg ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta 300 agctacaaca aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg 360 ttttgcgctg cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt 420 gtttaggcga aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta 480 gtttcgcttt tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc 540 aacatggtaa cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc 600 cgattggtgg aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg 660 acatggattg gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag 720 ctcgatacaa taaacgccat ttgaccattc accacattgg tgtgcacctc catcggctcg 780 catctctcct tcacgcgccc gccgccctac ctgaggccgc catccacgcc ggttgagtcg 840 cgttctgccg cctcccgcct gtggtgcctc ctgaactgcg tccgccgtct aggtaagttt 900 aaagctcagg tcgagaccgg gcctttgtcc ggcgctccct tggagcctac ctagactcag 960 ccggctctcc acgctttgcc tgaccctgct tgctcaactc tagttaacgg tggagggcag 1020 tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc agacataata gctgacagac 1080 taacagactg ttcctttcca tgggtctttt ctgcagtcac cgtcgtcgga tatcgaattc 1140 gccaccatgg ccagccaggg caccaagaga agctacgagc agatggagac cgacggcgag 1200 agacagaacg ccaccgagat cagagccagc gtgggcaaga tgatcgacgg catcggcaga 1260 ttctacatcc agatgtgcac cgagctgaag ctgagcgact acgagggcag actgatccag 1320 aacagcctga ccatcgagag aatggtgctg agcgccttcg acgagagaag aaacagatac 1380 ctggaggagc accccagcgc cggcaaggac cccaagaaga ccggcggccc catctacaga 1440 agagtggacg gcaagtggat gagagagctg gtgctgtacg acaaggagga gatcagaaga 1500 atctggagac aggccaacaa cggcgaggac gccaccgccg gcctgaccca catgatgatc 1560 tggcacagca acctgaacga caccacctac cagagaacca gagccctggt gcggaccggc 1620 atggacccca gaatgtgcag cctgatgcag ggcagcaccc tgcccagaag aagcggcgcc 1680 gccggcgccg ccgtgaaggg catcggcacc atggtgatgg agctgatcag aatgatcaag 1740 agaggcatca acgacagaaa cttctggaga ggcgagaacg gcagaaagac cagaagcgcc 1800 tacgagagaa tgtgcaacat cctgaagggc aagttccaga ccgccgccca gagagccatg 1860 atggaccagg tccgggagag cagaaacccc ggcaacgccg agatcgagga cctgatcttc 1920 ctggccagaa gcgccctgat cctgagaggc agcgtggccc acaagagctg cctgcccgcc 1980 tgcgtgtacg gccccgccgt gagcagcggc tacgacttcg agaaggaggg ctacagcctg 2040 gtgggcatcg accccttcaa gctgctgcag aacagccagg tgtacagcct gatcagaccc 2100 aacgagaacc ccgcccacaa gagccagctg gtgtggatgg cctgccacag cgccgccttc 2160 gaggacctga gactgctgag cttcatcaga ggcaccaagg tgtcccccag aggcaagctg 2220 agcaccagag gcgtgcagat cgccagcaac gagaacatgg acaacatggg cagcagcacc 2280 ctggagctga gaagcagata ctgggccatc agaaccagaa gcggcggcaa caccaaccag 2340 cagagagcca gcgccggcca gatcagcgtg cagcccacct tcagcgtgca gagaaacctg 2400 cccttcgaga agagcaccgt gatggccgcc ttcaccggca acaccgaggg cagaaccagc 2460 gacatgagag ccgagatcat cagaatgatg gagggcgcca agcccgagga ggtgtccttc 2520 agaggcagag gcgtgttcga gctgagcgac gagaaggcca ccaaccccat cgtgcctagc 2580 ttcgacatga gcaacgaggg cagctacttc ttcggcgaca acgccgagga gtacgacaac 2640 tgatcagtcg accacgtgtg atccagatct gctgtgcctt ctagttgcca gccatctgtt 2700 gtttgcccct cccccgtgcc ttccttgacc ctggaaggtg ccactcccac tgtcctttcc 2760 taataaaatg aggaaattgc atcgcattgt ctgagtaggt gtcattctat tctggggggt 2820 ggggtggggc aggacagcaa gggggaggat tgggaagaca atagcaggca tgctggggat 2880 gcggtgggct ctatgggtac ccaggtgctg aagaattgac ccggttcctc ctgggccaga 2940 aagaagcagg cacatcccct tctctgtgac acaccctgtc cacgcccctg gttcttagtt 3000 ccagccccac tcataggaca ctcatagctc aggagggctc cgccttcaat cccacccgct 3060 aaagtacttg gagcggtctc tccctccctc atcagcccac caaaccaaac ctagcctcca 3120 agagtgggaa gaaattaaag caagataggc tattaagtgc agagggagag aaaatgcctc 3180 caacatgtga ggaagtaatg agagaaatca tagaatttta aggccatgat ttaaggccat 3240 catggcctta atcttccgct tcctcgctca ctgactcgct gcgctcggtc gttcggctgc 3300 ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa tcaggggata 3360 acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt aaaaaggccg 3420 cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa aatcgacgct 3480 caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt ccccctggaa 3540 gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg tccgcctttc 3600 tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc agttcggtgt 3660 aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc gaccgctgcg 3720 ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta tcgccactgg 3780 cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct acagagttct 3840 tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc tgcgctctgc 3900 tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa caaaccaccg 3960 ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa aaaggatctc 4020 aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa aactcacgtt 4080 aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt ttaaattaaa 4140 aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac agttaccaat 4200 gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc atagttgcct 4260 gactcggggg gggggggcgc tgaggtctgc ctcgtgaaga aggtgttgct gactcatacc 4320 aggcctgaat cgccccatca tccagccaga aagtgaggga gccacggttg atgagagctt 4380 tgttgtaggt ggaccagttg gtgattttga acttttgctt tgccacggaa cggtctgcgt 4440 tgtcgggaag atgcgtgatc tgatccttca actcagcaaa agttcgattt attcaacaaa 4500 gccgccgtcc cgtcaagtca gcgtaatgct ctgccagtgt tacaaccaat taaccaattc 4560 tgattagaaa aactcatcga gcatcaaatg aaactgcaat ttattcatat caggattatc 4620 aataccatat ttttgaaaaa gccgtttctg taatgaagga gaaaactcac cgaggcagtt 4680 ccataggatg gcaagatcct ggtatcggtc tgcgattccg actcgtccaa catcaataca 4740 acctattaat ttcccctcgt caaaaataag gttatcaagt gagaaatcac catgagtgac 4800 gactgaatcc ggtgagaatg gcaaaagctt atgcatttct ttccagactt gttcaacagg 4860 ccagccatta cgctcgtcat caaaatcact cgcatcaacc aaaccgttat tcattcgtga 4920 ttgcgcctga gcgagacgaa atacgcgatc gctgttaaaa ggacaattac aaacaggaat 4980 cgaatgcaac cggcgcagga acactgccag cgcatcaaca atattttcac ctgaatcagg 5040 atattcttct aatacctgga atgctgtttt cccggggatc gcagtggtga gtaaccatgc 5100 atcatcagga gtacggataa aatgcttgat ggtcggaaga ggcataaatt ccgtcagcca 5160 gtttagtctg accatctcat ctgtaacatc attggcaacg ctacctttgc catgtttcag 5220 aaacaactct ggcgcatcgg gcttcccata caatcgatag attgtcgcac ctgattgccc 5280 gacattatcg cgagcccatt tatacccata taaatcagca tccatgttgg aatttaatcg 5340 cggcctcgag caagacgttt cccgttgaat atggctcata acaccccttg tattactgtt 5400 tatgtaagca gacagtttta ttgttcatga tgatatattt ttatcttgtg caatgtaaca 5460 tcagagattt tgagacacaa cgtggctttc cccccccccc cattattgaa gcatttatca 5520 gggttattgt ctcatgagcg gatacatatt tgaatgtatt tagaaaaata aacaaatagg 5580 ggttccgcgc acatttcccc gaaaagtgcc acctgacgtc taagaaacca ttattatcat 5640 gacattaacc tataaaaata ggcgtatcac gaggcccttt cgtc 5684 103 4473 DNA Artificial sequence VR4772, M2 Insert Replacing WNV Insert from VR6430 103 tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240 ctattggctg ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta 300 agctacaaca aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg 360 ttttgcgctg cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt 420 gtttaggcga aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta 480 gtttcgcttt tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc 540 aacatggtaa cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc 600 cgattggtgg aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg 660 acatggattg gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag 720 ctcgatacaa taaacgccat ttgaccattc accacattgg tgtgcacctc catcggctcg 780 catctctcct tcacgcgccc gccgccctac ctgaggccgc catccacgcc ggttgagtcg 840 cgttctgccg cctcccgcct gtggtgcctc ctgaactgcg tccgccgtct aggtaagttt 900 aaagctcagg tcgagaccgg gcctttgtcc ggcgctccct tggagcctac ctagactcag 960 ccggctctcc acgctttgcc tgaccctgct tgctcaactc tagttaacgg tggagggcag 1020 tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc agacataata gctgacagac 1080 taacagactg ttcctttcca tgggtctttt ctgcagtcac cgtcgtcgga tatcgaattc 1140 gccaccatga gcctgctgac cgaggtggag acccccatca gaaacgagtg gggctgcaga 1200 tgcaacgaca gcagcgaccc cctggtggtg gccgccagca tcatcggcat cctgcacctg 1260 atcctgtgga tcctggacag actgttcttc aagtgcatct acagactgtt caagcacggc 1320 ctgaagagag gccccagcac cgagggcgtg cccgagagca tgagagagga gtacagaaag 1380 gagcagcaga acgccgtgga cgccgacgac agccacttcg tgagcatcga gctggagtga 1440 tcagtcgaga tccagatctg ctgtgccttc tagttgccag ccatctgttg tttgcccctc 1500 ccccgtgcct tccttgaccc tggaaggtgc cactcccact gtcctttcct aataaaatga 1560 ggaaattgca tcgcattgtc tgagtaggtg tcattctatt ctggggggtg gggtggggca 1620 ggacagcaag ggggaggatt gggaagacaa tagcaggcat gctggggatg cggtgggctc 1680 tatgggtacc caggtgctga agaattgacc cggttcctcc tgggccagaa agaagcaggc 1740 acatcccctt ctctgtgaca caccctgtcc acgcccctgg ttcttagttc cagccccact 1800 cataggacac tcatagctca ggagggctcc gccttcaatc ccacccgcta aagtacttgg 1860 agcggtctct ccctccctca tcagcccacc aaaccaaacc tagcctccaa gagtgggaag 1920 aaattaaagc aagataggct attaagtgca gagggagaga aaatgcctcc aacatgtgag 1980 gaagtaatga gagaaatcat agaattttaa ggccatgatt taaggccatc atggccttaa 2040 tcttccgctt cctcgctcac tgactcgctg cgctcggtcg ttcggctgcg gcgagcggta 2100 tcagctcact caaaggcggt aatacggtta tccacagaat caggggataa cgcaggaaag 2160 aacatgtgag caaaaggcca gcaaaaggcc aggaaccgta aaaaggccgc gttgctggcg 2220 tttttccata ggctccgccc ccctgacgag catcacaaaa atcgacgctc aagtcagagg 2280 tggcgaaacc cgacaggact ataaagatac caggcgtttc cccctggaag ctccctcgtg 2340 cgctctcctg ttccgaccct gccgcttacc ggatacctgt ccgcctttct cccttcggga 2400 agcgtggcgc tttctcatag ctcacgctgt aggtatctca gttcggtgta ggtcgttcgc 2460 tccaagctgg gctgtgtgca cgaacccccc gttcagcccg accgctgcgc cttatccggt 2520 aactatcgtc ttgagtccaa cccggtaaga cacgacttat cgccactggc agcagccact 2580 ggtaacagga ttagcagagc gaggtatgta ggcggtgcta cagagttctt gaagtggtgg 2640 cctaactacg gctacactag aagaacagta tttggtatct gcgctctgct gaagccagtt 2700 accttcggaa aaagagttgg tagctcttga tccggcaaac aaaccaccgc tggtagcggt 2760 ggtttttttg tttgcaagca gcagattacg cgcagaaaaa aaggatctca agaagatcct 2820 ttgatctttt ctacggggtc tgacgctcag tggaacgaaa actcacgtta agggattttg 2880 gtcatgagat tatcaaaaag gatcttcacc tagatccttt taaattaaaa atgaagtttt 2940 aaatcaatct aaagtatata tgagtaaact tggtctgaca gttaccaatg cttaatcagt 3000 gaggcaccta tctcagcgat ctgtctattt cgttcatcca tagttgcctg actcgggggg 3060 ggggggcgct gaggtctgcc tcgtgaagaa ggtgttgctg actcatacca ggcctgaatc 3120 gccccatcat ccagccagaa agtgagggag ccacggttga tgagagcttt gttgtaggtg 3180 gaccagttgg tgattttgaa cttttgcttt gccacggaac ggtctgcgtt gtcgggaaga 3240 tgcgtgatct gatccttcaa ctcagcaaaa gttcgattta ttcaacaaag ccgccgtccc 3300 gtcaagtcag cgtaatgctc tgccagtgtt acaaccaatt aaccaattct gattagaaaa 3360 actcatcgag catcaaatga aactgcaatt tattcatatc aggattatca ataccatatt 3420 tttgaaaaag ccgtttctgt aatgaaggag aaaactcacc gaggcagttc cataggatgg 3480 caagatcctg gtatcggtct gcgattccga ctcgtccaac atcaatacaa cctattaatt 3540 tcccctcgtc aaaaataagg ttatcaagtg agaaatcacc atgagtgacg actgaatccg 3600 gtgagaatgg caaaagctta tgcatttctt tccagacttg ttcaacaggc cagccattac 3660 gctcgtcatc aaaatcactc gcatcaacca aaccgttatt cattcgtgat tgcgcctgag 3720 cgagacgaaa tacgcgatcg ctgttaaaag gacaattaca aacaggaatc gaatgcaacc 3780 ggcgcaggaa cactgccagc gcatcaacaa tattttcacc tgaatcagga tattcttcta 3840 atacctggaa tgctgttttc ccggggatcg cagtggtgag taaccatgca tcatcaggag 3900 tacggataaa atgcttgatg gtcggaagag gcataaattc cgtcagccag tttagtctga 3960 ccatctcatc tgtaacatca ttggcaacgc tacctttgcc atgtttcaga aacaactctg 4020 gcgcatcggg cttcccatac aatcgataga ttgtcgcacc tgattgcccg acattatcgc 4080 gagcccattt atacccatat aaatcagcat ccatgttgga atttaatcgc ggcctcgagc 4140 aagacgtttc ccgttgaata tggctcataa caccccttgt attactgttt atgtaagcag 4200 acagttttat tgttcatgat gatatatttt tatcttgtgc aatgtaacat cagagatttt 4260 gagacacaac gtggctttcc cccccccccc attattgaag catttatcag ggttattgtc 4320 tcatgagcgg atacatattt gaatgtattt agaaaaataa acaaataggg gttccgcgca 4380 catttccccg aaaagtgcca cctgacgtct aagaaaccat tattatcatg acattaacct 4440 ataaaaatag gcgtatcacg aggccctttc gtc 4473 104 8450 DNA Artificial sequence VR4773, Ligation of RSV RNP into VR4756 104 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg

agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catgagcctt ctaaccgagg tcgaaacgta 1680 tgttctctct atcgttccat caggccccct caaagccgaa atcgcgcaga gacttgaaga 1740 tgtctttgct gggaaaaaca cagatcttga ggctctcatg gaatggctaa agacaagacc 1800 aatcctgtca cctctgacta aggggatttt ggggtttgtg ttcacgctca ccgtgcccag 1860 tgagcgagga ctgcagcgta gacgctttgt ccaaaatgcc ctcaatggga atggggatcc 1920 aaataacatg gacagagcag ttaaactata tagaaaactt aagagggaga ttacattcca 1980 tggggccaaa gaaatagcac tcagttattc tgctggtgca cttgccagtt gcatgggcct 2040 catatacaac agaatggggg ctgtaaccac tgaagtggcc tttggcctgg tatgtgcaac 2100 atgtgaacag attgctgact cccagcacag gtctcatagg caaatggtgg caacaaccaa 2160 tccattaata aggcatgaga acagaatggt tttggccagc actacagcta aggctatgga 2220 gcaaatggct ggatcaagtg agcaggcagc ggaggccatg gaaattgcta gtcaggccag 2280 gcaaatggtg caggcaatga gagccattgg gactcatcct agctccagtg ctggtctaaa 2340 agatgatctt cttgaaaatt tgcagaccta tcagaaacga atgggggtgc agatgcaacg 2400 attcaagtga cccgcttgtt gttgctgcga gtatcattgg gatcttgcac ttgatattgt 2460 ggattcttga tcgtcttttt ttcaaatgca tctatcgact cttcaaacac ggtctgaaaa 2520 gagggccttc tacggaagga gtacctgagt ctatgaggga agaatatcga aaggaacagc 2580 agaatgctgt ggatgctgac gacagtcatt ttgtcagcat agagctggag taatcagtcg 2640 accacgtgtg atccagatct acttctggct aataaaagat cagagctcta gagatctgtg 2700 tgttggtttt ttgtgtggta ctcttccgct tcctcgctca ctgactcgct gcgctcggtc 2760 gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa 2820 tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt 2880 aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa 2940 aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt 3000 ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg 3060 tccgcctttc tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc 3120 agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc 3180 gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta 3240 tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct 3300 acagagttct tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc 3360 tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa 3420 caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa 3480 aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa 3540 aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt 3600 ttaaattaaa aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac 3660 agttaccaat gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc 3720 atagttgcct gactcggggg gggggggcgc tgaggtctgc ctcgtgaaga aggtgttgct 3780 gactcatacc aggcctgaat cgccccatca tccagccaga aagtgaggga gccacggttg 3840 atgagagctt tgttgtaggt ggaccagttg gtgattttga acttttgctt tgccacggaa 3900 cggtctgcgt tgtcgggaag atgcgtgatc tgatccttca actcagcaaa agttcgattt 3960 attcaacaaa gccgccgtcc cgtcaagtca gcgtaatgct ctgccagtgt tacaaccaat 4020 taaccaattc tgattagaaa aactcatcga gcatcaaatg aaactgcaat ttattcatat 4080 caggattatc aataccatat ttttgaaaaa gccgtttctg taatgaagga gaaaactcac 4140 cgaggcagtt ccataggatg gcaagatcct ggtatcggtc tgcgattccg actcgtccaa 4200 catcaataca acctattaat ttcccctcgt caaaaataag gttatcaagt gagaaatcac 4260 catgagtgac gactgaatcc ggtgagaatg gcaaaagctt atgcatttct ttccagactt 4320 gttcaacagg ccagccatta cgctcgtcat caaaatcact cgcatcaacc aaaccgttat 4380 tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc gctgttaaaa ggacaattac 4440 aaacaggaat cgaatgcaac cggcgcagga acactgccag cgcatcaaca atattttcac 4500 ctgaatcagg atattcttct aatacctgga atgctgtttt cccggggatc gcagtggtga 4560 gtaaccatgc atcatcagga gtacggataa aatgcttgat ggtcggaaga ggcataaatt 4620 ccgtcagcca gtttagtctg accatctcat ctgtaacatc attggcaacg ctacctttgc 4680 catgtttcag aaacaactct ggcgcatcgg gcttcccata caatcgatag attgtcgcac 4740 ctgattgccc gacattatcg cgagcccatt tatacccata taaatcagca tccatgttgg 4800 aatttaatcg cggcctcgag caagacgttt cccgttgaat atggctcata acaccccttg 4860 tattactgtt tatgtaagca gacagtttta ttgttcatga tgatatattt ttatcttgtg 4920 caatgtaaca tcagagattt tgagacacta tgcggtgtga aataccgcac agatgcgtaa 4980 ggagaaaata ccgcatcaga ttggctattg gctgctccct gcttgtgtgt tggaggtcgc 5040 tgagtagtgc gcgagcaaaa tttaagctac aacaaggcaa ggcttgaccg acaattgcat 5100 gaagaatctg cttagggtta ggcgttttgc gctgcttcgc gatgtacggg ccagatatac 5160 gcgtatctga ggggactagg gtgtgtttag gcgaaaagcg gggcttcggt tgtacgcggt 5220 taggagtccc ctcaggatat agtagtttcg cttttgcata gggaggggga aatgtagtct 5280 tatgcaatac tcttgtagtc ttgcaacatg gtaacgatga gttagcaaca tgccttacaa 5340 ggagagaaaa agcaccgtgc atgccgattg gtggaagtaa ggtggtacga tcgtgcctta 5400 ttaggaaggc aacagacggg tctgacatgg attggacgaa ccactgaatt ccgcattgca 5460 gagatattgt atttaagtgc ctagctcgat acaataaacg ccatttgacc attcaccaca 5520 ttggtgtgca cctccatcgg ctcgcatctc tccttcacgc gcccgccgcc ctacctgagg 5580 ccgccatcca cgccggttga gtcgcgttct gccgcctccc gcctgtggtg cctcctgaac 5640 tgcgtccgcc gtctaggtaa gtttaaagct caggtcgaga ccgggccttt gtccggcgct 5700 cccttggagc ctacctagac tcagccggct ctccacgctt tgcctgaccc tgcttgctca 5760 actctagtta acggtggagg gcagtgtagt ctgagcagta ctcgttgctg ccgcgcgcgc 5820 caccagacat aatagctgac agactaacag actgttcctt tccatgggtc ttttctgcag 5880 tcaccgtcgt cggatatcga attcgccacc atggccagcc agggcaccaa gagaagctac 5940 gagcagatgg agaccgacgg cgagagacag aacgccaccg agatcagagc cagcgtgggc 6000 aagatgatcg acggcatcgg cagattctac atccagatgt gcaccgagct gaagctgagc 6060 gactacgagg gcagactgat ccagaacagc ctgaccatcg agagaatggt gctgagcgcc 6120 ttcgacgaga gaagaaacag atacctggag gagcacccca gcgccggcaa ggaccccaag 6180 aagaccggcg gccccatcta cagaagagtg gacggcaagt ggatgagaga gctggtgctg 6240 tacgacaagg aggagatcag aagaatctgg agacaggcca acaacggcga ggacgccacc 6300 gccggcctga cccacatgat gatctggcac agcaacctga acgacaccac ctaccagaga 6360 accagagccc tggtgcggac cggcatggac cccagaatgt gcagcctgat gcagggcagc 6420 accctgccca gaagaagcgg cgccgccggc gccgccgtga agggcatcgg caccatggtg 6480 atggagctga tcagaatgat caagagaggc atcaacgaca gaaacttctg gagaggcgag 6540 aacggcagaa agaccagaag cgcctacgag agaatgtgca acatcctgaa gggcaagttc 6600 cagaccgccg cccagagagc catgatggac caggtccggg agagcagaaa ccccggcaac 6660 gccgagatcg aggacctgat cttcctggcc agaagcgccc tgatcctgag aggcagcgtg 6720 gcccacaaga gctgcctgcc cgcctgcgtg tacggccccg ccgtgagcag cggctacgac 6780 ttcgagaagg agggctacag cctggtgggc atcgacccct tcaagctgct gcagaacagc 6840 caggtgtaca gcctgatcag acccaacgag aaccccgccc acaagagcca gctggtgtgg 6900 atggcctgcc acagcgccgc cttcgaggac ctgagactgc tgagcttcat cagaggcacc 6960 aaggtgtccc ccagaggcaa gctgagcacc agaggcgtgc agatcgccag caacgagaac 7020 atggacaaca tgggcagcag caccctggag ctgagaagca gatactgggc catcagaacc 7080 agaagcggcg gcaacaccaa ccagcagaga gccagcgccg gccagatcag cgtgcagccc 7140 accttcagcg tgcagagaaa cctgcccttc gagaagagca ccgtgatggc cgccttcacc 7200 ggcaacaccg agggcagaac cagcgacatg agagccgaga tcatcagaat gatggagggc 7260 gccaagcccg aggaggtgtc cttcagaggc agaggcgtgt tcgagctgag cgacgagaag 7320 gccaccaacc ccatcgtgcc tagcttcgac atgagcaacg agggcagcta cttcttcggc 7380 gacaacgccg aggagtacga caactgatca gtcgaccacg tgtgatccag atctgctgtg 7440 ccttctagtt gccagccatc tgttgtttgc ccctcccccg tgccttcctt gaccctggaa 7500 ggtgccactc ccactgtcct ttcctaataa aatgaggaaa ttgcatcgca ttgtctgagt 7560 aggtgtcatt ctattctggg gggtggggtg gggcaggaca gcaaggggga ggattgggaa 7620 gacaatagca ggcatgctgg ggatgcggtg ggctctatgg gtacccaggt gctgaagaat 7680 tgacccggtt cctcctgggc cagaaagaag caggcacatc cccttctctg tgacacaccc 7740 tgtccacgcc cctggttctt agttccagcc ccactcatag gacactcata gctcaggagg 7800 gctccgcctt caatcccacc cgctaaagta cttggagcgg tctctccctc cctcatcagc 7860 ccaccaaacc aaacctagcc tccaagagtg ggaagaaatt aaagcaagat aggctattaa 7920 gtgcagaggg agagaaaatg cctccaacat gtgaggaagt aatgagagaa atcatagaat 7980 tttaaggcca tgatttaagg ccagtggctt tccccccccc cccattattg aagcatttat 8040 cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata 8100 ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg tctaagaaac cattattatc 8160 atgacattaa cctataaaaa taggcgtatc acgaggccct ttcgtctcgc gcgtttcggt 8220 gatgacggtg aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa 8280 gcggatgccg ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg 8340 ggctggctta actatgcggc atcagagcag attgtactga gagtgcacca tatgcggtgt 8400 gaaataccgc acagatgcgt aaggagaaaa taccgcatca gattggctat 8450 105 8450 DNA Artificial sequence VR4774, Ligation of Inverted RSV RNP into VR4756 105 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catgagcctt ctaaccgagg tcgaaacgta 1680 tgttctctct atcgttccat caggccccct caaagccgaa atcgcgcaga gacttgaaga 1740 tgtctttgct gggaaaaaca cagatcttga ggctctcatg gaatggctaa agacaagacc 1800 aatcctgtca cctctgacta aggggatttt ggggtttgtg ttcacgctca ccgtgcccag 1860 tgagcgagga ctgcagcgta gacgctttgt ccaaaatgcc ctcaatggga atggggatcc 1920 aaataacatg gacagagcag ttaaactata tagaaaactt aagagggaga ttacattcca 1980 tggggccaaa gaaatagcac tcagttattc tgctggtgca cttgccagtt gcatgggcct 2040 catatacaac agaatggggg ctgtaaccac tgaagtggcc tttggcctgg tatgtgcaac 2100 atgtgaacag attgctgact cccagcacag gtctcatagg caaatggtgg caacaaccaa 2160 tccattaata aggcatgaga acagaatggt tttggccagc actacagcta aggctatgga 2220 gcaaatggct ggatcaagtg agcaggcagc ggaggccatg gaaattgcta gtcaggccag 2280 gcaaatggtg caggcaatga gagccattgg gactcatcct agctccagtg ctggtctaaa 2340 agatgatctt cttgaaaatt tgcagaccta tcagaaacga atgggggtgc agatgcaacg 2400 attcaagtga cccgcttgtt gttgctgcga gtatcattgg gatcttgcac ttgatattgt 2460 ggattcttga tcgtcttttt ttcaaatgca tctatcgact cttcaaacac ggtctgaaaa 2520 gagggccttc tacggaagga gtacctgagt ctatgaggga agaatatcga aaggaacagc 2580 agaatgctgt ggatgctgac gacagtcatt ttgtcagcat agagctggag taatcagtcg 2640 accacgtgtg atccagatct acttctggct aataaaagat cagagctcta gagatctgtg 2700 tgttggtttt ttgtgtggta ctcttccgct tcctcgctca ctgactcgct gcgctcggtc 2760 gttcggctgc ggcgagcggt atcagctcac tcaaaggcgg taatacggtt atccacagaa 2820 tcaggggata acgcaggaaa gaacatgtga gcaaaaggcc agcaaaaggc caggaaccgt 2880 aaaaaggccg cgttgctggc gtttttccat aggctccgcc cccctgacga gcatcacaaa 2940 aatcgacgct caagtcagag gtggcgaaac ccgacaggac tataaagata ccaggcgttt 3000 ccccctggaa gctccctcgt gcgctctcct gttccgaccc tgccgcttac cggatacctg 3060 tccgcctttc tcccttcggg aagcgtggcg ctttctcata gctcacgctg taggtatctc 3120 agttcggtgt aggtcgttcg ctccaagctg ggctgtgtgc acgaaccccc cgttcagccc 3180 gaccgctgcg ccttatccgg taactatcgt cttgagtcca acccggtaag acacgactta 3240 tcgccactgg cagcagccac tggtaacagg attagcagag cgaggtatgt aggcggtgct 3300 acagagttct tgaagtggtg gcctaactac ggctacacta gaagaacagt atttggtatc 3360 tgcgctctgc tgaagccagt taccttcgga aaaagagttg gtagctcttg atccggcaaa 3420 caaaccaccg ctggtagcgg tggttttttt gtttgcaagc agcagattac gcgcagaaaa 3480 aaaggatctc aagaagatcc tttgatcttt tctacggggt ctgacgctca gtggaacgaa 3540 aactcacgtt aagggatttt ggtcatgaga ttatcaaaaa ggatcttcac ctagatcctt 3600 ttaaattaaa aatgaagttt taaatcaatc taaagtatat atgagtaaac ttggtctgac 3660 agttaccaat gcttaatcag tgaggcacct atctcagcga tctgtctatt tcgttcatcc 3720 atagttgcct gactcggggg gggggggcgc tgaggtctgc ctcgtgaaga aggtgttgct 3780 gactcatacc aggcctgaat cgccccatca tccagccaga aagtgaggga gccacggttg 3840 atgagagctt tgttgtaggt ggaccagttg gtgattttga acttttgctt tgccacggaa 3900 cggtctgcgt tgtcgggaag atgcgtgatc tgatccttca actcagcaaa agttcgattt 3960 attcaacaaa gccgccgtcc cgtcaagtca gcgtaatgct ctgccagtgt tacaaccaat 4020 taaccaattc tgattagaaa aactcatcga gcatcaaatg aaactgcaat ttattcatat 4080 caggattatc aataccatat ttttgaaaaa gccgtttctg taatgaagga gaaaactcac 4140 cgaggcagtt ccataggatg gcaagatcct ggtatcggtc tgcgattccg actcgtccaa 4200 catcaataca acctattaat ttcccctcgt caaaaataag gttatcaagt gagaaatcac 4260 catgagtgac gactgaatcc ggtgagaatg gcaaaagctt atgcatttct ttccagactt 4320 gttcaacagg ccagccatta cgctcgtcat caaaatcact cgcatcaacc aaaccgttat 4380 tcattcgtga ttgcgcctga gcgagacgaa atacgcgatc gctgttaaaa ggacaattac 4440 aaacaggaat cgaatgcaac cggcgcagga acactgccag cgcatcaaca atattttcac 4500 ctgaatcagg atattcttct aatacctgga atgctgtttt cccggggatc gcagtggtga 4560 gtaaccatgc atcatcagga gtacggataa aatgcttgat ggtcggaaga ggcataaatt 4620 ccgtcagcca gtttagtctg accatctcat ctgtaacatc attggcaacg ctacctttgc 4680 catgtttcag aaacaactct ggcgcatcgg gcttcccata caatcgatag attgtcgcac 4740 ctgattgccc gacattatcg cgagcccatt tatacccata taaatcagca tccatgttgg 4800 aatttaatcg cggcctcgag caagacgttt cccgttgaat atggctcata acaccccttg 4860 tattactgtt tatgtaagca gacagtttta ttgttcatga tgatatattt ttatcttgtg 4920 caatgtaaca tcagagattt tgagacactg gccttaaatc atggccttaa aattctatga 4980 tttctctcat tacttcctca catgttggag gcattttctc tccctctgca cttaatagcc 5040 tatcttgctt taatttcttc ccactcttgg aggctaggtt tggtttggtg ggctgatgag 5100 ggagggagag accgctccaa gtactttagc gggtgggatt gaaggcggag ccctcctgag 5160 ctatgagtgt cctatgagtg gggctggaac taagaaccag gggcgtggac agggtgtgtc 5220 acagagaagg ggatgtgcct gcttctttct ggcccaggag gaaccgggtc aattcttcag 5280 cacctgggta cccatagagc ccaccgcatc cccagcatgc ctgctattgt cttcccaatc 5340 ctcccccttg ctgtcctgcc ccaccccacc ccccagaata gaatgacacc tactcagaca 5400 atgcgatgca atttcctcat tttattagga aaggacagtg ggagtggcac cttccagggt 5460 caaggaaggc acgggggagg ggcaaacaac agatggctgg caactagaag gcacagcaga 5520 tctggatcac acgtggtcga ctgatcagtt gtcgtactcc tcggcgttgt cgccgaagaa 5580 gtagctgccc tcgttgctca tgtcgaagct aggcacgatg gggttggtgg ccttctcgtc 5640 gctcagctcg aacacgcctc tgcctctgaa ggacacctcc tcgggcttgg cgccctccat 5700 cattctgatg atctcggctc tcatgtcgct ggttctgccc tcggtgttgc cggtgaaggc 5760 ggccatcacg gtgctcttct cgaagggcag gtttctctgc acgctgaagg tgggctgcac 5820 gctgatctgg ccggcgctgg ctctctgctg gttggtgttg ccgccgcttc tggttctgat 5880 ggcccagtat ctgcttctca gctccagggt gctgctgccc atgttgtcca tgttctcgtt 5940 gctggcgatc tgcacgcctc tggtgctcag cttgcctctg ggggacacct tggtgcctct 6000 gatgaagctc agcagtctca ggtcctcgaa ggcggcgctg tggcaggcca tccacaccag 6060 ctggctcttg tgggcggggt tctcgttggg tctgatcagg ctgtacacct ggctgttctg 6120 cagcagcttg aaggggtcga tgcccaccag gctgtagccc tccttctcga agtcgtagcc 6180 gctgctcacg gcggggccgt acacgcaggc gggcaggcag ctcttgtggg ccacgctgcc 6240 tctcaggatc agggcgcttc tggccaggaa gatcaggtcc tcgatctcgg cgttgccggg 6300 gtttctgctc tcccggacct ggtccatcat ggctctctgg gcggcggtct ggaacttgcc 6360 cttcaggatg ttgcacattc tctcgtaggc gcttctggtc tttctgccgt tctcgcctct 6420 ccagaagttt ctgtcgttga tgcctctctt gatcattctg atcagctcca tcaccatggt 6480 gccgatgccc ttcacggcgg cgccggcggc gccgcttctt ctgggcaggg tgctgccctg 6540 catcaggctg cacattctgg ggtccatgcc ggtccgcacc agggctctgg ttctctggta 6600 ggtggtgtcg ttcaggttgc tgtgccagat catcatgtgg gtcaggccgg cggtggcgtc 6660 ctcgccgttg ttggcctgtc tccagattct tctgatctcc tccttgtcgt acagcaccag 6720 ctctctcatc cacttgccgt ccactcttct gtagatgggg ccgccggtct tcttggggtc 6780 cttgccggcg ctggggtgct

cctccaggta tctgtttctt ctctcgtcga aggcgctcag 6840 caccattctc tcgatggtca ggctgttctg gatcagtctg ccctcgtagt cgctcagctt 6900 cagctcggtg cacatctgga tgtagaatct gccgatgccg tcgatcatct tgcccacgct 6960 ggctctgatc tcggtggcgt tctgtctctc gccgtcggtc tccatctgct cgtagcttct 7020 cttggtgccc tggctggcca tggtggcgaa ttcgatatcc gacgacggtg actgcagaaa 7080 agacccatgg aaaggaacag tctgttagtc tgtcagctat tatgtctggt ggcgcgcgcg 7140 gcagcaacga gtactgctca gactacactg ccctccaccg ttaactagag ttgagcaagc 7200 agggtcaggc aaagcgtgga gagccggctg agtctaggta ggctccaagg gagcgccgga 7260 caaaggcccg gtctcgacct gagctttaaa cttacctaga cggcggacgc agttcaggag 7320 gcaccacagg cgggaggcgg cagaacgcga ctcaaccggc gtggatggcg gcctcaggta 7380 gggcggcggg cgcgtgaagg agagatgcga gccgatggag gtgcacacca atgtggtgaa 7440 tggtcaaatg gcgtttattg tatcgagcta ggcacttaaa tacaatatct ctgcaatgcg 7500 gaattcagtg gttcgtccaa tccatgtcag acccgtctgt tgccttccta ataaggcacg 7560 atcgtaccac cttacttcca ccaatcggca tgcacggtgc tttttctctc cttgtaaggc 7620 atgttgctaa ctcatcgtta ccatgttgca agactacaag agtattgcat aagactacat 7680 ttccccctcc ctatgcaaaa gcgaaactac tatatcctga ggggactcct aaccgcgtac 7740 aaccgaagcc ccgcttttcg cctaaacaca ccctagtccc ctcagatacg cgtatatctg 7800 gcccgtacat cgcgaagcag cgcaaaacgc ctaaccctaa gcagattctt catgcaattg 7860 tcggtcaagc cttgccttgt tgtagcttaa attttgctcg cgcactactc agcgacctcc 7920 aacacacaag cagggagcag ccaatagcca atctgatgcg gtattttctc cttacgcatc 7980 tgtgcggtat ttcacaccgc atagtggctt tccccccccc cccattattg aagcatttat 8040 cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa taaacaaata 8100 ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg tctaagaaac cattattatc 8160 atgacattaa cctataaaaa taggcgtatc acgaggccct ttcgtctcgc gcgtttcggt 8220 gatgacggtg aaaacctctg acacatgcag ctcccggaga cggtcacagc ttgtctgtaa 8280 gcggatgccg ggagcagaca agcccgtcag ggcgcgtcag cgggtgttgg cgggtgtcgg 8340 ggctggctta actatgcggc atcagagcag attgtactga gagtgcacca tatgcggtgt 8400 gaaataccgc acagatgcgt aaggagaaaa taccgcatca gattggctat 8450 106 8442 DNA Artificial sequence VR4775, Ligation of RSV RSeg7 into VR4762 106 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catggccagc cagggcacca agagaagcta 1680 cgagcagatg gagaccgacg gcgagagaca gaacgccacc gagatcagag ccagcgtggg 1740 caagatgatc gacggcatcg gcagattcta catccagatg tgcaccgagc tgaagctgag 1800 cgactacgag ggcagactga tccagaacag cctgaccatc gagagaatgg tgctgagcgc 1860 cttcgacgag agaagaaaca gatacctgga ggagcacccc agcgccggca aggaccccaa 1920 gaagaccggc ggccccatct acagaagagt ggacggcaag tggatgagag agctggtgct 1980 gtacgacaag gaggagatca gaagaatctg gagacaggcc aacaacggcg aggacgccac 2040 cgccggcctg acccacatga tgatctggca cagcaacctg aacgacacca cctaccagag 2100 aaccagagcc ctggtgcgga ccggcatgga ccccagaatg tgcagcctga tgcagggcag 2160 caccctgccc agaagaagcg gcgccgccgg cgccgccgtg aagggcatcg gcaccatggt 2220 gatggagctg atcagaatga tcaagagagg catcaacgac agaaacttct ggagaggcga 2280 gaacggcaga aagaccagaa gcgcctacga gagaatgtgc aacatcctga agggcaagtt 2340 ccagaccgcc gcccagagag ccatgatgga ccaggtccgg gagagcagaa accccggcaa 2400 cgccgagatc gaggacctga tcttcctggc cagaagcgcc ctgatcctga gaggcagcgt 2460 ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc gccgtgagca gcggctacga 2520 cttcgagaag gagggctaca gcctggtggg catcgacccc ttcaagctgc tgcagaacag 2580 ccaggtgtac agcctgatca gacccaacga gaaccccgcc cacaagagcc agctggtgtg 2640 gatggcctgc cacagcgccg ccttcgagga cctgagactg ctgagcttca tcagaggcac 2700 caaggtgtcc cccagaggca agctgagcac cagaggcgtg cagatcgcca gcaacgagaa 2760 catggacaac atgggcagca gcaccctgga gctgagaagc agatactggg ccatcagaac 2820 cagaagcggc ggcaacacca accagcagag agccagcgcc ggccagatca gcgtgcagcc 2880 caccttcagc gtgcagagaa acctgccctt cgagaagagc accgtgatgg ccgccttcac 2940 cggcaacacc gagggcagaa ccagcgacat gagagccgag atcatcagaa tgatggaggg 3000 cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg ttcgagctga gcgacgagaa 3060 ggccaccaac cccatcgtgc ctagcttcga catgagcaac gagggcagct acttcttcgg 3120 cgacaacgcc gaggagtacg acaactgatc agtcgaccac gtgtgatcca gatctacttc 3180 tggctaataa aagatcagag ctctagagat ctgtgtgttg gttttttgtg tggtactctt 3240 ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 3300 ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 3360 tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 3420 tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 3480 gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 3540 ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 3600 tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 3660 agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 3720 atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 3780 acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 3840 actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 3900 tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 3960 tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 4020 tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 4080 tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 4140 caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg 4200 cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc gggggggggg 4260 ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc ataccaggcc tgaatcgccc 4320 catcatccag ccagaaagtg agggagccac ggttgatgag agctttgttg taggtggacc 4380 agttggtgat tttgaacttt tgctttgcca cggaacggtc tgcgttgtcg ggaagatgcg 4440 tgatctgatc cttcaactca gcaaaagttc gatttattca acaaagccgc cgtcccgtca 4500 agtcagcgta atgctctgcc agtgttacaa ccaattaacc aattctgatt agaaaaactc 4560 atcgagcatc aaatgaaact gcaatttatt catatcagga ttatcaatac catatttttg 4620 aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg cagttccata ggatggcaag 4680 atcctggtat cggtctgcga ttccgactcg tccaacatca atacaaccta ttaatttccc 4740 ctcgtcaaaa ataaggttat caagtgagaa atcaccatga gtgacgactg aatccggtga 4800 gaatggcaaa agcttatgca tttctttcca gacttgttca acaggccagc cattacgctc 4860 gtcatcaaaa tcactcgcat caaccaaacc gttattcatt cgtgattgcg cctgagcgag 4920 acgaaatacg cgatcgctgt taaaaggaca attacaaaca ggaatcgaat gcaaccggcg 4980 caggaacact gccagcgcat caacaatatt ttcacctgaa tcaggatatt cttctaatac 5040 ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac catgcatcat caggagtacg 5100 gataaaatgc ttgatggtcg gaagaggcat aaattccgtc agccagttta gtctgaccat 5160 ctcatctgta acatcattgg caacgctacc tttgccatgt ttcagaaaca actctggcgc 5220 atcgggcttc ccatacaatc gatagattgt cgcacctgat tgcccgacat tatcgcgagc 5280 ccatttatac ccatataaat cagcatccat gttggaattt aatcgcggcc tcgagcaaga 5340 cgtttcccgt tgaatatggc tcataacacc ccttgtatta ctgtttatgt aagcagacag 5400 ttttattgtt catgatgata tatttttatc ttgtgcaatg taacatcaga gattttgaga 5460 cactatgcgg tgtgaaatac cgcacagatg cgtaaggaga aaataccgca tcagattggc 5520 tattggctgc tccctgcttg tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa 5580 gctacaacaa ggcaaggctt gaccgacaat tgcatgaaga atctgcttag ggttaggcgt 5640 tttgcgctgc ttcgcgatgt acgggccaga tatacgcgta tctgagggga ctagggtgtg 5700 tttaggcgaa aagcggggct tcggttgtac gcggttagga gtcccctcag gatatagtag 5760 tttcgctttt gcatagggag ggggaaatgt agtcttatgc aatactcttg tagtcttgca 5820 acatggtaac gatgagttag caacatgcct tacaaggaga gaaaaagcac cgtgcatgcc 5880 gattggtgga agtaaggtgg tacgatcgtg ccttattagg aaggcaacag acgggtctga 5940 catggattgg acgaaccact gaattccgca ttgcagagat attgtattta agtgcctagc 6000 tcgatacaat aaacgccatt tgaccattca ccacattggt gtgcacctcc atcggctcgc 6060 atctctcctt cacgcgcccg ccgccctacc tgaggccgcc atccacgccg gttgagtcgc 6120 gttctgccgc ctcccgcctg tggtgcctcc tgaactgcgt ccgccgtcta ggtaagttta 6180 aagctcaggt cgagaccggg cctttgtccg gcgctccctt ggagcctacc tagactcagc 6240 cggctctcca cgctttgcct gaccctgctt gctcaactct agttaacggt ggagggcagt 6300 gtagtctgag cagtactcgt tgctgccgcg cgcgccacca gacataatag ctgacagact 6360 aacagactgt tcctttccat gggtcttttc tgcagtcacc gtcgtcggat atcgaattcg 6420 ccaccatgag ccttctaacc gaggtcgaaa cgtatgttct ctctatcgtt ccatcaggcc 6480 ccctcaaagc cgaaatcgcg cagagacttg aagatgtctt tgctgggaaa aacacagatc 6540 ttgaggctct catggaatgg ctaaagacaa gaccaatcct gtcacctctg actaagggga 6600 ttttggggtt tgtgttcacg ctcaccgtgc ccagtgagcg aggactgcag cgtagacgct 6660 ttgtccaaaa tgccctcaat gggaatgggg atccaaataa catggacaga gcagttaaac 6720 tatatagaaa acttaagagg gagattacat tccatggggc caaagaaata gcactcagtt 6780 attctgctgg tgcacttgcc agttgcatgg gcctcatata caacagaatg ggggctgtaa 6840 ccactgaagt ggcctttggc ctggtatgtg caacatgtga acagattgct gactcccagc 6900 acaggtctca taggcaaatg gtggcaacaa ccaatccatt aataaggcat gagaacagaa 6960 tggttttggc cagcactaca gctaaggcta tggagcaaat ggctggatca agtgagcagg 7020 cagcggaggc catggaaatt gctagtcagg ccaggcaaat ggtgcaggca atgagagcca 7080 ttgggactca tcctagctcc agtgctggtc taaaagatga tcttcttgaa aatttgcaga 7140 cctatcagaa acgaatgggg gtgcagatgc aacgattcaa gtgacccgct tgttgttgct 7200 gcgagtatca ttgggatctt gcacttgata ttgtggattc ttgatcgtct ttttttcaaa 7260 tgcatctatc gactcttcaa acacggtctg aaaagagggc cttctacgga aggagtacct 7320 gagtctatga gggaagaata tcgaaaggaa cagcagaatg ctgtggatgc tgacgacagt 7380 cattttgtca gcatagagct ggagtaatca gtcgagatcc agatctgctg tgccttctag 7440 ttgccagcca tctgttgttt gcccctcccc cgtgccttcc ttgaccctgg aaggtgccac 7500 tcccactgtc ctttcctaat aaaatgagga aattgcatcg cattgtctga gtaggtgtca 7560 ttctattctg gggggtgggg tggggcagga cagcaagggg gaggattggg aagacaatag 7620 caggcatgct ggggatgcgg tgggctctat gggtacccag gtgctgaaga attgacccgg 7680 ttcctcctgg gccagaaaga agcaggcaca tccccttctc tgtgacacac cctgtccacg 7740 cccctggttc ttagttccag ccccactcat aggacactca tagctcagga gggctccgcc 7800 ttcaatccca cccgctaaag tacttggagc ggtctctccc tccctcatca gcccaccaaa 7860 ccaaacctag cctccaagag tgggaagaaa ttaaagcaag ataggctatt aagtgcagag 7920 ggagagaaaa tgcctccaac atgtgaggaa gtaatgagag aaatcataga attttaaggc 7980 catgatttaa ggccagtggc tttccccccc cccccattat tgaagcattt atcagggtta 8040 ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc 8100 gcgcacattt ccccgaaaag tgccacctga cgtctaagaa accattatta tcatgacatt 8160 aacctataaa aataggcgta tcacgaggcc ctttcgtctc gcgcgtttcg gtgatgacgg 8220 tgaaaacctc tgacacatgc agctcccgga gacggtcaca gcttgtctgt aagcggatgc 8280 cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct 8340 taactatgcg gcatcagagc agattgtact gagagtgcac catatgcggt gtgaaatacc 8400 gcacagatgc gtaaggagaa aataccgcat cagattggct at 8442 107 8442 DNA Artificial sequence VR4776, Ligation of Inverted RSV R Seg7 into VR4762 107 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catggccagc cagggcacca agagaagcta 1680 cgagcagatg gagaccgacg gcgagagaca gaacgccacc gagatcagag ccagcgtggg 1740 caagatgatc gacggcatcg gcagattcta catccagatg tgcaccgagc tgaagctgag 1800 cgactacgag ggcagactga tccagaacag cctgaccatc gagagaatgg tgctgagcgc 1860 cttcgacgag agaagaaaca gatacctgga ggagcacccc agcgccggca aggaccccaa 1920 gaagaccggc ggccccatct acagaagagt ggacggcaag tggatgagag agctggtgct 1980 gtacgacaag gaggagatca gaagaatctg gagacaggcc aacaacggcg aggacgccac 2040 cgccggcctg acccacatga tgatctggca cagcaacctg aacgacacca cctaccagag 2100 aaccagagcc ctggtgcgga ccggcatgga ccccagaatg tgcagcctga tgcagggcag 2160 caccctgccc agaagaagcg gcgccgccgg cgccgccgtg aagggcatcg gcaccatggt 2220 gatggagctg atcagaatga tcaagagagg catcaacgac agaaacttct ggagaggcga 2280 gaacggcaga aagaccagaa gcgcctacga gagaatgtgc aacatcctga agggcaagtt 2340 ccagaccgcc gcccagagag ccatgatgga ccaggtccgg gagagcagaa accccggcaa 2400 cgccgagatc gaggacctga tcttcctggc cagaagcgcc ctgatcctga gaggcagcgt 2460 ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc gccgtgagca gcggctacga 2520 cttcgagaag gagggctaca gcctggtggg catcgacccc ttcaagctgc tgcagaacag 2580 ccaggtgtac agcctgatca gacccaacga gaaccccgcc cacaagagcc agctggtgtg 2640 gatggcctgc cacagcgccg ccttcgagga cctgagactg ctgagcttca tcagaggcac 2700 caaggtgtcc cccagaggca agctgagcac cagaggcgtg cagatcgcca gcaacgagaa 2760 catggacaac atgggcagca gcaccctgga gctgagaagc agatactggg ccatcagaac 2820 cagaagcggc ggcaacacca accagcagag agccagcgcc ggccagatca gcgtgcagcc 2880 caccttcagc gtgcagagaa acctgccctt cgagaagagc accgtgatgg ccgccttcac 2940 cggcaacacc gagggcagaa ccagcgacat gagagccgag atcatcagaa tgatggaggg 3000 cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg ttcgagctga gcgacgagaa 3060 ggccaccaac cccatcgtgc ctagcttcga catgagcaac gagggcagct acttcttcgg 3120 cgacaacgcc gaggagtacg acaactgatc agtcgaccac gtgtgatcca gatctacttc 3180 tggctaataa aagatcagag ctctagagat ctgtgtgttg gttttttgtg tggtactctt 3240 ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 3300 ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 3360 tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 3420 tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 3480 gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 3540 ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 3600 tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 3660 agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 3720 atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 3780 acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 3840 actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 3900 tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 3960 tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 4020 tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 4080 tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 4140 caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg 4200 cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc gggggggggg 4260 ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc ataccaggcc tgaatcgccc 4320 catcatccag ccagaaagtg agggagccac ggttgatgag agctttgttg taggtggacc 4380 agttggtgat tttgaacttt tgctttgcca cggaacggtc tgcgttgtcg ggaagatgcg 4440 tgatctgatc cttcaactca gcaaaagttc gatttattca acaaagccgc cgtcccgtca 4500 agtcagcgta atgctctgcc agtgttacaa ccaattaacc aattctgatt agaaaaactc 4560 atcgagcatc aaatgaaact gcaatttatt catatcagga ttatcaatac catatttttg 4620 aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg cagttccata ggatggcaag 4680 atcctggtat cggtctgcga ttccgactcg tccaacatca atacaaccta ttaatttccc 4740 ctcgtcaaaa ataaggttat caagtgagaa atcaccatga

gtgacgactg aatccggtga 4800 gaatggcaaa agcttatgca tttctttcca gacttgttca acaggccagc cattacgctc 4860 gtcatcaaaa tcactcgcat caaccaaacc gttattcatt cgtgattgcg cctgagcgag 4920 acgaaatacg cgatcgctgt taaaaggaca attacaaaca ggaatcgaat gcaaccggcg 4980 caggaacact gccagcgcat caacaatatt ttcacctgaa tcaggatatt cttctaatac 5040 ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac catgcatcat caggagtacg 5100 gataaaatgc ttgatggtcg gaagaggcat aaattccgtc agccagttta gtctgaccat 5160 ctcatctgta acatcattgg caacgctacc tttgccatgt ttcagaaaca actctggcgc 5220 atcgggcttc ccatacaatc gatagattgt cgcacctgat tgcccgacat tatcgcgagc 5280 ccatttatac ccatataaat cagcatccat gttggaattt aatcgcggcc tcgagcaaga 5340 cgtttcccgt tgaatatggc tcataacacc ccttgtatta ctgtttatgt aagcagacag 5400 ttttattgtt catgatgata tatttttatc ttgtgcaatg taacatcaga gattttgaga 5460 cactggcctt aaatcatggc cttaaaattc tatgatttct ctcattactt cctcacatgt 5520 tggaggcatt ttctctccct ctgcacttaa tagcctatct tgctttaatt tcttcccact 5580 cttggaggct aggtttggtt tggtgggctg atgagggagg gagagaccgc tccaagtact 5640 ttagcgggtg ggattgaagg cggagccctc ctgagctatg agtgtcctat gagtggggct 5700 ggaactaaga accaggggcg tggacagggt gtgtcacaga gaaggggatg tgcctgcttc 5760 tttctggccc aggaggaacc gggtcaattc ttcagcacct gggtacccat agagcccacc 5820 gcatccccag catgcctgct attgtcttcc caatcctccc ccttgctgtc ctgccccacc 5880 ccacccccca gaatagaatg acacctactc agacaatgcg atgcaatttc ctcattttat 5940 taggaaagga cagtgggagt ggcaccttcc agggtcaagg aaggcacggg ggaggggcaa 6000 acaacagatg gctggcaact agaaggcaca gcagatctgg atctcgactg attactccag 6060 ctctatgctg acaaaatgac tgtcgtcagc atccacagca ttctgctgtt cctttcgata 6120 ttcttccctc atagactcag gtactccttc cgtagaaggc cctcttttca gaccgtgttt 6180 gaagagtcga tagatgcatt tgaaaaaaag acgatcaaga atccacaata tcaagtgcaa 6240 gatcccaatg atactcgcag caacaacaag cgggtcactt gaatcgttgc atctgcaccc 6300 ccattcgttt ctgataggtc tgcaaatttt caagaagatc atcttttaga ccagcactgg 6360 agctaggatg agtcccaatg gctctcattg cctgcaccat ttgcctggcc tgactagcaa 6420 tttccatggc ctccgctgcc tgctcacttg atccagccat ttgctccata gccttagctg 6480 tagtgctggc caaaaccatt ctgttctcat gccttattaa tggattggtt gttgccacca 6540 tttgcctatg agacctgtgc tgggagtcag caatctgttc acatgttgca cataccaggc 6600 caaaggccac ttcagtggtt acagccccca ttctgttgta tatgaggccc atgcaactgg 6660 caagtgcacc agcagaataa ctgagtgcta tttctttggc cccatggaat gtaatctccc 6720 tcttaagttt tctatatagt ttaactgctc tgtccatgtt atttggatcc ccattcccat 6780 tgagggcatt ttggacaaag cgtctacgct gcagtcctcg ctcactgggc acggtgagcg 6840 tgaacacaaa ccccaaaatc cccttagtca gaggtgacag gattggtctt gtctttagcc 6900 attccatgag agcctcaaga tctgtgtttt tcccagcaaa gacatcttca agtctctgcg 6960 cgatttcggc tttgaggggg cctgatggaa cgatagagag aacatacgtt tcgacctcgg 7020 ttagaaggct catggtggcg aattcgatat ccgacgacgg tgactgcaga aaagacccat 7080 ggaaaggaac agtctgttag tctgtcagct attatgtctg gtggcgcgcg cggcagcaac 7140 gagtactgct cagactacac tgccctccac cgttaactag agttgagcaa gcagggtcag 7200 gcaaagcgtg gagagccggc tgagtctagg taggctccaa gggagcgccg gacaaaggcc 7260 cggtctcgac ctgagcttta aacttaccta gacggcggac gcagttcagg aggcaccaca 7320 ggcgggaggc ggcagaacgc gactcaaccg gcgtggatgg cggcctcagg tagggcggcg 7380 ggcgcgtgaa ggagagatgc gagccgatgg aggtgcacac caatgtggtg aatggtcaaa 7440 tggcgtttat tgtatcgagc taggcactta aatacaatat ctctgcaatg cggaattcag 7500 tggttcgtcc aatccatgtc agacccgtct gttgccttcc taataaggca cgatcgtacc 7560 accttacttc caccaatcgg catgcacggt gctttttctc tccttgtaag gcatgttgct 7620 aactcatcgt taccatgttg caagactaca agagtattgc ataagactac atttccccct 7680 ccctatgcaa aagcgaaact actatatcct gaggggactc ctaaccgcgt acaaccgaag 7740 ccccgctttt cgcctaaaca caccctagtc ccctcagata cgcgtatatc tggcccgtac 7800 atcgcgaagc agcgcaaaac gcctaaccct aagcagattc ttcatgcaat tgtcggtcaa 7860 gccttgcctt gttgtagctt aaattttgct cgcgcactac tcagcgacct ccaacacaca 7920 agcagggagc agccaatagc caatctgatg cggtattttc tccttacgca tctgtgcggt 7980 atttcacacc gcatagtggc tttccccccc cccccattat tgaagcattt atcagggtta 8040 ttgtctcatg agcggataca tatttgaatg tatttagaaa aataaacaaa taggggttcc 8100 gcgcacattt ccccgaaaag tgccacctga cgtctaagaa accattatta tcatgacatt 8160 aacctataaa aataggcgta tcacgaggcc ctttcgtctc gcgcgtttcg gtgatgacgg 8220 tgaaaacctc tgacacatgc agctcccgga gacggtcaca gcttgtctgt aagcggatgc 8280 cgggagcaga caagcccgtc agggcgcgtc agcgggtgtt ggcgggtgtc ggggctggct 8340 taactatgcg gcatcagagc agattgtact gagagtgcac catatgcggt gtgaaatacc 8400 gcacagatgc gtaaggagaa aataccgcat cagattggct at 8442 108 7754 DNA Artificial sequence VR4777, Ligation of RSVRM2 into VR4762 108 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catggccagc cagggcacca agagaagcta 1680 cgagcagatg gagaccgacg gcgagagaca gaacgccacc gagatcagag ccagcgtggg 1740 caagatgatc gacggcatcg gcagattcta catccagatg tgcaccgagc tgaagctgag 1800 cgactacgag ggcagactga tccagaacag cctgaccatc gagagaatgg tgctgagcgc 1860 cttcgacgag agaagaaaca gatacctgga ggagcacccc agcgccggca aggaccccaa 1920 gaagaccggc ggccccatct acagaagagt ggacggcaag tggatgagag agctggtgct 1980 gtacgacaag gaggagatca gaagaatctg gagacaggcc aacaacggcg aggacgccac 2040 cgccggcctg acccacatga tgatctggca cagcaacctg aacgacacca cctaccagag 2100 aaccagagcc ctggtgcgga ccggcatgga ccccagaatg tgcagcctga tgcagggcag 2160 caccctgccc agaagaagcg gcgccgccgg cgccgccgtg aagggcatcg gcaccatggt 2220 gatggagctg atcagaatga tcaagagagg catcaacgac agaaacttct ggagaggcga 2280 gaacggcaga aagaccagaa gcgcctacga gagaatgtgc aacatcctga agggcaagtt 2340 ccagaccgcc gcccagagag ccatgatgga ccaggtccgg gagagcagaa accccggcaa 2400 cgccgagatc gaggacctga tcttcctggc cagaagcgcc ctgatcctga gaggcagcgt 2460 ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc gccgtgagca gcggctacga 2520 cttcgagaag gagggctaca gcctggtggg catcgacccc ttcaagctgc tgcagaacag 2580 ccaggtgtac agcctgatca gacccaacga gaaccccgcc cacaagagcc agctggtgtg 2640 gatggcctgc cacagcgccg ccttcgagga cctgagactg ctgagcttca tcagaggcac 2700 caaggtgtcc cccagaggca agctgagcac cagaggcgtg cagatcgcca gcaacgagaa 2760 catggacaac atgggcagca gcaccctgga gctgagaagc agatactggg ccatcagaac 2820 cagaagcggc ggcaacacca accagcagag agccagcgcc ggccagatca gcgtgcagcc 2880 caccttcagc gtgcagagaa acctgccctt cgagaagagc accgtgatgg ccgccttcac 2940 cggcaacacc gagggcagaa ccagcgacat gagagccgag atcatcagaa tgatggaggg 3000 cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg ttcgagctga gcgacgagaa 3060 ggccaccaac cccatcgtgc ctagcttcga catgagcaac gagggcagct acttcttcgg 3120 cgacaacgcc gaggagtacg acaactgatc agtcgaccac gtgtgatcca gatctacttc 3180 tggctaataa aagatcagag ctctagagat ctgtgtgttg gttttttgtg tggtactctt 3240 ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 3300 ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 3360 tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 3420 tccataggct ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 3480 gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 3540 ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 3600 tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 3660 agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 3720 atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 3780 acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 3840 actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 3900 tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 3960 tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 4020 tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 4080 tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 4140 caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg 4200 cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc gggggggggg 4260 ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc ataccaggcc tgaatcgccc 4320 catcatccag ccagaaagtg agggagccac ggttgatgag agctttgttg taggtggacc 4380 agttggtgat tttgaacttt tgctttgcca cggaacggtc tgcgttgtcg ggaagatgcg 4440 tgatctgatc cttcaactca gcaaaagttc gatttattca acaaagccgc cgtcccgtca 4500 agtcagcgta atgctctgcc agtgttacaa ccaattaacc aattctgatt agaaaaactc 4560 atcgagcatc aaatgaaact gcaatttatt catatcagga ttatcaatac catatttttg 4620 aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg cagttccata ggatggcaag 4680 atcctggtat cggtctgcga ttccgactcg tccaacatca atacaaccta ttaatttccc 4740 ctcgtcaaaa ataaggttat caagtgagaa atcaccatga gtgacgactg aatccggtga 4800 gaatggcaaa agcttatgca tttctttcca gacttgttca acaggccagc cattacgctc 4860 gtcatcaaaa tcactcgcat caaccaaacc gttattcatt cgtgattgcg cctgagcgag 4920 acgaaatacg cgatcgctgt taaaaggaca attacaaaca ggaatcgaat gcaaccggcg 4980 caggaacact gccagcgcat caacaatatt ttcacctgaa tcaggatatt cttctaatac 5040 ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac catgcatcat caggagtacg 5100 gataaaatgc ttgatggtcg gaagaggcat aaattccgtc agccagttta gtctgaccat 5160 ctcatctgta acatcattgg caacgctacc tttgccatgt ttcagaaaca actctggcgc 5220 atcgggcttc ccatacaatc gatagattgt cgcacctgat tgcccgacat tatcgcgagc 5280 ccatttatac ccatataaat cagcatccat gttggaattt aatcgcggcc tcgagcaaga 5340 cgtttcccgt tgaatatggc tcataacacc ccttgtatta ctgtttatgt aagcagacag 5400 ttttattgtt catgatgata tatttttatc ttgtgcaatg taacatcaga gattttgaga 5460 cactatgcgg tgtgaaatac cgcacagatg cgtaaggaga aaataccgca tcagattggc 5520 tattggctgc tccctgcttg tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa 5580 gctacaacaa ggcaaggctt gaccgacaat tgcatgaaga atctgcttag ggttaggcgt 5640 tttgcgctgc ttcgcgatgt acgggccaga tatacgcgta tctgagggga ctagggtgtg 5700 tttaggcgaa aagcggggct tcggttgtac gcggttagga gtcccctcag gatatagtag 5760 tttcgctttt gcatagggag ggggaaatgt agtcttatgc aatactcttg tagtcttgca 5820 acatggtaac gatgagttag caacatgcct tacaaggaga gaaaaagcac cgtgcatgcc 5880 gattggtgga agtaaggtgg tacgatcgtg ccttattagg aaggcaacag acgggtctga 5940 catggattgg acgaaccact gaattccgca ttgcagagat attgtattta agtgcctagc 6000 tcgatacaat aaacgccatt tgaccattca ccacattggt gtgcacctcc atcggctcgc 6060 atctctcctt cacgcgcccg ccgccctacc tgaggccgcc atccacgccg gttgagtcgc 6120 gttctgccgc ctcccgcctg tggtgcctcc tgaactgcgt ccgccgtcta ggtaagttta 6180 aagctcaggt cgagaccggg cctttgtccg gcgctccctt ggagcctacc tagactcagc 6240 cggctctcca cgctttgcct gaccctgctt gctcaactct agttaacggt ggagggcagt 6300 gtagtctgag cagtactcgt tgctgccgcg cgcgccacca gacataatag ctgacagact 6360 aacagactgt tcctttccat gggtcttttc tgcagtcacc gtcgtcggat atcgaattcg 6420 ccaccatgag cctgctgacc gaggtggaga cccccatcag aaacgagtgg ggctgcagat 6480 gcaacgacag cagcgacccc ctggtggtgg ccgccagcat catcggcatc ctgcacctga 6540 tcctgtggat cctggacaga ctgttcttca agtgcatcta cagactgttc aagcacggcc 6600 tgaagagagg ccccagcacc gagggcgtgc ccgagagcat gagagaggag tacagaaagg 6660 agcagcagaa cgccgtggac gccgacgaca gccacttcgt gagcatcgag ctggagtgat 6720 cagtcgagat ccagatctgc tgtgccttct agttgccagc catctgttgt ttgcccctcc 6780 cccgtgcctt ccttgaccct ggaaggtgcc actcccactg tcctttccta ataaaatgag 6840 gaaattgcat cgcattgtct gagtaggtgt cattctattc tggggggtgg ggtggggcag 6900 gacagcaagg gggaggattg ggaagacaat agcaggcatg ctggggatgc ggtgggctct 6960 atgggtaccc aggtgctgaa gaattgaccc ggttcctcct gggccagaaa gaagcaggca 7020 catccccttc tctgtgacac accctgtcca cgcccctggt tcttagttcc agccccactc 7080 ataggacact catagctcag gagggctccg ccttcaatcc cacccgctaa agtacttgga 7140 gcggtctctc cctccctcat cagcccacca aaccaaacct agcctccaag agtgggaaga 7200 aattaaagca agataggcta ttaagtgcag agggagagaa aatgcctcca acatgtgagg 7260 aagtaatgag agaaatcata gaattttaag gccatgattt aaggccagtg gctttccccc 7320 cccccccatt attgaagcat ttatcagggt tattgtctca tgagcggata catatttgaa 7380 tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa agtgccacct 7440 gacgtctaag aaaccattat tatcatgaca ttaacctata aaaataggcg tatcacgagg 7500 ccctttcgtc tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg 7560 gagacggtca cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg 7620 tcagcgggtg ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta 7680 ctgagagtgc accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc 7740 atcagattgg ctat 7754 109 7754 DNA Artificial sequence VR4778, Ligation of Inverted RSV RM2 into VR4762 109 tggccattgc atacgttgta tccatatcat aatatgtaca tttatattgg ctcatgtcca 60 acattaccgc catgttgaca ttgattattg actagttatt aatagtaatc aattacgggg 120 tcattagttc atagcccata tatggagttc cgcgttacat aacttacggt aaatggcccg 180 cctggctgac cgcccaacga cccccgccca ttgacgtcaa taatgacgta tgttcccata 240 gtaacgccaa tagggacttt ccattgacgt caatgggtgg agtatttacg gtaaactgcc 300 cacttggcag tacatcaagt gtatcatatg ccaagtacgc cccctattga cgtcaatgac 360 ggtaaatggc ccgcctggca ttatgcccag tacatgacct tatgggactt tcctacttgg 420 cagtacatct acgtattagt catcgctatt accatggtga tgcggttttg gcagtacatc 480 aatgggcgtg gatagcggtt tgactcacgg ggatttccaa gtctccaccc cattgacgtc 540 aatgggagtt tgttttggca ccaaaatcaa cgggactttc caaaatgtcg taacaactcc 600 gccccattga cgcaaatggg cggtaggcgt gtacggtggg aggtctatat aagcagagct 660 cgtttagtga accgtcagat cgcctggaga cgccatccac gctgttttga cctccataga 720 agacaccggg accgatccag cctccgcggc cgggaacggt gcattggaac gcggattccc 780 cgtgccaaga gtgacgtaag taccgcctat agactctata ggcacacccc tttggctctt 840 atgcatgcta tactgttttt ggcttggggc ctatacaccc ccgcttcctt atgctatagg 900 tgatggtata gcttagccta taggtgtggg ttattgacca ttattgacca ctcccctatt 960 ggtgacgata ctttccatta ctaatccata acatggctct ttgccacaac tatctctatt 1020 ggctatatgc caatactctg tccttcagag actgacacgg actctgtatt tttacaggat 1080 ggggtcccat ttattattta caaattcaca tatacaacaa cgccgtcccc cgtgcccgca 1140 gtttttatta aacatagcgt gggatctcca cgcgaatctc gggtacgtgt tccggacatg 1200 ggctcttctc cggtagcggc ggagcttcca catccgagcc ctggtcccat gcctccagcg 1260 gctcatggtc gctcggcagc tccttgctcc taacagtgga ggccagactt aggcacagca 1320 caatgcccac caccaccagt gtgccgcaca aggccgtggc ggtagggtat gtgtctgaaa 1380 atgagcgtgg agattgggct cgcacggctg acgcagatgg aagacttaag gcagcggcag 1440 aagaagatgc aggcagctga gttgttgtat tctgataaga gtcagaggta actcccgttg 1500 cggtgctgtt aacggtggag ggcagtgtag tctgagcagt actcgttgct gccgcgcgcg 1560 ccaccagaca taatagctga cagactaaca gactgttcct ttccatgggt cttttctgca 1620 gtcaccgtcg tcggatatcg aattcgccac catggccagc cagggcacca agagaagcta 1680 cgagcagatg gagaccgacg gcgagagaca gaacgccacc gagatcagag ccagcgtggg 1740 caagatgatc gacggcatcg gcagattcta catccagatg tgcaccgagc tgaagctgag 1800 cgactacgag ggcagactga tccagaacag cctgaccatc gagagaatgg tgctgagcgc 1860 cttcgacgag agaagaaaca gatacctgga ggagcacccc agcgccggca aggaccccaa 1920 gaagaccggc ggccccatct acagaagagt ggacggcaag tggatgagag agctggtgct 1980 gtacgacaag gaggagatca gaagaatctg gagacaggcc aacaacggcg aggacgccac 2040 cgccggcctg acccacatga tgatctggca cagcaacctg aacgacacca cctaccagag 2100 aaccagagcc ctggtgcgga ccggcatgga ccccagaatg tgcagcctga tgcagggcag 2160 caccctgccc agaagaagcg gcgccgccgg cgccgccgtg aagggcatcg gcaccatggt 2220 gatggagctg atcagaatga tcaagagagg catcaacgac agaaacttct ggagaggcga 2280 gaacggcaga aagaccagaa gcgcctacga gagaatgtgc aacatcctga agggcaagtt 2340 ccagaccgcc gcccagagag ccatgatgga ccaggtccgg gagagcagaa accccggcaa 2400 cgccgagatc gaggacctga tcttcctggc cagaagcgcc ctgatcctga gaggcagcgt 2460 ggcccacaag agctgcctgc ccgcctgcgt gtacggcccc gccgtgagca gcggctacga 2520 cttcgagaag gagggctaca gcctggtggg catcgacccc ttcaagctgc tgcagaacag 2580 ccaggtgtac agcctgatca gacccaacga gaaccccgcc cacaagagcc agctggtgtg 2640 gatggcctgc cacagcgccg ccttcgagga cctgagactg ctgagcttca tcagaggcac 2700 caaggtgtcc cccagaggca agctgagcac cagaggcgtg cagatcgcca gcaacgagaa 2760 catggacaac atgggcagca gcaccctgga gctgagaagc agatactggg ccatcagaac 2820 cagaagcggc ggcaacacca accagcagag agccagcgcc ggccagatca gcgtgcagcc 2880 caccttcagc gtgcagagaa acctgccctt cgagaagagc accgtgatgg ccgccttcac 2940 cggcaacacc gagggcagaa ccagcgacat gagagccgag atcatcagaa tgatggaggg 3000 cgccaagccc gaggaggtgt ccttcagagg cagaggcgtg ttcgagctga gcgacgagaa 3060 ggccaccaac cccatcgtgc ctagcttcga catgagcaac gagggcagct acttcttcgg 3120 cgacaacgcc gaggagtacg acaactgatc agtcgaccac gtgtgatcca gatctacttc 3180 tggctaataa aagatcagag ctctagagat ctgtgtgttg gttttttgtg tggtactctt 3240 ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg gctgcggcga gcggtatcag 3300 ctcactcaaa ggcggtaata cggttatcca cagaatcagg ggataacgca ggaaagaaca 3360 tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa ggccgcgttg ctggcgtttt 3420 tccataggct

ccgcccccct gacgagcatc acaaaaatcg acgctcaagt cagaggtggc 3480 gaaacccgac aggactataa agataccagg cgtttccccc tggaagctcc ctcgtgcgct 3540 ctcctgttcc gaccctgccg cttaccggat acctgtccgc ctttctccct tcgggaagcg 3600 tggcgctttc tcatagctca cgctgtaggt atctcagttc ggtgtaggtc gttcgctcca 3660 agctgggctg tgtgcacgaa ccccccgttc agcccgaccg ctgcgcctta tccggtaact 3720 atcgtcttga gtccaacccg gtaagacacg acttatcgcc actggcagca gccactggta 3780 acaggattag cagagcgagg tatgtaggcg gtgctacaga gttcttgaag tggtggccta 3840 actacggcta cactagaaga acagtatttg gtatctgcgc tctgctgaag ccagttacct 3900 tcggaaaaag agttggtagc tcttgatccg gcaaacaaac caccgctggt agcggtggtt 3960 tttttgtttg caagcagcag attacgcgca gaaaaaaagg atctcaagaa gatcctttga 4020 tcttttctac ggggtctgac gctcagtgga acgaaaactc acgttaaggg attttggtca 4080 tgagattatc aaaaaggatc ttcacctaga tccttttaaa ttaaaaatga agttttaaat 4140 caatctaaag tatatatgag taaacttggt ctgacagtta ccaatgctta atcagtgagg 4200 cacctatctc agcgatctgt ctatttcgtt catccatagt tgcctgactc gggggggggg 4260 ggcgctgagg tctgcctcgt gaagaaggtg ttgctgactc ataccaggcc tgaatcgccc 4320 catcatccag ccagaaagtg agggagccac ggttgatgag agctttgttg taggtggacc 4380 agttggtgat tttgaacttt tgctttgcca cggaacggtc tgcgttgtcg ggaagatgcg 4440 tgatctgatc cttcaactca gcaaaagttc gatttattca acaaagccgc cgtcccgtca 4500 agtcagcgta atgctctgcc agtgttacaa ccaattaacc aattctgatt agaaaaactc 4560 atcgagcatc aaatgaaact gcaatttatt catatcagga ttatcaatac catatttttg 4620 aaaaagccgt ttctgtaatg aaggagaaaa ctcaccgagg cagttccata ggatggcaag 4680 atcctggtat cggtctgcga ttccgactcg tccaacatca atacaaccta ttaatttccc 4740 ctcgtcaaaa ataaggttat caagtgagaa atcaccatga gtgacgactg aatccggtga 4800 gaatggcaaa agcttatgca tttctttcca gacttgttca acaggccagc cattacgctc 4860 gtcatcaaaa tcactcgcat caaccaaacc gttattcatt cgtgattgcg cctgagcgag 4920 acgaaatacg cgatcgctgt taaaaggaca attacaaaca ggaatcgaat gcaaccggcg 4980 caggaacact gccagcgcat caacaatatt ttcacctgaa tcaggatatt cttctaatac 5040 ctggaatgct gttttcccgg ggatcgcagt ggtgagtaac catgcatcat caggagtacg 5100 gataaaatgc ttgatggtcg gaagaggcat aaattccgtc agccagttta gtctgaccat 5160 ctcatctgta acatcattgg caacgctacc tttgccatgt ttcagaaaca actctggcgc 5220 atcgggcttc ccatacaatc gatagattgt cgcacctgat tgcccgacat tatcgcgagc 5280 ccatttatac ccatataaat cagcatccat gttggaattt aatcgcggcc tcgagcaaga 5340 cgtttcccgt tgaatatggc tcataacacc ccttgtatta ctgtttatgt aagcagacag 5400 ttttattgtt catgatgata tatttttatc ttgtgcaatg taacatcaga gattttgaga 5460 cactggcctt aaatcatggc cttaaaattc tatgatttct ctcattactt cctcacatgt 5520 tggaggcatt ttctctccct ctgcacttaa tagcctatct tgctttaatt tcttcccact 5580 cttggaggct aggtttggtt tggtgggctg atgagggagg gagagaccgc tccaagtact 5640 ttagcgggtg ggattgaagg cggagccctc ctgagctatg agtgtcctat gagtggggct 5700 ggaactaaga accaggggcg tggacagggt gtgtcacaga gaaggggatg tgcctgcttc 5760 tttctggccc aggaggaacc gggtcaattc ttcagcacct gggtacccat agagcccacc 5820 gcatccccag catgcctgct attgtcttcc caatcctccc ccttgctgtc ctgccccacc 5880 ccacccccca gaatagaatg acacctactc agacaatgcg atgcaatttc ctcattttat 5940 taggaaagga cagtgggagt ggcaccttcc agggtcaagg aaggcacggg ggaggggcaa 6000 acaacagatg gctggcaact agaaggcaca gcagatctgg atctcgactg atcactccag 6060 ctcgatgctc acgaagtggc tgtcgtcggc gtccacggcg ttctgctgct cctttctgta 6120 ctcctctctc atgctctcgg gcacgccctc ggtgctgggg cctctcttca ggccgtgctt 6180 gaacagtctg tagatgcact tgaagaacag tctgtccagg atccacagga tcaggtgcag 6240 gatgccgatg atgctggcgg ccaccaccag ggggtcgctg ctgtcgttgc atctgcagcc 6300 ccactcgttt ctgatggggg tctccacctc ggtcagcagg ctcatggtgg cgaattcgat 6360 atccgacgac ggtgactgca gaaaagaccc atggaaagga acagtctgtt agtctgtcag 6420 ctattatgtc tggtggcgcg cgcggcagca acgagtactg ctcagactac actgccctcc 6480 accgttaact agagttgagc aagcagggtc aggcaaagcg tggagagccg gctgagtcta 6540 ggtaggctcc aagggagcgc cggacaaagg cccggtctcg acctgagctt taaacttacc 6600 tagacggcgg acgcagttca ggaggcacca caggcgggag gcggcagaac gcgactcaac 6660 cggcgtggat ggcggcctca ggtagggcgg cgggcgcgtg aaggagagat gcgagccgat 6720 ggaggtgcac accaatgtgg tgaatggtca aatggcgttt attgtatcga gctaggcact 6780 taaatacaat atctctgcaa tgcggaattc agtggttcgt ccaatccatg tcagacccgt 6840 ctgttgcctt cctaataagg cacgatcgta ccaccttact tccaccaatc ggcatgcacg 6900 gtgctttttc tctccttgta aggcatgttg ctaactcatc gttaccatgt tgcaagacta 6960 caagagtatt gcataagact acatttcccc ctccctatgc aaaagcgaaa ctactatatc 7020 ctgaggggac tcctaaccgc gtacaaccga agccccgctt ttcgcctaaa cacaccctag 7080 tcccctcaga tacgcgtata tctggcccgt acatcgcgaa gcagcgcaaa acgcctaacc 7140 ctaagcagat tcttcatgca attgtcggtc aagccttgcc ttgttgtagc ttaaattttg 7200 ctcgcgcact actcagcgac ctccaacaca caagcaggga gcagccaata gccaatctga 7260 tgcggtattt tctccttacg catctgtgcg gtatttcaca ccgcatagtg gctttccccc 7320 cccccccatt attgaagcat ttatcagggt tattgtctca tgagcggata catatttgaa 7380 tgtatttaga aaaataaaca aataggggtt ccgcgcacat ttccccgaaa agtgccacct 7440 gacgtctaag aaaccattat tatcatgaca ttaacctata aaaataggcg tatcacgagg 7500 ccctttcgtc tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg 7560 gagacggtca cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg 7620 tcagcgggtg ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta 7680 ctgagagtgc accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc 7740 atcagattgg ctat 7754 110 7765 DNA Artificial sequence VR4779, 7765 bps DNA Circular 110 tggtatgcgg tgtgaaatac cgcacagatg cgtaaggaga aaataccgca tcagattggc 60 tattggctgc tccctgcttg tgtgttggag gtcgctgagt agtgcgcgag caaaatttaa 120 gctacaacaa ggcaaggctt gaccgacaat tgcatgaaga atctgcttag ggttaggcgt 180 tttgcgctgc ttcgcgatgt acgggccaga tatacgcgta tctgagggga ctagggtgtg 240 tttaggcgaa aagcggggct tcggttgtac gcggttagga gtcccctcag gatatagtag 300 tttcgctttt gcatagggag ggggaaatgt agtcttatgc aatactcttg tagtcttgca 360 acatggtaac gatgagttag caacatgcct tacaaggaga gaaaaagcac cgtgcatgcc 420 gattggtgga agtaaggtgg tacgatcgtg ccttattagg aaggcaacag acgggtctga 480 catggattgg acgaaccact gaattccgca ttgcagagat attgtattta agtgcctagc 540 tcgatacaat aaacgccatt tgaccattca ccacattggt gtgcacctcc atcggctcgc 600 atctctcctt cacgcgcccg ccgccctacc tgaggccgcc atccacgccg gttgagtcgc 660 gttctgccgc ctcccgcctg tggtgcctcc tgaactgcgt ccgccgtcta ggtaagttta 720 aagctcaggt cgagaccggg cctttgtccg gcgctccctt ggagcctacc tagactcagc 780 cggctctcca cgctttgcct gaccctgctt gctcaactct agttaacggt ggagggcagt 840 gtagtctgag cagtactcgt tgctgccgcg cgcgccacca gacataatag ctgacagact 900 aacagactgt tcctttccat gggtcttttc tgcagtcacc gtcgtcggat atcgaattcg 960 ccaccatggc cagccagggc accaagagaa gctacgagca gatggagacc gacggcgaga 1020 gacagaacgc caccgagatc agagccagcg tgggcaagat gatcgacggc atcggcagat 1080 tctacatcca gatgtgcacc gagctgaagc tgagcgacta cgagggcaga ctgatccaga 1140 acagcctgac catcgagaga atggtgctga gcgccttcga cgagagaaga aacagatacc 1200 tggaggagca ccccagcgcc ggcaaggacc ccaagaagac cggcggcccc atctacagaa 1260 gagtggacgg caagtggatg agagagctgg tgctgtacga caaggaggag atcagaagaa 1320 tctggagaca ggccaacaac ggcgaggacg ccaccgccgg cctgacccac atgatgatct 1380 ggcacagcaa cctgaacgac accacctacc agagaaccag agccctggtg cggaccggca 1440 tggaccccag aatgtgcagc ctgatgcagg gcagcaccct gcccagaaga agcggcgccg 1500 ccggcgccgc cgtgaagggc atcggcacca tggtgatgga gctgatcaga atgatcaaga 1560 gaggcatcaa cgacagaaac ttctggagag gcgagaacgg cagaaagacc agaagcgcct 1620 acgagagaat gtgcaacatc ctgaagggca agttccagac cgccgcccag agagccatga 1680 tggaccaggt ccgggagagc agaaaccccg gcaacgccga gatcgaggac ctgatcttcc 1740 tggccagaag cgccctgatc ctgagaggca gcgtggccca caagagctgc ctgcccgcct 1800 gcgtgtacgg ccccgccgtg agcagcggct acgacttcga gaaggagggc tacagcctgg 1860 tgggcatcga ccccttcaag ctgctgcaga acagccaggt gtacagcctg atcagaccca 1920 acgagaaccc cgcccacaag agccagctgg tgtggatggc ctgccacagc gccgccttcg 1980 aggacctgag actgctgagc ttcatcagag gcaccaaggt gtcccccaga ggcaagctga 2040 gcaccagagg cgtgcagatc gccagcaacg agaacatgga caacatgggc agcagcaccc 2100 tggagctgag aagcagatac tgggccatca gaaccagaag cggcggcaac accaaccagc 2160 agagagccag cgccggccag atcagcgtgc agcccacctt cagcgtgcag agaaacctgc 2220 ccttcgagaa gagcaccgtg atggccgcct tcaccggcaa caccgagggc agaaccagcg 2280 acatgagagc cgagatcatc agaatgatgg agggcgccaa gcccgaggag gtgtccttca 2340 gaggcagagg cgtgttcgag ctgagcgacg agaaggccac caaccccatc gtgcctagct 2400 tcgacatgag caacgagggc agctacttct tcggcgacaa cgccgaggag tacgacaact 2460 gatcagtcga ccacgtgtga tccagatctg ctgtgccttc tagttgccag ccatctgttg 2520 tttgcccctc ccccgtgcct tccttgaccc tggaaggtgc cactcccact gtcctttcct 2580 aataaaatga ggaaattgca tcgcattgtc tgagtaggtg tcattctatt ctggggggtg 2640 gggtggggca ggacagcaag ggggaggatt gggaagacaa tagcaggcat gctggggatg 2700 cggtgggctc tatgggtacc caggtgctga agaattgacc cggttcctcc tgggccagaa 2760 agaagcaggc acatcccctt ctctgtgaca caccctgtcc acgcccctgg ttcttagttc 2820 cagccccact cataggacac tcatagctca ggagggctcc gccttcaatc ccacccgcta 2880 aagtacttgg agcggtctct ccctccctca tcagcccacc aaaccaaacc tagcctccaa 2940 gagtgggaag aaattaaagc aagataggct attaagtgca gagggagaga aaatgcctcc 3000 aacatgtgag gaagtaatga gagaaatcat agaattttaa ggccatgatt taaggccacc 3060 attgcatacg ttgtatccat atcataatat gtacatttat attggctcat gtccaacatt 3120 accgccatgt tgacattgat tattgactag ttattaatag taatcaatta cggggtcatt 3180 agttcatagc ccatatatgg agttccgcgt tacataactt acggtaaatg gcccgcctgg 3240 ctgaccgccc aacgaccccc gcccattgac gtcaataatg acgtatgttc ccatagtaac 3300 gccaataggg actttccatt gacgtcaatg ggtggagtat ttacggtaaa ctgcccactt 3360 ggcagtacat caagtgtatc atatgccaag tacgccccct attgacgtca atgacggtaa 3420 atggcccgcc tggcattatg cccagtacat gaccttatgg gactttccta cttggcagta 3480 catctacgta ttagtcatcg ctattaccat ggtgatgcgg ttttggcagt acatcaatgg 3540 gcgtggatag cggtttgact cacggggatt tccaagtctc caccccattg acgtcaatgg 3600 gagtttgttt tggcaccaaa atcaacggga ctttccaaaa tgtcgtaaca actccgcccc 3660 attgacgcaa atgggcggta ggcgtgtacg gtgggaggtc tatataagca gagctcgttt 3720 agtgaaccgt cagatcgcct ggagacgcca tccacgctgt tttgacctcc atagaagaca 3780 ccgggaccga tccagcctcc gcggccggga acggtgcatt ggaacgcgga ttccccgtgc 3840 caagagtgac gtaagtaccg cctatagact ctataggcac acccctttgg ctcttatgca 3900 tgctatactg tttttggctt ggggcctata cacccccgct tccttatgct ataggtgatg 3960 gtatagctta gcctataggt gtgggttatt gaccattatt gaccactccc ctattggtga 4020 cgatactttc cattactaat ccataacatg gctctttgcc acaactatct ctattggcta 4080 tatgccaata ctctgtcctt cagagactga cacggactct gtatttttac aggatggggt 4140 cccatttatt atttacaaat tcacatatac aacaacgccg tcccccgtgc ccgcagtttt 4200 tattaaacat agcgtgggat ctccacgcga atctcgggta cgtgttccgg acatgggctc 4260 ttctccggta gcggcggagc ttccacatcc gagccctggt cccatgcctc cagcggctca 4320 tggtcgctcg gcagctcctt gctcctaaca gtggaggcca gacttaggca cagcacaatg 4380 cccaccacca ccagtgtgcc gcacaaggcc gtggcggtag ggtatgtgtc tgaaaatgag 4440 cgtggagatt gggctcgcac ggctgacgca gatggaagac ttaaggcagc ggcagaagaa 4500 gatgcaggca gctgagttgt tgtattctga taagagtcag aggtaactcc cgttgcggtg 4560 ctgttaacgg tggagggcag tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc 4620 agacataata gctgacagac taacagactg ttcctttcca tgggtctttt ctgcagtcac 4680 cgtcgtcgga tatcgaattc gccaccatga gcctgctgac cgaggtggag acccccatca 4740 gaaacgagtg gggctgcaga tgcaacgaca gcagcgaccc cctggtggtg gccgccagca 4800 tcatcggcat cctgcacctg atcctgtgga tcctggacag actgttcttc aagtgcatct 4860 acagactgtt caagcacggc ctgaagagag gccccagcac cgagggcgtg cccgagagca 4920 tgagagagga gtacagaaag gagcagcaga acgccgtgga cgccgacgac agccacttcg 4980 tgagcatcga gctggagtga tcagtcgacc acgtgtgatc cagatctact tctggctaat 5040 aaaagatcag agctctagag atctgtgtgt tggttttttg tgtggtactc ttccgcttcc 5100 tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca 5160 aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca 5220 aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg 5280 ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg 5340 acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt 5400 ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt 5460 tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc 5520 tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt 5580 gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt 5640 agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc 5700 tacactagaa gaacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa 5760 agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt 5820 tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct 5880 acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgagatta 5940 tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa atcaatctaa 6000 agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga ggcacctatc 6060 tcagcgatct gtctatttcg ttcatccata gttgcctgac tcgggggggg ggggcgctga 6120 ggtctgcctc gtgaagaagg tgttgctgac tcataccagg cctgaatcgc cccatcatcc 6180 agccagaaag tgagggagcc acggttgatg agagctttgt tgtaggtgga ccagttggtg 6240 attttgaact tttgctttgc cacggaacgg tctgcgttgt cgggaagatg cgtgatctga 6300 tccttcaact cagcaaaagt tcgatttatt caacaaagcc gccgtcccgt caagtcagcg 6360 taatgctctg ccagtgttac aaccaattaa ccaattctga ttagaaaaac tcatcgagca 6420 tcaaatgaaa ctgcaattta ttcatatcag gattatcaat accatatttt tgaaaaagcc 6480 gtttctgtaa tgaaggagaa aactcaccga ggcagttcca taggatggca agatcctggt 6540 atcggtctgc gattccgact cgtccaacat caatacaacc tattaatttc ccctcgtcaa 6600 aaataaggtt atcaagtgag aaatcaccat gagtgacgac tgaatccggt gagaatggca 6660 aaagcttatg catttctttc cagacttgtt caacaggcca gccattacgc tcgtcatcaa 6720 aatcactcgc atcaaccaaa ccgttattca ttcgtgattg cgcctgagcg agacgaaata 6780 cgcgatcgct gttaaaagga caattacaaa caggaatcga atgcaaccgg cgcaggaaca 6840 ctgccagcgc atcaacaata ttttcacctg aatcaggata ttcttctaat acctggaatg 6900 ctgttttccc ggggatcgca gtggtgagta accatgcatc atcaggagta cggataaaat 6960 gcttgatggt cggaagaggc ataaattccg tcagccagtt tagtctgacc atctcatctg 7020 taacatcatt ggcaacgcta cctttgccat gtttcagaaa caactctggc gcatcgggct 7080 tcccatacaa tcgatagatt gtcgcacctg attgcccgac attatcgcga gcccatttat 7140 acccatataa atcagcatcc atgttggaat ttaatcgcgg cctcgagcaa gacgtttccc 7200 gttgaatatg gctcataaca ccccttgtat tactgtttat gtaagcagac agttttattg 7260 ttcatgatga tatattttta tcttgtgcaa tgtaacatca gagattttga gacacaacgt 7320 ggctttcccc ccccccccat tattgaagca tttatcaggg ttattgtctc atgagcggat 7380 acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca tttccccgaa 7440 aagtgccacc tgacgtctaa gaaaccatta ttatcatgac attaacctat aaaaataggc 7500 gtatcacgag gccctttcgt ctcgcgcgtt tcggtgatga cggtgaaaac ctctgacaca 7560 tgcagctccc ggagacggtc acagcttgtc tgtaagcgga tgccgggagc agacaagccc 7620 gtcagggcgc gtcagcgggt gttggcgggt gtcggggctg gcttaactat gcggcatcag 7680 agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga tgcgtaagga 7740 gaaaataccg catcagattg gctat 7765 111 7765 DNA Artificial sequence VR4780, 7765 bps DNA Circular 111 tggtggcctt aaatcatggc cttaaaattc tatgatttct ctcattactt cctcacatgt 60 tggaggcatt ttctctccct ctgcacttaa tagcctatct tgctttaatt tcttcccact 120 cttggaggct aggtttggtt tggtgggctg atgagggagg gagagaccgc tccaagtact 180 ttagcgggtg ggattgaagg cggagccctc ctgagctatg agtgtcctat gagtggggct 240 ggaactaaga accaggggcg tggacagggt gtgtcacaga gaaggggatg tgcctgcttc 300 tttctggccc aggaggaacc gggtcaattc ttcagcacct gggtacccat agagcccacc 360 gcatccccag catgcctgct attgtcttcc caatcctccc ccttgctgtc ctgccccacc 420 ccacccccca gaatagaatg acacctactc agacaatgcg atgcaatttc ctcattttat 480 taggaaagga cagtgggagt ggcaccttcc agggtcaagg aaggcacggg ggaggggcaa 540 acaacagatg gctggcaact agaaggcaca gcagatctgg atcacacgtg gtcgactgat 600 cagttgtcgt actcctcggc gttgtcgccg aagaagtagc tgccctcgtt gctcatgtcg 660 aagctaggca cgatggggtt ggtggccttc tcgtcgctca gctcgaacac gcctctgcct 720 ctgaaggaca cctcctcggg cttggcgccc tccatcattc tgatgatctc ggctctcatg 780 tcgctggttc tgccctcggt gttgccggtg aaggcggcca tcacggtgct cttctcgaag 840 ggcaggtttc tctgcacgct gaaggtgggc tgcacgctga tctggccggc gctggctctc 900 tgctggttgg tgttgccgcc gcttctggtt ctgatggccc agtatctgct tctcagctcc 960 agggtgctgc tgcccatgtt gtccatgttc tcgttgctgg cgatctgcac gcctctggtg 1020 ctcagcttgc ctctggggga caccttggtg cctctgatga agctcagcag tctcaggtcc 1080 tcgaaggcgg cgctgtggca ggccatccac accagctggc tcttgtgggc ggggttctcg 1140 ttgggtctga tcaggctgta cacctggctg ttctgcagca gcttgaaggg gtcgatgccc 1200 accaggctgt agccctcctt ctcgaagtcg tagccgctgc tcacggcggg gccgtacacg 1260 caggcgggca ggcagctctt gtgggccacg ctgcctctca ggatcagggc gcttctggcc 1320 aggaagatca ggtcctcgat ctcggcgttg ccggggtttc tgctctcccg gacctggtcc 1380 atcatggctc tctgggcggc ggtctggaac ttgcccttca ggatgttgca cattctctcg 1440 taggcgcttc tggtctttct gccgttctcg cctctccaga agtttctgtc gttgatgcct 1500 ctcttgatca ttctgatcag ctccatcacc atggtgccga tgcccttcac ggcggcgccg 1560 gcggcgccgc ttcttctggg cagggtgctg ccctgcatca ggctgcacat tctggggtcc 1620 atgccggtcc gcaccagggc tctggttctc tggtaggtgg tgtcgttcag gttgctgtgc 1680 cagatcatca tgtgggtcag gccggcggtg gcgtcctcgc cgttgttggc ctgtctccag 1740 attcttctga tctcctcctt gtcgtacagc accagctctc tcatccactt gccgtccact 1800 cttctgtaga tggggccgcc ggtcttcttg gggtccttgc cggcgctggg gtgctcctcc 1860 aggtatctgt ttcttctctc gtcgaaggcg ctcagcacca ttctctcgat ggtcaggctg 1920 ttctggatca gtctgccctc gtagtcgctc agcttcagct cggtgcacat ctggatgtag 1980 aatctgccga tgccgtcgat catcttgccc acgctggctc tgatctcggt ggcgttctgt 2040 ctctcgccgt cggtctccat ctgctcgtag cttctcttgg tgccctggct ggccatggtg 2100 gcgaattcga tatccgacga cggtgactgc agaaaagacc catggaaagg aacagtctgt 2160 tagtctgtca gctattatgt ctggtggcgc gcgcggcagc aacgagtact gctcagacta 2220 cactgccctc caccgttaac tagagttgag caagcagggt caggcaaagc gtggagagcc 2280 ggctgagtct aggtaggctc caagggagcg ccggacaaag gcccggtctc gacctgagct 2340 ttaaacttac ctagacggcg gacgcagttc aggaggcacc acaggcggga ggcggcagaa 2400 cgcgactcaa ccggcgtgga tggcggcctc aggtagggcg gcgggcgcgt gaaggagaga 2460 tgcgagccga tggaggtgca caccaatgtg gtgaatggtc aaatggcgtt tattgtatcg 2520 agctaggcac ttaaatacaa tatctctgca atgcggaatt cagtggttcg tccaatccat 2580 gtcagacccg tctgttgcct tcctaataag gcacgatcgt accaccttac ttccaccaat 2640 cggcatgcac ggtgcttttt ctctccttgt aaggcatgtt gctaactcat cgttaccatg 2700 ttgcaagact acaagagtat tgcataagac tacatttccc cctccctatg caaaagcgaa 2760 actactatat

cctgagggga ctcctaaccg cgtacaaccg aagccccgct tttcgcctaa 2820 acacacccta gtcccctcag atacgcgtat atctggcccg tacatcgcga agcagcgcaa 2880 aacgcctaac cctaagcaga ttcttcatgc aattgtcggt caagccttgc cttgttgtag 2940 cttaaatttt gctcgcgcac tactcagcga cctccaacac acaagcaggg agcagccaat 3000 agccaatctg atgcggtatt ttctccttac gcatctgtgc ggtatttcac accgcatacc 3060 attgcatacg ttgtatccat atcataatat gtacatttat attggctcat gtccaacatt 3120 accgccatgt tgacattgat tattgactag ttattaatag taatcaatta cggggtcatt 3180 agttcatagc ccatatatgg agttccgcgt tacataactt acggtaaatg gcccgcctgg 3240 ctgaccgccc aacgaccccc gcccattgac gtcaataatg acgtatgttc ccatagtaac 3300 gccaataggg actttccatt gacgtcaatg ggtggagtat ttacggtaaa ctgcccactt 3360 ggcagtacat caagtgtatc atatgccaag tacgccccct attgacgtca atgacggtaa 3420 atggcccgcc tggcattatg cccagtacat gaccttatgg gactttccta cttggcagta 3480 catctacgta ttagtcatcg ctattaccat ggtgatgcgg ttttggcagt acatcaatgg 3540 gcgtggatag cggtttgact cacggggatt tccaagtctc caccccattg acgtcaatgg 3600 gagtttgttt tggcaccaaa atcaacggga ctttccaaaa tgtcgtaaca actccgcccc 3660 attgacgcaa atgggcggta ggcgtgtacg gtgggaggtc tatataagca gagctcgttt 3720 agtgaaccgt cagatcgcct ggagacgcca tccacgctgt tttgacctcc atagaagaca 3780 ccgggaccga tccagcctcc gcggccggga acggtgcatt ggaacgcgga ttccccgtgc 3840 caagagtgac gtaagtaccg cctatagact ctataggcac acccctttgg ctcttatgca 3900 tgctatactg tttttggctt ggggcctata cacccccgct tccttatgct ataggtgatg 3960 gtatagctta gcctataggt gtgggttatt gaccattatt gaccactccc ctattggtga 4020 cgatactttc cattactaat ccataacatg gctctttgcc acaactatct ctattggcta 4080 tatgccaata ctctgtcctt cagagactga cacggactct gtatttttac aggatggggt 4140 cccatttatt atttacaaat tcacatatac aacaacgccg tcccccgtgc ccgcagtttt 4200 tattaaacat agcgtgggat ctccacgcga atctcgggta cgtgttccgg acatgggctc 4260 ttctccggta gcggcggagc ttccacatcc gagccctggt cccatgcctc cagcggctca 4320 tggtcgctcg gcagctcctt gctcctaaca gtggaggcca gacttaggca cagcacaatg 4380 cccaccacca ccagtgtgcc gcacaaggcc gtggcggtag ggtatgtgtc tgaaaatgag 4440 cgtggagatt gggctcgcac ggctgacgca gatggaagac ttaaggcagc ggcagaagaa 4500 gatgcaggca gctgagttgt tgtattctga taagagtcag aggtaactcc cgttgcggtg 4560 ctgttaacgg tggagggcag tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc 4620 agacataata gctgacagac taacagactg ttcctttcca tgggtctttt ctgcagtcac 4680 cgtcgtcgga tatcgaattc gccaccatga gcctgctgac cgaggtggag acccccatca 4740 gaaacgagtg gggctgcaga tgcaacgaca gcagcgaccc cctggtggtg gccgccagca 4800 tcatcggcat cctgcacctg atcctgtgga tcctggacag actgttcttc aagtgcatct 4860 acagactgtt caagcacggc ctgaagagag gccccagcac cgagggcgtg cccgagagca 4920 tgagagagga gtacagaaag gagcagcaga acgccgtgga cgccgacgac agccacttcg 4980 tgagcatcga gctggagtga tcagtcgacc acgtgtgatc cagatctact tctggctaat 5040 aaaagatcag agctctagag atctgtgtgt tggttttttg tgtggtactc ttccgcttcc 5100 tcgctcactg actcgctgcg ctcggtcgtt cggctgcggc gagcggtatc agctcactca 5160 aaggcggtaa tacggttatc cacagaatca ggggataacg caggaaagaa catgtgagca 5220 aaaggccagc aaaaggccag gaaccgtaaa aaggccgcgt tgctggcgtt tttccatagg 5280 ctccgccccc ctgacgagca tcacaaaaat cgacgctcaa gtcagaggtg gcgaaacccg 5340 acaggactat aaagatacca ggcgtttccc cctggaagct ccctcgtgcg ctctcctgtt 5400 ccgaccctgc cgcttaccgg atacctgtcc gcctttctcc cttcgggaag cgtggcgctt 5460 tctcatagct cacgctgtag gtatctcagt tcggtgtagg tcgttcgctc caagctgggc 5520 tgtgtgcacg aaccccccgt tcagcccgac cgctgcgcct tatccggtaa ctatcgtctt 5580 gagtccaacc cggtaagaca cgacttatcg ccactggcag cagccactgg taacaggatt 5640 agcagagcga ggtatgtagg cggtgctaca gagttcttga agtggtggcc taactacggc 5700 tacactagaa gaacagtatt tggtatctgc gctctgctga agccagttac cttcggaaaa 5760 agagttggta gctcttgatc cggcaaacaa accaccgctg gtagcggtgg tttttttgtt 5820 tgcaagcagc agattacgcg cagaaaaaaa ggatctcaag aagatccttt gatcttttct 5880 acggggtctg acgctcagtg gaacgaaaac tcacgttaag ggattttggt catgagatta 5940 tcaaaaagga tcttcaccta gatcctttta aattaaaaat gaagttttaa atcaatctaa 6000 agtatatatg agtaaacttg gtctgacagt taccaatgct taatcagtga ggcacctatc 6060 tcagcgatct gtctatttcg ttcatccata gttgcctgac tcgggggggg ggggcgctga 6120 ggtctgcctc gtgaagaagg tgttgctgac tcataccagg cctgaatcgc cccatcatcc 6180 agccagaaag tgagggagcc acggttgatg agagctttgt tgtaggtgga ccagttggtg 6240 attttgaact tttgctttgc cacggaacgg tctgcgttgt cgggaagatg cgtgatctga 6300 tccttcaact cagcaaaagt tcgatttatt caacaaagcc gccgtcccgt caagtcagcg 6360 taatgctctg ccagtgttac aaccaattaa ccaattctga ttagaaaaac tcatcgagca 6420 tcaaatgaaa ctgcaattta ttcatatcag gattatcaat accatatttt tgaaaaagcc 6480 gtttctgtaa tgaaggagaa aactcaccga ggcagttcca taggatggca agatcctggt 6540 atcggtctgc gattccgact cgtccaacat caatacaacc tattaatttc ccctcgtcaa 6600 aaataaggtt atcaagtgag aaatcaccat gagtgacgac tgaatccggt gagaatggca 6660 aaagcttatg catttctttc cagacttgtt caacaggcca gccattacgc tcgtcatcaa 6720 aatcactcgc atcaaccaaa ccgttattca ttcgtgattg cgcctgagcg agacgaaata 6780 cgcgatcgct gttaaaagga caattacaaa caggaatcga atgcaaccgg cgcaggaaca 6840 ctgccagcgc atcaacaata ttttcacctg aatcaggata ttcttctaat acctggaatg 6900 ctgttttccc ggggatcgca gtggtgagta accatgcatc atcaggagta cggataaaat 6960 gcttgatggt cggaagaggc ataaattccg tcagccagtt tagtctgacc atctcatctg 7020 taacatcatt ggcaacgcta cctttgccat gtttcagaaa caactctggc gcatcgggct 7080 tcccatacaa tcgatagatt gtcgcacctg attgcccgac attatcgcga gcccatttat 7140 acccatataa atcagcatcc atgttggaat ttaatcgcgg cctcgagcaa gacgtttccc 7200 gttgaatatg gctcataaca ccccttgtat tactgtttat gtaagcagac agttttattg 7260 ttcatgatga tatattttta tcttgtgcaa tgtaacatca gagattttga gacacaacgt 7320 ggctttcccc ccccccccat tattgaagca tttatcaggg ttattgtctc atgagcggat 7380 acatatttga atgtatttag aaaaataaac aaataggggt tccgcgcaca tttccccgaa 7440 aagtgccacc tgacgtctaa gaaaccatta ttatcatgac attaacctat aaaaataggc 7500 gtatcacgag gccctttcgt ctcgcgcgtt tcggtgatga cggtgaaaac ctctgacaca 7560 tgcagctccc ggagacggtc acagcttgtc tgtaagcgga tgccgggagc agacaagccc 7620 gtcagggcgc gtcagcgggt gttggcgggt gtcggggctg gcttaactat gcggcatcag 7680 agcagattgt actgagagtg caccatatgc ggtgtgaaat accgcacaga tgcgtaagga 7740 gaaaataccg catcagattg gctat 7765 112 4196 DNA Artificial sequence VR10686, 4196 bps DNA Circular 112 tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60 cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120 ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180 accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcagattgg 240 ctattggctg ctccctgctt gtgtgttgga ggtcgctgag tagtgcgcga gcaaaattta 300 agctacaaca aggcaaggct tgaccgacaa ttgcatgaag aatctgctta gggttaggcg 360 ttttgcgctg cttcgcgatg tacgggccag atatacgcgt atctgagggg actagggtgt 420 gtttaggcga aaagcggggc ttcggttgta cgcggttagg agtcccctca ggatatagta 480 gtttcgcttt tgcataggga gggggaaatg tagtcttatg caatactctt gtagtcttgc 540 aacatggtaa cgatgagtta gcaacatgcc ttacaaggag agaaaaagca ccgtgcatgc 600 cgattggtgg aagtaaggtg gtacgatcgt gccttattag gaaggcaaca gacgggtctg 660 acatggattg gacgaaccac tgaattccgc attgcagaga tattgtattt aagtgcctag 720 ctcgatacaa taaacgccat ttgaccattc accacattgg tgtgcacctc catcggctcg 780 catctctcct tcacgcgccc gccgccctac ctgaggccgc catccacgcc ggttgagtcg 840 cgttctgccg cctcccgcct gtggtgcctc ctgaactgcg tccgccgtct aggtaagttt 900 aaagctcagg tcgagaccgg gcctttgtcc ggcgctccct tggagcctac ctagactcag 960 ccggctctcc acgctttgcc tgaccctgct tgctcaactc tagttaacgg tggagggcag 1020 tgtagtctga gcagtactcg ttgctgccgc gcgcgccacc agacataata gctgacagac 1080 taacagactg ttcctttcca tgggtctttt ctgcagtcac cgtcgtcgac acgtgtgatc 1140 agatatcgcg gccgctctag accaggccct ggatccagat ctgctgtgcc ttctagttgc 1200 cagccatctg ttgtttgccc ctcccccgtg ccttccttga ccctggaagg tgccactccc 1260 actgtccttt cctaataaaa tgaggaaatt gcatcgcatt gtctgagtag gtgtcattct 1320 attctggggg gtggggtggg gcaggacagc aagggggagg attgggaaga caatagcagg 1380 catgctgggg atgcggtggg ctctatgggt acccaggtgc tgaagaattg acccggttcc 1440 tcctgggcca gaaagaagca ggcacatccc cttctctgtg acacaccctg tccacgcccc 1500 tggttcttag ttccagcccc actcatagga cactcatagc tcaggagggc tccgccttca 1560 atcccacccg ctaaagtact tggagcggtc tctccctccc tcatcagccc accaaaccaa 1620 acctagcctc caagagtggg aagaaattaa agcaagatag gctattaagt gcagagggag 1680 agaaaatgcc tccaacatgt gaggaagtaa tgagagaaat catagaattt taaggccatg 1740 atttaaggcc atcatggcct taatcttccg cttcctcgct cactgactcg ctgcgctcgg 1800 tcgttcggct gcggcgagcg gtatcagctc actcaaaggc ggtaatacgg ttatccacag 1860 aatcagggga taacgcagga aagaacatgt gagcaaaagg ccagcaaaag gccaggaacc 1920 gtaaaaaggc cgcgttgctg gcgtttttcc ataggctccg cccccctgac gagcatcaca 1980 aaaatcgacg ctcaagtcag aggtggcgaa acccgacagg actataaaga taccaggcgt 2040 ttccccctgg aagctccctc gtgcgctctc ctgttccgac cctgccgctt accggatacc 2100 tgtccgcctt tctcccttcg ggaagcgtgg cgctttctca tagctcacgc tgtaggtatc 2160 tcagttcggt gtaggtcgtt cgctccaagc tgggctgtgt gcacgaaccc cccgttcagc 2220 ccgaccgctg cgccttatcc ggtaactatc gtcttgagtc caacccggta agacacgact 2280 tatcgccact ggcagcagcc actggtaaca ggattagcag agcgaggtat gtaggcggtg 2340 ctacagagtt cttgaagtgg tggcctaact acggctacac tagaagaaca gtatttggta 2400 tctgcgctct gctgaagcca gttaccttcg gaaaaagagt tggtagctct tgatccggca 2460 aacaaaccac cgctggtagc ggtggttttt ttgtttgcaa gcagcagatt acgcgcagaa 2520 aaaaaggatc tcaagaagat cctttgatct tttctacggg gtctgacgct cagtggaacg 2580 aaaactcacg ttaagggatt ttggtcatga gattatcaaa aaggatcttc acctagatcc 2640 ttttaaatta aaaatgaagt tttaaatcaa tctaaagtat atatgagtaa acttggtctg 2700 acagttacca atgcttaatc agtgaggcac ctatctcagc gatctgtcta tttcgttcat 2760 ccatagttgc ctgactcggg gggggggggc gctgaggtct gcctcgtgaa gaaggtgttg 2820 ctgactcata ccaggcctga atcgccccat catccagcca gaaagtgagg gagccacggt 2880 tgatgagagc tttgttgtag gtggaccagt tggtgatttt gaacttttgc tttgccacgg 2940 aacggtctgc gttgtcggga agatgcgtga tctgatcctt caactcagca aaagttcgat 3000 ttattcaaca aagccgccgt cccgtcaagt cagcgtaatg ctctgccagt gttacaacca 3060 attaaccaat tctgattaga aaaactcatc gagcatcaaa tgaaactgca atttattcat 3120 atcaggatta tcaataccat atttttgaaa aagccgtttc tgtaatgaag gagaaaactc 3180 accgaggcag ttccatagga tggcaagatc ctggtatcgg tctgcgattc cgactcgtcc 3240 aacatcaata caacctatta atttcccctc gtcaaaaata aggttatcaa gtgagaaatc 3300 accatgagtg acgactgaat ccggtgagaa tggcaaaagc ttatgcattt ctttccagac 3360 ttgttcaaca ggccagccat tacgctcgtc atcaaaatca ctcgcatcaa ccaaaccgtt 3420 attcattcgt gattgcgcct gagcgagacg aaatacgcga tcgctgttaa aaggacaatt 3480 acaaacagga atcgaatgca accggcgcag gaacactgcc agcgcatcaa caatattttc 3540 acctgaatca ggatattctt ctaatacctg gaatgctgtt ttcccgggga tcgcagtggt 3600 gagtaaccat gcatcatcag gagtacggat aaaatgcttg atggtcggaa gaggcataaa 3660 ttccgtcagc cagtttagtc tgaccatctc atctgtaaca tcattggcaa cgctaccttt 3720 gccatgtttc agaaacaact ctggcgcatc gggcttccca tacaatcgat agattgtcgc 3780 acctgattgc ccgacattat cgcgagccca tttataccca tataaatcag catccatgtt 3840 ggaatttaat cgcggcctcg agcaagacgt ttcccgttga atatggctca taacacccct 3900 tgtattactg tttatgtaag cagacagttt tattgttcat gatgatatat ttttatcttg 3960 tgcaatgtaa catcagagat tttgagacac aacgtggctt tccccccccc cccattattg 4020 aagcatttat cagggttatt gtctcatgag cggatacata tttgaatgta tttagaaaaa 4080 taaacaaata ggggttccgc gcacatttcc ccgaaaagtg ccacctgacg tctaagaaac 4140 cattattatc atgacattaa cctataaaaa taggcgtatc acgaggccct ttcgtc 4196

* * * * *